Mathematical Physics
2000
This page is intentionally left blank
Mathematical Physics
2000
Edited by
A Fokas A Grigoryan T Kibble B Zegarlinski Imperial College, London
^ffi
Imperial College Press
Published by Imperial College Press 57 Shelton Street Covent Garden London WC2H 9HE Distributed by World Scientific Publishing Co. Pte. Ltd. P O Box 128, Fairer Road, Singapore 912805 USA office: Suite IB, 1060 Main Street, River Edge, NJ 07661 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
Library of Congress Cataloging-in-Publication Data Mathematical physics 2000 / edited by A. Fokas ... [et al.]. p. cm. ISBN 186094230X(alk. paper) 1. Mathematical physics-Congresses. I. Fokas, A. S., 1952QC 19.2.1538 530.15-dc21
2000
. II. Title.
00-037042
British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.
Copyright © 2000 by Imperial College Press Copyright of each article is owned by the contributors). A11rightsreserved. This book, or parts thereof, may not be reproduced in anyform or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented without written permission from the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
Printed in Singapore by Uto-Print
V
PREFACE The International Congress on Mathematical Physics in the year 2000 is to be held at Imperial College, London. It occurred to the local organizers that, since this is a natural time to look back at the achievements of the twentieth century and forward to the opportunities of the twenty-first, it would be an appropriate occasion on which to ask a number of dLstinguished mathematical physicists to contribute their personal perspectives on some aspects of the discipline. We did not try to impose any general structure or theme; nor did we aim for comprehensive coverage. Instead, we invited a number of experts and left them to select their own topics and approaches. The result is seventeen diverse and highly individual articles on a wide variety of topics, providing many fascinating insights into our field. We are very grateful to the authors who agreed to write articles for this volume. We also wish to acknowledge the support and expert assistance of the publishers, Imperial College Press. The book is to be ready in time for the International Congress in July 2000. We hope it will be of interest to all the participants, and indeed to mathematicians and physicists in general, especially to young people just starting out on their careers.
A. A. T. B.
Fokas Grigoryan Kibble Zegarlinski
This page is intentionally left blank
V where
CONTENTS Preface
where
Modern Mathematical Physics: What It Should Be L.D. Faddeev
1
New Applications of the Chiral Anomaly Jiirg Frohlich and Bill Pedrini
9
Fluctuations and Entropy Driven Space—Time Intermittency in Navier-Stokes Fluids Giovanni Gallavotti
48
Superstrings and the Unification of the Physical Forces Michael B. Green
59
Questions in Quantum Physics: A Personal View Rudolf Haag
87
What Good are Quantum Field Theory Infinities? Roman Jackiw
101
Constructive Quantum Field Theory Arthur Jaffe
111
Fourier's Law: A Challenge to Theorists F. Bonetto, J.L. Lebowitz and L. Rey-Bellet
128
The "Corpuscular" Structure of the Spectra of Operators Describing Large Systems R.A. Minlos
151
Vortex- and Magneto-Dynamics — A Topological Perspective H.K. Moffatt
170
Gauge Theory: The Gentle Revolution L. O'Raifeartaigh
183
Random Matrices as Paradigm L. Pastur
216
Wavefunction Collapse as a Real Gravitational Effect Roger Penrose
266
Schrodinger Operators in the Twenty-First Century Barry Simon
283
The Classical Three-Body Problem — Where is Abstract Mathematics, Physical Intuition, Computational Physics Most Powerful? H.A. Posch and W. Thirring
289
Infinite Particle Systems and Their Scaling Limits S.R.S. Varadhan
306
Supersymmetry: A Personal View B. Zumino
316
1
MODERN MATHEMATICAL PHYSICS: W H A T IT S H O U L D B E
L. D. FADDEEV Steklov Mathematical Institute, St Petersburg 191011, Russia When somebody asks me, what I do in science, I call myself a specialist in m a t h ematical physics. As I have been there for more than 40 years, I have some definite interpretation of this combination of words: "mathematical physics." Cynics or pur ists can insist that this is neither mathematics nor physics, adding comments with a different degree of malice. Naturally, this calls for an answer, and in this short essay I want to explain briefly my understanding of the subject. It can be considered as my contribution to the discussion about the origin and role of mathematical physics and thus to be relevant for this volume. The m a t t e r is complicated by the fact t h a t the term "mathematical physics" (often abbreviated by MP in what follows) is used in different senses and can have rather different content. This content changes with time, place and person. I did not study properly the history of science; however, it is my impression t h a t , in the beginning of the twentieth century, the term M P was practically equivalent to the concept of theoretical physics. Not only Henri Poincare, but also Albert Einstein, were called mathematical physicists. Newly established theoretical chairs were called chairs of mathematical physics. It follows from the documents in the archives of the Nobel Committee t h a t MP had a right to appear both in the nom inations and discussion of the candidates for the Nobel Prize in physics *. Roughly speaking, the concept of MP covered theoretical papers where mathematical for mulae were used. However, during an unprecedented bloom of theoretical physics in the 20s and 30s, an essential separation of the terms "theoretical" and "mathematical" oc curred. For many people, MP was reduced to the important but auxiliary course "Methods of Mathematical Physics" including a set of useful mathematical tools. T h e monograph of P. Morse and H. Feshbach 2 is a classical example of such a course, addressed to a wide circle of physicists and engineers. On the other hand, MP in the mathematical interpretation appeared as a the ory of partial differential equations and variational calculus. T h e monographs of R. Courant and D. Hilbert 3 and S. Sobolev 4 are outstanding illustrations of this development. The theorems of existence and uniqueness based on the variational principles, a priori estimates, and imbedding theorems for functional spaces com prise the main content of this direction. As a student of O. Ladyzhenskaya, I was immersed in this subject since the 3rd year of my undergraduate studies at the Physics Department of Leningrad University. My fellow student-N. Uraltseva now holds the chair of M P exactly in this sense. MP in this context has as its source mainly geometry and such parts of classical mechanics as hydrodynamics and elasticity theory. Since the 60s a new impetus to M P in this sense was supplied by Q u a n t u m Theory. Here the main a p p a r a t u s is functional analysis, including the spectral theory of operators in Hilbert space, the mathematical theory of scattering and the theory of Lie groups and their repres-
2
entations. The main subject is the Schrodinger operator. Though the methods and concrete content of this part of MP are essentially different from those of its classical counterpart, the methodological attitude is the same. One sees the quest for the rigorous mathematical theorems about results which are understood by physicists in their own way. I was born as a scientist exactly in this environment. I graduated from the unique chair of Mathematical Physics, established by V.I. Smirnov at the Physics Department of Leningrad University already in the 30s. In his venture V.I. Smirnov got support from V. Fock, the world famous theoretical physicist with very wide mathematical interests. Originally this chair played the auxiliary role of being responsible for the mathematical courses for physics students. However in 1955 it got permission to supervise its own diploma projects, and I belonged to the very first group of students using this opportunity. As I already mentioned, O.A. Ladyzhenskaya was our main professor. Although her own interests were mostly in nonlinear PDEs and hydrodynamics, she decided to direct me to quantum theory. During the last two years of undergraduate studies I was to read the mono graph of K.O. Friedrichs, "Mathematical Aspects of Quantum Field Theory," and relate it to our group of 5 students and our professor on a special seminar. At the same time my student friends from the chair of Theoretical Physics were absorbed in reading the first monograph on Quantum Electrodynamics by A. Ahieser and V. Berestevsky. The difference in attitudes and language was striking and I was to become accustomed to both. After my graduation O.A. Ladyzhenskaya remained my tutor but she left me free to choose research topics and literature to read. I read both mathematical papers (i.e. on direct and inverse scattering problems by I.M. Gelfand and B.M. Levitan, V.A. Marchenko, M.G. Krein, A.Ya. Povzner) and "Physical Review" (i.e. on formal scattering theory by M. Gell-Mann, M. Goldberger, J. Schwinger and H. Ekstein) as well. Papers by I. Segal, L. Van-Hove and R. Haag added to my first impressions on Quantum Field Theory taken from K. Friederichs. In the process of this selfeducation my own understanding of the nature and goals of MP gradually deviated from the prevailing views of the members of the V. Smirnov chair. I decided that it is more challenging to do something which is not known to my colleagues from theoretical physics rather than supply existence theorems. My first work on the in verse scattering problem especially for the multi-dimensional Schrodinger operator and that on the three body scattering problem confirm that I really tried to follow this line of thought. This attitude became even firmer when I began to work on Quantum Field Theory in the middle of the 60s. As a result, my understanding of the goal of MP was drastically modified. I consider as the main goal of MP the use of mathematical intuition for the derivation of really new results in fundamental physics. In this sense, MP and Theoretical Physics are competitors. Their goals in unraveling the laws of the structure of matter coincide. However, the methods and even the estimates of the importance of the results of work may differ quite significantly. Here it is time to say in what sense I use the term "fundamental physics." The adjective "fundamental" has many possible interpretations when applied to the classification of science. In a wider sense it is used to characterize the research
3 directed to unraveling new properties of physical systems. In the narrow sense it is kept only for the search for the basic laws that govern and explain these properties. Thus, all chemical properties can be derived from the Schrodinger equation for a system of electrons and nuclei. Alternatively, we can say t h a t the fundamental laws of chemistry in a narrow sense are already known. This, of course, does not deprive chemistry of the right to be called a fundamental science in a wide sense. The same can be said about classical mechanics and the q u a n t u m physics of con densed matter. Whereas the largest part of physical research lies now in the latter, it is clear t h a t all its successes including the theory of superconductivity and super fluidity, Bose-Einstein condensation and q u a n t u m Hall effect have a fundamental explanation in the nonrelativistic q u a n t u m theory of many body systems. An unfinished physical fundamental problem in a narrow sense is physics of elementary particles. This puts this part of physics into a special position. And it is here where modern MP has the most probable chances for a breakthrough. Indeed, until recent time, all physics developed along the traditional circle: ex periment — theoretical interpretation — new experiment. So the theory tradition ally followed the experiment. This imposes a severe censorship on the theoretical work. Any idea, bright as it is, which is not supplied by the experimental know ledge at the time when it appeared is to be considered wrong and as such must be abandoned. Characteristically the role of censors might be played by theoreticians themselves and the great L. Landau and W.Pauli were, as far as I can judge, the most severe ones. And, of course, they had very good reason. On the other hand, the development of mathematics, which is also to a great ex tent influenced by applications, has nevertheless its internal logic. Ideas are judged not by their relevance but more by esthetic criteria. T h e totalitarianism of theor etical physics gives way to a kind of democracy in mathematics and its inherent intuition. And exactly this freedom could be found useful for particle physics. This part of physics traditionally is based on the progress of accelerator techniques. T h e very high cost and restricted possibilities of the latter soon will become an uncircumventable obstacle to further development. And it is here that mathematical intuition could give an adequate alternative. This was already stressed by fam ous theoreticians with mathematical inclinations. Indeed, let me cite a paper 5 by P. Dirac from the early 30s: T h e steady progress of physics requires for its theoretical formulation a mathematics t h a t gets continually more advanced. This is only nat ural and to be expected. W h a t , however, was not expected by the sci entific workers of the last century was the particular form that the line of advancement of the mathematics would take, namely, it was expec ted that the mathematics would get more complicated, but would rest on a permanent basis of axioms and definitions, while actually the mod ern physical developments have required a mathematics t h a t continually shifts its foundations and gets more abstract. Non-euclidean geometry and non-commutative algebra, which were at one time considered to be purely fictions of the mind and pastimes for logical thinkers, have now been found to be very necessary for the description of general facts of the physical world. It seems likely t h a t this process of increasing abstraction
4
will continue in the future and that advance in physics is to be associ ated with a continual modification and generalization of the axioms at the base of mathematics rather than with logical development of any one mathematical scheme on a fixed foundation. There are at present fundamental problems in theoretical physics awaiting solution, e.g., the relativistic formulation of quantum mechanics and the nature of atomic nuclei (to be followed by more difficult ones such as the problem of life), the solution of which problems will presumably require a more drastic revision of our fundamental concepts than any that have gone before. Quite likely these changes will be so great that it will be beyond the power of human intelligence to get the necessary new ideas by direct attempts to formulate the experimental data in mathematical terms. The theoretical worker in the future will therefore have to proceed in a more inderect way. The most powerful method of advance that can be suggested at present is to employ all the resources of pure mathematics in attempts to perfect and generalise the mathematical formalism that forms the existing basis of theoretical physics, and after each success in this direction, to try to interpret the new mathematical features in terms of physical entities. Similar views were expressed by C.N. Yang. I did not find a compact citation, but the spirit of his commentaries to his own collection of papers 6 shows this attitude. Also he used to tell this to me in private discussions. I believe that the dramatic history of setting the gauge fields as a basic tool in the description of interactions in Quantum Field Theory gives a good illustration of the influence of mathematical intuition on the development of the fundamental physics. Gauge fields, or Yang-Mills fields, were introduced to the wide audience of physicists in 1954 in a short paper by C.N. Yang and R. Mills 7 , dedicated to the generalization of the electromagnetic fields and the corresponding principle of gauge invariance. The geometric sense of this principle for the electromagnetic field was made clear as early as in the late 20s due to the papers of V. Fock 8 and H. Weyl 9 . They underlined the analogy of the gauge (or gradient in the termino logy of V. Fock) invariance of the electrodynamics and the equivalence principle of the Einstein theory of gravitation. The gauge group in electrodynamics is commut ative and corresponds to the multiplication of the complex field (or wave function) of the electrically charged particle by a phase factor depending on the space-time coordinates. Einstein's theory of gravity provides an example of a much more sophisticated gauge group, namely the group of general coordinate transformation. Both H. Weyl and V. Fock were to use the language of the moving frame with spin connection, associated with local Lorentz rotations. Thus the Lorentz group became the first nonabelian gauge group and one can see in 8 essentially all formu las characteristics of nonabelian gauge fields. However, in contradistinction to the electromagnetic field, the spin connection enters the description of the space-time and not the internal space of electric charge. In the middle of the 30s, after the discovery of the isotopic spin in nuclear physics, O. Klein 10 introduced the cor responding noncommutative group and the affiliated vector field. And apparently it was here that the censorship of W. Pauli played its killing role for very good
5
reason, based on the experimental facts. The problem was that of mass: the mass of would-be vector particles is equal to zero classically; the only known massless partcles (and accompaning long-range interaction) are the photon and graviton. There is no room for the massless quanta of the hypothetical charged vector fields, so the theory must be rejected and forgotten. Thus there is no wonder that Yang received the same reaction when he presented his work at Princeton in 1954. The dramatic account of this event can be found in his commentaries 6 . Pauli was in the audience and immediately raised the question about mass. It is evident from Yang's text, that Pauli was well acquainted with the differential geometry of nonabelian vector fields but did not allow himself to speak about them. As we know now, the boldness of Yang and his esthetic feeling finally were vindicated. And it can be rightly said, that C.N. Yang proceeded according to mathematical intuition. In 1954 the paper of Yang and Mills did not move to the forefront of high energy theoretical physics. However, the idea of the charged space with noncommutative symmetry group acquired more and more popularity due to the increasing number of elementary particles and the search for the universal scheme of their classification. And at that time the decisive role in the promotion of the Yang-Mills fields was also played by mathematical intuition. At the beginning of the 60s, R. Feynman worked on the extension of his own scheme of quantization of the electromagnetic field to the gravitation theory of Einstein. A purely technical difficulty — the abundance of the tensor indices — made his work rather slow. Following the advice of M. Gell-Mann, he exercised first on the simpler case of the Yang-Mills fields. To his surprise, he found that a naive generalization of his diagrammatic rules designed for electrodynamics did not work for the Yang-Mills field. The unitarity of the S-matrix was broken. Feynman restored the unitarity in one loop by reconstructing the full scattering amplitude from its imaginary part and found that the result can be interpreted as a subtraction of the contribution of some fictitious particle. However his technique became quite cumbersome beyond one loop. His approach was gradually developed by B. DeWitt n . It must be stressed that the physical senselessness of the Yang-Mills field did not preclude Feynman from using it for mathematical construction. The work of Feynman 12 became one of the starting points for my work in Quantum Field Theory, which I began in the middle of the 60s together with Victor Popov. Another point as important was the mathematical monograph by A. Lichnerowitz 13 , dedicated to the theory of connections in vector bundles. From Lichnerowitz's book it followed clearly that the Yang-Mills field has a definite geometric interpretation: it defines a connection in the vector bundle, the base being the space-time and the fiber the linear space of the representation of the compact group of charges. Thus, the Yang-Mills field finds its natural place among the fields of geometrical origin between the electromagnetic field (which is its particular example for the one-dimensional charge) and Einstein's gravitation field, which deals with the tangent bundle of the Riemannian space-time manifold. It became clear to me that such a possibility cannot be missed and, notwith standing the unsolved problem of zero mass, one must actively tackle the problem of the correct quantization of the Yang-Mills field.
6
The geometric origin of the Yang-Mills field gave a natural way to resolve the difficulties with the diagrammatic rules. The formulation of the quantum theory in terms of Feynman's functional integral happened to be most appropriate from the technical point of view. Indeed, to take into account the gauge equivalence principle one has to integrate over the classes of gauge equivalent fields rather than over every individual configuration. As soon as this idea is understood, the technical realization is rather straightforward. As a result V. Popov and I came out at the end of 1966 with a set of rules valid for all orders of perturbation theory. The fictitious particles appeared as auxiliary variables giving the integral representation for the nontrivial determinant entering the measure over the set of gauge orbits. Correct diagrammatic rules of quantization of the Yang-Mills field, obtained by V. Popov and me in 1966-1967 14 ° 15 , did not attract the immediate attention of physicists. Moreover, the time when our work was done was not favorable for it. Quantum Field Theory was virtually forbidden, especially in the Soviet Union, due to the influence of Landau. "The Hamiltonian is dead" — this phrase from his paper 16 , dedicated to the anniversary of W. Pauli — shows the extreme of Landau's attitude. The reason was quite solid, it was based not on experiment, but on the investigation of the effects of renormalization, which led Landau and his coworkers to believe that the renormalized physical coupling constant is inevitably zero for all possible local interactions. So there was no way for Victor Popov and me to publish an extended article in a major Soviet journal. We opted for the short communication in "Physics Letters" and were happy to be able to publish the full version in the preprint series of newly opened Kiev Institute of Theoretical Physics. This preprint was finally translated into English by B. Lee as a Fermilab preprint in 1972, and from the preface to the translation it follows that it was known in the West already in 1968. A decisive role in the successful promotion of our diagrammatic rules into phys ics was played by the works of G. 't Hooft 17 , dedicated to the Yang-Mills field interacting with the Higgs field (and which ultimately led to a Nobel Prize for him in 1999) and the discovery of dimensional transmutation (the term of S. Coleman 18 ). The problem of mass was solved in the first case via the spontaneous symmetry breaking. The second development was based on asymptotic freedom. There exists a vast literature dedicated to the history of this dramatic development. I refer to the recent papers of G. 't Hooft 19 and D. Gross 20 , where the participants in this story share their impressions of this progress. As a result, the Standard Model of unified interactions got its main technical tool. From the middle of the 70s until our time it remains the fundamental base of high energy physics. For our discourse it is important to stress once again that the paper 14 based on mathematical intuition preceded the works made in the traditions of theoretical physics. The Standard Model did not complete the development of fundamental physics in spite of its unexpected and astonishing experimental success. The gravitational interactions, whose geometrical interpretation is slightly different from that of the Yang-Mills theory, is not included in the Standard Model. The unification of quantum principles, Lorentz-Einstein relativity and Einstein gravity has not yet been accomplished. We have every reason to conjecture that the modern MP and its mode of working will play the decisive role in the quest for such a unification.
7
Indeed, the new generation of theoreticians in high energy physics have received an incomparably higher mathematical education. They are not subject to the pressure of old authorities maintaining the purity of physical thinking and/or ter minology. Futhermore, many professional mathematicians, tempted by the beauty of the methods used by physicists, moved to the position of the modern mathem atical physics. Let use cite from the manifesto, written by P. MacPherson during the organization of the Quantum Field Theory year at the School of Mathematics of the Institute for Advanced Study at Princeton: The goal is to create and convey an understanding, in terms con genial to mathematicians, of some fundamental notions of physics, such as quantum field theory. The emphasis will be on developing the intuition stemming from functional integrals. One way to define the goals of the program is by negation, excluding certain important subjects commonly pursued by mathematicians whose work is motivated by physics. In this spirit, it is not planned to treat except peripherally the magnificient new applications of field theory, such as Seiberg-Witten equations to Donaldson theory. Nor is the plan to consider fundamental new constructions within mathimatics that were inspired by physics, such as quantum groups or vertex operator algebras. Nor is the aim to discuss how to provide mathematical rigor for physical theories. Rather, the goal is to develop the sort of intuition common among physicists for those who are used to thought processes stemming from geometry and algebra. I propose to call the intuition to which MacPherson refers that of mathemat ical physics. I also recommend the reader to look at the instructive drawing by P. Dijkgraaf on the dust cover of the volumes of lectures given at the School 21 . The union of these two groups constitutes an enormous intellectual force. In the next century we will learn if this force is capable of substituting for the traditional experimental base of the development of fundamental physics and pertinent physical intuition. References 1. B. Nagel, The Discussion Concerning the Nobel Prize of Max Planck, Science Technology and Society in the Time of Alfred Nobel (New York: Pergamon, 1982). 2. F. Morse and H. Feshbach, Methods of Theoretical Physics, (New York: McGraw-Hill, 1953). 3. R. Courant and D. Hilbert, Methoden der mathematischen Physik, (Berlin: Springer, 1931). 4. S.L. Sobolev, Nekotorye primeneniya funktsional'nogo analtza v matematicheskoi fizike (Some Applications of Functional Analysis in Mathematical Physics), (Leningrad: Lenigrad. Gos. Univ., 1950). 5. P. Dirac, Quantized Singularities in the Electromagnetic Field, Proc. Roy. Soc. London A 133, 60-72 (1931).
V
6. C.N. Yang, Selected Papers 1945-1980 with Commentary, (San Francisco: Freeman, 1983). 7. C.N. Yang and Ft. Mills, Conservation of Isotopic Spin and Isotopic Gauge Invariance, Phys. Rev. 96, 191-195i (1954). 8. V. Fock, L'equation d'onde de Dirac et la geometrie de Riemann, J. Phys. et Rad. 70 392-405 (1929). 9. H. Weyl, Electron and Gravitation, Zeit. Phys., 56, 330-352 (1929). 10. O. Klein, On the Theory of Charged Fields: Submitted to the Conference New Theories in Physics, Warsaw, 1938, Surv. High Energy Phys., 1986, 5 269 (1986). 11. B. De-Witt, Quantum Theory of Gravity II: The manifest covariant theory, Phys. Rev., 1967, 162, 1195-1239 (1967). 12. R.P. Feynman, Quantum Theory of Gravitation, Acta Phys. Polon. 24, 697722 (1963). 13. A. Lichnerowicz, Theorie globale des connexions et des groupes d'holonomie, (Roma: Ed. Cremonese, 1955). 14. L. Faddeev and V. Popov, Feynman Diagrams for the Yang-Mills Field, Phys. Lett. B, 25, 29-30 (1967). 15. V. Popov and L. Faddeev, Perturbation Theory for Gauge-Invariant Fields, Preprint, National Accelerator Laboratory, NAL-THY-57 (1972). 16. L. Landau, in Theoretical Physics in the twentieth century, a memorial volume to Wolfgang Pauli, ed. M. Fierz and V. Weisskopf, (Cambridge, USA, 1956). 17. G. 't Hooft, Renormalizable Lagrangians for Massive Yang-Mills Fields, Nucl. Phys. B, 35, 167-188 (1971). 18. S. Coleman, Secret Symmetries: An Introduction to Spontaneous Symmetry Breakdown and Gauge Fields: Lecture given at 1973 International Summer School in Physics Ettore Majorana, Erice (Sicily), 1973, Erice SubnuclPhys., 1973. 19. G. 't Hooft, When was Asumptotic Freedom Discovered? Rehabilitation of Quantum Field Theory, Preprint, hep-th/9808154 (1998). 20. D. Gross, Twenty Years of Asymptotic Freedom, Preprint, hep-th/9809080 (1998). 21. V. Dijkgraaf, Quantum Fields and Strings: A course for mathematicians, vols. I, II (AMS, IAS, 1999).
9
N E W APPLICATIONS OF THE CHIRAL ANOMALY*
J U R G F R O H L I C H AND BILL P E D R I N I Theoretical Physics, ETH-H6nggerberg, E-mail:
[email protected];
CH-8093 Zurich, Switzerland pedrini&itp.phys.ethz.ch
We describe consequences of the chiral anomaly in the theory of quantum wires, the (quantum) Hall effect, and of a four-dimensional cousin of the Hall effect. We explain which aspects of conductance quantization are related to the chiral anomaly. The four-dimensional analogue of the Hall effect involves the axion field, whose time derivative can be interpreted as a (space-time dependent) difference of chemical potentials of left-handed and right-handed charged fermions. Our fourdimensional analogue of the Hall effect may play a significant role in explaining the origin of large magnetic fields in the (early) universe.
1
What is the chiral anomaly?
The chiral abelian anomaly has been discovered, in the past century, by Adler, Bell and Jackiw, after earlier work on 7T°-decay starting with Steinberger and Schwinger; see e.g. [1] and references given there. It has been rederived in many different ways of varying degree of mathematical rigor by many people. Diverse physical implica tions, especially in particle physics, have been discussed. It is hard to imagine that one may still be able to find new, interesting implications of the chiral anomaly that specialists have not been aware of, for many years. Yet, until very recently — in the past century, but only two to three years ago — this turned out to be possible, and we suspect that further applications may turn up in the future! This little review is devoted to a discussion of physical implications of the chiral anomaly that have been discovered recently. Before we turn to physics, we recall what is meant by "chiral (abelian) anomaly". In general terms, one speaks of an anomaly if some quantum theory violates a sym metry present at the classical level, (i.e., in the limit where h —> 0). By "violating a symmetry" one means that it is impossible to construct a unitary representation of the symmetry transformations of the classical system underlying some quantum theory on the Hilbert space of pure state vectors of the quantum theory. (By "vi olating a dynamical symmetry" is meant that it is impossible to construct such a representation that commutes with the unitary time evolution of the quantum theory.) It is quite clear that understanding anomalies can be viewed as a problem in group cohomology theory. A fundamental example of an anomalous symmetry group is the group of all symplectic transformations of the phase space of a classical Hamiltonian system underlying some quantum theory. The anomalies considered in this review are ones connected with infinitedimensional groups of gauge transformations which are symmetries of some classical Lagrangian systems with infinitely many degrees of freedom (Lagrangian field the ories). Thus, we consider a theory of charged, massless fermions coupled to an 'THIS REVIEW IS DEDICATED TO THE MEMORY OF LOUIS MICHEL, THE THEOR ETICIAN AND THE FRIEND.
10
external electromagnetic field in Minkowski space-time of even dimension In. Let 7°, 71>• • • > 7 2 n _ 1 denote the usual Dirac matrices, and define 7 := i 7 ° 7 1 . . . 7 2 n - 1 .
(1.1)
Then 7 anti-commutes with the covariant Dirac operator D := 17" {dp-iAp)
,
(1.2)
where A is the vector potential of the external electromagnetic field. Let tp(x) denote the Dirac spinor field and ip{x) the conjugate field. We define the vector current, J1*, and the axial vector current J^, by
j " := ^ y v , J" ■= fa»vl> •
(i.3)
At the classical level, these currents are conserved, d^J"
= 0 , d^J"
= 0,
(1.4)
on solutions of the equations of motion, (Di[> = 0). The conservation of the vector current is intimately connected with the electromagnetic gauge invariance of the theory, e~ix^
4>{x) ^ e'X^VOc) , i>{x) ->■ rp(x) Apix) *+A^x) x
+ dpX(*) ,
(1-5)
ls a
where x( ) test function on space-time. When x is constant in x the trans formations (1.5) are a global symmetry of the classical theory corresponding to the conserved quantity
Q = [dxj°(x°,x)
(1.6)
which is the electric charge. The conservation of Q (independence of x°) follows, of course, from the fact that the Noether current J*1 associated with (1.5) satisfies the continuity equation (1.4). The conservation of the axial vector current J^, in the classical theory, is con nected with the invariance of the theory under local chiral rotations 1>(x) -¥ eiQ<-x^rp(x), A„(x) *+A^x)
4>{x) H+ 4>{x) eia(~x^ +-yd^x)
,
(1.7)
where a(x) is a test function on space-time. In particular, when a is a constant the transformations (1.7) are a global symmetry of the classical theory corresponding to the conserved charge
Q = J'dxj°(x°,x)
(1.8)
(which, according to (1.4), is independent of a;0). It turns out that, in the quantum theory, the local chiral rotations (1.7) do not leave quantum-mechanical transition amplitudes invariant, and the axial vector current J^ is not a conserved current, for arbitrary external electromagnetic fields. This phenomenon is called chiral (abelian) anomaly.
11
Let us see where the chiral anomaly comes from, for theories in two and four space-time dimensions. We start with the discussion of two-dimensional theories. We consider a quantum theory which has a conserved vector current J^ and — if the external electromagnetic field vanishes — a conserved axial vector current J^, i.e., ^JM
= 0 , drj" = 0 .
(1.9)
In two space-time dimensions, J1* and J^ are related to each other by J»
= e"" Jv
(1.10)
where e 00 = e11 = 0 , el)1 = —e10 = 1. The continuity equation d„J" = 0 has the general solution J"(x) = j-e^frtfix),
(1.11)
Z7T
where
(1.12)
Thus, if the vector- and axial vector currents are conserved then the potential ip of the vector current is a massless free field. The theory of the massless free field is an example of a Lagrangian field theory. It has an action functional, S, given by %>) = ^ /
r f 2
* (d^)(x)(d^)(x) .
(1.13)
The "momentum", n(x), canonically conjugate to
,
(1.14)
where t = x° denotes time; (the "velocity of light" c = 1). In the quantum theory, ip and n are operator-valued distributions on Fock space satisfying the equal-time canonical commutation relations [*(*,*) >
•
Since
J"(x) =
£-e>"{dv
and j»(x) = e^Mx) = ± Z7T
(^)(i),
(i-i5)
12
eq. (1.15) yields the well known anomalous commutator [j°(t,x)
, J°(t,y)}
= i^S'(x-y)
.
(1.16)
Next, we imagine that the system is coupled to a classical external electric field E(x). In two space-time dimensions, the electric field is given in terms of the electromagnetic vector potential A^ by E(x) = e"" (dM
(x) .
(1.17)
The action functional for the theory in an external electric field is given by S(^,A)=^-Jd2x(dli
d2xJ"(x)Ali(x)
+ - J
= 1-Jd2x{(dtl
+ 2e'"'dM*)M*)}
■ (1-18)
The equation of motion (Euler-Lagrange equation) obtained from the action func tion (1.18) is □
(1.19)
Using (1.10) and (1.11), we see that equation (1.19) is equivalent to dj"(x) = f
E(x) ,
(1.20)
i.e., the axial vector current fails to be conserved in a non-vanishing external electric field E. Equation (120) is the standard expression of the chiral anomaly in two space-time dimensions. From the currents J* and J* one can construct chiral currents, Jf and Jf, for left-moving and right-moving modes by setting J?
= J'-J"
, 3? = J" + J" ■
(1.21)
They satisfy the equations ^J//r = T ±
E(x) .
(1.22)
From eqs. (1.17) and (1.22) we infer that one can define modified chiral currents, t7// r i which are conserved: Jt/r(x)
■= Jt,r
± ^
e^M*)
•
(1-23)
Then
djf/r(x)
= 0,
but j7t fail to be gauge-invariant. Nevertheless the conserved charges, Nt :=JdxJ?{t,x)
, Nr := J dxj?{t,x)
,
(1.24)
are gauge-invariant. They count the total electric charge of left-moving and of right-moving modes, respectively, present in a physical state of the system.
13
The anomalous commutators are given by
[j°(t,x)
, J?,r(t,y)] = Ti^S'(x-y)
.
The left-moving / right-moving charged fields of the theory can be expressed as normal-ordered exponentials of spatial integrals of £ J°/r(x), i.e., as vertex operators; they transform correctly under gauge transformations. This completes our review of the chiral anomaly and of anomalous commutators in two dimensions, and we now turn to four- (or higher-) dimensional systems. We consider charged, massless Dirac fermions described by a Dirac spinor field 4>(x) and its conjugate field 4>(x) = ^ ' ( Z J T O - We study the effect of coupling these fields to external vector- and axial-vector potentials, A^ and Z^, respectively. The theory of these fields provides an example of Lagrangian field theory, the action functional being given by := jd2nx
S(rl>,rP;A,Z)
i>{x)DA,z^{x)
,
(1.25)
where the covariant Dirac operator is DA,z
= Widn-iAp-iZtf)
,
(1.26)
5
with 7 (= "7 ") as in eq. (1.1). The fields A^ and Z^ are arbitrary external fields (i.e., they are not quantized, for the time being). We define the effective action, Sen(A,Z), by const fvi>V4> eiS{*,*;A,z) _
eiS,„(A,Z)
^
27)
where the constant is chosen such that S(A = 0, Z = 0) = 0, and h and c have been set to 1. After Wick rotation, t = x° -)■ -ix°, A0 -¥ iA0, Z0 -»• iZ0, 7° -> -*7° -
(1-28)
eq. (1.27) reads -SS,(A,Z)
=
\[VxpVrl,e-SE(
.
(1.29)
where the integral on the R.S. is interpreted as a renormalized Gaussian Berezin integral. Thus e-sf„(A,z)
= d e t r e n (DA,Z)
,
(1-30)
where, after Wick rotation, DA,z
= il"
(d^-iAp-iZtf)
is an anti-hermitian elliptic operator, and the subscripts "ren" indicate that (for n > 2) a multiplicative renormalization must be made. The effective action S^(A, Z) is the generating function for the Euclidian Green functions of the vector- and axial vector currents. At non-coinciding arguments, <J"(*l)-.-J'"(»l)...)i.2
14
where q is the electric charge, and ((-))5i z denotes a connected expectation value. We should like to understand how SEff(Al Z) changes under the gauge trans formations Ap-> A„ + dpx , Zn-tZ^+dpa.
(1.32)
Following Fujikawa [2], we perform a phase transformation and a chiral rotation of ip and ip under the integral on the R.S. of eq. (1.29). We set i/>'{x) = ei(-x(-^+a{x)',)
j>{x) , ^'(*) = $(x) e-'M')-"^™
.
(1.33)
Then SE(i>',rP';A + dX,Z+da)
= SE fair, A, Z) ,
(1.34)
where d\ denotes the gradient, (3^x)i of X- Next, we must determine the Jacobian, 7, of the transformation (1.33), Vrp'ViP'
= : JVij>V4> .
(1.35)
Obviously, phase transformations, V>' = eix V> , ft = i> e~ix
have Jacobian 7 = 1 . However, this may not be so for chiral rotations. Formally, under chiral rotations, the Jacobian turns out to be 7 = exp [2iTr (cry)] .
(1.36)
The problem with the R.S. of (1.36) is that, a priori, it is ill-defined. Let us assume that non-compact Euclidian space-time is replaced by a 2n-dimensional sphere. Then DA,Z has discrete spectrum, with eigenvalues i Xm corresponding to eigenspinors ipm(x), m G 7L. Formally,
m
We regularize the R.S. by replacing it by £ V (*"/"') jd2nx a(x) rm(*h iM*)
(1-37)
m
and, afterwards, letting M —¥ oo. Expression (1.37) is nothing but T r ( a
7 e
0 W O )
.
( L 3 8 )
From Alvarez-Gaume's calculations [3] concerning the index theorem, for ex ample, we infer that lim Tr ( Q M-+00
D
7 e{
^lM")
\
) = _ f d2nx a(x) A(x) , J
(1.39)
J
where A(x) is the index density described more explicitly below. From (1.39) and (1.36) we obtain that 7 = exp -2z j
d2nxa(x)A{x)
(1.40)
15
With (1.34), (1.35) and (1.29), eq. (1.40) yields - 1i f d2nx a(x) A(x) .
S&iA + dx.Z + da) = S?a{A,Z)
(1.41)
When combined with (1.31) eq. (1.41) is seen to yield [6S?n(A + =
dX,0)/6x(x)]x=Q dll{SS^(A,Q)/SAti(x))
= "
<W(*))A
= 0,
(1.42)
Q
and [SS^(A,Z
+
=
da)/8a(X)}z=a=0 dlt([SSefl(A,Z)/SZll{x)]z=0)
- d»{J»(x))A
= -2iA(x)
,
(1.43)
9 i.e.,
W ( * ) > A
= 0 , d,{J»(x))A
= -2qA(x)
.
(1.44)
Introducing the chiral currents Jt
■= J ^ - r , # * := J" + J" .
(1-45)
where Jf, is the current of left-handed/right-handed fermions, we see that (1.44) is equivalent to ^<J/»>4
= 2qA(x)
, <9M<Jr"(x))4 = -2qA(x)
.
(1.46)
Locally, we can solve the equation Su(x;A)
= A{x) ,
(1.47)
where S, the co-differential, is the dual of exterior differentiation d, the solution u)( ■; A) being a 1-form. The 1-form u)(-; A) is, however, not gauge-invariant. We may now define modified currents,
3t,M) = JtiM) =F 2q^(x;A) .
(1.48)
They arc not gauge-invariant, but, according to eqs. (1-47), (1.48), they are con served, i.e., d„ Jt,M
= 0•
(1-49)
Passing to the operator formulation of quantum field theory (i.e., undoing the Wick rotation (1.28), which amounts to Osterwalder-Schrader reconstruction), the conserved currents 3nr give rise to conserved charges,
Ni/r := JdxJ?/r(t,x) which (for gauge-transformations continuous at infinity) are gauge-invariant.
(1.50)
16
We should like to determine the equal-time commutators of the (gauge-invariant) currents J?,T(x). Let V denote the affine space of configurations of external electro magnetic vector potentials, A, corresponding to static electromagnetic fields. We consider the Hilbert bundle, %, over V whose fibre, TA, at a point A £ V is the Fock space of state vectors of free, chiral (e.g., left-handed) fermions coupled to the vector potential A. Then % carries a projective representation, U, of the group Q of time-independent electromagnetic gauge transformations, g = (gx(x))
, gx{x)
= e' x ( r ) , *(x) = x{s)
(indep. of t) ,
with the following properties: (i)
U(g) : TA
—► ?A+dx -
and, on the fibre TA+d\ I (ii)
U{g)t{>{x;A) | ^
U{g)~x = e*<*> ^(x;A + dX) ,
where ip(x;A) is the Dirac spinor field acting on TA\ (and similarly for The generator, G(x), of the gauge transformation U(gx(-)) is given by G(x) = IdxX(x)
ip(x;A)).
G(x) ,
where G(x) =
-»v
SA{x)
' q
Jl
(1.51)
Here V
s
2r»-l
j.
SA
Locally, the (phase) factor of the projective representation U of £ can be made trivial by redefining the generators G:
G(x) —► G(x) := - i V • ^ y
+ J J,°(x;/1) .
(1.52)
The operators G(x) generate a representation of the group Q of gauge transforma tions on ~K iff [G(t,x)
, G(t,y)]
= 0
(1.53)
for all times t. That (1.52) is the right choice of generators compatible with (1.53) follows, heuristically, from the fact that J^(x\A) is a conserved current. Because the current J^(x;A) is gauge-invariant, we have that V • jjjr^j,
J?(y,A)
= 0.
(1.54)
17
Thus, using (1.48), (1.54) and (1.53), we find that
0= [G(t,x) , d(t,y)]
+ 2, s
' " J7^
W
(155)
°^^-
This equation determines the anomalous commutator [Jt°(t,x)
, J/°(«,y)] = [.??(*,*) - Jt(t,v)]
■
(1-56)
Of course, our arguments are heuristic, but, hopefully, provide a reasonably clear idea about the origin of anomalous commutators. A more erudite, mathematically clean derivation of (1.55) can be based on an analysis of the cohomology of Q; see e.g. [4]. In order to arrive at explicit versions of eqs. (1.46), (1.47) and (1.55) in various even dimensions, we must know the explicit expressions for the index density A(x) and the one-form ui(x\A). We shall not have any occasion to consider systems coupled to a non-trivial chiral gauge field Z. We therefore set Z — 0. Then, in two space-time dimensions, A(x)
= - i - E(x) ,
(1.57)
by comparison of (1.45) with (1.20), and, by (1.48) and (1.57), ^(x-A)
= - ± ei" A„(x) ,
(1.58)
see also (1.23) and (1.17). In four space-time dimensions
=
(L59) ~ 3 2 ^ F " " ( l ) ^{X) ' where A denotes the exterior product and * the "Hodge dual". By eq. (1.47), W
" (X'A)
=
" 32^
£>WXP MX)
Fxp{x)
■
(L60)
Thus eqs. (1.47) read d,(J?/r(x))A
= ^ * ( F A F ) W ,
(1.61)
and, from eqs. (1.55), (1.56) and (160), we conclude that
= ± i j-j a well known result; see [1].
(£(*,*) • V) 6 (x-y)
,
(1.62)
18
The key fact reviewed in this section, from which all other results can be derived, is eq. (1.41), i.e., f d2nx a(x) A{x) .
S%t{A + dx, Z + da) = S^n{A,Z)-1i
(1.63)
In order to describe a system in which only the left-handed fermions are charged, while the right-handed fermions are neutral, one may just set A = -Z
= a
(1.64)
in eq. (163), where a is the electromagnetic vector potential to which the lefthanded fermions are coupled; see (1.25), (1.26) and (1.46). Denoting the effective action of this system by Wt(a), we find from (1.63) and (1.64) that Wt{a + dx) = Wt{a)+ 2» f d2nxX(x)A(x)
.
(1.65)
Similarly, 2n Wr (a + dX) = WT (a) - 2 i f d x x(x) A(x) ,
(1.66)
for charged right-handed fermions. Eqs. (1.65) and (1.66) show that a theory of massless chiral fermions coupled to an external electromagnetic field is anomalous, in the sense that it fails to be gauge-invariant. But let us imagine that space-time, M 2n , is the boundary of a (2n + l)-dimensional half-space M, (i.e., dM = physical space- time SIR 2 "). Let A denote a smooth U(l)-gauge potential on M which is continuous on dM, with A
a
\dM=
-
(167)
2n+l
Let u) (-;A) denote the usual Chern-Simons (2n + l)-form on M. The ChernSimons action on M is defined by Scs(A)
u;2n+1(Z;A)
:= if
,
(1.68)
JM
where £ denotes a point in M. It should be recalled that w 2 n + 1 (-;i4 + dx) = "2n+H-\A) Since d(*A) = 0 , dx^(*A) SCs(A
= d(x(*A))
+ dXA(*A)
.
(1.69)
, and hence, by Stokes' theorem,
+ dx) = Scs(A) + i f JdM
X(x)(*A)(x)
= SCs(A) + if d2nxX(x)A(x) JdM
.
(1.70)
It follows that Wi/r(a)
=F Scs(A)
is gauge — invariant.
(1.71)
This result has a (2n + l)-dimensional interpretation (see [5]): Consider a (parity-violating) theory of massive, charged fermions described by 2 n -component Dirac spinors on a (2n + l)-dimensional space-time M with non-empty boundary
19
dM. These fermions are minimally coupled to an external electromagnetic vector potential A. We impose some anti-selfadjoint spectral boundary conditions on the (2n + l)-dimensional, covariant Dirac operator DA- The action of the system is given by S(i>,1>;A) := /
d2n+}W(Z)(DA+m)rP(t;),
(1.72)
JM
where m is the bare mass of the fermions. The effective action of the system is defined by eS5t(A)
= det ren (DA + m) ,
(1.73)
where the subscript "ren" indicates that renormalization may be necessary to define the R.S. of (1.73). Actually, for n = 1, no renormalization is necessary; but, for n = 2, e.g. an infinite charge renormalization must be made. It turns out that, for n = 1 and n = 2 (after renormalization), S^(A)
= Wt/r (A \dM)
T
Scs(A) + O ( £ )
,
(1.74)
up to a Maxwell term depending on renormalization conditions, where the correc tion terms are manifestly gauge-invariant; see [5,6]. (Whether the R.S. of (1.74) involves Wt or Wr depends on the definition of DA)The physical reason underlying the result claimed in eq. (1.74) is that, in a system of massive fermions described by 2 n -component Dirac spinors confined to a space-time M with a non-empty, 2n-dimensional boundary dM, there are massless, chiral fermionic surface modes propagating along dM. This completes our heuristic review of aspects of the chiral abelian anomaly that are relevant for the physical applications to be discussed in subsequent sections. The abelian anomaly is, of course, but a special case of the general theory of anomalies involving also non-abelian, gravitational, global, . . . anomalies. In recent years, this theory has turned out to be important in connection with the theory of branes in string theory and with understanding aspects of M-theory. But, in this review, such applications will not be described. In Sect. 2, we describe physical systems, important features of which can be understood as consequences of the two-dimensional chiral anomaly: incompressible (quantum) Hall fluids and ballistic wires. In Sect. 3, we describe degrees of freedom in four dimensions which may play an important role in the generation of seeds for cosmic magnetic fields in the very early universe. This will turn out to be connected with the four-dimensional chiral anomaly. In Sect. 4, a brief review of the theory of "transport in thermal equilibrium through gapless modes" developed in [7] is presented. In Sects. 5 and 6, we combine the results of this section with those in Sect. 4 to derive physical implications of the chiral anomaly for the systems introduced in Sects. 2 and 3. Some conclusions and open problems are described in Sect. 7.
20 2
Quantized conductances
T h e original motivation for the work described in this review has been to provide simple and conceptually clear explanations of various formulae for quantized con ductances, which have been encountered in the analysis of experimental d a t a . Here are some typical examples. E x a m p l e 1. geometry. Let and the outer direction. T h e
Consider a q u a n t u m Hall device with, e.g., an annular (Corbino) V denote the voltage drop in the radial direction, between the inner edge, and let /// denote the total Hall current in the azimuthal Hall conductance, G//, is defined by GH
One finds t h a t if the longitudinal
= IH/V
resistance
•
(2.1)
vanishes
(i.e., if the two-dimensional
electron gas in the device is "incompressible") then GH is a rational multiple of ^-, i.e.,
G
"
=
1 'J '
n =
°'1,2
d = 1 2 3
' - '-- •
C2-2)
In (2.2), e denotes the elementary electric charge and h denotes Planck's constant. Well established Hall fractions, }H '■= ^ , in the range 0 < / / / < 1 are listed in Fig. 1; (see [8]; and [9, 10, 11] for general background). E x a m p l e 2. In a ballistic (quantum) wire, i.e., in a pure, very thin wire without back scattering centers, one finds t h a t the conductance Gw — I/V (I: current through the wire, V: voltage drop between the two ends of the wire) is given by Gw
= IN
e
- , N = 0,1,2,..., (2.3) h under suitable experimental conditions ("small" V, temperature not "very small", "adiabatic gates"); see [12, 13]. E x a m p l e 3 . In measurements of heat conduction in q u a n t u m wires, one finds t h a t the heat current is an integer multiple of a "fundamental" current which depends on the temperatures of the two heat reservoirs at the ends of the wire. If electromagnetic waves are sent through an "adiabatic hole" connecting two half-spaces one approximately finds an "integer quantization" of electromagnetic energy flux. Our task is to a t t e m p t to provide a theoretical explanation of these remarkable experimental discoveries; hopefully one that enables us to predict further related effects. Conductance quantization is observed in a rather wide temperature range. It appears t h a t it is only found in systems without dissipative processes. When it is observed it is insensitive to small changes in the parameters specifying the system and to details of sample preparation; i.e., it has universality properties. — It will turn out t h a t the key feature of systems exhibiting conductance quantization is t h a t they have conserved chtral charges; (such conservation laws will only hold approximately, i.e., in slightly idealized systems). Once one has understood this
21 r—
i
<*H=1
1
>i
3
>
• | »/->
1
<'1
5
• 1
7
•?
<>*
9
.}:
'* 0 2 11
11
•A •ft; «n °TT -ft
•A
•ft •& •& 4 4 4 44-1?
°A A
13 IS
^ 17
1 1 1
19 n "
...
i 1 1
i
T »
<
5
.|
i
9 ' 19
1
I
1
J
T
HSigner cry$tta F trim liquid or carrier behaviour freeze-out
Fermi liquid behaviour
Figure 1. Observed Hall fractions <7// = nu/dj]
I fa domain oj attraction
of / « = I in the interval 0 < an < 1.
point, the right formulae follow almost automatically, and one arrives at natural generalizations. In order to give a first indication how the effects described here might be related to the two-dimensional chiral anomaly, we consider Example 1, the quantum Hall effect, in more detail. For readers not familiar with this remarkable effect [14], we summarize some of its key features. A quantum Hall fluid (QHF) is an interacting electron gas confined to some domain in a two-dimensional plane (an interface between a semiconductor and an insulator, with compensating background charge) subject to a constant magnetic field B e ' transversal to the confinement plane. Among experimental control para meters is the filling factor, v, defined by
_ " "
n (°)
Bio)/ ( i s )
where n' 0 ' is the (constant) electron density, £'°) is the component of the magnetic field B(°) perpendicular to the plane of the fluid, and ^f is the quantum of magnetic flux. The filling factor v is dimensionless. Transport properties of a QHF in an external electric field (of small frequency)
22
are described by the equation l(t,x)
= (
aL
"")
K(t,x)
,
(2.4)
where x_ is a point in the sample, J_ is the bulk electric current parallel to the sample plane and E_ is the component of the external electric field parallel to the sample plane. Furthermore, 07, denotes the longitudinal conductivity, and an is the transverse - or Hall conductivity. In two dimensions, conductances and conductivities have the same dimension of [(charge) 2 /action], and it is not difficult to see that GH
=
(2.5)
Experimentally, one observes that the longitudinal conductivity,
=
/ 0 \-Ei \-E2
Ex 0 -B
E2\ B , 0 /
(2.6)
where E\ and E? are the components of an external electric field in the plane of the sample, and B is the component of an external magnetic field, B, perturbing the constant field B^ perpendicular to the sample plane; ( S to tai = B^ + B). We define J°(x) to denote the sum of the electron charge density in the spacetime point x = {t,x) and the uniform background charge density en^. We set From the three-dimensional homogeneous Maxwell equations (Faraday's law), 0„ FvX + du FXlt + dx F^
= 0,
(2.7)
the continuity equation for the electric current density (conservation of electric charge), 0„ J " = 0 ,
(2.8)
and from the transport equation (2.4) with 07, = 0, it follows [8] that J° = aHB
.
(2.9)
Equations (2.4), for 07, = 0, and (2.9) can be combined to the equation r
=
FuX
(2.10)
of Chern-Simons electrodynamics, [5]. Eqs. (2.10) describe the response of an in compressible QHF to an external electromagnetic field (perturbing the constant magnetic field B^).
23 Unfortunately, eqs. (2.10) are compatible with the continuity equation (2.8) for J*1 only if an is constant throughout space-time. But realistic samples have a finite extension. The finite extension of the sample, confined to a space-time region ft = D x M, where D is e.g. a disk or an annulus, is taken into account by setting the Hall conductivity <x//() to zero outside ft, i.e., 'Hit)
= OH Xii{0
,
(2.11)
3
for £ G M , where an is the (constant) value of the Hall conductivity inside the sample, and x n is t h e characteristic function of ft. Taking the divergence of eq. (2.10), we get that d» J" =
vX
ffHe"
(d^xn)
FvX ,
(2.12)
1
i.e., 8^ J * fails to vanish on the boundary, dD, of the sample. However, conservation of electric charge is a fundamental law of nature for closed systems. Thus, there must be an electric current, Jan, localized on the boundary 3ft of the sample spacetime such t h a t the total electric current
•Total - J" + JSn
(2-13)
satisfies the continuity equation. T h e boundary current j£n must be tangential t o the boundary 3ft of the sample space-time. Hence it determines a current density, Ia, on the (l-f-l)-dimensional space-time dft, where the index a refers to a choice of coordinates on 3ft. Eq. (2.12) and the continuity equation for J^otai then imply that daIa
= -a„eap
FaP .
(2.14)
T h i s equation identifies 1° as an anomalous current. Thus, there must be chiral modes (left-movers or right-movers, depending on the orientation of ft and the dir ection of the external magnetic field) propagating along the boundary. They carry the well known diamagnetic edge currents. If J" (or J") denotes the correspond ing quantum-mechanical current operator then the edge current Ia is given by the quantum-mechanical expectation value, (Jf,r)A, of J" (or Jr). T h e currents J° have the anomalous commutators [ # ( * , £ ) , 4 ° (<,g)] = ^ ' ( s - 2 )
.
(215)
see eqs. (1.21) and (1.16), and hence generate a chiral u(l)-current algebra with central charge given by anWe now return to the physics of the bulk of an incompressible Q H F . T h e absence of dissipation {ai = 0) in the transport of electric charge through the bulk can be explained by the existence of a mobility gap in the energy spectrum between t h e ground state energy of the Q H F and the energies of extended, excited bulk states. This property motivates the term "incompressible": It is not possible to add an additional electron to, or subtract one from the fluid by injecting only an arbitrarily small amount of energy. An important consequence of incompressibility is that t h e total electric charge is a good q u a n t u m number to label different sectors of physical states of an incompressible Q H F (at zero temperature).
24
We propose to study the bulk physics of incompressible QHF's in the scaling limit, in order to describe the universal transport laws of such fluids. For this purpose, we consider a QHF confined to a sample of diameter oc 9, where 9 is a dimensionless scale factor. The scaling limit is the limit where 9 —► oo, with distances and time rescaled by a factor 0 _ 1 . In reseated coordinates, the fluid is thus confined to a sample of constant finite diameter. The presence of a positive mobility gap in the system implies that, in the scaling limit, the effective theory describing an incompressible QHF must be a "topological field theort/'. The states of a topological field theory are indexed by static, pointlike sources localized in the bulk and labelled by certain charge quantum numbers which generate a fusion ring; see [8, 11]. It is not difficult [10] to find the effective action, Seff(A), in the scaling limit, where A is the electromagnetic vector potential of the external electromagnetic field F^„, see eq. (2.6). A possible starting point is eq. (2.10), relating the expectation value of the electric current to the external electromagnetic field: "(£) = SSeff(A)/6A,(Z)
= aHe^x
FvX(t) ■
(2.16)
The solution of eq. (2.16) is Sett(A) =
= %r [ d^e^A^d.A^) , (2.17) Jo. i.e., Sefr is proportional to the Chern-Simons action Scs- The Chern-Simons action is not invariant under gauge transformations of A that do not vanish on the boun dary dCl of the sample. Since electromagnetic gauge invariance is a fundamental property of quantum-mechanical systems, eq. (2.17) for Sef[(A) must be corrected by a boundary term. Let a denote the restriction of A to the boundary 9fi of the sample. Then, as pointed out in eq. (1.71), the expression Wt/r(a) q= Scs(A) is gauge-invariant, where Wt/r(a) is the effective action of charged chiral modes propagating along 9 0 . Thus, in the scaling limit,
StaW
1
=
|
,
(2.18)
(depending on the sign of
|
Branes, axions and charged fermions
The very early universe is filled with a hot plasma of charged leptons, quarks, gluons, photons, . . . . At a time after the big bang when the temperature T is of the order of 80 TeV chirality flips of light charged leptons, in particular of righthanded electrons, constitute a dynamical process slower than the expansion rate
25
of the universe. Thus, for T > 80 TeV, the c/iira/ charges, N( and yVr, defined in eq. (1.50) of Sect. 1, are approximately conserved for electrons. They are related to an approximate chiral symmetry of the electronic sector of the standard model. Among other results, we shall attempt to show that if, in the very early universe, the chemical potentials of left-handed and right-handed electrons are different from each other, this may give rise to the generation of large, cosmic magnetic fields, [15]; (see also [7] for a similar, independent suggestion). This effect is, in a sense explained in Sects. 4 and 6, an effect in equilibrium statistical mechanics. However, this is precisely what may make it appear quite unnatural and implausible: The chiral charges, Ni and /Vr, are not really conserved; leptons are massive. The very early universe is not really in an equilibrium state, and the chemical potentials of left-handed and right-handed electrons neither have an unambiguous meaning, nor would they be space- and time-independent. It may then be wrong, or, at least, misleading, to invoke results from equilibrium statistical mechanics to explore effects in the physics of the very early universe. A way out from these difficulties can be found by seeking inspiration from an analogy with the quantum Hall effect: Consider a quantum Hall fluid (QHF), confined to a strip of macroscopic width £ in the plane. If the QHF is incompressible then there are no light (gapless) modes propagating through the bulk of the sample; but, as shown in the last section, there are gapless, chiral modes propagating along the boundaries of the sample. Let Cl denote the space-time of the fluid; it is a slab of width I in three-dimensional Minkowski space. The two components of the boundary, <9f2, of f2 are denoted by <9+fi, <9_f2, respectively. As shown in the last section, eq. (2.18), (see also [10] for more details) the effective action of such an incompressible QHF (in the scaling limit) is given by Seff(A) = trH[Wt{a+) + Wr{a.)-Scs{A)]
,
(3.1)
(if the direction of the external magnetic field B'°) is chosen appropriately, given an orientation of Cl). In (3.1), A is an external electromagnetic vector potential on fi, and a± := A \g±a ,
(3.2)
is the restriction of the 1-form A to a component, <9±f2, of the boundary of Q; W(/r(-) is the two-dimensional, anomalous effective action for charged, chiral (leftmoving, or right-moving, respectively) surface modes propagating along d+Q, 3_Q, respectively; and Scs(') ls the three-dimensional topological Chern-Simons action, see (2.17). Many universal features of the quantum Hall effect can be derived directly from eq. (3.1). Suppose, in analogy to what we have just discussed, that the world, as known to us, is a movie showing the dynamics of light modes propagating along two parallel 3-branes in a five-dimensional space-time, M. More precisely, we imagine that M is a slab of width £ in five-dimensional space-time, M5, the two components, d+M and d-M, of the boundary of M being identified with the two parallel 3branes. Let us imagine that, through the five-dimensional bulk M of the system, a massive, charged, four-component spinor field ip propagates. We consider the
26
response of this system to coupling the charged fermions described by ip to a fivedimensional, external electromagnetic vector potential, A. By A± we denote the four-dimensional vector potentials on d±M obtained by restricting A to d±M. As discussed at the end of Sect. 1, there are chiral, left-handed or right-handed, charged, fermionic surface modes propagating along d+M, d-M, which are coupled to A+,A-, respectively; see [6]. In eq. (1.74), the effective action of this system has been reported. It is given by
S&tf)
= Wt(A+) + + (4£e2)-1
Wr(A-)-Scs(A)
I
(3.3)
JM
where the dots stand for terms ~ O ( ^ ) , and the renormalization conditions have been chosen in such a way that the constant e2 in front of the five-dimensional Maxwell term is the four-dimensional feinstructure constant. The components, An, of A are denoted by A? =: A„ , fi = 0,1,2,3,
A, =:
(3.4)
U.,{AK) = {A,
= £) = A. (x, x4 = 0) = A(x) .
(3.5)
This requirement is met if we assume that A (x,x4)
is independent of xA .
(3-6)
In this case,
Scstf) = £sJNr{FAAFA)
=
^J/4x
*{X) £^X"F^{X) Fx>(x)
where N = IR4 is a slice through M parallel to d±M, n,v = 0,1,2,3, and FA = {Ffn,) is the four-dimensional field tensor; (the trivial integration over xA has produced the factor £). Furthermore, the Maxwell term on the R.S. of (3.3) reduces to
■^
y^d4xFfil/(x)F^(x)+2j^d4x(d,>p)(x)(d^)(x)\
.
(3.8)
Finally, Wt {A+ = A) + Wr (A- = A) = S%[(A) ,
(3.9)
(
27 with S?ff(A) = S^{A,Z = 0) as in eqs. (1.29), (1.30). Thus, the complete effective action of the system is given by
S& (V; A) = S&(A) + ^
J^
+ 4^2 {jN#*F\[x)+2JNd*x{Vvf(x)}
.
(3.10)
Clearly, there is something quite unnatural about this approach: It is conditions (3.5) and (3.6)! If A+ were different from A_ then the fermionic effective action Sj?„(A) = S&(A,Z = 0) would be replaced by S^{A,Z), where A = %(A+ + A-) and Z = ^(—A+ + A_). Thus the surface modes would not only couple to the electromagnetic field, but also to a chiral gauge field Z for which there is no experimental evidence, and the gauge fields would sample a five-dimensional spacetime. These unnatural features can be avoided by following Connes' formulation of gauge theories with fermions [16]. Then the effective action displayed in eq. (3.10) can be reproduced as follows: One sets M = N x 7Z.2, N = R 4 and treats the discrete "fifth dimension", Z 2 , by using elementary tools from non-commutative geometry [16]. By adding a "non-commutative", five-dimensional Chern-Simons action, as constructed in [17], to Connes' version of the Yang-Mills action (for a U(l)-gauge field) and to the standard fermionic effective action, one can reproduce actions like the one in eq. (3.10): see [17]. There is no room, here, to review the details of these constructions. In analogy to what we have discussed above, one may argue that string theories arise as effective theories of surface modes propagating along 9-branes in an "elevendimensional" space-time, starting from eleven-dimensional M-theory, (with anom alies of the surface theories cancelled by certain eleven-dimensional Chern-Simons actions). One realization of this idea appears in [18]. But we shall not pursue these ideas any further, in this review. Instead, we ask whether the effective action in (3.10) ought to look familiar to people holding a conventional point of view that physical space-time is fourdimensional. The answer is "yes"! The scalar field ip appearing in the effective action on the R.S. of (3.10) can be interpreted as the axion. T h e axion field was originally introduced by Peccei and Quinn [19] to solve the strong C P problem. There are various reasons, including, primarily, experimental ones, to feel unhappy about introducing an axion into the standard model. But there is also a good reason to do so: String theory predicts the existence of an axion, the "model-independent axion" first described by Witten [20]. T h e argument in favor of the model-independent axion goes as follows: String theory tells us that there must exist a second-rank antisymmetric tensor field, i.e., a two-form, B^. The gauge-invariant field strength, H, a three-form, corresponding to B is given by // = dB - Ul-iYM + W3G ,
(3-11)
where d denotes exterior differentiation, and LJ^YM and uj^a are the gauge-field ("Yang Mills") and gravitational (Lorentz) Chern-Simons three-forms. (The coeffi-
28
cients in front of these Chern-Simons forms are proportional to the number, Nj, of species of fermions coupled to the gauge- and gravitational fields. In the following we shall set Nj = = 1.) The field strength H is invariant under the gauge transform ations B —► B + dX, where A is an arbitrary one-form, and under gauge- and local Lorentz transformations accompanied by shifts of B. The equation of motion of H is 5" H^x d"
= 0,
(3.12)
or SH = = 0, where S is the co-differential. We consider the components of B^ with /z, \i, v = = 0 , . . . , 3 and assume that B is independent of coordinates of internal dimensions (of the string theory target). Then, in four-dimensional (non-compact) space-time, the three-form H is dual to a one-form, Z, and the equation of motion (3.12) becomes diiZv-dvZii
= 0 , or dZ = = = 0.
(3.13)
By Poincare's lemma, Zna = d Z d^u a , or Z = da ,
(3.14)
where a is a scalar field. By (3.11), the scaling dimension of a is two. Introducing a constant, £, with the dimension of length, we set 15 (33-15 )
«« = = ^ JJ2 P P ..
where ip has scaling dimension = 1; (e2 is the feinstructure constant). From d2 — 0 and (3.11) we obtain the equation dH(x) dH{x) = *.4(z) *.4(x) + const, const. tr (R(x) A R{x))
(3.16)
*A(x) = - 3 ^ 2 ( F ( x ) A F ( x ) )
(3.17)
where
is the index density, see eq. (1.59), (* denotes the Hodge dual), and R(x) is the Riemann curvature tensor. Assuming that space-time is flat, hence R = 0, and considering the special case, where the electromagnetic field is the only gauge field in the system, we obtain dH
= - W-2 (F*AF*) 62. 62 TT* IT*
Recalling that
(318) (3-18)
■
see (3.13)—(3.15), we find that (3.18) yields the following equation of motion for ip: D
^ = - I T T *{FAAFA)
(3.19)
.
62 IT
This equation is the Euler-Lagrange equation corresponding to the action functional This equation is the Euler-Lagrange equation corresponding to the action functional 2 ^Jd"x{V
, ,
(3.20) (3.20)
29 which reproduces the R.S. of (3.10), up to the fermionic effective action and the Maxwell term! T h e second term in (3.20) can be understood as arising from coup ling fermions to the axion. T h e term in the bare action of the fermions describing their coupling to the axion is given by e
-Jd4xHfll/Xh'iYlXrP
= ljdixdll
i'litl>,
(3.21)
where 7 = 7 5 - Carrying out the Berezin integral over the fermionic degrees of freedom — see eq. (1.29) — we find an effective action for the fermions given by S& (A,Z
= '- d^j
= Sfff (A,2 = 0 ) - uJdAx = S$,(A)
- ^ | ^ A F , ) ,
(3.22)
in accordance with (3.20). T h e first equation in (3.22) is eq. (1.41), the second follows from (1.59). Thus, coupling charged Dirac fermions to an external electromagnetic vector potential A and an axion
T r a n s p o r t in t h e r m a l e q u i l i b r i u m t h r o u g h g a p l e s s m o d e s
In this section we prepare the ground for a theoretical explanation of effects such as the ones described in Sects. 2 (Examples 1 through 3) and 3. We consider a quantum-mechanical system S whose dynamics is determined by a Hamiltonian H, which is a selfadjoint operator on the Hilbert space 7i of pure state vectors of S with discrete energy spectrum. It is assumed that the system obeys conservation laws described by some conserved "charges" N\,..., Ni commuting with all observables of the system. Hence \H,N,\
= 0,
[Nt,Nk]
= 0,
£,k = 1 . . . . . L ,
(4.1)
(e.g. in the sense that the spectral projections of H and of Nt, Nk commute with one another, for all k and £.) T h e system S is coupled to L reservoirs, TZ\ ... ,1ZL, with
30 the property that the expectation value of the conserved charge N< in a stationary state of S can be tuned to some fixed value through exchange of "quasi-particlcs" between S and %i, i.e., through a current between S and 1ZI that carries "N(charge", for all £ = 1 , . . . , L . We are interested in describing a thermal equilibrium state of S coupled to 1Z\,..., TZL, at a temperature T = ( & B / ? ) _ 1 . According to Gibbs, wc should work in the grand-canonical ensemble. T h e reservoirs 1Z\,.. .TZL then enter the description of the thermal equilibrium of S only through their chemical potentials p\,..., pi. T h e chemical potential pi, is a thermodynamic parameter canonically conjugate to the charge Nc, in particular, the dimension of pi ■ N( is t h a t of an energy. According to Landau and von Neumann, the thermal equilibrium state of S at t e m p e r a t u r e (ICB(3)~] in the grand-canonical ensemble, with fixed values of p\,..., PL, is given by the density matrix
PP.a
=
E
p!E
ex
-13
P
(4.2)
\H
where the grand partition function "Ep^ is determined by the requirement t h a t Tr
PPi!L
= 1.
(4.3)
(It is assumed here t h a t exp [—/? (H — ^ptNt)] is a trace-class operator on %, for all p > 0; we are studying a system in a compact region of physical space.) T h e equilibrium expectation of a bounded operator, a, on % is defined by {a)p,t
■■= Tr (p/j.^a)
.
(4.4)
Let J{x) = (J°(x), J_{x)) be a conserved quantum-mechanical current density of S, where x = (x_, t), t is time and x_ is a point of physical space contained inside S. We are interested in calculating the expectation values of products of components of J in the state ppy, in particular, we should like to calculate {J_(x))j}^. Of course, if the dimension of space is larger than one, (J(x))g^ vanishes unless rotation invariance is broken by some external field. If J_{x) is a vector current then {J_{x))plfi vanishes unless the state pptll is not invariant under space-reflection and time reversal. This happens if some of the charges N\,..., Ni are not invariant under space-reflection and time reversal, i.e., if they are chiral. To say t h a t J is conserved means that it satisfies the continuity equation d^J"
= 0,
(4.5)
where x° = I denotes time, and d^ = d/dx^. If the space-time of the system S is topologically trivial ("star-shaped") then eq. (4.5) implies t h a t there is a globally defined vector field
= ^-div^z) , J(x) 2K — with q the electric charge.
= -± I
(4.6)
31
Let us suppose that
( \
(4.7)
[".£(*)]
[Technically, we are treading on somewhat slippery ground here; but we shall pro ceed formally, in order to explain the key ideas on a few pages.] From (4.6) and (4.7) we derive that iq
(4.8)
<£(*)>/»,/. = f ([H,
Formally, the R.S. of (4.8) vanishes, because {{-))ptli is a time-translation invariant state. However, the field tp turns out to have ill-defined zero-modes, and it is not legitimate to pretend that [H,
m £{x) = i
H -^2/uNt,
+ rZ)Mw,£(*)] '
t=i
(4.9)
t=i
and that the expectation value
H-52inNt,£{x) P,n
t=i
vanishes. This can be seen by replacing the Hamiltonian // by a regular ized Hamiltonian H^ generating a dynamics that eliminates the zero-modes of tp. One replaces the state ppiti by a regularized state pp proportional to exp [ - / ? ( # ( ' ) - I > / t y ) ] , and we set (a) (0 — 0,n = tr for any bounded operator a on 7i. Then
MW ■ (0
.E = Jim ^ //«>
-^2^Nlt^(x) i=i
(O
(4.10)
+ Jt=\- E T ^ « Obviously (O
H^
-
^ptNtMx) t=i
=
0
(4.11)
0,f
and one might be tempted to expect that \im([Nt,ip(x)])p^ vanishes, for all I, because the charges Ni are conserved. However, as long as the regularization is present (e / 0), these charges are not conserved, and there is no guarantee that the second term on the R.S. of (4.10) vanishes!
32 We conclude t h a t
£_>
*=i
(=1
Eq. (4.12) might be called a current sum rule. Let us assume t h a t the conserved charges Ni,£ = 1 , 2 , . . . , are given as integrals of the O-components of conserved currents over space. Then the current sum rule (4.12) implies that if {J_{x))p,n ^ 0 there must be gapless modes in the system. T h e proof, see [7], is analogous to the proof of the Goldstone theorem in the theory of broken continuous symmetries. T h e sum rule (4.12) is the main result of this section. A careful derivation of equation (4.12) and of our analogue of the Goldstone theorem could be given by using the operator-algebra approach to q u a n t u m statistical mechanics [21]. But, in order to reach our punch line on a reasonable number of pages, we refrain from entering into a careful technical discussion. 5
C o n d u c t a n c e quantization in ballistic wires a n d in incompressible q u a n t u m Hall fluids
In this section, we combine the results of Sects. 2 and 4, in order to gain insight into the phenomena of conductance quantization, as discussed at the beginning of Sect. 2. We first study a ballistic wire, i.e., a very thin, long, clean conductor without back scattering centers (impurities). T h e ends of the wire are connected to two reservoirs filled with electrons at chemical potentials m,(ir, respectively, with IH-lir
= V ,
(5.1)
where V is the voltage drop through the wire. A ballistic wire is a three-dimensional, elongated metallic object with a tiny cross section in the plane perpendicular to its principal axis. T h u s , at low t e m p e r a t u r e , the three-dimensional nature of the wire merely implies t h a t there are several, say N, species of electrons labelled by discrete q u a n t u m numbers t h a t originate from the motion in the plane perpendicular to the axis of the wire. Every species of electrons forms a one-dimensional Luttinger liquid [22], and these Luttinger liquids may interact with each other. Every Luttinger liquid has two conserved vector current operators, J^''^, and conserved chiral current operators, J^' , where s =ti4- denotes the magnetic q u a n t u m number of the electrons in the i t h Luttinger liquid ("spin up" and "spin down"), and i — 1 , . . . , N. T h e chiral current operators ^ are as in eqs. (1.21)—(1.23). T h e total electric current operator and the total chiral current operators are given by J»
=
£
JV>*)» , J//r
=
£
1=1
i=l
«=T.l
«=t,i
jM»
.
(5.2)
33
They are conserved. The total electric charge operators counting the electric charges of chiral (left-moving and right-moving) modes in the wire are the; operators Ni and Nr defined in eq. (124). Their expectation values in a thermal equilibrium state of the wire are tuned by the chemical potentials, fit,fiT, respectively, of the reservoirs at the right and left end of the wire. Imagine that the wire is kept at a constant temperature /? _ 1 . Our description of the electron gas in the wire in terms of a finite number of Luttinger liquids correctly captures electric transport properties of the wire only if j3~l and eV, with e the elementary electric charge, are tiny as compared to the energy scale of the motion in the plane perpendicular to the axis of the wire. (However, /? _ 1 and eV should be large as compared to the energy scale of weak back scattering centers.) We shall assume that these conditions are met. Then we may apply the current sum rule (4.12) derived in the last section, and the formulae for the anomalous commutators derived in Sect. 1, see (1.16) and the equation after (1.24), in order to calculate the electric current, /, in the wire corresponding to a voltage drop V. The current sum rule (4.12) yields
iq_
h
{^([^.^)]>^+^r([^r,^(x)]>/9iJ ,
(5.3)
where {x)
=
(5.4)
i ^ ( 8 l ^ ) ) w
see eq. (1.11), and q — — e, because the electric charge of an electron is equal to minus the elementary electric charge. Plugging (5.4) and (5.2) intoeq. (5.3) and recalling eq. (124) and the anomalous commutator
J/;/ , O (K,O.^ V ) (£.O = ±i — &a> 6„< S(x-y)
,
(5-5)
see eqs. (1.11), (1.15), (1.16), we find that N
I = -
Ht
N?-'\
+ /ir ( [tfr(,'''\ = — X 2N X
2N — V h
^(X,<)])
(fll-Hr)
(5.6)
34
Thus, we have derived the formula Gw = *- = 2N t- , V
(5.7)
h
as claimed in Example 2 at the beginning of Sect. 2. Of course, the number, TV, of Luttinger liquids of electrons in the wire depends on the mean Fermi energy of the wire (at zero temperature) and hence on the electron density in the wire and can be tuned. The quantization of the Hall conductance of an incompressible Hall fluid in a Hall sample with e.g. an annular (Corbino) geometry (see Example 1) can be understood by using very similar arguments as in the example of quantum wires. Let V denote the voltage drop between the outer and the inner edge of the sample. We assume that eV and the temperature /? _ 1 are tiny, as compared to the mobility gap in the bulk of the fluid. Let us also assume, temporarily, that the electric field created by connecting the outer and inner edge to the two leads of a battery with voltage drop V does not penetrate into the bulk of the sample (i.e., that, in the bulk, it is screened completely). If this assumption (which will actually turn out to be irrelevant, later) is made then the entire Hall current, / / / , in the sample is carried by the chiral modes propagating along the edges of the sample, i.e., IH is given by the expectation value of the sum, 3} + 3}, of the edge currents, 3f,3fFor an appropriate choice of orientation, 3$ is the current at the outer edge and 3r ' s the current at the inner edge of the sample. The two edges are separated by the bulk, and, for a macroscopic sample, tunnelling of quasi-particles from one edge to the other one can be neglected for all practical purposes. This implies that the currents, 3? and 3?, and hence the charge operators Ni and Nr defined in eq. (1.24), are conserved to very high accuracy. The anomalous commutators of 3i and 3r are given in eq. (2.15^, and the analogue of Eqs. (1.11) and (5.4) is r { x )
=e^S.e^
(d^)(x).
(5.8)
Inserting these equations into the current sum rule (4.12), one finds that IH = - \ sfa
{W([A^(z,0]>^ + M [ ^ W z > 0 ] V j
2
e = -^
IH
(p-t - Pr)
=
(5.9) 2
These arguments do not make it clear why the Hall fraction / # = {e /h) CTJI is a rational number, and we have no clue, so far, which rational numbers may turn up in physical samples. Understanding the rational quantization of / # is not quite an easy matter; see [8, 11]. Here we can only sketch some key ideas. Let ip(x_,t) denote the field (a "chiral vertex operator") creating an electron or a hole propagating along the inner (or along the outer) edge of the sample. This field has the form xP(x,t)
= : eiw'f-^^
: e[x,t)
,
(5.10)
where q is a real number to be determined,
35
neutral so-called simple current of a rational chiral conformal field theory describing chiral modes of zero charge propagating along the edge. The field il>{x_, t) must carry electric charge ±e. Using formula (5.8) and recalling that e has zero electric charge, we find that q = i/sfh ■
(5.11)
Furthermore, the field il>(x_,t) must obey Fermi statistics (because electrons and holes are fermions). Hence it must have half-integer "conformal spin", i.e., S* = 1/2 mod. 1 .
(5.12)
By eq. (5.10), the conformal spin of ij) is given by <7 2
1
S
<1> = 7T +
s
^ s' '
c = TTi
2
(5-13)
tJH
where sc is the conformal spin of e. Because £ is a simple current of a rational chiral conformal field theory, .s£ is a rational number, i.e., sc = j , with k and i two relatively prime integers. Thus (5.12) and (5.13) imply that - L + £ = 1/2 mod. 1 . (5.14) 2/7/ t It follows that f[f is a rational number. For more details see [8, 23, 24] and, especially, [11]. Properties of the rational chiral conformal field theories that may appear in the context of the quantum Hall effect are discussed in [8, 11]. One noteworthy result is that, unless fn is an integer, there must be chiral modes (quasi-particles) of fractional electric charge and fractional statistics, sometimes called Laughlin vortices, propagating along the edges of the sample. Let us see what happens if the electric field E_ can penetrate into the bulk of an incompressible quantum Hall fluid. Electric transport in such Hall fluids can be understood by combining the arguments outlined above with Hall's law in the bulk. The total Hall current, IH, is given by IH = Ifif
+ Ift*
,
(5.15)
where ^ g e is the edge current studied above, and /^ ulk is a current carried by extended bulk states. Let 7 denote an arbitrary smooth oriented curve connecting a point on the inner edge to a point on the outer edge of the sample. Then ///u'k = ~ E
/^feO
ends1
{x) ,
(5.16)
where Jk is the fc-component of the bulk current; see eq. (2.4). As usual, Jk(x,t)
= (Jk{x,t))A
= SSeff(A)/6Ak(x,t)
.
(5.17)
By eqs. (2.17), (2.18), the R.S. of (5.17) is given by = aHeklEl{x,t)
SSefr(A)/SAk(x,t)
,
(5.18)
see also (2.4) (with 07, = 0). Thus /^"k
=
afl
f E(x,t) J-i
■ ds(x) .
(5.19)
36
We have shown in eq. (5.9) that !%&* = aH (fit ~ fir) •
(5.20)
Thus, combining (5.15), (5.19) and (5.20), we conclude that IH = I ^
+ IH^
= aH Lt-fr+J
E(x,t)
■ ds(x)\
.
(5.21)
But the expression in the parenthesis on the R.S. of (5.21) is nothing but the total voltage drop V between the outer and the inner edge. Hence (5.21) implies that IN = *HV ,
(5.22)
as desired. Transport phenomena such as heat conduction through a quantum wire or a Hall sample (see Example 3 at the beginning of Sect. 2) can be studied along similar lines: In a physical system where modes of different chirality do not interact with each other (such as the modes at the inner and at the outer edge of the sample containing an incompressible Quantum Hall fluid) the left-moving and the right moving modes can be coupled to different reservoirs at different temperatures /^ -1 and P^1 ■ This results in a non-zero heat current given by an expectation value of the component 7' 01 of the energy-momentum tensor of the conformal field theory describing the chiral modes in an equilibrium state where the left-movers are at temperature /?£~ and the right-movers at temperature Z?"1 ^ flj1 ■ (Such expectation values can be calculated from Virasoro characters.) These ideas lead to a conceptually clean understanding of the effects described in Example 3 at the beginning of Sect. 2. 6
A four-dimensional analogue of the Hall effect, and the generation of large, cosmic magnetic fields in the early universe
In this section, we further explore the four-dimensional analogue of the Hall effect described in Sect. 3. We shall apply our findings to exhibit effects that may play an important role in early-universe cosmology. Our results represent an elaboration upon those in [15, 7]. We start our analysis by studying a system of massless Dirac fermions coupled to an external electromagnetic field in four-dimensional Minkowski space. Using results derived in Sects. 1 and 4, we derive equations analogous to eqs. (5.3)-(5.6) for the conductance of a quantum wire. From Sect. 1 we recall the expression for the anomalous commutators between vector- and axial-vector — or chiral currents. [ £ % ( < . « ) , Jt°,r(t,y)]
= ± «' £ 2 (B(x,t)
■ V) S(x-y)
,
(6.1)
where q is the charge of the fermions — see eq. (1.62) — and [j?(t,x),
Jr(t,y)}
= 0.
(6.2)
With (1.45) and (1.48), these equations yield 2
[j°r
(t,y) , J°{t,x)]
= ± i £_
(R(y,t)
•V j
8(x-y)
,
(6.3)
37
where J^ is the /z-component of the conserved vector current. In Sect. 4, we have introduced the vector potential, tp, of J^: J°(x)
= ±
div
= - ±
| - £(x) .
(6.4)
Eqs. (6.3) and (6.4) imply that
Jt,r(y>t) - £(*■<)] =±i-^R(y,t)
6{x-y)
±curl n ( x - y , < )
(6.5)
where n is some vector-valued distribution. Next, we recall that the operators
Ntir := JdyJ^
(y,t)
(6.6)
are conserved. They are interpreted as the electric charge operators for lefthanded/right-handed fermionic modes. The chemical potentials conjugate to Nt/r are denoted by nt/r. Let us imagine that, at very early times in the evolution of our universe (or others), there was an asymmetry in the population of left-handed and right-handed fermionic modes, (as argued in [15] for the example of electrons before the electroweak phase transition). Then W ^ /*r ,
(6.7)
in the state of the universe at those very early times. Let us furthermore imagine that the state of the universe at those early times was, to a good approximation, a thermal equilibrium state at an inverse temperature 0 ( < (80 TeV)~ , as argued in [15]) and with chemical potentials \ii and \iT. (It may well be that this is an unrealistic assumption. — It will subsequently turn out that it is unimportant!) Under these assumptions, we may apply the current sum rule (4.12) de rived in Sect. 4. Combining eqs. (6.5), (6.6) and (4.12), and using that f dy curl II (:E — y, t) = 0, for all x, t, we find that K3
_
~
=
~ £h
(W-M£(i),
(6.8)
as claimed in [7]. This equation is the analogue of (5.6). Treating the electromagnetic field as a classical, but dynamical field, its dyna mics is governed by Maxwell's equations, Y • B_ = 0 , V A £ + dtE
= 0,
and Y • E = {J°)p,E , V_AB-dtE=
(J)pijL .
(6.9)
There is no reason to imagine that the charge density, (J°)ptll, in the very early universe is different from zero. In the last equation of (6.9), the current on the R.S.
38
is given by eq. (6.8). Actually, assuming that there are some dissipative processes evolving in the early universe, an equation for the current, l(x)
:= {J_{x))0t!L ,
more realistic than (6.8) may be l(x)
= aLE_{x) + aTVB_(x)
,
(6.10)
where 07, is an Ohmic longitudinal conductivity, and
is the analogue of the "transverse" or Hall conductivity; furthermore, V := fit-fir
(6.12)
is the analogue of the voltage drop considered in the Hall effect. The quantity ar is "quantized", just like the Hall conductivity: If there are N > 1 species of charged, massless fermions, with electric charges qi,..., qpj, then
** = " 4 ^ 1 ^ 1
■
(613)
which is the precise analogue of a formula for the quantization of the Hall conduct ivity derived in [8], and, for qj = ± e , j = 1 , . . . , N, of eq. (5.6). Let us temporarily assume that 07, = 0, (i.e., we neglect dissipative processes). Then Maxwell's equations, together with eq. (6.10) (for 07, = 0) and the assumption that the charge density vanishes, yield the following system of linear equations: V • B_ = 0 , V A I + dtB_ = 0 , V • E_ = 0 , V A f l - dtK = o-TVB_.
(6.14)
Because all coefficients are constant, these equations can be solved by Fourier trans formation, and it is enough to construct propagating wave solutions corresponding to an arbitrary, but fixed wave vector k_. The equations V • B_ = V • E_ — 0 imply that k ■ B_ = k ■ E_ ~ 0 ,
(6.15)
i.e., that only the components of the Fourier transforms B_ and E_ of B_ and E_ (evaluated at the wave vector k_) perpendicular to k_ can be non-zero. Denoting the components of B_ and E_ perpendicular to k_ by B_ , E_ , respectively, the remaining equations in (6.14) yield
9t
( fT ) =K{k) ( fT ) '
(616)
39 where (in an orthonormal basis chosen in the plane perpendicular to k_) the matrix K(k) is given by / K(k)
0 0 0
0 0 ik
V -ik
0
=
-
-ik -
\
0
/
0
(6.17)
with k = \k\. The circular frequency of a propagating wave solution of (6.14) with wave vector k_ is given by u(k), where iui(k) is an eigenvalue of K(k). By (6.17), = k2 ± kaTV
ui{kf
,
(6.18)
as one readily checks. Thus, if \k\ = k < aT V
(6.19)
there are two purely imaginary frequencies, and eqs. (6.14) have solutions (B_(x_,t), E_(x_,i)) growing exponentially fast in time and with the property t h a t B(x,t)
■ E{x,t)
± 0 .
(6.20)
It is almost as easy to solve Maxwell's equations (6.9), with J_ given by (6.10), for
< (arV)2
,
(6.21)
one again finds exponentially growing electromagnetic fields; (perturbation theory). Dissipative processes will subsequently d a m p out electric fields. In [15], calculations similar to those just presented are used to argue t h a t , in the very early universe, large, cosmic electromagnetic fields may have been generated as a consequence of an asymmetric population of left-handed and right-handed electron modes (q = —e). However, these arguments rest on rather shaky hypotheses; (the state of the early universe is assumed to be a thermal equilibrium state, and the charges Nt and Nr, see eq. (6.6), are assumed to be approximately conserved). We propose to reconsider these arguments in the light of the analogy between the (2+l)-dimensional (bulk) description of the Hall effect and the (4+l)-dimensional description of chiral fermions discussed at the beginning of Sect. 3, eqs (3.3) through (3.10). W h a t we have described, so far, in this section are calculations analogous to those reported in eqs. (5.6), (5.8) and (5.9). Next, we generalize our analysis in a way analogous to that followed in eqs. (5.15) through (5.22), starting from the effective action given in (3.10); (see also (3.20)). We integrate out all degrees of freedom (quarks, gluons, leptons, the weak gauge fields — W, Z — etc.), except for the electromagnetic and the axion field. We have seen, at the beginning of Sect. 3, eqs. (3.4), (3.10), that the axion could be viewed as the four-component of a five-dimensional electromagnetic vector potential, A, which does not depend on the coordinate, x 4 , in the direction perpendicular to the fourdimensional branes on which we live; see (3.6). We could pursue a five- (or higher-) dimensional approach to early-universe cosmology (as presently popular), — but let's not! We propose to view the axion as the "model-independent (invisible)"
40
axion first described in [20]. It has a geometrical origin (in superstring theory). It couples to all gauge fields present in the system through a term
^
J y{FwAFw)
,
(6.22)
where Fw is the field strength of a gauge field W appearing in our theoretical description, and to the curvature tensor R\ see (3.16). All gauge fields, except for the electromagnetic vector potential A, shall be integrated out. The (Euclidianregion-) functional integrals have the form
Jdn(W)exp ^LJ
■U(V)
(6.23)
Since .^ ^ (Fw A Fw) is the index density, the integrand in U(ip) can be shown to be periodic in
VAV
,
(6.24)
with boundary conditions (A(ti), f{t\)) = (Am,ipm) and (A(t2), pfo)) = (^outi ¥>out)- In (6.24), Seff(A,
= ^Jd"x{F^x)F^(x) +
3 2 ^ / ^(X)
( F A F) {X)
~
+ 2
(dltV>)(x)(&'
UM
+
W{A)
'
(6 25)
-
where [^(^l) is of higher than second order in A and arises from integrating out all charged fields in the theory*; furthermore, e2 is the effective (one-loop renormalized) feinstructure constant. It is not necessary, in this approach, to assume that all the fermions in the theory be massless. They can acquire masses through the HiggsKibble mechanism. (The arguments of complex chiral Higgs fields then contain a term proportional to the axion field ip which, however, can be absorbed in a change of variables.) Furthermore, calculating transition amplitudes with the help of Feynrnan path integrals does not presuppose that the system is in or close to thermal equilibrium. 'W depends on the boundary conditions, at times imposed on the fields that have been integrated out.
41
We now insert expression (6.25) into the functional integral (6.24) and try to evaluate the latter by using a semi-classical expansion based on the stationary-phase method. The equations for the saddle point are 8SM^^)/SA„{x)
= 0,
6S*t{A,
(6.26)
To simplify matters, we consider solutions of these equations describing fairly small electromagnetic fields and an axion field that varies only slowly in space-time. Then we can neglect the term W(A) in (6.25) and we may omit all contributions to U (
a J
"
a
^=4h * ( fAF >-^) •
(6-27)
(and we have set c = 1 and ft = 1). Let Jj^ denote the magnetic current that could be present if there were magnetic monopoles moving through the early universe. Then the full set of Maxwell-Dirac-axion equations reads
^ {(d^) F""+W£} , The first equation in (6.28) replaces the homogeneous Maxwell equations, [dliF*"/ — 0, for JVM = 0 ) . I n vector notation, the system of equations (6.28) reads V ■ B = J°M , V A £ + B = lM
,
V - £ = | J {(5.
le"1
= - — {
a ^ = - ~ E - B -
U'dp) .
(6.29)
O 7T
In order to gain some insight into properties of solutions of these highly non-linear equations, we study their linearization around various special solutions. Already this part of the analysis, let alone a study of the full, non-linear equations, is quite lengthy; see [26] for a beginning. Here we just sketch results in a few interesting special situations. We shall first assume that Jjj^ = 0, i.e., that there aren't any magnetic monopoles around. (i)
We set U(
42 where V is a constant. Linearizing (6.29) around (6.30), we obtain the equations Y • B_ = 0 , V A I + 5
=
0,
V • £ = 0 ,
=
- T - T V I ,
e2 V A B - E
8 7T
Dy> = 0 .
(6.31)
W i t h the exception of the wave equation for the axion field
4nh
'
which is precisely eq. (6.11), with q — e! Recall t h a t , in the analysis presented at the beginning of this section,
V = m - nr . This equation and (6.30) tell us t h a t , apparently, the field tip has the interpretation of the difference of chemical potentials of left- and right-handed fermionsl This interpretation magically fits with the five-dimensional interpretation of the axion field
J ^ E K 7
r
-dsK
= y (far4 ^ 4 ( 0 = Jdx4
,
(6.32)
K= \
where £ = (x,x4) — (t,x_, x4), and we have assumed in the first equality t h a t E does not depend on x4 (see assumption (3.6)) and £4 does not depend on x. Since, for solution (6.30),
E^)
=m = j
eq. (6.32) yields 4
/
V
EK
■ dsK
= V .
(6.33)
K=l
This shows that, in the five-dimensional interpretation of the axion, V is the "vol tage drop" between the two four-dimensional branes corresponding to the lower and upper face of the five-dimensional slab. This observation makes the analogy between the effects studied here and the Hall effect yet a little more precise.
43
Solutions of eqs. (6.31) have been studied earlier in this section; see (6.16) through (6.20). They have unstable modes growing exponentially in time, with
£(*><) • E(x,t) / 0 . (ii) Now U(ip) / 0; U'(ip)(x) := SU(ip)/Sip(x) is a periodic function with minima at '— n, n = 0 , ± 1 , ± 2 , . . . . We linearize equations (6.29) around the solution E_ = B_ = 0 , ip =
= -U'(
.
(6.34)
This is the equation of motion of a planar pendulum in a force field with potential U. We have learnt in our courses on elementary mechanics how to solve (6.34), using energy conservation. For "small energy", a solution,
+ B = 0, Ic1 Y • E = 0 , V A B_ - E = - —-= (pcB_ ,
(6.35)
O IT
which can be solved by Fourier transformation in the space variables. The equations for the components, .6 and E_ , of the Fourier components of B_ and E_ perpendicular to the wave vector k_ are two Mathieu equations of the form
where k = \k\ and hk(t) depends on k and is linear in
(6.36)
In solving this equation one encounters the phenomenon of the parametric res onance, i.e., for I: in a family of intervals, eq. (6.36) has a solution growing ex ponentially in time. Hence the electromagnetic field has unstable modes growing exponentially in time and with B_ ■ E_ ^ 0. The parametric resonance has appeared in cosmology in other contexts. In our analysis it plays an entirely natural and essentially model-independent role and may help to explain where large, cosmic (electro) magnetic fields might come from. Of course, eqs. (6.29) are Lagrangian equations of motion. They are derived from the action functional (6.25), (with W = 0 and U independent of derivatives of
44
Clearly, it would be interesting to construct finite-energy solutions of eqs. (6.29), with an initial axion field depending not only on time but also on space. Of particu lar significance is situation (ii), with U ^ 0. Interpreting t(p as a difference of chem ical potentials for left- and right-handed fermions, we are thus considering states of the universe with spatially varying, time-dependent chemical potentials triggering an asymmetric population of left-handed and right-handed fermionic modes. This asymmetry gradually disappears, due to chirality-changing processes, and the field energy stored in axionic degrees of freedom is reshuffled into certain electromagnetic field modes triggering the growth of cosmic electromagnetic fields. Large electric fields rapidly die o u t because of dissipative processes; (the energy loss from t h e electric field into m a t t e r degrees of freedom is described by E_ ■ J_ oc <JL \E\7 + . . . .) But large magnetic fields may survive for a comparatively long time. Describing these phenomena within the approximation of linearizing eqs. (6.29) (possibly supplemented by a dissipative Ohmic term) around special solutions, in cluding space-dependent ones, of infinite or finite energy, is feasible; [26]. B u t our understanding of the effects of the non-linearities in eqs. (6.29) remains, not surprisingly, very rudimentary. Some speculations on t h e role played by magnetic monopoles in t h e effects described here are contained in the last section; see also [26]. 7
Conclusions and outlook
In this review we have shown how the chiral, abelian anomaly helps t o explain im portant features of the (quantum) Hall effect, such as the existence of edge currents and aspects of the quantization of the Hall conductivity, and of its four-dimensional cousin, which may play a significant role in explaining the origin of large, cosmic magnetic fields. Our analysis is essentially model-independent, a fact t h a t makes it quite trustworthy. How significant the four-dimensional variant of the Hall effect is in early-universe cosmology remains to be understood in more detail. This will require a better understanding of orders of magnitude of various physical quantities and of the properties of solutions of the non-linear Maxwell-Dirac-axion equations (6.29). A beginning has been made in [15, 26]. — There is no doubt that t h e following equations Jbu\k
= T
* F ,
(JJedge =
—
(7.1)
with ar — ^H, for bulk- and edge-currents of an incompressible Hall fluid (see eqs. (2.10) and (2.14)), and J" = vrlfaip)
F> +
VJ»M}
,
(7.2)
where ar = — 5Vh (5Zj=i 1j ) ' w ' t n ^ t n e n u m b e r of species of charged fermions with electric charges qi,.. .,qN, (see eqs. (6.13) and (6.28)) are significant laws of nature connected with the chiral anomaly. For the future, it would be important t o gain a better understanding of t h e contents of equations (6.29), (possibly corrected by dissipative terms a n d / o r ones coming from SW(A)/6Afl(x), which have been neglected), including the role played
45 by magnetic monopoles and dyons (J^ ^ 0). (Eqs. (6.29) and their fully quantized counterparts appear to offer some clue for understanding (axion-driven) monopoleanti-monopole annihilation, triggering the growth of certain modes of the electro magnetic field.) Some understanding of these issues has been gained in [26]; but much work remains to be done. We have also studied the influence of gravitational fields on the processes described in Sect. 6 [26] (in analogy to the "geometric" (or gravitational) Hall effect in 2 + 1 dimensions described in the third paper quoted under [10] and to the phenomenon of "quantized" heat currents in q u a n t u m wires mentioned in Sects. 3 and 5). But there is no room here to describe our results in detail. Our findings will have to be combined with cosmic evolution equations. In this review, we have only quoted literature t h a t we used in carrying out the calculations described here. Many further references may be found in [7, 8, 10, 15, 20, 26]. Acknowledgments T h e results described in Sects. 2, 4 and 5 have been obtained in collaboration (of J.F.) with A. Alekseev and V. Cheianov [7], in continuation of earlier work with T . Kerler, U. Studer and E. Thiran. We thank these colleagues, Chr. Schweigert and Ph. Werner for many useful discussions. We are grateful to Ft. Durrer, E. Seiler and D. Wyler for drawing our attention to some useful earlier work in the literature and for encouragement. References 1. R. Jackiw, in "Current Algebra and Its Applications", S.B. Treiman, R. Jackiw and D.J. Gross (eds.), Princeton University Press, Princeton NJ, 1972. L. Alvarez-Gaume and E. Witten, Nucl. Phys. B 2 3 4 , 269 (1983). 2. K. Fujikawa, Phys. Rev. Letters 4 2 , 1195 (1979); Phys. Rev. 2 1 , 2848 (1980); Phys. Rev. D 2 2 , 1499 (1980); Phys. Letters 1 7 1 B, 424 (1986). 3. L. Alvarez-Gaume, Commun. Math. Phys. 90, 161 (1983). 4. R. Stora, in: "New Developments in Q u a n t u m Field Theory and Statistical Mechanics", M. Levy and P. Mitter (eds.), Plenum, New York 1977, p. 201. L.D. Faddeev, Phys. Letters 145 B, 81 (1984). J. Mickelsson, Commun. Math. Phys. 97, 361 (1985). B. Zumino, Nucl. Phys. B 2 5 3 , 477 (1985). 5. S. Deser, R. Jackiw and S. Templeton, Ann. of Phys. 1 4 0 , 372 (1982) A.N. Redlich, Phys. Rev. Letters 5 2 , 18 (1984). I. Affleck, J. Harvey and E. Witten, Nucl. Phys. B 2 0 6 , 413 (1982). E. Witten, Nucl. Phys. B 2 4 9 , 557 (1985). 6. C. G. Callan and J.A. Harvey, Nucl. Phys. B 2 5 0 , 427 (1985). S. Chandrasekharan, Phys. Rev. D 4 9 , 1980 (1994). D.B. Kaplan and M. Schmaltz, Phys. Letters 3 6 8 B, 44 (1996). 7. A. Yu. Alekseev, V.V. Cheianov, and J. Frohlich, Phys. Rev. Letters 8 1 , 3503 (1998).
46
8.
9.
10.
11.
12. 13. 14. 15.
16. 17. 18. 19. 20.
21.
22.
J. Frohlich, in: "Les Relations entre les Mathematiques et la Physique Theorique" (Festschrift for the 40 th anniversary of the IHES), Louis Michel (ed.), Presses Universitaires de France, Paris 1998. See also: A. Yu. Alekseev, V.V. Cheianov and J. Frohlich, Phys. Rev. B 54, R 17 320 (1996). J. Frohlich and T. Kerler, Nud. Phys. B 354, 369-417 (1991). J. Frohlich and E. Thiran, J. Stat. Phys. 76, 209-283 (1994). J. Frohlich, T. Kerler, U.M. Studer and E. Thiran, Nud. Phys. B 453 [FS], 670-704 (1995). J. Frohlich, U.M. Studer and E. Thiran, J. Stat. Phys. 86, 821-897 (1997). R.E. Prange and S.M. Girvin (eds.) "The Quantum Hall Effect", 2 n d ed., Graduate Texts in Contemporary Physics, Springer- Verlag, Berlin, Heidelberg, New York 1990. M. Stone (ed.), "Quantum Hall Effect", World Scientific Publ. Co., Singapore 1992. X.G. Wen, Phys. Rev. B 40, 7387 (1989). X.G. Wen and A. Zee, Phys. Rev. B 46, 2290 (1992). J. Frohlich and U.M. Studer, Rev. Mod. Phys. 65, 733 (1993). J. Frohlich, B. Pedrini, Chr. Schweigert, J. Walcher, "Universality in Quantum Hall Systems: Coset Construction of Incompressible States", preprint condmat/0002330. B.J. van Wees et al., Phys. Rev. Lett. 60, 848 (1988). A. Yacoby et al., Phys. Rev. Lett. 77, 4612 (1996). K. von Klitzing, G. Dorda and M. Pepper, Phys. Rev. Letters 45, 494 (1980). D.C. Tsui, H.L. Stormer and A.C. Gossard, Phys. Rev. B 48, 1559 (1982). I. I. Tkachev, Sov. Astron. Lett 12, 305 (1986). M. Turner and L. Widrow, Phys. Rev. D 37, 2743 (1988). M. Joyce and M. Shaphoshnikov, Phys. Rev. Letters 79, 1193 (1997), (astroph/9703005). A. Connes, "Noncommutative Geometry", Academic Press, New York, Lon don, Tokyo 1994, (especially Chapter VI, Sect. 5). A H . Chamseddine and J. Frohlich, J. Math. Phys. 35, 5195 (1994). P. Hofavaand E. Witten, Nud. Phys. B460, 506 (1996); Nud. Phys. B 475, 94 (1996). R.D. Peccei and H.R. Quinn, Phys. Rev. Lett. 38, 1440 (1977). E. Witten, Phys. Letters 149 B, 351, (1984). J.E. Kim, "Cosmic Axion", 2 n d Intl. Workshop on Gravitation and Astrophys ics, Univ. of Tokyo 1997, astro-ph/9802061. D. Ruelle, "Statistical Mechanics (Rigorous Results)", W.A. Benjamin, New York, Amsterdam 1969. O. Bratteli and D. Robinson, "Operator Algebras and Quantum Statistical Mechanics", vol. I and II, Springer-Verlag, Berlin, Heidelberg, New York 1979. S. Tomonaga, Progr. Theor. Phys. 5, 544 (1950). J.M. Luttinger, J. Math. Phys. 4, 1154 (1963). D.C. Mattis and E.H. Lieb, J. Math. Phys. 6, 304 (1965). J. Solyom, Adv. Phys. 28, 209 (1979). F.D.M. Haldane, J. Phys. C14, 2585 (1981).
47
23. N. Read, Phys. Rev. Lett. 6 5 , 1502 (1990). 24. J. Frohlich et al., "The Fractional Q u a n t u m Hall Effect, Chern-Simons Theory, and Integral Lattices", in Proc. of ICM '94, S.D. Chatterji (ed.), Birkhduser Verlag, Basel, Boston, Berlin 1995. 25. C. Vafaand E. Witten, Phys. Rev. Letter 5 3 , 535 (1983). 26. Ph. Werner, diploma thesis, ETH-Zurich, spring 2000; J. Frohlich, B. Pedrini and Ph. Werner, in preparation.
48
FLUCTUATIONS A N D E N T R O P Y D R I V E N S P A C E - T I M E I N T E R M I T T E N C Y IN NAVIER-STOKES FLUIDS GIOVANNI GALLAVOTTI Fisica, Universitd di Roma "La Sapienza" Piazzale Aldo Moro 2, 00153 Roma, Italia E-mail: gallavotti@romal .infn.it We analyze the physical meaning of fluctuations of the phase space contraction rate, that we also call entropy creation rate, and its observability in space-time intermittency phenomena. For concreteness we consider a Navier-Stokes fluid.
1
The chaotic hypothesis in turbulence.
Consider a Navier-Stokes (NS) fluid in a container V which we take, for simplicity, cubic with periodic boundary conditions and subject to a constant volume force /
d-u
= Q,
R = fL3^'2
(1.1)
in a container of side L = 1, where R is the Reynolds number. We can suppose that fv ^ ^ £ — Q (because of translation invariance). We assume the chaotic hypothesis : Chaotic hypothesis: Asymptotic motions of a turbulent flow develop on an at tracting set A in phase space on which time evolution u —► St u can be regarded as a transitive Anosov system for the purposes of computing time averages in sta tionary states. Here we investigate some assumptions under which the hypothesis acquires some non trivial predictive value with implications that can have experimental relevance. For earlier reviews on the chaotic hypothesis see 15 . 12 . 17 . A recent one is 3 0 . 2
The OK41 cut-off.
Anxiety often mars the beginning of any discussion on the NS equations: it is a fact that to date there is no theory that allows a constructive solution of the equations via a controlled approximation scheme. Nevertheless most people rapidly recover and adopt the viewpoint that "physically there is an effective ultraviolet cut-off" and the NS equations can be reduced to ordinary equations: The OK41 cut-off hypothesis: There exists /c0 > 0 such that if the NS equation, Eq. (1.1), is truncated in momentum space at K(R) = RK° (or higher) then the physically relevant predictions are not affected.
49
The 0K41 theory, see 25 , assigns to K0 the value 3/4. Therefore the flows of physical interest should be described by Eq. (1.1) truncated at | £ | < K(R), i.e. ilk+iR
^2
Sfc 1 -*nfeUfc a = - i
2
u
k
+
(2.1)
4.1 + 4.2 = t \k,\
where u ( x ) = J2k?oe' ~ ~ H k a n d H* is the projection orthogonal to k_; k = 27rn with n an integer components vector. Equation Eq. (2.1) admits an a priori bound on the energy £/2 = - £ £ 2 | « , | 2 + £ > , . « , , which implies that asymptotically in time the energy is bounded by E < 2|Ml7(27r) 4 . We shall call PR the "statistics" of the NS equation defined as the probability distribution on phase space (i.e. on the space of the velocity fields {uk}, \k\ < K(R)) which describes, at Reynolds number R, the stationary state averages of observables F(u), i.e. the probability distribution such that lim T~x f F(Stu)dt= IF(v)pR(dv) (2.2) -»°° Jo J for almost all initial data u. The distribution PR exists and is unique because of the chaotic hypothesis and it is also called the SRB distribution of the stationary state of Eq. (2.2). T
3
NS and GNS equations: viscosity and vorticity ensembles. Equivalence.
For the purposes of a conceptual analysis stressing the analogy with the theory of ensembles in statistical mechanics we temporarily introduce a control parameter A in front of the Laplacian in Eq. (1.1) and in front of the — fc u k term in Eq. (2.1): bearing in mind, however, that we are interested in A = 1 only. The stationary dis tributions will then depend on A, R and will be denoted p-x,R so that the previously introduced SRB distribution UR, Eq. (1.1), is fiR = p-itR with the new notations. The collection £^s of all the probability distributions p.\,R is the collection of all the stationary states of the fluid, at varying Reynolds number R. It is an ensemble in the sense of statistical mechanics and it will be called the "viscosity ensemble" for the NS equations. The second idea is that the same fluid can be studied, rather than by the NS equations Eq. (2.1), by considering the Euler equations subject to a dissipation mechanism that keeps the vorticityS = Jv(du)2dx^ bounded. This "thermostatting" effect can be achieved by imposing various types of forces tji on the system so that S — const leading to u+Ru-du=-dp+
+ th,
d u
= 0,
R=fLzv~2.
(3.1)
50
As an example we can consider the force obtained by imposing the constraint that S remains identically constant via Gauss' principle of least effort, c.f.r. 12 , appendix. It corresponds to, c.f.r. 12 u+Rudu=-dp+
5 •u = 0
(3.2)
where ^ G ( U ) is an easily determined multiplier defined so that S is exactly con served, namely fv(
"
-Au-RAu(u-Qu))dx
c(fi) =
fv(Au)*dx
^
where V is the container region. The collection £ of the statistics jis,R for Eq. (3.1) will be called the "vorticity ensemble". We establish a correspondence between elements of the ensembles £ and £ by saying that two elements U\,R £ £ and Us,R £ £ are correspondent if i*sM"G) = A
(3.4)
We shall call "local' an observable F(u) that depends only on finitely many Fourier components of the velocity field u (this is locality in momentum space) and £ is the family of the local observables. The analogy with statistical mechanics is quite manifest if the following conjecture holds, 12>13, Equivalence conjecture: Let U\,R £ £ and us,R £ £ be corresponding elements of the viscosity and vorticity ensembles, i.e. if S and A are related by Eq. (3.4), then it is
(3-5)
lim ^ 4 S = 1 for all local observables F £ £ with non zero average.
In other words the statistics of the irreversible NS equation Eq. (1.1) and of the reversible "Gaussian NS equation" Eq. (3.1), called GNS equation, form two equivalent ensembles of stationary states of the fluid. By "reversible" we mean that there is an involutory map / of phase space which anttcommutes with time evolution, i.e. / 2 = 1,
ISt=S-tI
for all
t
(3.6)
and the GNS equations are reversible because I ' G ( U ) is odd under the transforma tion Iu(x_) = — u ( r ) . A similar conjecture has been proposed in certain models of non equilibrium statistical mechanics, 12>31.20. For reference purposes we write explicitly the GNS equations with the OK41 cut-off:
iifc
-riR
^2
M*, k^kHk^
I4.JI
with the same notations of Eq. (2.1).
= - " G ( « ) k2lLk
+
(3.7)
51 T h e analogy with equilibrium theory of ensembles is: the parameter R plays the role of the volume while A t h a t of the temperature and S t h a t of the energy. Therefore the viscosity ensemble is the analogue of the canonical ensemble and the vorticity ensemble is the analogue of the microcanonical ensemble. The R —> oo is analogous to the "thermodynamic limit?', 17 . We see also why it is useful to introduce the parameter A: if we stick to A = 1 then effectively we consider only a single stationary state pi\ = pn and not an ensemble: this state is "the same" (in the sense Eq. (3.5)) as the state PSF,,R if «Sft is so defined that psn.R^G) = 1- The parameter A will be set to its physical value 1 from now on. T h e conjecture of equivalence was proposed in 1 3 and discussed in several other papers, see for instance 1 4 . It has been investigated by simulations in 26 with results t h a t seem moderately satisfactory. 4
T i m e reversal a n d
fluctuation
theorem.
We now consider the NS equation Eq. (3.7) and try to find some of its properties under the assumption t h a t it is equivalent to the corresponding GNS equation, i.e. Eq. (3.7) with — A: uk replaced by — ua{n)k. u k . We assume the chaotic hypothesis and the OK41 cut-off and furthermore t h a t T r a n s i t i v i t y a n d a x i o m C: Either the full ellipsoid in phase space
{Ml
£
* 2 | « 4 | 2 = 5fl}
(4.1)
\k\
is densely visited by the evolutions of all data starting on it apart from a zero volume set. or alternatively the evolution on this ellipsoid verifies a geometric property called "axiom C". Axiom C says that if the system is not transitive because there is an attracting set A that is smaller than the full phase space (i.e. the ellipsoid Eq. (4.1) in this case) then, considering the simple case in which this happens because in phase space there are just a non dense attracting set and a repelling set (also not dense), 1. the attracting and the repelling sets are smooth manifolds and all their points, but a set of zero surface area, generate dense trajectories on them, and 2. the stable manifold of the points on the attracting set crosses transversally the repelling set and viceversa the unstable manifold of a point on the repelling set crosses transversally the attracting set. For details, which will not be really necessary here, we refer to 5 . This implies t h a t either the system is transitive or that its restriction to the attracting set is transitive. T h e interest of the Axiom C notion is that it is a geometric property t h a t has a remarkable consequence for systems admitting a time reversal symmetry / but with an attracting set A which is not the full phase space and, therefore, is mapped by
52 / onto a repelling set IA different from A. If the axiom holds one can define 5 , a m a p P : A <—>• / A, of the attracting set A on the repelling set IA which commutes with time evolution and with / and IPSt=S-tIP
(4.2)
i.e. the restriction of the transitive evolution St to the attracting set A is still reversible, although it is such for a new time reversal operation, namely / P, see 5 ' 1 6 . If a reversible evolution verifies axiom C and depends on a parameter and, as the parameter varies, it develops an attracting set A ^ IA that is not the full phase space then the restriction of the evolution to the attracting set is time reversible with respect to a new time reversal symmetry. In other words in axiom C systems time reversal symmetry / cannot be really broken: if there is a spontaneous breakdown (such has to be considered the "break ing", as a parameter varies, of phase space into an attracting set A smaller than phase space and a repelling set IA different from I A, 1 6 ) a new time reversal PI is "spawned". T h e axiom C property is stable under perturbations: changing slightly para meters a system keeps this property if it has it to begin with. T h e transitivity (or axiom C) property is relevant because of the following theorem T h e o r e m (fluctuation theorem): Let —
J-T/2
(°7 +
which we call "average over a time span r of the (dimensionless) phase space con traction at u " has a probability of being in the interval [p, p + dp] of the form. nT(p)dp= const e^(-PlT+oi-11 with C(-P)=C(P)-(*)+P
(4-4)
for all p. This theorem can be found in 10 for evolution m a p s and in 2 1 for flows (which is the version we use here): see also 3 0 . T h e quantity p depends on u. T h e quantity (<x}+ is also called "average entropy creation rate" and p = p(u) is the dimensionless entropy creation rate averaged over a time r and in the point u : see 1,29,23,30 -^ye reca\\ ^ a t entropy in systems out of equilibrium is not defined (yet) so that this name needs not be taken too seriously and it might eventually reveal itself inappropriate. T h e above result should not be confused (as it is conceptually and technically different) with other apparently similar statements, see 6 . It was discovered as an experimental relation in a numerical simulation, 9 , where the role of the SRB distributions and of time reversal were also suggested to be a possible reason for its validity: this led to its proof for Anosov maps in 10 and for flows in 2 1 .
53
A key feature of the theorem is that it contains no free parameters: its gener ality makes it a mechanical identity in the same sense, although of course of not comparable importance, as the heat theorem of Boltzmann, 2 , s , see also 17 . In the case of axiom C systems Eq. (4.4) still holds, because the evolution restricted to A is transitive and reversible by Eq. (4.2), but a has to be replaced by the contraction rate +
(4-5)
for all p. The GNS case is not among the (important) cases in which cr and ao are proportional, see 8 ' 32 . 5 > 4 ; although heuristic arguments can be given, 13, suggesting that nevertheless a relation like Eq. (4.5) might hold. In some cases in which proportionality between a and (To can be established, at least on heuristic grounds, the proportionality factor is just 1 — d(A)/d if d is the dimension of phase space and d(A) is the dimension of the attractor, but unfortunately such cases are very rare, 5 ' 4 . Finally it is worth writing explicitly the expression of the phase space contraction
k2)^G(u)-([Aip-Audx)(f[(Au)2-
E JV
\!L\
- RAu
JV
• (A(u • du)) - R(Au) ■ ( A M ) • (<9«) - RAu
■ (Adu)u
+
(4.6)
■ A2u]dx
) / / (Au)2dx Jv In this expression (straightforwardly derived by imposing that S is exactly constant on motions verifying Eq. (3.7)) the first term seems to be the dominant one at large R so that + i/(u)Au
a(u)~(
^
i2)^(M)=f/C(«)^G(w),
K(R) oc R3K°+2
(4.7)
\k\
which, if the side LQ of the box is not L = 1 would be written with K(R) replaced by K.Lo(R) - E|*|
Fluctuation patterns and an extension of Onsager-Machlup fluctuations theory.
A physical interpretation of the fluctuation theorem, when it holds, can be found along with proposals for its test in experiments. We need first some consequences of the (technique of proof of the) fluctuation theorem. Given an observable H(u) we say that in its evolution observed over a time interval of size r it follows a pattern t —► h(t) if F(St u) — h(t) for t £ [—r/2, r / 2 ] .
54
We assume that F has well defined time reversal parity e = ± 1 : F(I u) = eF(u), for simplicity; and we say that the pattern Ih(t) = eh(—t) is the time j'eversed pattern of ft. Fluctuation patterns are the main object of analysis in the theory of OnsagerMachlup which deals with the probability of observing a fluctuation pattern ft for an observable in the linear response regime (i.e. strictly speaking it deals with derivatives of various quantities with respect to the strength of the forcing terms evaluated at 0 forcing). The following theorem can be regarded a result of the same type without the restriction that the system is in the linear response regime. T h e o r e m (entropy creation as a fluctuations driver): Consider a time reversible evolution verifying the chaotic hypothesis and transitiv ity. Let H, K be two local observables (of given time reversal parity) and denote V-R.T.P the SRB distribution conditioned to a (dimensionless) phase space average contraction p over a time span r. Let ft, k be two fluctuation patterns for H, K and let IhyIk be their time reversal patterns. Then if pRiT,p(pattern of H = ft) denotes the probability that H follows the pattern ft in the time span r in which the average dimensionless phase space contraction is p it is HR,T,P (pattern of H = ft) _ ^ / j | T ] _ p (pattern of H = Ih) PR.T.P (pattern of K = k) //R ]7 -_ p (pattern of K = Ik) for large r. If the system verifies axiom C the same result holds with IP (see Eq. (4-2) replacing I (without requiring any relation between the total phase space contraction rate and the rate of contraction of the surface of the attractor). In other words the relative probability of fluctuation patterns of H and K observed in a time span r during which the average entropy creation rate is p(c)+ are the same as those of the time reversed patterns in a time span r in which the average entropy creation rate is the opposite: — p(a)+. This allows us to give a physical interpretation to p: namely if we look at the evolution on time laps of size r we see that the average entropy creation rate p will be usually p = 1 and the probability of observing p ^ 1 will be rare and the fraction of times we shall observe it is e ^ ' p ' ~ ^ 1 " T : hence events in which p ■£ 1 will be rare and random (i.e. intermittent) and they will take place at rate £(1) — C(p)The above theorem shows that when p is significantly different from 1 "things go very wrong". The frequency of findings of a time interval of size r during which the time reversed patterns are relatively as probable as the normal patterns will be given by e~^+ T no matter which observable H we look at: an independence property that can be checked, in principle, in an experiment. Hence a physical interpretation of p is that it is a quantitative measurement of the degree of reversibility that is observed. The larger 1 — p is the more "unin tuitive behavior"' will be observed. For p = — 1 everything would be dramatically different from what expected." The time intervals during which anomalous behaa "If entropy creation rate could be changed in sign for a minute around Niagara falls then during that minute their water would be more likely to go up rather than down". One "just" has to change the sign of the entropy creation rate, no extra effort needed!
55 vior is observed are rare so that their manifestation is intermittent and we call this phenomenon "entropy driven intermittency": the function C(p) describes the phenomenon quantitatively. 6
E n t r o p y d r i v e n i n t e r m i t t e n c y . Observability.
We now address the question: is this intermittency observable?; is its rate function C(p) measurable? Clearly oo so t h a t there should be serious doubts about the observability of so rare fluctuations. However if we look at a small subsystem in a little volume Vo of linear size L0 we can regard it again as a fluid enclosed in a box VQ described by the same reversible G N S equations. We can imagine, therefore, t h a t this small system also verifies a fluctuation relation in the sense that if, c.f.r. Eq. (4.7), Eq. (3.3) av0{u)
-
fCLo(R)i/G(u) fVo{
VG(U)
(6.1)
Jv„(A«) 2 <^
then it should be that the fluctuations of <x averaged over a time span r are con trolled by rate functions Cv(p) and Cv0(p) that we can expect to be, for R large CV(P)=C(P)£L(.R)I
Cv0(p)=C(p)£L0(fl), (
and
(6-2)
v+JCLo(R)
We recall t h a t , c.f.r. Eq. (4.7), ICL0(R) — Tl\k\
(6.3)
for T large, where the local fluctuation rate C(p) verifies (assuming transitivity or axiom C) C(-P)=C(P)-5>P*9
(6.4)
with i) = 1 in the transitive case and perhaps ^ 1 when the attracting set is smaller t h a n phase space. Therefore by observing the frequency of intermittency one can gain some access to the function C(p)Note t h a t one will necessarily observe a given fluctuation somewhere in the fluid if Lo is taken small enough: in fact the entropy driven intermittency takes place not only in time but also in space. T h u s we shall observe inside a box of size LQ
56
"somewhere" in the total volume V of the system a fluctuation of size 1 — p with high probability if (L/LofeGW-fmrKLoW ~i
(6.5)
and the special event p = — 1 will occur with high probability if (L/Z, 0 ) 3 e _ 5 f + T ' C t o(*)~i
(6.6)
by Eq. (6.4). Once this event is realized the fluctuation patterns will have relative probabilities as described in §5. An attempt at interpreting the experiment performed by Ciliberto and Laroche on convecting water at room temperature in terms of the above theory is in 2 0 . The idea and the possibility of local fluctuation theorems has been developed and tested first numerically, 22 , and then theoretically, 19 , by showing that it indeed works at least in some models (with homogeneous dissipation like the GNS and NS equations) which are simple enough to allow us to build a mathematically complete theory of the phase space contraction fluctuations. Of course if the quantity (
57
3. Boltzmann, L.: Uber die Eigenshaften monzyklischer und anderer damit verwandter Systeme, in "Wissenshafltliche Abhandlungen", ed. F.P. Hasenohrl, vol. Ill, p. 122-152, Chelsea, New York, 1968, (reprint). 4. Bonetto, F., Gallavotti, G.: Reversibility, coarse graining and the chaoticity principle, Communications in Mathematical Physics, 189, 263-276, 1997. 5. Bonetto, F., Gallavotti, G., Garrido, P.: Chaotic principle: an experimental test, Physica D, 105, 226-252, 1997. 6. Cohen, E.G.D., Gallavotti, G.: Note on Two Theorems in Nonequilibrium Statistical Mechanics, Journal of Statistical Physics, 96, 1343-1349, 1999. 7. Ciliberto, S., Laroche, C : An experimental verification of the Gallavotti-Cohen fluctuation theorem, Journal de Physique, 8, 215-222, 1998. 8. Dettman, C.P., Morris, G.P.: Proof of conjugate pairing for an isokinetic ther mostat, Physical Review 53 E, 5545-5549, 1996. 9. Evans, D.J.,Cohen, E.G.D., Morriss, G.P.: Probability of second law violations in shearing steady flows, Physical Review Letters, 71, 2401-2404, 1993. 10. Gallavotti, G., Cohen, E.G.D.: Dynamical ensembles in nonequilibrium statist ical mechanics, Physical Review Letters, 74, 2694-2697, 1995. [And: Dynam ical ensembles in stationary states, Journal of Statistical Physics, 80, 931-970, 1996]. 11. Gallavotti, G.: Chaotic hypothesis: Onsager reciprocity and fluctuationdissipation theorem, Journal of Statistical Phys., 84, 899-926, 1996. 12. Gallavotti, G.: New methods in nonequilibrium gases and fluids, Open Sys tems and Information Dynamics, Vol. 6, 101-136, 1999 (original in chao-dyn #9610018). 13. Gallavotti, G.: Dynamical ensembles equivalence in fluid mechanics, Physica D, 105, 163-184, 1997. 14. Gallavotti, G.: Ipotesi per una introduzione alia Meccanica dei Fluidi, "Quaderni del CNR-GNFM", vol. 52, p. 1-428, Firenze, 1997. English translation in progress: available at http:Wipparco.romal.infn.it. 15. Gallavotti, G.: Chaotic dynamics, fluctuations, non-equilibrium ensembles, Chaos, 8, 384-392, 1998. See also 12 . 16. Gallavotti, G.: Breakdown and regeneration of time reversal symmetry in nonequilibrium Statistical Mechanics, Physica D, 112, 250-257, 1998. 17. Gallavotti, G.: Statistical Mechanics, Springer Verlag, 1999. 18. Gallavotti, G.: Fluctuation patterns and conditional reversibility in nonequi librium systems, Annales de 1' Institut H. Poincare, 70, 429-443, 1999. 19. Gallavotti, G.: A local fluctuation theorem, Physica A, 263, 39-50, 1999. And Chaotic Hypothesis and Universal Large Deviations Properties, Documenta Mathematica, extra volume ICM98, vol. I, p. 205-233, 1998, also in chaodyn 9808004. 20. Gallavotti, G.: Ergodic and chaotic hypotheses: nonequilibrium ensembles in statistical mechanics and turbulence, chao-dyn # 9905026; and Non equilibrium in statistical and fluid mechanics. Ensembles and their equivalence. Entropy driven intermittency., chao-dyn # 0001???. 21. Gentile, G.: Large deviation rule for Anosov flows, Forum Mathematicum, 10, 89-118, 1998.
58
22. Gallavotti, G., Perroni, F.: An experimental test of the local fluctuation theorem in chains of weakly interacting Anosov systems, preprint, 1999, in http://ipparco. romal. infn. it at the 1999 page. 23. Gallavotti, G., Ruelle, D.: SRB states and non-equilibrium statistical mech anics close to equilibrium, Communications in Mathematical Physics, 190, 279-285, 1997. 24. Hoover, W. G.: Time reversibility, Computer simulation, and Chaos, World Scientific, 1999. 25. Landau, L., Lifchitz, E.: Mecanique des fluides, MIR, Moscou, 1971. 26. Rondoni, L., Segre, E.: Fluctuations in two dimensional reversibly damped turbulence, Nonlinearity, 12, 1471-1487, 1999. 27. Ruelle, D.: A measure associated with Axiom A attractors, American Journal of Mathematics, 98, 619-654, 1976. 28. Ruelle, D.: Sensitive dependence on initial conditions and turbulent behavior of dynamical systems, Annals of the New York Academy of Sciences, 356, 408416, 1978. This is the first place where the hypothesis analogous to the later chaotic hypothesis was formulated (for fluids): however the idea was exposed orally at least since the talks given to illustrate the technical work 27 , which appeared as a preprint and was submitted for publication in 1973 but was in print three years later. 29. Ruelle, D.: Positivity of entropy production in nonequilibrium statistical mech anics, Journal of Statistical Physics, 85, 1-25, 1996. And Ruelle, D.: Entropy production in nonequilibrium statistical mechanics, Communications in Math ematical Physics, 189, 365-371, 1997. 30. Ruelle, D.: Smooth dynamics and new theoretical ideas in non-equilibrium stat istical mechanics, Journal of Statistical Physics, 95, 393-468, 1999. 31. Ruelle, D.: A remark on the equivalence of isokinetic and isoenergetic thermo stats in the thermodynamic limit, preprint, to appear in Journal of Statistical Physics, 1999. 32. Woitkowsky, M.P., Liverani, C : Conformally Symplectic Dynamics and Sym metry of the Lyapunov Spectrum, Communications in Mathematical Physics, 194, 47-60, 1998.
Internet: Author's preprints at: h t t p : / / i p p a r c o . r o m a l . i n f n . i t e-mail: giovanni. gallavottiOromal. infn. i t Archived: mp.arc # 00-15; nlm.CD/0001021
(nlinCxyz.lanl.gov)
59 S U P E R S T R I N G S A N D THE UNIFICATION OF THE PHYSICAL FORCES MICHAEL B. GREEN DAMTP, Wilberforce Road, Cambridge CBS OWA E-mail: [email protected]
1
Introduction
At the turn of the twentieth century there were few hints of the imminent arrival of relativity, despite obvious theoretical shortcomings of the aether theory, while quantum theory was even less anticipated, despite the obvious need for a micro scopic theory of matter. The century that followed was one of spectacular progress. The discovery of the electron in 1897 marked the beginning of elementary particle physics. The obvious problems associated with point charges in clas sical electromagnetism notwithstanding, the standard picture of the 'elementary' particles that has evolved over the course of the century has been based on point-like fundamental constituents. Thus, the quarks, leptons and bosons of the Standard Model are described as point-like quanta of the Yang-Mills quantum field theor ies that describe the electromagnetic, weak and strong forces. The problem of the classical divergences of electromagnetism is replaced in quantum field theory by the somewhat milder problem associated with ultraviolet divergences. While the renormalization procedure treats these divergent quantities consistently it does not pretend to provide a fundamental description of physics at the shortest pos sible distance scales. Consequently, the Standard Model has a large number of undetermined parameters. Furthermore, it is not a unified theory since it treats the electro-weak and strong forces separately and does not even attempt to describe the force of gravity. Gravity itself was transformed in the early part of the last century by its syn thesis with space-time geometry in the theory of general relativity which raised well-known new issues relating to classical singularities. The divergences due to point masses in Newtonian gravity are replaced by the singularities of general re lativity. One possible resolution of this problem is to assume that singularities are always hidden behind event horizons and are therefore inaccessible to measurement. However, the presence of event horizons raises profound new issues in the context of quantum mechanics. The original arguments due to Hawking appear to imply that the quantum mechanical S-matrix cannot be defined in the presence of an event horizon — a black hole that may be formed by coherent quantum states de scribing in-falling matter subsequently decays by radiating an incoherent statistical mixture of outgoing matter in the form of Hawking radiation. This apparent dis aster was based on a semi-classical argument in which the matter is quantized in a fixed gravitational background. Clearly a complete analysis would require that the gravitational sector is also treated quantum mechanically. But the ultraviolet prop-
60
erties of any straightforward quantization of the gravitational field are disastrous — they lead to non-renormalizable divergences that point to the necessity for a radical change of classical geometry at distances close to the Planck scale. This apparent incompatibility of quantum mechanics and general relativity has proved a very fertile area for testing the fundamental principles of any underlying theory of quantum gravity. String theory is a framework that encompasses just such a radical change in the short distance nature of the physical laws. One of the most compelling features of string theory is the intuitive simplicity of its original formulation. In a nutshell, string theory is a generalization of all earlier particle theories in which the funda mental particles are no longer described as points but arise as different modes of excitation of an extended, string-like, object. This very simple paradigm has two immediate and dramatic consequences: • The elementary particles are unified - they are identified with different modes of a single kind of string. • String theory has no ultraviolet divergences - the perturbative quantum theory is well-defined! At distances large compared to the Planck distance string theory necessarily de scribes the familiar laws of general relativity coupled to Yang-Mills theory, but the non-zero string scale leads to new physics at smaller distances. This description of string theory is based on a semi-classical perturbative for mulation in which the string is viewed as a particle moving through a fixed back ground geometry. Scattering amplitudes are defined by a series of diagrams that are analogous to the Feynman diagrams that define perturbative approximations to conventional quantum field theories. Whereas a particular Feynman diagram is a sum over histories (world-lines) of point-like particles that interact at singular nodes, a string diagram describes the continuous world-sheet swept out by the in teracting strings. This sheet is a two-dimensional Riemann surface whose genus is the number of loops that defines the order of the diagram. There is therefore a very elegant and economic classification of string diagrams — one independent diagram at each order in perturbation theory. It appears that supersymmetry is a crucial ingredient in string theory. This extension of Poincare symmetry to space-times with extra fermionic dimensions automatically relates fermionic states to bosonic states and is therefore a profound unifying principle. But supersymmetry is not only aesthetically pleasing — it has become an essential ingredient in almost any attempt to unify the forces beyond the Standard Model. At least up to now, the only consistent versions of string theory are those that embody supersymmetry. Although the series of superstring diagrams has an elegant description in terms of two-dimensional surfaces embedded in space-time this is only a perturbative approximation to some underlying structure that must include a description of the quantum geometry of the target space as well as the strings propagating through it. It is this 'nonperturbative' description of the theory that is at the heart of developments of the last few years. An impressive feature of these developments is the manner in which they combine many ideas in theoretical physics that were
61
developed some time ago, often for unrelated reasons. Of central importance is the emerging network of deep relationships between Yang-Mills quantum field theory and quantum gravity. This article will begin with an overview of the development and structure of perturbative string theory. This will be followed by a review of some of the fascinating developments that have led to the understanding of non-perturbative features of the theory. These are based largely on the discovery of duality transformations that generalize the duality between electricity and magnetism that is present in certain supersymmetric Yang-Mills field theories. The non-perturbative extension of string theory, which is related to a quantum version of eleven-dimensional supergravity, has come to be known as 'M theory'. Some of the recent ideas that have arisen in trying to decipher M theory and to formulate it in terms of more fundamental principles will be surveyed. Although this story is by no means complete the recent developments indicate a number of fascinating insights into previously intractable problems. The bibliography at the end of this article gives a list of books, technical review articles and more general articles from which detailed references to the original literature can be obtained.
2 2.1
String perturbation theory The origins of string theory
The fact that it took sixty-five years from the formulation of special relativity to develop a theory of a relativistic string is remarkable given the long-standing problems with point particles. Even more remarkable is the manner in which string theory emerged directly out of the experimental data of the late 1960's. This was an era of confusion in quantum field theory. Despite the striking successes of quantum electrodynamics there seemed little hope of explaining the weak or strong forces by conventional field theory. In particular, the large number of hadronic resonances (mesons and baryons) showed systematic regularities that could seemingly not be accounted for by a local field theory. Fig. 1 demonstrates a very striking linear relationship between the spin and (mass)2 of a sequence of meson resonances which form a straight-line 'Regge trajectory',
J = a0 + Tj(mass) 2 , '»
(1)
where the intercept has the value ao ~ 0.5 (Gev/c 2 ) 2 and the string scale /, is a new dimensional quantity that defines the rest tension of the string, T — (27i72)-1, and sets the scale of the mass splittings (around 1 Gev/c 2 in the case of the mesons). This straight line also extends to the region of negative (mass) 2 , which is measured in high energy hadronic scattering at small momentum transfer. Many other meson and baryon resonances form similar linear Regge trajectories.
62 SPIN, J 6
5
4
3
2
1
0 0
2
4
6
8 (MASS) 2 (Gev/c2)
F i g . 1: The experimental spectrum of the mesonic resonances on the leading Regge trajectory, which is a straight line.
This pattern, together with the fact that the resonances are relatively long lived and combined with data on the high energy scattering of hadrons, led to a qualitative understanding of the systematics of hadronic scattering amplitudes. The advent of the Veneziano model in 1968 was a very simple guess for a particular meson scattering matrix element but it took a couple of years of rapid developments before it was realized that this is actually the Born approximation to the scattering amplitude of relativistic strings. Straight-line Regge trajectories such as (1) are a characteristic feature of a relativistic string where the intercept ao depends on details of the model. In this picture the meson resonances were thought of as modes of excitation of an open string with free end-points that carry the quark and antiquark charges. Closed strings described hadrons with no quark quantum numbers, nowadays referred to as 'glueballs'. These developments followed from a formulation of string dynamics by Nambu and Goto that was based on an action that was taken to be the area of the string world-sheet embedded in Minkowski space-time,
s
-se/*<^
<2
»
which is the simplest reparameterization-invariant function of the world-sheet co ordinates £ a = (cr, r ) . The induced metric, g, is defined in terms of the embedding coordinates, X^(£) (ft = 0, 1, • • •, d — 1), by gap = T)lil/dX'idX1',
(3)
63 where r/^ is the d-dimensional Minkowski metric. It is surprising that, despite the intense interest in relativistic dynamics, it took over sixty years from the formulation of special relativity to the formulation of the classical dynamics of a relativistic string (although Dirac had formulated a similar action for a relativistic membrane in the early 1960's a ) . T h e theory developed rapidly in the early 1970's and was generalized in various ways to include internal structure on the string. A notable development was the inclusion of fermionic modes propagating on the world-sheet which gave a string spectrum with fermionic as well as bosonic states. Although not fully appreciated at the time, this was to lead to the version of the theory with supersymmetry on the two-dimensional world-sheet. However, the early formulations of string theory as a possible theory of extended hadrons suffered from all sorts of problems. Among these was the fact that the vacuum of the original bosonic and fermionic theories was ill-defined due to the presence of tachyonic states. This problem was eventually resolved in the context of the fermionic string by realizing that the sum over the spin structures of the world-sheet fermions projects out the tachyon, giving a stable theory with space-time supersymmetry in addition to supersymmetry on the two-dimensional world-sheet. T h e major problem for string theory as a theory of hadrons was the fact t h a t an open string possesses a massless spin-one state t h a t is interpreted as a Yang-Mills gauge particle while a closed string possesses a massless spin-two state t h a t is interpreted as a graviton. Since there are no massless spin-one or spin-two states in the spectrum of hadrons this spelled disaster for the idea t h a t strings describe hadrons in any obvious manner. Following the advent of Q C D in the early 1970's it was understood that the hadrons could be understood in terms of confined quarks and gluons, which are the constituents of a conventional Yang-Mills field theory. In this picture the mesons can indeed be viewed, at least in some approximation, as the modes of excitation of an effective string describing a tube of confined colour electric flux. The precise relationship of this effective 'hadronic' string to the 'fundamental' string has yet to be completely elucidated. T h e presence of the Yang-Mills and graviton states in the string spectrum was unfortunate for a theory of hadrons but fits in beautifully with the idea t h a t string theory is a unified theory of all the forces, including gravity. Indeed it was shown in the mid 1970's t h a t simple S-matrix elements for the scattering of these string theory states are the same as those of general relativity coupled to Yang-Mills field theory. We thus see that the forces we are anxious to describe, general relativity coupled to Yang-Mills, emerge automatically from the q u a n t u m mechanical theory of an extended object — a (super)string. No such conditions arise when considering the relativistic quantum mechanics of point-like particles. 2.2
The superstring
spectrum
It was not until the early 1980's t h a t supersymmetric string perturbation theories were explicitly constructed and shown to have some remarkable properties. T h e superstring spectrum consists of an infinite sequence of higher-spin massive fermi onic and bosonic states. At every mass level there are equal numbers of fermionic "P.A. Dirac, An Extensible Model 0} The Electron, Proc. Roy. Soc. Lond. A 2 6 8 , 57 (1962).
64
and bosonic excitations, as required by supersymmetry. A particularly striking fea ture of the spectrum is that the massless ground states include the various kinds of particles that are required in the Standard Model (quarks, leptons, gauge bosons, scalar bosons) together with the graviton and gravitino that are characteristic of supergravity. The string mass scale is now to be interpreted as a scale close to the Planck scale. This means that the massive states decouple at the low ener gies attained in terrestrial conditions, leaving an effective 'low energy' field theory of some version of supergravity coupled to Yang-Mills that enter into more con ventional unified field theories. However, deviations from the classical laws arise at short distances, where the size of the string becomes important. This makes the short-distance properties of string theory much less singular than in ordinary quantum field theory and the problematic ultra-violet infinities of traditional the ories of quantum gravity do not arise in superstring theory. There are several different types of perturbative superstring theories that are simplest to express in ten space-time dimensions. The 'type IIA' and 'type IIB' the ories are closed-string theories with extended (J\f = 2) ten-dimensional supersym metry with no gauge symmetries when the ten-dimensional space-time is flat. The 'type I' theory is a theory of unoriented open strings coupled to closed strings which has M = 1 ten-dimensional supersymmetry. The open-string sector is charged un der the gauge group 50(32). Finally, there are two further N = 1 theories in ten dimensions which are called 'heterotic' closed-string theories and carry the gauge groups 50(32) and Eg x Es, respectively. All of these have quantum loop expan sions that are free of divergences as well as the potentially disastrous chiral gauge and gravitational anomalies.
(a)
(b)
F i g . 2: (a) A one-loop Feynman diagram of conventional quantum field theory. (b) The corresponding perturbation theory diagram in a closed-string theory. A single surface, or world-sheet, is swept out as two initial strings interact to produce to final strings. The absence of nodes in this process implies the absence of the ill-defined infinities of conventional Feynman diagrams.
These five superstring perturbation expansions about ten-dimensional Minkowski space are now known to be distinct perturbative approximations to the same underlying theory, rather than distinct theories. Scattering amplitudes are calculated by an obvious generalization of the series of Feynman diagrams of
65 conventional perturbative q u a n t u m field theory. Just as a conventional Feynman diagram (such as the one-loop diagram in fig. 2(a)) is a network of lines represent ing the trajectories swept out by point-like particles moving through space-time as they interact with each other, the stringy analogue of a Feynman diagram (fig. 2(b)) represents the two-dimensional world-sheet swept out by strings interacting as they move through space-time. The diagram is evaluated by a path integral t h a t sums over all world-surfaces that connect the incoming string states with the ougoing ones. The series of higher-loop diagrams define a superstring perturbation expan sion that has no ultraviolet divergences. The divergences of Feynman diagrams of ordinary q u a n t u m field theory arise from configurations in which the interac tion points (the nodes in fig. 2(a)) approach each other. However, string theory world-sheets such as fig. 2(b) have no singular points and therefore no possibility of ultraviolet divergences. In the presence of supersymmetry other (infrared) di vergences are also absent and the diagrams are finite. Superstring perturbation theory is so constrained t h a t it is only has a Poincare-invariant vacuum state in the critical space-time dimension d = dc = 10. However, there are very many solutions to the equations that define the theory in which the six unnecessary spatial dimen sions curl up to an unobservably small size, leaving the four familiar observable dimensions. T h e background space-time metrics associated with such compactifications pre serve at least some of the supersymmetry and are parameterized by a number of constants, or 'moduli'. These are the constant values of the scalar fields t h a t come from components of the metric and other tensor fields in the ten-dimensional the ory. Until supersymmetry is completely broken the potential of these scalar fields is flat and their values are not determined. 2.3
Strings moving in curved background
space-time
T h e properties of strings moving in general space-time geometries are described by the action (2) in a general background space-time metric, G M „. In contemplating the path integral over world-surfaces it is very convenient to also introduce an intrinsic world-sheet metric, gap(€), so that the action can be written in the form of a coordinate-invariant two-dimensional sigma model, S
= 4^77 / ^
(V99ap
G „ „ ( X ) + c^B[M{X))
daXH£)dpX"[!Z)
+ &f
+ ■■■
(4)
Classically, this action is equivalent to (2) (suitably generalized to a curved back ground that also includes the antisymmetric two-form potential, B^^X), and the scalar dilaton, $ ( X ) ) and generates the same equations of motion. The dots indic ate the coupling of the string world-sheet to other massless background fields, such as gauge bosons as well as the couplings of the world-sheet fermions t h a t are as sociated with supersymmetry. An /-loop contribution to closed-string perturbation theory is given by a functional integral over the bosonic and fermionic embedding coordinates (X^ and rp^) and the metric of the form, A -
I DX DipDg e x p ( - S ) ,
(5)
66
where the integration is over a genus-/ world-sheet. The coupling of the scalar dilaton field to the two-dimensional world-sheet curvature, fd2£y/g& R^/ln, is special and has been separated in (4). For constant values of the dilaton, = $0, this becomes 2$o Xi where x ls * n e Euler character of the world-sheet, which is equal to 2/ — 2. This gives a weight e 2 * 0 (' - 1 ) to the path integral on a genus-/ world-sheet which means that the string coupling constant g, can be defined in terms of the constant value of the dilaton field, <7,=e*°.
(6)
More generally, it is a distinctive feature of string theory that all coupling constants are determined by expectation values of such moduli fields. At leading order in 1% the sigma model defined by the action (4) is invariant under two-dimensional difFeomorphisms and Weyl transformations, or recalings of the intrinsic metric (ignoring for the moment the dilaton coupling which is of higher order in 1%). The local difFeomorphism and Weyl symmetry may be fixed by choosing the metric to be a fiducial metric, g = g, subject to well-understood topological restrictions. For example, if the world-sheet is a sphere the conformal gauge may be defined by the choice
gap =9ap = e-a f 0
1
J.
(7)
In this gauge the action (4) has a residual conformal invariance (still ignoring the dilaton term) and has no dependence on a. This is a crucial symmetry for the quantum consistency of string theory. The conditions that are required for the con formal symmetry to persist in the quantum theory are embodied in the conditions that the renormalization group beta functions vanish, /?^ = 0, where 4> is any of the background fields, <j> = G^„, #[,,„],<&, • • •. The classical dilaton coupling in (4) transforms anomalously in a manner that precisely cancels a quantum anomaly in the Weyl invariance at 0(1%) in the critical space-time dimension d = dc — where dc = 26 for the bosonic string and dc — 10 for the superstring. For example, the conditions that the beta functions should vanish lead to an equation of the form, R^-\GliUR+---
+ O(l2s) = 0.
(8)
To leading order at low energy the background metric is therefore required to satisfy Einstein's equations where • • • represents contributions that depend on the other background fields that couple to the metric in a generally covariant manner. The expressions for all the 0$ functions can be calculated up to any given order in the inverse string tension, 1%. So we see that the conditions (8) have a beautiful geometrical interpretation as generalizations of the field equations for the fields G>i/, B^, $, •• •, that arise in general relativity coupled to matter. It turns out that theories with d < dc (sub-critical string theories) can always be reinterpreted as critical string theories with non Poincare-invariant vacuum states. Since the background fields generally also include Yang-Mills gauge bosons the vanishing beta function equations such as (8) resemble the kinds of equations that we would expect to encounter in a unified theory of all the forces. It is important for the consistency of string theory that the small fluctuations of the background
67 fields are manifested as the propagating states of the string. For example, the massless spin-two state in the string spectrum can indeed be shown to correspond to a small fluctuation in the background metric of the sigma model, (4). The sum over fermionic spin structures is necessary, among other reasons, to ensure that the sigma model gives a local world-sheet theory. This leads to theories with space-time supersymmetry as described earlier. T h e classification of consistent solutions of the vanishing beta function condi tions — the string theory vacuum equations of motion — is the subject of superconformal field theory, which involves the study of representations of the superconformal algebra (the super-Virasoro algebra) on world-sheets of arbitrary genus. This has emerged as a major enterprise that is of intrinsic mathematical interest, as well as of interest in areas of theoretical physics, such as two-dimensional statistical mechanics and two-dimensional turbulence. T h e equations of motion of the massless superstring states at distances much greater than ls that follow from the vanishing of the beta functions can be derived as the Lagrange equations resulting from an effective action t h a t is a generalization of the Einstein-Hilbert action,
where many terms have been suppressed, including the terms involving fermions t h a t are needed for supersymmetry. Even as far as its classical properties are concerned the kind of supergravity described by (9) differs in a very significant manner from conventional general relativity by the presence of the factor e - 2 * multiplying the curvature. This means that Newton's constant is related to the string scale by
GN~g2.ll
(10)
which is small at weak string coupling. Since the gravitational field due to an object of mass M is proportional to G^M, the gravitational field of such an object is weak when gs is small, as it is in string perturbation theory. We will see later t h a t there are solitonic solutions to string theory known as 'D-branes', with large masses proportional to c/g,. Such objects have gravitational fields proportional to gs so that, in perturbation theory, the back reaction on the space-time geometry is negligible even though the objects are very massive. Equations such as (8) only correspond to the familiar classical field equations at distance scales large compared to the string length scale, /,. At shorter distances the higher-order terms in the action (9) become important and the short-distance properties of the theory are radically different from those of classical field theory. Of course, just such a radical change in the short-distance physics is required in order to circumvent the terrible ultraviolet properties associated with q u a n t u m gravity — the string scale acts as a kind of regulator that modifies the short-distance physics in a manner t h a t leads to finite amplitudes for string scattering in string perturbation theory.
68
2.4
Superstring phenomenology
The intense research that followed the initial indications of consistency of superstring perturbation theory led to significant progress in making contact between the theory and observed physics. The string equations have many solutions in which six of the nine spatial dimensions are curled and the very large symmetry is broken. In this manner, plausible and natural connections can be made with the Standard Model starting from one of the five ten-dimensional string perturbation theories, the E& x E& heterotic string. In making such connections the topological properties of the compact six-dimensional space have a direct relationship to the resulting spectrum of elementary particles and properties of their forces. For example, its Euler character determines the number of families of quarks and leptons while the Yukawa interactions are determined by the intersection form. Many of these con structions use a Calabi-Yau manifold (a compact Ricci-flat Kahler manifold), or an orbifold generalization, as the internal space. This leads to four-dimensional theor ies with a single unbroken supersymmetry. At low energies such compactifications are closely related to field theories that are of direct phenomenological interest in describing the Standard Model. More precise connections with observed physics cannot be pinned down until the properties of superstring theory are understood beyond the perturbative approxim ation. For example, within perturbative string theory supersymmetry necessarily remains unbroken and there is no potential for the moduli fields. This means that their expectation values are arbitrary parameters corresponding to marginal oper ators in the two-dimensional conformal field theory that describes the compact part of the target space. But supersymmetry is most certainly broken in nature other wise all the observed elementary particles would be massless. While it is true that particle masses are tiny compared to the Planck mass of 1019 Gev we really want to understand the values of these tiny numbers. For this it is vital to understand the non-perturbative mechanism that leads to supersymmetry breaking. It is equally vital to understand non-perturbative properties in order to describe phenomena such as the quantum mechanics of black holes, cosmology of the early universe and a host of other properties of the theory. Furthermore, we have seen that the superstring perturbation theory equations have very many degenerate solutions — only non-perturbative effects can distinguish between these and pick out a unique genuine solution to the complete theory. A potentially significant issue is whether, once the moduli are fixed by non-perturbative effects, there will be a small coupling constant with which to define a useful perturbation expansion.
3
Non-perturbative string theory
A conceptually complete theory of quantum gravity cannot be based on a background-dependent perturbation theory. The series of diagrams such as fig. 2(b) at best defines a semi-classical approximation which describes quantum mechanical string-like particles moving through a more or less classical space-time. Whereas the Feynman diagrams of quantum electrodynamics give an approximation to a complete theory (Maxwell's electromagnetism), string theory has evolved the other
69 way around. It has been defined by the series of perturbative approximations but the fundamental formulation of the theory, from which the approximations can be derived, is lacking. In such a complete formulation the notion of string-like particles would arise only as an approximation, as would the whole notion of classical spacetime. In fact, perturbative string theory can be shown to be incomplete by a very concrete argument based on the large-order behaviour of the series of string dia grams, which leads to the conclusion t h a t the series is not Borel summable. T h i s means that non-perturbative contributions are essential in order to make sense of the theory. T h e situation is analogous to the situation in QCD where the series of Feynman diagrams is also not Borel summable, which is an indication of the presence of non-perturbative instanton effects that behave as a power of e"1^9,. In the case of string theory the large-order behaviour leads to the conclusion t h a t the non-perturbative effects behave as powers of e _ 1 / ' s *. Such behaviour was then identified with the presence in the theory of solitons with masses of order l/gs. Supersymmetry provides a key tool in the systematic investigation of such nonperturbative aspects of string theory. It is well known t h a t field theories with supersymmetry satisfy very strong analyticity constraints. Ove the past five or six years these constraints have been used to determine non-perturbative properties of certain highly non-trivial four-dimensional q u a n t u m field theories for the first time. T h e developments in superstring theory since 1994 have enlarged this to encompass non-perturbative features, in which the field theories arise as special cases. It is these developments that have formed the core in the recent developments that are described by the term 'M theory'.
3.1
Dualities
At the heart of these developments has been the discovery of a number of different kinds of duality transformations t h a t relate the theory at different points in the moduli space. These include 'T-duality', which applies order by order in string perturbation theory, and 'S-duality' or 'U-duality' which relate the theory at weak and strong coupling. T-duality Even though T-duality is a property of string perturbation theory it points to some interesting features of the microscopic description of space-time in string theory. This can be seen in the simplest example in which one dimension of space is a circle of circumference R. In t h a t case there are Kaluza-Klein states familiar in any q u a n t u m system in a box t h a t have masses m/R (where m is an integer) which become infinitely massive and decouple from the theory as R —> 0. However, in string theory new states arise from a closed string winding around the circular dimension n times. Such stable winding states have masses proportional to nR so they become massless when R vanishes and do not decouple. In fact, the bosonic string theory is invariant under the interchange of m with n and the replacement of R by l^/R. At the same time the coupling constant (the value of the dilaton field) has to be transformed in a specific manner. This T-duality transformation is a
70
symmetry of all the Feynman diagrams of the bosonic string theory and implies that a very small circular dimension is indistinguishable from a very large dimension! This already suggests that I, should be interpreted as a minimum length in string theory. In the case of superstring theories T-duality is not a symmetry but it relates the two distinct perturbative closed-string theories — the type IIB theory on a circle of radius rg is identical to the type IIA theory on a circle of radius r& ■ In other words, these two perturbation expansions are really describing the same underlying theory! This is the first example of the web of duality relationships that link all of the apparently different string theories. This property of string perturbation theory extends to more general back grounds in which there is an abelian isometry. It means that spaces that are geo metrically distinct in conventional general relativity may be identical within string theory, hinting at a very large extension of the symmetry of general relativity under coordinate transformations. A particularly significant example of T-duality is the 'mirror' symmetry of Calabi-Yau spaces, which is relevant to the compactification of superstring theory to four space-time dimensions. S-duality and solitons in quantum field theory Many of the important non-perturbative effects in superstring theory discovered in the past five years have incorporated older results and conjectures concerning properties of solitons in supersymmetric quantum field theory. A prime example of such a soliton is the well-known magnetic monopole solution of four-dimensional Yang-Mills gauge theory, which is topologically stable and has a mass proportional to 1/e, where e is the the fine structure constant, so it is not present in the per turbative theory. The presence of such magnetically charged particles raises the possibility of a symmetry between electricity and magnetism. Such a symmetry was conjectured by Montonen and Olive in 1977 and there is by now overwhelming that it is a property of maximally supersymmetric (M = 4) Yang-Mills theory. This is a symmetry that interchanges e with 1/e and at same time interchanges the roles of the fundamental quanta (the electric charges) with the solitonic particles (the magnetic charges) — in the quantum theory they are equally fundamental! The electric charge is inverted in this interchange so that the strong coupling limit of the original theory with electrically charged quanta is identical to the theory with magnetically charged quanta. This highly non-perturbative statement, also known as 'S-duality', would be impossible to verify without the very special features of supersymmetry that allows a controlled extrapolation from weak to strong coupling. The fundamental electrically charged particles and the magnetically charged monopoles of supersymmetric Yang-Mills theory are BPS states (after Bogomol'nyi, Prasad and Sommerfield) that preserve only a subset of the supersymmetries. Such states, which are annihilated by a combination of the supercharges, are stable against renormalization and typically have masses that are exactly known functions of the coupling constants. The extension of these ideas to Yang-Mills theories with less supersymmetry (particularly, the work on J\f = 2 theories by Seiberg and Witten) has beautifully clarified some very subtle properties of four-dimensional quantum field theory, as well as making important links with issues in topology.
71 Solitons
in superstring
theory
When these ideas are generalized from Yang-Mills q u a n t u m field theory to string theory several new features arise. Firstly, since string theory contains general re lativity the BPS states are generalizations of extreme Reissner-Nordstrom solitonic black holes which may carry a variety of different conserved charges. However, string theory also contains black objects of higher dimension known a 'black pbranes'. These are solitons with extension in p spatial dimensions (where p = 0 for a black hole) with extreme versions t h a t are BPS states of string theory. These ob jects are associated with conserved charges that generalize the electric and magnetic charges carried by the fundamental quanta and the solitonic monopoles in fourdimensional Yang-Mills theory. These charges couple to (p + l)-form potentials, t h a t are generalizations of the Maxwell one-form, A^. For example, all string theor ies contain the second-rank potential, B^ — the 'Neveu-Schwarz-Neveu-Schwarz' (NS (p+i) .._ t h e ' R a m o n d - R a m o n d ' (R R) potentials. Such potentials enter the type IIA closed-string theory (where p = 0 , 2 , 4 , 6 ) and the type IIB closed-string theory (with p = —1, 1,3,5,7). Although the potentials have been known to exist for a long time the fundamental strings do not carry the associated charges and it is only recently that the objects carrying R® R charges have been discovered. These are a special class of p-branes known as 'Dirichlet-branes', or 'D-branes'. The configuration with p = — 1, is point-like in space-time and is an analogue of an instanton, known as a D-instanton. I t is also notable t h a t the type IIB theory possesses 1-branes and 5-branes carrying R® R charges (Dl-branes and D5-branes) in addition to the NS® NS 1-brane and 5-brane. In fact, there is an infinite set of (p, q) 'dyonic' 1-branes and 5-branes carrying NS® NS charge p and R® R charge q, where p and q are co-prime. These are analogous to the infinite set of dyonic states in maximally supersymmetric Yang-Mills theory. A classical BPS p-brane solution of the low-energy field theory that approxim ates either type IIA or IIB supergravity is described by a nontrivial solution of the equations for gravity coupled to the dilaton and the particular antisymmetric po tential that couples to the p-brane under consideration. This is a generalization of a Reissner-Nordstrom black hole. For example, the metric for an extreme Dp-brane living in ten dimensions is given by 9-p
ds2 = —L=i-dt2
+ Yjdxidxi) + N/tf(7)££fradra, >=1
/
(11)
a=l
and the dilaton solution given by e*=g,H(r)-r,
(12)
72
where H is the harmonic function (13) (dp is a constant) which is singular at the horizon r — \Jrara — 0. 3.2
D-branes
F i g . 3: A Dirichlet p-brane (Dp-brane) is a p-dimensional surface on which open strings can end. The diagram illustrates a time sequence in which two open strings with end-points fixed on a Dl-brane move towards each other and annihilate into a closed string that moves off the surface. Processes of this type describe the radiation (closed strings) emitted when a black hole evaporates.
Solitonic p-branes were originally discovered by studying the solutions to the classical supergravity field theories which are low energy approximations to string theory and did not have an intrinsically stringy interpretation. However, the dy namics of D-branes is explicitly described by configurations of string as shown in fig. 3. A D-brane is an extended p-dimensional surface on which open-ended superstrings are free to move with their end-points tethered to the brane. The open-string boundary conditions break the translation invariance in the transverse directions leading to Goldstone modes — these are world-volume scalar fields, Xf(£) (i = p + 1, • • -,9), that represent the fluctuating transverse D-brane coordinates. Similarly half the supersymmetry is broken by the presence of the D-brane, leading to world-volume goldstino fields which are fermionic coordinates, V>a(£)> that carry a space-time spinor index. In addition to these world-volume fields a D-brane carries a world-volume vector potential field, Aa(£) (a = 0,. . . ,p). These world-volume fields that determine the long wavelength behaviour of Dbranes are the massless states of the open strings moving on the D-brane in fig. 3.
73
Open superstring ground states are massless gauge particles and their spin-half fermionic partners so the low energy dynamics of a single D-brane is a version of supersymmetric Maxwell theory. For a Dp-brane embedded in space-time of small curvature (compared to the string scale, /^ 2 ) the dynamics is summarized by an action of the form 5 ^ = 5 ^ + 5^,
(14)
where the Dirac-Born-Infeld term is given by S
°*f
=
(2,T)P/?+1
/dP+^e"*\/-(let(G!°/'-/«2^)
+
---'
<15)
while the Wess-Zumino term is the integral of a (p + l)-form given by,
The metric in (15) is the pull-back of the target-space metric to the world-volume of the D-brane, (7a/? — daX'dpX3 Gij, where X' are scalar world-volume fields. The quantity T is a two-form defined by T — F — B, where F = dA is the field strength of the Born-Infeld electromagnetic (p + l)-dimensional vector potential and B is the NS® NS antisymmetric potential of the bulk theory. The action (15) unites the membrane action of Dirac, for a p-dimensional membrane embedded in ten dimensions, with the Born-Infeld version of electromagnetism living inside the (p -+■ l)-dimensional world-volume. This appearance of the Born-Infeld action as a key ingredient in string theory is one of the most fascinating of the recent developments. Recall that Born and Infeld originally introduced their nonlinear electrodynamics as a way of avoiding the inconsistencies of point charges in Maxwell electrodynamicsfc. It is now seen to be the effective action of open string theory in the presence of D-branes. More generally, the world-volume D-brane action also contains terms that depend on the extrinsic curvature of the embedding of the D-brane in the target space-time. It is not obvious how to generalize the DiracBorn-Infeld to a non-abelian gauge group, although it is easy to generalize the Wess-Zumino action. For many practical purposes it is sufficient to use the low energy expansion of the supersymmetric Dirac-Born-Infeld action, which is a supersymmetric Maxwell theory. In the cases with maximal supersymmetry this action is identified with the dimensional reduction from ten dimensions of N — 1 supersymmetric Maxwell the ory to the (p + l)-dimensional world-volume of the Dp-brane where, in static gauge, the world-volume coordinates £ a are identified with Xa. The Yang -Mills potential of the ten-dimensional theory decomposes into longitudinal and transverse compon ents A^ = (Aa, X') (fi = 0, • • -,9), where Aa is the abelian Born-Infeld potential. Versions of this action have been constructed in which the static gauge condition is not imposed so that the action is manifestly supercovariant. Such actions are ^M. Born, L. Infeld, Foundations oj the new field theory, Proc. Roy. Soc. Lond. A 1 4 4 (1934) 425
74
invariant under a fermionic 'K' symmetry of the kind that was needed originally in order to express the superstring in a manifestly space-time supersymmetric manner.
)
u(iy )
U(1) ( ( (a)
(b)
F i g . 4 : A configuration of two parallel Dp-branes separated by a distance L. (a) The ground state preserves half of the supersymmetry. Open strings moving on the separated Dp-branes describe two independent supersymmetric £/(l) gauge theories, (b) A BPS state with a string stretching between the two D-branes preserves one quarter of the supersymmetry. T h e string, with mass proportional to L, describes a W-boson living in the (p + l)-dimensional world-volume. In the coincident (L = 0) limit this becomes massless, leading to an enhancement to a V(2) gauge group.
Thus, the low energy dynamics of Dp-branes is controlled by maximally su persymmetric Yang-Mills theory in p + 1 dimensions. Consider, for example, two parallel D-branes separated in the X1 direction by a distance L, as in fig. 4(a). The ground state of this system is a BPS state that breaks one half of the supersymmetry. The separation X1 = L is interpreted as an expectation value for a scalar world-volume field in the D-brane, so this is a Higgs mechanism from the perspective of the supersymmetric Yang-Mills theory. The BPS condition guaran tees that there is no force between the two parallel D-branes, just as in the case of two static BPS magnetic monopoles in the Yang-Mills-Higgs system (although in this case, unlike the case with monopoles, the metric on the moduli space is flat). Now consider an excited state, such as as in fig. 4(b), where the stretched funda mental string represents a W-boson of mass L. In fact, the system is isomorphic to supersymmetric SU(2) x U(l) Yang-Mills-Higgs theory broken to U{\) x U{\). The state with the stretched string in its ground state is again a BPS state, but one that breaks 3/4 of the supersymmetry. When the two Dp-branes become coin cident (L — 0) the stretched string becomes massless, giving extra massless gauge particles, and the unbroken gauge symmetry is restored (to SU(N) x U(l) in the case of N coincident Dp-branes). More generally, the massive stretched string rep resents an electrically-charged W-boson if it is a fundamental string or a magnetic monopole if it is a D-string. We see that the transverse coordinates, X', which are interpreted as scalar world-volume fields, are actually commuting elements of a non-abelian algebra. In fact, N parallel D-branes have transverse coordinates that are N ® N matrices, X'ab, which do not generally commute with each other. The classical geometry of N separated D-branes is obtained only for situations in which they do commute. This is one of a number of indications that non-commutative
75
geometry enters into the underlying geometry of string theory. Although its precise role has yet to be clarified in detail, it plays a key role in the matrix model to be outlined in section 4. A very large number of more general configurations of D-branes have been stud ied, in which the branes can be of different types and they may not be parallel and may preserve a small amount of supersymmetry or break it completely. They can be embedded in curved space-times and wrap nontrivially around compact cycles. In many cases the correspondence between the D-brane world-volume theory and Yang-Mills field theory has led to a great deal of insight into the structure of quantum field theories in various domains of moduli space. D-branes have a mass per unit volume proportional to l/gs, so they are ex tremely massive for small g,. The other p-branes which carry the NS ® NS an tisymmetric tensor charge have masses of order \/g1 and are therefore even more massive at weak coupling. As we saw earlier, the version of general relativity that emerges from string theory is one in which the value of Newton's constant is propor tional to g2s (9) so that the Planck scale becomes small when gs becomes small. This means that the gravitational field of the very massive D-branes is small at weak string coupling. In contrast, the gravitational field of the more familiar solitons with M ~ \jg1 is large. We conclude from this that there is a sensible weak coup ling expansion in the background of D-branes in which it is consistent to neglect the back reaction of the brane on space-time. This is a key point in the description of black holes by D-branes that will be briefly described later. Another important fact about D-branes is that their size is very small compared to the string scale. In fact, the study of the high energy scattering of two slowly moving DO-branes reveals that the natural length scale is the eleven-dimensional Planck scale, which is much smaller than the string scale when g3 is small (they are related by lp = g, I,). This is one indication that string theory sees an extra spatial dimension beyond the ten obvious ones, a fact that is a key aspect of the duality symmetries. 3.3
Equivalence of all superstring perturbation 'theories'
The fact that BPS states preserve some supersymmetry allows justifiable extrapol ations to be made from weak to strong coupling. Since, at large coupling (large g,) the BPS p-brane masses become small these solitons can become the lightest objects in the theory.c In this case it is often possible to reinterpret them as the 'fundamental' particles of another perturbation expansion which is given as a power series in \/gs. This is the stringy analogue of S-duality. More generally the combin ation of 7' and S dualities relate all five different kinds of string theory, just as the electric-magnetic duality related the electric and magnetic charges in Yang-Mills theory. In order to exploit the full power of the duality relationships it is necessary to consider all the ways in which the extended BPS solitons can wrap around di mensions in space in which they are embedded. For example, if one embedding c We are here imagining that the p-brane is wrapped around a p-cycle of finite volume so that its mass is finite.
76 dimension is a circle a two-brane (a membrane soliton) can wrap around this di mension leaving one unwrapped direction, so t h a t the soliton looks like a string. This solitonic string of the original weakly-coupled compactified theory can then be reinterpreted as a fundamental string when the coupling becomes strong. In this way the fundamental strings of any of the five weakly-coupled superstring theories can arise as solitonic states of another theory arrived at by expanding in the inverse charge. This means t h a t none of the perturbation expansions should be viewed as fundamental, but rather as different approximations to the same theory. A clear example of this phenomenon arises in the comparison of the heterotic string theory, which has ten-dimensional M = 1 supersymmetry and a gauge group t h a t is either 5 0 ( 3 2 ) or E&xE%, with the type IIA string theory, which has ten-dimensional M = 2 extended supersymmetry and no gauge group. When the IIA theory is compactified on the Ricci-flat four-manifold KZ it is found to be identical (as far as can be ascertained) to the heterotic string compactified on a four-torus, T 4 . T h i s equivalence applies not only at generic points in the moduli space, where the heterotic string gauge symmetry group is broken to an abelian subgroup, but also to points where the gauge symmetries are enhanced. At these points new massless gauge particles arise which are interpreted in the IIA theory as M2-branes wrapped around homology 2-cycles in K3 t h a t shrink to zero area at the enhancement points. This gives a quite novel geometric description of the origin of Yang-Mills gauge symmetry. Furthermore, the presence of such extra massless states whose origin is non-perturbative resolves apparently paradoxical divergences of string perturbation theory at isolated points in moduli space. Such divergences had previously been puzzling inconsistencies of strings compactified on Calabi-Yau space. By means of such arguments all of the five apparently distinct superstring the ories have been shown to be different weak coupling approximations to the same underlying theory.
3-4
M theory
A most important feature of these stringy dualities is t h a t a parameter that enters as a charge in one perturbation expansion might appear as the radius of a compact spatial dimension in another. In this case, as the charge becomes large a new large dimension appears in the dual theory — the dual theory lives in a space with an ex t r a dimension. One particularly important example of this arises in ten-dimensional type IIA string theory — the strong coupling limit of this theory is known to be equivalent to an eleven-dimensional theory. This is not a string theory at all, but in its classical approximation is eleven-dimensional supergravity which was first formulated in 1978 but did not play an obvious role in string theory until quite recently. In contrast to string theory, there is no perturbative expansion of elevendimensional supergravity since there is no scalar field to act as a coupling constant in eleven-dimensional supergravity. T h e radius of the extra (eleventh) dimension, Rw, is related to the coupling constant of the ten-dimensional string theory by R-w/lp — e 2 * ' 3 — gs (where lp is the eleven-dimensional Planck scale). This gives a novel interpretation of the string perturbation expansion as an expansion in powers of the radius of the eleventh dimension.
77
T h e eleven-dimensional q u a n t u m theory t h a t arises by considering strongly coupled string theory can be taken as one definition of M theory. Unfortunately, this rather implicit definition does not provide a microscopic formulation of the theory. However, many interesting properties have been abstracted from proper ties of its long wavelength limit, which is classical eleven-dimensional supergravity. In t h a t theory the only field t h a t is a differential form is the three-form poten tial, C^ 3 '. This couples electrically to a solitonic two-brane (the M2-brane) and magnetically to a five-brane (the M5-brane) which are the only p-branes in the eleven-dimensional theory. All of the BPS states (the fundamental strings and p-branes described earlier) of lower-dimensional string theory can be obtained by compactification. T h e simplest way of relating eleven-dimensional supergravity to string theory is to consider the eleventh dimension to be a circle. One of the spatial directions of the M2-brane may be wrapped around this circle resulting in a ten-dimensional theory with maximal (M — 2) supersymmetry with a string t h a t is extended in the unwrapped direction of the membrane — this string may then be identified with a fundamental superstring of the type IIA theory. If the M2-brane is not wrapped around the extra dimension it describes the D2-brane of the ten-dimensional the ory. In a similar manner the M5-brane may either be wrapped or not wrapped around the extra dimension, giving rise to the D4-brane and the 5-brane of the ten-dimensional type IIA theory, respectively. T h e DO-brane of the type IIA the ory is simply identified with the Kaluza-Klein charge of the fundamental fields of eleven-dimensional supergravity compactified on a circle. Similarly, the D6-brane of the IIA theory is identified with a Kaluza-Klein monopole of eleven-dimensional supergravity, with its vector potential arising from the Kaluza-Klein vector poten tial that arises from the metric component G^w. T h e p-branes of the type IID theory can then be obtained by T-duality from the type IIA theory compactified to nine dimensions on a circle. [The eight-brane of the IIA theory and the nine-brane of the IIB theory have rather special status and will not be described here.] Instead of reducing from eleven dimensions on a circle it is possible to consider the eleventh dimension to be an interval of length Rn, which is bounded by two ten-dimensional boundaries. This configuration has non-zero curvature which leads to a breaking of half the supersymmetry. At small Ru it is identified with the background of the ten-dimensional heterotic string theory with a coupling constant gs — (Rii/lp)3?2 and an Ea ® E$ gauge group. Intriguingly, this gauge group is realized by placing one factor of E$ on each of the two space-time boundaries. This picture of gauge fields living on sub-domains of a larger space in which the gravitational forces propagate is the prototype for recent novel suggestions of how string theory may make contact with phenomenology, as we will see later. Viewed from this perspective it appears that all the lower-dimensional string perturbation expansions can be obtained from eleven-dimensional supergravity by dimensional reduction, but this is too simple minded. In order to arrive at string perturbation theory g, must be very small, which corresponds to a compactification radius 7?n t h a t is smaller than the Planck distance — a limit in which the classical supergravity approximation is not generally valid. In practice, the most complete a t t e m p t at a microscopic theory is the Matrix Model (described in the next section)
78
in which the eleven-dimensional theory arises as the limit of type IIA string theory at large coupling. Substantial regions of the moduli space of vacua of M theory have been mapped out in this manner. The very large variety of different perturbative superstring theories in ten and fewer dimensions as well as eleven-dimensional supergravity are explicitely seen as approximations to the full underlying theory when the moduli that parameterize supersymmetric vacuum states approach different limiting values. Thus, all these disparate theories are embedded in one underlying theory. On the one hand, this encapsulates the immense progress in string theory over the past few years. On the other hand, the presence of continuous moduli is a sign that our present understanding is limited to the subspace of the complete theory in which supersymmetry is unbroken. In order to agree with observed physics the theory must break supersymmetry. Supersymmetry breaking could well arise from a dynamical mechanism akin to the Nambu-Jona-Lasinio mechanism but this is not yet under firm control in string theory. Such a mechanism would create a potential for the moduli, hopefully lifting their degeneracy and leading to a disconnected nontrivial solution that represents the physics of our experience. 4
P r e s e n t directions
Many new approaches have been pursued in the past two years in the search for a non-perturbative formulation of the theory. A characteristic feature of most of these developments is that they have illuminated deep connections between quantum Yang-Mills theory and quantum gravity. As we have seen, Yang-Mills interactions (open string theory) and gravitational interactions (closed string theory) are natur ally related within string theory due to the fact that an open string forms a closed string when its end-points join. This simple relationship lies between some of the most fascinating recent ideas. The Matrix Model The Matrix Model builds the eleven-dimensional theory by considering the largeN limit of the supersymmetric quantum mechanical problem that describes TV in teracting DO-branes, or D-particles, which are solitonic BPS states of the type IIA superstring theory. As we saw earlier such a system is described by the compactification to zero spatial dimensions (p = 0) of ten-dimensional supersymmetric Yang-Mills theory. This can be described by the one-dimensional action,
sm =
WsS di fa*'? + ^.M**'. *J'])2 + •••).
(17)
where the where the N x N matrices X' (i — 1 , . . . , 9) are the non-commuting co ordinates of the D-particles and . . . indicates the fermionic terms. This is a supersymmetric quantum mechanical system with a rather subtle potential. Classically, the minima of the potential arise when [X',X J ] = 0, which are the configurations in which the coordinate X' can be written in diagonal form X' =diag(j/i,---,j/) v ),
(18)
79 where y'r (r = l,...,N) are interpreted as the degenerate classical positions of N separated D-particles. The classical minima of the potential form flat valleys. Small fluctuations around these minima are described by harmonic oscillators in the off-diagonal coordinates transverse to the valleys. Since the sides of the valleys get steeper as the separation gets larger the zero-point energy is non-constant if the fermionic contributions are ignored. However, the fermionic vacuum energy in the theory with maximal supersymmetry precisely cancels this effect and the valleys remain marginal directions in the complete quantum theory. It can easily be argued that in order for the duality relationships to be valid the system of N D-particles described by (17) should have precisely one threshold bound state (a state of zero binding energy) for any value of N. This is a subtle property of the quantum mechanical system described by (17) that has not yet been proved in full generality. However, if this is assumed to be correct it leads to a picture in which, in the large-N limit, the qantum system of N D-particles in ten dimensions generates an eleven-dimensional theory possessing the expected membrane states (the M2-branes) in the low energy limit. In a sense the D-particles play the role of partons, or constituents, of the membrane in a fashion that is analogous to the way in which string 'bits' are the constituents of strings in light-cone gauge string theory. There have been a host of further developments in the Matrix Model. Par ticularly significant have been the attempts to formulate it in more general back grounds. For example, when a spatial dimension is compactified on a circle extra states have to be added, corresponding to strings that end on the D-particles but wind around the compact dimension. This is equivalent to replacing the supersymmetric quantum mechanical system of D-particles with a system of D-strings, which is described by two-dimensional Yang-Mills quantum field theory with six teen supersymmetries. As more dimensions are compactified the model has to be generalized further to account for yet more states that are ignored in higher di mensions. For example, the theory in eight non-compact dimensions is defined by toroidally compactified four-dimensional J\[ — 4 supersymmetric Mills theory. This is an ambitious programme that has led to many significant insights into the structure of non-perturbative string theory. However, at present it suffers from severe difficulties. Firstly, the process of reducing dimensions requires an under standing of higher-dimensional theories. Already with six compact dimensions the theory has to be formulated in terms of the 'little string theory'. This is a fas cinating six-dimensional theory of non-gravitational strings that is not a local field theory and whose properties have not been properly understood as yet. Beyond this, when there are four or fewer non-compact space-time dimensions not much is understood as yet. An even worse problem with the present formulations of the Matrix Model is that the formalism is manifestly background dependent. This may be adequate for understanding M theory in specific backgrounds but is obviously not the fundamental way of describing quantum gravity. Non-commutative
geometry
It has long been suspected that some version of non-commutative geometry should be the appropriate geometric arena for describing quantum gravity. We
80 have just seen t h a t the Matrix Model provides one very specific example in which the lack of commutativity of space-time coordinates arises via the dynamics of Dbranes. An earlier example arose in the late 1980's in the structure of open-string field theory, which rapidly fell from grace since it appeared impossible to incorporate the closed-string, or gravitational, sector in a consistent manner. Most recently, it has been realized how the low energy limit of open-string field theory gives rise to an interesting version of non-commutative Yang Mills q u a n t u m field theory when there is a non-zero, but constant, bulk NS NS antisymmetric tensor potential. This makes contact, for example, between the Dirac-Born-Infeld action and noncommutative Yang-Mills theory. At the very least, this has given new insight needed for exploring areas of moduli space t h a t have so far remained uncharted. Beyond t h a t , it may point to the correct setting for the q u a n t u m geometry needed to understand the fundamentals of string theory. Black holes in string
theory.
Included among the panoply of solitons in the low energy field theory t h a t approximates string theory are the familiar black hole solutions that arise in any theory t h a t contains gravity. Although general macroscopic black holes are presum ably very complex objects which can radiate by emitting Hawking radiation, there are wide classes of stable black holes that carry various kinds of charges. These are generalizations of extreme Reissner-Nordstrom or Kerr black holes and behave very much like ordinary, but very massive, elementary particles. They are seen in string theory as explicit states composed of superstrings and p-branes. In many situations these extreme or near-extreme black holes can be explicitly constructed as config urations of D-branes. A well-studied example is a black hole t h a t is constructed by compactifying IIB superstring theory on T 5 (a five-torus) with M D5-branes wrapped around the torus together with N Dl-branes wrapped on a circle Sl in one of the T 5 directions with quantized m o m e n t u m P in the same direction. This is a five-dimensional extreme Reissner-Nordstrom black hole t h a t can also be dis covered as a classical solution in type IIB supergravity theory compactified on T 5 , with charges M, N and P arising from various components of the tensor fields in the ten-dimensional theory. However, the D-brane construction allows a detailed q u a n t u m mechanical enumeration of the states of this system, at least in the weak coupling limit, gs
NP,
(19)
which agrees exactly with the entropy calculated from the classical solution using Hawking's formula S = Area/47r. Thus, we see a microscopic origin for the geomet rical black hole entropy. This realizes the long-held hope t h a t microscopic black holes and elementary particles should have a unified origin. It is easy to move slightly away from the extreme limit and consider the beha viour of near-extreme black holes. These decay by the Hawking process, eventually returning to the extreme limit. T h e excited states of the string theory black hole are
81
again explicitly constructed in terms of the dynamics of open strings that propagate on the D-brane. These open string excitations can collide and combine to form a closed string which propagates in the bulk, thereby appearing as Hawking radiation as the black hole decays to its ground state (see fig. 3). The cross section for this process, including all details of the greybody factors, agree exactly with the rate predicted by Hawking. In the string calculation unitarity is maintained at each stage and there is no question of violation of quantum mechanics. However, these calculations are based on perturbative approximations that are valid at weak gravitational strengths where the D-brane approximation is control lable. This is far from the regime in which the black hole has a large radius and can be described by the semi-classical methods used by Hawking. Further under standing is needed in order to address the issue of what happens in the decay of an arbitrary black hole. These issues highlight the subtleties in the interplay between quantum mechanics and general relativity. Field theory on the brane A fertile line of development is one which exploits the fact that the long wavelength dynamics of p-branes in string theory or M theory is described by afield theory in the (p+l)-dimensional world-volume. The parameters of this field theory — the charges, masses and the amount of supersymmetry — depend upon how the p-brane is immersed in the ten or eleven-dimensional space-time. By considering various configurations of intersecting branes with different values of p, immersed in different configurations, it has been possible to rederive most of the results con cerning nontrivial supersymmetric quantum field theories of Seiberg and Witten and considerably extend them in a geometrically appealing manner. Particularly interesting is insight that has been gained into the structure of the six-dimensional little string theories that are related to the 5-branes. The World as a brane The idea that the four-dimensional field theories that form the Standard Model can be defined by embedding a 3-brane in higher dimensions is the subject of a most intriguing series of recent speculations. In this picture our apparently four-dimensional physical laws are associated with a 3-brane that is embedded in a higher-dimensional space-time or a four-dimensional boundary of a higherdimensional space-time. According to this idea the fields of the Standard Model live on the 3-brane while the (super)gravity fields live in the higher-dimensional bulk. The extra compact spatial dimensions may then be much larger than is gen erally supposed. If m dimensions have radii R much larger than the fundamental string scale (i.e. R ^> la) there are deviations from the inverse square law of Newto nian gravity on scales less than or of order R. In this case it is easy to see that the effective four-dimensional Planck length scale, lp, which determines the observed value of Newton's gravitational constant, is related to the fundamental string scale (which is approximately the fundamental Planck scale) by GJv/2 = ip = R~n C + 2 .
(20)
This could allow the string scale to be as low as l~l ~ 1 TeV with R as large as 1 mm
82
if there are two large dimensions, m = 2! With m = 6 the string scale is still about 1 TeV if R ~ (10 MeV) - 1 . Surprisingly, this scenario is not ruled out by present experimental limits since the well-studied phenomenology of the Standard Model remains unchanged when the non-gravitational forces are confined to the usual fourdimensional space-time. The fact that the gravitational forces permeate the higher dimensions is difficult to test with present experimental bounds for gravitational effects from terrestrial or cosmological observations. This is a very dramatic possibility since it would mean that the observed effect ive Planck mass scale is reduced to an experimentally attainable value. This picture has also been argued to lessen the problem of the hierarchy of mass scales in unified theories. However, there appear to be severe problems with naturalness. For ex ample, there is no obvious dynamical reason for presence of large extra dimensions or for the unification of gauge couplings. Although this general picture may be motivated by string theory considerations it has not yet been directly obtained from a string or M theory starting point. Possibly it would arise by compactification from something like the Horava-Witten picture of M theory in an eleven-dimensional space-time with boundaries. A very interesting version of this scenario has been suggested in which the ex tra dimensions need not be compact but the space transverse to the three-brane has a very nontrivial metric. In suitable examples a potential well is generated in the transverse directions so that a gravitational bound state is localized on the three-brane. Four-dimensional coordinate invariance guarantees that this state is a massless graviton. Although the precise details of this picture may appear unreal istic, it suggests a novel mechanism for interpreting our four-dimensional universe. In this picture a three-brane may be embedded in higher dimensions with all the forces, including gravity, localized on the world-volume of the three-brane. The ef fect of the higher dimensions is again to radically change the relationship between the observed Planck scale and the fundamental length scale that enters in the higher-dimensional theory. Holography Very general arguments suggest that any theory of quantum gravity will satisfy a 'holographic' principle. This implies that the information within any space-time volume is encoded on the boundary enclosing the volume. Furthermore, there is at most one bit of information per Planck unit of the surface area. Thus, in this picture the physics of the bulk is not extensive — the number of physical degrees of freedom is that of a field theory at discrete points spaced by a Planck distance in one less dimension. Such a principle has been argued to be a natural property of string theory due to its nonlocal properties. This seems to indicate that string theory, or any potential theory of quantum gravity, may possess an immense gauge symmetry that permits the reduction of the degrees of freedom to those living on the boundary of any volume. According to 't Hooft requiring a theory of quantum gravity to be holographic also requires a radical change in the structure of quantum theory, which would arise as an approximation to a more classical theory. Although this deep principle has still to be understood in full generality it has a clear implementation in particular backgrounds such as anti de-Sitter space-times.
83
Anti de-Sitter geometry and superconformal Yang-Mills theory One of the most interesting developments of the past two years has been the explicit realization of the holographic principle in a special class of space-time back grounds. The examples in this class can be obtained by considering the geometry that arises when a large number N of p-branes are superimposed. This is the limit in which the classical supergravity p-brane metric of the form (11) is a good ap proximation to the corresponding string theory or M theory geometry. As a specific example consider the D3-brane metric obtained by setting p = 3 in (11) and (12) (in this case the dilaton field is constant). The large-N limit of relevance is one in which the string scale is taken to zero, I, —► 0, with a simultaneous rescaling of coordinates to coordinates U = r/l2,. This defines the near-horizon limit in which the constant term in the harmonic function (13) is negligible so that H = R4/r4 (where R is a constant). In this limit the metric (11) becomes AdSs x 5 5 , which has isometry group 50(4,2) x 50(6). Both the AdSs and the S 5 factors have a scale L. This background space-time is one with a boundary that can be chosen to have the topology R® 5 3 ® 5 5 where the first and second factors are the time-like and spatial directions that form the boundary of AdS 5 . However, we saw earlier that N D3-branes are described by four-dimensional SU(N) M = 4 supersymmetric Yang-Mills theory coupled to the gravitational background fields. In the small I, limit under consideration the background gravity decouples leaving four-dimensional superconformal Yang-Mills field theory living on the boundary of the bulk space-time. The symmetry group of this boundary theory is the group 50(4,2) x 5C/(4), where the first factor expresses the conformal invariance of the Yang-Mills theory and the second factor is the R-symmetry group characteristic of maximal [M = 4) supersymmetry. Thus, the isometry (su pergroup of the background is the same as the (super)conformal symmetry group of the boundary theory. These facts were the motivation for the conjecture that string theory in an AdSr, x 5 5 background is equivalent to M = 4 superconformal Yang-Mills theory on the boundary. More precisely, the amplitudes of the bulk IIB superstring theory are functions of the values of the string fields at points on the boundary, labeled by xjf (fi = 0,1,2,3). The conjecture states that these boundary fields are the sources of composite gauge-invariant operators in the Yang-Mills theory and the amplitudes are to be identified with correlation functions of these operators, e x P ( - 5 / / B ) | { 4 > ( X r ) } = J DAexP(-SYM[A]
+ Or(xr)*(xr)).
(21)
In this expression, SUB is the superstring effective action that is a functional of the boundary values {(xr)} and Or{xr) are the composite Yang-Mills operators to which these fields couple. According to this conjecture the coupling constant of string theory is the square of the Yang-Mills coupling, gs = gYM while the 'tHooft coupling A = gynfN is equated with L4/l4, where L is the scale of the AdSs (which is equal to the radius of the 5 5 ). Little is known about the bulk string theory in this background in general but the supergravity limit in which I, <^L L with j , • C l has been well studied. However, this corresponds to the limit in which N —> oo with large 'tHooft coupling, A ^> 1,
84
in which almost nothing is known about the Yang-Mills theory. Conversely, taking N —► oo keeping A 1 with <7,
Further issues and objectives
It would have been gratifying to start the new century with a resolution of the profound problems that are encountered in constructing a unified theory, such as string theory. However, we have seen that important issues concerning the un derlying description of quantum gravity remain to be resolved. One of the most obvious problems is that all the suggested microscopic models of M theory are back ground dependent. A truly background independent formulation would not make a
85
distinction between the target space-time and the embedded objects — both con cepts should emerge from a novel kind of quantum geometry. However, despite the lack of an obvious punch-line the evolution of superstring theory has clarified some important issues and has provided a clear focus for tackling others. Since string theory differs so radically from conventional quantum field the ories it would be most disappointing if it did not have important experimental consequences. However, the present state of understanding of the theory does not allow detailed predictions beyond certain general statements. The discovery of supersymmetry in the next generation of accelerators is the most obvious and immediate experimental possibility. Nowadays supersymmetry, which invokes fermionic space-time coordinates, underlies the greater part of phenomenology beyond the Standard Model as well as being of central theoretical importance within string theory. Although its discovery would not necessarily test the details of string theory, it would certainly be of great significance to our understanding of space-time. The most conspicuous observational problem facing any quantum theory of grav ity is the explanation of the vanishing of the cosmological constant, which is meas ured to be zero to astonishing accuracy. It is difficult to imagine how it can vanish or be small in any standard theory, which most naturally lead to a value at least l() 120 times too large. Although it vanishes in the superstring perturbation approx imation, the key question is why it should vanish when supersymmetry is broken by non-perturbative effects. String theory should have an impact on many other cosmologial issues, particularly on early universe cosmology, which is determined by the kind of short-distance physics embodied in quantum gravity at the Planck scale. When the non-perturbative extension of string theory is properly understood it should have impact on this area and perhaps clarify the subject of quantum cos mology. The novel possibility described in the last section that the fundamental Planck energy scale is much smaller than normally supposed would lead to Planckscale effects that might also be recreated in the next generation of accelerators. Although this seems very far out, the fact that this is not yet ruled out makes it a subject worthy of investigation. Finally, an optimist might hope that there is a lesson to be learned from the development of General Relativity which lead to an unexpected explanation of the precession of the perihelion of Mercury. It is to be hoped that once the logical status of string theory is understood there there will be analogous unforseen experimental consequences.
86 BIBLIOGRAPHY Textbooks M.B. Green, J.H. Schwarz and E. Witten, Superstring Theory, CUP, 1987. J. Polchinski, String Theory, CUP, 1998 D. Lust and S. Theisen, Lectures on String theory, Springer-Verlag, 1989. S.-T. Yau Ed., Essays on mirror manifolds, International Press, 1992. E. Kiritsis, Introduction to superstring theory, Leuven University Press, 1997 J. Fuchs, Affine lie algebras and quantum groups: an introduction with applications in conformal field theory, CUP, 1992. Lecture notes a n d technical reviews J. Scherk, An introduction to the theory of dual models and strings, Rev. Mod. Phys. 4 7 , 123 (1975) M.J. Duff, R.R. Khuri and J.X. Lu, String solitons, h e p - t h / 9 4 1 2 1 8 4 ; Phys. Rep. 2 5 9 , 2 1 3 (1995). J . H . Schwarz, Lectures on superstring and M theory dualities (ICTP Spring School and TASI Summer School), Nucl. Phys. Proc. Suppl. 5 5 B , 1 (1997) hep-th/9607201. J. Polchinski, TASI lectures on D-branes, h e p - t h / 9 6 1 1 0 5 0 . P.K. Townsend, Four lectures on M theory, h e p - t h / 9 6 1 2 i 2 i . H. Ooguri and Z. Yin, TASI lectures on perturbative string theories, hep-th/9612254. G . T . Horowitz, Quantum states of black holes, gr-qc/9704072. T. Banks, Matrix theory, Nucl. Phys. Proc. Suppl. 6 7 , 180 (1998) h e p - t h / 9 7 1 0 2 3 1 . D. Bigatti and L. Susskind, Review of matrix theory, hep—th/9712072. N. Nekrasov and A. Schwarz, Instantons on noncommutative R**4 and (2,0) superconformal six-dimensional theory, h e p - t h / 9 8 0 2 0 6 8 ; Comm. Math. Phys. 1 9 8 , 689-703 (1998). C.P. Bachas, Lectures on D-branes, h e p - t h / 9 8 0 6 1 9 9 . N. Arkani-Hamed, S. Dimopoulos, G. Dvali Phenomenology, Astrophysics and Cos mology of Theories with Sub-millimeter Dimensions and TeV Scale Quantum Grav ity, h e p - p h / 9 8 0 7 3 4 4 ; Phys. Rev. D 5 9 086004 (1999). G. 't Hooft,Quantum gravity as a quantum dissipative system, gr-qc/9903084 ; Class. Quant. Grav. 16, 3263-3279 (1999). A. Sen, Non-BPS states and branes in string theory, Talk given at 3rd A P C T P Winter School on Duality in Fields and Strings, h e p - t h / 9 9 0 4 2 0 7 . 0 . Aharony, S.S. Gubser, J. Maldacena, H. Ooguri and Y. Oz, Large N field theor ies, string theory and gravity, h e p - t h / 9 9 0 5 1 1 1 . L. Randall and R. Sundrum, An alternative to compactification, hep—th/9906064, Phys. Rev. Lett. 8 3 , 4690-4693 (1999). N. Seiberg and E. Witten, String theory and non-commutative geometry, h e p - t h / 9 9 0 8 1 4 2 ; JHEP 9909:032 (1999). 1. Antoniadis, Mass Scales in String and M-theory, Strings '99, Potsdam hep-th/9909212.
87 Q U E S T I O N S IN Q U A N T U M PHYSICS: A P E R S O N A L
VIEW
RUDOLF HAAG Waldschmidtstrafie
4b, D-83727
Schliersee-Neuhaus,
Germany
An assessment of the present status of the theory, some immediate tasks which are suggested thereby and some questions whose answers may require a longer breath since they relate to significant changes in the conceptual and mathematical structure of the theory.
1
Introduction
Personal views are shaped by past experiences and so it may be worth while pon dering a little about accidental circumstances which channelled the course of one's own thinking. Meeting the right person at the right time, stumbling across a book or article which suddenly opens a new window. Fifty years ago, as eager students at Munich University just entering the phase of our own scientific research, we were studying the enormous papers by Julian Schwinger on Quantum Electrodynamics, following the arguments line by line but not really grasping the message. I remember the feelings of frustration, realizing that we were far away from the centers of action. But, mixed with this, also some dismay which did not only refer to the enormous arsenal of formalism in the new developments of QED but began with the standard presentation of the interpretation of Quantum Theory. I remember long discussions with my thesis advisor Fritz Bopp, often while circling some blocks of streets ten times late in the evening, where we looked in vain for some reality behind the enigma of wave-particle dualism. Why should physical quantities correspond to operators in Hilbert space? Why should probabilities be described as absolute squares of amplitudes?, etc., etc. Since we did not seem to make much headway by such efforts I decided to postpone philosophy and concentrate on learning what was really done, which aspects were used in an essential way and were responsible for the miraculous success of Quantum Theory. Leave aside for a while questions of interpretation discussed by Bohr and Heisenberg, extrapolations like Dirac's transformation theory or von Neumann's theory of measurement and return to the more pragmatic attitude pervading for instance the book by Leonard Schiff on Quantum Mechanics. For this purpose Walter Heitler's book 'Quantum Theory of Radiation' (second part of the first edition) proved immensely helpful. Here I saw what the typical problems were which Quantum Field Theory tried to address and I also learned to appreciate the progress made meanwhile by covariant perturbation theory and Feynman diagrams. The next great piece of luck for me was (indirectly) caused by Niels Bohr. In the course of the planning of a great joint European laboratory for high energy physics (now CERN) he saw the need to introduce a young generation of theoretical physicists in Europe to this area and offered the hospitality of his Institute. He called an international conference in 1952 which I could attend and a year later I got a fellowship for spending a year in Copenhagen. This clearly ended the frustration of being isolated from the great world. The first fringe benefit of the
88 1952 conference was a garden party at the residence of Niels Bohr where I met Arthur W i g h t m a n who gave me the invaluable advice t o read the 1939 paper by Wigner on the representations of the inhomogeneous Lorentz group. Returning to Munich I followed the advice and it was a revelation. Not only by putting an end to our concern about wave equations for particles with higher spin but because here I recognized a most natural starting point. T h e group, nowadays called the Poincare group, is the symmetry group of the geometry of space-time according to the theory of special relativity. W h a t would be more natural than to ask for the irreducible representations of this group? Equally remarkable was the result t h a t these representations (more precisely those of positive energy) correspond to the q u a n t u m theory of the simplest physical system, a single particle. It has just two attributes: a mass and a spin. Everything t h a t can be observed for such a system, including, as far as possible, a position at given time, could be expressed within the group algebra. No reference to canonical commutation relations, guessed from classical mechanics, nor to the wave picture. T h e wave equation arises from the irreducibility requirement. I could not understand why this paper had remained almost unnoticed by the physics community for such a long time. In fact, even in 1955 I was introduced at some conference in Paris as: 'He is one who has read the 1939 paper of Wigner'. This was indeed my major claim to fame. Probably all the young theorists who had the privilege of spending a year as members of the C E R N theoretical study group in Copenhagen will remember this as a wonderful time. T h e atmosphere at the Institute with the spiritus loci, eman ating from the personality of Niels Bohr, and upheld by Aage Bohr and Christian M0ller who was the official director of the study group, combined in a rare way scientific aspirations on the highest level with a friendliness discouraging any com petitive struggles and allowing everyone to proceed at his own pace. Though there was some joint topic suggested, at my time it was the Tamm-Dancoff method and alternatives to perturbation theory, with some emphasis on nuclear physics, every one was allowed to work on the subject of his own choice. So, after a short excursion into nuclear physics I returned to my pet subjects. Prominent among them was collision theory in q u a n t u m mechanics and field theory. T h e widely used recipe of 'adiabatic switching off of the interaction' appeared to me not only as ugly but also as highly suspect because the notion of 'interaction' was not clear a priori and if one switched off the wrong thing one would decompose a nucleus into its fragments. This led me to appreciate the physical significance of various topologies for vectors and operators in Hilbert space. Since in high energy physics the experiments were not concerned with fields but with particles there was the idea that the role of a field was just to interpolate between incoming and outgoing free fields which were associated to some specific type of particle. Trying to implement this I unfortu nately used the wrong topology. But just at the end of my stay in Copenhagen we received a preprint of the paper by Lehmann, Symanzik and Zimmermann who did things correctly and thereby provided an elegant algorithm relating 'Green's functions' in q u a n t u m field theory to scattering amplitudes for particles. Speaking of Copenhagen and q u a n t u m theory the 'Copenhagen interpretation' immediately comes to mind. I prefer to call it the 'Copenhagen spirit' or, more specifically, the natural philosophy of Niels Bohr. I did have some opportunities
89 to talk to the great master but, in spite of my great admiration and some efforts, this was not fruitful. It was only in later years that I understood the depth of various parts of his philosophy. But there always remained one disagreement which came from the question: what are we trying to do and what is guiding us? Physics began by the recognition t h a t there are relations between phenomena which are reproducible. These could be studied systematically, isolating simple processes, controlling and refining the conditions under which they occur. T h e formulation of the regularities found and the unification of the results of many different exper iments by one coherent picture was achieved by a mapping into abstract worlds: a world of appropriate concepts and a world of mathematical structures dealing with relations within and between various sets of mental constructs, one of them being the set of complex numbers. This endeavor manifestly led to some level of understanding of 'the laws of nature' as evidenced by the development of a tech nology which provided mankind with enormous powers to serve their conveniences and vices. But what was understood and what was the relative role in this process played by observations, by creation of concepts and by mathematics? When Dirac wrote on the first pages of the 1930 edition of his famous book: ' The only object of theoretical physics is to calculate results that can be compared with experiment^, this can hardly be taken at face value. As he often testified later, he was searching for beauty and he found it in mathematical structures. So much so t h a t he preferred to look for beautiful mathematics first and consider their possible physical relev ance later. Indeed, the road from phenomena to concepts and mathematics is not a one-way street. As the studies shifted from coarser to finer features the theory could not be derived directly from experiments but, as Einstein put it, it had to be freely invented and tested subsequently by suggesting experiments. In this passage back and forth between phenomena and mental structures many aspects entered which cannot be rationalized. T h e belief in harmony, simplicity, beauty are driving forces and they relate more to musicality than to logic or observations. There was one further highly significant and somewhat accidental occurrence in shaping my subsequent work and this may also illustrate the above remarks. Professors F. Bopp and W. Maak in Munich decided in 1955 that it was important to exchange experience between theoretical physicists arid mathematicians. This initiative was not rewarded by visible success. The number of participants dwindled quickly and the enterprise ended after a few months. But for me it was of para m o u n t importance. I was introduced there to rather recent work of the Russian mathematicians Gelfand and Naimark on involutive, normed algebras and to the work of von Neumann on operator algebras and reduction theory. I saw t h a t Hilbert space resides in some wider setting which, at least from the mathematical point of view, constitutes a rather canonical structure resulting from a few nat ural structural relations. Besides the standard algebraic operations it needed a *-operation (involution) and one is led to a natural topology induced by a unique 'minimal regular norm'. It appeared highly likely that this structure was behind the scenes in the mathematical formalism of q u a n t u m theory. T h e prototype of such an algebra is furnished by group theory. There a most important tool is the consideration of functions of the group with values in the complex numbers. They form obviously a linear space because we can multiply them by complex numbers
90 and add them. If there is a distinguished measure on the group (which is the case for compact groups and, up to a normalization factor, for locally compact ones) the group multiplication defines a product in the space of these functions, the con volution product. The inverse in the group defines a *-operation in this algebra of functions by f*(g) = /(<7 -1 )- The resulting algebra yields the representation theory of the group. An irreducible representation corresponds to a minimal (left) ideal in the algebra. If in the following I try to describe things which I believe to have learned con cerning the interrelation between observed phenomena, concepts and mathematical structures I must precede this with some apologies. The inadequate handling of references is due to the state of disorder in my notes and lack of time. The ab stractions used in describing the procedures of acquiring knowledge may be too schematic. There is a painful gap between their qualitative character and the very precise mathematical structures into which they are mapped. 2
Concepts and Mathematics in Quantum Theory
A paper describing some fascinating recent experiments was entitled 'Reality or Illusion?' These experiments (see e.g. refs. 1 ' 2 ' 3 ) have lent impetus to the long standing discussion about the meaning of reality in quantum theory. Do the dis coveries force us to abandon the naive idea of an outside world called nature whose laws we try to find? What is the role of the observer? Do the puzzles relate to the mind-body problem? Many different views concerning such questions have been voiced throughout the past seventy years. So if I try to express mine it may be pardonable to proceed in an extremely pedantic fashion. A single experiment of the type alluded to above combines many individual clicks of some detectors. Though each click is unique and neither repeatable nor predictable even under optimally controlled circumstances, we may regard it as a 'real fact' in the sense in which these words are used in any other context. Let us call it an 'event'. Its existence is not dependent on the state of consciousness of human individuals. In modern times it is usually registered automatically, stored in computer memories and there is no dispute between the members of a group of experimenters about its 'reality'. The outcome of the experiment refers to the frequency with which a particular configuration of events occurs in many runs and this is reported as a probability of the phenomenon under precisely described circumstances. To be accepted, this result must be reproducible by any other group of scientists who is willing to invest time and resources in repeating the experiment. The events mentioned are coarse. A detector is macroscopic. We regard macroscopic bodies as 'real objects' and statements about their placement in space and time as 'real attributes'. The word 'real' just means here that such objects, events and their space-time attributes belong to common experience shared by many persons and do not depend on the state of consciousness of an individual. The observed relations between them constitute the only empirical basis from which the 'free invention' of a theory can proceed. With regard to this mental task there is a piece of wisdom which I learned from F. Hund. It might be called Hund's zeroth
91
rule. He pointed out that the progress of physical theory depended on the lucky circumstance that always some effects were small enough to remain unnoticed or could be disregarded as insignificant at the time a particular piece of the theory was proposed. We cannot take many steps at the same time. We should regard a theory always as preliminary; it will disregard some fine features of which we are luckily ignorant or which we neglect in order to obtain a tractable idealization. The purpose of these lengthy elaborations is twofold. First, I do not think that physics can make any contribution to the mind-body problem. The attempt to ex plain some puzzling aspects of quantum physics by invoking subjective impressions and the role of the consciousness of individual human beings is not an appropri ate answer. Secondly, the concept of event is necessary in quantum physics. It is an independent concept. The mental picture that it corresponds to an interaction process between an atomic object and a macroscopic one is misleading because experiments tell us that there are no atomic objects in an ontological sense (see be low). Of course, from this we may conclude that there are no macroscopic objects either and that their apparent reality results from an asymptotic idealization. This is true (see e.g. the discussion in ref. 4 of the emergence of classical concepts due to large size and decoherence). But the idealization is covered by Hund's zeroth rule and is essential for the form of the present theory. If we want to avoid it we must take the next step in the development of the theory. Let us address now some specific aspects. 1. In experiments we usually (necessarily?) distinguish two parts: a source which determines the probability assignments (subsumed under the notion of 'state'), and a set of detectors to whose responses the probabilities apply. Though in the description of the source a variety of considerations enter which will have to be looked at more closely, we shall, for simplicity of language, just idealize it as char acterized by some pattern of'source events'. The total setting, consisting of source events and target events, where the former determine a conspicuous probability as signment for the occurrence of the latter, may be called a 'quantum process'. Bohr emphasized the indivisibility of the process as one of the key lessons of quantum theory. This poses the question of how we can isolate such a process from the rest of the world. In technical terms, what do we have to take into account in 'preparing a state' in order to get a reproducible probability assignment for a pattern of target events (defined by some arrangement of detectors). Here we are helped by lucky circumstances. We live in a reasonably steady environment; its influence does not change rapidly in space and time. So, if we are stupid in the state preparation we just get an uninteresting probability assignment, a 'very impure state'. The art of the experimenter is needed to improve state preparation and render the probability assignment as conspicuous as possible. It appears, however, that there is a limit to such improvements, idealized by the notion of a 'pure state'. 2. Given some definite process one would like to assign to it a 'physical system' as the agent producing the target events or, more carefully, as the messenger between source and target events. This is clearly a mental construct. Can we attach any element of reality to it? If we focus on a single event involving one detector far removed from the source we may think of a single particle as this messenger. But we may also consider patterns of several events, seen in coincidence arrangements
92 of detectors far removed from the source and from each other. Then we sometimes find correlations in their joint probabilities which are of a very peculiar type. If we believe that there is a specific messenger from the source to each target event (for instance a particle) then, whatever notion of state we try to assign to those, we cannot represent the joint probability for the pattern of events as arising from joint probabilities for a corresponding set of individual states of the messengers. This is the conclusion to be drawn from the violation of Bell's inequality. It is not so easily seen in the first discussions which focused on hidden variables but emerges clearly in ref. 5 . Another equally surprising effect has been demonstrated by Hanbury-Brown and Twiss. They start from two entirely independent source events (for instance photons emitted from two far distant surface regions of a star which happen to arrive almost simultaneously in the observatory). So they can cause a coincidence in two detectors. Each detector responds to one photon but can, of course, not distinguish from which source event it comes. By varying the difference of the optical paths from the telescope exit to the two detectors one finds varying intensity correlations in the coincidence signals. This means that the cause for the response of one detector cannot be attributed to the arrival of either a messenger from the left edge or from the right edge of the star. Both photons work together though there is no phase relationship between the two emission acts. There is a causal relation between the pattern of two source events and the pattern of two target events but it cannot be split into causal ties between single events. Taken together these experiences imply that the notion of a 'physical system' does not have independent reality. What is relevant for the click of a single detector is some notion of 'partial state' prevailing in its neighborhood. In both of the above examples this is described as an impure state of a single particle. In the EPRexample as discussed by Bell it is determined by one source event, the decay of an unstable particle. In the second example it is caused by two source events. The probability for the response of both detectors in coincidence depends on the partial state in the union of the two neighborhoods and this is apparently not determined by the pair of partial states around the individual detectors. Thus quantum physics exemplifies the saying: 'the whole is more than the sum of its parts' and it does so in extreme fashion. The Pauli principle claims that all electrons in the universe are correlated. The reality behind the mental picture of a physical system consisting of a certain number of particles refers to a certain set of events with causal connections between them, manifested by the existence of a probability for the total process. In an ideal experiment this is obtained by counting the number of times the pattern of target events is realized, dividing it by the number of times the source events occured. In the usually prevailing cases where the source is not adequately known we can still determine relative probabilities of different patterns of target events assuming that the source remains constant. The holistic aspect mentioned above is often called the 'essential non-locality' of quantum theory. But this is an unfortunate terminology because the only reason why we can talk about specific processes at all resides in the locality of individual events and the causal structure of space-time. 3. The reader may have wondered why I specialized the usual notion of 'ob servable' to that of a detector and talked about events instead of measuring results
93 indicated by the position of a pointer of some instrument. The spectral resolution of self-adjoint operators which played such an essential role in the development of early q u a n t u m mechanics was not even mentioned so far. One reason for this is the problem of how to achieve the mapping from a particular arrangement of instruments to its representative in the mathematical scheme. In early q u a n t u m mechanics the idea that we consider a physical system consisting of a definite num ber of particles seemed to pose no problem (a beautiful illustration of Hund's zeroth rule). The degrees of freedom were positions and momenta, appearing in the ca nonical formalism of classical mechanics on equal footing. Though it became clear t h a t these degrees of freedom could not be real attributes of the system one still talked about measuring one of them (or simple functions of them like energy and angular m o m e n t u m ) . Bohr emphasized that the full description of the experimental arrangement is needed 'to tell our friends what we learned' and t h a t this could only be done in plain language. But, since the classical degrees of freedom persisted, this description of the arrangement could ultimately be summarized by one m a t h e m a t ical object which corresponded in a symbolic way to a classical quantity. How can one proceed in this passage from the description of an arrangement of hardware to a mathematical symbol relating to the system? The primary piece of information is given by the placement of macroscopic bodies in space-time. These bodies perform different functions. Some parts may be considered as 'state preparation procedures' representing the source events. Other parts yield the measuring result which is an unresolvable phenomenon, an unpredictable decision in nature, a coarse event. Its primary attribute is an approximate position in space-time. T h e representation of the whole arrangement (apart from the primary source) by a self-adjoint operator, interpreted as describing the measurement of a quantity related to some function of the classical degrees of freedom involves the theory (Schrodinger equation) in conjunction with idealizations and approximations which are transparent only in simple cases. The operators corresponding to m o m e n t u m and energy have clear significance as generators of translations in space and time but are only indirectly related to observations, which in the last resort concern the position of an event in space-time. T h e position operator of a particle at a prescribed time yields spectral projectors which can approximately characterize an event. But the assumed exist ence of a family of mutually exclusive events with certainty t h a t one of them must happen is an extrapolation which becomes highly unnatural in relativistic situ ations. This is mildly indicated already by the ambiguities arising in the a t t e m p t to define a position operator in Wigner's analysis. T h e fundamental discovery, that the 'elementary particles', formerly believed to be the building stones of matter, are not eternal but can be created or destroyed in processes, forces us to consider states whose particle content is not only varying but undefined in some regions. While the concepts o f ' s y s t e m ' or 'particle' suggest some object existing in an ontological sense, the concept of ' s t a t e ' belongs to the realm of possibilities (potentialities, propensities) for the realization of something coming into existence, an event. This would not be so in a deterministic theory but if we believe that the indeterminacy in the prediction of phenomena, inherent in the formulation of q u a n t u m physics, is a feature of the laws of nature and not just due to ignorance which could be lifted by future studies then the distinction
94
between the realm of possibilities and the realm of facts becomes imperative. The 'state' belongs to the former. Strictly speaking it provides a quantitative descrip tion of a contribution to the probability for the occurrence of events. The other contribution is given by the placement and type of detectors. Thus also the notions of 'system' and 'particle' belong to the realm of possibilities. But they retain their importance. They allow us to classify (at least under favorable circumstances) the possible partial states in a region by a denumerable set. This procedure involves the center piece of the mathematics of quantum theory: the superposition principle and eigenvalue problems; more generally, the determin ation of invariant subspaces in a complex linear space with respect to the action of the symmetry group of the theory. The intuitive steps leading to the recognition that this mathematical structure (Hilbert space, involutive algebras, representa tion theory of groups) offers the key to quantum theory appear to me as a striking corroboration of Einstein's emphasis on free creations of the mind and Dirac's con viction that beauty and simplicity provide guidance. Returning to our line of argument: the ordering of states in classes by the concept of a 'system' corresponds to the selection of an invariant subspace, under the action of the symmetry group. A minimal invariant subspace, an irreducible representation, may be called an elementary system. Its attributes are group char acters. If we consider only the symmetry group of space-time, the Poincare group, the irreducible representations give us states of a stable system, a system which could persist eternally if it were alone in the world and no events could occur. This simple system is a single particle. Its attributes are a value of the mass and the spin, which define a group character. The reason why the simplest systems play such an important role for observations is due to the circumstance that in many experiments the partial state pertaining to a large but limited region of space-time can be very closely approximated by the restriction of a global single particle state to the region. This will, in fact be the standard situation in the overwhelming part of space-time if the mean density of matter is small. To obtain a basis in the space of single particle states we must choose some maximal set of commuting generat ors. In this choice the generators of space-time translations, whose spectral values are energy-momentum 4-vectors, play a preferred role in the following respect. If we look in a region whose extension is small compared to its mean distance from source events then the partial state there is well approximated by a mixture of parts of plane waves, (improper) eigenstates of the generators of translations, each belonging to a specific energy-momentum vector. We confined attention so far to the Poincare group describing the space-time symmetry. The full symmetry group includes 'gauge symmetries' whose characters are charge quantum numbers. The first example was the electric charge with its description as a character of U(l). The generalizations of this in high energy physics led to flavor and color multiplets associated with the groups SU(2), SU(3). To avoid misunderstandings it must be stressed that we talk here about a global gauge group. The significance of local gauge invariance will be addressed later.
95 2.1
Conclusions
Position and m o m e n t u m belong to different parts of the scheme. Position is an (approximate) attribute of an event, not of a particle, and the event marks a position in space-time not a position in space at an arbitrarily assumed time (as the picture of a world line for a particle would suggest). In simple cases the event may be regarded as the interaction process between a particle and a detector. But the notion of 'particle' does not correspond to that of an object existing in any ontological sense. It relates to the simplest type of global state and describes possibilities, not facts. T h e notion of 'partial s t a t e ' demands in addition that we ignore all possible events outside some chosen region and thus ignore possible correlations with outside events. T h e concepts of 'particle' and 'physical system' arise from the possibility of ordering global states into distinct classes denned by the symmetry group of the theory. A particle corresponds to an irreducible representation of this group. Its attributes are group characters. A system corresponds to some subrepresentation of the tensor product of irreducible representations. Experience tells us that only a countable set of irreducible representations (particle types) appears in nature. T h e determination of these (the masses, spins, charge quantum numbers of physical particles) is one of the tasks of the theory. In observations we are concerned with partial states which result by the restric tion of global states to some regions in which we choose to place detectors. For a fixed global state the partial state in a region can be approximately described by the restriction of a global state which belongs to the class of some specific system. In other words: if we focus attention on some particular region then the global state may tell us for instance t h a t in there the probability for events is almost the same as that predicted from the restriction of some single particle state. Indeed, if we choose the region sufficiently small then it will usually suffice to consider only mixtures of single particle states with definite momenta. T h e existence of zero mass particles complicates this picture somewhat, as evidenced by laser beams and by infrared problems where the number of particles is no longer useful for the description of a partial state. T h e analysis of global states in terms of various systems approximating the partial states in various regions of space and time is the other task of the theory (the theory of collision processes). In the whole scheme we still need an observer. No facts are created if no detect ors are around anywhere. Though the consciousness of an individual plays no role (it was eliminated by the assumed 'as i f reality of macroscopic bodies and coarse events) the scheme still appears somewhat artificial. It is a description of what we may learn by experiments. But looking at the detailed mathematical structure, developed to cope with the above mentioned tasks of the theory, it seems clear t h a t the notions of macroscopic bodies and coarse events are asymptotic concepts. If, on the other hand, we wish to replace them by finer ones we encounter difficulties. They can, I believe, not be overcome without a radical change of the formalism involving our understanding of space and time. As long as there is an enormous disparity between collision partners, one being a macroscopic body the other an atomic object, we can talk about an approximate position of the event and give
96 upper bounds for its uncertainty, relating to the size and the time of sensitivity of the (effective part of the) detector, and we can give lower bounds for the energym o m e n t u m transfer needed to overcome the barriers against the appearance of a significant change. This suffices for practical purposes but does not seem to be the ultimate answer if we look at the vertex of a high energy event in a storage ring. 3
T h e mathematical structure in relativistic quantum physics and its interpretation
In q u a n t u m field theory the basic mathematical objects, the fields, are functions of points in space-time. These are singular objects which have to be smeared out over some finite regions to yield observables which can be represented by operators in a Hilbert space. There are problems. Some serious ones are related to gauge invariance, specifically to the local gauge principle first encountered in q u a n t u m electrodynamics (QED). If one wants to avoid 'unphysical states' one has to restrict attention to gauge invariant quantities. From these one may hope to construct algebras of observables. To be precise: we abstract from this heuristic consideration t h a t we can obtain a normed, involutive algebra (for short, a C*-algebra) for each bounded, open region 0 of space-time. T h e correspondence 0->A{0)
(1)
between regions and algebras yields one essential piece of information for the ana lysis of the consequences of the theory. We call . 4 ( 0 ) the algebra of observables of the region O. There are some natural relations between these local algebras. Obvious is the inclusion relation: (0 0 i C 0 2 implies A{Oi) C A(03). This allows the definition of a global C*-algebra 21 as the 'inductive limit', the completion of the union of all local algebras in the norm topology. The second i m p o r t a n t relation reflects the causal structure of space-time: (it) If 0\ is space-like to 0 2 then A(Oi) and A(02) commute. The third basic ingredient is covariance with respect to the Poincare group V. We need a realization of V by automorphisms of 21; to each element g € V there is an automorphism of 21 denoted by ag which should have the obvious geometric significance: («»•) If A G . 4 ( 0 ) , then agA G A(gO), where gO denotes the region resulting from shifting 0 by g. It is convenient to assume t h a t these algebras have a common unit element. We call the structure defined by the correspondence (1) with the properties mentioned a (covariant) net of local algebras. A general state u corresponds to a normalized, positive linear form i.e. a linear function A -> u(A) from the algebra 21 to the complex numbers which takes real, non-negative values on the positive elements of the algebra: w(A'A)
> 0 for any A G 21;
w ( l ) = 1.
(2)
A partial state in some region 0 is defined in the same way with 21 replaced by . 4 ( 0 ) . It corresponds to the restriction of a class of global states to the subalgebra considered.
97 From section 2 we see that the physical interpretation requires a characterization of those elements of 01 which represent detectors for an event in a region O. The first guess might be to identify the projectors in A(O) with such detectors. This is, however, not sufficient. A detector, in contrast to a source, must be passive; it should not click in the vacuum situation. We must control the energy- m o m e n t u m transfer. Starting from any elemental 6 21 we can construct elements
A{f)=J(axA)f(x)d*x,
(3)
where x refers to a translation in space-time. If the Fourier transform of the function / has support in a region A in p-space then the energy-momentum transfer of A(f) is limited to A . Therefore we add to the structure described so far the (somewhat over-idealized) assumption that there exists a ground state wo, the vacuum, which is invariant with respect to the Poincare group and is annihilated by any L £ 21 which is of the form (3) with support A outside of the closed forward cone V+ (positive time-like vectors in p-space including 0). This assumption, called the spectrum condition, allows us to define detectors which are approximately associated to a region O in position space and to some window A in p-space indicating the minimal energy-momentum of the 'atomic object' needed for the response of the detector. Any element P=L'L
with
||P||=1,
L = A(f)
(4)
represents such a detector if we start with A £ A(0) and choose the function / so t h a t (apart from small negligible tails) the support in x-space is a small region around the origin and its Fourier transform is practically zero outside of the region A. Of course this does not yield a precise localization or m o m e n t u m transfer but this is not relevant if we think of a detector as a macroscopic body. Let us note t h a t P will in general not be a projector but this is not necessary either, because we need not consider the negation, an instrument which indicates with certainty t h a t no event has happened. T h e above characterization of a detector does not tell us what the detector detects. But we discover t h a t such additional information is not needed a priori. If the net is given and a vacuum state exists (the spectral condition), then we have the tools to analyze the physical content of the theory by studying the response of coincidence arrangements, represented by products of Pk belonging to mutually space-like situated localization regions, in any state. A single particle state, for instance, can be defined as a state which is 'simply localized at all times', i.e. never capable of producing a coincidence of two (space-like separated) detectors: w(P,P2) = 0
(5)
for any such choice of the Pk (k = 1,2), but w ( P ) ^ 0 for some P. For further elaborations see 6 . A net satisfying only the requirements mentioned so far need not yield a phys ically reasonable theory. It may, for instance, describe no particles at all or a non-denumerable number of different types. Further properties are needed. Some necessary conditions are known which concretize the structure considerably and relate to various physical aspects ranging from the appearance of charge q u a n t u m
98 numbers in particle physics to properties of thermodynamic equilibrium states 6 , ? . But we do not know yet how to formulate restrictive conditions powerful enough to define a specific net, let alone the ambitious aim of constructing a net whose physical content is corroborated by experiments. It is my personal conviction that in this step the local gauge principle plays a crucial role. This assessment stems partly from progress in theoretical high energy physics in the past decades and partly from my belief in simplicity and naturalness of fruitful basic concepts. The principle mentioned tells us that we should not try to focus on global symmetries. In a local theory the symmetries should only govern the structure in the small and the comparison of their action in different regions needs additional information which is called a 'connection' because the comparison depends on the way we pass from one region to the other. In the two important classical field theories which have proved their worth for physics, Maxwell's electro dynamics and Einstein's general relativity, this principle is encoded. In the former it was recognized rather late and refers to the gauge symmetry related to electric charge; in the latter it was one of the guiding principles and refers to the Poincare symmetry of space-time. The Lorentz part, which keeps one point in space-time fixed, is reduced to a local symmetry for the tangent space at this point; the trans lations are replaced by the connection. Quantum physics as we know and use it is anchored on the uncritical acceptance of space-time as an arena in which we can place instruments, an arena with known geometry including a causal structure. Some aspects of the theory depend also on the existence of a global symmetry for this geometry." If we loose this anchor completely we enter an area in which the conceptual structure and mathematical formalism of quantum physics cannot per sist. In this area the problem mentioned at the end of section 2 and some of the questions addressed in the next section may become imperative. So, to stay on the present level, we wish to keep global Poincare symmetry and only reduce the internal symmetries, relating to the charge structure, to local significance needing the definition of a connection. In a classical field theory the formalism of YangMills theories, generalizing electrodynamics to non-Abelian local internal symmetry groups, is well understood, using the notions of sections and connections in a fiber bundle. The transfer of this formalism to quantum theory is highly nontrivial and, in my opinion, not yet adequately understood. If we use the approach via algebras of observables sketched above then the incorporation of the additional structure due to (not directly observable) local internal symmetries is obscured by the sin gular nature of points and lines used in the classical case. To handle this we need knowledge about the short distance behavior (ultraviolet limit) of the theory. A few tentative suggestions concerning the notion of a quantum connection are given in ref. 7 . I consider the clear understanding of how local internal symmetries can be incorporated in a well defined mathematical structure as one of the most important immediate aims on which many subsequent developments may hinge. It constitutes,
"Quantum physics in 'curved space-time' (representinga given, external gravitational field) retains the first part of these requirements. This means that the net structure of local algebras with the properties (t), (tt) persists. The loss of (tt«) implies that the spectrum condition has to be replaced. A considerable amount of work has been devoted to this problem but we shall not discuss it here; it would be beyond the scope of this paper.
99
of course, a hybrid theory since the global nature of the geometric, symmetries is kept. So it may not be of primary importance to clarify whether the continuum limit really exists. 4
Retrospective and Perspectives
Comparing the picture sketched so far with the discussions on the interpretation of quantum theory seventy years ago we may note: 1. The 'language of classical physics' stressed so much by Bohr as indispensable for the observer (to enable him to tell what was done and learned) remains an essential ingredient but, if we disregard questions of convenience, it may be reduced to the description of geometric relations in the placement of various macroscopic bodies and the coarse events observed in space and time. All further information is contained in the mathematical structure. The correspondence principle needed to map the description into the realm of mathematical symbols is provided by the reference to classical space-time and its geometric symmetry on both sides. It is the correspondence (1) together with the action of the translation group, needed to characterize the passive nature of detectors by (4). Apart from this the global symmetry of space-time is needed in two respects. In a single experiment because it studies the statistical relations in an ensemble of many individual event patterns which occur at different times and in the communication with other observers who would like to test the results in a different region of space-time. 2. The indivisibility of a process emphasized by Bohr leads to the concept of an event as an irreducible unit and it manifests itself also in the holistic aspect of the causal relations between events. The isolation of an individual process as a distinguishable, coherent part in the history of the universe, a pattern of events which can be considered by itself without mentioning its ties with other parts, depends to some extent on the choice of the observer of how much he wants to consider but this choice is limited by the requirement that it must lead to a well defined conspicuous probability assignment for the total process, a requirement which can be precisely fulfilled only in a steady environment. 3. The distinction between possibilities and facts which appears to be unavoid able in a formulation of indeterministic laws implies a distinction between future and past. Bohr mentions the 'essential irreversibility inherent in the very concept of observation'. If the term observation does not mean that the ultimate respons ibility for deciding what constitutes a fact is delegated to the consciousness of an individual human being6 then we must accept the essential irreversibility inherent in the concept of an event. This endows the 'arrow of time' with an intrinsic signi ficance in the physical theory and corresponds to a picture of reality as evolving in successive steps of a process with a moving boundary, separating past facts from future possibilities. It corresponds to the picture drawn by the philosopher A.N. Whitehead 8 . This does not conflict with the existence of a time reversal symmetry of the theory which describes a symmetry in the probability assignments for pro cesses. The significance of the arrow of time is encoded in the existing theory by ^I think this would be too unreliable to be useful for the purposes of physics.
100
the spectrum condition for the energy-momentum of states entering in the char acterization of detectors (which provide one contribution for the probability of an event). The time reversal operator, being anti-unitary, does not change the sign of the energy. Turning now to perspectives for future development of the theory, we might take a few hints from the preceding discussion. First, that all symmetries should be considered as local but that we should not associate the meaning of local with a point in a space-time continuum but with a possible event. A pattern of events with a web of causal ties between them bears some analogy to a section in a fiber bundle whose base space is the set of events and the typical fiber is a direct sum of representations of the symmetry group. The causal ties provide the connection. The dynamical law must then describe the probability assignment for different pos sibilities of growth of such a pattern in the evolution process in which possibilities turn into facts and the boundary between past and future changes. Included in this task is the determination of the subset of representations in the fiber of an event, the generalized eigenvalue problem yielding the relation between masses, spins and charge quantum numbers. I shall not try to elaborate on the many questions con nected with such a picture and its relation to existing formalism. This is beyond the scope of this paper and the capabilities of its author. References 1. Ph. Blanchard and A. Jadczyk (eds.) Quantum Future, Proc. Przesieka Conf. 1997, Springer Verlag, Heidelberg, 1999. 2. R. Bonifacio (ed.) Mysteries, Puzzles and Paradoxes in Quantum Mechanics, Proc. Lake Garda Conf. 1998, AIP Conf. Proc. 461. 3. Proc. Lake Garda Conf. 1999, to appear. 4. R. Omnes, The Interpretation of Quantum Mechanics, Princeton University Press, 1994. 5. J.F. Clauser and M.A. Home, Phys. Rev. D 10, 526 (1974). 6. R. Haag, Local Quantum Physics, second edition. Springer Verlag, Heidelberg, 1996. 7. D. Buchholz and R. Haag, The quest for understanding in relativistic quantum physics, preprint hep-th/9910243, to appear in J. Math. Phys., special issue. 8. A.N. Whitehead, Process and Reality, Macmillan Publishing Co., 1927.
101
WHAT GOOD A R E Q U A N T U M FIELD THEORY INFINITIES? ROMAN JACKIW Center for Theoretical Physics Massachusetts Institute of Technology Cambridge, MA 02139-4307 A lesson for the new millennium from quantum field theory: Not all field-theoretic infinities are bad. Some give rise to finite, symmetry-breaking effects, whose con sequences are observed in Nature.
Quantum field theory is the most successful theoretical structure in physics, with applications that range from the short distances of subatomic particles to the micro scopic dimensions characterizing atomic, chemical, and condensed matter physics, and onto the astronomical distances where quantum field theory fuels "inflation" - a speculative but completely physical analysis of early universe cosmology. Re markably, no experimental observation has contradicted the predictions that are made by appropriate field theoretical models for the relevant phenomena. When accurate calculation is feasible and precision experiments are available, numerical agreement between theory and experiment extends to many significant places, as for example in the ground-state energies of simple atoms like hydrogen and helium, or in the magnetic moments of electrons and muons. Nevertheless, this gloriously successful invention of the human mind is logically defective in that some well-posed questions cannot be answered - a computation that should resolve the question can yield ambiguous or meaningless answers. This happens because the available methods of calculation encounter infinities that either persist, leading to meaningless results, or cancel among themselves, leaving ambigu ous, undetermined "finite" parts. There is no reason to suppose that this defect should be attributed to the method of (approximate) computation - it appears to be intrinsic to interacting quantum field theory when excitations are point particles and interactions are local. (More specifically, I am referring to ultraviolet infinities, which arise because various integrals over intermediate energies diverge at their high-energy end, which corresponds to short distances in position space. There are also other infinities, like diverging perturbative series or infrared/large-distance singularities associated with long-range forces. But these infinities are less trouble some, because they are attributed to the approximation method and are not viewed as intrinsic defects of quantum field theory.) For physically relevant models, but not including gravity theory, it has been possible to isolate the infinities by the "renormalization" procedure, which hides them and also permits unambiguous calculation of quantities not contaminated by the infinities. Within this framework definite numerical results have been obtained, which in principle explain all observed fundamental processes. (Failure to tame infinities in quantum gravity has thus far been irrelevant for practical purposes, because all presently observed manifestations of gravitational forces are described by the classical Newton-Einstein theory.) In spite of the great success of quantum field theory, its infinities notwithstand-
102
ing, there are many who remain unconvinced by the pragmatism of renormalization. Dirac and Schwinger, who count among the creators of quantum field theory and renormalization theory, respectively, ultimately rejected their constructs because of the infinities. But even those who accept renormalization disagree about its ulti mate efficacy at well-defining a theory. Some argue that sense can be made only of "asymptotically free" renormalizable field theories - in these theories the interac tion strength decreases with increasing energy. On the contrary, it is claimed that asymptotically nonfree models, like electrodynamics and <#4-theory, do not define quantum theories, even though they are renormalizable - it is said "they do not ex ist." Yet electrodynamics is the most precisely verified quantum field theory, while the <^>4-model is a necessary component of the "standard model" for elementary particle interactions, which thus far has met no experimental contradiction. The ultraviolet infinities appear as a consequence of space-time localization of excitations and of their interactions. (Sometimes it is claimed that field-theoretic infinities arise from the unhappy union of quantum theory with special relativ ity. But this does not describe all cases - later I shall discuss a nonrelativistic, ultraviolet-divergent, and renormalizable field theory.) Therefore choosing models with extended excitations and interactions provides a way for avoiding ultraviolet infinities. These days "string theory" is a model with precisely such extended fea tures, and all quantum effects - including gravitational ones - are ultraviolet finite. This very desirable state of affairs has persuaded many that fundamental physical theory in the next millennium should be based on the string paradigm (generalized to encompass even more extended structures, like membranes and so on). This will replace quantum field theory, which although marred by its ultraviolet infinities has served us well in the twentieth century. My goal in this essay is to argue that at least some of the divergences of quantum field theory must not be viewed as unmitigated defects. On the contrary, they convey crucially important information about the physical situation, without which most of our theories would not be physically acceptable. The stage where my unconventional considerations play a role is that of symmetry, symmetry breaking, and conserved quantum numbers, so next I have to review these ideas. Physicists are mostly agreed that ultimate laws of Nature enjoy a high degree of symmetry. Presence of symmetry implies absence of complicated and irrelevant structure, and our conviction that this is fundamentally true reflects an ancient aesthetic prejudice - physicists are happy in the belief that Nature in its funda mental workings is essentially simple. Moreover, there are practical consequences of the simplicity entailed by symmetry - it is easier to understand the predictions of physical laws. For example, working out the details of very-many-body motion is beyond the reach of actual calculations, even with the help of computers. But taking into account the symmetries that are present allows understanding at least some aspects of the motion, and charting regularities within it. Symmetries bring with them conservation laws - an association that is precisely formulated by Noether's theorem. Thus time-translation symmetry, which states that physical laws do not change as time passes, ensures energy conservation; spacetranslation symmetry, the statement that physical laws take the same form at different spatial locations, ensures momentum conservation. For another example,
103 we note that the quantal description makes use of complex numbers. But physical quantities are real, so complex phases can be changed at will, without affecting physical content. This invariance against phase redefinition, called gauge symmetry, leads to charge conservation. The above examples show t h a t symmetries are linked to constants of motion. Identifying such constants on the one hand satisfies our urge to find regularity and permanence in natural phenomena, and on the other hand we are provided with useful markers for ordering physical d a t a . Moreover, a large degree of symmetry in the mathematical formulation of phys ically successful q u a n t u m field theory models is desireable not only aesthetically but also practically. Symmetry facilitates unraveling the consequences of the com plicated dynamical model; more importantly, the presence of symmetry is required for a successful renormalization of the infinities, so that unambiguous answers can be extracted from the formalism. However, in spite of our preference that descriptions of Nature be enhanced by a large amount of symmetry and characterized by many conservation laws, we must recognize that actual physical phenomena rarely exhibit overwhelming regularity. Therefore, at the very same time t h a t we construct a physical theory with intrinsic symmetry, we must find a way to break the symmetry in physical consequences of the model. Progress in physics can be frequently seen as the resolution of this tension. In classical physics, the principal mechanism for symmetry breaking, realized already within Newtonian mechanics, is through boundary and initial conditions on dynamical equations of motion. For example, radially symmetric dynamics for planetary motion allows radially nonsymmetric, noncircular orbits with appropri ate initial conditions. But this mode of symmetry breaking still permits symmetric configurations - circular orbits, which are rotationally symmetric, are allowed. In q u a n t u m mechanics, which anyway does not need initial conditions to make phys ical predictions, we must find mechanisms that prohibit symmetric configurations altogether. In the simplest, most direct approach to symmetry breaking, we suppose t h a t in fact dynamical laws are not symmetric, but that the asymmetric effects are "small" and can be ignored "in first approximation." Familiar examples are the breaking of rotational symmetry in atoms by an external electromagnetic field or of isospin symmetry by the small electromagnetic interaction. However, this explicit breaking of symmetry is without fundamental interest for the exact and complete theory; we need more intrinsic mechanisms t h a t work for theories that actually are symmetric. A more subtle idea is spontaneous symmetry breaking, where the dynamical laws are symmetric, but only asymmetric configurations are actually realized (because the symmetric ones are energetically unstable). This mechanism, urged on particle physicists by Heisenberg, Anderson, Nambu, and Goldstone, is readily illustrated by the potential energy profile possessing left-right symmetry and depicted in the Figure. T h e left-right symmetric value at the origin is a point of unstable equi librium; stable equilibrium is attained at one of the two reflection-unsymmetric points ± a . Moreover, in q u a n t u m field theory, the energy barrier separating the two asymmetric configurations is infinite and no tunneling occurs between them. Once the system settles in one or the other location, left-right parity is absent. One
104 says t h a t the symmetry of the equations of motion is "spontaneously" broken by the stable solution. Energy density
-a
0
+a
Left-right symmetric energy density. The symmetric point at 0 is energetically unstable. Stable configurations are at ±a. Because field theory is defined in an infinite volume, the finite energy density separating ± a produces an infinite energy barrier and tunneling is suppressed. The system settles into state + a or —a and left-right symmetry is spontaneously broken. While the predictions of a theory with spontaneously broken symmetry no longer follow the patterns t h a t one would find if the symmetry were present in the solutions, one important benefit of the symmetry remains: the renormalization procedure is unaffected. So the mechanism of spontaneous symmetry breaking ac complishes the phenomenologically desired reduction of formal symmetries without endangering renormalization, but it does not reduce them enough. Fortunately there exists a further, even more subtle mode of symmetry breaking, with which we can further suppress symmetries, thereby bringing our theories in accord with observed phenomena. Here one crucially relies on the various ultraviolet infinities of local q u a n t u m field theory, for which the renormalization procedure (needed to make sense of the theory) cannot be carried out in a manner consistent with the symmetry. Nevertheless the symmetry breaking effects are finite, even though they arise from infinities. This mode of symmetry breaking is called anomalous or quantum mechanical, and in order to explain it, let me begin by recalling t h a t the q u a n t u m revolution did not erase our reliance on the earlier, classical physics. Indeed, when proposing a theory, we begin with classical concepts and construct models according to the rules of classical, prequantum physics. We know, however, such classical reasoning is not in accord with q u a n t u m reality. Therefore, the classical model is reanalyzed by the rules of q u a n t u m physics (which comprise the true laws of Nature), t h a t is, the classical model is quantized.
105 Differences between the physical pictures drawn by a classical description and a q u a n t u m description are of course profound. To mention the most dramatic, wo recall that dynamical quantities are described in q u a n t u m mechanics by op erators, which need not commute. Nevertheless, one expects t h a t some universal concepts transcend the classical/quantal dichotomy, and enjoy rather the same role in q u a n t u m physics as in classical physics. For a long time it was believed that symmetries and conservation laws of a theory are not affected by the transition from classical to q u a n t u m rules. For example, if a model possesses translation and gauge invariance on the classical level, and consequently energy/momentum and charge are conserved classically, it was believed that after quantization the quantum model is still translation and gauge invariant so t h a t the energy/momentum and charge operators are conserved within q u a n t u m mechanics, t h a t is, they commute with the q u a n t u m Hamiltonian operator. But now we know t h a t in general this need not be so. Upon quantization, some symmetries of classical physics may disappear when the q u a n t u m theory is properly defined in the presence of its infinities. Such tenuous symmetries are said to be anomalously broken; although present classically, they are absent from the q u a n t u m version of the theory, unless the model is carefully arranged to avoid this effect. T h e nomenclature is misleading. At its discovery, the phenomenon was unex pected and dubbed "anomalous." By now the surprise has worn off, and the better name today is "quantum mechanical" symmetry breaking. Anomalously or q u a n t u m mechanically broken symmetries play several and cru cial roles in our present-day physical theories. In some instances they save a model from possessing too much symmetry, which would not be in accord with experi ment. In other instances the desire to preserve a symmetry in the q u a n t u m theory places strong constraints on model building and gives experimentally verifiable pre dictions; more about this later. 1 Now I shall describe two specific examples of the anomaly phenomenon. Con sider first massless fermions moving in the background of an electromagnetic field. Massive, spin-^ fermions possess two spin states - up and down - but massless fermions can exist with only one spin state (out of two), called a hclicity state, in which spin is projected along (or against) the direction of motion. So the massless fermions with which we are here concerned carry only one helicity and these are an ingredient in present-day theories of quarks and leptons. Moreover, they also arise in condensed matter physics, not because one is dealing with massless, singlehelicity particles, but because a well-formulated approximation to various manybody Hamiltonians can result in a first-order matrix equation t h a t is identical to the equation for single-helicity massless fermions, t h a t is, a massless Dirac-Weyl equation for a spinor *P. If we view the spinor field $ as an ordinary mathematical function, we recognize t h a t it possesses a complex phase, which can be altered without changing the physical content of the equation that $ obeys. We expect therefore t h a t this instance of gauge invariance implies charge conservation. However, in a q u a n t u m field theory
106
operator Q is not conserved; rather
^=L[H,Q}«[ a I n
E-B, ^volume
where E and B are the background electric and magnetic fields in which our massless fermion is moving - gauge invariance is lost! One way to understand this breaking of symmetry is to observe that our model deals with massless fermions and conservation of charge for single-helicity fermions makes sense only if there are no fermion masses. But quantum field theory is beset by its ultraviolet infinities, which must be controlled in order to do a computation. This is accomplished by regularization and renormalization, which introduces mass scales for the fermions, and we see that the symmetry is anomalously broken by the ultraviolet infinities of the theory. The phase-invariance of single-helicity fermions is called chiral (gauge) sym metry, and chiral symmetry has many important roles in the standard model, which involves many kinds of fermion fields, corresponding to the various quarks and leptons. In those channels where a gauge vector meson couples to the fermions, chiral symmetry must be maintained to ensure gauge invariance. Consequently, fer mion content must be carefully adjusted so that the anomaly disappears. This is achieved because the proportionality constant in the above failed conservation law involves a sum over all the fermion charges, ^2qn, so if that quantity vanishes the n
anomaly is absent. In the standard model the sum indeed vanishes, separately for each of the three fermion families. For a single family this works out as follows: three quarks
q„ — §
three quarks
qn = — 5
=>■ —1
one charged lepton
qn — — 1
=> —1
one neutrino lepton
q„ = 0 £<7n
=>
2
=>
0
=
0
n
In channels to which no gauge vector meson couples, there is no requirement that the anomaly vanish, and this is fortunate. A theoretical analysis shows that chiral gauge invariance in the up-down quark channel prohibits the two-photon decay of the neutral pion (which is composed of up and down quarks). But the decay does occur with the invariant decay amplitude of 0.025±0.001 G e V - 1 . Before anomalous symmetry breaking was understood, this decay could not be fitted into the standard model, which seemed to possess the decay-forbidding chiral symmetry. Once it was realized that the relevant chiral symmetry is anomalously broken, this obstacle to phenomenological viability of the standard model was removed. Indeed since the anomaly is completely known, the decay amplitude can be completely calculated (in the approximation that the pion is massless) and one finds 0.025 GeV - 1 , in excellent agreement with experiment. We must conclude that Nature knows about and makes use of the anomaly mechanism. On the one hand fermions are arranged into gauge-anomaly-free rep resentations, and the requirement that anomalies disappear "explains" the charges
107
of elementary fermions. On the other hand the pion decays into two photons be cause of an anomaly in an ungauged channel. It is therefore paradoxical but true that in local quantum field theory these phenomenologically desirable results are facilitated by ultraviolet divergences, which give rise to finite symmetry anomalies, derived from infinities. The observation that infinities of quantum field theory lead to anomalous sym metry breaking allows comprehending a second example of quantum-mechanical breaking of yet another symmetry - scale invariance. Like the space-time transla tions mentioned earlier, which lead to energy-momentum conservation, scale trans formations also act on space-time coordinates, but in a different manner. They dilate the coordinates, thereby changing the units of space and time measurements. Such transformations will be symmetry operations in models that possess no funda mental parameters with time or space dimensionality, and therefore do not contain an absolute scale for units of space and time. Our quantum chromodynamical (QCD) model for quarks is free of such dimensional parameters, and it would ap pear that this theory is scale invariant - but Nature certainly is not! The observed variety of different objects with different sizes and masses exhibits many different and inequivalent scales. Thus if scale symmetry of the classical field theory, which underlies the quantum field theory of QCD, were to survive quantization, experi ment would have grossly contradicted the model, which therefore would have to be rejected. Fortunately, scale symmetry is quantum mechanically broken, owing to the scales that are introduced in the regularization and renormalization of ultravi olet singularities. Once again a quantum field-theoretic pathology has a physical effect, a beneficial one - an unwanted symmetry is anomalously broken, and re moved from the theory. A different perspective on the anomaly phenomenon comes from the path in tegral formulation of quantum theory, where one integrates over classical paths the phase exponential of the classical action: Quantum Mechanics «=> /
^(classical act,on)/n
•/(measure on paths)
When the classical action possess a symmetry, the quantum theory will respect that symmetry if the measure on paths is unchanged by the relevant transformation. In the known examples (chiral symmetry, scale symmetry) anomalies arise precisely because the measure fails to be invariant and this failure is once again related to infinities. The measure is an infinite product of measure elements for each point in the space-time where the quantum (field) theory is defined; regulating this infinite product destroys its apparent invariance. Yet another approach to chiral anomalies, which arise in (massless) fermion theories, makes reference to the first instance of regularization/renormalization, used by Dirac to remove the negative-energy solutions to his equation. Recall that to define a quantum field theory of fermions, it is necessary to fill the negativeenergy sea and to renormalize the infinite mass and charge of the filled states to zero. In modern formulations this is achieved by "normal ordering", but for our purposes it is better to remain with the more explicit procedure of subtracting the infinities, that is, renormalizing them.
108 It can then be shown t h a t in the presence of an external gauge field, the dis tinction between "empty" positive-energy states and "filled" negative-energy states cannot be drawn in a gauge-invariant manner, for massless, single-helicity fermions. Within this framework, the chiral anomaly comes from the gauge noninvariance of the infinite negative-energy sea. Since anomalies have physical consequences, we must assign physical reality to this infinite negative-energy sea. Actually, in condensed m a t t e r physics, where a Dirac-type equation governs electrons, owing to a linearization of dynamical equations near the Fermi surface, the negative-energy states do have physical reality. They correspond to filled, bound states, while the positive energy states describe electrons in the conduction band. Consequently, chiral anomalies also have a role in condensed m a t t e r physics, when the system is idealized so t h a t the negative-energy sea is taken to be infinite. In this condensed m a t t e r context another curious, physically realized, and infinity-driven phenomenon has been identified. When the charge of the filled neg ative states is renormalized to zero, one is subtracting an infinite quantity, and rules have to be agreed upon so t h a t no ambiguities arise when infinite quantities are manipulated. With this agreed-upon subtraction procedure, the charge of the vacuum is zero, and filled states of positive energy carry integer units of charge. Into the system one can insert a soliton - a localized structure t h a t distinguishes between different domains of the condensed m a t t e r . In the presence of such a soliton, one needs to recalculate charges using the agreed-upon rules for handling infinities and one finds, surprisingly, a noninteger result, typically half-integer: the negative-energy sea is distorted by the soliton to yield a half-unit of charge. T h e existence of fractionally charged states in the presence of solitons has been exper imentally identified in polyacetylene. We thus have another example of a physical effect emerging from infinities of q u a n t u m field theory. 2 Let me conclude my qualitative discussion of anomalies with an explicit example from q u a n t u m mechanics, whose wave functions provide a link between particle and field-theoretic dynamics. My example also dispels any suspicion t h a t ultraviolet divergences and the consequent anomalies are tied to the complexities of relattvistic q u a n t u m field theory. The nonrelativistic example shows t h a t locality is what matters. Recall first the basic dynamical equation of q u a n t u m mechanics: the time inde pendent Schrodinger equation for a particle of mass m moving in a potential V(r) with energy E: ( „■> 2m,,, .\ ,, . 2 (-V + _V(r)JV(r) = -2m^ V„ t,,o , . In its most important physical applications, this equation is taken in three spatial dimensions and V(r) is proportional to 1/r for the Coulomb force relevant in a t o m s . Here we want to take a different model with potential t h a t is proportional to the inverse square, so t h a t the Schrodinger equation is presented as
(_V2 + A^ ( r ) = ^ ( r ) , *2 = f ? £ . In this model, transforming the length scale is a symmetry: because the Laplacian scales as r - 2 , A is dimensionless and in the above there is no intrinsic unit of length.
109
A consequence of scale invariance is that the scattering phase shifts and the S matrix, which in general depend on energy, that is, on k, are energy independent in scale-invariant models. And indeed when the above Schrodinger equation is solved, one verifies this prediction of the symmetry by finding an energy-independent S matrix. Thus scale invariance is maintained in this example - there are no surprises. Let us now look to a similar model, but in two dimensions with a ^-function potential, which localizes the interaction at a point:
(-V 2 + A*2(r))^(r)=*2tf(r) Since in two dimensions the two-dimensional ^-function scales as 1/r2, the above model also appears scale invariant; A is dimensionless. But in spite of the simplicity of the local contact interaction, the Schrodinger equation suffers a short-distance, ultraviolet singularity at r=0, which must be renormalized. Here is not the place for a detailed analysis, but the result is that only the s-wave possesses a nonvanishing phase shift Sa, which shows a logarithmic dependence on energy: 2 1 cot So = — In kR + — . ■K
A
R is a scale that arises in the renormalization, and scale symmetry is decisively and quantum mechanically broken. The scattering is nontrivial solely as a con sequence of anomalously broken scale invariance. (It is easily verified that the two-dimensional J-function in classical theory, where there are no anomalies and it is scale invariant, produces no scattering.) To make sense of the above phase shift in the limit R —>• oo, one must "renormalize" the bare coupling constant A, allowing it to depend on R in just such a way that cot Jo is /^-independent (for large R). Alternatively, one recognizes that the S matrix e2's° possesses a pole, corresponding to a bound state with energy B
2mR? Therefore one may reexpress 8Q in terms of EB, rather than R. With this substi tution, dependence on A disappears (as it must since A is /^-dependent) and the dimensionless (infinite) coupling constant A, has been traded for a dimensional and physical parameter Ey. 1 , E cot do = — In T \En\ Similar anomalous breaking of scale invariance occurs in relativistic field theory, and perhaps explains the appearance of a dimensional mass parameter in QCD as a replacement for the dimensionless, but renormalization dependent, coupling constant..3 I believe that as the millennium draws to a close, and we look forward eagerly to the new physics ideas that will flourish in the new era, one very important lesson we should take from quantum field theory is not to banish all its infinities. Apparently the mathematical language with which we are describing Nature cannot account for all natural phenomena in a clear fashion. Recourse must be made to contradictory formulations involving infinities, which nevertheless lead to accurate
where descriptions of experimental facts in finite terms. It will be most interesting to see how string theory and its evolutions, which purportedly are completely finite and consistent, will handle this issue, which has been successfully, if paradoxically, resolved in quantum field theory. Acknowledgments This work is supported in part by funds provided by the U.S. Department of Energy (D.O.E.) under contract #DE-FC02-94ER40818. References 1. Anomalous symmetries in quantum field theory are discussed by Ft. Jackiw in S. Treiman, R. Jackiw, B. Zumino, and E. Witten, Current Algebra and Anom alies (Princeton University Press/World Scientific, Princeton, NJ/Singapore, 1985); by S. Adler in Lectures on Elementary Particles and Quantum Field Theory, S. Deser, M. Grisaru, and H. Pendleton, eds. (MIT Press, Cambridge, MA, 1970); and in a monograph on the subject by R. Bertlmann, Anomalies in Quantum Field Theory (Oxford University Press, Oxford, UK, 1996). 2. For more details on fractional charge see R. Jackiw and J.R. Schrieffer, "Solitons with Fermion Number 1/2 in Condensed Matter and Relativistic Field Theories," Nucl Phys B190, [FS3] 253 (1981); R. Jackiw, "Fractional Fermions," Comments, Nucl Part Phys 13, 15 (1984). 3. Further discussion of this model and a pedagogical account of anomalies is by B. Holstein, "Anomalies for Pedestrians," Am J Phys 6 1 , 142 (1993).
111 C O N S T R U C T I V E Q U A N T U M FIELD THEORY
ARTHUR JAFFE Harvard
University, E-mail:
Cambridge, MA 02138, jaffeQclaymath.org
USA
We review the emergence of constructive quantum field theory, we discuss how it fits into the framework of mathematics and physics, and we point to a major unsolved question.
1
Background
The pioneering work of early non-relativistic quantum theory led to the under standing that quantum dynamics on Hilbert space is a comprehensive predictive framework for microscopic phenomena. From the Bohr atom, through the nonrelativistic quantum theory of Schrodinger and Heisenberg, and the relativistic Dirac equation for hydrogen, agreement between calculation and experiment im proved rapidly over time. The incorporation of special relativity and field theory into quantum theory extended the scope of perturbative calculations, and these were tested through precision measurements of spectra and magnetic moments. Beginning in the 1940's, experimental tests of the Lamb shift and the anomalous magnetic moment of the electron detected effects that one can ascribe to fluctu ations in quantum electrodynamics. These effects deviated numerically from the predictions arising from equations that describe a fixed number of particles, so they were accurate tests of the quantum field hypothesis. Today these experiments have evolved to yield quantitative agreement with the most precise observations and calculations achieved in physics. For example, the anomalous magnetic moment of the electron is known theoretically and experimentally to amazing precision: (g - 2)/2 = 0.001159652200(±40). The success of this work, as well as the success of other less accurate, but compelling, predictions for weak and strong interactions, convince us to accept quantum field theory as the correct physical arena to describe particle physics down to the Planck scale. But the success of relativistic field theory calculations and of perturbative renormalization also led to a logical puzzle: is there any physically-relevant, relativistic quantum field theory that is also mathematically consistent? Put differently, can one give a mathematically complete example of any non-linear theory, relevant for the description of interacting particles, whose solutions incorporate relativistic covariance, positive energy, and causality? One must understand perturbative renormalization in order to resolve this problem, and have control over renormalization from a non-perturbative (or "exact") point of view. In fact, one needs to overcome sophisticated problems, such as whether a field theory may appear correct on a per turbative level, while it may have no meaning at a non-perturbative level. Doubts about quantum electrodynamics or scalar meson theory were raised early by Dyson and Landau. They recur from the point of view of the renormalization group in the work of Kadanoff and Wilson, as well as in the analysis of "asymptotic freedom" in the 1970's. In four dimensions, this has focused attention on finding a solution
112
to a non-abelian Yang-Mills theory. Assuming a positive answer to this existence question, then can one develop a calculational scheme to determine properties of such an example, both perturbatively and non-perturbatively? Strong-coupling calculations, as well as calculations near critical values of the coupling constants, have been the most elusive to un derstand. Thus one wants to understand both the quantitative structure of field theories, as well as qualitative features such as the dependence of the theories as functions on the space of coupling constant parameters. At the same time, physicists have tried to make further theoretical progress through ambitious attempts to imbed quantum field theory within a theory of strings, by which they hope to combine quantum theory with general relativity, and to predict the structure of space-time. There is also the appealing attempt to integrate non-commutative geometry into the picture. One would like to introduce the notion of quantization directly at the level of space-time, and to describe field theories on quantum space-time, rather than applying quantization to fields that live on a classical space-time. For the time being, all these methods remain beyond the realm of full understanding. 2
The Emergence of CQFT
Constructive quantum field theory (CQFT) was formulated forty years ago as an effort to find specific examples of non-linear quantum fields that fit within a math ematically complete description of quantum mechanics. Prior to the constructive field theory program, only a few exactly soluble relativistic field theories existed. Either they were free fields, or else they had solutions that could be expressed as functions of free fields (such as the Schwinger and Thirring models). In either case, they described (or appear to describe) particles without interaction or scattering. The emergence of constructive quantum field theory led to the direct attack on showing that solutions exist to the variational equations arising from particular Lagrangians. Constructive field theorists took perturbation theory as a reliable guide to the behavior of a particular equation. One could incorporate this inform ation into the mathematical analysis, leading to the proof of exact results based on perturbative guidance. The most basic questions raised by CQFT in the 1960's revolved about whether examples of quantum field theories exist within the axiomatic frameworks formu lated by Wightman or by Haag and Kastler. The Wightman axioms describe the vacuum expectation values of products of fields that transform covariantly under the action of the group of inhomogeneous Lorentz transformations {a, A}, where A designates a homogeneous Lorentz transformation and where a denotes a space-time translation. Thus they involve constructing the fields
113
T h e unitary representation U(a, A) of the Lorentz group of space-time symmet ries determines a *-automorphism group of transformations of fields, U(atA)tpU(a,A)m
.
(1)
Scalar boson fields are characterized by the transformation property (
(2)
where x = {x,t) denotes a space-time point, and spinor fields or vector fields have their own transformation laws. T h e Haag-Kastler axioms deal with the fields (or bounded functions of the fields), along with a positive energy representation of the automorphism group
T h e First Examples
Over the following twelve years, 1965-1976, constructive field theorists finally achieved the non-perturbative construction of field theories with non-linear interac tion in two-dimensional and in three-dimensional space-times. Both the Hamilto nian approach and the functional integral approach ultimately proved successful. In fact, using features of both methods turned out to be the most powerful point of view. Looked at in perspective, it is a shame t h a t Symanzik abandoned his approach when he did. Early progress on answering these questions in a two-dimensional space-time owed a great deal to the 1965 paper of Nelson 6 showing stability for the f4 Hamilto nian of a field
114
but does not in itself give uniqueness of that extension (essential self-adjointness), and its concomitant dynamics ip(x,t) = eitH(p(x, 0)e~itH. The proof of essential self adjointness, as well as the first non-trivial example of solutions to the relativistic field equation in a 2-dimensional Minkowski spacetime (x,t) £ M2 appeared in a series of papers written by this author jointly with Glimm. 7 ' 8,9 ' 10 This hyperbolic, non-linear equation for >p(x,t) has the form n
(3)
where □ denotes the wave operator Clip =
The equation (3) is known as the tp\ wave equation, where the subscript denotes the dimension of space-time, and the quartic power signifies that the interaction energy density \\(*,0), ¥>,(*',0)] = tf(*-x')/Here S denotes the Dirac measure. In addition one requires [(*', 0)] = [pt{x,0),
(4) (5)
The presence of the singularities in (4) clearly indicate the singularity of the solu tions tp, and signals that the non-linear power ip3 that occurs in (3) requires a special definition; this is a simple example of the phenomenon of "renormalization." In this case the "cube" is replaced by a cubic polynomial tp3 — Zcip, where c is a divergent constant, and the coefficient 3 is conventional. This constant is suggested by lowest order perturbation theory. Let us consider a sequence of regularized equations as introduced above, parameterized by /c, and with non-linearity ip\ — 2,cKtpK. This non-linearity converges as K —t oo and cK ~ ln/c, if we choose the coefficient of the logarithm according to lowest order perturbation theory. Thus the individual terms in ip3 = lim (
(6)
K—¥ OO
have no meaning, but their sum is a defined by a well-behaved limit and provides the non-linearity in the wave equation (3). The root of the difficulty in estab lishing stability stems from the fact that the "normal-ordered" interaction energy,
115
integrated over a compact domain / ,
Vl
= }[-Z \ l (** - 6C^ + 3C ') ^ = I / '^'AX
(?)
is a densely defined operator, but it is unbounded from below. Here the colons : denote substituting a Hermite polynomial for each monomial of y>, also called normal ordering. Stability is the statement t h a t the operator / / ( / ) = Ho + Vj is bounded from below, where Ho is the Hamiltonian for the linear wave equation Of +
Q u a n t u m T h e o r y as S t a t i s t i c a l P h y s i c s
T h e introduction by Schwinger of Euclidean q u a n t u m field theory, and the realiz ation by Kurt Symanzik that Euclidean fields have beautiful, Euclidean-covariant functional integral representations, 4 led to a fascination with Euclidean phenom ena. These functional integral representations yield a Feynman-Kac representation of the heat kernel c~tH. They also have the interpretation of being a statistical mechanics average of classical fields (over a configuration space of Euclidean field configurations), weighted by the Boltzmann probability e~SE. Here SE denotes the classical action functional continued to Euclidean (imaginary) time. T h e Euc lidean n-point Green's functions are the n t h - m o m e n t s of the measure. Furthermore, Symanzik showed t h a t a
116 a q u a n t u m field acting on t h a t space. This method assumes an a priori Markovian structure t h a t can easily be verified for the free Euclidean field, and thus gave rise to the free Markov field. However, this structure is special to bosons, and it presented difficulty in verifying the Markov hypothesis for interacting fields. T h e Markov framework is sufficient, but it does not give an equivalence of Euclidean and Hilbert space methods. It was used in very interesting ways by Guerra, Rosen, and Simon. 1 4 , 1 5 Osterwalder and Schrader discovered another solution based on properties of the Euclidean Green's functions. 1 6 They showed t h a t their axioms for Euclidean Green's functions are equivalent to the W i g h t m a n theory on Minkowski space. Following this discovery, Euclidean fields became the fundamental tool to invest igate Minkowski field theory. T h e beautiful simplicity of this approach is t h a t it relies on a positivity condition, and in many cases this positivity is easy to pre serve in approximating (cutoff) field theories. Their condition of reflection (or Osterwalder-Schrader) positivity yields the existence of the Hilbert space H, along with the existence of an positive-energy Hamiltonian, and an analytic continuation from Euclidean space to Minkowski space. In the case of a free bosonic field with two-point Euclidean Green's function C, with P+ the orthogonal projection in L2 onto t > 0, and with 0 the reflection in the t = 0 plane, the Osterwalder-Schrader positivity condition states 0 < P+BCP+
.
(8)
17
It turns o u t t h a t this positivity is a consequence of the classical operator monotonicity condition for Green's functions of the Laplacian on P+L2(Rd), CD
where
C = (-A + m2)"1 ,
(9) 2
and where Co (respectively CN) represents the Green's function of —A + m with Dirichlet (respectively Neumann) d a t a on the < = 0 plane. T h e reflection pos itivity condition also can be established for interacting scalar, fermion, or gauge fields with local interactions. Since positivity is preserved under limits, appropriate convergence of Euclidean Green's functions yields positivity after removal of the cutoffs. The Euclidean methods lead to mathematically-sound, functional-integral rep resentations of the solutions to field theory problems. These representations often reflect underlying symmetries of the field theories. T h e Euclidean methods also apply to theories with fermions, at least for examples with interactions t h a t are quadratic in the fermions. 1 8 This is the case for free and for "Yukawa type" inter actions, used extensively in physics. These methods have been realized in the two-dimensional and three-dimensional examples. The explicit integral representations lend themselves to the nonperturbative analysis of the examples. Basically two methods were developed to analyze these functional integrals. T h e correlation inequality m e t h o d 1 5 , 1 1 was suit able for a certain class of bosonic Lagrangians. It is based on the fact t h a t in these examples the Euclidean Green's functions are positive and monotonic in the volume, as is often the case in bosonic classical statistical mechanics. T h e expansion m e t h o d 1 9 , 2 0 ' 1 1 is based on developing convergent expansions in the coupling con stant A. This method applies for coupling constants 0 < A sufficiently small. These
117
expansions are not power series (which are known to diverge); however the n t h - t e r m in these expansions is 0 ( A n ) as A —► 0. T h e n t h - t e r m has the form A" G n (A), where |Cn(A)| < 0 ( 1 ) as A —> 0. These expansion methods are not limited to bosonic Lagrangians, but they are limited to the coupling constant A being small. T h e expansion methods, when applicable, ultimately yield greater qualitative control over the solutions. In some cases both methods can be used, a great advantage. We return to these two methods in the following section.
5
The Wightman Axioms and a Mass Gap
T h e Ost.erwalder-Schrader construction shows t h a t if a Euclidean functional integ ral satisfies reflection positivity and certain bounds on moments, then it yields a relativistic field theory. Furthermore spectral properties of the Hamiltonian and the m o m e n t u m operators in the resulting field theory can be deduced from cluster properties of moments of the measure t h a t determines the functional integral. T h u s the Euclidean functional integrals became a natural and powerful tool for the study of the details of the spectrum of the Hamiltonian. We assume that the Hamiltonian H is positive, 0 < II, and furthermore t h a t it has the normalized eigenvector fi with eigenvalue 0, namely II il = 0. T h e Hamiltonian 0 < / / is defined to have a mass gap, if in addition / / has no spectrum in some interval ( 0 , m ) , where ra > 0. The physical consequence of the existence of a mass gap is that the lightest particle described by the Hamiltonian H has a mass greater or equal to m. Assuming t h a t the Hamiltonian H and the m o m e n t u m operator P commute, we can define the mass operator as the positive square root M = \JH2 — P2. T h e mass operator labels Lorentz-invariant hyperboloids in the energy-momentum spectrum, and in a covariant theory, a mass gap ( 0 , m ) for H means there also exists a mass gap (0, m) in the spectrum of M. Consider the situation where the unit vector Q is a null vector (ground state) of H, and any vector ^ £ W satisfies the bound \(x,e-lHx)-\(n,x)f\<\\x\\'2e-mt
,
(10)
for all I, > 0. This situation is equivalent to the statement t h a t 0 is a simple eigenvalue of H with eigenvector il, and that H has a mass gap of magnitude m. In fact, we would like a sufficient condition on expectations to ensure the existence of a mass gap. One needs only to show that there exists a densely defined functional f(x) on 71 with domain V{f) 3 il, such that \(x,e-lHx)-\(^x)\2\<\f(x)\2e-mt
(11)
holds for all x £ T?(f) and all t > 0. It then follows that 0 is a simple eigenvalue of H, and that a mass gap of magnitude m exists. Cluster expansions for field theory were developed and applied in the early 1970's. 1 9 , 2 0 This method was originally developed for the two-dimensional examples with vectors x having the form x — Ail, where A denotes a monomial functions of spatially-averaged, time-zero fields and of heat kernels e~S]H, for Sj > 0. Denote this set of J4'S by 21. Finite linear combinations of vectors x = ^ ^ j ^4 G 21, are dense
118
in %. If we insert vectors \ — AQ into the expectation (11), we obtain a straight forward, functional integral representation. T h e functional f(\) is a norm on 21, with the property f(A£l) = \\A\\a = ||^4*||a; it ' s defined through a complicated inductively defined construction. We show t h a t for A £ 21, there exists a constant m > 0, such that for t > 0, (fi, Ame~tHAQ)
- \(il, Atyf
< \\A\\2a e~mi
.
(12)
T h e consequence of these bounds is the proof of the existence of a mass gap, uniform for equations with the non-linear interaction restricted to a bounded domain. They also led to the proof of existence and certain regularity properties of the infinite volume limit. The cluster expansion methods generate a convergent expansion of expectations of fields, for a sufficiently small, strictly-positive coupling constant A. Consider an expectation in a y? 4 -theory of a product of fields localized in a space-time region O', and with the ^ - i n t e r a c t i o n localized in region O containing O'. One performs the expansion for fixed O, but uses the form of the resulting terms to compare different O, and to analyze the limit as C / K 2 . T h e terms given by the expansion are expectations with the ip4 interaction localized in a region O", where O' C O" C O. Furthermore, the magnitude of the sum of terms localized in O" is exponentially small in the size of O". This estimate is independent of O, and is sufficiently strong to allow the comparison of different volumes O. We therefore can estimate the convergence of the original expectation to a limit as the volume O of interaction tends to infinity. With these methods, one could establish the first non-trivial example of the W i g h t m a n axioms. 2 0 Viewed differently, this body of work established the m a t h e m atical compatibility of q u a n t u m field theory with special relativity. While this work only applied in two-dimensional space-time, it marked the crossing of a major set of obstacles barring progress. In this case, it also showed t h a t the non-perturbative treatment of the equations and their renormalization could be understood at least for these examples in terms of perturbation theory. 6
Three Dimensions
T h e extension of existence to three-dimensional Minkowski space-time for
where
n
(13)
Here
119
called phase cell localization. This allowed one to analyze degrees of freedom asso ciated with a given length scale, and to use these estimates inductively, to analyze degrees of freedom associated with twice the length scale. Phase cell localization ideas are related to ideas of Kadanoff and Wilson's renormalization group. While the paper establishing this result was finished in 1973, 2 1 it resulted from an evolu tion of methods and ideas over about four years. There are two physical renormalizations responsible for the difficulty in ana lyzing the three-dimensional phenomenon. Each corresponds to a mathematical difficulty t h a t required new techniques to overcome. The first renormalization re volves around the Hamiltonian. As in two-dimensions, the normal ordered Hamiltonian — in the case that the interaction is confined to a bounded volume — is a densely defined bilinear form on Fock space. However, unlike two dimensions, this form does not yield a densely defined operator (as the continuity required by the Riesz representation theorem is not valid). The mathematical problem comes down to understanding how to modify the Hamiltonian in order to obtain a dense op erator domain. In fact, to obtain a Hamiltonian operator one must add to the normal-ordered Hamiltonian three renormalization terms: a mass-renormalization term (that is quadratic in the time-zero field tp and t h a t diverges logarithmically as a function of a m o m e n t u m cut-off «), as well as two vacuum energy renormal ization terms t h a t are independent of the field ip. One of the constant terms is linearly divergent in the m o m e n t u m cut-off K, while the second constant term is logarithmically divergent in K. T h e second problem is associated with renormalization of the Hilbert space, namely renormalization of the state vectors on which the Hamiltonian acts. This arises because the second-order, linearly-divergent, vacuum-energy renormalization constant in the Euclidean action 52 for the time interval [0,<] can be written 52 = tE? + A-2 + o ( l ) , as t —> oo. It is the case for constants no and 02, that t.E2 ~ tcifiX^K is the linearly divergent vacuum energy renormalization oiiH as it occurs in the heat kernel e~tfI, and A2 ~ 02^2 In « is a logarithmically-divergent re mainder that is also time-independent. This remainder gives rise to a multiplicative renormalization of each wave function by the constant e _ A 2 ' 2 ~ c~^2X l n K / 2 . T h e constant A 2 also forces a change of representation of the canonical commutation relations, and the limiting Hamiltonian (K —> 00) acts on a Hilbert space carrying a representation of the C C R t h a t is unitarily inequivalent to the representation for free fields, namely the Fock representation. This is quite different from the two-dimensional theory. The representation of the Heisenberg relations for the solution to the two-dimensional non-linear wave equation, restricted to any bounded, space-time region is locally, unitarily equival ent to the representation for the free fields.9 This local equivalence of representations is known as the "locally Fock" property. T h e analysis of these two effects in the three-dimensional theory took consid erable work. J 1 Once t h a t stability was in hand, the generalization of the cluster expansion method for small, positive A followed in the case of three-dimensional space-time, 2 2 and it led to the first example of a non-trivial W i g h t m a n theory on
M3.
120 7
Digging Deeper
T h e focus of C Q F T was and remains not only to establish existence, but also to develop methods aimed at establishing quantitative and qualitative properties of the particular examples. T h u s constructive q u a n t u m field theorists did not only a t t e m p t to justify expected phenomena, but they also aimed at the broader explor ation of physics at a fundamental level — consistent with historical precedents of mathematical integrity. This work also led to establishing physical properties of these examples, including many features of their particle spectrum, the description of scattering in these examples, and the qualitative behavior of the examples as a function of the coupling constants. In this section, we mention only a few of the many phenomena t h a t have captured our imaginations in the above field theories, and about which mathematically complete results have been established. 7.1
Particles
and
Scattering
An initial question to answer concerns whether these q u a n t u m field have particle states and whether the solutions to the field theory describe scattering of these particles. There are two standard text-book methods to recover scattering d a t a from a field theory: the theory of Lehmann, Symanzik, and Zimmermann using Green's functions to construct S-matrix elements, and the alternate approach of Haag and Ruelle, based on a construction of the wave operator in the Hilbert space. Both these methods require as an hypothesis, the existence of an isolated oneparticle mass hyperboloid in the energy-momentum spectrum — or equivalently the existence of an isolated eigenvalue m > 0 for the mass operator M. This requirement of an isolated eigenvalue is more subtle than the existence of a mass gap. It entails both a lower gap and an upper gap in the spectrum of M , as well as the existence of the eigenvalue m. If 0 and m > 0 are eigenvalues of M , one expects continuous spectrum on the interval [2m, oo), and possible additional eigenvalues in the interval (m, 2m) in the case of an attractive interaction. These eigenvalues can be interpreted as the masses of particles t h a t are bound states of two mass-m particles, with the lowering of the total mass attributed to the binding energy. In the W{4 equations are repulsive and they do not have bound s t a t e s , 2 3 while \
Phase Transitions
and
Non-Uniqueness
A second qualitative phenomenon occurs for the Ay 4 equation with A large, namely non-uniqueness of the infinite volume limit. Associated with this is the nonuniqueness of the Euclidean Green's functions, of the solutions to the equation (3), and of the ground state of H. In physics, this non-uniqueness is known as the existence of a phase transition or as degeneracy of the ground state. Phase trans itions often occur along with the breaking of a symmetry, in this case breaking of the (p —y —if symmetry of the Lagrangian. There are two successful methods to
121 study this phenomenon. One approach is to develop a new cluster expansion t h a t is valid in the region A > 1. One can change the parameterization of this equation, 2 5 by varying the mass implicit in the definition of the constant cK in (6), and thereby show that (3) for A ^> 1 is equivalent to an equation Wp-
= 0 ,
(14)
with a negative linear term and with 0 < A < 1 . This parameterization illustrates the two semi-classical minima of the polynomial ^\
7.3
Zero Mass and
Twists
Super-symmetry is an additional algebraic structure in q u a n t u m boson-fermion theories. The mathematical introduction of fermions requires the existence of a Z ^ g r a d i n g T, namely a self-adjoint, root of unity on 7i. T h e eigenspaces of T are defined to be bosonic or fermionic states, respectively. T h e grading acts on linear transformations by B —> Br — FBT, and modulo questions of domain, every linear transformation can be decomposed uniquely as a sum of two parts that are even (bosonic) and odd (fermionic) under this action. Super-symmetry revolves about the existence of a self-adjoint fermionic operator Q t h a t is the square root of H — Q2. This charge Q has a geometric interpretation, as one can define the differential given by the graded commutator dB — [Q, B]p = QB — BrQ. Such a structure also arises in non-commutative geometry. 2 8 Super-symmetric examples have been extensively studied in constructive q u a n t u m field theory, especially in 2-dimensional cylindrical space-times. 2 9 ' 3 0 In these and other cases of two-dimensional constructive field theory, one studies in teracting Hamiltonians that are perturbations of free, massive, super-symmetric fields. One routinely introduces a mass, since in two dimensions the massless, scalar boson field is singular. 1 However, very interesting Lagrangians arise for which a fam ily of super-symmetric interactions (with parameter 0 < A < 1) appear to share a common Lie group of symmetries of H, and for which the A = 0 endpoint is free. However, in these cases, the Lie symmetry and super-symmetry appear in compatible with an unperturbed massive theory. This forces the question of how to deal with the two-dimsional, massless, bosonic interaction within the frame work of constructive q u a n t u m field theory. We have discussed twist fields, namely multivalued fields on cylindrical space-times t h a t allow massless interactions of a two-dimensional sort. 3 1 ' 3 2 ' 3 3 ' 3 4 , 3 5
122
Fundamental to the notion of quantum field is the assumption that the abelian group of space and time translations of S 1 x ffi has a unitary representation on ~H generated by the self-adjoint, commuting operators P and H, where P is the mo mentum. This translation group e,x p+,t H implements the space-time translations of fields, so for the bosonic field with components labelled by i, + t') = eix'p+it'H
,
(15)
while for the fermionic field with components ipa,i, + t') = eix'p+it'»rPaii(x,
4>ati(x -x',t The unitary twist group e,ej satisfies
t)e-ix'p-H'H
.
(16)
has a generator J commuting with H and P, and
eitJVi{xtt)e-UJ
= ei'a'iPi(x,t).
(17)
The fermionic time-zero field ipaj satisfies eieJTPaii(x,t)e-iej
= ei0<>4aii(x,t)
.
(18)
The twisting angles Q = {£)*, £lai} are given constants that characterize the twist generator J, up to an additive constant c/2, chosen so that ±J have the same spectrum. Then the zero-particle vector fio £ ft satisfies JU0=-cQo,
with c=J2(n2,i-n{,i)
■
(19)
i=i
A twist quantum field on a circle Sl is a field for which these two groups are related. If the circle has length t, then tpi(x+e,t)
= eix*
(20)
and ipa,i{x + e,t) = ei<'rpaii(x,t)
,
(21)
for all x 6 Sx and t G R. The set of twisting angles x — {x\, xi i) is taken so that no twisting phase equals one. In case the superpotential V satisfies the quasi-homogenity condition V(z) = £ > * A ,=i
V
(
2 ) i
(22)
OZi
we choose
ixl xi,i} = {nfo, ni,^} •
(23)
123
7.4
Twists Break Super-symmetry
The twist fields act in many ways like massive fields, and in fact the Fourier mo menta are shifted from zero by the amount x/t, choosing the x appropriate for each component. Thus twist fields are not infra-red singular at k = 0 like periodic fields, and twist fields can yield translation-invariant, twist-invariant Hamiltonians. How ever, twist fields are not totally compatible with super-symmetry. On the other hand, one finds that for a complex interaction where one expects N = 2 supersymmetry, one can preserve half the supercharges. Furthermore, one can estimate the errors in the full super-symmetry algebra. One expect in a periodic example that there are self-adjoint charges Q\ and Qi such that Q\ = H + P ,
and
Q\ = H-P
.
(24)
In fact, we come closest to this situation when H has the form H = H0+
f
Hi{x)dx,
(25)
Jo
where
Hl(x) = J2\VJ(
+ £
Vi,i(x)^, 2 (x)*V,,^(x)) n
+ £
Vi.2(*)^,i{xyVijMx))' (26)
Then there is an operator Q\ invariant under translations and twists, and an oper ator Q2 such that Q\ = H + P,
and
Q\ = // - P + K .
(27)
Here 1Z is an operator independent of A, and satisfies for some constants 0 < c, ±tTl
+ I.
(28)
Such an estimate allows us to define and study the twisted partition function y
= Tr* (Te-itJ-iop-PH)
.
(29)
Amazingly, this partition function 3 V , which is a geometric invariant, 28,36,37 can be computed. It displays a hidden modular symmetry, and it can be expressed in terms of elementary theta functions. In fact 3 ^ depends on V only through its universality class determined by the numbers {Q,}. 33 In terms of r = (<x + i/3)/£
124 8
For t h e M i l l e n n i u m : G a u g e T h e o r y i n F o u r D i m e n s i o n s
Relations between field theory and geometry also arise both in gauge theories. T h e field F is the curvature of a connection A, and the classical equations for F transform covariantly under a change of coordinates (change of gauge). Classical Yang-Mills fields take values in the Lie algebra of the gauge group. Despite the farreaching success of constructive q u a n t u m field theory, the original puzzle explained in §1 remains unresolved. Can one find a non-trivial, non-linear q u a n t u m field in four-dimensional space-time? T h e most promising candidate for a non-trivial and physically-interesting field theory on Minkowski 4-space is the Yang-Mills theory with an SU(2) gauge group. T h e Yang-Mills field F is defined in terms of a Lie-algebra valued connection A, F = dA + A/\A.
(31)
2
T h e Euclidean Yang-Mills Lagrangian is | | F | | , where the squared norm includes a trace over St/(2) and an integral over space-time. Perturbation theory involves the study of the interaction in powers of the non-linearity arising from A A A, and it indicates t h a t this Yang-Mills example is asymptotically free. T h e physical interaction becomes weaker at high energy, and for this reason, the objections from perturbation theory suggesting the triviality of
125
2. Arthur Jaffe, Existence Theorems for a Cut-off X)-2 model and other applications of high temperature expansions, Part II: The Cluster Expansion, in Constructive Quantum Field Theory, A.S. Wightman (ed.), Springer Lecture Notes in Physics Volume 25, Springer, Berlin, New York (1973).
126
20. James Glimm, Arthur Jaffe, and Thomas Spencer, The Wightman axioms and particle structure in the weakly coupled P(\ quantum field theories, Ann. Physics 97 (1976), 80-135. 23. Thomas Spencer and Francesco Zirilli, Scattering states and bound states in \V(
127
39. Jacques Magnen, Vincent Rivasseau, and Roland Seneor, Construction of YMA with an infrared cutoff, Commun. Math. Phys. 155 (1993), 325-383.
128 FOURIER'S LAW: A C H A L L E N G E T O
THEORISTS
F. BONETTO Department Present
of Mathematics, Rutgers University, 110 Frelinghuysen Road, Piscataway NJ 08854. address: IHES, 75 route de Chartres, 91440 Bures sur Yvette, France email: [email protected] J. L. L E B O W I T Z
Department
of Mathematics
Present
address:
and Physics, Rutgers University, 110 Frelinghuysen Road, Piscataway NJ 08854. IHES, 75 route de Chartres, 91440 Bures sur Yvette, France email: [email protected] L. R E Y - B E L L E T
Department Present
address:
of Mathematics, Rutgers University, 110 Frelinghuysen Road, Piscataway NJ 08854. Department of Mathematics, University of Virginia, Kerchof Hall, Charlottesville VA 22903 email: [email protected]
We present a selective overview of the current state of our knowledge (more pre cisely of our ignorance) regarding the derivation of Fourier's Law, J ( r ) = —/cVT(r); J the heat flux, T the temperature and K, the heat conductivity. This law is empirically well tested for both fluids and crystals, when the temperature varies slowly on the microscopic scale, with K an intrinsic property which depends only on the system's equilibrium parameters, such as the local temperature and density. There is however at present no rigorous mathematical derivation of Fourier's law and ipso facto of Kubo's formula for K, involving integrals over equilibrium time correlations, for any system (or model) with a deterministic, e.g. Hamiltonian, microscopic evolution.
1
Introduction
There are at least two distinct situations in which Fourier's Law is observed to hold with high precision: 1. An isolated macroscopic system which is prepared at some initial time, say t = 0, with a nonuniform temperature 7o(r), e.g. a fluid or solid in a domain A surrounded by effectively adiabatic walls. At t > 0, the temperature will change, due to the heat, i.e. energy, current, with the energy density satisfying the conservation equation:
cv(T)^T(r,0 = - V - J = V[/cVT],
(1)
where cv(T) is the specific heat per unit volume and we have assumed that there is no mass flow or other mode of energy transport beside heat conduction (we also ignore for simplicity any variations in density or pressure). Eq.(l) is to be solved subject to the initial condition T(r, 0) = To(r) and no heat flux
129 across the boundary of A. The stationary state, achieved as t —¥ oo, is then one of uniform temperature T determined by the constancy of the total energy. For our purposes we can also think of A as a torus, i.e. having periodic boundary conditions. 2. We consider the system in contact with heat reservoirs which specify a time invariant temperature Ta at points of the boundary r £ (dA)a in contact with the Q-th heat reservoir, a > 1. When the system has come to a stationary state (again assuming no matter flow) its temperature will be given by the solution of Eq.(l) with the left side set equal to zero, V • J(r) = V • ( « V f (r)) = 0,
(2)
subject to the boundary condition T(r) = Ta for r 6 (<9A)a and no flux across the rest of the boundary which is insulating or periodic in the direction perpendicular to the heat flow. A simple example of this situation is the usual set up for a Benard experiment in which the top and bottom of a fluid in a cylindrical slab of height h and cross sectional area A are kept at different temperatures X/, and TJ, respectively. (To avoid convection one has to make Th > Tb or keep |T/, — TJ,| small). Assuming uniformity in the direction parallel to the vertical x-axis one has in the stationary state a temperature profile T(x) with f (0) = Tb, f(h) = Th and n(f ) £ =Const. for x G (0, h). From a physical point of view, which is how we presented them, the two cases are conceptually very similar (some physicists would even say identical). We have implicitly assumed that the system is described fully by specifying its temperature T(r,t) everywhere in A. What this means on the microscopic level is that we imagine the system to be in local thermal equilibrium (LTE). To make this a bit more precise we might think of the system as being divided up (mentally) into many little cubes, each big enough to contain very many atoms yet small enough on the macroscopic scale to be accurately described, at a specified time t, as a system in equilibrium at temperature T(r,-,i), where rj is the center of the i-th cube. For slow variation in space and time we can then use a continuous description T(r,t). This notion is made precise in the so called hydrodynamic scaling limit (HSL) where the ratio of micro to macro scale goes to zero 47,79,57 . The macroscopic coordinates r and t are related to the microscopic ones q and r, by r = eq and t = (aT, i.e. if A is a cube of macroscopic sides /, then its sides, now measured in microscopic length units, are of length L = e - 1 /. We then suppose that at t = 0 our system of N = pLd particles with Hamiltonian p
2
= E2mS + V «2) <3>
H(P,Q) = Y1 «=i
is described by an equilibrium Gibbs measure with a temperature T(r) = T(eq) roughly speaking the phase space ensemble density has the form, Ho{P,Q) ~ e x p < - ^ / ? o ( « l i )
2m
(4)
130 where Q = ( q i , . . . , q/v) G AdN, P = ( p i , . . . ,pjv) £ H& , <£(q) is some short range inter particle potential, «(q,) an external potential and 0Q (r) = T b ( r ) 7 9 . In the limit e —► 0, p fixed, the system at t = 0 will be macroscopically in LTE with a local temperature 7o(r) (as already noted we suppress here the variation in the particle density n ( r ) ) . We are interested in the behavior of a macroscopic system, for which c « 1, at macroscopic times t > 0, corresponding to microscopic times r = c~at, a = 2 for heat conduction or other diffusive behavior. T h e implicit assumption then made in the macroscopic description given earlier is t h a t since the variations in Tb(r) are of order e on a microscopic scale, then for € < < 1, the system will, also at time <, be in a state very close to LTE with a t e m p e r a t u r e T(r,t) t h a t evolves in time according to Fourier's law, E q . ( l ) . From a mathematical point of view the difficult problem is proving that the system stays in LTE for t > 0 when the dynamics are given by a Hamiltonian time evolution. This requires proving t h a t the macroscopic system has some very strong ergodic properties, e.g. t h a t the only time invariant measures locally absolutely continuous w.r.t Lebesgue measure are, for infinitely extended spatially uniform systems, of the Gibbs type 3 0 > 7 0 . 6 3 . This has only been proven so far for systems evolving via stochastic dynamics, e.g. interacting Brownian particles or lattice gases. In these systems the relevant conserved quantity is usually the particle density rather t h a n the energy density. We shall not discuss such stochastic evolutions here but refer the reader to [79,47] for a mathematical exposition. T h e only Hamiltonian system for which a macroscopic transport law has been derived is a gas of noninteracting particles moving among a fixed array of peri odic convex scatterers (periodic Lorentz gas or Sinai billiard). For this system one can prove a diffusion equation like E q . ( l ) for the density of the particles, both for the initial value and the suitably defined stationary state problem, with the (self)diffusion constant given by the Einstein-Green-Kubo formula 1 0 ' 5 6 . Unfor tunately, the absence of interactions between particles makes this system a poor model for heat conduction in realistic systems. In particular there is no mechanism for achieving LTE. T h e speed of each particle |v| does not change in the course of t i m e and the diffusion constant for each particle is proportional to its speed. T h e diffusion equation for the density mentioned above are therefore in fact separate uncoupled equations for particles with specified speeds. It corresponds to the usual diffusion equation only when all the particles have the same speed. To remedy this problem it would be necessary to add interactions between the moving particles, e.g. instead of points make them little balls, and then derive coupled equations for the diffusion of both particle and energy densities. This is what we would consider a satisfactory answer to the challenge in the title of this article and we offer a bottle of very good wine to anyone who provides it. We believe t h a t this system, with only two conservation laws and an external source (the fixed convex scatterers) for chaotic dynamics may be the simplest Hamiltonian system for which such results could be proven rigorously. Just how far we are from such results will become clear as we describe our cur rent mathematical understanding of the stationary nonequilibrium state (SNS) of macroscopic systems whose ends are, as in the example of the Benard problem, kept at fixed temperatures T\ and Ti. The heat conductivity in this situation can
131 be defined precisely without invoking LTE. To do this we let J be the expecta tion value in the SNS, i.e. we assume that the SNS is described by a phase-space measure (whose existence we discuss later), of the energy or heat current flowing from reservoir 1 to reservoir 2. We then define the conductivity «/, as Jj(AST/L) where ST/L = (Ti — T2)/L is the effective temperature gradient for a cylinder of microscopic length L and uniform crossection A and K(T) as the limit of KL when ST —> 0 (Ti — Ti — T) and L —► oo 5 4 . T h e existence of such a limit with K positive and finite is what one would like to prove. 2
Heat Conduction in G a s e s
Before going on to a mathematical discussion of heat conducting SNS, we turn briefly to the "kinetic theory" analysis of heat conduction in gases. This is historic ally the first example of a microscopic description of this macroscopic phenomenon. It goes back to the works of Clausius, Maxwell and Boltzmann 9 who obtained a theoretical expression for the heat conductivity of gases, K ~ y/T, independent of the gas density. This agrees with experiment (when the density is not too high) and was a major early achievement of the atomic theory of m a t t e r 9 . Clausius and Maxwell used the concept of a "mean free path" A: the aver age distance a particle (atom or molecule) travels between collisions in a gas with particle density p. Straightforward analysis gives A ~ l/pira2, a ~ an "effective" hard core diameter of a particle. They considered a gas with temperature gradient in the ^-direction and assumed t h a t the gas is (approximately) in local equilibrium with density p and temperature T(x). Between collisions a particle moves a dis tance A carrying a kinetic energy proportional to T(x) from x to x + A / \ / 3 , while in the opposite direction the amount carried is proportional to T(x + A-y/3). Taking into account the fact that the speed is proportional to \JT the amount of energy transported per unit area and time across a plane perpendicular to the x-axis J is approximately,
J~PVT[T(X)-T(X
+ \V3)}
~-*-2Vf^,
(5)
and so K ~ vT independent of p, in agreement with experiment. It was clear to the founding fathers that starting with a local equilibrium situ ation (corresponding to a Maxwellian distribution of velocities) there will develop, as time goes on, a deviation from LTE. They reasoned however t h a t this deviation from local equilibrium will be small when (X/T)dT/dx << 1, the regime in which Fourier's law is expected to hold, and the above calculation should yield, up to some factor of order unity, the right heat conductivity. In fact if one computes the heat flux at a point x by averaging the microscopic energy current at x j = pv(^mv2) over the one particle distribution function f(r,v,t) then it is only the deviation from local equilibrium which makes a contribution. The result however is essen tially the same as Eq.(5). This was shown by Boltzmann who derived an accurate formula for n in gases by using the Boltzmann equation to compute K. If one takes K from experiment the above analysis yields a value for a, the effective size of an a t o m or molecule, which turns out to be close to other determinations of the char-
132 acteristic size of an a t o m 9 . This gave evidence for the reality of a t o m s and the molecular theory of heat. Using ideas of hydrodynamical space and time scaling described earlier it is pos sible to derive a controlled expansion for the solution of the stationary Boltzmann equation describing the steady state of a gas coupled to t e m p e r a t u r e reservoirs at the top and b o t t o m 2 3 , 2 4 , 2 5 . T h e coupling is implemented by the imposition of "Maxwell boundary conditions": when a particle hits the left (right) wall it get reflected with a distribution of velocities m2 fa{dv)
=2^kT^
lVx]eXP
mv" 2kTa
dv
a
1,2
(6)
corresponding to a temperature T\ {T2) at the left (right) wall. One then s h o w s 2 3 , 2 4 , 2 5 that for e << 1, e being now the ratio A/L, the Boltzmann equa tion for / in the slab has a time independent solution which is close to a local Maxwellian, corresponding to LTE, (apart from boundary layer terms) with a local t e m p e r a t u r e and density given by the solution of the Navier-Stokes equations which incorporates Fourier's law as expressed in Eq.(2). T h e main mathematical problem is in controlling the remainder in an asymptotic expansion o f / in power o f t . This requires t h a t the macroscopic temperature gradient, i.e. | 7 \ — Tz\/h, where h = cL is the thickness of the slab on the macroscopic scale, be small. Even if this apparently technical problem could be overcome we would still be left with the question of justifying the Boltzmann equation for such steady states and of course it would not tell us anything about dense fluids or crystals. In fact the Boltzmann equation itself is really closer to a macroscopic then to a microscopic description. It is obtained in a well defined kinetic scaling limit in which in addition to rescaling space and time the particle density goes to zero 7 9 , i.e. A > > a. 3
Heat conduction in insulating crystals
Excellent accounts of the historical development of the theory of heat conduction in solids exist 7 2 , 5 1 so we will content ourselves here with some brief remarks. In (electrically) insulating solids, heat is transmitted through the vibrations of the lattice (in conductors the electronic contribution is in general much larger then the contribution due to the lattice vibrations). In order to use concepts of kinetic theory, it is useful to picture a solid as a gas of phonons which can store and transmit heat. In a perfectly harmonic crystal, the phonons behave like a gas of noninteracting particles and therefore the thermal current will not decrease with the length of the crystal placed between two thermal reservoirs. T h u s a perfectly harmonic crystal has an infinite thermal conductivity: in the language of kinetic theory a — 0 and the mean free path A is infinite. A real crystal is not harmonic and, in the phonon picture, any thermal current will be degraded by the anharmonic forces in the lattice. Another source of finite thermal conductivity may be the lattice imperfections and impurities which will scatter the phonons and degrade the thermal current too. Debye 1 8 devised a kind of kinetic theory for phonons in order to describe thermal conductivity. One assumes t h a t a small gradient of t e m p e r a t u r e is imposed
133 and that the collisions between phonons maintain local equilibrium. An elementary argument 2 gives a thermal conductivity analogous to Eq.(5) obtained in Section 2 for gases, (remembering however t h a t the density of phonon is itself a function of T) K ~ C„C2T .
(7)
Comparing Eq.(7) and Eq.(5) we see that p has been replaced by c„ the specific heat of the phonons, v f by c the (mean) velocity of the phonons, and A by CT, where r is the effective mean free time between phonon collisions. T h e thermal conductivity depends on the temperature via r and a more refined theory is needed to account for this dependence. Peierls 71 used a Boltzmann type equation for phonons to investigate this problem. The Peierls theory singles out one phenomenon which gives rise to a finite thermal conductivity 7 2 . The m o m e n t u m of phonons in collisions is conserved only modulo a vector of the reciprocal lattice. One can therefore classify the collisions of phonons into two classes: the ones where phonon m o m e n t u m is conserved (the normal processes) and the ones where the initial and final m o m e n t a differ by a non-zero reciprocal lattice vector (the umklap processes). Peierls theory may be summarized (very roughly) as follows: in the absence of umklap processes the mean free p a t h and thus the thermal conductivity of an insulating solid is infinite. A success of Peierls theory is to describe correctly the temperature dependence of the thermal conductivity 2 . Furthermore, on the basis of this theory, one does not expect a finite thermal conductivity in 1-dimensional mono-atomic lattices with pair interactions: this seems so far a correct prediction, see Section 10. T h e justification of the Boltzmann equation for phonons has been questioned 4 6 . Various alternative mechanisms have been proposed which would give rise to a finite thermal conductivity, but it seems fair to say t h a t , so far, no better theory of heat conduction in insulators has been proposed. As Peierls himself puts it 7 2 : "It seems there is no problem in modern physics for which there are on record as many false starts, and as many theories which overlook some essential feature, as in the problem of the thermal conductivity of [electrically] non-conducting crystals". To find a mathematical description of thermal conduction in crystals we need to specify the Hamiltonian of the system or at least some appropriately idealized version of it. A model crystal is characterized by the fact t h a t all atoms oscillate around given equilibrium positions. T h e equilibrium positions can be thought of as the points of a regular lattice in R . For simplicity we will assume that the lattice is simply 7L . Although d = 3 is the physical situation one can be interested also in the case d = 1,2. (The d — 1 system may show finite thermal conductivity without violating the Peierls criteria if we admit one particle, non m o m e n t u m conserving interactions.) Let A C Z be a finite set and denote by N its cardinality. Each a t o m is identified by its position XJ = i + qi where i 6 A is the equilibrium position and qi G R d is the displacement of the particle at lattice site i from this equilibrium position, and we denote by pj its m o m e n t u m and m its mass. Since inter atomic forces in real solids have short range, it is reasonable to assume that the atoms interact only with their nearest neighbors via a potential t h a t depends only on the
134
relative distance with respect to the equilibrium distance. As already noted it is useful to allow an external confining 1-body potential which breaks the translation invariance. Accordingly the Hamiltonians that we consider have the general form H p
( 'W = Y,^+
£
f(qt-qj) + 5 > ( q i ) = £ ^ + v(g),
(8)
where P = (pi)igA and analogously for Q. We shall further assume that as |q| —► oo so do Ui(q) and V(q). The addition of Ui(q) pins down the crystal and ensures that exp [—/3H(P, Q)} is integrable with respect to dPdQ and thus the corresponding Gibbs measure is well defined. Observe that for many purpose it is enough to put the potential Ui on only some of the atoms, e.g. the ones on the boundary of A 75,67 . We note finally that when A c Z one can still consider that pi and q\ £ R , v ^ d, but we will generally assume that qi £ R . Remark: While the Hamiltonian in Eq.(8) looks similar to that in Eq.(3) the meaning and domain of the Q variables is entirely different. In a fluid all the particle are identical and the particle with label i interacts with any other particle whose position qj is close to q,-, qi £ A C K , I = 1, • • •, N. The pair interaction potential >(q) is of finite range, e.g. hard balls, or decays rapidly with distance. For the crystal in Eq.(8) q,- is the deviation from an equilibrium position i £ A C Z , etc. 4
Microscopic models of heat reservoirs
To produce a stationary heat flow in a system, be it a gas or a crystal, the system must be coupled to at least two heat reservoirs at different temperatures. A physical coupling is one which acts only at the boundary of the system leaving the dynamics in the bulk purely Hamiltonian. Since a realistic description of heat reservoirs and coupling is out of the ques tion various model reservoirs have been used in analytical and numerical studies. We give here some examples which will be used later (other choices are of course possible). The expectation is of course that the different models will give the same behavior away from the boundary when the system is macroscopic. This has not been proven in any example, see [32,33] 4-1
Stochastic reservoirs.
We have already discussed one such model of reservoirs commonly used for fluids in Section 2. This corresponds to Maxwell boundary condition discussed in Eq.(4) for a gas in a rectangular slab. More generally given a fluid in a domain A c t a particle hitting the wall of the container confining the system at a point r £ dh. will bounce back into A with a Maxwellian distribution of momenta
f,(dp) = ^ J p • n(r)c-'M£dp ,
(9)
135
where n(r) is the inward directed unit vector normal to dA at r and /? - 1 (r) is the preassigned temperature at r. For solids, which are usually not confined to any fixed spatial region by external walls, it is sometimes mathematically convenient to use Langevin type reservoirs which act on the atoms at the "edge" of the crystal. For definiteness we will choose A to be a chain of particles or a parallelepiped in higher dimension (with suitable boundary conditions), A = {i £ Z ; 1 < 4 < N/, , 1 < k < d}. We assume that the particles at the "left" boundary {i £ A; ii = 1} are coupled to a heat reservoir at temperature 7/, and that the particles at the "right " boundary {i £ A ; i\ = Ni} are coupled to a heat reservoir at temperature TR. We set C = N\ the length of the crystal and A = N2 ■ • ■ Nu its cross section. For the particles at the boundary of the crystal in contact with a heat reservoir, the Hamiltonian equations of motion are modified by the addition of an OrnsteinUehlenbeck process: mipi = -V q i V(q) - A a P i /mi + (2X0Ta)1^a(t)
.
(10)
In E)q.(10), a £ {L, R} are indices of the reservoirs, AQ describes the strength of the coupling to the reservoir with temperature Ta of the reservoir, and £a(t) is a white noise, i.e. , a Gaussian random processes with covariance (ta{t)€p(s)) = $ap&{t~ s). The form of the coefficients is chosen so that the dynamics satisfy detailed balance. This implies in particular that if the system is coupled to a single reservoir at temperature T, then the Gibbs measure with density Z~l exp (—T~1H(P, Q)) is a stationary state of the system. With any such a choice of stochastic process to model the reservoirs the dynam ics is described by a stationary Markov process in the phase space of the system Q.
4-2
Hamiltonian reservoirs.
In this case the reservoirs themselves are modeled by infinite Hamiltonian systems and the full system consisting of reservoirs+system is Hamiltonian". An alternative but equivalent point of view is to start with an infinite system and to consider a finite subsystem of it as the system and the remaining part as the reservoirs. A non-equilibrium situation is obtained by choosing suitable initial conditions for the part of the total system which describes the reservoirs, e.g. the initial conditions of the reservoirs are assumed to be distributed according to a Gibbs measure with corresponding temperatures 6 . Studied in [77,80,69] the simplest version of such a total system consists of an infinite chain. The "left" reservoir consists of the particles with labels in (—00, —N] and the right reservoir consists of the particles with labels in [N, +00). The system "A series of results has been obtained for systems coupled to a single reservoir, such as nonrelativistic atoms coupled to the quantized electromagnetic field at temperature zero 3 ' 4 ' 5 , finite level atoms coupled to a boson field at positive temperature 4 3 ' 4 4 ' 1 9 t classical particles coupled to a scalar field at temperature zero ' l 9 ' 5 0 or at positive temperature 4 S . ''For quantum spin systems axioms are formulated in [78] which establish the existence of a stationary state and its mixing property for finite systems coupled to several reservoirs at different temperatures
136
consists of the particles in the middle. At time t — 0, the reservoirs are assumed to be in thermal equilibrium at temperatures TL and TR. This is clearly readily generalized to higher dimensions. 4-3
Hamiltonian/Stochastic
reservoir.
While Hamiltonian reservoirs are in principle the right ones to use they are totally intractable without further simplifications. When this is done it is actually possible to find examples in which one starts with a Hamiltonian reservoir and by integrating out over the degrees of freedom of the reservoirs, ends up with a stochastic evolution. Many models of this type have been constructed 38,39,28 . We describe here a model considered in [45,21,22,20]. The system is a finite chain of anharmonic oscillators coupled at each end to a reservoir modeled by a linear cf-dimensional wave equation, which is the continuum limit of a d-dimensional lattice of harmonic oscillators. The dynamics of the infinite system, crystal+reservoirs, is Hamiltonian. One makes the statistical assumption that, at time t = 0, the reservoirs are in thermal equilibrium at temperatures TL and TR. Since the reservoirs are linear, one may integrate them out, and, by our assumptions on the initial conditions of the reservoirs, the resulting dynamics for the crystal is stochastic, though in general not Markovian. Nevertheless, the fact that the reservoirs are described by a wave equation, together with special choices of the coupling between the reservoirs and the chain, permits 21 enlarging the phase space of the crystal with a finite number of auxiliary variables so that the dynamics is Markovian on the enlarged phase space. In the simplest case of coupling one variable per reservoir is enough and the resulting equations for the N oscillators are qi = - V q i V ( Q ) + r L ,
2,...,N-l,
= -1L(TL - A 2 q i ) + (2lLXlTL)1/2WL(t), i-R = —7R(rR - A^qi) + (2yRXRTR)^2wR(t), TL
(11)
In Eqs.(ll) XL and XR describe the coupling strength to the reservoirs, 7/,, fR are parameters describing the coupling and WL, WR are white noises. If the temperatures of both reservoirs are the same, TL = TR = T, then the stationary state is given by the generalized Gibbs measure with density Z _ l e X p
("^G(P,g,jR))
(12)
where Z is a normalization constant and the generalized "Hamiltonian" G is given by G(P,Q,R)=
Q^-
q i
r
L
)+Q|--q
J v
r
l l
)+//(/',Q).
(13)
If one integrates the generalized Gibbs state, Eq.(12), over the auxiliary variables
137
Ti and r/j one finds JdrLdrRZ-ie.xp(~G(P,Q,R)\
= Z~l exp f - ^ / / e f f ( P , Q)) ,
(H)
where Hen(P, Q) — H(P, Q) + A^qi/2 + A R q n /2 and Z a normalization constant. In view of this it is natural to consider Heft(P, Q) as the energy of the chain. 4-4
Thermostats.
A fourth way of modeling the reservoirs is by deterministic (non-Hamiltonian) forces 27,31 . Such models of reservoirs are usually called thermostats. An example of such reservoirs which are widely used in numerical work, are the so called NoseHoover 68,41 thermostats. Imposing these thermostats on small parts of the system c (on the left and on the right) Ai and AH, the equations of motion of particles in those region of the box are respectively mqi = - V q i V ( Q ) - a q i mqi = - V q ^ Q J - C f l q i
(15)
where q,- 6 Aa for a fluid and i € Att for a crystal, a £ {L, R}. The variable £ a model the action of the thermostat and satisfy the equations
In Eq.(16), 0 is interpreted as the response time of the reservoir and Ta is the temperature of the a-th reservoir. A limiting case of Eqs.(15)(16) is when we let 0 —)• 0. This limit can be formally taken and the model becomes equivalent to the so called Gaussian thermostat. This means that one computes (a as a function of P and Q in such a way that the kinetic energy of the particles in Aj, or AR is a constant of the motion. After a simple calculation one gets for the chain:
aw, P)=Ei-pi(f(qi"
qi }
^ ~ f ( r ~ qi))+f,(qi)) ,
(17)
l~,i
and similarly for CR(Q, P). Here f/(q) = — VC/i(q). One may also prescribe Gaus sian thermostat in which the total energies, instead of just the kinetic energies, in AL and AR are kept fixed. 5
Existence and Nature of Heat Conducting SNS
Suppose we are given a system described by a Hamiltonian of the form Eq.(3) or Eq.(8) and that we have chosen a given model of heat reservoirs. We shall now c
It is also possible to give "mechanized" models of heat transport in which one imposes the presence of a heat flow through the application of a particular force on the bulk of the system. This strong modification of the Hamiltonian character of the dynamics seems unnatural to us, although it can be useful for numerical simulations, and we will not discuss these models here 27,81
138
formulate a sequence of statements (of increasing mathematical difficulty) on the properties of the resulting dynamical system. 5.1
Existence, Uniqueness and approach to the Stationary State.
The first property that we want to prove is existence and if possible also uniqueness of a stationary state. For the case when all reservoirs are at the same temperature T existence is generally obvious - after all the reservoirs are chosen so that they leave the canonical Gibbs distribution or some variation of it invariant under the time evolution. Uniqueness and approach to this equilibrium state presents more of a problem and may not even be true for certain type of reservoirs and initial states. The real problem of interest for us is when the reservoirs are at different tem peratures. We expect that if the dynamics is stochastic (e.g. for models 4.1 and 4.3 of reservoirs) then "almost any" initial distribution of the state of the system con verges to a unique stationary state which is mixing. This is however, in general, a mathematically non-trivial problem. The isolated system has a non-compact phase space and has many invariant states. The coupling to the reservoirs induces a drift towards a state determined by the reservoirs. Since however the coupling to the reservoirs occurs only at the boundary, the proof of the existence of an invariant measure requires a good understanding of how energy is transmitted through the system. There are in fact only few examples (to be discussed in Section 6) where this behavior has been proven. For general Hamiltonian or thermostated reservoirs the problem seems to be mathematically out of reach at the present time. Starting in a state that corres ponds to the product of two equilibrium states for the reservoirs times a generic initial distribution for the crystal we then expect, that in the long time limit the marginal distribution for the system will approach some limit. When the two in finite reservoirs are initially at different temperatures, the limiting state should describe a system having a temperature gradient and a heat flow. Observe that in general the state for each reservoir at times / > 0 will not be the invariant state at a given temperature which is stationary for the isolated reservoir. It is this fact which makes the problem of general Hamiltonian reservoirs much more difficult that than of stochastic reservoirs. It is only in very special cases (essentially no interaction inside the reservoir) such as that discussed in subsection 4.3 where this can be dealt with. For thermostated systems the temperature of the reservoirs is already given by the equations of motion so we expect again to have a unique invariant distribution. Moreover these systems have the property that the phase space volume is not con served by the dynamics so that, in general, no invariant measure will be absolutely continuous with respect to Lebesgue measure. In this case we need a criterion to chose the "physical" invariant distribution. A natural choice are the so called SRB states. These can be characterized by assuming that the system was in equilibrium in the very distant past and that at some point a forcing was switched on and this drove the system to a steady state distribution. More mathematically this means that we consider the weak limit of a probability distribution absolutely continuous
139 with respect to Lebesgue, e.g. the canonical distribution that characterize the sys tem when all thermostats have the same temperature, under the time evolution, i.e. if
\im&\{dX),
X = {P,Q)
(18)
(—foo
where $ ( indicates the adjoint and X(dX) is the given initial distribution. We observe that we often cannot choose directly the Lebesgue distribution because the phase space of our system in not compact like for the Nose-Hoover thermostat Eq.(16). Although the definition Eq.(18) is interesting for its similarity to the ones used in the previous system+reservoirs models another characterization of these meas ure is obtained by saying that they represent the statistics of the motion. More precisely, given any observable, the average of this observable with respect to the SRB distribution is equal to its time average along a trajectory starting from almost every point (with respect to Lebesgue measure). In formulae we can say that if y. is the SRB distribution then 1 fl lirn - / dt8vlX)
= HSRB(dX)
(19)
t->oo I J0
for a set of X of full (or at least positive) Lebesgue measure. In Rq.(19) the limit is to be understood as a weak limit"*. 5.2
Heat Flow and Entropy Production in Reservoirs.
Since our interest here is specifically in Fourier's law our next question about the stationary state of the system coupled to two reservoir at different temperatures is the existence and nature of the heat flux across the system. It is clear that even existence is not automatic: most trivially just imagine that our system is composed of two noninteracting parts each coupled to a single reservoir. To study the heat flux through the system we first define a local energy density. For a fluid let q £ A C K then N
h(v,P,Q) =
2
^2S((i-qi) 2^ + X^(
(20)
where the square bracket is identical to that in Eq.(4). For a crystal with nearest neighbor interactions we define the local energy density at site i as
h(U p Q) =
> &+ u(qi) + h^(v(qi ~ qi-iJ + v(qi+u _qi)) k
where 1^ is the d-dimensional vector with all components 0 except the fc-th equal to 1. The only thermostated "physical" model for which a SNS corresponding to an SRB measure with the desired properties has been proven is the Moran-Hoover model of a single particle moving among fixed periodic scatterers (see Section 1) to which is added an external electric field and a Gaussian thermostat 1 4 .
(21)
140
Given the local energy density we can define a local microscopic heat flow \P through the continuity equation. To avoid repetition we shall do so only for the crystal. Writing d
Jl^m
= V*(i)
(22)
where Vtf(i) = £ k d,-,tf* (i) with &»*,•(!) = ( * j ( i + U) - *j(i))/2. It easy to verify that ^(i)=f(qi-qi+u)Pi+2Pi+U
(23)
where f (q) = — W ( q ) . We will usually be interested in the heat flow 4>(j) through the plane {i € A ; i\ = j}. It is clear that we can integrate eq.(22) and obtain
l<'l
Of course the heat current inside the system in the steady state is just the energy flux from one reservoir to the other, presumably from the one with the higher temperature to the one with the lower one. Let S{ be the time evolution for observables (averaged over the realizations if the evolution is stochastic). Since the equations of motion are Hamiltonian except at the boundary, one finds that the time derivative of the energy is given by j^H^Q)
= -S< (
(25)
where <J>£, ($/?) depends only on the variables of the left (right) boundary of the system. It is natural to interpret £ as the flow of energy from the system to the left heat reservoir and similarly for $R. We suppose that a stationary state n exists, and, for any observable / , we set /i(/) = f fd/i. One obtains -fi (*L + * n ) = ii {j^Hip,
?)) = 0 ,
(26)
and therefore /x(* L ) = - / * ( « * ) .
(27)
To check that the heat flux is indeed as expected it is useful to define the entropy production a of the reservoirs as 6 r - j r
+ jr;,
(28)
i.e., a is the sum of the energy flows into the reservoirs divided by the temperatures of the reservoirs. This (microscopic) definition of the reservoirs entropy production is in accordance with our notion of heat reservoir at specified temperature. It does not require that the system itself be close to equilibrium. A convenient way of proving that if TL > TR then heat is flowing through the system from left to right is to show that in the steady state
"(
(29)
141
and fi(a) = 0
if and only if
TL = TR .
(30)
(One may also consider the heat flow $(i) (or the corresponding J(q) for a fluid) inside the system and formally define a corresponding entropy production <7; = (Tft1 — T^l)^(i). One obviously has that / J ( $ L ) = /i($ t ) in the stationary state but <Ti is not the macroscopic entropy production density inside the system 27,15,16 .) If X/, and TR are close, one expects linear response theory to be valid. Setting T = (Ti + T R ) / 2 and ST = ('//, — TR), formal perturbation theory gives /•OO
(31) P(') = / dt (1O(<7SQ
T-t
,*($) =
L
J.OO
R
I dt fio^Sfo) + lower orders in ST. (32) ^ Jo It is important to note that in Eqs.(31) and (32), the reservoirs are still present via the time evolution S0. Although very similar this is not the Green-Kubo formula which will be discussed below. 5.3
Fourier's Law.
Assuming that 1. and 2. have been proved, we can then define the heat conductivity KC as in Section 1, where £ the length of the system (fluid or crystal) in microscopic units and A the area of its crossection. Since ST/C is the average temperature gradient the heat conductivity at temperature T should then be given by «=
l i m £ Hm -L (,*(*)/A) , £->oo
(33)
ST-tO ol
i.e., K = K(T) is the heat flux per unit area divided by the temperature gradient. One might also have taken the limit A —> oo in Eq.(33). As might be expected it is the limit C —> oo which is the crux of the matter. 6
Summary of Exact Results
We now summarize briefly the limited number of results relating to points 1-3 of the last section. 6.1
Fluid systems.
Consider a system of N particle in A C 1R , d > 2, with Hamiltonian given in Eq.(3), such that u(q,) = 0 and the pair potential >(|q|) is positive, with <^>(0) = C\, and -Ci < tf'(|q|) < 0, 0 < C i , C 2 < oo, for \q\ > 0. Then it was shown in [36,37,35] that, using Maxwell boundary conditions Eq.(9) with temperature T(r) > 0, r € dA (A a regular domain) there exists a unique stationary /z. Furthermore this fi is absolutely continuous with respect to Lebesgue measure on fi! = \N x R and is
142
approached, as t —> oo, from almost any initial (P, Q), i.e. the set fi' for which the approach may fail has Lebesgue measure zero. T h e argument makes use of the boundedness of the force acting on any particle in the interior of A. This assures t h a t any particle with a sufficiently high speed will hit the boundary with only little deviation from a straight p a t h . T h i s and the fact t h a t the force —<^'(|q|) is everywhere positive insures effective contact between the system and the stochastic boundaries which, according to Eq.(9), "spread" the velocity of particles which hit it. This yields something like a "Harris condition" which guarantees existence, uniqueness and approach to the stationary state. Using general technique developed in [6] Eq.(29) for systems in contact with stochastic reservoirs satisfying detailed balance is immediate. It is probably also possible to prove inequality (30) for such systems but the latter has not been done generally as far as for we know; see below. 6.2
Harmonic
Crystal.
A system with Hamiltonian given by Eq.(8) in which both V and U\ (when it does not vanish) are quadratic functions of their arguments is an ideal harmonic crystal. When such a system is placed in contact with stochastic reservoirs of the Langevin type the resulting process and thus also the stationary measure is Gaussian and one only needs to compute the covariances. This was done essentially explicitly for a chain in [75]. T h e most important difference with the equilibrium state is t h a t there are now non-vanishing covariances between position and m o m e n t u m variables proportional to ST. One finds uniqueness and approach to the stationary measure \i t h a t satisfies Eqs.(29) and (30) As already mentioned however the heat flux / i ( $ ) is essentially independent of £ and KC defined in Eq.(33) grows as C. (For the case of "random masses" KC grows as y/C13,69,76). T h e solution of [75] was extended to d > 1 in [67] where there were also considered various possibilities for U\, e.g. pinned down everywhere, only at the boundary or nowhere. Looking at the invariant measure in the limit £ —■> oo one finds t h a t the decay of the position-momentum covariance is rapid when U\ ^ 0 (at least on the boundary) but does not decay at all when Ui = 0. T h e case of an infinite harmonic chain with left and right portions acting as Hamiltonian reservoirs was investigated in [80]. T h e results are qualitatively the same as for the stochastic reservoirs: the heat current remain proportional to the initial temperature difference as t —*■ oo and the system approaches its stationary state which is again a Gaussian measure e . 6.3
Anharmonic
Crystals.
T h e anharmonic crystal coupled to Hamiltonian/Stochastic reservoirs described by E q . ( l l ) has been investigated in [21,22,20]. Technical conditions on the growth at infinity of the potential are needed: either 2 1 t h a t V is quadratic at infinity or 2 0 more general polynomial growth. (In the latter case, the one-body potential U e
For an infinite quantum harmonic chain with a special particle subject to a sufficiently small non harmonic potential, the existence of stationary states and their mixing property has been established, both for KMS states 2 9 ' 6 5 and for SNS 2 9 .
143
grows more slowly at infinity then the two-body potential V). One assumes also t h a t the two-body potential is strictly convex, and this condition alone implies 2 2 t h a t the stationary state is unique. Under these conditions the following results hold: 1. Existence and uniqueness of the stationary state \i. The stationary state is mixing, i.e. any initial distribution will converges to the stationary state as t —> oo. The stationary state has a C°° density which decays at infinity at least as fast as a Gibbs state with temperature equal to the m a x i m u m of the temperature of the reservoirs. 2. T h e stationary state is conducting: One has H($R) = 0 if and only if TL = TR and /j(^fi) > 0 if TL > TR. Linear response theory is valid: For a large class of observables / , the expectation value / i ( / ) is a real-analytic function of the temperature difference ST. In particular, near equilibrium, one obtains, with T = (TL+TR)/2 ^ R )
=
S
^D
+ 0(ST2),
(34)
Eq.32 for the coefficient D has not been proved, but rather the slightly weaker form D = VO(*R(^1*R)),
(35)
where pa is the Gibbs state Eq.(12) with temperature T and LQ is the gener ator of the Markovian semi-group SQ associated with the stochastic differential equations (11) with TL =TR — T. Notice t h a t , formally, one has LQ
l
= / Jo
dtSl0
(36)
and, inserting Eq.(36) into Eq.(35) yields Eqs.(32). In order to prove Eq.(36), one needs presumably some information on the decay of correlations. This has not been obtained so far. Nothing is known about the dependence of D on L and thus ipso facto about the validity of Fourier's law. 7
T h e G r e e n - K u b o Formula
It is clear t h a t , aside from the case of the harmonic crystal, which does not satisfy Fourier's law, none of the exact results quoted in the last section says anything about the local structure, e.g. about local equilibrium in the SNS. This means in particular t h a t at this time we have no rigorous way of relating the local heat flux £*($) to the gradient of the local temperature as defined in Eq.(4). Even in the absence of LTE one can define a local kinetic temperature by means of the average local kinetic energy. Thus for the crystal at site i, T(i) = n(p?/m)/d. For the harmonic crystal T(i) in found to be uniform away from the ends, i.e. there is no temperature gradient. Even accepting such a definition of temperature (in numerical simulation, to be discussed later, T(i) is one of the most directly measured quantities) we are completely lacking at this point any rigorous or even
144 formal connection between the K defined in Eq.(33) a n d the usual Green-Kubo formula for the conductivity which is defined in terms of the time evolution of an isolated system in equilibrium. Denoting by SQ t h e Hamiltonian evolution of the isolated system the thermal conductivity KQK is given by 5 3 , 6 6 KGK
= lim - — / A<*So*> (37) L->oo A 1 A J0 where (•) denotes the microcanonical average and the energy density is chosen such t h a t it corresponds to the thermodynamic energy at t e m p e r a t u r e T (since we are in equilibrium this is the same as the kinetic temperature). If the total m o m e n t u m n is conserved it has to be set equal to zero. Alternatively one may use in Eq.(37) t r u n c a t e d correlation functions (.9Q$)|J
n
= (&S0$)E,TI
— ($)% n
anc
^ ^hen average
over E and n using the canonical d i s t r i b u t i o n ' . One expects t h a t the equivalence of equilibrium ensembles will extend also to this case. T h e Green-Kubo formula, Eq.(37), also makes sense for a system with a few degree of freedom where it can be related to the variance in t h e fluctuation of t h e current; see the article of Bunimovich and Spohn n for a discussion. There is no clear connection however between the integral in Eq.(37) for a small system and the K in Fourier's law. From a mathematical point of view it is not even clear how to prove equivalence for macroscopic systems, i.e. show t h a t n = KGK- It would be nice to find even a formal argument establishing the equivalence. 8
Entropy P r o d u c t i o n a n d Large Deviations
T h e proper definition of microscopic entropy production has attracted much at tention in recent years. T h e interest comes from the observation of an interesting symmetry property in the large deviation functional associated t o the phase space volume contraction rate in thermostated systems. This property was first observed numerically in [26] and then proved under strong hyperbolicity condition in [31]. For such systems the phase space volume contraction has strong connection t o t h e entropy production in Section 5. Using this connection t h e fluctuation theorem has been extended to large deviations of the entropy production of various stochastic systems 5 2 , 5 8 . For crystals with stochastic reservoirs or for the model considered in Section 4.3, only formal proofs of the Gallavotti-Cohen fluctuation theorem are available so far. The Gallavotti-Cohen fluctuation theorem can be formulated as follows: In both deterministic and stochastic systems one identifies an observable
= l f dsa(X(s)) (38) ' Jo where for deterministic dynamics X(t) is trajectory of the system while for stochastic dynamics X(t) is a particular realization of t h e random process. Then •'If one does not fix n in an equilibrium ensemble for which the total momentum is conserved then the integral in Eq.(37) is divergent 7 4 but this does not say anything about K G K -
145
by the large deviation principle (assumed to hold) there exists an e(p) such that for any interval / lim log Prob(<7t(z) € /) = inf e(p).
(39)
The fluctuation theorem asserts that the odd part of e(p) is linear with slope — 1, i.e. e(p) - e(-p) = - p . In the deterministic case the entropy production variable that one consider is, as already mentioned, the phase space contraction rate which, in the case of Gaussian thermostats, coincides with entropy production as we defined it in Subsection 5.2. This can be easily seen by computing the divergence of the equations of motion quoted at the end of Subsection 4.4. Already for the case of the Nose-Hoover ther mostat the phase space contraction and the entropy production are not the same quantities although their average value in the SNS are the same. For anharmonic chains with a two body potential given V(x) = x2 + (3xA the validity of the fluctu ation theorem has been checked numerically in [59] for the entropy production (but not for the phase space contraction). The connection between the Gallavotti-Cohen fluctuation theorem and its vari ous generalizations can be related to the following observation: For a stochastic model described by a Markov process one can consider the measure P on the path space induced by the evolution. Let 4> be a path leading from X to Y in phase space in the time interval [0, t] and consider now the transformation which maps the path <j> into the time reversed path leading from IY to IX, where / is the involution on phase space which reverses the velocities of all particles. This transformation maps the measure P into a new measure P which is absolutely continuous with respect to P with Radon-Nykodym derivative given by
^(4>)
= exp (R{4>(t)) - R(4,(0)) + j
a{4>(s))ds\ ,
(40)
where a is the entropy production defined in Eq.(28). Eq. (40) states, roughly speaking, that the probability of a time reversed path is equal (up to a boundary term) to the probability of that path times the exponential of the integrated entropy production along this path. This property can be assumed as a general definition of entropy production 64 and can be seen as a generalization of detailed balance. In fact in the equilibrium case (TL =TR — T), the right hand side of Eq.(40) depends on <j> only through its endpoints and is equal to exp (T~l{H'((t)) — H(<j>(0)))) and this is precisely detailed balance. For thermostated systems the above calculation is quite delicate because, al though one can still consider the measure P on the path space, it will not be absolutely continuous with respect to the time reversed measure P. The original proof of the fluctuation theorem in [31] can be thought of as based on the construc tion of a series of approximants to the measure P which satisfies a relation formally similar to Eq.(40).
146
9
Local equilibrium
Since we have nothing to say about LTE in Hamiltonian heat conducting systems we discuss briefly some models where the Hamiltonian dynamics in the bulk is modified by the addition of stochastic forces. These models show that a mechanism sufficiently strong to destroy the coherence of the phonons does indeed produce both a temperature gradient and a normal thermal conductivity, even in one dimensional systems. The first such model 7 ' 76 is a harmonic chain coupled to self-consistent reservoirs. Each particle of the chain is coupled to its own stochastic reservoir at temperature Ti, i = \,---,N, modeled by Langevin dynamics. The reservoirs coupled to the particles in the bulk supposedly simulate the effect of strong anharmonic interac tions. The temperature of the reservoirs coupled to the first and last particles are at fixed temperature T\ = TL and T}v = TR. The temperature of the remaining reser voirs is fixed by a self-consistency condition: one requires, that in the stationary state there is no net energy exchange between the reservoirs and the particles in the bulk. It is argued 7 that in the limit of large N, the system exhibits a temperature gradient and Fourier's law is indeed satisfied76. Another model 1 7 is a one-dimensional chain of quantum mechanical atoms (each of which has a finite number of energy levels). The first and the last atoms of the chain are each coupled to heat reservoirs at temperatures TL and TR. The atoms in the chains are not directly coupled to each other, but each pair of nearest neighbors in the chain is coupled to a heat reservoir. Since each atom has only a finite number of energy levels, it is possible to choose the coupling to the intermediate reservoirs in such a way that no energy is exchanged between the particles in the bulk and the reservoirs. Compared to the previous model, no self-consistency condition is needed and the transfer of energy between the intermediate reservoirs and the system always vanishes, not only in the stationary state. The model is studied in the weak coupling (or Van Hove) limit, where the coupling between the chains and all reservoirs goes to zero and the time is suitably rescaled. In this limit the evolution is described by a quantum semigroup. For a suitable choice of couplings the model can be solved exactly and it exhibits a temperature gradient and Fourier's law. Another model considered 4 8 consists of a chain of uncoupled harmonic oscil lators {Ui(qi) = qf/2, V(qi — qj) = 0). The oscillators at the boundary are coupled to heat reservoirs modeled by Glauber processes which thermalize the oscillators according to the Gibbs distribution at temperatures TL and TR. The energy is exchanged between the oscillators in the chain according to the following (microcanonical procedure): at each pair of nearest neighbor sites, there is a clock with exponential law, when it rings the energy of the pair of particle is redistributed in a uniform way, keeping the total energy of the system constant. This model is exactly solvable and satisfies Fourier's law.
147
10
Numerical results
The availability of fast computers has permitted the investigation of heat current carrying SNS via numerical simulations. Many such simulations have been per formed, see e.g. [12,73,59,60,42,40,1,34] for recent works on various models. Earlier works were not always consistent but recently careful simulations of both one and two dimensional crystals have become available62. A coherent picture now seems to emerge from the numerical results 62 . We describe this briefly and refer the reader to [61] for a more complete overview. For obvious reasons, it is convenient in numerical works to keep the number of degrees of freedom as low as possible. On the other hand stochastic differential equations, like the ones used in the Langevin stochastic reservoirs require very good random number generators and present numerical problems connected with the singularity of the covariance of the white noise. For this reason the simulations done using the deterministic thermostat are easier and likely more reliable. In any case there is good agreement (in the data if not always in the interpretation) between simulations with different thermostats. Since it is not easy to look for invariant measures with numerical simulations one typically computes the time average of a few interesting observables along a given trajectory. Assuming the validity of Eq.(19) this represents averages with respect to the invariant measure. This permits investigation of questions like the temperature profile in a chain in the steady state or the value of the conductivity for which, as we have seen, there is no analytical result at this time. We will focus mainly on this last question. 1. In one dimension, where most of the simulations have been conducted, the conductivity, when £/i(q) = 0, appears to behave, as a function of the length £ of the chain, as £ a with a a positive exponent, a — 0.4 for the anharmonic chain. On the other hand if Ui(q) ^ 0 (typically one consider Ui(q) = | w 2 q 2 ) one finds a finite conductivity if some nonlinearity is present in the system (for the linear case a = 1 ). In this situation the exact form of the non linearity seems irrelevant, analogous result are found adding a 4-th order term to f/(q) as well as to V(q). Similar results are obtained if one compactifies the configuration space of Q, e.g. by considering each q as a point on a torus, obtaining what can be called a chain of rotators. In all cases the system has a well defined, approximately linear, temperature profile, although there can be a finite jump between the temperature of the first and last oscillator and the temperature of the respective thermostats. Moreover for one dimensional systems simulations using different kinds of reservoirs appear to yield similar values of the exponent a. 2. Recent simulations seem to show that the conductivity in two dimensions is logarithmically divergent if U{(q) = 0. Although it is not easy to see a logar ithm is such a situation the simulation in [62] strongly suggests this conclusion. Moreover in this case one can try to compute the thermal conductivity KQK as given in Eq.(37) 3 and compare it with the K obtained from Eq.(33) (without 9
In this case the equilibrium ensemble used is (•)£,„ described in the comments after Eq.(37)
148
the limit £ —► oo). The agreement obtained from the numerical data support the validity of the equality K = KGK- Although we do not know any results on this direction we believe that adding a confining U(q) to this system will make the conductivity finite. 3. We further expect that the conductivity in three dimensions will be finite, with or without any on site potential. The above picture can be interpreted in term of the Peierls theory which, as mentioned in Section 4 relies on umklap processes to produce a finite conductivity. We also note that in one and two dimension, there is no stable crystal without an on site potential, but there is localization in d > 3. The variance in the deviation of an atom near the center from its equilibrium position, when the boundary atoms are tied down, grows like £ in d = 1, like Iog£ in d = 2 and is finite in d = 3 8 . A cknowledgment s L. R.-B. is supported in part by the Swiss National Science foundation. F. B. and J.L were supported by NSF grant DMR-9813268 and AFOSR grant F4962098-1-0207. They acknowledge the hospitality of the IHES where this work was completed. References 1. K. Aoki and D. Kusnezov, Preprint 1999 http://xxx.lanl.gov/ps/chao-dyn/9910015. 2. N.W. Ashcroft and N.D. Mermin, Solid State Physics, Saunders College 1988. 3. V. Bach, J. Froehlich, and I.M. Sigal, Adv. Math., 137 (205), 1998. 4. V. Bach, J. Froehlich, and I.M. Sigal, Adv. Math., 137 (299), 1998. 5. V. Bach, J. Froehlich, and I.M. Sigal, Commun. Math. Phys., 207 (249), 1999. 6. P.G. Bergman and J.L. Lebowitz, Phys. Rev., 99 (2), 1955. 7. M. Bolsterli, M. Rich, and W.M. Visscher, Phys. Rev. A, 1 (1086), 1970. 8. H.J. Brascamp, E.H. Lieb and J.L. Lebowitz, Bulletin of the International Statistic Institute, 4 (393), 1975. 9. S. Brusch, The kind of motion we call heat, North Holland, 1976. 10. L. Bunimovich and Ya. G. Sinai, Commun. Math. Phys., 78 (479), 1981. I L L . Bunimovich and H. Spohn, Commun. Math. Phys., 176 (661), 1996. 12. G. Casati, J. Ford, F.Vivaldi, and W.M. Visscher, Phys. Rev. Lett., 52 (1861), 1984. 13. A. Casher and J.L. Lebowitz, J. Math. Phys., 12 (1701), 1971. 14. N.I. Chernov, G.L. Eyink, J.L. Lebowitz, and Ya. G. SinaT, Commun. Math. Phys., 154 (569), 1993. 15. N.I. Chernov, and J.L. Lebowitz, Phys. Rev. Lett., 71 (2831), 1995 16. N.I. Chernov, and J.L. Lebowitz, J. Stat. Phys., 86 (953), 1997. 17. E.B. Davies, J. Stat. Phys., 18 (161), 1978. 18. P. Debye, Vortrage uber die Kinetische Theorie der Warme, Teubner, 1914.
149 19. J. Derezinski and V. Jaksic, Preprint 1998. 20. J.-P. Eckmann and M. Hairer, Preprint 1999 http://xxx.lanl.gov/ps/chao-dyn/9909035. 21. J.-P. Eckmann, C.-A. Pillet, and L. Rey-Bellet, Commun. Math. Phys., 201 (657), 1999. 22. J.-P. Eckmann, C.-A. Pillet, and L. Rey-Bellet, J. Stat. Phys., 95 (305), 1999. 23. R. Esposito, J.L. Lebowitz and R. Marra, Commun. Math. Phys., 160 (49), 1994. 24. R. Esposito, J.L. Lebowitz and R. Marra, J. Stat. Phys., 78 (389), 1995. 25. R. Esposito, J.L. Lebowitz and R. Marra, J. Stat. Phys., 90 (1129), 1998. 26. D. J. Evans, E. G. D. Cohen, and G. P. Morris, Phys. Rev. Lett, 71 (2401), 1993. 27. D.J. Evans and G.P. Morris, Statistical Mechanics of Nonequilibrium Liquids, Academic Press, San Diego 1990. 28. J. Farmer, S. Goldstein, and E.R. Speer, J. Stat. Phys., 34 (263), 1984. 29. F. Fidaleo and C. Liverani J. Stat. Phys., (98), 2000. 30. J. Fritz, T. Funnaki, and J.L. Lebowitz, Probab. Theory Related Fields, 99 (211), 1994. 31. G. Gallavotti and E.G.D. Cohen, J. Stat. Phys., 80 (931), 1995. 32. G. Gallavotti, Physica D, 105 (163), 1997 33. G. Gallavotti, Chaos, 8 (384), 1998 34. C. Giardina, R. Livi, A. Politi, and M. Vassalli, Preprint 1999. 35. S. Goldstein, C. Kipnis and N. Ianiro, J. Stat. Phys., 41 (915), 1985. 36. S. Goldstein, J.L. Lebowitz and E. Presutti, in Colloqma Mathematica Societati Janos Bolay 27 (403), 1979. 37. S. Goldstein, J.L. Lebowitz and E. Presutti, in Colloqma Mathematica Societati Janos Dolay 27 (421), 1979. 38. S. Goldstein, J.L. Lebowitz and K. Ravishankar, Commun. Math. Phys., 85 (419), 1982. 39. S. Goldstein, J.L. Lebowitz and K. Ravishankar, J. Stat. Phys., 43 (303), 1986. 40. T. Hatano. Phys. Rev. E, 59 (Rl), 1999. 41. W.G. Hoover, Computational Statistical Mechanics, Elsevier 1991 42. B. Hu, B. Li and H. Zhao, Phys. Rev. E, 57 (2992), 1998. 43. V. Jaksic and C.-A. Pillet, Commun. Math. Phys., 176 (619), 1996. 44. V. Jaksic and C.-A. Pillet, Commun. Math. Phys., 178 (627), 1996. 45. V. Jaksic and C.-A. Pillet, Acta Math., 181 (245), 1998. 46. E.A. Jackson, Rocky Mount. J. Math., 8 (127), 1978. 47. C. Kipnis and C. Landim, Scaling limits of interacting particle systems, Springer, Berlin 1999. 48. C. Kipnis, C. Marchioro and E. Presutti, J. Stat. Phys., 27 (65), 1982. 49. A. Komech and II. Spohn, Nonlinear Anal., 33 (13), 1998. 50. A. Komech, H. Spohn and M. Kunze M Comm. Partial Differential Equations, 22 (307), 1997. 51. R. Kubo, Acta Phys. Austr., Suppi, 10 (301), 1973. 52. J. Kurchan, J. Phys. A, 31 (3719), 1998.
150
53. R. Kubo, M. Toda and N. Hashitsume, Statistical Physics II, Springer Series in Solid State Sciences, Vol 31, Springer, Berlin 1991. 54. J.L. Lebowitz, Prog. Theor. Phys. Suppl., 64 (35), 1978. 55. J.L. Lebowitz and P.G. Bergman, Ann. Physics, 1 (1), 1957. 56. J.L. Lebowitz and H. Spohn, J. Stat. Phys., 19 (633), 1978. 57. J.L. Lebowitz, E. Presutti and H. Spohn, J. Stat. Phys., 51 (841), 1988. 58. J.L. Lebowitz and H. Spohn, J. Stat. Phys., 95 (333), 1999. 59. S. Lepri, R. Livi, and A. Politi, Phys. Rev. Lett., 78 (1896), 1997. 60. S. Lepri, R. Livi, and A. Politi, Europhys. Lett, 43 (271), 1998. 61. S. Lepri, R. Livi, and A. Politi, In preparation. 62. S. Lippi and R. Livi, Preprint 1999 http://xxx.lanl.gov/ps/chao-dyn/9910034. 63. C. Liverani and S. Olla, Commun. Math. Phys., 189 (481), 1997 64. C. Maes, J. Stat. Phys., 95 (367), 1999. 65. H. Maassen, M. Guta and D. Botvich, Preprint 1999 http://www.ma.utexas.edu/mp_arch/99/99-220.ps.gz. 66. J. A. McLennan, Introduction to nonequilibrium statistical mechanics, Prentice Hall, 1989. 67. H. Nakazawa, Supplement of the Progress of Theoretical Physics, 45 (231), 1970. 68. S. Nose, J. Chem. Phys., 81 (511), 1984. 69. A.J. O'Connor and J.L. Lebowitz, J. Math. Phys., 15 (692), 1974. 70. S. Olla, S.R.S. Varahadan, H.-T. Yao, Commun. Math. Phys., 155 (523), 1993. 71. R.E. Peierls, Quantum Theory of Solids, University press 1956. 72. R.E. Peierls, In Theoretical Physics in the Twentieth Theory, Interscience 1960. 73. T. Prosen and M. Robnik, J. Phys. A: Math. Gen. , 25 (3449), 1992. 74. T. Prosen and D.K. Campbell, Preprint 1999 http://xxx.lanl.gov/ps/chao-dyn/9910024. 75. Z. Rieder, J.L. Lebowitz, and E.H. Lieb, J. Math. Phys., 8 (1073), 1967. 76. M. Rich and W.M. Visscher, Phys. Rev. B, 11 (2164), 1975. 77. R.J. Rubin and W.L. Greer, J. Math. Phys., 12 (1686), 1971. 78. D. Ruelle, Preprint 1999 http://www.ma.utexas.edu/mp_arch/99/99-220.ps.gz. 79. H. Spohn, Large Scale Dynamics of Interacting Particle, Springer Verlag, 1991. 80. H. Spohn and J.L. Lebowitz, Commun. Math. Phys., 54 (97), 1977. 81. F. Zhang, D.J. Isbister and D.J. Evans, Preprint 1999, http://xxx.lanl.gov/ps/chao-dyn/9910024.
151
T H E "CORPUSCULAR" S T R U C T U R E OF THE S P E C T R A OF OPERATORS DESCRIBING LARGE SYSTEMS R.A.MINLOS I N S T I T U T E FOR INFORMATION TRANSMISSION PROBLEMS, R.A.S. Institute for Information Transmissions Problems, Russian Bolshoi Karetny per., 19, Moscow, 101447, E-mail: minlQiitp.ru
Academy Russia
of
Sciences
Here we explain the remarkable feature of the spectra of "many component" op erators - the natural appearing of states which can be considered as states of one, two . . . and so on "particles". We give several nontrivial examples of this phe nomenon for some operators of mathematical physics. Contents I. T h e e x a m p l e : N q u a n t u m particles II. Infinite c o m p o n e n t s y s t e m s (general definitions) 1. What kinds of systems are considered here? 2. One-particle subspace 3. Two-particle subspaces III. C o n c r e t e classes of operators 1. The lattice model of the quantum harmonic crystal 2. Generators of stochastic dynamics 3. Hamiltonians of quantum infinite lattice systems (in the ground state) 4. Hamiltonians of lattice models of Euclidean quantum field theory Acknowledgments References
One of most amazing facts in the quantum theory of systems with infinite de grees of freedom (the theory of fields, statistical physics, stochastic dynamics and so on) is the following: objects which we interpret as particles appear in the theory by themselves. The states of these particles (or quasiparticles as they are some times called) are defined by their momentum, spin and maybe also some collection of parameters usually characterizing the symmetries of the system. In addition the energy E of the particle is defined completely by the values of these parameters (a so-called dispersion law for the particle). I
T h e example: N quantum particles
Start with a comparatively well-studied example: the system of N nonrelativistic quantum particles. The operator Hs of the energy (Hamiltonian) for such a system has the form:
'= 1
\l<.<j
/
and acts in the Hilbert space L2{(R3)N) of functions f of N variables X{ £ R3, i = 1 , . . . , N. Here h is Planck's constant, m is the mass of the particles (in the following we put ^ — 1 with the help of a choice of scales), A Xl is the Laplace
152
operator w.r.t. the variable £i,y(£)>£ £ R3 is a. pair potential of an interaction decreasing V(£) —¥ 0 very fast when |£| —> oo. Consider first the case N = 1 fEL2(R3)
Hlf = -Af,
(1.2)
Evidently this operator has a continuous spectrum and the basis of the eigenfunc tions of this spectrum (see [2]) have the form of "waves" p£R3
UW(x)=exp{i(x,p)},
(1.3)
Every such wave describes the state of a "freely moving particle" with momentum p and energy E{i}(p)=p2
(1.4)
Now in the case N = 2 H2f=-(ATl+Ata)f 3 2
there is a subspace M{i},{2] C L2((R ) ) and the group of translations: f(x1,x2)-tf(x1
+ V(x1-x2)f
(1-5)
invariant with respect to the operator H2 + s,x2 + s),
seR3
(1.6)
This subspace is spanned by the eigenfunctions of the continuous spectrum of the operator H2: U
\PI',P*)(XUX2)
= ex
P{*(*i.Pi) + i(x2,P2)}+SPliP2(xux2)
(1.7)
where SPliP3(xi,x2) —¥ 0, if |xi — x2\ —► oo. These functions describe the states of two particles "moving assymptotically freely at infinity" with momenta p\ and p2. The energy of these states is equal to E{i},{2}=Pi+Pl
(1-8)
Besides the eigenfunctions of the form (1.7) there can exist other eigenfunctions. Namely, after passing to the variables R = —
(the center of gravity)
r = x\ — x2
(the relative position of particles)
the Hamiltonian H2 may be written in the form H2f = -\ARJ
- 2Arf + V(r)f = (~AR
In the case when there are eigenfunctions *i(r),...,\I- f c (r)eL2(fl 3 ) of the auxiliary operator h2- with the eigenvalues Ai < A2 < • • • < Afc < 0
+ h2)f
(1.9)
153 the operator H2 has eigenfunctions of the continuous spectrum of the form u{^f{xux2)
= *,-(*! - x 2 ) e ' ( p ' £ i ^ i )
(1.10)
with energy
E\i\2}(p)=\p7+^,
j=i,...,k
(l.n)
Each eigenfunction of the series (1.10) describes a so-called bound states of two particles with binding energy Xj and together moving with m o m e n t u m P. T h e space M j j j , spanned by eigenfunctions of the form (1.10) is invariant with respect to the operator H2 and the group of translations (1.6). It turns out that there are no other branches of the spectrum of H2, that is
L2((«3)2)
= n{lh{2]
© n\\\2] © • • • e n\\]2]
(1.12)
This assertion is called "asymptotic completeness" of the states (1.7) and (1.10). T h e case N = 3: H3f
= -(ATl
+ A X 2 + A , , ) / + {V{X! - x2) + V(Xl
- x3) + V(x2 - x3))
(1.13)
In this case there are the following subspaces of L2((R3)3) invariant w.r.t. H3 and the group of tranlations. 1. The subspace "H{i},{2},{3} which is spanned by the eigenfunctions of the continuous spectrum of H3: »p!?pl 2 p } 3 ' { 3 } = e X P{ i (Pl. ; C l)+ i (P2. ; I ; 2) + J(P3,X3)}+<5p 1 ,p 2 , P 3 (11,12, X3)
(l.H)
where <5Pl,p3,P3 —> 0 when all pairs of distances \x\ — x2\, \x2 — x3\, \x3 — X\\ go to infinity. The eigenfunctions (1.14) describe the states of three "asymptotically free particles" with the momenta Pi,P2,P3- The energy of these states is equal to £{1,2,3}(P1,P2,P3) = P ? +Pl+Pl
(115)
2. The spaces O/O)
nj(J)
njU)
; _ 1
"•{1,2},{3}' "{1,3},{2}' n{2,3},{l}>
J -
1
L
>' ■ ■ '
are spanned by the eigenfunctions of H3 (for the case WW2,
K
,3,)
"isr-^^^f^^+er^.^^
(i.i6>
where 8p' '■ —► 0 when min {|z3 — xi |, \x3 — x2\) —> oo. These functions describe the states of three particles where two of them (1 and 2) are bound and the third is "asymptotically free". The energy of these states is equal to E$2U3){P,p) T h e spaces ?i A 3 i / 2 ) ' ^ | 2 ' 3} Ji)
= p 2 + EW2)(P) are
defined
ln
= p 2 + \P2 a
similar way.
+ A,
(1.17)
154
3. Finally there are the subspaces H j 2 3 i j = 1 , . . . , m of the bound states of three particles. They are constructed similarly to the bound states of two particles. Introduce the variables R = -(x\ + X2 + £3) (the center of gravity) o n,2 = x\— x 2 ,
ri ] 3 = xi — x 3
The operator H$ is written in these variables in the form H3 = --AR + h3 where /13 is a differential (elliptic) operator acting in the space of functions of two variables r ^ , J*i|3. Let *l(n,2,» - l,3),...,*m(»'l,2,''l,3) be eigenfunctions of the operator /13 with eigenvalues A*i < A*2 < • • • < fim Then the functions uiW} = eW**^))*.{tl_X3tXl_X3)j
i =
i,...,m
(1.18)
are eigenfunctions of the continuous spectrum of H3 with eigenvalues rf$*,z)(P) = \
p 2
+H
(1-19)
They describe the bound states of three particle with binding energy /ij which together move with momentum P. In this case again the "asymptotic completeness" holds, that is there are no other invariant subspaces of the operator H3. In a similar way the invariant subspaces of the operators HN, N = 4 , 5 , . . . are defined. Each of them is denned by some partition of N particles into groups {l,...,N} — {Q\) U {Q2} U {Q3}, Qj l~l Qj, = 0, j ^ j ' and by a choice for every group Qj = ( i ' i , . . . , i/,) k < N of a corresponding bound state of k particles. A bound state of N particles is defined by the position of the center of gravity of these particles and by a choice of an eigenfunction of an auxiliary operator h^ (like the operators /12 and /13) depending only on the relative positions of the particles. Remark. Note that every bound-state subspace {H^ 2\ f° r ' w o particles or ^ l [ 2 3 i for three) can be considered as the space of the states of a new (complicated) particle. According to this point of view there is some number of one-particle subspaces and other subspaces describe the states of several "particles which move asymptotically freely" with the momenta pi,.. .,pm, and the energy of such states is equal to sum of the energies of all "particles".
155
II //. /
Infinite component systems (general definitions) What kinds of systems are considered here ?
Here we study lattice "spin" systems. The state of such a system is given by an infinite configuration of "spins" Q = {qx,
xeZd}
on the d-dimensional lattice Z d taking values qx in some set S (the "spin" space). The totality of these configurations (the phase space of the system) we denote by Q{= Sz ) and suppose that it is provided with a probabilistic measure fi. The avarage with respect to fi is denoted by ()^. In what follows we shall consider operators acting in the Hilbert space Let TS, s G Z d , be a shift of configurations along s 6 Z d : (r,Q)x = qx-„
Q=
{qx,xeZd}GU
We assume that the measure \i is invariant with respect to all shifts. Evidently the operators acting in %\
(usf)(Q) = f(rrlQ),
fen
define a unitary representation of the group of shifts. 11.2 One-particle subspace Let H be some selfadjoint operator acting in %, commuting with the group {U,, s £ Z d } and such that #•1=0
(2.1)
(1 € % is identical to unity on ft). The subspace %\ c H is called a one-particle subspace for H if it is invariant with respect to H and Us and cyclic with respect to the group {U,}. More exactly: in Hi there is an orthonormal basis {vx,xEZd}
(2.2)
d
labelled by points of the lattice Z and such that U,vx = vx+,
(2.3)
Then Hvx — T,aXyyVy (because Hi is invariant with respect to H) and axy = c(x — y) (because H commutes with Us) We suppose that the function c(z), z £ Zid, decreases very fast at infinity. As a rule \c(z)\ < const e-*W
(2.4)
156
for some n > 0. From (2.1), (2.3) and (2.4) it follows that Tii is orthogonal to 1, which means (vx)^ = 0 for every x G Zd
(2.5)
Let W\ be the unitary map :%i^L2{Td,d\)
Wx such that
Wivx = ei{x'x)
(2.6)
d
Here T is d-dimensional torus, dX = j^KrjdX^ .. .dX^ Haar measure on Td (\ = (\(i\...,\W)eTd),
(x,A) = X>e>AW,
is the usual normalized
x^.C)...,^)
i= l
This map transforms the operators Hi = H\-Hi and U, ' = Us\n, into operators Hi = WiHiW^1 in L2(Td,d\)
and 0™ =
WiU^W^1
which act by the formulas: feL2{Td,d\)
(Hif)(\)=c(X)f(X),
(U^f)(X)=exp{i(s,X)}f(X),
XeTd
(2.7)
(2.8)
where
c(A) = J2 2€Z
cW{z-X)
d
From (2.7) and (2.8) it follows that the spectrum of the operators U, ' and Hi is absolutely continuous and the spectrum of Hi coincides with the set of values of the function c(A): E(X) = c(X) The functions */.(•) = { * ( • - / * ) . A* e T < } are generalized eigenfunctions of the continuous spectrum of the operators U, Hi (see [3]). They have decompositions <S„(A) = ] T e-,'("'x>e,'(A'I>
(2.9) and
(2.10)
r6Z<*
From here we get that the formal series
F»(Q) = E e-i("'X)MQ)
(2.H)
157
which can be considered as generalized functions on Q (linear functionals on a suitable space of quasilocal functions on Ct) are also eigenfunctions of the continuum spectrum of U, ' and H\. The series (2.11) is interpreted in concrete physical situations as a "wave of excitations" (electromagnetic wave, sound wave, spin wave and so on) with quasimomentum \t, £ Td and "energy" (dispersion) E(X) = c(n). Remark 1. In some cases when the system has additional degrees of freedom the basis (2.2) in the one-particle subspace %\ is provided by an additional index: {vZ,xeZd,aeA}(\A\) and the operators U, again act in this basis by the formula (2.3), but Hi acts by the formula
#it£ = ££*«.«'(*-»K'
(2.12)
a'.y
where C{z) = (Caia'{z))a,a'€A ls a selfadjoint matrix. The spectrum of H\ in %\ coincides with the set of eigenvalues {N(\),
j =
l,...,\A\)
of the matrices
C(A)=X;C(*)e'- z
II.3
Two-particle subspaces
Let us assume that the operator H in H has a one-particle subspace %\ with fixed basis {vx, x £ 2id) and that c(x — y) as above are the matrix elements of E\ in this basis. The subspace "H2 C "H is called a two-particle subspace for H if it is invariant with respect to H and U, and is provided with an orthonormal basis {v{xuXl),(xux2)CZd}
(2.13)
labelled by unordered pairs (xi,X2) of points such that U>v(*i,x,) = v(*i+s,x? + ,)
HV(XuX3) =Y^C(Xl x\
- X\)V(*\,X7)+YlC(X2
(2.14)
~ X'2)V(XLS',) +
x'2
J2 ^{(^i.^),^!,^)}^',,^)
(2-!5)
Here the quantity S , {(ii, X2), (x[, x'2)} (the so called "connected part" of the matrix element of H in the basis (2.13)) satisfies the following conditions: i) translation invariance: S{(Xi +s,x2
+ s), {x\ + s, x'2 + s)} = 5 { ( n , 12), (x'ltx'2)}
(2.16)
158
ii) fast decrease: S{(xl,x2),(x[,x'2)}^0
(2.17)
when max{|a:i - x2\, \x\ - x'2\, |a?i - x[\, \x2 - x'2\} -> oo As a rule S{(xi,X2),{x'l,x'2)}
satisfies a so-called "tree-like" estimate: < const e-Kd<*f*^'V
\S{{xi,x2),(x'vx2)}\
(2.18)
for some n > 0. Here d^, A C Z d , is the minimal length of any connected subgraph G C Zd such that A C V{G) where V{G) is the set of vertices of G. The unitary map W2 : U2 -»• Lfm(Td x T d , dAidAa) : W2v(xi, x2)eXl , r3 (Ai,A 2 ) = f^[exp{z'(Ai,a:i)-fi(A2,a;2)}+exp{i(Ai,a;2) + i(A2,Xi)}]
a^ ^ x 2
[exp{i(x,Ai + A2)}
x i = a ; 2 = 2: (2.19)
transforms the operators £/j ' = Us\%^ and i7 2 = H\n2 into operators in L2ym(Td x T d ): (t>i 2) /)(Ai, A2) = exp{i*(Ai + A 2 )}/(A!, A2) (tf 2 /)(A,, A2) = (c(A0 + c(A 2 ))/(Ai, A2) + (2.20) /
#(Ai, A2; ^ 1 , v2)f(f*i > H2)d(iid(i2
Here /£T(Ai, A2; A*I> A*2) is a smooth kernel concentrated on the surface { A i , A 2 ; P l , p 2 : Ai + A2 = in + fi3} C (Td)4
(2.21)
The passage to variables A = Ai + A2, A = Ai allows us to represent the space L2(Td x Td) as a direct integral (see [4]) Ls2ym(Td xTd)=
I L2A){Td)d\
where L2 '{Td) C L2{Td) is the subspace of functions / G L2(Td) with respect to the change A to A — A: /(A) = /(A - A)
(2.22)
symmetrical
159
wnlyA,)
maxbA(A.)
Figure 1. The typical view of the two-particle spectrum
In addition under the representation (2.22) the operators U, ' and H2 are repres ented as direct integrals of operators t/i 2 ) (A) = e'^A^Ew
and
H2(A):
(tf2(A)/)(A) = 6A(A)/(A) + J KA(X, ti)f(n)dn
(2.23)
6A(A) = c(A) + c ( A - A ) ,
(2.24)
where
A'A(A,/Z) = A'(A, A — A;//, A — fi) and i?( A ' is the unit operator in l \ '(Td). The operators (2.22) for any A 6 Td are operators of Friedrichs' type (see [5]). T h e spectrum of such an operator consists of an absolutely continuous part coinciding with the set of values of b\(-), and maybe a finite set of eigenvalues «I(A),...,K,(A),
s =
s(A)
(2.25)
These eigenvalues for all A € Td, except maybe a finite number of values A, lie outside the continuous spectrum of //2(A). T h e positions of the spectra of the operators H2 (A) are represented in Figure 1 (for the case d = 1). T h e hatched area in Figure 1 is the "main part" of the spectrum of H. The ei genfunctions (generalized) of this part describe the "asymptotically free" movement of two "waves of excitation" with quasimomenta Ai and A2 (A = A1+A2 is their total quasimomentum) and energy E — c{\\) + c(A 2 ). Denote by ' H ^ a i n C %2 the subspace spanned by these functions. The heavy lines in Figure 1 are called the spec t r u m of "bound states". T h e corresponding eigenfunctions describe the movement of two connected waves with total quasimomentum A and "energy" E(A) = «(A). Remarks 2. a) In many cases the basis (2.13) in %2 is labelled by pairs (x\, x2) of different points only: Xi ^ x 2 . Then instead of the space Ls2ym{Td x Td) in (2.19)
160
one should consider the subspace Ls2ym(Td x Td) C Ls2ym(Td x Td) orthogonal to functions of the form h{\\ + \2). Correspondingly the space L\S){Td) in (2.22) is changed to the subspace L2 C L2 ' which is orthogonal to a constant. b) In the case described in Remark 1 the basis (2.13) in H2 also is labelled by pairs of additional indices (a\,a2) and the operator H2 in this basis acts by the formula H2V%£=
£
Caia,(x1-z'l)v$°la)+
+
£
Caia;(x2-x'2)v°;°>,)
+
^S(aia,)iia'ia'a){{xix2),[x\x'2)}v°r°?
where {C a , a 2 (z)} is the same matrix as in (2.12). The other formulas are changed in a similar way. In particular the continuous spectrum of the operator H2(\) for any A C Td consists of the set of eigenvalues {i/j '(A), A G Td, j = 1 , . . . , |.4|} of the matrix B
Then the operator H\ux = Hi®E+E®Hi rraux„.aux
#2
\ "* .,/_
„aux acts in the basis v^*
^/ \„.aux
. \ ^ „/
„/ \ aux
< W , ) = 2-fC^~a:l)V,.«2)+2-rC(a:2~a:2)w(rllxJ)
(in some cases as mentioned in (a) x[ ^ x2 in the first sum and x'2 ^ x± in the second). Define the embedding J of the space ~H2U* into space % by the relations Jv
(***,) = "*• ' "*» Then in some cases there has been proved the existence of limits W± = s-
lim
JtH nn
e Jt
,„-itHi"
l-fkoo
and equalities
w±u3mx d) In a similar way, one can define three-, four- and more particle subspaces for the operator H having a fixed one-particle subspace %\. The spectra of H in these subspaces have again "corpuscular" structure like the iV-particle Schroedinger op erator (see [7],[2]). The technique invented for investigation of the latter operators (see [8],[9]) can be applied for studying the spectrum of H in fc-particle subspaces. The construction of Haag and Ruelle's scattering theory can also be used in this case.
161
III
C o n c r e t e classes of o p e r a t o r s
Here we give some examples of operators H from mathematical physics for which one- and two-particle subspaces may be constructed. ///. /
The lattice model of the quantum harmonic crystal.
This is the simplest of the models described above. The space of the values of spins is S = R (and fi = Rz ). First one needs to construct a well-defined Hamiltonian for this system (together with the corresponding Hilbert space). The formal Hamiltonian has the form *\ =aJ2
xtZ*
l*-jfl = i x.yez* x.yez*
where 2d\b\ < a (this condition comes from requiring positivity of the quadratic form
fl ? + a
qxq E * * J2 <7*?y) y^
J^
l*-vl = i
In order to give sense to the expression (3.1) consider the operator HA H\ in a finite volume A C Z d
^ / = -*6A E 5*+ ( \« *6A E ^ + 6 | i - yE| = l Mv)f / q q
*6A
*
\
x6A
|i-y| =l
/
HA = Li(R^,dQ\) L^(R^,dQ\) functio / ( Q A ) , which acts in the Hilbert space HA of functions {{*,*€ f e , i € AA} } CCi J/ iAA. . It is easy to check that the ground state of this operator is equal to It is easy to check that the ground state of this operator is eq * A ( Q A ) = CA exp j| - i ] T JJ* r.v1'9y \1 yqTqu *,y€A K(QA) = CA exp |[ - 1 x,y€A ^ J*yqxqu JI where C\ CA is a normalizing factor and the matrix A where C\ is a normalizing factor matrixA 1 / 2 J A =and {J*the {J y} y } = (A ) and
{
QA
=
(3.3)
(3.4)
J A = {j*y} = (A*y> a
(3-2)
x=y,
b | z - y | = l, x,y£\ (3.5) Now consider the Gaussian measure y.\ in the space RK with density w.r.t. Lebesgue 0 \x - y\ > 1 measure dQ\ Now consider the Gaussian measure //A in the space RA with density w.r.t. Lebesgue measure dQA 2 (3-6) % = [*A(QA)] = C l e x p | - £ J* #.VMA vq,qA (
z,y€A
)
162
The unitary map L2{RA,dQA)^L2(RA,d(iA):
f -+ f/K transforms the operator H H\A (3.2) into the operator H H\A
) &+^ "A = - E£ £- 2- E2 E( Efe^ ■£,*) 4"A+tryA "A = - E trJ AA
We We omit omit the the constant constant trJ
/)
x€A \y€A \ye\ x€A
x€A x€A ^°**
° 9^r
(3 7)
(3-7) -
Z dd .
and case and pass pass to to aa limit limit A A /* /* Z . Because Because in in this this case case
AAA
/2 and JJ AA -»• -»• JJ = = ,4A11/2 A ^A-► A and
where where
{
a
xi = y
b6 \x-y\ = \l x,yEZd the measure p\ on i?A converges0 weakly measure fi on I* -toy\ y|a > >Gaussian \x 1 AA :ovariance the measurematrix //A on i? converges weakly to a Gaussian measure fi on R covariance matrix (««?!/>„ = (J-%,y x,yEZd and average (qx)^ (
Rz Rz
(3.8) with a with a
H acting in
* = - £ & - > £ (*>**)£ d
Hx
d
d
(")
HHx
xez *ez \yaz xez \t/6Z<' /I It is a real operator describing the excitations of the ground state for the infinite system of harmonic oscillators. It turns out that the space H admits a complete decomposition into /r-particle invariant subspaces of the operator H. The one-particle subspace %\ C.M is spanned by functions: d
x,xeZ \{q
I
The energy of the one-particle state with quasimomentum A £ Td is ec equal to // E(X)=
««
x ^ X>/2
a + 2266V^ c ocos s AA,,-J
A = ( A 1i ,; . . . , Ad)
(3.10)
The two-particle subspace %2 C H is spanned by the functions
{ixqy-{qxqy)n,
x,yezd}
and does not contain the bound states. The following 3-, 4- and Ar-particle invariant subspaces Ti^ C H are constructed with the help of the so-called Wick's monomials (see [1]) and also do not contain the bound states.
163
In the following we consider examples of operators H for which the complete decomposition of the Hilbert space "H into fc-particle subspaces is absent at the present time (except for the case of the Glauber dynamics for the one-dimensional Ising model, see below). In the cases considered here the operators H = //(/?) depend on some small parameters /? such that spectral decomposition of the original operator Ho = //(/? = 0) is simple enough. Also all operators H > 0 and have a nongenerate eigenvector \Po = 1 with eigenvalue 0 (ground state; see condition (2.1)). The spectra of H in the one- and two-particle subspaces described here are lower branches of the spectrum of H: the remaining part of its spectrum lies to the right of these branches. In what follows we don't specify it precisely. Moreover the one-particle spectrum of H in all the examples below has the form c(A;/?)=co(/?) + c 1 / ? ( ^ c o s A ( < M +k(\,0)
(3.11)
where k{\,0) = o(/?), A = (A' 1 ',..., A) £Td. Here c0(/?) is some function of/? and ci is a constant. If the basis (1.2) has additional indices a £ A and C(A;/?) is a matrix the decomposition (3.11) is also true and Co(/?) and C\ are matrices. III.2
Generators of stochastic dynamics
Consider a "spin" lattice system of statistical physics with a compact metric space 5 of spin and some formal "classical" Hamiltonian:
(3-12)
MQ) = X>^(3) A d
z
where Q = {qx,x G Z ) G S = Q is the configuration of "spins" qx G S and {
+ dWx{t),
xe7,d
where V , is the gradient at the point q G S and {W^.z G Z d } is a family of independent "Brownian motions" on S (see [10]). For small 0
A
»t/-/?£(
xgZ d
V
«Ai,V„/),x,
f e 7 i
.
(3.13)
x
Here Aq is the Laplace-Beltrami operator at the point q G S and (•, )q is the scalar product in the tangent space at this point (the gradient V gx /i c i = J2A xeA ^ixfA
164
is well defined). At the present time the following cases of such dynamics have been studied (for/? <£ 1). A. Xy-model (or model of rotators, see [11]). Here S = S 1 C R2 is a circle, Q = [qx g S 1 , x G Z d }, and the formal Hamiltonian is
hc\ ~ £
(?*>?»)= X] cos(0 x -0 y )
|x-y|=i *, y€Z
(3.14)
|r-y| = l
where ? = (sin0,cos0), 0 < 0 < 2ir. The existence has been proved of two one-particle subspaces U\ and 71^ of H with identical spectra of the form (3.11), where CQ = 1, CI = —2. Both subspaces %i are eigenspaces with eigenvalues e ± , a respectively for the symmetry group {Va, a G S 1 } generated by the shifts of values of configurations a,xeZd},
Q - f Q + a = {qx +
In addition there are three two-particle subspaces H*+, "W^-, % J _ which are eigenspaces of the group Va with eigenvalues e2,a, l , e _ 2 , a . For d = 1 it has been proved that the operators H in 7i++ and "W -- have no bound states, but in "Wj" such bound states exist for values of A (total quasimomentum) close to 7r (see [12]). B. XYZ-model (see [13]). Here 5 = S2 C R3 (a two-dimensional sphere), Q = {<7X = (Qx,
MQ) = £
(^.9y) =
|*-vl =i *,V€Z <<
=
^^
( c o s 0 x c o s 0 y + sin 0 r sin 0 y cos(^>r -
*.y;l*-y|=i
For this case (for small /?) the existence has been proved of a one-particle subspace Hi with basis {w™,a: G Zd,m = —1,0,1}. This space is invariant with respect to the symmetry group (Tg, g G SO3) generated by the group of rotations g G SO3 of values of configurations Q. The representation g -> Tg of the group SO3 is a multiple of the irreducible representation of SO3 with the weight / = 1 (see [14]). The spectrum of H in %\ has the form (3.11) where the matrices Co = 2(8m,m') and C\ — — !(<Jm,m')- The two-particle subspaces for this model have not been studied. C. The model of a classical anharmonic crystal (see [15]). In this case 5 = R\Q= {qx £Rl,x£Zd} and
MG)= E ^ + f z£Zd
£
(<7*-
l*-y| = i *,v€Zd
where s > 2d + 1 is an integer and a > 0 is a parameter. The existence has been proved of a one-particle subspace (for small /?) with spectrum of the form (3.11) where CQ(/3) = 01''ki and cj — cy(a) is some function of a. Here ki is the smallest
165
eigenvalue of the Schroedinger operator in L^{R L^(R1,dq)
-~lq^ ^ 7 ++ V{q)f *(?)/, '
L 2R(R\dq) f efeL *( l>dq)
and v(q) = s V ' " 2 - s(2s -
\)q2"2.
The two-particle subspace for this model has not been studied. D. The Glauber dynamics for the Ising model (see [16]). In this case S = {-l,l},Q = {qxeS,x£Zd} and
/»ci = 2_]
|x-vl = i x.yez*
qxq +
h qx y + h z2 z2qx
r£Z
The Glauber dynamics for the Ising model is a stationary Markov process {Q(t),t £ R1} with local interaction (see [17]) given by the intensity of spin "flip" qx —> —qx at the point x £ Zd of the form m X X = ™( -phclc (Q') ^ '^ ,Q) =~ ee-0h +e-ph cl(Q) l (Q*) + e -/3h c ,(Q) ~
-2/3 9 «(E|yv__xl||== ,<3»+M 1<J,+M 11 ++ ec-2/3g.(E|
Here Qx is a configuration differing from Q only at the point x. Such a process exists, is unique and reversible for small /?
(tf/)(Q) = - X > ( * , Q ) [ / ( Q * ) - / ( Q ) ] zezd
There are one- and two-particle subspaces H\ and %i of this operator. The spectrum of H in H\ has the form (3.11) with co(/?) = 1 and c\ = ci(/i) some function of h. The existence of bound states of H in the space li-^ has not been studied for d > 2. In the one-dimensional case (d — 1) for h = 0 and arbitary /? the whole decomposition of % has been constructed (see [18]) into invariant (fc-particle) subspaces of H: oo fe = 0 fe = 0
The operators U, and H in Hk are unitary equivalent to the operators The operators U,(fc) and H in Hk are unitary equivalent to the operators (t/. /)f A, Afc) = exp{»(A, + . . . A t ) ) / ( A , , . . . , Afc) (t) (t/, /)(Ai,..., A,) = exp{z(A! + . . . A,)}/(A!,..., Afc) {Hkf){Xu...,\k)=
(£
acting in the space L"ym((T1)*) of antisymmetrical functions of k variables A i , . . . , A* Afc 6 Tl. Here c(A) = 1 — tanh/?cos A.
166
111.3 Hamiltonians of quantum infinite lattice systems (in the ground state) Consider the lattice quantum spin system with compact metric spin space S with measure dq and formal energy operator # = - £
A,,+/?
x€Z*
£
*(qx,qy)
(3.15)
|x-y| = l
*(?ii02)j <7i)2 6 S is a symmetrical smooth function on S x S. In order to give an exact sense to the expression (3.15), consider the selfadjoint operator
/7A = - £ A , x + / ? £ x£\
*(qx,qy)
*.v€A |i-y| = l
d
(A C 7i is finite set) acting in the Hilbert space L2(SA ,dAq). bounded below and has discrete spectrum
This operator is
^<41)<42)<--The smallest eigenvalue E\ is non-degenerate; let \PA > 0 be the corresponding normalized eigenfunction. Consider the new measure fi\ on SA
d/iA = |*°| 2 A and define the unitary map from the Hilbert space L2(SA,dAq)
f-+f/*leL2(SA,d^),
into L2(SA,
dfj,\):
feL2(SA,dAg)
It is easy to calculate that the operator H\ transforms into the operator H\ + E^ where
HAf=-J2
A,,/ - 2 £ (V,x(ln * A ), V,,/) , / G L2(SA, dpA)
rgA
is a selfadjoint operator in L2(SA, dfi\). The operator H\ > 0 and H\l — 0. This operator is the operator of energy of "the excitations of the ground state". It turns out that for a large class of manifolds S and j3 r, x £ Zd] on ft : wx = Uxivo such that lim V , x l n * A = Vq,wx A/>Z«
iii) the operator r€Z<'
reZ"1
acting in the Hilbert space "% = L2(£l,fi) is selfadjoint (see details in [19] and [20]). At the present time the operator H has been studied in the following cases:
167 A. S = Td (d-dimensional torus)(see [19]). Here a one-particle subspace %\ has been constructed with the basis {VxalxeZd,a
= {(ryi},(T=±l,i=
1,...,}
(3.16)
T h e spectrum of H in %\ has the form (3.11) with matrices CQ = (<$<*,<*') and C\ = {caai} where the matrix elements caai = c 0 a / ( $ ) depend on the function $ . In the gauge invariant case $(91,^2) — f(qi — 92)1 f is an even function on Td, the m a t r i x C(A,/?) is diagonal C(A,/?) = (c(A, a)6at0,>) and c(A, ( + , i)) = c ( A , ( - , i ) ) , i = 1 , . . . , d. T h e case of the two-particle subspace of H has not been studied. B. S = S2 (see[20]). For this case also the one-particle subspace H\ has been constructed. The basis (3.16) in ~K\ has an additional index m — —1, 0, 1 and the matrix C(A, /?) has as above the form (3.11) where Co = 2(<$mim/) and C\ — { c m m < } where c m > m / depend on $ . In the gauge-invariant case: $(<7<7i, (792) = ^*(?i5 92), where g £ SO3 is a rotation matrix, C(A,/?) is diagonal. In this case the operators H and U, commute with the representation g —> Tg of the group SO3 acting in 7i. T h e space 7ii is invariant with respect to Tg and the representation of SO3 in H\ is a multiple irreducible one with weight / = 1. T h e two-particle subspaces have not been studied in this case. 11 [.4
Hamiltoman
of lattice models of Euclidean quantum
field
theory
Because this case is explained in great detail in the book [1] we recall it very briefly. In [1] one can find references to the original works. Consider the (d+ l)-dimensional lattice Zd x Z 1 = Z d + 1 and the space £2 = S2 of configurations Q — {qz,z = (x,t) G Zd+i,x G Zd,t G 1 Z } qz G of the field. Here S is some compact (or finite) set provided with some "free" measure dq. Then the formal Euclidean action of the field (classical Hamiltonian) is given by the formula U{Q) = 0
£
t>2(q»qz,) + h
£
*l(?2)
l«i-'al=i
T h e corresponding Gibbsian field (unique for ^ < 1 ) with distribution vp^ on Q is a Markov field on the lattice Z d + 1 and can be considered as a stationary reversible Markov chain {Q(t),t £ Z 1 },Q(<) = {qx,t,x G 1d) with the space of states Sz" = Cl and stationary measure
(So is the (T-algebra generated by the values {<7(x,o),z G Z d } ) . Let Tt,t G Z 1 be the stochastic semigroup of this chain acting in the (physical) Hilbert space H — Li{$l,A*/3,/i)- Then the selfadjoint operator H = \\nT?
(3.17)
is called the Hamiltonian of the corresponding model of a Euclidean q u a n t u m field.
168 Besides the "matter fields" described above there are so-called gauge fields, given on the edges of the lattice Z d + 1 and taking values in some compact group G (see [21]). T h e Euclidean action of such a field (and the more general action describing a gauge field interacting with a "matter field") is also generated by a Gibbsian Markov field (Markov chain) on the lattice Z d + 1 . Its Hamiltonian is defined by the formula (3.16). At the present time the following models of such fields have been studied for small j3. A. Ising model on the lattice Zd+1 (see above). Here there are the one- and two-particle subspaces %\ and %?.. The spectrum H in %\ has the form (3.11) with Co(/?) = In |/?| and C\ = ci(h) some function on h. For the cases d = 1 and d > 3 there are no bound states in the two-particle subspace %i- T h e result is not known for the case d = 2. For d = 1 and h = 0 the spectrum of H has been calculated completely (see [22]). B. T h e model of rotators (see above). Here there are two one-particle 7if, 7i~[ and three two-particle 7 ^ J + , Ti^, Ti^- subspaces of H (compare with the gen erator of stochastic dynamics for this model, see above). T h e spectrum of H in Tif and "HJ"1 is the same and has the form (3.11) with CQ = In | ^ | and c\ = c\(h). Nothing is known about the existence of bound states in the two-particle subspaces. C. Models of a gauge field with gauge groups G = U{\) and G — SU(2). In the first case there are d(d — 1) and in the second one ' ~ ' one-particle subspaces with the same spectrum of the form of (3.11) where Co = 4 In/?, c\ — 0, k(X,/3) — C2/? 4 ccos A^1' + 0(/? 8 ). In addition there are "excited" one-particle subspaces (see [23]) for both cases with the spectrum ~ 6 In |/?|. D. T h e lattice model of q u a n t u m chromodynamics (see [24] and [25]). T h i s model describes the interaction of so-called "quarks" with a gauge field. For this case the four one-particle subspaces (corresponding to scalar, pseudoscalar, vector and pseudovector mesons) have been studied and the spectrum of H calculated in these subspaces. Acknowledgments I would like to thank R F F I for financial support (grants 97-01-00714 and 99-0100284). References 1. V.A. Malyshev, R.A. Minlos, Linear infinite-particle operators, Transl. of Math. Monograph 143 (AMS, Providence, RI, 1995). 2. M. Reed, B. Simon, Methods of modern Mathematical Physics, vol. 3, 4 (Aca demic Press, New-York, 1978). 3. I.M. Gelfand, G.E. Shilov (Russian), Some questions of theory of differential equations, Series "Generalized functions", v. 3 (Fizmatgiz, Moscow, 1958). 4. D.R. Yafaev, Mathematical scattering theory (AMS, Providence, RI). 5. K.O. Friedrichs, 1) On the perturbation of continuous spectra, Comm. Appl. Math. 1 1 5 , 249-272 (1948), 2) Uber die Spektralzerlegung eines Integraloper-
169 ators, Math. Ann. 115, 249-272 (1938). 6. R. Jost, The general theory of quantized fields (AMS, Providence, RI, 1965). 7. L.D. Fadeev, O.A. Yakubovski, Lecture on quantum mechanics (Russian) (Pub lishing Leningrad University, 1980). 8. L.D. Fadeev, Mathematical questions of q u a n t u m scattering theory for the system of three particles (Russian), Proc. Math. Institute of Steclov 6 9 (1963). 9. S.R. Merkuriev, L.D .Fadeev, Quantum scattering theory for the system of several particle (Russian) (Nauka, Moscow, 1985). 10. S. Albeverio, A. Daletskii, Yu. Kondratiev, Infinite systems of stochastic differ ential equations and some lattice models on compact Riemannian manifolds, Ukr. Math. J. 4 9 , 326-337 (1997). 11. Yu.G. Kondratiev, R.A. Minlos, One-particle subspaces in the stochastic XYmodel, J. Stat. Phys. 8 7 , 613-641 (1997). 12. K.A. Zhizhina, Two-particle spectrum of the generator for stochastic model of planar rotators at high temperaturs, J. Stat. Phys. 9 1 , 343-368 (1999). 13. N. Angelescu, II.A. Minlos, V.A. Zagrebnov, T h e lower spectral branch of the generator of the stochastic dynamics for classical Heisenberg model (in press). 14. I.M. Gelfand, R.A. Minlos, Z.Ya. Shapiro, Representations of the rotation and Lorentz groups and their applications, (Pergamon Press, Oxford, 1963). 15. R.A. Minlos, Yu.M. Suhov, On the spectrum of the generator of an infinite system of interacting diffusion, Commun. Math. Phys. 2 0 6 , 463-489 (1999). 16. R. Minlos, Invariant subspaces of the stochastic Ising high temperature dy namics Markov Processes and Related fields, 2, 263-284 (1996). 17. T h . M . Ligget, Interacting particle systems (Springer-Verlag, N.Y., Berlin, 1985). 18. R.A. Minlos, A.G. Trishch, Complete spectral decomposition of a gener ator of Glauber dynamics for one-dimensional Ising model (Russian), Uspekhi Mathem. Nauk 4 9 , 209-210 (1994). 19. E.A.Zhizhina, Yu.G. Kondratiev, R.A. Minlos. T h e lower branches of the spectrum of Hamiltonians of infinite q u a n t u m system with compact space of "spin" (Russian) Trudy Mosk. Mat. Ob-va. 6 0 , 259-302 (1999). 20. N. Angelescu, R.A .Minlos, V.A. Zagrebnov, One-particle branch of the spec trum of Hamiltonian of lattice weak-coupling field with the values on twodimensional sphere S2 (in press). 21. E. Seiler, Gauge theories as a problem of constructive quantum field theory and statistical mechanics, Lecture Notes in Phys. 159 (Springer-Verlag, Berlin, Heidelberg, N.Y., 1982). 22. E.A. Zhizhina, The excited one-particle states of transfer-matrix of gauge lat tice field of Yang-Mills with gauge group U(l) (Russian), Teor. Mat. Fysika 7 4 , No. 2 (1988). 23. R.A. Minlos, E.A. Zhizhina, Meson states in lattice Q C D , Adv. Sov. Math. 5, 113-138 (AMS, Providence, RI, 1991). 24. J. Frolich, C. King, Meson masses and the V(l) problem in lattice Q C D . Nuclear Phys. B 2 9 0 , 157-187 (1988).
170
VORTEX- A N D M A G N E T O - D Y N A M I C S - A TOPOLOGICAL PERSPECTIVE H.K. MOFFATT Isaac Newton Institute for Mathematical Sciences SO Clarkson Road, Cambridge CBS OEH
1
Introduction
The subject of vortex dynamics, within the broader field of fluid dynamics, was initiated by the pioneering studies of Helmholtz (1858) and Thomson (Lord Kelvin) (1869) on the laws of vortex motion. These laws are encapsulated in Kelvin's circulation theorem which applies to the motion of an ideal (i.e. inviscid) fluid in which pressure p and density p are functionally related, i.e. p = p(p), and any body forces acting on the fluid are irrotational. Under these conditions, Kelvin showed that the circulation round any closed circuit C moving with the fluid is conserved: K
—
(1)
Here, u(x,<) represents the velocity field. The circulation K may equally be ex pressed as the flux of vorticity u> = curl u across any orientable surface S spanning C:
K= I undS,
(2)
and, since this applies to every material circuit C, including the boundaries of infinitesimal surface elements, it is readily deduced that "vortex lines are frozen in the fluid", i.e. the velocity field u>(x,<) is transported with the flow. Kelvin immediately recognized that this result implied also the conservation of any linkage or any knottedness that might exist in the vorticity field at some reference instant t = 0. For example, if the vorticity field in a fluid is zero every where except in a closed tube which is knotted in the form of a knot K, then this topology of the vorticity field is conserved for all time; (it should be noted here and subsequently that this sort of result holds only for so long as the fluid can be regarded as truly inviscid; this is an idealisation that is never realised exactly in practice, except perhaps in liquid helium II in which quantum effects provide al ternative complications). It was this insight that led Kelvin to propose his 'vortex theory of atoms' in which a correspondence is conjectured between atoms of differ ent elements and knots of different knot types. This theory, although subsequently abandoned, provided a powerful stimulus for the major study of the classification of knots undertaken by Tait (1898, 1900) and in the subsequent development of topology as a distinct branch of mathematics.
171
2
Helicity and its topological interpretation
Remarkably, almost 100 years elapsed following Kelvin's great paper before the discovery of an invariant of the Euler equations of fluid motion which is truly topological in character and which indeed provides a natural bridge between fluid dynamics and topology. This invariant (J.-J. Moreau 1961, Moffatt 1969) is the helicity of a flow, defined as follows. Let S be any closed orientable surface moving with the fluid on which w • n = 0, i.e. the vorticity field is tangential to S; S may be described as a 'vorticity surface', a condition that clearly persists if it holds for t = 0. The helicity of the flow in the volume V inside S is then defined by
n= f u
u>dV, (3) Jv and this quantity is conserved under evolution governed by the Euler equations. To show this, it is best to express the Euler equations in Lagrangian form, viz.
£(=£+~*-)=-™
<«>
l
where h = f p~ dp (the need for the 'barotropic' condition p = p(p) may be seen here). The curl of (4) coupled with the equation of mass conservation in the form
,5,
g = -,V-u leads to the vorticity equation in the form Dt \pj
\p
Now from (3), it is readily shown that
dU
f Du
-dt=Jv^t-dV
„,
+
f
D fu\
JvUDi^)'dV
„,
<7>
and, on using (4) and (6), this reduces to ^
= / s ( n •«)(-/> +^)dS
= 0,
(8)
on using the essential condition n • u = 0 on 5. Hence, % is indeed constant. Note that this result does not require that the flow be incompressible, although it does hold also in this special case (with p = cst.). In general, it holds under precisely the same conditions that govern Kelvin's circulation theorem: inviscid fluid, barotropic flow, and irrotational body forces. It should be evident therefore that 7i must admit a topological interpretation. That it does so is best seen through consideration of the simplest possible 'pro totypical' linkage of vortex lines: consider the situation in which u> is zero except inside two unknotted but linked vortex tubes of circulations «i and /C2 and of small cross-sections; and suppose that the vortex lines within each such tube are them selves unlinked closed curves. Then % may be evaluated by first integrating across the cross-section of each tube, then along their axes C\ and Ci- The result is % ~ ±2rj«i/c2 ,
(9)
172
where n is the number of times that C\ winds round C 2 before closing on itself (the Gauss linking number of C\ and C 2 ), and the + or — is chosen according as this linkage (which is oriented by the direction of vorticity within each tube) is righthanded or left-handed. The velocity field u can be expressed in terms of vorticity u> by the Biot-Savart law, and this leads to the well-known expression for n as an integral: (dxiA
^Jcjc
<■
-
i
"
^
-
■
(10)
c3 |xi-x2|3 This is the fundamental topological invariant of the two closed curves C\ and C 2 , and the bridge between topology and fluid dynamics is therefore established by the simple result (9). The situation is not so simple when knotted, as opposed to linked, vortex tubes are considered (Moffatt & Ricca 1992). Suppose now that u> is zero except in a single closed vortex tube whose axis C is knotted in the form of a knot K. Suppose that each vortex line in the tube is a closed curve 'nearly parallel' to C, by which we mean simply that if C" is one such curve, then C and C form the edges of a closed ribbon of small width. Let N(s) be a unit spanwise normal directed from C to C on this ribbon,where s is arclength on C; then the twist of the ribbon (see, for example, Fuller 1971) is defined by Tw=
-!- I
(N(S)AN'(S))
dx.
(11)
This twist can be decomposed in two parts: Tw=
— i r(s)ds+ N (12) 2?r JC where T(S) is the torsion on C and N, an integer, is the number of rotations of N(s) relative to the Frenet triad ( t , n , b ) of unit vectors on C. If such a ribbon is cut and one of the cut ends twisted through 27r and then rejoined, then AT changes by ± 1 , depending on the sense of twist. N may be described as the 'intrinsic twist' of the ribbon. Now suppose that the vortex tube is 'uniformly twisted' in the sense that every pair of vortex lines C' ,C" has the same value of intrinsic twist N. Then the result analogous to (8) is the following: H = hK2,
(13)
h = Wr + Tw,
(14)
where
Tw is given by (11), and the writhe Wr is given by ((fxArfx'Hx-x')
Wr= — ff 4 T JC
JC
——
,
(15)
t-KJcJc |x-x'|3 i.e. by the Gauss formula but with the integral taken twice round the same curve. Under continuous deformation of C, both Wr and Tw vary continuously, but their
173
Figure 1. Conversion of writhe to twist through continuous distortion of a ribbon: (a) Wr l,Tw = 0; (b) Wr+Tw= 1; (c) Wr = 0 , T w = 1 [From Moffatt and Ricca (1992)].
=
sum is invariant (Calugareanu 1961, White 1969). Again therefore it is conservation of helicity that actually underlies (via (14) and (15)) the essentially topological invariance of writhe plus twist. There is a further subtlety in relation to the decomposition of twist (12) in which the first term depends only on C, while the second depends on the mutual configuration of C and C". Under continuous deformation of the ribbon, it may at discrete instants pass through 'inflexional configurations', i.e. configurations for which C contains an inflexion point at which the curvature c(s) is zero, and the torsion is undefined. As the ribbon passes through such a configuration, the integral of the torsion T(S) jumps by ±2TT, but there is a corresponding jump =F 1 in the integer N, so that the sum in (12) varies continuously through the transition. The delicate interplay of writhe, torsion and intrinsic twist can be visualised in the process of stretching a twisted ribbon (see figure 1).
3
Magnetic relaxation
For simplicity, let us focus now on the case of incompressible (or volume-preserving) flow with p — cst. , for which V • u = 0, and equation (6) may be written in the
174
equivalent form 5W/3UVA(UAW).
(16)
This equation is of course nonlinear (through the dependence of w on u). It proves fruitful to consider an associated linear equation having a very similar structure, namely dB/
(17)
VA(VAB)
where V • B = 0, V • v = 0 and B and v are otherwise independent fields. Equation (17) means that the field B(x,f) is transported by the 'velocity' field v(x,<), the flux of B through every material circuit being conserved. Equation (17) is in fact the equation satisfied by a magnetic field B in a perfectly conducting fluid moving with velocity v. This interpretation may be helpful, but is by no means essential to the argument; we shall however use the terminology of magnetohydrodynamics (MHD) in what follows. The important property of (17) is that, no matter what the field v may be, it conserves the topological structure of B, at least for all finite time; the question of what may happen as t —» oo is of particular interest, and will be discussed below. Let us define the 'energy' M of the field B in the obvious way, i.e. M = i / B W
(18)
where the integral is taken over the domain V of fluid, and it is supposed that n - B = 0, n v = 0 on dV .
(19)
We pose the question: can we choose v(x,<) in such a way that the energy M decreases to a minimum compatible with the conserved topology of B? In fact, there are various possible ways of choosing v to achieve this end. The simplest choice (irrelevant constant factors being set equal to unity) is V = J A B - V
P
(20)
where j = V A B (the current distribution, if B is indeed a magnetic field), and p (the 'pressure field') is chosen so that V • v = 0 and n • v = 0 on dV. We may write (20) in the more compact form v = (j*B)s
(21)
where the notation (.. .)s is used to denote the 'solenoidal projection' of a vector field. With this choice of v, equation (17) becomes 5B/9< = VA((JAB)5AB),
(22)
an evolution equation with cubic nonlinearity. It follows moreover that ^
=
/BVA(VAB)
=
/(V*B)-(vAB)dV
= - fv2dV
(23)
175
.(«g) Figure 2. Relaxation of two linked flux tubes which lose energy through contraction.
on using the boundary condition (19) to eliminate the pressure term. Thus the energy of the field B does decrease monotonically for so long as v ^ 0. However the conserved topology of B implies that (if this topology is nontrivial) there is a positive lower bound for M (Freedman 1988). Again, the prototypical configuration of two linked flux tubes makes this clear (figure 2): the field 'relaxes' as a result of contraction of the B-lines (due to the Maxwell tension associated with the Lorentz force); both the fluxes $ i and 4>2 and the volumes Vi, Vi are conserved during this process, which must evidently be arrested when the two tubes make contact with each other. 'Nontriviality' of the topology means simply that there exist field lines which cannot be continuously contracted to a point without 'trapping' other field lines in the process. Thus, as t —¥ oo, we must conclude that, for any nontrivial field topology, / v 2 dV -¥ 0 and M -> ME ,
(24)
where ME(> 0) is the asymptotic (relaxed) energy. Unless singularities of v appear during this relaxation process (a possibility that appears extremely unlikely, but has not as yet been rigorously eliminated), it follows further that v - > 0 and B(x,<) ->
BB(x)
as t —> oo. From (20), the field B £ ( x ) satisfies f,BE
= VpE,
(25)
i.e. it is a magnetostatic equilibrium with pressure pE(x). The linked tube example suggests rather strongly that in general, the relaxed field B E (x) may contain tan gential discontinuities (as where the two tubes ultimately make contact); these tangential discontinuities of B £ are current sheets, and we should here emphasise that it is the assumption of perfect conductivity that permits the appearance of such current sheets. If the least resistivity is permitted in the fluid, then the current sheets will diffuse to finite thickness, and may be subject to 'resistive instabilities'; here we deliberately exclude such resistive effects.
176
4
Magnetic knots
The above relaxation process is particularly intriguing and illuminating when we consider the case of an initial field Bo(x) with 'knotted tube' topology. If the field lines within such a tube are 'uniformly twisted' so that the helicity (cf (13)) is given by % = /i<£2, where $ is the axial flux of Bo, then the key parameters that remain constant during the relaxation process are h, $ and the volume V of the tube (the fluid being still supposed incompressible). Hence the asymptotic energy ME is determined by these three parameters, no others being available. There is only one dimensional possibility: ME = m{h)$2V-1/3
,
(26)
a result first obtained by Moffatt (1990). Here m(h) is a function of the dimensionless parameter h, and this function is determined (in principle) solely by the topology of the tube knot K. It may of course happen that there are multiple equilibrium states, which may be ordered so that 0 <m0{h)
< ffl,(/i) < m2(h) < . . . .
(27)
We may then talk of the 'energy spectrum' of the knot, m(/i) = {mo(h),mi(h),m2(h),...).
(28)
This type of argument may now be carried somewhat further. Suppose we consider the relaxed state of lowest energy mo(/i) < I )2 V _1/ ' 3 . The corresponding magnetostatic equilibrium B £ ( x ) can exist in an incompressible fluid at rest. Let us suppose that that fluid is ideal (i.e. inviscid as well as perfectly conducting). The equilibrium is stable (being one of minimum magnetic energy). We may ask, in the spirit of Kelvin, what are the normal modes of vibration about such an equilibrium? If we linearise the equations of ideal MHD, <9B/
(29)
Vu= - V P + J A B
(30)
about the equilibrium state u = 0, B = B E ( x ) , then, writing u = ui(x)c™ ( , B = B £ ( x ) + B ^ x ^ ,
(31)
we obtain iwBi = t'wui
VA(U,«B£),
=-VPI+JIAB
£
+J
B
AB
(32) 1
,
(33)
an eigenvalue problem (when coupled with appropriate boundary conditions) for the pair of fields {ui(x) , B i ( x ) } . Note that in the notation used here, the field B has been scaled so that its units are those of velocity (actually, B is the local Alfven velocity). Here again there is presumably a spectrum of frequencies 0 < U/Q < u>\ < u)2 < ..., the fundamental frequency UIQ being of greatest interest. Since this is
177
determined in principle by the field B B (x) which is in turn determined uniquely by h, 4> and V, we may conclude on dimensional grounds that u>0 = SloihjQV-1
(34)
where again £lo{h) is a dimensionless function of h determined solely by the topology of the knot K. Of course there will presumably be a matrix Q,ij(h) of such functions, where j labels the member of the spectrum of relaxed fields, and i labels the normal mode of vibration of this member. It is of course one thing to assert the existence of such functions; it is an altogether more difficult matter, as yet beyond analytical or computational capabilities, to determine and evaluate them. 5
The relaxation of chaotic fields
The situation considered in §4 in which all magnetic field lines are closed curves is very exceptional. The field lines of an arbitrary field B(x) are the trajectories of the system dx
Bx{x,y,z)
_
d
y By(x,y,z)
dz =
(35)
B„(x,y,z)'
and if the components Bx, By, Bz are nonlinear functions of (x,y,z), then in general these trajectories are chaotic within the domain T> of definition of the field. An example of such a chaotic field within a sphere |x| < 1 has been studied by Bajer &; Moffatt (1990); the field is quadratic in the space variables, i.e. Bi = CijkXjXk ,
(36)
the tensor Cijk being such that V • B = 0 and n • B = 0 on |x| = 1. Even for this simple form of nonlinearity, the lines of the force of B are chaotic: they are not closed curves, neither do they lie on a family of surfaces. The system (35) is technically 'non-integrable'. Figure 3 shows a Poincare section of the field for a particular choice of Cijk - it shows the points in which a single field line of B intersects the plane of section; the widespread scatter of these points is a familiar symptom of chaotic behaviour; at the same time, one should note the existence of a certain order within this chaos, an order that can be analysed and understood by means of 'adiabatic' techniques. Suppose now that we adopt such a field as the initial field Bo(x) in the relaxation process described in §3. During relaxation, the chaotic character of the field clearly persists - there is no obvious mechanism by which a field line which is initially chaotic could, under transport by a continuous velocity field, rearrange itself to lie upon a surface. The inference is that the relaxed field B £ ( x ) must also therefore exhibit the above symptom of chaos. But here we are driven to a curious conclusion. The relaxed field B £ satisfies the magnetostatic condition (25), so that BE ■ VpE = 0 ,
(37)
i.e. the field lines of B B lie on surfaces pE = cst. If, by the above argument, they do not lie on such surfaces, then Vp must be identically zero in the region of
178
Figure 3. Poincare section showing the intersections of a single chaotic field line of a quadratic field of the form (36) with an equatorial plane of section in the sphere |x| < 1 (from Bajer & Moffatt 1990).
chaos, and so J B A B B = 0. It then follows that j B = aBE
BE • V Q = 0
where
(38)
and again, B £ -lines lie on surfaces a = cst. , unless a = 0. It therefore appears that B £ -lines can be chaotic in some subdomain V of V only if VAB
B
=QB
B
in
V
(39)
with a constant in V. Equation (39) expresses the fact that BE is a Beltrami field in V, a very special type of field. As first pointed out by Arnol'd (1974, 1986), there cannot possibly be enough generality in the solutions of (39) (if applied to V rather than V) to accommodate the arbitrary topology that may be assured for the initial chaotic field B 0 (x). How can we escape this paradox? The explanation that has been suggested (Moffatt 1985) is that, within any chaotic field, there are always 'islands of regularity' (large islands can be clearly seen in the example of figure 3, but there are many smaller islands also, below the level of visual detection). Under relaxation, the boundaries of these islands may become considerably distorted, and the subdomain V of chaos acquires a correspondingly complex geometry. On this picture, the complexity of the initial
179
field B 0 translates to complexity of the geometry of the domain V in which the relaxed field is chaotic. Whether this is the correct explanation must await direct numerical simulation of the relaxation process, a computational experiment that has not as yet been accomplished in three dimensions. 6
Two-dimensional relaxation
Numerical relaxation to minimum energy states has however been carried out for two-dimensional fields. These have the advantage that the topology can be com pletely prescribed (Moffatt 1999) in terms of the homoclinic field lines (or 'separatrices') through all the hyperbolic neutral points (i.e. 'saddle points') of the field. If placed on the sphere S2, each such separatrix is a figure-of-eight and the generic separatrix structure consists of two families of nested figure-of-eights. Un der relaxation, the incompressibility condition implies that the area A(x) inside any field line \ = cst. remains constant; the function A(x) (or set of such functions for different regions within separatrix loops) is called the signature of the field (Moffatt 1986a) and is invariant during relaxation; it is in effect a topological property of the field. It has been pointed out in §3 that tangential discontinuities of B may appear during relaxation as t -> oo. This can occur also in the two-dimensional case: such behaviour is located near the saddle points, and results from the collapse of the separatrices (to zero angle) near the saddle points (Linardatos 1993), a behaviour that has been subsequently re-examined and confirmed by Vainshtein et al (1999). It may be conjectured that saddle points of a field B will play an equally signific ant role in three-dimensional relaxation; but it is by no means essential that saddle points be present to initiate such discontinuities (see Parker 1994 for an extended discussion of the spontaneous formation of such discontinuities in the important context of the solar coronal magnetic field). 7
Analogous Euler flows
What, it may be asked, does magnetic relaxation, in a perfectly conducting viscous fluid, have to do with the problem that we started with, namely the flow of an inviscid non-conducting fluid in the absence of any magnetic effects? The answer is it provides a powerful, albeit indirect, method for establishing the existence of steady Euler flows (i.e. steady solutions of the Euler equations of an incompressible fluid) having arbitrary streamline (NB not vortex line) topology. For the equation for such steady flow may be written in the form UAW = V #
(40)
2
where H = p/p+ | u is the Bernoulli function, and W = V A U . There is an obvious analogy between equations (25) and (40) through the identifications B B - + u , j £ - s > u ; , p0-pE^H
(41)
where po is an arbitrary constant. Thus the magnetic relaxation mechanism, which establishes the existence of fields B B satisfying (25) (and recall that j E = V A B £ ) ,
180
simultaneously determines an analogous Euler flow u via the analogy (41). Of course, care must be taken to ensure that the boundary conditions on the flow are compatible with the analogy (see Moffatt 1985). Thus, for example, since the arguments of §§3 and 4 establish the existence of 'knotted magnetic flux tube equilibria' for any knot class K, it follows via the above analogy that steady Euler flows having similarly knotted streamtubes also exist! It is not quite as visualised by Kelvin who considered knotted vortex tubes; there may exist steady knotted vortex tube configurations, but no technique has as yet been found to prove the existence of such configurations. Note that the tangential discontinuities of BE (i.e. current sheets) that may appear during the relaxation process translate via the analogy (41) to tangential discontinuities of u, i.e. vortex sheets, imbedded within the Euler flows thus de termined. Now it is well-known that vortex sheets are prone to instability (the Kelvin-Helmholtz instability) and one may infer that the steady Euler flows may be unstable within the context of the Euler equations despite the fact that the ana logous magnetostatic equilibria are, by their construction, stable within the context of the magnetohydrodynamic equations in a viscous, perfectly conducting, fluid. This may appear surprising, but it should be recognised that the analogy (41) applies only to the steady states, but not to the (different) problems of the stability of these steady states. The differences between the two types of stability problem has been discussed by Moffatt (1986b); and it has in fact been shown by Rouchon (1991) that the sufficient condition for stability of an Euler flow obtained by ArnoPd (1966) is never satisfied for flows that are fully 3-dimensional and lack any obvious symmetry. Thus, although the magnetic relaxation technique yields a rich harvest of inform ation about the existence of steady solutions of the Euler equations, the downside is that any such solution of nontrivial topology is almost certainly unstable. 8
Relaxation t o steady solutions of the M H D e q u a t i o n s
Let us consider the full MHD equations for an ideal (i.e. inviscid, perfectly con ducting) fluid in the form du/dt
= UAU>+JAB-V#
(42)
dB/dt
= VA(UAB)
(43)
and let us now, following Vladimirov, Moffatt & Ilin (1999), construct a relaxation process that yields topologically interesting steady solutions of these equations. Note first that there are two classes of topological invariants (or 'Casimirs') associated with (42), (43); these are first the magnetic helicity invariants %M = / A • BdV
(44)
Jv where B = V A A (cf (3)), and V is any material volume on whose surface B n = 0; and second the cross-helicity invariants Tic = / u • BdV . Jv
(45)
181 T h e cross-helicity is topological in t h a t it provides a measure of ' m u t u a l linkage' between vorticity and magnetic fields; this is conserved even although vortex lines are no longer frozen in the fluid, the Lorentz force in (42) being in general rotational (Moffatt 1969). In addition to these topological invariants, the total energy E = i / ( u 2 + B2)dV
(46)
is also an invariant of (42), (43), the integral being over the whole fluid domain. T h e evolution of the system (42), (43) follows a trajectory on which E = cst., this trajectory lying on an 'isomagnetovortical' folium in the function space of solenoidal fields { u ( x ) , B ( x ) } - i.e. a subspace in which the set (44), (45) of Casimirs take prescribed values. To construct a relaxation process in which energy decreases, we must obviously modify the dynamics in some way, and we seek to do this in a special way t h a t still conserves the Casimirs. This can be done by replacing (42), (43) by the modified equations du/dt
= V A W + C A B - VH
(47)
dB/dt
= VA(VAB)
(48)
where v and c are arbitrary fields satisfying V • v = V • c = 0 (and corresponding boundary conditions). It may be verified directly t h a t the Casimirs (44), (45) do indeed survive this modification; however we now find t h a t ^f = -
/"{v-(uA«+JAB) + c-(uAB)}dV
(49)
and this is, in general, non-zero. We can ensure t h a t E decreases through choosing v and c in an obvious way (cf 21): v = (UAU> + J A B )
5
, c = (UAB)S .
(50)
It then follows that
¥ = -/"'tc'»*'
(51)
so t h a t E is monotonic decreasing for so long as v a n d / o r c are nonzero. Moreover, the Cauchy-Schwarz inequality, ±J(u7
+ B2)dV>\fuBdV\
= \Uc\
(52)
here places an obvious positive lower bound on E whenever the total cross-helicity is non-zero. (Actually a nonzero value oi Tic for any subdomain V bounded by a magnetic surface is sufficient to provide a lower bound for E. Hence, E tends to a positive limit, and so, from (51), excluding the possibility of (point) singularities appearing in v or c, we must have v —> 0, c —> 0. Hence, from (50), the limit fields { u £ ( x ) , B E ( x ) } satisfy precisely the equations of steady MHD. It would of course be nice to go to the limit of zero magnetic field in the above argument, which would yield a relaxation procedure for the Euler equations. How ever, the lower bound (52) gives no useful information in this limit, and the energy can relax to zero in this situation.
182
So we have the curious result that a technique is available for treatment of the ideal MHD equations, but this technique fails for what, on the face of it, is a simpler system, namely the Euler equations for ideal fluids. The Euler equations, in a sense, emerge victorious, resistant as yet to the above type of general treatment that is available for more complex systems. The great, and enduring, difficulty of the Euler equations lies in their purity, within which the central intractable nonlinearity continues to defy progress at a fundamental level. It is this purity and associated intractability that lies at the heart of the still unsolved problem of turbulence - a problem that will continue to challenge and frustrate for many decades into the 21st century. References 1. Arnol'd, V.I. 1966 Ann. Inst. Fourier, 16, 316-361. 2. Arnol'd, V.I. 1974 Proc. Summer School in Differential Equations, Erevan, Armenian SSR Acad. Sci. [English transl: Scl. Math. Sov., 5, (1986), 327345]. 3. Bajer, K. & Moffatt, H.K. 1990 J. Fluid Mech., 212, 337-363. 4. Calugareanu, G. 1961 Czech. Math. J., 11, 588-625. 5. Freedman, M.H. 1988 J. Fluid Mech., 194, 549-551. 6. Fuller, F.B. 1971 Proc. Nat. Acad. Sci. USA, 68, 815-819. 7. Helmholtz, 1858 8. Kelvin, Lord (W. Thomson) 1869 Trans. Roy. Soc. Edin., 25, 217-260. 9. Linardatos, D. 1993 J. Fluid Mech., 246, 56910. Moffatt, H.K. 1969 J. Fluid Mech., 35, 117-129. 11. Moffatt, H.K. 1985 J. Fluid Mech., 159, 359-378. 12. Moffatt, H.K. 1986a J. Fluid Mech., 166, 359-378. 13. Moffatt, H.K. 1986b J. Fluid Mech., 173, 289-302. 14. Moffatt, H.K. 1990 Nature, 347, 367-369. 15. Moffatt, H.K. 1999 The topology of scalar fields in 2D and 3D turbulence, Proc. IUTAM Symp. on Geometry and Statistics of Turbulence, Tokyo (to appear). 16. Moffatt, H.K. & Ricca, R.L. 1992 Proc. R. Soc. Lond. A, 439, 411-429. 17. Moreau, J.-J. 1961 C.R. Acad. Sci. Paris, 252, 2810-2812. 18. Parker, E.N. 1994 Spontaneous Current Sheets in Magnetic Fields (Oxford Univ. Press). 19. Rouchon, P. 1991 Eur. J. Mech. B/Fluids, 10, 651-661. 20. Tait, P.G. 1898, 1900 Scientific Papers (Cambridge University Press) 21. Vainshtein, S.I., Mikic, Z., Rosner, R. & Linker, J.A. 1998 Preprint, University of Chicago. 22. Vladimirov, V.A., Moffatt, H.K. & Ilin, K.I. 1999 J. Fluid Mech., 390, 127150. 23. White, J.H. 1969 Am. J. Math., 91, 693-728.
183 GAUGE THEORY: THE GENTLE
REVOLUTION
L.O'RAIFEARTAIGH Dublin Institute
for Advanced Studies, Dublin, Email: [email protected]
Ireland
Contents 1. I n t r o d u c t i o n 2. T h e Classical E r a 2.1 Electromagnetism and Gravity 2.2 Impact of Quantum Mechanics 2.3. Dimensional Reduction 3. T h e Yang-Mills E r a 3.1. Advent of Non-Abelian Gauge Theory 3.2. The Gauge Principle 3.3. Difficulties with Non-Abelian Gauge Theory 4. T h e E l e c t r o w e a k E r a 4.1. Phenomenology 4.2. Spontaneous Symmetry Breaking 4.3. Salam-Weinberg Model 4.4. Renormalization 4.5. Experimental Success 5. T h e S t r o n g I n t e r a c t i o n E r a 5.1. Phenomenology 5.2. Asymptotic Freedom 5.3. Confinement 6. T h e S t a n d a r d M o d e l 6.1. Standard Model of Electroweak and Strong Interactions 6.2. Outlook 6.3. String Path to Non-Abelian Gauge Theory 7. G a u g e - F i x i n g B R S T a n d C o n s t r a i n t S y s t e m s 7.1. Faddeev-Popov Gauge-Fixing 7.2. Gribov Ambiguities 7.3. BRST Theory 7.4. Gauge Theory as a First-Class Constraint System 8. T h e Rich T a p e s t r y of G a u g e T h e o r y 8.1. Solitons 8.2. Loops 8.3. Chern-Simons Theories 8.4. Anomalies 9. S u m m a r y
1
Introduction
From the point of view of fundamental physics the first quarter of the twentieth century must rank as one of the most revolutionary of all time, including, as it did, the discoveries of the Quantum Theory and of both Special and General Relativity. In comparison, the rest of the century may seem to be relatively tame, a period of gradual progress and consolidation, as in the two centuries following Galileo and Newton. However, in course of the century there has been a further development, the evolution of gauge-theory, which could lay claim to being as important as either special relativity or quantum theory. Indeed it might be argued that gauge theory
184
is more important than either of the latter two theories on the grounds that, being a theory of forces, it is dynamical rather than kinematical. Like gravitational theory, it determines, not the stage on which the fundamental interactions take place, but the interactions themselves. The discovery of gauge-theory is not considered to be revolutionary because, instead of coming in one fell swoop like the earlier discoveries, it came as the accumulation of many intermediate steps, spread over the entire century. Indeed, the evolution of the theory was very slow and took many surprising turns before the present unified picture emerged. And the last word may not yet have been spoken. In this article I should like to sketch, in rough chronological order, how the mosaic that constitutes the present version of gauge theory was gradually constructed from its individual pieces. As the literature on the subject is vast I have given as references only books and some review articles for each chapter. The list is by no means complete but should prevent the text being disfigured with too many reference numbers. 2 2.1
The Classical Era Electromagnetism and Gravity
Gauge theory originated in Electrodynamics, where the content of the second set of Maxwell's equations was that the electromagnetic field could be expressed as the curl of a four-vector ^2 dTFlll, = 0 cyclic
->
FpU = dlkAv-dvAVk
(2.1)
However, the four-vector A^ was left undetermined up to transformations of the form A^ —► A^ + d^a, where the a's were arbitrary scalar functions. These transformations are now called gauge-transformations. For some time the freedom represented by the gauge-transformations was regarded as at best a simplifying aspect of electrodynamics and at worst a nuisance. Gauge theory next appeared in Einstein's gravitational theory, where it took the form of coordinate transformations, although it was only after the advent of dimensional reduction and the tetrad formalism that the relationship between co ordinate and gauge transformations became clear. The gravitational theory, based on Riemannian geometry, inspired the mathematician Levi-Civita to reformulate Riemannian geometry in a form in which the fundamental entity was not the metric but the concept of parallel transport, mediated by covariant derivatives V^ = dp + { A, where the {} are the Christoffel symbols. This concept was quickly seized upon by Cartan and Weyl to generalize Riemannian geometry to a new form of differential geometry in which the Christoffel symbols were replaced by so-called connections F°p(x), whose only essential property was that they should transform in the same manner as the Christoffel symbols with respect to coordinate transformations. In particular there was no requirement that they be related to a metric, which made them much more general than the original Christoffel symbols. Over the course of the following thirty years the Cartan-Weyl ideas were developed
185 into the branch of mathematics that now goes under the name of fibre-bundle the ory. From the beginning it was pointed out by Weyl t h a t the new point of view provided a close connection between electromagnetism and gravity. Indeed, if one uses the tetrads ea, where the indices a label the rigid internal Lorentz space, to define the connections
Kb = 'WM
(2-2)
then their transformation law with respect to the local Lorentz transformations L{x) is r„
->
L-lYtlL
+ L-ld^L
(2.3)
which can be regarded as a non-abelian generalization of the electromagnetic gaugetransformation. However, Weyl's 1918 a t t e m p t to construct a unified theory of electromagnetism and gravitation on the basis of this analogy failed because the gauge-transformations he proposed for electromagnetism were actually scale trans formations. But his a t t e m p t left two legacies. One is gauge-theory as we know it today. The other was the word gauge, which he had introduced to describe the scale-change of the metric. 2.2
The Impact of Quantum
Mechanics
W i t h the arrival of wave-mechanics, the ideas of Weyl underwent a renaissance. Schrodinger and London independently arrived at a way of describing the behaviour of a q u a n t u m mechanical particle of electrical charge e in an electromagnetic field. Schrodinger combined the classical electromagnetic gauge-principle, p^ —t p^ +eAli with the correspondence principle pM —► id^ of q u a n t u m mechanics to arrive at the gauge principle d^ —► d^ + jr-Af, for the wave-equation. This proposal not only succeeded in combining quantum-mechanics with electrodynamics but also provided a direct link with gravitation and the mathematical theory of connections, for both of which the covariant derivative was the central object. London took a different path. In a modification of the earlier gravitational work of Weyl, he proposed t h a t , in the presence of an electromagnetic field, the wave-function IIJ(X) of the charged particle should be changed to e1^ J dy"A"^rp(x). It is easy to see t h a t Schrodinger's proposal is nothing but the differential form of London's and it is interesting to note t h a t London's formulation already incorporated the non local q u a n t u m electromagnetic phenomenon known as the Aharonov-Bohm effect. Inspired by these advances, Weyl went one step further in 1929 and proposed t h a t , far from being simply an interesting accompaniment of electromagnetism, gaugeinvariance was at its very core. In fact, he proposed t h a t gauge-invariance be elevated to the rank of a principle from which the whole of electromagnetic theory could be derived as follows: Given that q u a n t u m mechanics is invariant under constant phase-transformations ip(x) —> elc,J4>(x) where the Q'S are constant phases, introduce the requirement that it be invariant even when the parameter a is spacetime dependent a —* a(x). This requirement necessitates the introduction of a connection A^(x), with all the properties of the electromagnetic gauge-potential. T h e importance of this principle was not appreciated at the time and Weyl's paper
186
was regarded as no more than an alternative formulation of electromagnetism. But, as we shall see later, when more general transformations than phase-transformation were considered, the profundity, indeed the necessity, of Weyl's principle, became evident. 2.3
Dimensional Reduction
In the meantime, attempts by Nordstrom (1914), Kaluza (1922) and Klein (1926) to find a unified theory of electromagnetism and gravitation, using what is now called dimensional reduction had clarified the relationship between gauge and coordinate transformations, in particular the relationship between the gauge transformations of electromagnetism and the coordinate transformations of gravitational theory. What these attempts showed was that if, in a 5-dimensional space, the dependence of the variables on the fifth coordinate were suppressed, then the 5-dimensional theory of gravity decomposed into a 4-dimensional theory of gravity plus electromagnetism. Furthermore, the cooordinate transformations of the 5-dimensional space split into the coordinate transformations of the 4-dimensional space plus the gauge transform ations of electromagnetism. From this point of view the gauge-transformations of electromagnetism emerged as nothing but a special case of coordinate transforma tions. Later, of course, dimensional reductions from higher-dimensional spaces were also considered and these led to non-abelian gauge-theories. With the appearance of Weyl's 1929 paper the classical era of gauge-theory came to an end. Attention during the next twenty years was focussed not so much on electromagnetism and gravity as on the strong and weak nuclear interactions, and these interactions appeared at first sight to have nothing at all to do with the gauge-theory. 3
The Yang-Mills Era
Apart from a remarkable paper by Klein in 1938, the next decisive phase in the de velopment of gauge theory came in the period 1953-6. During this short period the idea that the electromagnetic gauge-theory, based on U(l) phase-transformations could be extended to other internal symmetry groups was considered. The internal symmetry group mostly in mind was the isospin group SU(2) of the strong nuclear interactions. 3.1
Advent of Non-Abelian Gauge Theory
The celebrated paper of the period is the Yang-Mills paper of 1954, in which the extension of £/(l) gauge theory to SU(2) was completed in a self-consistent manner at the classical level, and turned out to have an elegant structure. It should be mentioned, however, that the structure found by Yang and Mills was found independently by a number of other workers, namely Shaw, Pauli and Utiyama. Shaw generalized to SU(2) an 50(2) version of electrodynamics due to Schwinger but published his work only in his 1955 Ph.D. thesis. Pauli used dimen sional reduction from six-dimensions. He assumed that the extra two dimensions
187 forming the surface of a 2-sphere and was thus led naturally to an 5(7(2) YangMiils theory. But, because of difficulties with the fermionic mas-spectrum he did not publish his work or pursue the matter further. Utiyama, who published only in 1956 and was motivated partly by gravitational theory in the tradition of Weyl, was the first to extend the idea to all semi-simple Lie groups, compact and otherwise, but delayed the publication of his work until 1956 because he thought (wrongly) t h a t he had been preceded by Yang and Mills. Of course, the mention of these other contributors is not meant to detract in any way from the merit of Yang and Mills, who are rightly credited with being the first to formulate and publish the SU(2) theory. Rather it is intended to illustrate how ideas which have been germinating for some time come to fruition along different lines. 3.2
The Gauge
Principle
T h e basic idea of Yang-Mills theory mirrors the 1929 proposal of Weyl. One starts by supposing t h a t ip(x) is a matter-field of any spin and t h a t the Action is invariant under the action 4>(x) —> U(g)ip(x) of a unitary representation U(g) of a rigid in ternal Lie symmetry group G with elements g. Following Weyl, one then makes the demand t h a t the Action be invariant even if the group-elements g are made spacetime dependent, g —► g(x). For the mass and potential terms in the Action this is automatically the case, but for the kinetic terms it is not because the derivatives dflip(x) do not transform covariantly. To remedy this one must introduce a gaugepotential All(x) lying in the adjoint representation of the Lie algebra of G and re place the derivative d^^x) by the covariant derivative D^tp(x) — (dli-\-AiJl{x)),4>{x). To guarantee the covariance of this derivative i.e. to guarantee t h a t Dflxl){x) will transform in the same way as rp(x) itself, the gauge-potential must transform ac cording to A,{x)
-+
(7(5(x))-1(^ + ^(x))f/(5(x))
(3.1)
In t h a t case, the Action will be invariant. T h a t is not quite the end of the story, however, because, if one were to stop at that point the Action would contain the gauge-potentials Afi(x) without any derivatives, in which case they would act only as Lagrange multipliers and lead to a constrained system. Accordingly, it is neces sary to introduce kinetic terms for the Afl(x) and, of course, these terms should themselves be gauge-invariant. If it is required t h a t the kinetic terms contain only the first derivatives of the gauge-potentials and is at most quadratic in these de rivatives, then (up to an overall constant) the choice of kinetic term is unique, namely t r ^ F " " )
where
F„ = d»Av - d^A^ + [A„, A„]
(3.2)
This is the end of the story and leads to the gauge-principle, which may be formu lated as
£(!M„V0
-»• Jtr(F^F^)+L(V>,D w V)
(3.3)
188 where the factor 1/4 is conventional. In this formulation the 1929 proposal of Weyl comes to full fruition. 3.3
Difficulties
with Non-Abelian
Gauge
Theories
In spite of the intrinsic beauty of classical non-abelian gauge theories they appeared at first sight to have nothing to do with the nuclear interactions, and they were regarded only as interesting toy models. Indeed the nuclear interactions appeared to be anything but vectorial. Worse still, non-abelian gauge-theories predicted massless gauge-fields, which would imply t h a t the nuclear interactions would be long-range, in manifest contradiction to experiment. On the theoretical side, the quantization appeared to present insuperable difficulties, since the methods of mak ing the theory renormalizable, unitary and infra-red convergent t h a t had been used successfully for q u a n t u m electrodynamics seemed to fail in the non-abelian case. In fact, the beauty of classical non-abelian gauge theory seemed to be matched only by the ugliness of the quantized version. T h a t the difficulties encountered were real and only partly deficiencies in technique has in the meantime become clear. Indeed they have been solved only by the introduction of some radically new ideas. Histor ically, however, the a t t e m p t s to apply non-abelian gauge theory t o the weak nuclear interactions preceded the solution of these problems, as we shall now describe. 4 4-1
T h e E l e c t r o w e a k Era Phenomenology
T h e weak-interaction processes are essentially generalizations of the /?-decay pro cess n —t p + e + v discovered by Becquerel and the Curies at the turn of the last century and, until the arrival of gauge theory, were described phenomenologically by a four-fermi interaction of the form Gja ■ j a , where G is a coupling constant and the j's are currents of the form j„ = 4>(x)aip(x), the cr's being Dirac matrices and the dot indicating a Lorentz invariant summation. This theory was regarded, not as a fundamental one, but as an effective theory, valid at the available low energies but ultimately unsatisfactory because at higher energies it would lead to violations of the unitary bound on cross-sections. For this reason it was also not renormalizable. T h e dimensionality of the coupling constant G was also considered to be an indication t h a t the interaction was not fundamental. Initially there was much experimental uncertainty about the form of the cr's, partly because of a re luctance to consider the possibility that they might be parity-violating, but soon after parity violation was established in 1956 it was found t h a t , t o a good approxim ation, the cr's were vectorial, actually of the V — A form 7^(1 — 75). This discovery constituted an important step toward gauge theory because it suggested t h a t the interaction was mediated by vector mesons. T h e only problem was t h a t the vector meson would have to be massive, with a mass much larger than the energies un der consideration. Indeed, the standard Yukawa theory of mesons shows t h a t the effective coupling for an interaction mediated by a vector meson of mass m and coupled to currents of m o m e n t a p and q with dimensionless coupling constant g is G — g2/(m2 + (p — q)2). For massles mesons such as the photon this expression
189 reduces to a long-range Coulomb-like interaction G — g2/(p — q)2, and it reduces to a constant G — g2 /m2 only if the exchanged mass m is large compared to the momentum-exchange (p — q). T h e result is that, while the V—A theory suggested a vector meson theory, it required m ^ 0 and thus did not immediately suggest a gauge theory, for which m = 0. Nevertheless, the gauge idea was considered by a number of people, notably Salam and Schwinger, in the hope t h a t the mass-problem might be resolved later. In particular, Schwinger proposed an SU(2) model of the weak interactions, and, when this model did not fit the phenomenology, it was ex tended in 1959 by his student Glashow to an SU(2) x f/(l) model. Similar models, based on this group were proposed by Salam and Ward in the early sixties. 4-2
Spontaneous
Symmetry
Breaking
Although the gauge-models of the weak interactions were not taken seriously at first due to the absence of a mechanism for generating gauge-field masses, it was not long before proposals to generate masses for the gauge-fields began to appear. T h e proposals were based on the fact t h a t in solid state physics such a mechanism already existed. This mechanism had been invented to explain the Meissner ef fect, namely the fact that in a superconductor the magnetic flux lines are expelled from the interior to the surface. A simple heuristic explanation was first given by Ginsburg and Landau in 1946, who simulated the material of the superconductor by a charged (complex) scalar field <j>(x), which in the superconducting phase was assumed to take the form (x) = c + 8{x), where c is a non-zero constant and 8(x) is the remainder that varies with x and falls off with distance. T h e standard static Lagrangian for such a field in interaction with a magnetic field B is
/'
>x^B2+l-(D)2
+ V()}
(4.1)
where, the potential, which can be a function of (/>*> only because of gaugeinvariance, was assumed to be of the approximate form
V = n* + \(*<j>)2
A>0
(4.2)
T h e parameter p. was assumed to be strongly temperature-dependent and to take negative values below a critical temperature. The superconducting phase was char acterized by these negative values of p.. In that case the potential minimum is seen to occur, not at <j> — 0 but at <j>*(f> = c2 where c 2 = |p|/2A. A particular value of <j> with this absolute value must be chosen as minimum thus breaking the symmetry —> e'a<j>. Choosing the minimal value to be <j> — c by convention, and expanding the field as <j>{x) = c + 0(x), where 8(x) is the 'true' field t h a t depends on x and falls off at large distances, it is easily seen that the Lagrangian takes the form <3„
/
^B2
+ \e2c2A2
+ ^(D8)2
+ e2c8iA2
+ ec02dA + V{0)]
(4.3)
where 6 = 9\ -\-i62 and V{6) is a potential that takes its minimum at 6 = 0. In the gauge dfiAp = 0 this leads to a field equation of the form (A + e2c2)A
= j{x)
(4.4)
190
where j(x) is the electromagnetic current. This field equation differs from the usual Maxwell one by the term e 2 c 2 term on the left-hand side. The presence of this term means that, once inside the superconductor, the magnetic field falls off like e~ecS where S is the distance from the surface. This explains the Meissner effect and the quantity (ec) _ 1 became known as the penetration length. Later, the BCS theory of superconductivity put the heuristic Ginsberg-Landau theory on a firmer footing and identified the scalar field cj)(x) as a bound state of two-electrons, called a Cooper pair. The relevance of all this for particle physics is that, in the relativistic version of the Lagrangian, the term e2c2A2 becomes e2c2A2 which means that ec could also be interpreted as a photon mass, and it was soon suggested that a similar mechanism might be used to generate masses for the non-abelian gauge-fields. However, this suggestion raised some interesting questions. First, the fact that the minimum of the potential was not at <j> = 0 meant that the phase symmetry <j> —> e'a<j> on which the electromagnetic theory was based was now broken. So was gauge-invariance now violated? Second, although the breaking of a symmetry G down to a subsymmetry H was a standard occurrence in other areas of physics such as crystallography, it was known that if G and H were Lie groups, the breakdown had to be accompanied by the appearance of dim(G/H) massless fields, called Goldstone fields. But no Goldstone fields were seen experimentally. Indeed, whereas conventional wisdom suggested that the Ginsberg-Landau model should have two massless fields, namely a massless gauge-field and a massless Goldstone field, experiment showed that in the superconducting phase there were none. The mechanism by which this rather surprising situation comes about was investigated by a number of authors, notably Higgs, Brout and Englert, and the explanation may be understood in a simple way as follows: In the unbroken phase of the theory (minimum of the potential at <j> — 0) there are six fields altogether, two real scalar fields and four components of the gauge-field. But, because of the Gauss law and the gauge-fixing two components of the gauge-field are not physical and may be eliminated (by the Gupta-Bleuler mechanism for example) leaving only four physical fields. Since the change to the superconducting phase should not change the number of physical degrees of freedom there should therefore be four physical fields in the broken phase also. However, since in this phase the gauge-field is massive and has therefore three physical components, it follows that one of the gauge-field components and one of the scalar fields must be non-physical. Hence there must be a mechanism which eliminates one scalar field and one component of the gauge-field. In fact, it turns out that there is a gauge in which the Goldstone component of the scalar field and the longitudinal component d^A^ vanish, which implies that in general gauges the effects of these two massless fields must cancel. This cancellation is called the Higgs mechanism. Higgs original work was for EM theory but the idea was quickly generalized by Kibble to the non-abelian case. In the latter case, the scalar field, now called the Higgs field, is assigned to a non-trivial representation of the gauge-group G, and
191
the unbroken Lagrangian takes a similar form to the abelian one, namely
I
Ur(F,uF>")+\{DY+V{)
(4.5)
where V(<j>) is a group-invariant potential. In the broken phase the minimum of the potential occurs at some non-symmetrical point <j> = c ^ 0 and when the field is expanded in the form <j>(x) = c + 9{x), where c is constant and 0(x) falls off for large x, the broken phase Lagrangian takes the form J( V " )
+ \e2(A„ • ac)2 + C(A„ ■ a, D^)
+ \{D6)2 + U(0)
(4.6)
where the inner-product is in the space of the <£-field representation. This Lag rangian obviously contains a mass-term proportional to ec for the gauge-fields, as in the abelian case. But there is a difference in that the symmetry group may break down to a non-trivial subgroup of H, in which case only dim(G/H) of the gauge-fields become massive, the dim(H) gauge-fields belonging to the residual (un broken) gauge-group H remaining massles. The Goldstone fields are those in the directions ac in the <^-field representation. The Higgs mechanism still works. But now it works in the sense that there exists a gauge (called the unitary or physical gauge) in which the dim(G/H) Goldstone fields and the longitudinal components of gauge fields belonging to the coset G/H both vanish. This implies, as before, that in any gauge the contributions of these two sets of fields must cancel, a phe nomenon that is sometimes described by saying that dim(G/H) of the gauge-fields eat up the Goldstone fields and become massive. 4-3
Salam-Weinberg Model
Inspired by the Higgs mechanism, Weinberg and Salam independently proposed that the previous Glashow and Salam-Ward models be equipped with a Higgs scalar. By choosing the the 2-dimensional representation of SU(2) for the Higgs field, and taking into account that, to have any chance of renormalizability, the potential would have to be at most quartic, they arrived at a Lagrangian of the form L= i v "
+ ^ ( F ^ • F"") +h-D^
+ {D<j>)2 + L m a t t e r
(4.7)
where F and F are the U(l) and SU(2) gauge-fields respectively, the covariant derivative is Dp=dll+g
+ -fAlt
(4.8)
where the cr's are the Pauli matrices, and the matter-field Lagrangian is given by L m a t t e r = J2G°{i>°L,)rl>aR + H' ~ c2)2
(4.9)
a
where the subscripts L and R denote the left and right-handed chiral components of the fermion fields, and the inner-product is in the 2-dimensional space of the fun damental representation of SU(2). The S(U(2)) assignments of the fermion fields obviously exclude invariant mass terms and this is why the matter-field Lagrangian
192
consists only of Yukawa terms and a Higgs potential. This is today's standard model of the electroweak interactions. In the broken phase the physical gauge-fields i.e. the combinations of gauge-fields with definite masses are
where A^ is the electromagnetic field, and, up to quantum corrections, the masses are easily computed from the spontaneous breakdown to be Mw=cg
Mz = c^(f2+g2)
and
MA« = 0
(4.11)
When the quantum corrections are taken into account, these are in excellent agree ment with experiment. In terms of the physical gauge-fields the covariant derivative takes the form
D
" = d" +ffi?**+ v/lF+7)^ 0 (*3 + jr^Q) + eA7Q
(412)
where e = /<7/\/(/ 2 + <72)- Thus the interaction of (e, ve) with the gauge-fields, for example, is given by eej.eAl
+ ^=(nlie)W+
+
±=(elltu)W~ (4.13)
+ VP+92{n»v
+ Hrfse + , 2
P+92
2
n^ejZ°
Here the first three terms are the traditional weak and electromagnetic coup lings and the last term is the modification introduced by the standard model. Note that the neutral current coupled to Z° is parity-violating, which allows its contribu tion to atomic proceesses to be distinguished from the dominating electromagnetic contributions. The electromagnetic coupling constant determines the combination ef/\/(f2 + g2) of the gauge-field couplings and the Fermi coupling constant G determines the dimensional constant c in the Higgs potential. Indeed, from G = g2/2Myy we have c = \/G/2. Furthermore, from the spontaneous symmetry breakdown it follows that the fermion masses are given by ma = cGa and this means that the Yukawa couplings are also known. Thus the only unknown parameters in the standard model are the so-called weak angle tan 9 — f/g and the constant A in the Higgs potential. The value of the weak angle can be determined both from a number of experiments on the current-current interactions and also from the gauge-field mass-ratio Mz/Mw, all of which are in agreement. Thus the only experimentally unknown parameter is A, which awaits experimental information about the Higgs field. 4-4
Renormalization
The original standard model was classical and there remained the question as to whether the quantized version was renormalizable. Indeed the consensus of opinion at the time of its proposal was that non-abelian gauge-theories were not renor malizable, and there was no reason to think that the standard model should be an
193 exception. T h e natural gauge to investigate the question was the unitary gauge but early a t t e m p t s to prove renormalizability in this gauge failed, essentially because the gauge-field propagators had numerators of the form g^ — k^k^/m2 which lead to divergences that increased with the order of perturbation. For other gauges, even the gauge-fixing presented problems, which were solved only by the introduction of ghost-fields, as will be described in detail in Chapter 7. Furthermore, it turned out t h a t the traditional methods of regularization, such as Pauli-Villars, could not be made gauge-invariant. A major breakthrough came with the introduction of dimensional regularization, which is manifestly gauge invariant. Using the ghost formalism and dimensional regularization 't Hooft and Veltman in the early seven ties succeeded in proving that (modulo anomalies and Gribov ambiguities, which will be discussed later) non-abelian gauge-theory was renormalizable in both the broken and unbroken phases,
4-5
Experimental
Success
T h e success of the renormalization program forced the experimentalists to take the non-abelian gauge theory of the weak interactions more seriously and within a few years they had not only found the neutral current t h a t was predicted by the model, but had discovered that the fine details of both the charged and neutral weak currents were in excellent agreement with experiment. More specifically they found t h a t the results of six independent experiments could be described using the only new parameter available, namely the weak angle 9. An interesting point was t h a t the relevant experiments were low-energy ones. In particular, the prediction t h a t a part of the neutral current would be parity-violating was tested in atomic physics. In the event, the experimental evidence for the standard model currents coming from the investigation of the currents alone was so convincing t h a t the model was awarded the Nobel prize in 1979, long before the gauge or Higgs particles that it predicted were found. T h e gauge particles W^ and Z° were discovered ten years later, using the highest accelerator energies available, and the search for the Higgs field is still in progress. Apart from the Higgs field, present experiments agree with the standard model to an accuracy t h a t is well beyond the accuracy t h a t might reasonably be expected. Indeed, the present dilemma is that, although there is general agreement t h a t the standard model is incomplete, because of the reducible form of the gauge-group and the rather ad hoc assignments of the matter-fields, the agreement between experiment and theory is so good t h a t there is no experimental hint as to how it should be modified.
5
T h e S t r o n g I n t e r a c t i o n Era
As we have just seen, the electroweak theory was suggested by the phenomenology, notably the V — A theory. In contrast, the major breakthrough in the gauge-theory of the strong interactions came from a theoretical discovery, namely the asymptotic freedom of non-abelian gauge-theories. In this chapter the p a t h to the strong interaction theory and the crucial role of asymptotic freedom will be sketched.
194
5.1
Phenomenology
For the strong interactions there was no analogue of the vectorial four-fermi coup ling that had suggested the electroweak gauge theory. On the contrary, until the seventies, the standard assumption was that the strong interactions were medi ated by the pseudo-scalar mesons and their higher spin resonances. Although it was known from the study of their form factors that the nucleons and mesons were extended objects, the idea of their fundamental nature persisted, and was formulated in the form of analytic S-matrix theory and also through the bootstrap hypothesis. These theories assumed a kind of nuclear democracy, in which the composite nature of each field was due to the presence of the others. However, the discovery in the early sixties that the isospin symmetry of the strong interactions generalized to an 5/7(3) (flavour) symmetry, in which the mesons and nucleons were not in the fundamental representation, opened the way for the hypothesis that the mesons and nucleons were composite states of more fundamental particles, called quarks, which had spin one-half and belonged to the fundamental representation of SU{?>). This hypothesis received some strong independent experimental support some years later when it was discovered that, in electron-proton scattering at very high energy, the electrons scattered as if they were encountering free point particles, which were soon identified as the quarks. The situation was analogous to that in the Rutherford atomic experiments, where the electrons scattered off the (effect ively point-like) nuclei. Further support for the quark hypothesis came from the observation that the higher baryon-meson resonances fitted exactly into the quark pattern, the most spectacular example being the spin three-half Q~ particle. This led to an extensive investigation into quark spectroscopy, the results of which were so affirmative that, in spite of the fact that individual quarks were not seen exper imentally, the quark model was soon taken for granted. According to the quark model the baryons are composites of three quarks and the mesons are composites of quark-anti-quark pairs. However, there was a price to be paid. First, as just men tioned, the quarks are not seen experimentally, and this forced the introduction of the rather ad hoc hypothesis that they were confined in the nucleons by some, as yet unknown, property of the strong nuclear forces. Second, so that the baryons would be anti-symmetric in the quark quantum numbers in accordance with the spin-statistics theorem, a new quantum number had to be introduced. The new quantum number was called colour and, since it had to take at least three values, this led to the introduction of a new internal symmetry group called 5C/(3)-colour, with the quarks in its fundamental (three-dimensional) representation, and all the visible particles in the trivial representation (colourless). And there the matter rested for some years.
5.2
Asymptotic Freedom
The next breakthrough in the strong interactions followed soon after the proof of renormalizibilty of non-abelian gauge theory. Once the renormalizability was estab lished it was possible to consider the renormalization group equation, in particular to consider the scale behaviour of the gauge-coupling constant. In the other known renormalizable interactions such as QED and <j>4 theory, the strength of the coup-
195 ling constant increased with energy-scale, but to everybody's surprise, it turned out t h a t that for non-abelian gauge theories it decreased with energy-scale. Because the interaction became weaker as the energy increased this phenomenon became known as ultra-violet asymptotic freedom, or simply asymptotic freedom. Since the weak ening of the coupling with increasing energy could explain the point-particle-like scattering of electrons at high energy, and the converse of asymptotic freedom, namely infra-red slavery, could explain the quark-confinement (their binding force would increase with decreasing energy and therefore increasing distance), the dis covery of asymptotic freedom led immediately to the idea t h a t the strong interac tions were mediated by non-abelian gauge fields, called gluons. The serendipity of the situation was that there no need to invent a new gauge-group for the gluons be cause a suitable internal symmetry group already existed, namely the SU(3)-colour group. The theory, therefore, is t h a t the Sf/(3)-colour group was a gauge-group in the sense of Weyl and t h a t the strong interactions are mediated by its gauge-fields, the gluons. This theory is known as chromodynamics, and according to it, the gluons belong to the adjoint representation of 5 ( / ( 3 ) . Hence they are coloured and confined, like the quarks, and are not expected to be seen directly. In contrast to the electroweak gauge-theory QCD is assumed to be unbroken, so requires no Higgs fields. Of course, asymptotic freedom is violated if the number of m a t t e r fields becomes too large, so it puts an upper bound on their number. Although the bound is relatively low for fermions, it is still high enough to permit the number of quarks required to describe the observed particle-spectrum. From the Lagrangian point of view QCD is formally the same as Q E D , the Lagrangian being simply L = f ^ ( F ^ F " " ) + 4>(f ■ D + M)i>
(5.1)
where M is a mass-matrix for the quarks. Here the electromagnetic gauge group U(l) is replaced by the (unbroken) non-abelian gauge group 5(7(3), the photon is replaced by eight gluons in the adjoint representation of St/(3) and the electron is replaced by the quark fields ip. The idea is that these interactions form a basis for the observed meson-baryons interactions in the same way t h a t electromagnetic interactions form a basis for molecular interactions. T h u s the mesons and nucleons are the molecules of this theory. However, the non-abelian structure makes a huge difference. T h e SU(3) theory is asymptotically free, whereas electromagnetism is not, and the gluons are confined, whereas the photon is not. Furthermore, because the group is unbroken, the problems of massless gauge-fields and infra-red diver gences that plagued the original non-abelian gauge theory resurface. But they are now thought to play a vital role in the confinement process. 5.3
Confinement
In spite of the beauty of the non-abelian theory of the strong interactions it has not yet provided a quantitative solution to the confinement problem. However, there are many fruitful ideas on the subject. One of the most appealing is t h a t confinement is due to a (dual) Meissner effect in which the roles of electricity and magnetism are interchanged. The idea is that, although a superconductor expels magnetic
196 flux lines, it makes an exception in the case of vortices. These are one-dimensional flux lines which, for topological reasons, force their way through the interior of the superconductor. T h e idea is t h a t , for some similar, but as yet not fully understood, topological reasons, the electric flux lines within the nucleus play the same role as the magnetic vortex lines within the superconductor. Since they are then onedimensional, the electromagnetic potential is linear rather than Coulomb-like and thus produces a force t h a t increases with distances. In other words, the further the quarks and gluons are pulled apart, the greater the force between them, which would explain confinement. This explanation, of course, requires the existence of a duality between electricity and magnetism, but there is actually some strong evidence for the existence of such a duality from other sources, e.g. from lattice gauge theory computations, from instanton theory and from supersymmetric models. Other explanations of confinement are based on the existence of solitons such as monopoles in non-abelian gauge theory (section 7.2) and on the existence of Gribov horizons (section 8.1). Other aspects of Q C D , such as scattering, are more susceptible to experimental verification and to the extent t h a t the theory can make predictions the agreement with experiment is good. T h e problem is t h a t , because of the strength of the coupling, perturbation theory does not work, unless accompanied by some phenomenological input to account for confinement and other non-perturbative effects. T h e best t h a t can be done directly is to use semi-classical approximation methods and lattice computations. Both of these m e t h o d s have been fruitful and all the avilable evidence suggests t h a t the QCD theory is correct.
6
The Standard Model
Once the Q C D model of the strong interactions is accepted the strong interactions fall into line with all the others. Gravity is still a liitle different but the other three interactions are very similar and, although they have not yet been unified, they can be combined in a single model. In this chapter we describe the group structure of this model and discuss the outlook for unification. 6.1
Standard
Model of Electroweak
and Strong
Interactions
T h e gauge theory for the combined electroweak and strong interactions is based on the Lie algebra SU(3) x SU(2) x (7(1) which is j u s t the direct sum of the strong and electroweak algebras. T h e only real problem is to arrange the fermion fields in the correct patterns. This is not quite trivial. First, the classification of fermions in the particle-tables shows that there is a six-fold correlation between the q u a n t u m numbers of the observed fermions which is encapsulated in the formula Q =
197
representation of electroweak SU(2), and since this applies in particular to quarks, it forces the introduction of a new set of quarks called charmed quarks, with the assignment of the nuclear quarks (u and d) to one fundamental representation of S£/(2)evv and of the charmed and strange (c and s) to another. Although quarks are not seen, the existence of the charmed quarks has been verified experimentally by the detection of meson and baryon composites with charm quantum numbers. For reasons connected with the (7(1) anomaly (see section 8.4) each quark doublet has to be accompanied by a lepton-doublet, and in that case the obvious assignments are
fud)
M
where the boldface particles are quarks (belonging to the fundamental represenation of Sf/(3)-colour) and the others are the leptons (neutral with respect to 5f/(3)-colour). The second sector decays into the first via charged flavour-changing currents and the rates are in good agreement with experiment. The later exper imental discovery of the r-particle, similar to the electron and muon, forced the introduction of yet another quark-lepton sector, with the T and r-lepton accom panied by a quark doublet {t, b}, so that the full set of quarks and leptons is actually /ud\
U s\
ft
b\
The three sets of particles shown are called generations, and although one might expect more generations to follow, there is good experimental evidence from both particle physics and astrophysics which shows that there are no further generations (at least at reasonable mass-levels). This is quite a puzzle since resonances usually continue indefinitely and, in fact, it is just an updated version of the old clectronmuon puzzle. At present the best that can be done for the generations is encapsulate the ignorance concerning them in a 3 x 3 unitary matrix, called the CabbiboKobayashi-Maskawa (CKM) matrix, which connects the 3 generations, and whose entries describe the masses and the decay rates of the particles. 6.2
Outlook
The three irreducible gauge-groups in the standard model have different coupling constants and this suggests that the present version is the broken remnant of a more unified theory in which there is just one coupling constant. In particular it has been suggested that the standard gauge-group is a subgroup of a simple, grand-unified, gauge-group. However, while there are many arguments in favour (and against) this hypothesis, no successful model has yet been constructed. Whatever the final truth about QCD, it has brought about a complete revolu tion in our way of thinking about the strong interactions. Far from being an excep tional force mediated by pseudo-scalar mesons, it appears that they are archetypal non-abelian gauge forces, completely in line with the other fundamental interac tions, including gravity. Indeed the strong interactions theory bears an uncanny
198 resemblance to electrodynamics, being unbroken and differing essentially only in the non-abelian character of its gauge-group and the strength of the coupling. 6.3
String Path to Gauge
Theory
In the previous sections the historical path through which the theory of the strong interactions changed from the old pseudo-scalar meson mediated theory to a gaugetheory was outlined. There the roles of asymptotic freedom and of the colour group were crucial, and one might ask whether in their absence, the meson theory could ever have led to gauge theory. The simple answer is yes. About a decade after the construction of the standard model of the nuclear interactions, string theory, which originated in the theory of the strong interactions alone, was found to incorporate not only the kind of gauge-theory appropriate to the strong and electroweak interactions but gravitational theory as well. Indeed, in contrast to local field theory, where gauge-theory is an optional extra whose validity rests on experimental evidence, in string theory it is an intrinsic part of the structure, as we shall now try to explain briefly. As already mentioned, the failure of perturbation theory for the strong inter actions led to alternative formulations, by far the most successful of which was Scattering-Matrix or S-Matrix theory (also called the theory of dispersion rela tions). T h e idea was t h a t the scattering matrix, considered as a function of the m o m e n t a of the scattered particles, could be determined from a few basic axioms such as unitarity and causality, together with some experimental input concerning the scattered particles, identified as poles. Although the theory was not success ful in the sense t h a t the 5-matrix could be determined uniquely from the input, or that it could be exactly computed, it was very successful from the phenomenological point of view. It also had the effect of creating a strong intuition a b o u t the nature of the strong interactions. The essential feature of the theory was t h a t causality implied t h a t the 5-matrix was analytic in the invariants formed from the external m o m e n t a and one of the most important consequences of this was crossing symmetry, which related the energies and angular m o m e n t a in different channels. A model of the 5-matrix t h a t incorporated analyticity, crossing symmetry and many other desirable features (but not unitarity) was constructed by Veneziano in the late sixties. Soon afterwards it was realized t h a t this model could derived from a two-dimensional string theory. T h u s string theory came from 5-matrix theory via of the Veneziano model. In string theory the fundamental objects are not point particles but one-dimensional strings of microscopic length. They may be described by a complex coordinate a =
ff / DXeS{X) /dV.e'*-*^) a=l
v
'
(6.1)
•'
where the p^ are the m o m e n t a of the external particles, the X^, y. — \...d are space-time coordinates, assumed to be scalar functions of the string variable a, the vertices f d2 a e'p (°> are generalizations of the standard Feynman vertices e'p x,
199
and the Action S is S(X) = K Jd2'(
(6.2)
where ga^ and r)^ are the metrics in the two-dimensional string space and the ddimensional Lorentz space respectively, and K is a constant called the string tension. The constant K is assumed to be of order of the Planck scale, so that the length of the string, which is proportional to /c _1 , is of order 1 0 - 3 3 cm. The remarkable feature of string theory is that the Veneziano n-point functions of ordinary field theory are just the expectation values of n vertices in a two-dimensional field the ory. The equivalence between the Veneziano amplitudes and the two-dimensional expectation values (and their exact computation) depends critically on the conformal invariance of the string Action, and because the two-dimensional theory has a conformal anomaly proportional to (d — 26) the equivalence holds only for d = 26, that is to say, only for (25+ l)-dimensional Lorentz spaces. In the supersymmetric version the number of Lorentz dimensions can be reduced to 9 + 1 . But there is no known version in which it can be reduced to 3 + 1. This does not necessarily con stitute a flaw of string theory, because of the possibility of dimensional reduction. Indeed, the fact that string theory provides a natural environment for dimensional reduction may turn out to be an asset. What has all this got to do with gauge-theory? The point is that an analysis of the mass-spectrum of the string shows that, in addition to a scalar tachyon (that disappears in the supersymmetric case) and the part of the spectrum that is at least of order of the Planck mass, it contains a finite number of zero-mass particles. Furthermore, apart from scalars and fermions, the corresponding particles have spins 1 and 2 for the open and closed strings respectively, and thus are candidates for gauge fields and gravitons. The question is whether they are successful candidates. To investigate this question one lets £M and £,,„ denote the polarization vectors of the spin 1 and 2 particles respectively and examines the behaviour with respect to the gauge-transformations £/i
->■ ^ + ^ ( p ) P / i
and Zpv ->
£,«, + < M P ) P " + «MP)PM
(6-3)
of the vertices corresponding to the respective particles, where the <j>(p)'s are arbit rary functions. The vertices are the obvious generalizations of (6.1), namely V(S,p) = jdse^x^idaX{a)
(6.4)
for spin 1, where s is the space-like component of a, and
J d2aei"-x^ga%l/daX'ldpX,,(
(6.5)
for spin 2, where, for normal-ordering reasons, we must have ^ = 0 and p % „ = 0
(6.6)
in the second case. It turns out that the gauge transformations (6.3) can be ab sorbed by the integrations in (6.4) and (6.5) with the result that the vertices are gauge-covariant. Thus the Veneziano S-matrix elements for spin 1 and spin 2 fields
200
derived from string theory are gauge-covariant, and this permits the spin 1 and 2 particles to be identified with ordinary gauge fields and gravitons (using also (6.6) in the spin 2 case). T h e fact t h a t it is integration over a t h a t allows the gaugetransformation to be absorbed shows t h a t the gauge-invariance is a consequence of the extended nature of the string. String theory was so closely connected with the strong interactions that, al though the above gauge-properties were known quite early, they were regarded as a nuisance rather than an advantage. It was only in the mid-eighties, when string theory was shown to be anomaly-free, and the gauge-theories of the nuclear inter actions were already ten years old, that the idea emerged t h a t string theory might describe the other fundamental interactions, indeed might be a so-called 'theory of everything'. Before leaving string theory the problem of dimensions should perhaps be men tioned. As we have seen, string theory works only in 25 + 1 or 9 + 1 Lorentz dimensions and thus needs dimensional reduction t o relate it with the ordinary 4-dimensional theories. But, as mentioned earlier, this apparent shortcoming could turn out to be a positive feature, since it introduces dimensional reduction in natural way and may select the experimentally observed gauge-theories from the myriad of possible ones. Indeed the so-called toroidal reductions from 26 to 10 dimensions limit the number of gauge groups to subgroups of 5 0 ( 3 2 ) and E(8) x E(8) and have contributed to the present interest in the generalization of string theory to membrane theory. T h e reduction from 10 to 4 is less structured, but has produced a number of interesting proposals. Thus, although so far there has been no real phenomenological success of string theory, in the sense of fixing the gauge-groups precisely, or finding a realistic mass-scale for the observed particles, it may yet open the way for an explanation of the standard model and its relation to gravity. In any case it has produced a radically new way of thinking about gauge theories.
7
Gauge-Fixing, B R S T and Constraint Systems
We begin this chapter by a discussion of the gauge-fixing problem in gauge theories. A remarkable fact is t h a t this problem, which at first sight would seem to be the most arbitrary and least symmetrical aspect of gauge-theory, actually leads to a very elegant subsymmetry, B R S T symmetry. T h e B R S T symmetry also serves to emphasize another aspect of gauge theory, namely the fact t h a t it is a first-class constrained system. Hence we shall include B R S T theory and the place of gauge theories in the general theory of constraints.
7.1
Faddeev-Popov
Gauge-Fixing
To put the gauge-fixing problem in historical perspective, it should be mentioned t h a t , even prior to the construction of the standard model, it had been discovered by Feynman and de W i t t , in a t t e m p t s to quantize gravity, t h a t the quantization of gauge-theories required in general the introduction of extra fields c(x) and c(x), called ghost fields because they did not occur as external lines in Feynman diagrams.
201 They satisfied the canonical equal-time commutation relations {b(x),c(y)}=6(x-y)
\c(x),b(y)}
= 6(x - y)
(7.1)
which implied that they operated on a Hilbert space with indefinite metric, and they obeyed the opposite spin-statistics relations to physical fields. The origin of the ghost-terms was found by Faddeev and Popov in the context of the path-integral formulation of Yang-Mills theories, where they showed t h a t the ghosts came from the gauge-fixing. Their point was that gauge-fixing was not simply a m a t t e r of inserting a factor 5(x) into the functional integral, where x = 0 is the classical gauge-fixing function, since this would be inconsistent with the identity S(Xi) = < J ( x 2 ) d e t ( ^ )
(7.2)
Consistency required that the delta-function be accompanied by a factor t h a t would produce the Jacobian on making a change of gauge, and, since the only group invariant variables are the parameters a the only possible factor was det(Sx/Sa). T h u s the gauge-fixed path-integral proposed by Faddeev and Popov was f dA» S{X)det ( l ^ ) e '
J L{A
» ^+^A-
(7.3)
where j ^ is the current. The ghost fields c and c then emerged in a representation of the Jacobian as the functional integral, namely c C-&) d e t ( - p ^ ) = / dcdcec\-^>
■m=i-
(7.4)
T h e unitary gauge was distinguished by the fact that the Jacobian was unity and required no ghosts. Similar results hold, of course, for Q E D , but there the effect had not been noticed because the ghosts decouple. 7.2
Gribov
Ambiguities
Although the Faddeev-Popov procedure represented a major breakthrough in gauge-fixing, it left open the question of uniqueness. T h e question was whether a particular choice of local gauge-fixing function \ fixed the gauge completely i.e. whether the condition x = 0 left no room for any gauge transformations except g = 1. This question was first raised by Gribov, who pointed out that, although the Coulomb gauge V • A — 0, with A —> 0 for x —I oo, fixes the gauge uniquely in the abelian case, it does not do so for SU(2). In fact, the gauge-fields A'' = f~ldf, which are gauge-equivalent to the trivial solution A = 0, satisfy the Coulomb con ditions provided d{f-ld-i)
= 0
7(0) = 1
7 (oo)
= ± i — (7.5) r and this equation has a denumerably infinite set of solutions. Gribov's statement was the prototype of a more general one that the gauge cannot always be fixed uniquely using a local gauge-fixing function x- T h e different solutions of a putative gauge-fixing condition are called Gribov copies. It was also shown by Singer that, for a pure non-abelian theory on 54, no continuous global m a p from the space A/Q
202
of gauge-invariant orbits to the space A of gauge-potentials existed, which meant that no continuous, global gauge-fixing was possible. In fact, the general situation is that, for non-abelian gauge-theories, the topology of the orbit-space A/Q (which depends of course on the boundary conditions) is so complicated that a unique global gauge-fixing is the exception rather than the rule. The best that can be done is to gauge-fix in patches and up to Gribov copies. Luckily, the existence of such patches and copies depends strongly on g~l, where g is the gauge-coupling constant, so it does not affect perturbation theory. As might be expected from the fact that the gauge-fixing breaks down when the Faddeev-Popov determinant is zero, there is a close connection between the Gribov ambiguity and the zero-modes Qo of the Faddeev-Popov operator S\/SQ. In fact, since
g = [G,x]
(7.6)
where the G"s are the generators of the gauge-transformations, the gauge transform ations e-f a ° G with the zero-modes as parameters commute with the gauge condition X and therefore generate Gribov copies. Thus, if there are subspaces, called Gribov horizons, on which the Faddeev-Popov determinant vanishes these are a signal for the existence of Gribov copies. The original idea of Gribov was that the existence of zeros for the Faddeev-Popov determinant was related to confinement, and, al though there is no firm evidence for this so far, there are enough indications for the idea to be taken seriously. 7.3 BRST Theory Leaving aside the question of Gribov ambiguities we return to the Faddeev-Popov gauge-fixing. Soon after the proof of renormalization of nonabelian gauge theory was established, it was found that this gauge-fixing procedure was actually equival ent to a very elegant substructure of non-abelian gauge theory, called BRST theory, as follows: First, it should be noted that, by using a Lagrange multiplier B, the Faddeev-Popov gauge-fixed integral may be written as dOdA^dBdcdce*I"M^.«)+Bx+e(U)c ( 77 ) / In other words Faddeev-Popov gauge-fixing can be implemented by simply inserting a factor dBdcdce^Bx+e^^c (7.8) / in the path-integral. The remarkable discovery of BRST was that, if the gaugeparameters a(x) of the ordinary infinitesimal gauge-transformations were replaced by the ghost-fields c(x), and the transformations extended to B and the ghost-fields by including 6c = B
6ca = fabccacb
SB = 0
(7.9)
then the gauge-fixed path-integral was invariant with respect to these transforma tions. Furthermore, the variations were nilpotent S2 = 0 and the Action density in
203
the Faddeev-Popov factor (7.8) could be written as B£ + c(6\/Sa)c
= ($* where
* = ex
(7-10)
In other words, the Faddeev-Popov Action density was not only BRST invariant but an exact BRST derivative. The generator for the BRST transformations is V = caGa
(7.11)
where the Ga are the generators of the ordinary gauge-transformations (section 7.4). This operator is nilpotent f22 = 0 in accordance with S2 = 0, and the fact that the Faddeev-Popov term is an exact BRST derivative S^l is a consequence of the cohomology associated with this nilpotency. The practical advantage of BRST theory is that, although BRST invariance is equivalent to gauge-invariance, the nilpotency makes it much easier to deal with, and thus it provides a much more satisfactory and precise way of establishing various properties of the theory, notably the renormalization properties. 7.4
Gauge Theory as a First-Class Constraint System
The BRST formalism emphasizes an aspect of gauge-theory that has not been discussed so far, namely their relationship to constrained theories, and we conclude this chapter by discussing this aspect. We first recall briefly the Dirac theory of first and second class constraints. For a classical canonical system with the usual phasespace coordinates Pi,qj, for i,j = 1...N, a first-class set of constraints Ga(x,p), a = l...n < N is a set which closes under Poisson-bracket commutation and such that det{Ga,<7fc}0 = 0
(7.12)
where the subscript zero means that the determinant is to be evaluated on the constrained surface Ga = 0. The Poisson bracket action of the Ga on the original phase-space generates a group, called a gauge-group in the wider sense, and there are obviously 2N — n independent phase-space invariants with respect to this group. The invariants do not constitute a reduced system because they are not canonical, indeed the number of them may be odd. The true constrained system, or reduced system, is obtained by introducing a further, complementary, set of n constraints Xo, a = 1 . . . n, with the property that, if the Ga and \a are denoted collectively by Cs, s = 1 . . . 2n then AJt=det{C.„C(}0^0
(7.13)
where the subscript zero means that the determinant is to be computed on the constrained surface C, — 0. The system of constraints Ca is called a second-class system. The reduced system is then defined by X' =X -{X,Cs}oA7tlCt
where
(7.14)
where the X's are any functions of the p's and q's. The reduced system is canonical, of dimension 2(N — n), and has Poisson brackets {X*,Y*} = {X,Y}0-{X,C,}0A;tl{Ct,Y}0
(7.15)
204
Its dynamics is determined by the projected Hamiltonian H' = H-{H,Ca}0Ajt1Ct
(7.16)
and in many cases this the same as the original Hamiltonian. For many years attempts to find a satisfactory quantum version of Dirac's con straint theory failed, principally because of the ambiguity and technical difficulties in defining A - 1 . But after the advent of the path-integral and BRST theory it was found that the formalism actually generalizes very naturally to quantum theory as follows: Given a phase-space path-integral dpdqeiIdix{H-pV
(7.17)
/ ' the first-class constraints G are implemented by introducing a factor F t =S(G)=
f dAe^AG
(7.18)
and the complementary constraints are added by changing this factor to F 2 = /cMdB
(7.19)
where the B is a Lagrange multiplier canonically conjugate to A, the 6,6 and c,c are separate pairs of ghost fields and fi = CaGa + baBa + fabcCaCdCd
* =
caXa+baAa
The original path-integral with this factor inserted is the path-integral for the re duced system. On evaluating the Poisson-bracket {Q, $ } and carrying out the integration over 6 and 6 the factor becomes ■ /
dAdBdcdcei^A'G'+B'x'+^Fpic+elBFV^
(7.21)
where [FP] = det{G a , Xb}
[BFV] = det{B a , Xb}
(7.22)
which will be recognized as a slight extension of the BRST factor of the previous section. Note that in this formalism the Faddeev-Popov determinant is identified as the Dirac determinant det{G,x}- Of course, this determinant may have zeros, so the Gribov problem is still present. From the above we see that, in both the classical and quantum versions, there are three levels in the theory of first-class constraints: 2Af-dimensional, unconstrained, canonical system (27V — n)-dimensional, first-class constrained, non-canonical, system 2(7V — n)-dimensional, second-class constrained, canonical system
205
Where do the Yang-Mills gauge theories fit into this scheme? They fit in at the intermediate, non-canonical, level. To see this consider the path-integral Z(j) = [dAe'fth80*'**1)**!
= [dAdneil^n*+82^a°A+AJ
(7.23)
for a three-vector field A, where B = d A A 4- [^4, A], and impose the first-class constraints G = D ■ n + j0 = 0
(7.24)
where j ^ is the current. Using Lagrange multipliers AQ the constrained path-integral Zc(j) becomes Zc(j)
= [dAdAodne'IW+^
+ t1 OoA+Aj-AoG
( ?2 5 )
Letting E = d0 A - dA0 + [A, A0] = d0A - DA0
(7.26)
and using partial integration, we see that this can be written as Zc(j) = [dndAne'lW+H'W
S+j A
""
(7.27)
which, on integrating out the conjugate momenta ft, takes the form Zc(j)= fdA^e^i^'+^+i^
(7.28)
Since this is just the path-integral for a Yang-Mills theory it shows that the YangMills theories are simply systems of three-dimensional vectors, subject to the firstclass constraint (7 = 0. The constraint will immediately be recognized as the Gauss law. From the Gauss law we see that the gauge-transformations may be written as g(x) = c*'}d3*a(z)(D2+io)
(729)
where a(x) are the group parameters. For Euclidean or Minkowski space-time the rigid gauge-group Go, for which a is constant, can be distinguished from the local gauge-group G(x), for which a(x) —>• 0 as \x\ —> oo, and the full gauge-group is the semi-direct product Go AG(i) of these two groups with G(x) as invariant subgroup. For other, compact, space-times the gauge-group may map into the space-time itself in a non-trivial way, in which case the gauge group has topologically inequivalent extensions that are not generated by the Gauss law. For example, if nn denotes the honotopy group for S„, then since 7Ti({/(l)) and ir3(SU(2)) are both non-trivial, the C/(l) on the circle S\ and SU(2) on the sphere S3 have topologically inequivalent extensions which are not generated by the Gauss law. The physical importance of topological extensions will be seen in section 8.1.
206 8
T h e R i c h T a p e s t r y of G a u g e T h e o r y
Once the standard gauge theories have been firmly established by experiment, it is legitimate to ask whether, like gravitational theory, they have structures t h a t make them desirable in their own right. T h e short answer is yes. Indeed they turn out to have an intrinsic elegance reminiscent of gravitational theory but also a richness of structure not found in alternative theories. We have already seen this in the B R S T structure and in this chapter we wish to sketch some other attractive features. Many of these are are connected to topology, with which gauge theory has a natural affinity. Of course, topology in physics is not restricted to gauge theory. It plays an important role in other branches of physics, such as the classification of defects in solid state physics and the Berry phase of atomic and nuclear physics. In gauge-theory, however, it plays a fundamental and pervasive role, and manifests itself in a variety of ways. Three of its most important manifestations are in the existence of solitons (vortices, monopoles, instantons), in the concept of Wilson and Polyakov loops, and in the construction of field-theories t h a t are purely topological (Chern-Simon theories). 8.1
Solitons
In the case of the solitons the topology enters through the fact t h a t the gauge-group, or some part of it, such as a subgroup or coset, can be mapped in a non-trivial way into the boundary il. of space-time. Instantons. T h e simplest example is the case of instantons. In four dimen sional Euclidean space the requirement t h a t the pure Yang-Mills Action J t r F 2 be finite implies t h a t on the sphere S3 at infinity the gauge-potential must take the form A^x) = g~1dlig. Because n3(SU(2)) = Z the m a p s g(x) ->• 53 are generally not trivial and, as already mentioned, they cannot be gauged to zero by the gauge-transformations generated by the Gauss condition. This means t h a t the finite-Action gauge-potentials can be classified into distinct topological sectors characterized by the boundary conditions on Afl(x) given by the elements n of Z. These sectors are separated by infinite action. If one writes the pure Yang-Mills Action in the form i J dAx t r ( F 2 ) = \ f
^x
[tr(F ± F)2 T 2tr{FF)]
(8.1)
where F is the dual of F formed with the Levi-Civita symbol, the second term on the right-hand side is a pure divergence whose integral over the surface at infinity is precisely the topological charge n. From this it follows t h a t the Action is bounded below by the absolute value of the topological charge n. Furthermore, the minimal Action configurations are the solutions of the self-dual equation F — F. These minimal configurations are the instantons. W h a t is i m p o r t a n t , however, is not so much the instantons themselves as the existence of the topological sectors t h a t they represent. In general the existence of instantons shows t h a t the Yang-Mills path-integral must be summed over the various topological sectors. T h e s u m m a t i o n can seriously
207
alter the structure of the theory as has recently been shown explicitly in N = 2 supersymmetric QCD models, where the instanton contribution to the effective superpotential can be computed exactly and changes the effective superpotential from the perturbative function
J d3x [ i t r ( 5 2 + (D)2) + V(<j>)} = J d\ [±tr(fl ± D4>)2 + V{<j>) * V(fltf)] (8.2) Finiteness of the energy requires that tr(>2) be constant and D4> be zero on the sphere S2 at infinity and since ^{SU(2)/U{\)) = Z these two conditions imply the existence of non-trivial maps from (x) to S2 characterized by integers n. This means that the finite-energy Yang-Mills-Higgs form topologically inequivalent sec tors labelled by the topological charge n. The last term on the right-hand side of (8.2) is a pure divergence whose integral JrfQtr(fl$) over the sphere at infinity is precisely this charge. From the form of this integral it is clear that it is the charge corresponding to the component of the magnetic field in the direction of the Higgs field and it is for this reason that the configurations are called magnetic monopoles. In fact they are smooth non-abelian generalizations of the electromagnetic point-monopoles proposed by Dirac, whose charge was also quantized. For positive potentials the energy is bounded below by the absolute value of topological charge n, and if the potential is zero the minimal energy configurations are the solutions of the first-order differential equations D(f> = =pfl. Vortices. The case of vortices is similar to that of monopoles except that the field configurations are assumed to be independent of one of the space-coordinates,
208
the Yang-Mills field is the electromagnetic field and the Higgs field has just one complex component. Thus the effective Hamiltonian is
H = \ J d2x[B2 + \D4>\2 + 2V{4>)\
(8.3)
The finiteness of the energy requires that cj> be unitary and D be zero on the circle at infinity, and since 7Ti(f/(l)) = Z the maps <j> to Si may be topologically non-trivial and the configurations form topological sectors defined by the charge n. From the conditions on <j> at infinity it is easy to see that this charge is given by the magnetic flux n=Y
f{ - <j>d<j>') = ~
I A^dx" = f d2x B{x)
(8.4)
Thus for the vortices it is the total magnetic flux that is quantized. The importance of the vortices is that they can penetrate superconductors in spite of the fact that other magnetic flux lines are expelled. The reason they do can so is that, because of their non-trivial topology, it would require an infinite amount of energy to expel them. The critical potential that corresponds to V = 0 for the monopoles is V = (|<^>|2 — l ) 2 / 8 , which corresponds to the superconductor being in a critical phase between what are called type I and type II superconductors. In that case the Hamiltonian may be written in the form
H = \ Jd2[{\(\\2 - 1) ± B)2 + (D<j>)2 ± AKd\<j>\2 ± B]
(8.5)
where the last term is the magnetic flux. The second-last term may be absorbed into the kinetic term \D<j>\2 for the Higgs field to form a sum of two squares, and the minimal energy configurations are given by the solutions of the first-order dif ferential equations obtained by setting the three squared terms equal to zero. For all the solitons the lower bounds on the Action and energy given by the topological charges are called Bogomolny bounds. The topological sectors just described are not merely of mathematical interest but are thought to play a fundamental role in the physics of gauge theories. They are all thought to play an important role in the confinement of quarks and gluons. Although this view has not been substantiated in any quantitative analytical way so far, it is supported by lattice QCD computations. These computations also indicate the existence of local chiral lumps, which can be attributed to the formation of local instantons. 8.2
Gauge-Invariant Loops
In a gauge-theory physical observables must be gauge-invariant and in electromagnetism it is customary to regard the functionals of the field-strengths F^ as the only gauge-invariants. This is true if underlying space-time is topologically trivial, but if it is not, there is a further, independent, set of gauge-invariants, namely the Stokes loops ei§Audx»
(g6)
209
In quantum mechanics this fact is highlighted by the Aharonov-Bohm effect, if it is interpreted as taking place in a plane with the origin (representing a magnetic fluxline) excluded. In that case the the Aharonov-Bohm phase-factor for the electron is just the Stokes integral for a loop encircling the origin. In non-abelian theory the corresponding loop-integrals acquire an even greater importance. Because of the non-abelian nature of the gauge-potentials they must be path-ordered and, for an open loop, they take the form g(x) = PeS°A»dy"
(8.7)
They may also be regarded as the solutions of the equations dtg{x) = (n-A)g(x)
g{0) = 1
(8.8)
along a line with parameter t and tangent n joining x^ to the origin. The trace of the integrals taken around closed loops WL = tr(Pe.$A»dy")
(8.9)
are gauge-invariants. They are usually called Wilson loops in general and Polyakov loops in the QCD case. As we shall see they are central to knot-theory, where the knot invariants are nothing but their expectation values in Chern-Simons back grounds. In physics they have many applications, for example they may play the role of (non-local) Higgs fields. But perhaps their greatest importance lies in the fact that they may be used as signals for confinement: A system is said to be in the confinement phase or not according as the value of its trace is proportional to the length of the loop or to the area enclosed by the loop. 8.3
Chern-Simons Theories:
A remarkable offshoot of instanton theory was the realization that it was possible to construct gauge-theories that were purely topological. In general, topological theor ies are those for which the Action does not depend on the metric of the background space, and for topological gauge-theories the Action is typically 3-dimensional and of the Chern-Simons form SCS= [ d3x[AAF+^AA(AAA)}
(8.10)
If gauge-theory is regarded as a retreat from metrical geometry, then topological gauge-field theory must be the ultimate haven, since both the gauge-field and the Action are independent of the metric. Chern-Simons Actions have many interesting properties. They were first in troduced as infra-red regulators for (2 + l)-dimensional gauge-theory but perhaps their most unexpected and important application is in pure mathematics, where they provide a very natural way of generating the invariants of knot theory. As already mentioned, the knot-invariants are nothing but the expectation values of Wilson loops with respect to Chern-Simons Actions dAes^WLa
[] a=l
J
(8.11)
210 Another application of Chern-Simons theory is to 3-dimensional gravity where it has been found t h a t SL(1,R) x SL(2,R) Chern-Simons theory is equivalent t o 3-dimensional gravity with a negative cosmological constant. This equivalence has led on t o many other interesting results, such as the exact computation of the entropy of 3-dimensional black-holes. It is actually possible to induce Chern-Simons terms from standard Actions. For example, if the path-integral for the Maxwell-Dirac Action f ^F2 + rpf-rp in (2 + 1) dimensions is integrated over the fermion field ip it leads to an effective Action of the form F2/4 + AAF. Since the original Action is parity-invariant but the induced Chern-Simons term A A F is not, its induction produces also a parity-anomaly. As in the induced case, Chern-Simons Actions can be used as supplementary terms in non-topological Actions. For example, Anyons may be incorporated in field theory in a natural way by considering the abelian Action A ■ j + 9A A F where 8 is an arbitrary parameter. This Action leads to the field equations j = 9F, which, on integrating the time-component over the two space-dimensions, yields Q = 6$ where Q is the charge and $ is the magnetic flux. If the charges are moved around one another in 2-space then, by the Aharonov-Bohm effect, they pick up a phase proportional to 6 times the flux and thus may be identified as Anyons. Another interesting example is the Wess-Zumino Action f d2xti(JJ}+
f d3xtT(jA[JAJ])
J(x)=g-1(x)dg(x)
(8.12)
where the g(x) are Lie-group elements and the 2-dimensional space is supposed t o the boundary of the 3-dimensional one. Here the 3-dimensional topological term is added in order t o make the theory chiral, which it does by reducing the field equations to the form d+J-=0
d-J+=0
(8.13)
where d± = d\±id<2 are the chiral derivatives. This means t h a t the general solution for the classical field g{x) is g(x) = gi(x+)gR(x-), where gi and gR are arbitrary differentiate functions. 8.4
Anomalies
In the wider sense, anomalies may be defined as classical symmetries t h a t are broken at the q u a n t u m level, the simplest example being provided by the harmonic oscil lator H = a^a, for which the classical symmetry a f> a ' is broken by the q u a n t u m vacuum condition a\ > = 0. However, in particle physics and field theory anomalies have come to regarded as symmetries of the Action which are not symmetries of the measure in the path-integral. Normally, in the path-integral j dn{4>{x))es^x»
(8.14)
where <j>(x) denotes any set of fields and 5(<^>) is the classical Action, it is possible to find a measure dn{<j>) with the same symmetries as S(). But this is not always the case. Typically, if the Action has two symmetries, there may not exist a measure t h a t has both. T h e classic example is the Action J ^ 7 5 7 • drp which is both Lorentz
211
and chiral invariant. But just as there is no mass term which is both Lorentz and chiral invariant, so also there is no translationally-invariant measure that respects both symmetries, the measures dtpdip and dip^dip violating chiral and Lorentz invariance respectively. Since Lorentz invariance is regarded as the more fundamental the measure dipdip is usually chosen. Thus, axial gauge theory leads to a chiral an omaly even in the abelian case. The particular anomaly is called the U(l) anomaly and it manifests itself experimentally in a number of spectacular ways. First, it mediates the electromagnetic decay the decay of the neutral 7r-meson, ir0 —> 27, which is the dominant decay mode and would be suppressed by at least an order of magnitude in the absence of the anomaly. Secondly, as will be discussed below, it requires that the number of quark and lepton families be equal and that the quarks have just three colours. The U(l) anomaly is actually the forerunner of many others and in non-abelian Yang-Mills theories even vector currents may lead to anomalies. Other interesting anomalies are the gravitational and scale ones. In all cases the anomalies lead to a violation of symmetry and hence to a violation of the Noether conservation law, a situation that can be formulated as <Wx) = 0
-»■ dpj^x) = A(x)
(8.15)
where A(x) is the Anomaly. In the U(l) case it takes the form i(i) = e V "
where
F^=\i^raFTa
(8.16)
Gauge theories are a natural habitat for anomalies, and they are closely con nected with other interesting aspects, such as their topological properties. For example, on a compact space-time, the global U{\) anomaly is nothing but the Atiyah-Singer index ^ | ^ F " "
=(n+-n_)
(8.17)
where n± are the number of left and right-handed zero-modes of the Dirac operator. Apart from their direct physical properties and mathematical interest, anomalies play a vital role in the renormalization of gauge-theories. In fact, for a theory to be renormalizable the anomalies must vanish at high-energy. This does not mean, of course, that they should vanish for individual fields or subsets offields,but only when all fields are taken into account. Thus the condition that they should cancel puts constraints on the field assignments, in particular on the fermion field assignments. For example, in the standard electroweak model both the U(l) and gravitational anomalies have the opposite signs for the leptons and quarks, and it is the requirement that these anomalies vanish that forces the numbers of leptons and quarks to be equal (modulo colour) and the number of colours to be three. This is why the experimental discovery of the r meson led immediately to the prediction of the existence of the t and b quarks. 9
Summary
During the course of the 20th century gauge-theory has developed from being just an interesting but minor part of electrodynamics to being the basic principle on
212 which all the known fundamental interactions are based. The change, though re volutionary, has come about in a series of small steps, which I have endeavoured to sketch in rough chronological order. It will have been noticed that the gauge-picture emerged in a completely different manner for each of the four different interactions. For electromagnetism it emerged from the Maxwell field equations, two of which are nothing but the conditions for the existence of a gauge-potential. In Einstein's gravity it emerged in the context of general coordinate transformations, which were later interpreted as gauge-transformations in the Vierbein formalism. For the weak interactions it was first suggested by the V — A form of the weak currents, but became acceptable only after a long struggle which showed that that there existed a gauge-invariant mechanism for mass-breaking and that the non-abelian theory was renormalizable. Finally, for the strong interactions, it was introduced only when it was found that non-abelian gauge theory was the unique renormalizable theory which was asymptotically free, and could therefore explain both high-energy electron-proton scattering and quark confinement. It should be pointed out that, although gauge theory provides a unifying picture of all the known fundamental interactions, it does not provide a completely unified theory. Even for the closely intertwined weak and electromagnetic interactions, the existence of two independent coupling constants shows that they are not really unified. The strong Lagrangian, though formally similar to the electromagnetic one, has qualitatively different properties, and gravitation is still widely different from all the other interactions, both in its form and its structure. In fact it is the only gauge interaction whose geometry is metrical and the only one that is not renormalizable. The challenge for the next century will be to find a truly unified theory of all four interactions, and perhaps of new interactions still to be discovered. String theory, which is the only theory so far that combines gravitation with the other forces in a natural way, already provides some clues as to what the future may hold, but it is too early to say whether it is just a forerunner of the true theory, as the Bohr-orbit theory was the forerunner of quantum mechanics. In any case gauge theory has turned out to be at the core of fundamental physics and impinges on many other areas of physics, such as constraint theory, integrable systems and superconductivity. It is also deeply interwined with many branches of nathematics, such as fibre-bundle theory, knot theory and topology. Its basic structure is geometrical and as a result it has an elegant internal structure. It has also turned out to have many unexpected and interesting aspects such anomalies and BRST symmetry. It is undoubtedly a discovery for which the twentieth century will be remembered. A cknowledgement I am much indebted to Dr. Sreedhar Vinnekota for help in preparing this typescript. References General 1. M. Carmelli, Classical Fields (Wiley-Interscience, New York, 1982).
213
2. G. 't Hooft, Under the Spell of the Gauge Principle, (World Scientific, Singa pore, 1994). 3. C. Itzykson and J.C. Zuber, Quantum Field Theory (McGraw-Hill, New York, 1980). 4. J. Leites-Lopez, Gauge Field Theories, an Introduction (Pergamon, Oxford, 1981). 5. K. Moriyasu, An Elementary Primer of Gauge Theory (World Scientific, Singapore, 1983). 6. L. O'Raifeartaigh, The Dawning of Gauge Theory (Princeton Univ. Press, Prin ceton, NJ, 1997). 7. S. Weinberg, Quantum Field Theory Vols. I-II, (Cambridge Univ. Press, Cam bridge, 1995-6). Chapter 2 H. Weyl, Sitzungsber. Preuss. Akad. Wiss. Berlin (1918) 465. Th. Kaluza, Sitzungsber. Preuss. Akad. Wiss. Berlin (1921). O. Klein, Zeit. f. Physik 37, 895 (1926). H. Weyl, Zeit. f. Physik 56, 330 (1929). L. O'Raifeartaigh, The Dawning of Gauge Theory (Princeton Univ. Press, Prin ceton, NJ, 1997) 13. L. O'Raifeartaigh and N. Straumann, Gauge theory: Historical Origins and Some Modern Developments, Rev. Mod. Physics 72, 1 (2000). 8. 9. 10. 11. 12.
Chapter 3 14. I. Aitchison and A. Hey Gauge Theory in Particle Physics, (Adam Hilger, Bristol, 1982). 15. T-P Cheng and L-F Li, Gauge Theory of Elementary Particle Physics (Clar endon Press, Oxford, 1984). 16. L. Faddeev and A. Slavnov, Gauge Fields, Introduction to Quantized Theory (Benjamin-Cummings, Reading, MA, 1980). 17. N. Koneplova and V. Popov, Gauge-Fields (Harwood, 1981). 18. C. Lai and R. Mohapatra Gauge Theories for Fundamental Interactions (World Scientific, Singapore, 1981). 19. S. Pokorski, Gauge Field Theories (Cambridge Univ. Press, Cambridge, 1987). Chapter 4 20. D. Bailin, Weak Interactions (Adam Hilger, Bristol, 1982). 21. E. Commins, Weak Interactions (McGraw-Hill, New York, 1973). 22. C. Lai, Gauge Theory of the Weak and Electromagnetic Interactions (World Scientific, Singapore, 1981). 23. J.C. Taylor, Gauge Theories of Weak Interactions (Cambridge Univ. Press, Cambridge, 1976). 24. J. Collins, Renormalization (Cambridge Univ. Press, Cambridge, 1984). Chapter 5 25. P. Becher, P. Bohm and H. Joos, Gauge Theories of Strong and Electroweak Interactions (Wiley, New York, 1984).
214
26. N. Brambilla and G. Prosperi Quark Confinement and Hadron Spectrum, (World Scientific, Singapore, 1997). 27. F. Close, An Introduction to Quarks and Leptons (Academic Press, New York, 1979). 28. M. Creutz, Quarks, Gluons and Lattices (Cambridge Univ. Press, Cambridge, 1983). 29. F. Halzen and A. Martin, Quarks and Leptons (Wiley, New York, 1984). 30. J. Kokkedee, The Quark Model (Benjamin, New York, 1969, World Scientific, Singapore, 1981). 31. T. Muta, Foundations of Quantum Chromodynamics (World Scientific, Singa pore, 1998). 32. Y. Nambu, Quarks (World Scientific, Singapore, 1985). 33. C. Quigg, Gauge Theories of Strong, Weak and Electromagnetic Interactions (Benjamin-Cummings, Reading, MA, 1983). Chapter 6 34. T. Appelquist, A. Chodos and P. Freund, Modern Kaluza-Klein Theories (Addison-Wesley, Reading, MA, 1987). 35. M. Green, J. Schwarz and E. Witten, Theory of Superstrings, Vols I and II, (Cambridge Univ. Press, Cambridge, 1987). 36. S. Ferrara, J. Ellis and P. van Nieuwenhuizen (eds.), Unification of the Funda mental Interactions (Plenum, New York, 1980). 37. M. Kaku, Introduction to Superstrings (Springer, Berlin, 1990). 38. L. O'Raifeartaigh, Group Structure of Gauge Theory (Cambridge Univ. Press, Cambridge, 1986). 39. G. Ross, Grand Unified Theories (Addison-Wesley, Reading, MA, 1985). 40. A. Zee (ed.), Unity of Forces in the Universe, Vols. I and II (World Scientific, Singapore, 1982). Chapter 7 41. P. Dirac, Lectures on Quantum Mechanics (Yeshiva Univ. Press, 1964). 42. J. Govaerts, Hamiltonian Quantization and Constrained Dynamics (Leuven University Press, Leuven, 1991). 43. M. Henneaux, Physics Reports 126, 1 (1985). 44. T. Kugo and I. Ojima, Progr. Theor. Phys. Suppl. 66, 1 (1979). 45. P. Ramond, Field Theory, a Modern Primer, (Benjamin, New York, 1981, Wiley, New York, 1974). 46. E.C.G. Sudarshan and N. Mukunda, Classical Dynamics: a Modern Perspect ive (Wiley, New York, 1974). Chapter 8 47. N. Craigie, P. Goddard and W. Nahm (eds.), Monopoles in Quantum Field Theory (World Scientific, Singapore, 1982). 48. A. Jaffe and C. Taubes, Vortices and Monopoles (Birkhauser, 1980) 49. R. Jackiw, S. Treiman, E. Wiiten and B. Zumino, Current Algebra and Anom alies (World Scientific, Singapore, 1985). 50. T. Eguchi and K. Nishijima (eds.), Broken Symmetry (World Scientific, Singa pore, 1995).
215
51. H. Rothe, Introduction to Lattice Gauge Theories (World Scientific, Singapore, 1997). 52. R. Bertlmann, Anomalies in Quantum Field Theory (Clarendon Press, Oxford, 1996).
216
R A N D O M MATRICES AS P A R A D I G M L.PASTUR Center
of Theoretical
Physics
of the CNRS, Luminy, case 907, 13288, Marseille, E-mail: pasturQcpt.univ-mrs.fr
France '
We present an outline of the random matrix theory, mostly of its spectral and probabilstic aspects, and a commented bibliography of related works aimed to show problems, links and applications of the field.
1
Introduction
Random matrices is an active field of mathematics and physics. Initiated in the 20s - 30s by statisticians and introduced in physics in the 50s-60s by Wigner and Dyson, the field, after about two decades of the "normal science" development restricted mainly to nuclear physics and having already in that time a strong mathematical physics flavor, became very active since the end of 70s under the flow of accelerating impulses from the quantum field theory, quantum mechanics (quantum chaos) and condensed matter theory in physics and probability theory, combinatorics, operator theory, and number theory in mathematics. The activity of the random matrix studies in physics is, in my opinion, higher than in mathematics. This makes the field extremely interesting from the point of view of mathematical physics. There are numerous fascinating problems that have to be understood and studied mathematically, not to mention many already mathematically well posed important problems, that require nontrivial methods including those of mathematical physics. The theory deals with integrals over matrix measures defined on various sets of matrices of an arbitrary (mostly large) dimension. Matrix integrals proved to be partition functions of models of quantum field theory and statistical mechanics and generating functions of numerical characteristics of graphs and and topological manifolds, they verify finite-difference and differential relations connected to many important integrable systems and geometrical and topological objects. How ever, the matrix integrals themselves, their dependence on parameters, etc. are of considerable use only in a part of the RMT, related to quantum field theory, combinatorics, integrable systems and some other branches. And even there the integrals can often be interpreted in spectral terms related to eigenvalues and eigen vectors of random matrices, whose probability law is a matrix measure entering the integral. These, spectral and probabilistic, aspects are widely represented in the RMT and its numerous applications, beginning from the dawn of the theory in the late 20s in statistics, via its rise in the 50s - 60s in nuclear physics till the present flourishing state, when the applications of the RMT include a wide variety of seem ingly unrelated domains (see Section 3 and the Bibliography for a list). Random matrices became an efficient source of insights and models for the spectral struc ture of various sufficiently complex operators, reproducing quantitatively a number of their important spectral properties, independent of their specific aspects. It is amount of universality of ideas and results of the RMT and of diversity of their "On leave from Department of Mathematics, University Paris 7, France
217
applications that makes strong impression and motivated the paper and its title a However, having got a wish to demonstrate the paradigm nature of the RMT in the framework of this volume, I encountered a problem. Indeed, directions, links and applications of the theory currently being explored are too diversive and numerous to be duly described in a contribution to the volume. Thus I decided to realize my strong wish by using the form of commented bibliography, the form that can be viewed as a miniversion of a commented reprint collection, the latter form being represented, among many, by the excellent book by E.Lieb and D.Mattis 18 . This is done in Section 3 of the paper and in References. Their first and second parts are devoted to results of the RMT and to its applications and links. To make comments and references more understandable, I supply them by an outline of objects, problems and methods of the RMT (Section 2), thus providing a reader with a minimum amount of key words of the field. They are illustrated by respective facts for the archetypal Gaussian random matrices, taken mostly from book 19, a classic reference item of the field. This Section can also be viewed as an effort to present a formal scheme of the part of the RMT that deals with spectra of random matrices. Actual amount of problems, methods and results of the RMT exceeds con siderably that given below and I do not intend (and I am not able) to discuss many of developments. Thus I just refer the interested reader to the bibliography. Since however the number of references related to random matrices is well above a thousand, the list and the comments are inevitably incomplete and subjective and I apologize strongly for not mentioning many important results. Other authors would choose and would comment differently but it seems to me that few among those who are familiar with the RMT would disagree on its beauty and utility, on its influence on a number of domains of physics and mathematics. The list contains review works and a collection of papers, mostly either first or recent. The first part of the list consists of review works discussing several topics related to the RMT. Particular (in my opinion) topics are divided in the three subsections, devoted to the RMT and to its physical and mathematical applications respectively. I hope that despite this somewhat pointillist style the paper, containing no proofs and even no detailed explanations will be viewed as something more than a "keyboard of references" b . By the way, the list of references, whose titles should be considered as a part of the text, includes approximately the same number of physical and mathematical works. Thus I also hope that both, a mathematician and a physicist, wishing to learn more on the field, will be positive to physical and mathematical works respectively, seeing behind rules of reasoning and of writing of another field its often deep and fascinating content. I do not mention mathematical physicists, since, in my opinion, that attitude is an important part of our profession.
"Since, however, it seems that the term "paradigm" is not unambiguous, I explain in Appendix the meaning that I use. The expression belongs to the Soviet poet O.Mandelshtam (O.Mandelshtam, Conversation about Dante, In: The Collected Critical Prose and Letters, Collins Harvill, London, 1991)
218
2
Outline of t h e Theory.
2.1
Generalities
As was indicated in Introduction, the paper deals mostly with spectral aspects of the R M T and its applications. T h e main subject of this part of the theory is the large-n asymptotic form of the eigenvalue distribution o f n x n matrices, whose probability distribution is given in terms of the matrix elements. In other words, the goal of the theory is to "transfer" probabilistic information from matrix elements to the spectrum. Formulated in so general form the goal of the R M T is similar to t h a t of the random operator theory ( R O T ) , spectral theory of Schroedinger operator with random potential in particular. However, in the latter the emphasis is put on the analysis of spectral types (pure point, absolutely continuous, etc.), i.e., in fact, on the spatial behavior of eigenfunctions (solutions of respective differential or finite difference equations), while in the former we are mainly interested in asymptotic behavior of eigenvalues as n —> oo, although statistical properties of eigenvectors are also of considerable interest for a number of applications. A typical example of a random matrix ensemble is the Gaussian Orthogonal Ensemble (GOE), introduced by Wigner and defined for any n > 1 as a n x n matrix of the form Mn = {A/j fc n) }y ifc=1 , Af#> = n~^Wjk, where Wjk, Wjk
(1)
j,k = 1, ...,n are Gaussian random variables, defined by the relations
= Wkj,
E{Wjk]
= 0, E{WjlklWhk2}
= w2(SjdJklk,+SjlkJhkl).
(2)
T h e symbol E{...} here and below denotes the expectation with respect of corres ponding probability distribution (in the case above this is the Gaussian measure in R.n(rj+i)/2^. In other words, matrix elements of Mn are independent modulo the symmetry condition Gaussian random variables with zero mean and the variances E{(M^f}
= (l+6jk)/nw2.
(3)
A typical example of a random operator is the one-dimensional discrete Schroedinger operator with a random i.i.d. potential. More generally, a random operator M is a measurable function defined on a probability space and assum ing its values in operators in an infinite dimensional Hilbert space, say in /(Z) 3 4 . Under weak conditions a random operator acting in /(Z) or in / ( Z + ) is uniquely determined by the sequence {Mn}n>\ of its restrictions on subspaces spanned by vectors having only n first non-zero coordinates. In particular, eigenstructures of Mn for large n and t h a t of M are strongly related. In the case of a random matrix ensemble we also have a sequence { M „ } „ > i of n x n matrices with random entries but we have no a "limiting" random operator because " t o o many" entries of Mn are of the same order of magnitude. For random operators the number of entries of M„ of the same order of magnitudes is propor tional to n, while for random matrices it always grows faster t h a n n. For example, in the G O E case we have n2 non-zero entries of the same order of magnitude.
219 However, since this order is 0 ( n - 1 / 2 ) , the GOE matrices have well defined limit ing eigenvalue distribution and other spectral characteristics similarly to random operators. Here is an example of ensemble that "interpolates" between random operators and random matrices. Namely, for any odd integers n and b < n consider n x n matrices M„,fc with entries Mfcb) = (2/?+ 1 ) - V ( ( J - k)/i3) Wjk,
(4)
where 6 = 2/?+l, <j> is the indicator of the interval [—1, 1] and Wjk, j , k = —v, —v + 1, ...,v, n = 2v + 1 are as in (2). In this case Mn t, is a matrix having 6 nonzero diagonals around the principal one. If b is independent of n, then {Mnib}n>i defines a finite difference operator of the order 6 - 1 acting in /(Z), the Jacobi matrix in the case 6 = 3. In general case of an integrable even <j> in (4) with a n-independent b we also have a well defined self-adjoint finite difference random operator in /(Z) although of the infinite order 34 . On the other hand, if 6 = n, we obtain the GOE. In all other cases when 6—>ooasn—>oowe obtain band random matrices. Band matrices possess a number of interesting properties, in particular they allow one to study interesting crossing over phenomena between random matrices and random operators in the rate of growth of 6 96 > 101 . 121 a nd effects of interaction on transport properties of disordered systems 155 . Concluding we may also say that random matrices are "related" to random operators as double array collections of random variables are "related" to sequences of i.i.d. (ergodic) random variables or that the RMT is a kind of the mean-filed version of the ROT. 2.2
Ensembles
The RMT deals with many different random matrix ensembles. We mention some of them, the most popular. The GOE probability law defined by (1) and (2) can also be written in the form Pn(dM) = Z-^xpi-nTrM^dM
(5)
where Zn is the normalization constant and
dM = Y[ dMjk
(6)
is the " Lebesgue" measure in the space of real symmetric matrices. This distribution possess two properties: i) it is invariant with respect to ortho gonal transformations of R"; and ii) the matrix elements are independent random variables (modulo the symmetry conditions). It can be easily shown that under the condition that the first moments of Mjk's are finite these two properties determine uniquely the GOE. This motivates the two classes of generalizations. The first class consists of ensembles having an orthogonal invariant but not necessary matrix-element independent probability distribution. The typical repres entatives are given by the law Pn{dM) = Z " 1 e x p { - n TrV{M)/4w2}dM
(7)
220 in which V : R —> R is an arbitrary bounded below and growing fast enough on infinity function, a polynomial of an even degree typically. These ensembles can be used to describe physical systems having no preferential basis. We will call t h e m the invariant ensembles or the one-matrix models. They appeared in studying the large-color limit in q u a n t u m field theory and in string theory 7 and later found other applications 1 4 , 1 5 0 . The second class consists of ensembles whose functionally independent matrix elements in a certain basis are statistically independent, i.e. the ensembles, whose probability law factorizes into a product of distributions of the m a t r i x elements in this basis
Pn(dM)=
1]
p
ik\dMjk).
(8)
i<j
Corresponding random matrices can be associated with physical systems having a preferential basis and appear, in particular, in condensed m a t t e r physics. They are also quite natural from the probabilistic point of view. This class goes back to Wigner 5 8 and we will call the corresponding ensembles the Wigner ensembles (or the Wigner matrices). If the distributions p£'
in (8) have finite second m o m e n t , it is often useful to
write these matrices in the form (1), where now the distributions of the entries W-£ of the m a t r i x W„ may depend on j , k and n. For the sake of technical simplicity we will assume that they are independent of n. This assumption allows one to consider directly all Wjk,j, k = 1, 2 , . . . as defined on the same probability space and to find an optimal form of a number of important facts related to the Wigner ensembles (for example, the convergence with probability I in problem (pi) below. Conditions (2) includes equalities Mn = {E{Mjk}}?k=l = 0. In many interest ing cases this assumption is not natural and needs to be replaced by the condition t h a t the matrix Mn is just sufficiently regular behaving a s n - > o o . More generally, one may consider the deformed Wigner ensemble, whose matrices have the form M^
+ Mn,
(9)
where Mn is a Wigner random matrix and Mn is a non-random (or random but independent of Mn) matrix whose NCM converges weakly to a certain measure. Similar modification can be applied for many other ensembles of the R M T . One more important random matrix ensemble related to the sample covariance matrices is known for the long time in statistics (see Sect. 3.31 and 1 6 3 > 1 6 5 ). Let Xfi, fi = l , . . . , m , X^ = { X w } " = 1 be i.i.d. random independent vectors of R n . Assume that all components Xj^ of X^ are i.i.d. random variables, and t h a t E{Xjp} = 0, E{X? ; j } = x2. Regarding X = { ^ j } ^ . " - = 1 as an m x n matrix and denoting XT the transposed matrix we can define the random matrix Sn,m
= n-1XTX.
(10)
In the case when X^ are the i.d.d. Gaussian random vectors the distribution of the matrix 5 n m coincides, after the replacement the factor \/n by the factor 1/m, with the distribution of the sample covariance matrix of m+ 1 i.i.d. Gaussian vectors (see formula (65)). This distribution is a particular case of the well known
221
in statistics since the late 20s the Wishart distribution with n degrees of freedom of the Gaussian vectors whose components are correlated in general. We will call matrices (10) constructed from arbitrary (not necessary Gaussian) random vectors the generalized Wishart ensemble, or, sometimes, the sample covariance matrices. Besides statistics, positive defined matrices of this form were considered in the nuclear physics context ll and in solid state physics 150 , where they used to model universal conductance fluctuations and other transport properties of small metal lic particles and quantum dots. This branch of solid state theory is known now as mesoscopics. There the ensemble (10) defined by Gaussian vectors with i.i.d. components is often called the Laguerre ensemble, because in the orthogonal poly nomial approach (see Sect. 2.4) one needs the Laguerre orthogonal polynomials to analyze the ensemble. Regarding (10) as a parametric form of this class of positive defined random matrices we can write the probability distribution of the Laguerre ensemble in the following compact form Pnjn(dA)
= Z-^mexp{-nTTATA/2x2}dA,
dA=
J\
dA^
(11)
/i=i,j = i
where A = X/^/n, and Z n m is the normalization factor. This suggests that an analogue of (7) in this case will be defined by (11) in which ATA is replaced by V{ATA). All random matrix ensembles mentioned above consist of real symmetric matrices. According to the group analysis by Dyson (see paper 61 in 8 ) , real symmetric matrices can model physical systems that are time-reversal and rota tion invariant, Hermitian matrices with complex elements can model systems that are not time reversal invariant (for example, systems in the presence of external magnetic field), and the quaternion real matrices whose elements are 2 x 2 blocs (see e.g. 19 for their definition and properties) can model the time reversal but not rotation invariant systems. The related mathematical fact is that real, complex and quaternion matrices comprise only irreducible matrix algebras over the real field 3 7 . In the context of the RMT these symmetries complemented by the idea of "total ig norance" are realized as invariance of the probability distribution of corresponding ensembles with respect to all orthogonal, unitary and symplectic transformations respectively. Being combined with the requirement of statistical independence of all functionally independent entries, this condition leads to the GOE (5), the Gaussian Unitary Ensemble (GUE), and the Gaussian Symplectic Ensemble (GSE), defined by analogous formulas in which the measure dM is the product of differentials of all functionally independent entries. In particular, in the case of the GUE n
dM ~Y[dMjj i=l
Y[
dReMjk dlmMjk.
(12)
l<J
Correspondingly, one can consider two generalizations of GUE and GSE, ana logous to (8) and (7). We will use below mostly ensembles of Hermitian matrices because they are technically simpler. On the other hand, they are related to a number of interest-
222
ing problems of quantum field theory, analysis, number theory, and mathematical physics. We mention also the circular ensembles, introduced by Dyson. They are subsets of the unitary group consisting from symmetric unitary, all unitary and self-dual unitary matrices and their probability laws are uniquely determined by the require ment of their invariance with respect to natural for these sets of matrices trans formations (see papers 57-61 in 8 and 1 9 ). This leads to the Circular Orthogonal, Circular Unitary and Circular Symplectic Ensembles respectively. These ensembles are technically simpler than the Gaussian Ensembles. Their natural generalizations can be obtained by replacing "uniform" distribution on a respective set of matrices by the distribution having the density (cf.(7)) C- 1 exp{-nTrV((/ + t/*)}
(13)
with respect to the uniform distribution. These ensembles are used to in the quantum field theory in some models of gauge fields (see 7 ' 1 3 3 ), in nuclear physics, condensed matter and in quantum chaos to model time evolution, chaotic scatter ing, universal conductance fluctuations, eigenvalue statistics of periodically driven systems, etc H,ii8,i26. Certain problems of condensed matter theory lead to the Poisson ensembles of unitary matrices, whose density with respect to the Haar measure of U(n) is det(l — CU)~Pp~2, where C is a fixed contraction (||C|| < 1), p is an integer and P = 1,2,4 as above 150 . This is a generalization of the circular unitary ensemble (it is the case for C = 0). In statistical mechanics of neural networks one encounter the ensemble (cf. (10)) m
Mjk = n-l^2f(9J-6k),
(14)
n=i
where f(0) is a 27r-periodic function with the mean zero and such that f(—0) = /*(#), and {0p}™=l are i.i.d random variables, uniformly distributed over the unit circle . The matrices can be viewed as a mean field version of random matrix operators {G(XJ — £fc)}j,fcez, where G(x), x £ R d , is an even fast decaying at infinity function and {xj}j£Z is the Poisson random field in R d . This random operator appears in the ROT when studying the impurity band of the Schroedinger operator with the Poisson type random potential 32 . We also mention a systematic method to generate random matrix ensembles 3 8 . It is based on the maximization of the "entropy" functional [ p(M)\ogp(M)dM
(15)
of the ensemble probability density with respect to corresponding standard measure (measures (6), (12), the Haar measure, etc.). This can be viewed as an analogue of the Gibbs variational principle of statistical mechanics. The Gaussian ensembles maximizes (15) under the constraint E{TrM 2 } = const, the Poisson ensembles results from the constraints E{(/ p } = Cp, p = 1,2,.. and invariant ensembles (7) follow from the continuous family of constraints, corresponding to a fixed density
223 of states p„(A) = E{<5(A - M)})} (see (30)) of an ensemble and V(\) = / l o g |A — n\pn(n)dn + const . One more class of ensembles, known as the multi-matrix models, is defined on the product of p > 2 copies of matrix spaces used above (i.e. real symmetric, Hermitian or real quaternion matrices) by the probability law 7 P(dM,
dMp) = Z-J, exp J -n Tr I £
V*(Mj) + c £ M3MJ
+1
j 1 f [ dM,-,
(16) where ZHiP is the normalization factor, Vj,j = 1, ...,p are as in (7), possibly coin ciding, c is a (coupling) constant. We have been discussing so far Hermitian or unitary matrices or their real and quaternion analogues. Now we mention more general classes, e.g. normal, or even just general real, complex or quaternion matrices having no symmetry conditions imposed on their entries. Because of more complex eigenstructure of these matrices even in the generic case of simple eigenvalues, typical in the R M T , these matrices are less studied despite fast growing in recent year field of applications. Main technical difficulties come from the possibility for eigenvalues to fill as n —> oo domains of the complex plane (disks, ellipses, etc.) and from the nonorthogonality of eigenvectors. Archetypal ensembles here are again Gaussian ensembles defined in the case of real matrices by (11) for m — n, but now A itself is the matrix t h a t we are interested in. Analogously, in the case of complex matrices we have the law 1 1 0 Z^explnTrM'M/Aw2}
JJ
dReMjkdlmMjk
(17)
i<j,k
2.3
Quantities,
Problems
Consider now one of basic quantities of the R M T , more precisely, of that its parts, t h a t deals with eigenvalue statistics. This is the normalized eigenvalue counting measure (NCM) /Vn of a n x n matrix M„ defined for any interval A of the spectral axis as Nn(A)
= \i{XleA}/n,
(18)
where A;, / = l , . . . , n are eigenvalues of M n . T h e measure was introduced in the R M T by Wigner in the early 50s. Its analogue for the Laplacian in a bounded domain A of R d , d > 1 with certain self-adjoint boundary conditions was introduced in spectral theory by H.Weyl in the beginning of this century. In this case the role of n plays the volume of the domain. The measure, or rather the limit as n —t oo of its formal density, corresponding to a sequence of domains {A n }„>i expanding to the whole R d as n —>• oo, called the density of states and is known in physics since the time of Rayleigh and Debay. In fact, H.Weyl was motivated by Debay's question on the independence of the density of states on the shape of the domains A„ and on the boundary conditions on their surfaces. In the case of a homothetically expanding domains (say cubes or spheres centered at the origin) a simple rescaling reduces Debay's question on the properties of the large size limit of the respective distribution function N n (A) = Nn(] — oo,A[) for a fixed energy A to the question
224
on the large energy asymptotic behavior of this function for a fixed domain. We come to the problem whose studies comprise now an important branch of spectral theory of the PDE's, related to harmonic and semi-classical analysis, ergodic theory, differential geometry, etc.(see e.g. books 3 0 ) . In the ROT as well as in the RMT the initial meaning of the normalized counting measure is mostly used, i.e. its asymptotic behavior for large n is a subject of considerable interest, although the Weyl counting function plays an important role in the quantum chaos studies (see 125>126). By using the NCM we can formulate several widely studied problems of the RMT. They concern the large-n behavior of the 1) expectation of the NCM: TV„(A) = E{TVn(A)};
(19)
2) covariance of the NCM Cov{N n (A 1 ),7V n (A 2 )} = E{N„(A1)Nn(A2)}
-E{Nn(Al)}E{Nn(A2)}, (20)
in particular, its variance Var{TV„(A)} = Cov{TV„(A), Nn(A)}
= E{TV2(A)} - E 2 {7V„(A 1 )};
(21)
3) probability distribution of the NCM E^(l;A) = P{Nn(A)=l/n},
(22)
in particular the hole probability E^(A)=P{Nn(A) = 0}.
(23)
Intervals in the above formulas may depend on n. If they do not depend on n, a convenient formalizations of the large-n behavior of the NCM is its weak convergence, i.e. the large-n behavior of integrals (linear statistics) Nn[ ( A , ) = /
(24)
■>"■
with a fixed continuous function oo, i.e. convergence of these integrals in an appropriate probabilistic sense to a non-random limit, most often in the variance (or, equivalently in our case, in probability), sometimes with probability 1. Borrowing the terminology from the ROT, in fact, from the condensed matter theory, we will call the weak limit of the NCM, or rather the corresponding distri bution function, if it exists, the integrated density of states (IDS) of a given random matrix ensemble, and its density, if it exists, the density of states (DOS). We will denote the IDS and the corresponding measure TV and its density p. We will call the support of TV the spectrum, although the meaning of this term in the RMT is more restricted than that in the ROT. We will use the notation a = supp TV.
(25)
225 In the R O T existence with probability 1 of the IDS is a simple consequence of the ergodic theorem 3 4 , i.e. respective proof requires much less efforts than the computation of the IDS, its asymptotic analysis on various parts of the spectrum and study of its role in the spectral analysis and its applications. In the R M T the requirement of existence of a non-trivial weak limit of the NCM plays an i m p o r t a n t role, because it fixes the scale of the global asymptotic regime of the theory. Indeed, since random matrices have "too many" entries of the same order of magnitude, " t o o many" eigenvalues can escape to infinity as n —» oo so t h a t the weak limit of the NCM will be zero. Thus, we have to assume t h a t a majority of entries vanish as n —> oo. For example, we have the normalization factor n - 1 ' 2 in the G O E (5) and the factor n~l in the sample covariance matrices (10). A simple but useful necessary condition of a correct normalization is boundedness in n of the expectations of some of the moments of the NCM Jlnl = E{^n]l},
nnii=
f X'Nn(d\)
= n-'TrMl,
/ = 1,2,....
(26)
JR
For instance, it is easy to find that the second moment of matrices (8) of a Wigner ensemble is bounded in n under condition (2), and that the first moment of the sample covariant matrix (10) also has this property under analogous condition on the random vectors Xfl. We thus come to the first problem of the R M T : (pi) Prove existence of the IDS for a given matrix ensemble, i.e. existence of a nontrivial weak non-random limit N of the random normalized counting measure (18) as n —»• oo. In general, even if a proper normalization of a given matrix ensemble is known, the proof of existence of the IDS in the R M T is more involved than in the R O T . In many cases it is carried out in two steps. The first (and the more easy) step is the proof that the fluctuations of the NCM (say, its variance (21)) vanish as n —> oo (see e.g. n for a proof in the case of ensembles with weakly dependent elements and 8 5 for invariant ensembles), and the second step is the proof of existence of the weak limit of the expectation (19) of the NCM. This step is as a rule more difficult being often equivalent to the explicit computation of the IDS. T h e situation is a bit similar to that in the statistical mechanics of disordered spin system, spin glasses in particular. There, in the case of the short range interaction, the proof of existence of a non-random thermodynamic limit of the free energy per site (known as its selfaveraging property) is again a simple consequence of ergodic theorem while the computation of the limit is a hard problem. On the other hand, in the mean field models, say in the most widely studied Sherrington-Kirkpatrick and the Hopfield models, where the interaction matrix is given by (8) and (10) respectively, one can prove comparatively easily the vanishing of the variance of the free energy (known as its weak selfaveraging property), but the proof of existence of the limit of the expectation of the free energy per site is still one of the most challenging open problems of the field. Recalling the mean field models of the translational invariant systems (e.g. the Curie-Weiss model) and the beautiful physics theory of the replica symmetry breaking 3 3 , one may believe t h a t the proof of existence of the last limit is of the same level of difficulty as the explicit computation of the limit
226 and has to result in deriving a variational formula for the limit and in describing respective extremum solution. T h u s we can divide problem (pi) above into the two subproblems. (Pi) Prove vanishing
of fluctuations of the NCM as n —► oo.
(p'/) Prove existence of a non-trivial weak limit of Nn as n —>• oo, give estimates of the rate of convergence, and work out efficient method of the computation of the limit and of its asymptotic analysis on various parts of the support. In particular, prove existence of the density of states, i.e. absolute continuity of the limit. Problem (p'j) includes usually a proof of the vanishing of variance (21) and in estimating respective rate. In the case of the G U E n < *r i A \ KT i A M I-, , tiw f c o n s t / n , A i D A 2 ^ 0, C o v { ^ n ( A 1 ) , y V r l ( A 2 ) } = (l + 0 ( l ) ) | c o n s t / / n 2 i A i n A 2 ^ 0 . . " ^ ° °
/0 _> (27)
By writing the NCM in the form N„(A) = Yl?=i X A ( ^ I ) , where \A is the indicator of A, we see t h a t the NCM is a linear statistics of random variables {A;}"_,. Replacing XA by a function
(28)
where C depends on ip and on the moments of matrix elements, but is independent of n. Thus, fluctuations of the N„ [
statistics, deviation
Because of representation (1) m a t r i x elements of the Gaussian Ensembles can always be though as defined on the same probability space of semi-infinite tables W = {Wj'fc}?°fc=i °f ^-independent Gaussian variables. Thus, the bound (28) implies the convergence with probability 1 of the NCM of respective ensemble t o the nonrandom IDS provided t h a t the latter exits, i.e. provided t h a t problem (p") is solved. As for the last problem, there are several methods of its solutions, based on combinatorial, functional analytic and variational m e t h o d s and producing diverse form of the IDS (see e.g. 2 3 . 5 7 ) . Here we only mention t h a t for the Gaussian
227
Ensembles, defined by the law (5) on the probability spaces of real symmetric, Hermitian or real quaternion matrices, that will be labelled by the index 0 = 1,2,4 respectively, the IDS N@ is given by the famous Wigner semicircle law 19 N0(d\)
=
P0(X)d\,
p0(\) = ( 2 7 r / 3 u ; 2 ) - V ( 4 ^ 2 - A 2 ) + ,
(29)
where x+ = max{0,i}. Thus, the main output of the study of the global regime is the IDS. It serves as an input of more detailed study of the eigenvalue statistics: next in 1 / n terms of Nn, explicit form of the covariance, hole probability (23), the local regime (see below for its definition). Suppose that the IDS is absolutely continuous and, moreover, that the expectation Nn of the NCM also has this property: Jfn(d\)
= pn(\)d\.
(30)
In particular, this is the case for the GOE (5) (see formula (58) below). An im portant problem of the RMT is the asymptotic form of pn(X) in a properly scaled n-dependent neighborhood of those points of the spectrum (25), where the DOS p(X) = limn_>oopn(A)) is either zero or infinity. We will call these points special. Particular cases of special points are edges of the spectrum. In the case of Invariant Ensembles (7) of Hermitian matrices, known as the Unitary Invariant Ensembles, this problem is strongly related to the double scaling limit in the string theory and in two dimensional quantum gravity. Recent progress in this field (see reviews 7,133,137^ i m pii e s that asymptotic form of pn(X) in a neighborhood of the edges of the spectrum corresponding to special (called critical) potentials V in (7) can be expressed via special solutions of the Schroedinger equation whose potential is a special solution of a certain nonlinear ODE's, various Painleve in particular, related to integrals of motion of completely integrable PDE's, the Korteweg de Vries equa tion first of all. In canonical case of the GUE the respective Schroedinger equation is the Airy equation, the order of magnitude of the neighborhood of the edge 2y/2w of the semicircle law (29) is n~2^3 and respective asymptotic form of p n (A) is 9 1 , 9 5 lim nV2wpn(2V2w
+ V2wn-2/3s)
= h(s), h{s) = -s\Ai{s)}2
+ \Ai'(s)}2.
(31)
n—*oo
In particular _ i*-X\s\1'2 -cos(4|s| 3 / 2 /3)(47rs)- 1 + 0{\s\-*'2), s - -oo, > ~ \ ^(geTrlsl 1 / 2 )- 1 exp{-4s 3 / 2 /3}(l + o(l)), s - oo.
h(* n{S
.
. W
Thus we have oscillating corrections for the prelimit mean DOS pn on the spectrum and the exponential decay of pn outside of the spectrum. Similar question arise in the quantum chromodynamics 139 . We come to the next problem of the RMT: (pz) Find asymptotic form of the prelimit mean DOS p„(A) of (30) in properly chosen neighborhood of special points of the DOS p(\) = lim n _ 0 0 p n (A). It follows from the definition of the IDS that a typical number of eigenvalues in a small neighborhood <5A of a given point Ao of the spectrum is asymptotically nN(6X), i.e. the IDS describes the "macroscopic" behavior of eigenvalues. This is why the global regime is often called macroscopic. Thus, if one is interested in properties of a finite number of eigenvalues, belonging to this neighborhood, one has
228
to rescale the spectral axis, by the typical distance (spacing) SX/(nN(SX)) eigenvalues. Under assumption (30) the typical distance is Dn(Xo) = l/npn(X0).
between (33)
The asymptotic regime of the RMT, defined by this scale in the case pn{Xo) ^ 0 (the "bulk" of the spectrum) is called the local (or microscopic) regime or the scaling limit. Existence of the IDS, i.e. the weak convergence in probability of Nn, say, for all open intervals, to a non-random limit, implies the simple form of the limiting hole probability. Namely, we have first, that if an interval A intersects the spectrum, then l i n i ^ o o ^ ' t A ) = 0, A n ^ O , where a is denned in (25). On the other hand, 1 - E^(A) n E { ^ n ( A ) } , i.e. 0
(34) = P{nNn(A)
> 1} < (35)
= 0 for A fl a - 0, then l i m ^ o o t f ^ A ) = 1, A n a = 0.
(36)
We see that in the global regime the leading term of the hole probability is typically given by simple formulas (34) and (36). According to the definition of the IDS we have asymptotically at most o(n) eigenvalues in any interval of the spectral axis such that A fl a — 0. Bound ("&) suggests more. Assume that nNn(A) vanishes sufficiently fast for ADcr = 0 (for the GUE J V „ ( A ) is exponentially small in n for A lying outside of the spectrum, see (32)). Then, assuming that all random matrices Mn of a given random matrix ensemble are defined on the same probability space, we obtain by the Borel-Cantelli lemma that with probability 1 there is no eigenvalues in any interval outside of the spectrum of the ensemble for all sufficiently large n. This observation provides one more motivation of problem (p\) and suggests the following problem. (pi) Study asymptotic statistical properties of the extreme eigenvalues of a given random ensemble, in particular: (p'4) Prove that the maximum (the minimum) eigenvalues of matrices of a given matrix ensemble converges in probability or with probability 1 to the extreme right (extreme left) edge of spectrum and find the rate of convergence. The assertion of problem (p'4) implies that the random norm of matrices of a given ensemble converges with probability 1 to the maximum modulus of the edges of the spectrum. Recall that for random operators the spectrum and the norm are non-random and the norm is always coincides with the maximum modulus of the spectrum edges. The probability distribution of the maximum eigenvalue is obviously P „ { A m a : c < A } = £(]A,oo[)
(37)
229 In the situation of problem (p'4) lim f l - + ooPn{A m a r < A} = X]-oo,o](imai - A), where X]-<x>,o] is the indicator of the left semi-axis and amax is the extreme right edge of the spectrum. This result can be viewed as belonging to the global regime. We may be interested in more detailed information. In this case we have to act as in the case of the density of states near an edge of the spectrum. Then we are lead to the problem: (p'J) Find an asymptotic form of the distribution (37) in a properly chosen neigh borhood of the right edge amax of the spectrum, i.e. for A = amax + n~aft, a > 0. Analogous problems can be considered in a properly chosen neighborhood of other edges of the spectrum, i.e. endpoints of the spectrum gaps. In the case of the G U E according to (31) a = 2 / 3 and respective limiting distribution (of the norm) can be expressed via a special solution of the Paileve-II equation 9 5 . Returning to the hole probability (23) for intervals t h a t are in the bulk of the spectrum, where the DOS is strictly positive and the mean spacing (33) is of the order 1/n, we can formulate in the light of above one of the most widely studied and used problems of the R M T : (ps) For any integer I > 0, any .s > 0 and Ao such that p{\o) ^ 0 find the probability disribution (22) E(l;S)=hmn^E^(l-s),
E™(1; s) = P{Nn((X0,
^o + s/npn{\0))
limiting =
l/n], (38)
in particular (p'5) Find under the same conditions
the limiting hole
probability
E(s) = l i m n - ^ o o f i ^ C ) . £ ( n ) ( s ) = P{(Ao,A 0 + S / n P n ( A 0 ) ) = 0}.
(39)
Under certain conditions on the ensemble probability law the derivative of £,'(")(s) with respect to s is the conditional probability to have no eigenvalues between Ao and Ao + s/npn(^o) provided that there is an eigenvalue in Ao n ' 1 6 > 1 9 . In other words, E'(s) is the limiting probability distribution of distances (spacings) between nearest neighbor eigenvalues in a 0(l/n) neighborhood of a point Ao of the bulk of the spectrum. T h e R M T owes a considerable part of its significance and development to the fact that it provides the spacing distribution which fits remarkably well the spacing distribution of a wide variety of chaotic and disordered q u a n t u m mechanical and the wave systems (see Sect. 3.2.2). Besides, the same spacing distribution is found for highly lying zeroes of the Riemann C-function and of similar functions (see Sect. 3.3.3). In the case of the G U E E(s) is independent of Ao and is equal to the Fredholm determinant of the integral operator Q2,s, defined on the interval [0, s] by the kernel
i.e. if £2(5) denotes limit (39) in this case, then E2(«)=det(l-Q2i,).
(41)
230
Moreover, if D{g,s) = det(l - gQ2,,),
0 < g < 1, then E(l;s) =
j£rD(g,s)\g=i
19,26
On the other hand, let { A / } ^ be i.i.d random variables whose probability law has the density d and {M„} n >i be a sequence of diagonal matrices, having i.i.d. diagonal entries A;, / = 0,1,..., n. It is evident that the sequence defines the random diagonal operator in / 2 ( Z + ) . Its density of states is p = d, moreover, pn = d, and if p ( A 0 ) ^ 0, limit (38) is E(l-s) = e-'s'/l\.
(42)
i.e. the Poisson distribution. In particular, the spacing distribution E'(s) of diag onal random operators is given by the Poisson law, independent of the nature of an ensemble (its density ) in the standard scaling (33). The same spacing distribution has the Schroedinger operator on the pure point part of its spectrum 100 >". The spacing distribution given by (41) and known as the Wigner-Dyson distri bution is completely different. Its density, denoted usually as p(s), has asymptotic behavior Cis 2 (l + o(l)), as s -> 0 , and C 2 exp{-C 3 s 2 }(l + o(l)), as s —¥ oo. In particular, the small-s behavior signifies the phenomenon, known as repulsion of levels, showing that degeneration of eigenvalues is unlikely. The phenomenon is pertinent for a wide variety of complex chaotic and mesoscopic disordered systems. The description of spectrum fluctuations of such systems of different origin and nature given by random matrices is to large extent, phenomenological, i.e. is not derived yet from intrinsic properties of respective systems and a suitable for a given system ensemble is determined mainly by symmetry of the system (see however Sect. 3.2.2 and 3.2.5). In this situation the rate of dependence of predictions of the RMT on the form of the ensemble probability law (say, on the function V in (7)) is of considerable importance. It turns out that the local regime of the RMT is to large extend independent of the ensemble. More precisely, the full information on the ensemble is encoded in a few parameters (functions), such as the DOS, spectrum edges, etc.. This fact, verified in a number of different particular cases both, by analytic and by numeric arguments, is known in the RMT as the universality. The scope of the important property is rather broad. Following statistical mechanics we can speak about several universality classes in which a respective local eigenvalue statistics (i.e. limiting properties of the point process un\0(s) = nNn ([Ao, A0 + s/npn(\o)])) falls. The RMT possess four universality classes for the bulk of the spectrum: the Poisson class and the three WignerDyson classes, given by the respective Gaussian or/and Circular Ensembles of a corresponding symmetry. Then we can formulate the most known version of the universality as (pe) Prove (or disprove) that the local eigenvalue statistics of a given matrix en semble falls into one of the (four) universality classes. For the bulk of the spectrum the problem can be viewed as a detalization of problem (ps), and for the spectrum edges this is a detalization of problem (p'4). Write the hole probability as E^(A) = E r = o ( - 1 ) ' ( ' ! ) ~ l p n , ( ( A ) . Pn,i(A) = Pnii(A\, ...Ati)|A,=...=A„=Ai where Pnj(A\, ...A n ) is the l-th marginal of the sym metrized joint probability law of eigenvalues. In many cases the eigenvalue joint
231
probability law is absolutely continuous with respect to the Lebesgue measure in R n . Denoting its symmetrized density p n (Ai, • -, An) and introducing analogously to statistical mechanics the l-point correlation functions ft;(n)(Ai,...,A;)
= n(n-
l)...(n-l
+ 1) / JR"-'
p n (^i,..., A,, \l+l,...,
A„)dA,+1 ...d\n, (43)
we can write that £(»>(A) = 1 + V
^
/
ii / (n) (A 1 ,... ) A,)dA 1 ...dA,.
We see that the hole probability in the local limit (39), is related to the same limit of all correlation functions (43). Thus, we can consider the following detalization of problem (pe) (p6) Prove that uniformly on compacts of R,',/ > 2 there exist the limits [nPn (Ao))-'R i ( n ) (Ao+6/np n (Ao),...,A 0 +6/nPn(Ao)) (44) which do not depend on the ensemble within a universality class and thus co incide with the same limits for the Gaussian (or any other) Ensemble of the respective class (symmetry). = lim n _foo
In sufficiently complex cases one may study the simplest second connected cor relator T^n){Xx,X2) = Ri2n)(X1,X2) - R{"\Xi)R^(X2) and its rescaled form y 2 ( n ) (ei,6)=(nPn(Ao))- 2 T 2 ( n ) (Ao+6/np n (Ao),A 0 + ^/np n (Ao)).
(45)
Thus, one can consider an important particular case of the previous problems. (p'B') Prove that the limit y 2 ( 6 , 6 ) = lirn„^ocV 2 ( n ) (6,6)
(46)
2
exists uniformly on compacts of R and coincides with that for the Gaussian ensemble of the same universality class (symmetry). For the GUE the limit is Y2,2(ti >fc)= («n*(€i -
6)/T(6
- 6))2 • 19
(47)
and for the GOE and the GSE YPt2, P = 1,4, see . The form of the Four ier transform of Ypt2 is responsible for the so called "correlation hole" in the long time dynamics of chaotic systems and the slow decay of Yp 2 leads to the "rigidity" (or stiffness) of the random matrix spectra, explaining, for ex ample, the universal fluctuations in mesoscopic samples 14 ' 150 . The pair cor relation function Yp,2 defines also the two important statistical characteristics, most often used in statistical analysis of random specta and other similar nu meric sequences. They are the variance E 2 (s) of the eigenvalue (level) number fn,A0(s) = nNn{{Xo,Xo + s/np„(Xo))) and the expectation of the so called A3 -
232 statistic (or spectral rigidity), giving the least mean square deviation of the "stair case" vn,\0(s) from the best fit by a stright line 1 9 . In the scaling limit the level number variance is E ^ s ) = 7r _2 (log27rs + 7 + 1) + Ois-1), s —► 00, where 7 is the Euler constant. Comparison with the respective result for the Poisson (com pletely random) sequence £/><»,(«) = s shows the famous "stiffness" of the ran dom matrix spectra. Analogously, the limiting expectation of the statistics A3 is (27r 2 ) _ 1 (log27rs + 7 — 5/4) + 0 ( s - 1 ) , s - > o o . On the other hand, despite these strong correlations, the level number satisfies the central limit theorem 80 > 87 . In many interesting physical cases the relevant operator (Hamiltonian, scatter ing matrix, transfer matrix) depends on some external parameter. It can be a con trol parameter, external field or even time and coordinates. T h e dependence is espe cially important in cases, when it can change the local eigenvalue statistics, in par ticular, symmetry of the ensemble. For example, the matrix Mn = Mn + iaMn , where Mn ' are i.i.d. G O E matrices belongs to the G O E if a = 0 and to the G U E if a = 1, i.e. iaMn is the G O E breaking symmetry term (that can model effects of the magnetic field). Another important example is the sum of a diagonal matrix and the Gaussian matrix, modelling the transition from regular to chaotic motion 4 3 . Besides, dependence of eigenvalues on a (time depending) parameter can provide an interesting mechanism of dissipation and transport in q u a n t u m systems 156 and leads to interesting statistic laws for the first derivative (level velocity) and the second derivative (level curvature) of eigenvalues with respect to the parameter in the question 8 6 . It is often not too hard to find the dependence on respective parameter of the global regime quantities, the NCM first of all. However, the local regime requires considerable efforts and may lead to new asymptotic phenomena. We come thus to the following problem (^7) Study dependence of the local eigenvalue statistics of a given parametric family of ensembles on a parameter, in particular, on the "interpolation " parameter whose variation changes the universality class (the transition or the crossover problem for the eigenvalue statistics). For the mentioned above transition G O E —> G U E the joint eigenvalue distribution has form (52) with /? = 1 multiplied by | d e t { / ( A j - A f c ) } " > = i | 1 / 2 , where /(A) = 19 erf((l — ga2)X/8a2w2) and all the correlation functions (43) can be written in the form similar to that of the G O E , e.g. the form t h a t uses the quaternion determinants. If a ^ 0 is independent of n, then in the n —> 00 limit all the limiting correlation functions (44) coincide with those of the G U E . But if n is scaled as a = n~l'2t, then in the same limit one obtains ^-dependent correlation functions equal to those of the G U E for t = 0 and tending pointwise t o those of the
GUE
ast-Kx.
We have been discussing statistical properties of eigenvalues of random matrices. Statistical properties of random vectors are also of interest. For invariant ensembles the joint eigenvector distribution is the properly normalized Haar measure on the respective group. In the limit n —> 00 this leads to the Gaussian distribution for any finite number of the eigenvector components in any fixed basis, and to the x2 - distribution for their squares, called in nuclear physics the partial width. T h e distribution is used since the 50s in nuclear physics under the name of the
233 Porter-Thomas distribution 3 . 2 4 ' 1 1 7 . Statistical properties of random vectors are also important in the other applications of the R M T , where one encounters many noninvariant ensembles. Thus we come to the following problem. (ps) Study statistical properties of eigenvectors of a given random matrix ensembles, in particular, moments of their components (known as the inverse participation ratios), covariances and distribution functions. Wc mention in conclusion that most of studied so far problems concerning eigenstructure of non-Hermitian matrices are in essence similar to those, discussed above and we refer the reader to book 19 and to Sect. 3.1.7 for respective results and ref erences. 2.4
Methods
T h e most direct method of the asymptotic study of the NCM is the asymptotic analysis of its moments (26). The method was introduced by Wigner to prove the convergence of N„ to semicircle law (29), first for real symmetric matrices whose elements take two values of equal modulus, and then for matrices with independ ent entries, whose elements are symmetrically distributed and have moments of all orders 5 8 (problem (p'{) above). Later it was noticed 12 t h a t under the same con ditions and by the same in essence arguments one can prove the vanishing of the variation (21) of the NCM (problem (p[) above), i.e. the convergence of the NCM's to the semicircle law in probability. Since that time the method is regularly used for random matrices having independent or weakly dependent entries, the Wigner ensembles and the sample covariance matrix ensembles in particular. A drawback of the method is t h a t it requires existence of the moments of the matrix elements up to high order (often all orders) and a rather involved combinatorial analysis of various terms appearing after performing the mathematical expectations of the r.h.s of (26) (see e.g. ' ) . Related to this is the fact t h a t the method is not pre cise enough in the local regime (see however recent papers 71 > 93 . 94 5 where certain problems concerning the local regime in neighborhood of the spectrum edges of the semicircle law were solved by this method). T h e sequence of the all moments of a non-negative unit measure m can be viewed as one of its integral transforms that determines uniquely the measure under certain conditions. In the conventional probability, another integral transform, the Fourier transform, proved to be more efficient in a number of important problems. It turns out that in the R M T a rather convenient tool allowing to avoid a cumbersome combinatorics is the Stieltjes transform (known also as the Cauchy or as the Borel transform) defined as
f{z) = I ^
,
Im* * 0.
(48)
(see e.g. ~s for its properties). Taking as the measure in (48) the NCM (18) of a real symmetric or Hermitian matrix we obtain in view of the spectral theorem t h a t if gn(z) is the Stieltjes transform of the NCM, then gn(z) = n~x'lrG(z), where G{z) = {M-z)~l
(49)
234
is the resolvent of a matrix M. The transform was introduced in the RMT in 55 , and being combined with the resolvent identity provides an efficient tool of study of the global regime (see n . 2 2 . 5 7 ) a nd of certain problems intermediate between the global and local regimes for both classes of ensembles (see e.g. 6 1 . 7 0 5 4 ) ) but mostly for ensembles with independent and weakly dependent entries (sometimes even not independent but just uncorrelated 5 1 ) . There are several versions of the method, based on various versions of the finite rank perturbation formulas and differentiation formulas 5 7 . We will call the method the resolvent method. Two next methods applicable to invariant ensembles are based on explicit form of the joint probability distribution of eigenvalues. Namely, write spectral theorem for, say, real symmetric matrices Mjk = Yl?=\ ^i^Pij^ik, where ipi = {^(j}j =1 , ' = 1,..., n are eigenvectors of M . Use this formula to carry out the change of vari ables in (7) from the collection of functionally independent matrix elements to the collection of eigenvalues and eigenvectors. It can be shown easily that the Jacobian of this change of variables depends on eigenvalues —oo < Aj < ... < \ n < oo as A(A 1 ,...,A„)=
I]
(A.-Afc).
(50)
l<j
A bit more detailed argument leads to the formula Q;1expJ-n^K(A()iA(A1,...,An)rfA^1
(51)
for the ensemble probability law (5), where d\ = Yl?=i d\i, \Pi varies in the part of the orthogonal group 0(n), whose matrices have the first non-zero element of each column positive, and d^i is the Haar measure on 0(n), restricted to this part of the group. In the RMT most often one deals with symmetric functions of eigenvalues that do not depend on eigenvectors. In this case we can restrict ourselves to the symmetrized form of the joint eigenvalue probability density defined now in the whole R n . Analogous argument applied to the unitary (/? = 1) and the symplectic (/3 = 4) ensembles, leads to the general formula n > 1 9 Pn/3(A1,...,A„)=gJexp|-n^l/(A;)||A(A1,...,An)|'5.
(52)
For the circular ensembles (13) we have the similar formula (that dates back to results by H.Weyl of the 20s in the case V = 0 3 7 ) : n
C^expl-n^l/tcostf,)} ;=i
JJ
\eid> - e^'f dd,..Mn
(53)
i<j'
for the joint probability distribution of eigenvalues {e 9 '}" = 1 of these unitary matrices, where C„p is the normalization factor and /? = 1,2,4 for the Circular Orthogonal, Circular Unitary and Circular Symplectic Ensembles respectively. Writing (52) in the exponential form p n /j(Ai,...,A n ) = g ^ c x p | - n | ^ ^ ( A , ) + ^ X ; i o g | A i - A f c | J I
(54)
235 we can interpret this from as the Gibbs distribution of n linear charges on the real line (one-dimensional conductor) subjected to the external field V. T h e factor n in front of parentheses in (54) plays the role of the inverse temperature. In view of the same factor in the denominator in front of the second term in the parentheses this means that the Gibbs distribution (54) corresponds to the mean field description of the system of linear charges cooled to the zero temperature simultaneously with the thermodynamic limit n —► oo. This explains the efficiency of the electrostatic analogy, establishing in particular links of the R M T with the variational methods of the real analysis via minimization problem for the electrostatic energy £[m] = f V{X)m(dX) JR
+ 0/2
f
\og\X - p\m(d\)m(dp)
(55)
JB.2
defined for any nonncgative unit measure m. This problem is known in analysis as the minimum energy problem in external field 3 5 . The electrostatic analogy was suggested by Wigner 5 9 , who used it to give a heuristic derivation of the semicircle law (29) for the G O E {V(X) = A 2 /4to 2 , /? = 1). Similar arguments were used later in 1 3 2 in the context of planar approximation of the gauge field theory, where V is an even quartic polynomial and f) = 2. The analogy was also strongly developed by Dyson (see papers 57-58 in 8 ) in his analysis of the large-s behavior of the hole probability. Another method, based on formula (52) is known as the orthogonal poly nomial method. Its simplest and most developed version corresponds to the unitary invariant case /? = 2. T h e method is based on the representation of the function A(Ai,...,A„) of (50) as the Vandermonde determinant, t h a t can be written as A(A 1 ,...,A„) = J X i 7 „ , , d e t ^ l ^ A * ) } ? ^ , where {p, ( n ) (A)}~ 0 are polynomials orthonorrnal on the whole axis with respect to the weight wn(X) = exp{— nV^A)}, i.e. p\- (A) is a polynomial of degree / having positive coefficient 7, in front of A' and satisfying the orthogonality condition oo e
/
-nV(A)p(n)(A)p(n)(A)rfA
=
(
W
( 5 6 )
■oo
This important observation by Mehta allows the correlation functions (43) for /? = 2 to be written in the following determinantal form n-l
R\n\\i,...,\l)=det{Kn(\j,\k)Yj:k=l,
A'n(A,p) = $ > , { n ) ( A M ( B V )
(57)
(=0
where A' n (A,^) is known in analysis as the reproducing kernel of an orthonorrnal system {^ ( "'(A)},~ 0 , ^ ( n ) ( A ) = e-nVM'2p\n){\). This leads to the following for mulas for the prelimit density of states of (30), variance of the linear statistics (24), and the hole probability (23): Pn(X)
= n-lKn(\,\),
Var{W„[
(
K2n{\,p)d\dli, (58)
4 n ) ( A ) = det(l-A'„, A )
(59)
236 where A ' „ A is the integral operator, defined on a set A C R. by the reproducing kernel. Analogous formulas are valid for the circular ensembles of unitary matrices defined by (13). In this case the formulas dates back to the Weyl integration formulas for class functions on classical compact groups 3 7 . In the simplest case of V = 0 the role of orthogonal functions play the Fourier system {(27r) _ 1 / , 2 e'' 8 };gz t h a t makes respective formulas especially simple and transparent. The determinantal formula (57) is an important element of the R M T formalism. Its analogues are valid for a number of cases where the joint probability density of eigenvalues is not necessary written in the form (52) and the kernel Kn(\, fi) is not necessary a reproducing kernel of an orthonormal system and even no necessary a symmetric function 4 1 ' 9 0 . Efficient use of these formulas in the rigorous part of the R M T is mostly based on asymptotic formulas for respective orthogonal polynomials. In the case of the Gaus sian (5) and the Laguerre (11) ensembles these are the Hermite and the Laguerre polynomials, whose asymptotic properties are well known. This is how mentioned above results for these ensembles, especially in the local regime, were obtained. However, the use of the orthogonal polynomial technique in general case of in variant ensembles (7) requires asymptotic formulas for the polynomials orthogonal with respect to the weight wn(X) = exp{—nV(X)}. These formulas were obtained only recently 202 > 204 for real analytic V's and lead to a proof of the universality property (problem (ps)) for these ensembles. We would like to stress, however, t h a t the scope of the orthogonal polynomial method is by no means restricted to the usage of asymptotic formulas. Here is a simple illustration. T h e formula (58) for the variance of the linear eigenvalue statistics and elementary properties of the reproducing kernel imply immediately that VarAf„(y>) < 2supXeR\ 1 3 3 . 1 3 7 ). It should be remarked t h a t the mathematical status of many of these beautiful findings is not yet worked out. Similar formulas are valid for the orthogonal and symplectic ensembles 1 9 . They require however the new elements: the quaternion determinants and the skeworthogonal polynomials. This makes asymptotic results on this ensembles more difficult to obtain and thus much less complete than for the unitary ensembles with a general V in (7). As for the unitary invariant ensembles, the q u a n t u m field theory applications of the R M T brought a new elegant form of the orthogonal polynomial method for polynomial V in (7). It is based on a canonical commutation relation and is particularly efficient in study of the local regime near certain non-generic soft spectrum edges (see Sect. 3.2.5 and Sect. 3.3.3) in terms of representations of the commutation relation by pseudodifferential operators. T h e method reveals 85
237 links of the R M T with integrable systems and semiclassical analysis and is also applicable to the multi-matrix models (16) 7 > 137 . One more important technical novelty introduced to treat the multi-matrix models, in particular to find their joint eigenvalue distribution, is the integration formula for unitary group obtained by Harish-Chandra in the group representation context and rediscovered by Itzykson and Zuber in the R M T context (see e.g. 7 ) . Another derivation of the joint distribution was given by Mehta (see 1 9 ) . The heat equation in matrix variables used by these authors proved to be an efficient tool of the R M T since the work of Dyson (see paper 60 in 8 ) till now 14 > 150 . The technique based on the integration formulas can be combined with versions of the orthogonal polynomial method (nonlocally orthogonal polynomials 1 9 , t h a t can be viewed as a version of the transfer matrix method of statistical mechanics), or just with the contour integration ' 14 ' 90 I or with the Grassmann integration and the cr-model techniques t h a t is asymptot ically exact in the R M T because of its mean-field character 9 , u . The last two techniques as well as the technique based on the Fokker-Planck (heat) equation for the joint eigenvalue probability density for ensembles, containing a varying para meter 1 5 °. 1 5 1 > 1 5 3 a r e widely used in the theoretical physics part of the R M T . T h e techniques are often more robust with respect to variation of ensembles considered but their mathematical status in many instances remains to be understood. 3 3.1
Comments to the Bibliography Random
Matrix
Theory
3.1.1 Global Regime Recall that in the global regime we study the large-n behavior of the normalized eigenvalue counting measure (18) for n-independent intervals of the spectral axis, i.e. its weak convergence (problem (pi) of Sect. 2.3). This requires certain ndependent normalization of random matrices and assumes, as the first step, a proof of existence of a nontrivial limit N(A) = lim rl _ >00 E{./vV,(A)} for any n-independent A (problem (p"). T h e problem was first considered by Wigner in the 50s, who first gave a heuristic derivation of the semicircle law (18) for the G O E by using the electrostatic analogy 5 9 ( reprinted in 2 4 ) , and then a derivation of the semicircle law for limn-Kx, E{jV n } for real symmetric matrices with independent entries by the moment method, thus by requiring existence of all moments 5 8 . A close to optimal condition for the weak convergence in probability of the NCM (18) of these ensembles to the semicircle has the form of the well known in probability Lindeberg condition, required here for each row of matrices 5 6 . By replacing this condition by an analogue of the Lindeberg condition for all matrix elements lim n " 2
y
f
w2P{W&]
G dw} = 0 , Vr > 0.
(60)
Girko proved t h a t this condition is optimal for the validity of the semicircle law, like the standard Lindeberg condition is optimal for the validity of central limit theorem in probability theory (see n for this and related results for the Wigner ensemble). T h u s we have here a kind of "macroscopic universality": the IDS of the Wigner
238 ensembles (8) depends only on the variance of entries. Another manifestation of this property is a universal from of the covariance of the Stieltjes transforms of the NCM 52,61,70 for £ n e i n v a r i a n t ensembles whose spectrum consists of one interval and for the Wigner ensembles. Results on the Wigner ensembles were obtained by a version of the resolvent approach, proposed initially for generalized Wishart ensembles (10) in 5 5 and giving the convergence of the NCM (18) in probability or even with probability 1. For modern forms of this approach see 1 1 . 7 0 . 5 7 , T h e approach is sufficiently robust. One can, for example, replace the condition of independence of entries by the condition for a part of them to be only orthogonal 5 1 , or consider dependent entries having the correlation function t h a t decays sufficiently fast as the distance to the principal diagonal tends to infinity (see e.g. 2 2 for a review), or make Imz = 0(n~a), 0 < a < 1/4 5 4 . One can also consider "deformations" (9) of the Wigner random matrices Mn by another matrix M„ , independent of M „ , non-random in particular. Assuming that the NCM's of M„ ' converges weakly to a measure No and denoting /o its Stieltjes transform, we have under same condition 60, that the NCM's of the sum M„ ' + Mn converge to a nonrandom limit, whose Stieltjes transform / is a unique solution of the functional equation 5 6 f(z)=f0(z
+ fiw2f(z))
(61)
22 57
(see ' for a collection of similar equations for various ensembles). An analogue of this functional equation in the case of the deformed generalized Wishart ensemble (10) is f(z)
= f0(z + cx/(l
+ xf(z))),
(62)
where c = lim^-^oo]m-»oo m/n. Analogous equations for multiplicative families of positive and unitary random matrices were found in 5 5 . In particular case /o = — 2 _ 1 we obtain the Stieltjes transform of the generalized Wishart matrices. We refer the reader to 1 ' 5 ' 1 1 - 9 7 and to references therein for various properties of respective limit eigenvalue distributions. For invariant ensembles (7) the variational method based on formulas (52) and (54) is more efficient. It dates back to Wigner 5 9 who used the method in the form of the electrostatic analogy. Rigorous results, yielding the weak convergence in probability of the NCM's of invariant ensembles with a locally Holder V in (7) are given in 5 ° . 6 9 . 2 0 3 Notice, t h a t respective proofs are valid for all /? in (52), but not only for /? = 1,2,4. A new, compared with the Wigner ensembles, property of the IDS of invariant ensembles is the possibility for its support to consist from several intervals. This is the origin of new phenomena, like the phase transition type behavior if two intervals are merging 4 ' 1 3 3 ) but also of certain technical difficulties. Assuming that the potential V in (7) is smooth enough (say, a polynomial) one can prove that the number of interval is finite and t h a t their end points can be found selfconsistently from the respective variational problem for the electrostaticenergy (55). T h e DOS is zero at all these endpoints and they are called the soft edges. Generically the DOS behaves as p ~ J 1 / 2 in the neighborhood of a soft edge c, where 8 = |A — c|. By tuning the potential we can obtain the edge behavior p ~ <$ p - 1 / 2 , p = 1,3,..., for polynomial V's of degree not less than 2p, and a polynomial of degree 1p producing this DOS is called critical. These potentials are
239 important in some models of the string theory 7 , being responsible for the double scaling limit in certain models of the theory. Another type of singularity arises in merging of two intervals of the spectrum into one interval both because of the tuning of the potential in invariant ensembles (7), where p ~ 62 or because of the choice of a two-interval initial measure 7V0 in (61) for the deformed Wigner ensembles, where p ~ (5 1 / 3 9 0 . Different endpoint arises if the spectrum is restricted a priori to a certain interval. For instance, in the case of positive defined matrices (10) the spectrum always occupies the positive semi-axis. Here the DOS behaves as p ~ <$ - 1 / 2 near zero. This is an example of the hard edge. We will need later this classification. New wave of interest to the global regime is motivated by recent studies in oper ator algebras associated with free groups 1 6 4 . 1 7 0 . It turns out t h a t random matrices give a rich analytical model for the notion of freeness, proposed by Voiculescu and central in these studies. An interesting definition of the free convolution of meas ures is a far reaching generalization of relation (61) and (62) and related notions and results provide a new tool of study of various interesting problems of operator algebras. On the other hand, these results yield a new conceptual form of many known results of the global regime of the R M T . In particular, the free analogue of infinite divisible measures and respective analogue of the Levy-Khinchin formula shows that the role of the Gaussian and the Poisson distribution in the free prob ability play the semicircle law and distributions defined in (62) for /o(z) = — z _ 1 respectively. T h e notion of the .ft-transform of a measure allows one to formulate in an elegant form the results of addition and multiplication of independent random matrices. 3.1.2 Corrections,
Fluctuations
and Large
Deviations
Most of problems and results related to the global regime have the nature of the law of large numbers. Thus, as in conventional probability, corresponding asymptotic analysis includes certain results on vanishing of fluctuations of respective random variables as n —> oo (see, e.g. (28)). Moreover, one is naturally led to the study of other standard components of probabilistic analysis: central limit theorem, large fluctuations and the large-n expansions (problems (p") and (p2) of Sect. 2.3). For example, the large-color approximation of the q u a n t u m gauge field theory 4 > 132 > 134 suggests that the large parameter of the theory should be not n but n 2 , at least in the case of unitary invariant ensembles. This implies, in particular, t h a t , say, for the normalized linear statistic Nn[
0, and central limit theorem for a collection of any number p of niV n [(- — z j ) _ 1 ] , Imzj > 0, / = 1, ...,p. For the unitary invariant ensembles (7), (12) a rigorous bound Var{N„[y]} = 0 ( n - ( 1 + Q ' ) with an a-H61der function
240 of the covariance of Nn[(- — z i ^ ) - 1 ] , Imzi^ > 0 was found in 61 and in 6 4 by a version of the 1/n-expansion and by an ansatz for the asymptotic form of orthogonal polynomials (56) respectively for the case when the support of the DOS is one interval. Analysis of the case of the multi-interval support for invariant ensembles have additional technical difficulties. This is why the central limit theorem for linear statistics in this class of ensembles is known only for the case when the spectrum consists of one interval 6 9 (this is the case, for instance, for convex V's 5 0 ) . For analogous results concerning unitary matrices see 65 > 68 . An important question is the rate of convergence in the central limit theorem. For the circular ensembles of unitary Haar distributed random matrices (13) phys ical results on the planar approximation in q u a n t u m gauge field theory suggest the exponential rate of convergence 0(exp{—const • n } ) 6 6 . Same suggestions follows from results of 6 5 on the central limit theorem for traces of powers of matrices. Rigorous proof 6 8 gives a superexponential bound 0(exp{—const • n l o g n } ) for the unitary group, and an exponential bound for the orthogonal and the symplectic group. These and other similar results are closely related via explicit expression for the generating functional of the NCM (see e.g. 23 ' 68 > 69 ) to the strong Szego theorem for determinants of the Toeplitz operators the case of U(n) and to its generaliza tions in the case of other ensembles 6 8 ' 6 9 ) an interesting subject of modern analysis (see 29 for results and references). For the large deviation type results for the NCM of various ensembles see 6 2 . 6 3 ' 6 7 i where the large deviation principle for the NCM is established and it is shown in particular t h a t the role of the rate functional in the large deviation formulas plays the electrostatic energy (55) as is suggested by the electrostatic and statistical mechanics analogies 5 9 . 5 0 . 1 3 2 . 3.1.3 Norms
and Spetrum
Edges
We give here some details concerning problems (p'4). Exponential bounds for the probability of large values of the Euclidean norm of the Wishart matrices (11) dates back to the 40s (see 7 7 where they are ascribed to Bargmann, Montgomery and von Neumann). Convergence with probability 1 of the norms of the G U E matrices to the right hand edge of the support of the semicircle law (29) follows from results by Bronk of the early 60s (see 7 4 ) , who used formula (58) for pn and the Plancherel-Rotah asymptotic formula for the Hermite polynomials to find t h a t the prelimit mean DOS pn of (30) is exponentially small already on the distance 6 = 0 ( n - 2 / 3 ) from the spectrum edge of the G U E (cf. (32)), thus leading to an exponential bound for the r.h.s. of (35). For another derivation of this result and its applications see 1 6 7 . We see t h a t as n —> oo we have outside of the spectrum not just o(n) eigenvalues, as is suggested by the convergence of the NCM to the semicircle law with probability 1, but no eigenvalues at all. Likewise, this should be contrasted with standard matrix inequalities (Gershgorin, Cassini, etc.), implying t h a t the spectrum has to be within an interval of the length 0(y/n). Analogous results for the generalized Wishart ensembles (10) whose entries have the moments of all orders was obtained by G e m a m 7 6 . and later by Bai, Girko, Krishnaiah, Silverstein and Yin in the late 80s for the Wigner and the generalized Wishart ensembles under conditions of existence of the m o m e n t of the order 4 for
241
i.i.d. entries and 4 + e, e > 0 for the general case. This was done by the moment and by the resolvent methods, see 1 - 11 ' 72 for results and references, in particular those that concern minimum eigenvalue of the generalized Wishart ensemble and eigenvalues lying in the gaps of the spectrum of respective deformed ensembles. For exponential bounds for the probability of deviations of norms from the respective spectrum edge in various ensembles see e.g. 73>75>168. These results are of importance in operator algebras, local theory of Banach spaces, statistical mechanics of spin glasses and neural networks, numerical meth ods, etc.. As for invariant ensembles, results here much less complete. Bounds, implying that the norm is finite with probability 1 were obtained in 50 for the C/oc potentials and all /?. For the unitary invariant ensembles with real analytic potentials precise bounds analogous to that for the GUE can be in principle deduced from the recent results 202>203 on the asymptotics of orthogonal polynomials, following the scheme based on formulas (58). 3.1.4 Local Regime (bulk) The local regime is one of the most studied and used in the RMT and its appli cations, because it allows one to describe striking similarity of the local properties of spectra (and even just "chaotically" originated sequences of numbers) coming from situations that are otherwise quite different. Recall that in this regime we are interested in statistical properties of a finite number of eigenvalues, or, in the scale fixed by the global regime, in intervals of the spectral axis having the length of the order of the typical level spacing (33) (problems (p5) and (p^) of Sect. 3.2). Re spective results are simpler looking (but not always simpler proved) for the bulk of the spectrum. The regime was also introduced by Wigner in the 50s to explain the experimentally found phenomenon of the repulsion of nuclear energy levels (neutron resonances), see 3,14,19,24 j n t j j e JQS^ t n e r e g i m e became important in modeling the spectrum fluctuations in quantum and wave systems, whose classical counterparts are chaotic (quantum chaos) (see Sect. 3.2.2). The quantities of principal interest are the hole probability (23), defining the spacing distribution and describing the short range eigenvalue correlations, and the pair correlator (46), responsible for the long range correlation of eigenvalues (see n . 14 - 16 > 19 for discussions of definitions and general properties of these and others characteristics of the local regime). Results for ensembles with independent entries are far from being complete. We mention only a heuristic results of 70 for the smoothed pair correlator of the Wigner ensembles. The study of these topics for invariant ensembles, the unitary invariant ones first of all, is much more advanced mostly because of formulas like (57) - (58), reducing the analysis to the computation of the limit Qfori) = lim (np„(A))- 1 Q n (e,7?), Qn(Z,ri) = Kn(X + Z/npn(\),A
+ 7>/npn(A)),
n—»oo
(63) where Kn(\,p) is defined in (57). The most direct method to compute the limit for the unitary invariant ensemble is the use of (57) and of asymptotic formulas for orthogonal polynomials. This is how the limiting sin-kernel (40) was found by Gaudin for the GUE and the
242
certain linear combinations of these kernels with arguments ±£ ±77 for the GOE and for the GSE, i.e. for the quadratic potential V = X2/4w2 and the Hermite polynomials respectively (see 1 9 ). More general case of potentials leading to the one- or two-interval DOS was studied in 15 by using a version of the orthogonal polynomial method. For real analytic potentials V's in (7) in the unitary invariant case formuals (40) and (47) were derived in 202>204 by using asymptotics formulas for respective orthogonal polynomials obtained in these papers. This implies the universality of the local eigenvalue statistics for this class of ensembles. For the case of potential belonging locally to C 3 and satisfying certain regularity properties at infinity, this fact was established in 85 without using asymptotic formulas for orthogonal polynomials. These rigorous results were preceded by physical papers 64,81,82,84,138 a n c j m a n y others, treating various interesting aspects of the subject, in particular the cases of/? = 1,4, where rigorous proofs are absent yet. The concept of universality is central in the local regime reflecting remarkable similarity of the eigenvalue statistics of different operators that one encounters in various branches of science, ranging from quantum chromodynamics and number theory to the room acoustics and microwave cavities. In regular points of the DOS there exist the three universality classes, corresponding /? = 1,2,4 and described by the Wigner - Dyson local eigenvalue statistics, established first for Gaussian and Circular Ensembles and the Poisson statistics pertinent for weakly dependent point processes in probability and to integrable or localized systems in physics (see Sect. 3.2.2 and 3.2.5). Other universality classes correspond to special points of the DOS of the same ensembles (see the next subsection), to other ensembles 16 . 39 . 46 . 49 . 79 ] are expected for random operators 78 ' 92 ) and for transition regimes corresponding to changes of the symmetry of an ensemble as the result of varying of a parameter in the ensemble probability law (a kind of the breaking symmetry phenomena in the RMT, playing an important role in applications (see e.g. 3.14>19>150>123 a n ( j problem (p7) of Sect. 2.3). Provided that universality is established, one needs to know properties of the limiting point process, its hole probability Ep(s) in particular. The Wigner-Dyson processes are quite different from the Poisson process (42). In particular, we have 19 Ep(s) = 1 - s + aps2+P + 0(s4+P), s -> 0, thus the spacing probability density. pp(s) = Ep(s) is pp(s) = bps13 + 0(s 2 + / 3 ), s —> 0, demonstrating the repulsion of eigenvalues in the local regime. These results follows easily from the iterations of the integral operator defined by the kernel (40) and its analogues for the GOE and GSE. The large-s behavior of the hole probability is much harder to find. It may seem that at least the first term of \ogEp(s) should be of the form const • s , because, according to the weak SzegoKac theorem for the Toeplitz operators 29 , the asymptotic form of the Fredholm determinant of the operator 1 — Q3, where Qs is the integral operator defined on the interval [0,s] by a difference kernel (63), is log i?(s) = s JR log(l — Q(k))dk(l + o(l)), s ->• 00 , where Q(k) is the symbol of Qs. However, for kernel (40) the Fourier transform is the indicator of the interval [—TT, n] , thus the last integral diverges to —00. It turns out that the leading term of the large-s asymptotic of the hole probability is log Ep(s) is —cps2, and in fact one can construct the asymptotic expansion of logEp(s). We refer the reader to 19 for derivations of this expansion
243
given before 1991, and to
26
3.1.5 Local Regime (special
and
206
for recent results.
points)
We discuss here results related to problem (P3), (p'J) and (p 6 ). As in the preceding subsection they can be divided in two group. The first includes topics related to the universality classes, known and expected. The second group includes proofs of universality for various classes of ensembles. Basing on results for the G U E 8 8 , 9 1 and on the universality idea one can conjecture t h a t the local regime in a 0(n~2/3) 1 2 neighborhood of a generic soft edge, where the DOS behaves as 6 ? , is defined by the kernel that can be obtained from (40) by replacing sin 7r£ by Ai(£) and C0S7T77 by nAi'(ri), where Ai is the Airy function (see (31) - (32)), and in a 0{n~x) neighborhood of the generic hard edge, where the DOS is asymptotically S~2, one has to replace sin TT^ by J\/2(V£) and cos TTT] by K\/ZJI/2'(^/TJ), where J i / 2 is the Bessel function of the order 1/2. Similar regime exists for the chiral ensembles (64). Development of the scheme of study of the local regime outlined in the previous subsection, in particular, the large-s behavior of the spacing distribution, requires the beautiful analysis, related to integrable systems, the Painleve transcendents, etc (see 8 8 ) . One more universality class corresponds to the closure of a gap in the spectrum, when its two intervals merge into the one. Respective DOS, obtained from (61) with a symmetric and correspondingly tuned "unperturbed" IDS JVO behaves near zero as S1'3, and respective limiting kernel is again defined by the two functions (analogues of Ai and Ai') given by a certain integral representation and related to the classical system of the ODE's, found by G a m i e r 9 0 . An infinite series of the universality classes was found in the q u a n t u m field theory context while studying the double scale limit 7 . They have to correspond to a 0(n~2/2p+l) p l 2 neighborhoods of non-generic soft edges where the DOS behaves as 8 ~ l , p = 3, 5, ... 89 . Interesting classes appear in number theory 1 6 . This was a brief overview of universality classes. As for proofs of universality for various classes of the random matrix ensembles, the situation is much less satisfact ory. T h e Airy kernel can be in principle obtained for unitary invariant ensembles by using the recent asymptotic results for orthogonal polynomials with the weight exp{—nV} 2 0 2 > 2 0 4 . As for the Wigner ensembles, we mention recent results 9 3 ' 9 4 , obtained by using the moment method and a rather sophisticated combinatorial analysis. 3.1.6 Random
Matrix
Theory and Random
Operator
Theory
Links between the R M T and the R O T are natural consequences of their common spectral content, despite that emphases of these domains are somewhat different. We mention here papers 1 0 3 ' 9 7 ! devoted to analysis of certain random operators (in fact coupled Wigner random matrices) whose IDS and other spectral characteristics asymptotically coincide with those of the deformed semicircle law (61). Similar model, motivated by free probability context was considered in 1 0 2 . As for the local regime, the statement t h a t the pure point part of the spectrum of the Schrodinger operator with a random potential obeys the Poisson statistics (42) was proved in 100 for the whole spectrum in the one dimensional case and in 9 9 for neighborhoods of the spectrum edges in the multidimensional case (strong localization regime). In 9 8 random matrices are important component of the approach to the study
244 of the absolutely continuous spectrum. Concerning interrelations between pure point and absolutely continuous spectrum, one of the most i m p o r t a n t and less understood topic of the R O T , a widely believed conjecture is t h a t the Poisson local eigenvalue statistics is pertinent for the pure point spectrum (localized states), while the Wigner-Dyson local statistics (thus the R M T description) have to be expected under certain conditions in the case in the absolutely continuous spectrum (extended states), responsible for transport properties of disordered systems 1 4 . In this context the validity of the conjecture was recently addressed in physical papers 78,128 (see also Sect. 3.2.2) T h e form of the local eigenvalue statistics at the border points (supposedly existing) between the pure point and the absolutely continuous components is a m a t t e r of discussion in the theoretical physics literature (see e.g. 14,92\
R a n d o m matrices can also be used to model the transition from pure point to the absolutely continuous spectrum, resulting from varying certain parameters, the problem that is out of the reach of the rigorous analysis in the frameworks of the R O T so far. For instance, according to numerical and scaling arguments 9 6 , supported by the theoretical physics computations 1 0 1 , eigenvectors of band random matrices (4) are localized if bn = o{n1'2) and are delocalized otherwise, and the ensemble of sparse random matrices, e.g. matrices with independent entries whose distribution has an atom at zero, can exhibit similar behavior t h a t results from varying the size of the atom. 3.1.7 Non-Hermitian
Matrices
T h e title is an abbreviation for all random matrices t h a t do not belong to the classes discussed above, t h a t is non-Hermitian, not real symmetric, quaternion real or respective unitary. These matrices are used to describe, for example, open physical systems 1 2 6 , where the role of energy levels play resonances, whose real parts are characteristic energies while their imaginary parts determine the life times, the time delays, etc. A related domain is statistical scattering theory (see e.g. 1 1 8 for a version) t h a t is of use in nuclear physics and in condensed m a t t e r 3 > 1 4 ' 1 5 0 . Non-Hermitian chiral random matrices appears in q u a n t u m field theory 1 4 0 . First results on non-Hermitian matrices were obtained by Ginibre u o , who con sidered the Gaussian ensembles (17) of complex matrices and their real and qua ternion analogues. In this case one can again integrate over the "angle" variables (eigenvectors) and obtain the joint probability of all (complex) eigenvalues, i.e. analogues of formula (52), although the respective arguments is much strongly re layed on the Gaussian form of the density. However, in the case of complex normal matrices 1 0 5 , defined by (17) with exponent — nTrV(M*M) (cf. (7)) this distribu tion can also be obtained. In this case one even has an analogue of the method of orthogonal polynomials, including analogues of the determinantal formulas (57)(59), in which the role of orthogonal polynomials play j u s t monomials, {zl}f2.0. T h e analogue of the semicircle law in the Gaussian case is the uniform distribution on the disc, known as the circular law n o . This limiting eigenvalue distribution as well as its generalizations known as the elliptic law was later obtained for a broad class of matrices with independent entries (see i. 1 1 . 1 0 6 for results and references). Main technical tool in study of the global regime is a generalization of the Stieltjes transform and the resolvent method proposed by Girko in the 80s (see his book n )
245 and reducing the problem to the study of certain Hermitian matrices. However, in the complex case this transform is much harder to use because of the possibility to have an eigenvalue in any point of the complex plane. This makes singular the integrand of the transform as well as certain expressions arising in the use of the resolvent method, and one needs special, often quite nontrivial, arguments to jus tify the method. Unfortunately, a number of interesting recent physics papers do not pay sufficient attention to this important moment. Because of mentioned above technical difficulties rigorous results on the local regime are not numerous. For the Gaussian complex matrices (17) one has the cubic law of the repulsion of the eigenvalues 1 1 3 instead of the quadratic law for the Hermitian case 1 9 . The same fact is valid for more general Gaussian ensembles 1 0 9 and for rotational invariant ensembles of normal matrices 1 0 5 , thus demonstrating, although restricted, universality of this law for complex matrices. The law agrees favorably with the spacing distribution of generators of dissipative q u a n t u m maps in chaotic regime 1 2 6 and with the local statistics of the lattice Dirac operator for non-zero chemical potential 1 4 ° . For real (but not symmetric) matrices we mention the fact that the number of the real eigenvalues is y/n and they are uniformly distributed over the whole real axis with the density \/2/n relative to yjn 1 0 ? . We mention also an interesting asymptotic regime of the weak non-Hermiticity found in 1 0 9 for the complex Gaussian matrices of the form M\ + iaM^, where M\ and Mi are the i.i.d. GUE matrices distributed according law (17). For a = 0 this is the G U E and for a = 1 this is ensemble (17). Thus, this is a version of problem (pe) of Sect. 2.3, reminiscent the G O E - G U E transition discussed there. In this case one can find all the correlation functions by using an interesting version of the orthogonal polynomial method and if a is scaled as a = n~}'2t, one obtains in the scaling limit the nontrivial dependence of the local regime on the parameter t, giving the GUE results (see e.g.(47) ) for t = 0 and converging pointwise to correlation functions of ensemble (17) as t —> oo. Physical aspects of non-Hermitian matrices are discussed in 9 . 1 0 8 . 1 1 2 , 1 2 6 . 1 4 0 3.2
Physics
3.2.1.
Nuclear
Physics
This is the first branch of physics where random matrices were seriously used and which was a principal source of motivation of the R M T in the 50s-70s 119 > 27 . This period is well enough documented in 3 > 14 . 19 . 24 > 117 T h e main topics of initial interest was repulsion of the nuclear energy levels and resonances, discovered experimentally in the 40s - 50s. The R M T provided sufficiently adequate theoretical description of the phenomenon in the frameworks of the local regime (see Sect.2.3), despite t h a t this description was phenomenological in the sense t h a t it was not deduced from the first principles, i.e. as a well defined and justified asymptotic regime of the many body q u a n t u m mechanical problem. We note also t h a t while the R M T predictions fits well the experimentally obtained histograms of nuclear level spacings, the overall density of nuclear levels is rather different from the semicircle law (see however 1 5 °). Nevertheless, because of the universality the density can be regarded as a kind of a fitting parameter of the theory, t h a t defines the energy scale for spacings (see
246 formula (33)). This point of view was made explicit in Dyson's proposal of the early 60s to consider invariant ensembles of unitary matrices, having a priory the constant DOS. Nuclear physics originated also other directions of the R M T , t h a t acquired later independent and important status. This is the idea of complexity of the spectrum of many body q u a n t u m systems t h a t was suggested by N. Bohr in the nuclear physics context 3 . The idea got considerable attention in the 60s 2 4 , being applied to a t o m s and molecules, and evolved in the 70'-80s into the vast domain, known now as q u a n t u m chaos and strongly using the R M T . One has also to mention the stochastic fluctuations of the neutron scattering cross section, caused by strong overlap of nuclear resonances. T h e fluctuations were predicted by Ericsson in the early 60s, who used a simple statistical model for the nuclear scattering matrix. Later more complete theories of this phenomenon were developed, including the introduction of new ensembles and the anticipation of several future developments, such as the weak localization, band matrices, random transfer and scattering matrices, parametric fluctuations, etc.(see 3 ' 1 4 for detailed discussions and references). 3.2.2 Complex Spectra (Quantum
Chaos)
T h e field is flourishing and its considerable part intersects the R M T . T h e origin of the intersections goes back to nuclear physics 3 ' 2 4 . Growing during the 60s - 70s amount of evidences on pertinence of the random matrices in description of the spectrum fluctuation of q u a n t u m and wave systems and of connections with their classical dynamics have been crystallized in two ideas (conjectures), according to which the local energy level statistics of classically integrable q u a n t u m systems are close to the Poisson distribution 1 2 2 , and the local energy level statistics of q u a n t u m systems whose classical counterparts are chaotic (nicely ergodic) are well described by the Wigner-Dyson statistics 1 2 4 . T h e physics literature on the subject is vast and exposes a wide variety of systems of different origin and nature, having chaotic spectra and described by the R M T , thus demonstrating universality of the phenomenon 121>125,126,129 m particular, the volume of numerical evidences on validity and detalization of these conjectures is impressive but theoretical proofs are scarce so far. We mention papers 129 > 127 ) on the validity of the Poisson statistics and physical papers 78 > 128 on justification of the Wigner-Dyson statistics. 3.2.3 Quantum
Field
Theory
A number of rather complete physical reviews and proceedings describes tech niques and results related to random matrices in q u a n t u m field theory. They treat the large-color approximation of the gauge field theory, non-perturbative formu lations of two-dimensional q u a n t u m gravity and the string theory, description of certain critical regimes and related topics of integrable systems, statistical mech anics, topology, analysis, etc. 4 . 6 . 7 . 1 3 3 . 1 3 8 . Most of these developments deal with matrix integrals over Hermitian matrices, arising in various approximations schemes. T h e historically first scheme was sug gested by t'Hooft 1 3 4 , who proposed to replace the gauge group U{Z) by U(n) with subsequent asymptotic n —>• do. This led to ensemble (7) with a polynomial V. Interrelation of the approximation with the eigenvalue distribution of this en semble, as well as with combinatorics and topology of certain graphs was given in
247 132,2 (see also a rather complete reprint collection 4 ) . Next came the matrix model approach to the two-dimensional Euclidean q u a n t u m gravity and low-dimensional strings, based on their discretized versions (lattice approximations, replacement of the functional integral by a sum over triangulations) and on their formal equival ence to (7) and more complex matrix models, like the multi-matrix models (16), and then the continuum limit or the sum over all genera was carried out via a special limiting procedure, known as the double scaling limit. T h e random matrix content of the limit in simplest cases is an analysis of the local regime in neighborhoods of the spectrum edges of ensemble (7) for special V's 8 9 and its mathematical physics s t a t u s has to be understood yet. However spectral aspects are somewhat implicit and interpretational in this part of the q u a n t u m field theory, whose primary concern is the behavior of matrix in tegrals as function of various parameters. These aspects became explicit in the recent approach to a special phase transition (spontaneous breaking of the chiral symmetry) in the gauge field theory coupled to fermions 1 3 9 . After certain approx imations the problem reduces to the study of the spectrum structure of the Dirac operator in a random (quenched) gauge field in a neighborhood of the origin of the spectral axis. Thus, in fact, we obtain a problem of the R O T , but a rather hard one. Therefore the problem is further simplified by replacing the Dirac operator by the random matrix ensemble consisting of chiral matrices
Mn = ( £ CQ) '
( 64 )
t h a t preserve relevant to the problem chiral symmetry of the Dirac equation, in particular, the symmetry of its spectrum with respect to the origin. This can be viewed as one more implementation of universality, according to which a pertinent random matrix ensemble is determined by mainly by the symmetry of the problem. As a result, wc come to the study of the local spectral statistic of these matrices near the origin, t h a t is to a version of problem (P3). Recent, more detailed studies based on the concepts of the scaling theory of localization led to physical criteria of applicability of the R M T approach to this problem 1 3 6 (like in the analogous problem in the condensed theory, see Sect. 3.2.6). 3.2.4 Statistical
Physics
We remark first that most of the random matrix results in q u a n t u m field theory have the statistical mechanics interpretation, just because respective matrix integrals of exponentials of matrix polynomials can be interpreted as partition functions of some statistical mechanics models. This concerns the both, integrals written in terms of matrices themselves and in terms of their eigenvalues, i.e. after integration over the "angle" variables when we obtain a one-dimensional log-gas in the external field (see (54)). Related interpretations views these integrals as the spin models (Ising, Potts) on the random lattice (or, equivalently, disordered spin models with the an nealed disorder) 1 4 3 o r / a n d as partition functions of random surfaces, and lead to a number of new critical regimes and phase transitions 6 > 1 3 . 1 4 1 . 1 4 5 There are also one-dimensional many-body q u a n t u m models, like the Calogero-Sutherland model, the classical models like the Pechukas gas 1 4 . 1 5 0 ' 1 4 7 ) that are related to the FokkerPlanck equation for the joint eigenvalue probability density of parametric random
248 matrix ensembles. There are also indications t h a t exact solvability of statistical mechanics models is related to the amount of stochasticity of spectra of their trans fer matrices 1 4 4 . On the other hand, random matrix facts (eigenvalue distribution, norm estimates) are of use in study of disordered spin systems, spin glasses and neural networks (see e.g. 1 4 2 . 1 4 6 ) ] where the G O E (5) and the Wishart matrices (11) plays the role of the interaction matrices in the most widely studied models of the field 3 3 . 3.2.5 Condensed
Matter
These applications of the R M T comprise now a very active and broad field, well described in recent reviews 1 4 . 1 4 8 . 1 5 0 . The idea to use the Wigner-Dyson statistics in description of electronic properties (in fact, the electric polarization) of small metallic particles was suggested by Gorkov and Eliashberg 1 5 2 in contradistinction with earlier suggestions by Frohlich and K u b o to use the equidistant and the Poisson statistics respectively (see e.g. the lecture by B. Miihlschlegel in 1 2 5 ) . This suggestion was later substantiated and strongly developed by Efetov (see recent book 9 for results and references) who gave a theoretical physics proofs of valid ity of these statistics (corresponding to all three symmetry classes /? = 1,2,4) for spectral fluctuations in small disordered metallic particles by applying the Grassm a n n integration method. T h e method is now one of the most efficient in the field. It is based on a certain representation of the resolvent (49) or other generating functions as the "double" Grassmann-Riemann integral of the exponential of the quadratic form defined by the matrix, thus allowing the respective expectation to be efficiently computed. A crucial link between spectral fluctuations and physical most important trans port phenomena of disordered systems given by Thouless in the 70s, determined to large extend further development theoretical physics of disordered systems (see 31 for a general review and 1 4 9 for a discussion related to the field). In the case of macroscopic systems this is the subject of the localization theory, whose m a t h e m a t ical aspects are studied by the R O T . In the case of systems of restricted geometry, whose experimental study became possible due to the progress of microelectronics and whose theoretical study was strongly motivated by discovering of universal (in dependent of the sample size a n d / o r degree of disorder) conductance fluctuations, this is mesoscopics. T h e mesoscopic regime is defined by the condition t h a t in elastic scattering length is larger than the system size. As a result the electrons moves coherently through the sample despite the multiple chaotic scattering either by impurities or by the sample boundary. T h e regime includes the three subregimes: localized, diffusive and ballistic, the latter already in the scope of q u a n t u m chaos. This leads to a wide variety of problems and results related to the R M T and to its links with other techniques of theoretical physics. There are two basic non-perturbative approaches dealing mostly with the R M T models of Hamiltonians of mesoscopic samples (quantum dots and q u a n t u m wires) and of their scattering a n d / o r transfer matrices and using respectively and (mostly) the Grassmann integ ration technique and the technique of the phenomenological (based on the m a x i m u m entropy principle) Fokker-Planck type equation for the joint probability density of eigenvalues in function of the wire length 151 > 153 . Equation of this type was first used by Dyson in the early 60s (see paper 60 in 8 ) to describe dependence of the
249 random matrix eigenvalues of a slowly varying parameter, (transition between local statistics, between ensembles, etc., see problem (P7) in Sect. 2.3). Besides eigenvalues, eigenvectors are also of considerable interest in this field. For account of recent results see 14 > 150 . 154 3.3
Mathematics
3.3.1 Probability and
Statistics
Statistics, more precisely, multivariate statistical analysis, is historically the first field where random matrices have appeared. Namely, let Y^, p. = l , . . , p be a collection of i.i.d. random vectors of R n (a sample of size p from a n-variate distribution). Then the sample mean vector is Y = p _ 1 ^2u=i Y^, and the sample covariance matrix S — {Sjk}7' k=1 is p
Sjk = {p- l ) _ 1 $ 3 ( y « -Yj)(Y»k
-Yk).
(65)
Y and S are unbiased estimates of the expectation E{V^} and the covariance mat rix £ = {Ej/c}^ f c = 1 , Ejfc = Cov{V/ij-V/ik}. It. can be shown t h a t if random vectors are Gaussian, then Y and S are independent and the distribution of S is the same as the distribution of the matrix n/m 5 „ , m , where m = p — 1, and Sn,m is matrix (10). In this case S has the well known in statistics Wishart distribution, a mul tivariate analogue of the ^ - d i s t r i b u t i o n . Eigenstructure of these random matrices is the subject of the principal component analysis 1 6 3 . 1 6 5 t dating back to the early 30s. This is why already in these early studies a number of important formulas for various spectral characteristics of real symmetric matrices were obtained and used 165 . In particular, it was shown that the joint probability distributions of eigenval ues of ensembles whose law is invariant with respect to orthogonal transformations has the form (52) with f3 — 1 and with the factor exp{—n £^ V(Aj)} replaced by an non-negative symmetric function w„(Xi,..., A„). This and related results valid for arbitrary finite n and m were obtained in statistics by several methods, including the use of differential-geometric structure of the orthogonal group (see e.g 1 6 3 for a review), and were later rederived in several contexts. However the asymptotic ana lysis of respective formulas has been carried out in statistics only for large number of observations m —> oo, but for a fixed, m-independent number of observable para meters n. The regime m —> oo,n —► oo, m/n —► a > 0, was first considered in 5 5 in the spectral context, motivated by Wigner's results for the Gaussian ensembles and by certain problems of the R O T and its applications (see, e.g. 32 > 34 ). However, approximately at the same time, Kolmogorov, apparently independently, proposed to study the same asymptotic regime in statistics (see e.g. 1 5 9 , where this idea was applied to the problem of computing the probability of misclassification in the discriminant analysis). For the modern state of this branch of multivariate analysis see S ' 6 0 ' 1 . Note, that random matrices arise also in statistical analysis of errors of linear numerical algorithms since the 40s ' 7 and in studies of computational complexity 7 5 . 1 6 8 .
250 As for a non-conventional probabilistic setting, random matrices is i m p o r t a n t ingredient of a version of non-commutative probability, known as free probability. Many limiting eigenvalue distributions are actually the distributions of free random variables. The field is motivated by studies in operator algebras related to free groups and leading to new concepts and results of probabilistic and functional analytic content as well as to new tools and results in operator algebras 1 7 ° . 1 6 4 . 3.3.2 Spaces, Operators,
Algebras
Local theory of Banach spaces as well as related topics of structure of linear oper ators and geometry of convex bodies of arbitrary finite dimensions uses efficiently probabilistic concepts and techniques to prove existence, "typicality" and "stabiliz ation" of a wide variety of geometrical objects and to find optimal bounds for their quantitative characteristics (see e.g. 1 6 9 for reviews of the approach). This includes certain randomized linear geometric procedures, in particular random matrices with Gaussian entries and necessity to have sufficiently precise bounds for their norms 1 6 6 and for singular numbers, i.e. square roots of eigenvalues of the Wishart m a t r i c e s ( l l ) and their analogues for non-Euclidean metrics. Exponential bounds for singular numbers of Gaussian matrices t h a t are close to extreme ones were ob tained in 1 6 8 and led to several important consequences concerning the structure of finite-dimensional normed spaces and their bases. These bounds were subsequently improved and used in analysis of questions of the theory of computational complex ity, reducing to the question of the exact order of E { l o g | | j 4 - 1 | | • \\A\\} (see 1 6 8 and references therein). Another recent topic of operator algebras concerns sharp estimates of order of magnitude of norms of linear combinations of elements of an algebra with random coefficients that are random Gaussian matrices 1 6 7 . One more active field is free probability, a part of the operator algebras theory, especially those, t h a t originated from the free groups. Roughly speaking, free prob ability space variable is an algebra (a von Neumann algebra) "modeled " by random matrices of large order and the expectation functional is modeled by the operation E { n _ 1 T r . . . } , where E{...} denotes the expectation with respect to random matrices (Gaussian or unitary most often). I m p o r t a n t problems of the von-Neumann algeb ras related to free groups were solved or strongly advanced using the free probability approach, and random matrices as an efficient analytical model, in particular 1 7 0 ' 1 6 4 . 3.3.3 Number
Theory
It is an idea of Polya and Hilbert t h a t nontrivial zeros of Riemann's zeta function comprise the spectrum of an operator D acting on an infinite-dimensional Hilbert space. Then Riemann's hypothesis would be equivalent to the self-adjointness of i(D — ld/2). Later, in the 70s, the idea was developed by Montgomery 1 7 4 (see also paper 1 7 5 for related events and reflections). It was shown t h a t if - - - < *y_i < 0 < 7i < ■ ■ ■ < 7n < ■ • • be the sequence of the imaginary parts of the nontrivial zeros of £ (j-j — —jj) and 7j = fj log7j/27r, so that the "fj have mean spacing 1 (cf. the renorrnalization procedure in (33)), then assuming the Riemann hypothesis we
251
have for the limiting pair correlation distribution
valid for any / whose Fourier transform is compactly supported in ( —1,1). It was also conjectured that the above equality should hold without restriction on the size of the support of/. This became known as the pair correlation conjecture: it was later largely confirmed numerically 1 T 6 ). A remarkable feature of the above relation is that it coincides with that for the normalized eigenvalues of the GUE random matrices, i.e. for 7j = Xj/npn(Xj), see formulas (45) - (47). This established a link between Riemann's hypothesis and the RMT and thus indirectly supported the Polya-Hilbert view. Similar and more complete results, establishing analogous relations for all higher correlation functions (44) (known as the GUE conjecture), but also with restrictions on the support of the test functions were obtained by in 177 not only for zeros of the zeta-functions but also for a large class of the Dirichlet L-functions. The restriction on the test functions in these results guarantees the dominating contribution of diagonal terms of multiple sums over primes. To obtain the general result one needs to consider the off-diagonal terms, which brings in the finer structure of primes. In 172 the problem of computing of the higher correlators of zeros from the off-diagonal terms in the prime sums is considered (partly heuristically) basing on application of a conjecture of Hardy and Littlewood on prime pairs to an explicit formula relating primes and zeros. The key observation is that the multiple sums over primes determining higher correlators of zeros can be effectively analyzed by using only the binary properties of primes. In the recent book 16 the GUE law is established unconditionally for the zeros of families of L-functions of geometric objects defined over finite fields in which case the analog of Riemann's hypothesis is known by the work of A.Weil in the case of curves and of Deligne in the general case. Related questions are discussed in review works 171 . 173 . 129 3.3.4 Integrable Systems We discuss here classical integrable systems. Quantum integrable systems were mentioned in Sect. 3.2.4. There are several origins of integrable systems in the RMT and its applications. One of them is in quantum field theory where it was found that matrix integrals over Hermitian of unitary matrices with weights (7), (16), and (13), being regarded as functions of coefficients of the potential V, solve certain integrable ODE's arid PDE's and are r-functions of some integrable systems, the Korteweg de Vries (KdV), the Kadomtsev-Petviashvili, the Toda, etc. These topics are extensively discussed in 7,137,178,180 ]\jext; studies of critical regimes of quantum field theory and conver gence of the lattice approximations led to the double scaling limit, described via the hierarchies of conservation laws of integrable PDE's and defined by them flows (e.g. the higher KdV flows) and related solutions of canonical commutation relation [P, Q] = 1 7 . The RMT interpretation of these finding is the local regime near a non-generic spectrum edge, where the DOS goes to zero as J p _ 1 ' 2 , where S is the distance to the edge, p > 1 is an odd integer 8 9 . Further, writing the Fredholm determinant of the sin-kernel (40) and analogous
252 ones for an union of intervals, we obtain according to 88 > 90 > 179 t h a t the determinants giving the limit hole probability (see e.g.(41)) are expressed via solutions of a com pletely integrable the system of the P D E ' s , provided t h a t respective functions of the kernel (analogues of sin7r£, Ai(£), etc.) verify certain O D E ' s . This allows one, in particular, to elaborate efficient procedures of asymptotic analysis of respective determinants. Related subjects concern various transcendents, the Painleve and others, participating in these expressions. 3.3.5
Topology
Topological and related combinatorial aspects of the R M T grew u p mostly from q u a n t u m field theory, beginning from the use of surfaces on a non-zero genus in the large-color approximation of the q u a n t u m chromodynamics, where the formal perturbation theory series can be resummed as a series in n~2, n being the number of colors and the power of n~2 is the genus of the surface on which one has to draws respective Feynman diagrams. The expansion is known as the topological 2,7,135 Similar techniques were also developed in 183 > 186 to compute the Euler characteristic of the moduli spaces (spaces of parameters) M9jk of genus- algebraic curves with k marked points (or punctures), and, eventually, to prove 1 8 5 W i t t e n ' s conjecture 1 8 7 (also deduced from the analysis of the m a t r i x models of q u a n t u m gravity) on the intersection numbers (cohomology classes) of Mgtk, including new matrix integrals 184 > 185 and link of their large-n limits with the (higher) Kortewegde Vries evolutional P D E ' s in parameters of these integrals 181 > 184 . This result from the point of view of q u a n t u m field theory can be regarded as equivalence between a version of the q u a n t u m gravity known as the topological gravity and the Hermitian matrix model defined by (7) in the double scaling limit 7 . 3.3.6 Combinatorics
and the Large-n Group
Representations
T h e idea to interpret matrix integrals over Hermitian matrices as generating func tions of numerical characteristics of classes of graphs drawn on the Riemann surfaces of nonzero genus dates back to t'Hooft paper 1 3 4 on the planar approximation in the q u a n t u m field theory and was elaborated in 132 > 2 . In these frameworks graphs are just Feynman diagrams of the zero-dimensional gauge field theory whose "gauge" group is U{n), generating function of all graphs is the partition function and of all connected graphs is the free energy of the theory (see also later paper 1 9 2 ) . These findings in their combinatorial aspects are related to earlier results of the field 1 9 6 and were followed by application of matrix integrals to the problem of enumeration of maps, i.e. graphs drawn on (imbedded to) a surface according to its genus 1 8 3 . For combinatorial content of these findings, including results following from ana lysis of more complex matrix models (with non-polynomial potential, multi-matrix models, etc.), see e.g. 1 9 8 , and for aspects related to q u a n t u m field theory, topology, integrable systems see 7 . Another recent direction deals with asymptotic problems of enumerative com binatorics and representation theory of symmetric group of large order. Here there is a number of interesting problems, like Ulam's problem on the length of longest increasing subsequence of random permutation with respect to the Haar measure on the permutation group, on the asymptotic shape of the Young tableaux and respective fluctuations, etc. (see e.g. 1 9 7 ) . These problems were actively studied
253 and solved by using the R M T and related techniques 189 > 191 > 195 . A rather general point of view on these problems and their links with the group representations is presented in 1 9 0 . Large order structure of representations of classical groups, in par ticular, certain summation procedures over characters of irreducible representations and their combinatorial content were considered in 1 9 2 . Several problems of enumerative combinatorics are connected with random nonnegative matrices, their permanents, etc. (see recent book 1 9 3 ) . A considerable combinatorial component, also related to the R M T , is present in the free probability developments 1 9 4 . On the other hand, the R M T needs a considerable amount of combinatorial analysis and the graph enumerations when dealing with the moment method since its introduction by Wigner (see recent works 1,71,93 \
3.3.7
Analysis
Analytical components of the R M T are numerous. We mention just several. Matrix integral representations of orthogonal polynomials go back to classics 3 6 . T h e mul tidimensional integrals involving the joint eigenvalue density of invariant ensembles are related to Selberg's integrals (see Chapter 17 of 1 9 ) for a detailed discussion. Results on the variational method of determining the DOS 5 0 ' 2 0 3 1 are based on minimization of the electrostatic energy (55), that is an important object of real analysis 3 r \ T h e orthogonal polynomial method, one of the most powerful in the R M T , requires asymptotic formulas for polynomials orthogonal with respect to the weight exp{ — nV(X)}, thereby connecting the R M T with semi-classical and other methods of modern asymptotic analysis (see recent book 2 0 3 ) . T h e R M T was one of serious motivations and applications of the recent progress in this field, based on the special integrable system 2 0 5 , related to those of q u a n t u m field theory (known there as the (pre)string equations), the Riemann-Hilbert form of the solution of this system by the inverse spectral method and on subsequent analysis of strongly oscillating Cauchy integrals by the method, known now as the non-linear steepest descent method (see 202,203,204 Analogously, several topics of the R M T (existence of the IDS, central limit theorem for the IDS and related quantities, large-s behavior of the hole probability) use the weak and strong Szego theorem and their analogues 6 9 . 1 9 9 - 2 0 0 ) Painleve transcendents and connection formulas for them 2 6 . In free probability (see Sect. 3.3.1, 3.3.2 and 1 7 °. 1 6 4 ) where the R M T is one of efficient analytical models, the new binary operations for probability measures have appeared, related to addition and multiplication of free variables thus random matrices (see (61) and (62) as simple examples of these operations), and known now as the free additive and multiplicative convolutions. This lead to a new domain of harmonic analysis, studying various properties of these operations. It is also worth to mention beautiful links between integrability aspects of the R M T and the isomonodromy method of analytic theory of differential equations, related asymptotic analysis, etc. 138 > 205 ) and the skew orthogonal and non-local polynomials appearing in analysis of various ensembles 1 9 .
254 A cknowledgment s 1 am thankful to many colleagues with whom I discussed topics of the paper on various stages of its writing. Appendix T h e Greek word TrapaSiyp.a means example or pattern t h a t serves for future work. This is how the word is widely used, say, in the Encyclopedia Britannica and in many scientific writings as can be seen, for instance, from the titles extracted from standard databases. As a technical term it is used in linguistics to denote the system of all inflectional form of a word (declensions of a noon, conjugations of a verb, etc). The word acquired much more broad (and vague) meaning after T . K u h n ' s famous book °, where he describes the history of science as a cyclic process: there are periods of "normal science", characterized by a " p a r a d i g m " , i.e. by a kind of consensus, of a systems of rules (agreements) on what phenomena are relevant, what problems are important, what is a solution of a problem, etc., and there are scientific revolutions, resulting in a paradigm shift, a new paradigm is to large extent inconsistent with an old one. Although one may doubt the extent of validity of Kuhn's cycle theory^, the term itself became much more used (even fashionable), meaning often a complex of ideas and attitudes t h a t is pertinent to a certain sufficiently large branch of science and may be only partly formalized but t h a t has strong heuristic and even aesthetic potential and sufficiently broad range of applicability, triggering interesting theoretical developments and shedding light on many experimental and numerical findings. This usage can also be seen from titles of publications drawn from databases. Examples: chaos paradigm, energy landscape paradigm, scaling paradigm, soliton paradigm, string paradigm, spin glass paradigm. I feel t h a t my usage of the word is close to this usage. The written above may seem trivial to the most of the readers. On the other hand, discussing the title with a number of colleagues, I found t h a t different people may understand differently the word or even do not understand it at all e. In view of t h a t the material of the Appendix may be useful. References [1]
Books, Proceedings, Reviews
[1.1] General Topics of the Random
Matrix
Theory
1. Z.D.Bai, Methodologies in spectral analysis of large-dimensional matrices: a review, Statist. Sinica 9 (1999) 611-677. 2. D. Bessis, C. Itzykson, J.-B. Zuber, Quantum field theory techniques ical enumeration, Adv. in Appl. Math. 1 (1980) 109-157. C
random in graph
T.S. Kuhn, The Structure of Scientific Revolutions, University of Chicago Press, Chicago, 1962. S.Weinberg, The revolution that didn't happen, New York Review of Books, 8 Oct. 1998 e N o wonder. According to certain critics, in Kuhn's book the word is used in more than twenty ways
255
3. T.A. Brody, J. Flores, J.B. French, P.A. Mello, A. Pandey, S.S.M. Wong, Random-matrix physics: spectrum and strength fluctuations, Rev. Modern Phys. 53 (1981) 385-479 4. E. Brezin, S. R. Wadia. (Eds.), The Large N Expansion in Quantum Field Theory and Statistical Physics. From Spin Systems to 2-dimensional Gravity. World Scientific, River Edge, NJ, 1993. 5. J. E. Cohen, H. Kesten, C. M. Newman (Eds.), Random Matrices and Their Applications, Contemporary Mathematics, 50, AMS, Providence, R.I., 1986. 6. F. David, P. Ginsparg, J. Zinn-Justin (Eds.), Fluctuating Geometries in Sta tistical Mechanics and Field Theory, North-Holland, Amsterdam, 1996. 7. P. Di Francesco, P. Ginsparg, J. Zinn-Justin, 2D gravity and random matrices, Phys. Rep. 254 (1995) 1-133. 8. F.J. Dyson, Selected Papers, AMS, Providence, RI, 1996. 9. K. Efetov, Supersymmetry in Disorder and Chaos, Cambridge University Press, Cambridge, 1997. 10. R. Fernandez, J. Frohlich, A. Sokal, Random Walks, Critical Phenomena, and Triviality in Quantum Field Theory, Springer, Berlin, 1992. 11. V.L. Girko, Theory of Random Determinants, Kluwer, Dordrecht, 1991. 12. U. Grenander, Probabilities on Algebraic Structures, Wiley, NY, 1963. 13. D. J. Gross, T. Piran, S. Weinberg (Eds.), Two-dimensional Quantum Gravity and Random Surfaces, World Scientific, River Edge, NJ, 1992. 14. T. Guhr, A. Mueller-Groeling, H. A. Weidenmueller. Random matrix theories in quantum physics: common concepts, Phys. Rept. 299 (1998) 189-425. 15. E. Kanzieper, V. Freilikher, Spectra of large matrices: a method of study, Dif fuse Waves in Complex Media, Kluwer, Dordrecht, 1999, 165-211. 16. N. Katz, P. Sarnak, Random Matrices, Frobenius Eigenvalues, and Monodromy, AMS, Providence, RI, 1999. 17. H. Kunz, Matrices Aleatoires en Physique, Presses Polytechniques et Universitaires Romades, Lausanne, 1998. 18. E. Lieb, D.C. Mattis, Mathematical Physics in One Dimension, Academic Press, NY, 1966. 19. M.L. Mehta, Random Matrices, Academic Press, Boston, MA, 1991. 20. V. Malyshev Probability related to quantum gravity. Planar pure gravity, Russ. Math. Surveys 55 (1999) 3-46 21. W.H. Olson, V.R.R. Uppuluri, Asymptotic distribution of eigenvalues of ran dom matrices, Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Vol. Ill, Univ. California Press, Berkeley, CA, 1972, pp. 615-644. 22. L. Pastur, Eigenvalue distribution of random matrices: Some recent results, Ann. l'lnst. Henri Poincare (Phys. Theor.) 64 (1996) 325-337 23. L. Pastur, Spectral and probabilistic aspects of matrix models, Algebraic and Geometric Methods in Mathematical Physics, Kluwer, Dordrecht, 1996, pp.207-247. 24. C.E. Porter, Statistical Theories of Spectra: Fluctuations, Academic Press, NY, 1965. 25. Proceedings of the IV Wigner Symposium, World Scientific, River Edge, NJ,
256
1996. 26. C. Tracy, H. Widom, Introduction to random matrices, Geometric and Quantum Aspects of Integrable Systems, Lecture Notes in Phys. 424, Springer, Berlin, 1993, pp.103-130. 27. E. Wigner, Random matrices in physics, SIAM Rev. 9 (1967) 1-23. 1.2 Related and Auxiliary Topics 28. N.I. Akhiezer, I.M. Glazman, Theory of Linear Operators in Hilbert Space, Dover, NY, 1993. 29. A. Bottcher, B. Silbermann, Analysis of Toeplitz Operators, Springer, Berlin, 1990. 30. V. Ivrii, Microlocal Analysis and Precise Spectral Asymptotics, Springer-Verlag, Berlin, 1998. 31. P. Lee, T.V. Ramakrishnan, Disordered electronic systems, Rev. Mod. Phys., 57 (1985) 287-337. 32. I.M. Lifshitz, S.A. Gredeskul, L.A. Pastur, Introduction to the Theory of Dis ordered Systems, Wiley, NY, 1988. 33. M. Mezard, G. Parisi, M. Virasoro, Spin Glass Theory and Beyond, World Scientific, NJ, 1987. 34. L. Pastur, A. Figotin, Spectra of Random and Almost-Periodic Operators, Springer, Berlin, 1992. 35. E.B. Saff, V. Totik, Logarithmic Potentials with External Fields, Springer, Berlin, 1997. 36. G. Szego, Orthogonal Polynomials AMS, Providence, R.I., 1975. 37. H. Weyl, Classical Groups, Princeton University Press, Princeton, 1948. 2 Random Matrix Theory 2.0 Ensembles, Quantities, Methods 38. R. Balian, Random matrices and information theory, Nuovo Cimento 57 (1968) 183-193. 39. E. Bogomolny, O. Bohigas, M.P. Pato, Distribution of eigenvalues of certain matrix ensembles, Phys. Rev. E (3) 55 (1997) 6707-6718. 40. G. Bonnet, F. David Renormalization group for matrix models with branching interactions Nuclear Phys. B552 (1999) 511-528 41. A. Borodin, Biorthogonal ensembles, Nuclear Phys. B 536 (1999) 704-732. 42. P. Cizeau, J.-P. Bouchaud, Theory of Levy matrices, Phys. Rev. E50 (1994) 1810-1817. 43. T. Guhr, A. Miiller-Groeling, Spectral correlations in the crossover between GUE and Poisson regularity:on indication of scales, J. Math. Phys. 38 (1997) 1870-1887. 44. V. Kazakov, External matrix field problem and new multicriticalities in (-2) dimensional random lattice, Nuclear Phys. B354 (1991) 614-624. 45. V. Kazakov, P. Zinn-Justin, Two matrix model with ABAB interactions, B546 Nuclear Phys. (1999) 647-668. 46. K. Muttalib, Y. Chen, M. Ismail, V.Nicopoulos, New family of unitary random matrices, Phys. Rev. Lett. 71 (1993), 471-475.
257
47. T . Nagao, P.J. Forrester, Transitive ensembles of random matrices related to orthogonal polynomials. Nuclear Phys. B 530 (1998) 742-762. 48. G. Parisi, Statistical properties of random matrices and the replica method, T h e Mathematical Beauty of Physics, World Scientific, River Edge, NJ, 1997, pp. 98-112. 49. M.R. Zirnbauer, Supersymmetry for systems with unitary disorder: circular ensembles, J. Phys. A: Math. Gen. 2 9 (1996) 7113-7136; Riemannian sym metric superspaces and their origin in random-matrix theory, J. Math. Phys. 37 (1996) 4986-5018. 2.1 Global Regime 50. A. Boutet de Monvel, L. Pastur, M. Shcherbina, On the statistical mechanics approach to the random matrix theory: the integrated density of states, J. Stat. Phys. 79 (1995) 585-611. 51. A. Boutet de Monvel, D. Shepelsky, M. Shcherbina, On the integrated density of states for a certain ensemble of random matrices, Random Oper. Stochastic Equations 6 (1998) 331-338. 52. E. Brezin, A. Zee, Universal relation between Green functions in random matrix theory, Nuclear Phys. B 4 5 3 (1995), 531-551. 53. A. Khorunzhy, Eigenvalue distribution of large random matrices with correlated entries, Mat. Fiz. Anal. Geom. 3 (1996), 80-101. 54. A. Khorunzhy, On smoothed density of states for Wigner random matrices, Random Oper. Stochastic Equations 5 (1997) 147-162. 55. V. Marchenko, L. Pastur, The eigenvalue distribution in some ensembles of random matrices, Math. USSR Sbornik 1 (1967) 457-483. 56. L. Pastur, On the spectrum of random matrices, Teor. Math. Phys. 10 (1972) 67-74. 57. L. Pastur, A simple approach to the global regime of the random matrix the ory, Mathematical Results in Statistical Mechanics, S.Miracle-Sole, J.Ruiz, V.Zagrebnov (Eds.), World Scientific, Singapore 1999, pp. 429-454. 58. E. Wigner, On the distribution of the roots of certain symmetric matrices, Ann. of. Math. 6 7 (1958) 325-327 59. E. Wigner, Statistical properties of real symmetric matrices with many dimen sions, Proc. 4th Canad. Math. Gongr. Banff, 1957, pp. 174-184 (reprinted in 24 )2.2 Corrections,
Fluctuations
and Large
Deviations
60. G. Akeman, Universal correlators for multi-arc complex matrix models, Nuclear Phys. B 5 0 7 (1997) 475-500. 61. J. Ambjorn, J. Jurkiewicz, Yu. Makeenko, Multiloop correlators for twodimensional quantum gravity, Phys. Lett. B 2 5 1 (1990) 517-524 62. G. Ben Arous, A. Guionnet, Large deviations for Wigner's law and Voiculescu's non-commutative entropy, Probab. Theory Related Fields 108 (1997) 517-542. 63. G. Ben Arous, 0 . Zeitouni, Large deviations from the circular law, ESAIM Probab. Statist. 2 (1998) 123-134. 64. E. Brezin, A. Zee, Universality of the correlations between eigenvalues of large random matrices, Nuclear Phys. B 4 0 2 (1993) 613-627
258
65. P. Diaconis, M. Shanshahani, On the eigenvalues of random matrices, J. Appl. Prob. 31A (1994) 49-62. 66. Y. Goldshmidt, 1/n expansion in two dimensional quantum gravity, J. Math. Phys. 21 (1980) 1842-1850. 67. F. Hiai, D. Petz, Eigenvalue density of the Wishart matrix and large deviations, Infin. Dimens. Anal. Quantum Probab. Relat. Top. 1 (1998) 633-646. 68. K. Johansson, On random matrices from the compact groups, Ann. of Math. 145 (1997) 519-545 69. K. Johansson, On fluctuations of eigenvalues of random Hermitian matrices, Duke Math. J. 91 (1998) 151-204. 70. A. Khorunzhy, B. Khoruzhenko, L. Pastur, Random matrices with independent entries: asymptotic properties of the Green function, J. Math. Phys. 37(1996), 5033-5060. 71. Ya. Sinai, A. Soshnikov, Central limit theorem for traces of large random symmetric matrices with independent matrix elements, Bol. Soc. Brasil. Mat. (N.S.) 29 (1998) 1-24. 2.3 Norms, Spectrum Edges 72. Z.D. Bai, J.W. Silverstein, No eigenvalues outside the support of the limit ing spectral distribution of large-dimensional sample covariance matrices, Ann. Prob. 26 (1998) 316-345. 73. A. Boutet de Monvel, M. Shcherbina, On the norm and eigenvalue distribution of large random matrices, Math. Notes 57 (1995) 475-484. 74. B. Bronk, Accuracy of the semicircle approximation for the density of eigen values of random matrices, J. Math. Phys. 5 (1964 ) 215-220. 75. A. Edelman, Eigenvalues and condition numbers of random matrices, SIAM J. Matrix Anal. Appl. 9 (1988) 543-560. 76. S. Gemam, A limit theorem for the norm of random matrices, Ann. Prob. 8 (1980) 252-261.C 77. J. von Neumann, H. Goldstine, Numerical inversion of matrices of large order, Proc. Amer. Math. Soc. 2 (1951) 1188-202. 2.4 Local Regime, Universality (bulk) 78. A.V. Andreev, B.D. Simons, B.L. Altshuler, Energy level correlations in dis ordered metals: beyond universality, J. Math. Phys. 37 (1996) 4968-4985. 79. A. Atland, M. Zirnbauer, Non-standard symmerty classes in mesoscopic normal-superconducting hybrid structure, Phys. Rev. B55 (1997) 1142-1161. 80. O.Costin, J.L. Lebowitz, Gaussian fluctuations of random matrices, Phys. Rev. Lett. 75 (1995) 69-72. 81. G. Hackenbroich, H. A. Weidenmueller, Universality of random-matrix results for non-Gaussian ensembles, Phys. Rev. Lett. 74 (1995) 4118-4121. 82. R.D. Kamien, H.D. Politzer, M.B. Wise, Universality of random-matrix predic tions for the statistics of energy levels, Phys. Rev. Lett. 60 (1988) 1995-1998. 83. G. Lenz, F. Haake, Transitions between universality classes of random matrices, Phys. Rev. Lett. 65 (1990) 2325-2328. 84. A. Mirlin, Ya. Fyodorov, Universality of level correlation function of sparse random matrices, J. Phys A: Math. Gen. 24 (1991) 2273-2286.
259
85. L. Pastur, and M. Shcherbina, Universality of the local eigenvalue statistics for a class of unitary invariant matrix ensembles, J. Stat. Phys. 86 (1997) 109-147. 86. B. Simons, B.L. Altshuler, Universality in spectra of disordered and chaotic systems, Phys. Rev. B48 (1993) 5422-5438. 87. A. Soshnikov, Level spacings distribution for large random matrices: Gaussian fluctuations, Ann. of Math. 148 (1998) 573-617. 88. C.A. Tracy, H. Widom, Fredholm determinants, differential equations and ma trix models, Commun. Math. Phys. 163 (1994) 33-72 2.5 Local regime, Universality (special points) 89. M. Bowick, E. Brezin, Universal scaling of the tail of the density of eigenvalues in random matrix models, Phys. Lett. B 268 (1991) 21-28. 90. E. Brezin, S. Hikami, Level spacing of random matrices in an external source, Phys. Rev. E (3) 58 (1998) 7176-7185; Universal singularity at the closure of a gap in the random matrix theory, Phys. Rev. E (3) 57 (1998), 4140-4149. 91. P. Forrester, The spectrum edges of random matrix ensembles, Nucl. Phys. B 402 (1993) 709-728. 92. V.E. Kravtsov, I.V. Lerner, B.L. Altshuler, A.G. Aronov, Universal spectral correlations at the mobility edge, Phys. Rev. Lett. 72 (1994) 888-891. 93. Ya.G. Sinai, A.A. Soshnikov, A refinement of Wigner's semicircle law in a neighborhood of the spectrum edge for random symmetric matrices, Funct. Anal. Appl. 32 (1998) 114-131. 94. A. Soshnikov, Universality of the edge of the spectrum in Wigner random ma trices, Comm. Math. Phys. 207 (1999) 697-734. 95. C.A. Tracy, H. Widom, Level spacing distributions and the Airy kernel, Comm. Math. Phys. 159 (1994) 151-174; Level spacing distributions and the Bessel kernel, Comm. Math. Phys. 161 (1994) 289-309. 96. 97.
98.
99. 100. 101. 102.
2.6 Random Matrix Theory and Random Operator Theory G. Casati, L. Molinari, F. Israilev, Scaling properties of band random matrices, Phys.Rev.Lett. 64 (1990) 1851-1854. Ya. Fyodorov, A. Mirlin, Localization in ensemble of sparse random matrices, Phys. Rev. Lett., 67 (1991) 2049-2052; Scaling properties of localization in random band matrices: a a-model approach, Phys. Rev. Lett. 67 (1991) 2405-2409. A. Khorunzhy, L. Pastur, Limits of infinite interaction radius, dimensionality and the number of components for random operators with off-diagonal random ness, Commun. Math. Phys., 153 (1993) 605-646. J. Magnen, G. Poirot, V. Rivasseau, The Anderson model as a matrix model, Nuclear Phys. B Proc. Suppl. 58 (1997) 149-162. N. Minami, Local fluctuation of the spectrum of a multidimensional Anderson tight binding model, Comm. Math. Phys. 177 (1996) 709-725. S.A. Molchanov, The local structure of the spectrum of the one-dimensional Schrodinger operator, Comm. Math. Phys. 78 (1980/81), 429-446. P. Neu, R. Speicher, Rigorous mean-field model for coherent-potential approx imation: Anderson model with free random variables, J. Statist. Phys. 80
260
(1995),1279-1308. 103. F. Wegner, Disordered electronic systems as a model of interacting matrices, Phys. Rep. 67 (1980) 15-24 104. J.X. Zhang, U. Grimm, R.S. Rommer, M. Schreiber, Level-spacings distribution of planar quasiperiodic tight-bindning model, Phys. Rev. Lett. 80 (1998) 39963999. 2.7 Non-Hermitian
Matrices
105. L.L. Chau, O. Zaboronsky, On the structure of correlation functions in the normal matrix model, Comm. Math. Phys. 196 (1998), 203-247. 106. A. Edelman, The probability that a random real Gaussian matrix has k real eigenvalues, related distributions, and the circular law, J. Multivariate Anal. 60 (1997) 203-232. 107. A. Edelman, E. Kostlan, M, Shub, How many eigenvalues of a random matrix are real? J. Amer. Math. Soc. 7 (1994) 247-267. 108. Ya. Fyodorov, H.-J. Sommers, Statistics of resonance poles, phase shifts and time delays in quantum chaotic scattering: random matrix approach for sys tems with broken time-reversal invariance, J. Math. Phys. 38 (1997) 19181981. 109. Y. Fyodorov, B. Khoruzhenko, H.-J. Sommers, Universality in the random matrix spectra in the regime of weak non-Hermiticity, Ann. l'lnst. H. Poincare (Phys. Theor.) 68 (1998) 449-489. 110. [81] J. Ginibre, Statistical ensembles of complex, quaternion, and real matrices, J. Math. Phys. 6 1965 440-449. 111. R. Grobe, F. Haake, and H.-J. Sommers, Quantum distinction of regular and chaotic dissipative motion, Phys. Rev. Lett. 61 (1988) 1899-1902. 112. G. LeCair, J.S. Ho, The Voronoi tesselation from eigenvalues of complex ran dom matrices, J. Phys. A: Math. Gen. 23 (1996) 3279-3295. 113. G.R. Oas, Universal cubic eigenvalue repulsion for random normal matrices, Phys. Rev. E 55 (1997) 205-211 . 2.8 Miscellanious, Related 114. P. Bleher, X. Di, Correlation between zeros of a random polynomial, J. Stat. Phys. 88 (1997) 269-305. 115. E. Bogomolny, O. Bohigas, P. Leboeuf, Quantum chaotic dynamics and random polynomials, J. Stat. Phys. 85 (1996) 639-679. 116. M. Isopi, C. Newman, The triangle law for Lyapunov exponents of large random matrices, Comm. Math. Phys. 143 (1992), 591-598. 3 Physics 3.1 Nuclear Physics 117. N. Ullah, Matrix Ensembles in Many-Nucleon Problems, Clarendon, Oxford, 1987. 118. J.J.M. Verbaarschot, H.A. Weidenmuller, M.R. Zirnbauer, Grassmann integ ration in stochastic quantum physics: the case of compound-nucleus scattering, Phys. Rep. 129 (1985) 367-438.
261 119. E. Wigner, On the statistical distribution of the widths and spacmgs of nuclear resonance levels, Proc. Cambridge Philos. Soc. 4 7 (1951) 790-798 3.2 Spectra of Complex Systems
(Quantum
Chaos)
120. A. V. Andreev, O. Agam, B. D. Simons, B. L. Altshuler, Semiclassical field theory approach to quantum chaos, Nuclear Phys. B 4 8 2 (1996) 536-566. 121. G. Casati, B. Chirikov (Eds.), Quantum Chaos. Between Order and Disorder, Cambridge University Press, Cambridge, 1995. 122. M.V. Berry, M. Tabor, Level clustering in the regular systems, Proc. Roy. Soc. A 3 5 6 (1977) 375-394. 123. O. Bohigas, Random matrix theories and chaotic dynamics, Chaos and Q u a n t u m Physics, North-Holland, Amsterdam, 1991, pp.87-199. 124. O. Bohigas, M.-J. Giannoni, G. Schmidt, Characterization of chaotic quantum spectra and universality of level fluctuations, Phys. Rev. Lett. 52 (1984) 1-4. 125. M.-J. Giannoni, A.Voros, J.Zinn-Justin (Eds.) Chaos and Quantum Physics, North-Holland, Amsterdam, 1991. 126. F. Haake, Quantum Signatures of Chaos, Springer-Verlag, Berlin, 1991. 127. J. Marklof, Spectral form factors of rectangle billiards, C o m m u n . Math. Phys. 199 (1998) 169-202. 128. B.A. Muzykanskii, D.E. Khmelnickii, Effective action in the theory of quasiballistic disordered conductors J E T P Lett. 62 (1995) 76-83. 129. Ya.G. Sinai, Mathematical problems in the theory of quantum chaos, Lecture Notes in Math. 1 4 6 9 (1991) 41-59. 130. M. Wilkinson, Diffusion and dissipation in complex quantum systems, Phys. Rev. A 4 1 (1990) 4645-4659. 131. G. Zaslavskii, Statistics of energy spectra, Sov. Phys. Uspekhi 22 (1979) 788804 3.3 Quantum
Field
Theory
132. E. Brezin, C. Itzykson, G. Parisi, J.-B. Zuber, Planar diagrams, Comtn. Math. Phys. 5 9 (1978) 35-51. 133. K. Demetrefi, Two-dimensional quantum gravity, matrix models and string theory, Inter. Journ. of Mod. Phys. A 8 (1993) 1185-1244. 134. G. t'Hooft, Planar diagram theories for string interaction, Nuclear Phys. B 7 2 (1974) 461-472. 135. V. Rivasseau, Isosystolic inequalities and the topological expansion for random surfaces and matrix models, C o m m . Math. Phys. 1 3 9 (1991) 183-200. 136. B. Seif, T . Wettig, T. Guhr, Spectral correlations of the massive QCD Dirac operator at finite temperature, Nuclear Phys. B 5 4 8 (1999) 475-490 137. A. Mironov, 2d gravity and matrix models. I, Int. J. Mod. Phys. A 9 (1994) 4355-4406 138. G. Moore, Matrix models of 2D gravity and isomonodromic deformations, Progr. Theor. Phys. Suppl. 1 0 2 (1990) 255 139. E.V. Shuryak, J.J.M. Verbaarschot, Random matrix theory and spectral sum rules for the Dirac operator in QCD, Nuclear Phys. A 5 6 0 (1993) 306-316. 140. J.J.M. Verbaarschot, Random matrix theory and QCD at non-zero chemical potential, Nuclear Phys. A 6 4 2 (1998) 305C-317C.
262
141. 142. 143. 144. 145. 146. 147.
148. 149. 150. 151. 152. 153. 154. 155. 156.
3-4 Statistical Mechanics M. Bowick, Random matrices and random surfaces, Nuclear Phys. B Proc. Suppl 63A (1998) 77C - 88C. L.F. Cugliandoloo, J. Kurchan, G. Parisi, F. Ritort, Matrix model as solvable glass mode, Phys. Rev. Let. 74 (1995) 1012-1015. I. Rostov, Solvable statistical models on a random lattice, Nuclear Phys. B Proc. Suppl. 45A (1996) 13-28. H. Meyer, J.-C. Agles d'Auriac, H. Bruus, Spectral properties of statistical mechanics models, J. Phys. A: Math. Gen. 29 (1998) L483-L488. D. Nelson, T. Piran and S. Weinberg (Eds.), Statistical Mechanics of Mem branes and Surfaces, World Scientific, Teaneck, NJ, 1989. M. Shcherbina, B. Tirozzi, The free energy of a class of Hopfield models, J. Stat. Phys. 72 (1993) 113-125. B.D. Simons, P.A. Lee, B.L. Aitshuler, Exact results for quantum chaotic sys tems and one-dimensional fermions from matrix models, Nuclear Phys. B 409 (1993) 487-508. 3.5 Condensed Matter E. Akkermans, G. Montabaux, J.-L. Pichard, J. Zinn-Justin (Eds.), Mesoscopu: Quantum Physics, North-Holland, Amsterdam, 1996. B. Aitshuler, B. Shklovski Repulsion of energy levels and conductance of small metallic sumples, Sov. Phys. JETP 64 (1980) 127-138. C.W.J. Beenakker, Random-matrix theory of quantum transport, Rev. Mod. Phys. 69 (1997) 731-847 O.N. Dorokhov, Transmission coefficient and the localization length of an elec tron in N bound disordered chains, JETP Lett.36 (1982) 318-323. L.P. Gorkov, E.M. Eliashberg, Repulsion of energy levels and and the conduct ance of small metallic particles, Sov. Phys. JETP 64 (1965) 940-947. P.A. Mello, P.Pereyra, N.Kumar, Macroscopic approach to multichannel dis ordered conductors, Ann. of Phys. 181 (1988) 290-315. A. Mirlin, Spatial structure of anomalously localized states in disordered sys tems, J. Math. Phys. 38 (1997) 1886-1916. D. Shepelyanski, Coherent propagation of two particles in random potantial, Phys. Rev. Lett. 73 (1994) 2607-2610. M. Wilkinson, Diffusion and dissipation in complex quantum systems, Phys. Rev. A41 (1990) 4645-4659. 4 Mathematics
4-1 Probability and Statistics 157. T. Akuzawa, M. Wadati, Diffusion on symmetric spaces of type AIII and ran dom metrix theories for rectangular matrices, J .Phys. A: Math. Gen. 31 (1998) 1713-1732. 158. H. Bercovici, V. Pata, Stable laws and domains of attraction in free probability theory, Ann. of Math. (2) 149 (1999), 1023-1060. 159. A.D. Deev, Representation of statistics of discriminant analysis, and asymp totic expansion when space dimensions are comparable with sample size, Dokl.
263
Akad.Nauk SSSR 195 (1970) 759-762. 160. V.L. Girko, Statistical Analysis of Observations of Increasing Dimension, Kluwer, Dordrecht, 1995. 161. A. Gupta, V.L. Girko (Eds.), Multidimensional Statistical Analysis and Theory of Random Matrices, VSP, Utrecht, 1996. 162. D. Grabiner, Brownian motion in a Weyl chamber, non-colliding particles, and random matrices, Ann. Inst. H. Poincare Probab. Statist. 35 (1999) 177-204. 163. R.J. Muirhead, Aspects of Multivariate Statistical Theory, Wiley, New York, 1982. 164. D. Voiculescu (Ed.), Free Probability Theory, Fields Institute Communications 12, AMS, Providence, RI, 1997. 165. S.S. Wilks, Mathematical Statistics, Princeton University Press, princeton, 1943. 4-2 Spaces, Operators, Algebras 166. E.D. Gluskin, Norms of random matrices and width of finite-dimensional sets, Math. USSR Sb. J. 48 (1984 ) 173-182. 167. U. Haagerup, S. Thorbjornsen, Random matrices and K -theory for exact Calgebras, Doc. Math. 4 (1999) 341-450. 168. S.J. Szarek, Spaces with large distance to /£° and random matrices, Amer. J. Math. 112 (1990) 899-942; Condition numbers of random matrices, J. Com plexity 7 (1991) 131-149. 169. V. Milman, Randomness and patterns in convex geometrical analysis, Proceed ings of the International Congress of Mathematicians, Vol. II (Berlin, 1998), pp. 665-674. 170. D. Voiculescu, Free probability theory: random matrices and von Neumann algebras, Proceedings of the International Congress of Mathematicians (Zurich, 1994), V. 2, Birkhauser, Basel, 1995, pp. 227-241. 4-3 Number Theory 171. M.V. Berry, J.P. Keating, The Riemann zeros and eigenvalue asymptotics, SIAM Rev. 41 (1999) 236-266. 172. E.B. Bogomolny, J.P. Keating, Random matrix theory and the Riemann zeros. I: Three- and four-point correlations, Nonlinearity 8 (1995) 1115-1131; //: npoint correlations, Nonlinearity 9 (1996) 911-935. 173. E. Bogomolny, Spectral statistics. Proceedings of the International Congress of Mathematicians, Vol. Ill (Berlin, 1998), pp. 99-108 174. H.L. Montgomery, Distribution of the zeros of the Riemann zeta function, Pro ceedings of the International Congress of Mathematicians (Vancouver, B. C , 1974), Vol. 1, Canad. Math. Congress, Montreal, Que., 1975, pp. 379-381. 175. F.J. Dyson, Missed opportunities, Bull. Amer. Math. Soc. 78 (1972), 635-652. 176. A.M. Odlyzko, \020th zeros of the Riemann zeta function and 70 millions of its neighbors, ATT Bell Telephone Labs. Preprint (1989) 177. Z. Rudnick, P. Sarnak, Zeros of principal L-functtons and random matrix the ory, Duke Math. J. 81 (1996) 269-322.
264
4-4 Integrable Systems, Transcendents 178. M. Adler, P. van Moerbeke, Matrix integrals, Toda symmetries, Virasoro con straints, and orthogonal polynomials, Duke Math. J. 80 (1995) 863-911; The spectrum of coupled random matrices, Ann. of Math. (2) 149 (1999) 921-976. 179. J. Hamad, C.Tracy, H.Widom, Hamiltonian structure of equations appearing in random matrices, Low-Dimensional Topology and Quantum Field Theory, Plenum, NY, 1993, pp. 231-245. 180. M. Jimbo, T. Miwa, Y. Mori, M. Sato, Density matrix of an impenetrable Bose gas and the fifth Painleve transcendent, Physica I D (1980) 80-158. 4-5 Topology 181. L. Chekhov, Matrix model tools and geometry of moduli spaces, Acta Appl. Math. 48 (1997) 33-90. 182. P. Di Francesco, C. Itzykson, Quantum intersection rings. The moduli space of curves, Progr. Math., 129, Birkhauser Boston, Boston, MA, 1995, pp.81-148. 183. J. Harer, D. Zagier, The Euler characteristic of the moduli space of curves, Invent. Math. 85 (1986) 457-485. 184. C. Itzykson, J.-B. Zuber, Combinatorics of the modular group. II. The Kont sevich integrals, Internat. J. Modern Phys. A 7 (1992) 5661-5705. 185. M. Kontsevich, Intersection theory on the moduli space of curves and the matrix Airy function, Comm. Math. Phys. 147 (1992) 1-23. 186. R.C. Penner, Perturbative series and the moduli space of Riemann surfaces, J. Differential Geom. 27 (1988), no. 1, 35-53. 187. E. Witten, Two-dimensional gravity and intersection theory on moduli space, Surveys in Differential Geometry 1 (1991) 243-310. 188. 189.
190. 191. 192.
193. 194.
195. 196.
4-6 Combinatorics and Large-n Representation Theory I. Arefyeva, I. Volovich, Knots and matrix models, Infin. Dimens. Anal. Quan tum Probab. 1 (1998) 167-173. J. Baik, P. Deift, K. Johansson, On the distribution of the length of the longest increasing subsequence of random permutations, J. Amer. Math. Soc. 12 (1999) 1119-1178. A. Borodin, G. Olshanski, Point processes and the infinite symmetric group, Math. Res. Lett. 5 (1998) 799-816. K. Johansson, The longest increasing subsequence in a random permutation and a unitary random matrix model, Math. Res. Lett. 5 (1998) 63-82. V. Kazakov, M. Staudacher, T. Wynter, Character expansion methods for ma trix models of dually weighted graphs, Commun. Math. Phys. 177 (1996) 451-468; Almost flat planar diagrams, .Commun. Math. Phys. 179 (1996) 235-256. V. Sachkov, Probability Methods in Combinatorial Analysis, Cambridge Uni versity Press, Cambridge, 1997 R. Speicher, Combinatorial theory of the free product with amalgamation and operator-valued free probability theory, Mem. Amer. Math. Soc. 132(1998), no. 627. C.A. Tracy, H. Widom, Random unitary matrices, permutations and Painleve, Comm. Math. Phys. 207 (1999) 665-686. W.T. Tutte, A census of planar triangulations, Canad. J. Math. 14 (1962)
265 21-38. 197. A.M. Vorshik, Asymptotic combinatorics and algebraic analysis, Proceed ings of the International Congress of Mathematicians (Zurich, 1994), Vol.2, Birkhauser, Basel, 1995, pp.1384-1394. 198. A.Zvonkin, Matrix integrals and map enumeration: an accessible introduction, Math. Comp. Modelling, 26 (1997) 281-304. 4-7 Analysis 199. E.L. Basor, Connections between random matrices and Szego limit theorems, Contemp. Math., 237, AMS, Providence, RI, 1999, pp.1-7. 200. E.L. Basor, H. Widorn, Determinants of Airy operators and applications to random matrices, J. Statist. Phys. 96 (1999) 1-20. 201. P. Biane, On the free convolution with a semi-circular distribution, Indiana Univ. Math. J. 46 (1997) 705-718. 202. P. Bleher, A. Its, Scmiclassical asymptotics of orthogonal polynomials, Riemann-Hilbert problem, and universality in the matrix model, Ann. of Math. 150 (1999) 185-266. 203. P.A. Deift, Orthogonal Polynomials and Random Matrices: a Riemann-Hilbert Approach, Courant Institute of Mathematical Sciences, NY, 1999. 204. P. Deift, T. Kriecherbauer, K. McLaughlin, S. Venakides, X. Zhou, Uniform asymptotics for polynomials orthogonal with respect to varying exponential weights and applications to universality guestwns in random matrix theory, Comm. Pure Appl. Math. 52 (1999) 1335-1425. 205. A. Fokas, A. Its, A. Kitaev, The isomonodromy approach to matrix models in 2D quantum gravity, Comm. Math. Phys. 147 (1992) 395-430. 206. H. Widorn, Asymptotics for the Fredholm determinant of the sine kernel on a union of intervals, Comm. Math. Phys. 171 (1995) 159-180.
266 WAVEFUNCTION COLLAPSE AS A REAL GRAVITATIONAL EFFECT ROGER PENROSE Mathematical Institute, Oxford, UK; Gresham College, London, UK; Center for Gravitational Physics and Geometry, Penn. State University, USA
1
Quantum s t a t e reduction
There are many different philosophical attitudes to quantum theory's measurement problem or, as I prefer to call it, the measurement paradox. This notwithstanding, there is very little difference between these various schools of thought when it actually comes to handling the measurement issue in practice. The quantum state \i>) of a system is simply taken to evolve, for the most part, according to unitary evolution U (which we can take to be controlled by the Schrodinger equation); but from time to time—whenever it is considered that a measurement has indeed taken place—the state \rp) is taken to jump to another state ]<j>), which is some eigenstate of the operator Q that describes the particular measurement being performed. The probability that IV*) jumps to \), rather than to any other eigenstate of Q, is given by the squared modulus |(V>|<^>)|2 of the complex amplitude {i)\<j>), assuming that \il>) and \<j>) are normalized states (otherwise we simply take the probability as {ip\)(4>\ip)/(ip\il)){\))- The jumping of the state from \rp) to \) is state-vector reduction R, sometimes referred to as the collapse of the wave function. The U and R processes (referred to by von Neumann as, respectively, "process 2" and "process 1" in his famous book 39 ) are clearly completely different mathem atical procedures. Whereas U is fully deterministic and linear, in its action on the state IV*), the procedure R i s non-deterministic and is not a linear action. It appears to be a common view, amongst physicists (and the one seemingly espoused by von Neumann himself), that this incompatibility is merely apparent, and results from the approximation that arises when one attempts to treat an evolving quantum system in isolation from its environment. An isolated system always evolves in accordance with U, on this view. But as soon as a system's state gets significantly entangled with that of its environment, it becomes inappropriate to describe the system on its own by a unitarily evolving state-vector. In such circumstances, a description in terms of a density matrix is employed instead, where the large num ber of ("unobservable") degrees of freedom in the environment are "summed over", yielding the density-matrix description of the quantum system under considera tion. This density matrix is then interpreted as describing a probability mixture of possible alternatives. It is considered that this is what is involved in any measurement process. The measuring apparatus and its environment comprise large numbers of "thermal" de grees of freedom. These swamp the intricate phase relations in the system that would otherwise require our treating its quantum superpositions (correctly) as su perpositions, and allow us, instead, to treat them (approximately) as probability-
267
weighted alternatives. It is argued t h a t although, strictly speaking, the combined system/apparatus/environment still has a single q u a n t u m state, the system itself can be treated, at least to a very good approximation, as a classical probability mixture of alternatives. Accordingly (so it is claimed), there is an effective emer gence of an R-type evolution for the system itself, arising from a U-type evolution of the entire system. This loss of phase coherence is referred to as environmental decoherence. There are many good reasons why such a description can give only a "stop-gap" position on the measurement process. As John Bell has described it, this inter pretation can, at best, be merely FAPP ("for all practical purposes"), rather t h a n being able to provide us with a resolution of the R / U conflict at a fundamental level. The evolution of the full q u a n t u m state, describing the system, measuring apparatus, and any remaining environment (including the conscious experimenter, if necessary) should be in accord with the deterministic U-process, and there are no probabilities involved. To obtain a probability-weighted mixture of alternatives, rather than a single state involving complex-number-weighted alternatives, some thing other than unitary evolution has to be invoked. One reason for this is that a density-matrix description is itself only a "stop-gap" measure, being employed as a convenience simply because it is impractical to keep track of all the degrees of freedom that are involved in the system's environment. At the fundamental level, in which all these degrees of freedom are taken into account, there would still be some q u a n t u m state \d) t h a t describes the full system including its entire environ ment (with all alternatives still in q u a n t u m superposition), even though this state would not be knowable in practice. As a further point, even if we do take the density matrix as representing our best description of quantum reality, any particular interpretation of it as a prob ability mixture of states requires further assumptions and this introduces certain problematic issues of consistency as the density matrix evolves. In all cases ex cept for pure states, any such interpretation is very far from unique. 1 6 ' 3 4 These further assumptions are necessary if we want to single out the required probabilityweighted mixture of alternatives from certain other mixtures of alternatives t h a t are superpositions of these. Where do these probabilities actually come from? From a "conventional" per spective, one might say that these probabilities are to be regarded as arising from a lack of knowledge of the details of the environment, or perhaps from an ob server's consciousness randomly threading its way through a multitude of linearly superposed alternative universes; yet the remarkably simple and accurate squaredmodulus rule of the R process, in the standard applications of q u a n t u m mechanics, seems to be independent of the detailed nature of the environment and of the details of whatever it is t h a t constitutes a conscious observer." It is my own firm opinion t h a t no explanation for the ubiquitous and precise squared modulus rule of the Rprocess can be obtained without a deeper investigation of the very foundations of q u a n t u m mechanics. I would contend t h a t U-evolution is itself merely a marvellous approximation to something even more precise, according to which the q u a n t u m
°See Home, 1 5 pp. 92-94 and Penrose, 34 Chapter 29.
268 evolution from a "small" to a "large" system (in some appropriate interpretation of these terms) objectively involves a procedure closely approximated by R.
2
Gravitational O R
It is not my purpose here to provide a detailed discussion of the weaknesses (nor of the strengths) of the various "conventional" approaches to the measurement para dox, despite the fact t h a t , in my opinion, it is i m p o r t a n t to appreciate why none of them really explains why R comes about. (See Penrose, 3 4 Chapter 29, for a detailed discussion.) Instead, I wish to acquaint the reader with a particular res olution of this paradox t h a t involves a (comparatively mild) and well motivated deviation from the standard rules of U - q u a n t u m mechanics. In this scheme, R in deed takes place as a real phenomenon, and it is not to be taken as an illusion of our awareness, nor as an approximation, nor merely a convenience. I use the acronym O R (objective reduction) to denote this supposed objectively real reduction of the q u a n t u m state. A number of suggestions for O R schemes have been put forward over the years. 2 5 ' 1 2 , 3 5 T h e particular scheme t h a t I am promoting here maintains t h a t deviations from U-evolution occur only when gravitational effects become sig nificant; accordingly, R is taken to be an implication of quantum gravity—although this involves an interpretation of the term "quantum gravity" t h a t differs from t h a t in all mainstream proposals, according to which U-evolution is taken as axiomatic. The tentative suggestion t h a t "quantum gravity" might conceivably involve an objective reduction of the q u a n t u m state can be traced back to Feynman, 1 0 al though Feynman himself never pursued this suggestion seriously in his own work on q u a n t u m gravity. Gaining some initial inspiration from Feynman's tentative remarks, however, the Hungarian physicist F. Karolyhazy has been the standardbearer of this kind of idea since the mid-1960s, but in more recent years it has been taken u p by a number of other r e S e a r c h e r s . 1 7 ' 1 8 ' 1 9 ' 2 1 ' 2 0 ' 3 6 ' 2 6 ' 5 ' 6 - 7 ' 1 3 ' 2 7 ' 2 8 ' 3 0 ' 9 I have given a particular motivation for a version of this proposal, 3 1 which arises from what I perceive to be a profound conflict between the principles of q u a n t u m mechanics and those of Einstein's general relativity. To understand this conflict, consider a situation in which there is a lump of material resting on a horizontal surface, the lump constituting a q u a n t u m system t h a t is stationary in a (stationary) external gravitational field. For the purposes of argument, let us try to assume the standard procedures of U-evolution, so that the l u m p has a state-vector |V>) t h a t is an eigenstate of the time-translation operator T. In ordinary flat space we have T = d/dt, where standard Minkowskian (Cartesian) coordinates are being used, t being the usual time-coordinate. In a curved space-time M, it is more appropriate to consider T to be a Killing vector field in M, representing the infinitesimal translational symmetry of M t h a t expresses M's stationary nature. Thus, T is a timelike vector field; and if we assume t h a t M. is asymptotically flat in some appropriate sense, then we may take T to be normalized by the requirement t h a t it becomes a unit timelike vector at infinity. We do not expect T to remain a unit vector in the finite regions of the space-time Ai, its squared length gabTaTb being a measure of the gravitational potential at each point of M. (In fact, in the Newtonian limit G —¥ 0, we have gabTaTb = 1 + 2 $ , where <E> is the Newtonian
269 potential.) T h e Schrodinger equation for our stationary lump takes the form ihT\i>) = E\tl>), where E is the energy (eigenvalue) of the system. Now let us suppose that our l u m p is put into a q u a n t u m linear superposition of two different locations, displaced from one another by a horizontal translation. Let us take the individual q u a n t u m states to be our original one \i>) and a horizontally displaced one \rp'), which is therefore also stationary, with the same energy eigenvalue E: ihT\4>') =
E\rp').
T h e superposition
| * ) = z|l&) + u>|tf'>, where z and w are fixed complex-number amplitudes (satisfying \z\2 + \w\2 = 1 for normalized states), clearly also satisfies the same stationarity condition:
ifiT|tf) = El*), with the same energy eigenvalue E as before. There is a complete degeneracy between all these superpositions, and each is just as stationary as each of the others. This is indeed what is to be expected in standard q u a n t u m mechanics, no m a t t e r how large the lump might be. However, we have so far assumed t h a t the gravitational field of the l u m p itself can be ignored, so that the system is described as existing in a fixed stationary background space-time. Let us now ask what happens when the lump's gravit ational field is also taken into consideration. First, consider the case where the q u a n t u m state merely describes a lump in a single location. How is its gravita tional field to be described? For this, it would seem t h a t we need t o call upon the (still missing) q u a n t u m theory of gravity, for we are asking for a description of the gravitational field of a quantum source, namely of the lump itself. T h e position t h a t I am adopting, in this discussion, is t h a t conventional ideas should be assumed to hold true until we are presented with good reasons to believe t h a t they should not. In accordance with these conventional ideas, there should be a quantum-gravity state that represents a classical-like limit according to which the q u a n t u m lump's gravitational field, in combination with that of the background, is indeed closely described by a stationary classical solution of the Einstein equations. There seems to be nothing problematic about this, in the case of a single lump location, and the Killing vector T now refers to this slightly distorted space-time. T h e stationary Schrodinger equation can be written down, just as before—although now there is a (mild) element of approximation involved, because the gravitational (space-time) field is treated classically. Can we treat the superposition of two lump locations in a similar way? T h e difficulty that now arises is t h a t there are now two distinct Killing vectors, say T and T , referring to the two slightly differing stationary space-times t h a t arise from each of the two separate l u m p locations. We appear to have two different Schrodinger equations, one for each lump location, differing through the fact t h a t
270
each has a slightly different "d/dt". In order to have just a single Schrodinger equation governing the evolution of the superposed quantum system, we appear to have to identify the two Killing vectors T and T , or else to find a new operator which both T and T' approximate. Each of T and T' is a vector field, described geometrically by a family of "ar rows" drawn on space-time. It is natural to think of T and T' as two separate fields of arrows on the one space-time, but this is technically incorrect. For T and T' are not simply two different vector fields on one space-time. They are vector fields on two different space-times M and M', respectively. To identify T with T', we would have to identify these two space-times, so that they act as a single manifold. Is there a difficulty about this? Yes, there is, because one of the basic tenets of Einstein's general relativity—the principle of general covariance—tells us that individual space-time points are not provided with identifying labels (special coordinates) which could tell us "which point is which", so that a pointwise iden tification could be achieved. There is no canonical way of asserting which point of M' is to be regarded as the same point in M. This issue is a familiar one to those who are concerned with the foundational questions raised by quantum gravity. It is a standard position on this issue that M and M1 would be taken as abstract (pseudo-) Riemannian manifolds and a quantum superposition of one with the other would itself be taken entirely abstractly, and it would not entail any pointwise identification of one space with the other. However, we are now trying to do quantum physics within such an abstractly linearly super posed pair of distinct space-times. We see that this is fundamentally problematic, since there appears to be no canonical time-translation operator, in terms of which the Schrodinger equation for a superposed quantum system is to be formulated. To express the notion of "stationarity" for our lump superposition, we indeed seem to require a single time-derivative operator, with respect to which the superposed state is to be an eigenvector. One way that such an issue would be treated within the "conventional" pro cedures of quantum gravity would be to regard the space-times M and M' to be identifiable at infinity, and to define a common time-translation operator for the two space-times in terms of their asymptotic properties. This is still somewhat problematic, since the action of this operator on a wavefunction is now non-local or non-canonical (or both), and it has a quite different character from the normal d/dt operator (local and unique up to Lorentz boosts) of standard quantum mech anics. It remains to be seen whether a consistent framework can be developed with such ill-defined operators, leaving the local rules of ordinary quantum mechanics undisturbed. In the absence of a full resolution of such issues, it appears to be appropriate to take the standpoint that, at least to a good approximation, one may attempt to "identify" the operators T and T' as the required time-translation operator of the Schrodinger equation, but only approximately, taking note of a suitable measure EG of the "error" that is involved in this identification. Since this error refers to an "uncertainty" in the interpretation of "d/dt", we may think of EQ as an uncertainty in the energy, given by "ihd/dt", of the superposed system. My own standpoint is that this is a fundamental irreducible energy-uncertainty that results from a
271
deep conflict between the basic principles of quantum mechanics and of Einstein's general relativity. For quantum mechanics depends upon a notion of d/dt whereas general relativity demands general covariance, whence d/dt is ill defined in such a superposition. We may relate this energy to Heisenberg's uncertainty principle in the standard way that one does in the case of an unstable nucleus. With a decaying nucleus, the lifetime of the nucleus is reciprocally related (with factor h) to an uncertainty in the energy in the nucleus. This suggests that the quantum superposition of our two lumps has a finite life-time, also reciprocally related to its energy uncertainty. Accordingly, our superposition is itself unstable, and will decay into one or the other of the actually stationary states of which it is a superposition, namely \tp) or IV*'), in a time-scale of the order of h
3
Newtonian approximation
It is difficult to say a great deal more than this in the general case, when gravita tional fields are large and full general relativity is required. But in most practical situations, gravitational fields can be adequately treated within the framework of Newtonian gravitational theory. It should not be assumed, however, that the pre ceding considerations can be ignored in the Newtonian limit. In the first place, it must be made clear that whereas in Newtonian theory there is an absolute notion of time, as described by some universal time parameter t, there need be no absolute notion of "d/dt". For we must bear in mind that the meaning of "d/dt" involves the notion that the remaining coordinates (say x,y and z) are to be held constant. (This is reflected in the "chain rule", whereby d/dt differs from d/dt by an amount (dx/dt')d/dx + (dy/dt')d/dy + (dz/dt')d/dz, even though t' = t.) In the standard formalism of Newtonian theory, there is a flat absolute spacetime background in which to specify d/dt. But this formalism does not incorporate the principle of equivalence at a fundamental level. We recall that Einstein's prin ciple of equivalence provides the underlying reason for adopting the principle of general covariance in gravitation theory. The principle of equivalence asserts, for example, that a uniform gravitational field is physically equivalent to the complete absence of any gravitational field; yet with respect to the flat space-time back ground, these two situations appear as different, free-fall in one being described by straight world-lines, but not in the other. This principle is, however, completely incorporated within Cartan's reformulation of Newtonian theory, 2,11,38 and this formalism is undoubtedly the one that should be used for a proper formulation of the Newtonian limit of the ideas expressed above. See Christian. 3 At the present stage of understanding, it appears to be adequate, provisionally, to use the standard Newtonian framework; nevertheless, the underlying ideas be hind Cartan's approach should be kept in mind. Tentatively, we regard each of M. and M! to be the ordinary flat space-time of the standard Newtonian framework, and we try to consider that A4 and M! are, in some sense, approximately (pointwise) identified with one another, in such a way that the respective Killing vector
272
fields T and T', on M and M', are themselves approximately identified. We shall try to estimate the error that is involved in performing these identifications. Let us suppose that Nature is, in some sense, antagonistic to an identification between space-times unless the notion of free-fall in one of the space-times agrees with the notion of free-fall in the other. If the notions of free-fall do not match, then we try to estimate the error involved in performing the identification. It turns out that the natural error measure that arises in this context is essentially an energy, and we can interpret this energy as a fundamental uncertainty in the total energy of the superposed system, this uncertainty arising from the incompatibility of the two time-derivative operators T and T that feature in its Schrodinger equation. Using the Newtonian description, we have (negative) gravitational potential functions $ and $ ' corresponding to the two mass distributions p and p' in the two lump configurations. The distributions of mass can be treated as classical, but more appropriately, for a single stationary quantum system \ip), we can take for its mass distribution p, the expectation value of the mass distribution in the quantum state |V>)- The Newtonian potential must then satisfy Poisson's equation V2$ =
\-KPG
for this mass distribution (G being Newton's gravitational constant), and the "New tonian force per unit mass" or "acceleration field" will be V $ . We can estimate the local "error" in identifying the two space-times M. and M' as the (squared) measure whereby free-fall in one differs from free-fall in the other, i.e. |V$'-V$|2 (a quantity that has some degree of invariance, because it is unchanged if we add the same acceleration field in each of M. and M'). The integral of this, over a particular time-slice t = const., is interpreted as (47rG)_1 times our total "error" measure EQ at time t. The factor (47rG) -1 is incorporated to give the entire expression the dimensions of an energy—although the particular choice of numerical factor An is not very clearly motivated here, and it may turn out to be the case that some other multiple would be more appropriate. Our quantity can be re-expressed £G = ^ / | V * ' - V * | 2 d 3 x
= - J - /(V$' - V*) • (V*' - V$)d3 = --}— / " ( V 2 * ' - V2<J>)(
= - j{p'-p){*' -*)
G
/
^ - r d
3
}
obta
fc = c , W - X ^ W - X 7 l ) ^
W"
273 which is the gravitational self-energy of the difference between the mass distributions t h a t are involved in the q u a n t u m superposition the two lump locations. We insert this expression into the formula
fpj*)md3xd3
G
J
l*-y|
which measures the energy that it would cost to displace one instance of our lump of material from the gravitational field of the other. This differs from our original expression for EQ only when the two individual states of which our superposition is composed have different individual gravitational self-energies. 0 Both of these expressions for EQ (with tc, « h/EQ) had been suggested earlier by Diosi, 6 although his preference seems to have been for the gravitational interaction energy, above. Diosi also suggested some kind of stochastic dynamics, whereby the q u a n t u m state would spontaneously reduce. According to Diosi, 8 he is now of the persuasion that any such explicit dynamics would encounter serious consistency conditions with (local) causality or energy conservation, or both (cf. also Ghirardi, Grassi, and Rimini, 1 3 who suggested a partial remedy to this difficulty with Diosi's original scheme, at the expense of having to re-introduce an arbitrary additional ''There is also other strong evidence that Nature's own choice of "quantization" procedure for general relativity must be fundamentally non-standard. This evidence is to be found in the gross time-asymmetry that occurs in the structure of space-time singularities (cf. 2 9 , Chapter 7), which is the one direct piece of observational evidence concerning the implications of the actual nature of "true quantum gravity"! c An example for which these alternative expressions may be expected to give different answers occurs with a cloud chamber, where the two states in superposition could be a charged particle entering the chamber and the particle not entering the chamber. The gravitational self-energy in the mass distribution in the condensing droplets would differ from that in the uniform distribution of vapour t h a t occurs when there are no droplets.
274 scale parameter). In my own view, it is probably premature to put forward an explicit dynamics at this stage of understanding. Any such dynamics would be necessarily non-local and very probably non-computable.30 T h e present article does not address the issue of an actual dynamics for O R . It is somewhat "minimalist" in its approach, merely providing general reasons for an effect of this n a t u r e and for the specific time-scale to « ti/Ec that is being put forward here. 4
M a g n i t u d e o f t h e effect
It would be easy to believe t h a t any effect t h a t depends upon the gravitational field of an ordinary q u a n t u m system, should be so tiny t h a t it can be ignored completely. However, this is not necessarily the case, as a simple calculation will reveal. T h i n k of a small sphere, of uniform density, which we are to put into a linear superposition of two slightly displaced positions. How large does it have to be for there to be a noticeable effect according to the above scheme? It turns out t h a t it need not be very large at all, in everyday terms, although large for q u a n t u m system. T h e value of EQ depends upon the distance of displacement, but we find t h a t as we increase the separation between the two components of the superposition, most of the effect comes from the displacement from coincidence to the contact position. T h e total remaining increase in EG all the way out to infinity, starting from contact, is only a further five sevenths of this original amount, for a uniform sphere. Thus, with the following rough numbers I shall ignore the actual distance displaced, assuming that this displacement is at least of the order of the diameter of the sphere. We then obtain the following approximate "decay-time" to, for a superposition between two l u m p positions, for a sphere of radius a of a uniform density p: h tG
* 20Gp 2 a*'
We find t h a t for a water-density sphere of radius 1 0 - 5 cm, the decay time is meas ured in hours; if of radius 1 0 - 4 cm, the decay time is about a twentieth of a second; if of radius 1 0 - 3 cm, the decay time is roughly one millionth of a second. Such decay times, for a q u a n t u m superposition do not seem at all unreasonable. In these considerations, I have made the unrealistic assumption t h a t the body is completely uniform, whereas in fact actual physical bodies have a granular nature, being composed of atoms. We may consider t h a t the atoms themselves are com posed of nucleons and electrons, and that the nucleons are composed of quarks. Electrons and quarks are regarded as point particles, in conventional theory, and we might imagine t h a t there is a serious difficulty in applying the above proposal in the context of point particles. Indeed, the gravitational self-energy of the difference between two delta-function mass distributions is manifestly infinite (the positive and negative delta functions each contributing to the self-energy with the same sign), and this would seem to give to = 0, i.e. instantaneous reduction. If this were the case, then q u a n t u m effects would never occur at all, which is in b l a t a n t contradiction with a vast body of observational d a t a ! Clearly this is not the correct prescription. For a point-particle's wavefunc-
275
tion to have a mass-distribution expectation value which is a delta function, the particle's wavefunction itself would have to be a position-space delta function. Such a wavefunction cannot be a stationary state. Yet, the mass distributions that are involved in the above computation of the gravitational self-energy, leading to the reduction time to, have to be taken to be, indeed, stationary states. Such mass distributions are always spread out, to some degree, and cannot be position-space delta-functions. Nevertheless, the granular nature of the lump can have a signific ant effect on the computed value of EQ. We need a clear proposal telling us how to regard the basic mass distributions in our superpositions. We shall be concerned, here, only with total states |\P) that are superpositions of pairs of states \ip) and |V>'), each of these two constituents being stationary. The gravitational field of the constituent 1^) is taken to be a classical Newtonian field with potential , whose source is the expectation value of the mass density distribution p in \i>). This potential $ is also to play a role as an ordinary potential term in the (stationary) Schrodinger equation for \il>). This gives a non-linear coupled pair of equations that I refer to as the (stationary) Schrodinger-Newton equations. 32,22 ' 23 The gravitational field of \t/j') is treated correspondingly, giving rise to <J>', also in accordance with the stationary Schrodinger Newton equations. The quantity EQ is then unambiguously defined as the gravitational self-energy of the field whose potential is $ ' — $, as given in §3, above. This proposal actually allows us to treat a "lump" that consists of a single point particle. It turns out that there is indeed an appropriate stationary solution to the Schrodinger-Newton equation for a point particle. 22,23 If we take the particle to have the mass of a neutron, then the spatial spread in the wavefunction for this solution is about 109 light years, which is getting on for the radius of the entire observable universe. A superposition of two such states, displaced with respect to each other, would only reduce to one or the other in a period that is enormously longer than the age of the universe, according to this scheme—absurdly longer than a neutron's radioactive decay time. There is clearly no conflict between the scheme being proposed here and experiments confirming quantum interference for individual neutrons. 40 Now, suppose that instead of a uniform sphere or a point particle, our "lump" is some realistic small macroscopic object. We may reasonably expect, when the gravitational energy is small compared with other energies involved in the lump, that the appropriate stationary solution of the Schrodinger-Newton equation would be very well approximated by a stationary solution of the Schrodinger equation alone (i.e. without the gravitational potential term 4>), with fixed centre of mass. I would anticipate that the role of $ in the Schrodinger-Newton equation in such circumstances is, in effect, simply to break a degeneracy that would occur if the $term were not present, whereby linear combinations of this solution with arbitrary spatial translations of it are also all stationary. For definiteness, let us take our lump to be a small crystal for which the nuclei have rather well-defined locations. We do not expect these nuclear locations to be concentrated at exact points, but would have a certain spread, depending upon the nature of the crystal. All this would be determined by the Schrodinger equation, where we assume that the crystal is in a stationary state at zero temperature. This,
276
indeed, is appropriate for the experiment to be described in the next section. We find, for example, for a (Mossbauer-like) crystal with about 1015 nuclei, the value of
The FELIX e x p e r i m e n t
I now describe a proposal for an experiment (FELIX = Free-orbit Experiment with Laser-Interferometry X-rays) that ought to be able to measure whether or not gravitational O R indeed takes place in accordance with the scheme set forth above. The idea is to place a number of tiny crystals, each perhaps not much larger than a speck of dust, into quantum superpositions of two slightly differing positions, where the physical displacement between the two components, in each case, is of the rough order of a nuclear diameter. This is done by hitting each crystal with a laser-produced X-ray photon that has been previously put into a quantum superposition of two beams, only one of which encounters the crystal. d The arrangement is illustrated in the Figure 1. We see that after leaving the laser, each photon encounters a beam-splitter, and it is the transmitted part of the photon's state that encounters the crystal, being thereby reflected, and imparting some momentum to the crystal. The two reflected parts of the photon's state (one reflected from the crystal, the other from the beam-splitter) have to be kept coherent for about 1 0 - 1 s, which is the rough expectation of the crystal's state-reduction time, according to the above scheme, for a crystal with about 1015 nuclei. It is envisaged that this be achieved by performing the entire experiment out in space, where the laser, beam-splitter, and collection of crystals are on one space platform (although for technical reasons, it might turn out to be more effective to place the laser itself on a separate platform). The two components of the each photon's state are then kept coherent by sending them to a distant space platform on which there are (probably two) X-ray mirrors, of the type that can exactly reverse the photon's momentum, returning each component precisely to its point of reflection (at the crystal or beam-splitter, respectively). If the distance between these two space platforms is of the order of an Earth diameter, then the complete trip will take roughly one tenth of a second, as required. Provided that the photon does not encounter any disturbing influence en route, such as an encounter with a stray molecule, there ought not to be significant environmental decoherence resulting from the photon's journey(s). Each crystal itself is thus placed in a quantum superposition of two slightly different locations—a "Schrodinger's cat"! One of these is slightly displaced, on account of the impact of the X-ray photon; the other is left undisturbed because the photon has been reflected from the beam- splitter rather than the crystal. Each crystal is suspended appropriately, perhaps by a suitable carbon fibre, thus providI have received considerable stimulation and assistance from a number of people for the idea of this experiment. The idea of employing a Mossbauer-like crystal in an experiment of this general nature was suggested by Johannes Dapprich, and many details, together with some general encouragement, were provided by Anton Zeilingerand his group, as they then were, at the Institute of Physics in Innsbruck. Assistance with aspects of the space-based experiment described here was provided by Anders Hansson.
277
suspension . RT crystar^; ~ " *
* -
-Earth-diameter separation
space platform 2
->-
detector
H X-ray mirrors
i i i
beam! -splitter
,,.1 /
.K i
X-ray laser space platform 1 Figure 1. The FELIX experiment.
ing a restoring force of just the right amount to bring the displaced component of the crystal exactly back to its original location (that prior to the impact by the photon) in the above time-scale (~ 10 _ 1 s). The part of the photon's state which was initially reflected from the crystal is timed to return so as to impact it again exactly at the crystal's point of return, this impact being just such as to cancel the effect of the photon's first impact, and thereby to reduce the crystal to rest. By arranging the path lengths appropriately, the timing of the return of the other part of the photon's state (the one originally reflected from the beam-splitter) is taken to be such that the two returning components of the photon now both finally arrive at the beam-splitter exactly together, with relative phases such that if there have been no decoherence or O R effects in the whole process, then the two photon components will cohere exactly, and return the way that they came in, namely back into the laser. A detector placed in the alternative location, as indicated in Figure 1, will then detect nothing. But if decoherence or O R has occurred, then there would be a 50% probability of detection. It would be a crucial feature of this experiment that, in addition to all the needed spatial and temporal precision (no mean task, in itself), there is no significant loss of coherence due to conventional processes, in any part of the experiment. The X-ray photon has to be kept from undesired encounters, and the various reflections cleanly achieved. In particular, the photon's impact on the crystal must be such that the crystal responds as a whole to the encounter, acting as one rigid body, in the manner of a Mossbauer crystal, rather than having internal modes of vibration excited. Moreover, the crystal itself must be undisturbed by decohering influences.
278
This applies to the suspension, so that no internal phonons must be excited.6 Any other tendency to disturbance of the crystal while in its superposed state, such as by stray molecules or electromagnetic influences, must be kept at a minimum. The crystal itself should be maintained at a low temperature (presumably close to absolute zero) and free of electric charge, to ensure the stationary nature of its state and to prevent it radiating. The idea would be to keep the conventional decoherence effects down as low as or lower than that predicted by the above O R scheme. Then it should be possible to pick out from the "noise" (of conventional decoherence) the very specific signature for the reduction rate predicted for gravitational O R scheme, and thereby determine whether or not it is present. The experiment would have to be repeated a large number of times, not only to ascertain whether or not the expected 50% detection rate actually takes place at all when the time-scale of the experiment significantly exceeds
I t may be that the suspension could be eliminated altogether, for such a space-based experiment, if a more complicated geometry is adopted to deal with the crystal's momentum in some other way, perhaps using multiple photon impacts.
279
to be described by momentum-state wave trains. There is a possibly problematic issue arising, however, when we consider the motion of the crystal. As the photon's wave train impinges upon the crystal, it will not convey an instantaneous impulse to it, but there will be a linear superposition of such impulses, spread over the temporal extent of the wave-train of the impinging X-ray photon. Owing t o this consequent spread in the crystal's state, it is not cleanly a superposition of just two states, each of which is closely stationary. Thus, the preceding discussion (of §3) may not, strictly speaking, apply. Yet, I do not think t h a t this is a m a t t e r of par ticular concern, because the spread in the crystal's state resulting from the present considerations will be small by comparison with the relatively gross (perhaps nuc lear diameter or more) separation between the two crystal locations involved in the crystal's overall movement, where I am assuming t h a t the spatial spread in the photon's wavefunction is small compared with the (say, Earth-diameter) separation of the space platforms.
6
Alternatives to the O R scheme?
There is some flexibility in the particular O R scheme that I have argued for here, the most obvious being t h a t there might be some possible overall numerical factor in the reduction time, suggested by the 27r-ambiguity referred to in §3. Indeed, the theoretical considerations of §2 and §3 do not seem to provide a convincing clear-cut numerical factor relating EQ (the gravitational self-energy of the difference between the two gravitational fields involved) and the reciprocal of the average decay time
280 as in the schemes of Pearle 2 4 , 2 5 and others. 3 5 In such models the R-process takes place continuously, so t h a t the amplitudes t h a t provide the weighting factors in a linear superposition (say of two displaced stationary states) would evolve with time. One of these amplitudes would ultimately increase to reach unit modulus and the remaining amplitude(s) would ultimately decrease to zero. It is an interesting question whether continuous state-reduction models such as these can be observationally distinguished, by experiments like FELIX, from those in which O R is taken to occur at a specific time, but with some random variation after the preparation of the state, as was being argued for in §2 and §3. Some of the other gravitational O R schemes t h a t have been suggested 1 7 ' 1 8 , 1 9 , 20,6,13,36 c o u ] d w e n g i v e q U ite different answers for the FELIX experiment. It would be interesting to see, in detail, how all these schemes might be experimentally distinguished from one another. In this connection, it should be made clear t h a t all these schemes envisage t h a t all state reduction takes place according to the specific O R scheme in question. This applies, in particular to the detector in the FELIX experiment. Thus, if the state of the crystal has not spontaneously fully reduced by the time the photon returns to the beam-splitter (the amplitudes merely having been changed), the final reduction would occur in the detector itself. In the specific gravitational O R proposal of §§2,3, I have envisaged superposi tions of only two separate stationary states. It is not completely clear how super positions of three or more such states should be handled. Different proposals for this could also be checked experimentally by FELIX-type schemes, but with more complicated beam-splitter arrangements. I do not have a specific proposal for a relevant O R theory, however, and it would probably be best to wait to see how the two-state systems fare before venturing into further experimental modifications. As a final comment, it may well be that there are Earth-based experiments t h a t could address the gravitational OR-issue with present-day technology. One ingeni ous modification of the experiment has been suggested by Lucien Hardy. 1 4 This involves two X-ray photons, one of which first puts the (Mossbauer-like) crystal into a superposition (pre-selected by the photon finally entering a specified de tector), and the other of which is carefully timed to come along later and respond to this superposition in a way t h a t has a clear observational signature. It is not clear to me that there would be any advantages in this over FELIX, however, but this idea (and a number of other possibilities) are certainly worth exploring.
Acknowledgments I am grateful to many colleagues for helpful and stimulating discussions, and to NSF for support under contract PHY 93-96246.
References 1. D. Bohm and B. Hiley, The Undivided Universe (Routledge, London, 1994). 2. E. C a r t a n , Sur les equations de la gravitation d'Einstein, J. Math. Pures et Appl. 1, 141-203 (1922) (cf. p . 194).
281
3. .1. Christian, Exactly soluble sector of q u a n t u m gravity, Phys. Rev. D 5 6 , 4844-77 (1997). 4. J. Christian, Why the q u a n t u m must yield to gravity, in Physics Meets Philo sophy at the Planck Scale, eds. C. Callender and N. Huggett (Cambridge Uni versity Press, Cambridge, 2000). 5. L. Diosi, Phys. Lett. 1 2 0 A , 377 (1987). 6. L. Diosi, Models for universal reduction of macroscopic q u a n t u m fluctuations, Phys. Rev. A 4 0 , 1165-74 (1989). 7. L. Diosi, and B. Lucacs, Ann. Phys. 4 4 , 488 (1987). 8. L. Diosi, personal communication (1999). 9. J. Ellis and D.V. Nanopoulos, Vacuum fluctuations and decohcrence in mesoscopic and macroscopic systems, in Symposium on Flavour-Changing Neutral Currents: Present and Future Studies (UCLA, U.S.A., 1997). 10. R.P. Feynman, F.B. Morinigo and W.G. Wagner, The Feynman Lectures on Gravitation (Addison Wesley, Reading, Mass., 1995), §1.4, pp. 11-15. 11. K. Friedrichs, Math. Ann. 9 8 , 566 (1927). 12. G.C. Ghirardi, A. Rimini and T. Weber, Unified dynamics for microscopic and macroscopic systems, Phys. Rev. D 34, 470 (1986). 13. G.C. Ghirardi, R. Grassi, and A. Rimini, Continuous-spontaneous-reduction model involving gravity, Phys. Rev. A 4 2 , 1057-64 (1990). 14. L. Hardy, personal communication (1998). 15. D. Home, Conceptual Foundations of Quantum Physics: An Overview from Modern Perspectives (Plenum Press, New York and London, 1997). 16. L.P. Hughston, R. Jozsa and W.K. Wooters, A complete classification of q u a n t u m ensembles having a given density matrix, Phys. Letters A 1 8 3 , 14-18 (1993). 17. F. Karolyhazy, Gravitation and quantum mechanics of macroscopic bodies, Nuovo Ctm. A 4 2 , 390 (1966). 18. F. Karolyhazy, Gravitation and quantum mechanics of macroscopic bodies, Magyar Ftzikai Polyotrat 12, 24 (1974). 19. F. Karolyhazy, A. Frenkel and B. Lukacs, On the possible role of gravity on the reduction of the wave function, in Quantum Concepts in Space and Time, eds. R. Penrose and C.J. Isham (Oxford University Press, Oxford, 1986) pp. 109-28. 20. T . W . B . Kibble, Is a semi-classical theory of gravity viable? in Quantum Grav ity 2: A Second Oxford Symposium, eds. C.J. Isham, R. Penrose and D.W. Sciama (Oxford Univ. Press, Oxford, 1981) pp. 63-80. 21. A.B. Komar, Qualitative features of quantized gravitation, Int. J. Theor. Phys. 2, 157 60 (1969). 22. I. Moroz, R. Penrose and K.P. Tod, Spherically-symmetric solutions of the Schrodinger-Newton equations, Class. Quant. Grav. 1 5 , 2733-42 (1998). 23. I. Moroz and K.P. Tod, An analytic approach to the Schrodingcr Newton equations, Nonlinearity, to appear. 24. P. Pearle, Models for reduction, in Quantum Concepts in Space and Time, eds. C.J. Isham and R. Penrose, (Oxford Univ. Press, Oxford, 1985) pp. 84-108.
282
25. P. Pearle, Combining stochastic dynamical state-vector reduction with spon taneous localization, Phys. Rev. A 39, 2277-89 (1989). 26. P. Pearle and E.J. Squires, Gravity, energy conservation and parameter values in collapse models, Durham University preprint DTP/95/13 (1995). 27. R. Penrose, Time-asymmetry and quantum gravity, in Quantum Gravity 2: A Second Oxford Symposium, eds. D.W. Sciama, R. Penrose and C.J. Isham (Oxford University Press, Oxford, 1981) pp. 244-72. 28. R. Penrose, Gravity and state-vector reduction, in Quantum Concepts in Space and time, eds. R. Penrose and C.J. Isham (Oxford University Press, Oxford, 1986) pp. 129-146. 29. R. Penrose, The Emperor's New Mind; Concerning Computers, Minds, and the Laws of Physics (Oxford University Press, Oxford, 1989). 30. R. Penrose, Shadows of the Mind; An approach to the missing science of con sciousness (Oxford University Press, Oxford, 1994). 31. R. Penrose, On gravity's role in quantum state reduction, Gen. Rel. Grav. 28, 581-600 (1996). 32. R. Penrose, (1998) Quantum computation, entanglement and state-reduction Phil. Trans. Roy. Soc. Lond. A 356,1927-39 (1998). 33. R. Penrose, The Large, the Small and the Human Mind, (Cambridge University Press, Cambridge, 2nd paperback edn., 1999) 34. R. Penrose, The Road to Reality (Vintage, 2000) to appear. 35. I.C. Percival, Primary state diffusion, Proc. R. Soc. Lond. A 447, 189-209 (1994). 36. I.C. Percival, Quantum spacetime fluctuations and primary state diffusion, Proc. R. Soc. Lond. A 451, 503-13 (1995). 37. E. Schrodinger, Die gegenwartige Situation in der Quantenmechanik, Naturwissenschaftenp, 23, 807-812, 823-828, 844-849 (1935) (Translation by J.T. Trimmer in Proc. Amer. Phil. Soc, 124, 323-38 (1980)) in Quantum Theory and Measurement, eds. J.A. Wheeler and W.H. Zurek (Princeton Univ. Press, Princeton, 1983). 38. A. Trautman, Foundations and current problems of general relativity theory, in Lectures on General Relativity, Brandeis 1964 Summer Institute on Theoretical Physics, vol. I, ed. A. Trautman, F.A.E. Pirani and H. Bondi (Prentice-Hall, Englewood Cliffs, N.J., 1965) pp. 7-248. 39. J. von Neumann, The Mathematical Foundations of Quantum Mechanics (Prin ceton University Press. Princeton, 1955), p. 351. [English translation, from German original, by Robert T. Beyer.] 40. A. Zeilinger, R. Gaehler, C.G. Shull and W. Mampe, Single and double slit diffraction of neutrons, Rev. Mod. Phys. 60, 1067 (1988).
283 SCHRODINGER OPERATORS IN THE T W E N T Y - F I R S T C E N T U R Y
BARRY SIMON Division of Physics, Mathematics, and Astronomy, 253-37 California Institute of Technology, Pasadena, CA 91125, USA E-mail: [email protected]
1
Introduction
Yogi Berra is reputed to have said, "Prediction is difficult, especially about the future." Lists of open problems are typically lists of problems on which you expect progress in a reasonable time scale and so they involve an element of prediction. We have seen remarkable progress in the past fifty years in our understanding of Schrodinger operators, as I discussed in Simon *. In this companion piece, I present fifteen open problems. In 1984,1 presented a list of open problem in Mathematical Physics, including thirteen in Schrodinger operators. Depending on how you count (since some are multiple), five have been solved. We will focus on two main areas: anomalous transport (Section 2) where I expect progress in my lifetime, and Coulomb energies where some of the problems are so vast and so far from current technology that I do not expect them to be solved in my lifetime. (There is a story behind the use of this phrase. I have heard that when Jeans lectured in Gottingen around 1910 on his conjecture on the number of nodes in a cavity, Hilbert remarked that it was an interesting problem but it would not be solved in his lifetime. Two years later, Hilbert's own student, Weyl, solved the problem using in part techniques pioneered by Hilbert. So I figure the use of t h a t phrase is a good jinx!) In a final section, I present two other problems. 2
Q u a n t u m Transport and Anomalous Spectral Behavior
For the past twenty-five years, a major thrust has involved the study of Schrodinger operators with ergodic potentials and unexpected spectral behavior of Schrodinger operators in slowly decaying potentials. (This is discussed in Sections 5 and 7 of Simon '.) The simplest models of ergodic Schrodinger operators involve finite difference approximations. The first is the prototypical random model and the second, the prototypical almost periodic model. E x a m p l e 2 . 1 . (Anderson model) Let Vw(n) be a multisequence of independent, identically distributed random variables with distribution uniform on [a,b]. Here B € 2 " is the multisequence label and ui the stochastic label. On £ 2 (Z"), define (hwu){n)=
J2 b1=i
u(n + j) +
Vw(n)u(n).
284
Example 2.2. (Almost Mathieu equation) On ^ 2 (Z), define {ha,x,eu)(n)
= u(n + 1) + uin — 1) + Acos(7ran + 9)u(n).
Here a, A are fixed parameters where a is usually required to be irrational and A is a coupling constant. 9 runs in [0, 2ir) and plays a role similar to the ui of Example 2.1. It is known that the Anderson model has spectrum [a — 2u, b + 1v\ and that if v = 1, the spectrum is dense pure point with probability 1, and if v > 2, this is true if \b — a\ is large enough (we will not try to recount the history here; see Simon 2 for proofs of these facts and some history) and also there is some pure point spectrum near the edges of the spectrum when |6 — a\ is small. Problem 1. (Extended states) Prove for v > 3 and suitable values of 6 —a that the Anderson model has purely absolutely continuous spectrum in some energy range. This is the big kahuna of this area, the problem whose solution will make a splash outside the field. In fact, just proving that there is any a.c. spectrum will cause a big stir. The belief is that for |6 — a\ small, there is a subinterval (c, d) C [a — 2i>, b + 2u] = a(Hu) on which the spectum is purely a.c. and that on the complement of this interval, the spectrum is dense pure point. As |6 — a\ increases beyond a critical value, \d — c\ goes to zero. Problem 2. (Localization in two dimensions) Prove that for v = 2, the spectrum of the Anderson model is dense pure point for all values of b — a. This is the general belief among physicists, although the claims for this model have fluctuated in time. Problem 3. (Quantum diffusion) Prove that for v > 3 and values of |6 — a\ where there is a.c. spectrum that ^n£%» n2\e,tH (n,0)\2 grows as ct as i —>■ oo. That is, (x(t)2)1/2 ~ ctll2. For scattering states, of course, the a.c. spectrum leads to ballistic behavior (i.e., {x{t)2)ll2 ~ ct) rather than diffusive behavior. This problem is one of a large number of issues concerning the long time dynamics of Schrodinger operators with unusual spectral properties. An enormous amount is now known about the almost Mathieu model whose study is a fascinating laboratory. I would mention three remaining problems about it: Problem 4. (Ten Martini problem) Prove for all A ^ 0 and all irrational a that spec(/i0A,«) (which is 9 independent) is a Cantor set, that is, that it is nowhere dense. The problem name comes from an offer of Mark Kac. Bellissard-Simon 3 proved the weak form of this for Baire generic pairs of (a, A). It would be interesting to prove this even just at the self-dual point A = 2. Problem 5. Prove for all irrational a and A = 2 that spee(ha\ zero.
e) has measure
This is known (Last 4 ) for all irrational a's whose continued fraction expansion has unbounded entries. But it is open for a the golden mean which is the value with
285 the most numerical evidence! To prove this, one will need a new understanding of the problem. P r o b l e m 6. Prove for all irrational a and A < 2 t h a t the spectrum is purely absolutely continuous. It is known (Last 5 , Gesztesy-Simon 6 ) that the Lebesgue measure of the a.c. spectrum is the same as the typical Lebesgue measure of the spectrum for all irrational a and A < 2. T h e result is known (Jitomirskaya 7 ) for all a ' s with good Diophantine properties but is open for other a ' s . One will need a new under standing of a.c. spectrum to handle the case of Louiville a ' s . While we have focused on the almost Mathieu equation, the general almost periodic problem needs more understanding. As for slowly decaying potentials, I will mention two problems: P r o b l e m 7. Do there exist potentials V(x) on [0,oo) so that \V(x)\ for some s > 0 and so t h a t — -^
< C|x|_1/2_£
4- V has some singular continuous spectrum.
It is known that such models always have a.c. spectrum on all of [0, oo) (Remling 8 , Christ-Kiselev 9 , Deift-Killip 1 0 , Killip n ) . It is also known (Naboko 1 2 , Simon , 3 ) that such models can also have dense point spectrum. Can they have singular continuous spectrum as well? P r o b l e m 8. Let V be a function on R" which obeys
[\x\-,/+l\V{x)\2 2. If v — 1, this is the result of Deift-Killip 10 (see also Killip u ) . Their result implies the conjecture in this problem for spherically symmetric potentials (which is where the |ar| _ J / + 1 comes from).
3
Coulomb Energies
T h e past thirty-five years have seen impressive development in the study of energies of Schrodinger operators with Coulomb potentials (see Sections 9 and 11 of Simon 1 or the review of Lieb 1 4 ) of which the high points were stability of matter, the threeterm asymptotics of the total binding energy of a large atom, and some considerable information on how many electrons a given nucleus can bind. While these results involve deep mathematics, except for stability of matter, they are very remote from problems of real physics. Since one does not often fully ionize an atom, total binding energies are not important, but rather single ionization energies are. Understanding the binding energies of atoms and molecules is a huge task for mathematical physics. T h e problems in this section may be signposts along the way. As we progress, the problems will get less specific. We will deal throughout with fermion electrons. %. ' will be the space of functions antisymmetric in spin and space in L ^ K ^ C 2 " ) .
286
Define H(N, Z) to be the Hamiltonian on Tif,
i?(-A,-R)+gi^i and E{N,Z)
= min
H(N,Z).
N0(Z) is defined to be the smallest value of N for which E(N+j, Z) = E(N, Z) for j = 1, 2, 3 , . . . . Ruskai 15 ' 16 and Sigal 17 ' 18 showed such an N0(Z) exists. Lieb 19 showed that N0(Z) < 2Z and Lieb et at. 20 that N(Z)/Z - 4 l a s 2 goes to infinity. By Zhislin 21 , we know N0(Z) > Z. Problem 9. Prove that No(Z) — Z is bounded as Z —> oo. It is not an unreasonable conjecture that No(Z) is always either Z or Z + 1. One has (see Simon 1 for detailed references) E(Z) = min E(N, z) = aZ7/3 + bZ2 + cZ5/3 +
o(Z5/3),
N
but more physically significant is the ionization energy (SE)(Z) = E(Z, Z - 1) - E(Z, Z). Problem 10. What is the asymptotics of (SE)(Z) as Z —¥ oo? There is a closely related issue: to define a radius of an atom (perhaps that R(Z) so that N — 1 electrons are within the ball of radius R) and determine the asymptotics of R(Z). Problem 11. Make mathematical sense of the shell model of an atom. This is a vague problem, but the issue is what does the most popular model used by atomic physicists and chemical physicists have to do with the exact quantum theory. Here is an even vaguer problem: Problem 12. Is there a mathematical sense in which one can justify from first principles current techniques for determining molecular configurations? Drug designers and others use computer programs that claim to determine con figurations of fairly large molecules. While one technique these programs use is called ab initio, all that means is they use few parameter molecular orbitals. This problem should be viewed as asking for some precise way to go from fundamental quantum theory to configuration of macromolecules. Finally, Problem 13. Prove that the ground state of some neutral system of molecules and electrons approaches a periodic limit as the number of nuclei goes to infinity. That is, prove crystals exist from first quantum principles.
287
4
Other Problems
Here are two final open problems: Problem 14. Prove the integrated density of states, k(E), is continuous in the energy. For a definition of k(E), see Cycon el al. 22 . Continuity is known in one di mension and for the discrete case, but has been open in the higher-dimensional continuum case for over fifteen years. Problem 15. Prove the Lieb-Thirring conjecture on their constants L7iJ/ for u = 1 and \ < 7 < §. L 7 „ is defined to be the smallest constant so that
J
3
where Cj(V) is the jth negative eigenvalues of —A + V on L2(M"). Here 7 > | in u = 1 dimension and 7 > 0 in dimensions > 2. Two lower bounds on L-y „ can be computed—the quasiclassical value Lft'1; and the best constant, L^°^ for one bound state (which is related to best constants in Sobolev inequalities). For v = 1, Lieb-Thirring 23 conjectured
L7iI/ = m a x ( L ^ , 0 which is L^v if 7 > § and L ^ if 5 < if -7 > I (Aizenman-Lieb 24 ) and if 7 = open is the best value of the constant if v > 8 and 7 = 0, Lltl/ > max(/,!^, L^)
v < | . The conjecture is known to hold \ (Hundertmark-Lieb-Thomas 2 5 ) . Also v > 2 and 0 < 7 < | . It is known that if with strict inequality.
Acknowledgments This material is based upon work supported by the National Science Foundation under Grant No. DMS-9707661. The Government has certain rights in this material. References 1. B. Simon, J. Math. Phys. (to appear). 2. B. Simon in Proc. Mathematical Quantum Theory, II: Schrodinger Operators, eds. .1. Feldman, R. Froese and L. Rosen (CRM Proc. Lecture Notes 8, 1995). 3. J. Bellissard and B. Simon, J. Fund. Anal. 48, 408 (1982). 4. Y. Last, Commun. Math. Phys. 164, 421 (1994). 5. Y. Last, Commun. Math. Phys. 151, 183 (1993). 6. F. Gesztesy and B. Simon, Ada Math. 176, 49 (1996). 7. S. Jitomirskaya, Ann. 0} Math, (to appear). 8. C. Remling, Commun. Math. Phys. 193, 151 (1998). 9. M. Christ and A. Kiselev, J. Amer. Math. Soc. 11, 771 (1998). 10. P. Deift and R. Killip, Commun. Math. Phys. 203, 341 (1999).
288
11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22.
R. Killip, in preparation. S.N. Naboko, Theor. Math. Phys. 68, 18 (1986). B. Simon, Proc. Amer. Math. Soc. 125, 203 (1997). E. Lieb, Bull. Amer. Math. Soc. 22, 1 (1990). M.B. Ruskai, Commun. Math. Phys. 82, 457 (1982). M.B. Ruskai, Commun. Math. Phys. 85, 325 (1982). I.M. Sigal, Commun. Math. Phys. 85, 309 (1982). I.M. Sigal, Ann. Phys. 157, 307 (1984). E. Lieb, Phys. Rev. A29, 3018 (1984). E. Lieb et ai, Phys. Rev. Lett. 52, 994 (1984). G.M. Zhislin, Tr. Mosk. Mat. Obs. 9, 81 (1960). H. Cycon et ai, Schrodinger Operators With Application to Quantum Mech anics and Global Geometry (Springer, Berlin, 1987). 23. E. Lieb and W. Thirring in Studies in Mathematical Physics: Essays in Honor of Valentine Bargmann, eds. E. Lieb, B. Simon, and A. Wightman (Princeton University Press, Princeton, 1976). 24. M. Aizenman and E. Lieb, Phys. Lett. 66A, 427 (1978). 25. D. Hundertmark, E. Lieb, and L. Thomas, Adv. Theor. Math. Phys. 2, 719 (1998).
289
THE CLASSICAL THREE-BODY PROBLEM - W H E R E IS A B S T R A C T MATHEMATICS, PHYSICAL INTUITION, COMPUTATIONAL PHYSICS MOST POWERFUL? H. A. POSCH Universitat Wien, Boltzmanngasse 5, A-1090 Austria E-mail: [email protected]
Institut fur Experimentalphysik,
W. THIRRING Institut fiir Theoretische Physik, Universitat Wien, Boltzmanngasse Austria E-mail: [email protected]
5, A-1090
Wien,
Wien,
We show how different aspects of the restricted three-body problem can be un derstood with physical intuition, rigorous mathematics and computer simulations. The first explains the short time stability, the second tells us when it is stable for all times, and the third shows when and why chaos takes over.
1
Introduction
The three-body problem is very old (see Reference l for a historic review which starts even with the Babylonians) and an immense literature has accumulated over the centuries. How can one think that one can make a new'contribution to it? It is not that we possess new observational data, but the computer puts us in a better position than previous generations. Any idea which would have taken years to verify or falsify with a slide rule can now be settled within seconds. Furthermore, unlike astronomers we can change the mass ratios at will to understand the various mechanisms and to see when and why things become chaotic. Of course, a general solution is impossible and would also be too complicated to be of any use. So we concentrate on some limited but relevant questions on the limiting situation where one body is so light that it does not influence the (circular) motion of the two others. The answers to these questions require different tools and we shall formulate them such that they make use of physical intuition, rigorous analysis and computational methods. Question 1. Even if the second body is much lighter than the heaviest one, its influence on the third is much less than a naive estimate would tell us. For instance, Mjupiter/A/© ~ 1/1000, but without sun it would take Mars at rest only about 200 years to fall freely into Jupiter. But its Kepler orbit is stable for a much longer time, merely its excentricity is about twice that of the Earth. What is exactly the mechanism which stabilizes the orbit? Answer 1. The radial motion of nearly circular orbits is like a harmonic oscillator, and the influence of Jupiter is like periodic kicks (better pulls). From the kicked oscillator one knows that the amplitude of the induced oscillations gets damped again if one is not at a resonance, and the kicks get out of phase. We shall underpin
290 this by an elementary calculation and illustrate it by computer simulations below. If resonance conditions apply, the amplitude increases linearly with time, but then one gets into the nonlinear domain and out of phase with the kicks. Whether this comes in time to quench the oscillations or whether the situation is already out of hand depends on the strength of the kicks, i.e. Mj. Question 2. For which initial conditions can one guarantee stability ad aeternitatem? Answer 2. Since the orbits can become so complex, this question cannot be settled by naive models and computers cannot calculate to t = oo. So this is the domain of mathematical proofs. But for them one has to be prepared for the worst situation, and any rational frequency ratio is a possible resonance. Though one can show that for small perturbations there are regions of finite measure which are stable, one had to cut out (perhaps unnecessarily) so many pieces in phase space that for the system sun + Jupiter 4- small planet one is numerically still far away from a proof of stability. Question 3. One has learned at school that if there is no other constant than the Hamiltonian, the system becomes ergodic. Computer studies show that for confining potentials |x< — Xj\", v > 0, the orbits for several particles seem ergodic on the energy-angular momentum shell 2 . Is this still true here? Answer 3. According to 2, for small perturbations this is not the case. But only the computer can give a hint how strong the perturbation has to be for ergodicity. (1) gives a clue for the mechanism of instability. If the kicks are too strong so that the planet will spill over and come near the sun or Jupiter before the quenching becomes effective it will be completely thrown out of its orbit and there is no stabilizing mechanism any more. A simple estimate shows that this happens for MJ/MQ > 1/100, and then the computer shows that there are large chaotic regions but they contain islands of regularity. They shrink with increasing MJ/MQ and look rather weird, not like a submanifold given by another constant of motion K(x,y,px,py) = const. Sometimes they are connected by a small bottleneck with other parts of the energy shell and the orbit fails to find the hole in a reasonable time. The impression one gets from these considerations is that our solar system must be very cleverly constructed to be stable over such a long time 3 ' 4 . Extensive computer-aided calculations show that the Liapunov time in the planetary system is of the order of 107 years and, thus, much shorter than its age. Jupiter is not too heavy but far enough from the sun to carry most of the angular momentum. This stabilizes the plane of motion, otherwise the inclination of the orbits would be random. Furthermore, all planetary orbits are nearly circular, and the two groups of outer and inner planets are fairly evenly spaced. Presumably, in the early solar system there were many more planets, but their orbits did not comply with the above stability specifications, so they collided, fell into the sun or were thrown out
291
-0.770
T
1
1
T
J
I
I
L
0.55
0.6
0.65
0.7
-0.775 -0.780 -0.785 -0.790 -0.795
r Figure 1. Effective potential Veff for the 2:1 resonance, ro = 2 - 2 ' 3 . harmonic approximation for the kicked-oscillator model.
The dashed curve is the
of the solar system. In the newly-discovered planetary systems, where the heaviest planet has about 1/10 of the mass of the central star the orbits of the other unseen planets must be so chaotic that they cannot provide a sufficiently well-tempered climate for life to exist. 2
Intuitive Argument
We consider here the situation where the two heavy bodies ("sun and Jupiter") make a circular orbit, and the third (the "planet") has a negligible mass (restricted 3-body problem). Furthermore all move in the same plane. For the planet's motion the configuration space is 2-dimensional, the phase space is 4-dimensional, and there is one constant of the motion, the Hamiltonian in the rotating system (equivalent to the "Jacobi constant"). In those parts of phase space where the planet cannot escape, no other constant is known and we have the simplest situation of a nonintcgrable system. We shall start with an almost circular orbit of the planet, because in our solar system most excentricities are small and these orbits are apparently the most stable ones. Without Jupiter the effective radial potential is (in suitable units °)
W) = -; + £> and the circular orbit is in the minimum of this potential. Here, L is the angular momentum. The potential is depicted in Fig. 1. Now we shall naively guess what the effect of Jupiter might be on an orbit inside its circle. We are interested in mass ratios MQ/MJ between 10 and 1000, so Jupiter should not immediately throw the planet out of orbit. Since the force is ~ Mj\x — xj\~2 it should be most noticeable when the planet is on Jupiter's side of the sun and Jupiter pulls the planet outward "Reduced units are used for which the sum of the masses MQ + Mj of the primaries, the sun Jupiter distance, and the angular velocity of Jupiter are unity.
292 of the minimum of Veft. Of course, there will also be an azimuthal force, but this will be first accelerating and then decelerating, so we think it will largely average out and forget about it. About the force / of Jupiter, we only assume that it is periodic with a period r = 2ff/(w — 1), where w is the unpertubed angular velocity of the planet, that of Jupiter being unity in our units, r is the time between successive conjunctions of Jupiter and the planet. Though the orbit of Jupiter is strictly periodic, the one of the planet is not, so f(t) = /(< + r) is not quite correct. But we think it is a good approximation. Thus, if we concentrate on the radial motion of the planet, the complex coordinate z = pr + i(r — ro), V^ff(ro) = 0, obeys z{t) = iu>z(t) + f(t),
V&(r„) = y
(1)
near the minimum ro. In the solution z(t) = eiutz(0) + f dt' e^-^fit') Jo the two terms have spectra {w} and {w} U Z, respectively. In particular,
(2)
=:eiwTK
dt'fit'^*-*') Jo
shows that for all times the change of z during a period r, z(T) = e™(z(0) + K),
(3)
— 3/2
depends only on w = r 0 ' and the constant K. Since the detailed form of /(<) does not enter, this gives us confidence that (3) might be a good guess, and we iterate it to :(nr) = e"*"" (z(0) + K
l
_ e_iwT
) ,
neZ.
(4)
To get an idea of the planetary motion, we have in Fig. 2 replaced the effect of Jupiter by periodic kicks, /(<) = K ^2nS(t — nr), where, in a generous mood, we have computed K as half of the total accumulated force of a planet passing Jupiter on a straight line with the correct minimal distance 1 — ro and relative velocity v = l/V^o - 1, K-^LT
dt{1
-ro)
- Mj
^
(5)
We do not insist on this hair-raisingly crude approximation, but to our surprise it worked rather well as will be shown below. What we learn from (4) is that the periodic pull of Jupiter excites radial os cillations of the planet, but unless w = 2 , 3 , 4 . . . , these oscillations eventually get out of phase with the period of the pull. Thus, after some time there will be a "thrust reversal", and the oscillations will be damped again until one comes close to the original configuration. More in detail, the influence of Jupiter will be most noticeable near a resonance u> = g + e, g G Z, e
293
0.600
0.590 r0 = 0.62
0.640 0.620 0.600
0.700
,- 0 =l/2
0.650 0.600
flrmniiii
0.550 50
100
150
Figure 2. Radial planetary motion, perturbed by Jupiter, for various unperturbed circular-orbit radii ro. The mass ratio MJ/MQ = 0.001/0.999. r(t) denotes the separation from the sun. The smooth lines are the "exact" computer-simulation results, and the dashed lines are for the kicked-oscillator model described in the text. From top to bottom: ro = 0.55, 0.60, 0.62, and 1/2 2 / 3 =0.62996.
294 near 1/2 we get thrust reversal and for ne near 1 they go back to the order of K. Since K is of the order MJ/MQ ~ 1 0 - 3 , only a small region near w £ Z is dangerous. However, even u — g might not be catastrophic because the resonances have a built-in selfquenching mechanism. If we start, say, with u = r$ ' = 2 , r = 27r, then r m a x = ro max2 r n < ( < 2,(„+i) Im z(() will determine the frequency after some time. The harmonic approximation to Ven will break down and u> be comes rmax 7^ 2. Hence, we will get thrust reversal and whether this comes in time before r m a x ~ ro + nK is close to one depends on the strength of K. To follow this analytically by improving our crude model is very tedious and at this stage it is better to consult the computer to see what is going to happen. For our numerical work in this section the equations of motion are derived from the Hamiltonian in the center-of-mass frame rotating with Jupiter, TT
1/2
2\
MQ
MJ
where the sun and Jupiter are located at (Mj,0) and (—MQ,0), respectively, and where M© + Mj — 1. They are integrated with a variable-step-size Runge-Kutta algorithm of fourth order, keeping the energy constant to 10 significant digits for 30,000 Jupiter periods. Since in this section only slightly perturbed cirular orbits are considered, no regularization of the equations of motion is required 6 . In all cases, the planet is initially located on the ar-axis at a;(0) = r(0) — Mj, with a velocity in y-direction corresponding to the respective unperturbed circular orbit (Mj = 0) with radius ro. In Fig. 2 we compare the "exact" simulation results (smooth lines) with the predictions of the kicked-oscillator model (dashed lines) for a perturbed orbit near and at the 2:1 resonance. The mass ratio MJ/MQ = 0.001/0.999. As before, r(t) denotes the radial distance from the sun. The unperturbed radius ro corresponds, from top to bottom, to 0.55, 0.60, 0.62, and 2~ 2 / 3 = 0.62996, and is indicated by the labels. According to this model, r(t) oscillates between the kicks occuring at the times nr, n = 0,1,2,..., with the unperturbed angular velocity w and with an amplitude determined from (4). It is surprising that away from the major 2:1 resonance at ro = 2 - 2 / 3 this simple model gives a rather good description of the excentricity of the orbit. Not unexpectedly, the model breaks down at the resonance, for which it predicts an undisturbed linear increase of the amplitude with time, wheras the exact oscillations are damped by selfquenching as mentioned above. From the different scales in Fig. 2 we infer that the oscillations are much less pronounced when one moves away from the resonance. To study this phase mismatch between the orbit and the periodic pull in more detail, we show in Fig. 3 the radial oscillations at resonance, ro = 2 - 2 / 3 . The perturbed amplitude starts to grow linearly with time, until it reaches the nonlinear regime of the effective radial potential depicted in Fig. 1, and the trajectory gets out of phase with Jupiter. As a consequence, the radial displacement is quenched again and the whole process repeated. This phase mismatch becomes apparent also in Fig. 4, where the time intervals A between successive maxima of r(t) in Fig. 3 are plotted at the end of each interval. A differs significantly from TT, which is the unperturbed period of the planet in this case, equal to half the period of Jupiter.
295
0.54 ' 0
' 50
' 100
' 150
' 200
' 250
' 300
' 350
'—' 400
t
Figure 3. Radial oscillations of the perturbed planetary orbit for the 2:1 resonance with Jupiter. The mass ratio MJ/MQ = 0.001/0.999. r is the distance from the sun.
For most of the time, A < n, and the phase shift accumulates until the force exerted by Jupiter damps the motion again. So far we have restricted our discussion to the radial oscillations, which accord ing to Fig. 3 turn out not to be symmetrical around T*O. Surprisingly, the largest amplitudes occur for r < ro, for which the effective potential Veff increases more steeply than for r > r 0 . This subtlety cannot be captured by the kicked-oscillator model and severely limits our intuition. A closer look at the exact computergenerated trajectories reveals that the largest amplitudes for r(t) mainly occur in a direction not aligned with Jupiter in our co-rotating frame. Also fractional resonances like the Saturn-Jupiter 2:5 resonance depicted in Fig. 5 are not contained in (4). As the figure shows, however, the radial oscillations are small and show an interesting double periodicity. This orbit is not bound by the Jacobi constant (see Fig. 7) to a finite region in configuration space. Nevertheless it is stable for a long time due to the action of the Coriolis forces in the rotating frame. 3
Rigorous Mathematics
One of the dogmas of classical statistical mechanics is that even if a system is not in equilibrium since in addition to H there are some other constants of the motion a little speck of dust ("Staubkornchen") will break them and render the system ergodic. Many great scientists tried to prove that, or even thought that they could
296 3.20
A 3.15
3.10 ■
3.05 -
3.00
Figure 4. Time difference A between successive maxima for the perturbed orbit shown in Fig. 3, plotted at times t at the end of each interval. The unperturbed planetary period is n.
prove it, but finally light was shed on this question by Kolmogorov, Arnold and Moser (KAM theorem) 7 . What they proved was not that some constants persist for small perturbations but that in regions of phase space with a finite measure the orbit stays on a submanifold homeomorphic to a torus. Thus, for small perturbations the system does not become ergodic. The proof proceeds as follows. If we have an integrable system with action variables Ij and an unperturbed Hamiltonian Hi(Ij) and add \H'l(Ij, I, (/,
j =
1=0
kQZm
297
2000
Figure 5. Fladial oscillations of the perturbed planetary orbit for a fractional 2:5 resonance with Jupiter. The mass ratio MJ/MQ = 0.001/0.999, and the unperturbed radius r 0 = ( 2 / 5 ) ~ 2 / 3 = 1.8420.
we set uj£±-
+ H'(I,
(7)
then H = H0(I) + XHk=0{i) + A2ff£(/,
SC^-E^'•tO-
(8)
(7) is solved in
rn
and we fail if ai) (u> • Jfc) = 0 for some 0 ^ k G Zm, or a
2) Efc^o diverges.
ai) means that the Uj are not linearly independent, 3 0 ^ k £ Zm, wifci + W2^2 + • • • + w m /: m = 0 and we have the resonance situations considered in Sect. 2. Although a term in (9) becomes infinite in this case, this does not mean that in
298 the orbit something becomes infinite. It only means that it cannot be described by (8). To see this more explicitly consider a simplified "Jupiter-Saturn" resonance H = 2I1+bI2
y?2 = 5,
+ \sin(b
y>i(*) = *>!(()) +2M,
h(t) = 1^0) + b\tc,
(10) 5• =--h=>
y>2(<) = 5 y 2 ( 0 ) + bt
I2{t) = h{0)-2Xte,
c = cos(5^i(0)-2^2(0)).
Thus, nothing drastic happens except that the action variables increase linearly in time. Mathematically, this is harmless, since the group structure of the time evolution tells us that the worst case is exponential growth. In reality this would be catastrophic if it were to go on forever, but we have seen in Sect. 2 that the linear increase of the amplutide of oscillation is quenched by nonlinear effects, which break the resonance. Nevertheless, in our strategy we have to be prepared for the worst and stay away from points in phase space where the frequencies are rationally related. In fact, in our restricted three-body problem we seem to be in trouble right at the beginning because in the two-body Kepler problem the angular and radial frequencies (w v ,o; r ) are not only rationally related on some points but equal in all of phase space where H < 0. This difficulty is spurious since we have to go into the frame rotating with Jupiter and there (Ref. 5 , 4.4.12) the Hamiltonian becomes (Mj=ii, M 0 = l , £ j = (l,O))
* = i(rf + g)-*-i and u)v — ur — 1. (Jupiter is now fixed and for circular orbits ur — r0' , so for ro = 1 we have w^, = 0.) However, the perturbation H[ = //(r 2 — 2rcos^> + l ) - 1 / 2 is not a polynomial in the exponentials of the angle variables since r is rather complicated when expressed by action-angle variables (Ref. 5 ,5.3.15,2). Thus all Hkltk3{I) W'U be different from zero and to avoid urk\ +w¥>/f2 = 0 we have to delete all rational — 3/2
uip/u)r = 1 — r 0 . Since this set is dense in phase space, H = //°°(/fc) cannot hold in an open set and we still seem to be in trouble. One might cherish some hope because this set has no interior points and is of measure zero. This hope is destroyed by a 2 ). For the series (9) to converge we need not only (w • k) ^ 0 but it has to stay sufficciently far away from zero. However, since the rationals are dense in R we can approximate Ur/uv closely by &2A1 if the fc's are sufficiently big. So the situation can be saved only if the H^ decrease sufficiently with increasing k. It is known that if H' is r-times differentiable H decreases with a power r, and if H is analytic it decreases exponentially. Away from r = 1, ip = 0 we have the latter situation, so Hk can beat any power. Thus, if m
|J/*|
|*| = J > i l .
299
in the regions of phase space where for some n we have (w-k)>jj^ VO^GZ"1,
(11)
there is no problem with the convergence in (9) since
Ei" |*|™ ->l*l+'V*) < oo. e
k
We even have analyticity for |Im ipj\ < p. But are there w's which satisfy (11)? The good set G is in our planar case u)v
G=\{wr,vv):Vk?0
ki
>
e
-tl
j.
(12)
so its complement does not only contain all the rationals. It even contains an open neighbourhood of each of them. To some extent this agrees with our previous experience where it did not make much difference whether one is exactly on the resonance or just close, but now we learn that the bad set Gc is not only dense but also open. It is surprising that there is still something left over for G, and people with a brilliant physics intuition thought that it is not. Yet simple consideration shows that the measure of Gc goes with e to zero. We may consider in our case 0 < u/v/u)r < 1, so &i and k2 have the same sign (say positive) and ki < k2 + 1. Now we just add the length of the dangerous intervals around w^/w r given by (12). Since they might overlap we get an inequality, which, however, goes in the right direction,
*^)*»
£ ^
E
^
^
+ i). as,
Thus for n sufficiently big and for small e, there is a lot left over for G where first order perturbation theory works. b) The Iteration. If we include XHk=o into Hi then the Hamiltonian regains its original form except that A is replaced by A2. Before starting the same procedure again we have to check whether the resonance condition holds. In fact, the new term will add \-gjHk=o to the frequencies and may break a resonance in H\, the effect we encountered in Sect. 2. However, by the same token it may also throw us into a resonance and we have to be able to avoid that by moving a little with the action variables. This would not help in the simple example (10) where the frequencies are fixed. One needs at least some quadratic terms in the action variables such that the Hessian
c de
= '(44H,)*°-
If H\ is quadratic in the Ij one can manage this with some effort 5 , for the general case we recommend Ref. n , or for more courageous people the original paper by Arnold 8 . c) The Convergence. In the terminology of physicists we have carried out a renormalization group trans formation, and now we have to prove that it leads to a fixed point. What one needs
300
is that for some norm || || at each step \\H'n\\ gets smaller than the square of the previous one, since the recursive relation n
ra
implies
Thus, if 7(J 3 ||/f(|| < 1 for n —»• oo \\H„\\ converges to zero, we have reached our goal. The constant 7 contains among other things the perturbation parameter A, and by making it sufficiently small we can always satisfy this inequality. The estimate of S is very cumbersome and contains also | | C _ 1 | | and, alas, the price for stability from now to eternity is high. An estimate by Henon 10 limits A to be < 10 - 5 0 . Celletti and Chierchia 9 have truncated £2fe ^fcc''"" a n d g°t ^ n e limit down to 10 - 6 , but what one would need is MJ/MQ ~ 10 - 3 . This truncation is not mathematically rigorous but physically reasonable since in our planetary system there are more important influences. This is like in music where a consonant interval contains higher overtones which are strongly dissonant, but they are only faintly excited and do not bother us. 4
The Computer
If the perturbation MJ/MQ becomes bigger than about 1/100, then the orbits come out of their harmonic shelter too soon for the stabilizing factor to become effective, and the orbit will come close to the sun or Jupiter. Then they will be thrown out of their original circle and the orbit becomes chaotic. This is demonstrated in Fig. 6 for a few "exact" perturbed trajectories distinguished by the mass ratio MJ/MQ. In terms of the kicked-oscillator model, for the quenching mechanism to be effective it is essential that the beat it strictly observed. If the planet is thrown out too far of its harmonic regime, where the frequency of the radial motion is independent of the amplitude, it never gets the rhythm. Thus, it is bound to happen that sometimes a few kicks come into phase and throw the planet beyond the point of no return. Then there is no stabilizing mechanism, and chaos prevails. For these large perturbations the analysis of Sect. 3 certainly does not apply, and the first guess is that then the system becomes ergodic. For this to be true one first has to make sure that the orbit remains in a compact region in phase space. If it escapes to infinity then with probability one it has also come from infinity and one has a scattering situation. In this case one even has the maximal number of constants of the motion (three in our case) and one is in the opposite extreme of ergodicity. However, in the rotating frame the Hamiltonian (6) can be written 5 ' 6 H = \ [(Px + y)2 + (py - x)2} + fi(i, y), thus like for a particle in a constant magnetic field perpendicular to the plane of motion and a potential
Q(x,y) = -2{MQrl
+ Mjrj) - - \-±
+— j +
^ —
301
1.0
1/999
0.8 0.6 0.4 0.2
1.0
1/99
0.8 0.6
*rtJ\!\l\j\!\J\l^^
0.4 0.2
Figure 6. Radial planetary motion, perturbed by Jupiter, for the unperturbed circular-orbit radius ro = 0.6. The mass ratio MJ/MQ increases from top to bottom as indicated by the label.
(see Fig. 7). Here, r© = [(x - 1/2)2 + y2]1'2 and rj = [(x + 1/2)2 + y2}1'2 are the distance of the planet from the sun and from Jupiter, respectively, where we use for convenience a rotating frame in which the sun is located at (1/2,0) and Jupiter at (-1/2,0). The regions Q < E are time invariant and are compact in configuration space for sufficiently low energy E. So the question is in this case whether the energy shell H = E is covered uniformly by the orbit or whether it is divided further by hitherto not-discovered constants. We shall see that neither
302
-A -31
-41 -5\ -2
x
Figure 7. The surface £l(x,y) Jupiter at (-0.5,0).
for a mass ratio MJ/MQ
= 1/9. The sun is located at (0.5,0),
Figure 8. Cut through the fi-surface of Fig. 7 along the x axis. The mass ratio MJ/MQ The horizontal line corresponds to an energy E = —1.795.
= 1/9.
seems to be the case. Since neither physical intuition nor rigorous mathematics are in a position to answer this question we have to avail ourselves of modern computer technology. Ergodicity means that the time average of the orbit gives a homogeneous density on the energy shell. The former we have to calculate on the computer and
303
0.6 0.4 0.2 K^i-y-(■■■--■■■■■
y
0.0
<■'
'^'.■••{•>A!i>-
-0.2 -0.4 -0.6 -0.5
0.5
Figure 9. Probability density in configuration space for an energy E — —1.795 corresponding to the horizontal dashed line in Fig. 8. The points of this stroboscopic map are taken from a single chaotic trajectory lasting for about 30000 Jupiter years. The mass ratio MJ/MQ = 1/9. The sun in this rotating frame is located at (0.5,0), Jupiter at (-0.5,0).
the latter, S(H(x,y;px,py) — E), becomes particularly simple when projected onto configuration space as follows from the more general Bohr-van Leewen-type (see Ref. 14, 2.5.39,1) T h e o r e m (14) In two dimensions the microcanonical density in configuration space of a particle in an arbitrary potential and arbitrary magnetic field is constant in the energetically allowed region. Proof: B = o t(P- ~ A*(*,y))2
p(x,y) = fdPxdpy6(H =
+ (P« - Ay(x,y))2}
- E) = JdvxdvyS
lK&(V{x,y)-E)
with Vi = pi — Ai and 0 the step function.
+
V(x,y),
Q ( t £ + v2y) + V(x, y) - E
304
x
0.0 -2.0
Figure 10. Double-sided Poincare map for mass ratio MJ/MQ = 1/9 and an enrgy E — —1.795 corresponding to the horizontal dashed line in Fig. 8. The sun in the rotating frame is located at (0.5,0), Jupiter at (-0.5,0). In some of the larger regularity islands the closed sections of regular tori for the same total energy are also shown.
To study this chaotic behaviour in more detail we have followed the dynamical evolution on the computer. Since we are concerned with long chaotic trajectories, a regularization procedure according to Birkhoff is used to remove the singularities at the position of both primaries 13,6 . In combination with a Runge-Kutta 4-th order algorithm with variable time step we ascertain that the energy is conserved to 10 significant digits over the whole length of the simulation. In Fig. 9 a stroboscopic map reflecting the probability density in configuration space is shown. The energy E = -1.795 was chosen to allow for a narrow channel between the sun and Jupiter, and corresponds to the dashed horizontal line in Fig. 9. The initial configuration for this trajectory, which is followed for 30000 Jupiter years, is at the position of the central saddle point between the sun and Jupiter in Fig 7, with the planet velocity pointing towards the sun. Clearly, the distribution of points in Fig. 9 is almost homogeneous. The fact that the theorem does not strictly apply is a consequence of the fact that the system is not ergodic. This may be seen more clearly by looking at other phase-space projections, say onto the (x,t; r )-plane, which are harder to treat theoretically. In Fig. 10 a double-sided Poincare map in the (x — u x )-plane is shown for the same chaotic trajectory as in Fig. 9. The plotted points correspond to states for which the
305
velocities y = vy in y-direction may be positive or negative. The results show that there are large islands of regularity in the chaotic sea, so the system is not ergodic. Nevertheless, the consequences of Theorem (14) are quite well satisfied, and the density in configuration space is nearly homogeneous. In Fig. 10 the sections of a few regular tori are also shown in some of the regularity islands for the same energy, £ = -1.795. A cknowledgment s We gratefully acknowledge helpful discussions with Heide Narnhofer and Rudolf Dvorak, and support from the Fonds zur Forderung der wissenschaftlichen Forschung, Grant No. P11428-PHY. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
M. Gutzwiller, Rev. Mod. Phys. 70, 589 (1998). Lj. Milanovic, H.A. Posch, and W. Thirring, Phys. Rev. E 57, 2763 (1998). J. Laskar, Icarus 88, 266 (1990). J. Laskar and Ph. Robutel, Nature 361, 608 (1993). W. Thirring, Classical Mathematical Physics: Dynamical Systems and Field Theories (Springer, New York, 1997). V. Szebehely, Theory of Orbits: The Restricted Problem of Three Bodies (Aca demic Press, New York, 1967). J. Moser, Stable and Random Motions in Dynamical Systems, (Princeton Uni versity Press, Princeton, 1973). V. Arnold, Dokl. Akad. Nauk. 142, 758 (1962). A. Celletti, and L. Chierchia, Planet. Space Sci. 46, 1433 (1998). M. Henon, Bulletin Astronom. Soc. 3, 49 (1966). G. Gallavotti, The Elements of Mechanics (Springer, New York, 1983). The Dynamical Behaviour of our Planetary System, R. Dvorak and J. Henrard (eds.) (Kluwer, Dordrecht, 1997). G. D. Birkhoff, Rend. Circ. Mat. Palermo 39, 1 (1915). also Collected Mathematical Papers, Vol. 1, p.628 (Am. Math. Soc, New York, 1950). W. Thirring, A Course in Mathematical Physics Vol. IV: Quantum Mechanics of Large Systems (Springer, New York, 1982).
306 INFINITE PARTICLE SYSTEMS A N D THEIR SCALING LIMITS S.R.S. VARADHAN Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, NY 10012, USA E-mail: [email protected]
The subject of nonequlibrium statistical mechanics deals with the question of how, a complex system with one or more conserved quantities, approaches its equi librium. Although initially investigated in the context of classical mechanical sys tems that conserve mass, momenta and energy, its paradigm is relevent in the study of stochastic paticle systems with conserved quantities and mutiple equilibria. In fact the inherent noise in the stochastic systems provides a degree of stability that is absent in classical systems. During last twenty years several such models have been studied, with varying amounts of noise, and a body of rigorous results has been obtained. A convenient class of examples go by the general name of lattice gas models. The physical space is the lattice Zd of points with integer coordinates. The sites x £ Zd may be occupied by particles. The state of the system is described by specifying the number of particles T](X) present at the site x. It is also possible that there are several types of particles, in which case T}{x) has multiple components specifying the number of particles of each type. The state space then is the collection Cl of all possible such functions w = q(-). The dynamics is a stochastic process, indeed a Markov process, with Q as its state space. The infinitesimal generator of this process specifies the dynamics. Since the state space is essentially discrete, the generator describes all possible potential changes that can happen in the system and the rate at which these changes occur. The rate is the intensity parameter of the Poisson process whose events trigger the change. The rate of a certain change a will in general depend on the current state w = TJ() and a full specification of the dynamics is achieved by the family of rate functions A(cr, w). If / : fi —> R is a function on fi the infinitesimal generator is given by,
M/)H = E % U ) [ / M - / H ] a
where aui is the new state after the change. While the list of possible potential changes is usually infinite, if / is a local function, all but a finite number will leave the value of f(o~uj) unchanged. But still, actually constructing the dynamics from the infinitesimal generator requires doing some analysis, not unlike integrating systems with infinite degrees of freedom. Liggett x is good source to see how this is done. One could replace Zd by its periodic version Zf^ where the coordinates are considered modulo N. Then the new state space fijv is a finite set and the analytical problem of constructing the dynamics goes away. The randomness in the dynamics comes from the random occurrences of the underlying Poisson times when the changes occur and the interaction inherent in making the rates depend on
307 ijj adds a layer of complication making the order in which events occur i m p o r t a n t . Let us suppose for simplicity that there are two types of particles and the changes involve one of the particles migrating to a neighboring site. Since the total number of particles of each type in the system is conserved, we expect t h a t in fijv the equilibria will consist of a family ^r>,,„ 2 of probability distributions on Q^ corresponing to specified numbers of particles ni,ri2 of each type. If the dynamics is translation invariant, then the equilibria will be too. We now let N —> oo in such a way that ..
rii
lim —r = pi N-t-oo Nd exists, and expect (in the absence of any phase transitions) limits /ip,, P 2 on Q which should be a family of equilibria for the dynamics on f2. If we s t a r t from an arbitrary initial state in fiw with ni and n-i particles of each type, it may take a long time before the statistical equlibrium /i n ,,n 2 is reached. Translation invariance requires the particles to be more or less uniformly distributed and this may require roughly 0(Nd+1) migrations. T h e most optimistic estimate of the time needed for this is O(N). If the particles wander aimlessly or diffuse we may have to wait for a much longer time of 0(N2). Depending therefore on the situation, in timescale Nt or N2t, some thing interesting may happen. T h e system may reach a local equlibrium, and if we rescale Zf^ by y = jj, near y G Td, and at time Nat (with a = 1 or 2) as the case may be, the local equlibrium may be specified by p\(t, y), pi(t, y) and in the microscale of lattice length the joint statistics of {r](x, Nat)} when ft is near y is presumably close to the ones specified by the stationary process Ppi{t,y),p2(t,y) ° n fiThis raises the related issue of how to determine p(t,y) as functions of t and y. Often one can establish a partial differential equation (a system in our case), t h a t governs the evolution of p\ and pi- The equation is the analog of the Euler equation. If a — 1 it usually a nonlinear hyperbolic system and if a = 2 it is parabolic. T h e initial conditions of the microscopic evolution determine the initial conditions of the partial differential equation (macroscopic dynamics). The primary goal is to determine the appropriate differential equation and then establish a rigorous connection between microscopic and macroscopic dynamics. Because of the stochastic nature of the problem, there are questions of fluctuation theory as well as large deviations that are interesting but often prove difficult. We shall give a brief description of what is known and what is not for some specific models. The simplest model is the asymmetric simple exclusion model on Z where the possible states of the system consist of having either one particle or no particle (of a single type) at each site. It can be described by either specifying the set A of occupied sites or the function T](X) which equals 1 if there is a particle at x and 0 otherwise. The possible changes are the movement of a single particle from an occupied site x to an empty site x'. There is a limit r such t h a t this can happen only if \x — x'\ < r. T h e rate at which this change occurs is p(x' — x). We normalize so t h a t X l u i < r P ( 2 ) ~ * an< ^ °f c o u r s e P{z) — 0 if \z\ > r. It is convenient to
308
introduce the notation rj(x) ■qx'x'(z) = < rj(x') r](z)
if z = x' if z = x if z ^ x,x'
so that the infinitesimal generator can be written as
(-4/)(»?) = £ > ( * ' - *fo(*)(i - ^'M/fo 1 - 1 ') - /fo)] The quantity m
= S *P(*
plays an important role. If we assume that m ^ 0, say m > 0 to be specific, the limiting macroscopic equation is
If we start with the physical space of Z and the configuration space of fi, equation (1) is to be considered on [0,oo) x R. If we start from Z^ and fijv then we need to view the equation (1) with periodic bounadry conditions p(t,0) = p(t, 1) on [0,oo) x [0,1] or on [0,oo) x T1 where T 1 is the 1-torus. The initial condition p(0, •) = p 0 ( ) f° r equation (1), is determined by the initial configuration of the microscopic evolution T?O(), which may or may not be random. In any case, it is natural to assume that if the probability distribution of T/ 0 () is specfied by ///v(0) on fi or Cljv it is consistent with po() in the sense that for any test function J(-), with compact support on R or on T 1 , as the case may be
^h^J{^)r,ix)
= J{y)Po{y)dy
J
(2)
in probabilty with respect to fi^ (0). The stochastic dynamics gives us a distribution p.N(Nt) at microscopic time Nt or macroscopic time t. Equation (1) gives us (hopefully!) a unique solution p(t, •) at the macroscopic time t. The first step in any theory is to establish a rigorous connection between the two. The least one should expect is that for suitable test functions
J™, jf E J(jfHx) = J Ay)p(t, y)dy
(3)
in probabilty with respect to fiN(Nt). One could formulate stronger versions of this relationship. First a few words about how equation (1) is formally arrived at. If we take the functional
and compute N(AfN)(r]), (the prefactor TV is to account for the speedup to macrotime scale) one gets asymptotically
309
N(AfN)(n)
~ 1 ^
J'(~Hx)(l
- r,{x')(x'
- x)p(x'
- x)
x,x'
We now remark t h a t the one parameter family of equilibria on fi relevent for this problem are the Bernoulli product measures {p.p} with p,p[r](x) = 1] = p. If we now assume t h a t we are nearly in a local equlibrium and the relation (3) is valid, then it is easily seen t h a t
TV(AfN)(r,)
~ mjj'(y)p{t,
y)(l - p(t,
y))dy.
It is not difficult to go from here to equation (1). By Markov process theory we have f(x(t))-f(x(0))= ,(0))
(Af)(x(s))ds = / f(Af)(
+ M(t)
Jo
where the noise term M(t) (a martingale) is easily seen, in our context, to become negligible for large N, by an explicit estimation of its mean square. The equation (1) however is not so nice. Even with smooth initial d a t a p o ( ) regular solutions exist only upto some time. One is therefore forced into the consid eration of weak solutions. While weak solutions exist, uniqueness is not valid in the class of all weak solutions. There is however a natural notion of entropy solutions. Within the class of entropy solutions there is existence and uniqueness for the ini tial value problem. Moreover smooth solutions are always entropy solutions. One expects equation (3) to be valid i.e. the entropy solution p(t,y) to be the correct choice. This has been proved in a wide variety of situations, but not in all cases. A natural initial distribution ^ N ( O ) , is the product measure with p.N(0)[ri(x)
a §* L = l] = N [ *
p0(y)dy
T h i s will automatically satisfy (2). With this initial condition it is known by Rezakhanlou 15 that (3) holds for p./^(Nt), with p(t,y) being the entropy solution with initial condition />(•). A stronger version stating that the specific relative entropy
HN(t) = jjh(nN(Nt)\\N(t)) satisfies lim N-yoo
sup Hfj(t)
= 0
0
can be proved using the methods of Yau 2 3 , provided the solution p(t,y) is smooth on [0,T] x T 1 and satisfies a < p(t,y) < 1 — a for some a > 0 . Here A^v(^) is
310
constructed from p(t, •) the same way as /i#(0) was constructed from po(-)- That is to say, it is a product measure with 2g + l
XN(t)[r,(x) = 1] = N I '"
p(t, y)dy
In the special case of totally asymmetric simple exclusion, that is when p(l) = 1 and p(z) = 0 for 2 ^ 1, a stronger result is valid. According to Seppalainen 17 , for any initial condition satisfying (2), (3) always holds for the entropy solution p(t, y). The problem of large deviations are currently being worked out by Jensen 9 for the special case of totally asymmetric simple exclusion process. If we start from an arbitrary determinstic initial condition that corresponds to the macroscopic profile in the sense of (2), we can ask how small the probabilities of large deviation from the entropy solution will be. We would like them to be no smaller than e~CN for some finite C. It turns out that at this level the only possible deviations are weak solutions p(t, y) of equation (1) that do not necessarily satisfy the entropy condition. The precise exponential constant C depends on the deviation p{t,y) and measures the amount by which the entropy condition is violated. More precisely, with the choice of the convex entropy h{p) = p\ogp+(l — p) log(l— p) and the corresponding flux f(p) determined by f'(p) = h'(p)(l — 2p), smooth solutions satisfy jjb(p(t,v)
+ -£f(p(t>v)=0
(4)
The entropy condition is that as a distribution il> = fth(p(t,y)
+ —f(p(t,y)<0
(5)
The rate function for large deviation turns out to be finite, only when the distribu tion ip in (5) is a signed measure of bounded variation, in which case it is equal to the total mass of the positive part in the Jordan decomposition of ip. The entropy condition, roughly speaking, says that for t > 0 if there is a simple discontinuity in the solution p{t,y) at some point y = yo, then p(t,yo — 0) < p(t,yo + 0), i.e. a shock must necessarily have higher density in front and a lower density behind. An interesting result concerning the corresponding microscopic system is that while there is some noise, presumably of order y/W, on the location of the shock in the microscopic scale, its extent is microscopically of order 0(1). The exact statistical description of the shock profile has been calculated in DerridaLebowitz -Speer 4 for the special case of nearest neighbour jumps. An important but easy consequence of the entropy condition is that the total specific entropy
H(t) = Jh(p(t,y))dy which is conserved for smooth solutions and is always nonincreasing. More over its decrease is accounted for by the shocks with each shock making its own contribution to the reduction rate of the total entropy. (Note that the physical entropy is —H(t)). In addition the shock profile creates an entropy loss at the microscopic level that matches the macroscopic loss due to the shock. A nonphysical shock, i.e. one with p{t, yo — 0) > p(t,yo + 0) pumps in entropy and in order to produce it, the
311
corresponding amount of entropy has to be put into the microscopic particle system. This explains the rate function for large deviations. We now move on t o the case when m = £^ 2 zp(z) = 0. This requires diffusive scaling and the macroscopic time scale is N2t. T h e limiting equation will be a nonlinear parabolic equataion of the form
! = ><ȣ'
(«>
and part of the problem is the determination of the function D(p) in t e r m s of p ( ) . If p(-) is symmetric, thanks to some cancellation, it is a very easy computation t o show that in this case (6) is valid, with the choice of
D(p) = D=±Y,*iP(*) T h e effect of the interaction due to exclusion is not felt a t all. The reason is t h a t because of symmetry the space of linear functions £3 X a(x)r)(x) is left invariant by the generator and therefore the evolution. This makes possible a fair amount of explicit calculation. Equations for densities t h a t normally should not close do close giving us the linear heat equation. By using the symmetry of p ( ) we see t h a t
" 2 -4N ^£•'(>(*) N' K
^P(X'-X)T,(Z)(1-T,(X'))[J&)-J(?-)] v l
N £-^rx x,x'
"
"
"
"
"N'
V
./V'
N2
— £>(*' - *)[*>(*) - n(x'))[J(^) - Ajf)] x.x'
= 2*I>arf*M*>'"(£>
(7)
z,x
In this model, because of diffusive scaling, although there is determinstic be haviour of the density profile, each individual particle exhibits random diffusive behavior. If we tagged a particle and followed its motion in equilibrium (the case of p ( l ) = p( —1) = 0 has to be excluded on Z because the particles are all locked in their relative positions.) it was shown in Kipnis-Varadhan n t h a t there is diffusive behavior with a diffusive constant S(p) that depends on the equlibrium density p. If we start from a n arbitrary initial configuration t h a t produces a density profile of p(t,y) which is a solution of the heat equation (6) with D(p) = D, one can expect t h a t the tagged particle has now an inhomogeneous diffusive behavior with a Kolmogorov backward generator of the form Ct=1 a{t y
2 '^
+ b y)
^ Ty
(8)
One can expect that a(£, y) — S(p(t,y)). Self consistency reuires that p(-, •) should also be a solution of the Fokker-Planck equation corresponding to 8 and this can be used to determine 6(-, •). While there is no general proof of this for every particle, it is result of Rezakhanlou 1 6 t h a t this is indeed true for most of the particles.
312
If we take the periodic case with a total number of pN particles and look at the empirical process R
N = lj'51Sy.(N*-) N
as random measures on the space of trajectories on D[[0,T],T], the law of large numbers asserts the convergence in probability of R^ to the measure Q correspond ing to the process with generator Ct and initial distribution PQ(-). Large deviations in this context have been worked out in Quastel- RezakhanlouVaradhan 13 (some of the results require technical assumptions on the regularity of the function S{p) that have so far been established only when d > 3). Generalizations to other lattice gas models that are reversible, relative to Gibbs measures with interactions that satisfy mixing conditions have been considered in Varadhan-Yau 21 . While they still lead, under diffusive scaling, to densities evolving according equations of the form (6), the dfficulties encountered in establishing such results are considerable. In the calculations carried out in (7) we were fortunate on two accounts. First, we were able to do summation by parts twice and get rid of two factors of N. Secondly the object we ended up with was again linear so the equations closed. There are models (called gradient types) that allow the first step. The paradigm about local equilibria allow us to replace the nonlinear terms by their expectations leading to the more general equation of the type given by (6) with a nonlinear D(). This is not so bad. Entropy methods of Guo-PapanicolaouVaradhan 7 , as well as relative entropy methods of Yau 2 3 can be applied. However for most models the first reduction is not possible. The conservation property allows us to perform summation by parts only once, leading to
N2
J2P(X'-X)WX,XI,V[J(^)-J(^)}
Wx(-) is the current density and has mean 0 under every equilibrium. Earlier W was itself a gradient that made another summation by parts possible. Now one has to replace Wx(r)) by D(r)(x) — T)(x + 1)) not just at the level of expectations but rather at the level of fluctuations. The constant D(p) will in general depend on p. The calculation of D(p) which is essentially the Green-Kubo diffusion invloves the projection ( in a rather fancy Hilbert Space) of the current W into the one dimensional space of density fluctuations, very much in the spirit of BoltzmannGibbs principle and is carried out in every global equilibrium pp. One uses precise estimates Yau 24 on the spectral gap in finite volumes, that inevitably get worse as the volume increases due to the presence of mutiple equilibria. A fair amount of the machinery of large deviations has to be used in order to make the results proved in equlibrium to be applicable when the process is far away from any global equilibrium. See Varadhan 19 , Quastel 12 or Kjpnis-Landim-Olla 10 for examples
313 of this approach. One has to develop so much machinery just to establish the basic evolution equation of the densities, that proving results on large devistions in these situations should not be much harder. Fluctuations in equlibrium are also accessible at this point. An OrnsteinUhlenbeck limit under appropriate scaling can be established. In the non gradient case, since this involves the fluctuation-dissipation relation, the Green-Kubo diffu sion D(p) will appear. For instance, see Quastel ? , or Spohn 1 8 for examples. Fluctuation results in nonequlibrium appear to be very inaccessible at this time. There is only one example, in Chang-Yau 3 of a one-dimensional model. There are conceptual difficulties because the normalization N* used to scale the fluctuations in d- dimensions gets stronger with increasing dimension. One has more or less the same situation when we consider the simple exclusion model t h a t we started out with, but with p(-) being asymmetirc and with mean 0. This is a nongradient model and can be analysed in a similar manner as was done by Xu 22. There is an additional complication due to the nonreversible nature of the dynamics, even in equlibrium, making the projection necessary to compute the Green-Kubo coefficient non-orthogonal. If we go from lattice models to continuum models the situation is far less satis factory. For the interacting Brownian motions model, the basic density behavior is known in d = 1 where there no phase-transitions (see Varadhan 2 0 ) and the relative entropy methods of Yau 2 3 are applicable in higher dimensions away from phase tansitions. For a very special interaction, the tagged particle motion in nonequlib rium has been studied in Grigorescu 8 . This interaction interpolates between the case hard rods of zero size (reflecting) and the free noninteracting case. A fair amount of explicit calculation is done, including a closed form formula for S(p). T h e equlibrium fluctuation theory exists below phase trnasitions Spohn 1 8 . Finally let us return to the totally asymmetric simple exclusion model of p ( l ) = 1 t h a t we considered earlier. If we start with an initial density po(y) = \+ " f f i ' t h a t is nearly constant and speed up time by an additonal factor of N then formally the solution takes the form P(t,y) = 5 + ^y with u(t,y) satisfying the equation du _ du ~dt=U~dy~ Clearly at this stage the dissipation term should reemerge and the correct equation could be
du _ du D d2u U + ~di ~ ~dy~ ~2~dy2~ While this is suspect in d = 1 the analogous results for d > 3 have been estab lished in Esposito-Marra-Yau 5 . Models having different types of particles with preferences to go in different directions and interactions t h a t change types by col lisions while conserving momenta, have been studied in Esposito-Marra-Yau 6 and Quastel-Yau 1 4 , leading to incompressible Navier-Stokes equations.
314 References 1. T.M. T . M . Liggett, Interacting Interacting Particle Particle Systems, Systems, Springer, 1985. 1985. 2. C.C. C.C. Chang, Equilibrium Equilibrium fluctuations fluctuations of gradient gradient reversible reversible particle particle systei systems Probab. Theory Related Fields,100 (1994), 269-283. 3. C.C. C.C. Chang, and and H.T. H . T . Yau, Yau, Fluctuations Fluctuations of one-dimensional one-dimensional Ginzburg-Land Ginzburg-Landau models in nonequilibrium, models nonequilibrium, C Comm. o m m . Math. Phys. 1 145 4 5 (1992), 209-234. 4. B. Derrida, J.L J.L Lebowitz and and E.R. E.R. Speer, Shock Shock Profiles Profiles for the asymmeh asymmetric simple exclusion simple exclusion process process in one dimension, dimension, J. Statis. Phys. 8 899 (1997), 135-K 135-167. 5. R. Esposito, R. Marra and and H.T.Yau, Diffusive Diffusive limit limit of asymmetric asymmetric simple simple texclusion, Rev. elusion, Rev. Math. Phys.6 (1994), 1233-1267. 6. R. Esposito, R. Marra and and H.T. H.T. Yau, Yau, Navier-Stokes Navier-Stokes equations equations for stochasi stochastic particle systems particle systems on the lattice, lattice, Comm. C o m m . Math. Phys, 182 1 8 2 (1996), 395-456. 7. M.Z. M.Z. Guo, Guo, G.C. G.C. Papanicolaou and and S.R.S S.R.S Varadhan, Nonlinear Nonlinear diffusion diffusion lin limit for a system for system with with nearest nearest neighbor neighbor interactions, interactions, Comm. C o m m . Math. Phys. 1". 118 (1988), 31-59. 8. I. Grigorescu, Self-Diffusion Self-Diffusion for Brownian Brownian Motions Motions with with Local Local Interaction Interaction ((to appear in Ann. Ann. Prob.) 9. L. Jensen, Large Large Deviations Deviations for the asymmetric asymmetric simple simple exclusion exclusion procei process. (Ph.D thesis under preparation). 10. 0. C.Kipnis, C.Landim and and S.Olla, Hydrodynamic Hydrodynamic limit limit for a non-gradie non-gradient system : The generalized system generalized symmetric symmetric symmetric symmetric exclusion exclusion process, process, C Comm. o m m . Pu Pure Appl. Math. 4 477 (1994), 1475-1545. 11. 1. C. Kipnis, S.R.S. Varadhan, Central Central limit limit theorem theorem for additive additive functionc functionals of reversible reversible Markov Markov processes processes and applications applications to simple simple exclusions, exclusions, Comi Comm. Math. Phys. 104 1 0 4 (1986), 1-19. 1-19. 12. 2. J. Quastel, Diffusion Diffusion of color color in simple simple exclusion exclusion process, process, C Comm. o m m . Pure Ap] Appl. Math. 4 455 (1992), 623-679 13. 3. J. Quastel, F.Rezakhanlou, and and S.R.S. Varadhan, Large Large Deviations Deviations for tthe symmetric simple symmetric simple exclusion exclusion process process in dimensions dimensions d > 3 . ,, Prob. Th. T h . Rel. Ri Fields 113 1 1 3 (1999), 1-84 1-84 14. 4. J. Quastel, J and and H.T. H . T . Yau, Yau, Lattice Lattice gases, gases, large large deviations, deviations, and the incoi incompressible Navier-Stokes pressible Navier-Stokes equations, equations, Ann. Ann. of Math. 148(1998), 51-108. 15. 5. F. F . Rezakhanlou, Hydrodynamic Hydrodynamic limit limit for attractive attractive particle particle systems systems on ZZd, C o m m . Math. Phys. 140 Comm. 1 4 0 (1991), 417-448. 16. 6. F. Rezakhanlou, Propagation Propagation of chaos chaos for symmetric symmetric simple simple exclusion, exclusion, Comi Comm. Pure Appl. Math. 47 4 7 (1994), 943-957 17. 7. T T.. Seppa.la.inen, Existence Existence of hydrodynamics hydrodynamics for the totally totally asymmetric asymmetric >Kexclusion process, exclusion process, Ann. Ann. Prob. 27 2 7 (1999), 361-415 18. 8. H. Spohn, Equilibrium Equilibrium fluctuations fluctuations for interacting interacting Brownian Brownian particles, particles, Comi Comm. Math. Phys. 1 103 0 3 (1986), 1-33. 1-33. 19. 9. S.R.S. Varadhan, Nonlinear Nonlinear diffusion diffusion limit limit for a system system with with nearest nearest neighb neighbor interactions, II, Asymptotic Problems in Probability Theory: Stochastic Mo interactions, Models and and Diffusions on Fractals, K. D. Elworthy and and N. Ikeda (Editors), Pitmj Pitman Research Notes in Mathematics Series 283, 283, (1991), 75-130. 20. 0. S.R.S. Varadhan, Scaling Scaling limits limits for interacting interacting diffusions, diffusions, Comm. C o m m . Math. Ph3 Phys.
315
135 (1991), 313-353. 21. S.R.S. Varadhan and H.T. Yau, Diffusive limit of lattice gas with mixing con ditions, Asian J. Math. 1 (1997), 22. L. Xu, Hydrodynamics for asymmetric mean zero simple exclusion, (1993) Ph.D. Thesis, New York University. 23. II.T. Yau, Relative entropy and the hydrodynamics of Ginzburg-Landau models, Lett. Math. Phys. 22 (1991), 63-80. 24. H.T. Yau, Logarithmic Sobolev inequality for lattice gases with mixing condi tions, Comm. Math. Phys. 181 (1996), 367-408.
316 SUPERSYMMETRY: A PERSONAL VIEW
B. Z U M I N O Physics
Department,
University
of California,
Berkeley
CA 94720,
U.S.A.
I shall discuss some basic facts about supersymmetry, in a way that I hope will be understandable to non experts. A (very little) knowledge of relativistic quantum mechanics will be assumed. I shall restrict myself almost entirely to rigid (global) supersymmetry in four space-time dimensions, although the super-Higgs effect will be discussed briefly. Supergravity and supersymmetry in diverse dimensions form a sizable body of work that has become an integral part of superstring theory, and that can be described clearly only in lengthy monographs written for readers with considerable technical background knowledge. Even within the restricted scope 1 have imposed on myself, I was constrained to select a few topics that I consider important and which do not require a highly technical description. No attempt to completeness is made in the text or in the references. Topics which could not be covered include holomorphic techniques, duality and "BPS saturation" and various aspects of dynamical supersymmetry breaking.
1
A brief history of the beginning of supersymmetry
It appears that four-dimensional supersymmetry has been discovered independently three times: first in Moscow, by Golfand and Likhtman, then in Kharkov, by Volkov and Akulov, and Volkov and Soroka, and finally by Julius Wess and me, who col laborated at CERN in Geneva and in Karslruhe. Julius and I were totally unaware of the earlier work. I find it more remarkable that Volkov and his collaborators didn't know about the work of Golfand and Likhtman, since all of them were writ ing papers in Russian in Soviet journals. For information on the life and work of Golfand and Likhtman, I refer to the Yuri Golfand Memorial Volume 10 , to appear shortly. For information on Volkov's life and work, I refer to the Proceedings of 1997 Volkov Memorial Seminar in Kharkov u . Supersymmetry is a symmetry which relates the properties of integral-spin bosons to those of half-integral-spin fermions. The generators of the symmetry form what has come to be called a superalgebra, which is a super extension of the Poincare Lie algebra of quantum field theory (Lorentz transformations and spacetime translations) by fermionic spinorial generators Qa. In a superalgebra both commutators and anticommutators occur. In supersymmetry the anticommutator of two Qa equals the total momentum which generates space-time translations:
while the commutator of a Qa with the momentum vanishes: [Ga,PM]=0. This is the simplest (N = 1) supersymmetry which Qa, and is the one which was discussed in all the commutator of a Lorentz generator with Qa is fixed law of the spinor. In the above I have used the Majorana spinors, and the Majorana representation
uses a single Majorana spinor works mentioned above. The by the Lorentz transformation Pauli notation for Dirac and with real gammas.
317
I think it is fair to say that Julius Wess and I, in addition to discovering supersymmetry independently, clarified its role as a space-time symmetry of a local relativistic quantum field theory satisfying all the standard axioms. We also gave examples of interacting theories involving scalars and spinors as well as theories with gauge fields. At the very beginning we noticed the characteristic cancellation of divergences which is now referred to as nonrenormalization theorems (according to Shifman, Evgeny Likhtman also observed some of these cancellations, see the volume I referred to above). In our very first paper we even introduced the idea of supergravity, a generalization of Einstein's gravity in which the graviton is part of a supermultiplet with other fields, although we gave no explicit Lagrangian at that time. There is still no direct evidence that supersymmetry is a symmetry of the phys ical world, that elementary particles arrange themselves in supermultiplets of spins differing by half a unit. It must be broken, since if unbroken it would predict that the particles in a supermultiplet have the same mass. Finding the correct breaking mechanism is probably still the basic unsolved problem. The strongest hint that supersymmetric field theories may apply to the phys ical world comes from the nonrenormalization theorems, which provide a possible solution to the so-called hierarchy problem. In elementary particle physics there appear several very different energy scales, such as the electroweak scale, the grandunification scale and the Planck scale. Without supersymmetry it is very difficult to understand the stability of these different scales, which would be strongly affected by radiative corrections, like the quadratic self-mass of scalar fields: in supersym metric field theories the self-mass of scalar fields is only logarithmically divergent, just as that of spinors. The work of Golfand and Likhtman and that of Volkov and collaborators went to a large extent unnoticed. On the contrary the first three preprints Julius and I wrote immediately aroused the interest of numerous theoretical physicists, even before publication, and the subject took on a life of its own, to which we both continued to contribute both together and separately, with other collaborators, to the bast of our ability. Our early papers also gave rise to renewed interest by mathematicians in the theory of superalgebras. Eventually a complete classification of simple and semisimple superalgebras was obtained, analogous to Cartan's classification of Lie algebras, and even the prefix "super" was adopted by mathematicians. Unfortunately the Poincare superalgebra is not semisimple, just as the Poincare Lie algebra is not, although it can be obtained by a suitable contraction. The general classification of superalgebras does not seem to be very useful in physics, because superalgebras cannot be used, like Lie algebras, as purely internal symmetries. But one can never tell. In four space-time dimensions a very convenient notation is the two-component notation of van der Waerden, with dotted and undotted indices taking the values 1 and 2. Thus a Dirac spinor can be written as
fc
- (30■
318
and a Majorana spinor,
where the bar here (~) indicates complex conjugation. From the four-component point of view this is the same as working in the Weyl representation where 75 is diagonal. The details can be found in the references given at the end of the paper. With this notation, the basic algebra given above takes the form
{Qa,Q0}
= {Q«,Q0} = o.
In space-times higher (or lower) than four the most convenient notation depends on the number of dimensions. We shall stay in four and follow van der Waerden, but sometimes go back to using Pauli's notation. 2
Extended supersymmetry
An important development in supersymmetry was the study of extended supersym metry and the realization that the algebra can contain central charges, i.e. scalar generators which commute with all generators of the super-Poincare algebra. The generators of N-extended supersymmetry satisfy the relations
{QtQ$} = taflzAB, where ZAB = -ZBA and the indices A,B take the values 1,2, ...N. They cor respond to the fundamental representation of a semisimple Lie algebra, usually SU(JV), which is an automorphism of the supersymmetry. The algebraic possibility of central chages in extended supersymmetry was discovered by Haag, Lopuszanski and Sohnius 12 , who showed that it was consistent with all axioms of local relativistic quantum field theory. It was important to find examples of quantum field theories with central charges. Witten and Olive 13 showed that in supersymmetric nonabelian gauge theories cen tral charges can appear as topological charges. They considered the N = 2, 0(3) Yang-Mills theory and showed that when the gauge theory has soliton solutions which are magnetically charged ('t Hooft-Polyakov monopoles) or even magneti cally and electrically charged (dyons), these charges appear in the supersymmetry algebra as central charges. Assuming that the algebra is exact at the quantum level, they were able to obtain exact results for particle masses. Central charges can sometimes be understood by considering first supersymmetries in more that four space-time dimensions, where the supersymmetry charges and the momentum have more components. When one compactifies the extra di mensions, one obtains extended supersymmetry in four dimensions, and the extra components of the momentum appear as scalar central charges in four dimensions. Townsend and collaborators 14 discovered that central charges in four dimen sions can originate from operators other than the momentum in higher dimensions.
319 These are tensorial charges which take nonvanishing values on geometric configura tions such as strings and membranes. One can then wish to discuss the possibility of nonscalar central charges in four dimensions. This leads to a generalization of the algebra discussed above, i.e.
{QtQ$} = '*0ZAB + <0z$?, where Z*A = 0 and Z*® = z ^ = — Z*^. These new charges are central in the sense that they commute with the generators Q and P of the super-translation algebra (obviously not with the Lorentz generators). They correspond respectively to string charges in the adjoint of SU(N) and to membrane charges in the two-index symmetric representation of SU(7V). Notice that while the scalar charges cannot appear in the N = 1 supersymmetry algebra, these vector and tensor charges can. Concrete examples have been given, especially by Shifman and collaborators 15 . A systematic algebraic treatment was given by Ferrara and Porrati 16 . It should be mentioned that, for an infinite string or an infinte membrane or domain wall, the value of the total vectorial or tensorial charge is infinite and one should talk about densities. The algebra can be formulated as follows. Consider for instance N = 1, so that {Qa,Qp} =
where JpliU{x) = £pa^vdoA(x), depends on the model. Then
so that dpJp'M„ = 0, and A is a scalar field which
{Qa, Jpp(x)} = a"ay\vdaA{x)
+ ■■■.
Ordinarily, by J (fix the r.h.s. will vanish. However, assume that the solution A(x) has a kink. For instance, let the kink be in the x 3 direction, and correspond to a flat domain wall in the x ' , x 2 plane. Then the above term in the r.h.s. becomes 20a20 0 outside the wall and x3 < 0 outside the wall. Integration over x 1 and x 2 gives an infinite result and should not be performed over the infinite x \ x 2 plane, but the density in the plane is finite. 3
Breaking of supersymmetry
In this section I return to the Pauli notation for spinors. As already mentioned, supersymmetry cannot be exact in nature. Perhaps the most important problem in the field is the question of the mechanism and the scale of supersymmetry breaking. In particular, can supersymmetry be broken spontaneously without breaking the translational invariance of the Poincare group? There is a very simple (fortunately incorrect) argument which shows that this cannot happen. A very simple consequence of the basic algebra is that the total
320
energy is given by
a=l
and similar equations for the total three-momentum. Now, translational invariance requires that the vacuum state satisfy P"|0> = 0; as a consequence, (0\QlQa\0) = 0 and therefore Qa\0) = 0, which means that supersymmetry is exact. Vice versa, if supersymmetry is violated, so is translational invariance, it seems. In the early days this argument was made more compelling by the fact that explicit calculations based on the simplest scalar-spinor models showed that there can be more than one vacuum solution (in the tree approximation) but those which break supersymmetry spontaneously are unstable. This puzzling situation was cleared up when examples of field theories which exhibit spontaneous breaking even in the tree approximation were found. O'Raifeartaigh 18 showed that scalar-spinor models (with a minimum of three chiral multiplets) can break supersymmetry for suitable choices of the parameters in the Lagrangian. Fayet and Iliopoulos 17 showed that supersymmetry breaking can occur also in a supersymmetric U(l) gauge theory. Examination of these models made it clear that the above formal argument is incorrect for a very simple reason: when supersymmetry is spontaneously broken the operators Qa and H do not exist in infinite space (they are infinite). Their algebra must be replaced by an algebra involving the supercurrent density j^ Q (x), which is related to the charge by jd3xj°a(x),
Qa = and the energy-momentum density 0\^(x), H =
which is related to the energy by
Id3xe00(x).
The transformation property of the supercurrent under supersymmetry is known: (5JM = [J^,Qe] =
27 A ©A M £
+ derivative terms,
where e is an infinitesimal anticommuting spinorial parameter. Taking the vacuum expectation value the derivative terms drop out by translational invariance and we obtain (0|{Q^V}|0>=2(7 A )a/3(0|e A M |0).
321
When supersymmetry is spontaneously broken, = -Erixn,
(0|9AM|0>
with E a positive constant (our 7700 = -1)- Then the total energy of the vac uum is infinite although the energy density is not. Notice that the above (anti-) commutators have meaning because {Jv0(y),Jna(x)} = ° for space-like separation y - x, so that the integral in Qp, which can be taken at time x°, does not extend to infinity. When supersymmetry is spontaneously broken, the theory has a massless fermion which is the analogue of the Goldstone fermion or goldstino. The transfor mation law of a spinor has the general form {O\{Q(3^a}\O) = F50a + ---, where F is a scalar field and the dots denote derivatives and other terms with vanishing vacuum expectation value. If (0|F|0) / 0, ipa is a goldstino. Then <0|{J£fo),iMx)}|0> = ( 0 | F | 0 ) ( 7 V W ^ X A ( 2 /
-x),
where A(y—x) is the standard zero mass scalar commutator function, which satisfies DA = 0,
A( 2 )| 2 , ) = 0 = 0,
0
-53(z).
| z<|=o =
The above formula is dictated by the requirements of causality and current conser vation, — dy*
a
Jx=0
'
and gives the correct result for the anticommutator with Q. Thus the goldstino has zero mass. Notice that the coupling of a massless fermion to the supercurrent could, also have the form <0|{J£(i/),<pa(x)}\0) = cA(7A)/to_3.A(j/
_
x)i
where c is a constant. Upon integration this formula gives (O\{Q0,<pa}\O)=O, so (fa is not a goldstino and its presence does not signal the breaking of supersym metry. It was thought at first that the goldstino could give us a good description of a massless neutrino, but it was was soon realized, by de Wit and Freedman 19 that the amplitudes involving goldstinos satisfy low energy theorems, analogous to those for low energy Goldstone bosons, that are not satisfied by massless neutrinos. So there are no candidates for the goldstino in nature. Does this mean that supersymmetry is not broken spontaneously? Fortunately not. If rigid (or, as some say, global) supersymmetry is gauged, i.e. made xdependent, as it is in supergravity, the corresponding massless gauge field is a
322
spin | field Wa( x )- ^ ' s called the gravitino because it is the superpartner of the graviton. When supersymmetry is spontaneously broken a phenomenon occurs which is the analogue of the Higgs effect in ordinary gauge theories, the superHiggs effect. Just as in the Higgs effect the Goldstone boson is absorbed into the gauge vector and produces a massive vector field, in the super-Higgs effect the goldstino combines with the gravitino to form a massive spin | field, the massive gravitino. This is a bonus to both rigid supersymmetry and supergravity because neither massless goldstinos nor massless gravitinos exist in nature. The qualitative features of the super-Higgs effect were discussed by Volkov and Soroka 20 and by Deser and myself 21 . This would seem to be a satisfactory approach but unfortunately things are not so easy. The simplest supersymmetric extension of the standard model of particle physics, the so-called minimal supersymmmetric standard model, constrains the masses of superpartners in an unacceptable way. The scenarios mostly pursued at this time involve postulating the existence of "hidden sectors" where supersymme try is spontaneously broken and of "messengers" which communicate the supersymmetry breaking to our world, the "observed sector". So one can have "gauge mediated" or "supergravity mediated" models. These may seem flights of fancy, but the existence of different sectors and of mediation among them is actually a feature of superstring theory. There are other ways of breaking supersymmetry of course. One can try the approach of dynamical breaking, by considering the corrections due to radiative corrections or the effect of instantons in nonabelian gauge theories. Alternatively one can modify the Lagrangian by adding explict symmetry breaking terms, which should however be "soft" in order not to spoil the cancellation of ultraviolet di vergences characteristic of supersymmetric theories. This latter approach has, of course, the disadvantage of introducing additional parameters, with the correspond ing loss of prediction. 4
Quantum groups and quantum spaces
The use of anticommuting variables in supersymmetry has inspired some to con sider other algebraic structures. A very interesting one is the algebra of quantum groups and quantum spaces, which I shall try to describe now by using some very simple examples. Quantum groups emerge as "hidden" symmetries in the study of integrable systems and of some two-dimensional quantum field theories, such as the so-called Wess-Zumino-Witten models. Here I shall take a different ap proach and consider the possibility of deformations of Minkowski space and of the Poincare group and even the possibility that a deformed quantum field theory may be developed which is preferable to our standard quantum field theory. A simple example of quantum space is the quantum plane. Its coordinates x and y do not commute, instead they satisfy the commutation relation xy - qyx - 0, where q is a generic complex number, the deformation parameter; the special value q = 1 gives commuting coordinates. Notice the difference between the above com-
323
mutation relation and the Heisenberg commutation relation between p and x. If one considers x and y as the two components of a vector and acts on them with a linear transformation x' = ax + by,
y' = ex + dy,
where the matrix elements o, 6, c, d commute with x and y and satisfy the commu tation relations ab = qba,
ac = qca,
bd = qdb,
cd = qdc,
ad - da = qbc
cb, be = cb, q one can verify easily that x' and y' satisfy the same commutation relations as x and y, x'y' - qy'x' = 0. In algebraic language, the algebra of the quantum plane is preserved under the coaction of the quantum group GL g (2) represented by quantum matrices
-ft) with elements satisfying the above commutation relations and nonvanishing quan tum determinant ad — qbc. In an ordinary plane the differentials satisfy a Grassmann algebra, i.e. they anticommute and their square is zero. For the quantum plane (dx)(dy) + -(dy)(dx)
= 0,
(dx)2 = (dy)2 = 0,
a deformation of the Grassmann algebra in two variables. These commutation relations are also preserved by the coaction of GL,(2) (dx') = a(dx) + b(dy),
(dy') = c(dx) + d(dy),
a,b,c,d commuting with (dx) and (dy). The commutation relations of GL g (2) can actually be derived from the requirement that the stated commutation relations of both the coordinates and the differentials are covariant under the coaction of the quantum group. Finally, one can write commutation relations between coordinates and differen tials which are also covariant under the coaction of the quantum group 22 . If one takes a second copy of the quantum matrix
i.e. its matrix elements satisfy among themselves the same commutation relations as a, b,c,d, a'b' = qb'a',
a'c' = qc'a',
etc.
324
and takes them to commute with a, b, c, d, the elements of the matrix M " obtained by matrix multiplication,
a" = a'a + b'c,
b" = a'b + b'd,
etc.
again satisfy the same commutation relations as a,b,c,d and its quantum deter minant is the product of those of M' and M. This is usually referred to as the quantum group property of GLQ(2) quantum matrices. Having coordinates and differentials, one can consider functions and differential forms on the quantum plane and define derivatives and a differentiation operator d satisfying d2 = 0 as well as (partial) differential equations and even invariant integration. It is clear that GL 9 (2) is not a group in the usual sense, although it is a defor mation of the group GL(2). The ideas sketched above find a precise mathematical description in the language of Hopf algebras. For this I refer to the literature at the end of this article 23 . We have sketched here the deformation of the calculus on a bosonic plane. Analogous developments lead to the concept of fermionic quantum plane 24 . The algebraic relations sketched here for the two-dimensional quantum plane can be generalized to a higher dimensional quantum hyperplane covariant under GL g (n). Quantum deformations of Minkowski and Euclidean space and of the corresponding Lorentz and orthogonal covariance groups have also been discussed. It is possible to formulate equations describing the deformation of classical and even quantum field theory, where the fields are functions of noncommuting space-time coordinates. In my own work with Julius Wess and other collaborators 25 , the original mo tivation for studying these quantum spaces and quantum groups was the idea that the smearing of the space-time points due to the noncommutation of the coordi nates could give rise to a realistic regularization of the divergences of quantum field theory. In my own work I found this not to be the case, the singularity of the Green's function of the deformed Klein-Gordon equation, for instance, is the same as for the undeformed case q = I. The qualitative reason for this is that the de formation parameter q is a pure number while one would need a deformation with a parameter having dimension of a length or a mass (however, some people still believe that the deformations discussed above can lead to a regularization). Deformations with parameters having a dimension have also been studied. One way to obtain this is through contraction of g-deformations. For instance one can study the <j-deformation of de Sitter space, quantum de Sitter space. The algebra has two parameters, q and the radius R of de Sitter space. For q = 1, as R —> oo, the covariance algebra of de Sitter space contracts to the Poincare algebra of Minkowski space. However, if one starts from finite q and R and goes to the limit q = 1 and R = oo in a judicious manner, one is left with an algebra with a dimensional parameter. Other popular choices of commutation relations among the space-time coordi nates are the canonical structure [x^,xu] = 10"",
0*" = -eUfl
325
and the Lie-algebra structure [a^z"]
=iCtwxxx,
which both involve dimensional parameters and ordinary (not g-deformed) com mutators. The canonical structure may appear too trivial at first, but is closely related to certain situations in superstring theory when a constant antisymmetric background two-index tensor is present. A detailed study of the divergences of a quantum field theory in a space-time with canonical structure has been done very recently 26 . References For those who wish to learn more about the subject and perhaps go on to do research, I can suggest the following. Books: 1. J. Wess and J. Bagger, Supersymmetry and Supergravity, 2nd ed. (Princeton University Press, Princeton, NJ, 1991). The two-component van der Waerden notation is explained in detail and used here. 2. S. Weinberg, The Quantum Theory of Fields, vol. Ill, Supersymmetry (Cam bridge University Press, Cambridge, 2000). This book uses the standard fourcomponent Pauli notation for spinors. 3. J. Polchinski, String Theory (Cambridge University Press, Cambridge, 1998) (two volumes). Reviews: 4. P. Fayet and S. Ferrara, Supersymmetry, Physics Reports 32 C, no. 5, 249-334 (1977). 5. P. van Nieuwenhuizen, Supergravity, Physics Reports68, no. 4, 189-398 (1981). Collections of Published Papers: 6. Supersymmetry, S. Ferrara, ed. (North-Holland, Amsterdam and World Scien tific, Singapore, 1987) (two volumes). 7. Supergravities in diverse dimensions, A. Salam and E. Sezgin, eds. (NorthHolland, Amsterdam and World Scientific, Singapore, 1989). Recent Reviews The above reviews and collections of papers are very useful for the periods they cover. Relatively recent reviews with emphasis on phenomenology are: 8. C. Csaki, The Minimal Supersymmetric Standard Model, Mod. Phys. Lett. A 11, 599 (1996). 9. S. Martin, A Supersymmetry Primer, hep-ph/9709356, extended version of the article in the book, "Perspectives in Supersymmetry", G.L. Kane, ed. (World Scientific, Singapore, 1998). Specific References 10. Yuri Golfand Memorial Volume, M. Shifman, ed. (World Scientific, Singapore, to appear).
326
11. D. Volkov Memorial Seminar, J. Wess and V.P. Akulov, eds. (Springer, Berlin, 1998). 12. R. Haag, J.T. Lopuszanski and M. Sohnius, Nucl. Phys. B 88, 257 (1975). 13. E. Witten and D. Olive, Phys. Lett. B 78, 97 (1978). 14. P.K. Towsend, hep-th/9507048, Proceedings of PASCO 95, J. Bagger, ed. (World Scientific, Singapore, 1996). 15. A. Kovner, M. Shifman and A. Smilga, hep-th/9706089v2, Phys. Rev. D 56, 7978 (1997). Other related papers by Shifman and collaborators are quoted here and in the following paper. 16. S. Ferrara and M. Porrati, hep-th/9711116v2, Phys. Lett. B 423, 255 (1998). 17. P. Fayet and J. Iliopoulos, Phys. Lett. 51 B , 461 (1974). 18. L. O'Raifeartaigh, Nucl. Phys. B 96, 331 (1975). 19. B. de Wit and D.Z. Freedman, Phys. Rev. Lett. 35, 827 (1975). 20. D. Volkov and V.A. Soroka, JETP Lett. 18, 312 (1973). 21. S. Deser and B. Zumino, Phys. Rev. Lett. 38, 1433 (1977). 22. J. Wess and B. Zumino, Nucl. Phys. (proc. Suppl.) B18, 302 (1990); this poaper explains the role of the ft-matrix in the noncommutative calculus. 23. A highly mathematical account of the quantum group approach (written with the idea of deformed quantum field theory in mind) can be found in the book: S. Majid, Foundations of Qunatum Groups Theory (Cambridge University Press, Cambridge, 1995). 24. B. Zumino, Mod. Phys. Lett. A6, 1225 (1991); this paper considers the calculus on a fermionic hyperplane fo arbitrary dimension. 25. C.S. Chu, P.M. Ho and B. Zumino, hep-th/9608188, in Quantum Fields and Quantum Space Time, G. 't Hooft et al, eds. (Plenum Press, New York, 1997), p. 281, as well as numerous papers in collaboration with J. Wess and others. 26. S. Minwalla, M. Van Raamsdonk and N. Seiberg, hep-th/9912072.