A Career In Theoretical Physics, Second Edition (World Scientific Series in 20th Century Physics 35)

r World Scientific Series in 20th Century Physics - Vol. 35 A CAREER IN THEORETICAL PHYSICS Second Edition P H I L I P...

Author: P. W. Anderson

100 downloads 1061 Views 13MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

r World Scientific Series in 20th Century Physics - Vol. 35

A CAREER IN THEORETICAL PHYSICS Second Edition

P H I L I P W. A N D E R S O N

World Scientific Series in 20th Century Physics - Vol. 35

A CAREER IN THEORETICAL PHYSICS

World Scientific Series in 20th Century Physics Published Vol. 21 Spectroscopy with Coherent Radiation — Selected Papers of Norman F. Ramsey (with Commentary) edited by N. F. Ramsey Vol. 22 A Quest for Symmetry — Selected Works of Bunji Sakita edited by K, Kikkawa, M. Virasoro and S. R. Wadia Vol. 23 Selected Papers of Kun Huang (with Commentary) edited by B.-F. Zhu Vol. 24 Subnuclear Physics — The First 50 Years: Highlights from Erice to ELN by A. Zichichi edited by O. Barnabei, P. Pupillo and F. Roversi Monaco Vol. 25 The Creation of Quantum Chromodynamics and the Effective Energy by V. N. Gribov, G. 't Hooft, G. Veneziano and V. F. Weisskopf edited by L N. Lipatov Vol. 26 A Quantum Legacy — Seminal Papers of Julian Schwinger edited by K. A. Milton Vol. 27 Selected Papers of Richard Feynman (with Commentary) edited by L M. Brown Vol. 28 The Legacy of Leon Van Hove edited by A. Giovannini Vol. 29 Selected Works of Emil Wolf (with Commentary) edited by E. Wolf Vol. 30 Selected Papers of J. Robert Schrieffer — In Celebration of His 70th Birthday edited by N. E. Bonesteel and L. P. Gor'kov Vol. 31 From the Preshower to the New Technologies for Supercolliders — In Honour of Antonino Zichichi edited by B. H. Wiik, A. Wagner and H. Wenninger Vol. 32 In Conclusion — A Collection of Summary Talks in High Energy Physics edited by J. D. Bjorken Vol. 33 Formation and Evolution of Black Holes in the Galaxy — Selected Papers with Commentary edited by H. A. Bethe, G. E. Brown and C.-H. Lee Vol. 35 A Career in Theoretical Physics, 2nd Edition by P. W. Anderson

For information on Vols. 1-20, please visit http://www.worldscibooks.com/series/wsscp_series.shtml

World Scientific Series in 20th Century Physics - Vol. 35

A CAREER IN THEORETICAL PHYSICS

P H I L I P W. A N D E R S O N Princeton University, USA

YJ? World Scientific NEWJERSEY

•

LONDON

•

SINGAPORE

•

BEIJING

•

SHANGHAI

•

HONGKONG

•

TAIPEI

•

CHENNAI

Published by World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

The author and publisher would like to thank the following publishers for their permission to reproduce the reprinted papers found in this volume: American Association for the Advancement of Science (Science) American Institute of Physics (AIP Conf. Proc, Physics Today) American Physical Society (Phys. Rev., Phys. Rev. Lett., Rev. Mod. Phys.) Birkhauser Verlag (Ann. Henri Poincare) EDP Sciences (J. de Physique Collogue) Elsevier Science Publishers (J. Phys. Chem. Solids, Solid State Commun., Physica, Phys. Reports, Stud. Hist. Phil. Mod. Phys.) Institute of Physics (J. Phys. F, J. Phys. Condens. Matter) Perseus Books (Emerging Syntheses in Science) Springer-SBM (Valence Instabilities and Related Narrow Band Phenomena, Proc. of 1983 NATO/CAP Inst. "Moment Formation in Solids", The Universality of Physics: A Festschrift in honor of Deng Feng Wang) Taylor and Francis [http://www.tandf.co.uk/journals] (Philosophical Magazine, Solid State Physics, Comments on Solid State Physics, Transition Metal Compounds) To those publishers who have not granted us permission before publication, we have taken the liberty to reproduce their articles without consent. However we shall acknowledge them in future editions of this work.

Library of Congress Cataloguing-in-Publication Data Anderson, P. W. (Philip W.), 1923A career in theoretical physics (2nd ed.) / Philip W. Anderson. p. cm. — (World Scientific series in 20th century physics; v. 35) ISBN 981-238-865-6 - ISBN 981-238-866-4 (pbk) 1. Mathematical physics. I. Title. II. Series. QC20.A55 2004 2004-063652 530.1-dc22 CIP

Copyright © 2004 this edition by World Scientific Publishing Co. Pte. Ltd. All rights reserved.

Printed in Singapore.

PREFACE TO THE SECOND EDITION

The plan which informed the selection of papers in the original volume was to stick as far as possible to summaries and reviews, since there was and is no possibility of covering all possible subjects. I think this may not have been a very good idea, since it slights the original research papers with their greater historical value. This volume is the only summary of my career likely to be published — after all, consider its title — and therefore I have omitted some of these reviews, and substituted some of the research papers — for instance, the paper on line broadening which was a summary of my thesis, and the original paper on localization theory. There is a second somewhat more embarrassing reason for this revision. At the time of the earlier edition, I was gung-ho for a theory of the mechanism of high Tc superconductivity which was shortly proven not to be the primary one. Among the "Central Dogmas" which I proudly reprinted in the book, number V was just plain wrong. I now belatedly understand what led me astray, but only because I was conclusively shown to be wrong, in the gentlest possible manner, by the careful experimental work of several colleagues, notably Kam Moler and collaborators. Fortunately, the correct theory is also partly based on the ideas I expressed in 1987, so not all is lost; but the papers in this section — involving nearly ten years of my professional life — have had to be mostly replaced, and some of the commentary rewritten.

v

INTRODUCTORY ESSAY TO THE FIRST EDITION

It was in the winter of 1966-7 that I first tried to express my philosophy of what was important in science — or at least, of what was important that was possible for me. I has been invited to spend a month in the pleasant climate of La Jolla as the Regents' lecturer at the UCSD, visiting a department staffed by many of my old friends from Bell Labs, such as Bernd Matthias, Harry Suhl, and George Feher, who had been recruited by Walter Kohn, who was also by courtesy a Bell alumnus. One of the lectures I gave was called "More is Different". The original version, now lost, went over with the audience like a lead balloon, at least according to that portion of the audience most closely related to me. But a cleaned up version, which was published in Science in 1971, has been surprisingly influential. I have chosen to reprint it, out of chronological order, as an additional introduction to this collection. I already had this overall vision, if only very vaguely, as I completed my thesis and other early work on the breadths of spectral lines. This work was among the first to approach what was normally thought of as a dissipative process and dealt with by Boltzmann equations for the microscopic variables, in terms of the macroscopic moments of the system as a whole. In my own mind called it the "one big molecule" method. I was beginning to see that somehow, starting from the fundamental atomiclevel laws of quantum theory, and without any more or less mystical appeals to the difference between the atomic domain and the macroscopic domain where deterministic Newtonian physics held way, one could really understand many of the properties of matter in that macroscopic domain. I also experienced two other insights: that these properties were often extraordinary; and that this understanding was often intellectually challenging and deep, at a level which was for me at least more exciting and worthwhile than any other purely intellectual exercise. Thus I was launched into the search for the "Aufbauprinzips" of the everyday world. I admired Frohlich's book on dielectric theory because it took the same point of view; I encountered it in my early work on ferroelectrics. Much more important, I came to see the crucial importance of the concept of broken symmetry, which has been a lifelong interest. Broken symmetry is the clearest instance of the process of emergence which lies behind "More is different". My interest began with the antiferromagnetic ground state work in 1952-3, from which I developed the concept of the Goldstone boson. In applying similar thinking to superconductivity, the Higgs

VII

phenomenon and the concept of Generalized Rigidity arose, the latter referring to the similarities between superfluidity, superconductivity, and elastic rigidity of all sorts. The topological theory of defects and dissipation is another facet of the conceptual structure of broken symmetry. The first natural direction in which to seek for understanding of the real world is the explicit quantitative calculation of properties of real materials. This enterprise is particularly rewarding when the basic principles which govern these properties are new, or when the effects are subtle. This has been the mainspring of much of my work, from the early work on magnetic resonance and other spectra, to superexchange, the microscopic theory of superconductivity, and even work on the electronic structure of solids, foreshadowing the corrections to local density theory, and a paper where the idea is to develop a method which explains and exploits the strong locality of the chemical bond, perhaps the most fundamental observation in the field of chemistry, which nonetheless seems to interest quantum chemists not at all. In 1956, stimulated by a pioneering paper on percolation by Hammersley and by some puzzling experiments, I started to look at the next level of complexity, those phenomena which are intrinsic to disorder. The first result was the theory of localisation of waves. This was slow of acceptance, and, perhaps discouraged by this fact, it was not until 1969 that I returned to it and also began to look at other disorder phenomena such as glass and spin glass. Spin glass stimulated a very fertile period of worldwide activity, first in the development of totally novel methodology, such as a new statistical mechanics of nonergodic systems, and then in applications of the new methodology to a "cornucopia" of new ideas and concepts and problems: neural networks, computer algorithms, rugged evolutionary landscapes, even immunology. This has left me involved in such fields as economics and the origin of life. The interest in complex, emergent phenomena made it natural for me to participate at an early stage in the Santa Fe Institute, founded on the twin principles of interdisciplinarity and of the importance and tractability of genuinely complex systems. Left behind, in my old field of quantum condensed matter physics, was a worry about the validity and completeness of the theories of many-body quantum systems which had occupied that world in the 1950's and 60's. My work on the Anderson model of magnetic impurities had led on to the "infrared catastrophe" in Fermi systems which was the key to my work on the Kondo phenomenon which solved that isolated many-body system but only at the cost of going beyond the conventional perturbation theory. I was also disturbed by a class of strongly interacting superconducting materials, as well as our notable lack of success in dealing with the field of quantum valence fluctuations, i.e. "heavy electrons". With the appearance of

VIII

the high Tc cuprates in 1987, my apprehensions became manifestly justified, and I began to formulate theories of metals based on new principles. This is the area which now most absorbs me, and where my efforts have been rewarded with the discovery of several new and fascinating phenomena. I reach my 70th year still involved in two major, very different types of problem. Although there is no easy way to round this one off, I have tried to give a flavor of the ideas involved. My original vision seems to me to have been sound. The problem of understanding the "here and now" world seems to show no sign of being exhausted. The contribution of physics is the method of dealing correctly both with the substrate from which emergence takes place, and with the emergent phenomenon itself. Examples are legion: for one, superconductivity cannot be properly understood simply as a phenomenology without understanding electrons and their interactions, nor on the other hand as a property of a small number of electrons without taking into account the macroscopic system. Ever newer insights into the nature of the world around us will continually arise from this style of doing science. There will be physics even in a world without the SSC. Regarding the choice of articles in this book: the sheer volume of either a complete selection of research articles or, even more, a complete set of reviews would have been much too great. I have given preference to materials which are for one reason or another not easy of access: lecture notes and out of print sources, for example. I have tried but failed to provide a selection of all major topics; the book is long enough. I have included a few items on history as I saw it, and on general aspects of science.

Oxford, December 2, 1993

IX

CONTENTS

Preface to the Second Edition

v

Introductory Essay to the First Edition

vii

More Is Different Science 111, 393-396 (1972)

1

1. Pressure Broadening in the Microwave and Infrared Regions Phys. Rev. 76, 647-661 (1949)

5

2. Theory of Ferroelectric Behavior of Barium Titanate Ceramic Age 57, 29-34 (1951)

21

3. Use of Stochastic Methods in Line-Broadening Problems Kyoto Lectures, March 1954

29

4. An Approximate Quantum Theory of the Antiferromagnetic Ground State Phys. Rev. 86, 694-701 (1952)

51

5. Qualitative Considerations on the Statistics of the Phase Transition in BaTi03-type Ferroelectrics Conference Proceedings, Lebedev Physics Institute, USSR Academy of Sciences, Nov. 1958, Fizika Dielektrikov (Moscow), 1960, pp. 290-297

61

6. Ordering and Antiferromagnetism in Ferrites Phys. Rev. 102, 1008-1013 (1956)

71

7. Absence of Diffusion in Certain Random Lattices Phys. Rev. 109, 1492-1505 (1958)

79

8. New Method in the Theory of Superconductivity Phys. Rev. 110, 985-986 (1958)

95

XI

9. New Approach to the Theory of Superexchange Interactions Phys.Rev. 115,2-13(1959) 10. Theory of Dirty Superconductors J. Phys. Chem. Solids 11, 26-30 (1959) 11. Calculation of the Superconducting State Parameters with Retarded Electron-Phonon Interaction (with P. Morel) Phys. Rev. 125, 1263-1271 (1962) 12. Generalized Bardeen-Cooper-Schrieffer States and the Proposed Low-Temperature Phase of Liquid He 3 (with P. Morel) Phys. Rev. 123, 1911-1934 (1961)

99

113

119

129

13. Localized Magnetic States in Metals Phys. Rev. 124, 41-53 (1961)

155

14. Plasmons, Gauge Invariance, and Mass Phys. Rev. 130, 439-442 (1963)

169

15. Hard Superconductivity: Theory of the Motion of Abrikosov Flux Lines (with Y.B. Kim) Rev. Mod. Phys. 36, 39-43 (1964)

175

16. Probable Observation of the Josephson Superconducting Tunneling Effect (with J. M. Rowell) Phys. Rev. Lett. 10, 230-232 (1963) 17. Image of the Phonon Spectrum in the Tunneling Characteristic between Superconductors (with J. M. Rowell and D.E. Thomas) Phys. Rev. Lett. 10, 334-336 (1963) 18. Exchange in Magnetic Insulators in Transition Metal Compounds, ed. E.R. Schatz, Buhl Int. Conference on Materials, Pittsburgh (Gordon and Breach, New York), 1964, pp. 17-28

XII

181

185

189

19. Superconductivity (Two Opinions) (with B.T. Matthias) Science, 144, 373-381 (1964)

207

20. Coherent Matter Field Phenomena in Superfluids in Some Recent Definitions in the Basic Sciences, Vol. 2, (Belfer Graduate School of Sciences, Yeshiva University, New York, 1969), ed. A. Gelbart, pp. 21-^10

219

21. Considerations on the Flow of Superfluid Helium Rev. Mod. Phys. 38, 298-310 (1966)

241

22. Multiple-Scattering Theory and Resonances in Transition Metals (with W.L. McMillan) in Proc. of the Int. School of Physics 'Enrico Fermi' XXXVII (Academic Press, New York), 1967, pp. 50-86

255

23. Infrared Catastrophe in Fermi Gases with Local Scattering Potentials Phys. Rev. Lett. 18, 1049-1051 (1967)

293

24. The Kondo Effect I Comments on Solid State Physics 1, no. 2, 31-36 (1968-69)

297

The Kondo Effect II Comments on Solid State Physics 1, no. 6, 190-197 (1968-69)

304

Kondo Effect III: The Wilderness — Mainly Theoretical Comments on Solid State Physics 3, no. 6, 153-158 (1971)

312

Kondo Effect IV: Out of the Wilderness Comments on Solid State Physics 5, no. 3, 73-79 (1973)

318

25. Superconductivity in the Past and the Future in Superconductivity, Vol. 2, ed. R. Parks (Marcel Dekker, New York), 1968, pp. 1343-1358

325

26. Macroscopic Coherence and Superfluidity in Contemporary Physics, Vol. 1 (IAEA, Vienna), 1969, pp. 47-54

343

XIII

27. The Fermi Glass: Theory and Experiment Comments on Solid State Physics 2, 193-198 (1970)

353

28. Space-Time and Scaling Techniques in the Kondo Problem in Proc. of 12th Int. Conf. on Low Temperature Physics, ed. E. Kanda (Academic Press, 1971), pp. 657-660

361

29. Anomalous Low-temperature Thermal Properties of Glasses and Spin Glasses (with B.I. Halperin and C M . Varma) Philos. Mag. 25, 1-9 (1972) 30. Comments on the Maximum Superconducting Transition Temperature (with M.L. Cohen) AIP. Conf. Proc, 1971, pp. 17-27 Comment on "Model for an Exciton Mechanism of Superconductivity" (with J.C. Inkson) Phys. Rev. B. 8, 4429-4432 (1973) 31. Resonating Valence Bonds: A New Kind of Insulator? Materials Research Bulletin 8, 153-160 (1973) 32. Anisotropic Superfluidity in 3 He: A Possible Interpretation of Its Stability as a Spin-Fluctuation Effect (with W.F. Brinkman) Phys. Rev. Lett. 30, 1108-1111 (1973)

367

377

389

393

403

33. Conference Summary in Collective Properties of Physical Systems, eds. B. Lundqvist and S. Lundqvist, Proc. of Nobel Symposium, Goteborg, Sweden, 13 June 1973 (Academic Press, 1974), pp. 266-271

409

34. Asymptotically Exact Methods in the Kondo Problem (with G. Yuval) in Magnetism, vol. V, ed. H. Suhl (Academic Press, 1973), pp. 217-236

417

XIV

35. Many-Body Effects at Surfaces in Elementary Excitations in Solids, Molecules and Atoms (Plenum, 1974), pp. 1-29

439

36. Conductivity from Charge or Spin Density Waves (with P.A. Lee and T.M. Rice) Solid State Communications 14, 703-709 (1974)

469

37. Uses of Solid State Analogies in Elementary Particle Theory in Proc. ofConf. on Gauge Theories and Modern Field Theory, eds. R. Arnowitt and P. Nath (MIT Press, 1976), pp. 311-335

477

38. Possible Consequences of Negative U Centers in Amorphous Materials J. de Physique Colloque, No. 4, 339-342 (1976)

503

39. Theory of Spin Glasses (with S.F. Edwards) J. Phys. F. 5, 965-974 (1975)

509

40. Solution of "Solvable Model of a Spin Glass" (with D.J. Thouless and R.G. Palmer) Philos. Mag. 35, 593-601 (1977)

521

41. Phase Slippage without Vortex Cores: Vortex Textures in Superfluid 3 He (with G. Toulouse) Phys. Rev. Lett. 18, 508-511 (1977)

531

42. Scaling Theory of Localization: Absence of Quantum Diffusion in Two Dimensions (with E. Abrahams, D.C. Licciardello and T.V. Ramakrishnan) Phys. Rev. Lett. 42, 673-676 (1979) 43. Some General Thoughts about Broken Symmetry in Symmetries and Broken Symmetries in Condensed Matter Physics, ed. N. Boccara (IDSET, Paris, 1981), pp. 11-20

xv

537

543

44. The Rheology of Neutron Stars: Vortex Line Pinning in the Crust Superfluid (with M.A. Alpar, D. Pines and J. Shaham) Philos. Mag. A45, 227-238 (1982) 45. Localization Redux Physica 117B & 118B, 30-35 (1983) 46. New Method for Scaling Theory of Localization. II: Multi-Channel Theory of a "Wire" and Possible Extension to Higher Dimensionality Phys. Rev. B. 23, 4828-4836 (1981)

555

569

577

47. Definition and Measurement of the Electrical and Thermal Resistances (with H.-L. Engquist) Phys. Rev. B. 24, 1151-1154 (1981)

587

48. Suggested Model for Prebiotic Evolution: The Use of Chaos Proc. Natl. Acad. Sci. (USA) 80, 3386-3390 (1983)

593

49. Chemical Pseudopotentials Phys. Reports 110, Nos. 5&6, 311-319 (1984)

599

50. Spin Glass Hamiltonians: A Bridge between Biology, Statistical Mechanics and Computer Science in Emerging Syntheses in Science, ed. D. Pines (Addison-Wesley, 1987), pp. 17-20 51. Measurement in Quantum Theory and the Problem of Complex Systems in The Lessons of Quantum Theory, ed. J. de Boer, E. Dal and O. Ulfbeck, talk given at the Niels Bohr Centenary Symposium, Copenhagen, October 1985 (Elsevier Science, 1986), pp. 23-34

609

615

52. It's Not Over Till the Fat Lady Sings Talk given at History of Superconductivity, APS Meeting, March 1987

629

53. Spin Glass I: A Scaling Law Rescued Physics Today 41#1, 9 (1988)

639

XVI

Spin Glass II: Is There a Phase Transition? Physics Today 41#3, 9 (1988)

642

Spin Glass III: Theory Raises Its Head Physics Today 41#6, 9 (1988)

643

Spin Glass IV: Glimmerings of Trouble Physics Today 41#9, 9 (1988)

645

Spin Glass V: Real Power Brought to Bear Physics Today 42#7, 9 (1989)

647

Spin Glass VI: Spin Glass as Cornucopia Physics Today 42#9, 9 (1989)

649

Spin Glass VII: Spin Glass as Paradigm Physics Today 43#3, 9 (1990)

651

54. Epilogue in Valence Instabilities and Related Narrow-band Phenomena, ed. R.D. Parks (Plenum, 1977), pp. 389-396

653

55. Present Status of Theory: l/N Approach in Proc. of 1983 NATO/CAP Inst. "Moment Formation in Solids", ed. W.J.L. Buyers (Plenum), pp. 313-326

662

56. The Problem of Fluctuating Valence in f-Electron Metals in Windsurfing in the Fermi Sea, Vol. 1, eds. T.T.S. Kuo and J. Speth (Elsevier Science, 1987), pp. 61-67

676

57. Gutzwiller-Hubbard Lattice-Gas Model with Variable Density: Application to Normal Liquid 3 He (with D. Vollhardt and P. Woffle) Phys. Rev. B. 35, 6703-6715 (1987) 58. Some Ideas on the Aesthetics of Science Lecture given at the 50 th Anniversary Seminar of the Faculty of Science and Technology, Keio University, Japan, May 1989

XVII

683

697

59. Theoretical Paradigms for the Sciences of Complexity Nishima Memorial Lecture, Department of Physics, Keio University, Japan, May 1989 60. 50 Years of the Mott Phenomenon: Insulators, Magnets, Solids, and Superconductors as Aspects of Strong-Repulsion Theory in Frontiers and Borderlines in Many-Particle Physics, Proceedings of the Enrico International School of Physics, Varenna, July 1987 (North-Holland, 1987), pp. 1-40

712

723

61. Theories of Fullerene T c 's Which Will Not Work

765

62. The Reverend Thomas Bayes, Needles in Haystacks, and the Fifth Force Physics Today 45#1, 9 (1992)

775

63. The Eightfold Way to the Theory of Complexity: A Prologue in Complexity, eds. G. Cowan, D. Pines and D. Meltzer (Addison-Wesley, 1994), pp. 7-16

779

64. Magnetic Field Induced Confinement in Strongly Correlated Anisotropic Materials (with S.P. Strong and D.G. Clarke) Phys. Rev. Lett. 73, 1007-1010 (1994)

789

65. Physics: The Opening to Complexity Proc. Natl. Acad. Sci. (USA) 92, 6653-6654 (1995)

795

66. Beyond Chaos: Singular Distributions and Power Laws

799

67. Essay Review — Science: A 'Dappled World' or a 'Seamless Web'? Stud. Hist. Phil. Mod. Phys. 32, 487-494 (2001)

809

68. RVB Redux: A Synergistic Theory of High T c Cuprates in The Universality of Physics, A Festschrift in Honor of Deng Feng Wang, ed. Khuri et al. (Kluwer/Plenum, 2001), pp. 3-8

819

XVIII

69. Physics of the Pseudogap Phase of High T c Cuprates, or, RVB meets Umklapp J. Phys. Chem. Solids 63, 2145-2148 (2002)

827

70. In Praise of Unstable Fixed Points: The Way Things Actually Work Physica 5 318, 28-32 (2002)

833

71. Rise of Complexity, 1953-2002 Ann. Henri Poincare 4, S1-S6 (2003)

839

72. The Physics behind High-temperature Superconducting Cuprates: The 'Plain Vanilla' Version of RVB (with P.A. Lee, M. Randeria, T. M. Rice, N. Trivedi and F.C. Zhang) J. Phys. Condens. Matter 16, R755-R769 (2004)

XIX

847

This page is intentionally left blank

Reprinted from 4 August 1972, Volume 177, pp. 393-396

More Is Different Broken symmetry and the nature of the hierarchical structure of science. P. W. Anderson

The reductionist hypothesis may still be a topic for controversy among philosophers, but among the great majority of active scientists I think it is accepted without question. The workings of our minds and bodies, and of all the animate or inanimate matter of which we have any detailed knowledge, are assumed to be controlled by the same set of fundamental laws, which except under certain extreme conditions we feel we know pretty well. It seems inevitable to go on uncritically to what appears at first sight to be an obvious corollary of reductionism: that if everything obeys the same fundamental laws, then the only scientists who are studying anything really fundamental are those who are working on those laws. In practice, that amounts to some astrophysicists, some elementary particle physicists, some logicians and other mathematicians, and few others. This point of view, which it is the main purpose of this article to oppose, is expressed in a rather wellknown passage by Weisskopf (1): Looking at the development of science in the Twentieth Century one can distinguish two trends, which I will call "intensive" and "extensive" research, lacking a better terminology. In short: intensive research goes for the fundamental laws, extensive research goes for the exThe author is a member of the technical staff of the Bell Telephone Laboratories, Murray Hill, New Jersey 07974, and visiting professor of theoretical physics at Cavendish Laboratory, Cambridge, England. This article is an expanded version of a Regents' Lecture given in 1967 at the University of California, La Jolla.

planation of phenomena in terms of known fundamental laws. As always, distinctions of this kind are not unambiguous, but they are clear in most cases. Solid state physics, plasma physics, and perhaps also biology are extensive. High energy physics and a good part of nuclear physics are intensive. There is always much less intensive research going on than extensive. Once new fundamental laws are discovered, a large and ever increasing activity begins in order to apply the discoveries to hitherto unexplained phenomena. Thus, there are two dimensions to basic research. The frontier of science extends all along a long line from the newest and most modern intensive research, over the extensive research recently spawned by the intensive research of yesterday, to the broad and well developed web of extensive research activities based on intensive research of past decades. The effectiveness of this message may be indicated by the fact that I heard it quoted recently by a leader in the field of materials science, who urged the participants at a meeting dedicated to "fundamental problems in condensed matter physics" to accept that there were few or no such problems and that nothing was left but extensive science, which he seemed to equate with device engineering. The main fallacy in this kind of thinking is that the reductionist hypothesis does not by any means imply a "constructionist" one: The ability to reduce everything to simple fundamental laws does not imply the ability to start from those laws and reconstruct the universe. In fact, the more the elementary particle physicists tell us about the nature of the fundamental laws, the

less relevance they seem to have to the very real problems of the rest of science, much less to those of society. The constructionist hypothesis breaks down when confronted with the twin difficulties of scale and complexity. The behavior of large and complex aggregates of elementary particles, it turns out, is not to be understood in terms of a simple extrapolation of the properties of a few particles. Instead, at each level of complexity entirely new properties appear, and the understanding of the new behaviors requires research which I think is as fundamental in its nature as any other. That is, it seems to me that one may array the sciences roughly linearly in a hierarchy, according to the idea: The elementary entities of science X obey the laws of science Y. X solid state or many-body physics chemistry molecular biology cell biology

Y elementary particle physics many-body physics chemistry molecular biology

psychology social sciences

physiology psychology

But this hierarchy does not imply that science X is "just applied Y." At each stage entirely new laws, concepts, and generalizations are necessary, requiring inspiration and creativity to just as great a degree as in the previous one. Psychology is not applied biology, nor is biology applied chemistry. In my own field of many-body physics, we are, perhaps, closer to our fundamental, intensive underpinnings than in any other science in which nontrivial complexities occur, and as a result we have begun to formulate a general theory of just how this shift from quantitative to qualitative differentiation takes place. This formulation, called the theory of "broken symmetry," may be of help in making more generally clear the breakdown of the constructionist converse of reductionism. I will give an elementary and incomplete explanation of these ideas, and then go on to some more general speculative comments about analogies at

Copyright© 1972 by the American Association for the Advancement

1

of Science

other levels and about similar phenomena. Before beginning this I wish to sort out two possible sources of misunderstanding. First, when I speak of scale change causing fundamental change I do not mean the rather well-understood idea that phenomena at a new scale may obey actually different fundamental laws—as, for example, general relativity is required on the cosmological scale and quantum mechanics on the atomic. I think it will be accepted that all ordinary matter obeys simple electrodynamics and quantum theory, and that really covers most of what I shall discuss. (As I said, we must all start with reductionism, which I fully accept.) A second source of confusion may be the fact that the concept of broken symmetry has been borrowed by the elementary particle physicists, but their use of the term is strictly an analogy, whether a deep or a specious one remaining to be understood. Let me then start my discussion with an example on the simplest possible level, a natural one for me because I worked with it when I was a graduate student: the ammonia molecule. At that time everyone knew about ammonia and used it to calibrate his theory or his apparatus, and I was no exception. The chemists will tell you that ammonia "is" a triangular pyramid NH

with the nitrogen negatively charged and the hydrogens positively charged, so that it has an electric dipole moment (/i), negative toward the apex of the pyramid. Now this seemed very strange to me, because I was just being taught that nothing has an electric dipole moment. The professor was really proving that no nucleus has a dipole moment, because he was teaching nuclear physics, but as his arguments were based on the symmetry of space and time they should have been correct in general. I soon learned that, in fact, they were correct (or perhaps it would be more accurate to say not incorrect) because he had been careful to say that no stationary state of a system (that is, one which does not change in time) has an electric dipole moment. If ammonia starts out from the above unsymmetrical state, it will not stay in it very long. By means of quantum mechanical tunneling, the nitrogen can

leak through the triangle of hydrogens to the other side, turning the pyramid inside out, and, in fact, it can do so very rapidly. This is the so-called "inversion," which occurs at a frequency of about 3 x 1010 per second. A truly stationary state can only be an equal superposition of the unsymmetrical pyramid and its inverse. That mixture does not have a dipole moment. (I warn the reader again that I am greatly oversimplifying and refer him to the textbooks for details.) I will not go through the proof, but the result is that the state of the system, if it is to be stationary, must always have the same symmetry as the laws of motion which govern it. A reason may be put very simply: In quantum mechanics there is always a way, unless symmetry forbids, to get from one state to another. Thus, if we start from any one unsymmetrical state, the system will make transitions to others, so only by adding up all the possible unsymmetrical states in a symmetrical way' can we get a stationary state. The symmetry involved in the case of ammonia is parity, the equivalence of left- and right-handed ways of looking at things. (The elementary particle experimentalists' discovery of certain violations of parity is not relevant to this question; those effects are too weak to affect ordinary matter.) Having seen how the ammonia molecule satisfies our theorem that there is no dipole moment, we may look into other cases and, in particular, study progressively bigger systems to see whether the state and the symmetry are always related. There are other similar pyramidal molecules, made of heavier atoms. Hydrogen phosphide, PH 3 , which is twice as heavy as ammonia, inverts, but at one-tenth the ammonia frequency. Phosphorus trifluoride, PF 3 , in which the much heavier fluorine is substituted for hydrogen, is not observed to invert at a measurable rate, although theoretically one can be sure that a state prepared in one orientation would invert in a reasonable time. We may then go on to more complicated molecules, such as sugar, with about 40 atoms. For these it no longer makes any sense to expect the molecule to invert itself. Every sugar molecule made by a living organism is spiral in the same sense, and they never invert, either by quantum mechanical tunneling or even under thermal agitation at normal temperatures. At this point we must forget about the possibility of inversion and ignore the parity symmetry:

2

the symmetry laws have been, not repealed, but broken. If, on the other hand, we synthesize our sugar molecules by a chemical reaction more or less in thermal equilibrium, we will find that there are not, on the average, more left- than righthanded ones or vice versa. In the absence of anything more complicated than a collection of free molecules, the symmetry laws are never broken, on the average. We needed living matter to produce an actual unsymmetry in the populations. In really large, but still inanimate, aggregates of atoms, quite a different kind of broken symmetry can occur, again leading to a net dipole moment or to a net optical rotating power, or both. Many crystals have a net dipole moment in each elementary unit cell (pyroelectricity), and in some this moment can be reversed by an electric field (ferroelectricity). This asymmetry is a spontaneous effect of the crystal's seeking its lowest energy state. Of course, the state with the opposite moment also exists and has, by symmetry, just the same energy, but the system is so large that no thermal or quantum mechanical force can cause a conversion of one to the other in a finite time compared to, say, the age of the universe. There are at least three inferences to be drawn from this. One is that symmetry is of great importance in physics. By symmetry we mean the existence of different viewpoints from which the system appears the same. It is only slightly overstating the case to say that physics is the study of symmetry. The first demonstration of the power of this idea may have been by Newton, who may have asked himself the question: What if the matter here in my hand obeys the same laws as that up in the sky— that is, what if space and matter are homogeneous and isotropic? The second inference is that the internal structure of a piece of matter need not be symmetrical even if the total state of it is. I would challenge you to start from the fundamental laws of quantum mechanics and predict the ammonia inversion and its easily observable properties without going through the stage of using the unsymmetrical pyramidal structure, even though no "state" ever has that structure. It is fascinating that it was not until a couple of decades ago (2) that nuclear physicists stopped thinking of the nucleus as a featureless, symmetrical little ball and realized that while it really never has a dipole moment, it can become football-

shaped or plate-shaped. This has observable consequences in the reactions and excitation spectra that are studied in nuclear physics, even though it is much more difficult to demonstrate directly than the ammonia inversion. In my opinion, whether or not one calls this intensive research, it is as fundamental in nature as many things one might so label. But it needed no new knowledge of fundamental laws and would have been extremely difficult to derive synthetically from those laws; it was simply an inspiration, based, to be sure, on everyday intuition, which suddenly fitted everything together. The basic reason why this result would have been difficult to derive is an important one for our further thinking. If the nucleus is sufficiently small there is no real way to define its shape rigorously: Three or four or ten particles whirling about each other do not define a rotating "plate" or "football." It is only as the nucleus is considered to be a many-body system—in what is often called the N -» » limit—that such behavior is rigorously definable. We say to ourselves: A macroscopic body of that shape would have such-and-such a spectrum of rotational and vibrational excitations, completely different in nature from those which would characterize a featureless system. When we see such a spectrum, even not so separated, and somewhat imperfect, we recognize that the nucleus is, after all, not macroscopic; it is merely approaching macroscopic behavior. Starting with the fundamental laws and a computer, we would have to do two impossible things —solve a problem with infinitely many bodies, and then apply the result to a finite system—before we synthesized this behavior. A third insight is that the state of a really big system does not at all have to have the symmetry of the laws which govern it; in fact, it usually has less symmetry. The outstanding example of this is the crystal: Built from a substrate of atoms and space according to laws which express the perfect homogeneity of space, the crystal suddenly and unpredictably displays an entirely new and very beautiful symmetry. The general rule, however, even in the case of the crystal, is that the large system is less symmetrical than the underlying structure would suggest: Symmetrical as it is, a crystal is less symmetrical than perfect homogeneity. Perhaps in the case of crystals this appears to be merely an exercise in confusion. The regularity of crystals

could be deduced semiempirically in the mid-19th century without any complicated reasoning at all. But sometimes, as in the case of superconductivity, the new symmetry—now called broken symmetry because the original symmetry is no longer evident—may be of an entirely unexpected kind and extremely difficult to visualize. In the case of superconductivity, 30 years elapsed between the time when physicists were in possession of every fundamental law necessary for explaining it and the time when it was actually done. The phenomenon of superconductivity is the most spectacular example of the broken symmetries which ordinary macroscopic bodies undergo, but it is of course not the only one. Antiferromagnets, ferroelectrics, liquid crystals, and matter in many other states obey a certain rather general scheme of rules and ideas, which some many-body theorists refer to under the general heading of broken symmetry. I shall not further discuss the history, but give a bibliography at the end of this article (3). The essential idea is that in the socalled N -> °o limit of large systems (on our own, macroscopic scale) it is not only convenient but essential to realize that matter will undergo mathematically sharp, singular "phase transitions" to states in which the microscopic symmetries, and even the microscopic equations of motion, are in a sense violated. The symmetry leaves behind as its expression only certain characteristic behaviors, for instance, long-wavelength vibrations, of which the familiar example is sound waves; or the unusual macroscopic conduction phenomena of the superconductor; or, in a very deep analogy, the very rigidity of crystal lattices, and thus of most solid matter. There is, of course, no question of the system's really violating, as opposed to breaking, the symmetry of space and time, but because its parts find it energetically more favorable to maintain certain fixed relationships with each other, the symmetry allows only the body as a whole to respond to external forces. This leads to a "rigidity," which is also an apt description of superconductivity and superfluidity in spite of their apparent "fluid" behavior. [In the former case, London noted this aspect very early (4).] Actually, for a hypothetical gaseous but intelligent citizen of Jupiter or of a hydrogen cloud somewhere in the galactic center, the properties of ordinary crystals might well be a more baffling and intriguing puzzle than those of superfluid helium.

3

I do not mean to give the impression that all is settled. For instance, I think there are still fascinating questions of principle about glasses and other amorphous phases, which may reveal even more complex types of behavior. Nevertheless, the role of this type of broken symmetry in the properties of inert but macroscopic material bodies is now understood, at least in principle. In this case we can see how the whole becomes not only more than but very different from the sum of its parts. The next order of business logically is to ask whether an even more complete destruction of the fundamental symmetries of space and time is possible and whether new phenomena then arise, intrinsically different from the "simple" phase transition representing a condensation into a less symmetric state. We have already excluded the apparently unsymmetric cases of liquids, gases, and glasses. (In any real sense they are more symmetric.) It seems to me that the next stage is to consider the system which is regular but contains information. That is, it is regular in space in some sense so that it can be "read out," but it contains elements which can be varied from one "cell" to the next. An obvious example is DNA; in everyday life, a line of type or a movie film have the same structure. This type of "information-bearing crystallinity" seems to be essential to life. Whether the development of life requires any further breaking of symmetry is by no means clear. Keeping on with the attempt to characterize types of broken symmetry which occur in living things, I find that at least one further phenomenon seems to be identifiable and either universal or remarkably common, namely, ordering (regularity or periodicity) in the time dimension. A number of theories of life processes have appeared in which regular pulsing in time plays an important role: theories of development, of growth and growth limitation, and of the memory. Temporal regularity is very commonly observed in living objects. It plays at least two kinds of roles. First, most methods of extracting energy from the environment in order to set up a continuing, quasi-stable process involve time-periodic machines, such as oscillators and generators, and the processes of life work in the same way. Second, temporal regularity is a means of handling information, similar to information-bearing spatial regularity. Human spoken language is an example, and it

is noteworthy that all computing machines use temporal pulsing. A possible third role is suggested in some of the theories mentioned above: the use of phase relationships of temporal pulses to handle information and control the growth and development of cells and organisms (5). In some sense, structure—functional structure in a teleological sense, as opposed to mere crystalline shape—must also be considered a stage, possibly intermediate between crystallinity and information strings, in the hierarchy of broken symmetries. To pile speculation on speculation, I would say that the next stage could be hierarchy or specialization of function, or both. At some point we have to stop talking about decreasing symmetry and start calling it increasing complication. Thus, with increasing complication at each stage, we go on up the hierarchy of the sciences. We expect to encounter fascinating and, I believe, very fundamental questions at each stage in fitting together less complicated pieces into the more complicated system and understanding the basically new types of behavior which can result. There may well be no useful parallel to be drawn between the way in which complexity appears in the simplest cases of many-body theory and chemistry and the way it appears in the truly complex cultural and biological ones, except perhaps to say that, in general, the relationship between the system and its parts is intellectually a one-way street. Synthesis is expected to be all but im-

possible; analysis, on the other hand, may be not only possible but fruitful in all kinds of ways: Without an understanding of the broken symmetry in superconductivity, for instance, Josephson would probably not have discovered his effect. [Another name for the Josephson effect is "macroscopic quantum-interference phenomena": interference effects observed between macroscopic wave functions of electrons in superconductors, or of helium atoms in superfluid liquid helium. These phenomena have already enormously extended the accuracy of electromagnetic measurements, and can be expected to play a great role in future computers, among other possibilities, so that in the long run they may lead to some of the major technological achievements of this decade (6).] For another example, biology has certainly taken on a whole new aspect from the reduction of genetics to biochemistry and biophysics, which will have untold consequences. So it is not true, as a recent article would have it (7), that we each should "cultivate our own valley, and not attempt to build roads over the mountain ranges . . . between the sciences." Rather, we should recognize that such roads, while often the quickest shortcut to another part of our own science, are not visible from the viewpoint of one science alone. The arrogance of the particle physicist and his intensive research may be behind us (the discoverer of the positron said "the rest is chemistry"), but we have yet to recover from that of some molecular biologists, who seem deter-

4

mined to try to reduce everything about the human organism to "only" chemistry, from the common cold and all mental disease to the religious instinct. Surely there are more levels of organization between human ethology and DNA than there are between DNA and quantum electrodynamics, and each level can require a whole new conceptual structure. In closing, I offer two examples from economics of what I hope to have said. Marx said that quantitative differences become qualitative ones, but a dialogue in Paris in the 1920's sums it up even more clearly: FITZGERALD: The rich are different from us. HEMINGWAY: Yes, they have more money. References 1. V. F. Weisskopf, in Brookhaven Nat. Lab. Pub}. 888T360 (1965). Also see Nuoro Cimento Suppl. Set 1 4, 465 (1966); Phys. Today 20 (No. 5), 23 (1967). 2. A. Bohr and B. R. Mottelson, Kgl. Dan. Vidensk. Selsk. Mat. Fys. Medd. 27, 16 (1953). 3. Broken symmetry and phase transitions: L. D. Landau, Phys. Z. Sowjetunion 11, 26, 542 (1937). Broken symmetry and collective motion, aeneral: J. Goldstone, A. Salam, S. Weinberg, Phys. Rev. Ill, 965 (1962); P. W. Anderson, Concepts in Solids (Benjamin, New York, 1963), pp. 175-182; B. D. Josephson, thesis, Trinity College, Cambridge University (1962). Special cases: antiferromagnetism, P. W. Anderson, Phys. Rev. 86, 694 (1952); superconductivity, , ibid. 110, 827 (1958); ibid. 112, 1900 (1958); Y. Nambu, ibid. 117, 648 (1960). 4. F. London, Superfiuids (Wiley, New York, 1950), vol. 1. 5. M. H. Cohen, /. Theor. Biol. 31, 101 (1971). 6. J. Clarke, Amer. J. Phys. 38, 1075 (1969); P. W. Anderson, Phys. Today 23 (No. 11), 23 (1970). 7. A. B. Pippard, Reconciling Physics with Reality (Cambridge Univ. Press, London, 1972).

Pressure Broadening in the Microwave and Infrared Regions Phys. Rev. 76, 647-661 (1949)

This is a condensation of most of my thesis. For two years of postwar grad school, 1945^17,1 took the (mostly wonderful) courses available at Harvard from Schwinger, van Vleck, Furry, and visitors such as Goudsmit and Gorter, and, aside from that, enjoyed myself, eventually winding up as one of a group which congealed around Tom Lehrer, Dave Robinson, and others like Chan Davis and Bob Welker, which was more about bridge, singing, puzzles and scifi than physics. I had the GI Bill and a tiny scholarship, and would have been happy to go on indefinitely as Tom Lehrer actually did. But in the summer of 1947 I met the right girl and things suddenly became more serious. Van had given me, as one of his last three graduate students (another was Tom Kuhn), a marvelous problem, in spite of my marginal performance on my qualifying exam. It arose from one of the postwar revolutions in spectroscopy which radarderived coherent sources allowed, unprecedentedly detailed molecular spectroscopy. I solved it using a series of novel techniques borrowed from Schwinger's courses among other sources. I felt that after this there was little more to be done in this field and wanted to move on to other things, an attitude which one job interviewer interpreted as reprehensible, and chose not to hire me.

5

Reprinted from T H E PHYSICAL REVIEW, Vol. 76, No. Printed in U. S. A.

5, 647-661, September 1, 1949

Pressure Broadening in the Microwave and Infra-Red Regions* P . W . ANDERSONf

Harvard University, Cambridge, Massackusettst (Received May 12, 1949) In this paper a generalized theory of collision broadening is developed, adequate for predicting line breadths in the microwave and infra-red regions. This theory differs from previous ones in taking into account transitions among quantum states caused by collisions, although it is limited to a classical picture of the relative motion of the colliding molecules as a whole. Formulas are derived for computing approximate line broadening collision cross-sections from known intermolecular interactions, and the results of the theory are successfully compared with experiments on self-broadening in the ammonia inversion spectrum and the vibrational band spectra of HCl and HCN. Some cases of foreign gas broadening in the microwave region are examined, but it is concluded that in general the Van der Waals interaction, commonly assumed to be the force important in causing foreign gas broadening, is not adequate to cause the observed broadenings. The more complicated types of forces which become important at short range would have to be taken into account to give good agreement with experiment in these cases. PART I. THEORY

computed by means of the adiabatic approximation and is equivalent to a relative phase-shift of the radiation before and after the collision, with no amplitude change.

A. Introduction I HE theories of pressure broadening presented up -*- to this time can easily be shown to be inapplicable in the microwave and infra-red regions of the spectrum. The primary purpose of this paper is, therefore, to develop a theory which can give accurate results in these regions, and to compare these results with the available experimental data. The older theories, in particular the Fourier integral treatment developed primarily by Lorentz, Lenz, and Weisskopf,1-4 have given good results in the optical region.4'5 However, this theory is essentially in contradiction to the theory of dielectric relaxation developed by Debye and others, 6-8 which treats broadening in the limiting case of a spectral line of zero frequency.8 Both theories involve computing the spectrum due to a radiating molecule randomly interrupted by collisions with other molecules. However, the mechanism by which the interruption is effected is essentially different in the two theories. Debye's relaxation theory considers interruptions due only to reorientations of the radiating molecule; in a quantum theory, this is equivalent to considering transitions among the various spacially degenerate quantum states. Contrariwise, the Fourier integral theory explicitly assumes that no transitions are caused by collisions. Then the effect of a collision is * The material in this paper is a part of the author's thesis submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at Harvard University. f National Research Council Predoctoral Fellow. % Now at Bell Telephone Laboratories, Murray Hill, New Jersey. 1 H. A. Lorentz, Proc. Amst. Acad. Sci. 8, 59 (1906). 2 W. Lenz, Zeits. f. Physik. 80, 423 (1933). 3 V. F. Weisskopf, Phys. Zeits. 34, 1 (1933). 1 E. Lindholm, Ark. Mat. Astron. Och Fys. 32, 17 (1945). 5 H . Margenau and W. W. Watson, Rev. Mod. Phys. 8, 22 (1936). 6 P. Debye, Polar Molecules (Chemical Catalog Company, New York, 1929), Chapter V. 7 W . Kauzmann, Rev. Mod. Phys. 14, 12 (1942). 8 J. H. Van Vleck and V. F. Weisskopf, Rev. Mod. Phys. 17, 227 (1945).

B. Generalization of the Fourier Integral It is clear that it is necessary, in order to compute pressure broadening in any frequency range, to develop an expression for the spectral shape of a pressure-broadened line which encompasses the two limiting cases of high and low frequencies. In other words, a formula must be obtained which includes the effects both of phase-shifts and of transitions. Before presenting this formula, we should emphasize the physical picture upon which it is based. This picture is equivalent to that employed by Foley9 in his re-derivation of the standard Fourier integral formula by means of quantum radiation theory. The basic assumption we call the "assumption of a classical path": we assume that for all collisions of interest the colliding molecules can be taken to be point dipoles following definite paths; thus their interaction is a definite function of the time Hi{t). The justification for this lies in the fact that one can think of the motion of the molecules in terms of packets of translational wave functions; by the uncertainty principle the allowable spreads Ap and &q in position and momentum space are ApAq=h. We seldom or never consider in detail collisions in which the molecules under consideration pass at distances less than the kinetic theory diameter, = 5A. Then an uncertainty in position of 1A will lead to no great ambiguity in magnitude or type of intermolecular forces. The corresponding uncertainty in velocity is Av/v= 0.3/(molecular weight)1

(1)

which is a small percentage for most molecules. Calculations carried out by Lindholm10 for many optical problems, not using this assumption, have indicated that the assumption is normally valid, in consideration

647

»H. M. Foley, Phys. Rev. 69, 616 (1946). 10 E. Lindholm, Dissertation, Uppsala, 1942.

P.

648

W.

ANDERSON tions come out of the theory actually does appear when the classical path method, or equivalent absorption methods, are used; 8 we have been able to clarify this point to some extent, but our work on this point will not be reported in this paper. It is interesting to rewrite (2) for the case in which the standard Fourier integral of Foley and his predecessors is valid. This is the case in which the perturbing Hamiltonian Hi(t) is either diagonal already, or can be made so in a simple manner (following Foley) by replacing transitions to high-energy levels by their second-order effects; that is, we replace the so-called Van der Waals forces, a second-order phenomenon, by a first-order Hamiltonian. This, too, can be shown to be valid from the general equations of motion and the radiation theory. Then Eq. (3) integrates explicitly; a typical matrix element is

of the large quantum numbers of the partial waves whose contributions were important. The problem is reduced, by this assumption, to finding the spectrum radiated by a molecule, whose unperturbed Hamiltonian we designate by H0. This molecule undergoes random perturbations due to collisions, representable by a time-dependent interaction Hamiltonian Hi(t). The problem can be solved following Foley's method, by the use of general radiation theory; or it is possible to extend the correspondence-principle radiation theory of Klein and Pauli to our problem by analogy. Either method leads directly to the following formula, which we present without proof:** /(«) = const. Xoi4Tr\

('•/:

dte^'fitit)

X

f

dt'e

"WO

•

(2)

„W = M»„(0) exp \/ih\

Here w is the angular frequency for which we desire the spectral intensity I(w); p 0 is an assumed initial density matrix for molecules in the gas, in fact essentially the density matrix for the time / = — co ; /j,z(t) is the Heisenberg time-dependent operator (see reference 12) for the z-component of the dipole moment (if we consider radiation polarized in the z-direction); and the "Tr" means that one must take the diagonal sum of the indicated matrix product. nz{t), of course, satisfies the Heisenberg time equation ihii=[Hii — nH2,

- f

H0+H1(t).

The unperturbed frequency is u„m, given by tiUnm = Em — En.

Em is the mth eigenvalue of the unperturbed Hamiltonian H0. Equation (5) defines the relative phase-shift r]nm for the two levels n and m. We can then use the Boltzmann density matrix

(3) (4)

(»| pB\m) = Snm

Finally, we have omitted all numerical factors other than the frequency-dependent one co4. These factors can easily be shown to come correctly from the general theory. The formula (2) gives the intensity for spontaneous emission. The approach through spontaneous emission is chosen in order to avoid the use of a time-dependent density matrix.tf The intensity for absorption must be derived from (2) by use of the Einstein relations." Some confusion as to whether the correct Einstein rela-

exp(-En/kT) X[Znexp(-En/kT)r\

(6)

for po, and I in (2) separates into the following contributions Imn for each of the various spectral lines m and n: 7m„(co) = c o 4 p x p ( - £ „ / * r ) l

co

/

** The intensity in classical theory is given by the average overall states of w4 I

Z(m\H1\m)-(n\H1\n)^\\ (5)

where H=

{Em-En)t

12

(ftexpj—«[(&> — C d m „ ) H - > ! . n n ( 0

0)

This is the form of the Fourier integral used by Weisskopf, etc.JJ

n,(t)e*"dt It It is interesting to note that the derivation of (7) employed by Foley fails to lead to the correct external frequency factor oi'. If carefully followed, Foley's theory gives a factor here of w2conm2, which is by no means correct at low or zero frequencies, where the line-breadths are comparable to umn. Even if Foley's "adiabatic assumption" (Hi diagonal) were correct at low frequencies, these factors would be in error. The reason is that the general theory must be retained, at least until the point where a partial integration can be performed analogously to that which one would use in the classical theory to go from the Fourier integral of t i e vector potential A (~/i) to that of «(() itself.

The corresponding quantum-mechanical quantity is obtained by replacing all observables (in this case ,i (t)) by quantum-mechanical operators, in multiplying by the density matrix, and taking the trace. Thus Eq. (21) is reasonable; a rigorous proof can be given. For this procedure see W. Pauli, Handbuch der Physik, 2. Aufl., Band 24, 1. Teil, p. 149. ft See, for instance, R. Karplus and J. Schwinger, Phys. Rev. 73, 1(120 (1948). " A . Einstein, Verh. d. D. Phys. Ges. 18, 318 (1916); Phys. Zeits. 18, 121 (1917).

7

PRESSURE

BROADENING

C. Introduction of the Time-Development Matrix T

This operator is the solution of (9') with initial condition T(t) = 1; it answers the question "what happens in the time-interval t-*l'f". In addition, we use the fact that the density matrix, as a function of time, transforms according to T(i) but inversely to other matrices. Then (11) becomes:

I t is well known that it is possible to write the timedependent operators p(l) in Eq. (2) in the following manner :12 U.(t)=U-Kt)»oU(t),

(8)

where ihid/d^U^iHe+H^U.

7(o)) = const. Xw 2

(9)

If MO is the value of n at a time U, then U(t0) = 1 is the initial condition on Eq. (9). The time-development operator U is not as useful to us as another operator T, which is defined in much the same way. However, in deriving T one uses a Heisenberg representation in relation to the unperturbed energy H0. Therefore the equation for M(0 is MW = T - 1 W exp(-ff 0 */tft) w exp{H0t/ik)T(i),

Xexp(-*fcW')(e| T(t-*t') | a).

(8') (9')

I(u) = w4(const.)Y.de J

(10)

I Po| *)

Xexp(io,cdt)(d\T{t)\e)

~"

dt Y,abc e x p [ —

i(0)bc+0>de)t]

*'e--''(e|r-KOI/)(/lMo|g) Xexp(-i«fot')(g\T(t')\a).

(ID

D . Re-expression in Correlation-Function Form, and Simplification Using the Assumption of Short Collisions Formula (10) can be simplified greatly by the insertion of a new time-development matrix T{t')T-l{t).

(15)

We observe that (15) has the form of an average over the time (. I t consists of three essentially different terms: p(0, which depends on the state of the molecule a t /; the T(t—H-\-T) factors, which depend on the collisions which occur between t and / + T ; and the phase-factor. At this point we first limit ourselves to the "impact" type of theory: we assume that the collisions are short compared to the time between them. Under this assumption, it is easy to show that unless wbc+Vde is less than 1/T, the inverse of the time between collisions, (15) will essentially vanish. For this reason, and for other reasons which we shall not consider here, it is considered legitimate to drop from (15) all terms for which ubc+oide^Q- This is equivalent to assuming that all lines considered are either not resolved at all compared to the line-breadth—in which case we treat them as degenerate—or are well separated.*** I t is also legitimate, since the phase and T-factors are independent of the actual state of the molecule, to take the average of pit) over the history of the molecule

x (^"'(Jir-HOIcXclMoM)

X J*

I

X(d\^\e)(e\T(t^t+T)\a).

(8")

We may now insert this explicit form for p.(t) into the Fourier integral formula (2), and obtain the following complicated expression: Y^abcd^fM

(14)

X(a|p(0|i)(»|Mo|c)(c|r-'(«-»H-r)|
Xexp(-io>klt)(k\no\l)(l\T\n).

4

dr exp\j(u—ude)T]
where

HM(0|»)=£(w|r-M£)

= COnSt. X «

(13)

In order to use the considerable simplification made available by the Wiener-Khintchine relations in evaluating our formulas, 13 it is expedient to re-express them in correlation-function form. By means of the simple substitution t' = t+T, we obtain

The diagonal elements of T give the phase-shifts due to the interaction Hi, while the off-diagonal elements—or their squared absolute values—give the transition probabilities due to this interaction. In terms of eigenstates of H0, (8') is

1(a)

exp(—KO^O

X(c\T-\t-+t')\d)(d\IM>\e)

jf?i' is the interaction Hamiltonian with the timedependence due to ffo inserted: (m| H^ \ n) = (m] H^t) | n) exp[(£„-E n )t/ih~].

J dt I die

XY.abcde(a\p{t)Ib)(b\no\c)

and that for T is ihf(t) = H/(t)T.

649

13 See, for instance, M. C. Wang and G. E. Uhlenbeck, Rev. Mod. Phys. 17, 323 (1945). ls *** In Section H we will discuss further the terms which are here Bom u. Jordan, Elementary Quantenmechanik (Verlag J. dropped. Springer, Berlin, 1930).

T{t-^t') =

(12)

8

P.

650

W.

ANDERSON

We divide the types of collisions into classes according to the paths followed by the molecules; then, corresponding to each "element of cross section" dak we can assume that there is a definite phase-shift JJ4. Then the probability that a collision in d<xk occurs in a given, short time dt is given by

separately. Clearly, we obtain for pAv the Boltzmann distribution (6). Let us now introduce a degenerate index m. Each state will be designated by a nondegenerate index (a, b, c, d- • •) and a degenerate index (m, m'• • •). Then, letting the "initial" state be designated b y i, the final by / , (14) and (15) are

(21)

p(diri,) — nvdldok, I = Hi,t I

dr exp[i(wu-u)T'](i\

PB\i)
(16)

where n is the number of molecules per cc, and velocity (assuming, as is always justifiable lision-broadening problems, that all molecules average velocity v). Consider the integrand of the expression
where <wM = Zm,».'.»•",».'" I

dt(i, m\/i\f, m')

v is their for colhave the (18) for

f(t, r) = exp {i[r,it(t+ r) - r,S)1 \ •

X(f,m'\T-\t-^t+T)\f,m") X(/, m"Mi, m'")(i, m"'\T(t^t+r)\i,

(22)

If in the time dr after r a collision of type k occurs, we have

m), (17)

f(t,T+dT) = exp(ii)t)f(l,T), ,„,.. /(/, r+dr)-f(t, T) = [exp(i„ t )-l]/(/, r). (Zd)

or *v(r) = 7>r f

Now let us take the integral, which is equivalent to averaging over all times t:

dty.V{T~\t~^t+T)Y XM"(JX*-»H-T)«J,

(17')

= ([exp(i7)i) — X]f{t, T)Average over t-

This is the point a t which the impact theory assumption is vital. We must be able to say that the time dT is short compared to the interval between collisions, and so contains a t most one collision; on the other hand, dr must be quite long compared with the duration of a collision, so that there is no correlation between what happens in the interval dr and what happens in the preceding or following times. Then we can take the averages in (24) separately, and have

where the superscripts designate "Teilmatrices" for the given non-degenerate indices. E. Solution for the Correlation Function in the Standard Optical Case To illustrate our method of finding the correlation function and to furnish some introduction to the lineshapes to be expected, we shall re-derive the impact theory solution of the standard Fourier integral (7). This formula, re-written in correlation-function form b y means of (13), and exclusive of numerical factors, is

d
= 1 Tjodr I
J

(25)

(18)
(24)

",

(26)

where

dtexp[i£i)if(t+T)-r)i/(t)l}.

a= i da(\ — e*') =
- /

The assumption of the impact theory is that the collisions are short compared to the time between them. Then Vif(t+T)-Vif(t)=

£

Ve,

I t is easy to see that
(19)

collisions from / tol+r

as is required for the reality of I(w); then we get for the intensity distribution

where rjc are the phase-shifts due to the various collisions occuring in the interval t—*t+r:

v.= j

dC(»l-ffi'»l*)-(/|i/i'W|/)].

2{nw,) 7(o)) = const.X(co—on/—»tw,-)2+ (nv
(20)

(27)

This is a dispersion line-form with half-width 2ir(Av)i=nv<7r,

9

shift 2xAy
(28)

PRESSURE

F. Solution for the Correlation Function in the Classical Debye Case Since the classical Debye problem of the spectrum of a rotating dipole perturbed by collisions has frequently been considered by absorption methods, 6 - 8 in particular the very general method in the appendix of the paper by Van Vleck and Weisskopf, it is of interest to show how easily this problem is solved by correlation function methods. Here we can see that the spectrum is dTeioT
1(a) = I

(29) V{T)=

f

dt^(t+r),xz{t),

where y.z(t) is the z-component of the rotator's dipole moment at a time t. Again we consider the function /(/,T)=M,(H-TW0, / ( / , r+dr)-f{t,

T-) =

2 M

651

BROADENING

(30)

The two terms in t+T)\m'Xm'\pL,v\m')

(COS,KH-T-NT) — C O S ^ ( / + T ) ) COSI^(/),

average over / = (m\ (Ti~1(t-*l+T)lX,«Tt(.M+T))!iversisc

over I K " )

(36) where ^,(<) = /icos^(/), ^ being the angle of the dipole relative to the z-axis. Our basic assumptions will be (a) that the probability of a collision taking the rotator from the element of solid angle dot into doi' in a time dr is

Suppose we apply a rotation of the coordinate axes to (36) in the sense that we do all of our computations with respect to a new set of axes x', y', z' = R{x, y, z). We must use the fact that, since in the course of the time average we consider collisions from all conceivable nvd(T„-+a>dT, (31) directions and of all possible types in equal number, the affect of the (Ti~1*Tf) factor is essentially an isoand (b) that this probability is obviously independent tropic one. (This is equivalent to the step (32)—+(33).) of the azimuthal angle

=

Since, upon averaging the difference of cosines in (30) over all possible collisions, the last term in (32) will vanish because of the factor cosp, we can use cost(t+ r+dr) - cosMH- r) = cosiKi-r- T) ( C O S 0 - 1). (33) Then we actually perform the average of (30) over all times, using again our impact theory assumption of short collisions. I t is clear that the result is: d„<(cos0— 1) M r ) , p(T) = e-*"",

(34)

(T-i(Rix,)T)to

= \i'x(T~1iiXT)ta+

x 2 ' „ ( r - V«^)A.

+x^ 2 (r-v,r) s „ (37) where the X's are the direction cosines of the new axis system with respect to the old. Thus the quantity (36) has the transformation property of a vector component. But the i, f matrix elements of any vector component are determined, exclusive of a constant factor, by the transformation property. 14 Thus we can say (m| (Ti-l(t-^t+TWT'(t->t+T))Av

overl\m'") = F{r){m\^'\m'"),

(38)

where F(j) is a scalar. We are now in a position to set up the differential ftf The case of higher multipole transitions is easily worked out and, interestingly, leads usually to a different breadth. 11 E. U. Condon and G. H. Shortley, Theory of Atomic Spectra (Cambridge University Press, Cambridge, 1935), Chapter III.

0

652

P.

W.

A N D E R S O N

tically identical, except, of course, that the radiation occurs according to the operator

equation for
T.m„n.:\{m\^'\m'")\*F(T),

M2 (l)+M*(2).

T.m,^-\(m\^\m'")\2dF{T),

dVi.r) =

( r" = Tr\ J

dW'lT^it-^t+T+dT)^'

X T'(t-*t+ T+dr) - Ti'1(T^t+

T)fiV

X P N + T ) ] .

(39)

We introduce a matrix Td=T(t+ T^>l+T+dr) (the notation Td indicates that this matrix is not a differential) ; then T(t-+l+T+dT) = T{t-^t+T)Ti. (40) Again we can make use of the reasoning of Section E, (24)—->(25): we use the assumption of short collisions to enable us to take independent averages of the happenings in t—*t-\-r and H-i—>t-\-r-\-dr. Then we introduce Eq. (40), Eq. (38) for F{T), and get for (39): = F ( T ) r r [ < M / « r „ W M . t f Z Y - M . < 0 ] ) A v . over all dr\.

(44)

The two molecules of interest are called (1) and (2). I t is clear that it is legitimate to use only the first half of (44), regarding, therefore, molecule (1) as the radiator, (2) as a perturber (the line-shape for molecule (2) will be included in the implicit average over all molecules in the gas). Then, in place of the transformation property "vector component," we use essentially "vector" molecule (1) " s c a l a f ^ r n o l e c u l e (2) • Without going through the reasoning for the case of two molecules, which differs from the preceding only in that the Boltzmann factors for molecule (2) must be included and that T-matrices involving any transition whatever for molecule 2 are of interest, we present the correct result for a: (mia2\Ti-1(da)\mib2)(ps)b2

a= I da{[_ £ •/

mi, mi' mi", mi'" Mb,

X « I M / 7 I » i " ) (wi"621 Tf{da) I

mi'"at)

X(«i"W|«i)]

(41)

Now we introduce the probability assumption (21) for types of collisions:

IMM/'ki')!2]-1-!}-

x [ E

(45)

mimi'

02, bi designate the various states of the second molecule. Two possible cases occur in which this expression If a collision of type da occurs in dr, the T matrix T4 can be simplified to a usable form. First we have the case will be characteristic of the collision da. In fact, T is in which no transitions occur in molecule (2) quantum obtained by integrating Eq. (9') from t — — » to + » , numbers, except perhaps for degenerate ones; second, using for ffi'(0 the time-dependent interaction Hamil- that case in which transitions in molecule (2) quantum tonian for just one collision of type da occurring in the numbers can occur only simultaneously with those of whole range of t. The initial condition is T(— co) = l. molecule (1), so that T' and Tf axe. still essentially This integration is legitimate because of our assumption diagonal in molecule (2) because they are so in molecule that dr can be chosen long compared to the duration (1) (this is the interesting case corresponding to resonant of one collision. We call the resulting T matrix T{da); interactions). In either case, a reduces to the following then the average in (41) gives the following equation simple form: for?. p(da in dr) = nvdadr.

cr = 2jcro2,

[dF(T)/dT~\=-nwF{T), /•

a=

F{T) =
[TriT'idayW'T'idaW') da\ J L Trfa.W)

1 1 . J

(42) aa%=

(PB)«2

(43)

r

X

It is obvious that the shift in frequency and the linebreadth follow from the real and imaginary parts of the number a, precisely according to (28). Thus the problem of the line-shape is solved in principle. Before proceeding further with the evaluation of a, the "collision cross section for line-broadening," it is necessary to generalize formula (43) to a case which we shall frequently treat—that in which the interaction Hamiltonian Hi'(t) is a matrix involving the quantummechanical states of both the perturbing molecule and the radiating molecule. The line of reasoning is prac-

11

[Tr(Ti'"-iday-»uizifT'''"(da)n/i) da\

1 1 .

(46)

The trace includes a summation over mi, the molecule (2) degenerate index, and 2/02+1 is the degeneracy of the a,2 level. H. Reduction of a to Computable Form 1. Use of Rotational Invariance to Relate Matrix to "Collision Axes"

Elements

In the form (43) it is still practically impossible to compute a, since we must find the ^-matrices not only

PRESSURE for every impact parameter, or distance of closest approach of the two molecules during collision, but for every set of direction angles of the colliding molecules relative to the polarization direction. I t is possible to use the fact that the fdo implies an average over all conceivable sets of direction angles to remove this latter difficulty. We do not do this rigorously, since it involves some rather long group theory. The following argument, however, indicates the method. The integrand of (43) is, as it stands, rotation invariant, since it is a trace. Essentially, this means that the component of M we deal with is immaterial. Suppose we can compute the /"-matrices for one specific set of direction angles (for instance, suppose we know them for a system quantized along b, the impact parameter). Then it is possible to leave these /--matrices as they stand and to take the direction average by rotating the z-axis, i.e., the component of n which is used in (43). Since (43) is quadratic in \il, it is allowable merely to average over the x, y, and z-directions. Then we get simply

•z>G'.W)=iE*».f

2Trbdb

fix, y, z fix, y, z

'«],

(47)

where T'(b) designates the /"-matrix in the "collisionaxis" coordinate system. Since a great deal of further manipulation with (47) is necessary, we shall use an alternative form for it, based on the identity between the vector matrix and the Wignerian coefficients.12 (48)

(TO I H.V | n) = (jfljun | j'/l/iO)

and similarly for l/VZO^zbiju,,), with 0 replaced by ± 1 . Then it is easy to show that (47) is the same as (j/ljim\jflfiM) r= f »^n

2-wbdb £ M, m, i M,m,m'

(2ji+l)

XlSmm^^-(m\T^(b)\m')(ix'\Tf(b)\ixU.

(jAjtm'\j/ln'M)

(47')

A similar formula can be written in the two molecule case of Eq. (46).

BROADENING tions with the assumption that J'Hi'dt/h use the following procedure: T=T0+T1+T2+---,

is small. We

7-0=1,

r' #i'(0 ihTi = HiT„;

7\(0=

df, J^„

ihf^H/Tu

(49)

ih

' Hi(t')

T2(t) =

•J-^

ih

r"Hi'(t") ih

df J

', etc.

If we could assume that 777(0 commuted with Hi'(t") at all times, the result of this process would be simply the scalar solution of the linear differential Eq. (9'):

T(t)

(50)

On the other hand, it is clear from the definition (10) of Hi that it does not commute with itself at all times, but only because 77i and H0 do not commute. Hi(t) is a function of the coordinates only, in general, and will always commute with itself. An investigation was made into the size and meaning of the "non^commuting" terms by which (50) differs from the accurate solution of (49). Those terms involving high frequency elements of Hi ("high" compared to the rate of change of 77i) were found to be precisely equivalent to the terms obtained in (50) by including, in Hi itself, the 2nd-, 3rd-, etc., order forces due to the high frequency elements of 77/. On the other hand, such terms are small if they involve low frequency elements of 77i'. Therefore it seems permissible to ignore the noncommuting terms and to use (50) for T, including at least the second-order (Van der Waals') forces in 77i. These terms have not been computed accurately according to (49), because they appear first in the successive terms T2 (shift) and 7% (broadening). The Van der Waals' procedure introduces the second-order forces in the same manner as the first-order ones in 7 \ and T2 where they are much easier to handle. I t is an interesting check on our method to note that the second order effects of high levels would derive naturally in the correct form from the general theory. For simplicity we limit ourselves to the first three terms of (50). These are

2. Introduction of an Approximate Form for T. For the purpose of computing a from (47), an expression for the T-matrices in terms of the interatomic forces must still be found. This cannot be done exactly. However, a device which has been frequently used is the following expansion of T in successive approxima-

653

T0=l;

T(b)^T0+Ti+T2, Tx=-iP;

T2=P2/2,

(51)

where P=

12

l r"

Hi '{i)dt.

(51')

654

W. / *

\ \

' / .

1

1

'

!

!

ANDERSON

paper.16 Also using the fact that Tr1= — Ti, we get

APPROXIMATION NO. 1 APPROXIMATION NO.2 APPROXIMATION NO.3 PROBABLE SHAPE OF

• \ \ \

CURVE:

\|

a AND b

5x-i

(2 POSSIBILITIES)

j J 1

b"

\2

|s a (u)-i

^ ^ ^ ^

FIG. 1.

3. Use of the Approximation for T in Computing a and Line-Broadening

r (St)o

Let us repeat the formula (47) here, in a slightly different form:

(52)

s(b)= z

JAP'M)

(54b) "—'r

2ji

2jf+

Hzv<-»|») 2-

4

;

2ji+l

(i,m\P*\i,m) 2ji+l

(M|ZY|M)

hZ

1

2//+1 J

(f,»\I»\f,*y

2jf+l

(54c')

J'

The second type of term, however, must be left in its original form; we call the sum of these (.SVJmiddie: 1

2jd-l

m, m', M

r

L i

a= f 2irbdbS{b), (JAj m I JAliM) (jAjiM\

(m\I (mlP^m)

—r

Referring to the definition (51) of P, it appears that this, the first "shift" term in a, is just the first term in the simple phase-shift approach, averaged over all directions. There are two possibilities for obtaining second-order terms in the S sum. We have the choice of taking To(XT/ or vice versa, or of taking the terms involving 7VX7Y. The former pair of terms follow precisely as did the Si terms; we call these (S2) outer.

\

|

1 \

E

L"—it

\ \ 1 1 ,--- S -~A 3<

r h

(S*) middle —

E

2ji~\~l M, m, m'

(jAjim\JAvM)

X («„•*-»'- H T'-Kb) |m'W| T'{b) |„). We are going to expand S(b) in powers of the P-matrix of Eq. (51'): S=S0+S1+S2+---.

(53)

X (jAjtm' | JAv-'M) iim \ P | im') (f/ \P\fr).

Further terms are prohibitively difficult; in any case, the "non-commuting" part of T can no longer be neglected in Ts.ttt

In order to get terms of the zero'th order in P, we must pick Ti=Toi, T'=Tj, and we get So=0.

(54a)

The first-order terms are obtained by including either T*= To, Tf= Tx or vice versa. The result is

(JAjwljAiiM) . (JAJ,-m'\JA^M)

S1=- £ M, m, m'

2yt~f- 1

x[(«|r1i-Mw)^+(M'|r1/|M)5„m-]This is greatly simplified by the fact that the Wignerian coefficient (jAji>n\JAl*M) vanishes unless m=n+M; thus if M=M', m=m', or vice versa, and Si is simply 5i=-E

E

|07ii,H;7W)l2

M m or n

x[(«|r1^|m)+(M|r1/|M)]. The sums over M are easily performed, one by the unitarity of the Wignerian coefficients, and one by this and the "permutation of indices" property of these coefficients which appears most easily from Racah's

13

(54c")

4. The Interpolation Process Since the expansion (53) of S(b) is valid only for small P, and P increases very strongly with decreasing b (/-3 for first-order, r - 6 for second-order forces), obviously the first two non-vanishing terms of 5, derived above, are of no help for collisions for which b is small. For this range of b we must use physical reasoning, based on the fact that "strong" collisions such as these must have an effect equivalent to complete interruption of the radiation. The T-matrix for complete interruption 15

G. Racah, Phys. Rev. 62, 438 (1942). t B I t is clear that in (54c) the degenerate and non-degenerate levels play essentially different parts. One can interpret these terms by saying that transitions to non-degenerate levels act like complete interruptions of the radiation, while transitions among degenerate levels take on the character of phase-shifts to a greater or lesser extent, whence the terms (54c"). This phenomenon, and the question of when a level is or is not "degenerate," are allied to the general problem of coherence of radiation of different frequencies which crops up in many fields. In this case, the problem is fairly simple. It is easy to see that the different behavior of degenerate and non-degenerate levels follows from the step of dropping the terms of finite frequency in Section D, Eqs. (13) and (14). We noted there that this step was allowable if the frequency difference of the levels was much greater than 1/T. In addition, (54c") can be shown to vanish for first-order dipole-dipole interactions in any case, so that observable effects of this cause seldom appear.

PRESSURE

can have two forms: For the pure phase-shift case, that of an arbitrary phase-factor, averaging to zero; for the transition case, that of complete off-diagonality, meaning that the molecule has certainly changed its state. In either case S(b) contains only the 5„,m'5w< factors, and a simple summation shows that S(b small) = 1 or averages to 1.

(56)

It is easy to show that because of the factor b in (52a) this is an upper limit for all conceivable shapes. One can look upon the true curve as an average of several curves like # 1 , with different asymptotic strengths; then it might well resemble curve (a) or (b). # 2 is the approximation obtained by using S2(b) from (54c) to the point at which S= 1; then we use exactly (55). # 3 is a rather arbitrarily set lower limit 5# a (6) = l - e x p [ - 2 5 2 ( 6 ) ] .

(56')

We will find that the three approximations are never more than 20 percent from each other, which is usually less than experimental error. Where the experiments are more accurate, # 2 does indeed seem to be the best choice. 5. Some Remarks on the Matrix

655

tion will be completely interrupted. Then r2=b2+v2fit

(59)

and At exp(iuabt)(b2+

(a\P\b) = K f

vW)-*°' " 3 .

(60)

•'-»

(55)

The problem of interpolating between the region of large b, where S(b) is small and satisfies (54), and that of small b, or (55), is insoluble in principle. However, several simple models and some general reasoning have led us to conclude that a definite upper limit for the real part of a, and a probable lower limit, can be assigned on the basis of two rather arbitrary choices of S{b), while a third choice gives a good working approximation. Figure 1 illustrates the various possibilities. Our three approximate "ideas" for curves are labeled "approximation # 1 , 2 , 3 . " # 1 is essentially the S(b) curve used in the phase-shift theory. 5#i(6) = l-cos(25 2 (6))».

BROADENING

Now we define x—vt/b,

k = bwab/v,

(61)

following Foley; 9 then (60) becomes k (a|P|J) =

b2vorbh-L„

eihx

f" dx

(l+x2)3orl

.

(60')

These integrals can easily be done by contour integration; however, the main features of the two solutions are the same. Both integrals have nearly the same value as at £ = 0 ("fast" collisions or degenerate levels) for a fairly large region of the parameter k (relating speed of collision to frequency) approaching k=l. Then there is quite a rapid falling off until at k = 4 or 5 (60') essentially vanishes. We may interpret this according to the uncertainty principle. As long as the collisions are sufficiently fast (&<1), the energy-difference corresponding to oiab acts as though it were negligible: AEAt>h implies that there is a region of "vagueness in energy" of the width h/Al, with At the time in which a collision occurs. Since the P-matrix element gives, by (51), the probability of occurrence of transitions between the states a and b, these transitions will occur as though these states were resonant throughout the region k
"P" SUMMARY OF THE THEORY

Before concluding the theoretical part of this work it is well to make some remarks about the magnitude of the a, b elements of P as a function of the separation in energy of the levels a, b. An element of P is given by {a\P\b)=

f

d/expCWXalffiMI*).

(57)

Hi(t) will always be either a first- or second-order dipoledipole interaction; therefore it will have a form like H1(t) = Kr-s

or Kr~\

(58)

We can always assume that the paths followed by the colliding molecules are straight lines, since one can show that if the molecules collide with sufficient strength to have curved paths it is quite certain that the radia-

At this point it will clarify the exposition to present a summary of the theory up to this point. We have, essentially, found the line shape of a pressure-broadened spectral line under the following assumptions: (a) that the relative motion of the molecules is classical; this enables us to write down the generalized Fourier integral (2). (b) that the time between collisions is long compared to the duration of the collisions; this enables us to separate the Fourier integral into component parts referring to the separate spectral lines ((16) and (17)), to find a general expression for the correlation function (42), and to write down the broadening cross-section a occurring in this expression in terms of collision operators T (43). (c) that the collision operators are smooth functions of the collision parameter b between large b (where we have the expansion (51)) and small b, where complete interruption occurs. Then we are able to apply the interpolation process, and obtain for u the expression (52), 5(&) given by the formulas (56) fitted to the large —6 expansion (54).

14

P.

656

W.

ANDERSON

TABLE I. Comparison of theory and experiment. Experiment

Theory

#2 0.77

Approx. # i &m= 0.86

function with respect to inversion.

#3 0.68 cm"1'/atmos.

^+=^+K+4'-K,

0.74 cm - 1 /atmos.

TABLE II. Comparison of theory with experiment-—relative

breadths in the ammonia spectrum. Experimental

data: Bleaney

Quasiempirical formula: Bleaney

J = (a)

K=

Theoretical breadths: this thesis

U»

M

W)

M

2 3 3 3 4 5 5 5 5 6 6 6 7 7 8 10 11

1 1 2 3 4 1 2 3 5 3 4 6 5 6 7 9 9

2.6 2.3 3.3 4.5 4.6 1.8 2.5 3.3 4.8 2.8 3.5 4.8 3.7 4.3 4.3 4.5 4.1

2.6 2.4 3.2 4.5 4.5 1.8 2.6 3.3 4.7 3.1 3.6 4.7 4.1 3.9 4.1 4.2 2.9

2.7 2.2 3.4 4.5 4.6 1.6 2.5 3.3 4.65 3.1 3.6 4.7 3.8 4.3 4.35 4.5 4.2

^ - = ^ + K - ^ - K .

(1)

Here -\-K and —K denote the "single-ended" states, whose degeneracy is raised by the well-known inversion or "tunnel effect." First we consider the interaction between two ammonia molecules (first-order only). The Hamiltonian of this interaction is H --

tfl-tf2-

_(tfi-r)(»2-r) •r)-| — r

(2)

where yi and y 2 are the dipole moment matrices of the two molecules considered, and r is the distance between the molecules. Equation (2) can easily be expanded in terms of direction-angles in a system of reference in which the impact parameter b is the polar axis of spherical coordinates for each molecule. Then 0i,
(3)

Data are half-widths in c m _ 1 X 1 0 - 1 at 0.5 mm Hg pressure.

The procedure to be followed in computing line shapes is this: given an assumed type of intermolecular forces, one computes the matrix P (51') in the collisionparameter coordinate system. The interaction energy Hi is given by (10). The elements of this matrix involving the initial and final states are then averaged over directions according to (54b) and (54c' and c"). The 5 2 averages (54c) are employed in the approximations (56) and the
We must obtain the matrix elements of the combinations of functions of 6 and (p which occur here. One theorem simplifies this problem greatly. All matrix elements vanish between states which are both + or both —, while the ( + | u | — ) elements are precisely those of the ordinary symmetrical top for states of the correct / , K and M. The proof is obvious upon observing that

(K\v\K) = -(-K\v\-K);

(K\9\-K)9&.

(4)

The latter holds true within the order of the tunnel frequency compared to a vibrational frequency, or 10~3. Then (1) leads directly to our theorem. We present here a tabulation of the required matrix elements: 16

(JKM\ cosfl \J KM) = KM/J{J+1) (/ KMlsinBe^lJK

Self-Broadening

Mil) = =FK/J(J+ \){{J±M+

We shall begin with various problems involving permanent dipole interactions, since there is much experimental work on them and since, in general, the "difficult" term (^middle vanishes in these problems. Self-broadening in the inversion spectrum of ammonia has been studied most extensively. The states of the ammonia molecule, exclusive of vibrational and electronic quantum numbers, are specified by four quantum numbers: / ( t o t a l angular momentum), i f (equatorial), A s y m m e t r i c top quantum number), and ± , denoting the symmetry of the wave-

1)(/TM))!,

(JKM\cos6\J+\KM) ![(/+l)2-A2][(/+l)'-Jlf2]}* (/+l)[(2/+l)(2/+3)]* [(/ 2 -A 2 )(/ 2 -ilf 2 );p {JKM\cosB\J-\KM)=7[(2/+l)(2/-l)]» 16

15

R. de L. Kronig, and I. Rabi, Phys. Rev. 29, 262 (1927).

PRESSURE (JKM\suite**\J-l

KM±

BROADENING

1)

2

[_(J -K*){J^FM-\)(J^FM)J ~

/[(2/-l)(2/+l)]»

'

( / if M | sinte^* | J + 1 K M± 1) {[(/+l)2-if2](/+l±Af)(/+2±jW)j» (/+l)[(2/+l)(2/+3)]* Of course, these elements are to be understood, according to our theorem above, as always off-diagonal in the index ± , which is not explicitly written in. There are two kinds of matrix elements of the "interaction matrix" P which do not vanish, for all practical purposes: (a) those corresponding to a simple "firstorder Stark effect" interaction involving the J—*J elements of (5), and (b) those corresponding to rotational resonance and involving / — > / ± l elements of (5). All others are negligible because of the large energychanges (order of 10 cm - 1 ) involved; we neglect the second order effects and vibrational resonances, the latter because few vibrationally excited molecules are present. However, it is necessary to observe that there is no resonance effect as far as the inversion frequency is concerned. The parameter k of Section H-5, Part I, is small for the inversion frequency, and therefore collisions of + molecules with + molecules or of — with —, in which each type must undergo a -\—>— transition (total energy change about 1.6 cm - 1 ) are as effective in broadening as the truly resonant I, _

,

collisions.

This applies as far as both effects (a) and (b) are concerned. The parameter k is actually bu 6(A) X10~ 8 X 3X1.6 X10 18 X2ir &=—= =J(A)X0.035. v 8X10 4 (5) We shall never consider impact parameters b greater than ISA. A k of 0.5 may lead to an error of a few percent. These considerations lead us to the following procedure: we shall calculate the P-matrix (Part I, Eq. (51)) ignoring the exponential time factors entirely, and then substitute the resulting P-matrix into the sum S 2 (I, Eq. (54c)). Since we are ignoring the time-factor, it is permissible to perform the time-integral of (3) previously to taking the matrix elements. We use the formulas: H = J 2 + vV

cost =b/r

sin^ = vt/r,

(6)

At this point it is very easy to calculate the correct S2 sums. We present merely the result, which includes the identical sums for initial and final states. For the simple Stark effect we get the following S 2 sum, applicable to collisions of the radiating molecule, quantum numbers Ji and Ki, with molecules of any J2 and K2 (n2 in the general formula (46) of Part I for the two-molecule case being here J 2 , K2). KSK* S2(J2, K2)—~ o j W / ^ + l X ^ / d - l ) (Stark effect).

8 u4 52(/i-l,X2) = 9 bVh2 (.J1*-K1*)(.Ji2-K*i)

X-

h

/12(2/1+1)(2J1-1)

-(rot. res.)

(9a)

•

(9b)

52(/1+I,A:2)=-

9 bW

KJi+iy-KSjLiJi+D'-Kt'i X2

(/1+l) (2/1+l)(2/1+3)

For collisions with molecules of / 2 = / i ± l , one must add (9a) or (9b) and (8) together to get the total 5 2 sum. These are (S2) outer sums j no (»S2) inner sums enter in this problem because the elements of y are all off-diagonal in the non-degenerate index ± . We must now obtain the quantity a (or rather the separate cr(72, K2) to average over J2 and K2 by Part I, Eq. (46)). Without bothering to do the integrals," we present here the three possible o-'s for our three approximations of Part I, Section H-4. We use the notation S2(J2,K2)

= A*/bi-

(10)

Then the three approximations give #1. #2. #3.

C=l.ll C=1.00 C= 0.885

2M2

K2)XC,

#1, #2, #3.

b2vh

-sinfli sin02 sinpi sin<s2].

(11)

where C has the values

-[_— COS01 COS02

/:

(8j

For the two rotational resonance 5 2 's, in case the radiating molecule J\, Ki collides with a molecule with / 2 = / i ± l , we get

and the result is Exit

657

(7)

n H . Jensen, Zeits. f. Physik. 80, 448 (1933).

16

(11')

658

P.

W.

AND

computed correction for the rotational resonance effect, from Eq. (9). These values are normalized as explained above. Column (d) gives Bieaney and Penrose's experimental data, and column (e) gives predictions from a theoretical formula of Bieaney based on the arbitrary assumption that the broadening effect of a collision depends on the maximum interaction energy of the two molecules. This formula is

Then we have Eq. (46) of Part I #2),

and (Av) cm~'= (nv/2irc)
ERSON

(12)

which determine the line-breadth from Eqs. (8)-(ll). Two types of experimental information are available. The most extensively measured single line in the am(AiO#~ (*?/(/(./+1))*. monia inversion spectrum is the Ji = 3, Ki = 3 line.18'19 For this line we can neglect, with one or two percent It does not seem very well based since it takes no error, the effect of the rotational resonance sums (9a) account of rotational resonance. All breadths are in and (9b). Then we can compare our theoretical values 10™» cm -1 at 0.5 mm pressure. from (8)—(12) with the experimental results.§ The The computational error in our approximate method result shown in Table I (all breadths are half-widths in for the rotational resonance is about 0.1 crcr'XIO -4 . cm -1 /atmos. a t 0°C). This shows the agreement of our Bieaney states that his experimental error is of the order theory with experiment as far as absolute values of the of 0.2 cnr'XIO - 4 . Within these errors, our agreement broadening are concerned. Certainly the experimental is excellent, except for the last four lines, which seem to value lies within our computational error, and it is allow no reasonable explanation. The fortuitous agreeinteresting that the working approximation # 2 is ment of Bleaney's formula (e) with experiment is fairly close. interesting, but not, apparently, significant. The second type of experimental data available is that on the relative breadths for all the lines in the B. Other Cases: HCN and HC1 Vibrational ammonia spectrum, taken by Bieaney and Penrose.20 Band Spectra In Table II we present our theoretical results on relative 21 16 Lindholm ' has taken a great deal of experimental values. In order to avoid the ambiguity due to the interpolation procedure, we have normalized the theo- data on line-breadths in the vibrational bands of HC1 retical value for the 33 line to fit the experimental value and HCN. He has also made theoretical calculations of 9 precisely. Columns (a) and (b) give the J and K values the line-breadths for these spectra, as has Foley. for the various lines; column (c) gives our values using Before giving our own theory, it will make our procedure the Stark eflect, from Eq. (8), plus an approximately somewhat clearer to indicate how these older theories worked.## For cases in which resonances occur, as in the moleTHEORETICAL CURVES cules under consideration, the standard practice has I THIS PAPER been to find a mean square direction-averaged energy perturbation due to resonance, for instance22 EXPERIMENTAL DATA '

UNCORRECTED, LINDHOLM. FIG. 7 O SAME CORRECTED FOR \ EXPERIMENTAL SMEARING

-0)'f

«F%)K/-^/-D =

\

J2

( 2 / - l ) ( 2 / + 3 ) J R3

(13)

in the case of rotational resonance of dumbell molecules. Then this energy change is inserted as the diagonal B(t) in the simple phase-shift theory. This is for precise resonance; if imprecise resonance occurs, the "quasiresonance" type of calculation such as the following is used. If two levels are separated by a small difference in energy A, and an interaction V'« is assumed between them, the energy E of the perturbed levels is computed from the secular equation (A/2)-£ Vik

FIG. 2. Comparison of theoretical and experimental results. 18 B. Bieaney and R. P. Penrose, Proc. Roy. Soc. A189, 358 (1947). » C. H. Townes, Phys. Rev. 70, 166 (1946). § Bleaney's experimental value seems the soundest. ,0 B. Bieaney and R. R. Penrose, Proc. Phys. Soc. London 59, 424 (1947); 60, 540 (1948).

17

Vik = 0. (-A/2)-£

(14)

21 E. Lindholm, Zeits. i. Physik 109, 223 (1938). # # Lindholm in reference 10 himself stated that he found this type of theory very bad in atomic resonance problems and that he used it only because his own rigorous theory was too difficult in these cases. 22 H. Margenau, Rev. Mod. Phys. 11, 1 (1939).

PRESSURE

Then

£=±((A/2) 2 +(F r t ) 2 )». In the application of the method used by these authors, a mean square perturbation such as (13) is introduced as Vu,, and that solution of (14) chosen which reduces to the correct level at Va = 0. Then this E is used in the phase-shift method. Two objections can be made to this idea, considered simply as an interpolation method between the first order, resonance region and the second-order, r~e force region for V. In the first place, the criterion for the point at which the change in type of forces occurs is whether V=A, apparently independent of the speed of collisions, which enters into the criterion for our own theory. This can be shown to be not a serious discrepancy, using the fact that on both theories a collision must be both long and strong enough to cause a certain P-matrix element before it has an appreciable effect. However, the two criteria are not identical; the quantitative results of the two theories differ considerably. The second objection is more fundamental. A particular choice of the solution of (14) is made; thus in general it appears that a center-frequency shift should be expected, larger the closer to resonance the levels come. However, experimentally and theoretically it can be shown that there is no shift due to precisely resonant collisions: both signs in (14) occur when A = 0. This contradiction is a true defect in the quasi-resonance theory. We are not able to present a good theory of the interpolation region between first order (r~3) forces and r - 6 forces, or even, in the particular case of the rotator, of the second-order, r~e, region. Therefore, since the answers for HC1 depend strongly on how these regions are treated, we do not present detailed work on this molecule. We are able to attain fair (30 percent) agreement with Lindholm's observed widths for this spectrum, but no better agreement than that of his quasiresonance theory. However, his theory predicts much greater breadths for the P than the R branch, which are not observed; our method of treating near-resonance could never lead to such a great difference, and thus agrees better with experiment. We have been able to make fairly quantitative calculations for the H C N molecule. This molecule is a dumbell rotator with a large dipole moment 2.65X 10~18 esu. The observed spectra are the P-branches of the two bands at 11500 crer 1 ( » i = l , v3=3) and 12500 cm" 1 (»8=4). There are several types of interactions to be considered. (1) Vibrational resonance; this is neglected because the dipole moment for this interaction is quite small. (2) Rotational resonance of the ground state ( / = / i ) with unexcited colliding molecules, having / = / i ± l . (3) Rotational resonance of the excited state ( / = / i — 1 ) with unexcited colliding molecules having J—Ji, Ji=2.

BROADENING

659

(4) Second-order Keesom alignment forces, operative primarily when the colliding molecule is not in resonance with either state of the radiator.

Effect (1) is ignored. Effect (2) we can compute on much the same basis we have already used for ammonia: in fact, we can take over the S 2 sums (9) bodily from the ammonia problem, if we specialize to K=0 and realize we must take only half of the sum there used, since only one state of the two involved in the radiation is undergoing the first-order perturbation. 4 S2(J2 = J1+1) =

52(/2=/i-l) =

M4

(/i+l)2

9JVA2(2/1+l)(2/1-r-3)

,

9bVhi(2J1-l)(2J1+l)

The effect (3) is not a precise resonance, because the moment of inertia of the H C N molecule depends on vibrational quantum number. If we express the rotational energy approximately as £«(/) = £„/(/+1).

(16)

Then 23 Bv=lA87&-0.0093(vl+i) +0.0007(» 2 +l)-0.0108(i>3+i) B is therefore about 0.04 c m - 1 different in excited and normal states. Since the quantum exchanged in the rotational resonance is about 2BJ wave numbers, the total amount by which the excited and ground states fail to be in resonance is approximately Ao)(off resonance) = 0 . 0 8 / c m - 1 .

(16)

Then our parameter k is k=bAa/v=2.9XiO-"b(A)J.

(17')

The maximum collision parameters for this problem are about 30A; then £<0.09/

(17")

and it is only for values of J~ 10 that an appreciable correction for failure to resonate should appear. We shall ignore this correction entirely, both because it should not be large for the interesting lines and because it is extremely difficult to compute. Then for the effect (3) we also use Eqs. (15), except that / 1 must be replaced by / i + l . Lindholm finds a large correction for quasi-resonance in this case, not depending much on / . We do not agree with this result. In addition, this is the point in which his HC1 treatment differs greatly from ours: we find that if (as is necessary for most lines) a correction is to be applied for this effect, it should be fairly symmetric between the P and R branches of the band. 23 G. Herzberg, Infrared and Raman Spectra (D. Van Nostrand New York, 1945), p. 393.

18

660

P.

W.

ANDERSON

Effect (4) we have not been able to treat rigorously by our own methods, since the sums required become quite difficult for the second-order alignment forces. It seems likely that the results for these forces obtained by the simple direction-averaged phase-shift theory should not be very far wrong; in order to include this important effect, we have simply borrowed the result of Lindholm and Foley for these forces (they agree) and added their line-breadths for these forces to our own for resonant forces. This simple addition is legitimate, since the important contributions these forces make come from different collisions from those contributing to the resonant line breadth. We use the "working approximation" number (2). Then the cross section is given by (11) with C= 1, and A defined by (10). We use S 2 from (IS). The line breadth can then be computed from (12) and Eq. (46) of P a r t I, for all resonant collisions (effects (2) and (3): sum over Jz = Jx—2, / x — 1 , J\, / i + l ) with a contribution computed from Lindholm's figure (7) added in for the alignment forces. Before comparing our results with Lindholm's experiments in Fig. 2, we should like to make some comments about his experimental data. These were taken in the photographic infra-red, by means of densitometer analysis of plates. His best data are those on the 11.5 X10 3 c m - 1 band at pressures of 25 cm and 58 cm Hg; he also has data on the 12.5X10 3 crrr 1 band at three pressures, 25, 40, and 58 cm Hg. The most obvious fact about his original breadth data is that they are not proportional to the pressure, as one would expect on any theory at these pressures, but that instead they are given by &v = const.-)- (const.) X pressure,

TABLE III. Comparison of measured and computed diameters. Line-breadth

Molecules Radiator Perturber

NH 3 NH 3 NH 3 NH 3 NH 3 H20

He H, N2 02 A Air

(a)

(b)

diameter A

Measured diameter A

Kinetictheory diameter(A)

1.5 1.9 2.6 2.5 2.6 3.3

2.4 3.5 6.4 4.8 4.6 5.6

3.2 3.4 3.4 3.5 3.7 3.5

are the corrected data. In addition, we present three theoretical curves for comparison: (1) is our own theory for resonance, borrowing the Lindholm-Foley values for the small contribution of alignment forces; (2) is Lindholm's curve for this spectrum; and (3) is Foley's published values for the H C N spectrum. Lindholm's curve is appreciably lower than Foley's for two reasons. He has a correction for quasi-resonance, which is, in our opinion, unfounded. Foley need not use this correction in any case, since his theory applied to the 14/J fundamental band. In addition, it seems that even taking into account this correction Foley's values are some 20 percent higher than Lindholm's; perhaps this is a numerical error in Foley's work, since we get agreement with Lindholm's values upon re-computing by their common method. Two forms of agreement can be claimed for our curve: (a) excellent agreement with the experimental variation with / , a factor which is unchanged by any method of correcting for experimental error; and (b) good agreement quantitatively with the corrected data.

(18)

within the experimental fluctuations. One can interpret the constant in this expression as an experimental broadening due to slit width, finite resolving power, etc.; this would be the case if these effects were summarized by a dispersion form of curve, to be smeared with the true line breadth. This constant is about 0.13 c m - 1 in half-width, a very reasonable value for experimental effects. To get the true width one should subtract this constant from all data.**** In Fig. 2 we present both sets of experimental data: the actual observations at 58 cm extrapolated to 1 atm. pressure using simple proportionality to pressure, and the same with the constant term in (18) subtracted, which we expect to be the true impact theory line breadth. The legend explains that the lower (o) values **** In a private communication, Lindholm suggests that while there must be some error of the type suggested here, he cannot agree that it could be quite so large. One may consider that the correction for experimental error is uncertain by perhaps half its total value, and thus that the data is uncertain by 0.06 cm" 1 / atmos. Gordy has recently reported microwave measurements on the HCN rotational spectrum (Rev. Mod. Phys. 20, 668 (1948)) and it is to be hoped that the observation of one or two of these lines will definitely settle the question.

C. Foreign Gas Broadening: Van der Waals Interactions Broadening of microwave lines by pressures of foreign gases has been the subject of some experimental investigations. 20,24 The theoretical approach to the problem of foreign gas broadening has generally been that of computing the Van der Waals interactions between the molecules concerned and using these to obtain broadening cross sections. We have followed this procedure with our more accurate theory. In general, it can be shown that the r~e term in the interaction between two molecules is, for the Debye induction type of forces: #ind.= (<W/2r 6 )(l-r-3 cos 2 0),

(19a)

where a% is the polarizability for the non-polar molecule, assumed isotropic, m the dipole moment of the polar molecule, and 6 the angle which yi makes with the intermolecular distance r. Very similarly, the London dispersion forces for two interacting non-polar molecules, one assumed isotropic, are approximately given 24

19

G. E. Becker and S. H. Autler, Phys. Rev. 70, 300 (1946).

PRESSURE

BROADENING

by

661

cross sections a calculated by means of the known Van der Waals forces come out usually to be considerably smaller than the kinetic theory cross sections comffdisp. = ( 3 a ' + a " + ( « " - « ' ) COS20), (19b) puted from viscosity, etc. This means that the forces which cause broadening must be the higher order and where a 2 is still the polarizability of the isotropic mole- exchange forces which come into prominence a t the cule, Ii and Iz are the fundamental frequencies from diameter corresponding to the kinetic theory crossone-term dispersion formulas, or the ionization poten- section. As would be expected, the measured broadening tials, for the two molecules, and a' and a" are the cross sections are at least as great as the kinetic theory polarizabilities along the two different axes of the ones in most cases. Table I I I summarizes the situation very well. polarizability ellipsoid of the non-isotropic molecule, 22 assumed to be an ellipsoid of rotation. (A somewhat Column (b) is from Bleaney and Penrose's paper, except the H 0 line, which comes from Becker and 2 similar formula holds in case the anisotropic molecule 24 26 has three different polarizabilities.) 6 is the angle Autler. Column (c) is from Stuart's book. One more problem was computed. The microwave between the axis of the polarizability ellipsoid and the spectrum of oxygen has been measured 26 in some detail. intermolecular distance r. For states separated by microwave frequency dif- This case is even more complicated that the preceding, ferences, the isotropic parts of (19) make no con- because transitions among states having different total tribution to broadening, and one need consider only an angular momenta (/) but the same K, or molecular quantum number, must be considered. J = K + S , where interaction of the form S is the spin angular momentum of O2, \S\ = 1 . The K is very small, since ff=const.X(cos29/r6). (20) splitting of the states J=K±\, this' is the line observed at microwave frequencies, and 52 sums can be computed using this interaction, and thus this type of transition occurs easily upon collision. inserting appropriate constants for various cases. These Another theorem of Racah was used to simplify this sums are quite difficult, both because the cos20 term computation. Here the agreement with experiment was introduces essentially a second-order Legendre poly- somewhat better than would be expected from the nomial symmetry into the matrix P, and because the kinetic theory diameters and Table I I , but this is not difficult term (.Sumner does not vanish. They are evalu- significant since the experimental data are only on the -1 ated easily only by using the group-theoretical methods entire unresolved set of lines near 2 cm . I should like to express my gratitude for the wise of Racah; 15 however, using these methods they can be done in general, even for forces of more complicated direction, help, and encouragement given me in my work on this problem by Professor J. H. Van Vleck. symmetry. We do not present this evaluation because the 26 H. A. Stuart, Molekttlstruktur (Verlag J. Springer, Berlin, results, in general, do not have any relation to experi- 1934), Table II. 26 mental results. It is found that the collision broadening J. H. Van Vleck, Phys. Rev. 71, 413 (1947).

20

Theory of Ferroelectric Behavior of Barium Titanate Ceramic Age 57, 29-34 (1951)

This is my first review paper — a forerunner of the "soft mode" paper of 1958. Bill Shockley hired me at Bell to work on his ideas on ferroelectricity and I did, but soon became more interested in generalities, feeling that calculations were premature. He was unhappy with my work, which seemed to ignore his ideas for detailed calculations; but he taught me a lot of solid state physics and the background and the experience were both valuable to me.

21

Theory of

Ferroelectric Behavior of Barium Titanate I — Review of Experimental Information

T

HE purpose of this paper is, first, to give a review of certain typical experimental information on ferroelectrics, particularly on barium titanate and its sister-structures, which have absorbed most of the recent effort in this field; and second, to present a rough idea of the present theoretical picture of the atomic mechanisms underlying the phenomenon of ferroelectricity, again concentrating on barium titanate. Since it does not seem necessary to attempt to give complete references in a paper of this character, particularly because a quite complete bibliography of the subject is available in Von Hippel's excellent review article1, it should be made clear at the start that the writer is borrowing material from various sources.** The first and most obvious question to be answered is: what is a ferroelectric, and why is it so called? The name does not come from any relation to iron, but from the very close analogy between the phenomena of ferroelectricity and the better-known ferromagnetism. By definition, ferroelectricity is the dielectric analogue of ferromagnetism2. Fig. 1 shows the most striking point of this analogy: The ferroelectric hysteresis loop in barium titanate (BaTiOa). Here we are plotting the dielectric induction D versus the applied field E, as can be done on a cathoderay oscilloscope with certain rather simple circuit arrangements. The full scale of D on these diagrams roughly is 2 x 107 volts/cm, while the full scale in E is only about 4 x 103 volts/cm. Thus the mean or "true" dielectric D constant E = —, corresponding to the E slope of the loop as a whole, is about 10*. On the other hand, the second dia* Presented at Tenth Symposium on Ceramic Dielectrics, School of Ceramics, Rutgers University, New Brunswick, N. J., November 17, 1950. ** The general rule followed in this paper is that if a development is covered in some detail by Von Hippel, no reference for such is noted. 1 A. Von Hippel, Rev. Mod. Phys. 22, 221 (July 1950). - Thus an eleetret, by this definition, is not a ferroelectric, since very few of the ferromagnetic properties have analogues in electrets.

gram in Fig. 1, in which the effect of applying a small extra voltage at various points of the loop is indicated, shows that the effective reversible dielectric constant er(,v- for small fields is only about 1000, at all points of the loop, and that for small fields there is little or no hysteresis loss. To those who are familiar with ferromagnetism, this will indicate that small fields are causing little or no domain motion, since £,.,,v. also is about equal to the saturation dielectric constant, which represents the dielectric constant of a sample in which all domains are completely aligned, and cannot be realigned by the measuring field.

By P. W . ANDERSON Bell Telephone Laboratories M u r r a y Hill, N . J .

of ferroelectrics, about 16 x 1 0 _ 0 coul/cm2. , The shapes of these hysteresis loops, as in ferromagnetism, are very sensitive to small amounts of impurity and even to the previous thermal and electrical history of a single specimen. Nevertheless, PB and, usually, cny. roughly are constant for a given substance. The second similarity between ferromagnetism and ferroelectricity lies in the Curie point phenomenon. Fig. 2 shows a plot of €rev. versus the tern-

On the ferroelectric hysteresis loop, one can define quantities entirely analagous to those which characterize

Fig. 2: Dielectric constant as a function of temperature for a single crystal of barium titanate.

Fig. 1: Hysteresis loops in BaTi0 3 .

magnetic loops: the coercive field EL. which is required to return the induction to zero, the remanent polarizaDr tion P r = at zero applied field, 4r and the saturation polarization, the polarization P8 of a hypothetical specimen with all domains parallel at no field. This latter quantity can be obtained by linearly extrapolating the saturation line at high fields back to the E = 0 axis: this is shown in Fig. 1. In BaTi0 3 the saturation polarization is larger than in the other classes

perature for a single crystal of BaTi0 3 . A very high peak occurs at the "Curie point" TL. = 120°; above this peak the substance behaves like a relatively normal dielectric of high dielectric constant, while below 120°, as shown in the Fig., there is complicated crystaldirection dependence for £rev. In this region we have also the field dependence of the dielectric properties indicated by the hysteresis loops of Fig. 1. Returning to the region above Tc, Fig. 3 shows the D versus E plot here. At temperatures only slightly higher than Tf., shown by the curve labelled T c + 2°, the hysteresis loop has disappeared, flattening out to a single curve which, nonetheless, still shows some evidence of dielectric saturation in the curvature ,at high fields. At still higher temperatures the dielectric constant is lower, and the characteristic is perfectly linear as in an ordinary dielectric. Another characteristic of the region above the Curie point which is strikingly similar to the behavior of ferromagnets is the Curie-Weiss law. This law states that the inverse of the sus-

(Reprinted from CERAMIC AGE, Newark, N. J., April, 1951)

22

ceptibility (magnetic or electric) is proportional to the temperature, i.e. 1 — = A + BT (1) £

1 and that actually the intercept at - -

^

c

= 0 is approximately Tc, i.e. (2) c ~ const — T-Tc The correctness of this law is demonstrated in Fig. 4. The small deviations near the Curie point are found also in ferromagnets, but are not quite the same in detail. Thermally, the electric and magnetic Curie points are not quite so similar in structure. Although both show singularities in the specific heat at this point, the ferroelectric singularity is considerably smaller for the BaTi0 3 class than is the usual ferromagnetic specific heat singularity. Also, while most (not all) known ferromagnetic Curie points are thermodynamically classified as second-order, or A-point, transitions, with no latent heat or discontinuity in volume or other "first order" properties, it is possible that the BaTi0 3 Curie point is a very slightly

Tc*2"

Fig. 3: D vs. E for BaTi0 3 above the Curie point.

first-order change with a true volume discontinuity. Further reference will be made later to some of these points. A third striking experimental similarity between ferromagnetism and ferroelectricity lies in the domain structure. It is well-known that in order to explain the magnetic hysteresis loop Weiss and others postulated that a ferromagnetic specimen is divided into a system of regions, or domains, each polarized to saturation in a certain direction. At zero applied field, however, the magnetic domain polarizaa For a review see C. Kittel, Rev. Mod. Phys. 21, 541 (1949). * G. C. Danielson, unpublished. W. Kaenzig, Phys. Rev. 80, 94 (1950).

/

4.

Fig. 4: I t as a function of temperature in BaTiC>3.

tions add up to zero polarization for the specimen as a whole. Through beautiful theoretical and experimental work3 the domains in ferromagnets have come in quite recent years to be understood and definitely "pinned down" theoretically, as well as made visible experimentally with the powder pattern technique. The opposite order is the rule in ferroelectrics: the domains are easily visible and, in some cases, have been the first observed evidence of ferro-electricity: it has been easier to identify a ferroelectric optically by looking for domains than electrically by finding a hysteresis loop — although the latter, it must be said, is a far surer test, since domain-like twinning is found to be surprisingly common. To explain why the domains are so easily visible we must first go into the crystallography of BaTiOs. Above the Curie point, this substance crystallizes in the cubic perovskite structure, as shown in Fig. 5. One may think of the large barium and oxygen ions as lying on the corners and face-centers of a cube, respectively, together making up a face-centered structure; the small titanium ions are then at the cube centers, surrounded by an octahedron of oxygen ions. At T c and below this structure is found, by x-rays, to deform into a tetragonal unit cell, with one of the three cubic axes (arbitrarily of course) lengthening, the oth-

Fig. 5: Crystal structure of BaTiC>3.

23

er two shortening, to become the c and a axes respectively of the tetragonal structure. The ratio c to a is 1.01 at room temperature. Recent x-ray results indicate that in addition to this general deformation the relative ionic positions shift slightly, the Ti ions moving .044 A in the direction of the c-axis, the O ions moving very slightly the other way4. This is confirmed by the fact that the spontaneous polarization P s has been found, by experiments on single crystals, to lie along the c-axis. It is fairly certain that the positively-charged Ti ions move in the direction of P s and the negative O ions in the opposite direction. One can think of the tetragonal deformation as an electrostrictive effect.

-- - -

CONST X T / f l C

rH

!>=

"* ("" -S*)'

^

k

V

\

. Fig. 6: Proportionality of strain and PB2.

It is found that an "electrostrictive" law A ( — ) = const x P-

(3)

is obeyed if we insert for P the saturation polarization P s . The same law, with the same constant, is found to hold above the Curie point for polarizations due to applied fields. There, of course, the effect is true electrostriction. Fig. 6 shows how equation ( 3) is obeyed. The deviations at lower temperatures are due not to the failure of the law but to inability to properly saturate the crystal when measuring r^, even in the so-called "single-domain" crystals. The cubic magnetic materials also have this same slight tetragonal distortion, although in these the effect always is thought of as magnetostriction superposed upon a basically cubic structure, while the BaTiOs effect is so large that calling it electrostriction is rather a stretch of the imagination. Because of this tetragonality a polarized piece of BaTiOa is optically anisotropic, with different indices of refrac-

tion in the two different directions; with a polarizing microscope, then, it is seen easily that a typical crystal of BaTi0 3 is subdivided into many domains, polarized in various directions, just as the existence of the hysteresis loop would lead one to suspect. A typical domain pattern is seen in Fig. 7.

C

strain = A ( — ) = co'nst x Pa (4) d

c dP (A ( — ) ) = const x 2P dE a dE Thus the change of strain with field involves the constant of electrostriction, the saturation polarization, and dP the dielectric susceptibility ; since dE

the latter two quantities are large, the result is a large and useful effect. In another sense, in an actual unpolarized crystal or ceramic the effect of a field is not at first to give a linear piezo-effect, because the deformation of domains polarized in one direction is cancelled completely by that of domains polarized in opposite directions. Thus, it is necessary to "pole" a specimen for piezo-electric work by the application of high fields, and accordingly to leave the specimen in an at least partially polarized state.

II — Review of the Theory of Ferroelectricity Fig. 7: A typical d o m a i n pattern in BaTiOa.

Precisely as in ferromagnetism, it is found always that the component of polarization perpendicular to a domain wall is the same on each side of the wall, so that no free charge need form in the wall to compensate the surface charge density which would otherwise appear. Such a charge density would have a high electrostatic self-energy. Two more lines of experimental work, not directly connected with the analogy the writer has been developing, should be mentioned. The first is the search for more ferroelectric materials. It is found that many compounds with structures similar to BaTiOa — the " p e r o v s k i t e structure" — are also ferroelectric.5 Thus BaTiC>3 is not an isolated phenomenon as are Rochelle salt and KDP, the other two ferroelectric substances known; this, with its relatively simple structure, contributes to the interest in work on BaTi0 3 . Because of the practical importance of this subject, I should also say something about the piezoelectric effect in BaTiO;t. Some confusion has arisen as to whether this effect should be called electrostriction or a true piezo-effect. In the precise sense of the words, below the Curie point the strain is proportional to applied field in each single domain of a BaTi0 3 crystal, and thus we have to do with a pure piezoeffect; however, in two ways the phenomenon is related closely to electrostriction. First, the actual piezo-electric constant is readily derived from the electrostrictive equation (3) which we have already mentioned, as follows: 'Matthias, Phys. Rev. 76. 175, 430, 1886. (19491 Shirane et. al., Phys. Rev. 80, 485. (1950) Smolenskii, J. Tech. Phys. USSR 20, 137. (1950) " P . Weiss, Phys. Zeits. 9, 358 (1908).

In commencing the theoretical part of this paper, one can remark that the main task of the theoretician has been to try to de-emphasize the close analogy of ferromagnetism and ferroelectricity, at least as far as the actual basic mechanism is concerned. The ideas involved in the present theoretical picture of the mechanism of ferroelectricity are not new and not complicated. We must go back to 1908, the year in which P. Weiss first proposed an acceptable, if somewhat ad hoc, theory of ferromagnetism.6 In his proposal he cited the familiar formula with which the names of Lorentz, Lorenz, Clausius, Mosotti, Debye, and others have been connected in its various aspects: 4irP E -| (electric case) 3 F= (5)

field due to all charges outside the sphere, then, is simply one of ordinary macroscopic electrostatics; it involves the fields of: (a) the surface charge on the sphere left by cutting it out; (b) the surface charges on the dielectric surfaces ( ( a ) and (b) involve P ) ; and (c) the charge on the condenser plates. The result of such a calculation is the Lorentz formula ( 5 ) . 4n The term P is, as von Hippel 3 has called it, a "bootstraps" term, i.e. a term by which the polarization is capable of pulling itself up by its own bootstraps. In more conventional lan+

+

!p

4TTM

H -1

+

+

+

*

©

(magnetic case)

3 In this formula, F is the "internal" field, the field which actually acts to polarize the individual molecule inside the material one is discussing. The external applied field is E or H, the polarization of the substance per unit volume P or M, and the important point of the formula is that it does contain a contribution of P or M. The derivation of this formula is a matter of simple electro- or magnetostatics. It can be best understood by referring to Fig. 8. We take the particular molecule on which we are trying to compute the acting field, and surround it with an imaginary sphere of macroscopic size. If the symmetry of the substance under consideration is as high as cubic (this includes isotropy, as in liquids) the field due to all the molecules inside this sphere can be shown rigorously to cancel out, if the polarization inside the sphere is uniform. The problem of finding the

24

+ + + + + + + Fig. 8: Internal field in a dielectric.

guage, this is a "cooperative force" term. To show how this term acts, let us compute the polarization of a molecule in the dielectric under an external field: P mol — aF (6) This equation defines the polarizability a of a molecule. Then let us find the total polarization of a cc of the substance containing n molecules, and finally insert for F equation ( 5 ) : 47rPt0, P, ot = naF = no (E -)

) 3

or,

(7) P

no

The susceptibility x ' s connected with the dielectric constant by *=l+4,rX (8) Now it is clear that without the "bootstraps" term we get the "naive" or Drude formula for susceptibility which is normally used in gases: (9) Y

=

na

4r Thus the

term has increased the 3 susceptibility and the dielectric conATT

stant. In fact, if we imagine that — no 3 is allowed to increase toward the value unity, we see that x a n d * increase without limit, and a little An thought will convince one that if — no 3 becomes greater than 1, the polarization will become infinite without the application of any external field at all. However, of course equation (6) breaks down when F and P become too large, and in such a region we will have merely a finite spontaneous polarization, just as is observed in ferroelectricity and ferromagnetism. This is the phenomenon which has "Air

been called the

catastrophe". 3 In the case of magnetism, the actual fact is that all known substances at Air

normal temperatures have —— n<* 3 < < 1; thus it was necessary for 6 Weiss to postulate boldly an additional cooperative force term N : 4w F= H+ ( f-N) M 3 and N had to be taken of order of magnitude 104. This completely unsubstantiated hypothesis was remarkably successful, when combined with the domain idea, in explaining ferromagnetism in considerable detail, since the precise behavior of a, and of the high field equation corresponding to (6), was being discovered by Langevin and others at about the same time. 7 W. Heisenbers, Zeits. fur Phys. 49, 619 (1928). s The best discussion of these general conc^pts is g i v n by H. Frohlich. Theory of Dielectrics (Oxford University Press, London, 19491. Also see J. Pirenne, Helv, Phys. Acta, 22, 479 (1949). »J. H. Van Vleck. J. Chem. Phys. 5. 320 (1937). 10 J. Pirenne, Physica 15, 1019 (1949).

The explanation of the factor N, however, had to wait 20 years for Heisenberg, who in 1928 showed that a roughly similar force resulted from the quantum-mechanical exchange effect.7 In dielectric substances, on the other hand, there is no such limitation on na, as the existence of numerous substances with dielectric constants greater than 10 will testify. We see here the first major point of difference between ferroelectricity and ferromagnetism; no such esoteric cooperative force as the exchange force is necessary in the dielectric case, since the simple electrostatic "Lorentz force" seems always to be quite adequate. On the other hand, in the dielectric case the shoe is almost too much on the other foot. Extrapolation from gas polarizabilities for dipolar substances by the Curie law led, for many polar liquids, to values of the polarizability which indicated that na should be considerably greater than the ferroelectric value. That is why Van Vleck named "4TT

the problem the

catastrophe": 3 the theory seems to lead to the prediction, for example, that water at 100°C should be ferroelectric, and presumably therefore frozen or at least of considerably different physical character from the water we know. The theoreticians were rescued from their theoretical catastrophe in the 1930's when Onsager showed that for polar liquids, at least, the original Lorentz field equation (5) is incorrect. The reason for this is that while on the average the polarization of the sphere is uniform, actually at any instant the permanent dipole of the central molecule affects the distribution in the sphere, leading to a situation in which the field due to the interior of the sphere does' not cancel out when the order of taking averages is correctly handled. He concluded from his formula that the "catastrophe" could not occur at all for permanent dipoles; this may not be quite correct, but it is probable that the Curie temperatures would be so low that other interactions than the electrostatic ones would already have taken over the situation, and one should more properly speak of phase or order-disorder transitions, such as occur in the hydrogen halides, than of Lorentz catastrophes.8 Van Vleck9 has shown, however, that the problem of the Onsager inner field does not arise at all in the case of solids with "harmonic" or induced polariza-

25

tion, i.e. in substances in which there are no permanent dipoles, but only charges elastically bound to equilibrium positions. This demonstrates to us the second i m p o r t a n t difference between ferromagnetism and ferroelectricity: ferromagnetism always involves permanent magnetic dipoles (the field factor N is not of magnetostatic origin, and thus is not affected by the Onsager calculation) while ferroelectricity probably cannot involve permanent electric dipoles. It is possible that ferroelectricity can occur only in materials with large polarizabilities of the induced type. Rochelle salt and KDP have long been thought to be dipolar-type ferroelectrics; however, Pirenne10 has recently shown that the isotope effect in the latter substance cannot be explained on the dipolar hypothesis, and that probably KDP is at least an intermediate, more complicated case than either the harmonic (induced) or the dipolar. A model of BaTiOs has been proposed which frequently is called the "rattling titanium" hypothesis. It is assumed that the titanium ion "rattles" among six equivalent equilibrium positions, one near to each of the oxygens

o"

-o ^&/ o"

o

Q^

o"~ Fig. 9: Models for BaTiO :j .

in the octahedron. A schematic diagram of this situation is given in the first half of Fig. 9- This model leads to quite a high polarizability for the Ti ion, which as we shall see later is an unnecessary frill; on the other hand, so far as Onsager field effects are concerned, the "rattling" ion is simply equivalent to a free rotating dipole, which as we have seen is not likely to exhibit ferroelectricity. Most theoreticians interested in ferroelectricity now feel rhat while the "rattling" model is a convenient visualization aid, as used for instance in Von Hippel's article,1 the second diagram, representing a pure "induced polarization" model, is far more likely to be correct. Another aspect of ferroelectricity in the BaTiOa class which shows the superiority of the "induced" model is

demonstrated by a quantitative consideration of the Curie-Weiss law above Tc. As we saw in equation (2) and Fig. 4, the susceptibility above the Curie point is given approximately by K (10) X= " T-T c

Air

Now above the Curie point is less than one, so that equation (7) holds: (7)

1—

3

aa

In all of the permanent dipole-like cases (square well, rattling titanium, or true rotating dipole) one and only one temperature dependence of the polarizability is to be expected, that given by the familiar dipolar "Curie lav": C n« = — (11) T This can be inserted into equation (7), and it is found indeed that an expression of the form (10) results, as follows: C T Air C 3

T

4TT

3

However, the measured value of the numerator of (10) is larger than 10,000. This means that the dipolar type of model is in error numerically by a factor of more than two orders of magnitude.11 On the other hand, the induced polarization model is capable of giving as large a value of K in (10) as we like. The reason for this is that the temperature coefficient of induced polarizability is generally quite small, of the order of 10" 5 —10" 6 per degree (the same order as coefficients of volume expansion); actually, this coefficient is due to the combination of the effects of change of n with temperature with those of a slight change in shape of the potential troughs in which the ions find themselves as the lattice changes size and the thermal agitation changes. Both of these effects are of the order of magnitude we have mentioned. Then if we postulate a small negative temperature coefficient 8 for na, we get ( 1 — 8 (T—T c )) no — (no) T=TC (13) =

( 1 — 8 (T—T,..)) Air

ihis, substituted into 7, gives 3 (14) 3 Air 4*-8 x —1 _ ( i _ 8 ( T — T c ) ) = fZJf Thus (15)

but then K =

4,r

(12) T —Tc It is a simple matter to test this equation in the case of BaTiCV, T c is known to be about 400° Kelvin, which gives us 3 K = T c ~ 100 Air " This discussion is similar to that given by Slater, Phys. Key. 78. 748 (1950), but, as Slater points out. very similar conclusions were reached by Devonshire, Phil. Mag. 40. 1040 (1949), Ginsburg, J. Exp. Tn. Phys. USSR 12, 239 (1948), and Richardson and myself (unpublished). The two latter were more concerned with the size of the specific heat anomaly than with the Curie law r the two phenomena are, however, easily shown to be closely related by a simple thermodynamic argument, and in fact the specific heat anomaly is several orders of magnitude too small to be explained by a dipolar mechanism.

4^-8 and this tells us that by a choice of 8 which is quite consistent with normal magnitudes we may easily explain the value of K which is observed. Thus we have a second very sound and convincing argument that the polarization in BaTi0 3 is of induced type. We may now say a little, about what this kind of thinking gives'us in terms of a picture for BaTiO-j. In the first place, at optical frequencies this is a fairly normal material, having the slightly high index of refraction 2.4. The application of equation (7) to n2 — €„pticai gives us Air (16) naoptical ~

-60

3 This part of the polarizability, is, as is well-known, due entirely to the motions of the electrons, since the ionic motions cannot follow optical frequencies. Thus it is only necessary for us

26

"N

\

V

J CYCLES PER

SECOND

Fig. 10: Dispersion and loss in BaTi0 3 as a function of frequency.

to find an induced type of polarizability, coming from ionic motions, which is large enough so that added to (16) it gives 1: (17) ^

n ( a 0 l , t i c a l ""i~ "ionic motion) >

1

i — Such polarizabilities, proceeding entirely from the harmonic binding of ions (we think particularly of the motion of the highly-charged titanium ion), are not at all surprising; LiF, for instance, has a value as high as tl.is. Actually, recent computations by Slater11 have shown that even this amount of clonic is not required, due to a peculiarity of the Lorentz field in the perovskite structure. (The original Lorentz calculation does not hold quite exactly for the O ions, but instead a larger Lorentz field is to be expected. ) A number of rather fantastic explanations have been suggested to explain the "surprising" phenomenon of ferroelectricity. As we have seen, it is not necessary to go to any great lengths to understand this phenomenon; what has always seemed to me a little surprising is that there are not more known ferroelectrics, since a combination of circumstances in which (17) is obeyed seems not to be a priori unusual. Perhaps, some have suggested, many well-known substances are ferroelectric, but do not have Curie points at measurable temperatures and have such large coercive fields that the polarization is to all intents and purposes completely frozen in and thus unobservable. One final point on the two models of ferroelectricity, in relation to some recent experimental evidence, should be made. Many investigators have measured the frequency response of the dielectric constant of BaTi0 3 at room temperature, and all have found more or less the behavior of Fig. 10.1 At first glance it appears that here is

a typical relaxation spectrum, indicating that some kind of potential barrier must be surmounted in order for the dielectric polarization to change; the polarizable entities thus cannot follow rapid motion of the field. This seems at first a very strong argument for the "rattling titanium" model of Fig. 9, since this model provides an immediate visualization for this potential barrier. In fact, a reasonable size for the barrier between Ti positions gives a very good value for the relaxation frequency. However, recent measurements by Powles and Jack-

\ t T in DEGREES C

Fig. 1 1 : High-frequency loss in BaTiC>3 as a function of temperature. 12 Proc. Inst. (1949).

Elec. Eng. part

3, p. 383

son12 in England have thrown considerable doubt on this interpretation. These workers have measured the losses as a function of temperature, at frequencies near the relaxation frequency. Some of their results are shown in Fig. 11. We see that the losses drop off very sharply above the Curie point; other work, on (Ba-Sr) TiC>3 mixtures, indicates that even the remaining losses seem to be illusory, and thus that BaTiC>3 is essentially a lossless material above the Curie point. This behavior cannot be explained on the relaxation idea; a relaxation frequency such as is given by the "rattling Ti" model cannot disappear so abruptly simply because of the disappearance of spontaneous polarization. The induced polarization model correctly predicts that there should be no large losses at any frequency lower than the fundamental infrared frequency, which should appear at a few tenths of a millimeter wavelength. The observed losses below the Curie point have been ascribed to two possible phenomena: (1) Slater's suggestion (made in informal discussion) is that these losses are acoustic radiation losses from the small particles of the ceramics upon which all thse measurements have been made. This suggestion can be

27

checked by making measurements at high frequencies on single crystals. (2) The second possibility is that the losses have something to do with domain effects; for instance, they may be domain wall motion relaxation such as is observed in some ferromagnets. This may be wrong in BaTi0 3 , but is appearing more and more likely in some of the newer materials, in which something very like small-signal hysteresis losses seem to be present. Here also there are indications, however, that the losses are not ordinary relaxation losses. In this second, theoretical, part of this paper I have tried to present the idea that BaTi0 3 and its sister materials are simply victims of the longpredicted and rather well-understood Lorentz "Air catastrophe." The atomic

"T

mechanism leading to the ferroelectric phenomena in these crystals is thus accidental, in the sense that quantitative prediction is rather hopeless, but certainly not surprising. The electronic and ionic polarizability, added together, pass the "critical value" at the Curie point. The results of this occurrence are the various experimentally observed phenomena discussed in the first portion of the paper.

Use of Stochastic Methods in Line-Broadening Problems Kyoto Lectures, March 1954

These lectures, given at the very young Yukawa Institute as a final thanks to Japan, for my Fulbright visit in 1953-54, are a summary of my approach to line-broadening problems, the field of my thesis and of much of my early work at Bell. During my stay with Kubo in Tokyo he began to go on with these ideas to develop the correlation function methods which became so important in condensed matter theory. (As did Lax at Bell, but not with so much effect.) My paper with Weiss in Rev. Mod. Phys., in spite of its appearance in a review journal, was not a review. Several reviews appeared on my approach to pressure broadening but none were by me.

29

USE OF STOCHASTIC METHODS IN LINE-BROADENING PROBLEMS

P. W. ANDERSON

In recent years there has been a revival of interest in the old subject of the breadths of spectral lines. This revival is the result of the experimental opening u p of the radiofrequency range of spectra: microwave gas spectroscopy, microwave magnetic resonance (magnetic resonance of electronic magnetic moments) and nuclear radio-frequency magnetic resonance. The radio-frequency range has the great advantage that monochromatic sources are available, so that the contour of a spectral line can be studied in complete detail insofar as the absorption of energy is greater t h a n the background and noise effects. You are all familiar with the basic problem of line-breadth theory. An isolated quantum-mechanical system — atom or molecule — always has discrete energy levels and radiates or absorbs discrete frequencies. In any macroscopic sample the spectrum will, however, not be discrete but essentially continuous, because the atoms or molecules interact. Observed line breadths contain this effect, which will be the subject of these lectures; they also contain at least three other effects: (1) The breadth due to the interaction with the radiation field, and (2) The instrumental breadths due to slit widths or noise-modulation of the source; these can be essentially completely removed in the radio range. (3) The Doppler breadth due to thermal motions of atoms: in gases this can be overwhelmed by increasing the density, in solids it is negligible. There is then a new opportunity to study the interaction of atoms by quantitative comparison of theories with the wealth of new experiments. Naturally, the existing theories were not completely adequate and there has been a development of new theoretical methods; it is with a group of these methods, as well as with some of the older ones, that I wish to deal today. These methods all have in common the fact that they approach the problem as a problem in stochastic theory, that is, the theory of functions which vary in a random fashion in time. The interest in these problems from the standpoint of stochastic theory lies in the fact that they represent a rather unusual application of this theory which goes beyond the more normal type of problem. In most physical applications of stochastic theory, what is desired is the distribution of the function in question, given some kind of information about the process; for instance, consider the study of resistances or mobilities.

30

The question is what the final distribution of velocities is, given the transition probabilities; however, in the line-broadening problem, which is in its essence equivalent to the problem of relaxation, one must in addition find the distribution as a function of time in some way. This will be much clearer expressed mathematically. W h a t one might consider the most fundamental theorem of the line-broadening theory is the Fourier integral theorem. Suppose that we are studying magnetic or electric dipole radiation from some macroscopic quantum-mechanical system, whose Hamiltonian is H. We can define an operator /i(i) which is the Heisenberg representation of the total dipole moment operator of the system. The Fourier integral theorem states that the spectrum of the absorption or emission by this dipole moment is J(w) = const, x Tr I p\ f

e~iu)tp,(t)dt

\ .

(1)

p is the equilibrium density matrix. This expression will appear self-evident to most of you, practically being the definition of the spectrum. However, since a simple proof can be given I shall give it: Label the eigenvalues of the macroscopic system Ei, Ej, etc. Now we all know that I(u) = const. X

e-Ei'kT\(i\»\j)\2

Y,

.

E^EiiEi-E^hu)

T h e limitation on the sum may be removed by defining

(iW)\3)

=

e ^ ^ ^ ^ j )

and selecting the appropriate frequency by using the ^-function expansion

so that

I(u>) = J2 ^~Ei/kT f e-^(i\f,(t)\j)dt Ei,Ej

J

I"e^OVWIO*

J

which, re-expressed as a trace, is exactly the above expression, since Hn(t) — fi(t)H = ih/j, is the definition of fi(t) and leads to the above expression. The above is still not obviously a problem in stochastic theory. Hoever, it becomes so when one physical assumption common to all the problems of which I speak is made. This is the assumption: that 'H = 'H0+'Hi

31

+ 'Hm

(2)

with the following conditions

(a)

Hm^Hi,

(b)

[Wm,Wo] = 0 ,

(3)

(c) [Hm,fi] = 0. T h e meaning of the three parts is the following. (a) 7io is the unperturbed Hamiltonian of the isolated, static atoms or molecules. In the gaseous pressure-broadening problem, it includes the internal degrees of freedom of the molecules. In the magnetic resonance problem it includes — \i • Ho, the Zeeman energy, as well as any of the fine and hyperfine structure terms we wish to include. (b) Tii is t h a t part of the intermolecular or interatomic interactions which is effective in broadening lines. Usually this is a relatively small part: for instance, in magnetic resonance only magnetic forces can broaden lines, while ordinary exhange repulsion, valence attraction, van der Waals' forces, etc., have no effect. In pressure broadening only anisotropic interactions can broaden microwave lines since these are only of the rotational type. In optical spectra there are also often special circumstances which allow the assumption of fii to be small. Finally, (c) 7im is the remainder of the Hamiltonian. Always it contains the translational kinetic energy of the atoms and molecules; also, usually the major portion of the interatomic interaction. The choice is made strictly on the basis of the problem at hand. Now! Why does 'Hm commute with Tio and A*? The answer is different for the two cases. In pressure broadening, the degrees of freedom in Tio are internal ones, while 7im is the mutual translational motion as well as that major part of the interaction which does commute with /i and ?io (as most of the isotropic interactions will, for other than electronic transitions). In magnetic resonance, W m is entirely non-magnetic in character, Ho entirely magnetic, and the theorem is obvious. There is one special case which we shall consider later: the famous case of "exchange narrowing" in which Hm is the energy of Heisenberg exchange interactions. I hope you will let us treat this when we come to it. Finally, as to Hm ; » Hi. This may be justified on a purely experimental basis. Namely, since Hm is the total interaction energy it must have the value at least kT per degree of freedom or roughly so, at least when it can be treated classically as in most of our applications. Thus we may think of Hm/a,tom as of order kT. But observed spectral line breadths are seldom more t h a n 1 c m - 1 or at worst a few c m - 1 . The reason is easy to see: anything broader very usually is too broad and weak to be visible, and in any case in the radio range it is no longer a "line" but a band. But the interaction energies of interest are certainly of the order of the line breadth or smaller and not much larger. Thus Hm ^> Hi if T is larger t h a n a few degrees.

32

This makes it possible now to discuss the problem in stochastic terms. First, we can see that Ho is easily disposed of. Ho can always be easily diagonalized — one must do this in order to know what are the spectral lines under discussion. Let its diagonal energies be E°,Ek, etc., their differences are hw°k. Naturally Ej,E% are always highly degenerate, but they are distinct, discrete energies. We perform the canonical transformation on all operators p,7ii, *Hm (A(t))jk = A'(O it c«' w >" (4) and the time-dependence of the primed quantities due to Tio is eliminated; ihA^m

+ H'J, A']

(5)

while the spectrum becomes J( W ) = const. X TrJ2eEi/kT

I' *t f d t ' e ^ - ^ ^ J

j,k

p'jp'^p*'^

J

It is assumed that we have found a way to define HQ SO that its eigenvalues are well separated, no matter how degenerate they may be — a very careful discussion of the handling of this separation and what to do if it fails would be out of place here but is contained in a paper by Kubo and Tomita for those who are dubious (also in my paper on pressure broadening for that case), p and p are still operators but now their dimensionality is reduced to operating within subspaces which are split apart by the various eigenvalues of Wo. This process of separation into lines is sometimes difficult but almost never impossible. We now have Ijk(u)

= Tr Pj Jdtj

dt'p'jkp*ke^-^^-^

= J>"-«"i*)V>*(r)dr

(7)

where i

Pjk{T)

=

\ *-TfJ>jk[v)p,ji;{T))ti.Vg

over

time and statistics

(8)

is the well known correlation function. Now we are well on the way. Since p and 7im commute, * v = [«:•(*),**]

(9)

iftHj. = [w^(0, «;(<)].

(io)

while

33

We now make the stochastic assumption: that Hm, which essentially represents the thermal motion of the atoms as a whole, develops in time in an entirely random way relative to the interactions H\ and the unperturbed motions HQ. Thus the time-behavior of H\ and of / / in t u r n is random in the sense that it is controlled by the random thermal motion H'm. A real justification of the randomness assumption for Hm would have to go deep into the fundamentals of statistical mechanics. Here let us say only that it is the identical assumption of randomness which is made in theories of diffusion, conductivity, etc., in calculating transition probabilities. An interesting point is that we expect our approach to fail when Hi ~ K r o . This failure is identical with the failure of conductivity theory which leads to superconductivity: when the interactions are stronger than the thermal reservoir, normal irreversible theory breaks down and completely new methods are always required. No known line-breadth theory is valid in the case hAu > kT. (Just as conductivity theory breaks down for

h/r > kT.) The problem then is: under the assumption that random thermal motion controls the time-dependence of Hi, find the spectrum as defined above. I think that at this point it is appropriate to begin to specialize to actual cases. First I should like to discuss the pressure-broadening case, then later the magnetic resonance case. The virtue of the pressure-broadening problem is that for a sufficiently rarified gas a nearly exact solution can be found in principle, satisfactory in most cases. Only one further assumption need be made, that the translational motion of the molecules is classical. This assumption is nearly always quite satisfactory, as may be easily verified from the uncertainty principle.. One can easily show that wave-packets can be formed which have a negligible extent in both momentum and position. The effect of Hm on Hi is then easily calculated on the basis that the molecules follow classical paths relative to each other, so that Hi becomes simply a function of the internal degrees of freedom and the time. For instance, the interaction between two molecules passing at a reasonably large distance from each other is H\ — F(rij , internal degrees of freedom) = F(y/b2

+ v2t2 , degrees of freedom)

The technique which is most appropriate here is to use the method of time-development matrices. In the first place, since in the rarified gas all the molecules

34

may be treated independently we focus our attention on a single radiating molecule and watch the development of its moment as a function of time. From the standpoint of the single molecule HJ becomes a time-dependent Hamiltonian acting on its internal coordinates. For instance, thinking of a simple case of foreign-gas broadening by polarizable classical molecules of a dipolar molecule the interaction might be ji • E (due to induced dipole), which gives something like /i 2 cos 2 #(//, rij)

s—^

fj,2 cos 2 #(//, rij)

and the part of the Hamiltonian acting on the internal coordinates of the molecule would be just f(t) cos 2 8. Now we introduce a quantity called the time development matrix T(t) such that /x'(t)

= r- 1 (ty(o)T(*)

(ii)

in fact for any operator A'(t) = T _ 1 ( t ) ^ 4 ' ( 0 ) T ( i ) which satisfies the equations ihT = H\(t)T

.

and T(0) = 1 . It is of course unitary. In the case in which there is no time-dependence and in addition 7io has not been transformed out of the problem, so that Ti\ ^ f(t) ei71lt/h

T =

however, such an assumption in this case is erroneous — we find ourselves ignoring second and higher-order perturbation effects, which can lead to errors. However, often (as in the case of van der Waals forces) one may include the higher-order perturbations in an effective Hamiltonian, and then in such a case it is often allowable to set T(t) = Tei/hfonWdt'

.

(12)

Still another matrix must be introduced for the problem we consider. This is the matrix T(t) for a special condition: namely, that only one collision has occurred between t = 0 and t = t'. Namely, it is the solution for t = +oo of

ihT = H'1T

35

if Ti\ represents only one collision. It is a function of the collision parameter b and any other variables (such as angles) which may be necessary to specify the collision. In the special case in which the T may be written as above

= e*<6> .

(13)

This is a form I often use; of course it can always be written this way in any case. T(b) is closely related to the usual collision matrix: it is its analogue under the assumption, of classical paths but quantum-mechanical internal motions (of course in the interaction representation). It is now possible to state our stochastic problem in a very definite form. Namely, it will be remembered that the original problem was to find ^• f c (r) = T r ( ^ t ( 0 ) ^ t ( T ) > a v g

(8)

but this may be written, for a sufficiently rarified gas that all collisions are independent, and in which the lengths of collisions may be neglected relative to the time between them; and if we introduce the degenerate indices m of the various operators,

^)=E^W(

T IS

n

&>*•>( n

~ )

\coll in T

/ m1 m"

rpk

\coll in r

I introduce a new notation involving operators with four indices operating on two indices quantities. In this notation T~lAT = (T-1

x f)A

.

(14)

In indices ( T ~ AT)mmi

=

2_^ (T~

x T)mtmi;m»>m»/

Amiimn,

m"m'"

=

2 - , (Tmm"T™'m'")Am»mi»

.

T h e quantity in brackets is the direct product of the operators T~l and T, and except for the extra indices, operates entirely as though it were an ordinary linear operation. I like to call this operator U. This technique has then the advantage that instead of the rather inconvenient operation T _ 1 /J,T we may use instead the simpler algebra of ordinary linear operators.

36

becomes, then

\

collisions

}

where U*k(u) = T(~J x f(k) .

(15)

The average may be taken by a simple differential-equation technique. We calculate (njk(T

+ dr)) - (fijk(r))

=

dnjk(T)

Now the rarified-gas assumption tells us that the averages over collisions in dr and collisions in r are independent, since the collisions are short and well-separated. We may then assume dr to be short compared to the time between collisions, and long compared to the duration of a collision. First we calculate the quantity (actually ( / ^ ( T ) ) ) multiplying H*k(0)

d

:(Nk(r)) =j-( n

dTv~j*K

d r

JJ

U'\u)-l)

\collision in dr

I

Nh(r)

(16)

aVg

(since the second average is just the quantity itself). Now let us consider the statistics of the collisions. The probability of a collision whose parameters fall within the limits da about a given value — {do = 2irbdb x (possibly) functions of angles — sin Add dtp etc., functions of the state of the colliding molecule with appropriate Boltzmann factors, and so forth) is n(v)vdvdadr where n(u) is the number-density/cc with velocity v. It is usually an unimportant assumption quantitatively to assume v = v, J n(v)dv = n, the total density. Then the average becomes , dr

jk

= —nv Jda(l-U

(da))Hk.

This equation may of course be solved in general, and we get fijk decaying exponentials;

(17) as a sum of

,, _ —river; T fJ-jk — 0,jt '

where Oj are the eigenvalues of the constant operator a= [ da(l-Ujk(da))

37

.

(18)

It turns out, however, that only one of the eigenvalues of this quantity enters in the final result, because of group theory. A very simple way of seeing this is to note that the transformation properties under rotation of fijk(T) a r e just exactly those of //^(O), since the average over all types of collisions is rotationally isotropic — collisions occur from all directions with equal probability, so their effect is isotropic. But all of you know the theorem that matrix elements of a vector-component internal to a degenerate state are determined. Thus / ^ ( r ) = const, x /ijjt(O) and

—-— = const. X Ujk . CLT

Thus Wi = Wi(0)e"nw''r

(19)

where o\ is a single eigenvalue of the a matrix. A simple exercise in group theory, or actually the simple operation

IM°)I2

(20)

will suffice to compute this eigenvalue. I do not want to bother you here with numerical details of results. The above treatment has actually given satisfactory results in some cases in which the interatomic forces are well known and the molecules simple. During my stay in J a p a n I have been attempting to apply the above techniques to more general problems. Since the relaxation behavior of every matrix, not only ft, can be computed by them, in particular p, the density matrix, one expects that the technique really contains the solution of the problems of energy transfer and relaxation. However, the classical p a t h method is not too well adapted to such problems because one can show that it does not give a quite satisfactory treatment of thermal equilibrium — there is, in fact, no differentiation between upward and downward energy transfers. I am sure that this snag can be surmounted in time by some simple new technique and the whole problem solved for sufficiently rarified gases. I should like to discuss one more pressure-broadening problem before going on to magnetic resonance. This is again of stochastic theory interest because it stems from the very old method of Markoff for random walk distribution functions, which has itself been directly used by Holtzmark and later Margenau for special situations. A simplification which is often extremely valuable and very seldom troublesome quantitatively of the basic problem of calculating (8) is to act as though the levels j and k were non-degenerate and transitions were not allowed: that is, the so-called "adiabatic

38

assumption" t h a t collisions take place adiabatically. Then the only thing happening during a collision is an adiabatic change in the energy levels Ej and £*; and i {' J

Vjk(t) - fijk(0)e

Ek) dt'(Ei h ~ t'

i Uikit )dt ' ' Nk(0)e r

=

uijk will, of course, be some function of the relative position of perturbing molecules and radiating molecules. Good results can be obtained from this assumption even in cases in which it is obviously untrue, by substituting for u>jk(t') some mean squared value for the substates calculated on a static basis, or some similar trick. A good fact to remember is that the adiabatic assumption never leads to large errors unless one is directly concerned in calculating transition probabilities. Now it turns out t h a t an additional, rather reasonable assumption removes the assumption of rarefaction of the gas completely; namely, that when a given molecule is interacting with more than one perturber, the frequency perturbation is the sum of those due to the separate ones N

Ujk(l,...,N)

= J2UJk(Rn) n=l

where Rn is the distance of the n t h perturber from the initial molecule. For a sufficiently rarified gas this is a pretty good assumption because only one perturber at a time is near the given atom most of the time. Also, for second-order van der Waals' forces the additivity assumption is pretty good. Then the correlation function becomes

^ \

/ avg

= H2(l[eif°"i,l(lin)dT) \

n

• I avg

We assume that the substance is still in the gaseous state so that the distributions of the various perturbers are independent (in liquid state molecules knock against each other and have some correction). Then we may take a separate average for each term and get (r)~ where N is the total number of perturbers. Eventually we want to set N —> oo but N/V

= number density = n = const.

39

How do we do the average? It is an average over all the possible paths R(t) of a given perturbing particle. We can specify these by giving the positions -R(O) at t — 0 and the vectorial velocities, as well as the quantum state of the perturbing particle. For isotropic, classical perturbers this state is constant and the direction of approach immaterial; for such perturbers we may (again assuming all velocities v with little error) take all velocities in the same direction and we need average only over the initial position R(0), but for more complicated situations one could easily find an averaging technique which would be appropriate. For our simple special case, however ( ) a v g then, = y /dV/j( 0 ) (all space) x v Rio)

R[X) =

R[o)*vZ

b

radiation

I like to express the position in the coordinate system shown above, in terms of y>, b, and x\ then \R(t)\2 Wjk(R(t))

= b2 + (x -

vt)2

= LOjk(y/b2 + (x -

vt)2)

while /•OO

/

dV =

f£TT

bdb / J0

SO

J0

1*00

dip /

dx

J — 00

bl+(x-vt)*)dt)

\

/ avg

V J

A trick is useful here. As V —> 00 the above integral diverges as does the denominator. This may be corrected by doing

±JdV(l-(l-eiLT»dt))

=l

V

where

v = f°°bdb f \ r dx (1 _ ei 1; Jo

Jo

J-00

\

40

^V^H^^)A J

Now the limiting process is easy

Km N,V-*oo

-

N

V'(r) V

1-

=

1-

lim V—oo

v

.-nV'(r)

And

oo

/

•oo

Let me just calculate two limiting cases for V. /

Jo

u;(v/&2 + (a: - ^ ) 2 ) ^ r =

For small r

TW(.RO)

and

V'= f dv(l-eiu(Ro)T^

.

This result is identical with that of the so-called "statistical" theory which makes a direct application of Markoff's method under the assumption that I(w) is the distribution of momentary frequencies w(R), with no effects of dw/dr or any higher rates of change. The statement is often made that this approximation is valid on the wings of the line, or for sufficiently high pressures. Really the criterion is one of sufficiently small r . This is roughly equivalent to the other two as we see from the expression for / ( w ) , but in one important case, namely that in which all the UJ(R) are of the same sign, this necessarily gives only one wing of the line, the other vanishing completely, and even on the wing, at high densities, we must calculate a first correction. This is not too hard with the present technique and a method has been found. These calculations are fairly long and have not been published yet, so I can give you only the result: that the intensity of the "wrong" wing falls off exponentially as a power of u close to the first:

statistical theory

1st correction

This result is entirely different from that obtained by Lindholm by a different method, which I am sure is incorrect. The correction terms on the "regular" wing agree with those published by Holstein.

41

T h e other limiting case is also easily derived: large r . For r large compared with the duration of the collision, only those i?(0) need be considered which are such that a collision occurs

The picture is

-vX<x<0

•vX

For all such x the integral J* u(R)dr may be approximated by J™ u(R)dT = r](b): a function of b alone, giving the "phase-shift" due to a single collision of parameter b, if x < —VT or x > 0 no collision occurs: / ' u(R)dr Jo

= 0,

1-e*

/ = 0.

Then V

= / Jo

bdb

=

/ Jo

dip

( l - e ^ ) J-VT

JO /•OO

for r < 0,7/ -> -77, we get

2TTVT

V'(-T) =

bdb(l - e'"^))

V(r).

This gives the impact theory limit:

/•OO

bdb(l - eir>w) .

a = 2n

Jo This is the result our previous theory would have given if specialized to this case. It can be shown t h a t the first correction to V is a constant and that V can be expanded

42

as ar + b + C/T + ... but b is a very complicated integral and I have not succeeded in evaluating it completely as yet. Only this last step prevents a numerically satisfactory solution for the entire problem, since two terms for V for r small and for r large should be easy to fit together numerically giving the whole of V. Now I should like to t u r n to the discussion of magnetic resonance line breadths. The magnetic resonance case, while describable in the same stochastic terms, has a number of essential differences. One of the most vital is that the perturbation Ti'(t) is not intermittent as in the pressure broadening case, but continuously present. This destroys the simplifications we could use when we could separate the collisions. Also, a new phenomenon appears, t h a t of narrowing, because of the following consideration. In pressure broadening the rate of collisions determines the breadth (nvcr; you remember) and during a collision is usually fairly large (it must be at least such t h a t W(*coll)

H'

>1,

ir-r

1 _>

1 , )

7 '

n t co u t Thus we are never confronted with a situation in which the time-rate of change of frequency is greater than H' /H. However, with 7i' constantly present there is no reason why its timerate of change need be less than "H1 /h itself, since there is no effective lower limit on "H' relative to the rates of motion. When 7i' changes rapidly relative to its own magnitude (which determines the line-breadth, and therefore the important range of values r in <^(T)) it is possible for W to average to a small value over these intervals of time, and for the line to become narrower rather than broader as the motion becomes more rapid.

constantly present

Explicit picture:

line breadth

1

1'mSJerm thereby present

11'//) rate of motion

Interest at the present time is focused on this "narrowing" range. I should like to discuss two topics of interest in this range: Kubo's quantum mechanical theory for %' continuously distributed; and my own simplified theory for narrowing of fine structure. The latter may be less familiar to most of you so I shall start with it and go on to Kubo's theory if I have time. In my theory I start with the same simplifying assumption that I used in the last pressure-broadening case: the assumption of adiabatic motion. This is of course invalid

43

in magnetic resonance in most cases, because there are important off-diagonal elements of the perturbation Hamiltonian (not however for non-idential atoms — dilution problem) but in the same spirit as we discussed before the adiabatic technique may be modified by using the correct mean values of frequencies or mean squares including off-diagonal terms, and the resulting errors are usually small. T h e n the problem is to calculate i (t)dt ) v>(r)=(e I>

\

I avg

where uj(t) is a stochastic function about which we may be able to know or guess something from the physical situation. For instance, in a crystalline solid in which Ti' is the dipolar interactions of local fields we guess that o> has a gaussian distribution with the well-known mean squared breadth as calculated by Van Vleck. My paper on exchange narrowing dealt with such Gaussian cases, but they are far better treated by Kubo's method so I won't discuss t h e m here. There are a large number of types of cases which are, however, not properly treated as yet by Kubo's method. Among these are all cases of fine structure, in which w{t) may take on one or another of a set of discrete values, but the motion is such as to cause transitions among them (e.g., exchange between electrons with nuclear hyperfine structure, in which the nuclei are fixed in definite states but the electrons change places; diffusion in solids with some kind of crystal fine structure, etc.). Also, most cases in which the motion is the same kind of barrier-jumping, like diffusion in solids, are not well-approximated by Gaussian methods. At present the only attack on such problems is to make the opposite assumption that the random function u>(t) is Markofnan. A Markofnan function f(t) has the property that the probability W{f2, \fi,t) that given f\ at t = 0 the value is f2 at time t is independent of the earlier history — the values at t < 0 no m a t t e r what the value of t is. Such a process can be thought of as proceeding in discrete jumps, whose probability is determined by the value only at the present time. Let us present some of the basic mathematics of Markofnan functions. The entire process is determined by the second-order transition probabilities W(f2\fi,At) which, if the process is to act sensibly at At —> 0, must be capable of being written

W(f2\fuAt)

=

6fuh-Il(fi,f1)At

where now II alone controls the process completely. T h e process can be expected to be stationary if it is a physical one: that is, there is a probability function W\(f) such that

YJw^fi)wU2\h,t) h

44

= wl{f2)

or,

^n(/ 2 ,/ 1 )w(/ 1 ) = o, /i

(the above is a form of the Smoluchowski equation: in general, the time-rate of change of a distribution is ^ g ^ = W>(t+A£-W>(t) = ^ U(fJ')W'W) and this is it). Another theorem is that the probability of taking on at least one value is unity:

X>(/ 2 i/i, A*) = i — E n ^ , / i ) = ° • h

h (

This shows t h a t II is a non-symmetric matrix with eigenvectors W\ on the right belonging to the eigenvalue zero. T h e following simplified technique of calculating \, the final state u>2, and realize that eventually we must average over oj\ with probability W\ and over u>2 with equal probability for all cases (i.e., mult, matrix F(r)UltU2 by the two eigenvectors of II on the left and right) F(T + Ar) W l ) U , 3 = ^ F ( r k ,

W !

%

3

M,Ar)e^

A T

u/2

= F(r)WliW3eiu'3Ar -J]n(a;3,a;2)F(r)u,1,W2Ar U12

or, dr

{F{T))U,UU3

= W3F(T)UUU3

-^F(T)UUU2U(UJ3,U)2)

In matrix language, — = iuF - UF dr if we define a diagonal matrix iu and II, F in obvious ways.
= 1-F-WI

.

Now clearly we have the complete solution if we can but work it out, since we need merely find the eigenvalues \j of (iu — U)Ul,u2 and the appropriate coefficients of their eigenvectors relative to W\ and then we shall have

^) = E C / V

45

and I(u>), in turn, will be a sum of resonance distributions

E C°'(

^ia

W -Ai)

+A3 •

This general procedure, while clear and straightforward in principle, cannot always be carried out. For more t h a n 4 CJJ'S we get in general an insoluble secular equation. Therefore it pays to calculate first- and second-order approximations in the two cases (a) 7r > u ; (b) u > n . Case (b) is simple, u> is already a diagonal matrix (u>)jk = WjSjk- T h e first-order approximation is \j = iuj — Tljj Hjj is the rate of transitions away from the given frequency j — i, the inverse lifetime (TJ = 1/w,) of that frequency. The coefficients of these approximate eigenvectors are just Wi(u>j): one gets in zeroth approximation for slow motion y ( r ) = ^ W\(uj)e*u>>T:

The

unperturbed distribution; and second

a Lorentz distribution about each unperturbed eigenvalue. Next we get the second-order perturbation:

The next term is thus a shift of the center frequencies toward the center of the distribution — since II must lead more often to the center than the wings. This shift could have been predicted from the theorem that Aw 2 is independent of the rate of motion Hm (provable direct from the expression for
46

For details of actual cases I should refer you to my paper to appear in the of the Physical Society of Japan.

Journal

A most interesting case is the one in which the possible frequencies Uj form a continuous distribution. In this case the perturbation techniques break down for the limit II —> 0, so that (as Kubo has pointed out to me) the best solution is to use the time differential equation directly, which can be done in simple cases. As a last demonstration of the stochastic techniques which can be applied in these problems I should like to discuss Kubo's technique for the so-called "Gaussian" case in which the distribution of W is continuous and the motion not of the abrupt "Markoffian" type but more gradual. Here it is possible to proceed without the initial assumption of adiabatic motion. T h e technique is closely related t o the so-called "moment method". In the moment method we take the expression (r) = T r ( / i ( r ) / z * ( 0 ) ) a v g / T r ( / , 2 ) a /avg and calculate fi(r) as a power series in r . There are the well-known relationships (ftp dr2

-g-

d*tp dr*

'

—

and so forth; and since in the absence of exchange or motion u* ~ 3(u>2)2 one may set _

u> 2

approximately / ( w ) c^ e 2 ^ Kubo has observed that this is equivalent to expanding — I d2f 24 00) must approach zero. In the case in which there is motion the second moment remains the same but the expansion in this form becomes invalid. This is because -^ = j^X = " , '£> ' ; but the averaging process after a finite interval r has intervened causes this expression to fail rapidly for r ^ O . A more hopeful expansion can be obtained by calculating fi(r) by means of its true equation of motion ihjt =

[7d(t),fi]

rather t h a n

t/»/t=[W(0),/i] .

47

T h e true equation may be solved by successive approximations: Ho = n(0)

£i = i[«'(0,/*(0)] in

we expand not in terms of r but of 7i'. We then set (M;(r)j.(0))avK

<^>

+...

again by t h e exponential assumption (/ii) av g vanishes on general principles). Explicitly, [«'(<), [H'(t'),fi]}

= H'(t)H'(t')n

-

H'(t)nH'(t')

- H'(t')iiH'(t)

+ nH'{f)H'{t)

.

Now one may set

<«'(0«'(O>avg = («'(*)) V«'(*'-0 where / > W ( t ' - t ) which can be shown to b e

This is the identical answer which is obtained by the adiabatic method; but Kubo's technique gives an explicit technique for calculating (fw rather t h a n allowing for guesswork. The two techniques t u r n out to coincide exactly in the one case, exchange narrowing, in which both can b e carried out; but obviously Kubo's has a far sounder basis. This expression leads to the narrowing phenomenon in well-known ways. For
= T>/2

Jo

48

which leads to the unperturbed Gaussian distribution. For ip-H', rap idly-varying, we set

1 ue.

Jo

and J tip An ~ 1/^e) negligible so

2 fu>e. This concludes my survey of stochastic methods used in line-broadening problems. This by no means exhausts all the methods which have been or will be used, nor all the problems which they might be. The subject of line-shapes in metals of the electronic resonances, where one has electrons diffusing in and out of the skin depth region, is a very important one, which has been discussed on this basis by Kittel and his co-workers with the result t h a t they have been the first to observe electronic paramagnetic resonance of really free electrons. Kittel's theory of diffusion is M+ = iu)0M+ H+=e

— kx-\-iut

ijMzH+

H(x(t))e

iuit

Solve for forced solution d -[M+e-iuot^ M+e-'Uot Thus M+(u)

= e ^ - ^ ' c o n s t . X H(t) * = f e ^ - ^ ' c o n s t . X H(t)dt

is a Fourier component of H(t)

theory:

49

which can be found by Khinchine

Electron diffusing in skin depth (A; of order 1/skin) will break off when diffuses out of skin: ~ time to diffuse out T,

A=

mfp

Aut skin depth so

7 = y/f/r\

Aui = v\ where

.

vXcrtJ r = ——

/l2

cr

2

I = c /cru> a = ne

so

VTv\

l2/v\

T =

2

=

X/2mv

ne2\v\ 2mv c 2

Au> u) =

n(-^—]\2=nr\2

\2mc2J

(r is the classical electron radius). Other stochastic methods have been used for nuclear resonances by Torrey, while Slichter and co-workers have discussed the effects of diffusion in an inhomogeneous external magnetic field in liquids. One of the most important fields remains as yet almost untouched: that of relaxation processes as opposed to line-shapes themselves. This has been started by Bloembergen, Purcell and Pound for nuclear magnetic resonance, but they had the same trouble one has with the stochastic method in pressure-broadening relaxation — that the transition probabilities as computed by this method do not lead to the Boltzmann distribution: the energy distribution is not property managed. They seem to have satisfactorily solved this problem, but not to have carried out their investigations very far in various directions. W i t h the new interest of Dicke, Hahn and others in transient techniques the relaxation problem has become much more important a n d more thought needs to be given to it. W i t h this I conclude these two lectures on stochastic methods in line-broadening problems. I hope that I have been able to interest some of you in carrying these studies further, or in applying the stochastic methods used here to other problems.

50

An Approximate Quantum Theory of the Antiferromagnetic Ground State Phys. Rev. 86, 694-701 (1952)

Gregory Wannier and I, as well as B.T. Matthias and Jack Gait, showed up for work on the same day at Bell Labs. At least two of the four of us (probably the theorists) were surplus to "nosecount" and so it was not surprising that they put Gregory and me in a big office together. Chatting with Gregory was a great learning experience for me. A year or so later Charlie Kittel came wandering into our office and asked, "Do you know about antiferromagnetism? Why don't you work on that?" He must have just heard about Shull's neutron diffraction experiments demonstrating its existence, and of course its relative, ferrimagnetism, where the spins do not compensate, was of great interest technically, and was soon to be the subject of beautiful experiments at Bell by William Shockley, and later Gait, on domain wall motion. Our negative answer to Charlie's question was soon corrected, Gregory producing a well-known paper showing the non-ordering of the triangular Ising lattice, and I produced several papers of which the first two earned me an invited paper, independence from Bill Shockley, and a raise, though they were not sparkling intellectually. Although Kittel's stay at Bell was brief — he was soon to decamp for Berkeley to space opened up by their "Loyalty Oath" losses — he was very influential at that early stage in my career, and in moving research at the Labs into new directions, for which Shockley did not thank him. This paper was an attempt to answer questions raised by Bethe and Hulthen's work on the one-dimensional antiferromagnetic chain, still open in the minds of such as Landau and Kramers, as to when and whether a quantum antiferromagnet orders. It contains the seeds of all my further ideas about broken continuous symmetries and Goldstone modes, and the relation between microscopic and macroscopic physics. I cherish the memory of being called down to the Institute for Advanced Study to spend an afternoon discussing it with HA. Kramers, shortly before his death. It also seems to have impressed Kubo, who had also worked on antiferromagnetic spin waves, and probably led to the invitation to spend eight months in Japan in 1953-4. That turned out to be even more of an adventure than we had bargained for.

51

An Approximate Quantum Theory of the Antiferromagnetic Ground State P. W. ANDERSON

Bell Telephone Laboratories, Murray Hill, New Jersey (Received January 30, 1952) A careful treatment of the zero-point energy of the spin-waves in the Kramers-Heller semidassical theory of ferromagnetics leads to surprisingly exact results for the properties of the ground state, as shown by Klein and Smith. An analogous treatment of the antiferromagnetic ground state, whose properties were unknown, is here carried out and justified. The results are expected to be valid to order 1/5 or better, where 5 is the spin quantum number of the separate atoms. The energy of the ground state is computed and found to lie within limits found elsewhere on rigorous grounds. For the linear chain, there is no long-range order in the ground state; for the simple cubic and plane square lattices, a finite long-range order in the ground state is found. The fact that this order can be observed experimentally, somewhat puzzling since one knows the ground state to be a singlet, is explained.

one-half.6,6 The result was so different from what RAMERS and Heller 1 were the first to give a Hulthen had assumed to be true in his spin-wave theory semidassical derivation of Bloch's spin-wave that he considered his former result to be of questionable theory of ferromagnetism. 2 They started with a classical meaning. Aside from this simplest case, no rigorous ferromagnetic lattice, with classical spins and the treatment of the antiferromagnetic ground state has —/S,-Sy Hamiltonian for exchange, found coordinates appeared. For this reason the very basis for the recent which represented classically the small vibrations of the theoretical work which has treated antiferromagnetism 7,8 In system, and quantized these coordinates. In this way similarly to ferromagnetism remains in question. particular, since the Bethe-Hulthen ground state is not they obtained a theory which in principle should be 9 good only to first order in 1/5 (if the classical results ordered, it has not been certain whether an ordered state was possible on the basis of simple S,-Sy interacare considered zero'th order) but which actually, perhaps by coincidence, is nearly exact. However, they tions. The best proof that it is seems1 so far to be the ignored the zero-point energy of their quantized vibra- experimental results of Shull et at. " using neutron tions without considering its significance very deeply. diffraction, in which they show that certain substances Klein and Smith 3 have recently pointed out the signifi- do have ordered antiparallel spin arrangements. In cance of the zero-point energy, and have shown that connection with these experimental results, a theory of only if it is included does one get the correct energy the ground state seems also of interest for comparison for the ferromagnetic ground state from the Kramers- with observed values of the magnetization of the various Heller treatment. T h a t is, the true value of the classical sublattices. In this paper we apply the results of the spin-wave angular momentum Sc which should be used for atoms theory 4 to the approximate determination of the ground with spin quantum number S is state energy and wave function in a fashion similar to the work of Klein and Smith, 3 by including the zeroTherefore, since classically one can expect to be able point energy and motion. This can be done for reasonto set all spins exactly parallel, the energy available ably general lattices (we do explicitly the linear, plane from any — 7S,-S,- interaction is — JSC2=— JS(S+1). square, and simple cubic lattices) and for arbitrary The total zero-point energy of the spin waves turns spin quantum number 5 . The results obtained should out to be exactly such as to increase this to —JS2, the be valid to order l/S (or perhaps even l/ZS, where Z is the number of nearest neighbors) since the approxicorrect value. mations of the so-called "semidassical" spin-wave Hulthen 4 worked out the equivalent quantization theory are valid quantum-mechanical approximations problem for small vibrations of simple antiferromagnetic to this order. lattices from their classical equilibrium state. Here one We are able to get a result for the ground-state energy has the opposite sign, +/Sj-Sy, for the nearest neighbor interaction, and the classical expectation is that suc- in all cases. This energy lies between the rigorous cessive spins will align themselves antiparallel. How- limits which have been derived on the variational ever, Hulthen ignored the zero-point energy. 5 L. Hulthen, Arkiv. Mat., Astron. Fysik 26A, No. 1 (1938). e Bethe had worked out the exact ground state for the H . A. Bethe, Z. Physik 21, 205 (1931). 7 J. H. van Vleck, J. Chem. Phys. 9, 85 (1941). antiferromagnetic linear chain of atoms with spin 8 P. W. Anderson, Phys. Rev. 79, 350, 705 (1950); C. Kittel, 1 Rev. 82, 565 (1951). G. Heller and H. A. Kramers, Proc. Roy. Acad. Sci. Amster- Phys. 9 This fact has been checked by Kramers and Kasteleijn dam 37, 378 (1934). (Kramers, private communication). »3 F. Bloch, Z. Physik 61, 206 (1930). l » C. G. Shull and J. S. Smart, Phys. Rev. 76, 1256 (1949); M. J. Klein and R. S. Smith, Phys. Rev. 80, 1111 (1951). 4 L. Hulth6n, Proc. Roy. Acad. Sci. Amsterdam 39, 190 (1936). also Shull, Wollan, and Strauser, Phys. Rev. 83, 333 (1951). I. INTRODUCTION

K

52

principle," -iNJZ(S*) and -%NJZ(S>>)(\+\/ZS). In fact, it is roughly equal to but slightly lower than -|AVZ5 2 (1+1/2Z5). For the linear chain with S=\, there is good agreement with the rigorous result of Bethe in spite of the fact that this is the worst possible case for the approximation method, which in fact should break down here, as we shall see. More interesting results are obtained for the longrange order. The quantity related to long-range order which the theory furnishes directly is OS,)1, the average z-component of the spin in one of the two sublattices into which all the lattices we treat can be subdivided. z is the direction in which we assume the spins of our original unperturbed (i.e., classical) system were pointing. In turns out that the average spin on a given atom is reduced by a small correction factor of order of magnitude 1/2Z in the case of 2- and 3-dimensional lattices. This result may be susceptible of experimental verification by accurate neutron diffraction measure? ments. In the case of the one-dimensional lattice (5 2 )' vanishes: The separate sublattices are on the average in singlet states. A little consideration of the situation for this lattice, in which it requires at most an energy of the order J to break up the long-range order, while N perturbation terms of order J are available to destroy the ordered state, convinces one that this is a reasonable result, and in fact it can be shown to agree with the result for the rigorous ground state derived by Hulthen and Bethe.6,6,9 Physically, the situation is this: It is known that in the one- and two-dimensional lattices in ferromagnetism only an infinitesimal amount of thermal energy in the spin-waves is necessary to break up long-range order. In ferromagnetism, however, the zeropoint energy of the spin waves is not adequate to break up long-range order because the quantized amplitudes of the long-wavelength spin-waves, which are those important in the thermal destruction of order, are small. In the antiferromagnetic case it seems reasonable to expect that the total amount of energy necessary to break up long-range order is again infinitesimal in these two cases; this is certainly true in the classical limit of large 5. However, here the zero-point energy can do the job, because the large amplitudes of the longest wavelength spin-waves permit a larger proportion of the zero-point energy to go into them as compared to the unimportant short-wavelength waves. This is only true for the one-dimensional case; the two-dimensional case is probably similar to the one- and two-dimensional ferromagnets, having long-range order in the ground state but losing it immediately under thermal excitation. Another result which has bearing on experiments, while helping to give internal consistency to the theory, can be derived by looking at the properties of the spinwaves of longest wavelength, which actually represent 11

motion of the transverse components of the total spin of the entire sublattice. The sublattice spin, in the ground state at least, certainly does not maintain a constant, definite direction, since the ground state is a singlet. Therefore, the amplitudes of motion of these transverse components of the total spin show an apparent divergence, indicating that the coupled total spins of the sublattices are indeed compelled to rotate around at will in space. One might think that since the approximate quantization of the spin-waves requires that we assume a constant, large spin for the sublattice, the fact that our theory then tells us that this is not true represents an inconsistency in our reasoning. This objection can be shown to be invalid in several ways, particularly by assuming that a small anisotropy energy is present which holds the spins constant in the z-direction. It then appears that the required anisotropy is actually infinitesimal, so that one can easily go to the limit of zero anisotropy, using the anisotropy merely as a convergence factor. Another way to answer this objection is to assume that our theory really starts out from "wave packets" of states, chosen in such a way that the z-component of the total spin is roughly constant. Inquiry into the properties of the longest wavelength spin-waves shows that the energy required to form such a packet is infinitesimal of order i/N, where N is the number of atoms in the lattice. This energy is very small. An equivalent result, of experimental interest, is that the time required for the total spin to drift around from one orientation to another essentially different one, in case we prepared the system originally in a state of definite spin orientation, is of order N, and thus extremely large. It is clear that the fact that the neutron diffraction experiments do indicate definite sublattice arrangements, in spite of having been averaged over some finite time, is perfectly reasonable.

P. W. Anderson, Phys. Rev. 83, 1260 (1951).

53

This argument, justifying the assumption of a large, constant z-component of spin on the sublattice, is admittedly somewhat circular: We have used the spinwaves, derived on the basis of this assumption, to verify it. However, it is difficult to find any reason why the equation of motion of (.Sz)totai should change very much if the correction terms of the theory were included. In any case, the argument shows internal consistency. Every other result furnished by the theory is at least not in disagreement with expectations arrived at by other reasoning. For instance, the fact that the sublattice spins are indeterminate in direction agrees with our knowledge of singlet states; the energy values lie between close limits set by a rigorous argument11; order does not exist in one dimension, agreeing with the rigorous result,6'6'9 while it does exist in three dimensions, agreeing with experiment.10 Further calculations which will be reported later on models for a ferrimagnet, and for an antiferromagnet with next nearest neighbor interaction, again give results whose

consistency with one's expectations furnishes further strong confirmation of the theory. Thus, while we cannot claim to have proved rigorously that this picture of the ground state is the correct one, it can be said that the probability is high that it is. II. SPIN WAVES IN AN ANTIFERROMAGNET Hulthen4 has carried through the quantization of the spin waves in an antiferromagnet, but we shall repeat the derivation here, both for ready reference and because his way of doing it is not easily adapted to our purpose. The Hamiltonian for an antiferromagnet is taken to be, assuming that the Heisenberg exchange coupling is responsible for the phenomena,

ff=/£S,"S*,

If our fundamental assumption, Eq. (2), is correct, we can use the following binomial theorem expansion for the z components, SZ1~SC— \SXj -\-SVJs)/2Sc, S,^-Se+(Sxk*+Svk*)/2Se. Then the Hamiltonian is, by substitution,

H=-iZNJSc1+iZJ{2Zi(SxJi+S,f)+2Z^Sxl?+S^)} + J E SxjSxh+SvjSvk. (6) o.*> This is valid to first order in the small quantities Sx2 and Sf. We introduce two sets of spin waves, one pair for each sublattice: 5„-=(25/iV)ii;xexp(i3i-j)Qx Syi=(2S/N)lZxexp(-il-j)P,

(1)

U. *>

Sxk=(2S/N)iT.xeM-iX-\L)Ri Syh= - (2S/N)IZ* exp(^-k)5x.

with / positive. The notation E will be used repeatedly (j. k)

to mean a sum over all pairs j and k of neighboring atoms, j and k, of course, represent each a set of D numbers, where D is the dimensionality; they may be thought of as Z)-dimensional vectors. We are assuming nearest neighbor interaction, and also we shall work only with lattices which can be divided into two sublattices with the neighbors of one all on the other and vice versa; the linear, plane square, and simple cubic lattices only will be used as examples. None of these restrictions is essential to carrying out a theory of this general type. The basic assumption we make in deriving the semiclassical spin waves is that the state of the antiferromagnet is not greatly different from the classical ground state in which the spins of one sublattice all point in one direction (say +z), the spins of the other all in the other direction. This is an assumption which must be justified, and which cannot, unfortunately, be justified by external reasoning, as is possible in the ferromagnetic case. We must use for this justification results of the theory itself (as well as its internal consistency and its agreement with other ideas). Mathematically, we assume S„~+S,

S„P*-S.

S.*=S.*-(Sf+SS),

(3)

where Sr. is the "classical" total spin of an atom with spin quantum number S, S.= [5(5+l)]».

Here the wave number X runs only over N/2 values from —T to IT, giving 2N coordinates in all12; for example, for the linear chain of length N, \=2im/N,

(4)

n=~iN+2,

2, 0, 2---|iV.

(8)

Thus, we have the correct number of degrees of freedom. To show the commutation relations, let us write down the inverse of (7): Gx=(2/iV5)»Ey«p(-»3i-j)5 y , P,= (2/NS)i>ZJexp(i^i)Sjy Rx=(2/NS)iZ^xp(iX-k)Sxk 5 x =-(2/AT5)iE*exp(-a-k)5 1 , i .

(

'

That Q\ commutes with Q^t R^, and Sx-, etc., is obvious. We need to verify only the following commutation rules:

CQx, Px'] = — E E «?[-*(*• j-V-JOXS*, 5„0 NS

i i'

2

— EexppH*-*')]^. NS i

(2)

Here and later, atoms labeled j are always on sublattice (1), while those labeled k are on sublattice (2). For the linear chain, for example, this would mean that j is odd and k is even. Where there is some possibility of confusion, we shall also use a superscript (1) or (2) to denote the sublattice. Now

. (i>)

Under our assumption (2) as to the sublattice spins, this gives D2x, Pvl= «xv*Ei 5 2 i /iV5~«xx-

(9)

[i?x, Sx-l=-(2i/NS)zZk

exppk-(^-V)]5zt <^i&\\>. (10) We use the well-known equation Eexpp(a.-a.')-j]=!2V8xv.

(11)

12 The use of X as a wave number, or inverse wavelength, for the spin waves, is perhaps unfortunate; it is hoped that no confusion will be caused.

54

The eigenvalues of the harmonic oscillator terms in (16) are easily found. The eigenvalues of

Note that the sign change in Sx comes from the requirement that the two commutators (9) and (10) be equal. Now we must substitute (7) into the Hamiltonian (6). First we replace Sxf by | 5 ^ | 2 , etc., which simplifies the presentation somewhat but does not change the results. Then, of course [using (11)],

H = (p2/m)+mu2q2, Lq,Pl=i, are

Z* sj= (2S/N)Z* Ex< Ei opWa.-*') • HQxGx'

H=(2n+l)w,

=SExQx2,

(17')

so that

and so forth for the squared terms. The cross-terms are 2^, (i.k)

(17)

if

# = - £/iVS 0 2 + J D/S£x[(2rc 1 x+1) (1 - YX2) » + (2« 2 x+l)(l- 7 x 2 ) i ].

\SxjSxk~\~Sy3Syk)

(18)

For the ground state all «x=0, so that

2S

=—E

E oppCX-j-Sl'-lOI&Rv-PxSxO

25 =— E

E

E,= -DJNS.*+2DJS£x(l-yx*)K

The frequencies of the spin waves fall into two identical branches, as we see by (18); using (17), these frequencies are

exp[^-(k-j)](Qxi?x--Px5xO

iVx.v(*-j>

a = DJS(l-y>?)K

XEexpp(3i-V)-j] i

= SZ(QxRx-PxSx)

(19)

(20)

2

Notice that Yx~l—jX , X—»0, so that

E exp[^-(k-j)],

x

ax~DSJ\, X-^0. (21) This dispersion law is quite different from the ferromagnetic case, where co~X2. The resulting decrease in magnetization of a sublattice, and specific heat, will vary as Ts at low temperatures if this dispersion law is correct. Thus, they will differ from the ferromagnetic case in which these quantities vary as TK It is questionable whether this could be observed. The difference in dispersion laws is not the most important difference between the two types of spin waves; it is the difference in amplitude per quantum of excitation which leads to the more striking effects. Since the potential and kinetic energies of a harmonic oscillator are on the average the same,

(k-j)

again using (11). The sum over (k—j), of course, is meant to run only over neighbors. We define 2£>7x= E e x p p M k - j ) ] ,

(12)

where D is the dimensionality of the lattice. For the three lattices we consider, the linear, plane square, and simple cubic, YX is D COSXi

Yx= E <=i

•

(13)

D

(The X,- are the D components of the vector X.) Then the total Hamiltonian (6) is, in terms of the spin-wave coordinates (again only for our three lattices): or H^-NDJSS+DJS-EiKPS+QJ+RS+SJ +27x(Gxi?x-A5x)]. (14) For our three lattices, of course, Z=2D. Similar expressions may obviously be found for other simple lattice types. H is not quite in normal coordinate form, but a canonical transformation may easily be found to make it so. This transformation is Px=(M+M)/V2~, Sx=(M-M)/v2,

<2x=(?ix+?2x)/v2, i?x=(?ix-? 2 x)/vl.

It leaves the commutation relations unchanged. The Hamiltonian then becomes H = - DJNSt+DJS-Eilq^a+yJ+PnKlYx) +
(?IX%(1+YX) = (^IX 2 )A,(1-YX) = K 1 - Y X 2 ) J ,

(?ix 2 )A,=i[(l-Yx)/(l+Yx)] i , <M2>A, = i[(l+Yx)/(l-Yx)] } ,

(

}

in the ground state. These quantities are also proportional to the extra amplitude per quantum of excitation. Note that, for small X, <M%->1/(\2X); 2

(23)

the mean amplitude of ^ix per quantum diverges as 1/X (i.e., as the wavelength) for very long wavelengths. This could be predicted from the dispersion law (21), because one expects from analogy with ferromagnetism that it requires only an energy proportional to X2 to create a periodic disturbance in the spins of a certain amplitude and of wavelength 1/X; since here we require an energy proportional to X per quantum, the amplitude of disturbance per quantum must be as 1/X.

55

HI. THE ZERO-POINT ENERGY We will compute first the energy of the ground state as it is given by (19), assuming for the moment that the basic premise (2) is right. In (19) the sum over X may be replaced by an integral, since the values (8) of X are quite dense. Since there are §-JV values of X, ExC(l-Tx2)]*=iV/2((l-7x2)iA,

—T

X r i - ^ g c o s x A l ,

(24)

£x[(l-Yx2)]*=(§Wi>. This equation defines ID- A rough value of ID may be obtained by expanding the square root in (36) by the binomial theorem and keeping only the first term, ID^l-((

fSosX.J \

/

2Di=\-{cosi\)J2D,

7D~1-1/4D; since D=Z/2,

this is equivalent to 7D~1-1/2Z.

(25)

This value leads to a rough ground-state energy-value, from (19):

This is to be compared with the rigorous ground-state energy in the case 5 = J for the linear chain, computed by Bethe, 6 CE„)E,.

Eg~-iNZJS2(l+1/225).

so that the energies are

(26)

In this approximation, the correction factor for the energy is 1 + 1 / 2 Z 5 , which certainly lies between the rigorous limits 1 and 1 + 1 / Z 5 . 1 1 For most known antiferromagnets this correction is fairly small, as discussed in reference 11. The approximation (25) for ID is not very good because of the slow convergence of the series of which (25) is the first two terms. More accurate values of ID have been computed for D= 1, 2 and 3. 13 For D= 1, ID is, 1

f

* sinX|iX=-

(27)

£.11.-1= =

(Eg)D=2=-2NJS?(l+0A58/S),

(31)

(£a)D=3=-3^/52(1+0.097/5).

(32)

The energies (28), (31), and (32) are afLsomewhat lower than the rough approximation (26) but lie between the rigorous limits derived on the variational principle. 11 IV. THE TOTAL SPIN OF THE

lim

-NJZS(S+l)-(2/ic)S2 -NJS?(l+0.363/S).

(28)

13 1 am indebted to Dr. R. W. Hamming for helpful suggestions in connection with the evaluation of these integrals. Dr. J. M. Luttinger has kindly pointed out that one of them, J2, has been shown to be an elKptic integral by G. N. Watson [Quart. J. Math. 10, 266 (1939)]. His more exact value agrees well with our numerical integration.

SUBLATTICE

A parameter which represents the state of long-range order of the lattice is the total spin of one of the two sublattices. I t can easily be shown that the square of this total spin divided by the square of the number of atoms in the sublattice is, except for quantities infinitesimal to order 1/N, the more usual order parameter

2TJ-,

Thus, the energy for the linear chain is

(

= -1.73NJS*.

This agreement is good, which is unexpected, both because 5 = § is a bad case for our assumption that 5 is large,14 and because, as we shall see, in this case the basic premise that the sublattice spin is large and nearly equal to 5 , on the average, is incorrect. One can only suppose that the basic premise is fulfilled temporarily over large enough regions of the lattice that the energy parameter is not badly approximated. An interesting comment which might be appended here is that for the antiferromagnet, as for the ferromagnet, the theory gives entirely correct results for the case N=2: Two atoms coupled by one exchange interaction. For the antiferromagnetic case, (8) gives X = T only, and thus 7 x = — 1 , and the zero-point energy vanishes entirely. This is correct: for two spins, the singlet state has energy — / 5 ( 5 + l ) . The zero-point motion, however, does not vanish in this case, and as discussed in Sec. V shows us that no directional preferences exist. For the ferromagnetic case one also obtains the correct groundstate energy, —JS2. The integrals 7 2 and Is were evaluated numerically, essentially by computing or approximating higher terms in the binomial series for the square root.13 The results were 12= 1-0.158, 7 3 = 1-0.097, (30)

2

£ , ^ : - i T O / [ 5 c - 5 ( l - 1/2Z)]

m)

SySy.

14 The direct use of the expansion (5) is rather dubious in case S=§. One can justify at least the long-wavelength spin-wave approximation on another basis, however: For the long wavelengths one can think that there are relatively large regions of parallel spin, leading to large S's in total so that something like (5) holds. In other words, one sums SzjStk over fairly large regions first, and then expands into an expression like (5). Since the contribution to the energy is most critical for the long wavelengths, this procedure might well lead to good energy eigenvalues. In other words, for long wavelengths we think more nearly in terms of the phenomenological type of spin-wave theory of C. Herring and C. Kittel [Phys. Rev. 81, 869 (1951)].

56

The total spin, however, is the more direct physical parameter, since it is what is measured in a neutron diffraction experiment: What is measured is the average spin per atom parallel to the total sublattice spin, which comes directly from the total spin. The total z-component of the spin of a sublattice, which we shall call (Sz)tota) (or (2), for the second sublattice) is easily computed, simply by adding up the expressions (5) for the whole sublattice. We shall see later that the x and y components may be neglected; for the time being we shall assume this to be true. Let us, then, start from (5) and compute (•J J tot (0 = •Zi

S*= (WS.-Zj(SJ+S,A/2S„

(33)

/ 2 = 1.393,

SJ+S^SZiiQt+Px*),

and by (15) E x ( Q x 2 + i Y ) = E x K?ix 2 +/>ix 2 +?2; + ^2X2+
(1)

= QN)S.-

(37)

Js= 1.156.

This leads to the following values for the total z-component of spin (in all cases we neglect terms of order 1/5, so that we set Se=S+^, etc.):

By (7),

£y

lt is obvious that the integral J i diverges logarithmically. This means that the original assumption that long-range order exists in the one-dimensional case is wrong; no component of the sublattice spin is finite, and the method breaks down. This, as we pointed out in the introduction, is in agreement with the rigorous result of Bethe and Hulthen. 6 ' 6 ' 9 For the other two cases, the integral does not diverge. Because of the slow convergence, however, a rough approximation such as (25) is quite useless, so that it is necessary to use values computed in much the same way as the exact values (30) for 7 2 and 7a.13 These are

(S/2Sc)£x Kquf+pv? + ? 2 x 2 + M 2 + ? i \qz\+pi\p2\).

In the ferromagnetic case, (5,)tot is a constant of the motion. This is not true for the sublattices here, as could in fact be shown from the original Hamiltonian (1). The sum of (Sz)il) and (S,)® is, however, a constant of the motion, since this is the z-component of total spin. Because of this lack of constancy we must content ourselves with average values, <(S.)t.t a ) >«.=INS.- (5/45 c )Ex(?ix 2 +^ix 2 +<72X 2 +M%,

(38) (39)

These corrections to 5 are small, and one expects them to become smaller as the number of neighbors increases beyond Z=6. Nonetheless, it is entirely possible that this correction term (or the similar term that would be present in more complicated lattices) can eventually be observed by neutron diffraction methods. The results (38) and (39) cannot immediately be taken a t their face values. To see why this is so, let us compute the *- and y-components of (5 to t) (1) o r (2). These are given by the particular spin wave coordinates forX = 0 :

(34)

since 17152 and pip2 vanish on the average. The average values of the q's and p's are given in (22), so that

(52tot)A,U_2=^(5-0.197). (5, t ot)A»U=..=JiV(5-0.078).

Y.i S*j= (hNS)iQo,

E * Sxk=

E y Sn= (iNS)*Po,

E * Syk= -

(iNSWo,

(40)

QNS)iS0.

The two quantities E> - V r - E * S*=S,

tot= (2VS)»?,.

(41)

and <(S.).*m>*=-^. 2

E ( ) +( ) 4S„x l \ l1+yJ +7x/ \l-7x/ i

N — 5

S

2 '

E> •S'lO'+E* Syk = Sy tot= (NS)*p20

1

v

(35)

2S.x(l—y x «)»"

Again we can replace the X-sum by an integral, N 2 }

x (l-7x )

1

-JH-

d\v •

2 (2*)

-(IKD

— 7T

[l-^EcosXijl 1

, (36)

N

"Efc Szk~ (NS)iqm, "Efc Syk~ (NS)ipla,

=-JD.

E 2

x (1-Tx )*

[by (15)] have finite mean squares, as we see by (22); the squares are somewhat smaller than or of order N, so that the actual quantities are always very small. This was to be expected from the fact that 5 t o t is a constant of the motion for the Hamiltonian (1). We see that we have, by our basic assumption (2), accidentally limited ourselves exclusively to singlet states or at least states of very low total spin quantum number. This is acceptable; the ground state is certainly a singlet, 5 while actually the majority of all states have very small total spins. On the other hand, the differences between the two x- and y-components,

2

(42)

have divergent mean square amplitudes, as we saw in

This equation defines Jo-

57

(23). The meaning of this fact is perfectly clear from our basic knowledge of the problem. The ground state of the antiferromagnet is certainly a singlet state, and a singlet state cannot, by general reasoning, show in any way a preference for one direction over another. Therefore, while the spins of the two sublattices can certainly be said to be opposite in direction, in the ground state of the lattice, on an average basis, we cannot define the direction in space of the spin of either one. One may think of the system as similar to the singlet state of two spins, (a/3—/3a), in which the spins are certainly opposite but each is directed with equal probability in all directions. The quantities (42) are the degrees of freedom which represent the rotation of the two oppositely directed sublattices in space. We know that in the ground state, at least, we cannot "pin down" these degrees of freedom in any way; the lattices will rotate around at will in the course of time. The apparently infinite zero-point, amplitude of motion is the mathematical result in this theory of this fact, and is therefore not to be thought of as a fault of the theory. The tendency of the pair of lattices to rotate around is, however, a weak one, as we can show in two ways. One way is more physical: Since every real lattice will certainly have some kind of anisotropy, we introduce in the Hamiltonian (6) an anisotropy of axial symmetry which makes the z-axis the preferred direction. It is easily verified that such an anisotropy energy can be expressed by

for very small K. This shows us that, unless K is small in the order 1/N, the root mean square values of 235*,or 2~^Syj to be expected is reduced from infinity to something of the order \/N, completely negligible in comparison with the value of 2~li Szj, which is of order N. Thus, we may consider an infinitesimal K to be present as a convergence factor, and then the truly interesting physical parameter, C(5t„ t (1) ) 2 ]4=[(5, tot(1»)2+(5„ t„t<»)2+(5s tot"')2]*, is simply given by (S.

°>A»,

which is the value computed in (38) or (39). A second way of clarifying the meaning of the divergence of the amplitudes (42) is the following. Let us return to the consideration of the original Hamiltonian (16): H=-DNJS^+DJSZ^q^2(l+yx)+piK2(l~yx) + ?--x 2 (l-7x)+M 2 (l+7x)]. For spin waves X = 0, the energy terms are H^ = 2DJS(q102+pw2).

(48)

Now besides the ground state of the X = 0 oscillators, there will be a number of higher states, which happen, since w = 0, to form a continuum. As the first question let us ask: how much energy will it cost to form a wave packet of higher states which, instead of having a large amplitude of pi0 or 520, has a reasonably small ff a „ i s = K l Z K S . n S ^ + T . d S ^ + S . m (43) one? This is easily answered by means of (48). From the commutation relations it can be derived that where K is the anisotropy energy constant per atom. Pi=id/dqx, qx=-id/dpx. (49) (43) can be re-expressed in terms of the spin-waves (7) and is Thus, if we limit q\ to some definite range, say Aq\, we H«niB=KSY,*(Q>?+W+PS+Sx2). (44) must contribute a certain amount of p\, given in order of magnitude by Then the total Hamiltonian becomes, instead of (14), A^l/Agx, (50) H=-NDJS*+DJSZxl(PS+Q>?+R>?+S>?) or vice versa. Thus, to limit qn to a value Ag o we require 2 X(l+K/DJ)+2y^Q^-PA)2 (51) = -NDJS*+DJSY.>lqix2(l+K/DJ+y>.) Eum = 2DJS/(Aq2<,y. 2 +pu*(\+K/DJThus 7 x)+?2x (l+tf/I>/- TX) 2DJS (45) + P^(l+K/DJ+^n •Elim —

The amplitudes (22) become rl+(K/DJ)-yx-f

2DJi

NS

Ll+(K/DJ)+yJ

N

1A(E^-E^)J

_n + (K/DJ)+y^ (M )A«=(?2X }A« = ~ il+(K/DJ)-yi 2

[A(ESX-ES**)]2/#S

(46)

]'

2

and, in particular,

((LiS^-ZtS^^NS

1 (2K/DJ)*

(47)

(52)

From this relation it becomes clear that to limit the rotation of the sublattice spins to a finite angle (this means limiting Y.SXj, etc., to a value of order of magnitude NS) the excess energy required is only of order of magnitude 1/N, and thus is extremely small. It is clear that we can even limit £>SW to a quantity of order of magnitude (N)i+a, a>0, without requiring any perceptible energy.

58

An equivalent way of looking at Eq. (52) is this: Suppose the lattice had been looked at at one time (say with neutron diffraction) and it had been determined that the sublattices were pointing in the + z and — z directions, respectively. How long would it then take for them to turn to the ^-directions under the effects of the Hamiltonian (1) ? The answer to this is given by (hX frequency of rotation)~.Eii m ; Frequency of rotation~///jiV; Time to r o t a t e ~ W / / .

(S3)

This time is of the order of magnitude 10 s sec = 3 years. These two arguments based on Eq. (52), I think, show that our original assumption that Sz total~iV.S' is justified, at least according to its own consequences. It seems very difficult indeed to find a reason why, at least in order of magnitude, (52) should not hold even for the rigorous problem. If we assume initially that (5 2 a ) ) is large, the Sx, Sy commutation relation tells us that Sz can then be localized easily, and this in turn justifies the entire theory, which tells us then that Sz is large. This argument also has the consequence that we can definitely predict the result of observing the spins on the two sublattices. The direction of the spins will no doubt in any real case be determined primarily by anisotropy energies. Their magnitude is given to a fair approximation by (39) or the equivalent formula for the lattice under consideration. V. SOME CONCLUSIONS The apparent success of the spin wave theory in elucidating the complications of the ground state of the antiferromagnet leads one to have some confidence in its validity as a description of the higher lying states; thus such things as Bloch wall theory may be capable of being carried over practically unchanged from ferromagnet theory. This is of some practical interest in connection with ferrimagnetism, which will also be discussed in a way similar to the present theory in a later paper. It seems, too, that the "sublattice" picture of antiferromagnetism which has been used in some theoretical papers 6,7 is at least not qualitatively affected by the present picture of the complications of the ground state. This is of importance because these theories have had considerable success in explaining experi-

59

mental results in a semiquantitative way, and because they do lean rather heavily on the assumption that the ground state has order and is not greatly different from what one would naively expect. The only approximate energies which seem to have been given in the literature for two- and three-dimensional antiferromagnets are the energies of very small aggregates (6 and 8 spins) with periodic boundary conditions, which were computed by Harrison. 15 However, these energies lie even below the rigorous limits computed previously, 11 because of the fact that the aggregates are so small that certain paired electrons can have an over-riding effect (just as in the linear case the binding energy of the two-spin case is far too great). Thus, it does not seem fruitful to make a comparison with Harrison's results. The conclusions of this paper about long-range order have implications in the theory of metallic binding. If one describes a substance by the Heitler-London model, (1) is a t least a fair approximation for the effective spin coupling, unless one has a case such as diamond in which the interatomic exchange integrals may be thought of as over-riding the internal spin-coupling which leads to a total S for the atom. (Of course, such substances as ionic crystals have S=0, and we do not treat them.) In most metals, however, and particularly in the single-electron metals such as the alkalis, the description by the Heitler-London model certainly leads to (1). Then, as we have shown, (1) requires a long-range spin order to be present, which is probably not the case in most metals. Thus, we conclude that the Heitler-London model is not even a good qualitative description of metallic binding, in agreement with Schubin and Wonsowski 16 and Mott ;17 the band theory must be used in metals, the Heitler-London theory may be used in insulators. I have profited greatly in doing this work from stimulating discussions with a number of my colleagues, particularly Dr. Wannier, Dr. Herring, Dr. Holden, and Dr. Kittel. Dr. Hamming's valuable help has been mentioned in the text. 15 R. J. Harrison, M.I.T. Research Laboratory of Electronics Technical Report, No. 172 (1950). 16 S. Schubin and S. Wonsowski, Proc. Roy. Soc. (London) 145, 159 (1934); also Physik. Z. Sowjetunion 7, 292 (1935) and 10,348 (1936). 17 N. F. Mott, Proc. Phys. Soc. (London) A62, 416 (1949).

Qualitative Considerations on the Statistics of the Phase Transition in BaTiC>3-type Ferroelectrics Conference Proceedings, Lebedev Physics Institute, Academy of Sciences of the USSR, Nov. 1958, Fizika Dielektrikov (Moscow), 1960, pp. 290-297

This is the aforesaid soft mode paper. I went to Moscow to meet the Landau group and to talk with them and with Shirkov about superconductivity, but the invitation was to a dielectrics meeting. The work was from my notebooks of 1950-54, and I felt that in sending me Bell was paying me its "dielectric bill" for all that work under Shockley.

61

QUALITATIVE CONSIDERATIONS ON T H E STATISTICS OF T H E PHASE TRANSITION IN B a T i 0 3 - T Y P E FERROELECTRICS

P. W. ANDERSON Bell Telephone Laboratories Murray Hill, New Jersey

The phase transition of BaTiC>3 is remarkably free of fluctuation effects relative to other similar cooperative transitions. This is related to the fact that the singularity which causes it involves only a very few of the modes of motion of the lattice, the remainder staying regular and harmonic. Various experimental consequences of these ideas are explored. The concept that the individual ions "rattle" in multiple-minimum potentials is shown to be misleading.

Barium titanate has occupied a special place in the minds of those interested in ferroelectrics because of its relatively simple structure and properties, and because of the technically important magnitude of some of its ferroelectric parameters. It is typical of a class of ionic ferroelectric crystals for which the theory of the phase transition is probably quite simple; it is our purpose to show why this is so, and some of the consequences of the theory. I should apologize for the always qualitative, and sometimes trivial, nature of the reasoning; my excuse is that in the long history of the subject, and even in the many years since I first suggested some of these things, there have been no publications along these lines. The theory of an abnormal ionic dielectric like BaTiOs is best approached by understanding a normal ionic dielectric of relatively high dielectric constant. For this the old methods of Debye 1 for studying thermal expansion due to anharmonic forces are adequate, although we shall re-express them in a more convenient form. We shall simplify the dielectric by including only two branches of the vibration spectrum, the acoustic and the lowest — and presumably most completely polar-optical one. The zeroth approximation to the potential energy is given by the harmonic forces:

V0=Y,*a(k)qlkqi+s0(k)q0_kq°k.

(1)

fc qk and qk are the acoustical and optical mode coordinates respectively, and s£ and s°k the stiffnesses. The q^s are linear combinations of displacements like a;", j standing for the

62

unit cell and n for the atom in that cell:

N being the total number of cells j . The independent harmonic oscillators q^ are perturbed by the presence of anharmonic terms of various sorts, all resulting because the interatomic potential is steeper for closer approaches, less steep for distant ones, than a parabola. Typical terms are:

H' = N-^

£

A,a'7(*)tftf«:t_*

it,it' (<*>/?,7)=a

+

iV

Z^ 4 k,k',Q

1kQ-k+Ql-k'-Qlk'

+ ...

(2)

The basic approximation in a normal crystal is that the anharmonicity is small, so that in calculating the partition function or various averages it is correct to expand the exponential e-fKHo+H')„e-fiHo(1_pHly (In most ferroelectrics classical theory is adequate, so for Ho read Vo.) Thus all problems, to a certain order, reduce to finding averages of H', usually multiplied by polynomials in one or a few q^s, using the unperturbed potential Vo. In calculating such averages we are helped by a theorem: Only quantities having zero total momentum (i.e. uniform throughout the crystal) have finite averages. Such are \qk\2, \
(3)

k,a/3

N-*J2 [2Ar^(A:-^) + Af^(0)]|^| 2 |^| 2 in which both sums include the values k = 0.

63

(4)

The final observation is that averages of g's for different fc's are independent. Using this concept it is easy to show quite generally that, to lowest order in the anharmonicity, all the thermal properties follow from a set of effective Hamiltonians, one for each mode, which are obtained by averaging (3) and (4) over all but the mode of interest. For instance, #eff(tf ) = **!?* I" + i V - 1 / 2 | ^ | 2 ^ ( A 3 ( 0 ) + 2A 3 (fc))(^) P + iV-1|^|2E(

2 A

^ - r ) + A4(°))(l^l2)'

(5)

fc;/J

HM)

=« |

2

+ JV-1/2A3(0)(g0a3 +

+ ^^77E(A3(°) +

2A

1lf4) 3(fc))(kf| 2 )

k

+ ^l'?o a | 2 E( 2A ^') + A ^ 0 ) ) ^ | 2 ) -

(6)

k

From (5) we see that the anharmonicity leads to a thermal variation of stiffness (and thus of frequency) for each mode; when this variation is large we know the method fails. The macroscopic properties, on the other hand, depend on the equations for the go'sN1'2q" are the dilatation and other uniform strains, so that as the vibrations produce, through the cubic anharmonic terms, a linear term in q£, this forces a displacement of q£, which is the cause of thermal expansion. Similarly the last term changes the compressibility: these results are like the old Debye Theory.1 The optical modes q®, and especially the transverse ones for which there are no depolarization forces, control the dielectric properties. In fact, the ionic polarization of the crystal is P = Y,e]nXjnCeNll2ql . (7) j,n

The energy is an external field is —EP. Symmetry arguments show, when the crystal has a center of symmetry, that
+ 5 ° k T + 2JV-1/2A3(0)|,?00|2<7oa

+ w-'kSl2 Y, (2M*) + A4(o))(l
+ skeBOtotf.

(8)

64

The dielectric constant, or more exactly the polarizability (e — l)/4ir, (7) and (8) using dH/dql = 0; we get

can be derived from

AnNe2

From (8) we see that there are two temperature-dependent effects on e. The A3 term is the effect of thermal expansion, and increases e with T because the more expanded lattice is softer. This is the normal behavior in low dielectric constant materials. The A4 term comes mostly from the optical branch and is the effect of the steep sides of the ions' potential wells. When the stiffness of the optical vibrations is lower (from (9), this is high e) this will predominate, as is observed. A last idea from the theory of simple dielectrics which is useful is that of the dipolar contribution to s(k). Ignoring the charges on the ions there are certain short-range restoring forces due mostly to ionic repulsion, which would lead to a stiffness siep(k). Then there are the purely electrostatic interactions of the dipoles caused by the displaced ions: e2

^2rij3[xi

• i1 ~ ^ij^ij)

• xi]

which, being quadratic in the displacements x, can also be transformed into a stiffness contribution ^P(kM\2

= -\\q0k\\ne2Lk)

(10)

Lie here is the usual Lorentz local field parameter for k = 0 (i.e. for cubic sites 47r/3 for transverse modes, —8n/3 for longitudinal ones 2 ), and varies with k in the manner calculated by Heller and Marcus 3 and others. A sketch of a hypothetical ^-dependence is given in Fig. 1. Then the optical branch stiffnesses may roughly be written s°(k) = srep(fc) - -^ne2Lk

.

When the Lorentz correction nearly cancels the repulsive term we have a high-dielectric constant material; when it is slightly greater at k = 0 we get a ferroelectric. Let us now go on to discuss the ferroelectric case. Starting at absolute zero, we have two choices: to discuss normal modes expanding about the actual ferroelectric state at T = 0, which leads to a highly complicated picture; or to expand about the hypothetical cubic state, assuming that deviations from it are not too large; this is the fruitful way. In the absence of the dipolar interactions, we imagine the potential expanded about the cubic state to be very normal, with a normally positive stiffness for all optical modes, and reasonably small anharmonic terms. Because of the dipolar interactions, however, the effective stiffness at T = 0 of a relatively small range of optical modes near k = 0 is

65

L(k)

Mn

Fig. 1. Schematic dependence of Lorentz Sum L(k) on k in cubic crystal.

Fig. 2. Contributions to s(k) in BaTiC>3.

negative, as shown in Fig. 2. Clearly the most unstable mode (with the largest negative stiffness) is q§, which will have an effective potential in the above formalism of

Ven = s°0(eS)\q00\2 + 3N-1\(0)\q°0\4 +

66

(11)

3 and, to a lesser extent, in several similar ferroelectrics such as K N b 0 3 , etc. This displacement of
+

X4(0))(\q^)

of increasing the stiffness of all the other modes with negative stiffnesses so that they all have positive stiffnesses. Nonetheless, we can expect these stiffnesses to be small and the anharmonic terms to be fairly large for these A;'s. The basic simplicity of the problem is this: t h a t these modes with peculiar behavior represent only a very tiny fraction of all modes, and even inserting into EA|<7fc|2 the worst possible values for their mean displacements, they do not contribute an appreciable amount. Thus it is perfectly satisfactory to calculate the effective Hamiltonians as though these modes were reasonably harmonic, having the effective stiffnesses we compute; they then contribute very little, and the temperature dependence of the effective stiffness remains perfectly normal except for the small correction due to the displacement of q®. The situation is shown in Fig. 3, where the unperturbed stiffnesses, those at low T with gj displaced, and the stiffness near Tc are all shown. This explains one of the rather surprising observed facts about BaTiC^: that the quadratic term in the Devonshire free energy equation continues to vary smoothly and linearly as A(T — TQ) near the transition, in spite of the large fluctuations implied by the high dielectric constant. 5 A second fact is that the thermal vibrations, as measured by xray or neutron scattering, should not change much at T c , and in particular should remain spherical. This is being confirmed by the most recent measurements. 6 Another interesting physical effect which has not been tested is also implied by Fig. 3. This is that near, but somewhat above, T c the stiffness for transverse optical modes, and thus the reststrahl-frequency, should decrease to very low values. Rough estimates suggest A ~ 1-3 mm. In summary, we have pointed out that under the special circumstances which may apply to barium titanate, the temperature variation of the parameters in the effective Hamiltonian, which is equivalent to Devonshire's free energy function, should be controlled by modes of motion which are not sensitive to any fluctuations caused by the ferroelectric transition, which explains the experimental success of simple analytic forms for this function. Essentially, not the whole of the substance, but only a few modes of very long

67

<7o has the famous double-minimum potential, and will displace to the position of one of the minima, with a value of order N1'2. This is the ferroelectric polarization. From the above expressions we would expect the 4th-order term in the potential to limit the displacement. Slater and Devonshire 4 however, have shown that the positive A4 contribution is outweighed by a negative contribution involving the electro-strictive term |3o|29o m second order, so that the displacement is really limited by sixth order terms, a fact which is responsible for the, at first, rather puzzling abruptness of the ferroelectric transition. Nonetheless the total displacement of q° is rather small, in BaTiC>3 and, to a lesser extent, in several similar ferroelectrics such as K N b 0 3 , etc. This displacement of <j»o has the effect, because of the term iV- 1 |^| 2 (2A 4 (A;) + A 4 (0))(|g 0 0 | 2 ) of increasing the stiffness of all the other modes with negative stiffnesses so that they all have positive stiffnesses. Nonetheless, we can expect these stiffnesses to be small and the anharmonic terms to be fairly large for these &'s. The basic simplicity of the problem is this: t h a t these modes with peculiar behavior represent only a very tiny fraction of all modes, and even inserting into £A|<7jfc|2 the worst possible values for their mean displacements, they do not contribute an appreciable amount. Thus it is perfectly satisfactory to calculate the effective Hamiltonians as though these modes were reasonably harmonic, having the effective stiffnesses we compute; they then contribute very little, and the temperature dependence of the effective stiffness remains perfectly normal except for the small correction due to the displacement of q%. The situation is shown in Fig. 3, where the unperturbed stiffnesses, those at low T with q® displaced, and the stiffness near T c are all shown. This explains one of the rather surprising observed facts about BaTiOa: that the quadratic term in the Devonshire free energy equation continues to vary smoothly and linearly as A(T — To) near the transition, in spite of the large fluctuations implied by the high dielectric constant. 5 A second fact is that the thermal vibrations, as measured by xray or neutron scattering, should not change much at T c , and in particular should remain spherical. This is being confirmed by the most recent measurements. 6 Another interesting physical effect which has not been tested is also implied by Fig. 3. This is that near, but somewhat above, T c the stiffness for transverse optical modes, and thus the reststrahl-frequency, should decrease to very low values. Rough estimates suggest A ~ 1-3 m m . In summary, we have pointed out that under the special circumstances which may apply to barium titanate, the temperature variation of the parameters in the effective Hamiltonian, which is equivalent to Devonshire's free energy function, should be controlled by modes of motion which are not sensitive to any fluctuations caused by the ferroelectric transition, which explains the experimental success of simple analytic forms for this function. Essentially, not the whole of the substance, but only a few modes of very long

68

Fig. 3. Temperature dependence of s(k) in BaTiOa.

wavelength, experience the double-minimum potential, so t h a t the physical theory is entirely different from t h a t expected if each ion sees a strongly anharmonic, double-minimum potential. References 1. P. Debye, in Vortrage uber die Kinetische Theory der Materie und Elektrizitdt, M. Planck et al. (Teubner, Leipzig, 1914). 2. In noncubic materials, see G. Skanavi, Doklady Akad. Nauk 5 9 , 231 (1948). 3. W. R. Heller and A. Marcus, Phys. Rev. 8 1 , 809 (1951). 4. A. F. Devonshire, Phil. Mag. 6 0 , 1040 (1949); 62, 1065 (1951); J. C. Slater, Phys. Rev. 7 8 , 748 (1950). 5. Drougard, Landauer and Young, Phys. Rev. 9 8 , 1010 (1955). 6. Frazer, Danner and Pepinsky, Phys. Rev. 100, 745 (1955).

69

Ordering and Antiferromagnetism in Ferrites Phys. Rev. 102, 1008-1013 (1956)

This is the best-known of a number of papers I have written involving what are now known as "frustrated" lattices, lattices into which it is hard to fit an ordered state. The first was one of my earlier papers on antiferromagnetism referred to in the discussion of the previous paper, and of course Gregory's triangular lattice paper had the same theme; I also had a summer student, Frank Stern, do spin wave calculations on a facecentered cubic antiferromagnet to show that it wouldn't order at finite T, which he published in 1954 under his own name. The subject is fashionable these days, so this is getting a late burst of citations, probably undeserved. After returning from Japan, I had a relatively fallow period serving as the "house theorist" on magnetism, for instance giving my Japanese magnetism lectures to a nascent group supposed to be developing magnetic-based devices. This is the only paper from this period which is much cited.

71

Reprinted from T H E PHYSICAL REVIEW, Vol. 102, No. 4, 1008-1013, May 15, 1956 Printed in C. S. A.

Ordering and Antiferromagnetism in Ferrites P. W.

ANDERSON

Bell Telephone Laboratories, Murray Bill, New Jersey (Received January 9, 1956) The octahedral sites in the spinel structure form one of the anomalous lattices in which it is possible to achieve essentially perfect short-range order while maintaining a finite entropy. In such a lattice nearestneighbor forces alone can never lead to long-range order, while calculations indicate that even the longrange Coulomb forces are only 5 % effective in creating long-range order. This is shown to have many possible consequences both for antiferromagnetism in "normal" ferrites and for ordering in "inverse ferrites. I. LATTICE OF OCTAHEDRAL SITES

T

HE ferrites are a class of oxides of iron-group metals, many of them of technical importance as ferromagnets, which crystallize in the spinel structure or structures closely related to it. The ideal ferrite has the formula ABJOi (e.g., NiFe204) and the smaller metal ions A and B occupy certain interstices between the large oxygen ions, which latter are arranged in an approximation to the cubic close-packed structure.

FIG. 1. Photograph of a model of the spinel lattice. The dark balls are oxygen; the tetrahedral sites are connected to their neighboring oxygens by four diagonal bonds, the octahedral by six vertical and horizontal ones.

The structure is shown in Fig. I. 1 The distortion of the lattice of oxygen ions is such that a cell of 32 oxygens has cubic symmetry again. There are, for each oxygen, one interstice surrounded by an octahedron of oxygen and two surrounded by a tetrahedron; half of the former and only one-eighth of the latter are occupied by metal ions. This means that in the unit cell there are 8 "tetrahedral sites" and 16 "octahedral sites." In a "normal" spinel, the 8 A ions occupy the 8 tetrahedral sites, the 16 B ions the octahedral ones. In an "inverse" spinel, 8 of the B ions occupy the tetrahedral sites, the other 8 and the 8 A's occupying the octahedral sites. Ferrites are known which range all the way from purely normal to purely inverse. We are here interested in two problems, both having to do with ordermg on the octahedral sites: (a) the problem of atomic ordering in inverse ferrites; (b) in normal ferrites with small or no magnetic moments on the A ions, the problem of antiferromagnetic ordering of spins. To attack these problems we need to study carefully only the crystal lattice of the magnetic ions, particularly that of the octahedral sites. The occupied tetrahedral sites form a diamond-type lattice, the octahedral sites (see Fig. 2) a somewhat more complex cubic lattice which could be generated from this tetrahedral site lattice by displacing it through half the cube edge and then placing an atom at the center of each bond, 1

72

T. F. W. Barth and E. Posnjak, Z. Krist. 82, 325 (1932).

1009

ORDERING

AND

ANTIFERROMAGNETISM

IN

FERRITES

rather than at the original sites. The lattice of octahedral sites is thus identical with the lattice of oxygen atoms in high-temperature cristobalite. The fact the consequences of which we wish to explore in this paper is the following: on this lattice one can create perfect order, in so far as nearest neighbors are concerned, and nonetheless have a finite entropy; in other words, assuming nearest neighbor interactions only, one can find a number W of configurations of N(x) A atoms and A7(l — x) B atoms which all have the lowest possible energy, W being such that \nW is proportional to N. This statement can be proved explicitly when x is f or less than or equal to \ (correspondingly, also > J) and seems likely to be true for all other values a fortiori since J and f are the values at which the best-looking long-range ordered arrangements occur. II. THE ZERO-POINT ENTROPY Of course neither of our two problems corresponds exactly to the situation in which this lattice can be shown rigorously to have a zero-point entropy. This situation is essentially the Ising model with only nearest neighbor forces. In the prob em of ordering of metallic ions in the octahedral sites, one can expect the Ising model to be statistically sound, since there are only two states of each site, and the problem is essentially classical; but the interaction is not only between nearest neighbors because a large fraction of it is Coulomb energy. 2 However, we shall later show that the Coulomb energy of the different structures of perfect short-range order does not differ by very much; in particular, we calculate explicitly the energies of a class of structures containing 2Nl members and find them all within 5 % of the lowest. Thus there will be considerable meaning to a calculation of the zero-point entropy in this case. In the case of antiferromagnetic ordering, the interaction will probably be short-range, because of the rather large distances and long chains of atoms connecting those 5-sites which are not nearest neighbors; however, the counting of states and the whole statistical problem will be complicated by the fact that spins are quantum-mechanical vectors and not classical scalars as in the Ising model. Thus we have no hope of making quantitative estimates of zero-point entropy in this case. However, two lines of evidence indicate that nonetheless most lattices which have zero-point entropy in the Ising model will also be at least disordered at absolute zero in the real quantum theory of antiferromagnetism. First, if one thinks of the spins as classical vectors these lattices usually have an even greater zero-point entropy due to the extra freedom of rotation.' 2

de Boer, van Santen, and Verwey, J. Chem. Phys. 18, 1032 (19S0). 3 Not quite always; the triangular lattice £G. H. Wannier, Phys. Rev. 79, 357 (1950)] is an exception in which one can find an ordered configuration of real vectors.

73

©

TOP LAYER (Z = 0)

©

2ND

"

i

(D 3RD

"

J

©4TH

"

J (b)

FIG. 2. (a) Perspective view of the lattice of octahedral sites in spinel; (b) Projection of octahedral site lattice on (100) plane. Second, Stern 4 has shown that the antiferromagnetic spin wave theory indicates that such systems will not order. > F. Stern, Phys. Rev. 94, 1412 (1954).

P.

W.

ANDERSON

In this section, then, we present some results relating to the Ising model ordering problem, as applied to the lattice of octahedral sites of the spinels. First we discuss the 50-50 problem: x=\. Here we find that our problem is very closely related to the old problem of the zero-point entropy of ice. The lattice we are discussing can be seen to be made up of tetrahedra of sites connected at their corners [see Fig. 2 (a) J. These tetrahedra are themselves arranged in a diamond lattice. If, instead, they were arranged in a lattice of hexagonal symmetry (as in high-temperature quartz rather than cristobalite) the resulting lattice would be identical with ice. We have, then, a lattice which we might call "cubic ice." We wish to put on our sites an equal number of + and — signs, in such a way that we get the maximum number of -\ pairs. This can easily be seen to be possible only by putting on each tetrahedron two + ' s and two — ' s ; if any tetrahedron has three + ' s and one —, or vice versa, the effect is merely to substitute one like pair for one unlike pair, which of course is unfavorable. In ice, the corresponding criterion of the Pauling theory 6 is that two of the protons be near the oxygen (which is a t the center of the tetrahedron) and two farther away. This seems very similar and is actually an identical criterion, for one can divide the tetrahedra of our lattice into two sets with opposite orientations such that all those in one set touch only those in the other. Then let the + ' s on one set mean "proton near the oxygen atom," and vice versa for the other set, and the identity of the two problems is obvious. Pauling has estimated the entropy in the ice problem, and since this estimate is immediately applicable here and since we shall use a similar method later we repeat his derivation. If we take only one of the two sets of tetrahedra (both contain N/4 members, where N is the number of sites of the lattice), we can make 6Nli arrangements satisfying the condition that two + and two — be on each, since there are 6 arrangements of each tetrahedron. Now we approximate the number of these that are acceptable to the second set of tetrahedra by saying that the probability that any one of these tetrahedra is correct is f (as it is) and that the probabilities of different ones are uncorrected (as they are not). Then we have Wx-i= ( f ) W (6)««= (Vi)N, (1) Sx=i=klnW=Rln(y/%)

= 0.202R.

(2)

(Note that all entropies quoted in this paper are per mole of octahedral sites, which is § mole of ferrite.) Onsager 6 has proved rigorously that this estimate is actually a lower limit; one might estimate that it is probably no more than .10-20% wrong.

1010

The case of x=\ is also of special interest since lithium ferrite (LiFe6Os) corresponds to this, and is actually one of the few ferrites which do have longrange order. One can here set a rigorous lower limit (see Appendix I) of 5>0.081i?. Pauling's method gives S=0.131R, and is probably again an underestimate by a small factor. This answer is obtained as follows: The first set of tetrahedra has ANli configurations of 1 + and 3—. The probability of any single one of the second tetrahedra then being 1 + and 3— is 4 ( | ) ( f ) 3 = (f) 3 . In the spirit of Pauling's technique, we reduce the total number of configurations by this factor for each tetrahedron. Then ^=(i)»*X4i*, S = i R l n ( 2 7 / 1 6 ) = 0.131-R. For values of x less than | , the zero-point entropy is obviously finite. We can find (actually in an infinite number of ways) a set of 7V/4 sites, none of which are neighbors of each other. We have xN atoms to assign to these sites, which can be done in

HN)\/[(\N-Nx)\(Nx){] ways. The logarithm of this number is proportional toN. Values in this range, as well as in the range \ <x<\, are not of particular interest, although it is almost certain that all possible values of x lead to a zero-point entropy. This is the major conclusion of this section. III. THEORY AND EXPERIMENT ON ATOMIC ORDERING If the forces between atoms in ferrites were shortrange, we should predict that there would be no cases of long-range ordering on the octahedral sites, but rather that short-range order should set in at a rather high temperature. However, some of the forces are Coulomb ones and no doubt there are other forces of longer than nearest neighbor range. We have calculated Coulomb ordering energies of various structures with x=\ and \ in order to verify our guess, based on the last section, that there should not be a very great Coulomb energy difference between the different structures of perfect short-range order. In Fig. 3 is shown the ordered structure postulated by Verwey, Haayman, and Romeyn 7 for Fe 3 04 (again showing only the octahedral sites). The lattice of octahedral sites, when visualized from a 100 direction, can be seen to be made up of lines of atoms laid down in 011 directions. The lines in the same 100 planes are exactly twice as far apart as their own internal spacing, which is d = o/2v5, a being the cubic unit cell edge. In successive 100 planes of the cell these lines lie at 90° to each other: 011, 011, 011, etc.; while the lines two planes apart fall in the spaces of the previous set. In

' L. Pauling, The Nature of the Chemical Bond (Cornell Uni' Verwey, Haayman, and Romeyn, J. Chem. Phys. 15, 181 versity Press, Ithaca, 1938), p. 303. 6 (1947). L. Onsager (private communication).

74

1011

ORDERING

AND AN T I F E R R O MAG N E T I S M IN

two of the possible sets of 100 planes these lines of the reference 7 ordered structure are alternating ABAB etc.; they also have a particular relationship to each other (lines of the same 100 plane "in phase," i.e., A closest to A and B to B). I t is easily seen, however, that any structure made up from a set of 011 and 011 alternating lines having any relative phase whatever is one of the short-range ordered structures for x=%. These do not make up all short-range ordered structures, but only a set of ~ 2 w i of them, since there remains only the freedom to choose one 100 plane of the lattice arbitrarily. 8 The Coulomb energies of this set of structures can be quite accurately calculated by the original Madelung method. 9 In the Madelung method, one divides the lattice into neutral lines of atoms and calculates both the selfenergy and the potential at external points of these lines. This potential falls off rapidly with distance, making it easy to sum up the interactions of the lines with each other to rather good accuracy. In the present case the method is particularly suitable because the lines in successive planes do not interact, since they cross each other exactly half-way between the two kinds of charges. Thus it is only next-nearest neighboring lines which interact; third nearest again do not and further neighbors can be neglected. The self-energy of ordering of the lines is, for ordering of charges
Q

4TH

CEi„ter)m„x= + 0 . 0 3 1 0 (

? 1

-

?2)

2

/d,

(7)

Ni

with a total of - 0 . 9 4 9 2 . All of the 2 structures it is possible to make out of alternating lines lie between these limits and thus within 5 % of the minimum; one suspects that actually the remainder also lie primarily in this range. The total ordering energy (6) is of order of magnitude 2-3 ev, i.e., > 104°K. The extra energy to be gained by long-range order is, however, of order a few percent of this or 500-1000 degrees K. This may be still further reduced by polarization screening effects, so that the transition temperature 100°K for Fe 3 0 4 is not surprising. 10 However, it seems certain that Fe 3 0 4 will be short-range ordered far above the transition temperature. I n fact the observed entropy change in the transition has been found to be little more than 0.3R11 per mole of octahedral sites rather than the i?(ln2) = 0.69i? to be expected in a transition from complete disorder to complete order, in rough agreement with Eq. (2). Longrange ordering on the octahedral sites is observed in no other AB2Oi inverse ferrite, although the normal vs inverse ordering, with its only slightly larger motivation in Coulomb energy, often occurs. This fact is probably explained by the above considerations: that the energy to be gained by long- as opposed to short-range ordering is very small, while the entropy change is still large. (We appeal here to the qualitative relationship Tc« Au/As.) Thus the transition temperatures are too low and the ions are not mobile enough to permit ordering. There does exist one proven case of atomic ordering

(Sa)

Each line has 6 next-neighbor lines with which it can interact, two at » = 2 in its own 100 plane and four at »=V3~ in the next-neighbor plane. At most four can be (on the average) opposite in sign; the most favorable case is that in which these are all the «=V3~ lines, which happens to be the order of reference 7 (Fig. 3). One can show by adding up expressions of the form (5a) that the additional energy is then (5b)

leading to a total energy Etot= -1.001 ( ? i - ? 2 ) 2 / a

3BD

@

energy of

The potential caused by an alternating line at a distance nd directly opposite one of the positive atoms of the line can be written, to three-figure accuracy,

(£inter)min= ~ 0.0206 ( ? 1 - ? 2 ) V « ,

®

FIG. 3. The Verwey-ordered structure on the octahedral sites (projected on 100 plane).

(qi-q2)2/a.

Po PP = (4:/d)K0(mr).

TOP LAYER

( ? ) 2ND

£.eif=-[(gi-?2)2/2rf]ln2 = - 0.9802

FERRITES

(6)

in accordance with reference 1. The most unfavorable case is that of all lines "in phase," which has an

10 J. H. Van Santen, Philips Research Repts. 5, 282 (1950), has shown that Coulomb and other long-range effects should in any case tend to create short- rather than long-range order and to lower transition temperatures. It is however not clear whether our considerations are logically independent of Van Santen's, so that we do not rely on his effect. 11 J. E. Kunzler (private communication).

8

Not in the structure of reference 7, but in all others, one finds hexagons of alternating A and B, in general a number of order N of them. These may be rotated through 60° without changing short-range order. These two operations (sliding lines and turning hexagons) probably generate all possible structures. »E. Madelung, Physik. Z. 19, 524 (1918).

75

P.

W.

1012

ANDERSON

on the octahedral sites: the ferrite LiFesOs, which has been shown 1 to have 1 Li to 3 Fe ordered on these sites and to have a transition at 1200°K. The ordering energy of the observed arrangement has been computed elsewhere 1 to be £obs=-0.712(?1-92)yffi. (9)

however will not be related simply to the atomic moments. We suggest that this phenomenon may occur but that the order on the octahedral sites will be, at least at higher temperatures, short- rather than longrange.

In Appendix I I we calculate the energies, not of this but of a group of structures of multiplicity 2Nl included among the linearly ordered structures. All of these structures have lower Coulomb energies than (9), the lowest being £min=-0.7Sl(?1-g2)Vo, (10)

I am indebted to Professor L. Onsager for an interesting and helpful discussion of these problems, and to L. Corliss and J. Hastings for early discussions of their results.

some 5 % below (9). This indicates that Coulomb energy is not the motivation for the long-range ordering, which suggests that perhaps some kind of valence force is at work here due to the chemical dissimilarity of Li and Fe. Under these circumstances we do not consider it too surprising that the transition temperature is so much higher than that of Fe 3 04, since these unknown forces are clearly in control of the situation. IV. ANTIFERROMAGNETISM ON THE OCTAHEDRAL SITES Our comments on this subject are necessarily very brief and qualitative, both because (as already mentioned) the theory cannot apply directly to quantummechanical vectors and because the observational data are sketchy and unclear. First we should emphasize that the present notions cannot, unfortunately, help to explain the complex patterns observed by Corliss and Hastings 12 for ZnFe 2 04 and ZnCr 2 04 since in these substances what shortrange order exists points to a ferromagnetic nearestneighbor interaction. For classical vectors with nearest neighbor interaction, instead of the 6 best ways of arranging each tetrahedron there are an infinity: all ways in which 5tot=0. For quantum-mechanical vectors S, there are only 2 5 + 1 , but of course the situation is much more complicated than that. One can only expect that for reasonably large 5 the classical behavior is not too badly approximated. If so, the least-energy state may be any complicated combination of vectors pointing in various directions, with actual long-range order only enforced by what one expects to be rather small secondneighbor forces. Antiferromagnetism in the octahedral sites has been invoked in the theory of Yafet and Kittel, 13 which suggests that with antiferromagnetic coupling on the octahedral sites and a relatively weak coupling of these with the tetrahedral sites, the octahedral sites will remain antiferromagnetically ordered but will turn to some extent into the direction opposite to the tetrahedral sites, thus giving a total magnetization which 12

L. Corliss and J. Hastings (private communication). » Y. Yafet and C. Kittel, Phys. Rev. 87, 290 (1952).

ACKNOWLEDGMENTS

APPENDIX I. RIGOROUS LOWER LIMIT TO THE ZERO-POINT ENTROPY IN THE 3-1 CASE We here exhibit a group of ordered states of the 3-1 order problem on the octahedral sites which have an entropy 0.0809.R. In order to do this it is useful to look at the lattice from still another viewpoint, along a 111 axis. In the 111 direction there are planes of atoms in a triangular lattice of triangle edge 2d, interspersed between denser planes (3 times as many atoms) which have the so-called "kagome" lattice. 14 Figure 4 shows this lattice with the atoms in the two adjoining planes. Note that each small triangle in the kagome lattice is associated with an atom in one of the two neighboring planes, either above or below the plane; these triangles and the atoms associated with them form tetrahedra which must contain three A and one B atom apiece to satisfy the perfect nearest neighbor ordering. Our set of states are those in which, in all of these tetrahedra, the B atom is in the triangle belonging to the Kagome lattice. Figure 5 shows a small portion of the Kagome lattice having this structure: AJR. Now we associate with the points at the centers of the large hexagons of the Kagome lattice plus or minus signs according to the rule that every A atom represents a - ] — bond, every B a -)—h or . That this rule works is clear from Fig. 5. These + and — form a triangle lattice in the perfect antiferromagnetic order; and from any arrangement of such a lattice we can make a perfect arrangement of the A2B kagome lattice. But the zero-point entropy of the triangular lattice has been calculated by Wannier 3 ' 16 to be (SO)A = 0.323.R,

(Al)

so that, since there is one triangle site to every four octahedral sites, we get the zero-point entropy of our set of states to be Soot(A3B)>iSA=Q.0809R.

(A2)

This is the result quoted in the text. " G. F. Newell and E. Montroll, Revs. Modern Phys. 25, 373 (1953). 16 J. Wahl has pointed out that due to a trivial error in the last step thefinalnumerical result for So in Wannier's paper is 0.323/2, not 0.338JJ.

76

1013

ORDERING

O (7) ©

AND

A N T I F E R R O M A G N E T I S M IN

SITES IN KAGOME PLANE "

©

0

FERRITES

SITES OF EQUIVALENT TRIANGULAR ANTIFERROMAGNET

» PLANE ABOVE

FIG. 5. A portion of the kagome1 lattice with AiB ordering and the equivalent triangular lattice.

BELOW

FIG. 4. ( I l l ) projection of the lattice of octahedral sites, showing the "kagom6" plane lattice and adjacent sites.

+ 2 - 2 +2 •••: 3-1

APPENDIX II. COULOMB E N E R G I E S OF S O M E STATES OF T H E 3-1 CASE

We calculate here the Coulomb energies of a still different set of states of the ASB case. This set of states resembles those of the AB case which we investigated in Sec. I l l very closely. In fact, we again divide the lattice into lines in Oil and Oil directions. However, we make only one set of lines alternating, the other set being all A. This gives us a 3-1 ratio and makes the nearest neighbors of all B atoms A. Again we can expect that the best arrangement will put coplanar lines in phase, those out of the plane and only vSrf away out of phase. This happens to be the ordered situation one gets by taking the triangular lattice planes of Appendix I to be B and all of the Kagome planes A. We should mention that the ordered structure observed by Braun 1 for LiFeeOs belongs to neither this group nor to that of Appendix I ; we stated in the text that its Coulomb energy is quite poor. Our technique now is to make A be a charge of — 1, B of + 3 (we must have a neutral lattice to get convergent answers). We can eliminate the contribution of the all-^4 lines by a trick. We consider the ABAB lines to be the superposition of + 1 + 1 + 1 • • • and

77

3-1=1 1 1 1 + 2-2 2-2 = 3i+92.

(A3)

The all-^4 lines have charges —l = q3. qi and qs taken together are simply the ordered arrangement of Fe 3 0 4 , and have an energy En.n=-

(1.001/o)X(2) 2

(A4)

per pair of sites. The alternating arrangement q2 simply does not interact with either qi or q3 because in both cases there are as many -\— as + + pairs at any distance. Thus Eq. (A4) gives the entire contribution including self-energy of qi and q%. The energy of g2 is simply exactly the same thing we calculated in Sec. I l l , except that there are only half as many sites and (A
(AS) £82ma*= - (0.949/o)X (16/2),

so that for the whole lattice we add (A4) and (A5) and take into account that the charge difference is 4; we get Emin=-(0.751/a)(Aq)*, £max=-(0.725/<j)(A<7) 2 .

Absence of Diffusion in Certain Random Lattices Phys. Rev. 109, 1492-1505 (1958)

There followed in 1957-9 a small-scale version of an Annus Mirabilis — no invidious comparison with Einstein intended, just that this kind of sudden creative flowering does happen with some generality. One possible causative factor was a change in the situation at Bell Labs — after a series of losses of major theorists including Bardeen, Kittel, and shortly H.W. Lewis and Wannier, Bill Baker, then the vice president for research, decided to encourage us to form a separate theory department rather than being beholden to experimental masters. We (Peter Wolff, Conyers Herring, and I) responded by escalating ourselves into a full-scale academic department with postdocs, an elective chairman, and many other privileges such as relatively casual travel and sabbaticals, which were standard in academic physics at the time but unprecedented in an industrial lab. Also for a couple of years we had an extensive summer visitor program including nuclear and particle physicists such as Keith Brueckner, John Taylor, and John Ward as well as an expansion of our solid state visitor program — to Walter Kohn and Quin Luttinger we added among others Philippe Nozieres, David Pines, Elihu Abrahams, and as a summer student Bob Schreiffer. To this was added substantial salary improvement, since our losses at that expansive time were often to better-paid academic jobs. In the year 1958 I produced this paper, on which my Nobel prize was mostly based, as well as the papers on gauge invariance in superconductivity, the paper on superexchange, and the dirty superconductor paper. (The soft mode paper was older work.) The gestation of this one had occurred in 1956 while I was still in my house theorist mode, but it gained from the independence and self-confidence of the new PWA. Mathematical physicists have made an industry of "proving" aspects of the results of this very mathematical paper. I took great care with the mathematics and indeed as far as I can see most of their work amounts to restatement of the same argument. That the effort was worth something I felt was proved by the fact that two of the colleagues I most admired, Quin Luttinger and Walter Kohn, expressed opposite views: Walter thought it must be wrong, Quin that it was obvious.

79

Reprinted from T H E PHYSICAL REVIEW, Vol. 109, No. 5, 1492-1505, March 1, 1958 Printed in U. S. A.

Absence of Diffusion in Certain Random Lattices P. W. ANDERSON

Bell Telephone Laboratories, Murray Bill, New Jersey (Received October 10, 1957) This paper presents a simple model for such processes as spin diffusion or conduction in the "impurity band." These processes involve transport in a lattice which is in some sense random, and in them diffusion is expected to take place via quantum jumps between localized sites. In this simple model the essential randomness is introduced by requiring the energy to vary randomly from site to site. It is shown that at low enough densities no diffusion at all can take place, and the criteria for transport to occur are given.

I. INTRODUCTION

A

NUMBER of physical phenomena seem to involve quantum-mechanical motion, without any particular thermal activation, among sites at which the mobile entities (spins or electrons, for example) may be localized. The clearest case is that of spin diffusion1'2; another might be the so-called impurity band conduction at low concentrations of impurities. In such situations we suspect that transport occurs not by motion of free carriers (or spin waves), scattered as they move through a medium, but in some sense by quantum-mechanical jumps of the mobile entities from site to site. A second common feature of these phenomena is randomness: random spacings of impurities, random interactions with the "atmosphere" of other impurities, random arrangements of electronic or nuclear spins, etc. Our eventual purpose in this work will be to lay the foundation for a quantum-mechanical theory of transport problems of this type. Therefore, we must start with simple theoretical models rather than with the complicated experimental situations on spin diffusion or impurity conduction. In this paper, in fact, we attempt only to construct, for such a system, the simplest model we can think of which still has some expectation of representing a real physical situation 1

2

reasonably well, and to prove a theorem about the model. The theorem is that at sufficiently low densities, transport does not take place; the exact wave functions are localized in a small region of space. We also obtain a fairly good estimate of the critical density at which the theorem fails. An additional criterion is that the forces be of sufficiently short range—actually, falling off as r —* » faster than 1/r3—and we derive a rough estimate of the rate of transport in the F i 1/r3 case. Such a theorem is of interest for a number of reasons: first, because it may apply directly to spin diffusion among donor electrons in Si, a situation in which Feher3 has shown experimentally that spin diffusion is negligible; second, and probably more important, as an example of a real physical system with an infinite number of degrees of freedom, having no obvious oversimplification, in which the approach to equilibrium is simply impossible; and third, as the irreducible minimum from which a theory of this kind of transport, if it exists, must start. In particular, it re-emphasizes the caution with which we must treat ideas such as "the thermodynamic system of spin interactions" when there is no obvious contact with a real external heat bath. The simplified theoretical model we use is meant to represent reasonably well one kind of experimental situation: namely, spin diffusion under conditions of

N. Bloembergen, Physica 15, 386 (1949).

* G. Feher (private communication).

A. M. Portis, Phys. Rev. 104, 584 (1956).

80

1493

ABSENCE

OF

"inhomogeneous broadening."4 We assume that we have sites j distributed in some way, regularly or randomly, in three-dimensional space; the array of sites we call the "lattice." We then assume we have entities occupying these sites. They may be spins or electrons or perhaps other particles, but let us call them spins here for brevity. If a spin occupies site j it has energy Ej which (and this is vital) is a stochastic variable distributed over a band of energies completely randomly, with a probability, distribution P(E)dE which can be characterized by a width W. Finally, we assume that between the sites we have an interaction matrix element Fj*(r,-«), which transfers the spins from one site to the next. Vjt may or may not itself be a stochastic variable with a probability distribution. If one thinks of the mobile entities as up or down electron spins which can occupy various impurity sites, such as color or donor centers, in a crystal, then the random energies Ej are the hyperfine interactions with the surrounding nuclei—Si29 for the donors, alkali and halide nuclei for color centers—and P(E) is the lineshape function. In this example, Vjk is that part of the interaction which allows an up spin on atom j to flip down while a down spin on k flips up, and the simple process we study is the motion of a single "up" spin among "down" spins. The "impurity-band" example would again make the sites donors or acceptors, but the £,'s would be energy fluctuations of the donor ground state caused perhaps by Coulomb interactions with randomly placed charged centers; the moving entity would be a single ionized donor. We would have to assume the states of the different donors to be orthogonal, which is no restriction if V is arbitrary. More generally, the situation described by our theorem probably holds in the low-concentration limit and the low-energy tail of almost any model for an impurity band. One important feature which is missing from our simple model is contact with any external thermal reservoir. When the present theorem holds, some such contact will actually control the transport processes. Our purpose is only to show that the model in itself provides no such reservoir and permits no transport, in spite of its large size and random character; study of the real relaxation and transport processes must come later. Our basic technique is to place a single "spin" on site n at an initial time /=0, and to study the behavior of the wave function thereafter as a function of time. Our fundamental theorem may be restated as: if V(rit) falls off at large distances faster than 1/r3, and if the average value of V is less than a certain critical Vc of the order of magnitude of W; then there is actually no transport at all, in the sense that even as /—» « the amplitude of the wave function around site n falls off « A. M. Portis, Phys. Rev. 91, 1071 (1953).

81

DIFFUSION rapidly with distance, the amplitude on site n itself remaining finite. One can understand this as being caused by the failure of the energies of neighboring sites to match sufficiently well for Vjk to cause real transport. Instead, it causes virtual transitions which spread the state, initially localized at site n, over a larger region of the lattice, without destroying its localized character. More distant sites are not important because the probability of finding one with the right energy increases much more slowly with distance than the interaction decreases. This theorem leaves two regions of failure (and therefore transport) to be investigated, namely, V oc 1/r3, as in spin diffusion by dipolar interactions, and V~W, or the high-concentration limit. In both cases, the methods used to prove the fundamental "nontransport" theorem will probably allow us to outline an approach to the transport problem. In the 1/r3 case, we show that transport may be much slower than the estimates of reference 2 would predict. In the case V~W, we show that transport finally occurs not by single real jumps from one site to another but by the multiplicity of very long paths involving multiple virtual jumps from site to site. H. SUMMARY OF THE REASONING Since the mathematical development is fairly complicated and involves lengthy consideration of each of a number of points, we should like to summarize the reasoning rather fully in this section, leaving the proofs and details to later sections. First, then, let us set up the simple model which we study. The equation for the time-dependence of the probability amplitude a, that a particle is on the site j is: idj=Eja1+ £ Vikak.

(1)

Here we measure energies in frequency units, so we can set h= 1. Equation (1) simply restates the assumptions about the model made in the Introduction. We study the Laplace transform of the equation (1): let

and then

/>(*)= f
(2)

C*/i(*)-«i(0)]-£y/rr- Z Vj,fk.

(3)

k*j

The variable J must be studied as an arbitrary complex variable with positive or zero real part. The transport problem which interests us is: suppose we know the probability distribution |a,(0)| 2 at time t=0, and that it is appreciable only in a certain range of frequency Ej or space T,-. Then how fast, if at all, does this probability distribution diffuse away from this region?

P.

W.

ANDERSON

The simplest question we can ask is to assume a 0 ( 0 ) = l for a particular atom j=0 and inquire how a.j varies with time, or fj(s) with J . I n particular, for very small real part of s we are studying the behavior as I—* oo ; for instance, lim sfj(s) is in fact (ay(°o))Av.

1494

Now the equation for fo{s) is 1 Ms)=—^—+ is—Ea

(-AE^---isK)fo (*)• T I

is—Eo\

The solution for /o is Equation (3) can be written (10)

Ms) = —^-+ E is—Ej

Vjkfk(s).

t^,is—Ej

(£O-A£<«)

If T is finite, one gets the usual result of perturbation theory: 1 /„(*) = , (10A) i+(l/r)+t(£o-A£<«)

In one approach to ordinary transport theory 6 this equation is solved by iteration, in which the equations (4) not involving f0(s) are solved for fj(s) in terms of

Ms): 1

//*)=

M(1+/0 + (J/T)-

(4)

VMs) is—Ej

1 1 + E Vit Fto/o (*)+ • • • • (5) t is—Ej is—Ek

which represents a state of perturbed energy Ea—AEm decaying at a rate e~"T. If, on the other hand, T is infinite [ I m ( F c ) — » 0 as s—+0], then the constant S enters and the amplitude is 1 /«»(*)•

s(l+it)+«'(£o-A£(2>)

(10B)

Then the zeroth equation becomes This is a state of the same perturbed energy which does not decay, but has a finite amplitude oo (/ —» °s) /•>(*) = +E 0 reduced from unity by the ratio l/(l+K). K is then is—Ek is —En * «—Eo \i. simply a measure of how much this state has spread 1 to the neighbors by virtual transitions, and has nothing ! \ + E Vk, Vl0+ • • • )/„(*). (6) to do with real transport. Our technique, then, will be to study the behavior i is—Ek is—Ei I of the quantity Vc(s) defined in (7) by an infinite series, just as in the usual transport theory. However, Let us call the quantity the quantities entering into (7) are completely different in character because of the fact that we have chosen (VckTVuVkiVK to start with localized states of random energies £ , . T — - + T . , . + - - - = r«(0). (7) F W , In the usual case, there are an infinite number of states In many cases the first term suffices. Studying this j connected to any state 0 by infinitesimal matrix first term, we see that it can be written elements VOJ, so that within any small range of energies there will be many possible energy-conserving transi—Ek is \ tions, no one of which takes place with particularly (8) Vc(o)= E W large probability. I n such a case the first term of (9) is a meaningful limit of a certain integral. Here, only 7+ES s*+Ek*)' In the limit as 5 —» 0, the first part of this is obviously a few Ve/s are large, and the energies they lead to are just the second-order perturbation —A£ ( 2 ) of the stochastically distributed, so that whether or not energy. The second part may be written energy can be conserved is a probability question.

—r Y-

(

We find that the quantity Ve(s) must be studied as a probability variable: that is, we pick a starting atom 0 and an arbitrary energy £ [imaginary part of s; •—0 * k,Ek*0 El? Re(.s)—>()] and study the probability distribution of Vc. Our study then resolves itself into three p a r t s : = isK. (9) First, we study the first term (8); second, we discuss the T convergence of the series of higher order perturbations. Of course, if the first of these terms, the usual transition Both of these questions we can resolve in t h e sense probability, is finite the second is indeterminate. We that there is a region in which, with probability unity, include both in order to see what happens if the first I m ( F c ) —• 0 as Re(j) —> 0 and the series is convergent. These two parts we shall discuss here briefly, and term does vanish. expand upon in Sees. I l l and IV. Finally, we must 6 See, for example, W. Kohn and J. M. Luttinger, Phys. Rev. decide whether this kind of convergence in a probability 108, 590 (1957). ]im(Vc)=-i-£(Vokys(Ek)-is

E

(v0ky

82

ABSENCE

1495

OF

sense is meaningful, and in particular whether the choice of an arbitrary energy is correct. Since this seems reasonable from the start, we shall not go into it further here, but reserve the discussion for Sec. V. We find there that this convergence means that the states are localized, but that it is not easy to assign a correspondence between the perturbed and unperturbed states. Let us, then, go ahead with the first two questions. The important quantity for the first is Wo, Let

1m(Vc) = -sZ* ^+£t2 (ii)

:-*«•

* J2+£*1

We note that A' also represents the quantity

di')

x(s)= E "H/otol

in first order, as is clear from (5). This points up the interpretation that as s - » 0 a finite X means no real transport. Using the Holtsmark-Markoff6 method, in the next section we calculate the probability distribution of this sum X, and find that if F(r)~l/r* f *, where e>0, then X(0) has a distribution law with a perfectly finite most probable value, while the distribution falls off as 1/X* for large values of X. This latter fact shows, as can also be verified directly, that the mean value of X is infinite. Clearly this is merely the result of an infinitesimal fraction of atoms having a very large value. It can be shown that even these few large values are illusory, and that by suitably redefining the localized states in accordance with multiple-scattering theory7 the probability of such divergences is greatly reduced.8 In reference 2 the transition probability is calculated by taking the mean of (11) over all possible starting atoms. The resulting finite transition probability is therefore meaningless, as discussed above. The case of V(r) = A/r3, or normal dipolar spin diffusion, is a special one. In this case X(0)—» » for all atoms, but the divergence is so extremely weak that Vc(0) is finite. In fact, the distribution of X(s) has the same form, but the most probable value is roughly n2AH [*(J)]M.P.«-

|sinh->

01 /W

DIFFUSION

distribution of the £,-. This most probable value diverges as the square of lm when s —» 0. This case leads to a transport theory with decays slower than exponential. Since the divergence of (12) is caused by large values of r, single, rather long, jumps have an important effect on the transport process. The existence of this transport process for F ~ l / V provides a counter-example for those who may think the present theorem is self-evident for small enough V. One may roughly estimate transport times by noting that, by the definining relationship (2), that value of * for which sf(s) is of order unity is something like the inverse of a time of decay due to spin diffusion. Equation (12) shows that this time increases exponentially with W/nA for large values—for example, for Si donors at 1016 concentration we compute a rate of exp(—104) sec -1 . On the other hand, for V'~l/f 3+ ' (or exponential, as it often is), the first term of perturbation theory leads to a vanishing rate of transport independent of V or W. Thus, if transport is to appear at all it must come in higher terms, and in fact it is easy to convince oneself that it can come only by a divergence of the whole series for Ve. Therefore it is of great importance to our theory to learn how to handle the sums of products K e (0)=

1 1 Z Vor ~Vjk i,k,i...m*o is—Ej is—Ek

XF,r

•Vm0,

(13)

is—Et

which represent the possibility of successive virtual transitions until, at possibly some very great distance from site 0, a real, phase-destroying process can occur. Our method for this problem, set forth in Sec. IV, involves both the idea of calculating a probability distribution rather than a mean for these terms, and also a modification of the multiple-scattering methods of Watson' in order to eliminate certain troublesome repeated terms. We must do this elimination first. Certain terms in (13) are apparently very large because of the fact that there is no prohibition on repeated indices. Suppose, for instance, that Vjk is particularly large and that both is—Ej and is—Ek are particularly small so that

is—Ej

(12)

-Vit

1 -Vkj > 1 .

is—Ek

Then there is the possibility of a term like

where n is the density of sites and W is the width of the "J. Holtsmark, Ann. Pbysik 58, 577 (1919). ' See, for instance, K. M. Watson, Phys. Rev. 105, 1388 (1957), also the references therein. • In this case this is particularly simple: near such an infinity Re(l / C ) will also be large so that the energy will simply be changed to a value at which I m ( F t ) is finite.

83

1

v0,is—Ei

1

1

1

vmiis—Ej vikis—Ek v„,is—Ej XVlk

1

1

is—Ej

is—Ek

Vkn

1 is—Ep

Vp0.

P.

W.

ANDERSON

Such a term will get larger the more times we repeat Vjt. We can represent the terms of our series by closed diagrams through the various sites of the lattice, starting and ending at 0 (see Fig. 1), and this type of diagram involves a "ladder" which repeatedly runs back and forth from j to k. Physically, we can think of it as resulting from a pair of closely coupled atoms. The technique of Watson' •* shows us that we may eliminate all repeated indices in a self-consistent way by including in the energy denominator for atom k the perturbed energy Vc(k) calculated from just such a series of terms as (13). A complicating factor is that Vc(k) must be calculated from a series of diagrams which do not include any indices which have previously appeared before in the particular term of Vc(0) we are calculating. That is, if we want the term 1 1 1 Vos-V3T-V2l-V10, «s ei e\ where for brevity we introduce the usual '"propagator" notation is~E~Vc(j) = eit (14) then the propagator ei, for instance, is given by e2 =

is-E2-Vc^(2)

=w-£,-

1 Vtr-Vj,

£ ),*, i^o.i

Bj

1

1

(15) Vk2

ei

1496

important correlation. Namely, suppose that one factor of our term, Vu/et0 " > , is particularly large. The Vt of the previous factor, say F i m / V ' " ' , will contain the term:

m,ivet«->. Therefore this previous factor would contain the large factor in its denominator, leading to a tendency to cancel. On a quantitative basis, first think of all V's as having the same order of magnitude. Then, since the other terms of the denominator e< will all be of order W or less, it is easy to see t h a t we simply decrease the total unless \Vu\*/ek<W, \Vu/ek\<W/V.

(17)

W is the breadth of the distribution P{E). We shall use the limitation (17) in our later computations. We note that it is meaningless if V is small; but the work of Sec. IV will show that small values of V are never important. In any case the results do not depend sensitively on the existence of this limitation. Actually, (17) is only the most important of an extensive system of correlations, since similar considerations hold for any group of factors starting from atom k and ending at atom /, if there is a distinct return path to atom k from / which has a finite factor. The fact that isolated large factors are not important,

et

and again, each of the propagators in this series must be appropriately modified not to include either 0, 1, or any of the previous indices in the Vc"x{2) series. Thus we may now write 1 1^(0)=

Y. ,vo

v0l is-Et-Vc'-'-^k)

/^•••i./.t.O

1

1 XV

k

is-Ej-Vc^iJ)

-vJ(-

-Vio. is-Ei-Vc<>(i)

(16)

All of this, of course, involves a self-consistent type of reasoning, since it is only if these series converge that we can find Vc(j) in this way, and therefore that we can define the modified series. We say in defense that clearly we can always make the sum converge for large enough s, and also that the Vc's in the higher terms, since they may have many forbidden indices, are more convergent than those we derive from them. The prohibition of repeated indices has two useful consequences. The most obvious is to prevent extensive correlations between successive factors V/e of a given product. However, they also introduce a useful and 9

For this purpose, one could equally well use tine method of E. Feenberg, Phys. Rev. 74, 206 (1948).

FIG. 1. Diagrams corresponding to terms in the perturbation expansion of Ve. (A) may be large and must be summed over; (B) is a legitimate term.

84

ABSENCE

1497

OF

however, should apply even more strongly to groups of factors. We merely note t h a t we will probably underestimate the limiting density, even using (17). Our problem now is to study series of the form (16) in which the E's, to a lesser extent the Vc's, and possibly the VjSs are stochastic variables. We want primarily to find whether such series converge in some sense as i —> 0; if so, we have found a self-consistent set of localized wave functions. If the convergence limit on 5 is finite it may still be possible to calculate some transport properties, but that can be reserved for later work. In the fourth section we discuss such series. The principle of this discussion is the following: since the terms TL of the series having a given length (number of denominators, say) L, are themelves random variables, we try to find a distribution function for their values. This distribution can be expressed as a number distribution. (Average number of terms of length L between TL and TL+dTL) = n(TL)dTL.

(18)

(It is necessary to use a number rather than a probability distribution because, when V extends to large r, the total number of terms of any length is infinite.) The techniques involved in getting such a distribution are closely related to the Markoff method of random walk theory, although we use, for convenience, the Laplace transform and the convolution theory for it. We are able to get n(T£) explicitly in two cases. In both cases we make the unimportant simplification that the distribution function P(E) of the energies Ej is flat: P(E)=1/W for -W<E<W, (19) P ( £ ) = 0 for \E\>\W, and we neglect the influence of Vc on the frequency denominators except for the limitation (17). In the first case we assume that J7,* is finite only between "nearest neighbors," of which there are some finite number Z ; between these neighbors, it has a constant value V. Then the problem simplifies to finding the probability distribution of the product of denominators P(UD)dIID 1 1 1 1 nD= . (20) ei e2 e 3

«L

Given P ( n 0 ) , we can use the idea of the "connectivity" from the percolation theory of Broadbent and Hammersley. 10 The connectivity K for any given lattice with near-neighbor connections only is defined by exactly the relation we want: (Number of nonrepeating paths of length L leading from any given a t o m ) ~ K L .

(21)

DIFFUSION K is generally of order Z— 1 to Z— 2. Then obviously (KL/VL)P(T/VL)dT.

n(T)dT=

(22)

A second case we are able to solve is t h a t of the purely random lattice in which V falls off as some power of r. Unfortunately, we have to ignore the restriction of nonrepeating paths in this case, so that we rather badly overestimate the sum. On the other hand, this at least tells us whether or not large r's are important since this restriction is not important for large jumps. Thus by this case we can show rigorously that I / ~ l / r 3 + « is the correct restriction on the range of V. In each case we come out with an n(T) of the following form: dT n(T)dT=ZF(K,W/V)2L—L(T), (23) •p

where L{T) is a slowly varying function relative to T. This form allows us to make use of the following result, which is implicit in (for instance) the theory of the Holtsmark distribution. The probability distribution of the sum of a collection of random terms of random sign such as (23) is the same as the distribution of the single largest term (a) for values greater than or of the order of the most probable or median value of the sum; and (b) if L(T) is increasing, or decreasing no more rapidly than 7^*. This is essentially the same as the theorem that the force on a dipole in an unpolarized gas, or on an electron in a discharge, comes primarily from the nearest neighbor. Since L(T) obeys this condition very well in both cases, at least for reasonable values of V/W, etc., we may immediately get the critical values of the parameters from (23). We know now that, if 2= E (±r.)

(24)

n-1

(where we use for clarity the case of a finite number of neighbors), then dZ P(2)dZ~FL(K,W/V)—L(2). (25) 2 First we find (W/V)0

to satisfy

F^K,(W/V)0)L(l)=l.

(26)

If (W/V) has a value even very slightly greater than this, the most probable value of 2 will be small of order e~L and the probability of a value = 1 will also be of order e~L. Now we consider Z, —» °o : The number of 2's only increases as L, while with probability ~ 1 — e~L their value is less than e~L. Therefore the series converges almost always if W/V>

(W/V),,

(27)

10

S. R. Broadbent and J. M. Hammersley, Proc. Cambridge Phil. Soc. 53, 629 (1957). This work suggested some features of our approach to the present problem.

85

which is the desired criterion. In Sec. IV the critical values will be discussed numerically. A typical estimate

P.

W.

ANDERSON

would be (W/V)o=26 for JT=4.5 (about correct for the simple cubic lattice). With this result the theorem is established.

1498

The integration here depends on the more stringent condition F ~ l / r 3 + < for large r. The probability distribution, which will be valid for large X at least, is familiar from line-broadening theory 12 :

IU. PROBABILITY DISTRIBUTION OF THE FIRST TERM OF Vc Before Eq. (11) we related the transition rate to a certain quantity X(s), and showed that if X(s) remains finite as s —» 0 no real transport takes place, in the first order of perturbation theory. Now X(s) is a sum over all possible single jumps:

(28)

X(s)=Z-

k ^+£*2

The probability distribution of such a sum is best calculated by the Markoff method, 11 as modified by Holtsmark. 6 In this method we find the Fourier transform of the probability distribution P(X):

C eixX
(29)

{ - „ ( / [ l - e x p ( ^ ) ] ^ ) j

(30)

P(X)=

P{X) =

W

e x

P

The average is to be taken over the probability distribution of E, P(E), and n is the density of sites. Let us write out the important integral in the exponent of (30):

r"

r*

r

/'^W\i

'-L^H^vM-^n

4rr r<°

r

WJ0

J-

4TT

XixW(r) dE\ 1 1 —exp

])

r"

/•"

=—(*)*

L E?

r*drV(r) I

(32)

/•-

2

r(

= G) »—•

" S. Chandrasekhar, Revs. Modern Phys. 15, 1 (1943). I am indebted to L. R. Walker for suggesting the use of the Markoff method here.

(34)

We see that / indeed has a logarithmic singularity as s —> 0. A simple case for which we can evaluate the integral (34) is the flat distribution (19) of width W for which W

•^MDIQ'

y/=—Isinh-'l-JK-).

(35)

This leads to the probability distribution of the sum X: 4\*2nA \ 1 ) — sinh-

( 3r*Wi/Xi

-Q

Xexp -

sinh-'f-)

I L 3IF

(-)

, (36)

\2^/J VA'/I

which was discussed in Sec. I I . IV. DISTRIBUTIONS OF HIGH-ORDER TERMS IN THE PERTURBATION THEORY In order to simplify the later manipulations, and to make closer contact with the Watson theory, it will be useful to expand some of the formalism of the problem. Equation (4) presents our basic equations of motion:

(V(r))

(33)

(E?+s>)i

iio

fM/x\i

.

X\

P(E)dE

3

(3i)

The behavior of P(X) for large X depends on the behavior of / for sufficiently small *. Let us first consider the case s = 0. Now for small enough x, and a finite E (say of order W), the exponential expixV2/E? can be expanded in a power series in x, and the integration over r done (so long as V is finite and falls off faster than r~ ! ) to obtain terms which go as x* or higher powers for small x. Thus only the behavior for small E is important, and we can neglect the variation of P(E), replacing it by a constant, \/W. Then

W/

m 4 T C" r ixA2 -\ r /=— P(E)dE\ d(r*) 1 - e x p 2 3 J^, J0 \.r°(E?+s )}

4rA,\/x\l =

) —

1 I

For large X, this falls off as X~*, as stated in Sec. I I : while the mean of X is divergent, the probability that X is larger than some value X0 decreases as X 0 - i . Thus, for any given starting atom n, the renormalization constant K may be large, but the probability that T is finite is exactly zero. We see that none of these conclusions are valid for the case V = r~3 for large r. In this case a finite s must be retained. Again looking at the integral (31), let us substitute V(r) = A/r* for all r.13 Then we do the integration over r first:

where ,

exp \-[2nT(i)—

WX>

a H. 13

is—Ej

-+£•

1

-v„Ms).

* is—Ej

Margenau, Phys. Rev. 48, 755 (1935). We make this substitution since there exists some r0 beyond which V— 04/^)2^0. We point out that Jlr" leads to terms in / of identical form with those we have already found, and thus independent of J.

86

ABSENCE

1499

OF

Let us consider simultaneously all possible initial starting atoms 0. The f(s)'s which result will form a matrix / ; „ , of which our / ' s are the particular row / , 0 . It is simple to introduce as the "Green's function" the matrix

DIFFUSION written out as a perturbation series by direct iteration: (j\U\k) = Sik+-Vik+ a,

£ -Vtr-V* ' a, ai

(37)

<j\W\k)—ifit(s), which satisfies the equation

tf|n'!*)=—^-+

E

is— E,

F„(/|H'|A); (38)

i is—Ej

1

1

1

'." a,

ai

am

+T.-Viv-vlm—vmk+A very direct as follows: take from the right, repetition of the comes a factor:

(44)

way to eliminate repeated indices is any given term of (44) and, starting look through until we find the first index k. Between these two repetitions

or, if one introduces the matrix (j\a\k) then II7 satisfies

= 1

v

1

w=-+-vw.

(39)

a a

The M0ller wave matrix ft is defined also, as Q=Wa,

(40)

and satisfies the equation 1 fi=l+-

VQ. a

1

at

ai

V„t). /

Next we continue t o look from right to left and find another similar factor, etc., until we find all such factors and no more k indices occur. Now everything which remains to t h e left will also appear multiplied by the factors we have found in all other possible combinations and repetitions, and in fact by all possible such factors in all such combinations. Summing all such factors, we get for that series which comes to the right of the last repetition of index k:

(41) 1

The general philosophy of the multiple-scattering theory is to try to separate and identify, in these matrices Q or W, effects which are "coherent" in the sense that they involve perturbations in which, finally, the system returns coherently to the initial state, from effects which involve real (as opposed to virtual) transitions to other states. Another way to p u t it is that this is an attempt to define a new set of states, once the perturbation V is applied, which correspond in some sense to the unperturbed states, and thus to have left over only the effects of whatever real transitions may occur. In principle this is exactly what we are attempting in this paper. The coherent effects are mainly included in a diagonal coherent wave matrix 12r, and the rest of the problem is contained in the "model operator" F: Q=FX1C,

1 -PVF,

1

Z V l+-Vkk+(-\(-V kk)+-+Z— ak

\akk \a

>

i aki

+

L

ct

ai

J

Z -Vk!-Vlk+

> ak

ak—Vc{k)

ai

(l.-vkI-vak)+-} \ i ak

= | 1 - (-Vkk+

1

Vkt-Vu

i ak

J

•••)I

ai

/ I

ak -=-=(J2c)t, ek

(45)

where Vc(k) is defined as in (13). The corresponding term of Q is then in the form 1 1 -Vit-Vlm

1 Vnk(Qc)k.

(42)

where we want F to satisfy an equation like

F=l+-

1

(-Vtl

Slk(is-E1),

(43)

a-Vc

where P is an operator which prevents in some way the repetition of indices in the perturbation series for F, while Vc is a correction which must therefore be made to the energy a. It is perhaps simplest just to go ahead and show what must be done. The solution of Eq. (41) may be

87

We now begin the same process as in (45), looking for repetitions of the index n which appear next. We collect together all such terms, and finally find that we can replace a„ by a„— Vck(n), where Vc"(n)=V^+

£

1 Vnl-V,n

i*k.»

a,

+

E l,m*k.n

1 1 Vnr-Vlm~ ! ' „ „ • • • . (-16) at

am

This process may be repeated until we come to the end

P.

W.

ANDERSON

of the given term. Thus we have successfully expanded ft as a = FQe, Slc = a/e, 1 F,*t = -Vit+ e,

1 1 £ -Vtr-V* '**.; «y ei 1 1 + Z Pyr- £ tj I*m.*,;'

(47) 1 Vim-Vmt+---.

Cl m^i.k

em

The final step is obvious: in each of the Vc's in (45) itself we eliminate repeated indices in an exactly similar manner, obtaining expressions like (16) for Vc.u Now the usefulness of such expressions in the usual multiple-scattering theory comes only from the fact that the limitations on the sums are unimportant, since there are an infinite number of Vit's starting from any j , each being small. In our case we have a different kind of fortunate circumstance which allows us to get around this problem; namely, all of the quantities of the theory are stochastic variables, so that all we really wish to know is the distribution function of the e/s, which except for the restriction (17) is practically that of the £ / s unless Vc is quite large. Even if Vc were large, one might study it as a stochastic variable. We now have the problem of calculating probability distributions for products such as the terms of (47). Because the only question is that of convergence we are interested only in the terms of very high order, that is, those of order L with Z . » l . We call such a term T: 1 1 TikL = -Vir-Vlm e, ei

1500

make the approximation of neglecting it altogether as an unimportant correction to the stochastic variable Ej. Since we are for simplicity confining ourselves to the flat distribution P(E)=l/W of Eq. (19), we can express (17) semiquantitatively. The smallest denominator a which will not seriously affect the probability distribution of subsequent factors we define to be A/2. The quantity A/2 satisfies the criterion that the contribution to Vc from it must be less than the maximum possible Ej: V/(iA)
&>wyw.

(50)

Our approximation is simply to take this as a lower limit on e and modify (19) to

Pfe)=

i ^ ' *
P(e,) = 0,

|e,|<$A

or

(51)

|e,|>JW.

We shall find that the use of (51) changes our convergence limits by less than a factor of 2, in spite of its apparent importance in eliminating singular factors; this is our justification for the crude approximation. We now take up the question of the probability distribution of T. Let us define the variable S as L

1

n-

(52)

= eS/{\W)L.

j ' - i ej The range of 5 starts from zero, so that we can apply a Laplace transformation to its probability distribution:

1

i(p)= f

Vnk

fT"sP(S)dS

em

= exp{[lnF i ,+lnF,„H

(L terms)]

-[lney+lned

(L terms)]}.

= ^nexp^[lne~ln(iir)]}y (48)

In Sec. I I we presented the two cases in which it has been possible to calculate explicitly the number distribution of T. We take up the simplest first: the case in which V is a constant, so that the only stochastic element in (48) is the denominators. In this case L

1

T=VL n -

(53)

(49)

To find the convergence limits, we go to s=0 immediately. Then clearly T has random sign, which must be taken into account later in summing the T's. As we discussed before, Vc is important only in that it causes a certain restriction (17) on the magnitude of the separate factors of the product; otherwise we shall

Since all the e/s are independent variables, the average in (53) is just the product of the separate averages. Thus FL(P)

w

(p>-l). P+i

(54)

W-A

For regions of 5 in which we may neglect A, this is very easily inverted: A^O:

FL(p)^l/(p+l)L,

P(S) =
(55)

Since T=VL/(iW)Les, to our order of approximation we have /2eV\h\nT

P

/2V\}LdT

™HT) IT-KF) I r'

14

1 am indebted to P. A. Wolff for many helpful discussions on the above, and in particular for pointing out that (43) is true only ;n the sense of the perturbation series.

-F

(A/H')JH-I

which shows that it is of the form (23).

88

(56)

1501

ABSENCE

OF

DIFFUSION

It is interesting to note that while (55) is a reasonably narrow distribution, satisfactorily obeying the central limit theorem, the fact that the quantity of interest is exponential in 5 transfers our attention to what, in (55), appears to be the extreme tail of the distribution. This is a characteristic of this problem. On the other hand, our task is simplified, in that nothing smaller than factors exponential in L affects our results. Without neglecting A, we can get the full distribution with sufficient accuracy from (54) by applying the inversion formula for the Laplace transform, P(S)=—f

FL(p)e"dp,

(57)

and using the method of steepest descents. Writing out (57) in appropriate form for this method, we obtain P(5) = - J

2

expj/>S-ijln(/.-r-l)

-0-i)-4'-(i)l

FIG. 2. The function f of Eq. (64), giving the slowly varying part of P(S). neglect 1 relative to (A/W)"*1. Here it becomes necessary to continue the logarithms in (58) to negative values of their arguments, but this can be done by referring to (54) and noticing that we must change the sign of 1+p and — (A/W)"*1 simultaneously. Then the saddle-point condition is L

(58)

A +Z,ln—=0, 1+p* W

5

(61)

First notice that as p becomes large and positive, the and the probability of such values of 5 is exponent will approach +=° unless S<0. If 5 < 0 , we can find a path via p —> + » and P (S) = 0, as it should / W S\L // A\l (62) be. Similarly, as p—* — », the 5 term dominates if S> L In (W/A), so that, correctly, P(S> L In (IF/A))=0. Within these limits, there is a saddle point for finite p The results (56), (60), and (62) may be summarized which may be found by differentiating the exponent. in the following way: At the point l+p=0, the exponent changes character, from not depending on A/W above this point to (63) depending primarily on it below. It is instructive to P(S)-(——) «-WS/L)>, \l-A/W/ expand the exponent about this point: -L

\n(l+p)+h

where \t is a slowly varying function which may be estimated in each of the three regions:

•(-i)

I : $L\n(W/A)»S>0, +(S/L)^S/L; II: S^iLln(W/A), f(S/L)~]n(W/A)/e; III: Lln(W/A)>S»iL\n(W/A), f(S/L)^]n(W/A)-(S/L).

-f(;)>#i) r w

1+p/

W\*

ii

- ln K--r( ln i) -J!-

(59)

As we see, this point is actually not a singularity. Taking derivatives, we find that the condition that the saddle be here is: = iiln(R7A),

P{T)dT=—( F\

and that at this point / W\ P I H l n — )^esl V A/

The function if> is easily plotted approximately from these results and is shown in Fig. 2. For the probability distribution of T, we obtain

W /

x

L

An(W/A)\ I . V.1-A/WV

I

(60)

As 5 gets appreciably larger than this value, Fi{p) and the exponent again become simple when we can

89

flnr

2V-\ i/

A\L

n—vJ/O-iF) •(65)

Except for the most remote parts of region III, (65) is of the form (23) and satisfies our condition. Wre have not explored very carefully the corrections to the saddle-point method, both because we need only

P.

W.

terms of the order eL, and because the results are clearly in accordance with expectations. Before studying (65) numerically, let us go on to the second case which can be solved; namely, V= Fo/r 3 ". (66) We find a solution only in the limited sense that various crude approximations are made for small r ; what we try to do is to assure ourselves that the region of large r and small V is not important, in spite of the extra mathematical difficulties to which it leads. These "crude approximations" are threefold: (a) We ignore the fact that a path leading to an atom from its nearest neighbor must leave it via a further neighbor— i.e., in each factor we allow V to be randomly distributed, ignoring the favorable correlations caused by the restriction to nonrepeating paths, (b) We ignore (17) and thus can use (55) and (56) as the distribution function of the denominators, again overestimating the effects of large V's without disturbing the small V region, (c) We limit | V \ simply by introducing a minimum radius a; the maximum V is then

Thus, once we have the distribution of 2 , we can find the distribution of 5 + 2 by convolution of the bilateral Laplace transforms, and thence find that of T. The required Laplace transform is VL(P)=C

II V

;/fU-

3N f

•PL

n(V)dV=

VollN , 3.V V 1+1 ' w = 0, V>Vm.

v
(69)

(72)

{v/vmy*u*

-(—VI—-

L

p<-\/N.

(73)

brrss=l/p. (74) The transform of the denominators, FL(P), was convergent for p> — 1 [ E q . (55)]. Thus for A T >1, or forces falling off more rapidly than 1/r3, the two transforms have a common strip of convergence. This is the criterion on range we already expected from Sec. I I I . Now let

X=S+2,

J

(68)

dV

(p)

*•

Here we have defined

The distribution of the V's may be approximated by using a perfectly random distribution of neighbor distances: n(r)dr = 4rpr2dr, where p is the density of sites. From this and (67) the distribution of V is immediately 4TTP

P(2)tr>zdX | 4 ^ p a V ' (.V/Vm)-"d(V/Vm)

V
1502

ANDERSON

n(X)dXe-*x=J,L{p).

(75)

Also, by the convolution theorem,
-Ol-

-l
i

(76)

(l+p)(p+l/N)\

The inversion of this is simple if we apply the shifting operator to p , bringing the origin to the center of the convergence strip: p' =

p+\{l+\/N),

L

The numerator in (68) may take on values from Vm W)f txp(-p'X)n'(X)dX, (77) to zero; the inverse of the denominator, from » to a certain minimum. Again the mathematics is more familiar if we study the logarithm, which is a variable «'(X) = n(X)eW+"*>. extending from — » to » . Thus it is necessary to use Then [a3 1 I1 now the bilateral Laplace transform. This causes no 2 2 difficulty in (53), (55), and (56) since 5 > 0 anyhow; lA>/i(l-l/A') -/>' J we simply reinterpret these as the corresponding The standard inversion formula now gives us bilateral formulas. The logarithmic variable 2 , corresponding to S, we 1 r*" dp'txp(p'X) ra'i' define by n'{X)=—\ . (78)

{[V^VJe*, J-I

and the quantities (68) whose distribution we want are VmL (W/2Y

2«J_*.

(70)

(71)

[i(l-l/NY-p'^ANrfl

This integral can be expressed in terms of the Bessel function KL-^ (which is actually a polynomial in elementary functions) by Basset's formula, 16 which in 11 G. N. Watson, Bessel Functions (Cambridge University Press, Cambridge, 1944), p. 172.

90

1503

ABSENCE

OF

turn can be expanded by the well-known asymptotic expansion for large order.16 The result, for our purposes, could be foreseen by realizing that in (78) practically all the contribution will come from />'=0. Using (77), and neglecting unimportant constant factors, it is n(X) ==

4a3

r

"|L exp[-|(l+l/7V)X].

(79)

Now we get the number distribution of the terms T, using (71): n(T)dT

L / i(, i Af -]i r2T „i* + ' »

4a5

dT r

r'+" 2A li\ T r s 3 (l-l/A*) 2 J

Lw J

.(80)

Again we find a distribution of the form (23), satisfying very well the condition that the distribution of large values be that of the largest term. Now the only remaining task to complete the discussion is to study the criterion for localization numerically, using (23) and (26). First we shall deal very briefly with the unrealistic second case. Upon using (80), Eq. (26) becomes _r

L

iL<1+1/A )

l Lr2F„-| f2F»]

4a3 3 T 2 »V s 5 (1-1/A ) J

\Nr H\-\/Ny\

'

It is interesting to put this in the following form: W =-•

w-i—-1/A *J

SNKK+1)

(81)

2

T

LA*

The 3A7th root of the denominator on the left can be thought of as an effective radius of interaction; namely, contributions from much greater distances are certainly of no importance. Except when AT is very close to unity this radius is smaller than the mean nearest-neighbor distance (c^0.8r,) and strongly dependent upon a. This tells us two things: first, that the infinite range of the potential is, unless A 7 stl, of no importance, and the important interactions are with close neighbors; and second, that this particular calculation, which did not take into account the important correlations of near neighbors introduced by the restriction to nonrepeating paths, is of no direct value. An order-of-magnitude guess at the correct answer to this problem might be gotten by inserting something like r, for a in (81); this eliminates the false effect of single, very close neighbors, which by the restriction to nonrepeating paths should make no contribution. We see then that rttf^r„ or approximately the mean nearest-neighbor distance. The resulting V/W is rather surprisingly large and tends to explain Feher's result that even when a considerable fraction of the atoms 16

have a close neighbor for which V>W, there is still little or no spin diffusion. These difficulties with correlations confine our quantitative work to the one case in which this problem is solved for us, at least in principle, by Hammersley's work: The regular lattice with a finite number Z of equal interactions. Here we apply very directly the discussion of Eqs. (22)-(26): that is, we find the probability distribution of the sum of KL terms from that of the largest term, and thence find a critical (W/V)o below which all the higher terms are exponentially small with exponentially large probability. The criterion for applicability of the basic theorem in the case of the present distribution (65) turns out to be In

2V

lnr

A

L

<1,

J. W. Nicholson, Phil. Mag. 20, 938 (1910).

91

(82)

which we can show to be true at the critical V/W in all cases. Equation (26) for the critical W/V is, when one uses (65) and takes the i t h root, 2eK (V/W)c+{-lnl2(V/W)c-]}. 1=1-(A/H0«

LW J

V,

DIFFUSION

(83)

Since we are really quite uncertain as to exactly how to take the correlations into account, we shall solve (83) first in the case of no correlation, which gives us an upper limit on (W/V)o, as well as in the case of a finite A given by (50). I. Upper limit.—Here we assume that A = 0 and that yj/ for region I is always correct, so that eKln(W/2V)a.i.=

(W/2V)a.]

(84)

A rough solution by iteration is clearly (W/2V)u.i. = eK ln(eisT). More accurate solutions can be obtained by plotting (84) and are given in Fig. 3. The values are rather large; for instance, at JT=4.5, approximately correct for the simple cubic lattice, (TF/27)„.i. = 45. The rather large effect of increasing connectivity is interesting. H. Lower limit.—Next let us do the calculation taking into account the approximate lower limit (50). It turns out that in this case the T of interest is in region II. In fact, with the definition (50), -\n(.2V/W)e=>lln(W/A), so that 5 is exactly correct for region II. This leads to 2K\n{W/2V)c \-(AV*/W)c

-=(W/2V)C.

(85)

Except for the correction in the denominator, which is negligible for all but the smallest values of K, this is the same equation as before with 2 replacing e. Its

P.

W.

1504

ANDERSON

final amplitudes on the various atoms by the prescription : \imv7(j\W\k)=aik(E), (87) tr-*0

where
UPPER LIMIT

SR

'BEST E S T I M / kTE

(88)

W=

is-B0-V

I

2

3

4

5

6

7

8

9

io+E-H

[This is just (39).] The scheme of the multiple-scattering method is to replace V in (88) by a number Vc, which can, of course, only be done in the diagonal elements:

K

(89) (j\w\j)= FIG. 3. Numerical estimates for the critical W/2 V, the ratio of line width to interaction, for transport, plotted against conia+E-E—Vc'Xia+E) nectivity K. The upper curve is a quasi-exact upper limit; the lower one is our best estimate. Then the general elements of W are obtained by using the model operator F of (42) and (47): solution is the lower curve of Fig. 3. I t is very unlikely (j\W\k)=(j\F\k)(k\W\k). (90) t h a t (85) is accurate for i r ~ l , so no plot has been made in this region. This concludes our estimates of the I t is important that in the expansion (47) of F the particular denominator occurring in (89) never appears. critical ratio for transport. Let us start by thinking of a large but finite system, V. MEANING OF CONVERGENCE OF THE so that we know the energy levels are not a continuum. PERTURBATION THEORY Then the exact energies of the problem are clearly the The results of Sees. I l l and TV may be stated simply poles of W, which occur at as follows: E-E~ tV»(.E) = 0. (91) (1) The first few terms of perturbation theory are This is just the expression one gets in Brillouin-Wigner convergent in the sense that I m ( F e ) is finite with perturbation theory. At such an energy, the amplitude probability unity, at any particular randomly chosen a t i —> <*> is given by point on the energy axis. (2) The terms of order L or higher are, at any pardyj(£) = lim0 (92) ticular random point on the energy axis, smaller than "- ia-ilm[VJ-f){io+E)'l e~*L with probability 1 — e~'L, under t h e appropriate conditions. so that if I m Vc converges and this convergence remains We discuss here the question of what this tells us as AT —> oo, this is finite, not zero as in usual transport about the actual perturbed energy states. Our contheory. clusion is that we can show that a typical perturbed Thus our proof that, at an arbitrarily chosen point state is localized with unity probability; but that we E, ImFc, or any of the quantities of the theory, cannot prove that it is possible t o assign localized converge, and (as we could also show) (j\F\k) falls perturbed states a one-to-one correspondence with off rapidly with distance, would be complete if (91) localized unperturbed states in any obvious way, so furnished us with only one solution per £ , , and if this that perhaps with very small probability a few states solution were displaced by a finite and indeterminate may not be localized in any clear sense. 17 amount from the poles of Ve or F. Unfortunately, this .Equation (37) introduces the "Green's function" is only true of some of the solutions of (91), those O'l'W(j) |*) = -if;k(s). Let us define which correspond to nearby localized functions; but s=
92

1505

A B S E N C E OF

Why this is so is just as apparent from the simplest second-order theory as from the whole sum. Let us write (91) to second order: £_£._ £ *

=0. E-Ek

Clearly this has a solution E0(j) near Ej at which the sum takes on the value

*

E~Ek

approximately. There is no close relation between Eo(j) and any of the poles of Vc to this order. But even for very small Vjk's it also has a solution Eo(k) near £*, given by E0(k)-E,£*

(viky

E„-E,

= (5£)4.

(93)

This solution is closer to £*, the weaker the coupling with j . Let us now compute a,j(£) at this energy, from (92). It is a„(£o(£))=lim a—»0

a

Y(lE)k+io\\

W*\x (viky-+(Ej-Eky

«1.

(94)

U\F\k)~(E-Ek)/\Vik\, lim aik(E0(k)) = 0

In the finite random lattice with all orders of perturbations the same considerations obviously hold; now, however, Ek is a perturbed energy, and Vik includes virtual effects. In principle all the same considerations apply, since except for the exact localized state Ek of (93) we have proved that all contributions to V's and F's fall off with distance sufficiently fast. Taking the limit N —» «> cannot change the above arguments for most states. However, because one of the distant wave functions may pop up at any point on the energy axis, it is not easy to see a way to assign a one-to-one correspondence of sites j and perturbed energies. This seems hardly necessary for a qualitative understanding of what is happening, however. The difficulty lies in the fact that V* is not really a continuous function in any sense. What can be done is to eliminate distant neighbors beyond a certain radius R. Then we find an appropriate perturbed energy E/(R). The contribution to Vc, 6VC(R), from beyond R is a probability variable which can be made to have as narrow a distribution as desired, but at all R may be large at a few E. Thus it is probable, but not certain, that a state localized around j has an energy within a predetermined SVC(R) of E/(R). However, as we increase the size of the system we can never find an R beyond which E/(R) is always as close as we like to the correct value. The fault lies in the BrillouinWigner technique, and is similar to problems as yet unsolved in other theories using this technique. VI. ACKNOWLEDGMENTS

In a similar way, one can show that so that

DIFFUSION

(95)

Vjt-Q

also.

93

Aside from the help of Dr. L. R. Walker and Dr. P. A. Wolff already mentioned, I should like to thank a number of my colleagues for discussing this work with me; in particular, Dr. E. Abrahams, Dr. G. Feher, Dr. C. Herring, Dr. D. Pines, and Dr. G. H. Wannier.

New Method in the Theory of Superconductivity Phys. Rev. 110, 985-986 (1958)

This Letter, sandwiched in between two much longer papers on the same subject (Phys. Rev. 110, 827 and Phys. Rev. 112, 1900), summarizes very succinctly their conclusions and method. It not only solves the dilemma of gauge invariance but presents my somewhat primitive version of the new methods being invented in manybody theory by Martin and students, by Montroll and Ward, and above all by the Russian group around Gor'kov — I called mine the "equation of motion" method, and it was almost simultaneously presented by Morrel Cohen. My "Babylonian" emphasis on the problem at hand probably saved me from further involvement in what was in fact a very fruitful but very crowded area of research, competing with all these brilliant people, and sent me off into other directions.

95

Reprinted from T H E PHYSICAL REVIEW, Vol. 110, No. 4, 985-986, May 15, 1958 Printed in U. S. A.

New Method in the Theory of Superconductivity P. W.

ANDERSON

Bell Telephone Laboratories, Murray Hill, New Jersey (Received February 24, 1958)

P

REVIOUS work 1 has shown that plasmon effects play an important, if hidden, role in the BardeenCooper-Schrieffer theory 2 of superconductivity: they are essential for understanding the zero-momentum pairing condition. For this reason, at least, a deeper justification of the B.C.S. theory must be good enough to treat plasmon (long-range Coulomb) effects correctly. Our method obtains the ground state and elementary excitations of the B.C.S. superconductor while including the Coulomb forces and the higher-order corrections from the phonon forces by a method equivalent to the Sawada-Brout theory of correlation energy 3 (which is exact in the high-density limit). One way of expressing the Sawada-Brout theory is to write down the full equation of motion of the quantity 4 pk,» Q = Ck+Q,»*Ck,„

(1)

which is, with only Coulomb interactions, [H,PK „«] = ( e k + Q - ek)pk, „«+47re2A2Q-1 x£^or2£k^(pk^<)*(Pk,,Q-"-Pk+9,,Q-').

[ , f f , p k , „ Q ] = (e k + Q— £k)pk,
Q

6 k 0 = C_ H C k t^& k °* = c k t*C_ k ; *.

(3)

where p = ]T k , , Pk, <, . This leads to the plasma oscillations as well as giving the Coulomb corrections to the energies of the individual particle modes. An observation which is trivial here but will not be in our generalization is the presence of nonphysical modes, for example | k [ , | k + Q | > £ f , for which (3) gives the correct energy but which, when applied to the Fermi sea or to anything derived via (3)

(4)

When the equations of motion (3) are generalized to include terms which are linear when the b's as well as re's are finite, they involve the quantities Jk Q * = Ck + QT*c_ k i*,

J k Q = c_ k _Qi(; k t;

(5)

and including the extended form of the attractive phonon interaction used by B.C.S., 1,2 the equations of motion are [i7,p k t Q ] = (e k + Q -6k)p k t Q +W# ! 0- 2 fl- 1 pQ(re k t-»k + QT) -Ek'(iF-2«2ft2i2-1|k-k'[-2) X[M(*kQ-(&_k'-Q-Q)*) +6k»(6_k._Q-<3)*-&k+Q»Jk-«],

(2)

In the sum almost all of the terms are products of two PkQ; terms which are not involve products Ck, ,*ct, „ = rek|(r, and are of two types: the terms q = Q , and exchange terms. Sawada shows the exchange and nonlinear terms to be negligible in the high-density limit (this is the random-phase approximation) by comparing with the Gell-Mann—Brueckner theory, 5 so that the equation of motion simplifies to +4veWQ-2ar1PQ(nk.„-nk+Q,„),

from it, are zero identically. The absence of these modes is a subsidiary condition on the theory. Our method generalizes this by observing that in the B.C.S. ground state of the superconductor not only the »k, <- have finite averages, but also the quantities

(6)

(where we neglect Q relative to k—k' wherever permissible) and [H,6 k Q]= - (ek + Q -(-6 k )6 k Q-2« 2 A 2 Q- 1 ^ 2 X(6k°PtQ+*k+Q0PiQ)-Ek'(l^-2^2ft2fi-1lk-k'|-2) X [ M ( p k t Q - i - p - k ^ H Q ) + 6 k ' Q ( » k + » - k - Q ) ] . (7) There are additional equations for the deviations 56k° and 5»k,« of the b°'s and re's from their ground state values. These contain constant terms unless a certain condition for equilibrium is satisfied, which for V = 0 leads trivially to the Fermi sea but in the superconducting case is the B.C.S. integral equation determining the ground state. The linear terms then give equations similar to (6) and (7) with Q~2pQ terms absent. Equations (6) and (7) become manageable only when the auxiliary conditions which eliminate nonphysical modes (single-particle excitations which do not jump the energy gap) are used. In this step it was necessary to use Bogoliubov's description of the ground state in

96

2

LETTERS

TO

THE

terms of a transformed set of fermions. 6 We find the following types of excitations: I. A set of individual particle modes displaced by an amount of order £2-1 from the continuous spectrum with energy gap of the Bardeen theory. II. A set of plasma modes with a dispersion relation very like that of Bohm and Pines. 4 In the absence of the plasma (Q~2) terms, these occupy the energy gap and obey some of the relationships given in reference 1, but have a phonon-like dispersion law. 7 I I I . A single <3 = 0 mode at o> = 0 which corresponds to a coordinate conjugate to the total number of electrons N. The zero-point motion of this mode automatically fixes N; this feature is what allows us to work with the b operators, which do not conserve this number. From this the following conclusions may be drawn: (1) By calculating the zero-point energies and motions of these modes as in reference 3, the next higher-order corrections to the B.C.S. theory could be obtained. We justify this theory in that we show that

EDITOR

these reasonably large corrections do not change the excitation spectrum in essentials. (2) Collective and plasma effects indeed reinforce the B.C.S. theory in the way predicted in reference 1, even though the exact matrix elements in B.C.S. are incorrect because of the neglect of collective effects. I am indebted to H. Suhl for help with some of the calculations and for many conversations. A more complete description of the method will be published

later. 1

P. W. Anderson, Phys. Rev. 110, 985 (1958), this issue. Bardeen, Cooper, and Schrieffer, Phys. Rev. 108, 1175 (1957). This is called B.C.S. hereafter. 3 K. Sawada, Phys. Rev. 106, 372 (1957); Sawada, Brueckner, Fukuda, and Brout, Phys. Rev. 108, 507 (1957); R. Brout, Phys. Rev. 108, 515 (1957). 4 This is related to a method used by D. Bohm and D. Pines, Phys. Rev. 92, 609 (1953), Appendix II. 6 M . Gell-Mann and K. A. Brueckner, Phys. Rev. 106, 364 (1957). 6 N . N. Bogoliubov, J. Exptl. Theoret. Phys. U.S.S.R. (to be published). Similar relations were also derived by J. G. Valatin (private communication via J. Bardeen). 7 By Feynman's argument [see reference 1; also Phys. Rev. 94, 262 (1954)], this implies that zero-point motions greatly reduce the long-range correlation in the Bardeen theory.

97

2

New Approach to the Theory of Superexchange Interactions Phys. Rev. 115, 2-13 (1959)

In that eventful year of 1958 I had time to spend two months visiting Charlie Kittel's group in Berkeley. I replaced Freeman Dyson, his summer visitor for the previous two years, and benefited from the considerable funding Charlie had found for Dyson only to the extent of our having a marvelous house perched on the Berkeley hills next to the Radlab. Many people — Cissie Geballe, who was out visiting her influential family, Jim Phillips, Charlie himself in spite of an earlier squabble not worth mentioning here, and Tom Lehrer being at the Hungry I, among others, conspired to make our visit a pleasant and interesting one, and a chance encounter with Leslie Orgel provided the final key piece of this paper (as I remark rather redundantly in the notes on the next paper.) As will appear later, the Mott interaction and the magnetic effects resulting from it were, from this time onwards, a crucial theme of my career, taking equal billing with broken symmetry and with disorder effects. I consider this paper the central proof of the Mott concepts. Incidentally, it is a "new" method because in one of the earlier papers about antiferromagnetism I had used a brute force method for it, based on an old paper of Kramers'.

99

Reprinted from T H E P H Y S I C A L REVIEW, Vol. 115, No. 1, 2-13, July 1, 1959 Printed in U. S. A.

New Approach to the Theory of Superexchange Interactions P. W. ANDERSON

Bell Telephone Laboratories, Murray Hill, New Jersey (Received February 4, 1959) The theory of indirect exchange in poor conductors is examined from a new viewpoint in which the d (or / ) shell electrons are placed in wave functions assumed to be exact solutions of the problem of a single J-electron in the presence of the full diamagnetic lattice. Inclusion of (/-electron interactions leads to three spindependent effects which, in the usual order of their sizes, we call: superexchange per se, which is always antiferromagnetic; direct exchange, always ferromagnetic; and an indirect polarization effect analogous to nuclear indirect exchange. Superexchange itself is shown to be closely related to the poor conductivity, in agreement with experiment. By means of crystal field theory the parameters determining superexchange can be estimated, and in favorable cases (NiO, LaFe0 3 ) the exchange integrals can be evaluated with accuracy of several tens of percent. Qualitative understanding of the whole picture of exchange in iron group oxides and fluorides follows from these ideas.

relationship is. 5 We shall suggest here that many of T N spite of a considerable literature, a qualitative them are roughly correct in that they represent various -"- understanding of the superexchange phenomenon ways of looking a t parts, larger or smaller, of the same has been lacking. Superexchange may be denned as the physical mechanism. The subject divides itself naturally into two parts. exchange mechanism taking place in transition-metal salts in which the ions are fairly well separated by First we have to write down the rather abstract, general normally diamagnetic groups, and in which conduction theory of rf-electron spins and their interactions. The is poor at all normal temperatures. 1 ' 2 The experimental first and most important part of this job is to underfact which is most striking is that in almost every known stand the concept of a single rf-electron in the presence case this interaction is antiferromagnetic; very few in- of the diamagnetic lattice. Such an electron is actually a quasi-particle like the polaron, because it carries with it sulators are true ferromagnets. 3 (Ferrimagnetism of course results from antiferromagnetic interactions.) This a cloud of associated electronic polarization. I n particu6 generalization suggests that there may be a distinct, lar, it will have an associated spin polarization. The two most important properties of these particles are reasonably universal mechanism for superexchange, and perhaps even that it is related to the Mott mechanism 4 their kinetic energy, or desire to delocalize themselves, and their repulsion when they are too near each other. which prevents conduction. Aside from not convincingly explaining this experi- Whenever this repulsion predominates and prevents mental point, the theories which have been advanced metallic conduction and the formation of bands, we have all suffered from one major difficulty: although it show that the opposite tendency to delocalize causes a is obvious that a perturbation technique is correct, since necessarily antiferromagnetic interaction which we the exchange effect is small compared to most energies identify as superexchange. The physical basis for this is simply that antiparallel of the problem, the perturbation problem is very difficult to define satisfactorily. Usually electron interaction electrons can gain energy by spreading into nonproblems are solved by starting from some suitable set orthogonal overlapping orbitals, where parallel electrons of single-electron states satisfying an assumed one- cannot. This term is the first in a series of types of electron Hamiltonian, and introducing the interaction interactions, of which we discuss the first few, including as a perturbation; but the appropriate one-electron all suggestions so far made. Quantitative estimates states here should have the symmetry of the problem, verify that superexchange dominates when it is present. i.e., be running waves, while clearly the outstanding The superexchange interaction may be expressed feature of this problem is that the electrons are localized, simply in terms of parameters related to other properties and actually the localization is caused by their inter- of the crystal. In the second part of the paper we choose actions. 4 I n addition, another large effect is the overlap to estimate the most important from ligand field theory 7 of (^-electrons onto neighboring diamagnetic groups, (which may be thought of as the theory of the isolated which is neither small nor easy to take into account, quasi-spin). We find that we can make at least a semiespecially because of the problem of orthogonalization. As a result of these difficulties, it is not even clear quantitative comparison with experiment for oxide and I. INTRODUCTION

whether the apparently very different schemes which have been proposed are actually distinct, or what their 1 H. A. 2 P. W. 3

Kramers, Physica I, 182 (1934). Anderson, Phys. Rev. 79, 350 (1950). R. R. Heikes, Phys. Rev. 99, 1232 (1955). * N. F. Mott, Proc. Phys. Soc. (London) A62, 416 (1949).

5 A number of these are discussed in J. Yamashita and J. Kanda, Phys. Rev. 109, 730 (1958); also see R. K. Nesbet, Ann. Phys. 4, 87 (1958). 6 I t will occasionally be convenient to have a name for these particles. We call them "quasi-spins" or "spin quasi-particles." ' For a review see J. S. Griffith and L. E. Orgel, Quart. Revs. (London) 11, 381 (1957).

100

THEORY

3

OF

SUPEREXCHANGE

fluoride antiferromagnets, and explain most of the qualitative regularities in the data. II. ISOLATED SPIN QUASI-PARTICLE

As was remarked in the introduction, the only satisfactory single-electron wave functions in a solid-state problem are running waves, and we indeed start the discussion with a single running-wave rf-electron isolated in the diamagnetic crystal. We must imagine that the nuclear charges are adjusted in such a way that the delectron sees a potential not far from what it sees in the real crystal; this is not difficult. This (f-electron in the real crystal is not very accurately simply a superposition of atomic rf-electron functions, for three reasons in order of numerical importance : (1) I t must be orthogonalized to all of the wave functions of the electrons already present. In particular, the 0 or F~ electrons will have formed partially covalent bonds with d-orbitals on the ions, according to the Van Vleck "molecular orbital" scheme as adapted by Stevens and Owen, 8,9 so that the wave function has an antibonding admixture of anion functions. Since the quasi-spin is assumed to be in principle an exact solution of the one-electron problem plus the diamagnetic crystal, there is no need to confuse ourselves with any additional one-electron transfer effects,10 as opposed to the true polarization effects we discuss shortly. It should be emphasized that the main virtue of the concept is this negative result, that it contains from the start all one-electron effects. We shall soon show that the most important part of the exchange results simply from these altered oneelectron functions; what the spin quasi-particle concept does is to isolate these one-electron effects from polarization effects, making it possible to estimate them separately and show the latter are usually small. (2) The electron, being a charged particle, is surrounded by an accompanying cloud of electronic dielectric polarization. Note that we must not take into account lattice polarization because most of the motions of our particles will be virtual ones with such large perturbation denominators that the motions are too rapid for the nuclei to follow. This electronic dielectric polarization is unimportant except for its numerical effect in reducing somewhat the Coulomb interactions of the particles, because it is not spin-dependent; we shall not therefore even write it down. (3) More important for spin effects is the spin-dependent polarization caused by the exchange effect. For running i-electrons this takes the form of virtual excitations from filled bands into empty ones, while the ^-electron jumps to a different A-state in the d-band; 8

J. H. Van Vleck, J. Chem. Phys. 3, 807 (1935). K . W. H. Stevens, Proc. Roy. Soc. (London) A219, 542 (1953); J. Owen, Proc. Roy. Soc. (London) A227, 183 (1954). 10 P. W. Anderson and H. Hasegawa, Phys. Rev. 100, 675 (1955). 9

INTERACTIONS

parallel electrons in the full bands go to parallel states, while antiparallel ones exchange spins with the rf-electrons. The full expression, in perturbation theory, of this effect is explored in the Appendix. In second-quantized notation, 11 we can call the manyelectron wave function of the diamagnetic lattice ^o, and define a fermion creation operator ska* such that the many electron wave function of a single quasi-particle of momentum k and spin
(1)

The s's are a set of properly anticommuting fermion operators. We should also note that really there are 5 bands of s's, which may be of complicated degenerate forms, especially in cubic crystals. An s might be expressed in terms of a series of the true one-electron operators c as (see the Appendix) Sk*=c^d*+

£

£

k'.q.o-'

e,f

XCk+qjir'

Ck',ff'

Ck'—q,(T

+ o r d i n a r y polarization-!

.

(2)

Here the c*'s are the creation operators for a complete orthonormal set of one-electron wave functions, which are chosen to be in some sense the best possible such set (presumably obtained by a Hartree-Fock method). One might define such a set precisely by the criterion that in (2) only one one-electron term should appear. Here a cd refers to a one-electron
CkJVkf(r)+Ck/
/

+ £ ck„Vk"(r)},

(3)

11 P. Jordan and E. Wigner, Z. Physik 47, 631 (1928); for a clearer treatment and applications to such problems as the present one see W. Heisenberg, Ann. Physik 10, 888 (1931); V. Fock, Z. Physik 75, 622 (1932). A brief review for those somewhat unfamiliar: cjj* is an operator which is thought of as creating a single electron in the one-electron orbital^kif), while its Hermitian conjugate Ck destroys this electron. The operator tik = Ck*Ck counts the electrons in state k in the sense that a state with exactly n electrons in k diagonalizes it with eigenvalue n. The Pauli principle and Fermi statistics are insured by the anticommutation relations,

Ck*Ck'+Ck'Ck* = Skk', CkCk'+Ck'Ck = 0,

the k = k' relation limiting n to 0 or 1 and the rest enforcing antisymmetry. The assumption we make is that the "clothed," physical quasi-particle operators like the Sk*, which create and destroy the electron together with its accompanying polarization, also exist and obey the same relations. This assumption is, though unproved except in perturbation theory, basic both to usual solidstate theory and quantum field theory. The field operator i^r* (r) introduced in (3) creates an electron at point r rather than in state k; clearly the relationship is the linear transformation (3). Even this much knowledge of second quantization will not be necessary to understand the physical ideas in this paper.

101

P.

W.

4

ANDERSON

where the fc„d*) and also higher many-electron terms. The most important property of the s's defined in (2) as single quasi-particles is their kinetic energy

in terms of the s„ as sK,>' = N~i Z ani E e- ik -%„*(R,
and this, inserted in the kinetic energy expression (4), gives the energy in terms of localized functions EK=

E

ec»s„*(R,
n,ir,R

+

£jc=E«(kKr**k,;

R

(4)

E

& E -R-»»'S„*(R,
(6)

?/,n',R^R',tr

k.
again we understand that the sum really contains five bands. If c(k) were a constant, an equally satisfactory set of one-electron starting functions would be the localized Wannier functions formed from the s's, s*(R,
(5)

k

Even when the variation of e with k is simply small relative to other energies in the problem—as it will turn out to be—the localized function represents a good starting point, and the variation of e may often be treated as a perturbation. This variation is small because it does not come from the direct transfer of an electron from the diamagnetic ion to the paramagnetic one, but from the higher order effect, transfer all the way from cation to cation. Here we see one of the advantages of the quasi-spin concept: in starting with these localized functions we have already included all the interactions of the (^-electrons with the diamagnetic ions, which may be large; we use as a perturbation only the overlap from one metal ion to another, which is much smaller. A second useful feature of the localized spin quasiparticles is that, neglecting e(k) in the large energy denominators involved in the many-electron parts, the expressions for such things as the spin-polarization greatly simplify, as is shown in the Appendix, and take on the form one expects for truly localized functions. This form is almost identical with the corresponding spin-polarization around a nucleus which causes nuclear indirect exchange.12 To get a full set of five localized functions the sums in (S) must actually be confined to any appropriate set of five bands in turn. The resulting five localized functions form a reducible representation of the ion's point group and so, in for example the cubic case, can be recombined to form the usual irreducible sets: t, of symmetry like (xy,yz,zx) and eg{y2— x2, 2z2—x2—y2). We shall introduce a new index n and call this set of localized quasiparticle operators sn(R,a). Explicitly, we can write the running-wave quasi-spin in the jih branch of the d-band 12 N. F. Ramsey, Phys. Rev. 91, 303 (1953); M. A. Ruderman and C. Kittel, Phys. Rev. 96, 99 (1954); N. Bloembergen and T. J. Rowland, Phys. Rev. 97, 1679 (1955).

(by group theory the first term must clearly be diagonalized by the symmetric functions sn). The ecn are the energies in the crystal field, which may be somewhat larger than the constants b which play the part of one-electron "transfer integrals" for the localized functions. In principle both can be obtained by Fourier transformation from the a » / s and the e(k), but in actual practice one will compute or measure them from properties of the localized functions. We shall regard the ec's and the b's, the latter of which are in practice appreciable only for near neighbors, as fundamental constants of the theory. The only purpose then of this whole excursion has been to make the following observation: thai the i„*(R,cr)xI'o are a set of many-electron functions representing spins localized on cations, for which the only of-diagonal matrix elements of the full Hamiltonian are the transfer integrals bR-R'. Thus, aside from these transfer integrals, all the further properties of the spins must follow from true many-electron interaction effects. (We shall find that the b's are of the order of half an electron volt.) III. INTERACTIONS OF QUASI-PARTICLES Having set up the picture of individual quasi-spins, we shall now write down their interactions in the order of their numerical importance. The first few are simply the various electrostatic interactions of the purely oneelectron parts, which may be written down by starting with the electrostatic energy

V = hfdtfdr> E, ^*(r')V(r) e2 X-^(r)^-(r'), |r-r'|

(7)

and expanding ^„(r) in terms of the localized functions analogously to (3), l t , ( l ) S E ^ „ ( R , ( 7 ) / „ ( r - R ) + Eother bands, R,n

(8)

f„(r) being the one-electron Wannier function as normally understood. Actually, (8) is just the first terms of a series which is to be understood as the inverse of (2), an expansion of it (or the c's) in powers of the manyelectron operators s. Higher many-electron terms in the

102

5

THEORY

OF

SUPEREXCHANGE

energy, however, -re more easily written down by direct perturbation theory than by writing the tlr's in (7) and (8) in terms of many-electron functions. Inserting (8) into (7), we get as the largest term the Coulomb repulsion of the electrons when they are on the same ion: tfrep = |

£ R,n,n'

Unn,Sn,*(RS) ,a,ff'

X*„*(R,crK(R,c7)s n .(R,o-'),

(9)

where Un„.= p r f Q y | / „ ( r ) | 2 | / „ . ( r ' ) M r - r ' | - 1 .

(10)

Equation (13) will obviously not depend much on n and »'. Since in all zeroth order states the occupation is uniform, (12) has no effect except to reduce the cost of transferring an electron from R to R'. Two electrons at a typical ionic distance in oxides, 4 A, repel each other by perhaps 6 electron volts; subtracting this from the isolated-ion IS electron volts, an upper limit on this cost is about 10 electron volts, further to be reduced by dielectric polarization. 14 (Note that the observed activation energy for (/-band conduction will be even smaller because of ionic polarization; this might be 2-4 electron volts, while the effective {/„„. might be 5-10 electron volts.) Next comes the true exchange interaction within the individual cations. This is the Coulomb repulsion of the overlap charge:

The coefficients £'„„• are of the order of 15 electron volts for free-ion functions, but are reduced in the solid by dielectric polarization. To a rough approximation the U's do not depend on n and n'. What dependence there v^»=h is, is given by the so-called Slater integrals, and will be discussed shortly. For the present c7„ n .= t/. The effect of the large magnitude of UTCp is that where possible the where electrons remain equally shared among the ions. I t costs J„n. = an energy mU-(m-l)U=U, (11) to take an electron from an ion with the correct number, m, of electrons (and thus m— 1 repulsive interactions with the given one) to one with m-\-\ electrons and thus m repulsive interactions. This in fact is how we estimate U. Because £/„„- is so much bigger than the transfer integrals 2>R_R-, the zeroth approximation we use should diagonalize the energy c7 rep rather than EK: that is, should occupy the localized wave functions s(R) with individual electrons rather than the running-wave functions Sk. The lowest states will be those with some exact number m of quasi-spins on each ion; the perturbed virtual states, states with one or more electrons transferred to other ions. These lowest states form a zeroth order degenerate manifold because of the various possibilities of spin
i L £ tf(R-R')*»<*(RV) R^R'

n,n'

Xsn*(R,a)sn(R,a)sn,(R'y),

£/(R-R') = JJ"|/„(r-R)|2|/„,(r'-R')|

(12) 2

Xe2\r-r'\~ldrdt'. (13) In the actual many-electron ions the f/^.'s of course differ slightly from one n, n' to another, by the so-called Slater integrals. These, the e„(«), and the shortly-to-be described Hund's rule internal exchange integrals give us a single-ion problem whose diagonalization is the main province of ligand field theory.3+(See2 reference 7.) The result, however, is, except for the cases of V (<2j and Co2+(
INTERACTIONS

z

/„„'*„<*(Ry) Xsn*CR,a)sn,(R,a)sn(R,a'),

(14)

Jdifdr'fn*(r)fn:(r) X/^drO/^rVlr-r'!"1

(15)

is the usual exchange integral. Note that if we confine ourselves to a subspace in which sn*(+)sn(+)+sn*(-)sn(-)

= l,

nn(+)+nn( — ) = l (i.e., orbital state n is certainly occupied), then n„(+) — nn( — ) = 2Si", sn*(+)sn(-)

= S+'=Sx"+iSy'',

sn*(-)sn(+)

= S-n =

(16)

Sx"-iSy-,

where the capital S"'s are the spin vector associated with state n. Using these relationships, we arrive at the well-known identity between (14) and a spin interaction: V*i*=-\

£

/...(R2S--S-').

(17)

R, n^n'

This interaction will remove the degeneracy of relative spin orientations within the single cations, according to the usual Hund's rule reasoning (see reference 13). So far no terms have affected the degeneracy of relative spin orientation of spins on neighboring ions; now we shall bring forth the largest terms which do. By far the largest, when it occurs, is a spin-dependent force resulting from the terms already presented. The mutual repulsion Unn' of electrons on the same ion prevents the permanent occupation of ionized states, and thus makes the (/-band substances insulators; but u There will also be an additional reduction, similar to the reduction of 10-20% in the Slater integrals and spin-orbit coupling in ligand field theory, (see reference 8) caused by the reduction in the amplitude of the wave function on the central ion due to spreading to the ligands.

103

P .

W .

by virtually occupying, to a small extent, ionized states which are connected to the ground state by the transfer integrals b, the system can gain a certain amount of energy. However, only ionized states in which the electron transfers to an empty state n, a can contribute; since the transfer integrals b carry the electron without change of spin, this means that the energy is gained only in the presence of an antiparallel neighbor, which means an antiferromagnetic effect. Clearly this part of our theory is a generalization of the "alternant orbitals" idea15 whereby the wave functions for opposite spins need not be orthogonal to each other and can spread out on to each other's respective sites. What is new is that we have extended the idea to much more general situations and shown that it is closely connected with the insulating property of the transition-metal salts. First we shall write this interaction down for the simple model of one orbital per spin which has been the province of most previous discussions of superexchange 2,5 ; then we shall write down the more general expression. In such a simple model all the degenerate states in the ground-state manifold have exactly one electron per ion, while all the excited states with one transferred electron have energy U. Between any pair of ions at a distance R— R' there is only one JR_R< ; this must act twice to return the state to one of the ground manifold. Thus the second-order perturbation of the energy is 6R_R<2

AE=-

E R,R',<7,d'

**(R,<0 U XJ(R»I*(R'/)J(R/).

(18)

The combination of quasi-particle operators is much simplified when we realize that in the ground states one and only one orbital per ion is occupied, so that the identities (16) hold. Then **(R, + M R ' , + M ( R ' , + M R , + ) +s*(R, - > ( R ' , -)s*(R',

-MR,

-)

= i{»«( + ) [ > R ' ( - ) + l - * R < ( + )] +*R(-)[»R<(+)+1-»R'(-)]} = 4{1-C»«R(+)-«R(-)I«R'(+)-»R'(-)]}

= 1-2S,(R)S.(R').

P.-O. Lowdin, Phys. Rev. 97, 1509 (1955).

21 6 R _ R , [2 A£=const+E R,R'

SR-SR-.

(21)

U

This is the antiferromagnetic exchange effect. Our estimates would put the effective exchange integral at 1/50-1/10 electron volts, or 200 to 1000 degrees. As we shall see, all other effects are orders of magnitude smaller, so that it is not surprising that this has been the outstanding effect experimentally. The above mechanism leads to a nonvanishing result only when R, a' and R', a are occupied states; i.e., when the ^-states from and to which transfer takes place are half full. This corresponds to the "both more than half full" case of Anderson 2 and in fact that reference contains, without understanding it, a clumsy but in principle correct calculation, as to a greater or lesser extent do later papers. 5 In the more complicated 5-orbital J-band, there will be many exchange integrals of this same kind, between each pair of occupied orbitals. There will in addition be off-diagonal exchange terms like JR_R'""'6R_R'"""

E

Sn*(K,a)

u

Xv(R',ff)v*(RV)j,"(V),

(22)

and more complicated types. These off-diagonal terms will usually connect to states which are higher than the ground states by crystal field energies, although in the anomalous cases like C o + + they can play some role. We shall discuss the numerical aspect of the above superexchange mechanism in the next part of the paper; now we go on to other effects. A third-order perturbation effect seems to be one of the major contributions when one of the states to which transfer takes place is completely empty. This is the third-order effect of transfer together with the internal exchange effect (14). Suppose that state n on R is half full, state n' on R' is completely empty (presumably because of the crystal field effects), and 6R_R'""' is fairly large. Also suppose that there is a state n" which is partially occupied on atom R'. Then there is a third-order interaction |&R-R<"n'|V»'»"

A£3=

E ,,«',*".,">

E — R,R'

U2

X S , * ( R / % ( R V " ) J . " ' ( R V > . . * ( R V ' )

X5„"(R>"K'(RVK<*(RVK(V)-

**(R, + M R ' , + M ( R ' , - M R , - ) +**(R, - M R ' , - M ( R ' , + M R , + )

15

so that

(19)

[Note that here the constant term has opposite sign to that in (17). This is a real difference and may be observable.] Also,

= -[5,(R)+tf,(R)][5I(R')-iS,(R')] -[5,(R)-iS,(R)][5,(R')+iS,(R')] =-2S*(R)Sx(R')-2Sv(K)Sv(R'),

6

ANDERSON

(20)

This interaction operating on a state with n' completely unoccupied vanishes unless a'— a and unless a'" = a", (that is, J operates only on the transferred electron n!,
104

THEORY

OF S U P E R E X C H A N G E

facts, we get \bs.--R'nn'\2Jn'n"

A£3= E

J.»*(R» Xsn*(R,,r')sn,.(R',
(23)

This is then like a positive, ferromagnetic exchange between n, R and n", R' [we could again show that this is equivalent to S „ ( R ) - S „ " ( R ' ) ] . This is the "less than half full" interaction of previous work 2 ; we see it is less by the order of Jn>n"/U or about a factor 5-10 than the antiferromagnetic effect, so will only appear when that is absent and under conditions where b is large or U small. The next effect is the true direct exchange, which comes by taking the Coulomb repulsion of the overlap charge: £ex=i

E

E

/R-R'»"V*(RV)

R , R ' , n . n '
Xs„*(R,K(R,<7'), where - / R - R < " * ' = fdr

fdr'

(24)

/„*(r-R)/„-(r-R';

X/„'*(r'-R')/„(r'-R)e2|r-r'

(25)

This is, by an obvious general theorem, always positive (ferromagnetic). We can understand why it is usually less than superexchange by an order-of-magnitude argument. All of these interactions will be estimated by assuming that the overlap of neighboring quasi-particle wave functions really takes place entirely on the intervening diamagnetic ion. Suppose that the amount of overlap is e: i.e., /„£#,,-e*,(0-);

(26)

INTERACTIONS

transferred part of the wave function, so (23) and (24) are two halves of the same effect, direct exchange with the "alternant orbital," nonorthogonally overlapping, function. We use, here and later, the fact that true ferromagnetic exchange integrals are always small. This is because they are the self-energy of an overlap charge which, because of orthogonality, must be as often positive as negative, and almost always has a node where the wave function itself is largest. The last effect we take up here is the third in this group of interactions which are only an order or so smaller than true superexchange. Undoubtedly many higher-order interaction effects exist, coupling groups of three spins, etc.; but these will be clearly smaller than the smallest of the effects we consider. This third effect is essentially exchange between one spin and the spin polarization of its neighbor, and is the complete analog of the Ramsey-Kittel-Ruderman-Bloembergen 11 mechanism of indirect exchange interaction between nuclear spins. So far our interactions have really been electronpair effects depending on the direct action of the exclusion principle : superexchange because the antiparallel spins can transfer over on to each other's ions, while the other two are direct exchange, in the first case of the transferred "superexchange" charge, in the second of the actual overlap of the wave functions (clearly effects of closely the same order of magnitude). I n all three cases the sign is unequivocal. The true indirect exchange interaction is probably best written down directly as an energy perturbation, starting from the one-electron approximation. In the Appendix we show that, in the approximation that the quasi-spins are fairly well localized, the effect of the part of the exchange which causes spin-polarization on a localized spin is

Then each of the / ' s in (25) must be of order e because this is the amplitude in the overlap region. Aside from this factor e4, (25) is just an atomic exchange integral A t on the O ion, of the order of x j ry. The expression 6R_R'""' contains e2 and is, aside from that, a one-electron term value T, or energy of interaction between the O electron and the one-electron potential; as a general rule term values are ~ 1 ry, and are thus an order of magnitude bigger than exchange integrals. U itself is a repulsive Coulomb interaction between single electrons, which is usually intermediate in magnitude between term values and exchange integrals. Thus j2f/-we4r2t/-i>>€4/at;

VcJ*CR,a)--

105

/„(e,/;K,k')exp(JK-R) Xc/tR.Ock',/^-,,,'',

where

(27)

/ „ ( e , / ; i e , k ' ) = f dldi' ,2 I , _ , ' ! - ! X/„*(r')/„(rW(r'W_,*«(r).

(28)

The second-order perturbation comes from applying (27) twice, the second time returning e, k + K and / , k' to their proper places (thus maintaining the identity of
the ratio being slightly more than an order of magnitude. On the other hand, the ferromagnetic effect (23) is tfU~l multiplied by a factor JKiU~l, and so may be expected to be at most only slightly larger than direct exchange; both are, however, ferromagnetic so they add. Actually (23) is just the direct exchange with the

E

E

cn."*(Wy)cn,"(R',
n,R,7i',R',(T,(7'

XcJ(R,
E

expp(k-k'HR-R')]

k,k',e,/

|J„(e/;k-k';k')|2 X

. «.(k)-e/(k')

29

P.

W.

CrF 3 , 16 but two orders of magnitude weaker than the usual paramagnetic shift in the iron group, just as we would expect from our estimates. On the other hand, the coefficient in (29) depends in a complicated way on relative phases of wave functions, etc., and actually has been shown to be ferromagnetic in some metal and semiconductor cases.11 Thus we do not feel that the sign is at all predictable without a more detailed discussion and a clear knowledge of the states e and / . This is the last interaction we shall discuss. There are obviously many higher terms, but they will all be smaller by one or more orders of magnitude; while there will be essentially no cases in which all four of these interactions are particularly small. That is, superexchange dominates in fairly concentrated or fairly covalent cases except when transfer occurs into empty shells, in which case direct exchange dominates. Dilute or noncovalent cases will always be dominated by the lowest order indirect exchange. 17

The sum over k and k' in (29) is rather hard to estimate, but we can get a rough idea of it in the following way. Except for the rather small variation of J with k and k', the sum will vanish unless both e„(k) and e/(k) are functions of k. This reflects the fact that both the electron and the hole must actually travel from R to R' in order to recombine there. Thus, taking into account that the electron and hole are probably in fairly deep, narrow bands, we expand e.(k) = Ee+Z

x

exp(jk-3i)|9x+ •••, (30)

e/(k)

= £ / + E exp(ik-3i)|8x'+- • •, x

and expanding the denominator in (29) under the assumption that /Sx and /3x' are small, we get |/„(«/) |2|8R-R'|8R-R''

E2

, E.-E,

(£.-£,)»

IV. BRIEF DISCUSSION OF THE COMPARISON WITH EXPERIMENT

where J„(ef) is an exchange integral with the Wannier functions of bands e and / : / „ = fdrdr' J

8

ANDERSON

/„*(,)/.(,)—!_ff*(r')fjt'). I r— r |

(31)

/ / we know, since the only core bands which could propagate effectively are the s and p bands on the anions. / , is probably some combination involving s or p functions on the cation; so / „ will be of order t (for the overlap with / / ) times an atomic exchange integral. The number /Sxis probably not too small, ~ \ (Ee—Ej), since the e band may be fairly wide; but j3x' may well be as little as 1/10 to 1/100 of Ee—Ef, which is of the order 20 electron volts or so. An estimate is 0\/ (E„—Ef) ~ t . Thus we get a result

or slightly smaller than direct exchange. Note, however, that this definitely falls off more slowly with t so that very distant ions or very inner shells will tend to retain this mechanism rather than true superexchange. It is not clear what the sign of this mechanism will be. Physically, we expect the spin-polarization to draw parallel electrons toward the d shell (being the result of true exchange), so to tend to take the parallel electrons from sp shells of the anions and put them in higher shells of the cation. This leaves antiparallel spinpolarization near the neighbor cation so we expect an antiferromagnetic sign. Note also that the nuclear magnetic resonance shift at the anion will be diamagnetic. Such shifts are indeed observed in G d ( H 2 0 ) 6 + + + and in

The two parameters which enter the superexchange energy are the repulsive energy U and the transfer integral b. One could get at U either by knowing the observed activation energy for conduction in the rf-band, and adding to that the ionic polarization energy; or by starting from U for a free ion and subtracting the nearest-neighbor repulsion and other corrections. Both of the quantities in the first method are hard to estimate, and in particular experimental values of the (Z-band energy are few and questionable. As a guess by the first method, the activation energy is 2-3 electron volts, and the ionic polarization energy 18

(

e2 Kei-ro

e2

\ ), Ktotra/

where K is the dielectric constant, and where r0 is an effective radius of the ion itself (say of order 1-1.5 A). Taking a value of if tot of 5, this is about 3-4 electron volts, giving an estimate of 6 electron volts, with no particular variation from cation to cation. More reliable perhaps, especially for comparisons, is to take the values of U for free ions from tables, 19 reduce by a 4-ev nearest-neighbor correction and by perhaps 4 ev for electronic polarization, take off another 10% 16 R. G. Shulman (to be published); Knox, Shulman, and Waite (to17be published). This clear separation suggests a nomenclature which we adopt here and suggest for the future: namely, the "alternant orbital" mechanism = superexchange, the next two = direct exchange, and the perturbation term = indirect exchange (since it is the term analogous to nuclear spin indirect exchange). 18 Rittner, Hutner, and Dupre, J. Chem. Phys. 17, 198, 204 (1949) discuss the polarization energy in detail, but more exact values than the above "cavity" estimate hardly seem justified. 19 Charlotte E. Moore, Atomic Energy Levels, National Bureau of Standards, Circular No. 467 (U. S. Government Printing Office, Washington, D. C, 1949 and 1952).

106

9

THEORY

OF S U P E R E X C H A N G E

INTERACTIONS

TABLE I. {/-value estimates, in electron volts. Third Second ionization ionization potential potential Free-ion U Corrections

Mn++ Fe++ Co++ Ni++ Cu++ Fe++ +

33.7 30.6 33.5 35.2 36.8 ~53.5

13.8 16.0 17.0 18.2 20.3 30.6

19.9 14.6 16.4 17.0 16.5 ~23

10.0 9.5 9.6 9.7 9.6 ~13

TABLE II. Transfer integrals.

V

9.9 5.1 5.8 6.3 5.9 ~10

for the covalency correction, and arrive at the values in Table I. F e + + + is only roughly estimated; also a 2 0 % covalency correction for this is used. I t is striking that only M n + + and F e + + + show any appreciable variation from constancy; a value of 6 ev, with perhaps 9-10 for the half-filled shells, would make good sense. There is no reason to expect the identity of the anion to change these numbers much. We may guess that the numbers in Table I are not in error by much more than one electron volt, i.e., ~ ± 2 0 % . This accuracy comes from the fact that between the experimental value of around 2-3 ev,20 and the ionic value (less 6 ev for the proximity correction and 10% for covalency) the difference is accounted for entirely by electronic and ionic polarization energy of the surroundings. We cannot change the distribution between these by very much (say at most an electron volt, or 5 0 % ) ; hence our 1-ev confidence in Table I. In estimating b we are helped by a very fortunate circumstance: that b is closely related to the crystal field parameter Dq. lODq is the difference in energy of a t from an e orbital, and is in general made up of two contributions 7 : a "covalency" contribution coming from the overlap effects, because the e orbitals point towards the octahedral ligands, the i's away; and a purely electrostatic repulsion. Thus Dq (aside from this electrostatic contribution) is a measure of the extent to which the i-function has spread onto the anions. One way to see how this "covalency" contribution is related to b is to look at the three-center problem of two d orbitals on opposite sides of an anion, b is essentially the difference in energy between the states in which f has the same sign on each cation and that antisymmetric about the anion. The former, by symmetry, can contain no anion ^-function, the latter no ^-function. Thus this difference is the same as the effect of removing all pfunction admixture from the wave function (assuming the s admixture relatively small). But that is also at least an important contribution to the crystal field. A less rough estimate comes from perturbation theory. Starting with an e orbital at energy 0, a p orbital at energy — E, and a (now O -cation, not cation-cation) transfer integral V, the energy perturbation of the e orbital is -\-b'2/E. On the other hand, the effective transfer integral to an equivalent e orbital is also b'2/E. » F. J. Morin, Bell System Tech. J. 37, 1047 (1958).

Orbit on first atom

Direction to first

e*a-»"

x, y z x, y x, y x, y x, y z

e*!-„» ei'-v* e*! « 2

!

ez2

Orbit on second Ci 2 -V

6r!-

e*! ez* et* e*' e*°

Direction to second

b

x, y x, y,z x, y z x, y z z

(5/2)Dq 0 (5/2v3~)/J? (5/v5)/Jg (5/6)Dq (5/i)Dq (W/3)Dq

, e*!

Under the assumption that the overlap of a / orbital is relatively negligible, this perturbation is the covalent contribution from this anion to WDq. To get the precise relationship, define ex*-y* to be the orbital of x2—y2 symmetry and e„' that of 2z2—x2—y2. The normalized relative amplitudes in the various directions are as follows:

e*<-y*

x

y

z

I

i

0

1

1

1

2V3

2V3

V3~

and 10D§ is made u p of the sum of the contributions from all the anions, i.e., it is equal to the contribution which would result from unity wave function amplitude. Using this, we can find the transfer integrals b in terms of the covalent part of Dq for all the possible combinations of orbitals. This is shown in Table II. Thus we may work out the various exchange integrals given the covalent contribution to Dq, using the formula JM = 2b2U~K

(32)

From Table I I it is easily verified that the total interaction in any direction between two half filled «9-shells (two electrons) is equivalent to exchange between two electrons with an exchange integral / e t f (filled e„) = 2[(W/3)DqJU-K

(33)

Thus, in a substance with a shell of n (i-electrons, the exchange integral for the total spin S=n/2 is /=2«-2[(10/3)Dg]2[/-1.

(34)

The simplest substance for a comparison with experiment is NiO. Here we have just the half-filled shell contributing, with no effect of the i-electrons and therefore a minimum of nearest-neighbor interaction. According to Van Vleck,21 the Curie temperature is given in molecular field theory by TN=2JzS(S+l)/3k.

(35)

The crystal field parameters for O— octahedra are about the same as for H 2 0 octahedra, 20 and in both cases are probably mostly covalent. 7 Thus we can use 21

J. H. Van Vleck, J. Chem. Phys. 9, 85 (1941).

P.

W.

10

ANDERSON

TABLE III. Exchange integrals and comparison with experiment. Cation

Cu++ Ni++ Co++ Fe++ Mn++ Fe+++ Co+++ Mn+++ Cr+++

Affoalo

TN oalc

TXr.unm toNiO

585-290°

1170-585°

(675-340°)

460 525 620 240 750 2200

920 879 930 340 1050

(520) 495 525 190

TN (oxide) obs

8 obs

453? 220? 520 290(326)" 185(194)" 117

280(530)" ? (545)" 610

Tir(M+*Fi) e(M++F!) 2MLaM+++Oi)

73.2 37 78.3 66.5

TH(,M+*+FI)

~100 50 117 97 750 100 320

394 460 43 80

• See reference 24.

the H 2 0 value, (Dq)t

= 850 cm- 1 .

Inserting this and U from Table I into (34), we get / e ff=460°, 7 = 1 1 5 ° , and from (35) with 5 = 1 , (TV) N i 0 = 920°. The observed value is22 7V = 520°, so that our calculation has come out 80% too high. The molecular field theory is known to give too high a value for the Neel temperature relative to / by the order of 2 0 - 3 0 % ; the theory behind the Curie-Weiss constant is more accurate but 6 has not been measured for NiO. Thus we consider our error to be about 40-50%. The U value, as we have said, is probably fairly accurate. The b value, on the other hand, cannot be fixed with any exactness. The first and probably least serious error is the fact that we assume in comparing Dq and b that there is no overlap of the t (xy, yz, etc.) functions with the O , i.e., we assume no ir bonding. The magnitude of actual (-shell exchange integrals suggests that this is perhaps § to \ of the a contribution, so that b may be underestimated by 2 5 % or so because of this. A source of error which should be about the same, and of opposite sign, is the neglect of the ^-function contribution to both Dq and b. Since the s and per functions change relative signs from one side to the other of the anion, s will add to Dq and subtract from b. We might think of the per function as being slightly hybridized with s. Because of this compensation of these two errors, the main problem is the third one: the relative amounts of the electrostatic and covalent contributions to Dq. From Table III, which contains calculations similar to those we made on NiO for other iron group cations, we will see that the calculated TN is closest for the trivalent ion F e + + + , next best ( ~ twice too high) for the divalent ions in oxides, and quite a bit too high for fluorides. Although fluorides have much the same Dq as oxides, they are generally expected to be much less covalent, while the triply-charged ion F e + + + in oxides is the most covalent situation. Thus the major uncertainty in our calculations is in this question, and we really must consider our J's as upper limits until some 22

C. H. La Blanchetais, J. Phys. Radium 12, 765 (1951).

method of more accurate estimation of b is worked out. In reference 7 will be found a discussion of the relative importance of covalent and electrostatic contributions, while in reference 9 we find actual measurements of the electron transfer—unfortunately not of the energy. We do not expect, however, very much variation in these correction factors among doubly charged ions in O , so we can predict to a fair extent the 7 V s which would be observed for the series MnO—CuO, normalizing to the observed NiO value, and neglecting other complicating effects which certainly exist. In Table I I I we give a list of actually calculated J's and TVs for these substances as well as for a Fe+ ++ — O — F e + + + 180° bond, and of TN'S for the divalent oxides renormalized to make NiO fit. The J ' s in this table are the effective J's of (33) for a single pair of electrons equivalent to all the e interactions. Then the TN'S are obtained from this simply by the quantum correction S(S+l)/S2=(S-\-l)/S (various factors z, ^, etc., dropping out). The Dq values we use are listed in Table IV, and are the most recent, those given by Holmes and McClure. 23 Table I I I illustrates that while many qualitative features are explained by our theory, the quantitative picture is as yet not clear and complete. Let us take up the various aspects in order: (A) Oxides M++0 . We have included the three cases of Cu + + , Co+ + , and F e + + not because they are expected to agree quantitatively with theory but to show the general order-of-magnitude correlation with theory, particularly of the Curie-Weiss constant 6. In C u + + the question is one of how the ex^y' orbitals are aligned in the given crystal structure; depending on this, there is an uncertainty by a factor 2, as we indicate in the table. In C o + + and F e + + the problem is the effect of orbital degeneracy, as well as (in Co+ + ) the mixture of e and t occupancies. The problem of exchange in the presence of orbital degeneracy has been discussed by Kanamori, 24 who showed that 6, and to a lesser extent TN, are depressed in C o + + by orbital degeneracy. The actual 23 O. G. Holmes and D. S. McClure, J. Chem. Phys. 26, 1686 (1957). 24 J. Kanamori, Progr. Theoret. Phys. Kyoto 17, 177 (1957).

108

11

THEORY

OF S U P E R E X C H A N G E

TABLE IV. Octahedral Dq values for iron group ions.

values he gives are shown in parentheses. I t is clear that a serious discrepancy exists only in F e + + . However, it may be that there are some ferromagnetic components to the interactions in both C o + + and F e + + , as well as the effect of nearest-neighbor exchange in the latter which we now discuss. The reliable comparison is MnO vs NiO. This is not nearly as bad as it looks, because we expect 25 the nearest-neighbor interactions, large because of the large number of t electrons and IT bonds, to depress the Curie point by an unknown factor which can easily be as large as 2. What is striking is the large reduction of TN in MnO, as compared to the increasing amount of nearestneighbor interaction in the series NiO—MnO, as indicated by the increasing 6— Tc ratio. 26 (B) LanthanatesLaAf+ + + 0 3 . These we include mainly to show the excellent agreement for Fe 3 + —0 —Fe 3+ . This is not unexpected because there is much less electrostatic correction to b here. Note also that in spite of the Dq's of Mn++ + and Cr + ++ being very much larger even than F e + + + , the t interactions (w bonds) operative here are much weaker. This is our first encounter with the sets of rules suggested by Kanamori and Nagamiya, 27 Wollan, Child, Koehler, and Wilkinson, 28 and by Goodenough 29 on semiempirical grounds. These rules are precisely equivalent to the predictions one would make from the present ideas about superexchange (or actually, even from the primitive notions of reference 2). The rules are (together with the explanation by these ideas) : (a) When filled e orbitals overlap a given anion, the result is strong antiferromagnetism (ordinary superexchange between e(a) orbitals). (b) When a filled e orbital and an empty one overlap opposite ends of an anion, the result is ferromagnetism (direct exchange between the e orbital and the t shell). (c) .When empty e's overlap an anion, the result is weaker antiferromagnetism (7r-bond superexchange of the t shells). In rules (b) and (c) the weakness of the interaction is only relative, because normally these involve trivalent cations with large, covalent Dq's where the comparable e-shell superexchanges involve divalent ions. (C) Difluorides. These repeat the general pattern of the NaCl structure oxides, except that there is no 9— Tc difference, and that because of the angle of the bond we expect Tr—a interaction (i.e., l—e) as well as e—e and t—t. Again, C o + + is small, presumably due to orbital effects, and F e + + is not as large as expected. M n + + does 26

F. Stern, Phys. Rev. 94, 1412 (1954). P. W. Anderson, Phys. Rev. 79, 705 (1950). J. Kanamori, J. Phys. Chem. Solids (to be published); T. Nagamiya, "State of atoms in magnetic crystals," Conference of Welsh Foundation, December, 1958. 28 Wollan, Child, Koehler, and Wilkinson, Phys. Rev. 112, 1132 (1958). 29 J. B. Goodenough, J. Phys. Chem. Solids 6, 287 (1958). 26

27

INTERACTIONS

No. of d

Ion

lODq (cm*1)

Ti+++ V+++ Cr+++ v++ Cr++ Mn+++ Mn++ Fe+++ Fe++ Co+++ Co++ Ni++ Cu++

20 300 18 000 17 000 11800 13 900 21000 7800 13 800 10 000 —18 000 9800 8500 12 500

not drop severely, probably because of the t— e interaction. All are much smaller than the oxides, reflecting most probably a much smaller covalency in spite of the Dq's being comparable to those in the oxides. (D) Trifluorides. Again we find the sharp drop below the half-filled shells. I t is clear, from the fact that the Mn++— F e + + + ratio is not greater than in oxides (the trifluorides have 180° bonds) that the angular factor does not play a very large part in the difluorides. The Co 3+ case is most interesting because it shows that the special properties of Fe 3 + are indeed due to its being the only usual trivalent ion with a greater than half-filled shell. We predict actually an even larger value for CoF 3 ; perhaps there are some complications with low-spin states as in the lanthanates of Co 3+ . 29 V. CONCLUSION

The main purpose of this paper has been to show from an entirely theoretical point of view how the main motivation of superexchange must be the desire of the electrons to set their spins antiparallel so that they may spread out into not quite orthogonal, overlapping orbitals. In doing this we find that the exchange effect is expressed in terms of two parameters: the repulsive energy of coincident (/-electrons, U, which besides being an empirical parameter (
109

P.

W.

reached. I t shall again be emphasized that more accurate computations, and a wider comparison with experiment, are possible and desirable.

Let us call the integral in the above expression

ACKNOWLEDGMENTS I have been helped by discussions with C. T. _ ... r^ TT- . T- T •»«• • T X ^ ^ r £ Ballhausen, C. Kittel, F. J. Morin, L. E. Orgel, T. Moriya, and J. C. Phillips. R. G. Shulman has helped immeasurably at several stages of this work. This work was begun during a stay at the University of California, made possible by a grant from the U. S. Carbon Corporation. APPENDIX In this Appendix the spin-polarization part of the many-electron wave function of a single spin will be j • j . . t. i i j r ^ i - f - 5 J derived to the lowest order of perturbation theory, and transformed to localized wave functions. We also demonstrate the simplifications appropriate when localization is a good approximation. By starting from an optimized set of one-electron wave functions
/• I drdt'- • • = « ( k " ' + k " - k ' - k ) / k ( e , / ; k ' , k " ) ; ^ ., ., . , , , .. »,T, • , . ~s • then the Fperturbed function SW in v(A2) is ' M>__ y^ j ie f-k'h") k',k",»',«,/

(A4)

X(ek+k'-k""+ek"' J — tv1— «k li ) _1 Xck",,' Ckv'Ck+k'-k",/ ^o-

(A5)

Th;g

.g ^ . Q| . o{ ^ spin larization. I t con. . , . . ^ ,, ' . ,, , , , -.i > . sists of virtually putting the rf-electron with unchanged . . , , , ., , • • •, s b a n d e whlle r Pm mto an ^ > ePlacln8 * w l t h a spin-independent excitation from band / into band d. Now, if we introduce Wannier functions c

ck,f
i W =I E

12

ANDERSON

(K,o-) = A' * 2_ e~'

ck„ ,

the localized quasi-particle is approximately

/

+ E Ck„Vk<(r)].

^*(R,
(3)

E

e

MeJ;

k',k")

kk'k"e/ff'

X[ek+k'-k"' i +*k"
We remember that / denotes normally full bands, e normally empty ones and d the rf-band. To get the interaction energy between electrons we must insert this into the Coulomb repulsion (7): Y — i f j f J / | * M I */ i\ J J Xe 2 | r—r / |~" 1 tt,'(r / )'l»( r )-

X.(etk'RCk",„>d )ck',»'/Ck+k'-k", / •

This turns out to contain various localized cd(R',cr') functions with R' near R, so is quite complicated. If it is at all a good approximation to use localized particles, we may simplify (A6) very much by using

(?)

Take as unperturbed state

/a\

*=6k,^„=ek,^n/*.-'*]*™..

(A6)

(AD

(b)

6

<J_e * ^ g MeJ.k,r)

The perturbed state is --N-

E

Cdtdt1!

R',R" J

where S*=ZE0-H0yw*.

(A2)

Xexpp(k"-R"-k-R')]

The part of this which leads to spin polarization is that in which electron r' goes from the d-band to an empty one, while electron r goes from a full band into the dband; for this we take the c! component of t[»„, the cd of ijr„*, c6 of ii,'*, and c d of i^,-. This means that the

v/^Vr—R"1 Wr'—R'l

./•,*Cr'1

, +

/ / ( , r1

~

r

~A7"-1 E I didr' e2\ r— r' | - 1 R ' J

relevant part of V is F->

1

X

E Ck-v-^k-^Ck/ck'.^ «,/,^',k',/",k"'

expB(k"-k).R']r(r-R0/V-R')

Xw+i-k"''Mw'(r)

x J W e ' j r - r r V ^ V )

- W ; k " - W , /

X^k"'**(r)^k''(r)¥>k' (r')-

(A3)

(AT)

which is not a function of k except through the combi-

110

13

THEORY

OF

SUPEREXCHANGE

nation k"—k. This allows us to sum (A6) first over k, keeping k"—k fixed, which leads to

«*=

Z

J(e,f; k"-k,k>)

k',k"— k,
X (e k . + k _ k -«- e*.')-1 e x p p ( k " - k ) • R] Xck,+(k_k")./V,
(A8)

111

INTERACTIONS

This is the expression used in the text, which is entirely analogous to the polarization accompanying nuclear indirect exchange. Clearly there are also both the nonlocalized parts and terms in which, instead of cd'(t,
Theory of Dirty Superconductors J. Phys. Chem. Solids 11, 26-30 (1959) (appeared 1960)

This introduces the importance of time reversal to superconductivity and the scattered state representation which was used so ably by Tsuneto and by the de Gennes group.* The names "Dirty Superconductors" and "Dirty Limit" are used extensively to this day. The ideas were developed during a summer of enjoying Charlie Kittel's hospitality at Berkeley in 1958, and actually written up before (9), which was in winter '58; but the publication was delayed by problems at Pergamon Press. To me this paper and its implications were the conclusive experimental proof of the essential correctness of BCS, because superconductivity's insensitivity to most impurities is such a striking, universal fact, while its extreme sensitivity to magnetic impurities is equally striking. Many people seem not to understand the great logical force of such an argument. A note on chronology: my paper on localization, though published in '58, was completed in summer '57. Late summer '57 through spring '58 was occupied primarily in working out the "Goldstone" and "Higgs" consequences of broken gauge invariance and hence in sorting out gauge invariance in the BCS theory. Paper (10), then, was my next logical step in superconductivity. That summer I also heard Orgel's lecture on transition metal complexes, which were an essential step in the superexchange work which I did over the winter '58—'59.

My version is the answer to some questions first posed by Hal (H. W.) Lewis, who was at Bell en route from IAS to Santa Barbara.

113

J. Phys. Chem. Solids

Pergamon Press 1959. Vol. 11. Nos. 1/2 pp. 26-30.

THEORY OF DIRTY SUPERCONDUCTORS P. W. ANDERSON Bell Telephone Laboratories, Murray Hill, New Jersey (Received 3 March 1959)

Abstract—A B.C.S. type of theory (see BARDEEN, COOPER and SCHREIFFER, Phys. Rev. 108, 1175

(1957)) is sketched for very dirty superconductors, where elastic scattering from physical and chemical impurities is large compared with the energy gap. This theory is based on pairing each one-electron state with its exact time reverse, a generalization of the k up, —k down pairing of the B.C.S. theory which is independent of such scattering. Such a theory has many qualitative and a few quantitative points of agreement with experiment, in particular with specific-heat data, energygap measurements, and transition-temperature versus impurity curves. Other types of pairing which have been suggested are not compatible with the existence of dirty superconductors.

ONE of the most striking experimental facts about superconductivity is that it is often insensitive to enormous amounts of physical and chemical impurities. For one example, several substances in essentially an amorphous state have been shown to be superconductors, such as bismuth and beryllium films laid down at liquid-helium temperatures. ^ As another example, there are disordered alloy systems with 20-50 per cent of chemical scattering centers, but with transition temperatures comparable with those of pure elements.*2* These quantities of crystal imperfections are large enough to scatter the electrons at an extremely rapid rate. In fact, if we were to take the mean free time before scattering for the electron as a measure of the electrons' uncertainty in energy, that uncertainty in energy is large compared not only with the energy gap eo, but with the Debye energy KO>D. Plane-wave states for the electrons definitely have this very large degree of energy uncertainty. On the other hand, the experiments of SERIN et alS3%> have shown that, starting with a pure single crystal of a superconducting material, there is usually a rather sharp initial drop in the superconducting transition temperature as the first small percentage of chemical imperfection is added. They show that this initial drop is proportional to the extra resistivity caused by these imperfections, and therefore proportional to the amount of scattering. If the impurities which are introduced are

magnetic ions rather than ordinary chemical impurities, MATTHIAS et aW have shown that this initial sharp drop continues, and superconductivit) is very soon destroyed. On the other hand, foi ordinary impurities the sharp drop stops rathei soon and is replaced by a more gradual behavior, which seems to be determined primarily by the fact that the impurity adds or subtracts electrons from the band, changes the density of states, and in various ways gradually varies the parameters of the free electrons. Thus we may divide superconductivity into two regions: (1) the region of relatively pure superconductors where scattering has a rather sharp effect on superconducting transition temperatures; and (2) the region of very imperfect superconductors, where additional scattering has very little effect. It is the purpose of the present paper to give a theory of this region of the "dirty" superconductor. The fundamental assumption we will make is that in this region the problem of the electron wave functions is best solved by first diagonalizing the scattering interaction between the electrons and the impurities, and then calculating the phonon interactions between electrons. Finally, one calculates from this the superconducting properties. That is, we find a new set of one-electron wave functions for the electrons, and then solve the problem of the interactions of the electrons in terms of these, rather than in terms of ordinary

114

T H E O R Y OF D I R T Y plane-wave functions, such as are appropriate in the region of the pure superconductor. All of this, of course, assumes that the scattering is perfectly elastic, as it is from any form of chemical or physical imperfection. (At the low temperatures where superconductivity is important, inelastic phonon scattering does not play an important role.) So what we do is simply to assume that somebody has solved for us the extremely difficult problem of the wave functions of the electrons in the presence of the scatterers, and write down the resulting wave function >pna.:

<\>na = ^ W f e >Kr>

(1)

27

SUPERCONDUCTORS

gives us the interaction between the electrons caused by the phonons. When the calculation is done this way, the energy denominators in the second-order perturbation theory contain the energies not of the initial plane-wave states, £*, but rather the energies En of the scattered one-electron states. Whether or not the interaction is attractive depends primarily on what these energy denominators are. Therefore, we find that the attractiveness or not of the interaction is now a function not of Ek, the energy of the plane-wave states, but of En, the energy of the scattered-electron states. Without going into excessive detail, we simply write down the part of the interaction which corresponds to the B.C.S. truncated Hamiltonian:' 5 )

where fao-are the Bloch waves and (n\k) the unitary transformation solving the scattering problem. •*c red. — Vnn,,cn C—n c—ti'Cn', (We give i[>n a spin index G; this may not be valid in heavy elements because of spin-orbit coupling, but the theory is still valid there.) \(n\k)\2\(n'\k')\*\Mk-k,\2h^k, , (3) T h e basic observation which we make is that if i/rno. is such an exact one-electron wave function in k,k' the presence of scatterers, and if the scatterers are Here cn* and c_re* are creation and destruction nonmagnetic, the time-reversed wave function, operators for electrons in state ifina. and (>f>no)*, (4>na)*t i s a l s o a n exact wave function of the oneMk-k a n d hwk-k have the usual meaning, and electron Hamiltonian. What is more, (ncr)* n a s (n\k) is defined in equation (11). the same energy as ipna. itself. We shall call this This interaction is summed over all the plane energy En. The time-reversed wave function wave functions which are contained in the scattered function w, with a coefficient which is given Ww)* is: by the square of the amount of the state contained B # (2) in the state n. Normalization requires the following (M* = J ( l*) ^*-
-2

Now, having the correct one-electron wave functions, we proceed to derive the phonon interaction between these electrons. It is not at all correct simply to transform the B.C.S. interaction' 5 ) directly into the interaction between these new wave functions tjincr; the reason for this is that ipniT contains plane wave functions which have quite different energies, and in particular, in the presence of strong scattering contains wave functions from outside of the region where the B.C.S. electron interaction is attractive. Therefore, this method would lead us to the conclusion that the interaction between electrons is altered rather seriously. Actually, it is correct instead to write down the interaction between these new scatteredelectron wave functions and the phonons, and then to do the second-order perturbation theory which

2 IH*)la

1.

(4)

Because of equation (4), if the parameters entering in the interaction were constants, as was assumed by B.C.S., the interaction would be exactly the same for the scattered state as it was for the planewave state. As it is, the interaction is not a constant, but at best only roughly so, and expression (3) picks out of the total interaction only the constant part. That is, the interaction (3) is strictly the average interaction over all the states going to make up the scattered state n. Since the states which make up this scattered state are, at leastunder conditions of strong scattering, taken more or less randomly from all the regions of the Fermi surface, we conclude that the interaction will be (aside

115

28

P. W.

ANDERSON

from a smooth energy dependence) a constant, averaged over the entire Fermi surface. The rest of the interaction, the part which was removed in the truncated Hamiltonian of B.C.S., is changed in a very radical way. In the pure substance the rest of the interaction can be thought of as an interaction between pairs which do not have exactly zero total momentum but rather some finite momentum. However, under conditions of strong scattering this part of the interaction does not take this form at all; each individual matrix element is smaller by a number of the order of the total number of electrons. This is because the momentum selection rule no longer holds, so that there is no reason why any individual matrix element should vanish. This increases the total number of matrix elements and must therefore decrease their magnitude. Thus, in the scattered state the B.C.S. part of the interaction, or rather the transformed B.C.S. part, plays a much more obviously unique role than it does in the pure superconductor. One final comment about this interaction: obviously if the scattering is strong enough the procedure which we have followed, of first diagonalizing the one-electron Hamiltonian including scattering, and then introducing the electron-phonon interaction and the interaction between the electrons which results from it, is correct. When we ask the question: at what degree of scattering is this no longer the correct procedure ? the first guess would be that one should find some average amount of electron-phonon interaction and when the scattering becomes less than that the procedure is no longer correct. However, it is fairly easy to convince oneself that this is not the correct way, but that the diagonalization of the one-electron part of the Hamiltonian is correct at very much smaller amounts of scattering. As a matter of fact, the procedure we have followed seems to be correct until the actual interaction between the electrons caused by the electronphonon interaction begins to come in to play. That is, it is correct until hjr becomes comparable with the energy gap, which is a reasonable measure of the electron-electron interaction. This is, in fact, the experimental criterion for the transition from the region of strong scattering, as here defined, to the region of weak scattering. <3' Now we shall draw some physical conclusions

from these ideas. First we should observe that it is possible to solve the B.C.S. integral equation and derive a theory of superconductivity just as well as in the new situation with the new averaged interaction and the scattered wave functions ip„ as it was in the old situation with the old interaction and the plane-wave functions i/r*. A general result, which is fairly easy to prove, is that the energy gap, and therefore the transition temperature, will always be slightly smaller for the scattered states than they would be in the pure case, essentially because the average taken in the scattered case is not as favorable as one gets in the pure case. This explains why it is that in the pure superconductor region the transition temperature drops so radically, and yet stops dropping after one gets into the region of strong scattering; and at the same time it explains why this drop is relatively small, because one does not expect the difference between the two energy gaps to be very large. Thus, we see that these ideas explain fairly satisfactorily the general features of SERIN'S results. It is also clear that when the time-reversal transformation cannot be made, that is, when the energy of the state ifin is not the same as the energy of \\>_n, all this cannot be done, and the transition temperature will continue to drop as the degree of magnetic scattering increases. An interesting question is what size of particles and at what degree of scattering will superconductivity actually cease. The first point is that, as long as the particle size remains fairly large, no quantity of scattering which leaves the substance a metal would seem to be capable of actually destroying superconductivity, because the average which is taken over the Fermi surface does not depend in any important way on the actual amount of scattering. On the other hand, on reducing the particle size, we will begin to get to the point at which the scattered wave functions <jjn have energies En which are separated by discrete energy gaps. That is, their energies must extend over something like the total Fermi energy, which is about 10 eV; and if there were only about a thousand electrons, that would mean that the energy differences between the states would be of the order of 0-01 eV. With such energy differences among the En, superconductivity would no longer be possible; in fact, it is fairly easily seen that the energy differences must be less than the energy gap. That means that

116

THEORY OF DIRTY

SUPERCONDUCTORS

particles with fewer than about KCMO5 electrons will begin to be affected. So far no experiments are reported on particles of this order of magnitude of size, although the particles used in REIF'S nuclearresonance experiments are beginning to approach this point. *6> Another conclusion is that the B.C.S. theory in its original form, assuming a constant interaction, will be more nearly correct in the dirty superconductor region that it will be for pure superconductors; that is, in the impure superconductors the interaction is relatively a constant, and therefore the energy gap itself will be a constant and the thermal and other results of B.C.S. should be nearly exact. On the other hand, in pure single crystals of superconductors, one can very quickly show that the energy gap will be a strong function of the momentum vector on the Fermi surface, because most superconductors have fairly complicated Fermi surfaces, and it would be a miracle if the interactions were sufficiently constant to maintain a constant energy gap. There are two types of experiments which bear on this point. One type of experiment is the electronic specific heat and, in fact, various recent experiments*7) show deviations from the exponential specific-heat curve of the B.C.S. theory in the direction which would be expected if the energy gap were a function of position on the Fermi surface. Experiments on less perfect single crystals have shown less such deviations, as is to be expected. On the other hand, most of the direct experiments, in particular the experiments of TINKHAM and his co-workers*8) on the optical measurement of the energy gap and the experiments of HEBEL and SLICHTER*9) measuring the

density of states by the relaxation time in nuclear resonance, are necessarily undertaken in the dirty superconductor region. In the case of TINKHAM'S experiments, the measurement is necessarily made near a surface, while the other measurement is made on small particles. Therefore, neither of these types of experiments would have been expected to show any considerable anisotropy of the energy gap. T h e former do not; the structure observed in the latter experiments*10) is probably a collective excitation.*11) T h e question of the experimental investigation of the anisotropy is a fascinating one which remains open so far as I know. Our conclusion is that the recent experiments on the detailed investigation of the specific-heat curve must

29

be considered as being more of a confirmation of the B.C.S. theory than vice versa, because they show that the expected anisotropy of the energy gap is actually there. In conclusion I should like to make a number of acknowledgements and apologies. In the first place various notions about time reversal and superconductivity have appeared independently in a number of places. Most particularly, ABRAHAMS and WEISS at Rutgers have made similar calculations, and BARDEEN and MATTIS* 12 * have used a related wave function. In the second place, the idea of the anisotropy of the energy gap occurred independently to COOPER and to PIPPARD

and

HEINE* 13 ). Thus, the purpose of the present paper is merely to summarize in a physically consistent way all of these ideas, and to show that there is good agreement qualitatively with experiments. I should also acknowledge interesting discussions with C HERRING and H. SUHL.

A final comment is that the nucleus is itself a dirty superconductor in the sense that the nucleus is a very fine particle with only a very small number of Fermi particles in it. Thus, it is not surprising to find that the theory of nuclei contains a "pairing" concept for +mx—m pairs*14) which shows very great similarity to the discussion in this paper. It is hard to see how any explanation other than time-reversal invariance will explain all these facts; in particular, the suggestions of inexact pairings or pairings of parallel-spin electrons advanced relative to the explanation of Knight shift results*15) certainly fail of agreement with the facts about dirty superconductors. It seems to the author that these considerations represent the strongest arguments that the theory of superconductivity must have some relation to zero-momentum, oppositespin pairs.

117

REFERENCES 1. LAZAREVB. G., SUDOVTSEV A. I. and SMIRNOV A. P.,

Zh. Eksper. Teor. F!>.33,10S9(1958);(translation) Sov. Phys. (JE.T.P.) 6, 816 (1958). 2. MATTHIAS B. T., WOOD E. A., CORENZWIT E. and

BALA V. B., J. Phys. Chem. Solids 1, 188 (1956). 3. LYNTON E. A., SERIN B. and ZUCKER M., J. Phys.

Chem. Solids 3, 165 (1957). 4. MATTHIAS B. T., SUHL H.

and CORENZWIT E.,

Phys. Rev. Letters 1, 92 (1958). 5. BARDEEN J., COOPER L. N. and SCHREIFFER J. R.,

Phys. Rev. 108, 1175 (1957).

P. W. A N D E R S O N

30

6. REIF F., Phys. Rev. 106, 208 (1957). 7. PHILLIPS N . E., Phys. Rev. Letters 1, 363 (1958). BOORSE H . A., Phys. Rev. Letters 2, 391 (1959). 8. RICHARDS P .

L.

and

TINKHAM

M.,

Phys.

12. BARDEEN J. and MATTIS D . C , Phys. Rev. I l l , 412

(1958). 13. PIPPARD A. B. and HEINE V., Conference on Pro-

perties of Metals at Low Temperature, N.Y. (1958).

Rev.

Letters 1, 318 (1958). 9. HEBEL L. C. and SLICHTER C. P., Phys. Rev. 107, 901

14. BOHR A. and MOTTELSON B. R., private communica-

tion; M . G. MAYER, Phys. Rev. 78, 22 (1950).

(1957). 10. T I N K H A M M . , RICHARDS P . L . and GINSBURG D . ,

Geneva,

15. H E I N E V. and PIPPARD A. B., Phil. Mag. 3 , 1046

(1958).

private communication. 11. ANDERSON P. W., Phys. Rev. 112, 1900 (1958).

118

Calculation of the Superconducting State Parameters with Retarded Electron-Phonon Interaction (with P. Morel) Phys. Rev. 125, 1263-1271 (1962)

This paper mentions that there will be structure in the energy gap function associated with phonons: when I brought this out in a talk at Birmingham in late '61, R. Peierls asked how it might be measured. At most a few weeks later I heard from J.M. Rowell at Bell Labs, that it had been, in the tunneling characteristic. The next few years left BCS with a sound quantitative basis. The paper owed much to Schrieffer's discussions with me and with Pierre Morel, my student inherited from D. Pines; Pierre was simultaneously serving as science attache at the French embassy.

119

Reprinted from THE PHYSICAL REVIEW, Vol. 125, No. 4, 1263-1271, February IS, 1962 Printed in U. S. A.

Calculation of the Superconducting State Parameters with Retarded Electron-Phonon Interaction P . MOREL

French Embassy, New York, New York

P. W. ANDERSON

Bell Telephone Laboratories, Murray Hill, New Jersey (Received October 10, 1961) The energy gap and other parameters of the superconducting state are calculated from the BardeenCooper-Schrieffer theory in Gor'kov-Eliashberg form, using a realistic retarded electron-electron interaction via phonons and including the Coulomb repulsion. The solution is facilitated by observing that only the local phonon interaction, mediated entirely by short-wavelength phonons, is important, and that a good approximation for the phonon spectrum is therefore an Einstein model rather than Debye model. The resulting equation is solved by an approximate iteration procedure. The results are similar to earlier gap equations but the derivation gives a precise meaning to the interaction and cutoff parameters of earlier theories. The numerical results are in good order-of-magnitude agreement with the observed transition temperatures but lead to an isotope effect at least 15% less than the accepted —J exponent (Tc proportional to Af-'). Also, the present theory predicts that all metals should be superconductors, although those not observed to do so would have remarkably low transition temperatures. INTRODUCTION

A

LTHOUGH the original Bardeen, Cooper, and Schrieffer (BCS) theory of superconductivity 1 has gone far in explaining the basic processes which account for the condensation of fermion systems, it must still be considered as a phenomenological theory with respect to the use which is made of a rather nonphysical "effective potential" to describe the complex Coulomb and phonon-induced interactions between the electrons in a metal. We may recall that the BCS effective potential is instantaneous and displays a strongly oscillating behavior in coordinate space (since its Fourier transform is sharply cut off in momentum space). Since the strength V of this effective interaction appears only as an adjustable parameter, the BCS model is still adequate to describe most properties of superconductors; it is difficult, however, to see how this parameter V could be related to the retarded, time-dependent electron-phonon interaction and the essentially instantaneous Coulomb repulsion or a combination thereof. On the contrary, several physical reasons lead us to the conclusion that the coordinate space dependence of the interaction potential plays a very subsidiary role whereas its retardation in time is indeed of major importance in determining the various cutoff phenomena. Firstly, it appears that the quite long range part (in space) of the phonon-induced interaction results primarily from the emission and absorption of very long wavelength phonons. These phonons can enter either through "direct" processes, in which the dielectric screening of the metal is nearly complete, or through "umklapp" processes. These latter processes

have been supposed in the past 2 3 to play a major role, but this appears very much in doubt in view of the deformation potential theorem which requires that the effective potential acting to cause scattering by any phonon be, to first order, proportional to the strain produced by the phonon and not to the displacement. In this perspective, an Umklapp process may be viewed as the excitation of a true phonon corresponding to an actual deformation of the lattice plus a stationary wave corresponding to a translation of the lattice as a whole. Since the electron wave functions follow adiabatically a translation of the lattice, only the actual deformation causes any scattering. Consequently, Umklapp processes are no more effective than direct ones. The net result, then, is that the phonon-induced interaction is mediated primarily through short-wavelength phonons. It is well known experimentally and theoretically that the shortwavelength part of the phonon spectrum is rather sharply peaked about a few definite frequencies, so that it is a quite good approximation to think in terms of an Einstein model of a few groups of single-frequency, nonpropagating phonons. And of course, phonons which do not propagate in space lead directly to a localized interaction. A second line of reasoning leading to the same conclusion stems from the particular form of the Gor'kov-Eliashberg energy gap equation. 4 ' 5 In these works, the energy gap function in coordinate space 2 D. 3 P. 4

Pines, Phys. Rev. 109, 280 (1958). Morel, J. Phys. Chem. Solids 10, 277 (1959). L. P. Gor'kov, J. Exptl. Theoret. Phys. (U.S.S.R.) 34, 735 (1958). [Translation: Soviet Phys.—JETP 7, 505 (1958).] G. M. Eliashberg J. Exptl. Theoret. Phys. (U.S.S.R.) 38, 966 (1960). [Translation: Soviet Phys.—JETP 11, 696 (I960).] 5 P. W. Anderson, Proceedings of the Seventh International 'J. Bardeen, L. N. Cooper, and J. R. Schrieffer, Phys. Rev. Conference on Low-Temperature Physics (University of Toronto Press, Toronto, 1960). 108, 1175 (1957); hereafter referred to as BCS. 1263

120

1264

P.

MOREL

AND

P.

and time is

2 ( r - r ' , t-t') = F(r-i',

t-t')V(t-t',

t-t'),

where V is the retarded potential caused by phonons and F is the Gor'kov pair Green's function. F has a long range in space and oscillates with a wavelength of the order of 2kfl (k0 is the Fermi momentum), while the spatial dependence of V is only related to the particular features of the phonon spectrum and may display an oscillating behavior with wavelength of the order of the inverse of the Debye momentum. Destructive interference makes the product of F and V small for all but quite small spatial distances (r—r'). Thus, we find once more that only the short-range part of the interaction potential is important. It is therefore desirable that a treatment actually taking into account the time dependence of the electronphonon interaction be developed, with the hope that the main features of the BCS model, and particularly the isotope effect, may be retrieved by this approach. Fortunately, a new and powerful analysis of the condensation phenomenon in terms of particle propagators has been derived by several authors, 4 thereby providing us with a suitable formalism for our purpose. In the first section, we shall briefly set up our notation and write the Dyson equation for the problem. In Sec. II, we reduce this equation to a single integral equation after integrating over all spatial variables; we then solve this equation approximately by a suitable iteration process in the case of electron-phonon interaction only and within the frame of the Einstein model; we find that the corresponding gap equation is identical to the BCS equation. It has been known since the work of Bogoliubov 6 that the screened Coulomb interaction acts to a degree like a hard core, at least in the instantaneous interaction model previously used, and so is quite ineffective in weakening the phonon-induced attraction. In Sec. I l l , we find that this result does indeed hold in our time-dependent treatment; adding an essentially instantaneous repulsive potential to the retarded phonon-induced potential, we solve the integral gap equation to find identically Bogoliubov's result. All the above results are derived in the case where the phonon spectrum reduces to a single frequency. We investigate in Sec. IV the effect of the spreading of this spectrum about one central frequency and we find closely similar results (identical in the weak-coupling limit) thereby justifying a posteriori our use of the Einstein model. Finally (Sec. V), we derive the expression for the isotope effect and we compute the transition temperatures for nontransition metals. Unfortunately, as in the calculation of Swihart, 7 the exponent in the isotope effect is found to differ appreciably (10 to 20%) from the ideal value J, more in fact than the quoted

W.

ANDERSON

errors of the present experimental data. On the other hand, we find a general order-of-magnitude agreement between the computed and measured transition temperatures better than that of Morel. 3 I. THE DYSON EQUATION Following closely the notation of Gor'kov and Eliashberg, 4 we define the Green's function G(x—x') describing the propagation of an ordinary single particle : G (x-x')

= i(Tt+ (x)++* (*')> = *<2y_(*¥-* (*')>,

(1)

as well as the Green's function F(x—x') describing the merging of two particles into the condensed phase or the inverse process (i.e., the creation or the destruction of a "ground pair"): = iei»<->+'">(Til/+{x)ip-{x')) = ie-i»«+''KT4>_*{x)^+*(x')),

F{x-x')

(2)

where the last equality is obtained by time reversal (provided we use a representation in which the state vectors are invariant under this transformation). Here x stands for the four-vector x, t and T is the usual time ordering operator (see reference 4). The average in (1) is taken in the ground state | <pa{N)) corresponding to a total number N of particles; the averages in (2) are taken between the ground states [ <po(A7)) and | <po(A7±2)). Lumping together in ek the kinetic energy of a single particle and the mass correction due to the self-energy in the normal fluid, we may write the Dyson equations in momentum space:

2 M

F(M =

-e k 2 -2 2 (k,«)-h'5' „ x 2(k,u)

w 2 -e k 2 -2 2 (k,co)-N<5

(3)

,

where 2(k,w) is the self-energy corresponding to the processes in which two particles either merge into the condensed phase or emerge from it. This self-energy plays the part of the "energy gap" of BCS theory since the energy spectrum of the individual excitations of the condensed fluid is now given, in the low-energy limit, by

£ k =C ek 2 +2 2 (k,0)]i.

(4)

2(k,o>) is related to the propagator F(k,u) by 2(k,co) = ^ —

/r(k-k',a ) -co')/ ? (k',co'M 3 Wco',

(2TT)4 J

(5)

6 N. N. Bogoliubov, V. V. Tolmachev, and D. V. Shirkov, where T is the exact vertex part, i.e., the sum of the A New Method in the Theory of Superconductivity (1958) (transcontributions of all possible interaction processes or lation: Consultants Bureau, Inc., New York, 1959). 7 J. C. Swihart, Phys. Rev. 116, 45 (1959). combinations of elementary electron-phonon inter-

121

CALCULATION

OF

SUPERCONDUCTING

STATE

PARAMETERS

1265

dependent factor wq(w) may be written:

action: (N\"ZqV{q) : - p h = - * £ I—J (6,+&-,*)ck-*Ck kk' \ M / [2u,]» =

[

2

Mq

* £ aq(&q + S-«*)Ck'*Ck,

(6)

and direct Coulomb interaction: #co„i=

E

F(k'-k) C k .*<; K _ k .*CK-kC k .

(7)

k,k',K

In the above expressions, N, M, and Z are, respectively, the number of ions per unit volume, the mass, and the valency of the ions; bq (6,*) and Ck (ck*) are the usual annihilation (creation) operators of the phonon and electron fields, respectively; V(q) is the Fourier transform of the screened electrostatic potential: 47re2

V(q) =

2

2

q +k

W 2

g +47re2iVo

(8)

where k,~x is the screening radius (for the FermiThomas model, k,2 is proportional to the density of states No on the Fermi surface). Finally, let us remark that the phonon momentum q appearing in Eq. (6) is equal to (k'—k) only if this vector happens to be in the first Brillouin zone (normal process). In general, q is a function of the vector (k' — k) given by the momentum conservation relation: q= k'-k+KJV,

"I

k'-k=q, but also to the states k' satisfying the more general condition (9). Because of the deformation potential theorem, the matrix element is approximately the same whether the scattering is a normal or an "umklapp" process, for a given q. Now, we shall restrict the vertex part V to its lowest order terms only, i.e., the sum of (»), (10)

and —V(q) for the phonon induced and Coulomb interactions respectively. Here wq and 17, are the real and imaginary parts of the frequency of the phonon of momentum q. Since the phonon damping is rather small (ij, is usually of the order of 10~2wq), the frequency

ITT

H—co,[8(a)+u,)-5(a)-a!,)]. -u J 2 2

(11)

It has been shown by Migdal 8 that this simplification is acceptable for the phonon-electron interaction since higher-order terms are of the order of (M)_i or smaller. We have approximated the exact Coulomb interaction by an instantaneous potential, neglecting high-order retarded polarization terms: This is perfectly acceptable since dispersion occurs only at rather high frequencies of the order of the plasma frequency and is completely negligible in the small energy range we are considering here. We are also neglecting mixed terms corresponding to a combination of phonon exchange and Coulomb interactions: These terms are certainly smaller than (M)~l. The gap equation (5) together with (3) and (10) constitutes therefore the mathematical formulation of the problem. We shall now be concerned with simplifying and solving this equation in the perspective outlined in the introduction. II. SOLUTION OF THE GAP EQUATION FOR ELECTRON PHONON COUPLING ONLY

We shall first neglect the Coulomb repulsion altogether and consider only the contribution Z>0(
(9)

where K;v is a suitable vector of the reciprocal lattice. It must be clearly stated that the summation in (6) extends over all k and k' near the Fermi surface, since, because of the periodicity of the crystal, the deformation potential caused by the phonon of momentum q has nonvanishing matrix elements not only for scattering from the electron state k to the state k' such t h a t :

D0(q,u) = a,

Cdq

2(k,u

"J

* q ( M -u')i?(kV)£' 2

(2i> w h e r e F(ln,u)

XsineWdip'dk'dw',

(12)

d e p e n d s only u p o n t h e k i n e t i c e n e r g y :

th=Mk—*<>)•

(13)

(»o is the velocity of the electrons on the Fermi level) and q is given by (9). Since F vanishes rapidly 9 when k' is allowed to depart from the Fermi surface, we may replace the integration over k' by integration over the energy and the angular variables. Moreover, we see from (6) that 4m;2 "|: 2

McA-k +q'<

1 r

k? 2

i

N0Lk +q ]

-i(14)

We have made use of the standard expression for the velocity of sound (see reference 3). Note that this factor is almost independent of q, so that we may replace q2 by a mean value of the order of the square or the Debye momentum (more precisely f2). The factor 8 A. B. Migdal, J. Exptl. Theoret. Phys. (U.S.S.R) 34, 1438 (1958). [Translation: Soviet Phys.—JETP 7, 996 (1958).] 9 The q~2 dependence of r (g,a>) insures the convergence of the original equation (5). Replacing the integration over k' by integration over e' may, however, introduce an artificial divergence in the case of an instantaneous interaction. We shall in this case dot off the energy integration at the Fermi energy.

122

1266

P.

MOREL

AND

P.

Hco

W.

ANDERSON

function centered at M , = U I , we shall proceed to integrate (15) over the energy. Before doing so, however, we write this equation in the equivalent form: i\

2( Z ) dt';£>-*), D*(z)-e

r

2(&>) = — / dz 2fl" J (L)

D(z) = FIG. 1. Contour of integration (L) in the complex (s) plane for Eq. (18). Note that (L) does not cross the branch line of the function D{z) on the real axis (heavy line).

«,(&>—«') is strongly dependent upon the phonon frequency wq. For polyvalent metals, however, the relation between w, and the vector (k'—k) is quite complicated, so that the angular average of (12) requires a complicated procedure taking into account the structure of the crystal (see Morel, reference 3). On the other hand, since we expect 2(k,o>) to be essentially independent of k, we may average (12) over all k in the energy shell. This is equivalent to averaging «q(o>—o>') over the phonon spectrum (all phonon momenta are roughly equiprobable). This approximation, which is akin to the spirit of BCS theory, is of course self-consistent since it removes all k dependence in the right-hand side of (12); it amounts simply to assuming that the gap is isotropic, which has been shown to be approximately correct from an experimental point of view.10 We obtain then 2(u) = — idt' jdu'

F(t',w')U ( « - « ' ) ,

and U(o>) is the average of the phonon propagator M8(U>) over the phonon spectrum, represented by the distribution function g(wq): uq(u)g(u>q)dtj>q

(17)

P. W. Anderson

f

Wl

1

l

J (L) 2 .coi—z-\-w—it]

-|2( Z )

(20)

a)i-t-2—oi—wj. 2D(z)

As we expect from the time-reversal invariance of F and V, this equation is compatible with an even solution on the real axis. On the other hand, the detailed features of the solution do not appear clearly. It is, therefore, illuminating to go back temporarily to the time representation of relations (3) and (18): F(k,0^[S(£k)/2£i]ei^,) F(k',l)de'.

The oscillating time dependence of the integrand is smeared out by the integration and C{t) is a rather slowly varying function of time (see Sec. IV) so that the resulting 2(i) behaves essentially like U{t). Consequently, we expect that the solution of the integral equation (20), may be approximately proportional to U(u). In order to test this possibility, we shall try to solve it by iteration, starting from the trial function: 21(o)) = Af/(o,),

I t is worthwhile to pause at this point and remark that X is not directly proportional to the density of states on the Fermi surface A7")) as the analogy with BCS's expression NQV tends to suggest. The dependence of X upon TVo enters only, through k,2 and we find particularly that however large A^o may be or however complicated the structure of the crystal may be [(?2)av
2(o>) = X

2 ( 0 = X£/(0C(0 = X£/(0 /

.ks-+%qD2J

U{a)-

(19)

&-X>(z)y.

Note that D{z) retains the same determination when z follows the contour (L) represented in Fig. 1. If we choose the determination in the upper half of the complex plane, we obtain in a straightforward fashion:

(IS)

where X is a parameter which plays the role of the "NoV" oi BCS: k? -f , (16)

(18)

U{z)U(oi — z)dz 22(o)) = XA

o

*[A 2 -z 2 ]*

(22) U(z)U(a—z)dz A

[z2-A2]*

where we have taken advantage of the smallness of 2(oi) to replace it by the constant 2(0) = A in the expression for D(z). These integrals cannot be computed with any accuracy near the singularity OJ~O>I ; however, in the regions both above and below this singularity, the integration can be carried out approximately and

123

CALCULATION

OF

SUPERCONDUCTING

one finds: 3

22(a)) = X A [ l n ( 2 « i / A ) f / ( a ) ) + l ( l - W 2 ) a ) / w 1 + 0 ( u ) ] ; W«Wl, 22(w) =

XA[ln(2Wi/A)f/(a>)-i7r(a)1Vw3)+0(l/a)4)];

(23)

This expression is reasonably similar to the trial function 2i(a>) and, indeed, converges toward 2 i in the weak coupling limit (X<
l/X = ln(2o>i/A),

identical to the BCS gap equation. Finally, let us note, for further reference, that 2 2 (0) is real and also that 2 2 (w) has no singularity at o>= 2wi although its complete analytical expression includes the term: XAwi3 2O)(OJ— 2wi-(-2^)

K— -1) L

\ u — oil

oil/

•In

STATE

PARAMETERS

the logarithmic divergence due to the constant term f/i in the numerator of the integrand (see footnote 9). As before, these integrals cannot be computed near the singularity ws=coi but we obtain the following expressions for 2 2 '(w) in the low- and high-frequency limits, respectively: 2 2 ' = A[X l n ( W A ) - ( 1 + f V m(2 W l /A)+&({ln(2e,/A)

-*T/2}+(a,/ Ul ){|X(l + ?)(l-*7T/2) + (ix/2)«X}+0(a ) 2 )],

(29)

Z 2 ' = - A O ( l + f)ln(2ui/A)-&{ln(2eF/Aj-Mr/2} +tirXfc>i/">+0(l/cd 2 )], £o»ui. Although (27) and (29) are not as accurately selfconsistent as (22) and (23) in the previous section,11 2 / does indeed converge towards 2 / in the weakcoupling limit ( X , M « 1 ) . In this limit, the adjustable parameters A and £ must satisfy (X-M) ln(2Ul/A)+Mf l n ( e , / « i ) = l , H l n ( 2 u l / A ) - M f l n ( t F M ) = f.

imp)'

1267

(30)

Hence, flu

,n(

which corresponds to a pole with a vanishing residue.

TY:

(31) l+M ln(e F /wi)_

This relation is identical to the equation found by Bogoliubov and coworkers 6 using a similar model, with the difference that the cutoff coi of the phonon-induced On account of the reasonable success of the above interaction and the cutoff ep of the instantaneous scheme, we wish to extend it to solve the gap equation Coulomb repulsion appear now as the consequence of including the Coulomb repulsion —V{q). Since Eq. (5) the frequency dependence of these interactions rather is linear with respect to the vertex part r , the intro- than as arbitrary cutoffs in momentum space of an duction of the essentially instantaneous Coulomb effective instantaneous interaction. The effect of the interaction is equivalent to adding a constant to the Coulomb repulsion is indeed a reduction of the parameter "NoV" as expected but we find also that the frequency-dependent potential, i.e., replace X£/(w) by Coulomb repulsion is somewhat less effective than the U'(o>) = \U(a>)-», (25) phonon-induced attraction on account of its instantaneous character {M appears in reference 6, Eq. (3.7) where /J. is the angular average of V (q): reduced by the factor [ l + / t ln(e F /wi)] _ 1 of the order 2 of 0.4}. 4xe2 ks2 +4<W (26) qdq= In 471-^0 vo JJ 0o k^+q2 8£ 0 ' 2 IV. EFFECT OF THE FINITE RANGE OF THE PHONON SPECTRUM We are looking for a solution displaying the general The actual phonon spectrum of a solid extends over behavior of U' (w) and more precisely, we shall start the a finite range, the width 2w2 of which may be a signifiiteration process with the trial function: cant fraction of the Debye frequency (of the order of 2 1 » = A[(l+£)[/(co)-i;], (27) 20% for example). Accordingly, we see from (17) that with two adjustable parameters A and f. We have then the singularities of the average phonon induced interaction U(OJ) are spread over the same range and therefore, U(u) is much smoother than its components - i L'(l+*)tfto-0}tf(«-*)-/'] 22'(co) = A -dz uq(u>). Equivalently, U(i) has only a rather short range 2 2 i[A -z ]» in time, of the order of w2_1, since the components uq(t) interfere destructively if the time interval / is of the III. SOLUTION OF THE GAP EQUATION INCLUDING THE COULOMB INTERACTION

[z2-A2]'

•dz.

(28)

Note that we have introduced a cutoff of the order of the Fermi energy in the last integral in order to prevent

11 The most striking deficiency is the appearance of an imaginary term —i(w/2)tix in the zero-order term of the expansion (29). It is likely that this imaginary term (negligible in the weakcoupling limit) is spurious and due to our introducing an artificially sharp cutoff of the term proportional to £JJ at the Fermi energy.

124

1268

P.

MOREL

AND

P.

order of W2_1 or larger. This may be best demonstrated by taking a simple model for the phonon spectrum, namely a Lorentz distribution:

;W=-W

(cO, —COl) +C0 2

Ci(0 = A[ln2-Ci(AO]+A[cos(co 1 OCi(co l O +sin(i/)-7r/2}]-»(ir/2)A +*(x/2)Aeri<»i-i'«><j

ln(e^coiO-| ln(l+e 2 Tco!¥). Both the third and the fourth terms are imaginary and we notice that the latter introduces a correction proportional to exp[— 2i{oii—ico2)i] in the expression for 2 2 (i) or equivalently a pole at 2(coi—ico2) in the corresponding expression for 22(co). Moreover, this extra term will in turn bring about corrections at 3co1; 4coi, etc., upon iteration. Our intent is, of course, to neglect these high-energy corrections (which are quite small in the weak-coupling limit) but we must not overlook the fact that the corrections at 2coi, 4coi, 6uh • • • bring small imaginary contributions which ultimately cancel the imaginary contribution of the third term — i(x/2)A of (37) [for 22(co) is the average of (29) with respect to the phonon frequency and, therefore, 2 2 (0) must be real]. In order to be consistent, we shall therefore neglect both imaginary terms of (37) and retain only

^ W = |(wi-*w2)e _ i " l l "e-" ! l ", 1

1

(33)

U(u) = i(wi— tco2) -COi + CO —K02

U l - CO — i « 2 -

Note the rapid damping due to the large imaginary part of the pseudo-phonon frequency (coi—ico2). Because U(u) is actually smoother than uq{oi), one sees physically that the actual solution of (20) should be smoother than (27). However, the computation scheme used in the previous sections is not suited for this generalization because it relies upon the sharpness of the singularity of ut(w). We shall, therefore, take a different approach to the problem, involving the transformation of Eq. (20) to the time representation (21). Our iteration procedure now consists in the following steps. Firstly, we choose a trial function Zi(AO = A [ ( l + { - r ) t f ( A > ) + M ( « ) - £ ] ,

C 1 (;)~A[ln(2co 1 /A)-|ln(l+« 2 Tco 1 ^)], 0
(34)

where A, £, and f are adjustable parameters and A (co) an even function of the frequency, smoothly decreasing from 1 at co = 0 to zero at u = ± c o 1 . Secondly, we carry (34) into the expression for C(t):

c«=c(-o=

(«

2D(z)e

(35)

C2 (i)~A[ln2 - Ci (A0 + Ci (fat) ] ^AQnCcoj/A)-! l n a - r - i ^ W / 2 ) ] .

(i+f-f)Ci(0,

fc,(0,

C3(0^A[ln2-Ci(A/)+Ci(cF0] ~A[ln(2eF/A)-i ln(l+e2W*2)].

(36)

and finally, we shall Fourier-transform the resulting expression for 2 2 in order to compare it to the initial 2i. In the course of this program, however, we shall need to make some approximations, the most important of which is described below. Since expression (35) is practically linear with respect to 2 [we may replace 2 (z) by 2 (0) = A in the expression for D(z)~\, the three terms of (34) contribute, respectively,

(40)

Collecting terms and performing the required Fourier transformation, we obtain the expression for 22(co) in the three regions: co~0, co~coi, and co»« 1 , respectively : 2 2 " (co) =XA[ln(2co 1 /A)+0.23(l+£)+0.14f] OJ~0

- M A [ l n ( 2 c o i / A ) - f l n 2 - f ln(6 F /coi)], 2 2 " (co) =XAC/(co)[ln(2co 2 /A)+0.28-r-0.12£+0.12f] to » f c ) l

-MA[ln(2co1/A)-fln2-Jln(6F/co1)],

-fc.W,

to the final expesstion of C{t). The most important

(39)

Finally, the last term of (34) is cut off at a frequency of the order of the Fermi energy; thus:

Thirdly, we compute 2 2 (0 according to relation (21):

2,(0 = [ } i 7 ( 0 - ^ W ] C ( 0 ,

(38)

Here y is the Euler constant. Similarly, the second term of (34) is cut off at a frequency of the order of fcoi and therefore

Zi&o "dz.

(37)

where Si(z) and Ci(z) are the sine and cosine integral functions. In spite of its appearance, the second bracket is a perfectly smooth function of t, increasing monotonically from — °o at t = 0 [logarithmic divergence, of the order of ln(coiZ)] and approaching zero like — {oiity- for large /. Consequently, this function is very accurately approximated by

2

Note that this mathematically simple model is still a reasonably accurate approximation of the "wellbehaved" phonon spectra found for a rather large class of metals. It would not be a satisfactory approximation for multi-peaked spectra observed in some instances. The corresponding phonon-induced interaction is found immediately to be

1

ANDERSON

term is the first one and is found to be (for positive t):

(32) 2

W.

2 2 " (co) = - M A Q n ( 2 c o ! / A ) - f l n 2 - ^ l n ( ^ / c o ! ) ] .

125

(41)

CALCULATION

OF

SUPERCONDUCTING

STATE

PARAMETERS

1269

Adjusting A, f, and f to fit 2 2 and Si in these regions, we find the following self-consistency condition:

r

x

In ( — ) = \ A/ L1-0.23A

(42) 1+M ln(6f/«i).

almost identical to (31). Note that the phonon-induced interaction (attraction) is enhanced by the factor (1-0.23X)- 1 typically of the order of 1.0S to 1.1. This close similarity with the results of Sec. I l l justifies a posteriori our use of the simplified Einstein model; this also encourages us to place actual numbers on the parameters X and /J. and compute the corresponding transition temperature. We have plotted the selfconsistent "gap-function" S2(w) for the typical case X = 0.3, M = 0 . 2 5 , o)2=coi/S and l n ( t P / « i ) = 6 (Fig. 2). Note the striking similarity with the simple model of BCS and Bogoliubov, i.e., a constant gap, cutoff at

V. TRANSITION TEMPERATURE AND ISOTOPE EFFECT

It is interesting at this point to use our model to evaluate the critical temperature (or rather the corresponding "NoV") for some well-behaved metals and compare our predictions with experimental data. Firstly, we notice that the two most important parameters characterizing the electronic behavior of metals, i.e., the electron density (or Fermi momentum) and the density of individual states on the Fermi level, enter the expressions for X and /x only through the combination: o2 = * s 2 /«o 2 = 47re2iVo/4^o2.

(43)

We may compute both k0 and No for the Fermi-Thomas model from the interelectron spacing (obtained from basic crystallographic data). Now, the actual density of states on the Fermi surface is the Fermi-Thomas value corrected for the effective mass (which in turn is estimated from specific-heat measurements). From (26), it is clear that M= iln[(l+
FIG. 2. Plot of the self-consistent energy-dependent "gap function" S(co) for typical values of the parameter. The full line and the dashed line represent the real and imaginary part of 2(w), respectively. The step function represents the gap function used by Bogoliubov (equal to 1 from 0 to wi and to — £ from coi to the Fermi energy).

For polyvalent metals, however, the umklapp processes play an important role and it is preferable to replace o2/4£02 by the mean value over the first Brillouin zone. For metals with a simple structure (quasi-spherical zone), this mean value is of the order of | ( 4 Z ) - ' ; then X=i-

(44) X- M *=X

where x= | k — k ' | / 2 £ 0 is equal to q/2k0 only for normal processes. For alkali metals, it is indeed a fair approximation to neglect the effect of umklapp processes and take 1 2 Jo .at+x .

a2

2 1+o2

(47)

Now, it must be emphasized that the above expression is no more than an order of magnitude estimate of X; for, we have not taken into account any effect of the crystalline structure although it is clear that it determines the mean value of §2/4&02. Also, (47) is based on the assumption that the screening radius of the electron-ion interaction is the same as the screening radius of the direct Coulomb interaction between electrons and may be estimated on the basis of the Fermi-Thomas model. This assumption is very much open to doubt, particularly for the heaviest ions. In any case, it may be seen from Table I that this simple model leads to a fair order of magnitude agreement between the "NoV" estimated from experimental data using the BCS expression for the critical temperature and our parameter:

On the other hand, we have not accurately computed the angular average of the strength of the phononinduced interaction because of the complication brought by the existence of Umklapp processes; we have, rigorously, a2 (45) xdx, .a2+32/4yfe02

%dx=

.o2+t(4Z)-U

(46)

126

~7~T> l+lx\n(eF/o)i)

(48)

computed from basic crystallographic and thermal data. We have also plotted both parameters X and /t* versus a2 for monovalent, bivalent, and tetravalent metals [using a typical value ln(ej?/o)i) = 6]]. Since a2 is always of the order of 0.3 or larger (0.8 to 1 for alkali metals), this plot indicates that most if not all metals should be superconducting (see Fig. 3). On the other hand, the computed critical temperature is exceedingly low if (X—/**) is smaller than 0.15, say. For example, the critical temperature of sodium would be of the order of 10~3 °K; it is then safe to assume that even if perfectly pure sodium were superconducting a t such low temperatures in zero magnetic field,

1270

P.

MOREL

AND

P.

W.

ANDERSON

2

TABLE I. In this table, a is computed from crystallographic data and m*/m is estimated from the electronic specific heat. These data, as well as the Debye temperature and the critical temperature (columns 3 and 4) are taken from the "American Institute of Physics Handbook," McGraw-Hill, 1957. N0V (column 8) is estimated from the critical temperature with the help of BCS equation. The exponent of the isotope effect (last column) is derived from this experimental value of NQV with the help of relation (50). a? Na K Cu Au Mg Ca Zn Cd Hg Al In Tl Sn Pb Ti Zr V Nb Ta Mo U

0.67 0.83 0.45 0.51 0.45 0.55 0.39 0.43 0.43 0.35 0.40 0.415 0.37 0.38 0.32 0.51 0.28 0.29 0.29 0.27 0.30

m*/m

6i,(0K)

1.6 1.15 1.1 1.3 0.75 0.9 0.75 2 1.5 1.35 1.15 1.2 2.1 ~3 1.1 ~9 =8 ~5 1.9 8

160 100 343 164 342 220 235 164 70 375 109 100 195 96 430 265 338 320 230 360 200

r„(°K)

0.9 0.56 4.16 1.2 3.4 2.4 3.75 7.22 0.4 0.55 4.9 8.8 4.4 1.1

the least residual field (magnetic impurities) would exceed the very small critical field and prevent superconductivity. This argument practically excludes all monovalent metals. Finally, we notice that all experimental values of "NQV" for metals appear to be smaller than 0.4 (see Table I) in good agreement with our prediction since H* is practically 0.10 in all cases and A cannot exceed 0.5 (to the first order of the expansion in the electronphonon interaction).

X 0.25 0.25 0.20 0.18 0.32 0.27 0.25 0.23 0.37 0.33 0.34 0.32 0.34 0.40 0.41 0.37 0.47 0.47 0.45 0.38 0.47

/**

X-„*

0.12 0.12 0.10 0.10 0.12 0.11 0.09 0.09 0.10 0.10 0.10 0.09 0.10 0.10 0.11 0.11 0.12 0.12 0.11 0.10 0.12

0.13 0.13 0.10 0.08 0.20 0.16 0.16 0.14 0.27 0.23 0.24 0.23 0.24 0.30 0.30 0.26 0.35 0.35 0.34 0.28 0.35

tfoF„P

-(dA/A)(M/dM)

0.18 0.175 0.35 0.175 0.29 0.27 0.25 0.39 0.14 0.16 0.24 0.32 0.25

0.35 0.34 0.46 0.34 0.44 0.43 0.42 0.47 0.25 0.30 0.41 0.45 0.42

0.19

0.36

Using the property ft* = 0.1, we can derive from (49) a semiempirical relation: dA/A= -|(rfM/Af)[l-O.Ol(7V 0 F)-2] (

(so)

indicating a significant discrepancy from the "ideal" value —\{dM/M) postulated by BCS. Note that the discrepancy is larger for low temperature superconductors (see Table I). VI. CONCLUSION

Isotope Effect

The positive aspect of our results is that they confirm the approximations to the effective potential made in the past, particularly Bogoliubov's approximation, and give a precise meaning to each parameter which can, therefore, in principle, be computed from the basic crystallographic data. Also, the orders of magnitude of the computed transition temperatures are generally satisfactory. (49) (fA/A=-i(rfiW/M){l-[Xln(2co1/A)-lJ}. The negative features are, firstly, the isotope effect which is particularly in conflict with the results of i ^ - Z=4 Geballe and Matthias on zinc, secondly, the prediction that all metals should superconduct at sufficiently low -~~ x^_-- Z = 2 temperatures, and thirdly, the fact that it is not obvious from the theory why no material with Tc higher than 18°K have yet been found. This last point is ^_JL~— - Z»1 not really too disturbing since the drastic simplifications on which this theory is based could only be expected to »* be reasonable in the weak-coupling limit; in the strong/ — ^ J * " "-^rr coupling case of high-transition-temperature super^^ 1 i i 1 i i conductors, such effects as the phonon scattering 0.3 0.4 lifetime and, in general, the complex diagrams which a2 have been omitted from our Eq. (12), should be FIG. 3. Plot of the parameters X and M* vs a2 [see expressions incorporated in the theory. (44) to (48)] for different valencies Z-1, 2, and 4.

The isotope effect consists in the dependence of the energy gap A upon the phonon frequency wi (the phonon frequency itself is proportional to M~*, everything else being equal). Differentiating (31) with respect to the phonon frequency, we obtain in a straightforward fashion:

127

CALCULATION

OF

SUPERCONDUCTING

On the other hand, the first and second difficulties are of a more serious nature for they do occur in the weak-coupling limit. Because our results are essentially insensitive to the actual computation scheme [compare Eqs. (31) and (42), and also Eq. (6.21) of reference 62, we do not feel that our solution of the basic equation (5) could be in serious error, at least in the weakcoupling limit. Moreover, the predicted transition temperatures are not remarkably sensitive to the errors, or approximations, which do remain in our treatment, with the possible exception of the cases in which zone boundary perturbations are too extensive at the Fermi surface (such as Bi, Sb, • • •), so that the free-electron Fermi sphere is a very poor approximation. This suggests strongly that the results derived here are

STATE

PARAMETERS

1271

exactly the logical conclusions of the original assumptions of the BCS theory. Consequently, what serious difficulties, such as the isotope effect, are to be found, seem to us to require either a new look at the basic assumptions of the theory, possibly using a different interaction taking into account more complex diagrams than the lowest order diagrams implicitly contained in the BCS's treatment, or re-examination of the experimental evidence. ACKNOWLEDGMENT It is a pleasure to thank Professor J. R. Schrieffer and Professor P. Nozieres for helpful discussions of the Green's function formalism of Sec. II.

128

Generalized Bardeen-Cooper-Schriefer States and the Proposed Low-Temperature Phase of Liquid He 3 (with P. Morel) Phys. Rev. 123, 1911-1934 (1961)

Pierre was my first serious student, and as is often the case he was among the best. He was bequeathed to me by David Pines, who had left Princeton for Illinois, and couldn't move Pierre from his job in NY. This paper later got Pierre his first promotion in France, when there was a brief but unconfirmed flutter about discovery of the He-3 phase, but of course we were 12 years previous and wrong about L. Nonetheless Pierre went on to a distinguished career in space science. I had had the idea for non-s-wave superconductivity for a year or more but only really began working on it as the result of two stimuli — Pierre's arrival, and Keith Brueckner's interest in it vis a vis He-3, which I hadn't known much about. Keith put a student on it immediately and rushed out a paper, on which I demanded equal billing for Pierre. It was well Brueckner did this because we scooped Emery and Sessler by only a month or so, but Brueckner's methods were sloppy and he focused on Tc which I felt we had no justification for predicting, so we followed up with the formal theory of such a state in a sequence of papers. I gave the first before an extraordinarily distinguished audience at the international congress at Utrecht in 1960, feeling very brave to choose such a speculative topic for my first such occasion. The c?-wave state we focussed such care upon was not the right one, but the principles we pioneered — the possibility of multiple ordered states, the meaning of the order parameter and the like, were seminal to all further work.

129

PHYSICAL

REVIEW

VOLUME

123,

NUMBER

6

SEPTEMBER

15,

1961

Generalized Bardeen-Cooper-Schrieffer States and the Proposed Low-Temperature Phase of Liquid He3 P. W. ANDERSON

Bell Telephone Laboratories, Murray Hill, New Jersey AND P. MOREL

French Embassy, New York, New York (Received May 15, 1961) Particle interactions in a Fermi gas may be such as to attract pairs near the Fermi surface more strongly in / = 1 , 2, 3 or higher states than in the simple spherically symmetrical i state. In that case the Bardeen-Cooper-Schrieffer condensed state must be generalized, and the resulting state is an anisotropic superfluid. We have studied the properties of this type of state in considerable detail, especially for / = 1 and 2. We have derived expressions for the energy, the moment of inertia, the magnetic susceptibility and the specific heat. We also derive the density correlation function and the density-current density correlation; in some cases

the latter implies that the liquid has net surface currents and a net orbital angular momentum. The ground state for 1=2 is different from those previously considered, and has cubic symmetry and no net angular momentum. A general method for replacing the possibly rather complicated potential by a simple scattering matrix is given. A brief discussion of possible collective effects is included. We apply our results to liquid He*; after correction for scattering by a method due to Suhl, it is found that the predicted transition should take place below 0.02 °K. Other possible applications are suggested.

L INTRODUCTION

fermion systems, particularly liquid helium-3. I t has been recently observed b y several authors 2 - 4 t h a t the problem of determining t h e ground s t a t e of a fermion

S

INCE the publication of the Bardeen, Cooper, and Schrieffer (BCS) theory of superconductivity,1 attempts have been made to extend their method to describe possible condensations of other interacting 1 J. Bardeen, h. N. Cooper, and J. R. Schrieffer, Phys. Rev. 108, 1175 (1957).

130

8 J. C. Fisher (private communication); D. J. Thouless, Ann. Phys. 10, 553 (1960). 3 K. A. Brueckner, T. Soda, P. W. Anderson, and P. Morel, Phys. Rev. 118, 1442 (1960). * V. J. Emery and A. M. Sessler, Phys. Rev. 119, 43 (1960).

1912

P.

W. A N D E R S O N

AND

P.

MOREL

assembly with attractive forces has solutions which near the Fermi surface (Sec. III). The problem of findcorrespond to condensing the fermion pairs into ing the ground state is then reduced to solving a l=l,2,3--states rather than into the spherically BCS-type nonlinear equation: Sec. IV is devoted to the symmetric s state of BCS theory. Brueckner et al.3 first treatment of this equation and the derivation of the pointed out that the forces in liquid helium-3 are properties of the ground state, particularly the ground favorable to a condensation into an 1 = 2 state at a state energy (condensation energy). We present in Sec. temperature of the order of 0.07°K. This situation is a V a novel approach to the derivation of the thermoconsequence of the strongly repulsive short-range part dynamics of the condensed system, based on Anderson's of the He3-He3 interaction potential which dominates formalism; the critical temperature Tc, the specific the first few terms of its expansion in terms of spherical heat, and the paramagnetic susceptibility of the conharmonics, thus preventing a condensation in the / = 0 densed fluid are evaluated. The two following sections or 1 configurations. In higher order, however, (beginning are devoted to the study of the flow properties of this with 1=2) the long-range attractive part contributes model; in Sec. VI we show that a small fraction of the more than the repulsive core so that the balance is particles take part in a spontaneous circulation in the attractive and a condensation is possible. It is not in- ground state at 0CK, because the average correlation conceivable that a similar situation could occur in other between the density of particles and the currentcases of interacting fermion systems, such as electrons density fails to vanish on account of the anisotropy of in transition metals. the ground state; in Sec. VII, we show that the conIt must be pointed out that the existence of such densed fluid is superfluid as could be expected from the condensed states has not been observed yet in any analogy with the superconducting property of the conphysical system. Measurements6 of the specific heat of densed electron gas in metals, and in accordance with 8 liquid helium-3 at temperatures as low as 0.008°K the finding of Glassgold and Sessler. Finally, we disfailed to show any transition to an ordered state; this cuss the alterations of the properties derived in the experimental result does not, however, preclude a con- previous sections for an ideal fluid, brought about by densed state of the type we are discussing here, for the scattering and in general departure of the real fluid critical temperature Tc derived for an ideal fluid from the ideal model; we emphasize particularly in this (without scattering) is certainly well above the true last section (VIII) the significant lowering of the transition temperature for the real fluid. Indeed, Suhl transition temperature by scattering in the case of argues that a transition to a condensed state cannot liquid helium-3. take place unless the energy broadening due to scattering is smaller than kTc.6 One can see from self-diffusion II. PRESENTATION OF THE FORMALISM measurements5 that the amount of particle-particle We write the Hamiltonian of our system of interscattering is still quite large at 0.03°K and that the acting fermions in second quantization and expand the resulting energy broadening will be small enough to wave function in term of plane waves: allow a condensation, only at a significantly lower temperature, probably of the order of 0.01° to 0.02°K. It H=2l £kCk,»*Ck,(T k,(r is thus possible that the stable state of liquid helium-3 be a condensed state at the absolute zero and at tem— X) T, Vkk'Ck',*C^-k',r'*Cq-k,a'Ck«. (2.1) k , k ' , q ir.ff' peratures close to zero, even though helium-3 is still a Fermi liquid in the range of temperatures which are attainable now. In any case, these nonisotropic con- Here ck,r* and Ck,„ are, respectively, the creation and densed states appear interesting enough to justify as annihilation operators of the fermion field, ek the energy complete an investigation as possible of their properties appropriate to single-particle excitations of the system, and Fkk- the interaction potential matrix element. The and of the conditions for condensation. most basic approach to treat the Hamiltonian (2.1) We first introduce a general theory of ideal many- would be to take «k as the "bare" particle kinetic fermion systems specifically designed to include possible energy and Fkk- equal to the Fourier transform of the condensation of the particles into nonspherically sym- free space He3-He3 interaction potential, and to derive metrical pair states, following a straightforward gen- all many-body effects, including a possible condensaeralization of the method of Anderson7 (Sec. II); then, tion, from there. We shall not, however, follow this we renormalize the interaction potential in order to course since we are only interested in finding the energy eliminate formally divergent matrix elements (resulting difference (or condensation energy) between the confrom the strong repulsive core of the He3-He3 potential) densed state and the Fermi states of a system of and reduce it to an "effective potential" acting only "dressed" particles. Anderson7 and others9 have shown that the usual many-body effects, leading to a re6 A. C. Anderson, H. R. Hart, Jr., and J. C. Wheatley, Phys. Rev. Letters 5, 133 (1960); A. C. Anderson, G. L. Salinger, W. A. Steyert, and J. C. Wheatley, ibid, 6, 331 (1961). 6 H. Suhl, Bull. Am. Phys. Soc. 6, 119 (1961). 7 P . W. Anderson, Phys. Rev. 112, 1900 (1958).

8 A. E. Glassgold and A. M. Sessler, Nuovo cimento 19, 723 (1961). 9 E.g., N . N. Bogolyubov, Uspekhi Fiz. Nauk 67, 549 (1959); Y. Nambu, Phys. Rev. 117, 648 (1960).

131

G E N E R A L I Z E D

normalization of the Hamiltonian in terms of quasiparticles or "dressed" particles, are probably substantially unaffected by the transition into a BCS-type condensed state, when the condensation involves only a small fraction of the total number of particles; the many-body corrections, such as the exchange energy, the screening of the interaction potential, etc., are practically the same in the normal and condensed states. Consequently, it is adequate to replace the basic Hamiltonian (2.1) by the renormalized Hamiltonian which is the result of a complete treatment of the usual many-body effects in the normal state, that is to say to include effective-mass corrections into £k and to use a screened potential instead of the free-space interaction potential. Furthermore, Anderson has shown that the reduced Hamiltonian Hre
We find, in this case, that the odd part of Vw can be canceled by symmetrizing the interaction terms of the reduced Hamiltonian. Since we shall later use the expansion of Fkk' in term of spherical harmonics, it is pertinent to remark that only even terms of this expansion remain after the symmetrization: even I terms are associated with antiparallel pairing. In the case of parallel pairing, we can instead define the operators: &k,/=Ck,„*e_k,<,*,

BCS

1913

STATES

w

\

F I G . 1. Representation of the mean value of the "pseudospin" operator a* in the occupation space of the individual particle state k.

are independent of the spin; this allows us to forget the distinction between the two spin states in this case and to write the reduced Hamiltonian both in the cases of parallel and antiparallel pairing as BKd= 2 £ e k «k- 2 £ Fkk<*k<*&kk

(2.2)

kk'

Following Anderson's method, we express the three operators b^*, bk and «k in terms of one vectorial operator Ok, formally identical, in the occupation space of the pair state kf, — k | (antiparallel pairing) or kf, — kf (parallel pairing), to the spin ^ operator in spin space: &k*=i(< r k»—i
fik—n^k—ckw

We now carry these expressions into (2.2) and also slightly alter the definition of ck to the effect that, from now on, this energy will be measured relative to the Fermi level. Omitting an irrelevant constant, we find •ffred = ~""H CkCkui - \ Z I Vkk'C' r ku ( 'Vu"r" 0 'ln> < 'V J k kk'

(2.3)

&k,
Now, only odd terms are left upon symmetrization of the interaction, meaning that the odd I terms of the expansion of Fkk- in terms of spherical harmonics are associated with parallel pairing. A slight complication arises in this case because the two spin populations are left uncoupled by the interaction: One can then write two "half-Hamiltonians," one for each spin, and consider, in principle, the case of unequal populations of the two spin states. It can be shown, however, that this situation could not occur in the weak-coupling limit,10 so that the average values of the above operators 10 Indeed, if the interaction has a reasonable strength, the binding energy gained by concentrating more particles in one spin

132

The analogy with the problem of ferromagnetism leads us to use the so-called "semiclassical method" for the purpose of finding the ground state of the system; this method consists in replacing the spin operators (here the operators Wk) by their mean values, which can be treated as ordinary vectors of moduli equal to 1, in their respective spin (occupation) spaces (Ouvw)k.u Let then Xk and ^k be the usual latitude and longitude angles defining the direction of the vector (ak) (see Fig. 1). In term of these new variables, the mean value state than in the other is more than offset by the resulting increase of the kinetic energy. 11 G. Heller and H. A. Kramers, Proc. Acad. Sci. Amsterdam 37, 378 (1934); M . J. Klein and R. S. Smith, Phys. Rev. 80, 1111 (1951); P. W. Anderson, ibid. 86, 694 (1952).

1914

P.

W. A N D E R S O N

E of the reduced Hamiltonian is E= - £ et c o s X k - | £ F k k - sinxk,««+k^*k). k

(2.4)

kk'

AND

P.

MOREL

the Fermi level and Xk —* 0 above the Fermi level, if the actual distribution of individual particle states approaches the Fermi distribution). Let us then split the summation on the right-hand side of (2.5) into summation inside and outside the shell:

The ground state at r = 0 ° K is found by minimizing E with respect to x k and
(2.6)

k

The condensation energy is then the difference between Eg and the energy corresponding to the Fermi state (cosXk= —1 inside the Fermi sphere and + 1 outside): W o = E [|e k | - e k ( c o s X k + i

sinXk

tanX k )].

and replace sinXk. in the second term on the right-hand side by «k

tanX k -|-0(|sin 3 X k |).

sinX k =

(3.2)

l«k|

We obtain thereby, as far as the summation outside the shell is concerned, a linear integral equation which can be solved by iteration. Substituting expression (3.1) for tanx k in (3.2), we find: tanXke'*k 1

in

1 out

ek'

=— E V*k. sinXk-e^k'-f— £ Kkk- tanXk,<:^k' «k k ' ek k' | ek. | m F k k ' sinXfc-e^k' ;n out F k k .F k - k " sinXIt»ei*k"

(2.7)

k

Although this energy W0 is usually quite small, it is indeed the main contribution to the binding energy gained by the pairing of the particles. The contribution of the interaction terms included in jE7red to the usual correlation energy is altogether negligible (to order 1/iV). On the other hand, the usual correlation energy results from terms which are not included in the reduced Hamiltonian (scattering, exchange, exchangescattering matrix elements, . . .) and therefore, is not significantly altered by the condensation.12

(3.1)

k'

=E

+£ £

k' in

ek

out out F k k ' F k ' k " ^ k " k ' "

+EEE V"

k" k'

—

k " k'

ek|ek'| sinXk,,.C**k'"

— — ek|ek'| |ek»|

+•••• (3.3)

By relabeling the summation indexes, this equation can be written in the more convenient form: 1

in

tanXkei*k=— £ J7kk. s i n X ^ k ' ,

(3.4)

«k k'

HI. EFFECTIVE POTENTIAL

where the "effective potential" Um is defined by Since we are interested here in the weak coupling limit, we consider only states of the system not too dif7k" k ferent from the Fermi state; We expect, in other words, ^ k k ' = F k k ' + E fkk k" |ek»| the distribution of individual particle states in the out out 1 1 ground state of the system to be essentially identical to + E E ^kk —PVk"'-7 k "'k • (3.5) the Fermi distribution outside a fairly thin shell about k" k"' |ek„| I tie'"] the Fermi surface. Restricting our interest to these quasi-Fermi distributions, we assume that |sinX k | is Note that Eq. (3.4) is quite similar to (2.5) although significantly different from zero only inside the shell much simplified since the summation is to be extended | « k | < ! in momentum space, £ being much smaller only on a small region of momentum space; on the than the Fermi energy. I t is clear that most of the other hand, the original potential V is replaced by the binding energy results from the contribution to (2.7) rather awkward expression (3.5). In this respect, we of transitions inside this shell. This region near the shall find convenient to use instead of this definition Fermi surface is also the domain where Eq. (2.5) is of Z/ /, the equivalent integral equation: kk nonlinear, for sinXk is almost equal either to + or out 1 — tanXk outside the shell (that is to say, Xk—> 7r below 0W=7kk.-|-£*,kk Uv.v. (3.6) 12 See reference 7, Sec. V. k" |ek«|

133

GENERALIZED We notice furthermore that the terms of the second order in sinXk cancel from the expression (2.6) for the ground-state energy, in the limit of small sinXk (that is to say, outside the shell): in

Ea— — £ «k(cosXk+§ sinXk tanXk) k out

-£[M+0(|sinXk|<)].

(3.7)

k

The last term on the right-hand side is of the order of A3/£3 (where A is a measure of the condensation energy near the Fermi level) and must be neglected in the weak-coupling limit. In this "effective potential" scheme, therefore, the binding energy Wo results exclusively from the contribution of transitions inside the shell. We justify thereby the "cutoff" hypothesis of BCS since we have obtained an equivalent expression of the ground-state energy by replacing the actual potential 7kk< by the effective potential Uw defined by (3.6) if both k and k' are inside the shell | t k | < £ and equal to zero otherwise. This procedure was a necessary step in the helium-3 problem, because of the presence of a practically divergent repulsive core, which is to a great extent renormalized out in the integral Eq. (3.6). It is very close to the Bethe-Goldstone scheme and Eq. (3.6) can indeed be re-expressed in position space and integrated like a Bethe-Goldstone equation.13 It is, in addition, extremely convenient in any weak-coupling case, since it allows the replacing of the energy-dependent F W by the substantially energy-independent Z7kk' thereby leading in general to a soluble integral equation (since this effective potential is factorizable). To elaborate on this last point, let us expand Eq. (3.6) in terms of spherical harmonics, with the help of the "addition theorem" for Legendre polynomials.14 The expansion of F k k ' is

BCS

1915

STATES

where we denote by k the unit vector parallel to k. Here Pi(u) is the (un-normalized) Legendre polynomial of order / and 7; + i is the Bessel function of order l+%. In the limit of small r, Ji+i(kr) is approximately proportional to (kr)l+i; consequently the short-range part of the potential V(r) (inside a range of the order of kp-x) contributes little to the high-order terms of expansion (3.8). This property is particularly significant in the case of a long-range attractive potential with a repulsive core; although the core may be strongly repulsive, the long-range part eventually dominates the higher order terms of (3.8) so that Vt is positive (corresponding to an attractive interaction) for large enough /. Similarly, the expansion of i/ k k' is - 21+1 *7kk. = 2 : U,(k#)P,(k-lft. ;-o 2

(3.9)

These coefficients at large values of k are of course important only because the equations we are to solve are for pairs of particles in the presence of the Fermi sea; the repulsive centripetal barrier, which for ordinary wave equations always ensures that a spherical potential binds 1=0 states most strongly, is here at least partially overcome by the zero-point kinetic energy of the degenerate gas. Carrying these expressions into (3.6) and replacing the summation by integration over the corresponding domain of momentum space,15 we obtain

1 r +— I 4TTVU,|>{

1 Vi(k,q)—-Ul(q,k')fdq.

(3.10)

\tq\

We can now demonstrate that the divergence due to the repulsive core has been eliminated from the effective potential. For the sake of the argument, let us take V(r) as a square potential of fixed range and arbitrary large strength A. Then Vi(k,k') is proportional to A and (3.10) can only be satisfied if Ut(k,k') is independent Fkk< = fe-^'-'Vir^'dT of A, hence finite. Thus divergent terms cancel each other on the right-hand side of Eq. (3.6) or (3.10), . , 2*2(21+1) r so that Ui(k,k') is finite, even though Vi(k,k') may be = £ Pi(*•*') I Jl+i(k'r)V(r)Jl+i(kr)rdr formally infinite. i-o (kky Jo Furthermore, we note that we only need to solve - 21+1 Eq. (3.10) in a very small range of values of k and k': - E Viik,k')Pi(^k'), (3.8) our basic assumption, relevant to the weak-coupling ;-o 2 case, is indeed that the thickness 2£ of the domain of u H. A. Bethe and J. Goldstone, Proc. Roy. Soc. (London) summation is much smaller than the Fermi energy so A238.SS1 (1957). that the relative variations of the moduli k and k' are 14 If 0,

ll-r

(cos0)iY» (costf')

16 We assume the system is large enough so that 2 k is approximately equal to y*rfr k /8ir s . More precisely, we shall replace S k i o b y (Nl,/2-ir)f<„etft^,*fv^?*sindd0dvde!>.rid2kmi by («"O/2TT) Y.J'c„("Ji-o*J'v=o2"sin6d9d
1 m=— I

134

P.

1916

W.

ANDERSON

Ul(q,kF)+&Ul(q,kF) = UtiqMll+bxJ This function is the solution of E q . (3.10) corresponding to a slightly different value £ + 5£ of t h e p a r a m e t e r :

|r ; „(0V)l 2

(3.11)

l = 2irZ/,E

l n £ + constant,

(4.6)

k- [«i,*+A»* IF*. («',*') i *:*

Subtracting (3.10) from (3.11) we obtain in first order:

\/NQUI=

(4.5)

For such solutions, (4.4) reduces exactly to the B C S type integral equation:

4ir

&Ui=-N*U?dk/l;

MOREL

c(e,
Vt(kF,q)r XUi(q,kF)q*dq.

P.

instead of (4.2), is quite small; the amount of other spherical harmonics in a predominantly /-type solution of (4.2) never exceeds few percent and moreover, the error in the ground state energy involved in neglecting this slight mixing is very small indeed (0.1%). We shall therefore simplify t h e problem of finding t h e ground state b y taking into account only the most attractive term of the expansion of the effective potential (which is likely to give the most favorable condensed state). Similarly, one could also argue t h a t different F i m (for a given /) are slightly coupled in Eq. (4.4) and one sees easily t h a t this equation has indeed simple solutions like

their mean values on the Fermi surface: Ui(kF,kF) or Ui for short. T h e Ui are not well (intrinsically) defined parameters of the system since their values depend logarithmically upon the arbitrary parameter £ (we shall show, however, t h a t the energy of the system and t h e other physical quantities of interest are independent of £). I n order to derive explicitly this dependence, let us assume then t h a t the function Ui{q,kF) of one variable q is increased by an infinitesimal a m o u n t dUi(q,]tF), proportional to Ui(q,kF) '•

1+8* (l+Sx)U,= V,+- 2 I

AND

On the other hand, the perfect decoupling of different m in (4.6) is only the result of the heuristic " A n s a t z " (4.5) and there can be no doubt t h a t (4.4) has also mixed solutions like (4.3): for example, the solutions derived from (4.5) by a rotation of t h e coordinate system. We have therefore performed a complete analysis of this problem for the case 1=2 (condensed state of helium-3) in the Appendix B . After removing the irrelevant degrees of freedom (three for rotational invariance and one for gauge invariance 1 6 ), we have found t h a t E q . (4.4) for / = 2 has a t least two independent mixed solutions in addition to the three simple solutions AoF2o, A1F21, and A 2 F 2 2 .

(3.12) (3.13)

A70

where is one-half the density of individual-particle states a t the Fermi level. IV. GROUND STATE AT r = 0 ° K W e can more conveniently rewrite the Eq. (3.4) in the form tanX k e<*k=C(0,¥>)/«k, C4-1) thereby separating the angular dependence from the energy dependence. C{0,
At this point, numerical computations are necessary to proceed and choose the most favorable configuration in F , m * ( f ? V ) C ( 0 V ) among these solutions. This computation is straightC ( M = E 2*UiYlm(8,v) £ (4.2) forward enough for t h e simple solutions (4.5); after 2 2 *» "' [ 6 k - + | C (') i ]* integrating (4.6) over the energy and discarding terms 2 2 This is a nonlinear integral equation, the nonlinearity of the order of A„ /£ (negligible in the weak coupling 2 case), we obtain resulting from the presence of the term [ C(8,
("£-//<

c(0,«o=x; AraFte(0,p).

(4.3)

A„=2rjexp(-l/iVot/!),

where Ara plays the p a r t of the energy gap derived b y BCS and where the constant T is defined b y

These considerations are borne out by a more refined evaluation of the amount of coupling between different terms of the expansion (3.9) of the effective potential (see Appendix A). One finds t h a t the error involved in solving t h e simplified equation:

C(B,V) = 2KV,

(4.8)

lnr= - j JI Y!m(0,
- Ylm*{e',v')c(e',
'* We have seen in Sec. II that the physical situation is not changed by adding the same arbitrary phase angle to all ^kSuch a gauge transformation is equivalent to multiplying C(8,ip) by an arbitrary phase factor.

135

GENERALIZED TABtE I. Values of Inr for different solutions of Eq. (4.2). The binding energy is proportional to r 2 for a given £/;.

m=0 fl!=l

z=o

1= 1

1=2

1=3

1.263

1.048 1.201

1.020 1.131 1.131

1.010 1.090 1.123 1.075

m=2 tt! = 3

Ground-state configuration (4.12)

BCS

1917

STATES

dependence upon | actually cancels out; A and consequently the ground state energy are indeed independent of the "cutoff" scheme we have introduced to compute them. Equation (4.13) shows that A must be maximized in order to maximize the ground-state energy. Equations (4.11), (4.10), and (4.8), in turn, show that, in order to do so, we must maximize

1.154

-.w relative to (|/ 2 |)»v itself. Since the above function is convex downward, this means that we must minimize the total variation of | / | z ; essentially, we must make the "gap" spread as uniformly as possible over the Fermi sphere. This concept is very useful in visualizing what is likely to be a good solution and what its properties are. First, of course, w = 0 and other real solutions are very bad because clearly we can always find another real spherical harmonic which is large when (4.10) C(8,
We have listed the values of lnr for the simple solutions Y!m up to / = 3 in Table I. Note that widely different configurations yield close values of r and, consequently, correspond to almost equal condensation energies [according to expression (4.13) for the condensation energy]. We wish to get at the same convenient relation (4.8) for mixed solutions also, and therefore we define

i«r=-JJ|/(e,

We find that the p-type ground stale is given by the simple solution F n ; the tf-type ground state configuration, on the other hand, is not a simple solution but rather a mixture of F2o and F22 spherical harmonics, in accordance with a suggestion of Thouless." More precisely, the most favorable
Individual Particle Excitation Spectrum As BCS pointed out, the condensation perturbs strongly the distribution of individual particle states near the Fermi level because of the appearance of an extra "condensation energy" term in the expression for the individual-particle excitation energy:

£ k =|> k 2 +|C(M| 2 ]*.

(4.12)

(4.14)

For 1=0 this extra term is a constant «0; the corresponding spectrum has therefore a gap of width 2e0. In the non-spherically symmetrical condensed states, however, C(8,
This configuration does indeed yield an energy about 5 % lower than the energy corresponding to either simple solutions F21 or F 2 2 (see Table I).

]*f

/V(MI

4T

Although the expression (4.8) for A formally involves the "cutoff" parameter f, we see from (3.13) that the " D. Thouless (private communication); see also D. Thouless, Ann. Phys. 10, 553 (1960).

J J z&-\c{9, re - v)\r?

The integration here is extended over the region S of the sphere, where |C(0,p)| is smaller than E. I t is clear from (4.16) that the excitation energy spectrum

136

1918

P.

W. A N D E R S O N

is altered only near the Fermi level since the right-hand side reduces to the normal value No if E is large compared to A. Furthermore, one notices that the integral on the right-hand side is finite, except maybe when E is equal to a relative maximum of \C(fi,
|C(0,v>)|=e o sin0.

In this special case, (4.16) can be integrated exactly and one finds: N0\E\ E-, N(E)-In 2eo E+t The density of states N(E) vanishes like No(E/eaY in the small excitation energy limit, and becomes infinite at E— e0. 1=2:

| C ( M ! =!«o[(3 c o s ^ - l ) 2 - ^ sin 2 2 v sin40]*

AND P.

MOREL

where | C | is approximately proportional to the distance from the node. Transforming (4.17) to a new system of coordinates 0',
N(E) = &

- f \ [£ 4TT

dQ! 2

-26 O 2 sin 2 0']i

J J •-2N,

0-

(4.18)

On the other hand, we have seen in Appendix B that |C(0,
tegral on the right-hand side of (4.16) is therefore always finite and reaches a maximum (about 1.43iVo) at E=e0 (see Fig. 2). V. THERMODYNAMICS OF THE SYSTEM

Let us now go back to our "occupation space" formalism of Sec. II, and notice that the total energy (2.4) of the system can be written

In the small energy limit, the domain of integration 2 (5.1) £=-LE„-* k , splits into 8 subdomains Zi in the neighborhood of the nodes of the function |C(0,^)| :0=0ionr—0i,
J__, A , ( £ ) = 8 _Si JZ&-\c\*y 4JT

(4.17)

k' JEkvi— *k-

We can still use here the effective potential Z7kk' instead of VW, for the derivation which led to Eq. (3.4) at the absolute zero, is very nearly exact at temperatures finite but much smaller than the cutoff energy £. Since we are only interested in the range of temperature below the critical temperature T0, this condition is fulfilled and we can write instead of (5.2): -Eku=23 £^kk'( (5.3) . NCo) k'

E,kuj— €k. 1=2

Since the potential I/kk' is separable, £k is indeed the product of a k-dependent factor by a (self-consistent) 1=0 sum over all states k' inside the cutoff. I t is all right «F+«0 then to write the total energy in the form (5.1), for ENERGY each term of the sum on the right-hand side depends FIG. 2. Comparison of the individual-particle excitation energy only upon its subscript k (there is no hidden dependence spectra for normal fluid (constant density of states equal to No), upon the other «rk', except through the self-consistent an j-type condensed fluid (1=0) and a d-type condensed fluid field E k ). a=2).

137

GENERALIZED

We shall consider first for guidance the case of the Fermi distribution, for which the pseudo-field Ek reduces to the w component «k, and a k is equal to + 1 and parallel to the w axis (corresponding to full occupancy of the pair k, — k below the Fermi level and both k and — k empty above the Fermi level). There are altogether four possible occupation number combinations of the individual-particle states k, — k: the ground state, two single-particle excited states (one occupied, one empty) and one excited pair state (two holes below the Fermi level, or two excited particles above the Fermi level). The same considerations apply to the condensed state where the pseudo-field is a vector of modulus E k [given by (4.14)] not parallel to the w axis in general. The reduced Hamiltonian (2.2) simply does not act on the two single-particle states, so that the pseudo-field E k is ineffective and they simply act like two extra states of the pseudo-spin with energy 0. The occupation vector
BCS

1919

STATES

TABLE III. Critical temperatures and specific heats of some generalized BCS states.

Configuration C(8,
\

2ev/kTc 3.50 4.03 4.20 4.93 4.26

1=0 / = l,m = l <_/ \ s t a t e (4.12) /=3, m = 2

yTe

1.055

linear equation for which the superposition principle is valid: All simple solutions F i m (for a given I and various m) and combinations of simple solutions are equivalent energy-wise. Equation (5.5) does indeed split into 2Z+1 equivalent equations: m |F,m(0V)|2 1 = 2TV1 £ tanh(&e k ,/2). k'

ffk

(parallel to Ek) +1 0 0 —1

Ground pair Single-particle excitation: k Single-particle excitation: — k Excited pair

(5.6)

ek <

After performing the angular integration, we find the BCS equation l = NoUt I t a n h ( / W 2 ) - , •'o

TABLE II. States of the pseudo-spin vector.

)T-TC

1.55 1.29 1.085

(5.7a)

t

or (5.7b)

*r c =1.14f exp(-l/iVotf,). Energy -£k 0 0

+Ek

straightforward to compute the "sum-over-states" Z and the related thermodynamic quantities for each subsystem k, — k and consequently the over-all thermodynamic behavior of the system. Critical Temperature The thermodynamic average of
Again, it is straightforward to verify that the formal dependence of Tc upon the cutoff parameter £ actually cancels out by virtue of (3.13). It is interesting to compare this critical temperature with the "gap" parameter e0 defined in the previous section. We have listed the values of the dimensionless ratio 2ea/kTc for a few different angular configurations (Table III). Specific Heat The total energy of the system at finite temperature is evidently

Ete-^-Ete^* k

e0Ek+e-*«k-L.2

= -L£ktanh(|J£k/2),

(5.8)

e/i£k—e-0*k

(
iiS

=tanh(,8£ k /2),

(5.4)

k+2

dE

where /3 is 1/kT. Consequently, Eq. (3.4) must be replaced by 1

in

tanX k e^k=— £ Ukk- sinXk-e**' tanh(/3£ k ./2).

and the specific heat C is therefore

(5.5)

tk k'

The critical temperature Te is the temperature above which no solution C(8,
N. N. Bogoliubov, V. V. Tolmachev and D. V. Shirkov, A New Method in the Theory of Superconductivity (Academy of Sciences of U.S.S.R., Moscow, 1958).

138

dT

e/S£k

/

(£ k H-/3E k ). k (l-(-e^k) V dp / s

(5.9)

In the low-temperature limit, the modulus A of C(0,
£kVSk

C r ^ 0 =2*i8 E* [1-r-e*"*]2

(5.10)

It is straightforward to see that (5.10) gives the ex-

P.

1920

W.

ANDERSON

pected result for the normal Fermi fluid, if £* is replaced by ek19

c»=4/j*w0r=7r.

(5.11)

For a condensed system, however, the excitation energy spectrum is strongly perturbed near the Fermi level and a quite different temperature dependence is expected. One finds, respectively: p-Type Condensed State (1= 1, w = 1):

AND

P.

MOREL

portional to the square of their wave vector q; furthermore, one can estimate the energy necessary to cause this gradual rotation by noticing that all condensation energy will be lost if a 2ir rotation takes place within a distance of the order of the cohesion length rc (see next section). The energy spectrum of the rotational modes will therefore be very approximately uc^Arfq*.

(5.14)

It is straightforward to show that the longitudinal modes (5.13) give a small contribution to the specific heat:

c long =o.o4( T r c )(r/r c ) 3 , d-Type Condensed State [State (4.12)]: Since the statistical factor on the right-hand side of (5.9) or (5.10) is strongly peaked at £ k = 0 , the domain of integration practically splits into the 8 subdomains Si near the nodes of \C(8,
It must be emphasized that the above results have been derived by taking into account only the contribution of individual-particle excitation modes of the system and neglecting all collective excitations. This procedure is rigorous in the case of superconducting electrons only, because the long-range Coulomb repulsive force prevents in this case the existence of low-energy collective excitation modes lying in the gap. It is not so for a system of neutral particles (with a short-range interaction) and it has indeed been shown by Anderson7 that a neutral Fermi gas condensed into a s-type state has lowenergy excitation modes corresponding to long-wavelength (small wave vector q) longitudinal waves:

of the order of 1% of the contribution (5.12) of individual particle excitations. On the other hand, the total energy of the rotational modes at the temperature T is the statistical average: dn En

En^

f

4x 2 A*r c 3 'o

«#"-!

4ir2A»f\!»

and therefore: dtEnt) ) 4.5k 4.5k /kT\* /klV Crot=~ ( — ) . dT 4irVA A /

(5.17)

This contribution to the specific heat vanishes like 71' only and will therefore eventually dominate in the lowtemperature limit; comparing the expressions (5.12) and (5.17), and noting that the cohesion length is of the order of kf^tr/ctt, one finds in the case of helium-3:

(5.13)

On the other hand, the spectrum of collective excitation modes in the case of helium-3 (anisotropic d-type condensed state) is certainly more complicated than the simple case of the isotropic neutral Fermi gas. In addition to longitudinal waves, there will be transverse or "rotational" waves corresponding to a gradual rotation, in space, of the anisotropy axis of the condensedstate configuration. These rotational modes are rather analogous to spin-waves in the usual ferromagnets and one expects therefore that their frequency may be proWe have used here the definite integrals: , _ r" ("»Wl __ f"ntn-ldt

(i+e'y'Jo

(5.16)

where n is the number of modes per unit volume with an energy smaller than &>: n=qs/(mi. We obtain from relation (5.14): 1 a*dw 1.8(*r)« t"

C

v3~ m '

"Jo

-dw.

Jo e " " - l dm

^O'©"^^©"(5i8)

CVot

1 kFq

19

(s.is)

T+¥'

The first four / „ are: Ji=0.693; Ji= 1.64; 73=5.4, and / , = 2 3 .

This ratio is of the order of unity only when the temperature T is of the order of 10~3rc and, consequently, the contribution to the specific heat from these rotational collective modes becomes important in a range of temperature well below the transition temperature and also well below the temperature which could conceivably be attained by the present day cryogenic techniques. These estimates of the contribution of collective excitation modes of the system justify then our claim that they are negligible in the interesting range of temperatures. It is also interesting to evaluate the specific heat near the critical temperature Te. In this limit, one can easily derive the expression for d(E^)/dfi from Eq. (5.5) if one assumes that the configuration f{0,ip) of the

139

GENERALIZED

BCS

dft

d0

—^=I/(M I

1921

1.0

stable state does not change; t h a t is to say, d(Ek2)

STATES

0.8

(5.19)

as 0.7

Using the result derived by BCS in the / = 0 case, one finds 10.2 (5.20)

w/l

o.e .X x no.5

]/|W

\.= 2

We have listed the values of the dimensionless ratio (C—yTc)/yTc which measures the relative magnitude of the specific heat discontinuity at the critical temperature, for the simple configurations AnYim(8,ip) up to 1=3 (Table III). We would expect the specific heat discontinuity to be of the same order for the groundstate configuration (4.12) as for the F 22 configuration; this, however, is only of academic interest since it appears that the actual transition of liquid helium-3 into a condensed state certainly cannot take place at the computed critical temperature Tc (see Sec. VIII). Paramagnetic Susceptibility

1=0 0.2

o.t 0

o

0.1

0.2

a.3

a.4

as

as

0.7

on

0.9

to

Tc

FIG. 3. Comparison of the variation of the paramagnetic susceptibility versus temperature for an s-type condensed fluid (l~0) and a d-type condensed fluid (1 = 2).

This expression is identical to the expression derived by Nosanov and Vasudevan21 by Bogoliubov's method. In the weak-field limit, (5.21) reduces to the expression found by Yosida for the special case /=0 2 2 :

Since the condensation into an even / configuration is based on the formation of pairs of particles with antieflEk parallel spins, one expects that the freedom of the in(5.22) M.,= x f f = W B E dividual-particle spins and, therefore, the over-all parak (e^k+iy magnetic susceptibility are reduced in the condensed state. At the absolute zero, the spins have no freedom It must be emphasized that, in view of the smallness in at all since all particles must be paired and the sus- of the nuclear magneton, one is practically always 6 ceptibility must vanish. On the other hand, condensa- the weak field limit: Fields of the order of 10 oersteds tion into an odd-Z configuration leaves the spin-up and must still be considered as "weak" at 0.01°K. Again, spin-down populations uncoupled (exactly like the the summation on the right-hand side of (5.22) inFermi state) and therefore the susceptibility of a p-type volves a statistical factor strongly peaked at £ k = 0 of /-type condensed fluid is essentially the same as the and therefore the integration can be performed by the susceptibility of a normal fluid.20 Let us then restrict same technique as for the specific heat (5.10). One our attention to the case of antiparallel pairing. The finds that the paramagnetic susceptibility 2vanishes in excitations of the subsystem k|, — k l are still those of the low-temperature limit Qike l.08(T/Tc) 3 as exTable II but, in this case, the degeneracy of the two pected (see Fig. 3). single-particle excitations is lifted by the paramagnetic interaction, their energies are now +11H and — fiH, VL DENSITY AND CURRENT-DENSITY CORRELATION IN THE GROUND STATE respectively (ti is the nuclear magnetron in the case of helium-3). The average paramagnetic moment at the One expects, in the case of the weakly interacting temperature T is then condensed fluid we are considering, that the condensation is a rather long-range cooperative phenomenon, involving the coherent ordering of the momenta of a M„=)i£ large number of particles; one expects, in other words, that the condensed state is characterized by a special sinh(ftjjBr) long-range correlation between the particles. Further(5.21) =ME more, the average particle current around a given k cosh(/3£k)+cosh(^Zir) particle does not need to vanish everywhere as in a 20 The susceptibility may be slightly increased for an odd / condensed state, if the coupling is strong enough to yield a sizeable increase of the condensation energy when the two-spin populations become unequal.

140

21 L. H. Nosanov and R. Vasudevan, Phys. Rev. Letters 6, 1 (1961). 22 K. Yosida, Phys. Rev. 110, 769 (1958).

1922

P . W. A N D E R S O N

AND P. MOREL

normal fluid or an isotropic condensed fluid, since the nomenon in which the close region (in position space) usual symmetry argument does not apply in the case does not play a major role: the short-range "correlaof the anisotropic configurations under discussion. I t is tion hole" and the long-range correlation due to therefore instructive to evaluate the average particle condensation: density and the average current density versus the W = i l E k sinX^c-'+W | 2 , (6.6) distance from a given particle, that is to say, the mean values of the operators ra(x)»(x') and w(x)j(x') in are well decoupled in position space. Let us remember the ground state; n and j are the usual density and cur- that this correlation (6.6) affects particles with opposite rent density operators in second quantization: spins in the case of antiparallel pairing and particles with the same spin for parallel pairing. In any case, we n=r *> (6.1) see that this extra term in expression (6.5) is the contribution of the terms k i = — k 3 = k and k 2 = —k4=k' j=-(V2)/C^*v*-(vr)vG in the sum on the right-hand side of (6.4). The same The ground state is the vector terms also provide a finite contribution to the density* 0 =IL[cos(X k /2)e* = - { E ksinX k e i <*-'+'k>][£ sinXk,] sider here the case of antiparallel pairing; the same 8 k k' results would be found in the case of parallel pairing, +C.C.} only with the opposite spin combination. Expanding the =-{J/*+J*/}, (6.7) operators n in Fourier series, we obtain immediately: 8 <«(X,=

E

<*«|cki«*Ck2.ck3,'*ck4„'|*0)

where we have defined the integrals:

kik2kak4

Xei(ks-kl)-VCk4-k3).*'_ (6.3)

/ ( r ) = EksmXke*,

We are only interested in the space average of this correlation function, which depends only upon the relative distance r = x'— x and the spins: <«(
E kik2k3ki Xe

i

J(r) = EkksinX k e« -'+^).

(

6 4

(6.8b)

We note that the second integral is the gradient of the first multiplied by (—i) so that we shall only need to compute I{t). Let us first remark that if / ( r ) is written

(^o\e^*C^C^*CUa'\^g)

«(ka-k3).r6(kl_k2+k3_k4)

(6.8a)

)

I(t)=\I(r)\ei^'\

(6.9a)

J(r) - - * e « T « v r | J(r) | + / ( r ) V r 7 ( r ) .

(6.9b)

then:

In the case of antiparallel pairing, we obtain in a straightforward fashion:

Since the first term of this expression of J(r) cancels in (6.7), we obtain thereby a general relation between the correlation functions (nn) and (wj):

(»(t)«(t))av=n[n+P F (r)], 2

<w(t)wa)>av=» +i[EkSinX k e« k - r +*k>]

<wj>=i^/| 2 V r T(r)=ft<»«>V rT (r).

X E k < sdnX k .«-« k '-«^k''J (6.5) Here n is the total density of particles of one spin (up or down) and Pp(r) is the exchange correlation or "Fermi hole." Note that the above results include the correlation effects due to the statistics of the particles and to the condensation but not the spin independent "correlation hole" due to the direct interaction. This many-body effect is, however, included in the formalism from the very beginning, since ck* creates a quasiparticle, that is to say, a particle dressed by its cloud of correlated particles. We have implicitly assumed that this rather short-range many-body effect, which determines the effective mass of the quasi-particles, is essentially unaffected by the transition into a condensed state. In other words, we are specifically considering the case where the condensation is a long-range phe-

(6.10)

This relation is really all what we need to know if the condensed state under consideration has rotational symmetry around its anisotropy axis Oz (case of a simple Im configuration, such as the ground state for ^>-type condensation). Indeed one sees directly from the expression (6.8a) by performing the rotation in momentum space which brings the Ozx plane onto the radius vector r, that the phase factor of I{t) is simply m

141

GENERALIZED

BCS

1923

STATES M

of equal swept areas (velocity decreasing as 1/r). The total angular momentum of this correlation current is therefore proportional to the total number of correlated particles nc:

m

= I I J J

sinXk s i n X k , e i ( ^ - ^ ' ) e i ( k - k ' ) r

drt 8ir

3

(IT*' 87T3

ir

sinX t sinX k ,e«*k-^')6 3 (k-k')—d T v &T3

kk'

J

sin2Xk

drk

.

(6.12)

&T 3

A straightforward integration shows that nc is of the order of Noe0, that is to say of the order of to/ep times the total density of particles n. The total angular momentum of the correlation currents in a volume V of the condensed fluid, is in this case: (6.13)

I—nfhncV^m%((o/eF)nV.

We notice, however, that this spontaneous circulation in the ground state is not a volume effect [to the contrary of what expression (6.13) seems to suggest] because the correlation current around one particle inside the fluid is exactly canceled by the currents around the neighboring particles. On the other hand, this cancellation does not obtain on the boundary of the sample and, if the same anisotropy axis is retained on the surface of the fluid (completely ordered state, presumably stable at absolute zero), these uncanceled correlated particles currents add up to form a sheet current on the surface, the total angular momentum of which is just / as given by the relation (6.13). This discussion is not applicable to the true 1=2 ground state (4.12), however, since the corresponding angular dependent energy gap function C(B,ip) has no rotational symmetry. We need therefore to compute J(r) explicitly. The interesting region in position space is the long-range region, where kpf is much larger than 1; this suggests computing I(t) by the so-called stationary phase technique (see Appendix C). We have found: 2ND cos(krr) _ /r\C(f)\ I(r) = C{f), (6.14) -K

FIG. 4. Correlation current pattern at a fixed distance from a particle for the d-type ground-state configuration (4.12). The points N are the nodes of the angle-dependent gap function | C(r) | ; M are its maxima.

the order of fa.y/\C\. The range of the correlation function (nn) is therefore anisotropic and extends to infinity in the direction of the nodes of \C(t)(. We may, however, define an average coherence length rc: r^i—~—kiT1. A 60

(6.1S)

Going back to the general expression (6.10) of the correlated particles' current density, we note that the only complex factor in the expression (6.14) of I(r) is C(f); consequently, the correlation current has no radial component. It is straightforward to compute the tangential components from (6.14) for any gap function C(8,
(nj)v=h{nn}

sin 2 0(3 cos20— 1) cos2^>,

16,r|/|' (6.16)

SVJ (nj))=ti(nny

16r|/|:

-sin20sin2
We have sketched the correlation current pattern on a sphere for this configuration (P'ig. 4). Note that the current flows around the nodes of the gap function C(f) where K» is the modified Bessel function of the second and vanishes at both the nodes and the maxima of C(f). kind and order zero. This remarkable result shows that These currents add up in such a way that the whole the long-range correlation (nn) in a given direction distribution has no net angular momentum about any depends only upon the value of the gap function C(9,
V fivp

/

142

P.

1924

W. A N D E R S O N

subtle character that it may be rather difficult to detect it in physical measurements. VII. FLOW PROPERTIES, SUPERFLUIDITY As we mentioned in the Introduction, the experimental investigation of liquid helium-3 properties has not yet disclosed any departure from a normal Fermi liquid behavior; particularly, it has been found that the flow properties of helium-3 are those of a normal fluid, that is to say that helium-3 does not participate in the superfluid flow of helium-4 in mixtures of both isotopes.28 If, however, a condensation of the type we have been discussing above does indeed occur at very low temperature, one expects that liquid helium-3 is not a normal fluid near the absolute zero. The analogy with the Meissner effect of superconducting electrons condensed into the £=0 configuration indicates that condensed helium-3 would be superfluid. We shall show by a straightforward generalization of the BCS derivation of the Meissner effect that the alteration of the excitation energy spectrum brought about by the condensation entails superfluidity, in accordance with the suggestion of Brueckner et al.3 The same result has been found by Glassgold and Sessler by a somewhat different method.8 The most convenient theoretical approach to superfluidity is to consider the flow of the fluid in a rotating container. Since we do not know how to treat the interaction between the fluid and the rotating wall of the container in the laboratory coordinate system, we shall transform our problem into the rotating coordinate system in which the container is at rest; as Blatt et al.M pointed out we have to perform this canonical transformation to rotating coordinates in order to do statistical mechanics at all. This, of course, is identical to the "cranking model" procedure proposed by Inglis26 for nuclei. The new (transformed) Hamiltonian is H'=H-(*L,

£

(wXr,)-Vi.

MOREL

system with respect to the fixed coordinate system: A(r)=o>Xr.

(7.3)

all particles

Let us introduce the vector field A(r), with A as the speed of a point attached to the rotating coordinate " J. G. Daunt, R. E. Probst, H. L. Johnson, and L. T. Aldrich, Phys. Rev. 72, 502 (1947), and J. Chem. Phys. IS, 759 (1947). M J. M . Blatt, S. T. Butler, and M . R. Schafroth, Phys. Rev. 100, 481 (1955). 26 D. R. Inglis, Phys. Rev. 96, 1059 (1954).

(7.4)

In second quantization, the perturbation to the Hamiltonian is then HR=

/V[)M(r)-T]\MT

= - ( & ) ! £ *k-a(q)Cfc+,./ck„

(7.5)

where we have used the plane wave expansions of the operators * and **, and where a(q) is the (formal) Fourier transform of A(r). This procedure is not strictly correct since the plane waves fail to satisfy the boundary conditions of the system (cylindrical container) and a Fourier-Bessel expansion of the operators * and $r* should be used instead. We expect, however, that the conclusions we shall draw from the comparison of the (formal) Fourier transform of the current operator j(r) with the (formal) Fourier transform of the speed field A(r) will be strongly indicative of the exact behavior of the model. Let

% i(')=— £

(2k+q)-'<;k+q„,%,

(7.6)

2m k,n,a

be the expansion of the current density operator (6.2). Since the expectation value of the current vanishes in the system at rest, this expectation value in the rotating system is, in the first order in co,

r<*o|Ha|*i><*,-|i|*.> j(r) = £

I hex. ,

(7.7

where * 0 is the ground-state wave function at the temperature T (energy Eo): /

*o=

„ X

Xk

Xk

\

2

1

/

I cos—e'^+sin—e~W 2 £ k * 1

II

ground pairs V

II

/

**'

**'

\

2

/

I cos—
excited pairs V

2

X

(7.2)

the perturbation can be written -<*-L=i%

P.

(7.1)

where L is the total angular momentum of all particles. Note that H' is not numerically equal to the Hamiltonian H of the system at rest; the extra term is a perturbation of the first order in the rotation speed w, corresponding to the Coriolis forces. Using the relation : co-(r i XV i )=(
AND

II

(ck»*)|*«>. (7.8)

single particles

The summation in (7.7) extends over all excited states ^i (energy £,•). Fortunately, states corresponding to different anisotropy axes are orthogonal,26 so that the angular degeneracy can be disregarded. We remark furthermore that the perturbation HB due to the rotation of the coordinate system is formally identical to the magnetic perturbation considered by BCS. We shall 20 The scalar product of two states SF; and *,-', corresponding to different z axes, is an infinite product of factors including at least one zero (for the individual particle state k such that Xk = Xk'=ir/2, ^ k = ^ k ' ± i r ) .

43

GENERALIZED

therefore obtain a formally identical result: j'(r)=—(2T)I E 2m

(2k+q)^--'[a(-q).k]

lim (7.9)

where L(k,k') = 2Z—1 i

—^-.

(7.10)

Eo-Ei

Even for nonspherically symmetric distributions, L(k,k') has a simple expression in the limit k'—* k: e0Et

limL(k,k') = dEk

(1+e^k) 2

limj'(q)= £k[a(q)-k> . *-* m k (l+e<«k) 2

2

2

(1+e^k) 2

(7.14)

2a*£ cos 0.

Because the statistical factor is strongly peaked at £ i , = 0 in the low-temperature limit, the domain of integration practically reduces to the eight subdomains Si near the nodes of |C(0,^>)| (see the last paragraph of Sec. IV). Replacing k2 sin20 and k2 cos2© by their average values at the node, respectively fkf and t&p2, we can therefore replace (7.14) by limj'(q) = 4/S—a(q) I dU I d (l+e^k)2 3irs J Joo = 47 2 «a(q)(*r/e u ) 2

(7.12)

Since the statistical factor is strongly peaked at £ = 0 , the integration here reduces practically to an integration in a shell near the Fermi surface. For a normal Fermi fluid, we can replace kCa(q)-k] by the angular average p / a ( q ) , and we find the expected result: e^de limj n '(q) = 2/J— a(q) f = »a(q). (l+e«<)2 «-° 3^ Jft

M

ciyk1 sin20

(7.15)

or, more conveniently:

0B

e ^

4ir3

e***

I dO I de

(7.11)

The (formal) Fourier transform of the current is then in the long-wavelength limit: 2¥B

1925

STATES

but the following terms cancel upon integration over
k,q

XL(k,k+q),

BCS

(7.13)

This result (7.13) means, of course, that the thermodynamically stable state of a fluid in a rotating container is such that the fluid is at rest with respect to the container, that is to say, rotates at the same speed as the container: j n (r) = »A(r). (7.13a) We shall show that it is not so for a condensed fluid and more precisely that the statistical average j ' of the current density vanishes in the low-temperature limit for finite rotation speed of the container; in other words, the fluid at large (i.e., as far as large-scale motions, corresponding to the long-wavelength Fourier components, are concerned) does not participate in the motion of the container at 0°K. In accordance with the usual meaning of superfluidity in the context of oscillating-disk" experiments, for example, we can characterize this behavior as "superfluid behavior." For the anisotropic condensed state (4.12), we cannot use an angular average of the term k(a- k) as above and we must instead compute separately the three components of j ' . Since Ek depends only upon sin22
'j'(q)

•-im(T/Tcy.

(7.15a)

Note that the variation of the long-wavelength Fourier components of the current density versus temperature is approximately the same as the variation of the paramagnetic susceptibility (see Fig. 3). The reason why (7.15) is isotropic is, of course, that the ground-state (4.12) has cubic symmetry for |/(0,TC (where this equation becomes linear) and have found that the critical temperature for the transition into an 1=2 con-

>' E. L. Andronikashvili, J. Phys. Moskow 10, 201 (1946); see also K. R. Atkins, Liquid Helium (Cambridge University Press, New York, 1959), for further references.

144

28 K. A. Brueckner and J. L. Gammel, Phys. Rev. 109, 1023 (1958). 29 The magnitude of the 1=2 coefficient is approximately equal to the magnitude of the (repulsive) 1=0 coefficient; this gives us a very useful indication of the relative magnitude of the coefficients U0 and U2.

1926

P.

W. A N D E R S O N

densed state is the highest; these authors have estimated the critical temperature Tc on the basis of the best phenomenological potential known and of the quasi-particle excitation spectrum computed by Brueckner and Gammel28; they have found a probable value of 0.07°K. I t must be emphasized, however, that this result is quite sensitive to even small discrepancies between the phenomenological potential (adjusted to fit low-pressure compressibility data) and the true interaction potential in a dense phase. Particularly, any screening would have the effect of reducing the longrange attractive part of the potential and therefore decreasing the transition temperature. We shall nevertheless use this value 0.07 °K for T„ as an indication of the strength of the effective interaction Ui (possibly too large by a significant amount). Thermal Properties in the Low-Temperature Limit Knowing the ground-state configuration (4.12) for d-type condensation, we can correlate the thermal properties of the fluid near 0CK, with the interaction parameter Uz or equivalently, the critical temperature Tc. We have found that the individual particle excitation spectrum is strongly modified in a shell of thickness eo near the Fermi surface: eo=2.46WV=0.17°K.

(8.1)

Since to is not truly negligible with respect to the normal fluid Fermi energy £p=2.5cK,30 the weakcoupling assumption on which this treatment is based does not hold well. One expects, however, that the rather large coupling does not affect significantly the results derived in the low-temperature limit, since only excitation modes very close to the Fermi level are important in this limit. One can, for example, estimate accurately the amount of condensation energy per particle at 0°K with the help of (4.13):

AND

P.

MOREL

breaks down at temperatures of the order of Te where excitations with an energy of the order of to become important; we shall see particularly that the scattering has the effect of destroying the pairs almost as soon as they are formed and, therefore, prevents their accumulation and the transition of the system into a condensed state. One expects, then, that the actual transition temperature will be significantly below the critical temperature Tc for an ideal gas. Effect of Scattering on the Transition Temperature Following an argument first proposed by Suhl,6 we shall represent the scattering by adding an imaginary part i{h/r) to the individual particle excitation energy ck and we shall investigate the condition for stability of the Fermi state with respect to the formation of pairs. Let then [H,bf]=ub^

-2(*+£). • 2 ( 1 - 2 / 0 E^kk'&k-*

be the equation of motion of the operator bk* creating an excited pair, in the presence of the Fermi sea; / k is the Fermi-Dirac distribution function. In this scheme, the appearance of an imaginary eigenvalue a> characterizes the onset of instability of the Fermi state with respect to the formation of pairs. Defining «=&>— 2i(h/T),

3 A2 - ~ 2 X 1 0 " 3 °K.

(8.5)

we derive from (8.4), with the help of the expansion (3.9): 2 tanh(86 k /2) °° «~n

£ £ 2^UlYlm(d,
fi—2ek

Wv

(8.4)

Hw=-I in

x £ Ylm*(e,y)bk'*. (8.6)

(8.2)

32ir ep

On the other hand, higher energy individual particle excitations £k* or ck* (excitation energy of the order of eo) are no longer well-behaved quasi-particles: they rapidly lose their individuality and disintegrate into many-particle excitation modes by the process of inelastic scattering. The relaxation time T for this process can be estimated from the spin diffusion data reported by Wheatley and his co-workers6: one finds that r is still as small as 6X10 - 1 1 second at 0.03 °K (corresponding to an energy broadening h/T=0.12°K, of the order of magnitude of eo). Consequently, our treatment

k'

Multiplying both sides of this equation by Fim*(0,
\Ylm(e,
1 = 4TUIT,

•

k

(8-7)

Q— 2€ k

Taking advantage of the symmetry with respect to the Fermi level (€k —> — ek), we can replace the sum on the right-hand side of (8.7) by a sum over the states above the Fermi level only:

30 We use here the value: « = 1.64X102' particles/cm' derived from the molecular volume data obtained by L. Goldstein CPhys. Rev, 117, 375 (1960)3 and the effective mass m*=2m computed by Emery and Sessler (see reference 4).

in l = 4»tf, £ |k|>ip

145

iFta^rfl2

4«k tanh(/3ek/2) -. 4t k 2 —SP

(8.8)

GENERALIZED

BCS

Let us first consider the case of negligible scattering where ft reduces to u. We have plotted the variation of the right-hand side of (8.8) versus o>2 in this case (Fig. 5). The discussion of Eq. (8.8) follows immediately: If Ui<0 (repulsive), the equation has only real solutions and consequently, the Fermi state is always stable with respect to the formation of /-type pairs. If Ui>0 (attractive), (8.8) has only real solutions at high temperature but has two conjugate imaginary solutions (&j2<0) below the critical temperature Tc defined by the relation:

1.75 kT c

TRANSITION TEMPERATURE TEMPERATURE — » •

tanhOW2) 1 = 4T17,

Z

1927

STATES

\YU0,>P)\

|k|>fcp

identical to (5.6). These imaginary solutions -Has and — iu^ correspond to a damped perturbation and a diverging perturbation, respectively; in other words, the instability of the Fermi state is characterized by the appearance of an eigenvalue of Eq. (8.8) in the lower half of the complex plane. Let us now go back to the actual case where the scattering is important and the imaginary part i{h/r) of the excitation energy is not negligible. Above the critical temperature Tc, the eigenvalues of (8.8) are complex: ±o>i-{-2i(h/ T) and describe a damped oscillation (with the relaxation time T/2). Below Tc> on the other hand, (8.8) has two pure imaginary solutions: O)=±«O2+2J(A/T),

(8.9)

which are both in the upper half of the complex plane if the energy broadening h/r is larger than oi2/2. We have plotted the variation of both quantities versus temperature (Fig. 6); o>2 increases rapidly near Tc and then

FIG. 6. Plot of the inverse scattering relaxation time and of the imaginary solution of Eq. (8.8) versus temperature. The Fermi state becomes unstable below the crossing point of these two curves (arrow).

levels off and approaches the value 1.15kTc near 0°K; h/r, on the other hand, follows a 71' law down to 0.03°K. At this temperature, the lowest attained experimentally, 2h/r is about twice as large as the limiting value of w2 and consequently the Fermi state is still stable. If we can make the reasonable assumption that the T* law holds in the 0.01°-0.03°K range, we find that the two curves cross at 0.02°K approximately. One expects therefore, on the basis of this argument, that the actual transition temperature for helium-3 is close to 0.02°K. Note that for the typical (low temperature) superconductors, the electron-electron scattering is quite small in the range of temperature where the transition occurs, so that the condensation does take place substantially at the critical temperature Tc computed in the weak-coupling limit. Flow Properties

FIG. S. Plot of the function on the right-hand side of (8.8) versus of. The intersections with the straight line y—l determine the solutions of this equation.

146

Since the density-current density correlation does not vanish in the ground state, there are spontaneous macroscopic currents on the boundary of the fluid at rest at 0°K; however, these currents are small (even in one single ordered domain) since they involve only a fraction njn of the particle, that is to say, about 1%. One may argue furthermore that the amount of energy necessary to cause a very gradual rotation of the anisotropy axis is quite small indeed (see the paragraph on Specific Heat in Sec. V); one expects therefore that there will be no long-range order with respect to the orientation of the anisotropy axis* even at extremely low temperature. Since the currents in randomly oriented domains cancel each other, one does not expect that such a spontaneous circulation could be observed. All this is valid for a pure m state; for the cubically symmetric state (4.12) it is true a fortiori. On the other hand, the appearance of a condensed state would be striking if the fluid is in a moving coordinate system (rotating container) since the con-

1928

P.

VV. A N D E R S O N

densed phase is superfluid, according to (7.15). One has found that the condensed fluid current vanishes like T2 for a given rotation speed in the low-temperature limit. The ground state for d-type condensation is such that this temperature dependence is isotropic (independent of the relative disposition of the anisotropy axis Oz and the rotation axis). Magnetic Properties It is interesting to note that the paramagnetic susceptibility x of the condensed fluid also vanishes like T2 in the low-temperature limit, following the same variation law as the current in a rotating container: X/Xn=j/U

(8.10)

This suggests a two-fluid model of the condensed system. The condensed state is then described as a mixture of a "normal" phase (the relative amount of which is X/X„ and vanishes at 0°K) and a superfluid phase which does not participate in the flow of the fluid and has zero paramagnetic susceptibility. IX. CONCLUSION

Our subject may be separated into two divisions: first, the general question of Fermi gas with reasonably weak interactions which are attractive in pair states l?=0, and second, the specific properties of liquid hclium-3. As far as the first division is concerned, we have concluded, by means of a generalization of the BCS theory, that there is an anisotropic condensed state of the gas with various characteristic superfluid properties, that is lower in energy than the normal Fermi gas state. This state may be best described as a condensation of Cooper pairs with zero linear momentum, but with an orbital angular momentum depending on the interaction potential. For angular momentum l—\, the gas as a whole has a net angular momentum, and may be described as a kind of "orbital ferromagnet," but for Z=2 the net angular momentum is zero; the ground state of the pairs is most probably such that the physical properties will have cubic symmetry. It can be conjectured that the same techniques will lead to similar states for / > 2 also, although we have not computed such states because they probably have no physical realization. (Note that this cubically symmetric state is quite different from what we believed to be the ground state in earlier publications.) There still remains the question, even in a relatively weakly interacting Fermi gas, as to whether the methods we have used are correct or complete. The next most urgent problem is to study the collective oscillation spectrum around the ground state which we have found. In particular, by this means (or by the equivalent technique of Thouless2) we could determine whether our state is stable as compared to similar configurations of

AND

P.

MOREL

finite rather than zero linear momentum. Unfortunately, the study of collective oscillations leads into complicated mathematics which we have not yet worked out. The second division of our paper concerned itself primarily with liquid helium-3. The forces in He 3 are rather strong. The most serious question is whether we can rely on a "Fermi liquid" type of picture at sufficiently low temperatures, having quasi-particle rather than pure single-particle excitations with a spectrum similar to that of an ideal Fermi gas. Our best reason for relying on this picture is that it does qualitatively explain the present measurements of various properties of helium-3 rather well; and also, that it can be demonstrated that the forces in the metallic Fermi gas are not weak either—perhaps not as strong by a factor of 2 as in helium—but nonetheless in many cases the Fermi liquid picture appears to be valid. If so, we have demonstrated reasonably well that some condensed state will probably form. Whether the state is a BCS-like one or something more complicated depends on the validity of our approximations; at what temperature, on the specific assumptions we and others have made about forces and effective masses, but most specifically on the amount of scattering. It is necessary in order for the condensed pairs to stay together that h/r~eo~-kTc, and this can only be satisfied below 0.02 °K. This has never been conclusively shown to be more than a necessary condition on the transition temperature, so we must consider it an upper limit. If, in fact, the transition is ^= 2 and is to the cubic state we suggest, it is interesting to speculate on how the specific nature of the ground state might be observed. Tensor properties such as moment of inertia, susceptibility, etc., will be isotropic; only a nontensorial property such as surface tension or surface current will show the effect. Since even in the pure m state the surface current was found to be nearly unmeasurably small, it appears that the nature of the ground state will be difficult to elucidate. Perhaps the clearest result will be the density of states near zero energy, which will appear in low-temperature susceptibility and specific heat. Nonetheless, since this anisotropy is the newest and most fundamental property of the ground state, it is to be hoped that it can be directly observed. One might add that the appearance of anisotropy in a liquid is not physically a surprising result. The majority of phase transitions have the same feature, that the symmetry below the transition temperature is not as high as that above. That symmetry changes have not in the past been associated with superfluidity appears to be of no particular significance. We have investigated this new type of superfluidity both as an interesting purely mathematical possibility and because it may apply to helium-3. It is interesting to speculate whether any further examples may arise. In nuclear matter, experimentally shell effects appear to obscure the question of what might occur in a

147

GENERALIZED sufBciently extended volume; preliminary calculations showed3 that actually the latter might well prefer a d state for like pairs, a triplet s state for T— 1 unlike pairs. The complications of the situation in nuclear matter—long-range Coulomb forces, boundary effects, isotopic spin effects, etc.-—do not seem to lend themselves well to our formalism in any case. In metals it is not entirely impossible that our anisotropic BCS states could occur. A necessary generalization for metals is to realize that the Fermi surface and the forces have only the invariance of the crystal point group, not of the full rotation group. The forces must now be classified according to representations of this point group, and the type of state which appears depends on which type of representation has the most attractive potential. There are three situations which might arise: (1) The natural generalization of BCS is the case in which the most attractive potential belongs to the identity representation. The energy gap parameter may be somewhat anisotropic in this case but will be everywhere real and, in general, everywhere finite (at least near the Fermi surface). The behavior is normal BCS with anisotropic gap. (2) A case which does not occur in the spherical gas is a one-dimensional representation of odd parity. For instance, in an axial crystal the solution like I— 1, m—0 might be lower either than / = 0 or / = 1, m=±\. Here the gap will be real but will in general have zeros and we can expect triplet pairing rather than singlet, so that the Knight shift would not change much. (Zeros may not occur if k values on the appropriate reflection plane are not represented on the Fermi surface. Note that in this case zeros are not limited to null points, so that the density of states near zero energy may be high.) (3) Finally, we come to the case of multidimensional representations. Here, in general, the gap function is complex and has point zeros most usually. Representations may be odd (triplet) or even (singlet). There are, of course, a number of experimental facts which one would like to attribute to the presence of cases 2 and 3. The existence of zeros in the energy gap function is a natural explanation of the severe curvatures in the specific heat characteristic observed in a number of cases-—Pb, Hg, and most recently, Mo-Re alloys.31 The Knight shift characteristic of a triplet state is also a tempting explanation of some data. The difficulty, however, is that superconductivity has been found to be so extremely insensitive to impurity and boundary scattering, which destroy the intrinsic symmetry of the crystal and should spoil rather easily any coherence which changes in sign or phase from one portion of the Fermi surface to the next—i.e., all but case (1). Thus we must reserve judgment on whether cases (2) and (3) have been observed, while realizing that in all likelihood there is no real reason why the forces 31

F. J. Morin (private communication).

BCS

1929

STATES

in some metal or group of metals cannot be favorable to their occurrence. The differences from the simple case (1) are rather subtle and difficult to determine experimentally. ACKNOWLEDGMENTS We should like to acknowledge gratefully the help of D. J. Thouless and E. Jakeman, particularly in indicating the need for a new 1=2 ground state, and of H. Suhl in explaining to us his ideas on scattering before publication. We also acknowledge discussions and preprints from W. M. Fairbank, K. A. Brueckner, T. Soda, A. Glassgold, V. J. Emery, A. M. Sessler, J. C. Fisher, B. T. Matthias, and V. M. Galitskii. APPENDIX A We intend to evaluate here the error for the groundstate configuration C(0,
in \c{e,v)\*h\c{d,
(A.i)

* 4|> k *+|C(^)|']'' After integrating over the energy, we find then the rather convenient expression: SE=-

• f' SC{B,v)C*(0,
(A.2)

&r.•J

On the other hand, we can derive from Eq. (4.2) the expression for the error 8C(9,
+-

UvPi.{k-%')\

|6C(flW)

(A.3)

(We have neglected terms of the order of A2/£2, negligible in the weak coupling limit, as well as higher than first-order terms in &C.) Now, one can expect that the

148

1930

P.

W. A N D E R S O N

A N D P.

sum K,

MOREL

and therefore i- YVm*Ylm K=2rN0-lT. •, 1=

SCO) =

(A.4)

NiVlUvAK!-Yim{e,
Ek

The energy perturbation is then is rather small on account of the orthogonality of spheriSE= - 0V0A2/8x) (NoUt) (NoU^K1. cal harmonics (it would vanish if Ex had no angular dependence). We conclude that: (1) the leading term In the / = 1 case, coupling occurs only with the 1=3, 5, of SC(6,
m

r,.„*(0'y)c(0V)

dt'dW. (A.5)

0.07

for / ' = 3 ,

0.0056

for / ' = 5 ,

0.001

for / ' = 7 .

Note that the factor NaUv can be of the same order of magnitude as or smaller than NoUi (the ground state This term vanishes if I and /' have different parity (no would be a /'-state otherwise). On the other hand, coupling between the even and odd parts of the po- NoUi may be significantly smaller than one if £ can be tential) and is small otherwise (of the order of K). chosen much larger than kTc. (2) 5C{0,
*Ylm*(d',<e')8CV(8',
* / / / -

SCw=—N0UoA f f f 4* J JJ

C(e,
~0.007(iVoD'o)A. We are thus satisfied that SCm is considerably smaller than \C\ and therefore that SCm is truly negligible with respect to 5C<1). We can then proceed: &C^ =

iff

N,U^C^Ym{e,
-dedQ £k

~0.00084(AT0?7o) (iV„£/ 2 )AF 2 o(M, and finally :

SCV^NvUvAYtUe,?)

XJ J J

itdQ,

= _ _ ! _ / — ) ^ o f ; o A f f ( 3 cos 2 0-l) lnC(6,v>)da 4Al6*7 J J

and finally, we carry this last expression into (A.2). We shall consider three cases (corresponding to the situation found for the / = 0 , 1, and 2 condensations). C(6,
F 20

8£=0.0013(iV0A2/85r)(-Aroc7o)(JVot/2). (A.7)

de,dQ'=NaUl.AKYlm($,
32 See reference 3, Fig. 1. The K matrix used in this reference is approximately equivalent to our effective potential U, in the limit of large "cutoff" parameter |.

149

GENERALIZED This coupling produces indeed a positive perturbation of the ground state energy of the order of l/1000th of the condensation energy and therefore much smaller than the energy difference between the simple F2s configuration and the true ground state (4.6%). Let us finally remark that expression (A.7) is truly independent of the cutoff parameter £, although it involves the ^-dependent U0 and U^; in fact, we have seen in Sec. I l l that 1/| AV7o| =-l/N0U0=

-ln£+constant.

(A.8)

This relation, together with (3.13), indicates that the product (NQUO) (NoUi) is independent of ij. APPENDIX B Our purpose here is to find all independent solutions of Eq. (4.1), which we shall write for studying conveniently its invariance properties: .

21+1

i„

C(k')

BCS

1931

STATES

p-Type Solutions (1= 1) The most general combination of ^-type spherical harmonics is: C(k)=Ax+By+Cz+i(A'x+B'y+C'z),

(B.4)

where x, y, and z are the three Cartesian coordinates of the unit vector k. The real part represents a plane which can always be brought by a suitable rotation onto the Oxy plane, for example; in this case, C,2 is invariant by reflection in the planes Oxy, Oyz, and Ozx. The rotation must therefore bring the plane represented by d onto one of the three planes Oxy, Oyz, or Ozx (unless C; is identically zero). On account of rotational equivalence, we find that the independent solutions are Az

or AoFio,

Ax+iBy

or

AiF u +A_iFi,_i.

A rather simple computation shows that the mixed solution (Ai=A_i) or the simple solution AoFio are not as favorable as AjFn (see Table I). d-Type Solutions (1=2)

C(k) is in general a complex function of the angular The most general combination of rf-type spherical coordinates $ and

(B.3b)

The relations (B.3) can only be fulfilled simultaneously, one being the consequence of the other. Finally, we see that (B.l) is also invariant under gauge transformations [multiplying C(k) by a constant phase factor] since the nonlinear factor on the right-hand side of this equation involves only | C(k) | 2 .

150

Ci{k)'=A'{2zi-xi-yi)+Bl(xi-f), or C'xy, or D'yz, or Ezx.

(B.7)

We conclude that the (/-type solutions of (B.l) are the 16 combinations of (B.6) and (B.7); not all these com-

1932

P.

W.

ANDERSON

AND

C(k) = A 0 F 2 o+A 2 (F 2 2 + F 2 ,_ 2 ),

This can be written in the equivalent form: C(e, P ) = A 0 F 2 o+A 2 F 22 +A^ 2 F 2 ,_ 2 ,

(x 2 -j, 2 )

On account of gauge invariance, one can multiply C(k) by a suitable phase factor to make the coefficient of the first term real (A'=Q). This solution is therefore:

MOREL

(2) C(fe) = A ( 2 z 2 - x 2 - i / 2 ) - f - B ( x 2 - ! / 2 ) - H C x y

binations are independent solutions, fortunately (on account of rotational and gauge invariance). We shall now study all independent combinations individually. (1) C(k) = (A+iA') ( 2 z 2 - x 2 - y 2 ) + (B+iB')

P.

where the three coefficients A0, A2, and A_2 are real and given by the set of nonlinear equations: Ao=aooAo+ao2(A2+A_2),

(B.8)

A 2 = Oo2Ao+«22A2-f 622A_2,

where A0 is real and A2 may a priori be complex. Carrying this expression into Eq. (4.4), we find that A0 and A2 must satisfy the following set of nonlinear equations:

A_ 2 =a 02 Ao+*22A2-r-022A_ 2 .

Ao= aooAo+2ao2A2,

(aoo— l)Ao+ao2(A2+A_2) = 0, ao2A0+ (o2»— 1) (A 2 +A_ s ) = 0,

where we have defined:

a 2 2 - & 2 2 - l = 0. •«

a^—NvUi

r

I dt I dQ •'o J {

r

F2*F2,

(B.14)

33

(B.14) has the following solutions : •

E*

A 0 =0,

(B.10)

r YU*

I dt I d& . Jo J £k The imaginary part cancels since |C(«)| is invariant under the transformation if> —• — —SII^=—NoUi\5W> lnr

+ ffRe(Y2*Y2lt,)ln\f\da{

The numerical tabulation of these coefficients versus A0/A2 indicates that (B.9) has four solutions: A 2 =0,

lnr =1.02;

A 2 =1/ V / 8A,

lnr = 0.98;

A 0 =1/2A, Ao=0,

A 2 =v'3/'\/8A, A2 = A,

lnr=1.02; lnT = 0.98.

l n r = 1.131; lnr=0.98; l n r = 1.154;

C(k) = A[1/V2F 20 +1/2 ( F 2 I - F,._,)]

(a)

C(^) = A[1/2(F 2 1 -F 2 ,_ 1 )+1/2(F 2 2 +F 2 ,_ 2 )],

(b)

with 6 and 4 nodes, respectively. Both configurations belong to group (2); (a) and (b) can indeed be brought to the form (B.12) with A 2 ^A_ 2 by a 90° rotation.

f Re(F2„2) ln|/|
A 0 =\/3/2A,

A_ 2 =0,

and possibly also solutions of the general type (B.12) with — A27^A_2. A2 Note added in proof. V. Emery has indeed brought to our attention the possibility that solutions displaying a low degree of symmetry may exist. This author analyzes Eq. (5.5) in the limit of a vanishing gap, corresponding to the situation near the critical temperature. He finds that the first order expansion of (5.5) in powers of C(k) has, in addition to the solutions Ci to Ce below, the two following solutions:

(B.ll)

A 0 =A,

A 2 =A,

A 0 =0, A 2 =-A_ 2 =A/VZ, A 0 =A/v2, A 2 = - A _ 2 = A / 2 ,

b^NoUi

b^-NoUtf

(B.13)

Discounting the solution (B.8) studied above, this system is equivalent to

(B.9)

A2=ao2Ao+(a22+&22)A2,

(B.12)

33 It is interesting to note that the condition am — 1 = an. — bn — 1 =0, for the existence of a solution of the type C(0,*>) = AoF2o +Az(I,22— Fj,-2), is exactly fulfilled when \C(9,
J ( r m y In I /1 dU=»J2 sin22P | F 22 1 a In | /1 dU with

\f(9,v) l ! = iF2oH-sin»2? [ F221». After performing a ir/4 rotation around the 2 axis (
A 90° rotation brings the first into the third, the second to the fourth, so these are only two independent solutions. These solutions are therefore not favorable (see Table I) and this could be related to the fact that the zeros of the "gap" |C(«)| form continuous lines (instead of being discrete nodes as for the most favorable solutions).

Using the cubic symmetry of | / | , we find indeed that: f3(x>-yty\n\f\da

151

= J " [ 2 ( ^ ~ y ! ) a + 2 ( / - 0 2 ) 2 - ( ^ - / ) 2 ] l n \f\dil = J " ( 2 s 2 - s ; 2 - y ) n n \f\dti.

GENERALIZED

It has not been demonstrated yet whether these solutions satisfy Eq. (5.5) to all order of its expansion, or not (private communication). (3)

C(k)^A(2z*-z>-y*)+B(x*-y*)+iCzy

This is equivalent to set (2) under a 90° rotation (4) C(k) =

BCS STATES

1933

also have discrete nodes and therefore we do not expect that they could improve much upon the configuration (4.12). Furthermore, the low-temperature properties derived for such configurations would be essentially the same as those we have found for the configuration (4.12). Thus, in any case the configuration (4.12) yields accurate results about the properties of the ground state for a d-type condensed system.

Azx+iBzy APPENDIX C

This is equivalent to C(0,*>) = A,F 21 +A_iF 2 ,_ 1 ,

(B.15)

where the two coefficients Ai and A_i are real and solutions of A_i=JnAi-r-a u A_i, or rather A!+A_ 1 = («u+*ii)(Ai+A_,), Ai— A_i= (a u —6 u )(Ai-A_i).

(B.16)

One sees immediately that (B.18) has only four possible solutions: A!=A, A_i = 0, l n r = 1.131; A!=0, A_i = A, Ai=A_i,

The purpose of this appendix is to compute the integral: C(k) J ( r ) = E smXke' = £ e i k r , (C.l)

*

in the long-range limit (kFr^>l), by the stationary phase method. Here C(k) is the angle-dependent gap function appropriate to any solution we may choose to consider; we think particularly of the 1=2 ground state (4.12). To do this, let us express the integrand in terms of new angular coordinates: the cosine u of the angle between k and r, and the azimuth angle

m-

lnr== 1.131; lnr=0.98;

Ai=-A_i,

lnr=0.98. 2

(B.17)

2

[The last two are the same (# — y ) solution we found under (1)J The final result then is that we have found five inequivalent solutions, all highly symmetrical but not particularly simple. In terms of direction cosines, these are: or * 2 - y 2 = F 2 i ± F 2 , _ , , lnr = 0.98;

d=xy 2

2

2

C2=22 -x -y =F2,0,

lnr=1.02;

Ct=(x+iy)z=Y3,i,

lnT= 1.131;

C 4 = {x+iyf=

l n r = 1.131;

F2,2,

Cf=2!+rf+e!y(f=Vl),

* IV+|C(*)|»]»

No /•+« = — I it I 47T"'_J

dip I du

"0

•'—1

e%>

c^+ic(«,v)i»:»

Let us consider first the integration with respect to the variable u. Now, if e?^0, the integrand is an analytic function of u and we can therefore replace the integration from — 1 to + 1 on the real axis by integration on the contour T in the complex plane (see Fig. 7); we have then:

r•+1

J-i

C{u,
C(u,
«-*» ~du-

-du -i

:

[*

i+i

™

D C(u,
du. •

l n r = 1.154. (B.18)

The most favorable solution we have found thus far [that is to say, excluding possibly a more complicated solution of the general type (B.12)] is the configuration C6. I t appears that the value of lnr and therefore the condensation energy is linked to the multiplicity and distribution of the zeros of \C(k) |. On this account, Cs is better than C2, for example, since the former has only eight discrete nodes, whereas the latter has two lines of nodes. It is likely that the special configuration C6 is the ground-state configuration on account of its high degree of symmetry. On the other hand, we have not quite excluded the three-parameter solutions (B.12) which we have not studied completely, and they may be slightly more favorable than (4.12); these solutions

152

(C2)

/

(C.3)

l*+\C(u,9)\*$

Since
FIG. 7. Contour for integrating the function (C.3) by the stationary phase method.

1934

P.

W. A N D E R S O N

variation of the factor C(M,ip)/(e 2 +|C'| 2 )* in the small interval where the integrand is sizeable, we obtain: +1 C{u,
1

C{+\)eikr

AND P. MOREL Observing now that k varies only slightly in the interval of integration (near the Fermi surface), we shall use its expansion in powers of e:

-du-

k=kp-\-e/fivF-\

/ i iy+|c(«^)i*:i»

(C7)

.

Hence:

C(-l)e-** r

eitr=eikFr[cos

(rf/JiVp)+i

s in (rt/hv F)

].

(C.8)

(C.4)

The second term on the right-hand side is odd and therefore cancels in the integration. Neglecting second-order Note that this result is independent of the variable

~[^+|C(-l)|*]»'

X

eikT±e~ikr C(f) -du= X! [V+|C(«,v>)| 2 ]* ikr [f«+|CW|»]» C(u,
sin£pr,

/ even

(C.9) i coskpr, I odd where the + sign corresponds to odd configurations The integrand is the product of a monotonically decreasing function of e, by a very rapidly oscillating and the — sign to even configurations. factor (in the long-range limit); we can then extend the The deformation of the contour of integration over u upon which (C.5) is based is not allowable if e is near interval of integration to (— <*>, +<*>) without introzero and C(u,tp) goes through a zero in the range ducing a significant error [the error involved in this — 1 ), i.e., ff/^kpr^. We tour. I t can be verified easily that the contribution from obtain finally: integrating around this branch point is negligible, as sink pr, I even 2NdC(f) /r\C(f)\ follows: carrying out first the « integration by the 7(r) = ~K, V fivF / I —icosk FT, I odd kpr method used later in getting (CIO), we obtain a (CIO) function like C(u,C(«,
(C.5)

which is of order (l//&pr)2 smaller than (CIO) in the i£V==:(ir/2z)V*, z—>». interesting case in which the zeros of \C\ are isolated This exponential factor, exp[—r\ C(f) \/iii>p], dominates points on the sphere. the dependence of I(r) upon the distance r in the longWe may write then: range limit. a N£(t) /•+« eikr±e-ik' de ' E. T. Whittaker and G. N. Watson, Course of Modern Analysis /(r) = (C.6) (Cambridge University Press, New York, 1946), Chap. XVII, Example 40. 2ir J_s C^+|C(r)|»]*A"

153

Localized Magnetic States in Metals Phys. Rev. 124, 41-53 (1961)

This paper was mentioned in my Nobel Prize citation along with localization and superexchange, causing a slight frisson of guilt in me because I learned about resonant states from a talk by Blandin at a small workshop on magnetic metals at Brasenose College, Oxford in Oct 1959. (We were not slow to take advantage of our new freedom at Bell to travel; this was my third European trip in ten months.) Friedel and Blandin's ideas had only the weaknesses that they were not focussed on the specific problem I addressed, which as noted in the paper arose from Matthias' question; and that they used a generalized notion of "exchange", not emphasizing the Mott-Hubbard "U". In a meeting abstract Clogston and I pointed out that therefore some details might be different — signs of couplings, for mstance, about which I won a bet against Walter Marshall made at that same workshop. Friedel's work on alloys and impurity states deserves more recognition than it has had. Nonetheless the "Anderson Model" has had a long and not wholly undeserved run.

155

PHYSICAL

VOLUME

REVIEW

124,

NUMBER

I

OCTOBER

1,

1961

Localized Magnetic States in Metals P. W. ANDERSON

Bell Telephone Laboratories, Murray Hill, New Jersey (Received May 31, 1961) The conditions necessary in metals for the presence or absence of localized moments on solute ions containing inner shell electrons are analyzed. A self-consistent Hartree-Fock treatment shows that there is a sharp transition between the magnetic state and the nonmagnetic state, depending on the density of states of free electrons, the s-d admixture matrix elements, and the Coulomb correlation integral in the d shell; that in the magnetic state the d polarization can be reduced rather severely to nonintegral values, without appreciable free electron polarization because of a compensation effect; and that in the nonmagnetic state the virtual localized d level tends to lie near the Fermi sur-

face. It is emphasized that the condition for the magnetic state depends on the Coulomb (i.e., exchange self-energy) integral, and that the usual type of exchange alone is not large enough in rf-shell ions to allow magnetic moments to be present. We show that the susceptibility and specific heat due to the inner shell electrons show strongly contrasting behavior even in the nonmagnetic state. A calculation including degenerate d orbitals and d-d exchange shows that the orbital angular momentum can be quenched, even when localized spin moments exist, and even on an isolated magnetic atom, by kinetic energy effects.

I. INTRODUCTION

simplified model, a quantum state of the metal in which such a moment exists, and to discuss the conditions required for its stability. The fundamental conceptual difficulty here is that such a localized moment cannot be satisfactorily described within the usual type of one-electron theory which is adequate for the properties of nonmagnetic solids. The first problem is that in polyvalent metals the bandwidths are so great that the energies of the one-electron states which might contain the magnetic electrons are certainly coincident with a free-electron band. One-electron theory does not permit localization of such a state; the best one can do is to use a virtual state, 2 a state which, left to its own devices, would decay into one of the continuum of free-electron states. The second problem is that it is very difficult to understand how, in a usual Hartree-Fock theory, the states of opposite spin on the ion can be empty while the parallel spin states are full. For instance, in solutions of Fe in Mo-Nb alloys, the Fe ion changes from being nonmagnetic to having a moment corresponding to about two electrons when the concentration is varied by only a few percent. This is impossible to credit as gradual filling up on one-electron energy states, even if we accept exchange as strongly favoring the parallelspin states. There must in fact be a nearly discontinuous change in the quantum state of the many-electron problem. The picture we suggest is founded on the same type of concept which is valuable in insulating magnetic materials 6 and which has been suggested as being essential in magnetic metals 6 : T h a t the magnetic state is characterized by being the state in which the Coulomb correlation integral of electrons in inner shell states is a major parameter of the problem and must be included in the Hartree-Fock treatment from the first. Under these circumstances, the Hartree-Fock fields for electrons of different spins differ not only by exchange in-

R

1

E C E N T experimental results have shown that the occurrence of localized magnetic moments on iron-group ions dissolved in nonmagnetic metals is a more widespread, more systematic, but at the same time, more complicated phenomenon than previously suspected. In dilute solutions of iron, cobalt, and to a lesser extent, nickel, the appearance of such moments seems to be a function mainly of the host metal, although the moments observed differ greatly from one solute to another. As a function of the Matthias "electron concentration" parameter, one finds sharply defined regions where localized moments are strictly absent, interspersed with regions at the edges of which the moments appear almost discontinuously. The regions of localized moments have a noticeable negative correlation with the superconducting transition temperature, and a rather less strong correlation with low density of states as measured by the low-temperature specific heat yT. Such localized moments in metals have been often accepted as an experimental fact without examination of their meaning or of the conditions for their occurrence. Friedel .and collaborators have made a start at an understanding of the phenomenon 2 ; extending their work and some of the concepts set forth by us earlier, 3 ' 4 we will here attempt to describe formally, in a highly 1 B. T. Matthias, M. Peter, H. J. Williams, A. M. Clogston, E. Corenzwit, and R. C. Sherwood, Phys. Rev. Letters 5, 542 (1960); B. T. Matthias and A. M. Clogston (to be published). There is, of course, a large literature on specific examples of localized moments, for which the reader is referred to extensive bibliographies in the papers of Friedel. (See reference 2.) The cited papers, however, are the first to bring out clearly the nature of the phenomenon of appearance and disappearance of localized moments as a function of the continuous variation of the nature of the solvent metal. 2 P. de Faget de Casteljau and J. Friedel, J. phys. radium 17, 27 (1956); J. Friedel, Can. J. Phys. 34, 1190 (1956); J. phys. radium 19, 573 (1958); Suppl. Nuovo cimento VII, 287 (1958); A.a Blandin and J. Friedel, J. phys. radium 19, 573 (1958). P. W. Anderson, Oxford Discussion on Magnetism, 1959 (unpublished). 1 P. W. Anderson and A. M. Clogston, Bull. Am. Phys. Soc. 6, 124 (1961).

»6 P. W. Anderson, Phys. Rev. 115, 2 (1959). J. H. Van Vleck, Revs. Modern Phys. 25, 223 (1953); reference 3.

41

156

42

P . W.

ANDERSON

tegrals but by true Coulomb integrals, and only this circumstance makes localized moments possible in the iron group. The essence of what we do is a self-consistent calculation of whether this localized moment exists. The principle of the method is this: Assuming that the localized moment exists, this means that a d-shell state
Nevertheless, in many other ways the results—but not the methods—are best described in terms of a model with actual localized electrons rather than in terms of a many-electron band type of model. Second it will be worthwhile to emphasize the comparisons and distinctions with respect to the earlier work of Friedel.2 (See especially the paper of Friedel and Blandin.) Friedel has foreshadowed qualitatively some aspects of our ideas, most particularly in the fact that the only logical description of localized magnetic electrons on impurities in metals is in terms of virtual states. His discussion of these virtual states in terms of scattering theory, however, is considerably different, because he does not identify them closely with localized atomic states, as we find strong physical reasons to do. (In some of his work this connection has been foreshadowed.) On the other hand, there are several vital differences. The most important is that he uses as the term which splits the spin up-spin down energies not the Coulomb integral

^=Jl#>u»(l)|,^ir1|#'i-.(2)|yT, but true atomic exchange integrals. In a sense U is an exchange integral—the exchange self-energy of
157

LOCALIZED

MAGNETIC

of our results but merely set out a rather schematized model to show the principles which we believe apply in practice. In a last section of this paper we will discuss possible further extensions and applications. II. THE HAMILTONIAN; APPROXIMATIONS OF THE MODEL The model we use can best be summarized by writing down the Hamiltonian: H=Hof-\-Hoi-{-HmTI-\-Had(1) Here Ho/ is the unperturbed energy of the free-electron system in second-quantized notation: (2) Wk
C]n,.

cj: is the energy of the free-electron state of momentum k, »k« the number operator for momentum k and spin
STATES

IN

METALS

43

On the other hand, this separation and our theory are only easily applicable if
(4)

r U=j

I^CrOl'I^CrOIVIn-r.l-'rfniT,.

We have here neglected the correlation energy of the electrons in "free-electron" states and the d-free repulsion. The free states, even if, as in such metals as Y or Sc, they are partially d states, are very much more extended throughout the unit cell than are the localizable states near the top of the d band8'9; for this reason, and particularly because the free electrons experience a much more effective screening than the inner shells do, it is reasonable that V, in effect, might be ~10 v for inner-shell d functions but only a few volts for free functions. Energy-level tables even for free atoms show that s-s and s-d repulsive energies are 2-3 v smaller than d-d energies; but we rely mainly on the fact that the free electrons in the metal are much more spread out and much better screened than in the atom. It has

Hod=E{nd++nd~). (3) 9 In cases like Sc, Ti, and other lower d-shell ions, it may be most A real question with respect to Eq. (3) is the precise convenient to treat the free-electron states as orthogonalized definition of the localized eigenfunction
J. H. Wood, Phys. Rev. 117, 714 (1960).

approach is simpler and more physical.

158

44

P . W . A N D E R S O N

matrix elements with the Wannier functions of the band on the same atom vanish; this is why such effects as we discuss do not usually appear in a one-atom theory. But there is no reason whatever for the matrix elements with Wannier functions on neighboring atoms to vanish. These matrix elements may be estimated to be of the order 2-3 ev or even more, from various sources—for instance, the lowering of the binding energy of Cu and other noble metals relative to the alkalies can be ascribed to s-d interaction.12 The effect of symmetry allows us to write down a useful expression for Vdk in terns of Wannier functions a(t— R„) belonging to the band: Vik = —~ r*>d*(r)3CH-r(r) £ e'"-*»a(i-R n ) y/N J n (6)

= — - £ «'k*"MRB). y/N

Rn*o

This has been used4 to evaluate some of the perturbation P(0i —— I —*/3(e), theory results for polarization and other parameters. It will not be necessary in this paper, however, to FIG. 1. Unperturbed energy levels in the absence of s-d admixture. simplify Vdt except to assume that the mean-square matrix element, averaged over a surface of constant been suggested in the past11 that the d-d repulsion might energy tk, does not vary radically as we vary tk. We treat the Hamiltonian (1) entirely in the Hartreebe to some extent screened out by s electrons; we think that such screening would require a kinetic energy loss Fock approximation. That and the approximations alto the 5 electrons as great as or greater than the correla- ready made in setting up the Hamiltonian are the only essential approximations; the one-electron problem tion energy gained, and would be quite ineffective. As we mentioned earlier, U is formally the exchange posed by Eq. (S) as a perturbation will be completely self-energy of the state
III. SOME SIMPLE LIMITS

where both K and / a r e U; this shows the parallelism with normal exchange, but also shows why the term is often dropped, because the J part is really only a oneelectron energy »t+»*- 7 A sound argument from experiment that U must be large is the observation that in some cases Hund's rule is obeyed, so that intra-atomic exchange is clearly not screened in the d shell; and mechanisms which screen V would also tend to screen the exchange integrals. The fourth essential part of the Hamiltonian is the s-d interaction term

Let us first, however, exhibit the two simple limiting cases, so that we can see the alternatives available. Let us imagine first that V is very small and U very large; for simplicity, place E a certain distance below the unperturbed Fermi surface, E+U above it. (See Fig. 1.) It is clear what will happen in this case: one electron, whether of spin up or down, will fall into the d shell state, since it is below the Fermi surface. Now because of the term // corr , the effective one-electron energy in Hartree-Fock approximation for the other d shell state is now E+U, so that that state will remain empty. Thus one electron and one only is in the d H,d = ~E, Vdk(ckt,*cd«+cd,*ci„). (5) shell, and this electron has a mean spin s = $, not k,» zero as it would be if it really were half an electron apiece in spin-up and spin-down states. This state, then, This type of s-d interaction is a purely one-electron is the magnetic state. It is interesting to note that in energy, entirely different from the s-d exchange interthis state there is a large spin-dependence of the action which enters the Zener type of theory. The d function usually has a symmetry different from the 12 A second source for this estimate is the size of the similar Wannier functions on the atom it occupies, so that 11

C. Herring, see reference 6.

matrix elements between d functions and ligand s and p functions in magnetic insulators, as estimated from covalency effects or superexchange integrals (see reference 5).

159

LOCALIZED

MAGNETIC

Hartree field—if we occupy
STATES

IN

METALS

V. GREEN'S FUNCTION SOLUTION OF ONE-ELECTRON EQUATIONS

Since these equations can, for practical purposes, be solved exactly by Green's function methods, we shall not stop to give the perturbation theory answers, even though they suffice under many circumstances for qualitative results. The results we want to use are never actually the individual quantities (»|k)„ or the actual continuum level energies en, but certain averages over these such as, for instance, the mean density of admixture of the da state into the continuum levels of energy e, i.e., p*,(6)-E.a(€ m -e)|(i»|«0.|'. This will be the most important quantity because it determines the ^-function occupation number. This and all other quantities of interest may be obtained most directly by studying the Green's function: G(e+«) = l / ( « + M - H ) .

Gnn'(t+is) = l / ( e + M - «„„),

1 s P-«.(«)—£ I (<*|»),|»lim-— rt

7T n

where the one-electron state creation operators c„* and energies e„ are solutions of (8)

c»„* = Ek(»| t W * + (n\d),cd,*;

(9)

and the equations for the various unperturbed operators are {H,ck*^\^=tkck*+Vkdcd*; (10) and lH,cd,*J\„ = lE+U(nd,-.)lfid
e»,(«|k),= «*(»|k)<,+ Fkd(»|<*)„ en.(n\d).=

tE+U(nd^)2(.n\d)e +Y,tVdk(n\k)..

(14)

while the total modified density of states is p,(«) = (l/x)Im[TtG'(6)l etc. The equations for G are derived from Eq. (12), using £,(e+M—H)livGVK= §„». They are, making use of the abbreviations and

E,=E+U(nd.-,), S = t+is,

(15) (16)

(.S-E.)Gdd'-Z*VikGki'=l;

(17a)

(S-ek)Gkd'~VkdGdd'=0;

(17b)

(8-E,)Gdk°--£,

Vdk.Gk.k'=0;

(17c)

(S-ev)Gk>k'-~

VvaGdk'= Sk.k.

(17d)

and

(12a) (12b)

J»+(e-ew)'

= -(lA)ImCC"(e)];

Vdkck,*. (11)

The resulting "equations of motion" for the relevant coefficients (»|k) and (n\d) are obtained by substitution of Eqs. (9), (10), and (11) into Eq. (8):

(13a)

but in the unperturbed state representation, its matrix elements give the required densities; for instance,

(n<*F

the "av" meaning that three fermion terms in the commutator are to be evaluated in terms of average values for the state $ 0 . In the case of our simple Hamiltonian (1) these equations are very easy to solve. Let us set

(13)

In the representation n of exact eigenstates, G is diagonal:

(7)

[fir,cncr*]|aT*o=e„„c„,**o,

45

From Eqs. (17a) and (17b) we may immediately solve for Gdd: Gdi'(S)=*£S-E.-Zk\Vdk\>(8-**)-l~}-1.

« P. O. LOwdin, Phys. Rev. 97, 1509 (1955).

160

(18)

46

P.

W.

ANDERSON

The sum in Eq. (18) may be evaluated: lim £

(Viky

- - ^ ^ ^ ( e ) .

work out the remainder of the elements of G. Starting with the off-diagonal elements of Eq. (17d), we obtain (19)

(€-«*)»+*»

Gvk'=

Here we have neglected the effective energy shift of the d state,

&Ed=PZ

, k

e-ek

Vk>dGdk' , S— ev

This may then be substituted into Eq. (17c) to give

(20)

[«-&- E

\Vdk>\\8-tk,)-^Gdk°=VdkGkk°,

I

because it may be taken into account simply by shifting the assumed unperturbed energy E. If the density of states is reasonably constant AEd will not change very radically as E„ changes. Thus we see that, except for the energy shift, Gdd behaves exactly as if there were an energy state—the "virtual state"—at <S=£„+*A, where the "width parameter" A of the virtual state is defined by A = r(P) aT p(«). (21) We usually assume that A is a constant parameter, roughly independent of E,. The density distribution (14) of the d state is 1 A P*,(«) = • (22) IT (e-E,y+&* Equation (22) is the most important formula for our self-consistent treatment, but it will be of interest in the problem of polarization in the free-electron bands to

(kVk).

which, using Eq. (19), is £8~E„+iA+1 Vdk\*(8-e,)-^Gdk°= VdkGkk'. (23) Now, the diagonal element of Eq. (17d) gives us

r

ir*i

—V

Gkk' °=\8-ek -£,+tA+|F r f k | i ! (<S-e i )-U (24) L S-l -ft)->J i = («-«*r +[|F«|V(«-«*)*(«--E,+iA)]. This can be interpreted to say that, although each Gkk has a pole a t precisely the same energy, so that its perturbed energy is unshifted, nonetheless a certain amount of its density is to be found in the region of the virtual state—the admixture effect.4 This is given by

1 ImGkk (e)=^ T

\Vdk\2 (e— e*)2

pd(e),

(25)

near the virtual state. There are also shifts near the pole in the density, which correspond to the polarization effect computed in perturbation theory in reference 4. We shall see later that the compensation theorem of that reference may be derived directly from (24). VI. SELF-CONSISTENCY CONDITIONS FOR LOCALIZED MOMENTS In the preceding section we obtained the expression (22) for the density of d admixture in the continuum states of energy e. In order to determine the number of d electrons of a given spin
Arfe •*J-K (e-£„) 2 +A 2 (26) 1

/E.-€F\

Now we must make this equation self-consistent, which involves solving simultaneously the two equations (IS) and (26). 1 /E-eF+U(nd-)\ <»<*+>=-cot- 1 ! I, FIG. 2. Density of state distributions in a magnetic case. The "humps" at E+U(nJ) and E+U(n+) are the virtual d "levels" of width 2A, for up and down spins, respectively. The numbers of electrons («+> and («_) occupying them are to be computed from the area of the unshaded portion, beiow the Fermi surface.

161

7T

V

1 (nd-)- • - c o t - !

IT

V

A

/

(27)

E-ep+U(tid+)

)

LOCALIZED

MAGNETIC

STATES

Equation (27) is the fundamental equation of our selfconsistent method. To show its meaning, we have plotted out in Fig. 2 a typical magnetic case. We show the two virtual states in terms of their distributions pd„(e) from Eq. (22) centered around the self-consistent energies E, from Eq. (15). The empty portions of the two distributions are shaded. To show the possibilities inherent in Eq. (27), let us plot out (nd+) vs («i_) in two cases. (See Fig. 3.) For simplicity we have chosen eF— £ = U/2, and values of U/A of 1 and 5. We see that when U/A is rather small, such as the case U/A— 1, there is only one place where Eq. (27) is consistent, at («+) — (»-) = £. But as U becomes larger, the maximum slope of the cot""1 curve becomes steeper and at U/A = 5 we find the "localized" case in which there are three possible solutions, one at (»+) = (»_) = i but another pair at (»+) = 1 - ( » _ ) = 0.822, while (»+— »_) = w=0.644, the moment in Bohr magnetons. The magnetic solutions are the stable ones energetically (this is obvious because the Hartree-Fock equations are variational; thus all three solutions must be extrema, and since the end points of the range are clearly not minima, this means that the two outside solutions must be minima, the center one a maximum). Before giving a physical discussion of these localized moment states, let us work out some of the numerical consequences of Eq. (27). Let us introduce the dimensionless parameter y=V/A, (28)

METALS

47

<"->

(a) £

0.4

1.0

F-EEFF 0 1

0.2 \ 1 \

1

-0.2 1

-0.4 1

0.8

the ratio of the Coulomb integral to the width of the virtual state. When y is large, correlation is large and localization is easy, while y small represents the "normal", nonmagnetic situation. The parameter <(eF-E)/U

IN

0.6

<"d->

(29) 0.4

is also useful; x=0 means the empty d state is right at the Fermi level, while x = l puts E+U at the Fermi level. x = \, where E and E+U are symmetrically disposed about the Fermi level, is the most favorable case for magnetism. We shall find that 0 < « < 1 is the only magnetic range, but in the nonmagnetic cases x can be outside this range. Inserting Eqs. (28) and (29) into Eq. (27), and dropping angular brackets, we rewrite that equation Tmd±-cot-l[y{ndir-x)~].

(27')

Let us investigate some special cases. A. Magnetic limit: yS>l, x not small or too near 1. Then the cot - 1 is either close to zero or to ir, and nd± is near zero or one. Assume w d + ~ l , « d _ ~ 0 . Then

-

0.2

1\

1

0

0

0.2

0.4

n

0.6 '

< d+> (b)

FIG. 3. (a) Self-consistency plot of («+) vs <»_) for a typical magnetic case. y—U/A<=5, x=((p—E)/U = i. Note three possible solutions, (b) A typical nonmagnetic case. y=\, while x=\. The width is five times greater and only one intersection appears. These may be approximately solved to obtain: x ( l — nd+)= (1 — *)»<*-

7rw
y(x—ndJ) 1 TtndJSi

—

y (*-»
^/H'-^h)]

1 y

iryx(l — x)j

(nd+-x)

m=nd+—ndJ~l~l/[wyx(l

162

— x)—l'].

(30)

P.

48

B. Nonmagnetic cases. Here we may »<j + =«d_=». Then Eq. (27) becomes co

W. A N D E R S O N

Urn=y(n—x).

C0t7T»2=:l/TrM,

(31)

In the simplest subcase, n will not be far from \—the final virtual state lying within a width of the Fermi level—so that cotwn~wQ — n), (32) and (l+2xy/T) (33) (1+?/*•) J We see that n tends to take on the value of J, meaning that the effective energy level stays near the Fermi level. The effective energy of the d state relative to the Fermi level is 1-2* Ee!!=U(n-x)=U l+y/r

symmetric), y quite large. Here

assume

1-2:*: = Ay . (34) 1+y/x

and we have (35)

y(n—x)c~l/nn. Solving by successive approximations, we get n~l/(xy)i+§x+

(36)

C. The Transition Curve. Next i t is interesting t o trace out the transition curve from magnetic to nonmagnetic behavior. Clearly, on the transition curve nd+—nd~, so that Eq. (31) is one condition on the curve. The second may be obtained by differencing Eq. (27). The result is 7r/sinV» c = yc. (37) Here the curve of (37) and vs x, but

subscript c refers to the values on the critical transition into the magnetic case. Equations (31) cannot be solved in simple form for ye can be expressed simply in terms of nc and x:

Thus this is a very good approximation when y is small or, when x is even reasonably close to f, for all relevant y. In the opposite limit (y largish, x near 1 or 0 ) ; the most interesting region is x near 0 (results for £ ~ 1 are

COtTTWc

yc=-

(38)

sm2imc = 2ir(n<,—x).

From Eqs. (38) and (37) we can obtain the following approximations for c t ~ | and xctS (symmetrical in

NI0N-MAGNETIC

y^v+W(x-W+...,

*~4;

(39a)

and j

MAGNETIC

0.4

0.2

0

0.2

0.4

0.6

0.8

1.0

Fio. 4. Regions of magnetic and nonmagnetic behavior. Curve gives x, vs n/yc=ir&/U.

t

^ V W + . . ,

(xc^.0).

(39b)

These results are summarized in Fig. 4, which gives the transition curve as a function of x and jr/y, and in Fig. 5, which plots n and, where they are different, (nd+} and (w<j_) as functions of ir/y for two typical cases, x=\ and x—\. The plot vs r/y is essentially a plot vs A, the width of the virtual state. In order to get a feeling for orders of magnitude, in the iron group U is expected to be about 10 ev. The density of states is fairly widely variable; in an s band as in Cu it might be of the order ^ (ev) - 1 , while some of the rf-band metals might have densities twice to three times that. The least well-known parameter is (F 2 ) a v . If one ascribes a fair fraction of the binding energy surplus of iron-group metals, such as Cu, as compared to non-
163

LOCALIZED

MAGNETIC

STATES

motion of the Fermi level (changes in x). Note, however, that the shape of Fig. 3 indicates some insensitivity to x. For rare-earth solutes, on the other hand, Z7~15 ev and F a v ~ 1 ev at most, so that magnetic cases are to be expected almost exclusively. In Appendix I, we discuss the somewhat more realistic case of an orbitally degenerate localized level, as is usually expected to be present in d or f bands on high symmetry sites. In such a case, one must take into account the repulsion of the electrons with parallel spins but different orbits, as well as the fact that this repulsion is decreased by the atomic exchange integral J ( ~ l - 2 ev in the iron group). This is the usual "exchange" effect. I t turns out that the condition for "splitting" of the Hartree field is less stringent when we include / : The condition analogous to Eq. (37) is

IN

METALS

^*^~>^

0.8 0.6

J ,

(40)

0.2

2A

— i

1

i

i

i

0.S

which shows that the required V is decreased by the presence of / . Note, however, that in the absence of U, we require / > 2 i r A . If our quantitative estimate of A is even approximately right, this shows that exchange of the usual sort is utterly incapable of causing magnetism in the d shell. I t is interesting that there is a second condition,

i

i

(W

„(«) = -

I J

dklmGwie)

J

—+ —,

(41)

1

sinV» c ' 2A which expresses the requirement that the Hartree fields for the two orbitally degenerate states become different. Where Eq. (40) is satisfied, but not Eq. (41), we will observe "quenching" of the orbital magnetic moment by the kinetic energy of interaction with the band, but no quenching of the spin. As far as I know, this is actually the behavior of the iron group solutes in most cases; the fact that such behavior is not simply explained from the usual point of view has not been remarked before.

= •K

/•» Im I

p{tk)deh{e-tk)"1 VdkK(e-E,)+iAl

X

1+-

Introduce P(«*) = P ( 0 — ( « * — t)(dp/de)\tk+..

Pfree(e) = p ( < 0 + -

dtk

(24)

(43)

|Fik|2(e-£„) ( « - ^ ) 2 + A2

(44)

The most interesting thing about this result is that if p(ik), the density of states, is constant in the first place, the presence of a virtual state anywhere in the spectrum fails to affect the free electron density a t all; only the d state itself modifies the total density of states. Since we can obtain the total number of free electrons of a given spin simply by integrating Eq. (44) u p to the Fermi surface:

(e-eky(t-E„+iA)

Now the total density in energy of free-electron states is given by summing this over all k, which is very simple since the dependence of (24) on tk is so simple.

.,

and by a simple contour integration we obtain

VI. POLARIZATION OF FREE ELECTRONS; THE COMPENSATION THEOREM The "compensation theorem," 4 which shows that the net free electron polarization roughly cancels between the admixture and antiferromagnetic effects, may be derived very easily from the diagonal elements Gkk of the Green's function from Eq. (24). T h a t equation gives

(42)

•'oo

dP(e)

Gkk'(e)=(e-n)-1+

i

FIG. 5. Two typical plots of (n+) and (»_) as a function of ir/y — irA/U. (n+) — (»_) gives the net number of spins, and vanishes at the transition yc for the given x. (a) x=\; (b) x — \.

T

yo>

i

1.2

w/y

PIT TV

=

J

0.1

U w —=yc>— 2 A sin xre<,

49

»frc

J

tF

P!ree'(e)d(,

(45)

this will be entirely unchanged, and there will be no

164

50

P.

W.

ANDERSON

ordinary ionic moments in insulators14; and they can exhibit ferromagnetic or antiferromagnetic behavior depending on their interactions. We will make no att-E, dpU) tempt here to explore the complexities of this situation. 2 IF de(46) A«tre, In the nonmagnetic region, however, the localized it ( e - E ^ + A 2 virtual d states can have a very considerable effect on If we assume for simplicity that dp/de is roughly con- the metallic properties, such as spin susceptibility and stant in the region of interest it turns out that even the specific heat; and in particular, these two measurements, first-order corrections tend to cancel in the polarization: which are often considered as both measuring the density of states at the Fermi surface, vary quite difV\*dp r(eF-E+)2+A* ferently, and do not measure the same quantity. We A«/+—Aiif~= (Am)/ .(47) have, thus, a simple model to exemplify the effects 2 2 2 de L(£_-6f) +A (£_which exchange and correlation can have upon these (For this observation I am indebted to conversations properties. In this section we compute these quantities near absolute zero and in the nonmagnetic case only. with A. M. Clogston.) The effect of an external magnetic field is to shift This cancellation appears to be more or less forthe + spin Fermi level relatively to the — spin level tuitous. There are two contributions physically: by an amount 2 p.H. This shift will change the occupa(1) The d function becomes mixed with free electron tions of the virtual levels, leading to relative shifts of wave functions and vice versa. Clearly, this leads to no these levels themselves. net total polarization, since it is merely a unitary transIn the preceding section, we proved that at least if formation of the wave function. However, it has the the density of states p(th) in the band was reasonably effect of reducing the d polarization, because some of constant, any motion of the virtual levels could not the free functions above e? become partially d. At this affect the net polarization of the free electrons. That stage, however, there is still a net spin of precisely one theorem is still valid here, so that the free-electron band electron; and to compensate the d polarization, some will contribute its unperturbed susceptibility X0. previously unpolarized free electrons are polarized. The motions of the d electron density are simply com(2) The free electron functions also have an energy puted as follows: The shifts of the effective positions shift of the virtual levels are A£ k „= | Vdk\»(ek-E,)/l(ek-E„y+A*J, &E+=-nH+U{dnJ), (48) dE-=p.H+U(6n+). which leads to a negative first-order free electron polarization. This polarization is a true polarization rather than an admixture, similar in nature to a Zener- The resulting changes in population may be obtained Nabarro-Kittel-Ruderman-Yosida polarization. In par- by differentiating Eq. (27): ticular, this polarization requires a relaxation process in (x/sinVw)(5»+) = - SE+/A, (49) order to follow the time dependence of the localized 2 (>/sin 7r»)(5»_)= - 5£_/A, spin in the presence of a variable external field, where the admixture polarization, being a high-frequency or, subtracting the two equations of Eq. (49), effect, will follow immediately. Thus, any g shifts caused (w/sm2irn){5n+— SnJ) = y(8n+— 5nJ)-\-2p.H/A. by free electron polarization will tend to have antiferromagnetic sign. (We leave out of account here the Thus, the susceptibility is true s-d exchange polarization which results from the M p(5n+— in-) relatively small s-d exchange integral. That has a definitely ferromagnetic sign. In the rare-earth group, this H H ferromagnetic effect may be relatively larger.) 2M2 2M2 The spacial distribution of the polarization is not 50) as directly obtainable from the Green's functions. For A(7r/sinV»—y) (ITA/sin2irn)—U this reason and for the sake of brevity, we leave this for a later publication. Equation (50) shows that as the system approaches the critical condition, Eq. (37), for magnetism, the VII. SUSCEPTIBILITY AND SPECIFIC HEAT absolute zero susceptibility per impurity increases, In the region of localized magnetic states, the sus- becoming infinite at the critical density of states. This ceptibility and specific heat—at least those caused by is quite reasonable physically. 14 the solute—are probably controlled, near absolute zero, We have not demonstrated this explicitly, but it is reasonably by the interaction of the solute moments. These mo- obvious from the symmetries of the problem. It is interesting that, as in similar case of nonspherical nuclei, the entity which roments, although they are not integral numbers of Bohr tates the is effectively the Hartree field, not the individual electrons' magnetons, are both localized and free to rotate, like moments. net polarization of free electrons, if p(tk) is rigorously constant. Otherwise, there will be a change:

£

165

LOCALIZED

MAGNETIC

The behavior of the added specific heat is quite different. Again, the result is extremely simple although the argument is rather long-winded. The standard formulas for the number of electrons and mean energy 16 are, if the density of states is pd+p/, r'F n= I •'o

n2 dpi (pd+p/)de+-k*T2—, 6 de

w2(kT)2/ 6

dPd

\ 1 tF \-pd+Pf I. \ de /

(52)

In this case, not only ep can change with temperature, but also pa, because if there is a shift ind in the occupation of the d states, the virtual state energy Ed shifts: SEd=VSnd.

dtp

C=

r2

dpd dEd

dpd dEd

dT

dEd dT

de

dip

dEd

T

2

dT' dpd

0= —pi-pi 1—k 2 T—. dT dT 3 dt

(55)

The net change in number of d electrons is Sfid =

—

7T2 dpd H—k 2 T—, PdU dT 3 de

dT or

(Tr2/3)k2T(dpd/de)

dm =

.

dT

(56)

1 + Upd

Now we return to Eq. (52), which we differentiate, again only to first order in T, in order to get C,v: di C=

/ " F dpd ir2 / dpd \ epp/+ I e—dt-\—k2T[ ep hp<j+p/ )• dT J0 dT 3 V de /

dtp =

dT

From Eq. (55), we can combine the first and third terms; we also use Eq. (54) and get / C^epl \

Pd

dEd\ dT/

r'F ) Jo

dpd dEd TT2 e—de +-k2T(pd+pf). de dT 3

16 F. Seitz, Modern Theory of Solids Company, Inc., New York, 1940), p. 150.

TT2

-r~*2r(p). dT 3

(57)

T2 \U(dpd/de)\tF C=-k*T\

3

I

1+Upd

) +Pd

J

,

(58)

where pd is the energy density of the impurity state, 7rA

TTA

pd(e) =

2

2

=

(e-Ed) +A

; snArra (59)

dpd

2irA(Ed—e)

Note that in Eq. (58) the sign in the denominator is + ; there is no tendency whatever towards a singularity. This is because the instability refers only to opposite motions of the two Fermi levels. At best, the specific heat from the anomalous term will be 20-30% of the density of states term pd. Thus, the specific heat is a much more accurate measure of density of states than is the susceptibility. CONCLUSIONS AND DISCUSSION

dm

0=

dEd

The new term is almost obvious; it says that the virtual level's energy just shifts by bEd as a whole. Using Eq. (56), we get explicitly:

Sn/ = — SeFpf,

so we get dnd

dT J0

*2 pdde+~k2T(pd+p,) 3

de~~l(e-Ed)2+A2J

Now dpd

51

= nd

dpd

—=0 = pt+ —dt-\—k2T—. dT dT ->« dT 3 de

so

dEd r'F

(53)

C"'dpd

METALS

Finally, the second term may be integrated by p a r t s ; the result is

First, we express the conservation of n (neglecting pd relative to p/ where possible, and keeping only lowestorder terms in T): dn

IN

(51)

since p/ is assumed constant for simplicity; and /••' e = I t{pd+Pf)dt-\ •'o

STATES

(McGraw-Hill Book

A rather schematized model of the electronic structure of a metal containing a solute atom with one or more inner-shell orbitals available has been worked out in the preceding pages. While the schematic character of the model should not be ignored, we feel that nonetheless it contains the essential physics of the phenomena for such solutions as Mn, Fe, or other irongroup elements in Cu, Ag, and Au; for the same elements in early members of the transition series such as Sc, Y, Ti, Zr, and possibly V, Nb, and M o ; and many rare-earth solutions. T h e essential criteria for our model's more-or-less literal validity are twofold: The "inner shell" orbital must be very different from the Wannier functions for the free electron bands in the solvent, so that a distinct localized orbital can be defined apart from band functions ; and it must be sufficiently sharply localized that its Coulomb self-energy integral is not strongly screened out, while those of the band electrons are. Under these circumstances, the competition between this Coulomb integral and the matrix elements—essentially kinetic energy—connecting the local state with the

166

52

P.

W.

ANDERSON

band states is undoubtedly the determining element as to whether or not a magnetic state is established. The parts of the problem which we have schematized out will have subsidiary importance. Some of the things we have left out, with a brief discussion of their importance, are: (1) True s-d exchange of the type normally postulated in s-d exchange theories. This will undoubtedly be present to a limited extent. Even in the free atoms, s-d exchange is a relatively small effect—1-2 ev. We may expect an appreciable screening of the Coulomb interactions for J electrons which may further reduce the effect. The primary result of s-d exchange will be to induce a more or less positive polarization of the free electrons, of the same order of magnitude, perhaps, as the terms ignored in the compensation theorem. (2) s-d Coulomb repulsion. We have discussed this to some degree; as Herring 10 has pointed out, this may be expected to reduce the effective U by compensating for missing d charge with i charge, but we believe the magnetic case to be the situation in which this effect is small. There would appear to be two distinct situations for impurities in metals. In such a case as Zn in Cu, the impurity may be considered simply as an extra positive charge, which is to be screened out by a deformation and modification of the free-electron band itself. But in the type of situation we envisage—for instance, Mn in Cu—the charge is expected to be compensated not by a deformation of the Cu s band, b u t by emptying levels approximating to the orbitals of the free atom— because these orbitals are of an entirely different symmetry and size from the functions available in the band. In other words, in such cases, it is far better to use as a starting approximation the neutral impurity atom itself, added to the, say, Cu matrix. This type of situation, in which the atomic properties of the solute are not strongly affected by solution, is probably the more widespread, especially when solvent and solute are widely different. I t is in this case that one does not expect the free bands to be strongly perturbed, or to compensate effectively Coulomb effects in the inner shell. Finally, free-electron correlation and exchange have been explicitly ignored; this is probably well justified in that we think of the free electrons as "quasi-particles," for which these effects are taken into account in the «* and the screened interactions. A final question is whether a real many-body theory would give answers radically different from the HartreeFock results. Since our theory is exact in both limits (U —• 0 or co), we expect only numerical modifications; in particular, the spin-up and spin-down electrons can probably correlate the times at which they occupy the localized state to some extent, reducing the effect of U and making the magnetic criterion even more severe. Having justified the model in the cases to which it does apply, what of situations in which it does not? One

is led to suspect that the philosophy and some of the results may still be applicable. For example, in Co in Pd 16 one has no right to assume that the Co d shell function is any more local than the Pd one. On the other hand, the philosophy suggests to us that the large Pd susceptibility indicates that Pd is very close to entering the "magnetic state" on its own hook. In this case, it is reasonable that the perturbation due to Co could localize moments on neighboring Pd atoms; this might explain the observations qualitatively. In general, in such cases, one does not expect merely the decrease in d magnetization we have discussed here; generally, there will be larger polarization effects, which we emphatically have not studied, on the solvent d electrons themselves. In general, we have not attempted to discuss the complex subject of interactions of d electrons on different atoms. It is the great conceptual simplification of the impurity problem that it is possible to separate the question of the existence of the "magnetic state" entirely from the actually irrelevant question of whether the final state is ferromagnetic, antiferromagnetic, or paramagnetic. We would suggest that it is this concept of separating these two problems which will be of the greatest value in the far more difficult problem of the ferro- and antiferromagnetic metals. One result that can be carried over without much alteration into even the pure band theory of ferromagnetism has to do with the polarization of free electrons by the d electrons. There, as in the impurity case, the positions of the d levels of the two separate spins will be radically different, leading to precisely the same two contributions to the polarization: a positive admixture, and a negative true polarization, for precisely the same reason. These two contributions will tend to cancel and the net result will be that the d polarization will be reduced relative to that in the absence of such interaction. This reduction may be fairly large, leading to the interesting possibility that the 0.6 and 1.7 Bohr magnetons of Ni and Co may represent an "unperturbed"
167

LOCALIZED

MAGNETIC

APPENDIX A

STATES

IN

1/ \+2xy/v n~-\

Let us generalize the Hamiltonian (1) in such a way as to include two degenerate d levels
k.cr

Vjk(cj„*ck,+Cbc*Cj,)

k,/,(r

+ ( ? 7 - / ) ( » i + « 2 + + « i - » 2 - ) + iy(w + «_).

r

which will be valid when * ~ 1 . Second, we can study the condition that magnetic moments just appear by differentiating Eq. (A.6), obtaining a set of linear equations in the Sn's which have a nonvanishing solution only on the transition curves:

(A.l) -5wi + =y(«»i_+S»2-)+(y—i)5«2+, (A.8) -5»2+=y (fl«i-+ Sti2~)+(y— j) J»i+,

*
•^

(A.2) and symmetrically for the nJs. Letting

ri2

£ is again the unperturbed position of the two d levels, assumed degenerate as they might be in a cubic crystal. The Fjk must be different—in fact, different functions belonging to the same symmetry under the group of the Brillouin zone—for the two different j , but we can expect the widths and shifts of the two virtual levels to be the same, nonetheless, because of the symmetry. By precisely the same techniques as in Sec. IV of the main body of the paper, we spread each effective d level out into a virtual level of width A. The effective Hartree field for each level depends on the occupation of all the other levels through the terms in U and J in Eq. (A.1): £etf(l+)=JS+c'((Wl_)+<»2_>+(»2+))-7(»2+),

cot(7r» 1+ ) = [ £ e f t ( l + ) - e f ] / A ,

(A.4)

etc. We introduce the parameters x=(eF-E)/U;

j=J/A,

&n+—5ni+-\-5n?+;

5w_= 8wi_+ 5» 2 -,

(A.9)

we may add these pairs of equations together, and subtract the two sums, to obtain: (ir/sinVra) (Sn+— dn-) = ( y + j) (Sn+~

in-),

or ir/sin 2 ir» 0 =y c +y c .

(A. 10)

Equations (A.10) and (A.7) must be solved simultaneously to obtain the critical concentration for a given x, y, and j . Note, however, that the most favorable possible condition occurs when nc = \, and t h a t then ymax=7T— j .

(A.3)

etc. With these equations and Eq. (26) of the text, we arrive at the fundamental equations:

y=U/A;

\

), (3y-j)/Tr/

2\l +

Here j= 1 or 2 and the new parameter / , the exchange integral proper, is

J=

53

text, and a fair approximation is often

Self-Consistent Theory for a Degenerate d Level

+ £

METALS

(A.ll)

T h u s j makes it easier to form the localized state, in general at least. I t is also possible that the Hartree field for orbital state
(A.5)

ir(8ni+—

and Eq. (A.4) becomes

2

S»2+)

=

(y-j)(in1+-Sn2+),

sin 0(»i++»2+)/2]

cot7rWi + =y(m_+W2_+W2+)-i»2+—*y, cotwn2+=y(n1_+n2-+ni+)-

jn1+-xy,

cotir»i_=y ( » i + + « 2 + + « 2 - ) — i « 2 - _ xy,

(A.6)

cotirW2- = 3'(wi + +«2++»i-) — i « i - _ xy. We will not work out all the details of the various results based on Eq. (A.6) b u t simply give a few general formulas. First, in the nonmagnetic state, clearly all the it's are equal, n-„=n, and the state is determined by cot7r»= (3y—j)n—xy.

(A.7)

This has the same form as the nonmagnetic case in the

sin 2 5r(»i + ) c '

=

yc~Jc

(A.12)

Since j is positive, this condition is always less easily satisfied than Eq. (A. 10). I t is, however, entirely possible for both to be satisfied, especially if after polarization of the spins » i + + « 2 + is roughly unity. As discussed in the text, this latter situation represents the closest approach to a truly localized magnetic state, with both an orbital and spin magnetic moment. The former case, in which £ i + e ( f = £ 2 + e " , is probably more often encountered. We have here in miniature, as it were, a model of the Van Vleck-Brooks orbital quenching mechanism in the magnetic metals..

168

Plasmons, Gauge Invariance, and Mass Phys. Rev. 130, 439-442 (1963)

The physics behind the Higgs phenomenon is not, in spite of the belatedly added reference to Schwinger's opaque remarks, in any way beholden to Schwinger in its inception, since it was a direct transcription into particle language of my work (referred to above) on broken gauge invariance in superconductivity. My ideas on broken symmetry in both vacua and in condensed matter systems did receive great impetus from some brief contacts with Y. Nambu: his phrase "independent orthogonal Hilbert spaces" was very liberating and expressed what I had been groping towards since 1953. I also brushed shoulders, at least, with Steve Weinberg at the Cavendish; and as I acknowledge, J.C. Taylor was at Bell for a summer and we talked at length. Higgs acknowledges this paper as the source of his ideas. Brout was a friend and a regular visitor at Bell, but I do not remember if we discussed this subject.

169

Reprinted from T H E PHYSICAL REVIEW, Vol. 130, No. 1, 439-442, 1 April 1963 Printed in U. S. A.

Plasmons, Gauge Invariance, and Mass P. W. ANDERSON

Bell Telephone Laboratories, Murray Hill, New Jersey (Received 8 November 1962) Schwinger has pointed out that the Yang-Mills vector boson implied by associating a generalized gauge transformation with a conservation law (of baryonic charge, for instance) does not necessarily have zero mass, if a certain criterion on the vacuum fluctuations of the generalized current is satisfied. We show that the theory of plasma oscillations is a simple nonrelativistic example exhibiting all of the features of Schwinger's idea. It is also shown that Schwinger's criterion that the vector field m^O implies that the matter spectrum before including the Yang-Mills interaction contains m = 0, but that the example of superconductivity illustrates that the physical spectrum need not. Some comments on the relationship between these ideas and the zero-mass difficulty in theories with broken symmetries are given.

R

ECENTLY, Schwinger1 has given an argument strongly suggesting that associating a gauge transformation with a local conservation law does not necessarily require the existence of a zero-mass vector boson. For instance, it had previously seemed impossible to describe the conservation of baryons in such a manner because of the absence of a zero-mass boson and of the accompanying long-range forces.2 The problem of the mass of the bosons represents the major stumbling block in Sakurai's attempt to treat the dynamics of strongly interacting particles in terms of the Yang-Mills gauge fields which seem to be required to accompany the known conserved currents of baryon number and hypercharge. 3 (We use the term "YangMills" in Sakurai's sense, to denote any generalized gauge field accompanying a local conservation law.) The purpose of this article is to point out that the familiar plasmon theory of the free-electron gas exemplifies Schwinger's theory in a very straightforward manner. In the plasma, transverse electromagnetic waves do not propagate below the "plasma frequency," which is usually thought of as the frequency of longwavelength longitudinal oscillation of the electron gas. At and above this frequency, three modes exist, in close analogy (except for problems of Galilean invariance implied by the inequivalent dispersion of longitudinal and transverse modes) with the massive vector boson mentioned by Schwinger. The plasma frequency

is equivalent to the mass, while the finite density of electrons leading to divergent "vacuum" current fluctuations resembles the strong renormalized coupling of Schwinger's theory. In spite of the absence of low-frequency photons, gauge invariance and particle conservation are clearly satisfied in the plasma. In fact, one can draw a direct parallel between the dielectric constant treatment of plasmon theory 4 and Schwinger's argument. Schwinger comments that the commutation relations for the gauge field A give us one sum rule for the vacuum fluctuations of A, while those for the matter field give a completely independent value for the fluctuations of matter current j . Since j is the source for A and the two are connected by field equations, the two sum rules are normally incompatible unless there is a contribution to the A rule from a free, homogeneous, weakly interacting, massless solution of the field equations. If, however, the source term is large enough, there can be no such contribution and the massless solutions cannot exist. The usual theory of the plasmon does not treat the electromagnetic field quantum-mechanically or discuss vacuum fluctuations; yet there is a close relationship between the two arguments, and we, therefore, show that the quantum nature of the gauge field is irrelevant. Our argument is as follows: The equation for the electromagnetic field is fA„=

1

J. Schwinger, Phys. Rev. 125, 397 (1962). 2 T. D. Lee and C. N. Yang, Phys. Rev. 98, 1501 (195 s J. J. Sakurai, Ann. Phys. (N. Y.) 11, 1 (1961).

(* 2 -w 2 )^,(k,w) = 47ri„(k,6o).

zieres and D. Pines, Phys. Rev. 109, 741 (1958).

170

440

P.

W.

ANDERSON

A given distribution of current _;',, will, therefore, lead to a response A? given by 4ir

4TT

(1) is merely the statement that only the electromagnetic current can be a source of the field; it is required for general gauge invariance and charge conservation according to the usual arguments. The dynamics of the matter system—of the plasma in that case, of the vacuum in the elementary particle problem—determine a second response function, the response of the current to a given electromagnetic or Yang-Mills field. Let us call this response function

j,(M=-K„,(MA.(M.

(2)

By well-known arguments of gauge invariance, i?„„ must have a certain form: Schwinger points out that in the relativistic case it must be proportional to p^p, ~gv'P2> a n d equivalent arguments give one the same form in superconductivity. 6 I t will be convenient to consider, for simplicity, only the gauge

M»=0-

(3)

Then the response is diagonal: Kliv=—glxyK, For a plasma with n carriers of charge e and mass M it is simply (in the limit p —> 0) K=ne2/M.

jn= —aep A,j, j = — amp2A

In a truly relativistic situation such as our normal picture of a vacuum, we expect

Jr=-apA„

(5)

to describe normal polarizable behavior. Since we cannot turn off the interactions, we do not actually observe the responses (1), (2), or (5). If we insert a test particle, its field A/ induces a current j , , which in turn acts as the source for an internal field A/:

How, then, if we were confined to the plasma as we are to the vacuum and could only measure renormalized quantities, might we try to determine whether, before turning on the effects of electromagnetic interaction, A had been a massless gauge field and K had been finite? As far as we can see, this is not possible; it is, nonetheless, interesting to see what the criterion is in terms of the actual current response function to a perturbation in the Lagrangian SL^JM,.

This will turn out to be identical to Schwinger's criterion. The original "bare" response function was K:

j^—KpMr

j „ = -K'5A,'=

-K[_p2/(p2+4irK)~]&A„°^ -(f/br)dA„; p2^0.

(10)

Thus, the new response to an applied perturbing field (9) is very like that of an ordinary polarizable medium. The only difference from an ordinary polarizable "vacuum" with bare response (5) is that in that case as p —* 0 A'->-[a/(l+4ira)>2,

(6)

The pole at which A propagates freely occurs at a mass (frequency) m'=-ffi=4wK, (7) M. R. Schafroth, Helv. Phys. Acta 24, 645 (1951).

(9)

(11)

so that the coefficient of ^ /47r is less than unity. This criterion is precisely the same as Schwinger's criterion

or, the total field is modified to

6

oif is the usual plasma frequency (4irne2/M)U2. It is not necessary here to go in detail into the relationship between longitudinal and transverse behavior of the plasmon. In the limit p —> 0 both waves propagate according to (8). The longitudinal plasmon is generally thought of as entirely an attribute of the plasma, while the transverse ones are considered to result from modification of the propagation of real photons by the medium. This is reasonable in the classical case because the longitudinal plasmon disappears at a certain cutoff energy and has a different dispersion law; but in a Lorentz-covariant theory of the vacuum it would be indistinguishable from the third component of a massive vector boson of which the transverse photons are the two transverse components.

2

A„'=+4rj^,

^=L>V(£2+4rf)}4/.

(8)

Taking into account the interaction, however, we must correct for the induced fields and currents, and we get

(longitudinal and time components), (transverse components).

j^-KUS+A,,'),

m2=u2-k2=oip2.

(4)

In an insulator the response is not relativistically invariant. If the insulator has magnetic polarizability am and electric ae, the response equations may be written, in the gauge (3), 2

which in a conductor is

\

B1{m2)dm?=\,

where Bim2 is the weight function for the current vacuum fluctuations. This can be shown by a simple dispersion argument. Schwinger expresses the unordered

171

PLASMONS,

GAUGE

INVARIANCE,

product expectation value of the current as

A N D MASS

441

resorting to Yang-Mills, it is not obvious how to characterize such a case mathematically in terms of observable, renormalized quantities. 2 2 2 0'».0)>0')}= dm m Bl(m ) / <>«><*-»') We can, on the other hand, try to turn the problem J J (2ff) 3 around and see what other conclusions we can draw about possible Yang-Mills models of strong interactions Xr,+(p)5(p2+m2)(p,p^gl^f). from the solid-state analogs. What properties of the The Fourier transform of the corresponding retarded vacuum are needed for it to have the analog of a Green's function is our response function: conducting response to the Yang-Mills field? Certainly the fact that the polarizability of the r dm2 m2Bi{m2) "matter" system, without taking into account the K'{p)= / L>^,-g,,^l interaction with the gauge field, is infinite need not J p2-m2 bother us, since that is unobservable. I n physical and conductors we can see it, but only because we can get 2 2 2 \vcaK'{p)={p,pv-g„p ) (dm Biim ). outside them and apply to them true electromagnetic "^> J fields,.not only internal test charges. More serious is the implication—obviously physiThus, (aside from a factor 4x which Schwinger has not 2 used in his field equation) his criterion is also that the cally from the fact that a has a pole at p =0—that polarizability a', here expressed in terms of a dispersion the "matter" spectrum, at least for the "undressed" matter system, must extend all the way to m2=0. In integral, have its maximum possible value, 1. The polarizability of the vacuum is not generally the normal plasma even the final spectrum extends to considered to be observable 6 except in its p dependence zero frequency, the coupling rather than the spectrum (terms of order pi or higher in K). I n fact, we can being affected by the screening. Is this necessarily remove (11) entirely by the conventional renormaliza- always the case? The answer is no, obviously, since the superconducting electron gas has no zero-mass excitation of the field and charge tions whatever. In that case, the fermion mass is finite 12 2 Ar=AZ-u\ eT = eZ ' , jr=jZ>> . because of the energy gap, while the boson which appears as a result of the theorem of Goldstone 7 ' 8 and Z, here, can be shown to be precisely has zero unrenormalized mass is converted into a finite-mass plasmon by interaction with the appropriate Z=1-4TO'=1- J dnfiBiim2). gauge field, which is the electromagnetic field. The Jo same is true of the charged Bose gas. It is likely, then, considering the superconducting Thus, the renormalization procedure is possible for any merely polarizable "vacuum," b u t not for the special analog, that the way is now open for9 a degeneratecase of the conducting "plasma" type of vacuum. In vacuum theory of the Nambu type without any this case, no net true charge remains localized in the difficulties involving either zero-mass Yang-Mills gauge region of the dressed particle; all of the charge is bosons or zero-mass Goldstone bosons. These two carried " a t infinity" corresponding to the fact, well types of bosons seem capable of "canceling each other known in the theory of metals, that all the charge out" and leaving finite mass bosons3 only. I t is not at carried by a quasi-particle in a plasma is actually on all clear that the way for a Sakurai theory is equally the surface. Nonetheless, conservation of particles, if uncluttered. The only mechanism suggested by the not of bare charge, is strictly maintained. Note that present work (of course, we have not discussed nonthe situation does not resemble the case of "infinite" Abelian gauge groups) for giving the gauge field mass charge renormalization because the infinity in the is the degenerate vacuum type of theory, in which the original symmetry is not manifest in the observable vacuum polarizability need only occur at p2=Q. domain. Therefore, it needs to be demonstrated that Either in the case of the polarizable vacuum or of the necessary conservation laws can be maintained. the "conducting" one, no low-energy experiment, and I should like to close with one final remark on the even possibly no high-energy one, seems capable of directly testing the value of the vacuum polarizability Goldstone theorem. This theorem was initially conof the solid-state prior to renormalization. Thus, we conclude that the jectured, one presumes, because 10 11 plasmon is a physical example demonstrating Schwing- analogs, via the work of Nambu and of Anderson. The theorem states, essentially, that if the Lagrangian er's contention that under some circumstances the Yang-Mills type of vector boson need not have zero 7 mass. In addition, aside from the short range of forces J. Goldstone, Nuovo Cimento 19, 154 (1961). 8 J. Goldstone, A. Salam, and S. Weinberg, Phys. Rev. 127, and the finite mass, which we might interpret without 9659 (1962). 6 Y. Nambu and G. Jona-Lasinio, Phys. Rev. 122, 345 (1961). We follow here, as elsewhere, the viewpoint of W. Thirring, 10 Principles of Quantum Electrodynamics (Academic Press Inc., 11 Y. Nambu, Phys. Rev. 117, 648 (1960). P. W. Anderson, Phys. Rev. 110, 827 (1958). New York, 1958), Chap. 14.

172

442

P.

W.

ANDERSON

possesses a continuous symmetry group under which the ground or vacuum state is not invariant, that state is, therefore, degenerate with other ground states. This implies a zero-mass boson. Thus, the solid crystal violates translational and rotational invariance, and possesses phonons; liquid helium violates (in a certain sense only, of course) gauge invariance, and possesses a longitudinal phonon; ferro-magnetism violates spin rotation symmetry, and possesses spin waves; superconductivity violates gauge invariance, and would have a zero-mass collective mode in the absence of long-range Coulomb forces. It is noteworthy that in most of these cases, upon closer examination, the Goldstone bosons do indeed become tangled up with Yang-Mills gauge bosons and, thus, do not in any true sense really have zero mass. Superconductivity is a familiar example, but a similar phenomenon happens with phonons; when the phonon frequency is as low as the gravitational plasma frequency, (4irGp)I/2 (wavelength~ 104 km in normal matter) there is a phonon-graviton interaction: in that case, because of the peculiar sign of the gravitational interaction, leading to instability rather than finite

mass.12 Utiyama 13 and Feynman have pointed out that gravity is also a Yang-Mills field. I t is an amusing observation that the three phonons plus two gravitons are just enough components to make up the appropriate tensor particle which would be required for a finite-mass graviton. Spin waves also are known to interact strongly with magnetostatic forces at very long wavelengths, 14 for rather more obscure and less satisfactory reasons. We conclude, then, that the Goldstone zero-mass difficulty is not a serious one, because we can probably cancel it off against an equal Yang-Mills zero-mass problem. What is not clear yet, on the other hand, is whether it is possible to describe a truly strong conservation law such as that of baryons with a gauge group and a Yang-Mills field having finite mass. I should like to thank Dr. John R. Klauder for valuable conversations and, particularly, for correcting some serious misapprehensions on my part, and Dr. John G. Taylor for calling my attention to Schwinger's work.

173

12

J. H. Jeans, Phil. Trans. Roy. Soc. London 101, 157 (1903). R. Utiyama, Phys. Rev. 101, 1597 (1956); R. P. Feynman (unpublished). " L. R. Walker, Phys. Rev. 105, 390 (1957). 13

Hard Superconductivity: Theory of the Motion of Abrikosov Flux Lines (with Y.B. Kim) Rev. Mod. Phys. 36, 39-43 (1964)

The point here was a demonstration theory that flux lines do move and cause dissipation, which was a totally new and deep concept; not the specific assumptions of the model, which have been occasionally set up as a straw man in subsequent work. Kim noticed the slow decay of magnetization in the "critical state" and I proposed a logarithmic fit, on the basis of a "creep" mechanism like that of dislocations. Later we observed flux "flow", as well as noisy "flux jumps". But the exciting idea was the generalization of old ideas of dissipation by dislocation motion in solids to the general concept of defect motion in ordered systems: the genesis of one of the main lines of the theory of broken symmetry. We thought this work very important, and were unhappy about Bell management's decision to oppose publicity for it, on the basis of Matthias' and Geballe's objections that it negated their discovery of high-field superconductors.

175

Reprinted from

Printed in U. S. A.

R E V I E W S O F MODERN PHYSICS

Volume 36, N o . 1 ( P a r t 1 of Two P a r t s )

J a n u a r y 1964

Hard Superconductivity: Theory of the Motion of Abrikosov Flux Lines P . W. A N D E R S O N and Y. B . K I M Bell Telephone Laboratories, Inc., Murray Hill, New Jersey

The central concept in the theory of what one might call the "critical phenomena" of hard superconductors—critical currents, critical fields, decay of persistent currents, "excess" voltages, etc.—is clearly Abrikosov's notion of the quantized flux line.1 This is made almost obvious by the remark that, because of the now universally accepted validity of the quantization of flux through superconductors, the smallest possible breakdown of superconductivity is the motion of a single quantum of magnetic flux through the wire or ring. Thus, in all cases so far conceived, the lowest activation energy for any critical breakdown is that for the motion—and creation, if necessary—of single Abrikosov flux lines. This statement is independent of whether the mechanism for hard superconductivity is the GLAG one or the Mendelssohn sponge theory, although we assume the former to be valid in most cases. Even the decay of currents in true soft superconductors under a-particle bombardment2 is probably best explained by the threading of Abrikosov lines through normal holes punched by the a particles.3 The purpose of this paper is to see how many of the phenomena of hard superconductivity we can understand qualitatively in terms of the thermally activated motion of Abrikosov lines past pinning 1 A. A. Abrikosov, Zh. Eksperim i Teor. Fiz. 32, 1442 (1957) [English transl.: Soviet Phys.—JETP 5, 1174 (1957)]. 2 P . de Feo a n d G. Sacerdoti, Phys. Letters 2, 264 (1962). ' N. Cabibbo and S. Doniach, Phys. Letters 4, 29 (1963). We have proposed a slightly different mechanism.

centers, without going into unnecessary detail on the nature of the pinning centers—whether they are dislocations, cavities, precipitates, etc.—or the precise internal structure of the superconductor. Our task, then, is to study the process—presumably thermally activated barrier penetration—by which flux lines move. Let us then suppose that we have a superconductor penetrated by a magnetic field H and carrying a bulk current, for simplicity A.H, J = cV x H/4ir. The magnetic field will penetrate in the form of Abrikosov lines; their density is clearly not uniform because of J, and we expect their arrangement is to some extent irregular. The magnetic energy per unit volume is H2/Sir; we can think of this as a magnetic pressure exerted by the flux lines on each •other, and in the absence of pinning centers this pressure would have to be equalized by a rearrangement of the lines, leading to J = 0. Examination of Abrikosov's theory shows that actually at all but low fields the internal and external fields are nearly the same, so that we usually assume B = H, a minor simplification of which Friedel et al. have considered the errors.4 In finding the rate of the activation process we need to know two things: the driving force exerted by the magnetic pressure, and the nature of the barriers. The former is more available to us theoret4 J. Friedel, P . G. de Gennes, and J . Matricon, Appl. Phys. Letters 2, 119 (1963).

176

40

REVIEWS OF MODERN PHYSICS • JANUARY 1964

ically but, on the other hand, more complex, entirely because it is really transmitted only through t h e interactions between t h e flux lines themselves. W e can only guess a t t h e correct results before more detailed computations of t h e behavior of nonuniform distributions of flux lines become available. In Abrikosov's paper, in t h e appropriate case for h a r d superconductors (K 3> 1, Hc2 > H > Hci) t h e interaction free energy of t h e lines is written in two equivalent ways: Fin« = ( 1 / 8 T T ) H - ( H -

SVH)

(1)

K being t h e famous Landau-Ginsburg parameter do/fo, 8 t h e penetration depth, a n d Hc t h e thermodynamic critical field. T h e two-dimensional position vectors of t h e lines are t h e r ( . In t h e case most usual in hard superconductors, t h e distances |r, — r,| are small compared t o 6, V2H is relatively small, a n d so t h e interaction free energy is practically given b y the naive expression for the magnetic energy. This would mean t h a t t h e force per unit volume would be given b y VF = - H

x V x H/4TT = J X H / C , t h e Lorentz

force; and the force per flux line is then J x 4>0/c per unit length, where t h e magnitude of * 0 is t h e flux unit hc/2e a n d it is directed along the flux line. I t is very useful t o notice t h a t (2) is formally the same as the electrostatic interaction between lines of electric charge, except t h a t it is screened away a t the penetration depth S—i.e., K0 has the logarithmic nature of two-dimensional charge interactions as r —> 0, while it is <~e~wS as r —> °o. This tells us t h a t , because of this relatively long-range interaction, local perturbations of t h e line density are very unfavorable energetically—for instance, simply putting in locally one extra flux line costs an energy of t h e order of H$o per unit length, which is much greater t h a n t h e energy available from a n y reasonable pinning centers. T h u s t h e arrangement can be irregular only on a scale greater than 5—locally t h e density must be uniform, a n d a n y local variation must be only a slight increase in the local density spread o u t over a region of radius > 5.

density in its neighborhood; b u t rather a whole bundle of lines, of radius ~ 5 , m u s t move simultaneously. Therefore, of course, it is the force on t h e total bundle which acts against the pinning barrier. On the other hand, while t h e line density must be very uniform, the Ka function is actually a very slowly varying one near r —» 0 so t h a t t h e arrangement of lines need not be crystallographically regular, and the bundles can slide past each other reasonably easliy. T h e free energy of a bundle in t h e region of a barrier, then, can now be written down. T h e force is about JH&H/c = J^ond/c, where I is t h e effective length of line over which t h e force acts, presumably t h e distance between pinning centers, a n d nb is t h e number of lines in a bundle; as a function of position x of t h e bundle we have Fhm

=

JHSHx/c.

T h e size of t h e barrier is presumably about {<>, a n d t h e appropriate scale factor for its energy is (H1/8ir){jJ; suppose a fraction p of this is effective. T h e total barrier free energy then—realizing t h a t we have both p a n d I as undetermined parameters so t h a t we need n o t specify the determinable ones more precisely—is Fb = (pH%l/8*)

- {JHfiyc)

.

(3)

I t is perhaps barely worthwhile t o make a guess a t the pre-exponential factors in the rate equation, even though almost all t h e results are controlled entirely by (3). A particular barrier will allow one of the lines in t h e bundle through a t a rate/sec of R = uoe'"1""

(4)

wo is a vibration frequency of t h e bundle, ~ 1 0 5 — 10'°/sec. Since t h e bundles are of width 5, t h e rate per unit area is obtained b y dividing by S. T h e equation for diffusion of flux density \B\ is given by finding t h e rate a t which flux enters and leaves a small element of volume: d\B\/dt

= -V-(*0R/S),

(5)

where the gradient is two-dimensional, $ 0 is t h e flux unit, a n d R is of magnitude (4) a n d directed in t h e direction of t h e gradient of magnetic pressure

This is t h e idea behind t h e concept of "flux (6) a = v p = V(#78TT) = J x H/c . bundles" 6 —that in fact, while it is probably t h e individual flux line's internal structure (of size ~ £ 0 ) We shall often use the notation a for t h e appropriate which is caught by a pinning center, t h a t line indi- combination in t h e force term even when JH is n o t vidually cannot j u m p over the barrier alone, because correct. it would get badly out of equilibrium with t h e local There are two cases in which we would obviously expect t h e above reasoning to fail: t h e two extremes 5 of t h e Abrikosov state, near Hc2 a n d Ha, t h e upper P . W. Anderson, Phys. Rev. Letters 9, 309 (1962).

177

P. W. ANDERSON AND Y. B. K I M Theory of Motion of Abrikosov Flux Lines

and lower critical fields, respectively. Near the lower critical field the lines are more than a distance S apart; also the differences between B and H, and therefore the complications due to surface currents, etc., are much greater. Qualitatively, we expect that as the number nb —> 1, the value of the field will matter less and less, and the effective force will reduce to that on a single line, proportional to J alone. This is indeed the qualitative behavior in many materials, e.g., those in which we find the force term as J(H + Bo), with Bo of the order of Hcl.6 But no justification for this particular form has appeared. As H —> He2, the lines will be forced together until the forces between them are no longer the longrange, smooth electromagnetic forces, varying as ln(r,- — Tj), but are the much more steeply varying forces which ensue when the regions in which * ^ const overlap. One would expect the bundle concept to fail completely, and the "hard core" interaction between lines to lead to new effects. One suggestion one might make would be that the lattice of lines may become rigid, so that the bundles can no longer slide independently past each other. In that case the rate might be expected to become much slower, a possible explanation for the "peak effect."7 The two immediate conclusions which were drawn from (5) were the critical current curve and the flux creep rate equation.6 The "critical current" came from supposing that the critical parameters as measured in most cases—notably Kim's experiments6—simply represented a point at which the rate R became immeasurably slow. Call this rate Re. Then we have kT\n{Rc/^)

=

-(Fb)c

or (./#).,» e

pm 8TT

& 2 S l

kT , R, o)„ S%

da_ _ _ ±_ I H dH\ dt dt \ 4ir dx I d H0 6R(a) dx 4:T8 dx Usually, R will depend exponentially on a. H, of course, will depend only roughly linearly on a. Also, we really do not quite understand the pre-exponential factors in the rate equation anyhow, and an extra factor H could not easily be checked experimentally in most cases. Thus, it is easiest and within errors of the theory to neglect the derivative of H relative to that of R, and we obtain _

H&QWQ

~dt -

da

4x8

6 Y. B. Kim, C. F. Hempstead, and A. R. Strnad, Phys. Rev. Letters 9, 306(1962). 7 For instance, S. H. Autler, E. S. Rosenblum, and K. H. Gooen, Phys. Rev. Letters 9, 489 (1962).

-FjkT

e

&_

dx2

a

-rr

C

a la,

Ao —- 2 e

e

dx

(8) where kT

Fo/S%0 In (o>o/Re)

(9)

ai is usually very small, of the order of 10"2 acrit, which means that the barriers are indeed high compared to kT. Ko is defined by this equation and Fa is the force-free barrier height Fb(a = 0). A steadily decaying solution of (8) is e""" = ( -

aix

+ bx + c)/2K0t

a = a!{In [ ( - a,x + bx + c)/2K0] - In t\ . (10) This logarithmic time dependence has been repeatedly observed.6,8 Another solution of the nonlinear diffusion equation (8) is the steady-state one, such as one would obtain physically by supplying power from an external source to maintain a current through a tube or plate sample: (d/dx)eaU> = c a = ai (In c + In x) .

...

A similar expression was found to agree reasonably well with the temperature dependence of aorit in Refs. 5 and 6. This expression is somewhat better qualitatively because one has the most structuresensitive parameter I in both terms. The flux creep rate equation was also treated approximately in Ref. 5. Equation (5) may be put in a more useful form by writing it as an equation for the pressure gradient a itself (for simplicity we specialize to a one-dimensional situation as in a tube wall):

41

(11)

Because «i is so small c will be enormously large in all cases, and thus a will be essentially constant throughout the sample, as is physically obvious from the exponential rate dependence on a. We have made no progress in studying the transient solutions which are relevant when one quickly applies external fields or currents to a hard superconductor. Clearly, the effect will be to create local concentrations of magnetic pressure which could diffuse away but may well not be able to do so stably— i.e., in such a way as to decay continuously into a steady-state solution. We are currently studying this 8 Y. B. Kim, C. F. Hempstead, and A. R. Strnad, Phys. Rev. 131, 2486(1963).

178

42

R E V I E W S OF MODERN PHYSICS • JANUARY

1964

problem with the idea that possible instabilities in Eq. (8) may well be related to the phenomena of magnet instability. A whole range of further applications of these ideas are suggested by the observation that the existence of flux creep clearly implies power dissipation in all hard superconductors.8 That is to say not only that apparently perfectly superconducting samples below the so-called "critical curve" are still dissipating power—and thus offering resistance to current flow—at a finite rate, but that apparently nonsuperconducting, resistive samples far above the usually accepted critical conditions are also often truly superconducting in the thermodynamic sense, especially under transient conditions before thermal or flux diffusion instabilities have had time to occur. A simple way to derive the resistive power dissipation is to start from (5) and Maxwell's equation

By differentiating Eq. (7) we may obtain an expression for the dimensionless ratio in parentheses: da

r2

T

(15)

where (da/dT)R means that we fix the rate R. Experimental data tell us that this ratio is of the order 102 - 103, which means that thermal instability is a severe problem. That is, the increase in temperature AT of the given region over the surroundings caused by the power input P is of the order AT ~

PT2/K

so this tells us that

AT JL ( da) (16) 10 -10"" <*i \dT/R T is the stability requirement: a very tiny rise in temperature can presage a complete thermal breakdown. V x E = B/c In particular, if, because of excessively rapid current in a simple one-dimensional case where we assume or field changes or of "weak spots," the stress a bewe have B in the z direction, J and E in the y direc- comes concentrated in a small region, P will be extion, and flux creeping in the x direction. Then, ponentially larger locally while K/Y2 only increases as the square of the size, so that local thermal indEy exp + ; stability can be a problem. The obvious practical A. dx c Sc ' \ KT ai dx morals are three: first, that magnet configurations i.e., to maintain the flux we have to apply an E field allowing good thermal conduction are vital; second, the rather discouraging remark that so far good hard *0 E„ = (12) superconductors are also bad thermal conductors for exp V kT ^ a, J ' dc obvious reasons, so that the better the magnet the There is, then, clearly a power dissipation worse the stability problem will be; and third, because of the factor 1/r2 very big magnets will have (a Fo \ Eyjy extra stability problems. (13) Sc exp \ ai kT J The final remark we would like to make is that the in the material. existence of this effective resistivity caused by flux This immediately suggests the possibility of severe creep allows us to investigate experimentally the thermal instabilities in hard superconductors. Let us creep rate over a much wider range of a than was write down the equation for the heat content of the possible with the original flux decay measurements, material: by measuring the resistivity of wire samples. In particular, one can go to far higher stresses—the "redQ = C ^ = _ K V Y + P . sistive state" of superconductors.8 In this region Kim dt et al. have found that the exponential law R <x e"'"1 K is the heat conductivity. Suppose a small tempera- begins to fail and is replaced by a roughly linear relature fluctuation ST(r,t) occurs. Whether it grows or tion R cc a. This occurs when the effective barrier decays is determined by the equation F 0 — (a/ai)kT is no longer large compared to kT; we would then expect a viscous resistance to flux line dST -KV2(&T)+j?r8T. (14) motion, a process of "flux flow" rather than "flux ' dt creep," analogous to the motion of magnetic domain Suppose the fluctuation occurs on a scale of size r. walls above the coercive field. Kim has suggested a Then for stability we must have very useful semi-empirical formula for the velocity of flux lines: dF

r^

l-!r ^r;J

A ^ <>£. = P (J" /

dT

T\kT

?L

ai

L

*\

[(const X a)'1 + (const X e"'"yl

kT dT I "

179

(17)

P. W. ANDEBSON AND Y. B . K I M Theory of Motion of Abrikosov Flux Lines

In summary, the general qualitative ideas of flux creep theory appear to be soundly based and to allow for the qualitative and semiquantitative understanding of a wide range of phenomena. Many problems remain, both for detailed quantitative study and even for better qualitative understanding. To list a few of these: (1) More detailed understanding of flux line interactions, in particular a sounder basis for the "bundle" concept and an understanding of B0 and of the peak effect.

43

(2) The peculiar transient pulses observed by Kim et al.s These support the creep idea qualitatively, but are too large to be individual lines or bundles and too small to be instabilities. Are they avalanches? (3) The nonlinear diffusion equation: when is it unstable? (4) The viscous state. This is a completely new theoretical problem and I know of no obvious way even of approaching it from fundamental theory.

180

Probable Observation of the Josephson Superconducting Tunneling Effect (with John M. Rowell) Phys. Rev. Lett. 10, 230-232 (1963)

Having discussed his theory with Josephson and others at the Cavendish, where I spent the academic year 1961-2 at the invitation of Neville Mott, and as an early Visiting Fellow of Churchill College, I alerted John Rowell on my return to the probability that the effect predicted by Josephson was real. I am on the paper because I helped devise ways of making sure it was not a short in the junction; and because under the necessity caused by actually seeing the effect I had to understand why others hadn't and thus to begin to understand the whole concept of coherence and generalized rigidity. I now believe that Giaever may have seen but ignored it. A few months later Ali Dayem and I demonstrated the AC Josephson effect in a metallic microjunction, giving Bell Labs a strong patent position in Josephson devices which they ignored. (For quite a while all SQUIDs were point contact and would have been covered by the patent. They may still be for all I know.) I have been accused of "exploiting" Josephson's discovery with these patents, but after all that's what Bell experimentalists are paid for, and the two experimentalists were the real "discoverers" — we had no intention, or possibility in law, of patenting other than our own discoveries. I went to great lengths, and succeeded all too well, to avoid the Matthew effect with Josephson, thereby depriving John Rowell of much of the credit he deserved.

181

VOLUME 10, NUMBER 6

PHYSICAL

REVIEW

LETTERS

15 MARCH 1963

PROBABLE OBSERVATION OF THE JOSEPHSON SUPERCONDUCTING TUNNELING EFFECT P . W. Anderson and J . M. Rowell Bell Telephone Laboratories, Murray Hill, New Jersey (Received 11 January 1963) Josephson 1 has recently predicted theoretically a new kind of tunneling current through an insulating barrier between two superconducting metals. This anomalous current behaves like a direct tunneling of condensed electron pairs between the Fermi surfaces of the two metals. When the voltage difference across the barrier is zero, the current should be dc but may range between limits of the order of magnitude of the usual tunnel current above the gap, depending on the relative value of the phase of the energy gap function or "second Green's function F" on the two sides, while at a nonzero voltage difference A V it is alternating at a frequency 2eAV/h. We have observed an anomalous dc tunneling current at or near zero voltage in very thin tin oxide barriers between superconducting Sn and Pb, which we cannot ascribe to superconducting leakage paths across the barrier, and which behaves in several respects as the Josephson current might be expected to. Figure 1 shows an X - Y recorder plot of the tunneling current vs voltage for one of these structures at - 1 . 5°K. The lead and tin films

are both approximately 2000A thick, and the junction has dimensions 0.025x0.065 cm 2 and a resistance (both metals normal) of 0.4fi. Voltage is applied to two arms of the junction from a 1 -kft potentiometer and the resulting current flow is measured as voltage across a series resistor of 10 n . The voltage appearing across the barrier is taken directly from the other two arms of the junction. Figure 2 shows the plot with current scale expanded to show the anomalous region near the origin. The current

H -6.10" 3 GAUSS I = 0.65 m A 0.4

0.3

H - 0.4 GAUSS I - 0.3 m A

5 0.2 a. a. o H-20GAUSS

0.1

2X10" Z MHO 0 1 2 VOLTAGE IN m V

1 VOLTAGE IN m V

FIG. 1. Current-voltage characteristic for a tintin oxide-lead tunnel structure at ~1.5*K, (a) for a field of 6 x 10"3 gauss and (b) for a field 0.4 gauss. 230

FIG. 2. Current-voltage characteristic for structure of Fig. 1 with expanded current scale. Note the conductance at low voltages in a field ~20 gauss.

C525 1-3

182

VOIUME 10, NUMBER 6

PHYSIGALREV

at first increases up to a value of 0.3 mA with no voltage appearing across the barrier. At this point the junction becomes unstable and may fluctuate back and forth between the vertical characteristic and the expected "two-superconductor" characteristic. With a small increase in current, the junction settles stably on the latter. Expansion of the voltage scale showed some fluctuations even at lower currents in many cases. One possible explanation that will be suggested is, of course, that in such thin junctions we have not avoided small superconducting shorts across the barrier. There are, however, four experimental points suggesting that this is indeed the Josephson effect. (1) As pointed out in Josephson's Letter, 1 the effect should be quite sensitive to magnetic fields. Since the proposed proportionality to Im(Ag n *Apb) is not gauge invariant, we expect an additional dependence on sin/12(2e/fcc)A«rfl, where A is the vector potential, which will lead to cancellation of currents in various parts of the barrier if the magnetic flux flowing between the superconductors reaches one or two quanta. With an area of -10"' cm 2 , this corresponds to about 1 gauss. We have found (see Fig. 2) that when the junction was carefully shielded by a mu-metal can with a measured interior field of 6 x l 0 " 3 gauss, the vertical characteristic reached 0.65 mA, with no shielding (0.4 gauss), 0.30 mA, and when less than 20 gauss was applied the anomalous behavior was not observed. Fine superconducting filaments should show anomalously high, not low, critical fields. (2) The effect can only occur if both metals a r e superconducting and should be proportional to ~ I Ag n Apb |. On cooling the junction we find that the vertical characteristic appears at the tin transition (as measured by the negative resistance region of width Agn first appearing at Ap^) within experimental e r r o r . It seems unlikely that the "superleaks" would have precisely the same T„ as the tin film. (3) Critical currents of finely divided specimens never exceed 107 A/cm 2 . Our observed maximum current of 0.65 mA corresponds, then, to an area A of superleaks of not less than 6 x l 0 ' u cm 2 . Assuming a rather poor conductivity of 104 ohm" 1 cm" 1 for the filaments, this leads to a conductance, which should be observed even when the filaments are normal, of .,

Y>

1 0 4 x_6 x_ l_0 "_u

„

u

= 6mh0)

EW

15 MARCH 1963

whereas (Fig. 2) we observe with an applied field at most 2xl0" a mho, most of which is probably thermally excited quasi-particle tunneling. (4) We attempt to "burn out" the leaks by passing increasing currents through the junction, checking the anomalous current between each step. We observed no change in the vertical characteristic for a number of junctions (^ to 1 Q), until (at a voltage usually between 0.3 and 0.6 volt) destruction of the junction resulted. Metallic leaks should burn out before the junction as a whole does. These last arguments seem to us nearly to exclude the conducting leak hypothesis. The maximum current we observe is still, even in the best units, about an order of magnitude less than predicted. Other questions concern the fluctuation effects and the fact that thin junctions are necessary to see the effect. We believe that these questions are probably all best elucidated by looking at the effect as related to a coupling energy between the phases of the gap functions on the two sides. Calculations which will be reported elsewhere show that this energy is proportional to the negative cosine of the phase difference, and in magnitude is *E = We)Jx, where Jt is the maximum amplitude of Josephson's predicted current. This energy coupling is r e duced both by the presence of magnetic fields and by driving current through the unit. In order to observe the dc Josephson effect, this energy must be large enough to keep the phases on the two sides coupled against thermal or other fluctuations. The total magnitude of A£ for the entire unit is rather small-of order 1 or a few eV in the thin b a r r i e r s , 10"2 eV in the more normal units. Obviously the thicker units are completely unstable against thermal fluctuations (remember that most of the circuit is at room temperature or higher). The thin barriers are more stable, but as we drive larger currents through them, or apply stray magnetic fields, we lower the coupling energy, and random fluctuations will eventually destroy the dc current and replace it with noise. A perfectly rigorous way to think of this process is that the energy A£ serves as a barrier against the passage of quantized flux lines through the unit. It will act as a superconductor only if fluctuations over the applied magnetic stresses cannot drive lines through the barrier region

C525 2 - 3

183

LETTERS

231

VOLUMB 10, NUMBER 6

PHYSICAL REVIEW LETTERS

15 MARCH 1963

reached some of the theoretical conclusions of the last few paragraphs.

at an appreciable rate. We wish to acknowledge the assistance of L. Kopf in the preparation of the tunnel units, and discussions with V. Ambegaokar. B. D. Josephson informs us that he has independently

'B. D. Josephson, Phys. Letters]., 251 (1962).

C525 3-3

184

Image of the Phonon Spectrum in the Tunneling Characteristic between Superconductors (with John M. Rowell and D.E. Thomas) Phys. Rev. Lett. 10, 334-336 (1963)

John was reluctant to exploit the Josephson Effect too extensively because we thought this line of work more important — it was after all the ultimate experimental proof of the BCS mechanism. We took the data down to Penn to show them to Bob Schrieffer, who I knew was developing an on line gap-equation-solving capability. Thus I returned the favor he had done Morel and me by pointing us toward Eliashberg. He and students Scalapino and Wilkins provided an accompanying letter using our suggested modeling of the phonon spectrum. I wrote up all this history in 1987 as "When the Fat Lady Sang."

185

VOLUME 10, NUMBEB 8

PHYSICAL

REVIEW

LETTERS

15 APRIL 1963

IMAGE OF THE PHONON SPECTRUM IN THE TUNNELING CHARACTERISTIC BETWEEN SUPERCONDUCTORS J. M. Rowell, P. W. Anderson, and D. E. Thomas Bell Telephone Laboratories, Murray Hill, New Jersey (Received 14 March 1963)

The equation for the energy gap function &(E) in a superconductor, using the Eliashberg 1 phonon interaction between electrons, predicts that &(£) will exhibit structure at energies determined by the phonon energies, specifically at the energy gap A plus multiples of the phonon frequencies* (assuming an Einptein phonon spectrum). It is evident that structure in the energy gap function A(£) will be reflected in the conductance vs voltage plot for a metal-insulator-superconductor tunnel 334

E525

186

junction, since the conductance is proportional to the density of states. 3 Structure in lead which can be related to the phonon spectrum has been reported by Giaever 4 and structure at phonon harmonics, also in Pb, by Rowell, Chynoweth, and Phillips. 5 Considerable refinement in experimental technique has allowed us to make a much more detailed quantitative study of this phenomenon. This and the next Letter report the following develop1-3

PHYSICAL

VOLUME 10, NUMBER 8

REVIEW

ments: (a) In P b we have r e s o l v e d the s t r u c t u r e in d e tail and can a s s i g n much of it to specific Van Hove s i n g u l a r i t i e s expected from neutron m e a s u r e m e n t s of the P b phonon s p e c t r u m . 8 (b) As a r e s u l t we a r e able to p r o p o s e a detailed model for P b in t e r m s of p h o n o n - e l e c t r o n coupling s t r e n g t h s and phonon f r e q u e n c i e s , which is c o n s i s t e n t with the o b s e r v e d gap and which q u a n t i t a tively p r e d i c t s the o v e r - a l l shape of the density of s t a t e s vs e n e r g y c u r v e . (c) We have a l s o found phonon s t r u c t u r e in Al which c o r r e l a t e s with the known phonon s p e c t r u m 7 of that m e t a l , a s well a s in Sn for which no phonon data a r e available. (a) F i g u r e 1 shows a plot of e s s e n t i a l l y d2I/dV* a g a i n s t voltage V (I is the c u r r e n t ) for a P b - P b tunnel junction at 1.3°K. The " t w o - s u p e r c o n d u c t o r " junction is used to give a s h a r p peak in density of s t a t e s with which to probe the phonon s t r u c t u r e . The applied ac signal used to obtain il'l/dV2 w a s 42 fxV m i s (~/?T) and no r e s u l t a n t s m e a r i n g of the s t r u c t u r e is e x p e c t e d . It can be shown t h e o r e t i c a l l y 9 that at a Van Hove s i n g u l a r i t y a) (a discontinuity in d e r i v a t i v e of the phonon d e n sity of s t a t e s , r e s u l t i n g from a s t a t i o n a r y point in a>,,;; vs k), the second d e r i v a t i v e of the tunnel c u r r e n t should show a discontinuity. P r e d i c t e d Van Hove s i n g u l a r i t i e s from r e f e r e n c e 7 (these a r e only those along s y m m e t r y d i r e c t i o n s , and a r e c e r t a i n l y not a c o m p l e t e set) a r e denoted by a r r o w s . The quoted e r r o r s in the neutron work a s well a s p o s s i b l e t e m p e r a t u r e dependence of the f r e q u e n c i e s a r e l a r g e enough that d i s c r e p a n c i e s between d i s c o n t i n u i t i e s and o b s e r v e d points of s t e e p slope a r e not significant.

LETTERS

15 APRIL

1963

The finite widths of the discontinuities a p p e a r to be c a u s e d mainly by the fact that the g a p s t h e m s e l v e s a r e not s h a r p , p r e s u m a b l y b e c a u s e of inhomogeneity in the f i l m s . T h i s is d e m o n s t r a t e d by the lack of discontinuity in / at a b i a s of 2 A in the I-V c h a r a c t e r i s t i c . The s m a l l e r s t r u c t u r e of F i g . 1 is r e p r o d u c i b l e from unit to unit; it is i n t e r e s t i n g that the peak m a r k e d " ? " c o r r e s p o n d s in e n e r g y to the s t r u c t u r e o b s e r v e d in infrared a b s o r p t i o n . 9 [This is a l s o t r u e for tin (see Fig 2, curve C). | F r o m Fig. 1 it is evident that two p r o m i n e n t g r o u p s of phonons a r e involved, a t r a n s v e r s e group n e a r 4. 5 mV and a longitudinal one n e a r 8 . 5 mV. H a r m o n i c s and s u m f r e q u e n c i e s of t h e s e two g r o u p s a r e p r o m i n e n t in the s t r u c t u r e at still h i g h e r b i a s e s , 5 and t h e s e a r e m a r k e d on c u r v e B of F i g . 2. In r e f e r e n c e 5 the longitudinal peak was a s s i g n e d to a h a r m o n i c , accounting for the amplitude d i s c r e p a n c y noted t h e r e . (b) The two g r o u p s of phonons a r e the b a s i s of the model, simplified for machine calculation, which Schrieffer, Scalapino, and Wilkins (following Letter 3 ) have used in the calculation of the function t\(E) in P b . The meaningful p r e d i c t i o n of t h e i r calculation is the smoothed density of s t a t e s which c o r r e s p o n d s c l o s e l y to the f i r s t d e r i v a t i v e dl/ilV; we show the e x p e r i m e n t a l c u r v e for an A l - P b junction at 1. 5°K (Al not s u p e r c o n ducting), in F i g . 2 (curve A ) . (c) In Fig. 3 we show the ril/rlV vs V plot from 25 to 45 mV for an A l - P b junction at 0. 35°K. 1.14 1.12

c

?\

1.10

'aV Ldv ,dv

1.08

V l

-1.06

\ B /^\ N1'04 1.02

ttlf \ A

~v~~

1.00 0.98 0.96 1

3

4

5

( V - A ) IN BROCKHOUSE ET » l

8

6

7

MILLIVOLTS

HIT ANO I00T 100T HOT, 3.6« < 58 5.17

1001 HOT2 II0L I00L 111L M » « 35 I . t t e.93 9.03

FIG. 1. d*//dV2 vs V (measured from A) for a Pb-Pb junction at 1.3°K. Arrows indicate the bias for Van Hove singularities expected from the data of Brockhouse et al. E525

187

1

8 10 12 14 16 18 ( V - A ) IN MILLIVOLTS

'

20

'

22

24

FIG. 2. (a) (dI/dV)s/(dI/dV)N vs V (measured from A) for an Al-Pb junction at 1.5°K. (b) d2I/dV2 vs V (measured from A) for a Pb-Pb junction at 1.3°K. Arrows mark the sum and harmonic positions of the main structures of Fig. 1. (c) dHldV1 vs V (measured from A) for a Sn-Sn junction at 1. 3°K. 2-3

335

PHYSICAL REVIEW

VOLUME 10, NUMBER 8

1

( V - A ) IN MILLIVOLTS 30 35 40 i i — i

gw

^

-— .

-

n it 1

1

- 1.01

li / — \ )

P Vi i i i

i 0.50

i 0.60

i 0.70

_

~?*N 0.80 0.90 13 y(io CPS)

1.00

- 0.99 0.98

I8L

196}

- 0.97 0.96

l 100

FIG. 3. (dI/dV)s/(dI/dV)N vs V (measured from A) for an Al-Pb junction at 0.40°K compared with the phonon spectrum for aluminum.

Also shown in Walker's 7 Al phonon spectrum (recalculated by Phillips 7 ). It will be seen that the prominent longitudinal peak at 34 mV, as well as the end point of the spectrum at 39 mV, are r e flected in the tunnel characteristic. Structure at lower biases was masked by the Pb but an Al-Al sandwich will be investigated. Figure 2 (curve C) also shows cPl/dV* vs V for a Sn-Sn sandwich and a large amount of structure is observed, some near the Debye energy of 17 mV. The rather surprising low-energy structures were at first thought to be due to Pb impurity in the Sn films but were exactly reproduced after thorough cleaning of the evaporation system and

336

15 APRIL

use of a high-purity tin source. The fundamental structure in tin is only as strong as the sum and harmonic structure in lead (curve B) and therefore about A the magnitude of the lead fundamental peaks. We wish to acknowledge valuable discussions with J. R. Schrieffer, D. J. Scalapino, and J. W. Wilkins, and the assistance of L. Kopf in the preparation of the tunnel junctions and of J . Klein with the differentiation techniques.

1.02

;

LETTERS

'G. M. Eliashberg, Zh. Eksperim. i Teor. Fiz. 38, 966 (1960) (translation: Soviet Phys. - J E T P n_, 696 (I960)). 2 P. Morel and P. \V. Anderson, Phys. Rev. 125, 1263 (1962). 3 The exact expressions will be given in the following Letter |J. K. Schrieffer, D. J. Scalapino, and J. W. Wilkins, Phys. Rev. Letters H), 336 (1963)]. ' i . Giaever, H. R. Hart, J r . , and K. Megerle, Phys. Rev. 126, 941 (1962). 5 J. M. Rowell, A. G. Chynoweth, and J. C. Phillips, Phys. Rev. Letters 9, 59 (1962). 8 B. N. Brockhouse, T. Arase, G. Caglioti, K. R. Rao, and A. D. B. Woods, Phys. Rev. 128, 1099 (1962). 7 J. C. Phillips, Phys. Rev. 113, 147 (1959); C. B. Walker, Phys. Rev. 103, 547 (1956). 8 P . W. Anderson, J. R. Schrieffer, and D. J . Scalapino (to be published). These discontinuities do not appear in the normal-metal-superconductor theory of reference 3, but only in the two-superconductor characteristic; this is in agreement with our observations. 9 D. M. Ginsberg and M. Tinkham, Phys. Rev. 118, 990 (1960); D. M. Ginsberg, P. L. Richards, and M. Tinkham, Phys. Rev. Letters 3, 337 (1959); P. L. Richards, Phys. Rev. Letters 7, 412 (1961).

E525 3-3

188

Exchange in Magnetic Insulators Transition Metal Compounds, Buhl Int. Conf. on Materials, Pittsburgh, ed. E.R. Schatz (Gordon and Breach, New York, 1964), pp. 17-28

In the Buhl lecture of 1963,1 talked about the "magnetic state" which characterizes all strongly repulsive systems, an insight still very valuable and unavailable to those trained in the school which nearly monopolizes U.S. solid-state education, and which emphasizes band calculations and one-electron theory. This latter school is very much beholden to the influence of J.C. Slater and his students. The motivation for the Buhl lecture was really Bob Shulman's simultaneous infatuation with his Jaguar and with food and wine; the former took us to Pittsburgh to partake of the latter unforgettably, with Bob Heikes and his wife. Heikes ran the workshop and, before his distinguished management career, contributed interesting ideas in this area. The "magnetic state" concept was a first crude vision of the dichotomy which becomes increasingly clear as the years go on, between systems based on the "oneelectron" physics of weakly interacting electrons, which can be well treated by "LDA" methods and direct computation, as opposed to magnetic systems which are dominated by the electronic repulsion and persist in defying even cleverly modified versions of those methods.

189

EXCHANGE IN MAGNETIC INSULATORS

P H I L I P W. ANDERSON

It's been my experience that giving a talk on material to which my own contribution was mostly completed a number of years ago is a very frustrating thing. In the first place, the advances in the field which have been made recently were by other people and it's always a difficult thing to describe properly other people's work. More serious is the fact that when one has completed a reasonably large job, one is always completely convinced that nothing further can be done and the theory has a beautiful, completed, closed appearance. A couple of years later one finds that this is faded somewhat and one is tempted to start off down the various new avenues of thinking which has opened u p since the last time one looked at the subject. So I will admit that the longest part of my talk, which will be about my own work, is really in the nature of a very long introduction and, from my point of view, I am going through it only to reach the more interesting new things which have been opened up by the work of Toru Moriya, Bob Shulman and Conyers Herring. I should mention that practically everything I'm talking about was done in collaboration with Shulman, Sugano, and Moriya. The contributions I made to this field a while back, were based on a name and the conceptual background to which this name led one. The name, the idea, is the idea of the Magnetic State of an insulating material. T h e concept which is implied is that there is a specific kind of state of the gas of electrons in a substance which one should call the magnetic state, which is distinguishable from other kinds of states of the electron gas, and which one should understand before one really starts thinking about such properties as exchange. I can best define what I mean by state by giving some examples: The metallic state, the insulating or semi-conducting state, or the super conducting state. Perhaps the most rigorous way of defining this concept of state is in terms of the low lying elementary excitations of the system. For instance, in the metallic state, these low lying elementary excitations are very numerous and have energies which go right down to zero; they are near a surface in momentum space, and they carry charge. In an insulator or semi-conductor or the other hand, the low lying elementary excitations have an energy gap, are near isolated points in momentum space and carry charge. The basic point t h a t I would make about this idea of state is that one can't really hope to make any progress, for example, in understanding the electronic behavior of the metal, the super-conductor, or the semiconductor unless one has firmly in mind the basic distinguishing characteristics which make it a metal as opposed to a super-conductor, say, and roughly why. For instance, start with a

190

finite number of individual sodium atoms far apart, and start bringing them closer together. We could know everything we liked about the properties of each of these individual sodium atoms or even about combinations of a few of them but nothing about them is going to tell us anything about the properties of metallic sodium, i.e., of a macroscopic piece of sodium metal in which we have put a macroscopic number of sodium atoms together. Nothing is going to tell us until we actually try it out that the electrons are going to come off the individual atoms and take on free electron properties. I don't mean to imply that the science of magnetism, before the magnetic state came along, was really in such a miserable position as that one would be in if one would try to understand the idea of a metal without the idea of the free electron, but I do mean that there is a distinguishable magnetic state and that the reason for its occurrence is rather uniform in all cases. Using the idea that there is such a state simplifies and clarifies thinking about magnetism. The reason why, for so many years, one could talk reasonably intelligently about magnetism, whereas one couldn't possibly talk about the metallic state without realizing it was a distinguishable state of matter, is that the case in which magnetism is favored is the case in which the individual atoms are reasonably far apart. In a zeroth rough approximation they act like the individual isolated atoms would act. For instance an isolated atom of iron or manganese or gadolinium or sodium itself is magnetic. If we keep the atoms far enough apart, we have examples of magnetic substances. So it's a kind of zeroth approximation to treat magnetism as though the atoms or ions were just individual separated atoms and treat their interactions as perturbations. But it still is important to understand that when we put the atoms together into a solid there is a real problem as to whether the substance will take on the magnetic or metallic or some other state. As I said if we bring sodium atoms together they do not go into the magnetic state, they go into a state in which the free electrons have come off the sodium atoms, paired their spins and the resultant metal may be diamagnetic or paramagnetic but certainly doesn't have the Curie-Weiss behavior that is connected with the magnetic state. We bring the atoms of gadolinium together and we find that they do stay magnetic. For this audience, actually, I hardly need to go in very great detail into the background of what is really happening in the magnetic state. You all know that the man who first explained what is going on was Mott in 1949. Mott imagined that we had some lattice of ions of the kind which we normally expect to be magnetic, say N i + + ions (see Fig. 1) In order to make the system neutral, of course, we have to make an ionic crystal out of it and we have to put something in between. We can put in oxide ions or we can talk about a really dilute substance like nickel sulfate. Then we can compare the energy of a hypothetical magnetic state with a non-magnetic state. First, suppose t h a t each N i + + ion has the configuration (d 8 ) and a moment of two Bohr magnetons. You can imagine that each nickel ion has precisely its own 8 d electrons; and then the spins are free and the electrons are localized. Then one can take an electron from one of these Ni"*""*" ions making it an N i + + + and put that electron on a very distant N i + + ion, and you will have an N i +

191

Ni"0

Nr

Ni" O

Ni

"0

Ni

o o

Ni

"

Ni

"

"0

o •

Ugond

S04~ Q—

F etc. ^ o

Fig. 1.

— Ni**^—

NI++

Ni++

Ni**

Ni'

Ni++

Ni**

Ni**

Ni**

Ni**

Ni~

Ni**

Ni**

Ni**

— Ni*-

Fig. 2.

ion with 9 d electrons. The N i + + + has only 7. (See Fig. 2.) Two energy changes will have occurred. T h e first energy change is quite obvious. When we took the electron from one ion and put it on another, we were taking it from an ion where there were only 7 other d electrons and were putting on an ion where there are 8. Thus the extra electron is repelled by the other extra electron on that ion by an amount which to zeroth approximation is given by the Coulomb repulsion integral between two d electrons: we can call it U and estimate it by the integral. U = e2 Id2{rx)— J r12

d2(r2)dridr2

This is the repulsive potential energy between two d electrons in the same orbital d. When one directly evaluates that by integration, one doesn't come out with a very reasonable value; for actually doing the problem, it is sensible to take the difference in energies between an N i + + + and a N i + ion as compared to a pair of N i + + ions from tables. One comes out with numbers of the order of 10 to 15 electron volts. So t h a t ' s quite a large number and quite a large amount of energy that we've lost. We've regained, however, a certain amount of energy because now the hole on the N i + + + ion and the extra electron on the N i + ion can hop back and forth from one atom to another, and so they are essentially free electrons now instead of localized electrons. If we imagine that there is a d band in the substance,

192

Fig. 3.

(see Fig. 3) the hole can occupy the very highest position in the band and the electron, if it likes, can occupy the very lowest position in the band. Say that the band width is b; we've gained back an energy of order b because the localized states of course are equal mixtures of all the states in the band. Now, again writing down an integral which does not really represent the most exact way of going at it but is what it would be in this highly oversimplified uncorrelated kind of picture, the band width will be roughly given by the hopping integral between the two local sites: bjk = I dr d{r - Rj) UL-

+ v\ d(r - Rk )

Rj and Rk are the positions of neighboring sites. We find then that if the energy U, the Coulomb repulsive energy, is much greater t h a n 6, then we haven't regained enough energy so that the electrons really will continue to separate themselves and become free electrons; therefore they will stay in the localized magnetic state. If U is of order b, (actually it turns out that because of various numerical factors and because of screening the relationship is not symmetrical in U and b) one goes into the metallic state, the electrons separate and free themselves from their sites and one has a metal. It is actually found that in certain oxides of the lower d series, there probably is a thermal transition, between the two possible states, magnetic and metallic. The great advantage of these ideas in thinking about problems having to do with magnetic materials is that they give one, in the first place, a zeroth order state, from which one can apply perturbation theory. This zeroth order state (or rather the zeroth order states) are then those in which all the electrons stay on their individual sites. The first excitations are those in which the electrons hop from one site to a nearby site, thereby

193

losing an energy U; then one can go on from that to a perturbation theory in principle and hope that that perturbation theory converges. If it converges then one has an elementary excitation or quasi-particle way of thinking about magnetic problems. In a metal, it is useful to think of one's elementary excitations as being very like free moving single electrons; in the zeroth approximation the wave function of a single electron is just a plane wave, the first approximation is a Bloch wave in which one has modified the wave function for the effects of the background lattice, but it still is a single electron function and the last approximation is the quasi-particle. And one can think on any one of these three levels depending on how accurately one wants to solve the problem. To this, in the magnetic state one thinks of one's zeroth order of approximation as the electrons in single, isolated atom electronic states, and then one has to take into account in the first approximation the fact that the electron is in a periodic lattice containing ions of N i + + and all kinds of other things, especially the nearly ligands. So from this, one must go to ligand field theory states which have a certain amount of the appropriate ligand functions mixed into the atomic functions, and possibly even the d functions are changed. Finally one comes to a really accurate quasi-particle theory in which one thinks of the electron as completely renormalized for all the possible many kind of electron excitations that should surround it as it sits on its individual site with its given spin. For instance, a certain amount of spin polarization of the background lattice will be carried with it of course. So the lowest energy excitations are the rotations of the spins of the electrons on the localized sites; to this zeroth approximation there is no interaction between the spin directions on the different sites. To a higher approximation, there will be such interactions and it's these interactions t h a t we're interested in finding; b u t in general the low lying excitations are spin reorientations without real motion of electrons from one site to another, only virtual motions of the electrons between the sites. Then one has real charge carrying excitations, the lowest charge carrying excitations in the zeroth approximation having an energy U. We can correct U then for admixture, for polarization, etc., and then there is presumably some real physical renormalized U which represents the lowest lying energy for excitation of a free electron in the material. So from this point of view we have a certain number of parameters which can be thought of as either integrals over the atomic functions or on a different level as renormalized parameters. The important ones then are basically the quantitities b and U; and a number of others we shall talk about shortly. Two important points I would like to make before I go any further. One is that the biggest effect, the most interesting effect of the background lattice on this magnetic state problem, is the admixture of the background lattice wave functions into the one electron wave function. Once one has taken that into account it is possible, though of course can't be proved, that the further perturbations due to the background lattice are not really very big. A second point is that it is never correct under the circumstances in which the magnetic state is the correct state, to talk as though one's basic wave functions were

194

eigenfunctions of the one electron problem because the one electron problem is periodic, and the eigen functions of a periodic hamiltonian are running waves. We see here that the starting wave functions, because of the size of U as opposed to b, must necessarily be localized wave functions, and the only possible zeroth approximation is to make them localized. Thus we must be very careful not to make the standard assumption, that one always makes in all other many body problems, that the one electron hamiltonian is diagonalized by the starting wave functions. Now starting from this point of view one can identify terms in the interactions of the spins on two neighboring sites. The biggest term is something which I have called kinetic exchange. Let's start from our one-electron orbitals on the two neighboring N i + + ions. T h e zeroth approximation is the atomic function

Um(r-Rj). We have said t h a t we have to admix into these orbitals a certain amount of the p function on the ligand — for instance an F - function with an admixture parameter A

cj> =

Um{r-Rj)-\Up.

Then if we make the same kind of wave function on a second Ni"*""*" we do not come out with an orthogonal function, and it is very convenient to work with orthogonal wave functions so we orthogonalize. Then our basic wave function also contains a certain amount of U^\ on the other atom, i.e.
Rk)

is second order in A, very small. Let us study the energy as a function by the relative spin of an electron in
MRj)

+

bjk/Uifd(Rk).

195

The energy improvement that it gets by doing that will be —bjk2/U, b u t the electron on Rk can do the same thing so we get an energy contribution, if the spins are anti-parallel, —2bjk2/U. This is always anti-ferromagnetic, and I contend it is the largest spin coupling energy in the problem. Therefore in general the spins want to be anti-parallel. T h a t conclusion is true only when bjk does not vanish; if the wave function on Rj, for instance, is a d-K with a node pointing along the line toward the ligand whereas the wave function on Rk is a da, bjk vanishes by symmetry. In summary, if there is a bjk kinetic exchange is probably the largest contribution. The second biggest contribution is what I call potential exchange and that comes about because the space wave functions of the two electrons, though orthogonal, do overlap. When the two electrons are anti-parallel as I said that has no particular effect, but when the two electrons are parallel we know that there will be what is called the exchange hole in the overlap region. This occurs because the two electrons really can never be at the same point at the same time. Thus the parallel state is favored because the electrons are not quite as close together when they are parallel as we would think. The correction relative to just taking a straightforward Coulomb interaction integral between the two wave functions is the potential exchange integral, given by f e2 / ipd(ri -Rj)Vd{r\ -Rk)~, r. J \r\ — r2) We see that this exchange is the self energy of the overlap charge distribution, by Pjk(r) =
196

as to when exchange will be positive, when negative, when it will be relatively large and when it will be relatively small, in terms of the covalency parameters and the crystal field parameters. These rules tend to work out very nicely in most cases, in fact I don't know of any cases which really contradict them. On the other hand, the really crucial test is to try to calculate something from first principles and really get a value for one exchange integral. There is a particularly fortunate case that one can calculate, the case of KNiF3. The most fortunate thing about it is that Shulman and Sugano have already done the ligand field theory. One can think of the ligand field theory as the theory of the isolated elementary excitation of the electron gas, and the exchange theory is a theory of the interacting electrons. We have in this calculation of the isolated electron problem which leads to reasonable values for various parameters and particularly for the crystal field parameter. The second nice thing about KNiF3 is that it's particularly conveniently set up for the exchange problem. See Fig. 4.

s

K on planes above and below

Fig. 4.

One main point is that the next nearest neighbor nickels are practically completely out of touch with each other. One really can treat the classical problem here of two nickel ions on opposite sides of the fluorine ion. It is even true that the only magnetic electrons on the nickel are in dz2 orbitals which point toward the fluorine and so one can have the classical picture of two dz2 orbitals interacting through a Pa. As I said, Shulman and Sugano have calculated the crystal field splitting lODq. A calculation which leads to lODq is certain to give a value for bjk because the two processes which lead to the major contributions to these two things are very similar. A major contribution to lODq comes from the interaction between the d sigma orbital and the p sigma orbital on the fluorine reacting back again, essentially, on the N i + + ion. T h e major contribution to bjk presumably comes from the

197

same admixture and interacting with the other nickel. It is really practically the same process. Thus from Shulman and Sugano's numbers, we calculated a very satisfactory answer for the quantity 2bjk2 /U. U now is something one gets from atomic tables and corrects with various corrections which seem reasonable and accurate to within any kind of accuracy that we are talking about here, which is not much better than 50%. We came out with a value for Jjk. One can calculate the Curie point, knowing this; we came out with a Curie point which was 75%, as I remember, too high. We were very happy about that because we knew that the potential exchange interaction was of the opposite sign to this, and we expected it to be somewhat smaller, and so it looked as though this was about the right answer. Now, however, with Moriya doing most of the work, we have calculated the potential exchange integral. We didn't do it before simply because the potential exchange integral involves a lot of three center integrals which are very hard to do. Using the expression Mr

- Rj) = Um(r - Pj) - XVP - jUNi(r

-

Rk)

and putting in an s term also, and pumping that into the potential exchange integral, one can thereby divide it up into integrals involving all the various possible exchange integrals between d on atom j , p or s on the fluorine and d on atom k. There t u r n out to be something like 18 different 2 and 3 center integrals. Most of them had been done by Shulman and Sugano b u t a number of the three center ones h a d to be estimated by Moriya in a way which I think is fairly satisfactory. The way he estimated them was this: for instance, suppose you want to do

f(Um(r

J

- Rtf-JL—V^u^r'

yr — r)

-

Rk)drdr';

that is a three center integral but you can, from looking at values of 2 center integrals, make a fairly good estimate of where the center of charge of the overlap between p and d on atom Rk is. One knows that the center of the charge distribution {U-$\)2 is right on atom Rj; then the integral is obviously pretty close to the overlap between p and d divided by the distance between the centers of charge. One can make a number of estimates like that, I'm sure they're close enough to right for the time being and so one can calculate the potential exchange integral. The result was of the same size as the kinetic exchange integral so we came out with a transition temperature of zero, essentially. Now this was surprising because it represented an unusually large failure of what is called Me Weeny's rule, which is the rule I mentioned before, that exchange integrals are always considerably smaller than direct Coulomb integrals. The biggest Coulomb integral involved is pretty obviously the Coulomb integral on the fluorine, times the admixture of the fluorine orbital

198

into Rj twice and the admixture twice into Rk- T h a t admixture gives one a term of order A* (Coulomb integral of Vp). It turns out that the exchange integral was of the order of 1/3 of that term; in other words, one has a cancellation only to 1/3. If one looks at typical potential exchange integrals in other cases, one finds that there is usually a reduction factor of from 1/5 to 1/35. So, we're in worse shape here t h a n one usually is with regard to McWeeny's rule, we don't quite understand why it seems to come out that way. This may really not be as bad as all that because these calculations are extremely sensitive to the values of the various parameters one chooses. One really doesn't know U to within better than 20% or maybe even 30 or 40%; on the other hand to make things correct again, one would have had to change U by about 80% and we didn't want to do that, that seemed unlikely to be correct; it's about twice our expected error. Another parameter one could change, is Ap. This integral is more sensitive to \p than bjk/U and the sensitivity of Ap is such that we would only have to change it by 30 or 40%, to get it back into agreement; b u t we believe we know Ap better t h a n t h a t . For one thing, it's an experimental parameter coming from nuclear resonance results; and for another thing the experimental and theoretical values check and all the theoretical values involving Xp for the properties of the isolated ion check, and why should then this one property convince us that we must use a Ap 30% smaller. T h a t ' s somewhat outside, but definitely outside, the error we would like to assign to it. Now, in view of the fact that there are 20 or 30% effects involved here we went back to our original calculation and found that we had probably not done it much better than 20 or 30% even from a purely theoretical point of view; so Shulman, Sugano and myself are presently involved, with the help of M. Karplus, in redoing the original calculation of &,*.. It turns out that bjk does involve at least two rather nasty three center integrals, which we had approximated by assuming Phillips' rule that wave functions which are properly orthogonalized do not see very large contributions from the core potentials. We used this rule to approximate to certain integrals, particularly the integral involving the nickel _;', nickel k, and the one electron potential on the fluorine. T h a t may or may not be right so we are going back and recalculating that. My own personal bias on the other hand is that this and the other possibilities already mentioned are not likely to be the major source of the difficulty. The reason for this is clear from a number of things, perhaps most directly from thinking about the recent work of Conyers Herring on exchange. Herring's approach is quite different from the one here: his approach starts from a more purely theoretical point of view. He starts from the idea that, as I said, the magnetic state is the state of atoms basically at large distances from each other; he says to himself, then, let's start with our atoms infinitely far apart and calculate always the very first term in their interaction energy, which depends on the relative orientations of their spins. He was able to show, for instance, in this limit, that

199

we'll call the limit r goes to infinity, t h a t the Heisenberg Hamiltonian /

yJjjSjSj

ij

is a complete description of the spin interaction problem. It is correct to order e~ ' a " where a # is the radius of the typical orbits on the different atoms. In the limit that r goes to infinity, that goes to zero very rapidly. Thus his approach was to attack the problem from the point of view that he was going to expand in this exponential. From t h a t point of view, one simply has only to consider pairs and consider the interaction of pairs in the order of e~2R'aH. It all sounds very simple, but the first thing one finds out when one looks at it is that even the simplest of all the exchange problems of this sort, simply an isolated pair of hydrogen atoms, the problem of H2, has never been solved. One thinks of H2, the hydrogen molecule, as completely solved by numerical methods but numerical methods don't help you much when you get into a mathematical limit because the term you're looking for is exponentially smaller t h a n the terms t h a t the computing machine is giving you. So it's simply impossible to have the computing machine do this "easy" problem. It can't be done. Herring did in fact solve the problem of H 2 finally, for the first time, in the course of this work but that is not the major result that we're interested in now. W h a t I'd like to talk about is certain very general features of his work. There were three qualitative results on the problem of the exchange between two atoms very far apart. Point No. 1 is that the intermediate region between the two atoms is very much larger t h a n the region in which the electrons are near their separate atoms. Specifically, since the configuration space of this problem is 6 dimensional, (there are 2 electrons and each one of them has 3 coordinates) the amount of space in which b o t h electrons are quite far from b o t h atoms is much larger than the amount of space in which they are close to the atoms. In general almost all the exchange integral of the lowest order comes from the intermediate region, where all the particles in the problem are very far apart from each other, and where the free atom wave functions are not good, so it's not very useful to use an atomic wave function kind of theory. This point No. 1 is not likely to be as important in our kind of problem because we've got an F - in between the two atoms. It has a core potential which is very deep and quite attractive to the electrons t h a t are involved in the exchange, and therefore it is very likely that the electron, in getting back and forth between the two atoms, will usually go by way of the fluorine atom. We expect to get a fair approximation by confining our considerations to atomic wave functions, although looking at Herring's work without thinking about the presence of the intermediate atom, we might think we would not be correct in doing so. Point No. 2 — he finds in general that in his problem, in H 2 specifically, the potential and kinetic terms very nearly exactly cancel. One can write out the exchange integral in

200

various ways in his theory and one of them is to write a rigorous wave function % ( r i > r 2 ) derived from rigorous solutions of the problem. This rigorous 2 particle wave function has electron one centered around atom j and electron two centered around atom k. There is also a wave function one calls P\ which is the permuted wave function and then one computes the exchange integral by J=

x(ri,r2)H(r1,r2)Px(r\,r2)-

This Hamiltonian H of course contains kinetic energy terms, core potential terms, and e V r i 2 - The term in e 2 / r 1 2 is a kind of a general potential exchange, t h a t in p2/2m + vcoie is a generalized kinetic exchange. He found that these two terms in the hydrogen molecule problem very nearly cancelled. Now at first you would think this cancellation is analogous with our cancellation; again I think that is not a correct conclusion. The point is that in the isolated atom problem so far one is working with neutral atoms and therefore if electron one is near atom j and electron two is somewhere in all the rest of space, the potentials of the core and of the other electron very nearly cancel and so, to a zeroth approximation, the other electron doesn't see the first atom; and that is the basic reason why one gets this cancellation. T h a t is simply not true if they are positive ions instead of atoms. So, the kinetic term should be larger in our problem because we're talking not only not about hydrogen atoms but about atoms which have rather deep core potentials which are very much stronger than the potentials involved in H2, and also about atoms which have a charge. Thus we have Conclusion No. 1, exchange comes from all space, not true in our case or not necessarily true although it's yet to be shown. Conclusion No. 2 is that potential and kinetic terms cancel; that can't be true in our case in general, in any case involving ions or in any case involving core potentials because our potentials are just very much larger. Conclusion No. 3 which probably is true is that correlation particularly in the e 2 / r 1 2 potential exchange term is very important. In fact, the only difference between the Heitler-London exchange answer for the H 2 problem and the true exchange answer is that at really large distance the Heitler-London answer changes sign and the potential exchange term becomes larger than the kinetic exchange term. There is a general theorem that says that can't be true and the only way one can satisfy this general theorem is by concluding that the correlation between the two electrons keeps them farther apart than the straightforward one electron theory would tell you. Thus the conclusion, which I think is probably true for us also, is that our problem is correlation in the potential exchange term. To see that can have a big effect, one can make a very simple calculation. One of the many integrals of Shulman a n d Sugano which we used in our calculation of the potential exchange term was the integral involving the Coulomb interaction of two p electrons on the fluorine ion y'vp2(r1)e2/ri2Vp2(r2)dr1dr2.

201

If you calculate that straightforwardly from just the wave functions, this Coulomb interaction is 23 electron volts. This is just calculated. There's another way to estimate it and that is to go to tables of ionization potentials and ask yourself what is the difference in energy between a fluorine (—) ion and a fluorine ( + ) ion. T h a t is roughly 4 volts, the electron affinity of fluorine; and the ionization energy of fluorine, — E of F + , that is 17 volts roughly. Now if we subtract these two, we get the difference between an F~* and an F + relative to a pair of F atoms. Again, we're calculating a U straight from the atomic tables by saying we are going to calculate the process of F + F goes to F + + F ~ and that difference is if we assume that the wave functions do not change in the interim, that difference is the Coulomb interaction of two electrons. That difference then is 13 eV experimentally, this is actually an overestimate because in this case we are working, on the average, with fluorine wave functions which are considerably more compressed t h a n F ~ wave functions presumably are. We really should calculate the problem F _ + F _ going to F 4- F , but we can't calculate that because it doesn't have the appropriate electrons in it and besides nobody has ever seen F - - ; this presumably would have a still smaller Coulomb interaction effect. So, we can conclude that we have seriously overestimated the Coulomb interaction term in the potential exchange interaction, that probably the correct potential exchange interaction can be reduced by a factor something like a half for correlation effect and we are in reasonable agreement. T h a t is the best we can do at this point. So in conclusion we must concede that exchange in d band ionic crystals looks like a more subtle and difficult problem t h a n it looked 2 or 3 years ago. I still believe, personally, t h a t we have calculated the major component and that we are working in the right ballpark. If one wanted actually to do more problems one would get roughly the right value for the exchange integrals. T h a t this is an achievement can be pointed out by the fact that as far as I know, no magnetic curie point has ever been calculated from anything like first principles to within a factor of 10. As far as I know very few second order transition points of any kind, in fact no second order transitions, except possibly the superconducting transition of lead, have ever been calculated to within 100% accuracy. So, even 100% accuracy for a second order transition point is still a remarkably satisfactory result. Discussion W. J. Carr Jr., Westinghouse: Isn't it true that also in simpler problems, say in two atom problems, what you call the kinetic exchange, although it's larger, is approximately equal to the potential exchange? I'm not sure of that but I believe it's correct. Dr. Philip W . Anderson: Yes, in all cases the kinetic exchange is somewhat bigger than the potential exchange, it has to be unless there is actual orthogonality between the two wave functions. W. J. Carr Jr.: But they are still the same order of magnitude aren't they? Dr. Philip W. Anderson: Yes, that is what I said in the table. In the only case which

202

has been really calculated rigorously, H 2 , they are of very much the same order of magnitude. I said however, that I don't think that is a general theorem that we can apply in every other case. J. Appel, General Atomic: I would like t o ask Dr. Anderson whether he can make any comments on how large the correlation correction to the bjk could be, in a particular case, say nickel oxide. Dr. Philip W. Anderson: It is very hard to say. I have not thought much about it. I have a feeling that the correlation correction to bjk is less t h a n it is t o U or to this potential exchange parameter. I don't know how much less but I think considerably so. For one of the important things to realize about b is that although you write it using the whole one electron potential H, there are ways of rewriting it with only the kinetic energy and it is really basically an off-diagonal kinetic energy matrix element. One can approximately take out the dependence on the core potentials and in those circumstances it should be less sensitive to correlation effects. Dr. Frederic Keffer, University of Pittsburgh: Can you get any estimate out of your theory of the biquadratic exchange which is now believe to be important in manganese oxides? Dr. Philip W. Anderson: Yes, you can get estimates. It is just a higher term in the kinetic exchange but presumably there are also potential terms. One can get the order of magnitude very easily. It is got to be the order I think b4/u3 but there may be a factor 10 in the coefficient because there are many more terms possible here t h a n there are otherwise. R. F. Wood, Oak Ridge: In your wave function you mix in F _ orbitals in a molecular orbital type of calculation. You're trying to get the description of the electron in the vicinity of the fluorine. Do you feel that this approach really allows enough flexibility into the wave function or should you perhaps mix in higher excited states on the fluorine atom also. You say that you can measure the covalency parameter through N M R experiments. But if you did allow higher excited fluorine functions into the calculation, what then would you be comparing your experimental results on covalency to theoretically? Dr. Philip W. Anderson: Well the second part is not mine to answer. I think t h a t ' s really directed to Bob Shulman and the resonance experimentalists. The first part I would say yes if one makes a correlation correction to the part of the potential exchange t h a t comes from the F - , a considerable part of that is coming from excitations to higher states on the F _ . You can describe an expansion or a contraction of one of

203

the F - orbitals such as one would have if another electron goes in as an excitation to a lot of higher states. I think it will be there and I don't know, something like 10% is certainly kind of a minimum order of magnitude for what it would be. R. F. Wood: I really wonder if this covalency parameter is very well denned except, perhaps, on this somewhat simplified model. There's only one parameter entering into the wave function t h a t you are using now, as I understand. Dr. Philip W . Anderson: Yes. R. G. Shulman, Bell Labs: I'll try to answer the second part of the question. Certainly if you could mix in excited states of fluorine — or any number of other excited states of the isolated atoms in the crystal, one should be able to get a better wave function and a better solution. There is no disagreement about it. But the question is how far can you go by mixing in the first most obvious excited state which is the state starting with the fluorine electron in the 2s 2p orbitals. So you go that far. You do your experiments and you interpret your experiments in terms of a certain amount of that mixture of 2p 2s. You calculate also, assuming that this is the only degree of departure from the free ion of the nickel plus 2. And you get pretty good agreement with whatever parameters of the d electrons that you are able to calculate which are fairly extensive. In fact, at that first go around, the agreement was good enough in the calculation that we didn't feel justified in increasing the flexibility of the wave function. I mean you could always make it a better wave function but what would the criterion be. The criterion should be that it would give better agreement with experimental results. Well we had pretty good agreement with experimental results considering the accuracy of the measurements, and the accuracy of the calculation so t h a t we felt that there was no justification in invoking some new physical mechanism, which say would be an excited state of one of the two atoms involved, so that what you say is certainly true in the ideal case. Now the consequences of that upon this second case, the more complicated case of how things interact through the fluorine has been answered. Dr. Philip W . Anderson: Perhaps I did not make it clear enough when I was talking about 10 to 20 percent accuracy, I don't think from a one electron point of view you would want to put in that much of some higher state. I think the calculations in terms of just the states we're using are probably very accurate from a one electron point of view. They are probably the best you can do, and you will get relatively minor corrections by including higher states. W h a t I meant when I said there are probably a lot of higher states around is that in the correlation effects there will be a lot of excitations and a lot of polarization effects involving higher excited states, probably. R. F . Wood: This says that you think you have a fairly good approximation t o the H a r t r e e Fock one electron state. Dr. Philip W . Anderson: Yes. R. G. Shulman: There too I would like to point out that the approximation is quite good to

204

the antibonding state because we are concerned with an electron which is primarily a d electron. So the d electron involves a small amount of fluorine mixture and you minimize the energy of the mixture. But the properties that we were concerned with were the properties of a magnetic electron, primarily a d electron, and we set up the Hamiltonian with a nickel surrounded by six fluorines in order to calculate the properties of an electron which was centered there and detached it from the crystal. I don't think we would want to make claims about knowing very much about the p electrons on the fluorines because they are out toward the edge of our problem and did not have the magnetic properties that we were generally concerned with. The problems we were concerned with were the electrons in an orbital which was predominately d in character with just a small admixture of the p. John B. Goodenough: Lincoln Lab, M.I.T.: In your estimate of the bji that entered the kinetic energy term, you did not make clear how much of the splitting lODq was assumed to be due to the admixture of wave functions from the fluorine ions. There is also a large electrostatic portion to lODq. W h a t do you feel are the relative magnitudes of the electrostatic and the covalent contributions to this splitting. Dr. Philip W. Anderson: Well that again is a question for Bob but I know the answer. The answer is there are lots and lots of contributions. I said t h a t ' s the largest one and it is perhaps the largest one but there are many contributions. For instance, when one is thinking about these things qualitatively one does not think exchange contributions are going to be very large — they are, they're quite large. The Shulman-Sugano calculation of the crystal field theory has all of the various kinds of terms in and they all t u r n out to be roughly of the same order of magnitude. You can regroup them in certain ways and perhaps after regrouping, a large fraction comes from the covalency, but not by any means all. There's an electrostatic contribution, an s contribution, a p contribution, etc., it's a mess. Incidentally, they add up in an entirely different way in bjk so that agreement between the two is strictly coincidental. R. G. Shulman: In answer to your question about the point charge contributions to the crystal field splitting, we did publish it but it is a very long thing to read and it goes like this. It depends upon what you mean by point charge contribution but let us take the semiclassical model of the point charge and calculate it in a quantum-mechanical way not in a semiclassical way. You see the semiclassical way of calculating the point charge model is to assume that you have D electrons in the metal in quantum-mechanical orbitals and then you have point charges out here although they are really atoms and they are not really quantum-mechanical, they are point charges. Now of course this point charge calculation gives a certain contribution to lODq of the correct sign, and in the case of potassium nickel fluoride it gives a contribution of about 1200 wave numbers whereas the major part of the calculated value of 6,000 wave numbers or so came from the off diagonal matrix element which Anderson mentioned.

205

Superconductivity (Two Opinions) (with B.T. Matthias) Science 144, 373-381 (1964)

A relic of my (mostly) love (some) hate relationship with Bernd. I came to be sure, even before 1987, that the full story of many superconductors included something new, while he became increasingly depressed by the evidence that phonons were right. Bernd's great strength was his refusal to play by other people's rules. With the metallurgists and chemists who were his natural competitors he exploited the narrowness of vision which led them to synthesize and characterize compounds in traditional ways, without recognizing that modern solid-state physics had opened up a cornucopia of useful and interesting classifications such as ferroelectricity, ferromagnetism ("pro" or "anti") and superconductivity, which could be recognized with essentially trivial measurement techniques. With theorists, it was our focus on simple models and simplest cases which was the Achilles' heel he found and, to be sure, used to the hilt to mystify and confuse us. The theorists loved to make elaborate theories of complicated effects, requiring beautiful precision measurements on the simplest, cleanest possible materials. Their success in doing this convinced them that they had "solved superconductivity". But Bernd realized that the average nonspecialist had no idea of the intellectual structure of the field and that, to him, the proper role of theory is to predict whether substance A or substance B is superconducting at all, and if so what is its critical temperature. (Reporters are the ultimate nonspecialists, but even physicists in other fields fall easily into this mode of thinking, and in fact one completely ignores it to one's peril.) Such questions are extraordinarily difficult, in large part because the answer comes mostly from negative results: substance A contains atoms Z, X, and Y which can combine in thousands of different ways, all of which must be less stable than A for it to exist at all in the given chemical state; it must then choose the physical state (crystal structure, magnetic structure, electronic structure) which favors superconductivity. So one is asking the theorist to do thousands of (to him) irrelevant calculations, most of them impossibly difficult, before he starts. To make the situation clearer, perhaps as many as a couple of dozen crystal structures, and no melting points, have been calculated accurately from first principles; and Tc is in principle (though not in fact) very much more complicated than either.

207

Thus Bernd pooh-poohed the predictive power of theory very convincingly for many years, and even entrapped many theorists into too literal interpretations of the parameters they fed into model theories, hence straying unsuccessfully into his bailiwick of chemical intuition and empirical rules. But with the increasing sophistication of the genuine microscopic theory especially exploited by Rowell, Dynes and McMillan, he seemed to lose his enthusiasm for the bizarre speculations he was wont to tease us with. As you will see later, I, on the other hand, was increasingly disturbed by the evidence that there were persistent "bad actors" among the higher Tc superconductors which did not obey the rules and had generally mysterious properties. Almost certainly, Bernd's claim to immortality will be that he dragged solid-state physics, much against its will, into taking seriously the many, many substances which did not fit our neat, logical — but probably incorrect — categories. Bernd and I could both have used a touch of the philosophy of General Semantics: superconductor A is not identical with superconductor B until proven so.

208

24 April 1964, Volume 144, Number 3617

Superconductivity Two physicists, P. W. Anderson and B. T. Matthias, approach this important phenomenon from different points of view.

I. A Theoretical Approach The conditions under which this article is being written are unusual. With the other side of the coin being ably presented by my colleague, B. T. Matthias, I will not have to qualify my statements or judiciously distribute credits and concessions, but can flatly state my opinions, for what they are worth. I suspect that I will be proved wrong in some measure; I hope the fact of my stating these opinions will stimulate other physicists to try to prove me wrong. The fact is that the theory of superconductivity—to which Bardeen, Cooper, and Schrieffer (/) made the most important contribution, which was announced almost exactly 7 years ago (2)—has had unprecedented and amazing success in changing this phenomenon from one of the most obscure to one of the simplest and best understood of all the phenomena relating to the properties of matter. Since this statement appears to contradict what Matthias says in the discussion that follows, and since much— not all—of the disagreement is a matter of semantics and of philosophy, I would like to discuss briefly some of P. W. Anderson is affiliated with Bell Telephone Laboratories, Murray Hill, NJ. B. T. Matthias is on the staff of the department of physics, University of California, La Jolla. and the Bell Telephone Laboratories, Murray Hill, N.J.

the elementary facts about the nature of scientific theories and of scientific proof which we all think we understand but which, on a little deeper examination, we find are not precisely the way we imagined.

Nature of Scientific Theories The scientific method advances—we all learned—by induction from experimental facts. The scientist experiments, evolves hypotheses, tests them against more experiment, applies Ockham's razor, and has learned something. That is of course understood, but I suspect many physicists do not think carefully about how the process really goes in a mature science such as the physics of matter. The experiments against which a theory must be tested are not merely those under direct consideration but the ones carried out over the past 50 to 100 years which have given us all-but-absolute, unshakable confidence in a certain structure of fundamental laws: quantum mechanics, relativity, statistical mechanics, the consequences of symmetry, the regular nature of crystals, the band theory, and so on. I could give you tens of examples of the following general rule: a theory which contradicts some of these accepted principles and agrees with experiment is usually wrong; one which is consistent with them but disagrees with experi-

209

ment is often not wrong, for we often find that experimental results change, and then the results fit the theory. Here is a simple example: some time ago Heisenberg and Koppe proposed a theory of superconductivity (3) which predicted correctly that no transition temperature would be higher than 19°K. It was not a bad theory at the time, but Koppe is the first to agree now that it really was not a very reasonable approach, and that the theory is in disagreement with several now well known principles of solid-state theory. I suspect he would also agree that a theory which was consistent with all the a priori known facts of solidstate theory but did not predict a definite value for the upper limit on the transition temperature T* would certainly have been more valid than his. Thus the first—many would say the main—task of a solid-state theorist confronted with a phenomenon (like superconductivity, ferromagnetism, or—an example I shall continue to use—semiconductivity) is to find a way in which the special behavior of matter under consideration can be accounted for in a way that is merely consistent with all the things we already know about solids and may already know about the phenomenon. Actually, this is_ usually, for practical purposes, the end of the story: there is usually only one sensible way to account for the facts about the phenomenon consistently with known principles of physics.

Band Theory of Semiconductivity Let me take an example which may be familiar to many readers: semiconductivity. The band theory of semiconductivity is mostly attributable to the work of Wilson (4) in 1932, and by 1940 or so, Mott and Gurney could say flatly (5), "The accepted theory of semiconductors is due to Wilson." If we examine the experimental data up to that time we find that this acceptance rested on little else than the plausibility of the theory; no quantitative, and very

little qualitative, data existed until after World War II. The theory had certainly predicted nothing quantitatively. The thermoelectric effect, the Hall effect, and the conductivity were found to behave in a manner made plausible by Wilson's theory, and that was about it. From 1945 to 1950, such detailed qualitative achievements as Pearson and Bardeen's wide-ranging experimental study of silicon ( 6 ) and the various experiments associated with transistor action verified the theory in qualitative detail; but still the quantitative basis of the phenomena was practically untouched: the band structure in all its detail, on which all of the energetics and much of the transport theory depended, and the precise coupling constants which controlled the scattering and thus controlled all transport phenomena. Yet one could hardly find anyone aware of the evidence who would question the basic Tightness of the theory. In the final stage of the advances of the past decade or so, two developments typical of an absolutely mature (perhaps even a dying) science have occurred: the correlation of detailed experiment with detailed quantum chemical calculations of band structures, and the beginning of the development of a "rigorous" mathematical theory. (I use quotes because of course the "proofs" in such theories always depend on the intrinsically unprovable convergence of the extraordinarily complicated infinite series which result from perturbation theory as well as on major—and unstated—idealizations of the physical world, as when all crystalline imperfections are neglected.) Semiconductivity theory is one of the few instances of a theory in which the quantitative aspect—the third s t a g e has been handled with a reasonable degree of completeness. I have an idea that even though superconductivity theory is so young, it is already entering its third stage—the stage of quantitative energy and transition-temperature calculations, the stage in which there are some pretensions to mathematical rigor.

The BCS Theory From the very first the BardeenCooper-Schrieffer (BCS) theory of superconductivity was remarkably successful in its qualitative and semiquantitative aspects. Perhaps this is as good

a time as any to give a brief account of it. The theory stems from an idea of Leon Cooper's, published in late 1956 (7). He showed that if the electrons in a metal have an attractive interaction, the commonly accepted state, the so-called "Fermi sea," is not stable, for the reason that two electrons taken from the top of the Fermi sea—the socalled Fermi level—can then always form a bound-pair state, thereby lowering their energy relative to the Fermi level. Since any two electrons may do this, a macroscopic number of bound pairs may form, changing the state qualitatively.

quasi-particle" costs the binding energy of the pair and thus gives rise to the famous "energy gap." But this extra particle is neither a hole nor an electron, because it keeps emitting and absorbing bound pairs into the zero-momentum state, changing from electron to hole when it emits, and vice versa. This is the brilliant and rather tricky new feature of this theory which makes it so successful: the essential discarding, in a sense, of the detailed and exact conservation of particle number.

Electron pairs may be added to or subtracted from the sea of bound pairs; when one metal is in contact with another metal, this is the process which A wave function for a bound pair determines the Fermi level. Because of electrons may be written these electron pairs occupy the whole •Kri,r2) = (ri — n) exp[j£ • (/i + r j ] metal—their momentum is fixed at where tiK is the momentum of the zero, usually—they may be added or center of mass and

(n — r2) = 2^exp[/£ • ri — ik • rjf(k). ing in concert, any attempt to change the momentum of a single pair will k effectively unbind that pair, whereas a change in the momentum of the whole Thus, I/I is a superposition of states in which the two electrons simultane- electron gas is a macroscopic process involving macroscopic energy barriers; ously occupy one-electron states with the "rigidity" of the bound pairs leads momenta (— k + K) and (k + K). to quantization of flux. The most reasonable, and in fact the correct, value of K is zero, hence this The analogy with rigidity of a solid is the origin of the famous k, — k is close and enlightening. Though a bar of steel is full of vacancies, phonons, pairing. It turns out, also, that the spins of the two electrons are opposite too: and other defects, we still are not surprised to find that it transmits a force the pairs bind in a state whose spin, from one end to the other without loss. momentum, and angular momentum are That force is transmitted by the lattice zero. of condensed atoms as a whole, the It was the momentum condition random motions of individual atoms bewhich gave Bardeen, Cooper, and ing too small to disturb it. Similarly, Schrieffer the clue to go farther. They supercurrents are transmitted "by the picked out that part of the interaction in condensed electronic state as a whole. which bound pairs of fixed total momentum zero occurred and found that a mathematically tractable—in fact exact—theory of that part could be developed. The resulting theory can be described in physical terms by saying that the entire gas of free electrons condenses into a single state of bound pairs. (One recognizes at once that the occupied electron states deep in the Fermi sea are almost always occupied in k, - k pairs even in the normal state, so that they actually do not change much in this condensation; the properties of both normal and superconducting metals are entirely determined by states only near the Fermi level. This is the main way in which the exclusion principle enters the problem.) T o unbind a pair and create an unbound particle or "normal

374

210

Schafroth, Blatt, Butler Theory This picture of the superconducting •state is rather similar to a proposal made by Schafroth, Blatt, and Butler (8), before the BCS theory was proposed: that superconductivity is caused by a Bose condensation of bound pairs of electrons. Just as helium atoms are a bound state of six fermions but obey Bose statistics, they argued, pairs of electrons can act like bosons and condense into the lowest energy state, K = 0, as it is suspected helium atoms do in the superfluid. Several years later, in fact, Blatt and others completed the proof that all the results predicted by

the BCS theory can be made, by dint of great effort, to follow from such a picture. I think it is fair to say that this equivalence, and the importance of the ideas of Blatt, Butler, and Schafroth, have been overlooked to some extent. Why do I then give the lion's share of credit to Bardeen, Cooper, and Schrieffer? Why isn't the BCS theory just a highly convenient working approximation within the Blatt-ButlerSchafroth theory? There are three reasons. 1) A very practical one: it is not obvious that, without the hints of the BCS theory, the mathematical results which correspond so closely to experiment would ever have been obtainable by Blatt and his associates from their very difficult, and complicated theory. In any case, Bardeen, Cooper, and Schrieffer were certainly the first, by some years, to give any mathematical results for comparison with experimental results; in the event, the agreement was amazingly good. Before the BCS theory was advanced, electron pairs were merely a plausible hypothesis among other hypotheses. 2) A difference in emphasis which is practically one of principle: according to the BCS theory the thermal breakdown of superconductivity is dominated by the breakup of pairs— the energy gap—and not by their evaporating as pairs from the condensate of pairs for which K = 0. In metals, pairs in which K + 0 practically do not exist, because of electromagnetic interaction effects, in contrast to the situation for 'He (9). This physical fact was not recognized in the Blatt-ButlerSchafroth theory until much later. 3) The most important difference, one of principle: actually the BCS theory is a better and more useful one than the corresponding theories pertaining to helium II that existed at the time Blatt et al. constructed their theory, even better than theories held until very recently, when the usefulness of the BCS type of theory began to be recognized (10). A theory of the BCS type, to be workable, had perforce to introduce breakdown of the assumption of exact conservation of particle n u m ber, which now turns out to be by far the best way of describing superfluidity as well as superconductivity. Let me make this important point another way. Photons are bosons. In dealing with microscopic quantum processes we find it useful to deal with photons as particles, and to work with states which

we des'cribe as containing a fixed small number of photons. In such a state the electric field fluctuates—particularly in phase, and phase and photon number are conjugate variables. In dealing with "classical"—wavelike —electromagnetic phenomena on a macroscopic scale, o n the other hand, we are familiar with the procedure of treating directly with the electric field as a fixed "c-number." W e know, then, that the number of photons must be uncertain, but that disturbs no one. For macroscopic coherence and interference phenomena it is far better to use eigenstates of the field operator E, not the number operator n. We can of course make transformations from one to the other, because the wave and particle pictures are dual and equivalent; but for classical interference phenomena the description in terms of waves—fields—is far better. Bardeen, Cooper, and Schrieffer had perforce—though they tried very hard not to—to use states emphasizing the field of electron pairs, rather than the number, in this same way. Basically, the phenomenon of superfluidity is best described in terms of the electron field. This choice of coherent states (11) in the particle field led directly to the gauge-invariance difficulties which troubled some critics of the theory in the first year or so after it was proposed, but by about 1959 that problem had been cleared up (9). Now, on the other hand, it is clear that in such phenomena as the Josephson tunneling effect or flux quantization the phase of the electron wave field has real physical meaning (12), and thus a strict interpretation of the requirements of number conservation and gauge invariance, such as was insisted upon by the Australian school, cannot be maintained. Shortly after the BCS theory was proposed, a n u m b e r of other theories based on similar ideas appeared. All of these in the end turned out to be essentially equivalent to BCS, although for a computation of a particular kind, any of the four (to m y knowledge) others [Bogolyubov and Valatin's quasiparticles (13); Koppe (14), Suhl, and Anderson's (9) pseudospin approach; Nambu's spinor self-energy (15); and Gor'kov's Green-function theory (16)] might be superior. As I said, Blatt's theory is also equivalent to BCS, but I know of no applications in which it is preferable, and the correct version appeared only much later.

Predictions and Experiment Let us return to the • mainstream of the discussion. The BCS theory led immediately to a number of predictions about the properties of superconductors, which depended on a small number of semiempirical parameters: the transition temperature or the energy gap at T = 0; the density of states; the velocity at the Fermi surface. The properties considered fell into two classes, and in both of them the predictions were, on the whole, in excellent agreement with experiment. In the first class were phenomena which depeijded primarily o n the existence of an energy gap and its temperature dependence: the specific heat, the Meissner effect, optical absorption. In the second were a number of phenomena which depended critically on the "coherence factors" of the theory. It was fashionable for a number of years t o say knowledgeably that, after all, the BCS theory was not "proved"—that only in the matter of the energy gap was there agreement between experiment and theory. This was not true. In the original paper it was pointed out that the striking difference between nuclear relaxation, which peaks at T« and then drops, and ultrasonic relaxation, which drops at Tt, was a coherence effect depending delicately on the form of the theory. This type of coherence effect can be explained in a very simple way. As we pointed out, the BCS assumption is that the pairs have zero spin, zero angular momentum, and zero linear momentum. Another way of saying this is that the paired electrons exemplify time reversal: they are in states in which the momentum and spin of one are reflections of those of the other. M a n y perturbing effects—scattering from a surface or a chemical impurity, or from an acoustic wave—are static in nature—that is, "time-reversal-invariant." Such a perturbation does not affect the bound pairs in a superconductor, in the BCS approximation (17). A magnetic perturbation, on the other hand—for example, the magnetic field of a nucleus or of a magnetic impurity—has just the opposite nature. This is the source of the two contrasting kinds of effects. One of the most striking predictions of this type to be supported by experiment is that ordinary chemical impurities do not lower 7"» much and do narrow the distribution of energy gaps (18), whereas magnetic 375

211

impurities do sharply lower Tr and "smear out" the gap distribution—effects leading in extreme cases to "gapless superconductivity." T h e effect on 7"o was measured before it was arrived at theoretically, but the effect on the gap was predicted before it was measured (19). According to the BCS concept, these impurity-scattering phenomena are diagnostic for pairing of electrons of the simple type assumed in the theory. So far all superconductors investigated show them. (Because impurities are universally present it is very hard not to observe them!) The one effect which appeared to violate the coherence-factor predictions was the Knight shift. T o my mind the violation is more apparent than real, reflecting a failure of communication, in that the theorists have been discussing an idealized, pure system and the experiments are made on some of the most impure and inhomogeneous specimens of metals ever prepared. In investigations of the Knight shift in pure, bulk specimens, experimental results have not yet contradicted theory. A second qualitative triumph of the BCS theory began in Russia. In 1959 Gor'kov announced that his version of the theory led to the Landau-Ginsburg phenomenological equations of superconductivity, with an effective charge e* = 2e (20). The importance of this discovery was not at that time widely appreciated in the West, mainly because we did not appreciate a 1957 paper of Abrikosov's (21) which derived from these equations the entire theoretical apparatus necessary for understanding hard superconductivity— the technologically important superconducting magnets, in particular—and, rather obscurely, demonstrated the quantization of flux through a superconductor in units of he/e* = hc/2e. As a result of this failure of perception on the part of Western theorists, both of these phenomena were first clearly brought out by the experimentalists (22). I am glad to give the experimentalists the credit of discovery if I may at the same time point out that, like a marginal note of Fermat's or a 17th-century anagram, Abrikosov had hidden away the truth, for us all to recognize afterward: the theory was all right, it was just that we were stupid. It is clear from all this that by 1960 or 1961 the experimental support for the theory of superconductivity was overwhelmingly convincing, more so

than that supporting, say, Wilson's equally plausible theory of semiconductors at the time the transistor was developed. For some reason— perhaps primarily psychological—physicists were, on the whole, much less ready to accept it than they have been to accept new theories in other, similar cases, but speculation on the precise reasons is hardly appropriate here.

and thus on the isotopic mass of the ions as M~>. One could hardly expect that such an admittedly crude approximation could be relevant to the real criterion for superconductivity, and the very first more detailed discussion, by Swihart (23), showed that the isotope effect was an artifact of the cutoff assumption. Nonetheless, for a few years this equation seems to have been taken seriously, even to some extent by its authors.

The Third Stage

Bogolyubov had already briefly, and Swihart more thoughtfully, approached the problem from a more realistic point of view; but the advance which made the quantitative approach possible was made by Eliashberg (24), whose ideas were amplified and clarified by Morel and Anderson (25), by Schrieffer, Scalapino, and Wilkihs (26, 27), and by others.

The first two stages of the development of the theory of superconductivity had, then, been compressed into 4 or 5 years. What about the much more difficult third stage: the quantitative prediction of transition temperatures, for instance, or of the condensation energy? I must again remind you that this kind of question has not been solved in most other cases: we don't have any good theory of melting points, of ferromagnetic or antiferromagnetic Curie points, or of ferroelectric transitions. But we are beginning to understand the superconducting transition quantitatively. The approximation for the interaction suggested in the original BardeenCooper-Schrieffcr paper was admittedly extremely crude. It is known that as an electron passes through the lattice of ion cores in a metal it displaces the ions—polarizes them— in such a way as to attract other electrons. At the same time, its own negative charge repels other electrons. Bardeen and his co-workers assumed that these two mechanisms were independent (they are not, of course, since the electron polarizes the ion cores by electrostatic interaction also) and that the criterion for superconductivity was that the lattice polarization, or "phonon" effect, was the larger. Since that effect was attractive only for electrons differing in energy by less than a largest phonon energy fi(,jn, an artificial cutoff (above which no interaction at all took place) was introduced at that energy. The result was the famous equation 3.5 kT, = c„ =7i^ D -expt— l/W(0)(F p h „„—V m ,)l This equation gives correctly the order of magnitude of the transition temperatures, since it is expected that the coupling constant N(0)V should be considerably less than unity, and it gives the right isotope effect for many metals—the dependence of c, and of Tc which is proportional to it. on 7?wD

The new feature brought in by these people was the idea that it was not only more correct but also more useful to emphasize the retarded nature of the electron-electron interaction in superconductors. This can be understood very simply if we think about the actual physical interaction between two electrons in a metal. We know that after an electron has moved rapidly through the metal, the first thing that occurs is a reaction by the other electrons which screens off the electrostatic potential at a very short distance, less than a unit cell radius in most metals. This screening is instantaneous (or may be considered so for our purposes), and so practically no particle beyond the screening radius ever experiences any appreciable potential energy of interaction. Thus, the second electron must be very d o s e in space and time to see this part of it. The second reaction is the slow response of the metal tons which are attracted briefly to the electron's negative charge. The ions move toward the region where the electron was and. in a time equivalent to one lattice-vibration period, return to their unperturbed positions, executing a damped vibration. Their initial, largest swing is such that an attractive potential builds up in the region where the electron had been, but the swing occurs about a lattice-vibration period after the electron, moving with the Fermi velocity, has left the region. Again, most of this attractive, retarded part of the potential is felt only in a volume of space very close to the actual path of the electron. Thus we see that the whole interaction potential between electrons 376

212

has a very short range in space but is of a complicated nature in time, with rather long-range parts. The repulsive part is nearly instantaneous, the attractive part is retarded. The partner of the given electron, then, should avoid the space-time region where the potential is repulsive and stay in that where it is attractive. Eliashberg generalized the equations of the BCS-Gor'kov theory to allow one to use a potential that depended upon both space differences and time differences. The essential result is a single integral equation in space and time —or, when transformed through Fourier analysis, in momentum and energy variables—for an "energy g a p " which is a function of momentum k and energy E. The obvious approximation, which turns out to be even better founded than the foregoing discussion indicates, is to neglect the ^-dependence—that is, the space-dependence—of this function and to use only the very much simpler equation in one variable, the energy, which results. This single, one-dimensional integral equation is manageable even for very complicated systems; what is more, as Schrieffer has pointed out, this theory is nearly rigorous, in the sense that it includes most of the complicated renormalization effects which might otherwise be large quantitative corrections. I like to call it the "E.A.S.Y." energy-gap equation, for Eliashberg, Anderson, Schrieffer, and Y. Nambu, all of whom contributed— among others—to its development and proper use. At the same time it happened that developments in experiments on tunneling between superconductors have made it possible to measure the energygap function A ( E ) — o r at least quantities very closely related to it—in all its detail. The most detailed experiments have been performed on lead by Rowell et al. (28). By correlating these experiments with the theory we have made the following two advances: (i) it was demonstrated that the tunneling data could result from a gap equation of the given form, and this, in view of the complexity of these data, all but verifies the form of the equation (29); (ii) we then showed that the strength of the electron-phonon coupling given from the tunneling data is consistent both with the transition temperature of lead and with the electron-phonon resistivity of lead (30). In this way we have realized, in a rather backward fashion, the decade-old hope of Frohlich and Bardeen that r . could be cal-

culated from the resistivity; rather, we have calculated both T* and the resistivity from the much more detailed knowledge of the phonons and the electron-phonon coupling furnished us by the tunneling experiments (with, I should mention, a strong assist from measurements of the phonon spectrum made by means of neutron scattering).

Questions and Answers A number of questions immediately arise, (i) Can we do the same thing with other metals? Answer: Yes, slowly, when good tunneling data are available, as they are not yet for most other metals; when we know more about their phonon spectra than we now do; and if the spectrum is as easily analyzed as that of lead, (ii) Is there any case in which we know as much from other sources as the tunneling data tell us in the case of lead? There is one: in many semiconductors we have good, detailed information about electron-phonon scattering, band structures, dielectric constants, and so on; and in degenerate semiconductors we can often vary some of the parameters—notably the number of electrons—at will. Cohen has modified the BCS-Gor'kov theory in a manner appropriate to the case of the degenerate semiconductor and has successfully predicted—to everyone's surprise—in what special circumstances degenerate semiconductors might become superconducting (31). A third crucial test for the quantitative theory of the interactions which cause superconductivity is that of the isotope effect. In the absence of detailed information about the electronphonon coupling one might hope to use the transition temperature as a parameter for determining this coupling and at least predict the isotope effect. In the retarded-interaction theory mentioned earlier, the source of the isotope effect is in the retarded part of the interaction caused by ion displacement, since this part will inevitably scale in time accurately with the lattice vibration period and thus with the isotopic mass. T h a t is to say, the only effect of a change of isotopic mass will be to change the vibration period of the ion. Any deviation from M~l must measure the effectiveness of the Coulomb part of the interactions which is instantaneous. It is well known in the theory of nuclear interactions that particles interacting by way of a "hard core"—that

is, a strong, short-range repulsive interaction—can to a great extent avoid its effects by modifying their wave functions in the region of the hard core. This modification brings in very-highmomentum states but does not cost very much energy. In the case of superconductivity theory we have a "hard core" in time, not in space, but the same principle operates: the electrons can avoid each other in this region by bringing veryhigh-energy components into their wave functions. But in a solid this may or may not be possible, depending on whether the bands are narrow or wide. When they are wide, as in most simple metals, the avoidance is nearly complete, and the isotope effect is nearly —V4; but in narrow-band metals such as the transition metals the theory, as worked out by Garland (32) (after Anderson and Morel, who did not compute the problem accurately enough numerically or include the effect of variable band width), can lead to intermediate, zero, or even, in extreme cases, positive values for the isotope effect. On this issue theory and experiment have fought to a standoff: theory predicted intermediate values—even, fortuitously, the right value for molybdenum!—first (23, 25), but experiment found near-zero values before the theory was accurately enough developed to provide a basis for understanding them (33). Matthias brings up some exciting open questions in these areas, which, at the risk of finally ruining my reputation for accuracy, I shall answer here to the best of my ability, (i) Will other mechanisms occur? (ii) Have they occurred? T o the latter question, the answer is very probably no. In most of the more complicated mechanisms, electrons seem to be paired in anisotropic or otherwise unusual states, which are broken up by impurity scattering. W e must look to very pure metals for anything really new. Will other mechanisms occur? Probably. The requirement for the formation of pairs is that the interaction be attractive not everywhere but simply in some, not necessarily very large, region of space and time. Many interactions—perhaps even screened Coulomb ones—have this property, but the transition temperature may be exponentially low. It is amusing to note that the original BCS criterion—that the interaction be attractive in momentum space—is not necessary or even, in many cases, satisfied; the vital thing is the real space-time 377

213

interaction, not its Fourier transform. Are all metals superconducting? Probably most are; many theorists have accepted for some time the idea that most of them must be. Sodium has an almost vanishingly weak interaction in much of the relevant region, so its T« may be very, very low; magnetic metals, to be superconducting, must be extremely pure, since their pair states are, perforce, not BCS pairs and will be sensitive to scattering, and the 7\. of these metals may be very low. But I can see no reason why some component of the interaction cannot be attractive somewhere in every case. P. W.

ANDERSON

References and Notes 1. J. Bardeen, L. N. Cooper, J. R. Schrieffer, Phys. Rev. 108, 1175 (1957). 2. J. R. Schrieffer, paper presented at the Am. Phys. Soc. meeting, 22 Mar. 1957. 3. W. Heisenberg, Z. Naturforsch. 2a, 185 (1947); H. Koppe, Ibid. 3a, 1 (1948). 4. A. H. Wilson, Proc. Roy. Soc. London A133, 458 (1931); A134, 277 (1932). 5. N. F. Mott and R. W. Gurney, Electronic Processes In Ionic Crystals (Oxford Univ. Press, New York, 1940). 6. G. L. Pearson and J. Bardeen, Phys. Rev. 75, 865 (1949). 7. L. N. Cooper, Ibid. 104, 1189 (1956). 8. M. R. Schafroth, S. T. Butler, J. M. Blatt, Helv. Phys. Acta 30, 93 (1957). A full and very readable, if one-sided, account may be found in Blatt's forthcoming book (Academic Press, New York). The proof is mostly in J. M. Blatt, Progr. Theoret. Phys. Kyoto 23, 447 (1960) and T. Matsubara and J. M. Blatt, ibid., p. 451. 9. P. W. Anderson, Phys. Rev. 110, 985 (1958); , ibid., p. 827; , ibid. 112, 1900 (1958); N. N. Bogolyubov, Usp. Fiz. Nauk 67, 549 (1959). 10. E. P. Gross, / . Math. Phys. 4, 195 (19631; P. C. Martin, Ibid. 10, 208 (1963). 11. I use the phrase "coherent state" here in the sense in which it is used by R. J. Glauber, Phys. Rev. 131, 2766 (1963). 12. P. W. Anderson, "Weak superconductivity," in "Ravello 1963 Spring School Notes," E. R. Caianello, Ed. (Academic Press, in press). 13. N. N. Bogolyubov, Zh. Eksperim. I Teor. Fiz. 34, 58 (1958); J. G. Valatin, Nuovo Clmento 7, 843 (1958). 14. H. Koppe and B. Muhischiegel, Z. Phystk 151, 613 (1958). 15. Y. Nambu, Phys. Rev. 117, 648 (1960). 16. L. P. Gor'kov, Zh. Eksperim. I Teor. Fiz. 34, 735 (1958). 17. The importance of time reversal here was recognized by D. C. Mattis and J. Bardeen, Fhys. Rev. I l l , 817 (1958); a more complete discussion was given by P. W. Anderson, Phys. Chem. Solids 11, 26 (1959); also see P. W. Anderson, Proc. Intern. Conf. tow Temp. Phys. 7th, Toronto, Ont. (1961). 18. For a discussion of chemical impurities on Te, see E. A. Lynton, B. Serin, M. Zucker, Fhys. Chem. Solids 3, 165 (1957); the theory is discussed in a host of papers, published after the papers cited in 17; of these later papers, that of D. Markowitz and L. P. Kadanoff, Phys. Rev. 131, 563 (1963), is the clearest. For discussion of the gap distribution from the standpoint of experiments, see Y. Masuda, Fhys. Rev. 126, 1221 (1962); from the standpoint of theory, see L. Gruenberg, in preparation. 19. Concerning the effect of magnetic impurities on Tt, early experimental and theoretical work by H. Suhl and B. T. Matthias [Phys. Rev. 114, 977 (1959)) and by others, was followed by the more accurate theory of A. A. Abrikosov and L. P. Gor'kov [Zh. Eksperim. 1 Teor. Fiz. 36, 319 (1959)]. This paper also predicted the effect on the gap seen by F. Reif and M. A. Woolf [Phys. Rev. Letters 9, 315 (1962)1.

20. L. P. Gor'kov, Soviet Phys. JETP (English Tmtisl.) 9, 1364 (1959). 21. A. A. Abrikosov, Zh. Eksperim. 1 Teor. Fiz. 32, 1442 (1957). 22. B. S. Deaver and W. M. Fairbanks, Phys. Rev. Letters 7, 43 (1961); R. Doll and M. Nabauer, ibid., p. 51. 23. J. C, Swihart, Phys. Rev. 116, 45 (1959). 24. G. M. Eliashberg, Zh. Eksperim. I Teor. Fiz. 38, 966 (1960). 25. P. Morel and P. W. Anderson, Fhys. Rev. 12S, 1263 (1962). 26. J. R. Schrieffer, D. J. Scalapino, J. M. Wilkins, Phys. Rev. Letters 10, 336 (1963). 27. , in preparation. 28. J. M. Rowell, P. W. Anderson, D. E. Thomas, Phys. Rev. Letters 10, 334 (1963). 29. This striking agreement may be seen in Fig. 2 of J. R. Schrieffer el al. {26); the staggering amount of detail available from tunneling curves is shown in Fig. 1 of J. M. Rowell et al. (28), and its detailed interpretation is discussed by Scalapino and Anderson, Fhys. Rev., in press. 30. Y. Wada, Rev. Mod. Phys., in press. 31. M. L. Cohen, Phys. Rev., in press. 32. J. M. Garland, Phys. Rev. Letters II, 111 (1963); ibid., p. 114. 33. T. H. Geballe, B. T. Matthias, G. W. Hull, E. Corenzwit, ibid. 6, 275 (1961).

n . The Facts Written discussions between experimentalists and theorists are not too frequent these days, and if they are presented at all, the authors are trying very hard either to agree or to ignore each other. The field of superconductivity is no exception. It is particularly difficult to follow the two sides of an argument when the interval between presentations extends over weeks or months. We hope the argument can be followed more easily in a two-part discussion such as this. I am indeed fortunate in having the present theoretical aspect of the BCS approach to superconductivity presented by my friend P. W. Anderson, who has been so instrumental and essential in this field. [Another theoretical point of view was given 2 years ago by H. Frohlich ( / ) . While he did not make any detailed quantitative statements, his general approach seemed to reflect the occurrence of superconductivity in a truer way.] For once I will not have the feeling of being unfair when I say that even today the theory cannot in any way predict the occurrence of superconductivity, much less the transition temperature. Reading Anderson's article can hardly lead one to another conclusion. For a long time experiments have indicated the almost general occurrence (2) of superconductivity in most metals that were sufficiently pure and cold. We have hoped it could be shown theoretically that a condensation of one kind or another in a Fermi gas should always occur under ideal conditions. Experimen-

378

214

tally the answer seems to be in the affirmative. It will be up to the reader to decide whether this answer has now been given theoretically as well.

Theoretical Development During the last 13 years the theory of superconductivity presented by Frohlich and Bardeen (3) was based on an electron-phonon interaction. Right from the outset in 1950 we were told that this was the final solution to the last unsolved problem in solid-state physics. Since then we seem to have gone through quite a number of "final" stages in the theoretical development. The present one is the BCS theory, named after its three authors, Bardeen, Cooper, and Schrieffer (4), and presented in 1957. There has also been a more general theory by Schafroth, Butler, and Blatt ( 5 ) , formulated in 1954, which is based on the Bose-Einstein condensation of electron pairs. As a rule these theories have been preceded by experimental results, and they have in general been unable to predict results. However there are some notable exceptions. Frohlich predicted a gap in the electron spectrum on the top of the Fermi surface in the superconducting state for a one-dimensional model. This gap was indeed detected optically, by microwave spectroscopy (6) as well as by specific heat measurements ( 7 ) . He also predicted the isotope effect, according to which the transition temperature T0 varies with the inverse square root of the atomic mass, thus clearly indicating a correlation with lattice vibrations. In 1957, Bardeen, Cooper, and Schrieffer presented another electron-phonon theory (4), which explained the existence of the gap by means of pairs of electrons with opposite spins. They correctly predicted the behavior of nuclear relaxation in the superconducting state. They also predicted the relation between the gap width and the transition temperature for the elements. In 1961 Onsager (8) predicted, on the basis of the Schafroth theory, the existence of quantized flux of only half the value of that calculated by London. The factor 2 was, again, due to the formation of pairs of electrons with twice the charge of a single electron. All these theoretical predictions have been verified experimentally. However, there have been just as many theoretical predictions that have not been verified by experiment. These

include the prediction that the Knight shift in superconductors vanishes, that higher transition temperatures exist, and that the isotope effect, with a value of M'1 is general; in addition, the theory has also incorrectly predicted the electronic heat conductivity. Unfortunately, one question remained almost totally ignored in most theories and experiments; namely, What are the critical conditions for the occurrence of superconductivity itself? Derivations of a criterion were first attempted by Frohlich and Bardeen, and later by Bardeen, Cooper, and Schrieffer. The latter group actually gave an equation for the transition temperature itself; this equation, however, contained an interaction constant that cannot be calculated at present. Apart from this difficulty, the critical conditions for superconductivity could not be predicted by this equation either. For example, according to the equation, yttrium and lanthanum should have the same transition temperature, that of yttrium being possibly a little higher, since both have the same N(0) and almost the same V and Debye temperature. However, yttrium is not superconducting down to 0.07°K and cc-lanthanum is superconducting at about 5°K. This difference is discussed later. Moreover, this formula, cited by Anderson, is not only crude, as he says, but also incorrect for the transition elements, since the dependence of T, on N (0) is in most cases the exact opposite of that stated in the formula. For example, the Tc of yttrium, rhodium, and platinum decreases with an increase in N(0). Since the formula was proposed it seems to have 'been discarded completely because it does not present the criteria for the occurrence of superconductivity, which, on the other hand, are easily given by a simple empirical rule ( 9 ) .

Theory versus Description Let me deviate for a moment and explain why I think the word theory is really inappropriate. A theory, according to the current Oxford English Dictionary is defined as "a conception or mental scheme of something to be done, or of the method of doing it. A systematic statement of rules or principles to be followed." According to the same source, a description is defined as "the combination of qualities or features that marks out or serves to describe a particular class."

Clearly, since the present "theories" are unable to state the rules for the occurrence of superconductivity, which, in my opinion, are essential in any explanation of the phenomenon, they should really be considered descriptions or at best models of superconductivity. It is, therefore, not surprising that the acceptance of the generality of the phenomenon, which had been considered rather limited until recently, is the result of the empirical approach of finding a great number of superconductors. The recent theoretical conjectures of Morel and Anderson (10), Casimir ( / / ) , and Frohlich (12) support the experimentally arrived at conclusion that most metals, if sufficiently pure and cold, will eventually undergo a condensation of one kind or another. In the overwhelming number of cases this will be the onset of superconductivity, but there are rare exceptions, such as ferromagnetism. In the latter case the electron spins, instead of being antiparallel, are all aligned. It is also possible that different and still unknown transitions or condensations will occur in the millidegree temperature range. These might include phenomena such as nuclear ferromagnetism, predicted 25 years ago by Frohlich and Nabarro, or other types of alignment.

Occurrence of Superconductivity The conclusion that superconductivity is a rather widespread phenomenon resulted from the discovery, in recent years, of almost 1000 superconductors as compared with the 30 to 40 superconductors known about 13 years ago. Quite early in the experimental studies it was possible to state a simple rule (9) for predicting the transition temperature of a given metal. If the metal is a nontransition element, or a combination of nontransition elements, superconductivity almost invariably occurs above 0.3°K and the transition temperature generally increases slightly with n, the number of valence electrons per atom. All electrons outside of filled shells are considered valence electrons. While the chemical valence is sometimes related to this number, it has no meaning for n > 7 or 8, from a chemical viewpoint. The transition temperar tures in systems of nontransition elements are not very high; they reach a maximum of 8.8°K in the lead-bismuth system. All nontransition elements investigated to date show the isotope ef-

fect predicted by Frohlich and Bardeen and later formulated in the BCS theory. Quite frequently the combination of nontransition elements results in semiconductors, in which case an insulator rather than a superconductor is expected as the temperature is decreased. However, even some of the semiconductors, such as doped germanium telluride, become superconducting below 0.3°K, as recently reported by J. K. Hulm at the International Conference on the Science of Superconductivity held at Colgate University. The transition elements form the second group of metals in the periodic system. The transition temperatures at which these elements and their compounds (with either transition or nontransition elements) become superconducting are strongly dependent on the number of valence electrons per atom for the element and on the average number of valence electrons for the composite systems. This is in sharp contrast to findings for the nontransition-metal group. For the transition elements the maxima in transition temperature correspond to values near 5 and 7 for number of valence electrons per atom, are very pronounced (Fig. 1). and are rather independent of crystal structure. Hamilton and Jensen (13) recently found that the distribution of the transition temperature is quite symmetric with respect to column VI of the periodic system if lanthanum and uranium are excluded. This means, as far as the ^-electrons are concerned, that the symmetry is with respect to a half-filled J-shell. Although the points shown in Fig. 1 are values for the elements only, the curve was actually traced through many intermediate points not shown in the figure. These points represent the transition temperature- of solid solutions of the elements in each other. The shape of the curve also persists for intermetallic compounds, but for these the height and precise location of the maximum transition temperatures depend somewhat on the crystal structure. N o symmetry or regularity of this kind has ever been observed for the nontransition elements. Rather high transition temperatures occur frequently in systems containing a transition element. The maximum transition temperature known today is 18.05°K for the compound Nb 3 Sn, which has the rather complicated /3-woIfram type crystal structure and a ratio for number of valence electrons per atom of 4.75.

379

215

None of the transition elements or their compounds has ever shown an isotope effect proportional to M'i, the value reported for the five nontransition elements measured to date. In ruthenium (14) there is no observable effect at all, and in molybdenum (IS) the effect is proportional to M~l. The variations in transition temperature, together with this entirely different isotope effect, would suggest to an unprejudiced observer that there is a drastic difference in the mechanisms causing superconductivity in the two groups of metals. Another difference between the two groups is noticed when their maximum attainable transition temperatures are compared. For alloys of the transition elements this temperature is 18°K, about twice the value for the nontransition elements. Lanthanum and uranium are exceptions with respect to the symmetrical distribution of transition temperatures in the periodic system. A reason for this has recently been postulated (13). Both are the beginning elements of the 4 / and 5/ series, without having any occupied j levels. However, these / levels cannot be far from being occupied, and thus may act as virtual levels. The existence of these virtual levels had previously been proposed as a possible cause of superconductivity by means of a magnetic interaction (16), rather than by means of the usual phonon mechanism.

Superconductivity and Ferromagnetism Many years ago the close relationship between superconductivity and ferromagnetism became apparent in investigations of these phenomena in isomorphous compounds (17). Since then we have found many examples which indicate not only that this close relationship exists but also that very often magnetic interactions must be responsible for the occurrence of superconductivity, and the reverse may even be true. This can be illustrated by a few examples: (i) U (! Fe is the superconducting compound of uranium with the highest transition temperature, higher than that of U, ; Co or U f ; Mn: (ii) iron in titanium raises the transition temperature of titanium faster than any other element does; (iii) however, if iron lowers the transition temperature by being localized, as, for instance, in molybdenum and some of its alloys, it does this also

much more drastically than either cobalt or manganese does. Scandium is not superconducting above 0.08°K, but solid solutions of chromium in scandium (and, so far as we now know, only of chromium) are superconducting near 3° to 4°K. This effect of chromium is presumably due to the magnetic interactions, similar to those of iron in titanium. Iron in scandium, in turn, has a magnetic moment and leads to antiferromagnetism. The compound ZrZn 2 crystallizes in a cubic structure (type C15) which results in superconductivity in most other zirconium compounds. However, ZrZn„ becomes ferromagnetic at 35 °K. La:1In is the lanthanum compound with the highest superconducting transition temperature, but the analogous compound with scandium, Sc 3 In, becomes ferromagnetic. So far this kind of free electron ferromagnetism has been observed only in these two compounds, which were found among 6000 intermetallic phases. The general rule is that superconductivity will occur. There never has been any theoretical attempt t o link the two phenomena. Yet our results indicate that the mechanisms leading to superconductivity are very often closely related to those causing ferromagnetism. Recently members of the Russian and Japanese schools have presented a number of papers describing magnetic interactions which lead to superconductivity (18). Strangely enough, these papers have been ignored thus far. Could it be that history is repeating itself, and that communications on magnetic interaction are receiving the same treatment from Western theorists that the Gor'kov and Abrikosov discoveries received? Since none of the existing theories can either give criteria for the phenomenon of superconductivity or permit prediction of the transition temperature, it is impossible, for me at least, to understand recent attempts to enforce another consolidation of theory and experiment at any price. Let me illustrate this by the following example. Before the absence of an isotope effect in ruthenium was discovered experimentally, none of the existing theories had ever considered the possibility of the absence of this effect. In their 1961 review article, Bardeen and Schrieffer (19), referring to a different Coulomb cutoff in Bogolyubov's theory of superconductivity, stated, "If this calculation were valid, there would be two serious difficulties: (1) the exponent of the iso-

216

NUMBER OF ELECTRONS

Fig. 1. Variation in superconductivity of the transition elements in the periodic system.

tope effect would be expected to depart significantly from Vz, contrary to experiment, and (2) the effect of the Coulomb interactions would be reduced so much that nearly all metals would be expected to be superconducting." Since then we have shown experimentally the absence of an isotope effect in ruthenium and also the deviation from the factor Vi. It has also been shown experimentally that most metals are superconducting. It is not clear to me why suddenly these serious difficulties should no longer exist, as Garland has recently claimed they no longer do (20). On the other hand, the different mechanisms for the occurrence of superconductivity, as well as the general existence of the phenomenon itself, have long since been deduced, though purely empirically.

Summary In summarizing, let me say again that some of the theoretical conclusions have been verified experimentally. These include the gap in the electron spectrum, the nuclear relaxation in the superconducting state, and the latticeelectron interaction for the nontransition elements, as well as the fact thati pairs are involved in the condensation, as shown by quantized flux measurements. However, the predictions of the theory, especially those with respect to the Knight shift in superconductors and the isotope effect for the transition elements, have not been borne out. The theory has since been modified to account for the non-vanishing Knight shift found by experiment, and also for

the experimentally determined isotope effect of the transition elements. However, there is no word at all on when and where to find a superconductor in the transition metals. Unfortunately, the validity of the electron-phonon interaction has been, in my opinion, overstated to the point that a formula for the transition temperature was given which has now failed under the weight of the periodic system. Another result of the BCS formulation of electron-phonon hypothesis is the conclusion that superconductivity is the result of a more or less delicate balance between the electron-phonon interaction and the Coulomb repulsion. For this reason most theories have considered superconductivity an accident rather than the normal ground state of most metals. However, experimental results indicate that superconductivity is indeed a very general phenomenon. In contrast to the BCS formulation, Frbhlich, in paragraph 7 of his review article ( / ) , has proposed that the occurrence of superconductivity as well as that of superfluidity might be understood in a very general manner as a result of an appropriate quantum hydrodynamics. The difference in the transition temperatures of yttrium and lanthanum cannot be understood on the basis of an electron-phonon interaction, but it has been explained on the basis of a magnetic interaction. The transition elements have many properties which can be symmetrically arranged according to the elements' position in the periodic system. The behavior of the transition temperature at which superconductivity occurs is another such symmetric function—a fact which gives added credence to the generality of the phenomenon. Since the rule that the maximum transition temperatures should occur when the number of valence electrons per atom is around 5 or 7 is fairly simple, it would seem that the correct underlying explanation should be equally simple. I believe that it has yet to be given. Should it ever be stated, we might then expect an answer to our second question: Why has it been relatively easy, within the last 10 years, to reach transition

temperatures of 17° to 18°K in many intermetallic systems and impossible to raise this value even by as little as half a degree? It is not that we have not tried. More than 6000 metals have been checked; 18°K has been reached very often but never exceeded. Another result has emerged from the experimental investigation of all these metals. As stated before, we found ferromagnetism in two compounds which had been expected to be superconducting—namely, ZrZn^, a compound composed entirely of superconducting elements, and Sc 3 In, wherein, to date, only indium has been found to be superconducting. The extreme rarity of these occurrences indicates that ferromagnetism is a much less likely kind of condensation than superconductivity. While there are many superconducting compounds formed entirely of nonsuperconducting elements, all ferromagnetic compounds, with the exception of the two mentioned, contain either chromium, manganese, iron, cobalt, or nickel or an /-electron element. Superconductivity can be predicted empirically on a routine basis today with very few exceptions. However, prediction of ferromagnetism is not possible at present, since the incidence of 2 in 6000 is rather unfavorable. Perhaps in this case we have not been looking in the right direction, and have thereby given the theory a chance to precede the results of experimental search. Where do we stand with respect to cooperative phenomena in general? As I have tried to show, ferromagnetism is a rare and restricted phenomenon in comparison with superconductivity. In the field of ferroelectricity there was a superstition for 20 years that the occurrence was due to one specific mechanism, that of the hydrogen bond. Today we know that ferroelectricity is a general phenomenon. We have a great number of ferroelectrics, and the mechanisms responsible for the effect range from order-disorder among hydrogen bonds to the polarization of sulfur. Thus, I think ferroelectricity is better understood theoretically than superconductivity is. To summarize in the form of a final

217

question, to which the ultimate and final answer must come only from the theory: Will the electrons in any and every metal that is sufficiently pure and cold always undergo a condensation? If so, in a particular metal, at what temperature will condensation occur and of what type will it be? Clearly, we do not yet expect any quantitative answer to the problems of superconductivity. As Anderson mentions, not even melting points can be calculated today. Melting, however, is a universal phenomenon, and every material, when heated above 4000°C, will eventually either melt or sublime. It seems that a similar answer should also be given for electrical conductivity as a general phenomenon. When sufficiently cold and pure, what will a metal do? I think it will usually become superconducting. B. T. MATTHIAS References and Notes 1. H. Frohlich. Rep. Progr. Phis. 24, 1 (1961). 2. B. T. Matlhias, T. H. Geballe, V. B. Compton. Rev. Mod. Phys. 35, 1 (1963) 3. J. Bardeen, Phys. Rev. 79, 167 (1950); . ibid. 80, 567 (1950); H. Frohlich, ibid. 79. 845 (1950); , Proc. Phys. Soc. London A63, 778 (1950). 4. J. Bardeen, L. N. Cooper, J. R. Schrieffer, Phvs. Rev. 108, 1175 (1957). 5. M. R. Schafroth, ibid. 96, 1442 (1954); M. R. Schafroth, S. T. Bmlcr, J. M. Blatt. Helv. Phys. Acta 30, 93 (1957). 6. H. Frohlich, Proc. Roy. Soc. London A223. 296 (1954); M. A. Biondi and M. P. Garfunkel, Phvs. Rev. 116, 853 (1959); P. L. Richards and M. Tinkham, ibid. 119. 575 (1960). 7. A. Brown, M. W. Zemansky, H. A. Boorsc, Phys. Rev. 92, 52 (1953). 8. L. Onsager, Phvs. Rev. Letters 7, 50 (1961). 9. B. T. Matthias, Phys. Rev. 97, 74 (1955). 10. P. Morel and P. W. Anderson, ibid. 125, 1263 (1962). 11. H. B. G. Casimir, Z. Physik 171, 246 (1963). 12. H. Frohlich, Phys. Letters 7, 346 (1963). 13. D. C. Hamilton and M. J. Jensen, Phvs. Rev. Letters 11, 205 (1963). 14. T. H. Geballe, B. T. Matthias. G. W. Hull, Jr., E. Corenzwit, ibid. 6, 275 (1961). 15. B. T. Matthias. T. H. Geballe, E. Corenzwit, G. W. Hull, Jr., Phys. Rev. 129. 1025 (1963). 16. B. T. Matthias. V. B. Comptwi. H. Suhl. E. Corenzwit, ibid. 115. 1597 (1959). 17. B. T. Matthias. E. Corenzwit, W. H. Zachariascn, ibid. 112, 89 (1958). 18. A. I. Akhiezer and I. Y. Pomeranchuk. Zh. Eksperim. I Teor. Fiz. 36, 859 (1959): A. I. and I. A. Akhiezer, ibid. 43, 2208 (1962); I. A. Privorotskii, ibid., p. 2255; J. Kondo, Progr. Theoret. Phys. Kyoto 29, 1 (1963). 19. J. Bardeen and J. R. Schrieffer, in Progress in Low Temperature Physics. C. J. Gorter. Ed. (North-Holland, Amsterdam, 1961), vol. 3, p. 203. 20. J. W. Garland, Jr., Phys. Rev. Letters 11, 111 (1963); ibid., p. 114. 21. I thank T. H. Geballe for his many good criticisms and stimulating discussions, and, last but not least, for being with me in a field that is not greatly illuminated at present by the gray world of theory.

Coherent Matter Field Phenomena in Superfluids Some Recent Definitions in the Basic Sciences, Vol. 2, (Belfer Graduate School of Sciences, Yeshiva Univ., New York, 1969), ed. A. Gelbart, pp. 21-40

A first shot at a general theory of broken symmetry and at the implications of the Josephson effect; as such much shorter than my article in Gorter's Progress in Low Temp. Physics of 1967. The discussion remarks are important: they refer back to a paper (not in my bibliography) taking issue with a remark of Gorkov and Galitskii, and reflect my growing conviction that broken symmetry has a lot to say about the fundamentals of quantum mechanics and vice versa.

219

COHERENT MATTER FIELD PHENOMENA IN SUPERFLUIDS

P. W. Anderson BELL TELEPHONE LABORATORIES, MURRAY H I L L , NEW JERSEY

It is ironic that during the same years that the elementary-particle physicists, at the highest energy end of the spectrum, have been doing their best to discard as many as possible of the properties of the quantum field operator, at the very lowest energy end of the spectrum, in the two low temperature phenomena of superconductivity and superfluidity, we have been provided with a series of very direct demonstrations which seem to bring the quantum particle field almost into the ordinary, tangible, macroscopic realm. This will be the burden of what I will be trying to say today. In these phenomena the quantum particle field plays a role very similar to the roles, with which we are familiar, of the classical fields, the electromagnetic and gravitational fields, in our ordinary macroscopic experience. This has been shown by a series of experiments of various kinds on coherence in quantum fluids. I would like here to emphasize more the basic meaning of these experiments than to describe in detail their results and their experimental mechanisms. It is interesting that nowhere in this discussion will I have to treat in very much detail the microscopic theory of either of these two superfluids, either the superfiuid, helium, or the superconducting electron gas in metals. In fact, the foundations of the subject of quantum coherence were laid long before there were any acceptable microscopic theories of these phenomena. One could even argue that there is no acceptable microscopic theory of helium to this day. These foundations were laid by Landau, Penrose, Onsager, London and many others. Even the two types of macroscopic quantization, the quantization of vorticity and the quantization of magnetic flux, were proposed by 1949, long before 21

220

22

P. W. Anderson

there were any microscopic theories. The first approach to the properties of superfluids and superconductors which contained the possibility of understanding these phenomena formally and theoretically was, however, proposed in 1951 and 1956 by Penrose and Onsager [1]. This is still, in a sense, equivalent to the accepted way of discussing these phenomena, and since it is historically the first and possibly the best known, it seems to be a good starting point for our discussion today. The proposal they made was that one could essentially define superfluidity of a Bose system such as liquid He as a state in which the density matrix of the system factorized in a certain special way. They said that the density matrix p = p(r, r') = (ip*(r)ip(r')}, which is defined as the average of the product of the field creation operator at point r and the destruction operator at point r', could be factorized into f*{r)f{r'), a function of r times a function of r' plus small terms. The density matrix p(r, r') of any ordinary system tends to vanish as the points r and r' attain macroscopic distances from each other. The "small terms" are normal, i.e., go to zero as \r — r'\ goes to infinity. The term which has no dependence on the relative positions of r and r' is characteristic of superfluidity. So the statement is that this is equivalent to superfluidity; I hope I will be able to make that point clear in the rest of the talk. I should say, of course, that at absolute zero the average is taken in some hypothetical ground state of the system; on the other hand, at finite temperatures one does not use ground states but a thermal average. Beliaev [2] extended this concept of factorization to the time-dependent Green's functions. He observed that the density matrix, which is the average of the time independent field operators, is a special case of the general Green's function, which is the average of the Heisenberg field operators (^*{r, t)ifi(r', t')}. He assumed that that could be factorized into/*(r, t)f(r', t') + small terms. Almost immediately and in a practically succeeding issue of the Journal of Experimental and Theoretical Physics, Gor'kov [3] observed that the same kind of theory could be generalized to apply to the electrons in superconductivity. It was not possible to have this kind of theory for just the single field operators because of the exclusion principle, but if one substituted instead the two-particle Green's function, which has a pair of field operators >p*, and a pair of ifi's, one could do the same thing for the fermions. One could talk about the fermion gas in the metal in terms of pairs of electrons which then have sufficiently similar commutation relations to bosons so that one can carry out the same kind of thing. So Gor'kov introduced a function F*(Xly X2) of two variables Xx and X2 (including both time and space variables) and factorized his two-

221

Coherent Matter Field Phenomena in Superfluids

23

particle Green's function, a function of four variables, into pairs of two variable functions:

G(xu x2, x3, x , ) = = F*(Xlt X2)F(X3,

Xi) + small terms

Again the important term is not correlated between variables of the two different F's, but now one has a strong correlation between X1 and X2. One thinks of F* as a kind of a two-particle wave function, and the two bound electrons in such a wave function should remain close together. Gor'kov showed that this theory, if one took F to be homogeneous in space, was completely equivalent to the then just developed BardeenCooper-Schrieffer (BCS) theory of superconductivity, which now, of course, has come to be recognized as the correct microscopic theory of superconductivity [4]. As far as the superfluid (helium) version is concerned all reasonably useful microscopic theories that have been proposed have led to this kind of behavior but, as I say, no really satisfactory microscopic theory yet exists. In superconductivity there already existed a theory of Ginsburg and Landau [5], which seems to describe superconductivity from a phenomenological point of view very satisfactorily. This theory describes the properties of superconductors in terms of a function which they called the order parameter function W{r), but the basic equations of the theory were remarkably similar to the equations for ordinary one-particle quantum mechanics in terms of a single-particle wave function W(r). So this function came to be known as the wave function although its meaning was not at that time understood. Gor'kov very shortly showed that in the appropriate limiting cases he could derive the Landau-Ginsburg theory from his theory [6], in which he identified W(r) with the function F(X, X) at identical values of the variables. So he gave the function W a physical meaning. It was only later that the corresponding theory for superfluidity, which is not quite as useful, was proposed by Gross and Pitaevskii [7]. Coherence phenomena can be treated perfectly satisfactorily using this Penrose-Onsager "off-diagonal long-range order" scheme. Unfortunately, all of those who until recently have chosen to work with this theory have made an additional assumption which is not necessary, and which makes the amount of work one has to go through in order to derive results considerably greater. This assumption is, that one should work always with states (in making the averages with which one forms either the density matrices or the Green's functions), with fixed definite total particle number N, i.e., total number of helium atoms or total number of pairs of electrons. In the case where one does the thermal

222

24

P. W. Anderson

average over a grand canonical ensemble, it is sometimes convenient, of course, to use many values of the particle number, but even there it was assumed that there was no coherence between states of different particle number. Now, so long as one is talking about a single isolated superfluid system, a single hunk, say, of superconductor or a single bucket of superfluid with a cover on it so that nothing evaporates, then, of course, the number of helium atoms or the number of electron pairs is perfectly fixed and the Hamiltonian must commute with the number operator. It seems extremely reasonable that, since the number of particles is a perfectly good quantum number, we should insist, in taking our average, on keeping the total number of particles fixed. Unfortunately, it begins to be a little more difficult if we start trying to do interference experiments. In doing the kind of experiment we are going to talk about, we will be thinking very characteristically of systems which consist of two separate parts between which we have some kind of connection, and the basic feature of that connection is that it can transfer particles coherently from one side of the system, or one piece of the system, to the other. We will, in other words, be talking about systems such as those in which we have two regions of space separated by "slits" as you do in an electromagnetic interference experiment. What is more, it will be very hard, in discussing these coherence experiments, as it would be very hard in discussing for example electromagnetic interference experiments, to keep the slits open at all times. You would often like to compare what happens when you close off a slit with what happens when the slits are open. You would like to be able to turn off or on switches, superconducting switches or superfluid switches, here and there in your system. And under those circumstances, there is no reason whatever to expect that you are going to get satisfactory results by working with states in which the numbers of particles on the two sides are absolutely fixed and in which there are not allowed to be coherent superpositions of states in which the particles are differently partitioned between the two sides. Let me suppose I have a system composed of two pieces connected by such a switch, and that I start with each piece having ODLRO, i.e., being described by

G(x,x')=f*(x)f(x'), but each having just N/2 particles. It turns out that there is no way that I can make up from this description the correct description of the system with the switch open, i.e., with ODLRO embracing the full composite system, in which x or x' may range over both halves. Thus no separate piece of a superfluid system is adequately described by a single

223

Coherent Matter Field Phenomena in Superfluids

25

state with ODLRO. Of course, it is possible to do things properly, by starting from a supermacroscopic system which has definite fixed particle number, and which contains every piece that I am ever going to attach to my system, i.e., every superconductor that I may ever bring into contact with my system or every reservoir of liquid helium that I may be interested in. But that is just inconvenient. It is much easier to do it in a different way. The different way is to satisfy our original requirement of factorization by taking as our basis states (confirming afterward that what we do is all right), states in which there is an average of the particle operator itself. < l P | ^ * ( r , 0 | ^ > = <0*(M)> = / * ( M ) Clearly, when we do this our original assumption of ODLRO is satisfied: < = <<#> + other terms, but equally clearly, the reverse is not necessarily true. In other words, this implies that I still have my original assumption but I have made an extra assumption in addition. Clearly also these are states which do not have a fixed number of particles because I have destroyed a particle between one W and the other, so at least the state has components with N and with iV — 1 particles, i.e., with two different numbers of particles. These states that I am going to use are very similar to, except for certain special features, the coherent states of the electromagnetic field with which people have begun to realize it is best to work in quantum optics. On the other hand, the states that we are confronted with in superfluidity and superconductivity do not have the simple complete coherence that one can get in the electromagnetic analogy. The maximum value that the mean value of W can have is the square root of the density, because <^*^> = p, but in the helium case, it was shown in the original paper of Penrose and Onsager (and the estimate has been refined recently by MacMillan) that one gets of the order of 0.1 y/p for the absolute value. In the superconductivity case, the situation is even less coherent, one gets of the order of 10 ~4 of the maximum possible value. Nonetheless, these numbers are clearly recognizable as numbers or order unity rather than numbers of order 10 ~ 23 , and therefore the coherence is macroscopic in magnitude rather than microscopic. Another way of putting it is that we all recognize that when we deal with macroscopic systems such as tables, chairs, solar systems and so on, it is extremely inconvenient to work with eigenstates of the conserved quantities such as total momentum, total parity, or anything like that;

224

26

P. W. Anderson

it is much more convenient to work with wave packets of the eigenstates which have the property of localizing reasonably closely the macroscopic variables of the system, and that is exactly what we are doing here. One can make wave-packet transformations back and forth and demonstrate that we are working with the same kind of system. The only condition we have is that someone sometime has to sit down with the equations and decide that the zero-point motion that we are ignoring in forming our wave packet, the diffusion of the wave packet, is quantitatively negligible compared with other fluctuations in the system. You have to decide, when you are talking about the solar system, that the uncertainty in the position of the sun is utterly negligible. In this case, you have to decide when you are talking about a macroscopic superfluid that the uncertainty in definition of the wave function of the quantum field is essentially negligible and it will not diffuse. I have done that calculation [8] and it does indeed turn out that even in the worst possible cases the zero-point motion is negligible compared to thermal fluctuations. Now after a long prologue we come to the basic idea of what I am going to do today, which is to deal with the mean value of the quantum particle field as though it were one of the macroscopic thermodynamic and dynamical variables of the system, just as in, say, elasticity of crystals we deal with the position of a particular part of the crystal as a macroscopic thermodynamical variable, or in ferromagnetism we deal with the magnetization as a macroscopic variable. Both are, in fact, variables which destroy symmetry principles but are well known by our daily experience to be usable as thermodynamic variables. As in these cases, it is perfectly possible to define the variable not only for the total system but also for individual pieces of the system, small cells macroscopic from the atomic point of view but microscopic from our point of view. In other words, one can deal with coarse-grained averages as well as with macroscopic averages. It is quite important t h a t / , the mean value of the wave field, is a complex order parameter. It contains an absolute value and a phase:

/ = 1/1 «'* The two real parameters | / | and 9 turn out to be two different kinds of thermodynamic variables. In the thermodynamics of order-disorder systems, for example, we have a kind of thermodynamic variable, the long-range order in that case, which is characterized by the fact that while it is extremely convenient in discussing the thermodynamics and in keeping our concepts straight, there is no way we can get in there with a force and act on that particular variable. It is hard to find a force

225

Coherent Matter Field P h e n o m e n a i n Superfluids

27

that pulls differently on copper and gold atoms. Similarly in this case, the magnitude of the mean value of the quantum field is not a variable that we can get in and manipulate. So this is not of much value to us in discussing macroscopic dynamics. The phase 9 of the mean field, on the other hand, has the opposite property. It is a real dynamical variable such as the ones I gave as examples previously, the magnetization or the position of a piece of crystal. It is a variable corresponding to which there are forces, and thus is simultaneously a dynamical and a thermodynamical variable. The fact that it is a dynamical variable follows from the fact that the number operator Nt which corresponds to any one of our partial systems is the conjugate dynamical variable to the phase operator
N

It is clear that in forming the mean value

the term on the left with phase e - w - i ) connects only with that on the right with phase e"z>N, so that

< = l / l * ' * . Thus the transformation matrix between phase and number operators is ei
cp =

-idjdN.

There are limitations on the usability of these commutation relations but those are relevant only for very small systems, and we are talking specifically about a large system, so that we can really say this is a precise relationship. In addition this gives us an illustration of how to form wave packets for our system.

226

P. W. Anderson

28

Now the essential dynamical equations for the whole phenomenon of superfluidity follow from these two ideas: that we are going to take states in which there are mean values of the particle field, and that the phase and the number operators are conjugate dynamical variables. The equations those lead to are, of course, the equations of motion of these two dynamical variables. We can write down the equation of motion of the number operator for one of our pieces of a dynamical system, connected by a superfluid connection to another system. Let the number and phase operators of the two systems be Nly N2 and (pu q>2. The equation of motion of the number iVx is given by the standard quantum mechanical equation of motion, which, using the fact that these are conjugate dynamical variables, can be written ihti1 = ye,N±]

=

-idje/dcpi,

and correspondingly we have the other Hamilton's equation for this pair of conjugate variables ih^

= L3f,
d2fjdN1.

These equations are essentially equivalent to two equations which are familiar in the theory of superconductivity. The first is more or less equivalent to the London-Ginsburg-Landau current equation ([5], [9]), the equation which determines the supercurrent in a superfluid system. The second is an equation which has had a shorter history, although a similar equation has occurred in both the theory of superfluidity and the theory of superconductivity. It is equivalent to London's acceleration equation [9], telling how the system responds if we apply forces. As I have written them down, these are equations for operators, but they hold, of course, for mean values; so these are equally good equations for the mean value of the number operator and for the mean value of the phase in our wave packets. The current equation is historically the first, and the most straightforward of the interference experiments that I promised to talk about are based on it. Of course, for an isolated system, as I said, the Hamiltonian commutes with the number operator, i.e., the Hamiltonian must obey gauge invariance, and be independent of the total phase. That just says, of course, that the number of particles moving into or out of an isolated system is zero. The situation is quite different if we talk now about two separate pieces of a superfluid system. Now imagine that we have two separate pieces of a superfluid system connected by a superfluid bridge (we must always remember that when I say superfluid I mean either helium or superconductors, in which case I am talking about pairs of electrons). Our very definition of superfluidity has said that we have a

227

Coherent Mattet Field Phenomena in Superfluids

29

mean value of the q u a n t u m field operator in this whole system. If the energy were i n d e p e n d e n t of the relative phase of the q u a n t u m field operator on the two sides I would not have such a mean value because on the average the two phases would fluctuate completely i n d e p e n d e n t l y . So by m y very definition of superfluidity I have said that t h e r e m u s t be a d e p e n d e n c e of the total energy on the relative phase on t h e two sides; t h e r e m u s t be an energy which m u s t be a m i n i m u m where the phases are equal, and m u s t increase on each side of that m i n i m u m . Incidentally, of course, this tells m e w h y superfluidity and superconductivity are lowt e m p e r a t u r e p h e n o m e n a , because as t h e r m a l fluctuations increase, eventually at high e n o u g h t e m p e r a t u r e s the system is going to overcome this energy (which can be traced in every case to the q u a n t u m zero-point energy of the individual particles). If, t h e n , t h e r e is a d e p e n d e n c e of the energy on (q>1 — q>2): U = [7(9! -
_

_dN2 dt

__ 1 dU{
W e can equally well apply this to two small, semimacroscopic neighboring cells within an individual bucket of superfluid and we have again t h e same equation b u t in t h a t case we get that the c u r r e n t will be t h e variational derivative of t h e energy with respect to the gradient of the phase: Js

~

_ 1 SU h S(V?)'

T h i s is the course-grained equivalent of t h e previous equation. Again we expect that the energy as a function of the gradient of t h e phase m u s t b e a m i n i m u m w h e n the gradient is zero, and that it will be a quadratic function for small e n o u g h values of t h e g r a d i e n t : U =

ns(T)(h2/2m)(V
As the coefficient it is convenient to p u t in h2/2m times a perfectly arbitrary function ns(T) of the t e m p e r a t u r e and the internal state of the system, and t h e n the c u r r e n t is t h e variational derivative of this with respect to the gradient of t h e phase a n d so may be written Js =

ns(T)(H/m)Vcp.

Well, it is clear now w h y I have w r i t t e n the coefficient the way I did. I w a n t e d to get a quantity which looked as t h o u g h it was a velocity, a

228

30

P. W. Anderson

superfiuid velocity vs = hjm V9 multiplying a superfluid number of particles. At this point, this is just a definition of the superfluid velocity. At absolute zero, in a perfectly homogeneous system, we can prove, in fact, that this is the velocity of the superfluid and that ns(T) is equal to the total number of particles or of electron pairs in the system. You prove that very simply by taking the perfectly homogeneous system, and translating it at a certain velocity, which must then be the superfluid velocity. I am aware of no proofs that vs defined in this way means anything else in any real system, either superconducting or superfluid. In any real bucket of helium, any experimentalist like myself who has worked with it will tell you that there are always counterflows of normal and superfluid particles, and in any real superconductor there are impurities. Even at absolute zero in the presence of impurities ns is not identically equal to N. Thus this is an idealization and one's reasons for calling this vs are reasons basically of convenience rather than necessity. It is sometimes but not always a good approximation of the actual particle velocity in the system. But it is the current equation which is the basic equation. In the case of superconductivity, another somewhat more familiar route may be taken to this equation. In the case of superconductivity you can recognize immediately that I have told you a whopper when I said that the energy is a functional of the gradient of the phase of the mean value of the wave field operator because that, of course, is not a gauge invariant statement. In the case of superconductivity my particles are charged; there is an electromagnetic field present and I can change the phase of the wave function arbitrarily if I at the same time change the value of the vector potential of that electromagnetic field. So in order to maintain gauge invariance I have to write instead that U is a functional of (h V99 — 2eAjc) where A = vector potential. 2e occurs in the factor rather than e because 99 is the phase of a two-particle wave function. Then, of course, I take the well-known identity that the current is equal to \jc times the variational derivative of the energy with respect to the vector potential and that gives me the electromagnetic current in this case. At this point I would like to make two things clear. The first is that this discussion of the supercurrent already makes it very plausible (and I believe probably it can be made more than plausible by someone more patient than I) that superfluidity as we understand it, the existence of superflows in the absence of driving forces, and this type of coherence within the wave function or the off-diagonal long-range order concept to which it is equivalent, are both necessarily and sufficiently related to each other. Thus we suggest it is impossible to have supercurrents without off-diagonal long-range order and vice versa. The one seems plausible because there is only one variable, A, or the gradient of the phase, to

229

Coherent Matter Field P h e n o m e n a in Superfluids

31

which the current operator is conjugate, and therefore it does not seem that we can get supercurrents unless we have a dependence of the energy on the appropriate variable. Conversely, if we have a dependence of the energy on this variable we must have supercurrents. Now this seems absolutely trivial except that there are papers in the literature during the past year which violate both directions of the implication. There are papers producing hypothetical phases with off-diagonal long-range order which do not have any supercurrents, and there are papers producing hypothetical superfluid phases which have no off-diagonal long-range order, so that it may be worth pointing out that they seem to be necessarily and sufficiently connected. The second importance of this part of the discussion is that this is the point at which macroscopic quantization begins to enter the picture. The first point at which macroscopic quantization enters is that the {> is to be taken as a macroscopic variable of the system, and therefore it is a physical quantity. Thus it is not by any means to be taken as multiple valued. Therefore, as you follow this quantity around some circuit within which you have, say, two bridges between two superconducting samples (i.e., some multiply connected circuit) the value of that quantity must come back to the same value when we return to the same point. The phase, however, need not return to the same value: it can return to the same value + 2-TT or 4TT or any number of integers times 2n. So we immediately get that there is a kind of quantization of a circuit integral. As we go around any closed curve within our superfluid, the integral of the gradient of the phase must be equal to 2mr. And that is the basic quantization statement we have to make. Now in the case of superfluidity, as I have said, we often take the superfluid velocity to be hjm V
230

P. W. A n d e r s o n

32

& A dl is equal to hcjle. This line integral in turn is easily shown to be the magnetic flux. Again this is an approximation because we have had to use a circuit in which there is no supercurrent and in many instances, which have actually been checked out experimentally, the sample can be made so small that there is supercurrent at every point. That inexactness in the macroscopic quantization is thus one which can actually be proved experimentally. So it is important to remember that the quantization in the sense of the line integral of V

~
^77, e t c . :

STRONG SUPERCONDUCTIVITY

Figure 1

For reasons which are easy enough to understand but that I do not want to go into here, when you actually start working with this system, it tends almost always to pass through the intersections between the parabolas without paying any attention to the other parabola, so that it is not possible to get from one of these states to another easily. The tremendous contribution that Josephson [10] made to the subject is the fact that he discovered a way in which this periodic energy curve could be made to be simply sinusoidal. It did not then have any of these annoying

231

Coherent Matter Field P h e n o m e n a in Superfluids

33

metastable states up at high energy so that the system would follow the sinusoidal behavior as one changed the phase. The original Josephson junction was simply a tunneling barrier between two pieces of superconductor separated by an extremely thin film of insulator, but now it is realized that simply extremely thin wires or extremely thin bits of superconducting film behave in much the same way. Almost all the macroscopic interference experiments make use of this kind of junction between two pieces of superfluid, except for the original macroscopic flux quantization experiment by Doll-Nabauer [11] and DeaverFairbank [11]. What is the simplest way in which this could be used ? We have two pieces of superconductor connected by an insulating film, and in order to arrive at this energy curve as a function of phase, I have had to assume that the phase difference is a certain fixed value everywhere in the junction. That need not be so, however, if I insert a magnetic field through the insulating barrier and parallel to it. Then the field is given by the curl of the vector potential so I can say that there is a vector potential perpendicular to the barrier. V i

0

i

2-n

ATT

tpx — tp2

Figure 2

Thus the phase difference is a linear function of x. (Figure 3) Then if we draw the sinusoidal phase-energy curve we see that for small enough H(«4Xwe/hc, it turns out, where w is the width of the junction and A the penetration depth) the phase difference is smeared only over a small portion of a cycle and the total energy is still sinusoidal; but when H is big enough to cover a full cycle, there is no longer any total dependence of energy on phase; it encompasses a whole 277 of the phase axis and then the energy clearly does not depend on where I am along the curve, (see figure 2) Therefore, since the total current is the derivative of the energy with respect to the phase, the total current will be exactly zero. In fact you can follow through the argument and convince yourself that the total current follows a perfect oneslit interference pattern. This experiment was the first check by Rowell and myself [12] that we were really seeing this kind of junction.

232

P. W. Anderson

34

superconductor insulating barrier 0

©

0

G

0

O

0

0

0

O

superconductor

M*)-%W

Figure 3 500

0 Fig. 4. 1.3°K.

5 10 15 FIELD GAUSS

The field dependence of the Josephson current in a P b - I - P b junction at

233

Coherent Matter Field P h e n o m e n a in Superfluids

35

Once you are convinced that you can do that then you can immediately see—we should have but Mercereau did—the next experiment to do. The next experiment to do is to make two of these things and separate them by a hole. In other words, I connect two superconducting reservoirs by two Josephson junctions and have a hole in the middle through which I pass a magnetic field; of course, the magnetic field also (*)

(l)

j£.

XJMmm^^^m, <(d)

Fig. 5. Cross-section of a Josephson junction pair vacuum-deposited on a quartz substrate (d). A thin oxide layer (c) separates thin ( ~ 1000 A) tin films (a) and (b). The junctions (1) and (2) are connected in parallel by superconducting thin film links forming an enclosed area (^4) between junctions. Current flow is measured between films (a) and (b).

had to be in the two Josephson junctions. In that case, I get the original pattern for each of the junctions but the phase at one is correlated with the phase at the other depending on how much of the vector potential comes from the magnetic field in the hole. So as I increase the magnetic field, since the hole is much bigger than the junction I get a two-slit interference pattern with a one-slit interference pattern as its envelope. This was checked by Mercereau et al. [13].

1000

500 -500 0 MAGNETIC FIELD, B in milligauss

1000

Fig. 6. Josephson current versus magnetic field for two junctions in parallel showing interference effects. Magnetic field applied normal to the area between junctions. Curve (A) shows interference maxima spaced at AB = 8.7 x 10 ~ 3 G, curve (B) spacing AB = 4.8 x 10~ 3 G. Maximum Josephson current indicated here is approximately 1 0 " 3 A.

234

36

P. W. Anderson

The next experiment, also carried out by Mercereau et ah, was to check the idea of Bohm and Aharonov that the vector potential plays a special role in quantum mechanics relative to classical mechanics. They wanted to check that one could get an interference phenomenon without there being any actual magnetic field acting on the electrons so they made the same device but now enclosed the magnetic field in a solenoid, and insofar as possible there was no magnetic field from the solenoid in any other parts of the apparatus. Sure enough, now the one-slit pattern is gone, but one gets a simple diffraction pattern up to as large magnetic fields as they used [14]. Now, finally, Mercereau and his co-workers at Ford used the same thing as a very sensitive ammeter [15]. They carried out the final possibility, which is to make two Josephson junctions but now one connects the two sides by wire which is wound noninductively. So now they tested the supercurrent of the junctions in parallel in order to test the relative phase of the two junctions. This then is testing the Josephson current as a function of the superfluid velocity, i.e., as a function of V99, where V

v
v(8je/8Ny

m djdt {him V is just precisely the definition that we normally use of the chemical potential. In any system, no matter how dreadfully off equilibrium it may be, this is the chemical potential. The gradient of the chemical potential is the accepted definition of the force on the system, so what this tells you is that the superfluid velocity obeys the equation of motion mvs = F. That is why I call it the acceleration equation. There is, then, only one way in which we can have an actual difference in potential across a superfluid, and that is to allow acceleration to take place. We can do that very directly: for instance, suppose I had two buckets of liquid helium with an opening between them, I could make the difference in head on the two sides finite but I would

235

Coherent Matter Field Phenomena in Superfluids

37

know then that unless something else happened, the liquid helium would accelerate and one would get just U-tube oscillations; one would never get any dissipation. Now fortunately, or perhaps unfortunately, that is not the only way in which acceleration can take place. The other possibility is the relatively new idea of phase slippage by vortex motion [16]. In order to have a phase difference, a change in the phase around a circuit, I have to have a hole in the middle of it, otherwise I am going to have a point which is a branch point of the macroscopic particle field. I do not like that, but I have to assume that there may be situations in which one has to introduce branch lines of the quantum particle field. In fact, the simplest such arrangement I can introduce in order for the phase not to be single valued within a single macroscopic sample of superfluid is a line at which you take the mean value of the particle field to be zero, and then around that branch line I can allow the phase to have a variation of 277: around that branch line I have I V93 dl = 27T. This object is called a vortex core and occurs both in superconductivity and superfluidity. In superconductivity there is often a quantum unit of flux associated with it [17]. In superfluidity it is the well-known quantized vortex [18]. Now the concept of phase slippage comes from the fact that I can get a time-dependent difference in phase from one point in a sample of superfluid to another by the following procedure. Let us take two points in the superfluid, point 1 and point 2, and a path between them. If I have a quantized vortex line on one side of the path, the phase difference from 1 to 2 may have one value. Now I move my vortex line to the other side of the path; and because I have a branch line of the phase, I now know that my phase integral, V

—

H-2

=

h^dnjdty,

where (dnjdf) is the average rate of passage of vortices through the path. Now, this tells us two things. One: it tells us the conditions under which dissipation can occur in superfluids: it can occur only if there are

236

38

P. W. Anderson

vortex lines. It turns out that in liquid helium it is very easy to create vortex lines and so one almost always has dissipation of some sort. There are two kinds of superconductors. The first is the so-called soft superconductors, which are very resistant to the presence of vortex lines, and pass supercurrents without any dissipation whatever. The second is the hard superconductors, which contain vortex lines and always have an associated dissipation [19]. Second, and more interesting from the point of view of interference experiments, is the idea of using this expression to connect a potential difference with a frequency. You can see this is like Schrodinger's equation ih difjjdt = Hifj, i.e., (ih/tp) di/t/dt equal to an energy. So I can associate a frequency with an energy difference simply by Einstein's frequency condition, which when it is applied in superconductivity is called Josephson's frequency condition. A typical simplified experiment, not far from the experiment that we actually carried out, is the following. I can take two buckets of helium with a partition between them, with a little hole in the partition, and then I can make a level difference so that the liquid is trying to flow through the hole. As the liquid flows through the hole it turns out that it probably blows smoke rings. In any case it creates vortices and these vortices move continuously out of the hole so if I make a path from the liquid surface in one bucket to the other, vortices pass across that path at a rate determined by the head difference. How can I tell whether they are passing at that rate? I can tell by introducing an ultrasonic transducer which modulates the flow through the hole at the appropriate rate, i.e., I can introduce an a.c. flow and synchronize the motion of those vortices. When the a.c. flow is synchronized with the height difference I would expect a singularity to occur in the flow and that indeed is exactly what we have observed [20]. This is interesting because it is the only one of these interference experiments so far which have been carried out in a precisely analogous way in both superconductivity and superfluidity [21]. What is the conclusion to be drawn from all this ? In the first place, probably the presently most interesting conclusion is that our ideas about superconductivity and superfluidity seem to be right. That is particularly interesting in liquid helium, where the microscopic theory is not in as strong condition as the microscopic theory is in superconductivity. In the second place, something which probably is much more important for the more distant future is that there are here the germs of a series of devices which act on an entirely different principle from anything which has appeared on the macroscopic level previously. It is so different, in fact, and so exciting, that, to anyone who is familiar with what is going on, it is intensely frustrating that we have not been able to make more use

237

Coherent Matter Field P h e n o m e n a in Superfluids

39

of them than we have. One of the reasons why I think we have not is that somehow these things are so different from what we are used to in our technology that we do not even yet have the sense to realize what we need them for, but I am sure that somewhere we need them. QUESTION. Would it be possible for several types of off-diagonal long-range order, several types of condensation to occur simultaneously ? ANSWER. Well, that is an interesting question. That was proposed by Gor'kov and Galitskii. The question is very simple. What they suggested is essentially that you might also have a second term in the factorization, ((r)
REFERENCES 1. O. Penrose, Phil. Mag. 42, 1373 (1951); O. Penrose and L. Onsager, Phys. Rev. 104, 576 (1956). It has been pointed out to me that the concept of O D L R O is actually mentioned in the paper of V. L. Ginsburg and L. D. Landau, in 1950, which as we will shortly discuss gives the phenomenological theory of superconductivity. These authors failed to observe the limitation that fermions could not exhibit O D L R O except as pairs. It is clear, however, that by 1958, when the papers of Beliaev and Gor'kov to which we later refer appeared, the concept was clearly understood by the Russian group. It is unfortunate that the paper of Yang, Rev. Mod. Phys. 34, 694 (1962), which has had great influence in this country, did not acknowledge the Russians' priority. 2. S. Beliaev, J. exp. theor. Phys. 34, 417 (1958). 3. L. P. Gor'kov, J. exp. theor. Phys. 34, 735 (1958). 4. J. Bardeen, L. N. Cooper, and J. R. Schrieffer, Phys. Rev. 108, 1175 (1957). 5. V. L. Ginsburg and L. D. Landau, J. exp. theor. Phys. 20, 1064 (1950).

238

40

P. W. Anderson

6. L. P. Gor'kov, 7. exp. theor. Phys. 36, 1918 (1959). 7. E. P. Gross, Nuovo Cim. 20, 454 (1961); L. P. Pitaevskii, Sov. Phys. JETP 13, 451 451 (1961). 8. P. W. Anderson, in Lectures on the Many-Body Problem (E. R. Caianello, Ed.) Vol. 2, p. 113. Academic Press, New York, 1964. 9. F. London, Proc. Roy. Soc. 152A, 24 (1935). 10. B. D. Josephson, Physics Lett. 1, 251 (1962). 11. R. Doll and M. Nabauer, Phys. Rev. Lett. 7, 43 (1961); B. S. Deaver and W. Fairbank, Phys. Rev. Lett. 7, 43 (1961). 12. P. W. Anderson and J. M. Rowell, Phys. Rev. Lett. 10, 230 (1963); J. M. Rowell, Phys. Rev. Lett. 11, 200 (1963). 13. R. C. Jaklevic, J. J. Lambe, A. H. Silver, and J. E. Mercereau, Phys. Rev. Lett. 12, 159 (1964). 14. R. C. Jaklevic, J. J. Lambe, A. H. Silver, and J. E. Mercereau, Phys. Rev. Lett. 12, 274 (1964). 15. R. C. Jaklevic, J. J. Lambe. A. H. Silver, and J. E. Mercereau, Phys. Rev. 140, A1628 (1965). 16. P. W. Anderson, Rev. Mod. Phys. (1966) to be published. 17. The theoretical discovery of the superconducting flux line is certainly due to A. A. Abrikosov, J. exp. theor. Phys. 32, 1442 (1957). 18. R. P. Feynman, Prog, of Low Temp. Physics (C. J. Gorter Ed.), Vol. 1, Ch. II (1955). 19. P. W. Anderson, Phys. Rev. Lett. 9, 309 (1962); Y. B. Kim, C. F. Hempstead, and A. R. Strnad, Phys. Rev. Lett. 9, 306 (1962); Phys. Rev. 131, 2486 (1963). 20. P. L. Richards and P. W. Anderson, Phys. Rev. Lett. 14, 540 (1965). 21. P. W. Anderson and A. H. Dayem, Phys. Rev. Lett. 13, 195 (1964).

239

Considerations on the Flow of Superfiuid Helium Rev. Mod. Phys. 38, 298-310 (1966) (Sussex LT Meeting, 1965)

I feel this is the cleanest discussion of superfluidity available. Note that on many points this is contradictory or orthogonal to Landau orthodoxy as pronounced by Khalatnikov. Whether Landau would have agreed was never clarified because of his accident.

241

Reprinted from REVIEWS OF MODERN PHYSICS, Vol. 38, No. 2, 298-310, April 1966 Printed in U. S. A.

Considerations on the Flow of Superfluid Helium* P. W. ANDERSON Bell Telephone Laboratories, Murray Hill, New Jersey First, we show that the most important equations of the dynamics of the two types of superfiuids, He II and superconductors, follow quite directly from the simple assumption that the quantum field of the particles has a mean value which may be treated as a macroscopic variable. The background of this ansatz is also discussed. Second, we apply these equations to various physical situations in He II, notably the orifice geometry and the superfluid film, and show how they, and particularly the idea of phase slippage accompanying all dissipative processes, can be applied and what kinds of macroscopic interference phenomena may be expected. The effect of synchronization in the ac interference experiment is discussed. Beliaev6 extended this for helium to a Green's function theory in which

E n F

,

ext «P>+

ext)/

FIG. 2. Termination of 4ir vortex texture in a "hedgehog," which is the texture equivalent of a "monopole" attached to the end of a vortex line.

p a t t e r n of high superfluid velocity and thus the c h a r a c t e r i s t i c and puzzling problem of vortex nucleation will not be as s e r i o u s . On the other hand, the phase slippage t h e o r e m [the first equality of Eq. (1)] is still valid between any two points in the fluid where the orientation of l does not change. This is because the usual argument from the n u m b e r - p h a s e commutation relation o r from gauge invariance 1 that dcp/dt equals dE/dN is u n i v e r s a l to all superfluids, but, if I is moving too rapidly, it is not safe to equate (8.E/9JV) with p.: hence the proviso on motions of I. cp i s , of c o u r s e , not a velocity potential but is uniquely definable when I is fixed, as it will be effectively, in most physical situations, at bounda r i e s by the boundary condition on I, and in many other situations in the bulk of the s y s t e m because of orienting flows and fields or the C r o s s " n o r mal pinning" effect. 5 Equation (1), not its c h a r a c t e r as a velocity potential, is the key to the i m portance of the phase

(3)

Equation (3) implies the m o r e general r e m a r k 7 that a c u r r e n t s o u r c e may always be inserted as a phase-dependent t e r m in the Hamiltonian: H,

,=

-dN/dtt„(p.

Hence, a c u r r e n t s o u r c e e x e r t s an appropriate force on any texture whose motion can allow phase slippage. The final physical fact is that t e x t u r e s move relatively slowly and dissipatively in 3 He because of the C r o s s " n o r m a l pinning" effect, 5 so that v o r tex t e x t u r e s can only affect very low-frequency phenomena. We envisage t h r e e flow r e g i m e s . (1) For high-frequency phenomena such a s fourth sound and vibrating w i r e s , or for l a r g e chemical potential differences, 3 He will behave like a conventional but anisotropic superfluid, since orbital motion will be pinned and phase s l i p page can only occur by motion of conventional vortices. (2) In the absence of a magnetic field and for moderately low frequencies, I will be free to o r i ent at will and vortex t e x t u r e s will play a g r e a t role in causing dissipation. The c r i t i c a l velocity for nucleating t e x t u r e s should be extremely low, of o r d e r h/mR w h e r e / ? is an apparatus s i z e , o r ~ 10* 2 c m / s e c . This is even lower than what is observed in heat flow e x p e r i m e n t s by Wheatley's group, but in the right range. It s e e m s possible that in the absence of a magnetic field almost any flow will fill the s a m p l e with enough vorticity to damp out fluctuations and the relatively quiet b e havior at low fields could be a turbulent r e g i m e . (3) In a magnetic field I is oriented by the d i polar energy for s t r u c t u r e s l a r g e r than the length R s of o r d e r ~ 100 ;um4 determined by the ratio of dipolar to c u r r e n t energy. The vortex s t r u c t u r e s must be of this o r d e r and thus contain velocities of o r d e r h/mRs~ 10* 1 c m / s e c . This i s again the right o r d e r of magnitude, but a bit fast r e l a t i v e to some c r i t i c a l velocities m e a s u r e d in r e s o n a n c e

510

534

VOLUME 38, NUMBER 9

PHYSICAL

REVIEW

and other e x p e r i m e n t s . In both of t h e s e c a s e s , it i s easily possible that a r e g u l a r motion of vortex s t r u c t u r e s can be set up under a p p r o p r i a t e flow conditions, which we speculate may be r e l a t e d to c e r t a i n observations of r e g u l a r and i r r e g u l a r orbital fluctuations. 6 It i s interesting to speculate on the outcome of a m e a s u r e m e n t of quantized vorticity in 3Ue-A by the Vinen 8 v i b r a t i n g - w i r e experiment or o t h e r w i s e . At a Vinen w i r e the boundary condition will r e q u i r e I tc be radial and the phase may rotate by any integer number of units It; a t e x t u r e in, for i n s t a n c e , a cylinder can simply add o r s u b t r a c t 47r to t h i s , so that any integer amount of vorticity i s p o s s i b l e . However, the r e s u l t s might be chaotic in the absence of a field b e c a u s e of vortex t e x t u r e s throughout the liquid. With a field the w i r e can again have integer vorticity but in the surrounding liquid the 4TT double vorticity is the m o s t stable v o r t e x line, consisting of a " c o r e " which is a Fig. 1 t e x t u r e of s i z e ~RS, and a conventional outer region. Thus, one may tend to add o r s u b t r a c t double units. In s u m m a r y , the most important point to be made is that dissipation in this superfluid i s qualitatively different from that in other superfluids and in most b r o k e n - s y m m e t r y s y s t e m s , in that it can occur by the motion of t e x t u r e s ( r a t h e r like "topological solitons") and not only by s i n g u l a r ities of the o r d e r p a r a m e t e r . Thus, the p r o p e r ty of superfluidity t a k e s a v e r y novel form in this case.

LETTERS

28 FEBRUARY 1977

We wish to acknowledge d i s c u s s i o n s with W. F. Brinkman and M. C. C r o s s , and D. J. T h o u l e s s ' s suggestion that the Vinen experiment should be examined. We thank the Aspen Institute for h o s pitality during the p r e p a r a t i o n of this L e t t e r .

f

P. W. Anderson, Rev. Mod. Phys. 38, 298 (1966). P.-G. de Gennes, in Collective Properties of Physical Systems, edited by B. Lundqvist and S. Lundqvist (Academic, New York, 1973). 3 G. Toulouse and M. Kleman, J. Phys. (Paris), Lett. 37, 149(1976). It was, of course, already proven that vorticity is not quantized in A [see, for example, R. Graham, Phys. Rev. Lett. 33, 1431 (1974)], but the topological argument gives a new and very fundamental point of view. 4 W. F. Brinkman and D. D. Osheroff, Phys. Rev. Lett. 32, 584 (1974). See also P. W. Anderson and W. F. Brinkman, in The Helium Liquids, edited by J. G. M. Armitage and I. E. Farquhar (Academic, New York, 1975), p. 315. 5 M. C. Cross and P. W. Anderson, in Proceedings of the Fourteenth International Conference on Low Temperature Physics, Otaniemi, Finland, 1975 edited by M. Krusius and M. Vuorio (North-Holland, Amsterdam, Vol. 1, p. 29 6 See J. C. Wheatley, Rev. Mod. Phys. 47, 415 (1975), especially Sect. VII; also D. N. Paulson, M. Krusius, and J. C. Wheatley, Phys. Rev. Lett. ^ 7 , 599 (1976). 7 P. W. Anderson, in The Many Body Problem, edited by E. R. Caianello (Academic, New York, 1964). 8 W. F. Vinen, Proc. Roy Soc. London, Ser. A 260, 218 (1961). 2

511

535

Scaling Theory of Localization: Absence of Quantum Diffusion in Two Dimensions (with E. Abrahams, D.C. Licciardello and T.V. Ramakrishnan) Phys. Rev. Lett. 42, 673-676 (1979)

The "Gang of IV" paper. I had the idea of fitting the classical and localized limits together with some kind of interpolated scaling curve in the midst of a lecture one day. Then it was at lunch the same day with Elihu, who often dropped in at Princeton, where I maintained a visitor office for him, and Rama, who was with us for several years off and on, that Rama recalled the old diagram work by Langer which we used as a starting point. He and Elihu then rapidly put together the resummation of backscattering diagrams which is "weak localization theory". Don L did some simulational verification for us; besides, the original inspiration for a scaling theory borrowed a lot from Licciardello and Thouless' earlier paper. This is a favorite of mine for several reasons. One personal, over and above my love and regard for the three co-authors: In the news stories around my Nobel Prize someone (was it Watson at Brookhaven?) had wondered publicly whether the prize would end my creativity "as it so often has on the award of a Nobel" (I don't know much evidence for that, actually), and this was a prompt answer. Second, it had been frustrating that localization had never before had a formalism that could provide employment for the hungry mob of eager postdoc theorists and clever experimentalists; it had been a kind of isolated mystery, out of the reach of standard methods, theoretical and experimental. The period of three to five years which ensued was delightful in the warm atmosphere in which groups from India, Japan, Eastern and Western Europe, and the US cooperatively carried the enterprise forward. Some of this is summarized in the review paper 45 in the present volume.

537

VOLUME 42, NUMBER 10

PHYSICAL

REVIEW

LETTERS

5 MARCH 1979

Scaling Theory of Localization: Absence of Quantum Diffusion in Two Dimensions E. Abrahams Serin Physics Laboratory, Rutgers University, Piscataway, New Jersey 08854 and (a)

P. W. Anderson, D. C. Licciardello, and T. V. Ramakrishnan (b) Joseph Henry Laboratories of Physics, Princeton University, Princeton, New Jersey 08540 (Received 7 December 1978) Arguments are presented that the T = 0 conductance G of a disordered electronic system depends on its length scale L in a universal manner. Asymptotic forms are obtained for the scaling function/3 (G) =dlnG/dlnL, valid for both G«Gc=*e2/S and G»GC. In three dimensions, Gc is an unstable fixed point. In two dimensions, there is no true metallic behavior; the conductance crosses over smoothly from logarithmic or slower to exponential decrease withL. Scaling theories of localization have been discussed by Thouless and co-authors 1 " 3 and by Wegner. 4 Recently Schuster, 5 using methods related to those of Aharony and Imry, 6 has proposed a close relationship of the localization problem to a dirty XY model of the same dimensionality. Wegner has proposed a scaling for dimensionality <2# 2 of the conductivity o~{E-EcY*-V\

(1)

while Schuster identifies v as the correlationlength exponent of the XY model — | at d > 4. The latter proposes a universal jump of conductivity for d = 2 given by eVfar2. This is not inconsistent with the results of Wegner4 and is in rough agreement with the early calculations of Ref. 2. It has not been clear how (1) could be reconciled with the physical ideas of Mott7 as related to the b e ginning of a scaling theory by Thouless. 3 We here develop a renormalization-group scheme based on the Mott-Thouless arguments, which in many essential ways agrees with Wegner's r e sults, and in other ways severely disagrees. In particular, we recover (1) for d>2, where v is the localization-length exponent below Ec. This is in clear contradiction to Mott,7 who argues that in all cases the conductivity jumps to zero at Ec. At d = 2, we find no jump in a but a steep c r o s s over from exponential to very slow dependence on size. There is no true metallic conductivity. These results were presaged by Thouless and coworkers 8 ' 9 to some extent, with Ref. 8 indicating a transition region for three dimensions, and Ref. 9 a size-dependent minimum metallic conductivity. Our ideas are based on the relationship 1 b e tween conductance as determined by the KuboGreenwood formula and the response to perturbation of boundary conditions in a finite sample de-

scribed by Thouless and co-workers ; "V" AE 2H „ 2K cL"-\ W dE/dN~ e2 e1

Here G is the conductance (not conductivity a) of a hypercube of size Ld [here L >» L0 (L0 = microscopic size)], dE/dN is the mean spacing of its energy levels, and AE is the geometric mean of the fluctuation in energy levels caused by replacing periodic by antiperiodic boundary conditions. Actually, when "V/W" is relatively large, it is hard to match the energy levels and, in fact, AE is defined using the curvature for small x when we replace periodic ip(n+ l) = i/)(l) by ip(n + l ) = e i x Xil>(l) boundary conditions. This procedure is valid throughout the range of interest. 3 We will comment on the validity of (2) in a fuller paper, but here we add the following remarks. The equivalence of the Kubo-Greenwood formula and the breadth v of the distribution of AE as described in Ref. 3a is not quite precisely provable but does not depend as stated in that reference on independence of momentum matrix elements paB and energy difference Ea-E e between two states, only on a uniform distribution of those Ea-E6 which have l a r g e p a B . Our scaling theory depends on the following ideas. (I) We define a generalized dimensionless conductance which we call the "Thouless number" as a function of scale L:

_AE(L)_/G(Ll\ dE(L)/dN\-e2/2K)'

s{L)=

(3;

where we now contemplate a small finite hypercube of size L. In the case L »l, the mean free path, we may use (2) to define a conductance C(L) which is not related directly to the macroscopic conductivity but is a function of L, and is

© 1979 The American Physical Society

538

(2)

673

VOLUME42,NUMBER 10

PHYSICAL

REVIEW

defined by (3) from the average of the Thouless energy-level differences at scale L. When L<1, there is phase coherence on a scale L and g is no longer given by (3) but it can be shown that (e 2 / K)g = G can be defined as the conductance of a hypercube imbedded in a perfect crystal. (II) We remark that g(L) is the relevant dimensionless ratio which determines the change of energy levels when two hypercubes are fitted together. This i s the hypothesis of Thouless and can be justified in several ways on physical grounds. For instance, once I > mean free path, the phase relationships for an arbitrary integration of the wave equation across the cube are as random from one side to another as those between wave functions on different cubes. This could be shown to be related to Wegner's "neglect of eigenvalues far from £ F " by a scaling argument. In this limit g(L) represents [as indicated in (2)] the "V/W" of an equivalent many-level Anderson model where each block has (L/L0)d energy levels and a width of spectrum W=(dE/dN)(L/L0Y. We cannot see how any statistical feature of the energy levels other than this coupling/granularity ratio can be relevant. (III) We then contemplate combining b" cubes into blocks of side bL and computing the new AE'/ (dE/dN)' at the resulting scale bL. The result will be g(bL) =

f{b,g(L)),

LETTERS

5MARCH1979

Thence limfld(g) = ln\g/ga(d)].

(7)

Here^,, is a dimensionless ratio of order unity. From the asymptotics (6) and (7), we may sketch the universal curve pt(g) in d = 1,2, 3 dimensions (Fig. 1). The central assumption of Fig. 1 is continuity: Since /3 represents the blocking of finite groups of sites, it can have no builtin singularity, and hence it would be unreasonable for it to have the cusp indicated by the dashed line: This is the curve which would be required to give the Mott-Schuster jump in conductivity for d = 2. The only singularities then, must be fixed points (3 = 0. Physically, it is also certain that /3 is monotonic in.g-, since smaller V/W surely always means more localization. In constructing Fig. 1, we have used perturbation theory in V/W which shows that the first deviation of (3 from h$g/ga) is positive, with f} = lrL(g/ga)[l + ag + ~gz + ...},

(8)

since this is essentially just the "locator" perturbation series first discussed by Anderson.10 The steepening of the slope of /3 given by (8) makes i/s 1, as we shall see, for d = 3 or greater. For l a r g e r , we suppose that 0 may be calculated as a perturbation series in W/V =g~1: dg/d InL =g(d -2-a/g+

...).

0)

(4)

or in continuous terms dlng(L)/dlnL=(i(g(L)).

(5)

The scaling trajectory has only one parameter, g. (IV) At large and small g we can get the asymptotics of (3 from general physical arguments. For l a r g e ^ , macroscopic transport theory is correct and, as in (2), G(i) = a i " - 2 , so that limpd(g) = d-2.

(6)

For s m a l l g (V/W«$, exponential localization is surely valid and therefore g falls off exponentially:

FIG. 1, Plot of (3(f) vs lug for d> 2, d = 2, d<2. g(L) Is the normalized "local conductance." The approximation (3 =s ln(g/gc) is shown for g> 2 as the solid-circled line; this unphysical behavior necessary for a conductance jump in d = 2 is shown dashed.

674

539

VOLUME 42, NUMBER 10

PHYSICAL

REVIEW

The first correction term in this series may be estimated in perturbation theory by considering backscattering processes of the sort first discussed by Langer and Neal 11 in their analysis of the dependence of resistivity on impurity concentration. The use of Langer and Neal involves a rather subtle question. Converting the calculation of Langer and Neal to dimensionality 2 in particular, one obtains

In the localized regime (g0
v=l/s.

where A is a length cutoff for a certain divergent integral of second order in the density of scatt e r e r s . Langer and Neal assume this cutoff is I, the mean free path; for scales L<1, it should, of course, be L and we obtain just the result expected from (9), dlng/dlnL~-g~1. On the other hand, our universality argument seems to r e quire L>1. We have restudied the cutoff question and will show in a fuller paper that their cutoff is not correct. On the other hand, we have been unable to show definitively that the mean free path does not represent a relevant scale for the problem, since once L>1, we find, for example, that the coefficient of lnX depends on I. We must rely rather on our general arguments from continuity and regularity, and an intuition that only g is relevant, to suppose that a series development of )3 in g~l should exist, once L »I. The consequences of Fig. 1 and Eqs. (8) and (9) are as follows: For d>2, the /J function has a zero at gc of order unity. It is an unstable fixed point which signals the mobility edge. The critical behavior can be estimated by integrating /3 starting from a microscopic La and with gQ near ge. We use the linear approximation where s>l, since a > 0 in (8). For g0>gc, obtain (d-a)/s

°-AH

L0<-*K go)

**(go-gJ/gc> 2

(13)

(14)

(15) g^go-AgcML/Lo) the conductivity decreases logarithmically with scale until g~gc at the scale L,, where

[!(£->)]•

= exp

(16)

From this point on g decreases exponentially with L, the localization length being of order L, as given by (16). This type of behavior was a l ready anticipated from computer studies, 9 but the nature of the actual solution is surprising to say the least, as well as the fact that it appears to have been anticipated in terms of a divergence of perturbation theory in the weak-coupling limit by Langer and Neal. 11 This work is supported in part by the National Science Foundation Grants No. DMR 78-03015 and No. DMR 76-23330-A-l, and by the U. S. of Naval Research.

we

(ID (a) Also at Bell Laboratories, Murray Hill, N. J. 07974. ^On leave from Indian Institute of Technology, Kanpur, India. 'J. T. Edwards and D. J. Thouless, J. Phys. C 5, 807 (1972). 2 D. C. Licciardello and D. J. Thouless, J. Phys. C 8, 4157 (1975). 3a D. J. Thouless, Phys. Rep. 13C, 93 (1974). 3b D. J. Thouless, to be published.

where A is of order unity. The distance to the mobility edge is measured by « = ln(gjgc)

we get

These results again agree with those of Wegner. In two dimensions, we have a strikingly different picture (see Fig. 1). Instead of a sharp mobility edge there is no critical gc where ^,gc) = 0, but j3 is always negative so that in all cases g{L - <») = 0. Instead of a sharp universal minimum metallic conductivity, there is a universal crossover from logarithmic to exponential behavior which for many experimental purposes may r e semble a sharp mobility edge fairly closely. If we extrapolate the form we would deduce from Langer's perturbation-theory calculation, on the "extended" side of the crossover

(10)

s]n{g/g0),

5MARCH1979

so that the exponent of the localization length is the inverse slope of j3 at gc,

g(L)=g0-g1lnA,

P=

LETTERS

(12)

2

and the factor Ae /HL0"' in (11) is the Mott conductivity which here appears in the scaling form proposed by Wegner.

675

540

VOLUME42,NUMBER 10 4

PHYSICAL

RE

F. J. Wegner, Z. Phys. 25, 327 (1976). H. G. Schuster, Z. Phys. 31, 99 (1978). A. Aharony and Y. Imry, J. Phys. C J £ , L487 (1977). 7 N. F. Mott, Metal Insulator Transitions (Taylor and Francis, London, 1974). 8 B. J. Last and D. J. Thouless, J. Phys. C 2, 699 5 6

IEW

LETTERS

5 MARCH 1979

(1974). 9 D. C. Llcclardello and D. J. Thouless, J. Phys. C 11, 925 (1978). ""%>. W. Anderson, Phys. Rev. 109, 1492 (1958). U J. S. Langer and T. Neal, Phys. Rev. Lett. 16, 984 (1966).

676

541

Some General Thoughts about Broken Symmetry Symmetries and Broken Symmetries in Condensed Matter Physics, ed. N. Boccara, IDSET (Paris, 1981), pp. 11-20

This is my most complete summary of the theory of broken symmetry in condensed matter systems.

543

Some general thoughts about broken symmetry P. W. Anderson Bell Laboratories, Murray Hill, New Jersey 07974, USA and Princeton University *, Princeton, New Jersey 08544, USA

I. Introduction It is appropriate to commemorate the discovery of piezoelectricity, one of the most useful of broken symmetry phenomena, with a general discussion of broken symmetry as it manifests itself in condensed matter systems. Arguments from broken symmetry, however, go back even further in French science : Louis Pasteur's deduction that fermentation was a spontaneous life process on the basis of the optical activity of fermentation products is to me one of the most miraculously early and deep insights in the history of science. It is striking that so much of that history has taken place within a few hundred meters of where we now stand. The more theoretical physicists penetrate the ultimate secrets of the microscopic nature of the universe, the more the grand design seems to be ultimate symmetry and ultimate simplicity. But all of the interesting parts of the universe, at least to us, are, like the earth itself as well as our own bodies, markedly complex and markedly unsymmetric. In the most elementary sense, then, we are surrounded by « broken symmetry », the result undoubtedly of some sequence of catastrophes. What I want to do here is to discuss the general rules which govern this process of the development of complexity and the breaking of symmetry in particular kinds of cases. In particular, I want to point out that there is a complete, rather satisfactory theoretical structure describing one particular type of broken symmetry, namely that which occurs in equilibrium condensed matter systems, such as the crystals which exhibit piezoelectricity. It is this kind of broken symmetry object which allows us to build structures, communicate, make measurements, calculate, locate ourselves — in essence, carry on all the everyday business of life. It is clear that an equally important kind of broken symmetry occurs in many dissipative systems when they are driven far from equilibrium. These systems exhibit nonlinear instabilities at which new structures appear, which have often been called [1] « dissipative structures » and discussed on a parallel basis with the equilibrium structures of broken symmetry systems. Surely there are obvious parallels between these two types of systems — for instance, they both come under the general umbrella of « catastrophe theory »[2], and the second kind often — but not always — changes the symmetry of the system in which it occurs, if only in some trivial fashion. What I want to do here is to pin down the properties of condensed matter systems exhibiting broken symmetry in some detail. After all, the existence of broken sym* The work at Princeton University was supported in part by the National Foundation Grant No. DMR 78-03015, and in part by the U.S. Office of Naval Research Grant No. N00014-77-C-0711.

544

12

P.W. Anderson

metry in itself may not be of any use or significance : for instance, fully developed turbulence can be thought of as a broken symmetry state, yet seems to be totally chaotic and without definable structure. When one speaks of« dissipative structures » and makes the analogy to equilibrium phases, there is an implied analogy to their structural properties. Thus it is relevant to exhibit these and their underlying sources. The second part of the talk will consist almost entirely of questions rather than answers. To my mind there exists neither a theoretical nor an experimental basis for deciding whether (or not) dissipative systems have structural properties analogous to equilibrium ones, and I am essentially presenting a program of questions one may ask in this field. II. Equilibrium broken symmetry and the concept of rigidity Landau was the first person to emphasize the important role which symmetry plays in the phase transitions of equilibrium condensed matter systems [3]. It is this, rather than Ehrenfest's concept of 1st vs. nth order, which gives us our most fundamental classification scheme of phases and the most basic theorems of solid state physics. In modern terms, one envisages a basic symmetry group of the underlying particles and of space, G0, containing not only such elements as rotations (isotropy of space) and translations (homogeneity of space) but time-reversal invariance, in some cases spin-rotation invariance, etc. Landau observed the very important fact that condensed states often exhibit lower symmetry than G0; for instance, where a molecular liquid is homogeneous and isotropic and exhibits the group G0, a nematic liquid crystal is anisotropic, a smectic one inhomogeneous. The new group of, for instance, the nematic, Gu has only rotation symmetry about the director D. \

\

„ ^

/

I

I

/

\

\

/

\

/

\

FIG. la. — Nematic liquid crystal in the disordered state. The line segments represent the rodlike molecules of the nematic. Averaging molecular orientations over macroscopic distances yields zero.

> ' , % " , ) I

,

\

/

FIG. lb. — For a suitable choice of thermodynamic parameters, the nematic enters the ordered state, with the appearance of a macroscopic order parameter (the director D). The system is no longer isotropic, but has chosen a special direction : rotational symmetry has been broken.

545

Some general thoughts about broken symmetry

13

Thus phase transitions often involve a change of symmetry. We classify phase transitions into 3 basic classes : 1) Same symmetry in the two phases : as, e.g., liquid-gas, the Mott metal-insulator transition [4]. In this case the transition may be first-order, and the line of first-order transitions can, and often does, end in a critical point where the free energies of the two phases coincide and the transition simply disappears. Too often the formal similarity of this case to certain aspects of case 2) is allowed to obscure the physical difference in principle. 2) Gj is a subgroup of G0 : the symmetry of G0 is « broken ». In this case there can never be a disappearance of the transition line, which may be first or second order depending on details. Symmetry cannot change continuously: what I have called the First Theorem of condensed matter physics. Since the two phases differ in symmetry, they may have the same free energy and the same physical parameters given by the derivatives of the free energy such as T = dF/dS, P =-3F/dV, and yet be distinct : a higher-order transition is permitted, even along a line of critical points, which line may continue into a line of first-order transitions. 3) The uninteresting case is unrelated symmetries G0 and G1; in this case only a first-order transition is possible, since only by impossible coincidence could all parameters coincide. In order to do this Landau added one concept which is vitally important to all of these questions : that of the « order parameter » r\. Unfortunately, the concept of the order parameter still remains somewhat mysterious, although in most simple cases its choice is obvious. This concept is only of interest in the « broken symmetry » case 2). In case 1), any of the relevant thermodynamic variables will serve to distinguish the degree of difference of the two phases, as for instance in the Mott transition the number of free carriers, etc. In the broken symmetry case where a loss of symmetry is present, Landau introduces an « order parameter » n indicating the degree of broken symmetry which, as he points out, is usually also the degree to which the system has « ordered » — for instance, in the liquid crystal, the degree to which the molecules are no longer arbitrarily oriented but have now oriented themselves along a specific direction. In this example, one might use < cos 2 6 > — 1/3 as the order parameter, where 6 is the angle of a molecule with the director D. This « order parameter »is very important in many ways, not the least of which is that it is a new thermodynamic variable, and often either is or contains a dynamic one as well. As far as I know there is no complete characterization of the « order parameter » but it must certainly have the following important properties (and I have the impression that the term is, unfortunately, often used without understanding these restrictions) : 1) It must be a variable which is operated Go/Gj = H, the group representing the amount often called the « group of the order parameter space » [5, 6]. 2) Since these group generators — at least for

546

on by the group generators of of lost symmetry. This is in fact » and defines « order parameter all continuous groups G0 — are

14

P. W. Anderson

necessarily dynamical as well as thermodynamic variables, since they are operators in the Hilbert space of the underlying quantum mechanics, the order parameter must contain, at least, a dynamical variable or variables conjugate to these. 3) Finally, the order parameter must be a useful quantitative measure of the degree to which G0 has been broken. It has then a thermodynamic as well as dynamic significance. In such cases as order-disorder transitions (described by the Ising model) where the broken symmetry is discrete, this may be its only role : the order parameter can be merely the difference in mean population of the two or more relevant states on a given site. In the more interesting continuous situation, there are two cases. First, there is the case of spins, where the algebra is totally defined by the spin components, which are themselves the generators of spin rotations. In this case the group generators, Sx, Sr Sz, are selfcontained, and their mean value < S > can be taken as the order parameter. Since the average of any of these group generators is, by symmetry, a constant of the motion, the ferromagnetic case represents a very rare example — a case in which the order parameter is also a constant of the motion, which has important consequences for the nature of the relevant Goldstone bosons. It is too bad that in illustrating the concept of broken symmetry, the ferromagnetic example is often used : it is extremely atypical. More typical in its complication is the antiferromagnetic case. Here an additional but discrete symmetry is broken, that of the lattice translation, and the order parameter is not the mean magnetization but the sublattice magnetization, or more generally a Fourier component of the magnetization. While this is not a constant of the motion, it still can be related to generators of the continuous part of the symmetry group.

M(T>

01

J

^

FIG. 2. — Variation of magnetization M with temperature T in a simple ferromagnet. This is a typical second-order phase transition, in which the order parameter grows continuously from zero as T is lowered below a critical temperature Tc.

The cases of density waves — solid lattice or smectic liquid crystal — and superfluidity are more typical. Let us first discuss the simple one, superfluidity [7]. Here the broken symmetry is the gauge symmetry of interactions which locally conserve particle number, and the group generator is then the number operator, which may be written i -=— , cp the phase or gauge variable. But the phase itself is not a suitable order parameter because it is periodic and its origin is meaningless. Hence it is natural to use < e^''* > = \p(r) as an order parameter.

547

Some general thoughts about broken symmetry

15

This exhibits the most straightforward type of order parameter. The phase of \\i is the relevant dynamical variable, independently of the magnitude which is purely statistical in nature. This is the general rule : the order parameter contains a phaseangle like quantity which is both a dynamical variable and reflects the original symmetry — in the sense that the free energy cannot depend on the value of cp, only on relative values in different parts of the sample. q> may move about freely in the space defined by the group H (in this case the group of gauge symmetry U(l)). In the isotropic ferro- and antiferromagnetic cases, « angle » is the spin direction, a dynamical variable free to rotate on the sphere. In real (as opposed to model) anisotropic ferromagnets, the spin retains its dynamical character in such phenomena as domain wall motion and spin waves. The nematic liquid crystal behaves in much the same way. But the density waves bring in added complications. The phase angle variable in this case is position in space, and it is reasonable to use the Fourier components of the density as order parameters : PG = < eiG'r > . Clearly, this contains the displacement u as a phase parameter : pG(u) = < eiG-(r+"> > is a density wave displaced by a distance u. Thus the strain u is the dynamical variable and | pu | the statistical one for this case. In the crystal lattice case, where the number of independent G's is equal to the dimensionality of space, u is the only independent angle variable in a certain sense. This is despite the fact that G, of course, also can rotate freely in space. However, as Halperin and Nelson have emphasized [8] (and as was already evident in the old Shockley-Read construction of a grain boundary as a dislocation array) one must introduce a large, finite density of defects in the strain, essentially destroying positional order, before orientational readjustment of parts of the crystal independently is permitted. Thus in the true lattice case, only the strain is a relevant dynamical object, although the crystal overall may rotate as well. This is not the case in the cholesteric and smectic, which are perhaps the most difficult to characterize in principle of all ordered states. Here strain u and G are both free angle variables, but there is what Volovik and Mineev [9] have called an « integral constraint » restricting the angular variation of G : namely, OG.ds = 0. That is, the layers may bend but the distance between layers must be constant. The nature of singularities, at least, in these two cases is not understood as yet. As important and little-appreciated caution must be added on the dynamics of the density wave situation. The « strain » or « position » variable « u » is that of the density wave itself, which only usually, not always, means the position of the whole substance. As Overhauser first pointed out for spin density waves [10], and Leggett for solids [11], the mass being carried as the density wave moves by the density wave fluctuations or « phasons », may or may not be the whole mass density. This question

548

16

P.W. Anderson

is a subtle and complex one related to the question of Mott or Wigner metal-insulator transitions. If there is a true energy gap for excitation of particles in the self-consistent lattice potential — as there is for all real solids, but not for smectics or electron density waves in some cases — the lattice carries all the mass. From the order parameter and its dynamic nature flow many of the useful and important properties of condensed broken symmetry systems : the Goldstone and Higgs boson excitations, the long-range elastic-like forces (such as Suhl-Nakamura interactions in magnets) but most important of all the property I call generalized rigidity. Important examples of generalized rigidity are true rigidity, superconductivity, superfluidity, and hysteresis in magnets. The order parameter, which is invariably a thermal average of some local quantity, can be defined locally, at least if its variation is sufficiently slow. The magnetization direction of an antiferromagnet, or the phase of a superfluid, can vary from place to place in a given sample. It is only reasonable to suppose that the extra free energy caused by such variation grows only slowly as the rate of variation increases : i.e., F = F 0 (| n\,T,-

-) + F,(\ q> |, T) : (Vcp)2 + - + a(V \ r, |)2 + - .

We write the gradient terms only schematically, as far as their tensor character goes. This free energy determines the degree of fluctuation of n and hence cp by conventional Gibbs theory (treating n as a conventional thermodynamic variable whose local average value we can constrain at will for purposes of calculating F). F0 does not depend on cp at all, by our original symmetry. The {Vq>)2 term is necessary if the broken symmetry ordered state is to be stable. This equation implies the rigidity property. First, it is clear that with even a very small force applied only locally we can move cp about at will in the whole sample. This is because the dependence of F 0 on the origin of

III. Generalization of the order parameter and broken symmetry concept, especially « dissipative structures » In several special cases one has attempted to define an « order parameter » which does not have the full properties, for instance, that of being a dynamical variable — the present version of the order parameter in the spin glass is one example [12]. It is not at all clear, at least theoretically and probably experimentally, that true rigidity exists in this system. My own preference is to leave such aberrant cases to one side and recognize that the analogy is a dangerous one. It is for this reason that I am disturbed by the common uses of the terms « broken symmetry », « order parameter », and « dissipative structure » in the theory of nonlinear instabilities of driven systems. The attempt is to draw the analogy with equilibrium structures of broken-symmetry systems. It is proposed that if such pro-

549

Some general thoughts about broken symmetry

17

yO.(T)

FIG. 3. — In a first-order transition, such as the liquid to solid crystal transition shown here, the order parameter will exhibit a discontinuous jump at the transition with an associated release (or absorption) of latent heat.

perties existed, they would have important consequences in our understanding of the self-organization of living systems [13, 14]. One can argue endlessly about words rather than meanings in this area so I would like to make a very clear distinction between what one might hope to be a useful « dissipative structure » as opposed to something which, while it has broken symmetry per se, and contains visible structure and something which might be described as an order parameter, is nonetheless an artefact which does not have the properties which might be useful for purposes of self-organization. The two properties which I would consider essential for self-organization are : a) Autonomy. b) Rigidity. Both of these are properties of condensed broken symmetry systems, and we may ask if they are exhibited in any well-understood dissipative systems.

FIG. 4. — Illustration (somewhat schematic) of generalized rigidity. An external force (the crank) couples to the order parameter at one end of the system, represented as a gear. A change in the order parameter at any point in the ordered system is transmitted to all other parts of the system (first gear turns the second gear). The second gear turns the second crank : a force has been transmitted from one end of the system to the other via the order parameter.

550

18

P.W. Anderson

By autonomy of a structure I mean that its space or time structure should not be predetermined in terms of the scale of the external boundary conditions (as opposed to the microscopic scale of atoms or molecules, for instance). An example of a dissipative system which does have autonomy of scale is a dye laser or any laser where the precise mode of oscillation is not predetermined by careful mode selection techniques : the wavelength of light is an autonomous microscopic scale irrelevant to the scale of the apparatus. This is not the case in such classical systems as Benard or Couette convection cells, which are controlled in size by one of the apparatus dimensions. Autonomy is necessary if one is to speak of se//-organization as opposed to predetermined organization. The second property, that of rigidity, it seems to me is also essential. If the structure is to be stable, carry out actions, and, above all, to serve as a substrate for information, it is essential that it be rigid : that is, that it have the two properties of (1) having internal degrees of freedom which are not predetermined; (2) having a freedom which may be stably manipulated and which can exert action at a distance.

cold

FIG. 5. — The Benard instability in rectangular geometry. A layer of fluid between two horizontal rectangular plates is heated from below. When a sufficient thermal gradient is reached between top and bottom plates, convection arises in the form of rolls. In this cutaway edge-on view, the arrows represent the fluid velocity.

I am unaware of any work in the literature which demonstrates these properties in any well-understood dissipative system. Most of the conventional hydrodynamic systems which show regular roll patterns are not really autonomous. In these systems one often defines an « order parameter » which is the inhomogeneous component of velocity or flow, and under sufficiently restrictive conditions a kind of Gibbs free energy functional of the order parameter can be derived which gives the equations of motion near the instability by differentiation. But the assumptions which go into this derivation seem to preclude its general use, or the derivation of rigidity properties from it; nor is it clear that this « order parameter » has the real properties of the condensed systems order parameters, such as freedom to move within an order parameter space, locality, etc. In lasers and in turbulent systems, as well as in some chemical oscillation systems, autonomy seems to be available but not order or rigidity; in general, these systems

551

Some general thoughts about broken symmetry

19

O O

o inner 'outerFIG. 6. — Couette flow : A fluid is placed between two cylinders with different rotational velocities about their axes. When the velocity gradient exceeds a critical value, rolls of vortices form. In this view the cylinder is cut along its length.

appear to be very chaotic. Whether this is a general state of affairs needs to be studied more carefully. If one enquires how living systems seem to go about building autonomous, rigid structures, one finds a fascinating mixture of dissipative and condensation processes at work. One seems to see dissipative stages initiating condensation, such as the formation of membranes, for example, and the condensed systems in turn controlling dissipative stages. Haken has emphasized this structure. It is not clear that the present idea of « dissipative structures » in the theory of nonlinear systems is at all relevant to this process.

Emitted Radiation (Coherent) Applied Power FIG. 7. — In a laser, a standing wave of excitation density is set up between two end plates, or mirrors, resulting in emission of a beam of coherent radiation.

552

20

P. W. Anderson

Acknowledgments The above ideas have evolved over many years and months, and in discussions with many people. Their present form is much influenced by recent discussions with D. Stein, L. Sneddon, A. Iberall and H. Soodak. Some of the ideas about broken symmetry evolved in a lecture course given at Quing Hua University, and I am grateful for their hospitality and the assistance of Charlies Xie. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14]

For example, H. Haken, Synergetics (Springer-Verlag, Berlin) 1977. R. Thom, Structural Stability and Morphogenesis (W. A. Benjamin, New York) 1975. L. D. Landau and E. M. Lifshitz, Statistical Physics (Addison-Wesley) 1958. N. F. Mott, Metal Insulator Transition (Taylor and Francis, London, 1974). L. Michel, Rev. Mod. Phys. 52 (1980) 617. N. D. Mermin, Rev. Mod. Phys. 51 (1979) 591. P. W. Anderson, Rev. Mod. Phys. 38 (1966) 298. See B. I. Halperin, in N. Boccara, Symmetries and Broken Symmetries in Condensed Matter Physics (IDSET, Paris) 1981, p. 183. G. E. Volovik and V. P. Mineev, Zh. Eksp. Theor. Fiz. 72 2256; 73 (1977) 767. A. W. Overhauser, see for instance Phys. Rev. 167 (1968) 691. A. J.'Leggett, Phys. Rev. Lett. 25 (1970) 1543. S. F. Edwards and P. W. Anderson, J. Phys. F 5 (1975) 965. H. Haken, to be published (Proc. Dubrovnik Conference on Self-Organization, E. Yates, ed.). See G. Nicolis, and J. Prigogine, in N.-Boccara, Symmetries and Broken Symmetries in Condensed Matter Physics (IDSET, Paris) 1981, p. 35.

553

The Rheology of Neutron Stars: Vortex Line Pinning in the Crust Superfluid (with M.A. Alpar, D. Pines and J. Shaham) Philos. Mag. A45, 227-238 (1982)

The "Glitch" phenomenon is nearly an unequivocal proof that neutron stars are at least partially solid. It is very hard to obtain two time scales differing by ~ 1013, with no intermediate scales, without a breakdown of rigidity. Ruderman, Pines and Shaham proposed actual "starquakes" caused by the strains generated by spindowns, but soon accepted the mechanism Itoh and I proposed in a brief letter in Nature, of "vorticity jumps", equivalent to "flux jumps" in a superconducting magnet. Our letter was given an award by the Observer magazine for "the most incomprehensible title of the year". Pines and Shaham are responsible for introducing me to the Aspen Center for Physics, which played a great role in my life on my return to the US in 1975 and for 15 years thereafter. We wrote too many papers on too few data, of which this is a typical one, but this is perhaps natural in view of the generally apathetic response to our work in the astrophysics community.

555

PHILOSOPHICAL MAGAZINE

A, 1982,

VOL.

45, No. 2, 227-238

The rheology of neutron stars Vortex-line pinning in the crust superfluid By P . W. ANDERSON

Bell Laboratories, Murray Hill, New Jersey 07974, U.S.A. and Princeton University}, Princeton, New Jersey 08544, U.S.A. M. A. ALPARf Bogazici University§, Istanbul, Turkey and Princeton University}, Princeton, New Jersey 08544, U.S.A. D. P I N E S |

University of Illinois, Champaign-Urbana, Illinois 61801, U.S.A. and J. SHAHAMf Racah Institute of Physics, Hebrew University, Jerusalem, Israel [Received 10 April 1981 and accepted 15 July 1981] ABSTRACT After a discussion of the general physics of neutron stars, we give a brief discussion of the ' glitch ' phenomenon and its relation to superfluidity, and finally a rather detailed study of the physics of vortex-line pinning in the crust lattice.

§ 1. INTRODUCTION

Very soon after the discovery of pulsars and the convincing evidence that these were Landau's ' neutron stars ', rotating at frequencies of about a hertz, observation began to show very-large-period irregularities in their rotations. The famous Crab pulsar, 900 years old and still rotating about 30 times per second, showed noisy irregularities including one, or sometimes a few, large events (which came to be called ' glitches ' rather than noise) where the rotation frequency changed suddenly by A£i/fi~ 10~ 9 -10 -8 . The Vela pulsar is somewhat quieter on average, but has up to now shown four giant glitches where the period changes by ~ 2 x 10~6. These are enormous events in terms of total energy ( ~ 1043 erg), although unaccompanied by visible radiations. Two other, f Work supported in part by the National Science Foundation, through Grant NSF PHY78-04404 and by NASA Grant NSG-7653. J The work at Princeton University was supported in part by the National Science Foundation Grant No. DMR78-03015 and in part by the U.S. Office of Naval Research Grant No. N00014-77-C-0711. § The work at Bogazici University was supported in part by the Space Sciences Research Unit of the Scientific and Technical Research Foundation of Turkey. © 1982 Bell Telephone Laboratories, Incorporated

556

228

P . W. Anderson et al.

older and slower, pulsars have exhibited giant Vela-style glitches, as well as, possibly, the X-ray stars Her X-l and Vela X - l . Ruder man (1969) had already, before the first of these observations, pointed out that the neutron star is covered by a solid crust of nuclei in a crystalline lattice, embedded in a Fermi gas of neutrons and electrons, the former gas increasing sharply in density with depth. A solid layer naturally provides the rigidity necessary for catastrophic breakdowns, such as ' starquakes ', and a starquake mechanism was immediately proposed (Ruderman 1969, Baym and Pines 1971, Pines, Shaham and Ruderman 1973). Apart from the planets, these glitches are the only evidence of rigid matter anywhere in the universe. Everything else is either gaseous or at best dusty, as far as we can ascertain. Even the white dwarf stars are likely to be too hot to solidify, and in any case show us no evidence of rigidity. I t is only the coincidence of having the remarkable rotating ' searchlight ' of pulse emission from the neutron stars (as yet by no means completely understood) which allows us to time their rotations with spectacular accuracy, and hence to follow the internal dynamics of these stars. I t is now accepted that the energy source which fuels the continued emissions of these stars is the flywheel energy of rotation, of the order of 1046 erg for a conventional 1 s period pulsar, and ~10 4 9 erg for the fast rotators Vela and Crab. This rotation, left over from the angular momentum of the parent supernova, has stored much of the gravitational energy of the original star. I t carries around the highly compressed magnetic field of the star ( ~ 1012 G) and, by mechanisms which are plausible but not precisely known, this leads to pulses of radio emission at each rotation of the magnetic pole. The energy radiated matches the regular rate of slowdown which is observed. Most pulsars are slowing down at a rate indicating an age of only ~ 1 0 6 years ; the Crab is only 930 years old (by Chinese observation of the supernova) and the Vela about 104 years. Apparently most pulsars ' turn off ' for unknown reasons at a few million years of age ; the older ones shine only when they have a binary companion to shower down accreting matter upon them, as in Her X - l . If these fascinating objects are our only solid companions in the universe, it may behove us to try to understand their modified version of solid-state physics and, in particular, its most striking manifestation, the glitch. Mechanisms not involving solidity have been proposed, notably encounters with planetary objects or hydrodynamic instabilities, but neither of these can agree either with the remarkable between-glitch stability of the period, nor the sudden, spectacular events which take place (in a day or so at most—no glitch has yet been followed through its actual occurrence). Thus attention must focus on the solid-state aspects of the star, on the ' rheology ' of the neutron star matter in some general sense : what kinds of catastrophic ' fracture ' or breakdown events can occur, and what kinds of frictional phenomena may follow upon these events ? I t is in this context that a paper dedicated to Nicolas Cabrera can deal with the apparently exotic astrophysical subject of neutron star physics. I t was in the dislocation theory of the strength of solids that many of the concepts used in these considerations first became clear : the vital importance of the topology of elementary defects in condensed phases, and of the interaction of defects with each other and with the lattice.

557

The rheology of neutron stars

229

§ 2. HYPOTHESES TO EXPLAIN GLITCHES

As we pointed out, the first hypothesis immediately suggested is a starquake. As the star slows down, its equilibrium shape becomes less oblate and a rigid elastic layer must eventually reach its elastic limit and break. This may still be reasonable for the very recent Crab pulsar. As for the Vela pulsar, however, it is not possible to store and release 10~6 of the star's rotational energy every two years as elastic energy in the solid crust, since that is roughly 1 % of the total energy loss of the entire star in that period. The total amount of energy due to elastic rigidity of figure can be at most (A///) 2 times the total star energy, and the centrifugal acceleration a = Q?R in Vela is only about 10~3 of the gravitational acceleration g, hence ( A / / / ) 2 ~ 10~6. The elastic energy is only a fraction of this, perhaps 1%, because of elastic relaxation. Hence not even the entire elastic energy amounts to 10~6, much less the amount accumulated in a few years due to slowdown. In fact, the possible mechanisms of the glitch phenomenon are very severely restricted by the unthinkably large magnitude AQ/Q = 2xlO~ 6 . A corresponding event on Earth would be an earthquake of magnitude 15 or 16 on the logarithmic Richter scale ! Alternatively, an asteroid 100 miles on a side would have to fall in every 2 or 3 years without tidal disintegration in a gravitational field 108 times our own. We have estimated (Anderson et al. 1980) from the number of Vela-type events observed in older ( ~ 1 0 6 years) pulsars—namely two—and the total duration of timing observations on such stars (of the order of 200 star-years) that approximately 1% of all pulsar slowdown (for quiescent pulsars, i.e. other than the Crab) takes place in connection with these catastrophic events. Since all sources of energy other than the rotation of the stars are known to be negligible in these neutron stars (nuclear processes are dead, gravitational collapse has reached its endpoint, and heat is just much too small, as is the energy of the stellar magnetic field and electric current system), this fundamental rotation must be the source powering the glitches as well. We were led unequivocally to a single type of event. To understand what might be taking place, we have to understand the structure of the star in more detail. Rather accurate estimates of the nature of the neutron-rich matter which makes up these stars are possible, since in most of the star its density is not very different from that in ordinary nuclei, and in much of the rest it is rather low-density and may be calculated from known scattering data. Ruderman's early estimates were refined by nuclear physicists, notably Negele and Vautherin (1975) and Baym and Pethick (1975). I t is generally accepted t h a t the star consists of ' crust ' and ' core ' regions (fig.l). There is, first, a core composed of a mixture of neutron and proton superfluids with a relativistic electron Fermi gas, about 10 km in radius, at a density of about 1014 g cm - 3 . The bulk of the mass is a BCS superf luid of paired neutrons. The evidence for superfluidity of both neutrons and protons in the terrestrial nucleus is overwhelming ; hence we are quite sure it occurs at the same density in the neutron star. Once the density of neutrons has decreased with radius to somewhat below nuclear density, it turns out that because of the attractive interactions of neutrons and protons, it is energetically favourable for some of the neutrons and all of the protons to clump together into moderate-sized nuclei containing perhaps 50 protons and about three times as many neutrons.

558

P. W. Anderson et al.

230

Fig. 1

7 x 106 g cm"3 4.3 x 10 11 gcm" 3 '

-2.4x10Ugcm"3,

g cm

Sketch of the structure of a neutron star. The magnitudes of the various radii and densities can vary by ~ 5 0 % depending on estimates of the parameters in the equation of state. The situation is made very clear in fig. 2, from Negele and Vautherin (1975). The nuclei must remain about this size, otherwise the protons' electrostatic repulsion will destroy them. The electrons which neutralize the charge of the protons are very uniform in density because they are relativistic in energy with an EF of ~200 MeV and a very long screening distance-; hence the nuclei see quite large electrostatic fields from each other (about 30 or 40 nuclei are within a screening radius of any given one). Thus the forces which maintain the nuclei in a rigid lattice are exactly calculable : the nuclei form a pure electrostatic ' Wigner ' lattice, all of whose elastic properties are fairly easily determined. The background neutron fluid between nuclei continues to be a superfluid, interpenetrating the nuclei and electrons. In fact, the energy gap and Tc of this superfluid (shown in fig. 3) (Hoffberg, Glassgold, Richardson and Ruderman 1970, Takatsuka 1972) are expected to rise as the density decreases, at first, and then to fall once the density falls below about 10 13 g cm - 3 . The nuclei have about the same superfluid energy gap as the neutrons in the core. Thus this crustal matter is very interesting : it is a lattice as well as two superfluids, one in the nuclei and one outside them. This is not at all impossible : precisely the same description is true of terrestrial superconductors, in which the electrons are superconducting but the nuclei form a rigid lattice. The crust is a few kilometres thick ; near the surface the density falls below 1011 g cm"-3, where the neutron fluid disappears and one has ' ordinary ' white dwarf matter. This outer region has negligible dynamical influence because it is so light. The ' core superfluid ' played an important role in early theories. I t was observed that immediately after a glitch (which always represented a sudden increase in rotation frequency) the rate of slowdown would increase by an

559

Fig. 2 180

Zr

nb • 2 79 x 10"

«o

60 35

320

40

40

100

60

1100

01

o

nb • 8 79 x 10'

Zr

20

120

Sn

01

V

l»00 (

Sn

is

n„ • 2 04 x I0 3

50'

20

5.77 x 10

40

60

n„ = 4 75 x 10

n„=7.89 x 10

37

3T

r(F) Typical density variation in the crust region. The horizontal axes denote proton and neutron density distributions occurring along a line joining the centres of two adjacent unit cells. (After Negele and Vautherin 1975.)

P. W. Anderson et al.

232

amount which is fractionally large compared to AQ/£2 : A(£l)/£^~ 10~2 in Vela (fig. 4, from Downs (1981) shows a typical event (see also Reichley and Downs 1971)). The observed glitch, however caused, was in this theory presumed not to have affected the neutron superfluid because that fluid was dynamically independent of the crust and of the charged components which rotate with the crust nuclei because of the enormous strength of the electromagnetic coupling via the magnetic field. The spin-down after a glitch was then caused by the slow spin-up process of this core superfluid.

Fig. 3

I

2

3

4

5

6

7

8

9

10

II

12

NEUTRON NUMBER DENSITY (in units of I0 37 cm"3)

Energy gap of superfluid neutrons as a function of density. The curve labelled H is from Hoffberg et al. (1970) ; T is an independent calculation for comparison.

More recent theories (Alpar 1977, Anderson and Itoh 1975, Anderson, Alpar, Pines and Shaham 1980, Alpar, Anderson, Pines and Shaham 1981, Pines, Shaham, Alpar and Anderson 1981) have had to come to terms with the idea t h a t coupling processes can be found which couple in the core superfluid, and that the entire dynamics of the glitch process is related to the superfluid component of the crust. As we have discussed in detail elsewhere (Alpar 1977), we believe t h a t there is a considerable layer of the star's crust where the quantized vorticity of the neutron superfluid is pinned to the lattice. (The coherence length in these stars, unlike that in ordinary superconductors, is smaller than the lattice constant and therefore vorticity can pin to the lattice itself, not just to imperfections.) Thus the angular velocity D. of the superfluid can only change by vortex creep or by vorticity jumps, the latter causing the glitches. The detailed scenario which we propose has been described elsewhere (Alpar et al. 1981). We now concentrate on the actual physical processes related to vorticity pinning.

561

The rheology of neutron

233

stars

Fie. 4 1

1

1

1

1

J

1 1 1

0089209300

/ /

A i

„

i

i

200

-

i

< /) o 'oE

i i

D

/

i

c

s-1

O

0>

/

/

100

-

1 1 1 1 1 1

- $1 <->'/ 1 1 _S>/

-

// / /

/

-/ /

0089209000

/' / / /

„

/

F

» 1 i i » » ' ' i ' i 1 ' ' » ' « ' ' ' 1

10

20 Feb.

1 1» '1 1 1 1 1 1 U 20 30

» 1

1 March Date of observation (UT)

91

/

^ r

-

89 Period (

'w)

87

-

85 83

00892081

II

I

I

I

1

I

I II

11 II

22 12 9 16 2329 6 13 20 27 3 Nov. Dec. Jan.

II

1

1

I

I

I

I

10 17 24 3 10 17 24 Feb. Mar.

Date (UT)

A typical Vela glitch event.

The change in slope is barely perceptible on this scale, but is easily measurable.

562

P. W. Anderson et al.

234

!j 3. W E A K AND STRONG PINNING OF VORTEX LINES IN THE NEUTRON-STAR CRUST SUPERFLUIDS

Most of the excess neutrons occupy Bloch states of the lattice. These neutrons have different local densities inside and outside the potential wells set up by the nuclei at lattice sites. The difference in the corresponding values of the superfluid gap A(p) (Hoffberg et al. 1970, Takatsuka 1972) results in a difference between the energy cost of a vortex line going through a nucleus and one that threads through the interstitial regions. This leads to forces that pin vortex lines to nuclei or repel them from nuclei, depending on which situation is energetically favourable (Alpar 1977). Since the rotational properties of a superfluid are determined by the density and distribution of quantized vortices in it, pinning has important consequences for pulsar dynamics. In particular, we have proposed the sudden release of pinned vorticity as the cause of pulsar glitches, and the coupling of the pinned superfluid as the cause of the subsequent slowdown of pulsars, through a vortex creep process (Anderson et al. 1980, Pines et al. 1981) very similar to creep of dislocations in solids. In this paper we describe the behaviour of pinning forces as a function of density in the neutron star crust. The pinning energy per nucleus is : E»"

EF(Pl)

r

"E¥(Pl o)J

where px and p0 are the superfluid neutron densities inside and outside the nuclei, and V is the volume of the nucleus. The pinning force Fp is obtained by dividing Ev by the larger of the coherence lengths £, and i? N is the nuclear radius. (£ = &uF/A.) Pinned vortices move together with the crust lattice, at the angular velocity O c of the neutron-star crust. However, the angular velocity of the superfluid, £ls(r), is determined by the number of vortices enclosed in a circular contour at radius r from the star's rotation axis. Thus O a (r) is greater than Oc. because of pinning. A vortex will move with the ambient fluid unless a force is exerted on it. The pinning force keeps the vortices moving at O,. < Q s . The equation of motion is fP = j 0 KA(V s -V ( .) = / , K A [ ( n s - Q c ) A r ] ,

(2)

where / is the force per unit length of vortex line, p the superfluid density and K the vorticity of the line. For vortex lines in the neutron BCS superfluid, K = h/2MN, where h is Planck's constant and M N the neutron mass. The right-hand side of eqn. (2) is called the Magnus force. Unpinning will start when O s — £2C, increases, through the continuous slowdown O c of the pulsar, to a critical value which the available pinning force cannot sustain. We obtain a scale for the critical angular velocity difference for unpinning by balancing / P , the pinning force per unit vortex length, against the Magnus force : F

8i2t.r = ( Q B - Q 0 ) „ =

v apKr

A

V xapKr

where x is the larger of f and J?N, and a the distance between pinning nuclei along the vortex line. In the spherical geometry of the neutron star, for

563

The rheology of neutron stars

235

regions of the crust close to the rotational axis, r is much smaller than the actual radius of the star, r^ ~ 106 cm. However, the equatorial regions, with r~r Hc , can already have SD cr much stronger than the critical difference (lctg that builds up between successive glitches in some layers, while in other layers in the equatorial plane pinning can be weak enough to give SQ c r g£W g ~ 10 - 2 in the Vela pulsar. We therefore investigate a cylindrical model of the star, i.e. we take the equatorial plane, where SQ er is not strengthened by the r _ 1 factor, and investigate its variation as a function of density through the crust. 3Q c r can be calculated as a function of the density p in the neutron star crust, using lattice structure and gap values appropriate to each density value (Alpar 1977). From eqn. (3) we see that 8QCT is largest when a is the nearestneighbour spacing in the crust lattice. This can happen if the lattice is oriented so that a line of nuclei coincides with each vortex line, that is, the lattice is oriented parallel to fi, the current angular velocity of the neutron star. We do not expect this to be the usual case, since conditions when the lattice froze would have been entirely different and undoubtedly cracking and flow of the crust have occurred on many occasions. A vortex line is likely to find the lattice oriented at some angle to its preferred direction Cl. The coupling between vortices in the vortex array of spacing lv = (K/2e)112 limits the amplitudes and wavelengths of distortions on individual vortex lines to less than lv ; but since pinning encounters occur on length scales of the order of the lattice, the contribution of the vortex array rigidity is negligible—the energy involved, (pif2/277)f (a/£v)2 per unit length of vortex line displaced a length a, is much smaller than the energies of pinning and vortex line bending we discuss below. We therefore neglect the other vortices and ask if a vortex line can optimize its pinning by displacing the nuclei from their equilibrium sites in the lattice. The pinning force is to be compared with the force needed to displace the charged nuclei from their lattice equilibrium sites by a distance comparable to the size of each nucleus, the length scale of the pinning force. This lattice displacement force is

Only in the layers of the crust where the pinning energy is larger than about 3 MeV will the vortices be able to displace nuclei so as to achieve optional pinning along much of the length of the vortex, and under these conditions eqn. (3) can be used with a a few times the lattice spacing. In fig. 5 we plot SQ cr as a function of the density in the neutron star crust, and indicate the transition between the strong (Fp> Fh) and weak (Fp< F^) pinning layers. At still stronger values of the pinning force, there has been a proposal that the pinning forces may lead to a breaking of the lattice (Ruderman 1976). This will happen when Fv reaches the critical value Fc, the force per nucleus that will break the lattice : L Hc = -FcnY,

(5)

where /x is the shear modulus z2e2/a4, c is the critical strain angle of the lattice, L the length of a vortex line going through the pinning lattice (L ~ 10 4 -10 5 cm, the thickness of the crust pinning layer) and rav the area density of vortices,

564

P. W. Anderson et al.

236

Fig, 5 24

20

16

p 12 C3CO 8

FD=F, F„=F,

10 l3

12

3

/5(IO gcm" )

SQcr resulting from pinning as a function of density in neutron star crust. The upper curve was calculated using A(p) given by Hoffberg et al. (1970). The lower curve is the result of reducing A(p) by a factor of three-quarters. The transition at F-L = Fv is indicated for each curve. nY = 2Q,/K. Ruderman (1976) has used <£c~ 10- 4 -10~ 5 to argue that Fc is smaller than both Fh and Fp, so that the lattice breaks before individual nuclei can be displaced or vortex lines can unpin from the nuclei. We feel that this argument misconstrues the role of the rigidity of the crust. The crustal matter as a whole is almost in gravitational equilibrium (except for a small degree of shape distortion due to elastic rigidity). Only crustal fractures which are either neutral to, or helpful with respect to, gravitational equilibrium are possible. An outward flow of crustal matter is unthinkable ; yet such a flow is the only way to respond dynamically to the pinning forces. Hence the overall, global pinning force is not contained by elasticity but by the much stronger gravitational shape forces, to which the force is conveyed by the energies of compositional and gravitational equilibrium. We feel t h a t the crust-breaking mechanism is not dynamically plausible. However, we do feel t h a t it is easy—in fact likely—for vorticity motion to trigger true elastic quakes of the type originally proposed (Ruderman 1969). Figure 5 shows SQ er calculated using the gap values given by Hoffberg et al. (1970). The transition to strong pinning as discussed above is indicated. Clearly, the SQ cr values from eqn. (2) are too high (except in the inner and outer boundaries of the crust) to have 8QCI=Cltg= 10~2 (for the Vela glitches) which we require to build up ( O s - Qc) to SO cr to start the next glitch by unpinning. However, SQ cr will be reduced from the values given by eqn. (2) for two different reasons. First, an uncertainty in A(/>) by a factor of two means t h a t 8Q.CI (which is proportional to A2 or A3, depending on whether the coherence length exceeds i? N ) could everywhere be smaller than calculated using the A(p)

565

The rheology of neutron stars

237

given by Hoffberg et al. (1970). The 8D.GI resulting from two model A(p) runs are shown in fig. 5. Such general reduction of SQ or , however, does not lead to sufficiently small values of S£2cr. A possibly more significant reduction will occur because of the disruption of the lattice near a vortex line. The vortex line may unpin simply by carrying its accompanying nuclei along with it through the lattice, as originally suggested by Anderson and Itoh (1975). This means that the effective Fp is limited by -PL, and is probably considerably smaller than F-^. In fact, all errors are clearly in such a direction as to reduce Fp, and thus the excessive size of SO cr is not important here. On the other hand, where Fp < F^, the vortex lines cannot pin optimally by pulling individual nuclei out of their equilibrium sites. They also cannot, for practical purposes, bend, because their energy per unit length is EFkF, which is greater than 100 MeV/Fermi. Instead, the lattice will deform more smoothly, on a length scale L which we can estimate by equating the pinning energy to the energy of distortion per site, ~ ELb/L, where EL is a typical lattice energy Eh = Fjb = z2e2BN/b2. The length scaleL will be the distance between effective pinning sites, and will go as b_ E^ L~ETj so that, finally, the effective pinning energy is E'^E^E^

or

F

p S

^ .

(6)

The pinning force falls off very much more rapidly than Fv as Fp gets smaller ; this rapid fall-off is shown in fig. 5 as the dashed line. Eventually L will become so long that the lattice can be assumed to be rigid and the line encounters nuclei as it would if it were a random straight line, i.e. at

i.e. about one nucleus every 10~ 2 -10~ 3 lattice constants. At this point the pinning force must be estimated statistically, and we will also begin to have bending at about this scale. In any case a reduction as given approximately by eqn. (6) is not a bad estimate. Thus, as we proceed to higher densities from the strong pinning layer boundary Fp = F^, we have a rather sudden transition to random pinning and a weak pinning layer, given by eqn. (6). For all higher densities up to the inner boundary of the crust, we have weak pinning with SQ cr as low as or lower than (ltg. § 4. CONCLUSION

The vortex-lattice interaction in neutron stars provides an interesting example of a transition from strong to weak random pinning, depending on the relative strength of binding forces in the lattice and pinning forces. Reduced effective values of pinning forces, which result from vortex line energy as well as lattice geometry and randomness, lead to important dynamical effects in pulsars.

566

The rheology of neutron

238

stars

I n p a r t i c u l a r , we h a v e ascribed (Alpar et al. 1981) t h e ' glitches ' t o a s u d d e n release of t h e p i n n e d v o r t i c i t y i n a v o l u m e c o n t a i n i n g a b o u t 1 0 - 2 of t h e m o m e n t of inertia of t h e s t a r . T h i s is triggered b y t h e h i g h d e n s i t y of v o r t i c i t y built u p in t h e b o u n d a r y layer b e t w e e n w e a k a n d s t r o n g p i n n i n g regions. This h y p o t h e s i s seems t o b e t h e o n l y o n e which e x p l a i n s all t h e d a t a , especially w h e n such a large fraction of t h e slowing-down process is involved in t h e glitches. T h e m o r e d e t a i l e d consequences for a s t r o p h y s i c s of t h e s e rheological considerations a r e n o t a p p r o p r i a t e l y discussed here, a n d h a v e been t h e subject of a previous p a p e r . W e felt t h a t in a v o l u m e d e d i c a t e d t o Nicolas Cabrera, it m i g h t b e v e r y a p p r o p r i a t e t o discuss t h e ' f a r t h e s t o u t ' consequences of t h e k i n d s of detailed u n d e r s t a n d i n g of defect s t r u c t u r e s a n d m o t i o n in solids in which he w a s a v e r y early a n d i m p o r t a n t pioneer.

REFERENCES

ALPAR, M. A., 1977, Astrophys.

J., 213, 527.

ALPAR, M. A., A N D E R S O N , P . W., P I N E S , D., a n d SHAHAM, J., 1981, Astrophys.

J.

Lett., 249, L29. ANDERSON, P . W., A L P A R , M. A., P I N E S , D., and SHAHAM, J . , 1980, Proceedings of the

Symposium, Union).

on Pulsars,

IAU Symp. N o . 95 (International

Astronomical

ANDERSON, P . W., a n d ITOH, N., 1975, Nature, Lond., 256, 25. BAYM, G., and P E T H I C K , C. J., 1975, A. Rev. nucl. Sci., 25, 27. BAYM, G., and P I N E S , D., 1971, Ann. Phys., 66, 816.

D O W N S , G. S., 1981, Astrophys.

J., 249, 687.

H O T T B E R G , M., GLASSGOLD, A. E., RICHARDSON, R. W., a n d R U D E R M A N , M. A., 1970,

Phys. Rev. Lett., 24, 775. N E G E L E , J . W., a n d VAUTHERIN, D., 1975, Nucl. Phys. A, 207, 298. P I N E S , D., SHAHAM, J., ALPAR, M. A., and ANDERSON, P . W., 1981, Prog, theor.

Phys.,

Supplement No. 69, p . 376. P I N E S , D., SHAHAM, J., a n d RUDERMAN, M. A., 1973, Physics of Dense Matter, edited

by C. J . Hansen, I A U Symp. No. 53 (Dordrecht : Riedel), p. 189. R E I C H L E Y , P., and D O W N S , G. S., 1971, Nature, Phys. Sci., 234, 48.

RUDERMAN, M. A., 1969, Nature, Lond., 223, 597 ; 1976, Astrophys. J., 203, 213. TAKATSUKA, I., 1972, Prog, theor. Phys., 48, 517.

567

Localization Redux Physica 117B and 118B, 30-35 (1983)

This is a final summary of my work in the second "weak localization" period of localization theory, which was inaugurated by the "Gang of 4" papers in 1978. During this period we all (Abrahams, Fukuyama, Vollhardt and Wolfle, Ramakrishnan, Lee, Thouless and occasionally some Russians) met regularly at Aspen in the summers. This cooperative stress-free atmosphere seems to have been a casualty of high-rc, funding and job shortages, the cutbacks at Bell and IBM, or a combination of these: it doesn't exist any more.

569

Physica 117B& 118B (1983) 30-36 North-Holland Publishing Company

30

LOCALIZATION REDUX P. W. Anderson Bell Laboratories Murray Hill, New Jersey 07974, U.S.A. and Joseph Henry Laboratories Princeton University, Princeton, N. J. 08544, U.S.A. I survey recent progress in the field of localization. Theory and experiment are in excellent agreement in two-dimensional systems such as inversion layers and thin metal films. The metal insulator transition in three-dimensional systems remains an interesting problem as exemplified by highly accurate data on doped semiconductors, but on the metallic side perturbational theory has had a number of successes.

tions of quantum transport, of which the most characteristic example is metallic conduction at low temperature, where, as we all know, there is maximum conductivity at T-*0 which decreases due to thermal agitation, in contrast to ordinary transport processes which only occur due to thermal agitation. Realizing that it is sure to fail in the end, we borrow from simple Boltzmann transport theory the expression for the conductivity

Localization, a very old concept and a continuing controversy, keeps bringing itself back to the center of the stage, year after year, which is why I use the word "Redux". The source of the latest ^Introduction 1s a series of developments, both experimental and theoretical, which have begun to bring scientific precision into what used to be a highly speculative and controversial field. One of the characteristics of the field has been an almost worldwide, semicollaborative effort on the theoretical side, and parallel evolution at different places on the experimental side.

ne mv

The theoretical cooperation was much aided by two timely workshops, one 1n Russia at Yerevan, one at Aspen last summer, so that in naming important workers in the field I have to pinpoint groups from at least six countries. My own collaborators 2 " 4 have included especially Ramakrishnan, Licciardello, Lee, Abrahams, Fisher and Shapiro from the Bell Labs-PrincetonRutgers group, as well as Thouless5 and Fukuyama;6 particularly useful work which found its way into the workshops has been done by Wegner,7 Gbtze, 8 Vollhardt and HBlfle9 in Germany; Altshuler, Aronov lu and others 11 in Russia; Kawabata,'2 Hikami and others' 3 in Japan: McMillan 14 in the U.S.; McKane and Stone 15 in England; and many, many others. In experiment, the Bell group, 1 ' especially Dynes and collaborators, and the Cambridge group with Pepper and others'^ have run neck and neck on two-dimensional work, while there is important work by a Belgian group with Deutscher; 19 on three dimensions the only precision work as yet published is by Thomas, Rosenbaum and coworkers20, of Bell-Princeton. Notable work at Illinois,21 IBM 22 and elsewhere 23 on various aspects is very worth mentioning. 1.

=

e2 jp

2k

jp

(kj-C)

F 3^- (k_£)(1n 3 dimensions) (in 2 dimensions)

Note the appearance of the characteristic quantum unit of conductance

1 25,000

(ohms)-1

Using the quantum Einstein relation between conductivity and diffusion, we relate these Boltzmann transport theory results to a second aspect of quantum transport, diffusion by tunneling from site to site: e

2 dn

dE D =

m (kF£) where h/m is the fundamental quantum unit of diffusivity and kfl = —• (A the de Broglie wavelength) appears in this view as a coherence factor for tunneling at different sites. (We note that; for a tight-binding band of energy width B,

INTRODUCTION TO QUANTUM TRANSPORT

Let us briefly introduce the set of concepts which have brought us to where we are. The basic problem is that of the fundamental limita-

h in

© 1 9 8 3 North-Holland

570

% a

7 IR \ h" ^

a

9 /(time

f o r

a

quantum jump).

P. W. Anderson / Localization

The concept of localization evolved from my observation that D couldn't be reduced below the krl ^ 1 limit because of a failure to break up coherence and prevent reverse tunneling, which Mott equated to the Ioffe limit k_£ > 1 on the

31

redux

Finally, to get precise results we took advantage of an old perturbation theory scheme of Langer and N e a l . " The ordinary theory is the lowest-order diagrams involving scattering of an electron-hole pair by the impurity potential:

ratio of mean free path to wavelength. Over the years, Mott developed the concept^ of a "minimum metallic conductivity" of about e V h a , which many data seemed to support, at which the conductivity dropped to zero. -+ etc 2.

SCALING THEORY: GANG OF IV AND SCATTERING THEORY

By combining diagrammatic transport theory with a scaling technique, the Gang of Four (myself, Abrahams, Ramakrishnan, Licciardello2) gave a probably correct account of the localization regime proper. A basic component is the Thouless conjecture-' that (a) conductance can be defined on a microscopic scale, for a sample of only a few atoms, even; (b) the conductance g in units of e /h is the only dimensionless coupling constant characterizing the wave functions and energy states, or at least characterizing the diffusion of wave packets. The mean conductance, then, of a sample of side 2L made up from a group of smaller samples of size L chosen from the appropriate statistical universe must be a function only of the mean conductances of the smaller samples of size L:

g (2L) = f [g~ (L)] which may be written d In g = d In I

B(g)

To this we add the macroscopic result that g-K» is the classical Ohm's law limit, for which g = ot (d the dimensionality) and (3 = d-2, as well as the localization result for the limit -aL

^ £ng, so that the g+0, ^ e or curvesi.e., of gvs. g must look roughly as in Fig.

0- i£r\L

=

«Wt

and the first corrections to g, which are of order 1/g, come from summing all of what we called the "maximally crossed diagrams" like:

which closely resemble the "Cooper pair" diagrams of superconductivity theory, and are now called "cooperons". Since g is proportional to I which is propor2 tional to 1/v , v being the impurity scattering potential, a series in scattering is a series in 1/g and it is natural for this series to start with 1/g; but oddly enough all higher terms have been shown to vanish, and the actual behavior for large g is dominated by the one term. In the case of two dimensions, since d-2 vanishes the 1/g term is the whole story and leads to the famous logarithmic correction to the conductivity: a correction which is verified in thin films by Dolan and Osheroff'' and others 1 9 ' 2 2 ' 2 3 and In MOSFETs by Bishop et al 1 ' and Uren et a l . 1 8 It can be demonstrated unequivocally by its strong, anisotropic magnetic field dependence: the "cooperon" diagrams are, as in superconductivity, very field dependent. The actual formula is: o(T,H) = a

p - V 2TT H

~-

2:/fi

2eH£ I.

e i

571

°

^(a+1/2) - Ma)l I

'

l^ = inelastic length v p x.

He

Schematic diagram of scaling function data vs. log g from G... paper.

£n T/T

,

,

4..

-,

le = elastic length

32

P. W. Anderson / Localization

Note this allows determination of the inelastic scattering time i,, which we shall see later is very important.

3.

In three dimensions the same calculation predicts not a mobility edge but a critical point, but apparently all behavior near this point is modified by interactions. That there really is a critical point and not a minimum metallic conductivity is beautifully demonstrated by the experimental data of Paalanen et al 2 0 shown in Fig. 2. It is likely that Mott 2 4 has severely underestimated the critical range of a, and that all the measurements shown here are within the critical range. One-dimensional experimental results are also as yet in flux, although Skocpol et al at Bell Labs Holmdel have promising data, as well as Tinkham's group 2 9 and Sacharoff et a l 3 0 at Ha rvard. 40

i

I

I

!

y"

200 V

100

°M

0

1

5

-

•

5

•

of

The basic discovery nere is that it is not. really possible to treat random impurity scattering and the effects of interactions completely independently, as we have always done: essentially, letting the interactions renormalize the particles and make them into quasiparticles with new numbers but not new properties. For instance, there are scattering corrections to quantities as simple as the exchange self-energy (which determines the effective mass) and which make the density of states change in a singular manner at the Fermi surface. These are called "diffuson" corrections and originally were put in as a "vertex correction" at each interaction due to scattering. The diagram for self-energy is as shown:

M—-y*

-

I.

Lee, and the experimental observability the effects began to be obvious.

CT

3

n (IO cm~ )

.

Unfortunately, this very beautiful story is extremely far from complete: there is a second, and indeed often dominant, source of corrections of similar type to conduction in poor metals, namely the electron-electron interactions. These corrections were pointed out years ago for the special case of electron-phonon interactions by Schmid, and some of the theory for three dimensions given by Altshuler.'0 But it wasn't until the Yerevan workship that the two-dimensional case was done by Altshuler, Aronov and

7 l8

.

INTERACTIONS: THE DIFFUSON CONCEPT

1

5

5 20

redux

••

• / • // /f

•* 1 6

1

i

i

The physical meaning is that after emitting a phonon, for instance, the electron does not propagate away as fast, so interacts more stronqly with it. Lee, Abrahams, Ramakrishnan and I 3 showed how to introduce diffuson corrections using the exact scattered wave functions and much simplify and generalize the calculation. But essentially it always involves introducing a "diffuson propagator"

7

S (kbor)

Fig. 2: Results of Paalanen et a ,20 l 6 " on conductivity in the impurity band near the metal insulator transition. In the lower curve stress is used as a vernier parameter.

2 • a n d doing integrals ioi+Dq

/

iu+Dq^

gives us The mathematical assumptions of the scaling theory have been very completely investigated in the one-dimensional case by myself and coworkers, ~ 4 Az'bel and more recently others. The result is to place the basic Thou less and Gjy conjectures on a very sound

In a

(2d) (Id)

1/2

(3d).

kT , or Where in many of the corrections in -* ^uBH h alternatively —r— under a magnetic field. The

footing, in addition to teaching us some very subtle bits of the mathematics of scaling theory.

572

33

P. W. Anderson / Localization redux

20 that of localization, u so the three groups of 19 experimentalists''" have very neatly been able to identify separately the contributions to the logarithmic behavior of localization and of interactions. I give for the record the extra resistivity in its dependence on magnetic field, and assure you that Uren et a l , 1 8 Bishop et al 1 ' and Deutscher et a l 1 9 have verified this formula in detail:

to ' corrections for 3d have been nicely demonstrated in tunneling density of states by Bob Dynes. in fact, they are a very old and puzzling result we saw 15 years ago and did not explain. Some data are shown in Fig. 3. 200 CT = O b + f ( T )

riNb

nsi

\

m#

1 3 5

/ SUBSTRATE

(2-F) In

U°)int

mm

g(h) 2TT2TI

85

150

UnH

h = -p? , 0 <_ F £ 1 (F = screening factor)

g(h) =

In

Ida) — = • / dto

1-

e'-l

. 100

The verification of the localization contribution has left us with detailed measurements of the inelastic scattering time T.. In all of the experimental results, this scattering time has been found to be surprisingly short and very slowly varying with temperature. This was a source of much of the early confusion: using classic methods, we supposed that this would go

0

J 2

I 4

L_ 6 8

as T" p with p ^ 3 or greater. There are, however, very striking diffuson effects, especially on the inelastic scattering by phonons. The ... most recent work by Lee, Altshuler and Aronov has somewhat revised our earlier work but still

I I I I 10 12 14 16 18 20 22 24 26 28 30 POSITION

gives a large value for T~ , linear in T, in 2d, 2/3 and T in 1 dimension (a result which implies the failure of Fermi liquid theory in that case;). The large value and Its agreement with theory is shown in Bishop et al's figure 35 (4), which he has loaned me. It appears that the r 2/3 large T ' result is not at all in compatible with the most recent Id data. 2 8 " 3 0 12.0 100

J

m

"I

-

T 0.5K

0 8.0 6.0 1

2

3

4

40

SQR VOLTAGE ( m V ) 1 / 2

,33 showing tunnelFig. 3: Data from Hertel et al"''"' ing conductivity from amorphous Si-Nb. Inset shows experimental set-up with varying Nb content. Tunneling conductance measures density of states at Fermi level. There are also effects on the conductivity, which ajk[ to those of localization. For a few months we had a good strong controversy going as to which was important, but we now know both are. The fortunate fact is that the magnetic field dependence is entirely different from

o

" o

j

J

• MEASURED o CALCULATED

2.0

1

&

0

° ,

l 15.0

i 20.0 25.0 30.0 35.0 40.0

Dm** (cm2/sec)

34 Fig. 4: Data from Bishop et al.*'"' Inelastic scattering time i. in inversion layers as a function of rate of scattering to illustrate arge diffuson effects.

573

P. W. Anderson / Localization

34

Let us conclude with a remark and a problem still unsolved for future theorists. The remark is that Gjy implies that a, ana therefore D, is a function of L - effectively, of q itself. The true "diffusion kernel" near the mobility edge is not

but iw+Dq

ioj+D a

on a much more singular function in 3 dimensions, for instance, implying that at the mobility edge very major changes take place. Muttalib, Ramakrishnan and I " have proposed a theory of impurity effects on superconductors using this, for instance. The problem is that even using this, the best available theories as yet do not explain the observations in 3d. I showed you a figure of the Paalanen et a l 2 0 data already (Fig. 2 ) , showing that apparently a drops off as (n-nJ11 with r, = 1/2. Theory and experiment are in agreement that (again in contradiction to Mott's assertions over many years) there J_s_ an Efros-Shklovskii gap36 or pseudogap at the mobility edge, but its nature and shape are not understood. I should reemphasize that agreement of weak localization and interaction experiment and theory is in pretty good shape, even to the extent of seeing the predicted negative sign 1/2 20 T singularities and the reverse-sign spin orbit effect (beautiful data of Bergmann37). It is very interesting that the exponent n = 1/2 does not satisfy the Harris-Mott criteria 2 ^' 38 for stability against inhomogeneity, and indeed the Paalanen et al data2*-1 show an unexplained and unexpected exponential tail. I leave you with a last intriguing puzzle: the close resemblance of Paalanen's a vs. n and optical absorption a vs. (u for the band tails in amorphous semiconductors.39

REFERENCES 1. P. W. Anderson, Phys. Rev. ]09, 1492 (1958); A. F. Ioffe and A. R. Regel, Prog, in Semiconductors 4, 237 (1960); N. F. Mott, MetalInsulator Transitions (Taylor and Francis, London, 1974"). 2. E. Abrahams, P. W. Anderson, D. C. Licciardello and T. V. Ramakrishnan, Phys. Rev. Lett. 42, 673 (1979); ibid 43, 718 (1979). 3. E. Abrahams, P. W. Anderson, P. A. Lee and T. V. Ramakrishnan, Phys. Rev. B 24, 6783 (1931). ~~

574

redux

4. P. A. Lee, Phys. Rev. Lett. 42_, 1492 (1979); P. A. Lee, J. Non-Cryst. Sol. 35_, 21 (1980); P. W. Anderson, D. J. Thouless, E. Abrahams and D. S. Fisher, Phys. Rev. B 22., 3519 (1980); E. Abrahams and T. V. Ramakrishnan, J. Non-Cryst. Sol. 35, 15 (1980); B. Shapiro and E. Abrahams, Phys. Rev. B 24, 4889 (1981). ~ 5. D. C. Licciardello and D. J. Thouless, J. Phys. C 8, 4157 (1975); ibid C V 1 , 925 (1978); D. J. Thouless, Phys. Rev. Lett. 39, 1167 (1977); D. J. Thouless, Solid St. Comm. 34., 683 (1980). 6. H. Fukuyama, J. Phys. Soc. Jpn. 48, 2169 (1980); ibid 49_, 644 (1980); Ibid 50_, 3407 (1981); Y. Ono, D. Yoshioka, H. Fukuyama, J. Phys. Soc. Jpn. 5_0, 2143 (1981); D. Yoshioka, Y. Ono, H. Fukuyama, J. Phys. Soc. Jpn. 5_0_, 3419 (1981). 7. F. J. Wegner, Z. Phys. B 35_, 207 (1979)'. 8. W. Gbtze, Solid State Comm. 27, 1393 (1978); W. Gbtze, J. Phys. C U_, 12/9 (1979); W. G'dtze, P. Prelovsek, P. Wolfle, Solid St. Comm. 30_, 369 (1979); W. Gotze, Philos. Maa. 43, 219 (1981); A. Gold, W. Gotze, J. Phys. C 14, 4049 (1981); D. Belitz, W. Gbtze, Philos. Mag. B 43_, 517 (1981); D. Belitz, A. Gold, W. Gbtze, Z. Phys. B 44, 273 (1981). 9. D. Vollhardt, P. Wblfle, Phys. Rev. Lett. 45_, 842 (1980); Phys. Rev. B 22, 4666 (1980). 10. B. L. Altshuler and A. G. Aronov, Zh. Eksp. Teor. Fiz. 7_7, 2028 (1979) (Soviet Phys. JETP 50, 968 (1979)); B. L. Altshuler, A. G. Aronov and P. A. Lee, Phys. Rev. Lett. 44., 1288 (1980); B. L. Altshuler, D.Khmelnitskii, A. I. Larkin and P. A. Lee, Phys. Rev. B 2^, 5142 (1980); B. L. Altshuler, A. G. Aronov, D. E. Khmelnitskii and A. I. Larkin, Zh. Eksp. Teor. Fiz 8J_, 768 (1981) (Soviet Phys. JETP 54, 411 (1981)). 11. L. P. Gor'kov, A. I. Larkin and D. E. Khmelnitskii, JETP (Soviet) Lett. 30, 228 (1979); A. I. Larkin, JETP Lett. 31_, 2 1 ? (1980) ;K. B. Efetov, A. I. Larkin, D. E. Khmelnitskii, JETP (Soviet) 5£, 568 (1980). 12. A. Kawabata, Solid St. Comm. 34, 431 (1980); J. Phys. Soc. Jpn. 49_, Suppl. A., 375 (1980); Solid St. Comm. 38, 823 (1981). 13. S. Hikami, A. I. Larkin, Y. Nagaoka, Progr. Theor. Phys. 63, 707 (1980); S. Hikami, Phys. Rev. B 24, 2671 (1981). 14. W. L. McMillan, Phys. Rev. B 24_, 2739 (1981). 15. A. J. McKane, M. Stone, Ann. Phys. 131_, 131 (1981). 16. F. J. Wegner, Z. Phys. B 25., 327 (1976); F. J. Wegner, Phys. Rev. 67, 15 (1980) and references therein; B. Shapiro, Phys. Rev. Lett. 48, 823 (1982); B. Shapiro, Phys. Rev. B 25_, 4266 (1982); M. Kaveh and N. F. Mott, J. Phys. C U , L177 (1981).

P. W. Anderson / Localization redux 17. D. J. Bishop, D. C. Tsui and R. C. Dynes, Phys. Rev. Lett. 44, 1153 (1980); ibid 46_, 360 (1981); R. C. Dynes and J. P. Garno, Phys. Rev. Lett. 46, 137 (1981); G. J. Dolan and D. D. Osheroff, Phys. Rev. Lett. 43, 721 (1979). 18. M. J. Uren, R. A. Davies and M. Pepper, J. Phys. C 13, L98b (1980). 19. G. Deutscher, 0. Entin-Wolman and Y. Shapira, Phys. Rev. B 22_, 4264 (1980); T. Chui, G. Deutscher, P. Lindenfeld and W. L. McLean, Phvs. Rev. B 23. 6172 (1981). 20. M. A. Paalanen, T. F. Rosenbaum, G. A. Thomas and R. N. Bhatt, Phys. Rev. Lett. 48, 1284 (1982); T. F. Rosenbaum, K. Andres. G. A. Thomas, P. A. Lee, ibid 46, 568 0 9 8 1 ) ; T. F. Rosenbaum, R. F. Milligan, G. A. Thomas, P. A. Lee, T. V. Ramakrishnan, R. N. Bhatt, K. DeConae, H. Hess and T. Perry, ibid 47_, 1758 (1981); T. F. Rosenbaum, K. Andres, G. A. Thomas and R. N. Bhatt, ibid 45_, 1723 (1980); G. A. Thomas, T. F. Rosenbaum and R. N. Bhatt, ibid 46, 1435 0 9 8 1 ) ; G. A. Thomas, A. Kawabata, Y. Ootuka, S. Katsumoto, S. Kobayashi and W. Saski, Phys. Rev. B 24_, 4886 (1981); H. A. Hess, K. Deuonde, T. F. Rosenbaum and G. A. lhomas, ibid B 25, 5585 (1982); G. A. Thomas, Y. OotuKa, S. Katsumoto, S. Kobayashi and W. Sasaki, ibid B 25, 4288 (1982); R. N. Bhatt, ibid B 24^, 3630 "0~981 and ibid, in press. 21. W. L. McMillan and J. Mochel, Phys. Rev. Lett. 46, t>56 (1981); B. W. Dodson, W. L. McMillan, J. M. Mochel and R. C. Dynes, Phys. Rev. Lett. 46, 46 (1981). 22. P. Chandhari and Hf-U. Habermeier, Phys. Rev. Lett. 44, 40 (1980), and Solid St. Comm. 34, 687" (1980); A. B. Fowler, A. Hartstein and R. A. Webb, Phys. Rev. Lett. 48, 196 (1982). 23. Y. Kawaguchi and S. Kawaji, J. Phys. Soc. Jpn. 48, 699 (1980); K. Morizaki, Phil. Mag. B 42, 979 (1980); R. G. Wheeler, Phys. Rev. B 24, 6783 (1981); Z. Ovadyaku and Y. Imry, Phys. Rev. B 24, 7439 (1981); S. Kobayashi, F. Komori, Y. Ootuka and W. Sasaki, J. Phys. Soc. Jpn. 49, 1635 (1980); ibid 50, 1051 (1981TT

35

24. N. F. Mott, Phil. Mag. 26_, 1015 (1972); ibid, B 44_, 265 (1981). 25. J. S. Langer and T. Neal, Phys. Rev. Lett. 16, 984 (1966). 26. J. Bardeen, L. M. Cooper, J. R. Schrieffer, Phys. Rev. 106, 162 (1957); ibid 108, 1175 (1957). 27. N. Giordano, Phys. Rev. B 2£, 5635 (1980); J. T. Masden and N. Giordano, Physica 107B, 3 (1981); N. Giordano, W. Gilson and D. E. Proler, Phys. Rev. Lett. 43_, 725 (1979). 28. W. J. Skocpol, L. D. Jackel, E. L. Hu, R. E. Howard and L. A. Fetter, unpublished. 29. A. E. White, M. Tinkham, W. J. Skocpol and D. C. Flanders, Phys. Rev. Lett. 48, 1752 (1982). 30. A. C. Sacharoff, R. M. Westervelt and J. Bevk. unDublished. 31. M. Az'bel, J. Phys. C 13, L797 (1980); Phys. Rev. Lett. 46, 67b (1981); ibid 47_, 1015 (1981); Phys. Rev. 25, 849 (1982) and references therein. 32. A. Schmid, Z. Physik 259, 421 (1973); ibid 271, 251 (1974). 33. G. Hertel, D. J. Bis,hop, R. C. Dynes, E. Spencer and J. M. Rowel!, unpublished. 34. D. J. Bishop, R. C. Dynes and D. C. Tsui, Phys. Rev. B 2b_ (1982). 35. P. W. Anderson, K. A. Muttalib and T. V. Ramakrishnan, unpublished. 36. A. L. Efros and B. I. Shklovskii, J. Phys. C 8, L49 (1975). 37. G. Bergmann, Phys. Rev. Lett. 49, 162 (1982); ibid 48, 1046 (1982); TFid 43, 1357 (1979); and Phys. Rev. B 25_, 2937 (1982). 38. A. B. Harris, J. Phys. C 7 , 1671 (1974). 39. For example, J. Tauc in The Optical Properties of Solids, ed. F. Abeles (North Holland, Amsterdam, 1970) p. 277.

575

New Method for Scaling Theory of Localization II: Multi-Channel Theory of a "Wire" and Possible Extension to Higher Dimensionality Phys. Rev. B 23, 4828-4836 (1981)

This is part of the cooperative development discussed above, but has always had a special place in my mind as most foreshadowing what were to be the next series of developments: the "universal fluctuations", quantized resistance, etc, most of which are corollaries of this point of view. This was the third way of looking at localization, one channel at a time — pioneered decades before by Rolf Landauer, but in a way also contained in my original paper, which emphasized the singular distribution of path amplitudes. It makes the GIV technique's dependence on averages over randomness look shaky indeed.

577

PHYSICAL

REVIEW

B

VOLUME

23, NUMBER

10

15 MAY

1981

New method for scaling theory of localization. II. Multichannel theory of a "wire" and possible extension to higher dimensionality P. W. Anderson Bell Laboratories, Murray Hill, New Jersey 07974 and Princeton University, Princeton, New Jersey 08540 (Received 10 December 1980) In this paper we use the techniques of the scattering formalism for localization to discuss a real wire of finite cross section. The paper derives the phenomenon of localization in such a wire and suggests a heuristic extension to higher dimensionality.

IXa)2

I. INTRODUCTION In a previous paper 1 I have discussed and evaluated the Landauer formula for quantum conductance of a one-dimensional chain of random s c a t t e r e r s , and in another 2 discussed how the Landaue r formula in t e r m s of a scattering formalism may be related to the Thouless formula in t e r m s of boundary-condition sensitivity. In the first paper we indicated that a corresponding general formula for the conductance of a real wire of finite c r o s s section could be derived s i m ilarly, and that formula is repeated h e r e . Az'bel has given similar r e s u l t s . 3 These results a r e actually virtually identical to formulas for the tunneling conductance of a b a r r i e r which a r e well known in the literature. 4 In this paper I will try to show how to make a scaling theory whereby the conductance of a long wire may be derived in t e r m s of the statistics of its component parts and the phenomenon of localization thereby demonstrated for this more r e a l i s tic model. In a final section of the paper, I will heuristically extend the results to higher dimensionality, confirming the conjectures of previous work.

II. CONDUCTANCE FORMULA IN THE MULTICHANNEL CASE The preceding paper in this s e r i e s gave an expression for the conductance of a multichannel system with elastic scattering only. (This is a valid model of a conducting wire at zero t e m p e r ature. ) e

"2rt

•

Ek -£k

-°G)« • In this case we can forget the denominators and write

Let us define g as we did in the previous paper by e2 G = 2vHs' g^Tiitt1).

(2)

The definition of a channel, of c o u r s e , varies depending on the physical model we envision, and is never very definitive but also never a serious problem. For a good metal, we define a channel as a k transverse which belongs to any k near the F e r m i surface. Those channels with no F e r m i level density in them will simply have no t r a n s mission t. This definition, and the question of what kind of medium is "outside" the s c a t t e r e r , is almost physically irrelevant. We think for definiteness of a piece of impure metal sandwiched into a perfect crystalline w i r e , and recognize that any scattering due to mismatch will, of course, be a part of the contact resistance and legitimately included. In essence, the channels a r e defined as allowed states at the F e r m i level in the highly conducting "contacts" through which we supply the current to our resistive w i r e . As for the concept of " r e s e r v o i r " in statistical mechanics, the details a r e unimportant. One is at first tempted to diagonalize raB in channel variables a and /3, which can in done in the case where H fields a r e absent scattering matrix is s y m m e t r i c : S = S, so i s . We note that

(i)

Here t and r a r e the transmission and reflection m a t r i c e s on the energy shell in channel v a r i a b l e s . As stated in Ref. 1, in the limit of a very large number of channels n, taS~l/n and therefore

or taB fact be and the also r

r may be diagonalized by a pseudounitary t r a n s -

23

4828

578

© 1981 The American Physical Society

23

NEW M E T H O D FOR S C A L I N G T H E O R Y OF L O C A L I Z A T I O N . I I . . . .

f o r m a t i o n ^ = UrU, with [/unitary, and then t_ by a transformation on the ongoing channels. This temptation must be avoided. The c o r r e c t channels to use a r e the natural ones for the perfect r e s e r v o i r system attached to the resistive sample, i . e . , channels where the incoming velocity v" = 6Ea/6px is a diagonal operator. This is because the cancellation of density of states and velocity factors which is responsible for the simple form of (1) does not work in any other representation. That is, Eq. (1) is to be thought of as multiplied by a factor

' *

' a

av

and there is no reason at all why the averages of these factors over a wide variety of channels should be each o t h e r s ' inverse if they a r e not d i agonal. That it is not appropriate to diagonalize can also be seen by a second way of deriving (1), which is by assuming a given flow of current J and calculating the momentum lost to the sample which must be proportional to V. Again, the calculation must be done in eigenstates of px to make sense. The formula as given is clearly valid for a set of conducting wires of different resistance in p a r a l l e l , which is not true of the one proposed by Az'bel. 3 ' 4 But, in the only case either of us consider very seriously, where B — 1, there is no real difference. Our later work can be used to show that the differences among the various formulas come in only in order n" 3 / 2 . We have shown in another paper 2 that for a truly one-dimensional chain our definition of conductance (which clearly can be extended in itself to any number of dimensions) can be related at least to that of Thouless, in t e r m s of the effect of boundary conditions on energy levels, which itself was related by Thouless to the Kubo formula. 5 But our definition is more fundamental than that of Kubo, does not involve eigenstates of an artificial Hamiltonian or boundary conditions, and above all a p plies naturally to finite s y s t e m s where Kubo's formula is ambiguous and Thouless' is not precise. Since finite systems such as tunnel junctions and small bits of wire a r e real and do exhibit finite r e s i s t a n c e , it can be physically incorrect to use a formula depending on a large volume limit like Kubo's, or that of conventional t r a n s p o r t theory for that m a t t e r . Among other p r o p e r t i e s , to specify densities and spacings of eigenstates one needs the scattering matrix as a function of energy S(E), while our formula is on energy shell only.

4829

to be a member of some given probability d i s t r i bution P(t); and to add "black boxes" in s e r i e s (see Fig. 1) with the a channels of box t^ feeding into the (3 channels of t_2. The probability d i s t r i bution of the resultant conductance G is determined by the composition law t=tu

X

,

t2.

(3)

The key assumption (very easily satisfied in the multichannel case) is complete phase randomness between matrix elements (/j) a8 and [t2)aS, etc. This is surely c o r r e c t for lengths past the mean free path I which is in this case always very much shorter than the localization length Lx (in general, L1 =1 n since \t\2 is of order 1/e at l). We also assume a second kind of randomness: a random mixing of channels involving no c o r r e l a tion between the eigenchannels of t1 and t2, so that any channel ct couples randomly to each channel 0. This is the assumption of essential onedimensionality. It states that there is no " t r a n s v e r s e " localization, i . e . , t r a n s v e r s e dimensions a r e small compared to localization lengths. Although we a r e free to redefine by unitary transformations four sets of channels (incoming and outgoing for s c a t t e r e r s 1 and 2), use of this freedom to diagonalize S, t_, or r is not very physically meaningful. In any case the various factors of (3) cannot be simultaneously diagonalized as far as we can s e e ; and, if they could b e , the eigenvalues would have an obscure relationship to conductance because of the requirement of t r a n s forming to natural channels for the input and output problems. It is meaningful, however, to discuss the eigenvalues of one factor of (3), namely M=.

1

,

.

(4)

This m a t r i x describes the multiple backscattering problem between s c a t t e r e r s 1 and 2, internally to the compound s a m p l e . Because in the limit of large n, | r x | and | r2 | a r e both of order 1 - \/n, this scattering is repeated ~n times before the particle gets transmitted or reflected. Thus this is the sensitive, crucial part of the problem, and without treating it properly, it is not even possible to understand classical Ohm's law behavior. The other thing about M is that, unlike t_ or r , it is

",r~ "~(Qj~~~~ l2 ~

III. SCALING As we did for a single channel, we will s t a r t from a microscopic scale at which we assume t_

FIG. 1. Schematic representation of composition of two scatterers to increase scale by a factor 2.

579

P . W. A N D E R S O N

4830

a matrix both of whose indices refer to the same set of channels: ingoing from the left into scattere r 2. Thus it is appropriate to t r y to diagonalize it by a unitary transformation. In o r d e r to understand the properties of M, we will have to study the large-rc limit. Unitarity of the S matrix gives us

23

tion.) In fact, we may achieve a slightly greater accuracy by defining g1 and g2 appropriate to each eigenvalue X by „<1),<2>

_ _

and then

1 = rr1 + tf

(r?)aa

= l-(te)aa

{rr*)„

I

( r

( l ) + _ ( 2 ) \ 1/2

(7b)

=

l-'E\tat

Now that we understand the eigenvalues of JV and thus of M, we may compute g from Eqs. (2) and (3):

(5a) (5b)

•EC M

g=Trtt\

Now in the large-n limit, in o r d e r for the conductance to be finite on the scale where localization is relevant, we must have, by (2),

a

'ax'yl\lXB

/-(

'8yJ™X,l\'a •

Now, again, the phases of Mx and M}, and of the various t's a r e totally incoherent, so that to an order more accurate than n" 1 / 2 , we can assume

^El&'I'ltfM'lAfJ', t„s~l/n

.

(8)

and then this is by (7) equal to

Thus, since the phases in (5b) a r e surely r a n dom, (5b) is ~ l / n 3 / 2 , and so by (5a) rr' — 1: r is unitary to lowest order in i/n. We may make r unitary to high accuracy by multiplying it by l / V l -g/n since the fluctuations in Z/ s j ^ B | 2 will also be of relative order l/Vn. In o r d e r to study M_ consider the matrix N=rfa

1 ^^ g

I2

1

JJ24^

x

x

|l

-Nx\

u <2)

= 4 V e- V H

2

V

~w 2 E

(6)

of which it is a function. tors just mentioned,

l

„ = _ L y ^ „(i)„(2) _ i _

*x

1 + 1A^XI2 - 2 1 A^x| c o s 0 x

gi1'^' ( 1 - \Nx])2

+

2\Nx\(l-cos6x)'

and we will recognize that aside from small fluctuations, this is by (7) equal to

Multiplying by the fac-

N

X = Zk.[(g^+g^)/2Y

i s , as we have just shown, the product of unitary m a t r i c e s and approximately unitary to order 1/n312. Thus we can diagonalize it to this order or better, and its eigenvalues a r e Nx N> {[1 - {gjn)][l - ( g » ] } 1 / 2 - [ 1 - (& + f t ) / » ] 1 / 1 = e"V (7a) Thus the amplitudes of the Nx a r e fixed and close to unity; the phases, however, a r e a r b i t r a r y , Since the phases of r2 and r t a r e completely unc o r r e c t e d . Thus, the eigenvalues Nx a r e randomly arrayed with density n/2ir in angle around a c i r cle of radius ((TV 2 ),.) 1 ' 2 - 1 - fe +g2)/2n. (It is not necessarily true that the eigenvalue spacing is random, but this does not appear to cause much effect: we return to this in the concluding s e c -

580

gl°gx2) +

2n2(l-cos8x)-

(9)

We recognize that this is a highly convergent sum, to which only values of 6x~X/n contribute appreciably, and for n large it is precise to r e place the cosine by its expansion: JTl 1 ^ 2 ' [C?x 1 , + sf'>/2] 2 + n 2 0 2 -

(10)

The phase angles 8X of the n eigenvalues a r e d i s tributed evenly from -n to IT. It is instructive to take a simple phase average of g, which gives us the ensemble average: S1S2

{g )=

- {tJ_rdeEg7+g2)/2T Wl+eV.v

2 2

+ n 6 /»

\Pl+P2/„

This is then the same result as one gets for the average transmission in the pure one-dimensional,

23

NEW METHOD FOR S C A L I N G T H E O R Y OF L O C A L I Z A T I O N . I I . . . .

lest few a r e relevant. Again, a s in the pure ID c a s e , all of the i m p o r tant action is in the s t a t i s t i c s : it is essential to calculate probability distributions rather than m e a n s . Let us then calculate from (10) the probability distribution Pig)dg:

one-channel case (see Paper 1): this adds c l a s s i cally in that _i

i_

=[((pi+p 2 )- i )» v r

(ID

which gives us Ohm's law in the limit in which the distribution in g is sufficiently narrow to obey the law of large numbers. This will clearly be the case when the ex a r e closely spaced relative to the distance (KiV^)., | - 1) between the circle of eigenvalues and 1, i . e . , when g1,g2» 1; the large conductance limit. In this case many eigenvalues contribute but in the opposite case only the s m a l I

Pig) = fPki1

Vx" ex)dg,dgsd8x6

We evaluate this by the well known technique of Markov and Kolmogorov 6

P(g) = ±- fdne-'^e-^\ M = nfP(g1>g2, 9)[l

(12a)

-e^(i[igi+^)fY+n2e2)]d9dgldg2

= ^/p( g JP( g ,)4 g l 4g,/>[l-exp( [ ( g i + , ^ ] , + y g )]

and«l.

(A-

F i r s t , consider gl,g2>> 1- In all c a s e s , it is clear from (12b) that exceptionally large or small values of gl and g2 a r e not very important (if they w e r e , we should have had to deal more carefully with \NX\ , for one thing). This is because the combination

This is irrespective of the size of gl,g2, although we will s e e that (14) and (15) become nearly i r relevant for small g. The behavior of P{g) de pends crucially on the size of (15) since this is the first r e a l t e r m of tp (/i). I f g 1 g ' 2 » l , (Ag-2)^ [in (15)] is » 1 and e"*'"' falls off to small values for j u ~ l . Then the power s e r i e s (13) is a good r e p r e s e n t a tion of ']), the next t e r m s in the power s e r i e s b e come successively very much s m a l l e r in the r e l evant region, and a Gaussian representation of P{g) is accurate:

is small if gx or g2 — °° or 0, so that exceptional values don't contribute strongly. It will be convenient to treat as secondary the fluctuations of gx and g2 to begin with, therefore, and concentrate on those caused by the phase. Let us take gl and g2 fixed and calculate the conditional probability =

P(g\g1,g2)~exV[-(g-{g)jV2{C^)J.

(g^+gJ3

(16)

It is perhaps worthwhile to make this more explicit with a simplification: s e t gl =g2 =g0. (The same manipulations, slightly m o r e complicated, work for gl ±gs.) Then

^fdtJ-exp[-itJ.g-ili{ti\glg2)]

where

-if^f^+M2

(15)

(*g\

gl&2

P(g\g,,g2)

(12b)

I ability distribution. Thus the first two moments a r e always given by [as we already saw in (11) for the first one] gigz (14) +i

It does not seem to be easy to get P(g) explicitly. However, it s e e m s adequate to study it in each of two limits'.

gl+g2»l

4831

(13)

It is a well known theorem (Ref. 7) of this method that the p o w e r - s e r i e s coefficients in fj. give the first, second, e t c . , moments of the prob-

=fu(ji). «S7T

581

4832

P.

W.

Here 4>(i±) is a universal function which has a good convergent power s e r i e s well represented by its first t e r m s {for p. » 1}, but is very badly e x p r e s sed by this power s e r i e s for ( i ? l . (We will see that in this limit, it is ocp 1 ' 2 .) Thus even though ((AJg-2)av)1/3 is not extremely small compared to (g)av i n this region, the p o w e r - s e r i e s argument tells us that the Gaussian is a good approximation throughout this region, since the expansion is in

^

„=- + -

gl

fi

'.V

-I. • • .

to„r + — + • • •

g2

.

(I?)

glgz

This is the result if g^ and g2 a r e fixed: averaging (17) gives (P>,»=W,V +(P2)av +2(p i ) av (p 2 ) av + • • • ,

(18)

which is additive to lowest order but exhibits localization in higher, and in fact the localizing term is identical to that in the one-channel c a s e . In the one-channel case the appropriate linearizing variable was

•

To a zeroth approximation, this is the case for the r e s i s t a n c e p = 1/g in the limit g — °° . That i s , when

((g),J2

L 1

~{g)„

H~l/g. We now wish to transform (16) into a distribution in an appropriate variable f(g) which has the allimportant property of additive mean, as explained in P a p e r I: (Ag)).,={Ag1)).,+(f(g2)\,

23

ANDERSON

f(g) = \ni\+p)

= p-^

+ P-

.

In the present case there seems no easy way to d e termine the full functional form of f{g) but we can approach it from its limiting values: first as a power s e r i e s in p . That i s , we write

gl+g2

f{g) = p-ap2

we may write

+ --- .

(19)

We determine a by demanding additive means:

( A , =

„ - < n p \ v =((Pi - «P?)),v +<(p 2 - aPl>)^

= P l +p 2 +2 P l p 2 -«(^- ¥ + ^ g g f ) + • • • = p1+p2+

2p l P 2 - a [(p1+p2f

= pl -cipl +p2 -apl

+ Sp^ipi

+ p2i]

+(2 - 2a)p,p 2 + (higher order) + • • • .

M

If a = 1 , (19) is additive to lowest o r d e r . This result holds throughout the region of the Gaussian distribution. Borrowing from the one-channel c a s e , we guess that the form of the additive variable f(g) should be a version of that appropriate to that c a s e , possibly involving a new conductance s c a l e :

=hf?*[-°*iKg^i¥T^} = (^Y /2 . \

IT!

(21

I

The corresponding probability distribution

= i-ln(l+2p) = ^ l n ( l + - ] .

P(,)4£d,exp[-^-(^-y/2]

(20)

is calculated by an exact saddle-point integration about the saddle point

(20) has the p o w e r - s e r i e s expansion (19). We now confirm that a s i m i l a r form (20) holds in the opposite limit p » 1, g«l. In this opposite limit, we may neglect {gl + g2)2 relative to x2 in the exponent and

and one obtains

582

23

NEW

Hg):

METHOD

g-11122 / 4 ? T

FOR

SCALING

feg 2 y / 2

THEORY

(22)

OF

4

ps = 2,

p-0

p s = ve"y

= \ . 764,

P_ Ps

p~x>.

1+

a-fe-a-]}

(giff2> (s-i +S2>:

where pc i s , as in the previous paper, the c l a s sically calculated additive resistivity, and the s e cond version holds if we s t a r t at weak scattering with p 0 very s m a l l . It is now essential to c a r r y out the second stage of the prescription of the e a r l i e r paper 1 : to make sure that the variable with additive mean has a variance which scales at least according to a weak law of large n u m b e r s , so that in the limit of large scale lengths, the quantity/(g) will have a Gaussian distribution. This may be checked in the two l i m i t s . At very small p, it is unnecessary to include the small correction t e r m in the power s e r i e s in p , and we have

which is consistent with the two means we have quoted in E q s . (14) and (15) When p » l , the logarithmic form (19) reduces to p o lnp/p o = p 0 ln(l/p o ^). We a r e therefore interested in ((lnp)) av which may be calculated from (22) as follows:

Let 1/

=(-2ta)

= - - o£a- f

is i )x2

e'

^' ' (\nx)dx

,

which is a known integral 2

4833

The scaling, then, is not exactly calculable except in these limits, but it is probably quite accurate to treat p s as though it were constant and then we would have, starting at a length L0 with starting resistivity p 0 , the Landauer form

is made, a cutoff for large g results which must converge the m e a n s . In fact, the cutoff occurs approximately at

(M

II....

where the scale resistivity p s is a slowly varying function of p with the two limits

Note that (22) has no finite mean or variance, since at large g it falls off only as g'3'2. A more exact calculation would confirm the values (14) and (15), since when the correction

g = const

LOCALIZATION.

/4r

lnxdx = l n 7 T - y - 0 . 5 6 8

(*2lAg))\. =„

TJa

=(p)\Mg\

= 2p 1 p 2 (p 1 +p 2 ) ,

so that • (ln7T - y) ,

(23)

(26)

which goes adequately to zero as p— 0. It is fairly c l e a r , since the distribution is most singular as p — °°, that (A/ 2 ) av r i s e s monotonically from this to its value in the limit p —• °o. The integral for ln2g- can be done only numerically as far as was ascertained: the relevant integral is

where y is E u l e r ' s constant, 0.577 22. This means that the combination which is additive in this limit is (choosing the sign to obtain a positive quantity, and noting that multiplicative factors a r e irrelevant)

- f

Agh

e-^^'ilnxfdx^l.M.

•"Jo

F r o m this we obtain that In

1.764 '

(24)

lim<[M^)]">„ = ( ( / - / ) 2 ) „ = 5.01 .

(27)

g-a

It is interesting that this is about the same ratio to the one-channel result as the scale factor on the r e s i s t a n c e . To repeat the argument of the previous paper, 1 we may majorize the variance in the distribution P(f) after scaling by noting that the added variance at each scaling step is less than (27). Thus the total variance satisfies A 2 (L)
This limiting result fits on to the result in the extended limit only if we allow the resistivity scale to vary slowly between the two l i m i t s . We may write

'•K)'

Ag) = P.l-

(25)

with

583

P. W. A N D E R S O N

4834

23

which is to say that which in turn is satisfied by A2 (L) = - A 2 + const X L .

exp LS-J^- + ±-A2/p2) - const (28) or

We plotted this result, as well as a plausible actual course for A 2 (L), in Fig. 3 of that paper which is reproduced here as Fig. 2. This also comp a r e s our later results [see Eq. (29)] with the one-channel c a s e . We s e e that the r m s breadth asymptotically approaches -TL in this variable, which is suitably s m a l l . As, again, for the simple chain, it is possible to study A 2 (L) a little more precisely. In the first place, it is possible to solve the scaling equation implied by the "extended" limit (26). We must add the variances of the starting distributions P(fl) and P(/ 2 ) to that caused by the composition p r o c e s s , namely (26):

A 2 - 2 ( / > a v P s = 2p c p s = 3.33p c = 3.33p 0 L .

A2(L) = A 2 (L 1 )+A 2 (L 2 ) + 2p 1 p 2 (p 1 +p 2 ). This is satisfied by A2(L) = f p 3 ( L ) ,

(29)

because p is to first approximation linear in L and V = L\ + L\ + 3L1L2{Ll +L2) . Thus the variance in the extended limit r i s e s much m o r e slowly, initially, than for a single chain, where it r i s e s as L2. The ultimate slope of the linear portions of A2(L) for large p may, again, be estimated by using the exact mean value of g, (13). If (g) av is not to vary exponentially in this limit, we must have fdfe-
~ const,

tf(L)

-7T?/3

IV. DISCUSSION AND IMPLICATIONS FOR HIGHER DIMENSIONALITIES Unlike the results for truly one-dimensional chains of one or a few channels discussed in the previous paper, these results a r e not rigorously derivable nor entirely precise quantitatively. The deep assumption is that the different channels or eigenstates X of the back reflection matrix N can be discussed independently. There is the problem already mentioned, that in fact they may not be quite independent in that transmission between two of them can interfere after taking into account the t m a t r i c e s of the separate boxes 1 and 2, but I am s u r e this is not serious because of the smallness and random phasing of these t e r m s . All of the effects of such interactions of the different channels would surely be to enhance localization. In the first place, one might expect that insofar as the N%'s a r e eigenvalues of a random matrix problem, they may to some extent repel like the eigenvalues of the energy in a random potential. I do not, however, believe that this is physically the c a s e , although I have not produced a rigorous argument. (Unpublished calculations by Lee show, however, that individual eigenvalues of r can be very different in magnitude, but we see no reason for such a correlation in N.) In any case this simply enhances their r e l ative spacing, which plays no special role in the problem. In the second place, t r a n s v e r s e localization will act to anomalously d e c r e a s e the g's which determine the magnitudes of the JVX and space the eigenvalues further apart, because they will lead to a correlation between the Nx's and the gx and g2 matrix elements which couple to them. That i s , once the sample width is wider than the localization length, it should be necessary to treat it roughly as W/Lloc separate one-dimensional s y s t e m s (where W is the width). Interpolation b e tween this concept and the considerations to be given shortly should give quite a satisfactory heuristic theory of the rf-dimensional system. In the opposite c a s e , that of a d-dimensional sample not too wide relative to the localization length, we may derive the scaling for d dimensions directly from that for one dimension by the following transformation. Let us consider a s a m ple with one-dimensional topology, but which does

FIG. 2. Schematic representation of the variance of f(L). The solid line gives the rigorous upper limit, the dashed line gives the estimate for the linear chain from the previous paper, and the dotted line gives our best estimate from the present work.

584

23

NEW M E T H O D FOR S C A L I N G T H E O R Y OF L O C A L I Z A T I O N . I I . . . .

not by adding equal lengths but by adding pieces with equal ratios of radii

not have a constant c r o s s section. In our t e r m s this means that it does not have equal numbers of ingoing channels from the left and outgoing channels to the right (see F i g . 3). In this c a s e the m a t r i c e s r[ and r 2 a r e no longer equal in size: one is N*N and the other M*M, while t_ is N*M. Unitarity conditions still require that rr1 +tf =r'*r' +ttt

=

ln^=ln^. R2 R1 In this case the fundamental scaling law (18)

„ =, v

l, becomes

so that the conductance is still symmetric and for practical purposes given by

(p), v (R 3 ,R 1 )=(p 1 )„(R 2 ,R 1 )

Two such unsymmetrical objects may still be compounded by the law

t=k-

Here we have setR 3 - R 2 « R2 ~ fii a n t * so small, in fact, as to make the r e s i s t a n c e of the sector R3 - R 2 perfectly c l a s s i c a l . This may be converted into a differential equation:

-u

and the properties of the matrix JV a r e the s a m e . Thus, as far as we can s e e , the manipulations of the previous section a r e perfectly valid so long as t r a n s v e r s e localization does not interfere. We may make such a nonuniform sample in the shape of a sector of an annulus of opening angle 6 (see figure). In scaling up such a sample, we note that we get equal increments of resistance

dp _ 1 dlnR i + 2 p ocle ^n(l+2p)|*i

£ "T

I I

1 I

=

(ln|)^

Now there is nothing in this argument which p r e vents us from allowing 9 to approach 2TT, at which point the sample becomes strictly two dimensiona l . Thus this expression in the limit 6 = 2ir should give us g(L), the effective two dimensional conductance as a function of scale length L . This may be simply done by comparing

1 -»2 dimensions' the "fan"transformation. iqnneis in

4835

i . __n channels 5 out

p(D 1

27T(7cl

•

gn L

~'g{D

Rt

which leads to g(L) = g0-^lnL/L0

+ --- .

This happens to be exactly the result obtained by perturbation theory in Ref. 8. We do not believe this to be a coincidence but think the reasoning c o r r e c t in the large-g- limit. Incidentally, the r e sult [Eq. (29)] shows that fluctuations a r e much less s e v e r e in the higher dimensionalities than in the ID chain. The same trick may be extended to higher dimensionality, where indeed the numerical coefficient of the first t e r m agrees perfectly with p e r turbation theory in three dimensions as well. Thus this method seems to obtain a valid f i r s t order correction in powers of 1/g. Combined with the idea of t r a n s v e r s e localization with Az*bel's methods, one could undoubtedly interpolate a quite satisfactory estimate of the 8(g) curve of Ref. 8

FIG. 3. Sketch of the fan transformation which carries the one-dimensional, many-channel case into 2 or higher dimensions. Instead of symmetric scatterers, we use random scatterers with n channels on the left and n1 >n on the right. Doing this repetitively gives us a sample which is topologically 2 or n dimensional.

585

4836

P. W. A N D E R S O N

at physical dimensionalities of 2 or 3. The next and more difficult problem is to apply this method to the essentially two- or t h r e e - d i mensional questions of Hall effect and magnetor e s i s t a n c e , and possibly to generalize to interacting s y s t e m s . We feel that as it is it places the quantum transport theory on a much firmer foundation than heretofore.

23

ACKNOWLEDGMENTS I acknowledge extensive discussions with D. S. F i s h e r . T. V. Ramakrishnan, P . A. Lee, and especially E . Abrahams on this subject. The work at Princeton University was supported in part by the National Science Foundation Grant No. DMR 78-03015, and in part by the U. S. Office of Naval Research Grant No. N00014-77-C-0711.

*P. W. Anderson, D. J. Thouless, E. Abrahams, and D. S. Fisher, Phys. Rev. B22, 3519 (1980). 2 P. W. Anderson and P. A. Lee Prog. Theor. Phys. (in press). 3 M. Az'bel, unpublished. l M. Az'bel, unpublished. 5 D. J. Thouless, in Ill-Condensed Matter, edited by R. Balian, R. Maynard, and G. Toulouse (North-Hol-

land, Amsterdam, 1979), p. 1. See, for example, S. Chandrasekhar, Rev. Mod. Phys. 15, 1 (1943); the method was used in P. W. Anderson, Phys. Rev. 109, 1492 (1958). 7 See, for an early use, P. W. Anderson, J. Phys. Soc. Jpn. j), 316 (1954). 8 E. Abrahams et al., Phys. Rev. Lett. 42, 673 (1979). 6

586

Definition and Measurement of the Electrical and Thermal Resistances (with H.-L. Engquist) Phys. Rev. B 24, 1151-1154 (1981)

This is the first or at least one of the first papers about what is now called "nanoelectronics", which excuses its appearance here. Engquist came from Sweden on his own money, and it was a pleasure to work with him; but his health has several times prevented his return.

587

PHYSICAL REVIEW B

VOLUME 24, NUMBER 2

15 JULY 1981

Definition and measurement of the electrical and thermal resistances H.-L. Engquist* and P. W. Anderson Joseph Henry Laboratories of Physics, Princeton University, Princeton, New Jersey 08544 (Received 17 April 1981) An ideal potentiometer and a thermometer are defined in order to show the experimental situations leading to Landauer's formula for the electrical resistance and to the corresponding expression for the thermal resistance. PACS numbers 1977 72.10.Bg BDR207 Recently, Economou and Soukoulis (ES), 1 and thereafter Fisher and Lee (FL), 2 , 3 derived from the Kubo formula an expression for the dc electrical conductance, G, of a one-dimensional chain for noninteracting (spinless) electrons G = (e2/h)T, where 7"is the transmission coefficient. This result contradicts Landauer's 4 formula (LF) for the electrical conductance, G = (e2/h)T/(l — T). This disagreement is discomforting since LF was used in recent treatments 5,6 of the electrical conductance of onedimensional chains. We want here to rederive LF from an experimental point of view since the exact experimental conditions leading to LF have remained unclear. We stress that the electrical resistance is defined in terms of chemical potential differences, rather than voltage differences. Alternative and independent discussions of LF have recently been given in Refs. 7-10. In the derivation of LF in Ref. 5 the difference between the chemical potentials of the reservoirs at the ends of the sample was directly related to the net current through the sample. The chemical potential, however, is not well defined when a net current is flowing through the walls of a reservoir. To avoid this problem, we introduce an ideal potentiometer (Fig. 1). The in- and outgoing currents are equal leading to a well-defined chemical potential ix. The operational definition of the potentiometer is given by the S matrix

5 =

'aj8

'ay

r

hy

»

currents to and from the reservoir are determined by the phase-averaged transmission and reflection coefficients. The difference between the outgoing current jA from the reservoir A (Fig. 2) with chemical potential fjLA and the current at equilibrium (fj.A = 0 ) is JA

= -£- C" [f(E•> J - « .

nA ) - / ( £ ) }v(E)N(

E)dE (2)

where v(E) is the velocity, N(E) is the density of states per unit energy and length, and / ( £ ) is the Fermi distribution function. The energy is measured relative to the chemical potential of the sample at equilibrium. Since v = dE/dp and N(E) = (l/irfi) x (dp/dE), Eq. (2) can be simplified to JA

(E- "AM) -f(E)]dE

~—IJLA

h

(3) Our aim is to relate jA and jB to the net current through the sample j . The ingoing currents to the reservoirs A and B consist of the transmitted parts of j \ = (elh)ixx and j2 = (elh)fx2 a nd of the reflected

(1)

where \tay\ = \tfiy\ = Ve and \ry\ = 1 - e , € « T. The subscripts a, 13, and y refer to the left and right transport channels and to the connection to the reservoir, respectively, and € is a controllable design parameter, the strength of coupling of the potentiometer. For the following discussion we do not need to specify the remaining elements of the S matrix except for requiring that the unitarity conditions are fulfilled and that 1 ^ 1 = 1 + 0 ( e ) , |/-J = |/>| = 0 ( e " ) , with n > T - Since the reservoir has a finite width," the 24

reservoir H-or T FIG. 1. Ideal potentiometer (thermometer). The in- and outgoing electrical (thermal) currents of the reservoir are equal leading to a well-defined chemical potential (temperature). Since the reservoir has a finite width, an average over the phases of the reflection and transmission coefficients for the in- and outgoing currents has to be taken. 1151

588

©1981 The American Physical Society

24

RAPID COMMUNICATIONS

1152

by using Eqs. (3) and (8) and the corresponding expressions for the reservoir at B. Since (-df/dE) = 8(£) at 7 = 0 we have rederived Landauer's 4 formula for the electrical resistance 12 Re\~

FIG. 2. Measurement of the resistance. The currents jx and J2 are impressed from left and right, respectively. The potentiometers (thermometers) divert a small part of the current from the channel into the reservoirs at A and B. The difference between the currents jA and jB determines the potential (temperature) difference.

parts of jA and j B . The transmission amplitude from the left ingoing channel to the reservoir, tiA(E), including terms to order Ve, is Jle^E)+^\r(E)\e^E)

(4)

,

where r(E) is the reflection amplitude of the sample. Owing to the finite width of the reservoir an average over the relative paths of the directly transmitted wave to the reservoir and of the reflected wave from the sample has to be taken; i.e., TXA is given by its phase average, =€[l+|r(£)|2]

.

K0)

(10)

• > • " = T XT [fr(E)

-fr0(E)]E

dE

(5)

M.

••'-oo

'(0)|

ej

The expression of ES1 and FL 2 , 3 is dubious since it relates the difference in chemical potential over the sample to the end reservoirs which are not in internal equilibrium. In order to get a well-defined potential we need to apply measurement reservoirs, for which the zero net flow condition is applied rigorously. This corresponds to a four-probe experiment, where the potentiometers do not discriminate between the left- and right-going flows. We can use a similar procedure to derive the formula for the thermal resistance. The potentiometer is replaced by a thermometer defined by the same scattering matrix (1). The temperature is determined by the Fermi distribution function, for which the inand outgoing heat currents are equal at the reservoir. The temperature is the only variable since the chemical potential is constant in a heat conduction experiment. When the temperature of the unperturbed sample is 7"0, the extra heat current from a reservoir at temperature T is

The ingoing currents at A originating from the external currents j \ and ji are

h

8^/if

dE

3h

(6)

dE

-(r-7-0)

(11)

Since / ^ ( l - 2 e ) is the part of the heat current from the reservoir at A that is reflected back, the condition for zero net heat flow through the walls of the reservoir is

and /•+0O

^

2

J

3L

l'(£)|

(7)

dE .

Since jA(\ - 2 e ) is the part of the current from the reservoir at A that is reflected back, the condition for zero net flow through the walls of the reservoir is C~[l + \r(E)\2]

jA=jA(\-2z)+e-{^ h

Ti-T0 + €hT0

dE

•*— oo

M. dE

dE

/•+0O

h

dE

dE

+e

J •/—oo §HAB*r — oo

\r(E)\

T2-T0

hT0

*J~ oo

|/(£)|2£2

(8) r

th

3h

df dE dE

U(£)|2£2

df dE dE

/—oo

dE

(12)

(13)

) , ! T

IT Kg In

/•+•»,

J

dE

dE

\r(E)\2E2

\

M.

d£ dE

dE

dE If we exchange the subscripts 1 and 2 in Eq. (12), we obtain the equation determining ySth. By using Eqs. (11) and (12) we readily derive the thermal resistance of the sample

The difference in chemical potential between A and B, &IAAB, can then be related to j , /»+oo

AL

£™[\+\r(.E)\2]E2

The Wiedemann-Franz law (9)

2

d£ dE dE

•K el

7T 2

kB_

e

589

To ,

(14)

24

RAPID

COMMUNICATIONS

1153

H-z or

777: T7777T

7777

c ^A0r

T

r^C0r

A

T

C

^ B

0

' ^

FIG. 3. Extra potentiometer (thermometer) on the sample. The resistance between two points in the channel depends in general on the scattering properties of the total system; i.e., no "local" resistance can be defined.

is valid when |r(£T)| 2 = |/-(0)| 2 for energies near the Fermi level. Since the reflection coefficient is a rapidly varying function of energy in one dimension, we do not expect the Wiedemann-Franz law to be satisfied even for elastic scattering. This is in contradiction to the usual relaxation-time approximation for elastic scattering. We now apply a potentiometer (thermometer) to the sample (Fig. 3). The transmission amplitude for electrons from the left to the reservoir at C is to order V<E, m

r|;,(g)k'*'

(0

Iir(i)=ve

[l+h(£)|f'*°] rrrs

The subscripts 1 and 2 refer to the scatterers to the left and to the right of the measurement point C. Since the reservoir has a finite width the current from the left channel to the reservoir at C is determined by the phase-averaged transmission coefficient

x:

....

• Ui)

|r,(£)|

h(£)l

l-|r,(£)|2|r2(£)|

S/X^B, is then

M

9£

dE

'JC

M.

r(£)|

(17)

dE

9£

Similarly, the ratio of the temperature difference between A and C to the total temperature difference, TAi thermal conduction is 9TAC $TAB

1+

+-

JC

(16)

( T\c(E)) is given by Eq. (16) with the subscripts 1 and 2 exchanged. Since e « | K £ ) | 2 , the applied potentiometer (thermometer) does not change the transmission coefficient of the sample. The potential difference between A and C, bfx.AC, in terms of

l-|r,(£)||i-2(£)k'*3<£)

1+

|(,(£)|2[l+h(£)|2] i - M t f I''2 (£)

-£

h(£)|2-|r2(£)|2

l-h(£)|2h(£)|

1L 8£

dE

'JC | r ( £ ) l £ 2

2

8£

dE

for

(18)

scattering is strong enough to destroy phase coherence between the measurement points can local resistances be defined.

The measured resistances for sections of the chain add linearly. Because of long-range phase coherence for each energy, however, they are determined by all scatterers in the sample, not just those between the measurement points. The "classical" case of local resistances corresponds to a complete phase randomization of the reflection coefficient for each energy, i.e., | r ( £ ) | 2 = ( | r ( £ ) | 2 > . Only when the inelastic

ACKNOWLEDGMENT We would like to acknowledge partial support from the NSF under Grant No. DMR 802063.

590

1154

RAPID COMMUNICATIONS

24

'The finite width of the connection to the measurement reservoir means, of course, that it has many, say, A1 channels. Thus we have | layl• | — | t^yi | = Ve/A and \ryj\ =1 — e/A for each channel y,. Since Eq. (3) is jA =(Ne/h)li/4, the relevant Eqs. (9), (13), (17), and (18) are independent of N. Hence we limit ourselves to a single-channel connection to the measurement reservoir. 2 We expect corrections to the relation b/j. = eHVto be of the order \D/L, where 81 / is the difference in the average electrostatic potential between the measurement reservoirs (voltage difference at zero current through the sample have already been subtracted), kD is the Debye screening length, and L is the linear dimension of the reservoir. Since the reservoirs have to be macroscopic, these corrections are negligible.

*On leave from the Department of Theoretical Physics, Royal Institute of Technology, S-100 44 Stockholm, Sweden. 'E. N. Economou and C. M. Soukoulis, Phys. Rev. Lett. 46, 618 (1981). 2 D. S. Fisher and P. A. Lee, Bull. Am. Phys. Soc. 26, 217 (1981). 3 D. S. Fisher and P. A. Lee, Phys. Rev. B 23, XXX (1981). 4 R. Landauer, Philos. Mag. 2\_, 863 (1970). 5 P. W. Anderson, D. J. Thouless, E. Abrahams, and D. S. Fisher, Phys. Rev. B 22, 3519 (1980). 6 P. W. Anderson Phys. Rev. B 23, 4828 (1981). 7 M. Ya. Azbel, Phys. Lett. 78A, 410 (1980). 8 M. Ya. Azbel, J. Phys. C L3. L797 (1980). 9 D. J. Thouless (unpublished). ,0 D. C. Langreth and E. Abrahams, Phys. Rev. B (in press).

591

Suggested Model for Prebiotic Evolution: The Use of Chaos PNAS 80, 3386-3390 (1983)

This is one of a sequence, of which more complete versions were produced with the help of the computer skills of first Dan Stein and next Dan Rokshar, then an undergraduate. I include it to selfishly stake a claim as one of the co-inventors of the "rugged landscape" evolution model, along with Gerard Weisbuch, Stu Kauffman, and no doubt others. This was the gestation period of the science of complexity, and I met all kinds of wonderful people doing similarly speculative things in wonderful, if sometimes weird, places — Keystone, CO, Dubrovnik, Werner Erhard's mansion in San Francisco, Klosters, and, eventually, we congealed around the Santa Fe Institute.

593

Proc. Natl Acad. Sci. USA Vol. 80, pp. 3386-3390, June 1983 Evolution

Suggested model for prebiotic evolution: The use of chaos (origin of life/spin glass) P. W. ANDERSON

Bell Laboratories, Murray Hill, New Jersey 07974; and Joseph Henry Laboratories, Princeton University, Princeton, New Jersey 08544 Contributed by P. W. Anderson, February 28,1983 ABSTRACT I elaborate on an earlier model for prebiotic evolution by adding a selection function depending randomly on basepair correlations along the replicating RNA chains. By analogy with the equilibrium statistics of spin glass, this should give a wide variety of metastable "species" that can mutate or combine symbiotically. Problems of Prebiotic Evolution. Anderson and Stein (1) have reported an attempt to model the first crucial stages in the evolutionary process, those stages in which the first informationcontaining molecules appear. There are of course many points in the process that led from inanimate matter to life as we know it that could be called crucial, and few of them have been investigated far enough that one can see a clear resolution of their manner of occurrence on the actual early earth. It has seemed, however, and I feel that on this point several previous investigators agree [e.g., Eigen (2, 3), Orgel (4, 5), and others], that the central question is that discussed in ref. 1: From essentially uniquely characterized simple molecules, which even when autocatalytically generated do not have any resemblance to a selfreplicating information string such as is necessary if evolutionary choice is to begin to operate, how can such information strings develop? The other important stages are as follows. First, the synthesis of a sufficient concentration of primitive organic molecules— phosphates, amines, purines and pyrimidines and such—and their activation into states capable of polymerizing into large molecules: investigations (6, 7) have suggested that, in the energy-rich environment of the primitive earth, moderately complex organic molecules can appear, but certainly no unique pathway has been proven. Nonetheless, this process is a purely chemical kinetic problem, and it seems likely that there were, in the enormous variety of microenvironments available on the early planet, many occasions for the collection of an organic soup having considerable concentrations of such molecules in energy-rich forms. No question of principle seems to be involved here. I choose to separate the next stage into two parts, in contradiction to Eigen (2, 3) who favors the simultaneous appearance of nucleic acids and proteins. I cannot see how the enzymes can have evolved simultaneously with the nucleic acid chains, because these are structurally unrelated molecules and both of great complexity. Peptides can easily have been chemically involved1—say as terminations—with the first nucleic acid chains, but I believe the "code" must have evolved quite separately and later. The code could even be a consequence of symbiosis of very primitive tRNA-like organisms (8, 9). The coupling of nucleic acid chains and proteins to produce the code is then a third event, which to my eye contains serious quesThe publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact.

tions of probability and of detail but ones unaffected by the questions of principle I will discuss here. This point of view is made more reasonable by the fact that RNA is used for purely structural purposes both in ribosomes and, of course, in tRNA itself, so it is not true that all biological catalysts are proteins. All life now lives within a cell membrane—even where, as with the Q/3 phage, it has little of the other paraphernalia of normal organisms, even dispensing with DNA and not using host enzymes to replicate. Many vital processes involve the membrane structurally. The cell membrane must have appeared early, but I see no difficulty with the idea that it initially appeared as an entirely separate construct in the environment and only later was manufactured as a substitute for natural lipid layers, or even was evolved or co-opted to allow the living molecules to grow elsewhere than the pores in clays or rocks in which they may have first appeared. Lipid membrane is an equilibrium structure, not a dissipative one in any sense, and for the primitive chemical species to fold it around themselves is not an implausible step. In the final section of this paper I will discuss some of the alternative approaches more fully, after my own ideas are before the reader. Let me now proceed directly to what I consider to be the problems. Some Quasi-Philosophical Questions. Living organisms may be the only stable example of the (essentially philosophical) concept called "dissipative structures." As Haken and others (refs. 10 and 11 and refs. cited therein) have emphasized, most of the structures we encounter in nature are "equilibrium structures": at the microscopic level, such broken symmetry structures as crystals, magnetic domains, liquid crystals, and lipid membranes; at the macroscopic level, rocks, stars, and planets. They have postulated a second type of structure, organized not as a consequence of equilibrium thermodynamics, but rather as a result of a sufficiently large deviation from equilibrium forcing a driven system that is very far from equilibrium into some kind of organized state. The more we study systems that appear to do this the more we recognize that stable organized dissipative structures are very much the exception rather than the rule. Even the simplest driven dynamical systems are liable to evolve into a chaotic state. The example Prigogine (11), for instance, has most often described, the Benard convection cell, has recently been shown to be intrinsically chaotic under almost all conditions (12). Whatever dissipative structures exist are surely remarkably restricted in their domain of stability, quite unlike living entities. For these reasons, I believe that the concept of dissipative structures, as it has been elucidated so far, may not be relevant to the origin of life. In any case, it is by no means sufficient, in discussing the question of whether "chance and necessity" can really lead to life, to point to the case of bifurcations in dissipative systems as an example of organized structure. We must ask for more.

3386

594

Proc. NatL Acad. Set. USA 80 (1983)

Evolution; Anderson CONSTRUCTION OF THE MODEL I focus here on two aspects of life on which previous models have not been satisfactory to me: stability and diversity. It is easy to invent a model that can produce a great diversity of different "polynucleotides" if one neglects the real complications of the chemistry of these molecules. The scheme proposed in ref. 1 had this property. We used two complementary "bases", which we called A and B (modeled really on guanine and cytosine), and treated them as totally equivalent "letters" in a binary alphabet. By setting up a kinetics in which, strings of letters A-B-A-A ... catalyze the formation of the complementary structure B-A-B-B ... by the Crick-Watson conjugation process, one can arrive at strings of arbitrary length and complexity. Unfortunately, there is no particular reason why any outcome of this process will remain stable in any particular structure. We encountered this difficulty very early in our computations, in the fact that we could not in any way evaluate what we could call "success": we could not identify when a "species" had evolved, because all species were equivalent and could presumably mutate at will into each other. As far as we carried our calculations, which may not have been adequate, we found simply a chaotic.mass of rather unrelated strings. This is the case when stability is absent: any one species will mutate into any other. It is equally easy to set up a process with a stable result but only at the expense of losing a crucial degree of diversity. There are two simple ways to do this. One is to take into account the genuine and real difference in local free energy between different groupings of bases. For instance, it seems likely that GC-G-C-G-C ... is a very stable polymer. If, in our A-B model, we inserted a strong preference for A-B pairing, we could get a system that could very stably grow in the A-B-A-B ... mode and nothing else. Any one stable structure is uninteresting because, essentially, it contains no information: there is no diversity on which evolution can build to create ever more complex structures. I have a sense that this is the fate of almost any simple realization of Eigen's idea of the complex autocatalytic cycle. Such a cycle is likely to be so specific that it permits only one outcome (or, if certain simple symmetries such as reflection are involved, two at most). Virtually all the seriously proposed mechanisms so far suggested are caught in this difficulty. Stability and diversity together are the key, and this suggested the one system in equilibrium statistical mechanics that has both of these properties: the spin glass. This possibility was suggested by Hopfield's ideas on associative memory, available in preprint form. (Discussions with Hopfield have been most useful.) The spin glass is a magnetic system that has the conventional-looking spin Hamiltonian,

*=-2)/»s.s,

[i]

but the exchange integrals / # are random functions of the pair of variables (i,j), which take on both positive and negative values. The key element in the Hamiltonian that makes the substance a spin glass is the property of "frustration," which is slightly ill-defined (13, 14) but means that there are many, loops (cycles) for which (/ s U / « • • • /™) is negative, so that all of the interactions in the loop cannot be satisfied simultaneously. If there are as many such frustrated cycles as there are acceptable ones, the state becomes highly degenerate, and it seems there are very large numbers of states that are local minima of the energy, not far in energy from the largest one or ones and essentially different each from the other. The number of such states seems to increase exponentially with the number of spins (15). It has been shown (16) that in some models—specifically

3387

when ]$ is nonzero for a very large number M » 1 of other spins j—the relaxation between different metastable states is extremely slow. These different metastable states have the required properties of stability and diversity in the equilibrium case. A number of conditions for the existence of stable spin glass states are worth noting. In the first place, if we are to model a system involving base-pair conjugation on a linear RNA chain, we will map the spins S( onto the bases: let us initially set guanine = +, cytosine = - , for instance; later, we can give Sj more values if we choose. Then the "system" must be a linear chain of spins S, of some length iV. It is essential that/ s be long-range; otherwise, even the equilibrium structure is not stable [the condition as N -* °° i s / s « l/(i —j)]. Most particularly, / s must not be a simple short-range function of (i - j ) ; otherwise, the gound state is periodic and hence, by our definition, trivial. In fact, if we choose M and N comparable, both problems are solved, so that is the model we shall use. Without these restrictions, we have not a hope of producing appropriate objects. How are we to understand the appearance of such properties in some realistic early world? What we have not yet done is to take into account adequately the complex chemistry of the RNA polymer system. There are a number of kinetic as well as equilibrium restraints on the system that have not been taken into account: I list a few that are already known (17, 18): (i) Poly(G) is probably the strongest bound of all polymers, but it has the unfortunate property of self-conjugating in four-strand helices. Thus, any appreciable length of parallel S [implying poly(G) GG-G-G ... for at least one of the conjugate pairs] is not viable. (it) After a length of four or five bases, RNA can fold back on itself and self-conjugate. Thus any appreciable length of G-CG-C ... or in fact any self-conjugating sequence such as C-CG-G-C-C-G-G is not viable as far as reproduction is concerned. This is a very strong restriction, because it implies that these sequences must not occur anywhere along the chain and is hence the equivalent of very long-range interactions, (iti) The ends are special points with very-long-range implications because, in real systems, they will surely be attached to eidier substrate or peptides. For instance, in Orgel's present experiments, he finds that chains of 13, 25 or 26, and possibly about 39 bases are particularly stable, probably because the helix repeat distance is 12 bases and the chain is growing at a flat surface, (iv) Coupling to peptides can occur in several ways. The simplest is chain termination and, if the peptide is polymerized to any appreciable length, it can influence catalysis at quite distant sites along the RNA chain. Another form of coupling is the replacement of a RNA base by intercalation of an aromatic ring from an amino acid. Of course, once the code or its primitive predecessor starts to evolve, this provides a very effective long-range correlation between distant sites along the RNA chain. The random frustrated model is perhaps an even better one for evolution at a more advanced stage, (v) Finally, the environment can contain many other influences that appear random or chaotic from the point of view of a growing RNA chain. Any periodic crystal structure of a substrate, or any periodic structure of a lipid membrane in or near which the RNA is growing, would have a chaotic-looking influence on the RNA, except in the infinitely unlikely case that the periodicity of the substrate matched one of the two periods (base spacing or helix repeat) of the RNA. This is only a partial list of the possible types of essentially conflicting influences that will affect the growth or survival rates of an autocatalyzing RNA chain. Let me summarize the effects of all such environmental factors on the RNA chains by a death function DN(S!...Ss)

595

[2]

3388

Evolution: Anderson

for chains of length N. As in previous work, I assume a regular fluctuation of temperature that periodically breaks apart and re-forms conjugated pairs of nucleic acid (NA) chains. I introduce a distribution of chains of all possible lengths, p(t,S1,Si,S3...SN) [3] and the death function gives the difference in p caused by environmental factors Ape„v = P(t + At; S, . . . S„) ~ P(t, S, . . . SN) = -D(Sl...SN)p. [4] In the actual primitive system, as well as in the simulation, we can define D and p only probabilistically. D represents the probability per cycle of death of a particular exemplar (St... SN). We assume that D is a chaotic random function of all its arguments. Noting that Sf = 1, Sf+1 = Sh D is necessarily linear in all Sj and can be expanded in successive correlations between its arguments:

Di»-S*fS, + 2 ; « S l S / + 2 CNvkStStSk.... [5] Now it is likely that ht = 0, because the conjugation mechanism imposes population symmetry between cytosine and guanine at every site in detail. Whenever guanine appears, in the next generation (if the system is to be viable), cytosine must appear and vice versa. Thus expansion 5 starts with Jq, and we shall for simplicity cut it off at this term as well. This is a rather cavalier step and only justifiable as a heuristic hypothesis about the nature of such random functions, which must be discussed. Imagine a minimum of the D function in which the various "spins" have values Sj. We can imagine changing a few of the bases S( —» — S( and making an expansion such as 5, in which we can think of the coefficients as a kind of partial derivative: ht = AD/ AS|, / s = A2D/AS(ASj, etc. These coefficients will now be functions of all the other values of S but near a particular D minimum they will not be strong functions because each base interacts with many (N) others and, in particular, C# will be ~l/N'" of /(/, etc. Thus, locally the behavior will be like that of the truncated series. In particular, because of conjugation symmetry, there will be many minima, even locally. Moving from one minimum to another quite far away, we will not expect the values of/ s to remain constant but since, for the spin glass, we know there to be 0 (e") local minima, and a large number of these to be quite deep, the local expansion will include at least a relatively large number of local minima. More generally, there seems to be a new class of functions of many variables of which the spin-glass Hamiltonian is an example and of which our desired D function may be another—as is, perhaps, the length of a "traveling salesman" tour, according to the work of Kirkpatrick et al. (19). I would like to call these generally "frustrated random functions." It is hoped that the appropriate survival functions for evolution problems have this frustrated random character and thus are best modeled by a spin-glass function; but of course a more theoretical proof of such a vague concept is out of the question. Thus, I take

troc. Natl Acad. Sci. USA 80 (1983) ferent from those of spin glasses and much less interesting. In particular, they usually have a unique stable solution, at least when the random field is large enough and, in the present case, that is a disaster because it means that no information is generated. Most other simple (or complex) autocatalytic cycles do not seem to me to have this property that is so vital. It is perhaps even a bit disturbing to have to suggest that Crick-Watson base-pair conjugation—or at least the concept of complementary strings—may be absolutely and uniquely essential in getting life started. Orgel has pointed out that my use of conjugation symmetry to eliminate the "random-field" case of ht 5* 0 is suspect. This was a key part of the argument because I see it as essential that the death function have many nearly equivalent minima whereas, with large random fields ht, the solution tends to be unique. The problem is that conjugation acts to reverse the order of the bases—i.e., that 3' —» 5' is replicated on 5' —* 3' and hence is reversed in order. Thus G-C produces the identical G-C not the symmetric C-G. The following argument arose in a discussion with D. S. Rokhsar. The fact that replication of a chain of length L produces -SL-i from Sj can be expressed by adding the two symmetry-related death functions D rf (S i ) = D(S i ...) + D(-S L - i ). We now expand linearly in St and get

D*(S} = 2 K(Sj)S, +

h,(-S^X-SL-d,

i

where the second ht is the same function of the variables - S L - / as the first one is of Sj. In particular, if /if has an average value, hh we can write Da(S>) =fc,(S,- SL.() + SMSj)

~ h>]

+ S t -,[fc,(-S t . i )-fcJThis means that the random field acts on the variable
N

and in fact it is almost certainly adequate to assume that / s is independent of N. Note the very important role played by the symmetry of the conjugation mechanism for the simplification of 5 to 6. This is crucial. The properties of random field models are totally dif-

Aftoura(Si=±l)=+s,

[7]

where die source s is chosen arbitrarily and more or less sets the time scale, (ii) At the beginning, there is a fortuitous density

596

Proc. Natl Acad. Sci. USA 80 (1983)

Evolution: Anderson

i.e., we will set up a density diffusion via

of short polymers, so that conjugation will have a point from which to start: Pmiaj(SiSjS3), say, = finite for several values of SiS2S3 including sign changes. (Hi) Temperature is cycled so as toformand break up conjugate pairs at intervals At. When chains are conjugated in a contiguous fashion, they will polymerize (any failure to do so can be ignored as a nonevent or included in the death rate D). Thus, we envisage processes such as

("

Ap(S l ...S,...S„) = E - e W s i • • -. S,, ... S„) - p(S,, -Sj,...

J

SN)].

[15]

It may also be useful to allow for a rate of breakage and of erroneous transcription; in fact, I would prefer an algorithm that

+); (- + + + + -)

(+ + -);

3389

--+);

- ++++A

(- + + + + - )

[8a]

or more simply

(-); (+); (-); ( + - - + ) -

(-)(+) (7)

- +),' etc.

This is easier to program in an algorithm than it is to express in terms of probability densities. It allows the growth of chains of increasing length by using the active monomers in such a way that Ap(-S) is positively driven by p(+S) (where S is a vector Si ... SN). In general terms, let us introduce a vector notation for partitions of the set of spins Si ... S^: [Si... SjSj+i... S/v) = (Si, SJ, where Si = (Si... Sj), S2 = (Sy +1 ... SN), etc. That is, we can describe p(Si ... Ss)in terms of its partitions into p groups of i,v spins Si... SN - S = (Si... S(l; S, 1+ i... Sj,; S(, + i... Sy ...)

-(S1,S,,S3...y,

[9]

where

[10]

2 t = N. n

Now, we describe processes such as 8 by the equation AA^(SI,SS) = ^

M S J M S J M S O , " S I , -S2,

S3)

contained breakage, but 14 should be adequate to move us occasionally out of local minima of the death function. Let us summarize the algorithm. We start, according to ii, with a few two- or three-base polymers, being sure to include both + + and + — sequences. We then subject them to a series of temperature cycles as follows: (1) Add 2s monomers, in random proportion + and —. (2) Allow conjugation between randomly chosen chains. An algorithm might be to make randomly chosen pairings and test for fit starring at one end and conjugating when possible; if any unpaired chains remain, again pair these along with all unsaturated conjugated structures, test for fit, and so on for a few stages. This will drive the system very hard relative to nature, but I think not in an essentially unrealistic way. (3) Split apart the resulting chains and apply D: First, for each chain, calculate the appropriate value of D. This will be of random sign and hence not a.true death function. Also, absolute values of D will be bigger for larger chains. I believe it is appropriate to add a constant to D of order -N (the probability of damage is of order size) so as to make most values negative. Then, randomly select chains to annihilate with probability ( 1 - e - W + fW/l),

So,Sj Apc™j(Sl) = -bPcmjiSi,

[8b]

SJ = Afto^Sj)

[11]

for all possible sets of pairs Si, S2. We especially allow processes that increase the total length of chains such as

( + - - + ) ; (- + +); ( + - + -)-» f ( + - 7 t ) (7 + +)l

where/^ has a variance of order/ and both are quite small. If 16 is less than 0, we should reproduce S rather than annihilating it with the appropriate probability. I am not sure that the normalization in the length is suitable but do not wish to introduce too many arbitrary parameters. Finally, we introduce the radiation (iv) by reversing spins on all chains with a very small probability £ per site. This is repeated until, one hopes, clusters of chains of similar structure appear.

J J //^ j-, ( + -- + -++);(+-+ -, [12] For these, Ap„nj(So, Si, S,, S3) = kp(-Si, -SJp(S0, Si)p(S2, S3) [13] with, again, the appropriate Ap values for the constituent molecules as well to allow conservation of monomers, (ID) Finally, to avoid indefinite growth and to allow for "mutation," we must build in an error rate. This can take several forms, and I may have chosen too special a one, but until actual simulations are carried out, let us propose that between conjugations, we allow a small percentage of SiS2 ... Sj ... Ss~* S1S2 ••• —Sj ... S„;

[14]

[16]

DISCUSSION

By this time, most of the motivation in setting up this model will have become clear. I hope that this model is capable of mimicking the behavior of the origin of molecular evolution, in the sense that a modern-day statistical physicist could describe as "being in the same universality class with the origin of life." That is, we cannot hope by the finiteness of our lives to work in the original time scale nor can we guess precisely the chemical nature of the original molecules or the actual boundary conditions and constraints which were present. What we can do is to attempt to show that in a well-defined mathematical model, which in principle contains no inherent fudge factors that prejudice the outcome, a transition such as that between inanimate molecules and life does occur. I would argue that

597

3390

Evolution: Anderson

Proc. Nad Acad. Sci. USA 80 (1983)

previous attempts to set up a model or description of this process have missed some essential features. In particular, "chance and necessity" alone will not do it, even aided by the idea of self-organization via dissipative structures. I argue that, in addition, chaos is a precondition: we require a survival probability that is a fixed but chaotic function of the molecular composition. Hartman (20) [following Cairns-Smith (21)] has proposed a logically consistent scheme in which the primitive replicating stage is not chains of RNA having a linear array of different bases but layers of silicate minerals having a two-dimensional array of different ions. From the "universality class" point of view, this proposal is totally equivalent: the crucial stage requires conjugation symmetry and random interaction with the environment. I have a personal preference, based on Ockham's razor, for the present model, but, of course, cannot exclude any possible scenario a priori. Given this "quenched chaos," I hope that we can model the evolution of metastable clusters of similar molecular chains or "species" in the Eigen sense and show that these species can occasionally radiate by mutation into other forms. We can also hope that species can find it possible to interact and combine or recombine, or even to devour each other. (The modeling of this stage will probably require further development of the scheme.) The question then resolves itself into whether the computer program as outlined does indeed successfully lead to the outcome predicted on heuristic analogue grounds. This will be discussed elsewhere.

the general area discussed here. 1 thank H. Hartman for valuable discussion and for an early suggestion that spin glass is a system of biological interest. I also thank D. Rokhsar and D. Stein for discussion and communication of preliminary results. Some of this work was done while I was. a Fairchild Scholar at the California -Institute of Technology, and 1 wish to express my gratitude to the Fairchild Foundation and to the hospitality of the Institute. 1. Anderson, P. W. & Stein, D., in Self-Organizing Systems: The Emergence of Order, ed. Yates, F. E. (Plenum, New York), in press. 2. Eigen, M. & Schuster, P. (1977) Naturwissenschaften 64,541-565. 3. Eigen, M. & Schuster, P. (1978) Naturwissenschaften 65, 341-369. 4. Orgel, L. E. (1972) Isr.J. Chem. 10, 287-292. 5. Miller, S. & Orgel, L. E. (1974) Origins of Life on Earth (Prentice-Hall, Englewood Cliffs, NJ). 6. Oparin, A. E. (1964) The Chemical Origin of Life (Thomas, Springfield, IL). 7. Miller, S. L. & Urey, H. C. (1967) Science 130, 245-251. 8. Hopfield, J. J. (1978) Proc. Natl Acad. Sci. USA 75, 4334-4338. 9. Eigen, M. & Winkler-Oswatitsch, R. (1981) Naturwissenschaften 68, 282-292. 10. Haken, H.,ed. (1978) Synergistics: An Introduction: Nonequilibrium Phase Transitions and Self-Organization in Physics, Chemistry, and Biology (Springer, Berlin), 2nd Ed. 11. Prigogine, I. (1978) Science 201, 777-785. 12. Siggia, E. D. & Zippelius, A. (1981) Phys. Rev. Lett. 47, 835-838. 13. Toulouse, G. (1977) Commun. Phys. 2, 115-119. 14. Anderson, P. W. (1978)/. Less-Common Metals 62, 291-294. 15. Edwards, S. F. StAnderson, P. W. (1976)/. Phys. F6,1927-1937. 16. Sompolinsky, H. (1981) Phys. Rev. Lett. 47, 935-938. 17. Sleeper, H. L. & Orgel, L. E. (1979)/. Mol Evol 12, 357-364. 18. Lohrmann, R. & Orgel, L. E. (1979)/. Mol Evol 12, 237-257. 19. Kirkpatrick, E. S., Gelatt, C. D., Jr., & Vecchi, M. P., Science, in press. 20. Hartman, H. (1975) Origins Life 6, 423-428. 21. Cairns-Smith, A. G. (1966)/. Theor. Biol 10, 53-88.

I am indebted to J. J. Hopfield and L. Orgel for extensive discussion of the subject of this paper and to F. E. Yates for introducing me to

598

Chemical Pseudopotentials (Symposium to mark the 70th Birthday of G.H. Wannier, Eugene, Oregon, 1983) Physics Reports 110, #5-6, 311-319 (1984)

This is my best review of the "chemical pseudopotential" theory, a methodology which is meant to clarify many of the empirical facts of chemistry, such as locality of bonding, the meaning of ions in molecules and solids, the meaning of directed bonds and bond-angle forces, etc. It seems to be wholly out of synch with modern trends in theoretical chemistry; chemists do not ask the "why?" questions but only calculate particular cases.

599

PHYSICS REPORTS (Review Section of Physics Letters) 110, Nos. 5 & 6 (1984) 311-319. North-Holland, Amsterdam

Chemical Pseudopotentials P.W. ANDERSON Bell Laboratories, Murray Hill, N.J. 07974, U.S.A. and

Joseph Henry Laboratories, Princeton University, Princeton, N.J. 08544, U.S.A. The subject of so-called chemical or localized pseudopotentials will be reviewed. These pseudopotentials are based on the concept of Wannier functions and one can derive a self-consistent wave equation which such localized functions satisfy. As pointed out by Adams and by Gilbert, the optimum such functions are not orthonormalized, as was remarked by Wannier himself many years ago. These functions have been the subject of many successful chemical calculations by D.J. Bullett, which will be reviewed.

My early career in physics was furthered by a number of very lucky breaks, and one of these was the happenstance that I arrived at Bell Labs in 1949 on the same day as Gregory Wannier. Both of us had been shoe-horned in over the quota or "nosecount", and as a result there was no office space, and we were assigned to sit together in a small conference room for the time being. Gregory very quickly decided that a physics department had to have afternoon tea, and that conference room -whose successor still exists-remained the tea room even after we were found offices. But in the meantime I, who had dropped out of the solid state physics course at Harvard, had had time to take an apprenticeship from Gregory, and in particular to learn the magic of Wannier functions, and the Wannier transformation, at the hands of the originator. The Wannier functions are still one of the most useful but underutilized methodologies of solid state physics, and in particular it is in the language of Wannier functions that I feel the chemical implications of band theory are most effectively expressed. Of course the reason is that most of the concepts of chemistry are local concepts, such as bonds, ions, complexes, etc., while band theory is a global structure, in which the wave functions permeate the entire system and the eigenenergies depend on the position of every atom everywhere. There is no a priori reason why band theory should lead to such a chemically intuitive result as that the carbon-carbon single bond should have roughly the same energy and bond length whenever it appears, the O ion should have a constant radius and negative electron affinity, etc. This same weakness is shared by band theory's chemical equivalent, the molecular orbital theory of Hund and Mulliken. From the very first, there was a vain attempt to restore locality by the use of atomic states and the valence-bond idea, very much advocated by Pauling, but only Pauling's great ingenuity in applying the vague concept of "resonance" and his enormous prestige kept this scheme afloat as long as it has been: it is just not a valid way of doing quantum mechanics, and fails completely in the case of metals and of the organic chemical equivalent of metals, namely aromatic compounds and graphite, and is not very useful elsewhere except in the hands of a master empiricist such as Pauling. Nonetheless many-by no means all-chemical properties seem to be roughly understandable on a local basis: how is this compatible with quantum mechanics? This is the enigma to which Wannier functions give us a very precise and clear answer. Not only that, but with a bit of ingenuity it is possible © Elsevier Science Publishers B.V. (North-Holland Physics Publishing Division)

600

Electrons in atoms and solids

312

to modify the local functions in such a way as to give one a simple, accurate and serviceable method for quantum chemical calculations. The participants in this work have been Gregory himself, W.H. Adams Jr., Tom Gilbert, myself (also with one paper jointly with John Weeks and Alan Davidson), and most particularly David Bullett, who has been almost solely responsible for the chemical and solid state applications. Let us then recall some of the basic facts about Wannier functions. Bloch's general theorem requires that the eigenstates of a periodic potential Vr(r) with V T (r+x)= VT(r)

0)

be expressible as <M0=e ifc ruk(r)

(2)

where uk is periodic with period T. The corresponding energy ek must also be periodic with period G = 2TT(T)_1 since 4>k+a(f) is also a solution equivalent to 4>k. The Wannier function is defined as [1] W(r-R)

= ^l.e-'kRMr)

(3)

(the sum is over a Brillouin zone of it's). Using (2)

The form of this function depends upon R, but if R^R + T it becomes the same function of ( r - (R + T)). W is not an eigenstate: instead the energy matrix elements are the Fourier transform of the eigenenergy ek: (W(r- R)\H\ W(r- [R + x])) = ^ ( ^ e + i k R 4>k{r), ek e"ik

fc(r))

(4)

^eik-sk

= k

and in particular (W|//|W)= ek, the average eigenenergy in the band. It is equally possible to reverse the Fourier transform and get back to 4>k by 4>k = S e * ' ( K + T ) W ( r - R + T).

(5)

T

A first approximation to a simple tight-binding band is (5) with W^nj)(r- JR,) with iji the atomic function centered around R„ but the i/f's are not orthogonal. To get to the orthonormalized W from the non-orthogonal tfi we can use the Wannier-Bloch transformation, normalize the resulting pseudo-Bloch

601

P. W. Anderson, Chemical pseudopolentials

313

function to get the Bloch functions and return; or use the Lowdin technique W=(l + S)~milj where S is the overlap matrix. Some important properties of W: clearly W is not unique; one may insert in (3) an arbitrary phase or "gauge" function e'""0 and obtain an equivalent W. Blount [2] has shown that minimizing (r2) gives a unique most localized W with certain simple properties. In particular, for a band separated from all others by gaps, the best W is exponentially localized: W(r-R)~e-Ir-Ri (a theorem due to Kohn) [3] where K may be shown to be related to the widths of the band gaps. (To be precise, K as a function of energy moves into the complex plane at the band gaps, and K is the maximum of \lmk\ in the band gap.) Finally, as Wannier showed [4] W satisfies a pseudopotential-type wave equation: (H-Htm)W{r-R)=ekW

(6)

where HpsW = [ j dV 2 W{r ~(R + T)) W(r' - {R + T)) H W(r' - /?)]

(7)

is an integral operator / dV F(r', r) W(r' - R). The kernel depends on the Wannier function itself, but in principle, (7) could be solved self-consistently for W without solving the Bloch equation at all; but in practice, the non-uniqueness of W in particular makes this exercise almost impossible. Nonetheless, already at this stage one can see that for all substances where there is an appreciable energy gap at the Fermi surface the chemical-that is, the energetic - properties are determined by the Wannier functions of the occupied bands, which are local and satisfy the local-if self-consistent-wave equation (6). This is one of the most important and lepst regarded theorems of solid state physics. It also extends to the electronic response functions (by slight generalization) so it also means that in these cases essentially local force systems can explain vibration spectra and other atomic properties. But the key point is that the Wannier function energy is the mean energy of the whole band and, with all bands full or empty, the detailed band structure is energetically irrelevant. Some generalizations are necessary before we see the full value of this idea. The most important is to "symmetry orbitals" [5]: it may often be that a group of bands is separated from the others by gaps, even though the individual bands are interleaved in a complex way. The most famous example is the s and p bands in the diamond structure, where in general a group of four valence bands are intimately connected and separated by gaps from others. When this is the case one can always find a set of Wannier functions representing the whole multiple set of bands, symmetry-related to each other with more than one per unit cell. In the diamond case, there is one per covalent bond, and they are in essence bonding linear combinations of sp3 valence orbitals. It is only necessary to run through a series of examples to see that invariably the Wannier functions of the occupied bands mirror the chemical reality. As contrasted with the bond orbitals of covalent

602

314

Electrons in atoms and solids

crystals, in ionic crystal structures the valence band Wannier functions are the p orbitals of the negative halogen or chalcogen ions. In complexed structures such as S O r or S O ; - they are the bonding orbitals of the negative ion complex, etc. Long after Wannier's work Adams and Gilbert made the breakthrough which allowed these ideas to take practical form: the abandonment of orthogonality as a criterion for Wannier functions of filled bands. These authors pointed out the unexpected result that a complete, linearly independent but non-orthogonal set of orbitals can be found which are more localized, and often much more localized, than the Wannier functions, and can in general be determined uniquely. To see that this is certainly so, let us calculate, for instance, r2 for a function in which we subtract off a fraction e. of the neighboring Wannier functions:

4>(r~R)=N W(r-R)-E"ZW(r-(R T

2

+ T))] -1

2

(0|0) =/V (l + e ) = l (\r2\) = N 2 [ ^ - 2e £ (WR\r2\WR+T) + 0(e 2 )] . Clearly N_and all corrections are of order e2; the_off-diagonal term is of order e and may be chosen to decrease r2, so W cannot have the minimum r2. Both Adams and Gilbert [6,7] gave much more exhaustive discussions showing that a very appreciable improvement could be made. Interestingly, Gregory suggested this to me many years before, in terms of a remark of Wigner's: that actually, the original atomic functions were sure to be more satisfactory in some way than his orthogonal ones. In fact, as we will see the "ultralocalized" functions often closely resemble atomic ones. These authors, especially Gilbert, pointed out that such functions can be defined by satisfying self-consistent equations like (6) and (7) which there were strong arguments to believe gave more localized solutions. But certain prejudices and limitations left their work purely on a theoretical level. Next came my own contribution [8], which was to point out that there are sets of ultralocalized or "ultra-Wannier" functions which obey a far simpler self-consistent wave equation than (6), so simple in fact that it can be used as the basis for sophisticated band or molecular calculations. In other words, the idea was to combine the conceptual advances of the pseudopotential theory, so valuable in band calculations, with those of the ultralocalized pseudo-atomic Wannier functions, to make a practical calculational scheme based on local, atomic concepts and not on band theory. The strong locality of the wave functions means that all energetic quantities are much more local than Wannier theory suggests. The simplest version of this theory is presented in my Physical Review Letter and in a Physical Review paper [9] on the use of these methods as a basis for the famous "Hiickel" theory of aromatic organic compounds. The Hiickel system of ^-electrons on the C atoms of an organic compound is perhaps the neatest possible illustration of the method. To a good level of approximation, we may schematize the potential energy as a sum of atomic potentials, centered on the individual atoms,

H=T+2Vn(r).

(8)

n

We define the atomic "ultra-Wannier" function 0„(r) by a self-consistent wave equation

603

P. W. Anderson, Chemical pseudopotentials

Hn4>n(r)= (T+ V„ (/))<£„(/•) + 2

315

K ( r ) 4 > „ ( r ) - ( f < M 0 K,4>„(/)) „,(/)]

= £„„(/). In fact, it is unnecessary to be as inaccurate as (8); one may use as "V m " the difference between the true effective potential (as obtained by density functional methods or, as in most of our work, by the "Wigner trick" of locating the correlation hole on the local atom, which is for chemical problems often more accurate) and the single-atom potential V„. The method allows remarkable flexibility. In general, the pseudopotential operator H<*(r,r') = 2

Vm{r)(8(r- r')-<j>m(r)<MO)

(10)

m * n

is (a) relatively small, (b) repulsive: the occupied or valence band states usually oversaturate the potential Vm (as one can easily show using completeness of the full set of wave functions), hence the wave function „ is either close to the unperturbed atomic function or even more localized. Thus a remarkably accurate approximation is usually (T+ Vn)<j>n=En4>n: the atomic wave functions themselves. This is the same near-miracle which happens in conventional pseudopotential theory: the naive approach works remarkably well. A second, and even more surprising, miracle occurs when we examine the secular equation which determines the band energies. The form of the pseudopotential equation allows us to expand the actual "Bloch" wave functions <j>a as linear combinations of the ultra-Wannier functions „ (note that there is absolutely no need to orthogonalize at any point): completeness alone allows us to write 0- = 2 ( « l » M .

(11)

n

and we can write Ea(f>a = Ha = 2 (a|«) H4>„

= 2 («l«) (£»» + 2 ("I VJ»0
m

Picking out the coefficient of <j>„, we get (Ea-En)(a\n)='Z(n\Vm\m)(a\m)

(12)

m

so that the secular equation is \\(Ea-En)Snm+V:J

= 0.

The secular equation has three remarkable qualities: (1) the off-diagonal matrix elements are simple

604

Electrons in atoms and solids

316

potential matrix elements-hence very short-range, and devoid of the messy three-center integrals of so-called "a priori" methods; (2) overlap terms are totally absent. This is the well-known character of the "Hiickel" secular equation, and a very accurate value of "/?" was determined in the original paper; (3) note that the total energy of the band is simply £„£„: only the energy eigenvalue of the local pseudo-wave function enters the total energy. This means that this scheme is particularly adapted to chemical bonding problems where it is the mean energy of the whole band which is relevant. For instance, Bullett has shown that, instead of atomic wave functions, in covalent systems bonding ultra-Wannier functions may be derived which give an excellent account of bonding in tetrahedral semiconductors for instance [10]. Following on this lead, these same methods have been taken up by Dave Bullett especially, in Cambridge and later at Bath [11]. I don't want to talk at too great length about work which is not my own, but I want to brag a little bit about the extraordinarily complex quantum chemical problems Dave has been able to demonstrate their usefulness on. Fig. 1 [12] is an example of his work on complex borides, systems where even the best of other techniques have been unavailing, whereas he is able to put together the full quantitative quantum chemical picture for these phases. Let me emphasize once

Fig. la [12]. Schematic section through the a-rhombohedral boron structure perpendicular to the three-fold axis. Broken lines symbolize the three-center bonds linking adjacent icosahedra in the basal plane; the 'other atoms are linked to icosahedra in neighboring layers by regular two-center bonds.

*

m\

jM -7U

-20

I

-16 leV)

Fig. lb [12]. Calculated density of states histogram for electrons in a-rhombohedral boron. Energies are measured in eV relative to the valence band maximum.

605

317

P. W. Anderson, Chemical pseudopotentials

Fig. lc [12]. Part of the 84-atom unit that forms the basic element of the /J-rhombohedral boron structure. Each vertex of the central (shaded) icosahedron is attached to a half-icosahedron; these link together along the rhombohedral cell edges.

J

1

II ll.rllri 111 20

-15

llrt -10

iI ' 1

I'

r

-

S

O

S

leV)

Fig. Id [12]. Calculated density-of-states histogram for the "ideal" ^-rhombohedral boron structure. Energies are measured in eV relative to the Fermi level; the full valence band would accommodate 320 electrons per unit cell, compared with the 105x 3 = 315 electrons available.

again that there is really nothing "semi-empirical" or "pseudo" or phony about these techniques: they are as capable as any others of starting strictly from atoms or even earlier and putting together an entire complex structure. Much of their relative obscurity has been the feeling of physicists and quantum chemists that anything so easy must be fake in some way: but this is no more true of these than of other pseudopotential methods, and perhaps less; and they have the added blessing that one deals with physically significant atomic wave functions throughout. What I want to do last is to talk about some of the original dreams I had for these methods that have not yet been fully realized. The method is a particularly clean way to understand the overlap repulsion (often miscalled "exchange repulsion") effect which is the fundamental repulsive force which keeps molecules apart (among other things: I would argue that it is also responsible for bond angle forces, for instance). This force results not from mysterious many-body "exchange forces" but simply from the Pauli principle limitation which prevents two electrons from occupying the same piece of Hilbert space together. We recognize that in our formalism this force must appear in the mean effect of the pseudopotential on the energy eigenvalues E„. We can derive its value directly and rigorously (except of course for the usual premise that we must subtract out half of the interelectronic repulsive energy so as not to double count) from the eigenvalue equation. We multiply through by „(/•) and integrate:

606

Electrons in atoms and solids

318

En X (n)={4>n\T+

Vn\cf>n)+ 2

W>„l Vm\4>n)~

2

(*,|<M(VD

so the repulsive force is just the mean value of the pseudopotential: (?>En) = (4>„\Vm\4>n)-Snn,VZn. It is interesting to understand why this is indeed repulsive. In particular, this depends very much on the nature of $„, and it is rather a subtle argument which convinces one that the attractive effect of the mean potential („\ Vm|„) is overcompensated by the overlap term. Where cf>„ is relatively a very smooth, flat wave function, in fact like a plane wave, the net interaction is very small, as we know in the case of ordinary pseudopotentials: the effect of orthogonalization to the core wave functions is almost negligible, and it is only the outer, mean part of the potential which affects the energies. Why don't we get then near compensation here? Physically, one can draw a very simple picture which makes this clear. It is a fact that eigenfunctions like m are more extended in space than the potentials Vm which they satisfy. Their shape looks like fig. 2. Now when we calculate the overlap matrix element / 4>m„, most of its value will come from the large volume where Vm is negligible, while the contribution V„m will be concentrated in the overlap region. A way of estimating this is to point out that <j>2n(r) Vm(r) will be proportional to cj>i(Rm)- Vm while Snm will be essentially 4>l(rm)Vmx

volume in which (t>„<j>m is roughly const. volume in which Vm is const.

In general, this volume ratio is quite large if 4>„ and $m are similar wave functions, and this accounts nicely for the normal repulsive forces between molecules. On the other hand, if „ is a weakly bound wave function (like a lone pair orbital on O"") and cj>m is tightly bound (like a o--bond between C and H) the repulsion is quite weak, and the compensation of the two terms nearly complete. This, I am convinced, is the true story behind the mysterious H-bond: the absence of the conventional overlap repulsion effect, rather than the presence of a peculiar kind of bond. This then brings us back to

4>n4>m

A

T0M"m"

Fig. 2. Repulsion due to overlap of orbitals which extend outside the region of their respective potentials.

607

P. W. Anderson, Chemical pseudopotentials

319

Gregory Wannier; among all the many problems we discussed together back in those days at Bell Labs, the interesting questions he first called my attention to, perhaps the H-bond remains the one most nearly unsolved.

References [1] G.H. Wannier, Phys. Rev. 52 (1937) 191. [2] The best review giving many results on Wannier functions is E.I. Blount, in: Solid State Physics, Advances in Research and Applications, eds. F. Seitz, D. Turnbull, and H. Ehrenreich (Academic Press, NY) 13 (1962) 305. [3] W. Kohn, Phys. Rev. 115 (1959) 809. [4] See ref. [2], [5] J. Lennard-Jones, Proc. Roy. Soc. Lond. A198, No. 1 (1949) 14. [6] W.H. Adams Jr., I. Chem. Phys. 34 (1961) 89; 37 (1962) 2009; Chem. Phys. Lett. 11 (1971) 71, 441. [7] T.L. Gilbert, Molecular Orbitals, a Tribute to R.S. Milliken, eds. P.O. Lowdin and B. Pullman (Academic Press, New York) p. 405. [8] P.W. Anderson, Phys. Rev. Lett. 21 (1968) 13. [9] P.W. Anderson, Phys. Rev. 181 (1969) 25. [10] D.W. Bullett, in: Solid State Physics, Advances in Research and Applications, eds. H. Ehrenreich, F. Seitz, and D. Turnbull (Academic Press, NY) 35 (1980) 129. [11] D.W. Bullett, sequence of papers: J. Phys. C 8 (1975) 2695, 2707, 3108; Sol. St. Comm. 17 (1975) 843; Phil. Mag. (8) 32 (1975) 1063; J. Phys. C, many papers 1976-83. [12] D.W. Bullett, J. Phys. C: Solid State Phys. 15 (1982) 415.

608

Spin Glass Hamiltonians: A Bridge between Biology, Statistical Mechanics and Computer Science Emerging Syntheses in Science, ed. D. Pines, Santa Fe Institute (Addison-Wesley, 1984), pp. 17-20

My contribution to SFFs founding Workshop. The evolutionary biology line has been much exploited by Kaufmann, Weisbuch, Fontana and others associated with SFI.

609

P. W. ANDERSON Joseph Henry Laboratories of Physics, Princeton University, Princeton, NJ 08544

Spin Glass Hamiltonians: A Bridge Between Biology, Statistical Mechanics and Computer Science

A remarkable number of fields of science have recently felt the impact of a development in statistical mechanics which began about a decade ago 1 in response to some strange observations on a variety of magnetic alloys of little or no technical importance but of long-recognized scientific interest. 2 These fields are: 1. 2. 3. 4. 5.

Statistical mechanics itself, both equilibrium and non-equilibrium; Computer science, both special algorithms and general theory of complexity; Evolutionary biology; Neuroscience, especially brain modelling; Finally, there are speculations about possible applications to protein structure and function and even to the immune system.

What do these fields have in common? The answer is that in each case the behavior of a system is controlled by a random function of a very large number of variables, i.e. a function in a space of which the dimensionality is one of the large, "thermodynamic limit" variables: D —• oo. The first such function of which the properties came to be understood was the model Hamiltonian N

H = J2 J'i 5 ^

Emerging Syntheses in Science, 1987

610

(!)

17

18

Emerging Syntheses in Science

{Jij is a random variable, P(Jij)ae—'i'2!2(-J,rf, S,- a spin variable attached to site i) introduced 1 for the spin glass problem. This Hamiltonian has the property of "frustration" named by G. Toulouse 3 after a remark of mine, which roughly speaking indicates the presence of a wide variety of conflicting goals. A general definition suitable for a limited class of applications has been proposed: 4 imagine that the "sites" i on which the state variables reside constitute the nodes of a graph representing the interactions between them—simply a line for every Jij in the case of (1), for instance. Let us make a cut through this graph, which will have a certain area A(oc Nd~1 in case the graph is in a metric space of dimension d). Set each of the two halves in a minimum of its own H, normalized so that H oc N. Then, reunite the halves and note the change in energy AH. If the fluctuations in AH are of order VA, H is "frustrated"; if they can be of order A—as in (1), they will be if the J are all of the same sign—it is "unfrustrated." The dependence on \I~A means that when the interactions within a block of the system are relatively satisfied, those with the outside world are random in sign; hence, we cannot satisfy all interaction simultaneously. A decade of experience with the spin glass case has demonstrated a number of surprising properties of such functions as H. As the Hamiltonian of a statistical mechanical system, for most dimensionalities it has a sharp phase transition in the N —• oo limit1. At this transition it becomes non-ergodic in that different regions of phase space become irretrievably separated by energy barriers which appear to be of order Np where p is a power less than unity. 5 As the temperature is lowered these regions proliferate, exhibiting an ultrametric multifurcation. 6 It is suspected that the number of such regions at or near the minimum (ground state) of H has no entropy (not of order e ^ ) , but may be exponential in some power of N. Many unusual properties of the response functions, and some strange statistical mechanical and hysteretic behaviors, have been explored at length. Recent work has generalized the Hamiltonian (1) and also shown that even first-order phase transitions may occur for some models. In computer science, there are a number of classic optimization problems which have been studied both as objects for heuristic algorithms and as examples for complexity theory. These include the spin glass itself (sometimes under other names), the graph partition problem (which can be transformed into a spin glass), graph coloring (close to a Potts model spin glass), the Chinese postman (in some cases equivalent to a spin glass) and the famous travelling salesman (Design a tour through N cities given distances dij, of the minimum length L = Yl^ij)- As I indicated, several of these are spin glasses—there even exists a very inefficient transformation due to Hopfield7 TS •$=>• spinglass—and all are well-known TVP-complete—i.e. hard—problems, of which it is speculated that no algorithm will solve the general case faster than O {eN). On the level of heuristic algorithms, Kirkpatrick 8 has suggested that the procedure of annealing using a Mitropolis-Teller Monte Carlo statistical mechanics algorithm may be more efficient for some of these problems than the conventional heuristics. In any case, the knowledge that a "freezing" phase transition exists and

611

19

Spin Glass Hamiltonians

that, for values of the function below freezing, one may be stuck forever in an unfavorable region of phase space, is of great important to the understanding of the structure of such problems. To my knowledge, the computer science community only knew of freezing as a bit of folklore, and has not yet absorbed its fundamental importance to the whole area—which includes great swatches of problem-solving and AI. Incidentally, a workshop at BTL came up with the limited but interesting conclusions that (a) simulated annealing works; (b) sometimes—not always—it beats previously known heuristics. 9 Equally important should be the knowledge that there is a general theory of average properties of such problems, not limited to the mathematicians' type of worry over worst cases, but able to make statements which are overwhelmingly probable. For instance, we also know analytically the actual minimum energy to order N for "almost all" cases of several kinds of spin glass. A student (Fu) and I have an excellent analytic estimate for the partition problem on a random graph, etc. We also can hope to achieve a real connection between algorithmic solution and non-equilibrium statistical mechanics: after all, the dynamic orbits of a system are, in some sense, the collection of all paths toward minimum energy, and, hence, of all algorithms of a certain type (D. Stein is working on this). In evolutionary biology, we can consider the fitness—the "adaptive landscape" as a function of genome to be just the kind of random function we have been talking about. The genome is a one-dimensional set of sites i with 4 valued spins (bases) attached to the sites, and the interaction between the different sites in a gene is surely a very complicated random affair. In the work of Stein, Rokhsar and myself,10 we have applied this analogy to the prebiotic problem, showing that-it helps in giving stability and diversity to the random outcomes of a model for the initial start-up process. Here we see the randomness as due to the tertiary folding of the RNA molecule itself. G. Weisbuch has used a spin glass-like model for the evolutionary landscape 11 to suggest a description of speciation and of the sudden changes in species known as "punctuated equilibrium." Most population biology focuses on the near neighborhood of a particular species and does not discuss the implications of the existence of a wide variety of metastable fixed points not far from a given point in the "landscape." In neuroscience, Hopfield12 has used the spin-glass type of function, uiong with some assumed hardware and algorithms, to produce a simple model of associative memory and possibly other brain functions. His algorithm is a simple spin-glass anneal to the nearest local minimum or pseudo-"ground state." His hardware modifies the Jifs appropriately according to past history of the Si's, in such a way that past configurations S = {Si} are made into local minima. Thus, the configuration {Si} can be "remembered" and recalled by an imperfect specification of some of its information. Finally, for our speculations for the future. One of these concerns is biologically active, large molecules such as proteins. Hans Frauenfelder and his collaborators have shown that certain proteins, such as myoglobin and hemoglobin, may exist in a large number of metastable conformational substates about a certain tertiary

612

20

Emerging Syntheses in Science

structure. 13 At low temperatures, such a protein will be effectively frozen into one of its many possible conformations which in turn affects its kinetics of recombination with CO following flash photolysis. X-ray and Mossbauer studies offer further evidence that gradual freezing of the protein into one of its conformational ground states does occur. Stein has proposed a spin glass Hamiltonian to describe the distribution of conformational energies of these proteins about a fixed tertiary structure as a first step toward making the analogy between proteins and spin glasses (or possibly glasses) explicit. In any case, this field presents another motivation for the detailed study of complicated random functions and optimization problems connected with them. Yet another such area is the problem of the immune system and its ability to respond effectively to such a wide variety of essentially random signals with a mechanism which itself seems almost random in structure.

REFERENCES 1. S. F. Edwards and P. W. Anderson, J. Phys., F. 5, 965 (1975). 2. Original observation: Kittel Group, Berkeley, e.g. Owen, W. Browne, W. D. Knight, C. Kittle, PR 102, 1501-7 (1956); first theoretical attempts, W. Marshall, PR 118, 1519-23 (1960); M. W. Klein and R. Brout, PR 132, 2412-26 (1964); Paul Beck, e.g. "Micto Magnetism," J. Less-Common Metals 28, 193-9 (1972) and J. S. Kouvel, e.g. J. App. Phys. 3 1 , 1425-1475 (1960); E. C. Hirshoff, O. O. Symko, and J. C. Wheatley, "Sharpness of Transition," JUT Phys 5, 155 (1971); B. T. Matthias et al. on superconducting alloys, e.g. B. T. Matthias, H. Suhl, E. Corenzwit, PRL 1, 92, 449 (1958); V. Canella, J. A. Mydosh and J. I. Budnick, J.A.P. 42, 1688-90 (1971). 3. G. Toulouse, Comm. in Physics 2, 115 (1977). 4. P. W. Anderson, J. Less-Common Metals 62, 291 (1978). 5. N. D. Mackenzie and A. P. Young, PRL 49, 301 (1982); H. Sompolinksy, PRL 47, 935 (1981). 6. M. Mezard, G. Parisi, N. Sourlas, G. Toulouse, M. Virasoro, PRL 52, 1156 (1984). 7. J. J. Hopfield and D. Tank, Preprint. 8. S. Kirkpatrick, C. D. Gelatt and M. P. Vecchi, Science 220, 671-80 (1983). 9. S. Johnson, private communication. 10. P. W. Anderson, PNAS 80, 3386-90 (1983). D. L. Stein and P. W. Anderson, PNAS 81, 1751-3 (1984). D. Rokhsar, P. W. Anderson and D. L. Stein, Phil. Trans. Roy. Soc, to be published. 11. G. Weisbuch, C. R. Acad. Sci. Ill 298 (14), 375-378 (1984). 12. J. J. Hopfield, PNAS 79, 2554-2558; also Ref. 7. 13. H. Frauenfelder, Helv. Phys. Ada (1984) in press; in Structure, etc.

613

Measurement in Quantum Theory and the Problem of Complex Systems Talk given at the "Niels Bohr Centenary Symposium", Oct. 1985 (Copenhagen), The Lessons of Quantum Theory, ed. J. de Boer, E. Dal and O. Ulfbeck (Elsevier Science Publishers, 1986), pp. 23-34

This is the most precise (and concise) statement I can make of my attitude towards the fundamentals of quantum theory. It says nothing which does not seem self-evident to me, yet Leggett disagreed strongly, although he now seems to have changed his mind. I had no idea that the experiment of causing two separately cooled Bose liquids to interfere, mentioned at the end of the paper, would be carried out so soon, as it recently was on two Bose gases. I believe this paper was the first to suggest it, and the outcome was exactly as predicted here.

615

The Lesson of Quantum Theory, edited by J. de Boer, E. Dal and O. Ulfbeck © Elsevier Science Publishers B.V., 1986

Measurement in Quantum Theory and the Problem of Complex Systems Philip W. Anderson Princeton University Princeton, New Jersey, USA

Contents 1. An approach to the Bohr-Einstein debate 2. Canonical Stern-Gerlach experiment 3. Phase variable of superfluid helium. Broken symmetry 4. Conclusions References Discussion

23 25 29 32 32 33

1. An approach to the Bohr-Einstein debate Every physicist who has studied the quantum theory is familiar with the great debate on the uncertainty principle between Bohr and Einstein which began at the 1927 Solvay Conference but which then simmered on for many years. Perhaps the commonest view is that Bohr simply won hands down; in Einstein's somewhat mistranslated phrase, God does "play dice with the world" and the outcome of a quantum measurement is truly uncertain. In a way, this view corresponds to the truth—at least to the pragmatic truth that now, after 60 years, there is no doubt in any quantum physicist's mind that the whole quantum theory, including the original quantum prescriptions of measurement theory, are correct in describing the results of experiment. Why, then, re-open the question? It may be slightly provocative, but surely in a spirit Bohr himself would have enjoyed, to suggest that many of those who have thought beyond the textbook level have come to the conclusion that the really correct answer to which one was right is —neither! It is not true that quantum mechanics is a purely probabilistic theory, any more so than statistical mechanics, for instance. That it is, seems to have been Bohr's point of view, in so far as we can follow his statements about complementarity as a general philosophical principle. In fact, generations of philosophers have been misled, probably against Bohr's intent, to give a deep meaning to this "uncertainty". On the other hand, so far as I can see, one cannot avoid the probabilistic results of measurements on atomic scale systems, as Einstein wished. Einstein's reasons are clear enough: he believed firmly that general relativity was the 23

616

24

P. W. Anderson

appropriate model for future theories, with its fundamental emphasis on the geometry of space-time, and it disturbed him deeply to have that geometry indeterminate and fluctuating—in particular, how can an indeterminate geometry properly maintain causality? The difficulties of quantizing general relativity vindicate Einstein's misgivings to some extent. Whether or not Einstein would have felt comfortable with the present liberties being taken with the structure of space-time is hard to know; one would have thought he would have felt a proprietary interest in four dimensions. It is true that we are beginning to return to his geometrical view and that at some ultimate scale everything may yet change. But really that is a bit of a diversion. I want to talk here about simple non-relativistic quantum theory, although I believe everything I say is equally valid in relativistic field theory at any scale short of the Planck mass, e.g. during most of the events which took place during the "big bang" including the symmetry-breaking phase transitions. What I consider to be the correct approach to the Einstein-Bohr debate began very much with one of my own personal idols, Fritz London. Interestingly enough, London trained originally as a philosopher, and one sees, in following his career, a certain philosophical theme: how can we deal with the quantum theory at the level of macroscopic objects? He is most justly famous for his ideas on the two macroscopic quantum phenomena which most severely test quantum mechanics in this regime, superfluidity and superconductivity; but I refer here to a seminal paper in the theory of measurement by London and Bauer (1939), who I believe were the first to take the point of view I advocate here, that the central problem of measurement theory is not the quantum mechanics of atoms, which is simple and easy, but the fact that macroscopic everyday objects are very difficult indeed for the quantum theory to deal with properly. To mis-quote another famous Dane: "The fault, dear Horatio, is not in our atoms but in ourselves." The sticking point is twofold. The first problem is that the quantum theory is in principle not probabilistic but deterministic. The equation for the time variation is a simple deterministic one:

where if is a given—in principle—function of a maximal set of arguments of ^ at the present time—or in relativistic theory, at a space-like surface. Of course, there is a mathematical equivalence between eq. (1) and a sum over classical trajectories, which resembles a stochastic process, but every attempt to replace quantum mechanics with a truly stochastic theory ends in failure. Thus in principle, given an initial ^{t = 0), we know precisely what ^ is. What is more, if quantum mechanics is a complete theory, ^ ( 0 should be a complete specification of the physics including measurements, though not necessarily vice versa. What London points out to be the second prong of the problem is that the

A Career in Theoretical Physics

Read more

Selected Papers of K C Chou (20th Century Physics) (World Scientific Series 20th Century Physics)

Read more

Theoretical Atomic Physics, Second Edition (Advanced Texts in Physics)

Read more

More Surprises in Theoretical Physics (Princeton Series in Physics)

Read more

Selected Papers of Richard Feynman: With Commentary (World Scientific Series in 20th Century Physics)

Read more

Theoretical Concepts in Physics: An Alternative View of Theoretical Reasoning in Physics, second edition

Read more

Theoretical Concepts in Physics: An Alternative View of Theoretical Reasoning in Physics, second edition

Read more

Theoretical concepts in physics

Read more

Selected Works of Emil Wolf: With Commentary (World Scientific Series in 20th Century Physics 29)

Read more

Surprises in Theoretical Physics

Read more

Surprises in theoretical physics

Read more

Surprises in theoretical physics

Read more

Broken Symmetry: Selected Papers of Y. Nambu (World Scientific Series in 20th Century Physics 13)

Read more

Murray Gell-mann: Selected Papers (World Scientific Series in 20th Century Physics)

Read more

Surprises in Theoretical Physics

Read more

Fields Medallists' Lectures (World Scientific Series in 20th Century Mathematics)

Read more

Applied Physics in the 21st Century (Horizons in World Physics)

Read more

Applied Mathematical Methods in Theoretical Physics, Second Edition

Read more

More surprises in theoretical physics

Read more

Selected Problems in Theoretical Physics

Read more

more surprises in theoretical physics

Read more

More Surprises in Theoretical Physics

Read more

More surprises in theoretical physics

Read more

theoretical methods in plasma physics

Read more

More surprises in theoretical physics

Read more

Theoretical Physics

Read more

Towards a Nonlinear Quantum Physics (World Scientific Series in Contemporary Chemical Physics)

Read more

Theoretical physics

Read more

Factory Physics Second Edition

Read more

Theoretical physics

Read more

Recommend Documents

A Career in Theoretical Physics

Selected Papers of K C Chou (20th Century Physics) (World Scientific Series 20th Century Physics)

SE~ECTED PHPERS OF H. C. CHOU World Scientific Series in 20th Century Physics Published Vol. 22 A Quest for Symmetry...

Theoretical Atomic Physics, Second Edition (Advanced Texts in Physics)

Springer Berlin Heidelberg New York Barcelona Budapest Hong Kong London Milan Paris Singapore Tokyo Harald Friedrich ...

More Surprises in Theoretical Physics (Princeton Series in Physics)

...

Selected Papers of Richard Feynman: With Commentary (World Scientific Series in 20th Century Physics)

Selected Papers With Commentary World Scientific Series in 20th Century Physics Published Vol. 1 Gauge Theories - ...

Theoretical Concepts in Physics: An Alternative View of Theoretical Reasoning in Physics, second edition

Theo retica l Con ce pts i n P h ys i cs An Alterna tive Vie w o f Theoretical Reasoning in Physics A highly original,...

Theoretical Concepts in Physics: An Alternative View of Theoretical Reasoning in Physics, second edition

Theo retica l Con ce pts i n P h ys i cs An Alterna tive Vie w o f Theoretical Reasoning in Physics A highly original,...

Theoretical concepts in physics

...

Selected Works of Emil Wolf: With Commentary (World Scientific Series in 20th Century Physics 29)

World Scientific Series in 20th Century Physics - Vol. 29 Selected Works of With Commentary Emil Wolf World Scientif...

Surprises in Theoretical Physics

Surprises in Theoretical Physics ,,.. I by n ,,, Rudolf Peierls Princeton Series• in Physics, • 1.... '" $3.95...