ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS VOLUME 54
CONTRIBUTORS TO THISVOLUME
P. J. Baum A. Bratenahl Lawrence E...
18 downloads
670 Views
16MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS VOLUME 54
CONTRIBUTORS TO THISVOLUME
P. J. Baum A. Bratenahl Lawrence E. Cram A. T. Georges Paul H. Holloway P. Lambropoulos P. R. Thornton
Advances in
Electronics and Electron Physics EDITED BY L. MARTON AND C. MARTON Smithson iun Institution Wushington, D.C.
BOARD EDITORIAL E. R. Piore T. E. Allibone M. Ponte H. B. G. Casimir W. G. Dow A. Rose L. P. Smith A. 0. C. Nier F. K. Willenbrock
VOLUME 54
1980
ACADEMIC PRESS A Subsidiary of Harcourt Brace Jovanovich, Publishers
New York
London Toronto Sydney San Francisco
COPYKICHT @ 1980, B Y ACADEMIC P R E S S , INC. ALL RIGHTS RESERVED. NO PART OF THIS PUBLICATION MAY BE REPRODUCED OR T R a S S h l I T 7 E D IN ANY FORM OR BY ANY h l E I N S , ELECTRONIC O R MECHANICAL, INCLUDING PHOTOCOPY, RECOKDING, OR ANY INFOR5I.ATION STORAGE AND RETRIEVAL SYSTEM, WITHOUT PERMISSION IN WRITING FROM THE PUBLISHER.
ACADEMIC PRESS, INC. 1 1 I FifIh Avenue. N e w York, N e w York 10003
Clrirted Kirigdorn Edrfiori pirblrslied by ACADEMIC PRESS, INC (LONDON) LTD. 31 28 Obal Road. London h W I 7DX
I
I ~ K \ KV I~
C u % ~ t s sC - \ I ~ LCO W~D N L M U E R : 49-7504
ISBh 0-12-014654-1 I’RISTED IN THE UNITED STATES O F AMERICA SO 8 1 82 83
9 8 7 6 5 4 3 2 1
CONTENTS . . . . . . . .
CONTRIBUTORS TO VOLUME 54 . . . . . . . . . . . . . FOREWORD . . . . . . . . . . . . . . . . . . . . .
vii ix
Magnetic Reconnection Experiments P . J . BAUMAND A . BRATENAHL
I . Prolog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I1 . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Historical Perspective Prior to 1970 . . . . . . . . . . . . . . . IV . Reconnection Theory . . . . . . . . . . . . . . . . . . . . . . V . Reconnection Experiments . . . . . . . . . . . . . . . . . . . . VI . Discussion and Conclusions . . . . . . . . . . . . . . . . . . Appendix I . A Simple Example of an X Point . . . . . . . . . . . Appendix I1 . Reconnection Jargon . . . . . . . . . . . . . . . Appendix 111. Impulsive Flux Transfer and Circuit Transients . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
. . . .
1 2 8 14 22 51 55 57 58 61
.
1. I1. 111.
IV . V.
1.
I1 . Ill. IV .
Electron Physics in Device Microfabrication I1 Electron Resists. X-Ray Lithography. and Electron Beam Lithography Update P . R . THORNTON Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . Interactions between a Focused Electron Beam and a Resist-Covered Wafer . . . . . . . . . . . . . . . . . . . X-Ray Lithography . . . . . . . . . . . . . . . . . . . . . . . Recent Work in Electron Beam Lithography . . . . . . . . . . . . The Relative Roles of X-Ray and Electron Beam Lithography Systems with High Throughput . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . .
Solar Physics LAWRENCE E . CRAM Introduction . . . . . . . . . . . . . . . . The Solar Interior . . . . . . . . . . . . . The Quiet Solar Atmosphere . . . . . . . . Solar Activity . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . 1.
69 73 95 116
133 136
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
144 160
. . . . . . . . . . . . . . . . . . . .
179 187
141
CONTENTS
Aspects of Resonant Multiphoton Processes A . T . GEORCFSAND P . LAMBROPOULOS Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . Formal Theory of Multiphoton Processes . . . . . . . . . . . . . The Quantum Theory of Resonant Two-Photon Processes . . . . . . The Effect of Nonresonant States . . . . . . . . . . . . . . . . . Higher-Order Processes . . . . . . . . . . . . . . . . . . . . . Semiclassical Approaches . . . . . . . . . . . . . . . . . . . . Multiple Resonances . . . . . . . . . . . . . . . . . . . . . . Field Statistics and Bandwidth Effects . . . . . . . . . . . . . . . Experimental Investigations of Resonant Multiphoton Processes . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . .
191 194 200 206 209 215 219 224 233 236
Fundamentals and Applications of Auger Electron Spectroscopy PAULH . HOLLOWAY Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . Experimental Approach . . . . . . . . . . . . . . . . . . . . . Quantitative AES . . . . . . . . . . . . . . . . . . . . . . . . Sample Damage . . . . . . . . . . . . . . . . . . . . . . . . Applications . . . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . .
241 242 274 280 285 287 291 292
AUTHOR INDEX . . . . . . . . . . . . . . . . . . . . . . . . . . . SUBJECT INDEX . . . . . . . . . . . . . . . . . . . . . . . . . . .
299 31 1
I. I1. 111. IV . V. VI . VII . VIII . IX .
I. I1 . I11. IV . V. VI . VII .
CONTRIBUTORS TO VOLUME 54 Numbers in parentheses indicate the pages on which the authors’ contributions begin
P. J. BAUM,Institute of Geophysics and Planetary Physics, University of California, Riverside, California 92521 (1) A. BRATENAHL, Institute of Geophysics and Planetary Physics, University of California, Riverside, California 92521 (1)
LAWRENCE E. CRAM,Sacramento Peak Observatory, Sunspot, New Mexico 88349 (141) A. T. GEORGES, Physics Department, University of Toronto, Toronto, Ontario, Canada (191) PAULH. HOLLOWAY, Department of Materials Science and Engineering, University of Florida, Gainesville, Florida 3261 1 (241) P. LAMBROPOULOS, Physics Department? University of Southern California, Los Angeles, California 90007 (191) GCA Corporation, Burlington, Massachusetts 01803 (69) P. R. THORNTON,
vii
This Page Intentionally Left Blank
FOREWORD The articles in this volume characterize the broad range of subjects that fall into the category of electron physics. In addition, electromagnetic phenomena are reviewed in P. J. Baum and A. Bratenahl’s contribution on magnetic reconnection experiments. Lawrence E. Cram’s review of solar physics, as pure physics, stands in contrast to the down-to-earth industryoriented article on microfabrication by P. R. Thornton. Midway between these contributions are two articles that deal with both pure and applied physics, the first by A. T. Georges and P. Lambropoulos on multiphoton processes and the second by Paul H. Holloway on Auger spectroscopy. We trust our readers will find this volume to be a valuable survey of five vital areas in current electron physics research and thank our authors for their splendid presentations. As is our custom, we present a list of articles to appear in future volumes of Admnces in Electronics and Electron Phjwics.
Critical Reviews: A Review of Application of Superconductivity Sonar Electron-Beam-Controlled Lasers Amorphous Semiconductors Design Automation of Digital Systems. I and I1
Spin Effects in Electron-Atom Collision Processes Review of Hydromagnetic Shocks and Waves Seeing with Sound Large Molecules in Space Recent Advances and Basic Studies of Photoemitters Josephson Effect Electronics Present Stage of High Voltage Electron Microscopy Noise Fluctuations in Semiconductor Laser and LED Light Sources X-Ray Laser Research The Impact of Integrated Electronics in Medicine Electron Storage Rings Radiation Damage in Semiconductors Solid-state Imaging Devices Spectroscopy of Electrons from High Energy Atomic Collisions ix
W. B. Fowler F. N. Spiess C. A . Cason H. Scher and G. Pfister W. G. Magnuson and Robert J. Smith H. Kleinpoppen A. Jaumotte & Hirsch A. F. Brown M. and G. Winnewisser H. Timan M. Nisenoff B. Jouffrey H. Melchior C. A. Cason and M. Scully J. D. Meindl D. Trines N. D. Wilsey and J. W. Corbett E. H. Snow
D. Berenyi
X
FOREWORD
Solid Surfaces Analysis Surface Analysis Using Charged Particle Beams Sputtering Photovoltaic Effect Electron Irradiation Effect in MOS Systems
Light Valve Technology High Power Lasers Visualization of Single Heavy Atoms with the Electron Microscope Spin Polarized Low Energy Electron Scattering Defect Centers in 111-V Semiconductors Atomic Frequency Standards Reliability Microwave Imaging of Subsurface Features Electron Scattering and Nuclear Structure Electrical Structure of the Middle Atmosphere Microwave Superconducting Electronics Biomedical Engineering Using Microwaves. I1 Computer Microscopy Collisional Detachment of Negative Ions International Landing Systems for Aircraft Impact of Ion Implantation on Very Large Scale Integration Ultrasensitive Detection Physics and Techniques of Magnetic Bubble Devices Radioastronomy in Millimeter Wavelengths Energy Losses in Electron Microscopy Long-Life ; High-Current-Density Cathodes Interactions of Measurement Principles Low Energy Atomic Beam Spectroscopy History of Photoelectricity Fiber Optic Communications Electron Microscopy of Thin Films
Supplementary Volumes: Applied Charged Particle Optics Microwave Field Effect Transistors
M. H. Higatsberger F. P. Viehbock and F. Riidenauer G. H. Wehner R. H. Bube J. N. Churchill, F. E. Homstrom, and T. W. Collins J. Grinberg V. N. Smiley J. S. Wall D. T. Pierce and R. J. Celotta J. Schneider and V. Kaufmann C. Audoin H. Wilde A. P. Anderson G. A. Peterson L. C. Hale R.Adde M. Gautherie and A. Priou E. M. Glasser R. L. Champion H. W. Redlien and R. J. Kelly H. Ryssel K. H. Purser M. H. Kryder E. J. Blum B. Jouffrey R. T. Longo W. G. Wolber E. M. Horl and E. Semerad W. E. Spicer G. Siege1 M. P. Shaw
A . Septier
J . Frey
FOREWORD Volume 55: Cyclotron Resonance Devices Microwave Systems for Industrial Measurements Photodetachment and Photodissociation of Ions Photodiodes for Optical Communication Heavy Doping Effects in Silicon
xi R. S. Symons and H. R. Jory W. Schilz and B. Schiek T. M. Miller J. Miiller R. P. Mertens, R. J. Van Overstraeten, and H. J. De Man
As in the past, we have enjoyed the friendly cooperation and advice of many friends and colleagues. Our heartfelt thanks go to them, since without their help it would have been almost impossible to issue a volume such as the present one. L. MARTON C. MARTON
This Page Intentionally Left Blank
ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS, VOL.
54
Magnetic Reconnection Experiments P. J. BAUM
AND
A. BRATENAHL
Institute of Geophysics and Planetary Physics University of California Riverside, California
I. Prolog,. . . . . .
.......
.......
........
............
11. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Historical Perspective Prior to 1970 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Dungey’s Paradox. . . . . . . . . ....... ....... ............
B. Sweet’sParadox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Process Rates.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV. ReconnectionTheory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Analytic Approaches . . . . . .......................... B . Numerical Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V . Reconnection Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Applied Research Experiments Exhibiting Reconnection . . . . . . . . . . . . . . . . . . . . B . Testing Reconnection Theory in the Laboratory . . . . . . . . . . . . . . . . . . . . . . . . . . . VI. Discussion and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix I. A Simple Example of an X Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 11. Reconnection Jargon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 111. Impulsive Flux Transfer and Circuit Transients . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 2 8
8 9 12 14 15 19 19 22 22 23 23 28 28 51 55 57 58 61
I. PROLOG This review is intended to give the general reader an overview of the present status of laboratory magnetic “reconnection” experiments while also providing the specialist with a unified critical picture in some detail. It is necessary to give some attention to the theory of the subject as well as its developmental history not just to clarify definitions and terminology, but more especially to explain the purpose and objectives of laboratory “reconnection” experiments, and perhaps show how results to date may influence the future development of the theory. We present many researchers’ work, but it can be noticed that our own is accorded more space. That happens partly because we understand this to be the usual custom in this type of review and partly because we are most familiar with our own work. “Reconnection” theory began in 1953, and although laboratory experiments specifically designed to test the theory did not begin for another 10 years, this 1 Copynght Q 1980 by Academic Press Ini All rights of reproduction In any torrn reserved lSBN 0-12-014654-1
2
P. J. BAUM A N D A. BRATENAHL
review covers the period 1953-1979, with emphasis on the late 1960s and the 1970s. Referencing ends in mid-1979. The word “reconnection” appears here in quotation marks because much of the literature on the subject, especially the earlier literature, has treated the process as a “moving together” of oppositely directed field lines, leading to a new configuration through their “breaking” and “rejoining new partners” and, on occasion, even to their mutual “annihilation.” This movingfield line concept, by aiding visualization, has provided the basis for many ideas concerning “reconnection,” but its nonphysical nature can lead to misleading conclusions (Alfven, 1976), and block the way to the use of more powerful methodologies. We therefore restrict our attention to measurable physical quantities and concepts derivable from them. Our use here of the traditional term “reconnection” is merely to identify the general subject matter and is not intended to imply any connotation of “moving field lines” that “do” anything. The reader, however, will find the quotation marks deleted from now on.
11. INTRODUCTION
In this section we are concerned with what is meant by magnetic-field reconnection in a general sense and why there is interest in its study. The magnetic vector field B is a local quantity, but the field possesses also a spatial structure expressed by its field lines, which are its integral characteristics (Morozov and Solov’ev, 1966). Both the local field vector and its spatial structure are uniquely determined by a second vector field, the current density and its spatial structure, although the inverse is not true. In general, the field line structure may be analyzed in terms of its topological elements, which may include: one or more separatrix surface distributions of field lines; separator lines where a separatrix appears to intersect itself; and null points of various kinds where the field vanishes. The general subject may be called magnetomorphology. The separatrix partitions the magnetic flux into cells, each distinguished by the unique linkage of its field lines with respect to the currents. It is obvious that any change whatever in the currents, including the introduction of new currents, will result in changes in the allocation of magnetic flux between the cells, including the possible development of new cells. Faraday’s law requires that any change of flux is accompanied by an inductive (rotational) electric field, and the Faraday electric field along a separator where three cells meet measures the rate of flux loss (gain) of two of the cells, and a corresponding gain (loss) of the third cell. Such flux changes among cells constitute reconnection in its broadest electrodynamic sense, and it will be
MAGNETIC RECONNECTION EXPERIMENTS
3
appreciated that its topological basis requires an appropriate system-wide definition. The definition of “system” in this sense requires careful consideration in order that the electrodynamic and topological aspects of the reconnection problem can be properly expressed. We shall shortly return to this point. However, little interest would be generated in the problem of reconnection on the basis of the electrodynamic and topological aspects alone and in the absence of a plasma medium. It is, of course, the rich variety of plasma dynamic effects associated with reconnection that commands interest in the whole subject. Plasma dynamics enters the problem in several distinct ways. First, as the medium of conveyance of electromagnetic energy throughout the system from sources to sinks. In this way plasma can act as a partner with changes in the sources, thus contributing to the cause for which reconnection is the response or effect. Second, the plasma within an inner portion of the system may be in a higher potential energy (pressure) state than that outside it, being confined in equilibrium by a particular topological structure of the magnetic field defined by a combination of internal plasma currents and fixed external currents. The plasma may then find a means to escape this confinement through a rearrangement of the internal currents and corresponding changes in the topological structure of the magnetic field through reconnection (formation of magnetic islands through the tearing mode instability). Third, and of greatest interest here, the plasma can interfere strongly with the detailed process of reconnection itself, making it necessary for the expenditure of electromagnetic energy to compress, accelerate, and otherwise energize any plasma that gets in its way (Bratenahl et al., 1979). In this interference process, compressed-plasma sheets and currents are built up along and in the neighborhood of separator lines, and this buildup of new structures constitutes the temporary storage of potential energy. Under appropriate conditions, and with significant amounts of magnetic and plasma energy thus stored, instabilities can develop, releasing this energy impulsively. This release mechanism or impulsive flux transfer event (IFTE) then offers itself as a prime candidate to explain solar flares and magnetospheric substorms (Russell and McPherron, 1973). Reconnection is also an essential ingredient in a self-excited dynamo that can maintain a magnetic field against ohmic losses. Thus, it turns out that interest in the problem of reconnection is multidisciplinary: not only is it an important issue in cosmic plasma physics, but it also presents one of the more important plasma containment problems that must be overcome in the practical achievement of controlled fusion as an energy source. The simplest manifestation of reconnection arises in the interpenetration of the magnetic fields of two independent current systems. By “independent”
4
P. J. BAUM AND A. BRATENAHL
we mean that these current systems are sufficiently immune to back-reactions from the reconnection system defined by their fields that cause and effect chains can be isolated so that the reconnection problem is well-posed and determinate. Examples of two-current systems are easy to visualize (Figs. 1 and 2). In each case, the separatrix has been accented by a heavy line, and its point of self-intersection marks the location of the separator. The separatrix defines three flux cells: cells whose field lines link one or the other of the two currents, called parent cells; the cell whose field lines link both; and the daughter cell (Bratenahl and Baum, 1976a). Stenzel and Gekelman (1979a) refer to these as the private and the public flux regions, but we prefer to emphasize the cellular structure defined by the separatrix and its separator. In general, the separator connects between a pair of magnetic null points of semidivergence type (McDonald, 1954), but in cases of degenerate axial or translational symmetry, the separator will be a locus of x-type neutral points. A simple example of this latter type is discussed in Appendix I. (Most toroidal fusion devices, such as the tokamak, involve at least three independent current systems, and the topology is more complicated .) Figure l a represents the impingement of solar-wind-driven southward interplanetary field on the earth’s dipole field. Figure l b illustrates the inter/
B (INTERPLANETARY
SPACE 1
(b) (C) FIG.I . Three expressions of the three-celled field topology of two primary current sources A and B . (a) Magnetosphere, source A in earth, B i n interplanetary space. (b) The double inverse pinch device (DIPD): sources A and B are conductors on which externally driven currents increase with time. (c) Field of two bipolar sunspot groups. Current sources A and B are schematically represented by subphotospheric solenoids. [From Bratenahl and Baum (1976a).]
MAGNETIC RECONNECTION EXPERIMENTS
5
(a1
FIG.2. Other three-celled field topologies. (a) Bipolar sunspot interacting with horizontal field. (b) Bipolar sunspot interacting with vertical field. (c) Field at border of plage regions of opposite polarity. (d)-(f) A dipole immersed in a uniform field showing how the flux content of the three cells changes with field orientation. [From Bratenahl and Baum (1976a).]
action of the fields of two conducting rods carrying parallel currents. Figure lc depicts the field of four sunspots, assumed to represent the erupted portions of two subphotospheric flux bundles defined by solenoidal current systems. In Fig. 2 we see other ways of generating what we call the characteristic three-cell topology of two current systems. Also demonstrated is the dependence of the allocation of flux among the three cells upon the angle between a uniform field and the axis of a dipole field. Although at the outset of theoretical work on the reconnection problem Dungey (1958b) and also Sweet (1958a,b) clearly recognized its system-wide topological aspects, these were quickly put aside in favor of investigating the local plasma dynamics in the neighborhood of magnetic null points, particularly x-type neutral points. This concentration on local effects with insufficient attention paid to distant causes and distant effects was, perhaps, the very natural consequence of pursuing a new line of theoretical investigations on a purely deductive basis without a parallel interactive effort in the laboratory. In addition to the unfortunate introduction of the notion of moving magnetic field lines, which excludes the physically valid and more powerful method of superposition, an orthodoxy soon developed that expressed itself by saying that because of the huge difference of scale, obtain-
6
P. J. BAUM A N D A . BRATENAHL
able plasma regimes and “wall effects,” laboratory reconnection experiments can bear little or no relevance to the problem of reconnection on the cosmic scale. This viewpoint has some merit if the experimental objective is to produce a scale model of cosmic processes, but it quite misses the mark if the objective is directed at testing the assumptions and approximations in the corpus of the theory. The essential point is this : the study of reconnection in the laboratory forces upon the investigator an awareness of the systemwide aspects of the problem, much as an electrical engineer must consider the functioning of a whole system, and the cause-effect chains within it due to the interactive couplings between its component parts or subsystems. For instance, although the detailed nature of an instability that might develop may be quite different in the laboratory and cosmic contexts, nevertheless, the ultimate cause leading inevitably to some kind of instability and the ultimate consequence of that instability may, in fact, be very similar. This can be of great assistance in learning how to pose the right questions, fundamental questions that can lead to a proper definition of the nature of the problem. Moreover, an adequate theory should be equally capable of explaining reconnection phenomena whether it be in the laboratory or in space. We conclude from the above that in order to pose the electrodynamic and topological aspects of the reconnection problem as a determinate problem, the system under consideration must include the entire domain of the flux cells that engage in the exchanges and transfers of flux. In the discipline of fusion energy research, this has become the modus operundi for the obvious reason that laboratory experiment is the ruison d’etre of theory, so that theory and laboratory experiments have, of necessity, become closely integrated. In cosmic physics, on the contrary, the “orthodoxy” referred to above has interfered with such an integration. The result at the present time is that reconnection theory in the cosmic physics context has largely ignored laboratory evidence and has concerned itself almost exclusively with the so-called restricted problems (Vasyliunas, 1975) : the plasma dynamics in a neighborhood of x-type neutral points, neutral line, or neutral sheet, a neighborhood that has been excised out of the three-cell system topology with the assignment of an arbitrarily but conveniently chosen boundary. Theoretical work on the restricted problem has been mostly confined to steady plasma flows despite the fact that a principal objective, the understanding of flares and substorms, involves impulsive phenomena. Moreover, Cowley (1975) seems to have demonstrated that the steady restricted problem is not well posed. Inductive (rotational) electric fields are not considered, nor could they be introduced in a self-consistent way since this would require keeping track of the changes in flux content of the various cells, and these are not defined in the excised system. On the other hand, such timedependent studies of the restricted problem as have been made do not seem
MAGNETIC RECONNECTION EXPERIMENTS
7
to lead to steady solutions (Sweet, 1969). This presents one of several paradoxes that have arisen, and it is the resolution of such paradoxes that provides a strong motivation for laboratory reconnection experiments designed to test theory. More will be said of these paradoxes in what follows. This introduction would not be complete without an attempt at a formal definition of reconnection. This is not as easy as it might seem because of the wide variety of situations in which it can arise. Within the context of the restricted problem, Vasyliunas (1975) defines reconnection as (1) the plasma dynamic process in which there is a plasma flow across a separatrix. In the same context, reconnection might alternatively be defined as (2) the plasma dynamic process in which there is an electric field along a separator. These local definitions address complementary features concerning just one aspect of the problem. The local region behaves like a nonlinear circuit element, and its reponse in any particular situation depends on the overall system structure and what is taking place throughout. In general, the electric field is the sum of rotational and irrotational contributions : E
=
- ( d A / d t ) - V$
These two component fields are coupled to the plasma dynamics in completely different ways. Under certain circumstances they can be separately measured in the laboratory through integral measurement techniques. In fact the first, relating to Faraday’s law governing changes in the flux and its distribution among the cells, can redistribute space charges, even producing double layers, whereas the second, deriving from these space charges, local or distant, can be severely modified by the first. The restricted problem cannot address these issues. Accordingly, a system-wide definition has been proposed (Bratenahl and Baum, 1976a amended). (3) Reconnection is the transfer of flux from parent to daughter cells or vice versa, accompanied by the compression, acceleration, and energization of any plasma that gets in its way, and this work is performed at the expense of the electromagnetic field. This is closely related to Sweet’s original definition (Sweet, 1958a): (4) Reconnection is the interpenetration of two flux tubes that differ in the connectivity of their field lines. (3) and (4) are related also to a definition within the fusion descipline: (5) Reconnection is a change in the magnetic topology involving the development of a new separatrix structure defining one or more magnetic islands enclosing additional magnetic axes in a system initially containing just one such axis. (A magnetic axis is a field line in a toroidal geometry that closes on itself after a finite number of turns around the toroidal direction.) The experiments to be discussed herein are mainly those relating to reconnection theory within the discipline of cosmic plasma physics. However, there have been outstanding important instances of transfers of new concepts
8
P. J. BAUM AND A. BRATENAHL
from the fusion discipline. One example is the tearing-mode instability of the sheet pinch and its experimental evidence must be included. We shall see some of the effects of this interweaving of the two disciplines in Section 111.
111. HISTORICAL PERSPECTIVE PRIOR TO 1970 Reconnection had its roots in the early attempts to explain solar flares. Thus Giovanelli (1946, 1947, 1948, 1949) associated flares with electrical discharges at x-type neutral points. The dynamics of this x-point process was first considered by Dungey (1953, 1958a,b). Many others have followed his lead. A partial list of researchers who have worked in this field appears in Appendix 11, which lists some of the technical jargon of the subject along with the earliest referenced use that we have been able to discover. Dungey (1953) immediately noticed that once a current is started along the separator, there is a remarkable tendency for the x-point structure to collapse into a neutral current sheet, like a pair of scissors, accompanied by a spontaneous increase in the current. He interpreted this to be an instability, and thus we have Dungey’s Paradox and Sweet’s Paradox.
A . Dungey’s Paradox The Lorentz force of a current at an x point distorts the field in such a way that the current is increased, and with the increased current, distortion is increased still further, in violation of Lenz’s law. (Dungey believed that this obvious violation of Lenz’s law could be discounted by saying that Lenz’s law only applies to rigid conductors.) (1) Resolution. Lenz’s law cannot be applied to an open system as Dungey did. The entire current circuit must be taken into account in order to apply Poynting’s theorem. Dungey’s instability interpretation was generally believed to be correct until Imshennik and Syrovatskii (1967), using Poynting’s theorem, showed it to be a cumulation or storage of electromagnetic energy from external sources. In other words, the current increase is not spontaneous but is related to an influx of electromagnetic energy from sources outside the system under consideration. (2) Comment. Consideration of the whole current circuit is almost never done by workers in this field, and numerous errors have resulted therefrom. Appendix I shows the relation between Dungey’s increasing current and external EMFs to drive it. The collapse of the x point led Sweet and almost everyone else since to believe that the end result was the formation of a single current sheet, which experiment now clearly shows is not necessarily
MAGNETIC RECONNECTION EXPERIMENTS
9
the case. Dungey never mentioned a single sheet, however. Instead, he noted the current density would tend to increase without limit and then made the correct suggestion that an intervening instability would prevent this from happening. B. Sweet’s Paradox
Interpreting Dungey’s collapsing x as a tendency to form single-current sheets, Sweet (1958a) was led to conclude that pushing two bipolar sunspot groups together would result in a flattening of their convex field systems against each other as “motor tyres when loaded.” (1) Resolution. Experimentally it has been demonstrated that pushing convex fields together can actually increase their convexity toward each other rather than the converse, flattening (Bratenahl and Yeates, 1970; Bratenahl, 1972). (2) Comment. Since Sweet’s publication in 1958, but long before this laboratory demonstration, the picture of oppositely directed flat fields diffusing into a neutral sheet “of length 2L and 21” and annihilating each other has become a sacrosanct principle dominating nearly all thought on the subject. It was to become known as Sweet’s mechanism in Parker’s classic monograph (Parker, 1963). Parker came close to denying out of sheer frustration that magnetic fields have anything to do with flares, because it seemed to him impossible to get Sweet’s field annihilation process to go fast enough. Sweet (1958a,b) and Parker (1957, 1963, 1973a) considered a twodimensional planar geometry with a neutral sheet of width 2L and thickness 21 sandwiched between antiparallel fields. The problem was considered timestationary and the sheet was treated as a boundary layer. Application of the Bernoulli equation then showed that the plasma exits the edges of the sheet at the Alfven speed v, corresponding to the inflow field and density conditions. The inflowing plasma carries the “frozen in” field lines to be annihilated up to the boundary at the velocity u, or Alfven Mach number M , = v / V , < 1. For stationary conditions, the inflow velocity u, which is also the flux transport velocity, must match the magnetic diffusion velocity (pool)thus determining the sheet thickness 21. Conservation of mass then yields the inflow velocity u = 2.,(pooz’,L)- 1’2
which may be conveniently expressed in nondimensional form (Parker, 1963)
10
P. J. BAUM AND A. BRATENAHL
where R, is historically called the magnetic Reynolds number, whereas it is in fact the Lundquist number. This, then, is said to be the magnetic field annihilation rate for Sweet’s mechanism. It turns out to be far too slow to account for the energy release rate in flares. Petschek (1964) then introduced a weak normal component of the field in the downstream portions of Sweet’s boundary layer (Fig. 3). These portions then become pairs of slow-mode shock waves that separate slightly from each other, between which the outflowing plasma now streams at a local Alfven Mach number considerably greater than 1. The effect of this is to reduce the effective width 2L* of the diffusion region to a value of the general order 21. The annihilation rate then becomes
which has been claimed to be fast enough to explain flares (see Section II1,C). Furthermore, the presence of the normal component in the exit flows restored the original concept of reconnection, which had been forgotten to Sweet’s mechanism. For a while it almost seemed as if the flare problem had been solved, yet some disturbing questions remained. The Petschek mechanism
t FIG. 3. The classical Petschek mode. Field lines are represented by light solid lines with arrows and the separatrix is drawn as dashed lines. The slow shocks lie along the dark solid curves. L is the width of the diffusion region. Heavy arrows indicate flow directions. [From Baum and Bratenahl (1977) adapted from Petschek (1964).]
1
MAGNETIC RECONNECTION EXPERIMENTS
11
is like a powerful blowtorch, burning steadily, fueled by a supply of electromagnetic energy from outside but operating at the rate M , solely by virtue of a prodigious store of magnetic potential energy represented by the currents in its stationary shocks and diffusion region. In contrast, flares are catastrophic impulsive events, like the failure of a dam, sending out large-amplitude disturbances in various directions. There was some speculation about a suitable trigger for Petschek’s model but little concern seems to have been given to the store of magnetic energy represented by its current system, even less to the security of this stored energy against gross instability. We see in Section II1,C that this lack of attention given to the problem of the stored energy and its security against possible instability is a prime example of the ineffectiveness of carrying on a theoretical investigation purely deductively, with no laboratory experiments to provide guidance. The use of more and better observations has its proper place in the scheme of things, but is a poor substitute for the kinds of provocative questions that can arise in the interaction between theory and experiment dedicated to theory. We have already indicated that reconnection also has roots in the fusion energy research problem of confinement of hot plasmas by magnetic fields. The earliest contribution is a paper entitled “Finite Resistivity Instabilities of a Sheet Pinch” (Furth et al., 1963) giving details of an instability in which the field line topology is rapidly changed from that of a simple sheet pinch sandwiched between antiparallel fields into a row of ordinary pinches with a separator (x-type neutral line) between each. The term “tearing mode” was given this instability, signifying the tearing apart of the sheet current into filaments. Since the tearing mode leads to an enhancement of magnetic energy dissipation, it was immediately incorporated into a flare model (Jaggi, 1964). This model does draw directly upon the energy stored in the sheet, but again something is missing. How does the sheet acquire sufficient energy without going unstable prematurely? The restricted problem approaches employed in these models simply lacks sufficient information to establish cause-effect relationships. The Petschek model deals with the conversion of electromagnetic energy flowing in at a prescribed rate from outside the system but ignores the available energy already stored in its internal current system. On the other hand, the Jaggi model, while dealing with the current sheet itself, ignores the inflow of electromagnetic energy from outside, which is responsible for building it up. It would be most helpful if the local problem could be separated from its system-wide environment and solved independently. Recently developed evidence indicates that this may not be possible (Cowley, 1975). We return to this point in Section VI.
12
P. J. BAUM AND A. BRATENAHL
C. Process Rates
If the large and rapid energy releases in solar flares and magnetospheric substorms are to be explained in terms of magnetic reconnection, then an adequate theory should provide a suitable measure of the energy release rate Q . Three totally different process rates appear in the theoretical literature. Only one of these can be related directly to Q .
(1) The Alfven Mach number M , is used, as we have already seen, as a nondimensional measure of the flux transport velocity in curl-free regions, and when evaluated immediately upstream of the diffusion region, it is widely used as a measure of the reconnection rate, consistent with definition (2), Section 11, and valid within the limitations of the restricted problem analysis of steady flow. It has proved useful in the comparison of different diffusion mechanisms such as those of Sweet and Petschek, as well as parametric studies within a given theory. However, beyond providing a very rough estimate, M , is not particularly helpful in establishing the claim that Petschek’s mechanism is fast enough to explain flares. Neither is it particularly useful in the analysis of laboratory experiments. The use of M , as a measure of the reconnection rate is an inheritance of the assumed validity of the notion of moving-field lines and frozen-in flow, but it is not dependent on that assumption (Sonnerup, 1979). We believe, however, that the product of the velocity and the field, consistent with definition (2), Section 11, would be more useful as a rate measure than the ratio M a . (2) A process rate frequently appearing in the literature, even further removed from the energy release rate, is the growth rate of the tearing-mode instability. This is the rate of development of the perturbation field in the small-amplitude linear regime corresponding to the wavenumber mode of most rapid development. Thus far, the tearing-mode instability has been most fully developed for a neutral current sheet of infinite width in which the perturbation field develops a periodic structure of alternating 0- and s-type neutral points. The theory has been well confirmed in laboratory experiments with cylindrical neutral sheets (Section V,A). In order to apply the tearing mode to the reconnection problems encountered in cosmic plasma physics. however, a theory needs to be developed for tearing modes in current sheets of finite width, containing a separator line and crossed by a normal component of the field. ab initio. This form of tearing mode is more aptly called current sheet rupture (Syrovatskii, 1975). (3) The flux transfer rate S is most helpful in the analysis of reconnection in systems-in-the-large. It gives full expression to the electrodynamic and topological aspects of reconnection in terms of the system’s separatrixseparator structure in accord with definition (3), Section 11, and is also
MAGNETIC RECONNECTION EXPERIMENTS
13
consistent with definition (2). The basis of the flux transfer rate is the line integral of the Faraday electric field, taken around one of the flux cells on a path along the separator and passing through the neutral points. The actual direct measurement is quite feasible in certain laboratory experiments (Section V,B,5), but can only be inferred indirectly in solar and space observations. Important general conclusions can be drawn from the flux transfer rate determination and certain other related measurements of the electric field. This fact underscores once again the importance of investigating reconnection mechanisms in the laboratory as an adjunct to theoretical development. From the resolution of the electric field into its rotational and irrotational components, Eq. (l), it is clear that S depends only on the former so that in the system-wide analysis a distinction can be made between timedependent and time-stationary processes. In the restricted problem analysis this distinction cannot be made unambiguously. In cosmic plasma physics, only a small subset of configurations admit time-stationary electrodynamic processes; the only example given here that can be time-stationary is shown in Fig. la. Even with these exceptions, time-stationary processes are strongly mitigated against both by the plasma dynamics and by naturally occurring fluctuations in the boundary conditions. On the other hand, the irrotational electric field can also strongly influence the local flow dynamics in the neighborhood of the separator, and in the case, for instance of Fig. la, it can also couple the external interplanetary convection electric field to the internal magnetospheric convection electric field. The term “magnetoelectric coupling” has been suggested for this latter effect (Bratenahl and Baum, 1977). In the laboratory, it has been found possible, through integral measurement techniques not at all feasible in solar and space observations, to make a local determination of both components of Eq. (1) (J. Nickel, personal communication, 1979; Beeler, 1979 ; Stenzel and Gekelman, 1979b). These methods show great promise in the analysis of current-carrier runaway processes and the possible development of electric double layers. On the system-wide scale, the concept of the flux transfer rate S and its determination in the laboratory leads, through an equivalent-circuit analysis, directly to a flux equation linking that rate with the rate of release of stored energy associated with the reconnection current system, and with the externally applied EMF (Bratenahl and Baum, 1976a,b; Baum et al., 1978) (see Appendix 111). In the flux equation,
4 is the flux to be reconnected, y the ratio of the effective resistance R to the self-inductance L associated with the separator current system including
14
P. J . BAUM AND A. BRATENAHL
Petschek shocks if present, and V the externally applied EMF tending to provoke reconnection. This EMF represents the changes in the geometry and/or strength of the primary current sources of the field, the cause for which reconnection is the effect. The flux transfer rate S = -y4. The magnetic energy release (conversion) rate Q = Z2R = y+’/L = S2/yL.The lower limit of the magnetic energy storage Urn= 4’/2L = S2/2y2Lwith an additional contribution depending on the mutual inductance between the separator current system and the primary currents. If a constant V is switched on at t = 0 and y remains constant, Eq. (2) yields S =
-y4
so that asymptotically in time
s + - v = s oc,
Q
=
= -
-
V2/yL US:,
V(1
-
e-yf)
Urn-+ V2/2Ly
(3)
La2S&
We note that S approaches the vacuum rate S , = - V (as though there were no plasma present), the energy release rate is proportional to the conductivity, and the stored energy is proportional to the product of the inductance and the square of the conductivity. This last is particularly significant. It implies that the capacity of the energy storage reservoir (equality of filling and emptying rates) increases with the inductance, with the square of the conductivity, and with the square of the external EMF. Of even greater significance to the flare and substorm applications is the obvious fact that the integrity of the energy storage reservoir depends either on the stability of the current system geometry ( L )or the stability of the conduction mode (G), raising the possibility that failure of either can bring about failure of the other, thus compounding the catastrophic effect. The solution of Eq. (2) with 7 = yoerr switched on at some t , > 0 is found to agree with laboratory observations of impulsive flux transfer events (IFTE) (Section V,B,5 and Appendix 111). This implies that the restricted problem analysis with stationary or quasi-stationary flows merely describes the process of filling the reservoir (inflow exceeding outflow). The explosive event of interest lies therefore entirely outside the scope of the restricted problem analysis, and a new direction for the theoretical effort is urgently called for. This illustrates precisely what we had in mind on p. 1 1 when we stressed the importance and relevance of laboratory reconnection experiments as a guide to the theoretical effort. IV. RECONNECTION THEORY Theoretical approaches to the problem of reconnection include the tearing-mode instability, shock-wavemodels such as Petschek’s, and various
MAGNETIC RECONNECTION EXPERIMENTS
15
similarity solutions and self-similar collapse solutions. Numerical approaches include computer solutions by Ugai and Tsuda, Sato, and Brushlinskii et al. These are briefly covered in the next sections. A . Analytic Approaches 1. Tearing Mode
In its earliest form (Furth et al., 1963), the tearing mode assumes that a current sheet established between antiparallel field and the current-carrying plasma is compressed in a sheet pinch in static equilibrium. Plasma external to the pinch is ignored. If no instability were to intervene, one would expect a slow diffusion of the antiparallel fields into the sheet where they would annihilate each other. The initial configuration is shown in the top portion of Fig. 4 (adapted from Van Hoven, 1979). The current sheet is located along the dashed line with the current flowing out of the page (z direction). It was clear from theory that the tearing mode would rearrange the current sheet long before field annihilation could be effective. In the tearing mode, the sheet is “torn” into a periodic ribbon or current filament structure. The
MAGNETIC T E A R I N G
1
FIG.4. The tearing-mode configuration with initial antiparallel fields at the top. Current is concentrated along the center dashed line. When the current is diverted (“torn”) as shown by the arrow on the middle panel, x points and o points are established periodically along the sheet. The o point region is clearly a sink for energy. The x point regions are also sinks but control the Poynting flux to the o points. Upstream magnetic energy is converted to downstream kinetic energy and heat. [Adapted from Van Hoven (1979).]
16
P. J . BAUM AND A. BRATENAHL
points of current minima become x-type neutral points, the points of current maxima lie in “magnetic islands” with o-type neutral points at their centers. Altogether, there is an alternating string of x-o-x-o-type neutral points along the previous location of the current sheet. One complete island is shown in the bottom portion of Fig. 4 and here the open arrows indicate both the flow directions (from J x B) and the Poynting flux (from E x B/po). The result of the tearing mode is to convert upstream magnetic energy into downstream kinetic energy and Joule heat. Thus, the island is labeled a Joule heating sink, while in the “source-sink’’ region a portion of the energy is converted to kinetic energy. The true source regions lie outside the region considered. The entire current sheet region is actualiy a sink since j E > 0 along its entire length. The growth time of the collisional or resistive tearing mode is z (z,~,)~”,where z, is the resistive time and z, the Alfven time. More complex versions of the tearing mode are treated by Biskamp and Schindler (1971), Galeev and Zelenyi (1976), Waddell et al. (1976), White et al. (1976), Drake and Lee (1977), Satya and Schmidt (1978, 1979), Bateman (1978), and Van Hoven (1979).
-
-
2. Wuzje-Assisted DifSusion Modes
Petschek (1964) introduced the concept that reconnection could proceed in a steady flow supporting two slow-mode shock waves that propagate upstream at the same rate at which plasma flows downstream. This mode is shown in Fig. 3, where the separatrix is shown as dashed lines and the solid arrows indicate plasma-flow directions. Plasma is ejected between the slow shock waves (solid lines) that lie downstream of the separatrix. Levy et ul. (1964) adapted the Petschek mechanism to the magnetosphere. The Petschek mode and subsequent modifications appear in Fig. 5. Figure 5a shows the first quadrant of the Petschek solution with the addition of a flow streamline shown dashed after Vasyliunas (1975). The slow mode shock locus lies along the line OA. Sonnerup (1970) also obtained a stationary flow solution shown in Fig. 5b. In Sonnerup’s solution three uniform flow regions are pieced together with two wave structures OA and OB. OA is the slow shock and OB, drawn as a discontinuity, is actually a slow-mode expansion fan, which seems to require external physical corners at B, for its initiation. The current along OA is out of the page, whereas along OB it is into the page. Therefore, the sign of J E indicates that OB is a generator, decelerating plasma as it passes but increasing the field. OA is a sink, accelerating plasma that crosses it but decreasing the tangential component of B. There is one case corresponding to the highest Alfven Mach number where there is no net energization since the magnitudes of the reversed currents along the OA and OBare equal. Figure 5c shows the similarity solution derived
-
MAGNETIC RECONNECTION EXPERIMENTS
17
FIG. 5 . Typical magnetic field lines (-) and streamlines (---) in the first quadrant for various two-dimensional reconnection theories: (a) Petschek’s mechanism with one slow shock; (b) Sonnerup’s model with a second discontinuity (OB); (c) Yeh and Axford’s model; (d) Priest and Soward. OB is an Alfven line here with no discontinuity. [Adapted from Priest and Soward (1976).]
by Yeh and Axford (1970), which includes Sonnerup’s solution as a special case. Finally, in Fig. 5d we see the variant of Petschek’s model obtained by Soward and Priest (1977; Priest and Soward, 1976). Here there is only one discontinuity, OA. The shock OA is now curved rather than straight and shows the surprising property that the field strength increases as one approaches the neutral point 0 from above. Extension of the Y eh-Axford type analysis to three dimensions has been carried out by Rosenau (1977, 1979), who finds shocks unnecessary. Extension of theory to collisionless plasma was done by Coroniti and Eviatar (1977). Kaw (1976) finds a link between Petschek’s mechanism and tearing theory. The models of the previous subsection assumed time stationarity, whereas
18
P. J. BAUM AND A. BRATENAHL
in this section explicit time dependences are found. The first solution of this type is that of Chapman and Kendall(l963; see also Uberoi, 1963; Chapman and Kendall, 1966), in which an incompressible plasma column in a hyperbolic magnetic field collapses like a scissors closing to form a neutral-current sheet of infinite width in finite time. Imshennik and Syrovatskii (1967) found similar behavior in the compressible case. Syrovatskii (1968) pointed out that the boundary conditions corresponding to the final collapse to a sheet were strictly unobtainable, “the singularity as q -+ 0 in the considered selfsimilar solution can be obtained only as a result of an unlimited increase of the potential of the external field, i.e., at infinitely large external currents and electric fields.” Forbes and Speiser (1979) continue the analysis, observing that in order to achieve infinite current density “what is required is very high conductivity.” In fact, the conductivity must be infinite and again infinite current density is not obtained. 3. IFTE and Sheet Rupture
In this section we consider the transient reconnection processes called impulsive flux transfer events (IFTE), (Bratenahl and B a a , 1976a,b; Baum et al., 1978), and sheet rupture (Syrovatskii, 1975; Somov and Syrovatskii, 1975; Kirii et al., 1977; Bulanov el al., 1977). These two processes are related to the tearing process but with some differences. Recall that the FKR (Furth et aE., 1963) tearing theory treats the field of the current sheet as well as a field along the current sheet. There is assumed to be no field component perpendicular to these components as would be established by finite external sources parallel to the current layer. The current sheet tears into periodically spaced filaments treatable by Fourier analysis. Sheet rupture deals with one or two tears of the current sheet and can accommodate background field components due, e.g., to a quadrupole field. The electric field arising during sheet rupture appears to be effective in accelerating particles. The theory of sheet rupture is more difficult than simple tearing so that analytic results are presently incomplete. In the case of IFTE (Baum and Bratenahl, 1975; Baum, 1978a,b), it is assumed that current flows along the separator of the background field, e.g., the potential field of Fig. lb. The current distribution need not have the form of a single-current sheet; it may be distributed also along shock waves. The tearing or diversion of this induced-current system serves the same function as sheet rupture. A difference between simple tearing and IFTE and sheet rupture is the preexistence of an x-type neutral point in the background field B, which forms a fixed preferred location for rupture since the J x B force and the E x B Poynting flux is channeled into two downstream regions on opposite sides of the x point. In simple tearing, the x point and thus the
MAGNETIC RECONNECTION EXPERIMENTS
19
points of deflection of the J x B force and the E x B Poynting flux are determined at the time of tearing. Again the analytic theory is incomplete because of the added complexity of the background field. Smith (1977) points out that oscillatory solutions can exist.
B. Numerical Approaches
While analytic approaches are conceptually and sometimesquantitatively helpful, the numerical computer approach seems even more useful. Provided sufficient attention is paid to initial and boundary conditions, quite complex problems become soluble. There remain some purely technical problems such as numerical diffusion, the tendency for finite difference schemes to become unstable, and, e.g., the propensity of plasmas to assume anomalous transport coefficients. Therefore, some comparison with experimental evidence is desirable. The reader is referred to the references listed in the following papers for earlier numerical work. 1. Ugai and Tsuda In a series of papers (Ugai and Tsuda, 1977, 1979a,b; Tsuda and Ugai, 1977) Ugai and Tsuda have solved a two-dimensional MHD problem assuming a current sheet in static equilibrium as the initial condition. Related work has been done by Amano (1977) and Amano and Tsuda (1977). At time t = 0 the resistivity is specified to rise locally by a factor 100, which reduces the current and produces a neutral point at the origin. As the tearing proceeds, current layers resembling Petschek’s (1 964) develop downstream. Immediately after t = 0, the power dissipation rate
P
= 12Reff= 12(R -I-dLldt)
must jump to approximately 100 from its initial value of 1. Thereafter, the neutral point current drops from 1 to -0.2 (power from 100.0 to 4.0) and starts to rise until at t 15 units it is found that it has recovered to nearly one again. Therefore, since the resistivity is spec$ed to remain constant (having made a transition from a lower value at t = 0), they have produced two “flares,” one at t 2 0, the second at t 15. If the solutions could be carried forward in time, it may be discovered that the solution is an aperiodic relaxation oscillation analogous to the periodic relaxation oscillation described by Baum et al. (1978). In their case, the recovery of the effective resistance Re, must involve the inductive term dLjdt rather than the resistance R. In order for the recovery to occur, current must have been propagated from the sources to the neutral point.
-
-
20
P. J. BAUM A N D A. BRATENAHL
Ugai and Tsuda (1979a,b) conclude that the reconnection rate is controlled by local conditions rather than boundary conditions. One must bear in mind that by “reconnection rate” they refer to the upstream Alfven Mach number M , (they define also an “intrinsic reconnection rate,” which is the maximum electric field ever developed at the neutral point). In their case, the Mach number is determined by whatever upstream conditions specify the current density J and therefore J x B. By requiring the resistivity to increase by a factor 100, the current density must therefore decrease also. Since the cause of the resistivity transition is not specified, the reason for the increased Mach number is also not specified. It is apparent, however, that in a more complete analysis, the resistivity transition could be the result of a local threshold for current instability and, therefore, the enhanced reconnection could be said to be due to “local” or at least “regional” effects that control the accumulated energy derived from Poynting flux from the sources. At the same time, there would be no induced currents were it not for the effects of distant boundary conditions. Therefore, the time average reconnection rate (electric field or Mach number) should depend on the distant boundary conditions. Instantaneous or rapid impulsive processes are predominantly regionally controlled, whereas their energy build-up is controlled by global sources in conjunction with regional effects such as stagnation points for Poynting flux arising from symmetry.
2. Sato and Hayashi These authors (Hayashi and Sato, 1978; Sat0 and Hayashi, 1979; Sato, 1979) have also studied the behavior of a current sheet in an MHD computer analysis. They consider the plasma resistivity to be zero everywhere initially. As the solution proceeds, currents build up and when j exceeds a critical threshold, the anomalous resistivity is assumed to occur locally. These authors observe the development of slow shocks downstream from the separatrix as in Fig. 6. Sato (1979) has examined the Rankine-Hugoniot shock conditions, finding them well satisfied across the current layers. He also finds from the plasma flow velocity that the shocks exceed the slow-mode phase velocity, concluding that the current ridges are indeed slow-mode shocks. He finds also that removal of the free expansion downstream boundary condition slows the speed of the downstream plasma jets. A solution has not been obtained in the long-time limit because of numerical instabilities. 3 . Brushlinskii et al.
Brushlinskii et al. (1977, 1978) have also studied the numerical behavior of a two-dimensional MHD reconnection model (see also Gerlakh and
MAGNETIC RECONNECTION EXPERIMENTS
21
d =0.02 A,=0.2
T=10
1-125
T =15
FIG.6 . Isometric presentation of the time development of electric current as computed by Sat0 (1979). Note the formation of pairs of slow-mode shocks. [From Sat0 (1979).]
22
P . J . BAUM AND A. BRATENAHL
Syrovatskii, 1976). They allow the plasma conductivity to make a transition to anomalous resistivity when the current densityj exceeds a threshold level. They also include the effects of radiation cooling. They find pairs of slow shocks attached to the ends of current sheet. The asymptotic location of the shocks is most strongly dependent on plasma conductivity. The shock locus moves farther downstream as the conductivity increases. The current sheet also thins and lengthens as the conductivity increases. V. RECONNECTION EXPERIMENTS
Twelve types of reconnection experiments are represented schematically in Fig. 7, both in early and late temporal stages [with the exception of (h)
FIG.7. Schematic drawings of 12 reconnection experiments at early and late times, as interpreted by the present authors. (a) The triax tubular pinch device of Anderson and Kunkel(l969). [From Baum and Bratenahl (1977).] (b) The O-pinch experiment of Irby et al. (1979). (c) The O-pinch experiment of Bodin (1963). [From Baum and Bratenahl(1977).] (d) The m = 2 tearing mode in tokamak, e.g., Batemann (1978); (e) The UCR flat plate device of Baum and Bratenahl (1976). ( f ) The pulsed quadrupole of Frank (1976) modified from Baum and Bratenahl(l977). (g) The double inverse pinch device of Bratenahl and Baum (1976b). (h) The steady state (DC) quadrupole experiment of Overskei and Politzer (1976). (i) The implosive multipole of Cowan and Freeman (1973). (j) The triple inverse pinch of B a r n et al. (1976). (k) An annular pinch experiment by Alidikres et al. (1968). [From Baum and Bratenahl (1977).] (1) The pancake pinch of Dailey (1972). [From Baum and Bratenahl (1977).]
MAGNETIC RECONNECTION EXPERIMENTS
23
which is time stationary]. Each will be discussed in the subsequent sections. Experiments (a)-(d), (i), and (k) are all fusion-research-related in which reconnection in the form of the tearing mode is an unintended and highly troublesome feature. Experiment (1) represents an effort in space propulsion research in which reconnection is explored. These applied research experiments are covered in Section V,A. All the other experiments, (e)-(h), and (j) with (e) representing three variants, were undertaken for the sole purpose of putting reconnection theory to the test in the laboratory. These are covered in Section V,B, in a sequence that reflects increasing levels of our familiarity with them. Our own experiment (g) is therefore presented last and in greater detail. A . Applied Research Experiments Exhibiting Reconnection
1. Triux
The tubular pinch or triax device shown in Fig. 7a was developed more or less concurrently with the Furth-Killeen-Rosenbluth (1963) theory of the resistive tearing-mode instability. As shown in Fig. 8, an inverse pinch is launched outward simultaneously with the inward development of an ordinary 2 or axial pinch. Upon collision at an intermediate radius, the two cylindrical current sheets unite into one, sandwiched between oppositely directed magnetic fields, so that the surface of initial contact becomes a true magnetic neutral sheet. Under moderate-pinch current conditions, the tubular pinch undergoes a few damped oscillations about its equilibrium position, but with sufficiently high current, the sheet goes unstable, breaking up or tearing into a set of equally spaced filaments (Fig. 9). This filamentation is obvious in the 25 pm argon and 125 pm helium sequences; however, it was not discovered for some time that the higher-pressure argon sequences also tore because high luminosity near the electrodes masked the effect photographically. Therefore, tearing seems to have been discovered theoretically (Furth et al., 1963) shortly before it was discovered experimentally (Anderson, 1964; Anderson and Kunkel, 1969). Fisher (1960) and Kunkel (1960) theoretically treated cylindrically symmetric properties of triax. Subsequently a numerical code (Killeen, 1964) was developed for the triax configuration leading to a satisfactory fit between theory and experiment, but with some remaining uncertainties for reasons listed by Anderson and Kunkel ;in particular, the magnetic Reynolds number in the sheet was highly uncertain. No completely satisfactory theory seems to have been produced to explain the neutrons associated with a secondary instability. Presumably it was associated with a current diversion away from the filaments themselves being described only as a “fast short-wavelength instability” (Anderson and Kunkel, 1969).
24
P. J. BAUM AND A. BRATENAHL
LASMA
UPLAS FIG.8. Top and side schematics of the triax device of Anderson and Kunkel. Plasma occupies the hatched area and is compressed by the opposing Lorentz forces of an inverse pinch and an ordinary pinch. [Adapted from Anderson and Kunkel (1969).]
2. Theta-Pinch Experiments Theta (d)-pinch experiments are presented in Fig. 7b,c. The former is the fast 8 pinch at the University of Maryland (Irby el a/., 1979) and the latter is the 8-pinch experiment of Bodin (1963). The experiment of Irby et al. (1979) is distinguished from most 8-pinch experiments in the detailed diagnostics. The magnetic flux surfaces have been measured as a function of time (Fig. 10) in one quadrant. Figure 10 shows what these authors call "forced reconnection" has produced one x point toward the right end of the
MAGNETIC RECONNECTION EXPERIMENTS
0.3psec
1.0
1.7
2.4
3.I
25
3.8
loop Argon
25p Argon
125p Helium FIG.9. Top view framing camera photographs of the triax tubular pinch discharge in argon and helium. These represent the first evidence for the tearing-mode instability. [From Anderson and Kunkel(1969).]
machine. (Notice the compressed scale of the z axis for ease of presentation.) All lines above the separatrix (dashed line) link around the outside wall of the device while those below the separatrix link through the plasma. In Fig. 10b we see that several other magnetic islands are produced mainly through “spontaneous reconnection” or tearing and these islands coalesced later to reduce the total number of islands. Irby et al. (1979) attribute the rapid growth rate of tearing in their device to their large ion gyroradius. As shown in Fig. 7c the long f3 pinch described by Bodin (1963) tore into a number of cylindrical rings. These rings later coalesced to form one o line circling the axis of the device. Similar behavior was observed by Benford et al. (1968) and subsequently also in a pinch experiment by Altynsev and Krasov (1975). This latter experiment, however, demonstrated tearing in the collisionless regime. 3 . Tokamaks
Still another experiment that unfortunately demonstrates tearing is the tokamak (Fig. 7d) and here we see the schematic formation of an m = 2
26
P. J. BAUM AND A. BRATENAHL
1oX)x)4050
I o X ) x ) 4 0 5 0 Z (cm)
FIG.10. First-quadrant view of measured flux surfaces in a fast 0 pinch. Field lines are shown at 0.40, 0.52, 0.60, and 0.76 psec with a constant flux spacing of 2500 G cmz. [From Irby et al. (1979).]
torroidal resistive tearing mode. It is fairly widely held that the disruptive instability involves the growth of an m = 2 tearing mode (e.g., Morton, 1976) that becomes so wide that it interacts with the limiter. Both a small and a large disruptive instability are shown in the streak photographs o f Fig. 11 where we are viewing the minor radius of a torus. It is seen that the plasma luminosity expands to the wall, later thinning. At the large disruptive instability there is an abrupt negative spike in diamagnetic voltage loop signal (inverted) and an increase of oscillations of the poloidal magnetic field. The large disruptive instability not only terminates confinement, but it can also rupture the vacuum vessel through relativistic particle bombardment. 4. Annular Pinch
The annular pinch of Alidieres et al. (1968) (Fig. 7k) exhibited tearing at just one point, and only after a considerable period of quiescent development. The delay led the authors to conclude that something quite different than the tearing mode was at work, In the light o f more recent developments,
27
MAGNETIC RECONNECTION EXPERIMENTS
A VOLTAGE LOOP SIGNAL
F
.-”-,-‘---.+~&+/
EQUATORIAL TANGENTIAL V I E W OF P L A S M A TORUS
POLOIDAL MAGNETIC PROBE
13.0
12.0
i
I
I
1
14.0
I
I
TIME (ms)
-
I
I
I
I
I
-
FIG.11. A small disruptive event at t 12.1 msec and a large disruptive event at t 13.0 msec in the ATC tokamak. Negative voltage loop signal (top), framing camera filmstrips of the edge of the torus (middle), and poloidal magnetic field probe signal. [Adapted from Jacobsen (1975).]
however, we might suggest the occurrence of a normal tearing mode following a delay caused by temporary stabilization by a low-density highly conducting plasma external to the pinch (see Sections V,B,4 and V,B,5). 5. Other Experiments
We conclude this presentation of experiments in applied research that exhibit reconnection with a brief account of four more. Figure 7i shows a multipole experiment performed by Cowan and Freeman (1973) where flux is compressed and forced to reconnect by conventional explosive means. Figure 7j is the triple inverse pinch experiment reported by Baum et al. (1976). The innermost flux cell formed an intense ordinary pinch that ulti-
28
P . J. BAUM AND A. BRATENAHL
mately became unstable to kink and sausage instabilities and abruptly disappeared entirely. The pancake pinch (Dailey, 1972) shown in Fig. 71 is similar to the preceding experiment, but with interchanged coordinates. Finally, we might mention that Zukakishvili et al. (1978) reported observing reconnection phenomena in the collapse of a planar z pinch.
B. Testing Reconnection Theory in the Laboratory As noted in Section 11, reconnection theory applied to cosmic plasma physics was allowed to develop for 10 years without support of testing and guidance through laboratory experiments. Moreover, it was a widely held belief (the “orthodoxy”), still influential today, that no such laboratory effort could be considered relevant in this discipline. Nevertheless, a pioneering effort was initiated in 1963 at the Jet Propulsion Laboratory, Pasadena, California, by one of the present authors (A. Bratenahl). In due course, this effort spread to other laboratories in the U.S., Japan, and the Soviet Union. 1. C‘CLA Large Flat-Plate Device
The most recent addition to this effort and also the most elaborate to date. is the experiment of Stenzel and Gekelman (1979a) schematically illustrated by the right-hand diagram of Fig. 7e. A quadrupole field is produced along the axis of the cylindrical vacuum chamber by passing parallel current along a pair of flat conductors. Added to this field is an axial field produced by external coils. A low-pressure collisionless argon plasma is then produced with an axially applied electric field and with the aid of a large oxide-coated cathode at one end. The results at an early time (20 p e c ) are presented in Fig. 12a,b, showing, respectively, the magnetic field vectors and the contours of constant (B(.A single x-type neutral point is located at (X, 2)x (4, - 1). At 60 psec (Fig. 12c,d) the neutral point has shifted to (X, 2)x (-4, - 2). It is apparent from Fig. 12b that considerable current has built up just upstream of the neutral point. Stenzel and Gekelman (1979) refer to this current as a “neutral layer with Petschek slow shocks. . .” although the shocks do not appear to be separated from one another as in Petschek’s model (Figs. 3 and 5a). They also observe that the induced electric field and the electrostatic electric field components are each relatively large separately, but add vectorially to produce a small total electric field. In more recent observations, Gekelman and Stenzel(l979) observe “forced tearing,” a concept that recognizes that reconnecting systems can be regarded as the superposition of two or more current systems (Baum and Bratenahl, 1977). which may have different time variation (Bratenahl and Baum, 1977). The
MAGNETIC RECONNECTION EXPERIMENTS
x (crn)
29
xkm)
FIG. 12. Vector magnetic field B (a, c) and field amplitude (1B1) contours (b, d) in .the UCLA large flat electrode plasma device. [From Stenzel and Gekelman (1979a).]
combination of these effects is used to explain the motion of neutral points and the appearance or disappearance of more neutral points (Gekelman and Stenzel, 1979; Roederer, 1977). Impulsive events, however, have not yet appeared in this experiment. 2. Other Flat-Plate Devices Ohyabu et al. (I 972, 1974) and Ohyabu (1974) had previously performed a similar experiment in the collisional regime and without the axial magnetic field in which the plates of Fig. 7e are replaced by four rods. They observed a sudden anomalous increase in plasma resistivity at the neutral point and observe a rise in temperature up to several keV. The magnetic topology is somewhat uncertain. Related experiments with an application to fusion injectors have been performed by Okamura et al. (1975) and Okamura (1978). The flat-plate experiment of Baum and Bratenahl (1976), Fig. 7e, produces elliptical inverse pinches that are launched from two plates covered with insulating ceramic. The temporal behavior of flux surfaces is shown in Fig. 13 at four times and the persistence of one neutral point is seen as well as the transfer of magnetic flux to the daughter cell outside the separatrix. Figures 14 and 15 present unpublished data of P. Baum, A. Bratenahl, and G . Crockett. Figure 14 shows the isocurrent contours demonstrating a fine structure in the current sheet resembling two pairs of slow shocks. One quadrant of this data also appears as the bottom panel of Fig. 15d, where
30
P. J. BAUM AND A. BRATENAHL
FIG. 13. Measured field line maps in the UCR flat-plate device at t = 3.5, 4.0, 5.0, and 6.0 p e c . Identical flux surfaces (field lines) have the same letter designation. All lines are equally spaced in flux per unit length except for line a and the separatrix (dashed). The line calibrations are ( x Wbjm). [From Baum and Bratenahl(1976)l: (a) 6.25 (b) 20.8 (c) 41.7 (d) 62.5
(e)
(f) (g) (h)
83.3 104.2 125.0 145.8
(i) (j)
(k) (I)
166.7 187.5 208.3 229.2
(m) 250.0 (n) 270.8 (0) 291.7 (p) 312.5
finer contours are presented. Figure 5 shows the current densityj and 1 ie mean surface current J along the current sheet (obtained by integrating the current density across the sheet). At both 3.5 and 5.0 p e c the origin is the locus of maximum current density. However, by 5.0 psec the surface current maximum has moved from the origin to a point approximately 2.5 cm downstream. Correspondingly, the current density contours (Fig. 15b,d) show a
MAGNETIC RECONNECTTON EXPERIMENTS
31
FIG. 14. Measured isocurrent contours at 5.0 psec in the UCR flat-plate device. The first quadrant is shown at double this line density in Fig. 17d. Shock separation is evident. [From P. J. Baum, A. Bratenahl, and G . Crockett, unpublished.]
“forced” tearing that is not sufficiently strong to appear in Fig. 14 and is therefore “hidden” (Bratenahl and Baum, 1977). This device is very similar in design and operation to the DIPD (Section V,B,5).
3. MIT DC Quadrupole Figure 7 represents the quadrupole experiment of Overskei and Politzer (1976), which is run in a DC fashion. Since the quadrupole magnetic field and the applied electric field are time stationary, the result is a stationary flow pattern that probably consists of cross-field flows as depicted, although these are not discussed by these authors. Plasma is continuously supplied at the rods but is probably supplied at a slow rate that provides a low density; thus the system is constantly in a state of ion-acoustic plasma turbulence and anomalous resistivity. This experiment provided an excellent means to study the phenomena of anomalous resistivity near neutral points. It also showed evidence for the onset of the interchange instability at the flux surface of zero curvature, which is located downstream from the separatrix (Overskei,
32
P . J . BAUM AND A. BRATENAHL
0.801
I
I
I
I
I
I
1
I
'
1.0
(d)
1
I
3
2
4
5
x icrn)-+
FIG.15. Measured currents and current contours at 3.5 p e c (a, b) and 5.0 psec (c. d) in the UCR flat-plate device. Each panel presents data only in the first quadrant of the device. j(x, o ) is current density along the center of the plasma sheet, J(x) is surface current integrated across the plasma sheet. [From P. J. Baum, A. Bratenahl, and G. Crockett, unpublished.]
1976). The fluctuations associated with the interchange instability were clearly distinguished from the ion-acoustic turbulence near the neutral point. It seems clear from this that the interchange instability (Parker, 1973b) is not needed for "fast" reconnection. 4. The TS-3 Experiment
The TS-3 experiment (Syrovatskii et al., 1973; Bogdanov et ul., 1975: Frank, 1976; Dreiden et al., 1977, 1978; Kirii et al., 1979) was initiated to test the concepts of dynamic dissipation of magnetic fields (Syrovatskii,
MAGNETIC RECONNECTION EXPERIMENTS
33
FIG.16. System of conductors producing a magnetic field with a null line (a straight quadrupole): (a) Current flow; (b) vacuum chamber in the quadrupole magnetic field, i.e., crosssection perpendicular to the null line. The rectangles are the conductors; the solid curves show the magnetic lines of force. [From Frank (1976).]
1966) and sheet rupture (Syrovatskii, 1975, 1976, 1977a). In simplest terms, TS-3 develops an axial dynamic pinch in a fixed quadrupole background field. The quadrupole field is first established by a system of external conductors arranged in a baseball seam pattern (Fig. 16a,b). Plasma is injected and a fast pinch discharge is then initiated along the axis. Due to the presence of the quadrupole field, the cylindrical pinch quickly deforms into a plasma sheet (electron density N , 1-5 x 1015 cmP3),shown in Fig. 17 as derived from laser holography. The current distribution in the sheet, determined by a Rogowski loop, is represented by its isodensity contours in Fig. 18b. It is significant that no evidence could be found for the presence of Petschek-type shocks, and Frank concluded that they are not produced in this experimental arrangement (we comment further on this in Section VI). Dreiden et al. (1977) succeeded in producing current sheets with a width-to-thickness ratio AxlAy z 12, before tearing or sheet rupture finally occurred. Simple tearing theory would predict that Ax/Ay 2 2n should be sufficient for sheet rupture. Syrovatskii (1975), however, had predicted that the sheet should be stabilized against tearing by the presence of a high-conductivity plasma “coat” external to the sheet and should therefore require the larger value Ax/Ay as observed.
-
FIG.17. Electron density distributions at four sequential times during the formation of the current sheet: (a) t = 0.40, (b) 0.46, (c) 0.52, (d) 0.63 psec. The initial magnetic field gradient is h, = 0.9 kOe/cm. The plasma is produced in argon at a pressure of 2 x 10- * Torr; E, = 250 Vlcm. [From Dreiden et al. (1978).]
35
MAGNETIC RECONNECTION EXPERIMENTS
16 14 -
-E 12->E -0-1
02 /
lo
8-
64-
20
2
1
I
1
I
I
I
6
8
10
12
I
4
14
16
I
I
I
I
I
1
I
18 20 22 24 26 28 30
X (mm)
FIG.18. Profile of the current density in the plane perpendicular to the null line,jz(x,y)at 2 = 0.5 psec. h, = 920 0e;cm; 0 discharge in helium; p o = lo-’ Torr; n > 2 x l O I 4 ~ m - ~ ; E, = 235 V/cm. [From Frank (1976).]
In the measurements reported by Kirii et al. (1977) a rapid local increase in the magnetic field component normal to the sheet is observed. This increase lasts 0.1-0.2 psec and corresponds to the passage of current concentrations from (usually) the center to the edges of the sheet. These authors identify this as the form of tearing-mode instability called sheet rupture. Rupture was obtained by lowering the. upstream plasma density so that the stabilizing effect of the plasma coat was weakened. 5. Double Inverse Pinch Device
The double inverse pinch device (DIPD) was developed by A. Bratenahl in 1963 to test the elements of reconnection theory. The DIPD is still in operation at the University of California, Riverside. The initial objective was to see if by pushing two convex field systems together, the two would indeed flatten against each other to form a neutral sheet as in Sweet’s mechanism (cf. Section II,B), or on the contrary to see whether initial reconnection would proceed fast enough to lead directly to the three-cell topology. The scheme adopted (Figs. lb, 7g, and 19) was the most elementary
36
P. J . BAUM AND A. BRATENAHL
FIG. 19. Double inverse pinch-device chamber: (a) Side; (b) top view; (c) equipotential lines of the z component of the magnetic vector potential (curl-free magnetic field lines). The dark line is the separatrix that divides the flux into three regions. Region 3 is accessible from regions 1 and 2 by reconnection at the origin. Straight arrows represent the current flow; curved arrows the magnetic flux; stippled area the luminosity; energy source, two 150-pF capacitor banks, 20 kV max. [From Baum et al. (1973a).]
two-dimensional field configuration containing an x point (Dungey, 1958b). By employing two inverse pinch devices mounted side by side, Dungey’s configuration could be produced in a way that satisfied several requirements believed to be important : (1) The dynamic process could be studied system-wide since no field lines defining the three-cell topology could intercept chamber walls ; (2) Poynting flux would enter the system only at the two current sources for the field; ( 3 ) Energy could exit the system only as thermal flux to the end electrodes, as radiation and perhaps as high-energy particles ; (4) The principal working plasma would be derived from a unique property of the inverse pinch (to be explained below); ( 5 ) Altogether then reconnection should proceed in a manner described as “hands off,” and as free as possible from effects of its in uitro environment.
MAGNETIC RECONNECTION EXPERIMENTS
37
It is, perhaps particularly significant that, thus far, no other reconnection experiment with the purpose to test theory satisfies condition (2) except the flat-plate device (Fig. 7e) of Baum and Bratenahl (1976) (described above) and few others satisfy (1). The inverse pinch (Anderson et al., 1958; Vlases, 1967) is a cylindrical current-carrying plasma sheet that is magnetically driven radially outward from a central insulation-covered rod that delivers current to the sheet from an electrode fixed to its end. The rod passes through a second electrode that serves to collect the current so that it can be returned to the largecapacitor bank from whence it came. The arrangement acts like a shortcircuited stub of coaxial transmission line whose outer conductor is free to expand under magnetic forces, and sweep up the plasma in its way, snowplow fashion. If the current increases linearly with time, the sheet velocity is constant, and it is convenient to utilize the first quarter-cycle of an oscillatory discharge. The DIPD (Bratenahl and Yeates, 1970) (Fig. 21) is simply two inverse pinch-rod assemblies mounted side by side in a common glass-walled cylindrical chamber, each rod being supplied by its own 150-pF capacitor bank. The chamber is 30 cm in diameter and 10 cm high. The rod separation is 10 cm. In typical operation, a 6-kV capacitor bank discharge is initiated into preionized argon at 165 mTorr. Traveling at 1.8 cm psec-l, the pinch sheets require 2.4 psec to reach the center of the chamber, where they collide, merging together to form an expanding oval. Interest is confined to the 9.6 psec test time interval that begins at the moment of collision, when the rod currents are 20 kA each, and ends at peak current (95 kA), 12 psec after discharge initiation. The system of interest is the doubly connected pillbox-shaped volume exterior to the rods, and interior to the expanding oval current sheet. The rods we call “sources” and the oval sheet structure simply “the outer return path” that carries the “return” or “closing current.” If the inverse pinch process were to proceed in practice as theoretically idealized, the test volume would be devoid of conducting plasma. Fortunately (for obscure but very real reasons) such is not the case. It has long been known that in the region exterior to an ordinary dynamic pinch or interior to an inverse pinch, there is always found a low-density, highly conducting plasma. The exploitation of this simple fact renders the DIPD particularly effective in the study of reconnection. It is significant that this same “external plasma” is also responsible for stabilizing the TS-3 pinch against premature sheet rupture. In the collison of the two inverse pinch sheets, the line of first contact becomes permanently established as the separator in the magnetic field and the three-cell topology (Figs. 16 and 19c) develops immediately. The merged
38
P. J. BAUM AND A. BRATENAHL
sheets, now forming the moving outer boundary of the daughter cell, carry most of the rod return current, but a significant portion of this return current remains behind, forming the separator current system. It is necessary at this point to explain the diagnostic technique for determining the vector potential A. With reference to Fig. 20, a long, thin glass probe containing a coil with the form dZ,-dZ4 (dot-dash line in the figure) inserted from outside, extending to any point in the field, can be used to measure the flux per unit height between that point and the outside closing currents. The induced probe voltage (calibrated) yields aA/& at that point. The signal, electronically integrated, yields A. The vector is in the z direction. Note that by our convention all measurement taken on the separator (as shown) carry the subscript notation x;thus A , signifies x point. A map of contours of constant A is congruent with a field line map. By Faraday’s law, the EMF developed across the rod insulation measures the rate of gain of the magnetic flux passing between the rod and the outer boundary. The resistive component of the voltage drop developed across the moving outer boundary measures the rate of loss of this flux. It turns out, however, due to the high conductivity and relatively low current density in
FIG. 20. Selected flux tubes in the DIPD. Two rods supply current, which is returned principally along the outer cylindrical oval. The dark figure-eight tube is the separatrix that divides the flux into three regions or cells. Cells 1 and 2 are “parent” cells as they contain the sources (rods). Cell 3 is the “daughter” cell. Flux is transferred from the parent cells to the daughter cell during field line reconnection. The line along which cells 1 and 2 touch (the line of x points) is the separator for the system. The dashed-line segments dl,-dl, define a flux probe loop with which the flux transfer rate $ 3 is measured. [From Bratenahl and Baum (1976b).]
MAGNETIC RECONNECTION EXPERIMENTS
39
this outer boundary, that this resistive component is only a few percent of the rod EMF, and so we may neglect this small loss rate to obtain E -aA/at, which along with other parameters, is presented in Fig. 21 (unpublished work of A. Bratenahl) showing typical oscillosope traces of rod current (Zs), time derivative of current (],), source voltage (E), flux content of the daughter cell ( A x ) ,current density along the separator ( j x ) ,and a less common behavior for the inductive electric field (bottom trace of Ax). The time scale is 2 p e c per division. The small insert at the upper right shows
-
1s -
‘s -
vs
-
Ax -
Ax
-
j ,
-
I
I
2 psec
FIG.21. Typical DIPD parameters as a function of time. I, is source current, 1, = a I , / d t ; = aA,/at; A,, untypical behavior. Insert between first and second trace shows detailed behavior of 1, and V, (5 x magnification). [From A. Bratenahl, unpublished.]
V,is source voltage; A, is cell 3 flux per unit length (vector potential), A,
40
P. J . BAUM AND A. BRATENAHL
detail of the behavior of 1, and during the IFTE, which occurs during the upward spike to the A , trace. It is seen that the current form is sinusoidal and only the first quarter cycle is used for data gathering. Upon collision of the two inverse pinch sheets, the A , trace shows a monotonic increase of flux in cell 3 during the first quarter cycle with a rather abrupt jump 0.4 psec before crossing the vertical centerline of the photograph. The rate of rise of that bump appears on the A , trace. Correspondingly, it is seen that the current density is nearly proportional to the upper k, trace up until the IFTE, at which time k, increases and j, decreases. Occasionally, as already noted, the k, traces are different, for example, showing multiple peaks as in the lower trace of A,. Figure 22, taken from unpublished data of A. Bratenahl, presents top view Kerr cell photos of the central part of the DIPD plasma. The two rods are at the extreme left and right of each photo and it is apparent at early times (e.g., 3.9 psec) that the rods form the central axis of expanding cylindrical plasma sheets. Slightly later, the luminosity pattern flattens between the rods corresponding to the establishment of forms of sheet currents. It is seen at later times ( - 8.2 psec) that the luminosity decreases at the origin (center of the photo, x point) and this decrease moves outward along the sheets. At late times ( t 9 psec) the separatrix is distinctly but faintly illuminated. On close inspection of photos (6.4-6.9 psec) luminosity upstream of the luminous plasma sheet seems to correspond to the establishment of a slow shock pattern. (That pattern is more apparent in Fig. 26.) Having established the essential operational behavior of the DIPD, it is of interest to study the symmetry properties of flux transfer in the DIPD. As shown in Fig. 23a, taken from unpublished work of A. Bratenahl, magnetic flux probes were placed in four positions designated 1, 2, 3, 4. Probes 1 and 4 consist of loops cut by flux between a rod and the neutral point (parent flux). Probe 3 measures flux from one rod out past the outer boundary and probe 2 measures flux from the neutral point to the outer boundary (daughter flux). These probes detect rates of change of flux d 4 i/d t, which are electronically integrated to produce the cPi(t) curves that appear in Fig. 23b,c (sweep speed 5 psec/div). The top trace of Fig. 23b shows 41(t)and the bottom trace shows 44(t).The top trace of Fig. 23c presents 41(t)- 43(t) while the bottom trace is + 2 ( t ) . From Fig. 23b,c we see that 41(t)z -cP4(t) and q5,(t) - 43(t)z - $ ~ ~ ( showing, t) first of all, that the neutral point remains fixed in space, and second, that the IFTE in this case consisted of two pulses. This figure provides a simple demonstration that the flux gained by the daughter cell is equal to that lost by either parent alone, and that an equal amount of flux disappears, is annihilated. In this “mixing” process, two parent field lines join to form one daughter line. If the flux transfer process
MAGNETIC RECONNECTION EXPERIMENTS
41
FIG.22. DIPD top-view Kerr cell photographs. Times are 3.9-12.3 p e c . The neutral point is at the center of each photo. [From A. Bratenahl, unpublished.]
were reversed (daughter to parents), which may be called “splitting,” one daughter line would become two parent lines and flux would have been generated as in a dynamo. Besides introducing flux probes at a few locations as done for Fig. 23, it is possible to cover the plane with flux-measuring probes to determine the flux surfaces or field lines (Fig. 24). For this figure from Bratenahl and Yeates (1970) local B(r, t ) measurements were integrated in space to provide flux measurements. The flux surfaces resulting are shown in Fig. 24 superimposed on Kerr cell photos and contours of constant current. The line spacing indi-
42
P. J. BAUM AND A. BRATENAHL
(C)
(b)
FIG. 23. Flux symmetry measurements in the DIPD. (a) Positions where four flux probes are located relative to the potential field. (b) Fluxes vs. time (5 psec/div). The upper trace is the flux measured by probe 1 (+J, the bottom trace is d4.(c) Fluxes vs. time (5 pec/div). Upper trace (dl - (b3), bottom trace, dZ.[From A . Bratenahl, unpublished.]
-
cates a quantity of stored magnetic energy at t 7.0 psec, which has disappeared (into kinetic and thermal energy) at late times. It is also apparent that pushing convex fields together does not result in their being flattened; on the contrary, their convexity actually increases (Sweet's paradox, Section 111,B). This stored energy appears in the lower panels of Fig. 25 as the separator current system distributed in the form of two pairs of back-to-back slow-mode shocks. The central region close to the separator is a hyperbolic pinch. Measurements of the field components B, and B, approximately tangential and normal to the shocks, respectively, appear in the top panels of Fig. 25. Here we see the switchoff of B, as one crosses the shock passing downstream (from 40" toward 90"). Computer shock profiles (Figs. 6 and
FIG. 24. Superposition of DIPD Kerr-cell-photographs, field line maps, and contours of constant current density. Note relation of luminosity changes and current contours to the Wb/m. At 9.0 psec, negative print separatrix. Field lines are labeled by value of A in enhances contrast. [From Bratenahl and Yeates (1970).]
44
m o;;Fi P. J. BAUM AND A . BRATENAHL R =
U
xx 9 I\
N
E
3; cz W
m
w
3 U
xx
9 h
lcm
0
40
60
0
I
.O
100
80
60
I
I
0
80
100 40
60
8
80
100
.
0 7.0
7.8
7.4
L
I
1
I
1
1
5
0 cm
FIG.25. Magnetic field components near the DIPD separatrix (top). Contours of constant current density obtained from curl B (bottom). Current contours are 0.5, 1, 2, 3, and 4 x lo3 A/cmZ Highest current is at the x point. [From Bratenahl and Yeates (1970).]
MAGNETIC RECONNECTION EXPERIMENTS
45
0
I 2 3 4 5 cm FIG. 26. Contours of constant current density j . Left: experimental DIPD data. Right: computational result by Fukao and Tsuda (1973) as modified by Bratenahl and Baum (1976a).
26) correspond closely to those measured in the DIPD (Fig. 27). For example, Fig. 26 shows the measured shock profiles of the DIPD compared with those computed by Fukao and Tsuda (1973). Some evidence for slow shocks is evident also in the Schlieren photos (Fig. 27), which indicate electron density gradients by light and dark shading. For example, at 6.2 and 6.6 psec, the four shock ridges are apparent along the plasma sheet. A compressed plasma tongue of inverse pinch plasma has been ejected ahead of the shocks so that the shocks do not attach to the ends of the actual plasma sheet. They, of course, attach to the ends of the hyperbolic pinch in the central region. At t 7.8 p e c , the shock-wave assisted diffusion mode is terminated in a violent episode we have termed the impulsive flux transfer event (IFTE). The following observations characterize IFTE :
-
(1) ln the neighborhood of the neutral line, anomalous resistivity grows exponentially through a factor of 100 in 0.7 psec. (2) This results in a cutoff of hyperbolic pinch current and the diversion of this current into
46
P. J. BAUM AND A. BRATENAHL
4.5
5.3
5.6
6.2
6.6
7.O
7.5
FIG.27. Sequence of DIPD schlieren photographs. The time given refers to time after the main bank discharge. The neutral point is at the center of each photo. [From A. Bratenahl, unpublished.]
( 3 ) a system of large amplitude hydromagnetic waves that propagate downstream as blast waves, upstream as fast mode rarefaction waves. (4) As the current diminishes in the pinch and is convected away in the waves, the voltage drop along the neutral line rises by a factor of 4. This sharp voltage pulse is due to the inductive effects of the rapidly changing geometry of the current system. At this point the voltage along the separator exceeds the input voltage to the device demonstrating the flarelike release of stored flux and energy. ( 5 ) By Faraday’s law, this voltage is the reconnection or flux transfer rate; the enhancement of this rate is the impulsive flux transfer event (Bratenahl and Yeates, 1970; Baum et d., 1973b). (6) Corresponding to the voltage pulse, the electric field E is elevated by a factor of 4 throughout the region swept out by the waves. (7) In this same region, the magnetic field B drops by a factor 2 2 in the inflow sectors and increases by the same amount in the outflow sectors. Hence, the wave system bounds a region of enhanced E x B convection. (8) For example, mass motion in the outflow sectors is at the local Alfven speed as evidenced by observed Doppler shifts of spectral lines (Baum and Bratenahl, 1974a). (9) An X-ray burst (Baum et al., 1973a) coming from the anode at the point of intersection with the neutral line gives clear evidence of runaway electrons in the pinch region. We have inferred a DIPD power law runaway electrons spectrum from X-ray measurements (Baum el al., 1973~). (10) The spectrum of plasma turbulence as evidence by plasma waves (double electrostatic probe) extending up to 500 MHz (opi) has the characteristic form of ion-acoustic turbulence (Baum and Bratenahl, 1974b). (1 1) Immediately preceding IFTE, the electron drift velocity is seen fast approaching the electron thermal speed.
47
MAGNETIC RECONNECTION EXPERIMENTS
' C 5
n A "d/"th
0 N;
4
6
8 t, p s e c
10
12
1.3
1.2
2
\ > '
).
1
1 14
FIG. 28. Conditions at the DIPD x point characterized by oscilloscope traces of j , and 5 /Kh calculated for several times prior to current disruption (7.8 Isec).j x , current density; q , resistivity; N , , electron density along the neutral line or separator; 4 / yh,drift speed normalized to thermal speed. [From Bratenahl and Yeates (1970).] A x / j x= '1 for a single shot; N , and
Many of these observations are illustrated by Figs. 28-32. Figure 28 shows parameters at the neutral point, for example, electron density N ; , current densityj,, drift velocity V,,and resistivity q. As estimated by a Hall probe, N i is seen decreasing prior to IFTE and & is rising. The current density decreases at IFTE, while the resistivity abruptly rises. Similar behavior is seen in later data (Fig. 29), where temperatures (T,, T,), and radiation levels are presented as well as the preceding parameters. In these experiments, some parameters have changed quantitatively, but little qualitatively. The spectroscopic measurements during IFTE (T,, T,, be, are somewhat uncertain because of possible failure of assumed two-dimensional behavior as well as because the spectroscopic integration time exceeded the time for plasma changes during IFTE.
48
P. J. BAUM AND A. BRATENAHL
100
-
1.0 E
C
E
1
F-" 0.1
4
5
6
t
(psec)
7
4
5
6
7
t (psec)
FIG.29. Time variation of DIPD parameters at the neutral point. The time scale for each (b) graph runs from 3.5 to 7.5 psec: (a) Ion temperature ( T ) and electron temperature current density (J,) and electric field ( A z ) ;(c) electron density ( K ) electron , thermal velocity (G,), and drift velocity (6);(d) resistivity in milliohm-meters; (e) Temperature and velocity ratios; (f) time variation of an ionized argon line, an ionized helium line, and the kilovolt X-ray signal. [From Baum et al. (1973c).]
(z);
MAGNETIC RECONNECTION EXPERIMENTS
A,
49
-
Jx -
AXX-ray
(b)
FIG. 30. (a) Simultaneous traces at 2 pec/div of the DIPD neutral point electric field ( E x ) or flux transfer rate and the current density (J,). The impulsive flux transfer event is the sharp upward spike in Ex during the time segment AT. (b) Simultaneous traces at 2 psecldiv of Ex and an X-ray signal from runaway electrons. [From Bratenahl and Baum (1976b).]
The changes during IFTE are illustrated in Fig. 30, where we see the electric field and current density along the neutral line as well as the X-ray signal from runaway electrons striking the chamber bottom plate. Runaway ions escape through the top screen at IFTE but do not show an X-ray signature. During the IFTE spike in k, a double electrostatic probe positioned along the neutral line yields the power spectrum presented as the solid curves of Fig. 31. Panel (a) is plotted on a linear scale, panel (b) on a logarithmic scale. For comparison, the dashed line shows the form of the spectrum theoretically predicted by Kadomtsev for ion-acoustic waves. While the agreement is by no means perfect, it seems reasonable to conclude that ion-acoustic waves are excited during IFTE. It is of interest to examine the energy balance in the DIPD and to do so we examine the various contribution to the total power
P
=
IsK
=
urn+ Q, + Qx
where Urnis the rate of change of magnetic energy, Q, the power associated with the inverse pinches, and Qx the power associated with the x point region. Urnand Q, are estimated by Bratenahl and Yeates (1970) and P is known so that Q, can be estimated. The results appear in Fig. 32. It is interesting to note that the DIPD calls for an extra surge of power from the external sources at IFTE as evidence by the second peak in the P curve. At
f (MHz) FIG. 31. DIPD results from a double probe located along the neutral line. The solid lines are the measured spectral density of potential fluctuations between probe tips. The dashed lines have the shape of Kadomtsev’s theoretical ion-acoustic spectrum. The ion-plasma frequency (Ai) is indicated at 574 MHz. No signal is seen at frequencies above the ion-plasma frequency. (a) Linear scale; (b) logarithmic scale. [From Baum and Bratenahl (1974b).J -----)
51
MAGNETIC RECONNECTION EXPERIMENTS 180
160
140
120
100
$ z
80
: 2
60
40
20
0
2
4
a
6
0 10
12
14
TIME, p e c
FIG.32. DIPD power conversion. Input power P and rate of change of magnetic energy Urnuse the right-hand scale, inverse pinch power Qp and collision layer power Q, use the lefthand scale. [From Bratenahl and Yeates (1970).]
IFTE, the power dissipated near the x point surges up by more than 16 MW to a peak of 2.7 x lo7 J!sec (27 MW). From
!j . E dV
=
2.7 x lo7 J/sec
or
j E V z 2.7 x lo7 J/sec
we find that V z 27 cm3 is the characteristic volume over which the neutral point power is stored. This power apparently is converted predominantly into kinetic energy ; however, the velocity measurements are not sufficiently accurate to demonstrate this conclusively.
VI. DISCUSSION AND CONCLUSIONS The concept of magnetic field reconnection has been advancing along two parallel pathways. Where one leads through the expanse of cosmic
52
P. J. BAUM AND A. BRATENAHL
plasma physics, the other winds through the technological complexities of fusion energy research. The methodologies employed in these two research efforts differ markedly, for reasons relating both to tradition and circunistance. The first, which is the older effort, remains today highly speculative, having relied almost exclusively on pure deductive reasoning. The second, out of necessity and practicality, contains a strong element of empiricism, theory integrated with experiment. There have been occasional exchanges of ideas between them, and increasing such exchanges would clearly be in the interest of both. For example, in the fusion effort, reconnection thus far has been recognized in only one of its many manifestations, the tearing mode, a process possessing potential for destructive impact on containment, which poses a constant threat to the ultimate success of the program. However, the possibility should not be overlooked that some other form of reconnection familiar in cosmic plasma physics might be exploited beneficially to energize the contained plasma. On the other hand, cosmic plasma physics would do well to adopt some of the proven methodologies in the fusion discipline. We have in mind here especially three items: integration of reconnection theory with appropriately designed laboratory experiments; analysis of reconnection on a system-wide basis ; inclusion of three-dimensional effects. We can now state that one of the purposes of this review is an undisguised advocacy of the first two of these items. No less important, the third has been omitted simply because so little is yet known about it. We have seen in reconnection experiments designed to treat theory that there is no lack of evidence for flarelike impulsive activity, and the proper analysis of this requires consideration of the structure and behavior of the system as a whole. For example, consider Eq. (2) and its interpretation in terms of filling and emptying rates of the reservoir of magnetic flux to be reconnected, or equivalently, the stored magnetic energy to be converted. This suggests in a very natural way questions concerning the stability of the reservoir itself, and the consequences of an instability. This perception of things, readily deduced from experiment, could just as readily have been deduced theoretically, but the significant fact is that this has not happened. The reason is twofold: (1) the restricted problem analysis, working with insufficient information, lacks the necessary scope and power; (2) theory alone, with no experimental guidance, faces far too many possibilities ab initio. The usual procedure in this situation is to investigate a subset of problems chosen on the basis of tractibility or solvability. The resulting solution is then very likely to be incomplete. In other words, the guidance of experiment can be very important in the initial steps of problem definition
MAGNETIC RECONNECTION EXPERIMENTS
53
so that the appropriate theoretical approach may be selected. It may then happen that the properly defined problem is actually simpler to solve. The above discussion suggests that between natural physical processes an appropriately designed laboratory experiments, there is a commonality that transcends the enormous differences in scale, plasma regimes, and other laboratory-peculiar effects, and must therefore relate to more fundamental electrodynamic and topological considerations. A basic question is frequently asked : Is reconnection controlled locally or by distant boundary conditions? Experiment provides a clear answer : reconnection is indeed controlled locally at each instant but the prevailing local conditions are determined by the previous history of the system-atlarge in response to conditions prescribed on the distant boundaries, including the previous history of those conditions. In a long-time average sense, reconnection is determined by time-averaged conditions on the distant boundaries. This complication leads to considerable subtlety in the problem. For example, Petschek shocks were reported in early DIPD experiments (Bratenahl and Yeates, 1970) but they were conspicuously absent in TS-3 (Frank, 1976). They show up in most computer experiments (Fukao and Tsuda, 1973; Brushlinskii et al., 1978; Ugai and Tsuda, 1977; Sato, 1979) but not all (Gerlakh and Syrovatskii, 1976). Complicating matters still further, they have failed to materialize in a recently constructed DIPD (Beeler, 1979).The question when to and when not to expect Petschek shocks may have no simple answer. It does seem likely, however, that the answer can be developed only through analysis of the system-at-large as an initialboundary value problem. The reason is that the shock discontinuities partition the flow field into regions differing in the role played by the fast, slow, and intermediate wave speeds. In the DIPD, the separator current system develops out of the collision of two outwardly expanding current sheets. One might reasonably expect from this the development of contours of constant current density in the form of a butterfly. In TS-3 on the other hand, the separator current system develops from the contraction and elliptical deformation of an ordinary pinch in a constant hyperbolic background field roughly following the theory of self-similar collapse (Section IV,A,2,b). One might expect, in this case, the contours of constant current density to maintain their elliptical form. Obviously this is a subject worthy of further study both in the laboratory and in computer simulations. Another fruitful area of study is the electric field. In some devices (Sections V,B,1 and V,B,5) it has recently been found that the inductive electric field is nearly cancelled by the electrostatic field. It would be valuable to understand in detail how this happens and to be able to predict how clumps of space charge move, perhaps producing electric double layers.
54
P. J. BAUM AND A. BRATENAHL
A very important unsolved problem is the true nature of the instability leading to IFTE. For a number of years, the occurrence of IFTE in the DIPD was attributed to a transition from normal (Spitzer) resistivity to anomalous resistivity (ion-acoustic noise has been observed; Section V,B,5). On the other hand, sheet rupture (tearing mode) may be the primary instability with anomalous resistivity a secondary effect. The recent (unpublished) electric field data, separating - aA/at from - V$ (Beeler, 1979), may indicate development of an electric double layer at IFTE. Much more work will be required to untangle the complex causal chain involved here. Finally, it is essential to expand reconnection studies to three dimensions rather than the degenerate two-dimensional studies now dominant. Important new effects of great importance will probably be discovered, for example, understanding the behavior of magnetoplasmas in the earth’s neighborhood and in solar flares. We conclude with a brief summary of experimental reconnection results : (1) Significant qualities of magnetic energy are stored as induced reconnection ,iasma currents. (2) The detailed forms of the stored current can correspond to slow shocks a la Petschek, current sheets, or “neutral”-current sheets. (3) The stored currents can become unstable in rapid impulsive events (IFTE, sheet rupture). The conduction mode can also go unstable (anomalous resistivity). It is not clear yet which is more fundamental or whether they can even be separated. (4) The Alfven Mach number M , of the flow determines neither the amount nor the “speed” of energy release during reconnection. (5) The interchange instability can occur near flux tubes of zero curvature but these seem to be sufficiently remote from the neutral point, even downstream from it, that its occurrence does not speed reconnection. (6) The instantaneous reconnection rate is locally or regionally controlled. The time-averaged reconnection rate is controlled by distant boundary conditions, asymptotically approaching the boundary rate. The instantaneous rate can exceed the average rate. The detailed description of the amount of excess is system dependent. (7) Both the inductive and electrostatic electric fields are important during impulsive reconnection. Double layers could result from IFTE ; an adequate search has not yet been conducted. ACKNOWLEDGMENTS Portions of this research were supported at the Cal Tech Jet Propulsion Laboratory under a NASA contract and at UCR by the Cal Tech President’s Fund. by the Air Force Office of
55
MAGNETIC RECONNECTION EXPERIMENTS
Scientific Research, and by the National Science Foundation. We appreciate their support and the opportunity to discuss these experiments and related concepts with a number of individuals including W. Gekelman, C.-G. Falthammar, T. Sato, B. Sonnerup, T. Yeh, W. I. Axford, A. M. Soward, E. R. Priest, W. Heikkila, H. Alfvtn, C. M. Yeates, J. C. Nickel, R. Beeler, A. G. Frank, D. Overskei, M. Cowan, W. B. Kunkel, R. Stenzel. We are especially indebted to the late Professor Sergei Syrovatskii, whose dedication to the study of reconnection has provided us with a very valuable scientific legacy. This chapter is dedicated to him.
APPENDIX I. A SIMPLE EXAMPLE OF
AN
X POINT
Consider the case of two filamentary conductors each carrying current I. The conductors will be located at y = 0, x = f a , and will carry parallel currents I. Then the magnetic vector potential A may be written
A, POI A, = ln[(x 271
-
a’)
=
A,,
0,
=
0
(Al)
+ y2I1” + 271 ln[(x + a)’ + y2l1’’ + const P O I
-
where all constants will be dropped. Now
and for x4ja4 < 1, y4/a4 < 1, x’y’la4
<< 1, i.e., near the origin,
<< 1,
and if y 2 / x 2G 1, x’ja’
to within a constant. From B = V x A we can obtain the magnetic field from B,
=
2AOy,
By
=
2AOx,
B,
=
0
Eq.
(A4) : (A51
B equals zero at the origin x = y = 0 and the origin may therefore be termed a neutral point or null point. In fact, there is a neutral line along the z axis. The electric current is given by j = p i 1V x V x A
=
pi’ V x B
(A61
and we find j = 0 for Eqs. (A6) and (A4) or (AS). In two-dimensional cases of planar symmetry field lines are given by lines of A = const, and therefore a plot of Eq. (A2) set equal to a constant
56
P. J. BAUM A N D A. BRATENAHL
will generate the ovals of Cassini shown in Fig. 1b. Near the origin, Eq. (A4) generates a family of hyperbolas shown in the topmost panel of Fig. 33. We may therefore call this configuration a hyperbolic neutral point or an x point from the apparent crossing of field lines 0.= ? x) at the origin. Consider next a distributed current producing the vector potential A,
=
A,(.?
+JJ)
(A71
Equation (A7) describes circular field lines (Fig. 33, middle panel) and the origin may be termed an o-type neutral point or an elliptical neutral point.
+
FIG. 3 3 . Schematic representation of the fact that the superposition of a hyperbolic field plus an elliptical field (e.g., field inside a current-carrying wire) gives rise to a deformed hyperbolic field (bottom). Solid lines are field lines; dashed lines are streamlines.
MAGNETIC RECONNECTION EXPERIMENTS
57
Adding the two fields (A4) and (A7), we find A
=
Ao(l
+ &){Y2
-
[(l - F ) / ( l
+
E)]X2}
(A81
where E = A , / & . Again the lines of (A8) are hyperbolic, but now are deX depicted in the formed with asymptotes at y = +[(1 - ~ ) / ( 1+ E ) ] ~ ’ ~as bottommost part of Fig. 33. Only the o field contributes to the current near the origin and its magnitude is j , = 4A, = 4A,. In a resistive medium, E, = qjz and the Poynting flux S = (E x B)/po will be everywhere orthogonal to the magnetic lines of force as depicted by the dashed hyperbolas of Fig. 33. The force lines of force F = j x B will be congruent with the Poynting flux lines and indicate the general flow direction in a pressureless, resistive medium. The effect of the hyperbolic field is to redirect the Poynting flux and force vectors from their usual self-pinch and inverse-pinch characters into a hyperbolic pinch. The vectors S and F would point radially inward (or outward) in the presence of an o point and in the absence of an x point. The inclusion of the x point diverts Poynting flux and plasma past the origin. Strictly speaking, the neutral or null character of Eq. (A4) can be removed by adding an arbitrarily small B, component. Since the Poynting flux and force trajectories are hardly altered by this absence of a true neutral point, it appears that the existence of a separatrix and separator are more fundamental than neutral points. These are defined in Section 11.
APPENDIX 11. RECONNECTION JARGON Period and oscillations of lines of force; isorotation of lines of force (with plasma) Magnetic diffusion Effect of magnetic viscosity on lines of force; wrapping of lines Lines of force broken and rejoined Cancellation and coalescence of magnetic flux loops Slippage of lines of force; diffusion of field Merging of magnetic bundles (of parallel field) Reconnection Reconnection, merging, diffusion, and severing of lines of force Colliding fields Interpenetration of flux systems Transfer flux Fields leak together and neutralize Pushing and pulling of fields; stretching of lines of force Cut lines of force Annihilation of field
Alfven ( 1943) Elsasser (1946) Elsasser (1950) Dungey (1953) Parker (1955a), Elsasser (1955) Elsasser (1 955) Parker (1955b) Dungey (1956, publ. in 1958a) Parker and Krook (1956) Sweet (1956a, publ. in 1958a) Sweet (1956b, publ. in 1958b) Sweet (1956b, publ. in 1958b) Cowling (1957) Parker (1958) Piddington (1959) Gold and Hoyle (1960)
58
P. J . BAUM A N D A . BRATENAHL
Tearing of a sheet pinch; tearing-mode instability Recombination of lines Dynamic dissipation Destruction of magnetic surfaces; Brownian motion of flux lines Magnetic flux transfer Cutting and reconnecting lines Field cutting Antidynamo process Relinking of lines Sheet rupture Magnetic isolation; mixing and splitting of islands Impulsive flux transfer event Reclosing process Intruding flux systems Magnetoelectric coupling Explosive tearing-mode reconnection
APPENDIX 111.
Furth e / ul (1963) Atkinson (1965) Syrovatskii (1966) RosenbIuth et a/. (1966) Bratenahl and Yeates (1970) Sonnerup (1970) Coroniti and Kennel (1972) Vainshtein (1974) Cowling (1975) Syrovatskii (1975) Grad F I d.(1975) Bratenahl and Baum (1976a) Syrovatskii (1977b) Baum and Bratenahl(1977) Bratenahl and Baum (1977) Galeev et ul. (1978)
IMPULSIVEFLUX TRANSFER A N D CIRCUIT TRANSIENTS
It has been shown (Bratenahl and Baum, 1976a; Baum et ul., 1978) that IFTE can be modeled by a circuit like that of Fig. 34 in which an impulsive increase in the effective resistance R: (E R, + dL/dt) transfers flux from parent cells 1 and 2 to daughter cell 3. Shortly, we also examine the case where the daughter's effective resistance impulsively decreases. A solution
sw
FIG.34. An equivalent electrical circuit. which provides an analog for the DIPD There are three current loops (which are reduced to two if L , = L z ) that model the three flux cells of the DIPD. A variable resistor R,y modeling the critical line and adjacent region governs the resistive diffusion of flux from the parent cells (1,2) into the daughter ( 3 ) . R, will be allowed to vary also with results shown in Fig. 36. I t must be recognized that R, is governed by the boundary conditions. The model leads to a good understanding of flux transfer or dynamic reconnection. [Adapted from Bratenahl and Baum (1976b).]
59
MAGNETIC RECONNECTION EXPERIMENTS
to the circuit of Fig. 34 appears as Fig. 35 taken from Bratenahl and Baum (1976a). Here R , has been set to zero throughout. R, is constant up to regime a (bottom panel of Fig. A3) at which point R, exponentially decays before exponentially rising during regime b and then exponentially decays again during regime c. The theoretical solution is shown as solid dots in the upper parts of Fig. 35. The continuous curve is data from the DIPD. The voltage rise and local current decay is modeled accurately. It was further shown by Baum c’t al. (1978) that in the interruption of field-aligned currents (e.g., Alfven and Carlquist, 1967), the same circuit equation is produced as in the diversion of currents orthogonal to the field (IFTE) and these two field-
300
t
4 AT
b
I 4
6
I
b-
I
iI II 8
I 10
I
I 12
I
I 14
TIME ( p s e c )
FIG.35. Continuous curves : measured electric field ( E x ) ,current density (jJ, and resistivity
(a,) along the separator of the DIPD. Solid circles (for the period A T ) ; the best-fit voltage (V,) and current (i,) from the circuit analog (Fig. 34; R,
= 0, R,: exponential variations) calculations of Bratenahl and Baum (1976b). There are three regimes (a, b, c) during AT corresponding to different resistivity forms. The impulsive flux transfer event occurs between the dashed lines. [From Bratenahl and Baum (1976b).]
r t
t I.o
0.I
t (sec)
t (sec)
FIG.36. Solutions to the circuit of Fig. 34 are presented during the impulsive phase where R, (or R , ) exponentially increases (or decreases). The parameters are chosen for a geomagnetic substorm: L , = L , = L, = 50 H, V = lo5 V. In panels (a) and (b) R, F 0.1 Q, R, = 0.1 R up to the discontinuity at which R, = O.leY' R. In panels (c) and (d) R, = 0.1 R, R, = 0.1 R up to the discontinuity at which R, = O.le-?' Q. In panels (a) and (c) l/y = 200 sec, in panels (b) and (d) l / y = 50 sec. We see that impulsive substorm-like behavior is possible only for the cases where the effective resistance increases rapidly in time.
MAGNETIC RECONNECTION EXPERIMENTS
61
intermixing processes are fundamentally similar. That model was applied to solar flares to reproduce the flare’s release of lo3, ergs in 100 sec. In previously unpublished work shown in Fig. 35 we compare the effect of increasing the neutral point effective resistance R, with decreasing the daughter effective resistance R,. Here the parameters are chosen representative of a geomagnetic substorm (e.g., Bratenahl and Baum, 1976b; Bostrom, 1972) so that R, represents the cross-tail effective resistance and R, the polar cap or auroral electrojet effective resistance. Here we have chosen the initial values L , = L,, L , L, = 100 H, L , = 50 H, R, = 0.1 R, R, = 0.1 R, V = lo5 V. In each part of Fig. 36 both R, and R, are constant at 0.1 0 until at t = r, the current I, reaches (1 - e-’)Z,, where I, = V/(R, + R,) = 5 x lo5 A is the saturation value of I,. Thereafter, in Fig. 36a,b R; = RxeY(‘-‘o)and in Fig. 36c,d Rb = R,e-Y(‘-rn’.For Fig. 36a,c y-l = 200 sec and for Fig. 36b,d, y - l = 50 sec. Each part of the figure has been normalized to the saturation value that would be achieved at t -F a for R, = R, = 0.1 R throughout. Now it can be seen that Fig. 36a,b produces impulsive rises in voltage, and power (V,, P,), whereas Fig. 36c,d does not. Therefore, shorting the auroral electrojet is far different from opening the cross-tail current circuit. Only the increase in cross-tail effective resistance R: = R, dLjdt results in impulsive substorm-like transients. To obtain unnormalized variables, multiply R, and R, by 0.1 R, Z, and I, by 5 x lo5 A, V, and V, by 5 x lo4 V, P, and P , by 2.5 x 10” W. The solutions were obtained using a high-order Runge-Kutta algorithm on a Cromemco, Inc., microcomputer.
+
+
REFERENCES Alfven, H. (1943). On sunspots and the solar cycle. Ark. Math., Astron. Fys. 29A, 1-17. Alfven, H . (1976). On frozen-in field lines and field-line reconnection. J . Geophys. Res. 81,4019. Alfven, H., and Carlquist, P. (1967). Currents in the solar atmosphere and a theory of solar flares. Solar Phys. 1, 220. Alidieres, M . , Aymar, R.,Jourdan, P., Koechlin, F., and Samain, A. (1968). Experimental study of a current sheet. Plasma Phys. 10,841-850. Altynsev, A. T., and Krasov, V. I. (1975). Connection of magnetic lines of force in a neutral layer. SOC.Phys. - Tech. Phys. (Engl. Transl.) 19, 1639- 1640. Amano, K. (1977). “Computer Experiment on the Reconnection of Magnetic Field Lines,” Preprint. Dept. Electr. Eng., Hokkaido University, Sapporo, Japan. Amano, K., and Tsuda, T. (1977). Reconnection of magnetic field lines by clouds-in-cells plasma model. J . Geomagn. Geoelecrr. 29,9-27. Anderson, 0. A. (1964). “Sheet Pinch Studies,’’ Sect. IV.4 of UCRL-12028 (LBL Report), pp. 72-77. Lawrence Berkeley Lab., Univ. of Calif., Berkeley. Anderson, 0. A., and Kunkel, W. B. (1969). Tubular pinch and tearing instability. Phys. Fluids 12, 2099-2108. Anderson, 0. A., Furth. H. P., Stone, J. M., and Wright, R. E. (1958). Inverse pinch effect. Phys. Fluids 1, 489.
62
P . J . BAUM AND A. BRATENAHL
Atkinson, G. (1965). A theory of polar substorms. J . Geophys. Res. 71, 515775164, Bateman, G. (1978). “MHD Instabilities.” MIT Press, Cambridge, Massachusetts. Baum, P. J. (1978a). “Laboratory Experiments on Magnetic Reconnection,” Invited review UCR/IGPP-78/18. AAS Special Session on Solar Flares, Madison, Wisconsin. Baum, P. J. (1978b). Magnetic reconnection experiments. Bull. A m . Phys. SOC.[2] 23,60. Baum, P. J., and Bratenahl, A. (1974a). Mass motion and heating in a magnetic neutral point system. J . Plasma Phys. 11,93-98. Baum, P. J., and Bratenahl, A. (1974b). Spectrum of turbulence at a magnetic neutral point. Phys. Fluids 17, 1232-1235. Baum, P. J., and Bratenahl, A. (1975). Effect of obstacles on the rate of reconnection of magnetic field lines. Planer. Space Sci.23, 813-816. Baum, P. J . , and Bratenahl, A. (1976). Laboratory solar flare experiments. Sol. Phys. 47,331344. Baum, P. J., and Bratenahl, A. (1977). On reconnection experiments and their interpretation. J . Plasma Phys. 18,257-272. Baum, P. J., Bratenahl, A,, and While, R. S. (1973a). X-ray and electron spectra from the double inverse pinch device. Phys. Fluids 16, 226-230. Baum, P. J., Bratenahl, A., and White, R. S. (1973b). Experimental study of the reconnection process. Radio Sci.8,917-920. Baum, P. J., Bratenahl, A,, Kao, M., and White, R. S. (1973~).Plasma instability at an X-type magnetic neutral point. Phys. Fluids 16, 1501-1504. Baum, P. J., Bratenahl, A., and Cowan, M. (1976). Dynamics of a triple inverse pinch. J . Plasmu Phvs. 15,249. Baum, P. J., Bratenahl, A., and Kamin, G. (1978). Current interruption and impulsive flux transfer solar flare models. Ap. J . 226, 286. Beeler, R. (1979). Ph.D. Dissertation, Univ. Calif., Riverside. Univ. Microfilms. Benford, J., Lovberg, R. H., and Niblett, G. B. F. (1968). Resistive instabilities in a low-energy theta pinch. Phys. Fluids 11, 218-221. Biskamp, D., and Schindler, K. (1971). Instability of two-dimensional collisionless plasmas with neutral points. Plasma Phys. 13, 1013. Bodin, H. A. B. (1963). Observations of resistive instabilities of a theta pinch. Nuci. Fusion 3, 215-217. Bogdanov, S. Yu., Tokarevskaya, N. P., Frank, A. G., and Khodzhaev, A. Z . (1975). Twodimensional plasma motion near a neutral current layer. SOC.J. Plasma Phys. (Engl. Transl.) 1, 71-77. Bostrom, R. (1972). Magnetosphere-ionosphere coupling. In “Critical Problems of Magnetospheric Physics” (E. R. Dyer, ed.), Proc. Joint Cospar/IAGA/URSl Symposium, Madrid, 11-13 May, 1972. IUCSTP Secretariate, c / o National Academy of Sciences, Washington, D.C. Bratenahl, A. (1 972). “Experimental Studies of the Reconnection Process,” UCRiIGPP-72.4. Univ. of Calif., Riverside. Bratenahl, A., and Baum, P. J. (1976a). Impulsive flux transfer events and solar flares. Geophys. J . R. Astron. SOC.46,259-293. Bratenahl, A., and Baum, P. J. (1976b). On flares, substorms, and the theory of impulsive flux transfer events. Sol. Phys. 47, 345-360. Bratenahl, A., and Baum, P. J. (1977). Hidden reconnection. Trans. Am. Geophys. L’nnron 58, 1208. Bratenahl, A., and Yeates, C. M. (1970). Experimental study of magnetic flux transfer at the hyperbolic neutral point. Fluids 13, 2696-2709. Bratenahl, A., Baum, P. J., and Adams, W. M. (1979). A two-level solar dynamo based on solar activity, convection, and differential rotation. In “Solar and Interplanetary Dynamics,”
MAGNETIC RECONNECTION EXPERIMENTS
63
IAU Symp. No. 91. Reidel Publ., Dordrecht, Netherlands (in press). Brushlinskii, K. V., Zaborov, A. M., and Syrovatskii, S. I. (1977). “Numerical Simulation of a Plasma Flow Near the Magnetic Zero Line,” Preprint No. 61. Acad. Sci. USSR, Inst. Appl. Math., Moscow. Brushlinskii, K. V., Zaborov, A. M., and Syrovatskii, S. I. (1978). “Resistivity and Pressure Dependence of the Current Sheet Characteristics,” Preprint No. 86. Acad. Sci. USSR, Inst. Appl. Math., Moscow. Bulanov, S. V., Sasorov, N. V., and Syrovatskii, S. I. (1977). Effect of external plasma on the decay of a neutral current sheet. JETP Lett. (Engl. Transl.) 26,565-568. Chapman, S., and Kendall, P. C. (1963). Liquid instability and energy transformation near a magnetic neutral line; a soluble non-linear hydrodynamic problem. Proc. R. SOC.London, Ser. .4 271,435. Chapman, S., and Kendall, P. C. (1966). Comment on some exact solutions of magnetohydrodynamics. Phys. Fluids 9,2306. Coroniti, F. V., and Eviatar, A. (1977). Magnetic field reconnection in collisionless plasma. Ap. J., Suppl. Ser. 33, 189-210. Coroniti, F. V., and Kennel, C. F. (1972). Changes in magnetospheric configuration during the substorm growth phase. J. Geophys. Res. 77,3361-3370. Cowan, M., and Freeman, J. R. (1973). Explosively driven deuterium arcs as an energy source. J. Appl. Phys. 44,1595-1602. Cowley, S. W. H. (1975). Magnetic field-line reconnection in a highly-conducting-incompressible fluid: Properties of the diffusion region. J . Plasma Phys. 14,475. Cowling, T. G. (1957). “Magnetohydrodynamics.” Wiley (Interscience), New York. Cowling, T. G. (1975). Sunspots and the solar cycle. Nature (London) 255, 189-190. Dailey, C . L. (1972). “Magnetic Field Annihilation of Impulsive Current Sheets,” TRW Interim Sci. Prog. Rep. for AFOSR Contract. TRW C o p , El Segundo, Calif. Drake, J. F., and Lee: Y. C. (1977). Kinetic theory of tearing instab 1341 -1353. Dreiden, G. V., Kirii, N. P.:Markov, V. S., Mirzabekov. A. M., Ostrovskaya, G. V., Frank, A. G., Khodzhaev, A. Z., and Shedova, E. N. (1977). Holographic-interference study of the plasma density distribution in a current sheet. Sou. J . Plasma Phys. (Engf. Transl.) 3, 26-3 1. Dreiden, G. V., Markov, V. S., Ostrovskaya, G . V., Ostrovskii, Yu. I., Petrov, M. V., Fillipov, V. N., Frank, A. G., Khodzhaev, A. Z., and Shedova, E. N. (1978). Hologram motionpicture study of a current sheet. Sou. J . Plasma Phys. (Engl. Transl.) 4, 6-1. Dungey, J. W. (1953). Conditions for the occurrence of electrical discharges in astrophysical systems. Philos. Mag. [7] 44,725-738. Dungey, .I. W. (1958a). The neutral point discharge theory of solar flares. A reply to Cowling’s criticism. In “Electrical Phenomena in Cosmical Physics” (B. Lehnert, ed.), IAU Symp. No. 6. pp. 135- 139. Cambridge Univ. Press, London and New York. Dungey. J. W. ( 1958b). “Cosmical Electrodynamics.” Cambridge Univ. Press, London and New York. Elsasser, W. M. (1946). Induction effects in terrestrial magnetism. Part 1. Theory. Phys. Rev. 69, 106-116. Elsasser, W. M. (1950). The earth’s interior and geomagnetism. Rev. Mod. Phys. 22, 1-35. Elsasser, W. M. (1955). Hydromagnetism. I . A review. Am. J . Phys. 23, 590-609. Fisher, S . (1960). “Some Calculations on the TRIAX Pinch Device,” UCRL-9344 (LBL Report). Lawrence Berkeley Lab., Univ. of Calif., Berkeley. Forbes, T. G . , and Speiser, T. W. (1979). Temporal evolution of magnetic reconnection in the vicinity of a magnetic neutral line. J . Plasma Phys. 21, 107-125. Frank, A. G. (1976). Experimental study of the conditions for the appearance of a neutral
64
P. J. BAUM AND A. BRATENAHL
current sheet in a plasma: Some characteristics of the sheet. Proc. P . N . Lebedec Phys. Inst., [Acad. Sci. USSR] (Engl. Transl.) 74, 107-160. Fukao, S., and Tsuda, T. (1973). Reconnection of magnetic lines of force: Evolution in incompressible MHD fluids. Planet. Space Sci. 21, 1151. Furth, H. P., Killeen, J., and Rosenbluth, M. N. (1963). Finite-resistivity instabilities of a sheet pinch. Phys. Fluids 6,459-484. Galeev, A. A,, and Zelenyi, L. M. (1976). Tearing instability in plasma configurations. Sou. Phys.-JETP (Engl. Transl.) 43, 1 1 13-1 123. Galeev, A. A,, Coroniti, F. V., and Ashour-Abdalla, M. (1978). Explosive tearing mode reconnection in the magnetospheric tail. Geophys. Res. Lett. 5, 707-710. Gekelman, W., and Stenzel, R. L. (1979). “Forced Tearing and Reconnection of Magnetic Fields in Plasmas,” UCLA PPG 417. Univ. of Calif., Los Angeles. Gerlakh, N. I., and Syrovatskii, S. I. (1976). Numerical integration of the MHD equations near a magnetic null line. Proc. P. N. Lebedev Phys. Inst. [Acad. Sci. USSR] (Engl. Transl.) 74, 74. Giovanelli, R. G. (1946). A theory of chromospheric flares. Nature (London) 158,81. Giovanelli, R. G. (1947). Magnetic and electric phenomena in the sun’s atmosphere associated with sunspots. Mon. Not. R . Astron. Soc. 107,338. Giovanelli, R. G. (1948). Chromospheric Flares. Mon. Nor. R . Astron. SOC.108, 163. Giovanelli, R. G. (1949). Electron energies resulting from an electric field in a highly ionized gas. Philos. Mag. [7] 40,206. 120,89. Gold, T., and Hoyle, F. (1960). On the origin of solar flares. Mon. Nor. R. Astron. SOC. Grad, H., Hu, P. N., and Stevens, D. C. (1975). Adiabatic evolution of plasma equilibrium. Proc. Natl. Acad. Sci. U.S.A. 72,3789-3793. Hayashi, T., and Sato, T. (1978). Magnetic reconnection-acceleration, heating, and shock formation. J . Geophys. Res. 83, 217. Imshennik, V. S., and Syrovatskii, S. I. (1967). Two-dimensional flow of an ideally conducting gas in the vicinity of the zero line of a magnetic field. Sou. Phys.-JETP (Engl. Transl.)25, 656-664. Irby, J. H., Drake, J . F., and Griem, H. R. (1979). Observation and interpretation of magnetic field line reconnection and tearing in a theta pinch. Phys. Reu. Lett. 42,228-231. Jacobsen, R. A. (1975). High speed photographic studies of the equilibrium and stability of the ATC Tokamak. Plasma Phys. 17,547-554. Jaggi, R. K. (1964). “A Mechanism for the Dissipation of the Magnetic Field in Solar Flares,” NASA SP-50. US Govt. Printing Office, Washington, D.C. Kaw, P. (1976). “Some Nonlinear Effects in Tearing Mode Instability,” MATT-1264. Princeton Plasma Phys. Lab., Princeton University, Princeton, New Jersey. Killeen, J. (1964). “Resistive Instability Calculations for the TRIAX Experiment,” Sect. V.4. in UCRL-12028 (LBL Report), pp. 85-87. Lawrence Berkeley Lab., Univ. of Calif., Berkeley. Kirii, N. P., Markov, V. S., Frank, A. G., and Khodzhaev, A. Z . (1977). Rapid change in the structure of the magnetic field of a current sheet. Sou. J. Plasma Phys. (Engl. Transl.) 3, 303-306. Kirii, N. P., Markov, V. S ., Syrovatskii, S. I., Frank, A. G., and Khodzaev, A. Z . (1979). Laboratory study of the structure and dynamics of a pinching current sheet. Proc. P. N . Lebedev Inst. 110, 121 (in Russian). Kunkel, W. B. (1960). “A Simple Analysis of the Tubular Pinch Discharge,” UCRL-9311 (LBL Report), pp. 3-15. Lawrence Berkeley Lab., Univ. of Calif., Berkeley. Levy, R. H., Petschek, H. E., and Siscoe, G. L. (1964). Aerodynamic aspects of the magnetospheric flow. AIAA J . 2,2065-2076. McDonald, K. L. (1954). Topology of steady current magnetic fields. Am. J. Phys. 22, 586.
MAGNETIC RECONNECTION EXPERIMENTS
65
Morozov, A. I., and Solov’ev, L. S. (1966). The structure of magnetic fields. Rev. Plasma Phys. 2, 1 Morton, A. H. (1976). Disruptive instability mode structure in the LT-3 Tokamak. Nucl. Fusion 16,571-577. Ohyabu, N. (1974). Energy dissipation in a magnetic neutral point. Ph.D. Thesis, Inst. Space Aeron. Sci., University of Tokyo. Ohyabu, N., and Kawashima, N. (1972). Neutral point discharge experiment. J . Phys. Soc. Jpn. 33,496-501. Ohyabu, N., Okamura, S., and Kawashima, N. (1974). Strong ion heating in a magnetic neutral point discharge. Phys. Fluids 17,2009-201 3. Okamura, S. (1978). Magnetic neutral point discharge and its application to high temperature plasma injector. Ph.D. Thesis, University of Tokyo. Okamura, S., Okamura, N., and Kawashima, N. (1975). A hot-plasma injector using a divertor based on the magnetic-neutral-point discharge. Nucl. Fusion 15, 207-212. Overskei, D. (1976). Plasma behavior in the vicinity of a neutral magnetic field line. Ph.D. Dissertation, Massachusetts Institute of Technology, Cambridge. Overskei, D., and Politzer, P. (1976). Plasma turbulence in the vicinity of a magnetic neutral line. Phys. Fluids 19,683. Parker, E. N. (1955a). Hydromagnetic dynamo models. Ap. J . 122,293-314. Parker, E. N. (1955b). The formation of sunspots from the solar toroidal field. Ap. J . 121, 491 -507. Parker, E. N. (1957). Sweet’s mechanism for merging magnetic fields in conducting fluids. .I Geophys. , Res. 62, 509. Parker, E. N. (1958). Interaction of the solar wind with the geomagnetic field. Phys. Fluids 1, 171-189. Parker, E. N. (1963). The solar flare phenomenon and the theory of reconnection and annihilation of magnetic fields. Ap. J . Suppl. Ser. 8, 177. Parker, E. N. (1973a). Comments on the reconnexion rate of magnetic fields. J . Plasma Phys. 9,49. Parker, E. N. (1973b). The reconnection rate of magnetic fields. Ap. J . 180, 247. Parker, E. N., and Krook, M. (1956). Diffusion and severing of magnetic lines of force. A p . J. 124, 214. Petschek, H. E. (1964). “Magnetic Field Annihilation,” NASA SP-50, p. 425. US Govt. Printing office, Washington. D.C. Piddington, J. H. (1959). The transmission of geomagnetic disturbances through the atmosphere and interplanetary space Geophys. J. R . Astron. SOC.2, 173-189. Podgorny, A. I., and Syrovatskii, S. 1. (1978). “Numerical Simulation of Long Time Development and Rupture of Current Sheet.” P. N. Lebedev Phys. Inst. Prepr. No. 2. Acad. Sci. USSR, Moscow. Priest, E. R., and Soward, A. M. (1976). “On Fast Magnetic Field Reconnection,” Basic Mechanisms of Solar Activity,” IAU, pp. 353-366. Reidel, Holland. Roederer, J. G. (1977). Global problems in magnetospheric plasma physics and prospects for their solution. Space Sci. Rev. 21, 23. Rosenau, P. (1977). A class of 3-D flows with neutral points. Bull. Am. Phys. SOC.[2] 22, 1206. Rosenau, P. (1979). Three-dimensional flow with neutral points. Phys. Fluids 22, 849. Rosenbluth, M. N., Sagdeev, R. Z . , Taylor, J. B., and Zaslavski, G . M. (1966). Destruction of magnetic surfaces by magnetic field irregularities. Nucl. Fusion 6, 297-300. Russell, C . T., and McPherron, R. L. (1973). The magnetotail and substorms. Space Sci. Rev. 15,205-266. Sato. T. (1979). Strong plasma acceleration by slow shocks resulting from magnetic reconnection. J . Geophys. Res. 84, 7171.
66
P . J . BAUM AND A. BRATENAHL
Sato, T., and Hayashi, T. (1979). Externally driven magnetic reconnection and a powerful magnetic energy converter. Phys. Fluids 22, 1189-1202. Satya, Y., and Schmidt, G. (1978). “An Introduction to Tearing Modes,” Research report. Stevens Institute of Technology, Hoboken, New Jersey. Satya, Y . , and Schmidt, G. (1979). Interaction of tearing modes. Phvs. Fluids 22, 587-588. Smith, D. F. (1977). Current instability in reconnecting current sheets. J. Geophys. Res. 82, 704-708. Somov, B. V., and Syrovatskii, S. I. (1975). Electric and magnetic fields arising from the rupture of a neutral current sheet. Bull. Acad. Sci. U S S R , Phys. Ser. 39, 109-111. Sonnerup, B. U. 0. (1970). Magnetic-field reconnection in a highly conducting incompressible fluid. J. Plasma Phys. 4, 161-174. Sonnerup, B. U. 0. (1979). Magnetic field reconnection. In “Solar System Plasma Physics” (C. F. Kennel, L. J., Lanzerotti, and E. N. Parker, eds.), Vol. 3, p. 45. North-Holland Publ., Amsterdam. Soward, A . M., and Priest, E. R. (1977). Fast magnetic field line reconnection. Philos. Trans. R. Soc. London 284, 369. Stenzel, R. L., and Gekelman, W. (1979a). Experiments on magnetic-field-line reconnection. Phys. Rev. Lett. 42, 1055-1057. Stenzel, R. L., and Gekelman, W. (1979b). “Electrostatic and Induced Electric Fields in a Reconnection Experiment,” UCLA Report PPG 414. Univ. of Calif., Los Angeles. Sweet, P. A. (1958a). The neutral point theory of solar flares. In “Electromagnetic Phenomena in Cosmical Physics” (B. Lehnert, ed.), IAU Symp. No. 6, pp. 123-134. Cambridge Univ. Press, London and New York. Sweet, P. A. (1958b). The production of high energy particles in solar flares. Nuouo Cimento, Suppl. 8, 188-196. . Sweet, P. A. (1969). Mechanisms of solar flares. Ann. Rev. Astron. Astrophys. 1, Syrovatskii, S. I. (1966). Dynamic dissipation of a magnetic field and particle acceleration. Sou. Astron.-AJ. (Engl. Transl.) 10,270-280. Syrovatskii, S. I. (1968). Magnetohydrodynamic cumulation near a zero field line. Soc. Phys.JETP (Engl. Transl.) 21,763-766. Syrovatskii, S. I. (1975). Charged-particle acceleration in processes of the solar-flare type. Bull. Acad. Sci. U S S R , Phys. Ser. 39,96-108. Syrovatskii, S. I. (1976). Neutral current sheets in plasmas in space and in the laboratory. P w c . P. N . Lebedec Phys. Inst. [Acad. Sci. U S S R ](Engl. Transl.) 1 4 , 1-10. Syrovatskii, S. I. (1977a). “The Current Sheet Formation and Rupture Near the Magnetic Zero Line,” P. N. Lebedev Phys. Inst. Prepr. No. 2. Acad. Sci. USSR, Moscow. Syrovatskii, S. I . (1977b). Singular magnetic field lines in a plasma. Sou. Phys. Lebedec Insr. Rep. (Engl. Transl.) 5, 9-12. Syrovatskii, S. I . (1979). General analysis of the problem of plasma flow in the neighborhood of a hyperbolic magnetic neutral line. Proc. P. N . LebedeL;. Phys. Ins?. [Acad.Sci. USSR] (Engl. Trans/.) 110, 5. Syrovatskii, S . I., Frank, A. G., and Khodzhaev, A. Z . (1973). Current distribution near the null line of a magnetic field and turbulent plasma resistance. Sou. Phys.-Tech. Phys. (Engl. Transl.) 18, 580-586. Tsuda, T., and Ugai, M. (1977). Magnetic field-line reconnexion by localized enhancement of resistivity. Part 2. Quasi-steady process. J . Plasma Phys. 18, 451-471. Uberoi, M. S. (1963). Some exact solutions of magnetohydrodynamics. Phys. Fluids 6, 1379. Ugai, M., and Tsuda, T. (1977). Magnetic field line reconnextion by localized enhancement of resistivity. Part I. Evolution in a compressible MHD fluid. J . Plasma Phys. 17, 337-356.
MAGNETIC RECONNECTION EXPERIMENTS
67
Ugai, M ., and Tsuda, T. (1979a). Magnetic field-line reconnection by localized enhancement of resistivity. Part 3. Controlling factors. J . Plasma Phys. (in press). Ugai, M., and Tsuda, T. (1979b). Magnetic field-line reconnection by localized enhancement of resistivity. Part 4. Dependence on the magnitude of resistivity. J . Plasma Phys. (in press). Vainshtein, S. I. (1974). Antidynamo-a possible mechanism of phenomena occurring in neutral layers of a magnetic field. Sou. Phys.-JETP (Engl. Transl.) 38, 270-275. Van Hoven, G. (1979). The energetics of resistive magnetic tearing. A p . J . 232, 572. Vasyliunas, V. M. (1975). Theoretical models of magnetic field line merging. 1. Rev. Geophys. Space Phys. 13, 303. Vlases, G. C. (1967). Shock and current layer structure in an electromagnetic shock tube. Phy.s. Fluids 10, 2351. Waddell. B. V., Rosenbluth, M. N., Monticello, D. A,, and White, R. B. (1976). Nonlinear growth of the m = 1 tearing mode. Nucl. Fusion Lett. 16, 528. White, R. B., Monticello, D. A,, Rosenbluth, M. N., and Waddell, B. V. (1976). Nonlinear tearing modes in Tokamaks. Proc. Berchtesgaden Con$ Plasma Phys. 1, 569. Yeh, T., and Axford, W. I. (1970). On the reconnection of magnetic field lines in conducting fluids. J . Plasma Phys. 4,207-229. Zukakishvili, G. G., Kavartskhava, I. F., and Zukakishvili, L. M. (1978). Plasma behavior near the neutral line between parallel currents. Sou. J . Plasma Phys. (Engl. Transl.) 4, 405-410.
This Page Intentionally Left Blank
ADVANCES IN ELECTRONICS A N D ELECTRON PHYSICS, VOL.
54
Electron Physics in Device Microfabrication. I1 Electron Resists, X-Ray Lithography, and Electron Beam Lithography Update P. R. THORNTON G C A Corporation Burlington, Massachusetts
1. Introduction. . . . . . . . . . . . , . , . . . . . . , , . . . . ............................ .......................................................... B. The Microlithographic Problem . , . . . . . ............................ 11. Interactions between a Focused Electron Beam and a Resist-Covered Wafer . . . . . . A. GeneralIntroduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Interaction between an Ele am an t-Coated Substrate: Qualitative Description . . . ..... ......................... C. Practical Aspects of Resist r ..................................... D. Interaction between an Electron Beam and a Resist-Coated Substrate: Quantitative Description of Broad Beam Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . E. The Proximity Effect . . , . , . . , . . . . . . . . . . . , , . , . . 111. X-Ray Lithography. . . . . . . . . . , , . , . . . . , . . . . . . . . . , , A. Overall Description ...................... ........................ B. The Ideal Resist Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. X-Ray Source Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Masks for X-Ray Lithography . . . . . . . . . . . . . . . . . . . . . . . . . . E. Magnitudes for a Fast-Throughput X-Ray System . . . . . . . . . F. Advanced X-Ray Source Development . . . . . . . . . . . , . . . . . . . . . . . . . . . . . . . . . . . IV. Recent Work in Electron Beam Lithography . . . . . . . . . . . . . . . . .
......................................................... cal Computation . . . . . . . . . . . . . . . . . . . . . , . . . . . . . . . . . . . . . . . . . . C. Electron Beam Lithography Based on the Use of a Field Emitter Cathode . . . . D. Shaped-Beam Lithography Systems. . . . . . . . . . . , . . . . . . . . . . . . . . . . . . . . . . . . . . E. Electron Projection Methods , . . . . . . . . . , . . , , . . . . . . . . . . . . . . . . . . . . . . . . . . . . F. Electron-Optical Components. . . . . . . . . , . . , , , , , , . . . . . . . . . . . . . . . . . . . . . . . . . V. The Relative Roles of X-Ray and Electron Beam Lithography Systems with HighThroughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
69 69 71 13 13 73 71 81 85 95 95 98 101
106 110 114 I I6 116 116 122 124 128 129 133 136
I. INTRODUCTION A . General
The drive toward the microminiaturization of electronic devices persists strongly today, and in the near future we shall require microlithography 69
Copyright G 1980 by Academic Press. Inc All rights of reproduction in any form reserved ISBN 0- 12-0 14654- I
70
P . R . THORNTON
patterns incorporating detail and associated complexity beyond the limits that optical techniques can provide (see, for example, Keyes, 1979). Much current work is exploiting techniques based on the use of electron beams and X rays to provide the necessary resolution. A brief summary of the various available approaches was given in a previous chapter in Volume 48 of this series (Thornton, 1979) and the central role of a scanning electron beam system for both direct writing and for pattern generation was stressed. A comprehensive treatment was given of the challenges, limitations, and possibilities involved in the development of a fast-scanning electron beam system for direct writing in a production environment. In contrast, the treatment of X-ray lithography was, of necessity, more superficial. Nor were the detailed phenomena associated with electron resist development given the attention that the subject merits. The purpose of this review chapter is to remove these shortcomings and, at the same time, provide an updated synopsis of the progress made in the development of fast scanning systems. In Section I1 emphasis is placed on the central role that the resist plays in deciding which approach becomes the more economically viable. In this section discussion is given of the ideal “family” of resists required for lithographic work and the necessary specification in detail of a resist is outlined. An illustrative theoretical treatment of resist behavior is outlined. This emphasis on resist behavior is justified on the grounds that in this work area there is the most scope for establishing the exact role to be played by fast, versatile scanning systems with the implicit capability of making very localized corrections for distortion. Section I11 examines in detail the possibilities and complexities involved in X-ray lithography. The basic ideas underlying the approach are outlined and, subsequently, each component is examined in detail. Where possible, quantitative data pertinent to the optimization of the method are given, the critical elements identified, and experimental results discussed. In Section IV a summary is given of the recent progress made in electron beam lithography. Here the topics covered included improvements in computational methods contributing to both electron-optical components and the understanding of the Coulombic interaction problem, recent developments in electron-optical components, progress made with conventional projection systems and with cathode projection systems, and finally recent results published in the area of scanning-beam systems using both thermal and field electron emitters. At this point, it is convenient to outline the common problem that all microlithographic approaches face, i.e., the quick and reliable reproduction of patterns with increasing resolution over larger and larger chip sizes.
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
71
B. The Microlithographic Problem The preciseness of the required pattern and its repeatability are the essential features that have to be considered, coupled with throughput, cost, and the realization that the very act of device processing will distort the previously developed structure both in the plane of the wafer and by warpage of the wafer surface itself. To provide this pattern, we can use parallel approaches in which all elements are exposed in one flash; we can exploit a purely series approach in which each element (or part of an element) is exposed sequentially; or we can adopt a compromise in which a selected subfield of a given number of elements is exposed in parallel and the whole subfield sequentially repeated over the required area by a process of “step and repeat.” These various approaches are applicable to optical techniques, X-ray methods, and electron beam systems and are summarized schematically in Table I. In general, series approaches pose the problem of obtaining TABLE I POSSIBLE CONCEPTUAL APPROACHESTO THE MICROLITHOGRAPHIC PROBLEM FOR OPTICAL,X-RAY,AND ELECTRON EXCITATION
-
Nature of exposure method
Excitation Optical
X-ray
Electron beam
Parallel
Step and repeat
H
Series
Single exposure of Recticle system using step and repeat to cover whole wafer through c -+ mask whole wafer Single exposure of Recticle system using step whole wafer through and repeat to cover t -+ mask whole wafer Electron projection systems exposing whole wafer or large group of chips Electron projection systems + exposing small group of + chips or single chip Scanning system using an -+ + aperture plate Scanning system using c single variable-sized --* aperture with sharpened distribution Scanning system using small Gaussian profiled, round spot
72
P. R. THORNTON
the required speed but have the inherent ability to correct for distortions in a versatile manner appropriate to the particular application. Purely parallel approaches have the required speed but have difficulty in providing the required degree of pattern precision and self-correctability.Compromise step and repeat processes sacrifice speed/throughput in order to obtain the TABLE I1 SCHEMATIC REPRESENTATION OF THE TRADE-OFF BETWEEN SPEED(THROUGHPUT) AND PATTERN PRECISION IN GOINGFROM PARALLEL EXPOSURE, THROUGH STEPAND REPEAT, TO SERIESE X P ~ U R FOR E THE PARTICULAR CASEOUTLINED IN THE TEXT Exposure method (1) Parallel, single “shot” of whole wafer
(2) Step and repeat with recticle of four 5 x 5 mm chips
(3)
Step and repeat with single chip
(4) Scanning approach using variable-sized shape beam
Specifications Number of “elements”: one Repeatability: + O . l pn in 7.5 cm, or & 1 part in7.5 x lo5 Dead time due to stage movement: one stage movement Correctability to match wafer distortion: only partially, over scale of complete wafer. Number of “elements”: ~ 4 0 Repeatability: 0.1 pm in 10 mm, or k I part in lo5 Dead time due to stage movement: 40 movement times. Correctability to match wafer distortion: only partially on scale of 1 cm Number of “elements”: z 160 Repeatability: +0.1 pm in 5 mm, or 2 parts in lo5 Dead time due to stage movement: 160 movements Correctability to match wafer distortion: only partially on scale of 5 mm Number of “elements”: z lo4 to 5 x l0”;chip Repeatability: kO.1 pm in 5 mm, or + 2 parts in lo5 Dead time due to stage movement : 160 movements Correctability : Ability to interpolate correction with the 5 x 5 mm scan field on the scale of 200 pm Number of “elements”: > IOb/chip Repeatability: + 2 parts in l o 5 Dead time due to stage movement: -1000 movements Correctability : Ability to interpolate correction with the 5 x 5 mm scan field on the scale of -200 pn
+
+
-
-
(5)
Scanning approach using small Gaussian spot and 2 x 2 mm field
73
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
required precision. Table I1 shows in somewhat oversimplified form the various trade-offs that can be envisaged in the important case where we wish to write patterns with 0.5- I pm detail over 5 x 5 mm chips on a 7.5 x 7.5 cm wafer with f0.1 pm repeatability.
11. INTERACTIONS BETWEEN A FOCUSED ELECTRON BEAMAND RESIST-COVERED WAFER
A
A . General Introduction This topic is an extensive work area involving expertise in several fields where analysis and modeling are complex and definitive experimentation difficult and painstaking. In such a situation, the problem of selection of content is magnified and some clear statement of intent is needed to avoid disappointment. This section is intended to provide a basis of understanding in which complex effects specific to interactions between focused electron and solid materials complicate, aid, and extend the development of modern microelectronic circuits for workers and students with a knowledge of such circuits but without a detailed knowledge of beam and beam interaction behavior. It attempts an overall synthesis in perspective rather than a detailed critique. There is one notable omission : The subject of irradiation damage is not discussed. The sole reason is space limitation. A natural starting point is a qualitative description of the interactions between electrons and a resist-coated wafer.
B. Interaction between an Electron Beam and a Resist-Coated Substrate Qualitative Description Over the range of electron energies (say 5-50 keV) that are of use in microlithography, the main mechanism whereby primary electrons lose energy is by the excitation and ionization of the host atoms of the target material. These inelastic collisions involve the interactions between the incident electron and the bound electrons of the bombarded material. Over and above the energy change that occurs, considerable scattering is implicit in this type of collision. In general, each such collision produces a “smallangle” scattering of the incident electron. By way of contrast, when an incident electron (a “primary” electron) impinges closely on an atomic nucleus, very large-angle scattering can result from an essentially elastic “orbiting” type of collision. The relative proportion of these two types of interaction is a function of atomic number. As the atomic number increases.
74
P. R. THORNTON
the number of bound electrons and the extent of the electron cloud increases, so that small-angle scattering becomes increasingly important. These mechanisms lead to the effects known as “backscattering” and “secondary emission.” That is to say, the loss of electrons from the material is either by the backscattering or “reflection” of the fast primary electrons or by the emission of slow secondary electrons created as a result of the ionization of matrix atoms. In general terms at normal incidence, the backscattered electrons will have energies approaching the primary beam energy and the secondary electrons will have energies below 50 eV. The net magnitude of these losses from the energy available for absorption in the target depends on many factors, including the atomic number of the target, external electromagnetic field distribution, surface finish, surface layer configuration, and surface inclination. This topic is well covered in the literature (see, for example, Knoll and Kazan, 1952; Nosker, 1969; Cosslett and Thomas, 1964a,b,c). From the viewpoint of microlithography we are fortunate that we are, in the main, concerned with smooth surfaces placed normal to the beam so that the loss is constant across the wafer and can be measured in a suitable calibration. There is one exception to this simple situation that complicates the behavior in the later layers of complex structures in which localized surface inclination and to some extent differences in atomic number play a role. After interacting with the surface layers, the primary electrons “random walk” down into the bulk of the resist layer, losing energy, being randomly scattering through small angles, with occasional large-angle scattering back toward the surface. At each elementary layer in the target, the electrons freed from the host atoms undergo a similar process of energy loss and scattering, causing additional bond breaking and atom excitation. Figure 1 shows the nature of the resultant region in which the total energy is expended in a case of particular importance in microlithography, where the resist thickness is such that beam energy is lost both in the resist layer itself and in the underlying substrate. As the beam energy is increased, the electrons statistically penetrate further into the target (see Fig. 1). A convenient measure of the depth of penetration of the primary electrons is the “range” of the electrons. Various range values have been calculated with differences in the detailed modeling, but in general the basis of the calculation is to derive an expression for the rate of energy loss per unit length of trajectory and the “expected” or mean energy of the primary electrons as a function of depth into the target. The range is that depth at which mean energy falls to an appropriate ionization energy. A self-consistent scaling based on a suitably defined electron range helps in understanding the processes involved and aids brevity and clarity of presentation. Figure 1 shows how the behavior changes as the electron range
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
75
RESIST LAYER
FIG. I . The region in which electron beam energy is lost in penetration through a resist layer into the substrate: (a)-(c) the effect of increasing beam voltage, say from 10 to 25 kV for a resist layer 1 pn thick.
-
is increased from just greater than the resist thickness to considerably greater. Although somewhat idealized, these figures indicate a basic difficulty associated with all forms of electron beam lithography using a refined beam. With a finely focused beam, the region over which the incident energy is available for excitation of the resist is determined by electron scattering within the resist itself and within the underlying substrate. This fact is the physical basis of the “proximity effect” (Ozdemer et al., 1973; Chang et al., 1974) in which, at separations of the order of 1 pm or less, the exposure of one region of micron size can “presensitize” adjacent areas, leading to overexposure and poor conformity if the effect is not considered. This problem is discussed in Section II,E. So far we have briefly discussed the excitation of the target by the beam. The processes by which this excitation is exploited in resist modification have to be considered. In metal, semi-insulator, and semiconductor materials, this energy results in free carrier generation, phonon production, X-ray emission, UV and optical photon generation, and Auger processes at the dose levels we consider here. In polymer materials, the processes essential to photolithography are superimposed on these mechanisms. Polymer materials can exist as long-chain molecules with systematic repeat structures in basically three forms : long straight individual molecules of high molecular weight, individual molecules that contain a branch point and intercoupled, or “cross-linked’’ structures in which the structure approaches one giant
76
P. R . THORNTON
molecule. The chemical bonding is mainly by shared electrons. The process of electron expulsion from such a molecule leaves a “dangling bond” or a “radical ion.” Under an appropriate level of excitation, the bond breaking can lead to a fragmentation of the existing molecules (a process of “scission”). While scission is occurring, the radical ions are seeking to rejoin and repair the fragmentation. The two processes are in competition and, depending on the structure, one process predominates. If the net effect is a fragmentation, the molecular weight decreases in the irradiated areas and such areas become increasingly soluble in “developer” solvents. Such resists are termed “positive” resists. On the other hand, if the bond “mending” is the predominant process, particularly if cross-linking occurs, then the molecular weight of the irradiated material increases, its solubility in solvents decreases, and the irradiated areas remain after development, i.e., a “negative” resist results. Resist design therefore consists, in grossly oversimplified form, of establishing the necessary mean molecular weight with some sufficient cross-linking and with the right fraction of radiation-sensitive “sites” to give the required properties. Modern electron beam lithography requires both positive and negative resists to be available with sufficient sensitivity and contrasts (see Section II,B,2). At this stage, we can sum up the essential steps in analyzing this problem. No significant loss of generality occurs by considering the particular case of a negative resist. The main steps can be listed as:
(1) Consider a uniform beam (allow for loss by backscattering, etc.) and determine the rate of loss of energy of the primary beam as a function of depth right down to the full range of the electrons. ( 2 ) At each elementary layer, model the released energy in terms of secondary electron formation and model the distribution and scatter of this secondary electron energy as a function of depth both in the resist and in the substrate. (3) Sum the results of (1) and (2) to obtain the total release of energy at each level in the resist down to the interface between. (4) Make, of necessity, ad hoc but physically reasonable assumptions as to the number of ionizations (or scissions) resulting from a given loss of primary beam energy as a function of depth and hence determine the number of ionizations as a function of depth. ( 5 ) Determine as rigorously as possible the number of cross-links occurring per ionization. Hence, determine the rate of cross-linking as a function of time at each depth in the resist. (6) Establish a reasonable estimate of the relationship between the required density of cross-links to just inhibit solution in a developer that
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
77
completely removes the unirradiated material, i.e., determine the criterion to onset “gelation.” (7) Determine the incident dose required at the surface of the resist to bring about gelation at the interface between the resist and the substrate. This dose is the dose required to bring about a residual film equal in thickness to the initial layer. (8) Extend the analysis to establish the properties of layers that have received a dose less than that required to leave a residual film equal to the initial thickness ; determine the residual thickness as a function of dose and so determine the contrast properties. From a quantitative viewpoint, the problem represents a considerable challenge. Considerable insight into the progress made can be obtained by examining an analysis by Heidenreich and co-workers (1973). This approach examines and predicts the properties of negative resists, in particular, on a model that considers only the loss of energy from a uniform beam. This model is outlined in Section II,D, but first to put the topic in perspective, we should outline some practical aspects of resist technology when applied to device fabrication. C. Practical Aspects of Resist Behavior 1. General
Here we are concerned with practical work from three interrelated viewpoints. First, we have to define the needs of modern lithography with regard to resist behavior in terms of sensitivity and contrast, i.e., those properties pertinent to the lithographic process itself. Second, we have to consider the properties in toto that a good resist should have. Finally, we should be cognizent of the technological steps needed in resist applications. These steps are well cataloged in the literature and are not discussed here. 2. The Lithography SpeciJcation of Resist Behavior
With regard to the needs of modern lithography, the first point to stress is that there is a range of behavior dictated by needs of commercial competition and by the wide range of devices to be made. In general, large-volume fabrication of devices that are not at the outmost edge of technology require speed of fabrication, high tolerance to day-to-day variations in process, and high yield. Critical devices of increased complexity that are used as the basis of instrumentation need a somewhat smaller throughput and a higher performance. In addition, there are specialized or custom devices in which the
78
P. R. THORNTON
volume is small, the performance critical, and special properties come to the fore. In this context, we have critical devices in satellite applications, military operations, and nuclear engineering where radiation properties are of concern. The range of complexity will increase rather than decrease in the immediate future. Bearing in mind that masks have to be exposed over widely varying fractions of their total area, it is apparent that both positive and negative resists are required. Calculation suggests that if the application of scanning electron beam systems to the direct writing is not to be resist limited, then the resist sensitivity should be 1 5 x lo-’ C/cm2. Workers (see Thompson, 1974) have argued that the limiting sensitivity available is C/cm2, the governing factor being the sensitivity of electron beam resists of high sensitivity to thermally induced changes. Thus the required sensitivity is available, in principle, but in practice trade-offs of sensitivity to achieve other needed properties have to be made. Of particular relevance here is the question of contrast. In Fig. 2, definitions are given of the dose required to initiate gelation, Dh, and that needed to give residual films equal in thickness to the initial film, D:. We can, in principle, define the contrast 7 as [loglo(D,O/D;)l(1) In practice, it is found that satisfactory behavior is obtained if y > 1, or, in terms of Fig. 2, less than a decade should exist between D; and 0,”. Thus the specification of a resist in terms of sensitivity alone is inadequate-contrast behavior has to be given due emphasis. The contrast properties mainly reveal themselves through the resultant edge resolution or “sharpness” of the 7
=
ELECTRON DOSE C/crnZ FI G .
-
2. Definition of the sensitivity and contrast ( y ) of a positive resist
79
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
structures resulting from the application of the resist. In general, the higher the contrast, the “sharper” the edge of the resultant structure until effects due to the processing/etching, etc., become apparent. In summary then, the needs are for a family of both positive and negative resists with sensitivities and contrasts ranging upward from C 5 x 10C/cm2 and y > 1 for general work at the 1 pm to submicron level and C l o p 6 C/cmZwith high 1’ for more precise work at ultrahigh resolution.
-
’
-
3 . The Fabrication Aspects of Resist Behacior
Lithography is not the only process used in modern device fabrication and the resist has to comply with the processing needs implicit in fabrication and with straightforward but important factors arising from economic considerations. In list form, we can quickly summarize the additional properties that resists should have to find extensiv.2 application.
Resistance to etchants. This need is fundamental as failure here precludes accurate replication of the resist pattern in the underlying structure. The resists have to withstand action by strong acids and bases over a limited range of temperatures at and above ambient. In addition, resists have to withstand exposure to ion milling (Gloersen, 1976), sputter etching (Cantagrel, 1975), and plasma etching (see, for example, Minkewicz and Chapman, 1979). It is likely that an increasing application of the dry etch processes will occur in the next few years because of the inherent cleanliness and versatility of such methods. Freedom from defects. A resist is of little use unless it is significantly free from defects. Essentially, this statement means tbat it must be free from microscopic faults or “pinholes,” which lead to regions of unwanted exposure to etchants. One major reason why polymeric materials find ready application is the uniformity of thickness and smooth surface associated with such materials. Here, we are concerned with defect levels of 1 cm-’ or less. Adhesion and mechanical stability. Any failure of adhesion between the resist and the underlying substrate can result in the unwanted exposure of supposedly resist-protected areas to etchant action. Depending on degree, the result can be poor edge resolution or the complete removal of required structure. Here we are concerned with a Straightforward problem of adhesion between substrate and resist during etching. An additional complexity can result during the development process itself. A partially exposed film of negative resist, for example, wil1 absorb the developing solution and will swell to thicknesses greater than that of the initial layer. The development action then consists of washing away the unexposed molecules within the
80
P . R . THORNTON
layer with a subsequent shrinking of the layer. The resist has to achieve a net mechanical stability throughout this process with its implicit heat cycles and uneven strain distribution. In the later stage of device fabrication, the complexity of the surface structure exacerbates the problem.
Tolerance to process changes. In commercial applications with an existing surfeit of complexities there is little room for a resist that will not perform satisfactorily when subjected to the small day-to-day variances in process routine that are inevitable. Any factor increasing the number of critical (i.e., time consuming) processes is unwanted. Batch reproducibility. Again the straightforward pressure for simplicity and ease of application tends to preclude a resist with variable behavior from batch to batch. Hand in hand with this variability comes the need to make an accurate and extensive determination of the sensitivity curve for each batch rather than a few checkpoints to confirm agreement with an existing “master curve.” Shelf lijie. Stability under storage conditions of both the solution and the spun films is essential. Various workers indicate that the shelf life should exceed 3-6 months. No adverse interaction with exposure system. In the particular case of direct exposure by a scanning electron beam system, it is undesirable if the film properties cause electrical charge-up of the resist surface under bombardment, leading to unwanted beam shifts and other departures from the norm. Ideally, we require that a resist can be used without the inclusion of additional process steps. For example, the charge-up problem outlined above can, if necessary, be eliminated by the use of a conducting layer deposited on the resist. It is better that the required property be built into the resist by making it electrically conducting. Finally, it can be noted that some negative resists need a “postexposure” room temperature “anneal” to complete the exposure process. Resists avoiding this need are to be preferred.
Adequate range of thickness available. Device fabrication requires that the layer thickness used be near optimum for the application considered. As a result, it is required that resists be able to function adequately with film thickness extending from -0.4 pm to say -2.0 pm. The use of thick resist films in high-resolution work and in the later stages of LSI device fabrication is becoming increasingly important.
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
81
D . Interaction between an Electron Beam and a Resist-Coated Substrate. Quantitative Description of Broad Beam Case 1. General Here, heavy reliance is placed on work by Heidenreich et al. (1973), which indicates the scope of the challenge facing this analysis. This work is of value to establish the basic ideas and as a basis on which to discuss the proximity effect (see Section 11,E). The sequence of factors to be considered is that listed at the end of Section I1,B. The natural starting point is the manner to which the incident electrons loss energy.
2 . Rate of Loss of Primary Electron Energj Figure 3 outlines the model. An extended and uniform beam of intensity J(0) A/cm2 is incident on a resist-covered substrate. The resist is of thickness z = i and it is assumed that there is no backscattering from the substrate. In seeking to establish a means of scaling for the energy loss as a function of distance into the resist, these workers use an analysis due to Everhart and Hoff (1971). These latter authors have stressed that in many physical processes including the present application, the quantity that is of most direct relevance is the rate of loss of beam energy as a function of depth rather than an estimate of the fractional transmission of a given film thickness for electrons of a given energy. On this basis, they argue that the work of Gruen (1957) examining the rate of energy loss of electrons in air forms a suitable foundation for the present analysis, the essential points being that for the range of electron energies relevant here (say 5-25 kV), a single relationship can be used to relate the range to the particle energy and that the rate of loss of energy curve is invariant if the dimensions are scaled to the range and if the energy is suitably scaled. In more analytical terms, if E(z) is the rate of loss of electron energy with depth into the film, then E ( z ) = e dV/dz = ( e / R , ) d V / d ( z / R , ) = e V , / R , d ( V / V , ) / d f
(2)
wheref = z / R , , with R , the Gruen electron range. The argument is further strengthened by physical measurements made on solid materials (Everhart et a/., 1966). The actual experimental technique used was to examine the electron-hole pair generation in the oxide layer of an MOS structure as a function of beam energy. Good experimental agreement was obtained using R,
-
con~t(V,)'.'~
(3)
P. R. THORNTON
82
J ( 0 ) A/crn2 AT Vb/V
1
1
R
IST
(4 SUBSTRATE
f:l
-t
FIG.3 . Energy loss processes in the resist/substrate combination : (a) parameters defining the uniform beam properties and the resist thickness; (b) standardized energy loss curve; (c) the importance of the resist thickness compared to the range, i.e.f= 1.
provided the depth dose curve is represented as a quartic in.6 The results are of particular relevance to silicon technology as the structure studied is an Al-Si0,-Si structure. The results should be valid for a range of atomic number 10 < Z < 15. It is also found experimentally that d( V / &)/df can be represented by
ELECTRON PHYSICS I N DEVICE MICROFABRICATION. I1
83
A ( f ) , where A(f) = A
+ Bf + Cf’ + Of3
(4)
with A through D as empirically determined parameters. These equations can now be applied to the case of a uniform beam of intensity f(0)at the surface. If the beam is turned on for a time z,then the dose delivered at a depth z is D(z), given by
D(z) = JdE/dtJ,t= [J(O)/e](dV/dz)z
(5)
3. Excitation of Photochemical Events and Competing Processes
The energy described in Eq. (5) is available for the excitation of the resist material. In general, two types of events can occur, those contributing to the required chemical and physical changes in the resist and those which do not. This division of energy between “productive” and “nonproductive” events can be described in the simple but important case where the energy is utilized only to create secondary electrons and cross-linking is occurring with no competing mechanism. Consider the first step as being the production of internal secondaries. If n, is the density of such particles dn,/dt
=
(l/Z,,,)(dE/dt) = (l/Zm)[J(0)/e](dV/dz)
(6)
where I,,, is the mean ionization energy. We are also concerned with the occurrence of cross-linking. If the concentration of such links is n, and if CI cross-links occur per secondary ionization, then dn,/dt
= CY
dn,/dt
=
(~1/Z,)[J(0)/e](dV/dz)
(7)
is, in fact, an implicit function of n, and we can define So as the number of potential cross-linking sizes available at the onset (f=0) and clo is the corresponding value of a , we can show that
CI
n,
withp
=
S,(1
-
e-P‘)
(8)
= (oc/Zm)[J(0)/Soel ldV/dzl.
4. incorporation of Resist Parameters
In practice, workers in this field introduce two further parameters. The first of these parameters is the “G value,” defined as the numbers of events occurring per 100 eV of available energy. For example, for secondary ionization G = Gi = lOO/Z,,,and for cross-linking G = G,( = CY~OO/Z,,,in the present context). The second of these parameters is the “gel dose” and is a measure
84
P. R . THORNTON
of the number of cross-links ng required to just render the negative resist insoluble. From the device fabrication viewpoint, the analysis gives guidance as to how to couple the film thickness, the beam voltage, etc., so as to optimize fabrication techniques. This optimization has to take account of the variance of dose with distance into the resist layer, of its absolute value compared to the value required to give n, = ng at critical locations such as the substrate interface, and of the distribution of cross-link density throughout the film. Figure 4c illustrates schematically some of the situations that can occur. In principle, this approach can be further extended to include competing processes with alternative G values to include questions of resist [contrast Broyde (1969) and Heidenreich et al. (19791 and to incorporate the effect of the diffusion of secondary electrons. Further understanding of the details of the polymer interaction with the beam can be obtained from original papers. In this context Thompson (1974), Ouano (1978), Roberts (1976, 1978), Heidenreich rt a/. (l975), and Thompson et ul. (1973, 1977. 1979) provide considerable insight.
Ai,Ei
AEb
8i.Oi
(a)
A2
/
iL *Ef
i.1 FIG 4. Definition of beam interaction parameters: (a) scattering angles. scattering length, and electron energy as a function of trajectory length: (b) relation between trajectory increment and depth increment; and (c) angular ring structure used for summation. (See text.)
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
85
From the viewpoint of an overall survey, we have now to consider the major factor neglected to date-the finite extent of the electron beam used in practice and the role played by backscattering both within the resist and from the substrate under these conditions (see Heidenreich et al., 1975). E. The Proximity Eflect
1. General The previous discussion considers the continuous-beam approximation and therefore neglects effects arising from the use of a focused-beam case. In addition, the implicit assumption is made that the energy released at a given depth is, in effect, used at that depth to induce photochemical events. In reality, the secondary electrons created in the energy loss process have a high energy and can “diffuse” significant distances before losing energy. Thus, in effect, the tacit assumption has made that the energy loss from a given layer is effectively balanced by scattering into that layer. To calculate the role of diffusion in the focused-beam situation, recourse is made to Monte Carlo calculations. The initial work in this area was completed by Hawryluk et al. (1974) and resulted largely in numerical data. In more recent years, the elegance of the earlier theoretical work has been continued and coupled with a necessary degree of practicality (as examples, see Chang, 1975; Kyser and Murata, 1974). This approach has incorporated an ability to approximate the results of complex calculations into simple analytic functions that are, to a large extent, capable of physical interpretation. Considerable progress has been made toward solving the programming problems associated with applying proximity corrections in real time (see, for example, Parikh, 1979a,b,c). In the sections that follow, the bulk of the discussion is concerned with the electron energy loss as a function of depth. Recent authors in this work area (Parikh and Kyser. 1978a,b, 1979, for example) have been careful to stress that direct application of this work to lithography assumes, in fact, that “the cumulative effect of electron scattering energy deposition (as a function of depth), molecular fragmentation, and the development process can be approximated by the energy deposition alone.” The validity of this assumption can only be judged from the results obtained. 2. The Application of Monte Carlo Techniques to the Proximity Problem
The model due to Kyser and Murata (1974) is defined in Fig. 4. All electrons are considered to land normally on the surface. It is imagined that the first collision occurs at the surface and results in the angular displacements
P. R. THORNTON
86
H,. 4o defined in the figure. This and all subsequent collisions are regarded as being elastic. Between collisions, the electron travels in a straight trajectory for a distance i ( E ) .As indicated, i is a function of the electron energy at that part in the trajectory. During the path length i ( E ) the electron loses energy continuously. To calculate i.(E) application is made of the classic Rutherford scattering formula, which on integration gives the total scattering cross section R(E).Then, if n is the number of atoms per unit volume, the number of collisions 7 made in a path length A S is given by ;'
=
n R ( E )A S
=
(N,p/A)R(E)A S
(9a)
or 7 = AS/i.(E)
(9b)
where N , is Avogadro's number, p the density, and A the atomic weight. If we define i, as being the value of A S at which 7 = 1. we obtain the definition of i ( E )implied in Eq. (9). The continuous energy loss per trajectory length is given by - dE/dS = 2e4n(Z/E)ln(2EzI )
(10a)
( d E / d q i = 2e4n(Z/E,)ln(2E, ! I )
(lob)
and. in particular.
where Z is the atomic number and I an appropriate ionization energy. The expression given in Eq. (lob) multiplied by gives the energy lost between the ith and the (i + I)th collisions, while Fig. 4b indicates the energy lost is a layer A; at z . This process of alternating elastic scattering and energy loss is continued until the energy falls below some reasonable threshold energy ( 50 eV). The results of individual calculations can be integrated to give the effect of a total beam by assuming a beam profile and by performing sufficient trajectory plots ( 5 x lo4). N o loss of generality occurs by assuming a delta function source as other more realistic distribution can be developed from this source if the need arises (see. for example, Kyser and Viswanathan, 1975). The method of summation adopted is shown in Fig. 4c. An annular ring with dimensions shown receives a total energy AE(z.r). AE is compounded from A E,., which is the energy received from electrons forward-scattered into the elemental ring and from AE,, the corresponding quantity obtained from electron backscattered into the ring. Computer plots of AE,AE,,.AEf can be obtained as,f(r) for various values of z and fitted to analytic expres-
-
-
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
87
sions in the form of suitably scaled Gaussian distributions, i.e., we can write AE(r) = AEf(r) + AE,,(r)
(1 1 4
for fixed z in the form
Cf exp[ - (ripf)’] + cb exp[ - (r/P,,)’] (1 1b) where Pf and fib are the characteristic widths of the forward and backscattered contributions and where C, and c b are the experimentally determined weighting factors. A physical interpretation of fir and P,, is possible, but the role of C, and c b is less clear. Some of the obscurity can be removed by expressing C, and C , in terms of I,, and If,which are the total contributions integrated over all r for a given depth from back and forward-scattered electrons, respectively : AE(r)
=
Z,,/If = Gfibz/CfP? = VE
(12)
and one scaling factor (= C,) remains in AE(r), which can now be written as (1 3) AE(r) = Cf{exp[-(r/fif)21 + (?efi:/Pb‘) exp[-(r/Pb>21} In a later section, we can outline the success of these estimates when compared with experiment. It is convenient at present to outline the nature of the computing problem associated with the application of this estimate to the lithographic problem. 3. Computation of Proximity Corrections
Figure 5 represents a statement of the proximity problem. In Fig. 5a, we represented an isolated T structure that has to be fabricated. The continuous lines represent the required structure, and the dotted line represents
(0)
ib)
FIG. 5. (a) The nature of the proximity effect. (b) Localized distortions within a given figure.
88
P. R . THORNTON
the actual structure exposed in the resist if no corrections are made. Three features of this dotted structure convey the essence of the problem. Along the two lines (1) the dose received is high because these regions receive energy while all the adjacent regions are being exposed. No distortion of shape is involved because no pattern edge is involved. At the reentrant corner ( 2 ) , a pattern distortion occurs because energy is received from three quadrants only and the resultant contour is forced outward toward the fourth quadrant. At the points ( 3 ) , “proximity” energy is received from only one quadrant and the contour is forced in toward that quadrant. Figure 5b completes the picture by addition an adjacent structure. The exposure of this nearby structure contributes energy to the T. which effectively pulls the structure at the point (4) outward toward the new feature. To sum up, an exposure point receives energy from three sources : (1) the direct exposure purposefully applied at that point, (2) the energy derived during the exposure of adjacent parts of the structure feature in which the point is located, and ( 3 ) the energy delivered during exposure of adjacent structures. Methods by which the necessary correction can be made are best exemplified by a fully automated techniquz developed by IBM workers (Parikh and Kyser, 1978; Parikh, 1979a,b.c). The practical details are not fully available; but the general approach is clearly defined and we can indicate the general trend of the details in terms of Fig. 6. Here, we are interested in determining the correct exposure to be delivered to all parts of the central T figure. The basis of the calculation is to determine the dose that a given spot within the T receives from the two proximity contributions and hence to calculate the dose to be delivered directly to give the required dose D o , with the added constraint that Do is to be constant over the whole exposure field for all points on all device elements. Assuming a constant current density during the exposure, the calculated dose translates to a required exposure time for the particular location.
“ZONE
OF
INTER AC T ION”
w FIG.6 . A schematic of the type of computation needed for proximity correction. A Lone of interaction is defined in which device elements contribute significantly to the local element considered. The scale of the contribution depends on r 2 .
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
89
The basic steps in the calculation are :
( I ) Determine a “zone of interaction” within which the exposed structures will contribute as the exposure of the element considered. Neglect all elements outside this zone. The size of the zone to be determined by the characteristic widths discussed in Section II,E,2. (2) Sort the pattern to determine those elements within the zone. (3) After the proximity corrections are complete, there will be a required distribution of exposure times with a mean value ?. Estimate a value of 5 and assume that all elements within the zone are exposed at this value. (4) Since the “proximity energy per second” received by a given spot during exposure of an adjacent point B is a function only of the separation between these points, it is quantitatively equivalent to the dose received by B during the applied exposure of the first point. This is the “reciprocity condition” indicated by Chang (1975). This result enables all the computations to have a common origin centered on some preselected point of high symmetry in the required structure element-the T structure of Fig. 6, for example. ( 5 ) Exploit the reciprocity condition in a straightforward numerical analysis to determine how economically, commensurate with the needed accuracy, the required integration can be replaced by a summation using finite areas. (6) Repeat this process over all elements and reiterate, if necessary, to self-consistency, taking account now of the distribution of exposure times needed to make the correction. Alternative approaches to proximity correction have been suggested. Among these is the idea of incorporating dimensional changes in the pattern to correct for the proximity effect (see Sewell, 1978; Ralph and Sewell, 1978). The underlying idea is illustrated in Fig. 7 for the case of two closelying extended areas. In Fig. 7a, we have illustrated the required pattern and the actual pattern obtained when no corrections are applied. In Fig. 7b, a corrected pattern indicated by solid lines would give an actual pattern (dotted lines) closely resembling the required pattern in Fig. 7a. This approach has not been fully tested. It can be used in a relatively simple form to correct relatively gross features such as the mean pattern separation in Fig. 7a. To correct for more localized effects, such as corners in Fig. 7, requires further elaboration and a fully automated computer aided design facility ifthe method is to be applied to patterns of any complexity at submicron resolution. This approach removes, i n principle, the need to transmit proximity data in real time at the expense of additional sophistication and restriction in the definition of the precomputed pattern data. The trade-off point may well vary with the
90
P. R. THORNTON
(C)
(d)
FIG. 7. Proximity correction by reshaping of input structures. The solid lines represent the input shapes; dashed lines indicate the resultant structures. Parts (a) and (b) show interacting structures before and after reshaping, whereas (c) and (d) indicate the corresponding effects within a single structure.
nature of the proposed application. Questions relating to the relative effectiveness of various computational processes for determining the proximity corrections have been discussed (Parikh, 1979a,b,c). The remaining central question in this work area is being attacked, that is, the question of data compression. A possible major limitation on cost effectiveness is the impact that the proximity effect has on our ability to compress the data defining the input pattern. In the absence of the proximity effect, the pattern elements are kssentially noninteracting and, as a consequence, the writing strategy can be established as a trade-off between throughput, data storage needs, and data transfer rates in real time, with abbreviated data format being applicable if pattern repeats occur. This use of pattern repeats (data compression) is impacted by three features: (1) the use of a dual-deflector system, (2) the question of “patching” the writing of subfields, and ( 3 ) the interacting feature of the pattern resulting from the proximity effect. Under these conditions, the definition of a repeat pattern is more complex as the question of grouping of elements has to be considered. Grobman and Studwell (1979) have investigated the problem. Insight into the very real progress made in this area can be seen by comparing the results with experiment. 4. Experimental Methods
Experiments in this area are concentrated on very directly applicable
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
91
studies made of resist layers after exposure and development. In general, two techniques are used. One method is to expose and develop the necessary structures usually in the form of very long strips, fracture the structure at right angles to the length of the strip, and use the high resolution and good depth of focus inherent in the scanning electron microscope to examine the resultant structure in detail (Hatzakis, 1971; see also Haller et al., 1979; Zeitler and Hieke, 1979). A very direct comparison can be made with theoretical estimates such as those outlined in Section II,E,2. In this case, the computer is used to graphically display contours of equal energy disposition so that a one-to-one correspondence between the developed structure and the computer printout can be made. Such studies have been exploited in processing studies (Neureuther tit a/., 1978), in comparison with theory (Neureuther et al., 1979a,b), and in studies of the proximity effect concerning the role of the substrate in particular (Greeneich, 1971). An elegant extention of this technique exploits the selective etchability of Si-SiO, structures to buffered amine (Finne and Klein, 1967) or KOHbased etchants (Greenwood, 1969) to investigate very directly the role of backscattering in the substrate (Jones and Hatzakis, 1978). The directness of the comparison of the observed exposure characteristics with and without the substrate scattering can be seen from Fig. 8. A second experimental technique particularly pertinent to proximity studies also allows a very direct comparison between theory and experiment. Again, the computer is used to provide contours of equal layer thickness before and after correction for proximity exposure. These “maps” can be directly compared with exposures made with and without proximity corrections (Jones and Hatzakis, 1978). A further refinement of the latter technique exploits the computer in its best role of reducing the repetitive labor involved in this type of work. By use of suitable algorithms, it is possible to write patterns that provide the data essential to correctly determine the exposure conditions for a given batch of resist under a range of proximity correction conditions (Grobman and Speth, 1978). By exposing linear “wedges,” it is possible to establish (1) the required development time as a function of effective dose, (2) the scattering parameters required to compute the necessary proximity corrections in a given situation, and (3) a monitoring facility for pattern development. The underlying idea is to expose a region in which the dose increases linearly along the length and to use measurements and the optical properties of the film to determine the development time as a function of effective dose. The idea is extended to provide what is in effect a self-calibrating means of determining the proximity parameters by simultaneously exposing a continuous large-area strip and a series of isolated lines of varying widths and comparing
P . R. THORNTON
92 r
EXPOSURE AREAS
SlLlCONl
I
(“1 PRIMARY
BEAM
RESIST Si3N4
SILICON
FIG.8. Use of selectively etched windows to provide a very direct comparison between situations in which proximity effect is present and in which it is substantially eliminated: (a) general structure used; (b) schematic of the different scattering behavior in the two cases.
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
93
the results under conditions of identical development. Using this technique, measurements previously requiring ten or more hours can be made in one hour and the computer gives best-fit values to the required parameters.
5 . Comparison with Experiment On the proximity problem, we have fairly extensive data applicable to the use of PMMA in a wide range of situations. Parikh and Kyser (1979) have made a comparison between theory and experiment using data from relatively diverse sources. The consensus of the results can be summed up in three points : (1) The backscattering component can be estimated to greater accuracy than the fonvard-scattering component. In approximate terms (present author’s estimates), a fair measure of the accuracy of Pb is .S30%, while for pf, a factor of 2 appears to be the accuracy obtainable, although, in the latter case, the lack of precision may be due to an artificial constraint imposed during computation. (2) From a practical viewpoint, the uncertainty in pf is not a significant limitation as the forward-scattering component is of the order of the beam diameter used in practical lithography and so contributes at a low level compared to Bb. (3) From a scientific viewpoint, some details are obscure, but the uncertainties do not prevent the method being usefully applied down to the $ pm feature level. Parikh (1979c), in the third of three papers, gives micrographs comparing corrected and uncorrected patterns in PMMA down to the pm element level. The improvement is significant and for many practical purposes adequate. Youngman and Wittels (1978) have compared two approaches to a difficult application centered on the fabrication of complex shapes for bubble memories. One approach is very similar in principle to the selfconsistent approach established by Parikh. The second method involves using a CAD facility to fit an equidose contour to the required (rather complex) shape. The basis of this method is to define the required figure in terms of turning points along the outer edge, use these turning points to determine the block structure with which to write the pattern, and finally “optimize” the process to give an equidose contour that closely approximates to the required structures. The results quoted were obtained with structures greater than 1 pm in size and were made using PMMA. Trade-offs exist between the storage needed, exposure time, etc., and no clear demarcation exists as to under what conditions this method is to be preferred. Kyser and Ting (1979) have attacked the question of how the need to
4
94
P. R. THORNTON
make proximity correction can be minimized. In particular, they have investigated how the ability to vary the beam voltage can be exploited in this context. Two main points were established. For device fabrication concerned with structures and separations greater than 1 pm, and if thin resist layers (4-1 pm) can be used, it is better to work at 10 kV rather than 20 kV if it is required to fabricate without proximity correction (compare Section IV,D). These authors estimate that, in the case of 3-pm thick resist, the limit at which proximity correction becomes imperative is 2 1 pm at 10 kV. The corresponding figure at 20 kV is 4 pm. In the submicron range of device fabrication, it is likely that some form of correction will generally be needed as it is unlikely that any manipulation of the experimentally available variables will remove the need in a general sense for practical application. However, the authors make one suggestion, and that is to use thin resist (f-1 pm) and to use a high beam voltage ( 250 kV). Under these conditions, the proximity effect gives a small, diffuse, but near-uniform background to the directly applied exposure. This situation is more tractable because, in essence, it reduces the nonlinearity of the system. Just how practical this approach can be is a subject of some conjecture because the use of high voltage impacts the deflection sensitivity and involves questions of throughput and stability no matter what type of deflector is used. An extension of this technique has recently been proposed by Stephani and Kratschmer (1979). These workers integrated this approach with the use of two separated resist layers to obtain further reduction of the significance of proximity effects. Hatzakis (1979) has suggested a similar multiresist layer approach with specific developers to enhance the effective sensitivity of the resist layer without loss of contrast properties. Chung and Tai (1978) have extended the scope of Monte Carlo calculations by applying them to a negative resist. The same degree of practical usefulness was obtained. A 20% agreement between theory and experiment was observed. In detail, this work is of interest because a different energy loss mechanism due to Landau (1944) between elastic collisions was used and the results compared with the “continuous slowing down approximation” used previously. A marginal improvement resulted, indicating that this aspect of the model is not a major uncertainty. An interesting alternative to the energy loss calculation is based on an approximate solution of the Boltzmann equation by Fermi (1940). Using this approximation, Chung and Tai found a 10% agreement between theory and experiment in examining the forward scattering (the “spreading”) of the beam. The experimental technique involved the use of windows to give a direct estimate of the scattering as a function of depth in the same manner developed by Jones and Hatzakis (1978) to establish proximity corrections. Further work has been published by Aizaki (1979) and by Murata et al. (197 1). The latter authors investigate a further energy loss mechanism, the
-
ELECTRON PHYSICS IN DEVICE
MICROFABRICATION. I1
95
Spencer-Fano model (Rossi and Greisen, 1941). In addition, Kato et al. (1978) have investigated the role proximity effects play in the fabrication of high-resolution metal filmmasks. These authors present data indicativeof the need to consider proximity effects even with thm film thicknesses (- 800 A). In the area where the computer is applied to reduce the load of repetitive work and to maintain process, we have seen significant help rendered by the use of an exposure "wedge" (Grobman and Speth, 1978).
6 . Summary and Comment In relation to the proximity effect problem, it is clear that the application of Monte Carlo computations followed by an analytic approximation in terms of Gaussian functions represents a very practical way of making corrections down to the f-pm resolution level. From a purist viewpoint, matters of detail are still obscure and the agreement between experiment and theory still leaves room for improvement, but these details leave the practicality unimpaired. Effort is being currently directed toward a reduction of the need to make proximity corrections in any general sense in practical environments where electron beam lithography will find its main application ( < 1 pm) feature and spacing). The approach centers, in the main, on the issue of multiresist layers. Two major reservations can be outlined. The bulk of the work in this area has involved PMMA, which is an insensitive resist of excellent contrast properties. We have yet to establish the magnitude of the problem with a resist that is, say, some 50 times more sensitive and possibly with a significantly reduced contrast. We have to confirm our ability to solve this problem with somewhat reduced computer capability. The data manipulation aspect of the proximity problem has been attacked with good success but the significant work has been done, in the main, on computers of considerable capability and we have little information in open scientific reports of the trade-offs between storage, preprocessing, and real time computation, and its effect on throughput. In less specialized environments, these problems will have to be tackled on smaller, cost-effective machines of less capability. It cannot be said that this problem has been solved, but it is a tractable problem appropriate to computer-aided design.
111. X-RAYLITHOGRAPHY A . Overall Description
In its initial form, X-ray lithography was discussed as a replacement for optical projection lithography, i.e., an X-ray flood beam exposes a whole
96
P . R. THORNTON
wafer in a single pass. The basic components of such a system are shown in oversimplified form in Fig. 9. The excitation is obtained from an X-ray source consisting typically of a vacuum chamber housing an electron gun, directing and possibly focusing an electron beam onto a suitably chosen anode. Both “characteristic” and “white” X-ray quanta are emitted externally from the target over a very large solid angle (-2n srad). A given fraction of these quanta penetrate a thin window, enter the exposure chamber, and fall on the mask/wafer combination. The thin window between the X-ray generation chamber and the exposure chamber performs several functions : (1) It provides the necessary pressure “standoff’ between the hard vacuum of the X-ray source and the inert gas ambient of the exposure chamber ; (2) it filters out “unwanted” background X radiation from the source, which could otherwise degrade the exposure; and (3) it contributes to the thermal stability of the system by shielding the mask and the wafer from infrared radiation from the hot regions of the X-ray source. The exposure chamber contains an inert, low-molecular-weight gas of relatively high thermal conductivity to help reduce thermal instabilities in the maskiwafer combination. The wafer itself is coated with a suitable resist ROTATING ANODE WATERCOOLED VACUUM SEAL
PUMPING PORT
L
I
kELWMENT
< -
r
, . - - - ~ ~10 ~~6
-II< -11
-
\
ELECTRON SUN WITH
E X POS URE CHAMBER
RETRACTABLE
N
~
¶
,
.
-
w
A
FE R
-TABLE
GAS INLET FIG. 9.
A schematic illustration of the major components of an X-ray lithography system.
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
97
and located on an alignment stage that can move the wafer relative to the mask. The mask is therefore held independently of the wafer but has to be located very close to it (Section 111,D). A factor that has to be considered in both the wafer and the mask support is thermal stability, and as much “heat sinking” as possible has to be provided. The mask itself is a critical item as we see in more detail in Section III,D. It has to establish the necessary “contrast” on the beam by imposing regions of alternate transmission and opacity. It has to perform to this specification with resolution detail at the 4-pm level or better without severe loss of X-ray quanta in the regions of high transmission and maintain an alignment ability and thermal stability of better than 0.1 pm. The final major component is the alignment system, which is shown very schematically in Fig. 9. In practice, this alignment system has to be a fully automated computer-controlled system similar to that currently available on optical and electron beam systems. As an introduction to the basic design challenge facing this approach, we begin by noting that X-ray generation is a very inefficient process. An order-of-magnitude figure for the efficiency measured in terms of external X-ray quanta per incident electron per steradian is 10-5-10-4. This inefficiency, coupled with the absence of an ability to focus X rays in the context of the present application suggests that a wide-area source close to the maskiwafer combination be used. Three factors contradict this argument : (1) The closeness of a hot source to the mask/wafer combination would lead to thermal instabilities. (2) The available resolution would be degraded if a large-area source is used. Figure 10 shows the problem of a close largearea source. A well-defined mask edge changes to a “penumbra” of finite width on the target. (3) In the single-exposure approach to X-ray lithography, the area exposed is considerable when 3- or 4-in diam wafers are used. The extent of the target can lead to two further factors that have to be considered. If the source is too close, then there is a systematic variation of exposure rate from the center of the wafer outward. There is also a geometrical distortion that can occur between the pattern built into the mask and that which is exposed on the target (see Fig. 10). When the total problem is considered, the need becomes finalized into an intense source of the order of 1 mm in cross section located at least 25 cm from the target. This specification may become relaxed somewhat if an X-ray “step and repeat” technique is exploited. In the realm of practical application, this step-and-repeat type of X-ray lithography will be forced upon us because of the impact of the alignment/distortion problem at resolution levels approaching 0.5 pm. It is conceivable that an X-ray step and repeat involving an exposure area of the order of 1 in.2 or less will represent a realistic compromise.
98
P. R . THORNTON /SOURCE
'\ '\ I \
I I
I \ I
I I
t
1
I
MASK
I I
I 1I
WAFER
4I-
DOSE
FINAL THlC KNESS
\ \
WAFER
I
+ A
t
I
FIG. 10. Factors affecting the source-to-target distance in an X-ray lithography system : the penumbra effect and the geometrical distortion as a function of angle and mask-to-substrate separation.
Further insight into the total system problem associated with X-ray lithography can only be obtained by a more detailed examination of each component in turn. However, all the factors are to some extent interrelated and all hinge to some extent on the properties of the X-ray resists available and on the fundamental limitations of X-ray exposure. A natural starting point is the nature of the ideal-resist problem. 3. The Ideal-Resist Case
This case is of importance as it clearly defines the trade-offs that have to be established. In particular, it relates the mask contrast, the resist sensitivity, and contrast to the resolution required in the final exposed pattern. The approach adopted is the classical treatment of a quantum-limited resolution situation with specific details applicable to lithography. The treatment adopted has been adapted from that given by Spiller and Feder (1977) and resembles that given by Kelly et al. (1974) for electron beam memory applications. Consider a uniform energy density I of X-ray quanta with energy hv
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
99
incident on a thin resist layer with a linear absorption coefficient c( [ =cr(hv)]. If the resist has a thickness t and we consider the behavior of a square resolution element of side d, then the mean number E of quanta absorbed is given by
m = (Z/hv)(l - epa‘)d2
(1 4 4
m
( 14b)
or N
(Zcl/hv)t@
in the limit where at G 1. Because of the random and independent nature of the X-ray generation and absorption different resist elements will, in fact, receive inputs statistically distributed about the mean value N,To obtain quantitative data about the salient processes, we can ask under what conditions the above statistical fluctuations lead to a defect level that is unacceptable. Here, we can use a chessboard model, i.e., a pattern of alternate black and white squares of side d has to be exposed by flux passage through an identical mask. In this mask, the transmission of the “light” or transmitting regions is I; and of the “dark” or absorbing regions is T . The predicted results are illustrated in Fig. 11. The flux patterns transmitted through the light and dark regions of mask will have distributions centered on the values Na and N,. As we try to optimize the system in terms of throughput, a situation will develop where the two distributions overlap, and a finite possibility exists that errors will occur because dark areas will be mistakenly identified as light areas and vice versa. To establish a quantitative criterion, we can define an exposure No ( N , 5 No N , ) such that all eleNo will not ments with N > No will be exposed and all elements with N I be exposed. We can appeal to experiment to define No more closely. The device fabricator can, as a result of experimentation, establish conditions that lead to the best performance with acceptable edge resolution and a minimum defect rate. In other words, the characterization of the system is equivalent to choosing N o so that the defect level is minimized. The probability of a defect is shown in Fig. 1 1b,c as the sum of the two shaded regions, with No chosen to minimize the sum of Pa and P,.If the analysis is carried through assuming an arbitrary but realistic acceptable defect level, then we can compute the dependence of defect density on mask and resist properties. We can also obtain an analytical statement of the total design problem. One method of specifying a low defect density is to specify a minimum separation between the two mean values of the distributions shown in Fig. 1la. Within the range where it is allowable to replace a Poisson distribution by the Gaussian distribution, we can specify that the separation must be at least n times the sum of the variances of the distribution. A common choice is the “three-sigma’’ point, i.e., n = 3. A further factor can be entered by
100
P. R. THORNTON
I
1
r
, , / “ “ \ ~ ~ IDENTIFIED . REGION AS PO
fc
1
FIG. 11. The shot noise limitation problem: (a) definition of parameters; (b) and (c) error regions according to adopted criterion. (See text.)
noting that at high resolution, the required resist thickness is small so that t + d. On this bases, we can obtain
m = N o = n”/Z[l
- (T,/7;)1’2]2
flux
=
ciZ= (hv/4(mid3)(ij4
flux
=
( h v / c ( ) ( n 2 / d 3 ) ( l / ~ ;>[1
(z/ZJ12]2(lit)
(1 5a)
(1 5b) (1 5c)
where C is the required exposure dose. Equation (1 5) is largely self-explanatory. In particular. Eq. (15c) is a statement of the X-ray source design problem as it illustrates in the ideal case how the required flux depends on
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
101
(1) the required defect level through n, (2) the required resolution through d, (3) the mask properties mainly through and (4) the system throughput through the exposure time T. What is less apparent is the fact that there is a limit placed on the sensitivity of the resist that can be used at high resolution. We can restate what has gone before. If we seek to use a resist of higher and higher sensitivity (smaller C ) at higher and higher resolution (smaller d ) , we reach a situation where the expected number of quanta falls below the value of No given above. As a result, the defect rate will increase. The net effect is a lower limit on the resist sensitivity that can be used. This limit, which is a function of the resolution and required defect level, can be expressed in quantitative terms by
z/r,
C > n 2 / { T [ l - (T,/7;)”2]2(a/hv)d3)
(16)
Equation (16) represents an oversimplification of the statistical model for X-ray exposure of resist as it ignores the random nature of the excitation, cross-linking, and competitive processes in photochemistry of the resist itself and it neglects the statistics involved in backscattering. We shall apply these criteria in Section II1,E. For the present, it is convenient to examine the question of X-ray source design.
C . X-Ray Source Development 1. General Methods of Approach There are several alternative methods of obtaining the required X-ray flux with varying degrees of throughput, ease of operation, and flexibility being balanced by cost, availability, and limitations in use. If we seek to develop an X-ray source that imposes no significant limitations on the system or on its throughput and is completely flexible in its applicability and in the use of resists, we have to develop a source that is intense by most standards and differs significantly from sources available for many other applications. In general, the approach adopted here is to use a rotating anode to maintain the required stability under the very intense excitations used. Several alternative approaches have been postulated and experimented with. These approaches obtain high intensity by pulsed operations, in one case by laser excitation (Nagel et al., 1978, 1979), another by use of a pulsed electron gun (Nagel et al., 1978, 1979), and finally by the use of plasma discharge (McCorkle eta/., 1979). If we are prepared to accept limitations on the versatility of the X-ray source, the design problem is greatly reduced. In essence, the argument here is that we may be able to use a sensitive resist with a lower flux and still maintain an adequate throughput. With this limitation, we can use a con-
102
P. R. THORNTON
ventional sealed-off X-ray tube specially modhed for lithography applications. A final approach employs a very different source. Inherent in the operation of synchrotrons is the presence of an intense background of high-energy X-ray flux. This flux is readily available at windows that are not normally competed for and have a good degree of collimation. Elegant research and development studies have been made using such sources (Spiller et al., 1976; Fay and Trotel, 1976; Lindau and Winick, 1978; Aritome et al., 1978). Before considering problems specfic to the use of rotating anodes, we outline some general factors affecting source design. The question of electron gun design for X-ray lithography represents no significant problems outside the sphere of good engineering. In general, X-ray lithography requires sources operating in the range of 5-20 kV. In the lower and middle voltage range, modified evaporator unit guns employing ring cathodes have been optimized for this application (Wardly et al., 1977). In the higher voltage range, Pierce guns are used and traditional tungsten hairpin guns of the type used in electron microscopy are adapted by widening the angular emission range from the gun and increasing the filament size. 2. Rotating-Anode X-Ray Sources
Here the approach is to overcome melting problems arising from the substantial heating effects that occur during the operation of intense X-ray sources by rotating new material into the path of the electron beam. Typically a disk of some 10-15 cm in diameter, - 1 cm in thickness, is rotated at several thousand rpm with the beam impinging on the center line of the cylindrical surface. The details of the problems that can arise have been reviewed by Yoshimatsu and Kozaki (1977) but can be typified by the question of anode stability and by the problem of the rotary vacuum seal. The base material used for the anode target is often OFHC copper to take advantage of the high heat conductance associated with this material. To obtain the required wavelength of X ray, the copper can be coated with approximately 100 pm of appropriate target material such as chromium, molybdenum, or aluminum. The limit to performance is not so much the melting of the target but a surface roughness that develops during operation and that can reduce the effective output by 50% by absorption in the surface layer. The central strip that is under bombardment heats up, the flow strength decreases, and under the combined effect of temperature and centrifugal force, plastic flow occurs well below the melting point by a process resembling viscous creep. The resulting surface roughness can be of the order of 50 pm. The problem has been tackled by using the substitutional alloys of copper,
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
103
which maintain a fair fraction of the conductance of the pure copper while giving a greatly increased yield strength. In approximate terms, the surface roughness is reduced by approximately a factor of 10 by this approach. Anodes made from molybdenum are far less troublesome from this aspect, but suffer from solubility in water (see below). Three types of vacuum seals have been used to obtain the necessary drive onto the anode together with the sealing needed for the cooling water. Traditionally, mechanical seals have been used. Such seals need special oil cooling and suffer from wear problems involved in stopping and starting the rotation. Magnetic-fluid seals are relative newcomers to this application. The present position appears to be tbat such seals are self-limiting in that the heat developed causes a loss of viscosity. A few thousand rpm appears to be the upper limit. The use of a heat pipe for cooling a rotating anode has been advocated (McCoy and Sullivan, 1978; Wittry et al., 1979). Until a more complete assessment of magnetic fluid seals is available, it appears that oil seals give the better performance. Performance figures have been given by Yoshimatsu and Kozaki (1977), who indicate that “lifetimes” of 2000 hr between maintenance can be obtained for seals rotating at angular speeds approaching lo4 rpm. Oil seals do compound the maintenance problem of the source itself. The combination of oil vapor, backscattered electrons, wide-angle X-ray emission, and regions of high temperature aid in the formation of contaminant and insulating films that can lead to gun, target, and vacuum instabilities. Some indication of the care that has to be taken and, hence, some indication of ancillary expense associated with X-ray lithography can be obtained from an insight into the cooling-water problem. The system, whether recirculating or open ended, has to meet two criteria : In practice, the allowable temperature is approximately 10-15”C, implying flow rates N 6 liters/min for a 12-kV X-ray machine and 20 liters/min for a 30-kV machine. Water purity is also important: distilled water with < 1 ppm chlorine has been advocated to avoid target corrosion. We need now to examine the implications that arise if we reduce the specification on the X-ray source and accept the resulting limitations on performance. In Section 111,C,3, we examine the case where a more conventional X-ray source of the sealed-off type of tube is used.
-
3. Application of Conventional X-Ray Source and the Selection of X-Ray Energy If we are prepared to accept the limitations that arise when we use commercially available, sealed-off X-ray sources to avoid difficulties inherent in intense source operation, the question arises as to how we optimize sys-
104
P. R . THORNTON
tem performance. The most significant variable is the choice of X-ray energy. The choice has to be made on the basis that, at the energy selected, the mask materials available must be able to produce the necessary contrast at the wafer without significant loss of X-ray quanta in the transparent regions. The mask materials have to maintain thermal and mechanical stability with dimensions appropriate to the resolution required on the target. In addition, the X-ray quanta must have good absorption in the resist used. The design problem is therefore an iterative one in which X-ray energy is chosen as an optimization of the following variables : (1) The variation with energy of the efficiency of the X-ray generation process, in particular, the ability to generate a high flux of characteristic X rays ; ( 2 ) the dependence of absorption coefficient on energy of both the absorbing and transmitting regions of the mask ; (3) the ability of the resist to absorb the incident quanta; (4) it should also be borne in mind that the properties of the window between source and exposure chamber are a function of the X-ray energy used, and hence the window respresents another variable in this context.
Some of the complexities of the design become apparent in Fig. 12. In
WAVELENGTH A
(x
0
FIG. 12. X-ray absorption data in the wavelength range pertinent to X-ray lithography
ELECTRON PHYSICS IN DEVICE
MICROFABRICATION. I1
105
particular, Fig. 12 illustrates the dependence of absorption coefficient on X-ray energy. The salient features of this figure are : (1) The applicable range of X-ray wavelengths is from 5 to 50 A, corresponding to energies of 0.25-2.5, respectively. (2) The upper part of the figure shows the curve for gold, which typifies heavy X-ray-absorbing materials. Two curves are shown, the lower curve representing the more recent data; The lower band corresponds to X-ray transmitting materials. Many of these materials are organic and all are of low atomic numbers. There is one aspect of the figure that should be stressed and that relates to the relative absorption coefficients of the “absorbers” and of the “transmitters.” In contrast to the optical case where absorption coefficients can vary by six orders at a given wavelength, the corresponding factors differ only by a factor of approximately 50. In other words, there are no total absorbing or transmitting materials for X rays. Bearing in mind that mechanical and thermal stability have to be carefully maintained, differences in layer thickness may further decrease the effective difference in absorbing capability. (3) Included in Fig. 12 are the specific curves for four materials. All four show the characteristic increase with increase in 2, with abrupt discreases at those energies beyond which the incident electrons cannot excite a characteristic transition for the particular target. Gold is shown as a typical X-ray absorber. The second material shown is beryllium, which is often used as a window material. As the operative wavelength increases from 5 to 50 A, the absorption coefficient increases by a factor approaching lo2, implying a significantly increased loss at longer wavelengths or a significant reduction in thickness that may or may not be practical. This increased loss is, to some extent, counterbalanced by an increased absorption in the resist (see Fig. 12 for the data pertinent to PMMA). This increase in absorption corresponds to an increase in sensitivity. Finally, we show the absorption data for silicon and Mylar, which are often used as mask substrates (i.e., for the transmitting regions). In general terms, the use of longer wavelengths gives the best resolution and contrast but implies the use of thin mask substrates, often precluding the use of a nonvacuum exposure chamber (see Spiller and Feder, 1977). For structures 2 1 pm, a lower wavelength can be used with the inherent conveniences of thicker masks and a nonvacuum system. The question of the efficiency of X-ray generation has been considered by Greeneich (1971, 1975), who gives an expression outlining the total design problem. This expression, in a notation applicable here, gives E, the X-ray energy incident on the resist, as
106
P. R. THORNTON
where ZbV, (= W) is the power (watts) incident on the anode, q the quantum efficiency of the X-ray process expressed in quanta/incident electron/srad, E, (= e b ) the electron beam energy, D the anode to wafer separation, z the exposure time, and hv the quantum energy. The exponential term represents the total absorption loss in the mask, the window, and the gas ambient. Applying the work of Green and Cosslett (1968; Green, 1963a,b), we can write q in the form
v
=
NZlS(~c)(EO-
w.63
(18)
Here N ( z ) is a function of the target (anode) atomic number,f(E,) represents the reabsorption loss, E, is the energy of the characteristic emission, and E, is the beam energy. Using a careful set of experimental measurements due to Dick et al. (1973) Greeneigh determines that with optimization of the efficiency by suitable choice of E, we have (qk/EO)max = 5.164 x lo-’ (ql/Eo)max = 2.80 x lo-’
+ 4.836 + 9.02 x
x 10-5E,-112
(194
10-6Ec
(19b)
for the k and 1 lines, respectively, with E, measured in keV. If we require E, to be between 5 and 25 keV, we have 1.02 x
I (qk/EO)max 52.21 x
1.11 x 10-5 I
t20b) The efficiencyof quanta creation is a slowly varying function of X-ray wavelength and hence plays only a minor role in the optimization process. The anode-to-wafer separation and absorption in the essential components are all important. The implication is clear, meticulous, and exacting attention to detail has to be applied to the critical area of mask fabrication. Of particular significance is the way in which the required fineness of the structure impacts the stability/thermal problem. (ql/~o)max
I 1.31
(20a) 10-5
D . Masks for X-Ray Lithography
1 . General and Fabrication
A mask for high-resolution X-ray lithography in a commercial environment has to meet a specification that includes the following properties : (1) It has to impose minimal constraint on the system throughput by restriction of quanta flux into the resist layer. In other words, the appropriate products of absorption coefficient and thickness have to be minimized in the mask.
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
107
(2) It has to provide the necessary contrast in the X-ray flux delivered to the resist layer. (3) The mask structure has to be thermally and mechanically stable and impose no limitations on the throughput because of “waiting” or “acclimatization” times required to bring the structure to equilibrium. (4) A rapid automated alignment system has to be developed that occupies no usable “real estate” on the wafer, which imposes no significant limitations on the throughput. (5) The materials and structure used have to be radiation resistant in the sense that loss of masks due to radiation degradation or due to dimensional change should not impose an inconvenience either in terms of system downtime or labor or material cost. (6) There is a linearity and repeatability requirement that depends on the details of the particular application. We return to this point later.
This list features the salient points; secondary objectives can be considered, including the very real probability that at the 0.5 pm resolution level, it may not be possible to use a gas to provide thermal stability. We can now indicate how recent work has sought to meet this overall specification for an X-ray mask. Figure 13 shows a schematic X-ray mask. The overall mechanical strength and ease of handling properties are obtained from a well-defined ring (often ceramic) onto which the mask substrate is bonded in a stretched condition to maintain stability against thermal change. In recent years, the mask substrates used have been a variety of transparent polymer films (see, for example, Spiller and Feder, 1977). These films have, to some extent, replaced the structured silicon layers used in initial studies (Smith and Bernacki, 1975) because the greater optical transparency allows a freer use of optical alignment techniques. Such techniques have replaced the earlier X-ray detection A u A B S O R B I N G LAYERS
2 5 0 0 i THICKNESS
\
-STRETCHED POLYMER FILM
f,
t“ FIG.1 3 . A cross section of an X-ray lithography mask showing the important dimensions.
108
P. R. THORNTON
alignment systems, which were limited in performance by noise problems. The necessary absorption regions are almost invariably obtained by the use of a gold layer, the required pattern being formed in photoresist either optically in the low-resolution regime or by electron beam exposure at the submicron level. Etching techniques used include both wet and dry etchants. Photoelasticity is used to assess the inherent stress in the films. Typical dimensions are shown in Fig. 13. Alignment techniques using interference methods based on the use of simple Fresnel zone plate are outlined next. 2. X-Ray Mask Techniques: Alignment
Here, because of space reasons, we limit the discussion to recent work directed toward the alignment of submicron structures in an automated mode. The discussion centers on work by Fay et al. (1979) as this work is both up to date and illustrative of earlier work on which it is partially based. The basic alignment system is a one-dimensional Fresnel zone plate illuminated by a helium-neon laser. Figure 14 illustrates the basic mark, its focusing action, and the schematic signal plotted as a function of misalignment between the center of the zone plate and the center of a suitably oriented mark on the underlying wafer. One difficulty with this simple approach is the presence of the high background signal in nonaligned positions. This high background has its origin in the zero-order reflection. Additional discrimination can be obtained by breaking the wafer alignment mark up into a series of regular segments giving, in effect, a diffraction grating (see Fig. 14b). A photodiode can be suitably located to receive, for example, the firstorder diffracted pattern. This approach not only gives the required low background signal, but is sensitive to changes in the z direction separation between wafer and mask, and it can therefore be used to “trim” this separation and its uniformity. In the context we can make use of the equivalence of an angular shift of the beam and an appropriate mechanical movement to impose a phase-locked signal on the output, so that a suitably chosen set of alignment markers, input lasers, optical scanners, and output photodiodes can generate a good-quality error signal, which can be used as the basis for an automated system. Some pertinent details can be noted : (1) The associated semiconductor technology required to make alignments of good quality is available compatible with IC needs and is not prohibitive in time or cost. (2) A triangular array of a three alignment mask system provides an ability to align and level the wafer relative to the mask. ( 3 ) The optical components can be located outside the X-ray flux and so can be used to continually monitor alignment during long exposures should the need arise.
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
I
WAFER
I
I
t
109
t
t f
\
I
f
51
/;i
2 i n
0
AX +
FIG. 14. One-dimensional Fresnel zone alignment marks adopted by Fay et al. (1979): (a) cross section of the zone: (b) section through the length of the zone showing the fragmentation of the wafer alignment marks; and (c) schematic signal.
110
P . R. THORNTON
(4) Piezoelectric movements can be used to perform the necessary adjustments. ( 5 ) The studies are incomplete, so that a definite statement as to how quick the system may be cannot be given. From the curves given and bearing in mind the need to cover three dimensions and possibly to reiterate, the present author estimates the time required for an automated system based on this approach may well be between 3 and 10 sec, as both coarse and fine adjustments will have to be made. (6) The alignment achieved is f500 8, with realistic structures. (7) The authors claim the approach to have advantages over the use of Moire fringes (King and Berry, 1972) because of contrast limitations arising with the latter technique and to be superior to systems involving the alignment of identical grating on the two components (Flanders and Smith, 1978). In this technique, actual replication of the grating may degrade the required symmetry of the detected interference patterns.
In the previous sections, an attempt has been made to describe all the critical elements of an X-ray lithography system. It remains to consider the implications when all these components have to be established concurrently in a practical system. This topic is examined next. E. Magnitudesfor a Fast-ThroughputX-Ray System
One method of establishing the full implication of what is involved in developing a fast, high-resolution, cost-effective X-ray lithography system is to design a thought experiment in which we seek to describe a total system capable of exposing 20 3-in. wafers in 1 hr. The wafers need to be accurately aligned to previous patterns. Reasonable premises on which to base such a system can be outlined as fcllows: 1 . X-RybI Source and Source-to- Wafer Separation
Assume the use of a 20-kW rotating-anode X-raq source. Take account of the need for low service and downtime by operating the system at 10 kW. We can apply Eq. (17) to determine the X-ray power available per unit area. In that equation, we take the power &I, as 10 kW. The optimum value of VIE, is taken, i.e., this parameter -2 x l o - ' quantum,JelectronlsradikV. To avoid penumbra effects, distortion, and uneven exposure, the source-towafer separation has to be significant. A commercially available rotatinganode system operates at a distance of 32 cm. Under these conditions, Eq. (17) gives the energy incident on the wafer surface and available for absorp-
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. 11
111
tion in the resist layer as
E
=
lo4 x 2 x 10-5(32)p2exp[-(a,t,
+ ax,tw)]zhv
+
1.95 x exp[ - (a,& a,t,)]zhv J/cm2 (21) In this equation, we have neglected the small loss due to absorption in any thermalizing gas used; hv is the quantum energy and varies from 0.310 keV for 1 = 40 8, to 2.48 keV for i= 5 8,. In order to obtain an order of magnitude estimate here, we take hv = 1 keV, which errs on the generous side for high-resolution applications. On that basis Eq. ( 2 1) becomes =
E
=
1.95 x
exp[-(a,,&
+ awtw)]z J/cmZ
(22)
2. Musk Size/Exposure Area In the high-resolution regime with detail and separation of the order 0.25-0.5 pm, the mechanical accuracy/repeatability/alignabilityhas to be <500 8, ( = 5 x cm). If the mask size is 2.5 cm, this limit represents a fractional accuracy of 2 x This figure must surely be regarded as the upper limit to the repeatability/accuracy obtainable on a day to day basis. For the present, we use this estimate of mask size, 2.5 cm, which is appropriate to the LSI situation in which two successive high-resolution layers have to be located with this degree of accuracy. With this size of mask, nine exposures will be needed for a 3-in. wafer. A 20-wafer/hr system therefore implies a total of 180 exposures/hr, so that the total time (resist exposure plus overhead) per exposure is 20 sec. Of the allowed 20 sec, a significant fraction may well be used as overhead (see below). Thus if we allow the full 20 sec for actual resist exposure, we are underestimating the required sensitivity of the resist. If also we neglect the absorption of quanta in any window used and neglect the loss in the mask substrate, then again we underestimate the required sensitivity. Within these limitations E
=
4 x
exp[-(a,t,
+ zWfw)]+ 4
J/cm2
x
(23)
This equation gives an estimate of the energy available for absorption in the resist, a fraction F :
F
=
1
-
exp(-at)
+ rt
for at
<
1
(24)
will be absorbed in the resist. Assume for the present a total absorption. Thus the energy/unit area absorbed in the resist is 4 x l o p 3 J/cm2. In these units, the sensitivity of PMMA for example, is 1-1.5 J/cm2. The implication is that to obtain a fast system, the resist required has to be at least 200-300
112
P. R. THORNTON
times more sensitive than PMMA. A more realistic estimate is given below, leading to further increases in sensitivity. 3. Estimate qf Overhead Time per Resist Exposure The main factors contributing to the overhead times are : (1) The alignment time per resist exposure. The requirement has to be repeated nine times per wafer. (2) Stage movement time per resist exposure. Again one movement has to be made per exposure, i.e., nine stage movements are required per wafer. (3) Time for wafer loading and transport to exposure station. Here we have to face two possibilities-one in which we can use a beryllium window and so operate in a gaseous ambient, and the other in which the use of a window is not possible. A leading group in this work area has expressed the view that the use of a window in long-wavelength, high-resolution applications is not possible (see Spiller and Feder, 1977). In this case, we have to include in the loading and transport time, a vacuum pumpout time and a possible equilibrating time. The first of these quantities is self-explanatory. Depending on the mode of loading (cassette or single wafer), there is a pumpdown time that is, in terms of an order of magnitude, say -5 secjwafer or -0.5 sec per resist exposure. The equilibrating time is a less well-defined concept but seeks to express the fact that a wafer has to be introduced into the system, located close to the mask, “aligned” in three dimensions, and then subjected for 20 sec to an X-ray flux and a simultaneous flood of infrared radiation. If the shutter is opened after alignment, the question arises as to whether the mask wafer combination will move due to thermal effects during the exposure or not. The problem has to be analyzed in depth to provide a meaningful answer. One way of avoiding the difficulty is to subject the mask/ wafer combination to an equivalent heating before the alignment. allow it to equilibrate to the required level. make the alignmeqt and the exposure, switching the equilibrating infrared source off at the onset of the exposure. N o mention is made in the literature of such intangibles, but to the present reviewer the problem appears very real bearing in mind the thinness of the membrane and the variation of time constants throughout the mask. The alternative approach using a beryllium window results in significant X-ray attenuation. At X-ray wavelengths approaching 50 A, the attenuation coefficient of beryllium is of the order of 2 ,urn- implying an attenuation of P - ’ (-0.135)/pm thickness. Practical systems use 12 pm foils that are water cooled, an approach that removes possible uncertainties arising from thermal instabilities at the expense of an increasingly demanding specification for the resist sensitivity. This increase in sensitivity has to be maintained in a
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
113
resist having the required contrast properties (see below). It is possible that the use of a thin collodion window may represent a compromise solution here, giving some of the required properties with little attenuation. 4. Complete X-Ray Resist Properties In Section III,B we established a basis for determining the absolute limit of sensitivity in terms of the shot noise limitation. It is usually assumed that 50 X-ray quanta per resolution element gives a suitably low noise figure. With PMMA, approximately 6 x lo5 quanta are needed to expose a 0.25-pm element, implying from this viewpoint that a resist some 12,000 times more sensitive than PMMA could be used. Thus we are unlikely to be limited by shot-noise-created defects. However, we should note that, even in this ideal case, the sensitivity available does not allow the use of a practical (i.e., proven) beryllium window if we wish to achieve our specified throughput. Where a further uncertainty does exist is in relation to the resolution/contrast properties of the required resist. One of the exciting features of X-ray lithography is the absence of proximity effects as we understand them in electron beam lithography. There are, however, secondary effects and the role of these in highly sensitive resists has to be considered. Practical work in various laboratories (Maldonado and Bernaeki, 1975; Hundt and Tischer, 1977) has shown that the secondary effects are minimal in PMMA. Careful experimentation has shown that the additional exposure due to secondary electrons, Auger electrons, and photoelectrons extends for a distance of only 400 A in this resist of high contrast and is unlikely to be of first-order importance at a resolution level 0.25-0.5 pm. However, we require a resist that is approaching lo3 times more sensitivity, and so some growth of this secondary-exposure region may be expected. This growth will result, not from an increased scale of the distribution of secondary effects, but from an increased ability of a more sensitive resist to detect the dose due to a small density of such effects. In the preceding section, we estimated the required resist sensitivity under favorable conditions in which attenuation in the mask was neglected, 100% absorption in the resist assumed, and the question of overhead time ignored. We suggested in Section III,D,B that some 5-10 sec may be required for movement, alignment, load/unload, etc. This time represents 25-50% of the total time available for resist exposure plus overhead. The inclusion of this factor increases the required sensitivity to 250-450 times that of PMMA. Treatment of the mask loss and the absorption loss involves detailed examination of a range of mask configuration and of resist properties. In a critique of this kind, it is better to assume minimal values, say a 90% transmission in the light areas of the mask and an 80% absorption in the resist. These approx-
114
P . R . THORNTON
imate estimates lead to a required resist sensitivity 350-600 times that of PMMA. We can sum up the content of this section by saying that a fast X-ray lithography giving well-aligned 0.5 pm structure and spacing at a throughput of 20 3-in. wafers/hr implies:
(1) a resist 350-600 times more sensitive than PMMA with adequate contrast properties used under conditions of effective absorption ; (2) a rotating-anode source with X-ray energy optimally selected and operating at 10 kW; (3) highly controlled mask technology, which imposes no significant attenuation; (4) a solution of intangible problems associated with “windowing”; and ( 5 ) an alignment system that is fast ( I 10 sec), accurate ( I t 500 A), and automated, and that solves the alignment problem in an extreme limita fractional accuracy/repeatability 2 x 10- ‘.
-
This modeling is directed toward a most immediate solution of this approach. It is now necessary to examine what further possibilities, of a more long-term nature, can be suggested to ease one or more of these problems. Particularly pertinent are recent innovations in source technology. F. Advanced X-Ray Source Development One approach suggested by Nelson and Ruoff (1978) obtains carbon K radiation by the use of a type IIb diamond bonded to a stationary anode and operated at an input power of 6.5 kW. The source size is approximately 1 x 1 mm. The authors indicate that an exposure time of 100 min for PMMA at a source-to-wafer separation of 25.4 cm is achievable. Alternatively. a resist some 300-600 times more sensitive than PMMA would be exposed in 20-10 sec. The wavelength (2. = 44.7 A) is well matched for high-resolution application and for the use of organic film masks. The authors suggest the the use of a collodion window (transmission at n = 44.7 A is 72”;) to aid with the thermalization pfoblem. This source simplifies the source technology. eases source replacements, and eliminates the vibration problems associated with rotation rates of the order to 5000 rpm. A more radical approach has been pioneered by McCorkle et al. (1979), which again uses carbon radiation. The source configuration is shown schematically in Fig. 15. The essential component is a discharge capillary (graphite electrodes on a polyethylene tube) used to generate a high-density plasma.
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
115
NEGATIVE HIGH VOLTAGE
7 ......................... ....:.:.:.: ................................... ....:. :.:....... ............................ :.:_,..
DISCHARGE CAPILLARY
\
CATHODE
FIG.15. Small-area, plasma discharge X-ray source developed by McCorkle e l a/. (1979).
-
Coincident with the plasma discharge, an intense but short-lived electron pulse is injected through the plasma (5 kA at 30 kV for 100 nsec, or 15 J of electrical energy). The X rays are generated in a 60 nsec flash and exposures are made at 22 cm from the capillary. The effective source size is 200 pm, penumbra effects are absent, the electronics are relative simple, and the system in its present, single-shot, small-area version is inexpensive. According to the published work, the present application is in the biological field for high-resolution X-ray microscopy studies, although its application to X-ray lithography is clearly outlined. In this latter application, the source has to be scaled up to provide a large-area source that is, in effect, a parallel or near-parallel X-ray beam. The question of duty cycle has to be considered. Details are not available in the abbreviated published notes, but the implicit possibility of a switched array of discharge tubes and either associated electron sources or a scanned electron beam present an attractive possibility. An approximate measure of the potentiality can be stated as follows. Such a capillary working at a duty cycle of 1 in 2500 can expose a resist that is 50 times more sensitive than PMMA in 10 sec. The realization of a wide-area source of this kind would alleviate the problems both of source and resist to some extent, leaving the alignment problem as the ultimate test.
-
116
P. R. THORNTON
IV. RECENT WORKIN ELECTRON BEAMLITHOGRAPHY
A . General Over and above the resist work and the attack on the proximity problem, recent work has seen substantial progress made in the area of electron beam lithography. This progress has been of a steady reduction to practice advance coupled with increasing applied research directed to solving specific problems in this application of electron physics. In the region of applied science, the problem of combining the aberrations calculated on the basis of geometrical optics for cases not maintaining a center of symmetry has been addressed in a generalized manner by the application of wave optics and computer-derived distributions (Kern, 1979). In addition, the critical problem of Coulombic interactions has received the attention the problem merits. In regard to improvements in overall performance of individual components several laboratories have become aware of possibilities inherent in the use of an octopole deflector, and an elegant application of channel plate detectors to the fiducial mark problem that is suitable for a production environment has been described (Penberth and Wallman, 1979). In the area of total systems work, a continued and determined drive has been made in shaped-beam lithography. While further progress made in electron image projectors has been outlined. Work has been reported on the field-emitter-based systems (Wolfe, 1979). Questions relating to system diagnostics have not been addressed as strongly as the topic merits and the question of how to make rapid and reliable measurements of differences of the order of 4500 A in distances of the order of 1 cm has received only limited attention. B. Electron Optical Computation 1. Summation of Aberration Terms
In the first of these two articles, stress was placed on the role computeraided design has played in this application (see, for example, Munro, 1973). At the same time two current limitations of the technique were indicated, one associated with the summation of the aberration terms and the second concerned with the fact that the computation is based on a model in which electrons are treated, in effect, as noninteracting entities. Thus the computations are applicable to a low-current model. Here we consider recent work addressing the first problem. The difficulty does not arise in the solution of Laplace’s equation, in the calculation of electron trajectories, or in the calculations of the aberration
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
117
integrals themselves. Well-established techniques can be used and present no limitations in the case of shaped-beam lithography. Where a problem does arise is in the way in which the aberration integrals are compounded to give the total aberration under preselected conditions of beam half-angle, scan area, chromatic spread, etc. In essence, each aberration is calculated individually and the total effect under given aperture conditions, e.g., estimated by a suitable summation. In general, the square root of the sum of the square of individual contributions has been regarded as the appropriate method. The validity of this approach depends on the nature of the illumination used and on the symmetry of the system. When the effective source has a small physical shape with circular symmetry and the energy distribution within the source is Gaussian, then the above procedure is valid for those aberration contributions that leave the centrosymmetry undisturbed (Harte, 1973). With the introduction of an extended source and the inclusion of off-axis aberration the validity is open to question. In the case where the sources used are linear or rectangular in shape and the energy distribution is a clipped off Gaussian distribution appropriate to shaped-beam lithography, the square law summation overestimates the aberration (Mauer, 1978). The approach developed by this worker uses a weighing appropriate to systems with centrosymmetry. Kern (1979), by an application of wave-optical calculations, has shown that in important cases relating to shaped-beam lithography, the situation is far from centrosymmetric and that the Gaussian summation approach estimates the aberrations by a significant factor-up to a factor offive in one illustrated case. Therefore. Kern argues that a full optimization of a shaped-beam system using extended sources must consider the noncentrosymmetric cases. To comply with this generalization, two approaches can be considered : a fullwave optical calculation; and a correctly chosen “combination” approach in which the parameters determined by geometrical optics are used as input to a wave-optical calculation in the final part of the trajectory. The waveoptical formalism gives the resultant intensity at the image plane in a natural manner without arbitrariness of assumption. The above method is applicable to point source studies where diffraction at the final aperture can play a significant role. Where we deal with extended sources, the diffraction limit becomes less significant and the calculations can be completed using geometrical optics. However, the necessary weighting has to be obtained by assuming that the intensity is proportional to the local density of trajectory intersections with the image plane. Such a density can be obtained by an extensive repetition of the geometrical aberration computation over the range of parameters representative of the beam. Kern (1979) has implemented this technique for an examination of the trade-offs required in terms of aberration performance if normal landing is
118
P. R. THORNTON
.10
.08
.06
.04
.o2
10 -5
10-4
10-3
10-2
LANDING DIRECTION (MAGNITUDE, I N RADIANS) Frc;. 16. Incorporation of the telecentricity requirement. The trade-off between total aberra-
tion and landing angle. After Kern (1979). System analyzed was an optimized double yoke within a projector lens and a predeflector.
imposed on the beam. The approach is to introduce an additional degree of freedom by using a predeflector in the manner suggested by several groups (Crewe and Parker, 1976; Ohiwa et al., 1971). Figure 16 shows representative results taken from Kern (1979). The figures quoted are for a threerad, a 5 x 5 mm scan deflector system with a beam half-angle of 5 x area, and the system had been optimized to give minimum total aberration after dynamic corrections had been applied. These estimates do not include Coulombic interaction effects. 2. Tlze Role of Electron Interactions
In summarizing the position here, we can define three regimes of modeling that are pertinent to electron beam technology. In the low-current limit appropriate to high-resolution transmission and scanning electron microscopy, the current is such that no significant electron interactions occur in a generalized sense and a noninteracting model is applicable. In instrumentation involving less resolution but higher current such as cathode ray tubes,
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
1I9
there is extensive interaction and the appropriate modeling involves investigating electron motion in the neighborhood of a smoothed field derived from the space charge of all other electrons in the beam. In electron beam lithography, we work between these limits both in resolution and required current. Under these conditions, the discreteness of electrons and their individual interactions play an aberrating role. There are also residual effects that can be traced to the onset of a small space charge field. This division of the required modeling should not be quoted out of context. Its main value is “to set the scene” and obviously each regime flows naturally into the other, the demarcation being useful mainly for clarification. This subject is being actively researched. The underlying points at issue are concerned with the details of interaction and how to predict the role of such Coulombic interactions in a given situation and how to do definitive experiments uncomplicated by indirect interpretation. Of particular relevance are the following questions:
(I) It is generally accepted that there is a significant Coulombic interaction at the electron gun under conditions of high brightness. There is clear evidence that interactions play a role along the beam length itself. Where does the major contribution come from: Is it over short lengths associated with the crossover or does a significant contribution arise along the extended regions in which the beam is unfocused? (2) Is it possible to establish a simple physically interpretable analysis of the problem that we can incorporate into the system design without prohibitive computations and obscuring of the design concepts? We can argue by analogy here. What we need is a process similar to that used in tackling the proximity effect problem. In that case, the complex computations are approximated in the final stages and the necessary corrections can be applied in terms of parameters that can be physically interpreted to a large degree. Here we need the same degree of simplification to allow for physical insight. (3) We cannot claim that the designs of electron beam lithography systems are completely optimized until we have a detailed physical understanding of these interactions and their role in creating aberrations. How well do we understand these interactions and how to relate them to computations to experimental results? In this area, studies that make a direct comparison between experiment and computation in simple situations are of particular importance. One such study (Groves et at., 1979) utilized a specially constructed short column consisting of a field emitter gun, a single magnetic lens ( f = 4.39 cm; U = 6.14 cm; V = 15.36 cm), a conventional deflector, scanned aperture, and a Faraday cup assembly for diagnosing the focused spot quality. The beam
120
P. R . THORNTON
half-angle at the target was 3 mrad and the beam currents varied between 1 and 20 PA. The source size was -50 A and beam voltage was 20 kV. A straightforward Monte Carlo program was developed to describe the Coulombic interaction. The attractive feature here is that the computation can be done on a desk calculator and the results converge after relatively few samplings, a group of 75 electrons iterated over 40 steps between source and target. The initial “seed” or input into the program is obtained by the use of random number generators to determine the transverse energy of each electron within the scale of a preset energy distribution. The starting position of each input electron is obtained by randomly assigning each electron a number between 0 and t , where t is the time taken to travel a fraction 1/N of the total beam length L. Each electron is imagined to start from a point source and travel a time t in the absence of Coulombic interactions with the components of momentum set by the first procedure. Subsequently the location of each electron is updated as it passed down the column after each increment of path length L / N .The acceleration experienced by each electron is the sum of that derived from the coulombic interactions of all other electrons within the associated group. The interaction used includes both electrostatic and electromagnetic contributions, and an idealized thin lens field was included to provide the necessary systematic radial velocity components. A technique for quickening convergence is sketched and a method of locating the crossover is automatically included. This technique enables the integrated distribution of electron position to be located at the crossover in a manner directly comparable with experiment. No attempt is made to determine the major contribution to Coulombic spot growth, i.e., energy broadening, trajectory rotation, etc. All aberrations are collectively summed, there is no assumption as to the Gaussian or nonGaussian nature of the distribution, and a consistent definition of spot size is given throughout. Two points can be stressed about this approach: the absence of preconception and the similarity with the method adopted by Kern (1979) to provide a realistic description of the electron beam distribution. In both cases no attempt is made to “force” the solution to fit distributions applicable to previous experience. The calculations are made to give a simple total summation directly comparable with experiment. An immediate possibility exists of combining the two approaches, whereby the trajectories and weighting developed from low-current modeling are used as input into the discrete charge model. In terms of actual results, these workers observe an offset of broadening at a brightness of -4 x lo7 A/cm2/srad with the conditions quoted. The broadening increases approximately linearly with spot current, and approximately linearly with 1.r. provided CI lies in the range 1 r I 8
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
121
mrad. The initial energy spread assumed was 1 eV and good agreement with experiment was obtained (better than f 10%). The above approach is designed to provide answers to specific problems facing workers developing electron beam lithography systems. In maintaining relative simplicity, details necessary to a total understanding are obscured. Other workers have adopted a complementary, more analytical approach, particularly in regard to the question of energy broadening. If we limit the discussion to the development of a comprehensive, quantitative theory, we can refer to the developments made by Loeffler and co-workers (Loeffler, 1969; Loeffler and Hudgin, 1971) and by Zimmermann (1968, 1969, 1970). The progress made up to 1970 is summarized cogently and economically in an excellent review article in this series by Zimmermann (1970). More recently many of the remaining details have been clarified by Knauer (1979a). This recent work has applied the results of plasma theory to the question of momentum transfer between electrons to clarify pertinent details, to provide results directly comparable with previous experimental data, and to investigate the relative performance of aperture-imaging (shaped-beam) and high-brightness (TF-emitter-based) systems (Knauer, 1979b). Highlights of the recent work are : (1) The approach represents a generalization of the work of Loeffler and Zimmermann : Zimmermann considered the role played by thermal momentum, while Loeffler stressed the role played by transverse momentum resulting from applied fields (focus action). Knauer treats both effects concurrently to give estimates of the “composite Boersch effect” in three cases of interest : a parallel beam, a crossover, and a diverging beam from a point force. (2) A detailed discussion is given to the nature of the scattering mechanism prevailing in electron beams of this type. Knauer argues that the multiple small-angle scattering model adopted by Loeffler and Zimmermann is applicable here and rejects a model developed by Crewe (1978) that is apparently equivalent to the assumption of a single-collision model. (3) Direct comparisons are made with data from field emitter sources (Bell and Swanson, 1979) and with data obtained from a total “shapedbeam” lithography system (Pfeiffer, 1971). The agreement between theory and experiment is within lO-15%. Bearing in mind the complexity of analysis and the skill required experimentally, this agreement should be regarded as acceptable. (4) Various points of detail are noteworthy. A minor difficulty in Loeffler’s model of an apparently nonconvergent integral is avoided. The dependence of significant parameters (column length, beam half-angle,
P. R. THORNTON
122
current density) is compared to experiment and to previous analysis. The question of detailed difference are examined in terms of available choices of integration limits and in interpretation of effective screening lengths. The remaining topics in this area-the trajectory perturbation and the role of space charge-have received a detailed examination. In terms of published work of direct applicability to the lithographic application, Stickel and Pfeiffer (1978a) have stressed that both discrete charge and space charge effects are required to interpret the results obtained at high current density. Crewe (1978) has given an approximate treatment, indicating an order of magnitude spot growth.
Ad
-
(2LZ/ccV3') x 10" cm
where a current Z at beam voltage V travels a total column length; x is the beam half-angle at the target plane; Ad 1-5 pm for reasonable values of the parameters with I 10 PA.
-
-
C . Electron Beam Lithography Based on the Use o f a Field Emitter Cathode Recent published work in this area has concentrated on the use of a thermally aided field emitter based on the use of a (100)-oriented tungsten emitter covered with an optimum surface of a zirconium oxygen mist (Wolfe, 1979; Tuggle et al., 1979; Groves et al., 1979). In published form, the work has been concerned with a system previously described by Wolfe (1975). A development version of the system is shown schematically in Fig. 17. Its simplicity is appealing, the minimization of Coulombic effects by avoidance of crossovers and reduction of column length is apparent and the results obtained are encouraging: (1) Electron-optical performance: Beam currents on the order of 100-400 nA can be obtained in an image spot on the order of -0.2 pm. The working distance can be as large as 10 cm and the energy spread values are quoted from 1 to 3 eV. ( 2 ) Cathode lifetime: Wolfe (1979) gives data suggestive of a cathode lifetime of 5000 hr under realistic operating conditions at a vacuum level of 10- * Torr. An increase in the robustness of the cathode has been obtained by using cathodes of relatively large diameter emitter tip (Tuggle et al., 1979; Wolfe, 1979), Wolfe (1979) has used an electronic current limiter, and by careful attention to the reduction of stray capacitance has limited the energy content of arcs to give high survival probability for the cathode tip. It would appear that the zirconated T F emitter is to be preferred over the alternativethe tungsten-oxygen built-up emitter (see Swanson and Bell, 1973) on grounds of stability and ease of operation.
-
ELECTRON PHYSICS IN DEVICE TF EMITTER
123
MICROFABRICATION. 11 GLASS TUBE
CONDUCTOR COATED SUPPRESSOR
SECOND
LENS
GRID
INTERNALLY
--I&
SYSTEM
WAFER
I
FIG.17. Electron microprobe based on the use of a TF emitter (after Wolfe, 1975).
(3) Noise properties: Over a wide range of operating conditions (5 x to Torr, cathode temperature -1600-1900 K, emitter voltage 5-10 kV) the noise is proportional to the probe current. Wolfe (1979) quotes a value of 0.9% peak-to-peak noise over a 5-kHz bandwidth for spot currents in the range 10-100 nA. (4) The long working distance between final lens and the target eases the deflection problem and presents no difficulty in the location of fiducial mark detectors.
124
P . R . THORNTON
It should be stressed that considerable effort has still to be applied to fully develop this approach. Among the unresolved questions that remain are : (1) It is likely that such a system will have to use relatively sensitive resists with reduced contrast properties. In these conditions, it will be necessary to blank on a crossover to avoid stray exposure during blanking (see Lin and Beauchamp, 1973). This additional requirement means the incorporation of an intermediate crossover with some loss of performance in terms of current density into a small spot. (2) The work reported to date has been concerned with undeflected spots. The incorporation of fast, stable deflector systems to fully exploit the potentiality of the high brightness has yet to be achieved. (3) The simplicity inherent in Fig. 17 is gained at the expense of considerable complexity in the electronic circuitry required for the system. The deflector drive system referred to above is one such challenge. In general terms, the problems are typical of those facing system engineers seeking ultra-high-speed operation : the ultimate in stability and repeatability. (4) The incorporation of proximity corrections may prove to be simpler with a high-brightness system than with shaped-beam systems because of the reduced “scale” at which the correction can be made with a refined beam.
D . Shaped-Beam Lithoyraphj Systems 1. General
This approach represents an optimization of electron beam technology to the lithographic application within the limitations imposed by the use of a thermal cathode of limited brightness. Originally pioneered by IBM (see, for a recent review, Pfeiffer, 1979) it is being exploited in several variations by different groups in the United States, Japan, and the Soviet Union. A reasonably comprehensive discussion of the physical basis of this approach is available from the original papers and, in more condensed form, in the previous article in this series (Thornton. 1979). Of more immediate interest are recent reports showing the substantial practical application of the technique and its continuing development. For the present, we limit the discussion to considerations of high-current-density systems. In the following sections, we outline, perhaps too briefly, the available data. I t should be remembered that the work being described here was performed in a competitive environment where the availability of data is dependent upon commercial factors. Under these conditions, errors of interpretation can arise. Such errors that do arise are solely the responsibility of the reviewer.
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
125
2. Improved Electron Optics Stickel and Pfeiffer (1978b) have stressed that the major limitations to electron optic performance lie in the projector/deflector system, not in the first four lenses of the system or in the gun brightness. In addition, it has been shown that, provided careful attention is paid to the optimization of the low-current aberrations, the limiting factor becomes the Coulombic interactions. Evidence is produced that two contributions play a role here: a space charge effect and a contribution from the individual interactions. The space charge effect can be substantially eliminated by refocusing from the focus condition implied by the low-current modeling. The discrete charge effect is the limiting factor. There are three manifestations of this form of aberrations : an energy spread and two distortions of the trajectory (one radial and one azimuthal). To minimize these effects, the path length is reduced as much as possible. In the initial shaped-beam system, a single main deflector was used in the bore of the projector lens, the geometry being chosen to eliminate the radial chromatic aberration by opposing the energy dispersion relationships of the projector lens and of the deflector system. In the second-generation systems, the projector lens has been reduced in length and a double-deflection yoke chosen in position and orientation to minimize the total chromatic aberrations. These modifications result in the edge resolution achievable being increased by a factor of two and the scan field being increased by 30%. In the dual-yoke system, the two yokes are rotated by 90" relative to each other. The orientation is not critical. The sensitivity is such that a decrease in orientation from 90 to 70" decreases the edge resolution by less than and, at the same time, increases the deflector system sensitivity by 20%. The authors indicate the difficulty in making definitive measurements that allow direct comparison with computed estimates of individual aberrations and adopt an averaging procedure to determine the total aberration present in a manner reminiscent of the way in which residual asymmetries are removed by experimental techniques in Hall effect measurements. ldesawa et al. (1979) have confirmed the role of space charge. These workers obtained an aberration of the edge slope on those edges derived from the first aperture at spot currents as low as 0.5 FA, which could be partially improved by refocusing. The solution adopted was to introduce a third aperture to remove the aberrated edges. Pfeiffer and Langner (1978) have extended the shaped-beam concept to the projection of complete characters. Included here is the ability to introduce a limited range of device elements with "non-Manhattan" geometry. This extension incorporated an ability to counter image movements introduced by the use of large mask fields. The technique used is to include a dynamic correction that can be applied to the variable-aperture deflector. Recent computational studies
-
126
P. R . THORNTON
(Kern, 1979; see Section IV,D,2). indicate design studies have been completed to incorporate telecentricity should the need arise. 3. Electronic and Computer Aspects The electronic design problems determined the nature of the total deflector system. If a single-deflector system (electromagnetic) is used. the required bandwidth of 15 MHz can only be obtained with a noise level equivalent to a positional uncertainty of 1 pm. A dual-deflector system consisting of an electromagnetic driver which gives a total scan of up to 8 mm at a bandwidth of 200 kHz is integrated with a fast (15 MHz) electrostatic “incremental” deflector of limited scan ( - 25 pm). Then a 15-MHz system can be realized with a placement noise of 0.1 pm (Woodward et al., 1979). Under these conditions, the time required for stepping, blanking, and providing offsets is 42 nsec. Using PMMA with a sensitivity of IOpC,:cm2and a current density of 25 A/cm2, the resist exposure time is -400 nsec. Thus the speed of the deflector system is not a major limitation as used at present. The accuracy requirements of the deflector system are met by a calibration procedure that obtains correction data at 30 x 30 sites over the scan area. Prior to calibration, the deflection error is 8 pm. After correction. this number is reduced to 0.2 pm. The positional noise error is 0.1 pm. the total pattern placement error is 0.25 pm. the register error is 0.30 pm. In terms of overlay accuracy, the predicted value is 0.5 pm and the experimentally observed value is 0.60 pm ( 3 0 value for 8 x 8 m field). The values can be compared with the 0.7 pm value required for 2.0 pm structures. It is stressed by Woodard et a / . (1979) that questions of long-term stability have to be carefully considered and it has been shown that a system of sequenced servo systems can be devised to provide the necessary monitoring and correction. A gun brightness servo is used to adjust the cathode temperature to bring the beam current to the reference value. A coupled pair of alignment servos (the first and third alignment servos) act to center the beam on the beam-limiting aperture by maximizing the current. Double-alignment systems are used to avoid excess virtual source movement during alignment. A further alignment servo (the second alignment servo) minimizes the current intercepted by the second square aperture. Further sequencing and iteration procedures ensure a convergence to the fully aligned condition. 4. Generul Properties
With a shaped-beam system and a variable spot size, an extra degree of freedom exists when the question of alignment and fiducial marker detection is examined because the optimum spot size appropriate to the particular application can be selected. It was determined by Weber and Moore (1979)
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. 11
127
that, with a system incorporating a backscatter detector and a layout of four fiducial marks, the optimum width in the scan direction is a function of resist thickness. With thin resist layers -0.7 pm in thickness, the optimum width is 1 pm. For layers , . 1.75 pm in thickness, the corresponding value is 2 pm. Stickel (1978) has also given data pertinent to the fiducial mark array used in these systems. It appears, although it is not specifically stated (Woodard et al., 19791, that blanking is achieved at or very near to a crossover, thus eliminating image movement during blanking. Although there may be some variability in specifications in the various systems constructed, Weber and Moore (1979) have outlined data pertaining to two design problems of significance : (1) the trade-off between deflection area and available current density, and (2) optimizing the maximum size spot compared to the minimum. In regard to the first problem, the relevant numbers can be summed up as follows by giving the available current density as a function of spot area and field size under conditions such that the edge resolution is kept constant at 0.3 pm. In the small scan-area case (2.5 x 2.5 mm), the current density falls from 125 to 25 A/cm2 as the spot area is increased from a small Gaussian probe ( I 1 x 1 pm) to a 25-pm2 spot. For a large-area scan (10 x 10 mm) the available current density is invariant at -4 A/cm2 over the same range of spot area. Many applications are currently being fulfilled with an operating point specified as a maximum spot of 4 x 4 pm, a 5 x 5 mm scan field, and a current density of 25 Aicm2. The results obtained by seeking an optimization (in terms of throughput) of the maximum spot size compared to the minimum size element required indicate a relatively broad minimum over the range 2-3. This result is valid from patterns varying in pattern density over a wide range (1 .3-66”1;). The minimum appears to arise from the fact that the total exposure time increases at small spots because of the number needed. And, at high values of this parameter, the exposure time again increases because of the relative infrequency with which such spots are used and possibly because of the reduced current density available. The minimum ratio will move toward a lower value as full proximity corrections are incorporated. Weber and Moore (1979) have given details pertaining to throughput on the latest shaped-beam lithography systems introduced into somewhat specialized commercial operation. With systems giving 1.25-pm minimum detail, the delivered throughput is ten wafers of 2.25-in. diam (six wafers of 3.25-in. diam) with an alignment capability of 0.3 pm (3a point). These results were obtained using a resist of 10 pC/cm2 (PMMA) and with a system uptime of 80%. Micrographs published by Giuffree et a / . (1979) indicate an almost undetectable error in placement when a long line is integrated from a sequence of individual exposures. The studies described were made under
-
128
P. R. THORNTON
conditions far more stringent than those met with in practice. Other magnitudes pertaining to alignment accuracy have been given above. Other pertinent references here include Davis et ul. (1977), Dorran et ul. (1975) and Yourke and Weber (1976).
E. Electron Ptvjection Method.\ Further work on both types of electron beam projection methods has been reported within the last year. Frosen et al. (1979) have extended their previous development of a “microprojector” to a consideration of the practical problems involved in a high-resolution system of this kind. The system described is concerned with the fabrication of 1-pm elements with 8 x 8 mm image fields. The exposure time for a field is 0.7 sec using a resist of sensitivity equal to 2.5 x C/cm2. In the development laboratory, an alignment accuracy over the whole field of better than 0.1 pm has been demonstrated with exposures taken 48 hr apart. No statistical data pertaining to this figure are available. The mask problem has been approached by the use of a grid of fine support wires (Politycki et d., 1978), which can be eliminated from the pattern by suitable overexposure. The fabrication of isolated patterns is done by the use of multibeam masks. the only drawback here being a slight degradation in the edge resolution of the pattern established in the resist. Nevertheless the edge accuracy is -0.2 pm in a I-pm structure. Effects due to backscattering are reduced by a suitable choice of beam voltage (compare Section 11,D). The use of a low beam voltage, i.e., 10 kV. allows the fabrication of I-pm structures without problems from the proximity effect. Without due care, a residual chargeup problem can occur. which results in a symmetrical distortion (a “growth”) of the pattern. This effect can be avoided by the use of thin resist layers and by the use of more sensitivity resists (i.e., sensitivity 2.5 x C/cm2).This charge-up problem can be avoided at the low beam voltages required to reduce proximity effects. N o statistics are available on the system throughput, the overhead, or alignment times. Two papers (Ward, 1979; Bril and Snijders, 1979) have described the upgrade and application of the electron image projector. Ward (1979) describes a system capable of exposing 4-in. diam wafers in a single pass. Such an approach allows the use of high insensitive resists such as PMMA. The alignment system has been improved by replacing the earlier X-ray detectors with scintillator-photomultiplier combinations to enhance the signal-to-noise ratio. The system is now capable of resolving to within 3”,, of the pitch of the alignment markers with an integration time of 1 sec. The alignment procedure operates in X . Y , and also provides a small range af magnification control to allow for the major, linear component to any inplane wafer distortion that may arise. The total alignment line is a few sec-
-
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
129
onds. Out-of-plane wafer distortion is reduced by the use of an electrostatic chuck and the residual distortion is a small fraction of a micron. Alignment accuracies less than 0.1 pm have been achieved with gold alignment markers in the laboratory. Bril and Snijders (1979) used an earlier model to fabricate bubble memory structures with I-pm minimum detail on 2-in. diam wafers. The total cycle time was 3 min/wafer. A 10% variation in exposure time represents a 0.1-pm deviation from a 1.4-pm structure studied in detail. A small proximity correction (- 10%) was applied to certain chevron elements, presumably by preadjustment of the element size in the mask. Gap width accuracy for adjacent elements is kO.1 pm for 90% of the chips. The implication from Ward (1979) is that these layers can register over substantial regions of the wafers to between 0.1 and 0.2 pm and to within 0.3 pm over most of the wafer. F. Electron-Optical Components
1. Advanced Dejection Systems Under the pressure of high resolution, large chip size, and high throughput, the question of beam deflection has been examined in some depth. If we take the development of shaped-beam lithography as indicative of the way in which the problem has been tackled, we can enumerate these main steps : (1) Initial work was concerned with the application of magnetic deflector systems because substantial expertise has been gained with such deflectors in scanning electron microscopy and, in a sense, electron beam lithography grew out of scanning microscopy. Magnetic deflectors were preferred as their aberration properties were basically superior to the four-plate electrostatic system. (2) The computation of the aberration properties as a function of geometry and location was undertaken systematically and independently of the practical problems that arise in relation to the fabrication, operation, and calibration of such systems. (3) With the realization of the role played by Coulombic effects and the need to shorten the column length, the use of combined focus and deflector systems was proposed, studied, and optimized. (4) Recently, the analyses have been generalized to include the possibility of incorporating a normally landing beam and to include the tradeoffs in aberration properties that have to be made to achieve a large measure of telecentricity.
This approach has led to the successful completion of feasibility systems and to the introduction of practical shaped-beam systems working in a
130
P. R . THORNTON
commercial environment. Recently, work has been concerned with questions of increasing throughput by the use of fast deflection systems. Here the limitation arises because electromagnetic systems are difficult to operate at high rates because the necessary drivers have to deliver a high current and a high-current slew rate to provide the necessary deflection and have to operate at a substantial voltage to overcome the resultant back emf. High stability has to be maintained and an eddy current problem overcome (see Pfeiffer, 1971). In recent years, several groups have realized the possibilities inherent in the use of an octopole deflector system. This system is an eight-plate electrostatic deflector unit that combines some of the better properties of both electrostatic and electromagnetic deflectors. The argument applied is that, if we could improve the aberration properties of the electrostatic deflector to give a useful performance, the drive problem would be substantially reduced as a more conventional voltage drive with no significant current loading is required and the eddy current problem is eliminated. The method of improving the aberration performance was patented by Salinger and Beach (1949) and in more recent years has been applied by Kelly et al. (1974) and others (Wang, 1971) to the investigation of electron beam memories. The underlying idea is that the aberration integrals result largely from terms measuring the derivative of the deflection field. If, therefore, we have a means of varying the uniformity of the field, the possibility exists of improving the aberration performance and of trading off one aberration at the expense of another. The necessary degree of flexibility is outlined in Fig. 18. A fourfold deflector system is further separated into eight plates. The electrical bias can be obtained, for example, by the application of two, independent, double-ended voltage supplies to a suitably designed resistor bridge. Using the system shown in Fig. 18, we obtain voltage & V , & a?. onto the x plates and & & a K onto t he y plates when a = R / ( R + r ) . Thus, by a suitable choice of r / R , the voltage distribution and hence the field distribution can be controlled over a wide range. Figure 19 (Kelly, 1977) illustrates the measure of control that can be obtained. Implicit in this figure is the fact that E,. can be made sensibly constant over a wide region centered on the axis. One of the more ambitious applications of this technique is described by Soma (1979) in which an array of five such octopoles is used to reduce the aberrations to -0.2 ,urn over a 5 x 5 mm deflection field with a reasonable beam half-angle and landing angle. It should be stressed that this has been done without the need for dynamic corrections. The work is limited to a very complete computational study based on an analytical and adjustable approximation to both the focus and the deflector fields and does not consider high-current effects. The author is careful to stress that a judgment between
<,
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
131
FIG. 18. Principle of an octopole deflector system: (a) systematic circuit; (b) voltage distribution with one set of voltages applied; and (c) voltage distribution with both sets of voltage applied. [After Kelly (1977))
this approach and a more conventional method using fewer deflectors and dynamic corrections has to be made after further study of the latter technique, of the sensitivity of each method to mechanical tolerance, and of the electrical drive requirements of each approach. The same laboratory (Idesawa et ai., 1979) has incorporated a single octopole deflector system into a shaped-beam system. In this case, the electrodes are accurately located cylinders of 3-mm diam, the internal bore is 5.4 mm, the active length is 3 cm, the total structure is 41 mm in length, and the authors claim high mechanical precision. Finally it should be noted that a very complete review of recent progress in the design of deflector systems has been given by Ritz (1979) in Vol. 49 of this series.
132
P. R. THORNTON
TRANSVERSE DISTANCE
NORMALIZED
TO DEFLECTION RADIUS
FIG. 19. Deflection field computations as a function of radial position for a selection of values of the parameter a. [After Kelly (1977).]
2. Fiduciul Mark Detection Penberth and Wallman (1979) have described a compact detector system with wide applicability because of simplicity and ease of location. The detector uses a channel plate electron multiplier 25 mm in diameter and with an 11-mm central hole. The total structure of the detector collector system is 6 mm thick. A single collector plate is used in the rear of the channel multiplier. The noise performance is superior to that obtained with silicon surface barrier detectors. The system is less complex than the traditional
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
133
scintillator-photomultiplier combination and does not require shielding from stray light. The collection angle is from 130 to 150" and the corresponding solid angle is 1.3 srad. The bias applied is on the order of I kV; the line scan time is on the order of 6 msec. Applications using realistic fiducial marks (1-pm deep etch put on silicon under 0.3 pm of resist) use currents in the nanoampere range. Overlay accuracy of 0.16 pm (80% of the observations in the range) has been achieved in a manual mode. The authors indicate that further improvements can be incorporated by segmentation of collector plate and by the use of optimizing algorithms in an automated system.
-
V. THERELATIVE ROLESOF X-RAY AND ELECTRON BEAM LITHOGRAPHY SYSTEMS WITH HIGHTHROUGHPUT One subject of extreme current interest is the relative roles that electron beam and X-ray lithography will play in commercial applications. Strongly entrenched and partisan views have been expressed with fervor and with flair but with considerable subjectivity and on an incomplete basis in which significant factors have been omitted or overstressed. The basis for making this judgment has changed in the last year as the result of extensions in optical lithography. Largely as a result of the introduction of direct writing techniques, designers of both commercial electron beam and X-ray systems aimed at high throughput have now to design for 0.5 pm and below, with immediate needs centered on the range 0.5-0.25 pm. This stipulation arises because the industry will not adopt either X-ray or electron beam lithography for applications that can be fulfilled by optical techniques. This limitation of the role of X-ray and electron beam to resolutions of 0.5 pm and better for general industrial application imposes different design problems on the two techniques. Here it is useful to refer to Table I and to stress that the competition is between series and parallel methods of exposure, not specifically between X ray and electron beam. In common with other methods derived from purely parallel exposure techniques, X-ray lithography has to introduce a step-and-repeat technique to meet the alignment problem. However, first it has to make an advance in source technology enabling a resist some 10-50 times more sensitive than PMMA to be used at the 0.5 pm and better resolution level. There is a possibility that additional complexities and reduction in throughput may result from the window problem, i.e., extrapolation from the 2-pm regime to the <0.5-pm is not straightforward. Once these problems are resolved, the outstanding problem from the viewpoint of X-ray lithography is still alignment. An exposure area of 2.5 x 2.5 cm represents the upper limit over which the required align-
134
P . R. THORNTON
ment accuracy can be obtained automatically and rapidly and with minimal use of “real estate.” The exposure area may well drop further to 10 x 10 mm, thereby imposing further stipulations on the source and/or resist problem if a competitive throughput is to be maintained. From the viewpoint of scanning electron beam lithography, the problem is to exploit the flexibility of optimized scanning techniques to fulfill the promise of good throughput implicit in recent work using shaped-beam lithography by working with more sensitive resists (both positive and negative with sensitivities -5 x lo-’ C,km2). This throughput has to be obtained at the 0.5-pm and better resolution level. A solution to the proximity effect has to be found by a combination of computer correction aided by corrective techniques in resist technology. In addition, X-ray lithography has to be compared with alternative methods based on the use of projection techniques or on step-and-repeat variants of such techniques. Here we refer to microprojector techniques either by transmission masking or by the projection of a cathode image. Each approach involves the same trade-off of loss of throughput because of an increasing degree of step and repeat to solve the alignment problem. The electron beam methods are aided by the easy deflectability of the beam. The masks used in the transmission projector are no more fragile than those required for X-ray lithography and do not suffer from effects associated with irradiation-induced changes or from thermal instabilities. The cathode image projector has the flexibility to incorporate a step-and-repeat technique at the 0.5-pm resolution level should the need arise. This analysis depends in part on the stated assumption that widespread industrial application of either electron beam and/or X-ray lithography will not occur in areas where optical techniques suffice. In the foregoing the assumption was made that it is going to be resolution that determines the changeover point (at approximately the +-pm level). An alternative viewpoint has been expressed by workers with considerable practical experience. Such workers argue that it may not be resolution that determines the level at which the crossover occurs but that it could well be the total alignment or ”overlay” problem. Here we mean by overlay the ability, after all sources of errors have been factored in, to write a succession of patterns on top of each other so that required points are in good coincidence to a high degree of precision with error rates corresponding to the 3 0 point ( 21 in lo4). On this basis the crossover point occurs somewhere between If pm and I pm. This is a valid point and merits careful consideration. The difference in viewpoint is largely one of emphasis rather than of fundamental difference. We can make this difference apparent by making a few relevant comments:
( 1 ) In the immediate future there is no doubt that a significant increase in packing density can be obtained by improving the overlay capability.
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. 11
135
Such an improvement can be obtained without any further development in terms of device processing. (2) While accepting that this improvement in packing density can be obtained it is not too clear how this improvement can best be obtained bearing in mind both immediate and future needs. Part of the uncertainty arises from the very real success that optical lithography has had in relation to alignment and the very real possibility that exists for future improvement. In numerical terms it has proved possible with optical lithography to align to 0.75 pm at the 3a level using wafer alignment. This figure includes all sources of error including transfer between different lithography systems. This figure should be compared with 0.40 pm (3a) obtained in a practical environment using electron beam lithography with chip-to-chip alignment (chip size : 7 x 7 mm). If the optical alignment technique be applied to each exposure area rather than the total wafer, significant improvement will result and the increased packing density referred to previously could be achievable by optical means. (3) The point made above in no way precludes the fact that ultimately the electron beam system is capable of a higher overlay performance because of speed, versatility, and compatibility with computer techniques; but still the point stands that there is considerable, immediate extensibility in optical alignment technology moving the range of applicability of optical lithography down below the 14-1-pm level toward the 4-pm resolution limit . (4) I f the emphasis is changed from the immediate future and we ask the question as to where we place the design aim for both electron beam and for X-ray lithography in order to avoid early obsolescence, then we should consider the b-t-pm level advocated in this text. The argument that this capability is not needed has little or no validity as previous experience has shown that when a capability exists the industry will use it. ( 5 ) The role of X-ray lithography has to be examined against this background. Whether or not there is an immediate application at the 1$-2-,ttm level has to be judged against the available optical techniques and their immediate extensibility. The long-term application has to be assessed against the existence of compatible optical and electron beam lithography systems with interchangeable output achieving high throughput, versatility, and resolution/overlay properties tailored to meet a wide range of needs.
ACKNOWLEDGMENTS The author would like to thank fellow workers for help and support: Gary Garrettson. who commented o n the text, E. J. Ritr and W. Knauer, who kindly supplied preprints prior to puhlication. and Kevin Cogan. who made the drawings.
136
P . R . THORNTON
Aizaki. N. (1979). Proc. S>,mp. Electron, Ion, Photon Beam Technol., ISth, 1979. Aritome, H., Nishimura. T., Kotani, H., Matsui, S., Nakagawa. O., and Namba, S . (1978). J . Vac. Sci. Technol. 15, No. 3, 992-994. Bell. A. E., and Swanson, L. W. (1979). Phvs. Rev. B 19,3353. Bell, A. E., and Swanson, L. W. (1980). Phj,s. Reu. B (in press). Bril. T . W., and Snijder, J. T. (1979). Proc. S j w p . Electron, Ion, Photon Beum Technol.. ISth, 1979.
Broyde. B. (1969). J . Electrochem. Soc. 116, No. 9, 1241 1245. Cantagrel, M. (1975). IEEE Trans. Electron Det‘ices ed-22, No. 7. 483-.486. Chang, T . H. P. (1975). J . Vac. Sci. Technol. 12, 1271-1275. Chang, T. H. P., Wilson, A. D., Speth, A., and Kern, A. (1974). Elecrron Ion Beum Srr. Technol., Int. Conf., 61h, 1974 p. 580. Chung, M. S. C., and Tai. K L. (1978). Proc. Int. Con/: Electron Ion Beam Sci. Technol , Xrh. 1978 Vol. 78-5, pp. 242-255. Cosslett, V. E., and Thomas, R . N. (1964a). Br. J . Appl. Phys. IS, 235. Cosslett, V. E., and Thomas. R . N . (1964b). Br. J . Appl. Phvs. 15, 883. Cosslett. V. E.. and Thomas, R. N . (1964~).Br. J . Appl. Phj,s. 15, 1283. Crewe. A. V. (1978). Oprik 52, 337. Crewe. A. V., and Parker, N . W. (1976). Opt& 46, 183. Davis. D. E., Moore, R . D., Williams, M . C.. and Woodard, 0. C. (1977). IBM J . Re\ D e r . 21,498. Dick, C . E., Lucas. A. C., Motz, J. M., Placious, R. C., and Sparrow, J. H. (1973) J . Appl. Phi.s. 44,No. 2, 815-826. Doran, S.. Perkins, M.. and Stickel, W. (1975). Proc. Sj,mp. Electron. Ion, Photon Beam Twhnol., 13th. I975 p. 1174. Everhart. T. E., Gonzales, A. J., Hoff, P . H . . and MacDonald, N. C. (1966). Electron ,Microst. I , 201. Everhart, T . E., and Hoff, P. H. (1971). J . Appl. Phjs. 42, N o . 13. 5837-5846. Fay. B . . and Trotel, J . (1976). Appl. Phj,s. Lett. 29, N o . 6, 370-372. Fay. B.. Trotel, J., and Frichet, A. (1979). Proc. S i m p . Electron. Ion, Photon Beum Trchnol., l i t h , 1979. Ferrni. E. (1940). Phys. Rer . 57, 485. Finne. R. M., and Klein, D. L. (1967). J . Electrochem. Soc. 114, 965. Flanders. D . C.. and Smith, H. I . (1978). J . Vuc. Scr. Technol. IS, No. 3. 995-997. Frosen. J.. Lischke. B., and Anger, K (1979). Proc. Microcirc. Eng. Conf. Aachen I979 pp. 50-58. Giuffre. G. J.. Marquis. J. F , Pfeiffer, H. C.. and Stickel, W (1979). Proc. S~mip.E l w t r o n , Ion. Photon Beam Technol.. 15th. 1979. Gloersen, P. (1976). Solid State Technol. pp. 68-73. Green, M. (1963a). I n ”X-Ray Optics and X-Ray Microanalysis” ( H . H. Pattee. Jr.. V E. Coselett, and A. Engstroem, eds.). pp. 185- 192. Academic Press. New York. Green. M. (3963b). In “X-Ray Optics and X-Ray Microanalysis” (H. H. Pattee, Jr.. V. E. Cosslett, and A . Engstroem, eds.), pp. 361-377. Academic Press, New York Green, M., and Cosslett, V. E. (1968). Br. J . Appl. Phys. [2] I , 425-436. Greeneich, J. S. (1971). Doctoral Thesis, University of California. Berkeley. Greeneiih, J. S. (1975). IEEE Ti-uns. Electron Dec. ed-22, No. 7. 434-439. Greenwood. J . C. (1969). J . Elecrrochrrn. Soc. 116, 1325.
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
137
Grobman, W. D., and Studwell. T. W. (1979). Proc. Symp. Electron. Ion. Photon Beam Technol., ISth, 1979. Grobman, W. D., and Speth, A. J. (1978). Proc. Int. Con/:, Eleciron Ion Bean? Sci. Technol.. 8th. I978 Vol. 78-5, pp. 279-284. Groves, T., Hammond, D. L., and Kuo, H. (1979). Proc. Simp. Electron, Ion, Photon Beam Technol., ISih, 1979. Gruen, A. E. (1957). Z. Naturforsch., Teil A 12.89. Haller, I . , Feder, R., Hatzakis, M., and Spiller, E. (1979). J . Electrochem. Soc. 126, No. 1, 154- 161. Harte, K. J . (1973). J . Vac. Sci. Technol. 10, 1098-1 101. Hatzakis. M. (1971). Appl. Phys. Lett. 18, No. I . 7-10. Hatzakis. M. (1979). Proc. Microcirc. Eng. Conf., 1979. Hawryluk, R. J., Hawryluk, A. M., and Smith, H. I . (1974). J . Appl. Ph>,s.45, No. 6,2550-2566. Heidenreich, R. D., Thompson, L. F., Feit. E. D., and Melliar-Smith, C . M. (1973). J . Appl. Phys. 44,4039. Heidenreich, R. D., Ballantyne, J. P., and Thompson, L. F. (1975). J . Voc. Sci. Technol. 12, No. 6 , 1284-1288. Hundt. t.,and Tischer, P. (1977). Proc. Symp. Electron, Ion, Phoron Beam Technol., 14th. 1977 pp. 1009-1011. Idesawa. M., Goto, E., Soma, T., and Sasaki, T. (1978). Proc. Synip. Electron. Ion, Photon Beam Technol., ISth, 1979. Jones. F., and Hatzakis, M. (1978). Proc. Int. Con/: Electron Ion Beam Sci. Technol., 8th. 1978 Vol. 78-5, pp. 256- 264. Kato, T., Yahara, T., Nakata, H., Murata, K.. and Nagami, K. (1978). J . Vuc. Sci. Techno/. 15,934-937. Kelly, J . I, 1977). Adr. Electron. Electron Phys. 43, 43- 138. Kelly. J . , Moore, J . S . , and Thornton, P. R. (1974). Proc. IEEE Natl. Aerosp. Electron. Con/. 1974 p. 55. Kern. D. P. (1979). Proc. Sj,mp. Electron, Ion, Photon Beam Technol.. ISth, 1979. Keyes, R. W. (1979). IEEE Trans. Electron. Dec. ed-26, No. 4, 271-279. King, M. C., and Berry, D. H. (1972). Appl. Opt. 11, 2455. Knauer, W. (1979a). Proc. Srmp. Electron. Ion, Photon Beam Technol., 15th. 1979. Knauer. W. (1979b). Optik (to be published). Knoll, M . . and Kazan, B. (1952). “Storage Tubes.” Wiley, New York. Kyser, D. F., and Murata. K. (1974). Proc. Int. Conf: Electron Ion Beam Sc,. Technol.. 6th. 1974. Kyser, D. F., and Ting, C . H. (1979). Proc. S i m p . Electron, Ion, Photon Beam Technol., 15th. 1979. Kyser. D F., and Viswanathan, N . S . (1975). J . Vuc. Sci. Technol. 12, 1305-1308. Landau. L. (1944). J . Phys. (Moscow)8 , 201. Lin, L. H., and Beauchamp, H. L. (1973). J . Vac. Sc,i. Technol. 10, 987-990. Lindau. I., and Winick, H. (1978). J . Vac. Sci. Technol. 15, No. 3,977-983. Lischke, B. (1978). Optik 50, 315-328. Loeffler. K. H. (1969). Anyew. PhJ3s. 127, 145. Loeffler, K. H., and Hudgin, R . M . (1971). Electron Microsc., Proc. Int. Congr., 7th. 1970 pp. 67-68. Maldonado, J . R., and Bernacki, S. E. (1975). 1. Vsc. Sci. Technol. 12, No. 6 , 1321- 1323. Mauer. J . (1978). J . Vac. Sci.Technol. 15, 853. McCorkle, R., Angilello, J . , Coleman, G.. Feder. R., and La Placa, S . J . (1979). Science 205, 401-402.
I38
P. R. T H O R N T O N
McCoy, J. H., and Sullivan. P. A. (1978). Proc. Inr. C'onf. Electron Ion Beum Sci. Technol., 8th, I 9 7 8 Minkiewicz, V. J., and Chapman, B. N. (1979). Appl. Phy.5. Lett. 34, 192-193. Munro, E. (1973). In "Image Processing and Computer-Aided Design in Electron Optlcs" (P. W. Hawkes, ed.), pp. 284--323. Academic Press, New York. Murata, K., Matsukawa, T . , and Shimizu, R. (1971). Jpn. J . Appl. Phys. 10, No. 6 , 678-686. Nagel, D. J., Whitlock, R. R., Greig, J. R.. and Pechacek. R. E. (1978). S P I E 135, 46- 53. Nagel, D. J., McMahon, J. M., Whitlock, R. R., Greig, J. R., Pechacek, R. E., and Peckerar, M . C. (1979). J . Appl. Phys. Jpn. Nelson, D. A., and Ruoff. A. L. (1978). J . Appl. Phys. 45, 5365. Neureuther, A. R., Kyser, D. F., Murata, K., and Ting, C . H. (1978). Proc. Int. Con/. Electron Ion Beam Sci. Technol., 8th, 1978 Vol. 78-5. Neureuther. A. R., Kyser, D. F., and Ting, C. H. (1979a). IEEE Trans. Electron Device5 4 - 2 6 , 686-693. Neureuther, A. R., Liu. C. Y., and Ting, C. H. (1979b). Proc. Si mp. Electron, Ion, Photon Beum Technol., 1Sth. 1970. Nosker, R. W. (1969). J . Appl. Phjs. 40, No. 4. 1872-1882. Ohiwa. H., Goto, E., and Ono, A. (1971). Ekctron. Commun. fpn. 54-8,44. Ouano. A. C. (1978). P o l j ~ Eng. . Sci. 18, No. 4. 306-313. Ozedemir, F. S., Perkins. W. E., Yim, R., and Wolf. E. D. (1978). J Cuc. Sci. Technol. 10, 1008. Parikh. M. (1979a). J . Appl. Phrs. 50,4371-4377. Parikh. M. (1979b). J . Appl. Phj.s. 50, 4378-4382. Parikh. M. ( 1 9 7 9 ~ )J. . Appl. Phys. 50,4383-4387. Parikh. M.. and Kyser. D. F. (1978a). Proc. Inr. C'onf. Electron Ion Beum Sci. Technol.. Xih. /Y78 Vol. 78-5, pp 371 381. Parikh. M . , and Kyser. D. F. (1978b). Proc. Int. C o n f . Electron Ion Beum Sci. Techno/.. Nrh. IY7X.
Parikh. M., and Kyser. D. F. (1979). J . Appl. Phys. 50, 1140- 1 I 1 1 Penberth. M. J., and Wallman. B. A. (1979). Proc. Symp. Elecrron. Ion. Photon Beum Technol.. I .iih. 1979. Pfeiffer. H. C. (1971). Proc. S\,mp. Electron. Ion. Luser Beurn Technol.. ISth, I971 p. 239. Pfeiffer, H. C. (1977). Proc. Sj,mp. Electron, Ion. Photon Beurn Technol., 14th, 1977. Pfeiffer. H. C. (1979). IEEE Trans Electron. Devices ed-26, No. 4. 663-674. Pfeiffer. H C., and Langner, G. 0. (1978). Proc. Int. C'nnf. Electron. Ion Beum Scr. Techno/.. Hill. IY7H.
Politycki. A . . and Meyer. D. (1978). Srmicns for.sc/i. € n t ~ ~ , ; [ , ~ / i 7. ~ ~Nio~. /I .~ 38. ~~,r. Ralph. H. 1.. and Sewell. H . (1978). Proc. In,. C'onl. Elcc,tron /on Bwn7 .%i. Ti,chno/.. 8rh. iY7X. R117. E F.. J r . (1979). Ark-. Electron. Eleciron Phrr. 49, 299 357 Roberts. E. D. (1976). Vucuum 26, N o . 10.459-467. Roberts. E D. (1978). Proc. Microcirc. Eny. C'onf.. l Y 7 H (to be published). Rossi. B.. and Greisen. K. (1941). Rec. Mod. Ph!,.s. 13, 240 308. Salinger. H . W . S . . and Beach. H. W . (1949). U.S. Patent 2.471.727. Seuell, H . (1978). J . Vuc. Sci. Technol. 15, No. 3. 927 930 Smith. H . I., and Bernacki, S . E. (1975). J . Ciic. Scl. Technol. 12. No. 6, 1321 1323 Soma. T. (1979). Optrk. 53 (4). 281 2x4. Spiller. E , and Feder, R. (1977). "X-Ray Optics." pp. 35 92. Springer-Verlag. Berlin and New York.
Spiller. E . Eastman. D. E.. Feder, R.. Grobman. W. D.. Gudat. W.. and Topalian. J . (1976). J . Appl. Phy.s. 41, N o 12. 5450-5458. Spivack. M. A. (1970). Appl. P h ~ xLett. 29. N o . 6. 370 372.
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I1
139
Stephani, D.. and Kratschmer, E. (1979). Proc. Microcirc. Eng. Conf., 1979. Stickel, H'. (1978). Proc. Symp. Electron, Ion, Photon Beam Technol., 14th, 1978 p. 901. Stickel, W., and Pfeiffer. H . C. (1978a). Proc. Int. Conf. Electron Ion Beam Sci. Technol.. 8th. 1978 p. 149. Stickel, W.,and Pfeiffer, H . C. (1978b). Proc. Int. Conf. Electron Ion Beum Scr. Technol.. 8th I978 p. 32. Swanson, L. W., and Bell, A . E. (1973). Adc. Electron. Electron PhJ.5. 32, 193. Takashi, S. (1979). Optik 53, No. 4, 281-284. Thompson, L. F. (1974). Solid State Technol. 17, 27. Thompson. L. F., and Doerries, E. M. (1977). J . Electrochem. Soc. 126, No. 10, 1699-1702. Thompson, L. F., Feit, E. D.. Melliar-Smith, C. M., and Heidenreich, R. D. (1973). J . Appl. Phys. 4 , 4 0 4 8 - - 4 0 5 1. Thompson. L. F., Yau, L., and Doerries, E. M. (1979). J . Electrochem. Soc. 126, No. 10, 17041708. Thornton, P. R. (1979). A d r . Electron. Electron P h w . 48, 271 380. Tuggle. D., Swanson, L. W.. and Orloff, J . (1979). Proc. SJ.mp. Elvctron. Ion, Photon Beam Technol.. 15th, 1979. Wang, C. C . T. (1971). IEEE Truns. Electron. Dc~icesec-18, No. 4. 258-274. Ward, R. (1979). Proc. Symp. Electron, Ion. Phoron Beam Technol., ISth. 1979. Wardey, G . A,, Munro, E., and Scott, R. W . (1977). lnt. Con/: Microlithoyr., pp. 217-220. Weber. E. V., and Moore, R. D. (1979). Proc Sj,mp. Electron. Ion. Photon Beam Technol.. 15th. 1979. Whipps. P. W. (!979). Proc. Microcirc. Eny. Con/.. 1979. Wittels. N . D., and Youngman. C. I . (1978). Proc. Int. C'onf. Electron lun Beum Sci. Technol., 8th. I978 Vol. 78-5, pp. 361-370. Wittry. D. B., Messenger. R . S., Rao-Sahib, T. S . . Jones, A. B., and Reekstin, J . R. (1979). Proc. Sj,mp. Electron. Ion. Photon Beum Technol., I5th, 1979. Wolfe, J . E. (1975). J . Vac. Sci.Technol. 12, 1169. Wolfe. J . E. (1979). Proc. Si,mp. Electron, Ion. Phoron Beam Technol.. 15th. 1979. Woodard. 0. C . , Ho, C . T.. Michial, M. S., Muir, A. W., and Williams, M. C. (1979). Proc. Symp. Electron, Ion, Photon Beum Technol., I S t h , 1979, Yoshimatsu, M., and Kozaki, S. (1977). "X-Ray Optics," pp 10-33. Springer-Verlag. Berlin and New York. Youngman, C. I . , and Wittels. N. D. (1978). SPIE 135, Yourke. H. S., and Weber. E. V. (1976). Technol. Dig., l E D M p. 431. Zeiller, H . U . , and Hieke, E. K. (1979). J . EIectrorhmi. Soc. 126, No. 8, 1430-1432. Zimmermann. B. (1968). Ph.D. Thesis, University of Karlsruhe. Zimmermann. B. (1969). Rec. S v n p Electron, Ion. Laser Bemi Techno/., 10th. 1969. Zirnmermann. B. (1970). Adtr Elecrron. Elecrron PhL... 29, 257.
This Page Intentionally Left Blank
ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS. VOL.
54
Solar Physics LAWRENCE E. CRAM Sacrumenlo Peak Obseroutor.v* Sunspot, New Mexico
I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. The Solar Interior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A. The Solar Core: Missing Neutrinos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. The Solar Envelope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. The Quiet Solar Atmosphere.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. The Photosphere.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. The Chromosphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. The Transition Region and Corona . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. The Solar Wind . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV. Solal.Activity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Origins of Solar Activity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Slowly Varying Solar Activity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Explosive Solar Activity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
I.
141 144 144 160 160 168 172 176 179 179 181 184 187
~NTRODUCTION
Solar physics is a vast and intricate subject, devoted to the study of a unique star: the Sun. The uniqueness of the Sun and the importance of solar physics rest on two facts: the Sun is the nearest star and hence the first steeping-stone in the study of the universe, and the Sun is the primary energy source of the Earth and thus the ultimate support for life. A discussion of these two facets of solar physics provides a perspective on current research activities in the subject. Parker (1978) has argued that "solar physics is the mother of astrophysics." While the more exotic directions of modern astrophysics lead toward the study of pulsars, quasars, black holes, and the primeval universe, the hard scientific foundations of the subject almost always rest on basic concepts and abstractions that evolved from studies of the Sun. These include the essential concepts underlying the theory of the structure and evolution of stars, the theory of stellar atmospheres, and the increasingly important subject of cosmical electrodynamics and plasma physics. The Sun, by
* Sacramento Peak Observatory is operated by the Association of Universities for Research in Astronomy, Inc., under contract AST-78-17292 with the National Science Foundation. 141
Copyright
,r,IY80 by Academic Press, In'
All rights of reproduction in any form reserved ISBN 0- 12-0 14654- I
I41
LAWRENCE E. CRAM
virtue of its proxiniity to Earth, is an astrophysical laboratory wherein processes that occur throughout the universe can be studied in detail. As this review emphasizes. we are far from having a satisfactory understanding of most of these processes. a situation that maintains solar physics as a central part of astrophysics. The leadership of solar physics within the general field of astrophysics can be illustrated by comparing research frontiers in general stellar astrophysics with those in solar physics. For example. in the investigation of stellar structure and evolution, a major thrust of stellar studies is toward the “accurate” computation of evolutionary tracks using the assumptions underlyinga standard solar model (see Section 11,A).On the other hand. solar astrophysicists struggle to explain the mysterious absence of solar neutrinos. and the paleocliniatological evidence suggesting that life could not have evolved had the Sun behaved in the past as iniplied by the standard model. The solar envelope. currently being probed by a variety of new techniques. also s e e m to be quite different froni the standard model. For example. the convection zone appears to be considerably deeper than predicted by standard mixing-length theory. However, mixing-length theory is a cornerstone of contemporary studies of stellar structure and evolution. and important astrophysical problems-such as presupernova evolution and determination of stellar ages-depend critically or1 its applicability. The probleni of the relation between convection, rotation. and magnetic fields is a study in its infancy. even in solar physics; however. the importance of this relation both for interior mixing and for consequences on atmospheric structure is of vital interest in general stellar astrophysics. The interpretation of stellar spectra and thestudy of stellaratnospheresare branches of astrophysics wherein the leadership of solar physics is even more pronounced. The view of many stellar astronomers has been that these are mature fields. hardly at the frontiers of astrophysics (e.g. Mihalas. 1974). This view has been jolted by space observations of stars. which show that. contrary to the conventional picture of a stellar atmosphere consisting of only a quiescent photosphere, stellar atmospheres ur.iversally exhibit structures analogous to the solar chromosphere, corona, and wind. Just as stellar astrophysicists scramble to acquaint themselves with recent advances in solar atmospheric physics. solar physicists are witnessing a revolution in the conceptual basis of the theory of the solar atmosphere. promoted in large part by space observations of the Sun. And as this revolution occurs. solar physicists introduce fundamental concepts of plasma physics and the theory of collective phenomena into astrophysics, thereby laying the foundations tbr studies of less-accessible (but possibly “more interesting”) cosmic objects. Solar physics has been, and will continue to be, the branch of astrophysics where the majority of fundamental concepts are first conceived and tested.
SOLAR PHYSICS
143
Another major facet of solar physics is the study of the role of the Sun within the solar system, particularly in the context of solar-terrestrial relations. Electromagnetic radiation, particles, and magnetic fields are emitted from the Sun, and travel through the interplanetary medium to interact with the Earth. The most spectacular manifestations of this interaction take place in the magnetosphere and ionosphere, as evidenced by aurorae and geoniagnetic disturbances. The effects of these interactions on manned space flight, galactic cosmic-ray fluxes, and shortwave radio conimunications are practical reasons that encourage research in these aspects of solar physics. However, the most important aspects of solar-terrestrial relations may lie in subtle influences on the chemistry of the stratosphere and the energy balance of the troposphere. The possible influences of solar emissions on the weather, climate, and atmospheric chemistry are hotly debated topics. N o one would disagree that an accurate measurement of the mean spectral irradiance of the Sun is required for any quantitative climatic or chemical studies of the Earth’s atmosphere and that existing measurements are inadequate (White, 1977), but the more intriguing question is whether there is a relationship between fluctuations in the solar output and fluctuations in the Earth’s weather and climate. Pittock (1978) has critically reviewed the subject of short-term (less than, or of the order of, the duration of the solar cycle) Sun-weather relations, and concluded that apart from an apparent weak connection between solar wind sector structure and an atmospheric vorticity index, there is little evidence for a significant influence of solar variability on the tropopause. Eddy (1978) agrees with this assessment, but raises the possibility of solar-induced climate changes on longer time scales, of the order of centuries or niillenia. The historical and prehistorical evidence for such variability is subtle, hard to interpret, and, as Eddy points out. difficult to confirm by precise conteniporary measurements over a short time base. Much careful theoretical and observational work will have to be done on both solar variability and the physics of the Earth’s atmosphere before we have an adequate understanding of the range of solar-terrestrial relationships. It will be clear from this brief account of solar physics in the broader context of astrophysics that this chapter cannot cover all aspects of solar physics. In keeping with the guidelines of this series, I have thus discussed those general areas where the most interesting advances have been made in recent years, and where the most interesting and challenging problems have been identified. There are a number of readable general monographs on the subject of solar physics, including older ones by Kuiper (1953), Menzel (1959). and Zirin (1966), and a more recent volume by Gibson (1973). The specialist journal Solar Physics is devoted to papers in the general field of solar physics, while other general astrophysical journals also regularly carry
144
LAWRENCE E. CRAM
papers in the field. The reader who wishes to dig deeper into any of the matters raised here should find an appropriate entry point in the list of references. A recently published glossary of solar physics (Bruzek and Durrant, 1977) should smooth any problems encountered with the jargon of the subject.
11. THESOLAR INTERIOR
A . The Solar Core: Missing Neutrinos The principles underlying the theory of the internal structure of the Sun and most other stars are easily enumerated (Schwarzschild, 1958; Chiu, 1968). At each point in the star, it is assumed that there is (1) hydrostatic equilibrium between gravitational force and pressure gradient ; (2) energy transport by convection or radiation, decided by a test for convective stability; (3) energy generation by thermonuclear reactions; and (4) a known set of constitutive relations defining the energy generation rate, the opacity, and the equation of state. Because nuclear reactions change element abundances, some assumption about the initial composition and mixing history of the interior must also be made. Generally the initial mass and homogeneous chemical composition are specified, and the evolution of the internal composition is predicted by assuming that complete mixing occurs in convectively unstable zones and that no mixing occurs in the radiative regions. With these assumptions one may construct a "standard solar model" such as that given in Table I. There is no major disparity between standard solar models computed by different people using different methods, or different nuclear cross sections or opacities (Bahcall and Sears, 1972). Until quite recently it was thought that the fundamental astrophysical problem of the structure and evolution of the stars was essentially explained in terms of this model (with certain modifications during stages of rapid stellar evolution). A particular success of the model is its ability to account for the structure of the Hertzsprung-Russel diagram (Schwarzschild, 1958). However, an attempt to directly observe the neutrinos emitted by certain nuclear reactions thought to occur in the Sun has shown that the neutrino flux is significantly smaller than predicted by standard models. The explanation of this discrepancy has not yet been found, and its implications for the theory of stellar structure and evolution may be calamitous. In a standard solar model the main energy source is the ppp (protonproton) reaction, whose dominant branch is the sequence 'H
+ 'H + *H
3H
+ 3He
'H
+ e+ + v 3He + 7
+
'H
4
'He
-
+ 2'H
(a) (b) (c)
145
SOLAR PHYSICS
TABLE I SOLARINTERIORMODELS
M/M,"
R/ROb
1% T, ("K)
LIL,'
log P (gm~rn-~)
log P (dynescm-2)
xd
17.40 16.94 16.58 16.18 15.54 14.99
0.36 0.65 0.69 0.71 0.71 0.71
A . Model of the Solar Core (Sears, 1964)
0.0 0.2 0.4 0.6 0.8 0.9
0.00 0.14 0.22 0.29 0.38 0.46
0.00 0.79 0.97 I .oo
2.20 1.77 1.49 1.18 0.70 0.26
7.20 7.05 6.95 6.85 6.71 6.59
1.oo 1.oo
B. Model of the Solar Envelope (Spruit, 1974) 0.978 0.988 0.994 0.998 1.000 1.000
6.27 6.14 5.98 5.78 5.41 3.81
0.75 0.80 0.85 0.90 0.95 1.oo
-0.70 -0.95 -1.19 - 1.49 -2.05 -6.51
13.63 13.31 12.91 12.41 11.51 5.10
Fraction of mass interior to tabulated level. Fraction of radius interior to tabulated level. Fracrion of total luminosity at tabulated level Mass fraction of hydrogen.
Neutrinos produced in (a) have energies less than 0.42 MeV. An alternative reaction path involves instead of (c) the sequence
+ 4He 7Be + 'H
3He
+
'Be
+
'B
+y +y
(4 (el
'B+'B*+e++v
(0
(8) The /]-decay reaction (f) produces neutrinos with energies as high as 14.06 MeV. Other chains of the p-p reaction also produce neutrinos, but with much lower maximum energies. The experiment used in attempts to detect solar neutrinos has been described by Davis and Evans (1978) and by Bahcall and Davis (1976). The basic reaction produced by solar neutrinos in Davis' experiment is SB* + 2
4
~
~
vIOlpr + 37Cl e- + 37A (h) a reaction with a neutrino threshold energy of 0.814 MeV. 37Clis a stable isotope used in the form of C,Cl,, a dry-cleaning fluid. The neutrino detector is a 3.9 x lo5 liter vessel of this liquid, located in a rock cavity 1600m under+
146
LAWRENCE E. CRAM
ground to avoid as much as possible the effects of cosniic rays. Any inert '-A produced in the tank is removed by periodic purges with helium. and then extracted by adsorption. The number of atoms of '-A recovered from the tank is determined by detecting the 2.8 keV electrons produced in the Auger decay of "A. The success of Davis' experiment depends on heroic efforts. In a typical 30-day run only a few dozen 37A atonis are detected, and many of these are definitely not due to solar neutrinos. Recovery efficiency. extraneous sources, and many other complications of the experiment have been meticulously studied by Davis and his co-workers (Davis and Evans, 1978). At the present time, there has been no definite detection of solar neutrinos, and the experiment has placed an upper limit ( l o ) on the solar neutrino flux of 1.7 S N U capture~/sec,~'CIatom). ( 1 solar neutrino unit = Standard solar models predict a neutrino flux and spectrum that would produce about 6 SNU in Davis' experiment. Values as small as 4 SNU can be countenanced within the framework of standard solar models, because of uncertainties in cross sections, chemical composition, and other parameters (Bahcall and Davis, 1976). The observed neutrino flux is clearly much smaller than theoretical predictions based on the standard solar model. A number of proposals, none of them very satisfactory, have been made to try to explain this discrepancy. They range from the bizarre (a central solar black hole; variation of the gravitational constantj, through the extreme (very strong magnetic fields in the solar core ; unforeseen neutrino instability), to the plausible (erroneous cross sections ; periodic mixing of the solar core). Rood (1977) has sunimarized 19 different proposals for the resolution of this problem, and concluded that no really satisfactory explanation exists. If the only problem were this apparent discrepancy based on an adniittedly difficult experiment. then one might not be very disturbed. but there are other strong indications that the Sun does not behave as predicted by the standard model. In particular, the evidence of paleocliniatological studies combined with modeling of the Earth's climatic response to solar luminosity variations suggests that the predicted 15"" increase of solar luminosity during the main sequence evolution of a standard solar model cannot have occurred. I 1 it had. the Earth would have frozen. and life would not have developed. Moreover. the Earth undergoes major ice ages every 2.5 x lo8 years or so. and these may be related to solar luminosity variations (Dilke and Gough. 1973). On the shorter time scale, there is some evidence for solar-related modulation of the Earth's climate and upper atmosphere (Eddy, 1978). There is even evidence that the Sun may be currently shrinking at the amazing rate ofO.1 per century (Eddy and Boornazian. 1979). None of these problems is by itself of great moment: models of the Earth's climate are poorly developed.
SOLAR PHYSICS
147
and the interpretation of historical and prehistorical data is fraught with difficulties. However, the combination of these problems with the neutrino deficit should make us concerned that all is not well with the standard solar model. In a thought-provoking, iconoclastic review of the implications of the solar neutrino experiment and these related problems, Roxburgh (1976) suggested that the solar interior may be quite different from the standard model, and that the differences would necessitate significant revisions in our views regarding the structure of the universe. Most stellar astronomers would prefer to have these problems resolved without calamitous changes to the foundations of their science, and they may indeed be solved with relatively minor consequences. A most important contribution to the investigation of the solar core would be the implementation of other neutrino experiments (Davis and Evans, 1978), which could detect the low-energy neutrinos produced by the main-line p-p reaction. These experiments are costly and difficult, but apparently feasible. In any case, as Roxburgh wrote, “the dream world of the stellar theoretician has now ended.” B. The Solur Envelope
1,
Convection
By considering the forces acting on a small displaced volume element in a gravitationally and thermally stratified gas, Schwarzschild (1905, see Meadows, 1970) derived the classical condition for convective instability, which may be expressed as the inequality
Here V, is the adiabatic temperature gradient, an intensive property of the gas determined by a generalized equation of state (Chiu, 1968, Section 3.2). The radiative temperature gradient V, is the gradient that would occur if radiation were to carry all of the energy flux: V, clearly depends on the opacity and the local flux of radiation. A test for convective stability applied to a standard solar model constructed under the joint assumptions of radiative and hydrostatic equilibrium shows immediately that a significant portion of the model is convectively unstable. We infer that the interior of the Sun, and other stars like the Sun, contains a zone in which convective flows occur. Moreover, we may estimate the Rayleigh number for this flow, using R
=
gaATd3/uv
(2)
148
LAWRENCE t. CRAM
where g is the gravitational acceleration, I the expansion coefficient, A T the temperature difference across the zone depth d, K the radiative diffusivity, and 1 3 the kinematic viscosity. Reasonable solar values imply R > 10l2 (Spiegel, 1967), and we infer that the convection must be highly turbulent. This turbulent convection appears to be the primary origin of many of the dynamical phenomena observed in the atmosphere of the Sun. Convection will alter the structure of an assumed radiative equilibrium model of the solar envelope, because heat transport by convection is quite a different process from radiative energy transport. A central unsolved problem of modern astrophysics is to provide an adequate theoretical description of both convection itself, and the consequences of convection, such as the interaction with pulsation and rotation, and the production of structured magnetic fields. IAU Colloquim 36 (Spiegel and Zahn, 1977) provides an up-to-date and comprehensive review of theories of solar and stellar convection. The most commonly used descriptions of convective energy transport in stellar envelope theory are phenomenological models based on mixinglength theory (Unno, 1967; Gough. 1977). These suppose that convection transfers heat in “eddies” that form as a result of convective instability, travel a certain distance 1 (called the mixing length), and then dissolve into the mean convective field. Consideration of the dynamics of such an eddy shows that the velocity u of the moving elements is approximately
where A V T = V - V, is the difference between the true and the adiabatic temperature gradients. The convective flux is approximately
F,
=
C,plrAVT
(4)
where C, is the specific heat and p the density. Equations (3) and (4) give a n expression connecting F , and the actual temperature gradient V, with I as a free parameter. Usually, the ratio / , H ( H = R T / q is the pressure scale height) is taken to be a constant, of the order of 1 or 2. There is no rigorous justification of this procedure, and in fact some numerical simulations (Graham, 1977) suggest that the eddy sizes may be much larger. A number of refinements to the mixing-length theory have been proposed, and alternative nonlocal convection theories are also available. However, all usable models are phenomenological, for the simple reason that turbulent convection is a complex phenomenon that has not yet been described by a satisfactory mathematical model (Liepmann, 1979). The heat content of eddies in the solar convection zone is large, and a bery small temperature gradient excess A VT suffices to transport all of the solar luminous flux. Convection is thus extremely efficient, and within the
SOLAR PHYSICS
149
convectively unstable regions of the solar envelope the actual temperature gradient is very close to the adiabatic gradient. It might be then concluded that the details of convection theory are unimportant-a satisfactory model could be constructed by assuming an adiabatic temperature gradient in the convectively unstable layers. However, this is not the case. Although the convective envelopes of solar models are indeed adiabatically stratified, the particular adiabat is determined by the structure of the thin surface layer where convection gives way to the radiating photosphere. The assumptions of mixing-length theory are least likely to be valid in this layer: for example, lateral radiative energy exchange between eddies will be important. The problem can be circumvented by “calibrating” the mixing-length theory, and then using the value of l / H that gives the same adiabat. Some calibrations may be based on theoretical studies that follow the evolution of a I M , [one solar mass ( M , ) = 1.99 x g] object from the zero age main sequence to the present solar luminosity and radius (e.g., Gough and Weiss, 1976). Values of l / H derived in this way can be used to construct solar models and models of other stars. A solar envelope model constructed by this conventional approach is shown in Table I. Calibrations of mixing-length theory are dependent upon assumptions regarding the solar core and the solar photosphere. We have seen that the physics of the solar core may be poorly understood, so that the calibrations are correspondingly uncertain. There is some evidence that the solar convection zone is deeper than predicted by standard models. For example, by considering the observed eigenfrequencies of solar oscillations (discussed later), Rhodes et ul. (1977) concluded that the base of the convection zone lies between 0.75R, and 0.62R, [one solar radius (R,) = 6.96 x 10’’ cm], while standard mixing-length theory gives the limits 0.86RO-0.71R, (Gough and Weiss, 1976). By totally different arguments related to the absence of the so-called solar polar vortex (a region of rapid prograde rotation at the solar poles), Gilnian (1979) suggested that the base could be as deep as 0.60R,, although his model is so idealized that its relevance in this context must be questioned. The observed presence of a small amount of lithium in the photosphere, combined with an apparently normal abundance of beryllium, also provides information on the depth of the solar convection zone (Vauclair et al., 1978). Lithium is destroyed by thermonuclear reactions at -3 x 10‘ K, so that the convection zone should not mix envelope material to temperatures in excess of this value. In standard models this temperature is reached above a depth of 0.6R0 unless L/H > 4, an unreasonably large value. However, lithium is significantly underabundant in the Sun relative to young stars, and it may be that solar envelope material has at some time been circulated through tem-
150
LAWRENCE E. CRAM
peratures >, 3 x lo6 K, but not as great as 4 x lo6 K, since beryllium would then be destroyed. Standard models of solar evolution never do this. Some ingenious explanations of these properties of the solar envelope have been devised (e.g., Dicke, 1972), but the problem is complex and we are far from a satisfactory solution. The growing awareness of the importance of the radiative-convective interior interface as a promising site for dynamo action makes the problem even more complex and more important. Spiegel(l967, 1971, 1972)has reviewed the problems ofconvection theory and discussed various routes along which theoretical advances might be made. Some of these are being followed, as discussed in Spiegel and Zahn (1977). Direct observations of the atmospheric consequences of convection (granulation, supergranulation, giant cells) are also being made and can provide guidance to theoretical studies, but Gough (1977) has correctly asserted that ”the prospects of an imminent supersession of mixing-length theory by a theory that is demonstrably more reliable for describing stellar convection zones is bleak.” A number of fundamental ideas concerning the structu; ’ and evolution of stars are based on inadequate theories of convection. and observational and theoretical solar studies will be very important in attempts to improve this circumstance. 2.
Rotation and L~irge-ScaleCirculation
The large-scale motions of the surface of the Sun may be observed either by spectroscopic studies of the Doppler shift or by following the proper motions of tracers such as sunspots. Such studies show that the rotation of the Sun varies with latitude and with depth, and that there is a complex, poorly understood, nonrotational circulation superimposed on this regular pattern. Studies of the rotation and large-scale circulation are important because these flows are presumably responsible for the origin of solar magnetic fields, and their evolution and large-scale structure. These subjects are discussed in Section IV. The mean rate of rotation of the solar photosphere determined by spectroscopic studies shows a marked dependence on latitude. A long-term average result given by Howard and Harvey (1970) is o = 13.76 - 1.74 sin2 q5 - 2.19 sin4 4 /day (siderial)
(5)
The spectroscopic rotation rate appears to vary with time. The amplitude is about 5% on time scales of days or weeks (Howard and Harvey, 1970), while the variation in the yearly mean is about 1%. There is a tendency for the yearly mean rotation rate to change systematically with the sunspot cycle, in the sense that the rotation rate is higher at solar minimum (Howard, 1978). The sense of this correlation is the same as that found by Eddy et ul. ( 1 976) for apparent rapid rotation during the Maunder minimum.
151
SOLAR PHYSICS
Observations of the motions of tracers reveal a different pattern of rotation. Sunspots, for example, have a rotation rate of about 14.4 /day at the equator, about 5% faster than the spectroscopic rotation rate. A proposal that the difference between sunspot and spectroscopic rotation rates was not real but due to contamination by scattered light in the latter has been discounted by Foukal(1980). He showed that the spectroscopic rotation rate of the plasma inside sunspots is in fact 5% greater than that of the nonniagnetized external plasma. Just like sunspots, most of the other long-lived structures on the solar surface are related to magnetic fields, and most of these discrete structures rotate in the sanie way as sunspots (see Howard, 1978). However, large-scale patterns in the solar magnetic fields (as opposed to elements), appear to have almost rigid-body rotation. This effect is particularly evident in coronal holes (Wagner, 1975). Figure L compares three different determinations of the solar differential rotation profile. The depth dependence of solar rotation is of great importance to theoretical studies. The faster rotation of sunspots has often been taken as an indication that they are rooted beneath the visible layers, where the angular rotation rate is higher (e.g., Foukal, 1972). This view finds support in studies of rotational splitting of solar p-mode eigenfrequencies : those normal niodes with deeply penetrating eigenfunctions have greater splitting (Deubner et af., 1979). Quantitatively, this study found that the rotation rate 10-15 Mm below the surface is about 4% faster than the spectroscopic surface rate. There has been some discussion of the possibility of a rapidly rotating solar core, occupying about 50% of the radius, or (Table I) about 90% of the mass (Dicke, 1970). Recent precise determinations of the solar oblateness (Hill et af., 1974) remove some of the strongest evidence in favor of this proposal,
15
-LO
-20
0
20
40
Latitude
FIG. 1. The latitude dependence of solar differential rotation. (A) Rotation rate of solar plasma determined spectroscopically via the Doppler effect (Howard and Harvey, 1970); (B) rotation rate of coronal holes determined by temporal autocorrelation of coronal brightness fields (Wagner, 1975); ( C ) rotation rate of sunspots determined from proper motions (Newton and Nunn, 1951). Notice both the strong differential rotation and the large differences in rotation rates at the equator.
152
LAWRENCE E. CRAM
although Dicke and others do cia in^ to see evidence of a regular (12.2 day?) modulation of surface phenomena that could be ascribed to such an internal "clock." More observations are required to settle the question. Although differential rotation is the dominant large-scale flow pattern of the solar surface, there is some evidence of other kinds of flow. A particularly important flow from the theoretical point of view is meridional circulation. Measurements of pole-equator flows are difficult and somewhat contradictory, although the most reliable determinations imply a flow from equator to pole of about 20 nisec-' (Duval, 1979). Other large-scale flow patterns also seem to exist. For example, a plausible explanation of the observed 5% modulation of the spectroscopic rotation rate on intervals of days or weeks is that there are horizontally coherent flows of several tens of msec-', on scales of the order of a solar radius. Other evidence for such flows exists (cf. Durney, 1976), but a number of instrumental problems and properties of the solar atmosphere make it very hard to measure surface flow patterns with the desired accuracy (Howard, 1978). The theoretical study of large-scale flows in the Sun is an extremely difficult problem. which is unsolved at present. As discussed by Durney (1976), models for solar differetial rotation fall into two classes, depending on whether the basic cause of differential rotation is thought to be (a) the interaction between rotation and a large-scale ordered pattern of convection (e.g., Gilnian, 1976), or (b) the interaction between rotation and turbulent convection (e.g., Durney and Spruit, 1979). Gilnian (1976) has constructed a number of refined models of the first kind based on the Boussinesq (unstratified) approximation and with an unrealistically small Rayleigh number. The models predict solar differential rotation and a pattern of meridional circulation, but agreement with observation is poor. Until recently, models of the second kind have been obliged to represent the effect of rotation on turbulent convection by an arbitrary latitudinal variation of the turbulent thermal diffusivity. This variation may be calibrated by using the observed differential rotation, but then there are few remaining tests of the theory. Durney and Spruit (1979) have recently improved this situation by providing a theory for the anisotropic turbulent viscosity and conductivity in a rotating star, with cell size as the single undetermined variable. The absence of a measurable temperature difference between the pole and equator does provide a very strong constraint for either type of model. but it does not let us distinguish between them. I t is easy to show that rotation will influence turbulent convection in solar conditions. as required by the second class of model, but the most affected scales are so large that they approach the scale of global circulation. As with the problem of solar convection per se, the main difficulty confronting theories of solar differential rotation is the fact that we must devise methods for dealing with
SOLAR PHYSICS
153
turbulence; and this problem has been attacked by noted hydrodynamicists for more than a century, with very little effect.
3.
Pulsations
The Sun is a rotating, compressible, self-gravitating body with an inhoniogeneous magnetic field. Each of these properties provides a potential restoring force, and the Sun can therefore support a tremendous diversity of wave motions. Many of these motions occur in localized regions of the solar atmosphere; their properties are not related to the internal structure of the Sun, and they are not discussed here. On the other hand, it has been found that large areas of the Sun oscillate coherently, indicating the existence of global solar pulsations or solar body waves. Studies of these waves promise to reveal properties of the solar interior in just the same way that terrestrial seismic studies have been used to infer the internal structure of the Earth. Although this section of the article deals with the solar envelope, observations of solar pulsations are of cource made in the atmosphere. Some further properties of the pulsations, more related to solar atmospheric physics, are discussed later. The best-observed global solar pulsations are the so-called five-minute oscillations. These were discovered in the early 1960s and studied in great detail through the subsequent decade (Stein and Leibacher, 1974). Throughout this period it was generally thought that the oscillation consisted of randomly distributed, independent cells, with a diameter of about 5-10 Mm and a lifetime of 5-10 periods. Although some solar physicists recognized some problems with this “local” picture of the oscillation, it was not changed until Deubner, prompted by a suggestion from the theorist Ulrich (1970), succeeded in resolving the oscillation into a multitude of normal mode pulsations of the entire Sun. This discovery is one of the most important advances made in solar physics, so that we shall describe the empirical approach at some length. Figure 2 shows a bidiniensional power spectrum of solar oscillations obtained by Deubner et al. (1979). The raw data for this power spectrum consisted of a long time series of observations of the line-of-sight photospheric velocity field made over a large area at the center of the Sun. The size of this area was in fact 140 x 720 Mni (recall that 1 R, = 696 Mm), and the velocity fields were measured in a raster covering this area with a resolution of 1.5 x 1.3 Mni. The entire raster was completed in 100 sec, and scans were made continuously for 7.1 hr. The velocity amplitude data were averaged in the north-south direction (over 140 Mm), so that the measurenients refer only to that component of the wave field propagating in the east-west direction. The resulting two-dimensional array u(x,t)of the velocity ampli-
154
LAWRENCE E. CRAM
FIG.2 . Bidimensional power spectrum of solar oscillations obtained at Sacramento Peak Observatory by Deubner e / al. (1979). The contours correspond to quadratically increasing power levels, beginning at 2.8y0 of maximum. The dashed lines are the loci of theoretical eigenfrequencies with constant radial quantum number n. The weak ridge at the bottom of the set corresponds to the fundamental oscillation mode, with very high order surface harmonic. Solar physicists call such a plot a ( k , w ) diagram.
tude I' at position x (along the equator) and time t was Fourier transformed in both dimensions :
~ ( kw. )
=
j dx S dt exp i(kx - wt)r(x, t )
(6)
where k and o are the spatial wavenumber and temporal frequency, respectively. From this transform, the power spectral density P can be evaluated: P(k, w )
=
1q*
(7)
Figure 2 is a plot of this function in the ( k , (0)plane. Solar physicists call such a plot a ( k , (0)diagram. The ( k , o)diagram of Deubner et al. exhibits a number of ridges that correspond to the loci of eigenfrequencies of normal mode pulsations of the
SOLAR PHYSICS
155
Sun. The precise correspondence is described below, but we can establish some properties of the modes by taking a representative point at, say, k = 0.25 Mm-', w = 0.021 sec-'. The corresponding period P and horizontal wavelength A of this mode are P = 2 x / w = 300 sec, A = 2 x / k = 25 Mm. The horizontal phase velocity Vph = w / k = 83 km sec-'; such a wave carries phase information around the solar circumference in a little under 15 hr. Previous studies of the five-minute oscillation did not resolve this normal mode structure, and detected only the beat patterns among the multitude of modes. Deubner and his co-workers resolved the modes because of the long duration and large spatial extent of their data set : their resolution in the ( k , w ) plane is Ak = 2x/720 Mm = 0.009 Mm-', Aw = 2x/100 = 0.06 sec- values much smaller than ever attained before. An observation of this kind is exceedingly demanding of telescope pointing accuracy (1 -2 arc sec in 7 hr) and of data-handling capacity (1.3 x lo7measurements of line-ofsight velocity in 7 hr). The irregular structure near the k = 0 axis in Fig. 2 and the termination of the modes near the borders of the diagram are thought to be mainly due to instrumental guiding errors and lack of sensitivity. New instruments and detection methods are producing yet unpublished ( k , w ) diagrams of improved quality. Figure 2 shows that the power spectral density of the solar photospheric oscillation falls off at low frequency. There is no power above the noise level for w <, 0.01 sec-', corresponding to periods P > 10 minutes. It is of great interest to know whether longer period pulsations exist, since these correspond to normal modes whose eigenfunctions penetrate most deeply into the solar interior. There is some evidence that detectable oscillations with periods P k 10 min do exist, although the interpretation of the measurements is hotly discussed. The most impressive evidence for the existence of such oscillations is contained in data obtained at the Santa Catalina Laboratory for Experimental Relativity by Astrometry (SCLERA). At this observatory, a large, homogeneous, and very accurate set of observations of the position of the solar limb were obtained in connection with a program to measure the oblateness of the Sun. Power spectrum analysis of these data showed a number of peaks in the period range 10-60 min, and these peaks may be ascribed to a superposition of solar oscillations (Hill and Stebbins, 1975). Statistical tests and phase stability studies seem to support this interpretation, but many other attempts to detect these oscillations have been uniformly unsuccessful. Hill (1978) has argued that none of the other methods has the sensitivity and/or the spatial filtering properties of the SCLERA experiment, so that the disparity is not a problem. But Hill's discussion appeals to a rather novel approach to the theoretical interpretation of the observed limb position data (related to the outer boundary conditions; see p. 157), in order to
',
156
L.AWRENCE E. CRAM
account for the failure to detect velocity fields associated with the fluctuatinglimb brightness distribution. I believe that the question of the detection of long-period solar oscillations is currently completely open. It is quite probable that the modes are excited, but whether they have been detected or not is a moot point. Further observational work with extremely sensitive instruments is required; the potential for seismic probes of the deep solar interior is so great as to justify a major observational program. There have been some reports of a 2hr-40ni oscillation on the Sun (e.g., Severny et al., 1976; Scherrer et 01.. 1979). The oscillation is phase coherent between the Criniean Astrophysical Observatory and the Stanford Solar Observatory, and the Criniean observations are phase coherent over at least four years. Because the observed period is very close to 1:9 of a day, a nonsolar (instrumental) origin of the oscillation is possible. However. very careful work implies that the oscillation is probably of solar origin. A solar oscillation of this period, if it were a nornial mode, would be a fairly high order gravity mode in a standard solar model. The Sun possesses a spectrum of such gravity modes, and it is an intriguing problem to ask why only one of these modes should be so prominently excited, and moreover why it should be observed at the solar surface where the eigenfunctions of g modes tend to zero. Even longer periods have been identified in power spectral analyses of solar properties such as sunspot numbers and “oblateness’. (e.g.. Dicke. 1976). These periodicities, if statistically significant and of solar origin. may be due to a rapidly rotating solar core that has a vestigial influence on various properties of the solar surface. It is clearly difficult to detect any of the longperiod ( P 2 15 min) oscillations of the Sun, but the potential value of such measurements for understanding the solar interior is enormous, and every effort should be made to overcome the obstacles. Theoretical studies of solar oscillations can be divided conveniently into two aspects: (1) the prediction of solar normal modes. and (2) the investigations of excitation and damping mechanisms. The second aspect is most difficult. since it involves nonadiabatic and, in the general case, nonlinear models. The first aspect is easier, because the normal modes can be quite accurately described by adiabatic models, so that the resulting wave equation is easily solved. A classical discussion of the theory of stellar oscillations is provided by Ledoux and Walraven (1957); the application of this theory to the Sun has been made by several authors, including Ando and Osaki (1975) ushose work is described here. The equations describing nonadiabatic, nonlinear oscillations of the solar envelope and atmosphere are, in the notation of Ando and Osaki (1975), continuity:
6p -
2t
+ V(pu) = 0
(8)
SOLAR PHYSICS
motion :
du dt
-
=
1
--vp - g P
157
(9)
energy : radiative flux :
F
=
4 --VJ
3KP
radiative transfer : The Eddington approximation of radiation transfer theory underlies Eqs. (1 1) and (12). Equations (8)-(12) are essentially impossible to solve in their nonlinear forms, so that the first step in studies of solar oscillations is to linearize the gas-dynamic equations about a mean state. What is the appropriate mean state? We have seen that the outer part of the solar interior is a dynamical structure as a result of convective instability, and we discuss below how the outer parts of the solar atmosphere are also highly dynamical. There will certainly be a nonlinear interaction between dynamical oscillations and the other dynamical processes, but to ease the problem it is usual to suppose that the mean state is just the standard solar model with a semiempirical atmospheric model superimposed. This approach is acceptable for finding normal modes, provided oscillations do not perturb the atmospheric structure (cf. Christy, 1962), but it is definitely suspect in studies of the excitation and damping of modes. Equations (8)-(12) may be linearized about this mean state; in spherical coordinates (r, 0, 4) these linearized equations admit solutions for the dependent variables of the form
In this expression f ; ( r ) is the (generally complex) radial eigenfunction, Y"' = cos mq5P,"(cos 0) is a surface harmonic of the first kind, and the complex frequency e = cR + ic,provides the growth (or decay) rate q = - e , / ~ ~ . An eigenvalue problem may be formulated from the linearized equations by prescribing boundary conditions. These are usually conditions of regularity at the center and a condition of boundedness at the surface. Hill (1978) and his co-workers have shown that their limb oscillation observations require that this condition of boundedness be relaxed in a linear analysis, but their approach i s unorthodox and has been criticized. In the present context, it is sufficient to consider the bounded solutions. Solution of the eigenvalue problem yields a spectrum of eigenfrequencies whose corresponding eigenfunction structure may be used to classify the modes. The classification is based on the spherical harmonic index 1 (related
LAWRENCE t. CRAM
158
to the number of node lines on the surface) and the number I I of nodes along the radius vector in the radial eigenfunction. The mode with n = 0 is known as the fundamental mode. For n > 0, two kinds of modes exist for each value of (n, 1 ) ; one of these, known as a gravity or g mode, has an eigenfunction peaking deep within the solar model, while the other. known as a pressure or p mode, has an eigenfunction peaking at the surface. All .L/ modes have a longer period than the fundamental with the same I, allp modes have a shorter period. Figure 3 shows the location of the various kinds of eigenmodes in the ( k , w ) plane. The ridges in the observed solar ( k , w ) diagram (Fig. 2) are identified with p mode oscilations with relatively small values of 17 and large values of 1. In fact. the lowest observed ridge, with only a few islands of power, is the fundamental mode. Modes with n up to 10 or 12 can be detected, and presumably higher orders will be found when small values of k can be resolved. The value of / can be estimated from the relations k = 211 A and 1 2 2nR,/A. For a point with k = 0.25 Mm-'. we find 1 5 300, so that the observed modes are very high-order nonradial oscillations. It is clear why ridges of power. rather than discrete spectral points are seen. The wavenumber separation between modes with I = 300 and / = 301, say. is only Ak = 1 R , 2 1.4 x 10- M m - '. compared with an observational resoluI
Period (mini
3
100
5
20
Frequeicy
(IO-'sec'!
FIG. 3 . Theoretical eigenfrequencies of the Sun, computed by Ando and Osaki (1975; / 2 10) and Iben and Mahaffy (1976; I S 6). The ordinate I is the order of the surface harmonic of the eigenmodes, while the curves are labeled with the radial index n , which measures the number of nodes in the radial eigenfunction. The mode n = 0 is the fundamental mode. To the right of the fundamental are p modes, while a single g mode ( n = I ) is shown on the left. The irregularities in the mode locii between n = 6 and n = 10 are due to differences in the theoretical models. and are a guide to the reliability of theoretical predictions of low-l normal modes.
SOLAR PHYSICS
159
tion (discussed earlier) of Ak z 9 x Mm-’. Thus each resolution element in the ( k , w) plane contains about six different 1 modes. Why are the modes excited? Ando and Osaki (1975) have shown that high-order nonradial modes may be excited by the kappa mechanism, in exactly the same way that the fundamental radial mode of Cepheid variables is excited (Cox, 1967). The mechanism operates in the layers just below the photosphere where hydrogen is significantly, but not entirely, ionized. In this region the opacity increases when the temperature increases, so that radiative energy is added to the hotter parts of the wave field. Energy is thus taken out of the radiation field and added to the dynamical oscillation, which consequently grows in amplitude. The high-order modes are excited because their eigenfunctions are strongly peaked in just those layers where the kappa mechanism operates. The amplitude of the oscillation is presumably limited by a nonlinear effect, but no theoretical studies of this problem have yet been undertaken. It is important to note that the fundamental mode is excited, even though theoretical studies always predict that it should be stable. Possibly this mode is excited by convective “noise,” in just the same way that a resonant oscillator can be set in motion by any arbitrary forcing function (Goldreich and Keeley, 1977). This would also provide an explanation of the presence of long-period p modes of the kind discussed by Hill (1978). However, the long lifetime of the 5-min oscillations (Deubner et al., 1979) argues against this irregular excitation mechanism for the short-period modes. The eigenfunctions of solar nonradial oscillations extend deep into the solar envelope, and studies of the surface properties of these modes can thus be used to probe the interior. We discuss two interesting results of this kind of “solar seismology.” First, the observed positions of the ridges in the ( k , w) diagram can be used to estimate the depth of the convection zone. As noted above, convection is very efficient within the solar convection zone, and the interior is nearly adiabatic. But the total entropy of the convection zone (and thus its depth) depends on the slight difference between the adiabatic and actual temperature gradients, which in turn depend sensitively on the convective transfer model, the opacity, the chemical composition, and the equation of state. Ulrich and Rhodes (1977) have computed the eigenfrequencies of a number of envelope models. and Rhodes et ul. (1977) and Deubner et ul. (1979) have compared their predictions with observations (the dashed lines in Fig. 2 correspond to a standard solar model with I / H = 2). Two important conclusions that emerge from this work are (1) so-called low-2 solar models (in which the heavy element abundance is smaller than in standard models), introduced in an attempt to account for the neutrino deficit, are not compatible with inferred constraints on the depth of the convection zone; and (2) the base of the convection zone should be deeper than 0.75R0, and preferably at 0.5R0,
160
LAWRENCE E. CRAM
although this latter value would encounter severe problems with lithium depletion. A second result of solar seismology is the determination of the depth dependence of the solar angular velocity. In a nonrotating model, the eigenfunctions are degenerate with respect to the index m of the surface harmonic in Eq. (13), but solar rotation lifts this degeneracy in exactly the same way that an imposed magnetic field lifts the degeneracy of angular quantum numbers of atoms to produce Zeeman splitting. The splitting induced by rotation is greatest between modes with I = m and 1 = - m ; the frequency difference for these modes is %IJ
=
w“,,,)
(14)
where n and 1 are the modal indices and Q ( T , , ~ ) is the angular velocity at an “effective depth” r,,,. This depth varies with n and 1 because the eigenfunctions penetrate to greater or lesser depths, depending on IZ and 1. Values of rn,[ are given by Ulrich et al. (1979). This theory has been applied to observedp mode spectra by Deubner et al. (1979) to show that a layer about 13 Mm below the solar surface rotates about 80 m sec-’ faster than the surface. This value agrees quite well with the observed faster rotation of sunspots, which may be anchored at about this depth (Foukal, 1977). Accurate observations of the splitting of solarp modes, especially those with deeply penetrating eigenfunctions, will provide a wealth of useful data on the rotation and large-scale circulation of the solar interior.
111. THEQUIETSOLARATMOSPHERE
A . The Photosphere
The solar photosphere is the deepest observable layer of the solar atmosphere. It may be directly observed in spectral lines and continuum radiation in the approximate wavelength range 0.17-20 pm. The quiet photosphere is highly structured by granulation, supergranulation, oscillations, and small-scale magnetic fields, and the detailed study of the dynamical origin of these structures has elucidated some of the basic mechanisms of solar atmospheric physics. These aspects of the photosphere are discussed below, but first we discuss one-dimensional models of the photosphere, since the main link between solar and stellar physics has been this kind of model. We use the discussion to introduce some of the basic ideas of radiation transfer theory, which naturally is of great importance in relating observed properties of the Sun to the underlying physical conditions producing them. The basic data used to construct one-dimensional photospheric models
161
SOLAR PHYSICS
are spatially unresolved observations of the wavelength and center-to-limb variation of the specific intensity. At a fixed heliocentric position, the wavelength dependence gives information on height variations because the opacity varies with wavelength. Similarly, center-to-limb scans at a fixed wavelength give height resolution because unit optical depth is reached higher along inclined lines of sight. Provided the opacity is known, the two methods are redundant and can be used as a consistency check. The accuracy of photospheric models naturally depends on the accuracy of observational data. An extensive compilation of solar intensity and flux measurements has been published by Vernazza et trl. (1976. hereafter VAL), and it is sobering to note that : (1) order-of-magnitude uncertainties occur in far-UV intensities, (2) visible-region absolute intensities are uncertain by 5;;, (rms), and (3) far-IR brightness temperatures are uncertain by about 500 K (rms). Accurate observations of the spectral distribution of solar intensity and irradiance are important not only for solar and stellar physics, but also for modeling the Earth’s atmosphere, and proposals to carry out accurately calibrated observations of this kind from space vehicles receive wide support. Most studies of the solar atmosphere are based on an application of the radiation transfer equation, which in a one-dimensional, plane-parallel model can be written
-
p d Z , ( p ) / d h = - ti,ZA(P)
+
E;,
= -tii[zA(p) -
S,]
(15)
where h is a physical height coordinate, ZA(p)the specific intensity at wavelength i propagating in a direction 6 = cos-’ p to the vertical, K, and c i , respectively, the monochromatic absorption and emission coefficients, and S , the source function. The optical depth may be defined as
dTA =
-K,
dh
(16)
whereupon (15) may be formally solved for the emergent intensity
The “inversion problem” of radiation transfer theory is to use (17) to deduce S,(T, ) from observations of Zt(p). Inversion problems have been discussed in detail by Jefferies (1968); a paper by Cram (1978a) shows the limitations to the amount of information that can be extracted by inversion. The simplest solution to the inversion problem is the Eddington-Barbier approximation: If S j . ( ~ iis) a slowly varying function, then
‘io(~L)
Sj.(T,
= 1)
(18)
Applications of inversion methods to observations of Z;(p) provide a large
162
LAWRENCE E. CRAM
amount of data on S,(z) and K,(z). These are “intermediate parameters” (Thomas, 1965) from which we want to infer state variables such as the electron temperature x ( h ) and density N,(h). The question of how this inference should be made has been the heart of a long controversy regarding the validity, or otherwise, of local thermodynamic equilibrium (LTE) in stellar atmospheres. The assumption of LTE was originally made on the grounds of expediency (e.g., Chandrasekhar, 1960, Chapter XI). However, as emphasized by Thomas (1965), a self-consistent approach to the problem of connecting the radiation field and the atmospheric structure should begin with the microphysics of the interaction between radiation and matter. This microphysics may be represented by a set of rate equations describing in detail the rates at which various atomic states are populated and depopulated. These rate equations contain values of the atmospheric radiation fields, because radiative excitation and emission are dominant processes. On the other hand, the radiation field depends on atomic level populations via the absorption and emission coefficients in the transfer equation. Thus, a consistent derivation of (T,, N , ) from intermediate parameters should involve the simultaneous solution of the rate equations and the transfer equations, Methods for solving this problem have been developed, and the non-LTE problem is now well understood (Mihalas, 1978). It is found that LTE is a valid approximation in the deep photosphere, while significant departures occur in the upper photosphere, and only non-LTE methods can be used in the chromosphere and above (Thomas and Athay, 1961, pp. 183-187). A refined “semiempirical” photospheric model is that due to VAL (the VAL model M). Figure 4 compares this model with the most recent theoretical model of Kurucz (1979). The semiempirical model represents a best fit to the observed solar spectrum in the wavelength range 0.125-500 pm. In the deepest layers where at 0.5 pm the optical depth z 5 > 1, the critical transition from convective to radiative transfer occurs. In these layers the model is based mainly on data in the 1-2.5 pm range where observations are uncertain and inconsistent (VAL. p. 3 1). Thus, the extremely important physics of the convective-radiative layer cannot be studied from this model. Similarly, the structure of the critical layers between the upper photosphere and low chromosphere is derived from far-UV data (A = lSOOA), where calibrations are poor, line blanketing is heavy, and non-LTE effects are important, and from far-IR data (A z 200 pm), where absolute radiometry is very unreliable. The theoretical model shown in Fig. 4 represents the state-of-the-art in computing models of stellar atmospheres under the classical conditions. It was constructed under the joint assumptions of convective-radiative equi-
163
SOLAR PHYSICS Height (km)
600
LOO
Kurucz
FIG.4. Comparison of a recent semiempirical model of the photosphere based mainly on absolute radiometry of the solar continuum (Vernazza et al., 1976) and a refined theoretical model based on the joint assumptions of radiative/convective equilibrium and hydrostatic equilibrium (Kurucz, 1979). The models agree well below T~ = 0.01, but the disagreement in the outermost layers is of critical importance in attempts to estimate the nonradiative energy balance of the solar atmosphere.
librium and hydrostatic equilibrium. The condition of radiative equilibrium is described by 0
=
dF,/dh
=
:1
KJJ,
-
S,)dA
(19)
where J, = j? l l l ( p ) d p is the mean intensity and F , the wavelengthintegrated flux. The condition of hydrostatic equilibrium is
dP = --Pgdh (20) where P = PG + PTis the sum of the gas pressure and any turbulent pressure included in the model. The function K , appearing in (19) is very irregular because of the multitude of spectral lines. This fact produces the line-blanketing problem (Mihalas, 1978),which Kurucz (1979)solved by applying a statistical method based on opacity distribution functions. Convective transport is treated by applying mixing-length theory [Eq. (4)] in convectively unstable regions [Eq. (l)].Kurucz’s (1979) model is strictly LTE. Figure 4 shows satisfactory agreement between the semiempirical and theoretical models in the layers z 5 2 0.01. In particular, the layers near z5 z 1 where convection gives way to radiative transport agree well. This
164
LAWRENCE E. CRAM
change in dominant energy transfer mode is apparently well treated in the theoretical model. The differences between the two models near the temperature minimum highlight one of the fundamental problems of contemporary solar physics : What are the energy requirements needed to nonradiatively heat the chromosphere and corona, and how are these requirements met? The power required to maintain a temperature excess (over the radiative equilibrium value T o ) in a layer of optical thickness AT is approximately (Cram, 1978b) AF
=
160T,j‘ AT AT h,!
(21)
where (T is the Stefan-Boltzmann constant and 6,- ( z1) is the non-LTE departure coefficient for the H - ion. As shown in Fig. 4, the empirical model is cooler than the theoretical model in the upper photosphere, and we might infer that mechanical energy must be extracted from this region. This is contrary to the view that the upper photosphere is mechanically heated (Ulmschneider and Kalkofen, 1973).The paradox is due to the fact that both the empirical and the theoretical models are uncertain by k250 K in the upper photosphere. Consequently, we do not know even the sign of the energy requirements for atmospheric heating in the temperature minimum region (Cram, 1977). The photosphere is highly structured. The structuring may produce important effects in attempts to produce mean models (Wilson and Williams, I970), but of more direct interest is the study of the dynamical origins of the various structural features. Let us discuss, in turn, the observed properties and probable physical origins of granulation, supergranulation, oscillations, and small-scale magnetic fields. Granulation consists of a field of rising and falling gas elements. with a positive temperature excess in the rising elements. Physical conditions in granulation, as a function of height and time, can be inferred by studies of continuum intensity fluctuations and of the shifts and shapes of highly resolved spectral lines. Figure 5 illustrates granular morphology, while Fig. 6 shows how granules and other structures produce Doppler shifts of spectral lines. A detailed account of granulation is available in the monograph by Bray and Loughhead (1967); more modern observational parameters are summarized by Wittmann (1979). Granulation appears to be due to convective overshoot (Moore, 1967): the vertical momentum of eddies in the convection zone below t 5 z 1 carries them up into the convectively stable layers, where they decelerate under the influence of radiation losses and momentum conversion into waves and “turbulence.” Theoretical models for these processes have been constructed by Nelson and Musman (1 977) and others. Critical tests of the models include their ability to reproduce the observed granular brightness contrasts, and the observed decay with height.
H
FIG 5 Time series images of photospheric granulation obtained wlth a 32-cm balloonborne telescope (Mehltretter, 1978) The elapsed time (in minutes) is shown beneath each frame The arrofis indicate characteristic decay patterns: with circles, merging of granules; without circles. dissolution of granules Fragmentation, the predominant decay mode, can be seen in many cases.
FIG.6 . A highly resolved spectrum of the center of the Ca-I1 K line, the strongest line in the visible solar spectrum. The self-reversed emission core of the line is resolved into a multitude of emission structures, each produced by localized fluctuations in temperature, density, and velocity in the chromosphere. It is difficult to see the relationship between the wealth of fine structure and plane-parallel solar (and stellar) chromospheric models based on unresolved observations of this line. Several photospheric absorption lines can be seen in the K line wing: the weakenings and wiggles of these lines contain information on granulation and oscillations in the photosphere.
The latter test is very sensitive, but unfortunately there is some controversy regarding the observational height dependence, and in any case most models parametrize the vertical momentum destruction via a turbulent parameter that can be arbitrarily adjusted to fit the observational constraints (Keil, 1980).
166
LAWRENCE E. CRAM
Supergranulation is a larger and more persistent flow field that is best seen as a pattern in the horizontal photospheric velocity field. The observational characteristics of supergranulation have been summarized by Beckers and Canfield (1975). Supergranulation is outlined by the photospheric network, a region where small magnetic elements are preferentially clustered into a reticulate pattern, presumably by the supergranular flow itself (Weiss. 1966). The upward extension of this network plays a dominant role in structuring the chromosphere and transition zone, and so in this context supergranulation is a phenomenon of primary importance. It is thought that supergranulation represents a preferred scale for convective eddies, possibly associated with an instability due to the second ionization of helium (Simon and Weiss. 1967). The phenomenon is poorly understood, which is unfortunate in view of its important effect on the structuring of solar magnetic fields (Weiss, 1966; Foukal, 1977). Oscillations of various periods and wavelengths pervade the photosphere. Reviews of the observed properties of the oscillations are available in Stein and Leibacher (1974) and in Beckers and Canfield (1975). The 5-min oscillations are the atmospheric manifestation of global / I modes. As expected on theoretical grounds. these waves behave as evanescent modes in the photosphere : their vertical phase velocity is very high. their amplitude grows with an appropriate scale height, and the velocity and temperature fluctuations are in quadrature. There is some observational evidence for the existence of a field of short-period ( P 5 200 sec). propagating acoustic waves throughout the photosphere (Deubner, 1976). Such a field of waves may be responsible for chromospheric heating via shock wave formation (see Section 111, B), and it may also contribute to the mystifying phenomenon of anomalous (i.e.. nonthermal) broadening of solar photospheric absorption lines. The first evidence for the existence of small-scale intense magnetic fields in the quiet sun consisted of spatially unresolved observations that detected mean field strengths of a few tens of gauss over areas of 5- 10 Mm'. Studies of the Zeeman profiles emitted from these areas later showed that this low mean field was due to a relatively small area covered by fields whose strength was of the order of 1 kG (Stenflo, 1976). Dunn and Zirker (1973) and Simon and Zirker (1974) showed that these fields are associated with extremely small ( 5 200 km) bright photospheric structures dubbed "filigree." Figure 7 shows a region quite densely packed with filigree- the morphological relation between the normal granulation and the bright, magnetically related filigree elements in the dark lanes is readily visible. The detailed relationships between magnetic fields. brightness structures. and an apparent strong
SOLAR PHYSICS
167
FIG 7 The photosphere near a developing active region. photographed with the 76-cm vacuum tower telescope of Sacramento Peak Observatory. The upper left of the image shows fairly normal granulation. while the lower right contains a region of abnormal granulation peppered with tiny filigree elements. The elongated structures between the sunspot and the pores are produced by loops of magnetic flux that are floating up through the photosphere in this region.
downdraft in the magnetic elements have not been clarified because the small size of the structures frustrates observational studies. The small-scale. intense fields form the photospheric network, which is the root of much of the structure of the upper parts of the quiet solar atmosphere. It is thus of great interest to understand the physical origin of this structure. One of the major difficulties presented by these fields is the large field strength. which is about five times the value that could be maintained by equipartition of convective and magnetic energy. Parker ( 1 976) has reviewed a number of mechanisms proposed to explain the kilogauss fields. The most satisfactory model appears to be one in which the magnetic element is
168
LAWRENCE E. CRAM
maintained in magnetostatic equilibrium with the external photospheric gas pressure, with much of the gas removed from the element as a result of cooling beneath the photosphere (Spruit, 1976). The cause of the cooling is unclear, as are the detailed physical processes responsible for the downdrafts, brightness structures, and temporal evolution of the elements. The study of the physics of these small flux elements is one of the most challenging current problems in solar physics : theoretical investigations are hampered by the need to include a number of poorly understood processes simultaneously to derive a holistic model, while observational investigations are extremely difficult because of the minute scale of the structures. High-spatial-resolution observations from space promise to provide a much clearer picture of these magnetic structures.
B. The Chromosphere The electron temperature distribution T,(h) in the solar atmosphere passes through a minimum about 500 km above T~ = 1, and increases outward through the chromosphere, the transition region, and the low corona (see Fig. 8). We may refer to these regions as the outer solar atmosphere. The elevated temperature of the outer solar atmosphere is direct evidence of nonradiative heating. One of the most important, and frustrating, problems of contemporary solar physics is to identify the physical processes
6
LoqT,:6
at 4500 km -
1000 Height I k m l
2000
F ~ t i 8. . The run of temperature with height in plane-parallel models of the photosphere, chromosphere, transition region, and inner corona. The photospheric model is due to VAL. the chromospheric model to Gingerich et al. (1971), and the transition zone and coronal model to Dupree (1972). Other atmospheric models would differ markedly from these, particularly above 1000 km. Moreover. as discussed in the text, the concept of a plane-parallel model loses much of its relevance in the chromosphere and above.
SOLAR PHYSICS
169
responsible for the heating. The canonical description of outer atmospheric heating involves three phases :
(1) Generation of a flux of mechanical energy by dynamical processes beneath the photosphere; (2) propagation of these fluxes through the photosphere, with modifications induced by stratification effects and radiative energy exchange ; and (3) dissipation of the ordered kinetic energy flux into the disordered thermal field by nonlinear processes such as shock formation. Different models involve different detailed processes in each phase, but until recently there was widespread agreement concerning the general validity of this framework. As we shall see, however, there is increasing evidence that the general picture may be wrong. The chromosphere is more prominently structured than the photosphere, but much chromospheric research still considers a spherically symmetric, plane-parallel model. It has been argued that the apparently satisfactory agreement between models based on quite different diagnostics is evidence for the approximate validity of a spherically symmetric model. However. this agreement is, at least in some cases, due to the introduction of ad hoc features such as temperature plateaux, UV opacity enhancement factors, spicular absorption, or turbulent pressure. Moreover, a plane-parallel model can never provide an adequate description of the fine structure itself, and it is the physics of the fine structure that tells us most about detailed processes. Let us first survey the plane-parallel models and then turn to a discussion of the fine structure. Spherically symmetric chromospheric models may be derived from several different kinds of data. For example, observations of line and continuum emission strengths as a function of physical height can be made as the Moon’s limb covers the chromosphere during a total solar eclipse. Extensive work on data of this kind during the 1950s is summarized by Thomas and Athay (1961). This work revolutionized our view of the solar atmosphere, first by demonstrating the importance of non-LTE processes, and second by showing that the chromosphere begins only a few hundred kilometers above z 5 = 1, and merges into the corona only 1000-2000 km above that height. In many respects, eclipse models are still the most selfconsistent and reliable chromospheric models (Praderie and Thomas, 1976), but they are not able to provide accurate information in the lowest parts of the chromosphere, and they do involve gross averaging through chromospheric fine structure projecting above the limb. The chromosphere is visible in continuum radiation in the far t J V (2 5 1600 A) and in the far IR (100 pm 5 il 5 10 cm). Models can be constructed from observations of the wavelength and center-to-limb varia-
170
LAWRENCE E. CRAM
tion of the radiation fields in these spectral regions, just as photospheric models are derived from visible radiation. Gingerich et al. (1971) and Vernazza et al. (1973) have described such models. The main problems with this approach include uncertainties in absolute radiometry in both I J V and IR regions, and non-LTE and line-blanketing effects in the far UV. Furthermore, the observations cannot be directly placed on a physical height scale, and considerable uncertainty attends the application of hydrostatic equilibrium (possibly with “turbulent pressure” included) in the upper chromosphere (cf. Vernazza et al., 1973; Praderie and Thomas, 1976). As Praderie and Thomas point out, models based on continuum data appear to be approaching the eclipse models, as the former become more refined. A third observational basis for chromospheric modeling is provided by spectral lines, either strong lines in the visible (Balmer lines, Ca I1 resonance lines and IR triplet, Na D, Mg b, etc.) or emission lines in the far-UV spectrum. A summary of line formation theory is provided by Mihalas (1978), while Athay (1976) reviews its application to chromospheric modeling. Despite the advances made in non-LTE theory, line formation in chromospheric conditions is still poorly understood. The fundamental problem appears to be due to the existence of inhomogeneities in the chromosphere. Line profiles are particularly sensitive to velocity gradients of a few km sec-’ in the chromosphere, and there is an extremely nonlinear relation between a line profile shape and the structure of the underlying atmosphere. It appears that with increasing complexity, line profile analyses can be satisfactorily linked with eclipse and continuum models, but as emphasized by Vernazza et al. (1 973), it would be dangerous to rely on models derived from line formation theory alone. An interesting example of this problem is provided by studies of the formation of the Ca I1 resonance lines (K, 23933; H, A3969). These lines, the strongest in the visible solar spectrum, possess self-reversed emission cores that are formed in the chromosphere by a non-LTE mechanism proposed by Jefferies and Thomas (1960). They are of particular importance not only because they are a valuable diagnostic of solar chromospheric conditions, but also because they have been used to provide models of the chromospheres of many other late-type stars (e.g., Linsky, 1977). A comparison of the models discussed by Linsky and Avrett (1970) and Ayres and Linsky (1976) shows how changes in line formation theory can lead to extremely large changes in derived chromospheric models. Moreover, as shown in Fig. 6, there is a tremendous diversity of fine structure in the Ca I1 resonance line emission cores. most of it related to violent dynamical processes. The viability of a spherically symmetric representation of such a line is clearly open to question. One of the basic applications of plane-parallel chromospheric models is in the derivation of the departures from radiative equilibrium responsible
SOLAR PHYSICS
171
for chromospheric heating. A quantitative measure of these departures is the net radiative flux divergence
In a steady-state atmosphere, this quantity must be balanced by the mean rate of energy deposition: Em = dF,/dh
(23)
Knowledge of E,(h)-provided, for example, by a theory of chromospheric heating-allows one to construct a theoretical model chromosphere, while conversely the value of E,(h) can be estimated if an empirical chromospheric model is available. Both kinds of studies of E,(h) have been pursued, but not with a satisfactory degree of internal self-consistency. For example, Ulmschneider and Kalkofen (1977) have derived Em(h)from a theory of generation of acoustic waves in the turbulent convection zone, with radiative damping of the waves in the photosphere, and subsequent dissipation through shock formation in the chromosphere. With such physics in E,(h), Ulmschneider and Kalkofen construct models of x ( h ) that can be compared with empirical results. As emphasized by Cram (1977), all phases of this approach are subject to considerable uncertainty, and a number of demonstrably inappropriate simplifications make the comparison with observations a pointless exercise. On the other hand, Athay (1976) and others have sought to derive E,(h) from empirical models. This can be done directly via Eq. (22), but this approach has never been used. Rather, EJh) is derived from estimates of the difference between an empirical model and a fictitious radiative equilibrium model. Cram (1977) has argued that this procedure is presently unable to reliably determine even the sign of the heating function in the upper photosphere and low chromosphere. In summary, the reliability and consistency of plane-parallel chromospheric models derived from eclipse and continuum data are fairly satisfactory, although uncertainties remain as a result of radiometric problems. Models derived from line formation studies are less reliable. Attempts to model chromospheric heating, or to derive chromospheric heating from empirical models, are quite unsatisfactory, and the whole conceptual framework supporting these kinds of studies should be reexamined. The chromosphere is extremely inhomogeneous. A discussion of quiet chromospheric structures can be naturally divided between the network and the cell interior regions. The network component is an extension of the photospheric network, whose general reticulate structure is related to super-
172
LAWRENCE E. CRAM
granulation, and whose observed physical properties are apparently intimately connected with the presence of intense, small-scale magnetic fields. With increasing height in the chromosphere the network component occupies a relatively greater fraction of the solar surface. This effect is directly related to the flaring of the magnetic field as the external gas pressure decreases. Just as in the photosphere, the most striking property of the low chromospheric network is the localized heating produced in and near the magnetic elements. These heated regions are the bases of agglomerations of spicules. As reviewed by Beckers (1972), spicules are jets of material that emanate from and fall back into the chromospheric network regions. When observed in Hcc. spicules are about 1 Mm in diameter and 10 Mm long; they live for several minutes, and their interiors have T, z 8000 K, N , z 10' cmP3.Theoretical models for this violent phenomenon have been discussed by Beckers (1 972) and Athay (1976) ; the models are incomplete and currently cannot provide a detailed description of the consequences of these repeated excursions of chromospheric material into the layers normally occupied by the transition zone and inner corona. The chromospheric regions lying between the network (i.e., the cell interiors) are also highly srructured. In contrast to the relatively rigid magnetic fields and spicular jets of the network, however, the structure of the cell interiors is dominated by waves with periods of the order of 3 min and amplitudes of several km sec-' (Cram et al., 1977). These waves, or yetunobserved shorter period waves. could be responsible for nonradiative heating, but there is no direct evidence for this. On the other hand, it appears that the magnetic field in network regions is intimately associated with nonradiative heating. We can thus ask whether or not magnetic fields are primarily responsible for nonradiative heating : if this is the case, the classical picture of heating of the outer solar atmosphere descirbed at the beginning of this section would be inappropriate. The answer to this question is at the moment unknown, but as our views of transition region and coronal heating mechanisms change with new theoretical and observational results, solar physicists are considering more carefully the proposition that all nonradiative heating of the outer atmosphere may be related to electromagnetic fields and plasma processes. C. The Trunsition Region und Corona The observational and theoretical study of those regions of the outer solar atmosphere above the chromosphere (say, T, 2 20,000 K) is one of the most rapidly evolving aspects of solar physics. Instrumental improvements are allowing solar physicists to observe the fine structure of these regions at X-ray, EUV, and radio wavelengths, and the picture emerging
SOLAR PHYSICS
173
from these observations is revolutionizing theoretical ideas concerning the physical processes responsible for coronal heating and related phenomena. A fundamental assumption of older diagnostic and theoretical studies of the transition region and inner corona was that spatially unresolved observations could be satisfactorily explained by a mean model. The data used to construct such models were of two kinds: observed net fluxes in strong EUV emission lines, and centimeter radio emission. Models based on EUV data have been derived by Dupree (1972; see Fig. 8) and others. An important conclusion drawn from these models was that the temperature gradient in the regime lo5 K 5 T, 5 lo6 K was compatible with a model involving a constant thermal conduction flux in these layers. If this were true, the physics of the transition region would be very simple. Radio observations of the spectral energy distribution at disk center lent some support to such a model, but the center-to-limb variation of centimeter radio fluxes has never been satisfactorily explained by a mean model. The plane-parallel model and its consequence of constant conductive flux have been the basis for many inferences regarding the structure of the solar transition zone and the transition zones of other stars (e.g., Kuperus and Chiuderi, 1976). Our view of the transition region and inner corona has been fundamentally changed with the recognition that this region is exceedingly inhomogeneous, and that the inhomogeneities are rapidly evolving dynamical structures. The transition region and inner corona is not a quasi-uniform region with a superimposed "weak" structuring. Rather. this part of the solar atmosphere is nothing but an agglomeration of dynamical structures. Furthermore, an inhomogeneous magnetic field strongly influences the structural pattern. and most of the physical mechanisms responsible for the dynamical structuring are probably ultimately due to plasma processes occurring in the magnetic fields. This revised view of these regions of the outer solar atmosphere has been well described by Vaiana and Rosner (1978), Feldman ef af. (1979), and Chiuderi (1979). Ground-based observers should not have been surprised by this new recognition of the dominant role played by inhomogeneity. Observations of the chromosphere made in Ha or other strong lines, which naturally provide images of the degree of rugosity of the low side of the transition region, reveal a constantly changing surface with vertical irregularities of thousands of kilometers. Given this irregular and fluctuating lower boundary, it is clear why observations should reveal a complex mixture of hot and cold material at each height in the outer solar atmosphere. An important semiempirical study of the inhomogeneous structure of the transition region is that of Feldman et a/. (1979). These authors used observations of EIJV line intensities at the solar limb to infer the height dependence of the relative emissivity of transition zone material, and disk center observa-
174
LAWRENCE E. CRAM
tions of the absolute intensities of network structures to find the filling factor as a function of height. This study showed that if the emitting regions in network regions are assumed to be thin columns (i.e., spicular structures), then less than 1% of the network volume in the height range 1.5-3.0 Mm is occupied by transition region material (3.5 x lo4 5 T, 5 2.2 x lo5 K). Furthermore, if there is any plane-parallel homogeneous (cell interior) component, its thickness is constrained by observations of the intensity of C I11 lines to be about 1 km for the temperature range 3 x lo4 5 T, 5 10’ K ! An interpretation of this model might be that transition zone material occurs in significant quantities only around spicules in the network. However. there is some evidence for a diffuse component of the network, and this may be responsible for transition zone emission. In any case, the observations imply that there is essentially no transition zone material over cell interiors : material at coronal temperatures must be in intimate contact with chromospheric material in these places. Although not as extreme as this empirical picture, some inhomogeneous models of the transition zone have been constructed (e.g., Gabriel, 1976). In these models, a magnetic field is introduced to provide the basic structural form. and the distribution of electron temperature T, and density N , is then found by imposing the conditions of magnetostatic equilibrium and energy balance between radiation and thermal conduction. Preferential heating of the convergence points of the magnetic field (“network”) occurs because thermal conduction is much more efficient parallel to the field. Such models can be adjusted by including an ad hoc mechanical heating function to reproduce the observed intensities and areas of network structures. but they do not work well over cell interiors nor do they account for the height distribution of transition zone material seen in limb observations. The essential component missing from t h :se models is the existence of dynamic structures in the transition zone. A model should therefore include a consistent treatment of time-dependent mass, momentum, and energy transfer. The discussion by Pneuman and Kopp (1977) highlights the complete change in ideas of transition zone structure that will result from advances in these kinds of studies. Moreover, steep temperature gradients and rapid mass transfer must lead to the kinds of nonequilibrium effects discussed by Raymond and Dupree (1978). The simple physics of the older conduction-dominated transition zone model is unfortunately only a small part of the story. A great deal of work will have to be done before we understand the full range or transition zone physics. The network and cell interior structure of the transition zone, as described above. is not prominent in the corona itself (say T, 2 10‘ K, see Withbroe, 1976). Rather, on all currently resolved scales ( 210 Mm) the corona is found to consist of loops, radiating in X rays, and the regions between loops where
1520-1620KPNO
21 AUGUST 1973 1622 UT FIG.9. Comparison between a photospheric magnetogram (top) and a soft X-ray image (bottom), published by Golub et a/. (1977). The white and black parts of the magnetogram are regions of opposite polarity. Where opposite polarities lie close to each other, there are often bright points in the X-ray image. The X rays are emitted by loops of magnetic field in the corona; the footprints of the loop appear as neighboring opposite polarities in the magnetogram. Note that there is little coronal X-ray emission outside the quiet-Sun loops and the active regions to the left of the field of view.
176
LAWRENCE E. CRAM
the X-ray flux is much weaker (Vaiana and Rosner, 1978).As shown in Fig. 9, these X-ray-emitting loops are intimately connected with looplike structure in the coronal magnetic fields. This connection occurs on all scales (Mclntosh et ul., 1976); large areas with no large-scale loops form the X-ray voids known as coronal holes (Zirker, 1977). All of these structures constantly evolve with time; small X-ray bright points decay in a few hours, while large coronal holes vary over intervals of weeks. As Vaiana and Rosner (1978) have noted, there are no well-defined classes of coronal features in either the temporal or spatial domains. Coronal structures cover quasi-continuously the entire range from flares to coronal holes. This picture of the corona, based almost entirely on the new views provided by imaging X-ray telescopes, is radically different to the old picture of the corona as a radially symmetric structure with only weak inhomogeneities such as streamers. The changed picture is naturally leading to a complete revision of the theory of the corona. Older models for coronal structure and energy balance (see Athay, 1976, for summary) were based on the principles outlined in Section 111, B: generation of a flux of mechanical energy in the convection zone, propagation of waves through the photosphere and chromosphere, and dissipation by shock formation in the corona. This model has become rather unattractive, partly because of difficulties in the development of a quantitative theory, and partly because space observations have placed stringent constraints on any wave energy flux into the corona (Athay and White, 1979). The obvious importance of magnetic fields in structuring the temperature and density distributions in the corona has led to the development of alternative models in which the corona consists of magnetically confined plasma loops, stabilized in essentially magnetostatic structures that are rooted on slowly evolving photospheric footpoints, and heated by dissipation of magnetic energy in anomalous plasma processes (Rosner et ul., 1978a,b). It is clear that the corona is the seat of a great variety of interesting physical phenomena. which will be understood only after a great deal more observational and theoretical work.
D . The Solm Wind
Material is constantly streaming from the Sun, past the orbit of the Earth, and out to the interstellar medium. Man-made satellites located outside the Earth’s magnetosphere provide a wealth of data from in situ measurements. most from near the Earth, but some from extremely distant regions of the solar system. Recent observations have shown that winds (or mass loss) appear to be a very common property of stars (Conti. 1978). While it is probable that stellar winds are not produced solely by the detailed processes
SOLAR PHYSICS
177
responsible for the solar wind, studies of the solar wind clearly are a useful prelude to investigations of the less well observed stellar analogs. As with most other aspects of solar physics, new observational and theoretical ideas are revolutionizing our picture of the solar wind, and a satisfactory description of all of the basic physical processes now known to be important is not yet available. The history of early studies of the solar wind has been reviewed by Parker (1963) and Hundhausen (1972). The orientation of comet tails, and the evidence of a solar influence on the geomagnetic field, hinted at the existence of an interplanetary medium. Chapman developed a theory for such a medium based on hydrostatic equilibrium and a conduction-dominated energy equation. Parker realized that Chapman’s model predicted a boundary pressure that could not be matched to the low pressure of the interstellar medium, and concluded that the interplanetary medium must be hydrodynamically expanding. Parker’s (1963) mathematical model was based on the equations describing the conservation of mass and momentum in a spherically symmetric steady flow: 1 d -(pr2u) r2 d r
-
=
O
du 2kT dp GM, p u d r = - _ _ - - pr2 m dr
where p is the density, u the radial velocity, T the temperature, G the gravitational constant, and M , the solar mass. These equations may be combined to produce a version of the “solar-wind equation,”
where q = ( 2 k T / n ~ ) ’is/ ~the one-dimensional proton thermal velocity of the plasma. I t was known that coronal temperatures close to the Sun were lo6 K, and that thermal conductivity would maintain high temperatures far out in the wind. Given these conditions as a basis for an approximate energy equation, Parker showed that the relevant solution of (26) involves subsonic expansion velocities inside a “critical point”
-
r,
=
GMo/2q2
(27)
and supersonic expansion outside this point. Satellite observations revealed a supersonic wind at the Earth, confirming Parker’s model. Refinements of Parker’s model included the introduction of an inter-
178
LAWRENCE E. CRAM
planetary magnetic field, the adoption of a multifluid description of the solar-wind plasma, and the use of a detailed energy equation to permit a selfconsistent determination of T, in the wind. Hundhausen (1972) has summarized these refinements to the theory of the solar wind. The aim of these studies was generally to explain the observed properties of a hypothetical “mean” solar wind, although from the very beginning of space observations the extreme variability of the wind was known. Our view of the solar wind has been revolutionized by the recognition that most, if not all. of the solar wind emanates from regions in the solar corona where the large-scale magnetic field configuration involves field lines that open into the interplanetary medium. In particular, the observation that the high-speed streams and associated sector structure observed at Earth originate in coronal holes (Hundhausen. 1977) has necessitated a major revision in the inner boundary conditions of solar-wind modeling. Because the streams originate in coronal holes, an areal reduction factor must be introduced in projecting stream properties back to the Sun, and the energy. momentum and mass flux requirements are significantly increased relative to those deduced in mean models. For example, it has often been argued that the solar wind represents a negligible energy sink for the corona. but it is now believed that, on the contrary, the coronal holes may be cool primarily as a result of solar-wind energy losses. As Hollweg (1978) has emphasized, solar-wind theorists face their greatest difficulties in explaining the high-speed streams ; moreover, these streams may represent the “normal” mode of solar mass loss, not previously recognized as such merely because of Earth’s location near the solar equatorial plane. The properties of high-speed streams at the Earth that should be explained by a model include a velocity of ~ 7 0 km 0 sec-’. proton temperature T,, = 2 x lo5, electron temperature T, = 1 x lo5 K, and electron density N , z 5 cm113.The classical solar-wind theory cannot explain such a high velocity, or the relatively small difference between T,, and T,. and a number of major modifications have been proposed. As reviewed by Hollweg (1 978) and Zirker (1977), these include modifications of the theory of electron heat conduction and proton-electron energy exchange to include collective plasma processes, the extension of the heated region to several solar radii, the introduction of expanding geometry to mimic the coronal hole configuration. and the addition of wave pressure as a vehicle for direct momentum transfer to the wind near the Sun. These kinds of modifications lead to a much better fit to observations. but raise a large number of unsolved problems regarding both the primary “cause” of the solar wind and the detailed description of the microprocesses whose effects have usually been treated in a phenomenological way.
SOLAR PHYSICS
I79
IV. SOLARACTIVITY A . Origins of Solur Activity
“Solar activity” is a general term used to describe the characteristic response of the solar atmosphere to the presence of a relatively compact region of unusually strong magnetic fields. Manifestations of solar activity include slowly evolving phenomena such as sunspots and coronal loops, and rapidly evolving phenomena such as flares and coronal transients. The driving force of solar activity is magnetic fields, whose atmospheric structure is determined by magnetohydrodynamic processes occurring within the solar interior. These processes not only produce solar magnetic fields by dynamo action, but also control the characteristic large-scale patterns of active region development and produce the small-scale atmospheric magnetic configurations whose energy drives such violent phenomena as flares. Although solar active regions each exhibit unique evolutionary and structural patterns, there are a certain number of characteristic properties shared by all. These properties are of particular interest to theories of the origins of solar activity, since they presumably represent observable consequences of the interior processes essential for the production of activity. These systematic properties of solar activity include: (1) the 22 year cycle of the degree of activity, coupled with a reversal of the solar magnetic field every 11 years; (2) the characteristic pattern of emergence of solar-active regions reflected in the ‘*butterfly” diagram; ( 3 ) the tendency for active regions to form a pattern of compact leader polarity and dispersed follower polarity with the associated motions required to produce this pattern ; and (4) the historical evidence for gross irregularities in the solar cycle, such as the Maunder minimum (Eddy, 1978). General properties of solar activity are reviewed in IAU Symposium 71 (Bumba and Kleczek, 1976). Theoretical models that seek to explain these properties are generally based on dynamo action, although Piddington (1976) and a few other solar physicists have steadfastly adopted the view that dynamo theory is fundamentally irrelevant. As summarized by Weiss (1971), the essence of most dynamo theories is the following sequence : (1) a predominantly poloidal field at the minimum of the solar cycle, with, say, positive polarity at the north pole; ( 2 ) generation of a toroidal field from this poloidal field by differential rotation; (3) expulsion of flux from the solar interior under the influence of buoyancy forces and convective motions, with a Coriolis-induced helical motion of the convection leading to the generation of reverse polarity poloidal field; (4) dispersal of the flux by eddy diffusion, with a residual reversed-polarity poloidal field concentrated toward the poles; and (5) rep-
180
LAWRENCE E. CRAM
etition of this sequence to give the full 22-year cycle. It should be clear that large-scale flows in the solar envelope, such as differential rotation and convection, are of central importance in dynamo action of this kind. Quantitative dynamo models are generally based on a version of the dynamo equation, which is essentially the following :
i?B/dt = curl(v x B
+ xB)
-
q curl curl B
i28)
Here B and Y are, respectively, the magnetic and velocity vector fields. x a coefficient representing the generation of an electric field by nonaxisymmetric (cyclonic) convection, and q the turbulent magnetic diffusivity. Although x and q represent effects whose action can be understood in phenomenological terms, and whose approximate form can be derived by a stochastic analysis of the magnetogasdynamic equations (Krause and Riidler, 1971), their numerical scalar values as used in quantitative dynamo models can be derived only from attempts to fit models to observation. In a kinemutic dynamo model the value of v is specified a priori. An oftenused example is a law of depth-dependent differential rotation of the form Y = (I,.,
I+. c4)
=
[0, 0, &(r,Q) sin Q]
Here R(r,O) is the angular velocity at depth r and latitude 0. Studies of such models with specified .(r,O) and R(r,O) show that if x > 0 in the northern hemisphere, and Y < 0 in the southern hemisphere (as expected for Coriolis forces), and if dR/& < 0 (the angular velocity increases inward). then cyclic solar dynamo models can be constructed. By suitably adjusting the numerical values of x, q , and R such observable properties as the period of the cycle, the latitude limits and shape of the butterfly diagram, and the evolution of solar polar fields can be satisfactorily explained. [Jnfortunately the parameters r, R, and q cannot be reliably determined from theory. so that we are in the uncomfortable position of not having an independent verification of the values derived by this fitting procedure. The prospects Tor measuring R(r,O) deep in the convection zone directly from observations of p-mode rotational splitting (Section II,B,3) are particularly exciting for dynamo theorists. I t is an extremely difficult problem to develop a djmrrzicul dynamo model, in which the velocity and magnetic fields are determined self-consistently. An essential first step to the solution of this problem is the provision of a satisfactory model of solar differential rotation suns magnetic fields. As described in Section II,B,2, such theories are not yet available, although the problem is currently the subject of intensive research. Dynamical dynamo models would be of particular value in exploring the systematics of stellar activity cycles (Wilson, 1978).
SOLAR PHYSICS
181
An assumption of most quantitative solar dynamo models is that the magnetic field can be represented by a mean state with weak fluctuations superimposed (Krause and Radler, 1971). However. the magnetic fields observed at the solar surface are characterized by an enormous degree of inhomogeneity, from the small flux elements of the quiet sun to the large fields of sunspots. Such inhomogeneity (or “intermittency,’’ in the jargon of hydrodynamicists) is probably a natural consequence of the tendency for convective eddies to expel flux when the conductivity is very high (Weiss, 1966). The solar interior is thus probably treaded by a tangle of connected, intense magnetic flux “ropes,” a picture markedly different from that of a mean field with mild “turbulent” perturbations as required by conventional theory. The difference between these pictures leads to important differences in theoretical modeling. For example, intense flux tubes in magnetostatic equilibrium are almost empty. and are consequently subject to very strong buoyancy forces. Unless modifications are introduced into a model, it is difficult to avoid the complete expulsion of flux from the solar envelope in a period of a few months. It appears that a major advance in the theory of the origin of solar magnetic fields will come with the introduction of what might be called a “flux-rope dynamo.” B. Slowly Vurj3ing Solur Activitj, Active regions exhibit several characteristic patterns of spatial and temporal evolution, although there are also great individual differences between active regions. The phases of active region evolution are basically: (1) the process of flux emergence and initial atmospheric response, (2) the adjustment of active region structure to rearrangements of the subphotospheric field and/or metastable atmospheric field configurations, and (3) the dissolution of active regions. An active region begins with the appearance of a localized bipolar magnetic field in the photosphere. The amount of flux penetrating the photosphere increases steadily, and the chromosphere and corona are violently disturbed. Magnetic flux in the photosphere is initially associated with “pores,” small, dark structures with a field strength B z 2.5 kG and a flux 2 x 10’’ 5 4 5 5 x 10’’ Mx. Over a period of several hours, some of these pores coalesce to form a sunspot, which develops a penumbra if it becomes large enough. Figure 7 shows an active region in the earliest stage of development. There are many pores, and the major spot has not yet developed a penumbra. The granulation field is markedly disturbed in the region where flux is emerging. The field strength in a sunspot may be as large as 4 kG, and the flux as
182
LAWRENCE E. CRAM
large as 5 x Mx. During the time that flux is emerging, the orientation of the field slowly changes until the "correct" polarities of leader (L) and follower (F) spots are obtained; at the same time, the L and F polarity regions begin to separate in longitude. During this phase of flux emergence, the chromosphere and corona are intensely heated. A characteristic pattern of chromospheric loops known as an arch filament system is the signature of a new active region : the arches presumably represent loops of magnetic field rising through the chromosphere. Coronal observations in X rays also reveal a system of loops associated with a developing active region. Within a few hours of creation, these loops begin to connect l o neighboring active regions. Detailed descriptions of the development of active regions can be found in standard texts, in the IAU Symposia 43 (Howard, 1971) and 71 (Bumba and Kleczek, 1976), and in Zwaan (1978). The developing phases of an active region can be understood (in general terms) by assuming that the magnetic flux that forms the active region origiiially lies beneath the photosphere in the form of a "frayed rope" of relatively conceni-:ited flux (Vrabec, 1974; Piddington, 1976). Strands of the rope emerge 10 produce pores and arch filaments, and as the flux continues to rise through magnetic buoyancy the body of the flux rope emerges to produce sunspots and large coronal arches. The chromospheric and coronal heating during flux emergence may be related to the dissipation of currents produced by the motions of the emerging flux. This description is of course only phenomenological ; detailed models have been proposed, but the great variety of phenomena associated with flux emergence have so far precluded the development of a satisfactory quantitative description. The mature phases of active region evolution are characterized by relatively slow overall development, punctuated by violent processes such as flares. One of the most intriguing problems of active region studies is the question of the structure and stability of sunspots (Parker, 1979). Because sunspots are cool and relatively static structures, it seems clear that the intense field strengths are maintained by transverse magnetostatic equilibrium. Crudely. this may be described by the balance o f external gas pressure and internal gas and magnetic pressure: Pi,
+ B218n = P,,,,
(30)
Vertical hydrostatic equilibrium, combined with the reduced scale height in the cool spot, implies that Pi, < Po,, ; the difference is supported by magnetic pressure. The density and temperature inside the spot are reduced, so that the photosphere inside the spot is located about 700 km deeper than the normal photosphere. This is known as the Wilson depression; it can be seen directly by looking at sunspots near the solar limb. While a magnetostatic sunspot model is found to satisfactorily explain
SOLAR PHYSICS
183
the observed photospheric structure, there are a number of unresolved problems. Two of the most important are: Why are sunspots cool? Why are sunspots stable? It has long been thought that sunspots are cool because the magnetic field prevents normal convective flows. But Parker (1974) argued that inhibition of convection will lead to instability; he proposed that spots are cool because they emit an intense flux of Alfven waves. The existence of such a wave flux was earlier postulated to account for the fact that sunspots do require some energy to maintain umbra1 radiation losses, which cannot be provided by radiative energy transfer. While there are firm observational constraints on the upward flux of mechanical energy in sunspot umbrae. there are also sound theoretical arguments suggesting that the strong density gradient in the sunspot atmosphere will reflect Alfven waves downward. Thus the whole subject of the energy balance of sunspots is an unsolved problem. Similarly. very little is understood about sunspot stability. Parker’s work has revealed a number of potential instabilities, of both the thermal equilibrium and the structural equilibrium. For example, the flared field of sunspots is potentially unstable to the fluting instability, although gravitational forces may stabilize this configuration. In addition, a cool sunspot will block heat flow at its base, building up the temperature gradient and hence producing a potential thermal instability. As Parker has emphasized, it is a sobering thought that such an apparently simple structure as a sunspot is so poorly understood. Chromospheric and coronal heating is a characteristic property of active regions. Chromospheric heating is manifested in plages or faculae, bright regions apparently composed of a conglomeration of individual magnetic flux elements very like the elements that form the filigree and network of the quiet sun. The physical processes responsible for heating in the presence of these flux elements are unknown: possibly they are related to the dissipation of magnetohydrodynamic waves, or to the dissipation of electrical currents. The problem of the energy balance of coronal loops in active regions is similarly unsolved. A lot of theoretical work is being devoted to the investigation of “anomalous” plasma processes as the origin of coronal loop heating (e.g., Rosner et al., 1978a).These studies represent a major departure from classical ideas concerning non-radiative heating in the solar atmosphere and are of great general interest insofar as they relate to laboratory studies of plasma processes. Alfven (1975) has provided some fascinating comments on this changed view of this aspect of solar physics. Active regions decay by erosion of magnetic flux. The simple picture of this process involves the gradual entrainment of the flux at the borders of the active region structures into the turbulent convective field (supergranulation and granulation) of the quiet sun. Because the ohmic dissipation time for solar fields is very long ( % 1 year), it has been suggested that the magnetic
184
LAWRENCE E. CRAM
field of the entire quiet Sun originates in active regions, being eventually dispersed in a "random walk" by convection. However. the appeal of this picture has been somewhat quenched by the recent discovery that flux emerges in ephemeral active regions (Fig. 9) at a rate comparable to the rate of emergence in the large active regions of the active zones. This flux emerges rather uniformly over the whole solar surface, and moreover the rate of emergence in ephemeral active regions appears to increase at sunspot minimum to such a level that the net rate of flux emergence is roughly constant throughout the solar cycle (Golub c't a/., 1977). This discovery may have far-reaching consequences in such diverse aspects of solar physics as dynamo theory and the theory of coronal heating. C. E.xplo,cii.c~ Solrir Acttrit? Although rapid and violent processes can be seen at times in the quiet sun (e.g.. filament disruption), the most spectacular events are associated with solar ac ivity. These events range from small features such as Ellerman bombs to major flares that produce intense disruptions of the chromosphere, corona. and interplanetary medium. I t is probable that essentially all of these "explosive" aspects of solar activity are ultimately due to the conversion of energy stored in magnetic and electric fields into kinetic energy (thermal and nonthermal) of particles. Theoreticians have devised a host of conversion mechanisms. and dialog between solar physicists and laboratory plasma physicists is continually enriching the study of physical processes in violent solar phenomena. In this section we discuss only solar flares, since they are the most spectacular form of explosive solar activity. and as such are the best studied. Solar flares can be observed in all parts of the electromagnetic spectrum. from long radio waves to gamma rays. Moreover, the matter ejected in the more violent solar flares can be studied directly by satellites in the interplanetary medium. A comprehensive description of observations of solar flares is provided in Svestka's (1976) monograph. Let us summarize the appearance of solar flares in various parts of the spectrum. and theoretical ideas regarding the origin of the flare radiation. The X-ray spectrum of flares can be conveniently divided into soft ( i > 1 A) and hard ( 2 < 1 A, /7v 2 10 keV) components. The soft component consists of a line spectrum and a continuum component. The line spectrum contains transitions such as the He-like Is2-ls2p transition of Fe XXV at 1.85 A. He-like transitions of S XV at 5 A. the Lyman+ transition of Mg XI1 at 8.4 A. and numerous other transitions of very highly excited ionic species. The continuum spectrum is mainly due to bremsstrahlung and freesbound recombination. In most flares the soft X-ray spectrum evolves
SOLAR PHYSICS
185
smoothly, rising to a maximum in a few minutes, and decaying in an interval of about one-half hour. The application of X-ray spectroscopic diagnostics to observed soft X-ray spectra shows that a typical flare source has a temperature T, z 2 x lo7 K , an electron density N , = 3 x 10” ~ m - and ~ ,an emitting volume of (lo9 ~ m ) although ~ , there is strong evidence for the existence of a wide range of physical conditions in flare X-ray sources. Weak microwave (3 mm-20 cm) radiations correlated with soft X-ray flare emission are thought to be produced by bremsstrahlung under similar physical conditions. Some flares emit a hard X-ray spectrum consisting of a “power law” continuum and on occasions a gamma-ray emission line spectrum. The origin of the continuum is unclear; it may be due to emission from inhomogeneous plasma with temperatures ranging up to lo8 K, or it may be due to bremsstrahlung, synchrotron radiation, or Compton scattering from directed beams of high-energy electrons. Experiments during the next solar maximum should decide the question. The gamma-ray lines are produced by nuclear reactions. In particular, a strong line at 2.23 MeV produced by neutron capture on protons to produce deuterons has been observed by Chupp r f al. (1973). The energies of nuclear particles participating in these reactions must be in excess of 30 MeV; mechanisms for accelerating heavy particles to this energy (such as the Fermi mechanism; see Smith, 1974) have been discussed, but it is clear that the whole problem of these extremely energetic flare processes requires more data and more work on the fundamental aspects of high-energy plasma processes. Most of the flare radiation emitted in the spectral interval between soft X rays and microwaves is thermally excited line and continuum radiation emitted from plasma with temperatures in the range 6 x 103-107 K. The radiation is emitted from the low corona, transition zone, and chromosphere. A comparison between flare models and quiescent active region models (e.g., Muchado et al., 1978) shows that the downward displacement of the transition zone in a flare can explain many of the observed characteristics of these thermal emissions. A vast amount of data exists on the appearance and evolution of flares in H a (e.g., Zirin and Tanaka, 1973), but the mechanism that leads to Hu emission in flares is poorly understood. For example, it is not known whether chromospheric and upper photospheric heating in flares is due to electron or proton beams, X rays, or radiative conductivity. The study of radio emission from solar flares has yielded information on both the nonthermal processes occurring at the site of the flare, and on the blast wave that moves out through the corona following a large flare (Rosenberg, 1976). Type I1 radio bursts probably originate from plasma waves excited by nonthermal electrons accelerated in an MHD shock traveling away from a flare. Type I11 radio bursts are also thought to result from
186
LAWRENCE E. CRAM
plasma waves, generated by “beams“ of electrons. The major problem facing this mechanism is that an electron beam should be very rapidly damped by collective plasma effects, but type 111 burst electron beams are observed even at Earth. Svestka (1976, pp. 21 1-213) has summarized proposed solutions to this problem. but none is very appealing. Type IV bursts. which are usua!ly associated with large flares, occur in a great variety of forms. The general source of type IV emission is probably gyrosynchrotron radiation produced by an electron beam interacting with a magnetic field: changes in beam properties and/or magnetic field configurations will produce the diverse kinds of type IV emission. Microwave emission (3-10 cm) associated with type 1V bursts is thought to be due to nonthermal electrons with energies of several tens of keV: these electrons are also responsible for hard X rays. As summarized by Svestka (1976, p. 178), there is a major discrepancy between the source properties inferred from hard X rays and from microwaves. Smith (1974) has given a detailed account of the various ways in which particles and fields can be excited in solar flares: his account is certainly incomplete, but nevertheless clearly illustrates the complexity of plasma processes in flares. A long-standing and fundamental problem in the theory of solar flares has been the mechanism for conversion of magnetic energy into kinetic (thermal and nonthermal) energy of particles. The magnetic energy itself can be slowly stored in a nonpotential configuration in the chromosphere and corona. This energy is ultimately provided by subphotospheric convective motions that move the subphotospheric magnetic fields ; these motions generate currents and nonpotential fields in the low-fl chromosphere and corona. The nonpotential configuration can be metastable in the sense that even though it can explosively relax to a configuration of lower energy, it requires a finite disturbance (“trigger”) to begin. Solar flares result from energy released during the transition phase following the triggering of a metastable configuration. The energy release presumably involves the dissipation of current systems in the solar atmosphere, but the classical resistivity of coronal material is very low. Thus, the rapid deposition of energy in a flare requires an anomalously high resistivity, such as that provided by plasma turbulence (Rosner et af.,1978a). Such plasma turbulence can be driven by sufficiently large induced currents, and can both accelerate particles (the Fermi mechanism) and randomize microscopic energy distributions to produce a hot, thermalized plasma. An interesting solar flare model that incorporates these various processes has been described by Spicer (1977); a reader of Spicer‘s work will be struck by the cross flow of ideas between theoretical flare studies and problems at the forefront of terrestrial plasma physics research.
SOLAR PHYSICS
187
ACKNOWLEDGMENTS Drs. F.-L. Deubner, Golub, J. P. Mehltretter, and the Editors of Astronomy and Astrophysics and Solar Physics graciously gave me permission to use copyright material. Dr. B. Durney made many valuable comments in reviewing the chapter. Special thanks are due to Ms. Christy Ott for her assistance in preparing the manuscript.
REFERENCES Alfven, H . (1975). In “Role of Magnetic Fields in Physics and Astrophysics” (V. Canuto, ed.), p. 179. N. Y. Acad. Sci., New York. Ando, H., and Osaki, Y. (1975). Publ. Astron. Soc. Jpn. 27, 581 Athay, R. G. (1976). “The Solar Chromosphere and Corona: Quiet Sun.” Reidel Publ., Dordrecht. Netherlands. Athay, R. C., and White, 0. R. (1979). Astrophys. J . 226, 1135. Ayres, T. R., and Linsky, J. L. (1976). Astrophys. J . 205,874. Bahcall, J. N.,and Davis, R. (1976). Science L91, 264. Bahcall, J . N., and Sears, R. L. (1972). Annu. Rev. Astron. Astrophys. 10, 25. Beckers. J. M. (1972). Annu. Rev. Astron. Astrophys. 10, 73. Beckers. J. M., and Canfield, R. C. (1975). ”Motions in the Solar Atmosphere,” Tech. Rep. No. AFCRL-TR-0592, Air Force Cambridge Res. Lab., Hanscom Field, Massachusetts. Bonnet, R. M., and Delache, P.. eds. (1976). “The Energy Balance and Hydrodynamics of the Solar Chromosphere and Corona,” IAU Coll. No. 36. Bussac, Clermont-Ferrand. Bray, R. J., and Loughhead, R. E. (1967). “The Solar Granulation.” Chapman &Hall, London. Bruzek, A,, and Durrant, C. J., eds. (1977). “Illustrated Glossary for Solar and Solar-Terrestrial Physics.” Reidel Publ., Dordrecht, Netherlands. Bumba, V.. and Kleczek, J., eds. (1976). “Basic Mechanisms of Solar Activity,” IAU Symp. No. 71. Reidel Publ., Dordrecht, Netherlands. Chandrasekhar, S. (1960). “Radiative Transfer.” Dover, New York. Chiu, H.-Y. (1968). “Stellar Physics,” Vol. 1. Ginn (Blaisdell), Boston, Massachusetts. Chiuderi. C. (1979). “Small Scale Motions on the Sun,” p. 105. Mitt. Kiepenheuer Institut fur Sonnenphysik, No. 179, Freiburg. (See also Wittman, 1979.) Christy, R. F. (1962). Astrophys. J . 136,887. Chupp, E. L., et al. (1973). Nature (London) 241, 333. Conti, P. (1978). Annu. R m . Astrun. Astrophys. 16, 371. Cox, J. P. ( 1967). See Thomas (1 967, p. 3). Cram, L. E. (1977). Asrron. Astrophys. 59, 151. Cram, L. E. (1978a). J . Quant. Spectrosc. & Radial. Transfer 20, 305. Cram, L. E. (1978b). Astron. Astrophys. 67, 301. Cram, L. E., Beckers, J. M., and Brown, D. R. (1977). Astron. Astrophys. 57, 211. Davis, R., and Evans, J. C. (1978). See Eddy (1978, p. 35). Deubner, F.-L. (1976). Astron. Astrophys. 51, 189. Deubner. F.-L., Ulrich, R. K.. and Rhodes, E. J. (1979). Astron. Astrophys. 72, 177. Dicke, R. H. (1970). Annu. Rev. Astron. Astrophys. 8, 297. Dicke, R. H. (1972). Astrophys. J . 171, 31 1. Dicke, R. H. (1976). Sol. Phys. 47,475. Dilke, F. F W., and Gough, D. 0. (1973). Nature (London) 240, 262.
188
LAWRENCE E. CRAM
Dunn. R. B., and Zirker, J. B. (1973). Sol. Phys. 33,281. Dupree, A. K. (1972). Astrophys. J . 178, 527. Durney. B. R. (1976). See Bumba and Kleczek (1976, p. 243). Durney, B. R., and Spruit. H. C. (1979). Astrophys. J . 234, 1067. Duval, T. L. (1979). Sol. Phys. 63,3. Eddy, I. A., ed. (1978). “The New Solar Physics.” Westview. Boulder, Colorado. Eddy, J. A . , and Boornazian, A. A. (1979). Bull. Am. Astron. Soc. 11, 437. Eddy, J. A , , Gilman. P. A., and Trotter, D. E. (1976). Sol. Phys. 46, 3. Feldman, U., Doschek, G . A,, and Markka, J. T. (1979). Astroph,v.c. J . 229, 369. Foukal, P. (1972). Astrophys. J . 173, 439. Foukal, P. (1977). Astrophys. J . 218, 539. Foukal, P. (1979). Astrophvs. J . 234, 716. Gabriel, A. H. (1976). Philos. Trans. R. Soc. London. Ser. A 281, 339. Gibson, E. G. (1973). “The Quiet Sun,” NASA SP-303. US Govt. Printing Office, Washington, D.C. Gilman, P. A. (1976). See Bumba and Kleczek (1976, p. 207). Gilman, P. A. (1979). Astrophys. J . 231,284. Gingerich, O., Noyes, R. W., and Kalkofen. W. (1971). Sol. Phys. 18,345. Goldreich. P., and Keeley, D. A. (1977). Astrophys. J . 212, 243. Golub, L., Krieger, A. S., Harvey, J. W., and Vaiana, G . S. (1977). Sol. Phys. 53, 11 1. Cough, D. 0. (1977). See Spiegel and Zahn (1977, pp. 15 and 349). Cough, D. O., and Weiss, N. 0. (1976). Mon. Not. R. Astron. Soc. 176. 589. Graham, E. (1977). See Spiegel and Zahn (1977, p. 151). Hill, H. A . (1978). See Eddy (1978, p. 135). Hill, H. A,, and Stebbins, R. T. (1975). Ann. N . Y . Acad. Sci. 262,472. Hill, H. A., Brown, T., and Stebbins, R. T. (1974). Phys. Rer. Lett. 33, 1497. Hollweg, J . V. (1978). Rev. Geophys. Space Phys. 16, 689. Howard, R., ed. (1971). “Solar Magnetic Fields,” IAU Symp. No. 43. Reidel Publ., Dordrecht, Netherlands. Howard, R. (1978). Rec.. Geophys. Space Ph.vs. 16, 721, Howard, R., and Harvey, J. W. (1970). Sol. Ph-ys. 12, 23. Hundhausen, A. J. (1972). “Coronal Expansion and Solar Wind.” Springer-Verlag. Berlin and New York. Hundhausen, A. J . (1977). See Zirker (1977, p. 225). Iben, I . , and Mahaffy, J . (1976). In “Solar and Stellar Pulsation” (A. Cox and R. Dupree. eds.), LA-6544-C, p. 25. Los Alamos Sci. Lab.. Los Alamos, New Mexico. Jefferies, J. T. (1968). “Spectral Line Formation.” Ginn (Blaisdell), Boston, Massachusetts. Jefferies, 5. T., and Thomas, R. N. (1960). Astrophys. J . 131, 695. Keil. S. L. (1980). Astrophys. J . (in press). Krause, F., and Radler, K.-H. (1971). See Howard (1971, p. 770). Kuiper, C . P., ed. (1953). “The Sun.” Univ. of Chicago Press, Chicago, Illinois. Kuperus, M., and Chiuderi. C. (1976). See Bonnet and Delache (1976, p. 223). Kurucz, R. L. (1979). Astrophys. J . , Suppl. Ser. 40, 1. Ledoux, P., and Walraven, T. (1958). In “Handbuch der Physik” ( S . Fliigge, ed.), Vol. 51. p. 353. Springer-Verlag, Berlin and New York. Liepmann, H. W. (1979). Am. Sci. 67, 221. Linsky, J. L. (1977). See White (1977, p. 475). Linsky, J. L., and Avrett, E. H . (1970). Publ. Astron. Soc. Puc. 82, 169. McIntosh, P. S., Krieger, A. S., Nolte, J . T., and Vaiana, G. S. (1976). Sol. Phys. 49, 57. Meadows, A. J. (1970). “Early Solar Physics.” Pergamon, Oxford.
SOLAR PHYSICS
189
Mehltretter, J. P. (1978). Astron. Asirophys. 62, 31 1. Menzel, D. H. (1959). “Our Sun.” Harvard Univ. Press, Cambridge, Massachusetts. Mihalas, D. (1974). Asiron. J . 79, 1111. Mihalas, D. (1978). “Stellar Atmospheres,” 2nd ed. Freeman, San Francisco, California. Moore, D. W. (1967). See Thomas (1967, p. 405). Muchado, M. E., Emslie, A. G., and Brown, J. B. (1978). Sol. Phys. 58,363. Nelson. G . D., and Musman, S. A. (1977). Asirophys. J . 214,912. Newton, H. W., and Nunn, M. L. (1951). Mon. Noi. R. Astron. SOC.111,413. Parker, E. N. (1963). “Interplanetary Dynamical Processes.” Wiley (Interscience), New York. Parker, E. N. (1974). Sol. Phys. 36, 249. Parker, E. N. (1976). Astrophys. J . 204, 259. Parker, E. N. (1978). See Eddy (1978, p. 1). Parker, E. N. (1979). Astrophys. J . 230,905. Piddington, J. H. (1976). See Bumba and Kleczek (1976, p. 389). Pittock, A. B. (1978). Rev. Geophys. Space Phys. 16,400. Pneuman, G. W., and Kopp, R. A. (1977). Asiron. Asirophys. 55, 305. Praderie, F., and Thomas, R. N. (1976). Sol. Phys. 50,333. Raymond, J. C., and Dupree, A. K. (1978). Astrophys. J.222,379. Rhodes, E. J., Ulrich, R. K., and Simon, G. W. (1977). Astrophys. J . 218, 901. Rood, R. T. (1977). Mem. Soc. Asiron. Ira]. 48, 357. Rosenberg, H. (1976). Philos. Trans. R. SOC.London, Ser. A 281,461. Rosner, R., Tucker, W. H., and Vaiana, G. S. (1978a). Asirophys. J . 220,643. Rosner, R., Golub, L., Coppi, B., and Vaiana, G . S. (1978b). Asrrophys. J . 222, 317. Roxburgh, I. W. (1976). See Bumba and Kleczek (1976, p- 453). Scherrer, P. H., et al. (1979). Nature (London) 277, 635. Schwarzschild, M. (1958). “Structure and Evolution of the Stars.” Princeton Univ. Press, Princeton, New Jersey. Sears, R. L. (1964). Asirophys. J . 140,477. Severny, A. B., Kotov, V. A,, and Tsap, T. T. (1976). Nuture (London) 259,87. Simon, G. W., and Weiss, N. 0. (1967). Z . Astrophys. 69, 435. Simon, G. W., and Zirker, J. B. (1974). Sol. Phys. 35, 331. Smith, D. F. (1974). In “Coronal Disturbances” (G. Newkirk, ed.), IAU Symp. No. 57, p. 253. Reidel Publ., Dordrecht, Netherlands. Spicer, D. S. (1977). Sol. Phys. 53, 305. Spiegel, E. A. (1967). See Thomas (1967, p. 347). Spiegel, E. A. (1971). Annu. Rev. Astron. Asirophys. 9, 323. Spiegel, E. A. (1972). Annu. Rev. Astron. Astrophys. 10, 261. Spiegel, E. A., and Zahn, J.-P. (1977). “Problems in Stellar Convection,” IAU Coll. No. 38. Springer-Verlag. Berlin and New York. Spruit, H. C. (1974). Sol. Phys. 34, 277. Spruit, H. C. (1976). Sol. Phys. 50, 269. Stein, R. A,, and Leibacher, J. W. (1974). Annu. Rec. Astron. Astrophys. 12,407. Stenflo, J. 0. (1976). See Bonnet and Delache (1976, p. 143). Svestka, Z. (1976). “Solar Flares.” Reidel Publ., Dordrecht, Netherlands. Thomas, R. N. (1965). “Some Aspects of Nonequilibrium Thermodynamics in the Presence of a Radiation Field.” Univ. of Colorado Press, Boulder. Thomas, R. N., ed. (1967). “Aerodynamic Phenomena in Stellar Atmospheres,” IAU Symp. No. 28. Academic Press, New York. Thomas. R. N., and Athay, R. G. (1961). “Physics of the Solar Chromosphere.” Wiley (Interscience), New York.
190
LAWRENCE E. CRAM
Ulmschneider, P.. and Kalkofen, W. (1973). Sol. Phys. 28, 3. Ulmschneider, P., and Kalkofen, W. (1977). Astron. Astrophps. 57, 199. Ulrich, R. K . (1970). Asrrophys. J . 162, 993. Ulrich. R. K., and Rhodes. E. J. (1977). Asrroph.v.r. J . 218, 521. Ulrich. R. K.. Rhodes. E. J., and Deubner. F.-L. (1979). AsrrophJ,s.J . 227, 638. Unno. W. (1967). Pub[. Asrron. SOC.Jpn. 19, 140. Vauclair, S . . Vauclair. G.. Schatzmann, E.. and Michaud, G. (1978). Asrrophys. J . 223, 567. Vernazza. J . E., Avrett. E. H., and Loeser. R. (1973). Astrophy.s. J . 184,605. Vernazza, J . E., Avrett. E H., and Loeser. R. (1976). Astrophl.s J . . Suppi. S r r . 30, 1. Vaiana, G. S., and Rosner, R. (1978). Annu. Rer. Asrron. Astroph).s. 16. 393. Vrabec. D (1974). In "Chromospheric Fine Structure" (R. G. Athay. ed.). IAU Symp. No. 56. p. 201. Reidel. Dordrecht, Netherlands Wagner. W. J . (1975). A.strophy.\. J . Lrtr. 198, L141. Weiss. N. 0. (1966). Proc. R. Soc. London. S r r . A 293, 301. Weiss. N . 0. (1971 j. See Howard (1971, p 757). White. 0. R.. ed. (1977). "The Solar Output and its Variation." Colorado Assoc. Univ. Press, Boulder. Wilson. 0. C. (1978). Astrophj,s. J . 226, 379 Wilson. P. R., and Williams. N . V. (1970). So/. Phj's. 26, 30. Withbroe. G. L. (1976). See Bonnet and Delache (1976. p 263). Wittman. A. (1979). "Small-Scale Motions on the Sun," p. 29. Mltt Kiepenheuer lnstitut fur Sonnenphysik. No. 179. Freiburg. Zirin. H. (1966). "The Solar Atmosphere." Ginn (Blaisdell). Boston. Massachusetts. Zirin. H.. and Tanaka. K. (1973). Sol. PhJ,.s.32, 173. Zirker. J B., ed. (1977). "Coronal Holes and High Speed Wind Streams." Colorado Assoc. Univ. Press, Boulder. Zuaan. C K. (1979). Sol. Ph.rs. 60, 213.
ADVANCES IN ELECTRONICS A N D ELECTRON PHYSICS, VOL.
54
Aspects of Resonant Multiphoton Processes* A. T. GEORGES Physics Department University of Toronto Toronto, Ontario, Canada
AND
P. LAMBROPOULOS Physics Department University of Southern California Los Angeles, California
I . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Formal Theory of Multiphoton Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. The Quantum Theory of Resonant Two-Photon Processes . . . . . . . . . . . . . . . . . . . . IV. The Effect of Nonresonant States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Higher-Order Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
VI. Semiclassical Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VII. Multiple Resonances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VIII. Field Statistics and Bandwidth Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IX. Experimental Investigations of Resonant Multiphoton Processes . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
191 194 200 206 209 2 15 219 224 233 236
I. INTRODUCTION A general resonant multiphoton process is depicted schematically in Fig. 1. A bound atomic or molecular system in an initial state Ig) under the influence of an electromagnetic field containing n photons, undergoes a transition to a final state I f ) by absorbing N photons. If the frequency of the photons is such that various combinations of numbers of photons add up to the energy differences between bound states of the atom, the N-photon absorption is referred to as resonant. Each of these intermediate resonant transitions can in general be a multiphoton transition itself. As such it requires for its description the methods of nonresonant multiphoton transitions that have been reviewed elsewhere (Bakos, 1974; Delone, 1975; * Work supported by a grant from National Science Foundation Grant No. PHY78-23812. 191
Copyright 1980 by Academlc Press. Inc All rlghlr of reproduction in a n y form reserved ISBN 0-1?-014654-1
192
A. T. GEORGES AND P. LAMBROPOULOS
+
+ i 1 I
FIG.1 . Schematic representation of a multiphoton process
Lambropoulos, 1976; Eberly and Lambropoulos, 1978). A brief summary of relevant material is also presented in the following Section I1 of this article, the central theme of which is the phenomenon of resonance and related effects. In the context of multiphoton processes, the phenomenon of resonance becomes much more complex than its counterpart in the usual weak-field spectroscopy. This difference stems from three main reasons: (a) The resonant transition can be a multiphoton process. (b) The field can be sufficiently strong for the transition to saturate or approach saturation; and a saturated multiphoton transition exhibits features substantially different from those of a single-photon transition. (c) The resonance is but a link in the chain of an overall N-photon process whose dynamics thus become an extremely complicated phenomenon. The first and obvious consequence of an intermediate resonance is that it enhances the overall N-photon process, an effect that has been exploited since the early days of multiphoton processes. In recent years, it has become a very versatile tool in schemes of isotope separation (Solarz et al., 1976; Letokhov, 1978), two-photon spectroscopy, the detection of single atoms via resonant two-photon ionization (Hurst et al., 1975,1977a,b), harmonic generation (Ward and Smith, 1975; Wang and Davis, 1975; Wallace and Zdasiuk, 1976), and in a number of other processes. Two-photon absorption is itself a resonant process-albeit in a different sense-that has a vast number of applications in high-resolution laser spectroscopy. For a guided tour on
ASPECTS OF RESONANT MULTIPHOTON PROCESSES
193
these subjects the reader is referred to the article by Bloembergen and Levenson (1976). As a result of their relevance to many contexts, resonant multiphoton processes have become a rather popular topic in the literature of the last four years or so. It has thus become apparent that there are certain basic aspects pertaining to most resonant multiphoton processes independent of the particular atomic system or scheme under investigation. For example, two-photon-resonant three-photon ionization of a single atom such as sodium, and a similar process of dissociation in a polyatomic molecule exhibit features that can be obtained from remarkably similar models. That is not meant to imply that the complicated process of selective dissociation of something like SF, does not pose its own problems that defy the models of atomic processes. It is for processes of relatively lower order (than those found in laser dissociation of SF,) that similarities seem to exist in atomic and molecular processes (see, for instance, Parker et al., 1978). Some of the questions that one expects to have answered through the use of models of resonant N-photon processes are: How do the resonances affect the dependence of the overall process on laser intensity? How does the observed signal-be it ionization, fluorescence, or dissociation-develop in time? How do the relaxation properties of the resonant states affect the signal in time as well as in its dependence on laser intensity? What is the effect of the stochastic properties of the radiation? How does the process depend on the frequency of the radiation? One can go on with a number of other aspects about which information is sought. Our purpose here is to present a review of the formulation of such models and their use in the interpretation of experiments. We have attempted to show the interconnections between various theoretical approaches, which, although apparently different, are essentially identical. It has been our intention to avoid the repetition of material that has been discussed in other reviews, but one can be only partially successful in this effort if the article is to be reasonably self-contained. We hope that the material included herein combined with the references will give the reader a useful overview of the subject. Limitations of space, however, have necessitated the selection of certain topics as well as constraints on the amount of detail devoted to each topic. As an inevitable consequence, and with much regret, we have not been able to discuss adequately worthwhile work by a number of authors. Judging from the trend we have seen in the Soviet literature, it is quite possible that we have left out work quite parallel to the one discussed here, which, however, has not appeared yet in the translated literature due to the usual time lag. In addition to reviewing the progress on the subject, we did attempt to write an article of some pedagogical value for the novice in the field. Our success, or lack of it, will have to be judged by the reader.
194
A . T. GEORGES AND P. LAMBROPOULOS
11. FORMAL THEORY OF MULTIPHOTON PROCESSES As long as the incident photon flux is smaller than a characteristic valuewhich depends on the atom and the photon frequency-an expansion in terms of the interaction V coupling the atom to the field will generally be useful. This does not mean that only the lowest-order nonvanishing term will always suffice. Often a partial summation of infinitely many terms must be performed as is the case in resonant processes, but still we are in the perturbation theory regime in the sense that the expansion in terms of V is meaningful. The basic theory can be cast either in a fully quantum-mechanical form with the radiation field represented by its creation and annihilation operators, or in a semiclassical form with the atom treated quantum mechanically while the field is represented by a time-dependent classical amplitude. Both descriptions are discussed here since both have been employed in the literature. Depending on the specifics of the problem, one may be preferable to the other. Even within each of these two descriptions there are two ways of formulating the problem: in terms of the equations of motion of the amplitudes of the wave function (Schrodinger equation) or in terms of the density matrix. Variants of these have also been employed and are discussed later. In the fully quantized version, the total Hamiltonian of the system “atom plus field” is written as H
=
H A
+HR +
V
E
H0
+ I/
where H A is the Hamiltonian of the free atom, HRthe Hamiltonian of the free field, and V the interaction between the two. All Hamiltonians are here assumed divided by h and thus all energy is denoted by w (rad/sec). Atomic states are denoted by lower case Latin letters with Ig) and I f ) reserved for the initial and final atomic states, respectively. The Hamiltonian of the radiation field is written as
where k is the wave vector, 3. the polarization index. and ok= ck the frequency of the (k2)th photon mode; uLj, and ukl are the usual creation and annihilation operators (Messiah, 1965; Sakurai, 1967) with c the speed of light. The eigenstates of HRare of the form 1 . . . r ~ ( k ~ ) n. (~k), A , , ) . . . ) (which is also abbreviated as l{n(ki)>))with n(ki.) the number of photons occupying the (ki.)th mode. These modes are here taken to be those of a box of linear dimensions L with periodic boundary conditions and are therefore discrete. An actual light source has a continuous-even if extremely narrohrspectrum. The transition from the discrete to the continuum is accomplished by letting L -P x at the appropriate point in the calculation and replacing
ASPECTS OF RESONANT MULTIPHOTON PROCESSES
195
the summation over k by integration according to
where izk is the direction of propagation of the k photon. The number of photons per mode is related to the photon flux I ( o k ) through the equation (Heitler, 1954),
where I ( q ) is expressed in number of photons/cm2/sec/unit bandwidth. There are two values of the polarization index 2 for each k unless the light is polarized, in which case only one 1 need be considered. A summation over photon modes must therefore be in general of the form ZkL.To compress notation, we shall hereafter omit II with the understanding that it is included in k. Since we will be dealing mostly with polarized light-because that is how most experiments are performed-we need not be concerned with 2, unless explicitly stated otherwise. The eigenstates of the unperturbed Hamiltonian H o can be written as products of the form A ) = Iu)I . . . n ( k ) .. . ) and will be denoted by capital letters / A ) , IB), IC), . . . , with IZ) and I F ) reserved for the initial and final states of the system “atom plus field.” Thus we have, for example, H O IA) = o,lA), where oA= o, + C, n(k)ok. At time t = 0 the two parts of the system are uncoupled, the atom usually, but not necessarily, being in its ground state. The initial state 11) then is Y ( t = 0 ) E 11) = Ig) I...n(k)...)
(5)
where, in principle, all photon modes are occupied, the actual occupation numbers being determined by the laser spectrum. At a later time t the wavefunction is given by Y ( t )= e-iHtlZ) = U ( t ) l Z )
(6)
This equation defines the time evolution operator U(t), which in this case can be written as e - i H rbecause H is time independent. This is one of the advantages of the fully quantized formalism. Of course U(t) exists in general, except that time ordering must be observed if written as an exponential with a timedependent Hamiltonian H ( t ) (Sakurai, 1967). Alternatively, we do not have to write U ( t )as an exponential but can proceed by considering the integral equation satisfied by U ( t ) as pointed out in more detail later. In any case, the probability that the system is in state I F ) at time t is given by I(FlW>12= I
(7)
196
A. T. GEORGES A N D P . LAMBROPOULOS
which shows that in calculating transition probabilities we are interested in calculating matrix elements of U ( t ) in the representation that diagonalizes HO. The fashion in which the interaction is switched on is not irrelevant. The sudden switching-on implied in the above formalism will be a valid theoretical description only as long as the overwhelming amount of the observed signal is produced after the transients accompanying any switching-on and -off have died out. This is usually the case in most experimental situations with lasers of duration more than a few picoseconds. The conditions, however, may be different, with very powerful subpicosecond pulses. For a discussion of the possible effects of pulse shape and duration, the reader is referred to the recent paper by Theodosiou et al. (1979). Proceeding now with the fully quantum version, and noting that since U ( r ) is a simple exponential operator its Laplace transform can be taken, after a change of variable we obtain the resolvent operator G(z)defined by
G(z) = (z - H)-'
(8 )
where z is a complex variable (Goldberger and Watson, 1964; Messiah, 1965). In terms of G(z), U ( t )is given by the inversion integral U ( t ) = (27rZ)-' Jc e-"'G(z)dz
It can be shown quite generally (Goldberger and Watson, 1964; Messiah, 1965) that for positive values o f t (which is what interests us here), we can replace z by x -t iq in Eq. (8), carry out the integration over the real variable x from - x to x and then take the limit for q -+ +O. Using this property, and denoting G(x iq) by G(x+) for short, we can express the matrix elements of U ( t )as
+
+
thus reducing the problem to the calculation of matrix elements of G, which does not involve the time variable. The matrix elements of G are not known since the eigenvalues of H are not generally known, but G satisfies the equation
where
ASPECTS OF RESONANT MULTIPHOTON PROCESSES
197
whose matrix elements obviously are G i B = ( z - w,)-’ 6,, with dAB the Kronecker delta. Matrix elements are hereafter assumed to be in the representation that diagonalizes HO, and the matrix elements of V are of course known. Iterating Eq. (1la), we obtain the series expansion G
=
[+2
Go 1
n=l
(VGO)”]
(12)
which can be used to express an arbitrary matrix element of G as an infinite series of products of matrix elements of Go and V . An Nth-order process (N-photon process) is represented, to lowest order, by the Nth term of the series. For example, a three-photon transition from II) to I F ) is represented by the matrix element
G,$)
=
(FIG~VG~VG~VG~IZ)
(13)
which can also be written as
where we have used the fact that Go has only diagonal matrix elements and the summations above (usually referred to as summations over intermediate states) extend, in principle, over all eigenstates of HO, which means over all possible products of atomic and photon states. In practice, the particular form of V and the concomitant selection rules limit these summations to particular subsets of product states. Using a well-known procedure whose details can be found elsewhere (Lambropoulos, 1976),a transition probability per unit time WFj is obtained from Eq. (14) or more generally from Eq. (12). For the general case of an N-photon process it reads
where the delta function implies energy conservation between the initial and final state. This type of transition probability per unit time expressed in terms of a time-independent quantity has been derived and used extensively since the beginning of the study of multiphoton processes (Bebb and Gold, 1966). It involves an (N - 1)-fold summation of intermediate states each of those summations being over complete sets. It is valid for nonresonant multiphoton processes, that is, as long as none of the denominators in Eq. (15) vanish. For resonant processes this expression is inadequate and a more elaborate set of
A . T. GEORGES AND P. LAMBROPOULOS
198
equations must be developed as discussed in detail subsequently, but even for resonant processes of order higher than 2, expressions involving multiple summations over complete sets of states are usually necessary. Thus calculations of such expressions are standard features of multiphoton processes. Methods for the calculation of these multiple summations have been reviewed most recently by Lambropoulos (1976). The interaction V can be written either in terms of the vector potential or in terms of the electric and magnetic field vectors (Power and Zienau, 1959; Power, 1978). In the latter form, V is expanded in terms of multipoles of the field vectors. The majority of the phenomena in multiphoton processes at optical frequencies result from electric dipole contributions although quadrupole transitions have played a role in some experiments (M. Lambropoulos et al., 1975; Zimmermann et al., 1974; Sayer et al., 1971; Hertel and Ross, 1969). In the dipole approximation, the interaction is written as V
=
(16)
-er.&(O)
where e is the electronic charge, r the position operator of the electron undergoing the transition, and b(0) the electric field evaluated at the origin of the system of coordinates-the position of the nucleus. A one-electron model of the atom is implied above and shall be adhered to throughout. In multielectron atoms r must be replaced by a more complicated operator but for our purposes the one-electron model is quite adequate. We do not elaborate on the reasons for expressing V in terms of the multipole expansion of the field vectors instead of the vector potential. The interested reader is referred to the recent discussion by Power (1978). In any case, these questions are not central to our subject matter. When the radiation field is quantized, I/ is written as (Sakurai, 1967; Heitler, 1954) k
where E k is the polarization vector of the k mode. The polarization vectors may be real, as in linearly polarized light, or complex as in circularly or elliptically polarized light. In the fully quantized formalism, V does not exhibit an explicit time dependence as we are working in the Schrodinger picture. Using the above expression for V in Eq. (15) and performing the sum over all final states reached from the same initial state, the transition probability per unit time for a nonresonant N-photon bound-bound transition from an initial atomic state I g ) to a final atomic state I f ) can be written as WE’
= zN-lz(0)27C(271CI)NWN
UN
=
a1
(flr(’)laN- ) . . . ( a , Ir(’.)Ig ) -a ( N - 1101
[WUN_,- o g -
g -
0)
199
ASPECTS OF RESONANT MULTIPHOTON PROCESSES
where w is the photon frequency, I(w) the number of photons/cm'/sec/unit frequency, I the total photon flux, and r(') the projection of r on the polarization vector. The matrix elements and multiple summations now involve atomic states only, while ct is the fine-structure constant. The energies of all atomic states have in this equation been assumed to be infinitely sharp and from energy conservation we have of - w, = N o . The linewidth of the intermediate states must be taken into account when resonant processes are involved, a point to which we shall return shortly. Also the assumption of a monochromatic photon beam will have to be reexamined subsequently. The detailed derivation of Eq. (18)and the discussion of related matters have been given elsewhere (Lambropoulos, 1976). In this chapter we only need to refer to the structure of this expression and not its detailed calculation. In the semiclassical formalism and in the dipole approximation, the Hamiltonian is written as
H
=
HA
+ V ( t ) = H A - er - E(t)
(19)
where now E(t) is the classical electric field vector and is, of course, time dependent, with H A the atomic Hamiltonian. It is more convenient to write a monochromatic field of frequency o in the form E(t) = E (de'"' €*e-'"') with E the polarization vector. In that case V ( t )is wr'itten as
+
V ( r ) = -e(r
E
+ €*e-'"')
)(€eiUr
(20)
and the notation p = e(r * E ) is often used for the projection of the electric dipole moment operator on E. If the field is not monochromatic, the amplitudes € and d* undergo stochastic (slow) variations in time, slow compared to l/w. Continuing for the moment with the monochromatic case, we note that since the Hamiltonian H A + V ( t ) is time dependent, V ( t )can not be written as a simple exponential. If this is done, time ordering must be observed. One can, however, consider the Schrodinger equation in the form
a
-
at
U(t)= -iH(t)U(t)
(21)
which can also be written in the integral form
where
The integral equation (22a) is most easily obtained by first transforming to the interaction picture (Messiah, 1965). Perturbation theory can now be done
200
A.
T.
GEORGE AND P. LAMBROPOULOS
by iterating Eq. (22a). If we take V ( t )as given by Eq. (20), the results obtained are identical with those of the fully quantized version for nonresonant processes. Some differences do arise in considering resonant processes and special care is then required, as we shall see. We can therefore discuss multiphoton processes in a fully quantized or in a semiclassical formalism. Even in the fully quantized version, we do not have to employ the resolvent operator. The integral equation for U ( t )can be iterated in both formalisms (Bebb and Gold, 1966). We have chosen to cast the quantized version in terms of G(z) because it lends itself to somewhat better bookkeeping and also because it has been employed in many recent papers on resonant multiphoton processes.
THEORY OF RESONANT TWO-PHOTON PROCESSES 111. THEQUANTUM The derivation that led to Eq. (18) is valid under the assumption that none of the denominators in that equation vanish. A factor in the denominator will vanish if an integral number of photons matches the energy difference between the initial and some intermediate atomic state. It is then said that there is a resonance with an intermediate (atomic) state and the whole process is referred to as a resonant multiphoton process. In principle, there can be N - 1 intermediate resonances for an N-photon process. This would occur if each absorbed (or emitted) photon matched the energy difference between two atomic states. For higher-order processes from the ground state of atoms, this is very unlikely. For processes of order up to N = 15 or even 20, and with a single laser, usually one or two resonances are apt to occur. Atomic levels are not equally s,paced and moreover the energy spacing between the ground and the first excited state is a considerable portion of the ionization energy. There are, however, cases such as multiphoton processes even with a single laser in molecules or from highly excited states of atoms (so that the intermediate steps occur in the dense part of the atomic spectrum) in which many resonances may occur. On the other hand, there are experiments in which more than one independently tunable laser is used. Then we can have at least as many resonances as there are lasers. Such experiments have been performed in connection with laser isotope separation schemes (Solarz et al., 1976). A formalism applicable to cases with one or more intermediate resonances is therefore necessary. Heuristically, we can argue that the energies of the intermediate atomic states appearing in the denominator must include an imaginary part; if for no other reason, only because they are excited states and as such have a finite lifetime, which means a nonzero width. This reasoning is qualitatively correct. In fact, if the applied fields are sufficiently weak and monochromatic,
ASPECTS OF RESONANT MULTIPHOTON PROCESSES
20 1
such widths are determined by the spontaneous decay of the excited intermediate state. We could then use the already derived expressions for W") modified by the inclusion of the natural widths. This, however, is useful only in a limited number of cases since most multiphoton processes require high intensities for their observation. Under such conditions not only are the widths of the excited states intensity dependent, but the mere introduction of widths in the denominators does not necessarily allow the use of a transition probability per unit time. As indicated subsequently, the process may not in that case proceed by a simple time-independent rate. Moreover, what causes a finite width generally introduces a shift as well, that is, a modification of the real part of the energy. As usual, shift and width are related through a dispersion relation (Goldberger and Watson, 1964). In this case it is an intensitydependent shift. This procedure of accounting for resonance effects is a generalization of the usual procedure employed in resonant Raman scattering or resonance fluorescence (Heitler, 1954; Sakurai, 1967) but strictly speaking it is valid in the weak-field limit. It has been employed in some older work (Keldysh, 1964; Voronov, 1967; Kotova and Terent'ev, 1967; Ritus, 1967; Lambropoulos, 1967; Bonch-Bruevich et al., 1968; Oleinik, 1967,1968; Davydkin et al., 1971 ; Kovarski and Perel'man, 1971, 1972; Chang and Ftehle, 1973; Chang, 1974) on multiphoton processes. In these papers, the shifts and widths are intensitydependent and under the appropriate experimental conditions may be useful even for relatively strong fields. These results have been obtained through various approaches but ultimately all lead to some sort of a shiftwidth function for the resonant intermediate states. Formally we can account for resonance effects by developing the expression for G,,(z) more systematically. Returning to the basic equations (1 l), let us consider the matrix elements GFI and G I I .We obtain
In this form, the equations make it manifestly evident that the matrix elements of G (infinitely many) are coupled with each other. In order to illustrate the method, we consider first the simpler case of two-photon ionization. The initial system stateis)I) = I g ) l n ) , while the final state in this case is IF) = l f ) l n - 2). Let now l a ) be an atomic state whose energy o,is approximately equal to wg w. The absorption of one photon excites the atom to state l a ) , which means l a ) is a resonant intermediate state. The resonance does not have to be exact and part of the problem is to determine how the
+
202
A. T. GEORGES AND P . LAMBROPOULOS
whole process depends on the detuning
A
o - (o, - my)
w
-
wag
(24)
from exact resonance. The term near resonance is also used when A # 0. We must then consider the near-resonant system state IA) = 1a)ln - 1 ) on an equal footing with IZ) and IF), which means we must consider C A I .For reasons that will become clear later, we also consider all other states I b ) of the atom that are of the same parity as l a ) but nonresonant with the particular photon frequency. It is here understood that the selection rules allow the single-photon transitions Ig) + la), Ib) and la), Ib) I f ) but not lg) I f ) . Therefore, VF, = 0 while V A r ,V B I .VFa, and V F B are all nonvanishing, where IB) = Ib) ( n - 1). Considering now the equation for G,,, separating out terms that contain G,, in Eqs. (23), and changing the notation from I M ) to I B ) (since { I B ) } denotes a specific set of states that couple to 11) and I F ) ) we obtain
-
(2 -
Ur)G,,
=
1
+
V,AGA,
+ C V,&B
--+
(254
h
For later use we also write here the equation for G,, obtained by taking the BZ matrix element of Eq. (10). It is (z
- W BIG,,
= VBrGII
+ 1VBFGF,
(26)
F
If the transitions / u > + I f ) and Ib) + I f ) are in the continuum, as is the case in ionization, the summations over F in Eqs. (25) and (26) indicate sums over final states. If on the other hand we are concerned with a two-photon bound-bound transition, the sum over F reduces to a single term representing the bound final state. These equations describe near-resonant two-photon ionization including the possible effect of the nonresonant states lb >. These we leave aside for the moment and proceed as if they did not exist. The resulting simpler set of equations reads (Z -
o,)G,,
(2 - wA)G,,
=
1
+
(274
VrAG,,
= VArGr,
+
VAFGF,
(27b)
F
(z -
wF)GFI
= f'FAGA1
(27~)
ASPECTS OF RESONANT MULTIPHOTON PROCESSES
203
Solving Eq.(27c) for GF, and substituting into Eq.(27b) we obtain
Continuing with the assumption that I F ) is in a continuum, we note that in the term IV,,lz/(z - w F )we may replace z by (0, because due to the presence of a smooth continuum this term is a slowly varying function of z. And it is only in the vicinity of w, that its value is significant. Recalling that z stands for x iq with q -+ 0, the sum over F gives rise to a real and an imaginary part S, and ir,, respectively. They represent a shift and the width of state la) due to its coupling to the continuum. The shift is usually negligible in the case of two-photon ionization, because the single-photon transition I g ) * 1.) will saturate before the intensity can become sufficiently large for the shift to become significant. The width simply represents the rate of the single-photon ionization of state) . 1 and can ultimately be expressed as
+
+
where aa is the usual photoionization cross section of l a ) and I the photon flux. The equations now reduce to the system
whose solutions are
The denominators above represent a polynomial of second degree whose roots are
where h is the detuning from resonance as defined in Eq. (24). Combining Eq. (31b) with Eq. (27c) and writing the polynomial of the denominators as ( z - z + ) ( z - z - ) , we can write the solution for G,, as GF1 =
VF, v.4,
(z
- WF)(Z -
z + ) ( z- z - )
(33)
204
A . T. GEORGES AND P. LAMBROPOULOS
from which the inversion integral can be readily calculated, with the results LrFI(t)
=
‘FAV,4, (wF
exp( - iw,t) - z+)(wF - z-)
- iz+ t ) + ( z + exp( wF)(z+ z - ) -
-
After some straightforward algebraic manipulation this can be written as uFf(f) =
exp(-iw,t) -
[(A
1 - exp[i(o, - z+)r] VFAVAI + 41VA,12]’/2 wF - ’+
+
1 - exp[i(wF - z-)t] - z-
(35)
where we have used the relation
z,
-
z-
=
[(A
+
+ 41VA,12]1’2
To recognize the connection between Eq. (35) and the transition rate for nonresonant processes, note that in the limit of large detuning, that is, A2 F ,: 1 VA,l we have z + -+ w, and z- + w A .In that case, wF - z, + oF- wI 0 because the final state must be such as to conserve energy from the initial state; otherwise the transition cannot take place. O n the contrary, wF - 2 - is much different from zero, being in fact on the order of A. Thus if one were to calculate the transition probability per unit time through the equation = limr+x(l/r)lUF1(t)12,the result would be 271(1 vFAVA,IzjA2) 6(wF- w,). The delta function arose from the energy-conserving first term inside the braces of Eq.(35).The second term (not conserving energy) averages to zero for large t and does not contribute to the transition probability. We have then recaptured the nonresonant result as a special case of Eq. (35). Strictly speaking, we have recaptured only the contribution of state 1.) to the nonresonant rate. This is because we left aside all other nonresonant intermediate states lb). Their effect is discussed in Section IV. Returning to the general expression of Eq. (39, we note that if A is comparable to 21VA,l the quantity ICJFl(t)/2cannot always be reduced to a transition probability per unit time. Instead, we must calculate its full time dependence, which is expressed as a superposition of interfering exponentials. The total yield in an experiment is then given by IUF,(T)I2,where T is the interaction time determined either by the laser duration or the time the atom spends in the light beam, whichever is smaller. For a nonresonant process with a time-independent transition rate W , the quantity I UF,(T)I2is equal to 1 - e - w r . For a resonant (or near-resonant) process, the general trend of i UF,(T)12as a function of T will be similar (especially for long times) but will
’,
+
+
w,
ASPECTS OF RESONANT MULTIPHOTON PROCESSES
205
differ in its details. The most important difference consists in the oscillations of frequency 21 V,I that I U,,(T)I exhibits as a function of T . It does tend to unity for large T but not monotonically as in nonresonant processes. It is modulated by the so-called Rabi oscillations as shown in Fig. 2. It should be evident by now that the quantity 217/,,1 is the Rabi frequency of two atomic levels coupled by a resonant AC field (see, for instance, Allen and Eberly, 1975). It represents the frequency with which the atom undergoes transitions between the two levels. Using Eq. (17) for V , we can express 21VA,l in terms of more familiar (and more directly related to experiment) quantities in the form 2 ( 2 n a l ~ l ( ~ l s . r 1 q ) 1 ~where ) " ~ I is the total photon flux and z the fine-structure constant. All photons are in the above expressions assumed to be in a single mode. The physical picture emerging from these results as illustrated in Fig. 2 is that of an atom undergoing (Rabi) transitions between ( g ) and l a } while making transitions into the continuum (ionization). If r, 4 21 Var], it undergoes many (Rabi) oscillations before making a transition into the continuum. In the opposite limit r, 9 2\VA1/,it leaks into the continuum as soon as it reaches state la). Thus in this limit atoms are ionized as fast as they are pumped to state la). In these two limiting cases, it is the slower (the bottleneck) of the two rates Ig) la) and la) that determines the rate of the overall process
-
-If)
FIG.2. Plot of the three-photon ionization probability in cesium as a function of time for a rectangular ruby laser pulse of intensity: (a) 4 x 10" W/cm2; (b) 5.5 x 10" W/cmZ.
206
A . T. GEORGES A N D P . LAMBROPOULOS
(ionization). In intermediate regimes, the picture is not as simple since there is interference between the two transitions. Note that, since 21 V,, I I while I-, I, ionization changes with intensity faster than the Rabi frequency. Thus the process may move from one extreme to the other as the light intensity increases. Typically the dipole atomic matrix element for the bound-bound transition is larger than for the bound-free. As a result, the Rabi frequency for small intensities will generally be larger than the ionization rate of the resonant state in two-photon ionization. Exceptions to that rule exist as, for instance, when the states Ig) and l a ) are connected through a quadrupole transition. The situation is also typically reversed in higher-order rnultiphoton processes (see Section V). The roots z+ of the denominator of Eqs. (31) are often referred to as the energies of the dressed atom (Cohen-Tannoudji, 1967; Cohen-Tannoudji and Haroche, 1969a,b). If w = w, - w q , the states 11) = 1 g ) I n ) and I A ) = l a ) ] n - 1) of the system “atom plus field” are nearly degenerate, which is by definition the case in resonant two-photon ionization. The presence of the interaction V,, removes this degeneracy and zi are the new energies split by ( A 2 + 41 V,1(2)112,where for the sake of simplicity we have ignored r, in this discussion. For zero detuning A, the splitting is equal to the Rabi frequency 21VA1].This is yet another interpretation of the roots z c (see, for example, Bonch-Bruevich et a!., 1968: Khodovoi and Bonch-Bruevich, 1968). This problem of resonant two-photon ionization has been discussed in a number of papers (Beers and Armstrong, 1975; Kazakov er al., 1976; Fedorov, 1976: Armstrong et al., 197.5; Feneuille and Armstrong, 1975; Choi and Payne, 1977; Ackerhalt, 1978) from various viewpoints. In all of these papers the field has been assumed to be monochromatic and the problem has been cast in terms of a two-level system with an additional term representing the transition (decay) of the upper level into the continuum. Much more recent work has dealt with a more general model, which includes the effect of the bandwidth of the field, a question discussed in Section VIII.
-
-
IV. THEEFFECTok NONRESONANT ST~TES I f one were interested in the behavior of resonant two-photon ionizarion in a photon frequency range only very close to resonance, the atomic nonresonant states Ib) included in Eqs. (2.5) would not matter, but often it is of interest to know the complete frequency dependence (lineshape) of the ionization probability around the resonance. This includes the region of interference between resonances where the well-known minima appear
ASPECTS OF RESONANT MULTIPHOTON PROCESSES
207
(Bebb, 1966, 1967). Although these minima basically are nonresonant effects, they d o affect the lineshape of each resonance and for the sake of completeness should be given some attention even in a chapter on resonant processes. For this we return to Eqs. (25) and (26). We solve (26) for G,, and substitute into Eqs. (25a) and (2%). In the process we encounter the quantity C,, V,,V~,/(Z - w B ) ,which represents two-photon transitions from IZ) to I F ) via nonresonant states ( B ) .Ifwe are interested only in photon frequencies fairly close to resonance, that is, frequencies for which w A . we can replace z - wB by - wB if it can be assumed that the sum over b IS a slowly varying function of z in the region under consideration. Noting that o, - w, = o -(wb - cog) we base this replacement on the assumption that none of the states Ib) is near-resonant. Consequently z - wBvaries very little for photon frequencies that satisfy wl N w A .Thus we introduce the z-independent quantity
where the superscript (2) indicates that it is of second order in I/. Interchanging I and F we obtain FA;). After these substitutions, we solve Eq. (25c) for G,, and substitute into Eqs. (25a) and (25b). As in Section 111, we replace z - wF in the denominators by or - O, or oA- w, since or z w A . We are thus left with two equations:
where we have introduced
From these definitions and Eq. (37), it is evident that Q I A = Q;,. The terms involving the summations in Eqs. (38) represent shift-width functions of the energies mr and u l A . For coI the shift-width comes from transitions (real or virtual) via the nonresonant channels provided by the states Ib). For oAthe main contribution comes from the transitions into the continuum and is indentical to the contribution we encountered in Eq. (23). For both or and wA,the contributions involving the summations over F contain a real as well
208
A. T. GEORGES AND P . LAMBROPOULOS
as an imaginary part, that is, a shift and a width. Introducing the modified energies
we can write Eqs. (38) in the form
In this derivation, we have neglected the term ZbCFI/FBT/BF/(z - o>,)GF,, which would give self-energy modifications of oFas well as other virtual transitions between different states in the continuum. These are small and essentially unobservable effects for the process under consideration. With this approximation in mind, we can express G,, as GFI =
(z
+ 1 VFBvB,/(z -
- mF)-'[vFAGA,
~e)G111
(42)
h
which is the expression we have used in the derivation. To calculate ionization through Eq. (42) we need expressions for G A , and G I , . which are obtained by solving Eqs. (41) with the result G I , = (z - G A ) / [ ( Z- F+ )(z - F- )] GAI
(V,dI
+ Q A I ) / [k
- 2, )(z
-
2-
(43a)
I]
(43b)
Here 2 , are the roots of thi: determinant of the coefficients of Eqs. (41) and are given by ?* = $51
+
G A )
?
)[(6I
-
(5.*)'
+ 4lVAI +
QA,i2]l'
'
(44)
Using these expressions we obtain I ' F A ( ~ A+ I Q A I ) + (z -
GF, =
$A)
i'>-eve~/(Z-
.______ ~
z - O F
(z
-
5 + ) ( z- 2 2 )
~~
0 ) ~ )
(45)
This expression is similar to that of Eq. (33), becoming identical to it if we set Q,4r = 0: for consistency, we must at the same time neglect the sum over b in the numerator since Q A I itself involves exactly that sum. As is now evident from the numerator of Eq. (45) as compared to Eq. (33), the nonresonant states Ib) contribute the amplitude V F A Q Awhich I, interferes with the amplitude VF,VA, of the near-resonant state. In addition, Q A Imodifies the roots zk
ASPECTS OF RESONANT MULTIPHOTON PROCESSES
209
(now denoted by f, ), which indicates that the nonresonant states affect the process in a nontrivial way. The second term in the numerator of Eq. (45) is also due to the nonresonant states. It can be neglected, however, as long as z can be replaced by G A , which implies photon frequencies very near the resonance. As has been pointed out by Dixit and Lambropoulos (1979), however, the assumption of a frequency-independent background of nonresonant states is not always justified; especially when one is interested in the details of the lineshape around the resonance. Beers and Armstrong (1975) have derived detailed expressions based on this model and have explored various limits in which we can obtain simplified expressions or time-independent transition rates. A similar analysis has been presented in a somewhat different formalism by Fedorov (1976) and Kazakov et al. (1976). Our presentation here is very similar to that of Beers and Armstrong (1975). Our QAI is equivalent to their effective Hamiltonian, representing the effect of the nonresonant states. The total ionization over time T can be calculated either through the quantity S~U,IU,,(T)(~or through the quantity 1 - I U,,(t)l - I U,,(T)I '. To the extent that unitarity is satisfied (which is equivalent to saying that all other nonresonant states can be neglected) we must have the relation SdwFIUFI(T)12
=
-
IuII
- 1UAI(T)12
(46)
which simply implies that at any time T the atoms that are neither in state Ig) nor in state la) must be in the continuum. The validity of this relation depends on how far from resonance the nonresonant states are. In principle their populations should also be included. V. HIGHER-ORDER PROCESSES
In multiphoton processes of order higher than two, the situation is more complicated due to the larger number of possible resonances. Not only can there be multiple resonances, but even a single resonance can occur in more than one way. In five-photon ionization, for instance, one could have-at least in principle-a single-photon resonance between the initial and an excited state and then four-photon ionization from there; or a four-photon resonance between the initial and some higher excited state and then singlephoton ionization from there. Two further obvious intermediate cases with two- and three-photon resonances can also occur. Resonance with one intermediate state can therefore occur in four different ways for a five-photon process. If we were to allow for more than one resonance, many more ways would become possible. Obviously this variety increases enormously with the order of the process. Despite this variety there are certain basic features
210
A. T. GEORGES AND P. LAMBROPOULOS
common to all processes, irrespective of order, that enable one to formulate the general aspects of the problem in a unified way. Of course there are differences in detail among processes of various orders. In attempting to sort out the basic aspects shared by such processes, we must first note that the order of the process of the resonant step is not very important. It is the number of resonances that is more important. Thus the formulation of five-photon ionization with a three-photon resonance between the initial and some excited intermediate state with two-photon ionization from there is very similar to two-photon ionization via some intermediate state. To illustrate this point, let us consider in some detail two-photonresonant (TPR) three-photon ionization depicted schematically in Fig. 3. The resonant states are now 11) = / y ) l n ) and IB) = l b ) / n - 2 ) while the final state is I F ) = l f ) ( n - 3 ) . Again we begin with Eq. ( l l a ) and obtain equations for GI,, G B r ,and G,, . We must. however, perform higher iterations because, by assumption, the states Ig) and [ b ) are not connected with a single-photon transition, i.e., VB, = 0. The equations thus obtained are
+ Cc V , C G C ,
(2 -
w,)G,, = 1
(:
"c)G,, = I/,&,
-
+
VCBG,,
(47aj (47b)
The states IC) = lc)ln - 1 ) represent virtual intermediate states via which the two-photon transition 11) -+ IB) takes place. Since these states are
(I>
FIG 3 Schematic representation of two-photon-resonant three-photon ionization
ASPECTS OF RESONANT MULTIPHOTON PROCESSES
21 1
nonresonant, G,, must be eliminated by solving Eq. (47b) for Gcr and substituting into the other equations, which now read
The quantity C, VBCVc,,/(z- w,) evaluated at z = oris the usual secondorder two-photon transition matrix element coupling the states 11) and IS). We shall denote it by the symbol
It can be evaluated at z = o, 'v oBbecause none of the states IC) is nearresonant. The structure of these equations is easily seen to be very similar to that of Eqs. (27) if Fit)is identified with V,, . G,, can be eliminated exactly as in the case of Eqs. (27), leading to an ionization width r, for I B ) . Two new quantities that require some attention have, however, appeared in Eqs. (48). They are the sums over C inside the parenthesis of Eqs. (48a) and (48b). The first evaluated at z = o,and the second at oB(although here the distinction is inconsequential since or = w B )represent the shifts of the atomic states Ig) and ( b ) due to the same laser field that causes the transitions. Often referred to as AC Stark shifts. they are nonresonant and depend linearly on the total photon flux. Such shifts exist, in principle, in the two-photon resonant ionization as well, but in that case the coupling V,, between the initial and the resonant state is of first order. As such V,, l2 is of the same order as the shifts and dominates the saturation of the transition long before the shifts can play a role, but in the present case the situation is reversed because 1 is of fourth order while the shifts are of second order, thus dominating the saturation behavior of the process. Defining the symbols
1
?#I2
S, 'b
=
c
lVc,12/(o,
-
oc)
(50a)
IvBC12/(wB
- @C)
(50b)
C
1 C
and eliminating GF, we are left with the equations (Z
(Z
- 0,- S,)GI, = 1
- wB - S,
+ irb)GB1=
+
Cg)GBI
@)GIr
(51a) (51b)
212
A . T. GEORGES AND P. LAMBROPOULOS
which demonstrate that the problem has indeed been reduced to two equations very similar to Eqs. (27). From here on, we can proceed as in the case of the two-photon process by finding the roots Z~ and obtaining expressions for either JI U F I ( t ) I 2dw, or 1 - 1 Ur,(t)12 - I UBI(t)I2. The behavior ofthe process as afunction ofphoton flux is, however, expected to be somewhat different because of the presence of the shifts. Their main effect is to change the energy difference between I g ) and lb), by shifting each of them (usually) toward or away from each other, thus bringing them closer to or away from resonance, depending on the algebraic sign of the initial detuning A and the relative signs of the shifts. The new feature here is that not only the frequency of the light but also its intensity determines how close to resonance the process is. As discussed later, this is in addition influenced by the coherence (stochastic) properties of the radiation. The separation of the states IC) from the states I F ) in the above derivation is somewhat artificial and was done in order to simplify the exposition. In fact, these states are of the same parity (being of parity opposite to that of 11) and IB)) and could be treated as one set. A subset of it would contribute to S, while the whole set would contribute to S, and ir, as the real and imaginary parts of the same quantity. To clarify this remark by an example, we can think of 1s) as an S state and of I b ) as a D state. Then only P states contribute to S, while both P and F states contribute to S, - ir,.If on the other hand, both I b ) and 1s) are S states, then only P states contribute to S, as well as s, - ir,. A more rigorous derivation, along somewhat different lines, can be found in Georges et al. (1977), where the shifts and widths are obtained as the real and imaginary parts of the polarizabilities of states Id and I + Having demonstrated the formal equivalence between two-photon resonant ionization and TPR three-photon ionization, we can readily write equations for a more general case. Consider N-photon ionization with an M-photon intermediate resonance (Fig. 4), to be referred to hereafter as
FIG.4. Schematic representation of M-photon-resonant N-photon ionization.
ASPECTS OF RESONANT MULTIPHOTON PROCESSES
213
“M-photon-resonant (MPR) N-photon ionization.” We must now derive equations for G I , , CAI,and Ck?), where 11) = I g ) l n ) / A ) = la)ln - M), and IF) = I f ) ( n - N ) . The superscript N has been added to GkY’ to remind us that we are dealing with a process of overall order N . It is, of course, understood that there are no intermediate resonances between ]g) and ( a ) or between ( a ) and I f ) . In analogy with the three-photon case, it is now evident that ( I ) and IA) are coupled by an effective matrix element pay)of order M, whose calculation involves M - 1 infinite summations. The ionization width is now of order N - M and its calculation involves N - M - 1 summations. In the terminology of Section 11, pz’and r, represent an M-photon bound-bound transition and (N - M)-photon ionization, respectively. These are nonresonant processes requiring for their calculation the formalism and techniques mentioned in Section I1 and reviewed in detail elsewhere (Lambropoulos, 1976). The derivation of the equations for the present case involves higher-order iterations of Eq. (10). The procedure is a straightforward, albeit lengthy, generalization of the three-photon case. The result is easily anticipated on the basis of Eqs. (51) and can be written as
(z (Z -
-
O I
wD - S,
-
S,)GI,
=
1
+
P:yGA,
+ iFu)GA1= PjC;I’Gll
(524 (52b)
with the additional equation
where is the effective matrix element representing the ( N - M ) photon ionization of I b ) and also is the quantity in terms of which r, is expressed. The solution of Eqs. (52) again leads to two poles Z + and the final expression for Gk: ) reads
where the roots
Z~
are the solutions of the equation
The expression for U&:)(t)will again be formally similar to Eq. (35) but the dependence on the photon flux will be drastically different, because not only is the numerator of Eq. (54)of higher order in V but also the poles z + contain terms of higher order. In the limit of weak fields, one of the roots approaches w and the other w A .Equation (54) is then reduced to the nonresonant result of Section 11. The detuning from resonance is now given by
214
A . T. GEORGES AND P. LAMBROPOULOS
The shifts S, and S, are again expressed as in Eqs. (50) except that now different sets of states IC) contribute to S, and S,, although in special cases they may happen to be the same set. As in the previous special case of threephoton ionization, S, and r, can be expressed as the real and imaginary part of the polarizability of state I d ) . If one calculates the probability of ionization either through jdOFlUFr(t)l2or through 1 - IUrr(t)12- (UAr(t)12for a given t and as a function of light intensity (photon flux), for weak intensities the process is found to be proportional to the Nth power of the intensity, as predicted by the transition probability per unit time of perturbation theory. As the intensity becomes stronger, significant deviations from that behavior begin to appear and for very strong intensities the intensity dependence will generally be completely different. Some of its features are discussed later. In the interest of simplicity of derivation, we have thus far neglected an important mode of decay of the resonant excited atomic states, i.e., spontaneous decay. In resonant two-photon ionization, the excited state l a ) decays spontaneously back to the initial state 1 g ), while in higher-order processec the resonant state decays spontaneously via single-photon emission to some other state. From there the atom eventually returns to the ground state via a cascade of further single-photon spontaneous transitions. The spontaneous decay out of the upper resonant state can be easily taken into account by simply adding its spontaneous width to the field-induced ionization width. Thus from here on, the width r appearing in equations such as Eqs. (52) will be understood as the sum = rloN + roof the ionization and spontaneous width of the upper state. rois a constant characteristic of the atomic state and independent of the field intensity, while rloNis proportional to the photon flux. The above substitution is not as phenomenological as it may appear here. In fact, it can be proven quite rigorously. For a review of the techniques by which this can be accomplished in a two-level system the interested reader is referred to the books by Agarwal (1974) and Allen and Eberly (1975). A proof related directly to the content of this article can be found in an article by Lambropoulos (1974). Obviously, in addition to depopulating the upper resonant state, the spontaneous decay eventually repopulates the ground state. If the time of interaction between atom and field is short compared to the time required for the atom to return to Ig) (either directly or via a cascade) the repopulation can be ignored and the equations derived above are quite adequate for the description of the process. Otherwise, the repopulation must be accounted for, which requires a somewhat different formulation of the problem in terms of the density matrix. Equations such as those derived above on the basis of G(z),i.e., using the Schrodinger equation, can account for the decay out of the upper state but not for the decay into Is}. The question of population buildup often is of
ASPECTS OF RESONANT MULTIPHOTON PROCESSES
215
particular concern in higher-order processes where the upper resonant state decays first to another lower excited state. If the lifetime of that state is shorter than that of the upper state, there exists a possibility of population inversion and therefore lasing, which may interfere with other ongoing processes. Examples of situations in which this may be of concern can be found in the papers by Ward and Smith (1975), Leung et al. (1974), and Georges et al. (1977). In that case, the process can not be described by a two-level system with ionization; additional equations accounting for the states populated via spontaneous emission must also be included. In the discussion of this section, we left out the effect of the background due to the nonresonant states. It can readily be included by an obvious and straightforward generalization of the procedure outlined in Section IV.
VI . SEMICLASSICAL APPROACHES As indicated in Section 11, the problem can be formulated in terms of the Hamiltonian H = H A + V ( t ) = H A + pE(t), where only the atom is treated quantum mechanically. The atomic wave function $(t) is now governed by the equation
whose solution can be written as
lJ/tt))
=
W)Ill/(O))
=
uttlls)
(58)
where, consistently with our previous notation, we have denoted by Ig) the initial atomic state. If we denote by Ic) an arbitrary atomic state, from Eq. (58) we obtain (clll/(t)) = (clU(t)lg) = U,,(t). Using this relation we write I$(t)) in the form
where the time-dependent coefficients U,,(t) are the familiar coefficients of time-dependent perturbation theory. Substituting this form into Eq. (57) we obtain a set of infinitely many complex differential equations for the coefficients. For a resonant multiphoton process we must eliminate all but two of the coefficients retaining only those describing the two resonant states. The continuum must also be eliminated, its effect being replaced by an ionization width. Through the same elimination procedure we obtain the shifts of the resonant states. If the frequency of the radiation is o,substitution of the expression E ( t ) = Geior + €*e-'"' into the differential equations gives
216
A . T. GEORGES AND P. LAMBROPOULOS
rise to terms with a resonant time dependence of the form exp[ fi(o- wuc,)r] as well as to terms with an antiresonant time dependence of the form ex p [ i i ( w + o,,)r],where o,denotes the energy of the upper resonant state. The antiresonant terms are neglected and the resulting equations correspond to the so-called rotating wave approximation, which is valid as long as the detuning A = w - mayis much smaller than w z o,,,i.e., under conditions of resonance. These resonant and antiresonant terms arise more naturally if U ( t ) is written in the interaction picture, which in effect means that U,,(t) = u,,(t) exp( -io,.t) where the fast oscillating time dependence of L',,(t) is separated from the slowly varying part u,,(t), which satisfies a somewhat modified differential equation. The resulting equations for the case of resonant two-photon ionization are d dt
U,,(t)
=
iUu,(t)Pga8exp[i(o, -
d -
dt
u,,(t) =
-
r,u,,(t)
+
oas)tl
iU,,(t)payg* exp[ -i(o - ~ , , ) t ]
(604 (60b)
The notation here is as defined in Sections I1 and 111. The initial conditions are u,,(O) = 1 and u,,(O) = 0. The width r, contains both ionization and spontaneous decay. Resonant two-photon ionization has been formulated i n this approach in papers by Fedorov (1976) and Kazakov et al. (1976). In view of the derivations given in the previous sections, it should be evident now that Eqs. (60) can be easily generalized to the case of M-photonresonant N-photon ionization. If we denote, as in Section V, by la) the upper resonant state the resulting equations are
d dt
-
ua,(t) =
-ip, - ira)uag(t)
+ iu,,(t)iiby)(8*)Mexp[ - i ( ~ w- u,,)t]
(61 b)
an effective matrix where the notation parallels that of Eq. (52) with element coupling the states la) and (y) through an M-photon transition. Except for allowing the use of a classical electromagnetic field, these equations are essentially identical to those of the resolvent operator. In fact, the equivalence becomes obvious if we take the Laplace transform of Eqs. (61) and then make a change from the real Laplace variables to z = - i s . However, the advantage of being able to use a classical field is by no means trivial when additional aspects such as pulse propagation or finite bandwidth are of
ASPECTS OF RESONANT MULTIPHOTON PROCESSES
217
interest (see Section VIII). As with the resolvent operator, the difficulty with accounting for the repopulation of the lower state is still inherent in the formalism as would be in any formulation in terms of the amplitudes of the Schrodinger equation, and this naturally leads us to the next topic, i.e., the formulation of the problem in terms of the density matrix. Although we have chosen to present the density matrix treatment in a semiclassical context this is not an inherent feature of the density matrix. It can be cast in the fully quantum version as well (see, for example, Lambropoulos, 1967). Our choice has been influenced by the fact that the vast majority of papers using the density matrix in the treatment of multiphoton processes have adhered to the semiclassical approach. The density matrix of interest here is, of course, that of the atom with matrix elements pab(t), where la), Ib), . . . are eigenstates of the atomic Hamiltonian H A . It obeys the equation of motion
a
-p(t)
at
=
-i[HA
+ V ( t ) ,p ( r ) ]
where the right side is the usual commutator. If I g) and l a ) are the resonant states, we shall be interested in obtaining differential equations for p,,(t), p,,(t), and p,,(t) = p,*,(t). All other matrix elements must be eliminated and their effect be replaced by appropriate constants. A very general procedure to this end, which can only be outlined here, begins by expanding pa,, in terms of harmonics of the incident field, pab(t) = aab(t)
+C
+
[a$)(t)einwf a&)*(t)e-inwf 1
n> 0
(63)
where uob(t),o$,)(t),and ab",'*(t) are complex amplitudes slowly varying in time, i.e., much more slowly than einwt. The quantities us)and ag)are not complex conjugates of each other, although pab and pba are. This expression is substituted into Eq. (62) written for each matrix element of interest, i.e.,
=
-i(aa
- Ob)pab
- iE(t)
[PacPcb
-
Packb1
(64)
C
and only terms of the appropriate order are finally retained. The rotatingwave approximation is again employed and provides the basic criterion for the retention of terms. Suppose we are interested in MPR N-photon ionization, For p,,(t) only the terms o$?(t) and aly)*(t)will be retained since it is an M-photon process that connects I g ) and la). By considering particular groups of matrix elements, one accounts for the shifts, the ionization width, etc. For example, to account for the shift S,, the matrix elements p,,(t) must
218
A. T. GEORGES AND P. LAMBROPOULOS
be considered; in particular the terms oh:) and or;)*will contribute to S, . The set of states Ic) corresponds to all nonresonant atomic states connected to Ig ) through a single-photon transition. The general aspects of this procedure have been discussed and applied to specificproblems by Khronopoulo (1964), Butylkin et al. (1971), Elgin and New (1976), Elgin et al. (1976), and Georges et al. (1976, 1977).The interested reader will find a fairly detailed derivation with many of the steps that are omitted here in the paper by Georges et al. (1977). That paper as well as a later paper by Georges and Lambropoulos (1977) deal with TPR three-photon processes. On the basis of their results we can, without further derivation, write the equations for the case of MPR N-photon ionization :
J-d + i[Mo
- Wag
Id t
=
i[o,,(t)
d dt
-
- o s g ( t )=
-(s, - s,)] + +(r;oN + r:)
og,(t)]p;y'P r,"o,,(t)
($ + rJoN+ r,"1o,,(t)
I
G p ( t )
+ 2 Im[ph~)*o$~)(t)(B*)'] =
(654 (65b)
- 2 I m [ ~ h ~ ) * o ~ ~ ) ( t ) ( 6 * )(65c) ~]
where the symbols correspond to the notation used in the previous sections. The ionization width here represents nonresonant ( N - M)-photon ionization. Two differences between these equations and the corresponding equations for the amplitudes discussed earlier deserve some attention at this point. First, the shifts, of the resonant states Ig) and l a ) occur in the form of the difference S, - S,, which explicitly demonstrates that it is the relative shift of the two states that matters. Second, the first term in the right side of Eq. (65b) shows that the repopulation of 19) by spontaneous decay is accounted for, whereas in contrast Eq. (65c) shows how both rloN and rg depopulate la). Obviously we have gained the additional information in the equations at the expense of having to deal now with three instead of two equations. Note that pss = og8and pa, = oaaand as a result the ionization can be calculated either as 1 - a,,(t) - o,,(t) or as ~~r,'oNo,,(t')dt'. These equations can in principle be solved by Laplace transform with the initial conditions a,,(O) = 1 and o,,(O) = 0. Analytic solutions are in general of little value since the time dependence must be expressed in terms of exponentials involving the roots of an algebraic equation of third degree. Although possible in principle, such solutions do not yield significant information upon mere inspection. There are, however, special cases corresponding to particular combinations of values of the parameters in which analytic solutions can be fairly useful.
219
ASPECTS OF RESONANT MULTIPHOTON PROCESSES
It will facilitate our further discussion if we recast our equations in a somewhat notationally simpler form. Let us denote 11) the lower and 12) the upper resonant atomic states with respective energies o,and oz.Thus in Eqs. (65)we replace g by 1 and a by 2. Let us further introduce the symbol QR for the Rabi frequency of the bound-bound transition 11) c* 12), i.e., Q
= 2fi'M'gM =
R -
ga
2fi'M'EM
denote Sa - S, by S,, = S,, ,and the detuning M o - o,, A. We now write the equations for the density matrix as
[$+
i(A
-
S,,) d dt
1
+ :(rioN+ rf)d#(t)
i
=
= $ez2(t) -
+
l ( t )= r3~,,(t) Im[Qgc\y)*(t)]
- 0,
(66)
12
Mw
-
w Z 1by
c,,(t)]Q, (67a)
(67b)
The dependence on the light intensity comes through the parameters S, rioN, and QR . The first depends on the intensity linearly, the second is proportional to I N - M , while R, is proportional to the Mth power of the field strength 8.Depending on the order N of the overall process, the order M of the resonance, and the intensity, QR may be larger or smaller than TioN.And the ratio of their magnitudes may change from smaller to larger than one, or vice versa, as the intensity changes. This, of course, changes the development of the process in time. It will be noticed that in Eqs. (67) the damping (relaxation) constant for o,,(t) is one-half the damping constant for c2,(t).This will be true as long as there are no collisions and the damping is caused by radiation. In a more general situation, however, these two matrix elements may relax with arbitrary relaxation constants usually denoted by 1/T, and l/T,, with T , associated with the relaxation of the diagonal matrix element a,, (see, for example, Allen and Eberly, 1975). It is an additional advantage of the density matrix formalism that it allows the treatment of this more general case.
VII. MULTIPLE RESONANCES Our discussion has thus far been limited to the case of one resonance in a multiphoton transition of arbitrary order. As we have seen, the problem can be reduced to a two-level system with the appropriate damping constants and as such can be treated with any of the many methods developed for that
220
A. T. GEORGES A N D P . LAMBROPOULOS
problem sometimes lending itself to analytic solutions. The addition of even one more resonance makes the problem much more complicated and except in very special cases renders analytic solutions impossible. Nevertheless the problem is of current interest in connection with multiphoton processes in molecules (Cantrell et al., 1978; Letokhov, 1978) as well a some schemes for isotope separation in atoms (Solarz et al., 1978). Three-level systems, i.e., two successive resonances, have received considerable attention and the literature on the subject constitutes a field in itself. This, of course, is due to the fact that the three-level model is related to a number of processes of wide interest and applicability, such as resonant Raman scattering and, double optical resonance (DOR). In fact for a very special case, namely an atom with three levels none of which decays, Sargent and Horowitz (1976) have given an exact solution in the rotating-wave approximation. They have also shown how to extend their solution to the case in which all three levels decay with the same damping constant. We d o not dwell here upon the details of the behavior of three-level systems but refer the interested reader to the reviews by Beterov and Chebotayev (1974) and Chebotayev (1976). Later, however, we return briefly to DOR under strong fields. Although in studies of three-level systems, it is assumed that the successive resonances are connected via single-photon transitions, it is rather straightforward to generalize the results to the case of multiphoton transitions of arbitrary order between successive resonances. The dipole matrix elements connecting the resonant levels are simply replaced by effective matrix elements as indicated in the previous sections. However, if in that case the fields become sufficiently strong, it should be kept in mind that the AC Stark shifts must be included and that their influence on the behavior of the system may be significant. Except for rather general formal results, it is extremely difficult if possible at all to obtain even qualitative expressions for multilevel systems with arbitrary number of levels K . Thus recent investigations have relied mostly on numerical calculations with specific numbers of levels. Such calculations have been formulated in the semiclassical approach and the rotating-wave approximation. The equations of motion for the amplitudes of the Schrodinger equation can be written very easily. If the wavefunction for the K-level system is expanded as
lW)) =
c C A t ) exP(-iw,f)la)
(68)
0
where the summation runs over the K states of the atomic system, substitution into the Schrodinger equation with the Hamiltonian H A V(t) leads to the equations
+
d ~
dt
C,(t)
=
-i
1 I/d,0’(t)jb) h
ASPECTS OF RESONANT MULTIPHOTON PROCESSES
22 1
where
The coefficients C, are equivalent to the quantities U,,(t) of the previous section. If the energies of the K levels are not equidistant, the field E ( t ) must contain K - 1 frequencies, each of them nearly matching the energy difference between two adjacent levels. Thus the electric field E ( t ) must have the form K- 1
E(t) =
1
[&, exp(io,t)
m=1
+ &,* exp( -iw,t)]
(7 1)
which if substituted into the set of Eqs. (69) will lead to exponentials of the form exp[ & i(w, wab) t ] . The rotating-wave approximation requires that only terms containing differences of the form 0, - w,b for which w, % w,b be kept. For each successive pair of levels only one such near resonance is assumed. If we introduce a sequence of detunings Aab = w,,, - w (where w, > (oh)and a transformed interaction matrix Dlpb’
=
V,‘Ob’(t)exp( - i Aabt)
(72)
the explicit time dependence can be eliminated by introducing the ansatz C,(r) = exp( - i&t)co(r). Then we obtain a set of equations for C, that contain no explicit time dependence if certain conditions are imposed on the i , terms. The problem is thus reduced to the evaluation of the eigenvalues of a Hermitian matrix. Approaches along these lines have been presented by Einwohner et al. (1976) and Eberly et al. (1977). Einwohner er a / . (1976) have also discussed the application of graph theory to the solution of these problems. Eberly et al. (1977) have written the equations of motion in terms of transition amplitudes Tnb(t) to find the system in state) . 1 at time t if it was in state I b ) at t = 0. Their equations can be written as (73)
with the initial condition Tub(O)= S,,,. The parameters 0, are Rabi frequencies between the levels a and a + 1, while A,-, represents a cumulative detuning for the ath transition. It is understood as before that a and b assume the values 1 to K. The set of the K Z complex quantities To,,,provide all the information contained in the Schrodinger equation. The behavior of the K-level system is determined by the detunings and the Rabi frequencies. If one had K-1 lasers with independently tunable frequencies, the parameters Aa and Q, could be chosen to have any desired values because Q, is proportional to the amplitude of the ath laser. On the other hand, a multiphoton transition up the vibrational ladder of a molecule with one laser (Letokhov,
222
A. T. GEORGES AND P. LAMBROPOULOS
1978) does not have that freedom because there are well-defined relations between the successive transition matrix elements and detunings. The information usually sought about a K-level system is the population of the various levels at time t if it is known that at t = 0 the system was in the ground state. We have seen in the previous sections that a two-level system exhibits oscillations of the populations between the two resonant states. One of the natural questions therefore is whether similar behavior exists in a Klevel system. More generally we are interested in whatever regularities exist in the temporal behavior of the system. Eberly et al. (1977) have presented numerical results for two special cases-equal Rabi frequency and the harmonic Rabi case, i.e., Rabi frequencies increasing as &-for a variety of values of K up to K = 15. In a follow-up paper (Bialynicka-Birula et al., 1977) they have also given analytical solutions for particular cases. It appears that for K = 4 or larger, the periodicities formed in two- and three-level systems d o not exist. A persistent and apparently general feature of those calculations seems to be a surge of the population of the Kth level as compared to the lower levels. Some periodicity has also been found under special conditions. One consistent quasi-periodicity has, however, been found in the recurrence of substantial population of the initial level. A sample of the results of Eberly et al. (1977) is shown in Fig. 5. These model calculations have dealt with K-level systems without loss. As a result, population cannot leak out of the system. The predicted behavior should be applicable to real systems as long as ionization and/or dissociation are not too strong. This can be expressed as the condition rat6 1, where t is the interaction time and r, the largest intensity-dependent decay width of the levels of the system. A related problem examined from a somewhat different angle is the behavior of an anharmonic oscillator under multiphoton excitation. This corresponds to absorption up a vibrational ladder including the effects of anharmonicity, which causes the higher levels to be progressively more detuned, a question touched upon by Eberly et al. (1977). A calculation by Walker and Preston (1977) has addressed the question of whether a completely classical description, in which not only the field but also the anharmonic oscillator is described classically, is appropriate. Their results seem to suggest that a classical treatment predicts the average behavior fairly successfully but that multiphoton resonant effects can not be accounted for. The question of knowing a priori when resonant effects are important is of course at the heart of the matter. The currently popular and certainly reasonable wisdom seems to be that as long as other couplings to a large number of degrees of freedom exist, resonant effects are averaged out and the classical calculation should be just as good. At this point, however, there does not exist sufficient quantitative comparison between theory and experiment to provide reliable criteria for the validity of the various approaches.
223
ASPECTS OF RESONANT MULTIPHOTON PROCESSES N = 3
N = 7
N = 15 1
I
12
20
n /15\
I
20
nl
*lt FIG.5. Level populations of an N-level system (for N = 3, 4, 7, 15) in resonant N-photon excitation (all levels exactly resonant) with all Rabi frequencies assumed equal. The populations are plotted vs. Rt with R the common Rabi frequency and t the time. (From Eberly et al., 1977.)
The dynamics of a K-level system can also be formulated more generally in terms of the density matrix. The mathematical complexity then escalates significantly because we have to deal with i K ( K + 1) differential equations of complex-valued functions. If under certain conditions the set of equations can be reduced, considerable simplification ensues and the density matrix can be quite useful. From processes with weak fields it is known that often the off-diagonal matrix elements can be eliminated through a series of approximations, thus reducing the problem to a set of K equations for the diagonal matrix elements (the populations of the levels) only. These are also referred to as kinetic or rate equations (Parker et al., 1978). Under strong fields, the possibility of Rabi oscillations may necessitate the inclusion of the offdiagonal matrix elements. Rabi oscillations imply some coherence in the excitation. If the diagonal matrix elements are to be sufficient for the descrip-
224
A . T. GEORGFS AND P. LAMBROPOULOS
tion of the process even in strong fields, there must be something that destroys the coherence. As we see in some detail later, a large laser bandwidth is usually sufficient to justify the use of kinetic equations. The question of when rate equations are justified under strong fields has been addressed recently in a number of papers (Ackerhalt and Eberly, 1976; Ackerhalt and Shore, 1977; Ackerhalt, 1978; Parker et al., 1978). In general, it is expected that rate equations will be a good approximation when there are damping mechanisms that prevent coherent effects such as Rabi oscillations. The radiation bandwidth can act as one such mechanism while coupling to a dense set of states or a continuum can be another. Ackerhalt and Eberly (1976), for instance, have discussed a model and a set of conditions in which the rate equations become valid in a multilevel system by arranging the intensities so that each transition between two levels is stronger than the preceding one. Their particular model consisted of a four-level system with ionization of the top level. The rate of ionization had to be larger than the Rabi frequency of the last bound-bound transition and hence larger than all other Rabi frequencies. Clearly, it is ionization that provides the damping in that model since their particular choice of Rabi frequencies makes ionization the dominant rate. If rate equations are applicable, life is certainly much easier for the theorist, but this does not mean that ionization or dissociation proceeds at the fastest possible rate. In fact, the highest efficiency occurs when some coherence exists in the process (Garrison and Wong, 1976; Ackerhalt and Shore, 1977) as suggested at least by studies on systems of relatively low dimensionality. It appears that a well-known principle known in electrical circuit theory, i.e., impedance matching is also valid in multiphoton transitions to a continuum.
VIII. FIELD STATISTICS AND BANDWIDTH EFFECTS A multiphoton transition, resonant or otherwise, is an inherently nonlinear process and as such depends not simply on the intensity of the field but also on its coherence properties. For a multiphoton transition to be completed, more than one photon must be absorbed within a very short time, depending on the response of the atom. As a result, the atom “sees” the fluctuations in the “arrival” of the photons and responds not only to the average number of photons per unit time but also to the way this number fluctuates. These stochastic fluctuations of the amplitude and/or the phase of the electromagnetic field reflect what is referred to as the coherence or correlation or statistical or stochastic properties of the field. If a classical field underoges amplitude fluctuations, then there will also be intensity fluctuations, above those of the Poisson distribution found even in a pure coherent state. If the classical field has a constant amplitude and undergoes
ASPECTS OF RESONANT MULTIFHOTON PROCESSES
225
only phase fluctuations, there will be no intensity fluctuations but the field acquires a finite (nonzero) bandwidth. In general, there are both amplitude and phase fluctuations. The bandwidth of the light source plays no role in a nonresonant multiphoton process, because by definition nonresonant means that the bandwidth is much smaller than the smallest detuning. In that case, it is only the intensity fluctuations that are seen by the atom. The study of the effects of intensity fluctuations (or photon statistics) on nonresonant multiphoton processes began as early as 1966. The related theory was formulated in terms of a single mode of the radiation field and the resulting phenomena were also referred to as photon statistics or photon correlation effects. Treatments in which the single-mode assumption was not employed have also been published (Mollow, 1968; Agarwal, 1970).The subject has most recently been reviewed by Lambropoulos (1976)and references to the earlier literature can be found in that article. The fundamental result of those studies has been that the rate of nonresonant N-photon ionization with chaotic light is larger by a factor of N ! than with purely coherent light. Qualitatively speaking, the bandwidth begins to play a role as soon as it becomes comparable to the detuning from an intermediate atomic state, i.e., as soon as an atomic state becomes near-resonant. Since the atomic state generally has a width and a lineshape of its own, due to its finite spontaneous lifetime, it can be said that the laser bandwidth becomes important when it begins to overlap with the width of the state. Viewed in the time domain, it implies that the time scale of the field fluctuations is comparable to or faster than the lifetime of the near-resonant state. If the laser fluctuations are much slower than the atomic lifetime, the atom does not see the fluctuations of the field. This qualitative picture is valid if the field undergoes phase fluctuations only or its intensity is weak, in the sense that the resonant bound-bound transition is not saturated. If the intensity is strong and in addition the field undergoes amplitude fluctuations, the picture is no longer as simple, because strong intensity in the sense of saturation implies a nonlinear dependence of the transition on the field, which as pointed out earlier leads to dependence on intensity fluctuations and in general on all the higher-order correlation functions, and hence the complete statistics of the field. Note that the bandwidth is determined by the first-order correlation function alone. Thus a quantitative understanding of the phenomenon requires a more complete mathematical formulation. The theory of the effects of field fluctuations on saturated transitions has received much attention in recent years (Apanasevich et al., 1968; Zusman and Burshtein, 1972; Przhibelskii and Khodovoi, 1972; Przhibelskii, 1973; 1977; Oseledchik, 1976; Agarwal, 1976, 1978, 1979; Carmichael and Walls, 1976; Eberly, 1976,1979; de Meijere and Eberly, 1978; McClean and Swain, 1977, 1979; Kimbel and Mandel, 1977; Zoller and Ehlotzky, 1977; Zoller,
226
A . T. GEORGES AND P. LAMBROPOULOS
1977, 1978, 1979a,b; Avan and Cohen-Tannoudji, 1977; Elyutin, 1977; Georges and Lambropoulos, 1978; Georges et al., 1979).The most recent interest stems mainly from new experimental results on resonance fluorescence (Walther, 1978; Wu et al., 1975; Ezekiel and Wu, 1978), double optical resonance (Whitley and Stroud, 1976; Wong et al., 1977; Moody and Lambropoulos, 1977; Hogan et a!., 1978), and multiphoton transitions (Agostini et al., 1978; Marx et al., 1978). Mainly as a result of advances in tunable dye lasers, experiments have advanced to the point where the effect of field correlations can now be seen in studies of multiphoton processes. T o discuss some of the elements of the theory of these effects, we must return to the initial equations. Let us consider the problem in the density matrix formalism as given by Eqs. (67) for the special case of a two-photon process, in which case we must take M = 1 and can for simplicity omit that superscript from our equations. For the sake of further generality, which as we shall see is important in the present context, we allow a,,(t) and o z 2 ( t )to have different relaxation constants in the absence of the field. Thus in Eq. (67a), $20 is replaced by ,and in Eqs. (67b) and (67c), l-20 is replaced by T:,. Our equations now become
rioN(t), S,,(t), and the Rabi frequency Q R ( t ) are shown as time dependent to reflect the fact that the field is now written as E ( t ) = &(t)eiu' + &*(t)eCiW', where the time variation of&(t)is assumed to be stochastic and much slower than eiW'.The above equations are now stochastic differential equations because the dependent variables oijare coupled to the stochastically fluctuating quantity & ( t ) . As usual in stochastic processes, the nature of these fluctuations can be described by a probability distribution reflecting the distribution of the values that € ( t ) takes as a result of its fluctuations around some mean value. The quantities of interest are now (a,,(f )), where the angular brackets indicate averages over the probability distribution appropriate to the stochastic nature of & ( t ) . If we take the averages ofEqs. (74),we encounter averages ofproducts ofthe form ( S z l ( t ) o 2l ( t ) ) and (n,*(t)o,,(t)), etc., which can not in general be separated into products of averages because the field and atomic quantities inside the brackets I t is to be noted here that
ASPECTS OF RESONANT MULTIPHOTON PROCESSES
227
fluctuate in a correlated fashion. In other words, the two quantities can not be decorrelated in the equations except in special cases. One such special case is obtained when the field is weak. As a consequence, the quantities oij(t)do not vary significantly with time and can therefore be decorrelated from the field variables. Another special case corresponds to a field that undergoes phase fluctuations only. The amplitude & ( t ) is then written as 8,,eid(*)with the phase d ( t ) being a stochastic function of time. In a model very commonly used to describe the fluctuations of & t ) , the phase is assumed to represent a Wiener-Levy process and is usually referred to as phase diffusion model (PDM). It corresponds to a CW laser operating well above threshold with a well-stabilized amplitude and a randomly fluctuating phase (Haken, 1969). It can be shown quite generally that within the PDM a decorrelation can be performed rigorously. The proof that it is rigorous can be found in one form or another in most of the papers quoted earlier, but for a selected and representative collection of proofs the reader is referred to the papers by Fox (1972), Wodkiewicz (1979a), and Agarwal (1977, 1979). To perform the decorrelation, one solves Eq. (74a) formally, obtaining an expression for oI2(t)that is then multiplied by !2g(t). Having now an expression for the product of the two quantities, we calculate the average (Q$(t)o,,(t)), which is needed in the other two equations. It is at this point that the decorrelation is performed in the right side of the integral equation for (Qg(t)ol2(t)).This equation can now be reconverted to a differential equation, which together with the other two equations also averaged now reads
The key result exhibited in these equations is contained in Eq. (75a), where the laser bandwidth yL is seen to appear added to the off-diagonal relaxation constant r;, . Otherwise, all other quantities appearing in Eqs. (74) have now been replaced by their stochastic averages. The laser bandwidth yL enters through the second-order correlation function (&*(tl)F(tZ)) of the field. The quantity
=
8; exp(-+y,lt,
-
t2l)
(76)
appearing earlier, referred to as the average Rabi
228
A. T. GEORGFS AND P. LAMBROPOULOS
frequency, is given by QR = 2h- l p I 2 b 0and represents the root-mean-square value of the stochastic process Q,(t). It will also be noticed that the other field-dependent quantities S , , and TioN have all been replaced by their averages, i.e., they can be calculated with a constant field of value E ~To . solve Eqs. (75) we can eliminate (CI,*(t)cr,,(t)) by solving Eq. (75a) formally-a step we had to go through in obtaining the averaged equations-and substituting into Eqs. (75b) and (75c), thus obtaining a set of integrodifferential equations for (oll(t)) and ( o Z 2 ( t ) These ). can be solved by Laplace transform but only in special cases are the resulting expressions inspectionally useful. The nature of the solution is similar to that of the monochromatic case except that now it is also the laser bandwidth yL that influences whether Rabi oscillations are significant and how fast they are damped. In general the solutions must be obtained numerically. That the only effect of the laser bandwidth is to increase the off-diagonal relaxation constant is due to the particular model adopted for the laser. The off-diagonal element crl 2 ( t )reflects the coherence of the process as opposed to the populations of the states. Since it is only the phase of the field that fluctuates, while the amplitude is constant, one expects that it will only affect the coherence of the process and not the populations. The fact that mathematically yL enters so simply has to do with the particular form of the correlation function of Eq. (76). Its form leads to a Lorentzian lineshape, which in addition to this simplicity can also introduce unrealistic effects to be touched upon later. Although we have here cast the formalism in terms of the density matrix, these effects can be and have been formulated in terms of any of the other formalisms discussed earlier. The PDM has in the last three years or so been employed in a number of papers dealing with bandwidth effects in resonance fluorescence (Agarwal, 1976; Eberly, 1976; Kimbel and Mandel, 1977: Avan and Cohen-Tannoudji, 1977; Zoller and Ehlotzky, 1977; Zoller, 1977),in resonant two- and three-photon ionization (de Meijere and Eberly, 1978; Agostini et al., 1978),and double optical resonance (Hogan et al., 1978; Georges and Lambropoulos, 1978). In particular, de Meijere and Eberly (1978)have devoted considerable attention to the question of the existence of a rate equation for two-photon ionization. They have shown that the laser bandwidth plays a significant role in smoothing out the Rabi oscillations thus enabling one to write a time-independent rate equation for a wide range of parameters. We have already seen that the ionization width plays a similar role. If the experiment allows for sufficient interaction time, both of these damping mechanisms act to smooth out Rabi oscillations. In further work, Eberly and ONeil (1979) have presented extensive numerical calculations covering a large part of the space of the parameters in two-photon resonant ionization, with the single rate found to be valid over a large part of that space, It must, of course, always be kept in mind that the interaction time
ASPECTS OF RESONANT MULTIPHOTON PROCESSES
229
plays here an important role. If, for example, the laser intensity is such that all of the ionization takes place within one or two Rabi periods, the single rate is not necessarily correct. This is apt to occur when neither the laser bandwidth nor the ionization width is sufficiently large compared to the Rabi frequency. The effect of phase fluctuations on higher-order processes with a single resonance (M-photon-resonant N-photon ionization) can be easily studied along similar lines departing from Eqs. (67).The averaging and decorrelation are performed in a manner that parallels the development above. The equations obtained are essentially identical to Eqs. (79, except for the term containing the laser bandwidth yL, which now occurs multiplied by M because the resonant bound-bound transition is of Mth order. Consequently the bandwidth enters through an Mth-order correlation function. The average Rabi frequency is again given by Eq. (66),with b M replaced by & f . An application of this to two-photon-resonant three-photon ionization-where a derivation can also be found-has been given by Agostini et al. (1978), while Eberly (1978) has studied a similar problem in a two-photon boundbound transition 4s 4D of sodium with the fluorescence from 4D being the observed quantity (Marx et al., 1978). The situation changes drastically if the field undergoes amplitude fluctuations. The bandwidth and the other stochastic features of the field no longer enter the formalism in a simple fashion. Now, not only the phase but also the magnitude of the amplitude fluctuates and as a result not only the relaxation of a,,(t) but also that of o,,(t) is affected. Amplitude fluctuations cause fluctuations of the Rabi frequency itself-as a glance at Eq. (66) reveals--which means that the rate of induced transitions between 11 ) and 12) undergoes fluctuations. This affects the way population transfers between the two states and consequently the effective relaxation between 12) and 11). It is no longer true that the net effect is the addition of field-related constants to the relaxation constants r:, and rf,.In fact no substitution rule of this type exists. From the mathematical standpoint, the implication is that the decorrelation between field and atomic variables is no longer valid. It would be valid only in the weak-field limit. If the decorrelation is therefore performed in the sense of an approximation, the theory essentially becomes a weak-field theory not allowing the study of saturation phenomena. The approximation becomes better when the laser bandwidth is large compared to the averaged Rabi frequency, but even then care must be taken to include the effect of intensity fluctuations, which is an inevitable consequence of amplitude fluctuations. To show explicitly the difference between this case and the PDM, we consider here the example of two-photon-resonant threephoton ionization by a chaotic field. The density matrix equations, as derived by Agostini et al. (1978), but in the notation of our Eqs. (66) and (67), can be
-
230
A. T. GEORGES AND P. LAMBROPOULOS
written as
-
2 Im{glz,‘*(o‘:,‘(t)[€*(t)I2)}
(774
In deriving this set of equations, the decorrelation approximation has been made but higher-order corrections that allow for the effect of intensity fluctuations have been included. This effect is manifested in the factor of 3 multiplying (Sz1) and (TjoN) in Eq. (77a)and the factor of 2 multiplying the right-hand side of Eq. (77a). The enhancement of the shift by a factor of 3 is due to the fact that the shift, although itself linear in the intensity, enters the process in a nonlinear fashion. Imagine the equations solved for ionization as given by 1 - ( o l l ( t ) ) - (02,(t)). Obviously the shift S12(t)will occur as part of this complicated expression, which when averaged over the field fluctuations does not allow the shift to be factored out and averaged separately. Physically, the shift involves the virtual absorption and emission of real (laser) photons, and since the intensity fluctuates during the interaction time, the net effect will be a shift whose magnitude depends on the order of the process and of the resonance, as well as on the stochastic properties of the field. The factor of 3 obtained here has to do with the properties of the chaotic field and the particular approximations in the treatment of Agostini et al. (1978). The importance of this enhancement of the shift in a multiphoton resonant transition has been recognized relatively recently, although the idea had been implicit in papers by Kovarskii and Perel’man (1975) and Kovarskii et al. (1976). The factor of 3 in front of (rjoN)is part of the enhancement by 3 ! that is expected of a three-photon nonresonant process in a chaotic field. Here it is broken up in two factors: the 3 and the 2 that appears on the right-hand side of Eq. (77a). The physical interpretation of this factorization rests on the realization that in the limit of very large laser bandwidth there is no correlation between the two-photon excitation of 12) and its subsequent ionization. Then the only enhancement from intensity fluctuations is the factor of 2 in the bound-bound two-photon transition 11) (2). Very large laser bandwidth here means yL & rlO2,(rioN),
-
(S12)> Q R .
ASPECTS OF RESONANT MULTIPHOTON PROCESSES
23 1
It should be evident now that in the general case of M-photon-resonant N-photon ionization in a chaotic field there will be analogous enhancement factors for the shift and the ionization width. Without these enhancement factors, even if the field bandwidth has been included, the model cannot represent a field with amplitude fluctuations even in the weak-field limit. It must be underscored, however, that the above procedure, allowing for the enhancements as it does, still represents an approximation valid for large field bandwidths. Complete solutions of the problem of amplitude fluctuations on resonant transitions have been obtained very recently by Zoller (1979b) and Georges and Lambropoulos (1979). In his approach, Zoller has employed a FokkerPlanck technique whose details can be found in papers referred to earlier (Zoller, 1979a,b).In brief, the chaotic field is assumed to be Markovian and is represented by its Fokker-Planck equation. Zoller has shown that if one is interested in certain one-time atomic-field averages, the stochastic density matrix equations can be reduced to an infinite set of differential equations for these averages. Under particular conditions, solutions of these equations can be obtained in terms of continued fractions. In a different approach, Georges and Lambropoulos (1979) have proceeded with the density matrix formalism discussed above. The chaotic field has been written as a complex Gaussian stochastic process described by the infinite sequence of its field correlation functions. Such a process is not necessarily Markovian but has been assumed Markovian with first-order correlation function as given by Eq. (76). Note that the correlation functions of the chaotic field obey well-known relations (Glauber, 1963a,b). In attempting to calculate averages such as ( n ( t ) ) = ( g z z ( t ) )- (oll(t)), for instance, one encounters correlations of the form (s*(t,)d?(t,)>, which for a chaotic field cannot be decorrelated. Using the correlation functions of the field and the integral equation for ( n ( t ) ) a series expansion in a diagrammatic form has been obtained. Again under particular conditions, a solution in terms of a continued fraction has been obtained. These approaches have been applied to resonance fluorescence and double optical resonance (Zoller, 1979b; Georges et al., 1979; Georges and Lambropoulos, 1979). One of the key results of this work is that the chaotic field is less effective than a purely coherent field in saturating a bound-bound transition. This is true even if it is a higher-order transition, in which case one might have expected the chaotic field to be more effective due to the enhancement by the factor of N !. The point is that this enhancement occurs for low intensities. As saturation sets in, the process becomes highly nonlinear, higher-order correlation functions become important, and the advantage of N ! is lost quickly. Of course, in the limit of large intensity, all fields lead to saturation, but the chaotic field does so more slowly than the coherent. How slowly depends on the bandwidth of the field. This behavior is illustrated in
232
A. T. GEORGES AND P. LAMBROPOULOS
FIG.6. Saturation of a two-level system under a strong stochastic field. R is the ratio of (a22(~)>CH/(u22(cn))PD where CH denotes chaotic and PD phase-diffusion field. The plot is vs Q/To with Q the Rabi frequency and '-I the spontaneous decay width of the upper state. The curves 1-6 correspond to field bandwidths y 2 = 0, 0.2r0, 0.5r0,ro,2r0,and 5r0,respectively. (From Georges er al., 1979.)
Fig. 6 from the paper by Georges et al. (1979). The results of that figure also show how inaccurate the decorrelation approximation would be for a chaotic field. The analogous behavior for a two-photon resonance can be found in the paper by Georges and Lambropoulos (1979). The above difference in behavior between chaotic and coherent fields has a number of consequences for resonance fluorescence, double optical resonance, and resonant multiphoton processes in general. Some of these have been explored and details can be found in the papers by Zoller (1979b),Georges et al. (1979), Georges and Lambropoulos (1979), and Zoller and Lambropoulos (1979). Most tunable dye lasers used in multiphoton experiments exhibit amplitude fluctuations to some degree. At this time, however, comparisons with theoretical models are scant and rather qualitative. Presumably in the near future this will change. The present status of experimental information is reviewed in Section IX. As mentioned earlier, a Lorentzian lineshape is inherent in all of the above approaches. A realistic laser line, however, is not expected to be Lorentzian. Its wings will fall off much faster than those of a Lorentzian a few linewidths away from its center. As a consequence, the use of a Lorentzian mathematical model can lead to unphysical results especially when large detunings are involved; large compared to the laser spectral width. The problem has been discussed quantitatively by Zoller and Lambropoulos (1979) and by Dixit and Lambropoulos (1979).
ASPECTS OF RESONANT MULTIPHOTON PROCESSES
233
IX. EXPERIMENTAL INVESTIGATIONS OF RESONANT MULTIPHOTON PROCESSES Experimentally, the presence of a resonant intermediate state in an N-photon process is manifested in three main ways and combinations thereof: (a) The dependence of the total yield on the intensity of the laser is not proportional to the Nth power of the laser intensity, but may in general exhibit a much more complicated intensity dependence. If this intensity dependence happens to be proportional to a certain apparent power of the intensity, it is denoted by Nappand is referred to as apparent index of nonlinearity. (b) The total yield if measured as a function of laser frequency will generally exhibit a resonance structure, i.e., a peak at the position of the resonant state, or at a shifted position if the circumstances are such that a Stark shift is significant. (c) The apparent index of nonlinearity will usually exhibit a dispersive structure as the laser frequency is tuned around the apparent position of the resonance. This behavior, of course, is a combination of effects (a) and (b), but it constitutes a rather sensitive probe of the resonance structure and has often been given separate attention in experiments. Historically, the investigation of the intensity dependence was the first method employed in the study of resonant effects. This was necessitated by the lack of tunability of high-power lasers ten years ago. Exploiting accidental resonances of existing laser frequencies (mainly the ruby and the Nd-glass laser) with particular atomic systems, studies were conducted of two-photon absorption and multiphoton ionization. Those early experiments have been discussed in previous reviews by Bakos (1974) and by Lambropoulos (1976), and need not be discussed again. We only point out that some of the experiments that had initially shown departures from the expected I N intensity dependence were later repeated under more controlled experimental conditions and showed no such departures. It was eventually realized that the culprit was the expansion of the interaction volume with increasing intensity. This presents a problem even for a nonresonant process because it depends nonlinearly on the spatial distribution of the field strength or the intensity in the interaction volume. If this distribution were uniform, there would be no complication. The distribution is not uniform, however, and as the laser power increases the outer regions of the distribution begin contributing to the process, thus altering its saturation behavior. The details of this effect can be found, for example, in the paper of Chin and Isenor (1970), while its
234
A. T. GEORGES AND P. LAMBROPOULOS
influence on a resonant process has been discussed by Agostini et al. (1978), but it must be emphasized here that the intensity dependence of any multiphoton process can be very misleading if allowance is not made for the possible effect of the spatial intensity distribution, which of course presupposes knowledge of that distribution. Experimental investigations of the frequency dependence of resonant processes is a more recent development that has received much impetus from the tunability provided by dye lasers. A few early investigations of this type that had employed the small thermal tunability of the ruby laser were by necessity of very limited scope and have been reviewed elsewhere (Lambropoulos, 1976). Here we focus our attention on more recent experiments from which somewhat detailed information has been obtained. One of these studies (Morellec et al., 1976) has its origins in experiments that date back to the early 1970s, reported by Held et al. (1972a,b).It involves the study of three-photon-resonant four-photon ionization of atomic cesium with a tunable neodymium laser. Over the range of tunability, a three-photon resonance with the 6F state of cesium occurs. Initially the experiments were performed with relatively long (of the order of 3.5 x sec) pulses. In a more recent paper (Lompre et af., 1978) the measurements have been extended to the regime of ultrashort pulse durations with results reported at 1.5 nsec, 50 psec, and 15 psec. In all of these experiments the resonance as a function of frequency has been seen quite clearly over a range of 1 or 2 cm- '. The overall width contains the unresolved hyperfine structure of the ground state and the fine structure of the 6 F state. The most recent data have, moreover, shown the broadening and distortion of the resonance profile due to saturation effects. An asymmetry that develops in the profile is not clearly understood. It is, however, quite likely that processes involving molecules Cs, play a significant role. These recent data have also addressed the question of the effect of pulse duration on the resonance profile, which has been studied theoretically by Crance and Feneuille (1977)and Crance (1978).Although the comparison with these theories can not be considered conclusive, the experiments have shown the resonance to be clearly visible down to times as short as 10- sec. Another interesting aspect of these experiments is the study of the AC Stark shift due to the laser. It is found to vary linearly with the laser intensity in accordance with calculations by Gontier and Trahin (1978). The shift clearly will play a role in the dependence of the apparent index of nonlinearity on frequency. This quantity, defined as d log Wla log I , was measured in these experiments, with the most recent data reported in the paper by Morellec et al. (1976). It shows a dispersive behavior around the resonance and its theoretical interpretation has attracted the interest of a number of authors (Chang and Stehle, 1973; Chang, 1974; Gontier and Trahin, 1979; Eberly, 1979; Petite et al., 1979). The fits that have been ob-
ASPECTS OF RESONANT MULTIPHOTON PROCESSES
235
tained are similar to the most recent one given by Eberly (1979). One side of the curve fits the data well while the other does not. Obviously which side fits well depends on the parameters chosen including the shift and intensity. The most self-consistent fit, in the sense that all atomic parameters have been calculated within the same model by the same authors, is that of Gontier and Trahin (1979). In any case, there seems to exist an anomaly around that resonance. It is conceivable that, although unresolved, the fine and hyperfine structure plays a role, since the various components may shift differently. The laser bandwidth in these experiments was of the order of 1-1.5 cm- and as a result it dominated the width of the resonance. Experiments in which the effect of the laser bandwidth was examined have been reported by Agostini et al. (1978) in two-photon-resonant threephoton ionization of sodium and by Marx et al. (1978)in two-photon excitation also of sodium. In both cases the bandwidth affected the two-photon transition. The experiment of Marx et al. (1978) has been analyzed by Eberly (1978) on the basis of the phase diffusion model. His calculation of the dependence of the process on the laser intensity and bandwidth is in good agreement with the available experimental data. In the paper of Agostini et al. (1978), the analysis is based on the decorrelation approximation including the enhancement due to amplitude fluctuations, which surely were present in the experiment. The effect of laser bandwidth, saturation, and interaction volume expansion have been included in the model. The atomic parameters used in the analysis were calculated on the basis of quantum defect theory. Despite the otherwise good agreement between theory and experiment, an overall disagreement by a factor of four in the intensity necessary to fit the data remains a mystery. Its solution would be easy if an error could be attributed to the intensity measurement. However, no substantial evidence in that direction exists and the mystery will remain until further interplay between theory and experiment clarifies it. Along somewhat different lines, consistency between experiment and theory has been obtained in the study of AC Stark splitting in doubly resonant three-photon ionization, i.e., double optical resonance detected by ionization (Hogan et al., 1978; Georges et al., 1979). The main effect in that case was that of laser bandwidth and amplitude fluctuations on the asymmetry of peaks. The agreement is only of a qualitative nature, however, which will presumably improve in the near future as more detailed data become available. The doublet structure of double optical resonance and the triplet structure of resonance fluorescence have been shown to be quite sensitive to the stochastic properties of the strong field and related experiments are likely to prove quite valuable in the investigation of these properties. Most recently, Bjorklund et al. (1978)have observed two-photon-resonant three-photon ionization in atomic hydrogen, where two different lasers are of
236
A. T. G E O R G E AND P. LAMBROPOULOS
-
fixed wavelength 266 nm and the other tunable around 224 nm were used to achieve the two-photon resonance 1s 2s. The radiation of 266 nm being stronger also caused the ionization. It is noteworthy that saturation of the two-photon transition was achieved since this is a notoriously weak twophoton transition and moreover requires UV intense radiation. The calculated dependence on laser intensity, as reported in the paper of Bjorklund et ul. (1978), is in good agreement with the experimental data. In a subsequent paper, Ausschnitt et al. (1978) have discussed the use of this process in the detection of hydrogen in plasmas. In the experiments discussed above, one of the objectives was the study of some aspect of the behavior of a resonant multiphoton process. With the exception of two-photon spectroscopy in its various forms, there do not seem to exist at this time experiments on resonant multiphoton processes of order higher than two for which there is complete theoretical interpretation. Although some aspects seem to fit theoretical models well, others present significant discrepancies. One would expect, however, this situation to change in the near future as experiments under more controlled conditions become available. There are nevertheless many experiments designed for specific purposes that have given reasonable agreement with those aspects of the theory that they were intended to test: usually the intensity dependence of the process. Noteworthy among such examples are experiments in diatomic or even polyatomic molecules (Johnson, 1976; Berg et al., 1978a,b; Bray and Hochstrasser, 1976; Zakheim and Johnson, 1978; Williamson et al., 1978; Williamson and Compton, 1979), which have also tested light polarization effects not discussed in this review since they have been reviewed elsewhere (Lambropoulos, 1976; Parker et al., 1978). In such molecular multiphoton processes, there almost always exist couplings to radiationless transitions in dense manifolds of levels, which provide strong damping for the resonant levels, thus eliminating most coherent effects. Consequently, the interplay between coherent excitation and the stochastic properties of light are more apt to be seen in atomic transitions. For the same reason, AC Stark shifts are not expected to be easily detectable in molecules.
ACKNOWLEDGMENT The authors gratefully acknowledge many discussions with Dr. P. Zoller.
REFERENCES Ackerhalt, J. R. (1978). Phys. Rev. A 17,293. Ackerhalt, J. R., and Eberly, J. H. (1976). Phys. Rev. A 14, 1705
ASPECTS OF RESONANT MULTIPHOTON PROCESSES
237
Ackerhalt, J. R., and Shore, B. W. (1977). Phys. Rev. A 16,277. Agarwal. G. S. (1970). Phys. Rev. A I , 1445. Agarwal, G. S. (1974). “Quantum Optics.” Springer-Verlag, Berlin and New York. Agarwal, G. S. (1976). Phys. Rev. Lett. 37, 1383. Agarwal, G. S. (1977). Ph.vs. Rev. A 15, 814. Agarwal, G. S. (1978). Phys Rev. A 18, 1490. Agarwal. G. S. (1979). Z . Phys. B 33, 11 1. Agostini, P., Georges, A. T., Wheatley, S. E., Lambropoulos, P.. and Levenson, M. D. (1978). J . Phys. E 11, 1733. Allen, L., and Eberly, J . H. (1975). “Optical Resonance and Two-level Atoms.” Wiley, New York. Apanasevich, P. A., Zhovna, G. I., and Khapalyuk, A. P. (1968). J . Appl. Spectrosc. 8, 14. Armstrong, L.. Beers, B. L., and Feneuille, S. (1975). Phys. Rev. A 12, 1903. Ausschnitt, C. P., Bjorklund, G. C., and Freeman. R. R. (1978). Appl. Phys. Lert. 33,851. Avan, P., and Cohen-Tannoudji, C. (1977). J . Phys. B 10, 155. Bakos, J . S. (1974). Adc. Electron. Electron Ph.vs. 36,57. Bebb, H . B. (1966). Phys. Rec. 149, 25. Bebb. H. B. (1967). Phys. Ret,. 153,23. Bebb, H. B., and Gold, A. (1966). Phys. Reu. 143, I . Beers, B. L., and Armstrong. L. (1975). Phys. Rec. A 12, 2447. Berg, J . O., Parker, D. H., and El-Sayed, M. A. (1978a). J . Chem. Phys. 68,673. Berg. J . 0..Parker, D. H., and El-Sayed, M. A. (1978b). Chem. Phys. Lett. 56,411. Beterov, I . M., and Chebotayev, V. P. (1974). f r o g . Quantum Electron. 3, Part I . Bialynicka-Birula, Z . , Bialynicki-Birula, I., Eberly, J. H., and Shore, B. W. (1977). Phys. ReC. A 16, 2048. Bjorklund, G. C., Ausschnitt. C . P., Freeman, R. R., and Storz, R. H. (1978). Appl. Phjs. Lett. 33,54. Bloembergen, N., and Levenson, M. D. (1976). In ”High-Resolution Laser Spectroscopy” (K. Shimoda, ed.), p. 318. Springer-Verlag, Berlin and New York. Bonch-Bruevich, A. M., Kostin, H. N., and Khodovoi, V . A. (1968). Soc. Phys.-c/Sp. (Engl. Transl.) 10, 637. Bray, R. C., and Hochstrasser, R. M. (1976). Mol. Phys. 31, 1199. Butylkin, V. S., Kaplan, A. E., and Khronopoulo, Yu. G. (1971). SOL..Phys.-JETP (Engl. Transl.) 32, 501. Cantrell, C. D., Galbraith. H. W., and Ackerhalt, J. R. (1978). In “Multiphoton Processes” (J. H. Eberly and P. Lambropoulos, eds.), p. 307. Wiley, New York. Carmichael, H. J., and Walls, D. F. (1976). J . Phys. B 9 , 1199. Chang. C . S. (1974). Phys. Rec. A 9, 1769. Chang. C. S . , and Stehle, P. (1973). Ph-vs. Rev. L e f t . 30, 1283. Chebotaqev, V. P. (1976). In “High-Resolution Laser Spectroscopy” (K. Shirnoda, ed.), p. 140. Springer-Verlag, Berlin and New York. Chin, S. L., and Isenor, N . R. (1970). Cun. J . Phj,s. 48, 1445. Choi, C. W., and Payne, M . G. (1977). Two-photon ionization. Report ORNL/TM-5754. Oak Ridge Natl. Lab.. Oak Ridge, Tennessee. Cohen-Tannoudji, C. (1967). Curgese Lect. Phj.s. 2 Cohen-Tannoudji, C . , and Haroche, S. (1969a) J . Phys. (Paris)30, 125. Cohen-Tannoudji, C., and Haroche, S. (1969b). J . Phys. (Paris)30, 153. Crance. M . (1978). J . Phj,s. B 11, 1931. Crance, M . , and Feneuille, S. (1977). Phys. Rec A 16, 1587. Davydkin. V. A,, Zon, B. A., Manakov, N. L., and Rapoport, L. P. (1971). Sou. Phys.-JETP (Engl. Transl.) 33, 70.
238
A. T. GEORGES AND P . LAMBROPOULOS
Delone, N. B. (1975). Sou. Phys.-Usp. (Engl. Trans/.) 18, 169. de Meijere, J. L. F., and Eberly, J. H. (1978). Phys. Rev. A 17, 1416. Dixit, S. N., and Lambropoulos, P. (1979). Phys. Rev. A 19, 1576. Dixit. S. N., Zoller, P., and Ldmbropoulos, P. (1980). Phys. Rec. A 21, 1289. Eberly, J. H. (1976). Phys. Rev. Lett. 37, 1387. Eberly, J. H. (1978). J . Phys. B 11, L611. Eberly, J. H. (1979). Phys. Rev. Lett. 42, 1049. Eberly, J. H., and Lambropoulos, P. (eds.) (1978). “Multiphoton Processes.” Wiley, New York. Eberly, J. H., and ONeil, S. V. (1979). Phys. Rev. A 19, 1161. Eberly, J. H., Shore, B. W., Bialynicka-Birula, Z., and Bialynicki-Birula, I. (1977). Phys. Rev. A 16, 2038. Einwohner, T. H., Wong, J., and Garrison, J. C. (1976). Phys. Rec. A 14, 1452. Elgin, J. N., and New, G . H. C. (1976). Opt. Commun. 16,242. Elgin, J. N., New, G . H. C., and Orkney, K. E. (1976). Opt. Commun. 18, 250. Elyutin, P. V. (1977). Opt. Spectrosc. 43, 318. Ezekiel, S., and Wu, F. Y. (1978). In “Multiphoton Processes” (J. H. Eberly and P. Lambropoulos, eds.), p. 145. Wiley, New York. Fedorov, M. V. (1976). P. N. Lebedev Physical Institute Preprint No. 144. Feneuille, S., and Armstrong, L. (1975). J . Phys. (Paris) 36, L235. Fox, R. F. (1972). J . Math. Phys. 13, 1196. Garrison, J. C., and Wong, J. (1976). Lawrence Livermore Lab. (Rep.]OPG 76-1. Georges, A . T., and Lambropoulos, P. (1977). Phys. Rev. A 15, 727. Georges, A. T., and Lambropoulos, P. (1978). Phys. Rev. A 18, 587. Georges, A. T., and Lambropoulos, P. (1979). Phys. Rev. A 20,991. Georges, A. T., Lambropoulos, P., and Marburger, J. H. (1976). Opt. Commun. 18,509. Georges, A. T., Lambropoulos, P., and Marburger, J. H. (1977). Ph.vs. Rev. A 15,300. Georges, A. T., Lambropoulos, P., and Zoller. P. (1979). Phys. Rev. Lett. 42, 1609. Glauber. R. (l963a). PAvs. Rec. 130, 2529. Glauber, R.(1963b). Phys. Rev. 131, 2766. Goldberger, M. L., and Watson, K. M. (1964). “Collision Theory.” Wiley, New York. Gontier. Y., and Trahin, M. (1978). J . Phys. B 11, L131. Gontier, Y., and Trahin, M. (1979). Phys. ReG. A 19,264. Haken, H. (1969). In “Handbuch der Physik” (S. Fliigge, ed.), Vol. 25, Part 2c, p. 1 . SpringerVerlag, Berlin and New York. Heitler, W. (1954). “The Quantum Theory of Radiation.” Oxford Univ. Press (Clarendon), London and New York. Held. B., Mainfray, G., and Morellec, J . (1972a). Phys. Lett. A 39, 5 7 . Held. B., Mainfray, G., and Morellec, J. (1972b). Pl~ys.Rev. Letr. i8, 130. Hertel, I. V., and Ross, K . J . (1969). J . Phys. B 2, 484. Hogan. P. B., Smith, S. J., Georges, A. T.. and Lambropoulos. P (1378). Phys. Rec. Letr. 41, 229. Hurst, G. S., Payne, M. G., Nayfeh, M. H., Judish, J. P., and Wagner, E. B. (1975). PhJ,s.Rer. Lett. 35, 82. Hurst. G. S., Nayfeh, M. H., and Young, J. P. (1977a). Appl. Phj.s. Lett. 30, 229. Hurst. G . S., Nayfeh, M. H., and Young, J. P. (1977b). P h ~ s Rec. . A 15, 2283. Johnson, P. M. (1976). J . Chem. Phys. 64,4143. . (Engl. Trtmsl.) Kazakov, A. E., Makarov. V. P., and Fedorov. M. V. (1976). Sor. P h . ~ s--JETP 43, 20. Keldysh, L. V. (1964). SOL.Phys-JETP (Engl. Trans/.) 20, 1307. Khodovoi, V. A,, and Bonch-Bruevich, A. M. (1968). Sov. Phvs-Usp. (Engl. Transl.) 10,637. Khronopoulo, Yu. G. (1964). Izc. Vyssh. Uchebn. Zaved. Radio$;. 7, 674 (in Russ.). Kimbel, H. J., and Mandel, L. (1977). Phys. Rev. A 15,689.
ASPECTS OF RESONANT MULTIPHOTON PROCESSES
239
Kotova, L. P., and Terent’ev, M. V. (1967). Sou. Phys.-JETP(Eng1. Transl.) 25,481. Kovarski, V. A,, and Perel’man. N. F. (1971). Sou. Phys-JETP (Engl. Transl.) 33,274. Kovarski, V. A,, and Perel’man, N. F. (1972). Sot;. Phys.-JETP (Engl. Transl.) 34, 738. Kovarski, V. A,, and Perel’man, N. F. (1975). Sou. Phys.-JETP (Engl. Transl.) 41,226. Kovarski, V. A,, Perel’man, N. F., and Todirashku, S. S . (1976). Sou. J . Quantum Electron. (Engl. Transl.)6,980. Lambropoulos, M., Moody, S. E., Smith, S. J., and Lineberger, W. C. (1975). Phys. Rev. Lett. 35, 159. Lambropoulos, P. (1967). Phys. Reu. 164, 84. Lambropoulos, P. (1974). Phys. Reu. A 9, 1992. Lambropoulos, P. (1976). Ado. A t . Mol. Phys. 12, 87. Letokhov, V. S. (1978). In “Multiphoton Processes”(J. H. Eberly and P. Lambropoulos, eds.), p. 331. Wiley, New York. Leung, K. M., Ward, J. F., and Orr, B . J. (1974). Phys. Rev. A 9, 2440. Lompre, L. A., Mainfray, G., Manus, C., and Thebouilt, J. (1978). J . Phys. 39, 610. McClean, W. A,, and Swain, S . (1977). J . Phys. B 10, L143. McClean, W. A., and Swain, S . (1979). J. Phys. B 19,723. Marx, B., Simons, J., and Allen, L. (1978). J . Phys. B 11, L611. Messiah, A. (1965). “Quantum Mechanics,” Vol. 2. Wiley, New York. Mollow, B. R. (1968). Phys. Reg. 175, 1555. Moody, S. E., and Lambropoulos, M. (1977). Phys. Rec. A 15, 1497. Morellec, J., Normund, D., and Petite, G . (1976). Phys. Rev. A 14, 300. Oleinik, V. P. (1967). Sou. Phys.-JETP (Engl.Transl.) 25, 697. Oleinik, V. P. (1968). Sou. Phys.-JETP (Engl. Transl.) 26, 1132. Oseledchik, Yu. S. (1976). J . Appl. Spectrosc. 25, 1036. Parker, D. H., Berg, J. O., and El-Sayed, M. (1978). Adu. Laser Chem., p. 182. Petite, G., Morellec, J., and Normand, D. (1979). 40, 115. Power, E. A. (1978). In “Multiphoton Processes” (J. H. Eberly and P. Lambropoulos, eds.), p. 11. Wiley, New York. Power, E. A,, and Zienau, S. (1959). Trans. R . Soc. London, Ser. A 251,427. Przhibelskii, S. G. (1973). Opt. Spectrosc. 35,415. Przhibelskii, S. G. (1977). Opt. Spectrosc. 42, 8. Przhibelskii, S . G., and Khodovoi, V. A. (1972). Opt. Spectrosc. 32, 125. Ritus, V. I. (1967). Sou. Phys.-JETP (Engl. Transl.)24, 1041. Sakurai, J. J. (1967). “Advanced Quantum Mechanics.” Addison-Wesley, Reading, Massachusetts. Sargent, M., 111, and Horowitz, P. (1976). Phvs. Rev. A 13, 1962. Sayer, B., Wang, R., Jeannet, J . C., and Sassi, M. (1971). J . Phys. B 4, L20. Solarz, R. W., Paisner, J . A,, and Carlson, L. R. (1976). Bull. Am. Phys. Soc. 22 [2], 736. Solarz, R. W., Paisner, J. A., and Worden, E. F. (1978). In “Multiphoton Processes” (J. H . Eberly and P. Lambropoulos, eds.), p. 267. Wiley, New York. Theodosiou, C. E., Armstrong, L., Crance, M., and Feneuille, S . (1979). Phys. Rev. A 19, 766. Voronov, G. S . (1967). Sou. P ~ . Y s . - J E T P(Engl. Trunsl.) 24, 1009. Walker, R. B., and Preston, R. K. (1977). J. Chem. Phy5.5,2017. Wallace, S . C., and Zdasiuk, G. (1976). Appl. Phys. Lett. 15,449. Walther, H. (1978). In “Multiphoton Processes” (5. H. Eberly and P. Lambropoulos, eds.), p. 129. Wiley, New York. Wang, C . C., and Davis, L. I. (1975). Phys. Reu. Lett. 35, 650. Ward, J. F., and Smith, A . V . (1975). Phys. Rev. Lett. 35,653. Whitley, R. M., and Stroud, C . R. (1976). Phys. Reu. A 14, 1498. Williamson, A. D., and Compton, R. N. (1979). J . Chem. Phys. 69, 851. Williamson, A. D., Compton, R. N., and Eland, J. H. D. (1978). J . Chem. Phys. 68, 2143.
240
A. T. GEORGES AND P . LAMBROPOULOS
Wodkiewicz, K. (1979a). J . Math. Phys. 20,45. Wodkiewicz, K . (1979b). Phys. Rev. A 19, 1686. Wong, J., Garrison, J. C., and Einwohner, T. H. (1976). Phys. Reo. A 13, 674. Wong, J., Garrison, J. C., and Einwohner, T. H. (1977). Phys. Rev. A 16, 213. Wu, F. Y., Grove, R. E., and Ezekiel, S. (1975. Phys. Rev. Lett. 35, 1426. Zakheim, D., and Johnson, P. M. (1978). J . Chem. Phys. 68,3644. Zimmermann, P., Ducas, T. W., Littman, M. G . , and Kleppner, D. (1974). Opt. Commun. 12, 198. Phys. . B 10, L321. Zoller, P. (1977). .I Zoller, P. (1978). J . Phys. B 11,805. Zoller, P . (1979a). Phys. Rev. A 19, 1151. Zoller, P. (1979b). Phys. Reo. A 20, 1019. Zoller, P., and Ehlotzky, F. (1977). J . Phys. B 10, 3023. Zoller, P., and Lambropoulos, P.(1979). J . Phys. B 12, L547. Zusman, L. D., and Burshtein, A. I. (1972). Sou. Phys.-JETP (Eng!. Trunsl.) 34,520.
ADVANCES I N ELECTRONICS A N D ELECTRON PHYSICS, VOL.
54
Fundamentals and Applications of Auger Electron Spectroscopy PAUL H. HOLLOWAY Department of Materials Science and Engineering University of Florida Gainescille, Floridu
I . Introduction _ _ _ _ _ _ _ . . _ _ _ _ _ _ _ .___. .. _. .___. __ .. _. ._ _ _ . _ _ _ _ _ 11. Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Basic Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Characteristics of Auger Electron Spectroscopy . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Auger Line Shapesand Intensity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Experimental Approach . . . . . . . . . . . . , . . . . . . . . . . . . . . . . , , . . . . . . . . . . . . . . . . . . . A. Vacuum Requirement, . . . . . . . . . . . . . . . . . . . , , . . . . . , . , . . . . . . . , . . , . . . . . . . . B. Energy Analyzers. . . . . . . . ......................................... C. Computerization . . . . . . . . . . . . . . . . . . . . , . . . . . . . . , . . . . . . . . . . . . . . . . . . . . . . . IV. Quantitative A E S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . , , . . . . . . . . . . . . . . . . . . . . . A. Intensity Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Sample Homogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Sample Damage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....... ........ ..... VI. Applications . , . . . . . . . . . . . . . . . . . . . . . . . . , , . . . . . . A. Fundamental Interface Studies . . . . . . . . . . , , . . . . . . , . , , . . . . . . . , . . . . . . . . . . B. Materials Science.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Catalysts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Electronics , . . . . . . . , , , . . . . . . . . . . , . . . . . . . . . , . . . . . . . . , , . . . . . . . . , , , . . . . VII. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References , , , . . . . . . . . , , . . . . . . ........................... ,
,
1 .__...... . 24 __ 242 242 249 26 1 274 274 275 219 280 280 284 285 287 287 288 290 29 1 29 1 292
I . INTRODUCTION In one sense Auger electron spectroscopy (AES) is something of a novelty, but in another sense it is not unique at all. It is unique because in only 12 years it has become firmly established as an indispensable technique to be used in the most fundamental studies of solid surfaces and in the most applied manufacturing problems (e.g., quality control). Normally such a process IS expected to require 20 or more years; but this leads to the fact that AES is not at all novel, because the fundamental mechanism of Auger emission was discovered in 1925 by Pierre Auger (I). Between 1925 and 1967, the Auger process was studied in free atoms (2-4), yet free atoms are seldom used in producing a tool or product for society. Therefore, the Auger effect 24 1
Copyright c 19110 by Academic Press In' All rights of reproduction In any form reserved ISBN 0-1?-014654-1
242
PAUL H. HOLLOWAY
TABLE I SIGNIFICANT EVENTS IN
1600 1925 1950s to date 1953 1950-1960s 1967 1968 1969 1969 1970
THE
DEVELOPMENT OF AUGERELECTRON SPECTROSCOPY (AES)
Development of vacuum pump Discovery of Auger phenomenon ( I ) Studies of the Auger Effect in free atoms (2-4) Concept of surface analysis by Auger electrons (6) Development of ultrahigh vacuum techniques Development of electronic differential background suppression ( 5 ) Adaptation of LEED instruments to record Auger electron derivative spectra (7, 8). Adaption of the cylindrical mirror analyzer to AES ( 9 ) Complementary use of ion sputtering with AES to obtain depth profiles (10) Scanning AES (I I).
was relegated to obscurity in the scientific world. Even though the basis for applying AES to solids was discovered 55 years ago, the technology to take advantage of the phenomenon has only existed for about 12 years ( 5 ) . Significant steps in the development of AES are shown in Table I (1-11 ). It is obvious that tremendous progress was made in the four years following the development by Harris of electronic dixerentiation for background suppression (5). It should not be concluded from Table I that progress in AES has ceased after 1970. On the contrary, progress in applying AES to technologically important problems has been phenomenal-it is now widely used in the metals, electronic, chemical, and numerous other industries. It is used both for research (basic and applied) and for process and product development. At the same time, the understanding of the Auger process has improved along with instrumentation and our knowledge of the limitations of the technique. Therefore, although several reviews of AES exist already, it seems appropriate to again summarize the state of the art in this exciting field.
.11. FUNDAMENTALS A . Basic Principles 1. Auger Emission and Notation
There are a number of reasons for the rapid expansion of surface science over the last 15 years, but primary among these has been the use of electron beams for studying solid surfaces. A large number of phenomena occur when electrons strike a solid, including emission of ions, neutrons, photons, and electrons. In the present instance, the electrons are of prime concern and the
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
243
(b)
FIG.1 (a) Number of electrons N ( E ) vs. energy E resulting from bombardment of a solid with primary electrons of energy E,. Region 111. true secondary electrons; region 11, rediffused primary and Auger electrons; region I, elastically scattered electrons. (b) The derivative of the N ( E ) vs. E curve showing the background suppression accomplished by differentiation.
energy distribution of secondary electrons is shown in Fig. 1. This distribution is dominated by peaks in the N ( E ) curve at E , , representing elastically scattered primary electrons (used in low-energy electron diffraction, LEED), and a peak at 5 5 0 eV representing true secondary electrons. The region between these two peaks has a low intensity and slope, yet with sufficient amplification small peaks can be detected in the N ( E ) curve. These correspond to Auger electron emission. Note that the Auger peaks are more readily identified in the dN/dEcurve shown in Fig. Ib. This discovery, reported by Harris (5), caused rapid growth in the use of AES. As discussed below, the present trend seems to be a return to using the N(E)curve for AES with background suppression by computer techniques.
244
PAUL H. HOLLOWAY
The emission of Auger electrons results from the radiationless rearrangement of electrons in an excited atom. As shown in Fig. 2, an incident electron, photon, or even an ion may create a core hole by removal of an ionizing electron. If a photon caused ionization, the ejected photoelectron may be used for surface analysis by photoelectron spectroscopy (X-ray, XPS ; ultraviolet, UPS) since the electron-photon interaction is a single-interaction event. However, ion-electron or electron-electron interactions are normally multiple-interaction events; therefore, the energy of the ionizing electron may occupy a range of states and these interactions are not often used for analysis. The ionized atom in Fig. 2 will approach the ground state by filling the core hole with an electron from an upper level. The energy difference causes emission of a photon or an Auger electron; the sum of the probability of emission of a photon and of an Auger electron is unity. For energy differences below about 2000 eV, the probability of Auger emission is near unity. As a result, the light elements (atomic number < 14) de-excite almost exclusively by Auger emission, but Auger electrons are emitted by all elements since electronic holes in the outer core levels can have de-excitation energies of less than 2000 eV. The simplest convention to designate Auger transitions employs the electron level nomenclature developed by X-ray spectroscopists. The principal quantum levels are designated by K, L, M, N, and 0, and spinPHOTOELECTRON OR IONIZING ELECTRON
t
AUGER ELECTRON (KLl ‘2,3) VACUUM
0
CORE LEVELS
EK
K
FIG.2. Energy level diagram of the electronic states in a solid and a schematic illustration of the Auger process.
FUNDAMEKTALS OF AUGER ELECTRON SPECTROSCOPY
245
orbit splitting of subshells is designated by M, , M, , M, , etc. In the Auger process illustrated in Fig. 2, the ionization occurred in the K level. deexcitation occurred from the L, level, and the Auger electron was emitted from the L2,3level. Therefore, the Auger transition is designated KL,L,,, . Auger emission from the overlapping levels of the valence band is often designated as KLV, KVV, etc. The simple spectroscopic notation discussed above is normally adequate and is almost universally applied. However, it is not totally adequate for describing all observed Auger transitions and more complicated designation schemes have been discussed (12J3). For example. when spin interactions dominate over Coulomb or exchange interactions, pure j-j coupling leads to the six terms for a KLL transition: KL,L, , KL,L,, K L l L 3 . KL,L,. KL,L, , KL3L, . However if pure I-s coupling is considered, ten terms are found:
2s02p6 : ls, 2s12p5 : IP, , 3P0, 3p,, 3p2 2 ~ ~ :2 iso, ~ 4 3p0, 3p1, 3P,,
'D,
The origin of these terms is shown in Fig. 3, as well as how they combine to give the peaks predicted by purej-jcoupiing (4).As discussed by Chang (12), the Auger notation for I-s coupling is necessary with high-energy resolution such as achieved in studies of Auger emission from gases (2-4J4-16). However in the study of solids, the Auger peak is considerably broader than for free atoms (17-18). Therefore, the X-ray nomenclature has been adequate. As studies of Auger peak shapes from solids become more sophisticated (19), the use of I-s coupling nomenclature will become necessary. 2. Auger Electron Energies
Transition probabilities, rates, and energies can be calculated from first principles (4,ZO). However, SCF values for one- and two-hole defect states must be used and relativistic effects may be necessary for inner-level transitions (21). This can be done for free atoms, but inclusion of these defects states for calculation of Auger energies from solids is very difficult and only infrequently attempted. Even for free atoms, the accuracy of first-principle calculations is not as good as semiempirical or empirical techniques ; therefore, first principles energies are seldom used. Empirical methods of determining Auger energies use the single-ionization energies determined from X rays or photoemission (22). For binding energies designated by Ei,the energy released by an electron from the L, level dropping to the K level is given by EK - E L 2 ,but as discussed above. Auger electron emission leaves the ion in a doubly ionized state. Therefore,
246
PAUL H. HOLLOWAY
1.'
2s2 2 p l
ul
w 0 W OI
z
W
0.
-
t-
2s1 2p5
Q,
W a:
a
'SO
2502p6 I
KLlLl I
I
I
I
I
100
50
0
ATOMIC NUMBER
FIG.3 . Relative line positions in the KLL Auger group showing the final state configuration and Auger peaks resulting from I-s coupling. The manner in which these I-s coupling Auger transitions combine to yield the common peaks denoted by,i-j coupling schemes is illustrated. [From Siegbahn e/ al. (22). Reprinted with permission, Almquist and Wiksell Boktrycheri A.B.. Uppsala.]
single-ionization energies give imprecise values for the Auger electron energies. As a result, Auger energies have been calculated by &L2Ls
=
4Jz)
-
EL2(Z) - EL3(Z + A) -
=
EKV)
-
EL,(Z)
-
4~
EL3(Z) - A[EL3(Z + 1)
(1) -
EL3(Z)I -
4 A
(2)
where 4Ais the analyzer work function, Z the atomic number of the element causing emission, and A is set equal to unity as a first approximation. Comparing calculated and experimental data for Auger energies, A has been found to vary over the range 0.69 I A I 1.5 (12,23). The deviation of A from unity is somewhat systematic and Haynes (23) has suggested the
247
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
following values :
A = 0.69 A = 0.69 A = l
+ 0.85(71 - 3 / 3 5
for Z 2 71 and LM,,,M,,, for 36 I Z I 71 for LMN transitions
transitions
In the present example, this is complicated by the fact that KL,L, and KL,L, have quantum-mechanically identical final states, and thus their energy should be the same. Therefore, Chung and Jenkins (24) have suggested the expression : EKL2L3
= EK(z)
-
+
+ EL~(Z +
t[EL2(Z)
+ l>1- 4 A
-
t[EL3(Z)
(3)
Finally, Coad and Reviere (25) have suggested the following expression for valence transitions from metals : E K L ~=VEK(Z)- i [ E L 2 ( z )
+ E L 2 ( z + I)]
-
E, -
4 A
(4)
Since the difference in binding from element Z to element 2 + 1 can be quite large, the uncertainty in A can lead to very significant errors. Therefore, a third semiempirical approach has been developed, in which single ionization energy levels are used, but hole-hole couplings in the final state are calculated. Thus the energy of the Auger electron is given by EKL2L3 =
MZ) -
EL2(z)
-
&(Z) - R(L,L3)
- 4 A
(6)
where E i ( Z ) are single-ionization energies, the analyzer work function. and R the hole-hole interaction energy. The value of R depends upon whether the Auger process is occurring in a free atom or in a solid. In a free atom there are two terms (26) : R(L23L3) = F(L2,L3) - pa(L2,L3) (7) where F is the recombination energy calculated by assuming frozen orbitals and an appropriate coupling scheme, and Pa is a “polarization” term resulting from relaxation of the atomic orbitals. Shirley (27) has calculated the Pa term, which he calls the “static atomic relaxation energy,” using the intermediate coupling scheme of Asaad and Burhop (28) and the equivalent core method of Jolly and Hendrickson (29). The Pa term has the effect of lowering the electron binding energy, thereby increasing the Auger energies, and the magnitude is from 10 eV to a few tens of electron volts for outer and deeper core levels, respectively. In order to calculate the Auger energy for solids, both hole-hole inter-
248
PAUL H. HOLLOWAY
action and many-body effects must be considered. In this case, Eq. (6) becomes (30,31): EKL2L3
=
EK(Z) -
EL*@)
Pea(L,L)
-
-
EL,(Z) - R(L,,L,)
(8)
where R is equivalent to that of Eq. (7), E, the single ionization energies for solids, which therefore contains a polarization term for a single-hole final state (32),and Peathe polarization that results from electrons on surrounding atoms relaxing toward the two localized holes (33) (called the static extraatomic relaxation energy). For outer electron levels, R may be determined from optical data, therefore Peacan be directly determined; the value of Pea for sodium has been found to depend upon the matrix, but it is positive and generally from about 2 to 10 eV (34). To obtain Pa and Pea in other cases, Shirley (27) has made the approximation that the static relaxation energies are equal to twice the relaxation energies resulting from a single-hole final state (called the dynamic relaxation energy). This gave good agreement with experimental data. Because this is a questionable approximation, Shirley et al. (31) have also calculated Pea with an excitonic model, and predicted the Auger energy with reasonable accuracy. Kim rt al. (35)have used a modified exciton model to calculate Pea and to get better agreement between theory and experiment. Hoogwij rt al. (36) and Watson el al. (37) have used SCF hole state calculations, and Laramore and Camp (38) have used a plasmon model to calculate Pea. Most of these approaches give reasonable results. For example, Hoogwijs et al. (36) have calculated the energy of the zinc L,M,,,M,,, transition to be 992.7 eV, compared to an experimental value of 992.3 eV. To demonstrate the extent of many-body polarization effects upon the above energy for zinc, Hoogwijs et al. (36) also calculated the free-atom energy to be 973.7 eV, which compares well with the experimental value of 973.3 eV. Thus the many-body effects are significant; the Auger energies from solids are higher than those from gases. Finally, note that Eq. (8) has a work function correction factor (bA associated with the analyzer and not the sample. This results from the fact that the electron does work against the sample work function during emission from the solid, but gains (or loses) the difference between the sample and analyzer work function upon entering the analyzer. Therefore, the sample work function does not appear in Eq. (8). This may lead to some confusion however, since all of the single-ionization energies are related to the Fermi energy of the sample, while the energies of Auger electrons are referenced to the Fermi energy of the analyzer or to the vacuum (39).Therefore, care must be exercised in comparing experimental data to theory.
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
249
B . Characteristics of Auger Electron Spectroscopy
1. Elemental Sensitivity, Resolution, and Detection Limits
As discussed earlier, Auger emission is a de-excitation process that can be initiated in a number of different ways. However, by convention AES refers to spectroscopy performed with electron excitation. When Auger emission is initiated with X rays, it is often referred to as XAES. Both highenergy (40) and low-energy ions (41) have been used to cause Auger emission. It is not surprising that 1-MeV ions cause Auger emission, but Haas et al. (41) have shown that 1-keV argon ions will cause emission. Momentum exchange cannot cause ionization at these low energies ; therefore, the emission must result from overlapping wavefunctions between the ion and atoms in the solid. While this is an interesting phenomenon, it has not been pursued except as a technique to align the ion beam and electron beam for accurate sputter profiling (42). As is evident from Fig. 2, Auger emission involves three electrons. As a result, emission can occur for elements with an atomic number of three or
FIG.4. The relative inrensity of Auger peaks vs. atomic number. Note that by using Auger KLL transitions, then L M M transitions, etc.. that the sensitivity of Auger electron spectroscopy varies by 520. [From Da*is eta!. (42a).Reprinted with permission, Perkin Elmer, Corp., Eden Prarie, Minnesota.]
3 0
PAUL H . HOLLOWAY
greater. Even lithium ( Z = 3) is a special case since i t contains two K-shell electrons and one L-shell electron. Auger emission requires one deep core and two upper-level electrons. As a result, gas phase lithium will not deexcite by Auger emission. Solid phase lithium will Auger de-excite since the second upper-level electron can be emitted from the valence band. Based upon similar reasoning, it should be possible to detect helium or hydrogen implanted in a solid. However, their energy would be very low and they have not been detected. Since Auger emission dominates over photon emission for energies <2000 eV. all elements with Z > 3 will emit Auger electrons. Because the process is not governed by the dipole transition function, its probability for outer shells is relatively constant with Z. This is shown in Fig. 4 ( 4 2 ~ ) .here the peak-to-peak heights of dominant Auger peaks are plotted vs. atomic number. By using the KLL transition for 2 < 14, the L M N transition for Z < 40. and the M N N transition for Z > 40, the relative sensitivity of AES to the elements varies by 5 2 0 . The ability to resolve one element from another with AES is very good. Even if the Auger peaks from two or more elements overlap, there is normally more than one peak and the peak shape is unique to an element; therefore. both elements can be detected by curve resolution. The detection limits for AES are typically 0.1 at. ('<,,although Thomas and Morabito ( 4 3 ) have reported detection limits -0.01 at. O 0 for boron and phosphorous in silicon.
2 . Spatiul Resolution in the Plune of' the Surfircc Because electron beams are normally used for AES, spatial resolution in the plane of the surface can be better than any of the other surface analysis techniques ( 4 4 ) . Commerical AES instrumentation is now available u.ith a minimum beam size of 50 nm. There are two basic configurations for these high-resolution instruments In one case, the finely focused electron beam is generated in an electron-optical column similar to those used in electron microscopes ( / / , 4 5 , 4 6 ) .This has the advantage that the well-focused. stable electron beams are produced by commercial equipment in widespread use. The disadvantages of this arrangement are : The electron optical columns are normally not high vacuum : (2) the electron energy analyzer must be placed in a configuration such that the sample is examined in one position by scanning electron microscopy (SEM). but must be moved to a second position for AES; ( 3 ) an angle of about 90 is included between the electron beam and the analyzer axis, which leads to shadowing for rough surfaces and leads to distortion in sample mapping; ( 4 ) ion sputtering is often difficult or impossible. (I)
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
25 1
While these problems do not prevent AES in this configuration, they make the analysis difficult. As a result, the combination of an electron analyzer with a scanning electron microscope has not been very popular. The second basic configuration for high-spatial-resolution AES has used an electron beam parallel to the axis of a cylindrical mirror analyzer. The disadvantage is that the beam is not focused nearly as well as from an electron optical column, but the other objections listed above do not exist. Independent of the approach, the Auger instrumentation with finely focused beams can be used to study inhomogeneities in the plane of the surface. A schematic illustration of a scanning Auger electron spectrometer is shown in Fig. 5 , and some typical maps of Auger peaks are shown in Fig. 6 (47). While scanning AES clearly offers some tremendous advantages for analysis, it should be realized that the AES detection limits are a function of the total current striking the sample (12,48). This is demonstrated in Fig. 7, where the detection limit is plotted vs. the primary current (48). This detection limit-current relation was calculated assuming an Auger yield of lop4, a transmission of 0.1, an amplifier bandwidth of 1 Hz, an excitation energy of 10 keV, and an analyzer window greater than the Auger peak width. Under these conditions, the detection limit for AES is 0.1 at.% at a A/pm2 and a 10-pm beam diam., but it increases current density of 3 x to 10 at.% for the same current density but a 0.1-pm beam diameter. Since the total current available in a beam with a 50-nm diam. is limited, the detection limits for AES at this resolution will be higher than the 0.1 at.%
SAMPLE
SUPPLY FOR
PRIMARY ELECTRON GENERATOR
PATH
S 1G N A L AMPLl FICATION
RECORDER FOR E L E C T R O N
AND PROCESSING
CURRENT VERSUS ENERGY
FIG.5. A schematic diagram of the various options normally available in modern scanning Auger electron spectrometers.
252
PAUL H . HOLLOWAY
,
848 c
Ni
I
38'
(e)
TI
F I G . 6. Typical data from scanning Auger analysis of a Nimonic PE 16 metallurgical sample: (a) the secondary electron linage showing the line analysis position; (b) Auger line analysis of nickel and titanium; (c) titanium LMM Auger intensity image; (d) nickel LMM Auger intensity image; (e) the electron energy spectrum during point analysis of an inclusion: (f) point analysis of the matrix. [From Mogami (47). Reprinted with permission, Elsevier Sequoia S.A., Lausann.]
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
I
I
1
01
I
10
I
I
10'O
loe
I
I
100 i
I0
253
I
10"
1000 DlAMfprnI
I p (A)
FIG.7 . The detection limit for AES vs. the primary current (or beam diameter) used for excitation. Note that for electron beams of 5 10- 8, or 5 100 pm diam, the detection limits rise steeply. The parameter is the Auger yield. [From Werner (48). Reprinted with permission, the American Society for Testing and Materials.]
quoted earlier. This may be improved somewhat by using phase-sensitive detection methods as discussed in Mogami (47). Finally with respect to high-spatial-resolution AES, it should also be realized that even if the primary beam has a diameter (or FWHM) of 50 nm, it will not necessarily achieve resolution of 50-nm features on the surface with Auger electrons. In fact, data by Janssen and Venables (49), El Gomati and Prutton (50), and Shimizu et al. ( 5 1 ) show that backscattered electrons are important for high-atomic-number substrates and for edge resolution. 3. Depth Resolution
While 50-nm resolution in the surface plane is good, the depth resolution of AES is even better at 1 nm. This depth resolution results from the very large inelastic scattering cross section for elections in solids, and is the reason for AES being a surface analysis technique. There are many ways to define the detection depth or depth resolution of AES, but the most appropriate method seems to be the use of the inelastic mean free path for electrons in solids, i. The value of 2 can be determined by measuring the increase or decay of the Auger signals from a deposited film or a substrate, when the amount of deposited material is measured by another technique. Such a
-
1
200
6 00
400
(a)
800
I000
I200
TIME (SECONDS)
i -
200
ib)
400
600
800
loco
I200
I400
TIME (SECONDS f
FIG.8. (a) The growth of the titanium (418-eV) Auger peak-to-peak height as a function o f time of titanium deposition. The points are experimental data and the circles are calculated times for the formation of a monolayer. (b) The decay of the tungsten (169 eV) Auger peak-topeak height during deposition of titanium. [From Armstrong ( 5 2 ) . Reprinted with permiscion. North-Holland Publ. Co.. Amsterdam.]
255
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
situation is shown in Fig. 8 for titanium on tungsten (52). Note that the titanium Auger signal increased while the tungsten Auger signal decreased exponentially with time (which is equivalent to thickness). The thickness at which the signal increased to e1 or decreased to e - ’ of its maximum value is used to define A for an electron of that energy in a particular matrix. From the data in Fig. 8, A (169 eV, Ti) = 0.74 nm and A (418 eV, Ti) = 0.91 nm, where an interlayer spacing of 0.24 nm was assumed for titanium. Therefore the intensity of the Auger signal Idof a deposited film may be written (53-56) Id
=
I:[1
-
(9)
exp(-Xd/Ad)]
and of the substrate signal I, may be written
z,
=
zp exp( -xd/As)
(10)
where I: and:Z are the Auger peak heights from a thick deposited film and a clean substrate, Xd the film thickness, and Ad and As the inelastic mean free paths in the deposited film of Auger electrons from the film and substrate, respectively. Even though the curves in Fig. 8 can be described on an average with continuous exponential behavior, there are obviously breaks in the slope. These result from the fact that the titanium film grew in a layer-by-layer fashion, and the inelastic scattering changed sharply at the filling of each successive layer. This provides another convenient way to measure A (52,55). The slope changes have also been used to calibrate the Auger signal (55). There has been some dispute over the last few years as to the correct range, energy dependence, and Z dependence for i. While some points still remain to be settled, it is clear that the early value of iz 10 nm at 1000 eV for organic materials (57)is too high. Instead, ihas a minimum at about 50 eV and increases at energies above and below this value. For metals. the minimum in A is -0.5 nm and this increases to 2 nm at 1000 eV, as shown in Fig. 9 (57). While the inelastic mean free paths for inorganic and organic compounds tend to be higher than for metals (57), they are only about 500,; larger (Fig. 9). The compilation of i vs. E by Seah and Dench ( 5 7 ) (shown in Fig. 9) indicates that the values of the inelastic mean free path generally group about the so-called “universal curve,” but that there are systematic deviations from this curve consistent with the theoretical predictions of Penn (58,59). However, the least-squares fit to the data compiled by Seah and Dench ( 5 7 )also indicates that >.x E’ for E > 150 eV, while Penn’s theory (58,59)predicts an E0.77 dependence. Thus the questions of the 2 and energy dependence are not settled. There are still a number of areas to be investigated, including the theory of electron attenuation (58-60) and geometric and analyzer effects (53,61).One problem has been the limited range of electron energies
-
’
(el I-/
i' i'
10
I
loo0
1000
103 Energy
lev1
i"
..
If1
100:
..
.
'
. :I:: .. '
'..
.
= x
10-
/ I
I
$0
I00
,
/'
1030
FIG.9. Compilation for elements of i. measurements (a) in nanometers, (b) in monolayers. (c) in mg m-', as a function of energy above the Fermi level. In (a), the full curve is the empirical least-squares fit over the complete energy range. The dotted curves are for Penn's relation (58) with (i) his mean values for a and 6, (ii) values of a and b to give the least-squares fit to the date above 150 eV. (d) Compilation for inorganic compounds of i measurements in nanometers, (e) compilation for organic compounds of i measurements in mg rn-', (f) compilation for adsorbed gases of 2. measurements in nanometers. [From Seah and Dench (57).Reprinted with permission, Heyden and Son, Ltd., Philadelphia.]
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
257
one can measure, once the overlayer film is in place. The use of synchrotron radiation promises to alleviate this problem and provide better data on the energy dependence of 1, (61 ). 4. Depth Profiling
Although the short inelastic mean free path causes AES to be very sensitive to the surface composition of a solid, it also complicates the analysis over depths greater than a few nanometers. The composition vs. depths of 10 nm or greater is often desired; therefore, depth-profiling techniques have been developed. The most common technique is to bombard the surface with inert-gas ions (sputter) and measure the composition as the surface is moved through the bulk of the sample (62). Auger data may be recorded simultaneously with or sequentially after sputtering by aligning the small electron beam within the uniform current density region of the ion beam (42,62). Taylor et al. (63) have demonstrated that depth profiling can be accomplished by measuring along the sputter crater wall, and angle lapping (at shallow angles of about 1 ) has also been used for depth profiling. Since depth profiling is almost exclusively accomplished with ion beams, the effects of sputtering on AES have been studied a number of times (64). The effects that can result from sputtering of solids include surface roughening (64,66), knock-in (64,67), and preferred sputtering (64-67). Shimizu (66) and Coburn el a/. (6.5) have shown that the topography of the surface will change upon sputtering with ion beams. Wehner (64) has discussed the formation of cones on the surface due to sputtering. This induced surface roughness can change the Auger signal strength (68),cause degradation of interfacial resolution, and result in misinterpretation of concentration vs. depth curves. Knock-in effects also degrade interfacial resolution (69,70). During bombardment of a solid with ions, some material will be removed by sputtering, while other atoms will be scattered in a forward direction and driven deeper into the solid. The severity of the knock-in effect depends upon the mass and energy of the sputtering ion as shown by data in Fig. 10 (69). Besides the roughening and knock-in effects discussed above. the resolution of depth profiling can also be influenced by the electron inelastic mean free path, the basic sputtering process, a “zone of mixing,” the crystal orientation, preferred sputtering, and instrumental factors. Many of these effects have been discussed by Wehner (64) and need not be further discussed here. As pointed out by Hofman (71) the effect of each of these factors can be accounted for using the general error propagation law
258
PAUL H. HOLLOWAY
0 Xe
P
A Ar
0 Ne
I
05
4
1
I
I
1 2 ION ENERGY (keV)
I
3
I
5
FIG. 10. Average width of the Si-SiO, interface as measured during sputter profiling with various ions and ion energies. The error bars represent standard deviations. Note that the width increases with increasing ion energy due to knock-in effects. [From Schwarz and Helms (69). Reprinted with permission, American Institute of Physics, New York.]
where A Z j is the interfacial width contributed by each of the above factors. Using gold films on nickel, Mathieu et al. (72) have found that the electron inelastic mean free path and knock-in effects limit the resolution in very thin films. In thicker films, sputter-induced roughness dominated, and the resolution was empirically correlated with the square root of the product of ion beam energy and film thickness. In this and other studies, the definition of interfacial resolution is somewhat arbitrary, but the best definition seems to be the width over which an Auger signal characteristic of the interface decreases (increases) from 84 to 16% (16 to 84%) of its steady-state value. This represents k l a values if the interface can be represented by a normal error curve (71). This definition is relatively widely accepted, although the percentage values used in the definition vary. Ho and Lewis (73) have developed deconvolution techniques to remove the analysis-induced breadth of the interface. Two of the effects reported above, the “zone of mixing” and preferred sputtering, deserve further comment. The sputtering process can cause compositional changes in a solid over the range of the ion (67,73-79). This results from preferred sputtering and sputter-enhanced diffusion (80). Preferred sputtering is probably the most serious problem for quantitative AES. When a multicomponent sample is bombarded with ions, the surface composition will change. For a binary alloy, Ho et al. (67) have shown that the surface composition X;? is related to the bulk composition X: by the relationship
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
259
where Si is the sputter yield (atomsjion) and N o is the atom density of the ith component. Therefore, if the sputter yield of one constituent is lower, that constituent will be enriched on the surface during sputtering. Holloway (81) has shown that preferred sputtering occurs in the Cr-Au system, Braun and Farber have observed it in the Ag-Au system ( 7 9 , and Slusser and Winograd (82) report that it occurs in Pd-Ag alloys. Liau et af.(74)measured preferential sputtering in a number of systems and found the heavy element enriched on the surface. They proposed an “escape depth” effect to explain the data. Olson and Wehner (83) have measured the composition vs. ejection angle of sputtered binary alloys and found that the composition is a function of the ejection angle. For example the material ejected normal to the surface of a Cu-Ni alloy was rich in nickel. Shimizu et al. (84) have shown that copper is enriched on the surface of Cu-Ni alloys by argon sputtering. It is possible therefore that the same mechanism may be responsible for both phenomena. The preferred sputtering effect is even more dramatic in oxides (76-78), and therefore has been studied in more detail. An example of preferred sputtering in Ta,O, by argon ions is shown in Fig. 11 (77). X-ray photoelectron data are shown in Fig. 11 for tantalum 4d and oxygen 1s electrons from clean and sputtered Ta,O, and sputter cleaned tantalum. It is clear that the tantalum 4d photoelectron peak is composed of both oxide and metal signals after sputtering with either 5 or 0.5 keV argon ions. Holloway and Nelson (77) showed that the preferred sputtering of oxygen (leaving metal on the surface) increased as the ion energy decreased. Even though preferred sputtering has been studied several times, its origin is uncertain. Kelly (76) has considered four aspects of sputtering: (1) direct energy transfer from the incident particles to the surface atoms, (2) energy transfer from the substrate to the surface atoms, ( 3 ) variations in the surface binding energies, and (4) vapor pressure effects. He concluded that vapor pressure and surface energy effects lead to preferred sputtering in oxides, while direct energy transfer and surface binding energy effects cause preferred sputtering in alloys. However, the theory is not sufficiently developed to predict the extent or often even the sign of preferred sputtering. Therefore, much research is still required. The preferred sputtering effect is further complicated by the fact that a “zone of mixing” exists after sputtering (67). This is a zone equal to the range of the ions over which the composition may be changed. Liau rt a/. (74) have attempted to explain the zone of mixing by an “escape depth“ effect. while Winters and Coburn (79) and H o (80) explain it based upon preferred sputtering at the surface and diffusion within the zone defined by the ion range. Again, the theory of the effect is not well established, but it must be considered, for example, in attempts to measure preferred sputtering using Auger electrons with short and long inelastic mean free paths.
260
PAUL H. HOLLOWAY
I
Ta4d
015
Before S putte ri nq
5keV Art
0.5 keV Art
Sputter Cleaned Ta
255
235
215
BINDING ENERGY (eV)
FIG. 11. Tantalum 4d and oxygen Is X-ray photoelectron peaks from TaZ05sputtered with argon ions. Note the shift in the tantalum 4d peaks after sputtering indicating the surface is a mixture of Tas and Tao species, i.e., that sputtering has reduced the surface tantalum to a metallic state. [From Holloway and Nelson (77). Reprinted with permission, American Institute of Physics, New York.] +
The phenomenon of preferred sputtering is of critical importance since sputtering is used to produce most standard surfaces. In other words, standards are prepared and characterized by the electron microprobe, etc. They are then placed in an Auger spectrometer, but their surface is contaminated and unsuitable for quantitative measurements. In order to serve as a useful standard, a surface must be produced with a known composition. With sputtering this is impossible because of preferential removal of one constituent over another. The best compromise is to measure the composition of a sputtered surface both on the standard and unknown. This is often but not always adequate. Therefore, sputtering remains as one of the least controlled areas of Auger analysis.
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
26 1
C . Auger Line Shapes and Intensity If the Auger transition examined is a core-core-core transition, its shape might be expected to be simple. This is generally not the case, and multiplet splitting has already been discussed in Section II,A,2. Beyond multiplet splitting, there are a number of other effects that cause the Auger transition to have a characteristic shape. These effects include plasmon losses, doubly ionized initial states, shake-up excitations, Coster-Kronig transitions, crossover transitions, inelastic scattering, lifetimes, and outer-shell electronic configurations (chemical states). All of these effects combine to influence the intensity of the Auger peak and determine its characteristic shape. With sophisticated peak shape analysis, AES is capable of much more than simple surface elemental analysis (85). 1. Plasma Losses, Doubly Ionized Initial States, and Inelastic Scattering Because of the characteristic collective oscillation of electrons in a solid, plasmons can be created when the Auger electron interacts with the solid. These interactions create low-energy satellites below the main Auger peak (86, 87). Often both surface and bulk plasmon loss peaks can be detected. In addition, in some materials there is often a small satellite peak slightly higher in kinetic energy than the main Auger peak. This is shown in Fig. 12 for silicon (88),where the main L2,3W transition occurs at 92 eV, but a satellite is observed at 107 eV. The initial postulate was that the silicon 107 eV peak and similar peaks for other metals were caused by energy gained upon absorption of a plasmon by the Auger electron (89-91). However, Rowe and Christman (88)have shown that the threshold energy for detecting the 107 eV peak is about twice that necessary for exciting the L,,3 VV Auger transition (Fig. 13). Thus the satellite is caused by Auger emission from a doubly ionized initial state. In fact, it is unclear why such satellites are not observed more often, since a core-core-core Auger transition leaves the atom in a doubly ionized core-hole configuration. Thus the initial state for further decay toward the ground state by Auger emission should often be doubly ionized. According to the calculations of Penn (58,59), inelastic scattering of Auger electrons in solids largely occurs as a result of bulk or surface plasmon generation. There is also controversy at the present time as to when the plasmon generation takes place-upon creation of core holes (intrinsic plasmons) or during transport through the solid (extrinsic plasmons). The portion of the loss spectrum attributable to intrinsic plasmons has largely been discussed for X-ray photoemission (92-94), but if these results are
262
PAUL H. HOLLOWAY
-
ic/
Si
M
hwp
b 20
= l7eV
15ev 1
40
I
I
60 80 ENERGY (eV)
1
100
II
120
FIG. 12. Silicon Auger electron spectrum showing the satellite - 1 5 eV above the main L,,,VV structure at 92 eV, and the change in background shown by the decreasing derivative for E 5 50 eV. [From Rowe and Christman (88). Reprinted with permission, Pergamon Press, Ltd.]
transferable to AES, anywhere from 10 to 50% of the loss spectrum are caused by intrinsic plasmon generation. In fact, this may be even higher in AES because of the fact that three holes are created in Auger emission, while only one is created during photoemission. Regardless of the loss mechanism, it is obvious that the onset of an Auger transition causes an increase in the background on the low-kinetic-energy side of the peak. This is shown by the level change in the dNjdE vs. E curve in Fig. 12. (This is called shake-off features in XPS.) If the true shape of the Auger peak is to be examined, the distortion due to inelastic scattering must be taken into account. Changes in the background caused by higher-energy Auger peaks and backscattered secondary electrons must also be subtracted. Although the proper technique for correcting the background is still controversial, Houston (95) and Madden and Schreiner (96) have shown that it may be accounted for by expanding the background in a Taylor series about a point well ahead of the Auger threshold using least-squares smoothed
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
263
ENERGY ( e V )
FIG.13. Auger yield for the L,,,VV transition (Ig2) and for the satellite (I1,,,) of silicon vs. incident electron energy. The threshold for the 107-eV satellite is approximately twice (f30 eV) the L2,, binQlng energy of 100 eV, consistent with the satellite resulting from a doubly ionized initial state. [From Rowe and Christman (92). Reprinted with permission, Pergamon Press, Ltd.]
derivatives. The inelastic scattering may then be approximately removed by deconvoluting the loss spectrum from the Auger peak (95,97,98). Unfortunately, the loss function can only be approximated since the inelastic scattering of Auger electrons by solids is not completely understood. The function has been approximated both by the near-elastic spectrum of electrons backscattered by the solid, and by the spectrum just below (in kinetic energy) an X-ray photoelectron core peak. The first approach is illustrated in Fig. 14 for the carbon Auger peak from a condensed layer of methyl alcohol (19). It is obvious from the above discussion on plasmon losses that these spectra can only be approximate loss functions. However, they d o result in deconvoluted Auger peaks, which may be used to discuss the distribution of electrons over the outer electron levels (19,95,98) (see Section II,C,6). 2. Ionization and Shake-up Features
When electrons are used to stimulate Auger emission, the electron-
264
PAUL H. HOLLOWAY
CH
H
q0/
I
180
I
I
220
260
0
ELECTRON ENERGY-eV
FIG.14. Deconvolution of energy loss features from multilayer data. (a) The experimental C(KVV) spectrum of CH,OH after background subtraction. (b) The solid curve is the nearelastic backscatter spectrum for 250-eV electrons, and the dotted curve is the carbon 1s photo electron peak with its loss features. The latter is shifted down in energy and scaled to the height of the backscatter spectrum. (c) The dotted curve results from deconvoluting (b) from (a), and the solid curve is the result of least-squares smoothing. [From Rye et al. (19). Reprinted with permission, North-Holland Publ. Co., Amsterdam.]
electron interaction that causes ionization can be a multiple-interaction event. Therefore, the kinetic energy of the primary electron after ionization can vary from zero to E, - E,, where E, is the primary energy and E, the binding energy of the ionized level. The energy distribution of the primary electrons after ionization is determined by the transition probability of the core electron to the unoccupied states above the Fermi level. If a high density of unoccupied states exists near the Ferrni level, this may cause a peak in the energy distribution of backscattered electrons at an energy E, - E , . This peak can be used for surface analysis (99). In principle, this phenomenon can also be used to probe the density of empty states above E F . However, the interpretation is not always straightforward (ZOO). Peaks in the electron energy spectrum that result from these characteristic ionization events are
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
265
easily separated from Auger peaks because their energy varies with (and the Auger energy is independent of) the primary electron energy. There are often peaks near the main Auger peak that are independent of E p and cannot be explained by plasmon generation. Klemper and Shephard (86)classify them as inter- and intraband transitions. Borrowing from XPS, these satellites may be more popularly termed shake-up features (101 ). The low-energy satellites result from loss by the Auger electron of small, finite amounts of energy to excitation of valence and outer-shell electrons to unoccupied states near the Fermi level. Both shake-up and shake-off (transitions to the continuum above the Fermi level) have been used in photoelectron spectroscopy to study molecular orbitals and band structures (102), but they have seldom been used in this manner in AES.
3. Lifetime Broadening and Coster-Kronig Transitions In general, Auger transitions are more broad than corresponding corelevel photoelectron lines, and this has been attributed to lifetime effects (12). In fact, Chang has calculated lifetimes of sec using the observed linewidths and the uncertainty principle; this is about the time required for an electron to make a single orbit (12). It is not certain, however, that this is a valid lifetime since Chang used the breadth of Auger transitions from solids (which are considerably larger than for free atoms) and the origin of the solid-state broadening effects are unknown (see Section II,C,6). It is clear, however, that Coster-Kronig transitions (e.g., L,L,X, where X # L, or super-Coster-Kronig: L,L,L,) influence the intensity and width of some Auger peaks. For example the L,VV transition from silicon is normally not observed (42a) because the L, holes decay to the L, , 3 level via an L,L, ,3V Coster-Kronig transition (103). If the electron spectrum near 120 eV from silicon is carefully measured, a very weak L,VV transition can be detected. It is very wide because the Coster-Kronig transition occurs so rapidly (4). Another example of Coster-Kronig effects is in the Auger spectra of the first-row transition metals (104).The L, peaks for these metals are very weak because of Coster-Kronig transfer of the L, hole to the L, level. However for copper and zinc, the ratio of the L, to L, peaks are as expected from population because their energy separation slows down the rate of the Coster-Kronig transition (4). At times, the influence of Coster-Kronig transitions can be determined by measuring the Auger yield vs. primary energy. This is shown in Fig. 15 for silicon (103). In this case, the silicon L,,,W transition showed an enhanced yield when the threshold for exciting the L, level was crossed. However, similar data for the gold N,,,VV transition did not show an enhanced yield upon crossing the N4, thresholds. By using glancing-incidence excitation,
266
PAUL H. HOLLOWAY
Ep lev)
Ln
r
a
Si-91 eV
-
L2,3
I c
/ /O
2
Y Y
U
8
e Y
E LL
w
0 U 3
FIG.15. Auger electron peak height vs. primary energy for the (a) N,,,VV and (b) L,,,VV Auger transitions of gold and silicon, respectively. Both Auger transitions are influenced by Coster-Kronig transitions, but a slope change is observed for silicon upon crossing the critical potential for the Coster-Kronig transition (L, level) while no corresponding slope change is observed for gold (N4 and N, levels). [From Holloway (103). Reprinted with permission. Pergamon Press, Ltd.]
Holloway (103) was able to show that N4,,N,,,V Coster-Kronig transitions were occurring and influencing the N6,,VV Auger transition intensity. 4. In teratomic‘ Transit ions
Interatomic, or crossover, transitions involve levels of neighboring atoms (105). They occur when the core hole in one atom is filled from the upper levels of neighboring atoms, for example, in alkali halides (206) and MgO (107,108). Their rates have been calculated to be slow (109).Salmeron and Bar0 (110), though, have claimed that the KVV transitions for adsorbates on nickel or copper can be considered as interatomic transitions
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
267
since the valence electrons originate from the metal d band. This would affect the use of AES as a valence band spectroscopy. In addition, Knotek and Feibelman (111) have shown that such interatomic transition can lead to ionic desorption from the surface.
5. .4ngular-Dependent Auger Emission Electron emission from homogeneous, polycrystalline samples has been found to be isotropic (112). However, geometric (113,114), diffraction (115) and transition matrix element (116,117) effects can results in nonisotropic emission from single crystals. These effects can be very dramatic as shown in Fig. 16 (116). The data by Noonan et al. (116) for single-crystal (100) copper show that the Auger emission was nearly zero when measured normal to the crystal surface. The measured signal increased to a maximum at about 15 from the normal, and then decreased toward zero again at angles further from the surface normal. The observation of near-zero emission normal to the (100) surface contrasted sharply with the data for the (1 10) surface. For this face, emission normal to the surface was a maximum. The authors could not explain the variations in Auger signal intensity vs. angle with either diffraction or reflection effects. They suggested that a nonisotropic source for emission of Auger electrons is necessary to explain the data. 6. Chmical State EHects As discussed in Section II,A,2, the energy of Auger electrons from solids is different from the same element in gas. There is also another difference: the Auger peaks from solids are wider in energy than those from gas (17). This has been termed the “solid-state broadening” effect. Fox et a/. (118) have measured the Auger spectra of gaseous and solid zinc and argon. They report peak FWHM for gas of 0.5 and 0.4 eV and FWHM for solids of 1.0 and 3.6 eV for zinc and argon, respectively. However, not all Auger peaks are broader in the solid state. Data by Rye et al. (19) in Fig. 17 show that the oxygen KVV peak from methyl alcohol broadens from about 5 eV in the gas to about 15 eV in a condensed solid. However, the carbon KVV peak, shown in Fig. 17, from the gas and condensed solid have almost identical peak width. Rye et a/. (19) postulate that the difference in broadening results from differences in bonding. Solid methyl alcohol (CH,OH) is formed by hydrogen bonding between oxygen atoms in the molecule. The carbon is shielded by three hydrogen bonds and one oxygen bond and does not participate in solid-state bonding. This suggests that the broadening effect results from a coupling to a lattice, i.e., may result from phonon generation. Mathews (119) has shown that the Auger process can create lattice vibrations and broadening results, at least for ionic solids. Citrin ef a/. (32) have also
268
PAUL H. HOLLOWAY
2
I
w k z -
APERTURE WIDTH
Y
9a I
e I
Y
a w
a
0
20
40
60 80 0 20 40 8, POLAR ANGLE (deg)
60
80
FIG. 16. Angular dependence of the M,,,VV, 62-eV Auger transition from Cu(100) as a function of polar angle 4 for four different values of the azimuthal angle 0 and normally incident primary beam. (a) 4 = 0" - [011] direction; (b) 4 = 16"; (c) 6 = 30"; (d) c$ = 45" - [001] direction. [From Noonan et al. (116). Reprinted with permission, American Institute of Physics, New York.]
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
-GAS PHASE
.-.___ MULT I LAYER
L
I
269
C(KvV1
I
0
FIG.17. C(KVV) spectra for (a) CH,OH and (b) (CH,),O. The solid curves in each case are the gas phase spectra and the dotted curves are the corresponding multilayer data after deconvolution and smoothing. The multilayer spectra have been shifted to lower energy by 3.3 eV for (a) and 3.4 eV for (b). [From Rye et al. (19). Reprinted with permission, North-Holland Publ. Co., Amsterdam.]
used plasmon generation to explain broadened XPS spectra. However, further research is necessary to show whether plasmon generation causes solid-state broadening. Other effects have been discussed; for example, interatomic transitions could lead to shorter final-state lifetimes and therefore broader transitions. There is also the possibility of nonuniform charging in the sample because of analysis; this might lead to broadening in insulators, but probably does not affect metals. From the discussion above it is clear that both the Auger energy and peak shape change drastically for chemical states as different as a gas and a solid. Energy shifts and peak shape changes are collectively called “chemical shifts” in AES and have been used on an empirical basis since shortly after AES became widely used (120). The characteristic shape of the carbon peak for graphite, hydrocarbon, and carbide is perhaps the most widely used case of AES chemical state characterization (121) but others are well known ( I 2,121,122).
170
PAUL
n. HOLLOWAY
The energy shifts associated with changes in the chemical state result from both shifts in the core-level binding energies and changes in the relaxation energies discussed above. Because of relaxation, the energy shifts observed with XPS do not equal those observed for AES (123,124). In fact. the nonequality of chemical shifts for XPS and AES has led Wagner to propose the use of the energy difference between the Auger and photoelectron peaks (which he calls the Auger parameter) to characterize the chemical state (125). The second part of the AES chemical shift (peak shape changes) results both from changes in the core level and in the valence band (or outer electron levels). These shifts are important not only for chemical state determination. but also because they affect the intensity of the Auger transition. Taylor (126) and Hall et a / . (127) have described the relationship between the breadth of an Auger peak and the peak-to-peak height commonly measured from d N : dE curves. If the peak width changes, the factor relating concentration to peak-to-peak height (or to peak intensity in N ( E ) curves) also changes, leading to error in determining the composition. It has been claimed that the peak width was independent of chemical state for core-core-core Auger transitions (128), but data by Gallon and Nuttall (17) show that CVV transitions are affected. Gallon and Nuttall (17) have found that the cadmium M4,sN4,5N4,s core transition broadens from 1.03 eV in metallic cadmium to 2.3 eV in CdS. Similarly, Fox et al. (118) have shown that the zinc L2,,M4.5M4,5core transition from zinc broadens from 1.0 eV for metal to 3.1 eV for ZnO. It is apparent, therefore, that changes in chemical state can influence such Auger transitions. This may not always be recognized, however. since most Auger spectrometer are operated under conditions such that instrumental broadening often determines the width of Auger peaks. Under these conditions, an Auger peak width of 1 eV is seldom observed and breadth changes of 1 or 2 eV would not significantly change the measured peak shape. Therefore, the claim that core Auger transitions are insensitive to chemical state may be valid. Auger transitions involving outer electron levels are very sensitive to chemical state. This is not surprising since the outer electron levels are involved in bonding that forms solids or molecules. The sffects upon the carbon and oxygen Auger spectra upon forming different molecules and forming molecular solids has been shown in Figs. 17 and 18. Figure 18 shows that the Auger peak shape changes for gaseous H,O, CH,OH, and (CH,),O because of the different molecular groups bonded to the oxygen atom (19). HoLvever. the carbon atoms in both CH,OH and (CH,),O are bonded to three hydrogen and one oxygen: therefore, the carbon spectra is the same from either molecule. This illustrates a very important point about AES: it measures the local chemical environment, and not the average environment. Stated
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
271
another way, the valence band measured by XPS or UPS (ultraviolet photoelectron spectroscopy) is the average distribution of electrons associated with, for example, both carbon and oxygen. However, in AES the initial core hole is localized on either carbon or oxygen atoms and therefore can sample the outer electron levels (valence band or molecular orbitals) associated with that atom. As a result, we see the effects of bonding in the molecular solid upon the oxygen atom (Fig. 18) but not the carbon atom (Figs. 17, 19). Although AES can, in principle, be used for valence level spectroscopy of solids, initial results were not very encouraging (129). It now appears that AES can be used successfully to measure the valence state of s and p band elements (98,130,131), but in some narrow d band elements the Auger transitions from the valence band has an atomiclike character. The reason for the two categories appears to result from final-state effects. For narrow d band elements, the hole-hole interaction energy may be more than twice the band width (132-134). Under these conditions, the two holes are forced I
I
GAS PHASE MULT I LAYER
O(KVV)
..\....*. _
I
430
I
470
L
510
I
550
ELECTROS ENERGY-eV
FIG.18. O(KVV) spectra for (a) H,O, (b) CH,OH, and (c) (CHJ,O. The solid curves are the gas phase spectra and the dotted curves are the corresponding multilayer results after deconvolution and smoothing. The multilayer spectra have been shifted to lower energy by 3.2 eV for (a), 4.9 eV for (bj, and 1.3 eV for (c). [From Rye et al. (19).Reprinted with permission, North-Holland Publ. Co.. Amsterdam.]
272
PAUL H. HOLLOWAY
95
100
105
110
ELECTRON ENERGY (eV)
115
120
125
---+
FIG.19. Decomposition of a Cu(lOO)-M,VV spectrum into a symmetric low-energy atomiclike peak (dashed curve) and a broad high-energy signal (solid curve). Also plotted (dotted curve) is the self-fold of the copper theoretical DOS after being broadened by convolution with a Lorentzian function of 2.09-eV FWHM. The symbol above the theoretical DOS label indicates that the curve plotted in this figure is the self-fold of the theoretical DOS curve. [From Madden et al. (136). Reprinted with permission, American Physical Society.]
to remain on the atom in which they were created, and this yields an atomiclike spectrum (132,133,135). The atomic-like Auger spectrum from copper is shown in Fig. 19 (136). Even though the valence spectrum of the valence band states is not observed for all elements, the agreement between theory and experiment may be impressive. Silicon and some of its compounds have been most widely studied. The experimental results for the silicon L2,,VV transition (137,138) from SiO, are compared to theoretical calculations in Fig. 20. The calculated line shape was obtained from a self-convolution of calculated .4uger transitions, and the experimental data have been corrected for both background and inelastic scattering. The agreement between experiment and theory is good. and will be improved as excitation and transport phenomena are accounted for more completely and accurately ( 1 %+I41 ). For example, the calculated line shapes clearly demonstrate that Auger emission emphasizes valence states with a p character at the expense of those wilh an s character (13% 141). It is obvious that tremendous progress is being made in understanding the effects of chemical state upon the Auger electron energy and peak shape. This will have a large impact upon AES. For example, chemical shifts will be placed on a scientific basis, rather than simply being an empirical art. The impact upon quantitative AES has already occurred as discussed below, and the ability of AES to sample the local chemical environment may prove to be extremely valuable in a number of fields, such as catalysis and corrosion.
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
40
50
70 80 90 ELECTRON ENERGY eV
60
100
273
I10
FIG.20. (a) A comparison of the experimental (solid line) silicon L,,VV Auger line shape with the calculated line shape obtained without (dashed line), and with (dotted line) 3d contributions. (b) The major Auger transitions comprising the silicon L,,VV calculated line shape (dashed lines). The peak labels identify the locations of the final state holes for the major transitions in each peak. The Auger transitions involving the silicon 3d orbitals are identified by dotted lines. (c) The difference between the experimental and calculated line shape without (dashed line) and with (dotted line) silicon 3d contributions. The intensity centered around 52 eV may be identified as an L,,V-V(3t2, 4a,)4t2 Auger satellite. and the peak at 69 eV as an L2,V-V(4t,, 5a,)4t2 satellite.[From Ramaker and Murday (138).Reprinted with permission, American Institute of Physics, New York.]
274
PAUL H. HOLLOWAY
111. EXPERIMENTAL APPROACH
‘4. Vacuum Requirement
AES is performed in a vacuum simply because it is necessary to accelerate electrons from the source to the target, and then energy analyze the secondary electrons before the electrons strike a gas molecule. Vacuums of 10- Torr are adequate to provide an electron mean-free path of 1 m, which is sufficient for analysis. However, vacuums of l o p 5 Torr are always used in order to extend the electron gun filament lifetime. Because AES started in surface science, there is a tendency to associate the technique with ultrahigh vacuum. I t is not necessary to perform all analyses at l o p l o Torr. For nonreactive samples (oxides, oxidized metals, gold. etc.) much higher pressures are perfectly adequate. However, to analyze clean surfaces of reactive metals, reactive-gas partial pressures of 10- l o Torr or less are necessary. Rather than reduce the partial pressure of reactive gases, a “clean” surface can sometimes be maintained by sputtering while analyzing. However, if the reactive gas partial pressure is sufficiently high, a steady-state surface concentration of adsorbed gas will exist when the sputtering rate equals the adsorption rate. As discussed by Holloway and Stein (142). the adsorption rate of gasj, A j , may be written
’
A j = SjPj (2nmjkTg)l’Z
(12)
where b, is the sticking coefficient, P, the partial pressure, mj the molecular mass, A- Boltzmann’s constant, and Tg the temperature of the gas. The sputter removal rate of t he j th adsorbed species S j may be written
S,
=
(13)
Y,i+oj/(roj
where Y, is the sputter yield (atoms per ion), i+ the ion current, cjthe density o f adsorbedj atoms, and a o jthe density of sites on the surface available for adsorption. For a steady-state condition A j = S j , therefore, Eqs. (12) and ( 13) may be rearranged to yield a, = aojbjPJjY,i’(2nmjkT,)‘
’
(14)
For a constant adsorption rate (constant P,) the steady-state concentration aj are independent of cj,then Eq. (14) can be written 0f.j deceases as i+ increases. If Y, and
ai = K : i f
(15)
where K is a constant. This is observed in some cases such as oxygen adsorbing onto silicon or Si,N, as shown in Fig. 21 (142).
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
275
V - 1 1 II~A)
FIG. 21. Oxygen Auger peak-to-peak height vs. the inverse of ion bombardment current for silicon or Si,N, exposed to oxygen-containing residual gas. As the rate of sputtering increases, the steady-state amount of adsorbed oxygen decreases in agreement with Eq. (15). [From Holloway and Stein (142).Reprinted with permission, The Electrochemical Society, Inc.]
B. Energy Analyzers
1. Cylindrical-Mirror Energy Analyzers Most AES today is performed with a cylindrical-mirror analyzer (CMA). This analyzer has been known for some time (/43),but Palmberg et a/. ( 9 ) first used it for AES. Its large acceptance angle, good transmission, and double-focusing features (143) quickly established it as the leading commercially available spectrometer. A common experimental configuration for AES is shown schematically in Fig. 22. The CMA is a bandpass analyzer, that is, it establishes both an upper and lower bound on the energy of electrons that pass through it. Electron energy analysis is accomplished by a voltage between the inner and outer cylinders, which reverses the radial momentum of the electrons. An electron multiplier detector serves to enhance the signal. A coaxial electron gun is normally used with the CMA. Although Palmberg has shown an enhancement of the Auger signal when glancing incidence electron beams are used for excitation (IZZ),a coaxial eIectron beam is preferred since it tends to reduce the effects of surface roughness on the Auger signal (68) and for other reasons discussed in Section II,B,2. In addi-
276
PAUL H. HOLLOWAY
.X-V RECORDER OR
r
1
-r
- $
ELECTRON MULTIPLIER
MAGNETIC SHIELD
SPUTTER ION GUN
FIG.22. Schematic of the experimental arrangement for using a cylindrical mirror analyzer (CMA) for Auger spectroscopy. The derivative spectra is taken directly using the lock-in amplifier as described in the text. For the N ( E ) spectra, the analyzer is used as a bandpass filter
tion, the coaxial gun gives a truer representation of the surface composition in the scanning mode. A typical schematic for the scanning Auger spectrometer was shown in Fig. 5 . While the CMA is very popular for AES, it is not without faults. Sickafus and Holloway (144) have shown that the intensity and energy of Auger peaks measured with a single-pass analyzer is very sensitive to the sample-toanalyzer distance. The peak height could be changed more than 107; and the measured energy shifted more than 10 eV by a position change of 0.5 mm (144). Some of these problems may be avoided by using two sequential CMAs (the so-called double-pass analyzer) (145). When retarding-field analysis is combined with the double-pass CMA, very high energy resolution can be obtained (145). The CMA is also very sensitive to electromagnetic fields (146). This may be one reason that the recent roundrobin experiments for AES (147) and XPS (148) measurements of gold and copper showed such large variations from one instrument to the next. The poor correlation of data from one laboratory to another clearly indicates that some instrumental factors are very poorly controlled.
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
277
2 . Retarding-Field and Other Energy Analyzers AES was first used extensively by surface scientists working with LEED equipment. Electron energy analysis was accomplished in this apparatus as illustrated schematically in Fig. 23. Normally LEED was performed with four transparent, hemispherical grids and a fluorescent screen, although three grid instruments were also used. The sample and the grid closest to the sample, were normally grounded to maintain a field-free region. The middle grid in the three-grid system, or the middle two grids in the four-grid system were tied to a retarding potential so that only electrons with an energy greater SINUSOIDAL GENERATOR
r
PHASE-SENSITIVE DETECTOR
+DROPPING RESISTOR
-
+
-
&
VARIABLE - X Y - DC RETARD RECORDER VOLTAGE
?
T"""""
FIG. 23. Schematic of the experimental arrangement for using a retarding-field analyzer (RFA) for Auger spectroscopy. The N ( E ) and dNjdE spectra are obtained by detecting the first or second harmonic, respectively, of the AC signal with the lock-in amplifier.
278
PAUL H . HOLLOWAY
than that potential could pass through it. The last grid was grounded and a small potential was placed on the collector, which was normally the fluorescent screen. As a result, the analyzer operated as a high-pass filter, and the collected current represented { N ( E )dE vs. E. The collection angle is large in this analyzer, but due to the large collected current so is the shot noise (12). As a result, the signal-to-noise characteristics of this analyzer were not as good as those of the CMA. A number of other analyzers have been applied to AES. Electrostatic high- and low-pass analyzers have been combined into a bandpass analyzer (149).Electrostatic spherical sector (150,151), 127"electrostatic sector (9,and magnetic analyzers (22) have seen only limited use. Faraday cage systems have been used in a few instances (116). 3 . N ( E ) L'S. dNldE Curves AES grew rapidly in popularity when it was shown that Auger peaks could easily be distinguished in the dNldE spectra with a RFA. If a sinusoidal voltage is applied to the retard voltage or either the CMA or RFA. a sinusoidal component of current will result. Expansion of this signal in a Taylor series can be used to show that the first harmonic component B of the signal from a CMA is given by (126) dN(E) B=kdE
k5 d 5 N ( E ) +--k83 d 3d NE 3( E )+--192 + dE5
(16)
where 2k is the peak-to-peak amplitude of the modulation voltage. [A similar expression can be written for the second harmonic from an RFA that coctains a dN/dE term (126).] Since the derivative Auger spectra is normally taken electronically by measuring the amplitude of the first harmonic. the second term in Eq. (16) must be small compared to the first in order to minimize error. Taylor (126)and Strausser (152) have shown that the ratio R of the second to the first term, for a Gaussian-shaped peak, is given by R
=
1.4k2:(FWHM)'
where FWHM is the full-width, half-maximum of the peak. Thus for a peak-to-peak modulation voltage equal to the FWHM, the Auger peak-topeak height is 35% in error. As reported above, the Auger peak heights can be very narrow and error due to large modulation voltages are very common. These and other complications are discussed by Houston and Park (153). This electronic differentiation scheme was introduced by Harris ( 5) and caused a rapid increase in the use of AES. However, differentiation only improved the signal-to-background ratio ;differentiation generally decreases the signal-to-noise ratio. The background is suppressed because it is slowly
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
279
and smoothly varying in the region of interest. Therefore, the derivative is near zero except when Auger peaks cause strong local changes in the slope of the N ( E ) curve. Degradation of the signal-to-noise ratio is reasonable based upon the same argument because the noise is also varying rapidly, causing small but strong local curvature in N(E). As discussed by Strausser (152) it is now entirely feasible, and perhaps desirable, to record the N ( E ) spectra directly. By using double-precision storage of digital data in a computer, the background can be suppressed by subtraction, and the improved signal-tonoise ratio of the N ( E ) curve can be utilized. This also reduces errors introduced during integration to obtain peak areas for quantification of AES data. C. Computerization It is obvious from the previous discussion that there is ample room to use digital computers in AES. If a multicomponent sample with overlapping peaks is being analyzed, the data could be peak resolved. Lookup tables could be stored for elemental analysis of unknowns. Similar tables could be used for determining chemical states or work function corrections. Background correction and deconvolution of inelastic scattering are impossible without the computer. Besides data reduction, the computer can control the spectrometer. It can repeatably set the primary energy, accept programming to sequentially analyze two or more spots, align the sample, and control the generation of sputter profiles. The possibilities are numerous and exciting. The capability to record and interpret data is increased many times over. That is the good news; there is also some bad news. If the computer controls the spectrometer, the spectrometer will not work when the computer is off the air. A mechanism for manual override is therefore essential. The tendency is to record every possible data point, but this encumbers the analyst with huge stores of data that may actually lengthen the time for analysis rather than shorten it. The operator may become too remote from the data processing to judge its validity in all cases. Automation does not automatically translate into high accuracy. New methodologies will be necessary to repeatably check the instrument performance on a frequent and routine basis. Brignell and Young (154)have recently discussed the impact of computeraided measurements upon scientific experiments. Not unexpectedly, they conclude that computer-aided measurements can be very important in areas such as data logging, signal processing, experiment control, data interpretation, and modeling. However drift, noise, and stability are still a problem, and they may be more severe since the operator generally is not monitoring the experiment so closely when the “computer is in charge.”
280
PAUL H. HOLLOWAY
IV. QUANTITATIVE AES While qualitative analysis by AES is sufficient for a vast majority of applications, semiquantitative and quantitative analysis is often desirable. It is relatively easy to quantify Auger data to an accuracy of a factor of four, which shall be called semiquantitative analysis for our purposes. In quantitative analysis, accuracy errors < 20% (percentage departure from the true composition) and precision errors < 5% (deviation upon repeated measurement of the composition) are necessary. There are still problems in achieving these limits on every sample, but with simple systems they can be achieved. Careful attention must be given to experimental procedures, however, as discussed below. A . Intensit}. Calibration
The Auger current Ii(UVW), from the UVW Auger transition of the ith element is given by (12,42a,81,155-160) Ii(UVW)
=
I,RP(UVW)T$,(E,,E,) x Ci(Ep,E,,)N(Xf)Ai(Xf)ri(E,,Xf)Xf
(17)
where I, is the primary electron current, R a surface roughness factor, P(UVW) the UVW Auger transition probability, T the instrument response function, $ithe ionization cross section, which is a function of the primary beam energy E, and the critical energy for U-level ionization E, ,Cirepresents Auger and Coster-Kronig contributions to electron holes in the U level, N the atom density, iLi the electron escape depth, ii the electron backscatter coefficient, and Xis the atom fraction of the ith element in the volume detected at the surface. There can be error in evaluating any term of Eq. (17), which will lead to a loss in accuracy or precision, and this is especially true for Ii(UVW). As discussed in Section II,C, the true Auger peak shape (and therefore Auger current) is distorted by changes in background, by the instrumental response function, and by inelastic scattering. Procedures for correcting these effects were discussed in Section I1,C. Therefore, the true current can be obtained by integrating under the Auger peak, but as discussed in Section 111, Auger data are often taken as dNjdE rather than N ( E ) . The area under the Auger peak can still be obtained by just integrating the d N / d E data, and then correcting for background, resolution, and loss features. This procedure is time consuming and requires a digital computer; therefore, it is more common to attempt quantification of the derivative data. Taylor (126) and Strausser (152) have shown that the distance between the maximum and minimum derivatives (peak-to-peak height or PPH) is
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
28 1
proportional to the area of a peak (if the electronic modulation voltage is sufficiently small as discussed earlier). Thus &(UVW)of Eq. (17) should be replaced with KiPPH(UVW) where Kiis the proportionality factor relating Auger current to the peak-to-peak height (PPH). The PPH may be used to quantify Auger data as long as the Ki is independent of composition. Changes in peak shape were discussed in Section I1,C. With current spectrometers, constant peak shape (constant Ki)is normally a good approximation for CCC (when C represents core) Auger transitions. However, CCV or CVV (where V represents valence) transitions should be avoided (12,42a,160,262). Once the Auger current is determined, the other factors in Eq. (17) must be known or controlled. The primary current and surface roughness may vary from sample to sample even if composition is constant; these variables must be controlled by the analyst. The next four variables (P,T,4i,Ci) are generally considered to be insensitive to the matrix, while the last three factors (N,Ai,ri)are matrix sensitive and vary with composition. The procedure to establish a relationship between Zi(UVW) and xi. must take these variations into account, and three approaches have been attempted. The first-principles approach (156,162) attempts to calculate the last seven parameters of Eq. (1 7). Substantial progress has been achieved, but significant error can still result in this approach (155,262).A second approach is to prepare a series of “exact standards” with all of the elements of the unknown present. Several standards must be prepared so that the compositions bracket that of the unknown. Then for thejth standard, Eq. (17) may be written A;,j(X;) = A,j(uvw)/X;,j
(18)
where the sensitivity factors so determined can be used to evaluate the concentration of the unknowns. Equation (18) is written with the implicit assumption that the primary current and surface roughness factor are constant between unknowns and standards; the remaining factors in Eq. (17) are incorporated into The use of “exact” standards does not necessarily mean accurate quantification of the Auger data. Holloway (155) has discussed errors in measuring I,, and pointed out that the common method of positively biasing the substrate holder to collect “all” of the current can lead to errors of +20% because of collection of stray currents. The roughness factor R has been investigated by very few people. Baird et al. (263) studied the influence of regular sinusoidal or triangular roughness upon X-ray photoelectron data, while Holloway (68) measured the influence of statistically distributed roughness upon Auger electron data. Holloway (68,155)points out that the effects of surface roughness can be minimized by using normal incidence of a primary electron beam that is coaxial with the axis of the analyzer, and by using large reduced energies (E,/E, = reduced energy > 8). At large reduced energies, the effects of roughness can be partially compen-
282
PAUL H . HOLLOWAY
sated by taking ratios of peak intensities, but ratioing can lead to errors if the reduced energy is too low. Experimental data suggest that Auger peak intensities can vary by as much as a factor of two because of surface roughness (68). Even if the primary current and surface roughness are constant between the “exact” standards and unknowns, it is very difficult to produce a surface with a known composition, i.e., produce a standard surface. Even though the bulk composition of the “exact” standards may be known (e.g., from electron microprobe, atomic adsorption, or other techniques) the surface will generally have a different composition because of evaporation, condensation, oxidation, adsorption, etc. As a result, the surface must be treated prior to analysis to produce a “known” composition. At the present time, there is no “best” method for producing a standard surface, although sputtering is used much more often than any other technique. However, as discussed in Section II,B,4, sputtering can cause roughening of the surface and lead to changes in composition because of preferred sputtering (64,67,163a). As long as both the standards and unknown are sputtered, this may produce a suitable standard surface where the surface composition is related to the bulk composition by a constant. Then we may write
x.= B. .xb. bJ
1.J
1.J
(19)
and Eq. (IS) can be rewritten
Atj(X,p) =
zi,j(uvw)/x;j
(20)
A:j(X:)
A:j(Xf)Bi,j
(21)
where =
Even though Eq. (20) is valid, the experimental variables must be held constant to apply it to a number of samples. Equation (19) was discussed specifically for sputtering of the surfaces, but similar expressions can be written for other methods of preparing standard surfaces. For example, the bulk composition of the sample can sometimes be exposed at the surface by scribing (75,81). However, the energy deposited and defects generated by the severe plastic deformation may cause the surface to deviate from the bulk composition. Again Eq. (19) must be used where Bi,j then would relate to the scribing process. Other techniques for producing standard surfaces are in situ codeposition (problems with accommodation coefficients), in situ fracturing (surface roughness and segregation effects), heating (surface segregation), and monolayer films detected by radioactivity, low-energy electron diffraction, ellipsometry, or quartz crystal microbalance (variation of A and r with thickness). It should be obvious from the above that even though AES can be
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
283
calibrated and quantified, the measurement of “exact” standards must be performed with all variables under control. Only with such caution can the measurements be made with sufficient accuracy and precision. The proportionality constants in Eqs. (18) and (20) should result in good accuracy; however, if the number of elements is large or the concentration ranges broad, the necessary number of “exact” standards can be very large. It is desirable to use simple standards for quantification, e.g., elements. This has often been performed using inverse sensitivity factors (42~1,157)or relative handbook sensitivity factors ( 4 2 ~ 1 )Neither . of these approaches consider the error introduced by the matrix sensitivity of N,A, and r in Eq. (17). Furthermore, the use of handbook data increases the probability of significant error caused by differences in the system response function (i.e., changes in the measured PPH caused by differences in energy resolution, modulation voltage. etc.). Both Holloway (81) and Hall et al. (158,159) have investigated techniques to account for matrix sensitivity. Holloway (81) assumed a linear variation in N,)., and r with composition and showed that for a binary system.
where the relative sensitivity factor is given by
;.,
and 1; is the intensity of pure i, N ; is the atom density of pure i, I.1 and are the electron escape depths of Auger electrons from 1 in pure 2 or 1, and r 1 and r 1 are the backscatter factors for electrons from 1 in pure 2 or 1, respectively. To write Eqs. (22) and (23) it must be assumed that the linear variation with composition for ,I1 divided by i2 multiplied by the variation of r 1 divided by r 2 is equal to unity; that is, these variations are offsetting and cancel (81,155). Literature data for rand .; are consistent with this conclusion (81), as are experimental results for binary alloys (8/,/58,159,164). The important conclusion from Eq. (23) is that the relative sensitivity factor for quantitative analysis by elemental standards is not simply the intensity ratio of Auger peaks, but this ratio modified by the ratio of atom density, escape depth, and backscatter factor. The ratio of escape depths can be determined from Penn’s data (165,166) and the backscatter ratio from data by Smith and Gallon (167). These corrections amounted to about 10% for Cr-Au binary alloys, but did improve the ability to analyze the surface composition as shown in Fig. 24 (85). Hall and Morabito (158) have recently taken backscatter data reported by Reuther (168) for electrons of 15-keV energy. taken escape depth data reported by Penn (165,166), and calculated tables of
,
284
PAUL H. HOLLOWAY SCRIBED SURFACES 0 ,o 0,.
UNCORRECTED CORRECTED Cr- 5 2 7 e V Au-2024 eV
Cr- 5 2 7 e V
Au-
1
70eV
x:,
(atom%)
FIG.24. Surface concentration of chromium on scribed Cr-Au alloys. The left-hand data were calculated using the gold 70 eV Auger peak while the right-hand data were from the gold 2024 eV peak. The open circles were calculated from pure element peak heights while the closed circles were calculated from Eqs. (22) and (23). [From Holloway (81). Reprinted with permission, North-Holland Publ. Co., Amsterdam.]
correction factors to apply to the intensity ratio of Eq. (23). These factors may be useful, but their accuracy must be questioned since the backscatter coefficients predicted by extrapolation of Reuther's data do not agree well with the experimental data of Smith and Gallon (167). In addition to evaluating the ratios of Eq. (23) with literature data, Hall et al. (259) have used sputtering through diffuse interfaces between thin films of pure materials to experimentally evaluate Prel.In their case, Prelcontained a term related to preferred sputtering, but they found Pre,to be independent of composition (but normally different from the elemental intensity ratio) for many cases. Pons et al. (160) have also developed a method whereby data were quantified without measurement of separate standards. In their procedure, the sensitivity factors were determined by an iterative technique. The method accounts for variations in escape depth but not in the backscatter factor. It may be a very good technique for semiquantitative analysis. B. Sample Homogeneity The procedures discussed in the previous section have all implicitly
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
285
assumed that the sample is homogeneous in the sampled volume. This is generally not true and the course of action to follow then depends upon the morphology of elements in the sample. For example, is the speciman is not homogeneous in the plane of the sample, high spatial resolution is necessary. Most Auger spectrometers now have a scanning capability, and resolution of 500 A are state of the art (see Section 11,B). For composition variation over dimensions less than the resolution of any particular spectrometer, the spectroscopist has two choices. First he can state the limits of his resolution and report the average composition of the sample. Second, he can attempt to evaluate uniformity (or lack thereof) with experimental techniques with higher spatial resolution (e.g., scanning electron microscopy, transmission electron microscopy). Composition variation normal to the surface is also common. If the composition varies over distances that are greater than the escape depth, sputter profiling can be used. For variations over distances equal to the escape depth, intensity analysis of the Auger electron peaks may be more appropriate (169-1 71). For example, Holloway measured the thickness of Cr,O, layers (< 10 nm thick) on gold to within f 10% (171). The uniformity of the films can be investigated by using Auger electrons with differing escape depths (energies) (12,169,171j. For films thinner than the escape depth, the distribution can sometimes be measured by varying the primary beam incidence angle (170) or the takeoff angle of analyzed Auger electrons (163). Variation of the incidence or detection angles may be complicated by surface roughness (71,164, primary beam attenuation (170), and nonisotropic emission of Auger electrons (116).
V. SAMPLE DAMAGE The beam that initiates emission of Auger electrons can also change the sample being analyzed. For example, ions cause sputtering while X rays or electrons can cause a number of effects including desorption and charging. Because of high power density and large interaction cross sections, primary electron beams can be especially damaging. A beam can cause adsorption of residual gases (172,173), oxidation (17 4 , electron-stimulated desorption of surface species (111,175,176), migration of mobile species (177-180), heating of the sample (179,181,182), sample charging (44,177-179), and molecular cracking (183-186). Coad et al. (172) have shown that electron beams can cause cracking of residual gas and deposition of carbon onto silicon. Ranke and Jacobi (174) have shown that GaAs oxidized more rapidly when an electron beam was striking the surface. Margoninski studied the desorption of oxygen from surfaces by electrons, and Madey and Yates ( I 76) have reviewed the electron-
286
PAUL H . HOLLOWAY
stimulated desorption process. Knotek et al. (111,187) have recently shown that desorption from ionic solids can result from interatomic Auger transitions, independent of whether the process is initiated by an electron or photon. However, desorption by electrons is a much more severe problem because of the higher cross sections, and in fact the phenomenon may itself be used as a surface analysis technique (188). Beam-enhanced diffusion of mobile species has been studied exclusively in insulators. Chou et af. (178) have shown that chlorine will migrate in SiO, during irradiation with an electron beam. Pantano et al. (177) and Ohuchi et al. (179) observed similar effects for alkali elements in glass or ceramics, and Pantano et al. (177) postulate that the phenomenon resulted from electron-beam-enhanced mobility and driving force. During irradiation with electrons, an electric field will exist in the solid due to election trapping, and the field will cause positive ions to leave the surface region, i.e., the driving force for diffusion is modified by the electric field. The mobility can also be higher during analysis since the power density of the primary beam can be large and result in local heating (181). Therefore, the migration can be analyzed as a diffusion process where both the mobility and driving force are larger because of the primary electron beam. Similar effects may be expected from ion beams although the power density (and therefore mobility) is less, and White et af. (180) have shown that mobile ions will migrate during ion bombardment. Therefore, even though X-ray excitation (for example in X-ray photoelectron spectroscopy-ESCA) may not result in an electric field sufficient to cause diffusion, diffusion may still occur during sputter profiling. Not only does electron trapping lead to field-enhanced diffusion, the fields may reach the dielectric breakdown strength of the solid and cause “charging noise” in the secondary electron energy spectrum. This is especially true for derivative data since the time-dependent breakdown causes large lock-in amplifier signals. The dielectric breakdown strength is not always exceeded, however. A leakage current may develop and stabilize a negative charge state before breakdown occurs. In addition, if the primary energy and incidence angle are correct, the secondary electron emission coefficient may be positive and true secondary electrons will return to the surface to maintain a steady-state charge accumulation. Conductive masks, inert gas in the chamber, and ion bombardment can be used at times to minimize charging. Finally, molecular cracking by an electron beam can be a serious problem for both organic and inorganic compounds. It is not surprising that cracking occurs with organic species; however, the doses at which it is observed are very low. Holloway et al. (183) report detectable damage for methyl alcohol and methyl ether at a dose of 5 x C/cm2. This is equivalent to a 1 PA,
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
287
0.5-mm-diam beam striking the sample for 1 sec. Inorganic species may also be dissociated. Thomas (184)first reported that electrons caused dissociation of SO,. This has been verified by a number of authors. Johannessen et al. (185) have shown that a current density of A/cm2 at 4.5 keV is necessary to avoid SiO, decomposition during Auger analysis. Molecular cracking apparently occurs due to collisional excitation of bonding electrons to nonbonding orbitals ; this causes disintegration of the molecule.
VI. APPLICATIONS Over the past ten years, the technique of Auger electron spectroscopy has moved from being exclusively a research technique to being a technique used in production lines for quality control. As a result, the spectrum of applications is sufficiently broad to itself be the subject of review articles (122,189). Only a brief sampling will be used here to illustrate the variety of applications. A . Fundamental Interface Studies
The initial widespread use of AES was to determine the cleanliness of surfaces being studied by low-energy electron diffraction LEED (13). Determination of contamination on surfaces remains a primary application of AES in current research programs. However, it has gone beyond this stage. For example, it has been used to quantify the extent and conditions under which the surface and grain boundary compositions differ from the bulk (190--192). As a result of the difference in bonding at the surface and in the bulk for different atoms in a multicomponent system (e.g., a binary alloy system), one or more elements may be enriched on the surface after a heattreatment (190). Similar arguments can be used to suggest that the composition near a grain boundary should be different (191,192). In fact, AES has shown directly and unequivocally that such enrichment does exist, is being used to develop theories of such segregation, and has been used to correlate segregation with mechanical properties and catalytic reactions as discussed be 1ow. Another area now developing in surface science is that of surface phase diagrams. I t is now apparent that for appropriate combinations of temperature and surface coverage, adsorbed species will be randomly or periodically arranged upon the surface (193). This is shown in Fig. 25 for oxygen on tungsten (193,194). This figure shows that for coverage 8 below 0.3 and temperatures above 450 K, oxygen adsorbed upon tungsten (1 10) is disordered. However, for 0.3 < 6 < 0.5, two phases exist upon the surface-a random
288
L COVERAGE
0.6 0
0
FIG.25. Possible phase diagram for W(110) covered with adsorbed oxygen. [From Lagally et al. (193). Reprinted with permission, North-Holland Publ. Co., Amsterdam.]
phase and an ordered p (2 x 1) phase. A single p (2 x 1)-ordered phase exists at 0 = 0.5, but for 0 > 0.5, the system may disorder ( T > 700 K) or form p (2 x 2) and p (1 x 1) phases. Surface phase diagrams are expected to be significant in predicting the thermodynamics of surfaces and therefore the possible reactions at surfaces. AES has been used extensively in studies of the interaction of electrons, ions, photons, and neutrons with the surface. This has been amply demonstrated in the discussion above. However, it is interesting to point out that it has even been used to study liquid-phase electrochemicalreactions (195,196). Of course, the spectroscopy was performed in vacuum before and after exposing the surface of electrodes to liquids. Felter and Hubbard (196) have shown that exposure of platinum to iodine can alter its electrochemical behavior in sulfuric acid. B. Materials Science AES has been extensively applied to studies of mechanical properties of materials. Among other effects, it has been used to study low-temperature embrittlement of steels, hydrogen-induced cracking, stress corrosion
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
289
cracking, creep, and machining (189). It has been used very successfully to show that the grain boundary composition can be different from the bulk, and this can affect brittle fracture behavior, grain growth, and creep (191, 192). These experimental data have led to progress in the theoretical predictions of grain boundary segregation in multicompoment systems (197). The impact of these studies upon technology is very apparent-in some instances new specifications are being written to limit the concentrations of group VA impurities. In another instance, the effects of impurity concentrations upon the brittle-ductile transition temperature have been predicted, and this temperature has been lowered by adding elements rather than lowering impurities (191). Another class of mechanical behavior studied with AES is that of friction, wear, and adhesion. The potential of AES in these areas in very large since they all are directly affected by the surface composition. Buckley (198) has shown that atomically clear surfaces adhere to one another, while adsorption of as little as a single layer of gas is sufficient to reduce the coefficient of adhesion for some materials by about an order of magnitude. Pepper (199) has shown that polymeric material may transfer during contact with a metal rider, and thereafter serve as a lubricant to minimize wear. Finally, Holloway (200) has shown that as little as 1 nm of Cr,O, will prevent formation of strong bonds during thermocompression gold-gold bonding in hybrid microcircuits. It is obvious that the technologies of bonding, lubrication, and wear will be directly affected by surface-analytical techniques. The same can be said of the area of corrosion and oxidation-a significant number of publications already demonstrate that AES is very valuable for studies of metals as well as glasses and ceramics. For example, the mechanism of dissolution and passivation of steels and other materials has been studied using AES (201-203). AES has been used in oxidation studies also; Holloway and Hudson (204) studied the initiation of the oxidation of nickel and showed that a passivating film was rapidly formed at low temperatures. Magnani and Holloway (205) studied the oxidation of a U-Nb alloy at moderate temperatures and correlated the results with stress corrosion cracking data. They postulated that stresses caused by the oxide formation were resulting in crack propagation at lower stresses in this alloy. For glasses and ceramics, AES in conjunction with argon sputtering has been used to study the corrosion rates of simulated nuclear waste glass (206). The data showed that selective leaching of the glass constituents cccurred at short times, but network dissolution was rate controlling at long times. In another study, Auger data showed that certain glasses (termed bioglass) can form very strong bonds to living bones by selective leaching of sodium from the surface, formation of a calcium phosphate layer, and incorporation of organic matter into the calcium phosphate layer (207).
290
PAUL H. HOLLOWAY
c. Catulysts AES has been applied to all areas of catalysis-to studies of the surface composition of catalysts, to studies of adsorption and reaction on the catalyst surface, and to studies of catalyst deactivation and poisoning. As discussed in Section VI,A, AES has shown that the surface composition may be different from the bulk composition when a catalyst is heated. This is especially true for bimetallic catalysts used for petroleum refining (208). In some instances the surface composition correlates directly with catalytic activity. For example, the activity for hydrogenation of benzene over platinum decreases sharply when small amounts of palladium are added. However, Ponec (209)reports that palladium segregates preferentially to the platinum surface, and the hydrogenation activity actually decreases linearly with increases in the surface concentration of palladium. However. composition studies are not limited to metallic catalysts. Goldobin and Savckenko (210)have used AES to investigate the surface of oxide catalysts. Adsorption of gases onto surfaces has been often studied with AES, but there is an increasing number of studies of the reactions for two or more gases upon these surfaces. The two most popular substrate materials are nickel and platinum. Madey et ul. (211) have studied the methanation reaction over single-crystal nickel. They report turnover numbers that agree very well with data from polycrystalline and Al,O,-supported nickel samples. This agreement demonstrates remarkable progress toward understanding catalysts from both a surface science and "real'' catalyst point of view. Madey et al. (211) also have observed a carbidic phase on the nickel surface after the methanation reaction has occurred, and they suggest this phase may represent a necessary intermediate step in the formation of methane from carbon monoxide. In a similar fashion, Matsushima et ul. (212) have shown by Auger peak shape analysis that two species of oxygen may be chemisorbed upon platinum. One of these species is very reactive toward oxidation of carbon monoxide, while the second chemical state does not react at all. While the basic mechanism of catalysis is important, the maintenance of high catalytic activity over long periods of times is also important to the economics of a process. Therefore, AES has been used to study deactivation and poisoning of catalysts. The effects of sulfur on catalyzed reactions have been studied for a number of materials, especially nickel and platinum (189). In more applied studies, Williams and Baron (213)showed that lead accumulated upon the surface of platinum or palladium automobile exhaust catalyst and caused loss of activity. Bhasin showed that lead also degraded the activity of copper catalyst used in the reaction of methyl chloride with silicon (214).He also showed that iron deposited on the surface of palladium catalysts would poison the hydrogenation of diolefins (215).Holloway and
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
29 1
Nelson (216) used the good spatial resolution of AES to investigate the poisoning of coal liquefaction catalysts. D. Electronics Of all the areas of application, AES is most heavily used in the field of electronics. AES is important to all phases of this industry-from research to production. Using device processing as a means for discussion, AES has been used for research and process development in the areas of substrate and substrate processing, deposited films, patterning, interconnections, and compatability. Holloway has recently reviewed these areas (217) and Holloway and McGuire have compiled an extensive literature survey of applications in electronics (218).Therefore, only brief illustrations will be given. Yang et al. (219) have investigated various techniques for cleaning silicon prior to metallization, etc. They conclude that plasma cleaning is the most effective method to remove carbon impurities. Several investigators have studied the distributions of dopants in SiO, films grown on silicon (220--222).With respect to deposited films, Holloway and Stein (142) have studied the incorporation of oxygen into CVD-deposited silicon nitride. They were able to correlate the oxygen concentration with the index of refraction of the. silicon nitride. Andrews and Morabito (223) used AES to show that metallic impurities from the substrate holder were being incorporated into sputter-deposited IC metallization films. This lead to high rejection rates for the ICs, but coating the substrate holders eliminated the problem. In bonding, Bushmire and Holloway (224) used AES to demonstrate the sensitivity of various bonding techniques to contamination. Using thin films of photoresist residues, they showed that the order of decreasing sensitivity was compliant beam lead (most sensitive to contamination), thermocompression fine gold wire, wobble beam lead, thermosonic fine gold wire, thermocompression lead frame, and ultrasonic fine aluminum wire bonding. Finally in compatability, Holloway (217) used AES to show that fluorine from an activator for an epoxy sealant can cause the formation of very thick layers of SiO, in a sealed microcircuit. It is evident from even this brief list that AES has proven to be extremely valuable to the electronics industry.
VII. SUMMARY
The characteristics and attributes of Auger electron spectroscopy described above do not need to be summarized here. Rather some thought needs to be given toward the future directions in AES. In the author’s
292
PAUL H. HOLLOWAY
opinion, the technique will become as widespread and common as scanning electron microscopy. The technologies to which it is applied will also continue to increase simply because it can save money through failure analysis, product development, and applied research. As a result the equipment will continue to improve, although the present trend is to make the equipment more complex and expensive. There is a real need to market commercial spectrometers at a price that smaller companies can afford. The real progress in the fundamentals of Auger electron spectroscopy will be in the areas associated with quantification and chemical-state analysis. For quantitative analysis, better understanding of the escape depth, backscatter factor, and Auger transition probability is necessary. Optimum procedures for a multiphase sample must be developed. In the area of chemical-state analysis, the local nature of the Auger process makes it very attractive for studies of molecular orbitals and valence bands. Further work is necessary in determining the true peak shape and extraction of the density of states from that shape.
REFERENCES 1. P. Auger, J . Phys. Radium [6] 6 , 205 (1925). 2. E. H. S. Burhop, “The Auger Effect and Other Radiationless Transitions” Cambridge Univ. Press, London and New York, 1952. 3. I. Bergstrom and C. Nordling, “Alpha, Beta, and Gamma Ray Spectroscopy,” Vol. 2, p. 1523. North-Holland Publ., Amsterdam, 1965. 4. E. J. McGuire, in “Atomic Inner-Shell Processes VI” (B. Crasemann, ed.), Chapter 7. Academic Press, New York, 1975. 5 . L. A. Harris, J . Appl. Phys. 39, 1419 (1968). 6. J. J. Lander, Phys. Rev. 91, 1382 (1953). 7. R. E. Weber and W. T. Peria, J . Appl. Phys. 38,4355 (1967). 8. L. N. Tharp and E. J. Scheibner, J . Appl. Phys. 38,3320 (1967). Y. P. W. Palmberg, G. K. Bohn, and J. C. Tracy, Appl. Phys. Lett. 15,254 (1969). 10. P. W. Palmberg and H. L. Marcus, A S M Trans. Q 62, 1016 (1969). 11. N. C. MacDonald and J. R. Waldrop, Appl. Phys. Lett. 16,76 (1970); 19,315 (1971). 12. C. C. Chang, in “Characterization of Solid Surfaces” (P. F. Kane and G. B. Larrabee, eds.), Chapter 20. Plenum, New York, 1974. 13. L. Fiermans and J. Vennik, Adv. Electron. Electron Phys. 43, 139 (1977). 14. K. Siegbahn, C. Nordling, A. Fahlman, R. Nordberg, K. Hamrin, J. Hedman, G. Johansson, T. Bergmark, S. E. Karlsson, I. Lindgren, and B. Lindgren, Nova Acta Regiue Soc. Sci. Ups. 20, 234 (1967). 15. W. E. Moddeman, T. A. Carlson, M. 0. Krause, B. P. Publen, W. E. Bull, and G. K. Schweitzer, J . Chem. Phys. 55, 2317 (1971). 16. J. M. White, R. R. Rye, and J. E. Houston, Chem. Phys. Lett. 46, 146 (1977). 17. T. E. Gallon and J. D. Nuttall, Surf. Sci. 53,698 (1975). 18. P. H. Citrin, P. Eisenberger, and D. G. Hamann, Phys. Rec. Lett. 33,965 (1974). 19. R. R. Rye, T. E. Madey, J. E. Houston, and P. H. Holloway, J . Chem. Phys. 69, 1504 (1978).
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
293
20. F. P. Larkin, in “Atomic Inner-Shell Processes VI” (B. Crasemann, ed.), p. . Academic Press, New York, 1975. 21. J. A. Bearden and A. F. Burr, Rev. Mod. Phys. 39, 125 (1967). 22. K. Siegbahn, C. Nordling, A. Fahlman, et al., “ESCA-Atomic Molecular and Solid State Structure Studied by Means of Electrons Spectroscopy.” Almqvist & Wiksell, Stockholm, 1967. 23. S . K. Haynes, Proc. Int. Conf. Inn. Shell Ioniz. Phenom. Future Appl., 1972 p. 559 (1973). 24. M. F. Chung and L. H. Jenkins, Surf. Sci. 22,479 (1970). 25. J. P. Coad and J. C. Reviere, Surf: Sci. 25, 609 (1971). 26. L. Hedin and A. Johansson, J. Phys. B 2, 1336 (1969). 27. D. A. Shirley, Phys. Rev. A 7, 1520 (1973). 28. W. N. Asaad and E. H. S . Burhop, Proc. R. SOC.London, Ser. B 72,369 (1958). 29. W. L. Jolly and D. N. Hendrickson, J . Am. Chem. SOC.92, 1863 (1970). 30. D. A. Shirley, Chem. Phys. Lett. 16, 220 (1972). 31. S . P. Kowalczyk, R. A. Pollak, F. R. McFeeley, L. Ley, and D. A. Shirley, Phys. Rev. B 8, 2387 (1973); 9, 381 (1974); 1 1 , 600 (1975). 32. P. H. Citrin and D. G. Hamann, Chem. Phys. Lett. 22, 301 (1973). 33. J. A. D . Mathew, Surf. Sci. 40,451 (1973). 34. T. E. Gallon, “Electron and Ion Spectroscopies of Solids.” NATO Adv. Stud. Inst., Gent, 1977. 35. K. S. Kim, W. W. Gaarenstroom, and N. Winograd, Phys. Rev. B 14, 2281 (1976). 36. R. Hoogwijs, L. Fiermans, and J. Vennik, J . Chem. Phys. Lett. 38, 192 (1976). 37. R. E. Watson, M. L. Perlman, and J. F. Herbst, Phys. Rev. B 13,2358 (1976). 38. G. E. Laramore and W. J. Camp, Phys. Rev. B 9 , 3270 (1974). 39. C. D. Wagner and J. A. Taylor, Surf. Interfacial Anal. 1, 73 (1979). 40. R. G. Musket and W. Bauer, Appl. Phys. Lett. 20,455 (1972). 41. T. W. Haas, R. W. Springer, M. P. Hooker, and J. T. Grant, Phys. Lett. A 47,317 (1974). 42. R. W. Springer, T. W. Haas, J. T. Grant, and M. P. Hooker, Rev. Sci. Instru. 45, 1113 (1974). 42a. L. E. Davis, N. C. MacDonald, P. W. Palmberg, G. E. Riach, and R. E. Weber, “Handbook of Auger Electron Spectroscopy,” 2nd ed. Physical Electronics Ind., Inc., Eden Prarie, Minnesota, 1976. 43. J. H. Thomas and J. M. Morabito, Surf. Sci. 41, 629 (1974). 44. P. H. Holloway and G. E. McGuire, Thin Solid Films 53, 3 (1978). 45. D. J. Pocker and T. W. Haas, J . Vac. Sci. Technol. 12,370 (1975). 46. A. Christou, Scanning Electron Microsc. 8, 149 (1975). 47. A. Mogami, Thin Solid Films 57, 127 (1979). 48. H. W. Werner, in “Appiied Surface Analysis” (T. L. Barr and L. E. Davis, eds.), ASTM STP 699. Am. SOC.Test. Mater., Philadelphia, Pennsylvania, 1979. 49. A. P. Janssen and J . A. Venables, Surf. Sci. 77, 351 (1978). 50. M. M. El Gomati and M. Prutton, Surf. Sci. 72,485 (1978). 51. R. Shimizu, T. E. Everhart, N. C. MacDonald, and C. T. Hovland, Appl. Phys. Lett. 33,549 (1978). 52. R. A. Armstrong, Surf. Sci. 50, 615 (1975). 53. M. P. Seah, Surf. Sci. 32, 703 (1972). 54. P. H. Holloway, J . Vuc. Sci. Technol. 12, 1418 (1975). 55. T. E. Gallon, Surf. Sci. 17,486 (1969). 56. D. C. Jackson, T. E. Gallon, and A. Chambers, Surf. Sci. 36,381 (1973). 57. M. P. Seah and W. A. Dench, Surf. Interfacial Anal. 1, 1 (1979).
294
PAUL H. HOLLOWAY
D. R. Penn, J . Electron Spectrosc. 9, 29 (1976). D. R. Penn, Phys. Rev. B 13,5248 (1976). H. J. Fitting, H. Glaefeke, and H . Wild. Surf. Scr. 75, 267 (1978). D. Norman and D. P. Woodruff, S u r f . Sci. 75, 179 (1978). D . M. Holloway, J . Vuc. Sci. Technol. 12, 392 (1975). N . J. Taylor, J . S. Johannessen, and W. E. Spicer, Appl. Ph.vs. Lett. 29, 497 ( I 976). G . K. Wehner, in “Methods of Surface Analysis” (A. W. Czanderna, ed.), Chapter 1 Am. Elsevier, New York. 1975. 65. J . W. Coburn and E. Kay, Crit. R e i . Solid Stute Sci. 4, 561 (1974). 6 6 . R. Shimizu. Jpn. J . Appl. Phys. 13, 228 (1974). 6 7 . P. S. Ho. J . E. Lewis. H. S. Wildman. and J. K. Howard, S u r f . Sci. 57, 393 (1976). 6 8 . P. H. Holloway. J . Efvciron Spectrose. 7 . 2 I5 (1975). 6 9 . S. A. Schwartz and C. R. Helms, J . Vae. Sci. Technol. 16, 781 (1979). 70. J. S. Johannessen. W. E. Spicer. and Y . E. Strausser. J . Appl. Pl?rs. 47, 3028 ( 1976). 7 1 . S. Hofman. A p p f . P h ~ , s 13, . 205 (1977). 72. H. J . Mathieu, D. E. McClure. and D. Ldndolt, Thin Solid Frinis 38, 281 (1976). 73. P. S. Ho and J. E. Lewis, Surf. Sci. 55, 335 (1976). 7 1 Z. L. Liau. W. L. Brown, R. Homer. and J. M. Poate, Appl P1rj.s. Lett. 30, 626 (1977) 75 P. &dun and V. W. Farber, Vak. Tech. 23, 239 (1974). 76. R. Kelly, Nucl. Instrum. Methods 149, 553 (1978). 77 P. ti. Holloway and G . C. Nelson. J . Vuc. Sci. Techno/. 16. 793 (1979). 7X K. S. Kim. W. E. Baitinger. J. W. Amy. and N. Winograd. J . Electron Spectrosc. 5, 351 (1974). 79. H. F. Winters and J. W. Coburn. Appl. PhJ,s. L e t t . 28, 176 (1976). 80. P. S. Ho, Surf: Sci. 72, 253 (1978). X I P. H. Holloway. Surf: Sci. 66, 479 (1977). 8-7 G. J . Slusser and N . Winograd, Surf. Scr. 84, 211 (1979). 83. R. R . Olson and G . K. Wehner. J . Vac. Sci. Technol. 14, 319 (1977). 84 H. Shimizu, M . Ono. and K. Makayama. Surf: Sci. 36,817 (1973). X5. R. R. Rye, J. E. Houston, D. R. Jennison, T. E. Madey. and P. H. Holloway. /nd. Eny. Chem., Prod. Res. D e r . 18, 2 (1979). 86. 0. Klemper and J. G . P. Shephard. A d r . Phys. 12, 355 (1963). 87. W . M . Mularie and T. W. Rusch, Surf. Sci. 19,469 (1970). X X . J . E. Rowe and S . B. Christman, Solid Stare Commun. 13, 3 I5 (1973). 89. L. H. Jenkins and M. F. Chung, Surf. Sci. 26, 151 (1971). 90. L . H. Jenkins. D. M. Zehner, and M . F. Chung, Surj. Scr. 38,257 (1973). 91. M . Suleman and E. B. Pattinson, J . Phys. F 1, L21 (1971). 92. B. I. Lundquist. Phys. Kondens. Marer. 9, 236 (1969). 93. W. J. Pardee, G . D. Mahan, D. E. Eastman. R. A. Pollak. L. Ley. F. R. McFeeleq. S. P. Kowalczyk. and D. A. Shirley, Phys. Rer. B 11, 3614 (1975). 94. D. R. Penn, P h ~ , sRer. . L e t t . 38, 1429 (1977). 95. J . E. Houston, J . Vuc. Sci. Technol. 12, 255 (1975). Y6. H . H. Madden and D. E. Schreiner, Sandia Laboratories Report S A N D 76-0283 (1976), available from the authors or National Technical Information Service, Springfield, Virginia. 97. H. H. Madden and J. E. Houston, J . Appl. Phys. 47, 3071 (1976). 98. J . E. Houston, G . E. Moore, and M. G. Lagally, Solid State Commun. 21, 879 (1977). YY. R. L. Gerlach, J . Vac. Sci. Technol. 8, 599 (1971). 100 L. Fiermans and J. Vennik, Surf. Sci. 38,257 (1973). 101. D. Berenyi, Adc. Electron. Elecrron Phys. 42, 76 (1976). 58. 59. 60. 61. 62. 63. 64.
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
29 5
102. T. Novakov and R. Prins, in “Comparative Electron Spectroscopy” (D. A. Shirley, ed.), p. 821. North-Holland Publ., Amsterdam, 1972. 103. P. H. Holloway, Solid State Commun. 19, 729 (1976). 104. T. W. Haas, J. T. Grant, and G. J. Dooley, Phys. Rev. B 1, 1445 (1970). 105. P. H. Citrin, J . Electron Spectrosc. 5,273 (1974). 106. P. H. Citrin, Phys. Rev. Lett. 31, 1164 (1973). 107. P. J. Bassett, T. E. Gallon, M. Prutton, and J. A. D. Mathew, Surf. Sci. 33, 213 (1972). 108. A. P. Janssen, R. C. Schoonmaker, A. Chambers, and M. Prutton, SurJ. Sci. 45, 45 (1975). 109. 1. A. D. Mathew and Y. Komninos, Surf. Sci.53, 716 (1975). 110. M. Salmeron and A. M. Baro, Surf. Sci. 49, 356 (1975). 111. P. Feibelman and M. L. Knotek, Phys. Rev. B 18,6531 (1978). 112. R. T . Poole, R. C. G. Leckey, J. G. Kinkis, and L. Liesegang, J . Electron Spectrosc. 1, 371 (1972-1973). 113. L. A. Harris, Surf. Sci. 15, 77 (1969). 114. L. A. Harris, Surf: Sci. 17,448 (1969). 115. T. W. Rusch and W. P. Ellis, Appl. Phys. Lerr. 26,44 (1975). 116. J. R. Noonan, D. M. Zehner, and L. H . Jenkins, J . Vac. Sci. Technol. 13, 183 (1976). 117. L. McDonnell, D. P. Woodruff, and B. W. Holland, Surf: Sci. 51, 249 (1975). 118. J. H. Fox, J. D. Nuttall, and T. E . Gallon, Surf. Sci. 63, 390 (1977). 119. J . A. D. Mathew,Surf.Sei. 20, 183(1970). 120. T. W. Haas and J. T. Grant, Appl. Phys. Lett. 16, 172 (1970). 121. T. W. Haas, J. T. Grant, and G . J. Dooley, J . Appl. Phys. 43, 1853 (1972). 122. A. Joshi, L. E. Davis. and P. W. Palmberg, in “Methods of Surface Analysis” (A. W. Czanderna, ed.), Chapter 5. Am. Elsevier, New York, 1975. 123. C . D. Wagner and P. Biloen, Surf. Sci.35,82 (1973). 124. P. H. Holloway and J. B. Hudson, J . Vac. Sci. Technol. 12,647 (1975). 125. C. D. Wagner, Anal. Chem. 44, 1050 (1972). 126. N. J. Taylor, Rev. Sci. Instrum. 40,792 (1969). 127. P. M. Hall, J. M . Morabito, and D. K. Conley, Surf. Sci. 62, 1 (1977). 128. P. Holloway, in “Scanning Electron Microscopy/l978” (0.Johari, ed.), Vol. I , p. 361. SEM Inc., A M F O’Hara, Illinois, 1978. 129. G. F. Amelio and E. J. Scheibner, Surf. Sci. 11, 242 (1968). 130. J . E. Houston, J . Vac. Sci. Technol. 12,255 (1975). 131. H. H. Madden and J. E. Houston, J . Vac. Sci. Technol. 14, 412 (1977); Solid State Commun. 21,681 (1977). 132. M. C h i , Solid State Commun. 20, 605 (1976). 133. M. C h i , Phys. Ret.. B 17, 2788 (1978). 134. G. A. Sawatsky, Phys. Rev. Lett. 39, 504 (1977). 135. E. Antonides, E. C. Janse, and G. A. Sawatsky, Phys. Rev. B 15, 1669 (1977). 136. H. H. Madden, D. M. Zehner, and J . R. Noonan, Phys. Rev. 17,3074 (1978). 137. D. E. Ramaker, J. S. Murday, N. H. Turner, G. Moore, M. G. Legally, and J. E. Houston, Phys. Rev. B 19, 5375 (1979). 138. D. E. Ramaker and J. S. Murday, J . Vac. Sci. Technol. 16, 510 (1979). 139. P. J. Feibelman, E. J. McGuire, and K . C . Pandy, Phys. Rev. B 15, 2202 (1977). 140. P. J. Feibelman and E. J. McGuire, Phys. Rev. B 17,690 (1978). 141. D. R. Jennison, Phys. Rev. B 18,6996 (1978). 142. P. H. Holloway and H. J. Stein, J . Electrochem. Soc. 123,723 (1976). 143. H. Hapner, J. A. Simpson, and C. E. Kuyatt, Rev. Sci. Instrum. 39,33 (1964). 144. E. N . Sickafus and D. M . Holloway, Surf. Sci. 51, 131 (1975).
296
PAUL H. HOLLOWAY
P. W . Palmberg, J . Electron Spectrosc. 5,691 (1974). P. H. Holloway and D. M. Holloway, Surf. Sci. 66,635 (1979). C. J. Powell, N. E. Erickson, and T. E. Madey, to be published. C. J. Powell, N. E. Erickson, and T. E. Madey, J. Electron Spectrosc. 17,361 (1979). D. A. Huchital and J. D. Rigden, Appl. Phys. Lett. 16,348 (1970). J. A. Simpson, Rev. Sci. Instrum. 35, 1968 (1964). A. L. Hughes and V . Rojansky, Phys. Rev. 34,284 (1929). Y. E. Strausser, in “Applied Surface Analysis” (T. L. Barr and L. E. Davis, eds.), ASTM STP 699. Am. SOC.Test. Mater., Philadelphia, Pennsylvania, 1980. 153. J. E. Houston and R. L. Park, Rev. Sci. Instrum. 43, 1437 (1972). 154. J. E. Brignell and R. Young, J . Phys. E 12,455 (1979). 155. P. H. Holloway, in “Scanning Electron Microscopy/l978” (0.M. Johari, ed.), Vol. I, p. 361. SEM Inc., AMF, O’Hare, Illinois, 1978. 156. C. J. Powell, Appl. Surf: Sci. 4,492 (1980). 157. P. W . Palmberg, J . Vac. Sci. Technol. 13,214 (1976). 158. P. M. Hall and J. M. Morabito, Surf. Sci. 83,391 (1979). 159. P. M. Hall, J. M. Morabito, a n d D . K. Conley, Surf. Sci. 62, 1 (1977). 160. F. Pons, J. Le Hericy, and J. P. Langeron, Surf. Sci. 69, 547 and 595 (1977). 161. J. T. Grant and M. P. Hooker, Surf. Sci. 55, 741 (1976). 162. C. .I.Powell, in “Quantitative Surface Analysis” (N. S. McIntyre, ed.), ASTM STP 643, p. 5. Am. SOC.Test. Mater., Philadelphia, Pennsylvania, 1978. 163. R . J. Baird, C. S . Fadley, S. K. Kawamoto, M. Mehta, R. Alvarez, and J. A. Silve, Anal. Chem. 48,843 (1976). 163a. J. McHugh, Radial. Eff. 21,209 (1974). 164. R. Bowman, L. H. Toneman, and A. A. Holscher, Vacuum 23, 163 (1973). 165. D. R. Penn, J . Electron Spectrosc. 9, 29 (1976). 166. D. R. Penn, Phys. Rev. B 13,5248 (1976). 167. D. M. Smith and T. E. Gallon, J . Phys. D 7 , 151 (1974). 168. W. Reuther, Proc. Int. Conf. X-Ray Opt. Microanal., 6th, 1971 p. 121 (1972). 169. C. C. Chang, Surf: Sci. 69, 385 (1977). 170. J. J. Vrakking and F. Meyer, Surf: Sci. 35, 34 (1973). 171. P. H. Holloway, J . Vac. Sci. Technol. 12, 1418 (1975). 172. J. P. Coad, H. D. Bishop, and J. C. Reviere, Surf. Sci. 21, 253 (1970). 173. R. E. Kirby and D. Lichtman, Surf. Sci.41, 371 (1970). 174. W . Ranke and K. Jacobi, SurJ Sci.47,525 (1975). 175. Y. Margoninski, Phys. Lett. A 54, 291 (1975). 176. T. E. Madey and J. Yates, J . Vac. Sci. Technol. 8,525 (1971). 177. C. G. Pantano, Jr., D. B. Dove, and G. Y. Onoda, Jr., J . Vac. Sci. Technol. 13,414 (1976). 178. N. J. Chou, C. M. Osborn, Y. J. VanderMeulen, and R. Hammer, Appl. Phys. Lett. 22, 380 (1973). 179. F . Ohuchi, D . E. Clark, and L. L. Hench, J . Am. Ceram. Soc. 62,500 (1979). 180. C. W. White, D. L. Sims, and N. H. Tolk, in “Characterization of Solid Surfaces” (P. F. Kane and G . R. Larrabee, eds.), Chapter 23. Plenum, New York, 1974. 181. A. W. Mullendore, G . C. Nelson, and P. H. Holloway, Proc.-Adv. Tech. Failure Anal. Symp., 1977 p. 236 (1977). 182. J. J. Vrakking and F. Meyer, Appl. Phys. Lett. 18,226 (1971). 183. P. H. Holloway, T. E. Madey, C. T. Campbell, R. R. Rye, and J. E. Houston, Surf. Sci. 88, 121 (1979). 145. 146. 147. 148. 149. 150. 151. 152.
FUNDAMENTALS OF AUGER ELECTRON SPECTROSCOPY
297
184. S. Thomas, J . Appl. Phys. 45, 161 (1974). 185. J. S. Johannessen, W. E. Spicer, and Y. E. Strausser, J . Appl. Phys. 47, 3028 (1976). 186. J. H. Martinez and J. B. Hudson, J . Vuc. Sci. Technol. 10,35 (1973). 187. M. L. Knotek, V. 0. Jones, and V. Rehn, Phys. Reu. Lett. 43, 300 (1979). 188. E. Bauer, J . Electron Spectrosc. 15, 119 (1979). 189. G. E. McGuire and P. H. Holloway, in “Electron Spectroscopy; Theory, Techniques, and Applications” (C. R. Brundle and A. D. Baker, eds.). Academic Press, New York, 1980. 190. J. J. Burton, C. R. Helms, and R. S. Polizzotti, J. Vac. Sci. Technol. 13, 204 (1976); J . Chem. Phys. 65, 1089 (1976). 191. M. P. Seah, Surf Sci. 53, 168 (1975); 80,8 (1978). 192. C . J. McMahon, J . Vuc. Sci. Technol. 15,450 (1978). 193. M. G. Lagally, G. C. Wang, and T. M. Lu, J . Chem. Phys. 69,479 (1972). 194. M. K. Debe and D. A. King, Surf: Sci. 81, 193 (1979). 195. W. F. O’Grady, M. Y. C . Wood, P. L. Hagans, and E . Yeager, J. Vuc. Sci. Technol. 14, 365 (1977). 196. T. E. Felter and A. T. Hubbard, J . Electroanal. Chem. Interfacial Electrochem. 100, 473 (1979). 197. M. Guttman, Surf: Sci. 53, 213 (1975); Metall. Trans. 8A, 1383 (1977). 198. D. H. Buckley, J . Vac. Sci. Technol. 13, 88 (1976). 199. S. V. Pepper, J . Appl. Phys. 45,2947 (1974). 200. P. H. Holloway, Gold Bull. 3, 99 (1979). 201. Ya. M. Kolotyrkin, Microsc. Spectrosc. Electron. 2, 121 (1977). 202. M. da Cunha Belo, B. Ronhot, F. Pons, J. Le Hericy, and J. P. Langeron, J . Electrochem. SOC.124, 1317 (1977). 203. G. Bouyssoux, M. Romand, H. D. Polaschegg, and J. T. Calow, J . Electron Specfrosc. 11, 185 (1977). 204. P. H. Holloway and J. B. Hudson, Surf Sci. 43, 123 and 141 (1974). 205. N. J . Magnani and P. H. Holloway, Corrosion 3 4 , 7 (1978). 206. D. E. Clark, E. Lue Yen-Bower, and L. L. Hench, in “Ceramics and Nuclear Waste Management” (J. Mendel, ed.), p. 256. Batelle-Pacific Northwest, Richland, Washington, 1979. 207. C. G. Patano, Jr., A. E. Clark, Jr., and L. L. Hench, J . Am. Cerum. SOC.57,412 (1974). 208. J. H. Sinfelt, Cutul. Reu. 9, 147 (1974). 209. V. Ponec, Surf Sci. 80,352 (1979). 210. A. N. Goldobin and V. I . Savchenko, Kinet. Katal. 15, 1363 (1974). 211. T. E. Madey, D. W. Goodman, and R. D. Kelly, J . Vuc. Sci. Technol. 16,433 (1979). 212. T. Matsushima, D. B. Almy, and J. M . White, Surf. Sci. 67, 89 (1977). 213. F. L. Williams and K . Baron, J . Catul. 40, 108 (1975). 214. M. M. Bhasin, J . Cutal. 34, 356 (1974). 215. M. M. Bhasin, J . Cufal.38,218 (1975). 216. P. H. Holloway and G. C. Nelson, Prepr. Dic. Pet. Chem., Am. Chem. SOC.22(4), 1352 (1977). 217. P. H. Holloway, in “Applied Surface Analysis” (T. L. Barr and L. E. Davis, eds.), ASTM STP 699. Am. SOC.Test. Mater., Philadelphia, Pennsylvania, 1980. 218. P. H. Holloway and G. E. McGuire, Appl. Surf. Sci. 4,410 (1980). 219. M. G. Yang, K. M. Koliwad, and G . E. McGuire, J . Electrochem. SOC.122,675 (1975). 220. G. Moore, H. Guckel, and M. G. Lagally, J. Vac. Sci. Technol. 14, 70 (1977).
298
PAUL H. HOLLOWAY
221. J. W. Colby and L. E. Katz, J . Electrochem. Soc. 123,409 (1976). 222. N. J. Chou, Y. J. VanderMeulen, R. Hammer, and J. Cahill, Appl. Phys. Lett. 24, 200 (1974). 223. J. M. Andrews and J. M. Morabito, Thin Solid Films 37, 357 (1976). 224. D. W. Bushmire and P. H. Holloway, in “Proceedings of the 1975 International Microelectronic Symposium,” p. 402. Int. SOC.Hybrid Microcirc., Montgomery, Alabama, 1976.
Author Index Numbers in parentheses are reference numbers and indicate that a n author’s work is referred to although his name is not cited in the text. Numbers in italics show the page on which the complete reference is listed A
B
Ackerhalt, J. R., 206. 220, 224, 236, 237 Adams. W. M . , 3 , 6 2 Agarwal. G. S., 214, 225, 221. 228, 237 Agostini, P., 226, 228, 229, 230. 234, 235, 237 Aizaki, N., 94, 136 Alfven, H.. 2, 57, 59, 183, 61, 187 Alidieres, M . , 22, 26, 61 Allen, L., 205. 214,219, 226, 229. 235,237,239 Almy. D. B., 290,297 Altynsev, A. T., 25, 61 Alvarez, R., 281, 285, 296 Amano. K., 19, 61 Arnelio. G. F., 21 I, 295 Amy, J . W., 258. 259. 294 Anderson. 0. A , , 22, 23, 24. 25, 31, 61 Ando, H., 156. 158, 159, I87 Andrews, J . M . , 291, 298 Anger, K., 128, 136 Angilello, J . , 101, 114, 115, 137 Antonides, E., 212, 295 Apanasevich, P. A., 225. 237 Applebaum, D. C., 137 Aritome, H.. 102, 136 Armstrong, L., 196, 206, 209. 237, 238, 239 Armstrong. R. A,, 254, 255. 293 Asaad, W. N., 247,293 Ashour-Abdalla. M . , 58, 64 A t h a y , R . G . , 162, 169,170,171, 172, 176,187,
Bahcall, J. N., 144, 145, 146, 187 Baird, R. J . , 281. 285, 296 Baitinger, W. E., 258, 259, 294 Bakos, J. S., 191, 233, 237 Ballantyne, J. P., 84, 85, 137 Baro, A. M . , 266, 295 Baron, K., 290,297 Bassett, P. J., 266, 295 Bateman, G . , 16, 22, 62 Baue:, E., 286, 297 Bauer, W., 249, 293 Baum, P. J., 3. 4, 5, 7, 10, 13, 18. 19, 22. 27. 28, 29, 30, 31, 36, 37, 38, 45, 46, 48, 49, 50, 58, 59, 61, 6 2 Beach. H. W., 130, 138 Bearden, J. A., 245, 293 Beauchamp, H. L., 124, 137 Bebb, H. B., 197, 200, 207, 237 Beckers, J. M., 166, 172, 187 Beeler, R., 13, 53, 5 4 , 6 2 Beers, B. L., 206, 209, 237 Bell, A. E., 121, 122, 136, 139 Benford, J., 25, 62 Berenyi, D., 265, 294 Berg, J. O . , 193, 223, 224, 236, 237, 239 Bergmark, T., 245, 292 Bergstrom, I., 241, 242, 245, 292 Bernacki, S. E., 107, 113, 137, 138 Berry,, D. H., 110, 137 Beterov, I . M., 220, 237 Bhasin, M . M., 290(214, 215), 297 Bialynicka-Birula, Z., 221, 222, 223, 237, 238 Bialynicki-Birula, I., 221, 222, 223, 237, 238 Biloen, P., 270, 295 Bishop, H. D., 285, 296 Biskamp, D., 16, 62 Bjorklund, G. C., 235, 236, 237
189
Atkinson, G., 58, 62 Auger. P., 241, 242, 292 Ausschnitt, C. P., 235, 236, 237 Avan, P.. 226. 228, 237 Avrett, E. H., 161, 163, 170, 188, 190 Axford, W. I . , 17, 6 7 Aymar, R., 22, 2 6 , 6 1 Ayres. T. R., 170, 187 299
300
AUTHOR INDEX
Bloembergen, N., 193,237 Bodin, H. A. B., 22, 24, 25, 62 Bogdanov, S., Yu., 32,62 Bohn, G. K., 242, 275,292 Bonch-Bruevich, A. M., 201,206, 237, 238 Bonnet, R. M., I87 Boornazian, A. A,, 146, 188 Bostrom, R., 61,62 Bouyssoux, G., 289,297 Bowman, R., 283,296 Bratenahl, A., 3, 4, 5, 7, 9, 10, 13, 18, 19, 22, 27, 28, 29, 30, 31, 36, 37, 38, 41, 43, 44, 45,46,47,48,49,50, 51, 53, 58, 59,61,62 Braun, P., 258,259, 282, 294 Bray, R. C., 236,237 Bray, R. J., 164, 187 Brignell, J. E., 279, 296 Bril, T. W., 128, 129, 136 Brown, D. R., 172, 187 Brown, J. B., 185, 189 Brown, T., 15 1, 188 Brown, W. L., 258, 259, 294 Broyde, B., 84, 136 Brushlinskii, K. V., 20, 53, 63 Bruzek, A., 144, 187 Buckley, D. H., 289, 297 Bulanov, S. V., 18, 63 Bull, W. E., 245, 292 Bumba, V., 179, 182, I87 Burhop, E. H. S., 241(2), 242(2), 245(2), 247 (28), 292, 293 Burr, A. F., 245, 293 Burshtein, A. I., 225, 240 Burton, J. J., 287, 297 Bushmire, D. W., 29 1, 298 Butylkin, V. S., 218, 237 C
Cahil, J . , 291, 298 Calow, J. T., 289, 297 Camp, W. J., 248,293 Campbell, C . T., 285,286,296 Canfield, R. C., 166, 187 Cantagrel, M., 79, 136 Cantrell, C. D., 220, 237 Carlquist, P., 59, 61 Carlson, L. R., 192.200,239
Carlson, T. A,, 245, 292 Carmichael, H. J.. 225, 237 Chambers, A., 255(56), 266(108), 293, 295 Chandrasekhar, S., 162, 187 Chang, C. C., 245(12), 246(12), 251(12), 265 (12), 269(12), 278(12), 280(12), 285(12, 169), 292, 296 Chang, C. S., 201,234,237 Chang, T . H. P., 75, 85, 89, 136 Chapman, B. N., 79, 137 Chapman, S., 18,63 Chebotayev, V. P., 220,237 Chin, S. L., 233, 237 Chiu, H.-Y., 144, 147, 187 Chiuderi, C., 173, 187, 188 Choi, C. W., 206,237 Chou, N. J., 285(178), 286(178), 291(222), 296,298 Christman, S. B., 261, 262, 263, 294 Christou, A,, 250, 293 Christy, R. F., 157, 187 Chung, M. F . , 247(24), 257(65), 261(65, 93), 293,294 Chung, M. S. C., 94, 136 Chupp, E. L., 185, 187 Cini, M., 271(132, 133), 295 Citrin, P. H., 245(18), 248(32), 266(105, 106), 267(18), 292, 293, 295 Clark, A. E., Jr., 289, 297 Clark, D. E., 285(179), 286(179), 289(206), 296,297 Coad, J. P., 247(25), 285(172), 293, 296 Coburn, J. W., 257(65), 258(79), 259(79), 282 (65), 294 Cohen-Tannoudji, C., 206,226, 228, 237 Colby, J. W., 291, 298 Coleman, G., 101, 114, 115, 137 Compton, R. N., 236, 240 Conley, D. K., 270(127), 280(159), 283(159). 284( 159), 295, 296 Conti, P., 176, 187 Coppi, B., 176, 189 Coroniti, F. V . , 17, 58, 63, 64 Cosslett, V. E., 74, 106, 136 Cowan, M., 22, 27, 62, 63 Cowley, S. W. H., 6, 11, 63 Cowling, T. G., 57, 58, 63 Cox, J. P., 159, 187
30 1
AUTHOR INDEX
Cram, L. E., 161, 164, 171, 172, 187 Crance, M., 196, 234, 237,239 Crewe,A. V., 118, 121, 122, I36
D da Cunha Belo, M., 289,297 Dailey, C. L., 22, 28, 63 Davis, D. E., 128, 136 Davis, L. E., 249(42a), 250(42a), 265(42a), 269( 122), 280(42a), 281(42a), 283(42a), 287(122), 293, 295 Davis, L. I., 192, 239 Davis, R.,145, 146, 147, 187 Davydkin, V. A,, 201,237 Debe. M. K., 297 Delache, P.,187 Delone, N. B., 191, 238 de Meijere, J. L. F., 225, 228, 238 Dench, W. A., 255,256,293 Deubner, F.-L., 151, 153, 154, 159, 160, 166, 187, 190 Dick, C. E., 106, 136 Dicke, R. H., 150, 151, 156, 187 Dilke, F. F. W., 146, 187 Dixit, S. N., 209, 232, 238 Doerries, E. M., 84, 139 Dooley, G. J., 265(104), 269(121), 295 Doran, S., 128, 136 Doschek, G. A,, 173, 188 Dove, D. B., 285, 286,296 Drake, J. F., 16, 22, 24, 25, 26, 63,64 Dreidan, G. V., 32, 33, 34, 63 Ducas, T. W., 198, 240 Dungey, J. W., 5, 8, 36, 57, 63 Dunn, R. B., 166, 188 Dupree, A. K., 168, 173, 174, 188, 189 Durney, B. R., 152, 188 Durrant, C. J., 144, 187 Duval, T. L., 152, 188
E Eastman, D. E., 102,257(93), 261(93), 138,294 Eberly, J. H., 192,205,214,219,221,222,223, 224,225,228,229,234,235,236,237,238 Eddy. J. A,, 143, 146, 150, 179, 188 Ehlotzky, F., 225, 228,240
Einwohner, T. H., 221, 226, 238,240 Eisenberger, P., 245, 267,292 Eland, J. H. D., 236,239 Elgin, J. N., 218, 238 El Gomati, M. M.,253,293 Ellis, W. P., 267, 295 Elsasser, W. M., 57,63 El-Sayed, M. A,, 193,223,224,236,237,239 Elyutin, P. V., 226,238 Emslie, A. G., 185, 189 Erickson, N. E., 276(147, 148), 296 Evans, J. C., 145, 146, 147,187 Everhart, T. E., 81, 253(51), 136, 293 Eviatar, A,, 17,63 Ezekiel, S., 226, 238, 240
F Fadley, C. S., 281, 285,296 Fahlman, A,, 245(14, 22), 246(22), 278(22), 292,293 Farber, V. W., 258,259,282,294 Fay, B., 102, 108, 109, 136 Feder, R., 91,98, 101, 102, 105, 107, 112, 114, 115,137,138 Fedorov, M. V., 206,209,216,238 Feibelman, P. J., 267,272(139, 140), 285,286, 295 Feit, E. D., 77, 81, 84, 137, 139 Feldman, U., 173, 188 Felter, T. E., 288, 297 Feneuville, S.,196, 206, 234, 237, 238, 239 Fermi, E., 94, 136 Fiermans, L., 245(13), 248(36), 264(100), 287 (13), 292, 293,294 Fillipov, V. N., 32, 34, 63 Finne. R. M., 91, 136 Fisher, S., 23, 63 Fitting, H. J., 255, 294 Flanders, D. C., 110, 136 Forbes, T. G., 18,63 Foukal, P., 151, 160, 166, 188 Fox, J. H., 267, 270, 295 Fox, R. F., 227,238 Frank, A. G., 18,22,32, 33, 34, 35, 53,62,63, 64,66 Freeman, J. R.,22, 27, 63
302
AUTHOR INDEX
Freeman, R. R., 235, 236, 237 Frosen, J., 128, 136 Furth, H. P., 11, 15, 18, 23, 37, 58, 61, 64 Fukao, S.,45, 53, 64
Groves, T., 119, 137 Gruen, A. E., 81, 137 Guckel, H., 291,297 Gudat, W., 102, 138 Guttman, M., 289, 297
G Gaarenstroom, W. M., 248, 293 Gabriel, A. H., 174, I88 Galbraith, H. W., 220, 237 Galeev, A. A,, 16, 58, 64 Gallon, T. E., 245(17), 248(34), 255(55, 56), 266(107), 267(17, ]IS), 270(17, 118), 283 (167), 284(167), 292, 293, 295, 296 Garrison, J. C., 221, 224, 226, 238, 240 Gekelman, W., 4, 13, 28, 29,64,66 Georges, A. T., 212, 215, 218, 226, 228, 229, 230, 231, 232, 234, 235,237,238 Gerlach, R. L., 264, 294 Gerlakh, N. I., 20, 53, 64 Gibson, E. G . , 143, 188 Gilman, P. A , , 149, 150, 152, 188 Gingerich, O., 168, 170, 188 Giovanelli, R. G . , 8, 64 Giuffre, G. J., 127, 136 Glaefeke, H., 255, 294 Glauber, R., 229,231, 238 Gloersen, P., 79, 136 Gold. A., 197, 200, 237 Gold, T., 57, 64 Goldberger, M. L., 196, 201, 238 Goldobin, A. N., 290, 297 Goldreich, P., 159, 188 Golub, L., 175, 176, 184, 188, 189 Gontier. Y., 234, 235, 238 Gonzales, A. J., 81, 136 Goodman, D. W . , 290, 297 Goto, E., 118, 125, 137, 138 Cough, D. O., 146, 148, 149, 150, 187, 188 Grad, H.. 58,64 Graham, E., 148, 188 Grant, J. T., 249(41, 42), 257(42), 265(104), 269(120, 121), 281(161), 293, 295, 296 Green, M., 106, 136 Greeneich, J. S., 91, 105, 136 Greenwood, J. C., 91, 136 Greig, J . R.,101, 138 Greisen, K . , 95, 138 Griem, H. R., 22, 24, 25, 26,64 Grobman, W. D., 90,91,95. 102, 136, 138 Grove, R. E., 226,240
H
Haas, T. W., 249(41, 42), 250(45), 257(42), 265(104), 269(120, 121), 293, 295 Hagans, P. L., 288. 297 Haken, H., 227, 238 Hall, P. M., 270(127), 280(158, 159), 283(158, 1591, 284( 159). 295, 296 Haller, I., 91, 137 Hamann, D. G., 245, 248, 267,292, 293 Hammer, R., 285(178), 286(178), 291(222), 296,298 Hammond, D. L., 119, 136 Hamrin, K., 245, 292 Hapner, H., 275,295 Haroche, S., 206,237 Harris, L. A., 242(5), 243, 267(113, 114), 278 ( S ) , 292, 295 Harte, K. J., 117: 137 Harvey, J. W., 150, 151, 175, 184, I88 Hatzakis, M., 91, 94, 137 Hawryluk, A . M., 85, 137 Hawryluk, R. J., 85, 137 Hayashi, T., 20, 64, 66 Haynes, S. K., 246, 293 Hedin, L., 247, 293 Hedman, J., 245, 292 Heidenreich, R. D., 77, 81, 84, 85, 137, 139 Heitler, W . , 195, 198, 201, 238 Held, B., 234, 238 Helms, C. R.,257(69), 258(69), 287( 190), 294, 297 Hench, L. L., 285(179), 286(179), 289(206, 207), 296, 297 Hendrickson, D. N., 247,293 Herbst, J. F., 248, 293 Hertel, I. V., 198, 238 Hieke, E. K., 91, 139 Hill, H. A., 151, 155, 157, 159, 188 Ho, C. T., 126, 127, 139 Ho, P. S.. 258(67, 73, SO), 259(67, 80). 282 (67), 294 Hochstrasser, R. M., 236, 237
303
AUTHOR INDEX
Hoff, P. H., 81, 136 Hofman, S., 257,258,294 Hogan, P. B., 226,228, 235.238 Holland, B. W., 267, 295 Holloway, D. M., 257(62), 275(144), 276(144, 146), 294,295, 296 Holloway, P. H., 245(19), 250(44), 255(54), 257(68), 258(77), 259(77, 81), 260(77), 261(85), 263(19), 264(19), 265(103), 266 (103), 267(19), 269(19), 270(19, 124), 271 (191, 274(142), 275(68, 1421, 276(146), 280(81, 155), 281(68, 155), 282(68, 81), 283(81, 155),284(81),285(44,68,171,181, 183), 286(181, 183), 287(189), 289(189, 200, 204, 205), 290(189, 216), 291(142, 216, 217, 218, 224), 292, 293, 294, 295, 296,297,298 Hollweg, J. V., 178, 188 Holscher, A. A,, 283, 296 Homer, R., 258, 259,294 Hoogwijs, R., 248,293 Hooker, M. P., 249(41, 42), 257(42), 281 (161), 293, 296 Horowitz, P., 220,239 Houston, J. E., 245(16, 19). 261(85), 262(95), 263(19, 95, 97, 98), 264(19), 267(19), 269 (19), 270(19), 271(19, 98, 130), 272(137), 278(153), 285(183), 286(183), 292, 294, 295, 296 Hovland, C. T., 253, 293 Howard, J. K., 258(67), 259(67), 282(67), 294 Howard, R., 150, 151, 152, 182, 188 Hoyle, F., 57, 64 Hu, P. N., 58,64 Hubbard, A. T., 288,297 Huchital, D. A., 278, 296 Hudgin, R. M., 121, 137 Hudson, 1. B., 270(124), 285(186), 289(204), 295,297 Hughes, A. L., 296 Hundhausen, A. J., 177, 178,188 Hundt, E., 113, 137 Hurst, G. S., 192,238 I
Iben, I., 158, 188 Idesawa, M., 125, 131, 137 Imshennik, V. S., 18, 64 Irby, J . H., 22, 24, 25, 26, 64 Isenor. N. R., 233, 237
J Jackson, D. C., 255,293 Jacobi, K., 285,296 Jacobsen, R. A., 27,64 Jaggi, R. K., 11, 64 Janse, E. C., 272,295 Janssen, A. P., 253(49), 266(108), 293,295 Jeannet, J. C., 198, 239 Jefferies, J. T., 161, 170, 188 Jenkins, L. H., 247(24), 257(90), 261(89, 90), 267(116), 268(116), 278(116), 285(116), 293,294, 295 Jennison, D. R., 261(85), 272(141), 294,295 Johannessen, J. S., 257(63), 285(185), 287 (185), 294,297 Johansson, A., 247,293 Johansson, G., 245,292 Johnson, P. M., 236,238,240 Jolly, W. L., 247,293 Jones, A. B., 103, 139 Jones, F., 91, 94, 137 Jones, V. O., 286,297 Joshi, A,, 269, 287, 295 Jourdan, P., 22, 26, 61 Judish, J. P., 192, 238 K
Kalkofen, W., 164,168, 170, 171,188, 190 Kamin, G., 13, 18, 19, 58, 59,62 Kao, M., 46,62 Kaplan, A. E., 218, 237 Karlsson, S. E., 245, 292 Kato, T., 95, 137 Katz, L. E., 291, 298 Kavartskhava, 1. F., 28, 67 Kaw, P., 17,64 Kawamoto, S. K., 281, 285,296 Kawashima, N., 29,65 Kay, E., 257(65), 282(65), 294 Kazakov, A. E., 206, 209,216,238 Kazan, B., 74, 137 Keeley, D. A , , 159, 188 Keil, S . L., 165, 188 Keldysh, L. V., 201, 238 Kelly, J., 98, 130, 131, 132, 137 Kelly, R., 258, 259, 294 Kelly, R.D., 290, 297 Kendall, P. C., 18,63 Kennel, C. F., 58, 63
304
AUTHOR INDEX
Kern, A,, 75, 136 Kern, D. P., 116, 117, 118, 120, 126, 137 Keyes, R. W., 70, 137 Khapalyuk, A. P., 225, 237 Khodovoi, V. A,, 201,206,225,237,238,239 Khodzhaev, A. Z., 18,32,33,34,35,62,63,64, 66 Khronopoulo, Yu. G., 218,237,238 Killeen, J., 11, 15, 18, 23, 58, 64 Kim, K. S., 248(35), 258(78), 259(78), 293,294 Kimbel, H. J., 225, 228, 238 King, D. A., 297 King, M. C., 110, 137 Kinkis, J. G., 267, 295 Kirby, R.E., 285,296 Kirii, N. P., 18, 32, 33, 35, 63, 64 Kleczek, J., 179, 182, 187 Klein, D. L., 91, 136 Klemper, O., 261, 265,294 Kleppner, D., 198,240 Knauer, W., 121, 137 Knoll, M., 74, 137 Knotek, M. L., 267(111), 285(111), 286(111, 187), 295, 297 Koechlin, F., 22, 26, 61 Koliwad, K. M., 291,297 Kolotyrkin, Ya. M., 289, 297 Komninos, Y., 266,295 Kopp, R. A., 174,189 Kostin, H. N., 201, 206, 237 Kotani, H., 102, I36 Kotov, V. A,, 156, 189 KOtOvd, L. P.,201,239 Kovarski, V. A,, 201, 230, 239 Kowalczyk, S. P., 248(31), 257(93), 261(93), 293,294 Kozaki, S., 102, 103, 139 Krasov, V. I., 25, 61 Kratschmer, E., 94, 138 Krause, F., 180, 181, 188 Krause, M. 0..245, 292 Krieger, A. S . , 175, 176, 184, 188 Krook, M., 57, 65 Kuiper, G. P., 143, 188 Kunkel, W. B., 22, 23, 24, 25, 61, 64 Kuo, H., 119, 137 Kuperus, M., 173, 188 Kurucz, R. L., 162, 163, 188
Kuyatt, C. E., 275, 295 Kyser, D. F., 85, 86, 88, 91, 93, 137, 138
L Lagally, M. G., 263(98), 271(98), 272(137), 287(193), 288(193), 291(220), 294, 295, 297 Lambropoulos, M., 198,239 Lambropoulos, P. 192,197,198,199,201,209, 212,213,215,216,218,225,226,228,229, 230,231, 232,233,234,235,236,237, 238, 239,240 Landau, L., 94,137 Lander, J. J., 242,292 Landolt, D., 258(72), 294 Langeron, J. P., 280(160), 281(160), 284(160), 289(202), 296, 297 Langner, G. O., 125, 138 LaPlaca,S. J., 101, 114, 115,137 Laramore, G. E., 248, 293 Larkin, F. P., 245, 293 Leckey, R. C. G., 267,295 Ledoux, P., 156,188 Lee, Y. C., 16,63 Le Hericy, J., 280(160), 281(160), 284(160), 289(202), 296, 297 Leibacher, J. W., 153, 166, 189 Letokhov, V. S., 192, 220, 221,239 Leung, K. M., 215,239 Levenson, M. D., 193,226,228,229, 230,234, 235,237 Levy, R. H., 16,64 Lewis, J. E., 258(67, 73), 259(67), 282(67), 294 Ley, L., 248(31), 257(93), 261(93), 293, 294 Liau, Z. L., 258,259,294 Lichtman, D., 285, 296 Liepmann, H. W., 148, 188 Liesegang, L., 267, 295 Lin, L. H., 124, 137 Lindau, I., 102, 137 Lindgren, B., 245, 292 Lindgren, I., 245, 292 Lineberger, W. C., 198, 239 Linsky, J. L., 170, 187, 188 Lischke, B., 128, 136, 137 Littman, M. G., 198, 240 Liu, C. Y., 91, 138
AUTHOR INDEX
Loeffler, K. H., 121,137 Loesner, R., 161, 163, 170, 190 Lompre, L. A., 234,239 Loughhead, R. E., 164,187 Lovberg, R. H., 25,62 Lu, T. M., 287, 288,297 Lucas, A. C., 106, 136 Lue Yen-Bower, E., 289,297 Lundquist, B. I., 261,294
M McClean, W. A,, 225,239 McClure, D. E., 258(72), 294 McCorkle, R., 101, 114, 115, I37 McCoy, J. H., 103,137 McDonald, K. L., 4,64 MacDonald, N. C., 81, 136, 242(11), 249,250 (Il), 253(51), 265, 280,281, 283,292,293 McDonnell, L., 267, 295 McFeeley, F. R., 248(31), 257(93), 261(93), 293,294 McGuire, E. J., 241(4), 242(4), 245(4), 265(4), 272(139, 140), 292,295 McGuire, G. E., 250(44), 285(44), 287(189), 289(189), 290(189), 291(218, 219), 293, 297 McHugh, J., 282,296 McIntosh, P. S., 176,188 McMahon, C. J., 287,289,297 McMahon, J. M., 101,138 McPherron, R.L., 3,65 Madden, H. H., 262(96), 263(97), 271(131), 272( 136), 294,295 Madey. T. E., 245(19), 261(85), 263(19), 264 (191, 267(19), 269(19), 270(19), 271(19), 276(147, 148), 285(176, 183), 286(183), 290(21 l), 292, 294, 296, 297 Magnani, N. J., 289,297 Mahaffy, J., 158,188 Mahan, G. D., 257,261,294 Mainfray, G., 234, 238, 239 Makarov, V. P., 206, 209, 216, 238 Makayama, K., 259,294 Maldonado, J. R., 113, 137 Mandel, L., 225, 228,238 Manus, C., 234,239
305
Marburger, J. H., 212,215,218,238 Marcus, H. L., 242,292 Margoninski, Y., 285, 296 Markka, J. T., 173, 188 Markov, V. S., 18, 32, 33, 34, 35,63,64 Marquis, J. F., 127, 136 Martinez, J. H., 285, 297 Marx, B., 226, 229,235,239 Mathew, J. A. D., 248(33), 266(107, 109), 267 (119), 293,295 Mathieu, H. J., 258,294 Matsui, S., 102, 136 Matsukawa, T., 94, 138 Matsushima, T., 290, 297 Mauer, J., 117, 137 Meadows, A. J., 147, 188 Mehltretter, J. P., 165, 189 Mehta, M., 281, 285, 296 Melliar-Smith, C. M., 77, 81, 84, 137, 139 Menzel, D. H., 143, 189 Messenger, R. S., 103, 139 Messiah, A., 194, 196, 199, 239 Meyer, D., 128, 138 Meyer, F., 285(170, 182), 296 Michaud, G., 149,190 Michial, M. S., 126, 127, 139 Mihalas, D., 142, 162, 163, 170, 189 Minkiewicz, V. J., 79, 137 Mirzabekov, A. M., 32, 33, 63 Moddeman, W. E., 245,292 Mogami, A., 251,252,253,293 Mollow, B. R., 225, 239 Monahan, K., 286,297 Monticello, D. A,, 16, 67 Moody, S. E., 198,226, 239 Moore, D. W., 164, 189 Moore, G. E., 263, 271, 272(137), 291(220), 294, 295,297 Moore, J. S., 137 Moore, R.D., 126, 127, 128, 136, 139 Morabito, J. M., 250(43), 270(127), 280(158, 159), 283(158, 159), 284(159), 291(223), 293,295,296,298 Morellec, J., 234, 238, 239 Morozov, A. I., 2,65 Morton, A. H., 26,65 Motz, J. M., 106, 136 Muchado, M. E., 185, 189
306
AUTHOR INDEX
Muir, A. W., 126, 127, 139 Mularie, W. M., 261,294 Mullendore, A. W., 285, 286,296 Munro, E., 102, 116, 137,139 Murata, K., 85, 91, 94, 95, 138 Murday, J. S., 272(137, 138), 273(138), 295 Musket, R. G . , 249,293 Musman, S. A., 164,189 N Nagei, D. J., 101, 138 Nakagawa, O., 102, 136 Nakata, H., 95, 137 Namba, S., 102, 136 Nayfeh, M. H., 192,238 Nelson, D. A., 114, 138 Nelson, G. C., 258(77), 259(77), 260(77), 285 (181), 286(181), 291(216), 294,296,297 Nelson, G. D., 164, 189 Neureuther, A. R., 91, 138 New, G. H. C., 218,238 Newton, H. W., 151, 189 Niblett, G. B. F., 25, 62 Nishimura, T., 102, 136 Nolte, J. T., 176, 188 Noonan, J. R.,267(116), 268(116), 272(136), 278(116), 285(116), 295 Nordberg, R., 245, 292 Nordling, C., 241(3), 242(3), 245(3, 14, 22), 246(22), 278(22), 292, 293 Norman, D., 255,257,294 Normand, D., 234,239 Nosker, R. W., 74, 138 Novakov, T., 265,295 Noyes, R. W., 168, 170, 188 Nunn, M. L., 151, 189 Nuttall, J. D., 245, 267, 270, 2Y2, 295
0 O’Grady, W. F., 288,297 Ohiwa, H.. 118, 138 Ohuchi, F., 285, 286,296 Ohyabu, N., 29,65 Okamura, N., 29, 65 Okamura, S., 28, 29, 65 Oleinik, V. P., 201, 239 Olson, R. R., 259,294
ONeil, S.V., 228,238 Ono, A., 118, 138 Ono, M., 259, 294 Onoda, G. Y., Jr., 285, 286,296 Orkney, K. E., 218,238 Orloff, J., 122, 139 Orr, B. J., 215, 239 Osaki, Y., 156, 158, 159, 187 Osborn, C. M., 285,286,296 Oseledchik, Yu. S., 225,239 Ostrovskaya, G. V., 32, 33, 34, 63 Ostrovskii, Yu., I., 32, 34, 63 Ouano, A. C., 84, 138 Overskei, D., 22, 31, 65 Ozedimir, F. S., 75, 138
P Paisner, J. A,, 192, 200, 220,239 Palmberg, P. W., 242(9, lo), 249(42a), 250 (42a), 265(42a), 269(122), 275(9), 276 (145), 280(42a, 157), 281(42a), 283(42a, 157), 287(122), 292,293, 295,296 Pandy, K. C., 272,295 Pantano, C. G . , Jr., 285, 286, 289,296,297 Pardee, W. J., 257,261,294 Parikh, M., 85, 88, 90, 93, 138 Park, R. L., 278, 296 Parker, D. H., 193, 223,224, 236, 237, 239 Parker, E. N.,9,32,57, 141, 167, 177,182, 183, 65,189 Parker, N. W., 118, 136 Pattinson, E. B., 251, 261, 294 Payne, M. G., 192,206,237,238 Pechacek, R. E., 101, 138 Peckerar, M. C., 101, 138 Penberth, M. J., 116. 132, 138 Penn, D. R., 255(58, 59), 257(94), 261(58, 59, 94), 283(165, 166), 294, 296 Pepper, S. V., 289, 297 Perel’man, N. F., 201,230, 239 Peria, W. T., 242, 292 Perkins, M., 128, 136 Perkins, W. E., 75, 138 Perlman, M. L., 248,293 Petite, G., 234, 239 Petrov, M. V., 32, 34,63 Petschek, H. E., 10, 16, 17, 19, 64, 65
307
AUTHOR INDEX Pfeiffer, H. C., 121, 122, 124, 125, 127, 130, 136, 138,139 Piddington, J. H., 57, 179, 182, 65, 189 Pittock, A. B., 143, 189 Placious, R. C., 106, 136 Pneuman, G. W., 174,189 Poate, J. M., 258, 259, 294 Pockcr, D. J., 250,293 Podgorny, A. I., 65 Polaschegg, H. D., 289, 297 Politycki, A,, 128, 138 Politzer, P., 22, 31, 65 Polizzotti, R. S., 287, 297 Pollak, R. A,, 248(31), 257(93), 261(93), 293, 294 Poncc, V., 290, 297 Pons, F., 280(160), 281(160), 284(160), 289 (202), 296, 297 Poole, R. T., 267, 295 Powell. C. J., 276(147, 148), 280(156), 281 (156, 162), 296 Power, E. A,, 198,239 Praderie, F., 169, 170, 189 Preston, R. K., 222, 239 Priest, E. R., 17, 65, 66 Prins, R., 265, 295 Prutton, M., 253(50), 266(107, log), 293, 295 Przhibclskii, S. G., 225,239 Publen, B. P., 245,292
R Radler, K.-H., 180, 181, 188 Ralph, H. I., 89, 138 Ramaker, D. E., 272(137, 138), 273(138), 295 Ranke, W., 285,296 Rao-Sahib, T. S., 103, 139 Raymond, J. C., 174,189 Reekstin, J. R., 103, 139 Rehn, V., 286,297 Reuthcr, W., 283, 296 Reviere, J. C., 247(25), 285(172), 293, 296 Rhodes, E. J., 149, 151, 153, 154, 159, 160, 187, 189, 190 Riach, G. E., 249,250,265,280,281,283,293 Rigden, J. D . , 278,296 Ritus, V. I., 201, 239 Ritz, E. J., 131, 138
Roberts, E. D., 84, 138 Roederer, J. G., 29, 65 Rojansky, V., 296 Romand, M., 289,297 Ronhot, B., 289,297 Rood, R. T., 146,189 Rosenau, P., 17, 65 Rosenbcrg, H., 185, 189 Roscnbluth, M. N., 11, 15, 16, 18, 23, 58, 64, 65,67 Rosner, R., 173, 176, 183, 186,189,190 Ross, K. J., 198,238 Rossi, B., 95, 138 Rowc, J. E., 261,262,263,294 Roxburgh, I. W., 147, 189 Ruoff, A. L., 114, 138 Rusch, T. W., 261(87), 267(115), 294,295 Russell, C. T., 3,65 Rye, R. R., 245(16, 191, 259(85), 261(85), 263 (19), 264(19), 267(19), 269(19), 270(19), 271(19), 285(183), 286(183), 292,294,296
S Sagdccv, R. Z . , 58,65 Sakurai, J. J., 194, 195, 198, 201,239 Salinger, H. W. S., 130, 138 Salmeron, M., 266, 295 Samain, A., 22, 26, 61 Sargcnt, M., 111, 220, 239 Sasaki, T., 125, 137 Sasorov, N. V., 18, 63 Sassi, M., 198, 239 Sato, T., 20, 21, 53,64, 65, 66 Satya, Y., 16, 66 Savchenko, V. I., 290,297 Sawatsky, G. A., 271,272, 295 Sayer, B., 198,239 Schatzmann, E., 149, 190 Scheibncr, E. J., 242(8), 271(129), 292, 295 Schemer, P. H., 156, 189 Schindlcr, K., 16, 62 Schmidt, G., 16, 66 Schoonmakcr, R. C., 266,295 Schrcincr, D. E., 262, 294 Schwarz, S. A , , 257, 258, 294 Schwarzschild, M., 144, 147, 189 Schweitzer, G. K., 245, 292
308
AUTHOR INDEX
Scott, R. W., 102,139 Seah, M. P., 255(53, 57), 256(57), 287(191), 289(191), 293, 297 Sears, R. L., 144, 145, 187, 189 Severny, A. B., 156, 189 Sewell, H., 89, I38 Shedova, E. N., 32, 33, 34,63 Shephard, J. G. P., 261,265,294 Shimizu, H., 259,294 Shimizu, R., 94, 138, 253(51), 257(66), 282 (66), 137 293, 294 Shirley, D. A., 247, 248(30, 31), 257(93), 261 (93), 293, 294 Shore, B. W., 221,222, 224,237,238 Sickafus, E. N., 275, 276,295 Siegbahn, K., 245(14, 22), 246(22), 278(22), 292,293 Silve, J. A,, 281, 285, 296 Simon, G. W., 149, 159, 166, 189 Simons, J., 226, 229, 235, 239 Simpson, J. A., 275(143), 278(150), 295, 296 Sims, D. L., 285,286,296 Sinfelt, J. H., 290, 297 Siscoe, G. L., 16,64 Slusser, G. J., 259,294 Smith, A. V., 192, 215, 239 Smith, D. F., 19, 185, 186,66, 189 Smith, D. M., 283,284,296 Smith, H. I., 85, 107, 110, 136, 137, 138 Smith, S. J., 198, 226, 228, 235, 238, 239 Snijder, J. T., 128, 129, 136 Solarz, R. W., 192, 200, 220, 239 Solov’ev, L. S., 2,65 Soma, T., 125, 130, 137, 138 Somov, B. V., 18,66 Sonnerup, B. U. O., 12, 16, 17, 58,66 Soward, A. M., 17,65, 66 Sparrow, J. H., 106, 136 Speiser, T. W., 18, 63 Speth, A., 75, 136 Speth, A. J., 91, 95, 136 Spicer, D. S., 186, 189 Spicer, W. E., 257(63, 70), 285(185), 287(185), 294,297 Spiegel, E. A,, 148, 150, 189 Spiller, E., 91, 98, 102, 105, 107, 112, 137, 138 Spivack, M. A, , 138 Springer, R. W., 249, 257(42), 293
Spruit, H. C., 145, 152, 168, 188, 189 Stebbins, R. T., 151, 153,188 Stehle, P., 201, 234, 237 Stein, H. J., 274, 275, 291, 295 Stein, R. A., 153, 166, 189 Stenflo, J. O., 166, 189 Stenzel, R. L., 4, 13, 28, 29, 64,66 Stephani, D., 94, 138 Stevens, D. C., 58,64 Stickel, W., 122, 125, 127, 128, 136, 138, 139 Stone, J. M., 37, 61 Storz, R. H., 235,236,237 Strausser, Y. E., 257(70), 278(152), 279(152), 280(152), 285(185), 287(185), 294, 296, 29 7 Stroud, C. R., 226, 239 Studwell, T. W., 90, 136 Suleman, M., 257, 261, 294 Sullivan, P. A., 103, 137 Svestka, Z., 184, 186, 189 Swain, S., 225,239 Swanson, L. W., 121, 122, 136,139 Sweet, P. A,, 5, 7, 9, 57, 66 Syrovatskii, S. I., 8, 12, 18, 20, 22, 32, 33, 53, 58,63, 64,66
T
Tai, K. L., 94, 136 Takashi, S., 139 Tanaka, K., 185,190 Taylor, J. A , , 248, 293 Taylor, J. B., 58, 65 Taylor, N. J., 257(63), 270(126), 278(126), 280 (126), 294, 295 Terent’ev, M. V., 201, 239 Tharp, L. N., 242,292 Thebouilt, J., 234, 239 Theodosiou, C. E., 196.239 Thomas, J. H., 250,293 Thomas, R. N.,74,162,169,170,136,188,189 Thomas, S., 285, 287,297 Thompson, L. F., 77, 78, 81, 84, 85, 137, 139 Thornton, P. R., 70, 124, 137, 139 Ting, C. H., 91, 93, 137, 138 Tischer, P., 113, 137 Todirashku, S. S., 230, 239 Tokarevskaya, N. P., 32,62
AUTHOR INDEX
Tolk, N. H., 285, 286,296 Toneman, L. H., 283,296 Topalian, J., 102, 136, 138 Tracy, J. C., 242, 275,292 Trahin, M., 234, 235,238 Trotter, D. E., 150, 188 Tsap, T. T., 156,189 Tsuda, T., 19, 20, 45, 53, 61,64,66, 67 Tucker, W. H., 176, 183, 186,189 Tuggle, D., 122, 139 Turner, N. H., 272,295
U Uberoi, M. S., 18,66 Ugai, M., 19, 20, 53,66, 67 Ulmschneider, P., 164, 171, 190 Ulrich, R.K., 149, 151, 153, 154, 159, 160, 187, 189,190 Unno, W., 148,190
V Vaiana, G. S., 173, 175, 176, 183, 184, 186, 188,189,190 Vainshtein, S. I., 58,67 VanderMeulen, Y. J., 285(178), 286(178), 291 (222), 296,298 Van Hoven, G., 15, 16,67 Vasyliunas, V. M., 6,7, 16, 67 Vauclair, G., 149, I90 Vauclair, S., 149, 190 Venables, J. A,, 253,293 Vennik, J., 245(13), 248(36), 264(100), 287(13), 292,293,294 Vernazza, J. E., 161, 163, 170, 190 Viswanathan, N. S., 86, 137 Vlases, G. C., 37,67 Voronov, G. S., 201,239 Vrabec, D., 182, 190 Vrakking, J. J., 285(170, 182), 296 W Waddell, B. V., 16,67 Wagner, C. D., 248(39), 270(123, 125), 293, 295
309
Wagner, E. B., 192,238 Wagner, W. J., 151, 190 Waldrop, J. R., 242, 250, 292 Walker, R. B., 222,239 Wallace, S. C., 192,239 Wallman, B. A., 116, 132, I38 Walls, D. F., 225, 237 Walraven, T., 156, 188 Walther, H., 226, 239 Wang, C. C., 192,239 Wang, C. C. T., 130,139 Wang, G. C., 287,288,297 Wang, R., 198,239 Ward, J. F., 192,215,239 Ward, R., 128, 129, I39 Wardey, G. A., 102, 139 Watson, K. M., 196, 201, 238 Watson, R. E., 248,293 Weber, E. V., 126, 127, 128, 139 Weber, R. E., 242(7), 249(42a), 250(42a), 265 (42a), 280(42a), 281(42a), 283(42a), 292, 293 Wehner, G. K., 257(64), 259(83), 282(64), 294 Weiss, N. O., 149, 166, 179, 181,188,189,190 Werner, H. W., 251, 253,293 Wheatley, S. E., 226, 228, 229, 230, 234, 235, 237 Whipps, P.W., 139 White, C. W., 285, 286, 296 White, J. M., 245(16), 290(212), 292, 297 White, 0. R., 143, 176, 187, 190 White, R. B., 16, 67 White, R. S., 36, 46, 48, 62 Whitley, R. M., 226, 239 Whitlock, R. R., 101,138 Wild, H., 255,294 Wildman, H. S., 258(67), 259(67), 282(67), 294 Williams, F. L., 290, 297 Williams, M. C., 126, 127, 128, 136, I39 Williams, N. V., 164, I90 Williamson, A. D., 236, 240 Wilson, A. D., 75, 136 Wilson, 0. C., 180,190 Wilson, P. R., 164, 190 Winick, H., 102, 137 Winograd, N., 248(35), 258(78), 259(78, 82), 293,294 Winters, H. F., 258, 259,294
310
AUTHOR INDEX
Withbroe, G. L., 174, I90 Wittels, N. D., 93, 139 Wittrnan, A., 164, 190 Wittry, D. B., 103, I39 Wodkiewicz, K., 227,239 Wolf, E. D., 75, 138 Wolfe, J. E., 116, 122, 123, 139 Wong, J., 221, 224, 226, 238, 240 Wood, M. Y. C., 288,297 Woodard, 0. C., 126, 127, 128, 136, 139 Woodruff, D. P., 255(61), 257(61), 267(117), 294,295 Worden, E. F., 220, 239 Wright, R. E., 37, 61 Wu. F. Y., 226,238,240
Y
Yahara, T., 95, 137 Yang, M. G., 291,297 Yates, J., 285, 296 Yau, L., 84, 139 Yeager, E., 288,297 Yeates, C. M., 9, 37, 41, 43, 44,46, 47, 49, 51, 53, 58, 62 Yeh, T., 17, 67 Yim, R., 75, 138
Yoshimatsu, M., 102, 103, 139 Young, J. P., 192, 238 Young, R., 279,296 Youngman, C. I., 93, 139 Yourke, H. S., 128, 139
Z Zaborov, A. M . . 20,53,63 Zahn, J.-P., 148, 150, 189 Zakheim, D., 236,240 Zaslavski, G. M., 58, 65 Zdasiuk, G., 192, 239 Zehner, D. M.,257(90), 261(90), 267(116), 268 (116), 272(136), 278(116), 285(116), 294, 295 Zeitler, H. U., 91, 139 Zelenyi, L. M., 16, 64 Zhovna, G. I., 225,237 Zienau, S., 198, 239 Zimmermann, B., 121, 139 Zimmermann, P., 198, 240 Zirin, H., 143, 185, 190 Zirker, J. B., 166, 176, 178, 188, 189, 190 Zoller, P., 225,226,228,231,232,235.238,240 Zukakishvili, G. G., 28, 67 Zukakishvili, L. M., 28,67 Zusman, L. D., 225,240 Zwaan, C. K., 182, 190
Subject Index
A
B
Aberrations, in electron beam lithography, 116-120 shaped-beam system, 125 AC Stark shift, 211, 220, 234-236 AES, see Auger electron spectroscopy Alfven Mach number, in magnetic reconnection experiments, 10, 12, 16,20, 54 Alfven wave, from sunspot, 183 Amplitude fluctuations in multiphoton transitions, 224-227,229-230,232 Anharmonic oscillator, multiphoton excitation, 222 Annular pinch, in magnetic reconnection experiments, 22,26-27 Arch filament system, of chromosphenc loops, 182 Argon, in Auger spectroscopy, 251,267 Argon sputtering, 259-260 Astrophysics, 141-142; see also Solar physics Atomic number and Auger emission, 251-252 and scattering effects, in beam-target collisions, 73-74 Auger electron emission, 242-245,251-252 angular-dependent, 267 Auger electron energies, 245-248 Auger electron spectroscopy, 241-298 applications, 287-291 basic principles, 242-248 characteristics of, 251-260 computerization, 279 experimental approach, 274-279 instrumentation, 250-25 1 energy analyzers, 275-279 line shapes and intensity, 261-273 notation, 244-245 quantitative analysis, 280-284 sample damage, 285-287 Auger parameter, 270 Auger transition, 261-273
Backscattering, in beam-target interactions, 74 proximity effect, 86-87,91,93 Bandpass analyzer, 278 Bandwidth effects, in multiphoton transitions, 224-232 Beam-target interactions, 73-95 board beam case, 8 1-85 primary electron energy loss, 81-83 proximity effect, 85-95 Beam voltage, and proximity effect, in beamtarget interactions, 94 Beryllium solar, 149-150 window, for x-ray lithography, 112-1 13 x-ray absorption data, 104-105 Binary alloy, analysis by Auger spectroscopy, 258-259 Binding energies, and Auger energy calculations, 245-247 Bipolar sunspot, 4-5,9 C
Cadmium, Auger core transitions, 270 Carbon, Auger spectrum, 267-270 Carbon radiation, as x-ray source, 114-115 Catalysis studies, with Auger spectroscopy, 290-291 Cesium, resonant multiphoton processes, 234 Chemical shift, in Auger electron spectroscopy, 269-270 Chromosphere, solar, 168- 172 solar activity, 181- 186 Climate, solar influence on, 143, 146 CMA, see Cylindrical-mirror analyzer Coaxial electron gun, for Auger spectroscopy, 275-276 Collodion window, in x-ray lithography, 113114 311
312
SUBJECT INDEX
Computerization, in Auger electron spectroscopy, 279 Contrast of electron beam resist, 78-79 Convection, solar, 147-150, 162-164 Copper angular-dependent Auger emission, 267 anode target, for x-ray lithography, 102103 Auger spectrum, 272 Coster-Kronig effects, 265 Core-core-core Auger transition, 261, 270 Corona, solar, 172-176, 178 holes, 176, 178 loops, 182-183 rotation, 151 solar activity, 18 1 - 186 Cosmic processes, see also Solar physics and magnetic reconnection, 6-7, 12-13, 52 Coster-Kronig transition, 265-266, 280 Coulombic interactions, in electron beam lithography, 119-120 shaped-beam system, 125 Cross-linking, in beam-target interactions, 76, 83 Cylindrical-mirror analyzer, for Auger spectroscopy, 251, 275-276, 278
D Damping, in multiphoton processes, 219-220, 224 Decay, of resonant excited atomic states, 214215, 220 Deflector system, in electron beam lithography, 125-126, 129-133 Density matrix treatment, in multiphoton processes, 217-219, 223, 229-231 Depth profiling, for Auger spectroscopy, 257260 Depth resolution, of Auger spectroscopy, 253-257 Detection limit-current relation, in Auger spectroscopy, 251 Detuning from resonance, 202-204,206,213, 22 1 DIPD, see Double inverse pinch device Dipole approximation, in multiphoton processes, 198-199 Dipole atomic matrix element, for boundbound transition, 206
dNjdE spectrum, 278 DOR, see Double optical resonance Double inverse pinch device, 4, 22, 35-51, 53 Double optical resonance, 220,226,228,231 232. 235 Double-pass analyzer, 276 Doubly ionized initial state, Auger emission from, 261 Dual-deflector system, in electron beam lithography, 126 Dual-yoke system, in electron beam lithography, 125 Dungey’s paradox, 8-9, 36 Dynamical dynamo model, of solar activity, I80 Dynamic relaxation energy, 248 Dynamo theory, of solar activity, 179-181 E Eclipse, solar, 169 Elastic collision, 73 proximity problem, 86 Electromagnetic field, fluctuations in multiphoton transitions, 224-232 Electromotive force in double inverse pinch device, 38-39 and magnetic reconnection, 13-14 Electron beam energy loss, in microlithography, 73-76,81-83,85-87 Electron beam lithography, 116-133 aberration terms, summation of, 116-1 18 beam projection methods, 128-129 electron interactions, 118-122 electron-optical components, 129- 133 electron-optical computation, 1 16- 122 field emitter cathode, 122-124 with high throughput, 133-135 possible approaches to, 71 proximity effect, 75 resist design, 76 shaped-beam system, 124-128 Electron beam projection, 128-129 Electron beam scanning, resist design for, 78, 80 Electron energy analyzer, for Auger spectrosCOPY,250-251,275-279 Electron gun for Auger spectroscopy, 275-276
313
SUBJECT INDEX
Coulombic interactions, 119 for x-ray lithography, 102 Electronics industry, use of Auger spectroscopy in, 291 Electron inelastic mean free path, 253-259 Electron interactions, in electron beam lithography, 118-122 Electron-optical column, for Auger spectroscopy, 250 aberration terms, 116-1 18 shaped-beam lithography system, 125-126 Electrostatic deflector, 130 Elemental sensitivity to Auger emission, 25 1 252 Excitation in beam-target interactions, 73-76, 83 in x-ray lithography, 96
-
F Fiducial mark detection, 131-133 Field emitter cathode, in electron beam lithography, 122-124 Field statistics, in multiphoton transitions, 224-232 Filigree, of solar photosphere, 166-167 Five-minute solar oscillations, 153, 155, 159 Flat-plate device, in magnetic reconnection experiments, 22, 28-32 Flux equation, 13 Flux transfer rate, in magnetic reconnection, 12-14 Focused electron beam, and resist-covered wafer, interactions between, 73-95 Forced tearing, in magnetic reconnection experiments, 28, 31 Fusion energy research, and magnetic reconnection, 3, 6, 1 I , 23, 52 G
Gas Auger electron energy from, 267 Auger spectrum, 267,269-271 Gelation, of resist, in beam-target actions, 77 Gel dose, of resist, 83-84 Gold Coster-Kronig effects, 265-266 x-ray absorption data, 104-105
inter-
Granulation, photospheric, 164-165, 167 G value, of resist, 83-84 Gyrosynchrotron radiation, from solar flares, 186
H Hertzsprung-Russel diagram, 144 High-energy ion, and Auger emission, 25 1 Higher-order resonant multiphoton processes, 209-215 High-spatial-resolution Auger electron spectroscopy, 250-253 Hydrogen, resonant multiphoton processes, 235-236
I Ice Age, 146 IFTE, see Impulsive flux transfer event Impedance matching, in multiphoton transitions, 224 Impulsive flux transfer event, in magnetic reconnection, 3, 14, 18-19, 58-61 double inverse pinch device, 40,45-5 1, 54 Inelastic collision, 73 Inelastic scattering, of Auger electrons, 261263 Inelastic scattering cross section, for electrons in solids, 253,255 Inorganic compounds, inelastic mean free paths, 255-256 Intensity fluctuations, in multiphoton transitions, 224-225,229-230 Interatomic transition, in Auger spectroscopy, 267 Interfacial resolution, in Auger spectroscopy, 257-258 Interface studies, with Auger spectroscopy, 287-288 Inverse pinch, 37; see also Double inverse pinch device Ionization in Auger electron spectroscopy, 263-265 in beam-target collisions, 73-74, 76, 83 in multilevel systems, 224 Ion sputtering, and Auger spectroscopy, 250, 257-260,282 Inverse problem, of radiation transfer theory, 161
314
SUBJECT INDEX
J Jaggi solar flare model, 1 I j-j coupling, 245-246 Joule heating sink, 16
K Kerr cell, 40-41, 43 Kinematic dynamo model, of solar activity, 180 Kinetic equations, for multiple resonance, 223-224,228 K-level system, multiple resonance, 220-224 KLL transition, 245-247
L Large-angle scattering, elastic collision, 73 Laser, field statistics and bandwidth effects, in multiphoton processes, 225-236 Lattice, and Auger process, 267 Lenz’s law, 8 Lifetime broadening, in Auger spectroscopy, 265 Lithium Auger emission, 250 solar, 149 Local thermodynamic equilibrium, of stellar atmosphere, 162 Lorentz force, 8 Low-energy ion, and Auger emission, 251 I-s coupling, 245-246 LTE, see Local thermodynamic equilibrium Lundquist number, 9 M iMagnetic axis, 7 Magnetic deflector system, 129 Magnetic field, 2-4 field line structure, 2,4-5 and solar activity, 179-184 of solar chromosphere, 172 of solar corona, 176, 178 of solar photosphere, 166-167 of solar transition region, 174 Magnetic-fluid seal, for x-ray lithography, 103
Magnetic flux, 2, 6-7 in solar activity, 181-184 two-current system, 4
Magnetic flux transfer, see also Impulsive flux transfer event in double inverse pinch device, 38-41 Magnetic null point, 2,4-5 Magnetic reconnection, 1-67 definition, 7 experiments, 22-5 1 annular pinch, 26-27 DC quadrupole, 31-32 double inverse pinch device, 35-51 flat-plate devices, 28-3 1 theta-pinch experiments, 24-25 tokamaks, 25-26 triax, 23 TS-3 experiment, 32-35 historical perspective, 8-14 impulsive flux transfer, 58-61 process rates, 12-14 reconnection jargon, 57-58 theory, 14-22 impulsive flux transfer events, 18- I9 numerical approaches, 19-22 sheet rupture, 18-19 tearing mode, 15-16 wave-assisted diffusion mode, 16- 18 x point, example of, 55-57 Magnetic Reynolds number, 9 Magnetoelectric coupling, 13 Magnetohydrodynamics, and reconnection experiments, 19-20 Magnetomorphology, 2 Magnetosphere, 4 Magnetospheric substorm, 3, 12 Many-body polarization effects, 248 Mask for transmission projector, 134 for x-ray lithography, 96-97, 106-110, 114 alignment, 108-110 size and exposure area, 1 1I -I I2 Materials science, use of Auger spectroscopy in, 288-289 Metals, inelastic mean free paths, 255 Methyl alcohol, in Auger process, 267, 269271 Microlithography, 69-139; see also Electron beam lithography; X-ray lithography beam interactions with resist-covered wafers, 73-95 possible approaches to, 71
315
SUBJECT INDEX
proximity effect, 85-95 resist design and behavior, 76-80 Microprojector, 128, 134 Microwave emission, from solar flares, 186 Mixing-length theory, in astrophysics, 142, 148-150 Molecular multiphoton processes, 236 Molybdenum anode target, for x-ray lithography, 103 Monte Carlo technique, application to proximity problem in microlithography, 8587,94-95 Multiphoton processes, see Nonresonant multiphoton processes; Resonant multiphoton processes Multipole experiment, in magnetic reconnection, 22, 27 Mylar, x-ray absorption data, 104-105
N Near resonance, 202 Negative resist, 76-77 design specifications for lithography, 78-79 development process, 79 postexposure temperature, 80 and proximity problem, 94 N ( E ) spectrum, 278-279 Neutrino, solar, 144-147 Nickel, Auger analysis, 252 Nonresonant multiphoton processes, 206-209 intensity fluctuations, 225 transition probability, 197-198, 204 0
Octopole deflector system, 130-131 OFHC copper anode target, for x-ray lithography, 102 Oil seal, for x-ray lithography, 103 Optical lithography, 71, 134-135 Organic compounds, inelastic mean free paths, 255-256 Outer solar atmosphere, 168-176 Overlay problem, in microlithography, 134135 Oxide, preferred sputtering effect, from Auger spectroscopy, 259 Oxygen
Auger spectrum, 267, 270 preferred sputtering effects, in Auger spectroscopy, 259-260 surface phase diagram, 287-288 P
Pancake pinch, in magnetic reconnection experiments, 22, 28 PDM, see Phase diffusion model Petschek shock, 53 Petschek’s model, in magnetic reconnection, 10-12, 16-17 Phase diffusion model, in multiphoton processes, 227-228 Phase fluctuations, in multiphoton transitions, 224-229 Photochemical events, excitation of, in beamresist interactions, 83 Photoelectron spectroscopy, 244 Photon statistics (photon correlation effects), 225 Photosphere, solar, 149, 160-168 rotation, 150 solar activity, 181-183 Pierce electron gun, 102 Plasma containment, 3, 11 Plasma physics and magnetic reconnection, 3, 5-7, 12-13, 28-52 and solar flares, 186 Plasmon generation, 261-263,269 PMMA resist, 113-114, I26 and proximity effect, 93, 95 sensitivity, 1 1 1 1 12 x-ray absorption data, 104-105 Polymer materials, for electron beam resists, 75-76, 79, 84 Population inversion, in higher-order resonant processes, 215 Positive resist, 76 design specifications for lithography, 78-79 Poynting’s theorem, 8, 36 Preferred sputtering, in Auger spectroscopy, 258-260 Projector, in electron beam lithography, 125 Proton-proton reaction, solar, 144-145 Proximity effect, in beam-target interactions, 75, 85-95 -
316
SUBJECT INDEX
Q Quadrupole experiment, in magnetic reconnection, 22,31-35 Quadrupole transition, 206 Quantitative analysis, by Auger spectroscopy, 280-285 Quantum theory of multiphoton processes, 194-200 resonant two-photon processes, 200-206
R Rabi frequency, 205-206,221-222,224,227229 Rabi oscillation, 205, 223, 228 Radiation transfer study, of solar atmosphere, 160- 164 Radio emission, from solar flares, 185-186 Range, of electrons, in beam-target interactions, 74 Rate equations, for multiple resonance, 223224,228 Reconnection, magnetic, see Magnetic reconnection Resist, electron beam, 73-95 behavior and design specifications for lithography, 76-80 primary electron energy loss, 81-83 for shaped-beam lithography system, 126127 for x-ray lithography, 96-101, 111-114 Resonance fluorescence, 226, 228, 23 1-232, 235 Resonant intermediate state, 200-201 Resonant multiphoton processes, 191-240 bandwidth effects, 224-232 experimental investigations, 233-236 field statistics, 224-232 higher-order processes, 209-215 multiple resonances, 219-224 nonresonant states, effect of, 206-209 Quantum theory of two-photon processes, 200-206 semiclassical approaches to, 2 15-21 9 theory, 194-200 Retarding-field analyzer, 277-278 RFA, see Retarding-field analyzer
Rotating-anode x-ray sources, for x-ray lithography, 102-103, 110, 114 Rotating-wave approximation, in multiple resonance, 220-221
S Saturated transition, effect of field fluctuations on, 225-226 Scanning Auger electron spectrometer, 25 I Scanning electron microscope, for Auger etectron spectroscopy, 250-25 1 Scattering, in beam-target interactions, 73-74 electron beam lithography, 121 proximity effect, 85-87,91-94 Scission process, in beam-target interactions, 76 Secondary emission, in beam-target collisions, 74 SEM, see Scanning electron microscope Semiclassical form of multiphoton process theory, 194, 199-200,215-219 Semiquantitative analysis, by Auger spectrometry, 280 Separator, of magnetic field, 2 in two-current system, 4 Separatrix, of magnetic field, 2, 7 in two-current system, 4 Shake-up feature, in Auger electron spectroscopy, 265 Shaped-beam lithography, 121, 124-128 deflection systems, 129-131 electronic and computer aspects, 126 with high throughput, 134 improved electron optics, 125-126 Sheet rupture, in magnetic reconnection, 18, 33,35 Shot noise limitation problem, in x-ray lithography, 100, 113 Silicon Auger electron spectrum, 261-262,273 Coster-Kronig effects, 265-266 sputter profile, 258 valence spectrum, in Auger process, 272 x-ray absorption data, 104-105 Single-deflector system, in electron beam lithography, 126 Single-ionization energies, in calculation of Auger energies, 245-248
317
SUBJECT INDEX
Single-photon ionization, 203 Single-photon transition, 202-203 Slow-mode shock, 20-21 Small-angle scattering electron beam lithography, 121 in inelastic collisions, 73 Sodium, resonant multiphoton processes, 235 Solar activity, 179-186 explosive, 184- 186 origins of, 179-181 slowly varying, 181- 184 Solar atmosphere, 160-178 chromosphere, 168-172 corona, 172- 176 photosphere, 160-168 temperature, 168 transition region, 172-174 Solar core, 144-147, 149 model of, 145 rotation, 151-152, 156 Solar ( k , o)diagram, 154, 158 Solar envelope, 147-160 model of, 145 Solar flare, 3, 8-12, 184-186 x-ray spectrum, 184-185 Solar meridional circulation, 152 Solar physics, 141-190 quiet atmosphere, 160- 178 solar activity, 179-186 solar interior, 144-160 Solar polar vortex, 149 Solar pulsation, 153-160 Solar rotation, 150-153 Solar seismology, 159- 160 Solar spectrum, 165, 170, 173-174 Solar wind, 176-178 Solid analysis by Auger spectroscopy, 242-244, 253,257-260 Auger electron energy from, 267 Auger spectrum, 267, 269 inelastic scattering of Auger electrons in, 26 1 Solid-state broadening effect, of Auger electron energy, 267, 269 Sonnerup model, in magnetic reconnection experiments, 16- 17 Space charge effect, in electron beam lithography, 125
Spectroscopic rotation rate, solar, 150- 152 Spicule, of chromosphere, 172 Spontaneous decay, of resonant excited states, 214 Static atomic relaxation energy, 247-248 Static extra-atomic relaxation energy, 248 Stellar atmosphere, local thermodynamic equilibrium, 162 Stellar envelope, convection energy transport in, 148, 150 Stellar physics, 141-142 Stellar wind, 176-177 Step and repeat technique, in microlithography, 72,97 Sun, see Solar physics Sunspot, 4-5,9, 181-183 rotation, 150-151 Supergranulation, photospheric, 166 Surface analysis by Auger spectroscopy, 242-244, 250-251,253,264,274,281-285,287-289 effect of ion sputtering on, 257-260 Surface phase diagram, 287-288 Sweet’s paradox, 9-10,42
T Tantalum, preferred sputtering effects in Auger spectroscopy, 259-260 Tearing mode, in magnetic reconnection theory, 11-12, 15-16, 18-19 annular pinch, 26-27 theta-pinch experiments, 25 tokamak, 25-26 triax tubular pinch device, 23 TS-3 experiment, 33, 35 Theta-pinch experiment, 22,24-26 Three-cell topology, of two-current system, 4-5 in double inverse pinch device, 36-37 Three-level resonant systems, 220 Three-photon ionization, 205 Three-photon transition, matrix element, 197 Titanium, Auger analysis, 252, 254-255 Tokamak, in magnetic reconnection experiment, 4, 22,25-26 Transition metals, Coster-Kronig effects, 265 Transition probability, 197-198, 204
318
SUBJECT INDEX
Transition region, of solar atmosphere, 172174 solar activity, 185 Triax tubular pinch device, in magnetic reconnection experiments, 22-25 Triple inverse pinch, in magnetic reconnection experiments, 22, 27-28 TS-3 experiment, in magnetic reconnection. 32-35, 53 Tunable dye laser, 226, 232, 234 Tungsten, Auger peak heights, 254-255 Tungsten hairpin electron gun, for x-ray lithography, 102 Two-current system, magnetic reconnection in, 3-4 Two-photon bound-bound transition, 202, 229 Two-photon processes, 200-206,216, 228 Two-photon-resonant three-photon ionization, 210-212, 218,229-230 Two-photon transition, via nonresonant states. 207 V
Vacuum requirement, for Auger spectroscopy, 274 Vacuum seal, for x-ray lighography, 103 Valence band spectroscopy, 267,271-272
w Wave-optical calculations, in electron beam lithography, 117
Weather, solar influence on, 143 Wilson depression, in photosphere, 182
X X-ray initiation of Auger emission, 251 X-ray lithography, 95-1 15 cooling-water problem, 103 with high throughput, 133-135 ideal-resist case, 98-101 magnitudes for fast-throughput system, 110-1 14 masks, 106-112 overhead time per resist exposure, 112- I I3 possible approaches to, 7 1 resist properties, 113-1 14 step and repeat technique, 97 x-ray absorption, 104-105 x-ray sources, 101-106, 110-111, 114-115
Y Yeh-Axford model, in magnetic reconnection experiments, 17 2
Zinc Auger core transitions, 270 Auger spectrum, 267 Coster-Kronig effects, 265 many-body polarization effects, 248 Zone of mixing, in Auger spectroscopy, 258259