ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS VOLUME XIV
This Page Intentionally Left Blank
Advances in
Electronics...
24 downloads
818 Views
15MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS VOLUME XIV
This Page Intentionally Left Blank
Advances in
Electronics and Electron Physics EDITEDBY L. MARTON National Bureau of Standards, Washington, D.C.
Assistant Editor CLAIREMARTON EDITORIAL BOARD T. E. Allibonc W. B. Nottingham H. B. G. Casimir E. R. Piore L. T. DeVore BI. Ponte W. G. Dow A. Rose A. 0. C. Nier L. P. Smith
VOLUME XIV
1961
ACADEMIC PRESS
New York and London
COPYRIGHT 0 1961, BY ACADEMICPRESSINC. ALL RIGHTS RESERVED
NO PART OF T H I S BOOK MAY BE REPRODUCED I N A N Y FORM
B T PHOTOSTAT, MICROFILM, OR A N Y OTHER
MEANS,
WITHOUT WRITTEN PERMISSION FROM T H E PUBLISHERS.
ACADEMIC PRESS INC. 111 FIFTHAVENUE
NEWYORK3, N. Y . United Kingdom Edition Published by ACADEMIC PRESS INC. (LONDON) LTD 17 OLD QUEEN STREET, LONDON, S.W. 1
Library of Congress Catalog Card Number 49-7504
PRISTED I N T H E UNITED STATES OF AMERICA
CONTRIBUTORS TO VOLUME XIV C. G. B. GARRETT, Bell Telephone Laboratories, Znc., Murray Hzll, iVew Jersey P. GORLICH,Znstatute for Optics and Spectroscopy, German Academy of Sciences, Berlin, and Friedrich Schiller Universzty, Jena, Germany
SEYMOLR GOLDBERG, Edgerton, Germeshausen and Grier, Znc., Boston, ilfassachuse tts HERBERT LASHINSKY, Columbia Radiation Laboratory, Physics Depnrtmerit, Columbia Cniversity, X e w York, New Y o )k T. NORENO, Varian Associates, Palo Alto, California JEROME ROTHSTEIK,Edgerton, Gerineshausen and Grier, Inc., Boston, Jlassachuset ts ALBERTSEPTIER,Laboratoire d'Electronique et de Radioe'lectricite', Universite' de Paris, Fontenay-aux-Roses, Seine, France
This Page Intentionally Left Blank
PREFACE I n more rapid succession than usual this XIVth volume of Advances in EIectronics and Electron Physics has followed the previous one. It SO happened that the material available for Volume XI11 was much more than we could accommodate in a single issue. We decided, therefore, to split Volume XI11 in two, and the present volume is an outgrowth of the original one. This means ultimately that we will have two volumes issued this year: the present one, and our regular next volume XV later in the year. I hope they will be as well received as their predecessors. I n the Preface to Volume XIII, I invited the readers of Advances to send in personal comments to me. For that purpose, I gave a listing of the items which were planned for the next few volumes. This listing has changed slightly since that time. We have published some of the titles listed there, and therefore I am including a listing again with a repeated invitation to write me. The Distribution of Ionization in the Upper Atmosphere Masers Millimeter Waves Atomic Frequency Standards The Autodyne Detector as Applied to Paramagnetic Resonance Relaxation in Diluted Paramagnetic Salts a t Very Low Temperatures Ultrahigh Vacuum Techniques Scattering in the Uppcr Atmosphere Millimicrosecond Techniques Airglow Thermionic Conversion Electroluminescence Capacitance of P-N Junctions Electron Phenomena on the Semiconductor Surf ace Thermoelectric Phenomena Atomic Collisions Cathode Sputtering Radioastronomy Fluorescence EIectronics in Oceanography vii
...
Vlll
FOREWORD
Light Optical Masers Photo-Electronic Image Devices The above tabulation is tentative, of course, and it may change slightly as time goes by. At any event, i t should enable those who are willing to make suggestions for further subjects to be informed, particularly if they are willing, in addition, to go to the trouble of consulting the cumulative index in Volume X and the separate indices of the volumes which have appeared since then.
L. MARTON Washington, D. C. February, 1961
. . . . . . . . . PREFACE . . . . . . . . . . . . . . . . . . . . . .
V
CONTIUBUTOHS TO VOLUME XIV
\'i1
The Electron as a Chemical Entity c . G . B . GARRETT I . Introduction . . . . . . . . I1. Theoretical Section . . . . . . 111. Experimental Section . . . . . References . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . . . . .
1 2 12 34
. . . . .
37 39 40 57 58 60 64 67 70 80 80 81
Problems of Photoconductivity P . GORLICH
I . Introductory Considerations on Photoconductivity . I1. Photoconduction in the Base Lattice and Tail Absorpt.ion I11. Theoretical Problems in Photoconductivity . . . IV . Dislocations . . . . . . . . . . . . . V . Negative Photoconduction . . . . . . . . . V I . Surface Conditions . . . . . . . . . . . VII . Ohmic and Unidirectional Contacts, pndunctions . VIII . Photoelectromagnetic Effects . . . . . . . . I S . Application of Photoconductors . . . . . . . S . Conclusion . . . . . . . . . . . . . . I. ist of Symbols . . . . . . . . . . . . . Rcfcrences . . . . . . . . . . . . . . .
Regions
.
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
.
.
.
.
.
. . . . . . . . . . .
. . . . .
Strong-Focusing Lenses ALBERT SEPTIER
I . Theoret.iral Properties to First Order . . . . . I1. Aberrations . . . . . . . . . . . . . . I11. Practical Realization of Lenses and Measurement of I V . Experimental Study of the Optical Properties . . References . . . . . . . . . . . . . .
.
.
.
.
.
.
.
86 140 Fields . . . . 160 . . . . . . . 183 . . . . . . 203
. . . . . .
Hydrogen Thyratrons SEYMOUR GOLDBERG
AND
JEROME ROTHSTEIN
I . Introduction . . . . . . . . . . . . . . . . . . . 207 I1. Progress in Hydrogen Thyratron Construction and Techniques . . . 208 ix
x
CONTENTS
111. Operation of Hydrogen Thyratrons . . . . . . . . . . . . 219 1V. Conclusion . . . . . . . . . . . . . . . . . 262 References . . . . . . . . . . . . . . . . . . . . 263
Cerenkov Radiation at Microwave Frequencies HERBERT LASHINSKY I . Introduction . . . . . . . . . . . . . . . I1. General Theory of the Cerenkov Effect . . . . . . . I11. Theory of the Cerenkov Effect a t Microwave Frequencies . IV . Design of Cerenkov Microwave Devices . . . . . . . V. Conclusion . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . .
. . . .
265 268 . 275 . 285 295 296
. . . . . . . . . . . . .
. . .
High-Power Axial-Beam Tubes T . MORENO I . Introduction . . . . . . . . . . . . . I1. Problems Common to High-Power Klystrons and Traveling-Wave Tubes . . . . . . . . . . I11. Progress in High-Power Klystron Design . . . . IV . Progress in High-Power Traveling-Wave Tube Design References . . . . . . . . . . . . . . .
. . . . . .
299
. . . . . . 300 . . . . . . 313 .
. . . . . . . . .
. 321
AUTHORINDEX . . . . . . . . . . . . . . . . . . . . .
331
SUBJECTINDEX
336
. . . . . . . . . . . . . . . . . . . . .
329
The Electron as a Chemical Entity C . G. B. GARRETT Bell Telephone Laboratories, Inc., Murray Hill, New Jersey
Page
..........................
I. Introduction.. .......................
11. Theoretical Section.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. The Thermodynamics of Systems Containing Charged Components. . . . . . B. Statistical Mechanics. . . . . . . . . . . _ . _ . . ................ 111. ExperimentalSection .................................................. 9.Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Experimental Results.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References ...........................................................
1 2 2 7 12 12
14 34
I. IKTRODUCTION The title of this paper is perhaps misleadingly all-embracing. We are going to concern ourselves with the properties of solid crystalline phases in which there exist conduction electrons and holes, the activities of which are dependent on the presence of imperfections and impurities in the crystal. We shall also be concerned with the equilibrium between such a solid phase and one or more surrounding phases, themselves solid, liquid, or gaseous, which by their presence determine the activities of impurity materials in the first phase. I n short, we shall confine our attention to equilibrium states of assemblies in which a t least one of the phases is a semiconductor. The history of this subject goes back to the work of Wagner and Schottky in the 1930’s. At the present time it is not clear whether we should regard it as a part of physics or a part of chemistry; certainly some of the materials to which the concepts have been applied are far removed, in the degree of understanding which we possess at the present time, from the elemental semiconductors germanium and silicon. In the literature of the past 30 years one finds a considerable body of information which could be included within the rather hazy title of the present paper; so considerable, in fact, that we shall necessarily have to be quite selective in our choice of topics. What we plan to do is this. We shall start, in the next section, by considering the thermodynamics of systems containing constituents, some of which carry an electric charge-a subject not very adequately treated in most thermodynamics textbooks. The object of this section is to see what quantities can be defined in a rigorous thermodynamic sense, and to write down conditions for internal equilibrium and for equi1
2
C. G. B. GARRETT
librium between phases. I n Sec. 1I.Bwe turn to some statistical mechanical results, particularly those corresponding to “ideal dilute” solid solutions of imperfections and impurities in a semiconductor crystal; the principal objects of this section are the writing down of Fermi-Dirac distribution functions for describing the distribution of electrons among the various electron states, and the relating of the concentrations of the imperfections and impurities with which those states are associated to the absolute activities of the various substances in the thermodynamic system. Having laid the theoretical groundwork, we proceed in Sec. I11 to consider experimental techniques and results; in See. 1II.A , we discuss the available experimental tools and idealized experiments, and in 1II.B the experimental results themselves. Eere, too, we shall be very selective; rather than try to make an exhaustive survey, we shall pick one or two systems for which the quality of the experimental information is outstanding, and discuss them in detail. This is perhaps the right place to list the things that we shall not discuss in this paper. We shall omit any consideration of systems in which the only charged components are ions, such that free electrons or holes do not exist as distinguishable chemical entities-that is to say, systems in which we cannot separately control the activities of electrons or holes. Thus we shall not be talking about electrolytes or ionic conductors, neither shall we treat such things as the growth of oxide films, even though electronic conductions effects can under special circumstances be descried in these last. We shall omit anything having specifically to do with the surface region, mainly because this subject is in a rather unsatisfactory state a t the present time; there exists, for example, a very elegant treatment, due to Gibbs, of the thermodynamics of the surface region, but remarkably little use can be made of it in practice, especially where one of the bulk phases is solid. In addition, we shall restrict ourselves entirely to equilibrium states, in spite of the fact that some of the more interesting phenomena for which it is profitable to consider the electron as a chemical entity in the classical sense are precisely those for which the system is not in equilibrium-reactions occurring in or on the surface of a semiconductor crystal. For a general discussion of such processes, the reader is referred to a recent book of K. Hauffe (1). 11. THEORETICAL SECTION
A . The Thermodynamics of Systems Containing Charged Components 1. Definition oj Electrochemical Potentials. The chemical potential of a neutral constituent, of a phase is defined by the equation ( 2 ):
THE ELECTRON AS A CHEMICAL ENTITY
3
We ask now how to write down a similar equation for a constituent, the elementary particles of which carry an electrical charge. Consider the phase to be in the form of a sphere of radius R,and imagine that the surrounding space is occupied by some other phase, itself surrounded by a conducting Faraday cage, the linear dimensions of which are great in comparison with R. We make no requirements as to the properties of the second phase, except that it be a chemically homogeneous electrical insulator, and that the system as a whole be in thermodynamic equilibrium. We now wish to define, as far as is possible, the electrostatic potential difference \k between the inner phase and the Faraday cage. If the inner phase were identical in chemical composition to the Faraday cage, we could attach a metallic wire to each, and regard the difference between the Fermi levels of the electrons in the two wires (which is what a voltmeter will indicate) as the difference in electrostatic potential between the two phases. If they are not of the same chemical composition, we cannot do this. Let iis, however, imagine that we have some device (such as a vibrating reed) for measuring an electric field at any point in the outer phase, and that we determine in this way the potential difference between a point just outside the inner phase, and a point just inside the Faraday cage. This quantity we shall define as the electrostatic potential of the inner phase with respect to the Faraday cage. Since this definition will lead to a value of the defined quantity that depends, for example, on the work function of material forming the Faraday cage, we must regard it as arbitrary to the extent of an additive constant. Atomistically, we would prefer to relate this definition to the mean electrostatic potential (suitably averaged over a distance of the order of atomic dimensions) within the inner phase. Operationally, this cannot be done. However, it is reasonable to suppose that the difference between the electrostatic potential just inside and just outside the surface of the inner phase will not be affected so long as the bulk composition and the composition of the surface dipole are themselves unchanged. We now show that, if R is made large enough, the electrostatic potential of the inner phase can be changed by a n arbitrarily large amount without upsetting the composition of the bulk and surface regions enough to change significantly the potential difference across the surface. For, the capacity of the inner phase is proportional to R, the surface area to R2, and the volume to R3; therefore, by making R sufficieiitly large, the changes in charge per unit surface area and in charge per unit volume that are required to produce a given change in the electrostatic potential of the inner phase may be made as small as desired. We now turn to the question of the work required to transfer some differentially small number d n i of charged particles from the inner phase
4
C. G . B. GARRETT
to the Faraday cage. First, there will be a contribution depending only on the bulk chemical composition of the inner phase. This quantity we shall expect, to be linear in dn;. I n addition, there will be a quantity that is proportional to the difference between the electrostatic potentials at a point just outside the inner phase and a t a point just inside the Faraday cage; that is, t o the electrostatic potential of the inner phase, as defined above. This quantity will be given by !??q;dni, where qi is the number of electronic charges on each particle, and the energy is measured (as will be all energies in this paper) in electron volts. This second quantity, therefore, is also linear in dni. I n addition, there will be a contribution to the work that depends on the provenance of the charged particles: the work will be somewhat different if we t,ake them all from a region close to the surface, for example, from what it would be if we removed them uniformly from the volume. This term, however, will be proportional to ( d n J 2 , since the inhomogeneities in field that are set up within the inner phase by the removal of the dni charged particles are linear in d n ; (from Poisson’s equation), while the electrostatic energy density associated with the electric field so set up is proportional to the square of the field a t each point. Since we are interested only in the first order changes in the thermodynamic quantities, it will therefore not be necessary to specify in what way the charged particles are to be removed from the inner phase. On the basis of the foregoing discussion, we may write the following pair of equations:
pi
=
pi0
+ qi9.
(3)
Equation (2) is the definition of the quantity pi, which is called the electrochemical potential of the component i. Equation (3) shows how this quantity varies when the electrostatic potential of the phase in question (in the sense defined above) is varied. The quantity p p appearing in Eq. (3) is often called the “chemical potential” of the charged component, as representing that part of p 5 that depends only on the “chemical” properties of the component in the phase in question; this is somewhat misleading, since it is impossible to distinguish the “chemical” and “electrical” properties of a charged particle. It is better to regard pi0 as being merely the value that pi assumes when (defined with respect to some particular reference material) is zero. 2. Thermodynamic Relations a n d Conditions for Equilibrium. Once the electrochemical potentials of the charged components have been defined, the thermodynamics of a system containing such components can be set
T H E ELECTRON AS A CHEMICAL ENTITY
5
up in the usual way. The equation giving the change in the Gibbs free energy of a phase is ( 3 ) :
c;c =
+ T c r +~ 1pcrlctn,,
-S~T
(4)
i
where X is the entropy of the phase, V is its volume, the n, are the numbers of particles of the various constituents in the phase, T is the temperature, and P the pressure. If now certain of the constituents can react together, then, for the phase to be in a state of internal equilibrium, not all of the differentials shown in the above equation can be independent. For the Gibbs free energy must be a minimum with respect to a small displacement, a t constant temperature and pressure, of each chemical equilibrium from the existing state of the system. Each chemical equilibrium may be written in the general form :
2 z,lzl i
=
(5)
0
where the z , are integers, so that, for internal equilibrium to exist:
2 zip; i
=
0.
Each such equation may then be used to eliminate one of the dni’s from the right-hand side of Eq. (4). When this process has been finished, the number of d n l s remaining is called the number of components of the syst,em; the choice of which particular constituents we call components is of course arbitrary. The number of degrees of freedom of the phase is then (e 2), where e is the number of components, or (e 1) if we exclude from consideration the extent of the phase. Where charged components are involved, we must also look a t the question of the electrostatic field. Unless there is space-charge neutrality a t every point, differences in potential will exist, and the concentrations of the various components will vary from point to point. Let us therefore define the concentration y i of the ith component at each point by means of the equation :
+
+
yi = lim
(ni/V)
v-0
(7)
and write, making use of Poisson’s equation:
where e is the dielectric constant and €0 the permittivity of free space. Thus, if arbitrary assignments are made of the (c 1) thermodynamic variables,
+
6
C. G . B. GARRETT
and the equation of state if known, the electrostatic potential distribution is completely determined. I n actual fact, of course, a very small departure from space-charge neutrality is sufficient to set up enormous potential differences in the phase. Unless, therefore, we are interested in such cases it will be sufficient to write :
c i
QiYi =
0,
which will hold everywhere except in the immediate vicinity of the surface. Equation (9) sets one more subsidiary condition on the system, and may itself be used to eliminate one more of the dn;s from the right-hand side of Eq. (3). Thus, with the condition of space-charge neutrality, the number of degrees of freedom is e instead of (e 1). This result is not, of course, in disagreement with what one would find by applying the phase rule without considering the possible existence of charged constituents a t all; the number of components is now one higher, so that the number of degrees of freedom is the same. 5.Equilibrium between Phases. Let us consider the conditions for equilibrium between a number of phases, each one having the property that, within it there is substantial space-charge neutrality. The ordinary conditions for equilibrium between any two phases are:
+
p'
=
p"
T' = TI' p f i = p"
(10)
i
+
where the superscripts label the phases. There are (6 - l)(e 2) such relations. For each phase there are t! degrees of freedom, as concluded in the preceding section; but in addition we can arbitrarily fix the electrostatic potential of each phase, as explained in Sec. 1. To take account of this, let us pick some charged component (say that corresponding to i = l), and allow the uniformity of its electrochemical potential throughout the system to be achieved merely by flow of surface charge. There are 6 electrostatic potentials, and only one condition to be satisfied: to wit, that, during the flow of charge, the total charge of the system be conserved. We thus have (6 - 1) variables a t our disposal, which may be used, if we wish, to eliminate all of the (6 - 1) equations p'i = p"i. There then remain (6 - 1) (e 1) conditions on the equilibrium of the heterogeneous system. Since, however, there would be 6 e degrees of freedom if all of the phases were independent, we arrive a t the phase rule for systems involving charged components : 5 = e- 6 1. (11)
+
+
T H E ELECTRON AS A CHEMICAL E N T I T Y
7
By way of illustration, let us consider the equilibrium existing between a crystal of germanium containing arsenic and the vapor phase. The number of components is 3: we may pick, for example, germanium, arsenic, and electrons, and the concentrations of the other constituents present (holes, arsenic ions in the vapor phase, etc.) are thereby determined. From Eq. (11),the number of degrees of freedom is 2. We may fix,for example, the temperature and the partial pressure of the arsenic, and the system is then completely defined. Of course, we could have got this result from the ordinary phase rule, without thinking of charged constituents a t all; again we have increased the number of components by 1 over what we would write in the usual way. But our present formulation allows us to make some further conclusions, which would not be obvious from the conveiitional treatment. For example, it is clear from Eq. (11)that it would be perfectly permissible to pick the temperature and the electron concentration as independent variables, instead of the temperature and the partiaI presslire of the arsenic. One more thermodynamic result must be mentioned before we close this section. To the extent that we may regard impurities present in a solid semiconducting phase as ideal dilute solutions, it is possible to write : PZ
+ kT
= P ~ O
111
(rz/rLo)
(12)
for uncharged constituents, and PI0
+ kT
= P- ~O O
111 ( T % / Y ~ o )
(13)
for charged ones. The assumption, however that the ideal dilute solution approximation is a good one has to be investigated in detail, by statistical mechanical arguments, in each case.
B. Statistical Mechanics 1. GerLeral Remarks. Pure thermodynamics says the following. Once \I e know the equation of state:
we are in a position, by mathematical manipulation, to derive all the thermodynamic properties of the system. Eere mathematics ends and physics must begin. In principle, if we can calculate the energies of all possible configurations of an N-body system, for all values of N , we can set up the grand partition function and use it to derive the equations of state. Of course in general this cannot be done. Of the things that can be done, we shall select for discussion only those that are of particular interest in connection with topics to be discussed later in this paper. These are: (1) the calculation of the distribution of electrons among the allowed one-elect ron
8
C. G . B. GARRETT
states of a solid, and among the private energy levels of isolated centers (impurities and imperfections) present in the solid; (2) the calculation of the relation valid in the absence of interaction effects between the concentration of some electronically active constituent present in a solid and its absolute activity; (3) the treatment of interaction effects. We discuss these topics in the ensuing sections. 2. The Distribution of Electrons among States of a Nearly Perfect Crystal. We shall discuss this topic quite briefly, since it is treated in a number of standard texts (4). The allowed energy levels of a crystal fall into bands which may or may not overlap. I n a semiconductor, there exists a gap between the highest band that, at the absolute zero, is completely full, and the band lying next above, which is completely empty. At a nonzero temperature, some small number of electrons will be excited into the empty (conduction) band, and some small number of holes will be left behind in the full (valence) band; if there are no electrically active impurities or imperfections in the semiconductor, the density of electrons will equal the density of holes; otherwise this may not be the case. The distribution of electrons and holes among the allowed levels a t some temperature T will then be given by the expressions obtained from Fermi-Dirac statistics. This way of describing the crystal will be good so long as the number of holes and electrons is not so large that the one-electron band approximation breaks down. The Fermi-Dirac distribution function may be written : fi
=
gd[l
+ exp {
- ( E F- E J / k T j l ,
(15)
where Ei is the energy level of the ith state, g, is its statistical weight, and Ep is the energy corresponding to the Fermi level. This quantity, as in any electronic system, is equal (apart from a change of sign and the usual arbitrary additive constant) to the electrochemical potential for electrons, as defined in Eq. (2). I n the case that the Fermi level lies substantially below the bottom of the conduction band and substantially above the top of the valence band, Eq. (15) leads to the expressions:
n p
= =
N , exp { - ( E c - E F ) / k T } , N , exp { - ( E F- E,)/kTI,
(16)
(17)
where E , and E , stand for the energy values for the conduction and valence band edges, and N , and N , are the “effective densities of states” near the edges of the two bands, quantities that vary only slowly with the temperature and depend on the principal values of the effective mass tensor. Where there are impurities or imperfections in the crystal, one proceeds as follows. Consider one singly-ionizable donor atom in an otherwise perfect semiconductor crystal. It will then be possible to distinguish bound states, in which the extra electron is well localized in the vicinity of the
THE ELECTRON AS A CHEMICAL E N T I T Y
9
donor atom, and free states, in which the electron can be well described by free-running wave functions not spatially associated with the donor a t all. (In both cases, of course, one can proceed by forming linear combinations of conduction band Bloch functions; the difference lies in the presence or absence of spatial correlation with the donor atom.) We now determine how much the energy of the whole crystal changes when an electron is excited from the lowest of the bound states to the lowest of the free-running states; more precisely, the change in Gibbs function of the crystal when such an excitation is done a t constant temperature and pressure. This quantity is called the ionization energy of the donor; we indicate it in a one-electron band scheme by drawing a line parallel to the conduction band edge and separated from it by the ionization energy. Where now we have a certain density of such donor atoms present in a crystal, not so high that interaction effects occur, the over-all distribution function for the occupancy of the bound states factors into the distribution functions for the individual donor atoms, which may themselves be written down by applying to the private energy levels of one donor atom the conclusion of FermiIXrac statistics. The condition for equilibrium is then that the Fermi level for the donor system shall coincide with that of the host crystal. The simplest case is that the donor be singly-ionizable, that the ground state be nondegenerate, except for the twofold degeneracy associated with the electron spin, that no higher bound states exist, and that the ionization energy of the donor be independent of temperature. This case is handled in many textbooks and leads to the result (5):
f = 1/[1 f 46 exp
{ - (EF - E D ) / ~ T } ] ,
or
ND+/A-D=
56 exp
{ +(EF - E D ) / ~ T } ,
(18)
where f is the electronic occupancy factor (the occupancy of the bound state), ED is the “value of the donor level,” that is, the energy value corresponding to the above-mentioned line in the one-electron scheme, and N o + and N D are the densities of ionized and neutral donors. The expression for a singly-ionizable acceptor satisfying the same conditions is analogous. One other case that is of some practical importance is that of a doublyionizable donor. Since this case is not discussed in most texts, it is perhaps worth taking the space to do so here. Let N D stand for the density of unionized donors, No+ for the density of those donors that have lost one electron, and No++for the density of those that have lost two electrons. The singly-ionized state has a statistical weight of two, because of the two choices of spin; the others have a weight of one. Thus:
N D + / N D= 2 exp { - ( E F- El*)/kTj ND++/ND+
=
f $ exp { - ( E F - E2*)/kT],
(19) (20)
10
C. G . B . GARRETT
where El* and K* are the energics corresponding to the first and second ionizations; that is to say, (RC- El*) and ( E , - E2*) are the ionization energies that would be observed optically (provided that there is no FranckCondoii shift). Thus :
a result which can also be obtained by going back to first principles, and considering the number of configurations of the system with given numbers of electrons arranged in a specified way. It is worth noting a t this point that, if the ionization energy varies with the temperature, the temperature coefficient will also appear in the statistical expressions. For simple group I11 donors or group V acceptors in germanium or silicon, the ionization energy is given to a first approximation by the “hydrogen-like” model, and is in general independent of temperature; but there is no reason why this should always be the case. 3. T h e Chemical Potential of a n Electrically Active Constituent. Equation (12) describes the dependence of the chemical potential of some neutral impurity on the concentration, on the basis of the “ideal dilute solution” model. If the impurity is electrically active, one must use the Fermi-Dirac expressions, as described in the preceding section, to find the ratio of ionized to un-ionized densities; the density of un-ionized centers will then be given by Eq. (12). The validity of Eq. (12) for solutions in the solid state has to be explored for each type of center to which we wish to apply those equations. So long as the concentration of centers is low, so that interactions between the centers are small and the disturbance of the host lattice that is produced by the introduction of a center is independent of the existence of other centers, we would expect the chemical potential to depend primarily on the entropy of mixing. The derivation of Eq. (12) on this basis, first given many years ago by Schottky and Wagner (6),is as follows. If we have N sites at which the center can appear, and N’ centers, the number of distinguishable arrangements is N ! / ( N - N’) !N‘!.If each introduction is associated with an energy E , the change in the energy of the system associated with N’ introductions is EN‘. With this information, one sets up the partition function and uses it to evaluate the Helmholtz free energy, making use of Stirling’s approximation in the usual way. On differentiation with respect to N‘, Eq. (12) then falls out. Not much physics has gone into this calculation, so the result will not be conspicuously dependent on the physical properties of the center; really the calculation is nothing more than the one usually given for a mixture of two perfect gases, except that the number of sites is finite.
THE ELECTRON AS A CHEMICAL ENTITY
11
The validity of the assumptions behind the Schottky-Wagner treatment have been investigated by Reiss (2' )' in one particular case-that of a simple group V donor in an elemental group IV semiconductor. Reiss considers in detail the consequences of adding a group V atom to a lattice of such a semiconductor. These are: (1) new valence states, similar to, but not identical with, those that would be created by adding instead a host crystal atom, are brought into existence; (2) new states corresponding to the conduction band states are added; of these, we only need consider the lowest, which is the donor level; (3) the vibrational partition function is perturbed in a way peculiar to the particular donor added; (4) the heat of formation of the crystal is changed; (5) the number of extra electrons added to the crystal is one more than would be the case if we had added a host atom. The chemical potential of the donor atom is next related to the value the Helmholtz free energy would have if every atom were a t its exact lattice site, on the assumption that the coupling between the electron and phonon systems is not too strong. The change in this quantity produced by the addition of new energy states and additional electrons is then written down, and shown to lead directly to Eq. (12), provided that the density of donors is not too high. The logarithmic term arises again, of course, from an entropy of mixing; the point of the calculation lies in the fact that, when we take properly into account the way in which the center adds new electronic states to the crystal, we get the same answer as we would if we regarded the center simply as a red ball added to a lattice of blue balls. Longini and Greene ( 8 ) have also discussed this problem, and Brebrick (9) has extended the argument to anion and cation vacancies. Other impcrfections, such as dislocations, do not seem yet to have been discussed. 4. Mass-Action Laws and Their Limitattons. Consider the case of a semiconductor crystal containing only donors. We can, if we wish, regard t h t distribution of electrons between donor arid conduction band states as being described by the chemical equilibrium : TI
L)+
+
11
(22)
where the A's are the absolute activities. Now, if ideal dilute solution considerations apply to all three entities: ND/ND+?L =
collst.
(2-4)
12
C. G . B. GARRETT
On the basis of the discussion of the preceding sections, however, we can offer a more rigorous derivation of Eq. (24)) which brings out one important point (7‘).Using Eqs. (16) and (18)) we get:
so long as the Fermi level lies sufficiently far beneath the conduction band edge for Eq. (16) to be valid. Thus the mass-action relation holds only so long as the conduction band statistics are nondegenerate. 5. Interaction Effects. The “ideal dilute solution” approximation will be expected to break down at high concentrations. An understanding of the behavior of the system under these circumstances is a n N-body problem, and so far not much progress has been made, in most cases, in tackling it. From experiments on germanium, we know that, a t high donor or acceptor concentration, an “impurity band” is formed, so that it is no longer proper to talk of the private energy levels of a n individual impurity. One interaction effect that is not necessarily associated with high concentrations is that of ion pairing. Where a n ionized donor and an ionized acceptor are both present in the same crystal, where a t least one of them is free to move, and where the temperature is not so high that LT is large in comparison with the Coulomb energy a t the distance of closest approach, association can occur. Reiss has discussed this problem by using the teckniques of the Bjerrum-Fuoss theory of ion-pairing in solution (10). 111. EXPERIMENTAL SECTION
A . Methods 1. Experimental Techniques. The techniques for the study of chemical equilibria involving conduction electrons and holes are the same as those used in the study of ordinary chemical equilibria: the bringing of a system into a state of equilibrium under specified thermodynamic conditions, followed by the determination of the composition of the equilibrium state. There are, however, two complications. First, it is necessary to have methods for measuring the concentrations of the electrons and holes themselves; second, one must constantly bear in mind that for many of the systems under discussion, the concentrations of the impurities in the range of greatest interest (or a t any rate of greatest theoretical tractability) are quite exceptionally low by the standards of ordinary analytical chemistry. A measurement of the conductivity does not suffice to determine the concentration of holes or of electrons, unless the carrier mobility is already known. To determine both the carrier concentration and the mobility, one must make some other measurement as well. The best is the Hall effect,
THE ELECTRON A S A CHEMICAL ENTITY
13
which will usually be satisfactory so long as the mobility is of the order of 1 cm2volt-' sec-' or greater. Where the Hall effect is too small to detect, one can use the thermoelectric properties-in particular, the Seebeck coefficient; in high mobility materials, however, this may be complicated by phonondrag effects. Where both holes and electrons are present, it is usually necessary to make measurements over a range of temperature, and consider rather more systematically the predictions of the statistical theory (see Sec. 113).It goes without saying that measurements of the sort discussed in this paragraph are really reliable only when carried out on single crystals. The difficulties arising from the low impurity concentrations which it is often necessary to determine can be solved in various ways. Tracer techniques are sometimes useful. More usually one avoids the whole difficulty in the following way. By adding a known amount of some impurity to a crystal, under conditions such that the other thermodynamic variables are well defined, one can determine the electrical properties of the system when a known density of impurity atoms is present; then, in a separate experiment, one can determine what absolute activity of this impurity substance is required to reproduce the same effects. Where the number of independent thermodynamic variables is a t all large, this procedure is clearly very timeconsuming, and has been done only in comparatively few cases. There are, however, numerous experiments having to do, for example, with simple departures from stoichiometry, in which the dependence of the electrical properties of the crystal on the pressure of one of the atomic constituents in the gaseous state has been explored. 2. Idealized Experiments. I n order to make a complete study of some system, it would be necessary to determine the compositions of equilibrium configurations (including the concentrations of holes and electrons) for all sets of values of the independent thermodynamic variables. From such a study, one could obtain, as a function of temperature and pressure, values for the various mass-action equilibrium constants, and could then set about interpreting such of those constants as involve electroris or holes in terms of the one-electron band scheme for the semiconductor. Measurements a t high electron or hole concentrations would then reveal departures from the mass-action laws, from which one could in principle learn something about interaction effects. A practical difficulty in doing this is the length of time required to reach equilibrium. I n most semiconducting materials, diffusion of substitutional impurities tends to be a slow process a t temperatures considerably below the melting point, especially where diffusion occurs by a vacancy mechanism; where the impurity is present on interstitial sites, on the other hand, diffusion usually can proceed much faster. Examples of the latter are
14
C. G . B. GARRETT
interstitial zinc in zinc oxide and lithium in germanium or silicon. To study chemical equilibria involving both electrons or holes and heavy particles, therefore, one must either restrict oneself t o interstitial impurities or carry out the experiments a t quite high temperatures. The disadvantage of performing the experiments a t high temperatures is that, all too often, the intrinsic carrier concentration is then so high that it is quite insensitive to the presence of the impurities which are the object of the investigation. Under these circumstances, the electrons can greatly affect the equilibrium concentration of the impurity, but the impurity can hardly affect the electrical properties of the crystal. If, however, we quench the crystal, taking care to do so in a time so short that the distribution of heavy particles is not appreciably disturbed, we achieve a state of frozen equilibrium, in which the distribution of heavy particles is that corresponding to equilibrium a t the high temperature, while the distribution of electrons and holes among the electronic states is determined by the FermiDirac statistics appropriate to the final temperature. If, a t this final temperature, the intrinsic carrier concentration is low in comparison with that of the various impurities present, we shall have in the electrical properties at,that temperature an extremely sensitive tool for studying the equilibrium composition at the initial temperature. Good examples of this technique are to be found in the study of zinc oxide (Sec. III.B.3). I n setting up idealized experiments, it is necessary to consider what impurities might be of interest and what other imperfections may occur in the crystal. I n general, all single imperfections can be classified into vacancies, interstitials, and substitutionals; in addition we can have associated groups of two or more such imperfections, and, of course, any of these things can be in any one of several states of electronic charge. We shall denote interstitials by the symbol I , vacancies by V , with a subscript indicating the atom that has been inserted or removed; a substitutional will be indicated by the atomic symbol for the atom that has been introduced, with, where necessary, a subscript showing the atom that it has replaced. A vacancy is regarded as uncharged if it was formed by removing a neutral atom, regardless of the local electronic configuration that may subsequently prevail; if however, an electron is a t a later stage removed from the vicinity of the vacancy into one of the conduction band states of the crystal, the vacancy will be regarded as having a charge of +1 electronic unit.
B. Experimental Results 1. General Survey. The distinction between quality and quantity is nowhere more evident than in the literature having to do with equilibria involving holes and electrons in the solid state. There are, unfortunately, rather a large number of semiconductors, and there is a vast mass of experi-
THE ELECTRON AS A CHEMICAL ENTITY
15
mental information to be found in the journals, most of it obtained with polycrystalline samples of not particularly impressive purity. For the reader who wishes comprehensive information, we suggest a study of the aboverited book by K. Hauffe ( I ) . Usually, the sum of the information on some particular semiconductor consists of a measurement of the conductivitysometimes also the thermoelectric Seebeck coefficient, occasionally also the Hall effect-of a polycrystalline sample, as a function of the pressure of some gas surrounding the sample. If the semiconductor is a compound, and the gas one of the elements in that compound, the “excess” of the element-whether due to interstitial atoms of that element or vacancies of ailother-should then depend in a known way on the pressure of the gas, depending on the stoichiometric composition of the crystal and on the number of atoms in each molecule of the gas. If, then, the results can be made to yield an estimate of the electron or hole density (either by combining two of the possible electrical measurements, or simply by assuming that the mobility is a constant), and if one makes some assumptions as to the number of ionizations which the interstitial or vacancy center can suffer, and if, finally, one makes the assumption that the ionization is more or less complete a t the temperature of the experiment, one is in a position to check the experimental pressure dependence. The number of “ifs” in the above sentence may suggest to the reader that this procedure is not in most cases likely to be a particularly convincing one, unless there is independent corroborative evidence. To discuss in detail all of the known semiconductors, where most of the information is only of the quality indicated above, would be a n unrewarding task. What we shall do instead is to pick one or two cases in which sufficient information is available to furnish a fairly complete account of the role of intrinsic imperfections and impurities. An obvious pair of choices would be germanium and silicon, which are still the best understood semiconductors; these we shall discuss in the next section. However, if we restrict our attention to elemental semiconductors, we shall not be able to illustrate the role of electrons and holes in determining departures from stoichiometry and vice versa. Probably the next best understood semiconductors are the 111-V compounds, but significant departures from stoichiometry in these are rare. Turning to the 11-VI and (1V)t-VI compounds, we would probably find the most complete examples of chemical equilibria involving electrons and holes if we talked about lead sulfide and cadmium sulfide. It happens, however, that several excellent reviews on these materials have recently appeared (If). Cuprous oxide has been discussed by Bloem (If?), who has at tempted to piece together the most reliable experimental information in such a way as to give a consistent picture of the various equilibria involving “intrinsic” imperfections and impurities. The most profitable thing for us
16
C. G . B.
GARRETT
to do seems to be to take zinc oxide-another compound semiconductor for which good information is now available-and attempt to do for it what Bloem has done for cuprous oxide. This will be the subject of the last section of the paper. 2. Germanium and Silicon. We begin by considering separate equilibria and follow this up with a mention of one or two interesting cases where the individual mass-action relations interact. The first mass-action relation to attract our attention is: np
=
N,N, exp { - ( E c
-
E,)/kT},
(26)
which follows from Eqs. (16) and (17). This is perhaps one of the best established mass-action relations in solid state chemistry. I t s validity is well established from conductivity and Hall effect measurements on samples of germanium and silicon of various impurity concentrations (13). In the temperature range just below that at which the sample becomes intrinsic, the concentrations of holes and electrons are comparable, so that both quantities are appreciably dependent on the minority as well as the majority carrier concentration. Under conditions where one carrier concentration greatly exceeds the other, one can point to evidence from diode and transistor measurements; the minority carrier concentration can, for example, be determined by measuring the saturation current of a diode and at the same time determining the minority carrier lifetime. The temperature dependence of the quantity on the right-hand side of Eq. (26), both as explicitly indicated and through the temperature dependence of the quantities N , and N,, is used to determine the quantity (E, - E,,); the value obtained is consistent with optical measurements of the bandgap, for transitions in which, through the cooperation of a phonon, the electron goes from the valence band maximum to the conduction band minimum. The pressure dependence of the right-hand side of Eq. (26) has been measured, and related to the deformation potentials (13). The departures from Eq. (26) occurring a t high electron or hole concentrations have also been studied; the concentration at which departures occur is itself a function of the temperature, so that one speaks of a “degeneracy temperature” for a sample of given impurity concentration. I n practice the phenomenon of degeneracy usually occurs under conditions where impurity band effects, which we shall discuss below, have already set in. The mass-action relations for singly ionizable donors and acceptors [Eq. (25) and its analog] in germanium and silicon have also been exposed to exhaustive experimental study. The results are described by quoting the experimentally obtained donor and acceptor ionization energies, which are to be found tabulated in various places in the literature (IS). Just as for the intrinsic hole-electron equilibrium, however, the experiments-repre-
THE ELECTRON AS A CHEMICAL ENTITY
17
sent a quantitative check on Eq. (25) only when f is substantially different from either 0 or 1-that is, when the Fermi level is close to the donor or acceptor level. Where the donor or acceptor level lies a considerable way away from the band-edges, however, (as is true, for example, for most of the multiply-ionizable donors and acceptors), it is possible to fix the Fermi level by means of some other impurity, present to excess, and to study both the equilibrium occupancy and the kinetics of filling of the deep state. At high donor or acceptor concentrations, the mass-action laws break down, because of the appearance of the impurity band phenomenon. As the donor or acceptor electron wave functions begin to overlap, i t ceases to be apt to speak of private energy levels for the individual impurity atoms. It is known experimentally that the individual donor or acceptor levels broaden into a band, which eventually merges with the conduction or valence band as the case may be. The analysis of the experimental data is then complicated by the fact that conduction can occur within the impurity band as well as in the conduction or valence band (14). The other class of mass-action relations which has been the object of experimental study in germanium and silicon consists of that describing the equilibrium between some donor or acceptor substance in the crystal and in some external phase. Unfortunately, experiments on equilibrium between a germanium or silicon crystal and a gas phase are difficult to carry out, and there is nothing particularly illuminating on this subject in the literature. Experiments on equilibrium with a liquid phase have been done, but these suffer from the drawback that the thermodynamics of the liquid phase is often far from ideal. Thurmond and Logan (15) have studied the distribution of copper between a crystal of germanium and a liquid phase consisting predominantly of lead. The concentration of neutral (un-ionized) copper in the germanium should be related to its chemical potential by Eq. (12); if the copper were present in the liquid as a n “ideal dilute” solution, a similar relation should prevail there; and, if the position of the Fermi level in the germanium is substantially independent of copper concentration a t some fixed temperature, one would expect to find that the over-all distribution coefficient of copper (in whatever state of ionization) between the crystal and the liquid phase would be independent of concentration, depending only on the temperature. At low concentrations, this is indeed what Thurmond and Logan found ; a t high concentrations, departures were observed. Attempts have been made to account for these by talking of departures from ideality in the liquid phase (16), either by introducing the concept of “regular” solutions or in some further, more or less empirical, way; but it is clear that we would be in a better position if we had information on the equilibrium with the vapor phase.
18
C. G . B. GARRETT
We turn now to systems where there are several mass-action relations that are not mutually independent. It must be emphasized that a study of these systems does not tell one anything new; the observed effects, interesting though they are, can all be quantitatively predicted from the individual mass-ac tion laws. The phenomena are in fact exactly equivalent to the “common ion” effects in solution chemistry. If we define the “solubility” of some impurity in the crystal as the total atomic content, including atoms in all possible states of ionization, for a specified activity of the substance in the system, it is clear that that quantity will depend on the position of the Fermi level in the crystal, which will depend both on the activity of the impurity in question and on the presence of intrinsic imperfections and other impurities in the crystal. The simplest case is that in which the impurity in question is the only electrically active center present. If now the activity of the impurity is so low that the density of ionized atoms at equilibrium is small in comparison with the intrinsic carrier concentration, the position of the Fermi level will be nearly independent of the impurity activity, and the over-all distribution coefficient of the impurity between the crystal and an external gaseous phase will then be independent of concentration also. At higher activities, where the concentration of ionized impurities is sufficiently high to change the position of the Fermi level from that holding for intrinsic material, the solubility should begin to vary with activity in an anomalous way, and should be given by considering together: the hole-electron mass-action relation [Eq. (as)]; the mass-action relation relating the concentrations of ionized impurity atoms, un-ionized impurity atoms, and either holes or electrons [Eq. (as)];and the electrical neutrality condition [Eq. (9)]. This fact was originally pointed out by Reiss (17).One might hope to look for this effect in the above-mentioned experiments of Thurmond and Logan, but it appears on looking into the numbers that, under the conditions of the experiment, the copper concentrations were never high enough to make the crystal depart significantly from its intrinsic condition. One “common ion” experiment that has been done is that in which the presence of an acceptor substance in the crystal was shown to influence the solubility of a donor material. Reiss and Fuller studied the effect of the presence of boron on the solubility of lithium (a convenient substance on account of its high mobility) in silicon (18),and also the effect of gallium on the solubility of lithium in germanium (19). Both systems showed the common ion effect; in addition, for lithium in germanium, departure from the predictions of the simple theory were observed a t high impurity concentrations and low temperatures. These could be explained by ion-pairing. Following Reiss and Fuller, we consider these experiments on the following basis: (1) the density of acceptor atoms in each particular experi-
THE ELECTRON AS A CHEMICAL ENTITY
19
merit is regarded as fixed; (2) the activity of the lithium is prescribed by the conditions of the experiment, to wit, by holding the crystal in contact with a molten phase saturated with lithium; (3) both the lithium and the acceptor species are supposed to be almost completely ionized. The dependence of the equilibrium lithium content on the acceptor concentration can then he written down from Eqs. (9) and (25), and is given by the following expression :
where XLr+stands for the concentration of ionized lithium atoms (i.e. substantially the total concentration of lithium in the cryst.al), Ngi+ stands for the same quantity in the absence of acceptor impurities, and
Experiments of this sort also have a bearing on certain iionequilibrium processes, such as, for example, the diffusion of one impurity in the presence of aiiother (20). 3. Zznc Oxide. ( a ) The jundamental chemical equilibra: no impurities. Zinc oxide is a n example of a semiconductor in which departures from stoichiometry are possible. Let us consider the equilibrium between a sample of zinc oxide and its vapor a t some temperature. Following the argument of Sec. II.A.3, we can assign values independently to the temperature, say, and to the partial pressure of zinc in the gas phase, and the composition of both phases is then completely determined. When the vapor is stoichiometric, the total pressure in the gas phase is called the vapor pressure. whether or not dissociation of ZnO in the gas occurs to a substantial extent. The composition of the crystal under these circumstances will iil general be slightly nonstoichiometric, while, on the other hand, for equilibrium with a crystal that is perfectly stoichioinetric, the vapor has to depart very substantially from stoichiometry. For a crystal of zinc oxide that is free from chemical impurities, the following imperfections have to be considered : (1) Interstitial zinc Izn ( 2 ) Interstitial oxygen I0 (3) Zinc Vacancies Vz1, (4) Oxygen Vacancies V o Of these, (1) and (4) are expected to be donors, and (3) a n acceptor Interstitial oxygen can probably be ruled out, on the grounds of the size of the oxygen ion; certainly no one has seriously considered that it exists in significant concentrations a t the temperatures a t which ZnO has been
20
C. G. B. GARRETT
studied. The remaining imperfections must be in equilibrium with the gas phase according to the following equations: Zn (gas) Zn (gas) X O , (gas)
~
Izn
+ Vzne N.O. + VOe N.O.
where the symbol N.O. means “normally occupied.” For the gas itself, we have the equilibrium Zn (gas)
+ 5 4 0 2 (gas) * ZnO (gas) ZnO (crystal)
(30)
From the law of mass-action we arrive a t the equations:
where KI, K2, K 3 ,and K , are functions of the temperature [usually expressible, to a good enough approximation, in the form A exp (-E/lcT)], and NIz,, Nv,,, and N V , stand for the concentrations of the un-ionized constituent appearing in the subscript. These four equations are sufficient completely to determine the composition of both phases with respect to their neutral constituents, provided, of course, that the concentrations of the imperfections are sufficiently small for Eq. (13) to apply, and that ionpairing can be neglected. Turning now to the ionization equilibria in the solid, it is necessary to commit oneself on the question of the number of ionizations which each center can suffer. We suppose, on the basis of the experimental evidence, that interstitial zinc is a singly-ionizable donor, while oxygen vacancies are doubly-ionizable donors and zinc vacancies doubly-ionizable acceptors. From Eqs. (18) and (21):
N rzn - ge -[Ei(IzD)--EF]/AT, = NI,,+
(2n/Nc)e[E.-”(I~.)I/AT
& = eE1(Vzn)+E~(VziZn)--2Epl/kT,= Nvz.--
( p / N , ) 2 e l E l ( V ~ . ) + E z ( V z n ) -2EdlkT,
(35)
(36)
_ N vo _- e - [ E ~ ( V o ) + E z ( V ~ l o ) - - P E ~ ] / k=T , (~/N,)Z~[~E.--EI(V~)--E~(VO~)~/~T, (37) iVVo’f
np
=
N,N,e - ( E c --E*)/kT.
(38)
THE ELECTRON AS A CHEMICAL ENTITY
21
Here El and E z refer to the first and second ionization levels for the species appearing in the parentheses, EF is the Fermi energy, E , and E , the energy a t the edge of the conduction and valence bands respectively, N , and N , the “effective density of states” in the conduction and valence bands, and n and p are the electron and hole concentrations. On the basis of Eqs. (31) through (38), we could construct composition diagrams for pure ZnO, if we had values for all of the parameters appearing therein. Let us see how far we can go. (6) Estimation of the equilibrium constants. Let us begin with the semiconductor quantities appearing in Eqs. (35) through (38). The optical band-gap in zinc oxide is known (21) to be 3.3 ev; from the absence of any indication of indirect transitions, we may conclude that the electrical bandgap caiiiiot be much smaller. We therefore set :
E, - E ,
=
3.3 ev.
(39)
From Hutson’s measurements (22):
N , = 7.8 X 1019 ~ 1 1 1 1 ~
(40)
(taking a mean value for the region of temperature around 1000°C);
E , - E l ( l z n ) = 0.051 ev.
(41)
The quantity N , is unknown; we do not in fact need to know it except when we come to calculate hole densities, since, in the only other equation in which it occurs [Eq. (36)], it may be eliminated by using Eq. (38). In the absence of information, we shall simply assume :
N,
=
N , = 7.8 X 1019 CM-~.
(42)
The only other semiconductor quantities required are the ionization energies for the oxygen vacancies and the zinc vacancies. Here we shall proceed as follows. Just as the “hydrogen-like” model works fairly well for singly-ionizable centers in a semiconductor such as germanium, so the “helium-like” model works (less well, it is true) for a doubly-ionizable center. We therefore take [Ec- E1(Izn)]from Eq. (41), and multiply it by the ratio of the sum of the first and second ionization energies of helium to the ionization energy of hydrogen. I n this way we get:
[2Ec- &(VO)- Ez(Vo)j = [&(VZ,)
- Ez(VZ,)- 2E,] = 0.35 ev,
(43)
which will probably be correct as to order of magnitude. We now proceed to the “chemical” equilibrium constants, K1 through K4.
22
C. G . B. GARRETT
K I . The best experimental data are those of Lander (23), based on effusion measurements. He finds: K,
=
2.0 x
1011e-5.05/kT
atm3I2.
(44)
Now, a t temperatures and pressures of interest to us: (1) ZnO is almost entirely dissociated in the vapor phase; ( 2 ) zinc vapor is monatomic; and (3) gaseous oxygen is diatomic. Thus the pressure in the system will be, to good approximation, just the sum of Pzn and Po,. K,. Here we take Thomas’s (24) data. I n Thomas’s experiment the electron concentration was measured as a function of temperature, in an experiment where : (1) the electron concentration was substantially equal to NI,,+; (2) NIz,+ was substantially greater than NI~,; (3) NI,. was fixed by equilibrium with a gaseous phase consisting predominantly of zinc vapor in equilibrium with liquid zinc. He found: = 2-95 x 1 0 2 o e - O . 6 W k T ~rn-~, (45) where k is the value of Boltzmann’s constant in electron volts deg-’ (8.63 x From standard tables (25), the vapor pressure of zinc is given by : pzn= 2.14 x 105e-1.24/hT atm. (46)
To use Eq. (35) to deduce Nrzn, we need the values for N, and [E, - El(lzn)] [Eqs. (40) and (41)]. Setting n = NI,,+ in Eq. (35) and making use of Eqs. (45) and (46) we get:
K z = 5.4 X 1015e0.02/kT ~ m atni-I. - ~
(47)
It will be noted that this corresponds to a “distribution coefficient” [NZ,(gas)/NIz,] of about a t 1000°C. Thomas has related this fact to the difference between the entropy of interstitial zinc and zinc atoms in the vapor. KO.No direct experimental measurements of the concentration of zinc vacancies is available. Indirect information, however, has been obtained by Thomas during the course of his studies (26) of equilibrium concentrations and diffusion rates for indium. Indium enters the lattice substitutionally, and its diffusion rate is low; Thomas argues that diffusion occurs by a vacancy mechanism, so that the activation energy for the diffusion constant for indium will be equal to the activation energy for VZn--. Making use of estimates of the vibration frequency and jump distance, he then proceeds to calculate the actual concentration of (ionized) zinc vacancies a t one temperature under the conditions of his experiment. His estimates may be described by the equation: Nvz,
=
1.2 x 1027e-3.16/kT ~m-~.
(48)
THE ELECTRON AS A CHEMICAL ENTITY
23
The conditions were such that the electron concentration varied with temperature according to the law:
n = 1.74 X 1024e-'.5/kT~ m - ~ ,
(491
while the oxygen pressure remained constant a t 1 atm. We now have all the information we need. Taking ( E , - E,) from Eq. (39), N , from Eq. (40), [E1(Vzn) E2(Vz,) - 2E,] from Eq. (43), making use of Eqs. (38) and (44), and substituting in Eq. (36), we get:
+
K 3 = 4.5 X 1029e-1'.31/kT C M - ~ atm.
(50)
K , . There are no measurements available on which to base a calculation of K , , precisely because no unambiguous evidence for the existence of oxygen vacancies has been offered. We note, from Eqs. (31), (33), and (34), that the ratio (Nvo/NIz,) should be independent of zinc pressure at a given temperature; and if, as certain rough calculations suggest (27), the heat of formation of an oxygen vacancy is greater than that of a zinc interstitial, we shall expect (Nv,/NI,,> to be an exponentially increasing function of temperature. Thus, even if ( N V ~ / N I , .<)< 1at the temperatures at which Thomas studied the equilibrium involving interstitial zinc, oxygen vacancies might become important a t higher temperatures. Note too that, since oxygen vacancies are supposed to be double donors, while interstitial zinc has not been found to ionize more than once, the ratio (Nv,++/Nv,) must vary as (l/n2) [see Eq. (36)], while ( N I ~ ~ + / varied N ~ ~ Jonly as ( l / n ); hence the total concentration of oxygen vacancies (including all states of ionization) will not vary with zinc pressure in the same way as the total concentration of interstitial zinc. One way in which one might hope to calculate the ratio (Nv,/NI,,) runs as follows. By combining Eqs. (32) and (33) we may calculate ( N I ~ , , / N ;v ~ ~ ) i t is clearly K z K 3 .But Nvo++and NVz,-- should be related by the equilibrium constant for the creation of Schottky defects. One may attempt to calculate this quantity, as Bloem has done for CuzO (28),by insisting that it should be of the form 10%-E*/kTcm-6, with E* chosen in such a way as to give some reasonable concentration of Schottky defects (say 0.1%) a t the melting point. If we try this with ZnO, we arrive a t the conclusion that (NV,/NI,,) must be of the order of lo6 a t 1000°C. So Thomas's donor centers would have to be oxygen vacancies rather than zinc interstitials, but then (Nv,/NI,,) would have t o be 10-6 a t 1000"C, so clearly the whole argument is absurd. The product ( N V , / N ~ , , )cannot possibly be as high as this simple argument would lead one t o believe. As additional. evidence against the argument it seems fair to point out: (1)that it fails grossly for germanium also, where recent measurements (29) of the equilibrium concentration of vacaiicies a t the melting point suggest a value around
24
C. G . B. GARRETT
atom fractions instead of l O P ; (2) that the activation energy demanded by the argument-t,he heat of formation of a Schottky defect-is unreasonably small (2.95 ev). We must therefore agree for the moment to ignore oxygen vacancies. The best that can be said is this: in Thomas's experiments on interstitial zinc (,t?4),a good straight line was obtained on plotting log u against (l/T), whereas, if oxygen vacancies had begun to be important a t and above some temperature in the range covered (440°C to 730°C), one would have expected a fairly abrupt change of slope a t that point. (c) Chemical impurities. If we include in the two-phase system some new constituent X, which can be present in both phases but does not react chemically with any of the species already present, the number of components and the number of degrees of freedom both go up by one; we can, for example, fix quite arbitrarily (within certain limits) the partial pressure of the new constituent in the vapor phase. This statement remains true even if a reaction can occur (for example, between X and oxygen) so long as no new phase appears. If, however, the pressure of X is increased far enough, another phase, consisting either of X or of some compound involving X, appears. The number of degrees of freedom is now reduced again to two; the composition of all three phases is determined as soon as, for example, the temperature and one partial pressure are fixed. Atoms of the new constituent can enter the ZnO lattice either substitutionally, interstitially, or both at the same time. (The question of whether identifiable molecules or radicals involving X are formed in the lattice is immaterial and does not affect the argument). In the former case we write: X (gas) Vzn& X Z ~ , or (51) X (gas) V O XO, and in the second: x (gas) s Ix.
+ +
The mass-action relations are now : or
(53)
and : Once the impurity atom is in the crystal, it can proceed to ionize. Elements, the atoms of which have so far been shown to the lattice interstitially, lie to the left-hand side of the periodic charge and, as is reasonable,
25
T H E ELECTRON AS A CHEMICAL E N T I T Y
are all donors. Substitutional impurities should be donors if they are from group I11 and replace Zn, or from group V I I and replace 0; they should be acceptors if they are from group I and replace Zn, or from group V and replace 0. In all cases, the relation between the concentrations of impurity atoms in their various states of charge can be written down in ways analogous to Eqs. ( 3 5 ) , (36), and (37). The parameters, as with the imperfections proper to the pure crystal, are the ionization energies, which can be determined in each case by making conductivity and Hall measurements over the appropriate temperature range. Where experimental evidence is available, i t appears that the ionization energy for a singly-ionizable donor agrees quite well with the “hydrogen-like” model. There remains for consideration whichever one of the constants K”5,and K”’5 is applicable to the part,icular impurity. Clearly it would be very desirable to have some way of predicting the magnitude of this quantity, since it is the one remaining unknown in determining the electrical properties of the crystal. The constant K5 will be expected to depend on (1) the ionic radius, and (2) the energy involved in the formation of the bonds, if the impurity atom is substitutional and the bonding to some extent covalent. For the case of germanium, where the bonding is entirely covalent, Trumbore (30) has shown that the distribution coefficient a t the melting point of a n impurity between the liquid phase and the crystal (a quantity related, it is true, to K5 only very indirectly, since the solutions are far from ideal) is a monotonic function of the ionic tetrahedral radius for atoms of each particular group in the periodic chart, and that, for groups 111, IV, and V, the dependence on ionic size is the more important consideration. We shall therefore treat only ionic size in what follows. Table I lists ionic sizes for a number of elements from groups I, 111, V, or VII in the state of charge in which they would enterosubstitutionally. For comparison, the ionic radius of Zn++ in ZnO is 0.78 A; that of 0-- is 1.24 A, and the radius of the largest interstitial site is equal to that of the oxygen ion. TABLE I. IONIC RADII IK ANGSTROMUNITS Group I Li+ 0.60 Ka+ 0.95 K+ 1.33 c u + 0.96 Rb+ 1.48 Ag+ 1.26 Cs+ 1.69 Au+ 1.37
Group I11
Group V
Group VII
N--- 1.71 p-- - 2.12
F- 1.36 C1- 1.81
Ga+++0.53
As--- 2.22
Br- 1.95
In+++0.81 La+++1.15 Tl+++0.95
Sb--- 2.45
I- 2.16
n+++0.20 Al+++ 0.50 sc+++ 0.81
Y+++0.93
26
C. G. B. GARRETT
I n considering Table I, it is not quite fair to limit the ionic size to exactly that of the site which the atom is to occupy; Thurmond and Trumbore have shown that an excess of some 30% still permits substitution in the germanium lattice, with a distribution coefficient depressed, howeve:, by three or four orders of magnitude. Let us then tentatively pick 1.0 A for the maximum ionic radius for an atom substituting for Zn; and 1.6 for the maximum ionic radius for an interstitial atom or one substituting for oxygen. We thus find the following possibilities :
Substitutional for Z n Li, Na, Cu; all from group I11 except La.
(Trivalent ions of other elements in the first transition series and of the rare earth elements also satisfy the size requirement.) Substitutional for 0 Only F.
Clearly size cannot be the only consideration: Ks ought to be smaller, ceteris paribus, for a given atom on an interstitial than on a substitutional site, because of the Coulomb energy: but it seems at present impossible to turn this consideration into a reliable quantitative statement. The equations in this and the preceding section fall short, by just one equation, of sufficiency in determining completely the composition of the crystal phase. The one equation we need to complete the calculation is, of course, the condition of over-all electrical neutrality :
With this remark we turn to the construction of some representative composition diagrams. ( d ) Construction and interpretation of composition diagrams: pure crystals. We here assemble the fundamental equations. From (31) and (44) : PznPo,"
=
2.0 x
1011,-a.oalkT
strn%.
(56)
1015eo.OZ/k~cm-3 atm-1.
(57)
From (32) and (47):
(2) x - 5.4
From (33) and (50): ( N ~ ~ . P= ~ 4.5 ,) x
10,9e-11.3*/kT
~ m atm. - ~
(58)
From (38), (39), (40),and ( 4 2 ) : np
=
6.0
x
1039e-3.30/kT
(59)
27
THE ELECTRON AS A CHEMICAL ENTITY
From (35) and (41):
From (37) and (43)) using (59) above:
Ksing these equations, together with the neutrality condition [Eq. (55)], Figs. 1, 2, and 3 have been constructed, showing the concentrations of the LOG^^
(PO,
--
MM)
20 18
2
2:
49q
16
b
14
6
-
12
10 8 r-7
5
s
6
0
0
4
2
0 -2 -4 -8 -8 -
-104
I
/ 2
0
-2
-6
-4 LOGO
-8
-10
-12
-14
-16
(pZn M M )
FIG.1. Composition diagram for zinc oxide, T
=
i5O"C.
various constituents as a function of zinc or oxygen pressure at three representative temperatures, I n constructing these pictures, the usual convenient simplificsLtion (31) to the neutrality condition has been made: under each set of conditions, one ignores all but the largest of the negative constituent concentrations, and all but the largest of the positive constituent concentrations. (The true curves would show curvature in the regions close to the intersections of the straight lines, instead of abrupt discontinuities in slope.)
28
C. G. B. GARRETT
It will be seen from the figures that, a t all three temperatures, the stoichiometry point occurs a t quite large oxygen pressure. In fact, the point a t which Nv,,-- = > ~ N I , . +occurs, as may be deduced from Eqs. (56) through (61), at. an oxygen pressure of 3 X 107e-0.03’kTatm, so that the compensation point is substantially independent of temperature. Note.
however, that a t 750°C the crystal is already intrinsic (in the sense that n = p ) a t Po, = 1 mm; only on quenching from this temperature would the crystal become substantially n-type. It is worth noting that the scheme represented by Figs. 1 through 3 is consistent with some recent measurements of the conductivity of single crystals of zinc oxide in oxygen a t high temperatures made by Pohl (32). For an oxygen pressure Po2 = 1 atm, the equilibrium electron concentration should be given by: n = NI,,+ = 2.8
x
1023e-2.4g’kT cm-3;
(62)
29
T H E ELECTRON AS A CHEMICAL EN TI TY
while Pohl's measurements, corrected to Po2 = 1 atm, are fitted by the equation :
5
x
1(-~23~-2.3/kT
(taking p n
=
100 em2 volt-' see-'),
=
(63)
CM-~
which agrees within a factor of 3 or so with Eq. (62) in the middle of the temperature range studied by Pohl (13OO-1700"C). Furthermore, it now becomes clear why experiments carried out on single crystals in a n oxygen
b
U
LOG^^ (pz,
MM)
FIG.3. Composition diagram for zinc oxide, T
=
1250°C.
atmosphere a t lower temperatures have always failed to show any equilibrium pressure dependence. At 1000°C and Po2 = 1 atm, the electron concentration would be only 4 X ~ r n - ~and , the crystal would already be intrinsic (see Fig. 2), so that, unless chemical impurities were reduced below 1 part in 109-that is to say, unless the conductivity had been reduced to mhos em-' or so-the effect of changes in the oxygen pressure would not be noticed. Pohl (32) has also studied the conductivity a t high temperatures in the
30
C. G. B. GARRETT
presence of zinc vapor. Here certain anomalous results were found: (1) The crystals tended to become colored. (2) The equilibrium times were very much longer than those reported by Thomas in his experiment on interstitial zinc. (3)Heating to a high temperature in zinc vapor gave rise to irreversible changes in the properties of the crystal, which affected the subsequent diffusion properties at much lower temperatures. About this one can say only two things a t the moment. First, the equilibrium conductivities do more or less join on to Thomas's measurements with regard both to absolute magnitude and to temperature dependence; it therefore seems unlikely that the donor center in question was different from that identified as interstitial zinc by Thomas. Second, if we believe that Eqs. (56) through (61) still apply, we must conclude that the total concentration of interstitial zinc in Pohl's crystals must have been very high (perhaps 0.1% a t the highest temperature)-so high as to raise doubts as to whether permanent structural changes (the incorporation of dislocations, etc.) might not have occurred. I n concluding this section, it is worth pointing out that this discussion casts further doubt on the validity of the work on polycrystalline samples. It was originally (1933) reported by von Baumbach and Wagner (33) that already at 550°C and 650°C the conductivity of sintered zinc oxide samples varies with oxygen pressure according to the law u a Po,-%, which, of course, is what one expects according to the interstitial zinc model. The mm, and the oxygen pressures were generally in the range lo-' to "conductivities" were in the region of or mhos cm-l, varying with temperature according to the law: u a exp (-0.71/kT).
The usual interpretation of these results is that the electron concentration is determined by the equilibrium involving interstitial zinc, and that the activation energy for conduction (0.71 ev) has something to do with the ionization energy of the donors. From Eqs. (56) through (61),however, mm, zinc oxide would we would conclude that, a t a pressure PO, = already be intrinsic at these temperatures, and that its conductivity (taking p,, = 100 cm2volt-' sec-') should be of the order of mhos em-' at 650"C, and less than lo-' mhos cm-' a t 550"C, independent of Pa. So the results of the experiments on polycrystalline samples (which, by the way, have often been confirmed since) are definitely not explicable on the basis of the bulk properties of zinc oxide, as deduced from the single crystal work. There remain two possibilities: (1) chemical impurities (which might or might not suppress the oxygen-pressure dependence, depending on how the experiment was done); (2) surface effects. The great rapidity with which equilibrium can be reached suggests the latter.
T H E ELECTRON AS A CHEMICAL ENTITY
31
(e) Construction and interpretation of composition diagrams: crystal with impurities. In See. III.B.3.c we arrived a t some qualitative conclusions, based on reasoning about the ionic size, as to which impurities would enter the zinc oxide lattice, either interstitially or substitutionally. By and large, these predictions are well born out by experiment. Group I . Lithium and Na are known to enter substitutionally (23),and behave as acceptors, lowering the electron concentration in the crystal; Li also enters interstitially (23) (and perhaps the same is true for h'a), behaving then as a donor; and, a t high total lithium content, there is evidence that the concentrations of interstitial and substitutional Li are nearly equal. Copper enters substitutionally (34),behaving as an acceptor; possibly it can enter interstitially as well. An attempt to incorporate Ag was unsuccessful (35). All of this is as expected from Sec. III.B.3.c; we may therefore predict with some confidence that Au will not enter to any significant extent either. Group I I I . Of the elements that are expected to enter, I n is well established from single crystal work (26); information on A1 is available only from measurements on polycrystalline samples (34),but it, like In, appears, as one would expect, to behave as a donor. No information about the other group I11 elements has been published, but there seems no reason not to expect that they will behave in the same way. Groups V and V I I . None of these are expected to enter except possibly F. 'Experiments so far have failed to show that any do. Other elements. Chromium is expected to enter and to behave as a donor, and, according to the work on polycrystalline samples ( 3 4 , does. Attempts that have been made so far to incorporate rare earth metals have been unsuccessful (36), contrary to what one would expect on the basis of ionic size. Hydrogen is well established as an interstitial donor; this is to be expected, on account of its small size. Cadmium, P t , and Ni all lead to an increase in conductivity (34),but no quantitative analysis has been done; presumably Cd, a t least, enters interstitially. To attempt to construct composition diagrams for all of these impurities under all possible conditions would be pointless. We shalt therefore confine ourselves to the following simpler questions: (1) What should the composition diagrams look like if we could maintain a constant (total) concentration of some donor species A in the crystal? ( 2 ) What should they look like if the activity of (neutral) X in the crystal is held fixed by means of the equilibrium: 2A +502$ AzO? Figures 4 and 5 have been constructed to illustrate the answers to these questions. In Fig. 4,the composition diagram is shown at 1000°C for the case that the total concentration of A is held fixed a t 1 O I 6 From this it may be seen that, to obtain material that is substantially p-t,ype on
+
32
C. C. B. GARRETT
quenchiiig from the temperature of preparation, an oxygen pressure of some 300 mm should suffice. This then should be a perfectly feasible experiment. I n Fig. 5, however, we see what happens when the activity of A is fixed by the above-mentioned oxidation equilibrium, supposing (1) that
LOG^^
(PZn MM)
FIG.4. Composition diagram for zinc oxide containing 10l6acceptor atoms per cubic centimeter, a t T = 1OOO"C.
solid AzO is present; arid ( 2 ) that the constants are chosen in such a way at as to make the over-all concentration of A in the crystal 1OI6 PZn = 1 atm. Figure 5 shows that, under these circumstances, the presence of the acceptor in the system is only significant a t high zinc pressures, a t which the electron concentration is depressed below the value it would have in the absence of A; a t high oxygen pressures, where previously the crystal went p-type, the activity of A is so low that the electrical properties are substantially those of the pure crystal. Of course, depending on the value of Ks, and of the dissociation constant for AzO, the cross-over pressure may be substantially different from that shown. In order to increase the likelihood of reaching the p-type region, one
33
THE ELECTRON AS A CHEMICAL ENTITY
should thus choose an element such that: (1) the ionic size is not too large (so that substitution will be favored) and not too small (so that formation of interstitials is discouraged) and (2) the vapor pressure of the oxide is large. Lithium appears to be rather unfavorable on both counts: its ionic
b
U
FIG.5. Composition diagram for zinc oxide containing acceptor atoms, such that their density is 10'6 per cubic centimeter a t a zinc pressure of 1 mm, and that the density of neutral acceptors at other zinc pressures is given by the equilibrium: 2A >SO, 5 AzO; 2' = 1000°C.
+
size is small, so that it can without difficulty be inserted interstitially; and the heat of formation of LizO is high (6.2 ev: the Gibbs free energy of formation is unknown). Sodium would appear to be a more likely candidate, but it too can probably enter interstitially. The heat of formation (37') of NazO is 4.3 ev, and the Gibbs free energy 3.9 ev. From these and sundry other bits of thermodynamic information, one deduces that ~ s a 2 ~ o= ) i 4 x 1017e --G.BO/bT atm%
i64)
34
C. G . B. GARRETT
so that, comparing this with Eq. (56), one finds that the partial pressure of zinc and the partial pressure of sodium should each be of the order of 3X mm a t 1000°C in the presence of 1 atm of oxygen, solid NazO and solid ZnO both being present.* Since sodium should enter vacant zinc sites a t least as readily as zinc, one would imagine that it should be possible to incorporate arbitrarily large amounts of sodium under these conditions. Nothing will avail, however, if occupancy of interstitial sites by sodium is favored; for, as Lander has shown, this is likely to lead to a situation in which, however many acceptors atoms are incorporated, the sample remains intrinsic, because the additional acceptor atoms simply go equally to interstitial and substitutional sites.
REFERENCES 1. Hauffe, K., “Rcaktionen in und an Festen Stoffen.” Springer, Berlin, 1955. 9. Guggenheim, E., “Thermodynamics,” 3rd ed., p. 23. North-Holland, Amsterdam,
1957. 3. Guggenheim, E., “Thermodynamics,” 3rd ed., p. 24. North-Holland, Amsterdam, 1957. 4. See, for example, W. Shockley, “Electrons and Holes in Semiconductors.” Van Kostrand, New York, 1950. 5. See, for example, Hannay, N. B., in “Semiconductors” (N. B. Hannay, ed.), p. 24. Reinhold, New York, 1959. 6. Wagner, C. and Schottky, W., Z. physik. Chem. (Leipzig) B11, 163 (1930). 7 . Reiss, H., J. Chem. Phys. 21, 1209 (1953). 8. Longini, R. L. and Greene, R. F., Phys. Rev. lOa, 992 (1956). 9. Brebrick, R. F., Phys. and Chem. Solids 4, 190 (1958). 10. Reiss, H., Fuller, C. S.and Morin, F. J., Bell System Tech. J . 36, 535 (1956). 11. Thomas, D. G., i n “Semiconductors” (N. B. Hannay, ed.), Chapter 7. Reinhold, New York, 1959; Bloem, J., Philips Research Rept. 11, 273 (1956); Krogeo, F. A., Vink, J. H. and Volgw, J., ibid. 10, 39 (1955). 12. Bloem, J., Philips Research Rept. 13, 167 (1958). I S . Geballe, T. H., in “Semicondurtors” (N. B. Hannay, ed.), Chapter 8. Reinhold, New York, 1959. 1.4. Geballe, T. H., in “Semiconductors” (N. B. Hannay, ed.), p. 361. Reinhold, New York, 1959. 15. Thurmond, C. D. and Logan, R. A , , J . Phys. Chem. 60, 591 (1956). 16. Thurmond, C. D., in “Semiconductors” (N. B. Hannay, ed.), p. 148. Reinhold, New York, 1959. 17. See discussion by Thurmond, C. D., in “Semiconductors” (N. B. Hannay, ed.), p. 154. Reinhold, New York, 1959. 18. Reiss, H. and Fuller, C. S., J . Metals 8, 276 (1956). 19. Reiss, H. and Fuller, C. S., Phys. and Chem. Solids 4, 58 (1958). 10. Reiss, H., in “Semiconductors” (N. B. Hannay, ed.), p. 250. Reinhold, New York, 1950.
* This is probably wrong; the stable oxide of sodium a t high temperatures is not Na20 but NapO,.
T H E ELECTRON AS A CHEMICAL ENTITY
33
31. Thomas, D. G., in “Semiconductors” (X. B. I-Iannay, cd.), p. 301. Reinhold, New
Tork, 1959. 22. Ilutson, A. R.,Phys. Kev. 108,222 (1957); Phys. and C‘:etn. S~ilirls8, 467 (1959). 23. Lander, J. J., Phys. and Chem. Solids, in press. 24. Thomas, D. G., Phys. and Chem. Solids 3, 229 (1957). 25. “International Critical Tables,” Vol. 111, p. 205. McGraw-Hill, Kew Yorlr, 1928. 26. Thomas, D. G., Phys. and Chem. Solzds 9, 31 (1959). b?. Moore, W. J., J. Electrochem. Sot. 100, 302 (1953). 68. Bloem, J., Philips Research Rept. 13, 167 (1958). 69. Tweet, A. G., Bull. Am. Phys. SOC.[2] 4, 146 (1959). SO. Trumbore, F.,Bell System Tech. J . 39, 205 (1960). 31. Brouwer, G , Philips Research Rept. 9, 366 (1954). $2. Pohl, R. W., 2. Physzk 166, 120 (1959). 33. von Baumbach, H. H. and Wagner, C., Z. phyaik. Chem. (Lezptig) B22, 199 (1933). 34. Heiland, G., Mollwo, E., and Stockman, F., Solid State Phys. 8, 191 (1959). 35. Bogner, G. and Mollwo, E., Phys. and Chem. Solids 6, 136 (1958). 36. Sadowski, E. 8., Private communication (1959). 37. Selected Values of Chemical Thermodynamic Variables, Katl. Bur. Standards ( U . 8.) Circ. 600, 447 (1952).
This Page Intentionally Left Blank
Problems of Photoconductivity
.
P GORLICH Institute for Optics and Spectroscopy. German Academy of Sciences. Berlin. and Friedrich Schiller University. Jena. Germany
Page I . Introductory Considerations on Photoconductivity . . 37 I1. Photoconduction in the Base Lattice and Tail Absor 39 I11. Theoretical Problems in Photoconductivity. . . . . . . . 40 42 A. Lifetime: Theoretical Considerations . . . . . . . . . . . . 43 B. Saturated and Unsaturated Photocurrents . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 C. Advantages of the Concept of Lifetime . . . . . . . . D. Reaction Kinetic Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 E. Steady State Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 . . . . . . . . . . . . . . . . . . 50 F Rise and Decay Processes sity . . . . . . . . . . . . . . . 50 G. Photocurrent Dependence H . Demarcation Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 I. Wave Vectors and Crystal Momenta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 I V . Dislocations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 V . Negative Photoconductio .................................... 58 VI . Surface Conditions ....... ......................... 60 V I I . Ohmic and Unidirectional ns . . . . . . . . . . . . . . . . . . . . . 64 A . Unidirectional and Isotropic Contacts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 B. pn-Junctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 C. Photo-emf in Boundary Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 VIII . Photoelectromagnetic Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 IX . Application of Photoconductors . . . . . . . . . . . . . . . '70 . . . . . . . . . . . . . . . . . . . '70 A . Tabular Survey of Photoconductors . . . . . . . . . 73 B. Frequency Dependence and Amplification F a 78 C . Statistical Fluctuations in Photoconductors . . X . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 List of Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
.
I . INTRODUCTORY CONSIDERATIONS ON PHOTOCONDUCTIVITY Except for some special cases which are not sufficiently clear. semiconductors with completely filled valence bands. empty conduction bands. and lacking ionization of the defect states. would not exhibit any conductivity without electron excitation . Of the four possibilities of energy input for electron excitation. we will treat in this paper the thermal and optical excitations. excluding the excitation through particle bombardment and 37
38
P. GORLICH
electrical excitation which is essentially the problem of electrical breakdown. Thermal excitation interests us only insofar as a semiconductor at a given temperature exhibits a dark field conductivity. That is to say, that even prior to photon injection, the semiconductor exhibits the so-called dark current. The electron concentration n in the conduction band can be calculated by
where Ei is the energy of the Fermi level, m the effective mass of the electron, and EL the energy of the lowest level in the conduction band. If we assume the validity of Ohm's law in a homogeneous solid, the relation between the current density i and field strength E is given by
i
=
U E = nepnE,
(2)
where u is the conductivity of the n electrons with mobility p n . If, in addition, p holes with mobility p p participate in the production process then Eq. (2) is changed to
i = UE =
+ pepp)E.
(mpn
(3)
One might increase the current density by an amount A i by injecting photons into the lattice or tail absorption regions. By increasing the carrier concentrations ( A n and A p ) , the increased current density ( A i ) is given by Ai
=
(Amp,, - Apepp)E1
(4)
where it is assumed that neither the mobilities nor the field strength are changed. I n general, the observed changes of the mobilities are of no or, a t best, of little consequence. On the other hand the second assumption requires a homogeneous excitation of the photoconductors as well as a homogeneous distribution of the charge carriers during their migration in the electric field. The latter assumption is not always fulfilled a t high field strengths (insulators) so that deviations from Ohm's law are exhibited and saturation of the photocurrents is observed. Equation ( 4 ) describes the positive photocurrent conductivity assuming an increase of the carrier concentration, that is, an increase of electrical conductivity under illumination. Under certain conditions [for example, bombardment of germanium with fast electrons ( I ) ] one can bring about a negative photoconductivity, that is, a decrease of electrical conductivity under illumination.
PROBLEMS OF PHOTOCONDUCTIVITY
39
11. PHOTOCONDUCTION IN THE BASE LATTICE AND TAILABSORPTION REGIONS Although one should be able to excite electrons from the valence band t,o higher levels through photon injection throughout the complete base lattice absorption region, it is well known that not all semiconductors (and insulators) exhibit an inner photoelectric effect or, a t best, exhibit it only a t the band edge (except for some possible effects in the tail absorption region). There is a very good and interesting explanation for this phenomenon : that is, the photoelectric excitation is vanishingly small compared with the thermal excitation or the recombination rate of the photoelectrically formed charge carriers is extremely large (often brought about through recombination processes in the surfaces of the photoconductors, see Sec. VI). Also, excitons (2) may be formed. The observed fine structure of the yield of the inner photoelectric effect of cadmium sulfide near the band edge shows in an impressive manner the possibility for exciton formation (3). The investigations of photoconductivity in cadmium sulfide appear to be particularly suited to clarify extensively the exciton states. One then comes to the problem, if and in what manner the structure of the yield curve which appears more distinctly at low temperatures and under application of polarized light is related to the structure of the optical absorption. Apparently a relation exists (4),and the question arises whether the structural form of the absorption may be understood in terms of a base lattice effect or in terms of a n effect of the structure of the real lattice. The experiments indicate that the real structure of the crystals exerts a decisive effect ( 5 ) . Furthermore, the structural form of the absorption was already recognized earlier in the case of evaporated photoconductive CdSe layers (8). It has long been recognized that there is no photoconductivity observed in the region of the base lattice absorption of uncolored alkali halides. One has recently been led to the conclusion that the absorption of suitable radiation leads to exciton formation. The investigation of inner photoconductivity in the tail absorption region opens the possibility of relating certain absorption regions to known lattice defects. The excitation of electrons takes place from the forbidden levels lying between the valence and the conduction band to the conduction band or from levels in the valence bands to the levels in the forbidden zone (the latter case being defect (hole) conduction). If i t is possible to establish these correspondences in a clear-cut manner, preferentially in crystals with heteropolar binding (e.g. the colored alkali halides), it appears possible to establish methods to relate the lattice defects in homopolar crystals to their washed-out structureless tail absorption. For example, additional monochromatic irradiation in the region of the absorption edge
40
P. GORLICH
causes a change in the photoconductivity of weakly excited photoconducting cadmium sulfide with increasing wavelength ( 7 ) . We may expect clarification of the above-mentioned exciton mechanism at least in the cases of cadmium sulfide and zinc sulfide if we irradiate with light whose wavelength is longer than that corresponding to the absorption edge (8). If we take these experiments to be an indication of the existence of exciton excitations in cadmium sulfide and zinc sulfide then they lead to consideration of energy transport. That is, we take the diffusion parameter of the excitons to be dependent upon the real structure, the diffusion constant to lie between lo3and lo4 cmZ/sec and the lifetime to be of the order of to 10-6 sec. However, there exists experimental evidence contrary to the concept of exciton excitation (9) and work in this direction is in progress. The lifetime of the excitons increases with decreasing temperature. Based upon analyses of the changes in line widths, we are able to take 3X sec as the lifetime for excitons in silicon and 1.5 X 10-l' sec for those in germanium for temperatures below 100°K (10).
111. THEORETICAL PROBLEMS IN PHOTOCONDUCTIVITY Note that the theoretical treatment of the inner photoelectric effect requires as its basis that the process inverse to excitation of electrons by light be represented by recombination, which leads to a lowering of the concentration of charge carriers. A steady state between excitation and recombination is obtained if we maintain a steady illumination of the photoconductors. An interruption of the illumination causes a decay of the additionally excited charge carriers through recombination. The decay process is observed until one again reaches thermal equilibrium. Recombinations are, in principle, possible (1) from conduction band to valence band, (2) from conduction band to a defect state, (3) from a defect state to the valence band, (4)from a defect state to a defect state, and (5) from an exciton state to the ground state. One must distinguish between a radiative and a nonradiative recombination. Radiative recombination is called luminescence. I n general, only a small part of the excited electrons recombine with emission of light quanta. In order to be able to describe the conduction processes in a photoconductor, one must determine experimentally the chemical nature of the lattice defects and the influence of other parameters in addition to the spacings of the bands and the positions of the defect states and their excitations. The usual experimental methods used for this purpose in semiconductor physics, which are, of course, in part based on the well-known laws of photoconduction, are supplemented with the bombardment experiments that were recently related to the changes in photoconductivity
41
PROBLEMS O F PHOTOCONDUCTIVITY
brought about by bombardment. For example, this latter method mas used to bring about a decision regarding the absorption processes of injected electron hole pairs in germanium and silicon through charge carriers produced photoelectrically (11). The bombardment of germanium with fast electrons brings about a change in photoconduction which sheds some light on the properties of newly formed defect states formed through electron bombardment (12). These measurements on the changes of photoconduction brought about by electron bombardment which manifest themselves as conduction up to X = 6 p (cf. Fig. 1) suggest additions to the known Wavelength ( p )
4
0.8
0.6
5
0.4
Photon energy (ev)
FIG. 1. Spectral distribution of photoconduction for germanium. Curve a: before electron irradiation; curve b: after long irradlation. The specimen was still n-conducting. Curve C: after transition. The radiation intensity was the same for all wavelengths (approximately 10-6 watts/cmz).
facts regarding the defect state properties derived from measurements 011 the Hall effect and conduction processes. For instance, measurements 011 the time dependence of the rise and decay of the photoconduction permit us to conclude whether the defect states act as trapping or recombination centers (the state of the surface must not be neglected, cf. See. VI). 011 the other hand, germanium doped with gold or with gold antimonide has a long wavelength limit a t about 6 p (in particular, germanium doped with gold exhibits a sensitivity up to 9 . 5 ~if prepared in a special way) (12). This brings up the interesting question of whether defect states brought about through doping can act in a manner equivalent to those brought about through electron bombardment. Work on photoconductivity in doped germanium is in full progress. The latest investigations show that
42
P . GORLICH
germanium doped with zinc with an impurity in a concentration 4 3 X 1016 zinc atoms/cm3 is photoconducting up to wavelength Xo = 40p. However, in the bombardment experiments one must consider how much the state of the surface is changed by the bombardment. The electron bombardment seems t o influence strongly the state of the surface in the case of cadmium sulfide. Investigations of the change in photoconductivity of nondoped cadmium sulfide through neutron bombardment should be just as interesting (13). Neutron bombardment causes a small part of cadmium to change into a radioactive isotope which then decays to a stable indium isotope. I n this way cadmium sulfide is activated with indium. The photoconductivity is increased by a factor of 10 to 20 after being radiated with a flux of about 1OI6 thermal neutrons/cm2, and the spectral distribution is changed through the appearance of a new maximum a t 620 mp. Especially significant are the possible changes in the photoconductive process brought about through the Frenkel defects produced by the bombardment of fast neutrons.
A . Lifetime: Theoretical Considerations The concept of the lifetime T of the free charge carrier enables us to obtain an over-all view of the theoretical situation. The rates of production of the free charge carriers per unit volume and per unit time are gn and gp. Then we can define the lifetimes rn and T~ by
As in Eq. (4) A n and A p are the densities of the electrons and holes respectively which exist in the stationary state during bombardment. If the photoconductor does not have any defect states, then in the case of band-band excitation or recombination g, is equal to g p and A n is equal to Ap. Therefore, T,, is equal to T ~ On . the other hand, in the case of a perturbed photoconductor A n is not equal to A p and T,, is not equal to rP. Usually, either A n is very much less than A p or A p is very much less than An. We define a time T O by
which gives the time dependence of the photocurrent after the irradiation has been concluded. Thus there exists a relationship between T O and r of the type given by Eqs. 7 (in the case A p << An, T = T,, in the case A n << Ap, 7
=
Tp)
PROBLEMS OF PHOTOCONDUCTIVITY
43
This is so because there always exists a steady state condition between a group of defect states S with the free charge carriers in the bands. A loss of free charge carriers through recombination is compensated by charge carriers from the defect states. The quantity r0,which can be determined by the decay of the photocurrents after irradiation, is not in itself sufficient without further considerations to permit u s to determine the lifetimes of the charge carriers T,, and rP (14). As we will underscore in Sec. VI, the conditions of the surfaces of semiconductors can more or less influence the photoconductive processes. The form of the decay curve is modified if fast recombinations in the surface are possible. The volume lifetime r0 is defined in Eq. ( 6 ) for a photoconductor without considering the state of the surface. An effective lifetime re for a photoconductor where the state of the surface has been considered can be expressed in the form
-1 - -_ 1 Te
x + - 1
70
70
where X is a correction function which can be calculated approximately (15).
B. Saturated and Unsaturated Photocurrents Equations (4) and ( 5 ) help us to understand the saturated photocurrents in insulators a t high field strengths and also the amplification factors in photoconductors with large photoconductivity (16). I n general, one speaks of a conventional semiconductor if the forbidden zone EL - EV < 1.5 ev. If EL - Ev > 1.5 ev., the materials are called high-energy gap semiconductors or insulators. We define BD as the cross section of the current, L as the distance between the electrodes, and T , and T , as the times that the charge carriers travel the distance L. We employ the simplification that gn = g, = g, so the total photocurrent AT is represented by
AI
=
gBDLe(x,
+ x,,)/L,
(9b)
where xn and x, are the displacements of the free charge carriers in the direction of the field during their lifetimes. The mean distances z and the times T are related by
lcor small field strengths z remains much smaller than L and r much smaller than T . Equations (9) are equivalent to Eq. (4). For field strengths
44
P. GORLICH
+
in which xn x p 2 L, the photocurrent increases if we exclude insulators. The photocurrent in insulators does not increase but saturates. Photoconductors with large conductivities behave in a different manner. The motion of the primary charge carriers in opposite directions would cause the formation of a space charge if secondary charge carriers were not introduced from the electrodes as soon as charge carriers left the semiconductor (secondary photoeffect). The ratio T / T in Eq. (9a) becomes > 1. Therefore, we obtain a n amplification factor F which cannot become arbitrarily large for there exist phenomena which occur at high field strengths that limit the photocurrent. From a theoretical point of view we may expect certain types of saturated photocurrents in certain semiconductors, which may be described as being space charge limited (17). The injection of the secondary electrons from the cathode act, as already mentioned, to compensate the space charge. The compensation of the space charge requires, of course, a finite time, the dielectric relaxation time, TR
=
€€O/U,
(10)
where E is the dielectric constant, is equal to 8.86 X A sec/V cm. As soon as the time of passage T becomes less than the relaxation time r ~ , the mechanism of injection of secondary electrons becomes ineffective. The result is a “space charge limited photocurrent.” The search for semiconductors which exhibit that type of saturation condition (yield > 1, saturation currents greater than that in a n insulator by a factor 7 / 7 8 ) ) should lead us to the clarification of the unknown behavior in semiconductors which investigations in hexagonal selenium already leads us t o surmise. Those dielectric relaxation phenomena with a time constant T B are completely determined by the T R given in Eq. (10). Substances with small conductivity and with a time constant T R of the order of magnitude of seconds (for instance, cadmium sulfide) are especially suited for the investigation of conductivity inhomogeneities. Germanium with a conductivity of lo-’ cm-’ exhibits a T R of the order of sec.
C. Advantages of the Concept of Lifetime The lifetime T comprises a number of functions of the free charge carriers, with various parameters. The considerations of the laws of photoconduction with the aid of r should not be considered as complete. The improvement of these considerations with the concept of r should give a better over-all view on different processes in certain photoconductors. For instance, the photoelectric yields or quantum yields (and also their maximum values) can be expressed in a simple fashion with the aid of the lifetime T (18).
PROBLEMS O F PHOTOCONDUCTIVITY
45
\.there G is the gain and I?' the applied difference of poteiitial. The time of passage T is given by
Combining Eqs. (11)) (la), and (7) leads to extensive statements regarding the performance of a photoconductor, in particular, their limitations by injection of space charge. We are also able to find methods of determining the lifetime T itself (19). I n the experimental setup, shown in Fig. 2 . a photocurrent flows Ultrovtolet Light
FIG.2. Measurement of field effect in ZnO powder.
through an insulator (photoconducting insulator zinc oxide powder, n-type) between the ohmic contacts. By combining Eqs. (4) and ( 5 ) , the current is found to be proportional to 7% (in the case T~ << 7%).If we apply an electric field perpendicular to the surface of the photoconductor by means of an unidirectional contact, the electrons are driven towards ground and no new electrons are supplied from the unidirectional electrode. I n addition, free electrons are collected by the anode and are lost from the photocurrent. We denote the transit time of the electrons, while in the applied field between the unidirectional electrode and anode, by TR ,so that the lifetime T of the electrons is given by 1 -
T
=
1
7,
1 + TR'
We must realize that Eq. (13) is not strictly valid but only a good approximation because the loss of the free electrons in the applied field is not a relaxation process. Let d be the distance between the unidirectional elect8rodeand anode so we can express T Rwith the aid of Eq. (12),
46
P.
GGRLICH
and so obtain the relative change of the photocurrent
+ P*U, __
7%
Therefore, the lifetime T,, can be determined from a knowledge of the mobility pn,from the amplitude and time constant of the field effect modulation of the photocurrents, without information on the gain gn, and the incident intensity. I n addition the trap concentration may also be calculated. D. Reaction Kinetic Models An exact theory of photoconduction and luminescent phenomena can only be built up on the basis of a quantum mechanical or quantum electrodynamical defect state theory whereby all significant excitation and radiative as well as nonradiative recombinations are treated. Most of the theories formulated up until now may be considered plienomenological. This is so
FIG.3. Excitation in tail absorption region.
regardless of whether these theories are based on the band model of conductivity or on the law of mass action, according to which dissociation and association steady state processes of charge carriers in defect states, defect state-band transitions, etc., are expressed in terms of “electronic reaction equations.” The problems of the theoretical treatment already appear in the simplest case, i.e., where two types of defect terms S, trapping term H , and activator term A are present (see Fig. 3). Of course, this simple energy level scheme is valid only in a few real cases. I n those cases the reaction
PROBLEMS OF PHOTOCONDUCTIVITY
47
kinetic differential equation system for photoconduction in the tail region might be formulated in the form
I%
=
s3
-
s4.
The excitation s1 occurs with the frequency a per second per cubic centimeter. I n this way we obtain per unit volume n free electrons which go over to the conduction band and A+ ionized activators. The formed free electrons can recombine with the ionized activators with a probability Q! (process s2) or may be trapped by the empty trapping states H with the probability @ (process s3). The trapped electrons (concentration na) may be thermally excited to the conduction band with a probability y. We then obtain for the individuaI processes s1 through s4 the following equations.
I
s1 = a,
sz = m A + ,
Pn(H - m),
s3
=
84
= ynE.
(17)
In a similar manner we may obtain a system of differential equations for photoconduction in the region of the base lattice absorption in which electrons and holes are excited from the valence band (see Fig. 4). We must
FIG.4. Excitation in lattice absorpt~onregion.
48
P. GORLICH
consider the processes s1 (excitation in the valence band), s2 (recombination with holes in the valence band), s3 (capture in a trapping state), s4(thermal excitation and transition to the conduction band), SS, and s6 (transition to the valence band through a n activator term as .a recombination center). The solution of such a system of differential equations yields the quantities n and p . In the case of large field strengths, the theoretical considerations become complicated. Under the influence of an electric field, the bands are curved and the electron may then penetrate into the forbidden zone. The probability of finding electrons in the zones near the band edge is proportional to the field strength (Franz-Keldysch effect). Photoconduction may then appear a t lower excitation energies. Figure 5 depicts a concrete example of a suggested model based on
FIG.5. Suggested model for Ag-activated CdS.
experimental data (20). I n silver-activated cadmium sulfide we find a level 1 ev above the normal valence band of cadmium sulfide. On the other hand, silver activation brings forth an additional 0.4-ev level below the conduction band. The photoconduction and luminescence measurements require transitions of the type illustrated in the model in Fig. 5: s1 represents the excitation as a band-band transition; s2) a n infrared transition of captured electrons in the 3-p region; s3, a luminescent recombination of captured electrons with free holes; s4, a capture of electrons from the conduction band; SS, capture of holes from the valence band; and s g , the liberation of captured holes through excitation in the 1-p region. This model represents in a satisfactory manner the experimental findings in regard to the photoconduction in the base lattice region and in the infrared region as well as the luminescent phenomena. It can be treated by a system of differential equations according to the above-mentioned method.
PROBLEMS O F PHOTOCONDUCTIVITY
49
The most general form, the system of differential equations for t,he calculation of charge carrier concentration are of the type
In the above, a k represents the excitation through photoninjection, b k the possible loss in the k-term group, and S, and the term densities. The summation indices run through the above-considered term groups; n j and n k represent the carrier concentrations in the j - and k-term groups respectively. The transition probabilities between these groups are given by Y j k and 8 j ~ . One recognizes in the example (Fig. 5) that setting up such differential equations is justified in principle. One starts with simplifications which limit the validity of the model. Besides this, the large number of unknown parameters is in most cases not uniquely and exactly determinable. This requires that we must introduce a further simplified reaction kinetic model. One ascribes to every lattice defect in the forbidden zone only a single term and assumes that every trap can accept oiily a single electron and every activator level only a single hole. This excludes, therefore, bivalent lattice imperfections. These, however, appear in germanium with an impurity of the order of 1015 atoms/cm3 of transition metal a t liquid air temperatures. These form centers with double negative charge ( 2 1 ) . One also assumes that holes are not bound by unoccupied trapping levels and electrons are not bound by unexcited activator levels. Further, no direct transitions of charge carriers between lattice defects are allowed. In general, it is postulated that the lattice defects are statistically distributed, independent of one another, and that this distribution is stable during the photoconduction process; i.e., that there is no diffusion of lattice defects. It should be especially noted that the equations do not take into account the influence of the surface. One knows, however, from a multitude of experiments, that it is just these properties of the surface which are important in the photoconduction processes and, so much so, that in certain cases they completely influence these processes.
E. Steady State Conditions
In order to utilize the many experimental results one often considers only the stationary processes; that is, the stationary state which sets in after excitation by a constant radiation intensity. This causes the elimination of the derivatives with respect to time in the equations. I n order to calculate the charge carrier concentrations, for instance, the concentration of conduction electrons, it is sufficient then to solve second, third, or higher
50
P. GORLICH
order equations (22).One can, in principle, determine the number and type of defect term groups from the dependence of the photocurrent upon the incident radiation intensity. However, one should not underestimate the difficulties of this approach in many cases.
F. Rise and Decay Processes If one investigates experimentally the nonstationary states as well as the stationary states (that is the rise and decay processes in photoconduction), one should be able to utilize the curves of the solutions of the reaction kinetic system of differential equations to compare with the experimental results. However, here we experience difficulties. One must consider that in many cases we cannot obtain analytic representations of the solutions. This is in addition to the simplifications in the models mentioned above that we utilized in order to derive the system of differential equations and their limits of validity for applications. In these cases one is naturally forced to employ various sorts of approximation procedures (23).However, one is able to obtain a more precise knowledge of, for instance, the positions of the traps and their concentrations through the investigations of the fine structure of, for instance, the rise curves (24).This leads to a more exact knowledge of the parameters of a model. It is found, for instance, in the case of cadmium sulfide that the time dependence of the rise of the photoconduction is very strongly dependent upon the intensity of the incident light. One can see this in the rise curves plateaus and infiection points. These can be explained if one assumes that distinct groups of traps are filled with electrons successively; and after each filling of a group of traps, the photocurrent again rises. In such, and comparable, ways one is a t least able to obtain certain ideas regarding the reaction kinetic parameters. G. Photocurrent Dependence u p o n the Radiation Intensity For all photoconductors (those with small as well as with large dark currents io), we obtain experimentally a power law dependence of the stationary photocurrent Ai upon the radiation intensity IB, which can be understood in terms of reaction kinetic considerations. One tries to find the proper reaction kinetic model to fit the measured dependence of the photocurrent upon the intensity of radiation in order to recognize the particular photoconduction process. One can readily show in an example how the simplest reaction kinetic model leads to a power law relation between the photocurrent and the radiation intensity. The current increase a t the beginning of irradiation, (di/dt)A, is proportional to the radiation intensity Ie. Thus,
(di/dt)A
-
In.
(19a>
51
PROBLEMS OF PHOTOCONDUCTIVITY
One assumes that the primary process produces pairs, that is, electrons and holes. The production rate of free charge carriers is naturally proportional to I B :
(di/dt)A
-
(dn/dt)A
-
(19b)
TB.
If one simply takes as a basis a bimolecular recombination of free electrons and holes, one then obtains a recombination velocity proportional to the product of the concentrations of both kinds of charge carriers. If is the concentration of free electrons in the case of dark field photoconduction, then the concentration of electrons in the bright field case is no An. The concentration of the holes must be A n because the material is neutral. At the end of the irradiation of the photoconductor, the result is therefore
+
-
( d n / d t ) ~ - (no
+ An)An
-
(dildt)~.
(20)
For the stationary case it is necessary that (di/dt)A
= - (di/dt)E.
Because io is proportional to no and Ai is proportional to An,
IB
-
io Ai
+ (Ai)'.
For the special case (a) that A i is much greater than io,
~i
-
1~34;
(22a 1
for the case (b) that Ai is much smaller than io, Ai
IB/~o.
(22b)
H . Demarcation Levels A special type of stationary consideration has been brought about advantageously through the introduction of the so-called demarcation levels (14). A demarcation level is an energy level from which transitions to higher as well as to lower levels are equally probable. The introduction of the demarcation levels is based upon the experimental fact that not all levels in the forbidden zone act as recombination centers. This is, for instance, the case €or the levels in the neighborhood of the band edge. If we assume that a photoconductor contains a n arbitrary number of defect states, then among the processes between excitation and recombination one must naturally be the slowest. This slowest process determines the speed of the total process, and, with respect to this process, all other processes also under irradiation are in approximate equilibrium. Equilibrium must be assumed for two groups of defect states. First, for defect states which are in occupation equilibrium with the free electrons in the
52
P. C ~ R L I C H
conduction bands, and, second, for those defect states which are in occupation equilibrium with the holes in the valence band. The equilibrium conditions under the assumption of a Fermi energy Enfor electrons and E, for holes are
I
n
=
(-
EL - En ~ ~ e x p kT
),
E, - E v
where N L is the effective term density for the conduction band and N v is that. for the valence band. Both stationary Fermi levels (quasi-Fermi levels) are determined through the concentration of charge carriers in the conduction and valence bands. The photoconductor with a band spacing EL- E v becomes, under irradiation, a semiconductor with a band spacing reduced by t,he amount E, - E,. The group of defect states which determine the course of the total process lies between the demarcation levels D, and D , (see Fig. 6 ) . All defect states S belong to the first group if the process s2 E
FIG.6. Demarcation and quasi-Fermi levels in an irradiated photoconductor.
(transition of the electron in X to the conduction band by a thermal excitation) is faster than the process s3 (recombination in the valence band with a hole). Both processes are equally probable from the demarcation level D,. The probabilities for the processes s2 and s3 are each half as large as the probability for the process sl. We obtain an expression for the position of the demarcation level, for instance, the position of D,,by utilizing the kinetic Ansatz. It is
D, = En - kT In (2S-/S).
(24)
PROBLEMS O F PHOTOCONDUCTIVITY
53
With the aid of the concept of the demarcation levels, we can understand very well certain processes in photoconductors. This can be accomplished without more exact calculations which would otherwise be necessary if we were to base them upon a kinetic model. The stationary Fermi levels and, therefore, also the demarcation levels approach the band edges under increasing irradiation intensity. I n this manner, the number of recombination possibilities increases. The photocurrent exhibits superlinearity if the defect states get into the recombination zone thereby slowing down the recombination process. Indeed, we are able to observe superlinear photocurrents (Zb), that is, photocurrents which increase more rapidly than the radiation intensity. Above all, the dependence of the position of the stationary Ferrni levels (and, therefore, the demarcation levels by which it is possible to include every type of recombination center) upon the temperature allows, in principle, every type of variation of the dependence of the photocurrents upon the illumination intensity.
I . Wave Vectors and Crystal Momenta It is known that the electronic levels in the allowed energy bands of a crystal can be determined by the wave vector k of the electron waves. The energy can be expressed through the components of the wave vector k,, k,, and k,. The wavelength of the electron wave is given by Xn
=
2n
The product hk (with the dimensions of a momentum) is called the crystal momentum,
M
=
Xk.
(26)
The energy can be considered to be a function of this vector M. When in thermal equilibrium, most of the electrons of the conduction band occupy energy levels which correspond to the values of k in the neighborhood of the energy minimum while the holes in the valence band occupy, in a similar manner, levels near the energy maximum. The most clear-cut situation is shown in Fig. 7. This is the case where the bottom of the conduction band and the top of the valence band are at M = 0. The energy of a n electron in the conduction band may be interpreted upon the basis of the motion of the electron in a crystal as kinetic energy. The energy EL, given by E L = E L k ( M ) ,is greater than or equal to 0 for M = 0. In a similar fashion, the energy of an electron in the valence band is f i V = -AE - Evk(M),where Evk(M) is greater than or equal to 0 when M = 0. An electron in the valence band with a crystal momentum M may
P. GORLICH
54
Conduclon
Valence
M
FIQ.7. Simplest case: minimum of the conduct.ionband and maximum of the valence band for (M)= 0.
be raised to a state in the conduction band with a crystal momentum M* by absorbing a quantum with sufficient energy. This is expressed by
hv
=
AE
+ E ~ ~ ( M *+) E ~ ~ ( M ) .
(274
From this we can calculate the smallest as well as the largest energy of the incident radiation with frequency Y necessary for such a transition. This is given by
hv
=
AE.
(27b)
In order that the transition have a large probability, the condition
M* = M
+ Mphi
(28)
must be fulfilled. This is based upon the quantum mechanical expression for the transition matrix element. In Eq. 28, Mph is equal to h/Xph for the absorbed photon with wavelength X p h which is moving in the direction of the unit vector i. In the transition, the sum of the crystal momentum and the photon momentum is a constant. In the case where Mph is much less As a consequence of than /MI or IM*l we can assume IM*l is equal to [MI. this we have a vertical or direct transition which is shown, for example, in Fig. 7 by a n arrow. This means that the transition takes place with constant M. The probability for a direct transition is approximately independent of M and the probability for absorption depends only upon the number of available states if the magnitudes of M are not too large, if the transition is an allowed one, and if the electron hole interaction is negligible. The number of states in the conduction band for energies between E and E dE is of the form CE% where C is a constant. The number of possible
+
PROBLEMS O F PHOTOCONDUCTIVITY
55
+
traiisitioiis between v and v dv is, therefore, of the same form. The absorption coefficient can, therefore, be expressed by (Y
=
cl(hv - A E ) ~ ~ ,
where hv is greater than AE, and if hv is less to or equal to AE, a = 0. In the range where hv - AE is small, Eq. (29a) is not valid; for in this case, IM[is also small. The electron and hole move separately and slowly and are subject to a strong Coulomb attraction which strongly influences the shape of the absorption spectrum. If M = 0, the transition probability vanishes and the transition is forbidden. For small deviations of M from 0 the transition probability is proportional to /MI2and, therefore, proportional to hv - AE. Thus, a =
cz(hv - AE)%,
(29b)
if hv is greater than AE.If hv is less than or equal to AE, a = 0. If we consider the possibility that the Coulomb interaction between electron and hole can lead to the formation of a n exciton state, we obtain for the absorption of a quanta, whose energy is not sufficient to cause a direct transition, the formation of an exciton such that
hv
=
AE - Eexcl,
where Eexczis the binding energy of the exciton. The direct transitions therefore cause a hydrogen-like line absorption spectrum which extends to the long wavelength side of the edge of the base lattice as already mentioned (2, 3, 6). The simplest case for the simple situation described in Fig. 7 is only valid for a few semiconductors (for instance InSb). In most semiconductors the minimum of the conduction band and the maximum of the valence band generally do not have the same value for the crystal momentum M. One must note that the range of values for M is not arbitrarily large because it can be shown that the energy is a periodic function of the crystal momentum. The maximum value of M is of the order of h/2& where d~ is the lattice constant; therefore, the smallest electron wavelength is of the order of 2 dc. Figure 8 is a representation of the energy band structure for germanium, based upon experimental information. One sees clearly that the direct transition with M = 0 which is depicted in Fig. 8 as s1 is not the one which requires the smallest amount of energy, but the one with the smallest energy requirement is shown as s2. The indirect transition requires a large change of M so that the transition is not possible by the absorption of only a single photon. This latter transition can only take place if a photon is absorbed a t the same time as another photon is either absorbed or emitted. Let
56
P. G ~ R L I C H
E
I
M2
0
-
M
FIG.8. Energy band structure of germanium in (111) direct’ioo.
be the momentum a t the minimum in the conduction band and Ma be the momentum of the phonon which is necessary for the conservation law, then we have M L
M L = AMc
(31)
where the sign depends upon whether the phonon is absorbed or emitted. The energy of the phonon in a crystal can be expressed in the analogy with the electron as a function of its wave vector k’,also the momentum of the phoiion can be given analogously to Eq. (26) Mc
=
fik’.
(32)
Let EG be the energy of a phonon with momentum G = ML, then the minimum frequency for the transition s2 (see Fig. 8) is given by hv = AE f Ec.
(33)
Again, the sign depends upon emission or absorption of phonons. This simplified representation by no means exhausts the phonon problem. The problems become more complicated because of the different types of vibrations possible in a crystal, each with its own kind of phonons. Finally we must consider that indirect exciton transitions are possible. Whereas the direct exciton transitions result in a line absorption as mentioned earlier, it is not true for the indirect transitions. We obtain a continuous absorption spectrum for the excitons. The lowest absorption
PROBLEMS OF PHOTOCONDUCTIVITY
57
frequency which results in the formation of excitons through absorption or emission of phonons is given by hv = AE f Eo - Eex$.
(34)
IV. DISLOCATIOXS Dislocations are becoming more important in the field of photoconductivity. Step dislocations as, for instance, in germanium act as electron traps (26), which are able to decrease the lifetime T~ of the liberated photoelectrons. Similar conditions may be obtained in cadmium sulfide ( 2 7 ) . Figure 9 shows the lattice configuration of a step dislocation in cadmium
(110)
FIG.9. Structure of step dislocation in CdS
sulfide which clarifies the action of a dislocation line as an electroii trap. The data shown in Fig. 10 are a direct proof of the influence of the dislocations on photoconductivity. The photocurrent is plotted as a function of the distance from the grain boundary. The material was illuminated by sranning with a movable slit. One may expect similar influences upon the photoconductivity in other semiconductors. When, for instance, alumiiium migrates from silicon, it is preferentially along the dislocation lines. Consequently the A1 concentration is dependent upon the perpendicular distance from the dislocation line. Upon suitable doping, the dislocation line is enclosed by a tubular pn-junction (28). Such treatment cannot but influence the photoconduction process. One may expect that complete planes of step dislocations, which exist where there is twinning, should show exceptional photoconduction effects. This is so because in such lattice defects the activation energy may take on exceptionally small values. We would like to mention investigations on the increase of noise and change in photoconduction in evaporated lead sulfide layers which were mechani-
58
P. G ~ R L I C H
E 0
E e a a n .c
I
- Distance (mm)
FIG.10. Photocurrent in CdS as a function of distance from the grain boundary.
cally treated. It is suggested that lead sulfide is a suitable substance for investigations on dislocations (29).
V. NEGATIVE PHOTOCONDUCTION In spite of the great interest shown in the normal inner photoelectric effect as manifested by the large number of investigations of this phenomenon, there are still a large number of unsolved problems. This also holds for the negative photoelectric effect even to a much larger extent. This is so in spite of the fact that both effects have been known for a long time. We consider negative photoconduction to apply not only to those effects which are characterized by a decrease of the conductiv;ty while under irradiation to a level below the dark conductivity (for instance, selenium, germanium (I), and Ag2S (SO), under certain conditions of preparation which can give positive or negative effects) but aIso to such cases where the photoconductivity exhibits first a rise and after having reached a maximum decreases to a lower level which may in some cases be below the dark conductivity. According to this, the appearance of fstigue in photoconductors, photoelements, and photocells, where the changes in conductivity are exhibited in this or in similar manners, may be considered to be a negative photoeffect. Fatigue has been the subject of investigations for many years because of the great importance of constant photo devices in industry. Photocells based on photoemission are also mentioned here because all compound photocathodes exhibit semiconductor characteristics and therefore what applies to photoconduction (positive or negative)
PROBLEMS O F PHOTOCONDUCTIVITY
59
should also apply to this type of photoeffect (31).The conditions for which negative photoconduction occurs are very complicated and are difficult to understand from a common point of view. This is so because both in natural crystals and artifically produced photoconductors, negative and positive effects are reversible depending upon the wavelength of the incident light and depending upon temperature region and applied voltage. Examples of materials in which these phenomena occur are MoS2, SbzSi, single crystal CUZO, single crystal CdS (S2), pressed powder of ZnO, and doped Ge (33).
FIG.11. Explanation of negative photoconduction.
There is no dirth of kinetic models to explain the negative photoeffect (34). One model (35) is based essentially upon an exciton mechanism. Another model which was extensively discussed because of its clarity has the advantage of being readily checked by experiment and may lead to new considerations regarding the doubly-charged traps (1, 36). The abnormal negative photoconduction is usually excited by light on the long wavelength side of the absorption edge if we exlcude all the negative photoeffects which can be explained through the change of the adsorption equilibrium or through change of the stoichiometric proportions. I n this way we exclude a large part of the fatigue effects; therefore, we can assume (see Fig. 11) that the excitation process s1 consists of the liberation of an electron from the valence band and its transition to a trap (or the liberation of a hole from a trap and its transitions to the valence band). The processes sz and s3 are recombination processes of a type where there already exists in the nonirradiated condition an electron in the conduction band which combines with a hole in the valence band a t a recombination center. This must occur
60
P. GORLICH
a t the recombination center because the direct transition of the conduction electron to the valence band is fairly improbable. If the recombination processes s2 and s3 are faster than the thermal transition of the trapped electrons to the conduction band, then the stationary electron concentration is smaller than in the dark field case. One must therefore observe a decrease in the conductivity if the free minority carriers are split off from a defect state by photon injection. This holds when the defect state has a recombination coefficient for free majority carriers which is orders of magnitude smaller than that of other defect states acting as recombination centers. If the very small recombination coefficients are interpreted as being a result of the repulsive Coulomb forces, then one has to assume that the defect state from which the minority carriers are split off is able to trap two electrons (n-type conductor) or to yield two electrons (p-type). This holds, of course, in the nonirradiated condition. Such circumstances seldom occur and, therefore, one must conclude that the negative photoeffect can be observed less often than the positive photoeffect. If the small recombination coefficients are a consequence of a small transition probability, then one may disregard the assumption of doubly-charged defect states.
VI. SURFACE CONDITIONS
It has already been noted twice in this report that the photoconduction processes are strongly or completely influenced by the properties of the surface of a semiconductor. Figure 12 illustrates an example of this (37). I t depicts the spectral distribution of the photocurrent of cadmium sulfide as a function of the humidity. One recognizes that both the shape of the spectral distribution and the magnitude of the photocurrent change noticeably as a function of the humidity. The photocurrent decreases in the region of the characteristic absorption at small wavelengths with increasing humidity. From this, one must conclude that the absorbed moisture on the surface forms traps. The traps clearly exhibit a large binding energy for the electrons so that the electrons can easily recombine with the holes. It is clear that the part of the photocurrent which flows in the neighborhood of the surface is essentially decreased. This is so because the moisture is absorbed only on the surface and so the traps are formed only on the surface. Since short-wavelength light in the region of self-absorption is strongly absorbed in the region of the surface and light of longer wavelength is absorbed less in the surface region but much more within the volume of the crystal, it is understandable that the part of the photocurrent in the region of self-absorption decreases strongly. It was probably in germanium that the first scheme to explain the surface recombination was set up (38). In Fig. 13 we have a simple example of the usual current type of consideration. There we have shown the
PROBLEMS OF PHOTOCONDUCTIVITT
61
100
90 80
-
._ 70 3 c
,x 6 0
e $
._ &
-@ -
50 40
L
V 3
0
30
2
20 I0
-
n
3000
4000
5000
6000
Wovelength, A
FIG.12. Dependence of the spectral distribution of CdS crystal upon the moisture in surrounding at,mosphere. Curve 1, 10 mm Hg; curve 2, average vacuum; curve 3, high vacuum.
loyer
E"
FIG.13. Energy level diagram of a surface of Ge.
equilibrium condition for the surface and semiconductor. As usual, E L and Ev are the boundaries of the forbidden zone and Ei the position of the Fermi level; Ei, is the value of E , in an intrinsic sample. The presence of the surface states within the forbidden zone permits space charges near the surface and so the formation of a potential barrier. Let n, be t,he inversion density of the carriers in an intrinsic sample then we can as usual state,
P. GORLICH
62
n
=
ni exp (e+/kT),
p = n, exp ( - e + / k T ) ,
(35)
where e.9 = Ei - Ei,. I n Figure 13 cpi is for the value in the bulk semiconductor and q h b for the value in the surface of the semiconductor. The contamination density within the bulk semiconductor determines & whereas &b depends upon the barrier height and is given by e+ob = Ei
- Ei,
+ eVob.
(36)
Investigations on the surface states in germanium have resulted in the appearance of surface terms with different reaction times (39).One is able to differentiate between slow and fast surface terms. The number of the fast terms is probably quite smaller than that of the slow ones. The fast terms are probably continuously distributed around the band center and some others of great,er width are found a t a distance approximately of the order of 0.1 ev from the band center. The scheme that is represented in Fig. 13 is a n extremely simplified case. The influence of the surface states in zinc oxide single crystals upon the photoconduction was investigated in some detail (40). One can obtain a very large surface conductivity in zinc oxide crystals by allowing atomic hydrogen to activate the surface of the crystal. This brings about an enriched layer of n-type conductivity. One can again reduce the surface conductivity by adsorption of oxygen or through heating. One can determine the lifetime of the excited electrons by measuring the surface photoconduction. The lifetimes of these electrons are between 0.5 see and 7 X see under a n irradiation intensity increase of four orders of magnitude. If the surface conductivity is changed by four orders of magnitude, the lifetime remains constant. We can conclude therefrom that the lifetime is independent of the amount of the adsorbed gas on the surface. Figure 14 illustrates a scheme of the surface of the zinc oxide crystal with a n enriched layer and with surface states. This scheme is sufficient to explain the surface processes. One assumes a continuum of electron-trapping states near the lower edge of the valence band. One is then able to get information regarding the density of the fast terms. By changing the surface terms and also the field (change of carrier density due to transverse electric fields), the edge of the conduction band moves toward El at the surface. Under absorption of light, the quasi-Fermi level for electrons moves toward the conduction band and the quasi-Fermi level for holes moves toward the valence band with increasing irradiation intensity. Lead sulfide is a particularly interesting but obviously a more complicated photoconductor. Chemically deposited or evaporated polycrystalline lead sulfide films on insulated substrates exhibit, as is well known,
03
PROBLEMS O F PHOTOCONDUCT1VITY
large photosensitivity in the visible region and in the near infrared region. PbSe and PbTe films exhibit similar behavior but with greater infrared sensitivity. The polycrystalline PbS films with thicknesses from 0 . 1 ~to 1p have a large surface-to-volume ratio. I n addition, the minority carriers can easily diffuse towards the surface so that one may already expect the surface conditions to exert a great influence upon the photoconduction processes. The surface conditions are not only determined by the outer
1
Recombination
I -3ev
“Surface
FIG.14. Energy scheme of the surface of a ZnO crystal in thermal equilibrium. surface of the films but also by the intercrystalline grain boundaries. Experiments indicate that the crystallites are surrounded by intercrystalline oxide barriers and the crystallites are essentially of the p-type. This should not exclude the possibility of the existence of small n-type regions within the crystallites. The appearance of photo-emf’s in micro regions confirms the assumption that p-type regions can surround n-type regions ( / t l )The . conductivity can be expressed through 0
= ePPP*,
(37)
and effective mobility of the holes is limited through the intercrystalline barriers pP* = p P
exp (-EB/kT)7
(38)
where En is the height of intercrystalline barriers. The energy schemes for the intercrystalline barriers and for the outer surface barrier can be illustrated on the basis of Fig. 13 (42). Investigations with the aid of the field effects permit us to state that the surfaces of the crystallites have ‘‘fast” states and the outer surfaces have states. The various results for the different semiconductors can be in principle understood in terms of the discussed surface states. In summary, one should
64
P. GORLICH
note that there are still a large number of questions regarding the complicated photoconduction behavior in the surfaces of semiconductors. It seems to be of general validity that the photocurrents from pure surface photoconduction have laws which are of the same type as those for photocurrents in a homogeneous volume (@). Oiie is therefore not able to conclude from the measured dependence of the photocurrents upon the radiation in tensity whether one is dealing with a photocurrent from within the volume or from a surface.
VII. OHMIC
AXD
UNIDIRECTIONAL
CONTACTS,
pn-JUNCTIONS
In principle, we can explain the strong and sometimes extreme influence on the photoconduction process, as shown by experiment, by the boundary layers on the electrodes and the pn-junctions (as in the previous section, the boundary layers have surface states). We need to note however that in surface state photoconduction there is, in general, an assumed recomhination of the produced charge carriers in the surface states.
A . Unidirectional and Isotropic Contacts One knows that in metal-semiconductor boundaries, the contacts generally exhibit rectifying properties. An electron which goes from the metnl to the semiconductor must overcome a work function which may be approximately the difference between the metal-vacuum and semiconductor -vacuum work functions. At the contact, the work function causes a concentration nR in the semiconductor which is temperature dependent only and independent of any change in the electron concentration by photon injection into the interior of the semiconductor. Within the boundary layer of the semiconductor which is in contact with the metal, there is a concentration gradient of charge csrriers and so a space charge. If the thickness of the boundary layer is small compzred to the mean free path of the electrons, the rectification process may be described by the diode theory where one assumes the boundary layer is idealized as a vacuum. From theory we obtain
{
isp = A 1 - exp
(-
%)I,
where is, is the current in the blocking direction, iD, is the current in the forward direction, and A is the saturation current which depends upon the thermal velocity of the electrons and nR. For boundary layers which are large compared to the mean free path of the electrons, it is necessary
PROBLEMS O F PHOTOCOKDUCTIVITY
65
to take into account many collisions of the electrons with defect states and phonons, etc. The above considerations lead to a diffusion theory where is, and iD, are given by
where E R is the magnitude of the boundary layer field strength. If a photoconductor has a unidirectional contact, one should expect a photocurrentvoltage characteristic nhich is similar to that of the dark field case of a semiconductor with the same unidirectional contact. The experiments verify these expectations (44). In order that the semiconductor or the photoconductor exhibit rectifier characteristics, they must be bound by a t least two boundary layers of which one must be a unidirectional contact (i.e., depleted layer). The depleted layer is a layer in which the charge carrier concentration is less than that within the interior of the semiconductor. An enriched boundary layer is one in which the charge carrier concentration is greater than that in the interior of the semiconductor, and, in contrast to the previous case, it is a necessary condition for an isotropic contact. The investigations of photoconductors with negligible dark current, that is insulators (insulating ZnO), have gained importance for the study of the influence of rectifying contacts on the photoconduction processes (45). Rectifying contacts for insulating ZnO may be produced in various ways; for instance, by negative ions from a corona discharge in air, by negative ions of an electrolyte, and by a p-type semiconductor. It is difficult to obtain ohmic contacts for ZnO. Therefore “quasi-ohmic” contacts (Hg contact) are regarded as being sufficient substitutes. Experiments show that rectifying contacts on ZiiO lead to saturated photocurrents, whereas ohmic contacts result in secondary photoconduction. Ohmic contacts for CdS can be produced by causing Gu or In to be melted on the crystals or by evaporating these substances. Silver, gold, and graphite also result in ohmic contacts with CdS if the crystal surface is subjected to a cleaning procedure (electron and ion bombardment) prior to connecting the electrodes.
B. pn-Junctions It is known that one is able to excite isotropic transitions within a semiconductor by doping it in such a manner that one part with an excess of charge carriers is bounded by another part with a lack of charge carriers
66
P. GORLICH
(pn-junction). The electron and hole concentrations in a pn-junction are illustrated in Fig. 15 for the stationary case. By applying a field in the flow direction, electrons and holes are then concentrated in the pn-junction (diminishing of the total resistance of semiconductors). If the field is applied p- concentration
p-conduction
n-concentration
' pn- junction '
n-conduction-
x
FIG.15. Electron and hole concentrations in pn-junction.
in the blocking direction, then both types of charge carriers are taken away from the junction (increase of total resistance of semiconductors). The influence of pn-junctions on the photoconductive processes can be expected especially in those cases where an apparently homogeneous material becomes inhomogeneous through diffusion processes causing pn-junctions (see Sec. IV, Dislocations).
C . Photo-emf in Boundary Layers I n addition to the photoconductivity changes of photo-semiconductors through irradiation [which is described by Eq. (4)], there may appear photo-emf's in pn-junctions or in a metal-photoconductor junction where there is a depleted layer. For example, Fig. 16 illustrates schematically the formation of a photo-emf through a depleted layer in a metal-defect-semiconductor boundary (for example, Se or CuzO). In the unidirectional layer there is a depletion of holes. If process s (formation of electron hole pairs) is caused by irradiation, then electrons are driven by the action of the boundary potentials into the metal and the holes are driven into the semiconductor before a recombination is possible. The unidirectional electrode is negatively charged by the electrons with respect to the photoconductor. Similar conditions exist for pair production through irradiation in a pn-junction (46). The space charge field a t a pn-junction is directed so that the electrons drift into the n-region and holes into the p-region. If there is pair production in the direct immediate neighborhood of the space charge region, (that is in the field free space) and if no recombination takes place during the time
PROBLEMS OF PHOTOCONDUCTIVITY
67
“mox
!
-
s
Deplelionboundary layer (Unidirectional layer)
c
FIG.16. Appearance of a photo-emf at a metal-defect-semiconductor boundary.
of diffusion of the carriers to the space charge zone, then these carriers participate in the formation of the photo-emf. Photoconduction processes and the formation of photo-emf’s are the basic processes which enable us to understand the technically important photodiodes and phototransitors. VIII. PHOTOELECTROMAGNETIC EFFECTS In addition to the production of photo emf’s in a semiconductor which has been described in See. VII, one can cause pair production through photoninjection in the surface of a semiconductor By separating the diffusing charge carriers, by the application of a magnetic field, as they move into the semiconductor, one causes another photo-emf. This effect is known as a photonelectromagnetic, or a photomagnetoelectric, or a photogalvanomagnetic effect. It was first observed in CuzO a t liquid air temperatures and under 11,000 gauss where the measured potential was up to 20 v (47). This effect was later also observed in Ge, PbS, InSb, and Si, and not only a t low temperatures (48). The possibIe arrangements in which the effect was originally measured are illustrated in Figs. 17a and 17b. One utilizes electrodes in the yz-plane when the magnetic field is applied in the z-direction and the radiation is in the m-plane as illustrated in Fig. 17a; or one may place probes in the positions as shown in Fig. 17b under the same illumination and magnetic field direction as in Fig. 17a. Between the electrodes, or between the corresponding probes in the various positions, there exists a photo-emf or at short circuit a photocurrent. I n the case of the photoelectromagnetic effect, the forces which separate the charges are the Lorentz forces in a magnetic field instead of the Coulomb forces in an electric field. The theo-
68
P. GORLICH Another possible setup
0 Yl
3
Y f
3 (bl
(0)
FIG.17. Photoelectromagnetic effect.
retical treatment of the photoelectromagnetic effect indicates that with the aid of this effect there is the possibility of a direct determination of the velocity of the surface recombination and the lifetime of the free charge carriers. Figure 18 illustrates schematically the basis of the theoretical treat,ment. The applied magnetic field causes circular currents which enable us to differentiate between a transverse and a longitudinal photoelectromagnetic effect (49).The longitudinal effect in the irradiated parts may be Light
FIG. 18. Schematic representation of the longitudinal and transverse photoelectromagnetic effects in a semi-infinite plate. Vt-potential difference of the transverse effect; Vi-potential difference of the longitudinal effect; I-total current; E-magnetic field strength, vl, vz, r8-surface recombination velocity of thc irradiated (index l ) , not irradiated (index 2 ) , and sides (index 3) of the plate.
classified as being either linear with the magnetic field or quadratic with the magnetic field. Under illumination and without applying the magnetic field one obtains the Dember effect (50).(Dember emf). There occurs across the plate an electric field which is proportional to the concentration gradient and is, therefore, largest in the region near the illuminated surface. The differential equation system on which the kinetic processes of the photoelectromagnetic effect are based, contains the Dember field strength in addition to the surface recombination rates which may be assumed constant. In order to neglect the changes in resistance in the magnetic field,
PROBLEMS O F PHOTOCONDUCTIVITY
69
one must assume first of all small Hall angles. The theory can be generalized to arbitrary Hall angles in order to interpret theoretically the results in InSb and InAs. Under certain approxiniatioris one obtains for the short circuit current Ais, Ais
=
~ T B M ~ B ~~LD d1
+ pm2Bnt21 +
TnTOh
1
d1
+
(41) pn2Bm2
if one neglects the contribution of the holes because of their small mobility. In Eq. (41), is the surface recombination rate, B, is the magnetic induction, and LO is the diffusion length (the square root of the product of the lifetime and diffusion constant). The maximum photocurrent caused by the photoelectromagnetic effect corresponds to a quantum yield of 1 as in the case of the photoelements. From this, one may expect that the technical utilization of the photoelectromagnetic effect is of interest only in special cases except for application in the infrared region (51).This is illustrated in Fig. 19 where A is a single crystal plate of InSb (2 mm X 1 mm
FIG.19. Schematic arrangrmcnt of photoelectromagnetic cell.
X 0.1 mm), B refers to a pair of electrodes, and the semiconductor is located ill C a magnet with a field of about 10,000gauss. The radiation is iiicident on the front side of the plate (direction of the arrow in Fig. 19). The spectral
distribution is shown in Fig. 20.
Wovelength (microns)
FIG.20. Spcctrd senritivity of indium antimonide photo~.lcctromagneticcell.
70
P. GORLICH
The photoelectromagnetic effect is an important method utilized in determining the surface recombination rate in the inner photoelectric effect. It is also possible to have a combination of both effects. By utilizing different wavelengths of light, corresponding to different absorption coefficients, and applying a field, a superposition of both effects is then obtainable. The magnetic unidirectional layers in such cases may lead in addition to complicated conditions (52).
IX. APPLICATION OF PHOTOCONDUCTORS A . Tabular Survey of Photoconductors Solid state physics must in general determine the real nature of the recombination centers and trapping states and the imperfection terms. (Trapping centers for holes in CdS and CdSb can be, for example, cation holes). Furthermore, the question of trapping cross sections and a number of other questions must still be clarified. These similar problems need the methods of photoelectric investigations to help towards their solution and to clarify the general laws of solid state physics, as is already illustrated in the previous sections. One utilizes for such purposes model substances. That is, such materials for which it is convenient experimentally to combine electrical, in particular photoelectric, investigations with optical investigations, in particular optical absorption (for instance, CdS, colored and uncolored alkali halides as well as alkali earth halides). The investigations of the model substances naturally lead to an extensive specialized literature. Publications of the results of the investigations of t,he technically important semiconductors (for instance, Ge, Si, PbS) which naturally belong to the model substances are hopelessly impossible to survey meaningfully. Therefore, in order to study the special details we refer to certain communications (53) which are listed in the references. Here, we can only attempt to note the main properties of photoconductors in tabular form. The refractive index of photoconducting elements is given in the fifth column of Table I. According to older empirical considerations (Gudden 1928), a pure substance would only exhibit photoconduction if the refractive index were greater than two. More recent considerations (Moss 1950) suggest that there exists a relationship between the refractive index n B and the long wavelength limit Xo such that the form nB4/Xo = constant. As shown in Table I the requirement that nB > 2 for photoconducting elements is surely fulfilled. A calculation shows that the relationship between the refractive index and the long wavelength limit, taking into account the expreimental difficulties, is only approximately valid. The application of this relationship to binary semiconductors is useful only for making estimates.
71
PROBLEMS O F PHOTOCONDUCTIVITY
Table I1 shows a summary of photoconducting sulfides, selenides, and tellurides. Table I1 contains only the binary compounds. It is necessary to note the properties of the ternary sulfides, selenides, and tellurides (54),which are known to be photoconducting. These compounds are TlzSe-Sb2Sea, CdTe-ZnTe, BizS3Sb2S3,SbzSea-AszSee, Tl2S-Sb2S3, CdSe-InBe3, and T18e-As2Sea. These compounds are of Groups A I I I B V and AIIBVIwhich are able to form solid solutions. These compounds are characterized by the fact that the width of the forbidden zone varies monotonically with composition from one binary compound to another; in some cases linearly with composition (for instance CdTe-ZnTe) . The derivation of the spectral distribution with composition is shown in Fig. 21 in the case of two good I
23
4 5 6
789
Composition
O/O
2 100 ..-5
V
D
c
2 a.
50
0
0.5
06
07
0.8
0.9
Wavelength ( p )
FIG,21. Spectral distribution of CdS-CdSe.
fertile conductors (CdS and CdSe). I n this case we deal with the solid solution of CdS-CdSe. The ternary mixed crystal PgTe-CdTe seems likely to be photoconducting until -lop as appears to be experimentally established. The furthest limit towards the infrared can be expected to be about 120p in the case of ternary compounds with impurities. Another group of ternary compounds of the type A V B ~ I C V I Iappears also to be photoconducting (55). Depending upon the band separations of these compounds, the long wavelength limits are in the visible and in the infrared regions. Investigations have already been carried out on the combinations of Sb and Bi (Group V) with S, Se, and Te (Group VI) and C1, Br 1, and I (Group VII), where contacts were made with metallic Ga. With increasing temperature up to 100°C, one is able to bring about an increase in sensitivity. In addition to the photoconductors given in the tables, one can list as photoconductors the alkali halides (which do not exhibit the photoeffect i K I the h s e lattice region, on the other hand exhibit F-, 11-, and V1-center
72
P. GORLICH
TABLE I. Atomic number in Group merit periodic system
Allotropic modification of photoelectric element
Refractive index (extrapolated for
~
B
3
111
C
6
IV
Si
14
IV
3.43
Ge
32
IV
P
15 33 16 34
V V VI VI
As
S
8e
Te
52
VI
I
53
VII
Diamond, cuhic zinc blendestructure
Red Grey Crysta!line Amorphous monoclin (red) hexagonal (met.)
Activation Long
(k)*
limit
(PI
photoconduction (ev)
3.5
2.2-2.6
1.1
2.4
-0.241
0.234
5.3
1.48
1.08
1.15
4.0
2.2
1.7
0.73
2.6 3.35 2.0
-1.2 2 .O-2.4
0.85 1.03 0.5 0.5 1 and 0.8 and
1.3‘3 1.2 2.4 5-1.9 2.5
4.3
3.3
0.37
1 .0
0.96
1.30
2.45 4.8
-3
X J ~ energy from
0.96-1.27
* A;d is defined as one half of the value of the maximum sensitivity in the spectral distribution,
phot,oconduction) and the silver halides, the iodized thallium and mercury, and a large group of inorganic phosphors regarding which certain statements may be made about the radiative excitation mechanisms as a basis uf their photoconducting properties. Phosphors with bimolecular radiative excitation mechanisms (so-called recombiliation radiators) exhibit “good” photoconduction. On the other hand phosphors with monomolecular radiative
PROBLEMS O F PHOTOCONDUCTIVITY k’HOTOCONEUCTING
ELEMENTS
Method of investigation
Evaporated layer, decomposition layer of boron hydrid, hulk Crystal
Type of conduction
Insulator a t irradiation
Decomposition layer of silicon tetrachloride, bulk
p and n according to activation
Single rrj-stals
p and n according to activation (Group 111respectively Group V)
P
crystal
Ilistilled layer
Remarks
Technical application
P
n
Layer Evaporated lager Sublimated layer Evaporated layer, pressed layer,
73
P
Additional irradiation with red light increases the sensitivity in the uv. “Semiconducting” diamond (p-type) shows photoconduction UP to -1.15~ n-type a t liquid helium temperature, photo. conducting up to 38p .4u- and Zn-doptpd nand p- Ge photoconducting up to ~ 6 p . Displacement of long wavelength limit to longer wavelengths at lower temperatures.
Photoresistor, photoelement (solar battery) photodiode Photoresistor, photodiode, phototransistor
Photoresistor, photoelement (xerography) Suitable for photoelectromagnetic purposes
Thin plate, through melting
clxcitation mechanisms (so-called configuration radiators) exhibit “poor” photoconduction.
R. Frequency Dependence and Amplz$cation Factors of Photoconductors The frequency dependence of the photocurrent under sinusoidal excitatioil n a y be characterized by a relaxation process and so is given by
TABLE11. PHOTOCONDUCTING SULFIDES, SELENIDES AND TELLIJRIDES ~~
~~~
Long wavelength limit XO ( p ) at irradiation in the region of Binary compound CunTe ZnS ZnSe ZnTe
Atomic no. of the metallic component 29 30
33 42
Lattice absorption
1.5 0.38 0.5
0.63 -0.75 1.1
CdSe CdTe
2.2 1.3 1.4 -3.5
Technical application
Width of forbidden zone (ev)
Remarks
n-type. Single crystals, polycrystalline specimens. Influence of specific contaminations (Cu+, Ag+, Al", Se3+, Ga3+,C1- , Bi-, etc.)
Model substances for luminescent and photoconduction processes Semiconductor image tube
7 5
3:
2 h 9
*.I
1.3 0.55
2.0 2.6 1.35 1.8 3.0 0.9
0 .9 1.2
1.5 1.6
47
48
Lattice defect absorption (longest wavelength observed)
0.6
Photoelement Model substance, photoresistor Photoresistor Photoresistor
0.9 Evaporated layers, Eingle crystals, and polycrystalline specimens. Structured absorption, even with CdSe. Doped with Cu, Ag, C1. Surface terms caused by influence of oxygen. CdTe becomes p-conducting by doping with Cu or Ag.
InSe
49
Excess of metallic component leads to higher infrarcd senssi tivity. Evaporated layers. Weak influence of Cu, Sb and Cd. Probably Sb203(cubic crystallites) on surface, also metallic Sb, strong influence of the phase of the surface on the photoconduction. n-type.
1.8
.o
1
2.2 Semiconductor image tube
0.9 2.6
HgS
80
0.63
HgSe HgTe
TlZS TLTe PbS PbSe PbTc
Bi&
81 82
83
1.o 1. g 2.8 5.0 3.6
1.6
5.0
E m
0.4
0
Photoresistor
Photoresis tor
Bizsea BizTea
m
F
1.2 3.9 1.8 2.6
r
0.41 0.26 0.32
Evaporated layers, chemically depofiited layers, single crystals. Also used a t lower temperatures. (Large displacement of the long wavelength limit to longer waveIengths). n- and p-type. I n polycrystalline unidirectional layer effects, probably also in PbSe and PbTe. Oxygcn influence. Evaporated layers, At cooling, displacement of the long wavelength limit to longer wavelengths. Strong oxygen influence on sensitivity.
0 ~3
80
3
2
$
-l 01
TABLE 111. PHOTOCONDUCTING OXIDES Long wavelength limit An ( p ) at irradiation in the region of Compound
M go
TiO?
Atomic no. of the metal
Lattice absorption
Lattice defect absorption
12
0.8
1.5
22
Technical application
Conduction type
Remarks
p
A t uv irradiation ( 0 . 3 1 ~ and ) neutron bombardment, An displaced to -1.8~. New long wave-
0.48
cuzo
29
0.63
I .5
ZnO
30
0.48
0.55
n
Inz03
49 52 56
1.8 1.6
p
0.33
TeOz BaO
0.55 0,53
Pho toeleinent
p
length max. appears. Gudden criterium not fulfilled, for n~g= 1.74. Rutile Sperial level diagram, for band-band transition docs not correspond with optical ahsorption cdge Single crystals and polgcrystalline specimens. Surface effects. Sintwed specimens, thin layers, single crystals. Photoconduction depends strongly on atmosphere (oxidizing or reducing). Evaporated layers Investigation of photoconduction to gain better understanding of thermal emission. BaO can change color in Ba-vapor analogously to the alkali halides. Anodic deposited lagers.
? G)
0:
!a t+
z
Long wavelength limit ( p ) a t irradiation in the region of
Compound
Lattice defect at)sorption absorption
MgaSbz
1.5
ZIlJ?*
1.1 2.2 2.1
GaBb Inp InAs InSb
Condtiction type
7.85
3.5
2.8 1.6
Remarks
Measured only a t low temperatures (85°K and lower) Evaporated layers are photosensitive only under special conditions Small single crystals
5.3
MgtSn
ZnSt) Cd3-4~2 GaAs
Technical application
May be suitable for solar batteries
3.0
Up tonow n-type
0.8
P
Photoresistor, photodiode, photoelectromagnetic cell
n
Very high carrier mobility. Low effective electron mass. Strong temperature dependence of photoconduct,ion.
8
0.22
Photosensitivity is higher in polycrystalline specimens than in single crystals
4.8
7.7
Width of the forbidden zone (ev)
0. 3 3 0.17
%
78
P . GORLICH
where A B is the amplitude of the alternating radiation, w is the circular frequency, and Ai(w) is the change in photocurrent as a function of frequency. In addition Ai(w) is proportional to the applied potential difference; 7 0 is the time constant of the photocurrent defined in Eq. (6). According to the proportionality relationship (42), Ai(w) should be essentially frequency independent for all radiation frequencies w << 1/70. For higher frequencies, Ai(w) decreases as l/o.The requirement of a slight frequency dependence implies a small time constant; T~ is of the order of sec for sec for CdSe and PbS. CdS and of the order of The amplification factor F for photoconductors with large conductivity may be defined by
F
=
AI/egBDL.
(43)
On the basis of Eqs. (9) and (7), we obtain an amplification factor proportional to 70. A characterization of the photoconductors in terms of a quality index may be made on the basis of the relationship between the frequency dependence and their amplification factors with the time constant. This index is set proportional to F and 1/70.
C . Statistical Fluctuations in Photoconductors The estimates on the statistical fluctuation phenomena in photoconductors, which are brought about by various causes, serve not only to gain information on the lower limit of detection of photoresistors, but also in its precise investigation, the dependence of fluctuation phenomena of semiconducting properties and electrode configurations is, by all means, suitable to contribute to the clarification of the conduction mechanism of the semiconductor itself (56).One should regard the photoresistor and the radiation source as one system. In order to analyze it, it is necessary to eliminate the individual sources of noise. One should take into account, first of all, the fluctuations of the radiation incident on the photoconductor. The mean square fluctuation per frequency interval of the emitted light quanta (number n p h ) from the surface Ob of the radiator is given by
which may be derived from the fluctuations of radiations of a black body a t a temperature T and volume V in the spectral region v1 to v2. Fluctuations in black-body radiation are given by
79
PROBLEMS OF PHOTOCONDUCTIVITY
I n considering the fluctuation phenomena in the photoconductor itself, we have statistical laws which are valid for the collision processes, excitation and recombination phenomena. The charge carrier concentrations and the velocities of the charge carriers fluctuate whether or not the photoconductor is irradiated. For simplicity let us consider an n-type semiconductor. If we introduce the electron velocity v instead of the electron mobility /* in Eq. (2), we obtain the current fluctuation which is given by 6 i = e(n&
+ van).
(46)
The first term of Eq. (46) expresses the Nyquist noise and is given by __
4k T 6inr2 = __ d
R
”
(47)
where R is the resistance of the semiconductor and df is the bandwidth. One must note that a t extremely low temperatures deviations from Eq. (47) may occur. The second term of Eq. (46) can be calculated if we assume that every charge carrier that is present in a given band causes a current pulse during the time 7 . According to Eqs. (2) and (3), this current pulse is proportional to the field strength E and the mobility p . One may call this term pure semiconductor noise. Considering the fact that 7 is proportional to 70, as given in Eq. (7), we obtain for the second term
therefore, the average fluctuation increases as &. If we assume that the lifetime T is statistically distributed then Eq. (48) is transformed to
Another cause for noise, brought about by the metal contact (electrodes), may be called contact or electrode noise. One must also add a boundary layer noise if there are no ohmic contacts and only unidirectional contacts. One may calculate the current fluctuations brought about by the contact noise from
The fuiiction F ( f ) in most cascs is proportioiial to (l/f).
80
P. GORLICH
Finally, photoconductors which exist in polycrystalline layers or as sintered and pressed layers exhibit an additional noise which may be called crystal noise. The traversal of charge carriers in such layers is not unhindered. There are difficulties in expressing the crystal noise in terms of a formula but it might be expressed in a form similar to Eq. (50). The l/f dependence plays a large role also for the noise caused by the surface conditions. The clarification of these is presently under investigation (57).
X. CONCLUSION From the large number of investigations in the field of photoconduction, we have endeavored to sift out the most important and most definite laws. We are trying to supplement these with the most recent results and so to outline a survey of the present status of the problems in the field of photoconduction. I n a review article of this sort we are not able to deal with details. The choice of subject material is, therefore, somewhat arbitrary. In the field of solid state investigations the phenomenon of photoconduction occupies a position in solid state physics equivalent in value as an investigative method to, for instance, the temperature and conductivity measurements or the magnetic measurements which are necessary to understand solid state physics. One may expect future investigations of photoconducting processes in liquid solid boundary regions and in organic compounds. The methods of investigation of the interior and surfaces of these systems and substances will be at least equivalent and probably superior to those available a t present. List of Symbols
A A B
B B,
c
D Dindrx
F G
H IB
k L LD
M
Activator concentration (A+) Amplitude of alternating radiation Width of the electrodes Magnetic induction Constant Thickness of layer Demarcation level Amplification factor Gain Concentration of trapping states Intensity of illumination Wave vector Separation of electrodes Diffusion length Crystal momentum
PROBLEMS O F PHOTOCONDUCTIVITY
81
Surface Resistance Black-body radiation Concentratioii of defect states (8-, S+) Transit time Potential Unidirectional potential Rectification potential Volume Excitation Loss Constant Separation distaiice Lattice constant Elementary charge Production rate (gn, g p ) Dark current Wave vector Effective electron mass Concentration of electrons Inversion density Concentration of holes TTelocityof surface concentration Excitation and recombination processes Velocity of charge carriers Displacements (xn,2), Po ten tial Absorption coefficient Dielectric coiistaiit Mobility ( p n , p p ) Frequency of the light Conductivity Lifetime ( T ~T, ~ ) Decay time (time constant) Space charge relaxation time Circular frequency of irradiation RXFERENCES 1 . Stockmmn, F., Z.Physik. 143, 348 (1955). 2. Haken, H., in “Semiconductor Problems” (W. Schottky, ed.), p. 1. Vieweg, Braunschweig, 1958. 3. Gross, E. F., Kapljanski, A. A , , and Novikon-, B. V., Doklady Akad. Nnuk S.S.A’.R.
82
P. GORLICH
110, 761 (1956); J . Tech. Phys. (U.S.8.R.) 26, 697, 913 (1956); Broser, I. and Broser-Warminsky, R., 2. Elektrochem.. 61, 209 (1957). 4. Gross, E. F. and Rasbirin, B. S., J . Tech. Phys. (U.S.S.R.) 27, 1398 (1957). 5. Gross, E. F. and Rasbirin, B. S., J . Tech. Phys. (U.S.S.R.) 28, 237 (1958); Boer, K. W. and Gutjahr, H., 2. Physik 162, 203 (1958). 6. Gijrlich, P. and Heyne, I., Optik 4, 206 (1948). 7. Cf. e.g. Bube, R. H., Phys. Rev. 99, 1105 (1955). 8. Balkanski, M. and Waldron, R. D., Phys. Rev. 112, 123 (1958). 9. Broser, I. and Broser-Warminsky, R., Phys. and Chem. Solids 8, 177 (1959). 10. Macfarlane, G. G., Mclean, T. P., Quarrington, J. E. and Roberts, V., Phys. und Chem. Solids 8, 388 (1959). 11. Kessler, F. R., Phys. and Chem. Solids 8, 275 (1959). 12. Sttjckmann, F., Klontx, E. E., MacKay, J., Fan, H. Y. and Lark-Horovita, K., 2. Physik 163, 331 (1958); Cronemeyer, D. C., J . Appl. Phys. 29, 1730 (1958); Lasser, M. E., Cholet, P. and Wurst, E. C., J . Opt. Soc. Am. 48, 468 (1958); Beyen, W., Bratt, P., Davis, H., Johnson, L., Levinstein, H. and MacRae, A., ibid. 49, 686 (1959). 13. Kohler, V. and Hauser, O., Lecture, Ann. Meeting Phys. SOC.,Leipaig (1959). 14. Rose, A., Proc. I. R. E. 43, 1850 (1955); Phys. Rev. 97, 322 (1955). 15. Sim, A. C., J . Electronics and ContrgZ 6, 251 (1958). 16. Pohl, R. W. and Sttjckmann, F., Ann. Physik [6] 6, 89 (1949); Stockmann, I?., Z.Physik 147, 544 (1957). 1 7 . Cf. Sttjckmann, F., 2. Physilc 138, 404 (1954); Polke, M., Rtorch, G. and Stiickmann, F., ibid. 164, 51 (1959). 18. Cf. Rose, A., Helv. Phys. Actu 30, 242 (1957); Gorlich, P., Krohs, A. and Lang, W., Arch. Tech. Messen 394-1 (1958); Rose, A. and Lampert, M. A., Phys. Rev. 113, 1227 (1959); Redington, R. W., ibid. 116, 894 (1959). 18. Cf. e.g. Ruppel, W., Z. Physik 162, 235 (1958). 20. Lambe, J., Phys. Rev. 98, 985 (1955). % I . Newman, R., Woodbury, H. H. and Tyler, W. W., Pltys. Rev. 102, 613 (1956). 22. Cf. e.g. Schon, M., in “Semiconductor Problems” (W. Schott,ky, ed.), Vol. 4, p. 306. Vieweg: Braunschweig, 1958; Rittner, E. S., in “Photoconductivity Conference” (R. G. Breckenridge, B. R. Russell, and E. E. Hahn, eds.), p. 215. Wiley, New York, 1956. Gorlich, P., Krohs, A , , and Lang, W., Arch. tech. Messen 394-1 (1958); Muser, H., Z. Naturforsch. 78, 729 (1952). 63. Cf. e.g. Boer, K. W. and Vogel, H., 2. physik Chern. (Frankfurt) \?J.S.] 1, I , 17 (1956); Schijn, M., in “Semiconductor Problems” (W. Schottky, ed.), pp. 319, 333, 345. Vieweg, Braunschweig, 1958. $4. Boer, K. W. and Wantosch, H., Ann. Physik [6] 2, 406 (1959). 2.5. Cf. e.g. Bube, R. I%.,in “Photoconductivity Conference” (R. G. Rreckenridge, B. R. Russell, and E. E. Hahn, eds.), p. 575 VC’iley, New York, 1956; Tolstoi, N. A. and Sokolow, V. A., Optika i Spektroskopiya 3, 495 (1957). 26. Read, W. T., Phil. May. [7] 46, 775 (1954); Matarb, H. F., J . Appl. Phys. 30, 581 (1959). 27. Votava, E., 2. Natt~rfo~sch. 13a, 542 (1958). 28. Bullough, R., Newman, R. C. and Wakefield, J., Lecture, Intern. Convention on Transistors, London, May 21-27, 1959. 69. Cf. Gorlich, P., Optik 8, 512 (1951); Jenaer Jahrb. p. 237 (1951). SO. Miseljuk, E. and Martens, E., Lvest. Akad. Nauk S.S.S.R., Ser. Fiz. 14, 115 (1952). Sl. Cf. Snmmary, Gorlich, P., Advances in Electronics and Electroh Phys. 11, (1959).
PROBLEMS OF PHOTOCONDUCTIVITY
83
32. Boer, K. W., Borchardt, E. and Borchardt, W., 2. physik Chem. (heipzig) 203, 145 (1954). 33. Tyler, W. W., Newman, R. and Woodbury, H. €I., Phys. Reu. 97, 669 (1955); 98, 461 (1955). 34. Wolkenstein, F., “Conductivity of Semiconductors,” p. 340. Moscow-Leningrad 1947; Juse, W. and Rywkin, S., J . Exptl. Thedret. Phys. U.S.S.R. 20, 152 (1950). 35. Borissov, M. and Kanev, St., 2. physik. Chem. (Leipzig) 206, 56 (1955). 56. Rose, A.,R C A Rev. 12, 401 (1951). 57. Bube, R. H., J . Chem. Phys. 21, 1409 (1953). 58. Shockley, W. and Read, W. T., Phys. Rev. 87, 835 (1952); Brattain, W. H. aud Bardeen, J., Bell System Tech. J . 82, 1 (1953); Kingston, R. H., J. Apppl. Phys. 27, 101 (1956). 39. Cf. Brown, W. L., Brattain, W. H., Garrett, C. G. B. and Montgomery, H. C., in “Semiconductor Surface Physics” p. 111. 1956; Bardeen, J., Zntem. C m f . on Semiconductors and Phosphors, Garmisch-Partenkirchen p. 81 (1956); Wang, s. and Wallis, G., Phys. Rev. 107, 947 (1957). 40. Heiland, G., Phys. and Chem. SoZids 6, 155 (1958). 41. Sosnowski, L., Soole, B. W. and Starkicwicz, J., Nuture 160,471 (1947); Gorlich, P. and Krohs, A., Jsnaer Jahrb. p. 155 (1952). 42. Petrita, R. L., Lummie, F. L., Sorrows, H. E. and Woods, J. F., in “Semiconductor Surface Physics” p. 229 (1956). 43. Stiickmann, F.,2. Physik 146,407 (1956). 44. Batt, E. and Stockmann, F.,Z. Naturforsch. 13a, 352 (1958). 46. Gerritsen, H. J., Ruppel, W. and Rose, A., Helv. Phys. Acta 30,235,495,504, (1957). 46. Wiesner, R., in “Semiconductor Problems” (W. Schottky, ed.),Vol. 3, p. 59, 1956. 47. Kikoin, I. K. and Noskow, M. M., Physik. 2. Sowjetunlion 6, 586 (1934). 48. Aigrain, P. and Bulliard, H., Coompt. rend. acad. sci. 236, 672 (1953); Bulliard, H., Ann. phys. 9, 52 (1954); Moss, T. S., Pincherle, L. and Woodward, A. M., Proc. Phys. Soc. B66, 743 (1953); Moss, T. S., ibid. p. 993; Kurnick, S. W., Straws, A. J. and Zitter, R. N., Phys. Rev. 94, 1791 (1954); Bulliard, H., ibid. p. 1564. 49. van Roosbroeck, W., Phys. Rev. 101, 1713 (1956); Guro, G. M., J. Tech. Phys. (U.S.S.R.) 28, 1036 (1958); Walton, A. K. and Moss, T. S.,Proc. Phys. Soc. 73,399 (1959); Cardona, M. and Paul, W., Phys. und Chem. Solids 7, 127 (1958). 60. Dember, H., Physilz. 2. 32, 554, 856 (1931); 33, 207 (1932); Frenkel, J., Nature 132,312 (1933); Ph?/sik. 2. Sowjetunion 8, 185 (1935). 61. Rilsum, C . and Ross, I. M., Nature 179, 146 (1957). 56. Welker, H., Z. Naturforscli. 6a, 184 (1951); Msdelung, O., Tewordt, L. and Welker, H.,ibid. lo&, 476 (1955). 65. Smit,h, R. A,, Advances in Phys. 2, 321 (1953); Schulta, M. L. and Morton, G. A., Proc. I . R. B. 43, 1819 (1955); Bube, R. H., ibid. p. 1836; Moss, T. S., ibid. p. 1869; Kluge, W., 2. angew. Phys. 7, 302 (1955); Garlick, G. F. J., in “Handbuch der Physik” (S. Flugge, ed.), Vol. XIX, p. 316 (1956); GBrlich, P., Krohs, A. and Lang, W., Arch. Tech. Messen 394-1 (1958); Wright, D. A,, Brit. J . Appl. Phys. 9, 205 (1958). 64. Gorjunowa, N. A. and Kolomijez, B. T., Radiotechnika and Elektronika (U.S.S.R.) 1, 1155 (1956); Chansewarow, R., Rywkin, S. M. and Agejewa, I. N., J . Tech. Phys. (U.S.S.R.) 28, 480 (1958); Kolomijez, B. T. and Malkowa, A. A., ibid. p. 1662; Lawson, W. D., Nielsen, S., Putley, E. H. and Young, A. S., Phys. and Chem. Solids 9, 325 (1959). 55. Nitsche, R., Festkorpertagung, Balatonfured, Hungary, 1959, to be published.
84
P. G ~ ~ R L I C H
66. Cf. e.g. Gorlich, P., Jenaer Jahrh. p. 229 (1951); Optik 8, 512 (1951) (additional references); for complete treatment cf. Jones, R. C., Advances in Electronics 6, 1 (1953); Roberts, D. H. and Wilson, B. 1,.H., Brit. J . Appl. Phys. 9, 291 (1958). 67. McWhorter, A. L., in “Semiconductor Surface Physics” p. 207 (1956); Kingston, R. H., J . A p p l . P h p 27, 114 (1956).
Strong-Focusing Lenses" ALBERT SEPTIER Laboratoire d'Eledronique et de Radioblectricitd. Universitd de Paris. Fontenay.aux.Roses. Seine. France Page 86 86 100 105 C . Optical Elements of a Single Lens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 D . Optical Elements of a Doublet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E. Optical Elements of a Triplet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 F. Combination of Two Distinct Doublets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 G. Sequence of a Very Large Number of Identical Lenses . . . . . . . . . . . . . . . . 123 133 H . Helix Quadrupole Lenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I . Electric Lenses Excited a t High Frequencies . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 137 J. Focussing of Polarized Atoms and Molecules . . . . . . . . . . . . . . . . . . . . . . . . . . I1. Aberrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10 A . Aberrations of a System with Two Planes of Symmetry . . . . . . . . . . . . . . . . . 1-10 B. Calculations of the Trajectories to the Third Order . . . . . . . . . . . . . . . . . . . . 143 C. Trial Correction for the Aperture Aberrations . . . . . . . . . . . . . . . . . . . . . . . . . 156 158 D . Chromatic Aberration and "Mass" Aberration . . . . . . . . . . . . . . . . . . . . . . . I11. Practical Realization of Lenses and Measurement of Fields . . . . . . . . . . . . . . . . . 160 A . Practical Realization of Magnetic Lenses . . . . . . . . . . . . . . . . . . . . 160 B. Practical Realization of Electrostatic Len .................... 167 C. Magnetic Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 D. Distribution of the Transverse Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 180 E . Longitudinal Field B, . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F. Equivalent Length L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 IV. Experimental Study of the Optical Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 183 A . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Methods of Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 C. First-Order Results .............................. 188 D. Measurement of Aberrations of Magnetic Doublets . . . . . . . . . . . . . . . . . . . . . 194
I . Theoretical Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Potential and Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Equationsof Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
We shall denote by the name of quadrupole lens. or strong-focusing lens. a system consisting of four electrodes (or poles) in the form of cylinders. with generatrices parallel to an axis 0 2 . and possessing four planes of symmetry through this axis . They form a transverse electric field E. or magnetic field H.whose intensity ( E or H ) a t a point is proportional to the distance from the axis Oz to this point; the radial gradient of the fieId ( d E / d r or a H / d r ) then has a constant value throughout all the useful volume. * Translated by H . A . Fowler . 85
86
ALBERT SEPTIER
In a general study of cylindrical systems Melkich (1) indicated in 1944 the theoretical interest of these distributions of field with constant gradient in the domain of Gaussian optics (paraxial rays and very small angular apertures). But their first use as a means of focusing dates only from 1952, a t which time Christofilos (2) on the one hand, Courant, Livingston, and Snyder (3) on the other hand, proposed their use to guide rapid particles in a new circular accelerator, the strong-focusing synchrotron, and conceived of the construction of quadrupole lenses to focus beams of rapid ions (3) or to overcome the forces of repulsion which act on the beam in the interior of a linear ion accelerator (4). Since this date, numerous theoretical or experimental studies have been devoted to these lenses, and their field of application has been considerably extended. Owing to the transverse nature of their active field, these lenses have an advantage over classical lenses where the central component of the field is longitudinal (parallel to 0 2 ) and where the focusing results only from a differential effect; for the same energy of incident particles, their convergence is much stronger. However, they have a n inconvenient feature ; they are strongly astigmatic, being convergent in only one radial direction, and divergent in the radial direction perpendicular to that one. One may nonetheless easily obtain a convergent action in all planes by grouping on the same axis two crossed lenses; in what follows we shall denote this combination by the name of “doublet.” The spherical convergence of the doublet then becomes a differential effect, but contrary to the case of circular lenses, it increases with the length of the lenses. One finally obtains strong-focusing doublets, if they are formed of sufficiently long elements. I n Sec. I we shall study the theoretical properties of these lenses, the distribution of their field, and their optical properties, and more especially the elementary or grouped lenses. Section I1 will be devoted to the calculation of the third-order aberrations of these lenses. Section I1 will review the practical realization of these lenses and the distributions of field which may be obtained in practice. Finally, Sec. I V will be a review of the experimental results obtained, which show that the calculations carried out in the first section are valid to a good approximation. The methods of study will be described.
I. THEORETICAL PROPERTIES TO FIRSTORDER
A . Potential and Fields 1. General Expression for the Potential in a System of Quadrupole S y m metry. Let us consider a system of four equipotential electrodes’ main1 We shall denote as “electrodes,” both electrodes of electrostatic lenses and pole pieces of magnetic lenses.
87
STRONG-FOCUSING LENSES
tained respectively a t the potentials
(see Fig. 1) and having:
&+I,
4 planes of mechanical symmetry: zOX and zOY zOx and zOy 2 planes of excitation symmetry: zOx and zOy.
Their length is first of all supposed great enough that the end effects are negligible; we then suppose that the potential and the field depend only on two coordinates.
N North pole
I s South pole
FIG.1. Schematic transverse section of a quadrupole lens.
The scalar potential + ( X , Y ) obeys Laplace's equation we suppose (5)
W(s)= u(X,Y)
v2+= 0, arid if
+ i+(X,Y)and s = X + i Y
it may be expressed in the form of an infinite series:
n=l
AS a result of the different symmetries in X and has the form:
Y,the expression W ( S )
= from which the expression for the equipoteiitial curves: +(x,y> Im(W(s))
+(X,Y)
=
h*XY
+ h6XY (
x 4
-
10
3x2Y2
+ Y4) + . . .
88
ALBERT SEPTIER
In the system with axes that is
Ox and Oy, we will have +(x,y)
1
+(z,Y)= 5 h 2 ( 2 - y2)
+ 61 h s ( 9 -
- y2) -
~
=
f)
Re(W(s)),
+
*
*
.
or taking as units the potential 4, and the radius a of the inscribed circle tangent to the electrodes (6) ( a = semiaperture of the system)
and A 4(’?J) - K z (xP- Y2)
41
+
KO[x6- 15x2y2(x2 - y2) - Y61
a*
+
....
U6
The coefficients K , are constants which depend only on the form of the electrodes. The components of field Bx and B y are given by
Bx =
-po-
64 6X
and
BY =
-PO--. 64
6Y’
To have a field with constant radial gradient, i t is necessary that the terms K2, with n > 1 be 0, that is to say,
Such a distribution is obtained by making the electrodes coincide with the equipotential +1 of equation :
which has as a result that KP= 1. The ideal distribution will be furnished by four electrodes whose cross section is in the form of a n equilateral hyperbola with asymptotic direction OX and OY. But it is impossible to realize electrodes of infinite lengths along these axes; and higher-order terms appear in the expression for + ( X , Y )due to the cutting off of the ends a t a finite distance in X and in Y . The calculation of these terms is possible in an approximate manner in some simple cases: plane electrodes (Fig. 2a) or portions of circles arranged on a circular cylinder and separated by a gap 2-y (Fig. 2b). The first terms KPand Kf, of the development of +(X,Y)/4, are then given in the following tabulation ( 7 ) .
89
STRONG-FOCUSING LENSES
1.037 1,273 sin 2y/2y 1
Plane electrodes Circular concave electrodes Infinite hyperbolic electrodes
0.009 0.042 sin 6y/6y 0
For circular electrodes one may cancel the term K6 by taking 2y = r / 6 ; one then obtains a very extended useful zone with constant gradient, in the neighborhood of the axis. In any case, the abrupt change from hyperbolas to planes modifies relatively little the ideal distribution; from this insensitivity one sees that it, is possible to construct circular electrodes which are nearly hyperbolas and which are mechanically simpler to realize.
( a1
(b)
FIG.2 . Lenses with plane (a) or circular concave (b) electrodes.
We shall see in the experimental part that for circular electrodes close to hyperbolas, the coefficients Kg and Klo are a t most of the order of 1 or 2. and that it is also possible to set K6 = 0. Finally, in any quadrupole lens, if one considers the fundamental term of the field and if one denotes by K ( 0 ) the radial gradient a t the center of the lens (z = 01, we may express BX and By a t a point M by the relations
Bx
=
BY
=
--K(O)r sin 28 --K(O)r sin 28
r and 8 being the cylindrical coordinates of &I, and the origin of 0 being taken on O X . On O X , Bx is zero and BY has the value
By
=
--K(O)X
On O Y , By is 0 and on Ox and Oy, the field is directed along the axes, for
Rx
=
By.
The existence of these elements of symmetry in the fields allows us to
90
ALBERT SEPTIER
see the simple nature of the important families of trajectories; for a particle of positive charge circulating in the plane zOX of a magnetic lens (or ZOX of a n electrostatic lens) the effective force is directed towards Oz, the trajectory remains planar, and converges towards Oz. I n the planes zOY (or zOy) the trajectory also remains planar, but the force is defocusing I n all the other cases, the force causes a rotation of the trajectory, which then becomes a curved path.
FIG.3. Appearance of the real function k ( z ) along parallels to
02.
In the following text, the plane zOX will be called the “convergent” plane, and the plane z O Y , the “divergent” plane. 2. Lens of Finite Length. The distribution of field described above is valid only far from the ends of the electrodes (located a t z = fZ/2), and in particular a t the center of the lens (located a t z = 0 ) , if the mechanical length 1 is such that 2 >> a. I n the neighborhood of the ends (in z ) the potential and field again become functions of three variables; this is the region of “leakage” fields. However, the planes of symmetry through the axis still exist and it is still possible (6) t o express +(z,y,z) in the form: (6(Z,Y,Z)
___
z2 -
24
= f f -
(61
U‘
Y2+B,-
-
x6 - y6
Y4+r,a
with v2(6(x,y,z) = 0.
We have then
+
xty”x2 - y ” + . U6
...
91
STRONG-FOCUSING LENSES
arid finally,
In the center of the lens b ( z ) = K z and k ~ ( z = ) KO,arid we find again the simple expression given earlier. If we consider only the fundamental term, the fuiiction k z ( z ) , which is equal to 1 a t the center of the lens, gives for each plane of cross section, defined by its abscissa z , the value of transverse gradient with the value of the gradient a t the center ( z = 0) for a unit. We shall henceforth denote the coefficient k&) by the simpler notation k ( z ) and by the term “characteristic function” of the lens. The calculation of k ( z ) can be carried out in a rigorous fashion in the case of circular concave electrodes (6). It has only one feature of interest: it shows that this is a rapidly decreasing function of /zI for / z / > Z/2 and has the form shown in Fig. 3. k ( x ) = 1 in the central zone, of approximate length 220 I - 2a.
for
/z( >
(a
1
+
3. I’nlue of the Transverse Gradient. Finally, in order to establish the optical properties of the lenses to the first order, we consider that we are dealing with a distribution of field such that
+(r,y,z) -
-
k ( z ) ( z ?- y’)
41
4(X,Y,z) =
k(z) . X Y
a’
41
a2
In the magnetic case we use the second of these expressions, from which we deduce :
Bx
-.
= - 2tl
a
Y * k ( z ) , BY
=
-
If we suppose that the surface of the poles is practically a magnetic equipotential (experience shows that this is always the case, saturation always commencing a t a point in the yoke) and that the consumption of ampere turiis is negligible in the iron composing the magnetic circuit, we may write: 41
= l.lO?lI
92
ALBERT SEPTIER
where n l represents the number of ampere turns for one of the poles from which
that is, for z
=
0
In the electric case, we have
K(0) = 24,/a2. The real gradient at each point is given by the function
K(z)
=
k(z)K(O).
4. Equivalent Lengths. a. Equivalent length o j the field LB. As in any system of corpuscular optics with a transverse field (magnetic deflector, cylindrical condenser, etc.) the action of the lens can be characterized by the integral f m
A
B,dz
=
taken along the parallels to Oz, B, being the amplitude of the transverse coniponent of field a t a distance r from Oz. Following these parallels, the transverse field is representable by a function B,(z) analogous to k ( z ) , if one considers the fundamental term in the expression for the field: B7(z) = rK(O)k(z)= B,(O)k(z).
The function B,(z) is interesting because it is easy to determine experimentally. A theoretical lens, equivalent to the actual lens, may be defined by assuming that the transverse field B,(O) (value of B, a t z = 0, in the real lens) operates only along a length L such that, for the two lenses, the integral A should be the same. This amounts mathematically to approximating the function B,(z) by a rectangle of length L and of height B,(O).
L =
1 N O ) __
1
+ -
B,(z) dz.
- m
To complete this analysis one may distinguish between the action of the different components Bx and BY of B,, and define an equivalent length for
93
STRONG-FOCUSING LENSES
each component ( 8 )
r+r
i
m
Bx(O,X,Y) and By(O,X,Y) being the actual values of Bx and BY a t z a t the point M ( X , Y ) .
0,
=
For a pure quadrupole magnetic field at z = 0 (of constant gradient K(0,O)) one may express Bx and By in the form:
Bx BY
= =
-K(O,O)Y --K(O,O)X
and the expressions for Lx and LY are:
+ A(5X4- 10X2Y2+ Y*) + B(9X8 - 84X6Y2+ 126X4Y4- 36X2Y6+ Y 8 )+ . L y ( X , Y ) = La + A ( X 4- 10X2Y2+ 5Y4) + B ( X 8- 36X6Y2+ 126X4Y4- 84X2Y6+ 9Y8)+ L x ( X , Y ) = Lo
*
*
.
*
where Lo is the equivalent length a t X = Y = 0. We see therefore that L will not be a constant in all the useful volume unless the coefficients A , B , . . . are zero, which may be realized by the adoption of a special form for the ends of the pole pieces. Moreover, a t Y = 0 the equivalent length is given by LY = Lo AX4 B X s . and a t X = 0 by an equivalent expression. On Ox, where X = Y we have:
+
LX = Ly
+
=
+ -
Lo - 4AX4
+ 16BX8 + *
that is :
L x = L o - A x 4 + B x 8 + - ~. f o r x = X d . The variation of L as a function of the distance from the axis is not the same along OX and Ox, unless A = 0. On the contrary, if B may be neglected, and if A is positive, L will increase along OX and O Y , and decrease on Ox and Oy as we go away from Oz. In the electrostatic case, for an ideal distribution with constant gradient a t z = 0, the expression for L,(x,y) and L,(x,y) is obtained from the components E, and E,, which are such that
94
ALBERT SEPTIER
The expressions for L, and L, are identical with those for LY and Lx, respectively, in which we must replace X by x and Y by y. If the quadrupole symmetry does not continue out to the ends (for example it may be convenient to reduce the number of exciting coils to tlwoinstead of four, or perhaps to adopt a rectangular yoke) supplementary terms must be added to the preceding ones in the expression of L. I n what follows the equivalent length used will be the relative equivalent length for a radial field B and we shall denote it by LB(T)
LB(T)= LB(O)f ALB(T). b. Equivalent length for the gradient LG, We may also define an equivalent length for the function K(z) expressing the gradient along parallels to Oz. Most often, the transverse gradient varies slightly as one goes away from the axis and K ( z ) may be written in the form K(z,r). We then have:
LG(T)=
1 ~
K(oJr)
/-+ -
K(z,r) dz.
There exists a simple relation between LB(r)and Lo(r) (9).We have in fact
For a field with a perfectly constant gradient in the central zone
B(o,r)/K(oJr)= r , whence we have
If LB(r)is a decreasing function of the distance r , &(r) will decrease even more rapidly. c. Real equivalent length and theoretical equivalent length. We have defined the equivalent length starting from the graph of experimental values of field, or of the gradient a t z = 0; B(0,r) and K(0,r). Rut one may obtain a better idea of the convergence a t different points in the lens, by calculating LB and LC starting from the theoretical values
STRONG-FOCUSING
LENSES
95
of B(0,r) and K(0,r) a t the point considered, the values being linked by the relations B(0,r) = rK(0,O) and K(0,r) = K(0,O)
which are expressions for ideal lenses. These laws characterize ideal lenses, and starting from the measured values for K(O,O),allow us to obtain B(0,r) and K(0,r). The values thus obtained for L B ( r ) and &(r) will nut follow the same laws of variation with T as the entirely experimental values visualized earlier; we may compare these two in an interesting fashion. 5 . Magnetic Lenses without Poles. a. Lenses without iron (10a,b). It is possible to obtain distributions of field with a constant gradient, to a very good approximation, with sheets of current of constant density, arranged around the sides of a square (Fig. 4). I
I
I I
-I
i
FIG.4. Cross section of a lens with four sheets of current.
Let I be the total current circulating in one of the sheets, of width 2a and of infinite length; in two of the opposing sheets the current circulates in the positive sense along the axis 0 2 , and in the other two, in the negative sense. The resulting field possesses quadrupolar symmetry, and has for components in P , with the notations of Fig. 4:
These expressions simplify themselves in the planes of symmetry. If we demand that X / a = u,for example, we have
96
ALBERT SEPTIER
on OX
Bx
=
0
+t on Ox
Bx
=
+
1
a n h - 1 ~ 2 2uu2
By
POI Bx = - [tan-' u
21ra
+ tanh-l
u]
that is, in the neighborhood of the axis Oz onOX on Ox The radial gradient is rigorously given by the following expressions : on OX on
Ox
and the graph of the field is identical to that of a quadrupole lens with iron, the ends of whose poles are at A1, A2, Aa, and A l , with a number of ampere turns per pole equal to I / 2 . The gradient of such a lens would be:
For the same number of ampere turns, the corresponding lens with iron would give a larger gradient: K(O)/Kl(O)= r / 2 . However, it is possible to obtain strong gradients by increasing the number of conductors making up the sheets of current by adding layers which are also arranged in a square. If we calculate the values of Bx and B, as far as X , x = a, we notice that the difference between the real field and the ideal field is of the order of 6%, and that the law of variation of the field is given to about 0.5% a s far as r = 0 . 7 ~by the approximate law with 2 terms:
97
STRONG-FOCUSING LENSES
It is possible to improve this distribution considerably by either of the following two methods. ( 1 ) By using sheets of current of width 2h shorter than the side 2a of the square; in which case we will have in practice on O X : By=---
'''[
2rh
tan-'
+2hX h2 -
.2
Near the axis if we require that t
2'01
By
=
- aa(1
+ P) ' u [ l -
=
X?
+ tanh-'
a2
+2hX h2 + X2
h/a and u = X / a , we obtain:
- 24 - 5 u4 5(1 t2)4 9 - 84t2+ 126t4 - 36t6 9(1 t 2 ) 8
lot2
+
+
+
the term in u4will disappear if h
=
1.
I?
+ t S U 8- . . . .
0.726a
the term in ( X / a ) Swill then have the value 0.7 X 10-2(X/a)s. (2) By using thick coils of rectangular cross section (1021) of width 2h, (shorter than the side, 2a) and thickness 2b, where h/a = 0.92 and b/a = 0.5. The radial gradient is practically constant from X = 0 to X = 0.9, its variations being less than b. Lenses with iron ( 1 ) Current sheets on a circle ( 1 1 ) . Let there be a sheet of current of linear density j(+) (Fig. 5 ) circulating along a circular
X
FIG.5. Sheet of current on a cylinder.
cylinder of axis the relation
02,
and radius R. The local intensit,y I(+1,+2) is given by
98
ALBERT SEPTIER
The permeability pz of the exterior medium ( r > R ) is supposed to be infinite; the examination of boundary conditions between the medium 1 (p1 = pz) and the medium 2 shows that the components H , and H+ of H must be 0 in the iron. The discontinuity of the tangential component of field, expressed by ( G ) 1
-
(H$)2
=
-I(+)/Rd$ =
-A+)
requircs then that ( H $ ) l = - j ( $ ) if the positive sense of I is the same as that of 0 2 . To create a known distribution of field in the space r < R, it is sufficient to impose on j ( + ) the law of variation obtained for HJ.. A purely quadrupolar field (with a constant gradient) will show up in polar coordinates in the relations:
B, = p 0 H , = -K(O)r sin 2+ B,J = p o H , ~= -K(O)r cos 2$ by choosing $ = 0 on O X , K (0 ) denotes the radial gradient of the induction. It therefore suffices to create a sheet of current of density
A+) = K ( o ) R cos 211 PO
Since the magnetic fields are 0 in the exterior medium, a sufficient condition for shielding the lens is to surround it with a layer of a medium with a very large permeability. But i t is not very practical to realize strong densities of current with a complicated angular law of variation; the continuous sheet must evidently be replaced by separate wires in which the iriteiisities vary nearly according to the same law. ( 2 ) Current sheeis on a square or a rectangle (16). If we consider a sheet of current of constant density and of a definite width, pressed against a mediuni of infinite permeability (Fig. 6), the condition
Hy
=
-j(Y)
means that the variation of scalar magnetic potential along A B , given by:
is a linear function of Y
STRONG-FOCUSING LENSES
FIG.
6 . Surface current a t the edge of a medlum of infinite permeaklility.
In a lens with constant gradient, the scalar potential 4 varies linearly on the straight lines X o or Y o (Fig. 7) '$(on) = $(Am
=
K(O)XO y -K(O) Yo.
x
if we suppose that there is a north pole a t A and a south pole a t B, the points ABCD being taken on the limit of the hyperbolic pole pieces. The graph of the field on the inside of the rectangle ABCD will not be modified if we place the sheet of current of density on AD and BC on A B and C D
A(#) = KXo j,(J.)
=
-KYo
and if we enclose this rectangle in plates of steel of very high permeability. But, contrary to the case of current sheets without iron, if the number of 4y
I
FIG. 7. Straight lines for the linear variation of the scalar potential irr a classical quadrupole lens.
100
ALBERT SEPTIER
ampere turns increases by superimposing layers which are consequently moved away from the iron, the tangential field no longer remains constant, and moreover, higher order terms are growing and perturb the linearity of the field. Lenses of this type have been constructed by Hand and Panofsky (12); they are formed of 4 flat coils of appreciable thickness placed inside a soft steel frame of rectangular cross section. The total number of ampere turns is the same on the two sides of the rectangle = 2nI. The construction of the windings is more delicate than in the case of the classical quadrupolar iron lenses. For very thin coils in the neighborhood of 0, the radial gradient will be the same as that of a lens whose poles are tangent to the top of a rectangle for the same number of ampere turns n1 relative to a single pole. The real distribution of field with thick coils may be calculated, using the method of electric images of the coils in the different walls. The radial gradient is given by (12, 1 3 ) : . e K ( 0 ) = pOj---? e+b where j is the current density in the internal space of sides 2b and 2c (see Fig. S), with coils of thickness e or d. We shall see later that the interest
ec=bd
FIG.8. Section of a quadrupole lens without poles.
of these lenses consists in the fact that in the divergent plane OY one obtains a greater useful space than in the convergent plane O X ; we may thus use longer lenses, without having the divergent trajectories lose themselves against the walls of the vacuum chamber.
B. Equations of M o t i o n 1. General Expression. Let us go back to the system of coordinates
which was defined earlier (Fig. l), where the axes Ox and O X are obtained
101
STRONG-FOCUSING LENSES
one from the other by a rotation O X , Ox nate system. The fundamental equation d (m,v) ~dt
eE
=
+a/4 in a left-handed coordi-
(electric case) (magnetic case)
(where e is the value of the charge of the incident particle, m its mass, and v its velocity), which may be rewritten: d -
mov 4 1 - (v2/c2)
=
eE, -e(v X B)
leads to simple equations having the same form in the two cases, if referred to axes Ox and Oy, and B, to the axes OX and OY. In the magnetic case, eliminating the time by the relation
E
is
that is, d = u -d dt
ds
where s is the unit vector taken along the trajectory, we may write ds
= dz
[($)' + (%)*+
1]>(
To a first approximation, when the trajectory is only slightly tilted with respect to 0 2 , ds = dz and the longitudinal velocity V, lies along V, and remains practically constant. Furthermore,
mov
4 1 -
$0
(v2/2)
=
p
=
[2emo+o(1
+ c+~)]~J
with
E
=
le' > 0 2moc2
being the accelerating voltage for the incident particles. Finally ( 1 4 ) ,
102
ALBERT SEPTIER
To a first-order approximation, ds
Bx By
= dz, =
=
-K(z)Y, -K(z)X.
Whence, denoting by X” and Y” the derivatives of X and Y with respect to z +
+
[ 2 m o ~ oe( l
]”K(z)X
€40)
=
0
[for a negative particle, the (+) and (-) would be interchanged]. These equations may be written in another form, if the particle is characterized by its “rigidity” (Bp) = p / e (circular trajectory of radius of curvature p in a homogeneous field B, for a particle of linear momentum
P) :
In the case where positive ions are used the factor €40 is most often negligible; for protons of 10 and 10@Mev we would have ~4~= 5 . and 5 respectively. For protons of energy 40 less than 100 Mev, the equations of motion can then be simplified:
X”
+& g y Z ) X
Y” or
Jk0 K(z)Y +
X” P“z)X = 0 Y” - @“(Z)Y= 0 with
that is,
=
0
=
0
103
STRONG-FOCUSING LENSES
and
a t the center of the leiis ( z = 0 ) ; p is called the “characteristic factor” of the lens for a given incident particle. With electrons, 011 the other hand, the corrected potential must be retained in the expression for p ; if $0 = 1 MeV, we already have €40 = 1. In the case of an electrostatic lens, we arrive in the same way a t the equations:
in the relativistic case; K ( z ) = (2&/a2)k(z), and 1 41 k ( z ) z = 0 d’+ --
a240
f o r a nonrelativistic particle. The characteristic factor may then be written for z
=
0
In both cases we end up with a noncoupled system of equations to describe the properties to first order:
+
X” P’(Z)X Y’! - P’(Z)Y
= =
0 0
with
X=Xorx Y = Yory
2. Integration of Equations for the Rectangular Model. The term P ( z ) is a nonanalytic fuiiction of z, and the preceding equations can only be integrated step by step. But, if k ( z ) is replaced by the equivalent rectangle of length L and of height 1 for the entire width of the lens, we have:
p(z)
=
P(0) = constant.
From this we obtain the equations of motion which follow, taking the initial conditions a t the entrance to the lens (Xo, X’o, Yo, Y’o) and placing the origin of the axis Oz a t this point:
X
=
X o cos pz
X’O sin pz +P
104
ALBERT SEPTIER
in the convergent plane O X .
Y
=
Y ocosh pz
+ X'O - sinh Pz P
in the divergent plane O Y . At the exit (z
=
L)
From this we obtain the transfer matrices:
We have the relations
A* - BC 0' - EF
=
=
1 1
3. Integration of Equations for the Bell-shaped Model (15, 16). The equations may also be integrated by replacing the experimentally obtained real function k ( z ) by the more bell-shaped model represented on Fig. 3:
Region I.
k(z) =
Region 11.
k(z)
=
Region 111. k ( z )
=
1
[1 1 [1
+ (z + zo/b)*I'
-20
+
1 (2
< 2 <
--co
+zo
- zO/b)212
-20
< z < 20 <2 < +w
b is the half-width a t midheight of the half-bell-shaped curves thus defined. The quantities zo and b are determined from K(z) by keeping t,he equality between the areas of the experimental curve and the theoretical one, and by making them coincide best in the region where the slope d k ( z ) / d z is the most rapid. I n H.egion I, for example, the equation in X is written:
d2X
dzZ +
1
[I
+ + (2
ZO/b)*]2
x = 0.
105
STRONG-FOCUSING LENSES
This may be integrated by demanding (17 ): o*z
=
b
cot $
u = X sin $
and
with
?r
> 4 >~ / 2 .
The equation becomes d2u dlk2
+ (1 + b2p2)U = 0.
In the same manner we find, with v d2v dP
=
Y sin $.
+ (1 - b2p2)v = 0.
The solutions have the form:
i
+ +
1 ( A cos wl$ B sin wl$) sin $ 1 (C cos w?$ D sin w2$)
IIS L= - K *
wI2 = 1
with
[ w22
+ b2P'
= 1-
b2p2
RBCD is expressed a t a function of Xo, Yo, XIo and Y f 0at the entrance to the lens, this being situated not at z = - cz) but a t a point z = z1 where the function k ( z ) has taken a negligible value (less than lop2,for example). We then integrate in the central region, finally in the third region, fitting together the coordinates and the slopes of the trajectories a t z = - 2 0 and also :It 2 = +ZO.
C. Optical Elements of a Single Lens 1 . Dejinitiors. We consider a lens of length L having its center at z = 0 (Fig. 9 ) j and a n incident trajectory parallel to Oz, first of all in the plane z O X , then in the plane zOY (XI,, = 0, Y'o = 0.)
x;=o -I-----
~
I.
~
FIG.9. Cardinal elements of a lens (a) convergent plane, (b) divergent plane.
106
ALBERT SEPTIER
The focus is defined by the intersection with the axis 0 2 of the direction of the emergent ray (for strong lenses it may be situated within the lens, but does not coincide with the real intersection of the trajectory with the axis) and the position with the principal image plane, by the intersection of the direction of the emergent ray with that of the incident ray. Let ux and U Y be the abscissae of F‘x and F‘Y with reference to the exit face, and fx and f ~the , algebraic values of the focal distances. We have the relations:
fx
=
ZLX
=
-X,/X’1 -X,/X’,
fY =
uy
=
-Y,/Y’, -Y,/Y’,
denoting by X1, X’, and Y , and Y f l ,the ordinates and the slopes of the rays a t the exit from the lens in the zOX and z0Y planes, respectively. 2. Rectangle Model. We obtain easily (Figs. 9a and b) :
fx
=
1
p sin pL ~
fY =
1
- p sinh PL
if
the distance V X , face is given by:
VY
of the principal image plane with respect to the exit 1 - COSPL = 1 tan pL P sin PL P 2 cash PL - 1 1 PL = - tanh - 7 p sinh pL p 2
/VF= IVY1 =
and its abscissa with respect to the center of the lens by
The object foci and principal planes are obtained by taking the symmetric opposites of the image foci and principal planes with respect to the center 0. Let us note here that in the magnetic case they cannot be defined by considering a ray parallel to Oz coming from the right, for in this case OX will become the divergent plane and OY the convergent plane. The object elements can be defined also from the incident beams which are such that the corresponding emergent beams are parallel to the axis.
107
STRONG-FOCUSING LENSES
If the excitation of the lens is sufficiently weak, we may write sinh PL II sin PL sPL cosh PL > 1
cos PL ?c
+ (PL/2)2
1 - (PL/2)2
and we obtain the approximation of “weak” lenses fx =
-fy
=
Ax = A y = 0.
l/P2L
The two focal distances are equal and opposite, and the principal planes are a t the center of the lens. When PL is greater than 0.2, the ratio p = fx/f y increases very rapidly :
PL
=
p =
0.2
0.5
1
1.5
-1
1.1
1.4
2.1
The lens becomes rapidly more divergent in OY than convergent in O X . Moreover, when @Lincreases, the principal planes move away from 0, crossing each other. I n fact, we have, for the image plane
< L/2. negligible so long as PL < 0.5, but
CX
The difference remains we will already have:
> L/2
A x = -0.33-
vy
L 2
for PL
=
?r/2
= -Ay.
When PL increases still more, the principal planes of the plane OX oscillate, passing periodically to infinity, while they tend in toward the faces of the lens in the plane OY in a monotonic manner. The thin-lens approximation will therefore not be valid except for PL 6 0.5 or 0.6. The curves of variation as a function of PL, of the different cardinal elements are represented in Fig. 10. This representation of the lens with the use of foci and principal planes allows the transfer matrices to be expressed in a different form, the lens becoming equivalent to two drift spaces of equal length, ux and u y , and a thin lens causing a refraction P or Q (variation of the slope). We require that: PL 1 PL vx = M = -1 tanh v y = N = - taiih P 2 P 2 Q = P sinh PL P = -P sin PL we have then :
108
ALBERT SEPTIER
3.60
t
3s5
3
3.50
L
t -21
3.45
I-
0.5
I
PL
-
1.5
FIG.10. Variation of the convergence C = l/f, and of the position of the principnl plane image in the two planes zOX and zOY of a single lens.
3. Bell-shaped Model. The expressions are much more complicated, for it is necessary to bring the trajectories into agreement in the three successive domains : 1
- = sin 2Pz0
sin2 w l r / 2
Pfx
+
cos2 wlr/2
cos 2PZO
1 - w12
1
- = -sinh 2Pz0
af y
- cos 2pzo cos w17r
~ F ’ Y= 20
+fi/[ -P
sinh 2Pz0
1
1
-
b sin w2r
w)2a
1
STRONG-FOCUSING LENSES
109
4. Comparison Between the Two Models. If we study the convergence of a given lens, representing the characteristic function k ( z ) by the bellshaped model, with the conservation of the area of k ( z ) as the single condition, it is possible to obtain different values of the pairs (20 and b) in the equation for the bell-shaped curves. The quantity b = 0, corresponds to the rectangular model, and zo = 0, to a curve with a single bell, with no intermediate zone. The study of the cardinal elements furnished by the preceding formulas shows that, in the two extreme cases, the convergence and the position of the foci determined thus do not vary more than about 1.5% (16). It is therefore pointless to become involved in the complicated formulas of the bell-shaped model for the calculation of an optical system. The rectangular model will give satisfactory results, and ones which are sufficiently accurate, provided that the equivalent length L is introduced in the formulas, and not the mechanical length 1. This equivalent length is obtained experimentally. We shall see later that the empirical relation L ‘v 1 l . l a is often enough to calculate L to a good approximation.
+
D. Optical Elements of a Doublet 1. Definitions. A strong-focusing doublet is comprised of two lenses Q1 and Qz with the same axis Oz, of lengths L1 and L2,separated by a distance D , and such that the convergent plane of the first corresponds to the divergent plane of the second. A north (or positive pole) of the first lens is face to face with a south (or negative) pole of the second. The doublet will be called symmetrical if the two lenses are identical (PI = Py, Ll = L2). The origin will be taken in the entrance plane of Q1. The plane XOz will be denoted the “convergent-.divergent” plane of the doublet (plane C-D), the positive particles being supposed to follow the positive sense of 0 2 . The plane Y O z will be the “divergent-convergent” plane (plane 0-C). The initial conditions are defined by XoYoX’o, and Y’o a t z = 0, and the foci and principal plane of the arrangement may be calculated from the final conditions in the exit plane of Qz = X,Y,X’,Y’,, these being determined by reference to the formulas of the rectangular model. 2. Transfer Matrix of the Doublet (18). Plane C-D
110
ALBERT SEPTIER
Plane D-C
This leads to the following, if we require that PlLl = x1 and that PzLz = 22: P1 . Hll = cos x1 cosh 2 2 - PID sin x1 cosh 2 2 - - sin XI sinh xz
Hlz
=
D cos x1 cosh xz
Pz 1
+ - sin x1 cosh x2 + 1
P1
-
P2
cos x1 sinh x2
Hzl = P2 cos x1 sinh xz - P1 sin x1 cosh x2 - P&D sin x1 sinh xz Hz2 = cos x1 cosh x2
+ PzD cos x1 sinh xz + -Pz sin
sinh xz
x1
P1
+ BID sinh x1 cos xz + Pl sinh sin 1 1 Vlz = D cosh x1 cos x2 + - sinh x1 cos x2 + - cosh sin xz PY Vzl = -Pz cosh x1 sin xz + P1 sinh x1 cos xz - plP2D sinh sin xz
Vll
=
cosh x1 cos x2
-
x1
52
P2
XI
P1
x1
Vzz
=
Pz cosh 2 1 cos x2 - PZD cosh x1 sin xz - - sinh XI sin x2 61
The terms of the matrix become considerably simplified when the doublet is symmetrical. 3. Cardinal Elements. Let H’xF’x be the image focal distance in the plane C-D H f y F ‘ y , that of the plane D-C S‘F’x, the position of the image focus of the plane C-D, with reference to the exit face of &z, S ” y , the position of the image focus of the plane D-C. We imagine an incident ray parallel to Oz (XI0 = Y’o= 0). From which, we have the following:
l “x:,I1 I) 22l l l =
Fx
S’H’x
f0
=
H’xF‘x = -Xo/X’Z = - l / H z i - X 3 / X r 3 = Hll * F x
=
(Hi1 - 1 ) F x .
=
111
STRONG-FOCUSING LENSES
In the same way
Fy
=
H’YF’Y
S’F’y
=
VllFy
S’H’p
= (Vi1 -
= -
1 v 2 1
1)Fy.
In order to define the lens completely, we need elements of the objectspace. The formulas are identical with the preceding ones; in the plane zOX, it is necessary to take the preceding formulas for the plane zOY, replacing in them P1 and Ll by p2 and Lz and conversely, changing sign. An analogous treatment in zOY will give the object elements of the plane
D-C. The position of F x , F y , H x , and H y will then be measured from the entrance plane of &I. We obtain, for the focal distances;
+
H’xF’x = (PI sin x1 cosh xz - P2 cos x1 siiih x2 PlP2D sin x1 sinh H’ YF’Y = (--PI sinh x1 cos xz P 2 cosh x1sin x2 PIP?Dsinh x1 sin xZ)-l expressions which reduce to :
+
+
~~
H’xF’x
=
H’yF’y [P (sin PL cosh PL - cos PL sinh PL PD sin PL sinh 0L)I-l = [P sin PL sinh PL (PD ctnh PL - ctn pL)]-l
=
+
+
in the case of a symmetrical doublet. I n this simple case we will have:
S’F’x
=
--SFy,S’F’y =
~-
-YFx and H’XF’x
=
HxFx.
The cardinal elements are in general placed as shown in Fig. 11.
FIG.11. Position of the cardinal elements of a douhlct.
4. Object-Image Correspondence, Conjugate Points. a. Magnifications. .1 doublet, like a single lens, is generally an astigmatic system. For an object point P situated on the axis it gives two pseudo-images having the form
112
ALBERT SEPTIER
of two segments of straight lines, normal to the axis and situated in the planes zOX and 201’ respectively; we shall call these “focals.” The focal formed a t the intersection of Oz and the trajectories of the plane C-D will be called “the first focal”; it is situated in the plane zOY; the focal of the plane D-C will be the “second focal.” If the source is a disk of radius ro, the “images” will be ellipses of minor axes a0 and bo, respectively. The linear magnifications GX and GY are different :
GX = ao/ro # GY
=
bo/ro
Let l o be the distance of the object point from the entrance to &I, 1; the distance of the first focal, X i that of the second. We have GX = X’o/X’, with X‘, = X&, and Gy = Y’o/YF3 with Y’O = Yo/lo; from which: and
If trhere exist two different object points in zOX and zOY, situated respectively a t l o and Xo from Q1, we will have: and
b. Pseudo-stigmatic operation. Denoting by I , the distance of the image corresponding to the object in the planes C-D, and by A, the distance of the image in the plane D-C, we have the following homographic relations between I,, A,, lo, and Xo: Plane C-D
I,
= -
Hiz H2z
+ loHii +
hH21’
Plane D-C A = -
+ +
Vl% XOVll Vzz XOVZl
It is possible to look for the special condit!ions of operation which give from a single point source ( l o = A,) two images situated a t the same point (1%= XJ. But it is a case of pseudo-stigmatism. For the magnifications GX and G y are different, and if we allow the excitations thus determined for Q1 and Q2 to be fixed, the operation is valid only for a unique pair of conjugate object-image points; if the object point is moved on 0 2 , the two images again separate.
113
LENSES
STRONG-FOCUSING
-
One may not have a rigorously stigmatic system for all pairs of conjugate points except under very strong excitations; pL T (see below). The calculations can be simply carried out with a symmetric doublet, fur in addition to the conditions 1 0 = Xo and li = X, we have here 1 0 = li. It is enough to write:
with the following particular relationships : ii'J
=
Vi2 =
T.,'
=
H22 Hiz and HIIHT,- 1 = Hl2HZ1 H2,
From this we obtain:
If we demand
and
S
=
Plo J ? = (pF)2 and
VX
=
DUX
VY = p u ~
we find the formulas given by Sternheimer ( I Q a ) .
S
1
= - -
2
(Vx
+ Vy) + 51 [(Vx - Vy)' + 4J]?a.
J , V X and 1', can be expressed as a function of the two trigonometrics cot pL and coth pL J2=
Vx
=
=
+
(1 Cot2pL)(coth2PL - 1) (PO - cot pL coth pL)'
pux = -cot pL -
+
+
1 cot2 pL (PO - cot PL coth pL)
+
114
ALBERT SEPTIER
Sternheimer (19) has used these formulas to calculate l o as a function of L , allowing 8, = 2.22m-’[K(O) = 6.66 x web/m3] for 3 values of D / L (0.1, 0.4, and 0.8). In this particular case, symmetry leads us to the following (see Fig. 11) :
X’o Y’o
= - Y’3 = -X’3.
From this we deduce the relation G x the general expressions
GxGy
=
[(Hzz
=
(Gy)-l. This may be shown from
+ IoHzi)(Hii + ZOHZI)]-~.
Using the equation in l o and the relation
HllHZZ - 1 = HlZHZ,, we find indeed that G x G y = 1. c. Graphs of operation. I n the most current useful cases of strong-focusing lenses, the user must calculate a system, which allows focusing (either on a target, or on the entrance slit of a spectrograph) of particles which originate in an accelerator. The incident beam rarely has a simple structure. lo #
xo.
I,
A,
In the first case we wish =
and in the second, I , # A,; we place the first one on the entrance slit thus increasing the luminosity. I n either case the images will be real: Z, A, > 0. Very often loading conditions determine a supplementary value: the distance between the exit, of the accelerator and the target, thus to say indirectly, the distances
lo
+ 1, + A and XO + 1%+ A,
denoting by A the length of the doublet. The calculations are always very long with the general formulas. A few examples may be found in the articles by Bronca and Gendreau (18) in an application of the first type; and by Eiige (20) and Rosenblatt ( 2 0 ~, )which give very general graphs allowing the doublet to be determined in the very varied cases of convergence, supposing nonetheless that the two lenses have an identical length (Ll = Lz = L ) ; in the article of Enge, the calculations have been carried out for two values of the distance D , D = L and D = 0, and looking especially for the conditions of stigmatic operation. (It seems to be possible to realize D = 0 in practice because L is greater than the mechanical length. But if the lenses are brought too close together,
STRONG-FOCUSING LENSES
115
the distributions of field in the intermediate zone become distorted, rendering the lenses asymmetric, and reacting in an unforeseeable fashion on the value of L.) I n the recent article of Rosenblatt a graphical method is described, by means of which it is possible to design a quadrupole doublet, when locations of object and image are given. Let us note finally that the entire calculation can be carried out in an entirely different manner, considering the doublet as a grouping, in each plane, of two thick lenses having unequal focal distances and different principal planes; by means of the classical formulas we look for the value of the focal distance and the position of the principal planes of the ensemble.2 5. Thin-lens Approximation. In all the useful cases, where the distances 4, lo,li, and X i are much greater than L, it is always possible to carry out a simplified calculation, using the thin-lens approximation, with weak convergence. This is particularly the case with particles of very high energy (protons of from 100 to 1000 MeV, for example), for it is impossible to increase indefinitely the excitation of the lenses, and hence the convergence. The doublet then appears as a combination of two thin lenses Q1 and Q2, separated by a distance d d = D + - . L1
+
L 2
2
The two focal distances of each lens are equal, but opposite, in the directions OX and OY. For &I
SLY
= -"fly
= fi,
for Q2 f*x =
= f2,
+f*Y
the principal planes of each lens coincide a t t,heir centers 0, arid 02. We have then for the combination: 1
1 - 1
Fx _ 1 FY
j-1
f2
- 1
fi
+-d
flj-2
+ -1+ - .d f 2
flf2
For a symmetrical doublet, t.hese expressions reduce to
2
See in this respect Sternheimer (196).
116
ALBERT SEPTIER
+
with d = D L. Although QI and QZhave in each plane equal and opposite convergences, the combination remains convergent, because of the separation d of the two lenses. I n the same way one would calculate the position of the principal planes; the accuracy of the calculation is increased if the real value of focal distances (in sin PL and sinh PL) is carried into the equations, keeping the thin-lens hypothesis (principal planes coinciding), but the approximation remains valid only if PL remains small [(PL)2< 0.5, for example]. Such calculations have been carried out by Gendreau @ I ) , in the case l o = Xo = and I, = 03 (which corresponds to an incident cylindrical beam parallel to 0 2 , concentrated on the slit of a magnetic-prism analyzer), by Bromley and Bruner (22) for focusing a t a point (1%= X, with l o # 10) of a cyclotron beam, and by Levy (23) to establish a universal table allowing the problem of pseudo-stigmatic focusing ( l o = Xo, 1, = A,) to be solved in a n approximate fashion, with a particular separation D. 6. System of Revolutzon (24, 25). A combination of quadrupole lenses may be equivalent to a system of revolution if it satisfies two conditions: (1) equal focal distances, and (2) identical position of the foci in the two planes zOX and zOY. The principal planes will be similarly identical. Equality of the focal distances imposes the symmetry of the combination; for a doublet, P1 = PZ, L, = LP.The second condition can be supplied in practice with a symmetrical doublet; it is sufficient to write that the linear magnifications are equal, GX = G Y which gives the result HI1 = H22, thus finally :
PD with PD
3
=
-2
cot PL
+ coth PL'
0. This relation is satisfied for PL = an with a0
+ mr <
< (n+ 1
an
) ~ and
31
a0 'v -
4
We have then :
F
=
F'x
=
F f r=
1 sinh --
P
+
PL cos PL sin PL cosh pL sinh2 PL - sin2 PL
,
and
S'Frx
~~
=
+
S'FfY= - -1 sin PL cos PL P
+ sinh PL cosh PL < 0.
sinh2 PL - sin2 PL
-
For (cot pL coth PL) = 0, that is PL 31/4, we have PD = 00 , and for pL = 1,PD = 0. I n this interval, the convergence is extremely strong: F / L is a t a maximum of 0.03 for D = 0. But throughout the interval the
STRONG-FOCUSING
LENSES
117
object and image foci remain immersed in the lens. As these foci are defined from the asymptotic directions of the emergent rays, and since, contrary to the case of classical magnetic lenses, there do not exist real immersed foci which coincide in the two planes, the system cannot be used with real objects in order to obtain real images, Its use in corpuscular optics will be reduced to the role of a projection lens, if it is preceded by a classical round objective lens which furnishes a real image in the focal object plane of the projector. Figure 12 shows the variations of the focal distance, and of the position of the foci in the two planes OX and OY when pL varies,
/3L
-
FIG.12. Variation of F, and of the position of the foci in a symmetrical doublet which is strongly excited; the foci are superimposed for gL = T when D = 0.
for a separation D which is equal to 0; and Fig. 13 shows the shape of the trajectories and the position of the principal elements, when the system is one of revolution. 7 . Clariiy of a Doublet. Let A be a point source placed on Oz in front of a doublet and emitting particles in all directions (Fig. 14). The brightness of the doublet can be defined from the number of particles transmitted to a target placed a t T ;i t is expressed by the useful solid angle AQ. In the plane zOX, the maximum useful aperture Ox is limited by the ray which is tangent to the throat circle in the first convergent lens, while in the plane zOY the outermost trajectory strikes the throat circle in the second lens;
118
ALBERT SEPTIER
FIG.13. Doublet equivalent to a system of revolution: appearance of trajectories arid cardinal elements.
FIG.14. Clarity of a doublet for a point source situated on the axis.
if the two lenses have the same parameter a, we have B y < Bx. The calculation of the angles BX and B y has been carried out by Sternheimer (19) for a symmetrical doublet in pseudo-stigmatic operation. For @ 'v 2.22 meter-', L
'v
3 meters,
with particles of momentum p
20
2.25 meters, Bx,,,
=
and
D
=
1.2 meter,
10 Bev/c, he obtains:
= 5.4", and Byma, = 1.l0,with a 'v 0.152 meter;
that is to say, 75y0of the maximum brightness (which one would have with D = 0 and L 'v 5.22 meters), a solid angle AQ
=
1.8 X
steradians
STRONG-FOCUSING LENSES
119
and a magnification Gx = 5 ‘v l/Gy. We see here the interest in using lenses with rectangular, pole-free apertures; the plane zOX would be parallel to the large side b of the first one, and to the small side c of the second, allowing a greater extension of the beam in zOY. The extreme trajectory then touches the side of the aperture of the &I lens in zOX, and that of Q2 in zOY. Sternheimer (26) gives the following results for two lenses of this type having the same gradient, the same length, and the same separation D as the classical lenses described above. steradians for OX = 7”36‘, and Oy = 1”48’, that is AQ = 3.29 X c = 0.113 meter and b = 0.207 meter, instead of a = 0.152 meter. The improvement is about 1.7 approximately, in brightness, proportional to Ail.
E. Optical Elements of a Triplet We shall denote as a triplet, a sequence of three alternating lenses. The plane zOX will be the initially convergent plane, therefore C-D-C, and the plane zOY will be the plane D-C-D. I n a symmetrical triplet the two outer lenses are identical, have the same excitation, and the distance to the central lens is the same. I. Importance of the Symmetrical Triplet. Quadrupole lenses allow us to effect the necessary “adaptations” between two active sections of an accelerator [a linear accelerator and a synchrotron for example (27)],by modifying the dimensions of the beam and the slope of the trajectories, in the planes zOX and zOY. The simplest arrangement allowing of all adaptations is formed of two convergent groups which are strong-focusing, separated by a drift space, the variable parameters being the excitations of the lenses. The use of doublets has been shown to be possible; but the control in one plane, zOX for example, operating on p1 and Pz, entails a considerable variation in the other plane. I n fact the equivalent doublet lens, different in each plane has not only a varying convergence but also a varying position on 0 2 , for the position of the principal planes varies rapidly with the excitation. On the contrary, for a symmetrical triplet, the principal planes are also symmetric with respect to 0, in each plane. Their distance varies only wit,h the convergence, and if this is weak, one may consider that in each plane the triplet is equivalent to a thin lens fixed in position a t 0,of variable convergence, which greatly simplifies the doublet adaptation which is sought. 2. Cardinal Elements. We denote by L1 and Lz lengths of the lenses (Fig. 15), by D their separation, by p1 and pz their excitations, and by
120
ALBERT SEPTIER
4
Q2
Q,
Z
81
PI
I32
FIG.15. Symmetrical triplet. XI
and
X S and
XZ, the X’6, Y6
factors PILl, and PzLz. The emergent rays are defined by and Y’s
with 1
sin x
A
=
cos x
3
D
=
cosh x
E = - sinh x P
=-
P
1
C
=
-@sin x
F
=
sinh x
From this we easily obtain the expression for the image cardinal elements, and we define the object elements by symmetry with respect to 0. But the formulas which are obtained are hardly manageable. A simple method, proposed by Blewett (28) allows the elements of any triplet to be determined rapidly, by allowing two focal images Bx and BY to be made to correspond to two focal sources A X and A Y ,a method which is easily applied to the symmetrical case and to pseudo-stigmatic operation. Let us consider a doublet Q1Q2placed a t a distance p x from Ax and p y from A y . We look for the relation between P I , L1 and P Z , Lz,for which the rays come out parallel to Oz, on both planes. If we demand that CI = cot XI, CZ = cot 22,K l = coth xl, K z = coth 22, D is calculated in z O X , then in zOY, giving
XIB= 0
and
Ylg = 0,
STRONG-FOCUSING LENSES
121
and we set the two values which are found equal to each other. We obtain the condition:
We then have:
If we place after QIQza second doublet Q3Q4, operated under the same conditions, but reversed, the rays parallel to Oz will converge in BX and B Y . Different pairs of values 2 3 and x4 are possible, and the separation D‘ between the two doublets may be chosen arbitrarily. In particular, D‘ = 0 may be achieved, and if moreover one makes P 3 = P Z , Q3 and QZ may be joined together into a single lens of length (Lz L3) and we obtain thus a symmetrical triplet. I n the case of pseudo-stigmatic operation, px = p y = p ; the second doublet is then calculated to give a single image a t qx = q y = q. But this operation is valid only for two points A and B ; if the object point is moved, the image ‘(point” separates into two different focals. For a symmetrical triplet Ll = 15.4 cm, Lz = 12.5 cm, D = 10 cm, x2 = 0.502, x1 = 0.618, which has as consequence that PI = 0 2 and that p = q = 1 meter, we obtain the displacements in Table I for the two image
+
TABLEI
P (meters)
9x
QY
0.70 0.80
1.41 1.25 1.11 1 .oo 0.91 0.83 0.77
1.53 1.30 1.13 1 .oo 0,92 0.85 0.80
0.90 1 .oo 1.10 1.20 1.30
focals qx and q y when p varies from 0.7 to 1.3 meters (28). The triplet as a whole has a length of 75.8 cm. The operation here is really stigmatic, for in the two planes the linear magnification is equal to unity.
122
ALBERT SEPTIER
We see that the stigmatism remains relatively feeble for a variation < 1.3 meters. One may in the same way, with this symmetrical triplet, preserve the stigmatic operation while making the gradient vary (PI # p2). One obtains, for example, values given in Table 11. 0.8 < p
TABLE I1 P=P (meters) 0.6 0.8 1 1.2
1.4 1.6
&I
Gradient (gauss/cm) 268 227 200 180 165 154
243
217 200 186 174 166
The tables given by Blewett allow us to determine tentatively any triplet. In a more recent and very interesting publication by the same author (29) one may find the cardinal elements of different triplets, as well as those of several doublets, calculated by using the formulas furnished by the rectangular model. 3. Thin-bns Approximation. When the triplet has to function as a weak lens, it may be considered as being formed of three weak lenses. In the symmetrical case we will have f1, fi, fl in the direction O X , TI,Tz, in the direction OY with a separation d for the lenses. The focal distance F X of the whole lens is given by the relation
and we have a similar formula for F y . The separation of the principal planes does not exceed d when F X and F Y vary between large limits. If FX and FY are of the order of 10 to 2 0 4 we see that we may consider the combination as a thin lens, even localized as 0. Such triplets are used to adapt the beam at the entrance to the synchrotrons in Geneva (SO), Brookhaven ( S l ) , and Hamburg (YO). F . Combination of Two Distinct Doublets 1. Symmetrical Case. This corresponds to the case of the symmetrical triplet, in which the central lens is divided into two parts separated by a drift space of length A, where the trajectories are parallel to Oz. This
STRONG-FOCUSING
LENSES
123
mounting has been effectively studied and carried out by Schneider (32). The variation of D,linked to that of x1 = &Ll and xi = PzL,, allows the stigmatic operation and the unit magnification to be maintained. The optical ensemble formed in this manner is equivalent to a system of revolution, but only for a particular pair of conjugate planes. 2. Investigation of the System of Revolution. If D denotes the separation of the lenses of each doublet, and A the drift space between the two doublets, we may hope by varying all these parameters to obtain a combination which is equivalent to a lens of revolution, and which is such that the foci are outside the outer lenses, opening up thus the possibility of constructing a n objective for an electron or ion microscope which is strong-focusing. The general formulas are extremely complicated, and a calculation has been carried out by the author by means of a n electronic machine IBM 650, for the simplest case D = 0, to determine the function z1= f(zz), yielding a system of revolution, and the values of A, F , and Z F which result from it (33). I n the zones where the system is physically realizable (A > 0) the result is negative. The foci remain immersed in the outer lenses. There remains the hope that changing the parameters D will allow us to reach the intended objective.
G. sequence of a very Large Number of Identical Lenses 1. Simple Alternating-Gradient Focusing. A sequence of lenses may be used to channel beams of particles of very high energy for great distances between the exit of an accelerator and the laboratory where it is to be used, for example. One might think of placing at the middle of the path a single doublet or a convergent triplet operating in a stigmatic fashion, but the incident beam is always divergent; and in order to avoid a significant loss of particles i t is necessary in this case to give the lenses and the vacuum chamber apertures which are too large, unreasonably so. The beam encounters lenses which are arbitrarily convergent and divergent; its spatial oscillations will grow and become amplified in proportion to its progression along O z ; we must find the limiting condition of stability of the combination, that is to say the value of the factor x = PL for which the conditions of entering the ( n 1)th period are identical to those of the nth. In the direction O X a period in the simplest case is composed, on a length p , of: convergent lens 1 divergent lens $ convergent lens.
+
+
124
ALBERT SEPTIER
The transfer matrix of this symmetrical combination
x
21
I/ Tx I1
=
x
xl
1 cosh x - siiih x
cos - - sin 2P 2
cos - - sin
P
x
2P
x
X -
x
-
- P sin 3 cos 3
p sinh 2 cosh x
- p sin -2 cos -2
x 2
with Tll = TzJ,may be in the form
It was first shown by Courant et al. (3) that the combination will be convergent if the trace of T X is less than 2 ;
I2Tll
I 6 2,
which means physically that
-1
6
cos p p
6 1,
and this appears here by -1
6
cos pL cosh pL
6
1
that is
0
< PL 6
1.873.
In the direction OY, we will have the same condition. The expression for the stability condition is more complicated when there exist drift spaces D between the lenses, but it still results from the condition : Trace (T)
< 2.
If the lenses are weak enough, they may be replaced by thin convergent lenses of strength =ttp2L = fa, separated by a distance A = D L. Setting 9 = PA, and choosing the origin arbitrarily right after a convergent lens in the plane z O X , we find (6):
+
Setting
STRONG-FOCUSING LENSES
125
we should have 0 < q < 2 .
For 77 = 0 the focusing is 0, and for 7 3 2 there is over-focusing. If we arrange on the axis a vacuum enclosure of radius a, a particle will pass through the entire ensemble if [ X I < a and 1 Y I < a. After having transversed n periods, the coordinates Xo and X’o are transformed into X , andX‘, by the matrix: IIAII
=
We will have X ,
-=
,- sin n p
IITII
=
0 if
which determines sin n p . The trajectory will have a certain period r in the system, equal to twice the distance separating two successive points where X, = 0. This trajectory is in reality a broken line, but we obtain a good approximation by supposing that n is one continuous variable; the real trajectory will be enclosed in the interior of the L L ~ m ~t ~r atj eh ~”t o r y since ,~ the maximum amplitude of the broken line is always obtained in the direction OX which is initially convergent. We have then for one period
The smallest period of the trajectory is obtained for 7 = 2, t h a t is, An = 2 and is equal to I’ = 4 4 that is to say twice the period of the structure. For 17 = d s we find that A n = 4,that is I’ = 8A. The maximum separation X,,, is calculated by writing dXn dn
-=0 This expression leads, by writing X,,, = a , to the limiting values of XO and X’, “accepted” by the system (that is to say, finally, in the brightness 3 In the publications devoted to the theory of the strong-focusing synchrotron, the explanation of mathematical techniques of fair simplicity may be found, which allow the equations of the smooth trajectory to be obtained directly.
126
ALBERT SEPTIER
of the combination) in a n origin plane situated a t the exit of a convergent lens. If X o = 0 (point source on the axis) we will find, for example:
which is a t a maximum for
t] =
1.24;
X’omnx= 0.305a/A.
4 5 meters, and for a = 0.1 meter, we find A = 1.73 meter ,B2L= __ 2 1”. and X’Omax If X’O= 0 (beam parallel to the axis), we will have under the same conditions For a
=
-
XOmax = 0.44a. In the same way the admittance in the plane zOY will be determined. I n the origin plane: YIOmax =
0.62a/A
and
YOmax= 0.44a. In a coordinate plane Xo/a, X’oA/a, the curve for the equation X,,, is an ellipse whose area
is a maximum for system :
t] =
=
a
1.24 and gives the maximum admittance for the
The complete study from which these calculations are taken was carried out by Citron and @eras (34) for a n arrangement channeling p mesons. 2. Alternating-Gradient Focusing in Linear Ion Accelerators. In a linear ion accelerator of the classical type, the particles circulate in the neighborhood of the axis of a resonant cavity, and traverse successively drift tubes and accelerating gaps. Between two successive gaps, the time of flight is equal to a period of the hf oscillation (in general of frequency 2.108 Me).
127
STRONG-FOCUSING LENSES
The accelerating hf field h’,parallel to Oz is accompanied by a radial defocusing component E , at the entrance to the tube which follows the gap. Therefore each gap behaves naturally like an accelerator space, on which is superposed a divergent lens. The simplest method for compensating this divergence is to suppress E, by equipping the entrance to the tubes with a metal grid. But the loss of particles is then considerable. This inconvenient feature is eliminated by compensating the natural spreading with a sequence of strong lenses. Several authors (5, 6,35,36-39) have studied the problems which are faced in this type of focusing. We shall say a few words on the two simple methods which may be used for these studies. a. The progressive-wave method (4, 6 ) . It is possible to describe the longitudinal motion of one particle type by considering it as being attached to the fundamental space mode of the field E, [obtained by development in a series from the function E, = f(z) a t a given instant]. Given the equation for E, for a given accelerating structure, one may derive from it E, and the equation for the radial motion under the action of the defocusing force; everything happens in this case as if this force were distributed all along the path. Since E , is proportional to the distance r from the axis, we calculate the gradient g ( z ) = -vw ctn
+-
$s
2v
representing by @ =
h’z2u
the law of acceleration for the particles, by the phase of the stable particle, by w the pulsation 27rf corresponding to the frequency f of the hf oscillation; e and M are respectively the charge and mass of the particle. We suppose a t first that we are considering a structure whose period does not vary; in practice this length varies since there is acceleration, but this factor varies slowly, and it is adequate to repeat the calculation for different values of the period, and therefore for different velocities in order to determine in practice the characteristics of the lenses a t different points. We suppose that the period is composed of: a convergent lens of length L, a divergent lens of length L, a gap of length e. The period p is known from the velocity of the ion; and the ratio e/2L, arbitrarily chosen ( e / 2 L = 3, for example).
128
ALBERT SEPTIER
The equations of motion are then of the type:
d2X_ _ (P2 dz2
+y2>x
d% + dz2
- y”>X = 0
d2X __dz2
(@2
=
0
for0 < z for L
< z < 2L
for 2L
y2X = 0
< z < 2L + e
with y ( z ) = 4 in nonrelativistic mechanics. ,8 is the characteristic factor of the lenses which are supposed to be identical. From z = 0 to z = p , we take for the mean value of the accelerating potential in this interval. We then obtain ( B ) , setting p(z) = constant in the lenses, and p 2 y2 = a2,p2 - y2 = li2
+
y sinh ye cosh ye
11
cos 6L ‘XL -6 sin 6L cos 6L
/I /1
a
a sinha LaL cosh aL cosh @,!?-?&
1
.
Supposing that the values ye, 6L and a L remain small, we may develop the terms which make up the trace of / / T 11 and we obtain the condition of stability in the form
where A , B, and C are simple polynomals in L and e. We may then choose 0, that is to say the voltage in the electrostatic case or the current I in the magnetic case, with the condition that trace /I T 11 = 2 furnishing the weakest admissible excitations. The calculation of the amplitude u of the transverse oscillations of the beam, which are caused by the focusing system, shows that the “undulation” e, defined by the ratio of the amplitude u to the mean distance r,,, of the trajectory from the axis 0 2 , varies as (pL)2.There is therefore some interest in choosing the shortest possible lenses, for a n undulation figure fixed in advance. For a proton accelerator of 1 to 7 MeV, of length 3 meters, with e / p = 0.25, f = 200 Mc/s, v = 8 (that is to say an accelerator of the Alvarez type, with constant field) and electrostatic lenses such that a = 1.5 cm, one would find for example a t the entrance a t the exit
91 = 29 kv and
e = 36%
4%= 20 kv
e = 26%.
STRONG-FOCUSING LENSES
129
This method of calculation would be applicable to all the progressive-wave accelerators. b. Localized perturbation method (36). One may suppose that the defocusing physical action of a gap is equivalent to that of a thin divergent lens localized in the middle of this gap, and that each tube contains one or more lenses. I n the simple case where a tube contains a single lens (and it is difficult to place two of them in the first tubes, whose length is less than 10 cm around 4 MeV) we have the following periodicity (Fig. 16),
FIG. 16. Periodicity in a linear ion accelerator, with separation equivalent to divergent thin lens, and a single lens per tube.
B
having as limits the principal image plane P', of Qn, and the principal object plane P,+l of &,+I. This length is greater than p , which separates the centers of the two convergent lenses, unless PL << 1; s denotes the path length of the gap; if y is the convergence of the defocusing lenses, M the abscissa of P', with reference to the exit face of Qnor P,+l with respect to the entrance face of Qn+l, P the convergence of Q n and of Qnfl, and Q that of Q',, we have with:
h=M+s
A'=N+s
+ sin PL(A + A' + yAA') + 2(1 + yA)] X [P sinh PL(A + A' + yAA') + 2(1 + yA')] - 1
We set cos a = +(TI1 T22), and the condition for stability is written: cos a
=
$[-P
where - 1 6 cos a 6 1. Fig. 17 gives the values of @ as a function of y for two values of L / p for cos a = 1 and 0.96. The value of y is expressed in terms of the defocusing force
130
ALBERT SEPTIER
where Eo = E,/sin wt and 1c. represents the phase of the stable particle. In an accelerator p is the length traversed in one period, and therefore p v (velocity of the particle) ; y will therefore decrease when the particle velocity increases. The convergence of strong lenses should therefore be a t a maximum at the beginning of a n accelerator, which is the most unfavorable zone since the tubes are shortest there.
-
&=0.3
cosa = 1
c o s a = 0.96
0.05
0.1
0.15
0.2
YFIG.17. Value of B as a function of the divergence y of the gaps [after L. C. Teng, Rev. Sci. Innsir. 26, 264 (1954)l for different lengths L / p .
Consideration of the amplitude of the oscillations show that there is an advantage in using the shortest possible lenses, but it is then necessary to increase the gradient. I n the first part of accelerators (0.5-10 MeV, for example) the tubes are very short (several centimeters); the radius a is of the order of 1 cm. For pL = 1.6 and L = 5 . meter, with a = lov2 meters and protons of 0.5 MeV, it will be necessary to be able to apply & + I = f 5 0 kv, which is impossible because of the too-great risks of breakdown, or else to create a magnetic field gradient of
a value which is not realizable in practice in a continuous electrical supply because of the small amount of space which one has, in order to provide windings and important magnetic yokes inside the tubes. As a result, from 0.5 to 10 MeV, one may use, as a t CERN in Geneva, a focusing by means of pulsed lenses (40). Above 10 MeV, the necessary gradients, of the order of 2000 gauss/cm at 10 MeV, and 5-800 gauss/cm at 50 MeV) allow continuous supply, which is simpler. For a high-current accelerator (40-100 ma peak a t CERN) it is necessary to keep track of the defocusing from space charge, at least at the beginning; and the correction of this effect then leads to increasing the
131
STRONG-FOCUSING LENSES
excitations of the lenses. We may calculate an order of magnitude of PL 11y adding to the system calculated earlier, some localized divergent lenses, either in the gaps, or in the tubes, and having an effect equivalent to the space charge. Calculation of this nature has been carried out by Wilcox (41) for the determination of the optical elements of a Cockroft-Walton for strong currents; but the gap between two lenses acts here as a static acceleration space, and therefore like convergent lenses, contrary to the case of the hf accelerator. 3. The Electrostatic Energy Filter (42). Another method for treating the problem of N identical lenses following each other consists of rewriting the equations of motion in a distribution of potential of the form
4 =41 ("2
- yZ)k(z),
a2
where k ( z ) is a periodic function of z, of period A close together, we have, for all purposes
=
2L. For both lenses
k ( x ) = cos?rz/L.
We then obtain :
with v = dz/dt, 7 = ?rvt/2L = m / L , and v = 1/2e40/M, if we adhere to the nonrelativistic approximation. We obtain a Mathieu equation of the type
in which a,
=
a,
=
0 and
The trajectory will not be stable except for certain values of q : 0
< q < 0.92
7.5
< q < 7.52,
etc.. .
of the excitation, the syst.em In the first possible domain, for a value will only transmit energies 40 greater than a certain minimum value
132
ALBERT SEPTIER
The arrangement then constitutes a high-pass filter for the energies. The higher domains cannot be used, because they necessitate too high a value of PL. The arrangement may be transformed into a pass-band filter; all we need do is add to the distribution +(x,y,z) a continuous voltage (+&) in one of the lenses of the period and (- +G) in the other one. We then have :
Here,
The regions of stability are then different in the two planes; and for a trajectory to be stable, it is necessary that the domains overlap, for a given value of a,, and therefore of +G. If a, = 0.2, one would have, for example, stability for 0.6 < q < 0.75. The width of the possible band decreases more and more as a, increases, and falls to 0 for q1 = 0.706 and a, q / 3 . Other domains exist for greater values of a and q. If we replace the function
-
k(z)
=
TZ
cos L
by a rectangular function T ( Z ) of height fk(O), we will arrive a t an analogous result, but the limits of the regions of stability are distinctly different. We obtain for example qmaa = 0.57 in place of 0.75, when a # 0 ; this separation is normal, for the two functions k ( z ) and T ( Z ) are not equivalent, the convergence being much stronger in the second case:
If we consider a system of lenses whose central gradient is k(O), the real solution will be contained between the two above results, since the real function K ( z ) is intermediate between k ( z ) and ~ ( z ) . We may show that the matrix method, using the rectangular approximation, would lead to 0
< PL < 1.873
0
< q < 2.24
that is
133
STRONG-FOCUSING LENSES
If we represent k ( z ) by a sinusoidal function of height 1, this reduces to diminishing the height of the rectangles in a ratio 2 / r , thus to say reducing q in a ratio 4/d;one will then have: 0
< q < 0.91
that is to say, a result very close to that which the Mathieu equation furnishes with a = 0.
H . Helical Quadrupole Lenses 1. Description. Let us consider a quadrupole lens which is very long, having an axis Oz where the gradient is K(O),and let us suppose that it is possible to give the pole pieces a twist such that the ends of the poles are on a helix of axis Oz and of step A, such that A >> a. The distribution of field in B, and Bo persist, and a component B, appears. Nonetheless this is very small and may be neglected. The lines OX and OY no longer exist; their radial direction turns with the helix. One would obtain an equivalent system with n very short lenses whose axes OX and OY would turn through an angle of 27r/n in passing from the nth to the ( n 1)st lens. We would have A = nL. 2. Equations of Motion (43). Supposing that B, = 0, we have simply
+
r -d20 + 2 -dr d0 dt2 dt dt 2 = vt
e
= - UB,
m
By introducing
we obtain
r re
-
re2 = Ar cos 2 ( w t
+ 2i-8 = -Ar
+ 0) + 0)
sin 2(wt
that is, in a rotating system x
=
r cos (wt
y = r sin (ot
+ 0) + 0) + A)x = 0
3
+ 2 ~ -y ( w 2
y
-
~ w X (w’
-
A)Y = 0.
134
ALBERT SEPTIER
The solution is written
+ wlC2 cos wlt + wC3 sin w2t + wC4 cos wzt + wC2 sin wlt - cos wzt + sin with w1 = ( w 2 - A ) % and = (w2 + A)%.The system will be convergent x
=
w l C , cos wlt
y = -wC1 cos wlt
w2C3
w2C4
w2t
w2
if
e (%)2 v 3 ; K(0). Here again we obtain a n energy filter of the high-pass type. The study of the equations in the fixed system
+ +
x cos wt y sin wt y = -x sin wt y cos wt x
=
shows that it is possible to obtain the image of a point source situated on the axis and immersed in the system, at a distance p , with a magnification unity; it is therefore possible to channel particles over a considerable distance, the trajectories reproducing themselves with a periodicity p along 02. If T denotes the time of flight T = p / v , w1 and wz must satisfy the condition cos
W I T
=
cos w ~ = T f l
that is, for example w1 =
0
and
2a
w2 = -
T
=
27rv
P
with
A
=
e
-vK(O) m
and
w2 =
%V
-.A
The characteristics of the lens and its optical properties may be determined, the constants C1C2C3C4being obtained from the initial conditions a t t = 0, a t the initial object point. The theoretical application of this system to a channel for p meson has been studied by Morpurgo (44); the brightness of the system would be, a t equal length, clearly superior to that of a sequence of conventional crossed lenses.
135
STRONG-FOCUSING LENSES
I . Electric Quadrupole Lenses Excited at High Frequency This application was noted for the first time by Paul and his collaborators (45, 46‘). 1. Equations of Motion. If a sufficiently slow particle moves in a quadrupole lens excited by a voltage frequency f,such that the distance covered in one period is small compared to the length of the lens, this particle will cross regions which are alternating convergent and divergent, but with a time-varying gradient. Let the distribution of voltage, independent of z, be as follows:
+(x,y) = with
w =
Y = d“.
cos wt
=
o
cos wt y
=
o
d2x
0, q
a2
1
1
These equations are still in the form: dr2
=
x2)
+% cos wt - x a2
nzji - 3
with a,
-
2 ~ f The . equations of motion will be written:
mZ
if 2 =
2 (y2
=
+ (a, - 2q cos
4 e + l / m a 2 ~ and 2 T
~t:=
=
2T)Z =
0
4 2 . The solution is of the form
+ BC-P
AeP‘ n=
-m
Cne-inr. n=--m
If p is real or complex, the amplitude x grows without bound. If on the contrary p is purely imaginary, the solution will be finite for all values of T , the trajectory will be stable and may pass through the lens. The first stable domains of p will correspond to: 0
< q < 0.92
and
7.5
< q < 7.52
but here q i s independent of the velocity of the incident ions. The only interesting parameter then is the mass m ; only the masses larger than a value ?no given by q = 0.92 will pass through the lens, the first region constituting a high-pass mass filter. The second domain will supply a pass-band filter. For “unstable” masses, the amplitude of the oscillations of the trajectories \Till grow exponentially, and the resolving power of the combination will be
136
ALBERT SEPTIER
better in proportion to this increase, that is to say the number of oscillations will be greater. For a length L of electrodes which is fixed, we must therefore use slow ions (v/f<< L ) . 2. Mass Filter. The use of the second domain leads to elevated values of 41, which is an inconvenient feature. A band filter may be realized a t low voltage by remaining in the first domain and superimposing on the hf voltage, a steady quadrupole voltage f 4 ~ .
we demand
The ratio u = a/q = f 2 + G / & depends only on the ratio of the applied voltages. In the first domain u has an upper limit ha,,, = 2+G/+1 = 0.333; the domain of stable operation is then infinitely narrow and corresponds to qi = 0.706.
To each mass will correspond a ratio u, or a frequency j for which the value of q will lie within the chosen band, in the neighborhood of urnax. For a constant field the resolving power for masses may be calculated: m _ Am
0.75 1-
(U/Umax>
or those off, if one is dealing with I n sweeping through the values of a source of slow ions containing several masses, one will see appearing successively on the detector the different component masses; thus one obtains a mass spectrograph. The equatiom of motion are much more complex in the case of intense beams, where space charge plays an important role. For weak currenhs one may effectively demonstrate the accuracy of the simple theory (46) with ions of rubidium. Choosin;g f = 2.55 Mc/s, = 1.000 volt, a = 1 cm, L = 50 cm, the rubidium positive ions of masses 85 and 87 may be separated with a maximum resolving power for u = 0.328. The same authors have been able, by rendering the electrica supply more complex, to transform the arrangement i,nko a n isotope separator (47) furnishing a current of several milliamperes.
STRONG-FOCUSING LENSES
137
J . Focusing of Polarized Atoms and Molecules Vauthier was the first to show that it was possible to use lenses with an inhomogeneous magnetic field for the focusing of magnetically polarized atoms (@), that is to say, endowed with a magnetic moment. The optics of neutral atoms and molecules, polarized naturally or artificially by the Zeeman or Stark effects, has made great progress in the last few years, and numerous arrangemenhs using quadrupole lenses have been employed. We can imagine, for example, the application of magnetic lenses to the focusing of atoms of hydrogen, and electric lenses for that of dipolar molecules. 1. Magnetic Focusing of Hydrogen Atoms. Hydrogen atoms, placed in a magnetic field, distribute themselves into four energetic states corresponding to the states of different spins. I n a very strong field the spins of the proton and of the electron, each of value 35, are decoupled and orient themselves separately. If the fields are very weak, the magnetic moments remain coupled, and we have a triplet of spin 1, or a singlet of spin 0. The atom with spin 1 may be oriented in three different ways. The various components are indicated on the well-known Breit-Rabi diagram. When placed in a magnetic field an atom undergoes a force:
proportional to the slope of the curves in the diagram. I n the strong fields this force is the same on curves 1 and 2, on the one hand, and on curves 3 and 4 on the other hand; a parallel beam of atoms separates itself only into two parts. The two beams would not be polarized, since the spins of the nuclei are in opposite directions for the two respective components of each. Denoting by B, the field on the circle of radius a, in a quadrupole lens, we have :
For components one and two, the forces F I and F , are convergent and are expressed by
P B denoting the Bohr magneton, and Bo 0.05 web/meter2. On three and four, divergent forces are effective, equal respectively to -F1 and - F z . We see that the force is not proportional to the distance from the axis;
138
ALBERT SEPTIER
the quadrupole structure does not really perform the function of a lens (it would be necessary, for that, that the force be proportioiial to r ) . But this structure exerts a force possessing a symmetry of revolution with respect to 0 2 . It is possible to determine the trajectories of the components 1 and 2, and without producing a real image, to separate these components. The arrangement studied by Keller (49) is composed of such a lens which is very long, and in which the variation of the radius a is adiabatic, of the form : a = ao(x/z0)94. The injection is carried out on a circle of radius ro < a, and a diaphragm tangent to the envelope of the trajectories of the first component allows the second component to be eliminated, the third and fourth eliminating themselves by loss against the walls. I n the same manner atoms possessing a magnetic moment pm proportional to the field H (50) may be focused; they are then subject to a returning force directed toward the axis
F,
= -p
grad H
=
-A,H
=
-Azr
and the quadrupole lens plays the role of a lens of revolution. If g, were constant, it would be necessary to use a lens with field gradient linear in r, that is a hexapolar lens (51). 2. Electrostatic Focusing of Dipolar Molecules (52). Certain diatomic molecules have a permanent dipole moment; this is the case for potassium bromide. If 1 is its moment of inertia, p its dipole moment, J and M the orbital and “magnetic” quantum numbers (denoting respectively the quantization of the amplitude of the kinetic energy, and of its projection on the axis of the molecule), the additional energies acquired by the molecule in the electric field E may be written as follows for weak fields:
J(J where A
=
J ( J + 1) - 3M2 + 1)(2J + 1 ) ( 2 J - 3)
h / 2 ~( h is the Planck constant), in setting
A
=
tL2/2f ff
Awe, = - p2E2. 2A
We define an effective dipole moment
peff
by the relation
STRONG-FOCUSING LENSES
139
For weak fields, p,ff is a linear function of E. I n a n electrostatic quadrupole lens, where the electrodes are kept a t potential =t&, and are tangent to the throat circle of radius a :
E
=
- (24,/a2) r.
We have
The force acting on the molecule is a returning force proportional to r. The lens is equivalent, for molecules, to a lens of revolution around 0 2 . The equation of motion is written as follows, since F = -p,ff grad E :
A particle of velocity v which crosses the axis Ox a t 1 = 0 and x = 0 will cross the axis again a t time T = T/U,a t a distance x = L, such that L
=
v(7r/w).
If the object-image distance L is fixed,
61
may be calculated:
But mu2 = 2kT ( k the Boltzmann constant, T the absolute temperature). For L = 90 cm, a = 0.5 em, T = IOOO’K, we will have, with molecules of potassium bromide, in the 1-0 state, for which p = 10.4 debye units, and A = 1.60. ergs: 241
=
1.45 kv.
The field Em,, on the electrodes is of the order of 2.9 kv/cm and the gradient is of the order of 5.8 kv/cm-2. A spectrograph which allows molecules of BrK having different values of J and M to be separated has been coiistructed on this principle (52). Similarly, molecules of ammonia, NH3,with a n axis of symmetry, which in a n electric field E acquire a dipole moment proportional to E, may be focused (53),for the purpose of injecting the molecules (situated on a high level of energy) into a resonant cavity tuned to the frequency fo = 23 870 Mc/s which corresponds to the frequence of inversion of the molecules NH,, in order to obtain an oscillator of the “Maser” type. In this case, for a = 1 cm, L = 33 cm, we would find that die 15000 volts.
140
ALBERT SEPTIER
11. ABERRATIONS A . Aberrations of a System with Two Planes of Symmetry 1. Number and General Form of the Aberration Terms. In a quadrupole lens there cannot exist second-order aberration, that is to say, aberration whose development, contains seconddegree terms in XO,Y O XIo, , Yr0.In fact, the potential +(r,B) is such that
+w+
=
+(r,e)
and the change of X o to - X o and of Yo to - Yo automatically brings about an identical change for the image plane
xi4 -xi Yi-+ -Yyi
for the exit plane of the system. This can only take place of the expression for the distance between the point of arrival of the supposedly perfect trajectory (to the first order) and of the real trajectory (perturbed by the aberrations) contains only uneven powers of Xo, Y o ,etc. . . . Here we shall examine only the principal terms which are called “thirdorder.” Their general expression for a point source situated on the axis is given, as a result of symmetry with reference to the plane containing the plane trajectories, by an expression of the type (54)
AX
=
E
=
+ a2Yo2 + + U J ’+OasYoY’o) ~ + X’O(a6X02 + @YO2+ + a9Y’02 + alOYOy’0)
Xo(alXo2
a3X’02
a&’02
and similar expressions for A Y , AX’, and AY’. But these terms can be expressed more simply for an incident beam parallel to the axis. Their global expression is then of the form:
AX
=
+ a2XoYo2 + bzYoXo2.
~1x0~
AY = blYo3
The aberrations are of several sorts, as in classical optics: (1) Aperture aberrations, identical a t all points of the Gaussian image plane, even on the axis. An “object” point situated on the axis will give for example (calling the coordinates of the trajectory in the aperture plane of the system, RA and 0) an aberration spot of equation A x = IRA^ COS3 6 AY = blRA3sin B
+ b l R sin2 ~ ~0
COS
0
+ b z R cos2 ~ ~ B sin 0
surrounding the theoretical image point.
141
STRONG-FOCUSING LENSES
In a system of revolution there exists only a single aberration of this type : spherical aberration; here, there is a combination of three aberrations. (2) Different distortions, which appear as a displacement of the real image point with respect to the Gaussian image point, which causes a variation of magnification in the Gaussian image plane. The global effect may be put in the form.
AXi AYi
+ +
slXO3 s2XoYo2 = S3Y03 s4YoXo2 =
X Oand Yo being the coordinates of the object point in the object plane. (3) Finally other aberrations like astigmatism, curvature of field, and coma. All these terms are described by Burfoot (55)-we shall study more especially the aperture aberrations, which are much the most important. 2. Aperture Aberrations. Of these there exist three sorts: a. Pure spherical aberration. From the constant rawe would have AX = r,RA3(cos3e A Y = I',RA3(sin20
+ sin2 e cos e)
+ cos2 0 sin 0).
For an object point emitting rays reaching the system around a crown of radius RA and such that 0 < e < 2 ~
the aberration spot surrounding the ideal image point will have a radius r = (AXz AY2)S. The incident trajectories which are inside the cone formed by the preceding trajectories will arrive in the interior of the circle which is thus defined (Fig. 18a).
+
(a 1
(b)
(C 1
FIG.18. Aperture aberrations of a system with two planes of symmetry: (a) spherical aberration; (b) star; (c) rosette.
b. Pure star aberration. We will then have: AX = r,RA3C O S ~e A Y = -FeRA3 sin3 0.
The aberration spot will be a sort of star with four branches (Fig. 18b)
1 42
ALDERT SEPTIER
c. Pure rosette aberration. This will have the form
AX
=
AY
=
rrIiA3(COS3e - sin2 e cos e) I ’ , R ~ ~ ( s i0n~ cos2 0 sin 0).
and the spot will have the indicated form (Fig. 18c). d. Global aberrution. I n general the three aberrations are present and one will obtain expressions
AX AY
= =
RA3[rl cos3 e &3[r3sin3 e
+ r2sin2 0 cos 01 + r4cos2 0 sin el
with
rl = r, + re+ r, r2 r, - r, = r4 r3= rs - re+ rr If l.’z # r4,other aberrations are present. 3. Distortions. I n classical electron optics (microscopy, for example) distortion is the fundamental fault of lenses which operate with a very small aperture (projection lenses, for example) for which the aperture aberration is negligible, contrary to the case of objective lenses which use a very large aperture.
(b)
(C)
FIG.19. Different types of distortions: (a) Pincushion (Dl > 0) arid barrel ( D , < 0) ; (b) inverted pincushion (Dz> 0), and inverted barrel (Dz< 0 ) ; (c) hammock distortion type I (Da > 0) and type I1 (Da> 0). If Da and Dqare negative, these figures rotate 90”.
I n addition to the classical distortions, of ‘(pin cushion’’ and “barrel” types which lead to the distortions indicated on Fig. 19a and are expressed by equations such as A X i = D1(Xo3f XoYo2) AYi = D1(Ya3 Xo2Yo)
+
there exist other types here. (1) The preceding distortions “inverted” (Fig. 19b)
143
STRONG-FOCUSING LENSES
( 2 ) Asymmetric “hammock” distortions, that is,
D 3 ( X O3 X,YOZ) AYi = D3(-Yo3 XozYo)
AXi
=
AX;
=
+
(Fig. 19c)
or
+
Dq(XO3 X0Yo2)
A Y i = D4( - Yo3 - X,’Yo)
In general, we obtain
AXi
=
AYi =
+ + l?&A2Yo
F5X03 I’&oYo2 I’7Y03
and one may, by knowing r6,rs,r7,and rs,calculate D I ,D z ,D3,and D4. All these coefficients may be determined theoretically by solving the equations of motion in the “third order” approximation.
B. Calculation of the Trajectories of the Third-Order (56, 57) 1. General Equations. a. Magnetic case. We limit the expression for the scalar potential b ( X , Y , z ) to the two first terms, which are the most important :
% K@ X Y ( X 2 + Y2)+ . . . . 12
+(X,Y,z) = 2 $ k ( z ) X Y a a
The different components of the field are then given by
B = - PO grad +(X,Y,z) that is
which may be written:
Bx
=
-K(z)Y
BY =
-K(z)X
L3,
=
1 + ij K”(z)Y(3X2+ Y’) + 121 K”(Z)X(XZ + 3YZ)
-K’(z)XY
144
ALBERT SEPTIER
with
K(z) = 2P04l __ k ( z ) ; a2
k ( z ) is the characteristic function, which is equal to unity a t z = 0. The calculations may be carried out in the nonrelativistic approximation, and one may go from this case to the relativistic case by replacing the accelerating voltage +o by
in the magnetic case 40*
=
40
1
+ (e40’2moc2)(e4dmoc2)
in the electric case
Starting from the general equation d dt
- (mv)= -eV
xB
and eliminating the time by the method seen earlier, we obtain:
where s is the unit vector tangent to the trajectory and 4 = 40 = constant. We project back OX and OY and arrive a t the following equations:
X” 4-B2kX
+
p2
=
[k Y 3y’2 2f
- ICX’Y’X - k’XYX’ - k ” Y(3X2 + Y’)
x’2
1.
These equations group together several sorts of aberrations. (1) Those which are due to the variation of longitudinal velocity, as a and Y’”, we shall call result of the slope of the trajectory (terms in these “velocity-inclination” aberrations. (2) Those which are due to the existence of the field B, (terms in k’). (3) Finally aberrations due to the appearance of perturbing terms in the leakage fields (terms with k”). They suppose implicitly that in the central zone, where k ( z ) = 1, the field is purely quadrupolar; we have neglected the sixthdegree term in the development of 4(X,Y,z).The aberrations that we shall be able to calculate will then be those of lenses with ideal hyperbolic pole pieces.
145
STRONG-FOCUSING LENSES
b. Electric case. The starting equation is now:
In the axis system Oxyz, the plane zOx is the convergent one (for positive particles) if the expressions of the field components are:
241
E,
= - -~
E,
=
a2
24 k”(z)
( z )+ x -‘a2
241 k ( ~ ) +y 241 a2 a2
__
6
’
k”(2) ~
6
’
E z -- - 2Ak’o(x2 - y 2 ) ) . a2 2
We must keep track here, in the calculation of velocity, of the local potential : +(x,y,z)
= 40
(acceleration)
+ 4(x,y,z) (lens)
For ions, $0 must be taken negative. We obtain:
and, after some calculation, we obtain the projected movement on Ox and Oy:
The terms are slightly different from those of the magnetic case and there appears a supplemeiitary term which proceeds from the variations of the global velocity v. A better approximation would be obtained by bringing in terms of the sixth and tenth orders in the development of +(x,y,z), terms which cause the radial variations of gradient which one always observes in real lenses. These terms being negligible, as far as a distance from the axis equal to u./2, we may suppose that the above equatioiis will give to a good approximation the values of the aberrations terms. But for very broad beams, the agreement between theory and experiment can be more or less bad, the
146
ALBERT SEPTIER
decrease of the gradient toward the edges playing a role which is difficult to foresee. 2. Calculation of the Aberrations; the Perturbation Equations in the Magnetic Case. I n order to calculate the aberrations from the general equations, we may use the classical method from electron optics and set:
+
Xl(Z) = X ( z ) e(z) X’,(z) = X f ( z ) d ( Z )
+
+ +
Yl(Z) = Y(Z> Y’l(Z) = Y’(z>
T(2)
qfb)
The terms e, e’, q and q‘ always constitute a weak perturbation; in calculating it in a plane after the exit of system, one may derive from it the value of the aberrations in the image plane. The functions X ( z ) and Y(z)will satisfy the first-order equations:
X”(z)
+
/32k(Z)X(Z)= 0, Yff(z) P2k(z)Y(z) = 0.
We then obtain the necessary expressions by calculating the terms e and q , neglecting the product tq and the powers of c and q greater than the first: €”
+ p2k€ = -p2
+
p2
[kX 3xf2;
2+
[ICY 3y’2
k”
y’2
- kX’Y’Y - k’XYY’ - -X(XZ 12
- kX’Y’X
- k‘XYX’
k“
+ 3Y2)
1
,
I
- 12 Y ( 3 X 2 + Y2) .
The second terms are known when ona has determined the solutions X ( z ) and Y ( z )to first order; they form such a function of z that equations can be integrated only step by step on electronic machines. The function k ( z ) can be represented by the bell-shaped model, which allows the terms k’(z) and kf’(z) to be introduced in the form of analytic functions of z. We obtain equations of similar form for the electrostatic case. Before giving the results obtained by the integration of these general equations, we shall describe the different attempts made a t calculating approximately the aberration terms from the simplified equations. 3. Approximate Calculations from the Rectangular Model. a. Aberrations due to velocity terms. Reisman (24) has been able to integrate directly the following simplified system
with k
=
constant
=
1.
STRONG-FOCUSING LENSES
147
The second terms may be calculated by using the solutions X , Y and their derivatives to the first order; one may then integrat.e and obtain the solutions at the exit of the first lens Q1 of the doublet. From this one draws the initial conditions a t the entrance to Qz after a drift path of length D ; then one again integrates the equations in the second lens Q 2 . After some fastidious calculations, Reisman obtains the expression for X 3 X’3 Y 3and Y‘, in the very particular case where the doublet is equivalent to a round lens, that is to say when
PD
=
-2/(cot PL
+ coth PL).
The trajectories oscillate in the lenses and the terms X’2 and Y‘z are no longer negligible. From these values, one may obtain the displacement AX in the image focal plane (Fig. 20) which is here immersed in the system. Q*
\ FIG.20. Aberration in the focal plane of a doublet of revolution.
Let US denote by XI, X’, the elements for the first order, by X 3 and X ’ , the elements for the third order. We have
If we consider an incident ray parallel to the axis O Z ( X ’ ~= = 0) and XO= ROcos 0 in the plane C-D, Y o = Rosin 0 in the plane D-C, we obtain : A X = RO3[r1 c0s3 9 A Y = Ro3[r3 sin3 9
+ rzsin2 9 cos el = r1XO3 + rzXoYo2, + r2cos2 9 sin 91 = r 3 Y O+3 r?YoXo2.
148
ALBERT SEPTIER
For a purely spherical aberration, one would have: rl = rz = r3. For a pure star aberration: rl = -r3 and r2= 0. For a pure rosette aberration: rl = r3= -r2. In a particular case ( L = 7.46 em, a = 1 em, @L= 3.6178, d = 5 em, and f = 0.177 em) one finds for example
r3=
-918
r2= 66.2
and
=
189
One may obtain by the same method the other aberration terms, particularly the distortion. We have in practice, in the case of the doublet of revolution :
AM, denoting the variation of the magnification along OX, M the magnification itself, and XI representing the coordinate in image space. Reducing to the object space, one would have - x’l X o with X o = Ro cos O AX0 = AYo
=
X’1 Y’3 - Y’I Yo Y’l
Y o = Ro sin O
and in the particular case discussed earlier:
+ + 12.1X02Y0.
AX0 = 17.7X03 34Y02xo AYo = -2O.6Yo3
From this we have the coefficients r5 = 17.7 rs = 34 r7= -20.6 I’s = 12.1 cm-2. Let us give for comparison the aberration constants of a very good round projection lens for an electron microscope, also studied by Reisman. Spherical aberration
ARi -
- C8fa3--+ C, = 2.40 with j = 2 mm. M Distortion, defined by: ARi - - S1RO3-+ SI = 5.6 to 7.5
(Ro denotes the distance from the axis to the object point.) By the constants rl, r2, and r3 obtained above with a beam parallel to the axis and constants similar to the C, defined here, we would have the relation : AX = a3f[c16 c0s3 0 Cz8sin2 O cos O] AY = a3.f[Ca6 sin3 0 CZscos2 O sin O]
+ +
STRONG-FOCUSING
149
LENSES
we then have, for e = 0: C,,f = r f 3 (since CY = Ro/j), that is CI, N 5. I n the same way CZs‘v 1.9, Css ‘v 26.5. For distortion, we shall compare S1directly to rs,rs,r,, and rs. I n conclusion the aberration terms calculated from a doublet equivalent to a round lens are of a n order of magnitude comparable to those of a normal projection lens. But they do not take into account the fact of leakage fields, and of B, in particular. b. Estimate o j aberration due to B,. Bernard and Grivet (58) have proposed a method of calculation which keeps track of B,, using the rectangular model, but neglecting the terms due to the slope of the trajectories; application has been made to a relatively weak symmetric doublet ( X I z and Y t Zbeing negligible). We integrate the simplified equations:
+
XI’ PZkX Y” - P2kY
/32XYY’k‘(z) = -pXYX’k’(z).
=
The term in k’(z) is presented in the form of infinitely narrow impulses, but having a finite area
1-y
dz
dz = k ( z )
=
1,
denoting by f a ! a infinitely small displacement on one of the other side of the discontinuities in k(z). The computation has been carried out for a hollow divergent beam whose maximum divergence is ( d r l d z ) , = r’o and the distance to the axis in the entrance plane Po of &I, equal to To. We then set:
X o = ro cos e X‘” = rr0cos 8
Yo = ro sin 8 Y’o = r‘o sin
8
When one passes through the plane Po, the slope undergoes a variation caused by the leakage field, and symbolized by the term in d k / d z . We then have a plane P1situated immediately a t the entrance of QL El1 =
X’, - X I o =
+ p”oYoY’0
T’, =
Y’, - ytO=
-
dk dz
- dz =
dlc
P2XOY;u‘, dz dz
=
+p2x”YoY’o -p2xoY&’o
where X o , Y o , Y’OX’O denote the solutions in the first order; as a consequence : X’, = T I o cos 8 p2r02r’osin2 e cos e Y’, = sin e - p2r02r’osin e cos2 e
+
x1=
Yl
xo
= yo,
150
ALBERT SEPTIER
that is
The entrance and exit planes are therefore equivalent to thin lenses, divergent in POand Pel convergent at P 3and P4in the plane %OX,and inversely in the plane zOY. The calculation is carried out from Po to P7, the exit plane of Q2. The emergent beam rests on two perpendicular focals contained in the planes of symmetry. The focal contained in plane zOX is situated a t a distance
and has for its length dy=2
(X 7 - X 1 , y l
-
y7 7>
Symmetrical formulas allow the characteristics of the focal contained in the other plane to be obtained. If one is situated in the front plane PX containing the first focal (contained in zOY when zQX is the plane C-D), and if one considers all the incident rays defined by: X o = rg cos e Y o = ro sin 8, we no longer obtain a focal line but a curve having equations X Y
C cos e sin2 e = A sin % B sin =
+
e cos2 e
where ABC are coefficients depending on rg, T I o , p, L, and D. It is difficult to draw a useful general expression from these, and the calculations have been carried out for particular cases. Figure 21 shows the general form of the focal spots and the correspondence between the rays which bound the beam in the entrance plane and on the “focal” spots. The rays labeled 2, 8, 6, and 12 are those which pass furthest from the axis OX in the second focal; they are defined : tan e
=
f4 2/2 that ,is %
-
35”.
151
STRONG-FOCUSING LENSES
Y Plane D-C 1st focal
Y
X
'
Plane
C-D
line
0
I2
w+x 10
Y
focal 2nd
line
6
FIG.21. Forms of the aberration spots (perturbed focals); the same figures denote corresponding rays.
I n the same way the aberration is a t a maximum in the first focal for tan e
=f
a,that is
0
-
55".
As one moves away on 0 2 , moving away from the lenses, one finds successively the aberration focals in the order indicated in Fig. 22, from (a) to (d), 4
(C)
(d )
FIG.22. Cross section of the beam in the neighborhood of the focals, for an incident beam of radius TO.
Let us give several numerical results, in the case where L = 19.4 cm. rad. D = 30.6 em, p = 2.015 m-l, for a: = 0, a! = and a: = 2.5 X If a: = 0 for the first focal situated a t 2.18 meters from Q 2 :
X Y
= =
-19.3 rO3cos 0 sin2 0, with ro, X , and Y in meters 0.79 ro sin 0 - 10.2 rO3sin 0 cos2 0.
There is therefore only a single aberration term, the first term A sin e being due to the first order. The width 2 6 of ~ the spot (Fig. 22c) is about 2 6 = ~ 15.3r03,that is 0.412 mm for ro = 3 cm.
152
ALBERT SEPTIER
In the same way we will have, for the second focal, situated at 5.14
1x1:
2617 = 17r03N 2 6 ~ .
We may define an “aberration figure” r = 6/ro characterizing the maximum width of the focal spot of Fig. 22c, constituted by pure aberration. We then have : ‘V
If
a #
0 we find for a for a
=
=
7
x
10-3.
lo+, 2 6 ‘v ~ 2617 = 0.51 mm,
2.5 X
2 6 ‘v ~ 2 6 = ~ 0.62 mm.
We may use this method with the bell-shaped model for k ( z ) , but some attempts in this direction lead to inextricable calculations. It is therefore preferable to go on to the integration of the more complete equations, seen above. 4. Integration of the More Complete Equations (57). a. Methods. The integration of the perturbation equations giving e and q ) keeping track a t one time of the slope terms for the trajectory and the existence of a leakage field B,, has been carried out with different electronic machines by the team of Grivet, Septier, and collaborators, for the symmetrical doublet already pictured in the preceding paragraph, but with a bell-shaped model for k ( z ) . The curve k ( z ) has been measured experimentally for each lens. The separation D is enough for the distributions of field relative to Q1 and Qs to be able to be considered as independent (see the later chapter relative to measurements). Each curve k ( z ) has then been represented as well as possible by plateaus of length 2z0 = 7.4 X
meters
terminated by half-curves of bell shape having the form with
b
=
7.6 X
lo-’
meters.
Let us recall that P = 2.015 meter-’. We consider that the lens thus designed begins a t 25 cm before its geometric center, the function k ( z ) being practically 0 a t this point. The separation of the centers of the lenses is 50 cm; the total length of the system is 1 meter and may be divided into six successive domains of integration, of respective lengths 21.3, 7.4, 21.3, 21.3, 7.4, and 21.3 cm. From the initial conditions in the entrance plane Po the machine first integrates the equations to the first order, in the six domains, each time
STRONG-FOCUSING
153
LENSES
bringing into agreement the trajectories a t the boundaries; it thus arrives a t the functions X ( z ) , Y ( z ) ,etc. . . . which then permits it to calculate the functions P ( z ) and Q ( z ) , which constitute the second members of the perturbation equation. The integration has been carried out by two different machines: (1) An analog machine (OME, of the Sociht6 Francaise d’Electronique et d’dutomatisme) which can give directly the general expressions for the aberration terms from the initial conditions, when 3./ and k ( z ) are fixed numerically. (2) An arithmetic machine (IBM 704 of the Xoci6t6 ZBM-France), in which one must introduce directly the initial conditions in a numerical form, and begin the calculation over again when these change. The calculation with this last machine is very rapid (about 3 min for the complete system, with an integration step of 0.5 mm, that is 2000 steps for the total length of the doublet) and certainly much more exact than the preceding, but the results, which are less general, do not allow one to obtain rapidly the form of the caustic of the emergent beam. b. General form of the beam in the neighborhood of the focal lines. We shall use here the results obtained from the analog machine, noting first that a partial integration carried out without keeping track of the terms in K‘ and K”, and leading to what may be called “the aberrations of slope, and of velocity,” show that the terms are, in the particular case of the weakly excited doublet studied here, totally negligible with reference to the terms of global aberration obtained with the complete equations (only some percent), which confirms the validity of the integration carried out with the equivalent rectangular model. Each perturbation is obtained in the form of 10 terms. In the case of a beam coming from a point source, the slopes X f 0and Y f 0are proportional to X Oand YO.All the terms are proportional to the third power of the distance from the axis Ro. These coefficients reduce to two if the incident beam is parallel to the axis, and in this case the emergent rays are described, in the exit plane, by expressions of the form X , = alXo - blX03 - clXoYo2 X’, = -dlXo - elXo3- f l X o Y o 2 . The calculations show that the coefficients a, b, c, d, e, f are almost positive. The Gaussian image planes are defined by z1 = -X,/X’,
=
al/dr
22 =
-Y,/Y‘,
= az/d2.
The perturbations bring the trajectory back toward the axis, and the slope is increased.
154
ALBERT SEPTIER
We obtain for the parallel beam, with the dimensions in meters:
X, = 0.560431Xo - 280 X lOP6Xo3- 240 X 10-6XoYo2 X I , = -2.536 X 10-2Xo - 4 X 10-6X~3- 5.5 X 10-6XoYo2 Y , = 1.336856Yo - 410 X 10-6Y~3- 620 X 10-6YoXo2 YIg = -2.536 X 10-2Yo - 7 X 10-6Y~3- 9 X 10-6YoXo2. The Gaussian planes are situated in z1 = 2.21 meters and 2 2 = 5.27 meters, while for ro = 3 cm, the pseudo-focal will be a t z’l = 2.17 meters and 2’2 = 5.13 meters. The aberration spot, for ro = 3 cm, will be in the Gaussian plane a flattened pseudo-ellipse, and will pass progressively through the forms represented in Fig. 22 from (d) to (a) in front of the Gaussian image plane as one approaches the lenses. We will have here: 2 6 = ~ 0.06 mm 26y = 0.26 mm
Tx
cv 10-3
7y
= 4.3 x
10-3.
For a slightly divergent beam we would have: 2 6 = ~ 0.12 mm 2 6 = ~ 0.29 mm
for an aperture of
(Y
=
2aX 2aY
= =
Tx Ty
2 =5
=
x x
10-3 10-3
and 0.21 mm 0.33 mm
Tx = Ty =
3.5 x 10-3 5.5 x 10-3
for a = 2 x The aberration figure seems to grow linearly with the aperture. If we now consider an incident beam composed of parallels to Oz and of diameter 2ro = 6 cm, each of the crowns of radius ri < ro will give spots which are analogous but strung along Oz. The cross section through zOX of the beam in the neighborhood of the first focal shows the perturbed structure of it and gives the trace of the caustic, whose point is in the Gaussian plane PG.After calculating different sections of the beam in the neighborhood of Pa, one notices that in PGthe outer rays always correspond to the incident exterior surface of radius ro. One also notices that there exists, before Pa, a zone of maximum narrowing PI (analogous to the zone of minimum confusion in the classical optical instruments). This area of maximum narrowing is limited a t the extremities merely by the outer incident rays; in the neighborhood of the little axis of the spot, these are rays corresponding to interior incident rays (ri < ro) which form the outer envelope of it.
STRONG-FOCUSING LENSES
I'
r = ro
f-2
155
-= r o
FIG.23. Transverse section of the beam and the plane of optimum convergence at point
PI.
The mean thickness 2e of this figure is clearly greater than the thickness 26 of the aberration spot ex
-
26x
ey
-
2.561.
C. Coeficients given by the calculating machine. The machine gives the global perturbation in the exit planes P5.The calculation has been carried out for six incident rays parallel to the axis, situated a t 3 cm from the axis in one of the quadrants a i d such that 0 = 0, 18", tan-' d2/2 tan-' , 42, 72", and 90". The expressions which are furnished may be put in the form E
=
+
+
A o X o 2 R o X o 3 CoXoYo2.
We arrive at the following expressions:
X, X',
=
- 757 X 10-6X03 - 335 X 10-6XoYo2
=
- 13 X 1 0 - 6 X ~ 3 6 X 10-6XoYo2,
0.56045Xo 0.02536xo Y,= 1.33685Yo Y', = 0.02536Yo
- 96.5 X 10-6Y~3- 153 X 10-6YoXo2 - 2.7 X 10-6Y03 - 2.9 X 10"Yoxo2.
Which leads to 5-x = 7y =
5 x 10-5 10-3.
The results are of the same order of magnitude as the preceding ones, but inverted; no explanation has been found for this disagreement. But, in any case, the calculation confirms that if one observes in the neighborhood of one focal the section of the beam corresponding to a hollow beam of radius TO, the cross sections certainly have the forms represented on Fig. 22, and appear in the order (a) to (d), as one moves away from the lenses. Let us finally note that the order of magnitude of the aberration figures is the same as that given by the approximate calculation using the rectangular model.
156
ALBERT SEPTIER
C. Trial Correction for the Aperture Aberrations 1. Principle. In classical electron optics, when one uses systems of revolution, different authors have shown that it was possible to correct spherical aberration of electrostatic lenses by following the lens to be corrected with electrostatic lenses of octopolar symmetry (5944) ;the perturbing force being convergent and proportional to the third power of r, the octopolar lenses used create opposing forces in r3 in two perpendicular planes and in -r3 in the bisecting planes. If the beam is arranged to be very astigmatic, and if the correcting lenses are suitably placed in the planes of the focals, a global defocusing action may be obtained which practically compensates the spherical aberration. The action of these correctors is studied by carrying in the expression for the electrostatic potential 4(r,z), which is written:
a complementary term in r4
We introduce the resulting function:
into the expression for the trajectories, and we show that it is possible to eliminate the effect of the perturbating term in r 4 which is the origin of the force in r3. An analogous correction can be carried out with magnetic lenses. It should be possible, by utilizing here a similar method, to eliminate or a t least to lessen the aperture aberrations. 2. Possible Solutions (Magnetic Case). I n the leakage field, the field B,, which is responsible for a large part of the aberrations is zero in the plane zOX and z O Y , and a maximum in the planes zOx and zOy, where its amplitude varies as x 2 or y2. It has a global convergent action in a lens, which is for a given trajectory proportional to r3, and although its distribution is of the form Bz(r,#)= A(z)r2sin 28 the convergence has an octopolar symmetry; it is 0 in OX and OY, and a maximum and of the same sign following Ox and Oy. One may therefore think of correcting this parasitic action by making the lens itself less convergent along Ox and Oy, and more convergent along
STRONG-FOCUSING LENSES
157
OX and O Y ; for example in superimposing on the ideal distribution of field in the center of the lens an octopolar distribution of the form +(z,r,0) = +(z)r4 sin 40
or a t least to follow the quadrupole lens or the doublet by an octopole lens. The first simple solution (and we shall see in the chapter devoted to experiment that i t is practically possible to correct thus the aperture aberrations) consists of reintroducing into the general development for the potential, beyond the fundamental, higher terms of the 6th and 10th orders, such that the radial gradient will be a decreasing function along Ox and Oy, and increasing in OX and OY, thus modifying the shape of the poles. Nonetheless, if the initial distribution was pure quadrupolar, the correction could riot be complete, for the action of B, varies as r3, while that of the higher order terms will vary as r5 or rg, and according to a law in the sin 60 and sin 108, respectively. Another solution consists of introducing into +(r,e) the terms of the 4th order which are lacking, but it is then necessary to suppress the quadrupole symmetry (one modifies it only in opposite gaps, on O X , for example). The calculation of the perturbations created thus to correct the aberrations has been carried out by Reisman (24). This author considers only the aberration terms in equations due to the slope of velocity, which are important in the doublet of revolution symmetry which he studies, and which is composed of long lenses (importaiice of the leakage fields reduced) which are very strongly excited; he hopes to compensate them, as well as those which are due to the terms in r5 and r9, always present. He describes BX and B y in the form
Bx By
= =
-K[Y -K[X
+ ljpzm(3X2Y- Y3)]
+ +P2m(X3- 3 Y 2 X ) ]
with m = ml in Q1 and m = m2 in Q2.After integration of the equations by the method described above, he arrives a t the following expressions for the spherical aberration, for an incident beam of radius r0 parallel to the axis (all dimensions in cm).
+ O.O1ml + 193m2)ro3c0s3 0 + (66.2 + 0.5ml - 596.6mz)ro3cos 0 sin2 8, -918 - 5 0 . 1 + ~ ~2870m2)ro3sin3 B + (66.2 + 9.31ml - 75.1mz)ro3 sin 0 cos2 0.
AX = (189 Al'
=
I n the same way we would have for the distortion: A X 0 = (17.7 -I- 0.139ml
+
+ +
18.7mz)Xu3 (33.6 - 13.6ml - 237mz)Yo2X~, A Y o = (-20.6 - 438ml -I- 79.2m2)Yo3 (12.1 4-0.034ml - 35.6mz)XoZYo.
158
ALBERT SEPTIER
It is therefore impossible to cancel completely AX and A Y by a judicious choice of ml and mz. One can only hope to reduce them. Moreover the expressions which are obtained show that, in equal operation, the influence of the correctors is practically negligible on the first lens; a correction on the second lens of the doublet should be sufficient in practice.
D. Chromatic Aberration and ‘(Mass” Aberration (57) 1. Chromatic Aberration. The first order trajectories obtained with particles accelerated to a voltage $,, are more or less perturbed when this voltage varies slightly, of the order A$o
% ! ! << 1. 40
For a slightly less rapid particle, the lens will be more convergent (or divergent), and vice versa. If one sends into the lens, or into a more complicated system, a beam which is not monoenergetic, the focals w ill be enlarged, and one may define a constant of chromatic aberration from the width of the beam in the Gaussian plane, by the relations;
with ax
=
ro/fx,CYY =
ro/fY.
One may define an “aberration figure”
ro being the radius of the useful aperture of Q1. Since this aberration is of the first order, it is possible to calculate it rapidly from the first order equations. If A X 3 , A Y 3 , A x ’ 3 , and AY13 denote the displacements in the exit plane of Q 2 between a trajectory corresponding to $0 and a trajectory , will find in the Gaussian plane, situated corresponding to 90 A ~ Oone a t a distance
+
AX
=
X AX3 - A X ’ S +
(See Fig. 24)
x3
and likewise in zOY. Taking the perturbation A& to be very weak, we obtained AX3 and
159
STRONG-FOCUSING LENSES
FIG.24. Chromatic aberration.
AX', by direct differentiation of the general equations giving X3 and XI3, by writing
4x31 - 4x3). dp d40
dP
&I
with (magnetic case) 1 d$o dP _ - --
P
4
$0
or (electric case)
d p = ---, 1 P
2 40
that is,
and similarly for XI,. For a symmetrical doublet with (the case of the = f 2X linear CERN accelerator corresponding to a proton energy spectrum f100 kev at 50 Mev), @ = 2 mete+, pL = 0.4, and D = 0.3 meter, when the incident beam is parallel to the axis ( T O = 3 cm) one may write:
-
-
for f 3.6 meters. and we find (C,) = 0.6, that is, (rC)x 1.2 X I n comparing this with the aperture aberration calculated above, we see that the chromatic aberration is less. This would not be the case with a beam issuing directly from a cyclotron, where the spectrum of energies is broader (66).
1GO
ALBERT SEPTIER
2. (cMass” Aberration. In the case of magnetic lenses, the method of calculation used here suggests the immediate analogy by the influence of a displacement A40 or a displacement AM of the mass of the incident particle. For a given particle Me is constant, but there often exist in a given incident beam parasitic ions of mass M i # Mo which have a given displacement. AMi = M i
- Mo.
We have the relation
Knowing the constant of chromatic aberration, one may therefore calculate, for ions of neighboring masses, the spreading of the image produced by the parasitic ions when the system is adjusted as well as possible for the principal ion. In systems for mass separation of high-energy particles, the separation achieved depends on the chromatic and “mass)’ aberration of the magnetic lenses which may be used to focus the beam of the particles. Courant and Marshall (65a) give quantitative relations for the mass separation and momentum dispersion in a sequence of N identical period, each period being composed of classical velocity filter (with crossed electrostatic and magnetic fields) and several quadrupole lenses.
111. PRACTICAL REALIZATION OF LENSESAND MEASUREMENT OF FIELDS
A . Practical Realization of Magnetic Lenses 1 . Form of the Pole Pieces. We have seen that the correction of the thirdorder aberrations demands the realization of a complex and as yet rather poorly known distribution of scalar potential +(X,Y,z) in the useful space (T a ) . For applications where this correction is not thought to be useful, and for weak and long lenses (in which the trajectories are slightly inclined and the relative importanoe of the leakage fields is very much reduced), one tries to realize, with electrodes of finite width, a distribution as purely quadrupolar as possible for C#J(X,Y,z), by eliminating as much as possible the higher terms. a. Hyperbolic electrodes. A first solution, adopted by numerous experimenters, consists of furnishing the lenses with hyperbolic electrodes of the greatest possible width E. This solution is expensive, for the machining of pieces having such a profile is extremely delicate. And the study of fields has shown that the amplitude of the perturbing terms in @(X,Y,z)varies
<
161
STRONG-FOCUSING LENSES
very little with the form of the electrodes; it appears advantageous to give up the hyperbolas for pieces of an approximate, but more simple shape, circles for example. b. Circular electrodes. Its very simple mechanical realization makes this solution attractive, and this form has been the object of a thorough experimental study by Dayton, Shoemaker, and Mozley (66). These authors have measured (see below) the harmonics of the radial field, the expression for which is developed in the form:
Br
84 dr
- --=
-hzr sin 28
- h6r6 sin 68 + . . .
(which gives the coefficients hz, h6, hlo for the potential +(T,O) for circulra electrodes of radius R1 (Fig. 25) and of width E / a = 2, with a = 2.54 cm
FIG.25. Form of the poles with circular cross section.
The experiment gives the results in Table 111. Experiment shows that it TABLE I11
Rl/a = 1.25 R,/a
=
1.125
central zone leakage field central zone leakage field
-21 f 8
202 f 3
-124 f 5 -81 8
-51 f 2 k8
-125 f 5 -78 k 8
-206
*
~~
is possible to cancel hg in the interior by choosing Rl/a LX 1.15. For each value of E , there thus exists a value of Rl/a canceling ha. When hs # 0, the curves giving the radial gradient are different in the two directions OX and Ox;but in any case, for the values of X or x near a, the gradient is less than the gradient on the axis, for hlo is negative. For h6 = 0, the two curves are practically identical, and the difference [ K ( r ) - K(O)]/K(O)on the gradient reaches 10 or 12% for r = a. The circle of radius Rl/a = 1.125
162
ALBERT SEPTIER
corresponds in practice to the osculating of the theoretical hyperbola ( H ) , while the other circles, tangent to ( H ) at its vertex, cross it again a t two points. For E / a = 2.5 and R l / a = 1.15, the measurements (57) show that the higher terms still exist, the relative decrease of the gradient reaching 8% on OX and 4% on Ox,a t r = a. Measurements carried out with hyperbolic poles (67) such that E/a = 2 give h6/hn'v 50 X in the centpal zone (z = 0). c. Composite electrodes. In lenses which demand very strong excitations, the width E must be reduced to allow place for a larger number of conductors (or to avoid breakdowns, in the electrostatic case). Thus lenses have been studied in which E / a = 1.1 with R/a = 1.1 (57). I n order to try to compensate for this narrowing, the profile is composed of a portion of a circle followed by portions of planes exterior to the hyperbola ( H ) (Fig. 26). The fall-off on the gradient attains 5% in OX and 20% on O X , the variations being shown in Fig. 27.
0
x
*
FIG.26. Composite profile: circle (C) and plane (P), for narrow poles.
The poles of the quadrupole lenses arranged for the synchrotron a t CERN, and studied by Van der Meer (68)have an analogous profile, in order to try to maintain the equivalent length Lg constant in the entire space. 2. The Magnetic Circuit. a. Description. The poles fix the layout of the field; all the calculations given above are valuable only if the pole surfaces
STRONG-FOCUSING LENSES
X , 5 (crn.)
163
-t
FIG.27. Gradient along OX and Oz obtained with the poles of Fig. 26.
are effectively equipotentials for the function 4. Moreover the relation
is valid only if the sum of the ampere turns outside the gap is negligible. T o achieve strong gradients without disturbing the intended distribution of field, it is therefore necessary to have a well constructed magnetic circuit. A quadrupolar symmetry of the entire lens is necessary, a t least in a large zone surrounding the throat circle, to avoid asymmetry in the leakage fields. Therefore one chooses preferably a magnetic yoke having square or circular form on which the pole masses are fixed, supporting the poles. Square lenses seem to be easier to construct and assemble with precision. The poles are often separate from the pole masses; the machining of poles with a circular cross section is thus made easy. b. Distribution of flux in the circuit. Let us consider a portion of the circuit (Fig. 28). The flux coming out of the pole N 1 will distribute itself equally between the two poles S1and Sz. But this flux is composed of three parts: (1) the useful fluxgp coming out of the pole itself, ( 2 ) the leakage fluxgE crossing the lateral gaps and penetrating the sides of the pole piece, a i d (3) the leakage flux \ks from the ends, penetrating at the end (and responsible for the component B J . The total flux \kT crossing a plane Ph situated a t a distance h from the entrance is given by
164
ALBERT SEPTIER
t
Y
c
FIG.28. Distribution of flux and field lines.
and *S are functions of h, and local saturation must be avoided; the induction in the magnetic material should therefore be less than the saturation induction B,. This is of the order of 20,000 to 23,000 gauss for very soft rolled steel, 23,000 gauss for Armco iron, and 24,000 to 25,000 gauss for special cobalt steels. But the difficulties of machining and the net price decide against these latter two, for the gain in B, is relatively too weak, and moreover, it is difficult to obtain them in sizeable and homogeneous pieces. I n order to avoid saturation a t high excitations, it is necessary to use a pole piece M of cross section which increases regularly between the pole and the yoke; but one increases thus the t e r m q ~ and , the gain is not proportional to the increase of surface. Moreover one reduces the cross section of the windings, which makes it necessary to use high current densities and to cool the conductors. Finally one may reduceqp by diminishing the width of the poles, but this disturbs the distribution of field. A compromise therefore has to be established in each particular case. The cross section of the exterior yoke should be larger than half of that of the pole pieces. An experimental study by Hubbard (W),carried out on pole pieces of different shapes, leads to approximate empirical expressions (which seem a little optimistic) and show that the saturation will always produce itself between the pole and the yoke; if the cross section 80,)of M remains constant as one goes away from P (Fig. ZS), the induction B ( h ) = \ k T / & h ) is equal to B, a t the junction of M and of the yoke ( h = 0). If on the contrary &,) increases very rapidly, B, is reached in the middle of M as a result of the rapid increase of *E, but the cross section of the conductors then
*E
STRONG-FOCUSING LENSES
165
becomes too small. The induction B, a t r = a will be even stronger when one can place the saturated zone nearer the summit of the pole pieces. For certain lenses operating far away from saturation, one may on the contrary reduce the cross section of the pole masses and render them cylindrical; the nonsaturated poles remain equipotential surfaces, and the fabrication of the windings is facilitated thereby (70). The curves giving \ET between the vertex of the poles and the yoke (57, 69) show that the flux grows very rapidly starting a t the pole and that at a distance from the pole of the order of a, one always has: 2
< B / B , < 2.5 to 3.
For a very soft steel, one then has B, N 23,000 gauss. It is therefore, in practice, impossible to exceed a maximum induction B, of 11,000 to 12,000 gauss, even with very strong excitations. If one stays far away from saturation, that is to say in the linear zone of increase of field with intensity, the upper limit to B, will be of the order of 7000 to 8000 gauss. This limit fixes that of the gradient
I n conclusion it is wise to choose an extra soft steel which satisfies the following criteria : (1) B, is as strong as possible: the pole masses and the poles can be made of Armco iron. (2) Very weak remanent field. (3) Small total price: for the yoke one may use rolled steel, to avoid increasing a little the cross sections (forged steel or even very homogeneous cast iron for big lenses). For lenses operating in a pulsed arrangement, the circuit must necessarily he laminated and formed by an assembly of plates of 1 or 2 mm thickness, insulated from each other. 3. Excitation Coils. I n the leakage field the magnetic field belonging to the conductors which form the return lines of the winding is superimposed on the field of the lens. The precise distribution of field requires the same symmetry for the windings as for the magnetic circuit. One finds only few quantitative estimates, in existing publications, on the perturbing effect created by a current supply with two windings; such lenses are often constructed currently, nonetheless, for i t is simpler to construct two windings of rectangular cross section filling all the open space between two adjacent poles and the square yoke, than four windings with trapezoidal cross section. Nonetheless one moves the conductors as far as possible from the useful volume, by raisiiig them up on the framework (IS).
166
ALBERT SEPTIER
Measurements show that the presence of the windings causes a “point effect” which is very marked a t the ends of the poles (see Figure 40); this effect appears only if the pole has a winding (72). Depending on the density of current put through them, the windings will be in plain copper wire or in the form of a tube through which a current of water runs, as for classical electromagnets.
-
AL
// /
-
ni~104
FIG. 29. Field-current characteristic obtained along OX at X = a; Curve 1: QI (a = 4 em), pole pieces of constant cross section, Curve 2: Q3 (a = 4 cm), trapezoidal pole pieces.
FIG.30. Cross sect,ion of the pole pieces of Q1and Q3.
4. Field-Current Characteristics. After the calculation of a lens and its practical realization, a good means of checking consists in measuring the field-current characteristic a t a point situated near a pole, in the central plane ( z = 0 ) . As nI increases, one should obtain a curve analogous to the magnetization curve of a very soft steel, with a very weak remanent field. Figure 29 shows such a characteristic curve for a weak lens with straight
STRONG-FOCUSING LENSES
167
pole masses of width E = 10 cm, and a length fT = 10 em, having poles of radius R l / a = 1.15, of thickness e = 4 em, with a = 4 cm. The square frame has a thickness of 6 em (see Fig. 30a). The normal point of operation is situated in the region where nI is between 2500 and 5000 ampere turns, that is to say far from saturation. Figure 29 also shows curve (2) for a lens Q3 having the same value of LT = ZT el but whose whole mass has been given a special profile (57) (Fig. 30b and Fig. 26), the factor a remaining constant. We see the improvement which is obtained a t strong excitations; the saturated zone is displaced from the yoke toward the middle of the piece. Figure 31a is a photograph of the lens Q1 which was used for these trials, and Figure 31b is a photograph of a lens with a round frame utilized by Lynch and Zaffarano (72).
+
B. Practical Realization of Electrostatic Lenses 1. Limitation at High Gradients. Electrostatic lenses present numerous theoretical advantages over the magnetic lenses : greater ease of construction, greater lightness, and current sources which are weaker and easier to stabilize; but also two disadvantages: (1) they must be placed in the vacuum chamber, which makes it necessary to expand this in order to put in insulators of sufficient length and to install high voltage leads, and does not allow of any displacement, and (2) but more definitely, the maximum value of voltage which can be applied safely to the electrodes is limited by the phenomena of breakdowns. The maximum difference of potential which may be applied between two neighboring electrodes depends very little on their form and on their nature if the surfaces are of a sufficient radius of curvature, but is a complex function of distance, pressure (73), and also of the ionization of residual gas, which is caused by emanations from the insulators and rubber gaskets and the beam of particles to be focused. I n a very good vacuum, one may say that the value of voltage is limited by the breakdowns along the insulators separating the electrodes of the mass. Considerable progress has been made recently (74).One must note here, that in the domain of corpuscular optics, tolerances are very strict. One must not cross the threshold of voltage which cause pre-breakdowns. Moreover, the lenses are often mounted in sequence, or in the interior of an accelerator which is difficult to take apart and which must operate without disturbance during long periods; i t is necessary to have a sufficient margin of safety. Many publications (65, 75-80) relative to lenses intended for the focusing of protons of some million electron volts set the maximum voltages a t the order of 30 to 40 kv (that is A 4 1 = 2+1 60 to 80 kv between adjacent electrodes separated by a minimum of 2 cm).
-
168 ALBERT SEPTIER
FIG.31. T w o types of apparatus: (a) lens with square circuit [P. Grivet and A. Septier, CERN 68-26 (1958)l (b) round lenses [P. J. Lynch and P. J. Zaffarano, I.S.C. 927 (1957)l.
STRONG-FOCUSING LENSES
169
Attempts which were made to arrange such lenses in drift tubes of a linear ion accelerator (81) between 4 and 32 Mev were abandoned as a result of the frequent breakdowns, the necessary maximum gradients being of the order of 60 kv/cm. 2. Some Realizations. Several electrostatic lenses have been realized by the author to study the aberrations of electrostatic doublets; Fig. 32 shows
,
FIG.32. Electrostatic experimental lens, with the same electrodes as those of the magnetic lens of Fig. 31a.
one of these lenses, the electrodes of which are exactly similar to those of the magnetic lens of Fig. 31a. The maximum voltage applied does not exceed 2#i = 20 kv. Figure 33 shows the lenses used by Vivargent (65) for a beam of protons of 4 MeV, with 2tp1 = 80 kv. The electrodes are realized here in Inox steel with hyperbolic pole pieces, the insulators are made of Teflon or of plexiglass for the first, and of porcelain for the second. Two voltage leads are enough for a symmetric doublet,
170
ALBERT SEPTIER
the electrodes of the same polarity being tied together in vacuum; but it is necessary t o move the connecting cables far away from the useful volume in order to avoid distortions of field. In the case of the lens with circular electrodes, 8 simplified construction consists of using cylindrical tubes or rods (75, 80) of suitable radius (Rl/a‘v 1.15).
FIG.33. Electrostatic lens for use at high voltage [fit.Vivargent, Thesis, Paris (1958)l.
It is possible to tie two of the electrodes to the frame of the instrument and to apply the total voltage 2& to the other pair of electrodes. This artifice allows the source of high tension to be taken from the general accelerating voltage; the ratio (PI/& will then be constant, whatever the fluctuations of &I, and the optical properties will remain unchanged. The particles will undergo a slight slowing down while entering and an acceleration a t the exit, from which they suffer a weak supplementary convergence (and perhaps also supplementary aberrations).
STRONG-FOCUSING LENSES
171
C. Magnetic Measurements 1. Quantities to be Measured. Calculation does not permit exact knowledge of the distribution of field in a real lens, and especially in the leakage fields. Moreover the electrolytic tank or the resistance network can only give it in a transverse section, or in a plane of symmetry such as 20s. Measurements on electric lenses are nonexistent, but it is possible here to make a magnetic model, since the pole surfaces remain equipotential whatever the excitation. The interesting quantities are the following : (1) The transverse field, that is to say, the components BX and BY in the plane z = 0, as well in the leakage fields, as a function of X and Y . (2) The longitudinal distributions of these qomponents along parallels to ox = B(2). (3) The longitudinal field B, in the leakage fields. (4)The radial gradient K(z) and its variations in different directions of the function of the distance from the axis. ( 5 ) The characteristic function k ( z ) along the parallels Oz. ( 6 ) The distortions of the scalar potential, that is to say the coefficients of the higher terms for the development in series of + ( X ,Y,z). (7) Finally, from the preceding values B ( z ) and k ( z ) , the equivalent lengths of the field and of the gradient, and their variations across the gap. 2. Apparatus Used. a. Measurements of field. As a consequent of the strong gradients which are used, measurements by nuclear or paramagnetic resonance are not very convenient. It is possible to either (1) gaussmeters of bismuth wire (82), or (2) Hall-effect gaussmeters, or (3) short or long coils whose movements in the field give induction emf (58, 7 2 ) ,or (4)coils tied to a sensitive flux meter (66). The first mentioned give only a poor precision (2% a t best, in relatively strong fields). The turning coils which have been used (57) by us are short coils with a great number of turns and high impedance, carried on a rotating axis parallel to Ox and normal to their own axis; they give a signal v proportional to the transverse component B, of the field. Driven by a synchronous motor, the fundamental signal at 25-cps is of the order of 1 mv/gauss; and the use of brushes composed of a mixture of carbon and silver, sliding on silver rings, and of a low pass filter eliminating the parasitic components of the signal, allow the parasitic background noise to be reduced to some tens of microvolts. The rotating shaft of bakelite passes through a rigid brass tube and through graphited bearings, which reduces its vibrations as much as possible. The amplified voltages are read on a vacuum tube voltmeter, which is calibrated beforehand. The precision of the arrangement is better than lye.
172
ALBERT SEPTIER
Another type of rotating coil is formed of several turns of rectangular cross section (5 X 700 mm) wound in slots cut in a mandrel of plexiglass. If the ends of the turns are placed at points where B, = 0, with the long side parallel to Oz, the signal furnished is proportional to the integral
and therefore to the equivalent length of the field LB(r).By displacing the coil in the gap, the variations of A ( r ) give information about those of LB(r). All of these rotating coils give a signal of the form: v =
2
a, sin nwt
n =I
where w is the angular velocity of rotation. The first harmonic is of the form :
v2 = a2 sin 2wt and the coefficient a2 is proportional to the transverse gradient. If the signal from two identical coils mounted on the same shaft parallel to Oz is set in opposition, the resulting signal is proportional to v2, and allows one to follow the transverse distribution of the gradient. This signal is very useful, to carry out the alignment of the axis Oz of the lens with the direction of the displacement of the measuring instrument; on the axis Oz, v1 = 0, and v2 # 0. 212 is observed on a n oscilloscope screen; i t is a sinusoid of frequency 50 cps; a misalignment of or 2 x 10-2 mm gives a sufficient signal v1 to distort the curve. After alignment, a low-pass filter stops v2 a t all points; v is then practically 0 on the axis. The measurement of B, is more delicate. A long solenoid, parallel to Oz is used, vibrating parallel to Oz. If the end A is a t a point where B, = (B,)A and the end A’ is a t a point where the field is 0, and if the frequency of vibration is fo, we have the relation (83): v
=
(nm2wo)(B.)Azo cos mot.
+ (higher terms)
..
.
with uo = 27rf0, n = number of turns per unit length; a = radius of the coil; zo = the amplitude of the movement. A coil of 300-mm length with closed turns, wound on a glass tube of 5-mm diameter, vibrates a t a frequencyfo = 50 cps, thanks to an electrodynamic motor similar to that of a loud speaker. The tube is suspended by springs which form a n oscillating system without damping on the axis of a rigid tube, provided with centering screws. The alignment of the assembly is quite delicate, but permits of measurements with the precision of 1 to 3% of the amplitude of B,. The sensitivity reaches 0.3 mv/gauss.
STRONG-FOCUSING LENSES
173
b. The transverse graclient. A direct measurement of the transverse gradient dBy K(r) = a~~ __ = __
dY
dX
a ~ ,
=-
dr
is possible with a short coil carried on a plexiglass rod parallel to 0 2 , vibrating in a radial direction.
tY
(a 1
2x0
FIG.34. Measurement of the radial gradient: (a) short coil vibrating along OX; (b) mounting screen: T,plexiglass rod; A,, Az, excited coils; D,soft iron discs; B , measuring coil.
If the axis of the coil is parallel to O Y , and the displacement { parallel to OX (Fig. 34a), with a n amplitude zo, { =
and if
20 is
xo cos wt
sufficiently small, the signal Vo is written
Vo = kx&(r) cos wt. The plexiglass rod, which has its other end fixed (Fig. 34b), carries near the fixed point two disks of mild steel; two little electromagnets fed by currents of frequency f, which may be adjusted, cause them to vibrate. When f corresponds to one of the resonant frequencies of the rod, the motion takes on a very large amplitude 2 0 . We have thus used the harmonic of order 2 (f- 85 cycles); for 2zo = 2.5 mm, the sensitivity reaches l O F volt/gauss/cm. The amplitude zo can be maintained constant by stabilizing the exciting current and using a strong negative feedback furnished by the signal itself (57). c. Distortions of the scalar potential. The measurements of gradient in the
174
ALBERT SEPTIER
directions OX and Ox give information on the qualitative lack of symmetry. But one can obtain more exact values by two methods: magnetic measurements or electrolytic tank.
First method: The potential +(r,O) is developed in series and the radial field B, -&$(r,B)/dr in the form
2
=
00
4(r,e) =
h.,rn sin no
n=l
Bt(r,O) = -
2
h,rn-l sin no.
n=l
The coefficients hn are the same in the two cases; the origin of the angles f? is taken on O X . On a circle of radius rz, a coil B, of small dimensions with its axis directed --+
along Or, is displaced, while it is connected to a very sensitive and stable flux meter (for example a photoelectric cell flux meter, tied to a counter, such as that of Sauzade (84) having a maximum sensitivity of 10 maxwell/ volt). If part of tlie signal is opposed by that of a second coil B1placed on a circle of smaller radius rl, with the same azimuth 8, the signal which is due to the fundamental term
(B,), = -hzr sin 26' may be eliminated. The curve obtained from the remaining signal is traced as a function of 0, and a harmonic analysis furnishes the value of the coefficients h6 and hlo. In general the higher order terms can be neglected. The opposition of B, to B, reduces the amplitude of the harmonic in the ratio n--2
.=1-(:)
.
A correction is therefore necessary. Moreover, the fundamental term is obtained from Bz alone. Figure 35 reproduces a curve obtained (66) with a lens having circular pole pieces of radius Rl/a = 1.25, from which we derive h,j/hz
=
2.02
x
lo-'
h,o/hz = -1.24 X lo-'
Second method: Let us consider the cross section of one pole (Fig. 36), of whichsthe portion CD reproduces the theoretical hyerbola ( H ) . In the throat circle
175
STRONG-FOCUSING LENSES
1 - -1 1
0
90
180
270
8 (dearees)
-200 360
-+
FIG.35. Curve giving the difference between the real field and a purely quadrupolar field on a circle of radius rZ, as a function of the azimuth [after I. E. Dayton, E. Shoemaker, and R. F. Mozley, Eev. Xci. Znstr. 26, 485 (1954)J: (I) the solid curve represents the higher-order terms, obtained with two coils bucking; (2) the dashed curve represents the fundamental term, obtained with single outer coil. Fluxmeter readings in volts.
c.
the purely quadrupolar distributions would be obtained if the equipotential were hyperbolas such as ( H ) and (El’); this distribution may be calculated; one would then note, for example, the theoretical distribution on the contour ABCDEFO, where AB and E F are chosen arbitrarily = V,(X,Y). A map from the electrolytic tank will give the real distribution to first
FIG.36. Determination of the error function (Vz - V o ) :( H ) ( H ’ ) , are theoretical hyperboles; ( C ) , real equipotentials.
176
ALBERT SEPTIER
order on the contour ABCDEFO difference
=
V,(X,Y). Along this contour, the
VZ(X,Y)- Vl(X,Y> will cause distortions in the potential. We have T i 2 - V1 = 0 on OX and O Y , and on CD. Therefore, one realizes in the tank the contour ABCDEF in an approximate fashion with separate electrodes, maintained at voltages chosen according to values of the function ( V , - V 1 ) one ; may then trace directly the map of the perturbations with the full sensitivity of the apparatus, and not any more by trying to detect the weak variations of the fundamental term. This method has been used by Van der Meer (68) for establishing the profile of the CERN synchrotron lenses. The different method copied from the preceding magnetic measurements may be used in the electrolytic tank with two probes mounted in opposition. d. Determination of the equivalent lengths. The curves giving Bx, BY,or B, and B, as a function of z in the planes OY, O X , Ox,and Oy, or the curves K(z),are traced on a very large scale, and then integrated with a mechanical integrator. The precision is of the order of 5 x and the final precision along L depends only on the precision of the measurements of field.
D . Distribution of the Transverse Field (85) We shall now give the results obtained for the lens Q1of Fig. 31a, with circular pole pieces of radius Rl/a = 1.15, with a = 4 cm, E = 10 cm.
The mechanical length is I = 15 cm; each of the four coils is composed of 627 turns of wire of cross section 2 X 4 mm. The normal operation is expected to be between I = 2 and I = 6 amp. 1. Central Zone (z = 0). Figure 37 shows the variation of Bx along OY when I varies; the curves are practically straight lines as far as Y = a, and then the field passes through a maximum a t Y = 6 cm (minimum gap). Saturation begins to appear for I > 8A [see curve, B = f(1)of Fig. 291. Curve I = 0 gives us information about the remanent field. This has a quadrupolar symmetry, with a gradient of 2 or 3 gauss/cm. The distribution of field is the same along Ox and O X up to the distance from the axis equal to 2a/3. A very small difference then appears; measurements of gradient confirm this; we obtain the curves of Fig. 38 which represent the variations of dB,/dx along Ox and of dB/aX along OX, which may be compared with those of Fig. 27. The gradient measured in Q1 is about 100 gauss/cm/amp, for I < 8 amp, in very good agreement with the theoretical value. A study carried out at very high intensity on a saturated lens (57) shows
177
STRONG-FOCUSING LENSES
Y (cm.)
FIG.37. Variation of Bx along OY in
&I
-
for different intensities.
I 05
0
I
2 A', z (crn.)
3
4
FIQ.38. Gradient K,/K(o)along Ox and O X in
&I.
178
ALBERT SEPTIER
1.6
-20
-15
-10
-5
0 7f.m)
5
10
I5
FIQ.39. Function B y ( z ) in zOX of
20
25
&I.
that the topography of the field remains the same as at very weak excitations; the optical properties will remain unchanged to the third order (aberrations). 2. Leakage Fields (86). One may find, for a given value of I , curves identical to those of Fig. 37 but taking as a parameter the distance z from the origin. As one approaches the leakage field the slope of the curve decreases rapidly, but one notes immediately the dissymmetry existing between the planes Ox and O X . For a fixed value of z, the gradient varies little along Ox;a perturbation makes it grow rapidly for 2 > a/2, as one approaches the extremities of the pole pieces (beside z = 75 mm); a “point effect,” described above, appears due to the presence of the coils, but is much stronger than the simple field of the coils; B, may reach twice the value B, existing on t,he poles a t z = 0. Rounding or cutting off these angles changes practically nothing. This point effect is even more visible on the curves B(z). Figure 39 gives the distributions of B y in OX along the parallels to 0 2 ; we obtain
STRONG-FOCUSING
LENSES
179
curves with the central plateau, here about 7 em long for a = 4 em and 1 = 15 cm, with a continuous decrease a t the extremities. On the other hand in Oy (Fig. 40), the curves present maxima which are higher than the plateau for y > a/2.
L
L
z(cm.)
FIG.40.Function By@)in zOy of QL,point effect near the iron.
This effect is also found along the curves giving K ( z ) in the planes Ox and Oy, but it also exists, in an attenuated form, in the plane OX and O Y . 3. The Influence of Neighboring Lenses. We have only a few values on the deformations which B(z) or K(z) undergo when several lenses are placed on the same axis with a separation D having only a small value. It is easy to study this influence, by replacing the neighboring lenses of lens Q1 for measuring purposes by soft steel plates perpendicular to 0 2 ; the image of Q1in the magnetic material provokes the same variation of field as Q2 situated a t twice the distance. Experiment shows that if the separation between Q1 and QS is greater than IOU, the distributions are not disturbed; if the gap is less than 5a, the perturbation becomes serious, for this distance is clearly less than the length of the leakage fields.
180
ALBERT SEPTIER
E. Longitudinal Field B, I . Distribution of B, along Parallels to Ox. The curves giving B, along parallels to Oz have been established. Figure 41 corresponds to these distributions in the plane Ox of Q1,for a = 4 cm. In the limit (5 = 4 cm) the value of B, on the angle of the pole piece is greater than B,. Therefore this
Ql
o=4cm. K ( O ) = 400Cs. ern:'
y=o
-
20
10 z(crn.1
FIG.41. Distribution of B,(z), in the plane zOX.
component will not have a negligible influence on the most external trajectories. B, is 0 in the central zone of the lens and passes through a maximum in z,,, a little outside the lens (z, tends towards 1/2 as x tends toward a). The variations of B, as a function of y, have been determined for different values of z ; B, varies as y2 in the neighborhood of the axis. 2. Value of the Integral $B,dz. One of the approximate methods of cal-
STRONG-FOCUSING LENSES
181
culating the aberrations uses the rectangular model and introduces the influence of B, under the form of perturbations such that SJ =
jBzdz
=
K(0)XY
localized at the imaginary extremities of the lens. We have integrated the preceding curves and compared the values obtained thus to the theoretical values; there is excellent agreement a t all points of the lens. Moreover, for two different lenses, if K(0) is the same, 4 should take identical values. We have verified this also for the two lenses Q1 and Q2 with a = 4 cm and a2 = 6 cm; in the plane Ox we have indeed a single curve (86),which is a parabola in x2.
F. Equivalent Length L 1. Absolute Value of L along the Axis. We have noted right away that the necessary excitations for a given convergence were less than the calculated excitations using the rectangular model of length l (mechanical length) ; in the formulas we must consider that
L
> 1.
From the measurements of field reported above and from the comparison between theory and experiment (24, 57), one may derive an approximate law giving L as a function of I; if the lens is sufficiently long for a central plateau to exist, L depends only on a, and decreases as a increases.
For a = 1 cm a=4cm a = 6cm
LN1
+ 1.14~
L N 1 + l.la
L N 1 t 0.95a
Therefore one may trace in a n approximate fashion the curve giving ( L - l ) / a as a function of a. Moreover it is verified that LG = LB along the axis, and that L is independent of the intensity of excitation. 2. Variations with Distance from the Axis. a. Variation of Lg. We must distinguish immediately the two possible values of L B ;LB may be obtained by dividing the area of B ( z ) by the real value of B(0) on the central plateau (L,) or also by the value which B(0) should have, if the field were purely quadrupolar (a gradient rigorously constant with distance : B(0,T)theor =
K(0,O)r.
The second value (Lz) will show better the real convergence of the lens if one keeps track of its variation in the equations of motion. I n the lens Q1 described already, with a = 4 cm, we obtain the curves of Fig. 42 by using the second definition. The decrease of Lz with the dis-
182
ALBERT SEPTIER
tance from the axis reaches 2.5% at the throat circle and is practically the same in all directions. In a lens where a = 6 cm, we can notice a difference between the variations of LI on Ox and OX but Lz would give practically a coincident curve, with A L z / L ~= -3% at r = a. This variation of L with r appears physically as a decrease in convergence as one goes away from the axis, from which there is a supplementary aberration term opposed partly to the term due to B, (increased convergence near the edges, near the planes Ox,Oy). This law remains valid for lenses with two coils (71) despite the additional dissymmetry between the planes Ox and Oy.
I
C
0
I
I
I
I
2
3 T,X (cm
1
-
I
4
5
FIG.42. Variation of L in the gap: in Q1 (a = 4 cm); LB or LO.
b. Variation of La. The decrease is more rapid than that of LB, in agreement with the theoretical formula (see Sec. I); it reaches 5% for r = a. In the lens where a = 6 cm, we again find the dissymmetry already noted along L1 but increased here. 3. Correction of the Variations of LB (87). If L B is constant, La will also be; it is therefore sufficient to regulate the correction of Lg, This variation may be canceled in all planes of symmetry by equipping the ends of the pole pieces with small masses of soft steel which locally increases the mechanical length. The shape and position of these masses are determined by trial and error, and the correction is checked with the assistance of the long coil described above. One may note that a correction carried out only along OX or OY, increases the variation in the planes Ox and Oy. For Q1, the masses are cylinders of height 15 mm and of diameter respectively 10 mm (on Ox and Oy), and 25 mm (near OX and OY) for a = 4 cm. The variations of L B along OX may be equally well compensated by increasing the gradient in this direction; it is enough to reduce the gap
STRONG-FOCUSING LENSES
183
between two adjacent poles slightly (68), but one increases thus the decrease along O X .
IV. EXPERIMEXTAL STUDY OF
THE
OPTICALPROPERTIES
A. Introduction Many partial results are available which give the optical characteristics of lenses or of various combinations of lenses ( 2 to 4 lenses) obtained with the beam of particles to be focused; that is to say, from a beam whose characteristics are fixed once and for all by the accelerating system, and for fixed object-image distances. General publications exploring the properties of a system in a large region of convergence are much less numerous. Several describe experiments carried out with a beam of accelerated particles, which allows only a study to first order (’77,88). Two methods may be used for a general and thorough study, with different degrees of accuracy: (1) the study of trajectories by means of a conducting wire (“hodoscope” or “floating wire”) which gives the firstorder properties with good accuracy, and ( 2 ) the study of trajectories between ordinary methods of corpuscular optics, but with a beam having easily variable attached characteristics. This second method is particularly well adapted to the study of aberrations.
B. Methods of S t u d y 1. The “Hodoscope” or $outing wire method (magnetic lenses). The principle and first realization are described in a n article by Loeb (89). This method rests on the analogy which exists between the trajectory of a particle and the equilibrium form of a conducting wire which is traversed by a current i and placed in the same magnetic field. From the equations
we derive
dP -I-e ( d s X B) = 0 denoting by p the momentum of the particle and by ds an elementary vector tangent to the trajectory. For the wire, if T is the tension of the wire, and i the intensity which runs t,hrough it :
dT
+ i(ds X B) = 0,
184
ALBERT SEPTIER
neglecting the other forces which are able to act on the wire (weight and rigidity) and denoting by dT the increase of tension. The vectors p and T are parallel to ds, and one may write ds
P=pds
and
ds
T = T -ds’ -.
from which
S, dT is also perpendicular to it, from which it follows that dT = 0 since ds/ds is parallel to S; only the first term, denoting a vector normal to S, persists. I n the same way d p = 0. The momentum p for a particle, and the tension T for the wire remain constant in a magnetic field. One then has the equations
As (ds X B) is perpendicular to
The solutions of the equations are exactly identical in the space if p / e = T / i , and if the initial conditions are the same. The ratio p / e = (Bp) is the “rigidity” of the particle. It is easily expressed as a function of the accelerating voltage of the particles;
+
with a0*= +o[l (e+o/2m~c2)] as in all magnetic systems. For nonrelativistic particles ( B P )=~ 2 m o ~ o / ee; denotes here the absolute value of the charge. It is therefore necessary to realize the condition
T
=
(Bp)i.
For example (Bp) = 0.3 mks (for protons of 5 MeV) and i = 1A one finds that T = 0.3 newtons, that is about a 30 gm weight. Since the tension is constant all along the wire it is sufficient to apply this tension at one point; the end point of the trajectory for example, by means of a stretched spring or a weight T . There exist several possible causes of error, that is to say, several perturbing forces acting on the wire:
STRONG-FOCUSING LENSES
185
(1) The force exerted by the image of the wire which exists in the magnetic. material, and therefore in the four poles. (2) The force due to the rigidity of the wire. These two first terms are negligible for a well annealed wire, and if T is sufficiently large. (3) Finally, the weight of the wire. This last is not negligible, but its action may be eliminated if one simply looks for the planar trajectories by arranging the plane of the trajectories horizontally. Moreover, one should suppress as well as possible the friction of the part which allows the wire to be stretched horizontally by a weight. Finally, for the given initial conditions, the equilibrium of the wire in the magnetic field may be stable or unstable. A detailed study of these questions may be found in the article of Carlile (go), and moreover in that of Citron and collaborators ( Q I ) , who have used this method a t CERN to study the trajectories of protons and of mesons in the magnet of the 600 Mev synchrocyclotron. The practical method is then the following; with the lenses arranged with their plane zOX (or z O Y ) horizontal, the intersection with the axis js fixed by attaching the wire a t this point. The pulley may be displaced in the horizontal plane. For a tension T and a current i which are fixed, one looks for the value I of the current in the lenses which gives the initial conditions ( X , and X’o), which are desired at the entrance to the system. The fixed point may be the object, or the image. Figure 31 shows the experimental arrangement of Lynch and Zaffarano (7’2). With a particularly careful mounting, a precision of can be obtained in the tracing of the trajectories (72, QLa,b), and should allow the aberrations in the planes zOX and zOY to be measured. Unfortunately it is difficult to obtain curved trajectories with precision and to explore the region between the planes zOX and z O Y , and therefore to measure the aberrations which are due to B,. 2. Corpuscular Optical Bench (93) .for Large Lenses. The magnetic measurements have shown that, even a t saturation, the topography of the field remains unchanged. For a study of aberration, it is therefore unnecessary to make the lenses operate with their normal excitation. It is sufficient merely to work in the linear zone of the characteristic B = f(l), and with values of B such that the remanent field may be negligible (this may have a perturbed symmetry). For electrostatic lenses there will not be any lower limit on excitation. Therefore lenses may be explored with beams of relatively weak energy (50-100 kev, for example); and in order to obtain sufficient intensities of excitation in the magnetic case, one may use heavy alkali ions, which we now have been able to produce in abundanee in very
186
ALBERT SEPTIER
simple sources. This method is equally valid for all the optical s y s t e m used in corpuscular optics, deflectors for example. We have constructed along this principle a special optical bench, which we shall describe rapidly, and which allows beams of large dimensions to be obtained. This is shown schematically in Fig. 43.
FIG.43. Sketch of an ion-optical bench [A. Septier, Compt. rend. acad. sci. 246, 1406 (1957)l: S , source; E,E2, accelerating system; P pumps; D, deflector for the rotating field; LO,electrostatic lens; Q1&2, lens to be studied; E,fluorescent screen.
a. The ion source. Since the work of Couchet (94), we know how to produce alkali ions from synthetic alumino-silicate, having the formula Al2O3 2Si02 . M20, by simple heating, with a n emissivity of 1 to 2 ma/cm.z A sphere, 1-2 mm in diameter, fixed a t the end of a tungsten or platinum wire which is bent in a V shape, may thus yield 20-100 pa of ions. These ions have a velocity spectrum of some tenths of electron volts, and the sources give oiily ions which are singly charged. Table I V gives the melting temperature Tf of the different alkali
-
TABLEIV
M+ Atomic mass PM+/PEc
Li+
Na+
1450 7 0.084
1100 23 0.151
K' 1000 39 0.197
Rb+
CS'
2000 85 0.293
2200 133 0.362
alumino-silicates. Emission begins around 1000°C. For Li, the normal temperature of operation is between 1200 and 1300"C, and the lifetime can be as large as some tens of hours. We have also shown in the table the atomic mass of the most abundant isotope, as well as the ratio of the momenta of the ions of 50 kev to those of protons of 50 Mev. We see that with heavy ions of 50 kev one may simulate protons of high energy. We have used beams of ions of Li of 50 to 100 kev to study the lenses whose field curves are given above. b. Accelerating system. Constructed with a horizontal axis, it is composed of: (1) A classical triode gun with Wehnelt cylinder and anode fixed; the filament is movable in the three perpendicular directions, its motion being
STRONG-FOCUSING LENSES
187
controlled by insulating rods. The anode is equipped with two diaphragms of 1 mm diameter. ( 2 ) Two accelerating lenses El and E2, separated by three porcelain insulators forming the vacuum enclosure. The arrangement allows the voltage 60to be raised to 200 kv; the variation of the voltage on E2 allows the ion beam to be focused from 1 to 2 mm diameter in a crossover of 0.5 mm situated a t about 50 cm from the exit of the accelerating tube. The gun assembly rests on an insulating column, and a metal box encloses the heater current supply and some of the measuring apparatus. c. Formation o j probe beams. The ion beam passes through an electrostatic deflector D with a revolving field, which allows a very thin hollow ion beam to be created, of half-aperture which may be varied from 0 to 5". The deflector D is formed of six cylindrical electrodes which are parallel and of length 50 mm, tangent to a throat circle of 10 mm diameter; six voltages, r / 3 out of phase, of variable amplitude obtained from the alternating 50 cycle 3 phase current, supply it. Careful filtering is necessary to obtain perfect circular sweep. The hollow beam appears to issue from a r e d object point situated in the center of D. It passes through a weak elecbtrostatic lens Lo, of large dimensions, situated a t 1 meter from D , which allows it to be transformed into a cylindrical beam parallel to the axis, of radius Ro. A grid of pitch 5 mm, placed at the exit of Lo, gives a shadow on the final fluorescent screen, and thus allows the aperture of the departing beam to be known. Ro may be varied from 0 to 4 cm; one may study lenses with their full aperture. (1. SystewL to be studied. This is composed of a doublet, formed of two identical magnetic lenses Q, and Q2, mounted on jacks; Q2 was placed on a milling machine table allowing easy alignment. r . The Juorescent screen. The vacuum enclosure, formed of tubes of 8 c a m diameter, ends in a fluorescent screen which may occupy three positions: 52, 148, and 228 cm from the center of Q 2 . The observations are made from the rear face of the screen. This screen, which must resist the ion bombardment, is formed of a layer of fluorescent substance without a binder, deposited on a support of conducting glass. The best results, from the point of view of lifetime and luminosity, have been obtained with willemite (zinc orthosilicate). Each screen allows convenient observation for a period of 15 to 20 min. It is photographed externally. By observing the figures on a screen which is movable in the vacuum, one may follow the deformation of the emerging beam. But it may also be followed by causing the excitation of the lenses to vary, with the screen remaining fixed. 3. Electron-Optical Benches. We have spoken of a possible application of quadrupole lenses in electron or ion microscopy. An electron-optical (Y
188
ALBERT SEPTIER
bench of a more classical type and dimensions has been constructed by Reisman (24),and an electrostatic electron mivroscope has been modified by us (25) in order to verify the properties of a doublet equivalent to a system of revolution, magnetic in the first case, electrostatic in the second. Reisman’s optical bench is composed of an electron gun, a magnetic condenser, an object mount, an objective lens, and the lenses to be studied. The object mount can be furnished with a diaphragm of 12.5 I.( diameter situated a t 48 cm from the crossover of the gun (diameter 40 p ) . I t allows an extremely fine beam to be obtained for probing the lenses; if after careful alignment the diaphragm and the gun are displaced laterally in a plane corresponding to the convergent plane of a quadrupole, the emergent ray crosses the axis a t the focus of the lens; one can easily determine the principal elements in this manner. Another electron-optical bench with slow electrons has allowed us to verify the properties of helical quadrupole lenses (95) on an iron-free model; it is composed of a graduated fluorescent screen which may be moved in the vacuum along the entire length of the lens and photographed from the outside.
C . First-Order Results 1. Some Particular Examples. Some studies which have been carried out on a doublet and a triplet, using the accelerated beam directly, give practical information which is directly useful, and in limits of operation which are sufficiently large. Let us give some examples: Shull, MacFarland, and Bretscher (see 88) have shown a simple method for determining the position of the object point in the incident beam (this point is often different in the two planes OX and OY). A lens Q1 is placed a t the exit of the accelerator and the excitation current I is varied. The lens gives a single real focal a t a distance q from its center; this is observed on a screen, which is here composed of a quartz plate, and which can be displaced along 0 2 . Then the curve l/y = f(1)is followed. Since the lens is weak, the curve is a straight line displaced along 0 2 . From each value of I , a value of y can be found, and the curve l / q = f(1)is followed. The lens being weak, this curve is a straight line which cuts the axis ( l / q ) of the graph a t a point P. The ordinate OP of this point is such that OP = l/p, the inverse of the object distance. Moreover, the ordinate of the curve, measured from the new origin P , gives the convergence l/f of QI.Here px = p y = 235 cm.
The system finally used is a doublet with two identical lenses Q1 and Q 2 , 30 cm apart, and the authors are looking a t stigmatic operations. It is therefore adequate to trace successively the curve I2 = f(I,) such that
STRONG-FOCUSING LENSES
189
the first focal is formed 011 the screen at the desired distance q = 176 cm from Q 2 , and then the curve I 2 = g(Il) which keeps the second focal a t the same distance q. I , and I? are the inteiisities in Q1 and Q 2 , respectively. The point where t o t h curves cross each other gives the values of I1and I , corresponding to stigmatic operation (3.3 amp and 3 amp, respectively). Figure 44 gives I, and I 2 under these conditions, for different values of q.
1
500 400
300
y
200 150 0.
U
100 80 60
I (amp.)
-
FIG.44. Pseudo-stigmatic operation of a doublet [after F.G. Shull, C. E. McFarland, and bl. AX. Bretscher, Rev. Scz. In&. 26, 364 (1954)l:excitation for different positions q of the image ( p = 2.35 m),
With an entrance diaphragm for Q1 of 2.5 X 1.5 em, the image is approximately round, with a diameter of 6 mm a t q = 176 cm, and 13 mm a t 411 cm. hleasurements carried out with a Faraday cage show that one may hope for an increase of intensity in the beam of about 8, thanks to the doublet. Hubbard and Kelly (77) have studied a symmetric electrostatic triplet. where the central lens Q2 has a length Ls which is double the others &I aiid Q3 (A2rx 30 em) in order to focus protons of 750 kov. The object points are different in zOx and zOy and are situated respectively at distances p u 'v 7.6 cm and p , 'v 34 cm from the entrance of Q1. The geometrical parameters are the following:
a
=
3.8 cm, ll/a = &/a = 4, E2/a = 8, D / a = 1.
The voltage $1 of Q1 and Q 3 is adjustable to 34 kv; stigmatic operation is then obtained for a voltage $2 of f25.5 kv along Q2a t a distance q = 50 cm. Figure 45 gives the voltages to be applied in order to obtain this operation a t different distances q, and the corresponding values of q. 2. Verzfication of the Theoretical Formulas. More general studies have been carried out with the methods presented in the preceding paragraphs, aiid their results compared with those furnished by the theoretical formulas.
190
ALBERT SEPTIER
a. Single lens. By means of our ion optical bench we have measured the values of p (peXp)which correspond to the focusing of a n object point A at. a real focal B placed on the screen for different values of p and q (object and image distances).
I
0
IS
1
9,(kV.1
-
FIG.45. Pseudo-stigmatic operation of an electrostatic triplet [after E. L. Hubbard and E. L. Kelly, Rev. Sci. Instr. 26, 737 (1954)l: Curve (1): = f(+%);Curve (2): image distance q as a function of + l .
The intensities I have been measured with an accuracy of We have then calculated the theoretical values of p which correspond, knowing L, p , and q ; first with the reotangular model (p,) then with the bell-shaped model (p,). Table V gives the results for two different lenses TABLEV
BeXp m-l &I
Or
6.5 6.3
8. &XP
&’I
Br
8.
3.62 3.55 3.59
3.76 3.66 3.75 2.92 2.85 2.88
3.02 2.95 2.96 2.23 2.21 2.22
.
2.36 2.29 2.30 2.17 2.13 2.13
2.27 2.21 2.23
(Q1:a = 4 cm, L, = 19.5 cm; and Q’; a = 6 cm, Lz = 20.5 cm). The experimental values are all larger than the calculated values, and the difference increases as the convergence (or p) increases; the bell-shaped model gives values which are closer to experiment, but the small value of the difference (3% at a maximum) shows that the rectangular model gives the cardinal elements with sufficient accuracy, without the necessity of resorting to the bell-shaped model. Conversely, if one calculates the equivalent length of the lens from Pexp, one finds a value L’ which is slightly smaller than the value given by the magnetic measurements: L’ = 18.5 cm, that is, (L’- L ) / L 3%.
-
STRONG-FOCCSIKG LENSES
191
Reisman determines directly the focal distance in the case of magnetic and weak electrostatic lenses ; the incident beam parallel to the axis crosses this at the focus. With the fluorescent screen fixed, and the lens Q1 to be studied movable along Oz, a second thin lens Q2 has a fixed position on the axis. When Q2 is excited, it deflects the ray issuing from Q1, but, not if this traverses it in its optical center. For a given excitation of Q1, i t is therefore suffieient to displace this lens along Oz until the trace of the beam is motionless on the screen,when the excitation of Q2 is varied. The separation between Q1 and Q2 is then equal to the focal distance of Q1. The curves giving the convergence of Q1 as a function of I iii it are straight lines, which practically wincide with tjhe theoretical straight lines
so long as the equivalent length L is used. b. Stud!/ of a doublet. ( 1 ) W e a k l y convergent doublet (96). When one wishes to focus the beam issuing from an object point A a t a point R , the general calculation of equations of the doublet is long and tedious. We have set up for the optical bench charts of operation of a doublet formed of two identical lenses Q1 and Q2 ( a = 4 cm, I, = 19.5 cm, D = 30.5 (em) for two object distances: p infinite and p = 164 cm. Figure 46 shows one of these charts for p = 164 cm and gives the curves (p1L)2= f(p2L)2such that the firht focal (or the second focal) are a t distances q = 17, 52, 148, and 228 cm from Q2. N-heii two curves wrresponding to the w n e value of q cross each other 011c obtains pseudo-stigmatic operation. The straight line plL = p2L vorrehponds to a douhlet where the lenses are excited in series. Here again, if t he equivalent length is calculated from the expermental values, one oi)tuiii- B valuv L” which is different from that given by thc magnetic l l w : i s ~ l r ~ n l e lhI:t
I,”
=
117.5 cni, that
i+
( I d f ’ - L ) / L - 5(,{
’l‘tre rectangular model, associated with I,, leads therefore to giving too large a value of convergence to quadrupole lenses, and the effect is increased as the iiumber of lenses increases. Vsing the hodoscope method, Lynch and Zaffarano (7‘2) have established curves giving the position of the focus in the two planes, as a function of I . Comparing this with the theoretical results, they have deduced the equivalent length of the lenses and the conditions of pseudo-stigmatism. The lenses studied had for their characteristics a = 3.75 cm, L = 8.5 cni, 1) = 30 cm, and were fed current in series. The wire which was used tq
192
ALBERT SEPTIER
determine the trajectories had a diameter of 0.2 mm, and was supplied by a current i = 5 amp; it simulated electrons of some M e V . (2) Doublet of revolution. The experimental verification of the properties of a doublet, equivalent to a lens of revolution has been carried out by Reisman for magnetic lenses (24) and by 11sfor electrostatic lenses (95).
2nd.focal line
I
5
0
0.5
I
1.5
2
FIG.46. Chart of the operation of a doublet for p = 1 6 4 rn and for different values of q.
Figure 47 is a photograph of molybdenum oxide obtained with an electrostatic. projection lens (a = 4 mm, L = 37.5 mm, D = 0, PL = r). The images obtained have a quality comparable to those of a classical projector lens. Reisman has carried out measurements of the focal distances and the magnifications in the two planes zOX and zOY in the neighborhood of operation as a lens of revolution, and compared the results to the theoretical formulas; here also there is a satisfactory agreement; but for a given separation D between the lenses, the experimental values of I leading to a system of revolution are greater than the values of I calculated with the rectangular model, in which L is introduced. The difference is about 4 or 5%. The agreement is less good, if the lenses are very close together,
STRONG-FOCUSIKG LENSES
193
Experiment shows that one can in practice ohtain excessively short, focal lengths f = 2 mm with a = 2 c m , I, = 7.5 cm and with n l = 30 ampere turns only, while under the same conditions a classical projector would demand 300 ampere turns with a gap of only 2 mm. The doublet is therefore very interesting from the point of view of luminosity of the image, the diameter of the useful incident beam being much greater.
$'IG. 47. Photograph of oxide of molybdenum at high magnifications, obtained with a strong focusing projector lens, and a classical round objective lens.
c. System of four lenses. Cork and Zajec (82) have determined the intensity of operation of an arrangement of four lenses fed in series: using the wire method a = 2.54 cm, l / a = 4, E / a = 2.5, D, a = 3, by simulating an incident beam parallel to the axis and looking for a unique image point situated at, a dist,ancta q / a = 1.5 from the exit of the system. The wire used has a diameter of 4 X 10P cm approximately; with a weight T of t5.5grams, i is adjusted to simulate protons of i . 6 3lev. The intensity
194
ALBERT SEPTIER
I in the lenses is then about 12.5 amp. The method allows a difference of convergence between the zone near the axis and the edges to be revealed; for I fixed, the parallelism between the wire and the axis is obtained for intensities i greater in this zone; for example one has
i ‘v 130 ma near the axis, and i N 140 ma for rays reaching the 20 mm from the axis. The properties have then been verified with a proton beam of 460 kev; I is found to be 3 amp, in good agreement with the intensity given by the wire method (2.9 amp). But the agreement with the results furnished by the formulas of the rectangular model is much less good; the difference observed is no doubt due to the interactions between the lenses, the distance D being very small.
D. Measurement of Aberrations (.57’, 97) of Magnetic Doublets The great size of the lenses to be studied, and the good resolution of the ion-optical bench, in combination with the use of hollow beams has allowed us to expose and to measure the aberrations of magnetic quadrupoles, in their normal operation as a weak or slightly convergent lenses, and then to correct the aperture aberrations of the third order. The measurements give information about the real global aberrations, since the calculations are only approximate and cannot keep track of all the terms. The doublet id formed of the two lenses QI and QZ seen above (or of two similar lenses Q’1 and Q’Z, where a = 6 cm). 1. Form of the Observed Beams. We use hollow incident beams parallel to the axis, of circular cross section and of diameter 2Ro which is irariable. At the exit of the lenses, the beam remains hollow and has an elliptical cross section perturbed by the aberrations. Instead of the infinitely thin focal lines of an ideal system, one observes on the screen complex figures similar to those which have been indicated in the theoretical part (Fig. 2 2 ) and in good agreement with the results of the various numerical calculations; the figures appear on the fixed screen in the order (a) to (d) as the convergence increases (as I increases). Figure (b) corresponds to the best concentration. Figures (a) and (d) are respectively to the inside and to the outside of the theoretical ellipses. If the system is fixed a t a given value of I , one therefore encounters the figures from (a) to (d) as one moves away from the lenses. The qualitative study of the progress of the rays in the doublet shows that one obtains these perturbed cross sections in the indicated order by supposing that the lenses are less convergent (or divergent) on the edges than in the center. If, on the contrary, this convergence were to increase
bTRONC-FOCUSING
195
LENSES
with the distance, m e .clrould obtain the same figurest but arranged in the opposite order [from (d) to (a) toward positive 21. Figure 48 represents the real cross sections of a11emergent beam corresponding to a hollow incident beam of radius €20 = 25 mm in the neigh-
FIG.48. Cross section of a hollow beam, near the first focal; f from (a) to (d).
-
1.3 m. I increaws
borhood of the first focal, a t 52 (mifrom Q2;the focal distance is then 130 mi. The trace of the cross section of the incident beam (above), where the shadow of two wires of the grid is visible, gives the scale of the photograph. One goes from (a) to (d) by a variation A i / j ‘v 47,. From the width 26 of the pseudo-focal, which represents the aberratioii figure of the system, one may define the figure of global aberration 7
= 6
Ro.
I96
ALBERT SEPTIER
When a full beam is focused, one still finds these perturbed forms of the focals (98) by superposition of the elementary aberration figures. 2. Magnitude of Global Aberration. The width 26 has been measured for each value of the focal distance, a large number of photographs corresponding to different diameters of incident beams, either directly when the resolution was sufficient Fig. 48 or from Figs. 22(a) or 22(d) framing the aberrations spot, by comparison with the theoretical ellipse having the same axes. Results for the measurements are given in Fig. 49 for f = 17, I
0
-
2 3 Ro ( c m )
I
’ I
FIG.49. Aberration figure as a function of ROfor incident hollow beams parallel to the axis and the different values of the focal distance f ( L = 19.5 cm).
75, and 130 ern in the case of lenses Q1and Qz (a = 4 em). The dotted curve is constructed for Q1and Q’z (a = 6 cm) with f = 130 em. The change from a = 4 ern to a = 6 cm lowers T by a value of about 2 for the large values of Ro. On the contrary, r is about the same as far as Ro = 1.5 cm, a distance for which B, is still negligible, and where L, ‘v Lo on the axis. The aberration figure varies practically as the convergence 7
-
A/f
which allows the approximate aberration figure to be predicted for 3.5 < f < 4.5 meters (region of normal operation). One would have r
-
5 to 6 X
with a
=
6 cm.
The effects of the edges seems to be preponderant; there will be therefore an advantage in utilizing pole pieces such that a = 6 cm, for a beam of 6 to 7 ern diameter; and to decrease 7 still further, to use longer lenses with a more extended central plateau.
197
STRONG-FOCUSIK'G LENSES
We have seen that theoretically r is of the form r =
AKo'
+ Blin4.
The experimental curves of Fig. 49 are effectively second degree as far as R o 6 a / 3 ; for a = 4 cm, the term BR,,4 rapidly becomes preponderant. ( h e may calculate an aberration c~mstant,similar to that of systemq of revolution from the relation:
which gives r =
c,
( 7) 2
+ C',
(?")4 -
1
a is the slope of an emergent ray wrrebpoiidirig to an incident one parallel to the axis a t the distance f i n , arid then crossing the axis at the focus of the lens. For a = 4 cm the following table gives the order of magnitude of ('\ a n d V'
.f(cni)
1
-ia
I
14
17 3 C', 170
2.3
x
104
130 25 100
For sriiall values of f(f < I,), these values arc mmparable to those of good magnetic lenses of revolutioii. The change from a = 4 cm to a = 6 cm keeps Cc constant but lowers to 2 X lo4 forf = 130 cm; that is, a gain of 5 on the second term. J itijhence of the Correcttori of the V a r i u t ~ o n m s the iCquzualent Length. When the pole pierrs of Ql and QLare equipped with masses of soft steel, which cancels the variations of Lg) the aberration figure diminishes iii pract i e . One practically recovers the v:llues of lenses with large gaps (a = 6 cm), hut the global aperture aberration keeps the same sign; there is simply a decrease in the aberration; the aberrations due to B, persist. The different causes contributing to global aberration in uncorrected leiises therefore operate in the same sense and one may hope, by slightly (*hangingthe distribution of the magnetic field, to cancel the figure of :therratiori completely, for a given value of convergence ; one must try from the heginiiiiig to realize a correction such that LB increases slightly as one goes away from the axes a t least along OX and 01') in agreement with the theoretical predictions (introduction of 6th order terms in the development of the scalar potential). ("T
198
ALBERT SEPTIER
4. Attempt at Totwl Correctiou (99).We have carried out, this rorrection along the axes OX and O Y by slipping rods of soft steel of length 1 arid of small diameter (of the order of 8 mm) along them, tangent to the pole pieces and located on a circumference of radius R. If the rods are placed against the vacuum chamber ( R = 44 mm) the aberrations are considerably increased, but the sign of the aberration is changed, the evolution of the beam in the neighborhood of the focals is reversed; the Figs. 22 (d), (c), (b), and (a) are encountered in order; the beam is “over-corrected.” (See fig. 50.) There therefore exists a position of the rods defined by R = R, for which one will have a correction (total or not, the aberration terms which are introduced do not iwcessarily have the same form as the pre-existing terms). For f = 130 cm, an effective correction is obtained with R, = 59 mni; under these conditions, for f = 17 em, the beam is slightly overcorrected, but T is then very small (less than for R , = 1.5 cm). The caorrection is easily noticed; the cross sections of the beam are symmetric on both sides of the focal, and perfectly elliptical. The adjustment of the rods is very sensitive; a change ARC= 0.5 mm has a visible effect o r 1 the form of the cross sections observed on the Jcreen. Figure 50 shows in (a) the normal beam ( R = a), in (b) the overcorrected beam ( R < &), and in (c) two cross sections of the rorrected beam (El = R J . In the case of lenses and Q’Z ( a = 6 cm) the correction by rods is easier, the aberrations being weaker. With the rods a t R, = 75 mm, there is a correction for a domain of convergence going from f = 2.5 meters to f = 0.75 meter; for j = 17 cm, a slight overcorrection exists. In any case, a correction carried out for the large values of Ro is valid for all values of R,; therefore a full beam will also be corrected, at least to the limits of the resolving power of the optical bench. 5. Structure of the Magnetzc Faeld after Correctton (57). The measurements of field carried out on a corrected lens (Q1 with a = 4 rm, and Iz, = 5.9 em; or with a = 6 cm and EC = 7.5 em) show that the transverse gradient increases in the directions OX and O Y , and decreases along Ox and Oy, and in a practically symmetrical fashion in the central useful zone. The corresponding variations of equivalent length in Q1 are given in Fig. 51. The variations of convergence which correspond are symmetrical with reference to L / L , = 1. We denote by L, and LX the values calculated with the true values of B(0,r) and by L H the values obtained with the theoretical values of B(0,r); the variations of convergence which correspond to L R are symmetrical with reference to L/Lo = 1 (Fig. 51). For X = 2a/3, the relative variation of the gradient AK/Ko is about loyo;but the variation of L is only 2%.
STRONG-FOCUSING
LENSES
199
FIG.50. Correction of the aberrations: (a) Undercorrected beams R = 0; (b) Uvercorrected beam K = a ; (c) Corrected beam K = R,. In the three cases we pass from ( l j to (3) with an increase of I.
200
ALBERT SEPTIER
I 0
I
2
4 X . t (crn 1
6
t
FIG.51. Appearance of L B ( r ) in the corrected lens &’I.
From the distribution of field, one may derive in an approximate fashion the profile of pole pieces which play the same role as the correctors (57). 6. InJluence of the Aperture Aberrations on the Shape of the Spot i n Pseudo-Stigmatic Operation. With a norisymmetric doublet (PI # Pz) the image of a circular source of radius ro is formed on the screen E. To first order, one would obtain an ellipse of axes a = Gxro and b = Gyro with b/a = Y’3/X’3. In reality the aberrations distort and enlarge this image, and if the first perturbed focal has the appearance (a) of Fig. 22, the (.ontraction aloiig
IQ1 . increasing
(IQ fixed)
FIG. 52. Formation of the imagr in pseudo-stigmatic operation (first focal, hollow beam),
OY will lead to a spot of length 26x. The width of the final spot along 01’ will be that of the other focal: 2 6 y . For lenses Q1and Q2the dimensions of the spot, are twice as large as the predicted values. Figure 52 shows the mechanism of formation of the image when the excitat,ions of Q1 and QZ are successively changed. 7 . Chromatic or Mass Aberration. These aberrations have been defined in the theoretical section. The measurement of the figure and the constant aberration T~ and C, have been carried out in two cases: the first focal and
201
STRONG-FOCVPING LENSES
theii the second focal on the screen E a t .52 cni from Qz, measuring the small axis 26, of the ellipse when +n varies an amount A+" for several values of Ro. One sees easily that 6 , is proportional to the factor A+o/+o for variations reaching 3070, and for :t single \ d u e of &, 6, increases with the wnvergenre. Also the ion beam Li+ which is used contains two isotopes, I&+ and Li7+. The first is in a sufficient quantity to be visible on the fluorescent screen (about 8%); when one obtains the focal of Lii+, the beam of Lie+ is overfocused and forms an ellipse of small axis 2 6 ~ 1and , vzce versa. We have here A M / M 16.6% between the 2 isotopes. The measurement of T ~ gives ? T.W = T~ for A+o/+o l6.6Yc. Table VI gives the experimental
-
-
TABLE VI 130 cm c c
('w
c,
IXI,
'XI,
tllc
0 60 0. 'i4 0 63
f"52
cm
1 5 1 TO I 88
results, arid wrnpares them with the theoretical coefficients. We see that the calculation carried out by differentiation of the formulas leads to results which are sufficiently good in practice. h possible application of the mass aberration is at-least-partial elimination of harmful ions at the exit an accelerator (HP+and H3+iii a source of protons) and a t the entrance to the accelerating column; this would not be charged up uselrssly. Figure 53 shows the ellipses of Li7+and LiB+obtained with beams of radius K o = 1.5 em.
FIG.T3.Separation of Li?+ and Lis+ from a hollow beam of incident Li+ ions: (a) t i R 'focal, th) Li7+ focnl.
202
ALBERT SEPTIER
8. Aberrations of Poor Alignment (57). I n one example (screen a t Q2 and divergent incident beam with p = 164 cm) we
q = 52 ern from
have studied the influence of bad alignment of Q2 with reference to Q1on the aberratioii figure. The results are as follows: (1) Displacement Az of &2 along Ox;the focal f1 (whose axis is along O X ) is displaced along Oy and becomes asymmetrical. The new aberration remains negligible by comparison with the aperture aberration if Ax remains less than 0.5 mm, which is sufficiently easy to realize. If the direction of current is reversed in the lenses, the new focal f 2 has the same appearance. (2) Displacement AX of Qz along O X : the action is no longer symmetric as a result of zOX being the plane C-D or D-C. The focalfi (plane zOX C-D) goes in the same sense as Qz, while f2 (plane zOX D-C) goes in the opposite sense; fi remains without supplementary aberration, while fz is deformed. The effect is of the same order of magnitude as the preceding one. ( 3 ) Rotation of Q2 around Oz: for a rotation of 1" the increase of the aberrations is about 50% only. An angular position accurate to 10-l mm approximately will therefore be sufficiently accurate. 9. Comparison with Electrostatic Lenses. Some measurements have been made recently with the aid of our corpuscular optical bench, using electrons of energy 50 kev (100). The principal results are the following: If the electrodes are supplied by symmetrical voltages (case 1: + + I and - &), the aperture defects of a doublet have values equivalent to those of a magnetic one, but only for weak convergences; for short focal lengths, magnetic lenses an: better than the electrostatic lenses. The aberration figures (see Fig. 22) always appear on the fixed screen in the order (d) to (a) when the convergence increases; electrostatic and magnetic lenses have, respectively aberrations of opposite signs. If now the voltages are not symmetrical, for example 0 and +2& (case 2), or 0 aiid - 2 ~ $(case ~ 3), the aberration are very different from those of case 1; case 2, which corresponds to decelerating voltages for the incident particles, is always the most unfavorable, and gives large and positive aberrations (with the same sign as case 1, in opposition to the magnetic case). On the contrary, case 3 (accelerating voltage for the particles) leads to small and negative aberrations. Thereafter, it will be possible to correct the aberrations of the focal lines by introducing sufficient dissymmetry in the voltages supplying the two pairs of electrodes in each lens (101). This correction is also possible with the aid of a n octopolar lens.
ACKNOWLEDGMENTS The author wishes to thank Professor P. Grivet, Director of the Laboratoire d'Electronique et de RadioBlectricitk, who has encouraged him to publish this article, and who
STRONG-FOCUSING LENSES
203
has participated in the final review of the manuscript. The research carried out a t the Laboratoire which is mentioned in this article has for the most part been conducted under the Proton Synchrotron Division of the European Organization for Nuclear Research (CERN) at Geneva, and with its effective assistance. The author particularly wishes to thank Mcssrs. J. B. Adams,* Director of this Division, and P. Lapostolle,* Chief of the LINAC group, with whom he has had numerous fruitful discussions.
REFERESCES** 1. Melkich, A , Thesis, Berlin (1944); see Satzber. Akad. Wzss. WLen, Math.-naturw. K1. -ibt. IIs, NO.$)-lo, 393-471 (1947). 2. Courant, E . D., Livingston, M. S., and Snyder, H. S., Phys. Rev. 91, 202 (1953)
(letter). 3 . Courant,
E. D., Livingston, M. S., and Snyder, H. S., Phys. Ref,.88, 1190 (1952).
4. Blenett, J. P., Phys.
Bec. 88, 1197 (1952). Elmore, W. C , and Garrett, M. W., Rev. Scz. Instr. 26, 480 (1954). Bernard, M. Y., Thesis, Paris (1953); Ann. phys. 9, 633 (1954). Bernard, M. Y., Compt. rend. acad. sci. 236, 185 (1953). I,apostolle, P., CERK (P.S.) Int. LINAC 59-5 (1959), unpublished. Hine. &I. G., CERN (P.S.) Int. MGNH Note 17 (1956), unpublished. Septier, A , , and Chartier, C., Compt. rend. wad. sci. 246, 2056 (1959). 100. Septier, A., CERN 60-6 (1960); J . phys. radium 21, 1A-15A (1960). 21. Hagedorn, It., CERN (P.S.) R H 7 (1955), unpublished. 12. Hand, I, N , and Panofsky, W. K., H.E.P.L. 169, (1959); Rev. S c i Znslr. 30, !l”i-Y30 ( I 959). Id Van der Mecr, S , CERN (P.S.)Int. MhI 59-8 (1959) unpublished. 24. Brurk, H., “Optique Corpusculaise” (C.D.U. ed.) Paris (1957). 15. Glaser, W., “Handbuch der Physik” (S. Flugge, ed.), Vol. 33, pp. 123-395. Springer, Ijcrlin, 1956. 16. Bernard, M. Y., Conipt. rend. m a d . scz. 240, (1955). 17. Glascr, W., Z. Physzh- 117, 285 (1941). 18. Bronca, G., and Gendreau, G., C.E.A. (Saclay) Rept. No. 1081 (1959). 29a. Sternheimer, R. M., Brookhaven Natl. Lab. RMS-6 (1959). 13b. Sternheimer, It. M., Rev. Sci. Instr. 24, 5 i 3 (1953). 20. Enye, H. A., Rev. Sci. Instr. 30, 248 (1959). 80a. Hosenblatt, J., Nuclear Znstr. & Methods 6, 152 (1959). 21. Gendreau, G., C.E.A. (Saclay) Note 29 (1953); see also Bruck, H., and Gendreau, G., Ondr &kc. 36, 1009 (1955). 22. Bromley, D. A . , and Bruner, J. A., N.Y 0.3823 (1954). 83. Iavv, P., .I. phys. radzum 17, 60A (1956). 24. Iteisman, E., Thesis, Cornell University (1957). 2<5. Srptier, A., Compt. rend. acad. sci. 246, 1983 (1958). 66. Sternhcimer, R. M., Brookhaven Natl. Lab. RMS-7 (1959). 5. 6. 7. 8. 9. IOU.
* Now General Director of the CERX, and Director of the Hynchrocvrlotron Ilivision. respectively. ** The reports cited below may be found hsted in Nuclear Science Abstracts. Prior to 1058, internal CERN reports were designated Int. ; the various group reports wer? labeled LINAC, M.M., P.S., and SC. for linear accelerator, magnetic measurements, proton synchrotron, and synchrocyclotron, respectively. Often, the author’s initials were also included in the report number.
204
ALBERT SEPTIER
27. Hereward, JI. G., Johnsen, K., and Lapostolle, P., CERN Symposium pp. 179-191 (1956); see also Lapostolle, P., CERN (P.S.) PL5 (1955), unpublished. 28. Blewett, J. P., Hrookhaven Natl. Lab. JPB-11 (1958). 29. Blewett, J. P., Hrookhaven Natl. Lab. JPB-13 (1959). 30. CERN, Minutes of Staff Meeting No. 114 February 17, 1959. 31. Brown, H., Brookhaven Natl. Lab. HNB-1 (1957). 32. Schneider, H., iYicc2ear Instr. 1, 268 (1957). 33. Dhuicq, D., and Septier, A., Compt. rend. acad. sci., 249, 20J1 (1959). 34. Citron, A , , and 0verds, H., CERN SC. 27.3.57. 55. Bell, M., and Walkinshaw, W., AERE Mem. T/M 112 (1954). 46. Teng, L. C., Rer.. Sci. Instr. 26, 264 (1954). 57. Dallenbach, W-., Z . Angcw. Physik 7, 344 (1955). 58. Smith, L., and Gliickstern, R., Rev. Sci. Instr. 26, 220 (1955). 39. Vlasov, A. D., J . Yuclear Energy 6, 62 (1957). 40. Regenstreif, E., CERN 59.26, p. 131 (1959). 41. Wilcox, J. M., trniv. California Radiation Lab. Rept. UCRL-3184 (1955). 42. Taubert, T., 2. Saturforsch. 12a, 169 (1957). 43. Krienen, P., CERN SC-57-28 (1957). 44. Morpurgo, M.,CERN SC. 141 bis (1957). 45. Paul, W., and Skinwedel, H., Z..Vaturforsch. 8a, 448 (1953). 46. Paul, W., and Raether, M., Z. PI,ysik. 140, 262 (1955). 47. Paul, W., Reinhard, H. P., von Zahn, V., 2. Ph,ysik. 162, 143 (1958). 48. Vauthier, R., Compt. rend. acad. sci. 228, 1113 (1949). 49. Keller, R., CERN SC. 57-30 (1957). 50. Eennewitz, H. G., and Paul, W., 2. Physik. 139, 489 (1954). 51. Friedburg, H., 2. Physik. 130, 4Y3 (1951). 52. Bennewitz, H. G., Paul, W., and Schlier, C., Z . Physik. 141, 6 (1955). 63. Gordon, .J. P., Zeiger, H. J., and Townes, C. H., Phys. Rev. 96, 282 L (1954). 54. Bernard, M. Y., and Hue, J., Cotnpt. rend. acad. sci. 243, 1852 (1956). 55. Burfoot, J. C., f'roc. Phys. Soc. B67, 323 (1954). 56. Bernard, M. I-., and Hue, J., Compt. rend. mad. sci. 244, 732 (1957). 57. Grivet, P., and Septier, A,, C E R S 58-25 (1058), A'uclear Instr. 6, 126 (1960); 6, 243 (1960). 58. Grivet, P., ttnd Bernard, M. Y., CERN (P.S.) MB-6 (1955), unpublished. 59. Schemer, O., Optik 2, 114 (1947). 60. Seeliger, R., Optik 6, 490 (1949); 8, 311 (1051); 10,29 (1953). 61. Burfoot, J. C., E'roc. Phys. SOC.B66, 775 (1953). 62. Archard, G. D., Proc. Phys. SOC.B68, 156 (1955). 63. Archard, G. P., Proc. Intern. Con!. on Electron Microscopy, London p. 97 (1954). 64. Mollenstedt, G., Optik 13, 209 (1956). 65. Vivargent, M., Thesis, Paris (1958). 66a. Courant, E. D., and Marshall, L., Rec. Sci. Instr. 31, 193 (1960). 66. Dayton, I. E., Shoemaker, E. C., and Mozley, R. F., Rev. Sci. Instr. 26, 485 (1954). 6'7. Matsuda, K., et al., INS J (Tokyo) 14, (1959). 68. Van der Meer, S., CERN (P.S.) MM31 (1957), unpublished. 69. Hubbard, E. L., Univ. California Radiation Lab. Engng. Note 4111-63 (1953). 70. Timm, U., DESY A. 2/47 (1959). 72. Chartier, G., DiplBme d'Etudes Supkrieures, Paris (1959). 72. Lynch, P. J., and Zaffarano, D. J., I.S.C. 927 (1957). 75. Arnal, R., Thepis, Paris (1955), Ann. phys. 10, 830 (1955). 74. Gribi, M., Thiirkauf, M., Villiger, W., and Wegmann, L., Optik 16, 65 (1959).
STRONG-FOCUSING LENSES
205
75. 13ullock, Pvl. L., A m . J . Phys. 13, 264 (1955). 76. Dushin, L. A., and ,\Iaslov, V. A., Soviet Phys. T e c h . Phys. 3, 394 (1958). 77. Hubbard, E. L., and Kelly, E. L., Ret. Sci. Instr. 26, 737 (1954). 78. Sunan, C., Urtiv. California Radiation Lab. Rept. UCRL-2117 (1953). 79. Johnson, C. H., Judish, J. P., and Snyder, C. W., Rev.Sci.I n s t r . 28, 942 (1957). 80. Giese, C. F . , Rev.Sci. Znstr. 30,260 (1959). 81. Alvarez, L. W., et al., Rev. Sci. Inslr. 26, 111 (1955). 88. Cork, B., and Zajec, E., liniv. California Radiation Lab. Rept. UCRL-2182 (19SJ). 83. Gautier, P., J . phys. radium 16, 684 (1954). 84. Sauzade, M., Conrpt. rend. acud. sci. 246, 272 (1958). 85. Septier, A., Compl. rend. acad. sci. 243, 132 (1956). 86'. Septier, A., Compt. r m d . accd. sci. 243, 1026 (1956). 87. Septier, A., Conipt. rend. acud. sci. 243, 1297 (1956). 88. Shull. F. G., McFarland, C. E., and Bretecher, M. M., Iiei'. Sci. fnstr. 26, 304 (1954). 89. Ifiocl,, J., Onde d e c . 27, 27 (1947). !/0.Carlile, R. N., H.E.P.L. 33 (1957). !)f. Citron, h.,Varley, F. J., Michaelis, E . I,., and y)ver%s,H., CERN 50-8 (1959). DBa.. Pinel, J., Ann. radio&-. cornpagn. Jranc. T.S.F., 14, 230 (1959); 16, 17 (1960). 82b. Real, M., Diplome d'Etudes Supbrieures, Paris (1958). 93. Septier, A , , Cobnpl. rend. atad. sci. 246, 1406 (1957). 94. Couchet, G . , Thesis, Paris (1953); Ann. phys. 9, 731 (1954). !)5. lIorpurgo, &I., and Septier, A,, Compt. rend. acad. sci. 246, 2496 (1957). 96. S,rptier, h.,Conrp2. rend. acad. sri. 246, 1835 (1958). 97. Septier, A., Compt. rend. accld. sci. 246, 1905 (1957). 98. Grivet, P., Septirr. Ai.j and Hue, J., CERN Symposium Part I, p. 192 (1956). 39. Scptier, A,, Conapl. rend. u a d . sci. 246, 2036 (1957). 100. Septirr, A., and Van Acker, J., Compt. rend. acad. sci. 261, 346 (1960). 101. Spptier. A., and Van Acker, J., Conlpt. rend. acad. sci. 261, 1750 (1960).
This Page Intentionally Left Blank
Hydrogen Thyratrons SEYMOUR GOLDBERG
AND
JEROME ROTHSTEIN
Edgerton, Germeshausen and Grier, Inc., Boston, Massachusetts Page I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 11. Progress in Hydrogen Thyratron Construction and Techniques. . . . . . . . . . . . . 208 A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 B. Hydrogen Reservoirs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 11 216 C. Ceramic Hydrogen Thyratrons.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Operation of Hydrogen Thyratrons.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 A. Modulator Circuit and General Sequence of Events over a Complete Pulse Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 B. Triggering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 C. Commutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 D. Steady State Conduction.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 E. Deionization and Recovery.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 IV. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262 References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...................... 263
I. INTRODUCTION The hydrogen thyratron was developed to meet the wartime need for a. high-voltage, high-power, fast-acting, jitter-free switch with a short
recovery time. Advantages of hydrogen include low molecular weight (fast deionization, less cathode damage), relatively high breakdown strength a t low pressure, and the existence of means to replenish gas which has disappeared during operation. Its major, and for a long time its sole, application was to radar modulators. After the war, a survey of hydrogen thyratron tube and circuit characteristics was written by Germeshausen ( I ) , its “father.” Wittenberg has also discussed wartime radar applicatioiis (g), though mostly with reference to devices other than hydrogen thyratrons. Both of these cover American wartime developments. British developments are briefly discussed by Knight and Hooker ( 3 ) . Further discussion is given by Knight (4). An cxtensive survey of hydrogen thyratrons, including some French postwar work, has been written by Charles and Warnecke ( 5 ) . A Russian survey has also appeared (6),biit it seems to cover essentially the same material surveyed in thc West. Progress in hydrogen thyratrons has included development of tubes 207
208
SEYMOUR GOLDBERG AND JEROME ROTHSTEIN
with much higher ratings (up 40 kv anode voltage, 50 megawatts peak, 50 kw average), requiring reservoirs or replenishers to make up for hydrogen cleanup during operation, the introduction of ceramic envelopes, investigation of details of hydrogen thyratron operation such as breakdown of the grid-cathode space, commutation of the discharge to the anode, fall of anode potential and anode dissipation, characteristics of the conduction interval, problems of inverse conduction, deionization and recovery, problems of cathode utilization and dissipation, and applications of these investigations to thyratron design. The following exposition will draw heavily on a three-volume research study of hydrogen thyratrons (7).I
11. PROGRESS IN HYDROGEN THYRATRON CONSTRUCTION AND TECHNIQUES
A. Introduction A picture of the wartime series of American thyratrons is given by Germeshausen in Reference 1, Fig. 8.39, p. 340, and the typical structure illustrated dingrammetrically in Fig. 8.36, p. 338. Figure 1 compares the physical sizes of wartime tubes with those of later developments, and Table I compares their characteristics. Figure 2 is a diagram of the later structure of glass tubes. All hydrogen thyratrons have the anodes surrounded by a close-fitting shield (here the grid) to prevent long path breakdown. The grid cathode spacing is about five times anode spacing in order to permit easy breakdown in the grid cathode space. The anode seal is now re-entrant so that the shortest, path from the plasma, outside the grid structure, to the anode lead traverses a gap between glass and lead. This seal was developed by 0. W. Marsh and one of the authors a t Evans Signal Laboratory during the war, when hydrogen cleanup was a severe developmental problem, causing the anodes t o heat up excessively. Resultant glass heating, lowering its dielectric strength, then caused dielectric failure even before cleanup of hydrogen reached the range of inoperability. Dielectric breakdown was never observed after the introduction of the re-entrant lead. While later developments made it possible to dispense with the reentrant seal for low-power tubes, for ratings higher than those of the 5C22, it has proved necessary to employ it in all tubes with glass envelopes. Hydrogen cleanup became increasingly serious as tubes were developed with higher ratings. While it proved possible ultimately to process low1 Explicit acknowledgment of the contributions of Prof. W. P. Allis and Mr. K. J. Germeshausen to our present understanding of hydrogen thyratrons should be made. The work was supported by the U. S.Army Signal Corps under contracts DA 36-039 SC-15372 (Vol. I), 36-039 SC-52589 (Vol. 11) and DA 36-039 SC-70139 (Vol. 111).
bA
HYDROGEN THYRATRONS
209
FIG.1. Hydrogen Thyratrons.
Y
0
TABLEI. HYDROGEN THYRATRON CHARACTERISTICS
-
u,
Physical
Tube
OA length (in.)
12578 201.4 5948A2 16>/a 5949A2 12>4 5C22/HT-415 894 6% 4c35 3c45 5.2 1258 2x6 7322/1802$ 5%
-
Maximum diameter (in.) 754 554 34i6 2x6 2x6 1x6
6?' 3.42
Peak forward
Minimum Ebb
Peak Avcrage RMS current current current ib Ih 1,
(kv)
(volt)
(amp)
(amp)
(amp)
33 25 25 16 8.0 3.0 1.0 25
3500 5000 5000 4500 2500 800 300 4000
2000 1000 500 325 90 35 20 1000
2.6 1 0.5 0.200 0.100 0.045 0.050 1.5
60 30 -
A t 6.3 volts. 2 Later versions of 1907 and 1754. 3 Includes reservoir.
40
Max.' Trigger cathode (epyit,prr) voltage filament P b
power
20 X loQ 9 x 109 6.25 X lo9 3 . 2 x 109 2 x 109 0 . 3 X log 0.1 X lo9 20 X lo9
4
Into matched load.
5
Tentative ratings.
3
Peak' ppwer switched
(volt)
(watt)
1300 700 550 200 175 175 175 550
252 33 MW 208 12.5 MW 6.25MW 139 73 2.6MW 42.2 360 kw 15.7 52.5kw 12.6 10 kw 1353 12.5 MW
Average4 power switched
5d
23
8 8M % 43 kw 12.5kw 6.25kw 1.6kw 400 w 67.5 w 25 w 18.5 kw
*
4
50
fi 23 0 2 u,
e
B
211
HYDROGEN THYRATRONS
power tubes with sufficient care to obtain reasonable life without a reservoir, the reservoir is mandatory for higher power tubes and increasingly frequent even for low-power tubes. A major advance in miniaturization and ruggedness of high-power thyratrons has resulted from the development of ceramic hydrogen thyra[ANODE
LEAD
NONEX
GRID STRUCTURE
CATLTHODE SHIELDING
(1
4
-
GLASS
BAFFLE
CATHODE
FIG.2. Structure of the hydrogen thyratron.
trons. Figure 3 shows a comparison grouping of glass high-power hydrogen thyratrons and the EGG Model 1802 ceramic tube. The ratings of the latter are given in Table I and are intermediate between those of the two largest tubes.
B. Hydrogen Reservoirs Gas cleanup under the influence of a n electrical discharge has been lil~omlla t least since 1858 (8). There is an extensive literature on this subject and on thc absorption and freeing of gases by and from metals or stiitable compounds (9, 10, I I ) . Fortunately a number of metal-hydrogen systems mist with a characteristic equilibrium hydrogen pressure a t each temperature ( I d ) . The first attempts, during the war, to use the Ba-H and
212
SEYMOUR GOLDBERG AND JEROME ROTHSTEIN
FIG.3. Comparison grouping of high-power hydrogen thyratroms with EGG Model 1802 t,ube in foreground.
‘0,
I
I
1
I
I 1 1 1 1
I
I
I
I
I
I
I
C
I
l
l
1
5
a 2 2 w
a
I
I
I
I
I
I I I I I
I
I
I
I
I0
PRESSURE -MICRONS
FIG.4. Equilibrium pressure as a function of reservoir heater voltage for various loadings. Kuthe 5C22 reservoir; working volume = 130 cm3; active material 100 mg titanium.
214
SEYMOUR GOLDBERG AND JEROME ROTHSTEIN
the Ta-H systems were unsuccessful. A prediction on chemical and metallurgical grounds that Ti-H would work was borne out by preliminary results a t Evans Signal Laboratory2 and is presently used most commonly in commercial hydrogen thyratrons. Zr-H has also been successfully used. It is possible that the rare earth-hydrogen systems may be even better than the Ti-H system, particularly Ce-H and La-H, but further development is necessary. Walsh and Shearman ( I S ) have described the processing of a TiHz reservoir for microwave diode attenuators. For a discussion of pressure stability and reservoir design, the reader is referred to ref. 14. We here simply present graphically the characteristics of a commercial 5C22 reservoir (measured a t EGG, Inc.). Figure 4 shows, with loading as a parameter, the equilibrium pressure as a function of reservoir heater voltage. The unit of loading is the liter-millimeter (1 litermillimeter is the amount of hydrogen necessary to fill a volume of 1 liter to a pressure of 1 mm of Hg). Figure 5 shows, for the same loadings, equilibrium reservoir pressure as a function of reservoir temperature. The reservoir presently used in the EGG 1802 ceramic tube contains 3.8 gm of titanium and is loaded with 16 liter-millimeters of hydrogen for a fill pressure of 3 7 5 ~ .Assuming a low-pressure operating limit of loop, it is possible for the hydrogen loading to decrease, t.hrough cleanup losses, to about 6 liter-millimeters. This represents a loss of about 120 tube volumes of gas. A new reservoir using lanthanum in place of the conventional titanium is under development. The advantage of La is a plateau in the equilibrium pressure-concentration relation in the pressure range a t which thyratrons operate for operating temperatures which are desirable for pressure stability considerations, as shown in Fig. 6. Half a gram of active material can accommodate a change in loading of 50 liter-millimeters before the pressure falls to the lower operable limit. Furthermore, the pressure plateau exists for about 40 liter-millimeters, representing about 480 tube fillings. The developmental problems are associated with complicated handling procedures resulting from the high chemical reactivity of La. Attempts have been made to dispense with a separate reservoir heater by using heat generated a t the cathode or anode. These have proved to be impractical because reservoir temperature depended too sensitively on the operating conditions. If the pressure gets too high, breakdown can occur 4 K. J. Germeshausen and M. L. Yeater did some early metal-hydrogen experiments at M.I.T. but made no thyratrons with reservoirs. J. E. Gorham, G. F. Rouse, G. C. Kretschmar, W. E. Harbaugh, 0. W. Marsh, G. E. Reilly, I. Levin, H. Gerlicher, and R. T. K. Murray contributed to wartime thyratron effort at Evans Signal Laboratory, where Ti-H reservoir development began at J. Rothstein’s suggestion.
900'
800
700
'
i t
600
$
5 (I)
W
'500 e! c
2!
8 E 2
4oc
.-0
I
2
8!8
3oc
0
I 100
I 200
I
I
300
400
I 500
I
600
I 700
I
800
I 900
1000
-
Pressure microns
FIG. 5. Equilibrium reservoir pressrire as a function of reservoir temperature for different loadings. Kuthe 5C22 reservoir; working volume = 133 rm3; active material 100 mg titanium.
!2 CII
216
SEYMOUR GOLDBERG AND J E R O M E ROTHSTEIN
600
I
I
I
I
I
500-
400-
w 300-
a
3
a a w
a 0
200
-
100
-
0
I
I
I
I
I
10
20
30
40
50
0
FIG.6. Pressure-concentration relation for titanium and lanthanum reservoirs. @, titanium reservoir; 0, lanthanum reservoir.
a t too low a forward or inverse voltage; if it goes too low, anode dissipations are excessive, as are jitter and delay time. While it has not been possible to make the reservoir temperahre independent of operating conditions (practical considerations make i t necessary to keep the reservoir in the same envelope as the rest of the structure), this independence is sufficiently small to keep reservoir temperature within “physiological” limits.
C . Ceramic Hydrogen Thyratrons Since their inception, hydrogen thyratrons have been characterized by what is essentially a plane parallel electrode configuration. Figure 7 shows how this structure has evolved to permit the use of cylindrical or vaned cathodes (in order to achieve high peak currents) while maintaining essen-
217
HYDROGEN THYRATRONS
GRID
CATHODE
b
0
ANODE GRID
GRID
CATHODE CAT H o DE C
d
FIG.7 . Plane parallel thyratron structures with different cathode arrangements.
tially the same geometry. Figure 8 shows a schematic of the EGG ceramic hydrogen thyratron, which preserves this geometrical feature. The all-metal and high-alumina ceramic construction makes it possible to process the ceramic tube much more rigorously than is permissible with a glass envelope. An operating envelope temperature of 400°C is permissible. The great strength of the ceramic plus the rigid anode and grid seals make the design extremely rugged. A baffle is employed at the grid to prevent deposition of evaporated cathode mat,erial on a region of the grid accessible to the ailode field and to facilitate positive grid control without straight-through anode to cathode breakdown. The anode and grid cups are of O F F C copper to facilitate electrode cooling. The anode itself is Mo and is brazed on the cup, which further aids anode dissipation by comprising part of the tube envelope. The metal-ceramic seals at EGG utilize the hydride process (other techniques, like Mo-Mn, can also be used). Silver and TiH, bond the ceramic to nickel. I n the case of the base disc, this nickel is a flanged ring on which the cathode assembly is rigidly mounted. In the case of the copper grid and anode cups, the nickel and copper are joined with BT solder. The reservoir is rigidly connected to the cathode flange.
218
SEYMOUR GOLDBERG AND JEROME ROTHSTEIN
The history of hydrogen thyratron development has been beset a t many stages with oxide cathode problems. This subject is too large to discuss here in detail. The reader is referred to a recent survey giving full references to the literature (15). I n general, hydrogen thyratron cathodes require high purity, careful handling, and rigorous processing. If cathode break-
Cothode boffle cover Heat shield-
Filament pins
FIG.8. Hydrogen thyratron tube type 7322/1802.
down products get on the grid or envelope, they can lead to problems of forward or inverse breakdown. The problem of full cathode utilization deserves separate mention. During the conduction interval, the plasma penetrates into the intervane space. As the plasma has a low but nonzero resistance to electron flow, there is a tendency to draw larger current densities from the vane tips. This sets an upper limit to the maximum useful vane depth for a given vane separa-
219
HYDROGEN THYRATRONS
tion (see Sec. 1II.C). Other factors entering into proper cathode design are the temperature distribution within the cathode, warm-up time requirements, and cathode dissipation. Cathode dissipation due to ohmic heating of the oxide can be a serious problem. OF HYDROGEN THYRATRONS 111. OPERATION
A . Modulator Circuit and General Sequence of Events over a Complete Pulse Cycle
A typical modulator circuit used with the hydrogen thyratron is show11 in Fig. 9. It consists of: 1. A pulse-forming artificial transmission line characterized by a twoway transit time rp, and a n impedance 2,. 2. An energy absorbing load RL. 3. The thyratron trigger system. 4. X charging system for restoring energy to the line during the interpulse interval. 5 . The thyratron. CHARGING CHOKE
CHARGING
DIODE
PULSE
NETWORK
r . f,
FIG.9. Hydrogen thyratron pulse generating circuit.
The circuit functions basically as a synchronized relaxation oscillator. When the circuit is triggered, the energy stored in the pulse line during the interpulse period is discharged through the load and the thyratron. Following the current pulse, the thyratron is extinguished by maintaining the anode at less than a critical reignition voltage for a period longer than the recovery time. A suitable voltage is maintained by proper choice of the parameters of the charging system and by having inverse voltage on the pulse line a t the end of the current pulse. I n Fig. 10 typical currents and voltages a t the grid and anode in the hydrogen thyratron are shown over a complete pulse cycle to illustrate the sequence of events. The period is divided into four intervals, which may be
220
SEYMOUR GOLDBERG AND JEROME ROTHSTEIN
defined as triggering, commutation, steady state conduction, and recovery. The voltages and currents showii are typical for the 4C35 thyratron. The triggering interval is initiated by the application of the positive grid trigger pulse. Following a delay of the order of tenths of a microsecond from the application of voltage, grid current is observed to increase exponentially with time. After the grid current reaches a certain critical value, conduction is transferred to the anode region and the commutation interval
FIG.10. Typical events in the pulsed operation of a hydrogen thyratron.
starts. The anode voltage then falls rapidly with a time constant of several to several tens of millimicroseconds. The cathode current rises a t a rate determined by the modulator circuit time constant and the rate of fall of anode potential. The course of the grid voltage during the commutation interval is somewhat erratic, because of self and mutiial inductive effects of the rapidly rising current in the cathode circuit (and also electronic processes within the tube). Under certain conditions the inductive effects can give rise to a voltage spike on the grid in the order of kilovolts. During the steady conduction interval which follows, the current is substantially constant and the anode voltage is a t some low value equal to the steady state tube drop. At the end of this time the pulse network has delivered the charge it received and the current falls to zero.
HYDROGEN THYRATRONS
221
Next, a negative or inverse voltage appears at the anode because of the normal mismatch in impedance between the load resistance and the pulse network. This starts the deionization or recovery interval, during which the plasma remaining in the tube as a result of the discharge decays to the tube walls by means of a diffusion process. The anode voltage slowly increases from its inverse value to the peak forward voltage. Oscillations are superimposed on this because of transients reflected back and forth along the pulse line. Because of losses in the transmission line, these normally decay to a negligibly small value after some tens of microseconds. After 360 psec, corresponding to a repetition rate of 2800 cps/sec for the example shown, the anode voltage reaches its full peak forward voltage, called epy.If the grid is maintained negative during the deionization period, a current made up of positive ions will flow to it. Diffusion losses of the plasma cause this current to decay exponentially. The intervals described are considered in detail in Sec. 1II.B. The general theoretical and experimental literature of electrical discharge in gases is tremendous: we cite here only the Handbuch volumes (16) which devote over a thousand pages to various aspects, giving several thousand references. Some specific references to the ionization process in thyratrons include theoretical studies by Wheatcroft et al. (17) and Mullin (18),a theoretical and experimental investigation by Silver (19),and experimental studies by Farrison (LO), Webster @ I ) , Birnbaum (ZW), Knoop and Kroebel (23),Woodford and Williams (24),3Pakswer and Mayer (25), Appel and Funfer (26), and Olmstead and Roth (2'7). Deionization and recovery processes are treated by Wittenberg (28), Birnbaum ( Z Z ) , Hess (29), Knoop and Kroebel (WS), Romanowitz and Dow (SO), Malter and Johnson ( S I ) , Knoop ( S Z ) , and Olmstead and Roth (27).
B. Triggering Conventional hydrogen thyratrons contain baffled grid structures which prevent the anode field from penetrating to the cathode region. The triggering process thus not only requires that the grid voltage pulse break down the grid-cathode space, but also that a plasma be established in the grid aperture region to supply electrons needed to initiate the commutation process. Figure 11 shows the growth of grid current in a 4C35 as a function of grid voltage for an anode voltage of 3 kv. Anode voltage does not influence grid current until commutation occurs. The current rises nearly exponentially as Joet" (until limited by the external circuit) with a time constant T determined by the applied grid voltage. Initially no appreciable ionizaThis paper gives measurernentb on hydrogen thyratrons.
222 SEYMOUR GOLDBERG AND JEROME ROTHSTEIN
m
9
a d
Q)
4
HYDROGEN THYRATRONS
223
tion is present, so the initial current is determined by conditions analogous to the Child-Langmuir space charge relations as modified by elastic collisions and scattering of the electrons by the gas molecules. The relations applicable during this interval have been deduced by Allis and Goldberg (33)4 for infinite plane parallel electrodes. The equations relating grid potential to current density and spacing, where the grid is considered as the anode of an equivalent diode to which a voltage VA equal to the trigger voltage is applied, are
and
Here ( A is the unscattered electron transit time to the grid plane mult8iplied by the collision frequency v,; and vc is the collision frequency = 5.9 X lo9 p(mm). It is found that an accurate representation of the relation obtained by eliminating E A is given by
where J A = grid current density (amp/meter2) p = e / m v , = electron mobility VA = applied grid potential X A = grid-cathode spacing (meters). This relation holds satisfactorily for pressures between 100 and 1 0 0 0 ~ . Equation (3) may be compared to the Child-Langmuir equation relating grid current to spacing and voltage:
The grid current under scattering conditions is roughly a factor of 30 less than the Child-Langmuir current a t a pressure of the order of 35 mm, a spacing of about 1 em, and a voltage of 100 volts. The initial scattered current J A corresponds to Jo,which is the coefficient of the exponential growth of grid current:
4
Theory developed by W. P. Allis and S. T. Martin.
224
SEYMOUR GOLDBERG AND J E R O M E ROTHSTEIN
This initial electron current produces ionization by collision and positive space charge builds up in the intervening space. The positive space charge increases the field near the cathode, increasing the electron current regeneratively. When sufficient positive charge accumulates near the grid, a zero field region develops which traps electrons, creating a plasma. With a zero field region near the grid, the cathode current increases still further, since the grid potential extends further towards the cathode. Finally, the plasma boundary originally near the grid extends all the way to the cathode, and the current is limited by the impedance of the external circuit. The potential variation during grid-cathode breakdown is shown in Fig. 12. It is independent of the motion of positive ions. The motion of the NET POSITIVE
FIG. 12. Successive steps in the breakdown of a gas diode containing a thermionic cathode showing the development and measurement of the plasma front.
potential wave towards the cathode is entirely due to the space charge effect of ions. The quantitative relations involved during this transition period have been deduced by Martin and Allis (33).I n this analysis current is assumed to rise exponentially with time, as observed. The time constant T is a function of the applied grid voltage and the constants of the gas as follows:
1 = 21.1 x
109pv,4ij(~);
(6)
where y = (V*/VJ%, p = pressure (mm), V i = ionization potential = 16.2 volts, VA = grid voltage, and
f(r>= MY^ - 1) -icy4 - 1 ) + + ( ~ 3- 1)- w 4 [ + , ( ~ 3 u+tw - 1)-+(wI. r ( 2 r 3 - 1)
225
HYDROGEN T H YRATRON S
These relations fit the experimentally observed time constants fairly well. In Fig. 13 T (mpsec) is plotted as a function of V Afor p = 0.5 mm. This theory satisfactorily explains the grid-cathode breakdown between ideal plane parallel electrodes. The actual grid-cathode geometry in typical
3 0
I
I
40
I
I
a0
I
I
I
I20
I
I60
I
I
2 00
I
G R I D VOLTAGE
FIG.13. Rate of growth of grid current. Time constant versus grid voltage. Pressure 0.5 mm; J, = Joet/'.
thyratrons, however, differs markedly from the ideal. In the 4C35, the cathode is a cylinder placed axially along the discharge path and almost surrounded by heat baffles except for an annular opening a t the top to permit electron flow. The principal effects of these perturbations are, first, to focus the initial electron current flowing to the grid and, second, to cause
226
SEYMOUR GOLDBERG AND JEROME ROTHSTEIN
only the uppermost cathode regions to participate in initial grid-cathode breakdown, due to electrical shielding of the cathode by the heat shields. The initial current I0 is the product of the grid current density J A of Eq. (3) and the effective cathode area. I n the 4C35 the effective cathode area is considerably less than the total emitting surface. This assertion is supported by measurements of grid current rise as a function of cathode temperature. I0 is profoundly affected, but T is unchanged. This reflects the fact that more electrons are available a t the upper regions of the cathode either because of increased activation there or because of electron diffusion from the more dense space charge cloud surrounding the entire cathode. The focusing action is indicated by the shaded paths shown in Fig 14. ANODE
FIG.14. Equipotential lines in the grid-cathode space and in the space between grid baffle and anode of 4C35. Lines are identified in percentage of grid voltage (in grid space) and in prrrentage of anode voltage (in anode space).
Since electrons flow initially in this path, the plasma is expected to be located here initially. Because of the high degree of anode shielding, electrons must, in order to contribute to anode breakdown, diffuse to the annular opening leading to the anode. The diffusion time for electrons to cross the space between the initial plasma and the annular opening to the anode is calculated using ambipolar diffusion laws. These yield a diffusion time of 0.02 or 0.03 psec for this length, which is approximately 3 mm.
HYDROGEN THYRATRONS
227
Analysis of the data shown in Fig. 11 indicates that comniutation occurs a fixed time after the grid current reaches a critical value, which for the 4C35 is 50 ma. This delay is approximately 0.04 psec, agreeing fairly well with calculated ambipolar diffusion times. It has been suggested that motion of the electrons from the initial discharge path might also be affected by the electric fields present at the grid regions. This hypothesis is difficult to verify, since probe measurements of plasma potential are difficult to interpret under transient discharge conditions Since commutation time depends on the initial location of the gridcathode discharge, we might expect factors that influence the initial course of the electrons to be of importance in determining this time. Magnetic fields originating from the filament windings have been shown to have ail effect on the commutation time and to introduce jitter a t the filament voltage frequency. Magnetic fields of this nature would, of course, affect the initial path of electrons as well as the manner in which t,hey diffuse. If an electric field mechanism transports electrons from the initial position of the discharge to the grid openings, the magnetic field would certainly affect the times involved here also.
C. Commutation Commutation begins after breakdown of the grid-cathode space and buildup of grid current to some critical value. Electrons appear a t the grid openings, are accelerated by the anode field, and cause additional ionization. When there is a steady discharge in the grid-cathode space, small electron currents flow to the anode a t low anode potentials. As the anode is raised, these currents increase until breakdown occurs. Considerable light comes from the anode and grid opening region when this precommutation current flows. The breakdown current is fairly definite, about 1 ma for the 4C35, and as high as 3 or 4 ma in tubes such as the 1907 and ceramic thyratrons similar to the 1802. Anode breakdown cannot be explained by a theory similar to that applicable to grid-cathode breakdown because positive ions formed in the anode space are swept out so rapidly by the high field that insufficient positive ion space charge accumulates to result in a region of zero field. The important distinction between the baffled grids normally used in hydrogen thyratrons and the unbaffled grids requiring negative bias for anode holdoff is that the strong negative bias creates a potential minimum in the grid region, across which the current is regulated by the Boltzmann relation. Breakdown theory for such structures can be based on regenerative reduction of this potential well by ionization in the anode space. This occurs since ions created by electron current passing the potential minimum
228
SEYMOUR GOLDBERG AND J E R O M E ROTHSTEIN
raise the potential of the saddle point. This increases the electron current passing it, producing a regenerative effect culminating in breakdown. 1. The Anode as a Langmuir Probe. Careful measurements of the anode currents flowing in the 4C35 thyratxon before breakdown occurs show there is no potential well in the grid region a t normal operating anode potentials, and that when the anode is a t or above the potential of the
ANODE VOLTAGE
FIG.15. Anode voltage as a function of anode current, with triggering current drawn to grid as a parameter.
grid-cathode plasma (about 25 volts), it collects all the random electron current from the plasma. The evidence is shown in Fig. 15, where dc anode current is plotted as a function of anode voltage, with triggering current to the grid as pnrameter. The response is that of a Langmuir probe: to which the electron current density J P is given by the Boltzmann relation: Jp =
J, exp
[-e(T’
- V,)/kT-];
(7)
5 A survey of the theory and use of probes, with many references, is given by Loeb (34).
229
HYDROGEN THTRATRONS
where V is probe potential, Vp is plasma potential, and J , is the random electron current density. When the probe reaches or exceeds plasma potential, all of the random electron current is collected and probe current should saturate. In practice, an abrupt change of slope is observed, as shown in Fig. 15. Further evidence that the anode behaves as a probe is obtained when the anode potential is well below Vp. If sufficient electron current is repelled, random positive ion current is collected which makes the current in the external circuit reverse sign. When Fig. 15 is corrected by eliminating the positive ion current, a good Boltzmann line is obtained, giving an electroii temperature of about 32,500"K (4 ev). The plasma potential is about 23.5 volts. Since the applied voltage was 20 volts, the grid was about 3.5 volts negative with respect to the plasma (neglecting contact potentials). Figure 16 shows measurements taken with a coiwentional probe near
0
+ 10
+2 0
V - Probe potential w i t h
+ 30
+ 40
respect to cathode
FIG.16. Probe measurements of grid-cathode discharge.
+ 50
230
SEYMOUR GOLDBERG AND JEROME ROTHSTEIN
the grid baffle which confirm the foregoing. An electron temperature of about 35,000”Kis obtained, agreeing with the anode current measurements. The plasma potential, as indicated by the abrupt change in slope, is several volts above that of the grid. We now ask what conditions develop about the anode, as it becomes increasingly positive with respect to the plasma, which eventually lead to breakdown and anode control of the plasma density. Otherwise expressed, we ask what happens when a probe immersed in a plasma is made increasingly positive with respect to the plasma. Experimental results of a measure-. ment of this type, shown in Fig. 16, may be explained as follows: At low positive voltages relative to the plasma, an electron sheath develops. This occurs because the mean energy of the ions is only of the order of tenths of a volt and all the ions would be repelled from the probe region a t low positive voltage. Since electrons enter the sheath from a region of zero field (the plasma), the Child-Langmuir space charge conditions apply and yield a sheath thickness S given by
This sheath effectively increases the area from which the probe collects electron current, and results in the commonly observed gradual increase of collected currents at low potentials over the plasma. As the probe potential is further raised, the probe current increases more rapidly than predicted by the simple increase in probe area given by the Child-Langmuir relation. Simultaneously, a glow appears above the probe. Under these conditions, thermal electrons entering the probe sheath from the plasma are accelerated through sufficient voltage to greatly increase excitation and ionization in the sheath. This modifies the potential distribution as shown in Fig. 17. Curve A is the potential under space charge limited conditions. The space charge is composed only of electrons in transit to the anode. Poisson’s equation
-d2V =+dx2
4rp-
(9)
EO
shows that the curvature is positive as shown. The potential is given by the Child-Langmuir equation
v
=
5.7
x
103~%$~,”.
(10)
When ionization and the accompanying accumulation of positive charge occur, Poisson’s equation becomes d2V _ -
dx2
42r
- - (iJ+CO
$-I,
231
HYDROGEN THYRATRONS
and the curvature decreases. Curve B illustrates this decrease for positive charges introduced and maintained a t xo. The potential for x less than xo is not altered initially, since space charge limited conditions implying zero field a t the negative boundary were assumed and the charges up to xo are not changed. At xo, change in curvature must occur as shown; beyond XO,the curvature must remain as before, since no new charges are introduced I
I
APPLIED
BOUNDARY
FIG.17. Potential diagrams between plasma and probe showing effects of ions a t 2.
in this region. But now the potential a t the anode does not equal the voltage applied. The space charge must then be adjusted to the new conditions, i.e., the curvature must increase to satisfy the boundary conditions. This can occur only if the negative space charge increases, which is produced only through an increase in the electron current across the space. The net effect of positive ion introduction is thus an increase in electron current for a space charge limited electron source. For a probe in a plasma, however, the current density available is fixed by the random plasma eurrents. I n order to satisfy the boundary conditions, the sheath thickness must increase. The case described above is illustrative and not physically realizable. However, under conditions of ion generation in the sheath region, there is a t any point xo a steady contribution to the space chaige by posihive ions A simple soluble case, close to actual probe conditions, occurs when all the ions are generated in a layer a t the probe surface. Langmuir (35) found that with increasing ion current from the layer, the curvature steadily decreases near the probe until the field is zero a t the surface and there is :L space charge limited flow of positive ions from surface to plasma. This current density, J+, is related to the electron current density to the probe, J-, bY
J;/J-
=
(m/M)>4J
(12)
232
SEYMOUR GOLDBERG A N D J E R O M E ROTHSTEIN
where m arid M are electron and ion masses, respectively. If the plasma is a space charge limited emitter of electrons, the current collected is increased to 1.86 times that in the absence of positive ion flow. Since the plasma current density to the probe is fixed, the sheath thickness increases by the square root of 1.86, i.e., by 1.36. Langmuir termed this a “double sheath” since ions enter a t one end (probe) and electrons a t the other. For double space charge flow limits, the field is zero a t both ends, the potential is symmetrical about the midpoint, and a second plasma forms at the probe surface since the field is zero a t the probe. The extent of this plasma increases with increasing probe overvoltages. It will be shown later that the double sheath can also act as a stable boundary between two plasmas of densities in the ratio NI/N2 = (T+,/T-,) x i, (13) where T+, is the positive ion temperature in the more positive plasma arid T-l is the electron temperature in the more negative plasma. With increasing collecting area of the probe, a point comes where it controls the plasma. This may be seen as follows. The random current density collected by the probe adds to the drift current, which requires an added number of electrons leaving the cathode and passing through the gas. These electrons undergo ionizing collisions, thereby increasing the random electron current density in the plasma. It was shown on the basis of the probe measurements that there is a fixed ratio between the drift current to the grid and the random electron current. Since the anode (probe) collects current from a boundary near the same plane as the grid, the current collected, as far as the plasma is concerned, may be considered simply as an addition to the grid current. Thus, we may write
+
I D = IDO Jr-A,;
(14)
where I D = total drift current, ID^ = initial drift current (grid current), J,- = random electron current density, and A , = effective collecting area of probe. But, since and then When the probe collecting area (A,) approaches the ratio of drift current to random current density, the random current density approaches infinity. The external circuit will, of course, limit the current to some finite value
HYDROGEN THTRATRONS
233
and will thus limit the effective collecting area of the probe. Under these conditions, the probe may be said to be in control of the plasma density. Thus, a breakdown criterion is established in terms of a critical area for the breakdown electrode, given by the ratio of drift to random plasma currents. All the events leading to probe control of the plasma participate in anode breakdown. 2. Anode Breakdown and Dissipation. The following behavior occurs in the grid-anode partition region as the anode potential is raised to the breakdown voltage, which, in a typical case for the triggering currents drawn, would be several kilovolts. First, a t very low anode potentials over that of the plasma, ionization is observed in the anode region in collimated beams corresponding to openings in the grid mesh. Ionization is observed only in the outer portions of the mesh nearest the annular baffle opening. The ionization boundary starts at the anode surface and advances across the space to the mesh opening as the anode potential is raised. The anode current continues to increase slowly. At still higher potentials, the ionization boundary protrudes from the grid mesh openings into the baffle region. This boundary continues to expand with increasing anode voltages until it practically touches the grid baffle. At about the point it reaches the baffle, the boundary suddenly increases in size and connects through the annular baffle opening to the main body of the triggering plasma. The interconnecting space then fills with a dense plasma bordering on the initial triggering plasma in a well-defined boundary constituting a double sheath. As in the probe observations, ionization occurs in the anode space when anode potential exceeds that of the plasma by as much as the ionization potential, and increases the potential a t all points between anode and grid. The ionization and excitation boundary that advances across the anode space and penetrates the grid mesh openings may be taken to be an equipotential near the ionization level. Its motion outward reflects the normal motion of a n equipotential line that would occur, even in the absence of ionization, as the anode potential is raised. The position of this boundary is, however, profoundly influenced by the ionization. At constant aiiode potential, as the triggering plasma density is increased, the boundary protrudes further into the baffle region. This is a result of the increased number of ions generated by the higher anode currents, and reflects the extent to which the potentials are altered by ionization and the resultant positive space charge in the anode space. As this equipotential boundary expands, the area from which the anode collects plasma electron current increases correspondingly. The position of the plasma from this equipotential front may be calculated from the Child-Langmuir equation. For the order of magnitude of trigger plasma densities used, this distance is of the order of several millimeters.
234
SEYMOIJR GOLDBERG A N D J E R O M E ROTHSTEIN
As the collecting area increases, more electron current is collected by the anode, resulting in increased ion generation and positive space charge in the interconnecting space. This further increases the potentials in the anode. grid space, resulting in further increases of collecting area, and so on, until breakdown occurs. The net effect of the cumulative events occurring in the grid aperture region is to establish a conducting channel, terminated on the grid-cathode region by means of a double sheath, enabling the anode field to be impressed on the cathode. The field in the plasma increases somewhat a t this time, leading to an increased rate of ionization in the grid-cathode space The rate a t which ionization builds up determines the time it takes for the tube to switch from a nonconducting to fully conducting state. One would expect the gas density, along with the ionization frequency constant, to have a critical influence on this time. Experimental observations show that the anode potential falls exponentially over most of its range in the following manner: eb
= ey2/ -
Act"",
(18)
where e b = instantaneous anode potential, epy = peak forward voltage, = anode fall time constant, and A = constant. Figure 18 shows the results of a typical measurement taken on a 4C35 over a range of values of epy from 2 to 10 kv. The plot shows that the time constant is nearly independent of e,, varying only from 5.10 X lop9 see a t the lowest value of epyto 3.43 x lop9see a t the highest value. The time constant is similarly independent of the circuit constants determining the rate of rise of current. It is strongly affected by tube pressure, as shown in Fig. 19. These data were taken on an Amperex 4C35 having a reservoir that enabled the pressure to be varied. At the mean operating pressure of the 4C35, which is about 450p, t,he time constant is about 6 X lop9 sec; and 3 X lop9 sec, respeca t 200 and 700p, the time constant is 28 X tively. After an interval of several time constants, the anode fall approaches the steady state tubedrop value and the exponential levels off. Anode dissipation then becomes negligible. Since the course of the anode potential during the switching interval seems fixed only by the tube, the rise of current may be calculated from the equivalent circuit of Fig. 20. An inductance L in series with an ideal transmission line, was chosen to simulate the characteristics of the lumped pulse forming network. Substituting the circuit parameters in Eq. (18) we have
T~
eb = em - Aet/Ta
=
di,
em - L dt
- (RL
+ Zo)i,.
(19)
20
-
-
I
I
I
I
I
I
I
I
I
1
I
I
I
I
I
I
FIG.18. Exponential r1i:ir:icter of the fall of anode potential and its relative independence of mode voltage.
236
SEYMOUR GOLDBERG AND JEROME ROTHSTEIN
-zo
L'
-
f )
t L ep 9
If
FIG.20. Equivalent circuit of thyratron, load, and pulse network during the commutation interval.
HYDROGEN THYRATRONS
237
Solviiig for i,, we get
These equations apply only during the time the anode potential is decaying exponentially. When the anode potential falls to a value of several hundred volts, its course departs markedly from the exponential and the current is limited primarily by the external circuit. It is not easy to account for the fact that the time constant for the fall of anode potential is independent of the anode potential and circuit condition. However, we would expect the generation rate of ions in the gridcathode space to control the rise of current. The rate of plasma generation is given by: d n / d t = nvi,
(21)
where vi is the ionization frequency and n, the plasma density, is the common value of electron and positive ion densities. Since the anode current is proportional to n,
Under circuit conditions where T~
< L / ( n + &),
1 _1-di - -. 2
dt
ra
Since ra is nearly independent of anode voltage, Y, must be similarly independent. The rate of ion generation in a plasma subjected to a strong field depends on pressure, which implies the same for T ~ Measurements . taken with a 4C35 having a reservoir (Fig. 19) show the time constant to vary inversely as the pressure squared Of considerable practical importance is the fact that, although the term Li pressure” is used in the discussion of anode fall time, it is actually the gas density which controls the rate of ion generation. Thus, the temperature of the gas in the grid and anode regions affects anode fall time. Under conditions which result in red hot anodes, the fall time is observed to be longer for this reason. Since gas density varies inversely as the absolute temperature, we might expect density variation as high as 3 to 1ill different regions of the tubes of this type. Immediately prior to anode breakdown, the anode potential is epv and there is essentially no anode current Following the buildup of the triggering current to the critical value necessary to initiate commutation, the
238
SEYMOUR GOLDBERG AND JEROME ROTHSTEIN
anode potential starts to fall and conduction through the tube commences. The small anode currents flowing during the time it takes the anode potential to fall from em to the steady state tubedrop value result in a spike of commutation, which, under typical operating conditions, forms the major part of the total anode dissipation. I n Eq. ( 2 0 ) denoting L/(& &), the circuit time constant, by re,we get
+
Multiplying this by the anode potential and integrating over the commutation interval (the time it takes epu - Aexp ( t / r a ) to equal zero), we obtain the energy dissipated a t the anode;
w=
epgibTa2/2(Ta
+
rc),
(25)
where i b is the peak anode current. The average power dissipated a t the anode is this multiplied by the repetition rate
Po
=
epuibPrrTa2/2(Ta
+
Tc).
(26)
This is the familiar PB factor now used as a rating criterion multiplied by a factor containing the constants of the tube and circuit. I n most cases the circuit time constant is long compared to that of the tube, and commutation dissipation varies as the square of the tube time constant and inversely as the circuit time constant. Since the tube time constant varies inversely as the pressure squared, anode dissipation will in most cases vary inversely with the fourth power of pressure. This shows that tube pressure controls anode dissipation, and accounts for the appearance of red anodes as nonreservoir tubes age and gas pressure declines. In reservoir tubes, the lower end of the reservoir range occurs when the pressure falls to the point where the commutation energy increases sufficiently to result in red anodes. It has long been known that PB is a crucial rating factor whose significance may be seen in its effect on anode dissipation. The present discussion points out the importance of the tube constant r., and of the circuit constant re. The value of T. may not readily be estimated by observation of the current rise through the thyratron in a modulator circuit because the current rise is also a function of r8. Figure 21 illustrates this effect. The current rise in a 1907 thyratron in a conventional modulator circuit is shown a t various reservoir voltage settings corresponding to different tube pressures. Shown also are the circuit rise characteristics, measured with an essentially instantaneous mechanical switch replacing the thyratron. Note that even a t the highest pressures (ER = 5.05 volts), the measured current rise to the thyratron does not indicate the true circuit time const,ant, while a t low pressures the rise time is considerably different from the circuit characteristics. A reservoir voltage of 4.5 volts corresponds to
239
HYDROGEN T H YRATRONS
the point a t which red anodes develop when the tube is operated a t its rated values, while 5.05 volts is the maximum operable reservoir voltage. The effective anode time constants in these measurements were approximately 0.05 psec a t 4.5 volts and 0.03 psec a t 5.05 volts. The circuit time constant was about 0.04 psec. One ordinarily assumes that if the current rise is slower, the anode dissipation is less. This is true only if the current,
0.0 5ps
0.IOps
0.15ps
420~5
TIME
F I G . 21. Current rise for different reservoir voltages compared to modulator circuit rise characteristic.
rise is slowed by changing the circuit time constants. If the current rises more slowly because of drop in tube pressure a t constant circuit constant, anode dissipation will increase. This indicates the importance of an independent measure of circuit time constant.
D. Steady State Conduction Following breakdown, a second plasma, separated from the triggering plasma by a double sheath, forms in the grid baffle interconnecting space, establishing conduction throughout the tube as shown in Fig. 22. The important phenomena occurring during the steady state discharge period are : 1. Processes in and near the double sheath system. 2. Generation of ions and atomic hydrogen, excitation through the body of the plasma, and energy balance associated with these processes. 3. Supply of high discharge currents by the cathode. The double sheath boundary area satisfies Eq. (17) for plasma control. By comparing the space potential of the triggering plasma obtained by probe techniques to the anode voltage after breakdown, one finds that the potential across the double sheath is about 45 volts for the 4C35 and is independent of current or pressure. The total tube drop, exclusive of resistive voltage drops in the cathode, is approximately 70 volts. The potential drop across the double sheath is the major source of power input to the gas. Except for diffusion losses, all of the electrons drawn from the cathode
240
SEYMOUR GOLDBERG AND JEROME ROTHSTEIN
I
ANODE
GRID SPACE
C A T HOD€
N,
FIG.22. Double sheath bounding anode and cathode plasmas during steady discharge.
plasma across the double sheath arrive a t the anode. The higher density plasma in the grid-anode space is necessary to supply the currents to the anode through the constricted grid openings. The electron current density crossing the double sheath from the grid-cathode plasma to the anode plasma is related to the positive ion current density flowing from the anode plasma into the cathode plasma by =
il/.i+Z
(M/mP
(27)
where M and m are ion and electron masses, respectively, and the subscripts 1 and 2 represent cathode and anode plasmas, respectively. Now, j-1 = nleij-1/4 (28) where nl is the electron density in the cathode plasma and L1is the mean velocity of cathode plasma electrons, and j+z
=
n2ei&/4,
(29)
where n2 is the ionic density in the anode plasma and ijt 2 is the mean velocity of anode plasma ions. Thus j-l/j+2 = nlij-l/nzij+z.
(30)
HYDROGEN THYRATRONS
24 1
But hence The positive ion temperature is close to gas temperature, about 1000°K. The electron temperature in the cathode plasma is approximately 35,000°K, whence (1000/35,000)% = 1/6.
(33)
The high inteiisity region of ionization near the grid apertures is thus accounted for by double sheath theory and consists of a second plasma approximately six times the density of the normal grid-cathode plasma. The effective surface area of this plasma bordering on the cathode space is equal to the ratio of the drift current to random electron current density in the cathode plasma. Further application of Langmuir's double sheath theory will permit calculation of this ratio, since the cathode sheath is actually double. The drift current is emitted from the cathode surface, a t which, assuming space charge limited emission, the field is essentially zero. At the same time, positive ions flow to the cathode from the plasma, in which the field is near zero, thus satisfying the essential conditions for development of a double sheath. At the cathode j+r/jD
= (7n/M)%
(34)
where j~ is the drift current density a t the cathode and j+, is the random ion current in plasma. Since the random ion and electron currents in the plasma. are related by j-Jj+=
=
(MT-/rnT+lM,
(35)
we obtain .?-?/jD
= j+J.i=
. j - ~ j +=~ ( T - / T + ) ' ~ .
(36)
With a measured electron temperature of 35,000"K and an approximate ion temperature of 100O0K, j P r / j zz ~ 6.
(37)
Since the total drift current is j~ multiplied by the cathode area A , and the random electron current is practically the same throughout the discharge volume (neglecting diffusion gradients), we may now calculate the
242
SEYMOUR GOLDBERG A N D JEROME ROTHSTEIN
area A, to which a probe or the double sheath collecting anode current must attain to control the discharge. We have
A,
=
i~/j-~
(38)
hence
A, = j ~ A ~ / = j -A,(T+/T-)” Ae/6. (39) If in any diode the anode area is greater than that given by Eq. (39), a retarding or ion sheath will appear a t its surface; if its physical area is less than that necessary, a double sheath will appear and extend the anode area to the required level. The cathode area of Eq. (39) is only the part supplying current to the discharge, which may be appreciably less than the total cathode area. I n such cases there would be considerable current density variation over the area, which would complicate application of (39) to an actual case. It is observed, however, that as the peak current through a thyratron is increased, the area of the double sheath expands. The effective cathode area can also be shown to increase with increasing current. Double sheath theory permits calculation of the plasma densities. From Eq. (37) the ratio of random to drift current densities is 6 to 1. The drift current density is simply the cathode current divided by the effective cathode area. The plasma electron density is related to the random current by j , = nec/4, (40) whence n = 4j,/en = 24i~/A,efl. (41) From kinetic theory ii =
(8kT/nm)f5
so =
4.03
x
1013j-,/~3s
or n = 24.2 X 1013i,/AcT”.
For a typical case in the 4C35, operating a t its rated current of approximately 100 amp and having a n effective cathode emitting area of approximately 10 cm2 and an electron temperature of 35,00OoK, n = 1.3 x 1013/cm3.
(45)
In the plasma beyond the double sheath the density is roughly six times this, or n = 7.8 x 1013/cm3. (46)
HYDROGEN THYRATRONS
243
There may be still higher densities within this second plasma, particularly a t the aperture where the current is focused through a narrow channel. There might even be a second double sheath a t this point which would result in a second sixfold increase in density, yielding =
4.8
x
1014/cm3.
(47)
As this density is 1/10 of a perfect gas at 1/2 mm pressure and 1000°K temperature, it appears that a substantial portion of all the gas molecules are ionized, even at the relat.ively low current of 100 amp. 1. Energy Balance in the Steady Discharge. Under steady discharge conditions, power is delivered to the gas primarily by passage of the average electron current across the electron accelerating sheath a t the cathode and across the double sheath a t the grid apertures. Some of this power maintains the ion density needed to support the discharge, i.e., to supply the loss of ions by diffusion, and some heats the electrons from 1.1ev (cathode temperature) to 4 ev (plasma temperature). Further energy goes into excitation and dissociation of the hydrogen, into radiation, and into kinetic energy of atoms, molecules, and positive ions. 2. Atomic Hydrogen Concentration in the Steady Discharge. The equilibrium concentration of atomic hydrogen is of considerable practical interest, as cleanup of molecular hydrogen does not normally occur and atomic hydrogen may react with the cathode. It is reached when the rate of loss by hydrogen atoms by diffusion equals the rate of generation (recombination is primarily a t electrode and envelope surfaces). The rate of generation is determined primarily by plasma density and temperature, and as a first approximation may be assumed independent of the atomic hydrogen concentration. We have where 1VHis atomic hydrogen density, n- is the electron density in plasma, and T- is electron temperature. Hydrogen atoms are lost by diffusion:
where D, the interdiffusion coefficient of atomic in molecular hydrogen, is 20 x 103 cm2/sec a t T = 1000°K and P = 0.5 mm, and A is the characteristic diffusion length ( h / s for parallel plates of separation h). The net rate of generation is therefore
dNH/dt
=
f(n-, T-) - NHD/A' = f - A-H/T,
from which
NH
=
j~(1 e-t").
(50)
244
SEYMOUR GOLDBERG AND JEROME ROTHSTEIN
For an average electrode spacing of 0.3 cm, the time constant is T
=
( 0 . 3 / ~ ) ~ / 2X0 lo3 = 0.46 psec.
(52)
It thus appears that for pulse lengths up to 1 or 2 psec, the atomic hydrogen concentration is not in equilibrium. To determine the rate of generation of hydrogen atoms, it is necessary to take dissociative collisions into account. These have been studied by Smyth and Condon (36)and later measurements are given by Herzberg (37).Their research shows that hydrogen atoms are not generated by the direct process of collision of 4.47ev electrons with the molecule, even though this is the dissociation energy. All electron collisions with the molecule below 8.8 ev energy are almost completely elastic. Collisions a t or above 8.8 ev can dissociate the molecule and impart high kinetic energy to fragments. Assuming a probability comparable to that of ionization by electrons with energy greater than 15.5 ev, we can calculate the approximate rate of dissociation, as the rate of ion generation can be calculated from the equilibrium plasma densities and ambipolar diffusion losses. We compare the relative number of electrons having energies between 8.8 and 11.5 volts (first electronic excitation level) with those having energies in excess of 15.5 volts by assuming a Maxwellian energy distribution. The number of electrons having energies in excess of 8.8 volts is close to nS.8
=
noe
-8.8elkT-
(53)
f
while the number with energies greater than 11.5 volts is near n11.s = noe
-1I .Be/kT-
(54)
Thus, the ratio of the number of electrons having energies between 8.8 and 11.5 volts to the number having energies in excess of 15.5 volts is
R
= (ns.8
- n11.5)/n15.6
=
4.0e/kT
e
2 ?elkt
(e
'
- 1).
(55)
For an electron temperature of 35,000"KJ R is 5.5, i.e., about 5.5 times more electrons are available for dissociation than for ionization. The equilibrium ion concentration N+ is related to the rate of generation and loss by N+ = (dN+/dt)genA2/DA, (56)
where D A is the ambipolar diffusion constant = 10 meter2/sec and A2 = (h/a)*= 1 x 10V meters2,where h, the mean electrode spacing, = 3 X meters. At 100 amp peak current in a 4C35, N+ is 1.3 X 1013/cm3in the cathode region and about 7.8 X lOl3/cm3in the grid double sheath region. Assuming an average density of 5 X 1013/cm3,
245
HYDROGEN THYRATRONS
d N + / d t = N+DA/A2 = 5 X lOI3 X lO/lO-'j
=
5 X 10z0/cm3/sec. (57)
Now, since we have 5.5 times more electrons available for dissociation and each dissociation yields 2 atoms,
dNH/dt
=
2 X 5.5 X 50 X
lOI9
=
5.5 X 1021/cm3/sec-i.
(58)
From Eq. (51) the equilibrium concentration is
N H = (dNH/dt)genA2/DH
lo+, whence = 5.5 x 1 0 2 1 x 0.46 x
(59)
where A2/DH is 0.46 X
N~
10-6
=3
x
1015/cm3.
(60)
Since the original molecular gas density is 4.8 X 1015 a t 0.5 mm, it appears that a t 100 amp about two-thirds of the gas in the discharge region is atomic and one might expect that a t higher currents nearly all the gas would be dissociated. These calculations are rough approximations because of the inaccurate estimate of the probability of dissociation collisions, departures from the Maxwellian velocity distributions a t high energy, and inaccuracies associated with the calculation of plasma densities and diffusion losses. If nearly correct, they imply a large fraction of dissociated hydrogen. As an energy of 2.2 ev per atom goes into kinetic energy when dissociation occurs, the gas could assume a temperature of the order of 15,OOO"K. This would affect diffusion losses and plasma densities. As 1000°K was assumed earlier and temperature generally enters as a square root, a "hot atom" correction as high as a factor four may be needed. 3. Cathode Utilizalion. The question of how well different regions of the oxide coated cathodes are utilized in thyratrons arises from the large areas employed, the remote location of certain areas, and the rather complex cathode geometries. Large emitting areas are needed because hydrogen thyratrons must switch high currents (up to 2000 amp in the larger tubes) during the steady discharge. Hull's techniques (38) are conventionally used to obtain cathodes with large emitting areas, small volumes, and small heater power requirements. They generally contain vanes and use baffles to minimize heater power and to reduce deposition of evaporated cathode material on the grid. The utilization of the more remote or shielded region of such cathodes is explained qualitatively on the basis of a high conductivity plasma that reaches these portions and provides an electron accelerating sheath to enable them to contribute to the discharge. The basic problem is to ficd how a discharge propagates along a cathode surface distributed along the discharge path. The diode shown in Fig. 23 was constructed to study this. It coiitains three small cathodes of 1.3 cm2
246
SEYMOUR GOLDBERG AND J E ROM E ROTHSTEIN
each, spaced axially along the discharge path and enclosed in a shield that simulates the spacings in 4C35 cathodes. The relative currents supplied by each cathode were measured under transient and steady state conditions, using viewing resistors as shown in Fig. 24. Figure 25 illustrates the results
ANODE
1258 CAJHODES
3;4' CATHODE SHIELD
/
FIG. 23. Schematic representation of the placement of cathodes in the cathode utilization experiment.
of a typical measurement for an applied voltage of 260 volts. It was found that the current appears first in time a t the uppermost cathode and a t progressively later times appears a t the lower two cathodes, and that the current from the upper cathode exceeds that from the lower two. The propagation velocity of the discharge was measured and found to
I
"APP
FIG.24. Placement of viewing resistances for measurement of cathode currents.
247
HYDROGEN THYRATRONS VOLTAGE CURRENT I VOLTS) (AMPS)
2401
2.6[ 2.4
220
22-
200
20-
180.
1.8-
L CATHODE UTILIZATION 260 VOLTS APPLIED VOLTAGE
L L CATHODE BAFFLE GROUNOEO
TOP
160- 1.6140-
1.4-
120-
12-
100-
1.0-
MIDDLE
BOTTOM
80- 0.8 60- 0.6 40-
0.4 -
2 0 - 0.2-
I
OL
OO
005
0 10
0I5 0.20 TIME (/l SEC 1
0 25
030
FlG. 25. Voltage - and curretit, as a function of time for the three cathodes of Fig. 23 with c:ithode shield grounded.
0
FIG 96 Voltage :is L: fririction of currmt n i t h I , 2, arid 3 cathodes conuecteci reslmt i r c ~ l bhonirig ~, that lower cathodes do riot c:irr\ n proportionate sh:tre of the rirrrmt. 0 :ill c:tthotles r o r i ~ i e ~ t r0 ~ l ;-top and middle c*:tthodc,srorinwtrd; A -to11 c-:tthodc. ~
0111)
(WlIllWttY~.
248
SEYMOUR GOLDBERG .4ND JEROME ROTHSTEIN
FIG.27a. Experimental tube designed to make observations on the degree of cathode utilization as a function of spacing.
be a function of anode voltage. It ranged from 2.5 to 12.5 cm/psec, for a voltage range from 100 to 260 volts. This agrees with the view that breakdown proceeds by means of a traveling plasma boundary propagating towards the cathode a t a velocity dependent on the applied voltage. The relative currents supplied by each cathode in steady state are
HYDROGEN THYRATRONS
249
HK*( TEYPERATURE CORtR-NWEL BRAZES
ro
STEM
FIG. 27b. Schematic diagram of the arrangement of parts in the tube
own in Fig. 27a.
determined primarily by the voltage differences existing between them. X potential difference between the cathodes will arise since there is 1. A field in the plasma necessary to support the drift current 2 . A sheath voltage on each cathode which varies from one to the other. 3 . A resistive drop occurring within the cathode coating itself. 4. A voltage difference which arises from the different currents in the exteriial viewing resistors required for instrumentation. Figure 26 gives the experimental results, showiiig that the lo\\ vr i-athodes do riot carry a proportionate share of the current. A theory wab developed for the maximum useful length L of cathodes iwiisistirig of parallel plates with the discharge direction parallel to them (39) The result is
L where
2,
=
=
[2t,Ro/W(Ep- c l V , / d ~ ) ] ~ ,
iB1)
total anode current
KO = specific resistivity of oxide coating E, dl’,/dz
plasma gradient rate of change of voltage across cathode sheath with distance along cathode in the discharge direction M‘ = distance between cathode plates. Figures 27a and 27b show an vxperimeutal tuhe used t o verify (61) = =
250
SEYMOGR GOLDBERG AND JEROME ROTHSTEIN
FIG. 28. Appearance of plasma penetrating the space between the two parallel cathodes of the tube shown in Fig. 27a. The line across the top of the cathode was ruled on the negative for measurement purposes.
HYDROGEN THYBATRONS
25 1
Figure 28 shows the appearance of the plasma penetrating between the two planes. Length utilized was measured as a function of anode current a t various plate separations. The quadratic dependence predicted by Eq. (61) was verified. 4. Cathode L)issipatton. Dissipation in the cathode, which in hydrogen thyratrons is primarily from passage of the emitted electron current through the resistive cathode coating, heats the cathode and increases the rate of evaporation of cathode coating, thereby depleting the active emitting surface. Dissipation resulting from ioii bombardment of the cathode is generally negligible, as the cathode sheath voltage drop is only about 20 volts compared to I R drops of the order of 100 volts in the cathode coating. In addition, the ion currents are only about 2"/G of the electron current. I t is only where the cathode is deactivated and sheath voltages far in excess of 20 volts develop that ion bombardment becomes ail important source of dissipation.
E. Deionazation and 12ecoaerv After passage of the current pulse, a fairly dense plasma (1013-1015 ions/ em3) remailis which subsequently decays. Because of impedance mismatch between pulse forming network arid load resistance and assoc+iatedtransient phenomeria, there is normally a high negative voltage (possibly several kilovolts) on the anode after the pulse line has discharged in the forward direction. Thereafter the anode voltage grows more positive and ran cause the tube to break down before trigger voltage is applied unless the plasma has decayed sufficiently by this time. An important effect of this inverse voltage, besides delaying the time a t which the anode becomes positive, is that it engenders anode dissipation hecause of bornbardmerit by ions from the decaying plasma. Ion current flow for 0.1 or 0.2 psec. At inverse voltages of the order of 1 kv, the positive ioii sheath developed a t the anode extends all the way across the anode .space to the grid arid rapidly removes all the ions from this space. Because the plasma in t,he grid-cathode space is shielded from the inverse fields by the grid structure, it decays a t a much slower rate not noticeably affected by the iiiverse voltage. The time between the end of the current pulse and that a t which positive voltage may be reapplied to the anode without causing breakdown is called the recovery time. It is strongly affected by negative bias on the grid. It is less affected by the peak discharge current, which determines plasma deiisities a t the onset of the deionization interval. The magnitude of the reapplied anode voltage, if greater than 100 or 200 volts, has little effec+t o11 recovery time. The presence of inverse voltage has 110 effect on recovery time. Figure 29 is a typical plot of the recovery time of a 4C35 as a fuiictioii
252
SEYMOUR GOLDBERG AND J E ROME ROTHSTEIN
I0 0, 80 60
-
k
-
IOOAMPS
40-
20W
c3
U IJIO 0
>
8-
u)
5 m
64-
e
2-
11
I
I
I
I
0
2
4
6
8
I 10
I I2
I 14
1 16
I
1
18
2
RECOVERY T I M E
FIG.29. Recovery time measured in a 4C35 as a function of bias voltage for different load currents. Reapplied voltage = 1000 volts; inverse voltage = 0 volts.
of negative grid bias for two discharge currents. Recovery time is a logarithmic function of negative bias. Small bias voltages (10-20 volts) are very effective in controlling the recovery time. Figure 30 shows the recovery time of a 4C35 as a function of bias and pressure, showing a strong dependence on pressure. The dominant factors determining recovery time are : 1. The decay of the density of the residual plasma in the discharge region. 2. The electron retarding sheath developed a t the grid when negative bias is employed. 3. Cumulative events in the grid-anode space leading to breakdown. 1. Decay of Plasma. Plasma density has been measured as a function of time, using the grid as a negative probe. The ion currents collected by the grid decay approximately exponentially with time, indicating that the plasma also decays approximately exponentially. Measurements of grid current drawn from a decaying plasma are sometimes difficult to interpret because of their magnitude (of the order of amperes for the first few microseconds for a 100-amp main discharge in
253
HYDROGEN THYRATRONS
aBO 903
I
I
I
I
I
I 0 0 AMP CURRENT PULSE
V
~
~
~
~ VOLTS ~ ~
~
-
~
o
o
o
FIG. YO. 4CY5 recovery time versus bias a t various pressures (.imperex). Curve I, B reservoir 4.5 volts = 266,; curve 2 , B rescrvoir 5.5 volts x 407,; curve 3, h'reservoir 6.5 volts = 598,; curve 4, E' reservoir, 7.5 voits = 77Op.
4C35). Such large currents often overload the voltage hourre used to collect the ions SO t h a t little or no negative voltage appears a t the grid duriiig this time. As the grid potential approaches zero, plasma electrons may flow to the grid, causing the ohserved current to reach a limit given by
il,
254
SEYMOUR GOLDBERG AND JEROME ROTHSTEIN
where V , is the grid negative bias voltage, R, the bias source impedance,
i+, the random ion current to grid, and Ai-, a fraction of the random electron current. This tends to make the current observed initially less than the total random ion current collected by the grid. This also affects recovery time as a function of bias. The bias actually present at the grid a t the instant of recovery, which is the important fsctor, is often considerably less than that, actually applied bemuse of the voltage drop in the bias source impedance caused by the ion currents. I n Figs. 29 and 30, the grid voltage plotted is that actually present a t the grid a t the instant of recovery. In these measurements the bias was applied through a resistance of 100 ohms. The size of the resistor or the magnitude of the grid bias before recovery occurs does not affect the recovery time; that is, the bias may be omitted even up to the point recovery is desired and then may be applied as a pulse, without affecting the recovery timebias relation. Neither time constant nor amplitude of the exponential grid ion current is affected by the magnitude of the negative grid voltage when the predominant loss mechanism of the charged particles is diffusion to the collecting walls or electrodes. Making the grid more negative merely affects the potential in a thin sheath surrounding the grid, without affecting the fields in the plasma which control the diffusion process. The diffusion is ambipolar, i.e., both ions and electrons leave the plasma a t equal rates. If ions and electrons were to leave at their free diffusion rates, electron loss would predominate, leaving a positive plasma which would restrain further free loss of electrons. A steady state then results wherein a small field is set up in the plasma to hold back the faster electrons so that losses of ions and electrons proceed a t equal rates. Ion loss by recombination of ions and electrons in the gas volume is highly improbable bccause of the rarity of collisioiis and because of the tendency, if collision occurs, of the electron to enter a hyperbolic orbit rather than radiating and entering a bound orbit (40).The rate of ion loss by recombination is
dN/dt
=
-aN2
(63)
where iV is the plasma density and a the recombination coefficient. Estimates of a for hydrogen in the glow discharge are of the order of lo-''' (4.2). It may even be zero in very pure H , (42). This yields a rate of ion loss perhaps 1000 or more times smaller than that by ambipolar diffusion, assuming plasma densities in the order of l O I 3 to 1014per cc. The rate of ion loss by ambipolar diffusion to the tube walls is given by d.V/dt = -liz'DA/A2
=
-N/T.
(64)
H Y D R O G E N TH \-RATRUN S
255
The ambipolar diffusion constant D A is given by
+ T-)P+,
D A = (k/e)(T+
(65)
where k is Boltzmann’s constant, e the electronic charge, T+ the positive ion temperature, T - the electron temperature, and p+ the positive ion mobility. From microwave measurements: of K. Persson at, RI.1.T. p+ =
1.45 T / POT,(inks units)
(66)
where Po is pressure (mm) measured a t To and T the gas temperature (OK). The ambipolar diffusion loss equation yields an exponential decay of plasma density
n: =
(67)
The ioii current collected by the grid is related to the plasma density by + ;,
= S+ez;, 4.
(68)
11, is thus proportional t o the plasma deiisity slid peak currcut of the rnaiii discharge, which determines N u . There are several uncertainties in cdculating the diffusion time coiistunt, A2/DA. First, the plasma in the tube is diffused throughout a rather cwmplex geometry with no one definite spacing or temperature. Second, the appropriate ion and electron teniperatures are iii doubt. It was shown, for esaniple, that a high percentage of atomic hydrogen is probably formed with possible release of enough energy to raise the gas temperature to 13,00OoK. Also, a t the erid of the discharge the electron temperature is quite high. The decay of electron temperature rnay deperid on two niec~haiiisnis: 1. Rapid diffusion of hot clectroiis to the Cathode and their replacement or cooling by cool electrons (1000°K) emitted by the cathode. 2 . Elastic collisions of hot electrons with gas molecules. Not enough information about plasma diffusion to a freely emitting surface is available for us to accept the first mechanism with assurance. Probe measurements indicate that, the plasma potential drops rapidly to a value (*loseto cathode potential arid may actually be several volts below it during most of the deionixatioii. The second mechanism is based on the electrons’ losing a fraction 2 m , / N of their energy for earh collision with a gas molecule. Siuce the mean free path of the electrons is known (0.4 nim), the frequency of collision may tie calculated from their mean velocity. The electron energy then decays as 7l ’ ~ = 0
(t, T
+ 1) ’,
(6‘3)
256
SEYMOUR GOLDBERG AXD J E R O M E ROTHSTEIN
where l / T = (2u0/in)~'rn/?,,M, u g is the initial electron energy, and X, is the electron mean free path. For an initial electron energy of 4 ev (35,000°K), T is about 1.2 psec. A decaying exponential with time constant 2T and agreeing with Eq. (69) a t u/uo equal to 1 and u/uo about 0.02 is a fair approximation to (69) at intermediate values also. The foregoing neglects inelastic cdlisions, which would absorb all the electron energy in one collision instead of the several thousand required for elastic collisions. One would thus expect all the electrons having more than 8.8 volts energy (first molecular level) to be cooled in a small fraction of a microsecond. The effective time constant, is probably of the order of 1 psec. We might then expect the plasma density to have two effective time
FIG.31. Consolidation of deionization data showing the analysis of the dcionixation current into two exponentials.
constants : one associated with low gas and electron temperatures, and another associated with the short period of time the electrons and gas are cooling to equilibrium temperature. Figure 31, showing positive ion grid current as a function of time, appears to bear out this supposition. At late times, the time constant is 4.3 psec, which is close to the value calculated for ion and electron temperatures of 1100°K and a spacing of 1.0 cm. At times earlier than several microseconds, the time constant is considerably less, as one would expect for higher ion and electron temperatures. The length of 1.0 cm is about double the spacing from cathode to cathode baffle. The double spacing is probably correct, since the cathode is slightly positive with respect to the
HYDROGEN T H T R A T R O N S
257
plasma and does riot collect ions, whereas the grid is negative and repels electrons. These coiidit,ions tend t,o double t.he effective diffusion distances for each particle. The t.heory of ambipolar diffusion to t.he walls may thus account fairly well for t,he rat,e of disappearance of t,he plasma. Accordingly, shortening the diffusion distances would reduce deionizatiorl and recovery times. 2. E$ect of Negative Bias. The application of a negative voltage to t,he grid during the recovery interval creat>esa positive ion sheath about the grid apertures, tending t,o repel electroiis from the anode field region. We may say qualitatively that when this sheath extends eiit'irely across the grid openings, it will prevent the electroiis from reaching t,he anode and initiating breakdown. The positive ion sheath thickness (assuming Hz ions) for the simple case of parallel electrodes is
s
=
1.97
x
10-4r- g 4; /J+" " "
(70)
where 1', is t,he grid bias and j,. the random ion currelit. Thus the sheath makes recovery more likely hy expailding with a decrease in the plasma density, which determines j+, or wit'h an increase in the bias. X more exact expression for the electroil current reaching the anode s p i ~ ~isf :the Boltzmann relation j,
=
jvre -
r A i r / kT -
(71)
Here I'N is the negative potential barrier in the ion sheat,h, over which ail olectron from the plasma must pass. This factor is a function of the grid bias and the extent of the grid sheath. Recovery is then a quest,ion of having sufficient negative bias or a sufivieiitly low plasma density to prevent the flow of' the critical c-urreiit, t o the anode, causing breakdown. Figure 32 illustrates, for the vase of :L simple grid consisting of a single apert,ure, the sheaths and potent,ials exist irig for conditions of recovery and 110 recovery. The situation is similar to that involved in commutatioii, as described in Sec. 111. There, a plasma was formed in the grid-cathode space by means of a11 auxiliary elecbrode, arid the anode voltage and plasma densities required for breakdown were studied as a function of grid bias. The major differences between that study and the conditions in a thyratroil during deioiiixat,ioiiare : 1. The electron temperature was near 35,OOO"Kfor commutation, while during recovery (after a few microseconds) it is near 1000°K. 2. The grid geometry in an actual thyratroil is considerably more complex, primarily because of the grid baMe. The Boltzmaim relation shows that, the effect' of the hot,ter elecirons is
258
SEYMOUR GOLDBERG A N D J E R O M E ROTHSTEIN
simply to require a greater potential minimum in order to limit the anode current to a given value. The effect of the grid baffle is simply to alter the shielding factor of the grid or, in conventional triode terms, to increase the amplification factor. 3. Inverse Anode Dissipation. When anode conduction in the forward direction ceases, the high inverse voltage usually present a t the anode will sweep ions from the decaying plasma to the anode. The result is a pulse of anode dissipation which can be large. Ions remaining in the grid cathode
PLASMA
~ O S I T I V EION SHEATH
0
A.
RECOVERY POSSIBLE
+ VA p
ANODE
-vo
-GRID
POSITIVE
ION SHEATH
B. NO RECOVERY
FIG.32. Pot.entials and sheaths during recovery for single hole grid.
space do not contribute materially to this source of dissipation because the highly shielding grid structure prevents appreciable diffusion of ions from the cathode space to the anode region. This is verified by measurements of inverse anode current immediately following the forward current pulse, which show a burst of current lasting about 0.1 psec. The ion density in the cathode space decays with a time constant of several microseConds, and thus does not contribute to inverse dissipation. Figure 33 shows the observed inverse voltage and current in a 4C35 in a conventional modulator circuit. The oscillations present on the inverse voltage result from the nonideal nature of the artificial transmission line used in the modulator circuit. Measurements show :
'L HYDROGEN THYRATRONS
259
LNODE CURRCNl 900
INVERSE ENERGY 1800,~JOULES /L
ANODE IOLTAGE
I
I
02
SEC.
03
- 4000
~ I G 83. . AIeasurement of spike inverse voltage and inverse current with a pulse iiets o r k as normally used, that is, open circuited a t the far end.
1. The peak amplitude of the inverse current varies linearly with the a~nplitucleof the inverse voltage present during this cw-rent flow, which ill turn is proportional to the ideal mismatch value. 2. The peak amplitude of the inverse current varies linearly with thc aniplitude of the main forward current pulse. (This linear variation is to be expected since the main forward current determines the ioii density iii the anode region.) 3 . The inverse current is independelit of tube pressure. The dissipation resulting from the flow of inverse current is giveii by the integral of the product of the inverse current and the inverse voltage, multiplied by the repetition rate. I t has been found experimentally that the integral sign may be replaced by a constant multiplier of this product.
Thus, P I N
=
kepzi+P,,.
(72)
Since inverse current varies linearly with inverse voltage arid forward rurrent, this may be written as
260
SEYMOUR GOLDBERG A N D JEROME ROTHSTEIN
The constant ki has been empirically determined by integrating the inverse current inverse voltage product in the 4C35 and 1907 thyratrons over a wide range of inverse conditions. The value of ki for the 4C35 lies between 0.65 x lo-'* and 1.2 X AxTerage values that might reasonably be used for calculations are 0.7 X lo-'* for the 4C35, and 1.1 X lo-'* for the 1907. The larger value of ki for the 1907 is due to the large volume of the gridanode region, which would rontain a greater total number of ions.
FIG.34. Same as Fig. 33 cxcept t h a t the pulse network has been terminated a t its far end as shown.
The amplitude of the inverse voltage effective during the flow of inverse current can be profoundly modified by the nature of the reflection occurring a t the normally open-circuited end of the artificial pulse line. Series R-C" networks placed across the open-circuit,ed end have proved very effective in this respect,. Figure 34 shows the inverse voltage and current and the resulting inverse energy for conditions identical to those of Fig. 33, except that a series R-C network, consisting of 417 ppf and 100 ohms, was shunted across the open end of the line. This reduced the inverse dissipation by more than a factor of 3.
HYDROGEN THTRATRONS
261
262
SEYMOUR GOLDBERG AND JEROME ROTHSTEIN
In addition to the dissipation effects, the ions cause appreciable sputtering a t the anode surface. Small holes or craters, whose positions correspond to the apertures in the grid, appear in the anode. Figure 35 shows a micrograph of a single hole in the anode of a 1754 and indicates the very violent nature of the sputtering that occurs. Other than coating the glass walls of the tube and the grid aperture disc with sputtered molybdenum, however, no harmful effects are observed except when the sputtering is sufficient to drill completely through the anode. I n this case, sputtered material condenses on the glass insulator forming part of the anode support structure and a reduction of the anode holdoff ability results.
IV. CONCLUSION
At the present time it can be said that a sufficiently complete engineering understanding of hydrogen thyratrons exists for large scale manufacture and development. The obscurities in the basic phenomena are those which have existed in physical electronics for decades. Though many of the details of breakdown, diffusion, recombination, dissociation, cleanup, sputtering, arid the like are not yet fully understood in fundamental physical terms, the information available is generally either reasonably adequate for design or, as in the case of hydrogen cleanup, the penalties of ignorance can be ameliorated by an appropriate counter measure, like a reservoir. Several phenomena or characteristics need further attention as ail aid to achieving higher ratings. Long pulse length effects on the cathode, the chemical interactions of molecular and atomic hydrogen with the oxide and their effects o n thermionic emission and cathode depletion, as well as more detailed investigation of the cathode sheath structure with emphasis on utilization of laterally separated areas of the cathode (vanes) are typical areas of cathode research and development specifically applicable to hydrogen thyratrons. More work should be done on grid aperture design as related to ease of triggering, tube drop, quenching (i.e., ion “starvatioii” a t the grid apertures limiting peak current a i d pulse duration), and recovery time. The effect of long pulses on the grid, structures with internal grid arid anode cooling means, and variables affecting high voltage limitation of single stage grid-anode structures should all be studied and applied to high power level design. The hydrogen t,hyratron is still supreme where a high-voltage, highpower, fast-acting, jitter-free switch with a short recovery time is needed. It does not appear likely that serious competitors will arise in the near future. Semiconductor devices will probably replace low-power thyratrons, but there are serious difficulties to be overcome before they can be used a t the ratings for which present hydrogen thyratrons can be designed. It is possible that many applications other than radar switching can
HYDROGEN THYRATRONS
263
develop for hydrogen thyratrons. They can be used, for example, for induction heating (43) where over-all efficiency is of the order of 70%. The short deionixation time permits pulse repetition rates of the order of 10,000 per second. The thyratron functions here like the gap in old-time spark gap transmitters. Many other industrial applications would probably develop if hydrogen thyratron cost were comparable to the ignitron. The chief advantage, other than cost, which the ignitron or other mercury-pool devices have over thyratrons is the tremendous average currents which these tubes can pass. In other respects the hydrogen thyratron is competitive or much superior. For super-high-power switching, multiple grid structures (sometimes called graded anodes) appear to be necessary to distribute the high voltage in a manner avoiding both long path breakdown and field emission from the electrodes. It is possible that such tubes may ultimately find use in high voltage dc transmission as well as in super-high-power pulsed transmitters.
REFERENCES 1. Germeshausen, K. J., in “Pulsed Generators” (G. N. Glasoe and J. V. Lebacqz, eds.), Vol. 5, pp. 335-354. Radiation Laboratory Series, McGraw-Hill, New York, 1948. 2. Wittenberg, H. H., R.C.A. Rev. 10, 116-133 (1949). 3. Knight, H. and Hooker, 0. N., B.T.H. Activities 29, 47-49 (1949). 4. Knight, H., Proc. Inst. Elec. Engrs. (London),Pt. ZIZ 96, 361 (1949). 5. Charles, D. and Warnecke, R. J., Ann. radiodlec. 10, 256-302 (1955). 6. Veronchev, T. A., Pulse Thyratrons, “Sovietskoe Ratio” Press, Moscow, 1957, 164 pp. (in Russian). 7 . “Research Study on Hydrogen Thyratrons,” Vol. I (1956) by S. T. Martin and S. Goldberg; Vol. I1 (1956) by S. Goldberg; and Vol. I11 (1957) by 8. Goldberg and D. F. Riley. Edgerton, Germeshausen and Grier, Boston. 8. Plucker, J., Ann. Physik [a] 106, 67, 84 (1858). 8. Dushman, S., “Scientific Foundations of Vacuum Technique.” Wiley, New York, 1949. 10. Knoll, M., “Materials and Processes of Electron Devices.” Springer, Berlin, 1959. 11. Kohl, W., “Materials Technology for Electron Tubes.” Reinhold, New York, 1951. 12. Smith, D. P., “Hydrogen in Metals.” Univ. of Chicago Press, Chicago, 1948. IS. Walsh, D. and Shearman, P. M., J. Sci. Znstr. 34, 161 (1957). 1.6. Goldberg, S., “Research Study on Hydrogen Thyratrons,” Vol. 11,Sect. V, Chapter 5. Edgerton, Germeshausen %I Grier, Boston, 1956, 16. Nottingham, W. B., in “Handbuch der Physik,” Vol. XXI, pp. 1-175. Springer, Berlin, 1956. 16. “Handbuch der Physik,” Vols. X X I and XXII. Springer, Berlin, 1956. 17. Wheatcroft, E. L. E., Smith, R. B., and Metcalfe, J., Phil. Mag. [7]26, 649 (1938). 18. Mullin, C. J., Phys. Rev. 70, 401 (1946). 19. Silver, M., Trans. I R E Professional Group on Electron Devices ED-1,57 (1954). 20. Harrison, A. E., Trans. AZEE 69, 747 (1940). 21. Webster, E. W., J. Sci. Znstr. 24, 299 (1947). 22. Birnbaum, M., Trans. AZEE 67, 209 (1949).
264
SEYMOUR GOLDBERG AND JEROME ROTHSTEIN
23. Knoop, E., and Kroebel, W., 2. angew. Physik 2, 281 (1950). 24. Woodford, J. B. and Williams, E. M., J . Appl. Phys. 23, 722 (1952).
26. Pakswer, S. and Mayer, R., J . Appl. Phys. 24, 501 (1953). $6. Appel, H. and Funfer, E., 2.angew. Physik 8,322 (1956). 27. Olmstead, J. A. and Roth, M., RCA Rev. 18,272 (1957). 28. Wittenberg, H., Elec. Eng. 66,843 (1946); 69,823 (1950). 29. Hess, K. W., Philips Tech. Rev. 12, 178 (1950). SO. Romanowits, H. A. and Dow, W. G., Trans. AIEE, 69, Part I, 368 (1950). 31. Malter, L. and Johnson, E. O . , RCA Rev. 11, 165 (1950). 32. Knoop, E., 2.angew. Physik 4, 386 (1952). 33. Martin, S. T. and Goldberg, S.,“Research Study on Hydrogen Thyratrons,” Vol. I, pp. 26-37. Edgerton, Germeshausen & Grier, Boston, 1956. S4. Loeb, L. B., “Basic Processes of Gaseous Electronics,” pp. 329-373. Univ. of Calif. Press, Berkeley, 1955. 36. I. Langmuir, Phys. Rev. 33, 954 (1929). 36. Smyth, H. D. and Condon, E. U., Proc. Natl. Acad. Sci. U . S. 14, 871 (1928); Smyth, H . D., Revs. Modern Phys. 3, 347 (1931). 37. Hersberg, G., “Diatomic Molecules,” 2nd ed. Van Nostrand, New York, 1950. 38. Hull, A. W., Trans. AZEE 47, 753 (1928). 39. Goldberg, S., “Research Study on Hydrogen Thyratrons,” Vol. 11, pp. 58-70. Edgerton, Germeshausen & Grier, Boston, 1956. 40. See I. Langmuir, Phys. Rev. 33, 521 (1929); or “Handbuch der Physik,” Vol. XXI, p. 471. Springer, Berlin, 1956. 4 1 . Mohler, F. L., J . Research Bur. Standards 19, 559 (1937). 42. Persson, K., Sixth Ann. Conf. on Gaseous Electronics, Washington (1953); Langmuir, I . , Phys. Rev. 33, 511, footnote 8 (1929). 43. Van Der Horst, H. L., Electronics 32, 51 (1959).
Cerenkov Radiation at Microwave Frequencies HERBERT LASHINSKY Columbia Radiation Laboratory, Physics Department, Columbia University, New York, New York page I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 11. General Theory of the Cerenkov Effect.. . . . . . . . ...................... 268 A. Qualitative Description of the Cerenkov Effect.. . . . . . . . . . . . . . . . . . . . . . . 268 B. Tamm Analysis of the Cerenkov Effect.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 C. Cerenkov Radiation and Related Phenomena.. . . . . . . . . . . . . . . . . . . . . . . . . 274 111. Theory of the Cerenkov Effect a t Microwave Frequencies.. . . . . . . . . . . . . . . . . 275 A. Cerenkov Radiation from an Electron Moving Near a Dielectric.. . . . . . . . 275 B. Cerenkov Radiation from Bunched Electron Beams.. . . . . . . . . . . . . . . . . . . 277 C. Effect of the Medium on Cerenkov Radiation.. . . . . . . . . . . . . . . . . . . . . IV. Design of Cerenkov Microwave Devices.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Cerenkov Radiator and Conventional Microwave Devices. . . . . . . . . . . . . . . 285 B. Proposed Cerenkov Devices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 C. Design Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 D. Experimental Results.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293 V. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295 Acknowledgments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
I. INTRODUCTION Electromagnetic radiation is produced whenever a charged particle moves through a medium other than free space with a uniform velocity greater than the velocity of light in the medium. This phenomenon, which may be regarded as the electromagnetic analog of the acoustic shock wave, was first investigated in detail in 1934 by the Soviet physicist P. A. Cerenkov (1). The history of this discovery and its theoretical interpretation are interesting ( 2 ) . Cerenkov, who was a student of the Soviet physicist V. A. Vavilov, was studying the visible radiation emitted by solutions when bombarded by radioactive materials. The effect had, in fact, been observed earlier by Mallet (S), who had noted some of its general properties, but had not attempted to explain it. Cerenkov was apparently unaware of this earlier work. A t first the radiation was attributed to a luminescence effect of some kind. However, it was found that the radiation was observed in nonfluores265
266
HERBERT LASHINSKY
cent solutions such as pure water and that it was not affected by any of the factors which usually affect luminescence, e.g. temperature or quenching materials. Later it was discovered that a magnetic field had a marked effect on the directivity and polarization of the radiation; for this reason it was attributed to the action of electrons. The only mechanism known at that time by which electrons could radiate was bremsstrahlung, i.e. the radiation associated with the acceleration or deceleration of charged particles in the electric field of nuclei in a medium. However, even this hypothesis was found to be erroneous and, in a classic paper Frank and Tamm showed that it is possible for a charged particle moving with uniform rectilinear motion in a medium to generate electromagnetic radiation so long as its velocity is greater than the phase velocity of electromagnetic waves in that medium.' Cerenkov, Frank, and Tamm received the Nobel Prize in physics in 1958 for this work on Cerenkov radiation. The most fruitful application of Cerenkov radiation has been in nuclear physics, in which Cerenkov devices are used for accurate measurements of velocities of charged particles. In 1947 V. L. Ginzburg (4), another Soviet physicist, suggested that the Cerenkov effect could be used for the generation of radio waves in the microwave region. This suggestion was advanced as a possible means of solving one of the more pressing problems in present-day microwave technology-that of producing a tunable source of coherent radiation in the region between the infrared and microwave portions of the electromagnetic spectrum, the so-called ultramicrowave region ( 1 . 0 . 1 mm). A source of this kind is desirable for many reasons. The physicist is interested in an ultramicrowave source for molecular and atomic spectroscopy and as a research tool for studying superconductivity, antiferromagnetism, and other physical phenomena which exhibit quantum transitions in this region of the spectrum. From the technological point of view the availability of a coherent tunable ultramicrowave source would make possible a significant advance in the practical application of electromagnetic waves. This possibility derives from the high frequencies and quasioptical properties which characterize millimeter and submillimeter waves. The high frequencies imply communications channels with enormous bandwidths, while the quasi-optical properties mean that it would be possible to produce electromagnetic beams of extremely high directivity such as would be useful in high-resolution radar or space communication. Moreover, submillimeter waves can propagate through ionized gases (plasmas) It is interesting to note that a result similar to that obtained by Frank and Tamm had been obtained in 1904 by Sommerfeld. In this work, which preceded the theory of relativity, Sommerfeld considered the energy radiated by an electron moving in free space with a velocity greater than that of light (Sa).
CERENKOV RADIATION A T MICROWAVE FREQUENCIES
267
which are opaque to radiation at longer wavelengths. This property arises from the fact that the effective dielectric constant of a plasma becomes negative (thereby preventing propagation) at the plasma frequency. Hence, propagation through a plasma requires that the frequency of the propagating wave be substantially higher than the plasma frequency. The ability of ultramicrowaves to penetrate plasmas is of great importance in “microwave diagnostics,” in which the transmission, reflection, or refraction of microwaves is used as a means of indicating the density of plasmas such as those produced in research on controlled thermonuclear reactions. Unfortunately, conventional microwave generation methods lose most of their effectiveness in the ultramicrowave region of the spectrum. Although devices such as the klystron and magnetron have made it possible to circumvent the transit-time limitations of ordinary vacuum tubes, it can be shown that there is an upper limit to the frequencies which can be produced with devices in which an electron beam interacts with a resonant structure. The problem is well known and has been discussed in the literature ( 5 ) . I n general, the limitation on a resonant-structure device arises because the dimensions of the structure decrease directly with wavelength. This situation implies more and more stringent requirements on the mechanical tolerances as the wavelength is reduced. In addition, the circuit losses increase as the square root of the frequency. Most important of all, the current density must increase as the cube of the frequency; hence, the heat dissipation per unit area increases as the frequency to the fifth power. Although no exact limit has been set for specific devices it would appear that these limits are such that there is little hope of generating fundamental power directly (i.e. without harmonic generation) a t wavelengths below 1 mm. Another general class of conventional microwave devices (travelingwave tubes, backward-wave oscillators, etc.), in which no resonant structure is used, suffers from an equivalent limitation, namely, the mechanical tolerances on the periodicity and alignment of the periodic structure. It is possible to alleviate the dimension problem by operating a resonance structure a t higher modes or by using devices in which no structure is used, but in which the electrons are forced to move in periodic trajectories (5u). The general state of progress in this field has been reviewed by Pierce (6) and, more recently, by Kaufman ( 7 ) . A comparison of these reviews, written between 1950 and 1959, shows that there are still a number of formidable difficulties which would have to be overcome before conventional microwave devices could be useful in the millimeter region. For these reasons, in recent years a good deal of effort has been directed toward the exploration and evaluation of more unconventional methods for generating radiation in the millimeter region. Several schemes which
268
HERBERT LASHINSKY
are presently under investigation are based on the use of the Cerenkov effect and will be described below. In the present review we shall summarize the general theory of the Cerenkov effect and investigate the theoretical considerations and design factors which pertain to the generation of microwaves by Cerenkov radiation. For a more complete description of the Cerenkov effect and related phenomena the reader is referred to the book by Jelley (8), which also contains a comprehensive bibliography.
11. GENERAL THEORY OF
THE
CERENKOV EFFECT
A . Qualitative Description of the Cerenkov Effect As indicated in the Introduction, Cerenkov radiation may be considered the electromagnetic analog of the acoustic shock wave which is produced when a projectile moves through a medium a t a velocity which exceeds the velocity of wave propagation in the medium. Typical examples are the bow wave which is generated when a ship moves through water and the Mach wave characteristic of the passage of a supersonic projectile through air. 1. Spatial Relations. In the electromagnetic case we are concerned with a charged particle, say an electron, which moves through a refractive medium. In its motion the electron tends to polarize the medium in the immediate vicinity of its trajectory, giving rise to radiation centers along this trajectory. A typical situation is shown in Fig. 1. The circular arcs represent wave fronts which have propagated out from individual radiation centers. In the general case the radiation from all points along the trajectory is not coherent. However, if the electron velocity is greater than the phase velocity in the medium, there is one direction in which the radiation is coherent. This direction is defined by the perpendicular to the tangents to the circular arcs represented by the Huygens construction in Fig. 1. This tangent represents the resultant produced by the radiation centers along the trajectory. Thus, if the velocity of the electron is v = Pc, where c is the velocity of light in free space and /3 is a dimensionless factor,2 in the time in which the electron has traversed a distance denoted by pet, the radiation at c1 has formed a circular wave front of radius (c/n)t, where n is the refractive index of the medium. Similar considerations apply for the other radiation centers along the trajectory. From the geometry of the figure we can obtain the relation which must be satisfied between the velocity of the electron, the index of refraction, and the angle 0, which is the complement of the semivertex angle of the Cerenkov cone: cos e = l/pn. (1) 2
We use the notation j3 = v / c throughout this paper (0
< @ < 1).
CERENKOV RADIATION A T MICROWAVE FREQUENCIES
269
This is the well-known Cerenkov condition. We see that if the angle is to be real, i.e., cos e 1, for a given refractive index there is a threshhold velocity &in = l/n which must be achieved if the Cerenkov radiation is to be excited.
FIG.1. Geometrical construction to show Cerenkov cone produced by the motion of a charged particle through a dielectric.
2. Frequency Dependence. Although crude, this description serves to illustrate the important geometric features of the effect. It is also possible to derive the frequency characteristics on the basis of a simplified description given by Jeliey (8). Consider Fig. 2 . When the electron is a t point el, because of time required for propagation, the polarization vector P points toward ell. A short time later, when the electron has reached point e2, the vector points a t el2. Resolving P into radial and longitudinal components, we see that the radial components do not contribute because they are symmetrical and cancel at large distances. The axial component becomes the equivalent of two successive Dirac &functions of opposite sign. Next consider the Fourier component of amplitude a and frequency 0 for which the corresponding period T is much greater than the separation between the two &functions. (See Fig. 2d.) The phase difference between the two components is A+, = wAt. The resultant of these two components is
+
+
A , = a sin w t a sin (at A&) = a[sin w t ( 1 - cos A+,) - cos wt sin A+,].
(2)
The coefficient a is a constant because the Fourier transform of the 6-function is constant over frequency. Since we have assumed that At << T , cos A+, -+ 1 and sin A+, -+ A+,. Thus Eq. (2) becomes
A,
-
-aA+, cos wt,
(3)
270
HERBERT LASHINSKY
pr
4
/ /
rr i!
FIG.2. Diagram to illustrate the frequency dependence of the Cerenkov radiation produced by a single particle [J. V. Jelley, “Cerenkov Radiation and Its Applications.” Pergamon Press, London, 19581.
and the resultant intensity for this Fourier component is
W,
-
a2a2At2cos2at.
(4)
Neglecting the oscillating factor and taking At constant (i.e. neglecting dispersion), we find
w
=
a2
dW =
w,
whence dw
(5)
and the energy radiated per unit frequency intervaI is proportional to frequency. If the medium is dispersive, this relation holds a t frequencies far from resonances in the medium. The same result is obtained below by means of a more rigorous analysis.
CERENKOV RADIATION AT MICROWAVE FREQUENCIES
271
B . Tamm Analysis of the Cerenkov E$ect The classic analysis of the Cerenkov effect has been given by Frank and Tamm (9) and later by Tamm (10). The effect can be derived on the basis of classical electrodynamics. We consider an infinite uniform medium characterized by the dieIectric constant e and magnetic permeability p and find the radiation produced when a charged particle e moves through this medium with uniform velocity 0. Maxwell’s equations for the medium (neglecting dispersion) are:
V-B=0
D
=
aE
B
=p H
We introduce the electromagnetic potentials @ and A which are defined by the relations:
E = -v+
1 aA - --
c at
(7)
H = p 1( v X A ) and satisfy the Lorents condition ep
V *A+--
a+
c at
=
0.
If the particle moves along the z-axis with constant velocity v the charge density and current density are expressed in terms of the Dirac delta function p =
e6(z - vt)
j
eVG(z - vt).
=
From Eqs. ( 6 ) , (7), and (8) we obtain the wave equations for the e!ectromagnetic potentials:
272
HERBERT LASHINSKY
It should be noted that e and p do not appear symmetrically in these equations and that because v is a constant
and it is not necessary to solve both equations. We now introduce the notation n = and, for simplicity, assume that p = 1. Expanding all the field variables in Fourier integrals (time dependence of the form exp i w t ) we arrive at the wave equation for the vector potential
In cylindrical coordinates r, the relation
cp,
and z we expand the delta function by
Substituting this expression in Eq. (11) and taking
finally we obtain a2w ar2
Thus
u 1
+ 1r aw +v ar --
0 2
(@n2- 1)w
=
4
- - 6(r). n-r
is a cylindrical function which is a solution of the Bessel equation
To find the condition which must be satisfied by w at r = 0 we replace the right-hand side of Eq. (13) by f where
f
=
a4 for r < ro;
f = 0 for r
> ro.
We then integrate this equation over the surface of the circle of radius to obtain
TO
We are interested in the case in which the velocity of the particle is smaller than the propagation velocity in the medium and in the case in which it is greater. In the first case on < 1 and
w
=
ZHO(l)(iU?-)
(15)
273
CERENKOV RADIATION AT MICROWAVE FREQUENCIES
where u =
I4 4 +2,
___ 1 - p2n2
and H’o(’) is a Fankel function of the first kind. The asymptotic value of w is (UT >> 1)
w
(16)
= 4-e-m.
In the second case (the Cerenkov case) On
s =
+
dp2.2
> 1 and
-1
(17a)
and HO(*)is a Hankel function of the second kind. For sr >> 1 we obtain the asymptotic expression
Making use of the Cerenkov condition cos 0 = l/pn and Eq. (17a), we transform the exponential in this expression to obtain
[ (t
-ie A,(w)eiwt = = exp iw c 1/27rsr
-
[z cos 6
+ r sin e]) +
c/n
$1
(19)
which represents a wave propagating to infinity at an angle 0 with respect to the z-axis. Thus the nature of the solutions depends on the sign of the third term in Eq. (13). If this term is negative, the solutions represent damped waves which are attenuated exponentially (15). On the other hand, if this term is positive, the solutions represent propagating cylindrical waves (18). The critical Cerenkov velocity is the velocity at which the third term in the “wave equation” changes sign and propagation takes place. In the propagation case there are three nonvanishing field vectors. These are obtained from Eqs. (7), (ll),and (12):
274
HERBERT LASHINSKY
The H-lines are circles with centers on the z-axis while the E-lines are straight lines which at any instant of time originate at the point occupied by the particle. The radiated energy is calculated by finding the radial component of the Poynting vector and integrating over the surface of a cylinder which encloses the path of the particle: dW = 2wdl + -/:
[E,H+ldt.
The radiated energy (per unit length of path) is then found from Eq. (20) (10):
where the integration is carried out only over regions for which ,Bn > 1. Equation (21) is the fundamental relation for the Cerenkov radiation produced by a single charged particle. It should be noted that the energy radiated per unit frequency interval is proportional to frequency, as has been indicated in our descriptive analysis, and that the mass of the particle does not appear. Although Eq. (21) would seem to indicate an infinite radiation yield, there are actually two factors which impose high-frequency cutoffs. The first is the fact that real media are dispersive, so that there is a high-frequency limit at which the Cerenkov condition is no longer satisfied. In most materials of interest this limit falls in the ultraviolet. The second is the finite size of the electron, which imposes an upper limit on the frequency at which the coherence condition can be satisfied. This limit, however, is in the gamma-ray region. C. Cerenkov Radiation and Related Phenomena
We are now in a position to compare Cerenkov radiation with bremsstrahlung, the radiation which arises when a charged particle interacts with the individual nuclei in a medium. There are fundamental differences between the two effects. First, as noted above, the mass of the particle has no effect on the Cerenkov radiation. On the other hand, bremsstrahlung is the radiation which is excited by virtue of the acceleration or deceleration of a charged particle in the electric field of a nucleus, and the particle mass is all-important; moreover, the effect is proportional to the square of the charge of the nucleus. Whereas the Cerenkov radiation involves a large number of weak interactions between the particle and an infinitely large number of atoms, i.e., the macroscopic properties of the medium, as characterized by the dielectric constant, bremsstrahlung implies a small number of “collisions” in which large fractions of the particle energy are radiated. Finally, bremsstrahlung is characterized by a uniform frequency character-
CERENKOV RADIATION AT MICROWAVE FREQUENCIES
275
istic, i.e., uniform energy radiated per frequency interval; in the Cerenkov case, however, the energy radiated per unit frequency interval is proportional to frequency, as shown in Eq (21). I n closing this section it may be interesting to note that the problem of determining the Cerenkov radiation field is equivalent to that of finding the radiation from a linear array of fixed dipoles, located along the particle trajectory, which are excited in such a way that there is a progressive change of phase along the array. This analogy has been pointed out by Frank (11) and Lawson ( I d ) .
111. THEORY OF THE CERENKOV EFFECT AT MICROWAVE FREQUENCIES
A . Cerenkov Radiation from a n Electron Moving Near a Dielectric I n the preceding section we have considered the Cerenkov radiation which is produced when a single electron moves through a n infinite refractive medium. As far as practical microwave applications are concerned, this situation is not very realistic. The electron would soon lose all its energy in ionization of the medium because the ionization losses are about a thousand times greater than the losses due to Cerenkov radiation. It was pointed out by the Soviet physicist Mandel’shtam in 1940 that Cerenkov radiation can also be excited if an electron moves in close proximity to a dielectric. The distance between the dielectric and the electron must be small compared with the wavelength of interest. The electron can move over a plane dielectric slab or through a hole or channel cut into a dielectric. Cases in which a single electron moves in the proximity of a dielectric have been considered by Ginzburg (4), Ginzburg and Frank ( I S ) , Linhart (f.Q, and Bogdankevich and Bolotovskii (15). The results obtained by Ginzburg and Frank for the case in which a n electron moves along the axis of an evacuated cylindrical tunnel cut into an infinite dielectric are shown in Fig. 3. In this figure the radiated energy per unit length dW-/dl is plotted as a function of r/X where r is the radius of the tunnel and h is the wavelength of interest. These curves apply for n = 1.5. Linhart has considered the case in which a single electron moves over a plane dielectric of infinite extent. It is shown that the electron interacts with totally reflected waves whose phase velocity is the same as the electron velocity; these waves extend beyond the dielectric boundary but fall off exponentially. I n all these cases, however, in which radiation from a single electron is considered, if one substitutes reasonable numerical values it turns out that the radiated energy is of the order of ergs/electron, a value which is obviously too small to be of practical interest. If, however, the
276
HERBERT LASHINSKY
r lX
FIG.3. Cerenkov radiation produced by a charge moving through a cylindrical tunnel in a dielectric. The radiation yield is shown as a function of the tunnel radius. Curve 1 applies for 0 = 1 and Curve 2 for 0 = 0.94 [V. L. Ginzburg and I. M. Frank, Dokl& Akad. Nauk S.S.S.R. 66, 699 (1947)l.
electrons are bunched so that the dimensions of a bunch are small compared with the wavelength of interest, the electrons in a bunch radiate coherently. TWOimportant advantages are obtained in this case. First, the effective “charge” is increased by a factor of N , where N is the number of electrons in a bunch; this means a factor of N2 in the radiated power, as can be seen \
\
\
\
\ \
\
/
\
\
CERENKOV WAVE FRONTS
ec \
\
\
\
\
\
/>
\ /h DIRECTION BEAM
/
/
/ /
\
/ DIRECTION OF
FIG.4. Coherent Cerenkov radiation produced by a bunched electron beam.
from Eq. (21). Second, the continuous spectrum becomes a line spectrum, with the radiation concentrated a t the bunching frequency and its harmonics. This can be seen from an examination of Fig. 4, which shows the radiation pattern of an electron beam which moves through a dielectric. We assume that the bunches are point charges. The wave pattern remains fixed with respect to the beam so that a typical wave front moves from A to
CERENKOV RADIATION AT MICROWAVE FREQUENCIES
277
13 in the time that the associated bunch moves from A to C. Assume that the wave fronts contain frequency components wt which are radiated coherently. For coherence we require that At * w i = 2Nr, N = 1, 2, . . . where At is the time required for the wave front to move from A to B. Since the pattern is fixed with respect to the beam, however, this time corresponds to the time required for the bunch to move from A to C, that is, 2r/w, where w is the bunching frequency. Then, from the coherence requirement we have %wi/w = 2N7r and the frequencies radiated coherently are given by wi = N o . B. Cerenkov Radiation from Bunched Electron Beams The Cerenkov radiation produced by an extended electron beam has been computed by Danos (16). I n this calculation it is assumed that the beam is bunched and moves as a rigid body. The problem is solved by finding the solution of the inhomogeneous equation for the electron beam in the absence of the dielectric and the general solution for the vacuum space and the dielectric; the fields are then matched a t the boundary. Three cases have been treated: a ribbon beam passing over a plane dielectric, a ribbon beam moving between two plane dielectrics, and a cylindrical beam which moves through a cylindrical tunnel in a cylindrical dielectric. The first case is of greatest interest since it has been used in experimental arrangements.
i
FIG.5. Sheet beam moving over a plane dielectric.
The pertinent geometry is shown in Fig. 5. The charge density of the electron beam is given by p =
poS(s)[l
+a
COS
(kz - w t ) ] ,
where PO is the average charge density, a is the relative amplitude of the ac component of the charge density a t frequency w , and k = w / v where ZI = /3c is the velocity of the electron beam.
278
HERBERT L AS HINS K Y
I n the vacuum region (z > -d) the electromagnetic potentials obey the inhomogeneous field equations
The solutions of these equations are 9 = (V/p) cos (kz - wt)e-+l
A
=
where q2 = k2Q2= k2(1 - p2) and V equations
= 27r/apO.
we obtain the solutions Al
=
(24)
(VD/q) cos (kz - w t ) e * U .
(Al/Q)e*z[Q cos ( k z - wt
+ +)k
(25)
For the homogeneous
- sin (kz - ot + 4)1x,
(26)
and 9 1
=
0.
Finally, the expressions for the primary and reflected fields are
Eo= Ve+IzI[@ sin (kz - w t ) 2 + sign 2 cos (kz - wt)]f, (28) Ho = sign sVpe-91z1 cos ( k z - ut)4, (29) El= -(A1/Q)~e-qz[Q sin (kz - wt 4-cp)k -t cos (kz - wt -I- cp)fi],
Hi = - ( A 1/ 8)KDe-qZ cos (kz - wt + q)4; where K
= w/c
(30) (31)
and sign x is defined by 1 for z
>0
-1 for x
<0
where A,, the amplitude factor, and +, the phase shift, are determined by the boundary conditions. Here EOand HOare the primary fields generated by the motion of the electron beam in vacuum and Eland H1are the fields which arise from the reflection at the boundary. Inside the dielectric the vector potential obeys the equation
CERENKOV RADIATION AT MICROWAVE FREQUENCIES
279
For the case of no incoming wave, in which we are interested, the vector potential is assumed to be of the form cos [ k ~
P(Z
+ d ) - wt +
(p.1
(33)
where p2
=
k272
=
k2(€@2
- 1).
The fields found from the electromagnetic potetitials are
+ d ) - w t + pz], HD = - - B z y sin [kz - p ( z + d ) - wt + sin [kz - p ( z
CKP
~ 2 1 .
(34)
(35)
Having expressions for all fields, we now determine the quantities of interest by satisfying the boundary conditions at the vacuum-dielectric interface (z = - d ) : that is, E t s n vaa
= E t a , die
and
H t s n vac
= H t a n die.
The radiated power is determined from the time average of the Poynting vector in the medium
I t should be noted that the Poynting vector has two components. One of these, the z-component, describes an energy flux which remains with the electron beam and is not radiated away. The other component represents radiated energy which continues to propagate no matter what happens to the beam. This component of the Poynting vector is given by (s,) =
2rr~~u2p02p2e-~Q'v((~,P),
(37)
where the function q is given by
It is to be noted that this calculation applies for an infinitesimally thin beam. Only the amplitude of the emitted field depends on the separation distance d ; the phase cpz is independent of this distance. Hence to obtain the power for a beam of finite thickness the expression in Eq. (37) is integrated over the height of the beam.
280
HERBERT LASHINSKY
C . Effect of the Medium on Cerenkov Radiation The above calculation applies to the case in which the medium is a pure dielectric. However, it is of interest to consider the Cerenkov effect in other media. I n magnetic media, i.e., media in which p # 1 and e # 1, the dielectric permittivity and magnetic permeability do not appear symmetrically in the expression for radiated Cerenkov power. As pointed out by Nag and Sayied (17), this asymmetry stems from the fact that the dielectric permittivity and magnetic permeability appear differently in the Maxwell divergence equations. Thus, in deriving an expression for the Cerenkov power in a magnetic medium these authors, and Sitenko (It?), obtain a n expression of the form
while the Cerenkov condition becomes @pc > 1. Equation (39) is the equivalent of Eq. (21) for the pure dielectric. The relation indicates that the radiated power can be enhanced if the magnetic permeability is greater than unity. For this reason it is of interest to extend the calculations carried out by Danos to the case of a magnetic medium. In particular, it is of interest to consider ferrite materials. These materials have relatively low losses and have ferromagnetic resonances in the microwave region. By applying an external magnetic field it is possible to exercise some control over the relative values of the real and imaginary parts of the magnetic permeability. 1. Cerenkov Radiation in Ferrites. This problem has been treated by Lashinsky (19). The problem is solved in the same way as the pure dielectric case except that the magnetic permeability and the dielectric permittivity become complex quantities : €*
- &'I;
=
- ip"*
p* =
In the complex medium the fields are3
Em = Re(i&*[%/T* H, = Re(iB,*[k/T*
+ 21 exp i(kz - u*[z + dl - ot)1, + u*]iexp i ( k z - u * [ z + d] - w t ) ] ,
(40) (41)
where the complex propagation constant u*is given by U* =
kT*
=
k(r
+ is).
(42)
In terms of the physical parameters
r= 3
[
(az
+ ,235 + a]";
Complex quantities are denoted by the asterisk
2
*.
CERENKOV RADIATION AT MICROWAVE FREQUENCIES
281
where a = <l
- p)Ip) - 1;
b
= @ 2 ( p ) p-
plltl).
The amplitude of the reflected field in the vacuum region is
and the phase shift between the primary and reflected fields in this region is given by sin cp
=
+
[ ( O V - p4)>'
2p2052 2025t2(02 P 2
+ + p4)
04941>;'
(45)
In these expressions
and
where R = y2 f a2. As in the dielectric case, the propagated power is the normal component of s. However, this function is now somewhat more complicated
The radiated power is determined by q, which is a function of p " / p ' . The dependence of 7 on w in the region of a ferromagnetic resonance is given in Fig. 6 which shows a typical dispersion curve for a ferrite material together with a sketch of the behavior of q in the region of interest. The general features of this curve may be explained as follows. At frequencies higher than wo, the ferromagnetic resonance frequency p' is negative so that electromagnetic radiation cannot be propagated in the medium. Let w1be the frequency at which p' becomes large enough to satisfy the Cerenkov condition P2p'er > 1; then, between wo and w1, although p' is positive, the Cerenkov condition is not satisfied and no power is radiated. At wl, Cerenkov radiation is excited. At some point slightly lower than wl,say w2,q reaches a maximum as is seen in Fig. 6, and maximum power is obtained. It should be emphasized that this is not the ferromagnetic resonance frequency. At still lower frequencies 7 drops off to an asymptotic value as shown in Fig. 6.
282
HERBERT LASHINSKY
cn
t z
3 t
a 4
a
cm K
a
z
Y
-*
UJ
cz
3
t
a a a
t m a
z F
FIQ.6. Dispersion curve of a typical ferrite and the function T I ( @ ) ! which determines the Cerenkov power as a function of frequency [H. Lashinsky, J . Appl. Phys. 27, 631 (1956)1.
We can now compare the power obtained with a ferrite and with a pure dielectric. We neglect attenuation effects in both media. Then, using the following reasonable values of the parameters: for the dielectric: for the ferrite:
e =
p' = €'
=
100, 10, 10,
p = p"/p' el'
1
= 5.5 =
0,
and taking p = 0.2 for both cases, it is found that the power in the ferrite case is about two orders of magnitude greater than that of the dielectric (for the assumptions which have been introduced here). It is instructive to examine the physical basis for this improvement.
CERENKOV RADIATION AT MICROWAVE FREQUENCIES
283
This may perhaps best be done by comparing the work which is done on the field by the electron beam in the two cases. In general this work is proportional t o an expression of the type
11(E j)dvdt I / (E v)pdvdt, =
where E is the field seen by the beam, p is the charge density of the beam, aiid the integration extends over the volume occupied by the beam. Iri the case a t hand the integrand is (19):
-AICrpoe-qd sin (kz - wt
+ a/2) sin (kz - w t + 4 ) .
Thus, for a fixed geometry and a given current density the integrated value of the quantity in question depends on A l , the amplitude of the reflected wave, and I , its phase shift with respect to the current density a t a given point, where
r = (;
-
0).
Cos 1 is analogous to the power factor in electric-circuit theory. To obtain maximum radiated power, the system should be operated a t “unity power factor,” that is, cos (a/2 - 0) = 1. Computing this factor in the two cases under consideration, for the dielectric : cos (a/2 - 4)
z
0.03,
and for the ferrite: COS (a/2 -
0) = 1.00.
The ratio of the amplitudes of the reflected wave in the two cases is
A Ferrite ADielectric Combining the two factors, the ratio of the work done by the beam iit the two cases is obtained :
Thus the power enhancement results from a better matching of the beam to the medium. The effect of attenuation has been neglected in the above analysis. If one considers attenuation, it turns out that a typical decay length in conventional ferrites is 0.005 h ; thus, for h = 1 mm, a layer approximately 5~ thick would be required. It is also necessary that power be coupled out of the medium in such a way that reflection is minimized. Hence, although
284
HERBERT LASHINSKY
the use of a ferrite does, in principle, offer the possibility of obtaining a better match between the beam and the medium, there are difficulties associated with attenuation effects and the problem of coupling energy out of the medium. 2. Ceren,kov Radiation in Plasmas. In recent years there has been a great deal of interest in the Cerenkov radiation which is produced at microwave frequencies in ionized gases by the passage of beams of charged particles. I n this case the role of the medium is played by the electron plasma. The effective dielectric constant of an electron plasma is W02
€(a)= 1 - -1
w2
where w is the frequency of the wave being propagated through the plasma and wo is the plasma frequency, defined by the relation
where N is the electron density, e is the charge of the electron, and m is the mass of the electron. (Because of their greater mass, the ions are neglected.) Since e ( w ) [Eq. (50)] is never greater than unity, we see that Cerenkov radiation cannot be excited in a plasma under ordinary conditions. However, as was pointed out by Veksler (ZO),if the plasma is in a magnetic field the situation is changed; the plasma now behaves like a birefringent medium and the refractive indices for the ordinary and extraordinary waves become rather complicated functions of the angle between the direction of propagation and the magnetic field, the frequency of the propagated wave, the plasma frequency, and the electron cyclotron frequency wc = eH/mc. It now turns out that the effective dielectric constant can, in fact, be greater than unity and that Cerenkov radiation is possible. The case in which the electron moves in the direction of the magnetic field has been treated by Kolomenskii (21). It is found that radiation is possible in any direction, the criterion being:
where 0 is the angle between the direction of the motion of the electron and the wave vector k , h is the wavelength and X, is the distance traveled by the electron per period. Since the plasma acts like a birefringent medium there are two kinds of waves and two conditions for the Cerenkov radiation : a) b)
< wo for ordinary waves wo < w < d u o 2+ wC2for extraordinary waves. 0
285
CERENKOV RADIATION AT MICROWAVE FREQUENCIES
These regions do not overlap. In terms of a critical magnetic field H I we find that at low fields ( H < H1) only ordinary waves areapossible and a t high fields ( H > HI) both ordinary and extraordinary waves are possible, where
and
The energy is given by the expression
(1*
w,l(w? [(WZ
- &"(l -
P4)
w,"(l
+
+ P")" + P'(3
/32w02]
+ 4P(w2 - ]1
- P"w0'I
d(1- P">"wc2
wo')
wdw.
It is of interest to note that the conditions for excitation of microwave Cerenkov radiation described above do obtain in the atmospheres of the earth, the sun, and other stars which have magnetic fields, and may be a possible source of astronomical microwave radiation (20, 21). These conditions also exist in the plasma of gas discharges used in thermonuclear research and the investigation of Cerenkov radiation in this field is important in connection with plasma diagnostics (29). An "inverse" Cerenkov effect has also been proposed as a possible mechanism for the adsorption of electromagnetic radiation by plasma electrons. In this case, when the thermal velocities of the electrons become greater than the Cerenkov velocity, energy is transferred from an iiicoming wave to the electrons ( 2 3 ) . Veksler has also proposed the application of the "inverse" Cerenkov effect as a method of accelerating charged particles (LO). In addition to the work cited above, the Cerenkov effect has been treated for a variety of other isotropic and anisotropic media and geometric configurations. This work is summarized in the book by Jelley ( 8 ) ,to which the interested reader may refer.
IV. DESIGNOF CERENKOV MICROWAVE DEVICES A. Cerenkov Radiator and Conventional Microwave Devices Before considering Cerenkov devices for the generation of microwave power, it is instructive to examine the Cerenkov mechanism in somewhat
286
HERBERT LASHINSKY
greater detail and to understand its relation to conventional microwave devices. Our description of the Cerenkov effect up to this point has been in terms of a field picture. However, the effect can also be described in terms of an interaction between an electron beam and a slow-wave structure or circuit, as in the analysis of microwave devices such as the travelingwave tube (24). 1. Role of the Medium. We first examine the role played by the medium in the Cerenkov radiation process. By hypothesis the radiation field of the particle remains stationary with respect to the particle and the radiated power remains constant in time. That is to say, in our analysis we assume that the medium is infinite and that the process goes on for an infinite time, i.e. the effect is invariant under translation in time and space. This condition can be stated mathematically as follows:
dW
=
eE * dz
= constant,
where E(z,t) is the radiation field. This field can always be expressed in terms of one or more plane waves:
E
=
Eoei(ot-k
Z)
Now dW can be constant only if k.z = wt where k = (w/vx)k and (ve6)t. Here v, is the velocity of the particle. Hence (v,/v~)k-V = 1 or ve cos e = VA. In other words, the projection of the particle velocity in the direction in which the wave is propagated must equal the velocity of propagation. It will be apparent that this synchronism condition cannot be satisfied in free space because the velocity of propagation is the velocity of light and, in accordance with relativity theory, the particle cannot travel at this velocity. Hence the medium serves to slow down the wave just as the slow-wave structure in traveling-wave tubes. However, there are differences. 2. Cerenkov Radiation and the Traveling-Wave Tube. These differences can be understood if we consider the interaction between an electron beam and a nondispersive circuit, a dispersive circuit, and a continuous medium. First consider the interaction between an electron beam and a nondispersive slow-wave structure characterized by a phase velocity vph. Energy can be exchanged between the circuit and the beam only when the beam velocity is equal to the phase velocity of the circuit. In a dispersive circuit, on the other hand, the phase velocity varies with frequency and, in principle, there is some frequency a t which the synchronism condition is satisfied. Thus there is a range of particle velocities for which an interaction can take place. In the case of a continuous medium, i.e. the Cerenkov case, the medium is nondispersive; however, there is still a wide range of particle velocities for which an interaction is possible z =
CERENKOV RADIATION AT MICROWAVE FREQUENCIES
287
because for any particle velocity greater than the propagatioli velocity in the medium the system acts to adjust to the synchronism condition, i.e. the Cerenkov angle is always such that the projection of the particle velocity in the direction of propagation is equal to the propagation velocity. Thus it is essentially the extra degree of freedom which results from the fact that the medium is two-dimensional* that allows the interaction to take place even though the medium is nondispersive. This can be seen more clearly if one considers an artificial dielectric, which may be considered the limiting case of a slow-wave structure. If we start with a linear, nondispersive, artificial dielectric of infinite length, the system is subject to the same limitations as the nondispersive circuit. However, if we now build up the dielectric until i t becomes infinite in the perpendicular direction, the system is governed by the usual Cerenkov condition. In effect, in going from a slow-wave structure to a n infinite medium we are making a transition from a system which can support a limited number of propagation modes to one in which an infinite number of modes, characterized by a continuous velocity spectrum, can be propagated. Thus the synchronism condition can be satisfied for a whole range of beam velocities. There is another difference between the Cerenkov effect in an infinite medium and the traveling-wave tube as far as microwave generation is concerned. I n the Cerenkov case one implicitly assumes that the energy lost as radiation is a negligibly small part of the kinetic energy of the beam and that the beam velocity remains constant. Attention is concentrated on the radiation mechanism and the effects of the field on the beam are not considered. It is assumed that once the Cerenkov wave is radiated it is lost to the system. On the other hand, in the traveling-wave tube the continued interaction between the wave and the beam is all-important. In fact, it is precisely the effect of the wave on the ballistics of the beam which gives rise to the bunching of the beam and the resulting amplification. It should be emphasized, however, that the foregoing distinctions hold only as long as the medium in the Cerenkov system is an infinite one, i.e. only as long as the Cerenkov wave is actually radiated away from the system without further interaction with the beam. B. Proposed Cerenkov Devices
I. Rejlection Schemes. It will be apparent from the foregoing discussion that configurations are possible in which the traveling-wave and Cerenkov mechanisms can be combined. For instance, in a scheme which has been proposed by Danos (Zzj), the Cerenkov wave is reflected from the bottom of a dielectric slab (total internaI reflection) and returns to interact with 4 By “two-dimensional” here we mean as viewed in the plane defined by the particle velocity vector and the propagation vector.
288
HERBERT LASHINSKY BUNCHED ELECTRON BEAM - -- - -- -\ -
T
P
DIELECTRIC MATERIAL
OUARTER WAVE MATCHING PLATE WAVEG U I DE
FIG. 7. Reflection arrangement for increasing the yield of Cerenkov radiation [M. Danos, Columbia Radiation Lab. Quarterly Report, April 1954, unpublished].
a bunched beam (Fig. 7). If the initial Cerenkov radiation intensity is Ioz,after the wave is reflected back to interact with the beam the radiation intensity is
12
=
102
+
1/02
+ A10 1’0 cos cp
where (a denotes the phase difference between the incoming (reflected) and outgoing (unenhanced) radiation and f r o 2 is the intensity of the reflected wave. The phase angle (a depends on the thickness of the slab, which can be chosen to optimize this parameter. If this process is repeated along the length of the dielectric slab, the final intensity is greater than 1 0 2 by a factor n2,where n is the number of times the wave has been reflected. However, as in the case of the nondispersive slow-wave structure, for a given slab thickness there is now only one beam velocity for which the synchronism condition is satisfied. Another arrangement which combines the properties of the Cerenkov radiator and a circuit device is shown in Fig. 8 (26). I n this system the periodic slow-wave structure serves to bunch the electron beam, which is not initially bunched. The dielectric slab serves to couple energy out of the beam, functioning in the same way as the dielectric in the preceding example. The dielectric, however, is “tuned” to operate a t a higher harmonic of the bunching frequency and does not affect the bunching operation of the periodic structure, which operates a t the fundamental bunching frequency. The advantage claimed for this arrangement is that the progressive bunching effect of the periodic structure compensates for debunching due to spacecharge fields. I n a dielectric system it is possible to increase efficiency by applying a longitudinal dc electric field. I n this case the energy transferred from the beam to the circuit represents potential energy acquired by the beam in
289
CERENKOV RADIATION AT MICROWAVE FREQUENCIES
the dc field; the kinetic energy of the beam can be kept constant, in contrast to the traveling-wave tube, in which the beani loses kinetic energy. On the other hand, a dielectric system is always characterized by a low circuit impedance because of the large fraction of the energy which is stored inside the dielectric.
(k N Z R SLOW-WAVE
T O D E
STRUCTURE
I
J
i~ ~ ~ / ; ~ / ~ , ; % ? 5 ~ ~ I
,
// /-
ELECTRON BEAM
DIELECTRIC
FIG.8. Arrangement in which a sIow-wave structure is used to bunch the electron beam and a dielectric plate is used to couple out Cerenkov radiation [G. Mourier, Proc. Intern. Congr. on Ultra-Highfrequency Tubes, Paris.1956 2, 132 (1956).
2. Loaded Waveguides. A system in which a dielectric-loaded waveguide serves as the slow-wave structure has been investigated theoretically by Abele (27).This system (Fig. 9) consists of a cylindrical metal waveguide of radius 7-2, inside of which there is a hollow dielectric cylinder (radii r1 and T ? ) . The region 0 < r < rl is a vacuum region through which a cylindrical electron beam passes. The metal is assumed to be an ideal conductor and the dielectric is assumed to be nondispersive and to have a refractive index n. We first consider the radiation produced by a single charge e which moves with velocity v = be along the axis of the system, where @n> 1. Qualitatively it is apparent that the syiichronism condition will be satisfied for all waveguide modes characterized by phase velocities equal to the velocity of the particle; thus, in this case the Cerenkov radiation excited by the single particle is emitted in a line spectrum. The actual distribution of energy over the modes is rather complicated. For the case in which (1 - /32))/(n2Dz - 1) =: 1, if ?-I/?-:, << 1 the energy in the hth mode, wh, is given by [ k , = w/v(n2B2 - 1)%1
for h << (I/T) ( T ~ / T I ) .
(53)
290
HERBERT LASHIXSKY
J-
METAL WALL
FIG.9. Electron beam moving through cylindrical tunnel in a dielectric which fills a cylindrical waveguide.
The frequency of the hth mode is
On the other hand, when h >> ( l / ~( r)z / r l ) ,the energy distribution becomes
In the limiting case r1 ---f 0 (waveguide completely filled by the dielectric) the energy increases linearly with frequency for low values of h. The effect of the empty channel is to introduce a correction term which is essentially proportional to the square of the frequency. The power reaches a maximum when h = ( ~ / ? F ) ( T ~ / T ~and ) , then falls off exponentially in accordance with Eq. ( 5 5 ) . When r2 = rl it is found that the linearly increasing part vanishes, leaving the exponential decay factor. Abele has also considered the radiation of a bunched electron beam in a dielectric-loaded waveguide. The expression for the radiated power is rather complicated but indicates that the energy in the fundamental mode increases linearly as the square of the interaction length ; thus, undesired modes can be suppressed by using long interaction distances. The Cerenkov radiation produced by particle beams in loaded wave-
CERENKOV RADIATION AT MICROWAVE FREQUENCIES
291
guides has also been investigated by Akhiezer et al. (5’8) in connection with the stability of beams in linear accelerators. In this case the array of coupled resonator sections through which the beam passes constitutes a slow-wave structure and, since the beam velocity is equal to the phase velocity of the system, the Cerenkov radiation condition is satisfied. Moreover, as the particles become bunched the radiation from different bunches becomes coherent and the effect is enhanced. I n addition, density fluctuations in the beam can be amplified by the traveling-wave amplification mechanism. However, it is found that both of these effects are negligible for currents less than the order of amperes so that they can be disregarded in presentday machines.
f FOCUSING ELECTRODE
CAT IiODE
ELECTRON BEAM
DIELECTRIC
FIG. 10. Cerenkov “two-cavity” klystron [J. G. Linhart, PTOC.Intern. Congr. on Ultra-Highjrequency Tubes, Paris, 1966 2, 136 (1956).
3. Dielectric resonator^.^ Another class of devices in which an electron beam is used to excite a dielectric resonator via the Cerenkov effect has been considered by Coleman and his group (29) and Linhart (SO). I n the Coleman device a prebunched electron beam interacts with a coaxial dielectric cavity. The outer wall of the cavity is tapered so that it becomes cone-shaped. If the taper is such that the waves strike the boundary a t the Brewster angle, the radiation escapes and the system behaves like an iiifinite medium. In the scheme suggested by Linhart (Fig. 10) the beam is bunched in the dielectric “buncher” and transfers energy to the L‘catcher”’ so that this configuration is the analog of the two-cavity klystron. 5 Note added in proof. A recent report [P. D. Coleman and C. Enderby, J. A p p l . Phys. 31, 1695 (1960)l describes the production of microwave radiation a t frequencies up to 40 kMc by means of a megavolt electronics dielectric cavity Cerenkov radiator similar to that described above in See. IV.B.3. The system shown in Fig. 12 has also been operated at K-band [H. Lashinsky, Columbia Radiation Lab. Quarterly Report, June 1960, unpublished].
292
HERBERT LASHINSKP
C . Design Factors As we have indicated, there are a number of beam-dielectric configurations in which the Cerenkov effect can be used for the generation of microwaves. However, upon examination it will be found that all these configurations are subject to the same general requirements. These requirements are fundamental and stem from the nature of the wave equation. I n general, in any system in which waves propagate a t velocities smaller than c in the longitudinal direction the solutions to the wave equation are such that the fields fall off exponentially in the transverse direction. The exponential factor is an decreasing function of d/X, where d is the distance from the structure or medium in which the wave is propagated. Since i t is necessary to obtain an interaction between the electron beam and the wave, in all these devices the fundamental problem is that of causing a highdensity, bunched electron beam to pass within a characteristic decay distance of a medium or structure. In general the characteristic decay length is a n increasing function of the velocity factor & Thus, if one is willing to work a t high beam velocities (high voltages) the requirements on the mechanical tolerances and alignment can be relaxed to some extent. This is one of the motivations for the “megavolt electronics” approach to the problem (29). I n one configuration which has been investigated experimentally ( S l ) , a flat ribbon electron beam passes over a plane dielectric slab. This system has the advantage that the “structure” is nothing more than a flat slab of dielectric material which does not require any accurate machining other than that the surface be made as flat as possible. I n effect, in this kind of configuration the size tolerances are replaced by shape tolerances, which are always easier to meet. For example, it is a routine matter to grind a dielectric surface to optical flatness whereas the accurate machining of a structure containing slots or vanes with typical dimensions of the order of 0.1 to 0.25 mm is a fairly difficult problem. Moreover, since there is no size restriction on the dielectric slab the heat dissipation problem is a great deal easier. An estimate of the power that can be obtained from a system in which a ribbon electron beam passes over a flat dielectric can be obtained from Eq. (37). The normal component of the Poynting vector can be written (s,)
=
watts cm
Ro(Ia)2ee-2qdq(E,P)__
where
Ro
=
27rN2e2 X
~
c
lo-’ = 189 ohms.
Here N is the number of electrons in a Coulomb (6.67 X 10l8),Ro is the effective radiation resistance for this geometry, and I is given in amperes
CERENKOV RADIATION AT MICROWAVE FREQUENCIES
293
per centimeter of beam in the transverse direction, i.e. the current divided by the beam width. Taking the values e = 100, /3 = 0.2, d = 3 X 10P cm, watts/cm. It should be = 1 cm and I = 1 ma/cm we find (s,) = remembered that the power radiated a t harmonics of the bunching frequency is proportional to the product of the harmonic content of the beam and the efficiency of coupling between the beam and the dielectric a t the harmonic frequency. It is apparent from Eq. (56) that the coupling efficiency is an exponentially decreasing function of frequency.
D. Experimental Results The first attempt to observe Cerenkov radiation a t microwave frequencies reported in the literature (31) was made with the apparatus shown in Fig. 11. This was an exploratory experiment and no serious attempt K-BAND 0 TO -I KV.,
INPUT
- 3YV. FARADAY CAGE
TITANIUbi DIOXDE
QUARTER WAVE MATCHING PLATE
ELECTRODE FOCUSING ELECTRODE
CAVITY
TO K-BAND
DETECTION SYSTEM
FIG.11. Experimental arrangement used to detect Cerenkov radiation a t microwave frequencies [H. Lmhinsky, in “Symposium on Millimeter Waves” (J. Fox, ed.), p. 181. Interscience, New York, 19601.
was made to optimize the current density. The K-band bunching cavity was driven by a low-power klystron which was square-wave modulated a t 6 kc/sec to facilitate detection and amplification of the signal in an ac amplifier. I n addition, the electron beam was “chopped” a t 20 cps in order to make it possible to discriminate between the leakage power from the cavity and the Cerenkov radiation excited by the electron beam. The radiation was coupled out of the dielectric by means of the quarter-wave plate shown in the figure and the radiation was then coupled into a metal microwave horn. The experimental parameters of interest are as follows: beam voltage 10 kv; 6 100; beam cross sect,ion, 4 mm wide by approximately 0.3 mni high; length of dielectric 1.9 cm; beam current 0.2 ma. X power of
-
294
HERBERT LASHINSKY
approximately lop7watts was observed a t the fundamental frequency; this may be compared with the theoretical power of watts computed on the basis of Eq. ( 5 6 ) . I n this experiment no search was made for harmonic power. The experimental arrangement being used in an experiment designed to investigate the possibility of generating harmonic power in the millimeter region is shown in Fig. 12 (32). The current density is about 5
7
c
COLLIMATING TUNNEL (MOLYBDENUM)
I
POLE PIECE
I/I
WAVEGUIDE
FIG. 12. Experimental arrangement being used in an attempt to generate Cerenkov radiation a t millimeter wavelengths [H. Lashinsky, in “Symposium on Millimeter Waves” (J. Fox, ed.), p. 181. Interscience, New York, 19601.
amp/cm2 a t a beam voltage of 10 kv. The beam is confined by immersion flow in a magnetic field of 5500 gauss and a lens-cancellation (SS, S4, 56) system is used to minimize beam perturbations. The filament is a tungsten ribbon which is heated by radio-frequency currents in order t o avoid mechanical deformation in the magnetic field. The electron beam is 0.025 mm high and 5 rmn wide and the total interaction current is about 5 ma. The structure itself consists of water-cooled copper blocks which contain the collimating slit, bunching cavity, and collector. These blocks are aligned by means of sapphire rods and spacers. The structure rests on a platform which can be rotated about three orthogonal axes by means of micrometer drives which operate through the vacuum chamber. With this arrangement it i s possible to make the magnetic field and the electric field (defined by the
CERENKOV RADIATION A T MICROWAVE FREQUENCIES
295
electrode system) parallel to better than 0.001 radians. The bunching cavity is driven by a 10-watt CW K-band magnetron which can provide up to 1000 v across the bunching gap. This voltage is required in order to form bunches small enough to radiate coherently in the millimeter region. The dielectric terminates in a transition section through which the radiation is coupled into a metal waveguide. The electron beam is square-wave modulated a t 500 cps so that a synchronous detection system can be used. Estimates made on the basis of the presently available current density and reduction in coupling efficiency at the higher harmonics indicate that this system could provide microwatt power in the millimeter region. It also has been reported recently that Russian workers have produced microwave Cerenkov radiation a t 3 cm and are currently trying to produce radiation a t millimeter wavelengths ( 2 ) .
V. COKCLUSION The limitations of conventional microwave devices have stimulated the exploration of more unconventional approaches to the problem of producing a tunable source of electromagnetic radiation in the ultramicrowave region of the spectrum. I n this review we have summarized the theory of the Cerenkov effect as it applies to microwave generation and have indicated the relation between Cerenkov radiators and microwave devices in which periodic metal slow-wave structures are used. The fact that the “structure” in the Cerenkov radiator is a dielectric implies certain advantages and disadvantages as compared with systems in which the structure is metallic. In the Cerenkov case the retardation of the propagated wave is due to the inherent properties of the material so that it is not necessary to employ periodic structures, with their characteristic stringent mechanical tolerances and poor heat-dissipation properties. Moreover, a longitudinal dc electric field can be used to compensate for the kinetic energy lost by the beam to the propagating wave. On the other hand, in the Cerenkov case a large fraction of the field energy is propagated in the dielectric and is thus unavailable for interaction with the beam, so that the impedance of the system is inherently low. I n principle a better match between the beam and the field can be obtained by the use of materials other than pure dielectrics. The fundamental problem of passing a high-density beam close to a structure is common to both dielectric and metal systems. Although the technical problems involved in the construction of a Cerenkov radiator are formidable, the possibility of producing an ultramicrowave source would seem t o justify the exploration of this unconventional approach to the problem. There are also indications that microwave Cerenkov radiation will become important in the investigation of astronomical radio radiation and in “plasma diagnostics” in research on controlled fusion processes.
296
HERBERT LASHINSKY
ACKNOWLEDGMENTS The author is indebted to Prof. C. H. Townes for a critical reading of the manuscript and many valuable suggestions and t o J. V. Jelley, M. Danos, G. Mourier, and P. J. Linhart for permission to reproduce figures from their publications. The author also wishes to acknowledge the support of this work by the Signal Corps, the Office of Naval Research, and the Air Research and Development Command.
REFERENCES 1. Cercnkov, P., Doklady Akad. Nauk S.S.S.R. 2, 451 (1934). 2. Tamm, I. E., Proc. Second Intern. Conf. on Peaceful Uses of Atomic Energy, Geneva, 1958 1, 408 (1959).
3. Mallet, L., Compt. rend. acad. sei. 188, 445 (1929). Sa. Sommerfeld, A., “Optics.” Academic Press, New York, 1954. 4. Ginzburg, V. I,., Doklady Akad. Nauk S.S.S.R. 66,253 (1947). 5. Elliot, R. S., J . Appl. Phys. 23, 812 (1952). 6a. Mote, H., Trans. I R E Professional Group on Antennas and Propagation bp-4, 374 (1956). 6. Pierce, J. R., Phys. Today 3, 24 (1950). 7. Kaufman, I., Proc. Z.R.E. 47, 381 (1959). 8. Jelley, J. V., “Cerenkov Radiation and Its Applications.” Pergamon Press, London, 1958. 9. Frank, I. M., and Tamm, I., Doklady Akad. Nauk S.S.S.R. 14, 109 (1937). 10. Tamm, I. E., J . Phyb. (U.S.S.R.) 7, 49 (1943). 11. Frank, I. M., Uspekhi Fiz. Nauk 30, 150 (1946). 22. Lawson, J. D., Phil. Mag. [7] 46, 748 (1954). I S . Ginzburg, V. L., and Frank, I. M., Doklady Akad. Noiik S.S.S.R. 66, 699 (1947). 14. Linhart, J. G., J . Appl. Phys. 26, 527 (1955). 15. Bogdankevich, L. S., and Bolotovskii, B. M., J . Exptl. Theorel. Phys. (U.S.S.R.) 32, 1421 (1957); Soviet Phys. J E T P 6, 1157 (1957). 16. Danos, M., J . Appl. Phys. 26, 1 (1955). 17. Nag, B. D., and Sayied, A. M., Proc. Roy. Soc. 236, 544 (1956). 18. Sitenko, A. G., Doklady Akad. Nauk S.S.S.R. 98, 377 (1954). 19. Lashinsky, H., b. Appl. Phys. 27, 631 (1956). 80. Veksler, V. I., Proc. C E R N Symposium, Geneva, 1956 1, 80 (1956). 21. Kolomenskii, I. I., Doklady Akad. N a u k S.S.S.R. 106, 982 (1956); Soviet Phys. Doklady 1, 133 (1956). 22. Drummond, J. E., Proc. Second Intern. Conf. on Peaceful Uses of Atomic Energy, Geneva, 1968 32, 378 ( I 959). 2s. Sagdeyev, R. S., and Shafranov, V. D., Proc. Second Intern. Conf. on Peaceful Uses of Atomic Energy, Geneva, 1958 31, 118 (1959). 24. Pierce, J. R., J . Appl. Phys. 26, 627 (1955). 26. Danos, M., Columbia Radiation Lab. Quarterly Report, April 1954, unpublished. 26. Mourier, G., Proc. Intern. Congr. on Ultra-Highfrequency Tubes, Paris, 1956 2, 132 (1956). 27. Abele, M., Nuono cimento [9] 9, Suppl. 3 (1952). 28. Akhiezer, A. I., Fainberg, Y . B., Liubarskii, G. L., Proc. C E R N Symposium, Geneva, 1956 1, 220 (1956). 29. Coleman, P. D., Progress Reports, Ultramicrowave Section, Elec. Eng. Research I.ab. University of Illinois, March 1956.
CERENKOV RADIATION AT MICROWAVE FREQUENCIES
297
SO. Linhart, 6. G., PTOC.Intern. Congr. on Ultra-Highfrequency Tubes, Paris, 1956 2, 136 (1956). 31. Danos, M., Geschwind, S., Lashinsky, H., and Van Trier, A., Phys. Rev. 92, 828 (1953). 3%. Lashinsky, H., in “Symposium on Millimeter Waves” (J. Fox, ed.), p. 181. Interscience, New York, 1960. 33. Dunn, D. A., and Luebke, W. R., Trans. IR E Professional Group on Electron Devices ED-4,265 (1957). 34. King, P. G. R., Services Electronics Research Lab. Tech. J. 4, 9 (1954). 35. Pierce, J. R., Bell Svsfern Tech. J . 30, 825 (1951).
This Page Intentionally Left Blank
High-Power Axial-Beam Tubes T. MORENO Varian Associates, Palo Alto, California
Page I. Introduction.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Problems Common to High-Power Klystrons and Tr A. Cathode Materials. . . . . . . .. . . . . . . . . . . . . . . 300 B. Electron Beam Formation.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302 C. Output Window Design.. . ........................ 306 111. Progress in High-Power Klyst ........................ 313 A. High Gain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 B. Broadbanding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314 C. High Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 IV. Progress in High-Power Traveling-Wave Tube Design. . . . . . . . . . . . . . . . . . . . . 321 B. Comparison of Traveling-Wave Tubes and Klystrons.. . . . . . . . . . . . . . . . . 321 B. Circuits for High-Power Traveling-Wave Tubes. . . . . . . . . . . . . . . . . . . . . . . . 323 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
1. INTRODUCTION During the initial period of rapid development of microwave tubes which took place during World War 11, the high-power transmitting tubes were almost exclusively magnetron oscillators. The klystron had been invented before World War 11, and a small amount of effort was devoted to high-power klystrons during the war years, but these power tubes did not play a significant role in wartime equipments. Work continued on highpower klystrons following the war, and very significant results were achieved, particularly in the laboratories of the Sperry Gyroscope Compaiiy ( I ) and Stanford University (9).This led to widespread recognition of the importance of high-power klystrons as transmitting tubes. Very substantial effort continued to be devoted to these tubes during the past decade, and further important advances were achieved. Currently, highpower klystrons are an essential component in many modern microwave systems. During this same decade, a rapidly expanding effort was devoted to the development of high-power traveling-wave tubes. Although the state of the art is not so far advanced for high-power traveling-wave tubes, and their current use in systems is much less, it appears certain that travelingwave tubes will be widely used in future systems. 299
300
T. MORENO
There are many similarities between high-power klystrons and traveling-wave tubes, and in the problems encountered in their development. Problems of cathode design, beam formation and control, output window design, and collector design are similar for both tube types. They are used in a similar manner in systems, the most important differences for the system designer being the larger bandwidth that can generally be obtained with the traveling-wave tube, or the higher efficiency, power, and gain of the klystron. In the sections that follow, the problems that are common to these two tube types will be reviewed, and also the problems that are unique to one type or the other. Significant advances that have taken place during the past decade will be reviewed, as well as areas where future progress can be anticipated. An attempt will be made to assess the role that these two tube types will play in the future, where high power is required. 11. PROBLEMS COMMON TO HIGH-POWER KLYSTRONS AND TRAVELING-WAVE TUBES
A. Cathode Materials 1. Oxide Cathodes. Oxide cathodes of a conventional nature, formed with a base metal of nickel covered with a coating of mixed barium and strontium oxides, are used for continuous-wave tubes where the required emission density is not large. Where long life is a requirement, it is generally considered good practice to limit the emission density to a maximum value of about 3 amp/cm2 of cathode area, although it has been found that with careful processing emission densities as high as 0.6 amp/cm2 can be used with operational life of several hundred hours. For short pulse operation, advantage can be taken of the enhanced emission of oxide cathodes. With high-voltage pulse tubes, the nickel base is usually modified t o form a rough, porous structure. This can be done, for example, by sintering layers of nickel powder on to the smooth base metal. When this rough surface is painted with the alkaline earth carbonates, the paint soaks into the pores, and a mechanical bond is formed which prevents loss of coating by peeling or flaking. I n addition, a reservoir of oxides is provided to replace by diffusion material that is lost from the surface by evaporation or sputtering. It is generally considered good practice for pulse lengths of several microseconds to operate cathodes of this type a t emission densities up to 5 amp/cm2, and higher values have been used and have given satisfactory life. With electron gun designs of reasonable perveance and convergence, axial-beam tubes of many megawatts peak power output can be designed with the oxide cathode operating a t a conservative current density. As the pulse length is increased beyond a few microseconds, a region is
HIGH-POWER AXIAL-BEAM TUBES
301
entered where little precise information is available. As the pulse length increases, the oxide cathode is increasingly subject to damage by positive ion bombardment or by arcing. The seriousness of both of these effects will vary over wide ranges depending strongly upon the details and quality of tube construction and processing. At some point one can no longer depend upon the enhanced emission that an oxide cathode can supply for short pulse lengths, but again, the limits are not accurately known. 6. Pure Tantalum. Pure tantalum metal has been used as the cathode material in a number of beam tubes that operate continuously a t several kilovolts. For a time pure tantalum was favored because of its resistance to poisoning and resistance to damage by ion bombardment. To supply emission of 250 ma/cm2, tantalum must be raised to a temperature of nearly 2300°K. This temperature is attained by using the tantalum cathode button as the anode of a simple diode, and heating it by bombardment of electrons emitted from a heated tungsten filament. The associated circuitry is more complicated than with an indirectly heated cathode. Further complication may be necessary in the form of control circuits to prevent runaway of the cathode temperature. As a result, the tantalum cathode has been widely used in power klystrons in service today, but newer tubes are being designed with other, more modern cathode materials that permit indirect heating. 3. Thoriated Tungsten. Thoriated tungsten buttons, machined from thoriated tungsten rods to give the customary spherically concave emitting surface, are also used. These give higher emission a t lower temperature than tantalum, 1 amp/cm2 a t a temperature of 1900°K is typical, but they must still be heated by electron bombardment from a primary filament. Because the cathode button is thick, typically in., it can be readily carburized to a depth of .010 in. or so, and gives exceedingly long life. 4. Matrix and Impregnated Cathodes. There have been substantial advances in recent years in cathodes of the matrix or impregnated type. In one form of matrix cathode ( 3 ) , nickel powder and alkaline earth carbonates, possibly with an activating agent such as carbon added, are mixed and formed into a cathode surface on top of a iiickel base by pressing a t a pressure of up to 50 ton/in.2 The resulting surface appears nearly metallic but is activated like an oxide cathode, although a longer period of activation may be required. The resulting surface is more resistant to arcs than the conventioiial oxide cathode with a painted surface. Impregnated cathodes can also be constructed by forming a porous button and impregnating it with active materials. One satisfactory cathode of this type (4, 5) uses a porous tungsten button, impregnated with barium aluminate. The resulting cathode can be operated a t a temperature of 1050°C, which will permit indirect heating, and give continuous emission of several amperes per square centimeter of cathode area for many thousand
+
302
T. MORENO
hours in typical high-power beam tubes. When high continuous power a t high microwave frequencies is required, this type of cathode has become very popular. The available emission densities combined with a gun design of reasonable convergence permit continuous current density in the electron beam higher than 100 amp/cm2, a figure that is very difficult to attain with any of the other cathode materials listed above.
B. Electron Beam Formation 1. Brillouin Focusing. In high-power axial-beam tubes, the electron optics of 6he cathode and anode are normally designed so that the stream of electrons converges from the concave spherical surface of the cathode and is focused through an aperture in the anode. Commonly, the power density in the beam when it passes through the anode aperture is so high that if any substantial portion of the beam is intercepted by the anode, the anode may melt. The electron beam that is formed by the stream of electrons passing through the aperture will diverge rapidly because of the mutual repulsion of the electrons. To collimate the beam as it continues beyond the anode through the microwave interaction structure, additional focusing means are required. An axial magnetic field, parallel to the electron beam, usually provides this additional focusing. The precision with which the electron beam is formed depends upon several factors, (1) the design of the electron gun, including the focus electrodes that are required in addition to the cathode button and anode, ( 2 ) the smoothness and uniformity of emission from the cathode surface, (3) the lens effects of the anode aperture, (4) the deflection effects associated with the entry of the beam into the focusing field, (5) the absolute value and uniformity of the magnetic focusing field. A dense, smooth, well-focused electron beam is a n essential part of a high-power axial-beam tube, and a great deal of effort has therefore been devoted to study of the problems associated with electron beam formation and focusing. Some of this work has been summarized in earlier volumes of this series (6, 7‘). If it is proper to speak of “standard techniques’’ of electron gun design and beam focusing, the “Pierce gun” (8) and “Brillouin focusing” (6) are the standard techniques to which other design approaches and focusing techniques are compared. If a Pierce gun is carefully designed, if anode aperture effects are properly calculated, and entry of the beam into the magnetic field is also done properly, a beam will be formed which is well collimated by a magnetic field that is reasonably close (within several percent) to the ideal Brillouin field. Careful experimental investigations have shown that the beam is not of uniform density, and that the “surface” of the beam is scalloped by an amount that depends upon the accuracy and precision of the gun design and construction. But for a well-designed gun
HIGH-POWER AXIAL-BEAM TUBES
303
and carefully adjusted magnetic field, these deviations from an idealized electron flow do not appear to perturb seriously the actual operation of a practical microwave tube. One technique that has been used to provide additional flexibility in design and construction is to place an additional short magnetic lens between the anode aperture and the main focusing field. Adjustment of this lens will permit beams of varying diameter to be produced from an electron gun of fixed dimensions. I n addition, the lens can be used to compensate for aberrations in the electron gun, and for mechanical tolerances in the gun construction. But with any axial-beam tube employing “Brillouin flow,” there are some other, more basic practical problems that are difficult to circumvent. One of these problems is that in theory, there is only one value of magnetic field that is correct for a given voltage of operation of the tube. If the magnetic field is increased beyond this “correct” value, the beam will scallop. If these scallops become excessive, they will affect the operation of the tube. I n a practical tube, there is of course a finite range over which the magnetic field can be varied without substantially affecting the tube performance, but this range is relatively narrow. Furthermore, the magnetic field must be adjusted as the voltage of the electron beam is increased, this may result in a complex turn-on procedure. Another difficulty frequently encountered with Brillouin-focused beams in continuous-wave tubes is that of ion noise. If there is beam scalloping with a smooth-walled drift tube around the beam, the space charge potential depression a t the minima in the beam diameter will be greater than a t the maxima. Positive ions that are formed by electron collisions with residual gas molecules in the tube will therefore be drawn toward and trapped by these minima. The space charge depression in the center of the beam can be typically of the order of several hundred volts, and the ion-trapping action can be quite strong if there is serious scalloping. The electron beam can interact with the ion plasma and cause plasma oscillations. These oscillations of the ion plasma will affect the focusing of the electron beam and modulate the rf signal being amplified by the beam. A Brillouin-focused beam depends upon a balancing of forces between the outward space charge forces of the electrons and the inward centripetal forces resulting from the electrons rotating about the axis and thereby cutting the magnetic lines of force. The neutralization of space charge forces by the positive ions will upset the balance of forces that exists in the beam; the resulting instability of the beam can augment the ion noise effects. Another problem that must be given careful attention by the tube designer arises from the modulation of electron density and velocity along the beam by its interaction with the radio-frequency (rf) electromagnetic
304
T. MORENO
fields. The electrons in the beam are gathered together into bunches, which travel down the tube separated in time by one rf cycle. With Brillouin focusing, the primary effect is not a variation of electron density but rather a variation of beam diameter. I n other words, the beam expands laterally as the electrons group together to form a bunch, and the beam diameter contracts between the bunches. If the tube designer does not compensate for this effect in the design of the tube, excellent transmission of the beam through the rf circuit may be obtained when the tube is operated without rf drive. But when the rf drive is introduced, the expansion in beam diameter a t the rf bunches may result in greatly increased interception of the beam by the rf circuit. Maximum coupling between the electron beam and the rf fields of the circuit results when the electron beam diameter approaches the inner diameter of the rf circuit. For maximum low-level gain, the beam diameter should therefore be nearly equal to the inner diameter of the circuit. For high-level operation, lateral expansion of the bunches will cause high interception of the beam current; this can result in a degradation of efficiency, or possible melting of the circuit. To protect the circuit and to maximize efficiency, the beam diameter near the output end of the rf int,eraction structure in the absence of modulation must be substantially less than the inner diameter of the structure. To accomplish this end, the beam can initially be designed with a reduced diameter, and a correspondingly higher magnetic field will be required to focus the beam. Or as an alternative, the magnetic field can be tapered, increasing from the input toward the output of the tube, thereby compressing the beam diameter and reducing beam interception near the output end of the rf circuit. Where these steps have not been taken, it has been observed experimentally that with a traveling-wave tube adjusted for maximum low-level gain and beam transmission greater than !%yo,the beam transmission may drop to as low as 60% under high-level signal conditions. When the beam diameter is reduced to compensate for the beam spreading with rf modulation, the beam transmission under high-level signal conditions may remain above 95y0, but the small-signal gain will be reduced. 2. Magnetically Confined Flow. Some of the design problems that are fundamental in a tube employing a Brillouin-focused beam can be circumvented by employing the technique of magnetically confined convergent focusing or “confined-flow.” With Brillouin focusing, the magnetic field is ideally zero at the surface of the cathode. With confined flow, the magnetic field is allowed to penetrate the cathode-anode region, and the surface of the cathode is threaded by the flux lines of the magnetic field. Electron beams can be focused with varying amounts of magnetic field at the surface of the cathode. Very successful results have been attained by
HIGH-POWER AXIAL-BEAM TUBES
305
designing a Pierce gun to give convergent trajectories in the cathode-toanode region, and then designing the magnetic field structure so that the lines of magnetic flux coincide approximately with the electron trajectories that would exist in the cathode-anode region in the absence of magnetic field. This is illustrated in Fig. 1. The magnetic field can be properly shaped
/,
FOCUS ELECTRODE ,ANODE BEAM BOUNDARY
-- ---E T l C FLUX LINES I
FIG.1. Magnetic flux for confined flow.
by using a solenoid focusing structure as shown in Fig. 2, and controlling the magnetic field in the cathode-anode region by adjusting the dimensions of the aperture in the iron pole piece. Alternatively, an additional coil may be added beyond the pole piece to adjust the magnetic field in the cathode region.
As has been pointed out by Dow ( 7 ) , with confined-flow focusing the electrons in the beam do not have uniform velocity in the axial direction, as they do in theory with Brillouin focusing. The nature of the electron trajectories has been discussed by Dow. This deviation from what might be considered a theoretically more ideal situation does not appear to have any deleterious effects upon the actual operation of microwave tubes.
306
T. MORENO
A confined-flow beam is typically operated with a magnetic field two to three times the field required to focus a Brillouin-focused beam of the same diameter. This wide disparity in required magnetic fields is reduced by some important factors. First, most tubes employing Brillouin focusing actually require a magnetic field somewhat higher than theoretical because of imperfections in the design and construction, although the very best designs will operate with a magnetic field not greatly in excess of theoretical values. Second, and more important, with a confined-flow beam radial spreading of the beam when rf drive is applied to the tube is greatly reduced. With a given inner diameter of structure, a confined-flow beam therefore can be initially designed with a larger diameter than a Brillouin-focused beam, and the required magnetic field will be correspondingly less. It is typical of tubes with confined-flow beams that there is only a small reduction in beam transmission when rf drive is applied sufficient to drive the tube to maximum power output. The larger initial diameter of the beam with confined-flow focusing will also improve the coupling between the beam and the circuit a t low signal levels. An important advantage of confined-flow focusing results from the reduction in beam scalloping. With continuous wave operation a marked reduction in ion-noise effects is observed. It is usually possible to adjust the magnetic field in a Brillouinfocused tube to minimize troublesome ion oscillations, but unless the beam perveance is low, and the operating voltage correspondingly high, the adjustment is typically very sensitive and critical for systems that are sensitive to ion noise. With well-designed confined-flow focusing, the adjustment of magnetic field to eliminate ion noise is very uncritical. Other operating parameters, such as gain, efficiency, and body current interception are also very uncritical to variations of magnetic field, if confined-flow focusing is employed. The minimum limit of the operating range is reached when the magnetic field becomes too weak to confine the elect'ron flow properly. The maximum limit is usually reached when saturation effects in the iron pole pieces of the magnetic circuit become sufficiently severe to distort the magnetic field. The range of usable magnetic field strengths between these limits can be quite large, as large as two-to-one or more. This greatly simplifies the problem of setting up a tube to proper adjustments, and the magnetic field need not be changed when the operating voltage is varied over rather wide limits. C. Output Window Design
In all high-power microwave tubes, including axial-beam tubes, one of the more serious problems that must be faced is that of the output window. This dielectric structure, through which the output power must be coupled from inside the vacuum envelope to the output transmission line, presents
HIGH-POWER AXIAL-BEAM TUBES
307
a variety of problems. Failures of this output window are usually catastrophic in nature and can result from a variety of causes. These include excessive heating from dielectric losses or electron bombardment, or dielectric failure resulting from exceeding the dielectric strength of the material. Design of a satisfactory window involves two classes of problems. The first are problems of materials technology, choice of a suitable dielectric material and means of providing a vacuum seal from this dielectric to the metal structure of the tube. The second class of problems involves design of the window as an electromagnetic structure, to insure that the window presents the proper characteristics as a circuit element, and a t the same time is protected from excessive damage by any of several mechanisms that may occur. 1. Materials Technology. Glass was at one time used almost exclusively for output windows for high-power tubes, and it is still used by some manufacturers. The technology of making metal-to-glass seals is well understood and offers no serious problem. The dielectric losses of hard glass are only moderately low, and glass windows can be heated to the melting point by dielectric heating and will then fail by softening and sucking in. The average-power handling capacity of glass windows can be greatly increased by forced air cooling, so that very substantial amounts of average power, in excess of 20 kw at a frequency of 3000 Me, can be handled. This is still much less than can be passed by a high-alumina ceramic window without the added complication of forced air cooling. Also, a glass window cannot be baked a t a temperature much over 450°C without softening. This limits the processing techniques that are possible and makes it more difficult to obtain the ultra-high vacuums that are highly desirable with high-power tubes. Mica is also a material that has found application in output windows of high-power tubes. When the maximum dimensions approach 2 in., the technique of obtaining a good vacuum seal between mica and metal becomes troublesome, but windows suitable for waveguides of smaller dimensions are readily fabricated. At frequencies of the order of 10,000 Mc, the sheet mica is typically 0.004 in. thick and is sealed to an alloy of matching expansion coefficient with a soft glass. Windows of this type have been extensively tested and without forced air cooling have successfully transmitted over 5 kw of average power at X-band. Tests have been conducted by placing mica in resonant cavities and feeding in power until the mica breaks down. These tests have led to the conclusion that the maximum power limit is not greatly in excess of 5 kw, although thinner sheets will have somewhat higher power ratings. A serious practical problem in the application of mica windows is keeping the sealing glass out of regions of high rf field strength. The glass is quite lossy, and an excess of glass on the
308
T. MORENO
seal or a tiny drop that has splattered on the face of the window can lead to window failure at much lower power levels. The use of the glass-sealing technique also limits bakeout and processing temperatures to values slightly lower than are possible with conventional glass windows. High-alumina ceramics have come into extensive use in recent years for high-power output windows. Experimental tests have shown rather conclusively that a properly designed ceramic window has a much higher average power capability than does either a glass or a mica window. The dielectric losses of high-alumina ceramic depend upon the purity of the alumina. Extremely high-purity alumina, over 99% A1203,has dielectric losses a t X-band that are typically several times lower than the ceramics that are 95--97% A1203. The exact values of loss depend upon the fluxing agents that are used in forming the ceramic. High-alumina ceramics can be bonded to metal by various metallizing processes, of which the well-known moly-manganese' and titanium hydride processes are the most popular. Ceramic windows and seals permit the tube to be baked at temperatures above 6OO0C, making possible a n extremely clean tube with a very good vacuum. A variety of designs of output windows employing ceramics have been employed, and some of these will be described below, following a discussion of failure mechanisms other than dielectric heating. Ceramic windows of less than optimum design have experimentally transmitted over 20 kw of continuous power without air cooling at a frequency of approximately 10,000 Mc. It has been demonstrated that with better ceramics and a better design of window, the dielectric heating can be further reduced by several times. Therefore, with the best ceramic window designs, dielectric heating of the output window is no longer a significant limitation to maximum power performance a t the present state of the high-power microwave tube art. Synthetic sapphire, single crystal aluminum oxide, has also been used as a window material for high-power tubes. It has been found experimentally that the dielectric heat.ing of sapphire is approximately the same as high-purity alumina ceramic. The sapphire is more difficult to handle, and more subject to damage by thermal shock, and seems to offer no significant advantage over the ceramic. Its optical transparency is sometimes useful in experimental tubes. Quartz is a very promising material for windows because it combines very low dielectric losses with SL very low expansion coefficient, and correspondingly high resistance to thermal shock. Its use has been limited because the problems of making a vacuum bond between metal and quartz have proven rather difficult, although successful seals have recently been made on a laboratory basis. It is also possible to connect the quartz to a metal envelope through a series of graded glass seals. 1
Moly is an abbreviation for molybdenum.
HIGH-POWER AXIAL-BEAM TUBES
309
2. Failure Mechanisms. There are several possible mechanisms of window failure that must be considered by the tube designer as possible limitations to high-power tube performance. Failure by dielectric heating has been mentioned in the preceding paragraphs, and it has been pointed out that with the best materials and the best windows, this is no longer a serious problem. Even with the best materials, the problem can be magnified greatly by the design of the window. Some high-power klystrons have been designed with the output window a ceramic cylinder placed inside the output cavity. Radio-frequency fields are much stronger inside the cavity than outside the cavity in the output transmission line, and output seals placed inside the resonant cavity will in general be subject to much greater dielectric heating than output windows located in the transmission line. It is also necessary to consider the possibilities of spurious resonances in the window structure for windows placed in the output line. It is highly desirable to avoid spurious resonances within the operating frequency range of the tube, if possible. If power is coupled into these spurious resonances, the dielectric heating of the output window can be enormously increased, resulting in window failure. Excessive heating of the output window can be caused by electron bombardment, as well as by dielectric heating. For ceramic seals that are placed inside the output cavity of a klystron, close to the electron beam, substantial electron bombardment of the output seal can result from secondary electrons released by the beam striking the surfaces adjacent t o the coupling gap. Ceramic seals placed in the output transmission line are readily shielded from bombardment by stray primary or secondary electrons from the electron beam. Heating by multipactor electrons can be a serious effect, however. The troublesome multipactor effect is encountered when electrons released from one surface are accelerated by the rf field to strike another surface in one half rf cycle. If secondary electrons are released and are accelerated back to the initial surface in the next half cycle, there to release additional secondary electrons, a n electron multiplication phenomenon can occur. This will result in a cloud of secondary electrons oscillating back and forth between surfaces in synchronism with the rf field. Energy Kill be extracted from the rf field by this multipactor discharge and will be delivered as heat energy to the surfaces involved. One of the two surfaces involved can be a surface of the dielectric output window, and excessive heating of the output window can result. For the threshold of multipactor to be reached, i t is necessary that the strength of the rf field be sufficient to accelerate electrons from one surface to the other in a half cycle. The existence of multipactor will therefore depend upon the peak power being transmitted by the window. If the peak power is sufficient to cause a multipactor discharge, the resulting heating of the window will
310
T. MORENO
depend upon the average power being transmitted. The threshold of possible damage to the window is therefore a function of both peak and average power being transmitted. It is also possible to excite a multipactor discharge a t the output window by exciting a spurious resonance, thereby developing rf fields of sufficient intensity and of the proper configuration to cause multipactor. This is an additional reason to avoid spurious resonances in the output window structure. At high peak power levels, electron bombardment of the window is frequently made visible by a fluorescence or glow of the dielectric material. Multipactor can be observed in this way, but there are evidently other mechanisms that can lead to electron bombardment whose nature is not clearly understood. This electron bombardment results in patterns of visible fluorescence and excessive heating of the window. The pattern of fluorescence can be shifted by applying SL magnetic field to the window, but the bombardment cannot be readily eliminated by proper orientation of the magnetic field, whereas a multipactor discharge usually can be controlled by a magnetic field. This excessive, anomalous electron bombardment is frequently the most serious problem faced by the designer of a high-power window, neither the cause nor the cure is currently well understood. Another failure mechanism of output windows is manifested by small holes or punctures drilled through the dielectric. These result from highenergy electrons striking the window and charging the surface until the dielectric strength of the material is exceeded. These high-energy electrons can be stray electrons from the beam, possibly scattered by elastic collisions with interior metal surfaces in the tube. With high-alumina ceramic, failures of this type are typically encountered a t beam voltages substantially greater than 100 kilovolts. When very high rf voltages are developed in the waveguides, puncture failures can be observed even when the window is well shielded from the electron beam. Stray electrons in the evacuated waveguide, possibly having their origin in ionization by cosmic rays, can be accelerated by the rf field to energies sufficiently high to charge the window to the point of failure. These puncture failures are more commonly observed with dielectric materials having very low conductivity, such as the high-alumina ceramics. If the leakage resistance is higher, the charge will bleed off. It is possible to improve the resistance of ceramic windows to puncture failures by coating the window with a thin resistive film.Various techniques have been used, including the evaporation of thin metal films on the ceramic. Only a very slight amount of leakage is necessary to prevent charging of the surface. The resistance to puncture can also be increased by use of thicker material, thereby increasing its dielectric strength. 3. Typical Window Designs. Because most modern high-power micro-
HIGH-POWER AXIAL-BEAM TUBES
311
wave tubes use ceramic wiiidows, this section will describe some ceramic window designs that have been used successfully on tubes of recent design. a. Slanted disk. A flat disk of ceramic is a preferred form to use for an output window; it is simple to fabricate, and the dimensions are easy to control precisely. A waveguide of rectangular cross section is a form often used. The round ceramic disk and the rectangular waveguide are conveniently fitted to form an output window by canting the disk at a n angle, as illustrated in Fig. 3. Various dimensions can be adjusted to minimize the
V
FIG.3. Section through a ceramic disk window mounted slantwise across a rectangular waveguide.
reflection introduced by the window into the waveguide, including the diameter, thickness, and angle of inclination of the window, and the alignment of the axes of the waveguide on either side of the window. By cut and try, dimensions can be found to give the window a good match over a band of frequencies substantially greater than 10%. One difficulty that is encountered with this design is multipactor between the window and the waveguide wall. At frequencies in the region of 3000 Mc, this multipactor can be troublesome a t a peak power level under 5 megawatts, the resultant heating will severely limit the average power capability of the window.
FIQ.4. Thick ceramic plug window in rectangular waveguide.
b. Thick plug (9). Another form of output window is the thick plug shown in Fig. 4.Where the thin disk depends for cooling principally upon convection by the air, the thick plug is cooled principally by conduction t20the outside surface of the waveguide. The plug can be matched to the waveguide by steps or tapers a t the end of the plug, or by adding metal matching structures, which can serve the dual purpose of shielding the window from stray electrons from the electron beam. This window is exces-
812
T. MORENO
sively bulky at 3000 Mc, but has successfully transmitted peak power over a megawatt a t 10,000 Me, a t an average power of two kw. c. Conical window. The conical window is an attractive form because it combines a thin wall and a large surface area with mechanical strength to withstand pressure. If heating by dielectric losses were the limit on performance, these characteristics would all be of value. A disadvantage of the conical window is that it is expensive to fabricate, and difficult to hold tolerances on circular symmetry and uniformity of wall thickness.
FIG. 5. Ceramic cone window moilnted in circular waveguide, with transitions to rectangular waveguide.
To use a conical window in a rectangular waveguide, transitions must be employed, as shown in Fig. 5. Structures of this kind are troubled by spurious resonances, which em be excited easily by asymmetries or misalignments of the ceramic cone. I t has been found necessary in some instances to introduce lossy structures such as metal films on dielectric plates to suppress these spurious resonances. When care is used in design and fabrication, these conical windows have been successfully used for power levels of several megawatts and several kilowatts st frequencies from 3000 to 10,000 Mc.
FIG.6. Transverse disk window mounted in a short section of circular waveguide wit,h abrupt transitions to rectangular waveguide.
d. Transverse disk. The simple transverse disk shown in Fig. 6 has been very successfully used on a number of high-power tubes. The transition between rectangular and round waveguide is abrupt, but if the dimensions
HIGH-POWER AXIAL-BEAM TUBES
313
are properly chosen and the window designed as a filter, the refection will be very low over nearly the entire frequency band of the rectangular waveguide. At megawatt power levels, the canted disk and the cone described in previous sections are troubled by electron bombardment, as evidenced by a pinkish fluorescent glow which can be observed through the translucent ceramic and by heating of the window. The visible glow and the heating are reduced with the transverse disk design. Spurious resonances will be encountered in the window structure and can be excited by asymmetries or poor alignment of the waveguides. Unless considerable care is taken in the construction and alignment of the window assembly, the power handling capacity can be greatly reduced at the frequencies of the spurious resonances. This window design has successfully transmitted over 10 megawatts peak power at 3000 Mc, although a t higher peak powers it is subjert to puncture failures, as are other designs. Windows of this design have been extensively tested for average p m r r handling capacity at frequencies near 10,000 Mc, and the temperature rise of the windows has been measured as a function of the power transmitted. With optimum dimelisions and the best materials, the measured temperature rise can be less than one degree centigrade per kilowatt of average power transmitted. Windows of ceramics substailtially lossier than t h t best have successfully transmitted over 20 kw of average power w ithout forced air cooling. The conclusion that can h r drawn from these experinieiits is that with a transverse disk window of optimum dimeiisioris and materials, heating by dielectric losses is no longer a significant limitation to the maximum pou-er performance of microwave tubes, for the present state of the art. Output windows still offer problems and call limit tube performancae, but heating by electron bombardment, or multipartor, or puiic*ture failures are more likely t o be the phenomena that are troublesome. The limits set by thew phenomena depend upon peak power as nell as average power transmitted.
111. PROGRESS I N HIGH-POWER KLYSTROS DESIGN
A . Hzyh G u m It has long heen recognized that the gain of a klystron amplif er can be increased indefinitely by adding additional cavities along the beam. Early attempts to build high-gain amplifers with more than three cavities were riot successful because of regeneration and instability, pririrjpally from secondary electrons that traveled backwards through the drift tuhc froni the output cavity toward the input. With the development of well-forlid electron beams, there was low interceptioii on the drift tube arid cavitic. and correspondingly reduced emission of secondary electrons. Also, T\ it h
314
T. MORENO
high current density beams, the collector is much larger in diameter than the drift tube, because the beam must be allowed to expand to a lower power density before it strikes the surface of the collector. The collector then forms a natural trap for secondary electrons. As a result, it has been possible to design modern high-power klystrons with more than three cavities, and thereby greatly increase the gain that can be obtained from a single amplifier tube. A four-cavity tube will typically exhibit gain of the order of 50 to 60 db when the cavities are synchronously tuned, and with additional cavities even higher gain can be obtained. With six-cavity amplifiers, stable gain in excess of 110 db has been observed. This is a far higher value than is practical for most applications. With higher gain, the shot noise of the beam will be increasingly amplified, and the signal-to-noise ratio of the output will be reduced correspondingly (although compression effects a t saturation will reduce amplitude modulation noise). Also, to obtain stability, extreme care must be taken to prevent rf leakage between transmission line connectors a t the input and output. There is also a possibility of feedback of harmonics through the drift tubes, which are normally so small as to be cutoff waveguides for the operating frequency. Feedback of high frequencies through the drift tube has been troublesome for some low-frequency klystrons with closely spaced cavities.
B. Broadbanding It has also been recognized in recent years that if additional cavities are included in a klystron amplifier, they can be stagger-tuned to increase the bandwidth in a manner somewhat analogous to an intermediate-frequency amplifier (10, 11, 12). Much emphasis has been placed on broadband klystrons in recent years, and very substantial improvements have resulted. The design of a broad-band klystron can be broken down into several parts, that can be considered rather separately. The output cavity must have a bandwidth a t least equal to the desired bandwidth of the complete klystron. All of the other cavities, including the input cavity, can be considered as a driver section. The function of this driver section is to receive an input signal over the entire operating frequency band, and amplify it so as to deliver the maximum possible driving current to the output cavity. To consider first the output cavity, it must be recogniEed that the rf voltage developed across the output gap cannot greatly exceed the potential difference across which the beam was initially accelerated. If the peak rf voltage becomes too high, electrons in the beam will be reflected by the rf field in the gap of the output cavity. If the reflected electrons become substantial in number, the effective rf driving current is reduced. An
HIGH-POWER AXIAL-BEAM TUBES
315
optimum output cavity is therefore designed with a gap impedance that matches the characteristics of the current supplied by the driver section. If the gap impedance is too high, and substantial numbers of electrons are turned around, the efficiency is reduced. Normally, the two parts of the klystron are considered separately. The output circuit is designed with the assumption that the driver sec,tion will supply constant rf current at all frequencies within the band. The driver section is then designed to supply this constant rf current to the output gap, insofar as possible. It can be shown (11)that if the output cavity is a simple resonant circuit designed to give maximum efficiency at band center, its fractional bandwidth will be given by
where Af/f is the fractional bandwidth, Vois the dc beam voltage, V is the peak rf voltage at the output gap, q is the efficiency, R s h / Q the characteristic impedance of the output cavity, K the perveance, and P the output power. This then becomes a limiting bandwidth of the tube, to a first approximation. It has been pointed out ( l a ) that this limiting bandwidth can be substantially increased if the output cavity is not a simple resonant circuit, but is designed as a filter to match the rf impedance of the beam at the gap to the impedance of the output transmission line. A simple form of such a filter is a double resonant circuit, or pair of coupIed cavities, with the beam current driving the first cavity and the output transmission line coupled to the second cavity. Using a lumped constant equivalent circuit, with coupling coefficients properly chosen, the band of frequencies over which a given power level can be exceeded is doubled by this arrangement, as compared to a simple resonant cavity. More complex filter networks can further increase the bandwidth, but a theoretical limit of T times the bandwidth of a simple cavity is approached as the filter network is made arbitrarily complex. The driver section of the tube, consisting of all cavities with the exception of the output cavity, is usually treated as a separate problem in the design of the tube. With small input signals, the driver section is approximately linear, and a linear analysis has been shown to be a good approximation to the small-signal behavior. When the cavities in the driver section are stagger-tuned for bandwidth, the analysis is similar to that of a staggertuned i.f. amplifier, but more complex. The additional complexity arises from the fact that the voltage at each cavity influences the current not just at the cavity immediately following, but at all the following cavities. It, is a s if each stage in the i.f. amplifier drove all the stages that followed.
316
T. MORENO
To carry out the analysis, the space charge wavelength along the beam in the drift tube must be known. The mathematical analysis is greatly simplified if the cavities are spaced apart a quarter space charge wavelength, but the analysis has limited practical value because a tube designed with this spacing will have seriously reduced efficiency. The more complex problems of drift spaces different from a quarter wavelength are readily attacked with a n analog computer. The analog computer has proven to be a very useful tool in the analysis of stagger-tuned klystron amplifiers ( I S ) , and good agreement has been obtained between analog computer calculations and measured results with experimental tubes. To maximize the bandwidth, the individual cavities in the driver section should have minimum capacitance and maximum inductance. The quarter space charge wavelength between cavities will give the maximum small-signal gain-bandwidth product for a given number of cavities in the driver section, but to obtain reasonable efficiency, the spacing between the final cavity and the preceding cavity must be reduced, a t some sacrifice in the gain-bandwidth product. It may also be desirable, to flatten the frequency response, for the individual cavities in the driver section to have different Q’s. The Q’s of the individual cavities can be decreased by adding artificial loading or by lengthening the gap so as to increase the effective electronic loading by the beam. Clearly, in designing a driver section for a broad-band klystron, a great many variables are available to the designer. These include the number of cavities, the inductanc*e, capacitance and losses of each of the individual cavities, the electron transit time through the gaps of the individual cavities which changes the electronic loading, and the drift distances between the individual cavities. Any chosen configuration within a wide range is amenable to reasonably accurate small-signal analysis, particularly with the aid of an analog romputer. For a practical tube, the large-signal effects are important because they control the power output and efficiency over the operating frequency band. These large-signal effects have not yet been treated adequately by a mathematical analysis, and the tube designer must depend heavily upon results with experimental tubes for information on large-signal and saturation effects. I t is generally agreed, as a result of experiment and some theory, that for high efficiency the spacing between the output cavity and the preceding cavity should be substantially less than a quarter space charge wavelength. It also seems desirable for maximum efficiency that the drift distance between the final two cavities of the driver section should also be substantially less than a quarter space charge wavelength, although the evidence is less conclusive on this point. These cavities that immediately precede the output cavity should be tuned to a frequency substantially higher than the
HIGH-POWER AXIAL-BEAM TUBES
31 7
operating frequency of the tube. The beneficial effect these final cavities of the driver section have O K efficiency ~ can be explained qualitatively as follows. The hunches in the electron beam have been fairly well formed before the beam reaches these cavities. If the cavities are tuned substantially higher than the operating frequency, the voltage developed across the gaps in these cavities will be 90 degrees out of phase with the rf current passing through the cavities. The effect of this rf voltage will be to slow down the electrons passing through the gap ahead of the bunches, and to accelerate the electrons passing through following the bunches. As a result, more electrons will be gathered into the bunches, and the bunches will be shortened or tightened. This will increase the rf current in the electron beam, and increase the efficiency of the tube. It is apparent from the Eq. (1) that high power, high efficiency, and high perveance all contribute to wide bandwidth. At one time it was feared that with a high-perveance solid beam, space charge debunching effects would seriously reduce the efficiency of a klystron amplifier. This may eventually prove to be the case, but solid-beam klystroii amplifiers hare been designed with perveance K = 2.5 x 1 0 P I / V w nith no sigiiificant decrease in efficiency compared to lower perveance designs. li'fficieiicies substantially higher than 407, have been measured with this high perveance, with confined convergent-flow electron beams. F alf-power bandn idths greater than 127, have been achieved with high-power solid-beam blystrons. Further improvement in bandwidth should result from the use of beams of still higher perveance. Hollow electron beanis can he formed with perreance K = 10 or greater. Calculations show that for a megawatt klystron with a K = 10 beam, it is reasonable to expect bandwidths of 15% or greater ( I S ) . It has not yet been demonstrated experimentally that this performance can be achieved together with reasonable efficiency, but if the experiment should prove successful, this will be a major adrancac in the klystron art.
6. High Power There are several important fundamental reasons why axial-beam
t ubes, such as klystrons and traveling-wave tubes, are inherently capahlc of extremely high power a t microwave frequencies. It is an expensive and time consuming task to increase the maximum power capability of microwave tubes, but it appears certain that no fundamental limits have as yet been ex-en closely approached. One important advaiitage of axial-beam tubes is the fact thut the iormation of the electron beam is in a region separate from the region where the bemi interacts with the rf circuits in the tube. Cathode emission 1. not affected by rf fields, and the cathode is riot subject to bornt)artlment
318
T. MORENO
by electrons that have been accelerated by the rf fields. It is possible to design electron guns with high convergence, so that high-density electron beams can be formed from large cathodes operating a t low emission densities. An even more important advantage of axial-beam tubes is that with careful beam design, only a very small percentage of the beam current is intercepted by the rf circuit, and only a very small fraction of the beam power must therefore be dissipated on the rf circuit. With carefully designed modern klystrons, substantially less than 1% of the beam power is dissipated as a result of current interception by the rf circuit and drift tubes. With a crossed-field device, the efficiency may be relatively high, 70% or higher, but the remaining beam power is nearly all dissipated on the rf circuit of the tube. The rf circuit is relatively limited in dimensions and difficult to cool adequately well. With an axial-beam tube, the beam can be allowed to expand in diameter after passing through the rf circuit, and not be intercepted by the collector surface until it has spread to a point where the power density is conveniently low. An experimental klystron amplifier designed by Louis Zitelli a t Varian Associates has produced over 20 kw of continuous power a t a frequency of approximately 10,000 Mc. The tube was carefully designed and constructed, but no heroic measures were necessary t o provide cooling to the cavities, drift tubes, or collector. The cathode operated a t an emission density of 3.5 amp/cm2 of area, high but not excessive for impregnated cathodes. The electron gun was designed for a n area convergence of only 36: 1. The output window was a transverse ceramic disk, and was operated without air cooling. It is clear from the results obtained with this experimental tube that substantially higher power levels can be achieved without radical modification of the design techniques that were used. Very large increases in power level can undoubtedly be achieved if more radical techniques of cooling are resorted to. If the results achieved with this experimental tube are scaled to lower frequencies where physically larger structures can be used, enormous amounts of power can be anticipated. Some examples of modern high-power klystrons are shown in Figs. 7, 8, and 9. The klystron in Fig. 7 is a 2-kw c-w amplifier that operates at a frequency of approximately 8 kMc. It is a four-cavity amplifier, with power gain of approximately 50 db, and efficiency of about 3oy0.In operation, the tube is mounted in a yoke electromagnet which provides an axial magnetic field to focus the beam. The construction is typical of high-power tubes in this frequency range. The cavities are milled from a solid block of copper, in which water passages are also located to provide body cooling
IIICII-POWER AYIAIrBEAAI TUBES
31 9
and frequency stabilizatioli. Metal-ceraniic. constriictioii is used throughorlt . Waveguides provide rf connections at input and output. The klystron in Fig. 8 is a typical example of a wide-band, high-power S-band amplifier. The tube illustrated delivers over 4 megawatts of power over a frequency band slightly greater than 5y0of the center frequency of
FIG.7. Two-kilowatt c-w klystron amplifier for 8 kMc (courtesy Varinn Associates).
2800 Mc. Efficiency at the 4-megawatt level is 35%, drive power required is 150 w. Peak efficiency is 480j0. The output cavity is a double-tuned cavity, and the driver section consists of six additional cavities, staggertuned to provide the necessary bandwidth.
FIG.8. Multicavity, broad-band klystron amplifier for 3 kMc (courtesy of Varian Associates).
A lower frequency klystron is illustrated in Fig. 9. This four-cavity amplifier can be tuned over the frequency range 400-450 Mc. Peak power output is 1.25 megawatts, average power 75 kw, pulse length 2 millisec, efficiency approximately 4070. At this low frequency, the klystron becomes
320
T. hlORENO
very large, the tube illtlstrated is o\.er 10 f t long. But because of this large physical size, the structure is able to dissipate a large amount of heat. In the interior of the tube, the power dissipation per unit area is much smaller than for the higher frequency klystron shown in Fig. 7. For this reason, the
FIG.9. High-power uhf klystron amplifier (courtesy of Varian Associates).
klystron of Fig. 9 does not approach the ultimate power capabilities of klystrons in this relatively low frequency range. Average power a great deal higher can be generated by designing the tube with a larger collector and cathode, and with better cooling for the interior port,ions of the tube, such as drift tubes and tuners.
HIGH-POWER AXIAL-BEAM TUBES
32 1
IV. PROGRESS IN HIGH-POWER TRAVELING-WAVE TUBEDESIGX(14) A . Comparison of Traveling-Wave Tubes and Klystrons The advances that have been made in recent years in cathode technology, electron beam formation, and output window design are as important to traveling-wave tubes as to klystrons. To a large extent, cathodes, collectors, and windows are interchangeably used between the two tube types. Similar techniques of beam focusing are used. As a result, there is frequently more than a superficial external resemblance between highpower versions of the two tube types. The important internal difference is, of course, the circuit that interacts with the electron beam. In the klystron, the beam interacts with the fields in a string of microwave cavities arranged along the beam. With no electron beam, there is negligible coupling between adjacent cavities. A signal injected into the input will not progress through the structure further than the input cavity. A signal reflected from the output load will not be transmitted in a reverse direction beyond the output cavity. The string of cavities formq a nonpropagating structure. With a traveling-wave tube, a propagating structure is used. In the absence of a beam, a signal injected into the input will propagate along the structure toward the output, a signal reflected from the output load will proceed through the structure toward the input. With the beam on, the velocity of the electrons is approximately equal to the phase velocity of the circuit along which the beam travels, and there is continuous interactioii between the electrons in the beam and the fields of the propagating circuit. The fact that a propagating circuit is used is responsible for the principal advantage that a traveling-wave tube has to offer, electronic bandwidth. Characteristically, the most important feature of the traveling-wave tube amplifier from the point of view of the user of the tube is its bandwidth, the range of frequencies over which amplification can be obtained without mechanical adjustment or tuning of the tube. As discussed above, substantial advances have been made in recent years in the bandwidth of klystron amplifiers, but there is usually a limit above which the travelingwave tuhe offers better performance. For high-power tubes, the bandwidth advantage of traveling-wave tubes is by no means as great as it is for lornpon er tubes. As the peak power level increases, the klystron bandwidth improves, but the traveling-wave tube must use circuits other than the simple helix, at a sacrifice in bandwidth. It is not possible to define a limiting bandwidth, below which the klystron is preferred and above which the traveling-wave tube offers better performance. At a frequency of approximately 3000 Mc, and a pomer level of it megawatt, the boundary is at about
T. MORENO
322
5% bandwidth, for the current state of the art. As the peak power increases or the frequency decreases, the klystron has the advantage over somewhat wider bandwidths. Conversely, for lower powers or higher frequencies, the bandwidth capabilities of the klystron are reduced. I n the future, as the state of the art advances, improvements in the performance of both tube types can be expected, and i t is difficult to forecast what will happen to the bandwidth crossover. For the wider bandwidth of the traveling-wave tubes, the user must usually pay a price in gain per unit length and efficiency. A general theorem has been developed by Muller (15) which states that if a nonpropagating structure is converted into a propagating one by increasing coupling between the resonant cavities, the gain-bandwidth product per unit length will increase. With practical tubes, however, the high-power traveling-wave tube usually has lower gain per unit length of the interaction structure than does the klystron, even the broad-band klystron. The propagating characteristics of the interaction structure are also responsible for a class of problems that must be overcome by the designer of the traveling-wave tube. These are the problems of stability. A signal reflected from a mismatch a t the output of the tube will be reflected back through the propagating circuit toward the input; the resulting regeneration can result in instability and oscillation. The reflection of energy can result from an imperfect match a t the transition between the slow-wave structure and the output transmission line, or from the load that terminates the output transmission line. Because the tube will amplify over a broad band of frequencies, these matches must be good over the full frequency band. The higher the gain of the tube in the forward direction, the higher must be the attenuation in the reverse direction to insure stability. With relatively low-gain tubes, 10 to 13 db gain, reasonable stability can be obtained without introduction of additional attenuation into the slow-wave structure. With higher gain, additional attenuation (center loss) is normally required. For high-power, high-gain tubes, it has become customary to split the slow-wave circuit into two parts, roughly in the middle. This is shown schematically in Fig. 10. The input section is terminated in approximately INPUT
BEAM
SLOW -WAVE CIRCUIT
OUTPUT
V
SLOW-WAVE CIRCUIT TERMINATIONS
FIG.10. High-power TWT, showing severed circuit for high gain with stability.
HIGH-POWER AXIAL-BEAM TUBES
323
its characteristic impedance a t the point where the slow-wave circuit is severed. The output section of the slow-wave circuit is also terminated a t the other side of the break. By this technique, the gain of the complete tube can be increased. If each of the two sections is stable, the entire tube will be stable. The gain can be further increased by lengthening the circuit, severing it a t additional points, and terminating the slow-wave circuit in its characteristic impedance on both sides of each break. Tubes with as many as five severed sections have been constructed, other tubes with three severed sections have exhibited over 50 db of stable gain. With high-power tubes, it is necessary that these terminations dissipate substantial amounts of power, and this may present a serious problem of design. Lossy ceramics that maintain their properties a t high temperatures are a commonly used material. For very high-power tubes, i t can become impractical to attempt to dissipate within the tube itself the power that must be absorbed by the terminations. I n this event, the slow-wave structures can each be matched to transmission lines a t the severing point. The power that must be dissipated can be carried out of the tube proper by these transmission lines, and dissipated in dummy loads that are external to the tube. This is illustrated in Fig. 11. INPU'I
ELEC
SLOW-WAVE CIRCUIT TERMINATIONS
OUTPUT
SLOW-WAVE CIRCUIT
FIG.11. High-power TWT, with severed circuits terminated external to the tube, for higher power dissipation.
The tube designer must also keep in mind the fact that the propagating circuits that are used are generally also capable of propagation in modes other than the desired mode, a t frequencies that are usually outside of the desired passband of the tube. Spurious oscillations are possible if the velocity of the beam is such as to permit interaction with these other modes of propagation, and various means must be resorted to prevent these oscillations. Distributed attenuation that is selective for the undesired modes may be necessary. B. Circuits for High-Power Traveling-Wave Tubes 1. Helix-Derived Circuits. The helix is a favorite circuit for low-power traveling-wave tubes, because i t combines very wide bandwidth with good
324
T. MORENO
interaction impedance. As the power level is raised, mid the operatixig voltage is increased, the pitch and diameter of the helix must also be increased. This becomes impractical at voltages much grexter than 10 kv, because an increasingly greater pereeiitage of the energy will be carried in the space harmonics, instead of in the desired fundamental mode of propagation. This will reduce the gain per unit length of the tube, and more seriously, enhance the possibility of oscillatioiis resulting from interaction between the beam aiid the space harmonics of the helix. It has been shown by Chodorow and Chu (16) that the situation at high-power levels can be greatly improved by using a cross-wound helix. For a large-diameter structure, the energy transmitted in the undesired space harmonics will be greatly reduced, aiid stability for the desired mode of operation will be correspondingly enhanced. The topological equivalent of a cross-wound helix is the ring-and-bar circuit, shown in Fig. 12.
I’Ic. 12. Ring-and-bar circuit for TWT’s.
For high-powered tubes, there is a problem in carrying away the heat that is developed in a ring-and-bar circuit by rf heating or interception of the electron beam. For relatively modest powers, this can be done by glazing the structure to dielectric support rods, of materials such as synthetic sapphire, and shrinking this into an outer metal envelope (Fig. 13). Heat will be conducted out from the structure through the support rods. A t higher power levels, the ring-and-bar circuit can be supported by metal quarter-wave supports, of higher thermal conductivity. The effect of these supports upon the interaction impedance is shown in Fig. 14. It is seen that the impedance is increased, but the bandwidth is reduced. For very high peak powers, the circuit becomes relatively massive, and the cross section of the supporting stubs can become large enough to permit interior liquidcooling channels. With the additional cooling provided by these channels, very large average powers can be handled. At relatively low frequencies, in the several hundred megacycle range, arid a t peak power levels in the megawatt range, the dimensions of the ring-and-bar circuit become large enough to permit the circuit to be fabricated from tubing, so that it can be directly liquid cooled. Under these conditions, it becomes theoretically possible to generate many tens of kilowatts of average power. 2. Coupled-Cavity Circuits. Another class of circuits that are suitable for high-power traveling-wave tubes is derived from a disk-loaded waveguide, or a chain of coupled cavities (17, 18, 19). A waveguide of circular,
325
HIGH-POWER AXIAL-BEAM TUBES
FIG.13. Ring-and-bar circuit, supported by glazed sapphire rods. 200
I
I
I
I
I
1
175
I50
125
l n
50 100 75
50
25
0 FREQUENCY
( KMC
1
FIG. 1.2. Interaction impedance of ring-and-bar circuit, with and without quarterwave metallic stub supports.
cylindrical cross section has a phase velocity greater than the velocity of light, but the phase velocity can be reduced by loading disks, periodically spaced along thc waveguide (Fig. 15). The disk-loaded waveguide becomes a chain of cavities, coupled together through the holes in the center of the loading disks. It is a propagating circuit, but has a relatively narrow
326
T. MORENO
bandwidth or passband, and is not suitable for a wide-band amplifier. The passband can be substantially increased by providing additional magnetic coupling between adjacent cavities, of a polarity corresponding to negative mutual inductance coupling in the equivalent circuit. The lower cutoff frequency of the passband is thereby reduced. One means of accomplishing this result is the clover-leaf circuit, invented by Chodorow
FIG.15. Disk-loaded circular waveguide is basic slow-wave structure.
(18). This is illustrated in Fig. 16. This circuit has impedance characteristics as shown in Fig. 17. The interaction impedance is relatively high, so relatively good efficiency and gain per unit length can be obtained. Considerable care must be taken to suppress spurious oscillations. A number of successful tubes have been constructed using this circuit. One of these that operates a t a frequency of about 3000 Mc is shown in Fig. 18; this should
PIC. 16. Clover-leaf slow-wave circuit is modification of disk-loaded waveguide with increased bandwidth.
be compared with the klystron of Fig. 8. This traveling-wave tube has a half-power bandwidth greater than lo%, with peak power output of over 3 megawatts a t an efficiency of over 4OyG Average . power output is limited to 6 kw by dissipation in the lossy ceramics which terminate the slow-wave structure where it is severed in the center of the tube. Higher average power could be developed if the terminations were external to the tube itself. 3. Folded-Line Circuits. A third class of suitable circuits are those derived from a folded waveguide. Two circuits which have been derived from a folded line are the “Dutch door” circuit of Fig. 19 and the “Hines” circwit of Fig. 20, named after its inventor. Successful high-power tubes h a w bctw vonstnivted usiiig both of these circuits. It is characteristic uf thew c*ircuits that the fundamentd mode of propagation is a backward
HIGH-POWER AXIAL-BEAM TUBES
FIG.17. Interaction impedance of clover-leaf circuit.
327
328
T. MORENO
wave. They are iiormally operated in the first space harmonic, which is a forward wave. Bandwidth can be relatively large, approaching 25%, but the interaction impedance is relatively modest. The efficiency is not particularly high, typically in the 10-25% range, or slightly higher for reduced ,CATHODE
SEC. A-A C~PPER SEPTUM
LWAVEGUIDE MATCH FIG.19. “Dutch-door” circuit has fundamental backward wave.
FIG.20. “Hines” circuit has good bandwidth, rather low interaction impedance.
bandwidths. Thermal dissipation characteristics can be relatively good, SO that high average power can be developed. The Dutch-door circuit is well adapted to periodic magnetic focusing, by supporting the drift tubes on iron septums, but the average power capacity will be reduced by the
HIGH-POWER AXIAL-REAM TUBES
329
relai ively poor t,hcriii:il cwnductivity of the iron. These circ.\iit,sdescribed above are by no means a complete list of the various circuits that can he used in high-power traveling-wave tubes. They have been chosen only as typical examples of the present state of development. In summary, it can be stated that the development of high-power traveling-wave tubes is in a relatively early st’age. There has not yet been an opportunity to test experimentally many of the obvious ideas for improving the performance and power capacity of these tubes. It is reasonable to expect that many important advances mill be demonstrated in the next several years.
REFERESCES 1. Learned, V.: and Veronda, C., Proc. I.R.E. 40, 465-469 (1932). 2. Chodorow, M., Gineton, E. L., Nielsen, I. R., and Sonkin, S., Proc. I.R.E. 41, 1584-1601 (1953). 3. hlacNair, D., Lynch, R. T., and Harmay, N. B., J . A p p l . Phys. 24, 1335-1336 (1953). 4. Levi, R., J . .4ppZ. Phys. 24, 233 (1953). 5. Levi, It., J . A p p l . I’hys. 26, 639 (1955). 6. Snsskind, C., Advances in. Electronics and Electron Phys. 8, 363-403 (1956). 7 . DOH.,W. C., Ade+ancesin Electronics and Electron Phys. 10, 2-70 (1958). 8. Pierce, J. R., “Theorj. and Design of Electron Beams.” D. Van Nostrand Co., New Torli, 1949. 9. Shaw, €1. J., and I4-inslow, L. M., “A Broadband High-Power Vacuum Window for S-Band,” I R E Trans. on Microware Theory and Tech. MTT-6, No. 3, 326-331 ( liI58). 10. Krrachen, K. II., iluld, B. A., and Dixon, N. E., J . Electronics 2, 529 (1957). 1 2 . Dodds, W. J., Moreno, T., and McBride, Jr., W. J., “Methods of Increasmg Bandrsidth of High-Power Microwave Amplifiers,” 1957 W-escon Convention Record, I’t. 3, pp. 101-110. 1.2. Eeaver, W. L., Jepsen, R. Id.,and Walter, R. I-., “Wide-Band Klystron Amplifiers,” 1 9 5 i W‘rscon Convention Itecord, Pt. 3, pp. 111-113. If$. Heaver, W., Caryotakis, G., Staprans, h.: and Symoris, R., “Wide Band HighPower Kl.vstrons,’’ 1959 IRE Wescon Convention Record, Pt. 3 . pp. 103-111. 14. Snlos, E. J., Alicrozcave J . Part I, 2, No. 12, 3 - 3 8 (1959); Part 11, 3, No. 1, 46-52 i1960). This rcference has an excellent, bibliography. 15. Mtiller, M. W.,paper preserrt,cd at the 1st International Valvc: Conference, Paris, 1956.
Chn, E. I,., 1.A p p l . Phys. 26, 3:3-43 (1955;. 17. Chodorow, M., arid Nalos, E. J., Proc. I.R.E. 44, No. 5, 649-659 (1956). 18. Chodorow, M., and Craig, R. A., Proc. I.R.E. 46, No. 8, 1106-1117 (1057). 18. Chodorow, M., Nalos, E. J., Otsuka, S.P., and Pantell, R. H., Trans. I.R.E. Projessional Group on Electron Jkvices ED-6, No. 1, 48-53 (1950). 16. Chodorow, M., arid
This Page Intentionally Left Blank
Author Index Numbers in parentheses are reference numbers and are included to assist in locating references when the authors’ names are not mentioned in the text. Numbers in italics refer to the page on which the reference is listed.
A Abele, M., 289, 236 Agejewa, I. N., 71(54), 83 Aigrain, P., 67(48), 83 Akhiezer, A. I., 291, 236 Alvarez, L. W., 169(81), 205 Appel, H., 221, 264 Archard, G. D., 156(62, 63), 204 Arnal, R., 167(73), 804 Auld, B. A,, 314(10), 329
Broser, I., 39(3), 40(9), 55(3), 81 Broser-Warminsky, R., 39(3), 40(9), 55 (3), 81 Brouwer, G., 27(31), 35 Brown, H., 122(31), 204 Brown, W. L., 62(39), 83 Briick, H., 101(14), 116(21), 209 Bruner, J. A., 116, 205 Bube, R. H., 40(7), 53(25), 60(37), 70(53), 82, 83
Bulliard, H., 67(48), 83 Bullock, M. L., 167(75), 170(75), 204 Bullough, R., 57(28), 82 Burfoot, J. C., 141, 156(61), 204
B Balkanski, M., 40(8), 82 Bardeen, J., 60(38), 62(39), 83 Batt, E., 65(44), 83 C Beaver, W. L., 314(12), 315(12), 316(13), Cardona, M., 68(49), 83 317(13), 329 Carlile, R. N., 185, 205 Bell, M., 127(35), 204 Caryotakis, G., 316(13), 317(13), 329 Bennewitz, H. G., 138(50, 52), 139(52), Cerenkov, P., 265, 2.96 204 Chansewarow, R., 71(54), 83 Bernard, M. Y., 88(6, 7), 90(6), 91(6), Charles, D., 207, 663 104(16), 109(16), 124(6), 127(6), 140 Chartier, C., 95(10a), ZU3 (54), 143(56), 149, 171(58), 203, 204 Chartier, G., 182(71), 204 Reyen, W., 41(12), 8.2 Chodorow, M., 299(2), 324(17, 18, 19), Birnbautn, M., 221, 263 326, 329 Blewett, J. P., 86(4), 120, 121(28), 122 Cholet, P., 41(12), 82 (29), 127(4), 203, 204 Chu, E. I,., 324, 829 Bloem, J., 15(11), 23,34,36 Citron, A,, 126, 185, 204,205 Boer, K. W., 39(5), 50(23, 24), 59(32), Coleman, P. D., 291, 292(29), 236 82, 83 Condon, E. U., 244, 264 Bogdankevich, L. S., 275,236 Cork, B., 171(82), 193, 205 Bogner, G., 31(35), 35 Courhet, G., 186, b05 Bolotovskii, B. M., 275,236 Courant, E. D., 86(2, 3), 124, 160, 203, Borchardt, E., 59(32), 83 204 Borchardt, W., 59(32), 83 Craig, R. A,, 324(18), 326(18), 329 Borissov, M., 59(35), 88 Cronemeyer, D. C., 41(12), 82 Bratt, P., 41(12), 82 Brattain, W. H., 60(38), 62(39), 83 D Brebricak, R. F., 1 1 , $4 Dallenbarh, W., 127(37), 204 Bretsrher, M. M., 183(88), 188, 206 Danos, M., 277, 287, 292(31), 293(31), Bromley, D. A., 116, 203 296, 297 Bronca, G., 109(18), 114, 165(18), 203 331
332
AUTHOR INDEX
Davis, H., 41(12), 82 Dayton, I. E., 161, 171(66), 174(66),204 Dember, H., 68, 83 Dhuicq, D., 123(33),204 Dison, N . E., 314(10), 529 Dodds, W. J., 314(11), 315(11), 329 Dow, W. G., 221,264, 302(7), 305,329 Drummond, J. E., 285(22), 2116 Dunn, D. A,, 294(33), 297 Dushin, L. A,, 167(76), 204 Dushman, S., 211(9), 263
E Elliot, R. S., 267(5), 296 Elmore, W. C., 87(5), 127(5), 203 Enge, H . A,, 114, 203
F Fainberg, Y. B., 291 (28), 296 Fan, H . Y., 41(12), 82 Farley, F. J., 185(91), 205 Frank, I. M., 271, 275,296 Frenkel, J., 83 Friedburg, H., 138(51), 204 Fuller, C . S., 12(10), 18, 34 Funfer, E., 221, 264
G
Gribi, M., 167(74), 204 Grivet, P., 143(57), 149, 152(57), 158(57), 162(57), 165(57), 167(57), 171(57, 58), 173(57), 194(57), 196(98), 198(57), 200(57), 202(57), 204, 205 Cross, E. F., 39(3, 4, 5), 55(3), 81, 82 Guggenheim, E., 2(2), 5(3), 34 Guro, G. M., 68(49), 85 Gutjahr, H., 39(5), 82
H Hagedorn, R., 97(11), 203 Haken, H., 39(2), 55(2), 8f Hand, L. N., 98(12), 100(12), 203 Hannay, N. B., 9(5), 34 Harmay, N. B., 301(3), 329 Harrison, A. E., 221,263 Hauffe, K., 2, 15, 34 Hauser, O., 42(13), 82 Heiland, G., 31(34), 55, 62(40), 83 Hcreward, H. G., 119(27), 205 Herzhcrg, G., 244, 264 Hess, K. W., 221, 264 Heyne, I., 39(6), 55(6), 82 Hilsum, C., 69(51), 83 Hine, M. G., 94(9), 203 Hooker, 0 . N., 207,263 Hubbard, E. L., 164, 165(69), 167(77), 183 (77), 189, 204, 205 Hue, J., 140(54), 143(56), 196(98), 204, 205 Hull, A. W., 245, Z64 Hutson, A. R., 21,35
Garlick, G. F. J., 70153), 83 Garrett, C. G. B., 62(39), 83 Garrett,, M . W., 87(5), 127(5),205 Gautier, P., 172(83), 205 Geballe, T. H., 16(13), 17(14), 34 Gendreau, G., 109(18), 114, 116, 165(18), 205 J Germeshausm, K . J., 207, 263 Jelley, J. V., 268,269,285, 296 Gerritsen, H. J., 65(45). 83 R. L., 314(12), 315(12), 329 Jepsen, Geschwind, S., 292(31), 293(31), 2.97 Johnsen, K., 119(27), 205 Giese, C . F., 167(8@),17@(80),205 Johnson, C. H., 167(79), 205 Ginzburg, V. L., 266,275,206 Johnson, E. O., 221, 264 Ginzton, E. L., 299(2), 3-39 Johnson, L., 41(12), 82 Glaser, W., 104(15), 105(17), 203 Jones, R. C., 78(56), 84 Gliickstern, R., 127(38). 204 Gorlich, P., 39(6), 44(18), 50(22), 55(6), Judish, J. P., 167(79), 205 58(29), 59(31), 631411, 70(53), 78(56), Juse, W., 59(34), 83 82, S J , 84 K Goldberg, S., 208(7), 214(14), 223. 224 Kanev, St., 59(35), 83 ( 3 3 ) , 249(39), 263, 264 Gordon, J. P., 139(53). -304 Knpljanski, A. A., 39(3), 55(3), 81 Gorjunowa, N. A., 71(54), 83 Kaufman, I., 267. 206 Grcene, R. F., 11, 34 Keller, R.>138.
333
AUTHOR INDEX
M
Kelly, E. L., 167(77), 183(77), 18'3,205 I i c d r r , F. K., 41(11), S2 Iiikoin, I. K., 67(47), 83 King, P. G. R., 294(34), 21r? Kingston, R. H., 60(38),80(57), 83, 84 Klontz, E. E., 41(12), 82 Kluge, W., 70(53), 83 Knight, H., 207, 263 Knoll, M ., 211 (lo), 263 I h o o p , E., 221, 264 Kohler, V., 42(13), 82 Kohl, W., 211(11), 263 Kolomenskii, I. I., 284,285(21), 206 Kolornijez, B. T., 71(54), 83 Krcuchen, I<. H., 314(10), 329 Krienen, F., 133(43), 204 Kroebel, W., 221,204 Krogeo, F.A., 15(11), 34 Krolis, A,, 44(18), 50(22), 63(41), 70(53), 8,. 83 Kui.nic-li, S. W., 67(48), S3
RlcBride, W. J., Jr., 314(11), 315(11), 32.9 McFarland, C. E., 183(88), 188, 205 Macfarlane, G. G., 40(10), 86 MacKay, J., 41 (12), 82 IMclean, T. P., 40(10), 89 MacRae, A., 41(12), 86 MacKair, D., 301(3), 329 McWhorter, A. L., 80(57), 84 Madelung, O., 70(52), 83 Malkowa, A. A,, 71(54), 83 Mallet, L., 265, 296 Malter, L., 221, 264 Marshall, L., 160, 204 Ahfartens,E., 58(30), 82 Martin, S. T., 208(7), 223(33), 224, 263, 264 Maslov, V. A. 167(76), 204 MatarC, H. F., 57(26), 82 Matsuda, K., 162(67), 204 Mayer, It., 221, 264 Melkich, A., 86, 203 L Metcalfe, J., 221(17), 2G3 I ~ m h c J., , 48(20), 82 Michaclis, E. I,.. 185(91), 205 I,nnil~erl,M . 8., 44(18), 82 Miseljuk, E., 58(30), 82 I,:incler, J. J., 22, 31 (231, 35 Mollenstedt, Q., 156(64), 204 I,:ing, IT., 44(18), 50(22), 70(53), 82, 83 Mohler, F. L., 254(41), 26/, I,angmuir, I., 231, 254(40), 264 Mollwo, E., 31 (34, 351, 35 I,apoetolle, P., 93(8), 119(27), 203 Montgomery, H. C., 62(39), 83 I,:irk-Horovitz, K., 41(12), 88 Moore, W. J., 23(27), 35 1,:rshinsky. H., 280, 283(19), 292(31), 293 Moreno, T., 314(11), 315(11), 320 (31), 294(32), 206, 207 lMorin, F. J., 12(10), 34 1,:rssw. M. E., 41(12), 82 Morpurgo, M., 134, 188(95),204, 205 r,:l\Tsoll,.J. n.,275, 296 Morton, G. A., 70(531, S3 T,:imson: IfT.D., 71(54), 83 MOSS,T. S., 67(48), 68(49), 70(53), 8.3 Imtrned, V., 299(1), 329 Motz, H., 267(5a), 296 h y i , R., 301 (4, 51, 329 Mourier, G., 288(26), 296 I,rvinstrin, H., 41(12), Si? Mozley, R. F., 161, 171(66), 174(66), 20.1 T,c\-y. P., 116, 203 Muser, H., 50(22), 82 I,inhart, J. G., 275, 291, 29G, 297 Muller, M . W., 322, 529 Liuharskii, G. L., 291(28), 296 Mullin, C. J., 221, 465 M. S., 86(2, 3), 124(3), 203 ,i\ingston, ' .' Lorb, J., 183, 205 N Loeb, L. B., 228, 264 Sag, B. D., 280,296 Logan, R. L4.,17, 34 Salos, E. J.,321(14), 324(17, 19), 329 rmigini. R. L., 11, 34 Sewman, R., 49(21), 59(33), 82?, 83 T,uebke, W.R., 294(33), 297 Sewman, R. C., 57(28), 82 1,iimniis. F. L., 63(42), 83 Lynch. P. J., 166(72), 167, 171(72), 185 Sielsen, I. R., 299(2), 329 Sielsen, S., 71 (54), 83 (72). 191, 204 Sitsche, R., 71 (55), 85 Lynrh, R . T., 301(3), 329
.
334
AUTHOR INDEX
Noskow, M. M., 67(47), 83 Nottinghani, W. B., 218(15), 263 Novikow, B. V., 39(3), 55(3), 81 Nunan, C., 167(78), 205
0
€loth, M., 221,264 Ituppel, W., 45(19), 65(45), 82, S3 Rywkin, S. M., 59(34), 71(54), 83
5
Sagdeyev, R. S., 285(23), 296 Sauzade, M., 174, 205 Sayied, A. M., 280,296' Scherzer, O., 156(59), 204 Schlier, C., 138(52), 139(52), 204 Schneider, H., 123,204 P Schon, M., 50(22, 23), 82 Pakswer, S., 221, 264 Schottky, W., 10,34 Panofsky, W. K., 98(12), 100(12), 203 Schulte, M. L., 70(53), 83 Pantell, R. H., 324(19), 329 Seeliger, R., 156(60), 204 Paul, W., 68(49), 83, 135(45, 46), 136(46, Septier, A., 95(10a, lob), 97(10b), 116 47), 138(50,52), 139(52), 204 (25), 123(33), 143(57), 152(57), 158 Persson, K., 254(42), 264 (57), 162(57), 165(57), 167(57), 171 Petritz, R. L., 63(42), 83 (57), 173(57), 176(57,85), 178(86), Pierce, J. R., 267, 286(24), 294(35), 296, 181(57,86), 182(87), 185(93), 188(25, 297, 302(8), 329 95), 191(96), 192(25), 194(57, 97), Pincherle, L., 67(48), 83 196(98), 198(57, 99), 200C57), 202 Pinel, J., 185(92a),205 (57, 100, 101), 203, 204, 205 Plucker, J., 211 ( 8 ) , 263 Shafranov, V. D., 285(23), 296 Pohl, R. W., 28,29,35,43(16), 82 Shaw, H. J., 311(9), 329 Polke, M., 44(17), 82 Shearman, P. M., 214,265 Putley, E. H., 71(54), 83 Shockley, W., 8(4), 34, 60(38), 83 Shoemaker, E. C., 161, 171(66), 174(66), Q 204 Quarrington, J. E., 40(10), 82 Shull, F. G., 183(88), 188,205 Silver, M., 221, 26 R Sim, A. C., 43(15), 82 Sitenko, A. G., 280, 296 Raether, M., 135(46), 136(46), 204 Smith, D. P., 211(12), 263 Rasbirin, B. S., 39(4, 5), 82 Smith, L., 127(38), 204 Read, W. T., 57(26), 60(38), 82,83 Smith, R. A., 70(53), 83 Real, M., 185(9Zb), 205 Smith, R. B., 221 (17), 263 Redington, R. W., 44(18), 82 Smyth, H. D., 244, 264 Regenstreif, E., 130(40), 204 Snyder, C. W., 167(79), 205 Reinhard, H. P., 136(47), 204 Reisman, E., 116(24), 146, 157, 181(24), Snyder, H. S., 86(2, 3), 124(3), 203 Sokolow, V. A., 53(25), 82 188, 192(24), 203 Sommerfeld, A., 266, 296 Reiss, H., 11, 12(7), 18, 19(20), 34 Sonkin, S.,299(2), 329 Riley, D. F., 208(7), 263 Soole, B. W., 63(41), 83 Rittner, E. S., 50(22), 82 Sorrows, H. E., 63(42), 83 Roberts, D. H., 78(56), 84 Sosnowski, L., 63(41), 83 Roberts, V., 40(10), 82 Staprans, A., 316(13), 317(13), 329 Romanowitz, H. A., 221,264 Rose, A., 43(14), 44(18), 51(14), 59(36), Starkiewicz, J., 63(41), 83 Steinwedel, H., 135(45), 204 65(45), 82, 83 Sternheimer, R. M., 114, 115,118, 119,203 Rosenblatt, J., 114, 203 StGckmann, F., 31(34), 56, 38(1), 41(12), Ross, I. M., 69(51), 85
0ver&, H., 126,185(91), 204,205 Olmstead, J. A,, 221,264 Otsuka, S. P., 324(19), 329
335
AUTHOR INDEX
43(16), 44(17), 58(1), 59(1), 64(43), 65(44), 81, 82, 83 Storch, G., 44(17), 82 Strauss, A. J., 67(48), 83 Susskind, C., 302(6), 329 Symons, R., 316(13), 317(13), 329
von Zahn, V., 204 Votava, E., 57(27), 82
W
Wagner, C., 10,30,5'4,35 Wakefield, J., 57(28), 82 Waldron, R. D., 40(8), 82 T Walkinshaw, W., 127(35), 204 Tamm, I. E., 265(2), 271, 274(10), 295(2), Wallis, G., 62(39), 83 Walsh, D., 214, 265 296 Walter, R. L., 314(12), 315(12), 329 Taubert, T., 131(42), 204 Walton, A. K., 68(49), 85 Teng, L. C., 127(36), 129(36), 204 Wang, S., 62(39), 83 Tewordt, L., 70(52), 83 Thomas, D. G., 15(11), 21(21), 22, 24, 31 Wantosch, H., 50(24), 82 Warnecke, R. J. 207,263 (26), 34, 35 Webster, E. W., 221, 263 Thurkauf, M., 167(74), 204 Wegmann, L., 167(74), 204 Thurmond, C. D., 17(16), 18(17), 34 Welker, H., 70(52), 85 Timm, U., 122(70), 165(70), 204 Wheatcroft, E. L. E., 221,263 Tolstoi, N. A., 53(25), 82 Wiesner, R., 66(46), 83 Townes, C. H., 139(53), 204 Wilcox, J. M., 131, 204 Trumbore, F., 25, 35 Williams, E. M., 221, 264 Tweet, A. G., 23(29), 35 Wilson, B. L. H., 78(56), 84 Tyler, W. W., 49(21), 59(33), 82, 83 Winslow, L. M., 311(9), 329 V Wittenberg, H. H., 207,221,263,264 Wolkenstein, F., 59(34), 83 Van Acker, J., 202(100, 101), 205 Woodbury, H. H., 48(21), 59(33), 82,83 Van Der Horst, H. L., 263(43), 264 Van der Meer, S., 100(13), 162, 176, 183 Woodford, J. B., 221, 264 Woods, J. F., 63(42), 83 (68), 203, 204 Woodward, A. M., 67(48), 83 van Roosbroeck, W., 68(49), 83 Wright, D. A,, 70(53), 85 Van Trier, A,, 292(31), 293(31), 297 Wurst, E. C., 41(12), 82 Vauthier, R., 137(48), 204 Veksler, V. I., 284, 285(20), 296 Y Veronchev, T . A., 207(6), 263 Young, A. S., 71(54), 83 Veronda, C., 299(1), 329 Villiger, W., 167(74), $04 Z Vink, J. H., 15(11), 34 Zaffarano, D. J., 166(72), 167, 171(72), Vivargent, M., 159(65), 167(65), 169, 204 185(72), 191, 204 Vlasov, A. D., 127(39), SO4 Zajec, E., 171(82), 193,205 Vogel, H., 50(23), 82 Zeiger, H. J., 139(53), SO4 Volger, J., 15(11), 34 Zitter, R. N., 67(48), 83 von Baumbach, H. H., 30,35
Subject Index A Aberrations aperture, 140-142 calculations from rectangular model, 146-152 chromatic, 158-159, 200-201 figure, 152, 158 global, 195, 196-197 lens, 140-160 mass, 160, 200-201 of poor alignment, 202 due to ve1ocit)yterms, 146-149 Absorl~tioncoefficient, quantum, 55 Accelerators high current, 130 linear ion, 126-131 proton, 128 Amplification factors, photoconduction, 78 Aperture aberrations, 140-142 trial corrections, 156158 Atomic constant, 18 Atoms focusing of polarized, 137-139 force of magnetic field on, 137, 138
B Bandwidth, limiting, 315 Bell-shaped model, 90 equations of motion, 104-105 optical elements, 108-109 Bessel equation, 272 Blackbody radiation fluctuation, 7%79 Boltzmann equation, 257 for current density, 228 Boundary area, double sheath, 239ff Boundary layer enriched, 65 photo-emf in, 6G67 Breakdown anode, 227, 233-239 gas diode, 224 grid-cathode, 224 Bremsstrahlung, 266, 274-275 Brillouin focusing, 302-304
336
Broadbanding, 316317 Broadband klystron, 316317
C Cadmium sulfide effect of humidity on spectral distribution, 60,61 model for Ag-activated, 48 step dislocations in, 57 Cardinal elements, lens, 105 doublet quadrupole, 110-111 triplet quadrupole, 119-122 Carrier concentration, minority, 16 Cathode dissipation, 251 materials, 300-302 utilization, 245-251 Certtmic hydrogen thyratron, 216219 Cerenkov effect bunched electron beam, 277-279 electron moving near a dielectric, 275-277 frequency dependence, 267-270 <,. Inverse,” 285 at microwave frequencies, 275-285 Tamin analysis, 271-274 theory, 268-275 microwave devices, 285-295 radiation, effect of medium on, 280-285, 286 “two-cavity” klystron, 291 velocity, critical, 273 Child-Langmuir equation, 223, 230 Circiiits clover-leaf slow-wavc, 326, 327 coupled-cavity, 324-326 “Dutch door,” 326, 328 folded line, 326-329 helix-derived, 323-324 modulator, 219-221 pulse generator, 219-221 ring-and-bar, 324, 325 Clarit,y of doublet lens, 117-119 Coils, magnet excitation, 165-166
337
SUBJECT INDEX
“Coinnion ion” effect, 18 Coinmutation Composition diagrams, 26-34 Conduction, steady-state, 239-251 Conductivity, positive photocurrent, 38 Conttict s isotropic, 64-65 noisr, 79 ohmic, 64-67 rectifying, 64-65 unidirectional, 64-65 Crystals with impurities, composition diagram, 31-34 momenta, 53-57 nearly perfect, 8-10 purr, composition diagram, 26-30 stability of impurities in, 18 Current, density, 38 sheets, 97-100
D Ik y en r r at e temperature, 16 Deionization, 251-262 Demarcation levels, 51-53 Dernber effect., 68 Density conduction band electron, 38 current, 38 Boltzmann equation for, 228 electron beam charge, 277 grid current, 223 plasma clertron, 242 of states, 21 effective, 8 o l un-ionized centers, 10 Depletcd layer, 65 Dielectric relaxation time, 44 Diffusion ambi polar constant, 255 rate of ion loss by, 254 time, electron, 226 Dipole moment, effertir-e, 138 Dirac delta function, 271 Discharge, grid-cathode, 229, 230 Dislocations, 57-58 Dissipation anode, 233-239
cathode, 251 inverse anode, 25S262 Distortions, 142-143, 148 Donor ionization energy, 9, 25
E Electrode forms, 8%39 noise, 79 Electrodes circular, 161-162 composit,e, 162 hyperbolic, 16&161 Electron beam Cerenkov effcct from, 277-279 charge density, 277 formation, 302-306 scalloping, 302ff bombardment of germanium, 41-42 heating, 309-310 :is a chemical entity, 1-35 experimental, 12-35 theoretical, 2-12 concentration equilibrium, 28-29 temperat,ure variation, 23 density conduction band, 38 plasma, 242 diffusion time, 226 flow, magnetically confined, 304-306 -optical benches, 187-188 plasma effective dielectric constnnt. 284 sheath thickness, 230ff transit time, 45 wave vcctors, 53-57 Energy balance in steady disch:trge, 213 band structure of Ge, 55-56 donor ionization, 9 filter, electrostatic, 131-133 Gibbs free, 5 level diagram of germanium surfacc, 61 of ZnO surface. 62, 63 levels, semiconductor, 8 Equilibrium configurat,ions. 13- 14 constants, chemical. 21ff
338
SUBJECT INDEX
electron concentration, 28-29 ionization, 20 Equipotential curves, equations for, 87ff Erosion, anode, 261 Excitation in lattice absorption region, 47 in tail absorption region, 46 Exciton formation, 55
F Fermi-Dirac distribution function, 8 Fermi levels, 52 Ferrites, Cerenkov radiation in, 280-284 Field -current characteristics of magnetic lenses, 166-167 distortion lens, 86-100 equivalent length, 92-95 leakage, 178-179 longitudinal, distribution, 180-181 magnetic, 171-172 transverse, distribution, 17G179 Flux distribution in magnetic circuit, 163-165 Focal, 112 perturbed, 151 Focusing alternating gradient, simple, 123-126 in linear accelerators, 126-131 Brillouin, 302-304 of polarized atoms and molecules, 137139 Force of magnetic field on an atom, 137, 138 Franz-Keldysch effect, 48 Free charge carriers lifetime, 42-46 production rate, 51 Frequency dependence of Cerenkov effect, 269-270 dependence of photoconductivity, 73, 78 incident radiation transition, 54, 56 plasma, 284
spectral distribution of photoconductivity, 41 surface, energy level diagram, 61 Gibbs free energy, 5 Global aberration, 142 Gradient, transverse, 173 Grid current, rate of growth, 225
H Heating dielectric, 307% electron bombardment, 309-310 “Helium-like’’ model, 21 Hines circuit, 326, 328 Hodoscope, 1 s 1 8 5 Hole mobility, 63 Hydrogen atoms, magnetic focusing of, 137-138 concentration, atomic, in steady discharge, 243-245 thyratrons, 207-264 operation, 21%262 reservoirs, 211-216
I Image plane, Gaussian, 153ff Impurities, chemical, 24 Impurity concentration, 12ff Interaction effects, 12 Ion generation in plasma, 237 loss, 254 noise, 303, 306 -optical bench, 186 source, 186 Ionic radii, 25 Ionization energy of donor, 9 for singly ionized donor, 25 equilibrium, 20
J Junctions, p-n, 65-66
Gas clean-up, 211 Gaussnieter, rotating-coil, 171-172 Germanium, 16-19 electron bombardment, 41-42 energy band structure, 55-56
K Klystron broadband, 314-317 Cerenkov “two-cavity,” 291 driver section, 315-316
339
SUBJECT INDEX
high gain, 313-314 high power, 300-313 progress in design, 313-320
1 Langmuir probe, anode as, 22%-233 Lanthanum hydrogen reservoir, 214 Lead sulfide surface conditions, 62-63 Lengths, equivalent, 176, 181-183 Lenses cardinal elements, 105 characteristic function, 91, 103 doublet, 86, 183-202 of finite length, 90-91 field distribution, 86-100 magnetic field-current characteristics, 166-167 strong focusing, 160-167 without poles, 95-100 projection, aberrations, 148 quadrupole, 85-205 double cardinal elements, l l & l l l clarity, 117-119 combination of two, 122-123 conjugate points, 111-115 magnification, 111-112 optical elements, 109-119 pseudostigmatic operation, 111114 transfer matrix, 10%110 electrostatic, 135-136, 167-170 helical, 133-135 potential in system of, 8&100 single, optical elements, 105-109 triple, optical elements, 119-122 strong focusing, 85-205 graphs of operation, 114-115 magnetic, 160-167 sequence of large number, 123-133 thin-, approximation, 107, 115-116, 122 Lorentz condition, 271 Luminescence, 40
M Magnet excitation coils, 165-166 Magnification, doublet quadrupole lenses, 111-112 Mass aberration, 160, 20&201
-action relations, 11-12, 16, 17 filter, 136 resolving power for, 136 Mathieu equation, 131 Matrix cathode, 301 Maxwell’s equations, 271 Mechanics, statistical, 7-12 Microwave devices, Cerenkov, 285-295 diagnostics, 267 frequencies, Cerenkov radiation at, 265-297 tube window design, 206-313 failure mechanisms, 309-310 Mobility, hole, 63 Modulation, 128 Modulator circuit, 219-221 Molecules, focusing of polarized, 137-139 Motion, equations of, 100-105, 128 bell-shaped model, 104105 electrostatic quadrupole lenses, 135 helical quadrupole lenses, 133-134
N Noise contact, 79 crystal, 80 electrode, 79 ion, 303, 306 Nyquist, 79 semiconductor, 79
0 Occupancy factor, electronic, 9 Optical bench, corpuscular, 185-187 Optical elements bell-shaped model, 1OS-109 doublet quadrupole lens, 109-119 single quadrupole lens, 105-109 triplet quadrupole lens, 119-122 Oscillator, synchronized relaxation, 219 Oxide cathodes, 300-301 Oxygen vacancies, 20, 23
P Particle rigidity, 184 Periodicity in linear ion accelerator, 129 Perturbat,ion global, 155 method, localized, 129-131
340
SUBJECT INDEX
Phase rule for systems of charged components, 6 Photoconductivity, 37-84 amplification factors, 78 applications, 7&80 frequency dependence, 73,78 negative, 5g60 non-stationary states, 50 spectral distribution in Ge, 41 statistical fluctuations, 78-80 steady state condition, 49 surface conditions, 60, 64 theoretical, 40-57 Photocurrent dependence on radiation intensity, 5051 saturated, 43-44 short circuit, 69 space charge limited, 44 superlinear, 53 time dependence of, 42 total, 43 unsaturated, 43-44 Photoelectromagnetic effect, 67-70 Photo-emf in boundary layer, 66-67 Photopalvanomagnetic effect, 67-70 Pierre gun, 302 Plasma Cerenkov radiation in, 284-285 decay, 252-257 electron density, 242 electron, effective dielectric constant, 284 frequency, 284 generation rate, 237 ion generation in, 237 Poisson equation, 230 Pole pieces, form of, 160-162 Potential chemical, of electrically active constituent, 10-11 electrochemical, 2-4 electrostatic, 2, 4 fall, anode, 234, 236 scalar, distortions of, 173-176 in system of quadrupole lenses, 86-100 Power dissipation a t anode, 238 Prohe beam formation, 187 Proton arcelerator, 128 Pseudo-stigmatic operation doublet lens, 189
electrostatic triplet, 190 spot size, 210 Pulse cycle, hydrogen thyratron, 219-221 Pulse generating circuit, 219-221
Q Quantum absorption coefficient, 55 yield, 44
R Radiators configuration, 73 recombination, 72 Reaction kinetic models, 46-49 Recombination, 40 radiators, 72 rate of ion loss by, 254 Recovery time, 251-262 Resonators, dielectric, 291 Rosette aberration, 142
S Sern~onductor addition of group V atom to, I1 energv levels, 8 noise, 79 phases, 1-35 Space charge limited photocurrent, 44 neutrality, 5, 6 Spherical aberration, 141, 148, 157 Sputtering, anode, 262 Stability rondition, 128, 129 Star aberration, 141 Stigmatism, 122
T Tamm analysis of Cerenkov effect, 271274 Tantalum cathodes, 301 Temperature degenerate, 16 of thyratron envelope, 217 variation of electron roncmtration, 23 Thermodynamirs of systems containing charged components, 2-7 Thyratrons, hydrogen, 207-264
SUBJECT INDEX ‘rime dependence of photocurrent, 42 dielectric relaxation, 44 recovery, 251-262 Titanium hydrogen reservoir, 214 Trajectories of third order, 143-155 Transfer matrix, doublet qiiadrupolc lens, 109-110 Triggering, 221-227 Tubes ceramic,, 308, 311-313 high-power axial-beam, 299-329 microwave, window design, 3 6 3 1 3 traveling wave, 300-313 and Ccrenkov radiation, 2 8 6 2 8 7 progress in design, 321-329 Tungsten cathodes, thoriated, 301
V Vcloci t y Cerenkov critical, 273 -inchnalion aberration, 144
341 W
WtiVC cquations for electromagnetic potentials, 271 for vector potentials, 272 guides, loaded, 289-291 progressive, method, 127-129 vectors, electron, 53-57 Window design ceramic tube, 308, 311-313 microwave tubes, 3 0 6 3 1 3
Z Zinc interstitial, 20 oxide, 19-34 composition diagram, 27ff equilibrium equation, 20ff field effect in, 45 optical band gap, 21 surface energy scheme, 62, 63 1-acancics, 20 concentration, 22
This Page Intentionally Left Blank