Advances in COMPUTERS VOLUME 2
This Page Intentionally Left Blank
Advances
in
COMPUTERS edited b y F R A N Z L A ...
11 downloads
289 Views
22MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Advances in COMPUTERS VOLUME 2
This Page Intentionally Left Blank
Advances
in
COMPUTERS edited b y F R A N Z L A L T National Bureau of Standards Washington, D. C.
associate editors A. D. BOOTH
R. E. MEAGHER
VOLUME 2
Aoademio Press New York London 1 9 6 1
COPYRIGHT @ 1961, BY ACADEMIC PRESS INC. ALL RIGHTS RESERVED NO PART OF THIS
BOOK
MAY
BE REPRODUCED IN
ANY
FORM, BY PHOTOSTAT, MICROFILM, OR ANY OTHER MEANS, WITHOUT WRIlTEN
PERMISSION FROM
THE PUBLISHERS]
EXCEPT A S STATED I N THE FIRST FOOTNOTE ON PAQB
m a mr
a.
PAGE 135
ACADEMIC PRESS INC. 111 FIFTHAVENUE NEWYORK3, NEWYORK
United Kingdom Ediliun Published by ACADEMIC PRESS INC. (LONDON) LTD. 17 OLDQUEENSTREET, LONDON, S.W. 1
Library of Congress Catalog Card Number 59-15761
PRINTED IN THE UNITED STATES OF AMERICA
Contributors to Volume 2
PHILIPJ. DAVIS,National Bureau of Standards, Washington, D . C. JIMDOUGLAS, JR., Department of Mathematics, Rice University, Houston, Texas SAULI . GAS, Federal Systems Division, International Business Machines, Washington, D . C. ROBERTMCNAUGHTON, Moore School of Electrical Engineering, University of Pennsylvnnia, Philadelphia, Pennsylvania PHILIPRABINOWITZ, Department of Applied Mathematics, Weizmann Institute of Science, Rehovoth, Israel KENNETH R. SHOULDERS, Applied Physics Laboratory, Stanford Research Institute, Menlo Park, California
V
This Page Intentionally Left Blank
Preface
The success of the first volume of Advances in Computers confirms the expectation that there was a real need for publications of this kind. Indeed, this could have been inferred from the emergence, during the past few years, of similar series in a large number of fields of knowledge. What is it that distinguishes these series from the more traditional media of publication? It seems that they possess three outstanding characteristics, all interrelated and reinforcing each other. First, they cover a wide range of topics, so wide that hardly anyone can claim to be expert in all of them. One may conjecture that this is felt by many readers as a welcome antidote to the ever growing specialization of technical fields. Second, and necessitated by the first, there is the level of competence required of the reader. These are not “popular” articles by any means; they represent solid technical writing, yet, as we said in the Preface to Volume I, they are intended to be intelligible and interesting to specialists in fields other than the author’s own. And finally, there is the length of the individual contribution: shorter than a monograph but far longer than an article in a conventional technical journal; long enough to introduce a newcomer to the field and give him the background he needs, yet short enough to be read for the mere pleasure of exploration. These are our aims. If we have fallen short, we ask the reader’s indulgence, and invite him to contemplate the new editorial difficulties created by this type of publication: the need to find a group of authors willing to engage in the time-consuming business of expository writing, rather than pursuing their own special interests; to find just enough of them for a volume of manageable size; and to persuade them to time the completion of their manuscripts so as not to keep each other waiting. Even if there were universal agreement on what constitutes a balanced mixture of topics, a proper level of technical writing, and adequate expository treatment, one could hardly hope to accomplish more than a crude approximation to the ideal. FRANZ L. ALT October, 1961
vii
This Page Intentionally Left Blank
Contents CONTRIBUTORS TO VOLUhfE 2 PREFACE . CONTENTS O F VOLUhfE 1
V
vii
...
Xlll
A Survey of Numerical Methods for Parabolic Differential Equations JIM DOUGLAS, Jr.
1. Introduction . . . . . . . . . . . 2. Preliminaries . . . . . . . . . . . 3. Explicit Difference Equations . . . . . . 4. The Backward Difference Equation . . . . 5. The Crank-Nicolson Difference Equation . . . 6. An Unconditionally Unstable Difference Equation 7. Higher Order Correct Difference Equations . . 8. Comparison of the Calculation Requirements . 9. Several Space Variables . . . . . . . . 10. Alternating Direction Methods . . . . . . 11. Abstract Stability Analysis . . . . . . 12. The Energy Method . . . . . . . . 13. Stefan Problem . . . . . . . . . 14. Parabolic Systems . . . . . . . 15. Integro-Differential Equations . . , . . 16. Extrapolation to the Limit . . . . . . . References . . . . . . . . . . . . . ,
,
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
,
.
.
.
.
.
. .
31
. . .
1 2 4 13 18 24
.
. . .
.
.
. .
.
37
.
.
.
.
.
.
.
.
.
.
41 44 46 48 49
.
.
.
50
.
.
. .
.
52
56 56 58 59
.
.
.
25 30
Advances in Orthonormaliring Computation PHILIP J. DAVIS and PHILIP RABINOWITZ
PART I: THEORETICAL
1. Introduction . . . . . . . . . . 2. The Geometry of Least Squares . . . 3. Inner Products Useful in Numerical Analysis 4. The Computation of Inner Products . . . ix
,
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
,
,
.
CONTENTS
X
. . . . . . .
60 63 64 68 69 70 73 75
.
79 81 83
5 Methods of Orthogonalization . . . . . . . . . . 6. Tables of Orthogonal Polynomials and Related Quantities . 7 Least Square Approximation of Functions . . . . . . 8 Overdetermined Systems of Linear Equations . . . . . 9 Least Square Methods for Ordinary Differential Equations . . 10 Linear Partial Differential Equations of Elliptic Type . . . 11 Complete Systems of Particular Solutions . . . . . . . 12 Error Bounds; Degree of Convergence . . . . . . . . 13. Collocation and Interpolatory Methods and Their Relation to Least Squares . . . . . . . . . . . . . . 14 Conformal Mapping . . . . . . . . . . . . . 15. Quadratic Functionals Related to Boundary Value Problems PART 11: NUMERICAL
16. Orthogonalisation Codes and Computations . . . . . . 17. Numerical Experiments in the Solution of Boundary Value Problems Using the Method of Orthonormalized Particular Solutions . . . . . . . . . . . . . . . 18 Comments on the Numerical Experiments . . . . . . 19. The Art of Orthonormalization . . . . . . . . . . 20. Conclusions . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . .
.
85
89 116 121 122 123
Microelectronics Using Electron-Beam-Activated Machining Techniques
.
KENNETH R SHOULDERS
. . . . .
1 Introduction . . . . . . . . . 2 Research Plan . . . . . . . . . 3 Microelectronic Component Considerations 4 Tunnel Effect Components . . . . . 5 Accessory Components . . . . . . 6 Component Interconnection . . . . . 7 Substrate Preparation . . . . . . 8 Material Deposition . . . . . . . 9 Material Etching . . . . . . . . 10 Resist Production . . . . . . . . 11. Electron Optical System . . . . . . 12. High-Vacuum Apparatus . . . . . 13. EIectron Microscope Installation . . . 14 Demonstration of Micromachining . . .
. . . . .
.
. . . . . . 137 . . . . . . 144 . .
. .
. .
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . .
. . .
. . .
. . .
. .
. .
. .
. .
. .
. .
.
. 150 . 158 180 . 190 . 197 . 204 . 224 . 230 . 236 . 260 . 275 . 276
CONTENTS
15. Summary References .
. .
. .
. .
. .
.
.
. .
. .
xi
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
283 289
Recent Developments in linear Programming SAUL I. GASS
1. Decomposition Algorithm . . . , . . . . . 2. Integer Linear Programming . . . . . . . . . 3. The Multiplex Method . . . . . . . . . . 4. Gradient Method of Feasible Directions . . . . . 5. Linear Programming Applications . . . . . . 6. Summary of Progress in Related Fields . . . . . 7. Linear Programming Computing Codes and Procedures 8. SCEMP . . . . . . . . . . . . . . 9. Linear Programming in Other Countries . . . . . References . . . . . . . . . . . . . .
. . . .
296 302 309 314 317 322 325 361 363 366
The Theory of Automata, a Survey ROBERT McNAUGHTON
379 1. Introduction . . . . . . . . . . . . 2. Finite Automata . . . . . . . . . . . . . 385 3. Probabilistic Automata . . . . . . . . . . . . 391 4. Behavioral Descriptions . . . . . . . 393 397 5. Various Concepts of Growing Autoiiinta . . . . . . 6 . Operations by Finite and Growing Automata, Renl-Time and General . . . . . . . . . . . . . . . . 402 7. Automation Recognition . . . . . . . . . 407 8. Imitation of Life-Like Processes by Automata 411 416 References . . . . . . . . . . . . 423 AUTHORINDEX . . . . . . . , . . . . SUBJECTINDEX. . . . . . . . . . . . . . 430 ,
,
,
,
,
,
,
,
,
This Page Intentionally Left Blank
Contents of Volume 1 Gcncrttl-Purpose Programming for Business Applications CALVIN C. GOTLIEB Numerical Weather Prediction NORMAN A. PHILLIPS The Present Status of Automatic Translation of Languages YEHOSHUA BAR-HILLEL Prograniniing Computers t o Play Games ARTHURL. SAMUEL Machine Recognition of Spoken Words RICHARDFATEHCHAND Binary Arithmetic GEORGE11'. REITWIESNER
xiii
This Page Intentionally Left Blank
A Survey of Numerical Methods for Parabolic Differential Equations JIM DOUGLAS, JR. Rice University
Housfon, Texas
1. Introduction
. .
2. Preliminaries .
.
. .
. .
. .
.
. 3. Explicit Difference Equations .
.
. . . . .
. .
. .
. .
.
.
.
4. The Backward Difference Equation . . . . 5. The Crank-Nicolson Differcnce Equation . . . 6. An Unconditionally Unstable Difference Equation 7. Higher Order Correct Difference Equations . . 8. Comparison of the Calculation Requirements. . 9. Several Space Variables . . . . . . . . 10. Alternating Direction Methodp . . . . . . 11. Abstract Stability Analysis . . . . . . . 12. The Energy Method . . . . . . . . . 13. Stefan Problem . . . . . . . . . . 14. Parabolic Systems . . . . . . . . . 15. 1nt)egro-Differential Equations . . . . . . IG. Extrapolation to the Limit, . . . . . . . Ileferenres . . . . . . . . . . .
.
.
.
.
.
.
.
.
. .
. .
. .
. .
. .
. .
. .
. .
1 2 4
. .
.
.
.
.
.
.
13
.
.
.
. . . . .
.
. .
. . . . . . . .
,
,
,
.
. . . .
. . ,
. .
,
.
,
,
.
.
,
,
.
.
.
. . . .
. . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
. . .
.
,
. .
. .
.
.
.
. . .
.
. .
. .
.
18 24 25
30 31
37 41 44 46 48 49 50 52
1. Introduction
The purpose of this survey is to introduce a theoretically minded, but not highly mathematically trained, scientist to finite difference methods for approximating the solutions of partial differential equations of parabolic type, Differential equations of this type describe diffusion processes of many kinds. In recent years much progress has been made in developing more efficient finite difference procedures and in methods for determining whether or not the numerical solutions are indeed good approximations to the solutions of the differential equations. Both of these advances will be discussed. Clearly, the better methods should be presented for the use of the applied scientist. Since the variety of physical problems leading to 1
JIM DOUGLAS, JR.
2
parabolic equations that cannot be solved by classical methods is much larger than could be anticipated by the mathematician, and since many difference analogues of the differential system are either inefficient or divergent, it is necessary that the practitioner understand at least the fundamentals of techniques for demonstrating the convergence of a numerical solution to the solution of the corresponding differential system. In order to make clear the salient features of each method of proof to be presented, the proofs will usually be given for the heat equation,
although the difference equations will be generalized to linear parabolic equations with variable coefficients and to nonlinear equations. At first the number of space variables will be limited to one. To illustrate the process of obtaining better difference equations for a, given problem, an orderly refinement of difference analogues will be derived. Also, to avoid technical complications, the detailed derivations will be presented only for the heat equation. The sequence of difference equations will progress from one for which the error is first order in the time increment and second order in the space increment and that is subject to restrictions between these increments to ones that are second order correct in the time increment, fourth order in the space increment, and independent of restrictions between the increments. The sequence will not be the same for one space variable and more than one space variable. Before beginning the discussion of difference equations, a few preliniinary definitions and facts about difference analogues of derivatives will be collected together. The new book of Forsythe and Wasow* contains a good introductory chapter on parabolic equations, although it appeared too late to be referenced in this survey.
2. Preliminaries
The solution of a difference analogue of a differential equation is defined only on a lattice of points in the region of interest. Let Ax, Ay, . . . , At be increments of the independent variables, and let xi = iAx, yj = jay, . . . , t, = nAt. Then, the lattice consists of the points (zi, yj, . . , t n ) . Let f(xi, yj, . . , t,) = fi,j.. . .,n. The subscripts are not necessarily integral; * Forsythe, G. E., and W. R. Wasow, Finite-LXfference Methods for Partial LXffer-
.
ential Equations. Wiley, New York, 1960.
.
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
+
3
i.e., t,+ 4 = (n $) At. The values of the space increments could vary with the indices (x,+~= x, Ax,, etc.), but for the sake of simplicity let us assume that they do not; however, At is not assumed equal to Ax, although we shall frequently assume Ax = Ay = . . . . Throughout the paper the symbols u and w will bc used to denote the solution of a differential equation and the solution of a difference equation, respectively. Frequently, the symbol z will be used to denote u - w. The Landau order notation will be used. Thus,
+
means that as x tends to some stated or obvious limit. Similarly,
f(x)
= o(g(4)
means that
as x tends to its limit. I n particular, f(x) = o(1) implies that lirn f(x) = 0. Let the function f ( ~x2, , . . . ,x,) be defined on a closed region R. Then,
j CPLP2.. .. (2.5) i f f has pz contiiiuous derivatives with respect to 2,.Iff is 5t function of several variables, f E C@ (2.6) implies that all partial derivatives off of order not greater than /3 are continuous. If R is unbounded or not closed, we also assume these derivatives to be bounded. I n most of our applications, the highest order derivatives need only be bounded; however, this relaxation of the hypotheses will not be discussed. The following results are proved in any text on numerical analysis and most calculus texts. First,
JIM DOUGLAS, JR.
4
The s i arise from the use of the extended mean value theorem. Denote the centered first difference by Axfi
=
(fi+i
Also, d"(Xi) dx2
-
fi+l
- 2fi
+ + fi-1
As the second difference will appear very frequently, let' (2.10) (2.11)
(2.12) m d dx4
=
Az4ji
+ O((Ax)')),
where
+
- 4f~t-i 6fi
f E C6,
+
4fi-1 fi-2)/(AXI4* It follows from elementary trigonometric identities that AzYi
=
(fi+2
Ax2 sin rpx
=
-- 4
(Ax)
-
sin2
rpAx sin apx. 2
(2.13) (2.14)
(2.15)
3. Explicit Difference Equations
Let us begin with the numerical treatment of parabolic differential equations by considering the boundary value problem for the heat equation in one space variable, since this problem is the easiest example t o discuss for difference equations. Let 0 < x < 1, 0 < t I T, ut = U J Z , 0 < x < 1, u(x, 0) = f(x), (3.1) 0 C t 5 T, 4 0 , t) = go(t), 0 < t 5 T. ~ ( 1t,) = gi(t), Let Ax = M-I and At = TN-', where M and N are positive int,egers. Assume that a solution u(x, t ) exists for (3.1) and, moreover, that 1L
E 6412.
(3.2)
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
5
Assume that an approximate solution w is known on the lattice points up through time 1,; then a melhod must, tw spccified to advance the solution to time Clearly, the valucs o f w,,,+1 for 2 = 0 niid J: = I should be those assigned to 1 1 : (3.3)
At a point (xi,tn), 0 < i < M , the differential equation will be replaced by a difference equation. The simplest replacement is to approximate the time derivative by a forward difference and the space derivative by the centered second difference a t (xi,t,,). The resulting difference equation is
- W.n
Wi,n+1
At
=
Az2win,
i
=
1,
. . . , M - 1.
(3.4)
Equation (3.4) call be solved for W;.~+I :
+
i = 1, . . . ,M - 1. w,, AtAz2w,,, (3.5) As only five arithmetic operations are required to evaluate (3.5) for each choice of i, the approximate solution can easily be advanced a time step by (3.3) and (3.5). As the solution was prescribed for t = 0, the approximate solution can be obtained a t all (z,, tn), i = 0, . . . , M , n = 0, . . . ,N . Equation (3.5) is frequently called the forward difference equation. The question of the accuracy of the approximation arises immediately. There are several ways to study the relation between w and u. The most obvious way is to take examples for which closed form solutions of both the differential and the difference problems can be obtained and to compare these solutions directly. This procedure has been used by several authors [l-41; however, the method lacks the generality required to treat more complex linear and lionlinear problems. A particularly simple analysis [5-81 based on the concept of a maximum principle can be applied to analyzing the difference between u and w , and this analysis will be presented. For most of the more refined difference methods to be discussed below, more general analytical methods are required, and these techniques will be discussed later; nevertheless, the maximum principle analysis has broad application. Since w is defined a t a finite number of points, no differential equation can be found for w; thus, it is helpful to find a difference equation satisfied by Ztn = Ucn - Win. (3.6) As u E W2, (2.7) aiid (2.9) imply that i = 1 , . . . , M - 1. Ui,n+l = ?!in AtAZ2U,, O((At)' (Az)'At), w,,,+1
+
=
+
+
(3.7)
JIM DOUGLAS, JR.
6
Subtract,ing (3.5) from (3.7), we obtain Zi,rA+1
+
= z , ~ AfA,%,,
+ O((At)%+ (Az)%At),
i
=
1,.
. . , M - 1. (3.8)
As w agrees with
?A. initially Z;o
and
011
= 0,
thc boundary, i = 0 , . . . ,M , n = 0, . . . , N .
zO, = za1, = 0,
(3.9)
Let r
=
At (AX)'
Then, (3.8) can be written in the form
+
(3.10)
*-
+
= rzi+I,,, (1 - 2r)zi, 4-rzi-i,, o((At12 -I- (Az)'At). (3.11) Kote that the three coefficients on the right-hand side of (3.11) sum to one for any choice of r and all are nonnegative if (3.12) O < r l + . Assume (3.12) to hold; then Zi,n+l
THEOREM.Let u E C4s2, and let w be dejined b y (3.3) and (3.5). If 0 < r 5 $, then maxlu;, - win] 5 B ( ( A z ) * At), 0 I z i I 1, 0 I t, 5 T .
+
The value of B depends on upper bounds for u t land u,,,,, as well as on T. Thus, the forward difference equation satisfies the most important property required of a difference analogue of a differential equation; i.e., at least under some conditions its solution converges to that of the differential equation as A x and At tend to zero. It is clear that the condit,ion (3.2) can be relaxed to u E C2J. (3.17)
7
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
Then, by (2.7), (2.9), and the argument above,
llznll
= 41)
(3.18)
as Ax aiid At tend to zero. The condition on r , however, cannot be removed; it can be relaxed trivially to
(3.19) r 5 t -I- ()(At), as will be seen presently. The necessity of (3.19) will be discussed later. One of the most interesting features of the numerical treatment of psrtiul differential equations is that many of the numerical methods and many of the proofs for linear equations with constant coefficients carry over directly to nonlinear equations. Consider the differential equation Ut
=
#(x, t,
U, u,,
uZ2),
0
< z < 1,
0
< t _< T ,
(3.20)
again subject to initial and boundary values. To assure that this problem is well posed (i.e., physically stable) assume that (3.21)
Then a forward difference equation can be written as wiSn+I
=
Win
+ At4(Xi, tq
win, AzWinl A z 2 ~ i n ) ,
i
= 1,
- . . ,M
1. (3.22)
Now, assume also that (3.23)
in the region. Then, by the mean value theorem O((At)' -I- (Az)*At)i ~ i , , + 1 = in At+(Xi, t n , U i n , AzUinf Az'uin)
+
(3.24)
provided 'u E C"2. Then, subtracting (3.22) from (3.24) and applying t'he mean value theorem again, we obtain
+ 0((W2+ ( A x ) ~ A ~ ) , where the partial derivatives appearing on the right-hand side are evaluated ;It a point between (xi, t,, I L ; ~ A2u;,, , AZ2u;,) and (xi, t,, Win,&Win, &'Win) as required by the mean value theorem. Equation (3.25) can be rearranged as
a
JIM DOUGLAS, JR.
The analysis for the heat equation is applicable here provided that we can choose At and A x so that the coefficients are nonnegative. Now, (3.27) if
Az
5 2a/0.
(3.28)
Thus, (3.28) implies that the first and third coefficients are nonnegative. Also, (3.29) if
O
1 - 6At 20
(3.30)
Thus, if (3.28) and (3.30) hold and u E C43, I t can be shown easily that llZnll
= O((AX)*
since
+~ t ) ,
(3.32) (3.33)
Again, the restriction that zi E: C4p2can be relaxed. The analysis above has been for the boundary value problem. The discussion of more general boundary conditions will be deferred until the backward difference equation has been introduced. The limitation (3.30) on the ratio of the time step to the square of the space increment can be extremely inconvenient, particularly if 0 is large, since the number of time steps and the amount of computation required to complete a problem can be very large. Note that having large partial derivatives of qj in a small portion of the region enforces the use of small time steps over all the region, unless a rather complicated machine program is constructed. Much of the work to be described has been aimed a t removing or altering (3.30). Before passing to improved difference equations for (3.1), let us consider the pure initlid problem Zit
=
?I,,,
11(z,O) =f(r),
-m
<2:<.0,
- m < 2: < m.
O
(3.34)
Let u € C4a2.Then, it is quite clear that the analysis of the forward difference equation is essentially unaltered; thus, convergence of the solution of
9
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
the forward difference equation is assured if r 5 4 and u is sufficiently smooth. As most of the better differciiccequations that will be discussed for the bounded region will have little or no application to the initial value problem, let us describe improved difference approximations for (3.34) here. Let u E Cgr3. Then,
Replacing u,, by a second difference and uzzzzby a fourth difference, we see that
+ AtAz2Uin + $(At)'Az4u,;, + O((At)3 + (A2)'At). (3.36) The difference equation obtained by deleting the local truncation error term O((At)3 + (Ax)'At) can be analyzed in precisely the same manner
Ui,n+l
= in
that the forward difference equation was treated. Again, it is necessary to restrict T to be not greater than one-half, and it is easy to see that llZnlI
= O((A2)'
+ (At)').
(3.37)
Note that (AQ2 is dominated by ( A Z ) ~ thus, ; the error is of the same order as before. Consequently, no significant improvement was obtained. The trouble lies in that, while the local error in the time direction was improved, the local error in the space direction was not. As
the local space error can be improved by using the difference equation Wi,n+ 1 = Win
+ AtAZ'Win + [&(At)z - &(A~)'At]Az~t~in.
In this case, it can be shown that llZnll
=
+
O ( ( A Z ) ~ (A2)'At
+ (At)2)
(3.39) (3.40)
if T 5 3. The analysis above implies a lower limitation T 2 Q as well; however, a more refined argument shows that the restriction is unnecessary. Note that the error has now been reduced significantly and, fortuitously, the restriction on r has been eased slightly. Unfortunately, (3.39) cannot be applied to the bounded region problem unless the solution can be extended across the boundaries by symmetry considerations, since otherwise
JIM DOUGLAS, JR.
10
the formula could not be evaluat,ed a t the grid points next t o the boundary points. The discussion of the initial value problem has 80 far been oriented toward the derivation of difference analogues; let us turn our attention more toward the convergence issue. Fritz John [9] has given a very thorough treatment of convergence for explicit difference analogues of the linear parabolic equation (3.41)
where a&, 2) is positive and bounded away from zero. Equation (3.41) was generalized to the slightly nonlinear equation where a2(z,t)u d ( x , t ) is replaced by d(x, t, u).Professor John’s paper is without doubt the most outstanding contribution to the mathematical analysis of numerical methods for parabolic equations. His results are too extensive and many of his arguments too complex to present here; unfortunately, we shall have to be contented with a summary. The general explicit difference equation for (3.41) is of the form
+
Wi,n+1
- C ciniWi+jvn 3
+ Aldsn,
--m
(3.42)
The coefficients cznj can depend on A x and At as well as the indicated x and 2. Let T be held fixed throughout the discussion. Let us assume cIn2 = 0 unless Ijl 5 m,where m is independent of A z , i, and n. Equation (3.42) is said to be consistent if the equation goes over formally to (3.41) when A x and At go to zero. It is easily seen that consistency is equivalent to
For convenience, let At be a function of Ax. NOW,assume that the coefficients can be expanded in terms of A x as follows:
where ainj,@ i n j , and
+
(3.44) 4- Ax@i,i $(Az)’yini(A~), and Yin’(Az) are uniformly bounded in the strip 0 5 t 5 T cinj = ainj
lim yinf(Ax)
Ax-0
= yini(0)
(3.45)
uniformly in the strip. Then, (3.43) is equivalent to the six relations
11
(3.46)
The solution of the inhomogeneous equation (3.42) can be obtained by the Duhamel principle of superposition. Define the family of operators L k n acting on functions g(z) as follows: (3.47) (3.48)
+ At C Lkn(d(x, n-1
W i n = &n(f)
tk)),
(3.49)
E=l
where u(z, 0) = f(r). The difference equation (3.42) is said to be stable if the operators Lkn are uniformly bounded independently of Ax; i.e., if there exists a constant Q such that IILkn(g)II 5 Qllglli 0 5 1; 5 5 N . (3.50) To be more precise this stability is stability with respect to the maximum norm; we shall treat stability with respect to other norms later. John's first important theorem is that stability and consistency imply convergence.
THEOREM. Let u E C2J be the solution of (3.41) subject to u ( x , 0 ) = f ( x ) , and let w be the solution of (3.42) subject to w,o = f;.If (3.42) i s both consistent and stable, then w converges uniformly to u as Ax tends to zero. Since logically derived difference equatioiis will usually be consistent, the convergence problem has been reduced to determining stability. He proves that a necessary condition for stability is that (3.51) for all real 0 and all (a, t n ) in the region (i being the complex unit). The slightly stronger condition
(3.52)
12
for some 6
JIM DOUGLAS, JR.
> 0 is sufficient for stability, provided that the quantities
exist and are uniformly bounded in the region for sufficiently small Ax. These results essentially provide a criterion to determine whether or not an explicit difference analogue of the initial value problem for (3.41) is convergent. Condition (3.52)is satisfied if aini >_ 0, in',
c
ainl
2 6 > 0,
mini =
17
(3.54)
Cjffini= 0. 3
j
As these conditions hold for difference equation (3.22)for the linear differential equation (3.41)for sufficiently small Axl a convergence theorem similar to the one proved earlier follows from the general result, provided
are bounded. A direct appeal to the definition shows that the conditions
2 0, C cini 5 1 + O(At) cini
(3.56)
j
also imply stability. So far the existence and the required smoothness of the solution of the differentialequation have been assumed. John bases his proofs of the results stated above in part on certain a priori bounds for the solution of the difference equation. The norm of the solution is shown to be less than a multiple of the norms of the data f(x) and d(x, t ) , the magnitude of the multiplier depending on the smoothness of the coefficients of the differential equation. These a priori bounds are natural starting points for proving existence and uniqueness of the solution of the differential problem, as well as the convergence of the difference solution to this solution. The arguments are clear but long. A typical result is the following.
THEOREM. If a0 E al E: C1.0,a2 € C1lO, d E C1*O1f f C2and a0 is bounded away f r o m zero, then there exists a unique solution u E C2J of the initial value pToblem for (3.41). C2101
Generalized solutions exist under much less restrictive conditions. Fundamental solutions were also discussed. John [lo] has also relaxed the
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
13
requirement that the derivatives appearing in the discretization error tcrms bc bouiidcd as J tcnds to infinity. Rittermnii, a student of Professor John, htis trcatcd tlic boundary problem in a similar fashion, but, liis Lhesis 11I] is as yct unpublishcd. The author lias not seen it. The convergence proofs given above all depend 011 considerable snioothness of the solution of the differential equation; with this smoothness it has been possible to establish the rate (ie., exponents on Ax and At) at which the solution of the difference equation converges to that of the differential equation. John, in discussing generalized solutions, obtained convergence with much weaker smoothness conditions and with those on the data rather than on the solution, but he sacrificed establishing the rate of convergence. Juncosa and Young [2, 3,4] and Wasow [12] have studied the dependence of the rate on the smoothness of the initial condition f(x) for the simplified version of (3.1) when go(t) = g l ( t ) = 0, for several difference methods. As might be expected, the rate goes down as the restrictions on f(x) are removed. In particular, the effect of isolated discontinuities has been studied. See the papers cited for details. 4. The Backward Difference Equation
The necessity of imposing a restriction on r for the forward difference equation can be partially explained by considering the domains of dependence of the solutions of the difference and differential equations. For the bounded region the solution of the differential equation at the point ([, q ) depends on all the data for 1 = 0 and that portion on the boundary for which t < q. If an explicit difference equation is of the form
the solution at (xi, in) depends only on the data not above the lines t = 1,
+ min [5 mAx At (x - xi)
1 .
(4.2)
Thus, for the domain of dependence for the difference equation to converge to that of the differential equation it is necessary that At = AX) as Ax tends to zero. Obviously, this does not imply bounded r nor any particular choice of the bound on r ; however, it does indicate a way around the inconvenience of having to limit the time step excessively. If we wish to choose Ax and At independently, the difference equation should be chosen so that its domain of dependence is independent of Ax and At and coincides with that of the differential equation. An easy way to accomplish this coincidence is to approximate part or all of uzza t time In+1. The backward
JIM DOUGLAS, JR.
14
difference equation results from replacing all of uZza t the advanced time; i.e., Ar%Wi,n+l =
W,,,,+l
- qfl,n
(4.3)
At
corresponds to the heat equation. Now,
if u E: C4v2. The crucial step in the convergence argument for the forward difference equation was the obtaining of (3.15). Since the second difference is nonpositive at a relative maximum, it is easy to see for the boundary value problem that (3.15) holds for the backward difference equation [7, 81 without any restriction on r. Consequently, 11~~ll = O((Az>' 4- AL) (4.5) as Ax and At tend to zero independently. Thus, the desire to be able t o choose the increments independently has been realized; however, as the error is second order in Ax and only first order in At, it can be shown [8] that the optimum choice between Ax and A,? is st,illconstant T . The number of time steps required to reach the time T may be reduced somewhat in comparison to the forward equation by using a value of r larger than one-half. Notice that three unknown values of w appear in the difference equation ~ + ~ and a system of written z = 2,; hence, (4.3) defines ~ i , implicitly, linear algebraic equations must be solved a t each time step, Fortunately, this system is a special case of the following tridiagonal system blfl C l l 2 = rllt i = 2, . . , M - 1, Uifi--l+ 6ifi ctrts1 = q2, (4.6)
+
WfM-1
+
+
.
= VM.
6MSM
A normalized form of Gaussian elimination can be applied very efficiently to obtrainthe solution of (4.6). The algorithm is the following: P1 = bl, (4.7) ccl Pi = bi - a9 i = 2 , ..., M , Pi-1
-,
y1 = rll PI
i
= 2,.
., (4.9)
fM = Y M , fi =
yi
- &, Pi
i=M-1,
. . . , 1.
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
15
If the coefficient matrix does not change from time step to time step, needs to be computed only once. I n the case of the heat equation only four arithmetic operations per grid point are necessary to evaluate Z O ~ , ~ + ~ , after pi is obtained. Note that this is one less than that required to evaluate the forward difference formula. It has been shown [13] that this method of solving the linear equations does not introduce large errors in the solution due to round-off errors in the calculation. Cuthill and Varga [14] have proposed a different algorithm which is slightly more efficient for rclf-adjoint equations with time independent coefficients; however, if the coefficients vary with time, the calculation required is greater than that required by the above algorithm. The round-off error results have been extended to the algorithm of Cuthill and Varga for the self-adjoint case by Blair [15]. Let us consider other boundary conditions than assigning the value of the solution. The conditions (4.10)
or all
4)
+ D(0u = y(t)
(4.1 1)
occur very frequently in physical problems. I n (4.11) the sign of a(l)b(t> must be such that the physical problem is stable. Rose [l6] has discussed (4.11) in connection with a Crank-Nicolson type difference equation (to be discussed in Section 5) for a nonlinear parabolic equation of the form (3.20) under the assumption that b(t) is bounded away from zero. If his argument is restricted to the backward difference equation, he shows by means of a sequence of comparisons that, if u z is replaced by a one-sided difference, the truncation error is O(Ax At) for sufficiently smooth u. Kote that the rate of convergence has been reduced. The reduction results from low order correctness in the replacement of the boundary condition. Lotkin [lT] has obtained related results for a more general problem. Similar sets of conclusions can be drawn for boundary condition (4.10) and for certain nonlinear generalizations of (4.10) and (4.11). A second order replacement for u,(O, t ) is
+
(4.12) 13attcw [I81 has showii very rccciitly that the use of (4.12) in eithicr (4.10) or (4.11) leads to an over-all error of O((Ax)2 at) for the backward difference equation. It is not necessary that b(t) # 0. Consider next the backward difference equation for the nonliiiear equa-
+
JIM DOUGLAS, JR.
16
tion (3.20). If (3.21) holds, then the implicit relation can be solved for u.~. Assume the differential equation to be in the form (4.13) ~ z = z $4, t, u, uz, U t ) , where (4.14) The backward difference equation becomes A z z w i , n + l = $(xi, L+l, W i , n + l , A z W i , n + l , ( W i , n + l - ~ i r J / A t ) . (4.15) The convergence of the solution of (4.15) to that of (4.13) for the boundary value problem is proved by a similar argument to the one for the heat equation [S, 191. The error is AX)^ A t ) , and no restrictions on r arise. The algebraic problem may have become quite complicated, since the algebraic equations are no longer linear. However, an iterative method is easily devised. Let W t( 0, n) + l = Win, (4.16) and determine successive approximations to wi++1 by the linear, tridiagonal equations AZ2w$",'+'{ - Aw@fl) t.n+l = -Aw:s+l (4.17) $ ( x i , tn+1, ~ I : + l r AzW8!+1, (wt(%+~ - win)/At), where A is a positive constant. The optimum choice of A is
+
+
(4.18)
If (4.19) and At is sufficiently small, then W ~ Z converges +~ to the solution of (4.15). Note that (4.19) is needed only in order to demonstrate the usefulness of the iteration and is not needed for convergence of the solution of the difference equation to that of the differential equation. If (4.13) can be written in the quasi-linear form uzz a(x, t, u ) ? ~ , b(x, t, u) = c(x, t , u h t , (4.20) then a modification of (4.15) may be obtained for which the algebraic problem is linear a t each time step. Let At = AX)
+
Az2wi,n+i
+
4- a(xi, tn+i, Zuin)AsWi,n+l -I-b(xi, t n t i r Win) = c(-ci,
tn+l,
in)
Wi,n+I
- Win.
(4.21)
Al As wi,n+ 1 appears only linearly, the desired modification has been produced. The convergence proof remains valid [S]. A similar situation arises where the differential equation is in the self-adjoint form
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
17
(4.22)
It can be shown that
is a second order replacement of (au,), for sufficiently smooth a and u. An obvious alteration of (4.21) using (4.23) gives a backward difference equation for (4.22) for which the algebraic equations are linear. The same convergence proof holds. Of course, (4.22) is a special case of (4.20), but it is frequently more convenient to leave it unaltered in form. The backward difference equation can be applied ‘readily to heat conduction problems in one space variable with adjoining regions of different properties. As an example, consider the problem (4.24) Assume that both the temperature and the heat flow are continuous a t the transition. Then, lim u ( x , t ) = lim u ( x , t ) , z
X t z *
4 x*
(4.25) au au lim PI - ( x , t ) = lim pz - ( 2 , t ) . z.tz* ax z&z* ax In addition, assume that the temperature is specified initially and along the ends of the material. For convenience let x* = X E = kAx. Then, the problem reduces to finding an analogue of (4.25), and a simple one is Wk--l,n+l 81 Wk,n+l Ax
=
82 W k + l , n + l - W k d - I . Ax
(4.26)
Notice that (4.26) is a tridiagonal linear equation; thus, the algebraic problem is changed only trivially. It can be shown that the difference system incorporating (4.26) is convergent with an error that is O(Ax -I- At). Lotkin [171 has discussed generalizations and improvements of this procedure. The time step has been assumed to be independent of the time in all the arguments given above. Frequently, the solution of a parabolic differential equation tends to smooth out as time progresses, and it would be desirable to take advantage of this knowledge to reduce the computational effort for such problems. Now, it is usually inconvenient to change Ax in the midst of a calculation, but At may be altered easily. Gallie and the author [20] have shown that At may be increased following the relation
18
JIM DOUGLAS, JR.
for the backward difference equation for the heat equation without a reduction in the rate of convergence, provided the derivatives appearing in the truncation error terms decay exponentially with time. The result also holds for certain analogues of the heat equation in several space variables [all, and the ideas of the proof extend to any of the other difference equations discussed in this paper for linear parabolic equations. The relation (4.27) has also been applied successfully many times to nonlinear equations by the author and his colleagues.
5. The Crank-Nicolron Difference Equation
The global truncation error for each of the difference equations treated so far is of the same order as the local error, provided the stability restrictions are satisfied. It would seem reasonable to hope that increasing the local accuracy would lead to the same increase in the global accuracy, and such is often the case. Let us begin by decreasing the local error in the time direction by deriving the Crank-Nicolson difference equation [22, 231. The concepts involved are credited to von Neumann in both references. It is frequently convenient to limit the number of time levels appearing in a difference analogue to two, although multilevel equations are both possible and useful. If this limitation is admitted, then it is natural to + I Win)/At. Now, this difference is first order correct replace ut by ( w ~ , ~ at any point (xi, t) for tn 2 t 2 tn+l. At the particular choice t = tn+h it is bounded. In order to becomes centered and is second order correct if utCt take advantage of this increase in accuracy it is necessary to replace u,,at (xi, tn+i). Assume that u € Ca.This implies that u,,,, = u f rand uzZCC = utlf are bounded. Then,
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
19
Rearranging, Ui,n+l
- Uin
At
- 1
- 2 Az2(Ui,n+l
+
Uin)
+ O((&)' + (At)').
(5.2)
Consequently, let us define the Crank-Nicolson equation to be (5.3)
Flatt [24] has shown that stability in the uniform sense does not hold for r > R, where R depends on the length of the rod. For a rod of length one, R = 4 - W2.Another method of analysis must be introduced to treat the convergence of the solution of the Crank-Nicolson equation. This procedure will be based on a combination of Duhamel's principle and harmonic analysis. Let us consider the boundary value problem. The difference equation for the error is i = l ,. . .
(5.5)
where (a)
Zink
= 0,
(b)
&+I
=
WL'z:t+i 4-Ateik,
n
5 k,
n
=
k
+ 1,
(5.6)
It is easy to see by direct substitution that (5.5) and (5.6) provide the solution of (5.4) ; thus, the analysis of the error is reduced to estimating t,he solution of the homogeneous initial value problem, starting from the initial data z 2 k + 1 . First, note that (5.6b) can be rewritten in the form
which is the backward difference equation with z i k k replaced by eikAt and At by At/2, By the argument for the backward equation it follows that
max[z:t+l[ 5 At maxleikl = O((Az)'AClt i
i
+ (At)".
(5.8)
20
JIM DOUGLAS, JR.
Now, consider the auxiliary problem
vio
(5.9)
arbitrary,
v ~ , n + l = oM.n+l
= 0.
Equation (5.9) is a linear homogeneous difference equation with constant coefficients subject to homogeneous boundary conditions. The analogous differential problem is the standard textbook example for the method of separation of variables; consequently, it seems natural for us to try to separate variables here by assuming a solution of the form (5.10) (5.11) Thus, the following eigenvalue problem arises:
,M 40
=
4ild
=
- 1,
(5.12)
0.
It follows from (2.15) that the eigenfunctions are p = 1, . . . , M &P = sin ?rpxi,
- 1,
(5.13)
and that the corresponding eigenvalues are 1 pp
= 1
- 2r sin2rpAx/2.
+ 2r sin*?rp~x/2
(5.14)
The natural topology for discussing eigenfunction expansions for differential equations is the Lz topology, and such is also the case here. Define the inner product of two vectors defined on the points xi, i = 0, . . , M , as
.
M
(u,V) =
C i-0
u~v~*Ax,
where vi* indicates the complex conjugate of jlvllz =
vi.
(v, w2
(5.15)
Let (5.16)
denote the corresponding norm. The coefficient Ax is introduced to maintain uniformity in estimates ae M tends to infinity. The standard facts of finite dimensional vector spaces will be assumed [25]. I n particular, the set of eigenvectors of a symmetric operator form a basis for the space, and eigenvectors corresponding t o distinct eigenvalues are orthogonal. Hence,
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
21
sin n-px and sin r q x are orthogonal for p # q, provided that p , q - 1. Actually, it is easy to see [26] that p , q = 1, . . . , M - 1. (sin q x ,sin m p ) = $Sp,,
=
1,
. ..,
M
Let us apply these results to (5.9). Denote the vector Then,
Vik
by
(5.17)
vk.
c cpcp',
(5.18)
2(VO, P).
(5.19)
M-1 2'0
=
p=l
where cp
=
As the solution of the recursion &L=
PO =
PPl
(5.20)
CP,
Pn
is (5.21)
it follows that. (5.22)
11vn112
As
IpPl
< 1 for any choice of
=
3
c
(5.23)
M-1
ICPI'PP2n.
p=l
T, IIunJIz
5
(5.24)
II~'o112.
A diflerence equation for which (5.24) holds for the homogeneous error equation is said to be stable in the Lz norm. Note that this is merely a change of norm from stability in the uniform sense. The general relationship between stability and convergence will be discussed later. Let us show here that the above stability implies tJheconvergence of the solution of the Crank-Nicolson equation. For 2'0 = z:+I, (5.24) implies that IIZnkllz
I II~:+lilz,
+ 1.
rL 2
(5.25)
Relation (5.8) and the definition of the L, norm imply that llZ:+i][2
=
O((Az)'AL
+ (At)').
(5.26)
Thus,
5
I I Z ~ I I ~ 'E111Znkl12 =
k-0
~o((Aw~
+
= ~((AX)'
+ (at)')
(5.27)
22
JIM DOUGLAS, JR.
as nAt I T. Thus, as Ax and At tend to zero, the error tends to zero in the grid LZnorm. Of course, this is not the usual integral L2topology; however, it can be shown [27] using interpolation to define the solution over the rectangle (0 5 x 5 1,0 5 t 5 T) that the integral Lz norm of the error is also O ( ( A X ) ~ ( A t ) z ) . Note that we have preserved the local accuracy in the global error. The above argument is a special case of this method of the author [27]. It will be shown later that the optimal relation between A,? and Ax for the Crank-Nicolson equation is that the ratio At/Ax be constant. Consider the algebraic problem associated with the evaluation of the solution of the Crank-Nicolson equation. No essential difference between the linear equations in this case and in the backward difference equation case arises. Only the right-hand side is affected; consequently, the same elimination method may be applied. Indeed, a considerable increase in accuracy has been obtained for a very small increase in computing. The comparative aspect of the computing requirements will be discussed later. Strang [28] has also treated the above convergence problem by different methods. His proof is based on matrix manipulations requiring explicit knowledge of the eigenfunctions of the difference operator. Although these eigenfunctions were used above, it will be seen in Section 11 that exhibiting thcm is not necessary for the genera1 method used. As mentioned earlier, Juncosa and Young [2,3,4] and Wasow [12] have studied this convergence problem using the explicit solutions of both the difference and the differential equations. The Crank-Nicolson method may be easily formulated for problems of a more general nature. For the differential equation (4.13) the difference equation becomes
+
a A z 2 ( W , n + l +W,n)
= $(xi, t n + : , i ( W i . n + l
+
Win), 4 A z ( W i , n + l
+
Wsn),
(Wi,n+l - w i n ) / A t ) . (5.28) The convergence of the solution of (5.28) was first proved by Rose [IS] under a restriction on T . More recently Lees has constructed a proof based on energy estimates [19,29], and the restriction on T has been removed. The algebraic problem associated with (5.28) is essentially the same as that for (4.15), and an iteration method analogous to (4.17) can be written down. For the somewhat simpler differential equation (4.20) the difference equation simplifies to %Az2(w;,n+~ Ww) ia(ri, f n + t , i ( w i , n + l Win)) ALr(lDi.n+l ~ , n ) (5.29) [)(xi, t n + i , i ( w i , n + l win))
+
+
+
+ +
+
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
23
'I'hc soliitioii of the slgchraic proldcm may hr ol,tninctl hy itcmtiiig on w,, I 1, using thr old itrratc in tlic r1v:iliint ion of t hc corfficienth. If an arciirule cstimatc of thc solutioii at timc l n + : could ho provided, it, \vould 1)c unnecessary to iteratc a t all. The author [30] proposcd to usc the forward difference equation to predict wn++ from w,; however, the proof offered was incornpletc. Perhaps a bctter method would be to use the backward difference equation (4.21) for a time step of At/3 to predict wn+;, and then modify (4.29) to
+
%Az'(~,n+l
ujm)
+
%,(,it
k+:,
Wt,n+J)Az(wz,n+l
+ b(.~7,
tn+;t
=
c(.rt,
+
WZ")
w,,n+;)
W,n tn+;, ~ 7 , n + i )
(5.30) 1
- Wtn.
A convergence proof has been roiistructed in case the differential equation is of the almost linear form ( u ( q t)u,),
+ b(r,t , u )
= c(s, t)W.
(5.31)
Another method for predicting wn+i would be to extrapolate from w, and w,,-~. Whether this can be done without iteration is not known. So far, only the advantages of the Crank-Nicolson equation have been pointed out. A disadvantage of the Crank-Nicolson method with respect to the preceding methods is that greater smoothness is required of the solution of the differential equation to insiirc convergence. For some particular examples the solution of the backward difference equation may actually be better than that of thc Crank-Kicolson equation; however, if the differential problem has a sufficiently smooth solution, it is to he expected that the Crank-Kicolson equation will produce superior accuracy. Crank-Nieolson equations can, of coursr, be applied to problems for which slope conditions are specificd a t a boundary rathcr than the values of the solution; however, the manner of replacing the normal derivative is not analogous to that for u,,in the differential equation. The values of thc solution a t only the advanced time should appear; no avcraging to obtain a timcwise centcred replacement should he atternptcd. To see this, considcr the following example. Let 14(5,0)
= 0,
o
(5.32)
be part of the specification of the data. Katurally, the initial condition would go over to wz,o=o, i = o , . . . ,nr. (5.33)
JIM DOUGLAS, JR.
24
Then let the slope condition be replaced by A"(Wo,n+1
f W0.n)
= -1,
n 2 01
(5.34)
where A, indicates the right hand first difference. Note that, as a result of (5.331, AvW0,n = (-l)n - 1, n 2 0; (5.35) thus, the imposed heat input rate oscillates in an undamped fashion about the desired rate. While this particular difficulty can be suppressed by adjusting w o , ~the , same type of oscillation creeps in as a result of truncation error if (5.34)is used, as the author has observed each time he has forgotten this simple example. It should be mentioned that the eigenfunctions used in the stability analysis change when the slope is specified. Frequently, cosines instead of sines appear.
6. An Unconditionally Unstable Difference Equation
A seemingly simpler way to produce second order correctness in time than that of the Crank-Nicolson equation is to use a centered first difference for ut, while replacing uzzby a difference at the middle time. For the heat equation [31] this difference equation becomes
Note that w1 must be obtained by some other relation, but let us forget that difficulty for a moment. The Lzstability analysis method may again be applied, and the application of the Duhamel principle leads to the auxiliary initial value problem
v i , and ~ V ~ Jarbitrary, VO,n
=
V M ,= ~
(6.2)
0.
If a solution of the form Vin
is assumed, then
pn satisfies the
= pn sin ? r p ~ i
second order ordinary difference equation
Ax + 8r sin2% pn - pn-l = 0. ?r
Pn+l
Consequently, Pn =
+ bh",
Nln
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
where
are the roots of the characteristic equation
and
Thus, el
For p = M
25
=
-4rsin2 *pdz 2
+ ( - l ) i [ l + 16r2sin4?!@]'".
(6.7)
- 1, sin ( s p A 2 / 2 ) is close to one, and >1
+ 4r.
(6.8) If Ax tends to zero as At tends to zero, it is impossible that r = O(At); thus, difference equation (6.1) is unstable in the Lz sense for any choice of T as At and Ax tend to zero. It will be shown later that this implies that the solution of the difference equation does not depend continuously on the data for the differential problem and that convergence cannot take place except under unrealistic restrictions on the data. Even then the round-off errors would grow unboundedly, and the computed numerical solution would bear little resemblance to the exact solution. This difference equation serves as a good example of the advantage of analysis over a purely experimental approach. [ell
7. Higher Order Correct Difference Equations
It has been remarked that a difference equation involving the value of its solution at grid points more than one interval away spatially from the center term cannot, in general, be evaluated at the grid points next to the boundary; consequently, for boundary problems it is advantayeous to restrict the difference equations to those leading to tridiagonal algebraic equations. While the Crank-Nicolson equation is a considerable improvement over the forward and backward equations, the author [32]has shown that it does not posFess the highest order local accuracy that can be obtained even using the same six grid points. It is also possible to increase local accuracy by using more time levels; difference equations involving several time levels will be discussed later in this section. Assume u E C6. The object for the moment is to derive a difference equation in which only the values of the solution at the six points appearing in the Crank-Nicolson equation occur and which is fourth order correct in the space coordinate and second order in time. As (Wi,n+l - w i n ) / A t is already second order correct in time at (z<, t n + + ) , it will be retained. The replacement of u,, may be facilitated by observing that
JIM DOUGLAS, JR.
26
d'ui,,+l = ax2
2 Az2
+
( ~ i , ~ +uin) ~ -
(AX)' a4ui,,,+)
12
a24
since uzzzz= uZzt.Now, it can be shown that
ha ax2at Thus, if
T
=
'
At
Az2(ui,,,+1 - uin)
+0
is held fixed and (7.2) substituted into (7.1), it follows that
Consequently, the difference equation
is fourth order correct in space and second order in time for any fixed r. It has been shown [32] that the solution of (7.4) converges to the solution of the heat equation for the boundary value problem with an error that is O((At)2) = O ( ( A X ) ~ ) in the uniform norm if u C'. A much simpler argument based on L2 stability can he constructed to show that the L2 error has this order if u € CB. Equation (7.4) can be extended to apply to more general parabolic equations. Consider first the linear parabolic equation
Again, we iieed t,o replace the u22zzterm in (7.1) through use of the differential equation. Now, = a24
-
if u €
a
2ax? (a(x,
t)
1
at *)
i.n+
+
- Az2[ai,n++(Ui,z+1 - uin)]
Al
O((Ax)')
(7.6)
E C4, and T is fixed. This leads to the difference equation
[(1 - y) + f [(1 + 7 win] ) Wi,n+l]
~~2
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
27
which is locally accurate to ternls that arc AX)^) = O((At)?). This difference equation was introduced by the aut,hor in [30]; the convergence argument given there is incomplete, but a minor corrcctiori would show that the LSerror is AX)^) = O((At)3”2) for the case u(z, 1) = a(.). Note that a decrease in the global acvuracy to less than the local accuracy is predicted; fortmiately, Lees [29], using energy methods, has established that the global error for (7.7) is AT)^) in the uniform norm for sufficiently smooth u(r,t ) and u ( x j t ) . The difference equation can clearly be generalized to u = u ( r , t, u) arid to iiiclude lower order terms in the differential operator with the resulting algebraic equations becoming nonlinear. At the moment a convergence argument has not been produced. A special case of (7.4) was considered by Crandall [33]. l e t us consider introducing the values of the solution a t more than two time levels into the difference equation. One motivation for studying such difference equations is the desire to trent nonlinear equations in a less complicatrd fashion than the methods proposed in the section on the CrankNicolson equation. As an example consider the almost linear equation 211
= 11,,
+
f(U).
(7.8)
The standard Crank-Nicolson equation for (7.8) would be either
or
Both relations preserve second order correctness in time, but each generates a nonlinear algebraic system to be solved a t each time step. If the rephcement of the heat operat,or could be centered a t the point (xi,tn), thenf(u) could be evaluated at time tn from known values of the solution of the difference equation, and the algebraic equations would again be linear. Obviously, the practical advantage of such a difference equation would be large. A second and quite important motivation is that the method of obtaining higher order correctness using only two tJimelevels does not generalize t o several space variables. Before attempting t o obtain higher order accuracy, let us consider a n equation with local accuracy being O((A2)’ f (At)’). The simplest multilevel difference equation for tJhe heat equation is a three level formula resulting from replacing uzzby the average of the second differences at L-1, t,, and t,+l and ut by a centered, first difference:
28
JIM DOUGLAS, JR.
It is easy to see that (7.11) is second order correct both in space and time. As (7.11) can be applied only for n 2 1, it is necessary to obtain w1 from wo by some other process. Before choosing a method for computing wl, let us consider the truncation error for (7.11) and its sources. The Lpstability method may be applied [34], and, after a somewhat longer argument than arises in the Crank-Nicolson case, it can be shown that IlZnIlz = O(llzl]lz (Ax)' (W2) (7.12) without restrictions on Ax and At, provided u € C4.If the inherent accuracy of the method is to be preserved, then WI should be produced by a method such that 11z1112= O ( ( A Z ) ~ (At)z); even the forward difference equation produces the required accuracy for any r for one time step. Notice that the algebraic equations are of the familiar tridiagonal form. Let us turn our attention to obtaining higher order correctness. As
+
+
+
(7.14)
Thus, the difference equation
is fourth order correct in space and second order in tirne. Again, w1 must be determined by another method. A somewhat tedious argument establishes the unrestricted Ls stability of (7.15) for the boundary value problem and that IIZn112 = O(I1Z11[z 3(At)') (7.16)
+
+
if u E CE.This time we need Ilzll[2 = O ( ( A Z ) ~ (At)2). I t will be shown that constant r is essentially the optimum choice of the relation between Ax and At for (7.15); if this choice is made, then predicting w1 by the forward difference equation is adequate to obtain the desired error bound for 21.
An alternate substitute for uzzZzis uzZt.Then, the similar difference equation is
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
29
and this equation is also fourth order correct in space and second order in time for constant r. A somewhat similar analysis leads to the same conclusions for (7.15). Consider the generalization of the above difference equations for the differential equation (7.18) as an example. The term uzzzzcan be replaced as follows: (7.19) Then, (7.17) would become ai
Wi,n+l
- Wi,n-l 2At
=
-1 A z 2 ( W i , n + l 3
+ + win
wi,n-l)
For the nonlinear differential equation
a (a(x) ax
2)
=
b(x, 2, u)dU dt
+ c(x, t, u),
(7.21)
equation (7.11) becomes 1 3
- Az(a;Az(Wi,n+l
Win
Wi,n-1))
Note that the algebraic equations are linear. Convergence proofs are lacking for any serious generalization of the heat equation; however, difference equations like (7.20) and (7.22) have been successfully used by the author on numerous occasions. The results were in good agreement with the results obtained by more tedious methods. More than three time levels can be used. For instance, the difference equation
is a stable, second order analogue of the heat equation. The determination of stability for more than three levels begins ta became a nuisance.
JIM DOUGLAS, JR.
30
8. Comparison of the Calculation Requirc.bments
A compIet,ely satisfactory estimate of the number of arithmetic operations required to produce a numerical solution of a preassigned accuracy is not possible on the basis of the preceding analysis, since the constants in the order estimates depend on values of the derivatiles of the solutions of the differential equations and these values are obviously unknown in general. However, it is possible t o derive heuristic asymptotic estimates of the calculation requirements as the allowable error tends to zero [8]. Assume that
+
I 1 4 = W t ) " (A499 (8.1) where a aiid 0 are positive and the norm symbol indicaLes a vector norm. Assume also that the number of calculations per time step is O((Ax)-]), as it is for all the linear difference equations and several of the nonlinear ones considered so far. Then, the total number of callculations up to a time T is c = O( (A.zA~)-~). (8.2) Finally, assume that -At -
- constant
(8.3)
as Ax and At tend to zero. The choice of y and the value of the constant should be consistent with stability requirements, if any. Then,
c = (Ax)-'-T,
(8.4) where the multiplier on the right-hand side has been ignored. If the allowable error is E and constants are again ignored, (At)"
+ (Ax)fl = (Ax)*Y + ( A Z ) ~5 e.
If a y < 0, then (Ax)". doininates (Ax)a as (8.5) is asymptotically equivalent to AX
e
(8.5) and Ax tend to zero, and
= cl/ay.
Thus,
C = e-(l+T)/w, For p and
< p.
< a y , (8.5) is asymptotically equivalent t o
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
31
To iiiininiiszc fhc num1)rr of calciil:if ions, thc csporirnt
(8.10)
should be minimized as a function of y. It, is easily seen that the optimum choicc is y = Pa-' (8.1I) and that emin =
1
a!
+1
(8.12)
Moreover, (8.13)
Thus, any increase in the global accuracy asymptotically reduces the total calculation if the order of t,he number of calculations per time step is not increased. Now assume that the solution of the differential equation is sufficiently smooth that the best estimates of the error obtained for each difference equation hold. Then, it follows that
+,
emin =
1,
forward and backward, Crank-Nicolson and (7.11),
i,
(7.4), (7.15), and (7.17),
i
(8.14)
and for thcir generalizations. Note that the higher order correct equations lead to a very large reduction in the amount of arithmetical work required to complete a problem if a small crror is prescribcd.
9. Several Space Variables
The presently used finite difference methods for parabolic equations in several space variables can be separated into two categories, generalizations of the difference equations discussed above and alternating-direction methods for which no single space-variable analogues can exist. The generalizations of the previous equations, aIong with certain questions that do not arise in the single space-variable problem, will be developed in this section, and the alternating-direction methods will be treated in the next section. The treatment of two space variables is typical, and the algebraic manipulations are somewhat simplified by restricting the discussion to this case. Consider first t,he boundary value problem for the heat equation
JIM DOUGLAS,
32 Ut
= uzz
+
IR.
0 <: t I Tj
(x, Y) f R, (x, Y) € R, (Xl Y) € aR,
Uuu,
u(x, Y, 0) = f(x1 Y),
4x1 Y, t ) = !.7@1 Y, 11,
0
.=IT ,
(9.1)
t
where R is a connected region in the plane and dR its boundary. The region may be multiply connected. The forward difference equation easily generalizes to Wi,j,n+l = Wijn
+
+
(9.2)
Ay2)Wijni
provided that each spatially neighboring lattice point of (xi,y,) is in either R or aR. If aR is made up entirely of segments of the lines x = xi and y = yj for a sequence of choices of Ax and Ay tending to zero, then the convergence analysis for the forward difference equation in ‘one space variable clearly generalizes. A rectangular annulus and an L-shaped region can be examples of such regions. The stability restriction is 1 4
r<-
if Ax
= Ay and
+ (AY)-~] 5 51
AI[(AX>-~
otherwise. With p space variables and equal space-increments, the restriction becomes 1
A much more interesting problem arises when the boundary aR is curved or at least not so simple as above. Let aR intersect the line y = yj at a point (x*, yj) between (xi, Yj) and xi+^, yj), and assume (xi, yj) interior to R and (zi+1, yj) exterior to R. There are several ways of handling the boundary values. Perhaps the simplest way would be to add the point (z*, yj) to the lattice and assign the given boundary dsta there; also, it is feasible to transfer the data from (z*, Yj) to either (xi, gi) or (zi+l, yj). Let us consider both of these procedures. If the point (x*,yj) is added to the lattice, it is nixessary to replace uIz a t (xi, yj) by a n uncentered second difference. Let L* - xi = Ax1 and xi - xi-1 = Ax2. Then, (72u..
tjn=
ax2
Axl
‘, In) - Uijn
+ A Z ~[u(x*ly’AxI
- Uijn -
U&l,j,,!
Ax2
1
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
33
uzt+ being evaluated where required by the Taylor series with remainder expansion. Let
and define 612 and 6,2 analogously when the distance to any other neighboring lattice point is altered. Note that 6z2w,,ncoincides with AZ2wyn if Ax1 = Ax2 and that it is only first order correct if Ax1 # AX^. Then, (9.2) can be generalized to
+
+
Wa.j.n+l = W,,n At(6z' 8g')WzjnFor stability in the uniform sense, stability criterion becomes 1-
2At
(9.8)
(9.9)
at each interior lattice point, where Axl is the x-increment to the right at the point, etc. The restriction reduces to (9.10)
Since the local error is O(Ax implies that llz,,ll
+ Ay + At) if u E C?,satisfaction of (9.10) (9.11) = AX + Ay + At).
Ay. Note that (9.10) is more stringent than (9.4), as b k 5 Ax and Ayk I Consequently, if an interior grid point is quite close to the boundary (and, practically, this seems inevitable), the computing requirements could he increased severalfold over those if (9.4) could be used as the criterion. Let us consider transferring the data. The choice of always transferring the data to the interior of the region, which in this example means assigning (9.12) wijn = g(x*, yi, tn), seems to have advantages over adding points outside R. There are fewer lattice points, which implies slightly less calculation, and the convergence proof would be confused somewhat if derivatives of the solution u(x, y, 1) at points outside the region appear in the analysis. The simplest general method of assigning a boundary value at an interior grid point (xi, yj) that must be converted to a boundary point is to use the boundary value g(z, y, tn) at the nearest boundary point along either x = xi or y = yj. Assume that u € C*.Then, by the mean value theorem the error in the solution at this converted point is first order in Ax or Ay. Thus, Zijn = o(Az Ay), (xi, yj) € an*, (9.13)
+
where aR* is the effective boundary for the difference equation. By an argument essentially the same as before it can be shown that (9.14) IIZn11 = O ( b AY At).
+ +
JIM DOUGLAS, JR.
34
Since the space increments are unchanged about the remaining interior grid points, the stability criterion is (9.4). The method of transferring the data to the next interior grid points appears to be superior for the forward difference equation t o the method of introducing irregular lattice points a t the intersections of the boundary and the grid lines. The backward difference equation can also be generalized to (9.15)
It is easy to see that the maximum principle analysis extends to (9.15) for both methods of treating curved boundaries and that no stability retriction arises. Thus, the truncation error for the general region is O(Ax Ay At) if u € C3 for either procedure. One of the advantages of the backward difference equation OVCP the forward difference equation can be displayed from this error estimate. For convenience in the discussion let Ax = Ay. The time step may be taken t o be O(Ax) so that the time and distance truncation errors are in balance, whercas this balancing is not possible with the forward difference equation; consequently, many fewer time steps can be taken with the backward equation without affecting the order of the error. A serious algebraic difficulty arises from the use of the backward difference equation. The linear equations are no longer tridiagonal and elimination methods require an excessive amount of calculation if the mesh size is small. Various iterative methods have been developed for treating this ellip tic difference system. The two most popular methods are over-relaxation and alternating-direction methods.* Thc author's experience is that the alternating-direction methods are many times more efficient for largescale .problems. Actually, it is not necessary t o solve the linear equations exactly in order to preserve the inherent accuracy of the method. The author has shown [35] that, if the number of alternating-direction iterations taken a t each time step is O((logAt)2), the truncation error remains O((AX)~ At) for at least certain simple boundary conditions when R is a rectangle. Experience indicates that this conclusion is applicable t o more general problems. Since the number of calculations for one alternatingdirection iteration is 0 RE AX)-^), the number of calculations per time step is O((Ax)-2(log A t ) 2 ) ,which is but a little more than for the forward difference equation. Thus, if the number of iterations is chosen on the above basis in the curvilinear boundary problem, the reduction in the number of time steps will strongly overcome the extra effort per time step.
+ +
+
* These are discussed in a forthcoming volume of this series.
35
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
The backward difference equation can be adapted to treat the boundary slope problem au
= g(z, Y, aU
0,
(2,Y>
E an,
0
< 1 5 T,
(9.16)
where u, is the outer normal derivative. Restrict the boundary dR to segments of the lines x = xi+$ and y = yj++ Let Avwijn represent the outer normal difference at a point (xi,yj) ;i.e., the difference of w at the boundary point and w at the nearest interior grid point divided by appropriate space increment. The difference system
(9.17) € 812, wi,j,o = f(zi, Yj), (xi, Yi> € R can be shown [3G] to converge if u E C3. The error is O(Ax At) in the uniform norm if R is a rectangle and O ( A z At) in the L1 norm if R is the more general polygonal region. The Ll norm is defined by (9.18) AvWi,j,n+l =
g(zi, ~
(Xi,YA
j t n, + J ,
+
+
The appearance of the L1 norm is somewhat unusual. The general curvilinear boundary presents greater difficulties; indeed, to the author’s knowledge no method has been offered with a proof of eonvergence. There are several ways of obtaining a difference analogue of the differential system. The normal derivative can be replaced directly in the following fashion. Introduce the irregular boundary points discussed previously and extend the normal to the boundary into R until it intcrsccts n grid line. Then, use linear interpolation to define a value of the solution a t time in+, in terms of the values a t the nearest spatial grid points, regular or irregular, on the grid line each side of the intersection. Finally, form a normal difference using the values of the solution a t the boundary and a t the intersection. The author has used this method with apparent success on several occasions; however, the convergence rate was not as rapid as desired. Other methods are based on using the conservation of heat for an irregularly shaped increment of the area in the neighborhood of the boundary. This is possible since (9.16) specifies the heat flux across the external boundary. These methods have also led to accepted results; note that any such procedure is essentially a network analogue for the system. The Crank-Nicolson equation can also be extended to the two dimensional problem. It is
JIM DOUGLAS, JR.
36
Obviously, it is still locally second order correct in time and space. If R is a rectangle, the Lz stability analysis can be made in essentially the same fashion as before. For the boundary value problem on the unit square the eigenfunctions are sin r p x sin rqy, p , q = 1, . . . , M - 1, and the Lz norm is I l ~ l l z= [C ~ i j ” ( A ~ ) ~ ] l ” ’ . (9.20)
+
Under the same conditionsas before the truncation error is O((Ax)’ (At)?). Stability for polygonal regions with sides parallel to the axes follows from the known distribution of eigenvalues of the Laplace difference operator [6]; consequently, the convergence result holds for the boundary value problem in such a region if u € C4. In the curvilinear case the local error becomes O(Ax (At)z) for either of the two methods of handling the boundary. It can be shown on the basis of the minimax principle [37] that stability holds when irregular boundary points are introduced, and the stability for the method of transferring data follows from the remarks above. Thus, the global error is O(Ax (At)2) in the Lz sense for sufficiently smooth u.The algebraic problem is again essentially the same as for the backward difference equation. The high order correct equation (7.4) cannot be generalized to more than one space variable. It would be necessary to replace uzZIz %Yrr by use of the differential equation, but
+
+
+
a4u ad
a4u a2 a’ a 4 ~ a?u a 4 ~ 4+a9a= (%+ a-). Y - 4= at2 - ax2ay2 9
(9.21)
is the best that we can do. As the discretization of utt requires three time levels, the two level high order correct method is impossible. The three level formula (7.11) obviously generalizes. The analysis proceeds in the same manner for a rectangular region and then extends to more general regions essentially like that for the Crank-Nicolson equation. In order to extend (7.15) it is necessary to replace the right-hand side of (9.21) by differences. One such replacement leads to the difference equation [34]
- 12 [At2wijn - 4A22Au2~ijn].
(9.22)
The equation is fourth order correct in space and second order in time; a simple but uninteresting argument shows unconditional Lzstability for the boundary value problem on a rectangle. Equation (7.17) does not generalize for the same reason that (7.4) does not. The extensions of the various difference equations listed above to linear
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
37
and nonlinear parabolic equations follow very much as for a single space variable. Since the algebraic equations that must be solved at each time step for the implicit difference equations have to be treated by iteration anyway, the advantage of noniterative equations such as discussed at the end of Section 5 tends to be minimized. 10. Alternating-Direction Methods
The use of implicit difference schemes is motivated by several desires. The primary one is to obtain unconditional stability. Better accuracy is also sought. Finally, from a practical point of view it is highly desirable that the generated algebraic systems be easily solvable. The implicit equations for a single variable satisfy these wishes reasonbly well, but the implicit equations introduced so far for treating several space variable problems are lacking in the third category. The alternating-direction methods introduced by Peaceman, Rachford, and the author [21,38,39,40] are intended to simplify the solution of t,he algebraic equations and to preserve unconditional stability and reasonable accuracy, thus fulfilling the requirements above. Consider first the heat equation on the unit square. The solution of tridiagonal linear equations is arithmetically simple ; if the algebraic equations are to be of this form, then only one of the space derivatives can be evaluated at the advanced time level. This restriction leads to the difference equat,ion
Let us consider the stability of (10.1) for the boundary value problem. As (10.1) is linear, the auxiliary problem would be the initial value problem for (10.1) subject t o vanishing boundary values, and the eigenfunctions would be vijn
= pn sin
P=+I
-1
~ p x sin i rqyj.
(10.2)
Thus, P,,
1
- 4r
sin' (aqA2/2)
+ 4r sin2(?rpA2/2)'
(10.3)
Now, if r is large, the magnitude of t,he stability ratio can be made large by t'aking q = M - 1 and p = 1 ; hence, unlimited stability does not result from (10,l). Now, notice t.hat t.he use of (10.4)
would lead to a stability ratio in which the positions of p and q would be
JIM DOUGLAS, JR.
38
i n t erkhaiigcd. Coiisitler taking one t,imc stcp using (10.1) and thcn o m using (10.4). As the cigcnfunrtioiis fire 11ie same for both rclnfions, flw stability ratioii for tlic doiiblc stcp is pt1+2 =
1 - 4r sin? ( ~ p A x / 2 ) 1 - 4 sill” ( n y A z / 2 (I 4r sin? ( ~ p A x / 2 ) 1 4 sin2 ( r q A z / $
+
+
which is bounded in magnitude by one for any size time step. Thus, the effect of using two possibly unstable difference equations alternately is to produce a stable equation. Since we are interested in the solution only after the double stcp, let us alter At to be the double step and introduce a n intermediate value notation for the solution at the end of one time step. Then, the difference system becomes
As the stability ratio for (10.6) is given by (10.5) with the fours replaced by twos, (10.6) is unconditionally &stable. As each half of (10.6) represents tridingonal systems of algebraic equations, the algebraic problem is very simple. Finally, we need to investigate the accuracy of the method. This is facilitated by eliminating the intermediate values. For R a rectangle the elimination leads to
wijn).
(10.7)
Kote that (10.7) is a perturbation of the Crank-Nicolson equation. If C6,it follows that (10.7) is locally second order correct both in space and time; consequently, the stability implies that the global error is also. Extend the ideas above to three space variables:
u €
Unfortunately, (10.8) is unstable for any useful value of r ; in particular, r > implies instability. The author has not bothered to determine the exact criterion.
+
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
39
Recall that (10.6) and (10.7) are equivalent on a rectangle; actually, any alternating-direction difference system that leads to a perturbation of the Crank-Nicolson equation like (10.7) is equally satisfactory. With this in mind, let us attempt to set up an alternating-direction modification of the Crank-Kicolson method from the beginning [40]. First, evaluate the x-derivative a t t,++ to obtain a first approximation wz+l a t time fn+l:
Then, move the evaluation of the y-derivative ahead:
Note that (10.9b) is almost the Crank-Nicolson equation; if the intermediate solution w:+1 is eliminated, then again (10.7) is satisfied for a rectangular region. Thus, the systems (10.6) and (10.9) are equivalent on a rectangle, although the intermediate values are different. Now, the generalization of (10.9) to three (or more) space variables is clear. Let
(10.1Oc) uhcrc the space indices lisve been suppressed. It can be shown that (10.10) is locally Fecond order correct in space and time for sufficiently smooth u. As i t is unconditionally Lz-stable, it is convergent with an Lz-error that is O((Ax)* (At)". The linear equations are of the desired txidiagonal form. Again, the proofs apply to R being a three dimensional interval. Equatioiis (10.10) can be put into a more convenient form by subtracting (10.10a) from (10.10b) and (10.10b) from (10.10~):
+
40
JIM DOUGLAS, JR.
The system (10.10a), (10.10b'), and (10.10~') is perhaps a bit easier to treat on a computer. A predecessor for three space-variable problems was developed by Rachford and the author [21]. This method amounts to a modification of the backward difference equation instead of the Crank-Nicolson relation; consequently, (10.10) should be superior to the older method. An equivalent method has also been developed by Brian [41]. It has been shown by Birkhoff and Varga [42] that the argument based on the elimination of the intermediate solutions does not generalize to either difference equations with variable coefficients or nonrectangular regions. The reason for the failure of the analysis for nonrectangular regions is that (10.6) and (10.7), for instance, are no longer equivalent, since the operators A,z and Au2 commute only on a rectangle. This observation points out an error in the proofs in [21, 431. A similar lack of commutativity prevents the extension of the argument to variable coefficients. Although Birkhoff and Varga point out the meaning of their results, the paper has been misinterpreted. They do not show that the alternating direction methods fail for such problems; indeed, the methods are very useful for the more complex problems. Recently, Lees [44]has obtained convergence proofs for alternating direction methods by means of energy estimates for the nonlinear parabolic equation
(10.11) on an arbitrary domain. At the moment his results are based on fixed r. Both the generalization of (10.6) and those of the methods of Douglas and Rachford [21] for both two and three space variables have been treated. For the linear Farakolic equation (10.12) v. (4x1 Y ) V d = ",
If t appears in the coefficients, the evaluation should be at time t,+i to preEerve second order correctness in time. The forms of other generalizations are evident. A l o ~in s local accuracy will occur for nonlinear equations if the coefficients are evaluated using the solution at the old time level. The estimation of the coefficients a t tn+t is as useful for the alternating-
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
41
direction methods as for the Crank-Nicolson method for one space variable; however, no work along these lines has been done to the author’s knowledge,
1 1. Abstract Stability Analysis
That stability and convergence are intimately related is apparent from the numerous difference equations treated so far. Let us turn to a n abstract formulation of stability and its implications. Consider the linear parabolic equation ~t = AU b, (11.1)
+
where A is a linear elliptic operator. A may have variable coefficients. Assume t,hat x E R, ? 4 X , O ) = fb), (11.2) 0 < t 5 T. z E aR, U(T, t ) = g(r, t ) , The dimensionality of R is arbitrary. Replace the differential system by the difference equation (11.3) where w, is the vector representing the approximate solution a t the grid points in R a t time tn. Assume that the space increments are functions of At. Although (11.3) is explicit in appearance, it still represents the general two-level difference equation, since the multiplier of w , + ~in a n implicit equation must be invertible if the equation is to be practically useful. Let us assume that (11.3) is consistent; i.e., if ?1,+1
= Cnlln
then
+ qn + elL,
(At)-’e, -+ 0
(11.4) (11.5)
for sufficiently differentiable solutions of (11.1) as the increments of the independent variables tend to zero. Again let 2, = u n
Then, zn+l
= CnZn 20
=
- wn.
+ en, 0.
(11.6) (11.7)
Thus, the convergence analysis of (11.3) has been reduced to the estimation of the solution of (11.7). The natural norm to use for (11.7) may vary from time step to time step, as the simplest choice may depend on the operator Cn, which may
JIM DOUGLAS, JR.
42
vary with n since the cocfficients of A may depend on t. Let us introduce a sequence of norms, (11.8) n = 0, 1, 2, * . * , \lz]ln, where each norm satisfies the usual axioms for vector norms. Introduce also the induced matrix norms: (11.9) Since the choiccs of the vector norms above will usually be based on convenience rather than intrinsic interest, let us also denote the norm of interest (maximum or L2, for instance) by unindexed bars. Then, it follows from (11.7) that (11.10) IIzn+llln I II(JnllnlI~nlln Ilenlln.
+
In ordcr to iterate this recursion relation it is necessary to be able to compare successive indexed norms. Assume that (I 1.11) (Iz(ln+~ 5 (1 4-W [ l z l l n , n = 0, 1, 2, . . . , for all z. The motivation for (11.11) is twofold. It is essentially necessary that such a relation hold for the argument to carry through, and it holds for the examples to follow. Then, II~n+illn+i
+ (1 + Wllenlln.
I (1 + aAt)llcnllnJlznJln
(11.12)
It follows that (11.13) where (1 1.14) j=n
Equation (11.3) will be defined to be stable with respect to the given sequence of norms provided
+ bAt,
0,I, 2, . . . as A2 tends t,o zero. If (11.3) is stable, (11.13) implies that llckllk
51
k
(I 1.15)
(11.16)
If the solution of the differential system is sufficiently differentiable that (11.17) (11.18)
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
43
If more is known about the local truncation error, more can be said about the global error. I n particuIar, if (11.19) then (11.20) Thus, consistency, (11.17) or (11.19), and stability of the difference equation imply a t least convergence with respect to the sequence of norms. Finally, if ., (11.21) IIzI[ 5 c ( A t ) p / [ z ] [ n , 71 = 1, 2, for all z, then [lZnl[ = o ( ( A t ) a + p > (11.22) if (11.19) holds. The exponent p will usually be nonpositive. If s p > 0, the convergence in the desired norm has been established. The above derivation [45] is a natural correction of the incomplete argument used by the author in 1958 [30], which was a n attcmpted extension of earlier results of Lax and Richtmycr [46,47] and of the author [27]. The multilevel difference equation can bc treated by a similar argument P71. As an example of the usefulness of the above results, let us consider the Crank-Nicolson difference equation for the boundary value problem for a au (11.23) 4 2 , t ) , b(z, t ) 2 m > 0. (a($, t ) = b(z, t ) - *
..
+
az
E)
at
1
The difference equation is
The natural norm for obtaining the bound (1 1.15) is (11.25
It follows [30 (properly interpreted), 451 from the Courant Minimax Principle that 51. (11.26) If b(z, t ) is continuously differentiable, then (11.11) follows readily. Moreover, these norms arc uniformly comparable with the usual LZnorm, and p = 0. Thus, for a sufficiently smooth solution of (11.23) IlZnll = O((At12) = O ( ( A z ) 9 , (11.27) assuming At/Ax to be held fixed. The argument can be extended to mildly lionlinear forms of (11.23) and to other difference equations, including higher order correct equations in some cases. IICkllb
44
JIM DOUGLAS, JR.
We see that stability and consistency are sufficient for convergence; Lax and Richtmyer [46, 471 have shown that the converse is essentially true, at least for properly posed pure initial value problems. They have demonstrated that if the solution of the difference equation converges to the solution of the differential equation for every initial condition in a reasonable class (Lee,dense in L2), then the difference equation must be stable. There are examples [48] of convergent, but unstable, difference analogues of the heat equation subject to special initial conditions; however, the result of Lax and Richtmyer shows that this cannot be expected to occur generally. Even in the case of specia1:data where the solution of an unstable difference equation is convergent, round-off error would grow sufficiently rapidly to invalidate the computed results. The proof that convergence implies stability is a Banach space argument based on the principle of uniform boundedness. Stability results related to those presented above have also been derived by Strang [49]. His definition of stability is a bit weaker than that above; in order to obtain convergence he required additional smoothness of the solution of the differential equation. He also extended the proof of the Lax equivalence theorem [46, 471 of stability and convergence for consistent difference equations to boundary value problems. 12. The Energy Method
It has been noted in several places previously that Lees has used the energy method to obtain very general convergence theorems. The method is reasonably simple in concept, but it is rather complicated in detail. The method was originally applied to differential equations, and many of the complications that arise in the application to difference equations do not appear in the differential case. It is the intention here to outline the procedure in a heuristic fashion by switching back and forth between difference and differential equations as the argument progresses for a very simple case. Let us consider the backward difference equation for the boundary value problem for the heat equation. We have seen that the error satisfies the equation A z 2 ~ i , n + 1=
Zi,n+l
Zi,o = ZO,n
- Zin
At
+ ein,
= ZM,n = 0,
(12.1)
e c = O ( ( k > ' 4- A t ) , provided the solution of the differential equation is sufficiently smooth. Thus, to establish Convergence i t is sufficient t o obtain a bound for z that
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
45
tends to zero as Ax and At tend to zero. Consider an analogous differential problem :
+
0
< 1,
0
7',
(12.2)
Multiply the differential equation by q l and integrate over the region up to time i. It follows from the identity (12.3)
and the boundary conditions that (12.4)
(12.5)
(12.6) (12.7)
In particular,
/u' qz(z,t)'dx 5
Jol
ez d-t dtl,
0
5 t I T.
(12.8)
(12.9)
(2.10)
(12.11)
Xow, assume that a corresponding result holds for (12.1). Then, max lzi,,l 5 a,n
{
+
ei.2AxAt}1'2 = O ( ( A X ) ~ A t ) . i.n
(12.12)
Lees has shown that the assumption is valid by demonstrating difference versions of (12.3)-(12.11). For more complex difference equations it is necessary to treat sums arising from mult,iplying the error equation by a
JIM DOUGLAS, JR.
46
liricar coni1)in:ilion of x :~ndi l s limc diffcrcnrc. Scc I m s [ 19, 29, 50-531 and Douglas nrid J m r s 1.541 for cldnils :]rid T I I I I I ~ P ~npplimt8ions OIIS nf tlw method .
13. Stefan Problem
Certain physical processes involving phase changes lead to frce boundary problems associated wit,h parabolic differential equations. An example arises in the melting of ice. Let us consider a simple realization of this example when we start with a semi-infinite (x 2 0) block of ice a t the freezing point and heat the medium along the plane 2 = 0. Clearly, the heat absorbed by the medium melts part of the ice, and a boundary with position x = z(t) is formed between water and ice. If we assume that only conduction takes place in the water, then the heat equation is satisfied in the water. Now, the heat flowing into the boundary must be used to provide the latent heat to melt the ice; thus, a relation between uz and 3(t) must be satisfied. For simplicity let all the physical constants be replaced by one and assume that one unit of heat is furnished to the water per unit of time. Then the differential system becomes u1
U,(O,
=
< x < z(l), t > 0, > 0, x = x(t), t > 0, 0 t
u,,
t ) = -1,
u(x(t,, t ) = 0 ,
2
(13.1)
> 0,
x(0) = 0. The last two conditions may bc replaced by the equivalent condition
t
=
x(t)
+
r(l) U(Z, t )
dx,
(13.2)
which amounts to a heat balance over the whole system rather than the local balance leading to the differential condition. Gallie and the author [55, 561 developed a simple finite difference method for treating the above problem based on variable time steps and the backward difference equation. The time steps are determined so that the free boundary moves one space interval each t,ime step. Let
t,
n-1
=
C
At,;
(13.3)
k=O
then the boundary a t time t, will be a t x, = nAx. Assume that the approximate solution win is known at t,; then let
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
47
n
fn+l
= Xn+1
4-i C =O
WinAX
(13.4)
and let At,
=
tn+l - t,.
(13.5)
Then, solve the difference system
Wn+l.n+l
=
0.
Note that (13.4)-(13.6)advances the solut,ion, both temperature and boundary, one time step. Alternately, (13.4)and (13.5)may bc replaced b y an analogue of the differential boundary condition :
(13.7) The system discussed by Douglas and Gallie [55] was slightly different from (13.4)-(13.6), but thc argument given there may be modified trivially to sliow that the solution of cithcr (13.4)-(13.6)or (13.6)-(13.7) converges to the solution of (13.1).The proof showed convergence up to t = 1; again, a minor change lcads to convergence for all time. Finally, another modification shows that the proof holds when the differential equation is generalized to a nonlinear one. A different approach would be to use a fixed time step and to allow the free boundary to intersect the line t = tn a t a point that is not necessarily a regular grid point. An uncentered second difference and a n irregular backward time difference have to be employed a t the regular grid point nearest the boundary. See the work cited above [55] for details. Trench [57] treated the application of the forward difference equation to (13.1)using a fixed time step. His arguments parallel those of Douglas and Gallie [55].Ehrlich [58]has discussed the use of the Crank-Nicolson equation for a geiieralizatioi: of (13.1).Although he does not treat the convergence problem, it appears that a proof similar to that of Douglas and Gallie [55] and of Trench [57] could be constructed at least for the ratio At/(Ax)* in the range where the maximuni principle is satisfied by the Crank-Nicolson equation [24]. Rose [59] has proposed a method for (13.1)that is analogous to a procedure iiitroduced by Lax [60] to compute shock wave propagation in first order, hyperbolic equations. He h s iiot yt:t succcedcd i n demoristratiig coiivergence for the method, but his ~iumcricalevidence is encouragiiig. In his approach, (13.1)is replaced by a first order systcin with temperature
48
JIM DOUGLAS, JR.
and internal energy as dependent variables. An explicit difference system is employed. Very little has been done for Stefan problems involving more that one space variable. Rose notes that his method is essentially independent of dimensionality, but he has yet to try such a problem.
14. Parabolic Systems
The numerical solution of parabolic systems of second order equations has received a scant amount of attention, and at best, fragmentary results are known, even though perhaps the majority of physical problems leading to parabolic problems actually should be described in terms of a parabolic system in order to include all the physics known to pertain to the problem. We shall be limited to discussing two types of systems, a system with zero order connection and a degenerate system. A system with zero order connection is of the form
auk = fb (xl t, u', u2, . . . , urn,
'2)
ax ax2
k
=
1, .
. . , m.
(14.1)
The zero order terminology refers to the fact that the equations are tied together only by the appearance of u l , . . . , uk-l, u k + l , .. . , urnand not their derivatives in the equation for uk.With this weak relation it would appear that methods that work for a single equation of the same form should generalize in L straightforward fashion to this case. It is easy to see [61] that the backward difference equation does. Let
k
= 1,.
. . , m.
(14.2)
Note that the problem of evaluating the solution of (14.2) breaks down into m sets of algebraic equations of exactly the form discussed earlier. The convergence proof parallels that for a single equation without serious modification. Obviously, more space variables may occur. The degenerate parabolic system
+ V-(bVv) = 0, V - ( b V u ) + V.(aVv) = cut,
V.(aVu)
(14.3)
where a, b, and c depend on v and the space coordinates and
< 4 x 7 Y,v ) ,
(14.4) arises in the description of immiscible, two-phase flow of fluids in a petroIWx, Y,v)I
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
49
leum reservoir. Peaceman, Rachford, and the author [62, 631 suggested the difference system
+ A(bnAVn+J
A(anAUn+J
=
0, (14.5)
where A(UAU) = A~(UA,U)
a,
=
+ A,(UA,U) + . . . ,
a ( z , g, . . . , Vn), etc.
(14.6)
It should be noted that the algebraic equations arising at each time step are rather formidable; an alternating direction iteration technique was devised to evaluate the solution of (14.5). See Douglas et al. [62, 631 for details. A proof that the solution of (14.5) converges t.o that of (14.3) has been obtained [63] under a slight simplification of (14.3). The simplification amounts to dropping the gradient terms if (14.3) is differentiated out.
15. Integro-Differential Equations
Volterra [64] considered parabolic integro-differential equations that are special cases of uzz =
fl (x, 1, u, uZ,ut,
g(1:, t,
8,
u(1:, s),
UJX,8)) d s ) ?
(15.1)
where (15.2) F,, 2 m > 0 and the derivatives of F and g are bounded. Let us consider solving (15.1) numerically when the solution is prescribed initially and along boundaries at 1: = 0 and 1: = 1. Perhaps the simplest method would be to generalize the backward difference equation as follows : (15.3)
Az2wi,n+1
=
F
(
2 2 ,
n &+I, Wtn,
Azwrn,
(w,~+I
- Wtn)/At,
C
k=O
g(xij tn+l,
t k , wtk, AzWtk)
At
)
*
Jones and the author [54] have shown that, assuming the same smoothness of the solution of (15.1) as for the heat equation, the solution of (15.3) converges with an error that is uniformly O ( ( A X ) ~ At). The method of proof is essentially Lees’s energy approach. Notr that the evaluation problem is the same as that for the purely differential equation case; in particular, if ut appears linearly in (15.1), the algebraic equations are linear. The Crank-Nicolson equation may also be generalized; however, a proof
+
JIM DOUGLAS, JR.
50
[54] has been obtained only in the case that (15.1) has the self-adjoint form
a
ax (p(x, t )
2)
=
~ ( xt ),
$ + F (x, t, u, lfg(x, t, s, u(x,
8))
ds
The difference equation becomes
Note that the quadrature has been changed in order to preserve the sccond order accuracy in time. It is the author’s conjecture that any convergent difference analogue of the differential equation can be adapted to the integro-differential equation; moreover, it should be possible to maintain the same order of accuracy by a proper choice of the quadrature method.
16. Extrapolation to the Limit
It is well known [lo, 651 that, if
+ Ax) = ~ ( l ) ( x ) y@)(x+ $Ax) = y@)(x)
k
y(l)(x
= 0, 1,
k
=
...,
0,1, .
9
.,
(16.1)
the linear combination
Y ( z )= 2 y “ ’ ( ~ )- y ( l ) ( ~ ) ,
2 =
AX,
k = 0, 1,.
..,
(16.2)
is a second order approximation to the solution of (16.3)
even though y(l)(x) and ~ ( ~ ’ ( are 2 ) only first order correct. This is an example of a technique known as Richardson’s extrapolation to the limit. Batten [66] has recently (indeed, as this paper was being typed) shown that this technique can be extended to difference analogues of parabolic equations. Let us consider the backward difference equation as an example. Then
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
51
for sufficiently smooth u. Thus,
Now, let us consider the effects of thc two error tcrms separately as follows. Let v ( q t ) be the solution of vzz
vt -
=
$lltr,
(16.6)
an.
on
v (.x ., t .) = 0
Then, t ) = O(l),
(16.7)
I d2Ui,n+
(16.8)
c(2,
and Az'+Ji,n+l =
Vi,n+l
-
Vzn
-
2
At
at'
+ ~ ( ( A Z +) ~ A t ) .
Let aln=
zin
- vin At.
(16.9)
Then, A2ai,n+ 1
=
ain =
ai.n+l
- ain
At
on
0
+ o((Ax)' aR.
Clearly, [lainll=
and 2in = vin
At
i(At)'),
O((Az)'
+ (At)'),
+ ~ ( ( A Z ) '+ (At)').
(16.10) (16.11) (16.12)
With this relation we are ready to improve the convergence rate. Let w(l)(x, t ) denote the solution of the difference equation corresponding to (Ax, A t ) , and let w(')(z, t ) correspond to (Ax, A t / 2 ) . Then, let W ( x ,t ) = 2w(')(x, t ) - w(~)(z,t ) , x = iAx, t = nAt. (16.13) Denote the corresponding discretization errors by z ( ' ) ( x ,t ) , z(?)(x,t ) , and Z(z, t). Then, Z(x, t )
=
22(')(2, t ) - z(')(x,1)
=
2 [ ~ ( xt ),
=
f + O ( ( A X ) ~+ (At)')]
+
~ ( ( A Z ) ' (At)').
- [ ~ ( x2 ), At 4-o ( ( A x ) ' i- (At))'] (16.14)
JIM DOUGLAS, JR.
52
Thus, the extrapolation leads again to an increase in the order of accuracy. Recall that the optimum choice between At and Ax for the backward difference equation is constant r . For the extrapolated case the optimum choice is At - constant; (16.15) Ax Consequently, the use of larger tjime steps implies that considerably less computing is required for the same accuracy in the solution. Note the above argument is unaltered for the linear differential equation
+
+
+
a(x)u,z b(X)% 4X)U d(x,t ) . (16.16) It is clear that the same ideas could be applied to other difference equations. Also, it is clear that we could choose to increase the spatial accuracy instead of the temporal accuracy. Indeed, by taking a linear combination of three solutions, the accuracy can be increased in both directions; more generally, a proper linear combination of several solutions should lead to the elimination of several leading error terms. Obviously, the ideas extend to several space variables. Ut =
References 1. Nildebrand, F. B., On the convergence of numerical solutions of the heat-flow equation. J . Math. Phys. 31, 35-41 (1952). 2. Juncosa, M. L., and D. M. Young, On the convergence of ti solution of a difference equation to a solution to the equation of diffusion. Proc. A m . Math. SOC.5, 168-174 (1954). 3. Juncosa, M. L., and D. M. Young, On the order of convergence of solutions of a
difference equation to a solution of the diffusion equation. J . SOC.Ind. Appl. Math. 1, 111-135 (1953). 4. Juncosa, M. L., and D. M. Young, On the Crank-Nicolson procedure for parabolic partial differential equations. Proc. Cambridge Phil. Sac. 53, 4 4 8 4 6 1 (1957). 5. Collate, L., l'he Numerical Treatment of Diferential Equations. Springer-Verlag, Berlin, 1960. 6. Milne, W. E., Numerical Solution of Diflerential Equations. Wiley & Sons, New York, 1953. 7. Laasonen, P., Uber rine Methode aur Losung der Warmeleitungsgleichung. Acta Math. 81, 30S317 (1949). 8. Douglas, J., On the numerical integration of quasi-linear parabolic equations. Pacific J . Math. 6, 35-42 (1956). 9. John, F., On integrat,ion of parabolic equations by difference methods. Communs. Pure & Appl. Math. 5, 155-211 (1952). 10. John, F., Advanced Numerical Analysia. New York University, New York, 1956. 11. Ritterman, Ph.D. Thesis, New York University, New York, 1955. 12. Wasow, W., On the accuracy of implicit difference approximations to the equation of heat flow. Math. Tables Aids Comput. 19/43-55 (1958). 13. Douglas, J., The effect of round-off error in the numerical solution of the heat equation. J . Assoc. Computing Machinery 6/48-58 (1959).
NUMERICAL METHODS FOR PARABOLIC EQUATIONS
53
14. Cuthill, E.H., and R. S. Varga, A n ethod of normalized block iteration. J. Assoe. Computing Machinery 6,236-244 (1959). 15. Blair, P. M., M.A. Thesis, Rice University, Houston, Texas, 1960. 16. Rose, M. E.,On the integration of nonlinear parabolic equations by implicit difference methods. Quart. Appl. Mafh. 14,237-248 (1956). 17. Lotkin, M., The numerical integration of heat conduction equations. J . Math. Phys. 37, 178-187 (1958). 18. Batten, G. W.,M.A. Thesis, Rice University, Houston, Texas, 1961. 19. Lees, M.,Approximate solution of parabolic equations. J . SOC.Ind. Appl. Math. 7, 167-183 (1959). 20. Douglas, J., and T.M. Gallie, Variable time steps in the solution of the heat equation by a difference equation. Proc. Am. Math. S O C . 6,787-793 (1955). 21. Douglas, J., and H. H. Rachford, On the numerical solution of heat conduction problems in two and three space variables. Trans. Am. Math. Soe. 89,421439 (1956). 22. Crank, J., and P. Nicolson, A practical method for numerical evaluation of solutions of partial differential equations of the heat-conduction type. Proc. Cambridge Phil. SOC.43, 50-67 (1947). 23. O’Brien, G., M. Hyman, and S. Kaplan, A study of the numerical solution of partial differential equations. J. Math. Phys. 49, 223-251 (1951). 24. Flatt, H. P.,Chain matrices and the Crank-Nicolson equation, to be published. 25. Halmos, P. R.,Finzte-Dimensional Vector Spaces. Van Nostrand Co., New York, 1958. 26. Zygmund, A., Trigononketric 8eries. Warsaw-Lwow, 1935. 27. Douglas, J., On the relation between stability and convergence in the numerical solution of linear parabolic and hyperbolic differential equations. J . Soc. Ind. Appl. Math. 4, 20-37 (1956). 28. Strang, W. G.,On the order of convergence of the Crank-Nicolson procedure. J . Math. Phys. 38, 141-144 (1959). 29. Lees, M., A priori estimates for the solution of difference approximations to parabolic partial differential equations. Duke Math. J. 47, 297-312 (1960). 30. Douglas, J., The application of stability analysis in the numerical solution of quasilinear parabolic ditferential equations. Trans. A m . Moth. SOC.89, 484-518 (1958). 31. Richardson, L, F.,The approximate arithmetical solution by finite differences of physical problems involving differentialequations with an application to the stresses in a masonry dam. Phil. Trans. Roy. Soc. A410, 307-357 (1910). 32. Douglas, J., The solution of the diffusion equation by a high order correct difference equation. J . Math. Phys. 35, 145-151 (1956). 33. Crandall, S. H., An optimum implicit recurrence formula for the heat conduction equation. Quart. Appl. Math. 13, 318-320 (1955). 34. Douglas, J., Multi-level difference equations for parabolic equations, to be published. 35. Douglas, J., On incomplete iteration for implirit,, parabolic difference equations, J. SOC.Ind. Appl. Math., 9, September 1961. 36. Douglas, J., A difference method for the Neumann problem for parabolic equations, t o be published. 37. Courant, R.,and D. Hilbert, Methods of Mathematical Physics, Vol. I. Interscience Publishers, New York, 1953. 38. Peaceman, D. W.,and H. H Rachford, The numerical solution of parabolic and elliptic differential equations, J . SOC.Znd. Appl. Math. 3, 2-5 (1955). 39. Douglas, J., On the numerical integration of uzz 7 i Y y = ut by implicit methods. J. SOC.Ind. Appl. Math. 3, 42-65 (1955).
+
54
JIM
DOUGLAS, JR.
40. Douglas, J., Altcrnating direction methods for three space variables, to be published. 41. Brian, P. L. T., A finite difference method of high-order accuracy for the solution of three-dimensional transient heat conduction problems, to be published. 42. Birkhoff, G., and R. S. Varga, Implicit alternating dirrction methods. Trans. Am. Math. SOC.92, 13-24 (1959). 43. Douglas, J., A note on the alternating direction implicit method for the numerical solution of heat flow problems. Proc. Am. Math. SOC.8, 409-412 (1957). 44. Lees, M., Alternating direction and semi-euplicit difference methods for parabolic partial differential equations, to be published. 45. Douglas, J., On stability for parabolic difference equations, to be published. 46. Lax, P. D., and R. D. Richtmyer, Survey of the stability of linear finite difference equations. Communs. Pure & Appl. Math. 9,267-293 (1956). 47. Richtmyer, R. D., Diserencc Methods for Initial Value Problems. Interscience Publishers, New York, 1957. 48. Leutert, W., On the convergence of approximate solutions of the heat equation to the exact solution. Proc. Am. Math. SOC.2,433-439 (1951). 49. Strang, W. G., Ph.D. Thesis, University of California, Los Angeles, California, 1959. 50. Lees, M., Von Neumann difference approximations to hyperbolic equations. Pacijic J . Math. 10, 213-222 (1960). 51. Lees, M., Energy inequalities for the solution of differential equations. Trans. Am. Math. SOC.94, 58-73 (1980). 52. Lecs, M., The Goursat problem. J . SOC.Znd. Appl Math. 8, 518-530 (1960). 53. Lees, M., Solution of positive-definite hyperbolic systems by difference methods, to be published. 54. Douglas, J., and B. F. Jones, The numerical solution of parabolic and hyperbolic integro-differential equations, to be published. 55. Douglas, J., and T. M. Gallie, On the numerical integration of a parabolic differential equation subject to a moving boundary condition. Duke Math. J . 92,557-572 (1955). 56. Douglas, J., A uniqueness theorem for the solution of a Stefan problem. Proc. Am. Math. SOC.8, 402-408 (1957). 57. Trench, W., On an explicit method for the solution of a Stefan problem. J. SOC. Znd. Appl. Math. 7, 184-204 (1959). 58. Ehrlich, L. W., A numerical method of solving a heat flow problem with moving boundary. J. Assoc. Computing Machinery. 5, 161-176 (1958). 59. Rose, M. E., A method of calculating solutions of parabolic equations with a free boundary. Math. Comput. 14, 249-256 (1960). GO. Lax, P. D., Weak solutions of nonlinear hyperbolic equations and their numerical computations. Communs. Pure & Appl. Math. 7, 159-193 (1954). 61. Douglas, J., On the numerical solution of parabolic systems with zero order connection, to be published. 62. Douglas, J., D. W. Peaceman, and H. H. Rachford, A method for calculating multi-dimensional displacement. Truns. A m , Inst. Mining, Met. and Petrol. Eng. 216, 297-308 (1959). 63. Douglas, J., A numerical method for a parabolic system. Numer. Math. 2, 91-98 (1960). 64. Volterra, V., Theory of Functions and of Integral and Integro-diserential Equations. Dover Public*ations,New York, 1959. G5. Henrici, P., Discrete Methods for Ordiuary I),ifeiwtiul Equations. Wiley, New York, to be published, late 1961. 66. Batten, G. W., Extrapolation methods for parabolic difference equations, to be published.
Advances in Orthonormalizing Computation* PHILIP J . DAVIS and PHILIP RABINOWITZ ** National Bureau of Standards Washington. D.C.
Part I: Theoretical
. .
1 Introduction . . . . . . . . . . . . . . . 2 . The Geometry of Least Squares . . . . . . . . . 3 Inner Products Useful in Numerical Analysis. . . . . . 4 . The Computation of Inner Products . . . . . . . . 5. Methods of Orthogonalization . . . . . . . . . . 6 Tables of Orthogonal Polynomials and Related Quantities . . 7 . Least Square Approximation of Functions . . . . . . 8. Overdetermined Systems of Linear Equations . . . . . 9. Least Square Methods for Ordinary Differential Equations . . 10 Linear Partial Differential Equations of Elliptic Type . . . 11 Complete Systems of Particular Solutions . . . . . . . 12 Error Bounds; Degree of Convergence . . . . . . . . 13 Collocation and Interpolatory Methods and Thcir Itelntion to Least Squares . . . . . . . . . . . . . . . 14 Conformal Mapping . . . . . . . . . . . . . 15 Quadratic Functionsls Related to Boundary Value Problems
.
. . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . .
. . . .
. . . . . . . . . . . . . . . . . . . .
56 56 58 59 60 63 64 68 69 70 73 75
. . . . .
79 81 83
.
.
85
.
. . . . . 116 . . . . . 121 . . . . . 122
Part 11: Numerical 16 Ortliogonalization Codes and Computations . . . . . 17. Numerical Experiments in the Solution of Boundary Value Problems Using the Method of Orthonormalized Particular Solutions . . . . . . . . . . . . . . . . 18. Comments on the Numerical Experiments . . . . . . . . . . . . . 19 The Art of Orthonormalization 20 . Conclusions . . . . . . . . . . . References . . . . . . . . . . . . . . . .
.
.
.
. . . .
. . . .
.
80
123
* Sections 17 and 18 of this article were prepared at the National Bureau of Standards with the sponsorship of the Atomic Energy Commission. Reproduction for the purposes of the United States Government is permitted . ** Currently at Weizmann Institiite, Rehovoth. Israel 55
.
PHILIP J. DAVIS AND PHILIP RABINOWITZ
56
PART I: THEORETICAL
1. Introduction
This paper contains (1) a survey of least square approximation techniques in numerical analysis and (2) some recent numerical results in this field which were obtained at the National Bureau of Standards on an IBM 704 computer. The amount of theoretical material relating to least squares and orthogonal functions is vast, even that part which has some bearing on numerical analysis. Jointly and individually in previous reports (Davis and Rabinowitz [1.1, 1-21 and Davis [1.3]), we have attempted to gather and describe aspects of this theory which are of utility in numerical analysis and which display a spectrum of applications of the least square idea. This paper is an elaboration of our previous work and its scope is limited by our computational interests. Sections 2 and 3 contain an abstract mathematical foundation for the subsequent material. These sections may be omitted by readers who are less theoretically inclined.
2. The Geometry of least Squares
An abstract vantage point from which it is convenient to survey the common features of least square processes-at least numerically-is furnished by the theory of inner product spaces. Central to this theory is the notion of the inner product (z, y) of two elements of the linear space. This function of two vector variables is assumed to be additive, homogeneous, symmetric (or Hermitian symmetric in the complex case), and definite: a.
+ m,
(21
23) = (XI,
4
+
( ~ 2 ~ x 3 )
b. (21,z ~ = ) (ZZ,zd or (a,ZZ)= ( z 2 , a ) (2.1) c. (aa,$2) = cy(z1, zz), cr scalar d. (z, z) 2 0 ; ( q z ) = 0 implies z = 0. Frequently semidefiniteness suffices. The inner product leads to a norm 11z11 = ( 5 , z)'I2, and the Schwarz inequality holds: (2, y)z 5 (z, x)(y, y). Orthogonality of x and y is given by (z, y) = 0. The Fourier series of an element y with respect to an orthonormal system xi is given by
ADVANCES IN ORTHONORMALIZING COMPUTATION
57
and its segments have the least square property: the problem minai IIy - C ? = l upill is solved by ai = (y, xi). The Bessel inequality _< lly[12,with equality if the orthonormal system zi is holds: C?-l (y, closed. I n the latter case, the Fourier series of an element y converges in norm to the element:
If the system x, is linearly independent (but not necessarily orthogonal) , we may solve the least square problem mina, I1y - Z!=la,x,ll in one of several ways. We may set up and solve the normal equations which give the a, directly
+ +
+
aI(x1,x,) az(~2,x,) . . . an(.rTL, x,) = (y, Ta) i = 1 , 2 , . . . , n. (2.4) The normal equations express the perpendicularity of the residual vector y - x?-.1 a,x, t o the linear subspace spanned by the xI1s.Alternatively, we may first orthogonalize the x’s and work through the Fourier expansion. The normal equations may, in many instances, have grave numerical difficulties attending their solution. The determinant of the system (2.4) :
G
41
= KX*,
(2.5)
is positive definite and symmetric and is known as the Gram determinant of the system 21, x2, . . . , x,. G is the n-dimensional volume of the parallelepiped whose edges are xl, XZ, . . . , xn. If the elements xa have been normalized so that l[xrll= 1, then we have
OIC_
.
58
PHILIP J. DAVIS AND PHILIP RABlNOWlTZ
3. Inner Products Useful in Numerical Analysis
A wide variety of inner product spaces appear in theoretical arguments. We mention here a few that have been utilized in numerical analysis. (1) Inner products in finite dimensional vector spaces P.Let x and y be vectors having components xi and yi, i = 1, 2, . . . , n. Let wi be a fixed set of weights, wi > 0. Then the expressions
in the real case and n
in the complex case are inner products. These may be considered to be the fundamental inner products, numerically speaking, and arise whenever discrete data intervene or when continuous data have been made discrete. (2) The most general inner product in En is given by ( 5 , g) =
xPy’
(3.3)
in the real case and ?/*= Y- I (3.4) xPy*, in the complex case. The matrix P is a positive definite symrnctric or Hermitian matrix. These more general forms appear from time to time, but with considerably less frequency than (3.1) or (3.2). (3) Integral inner products. Let B be a fixed domain in the space of variables 21, 22, . . . , xn and w(x1, x2, . . . , x,) a fixed positive weighting function defined on B. For two functions f and g defined on B, and with proper integrability conditions, the expression ( 2 , y) =
is an iiiiier product over an appropriate linear space. The integral may alternatively be extended over the boundary of B or over combinations of various portions of the region and of its boundary. This case may be made to include the previous ones by means of Stieltjes’ integrals, but it is more convenient to exhibit the discrete and the continuous cases separately. (4)Complex integral inner products. Let B be a region of the complex plane and C its boundary. Let w(z) designate a fixed positive weighting function. For two functionsj(x) and g(z) defined on B (or on C), the expressions
ADVANCES IN ORTHONORMALIZING COMPUTATION
(f, g)
=
(1,9 ) =
/ p ( . ) j ( r ) s ( . ) ds, ds2
// w(.)f(z)s(z>
=
dx2
+ dy2
59
(3.6) (3.6a)
dz ClY
L1
are both inner products in an appropriate linear space of complex functions and appear in theories of conformal mapping. Inner products similar to these have been defined for analytic functions of several complex variables bqt we have as yet not seen any related computation. (5) The Dirichlet iiiner product. This occurs in potential theory. In two dimensions it is
This inner product is only semidefinite. In the theory of partial differential equations of elliptic type
(3.7) is replaced by
In problems of mathematical physics, the norms arising from inner products are energy integrals whose Euler-Lagrange expressions are the differential equations to be solved. For an inner product appropriate to elasticity theory see Bergman [3.1, p. 1291. (6) Mixed inner products. Inner products occasionally contain a discrete as well as a continuous part. For instance, over the space C1[-l, 11 of functions which are differentiable on [ - 1, 11 define
+ 2 f’(xi)g’(zi), i=l
-1
< xi < 1.
(3.10)
4. The Computation of Inner Products
In the numerical solution of least square problems, the values of the inner products (xi, xi), i, j = 1, 2 , . . . , n must be obtained. In the discrete case, inner products are the sums of ordinary products. In the continuous case, inner products are integrals, and two procedures may be
60
PHILIP J. DAVIS AND PHILIP RABINOWITZ
followed. If the region of integration is simple and the system of functions is elementary, then we may compute the inner products exactly (that is, up to roundoff) by means of explicit formulas. For complicated regions and functions, we have no choice other than to obtain these values by means of an appropriate rule of approximate quadrature. Even when the region and the functions to be integrated are elementary, as, say, in the case of a polygonal region and powers, the closed form expressions for the inner products may be so lengthy and cumbersome as to preclude their convenient and accurate computation and use. We may have here simple functions such as logarithms entering with concomitant errors of truncation. In such elementary situations one may prefer to evaluate inner products by wholly numerical means, and we enter a strong plea here for so doing. When rules for approximate integration are applied and integral inner products made discrete, they take the form (3.1)-(3.4). For one-dimensional real integrals, appropriate rules are numerous and so well known that we shall not reference them. For higher-dimensional integrals, the theory of approximate quadrature is currently in a state of development. The work of Stroud and Hammer at the University of Wisconsin represents the most concentrated frontal attack on this problem. We call attention to several papers in this area. A survey is given in Stroud [4.8]. For complex integration see Birkhoff and Young [4.1]. Reference [4.7]contains many additional citations. Despite this work, one must still approach the problem of integration in higher dimensions rather gingerly until a backlog of computational experience relevant to a specific problem has been built up. It appears to be considerably easier numerically to perform one-dimensional integration rather than higher-dimensional integration, and in those problems, e.g., in potential theory or conformal mapping, where the numerical analyst has an option, he is probably best advised to select the lower dimension.
5. Methods of Orthogonakation
The basic problem is, given a sequence xl,x2,. . . of elements of an inner product space any finite number of which are linearly independent, we wish to obtain linear combinations of the xj
xt
a ,,XI ,. .
=
i = l , 2 , 3, . . .
jS.1
which are orthonormal :
(xt,x?)
= 6ij =
1
ifi=j
0
ifizj
(5.1)
ADVANCES
IN ORTHONORMALIZING COMPUTATION
61
This can be effected by a recursive scheme which is generally called the Gram-Schmidt orthogonalizationprocess. Set, recursively,
y1 = 21, y2 = 5 2 -
3 s = Yl/IIYllI (22,
zf)zf, (5.3)
It may be shown that (5.2) holds for elements x: determined in this way. The quantities lly.ll are related to G(G, z2, . . . , 5,) via the following identity
G(z1, Q, .
, 4 = (IlYlll lIY211
*
. llYn11)2.
(5.4)
If G(x1,22,. . . , Zn) approaches zero rapidly, as will be the case with a system 2 1 , 2 2 , . . . which is more and more nearly dependent, it follows from (5.4) that 1 1y.l I will become small. The normalizing step yn/l (ylLl I becomes numerically indeterminate and roundoff builds up. The most favorable numerical situation is that in which the elements xi are nearly orthogonal to start with. The orthonormal vectors x: may be expressed in the form of determinants (see [5.2]) but such an expression seems to be of little value theoretically and not at all numerically. For real orthogonal polynomials there is a three-term recurrence relationship from which the values can be computed successively. Let
where a(z) has at least M points of increase. Then we may orthogonalize 1, x, . . . , xM--l with respect to the inner product (5.5) and obtain certain polynomials. P,(x) will designate the orthogonal polynomials while P:(x) will be normal. A form convenient for iterative computation and expressed with inner products is P-1
=
Pn+l =
0, Po
=
1,
XP:,- (xP;, P:)P:, - (P., Pn)”21’:-1,
(5.6)
PT,= Pn/(Pn,P p . Recurrence relations for the individual coefficients of the orthonormal polynomials may be derived from (5.6). They are needed whenever least square approximants are to be expressed as linear combinations of powers 1 , x , 9,. . . and not merely as linear combinations of the orthonormal. polynomials. Let
62
PHILIP J. DAVIS AND PHILIP RABINOWITZ
P:(x) = an0 Then,
+ an12 +
. . + annZn,
n = 0, 1,
.
ant1.i = [an,j-I - (xP:, Pi)an,j - (Pn, Pn)1’2~n-~,j] * (Pn+l,Pn+1)-”’; n = 0 , 1 , 2, . . . , j = O , I ,..., n + I with am =
(5.7)
(5.8)
(Po,Po)-1’2
ai,j = 0, j >i (5.9) i or j < 0. ai,j = 0, This recurrence relationship depends essentially upon the fact that xPn is a polynomial of degree n 1 and that (zPn,P,-l) = (Pn, zP,,-l). If these facts are not available, then there is no recurrence relationship. Thus, for complex orthogonal polynomials in a single complex variable, and for real orthogonal polynomials in several variables, there is, with certain special exceptions, no such formula. The Gram-Schmidt orthogonalization or some modification of it is appropriate to these situations. The classical recurrence (5.6) has lately been advocated for the numerical construction of one variable, real, least square polynomial fits [5.6, 5.71. The numerical performance of (5.6) is good, exceeding Gram-Schmidt. For orthogonal polynomials in several real variables some recurrence relationships with a variable number of terms have been found recently by Weisfeld [5.8]. He has had some computional experience with orthogonal polynomials in six and seven variables but of degree not higher than two, and in a communication to the authors, indicated that there appeared to be no advantage to their use as opposed to the Gram-Schmidt method. The authors computed the orthogonal polynomials over a point set consisting of two line segments via Gram-Schmidt and via recurrence. Though double precision was employed with the Gram-Schmidt method, this process broke down at n = 16. With single precision recurrence the results were still good a t n = 20. (See Section 18, Case XVII.) For specific inner products and functions to be orthogonalized (principally polynomials) there are extensive representations in one and several variables. See Szego [5.2] and Erddlyi [5.5] where summaries will be found. A complete bibliography on orthogonal polynomials up to 1938 has been compiled by Shohat et al. [5.3]. However, we cannot concern ourselves here with the “special function theory” of these polynomials. By way of recapitulation, least square approximations may be effected by (a) Normal equations (b) Fourier segments using the Gram-Schmid t method (c) Fourier segments using three term recurrence relationships (d) Fourier segments using multiterm recurrence relationships (e) Precomputed auxiliary tables (see Section 6).
+
63
ADVANCES IN ORTHONORMALIZING COMPUTATION
(a) and (11) arc always available. (c) is available only in the real, one variable polynomial case. (d) is availablc in ccrt,ain multivariable polynomi:rl cases. (c) is available only in spccial cases, gcncrally that of equidistant data. (a) frequently leads to poorly conditioned Gram matrices. (b) is numerically safer than (a). (c), when available, is superior to (b). (d) is unexplored. (e) is for hand computation.
6. Tables of Orthogonal Polynomials and Related Quantities
A different approach to orthogonalization is that of precomputed tables of orthogonal polynomials. It would be a slight of much good material and it would be to succumb prematurely to mechanization if we failed to mention some of the tabular material that is available. Its principal uses are likely to be in spot computations and computations performed away from large-scale computing facilities. The classical orthogonal polynomials have been tabulated by Russel [6.1], by the National Bureau of Standards [6.4], and by Karmazina [6.6]. Reference [6.4] contains a fine introduction to the Chebyshev polynomials of the first kind by C. Lanczos which discusses application to curve fitting, a particular brand of approximation known as “economization,” and solution of linear differential equations. Attention should also be called to [6.12] where polynomials orthogonal on a set of equispaced points have been tabulated so that least square approximations to equispaced data can be made by simple linear combinations. Hochstrasser [6.16] is a useful compendium giving both values and coefficients. Related quantities are the zeros of the orthogonal functions which appear in quadrature formulae of Gauss type. Let a ( x ) be a iiondecreasing function defined on [a, b] which possesses an infinity of points of increase and for which the moments
b
xnda(x) exist. Let p,(x)
=
knxn
+ . . . , k, > 0 be
the sequence of polynomials which are orthogonal on [a, b] with respect to the inner product (5.4). Let a < x1 < x2 < . . . < xn < b designate the n zeros of p,,(x) (which are known to be simple and interior to [a, b ] ) . These are the abscissas, and the quantities X, = -kn+l[L,p’,(Ll)pn+l(x~)]-’,
j
2
1,2, .
-,
fi
(6.1)
are the weights in the quadrature formula (6.2)
of Gauss type. The X j are positive, (6.2) is exact whenever f(z) is a polynomial of degree 5 2 n - 1, and if - < a < b < , we have Q)
Q)
64
PHILIP J. DAVIS A N D PHILIP RABlNOWlTZ
lim
2 ~jj(sj)
=
n+cc j = 1
(6.3)
/bj(x) da(x)
whenever the Riemann-Stieltjes integral on the right exists. Side by side with this formula, associated with the names of Gauss and Jacobi, we have a modification wherein a fixed number of abscissas have been preassigned and the remaining abscissas and all the weights selected so that the resulting formula (6.2) is exact for polynomials of maximal degree. These are associated with the names of Lobatto, Radau, and Bouzitat. In these modified cases, the abscissas turn out to be the zeros of certain linear combinations of the orthogonal polynomials and their derivatives. Despite the very beautiful theory which these quadrature formulas possess, their use in numerical analysis was sporadic up to the advent of high-speed computing machinery. This was due to the fact that the numbers zj and X j are generally irrational. Since automatic machines are indifferent to “rationals” or “irrationals,” the Gaussian rules have now become quite popular, and their advantage over other rules is for certain classes of integrands quite striking. The following abscissas and weights for quadrature rules of Gauss type are currently available. We write here w(z) dx in place of da(z), and reference principally those values computed by electronic calculators. Type of quadrature rule Gauss Gauss
Range
Weight function
Abscissas preaasigned
[-I, 11 [-I, 11
1 1
None None
Jacobi
[O, 11
Xm
None
Lobatto Radau Logarithmic
t-1,11 [ - I , 11 [0, 11
1 1 log x
-1, 1 -1
Laguerre
[0, m ]
zmeP
Hermite
[-m,m]
e-zl
Range in n selected to 96 2(1)64 m = 0(1)5 12 = 1(1)8 selected to 65 2(1)5
{
None None None
m = 0(1)5 n selected to 32 l(l)%O
Accuracy
Reference
20D 20D
[6.8, 6.91 [6.13]
12D
[6.11]
19D 6D
[6.15] [6.7] [6.7]
178
[6.14]
13s
[6.5]
6D
7. Least Square Approximation of Functions
Let B designate a region in the space of n real variables. P is a variable point in B, f ( P ) a function defined on B, w(P) a positive weighting function and f,(P), . . . ,fn(P)a given set of n independent functions. The fundamental problem is to find a linear combination C?-laifi(P)such that
ADVANCES IN ORTHONORMALIZING COMPUTATION
/ ...1
f:
w(P)[f(P) ~ i f i ( P )dzl ] ~. . . dxn = minimum. i=l
65
(7.1)
B
In the discrete formulation of the problem, we want N
n
N 2 n.
C wj[f(Pj)- iC aJi(Pj)]' = minimum, j=1 =l
(7.2)
Here we have selected N distinct points Pj in B a t which we do the approximating. Modifications are possible. Suppose that f and f i are differentiable functions. We might want to do our approximating in such a way that derivatives are brought into the picture. For instance, set up the problem
minimum. (7.3) Here, the prime designates certain total or partial derivatives. Depending upon the relative sizes of the weights wj'),wj",we throw differing amounts oi emphasis on the functional values or upon the derivative values. Since ordinary least square approximation frequently provides oscillatory approximants, the criterion (7.3) might prove useful in curbing some of this oscillation. Unfortunately there is not much data available for inspection from which to glean a feeling for this criterion. If f$(P)designates the functions fj(P) after they have been orthonormalized with respect to the appropriate inner product, then the least square approximation is given by the Fourier segment =
(7.4)
i=l
and the measure of the minimum error that can be achieved is
!If- Z,g (f, f?)f?l12 = llfl12 =1
2 (fJf?)2'
i=1
(7.5)
Mixed problems of interpolation and approximation occur with sufficient frequency in practical work to warrant their being mentioned. In fitting a polynomial to data, it may be desired to pass the polynomial through one, or two, or a fixed number of points while approximating to the others in the sense of least squares. The general formulation in inner product spaces is as follows: find
subject to the p side conditions
PHILIP J. DAVIS AND PHILIP RABINOWITZ
66
This problem can be immediately reduced to a conventional least square problem in n - p variables. Assume that there is a solution
+ ~ 2 ~ x+2 . . . +
(7.8) to the equations (7.7): (yj, zo) = pi, j = 1,2, . . . , p . Consider now the linear subspace S of the space spanned by x l . . . xn and which is defined by the homogeneous conditions j = 1,2, . . . , p . (7.9) (yj, z) = 0, ZO
= al0zl
anoZn
S is of dimension n - p and here we can find an independent basis w1, w2, , w,,-~. Orthonormalize the w’s, giving wr, wt, , WE-,. Then the solution to the above problem is given by
...
...
+ c (y - zo, wi)wE. n-p
y =
20
(7.10)
k=l
Geometrically speaking we are dealing with the problem of the shortest distance from a point to a subspace, and for an exposition from this point of view sce Schreier and Sperner [2.1, pp. 140-1511. As a concrete example, let N pieccs of data yi be given at points z1, x2, . . . , XN and suppose it is desired to find that polynomial of degree n which passes through the points 21, $2, . . . , zpand which best fits the data on the remaining points in the sense of least squares. Here we must have 0 < p < n I N . Let qp(z)be the unique polynomial of degree p - 1 which interpolates to ycat zi, i = 1,2, . . . ,p : qp(zi)= yi, i = 1, 2, . . . ,p. Take the reduced data yi - qp(zi),i = p 1, p 2, . . . ,i = N (which vanishes at $1, 22, . . . , 2,) and fit it in the sense of least squares by linear combinations of (z - z1)(z - ZZ). . . (z - zp)zjlJ = 0, 1, 2, . . . , n - p a t the points xH1, xp+2, . . . , XN. If this answer is designated by &(z) the &(z). final solution is given by q,(z) The processes that we have been describing select an approximation from a linear family of approximants. It is of considerable importance to numerical analysis to be able to deal with nonlinear problems. This topic is currently under development, and the techniques of this paper are not applicable. It is a temptation, though, t o “linearize” whenever possible, and we shall now describe some ways this can be done. Suppose that f(z) is defined on an interval and we would like to approximate it by rational functions of the form (alz uz) / ( z aa)so that
+
+
+
+
+
(7.1 1) Multiply up by the denominator, discard the denominator, and consider, instead of the nonlinear problem (7.1l ) , the linear problem
Ilxf(x)
+ a&)
- alz - a21/ = minimum.
(7.12)
ADVANCES IN ORTHONORMALIZING COMPUTATION
67
The problem (7.12) calls for an approximation of zf(x) by a linear combination of 1, 2, and f(z). Though we have occasionally used this device with profit, we know it can lead to unacceptable results and must therefore be employed with caution. The relationship between the general problems of which (7.11) and (7.12) are simple representatives stands in need of clarification. There is an iterative method due to Gauss for the least square fit of nonlinear functions. The method employs a sequence of linear approximations where, at each stage of the iteration, a linear problem is solved. Some numerical observations on this method are given in Hartley’s report C7.141 and some general conclusions can be found in reference [7.13, pp. 16-20]. A second type of nonlinearity arises from nonquadratic norms. From the point of view of approximation theory, the least square criterion arising from a norm in an inner product space llzll = (z, x ) ” ~is but one of an infinite number of criteria and norms that might be employed. Its popularity is due to the fact that it leads to a linear theory, closed form solutions, and on this account, is one which is of considerable simplicity and has been developed to the utmost. High-speed computing machines have creatcd their own problems and brought with them their own resources. This fact, coupled with the development of linear programming, has served to make another norm popular. This is the so-called uniform or Chebyshev norm. In the case of continuous functions defined on an interval [a, b] it is defined by (7.13) The resulting approximation problem for polynomials (7.14) is nonlinear. A number of iterative schemes have been proposed, coiivergence proved, codes written, and successful machine runs made. This material is beyond the scope of the present article, but the interested reader will find pertinent material in references [7.6-7.9, 7.121. Approximation by rationals
Ilf(x) -
=
minimum;
b,,, = 1
(7.15)
has also been considered, but the machine coding of this problem seems currently to be in a state of development. The solution to the problem (7.13)-(7.14) provides (theoretically) the approximation for which the maximum deviation is minimum. The maximum deviation for approximants found by the least square criteria will
PHILIP J. DAVIS AND PHILIP RABINOWITZ
68
therefore be higher. For the smooth, conventional kind of functions, the maximum deviations for least square approximants seem to run around 1-5 times the least possible (Chebyshev) deviations. It would be a “rare” function for which this ratio would be as high as 10. Indeed, the least square approximant is usually a h e approximation to the Chebyshev approximant, and in performing iterative computations to determine the latter, Stiefel [7.9] suggests strongly that one begin with the former. These facts should be kept in mind and weighed against any increased difficulty of computation when uniform approximations are suggested. Of all the uses of least squares or orthonormalizing codes, that of functional approximation occurs the most frequently. It is the “bread and butter” use. When experimental data are being treated, a choice must be made of the proper degree of the polynomial fit and, in addition, certain statistical parameters are often sought. Readers who would like an introduction to the statistical side of this field are referred to Snedecor [7.1], Kendall [7.2], and Box [7.13]. An excellent bibliography on correlation and regression theory up to 1957 is given in Deming C7.101.
8. Overdetermined Systems of linear Equations
Given n linear equations in p unknowns,
9
j=l
UijZj = bi,
i = 1, 2, . . , ,n ;
p
< n.
(8.1)
The system (8.1) is, in general, overdetermined. It can be treated numerically by asking for that vector (q, x2, . . . ,x,) for which
In other words, we are to approximate the vector ( b l , . . . , b,) by linear combinations of the p vectors (alj, u2j,. . . , anj),j = 1,2, . . . , p . The general results of Section 2 provide the answer, and little more need be said. Vector orthonormalizing codes can handle this problem without any modifications. The square norm, however, is not always appropriate to the physical problem leading to (8.1), and so the problem has been set up and algorithms proposed for other norms. The uniform norm I I (bl, bz, . . . , max Ibil
bn)l‘
=w
has received considerableattention. The 2qth power norm: I I (bl,bz, . . . , b,,) I I = (CLIbi2911z*has received some, in virtue of the fact that as q-$ m, it approaches the uniform norm in value. It would exceed the limitations we have set for this article to discuss
ADVANCES IN ORTHONORMALIZING COMPUTATION
69
these alternate norms, and we content ourselves with providing the interested reader with some selected references. In 1911 de la VallBe-Poussin [8.1] gave an algorithm for the solution of the problem under the uniform norm. This has proved unwieldy, and recent solutions have all been given with the requirements and potentialities of high-speed computing machines in mind.
9. least Square Methods for Ordinary Differential Equations
It is desired to solve the differential equation
F(z, y, y’,
. . . ,y‘”’)
0 (9.1) over a 5 x 5 b subject to appropriate auxiliary conditions. Make the substitution y = 9(x, U l , u2, . . . , a,) (9.2) where 9 is a convenient closed form expression which depends upon p parameters al, . . . , up The function 4 is assumed to satisfy the auxiliary conditions for all values of the parameters. Now, set up the integral gl’,
=
where w(z) > 0 is a convenient weighting function, and seek values of all . . . , up which render 1 = minimum. (9.4) In general, this is a nonlinear problem. In the special, but frequently occurring case of (9.1) linear, if we take a linear expression
for substitution, the problem becomes linear and is therefore amenable to orthonormalizing techniques. This method seems to go back a t least as far as Picone [9.1]. For some textbook presentations see Collatz [9.3, pp. 130-131, 184-1851 and Fox [9.5]. The following modification can be made. Designate the auxiliary conditions to be satisfied by i = 1 , 2 , . . . , n. (9.6) A&) = 0 Assuming no auxiliary conditions for 9, solve the problem
lb
w(z)F2(z,4, 9”,
,
. .,4(n))
dx
+ i5= w,Ai2(4) 1
= minimum.
(9.7)
PHILIP J. DAVIS AND PHILIP RABlNOWlTZ
70
This approach puts the auxiliary conditioiis on the same footing, so Lo speak, with the differential equation itself, and will be of utility whenevcr the former are as troublesome as the latter. For additional methods which involve parameters, such as the Rita, Galerkin, and collocation methods, see Collata [9.3]. Studies of the degree of convergence of this method have been carried out for the two-point problem [9.1,9.2,9.4]. Despite estimates which indicate rapid convergence under certain circumstances, the method of least squares has not to our knowledge been exploited numerically; other well tested methods offer stiff competition [9.5], and are commonly employed. For an additional application of the least square method to two-point problems, see Kadner [9.6]. For applications to the solution of integral equations, see [9.7] and [9.8].
10. linear Partial Differential Equations of Elliptic Type
When we come to least square methods for the solution of linear boundary value problems, we are dealing, perhaps in contrast to the problem of the previous section, with a method which has many fine numerical features. In Part I1 of this report, extensive experiments in this area will be presented and discussed. The prototype problem here is, of course, the Laplace equation (10.1)
under one of the following tjhree boundary conditions (9 U ( P )
=
Dirichlet problem ,
g(p)
Neumann's problem,
+ b(P) au(p) an
a(p)u(p)
=
(10.2)
Mixed, or Robin's problem.
g(p)
Here, B designates a region, p is a point on the boundary of B, dB, and a , 6 , g are given functions on aB. The met,hod of least squares proceeds by taking a linear combination n
u" = C aiui i=l
(10.3)
as an approximate solution. The ui are a set of independent, conveniently handled functions which are generally given in closed form. They are, moreover, selected so that either
ADVANCES IN ORTHONORMALIZING COMPUTATION
71
(a) ui satisfy the differentialequation but not the boundary condition, (b) u1 satisfies the boundary conditions, us,u3,. . . satisfy homogeneous boundary conditions, but the u; do not satisfy (10.4) the differential equation, (c) ui satisfy neither the differential equation nor the boundary conditions. For the differential equation in two or a higher number of dimerisioiis W P ) ) = 0,
PeB
(10.5)
under the boundary conditions
- mP ) ) =
dP),
PeaB
(10.6)
we may set up the following integrals corresponding to (a), (b), and (c)
(10.7)
and seek the values of all u2, . . . , a, in (10.3) which render these integral measures of discrepancy as small as possible. Here, w, wl,w2are conveniently chosen weighting functions, and dV and ds are integration elements in B and dB respectively. When we have solved this problem, we have a closed form approximation to our solution which may be very convenient for subsequent numerical work. The bulk of the theoretical and numerical work has been devoted to method (a). For computational success we need easily calculated families of particular solutions of the differential equation L(u) = 0 which are closed in an appropriate metric. Such families are available in the case of the Laplace and the biharmonic equations in two and three dimensions. For sufficiently complicated differential equations, methods (b) and (c) are very appealing. For a numerical example of method (c) see Lieberstein [10.14].
Orthogonalization methods for the solution of the potential problem go back surely as far as Zaremba [ l O . l l 10.21. Zaremba makes use of the Dirichlet inner product (3.7) to effect his orthogonalizations. Orthogonalization methods obtained a strong theoretical boost from the work of Bergmsn whose interest in these matters extends back to 1921 C10.31. In his papers and in his books (10.5, 10.91, the method of orthogonalized
PHILIP J. DAVIS AND PHILIP RABINOWITZ
72
particular solutions is developed and becomes part of a larger theory related to the theory of kernel functions, Green’s, and Neumann’s function. This theory has an aesthetic appeal which is unrivaled in potential theory. Convergence theorems in the harmonic case have also been proved by Merriman [10.4]. For the convergence of the method of least squares in the polyharmonic case, see Zolin C10.131. The Italian school has also been very active in this area. They have derived bounds, proved completenessand convergence theorems and have proposed numerical methods as a consequence. See the survey articles by Fichera [10.6] and Picone [10.10, 10.111 which contain many references to the Italian literature, and also Lowan [10.16]. For material which relates specifically to the biharmonic equation, see reference [10.71. Considerable recent work has been done by this school with method (c) for problems of elliptic-parabolic-hyperbolic type. The method goes back certainly as far as the 1920’s when Picone advocated it. A detailed theoretical discussion and further references will be found in Fichera [lo.151. Lieberstein [10.141 gives several numerical examples and has a discussion of the numerical questioiis involved. Briefly, and with considerable loss in generality, the situation is as follows. Let B be a region in the space of the real variables p = (q,. . . , 2,) and let aB be its boundary. Let
c n
U U ) =
i,j=l
+c n
~ijUz,z,
bi%i
;==I
+ cu,
aij = aji,
(10.8)
be a second order linear differential operator where c < 0 and where for any p in B the quadratic form with matrix (aij) is positive semidefinite. Under appropriate conditions which‘cannot be elaborated here, a subset of aB, ( ~ B ) D is ,determined by L so that the elliptic-parabolic boundary value problem (10.9) is well posed. Now let U I , u2, . . . , u,, be a system of functions selected in advance, of appropriate continuity class, and for which L(uk)is bounded in B. Set up and solve the least squares problem
u* = UlUl
gnd accept u*
a8
the approximate solution.
+ , , + a+, 1
(10.10)
ADVANCES IN ORTHONORMALIZING COMPUTATION
73
11. Complete Systems of Particular Solutions
The theoretical success of the method of least squares, outlined in the last section, depends upon finding a complete system of particular solutions of the partial differential equation. The numerical success depends upon finding a complete system that is readily computed. A complete system depends upon the differential equation, the domain over which it is to be solved and the class of boundary values allowed. It should be noted that there is considerable ambiguity in the literature as to the definition of the word “complete.” In the fmt place, it is related to “closed.” We call a system of elements in a normed linear space “closed” if linear combinations are dense in the space. We call a system of elements of an inner product space “complete” if identically zero Fourier coeficients can only come from the zero element. Some authors reverse this practice. In many situations, the two notions are identical (Riesz-Fischer Theorem) and the confusion causes no difficulty. Some authors call a set of polynomials in several variables complete if they span the class of all polynomials. Under certain conditions, this usage coincides with the above usages. In this section we shall not make explicit the fine points of completeness theory and shall use the word “complete” in the pragmatic sense of a set of solutions which is sufficiently numerous so that the least square process is demonstrably convergent to the proper solution in the interior of the region, and this to be true for a wide class of boundary conditions. Nor shall we elaborate the topological conditions that the region must satisfy. For Laplace’s equation Au = 0 in a simply connected two-dimensional region, the harmonic polynomials Re(zn),n = 0 , 1 , . . , Im(z“), n = 1,2, . . . z =x iy, form a complete system. If the region is multiply connected, this system is no longer complete but must be augmented. Let the region contain p holes and let zl,z2, . . . ,2, be p points selected so that z h is in the kth hole. If the functions Re and Im(z - z&”, n = 1,2, . . . , k = 1,2, . . . ,p , and log lz - zkl, k = 1,2, . . . ,p , be adjoined to the above list, the resulting system is complete (cf. Nehari [11.7], p. 372). For Laplace’s equation in a simply connected three-dimensional region, the spherical harmonics are complete (cf. Kellogg C11.23). For a proof see Vekua [11.9]. If the region is multiply connected and contains p holes we must augment these functions. Using Maxwell’s transformation,
+
(11.1) T2
=
(x - .0)2
+ (y - Yo)2 +
(2
- zo>2
74
PHILIP J. DAVIS A N D PHILIP RABlNOWlTZ
and applying it t o the spherical harmonics, we produce harmonic functions which are regular save a t (zo, yo,zo). Select points Pk = (zk,yk, zk) in the kth hole and transform the spherical harmonics in this way. If this is done for k = 1 , 2 , . . . , p and the functions adjoined to the spherical harmonics, the augmented set will be complete. A detailed discussion of this problem will be found in Fichera [11.4]; see also Bergman and Schiffer C10.91, p. 202, and Gagua [11.19]. The problem of constructing complete and linearly independent, sets of polynomial solutions is of considerable importance in computation and of interest in its own right. Such sets will be of utility in simply connected regions. A number of authors havc exhibited such sets for the Laplace, the iterated Laplace, and the wave equations in n dimensions. Reference is made to Miles and Williams [11.11, 11.151. Additional references will be found there. We have made numerical use of Miles and Williams’ scheme for the three-dimensional Laplace equation. Their functions are
v(?J)= the j t h iterate of the Laplacian. For the biharmonic equation in two dimensions
AAu =
z4+ 2 axzayz -+ -a94 -0,
a 4 ~
a4u
a 4 ~
(11.3)
.
the functions Re z”, Re Z zn, n = 0, 1, 2, . . ; Im zn, Im Z zn, n = 1, 2 , . . . are known to form a complete system in a simply connected region. For three dimensions, see Fichera [11.3] and Bergman’s book [10.9], p. 229. The functioiis
form a set of solutions of the three-dimensional equation
AU
+ X2u = 0
(11.5)
which is complete on smooth surfaces [11.10]. The generation of special solutions for general partial differential equations has been made the object of a number of extensive studies and theories. We mention three: those of Bergman, Vekua, and Bers. S. Bergman makes extensive use of integral operators to generate particular solutions. He develops integral operators which transform analytic functions into solutions of UZZ
+ + + bu, + cu uvy
auz
=
0,
(11.6)
ADVANCES IN ORTHONORMALIZING COMPUTATION
75
which transform analytic functions of two complex variables into harmonic functions of three real variables, and which transform harmonic functions of three variables into solution of various differential equations in three independent variables. He then produces complete systems of solutions by transforming complete systems of analytic functions or of harmonic functions. For certain problems, particular solutions are required which have singularities of a prescribed type. Bergman has considered this point in great detail. For an up-to-date and complete exposition of these matters, see Bergman [I 1.171. Applications to questions of hydrodynamics and some numerical work will be found in Krzywoblocki [11.18]. Vekua [I 1.51 has considered general elliptic partial differential equations with analytic coefficients. By setting z = x iy, z* = x - iy where x and y are interpreted as complex variables, and by making use of a complex Riemann’s function, Vekua obtains complete systems of soliit,ions of these differential equations. For an excellent English language exposition of Vekua’s methods see Henrici [11.13]. In Vekua [11.12] a system of two first order partial differential equations is studied and on page 73 a complete system of solutions is described. I n his 1959 paper [11.16], Henrici derives further results of Vekua’s typc. Explicit examples of complete systems of solutions are given for a variety of special forms of (11.5). These solutions are expressed in terms of hypergeometric and other ‘Lspecial’’functions whose function theory is well known. Bers has constructed a theory of solutions parallel to that of the theory of analytic functions of a complex variable; see reference [11.6] for a n extensive presentation.
+
12. Error Bounds; Degree of Convergence
One of the advantages that accrues from least square methods for solving elliptic differential equations is that simple error estimates become available once the computation has been performed. Harmonic functions satisfy the maximum principle: Let B designate a closed bounded region of space and let u be harmonic and nonconstant in B. Then u attains its maximum (and minimum) values on the boundary of B (see, e.g., Kellogg [11.2], p. 223). If u now designates the theoretical solution to a boundary value problem and if u* is a combination of harmonic functions which is to approximate u,the difference u - u* will also be harmonic. If we set max lu(P) - u*(P)1= i’dB
a
(12.1)
76
PHILIP J. DAVIS AND PHILIP RABINOWITZ
then we must have (12.2) lu(P) - u*(P)I 5 6 throughout the interior of B as well. To obtain a global estimate of error, we need only observe the maximum deviation between the proposed boundary values and the approximate values u*(P). For general elliptic partial differential equations in n variables, the following maximum principle is available (see, e.g., Collatz [12.12], p. 439, and [12.9]). Consider the differential equation (12.3) Let B be a closed, bounded, simply connected region of the xl, . . . , xn space and bounded by an n - 1 dimensional hypersurface dB. The boundary dB will be divided into two pieces designated by dB1 and dBz. Either of these may be empty. Assume that aij, bi, c, d are continuous functions of 21,. . . , xn in B, I luijll is positive definite, and c 2 0. (12.3) is to be solved under the boundary conditions
and du/du designates the derivative of u in the direction of the conormal on dB. Suppose now that u* is a solution of (12.3) which approximates the boundary conditions (12.4). Let e(P),P d B be defined by
€ ( P )=
u*-u+-
‘3~E-3 aa
A
lu* - u
onaB1 (12.5) on dBz
Then we have min e(P) 5 u*(P) - u(P) 5 max e(P) PdB
(12.6)
PdB
For higher order partial differential equations the subject of maximum principles is currently being developed. For biharmonic functions we have, e.g., some inequalities given by Miranda [12.6]. Let u be biharmonic in B and be continuous along with its first derivatives. Let on aB
(12.7)
where f’( =f’(s)) and g are continuous. Then there exist two constants K1 and Kz depending only on the region B such that
lu(P)I I KIG[max IgI
+ max If’ll + (1 + Kd)max Ifl,
PrB.
(12.8)
ADVANCES IN ORTHONORMALIZING COMPUTATION
77
In (12.8) 6 is the distance from P to dB and the maximum is taken over aB. If B is simply connected we can take K z = 0. We can employ (12.8) to estimate approximation errors in the interior of B. I n the case of the two-dimensional Laplace equation, a penetrating analysis of least square error has been carried out by Nehari [12.131. This estimate is superior to the simple maximum principle and exhibits clearly the role of completeness in the selection of the family of approximating functions. Let B be a convex domain and let u(z)(=u(x,y)) be harmonic in B. Designate its values on dB by u(s). Let ul, u2, . . . , un be n harmonic functions which are orthonormal in the sense that (12.9)
If a,, are the Fourier coefficients of u with respect to
id,,,
(12.10)
then
Here z p designates a point on aB and (12.11) is valid for zeB. When the system ul, uzl. . . is complete in the space of harmonic functions u(z) with
/dB
u2(s) ds
< co, the
first bracket in the right-hand side approaches zero
(Parseval’s equation) and (12.11) exhibits the pointwise convergence of the Fourier series C:il a,,,u,,(z) to the solution of the first boundary value problem. This inequality cannot be employed to make a priori estimates of approximation, but as soon as orthonormal functions ul, . , . , u,,have been computed it leads a t once to global estimates. Hochstrasser [12.14] has carried out some examples of est,imates using (12.11). Similar inequalities can be given for more general elliptic equations, but the details are yet to be worked out. For additional material on maximum principles see Marcolongo [12.1], Humbert C12.31, Nicolescu [12.4], Fichera [12.8], Griinsch [12.10], Duffin and Nehari [12.15], and Agmon [12.16]. For use with the problem and method of (10.8)-(10.10) are certain very general maximum principles which have been worked out by Fichera [10.15].These lead to useful a posteriori error estimates which tell how close a particular solution is once it has been computed. Thus, in the problem just mentioned, we have
PHILIP J. DAVIS AND PHILIP RABINOWITZ
78
The two quantities on the right hand side of (12.12) may be determined with considerable accuracy as part of the program which determines u*. See Lieberstein [10.14] for numerical examples. Related to the work of Fichera, but by no means identical with it, is the work of the Maryland school. The object of their work is to obtain a priori bounds for functions of certain continuity classes in terms of various operator data. A typical example is uV)
Ih(P)
IB
( A U )dv ~
+ kdP) IBu2ds
(12.13)
Here Icl and k2 are constants which depend upon P and the geometry of B but are independent of u. The best values of kl and k2 are the solutions of certain eigenvalue problems, but upper bounds for them can readily be obtained from the geometry. Inequalities of this sort form a practical method of a posteriori error estimation when least squares are employed. For the details, the reader is referred to Payne and Weinberger [15.42, 15.471, and to Diaz [15.49]. Much theoretical work is currently underway by L. E. Payne, J. H. Bramble, and B. E. Hubbard C12.17-12.201.. The rate of convergence of numerical solutions of boundary value problems by methods (a) of Section 10 is intimately related to the smoothness of the boundary and the continuity class of the prescribed boundary data. In the case of approximation by harmonic polynomials in two variables, fairly complete theoretical information is available. Typical results are as follows. The case of strongest convergence is when the prescribed boundary data comes from a harmonic function which continues harmonically across the boundary. Let C be a Jordan curve in the z-plane and let CR designate the image of the circle (w( = R > 1 under the conformal map z(w) of the exterior of C onto the region IwI > 1 with z(.o) = 0 0 . If u(z) is harmonic in the interior of C R , then there exist harmonic polynomials p , ( z ) of degree n such that (see Walsh [12.2, 12.51) lim sup [max lu(z) - p,(z)IIL’n n-m
ztC
5
1 g*
(12.14)
Hence, the larger the region of continuability, the larger R , and according to (12.14) the stronger the approximability of u(z) by polynomials. For solutions which do not continue across the boundary we have the following results. Let C be an analytic Jordan curve. Write u (z )~L (k, a), k 2 0, on C if u(z) is harmonic in the interior of C , is continuous in c, i.e., in C its interior, and if dku(z)/dskexists on C and satisfies a Lipschitz condition of order a, 0 < a 5 1. Here s designates arc length on C. If
+
79
ADVANCES IN ORTHONORMALIZING COMPUTATION
u(z)~L(Ic, a), then there exist harmonic polynomials p , ( z ) such that (see Wabh [12.7]) IZL(Z) - p,(z)l I Mn-k-a, zec. (12.15) That is, the more derivatives the boundary data possesses, the stronger the convergence. The asymptotic inequalities (12.14) and (12.15) pertain to the uniform norm, but similar results can be inferred for the square norm. For some existential questions related to mean approsimation by harmonic polynomials see Shaginyan [12.11]. For harmonic functions of three or more variables or for general elliptic diff ereiitial equations, corresponding results on degree of convergence do not appear to have been worked out. However, there is reasonablenumerical evidence to indicate that similar theorems can be proved. I n Part I1 of this paper will be found a number of numerical esamples which eshibit clearly the relation between degree of approximation and the continuability or continuity class of the boundary data. 13. Collocation and lnterpolatory Methods and Their Relation to least Squares
The followiiig is a simple and, on the surface, entirely plausible method using particular solutions. It is very old, but despite this fact, has been “rediscovered” every year or so. Take n particular solutions u1, UZ,. . , u, rtiid n points P I , P,, . . . , P, on dB. Xow determine constants al, . . . , a,, such that if we set
.
(13.1)
then
u*(P,) = f(P,) = giveii boundary data,
i
= 1, 2,
. . . , n.
(13.2)
In other words, let us interpolate to the boundary data by means of a linear combination of particular solutions. This method is frequently called the collocation or point matching method. See Collatz [9.3] for a more general formulation. Collocation can be employed to match not only the boundary data but also the diff ereiitial equation. Despite the simplicity of the method, there are a number of facts which should make one cautious about using it. I n the first place, there is not much that is knowii about it theoretically. Under what circumstances, algebraic or geometric, is the solutiori of thc lincar system (13.2) possible? What can be said about the convergelice of the method as n 3 ? I n the second place, the little numerical data that is available is either inconclusive
80
PHILIP J. DAVIS AND
PHILIP RABINOWITZ
or bad (see, e.g., Poritzky and Danforth [13.1]). We suspect that the method has been employed more widely than would appear from the published literature and the results accepted uncritically. The reader is referred to an article by J. H. Curtiss [13.3] which surveys the theoretical situation for the Laplace equation. On the basis of the simple cases of circles and ellipses that have been investigated successfully [13.2], and also arguing by analogy from the case of complex analytic interpolation, Curtiss comes t o the following conclusion. For sufficiently regular boundaries there probably exist sequences of points {Pnh>, k = 1, 2, . . . , n, n = 1, 2, . . . for which the interpolation process converges for a wide class of boundary data. Such sequences are probably obtainable as the exterior conformal image of equidistributed points on the unit circle. The selection of a proper sequence of points may be crucial theoretically and numerically, and if proper selection really depends upon the conformal map, then we are going around in circles. Conveniently selected sequences such as those of equidistant points may turn out, as in the case of interpolation in one real variable, to lead to divergent approximation processes. By way of contrast, the least square approach selects coefficients so that
LB(f(P) - 2 i= 1
aiui(P))'!ds = minimum
(13.3)
and convergence as n 00 holds under quite general conditions. I n the actual orthonormalizing computations that have been carried out and which are described in Part 11, we select m points on the boundary and m weights wj,corresponding to a rule of integration, and select coefficients ai such that, up to roundoff error, (13.4)
It should be clear that m can be selected in such a way, m = m(n) and m >> n, that as n 3 a, this "discrete" process is also a convergent one. I n problems where the contours are composed of line segments or some such simple curves, and f is simple, our process (13.4) may actually coincide, up to roundoff, with (13.3). In most problems we select m = 2n or 3n. The case m = n (which also can be run via orthonormalizing codes) is the caEe of straight interpolation (13.2). It is felt that the value of 2 or 3 for m/n provides enough "slack" in the system to forestall any divergencies of interpolatory type. I n Section 18, case XV, we report some numerical experience with collocation which shows such divergencies. Three boundary value problems were solved by collocation over an H domain. We took boundary
81
ADVANCES IN ORTHONORMALIZING COMPUTATION
values which came from (1) an entire harmonic function, (2) a, harmonic function possessing a singularity outside but near the boundary of the domain, (3) boundary values arising from the torsion problem (+(z2 y') ). Results in (1) were comparable to the least square solution. Those in (2) were less accurate, while those in (3) were bad, with errors exceeding 100 times those of the least square solutions. These results can be explained by arguing by analogy to the case of equidistant interpolation to functions of one variable. If the interpolated function is regular in a sufficiently large portion of the complex z plane, the interpolation process is demonstrably convergent. If it is not sufficiently regular or is not analytic a t all, the interpolation process may be divergent.
+
14. Conformal Mapping
Conformal maps are obtainable from complex orthogonal functions (see Bergman [14.2], Bochner [14.3], Carleman [14.5], Szego [14.1]). Szego's work employs a line integral as inner product and the other authors stress the area integral. Let B be a simply connected domain of area S possessing a piecewise smooth boundary b of length 1. Let zo be an interior point of B and let w = m(z, zo) designate the function which maps B conformally onto the circle IwJ< R and such that w(zg, zo) = 0 and w'(zg,20) = 1. Let p,(z) = k , p . . . , k, > 0 , be a set of complex polynomials such that
+
/
-
p,(z) p,(z) ds
= 6,,,
m, n = 0, 1, 2, . . .
(14.1)
Then, (14.2) where
(14.3) of the circle onto which z is mapped is given by 1 1 r=---* (14.4) 2?rK(z0,zo) The function K(zolz ) is sometimes known as Szego's kernel junction for the domain B. If, on the other hand, the polynomials p,(z) are orthogonal in the following sense : The radius
T
a2
PHILIP J. DAVIS AND PHILIP RABINOWITZ
then, (14.6)
and (14.7)
where
KB(ZO, 2)
=
2 P,(z)
n=O
pnO.
(14.8)
The function KB(z0,z) is known as the Bergman kernel function for the domain B. For monographs and text books which expand and develop these formulas see Szego [5.2], Bergman [3.1], Nehari [11.7], pp. 239-263, Rehnke and Sommer [14.8], Chapter 111, $12, Kantorovich and Krylov [10.8], pp. 381-389, and Walsh [14.10], pp. 111-151. For multiplyconnected domains, see Bergman [3.1], Chapter VI, and Nehari [11.7], Chapter VII. For an application of complex orthogonal polynomials to a 2-dimensional Poisson equation, see Kliot-Dashinskiy [14.71. Nehari [14.6] has given an estimate of the truncation error in the computation of the Szego kernel function of a plane domain with an analytic boundary 6. Let u,(z) be analytic functions in the domain which are complete and orthonormal over C. Then K(z, f ) = C:-ou,(z)u,(r>. Nehari shows that IK(Z,~)
- j2 ui(z> =l
5 sn(z)sn(l)
(14.9)
where
a,(t) =
1%@
2T c z - t
ds.
The functions s,(z) +0 as n t 00, and the estimate may be carried out after the orthogonal functions have been determined. A similar estimate for the area integral orthogonalieation (corresponding to (14.5)) does not seem to be available. Exterior maps are also related to orthogonal polynomials. Let z = m(w) = cw co (cl/w) (cz/w2) . . , ,c > 0, map the exterior of the unit circle IwI = 1 conformally onto the exterior of B. Let w = +(z) be the inverse function (mapping the exterior of B onto the exterior of [wl= 1). Then, with the normalization (14.1) we have
+ +
+
+
(14.11)
ADVANCES IN ORTHONORMALIZING COMPUTATION
83
uniformly in the cstcrior of l?. The lending rocfficicnts k , posscss the following asympt otic hhnvior : k,tl liin __ = 1,'~. (14.12) n 4 m li,, The coiistaiit c coincides with the transfinite diametcr of B, a concept which was introduced by Yeketc [14.4] and which has been given very simple and clegaiit gconictricnl definitions for very geiierd point sets. This concept is closely rclatcd to that of the electrostatic capacity of a region, referred to rrpeatedly in the bibliography of Section 15. The followiilg theorem of Fekete aiid Wnlsh [14.9] relates the leading coefficicnt of complex orthogonal polyrioniials with the transfinite diameter: Let I!: consist of a finite number of rcctifiable Jordan arcs and let p,(z) = k , ~ . . . , k,, > 0 he complex orthonormsl in the sense that ,
+
(14.13) lini 71-+*
(i)"" =
trniisfiiiitc di:tinetcr of 11;.
(14.14)
15. Quadratic Functionals Related to Boundary Value Problems This is an area in which orthonormalizing techniques are potentially of great nunicrical value, but as the authors have had little computational experience, the topic will be passed over with a few general remarks plus an extensive bibliography. A typical problem here is the following. Tat u(r, y) designate the solution of the Xeumann problem A I L= 0 i n 13 (15.1) = j on as, dn
Compute the value of the Dirichlet integral (15.2)
+
In the particular case wheiif = (d/ds)$(xz y'), D is intimately related t o the torsional rigidity or the stiffness of the plane domain B. iLIuch work has bcen done to provide upper aiid lower bounds for I). Two inequalities valid in inner product spaces are pertinent to this work. The first is the familiar Bessel inequality (15.3)
PHILIP J. DAVIS AND PHILIP RABINOWITZ
a4
where ( r , ]is orthonormal and y is arbitrary. The second is a modification of this. Let y he arbitrary and z bc selected so that ( z , y) = (y, y). Let n linearly independent vectors w;be selected so that (y, w,)= 0, i = 1, 2, . . . , n. Then we have,
c a&, 7I
(Y,Y)
I (z,z> -
a=1
(15.4)
Wl)
where thc ronstaiits a; solve the linear systrm of equations
2 a,(w,, w,)=
(2, W I ) ,
j = 1,2, . .
. , n.
(15.5)
2=1
Lower hounds to (y, y) arc provided by (15.3) while upper bounds are given by (15.5). Int.roduce the Dirichlet inner product (15.6)
Tlicii
u = (11, ti,)
(15.7)
whrre u solves the problcin (1 5.1). Now Green’s identity leads to (1 5.8)
If in (15.3) we select y = u and xi = ui, where ui is any set of functions which are ort,honormal with respect t,o (15.6), then, we have (15.9)
Under appropriate completeness conditions (15.9) bccoincs an equality when n = 00. If z(r, g) = f c, on the p separate contours of dB (c, constant, p = 1, 2 , . . . , q), then (2, u) = (u, u).If u1 = 0 on dB, then (u, u,) = 0. With z and u, thus selected we may insert, y = u,z = z, wz = u,in (15.4) and obtain upper estimates for B. These matters are explained fully by Diaz [15.30, 15.491. Numerical examples have been worked out by Synge [15.45]. Though much effort has been spent on the idea of separate upper and lower bounds, and though it is of great theoretical interest, some of its devotees lose sight of one numerical fact. To obtain separate upper and lower hounds which are very good requires an increasing amount of computation, the accuracy of which cannot be given. The computed upper and lower bounds are therefore only approximate, and one might do well to confine one’s numerical work to a single convergent scheme such as (15.9) with n = 00.
+
ADVANCES IN ORTHONORMALIZING COMPUTATION
85
PART II: NUMERICAL
16. Orthogonalization Codes and Computations
There are several papers in the literature describing routines for genelating orthogonal polynomials by means of a 3-term recurrence relation. Barker [6.12] gives a fairly complete treatnieiit of the subject. He describes a routine using equations (5.6) to (5.9) which was written for the NORC, a floating point computer, and which has the following properties. The input consists of N pairs of data-point coordinates ( L , y,), N associated weights wt > 0, and a parametcr m, the degree of the least-square approximating polynomial fm(z) desired. The output consists of as much of the following as is desired : (a) P*,,(zJ,i = 1, . , . , A;; 1 1 = 0, 1, . . . , m (b) (P*,, P*,) = 6,,, i, j = 0, 1, . . . , m. Orthonormality check. (c) The coefficients u , , ~of the polynomial expansion of P*,(J) = C;=oa,,,~.],j = 0, . . . , z ’ ; v = 0, . . . , m (d) fm(z,), i = 1, . . . , N The Fourier coefficients O,, L’ = 0, 1 , . . . , rn;fm(.c) = Z‘=O b,l-’*,(c) The residuals e(z,) = yyz- f m ( x , ) ,i = 1 , . . . , N bv2,v = 0, . . . , m Sums of squares of residuals eVTwe,= yTwy e,(.cJ = yz - Z = O b,P*,(z,). e, = { e u ( x j ) ] = (yLb,Z’*j(c,): y = (yt>, w = diagonal matrix [w,]. Student’s t, = d N - v - l/e,Twe,, = 0 , 1 , . . . ,m Coefficients d, of the polynomial expansion of fm(z) CLOdux”, ti = 0, 1, . . . , m Check 011 the above coefficients by evaluating f,n(n(zl) from the polynomial expansion and comparing with results of (d) P:’(zJ, I’;’’(zJ, i = 1, . . . , N ; v = 0, 1, . . . , m bJ’;’(z,), i = 1, . . . , N CLo b*Pf‘(z,),i = 1, . . . , N I n addition, Barker gives 2 sets of tables for the case of equally spaced points. The first set gives the coefficients in the polynomial expansion of Z’g,N(z), the polynomial of degree k orthoiiormal over the set of N equally spaced points in the interval [ - 1, I], for selected values of k and N as well as the values of these polynomials at these points. The second set gives coefficients which cnable oiie to calculate the variance. a fixed-point computer, Ascher and Forsythe [5.7], working with SWAC, describe two routines which use floating vectors. The first routine generates LJ
xr=o
-
PHILIP J. DAVIS AND PHILIP RABINOWITZ
86
a set of polynomials Pk(x) orthogoiial (not orthoiiormal) over a given set of N points 2 , using the recurrence equations
Pk+l(x) ffh+l
+
=
hX(z -
=
(xpk, Pk)/(Pk]I>/,), p/, ==
ffk+l)Pk(X)
flLI’h-1(2),
PO
=
1,
PO
hh(Pk1 PL)/hk-l(pk-lf
=
0 hL = 1, PL-1).
It then computes the Fourier coefficients b k of a given set of values yi, b k = (y, P k ) / ( P k , PA). The f f k , f l k , and b k are printed out together with a k 2 = ( N - k - l)-l(ehl ek). Knowledge of these quantities is sufficient to determinc the polynomial approximation to y and hence Ascher and Forsythe do not concern themselves with finding the coefficients de of the least square polynomial fnL(s). Instead, their second routine computes f m ( x I ) for a given set of values x,, which may include the original x,, given the quantities ahl o h , and b h . Clerishaw [16.1] is concerned with those computers which have a small amount of high speed storage. He has written a routine for the DEUCE which generates by recurrence the coefficients of j ’ h ( X ) in a Chebyshev series. Here the P L (x )are normalized differently by taking Po = 3 and h k = 2. If P k ( Z ) = ; f ) o , k f 711,kT1(2) . . . -k l J k - - l , k T k - i ( L ) Tk(Z), then P j , k + l = 7 ) j t l . L f 1)1j-i1 .h - 2ah+lpj,L - P k r ) j , l - l . Thus OlllY the P , - l , k and P l , k need be stored instead of the values of P L ~ ( Xand ) I ’ k ( s ) which are ge:icrated when riccded to compute f f h arid PA. The output consists of the cocfficicnts of the least square approximation polynomials given as a Chebyshev series and the largest positive and numerically largest negative residuals. Efroymson [1G.2] has written up a program for solving least square problems on a computer using normal equations. Given a set of vectors x, and a vector y, this program expands y as a linear combination of only those vectors x k on which y is really dependent. There are many other write-ups of least square routines on electronic computers of which we mention only that of Sciama [lG.3]. The first paper in the literature on orthonormalizing a set of vectors 011 a computer was that of Davis and Rabinowitz [1.1]. Davis [1.3] brought the subject up to date. The latest publication of the “National Bureau of Standards School” in this subject is a write-up by P. J. Walsh and 13. V. Haynsworth dated November 10, 1959 of a routine Bs ZRTH for use on the IBM 704 submitted to SHARE. The features of this routine are as follows: (a) The generalizcd definition of inner product is used.
+
+
is a givcii vector arid W = w,is a real, positive definitc synimetric matrix of weights, generally diagonal arid often the identity matrix.
87
ADVANCES IN ORTHONORMALIZING COMPUTATION
(11) ‘J’11e Gram-Schmidt process i u rccwrsivc form is iibetl. Givcii :L set, .fL,f?, . . . , j,L t h c . objcvt, of thc (:rani-
of linearly iiidcpwitlcnt vectors
Schmidt process is to produce a set of orthonormal vrctors which are linear combinations of the vectors j , , 41 = U l l S I
42
= anlfl
6,= a,,f,
+$
=
. . . z,,,)
( ~ ~ 1 ,
+ @?.h +
a,,&
+ . . . + a,,,,f,z.
Then in terms of thc orthonormal functions &, . . . , & a given vector
where d, follows :
=
41 = f
Cr=l(y, 4k)ukj.The orthonormal vectors D1 =
1 / h
+i
are computed as
(fl,fl)”2
i = 2 , ...,12. (c) Matrix operations are used. The central and recurrent feature of thc orthonormalization and expansion scheme is the construction of a vector of the form
- . . . - (g, +A)&, k = 1, . . . , n where g is one of the vcctors . . . ,f,,,yl, . . . , yf2. g* = 9 - (g,41)41
?: )
We have
Hence, if we designate the N X k mntris d2,. . . , &) tiy aPk, we have q* = g(I - N’% @kT). In addition wc need (g*, g*) = ( g * T ) ( W ) ( g * ) . (d) A “straightening out” of the orthonormal vectors is used. Let us suppose we have a system of k vectors (PI, . . . , +M, C& such that (4,) &) = 6,,
PHILIP J. DAVIS AND PHILIP RABINOWITZ
00
. .
( z , ~= I ,
. . . , 1;
-
I ) ; (&, &)
=
1 h i t ($/, 4,)
=
~ , , j= 1 , 2, .
. ,k
- 1.
'I'hr L l 1-1
4/L = 4)--
c
J=1
6-1 (61j
4J)41 =
B / b
-
c '.f4,
3=1
is much closer to orthogonality. (e) General augmented inputs arc possible. With an input of the vectorsf, and y3, the following quantities can be computed : 1. The orthonormal vectors 4t 2. The residual vectors e, 3. The sums of the squares of the residuals E,r 4. The standard deviation c1k = ((el.+,e,k)/N - I c ) ' ' ~ 5 . The normalized Gram dctcrminant G* 6. The Fourier coefficients (y,, &).
=
DID2 . . . D,,)'
Ilfll?~
*
*
llfnll*
To get the matrix A = (G,)and the coefficients dJ and related quantities such as the covariance matrix C = ATA and the variance-covariancc matrix V = u2C, we augment each vector fa by the n-component vector whose j t h component is 6,, and each vector y3 by the n-component zero vector. If we then apply the Gram-Schmidt algorithm to these augmented vectors with the provision that inner products are taken as before only over the N original components, then the vectors a,, are replaced by the columns of A and the zero vectors by the coefficients -d,. This process can be generalized as follows: The input vectors f z form an n X N matrix # which wc augment by an n X p matrix T = ( t l l ) . Then Il.A = ip where @WaT= I , and a t the same time T A = E. If T = LIl. for some p X N matrix I,, then E = La. If p = N = n, E = T$-*@,$-* = T-'EWaT. I n particular if T = I , W = I , then E = A and #-I = A@'". This gcneralized augmentation is useful in many cases where one approximates data by, my, a polynomial. What one is really interested in is some further quantity obtained from the polynomial by a linear process, perhaps an integral or derivative at specified points. If L designates a linear operator and f a function, this reflects the working rulc : Approximation to L(f) = L(approximation to f). Now L(approximatioi1 to f ) may be computed by augmenting each vector f z by the vector L(f,) and yl by a zero vector. Then this zero vector will be replaced by
Further work a t the National Bureau of Standards has consisted of the writing of routines to gciicrate particular solutions of the harmonic and
ADVANCES IN ORTHONORMALIZING COMPUTATION
a9
biharnionic eyuat ions, orthonormalizing them by Gram-Schmidt using a double-precision floating routine, and also routines to orthonormalize a :et of complex vectors for the purpose of computing the exterior mapping function and transfinite diameter of planar domains. Finally, Rutishauser [16.4] has writtcn an orthonormalization routine using ALGOL58. I n this routine he writrs CP = #RN arid computes @ and RN,given J.. He then computes the Fourier coefficients b , for a given vector y. The coefficients d, in the expansion of y terms of the original vectors f t are then computed by solving the system RYD = B by back substitution.
17. Numerical Experiments in the Solution of Boundary Value Problems Using the Method of Orthonormalized Particular Solutions
The preseiit experiments were carried out on an IBM 704. 1:or couvenicnce, we shall elaborate the meaning of several terms which occur in the tabulation of the data. Equation: The partial differential equation solved. That is, either A u = 0 in 2 dimensions, A u = 0 in 3 dimensions, or AAu = 0 in 2 dimensions. Domain: The portion of space over which n solution is sought. Points: (xl,y,) or (xl,yl, z,), i = 1, . . . , N . Selected points on the boundary of the domain. Somet inies the points and corresponding weights were chosen accordiiig to a high order quadrature rule 011 the houndary. At other times points were chosen to be equidistant in one or more coordinates along the boundary. I n some instances the points were chosen so as to exhibit the characteristic features of the domain. IVeights: ur, : Weights were chosen either corresponding to a quadrature rule, were taken t o be rqual, or, i n thr 2 dimensional case, wrrc computed by the forniu1:i 20,
=
a[((&
- X+l)?
+ (UL-
&L)pl
+
- Z,)Y
+ (g,+1 - g,)9’2],
that is, they are equal to the average of adjacent chords. Special Functions: ul, . . . , u,,are the particular solutions of the differential equation, selected from among a complete set. Boundary Values: A function f defined on the boundary points P,.We designate by max f and min f the maximum and minimum values of f o n I>,. This information is included to indicate the range of the boundary \.nlucs.
90
PHILIP J. DAVIS AND PHILIP RABINOWITZ
Root Mean Square Boundary Value: V n : This is defined by
arid is the weighted average of the given boundary values. This is not the usual definition of the root mean square. It was used for ease of computation. A similar definition occurs below with En and the relative error ER/Vn is the same as it would be with the conventional defintion. The Boundary Value Problem: Laplace Equation: This can be written in the following way. Let a(P,) and b(P,) be two functions defined on Pi. We wish to determine that linear combination u* = aiui for which
fits f a t Pi in the sense of least squares. The Dirichlet problem is the case a = 1, 2, = 0. The Neutnann problem is thc case a = 0, b = 1. For general a, b we have the mixcd or Robin’s problem. Biharmonic Equation: No mixcd boundary conditions were considered. We wish to determine that linear combinatioii u* = x?=a1m for which u*(PJmatches an fi and au*(Pi)/anmatches an fi a t the points Pi,simultaneously, in the sense of least squares. What i s Minimized: Laplace Equation:
Biharmonic Equation:
Theoretical Value at Isolated Interior Points: u ( P ) , available only when exact solution is known in simplc closed form. In the present work, this occurs only when f has been computcd from a simple, closed form solutioii given in advance. Computed Value at Isolated Interior Point: u*(P), where P is the giveu poiiit. Maximum Error an Boundary: E,,,,, Laplace Equation:
ADVANCES IN ORTHONORMALIZING COMPUTATION
91
Hiharmonic ICquation:
Relaiive Root Mean Square Error: If8/Vli The average error relative to the average boundary value. Exact Error at Isolated Interior Point: E ( P ) The absolute value of the difference between the theoretical value u(P) and the computed value u*(P) a t the given point P. Case I
Equatio I I :
Au = 0.
Domaiii :
Cube. -1 I z, v, z 5 1.
Points :
N = 386, spaced uniformly a t an interval of 0.25 in each coordinate.
Weights:
EquaI, 1.
Special funct,ions: n = 25.
a
+b +c
=
0(1)4. See Miles and Williams [11.11]
W n is the j t h iterate of the Laplacian.
Boundary values : (a) f = [(1.5 - z)z These are unweighted.
+ (1 - y)? + (1.25 -
z)?]-I’~
92
PHILIP J. DAVIS AND PHILIP RABINOWITZ
( l ) ) f = I forz (0)
f
=
I,!=
> 0, j
= 1 for z
=
Oforz
< 1.
0 for z
5 0.
Boundary value problrm : Dirichlct problrm.
Case (11)
Case (a)
V R
Em ER
LX
ER/Vn
Number of spwi:il functions taken
Citae
1" 4"
Cnm (I))
0.43 0.24 0.136 0.08 0.05
!J"
16" 25" a
( i ~ )
0.88!) 0.636 0.439 0.327 0.397
Thwe linear conibinatioiis cwrreq)ond to Oth, lst, Case I(a), Error
Point Error
0.0 0.178
0.I; 0.019
tit
((5)
1.o 0.0 0.6772 0.39 0.1804 0.266
1.o 0.0 0.4581 0.376 0.136 0.297
1.789 0.2556 0.502 0.22 0.025 0.05
Max f Min f
Case
Case
(1:)
0.736 0.332 0.324 0.272 0.266
. . . , 4th degree polynornials
Interior Points, 10 Special Functionu 0.0 0.000
0.3 0.024
-0.3 0.001
-0.6 0.012
-0.9 0.057
Point 0.9 refcrs to (0.9, 0.9, O.!)), c x t c * , Case II
Equation : Domain :
AU
=
0
+ + z'
_< 1 Points : N = 266, 24 equidistant points on each of 11 parallels of latitude plus 2 poles. Sphere is divided into 266 equal areas. Points are centers of these areas. Weights: Equal, 1. Special functions: n = 25. Same as Case I. Sphere: z?
g'l
ADVANCES IN ORTHONORMALIZING COMPUTATION
93
Uouiidury values :
+ ( I - y)? + (1.25 -
(a) f
=
[(1.5 - J - ) ~
(h) f
=
0 for z 2 0, f
( c ) .f
=
0 for z 2 0, f
(d) f
=
0 for z 2 0, j
(e) f
=
0 for x 2 0, f
(f) f
=
0 for z 2 0, f
~)~]-l/?
<0 = - z for z < 0 = z? for z < 0 = -2 for z < 0 = z4 for z < 0 =
1 for z
Boundary value problem : Dirichlet problem. Case 11, Results
Case
Case
(4
0))
__-
I .0 0.0 0.674 0.39 0.175 0.26
0.835 0.313 0.495 0.016 0.003 0.006
Case (4
C:tse
Case
Cast!
(4
(f) .~
I .0 0.0 0.407 0.05 0.021 0.051
I .0 0.0 0.312 0.025 0.0086 0.028
1.0 0.0 0.260 0.01 0.0026 0.010
1.0 0.0 0.2'25 0.038 0.009 0.040
Case 11, Rapidity of Converyeiire, ~ ' R / V R Number of specsial functions taken
Case (a)
Case
Case
Case
((.I
(4
Case (e)
Case
(1))
1
0.26 0.093 0.036 0.0145 0.006
0.74 0.37 0.36 0.27 0.26
0.79 0.36 0.094 0.094 0.051
0.85 0.50 0.17 0.098 0.028
0.88 0.59 0.29 0.084 0.010
0.90 0.66 0.37 0.16 0.040
4 !)
16 25
(f)
Case II(a), Error a t Interior Points Pointha Error X 103
0.9 1.67
0.7 0.51
0.4 0.01
0.2 0.03
0.0 0.03
-0.2 0.02
-0.4 0.04
The point 0.9 rrfers to (0.9, 0, O ) , etc Case 111
Kquatioli :
A u = 0.
Donmiit :
Elliptical Aiiiiulus, $
5 z2 + 4y2 5
1.
-0.7 0.48
-0.9 1.63
94
Points:
PHILIP J. DAVIS AND PHILIP RABlNOWlTZ
N
80,40 points on each bounding ellipse.
=
z = 1(0.1)1 on outer ellipse, z = -0.5(0.05)0.5 on
inner ellipse. Weights: Equal to average of adjacent chords. Special functions: n = 10 : 1, Re 9, Re z4, Re z6, log [ z [ ,Re r2, Re 2-*, Re r l O . Boundary values: 1 on outer ellipsc, 0 on inner ellipse. Boundary value problem : Dirichlet problem.
.. .,
Case 111, Results Mas f Min j
1.o 0.0
V R
0.2456
Em,., outer boundary E,,,, inner boundary ER ERIVR
0.0028 0.088
0.00534 0.02
Case 111, Rapidity of Convorgciice Special function incorporatcd
ERIVR 0.58 0.56 0.56 0.56 0.19 0.09 0.06 0.04 0.03
0.02 Case IV
Equation: Domain :
Poirits:
Weights :
Au = 0. Triply-connected domain bounded by the circles A : x2 yz = 1, B : (z - &)e y2 = &, 6 : (x 4 ) 2 y2 = A. N = 80, 40 points on A , 20 on B , 20 on C. Distributed equally in angle, beginning on x-axis. Equal to average of adjacent chords.
+ + +
+
ADVANCES IN ORTHONORMALIZING COMPUTATION
95
n = 21 : 1, Re z 2 ,Re z4, log Ix - $1, Rc(z - +)-”, log jz $1, I k ( z $ ) - ~ ~11, = 1 , 2, . . . , 8. (:I) j = I 011 A , J = 0 o i l 13 and C’.
Sprcid fiinct,ions:
+
+
Ihundary valrws:
(b) f = 1 on C , f = 0 on A arid 13. Boundary valiir problem : Dirichlct problcin. Case I\. , Rcsiilts Cnsc (1))
ClLKC (:I
1
1 .o 0.0 0.3955 0.008 0.0008
1 .0 0.0 0.i922
0.01 0.0026 0.0025 0.00486 0.0061
0.0016 0.00266 0.0067
Case IV, Rapidity of Convergence, E R / V R Number of functiona incorporatcd (In the order listed above)
Case (a)
Case (b)
1 2 4 12 I3 11 15 21
0.58 0.57 0.34 0.34 0.075 0.0063 0.0063 0.0061
0.91 0.86 0.84 0.835
Case
0.083
0.017 0.0067 0.0067
V
Equation :
AU
Domain :
(a) Square with squarc corners punched out.
=
0.
I I
7 000
-.-I
Sw Bergman [18.1].
96
PHILIP J. DAVIS AND PHILIP RABlNOWlfZ
(h) Domain (a) wit!h corners roundcd.
Points :
N
224. (a) Steps of 0.1 on each line segment, except for single step from 2.7 to 2.722. (b) 4 points equally spaced on outer quadrant; 3 points equally spaced on interior quadrant. Equal to average of adjacent chords. Weights : Special functions: n = 8 : 1, Re 24, Re 2 , . . . , R e zZ8. Boundary values : f = x2 y2 (Torsion Problem). Boundary value problem : Dirichlet problem. =
+
Case v, Results
Max j Min f VR Em,, ER ERIVR Case
Domain (4
Domain (b)
11.41 7.41 2.81 0.97 0.069 0.024
10.88 7.41 2.73 0.48 0.0434 0.016
v, Rapidity of Convergence, EB/VR
Number of functions incorporated
Domain (4
Domain (b)
0.134 0.062 0.055 0.042 0.032 0.028 0.026 0.024
0.126 0.054 0.044 0.033 0.025 0.021 0.018 0.01G
T
2
Maxf Minf
1.0 0.0 0.174 0.5 0.032 0.18
VR
Emax ER ER/T'R ~
1.0 0.0 0.174 0.5 0.023 0.133
1.0 0.0 0.112 0.0197 9.7 X 0.0086
~~
Exact error at origin: Case (j), 4.3 X
1.o 0 .o 0.093 9.1 x 10-4 7.3 X 7.8 x 10-4
1.o 0.0 0.083 1.8 x 10-4 1.5 X 1.8 x 10-4
1.o
0.0 0.07'7 2.3 x 10-5 2.5 X 103.2 x 10-5
1.0 0.0 0.072 8.2 x 10-6 1.1 X 1.5 x 10-5
3.0 1.49 0.52 5.5 x 10-8 6.1 X 101.2 x 10-8
1.88 1.83 1.31 -1.39 0.40 0.29 4.3 x 10-8 1.3 x 4.4 X 101.1 X 1.1 x 10-8 3.8 x
10-5
low6 10-6
PHILIP J. DAVIS
98
A N D PHILIP RABINOWITZ Case VI
Equation :
AU
Domain :
Ellipse x?
Poiiits:
N
Weights :
Equal to average of adjacent chords.
Special functions: n Boundary values:
=
=
=
0.
+ 4y2 _< 1.
80, z
=
-1(0.05)1.
31: 1, Re zn, Im zn, n = 1, 2, .
. . ,15.
> 0 ; f = 0 , y 5 0. (b) f = I, z > 0 ; f = 0 , 2 5 0. (c) f = 5, 2 > 0;f= 0, 2 5 0 . (d) f = 2,2 > 0;f = 0, 2 5 0. (e) f = x3, 2 > 0;f = 0, z 5 0 . (f) f = 2 4 , 2 > 0;f = 0,z 5 0 . (g) f = 56,2 > 0 ; f= 0,z i 0. (h) f = exp (z2 - zy + 2y2). (a) f = 1, g
+ 4 + (y - 1)'). f = log ((z + 1.5)2 + v').
(i) f (j)
=
log (z
Boundary value problem: Dirichlet problem.
Case VI, Rapidity of Convergence, ERIVH Number of Case functions" (b) 1 2
3
4 5
fi 7 8 16
0.707 0.337 0.337 0.249 0.249 0.207 0.207 0.182 0.133
Case
Case (4
Case
Case
Case
Case
(c)
(4
(f)
(9)
(j )
0.7833 0.3370 0.0793 0.0793 0.0390 0.0390 0.0245 0.0245 0.0086
0.8360 0.4735 0.1591 0.0251 0.0251 0.0092 0.0092 0.0014 0.00078
0.8064 0.5609 0.2639 0.0759 0.0091 0.0091 0.0027 0.0027 0.00018
0.8860 0.6199 0.3440 0.1429 0.0368 0.0036 0.0036 0.0009 0.000032
0.8997 0.6626 0.4063 0.2026 0.0766 0.0180 0.0015 0.0015 0.000015
0.8177 0.2016 0.0713 0.0295 0.0133 0.0063 0.0031 0.0015 0.0000038
* Because of symmetry, only real parts of z" contributed to solution. Function i is thus R(zi-l).
Case VII, Results
*o Q
Maxf Minf
0.522 0.0
V R
1.0 0.0 0.136
Em,,
0.48
Ell ER/VR
0.025 0.18
0.048 0.0020 0.042
0.048
0.272 0.0 0.022 0.0097 0.00058 0.026
0.142 0.0 0.0106 0.0034 0.00021 0.020
O.Oi4 0.0 0.00518 0.0013 0.00009 0.0168
Exact error at origin: Case (j), 2 X 10-8; Case (k), 2 X 10-7.
0.039 0.0 0.0026 0.0006 O.oooO1 0.0151
0.575 0.012 0.0565 0.024 0.0018 0.032
2.71 1.02 0.309 0.040
0.0030 0.0095
1.82 1.31 0.329 0.001 8.4 X 2.6 X
lo-'
1.41 -0.238 0.181 3 x 10-7 9.6 X 1O-O 5.3 X lo-'
1.99 0.02 0.24 2 x 106 1.6 X 10-7 6.8 X l o '
100
PHILIP J. DAVIS AND PHILIP RABINOWITZ
Case VII
Equation
AU = 0.
Domain :
"Bean" shaped design. See Davis a i d ltabiriowitz [1.2].
N = 84. Distributed on boundary with more points where curvature is greater. Weights: Equal to average of adjacent chords. Special functioiis: n = 31: 1, Re zn,Im zn, n = 1, 2, . . . , 15. Boundary values: (a) f = 1, z > 0 ; s = 0, z 5 0 (b) f = 2, z > 0 ; f = 0, z 5 0 (c) f = 22, z > 0 ; f = 0,z 2 0 (d) f = 2 " , > ~ O ; f = 0 , 5~0 (e) f = z4,z > 0 ; f = 0, z 2 0 (f) f = z5, 2 > 0 ; j = 0, z 2 0 (g) f = z? y2 (Torsion problem) (h) f = esp (2' - zy 2y2)
Points :
+
6) f
=
f
=
(j)
(k) f
+
+ y + (9 - I)') log ((x + 1.5)2+ V ) log (z
+
+ (y - l)?)
cos y log (z2 13oundary valuc problem : Dirichlet problem. = e"
Case Vlll
Equation: Domain : Points:
Au = 0. Lower Unit Semicirclc. N = 78. On z-axis: - 1(0.05)1. On circle: every 5'. Corner points are counted twice. Weights : Equal to average of adjacent chords. Special functions: n = 21 : 1, Re zn, Im zn, n = 1, 2, . . , , 10. Boundary valuo: (a) fi = 0,f2 = 1 (Flow t,hrough semicircular channel)
ADVANCES
(I))
IN Jl
( c ) $,
101
ORTHONORMALIZING COMPUTATION
=
+ (I - 2)' +
log (.rz
=
log
((J
$),f2
- _i_
=
!I)?),
1
+ .c'
=0
Boundary value problem: Mixed. a ( P ) = 1, b ( P ) = 0 on semicircle a(Z>) = 0, b(P) = 1 on s-axis Case VIII, Results
Case
Case (b)
(a)
Max fi on semicircle Max faon x-axis Min fi on semicircle Min ji on z-axis VR Emaxon semicircle Em,, on z-axis ER ER/ V R
0.0 1 .0 0.0 1.o
0.1601 0.0435 0.0234 0.00205 0.0128
Exact error at origin
Case -
0.693 -0.5 0.317 -1.0 0.1748 0.000% 0.00048 5.93 x 10-6 3.39 x 10-4 4.0 x 10-5
((!I 022 0.0 0.0015 0.0 0.0315 3.6 X 8.8 X 4.79 x 10-7 1.52 x 10-6 2.0 x 10-0
Case VIII, Rapidity of Convergence, E R / V R Number of special functions
Case (a)
Case (b)
1 3
1 .o 0.361 0.087 0.04:! O.WB 0.018 0.013 0.013
0.743 0.21 1 0.150 0.027 0.005 0.001 0.0003 0.000:3
4
8 12 10 20 21
Case (r)
0.467 0.113 0.041 0.006 0.00 1 0.0002 0.00004 0.000015
Case IX
Equation:
AU
Domain :
Compressor Blade. See Poritzky and Danforth [18.5].
=
0.
PHILIP J. DAVIS AND PHILIP RABlNOWlTZ
102
N = 75. Ji:ssmt)i,zlly rqu:tlly spncrd with rrq'cct to x. Equal to nvcragc of adjsceiit chords.
Poiiiis: Weights :
A : n = 15: 1 , Re z", Im z", n = 1, 2, . . . , 7. B : n = 10: log (z2 (y - 0.5)*), 1, Re z", Imz", n = 1 , 2 , . . . , 7 . (a) $(z? y2)(TorsionProblem) Boundary values : (b) f = 1, z > O ; f = 0, z 5 0 (c) f = 2, 2 > 0 ; f = 0 , x 5 0 Boundary value problem : Dirichlet problem.
Special functions:
+
+
Case IX, Results
Max j hlin f
vx
Em,,, n = 15 E,.,, n = 16 ER, n = 15 ER, n = 16 ERIVR,n = 15 ERIVR,n = IG
Case (a)
Case (b)
Case (4
1.446 0.02 0.1939 0.00516 0.00522 0.0004 0.0004 0.0021 0.0021
1 .o 0.0 0.2097 0.44 0.42 0.0401 0.0396 0.191 0.189
1.65 0.0 0.2059 0.OG7 0.081 0.0053 0.0045 0.026 0.022
Case IX(a), Rapidity of Convergence
ER
Emas
Special functions used 1% log, 1 log, 1, Re z, Im z, Re 9, Im z2 log, 1, . , Re 24, Im z4 All
. .
n = 15
n = 16
0.960 0.030 0.010 0.005
1.076 0.453 0.021 0.010 0.005
Case X
Equation:
AAu = 0.
Domain : Points :
Ellipse, x2
+ 4y2 5 1.
N
=
=
80, x
- 1(0.05)1.
n = 15
0.1367 0.0031 0.0012
0.0004
n
= 16
0.1717 0.0623 0.0023 0.0012 0.0004
103
ADVANCES IN ORTHONORMALIZING COMPUTATION
wj’)= wj”. Equal to average of adjacent
Weights :
chords.
+
Special functions:
n = 38: 1, 5 , y, ‘x y’, lie z”, Im zn,Re 2 zn, Im z P,n = 2, . . . , 9, Re zlO, Im zIo.
Boundary values:
(a) fl = Itc (b) fl
=
(2 c z ) , f. -
’
0,
f2
a
Re an
(2 e z )
= 1
f.2 = exp (.I? - xy (c) fl = 0, (d) fl = exp ( x y - zy 2yz),f2 = 0 f? = 0 (e) fl = 2, f2 = x (f) fl = 0,
+
+ 2y’)
dU
Bouudary value problem: u = fl, - = f?. an Case X, Reeults Case (1%)
Case (d)
Cnse
Case (b)
(C)
Case (e)
Cnse (f)
-
2.718
- 0.388 5.137 0.0 3.57 x 1.42 x 1.33 x 7.29 x 2.0
x
10” 10-7
10-8 10-0
0 .o 0.0 1.0 1 .o 1.7 x 10-4 7 3 x 10-6 8. x 10-5 3. x 10-6
0.0 0.0
3.01 1.49 0.7 x 3.03 x 3. x 1. x
10-4 10-4 10-4 10-4
3.01 1.49 0.0 0.0 3.18 x io--a
2.35 x 10-4 1. x lo-: 1. x 10-4
1 .o
0.0 0.0
- 1.0 0.0 0.0 1.2 x 9.0 x 5.7 x 4.2 X
1 .O
- 1.0 10-8 10-5 10-4 10-i
4.5 1.7 2.1 1.0
x x
x x
10-4 10-4
10-4 10-4
10-s
Case XI
Equat,ion : Domaill : Casc! (A) Points : M’eigh ts : Special funct,ions: Case (H)
AAu = 0.
Square
- 4 5 x, y 5 4.
N = 100. Gaussian abscissas of order 25 on each side. wf’)= wj2’. Gaussian weights. n = 38, as in Case X.
Poiiit s :
N
IVcights:
wj’)= wj”. I;:qual l o average of adjacent chords.
= 104. Equally spaced on sides at iiitervals of .02. Corner poiiits taken twice.
Case X, Rapidity of Convergence
d
P 0
Number of special functions
Case (a)
1 2 4 7 11 15 19 23 27 31 35
2.11 0.85 0.38 0.79(-1) 0.12(-1) 0.17(-2) 0.17(-3) 0.16(-4) 0.13( -5) 0.10(-6) 0.42(-7)
Emax, /I
Case (f)
0.0 0.48 0.48 0.16 0.61(-1) 0.57(-1) 0.106(-1) 0.108(-1) 022( -2) 0.23(-2) 0.45(-3)
Case (a> 5.44 3.61 1.34 0.23 0.36(-1) 0.49(-2) 0.61(-3) 0.65(-4) 0.74( -5) 0.76(-6) 0.15(-6)
Case (f)
1.0 0.56 0.56 0.30 0.22(-1) 0.28(-1) 0.34(-2) 0.23(-2) 0.75( -3) 0.40(-3) 0.17(-3)
Case
Case
(4
(f)
0.84 0.40 0.28 0.52(-1) 0.72(-2) 0.93(-3) 0.96(-4) 0.90(-5) 0.77( -6) 0.75(-7) 0.16(-7)
I
ER.f.
ER. I1
Emax, j z
0.0 0.27 0.2i 0.65(-1) 0.29(-1) 0.25(-1) 0.44(-2) 0.46(-2) 0.90(-3) 0.94(-3) 0.21(-3)
Case (a) 1.69 1.37 0.33 0.58(-1) 0.12(-1) 0.20(-2) 0.31(-3) 0.40(-4) 0.44( -5) 0.43(-6) 0.73(-7)
Case (f
1
0.58 0.38 0.38 0.17 0.15(-1) 0.20(-1) 0.22(-2) 0.14(-2) 0.46( -3) O.l9(-3) 0.10(-3)
Case (a) 4.85
3.56 1.12 0.19 0.32(-1) 0.52(--2) 0.76(-3) 0.95(-4) O.lO( -4) 0.99(-6) 0.16(-6)
Case (f)
1.42 1.13 1.13 0.42 0.77(-1) 0.75(-1) 0.118(-1) 0.115(-1) 0.24( -2) 0.23(-2) 0.58(-3)
ADVANCES IN ORTHONORMALIZING COMPUTATION
+
18: 1, z, y, z? y?, Re z", Im zn, Rc 2 5'1, In1 2 x " , n = 2, 3, -1, Ite x5, Im z5.
Spccisl functions:
TL =
Case (C) Points :
N = 21. Equally spaced on sides a t intervals of 0.1. Corner points taken twicc. As in Case (R). n = 18, as in Case (13).
Weights: Special functions: Boundary values:
a
ji = Re (W), j 2 = -lie an (b) j i = 0, J-2 = 1
(a)
(c> fl ( 4 fl
13oundary d u e problem:
=
0,
j 2
=
z
=
z,
ji
=
0
Case R, a
A, a 0.30 -0.194 1.63 -0.614 4.5( -6) 2.0( -5) 3.4( -6) 8.6( -6)
(W)
a11 an = J2.
ZL = j 1 ,
Cnsv
a
105
Case c, a
0.3!) -0.195 1.63 -0.614 4.5( - 6 ) 2.0( -5) 3 . q -ti) 7.4( -6)
0.39 -0.103 1.63 -0.614 5.3(-6) 1.0(-5) 4.0( -6) 8.1(-6)
4.4(--6)
4.0( -6)
A, b
Case
Case B, b
0.0 0.0 1 .o 1.0 8.1(-3) 1.0( -3) 4.7(-3) 5.8( -4)
0.0 0.0 1.o 1.0 8.2( -3) 1.0( -3) 4.0( -3) 4.6( -4)
A refers to Cusc A after only 18 special functions were used. Case XI, Results Case c, b
0.0 0.0 1.0 1.o 8.0(-3) 1.2(-3) 5.1(-3) 6.1(-4)
Case
A, c 0.0 0.0 0.25 -0.25 1.0(- 2 ) 2.0(-2) 5.3(-3) 7.4(-3)
Case B, c 0.0 0.0 0.25 -0.25
1.0(-2) 2.0(-2) 4.5(-3) 6.1(-3)
Cnse
c, c 0.0 0.0 0.25 -0.25 8.4( -3) 1.2(-2) 5.0(-3) 8.3(-3)
Case A, d 0.25 -0.25 0.0 0.0 3.3( -2) 1.8(-2)
1.8(-?) 7.7(-3)
Case B, d 0.25 -0.25 0.0 0.0 3.3( -?) 1.8(-2) 1.9(-2) 6.7(-3)
Case
c, d 0.25 -0.25 0 .0 0.0 3.6( -2) 8.0(-3) 1.9(-2) 5.2(-3)
PHILIP J. DAVIS AND PHILIP RABlNOWlTZ
106
0.39 -0.194 1.63 -0.014 1.2 x 10-8 5.8 X 3.4 x 10-9 1.7 X 5.5
x
0.0 0.0 0.1 0.1 2.7 x 2.5 x 1.2 x 1.2 x
0.0 0.0 0.25 -0.25 6.8 X 1.2 x 3.0 x 3.5 x
10-2
10-3 10-2 10-3
-
10-10
0.25
--0.25 lOP 10-2 10-3 10-3
0.0 0.0 1.17 X lo-* 1.24 X lo-* 6.5 X 10-3 4.2 x 10-3
-
Case XI1
IGquation:
AAu = 0.
Domain: Poirit,s:
Lower Unit Semicircle.
Weights: Special fuiictions: Boundary values :
A : w;’)
N
=
78, as in Case VIII.
w;’); L3: n = 38, as in Case X. =
=
0.l~j”.
(a) f1 = 0, f2 = 1 (b) f i = 0, f2 = 2 (c) f i = 2,f2 = 0 (d) Uniform loading. f1 = (1 - x2)2 on x-axis fi = 0 on semicircle, f2 = 0 (e) Concentric loading. f l = (t x2) log 3 $(I 2% 1x1 I fi = (t 22) log3 Z(1 - 22), 121 2 fi = 0 on semicircle, fi = 0 (f) Circular loading.
+
+
I4 I 3,
-
+ + +
+ +
= 8(1 z’) (3 4x2) log 2, 121 2 fi = 0 on semicircle, f2 = 0.
fl
3 3
B,
Case XI, Rapidity of Convergence
-
Number of special functions
0
Case (b)
Case
Case
(C)
(4
1.4(-3) 5.8(-3) 1.7( -2) 6.7( -3) 1.2( -3)
1.2(-1) 3.3(-2) 7.4(-3) 4.4( -3) 3.5( -3)
v
4 13 19 29 35
8.2(-2) 8.1(-2) 7.2(-2) 2.8(-2) 2.T( -2)
5.8(-2) 1.8(-2) 1.0(-2) l.l(-2) 6.8(-3)
2.3(-1) 3.7(-2) 3.3(-2) 2.7(-2) 1.2( -2)
1.4(-3) l.O(-2) 3.3(-2) 1.6(-2) 2.5( - 3 )
2.5(-1) 7.7(-2) 2.0(-2) 1.2(-2) 1.2( - 2 )
7 . 7 ( - 2 ) 4.8(-2) 2.5(-3) 4.7(-2) 1.8(-2) 4.0( -2) 7.4(-3) 1.2(-2) 1.2( -2) 1.2( -2)
5.0( -2) 9.7(-3) 5.3( -3) 6.0(-3) 3.0( -3)
2.0(-1) 1.9(-2) 1.8( -2) 1.8(-2) 6.5( -3)
5.4(-2) 1.4(-2) 7.7( -3) 3.7( -3) 4.2( -3)
PHILIP J. DAVIS AND PHILIP RABlNOWlTZ
108
(g) Linear loading. fl
= -(z6
fi
=
- 2s3
+ z), on z-axis
0 on semicircle, fi
=.
0
aU
Boundary value problem: u = fi, - = fi. an Cusc
hiax f i
0.0 0.0 1.0 1.o 5.4(-2) 1.6(-2) 1.5(-2) 4.9(-3)
Min f l Max fj Min j 2 E m a x , /I
Emax, / r
ER,
/i
ER, I?
XII, RosiiltH
0.0 0.0 0.0 0.0 1 .0 1.o 1.0 -1.0 4.6(-2) 5.3(-2) 3.7(-2) 3.4(-2) 1.2(-2) 1.4(-2) 1.7(-2) 1.1(-2)
0.0 0.0 1.0 - 1.0 4.5(-2) 4.5(-2) 1.2(-2) 1.5(-2)
I.0
1 .0 -1.0 0.0 0.0 0.0 0.0 :J.7(-2) 2.4(-2) t.l(-2) 2.4(-2) '3.2(-3) 8.6(-3) 1.4(-3) 8.1(-3)
- 1.0
Case XII, Results Case A, d
Cme d
1 .0 0.0 0.0 0.0 7.3(-4) I .8( -3) 3.7( -4) 4.7(-4)
I .0 0.0 0.0 0.0 1.8(-4) 3.1(-3)
Caw A, c
CSK!
Case A, g
13, 0
Case B, g
~~
9.q-5)
0.2 0.0 0.0 0.0 2.q-3) 2 . q -3) 1.2( -3)
7.6(-1)
5.5(-4)
0.2 0.0 0.0 0.0 1.7( -3) 6.1 (-3) 8.0( -4) I .8(-3)
1.28 0.0 0.0 0.0 1.8( -3) 1.7(-3) 8.6( -4) 6 . q -4)
0.29 1.28 02!) -0.20 0.0 -0.29 0.0 0.0 0.0 0.0 0.0 0.0 1.5( -3) 1.7(-3) 1.3(-3) 2.7(-3) 1.4( -3) 3 4 -3) 6.9( -4) 8.2( -4) 5.7( -4) 1. I (-3) 6.1(-4) 1 . q-3)
Case Xlll
Equation :
Au = 0.
Domain :
Isosceles right triangle, vertices at (0.25,0.25), (0.25, -0.25), (-0.25, -025).
Points :
N
Weights :
Gaussian.
Special functions:
n
=
=
75, 25 point Gaussian rule on each side. 31 as in Case VI.
Boundary d u e problcm: Mised. n(1')u
au + b( P )--an = ,f, o ( P ) = 1 ,
b(") = 2. Case XIII, Rcwlts
Maxf Minj
0.56 0.0 V R 0.063 E,,, 4.3 x 10-7 ER 8.7 x 1 0 - 9 ERIVR 1.4 X lo-'
8.15 - 18.77 1.051 3.6 X l o - ' !).7 x 10-6 9.2 x 10-6
4.71; 7.67 - 18.77 - 18.77 0.665 1.515 5.1 x 1 ~ 5 5.1 X lOV3 1.5 x 2.4 x 10-4 2.2 x 10-8 1.G X 10-4
5.6'2
- 14.01
0.916 1.8 X 10-1 6.0 X lou6 6.5 X
Case XIV
l?quntion : Domniii :
AAtL
=
0.
Points: Weights:
Isoscrlrs right triangle, vertices at (0.25,0.2.5), (0.25, -0.25), (-0.25, -0.25). N = 75, 25 point Gaussian rule on each sidr. Gaussian ; = w ~ ~ ' .
Sprcial f u n d ions:
n
Round:try vnliies:
().
~1;')
=
38, as in C'asc X.
a - Hr a?%
.fi
=
Re 2f+,
,f?
=
(h)
fi
=
= 1
.fl
=
0, 0,
$2
(c)
j-2
= T
r,
.f? = 0
(d) .fl
=
(re.)
110
PHILIP J. DAVIS AND PHILIP RABINOWITZ
Houndary v a l prohlent: ~
11, =
a16
.fi, - = fi.
an
Case XIV, Results
Case
Case
(4
(b)
Case (c)
0.0
0.39
0.0 1 .0 1.o 3.37 X lo-* 2.58 X lod2 1.08 X lowa 7.0 X
-0.14 1.63 -0.77 1.39 X 6.10 X 10-8 3.68 X lo-# 2.13 X 10-6
0.0 0.0 0.25 -0.25 8.77 x 10-3 7.75 X 2.96 X 1.26 X lo-'
Case
(4 0.25 -0.25 0.0 0.0 1.3 X 5.55 x 5.93 x 1.20 x
10-2
1010-3 10-3
Case XV
Equation :
AU = 0.
Domain :
H-shape. Vertices a t (0.7, 0.8), (0.7, -0.8), (0.3, -0.8), (0.3, 0), (-0.3, 0), (-0.3, -0.8), (-0.7, -0.8), (-0.7, 0.8), (-0.3, 0.8), (-0.3, 0.4),(0.3, 0.4)) and (0.3, 0.8).
Points:
N
Weights: Special functions : Boundary values :
Average of adjacent chords.
= 96, equidistributed in steps of 0.1 wit,h ovcrlap at every corner.
31, as in Case VI. (a) f = +(x2 y?) (Torsion problem) (b) f = log ( x 2 (y O.LL)*) (c) f = log (x' (y 0.8)Y) (d) f = log (x2 (y - 0.8)') (e) f = 1, 2 > 0 ; f = 0, x 5 0 (f) f = 2: 2 > 0 ; f = 0, x 5 0
n
=
+
+ + + + +
Case XV, Results
--
4
Masj Minj v-8 Emax
ER ERI5-R
0.565 0.0 0.095 0.138 0.019 0.2
0.66 -2.41 0.324 0.786 0.092 0.284
1.12 -2.41 0.296 0.113 0.010 0.035
1.12 -2.41 0.309 0.117 0.011 0.036
1.o 0.0 0.207 0.47 0.05 0.242
0.7 0.0 0.110 0.16 0.0125 0.114
0.49 0.0 0.071 0.11 0.012 0.169
0.343 0.0 0.0485 0.10 0.0103 0.212
0.240 0.0 0.0336 0.079 0.0057 0.229
0.168 0.0 0.0227 0.057 0.0053 0.233
2.01 0.346 0.326 0.97(--9) 0.71(-10) 0.22(-9)
PHILIP J. DAVIS AND PHILIP RABINOWITZ
112
(g) f = (h) f = (i) f = (j) f =
(k) f
=
> 0;f
0,z 5 0 0, 2 5 0 z4,z > 0 ; f 0, z 5 0 z6, 2 > 0 ; j = 0, z L 0 Ite ez = ez cos y z2,z
x’, 1: > 0 ; f
= = =
Eoundary value problem : Dirichlet problem, Case XVI
Equation :
AAu = 0.
Domain :
“H” Shape (as in Case XV). N = 96, equidistributed in step of 0.1 with
Points :
overlap a t cvery corner. Weights :
wjl)= w;’)= average of adjacent chords. n = 38, as in Case X. Special functions: Boundary values : (a), (b), (c), (d), as in Case XIlr. au Boundary value problem: IL = fi, - = f2. an Case XVI, Results
Max fi
Min f l Max fz Min j 2 Enlax, /I
Em.,, ER,
/$
Ji
Ex, In Exact error at (020)
2.14 -0.35 3.54 -2.00 7.!)6 x 10-8 1.63 x 10-7 2.83 X 6.43 x 10-8
0.0 0.0 1.0 1 .o 4.28 X 9.18 X 1.81 X 2.58 X
1.40 x 10-9
-
10-I
10-1 lo-‘ lo-’
0.0 0.0 0.7 -0.7 9.95 X 10-1 2.54 X lo-’ 5.68 X 7.38 X
0.7 -0.7 0.0 0.0 5.06 X 6.20 x 2.88 X 2.48 X
-
10-1 10-1
10-I lo-’
-
Case XVll
Problem :
+ . . . over
To compute orthonormal polynomials alLzn
a point set and to compute its transfinite diameter.
Point, set:
Two eolliricar linc segments S: [- 1,
-$I,
[+, I].
ADVANCES IN ORTHONORMALIZING COMPUTATION
Ittiicr product :
(J, q )
113
= Jsj(r)g(x) (1.r.
N = 50, a 25 point Gaussian rule in each segment. Gaussian. Weights : Special functions: n = 21; 1, r , z?, . . . , .$I' . Points:
Case XVII, Convergelice to Transfinite Diaiiieter an
71
0 1 2 3 4 5
-
(i i
8 9 I0 I1 12 13 14 15 1ti li 18 I!) 20 m
(o,*/an+*)1'2
1.oooooo 1.309307 4.601790 6.193236 23.82221 31.86801 125.353% 167.388ti 633.4655 885.3688 3521.162 4697.453 18715.48 24963.57 99565.1-4 132791.5 529991.0 706811 .O 2822310.0 3763744.0 I 5033710.0 (theoretiral valuta)
0.46616 0.45979 0.43951 0.44084 0.43594 0.43633 0.43467 0.43481 0.43408 0.43414 0.43375 0.43379 0.43356 0.43358 0.43343 0.43344 0.43334 0.43335 0.43328 -
0.4330127
a,,-"q 0.76376 0.46616 0.54454 0.45264 0.5004 1 0.44700 0.48120 0.4438!) 0.47048 0.44191 0.46366 0.44054 0.45893 0.439L3 0.45546 0.43877 0.45282 0.43816 0.45073 0.43767 -
Case XVlll
Problem : Point sct :
Same as in XVII. Three collinear line segments. R : [ - 1 ,
Inner product:
Same as in XVII.
Points:
N = 75, a 25 point Gaussian rule in each segment.
Weights :
Gaussian.
I%, 11.
Special functions: n
=
21: 1, r , . . . ,x?ll .
-41, [-4,
a],
PHILIP J. DAVIS AND PHILIP RABINOWITZ
114
Case XVIII, Convergenco to Transfinite Diameter a
0.8164967 1.297772 2.557682 5.76441 1 9.536194 22.947 78 44.04354 78.68162 201.1679 334.6985 722.6411 1619.348 2648.467 6798.31 8 12413.51 23011.70 58823.60 95330.98 218701.1 465947.6 773404.5
0 1 2 3
4 5 6 7 8 9 10 11
12 13 14 15 16 17 18 19 20 a
0.433
< theoretical
value
0.5650 0.4745 0.5179 0.5012 0.4653 0.5400 0.4679 0.4848 0.5276 0.4546 0.5224 0.4880 0.4618 0.5435 0.4594 0.4913 0.5186 0.4423 0.5317
-
0.5213 0.5144 0.4812 0.5077 0.4949 0.4850 0.5086 0.4775 0.4990 0.5018 0.4737 0.5072 0.4864 0.4871 0.5069 0.4721 0.5017 0.4977
-
-
-
0.6292 0.5650 0.5213 0.5409 0.5132 0.5145 0.5207 0.5024 0.5125 0.5073 0.5015 0.5098 0.4994 0.5027 0.5050 0.4971 0.5034 0.4994 0.4978 0.5025
< 0.5. Case XIX
Problem :
Domain :
To compute the complex orthonormal polynomials corresponding to a given domain, pn(z) = anZn . . . , to compute an approximation to the exterior mapping function of the domain from the ratio pn+l(z)/pn(z), and to compute the transfinite diameter of the domain. Same as in Case VII, the “bean.”
+
Points: N = 84, as in Case VII. Weights: As in Case VII. Special functions : R = 31 : 1, z, . . . ,z30. Error in exterior mapping function: Max 6 2 9 , i = 0.0774 i
($ ,xa;), N
a=t
li2
= 0.0281
115
ADVANCES IN ORTHONORMALIZING COMPUTATION
0 1 2 3 4
5 6
7 8
9 10 11 12 13 14 15
0.48794905 0.51270368 0.50368673 0.50441978 0.50784869 0.50711472 0.50705196 0.50776177 0.50822067 0.50764419 0.50733756 0.50798193 0.50821394 0.50749173 0.50772655 0.50823471
16 17 18 I9 20 21 22 23 24 25 26 27 28 29
0.50776548 0.50732470 0.50808456 0.50815499 0.50731565 0.50757485 0.50832095 0.50760986 0.50733903 0.50784640 0.50783172 0.50708396 0.50761722 0.50827258
W
Transfinite diameter
Case XX
Problem :
Same as XIX.
Domain :
Square: Same as XI.
Inner product,:
(.f, 9)
=
Points :
N
100, as in XI, Case (A).
Weights:
As in XI, Case (A). n=35:1,z 9 4.
N
Special functions:
=
.E W i f k i ) s ( z i > . a=1
) . ' . )
Error in exterior mapping function: Max 633,i = 0.008 i
(z;x s%,)''2 I N
=I
=
0.001
-
PHILIP J. DAVIS AND PHILIP RABINOWITZ
116
Cum SX,Coiivc:rycwcr 6o Trnnsfuiitc Diamotcrn
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
0.28867513 0.29580399 0.30304576 0.28544961 0.29562620 0.29506199 0.29451547 0.29440303 0.29490287 0.29487829 0.29489984 0.2M85711 0.29496020 0.29497102 0.29498217 0.29497104 0.29500655
17 18 19 20 21 22 23 24 25 26 27 28 29 80 31 32 33
0.29501340 0.29501968 0.29501633 0.29503205 0.29503620 0.29503996 0.29503893 0.29504709 0.29504972 0.295052 12 0.29505188 0.29506660 0.29505836 0.29505997 0.29506002 0.29506298 0.29506420
Theoretical value: 0.295085 = o.5[r(1/4)1~/4T3~~
18. Comments o n the Numerical Experiments
The tables giving rapidity of convergence may be used to estimate the number of special functions needed for prescribed accuracy. Case I
.
Case (a) asks for a harmonic function which is regular in a larger region than the cube. But the singularity is close to the boundary. This yields considerably better results than the discontinuous problems (b) and (c). In discontinuous problems, E,,,, is of no great significance since we must inevitably pick up a discrepancy of the order of half the jump. Note that the error in the interior is considerably less than the error on the boundary. In case (a), the maximum error occurs at a vertex and is the point nearest the singularity. Part of the error is due to proximity to the singularity and part to the discontinuities at edges and vertices. Case II
For regular boundary values (a), the result on the sphere is better than on the cube by an order of magnitude. This comparison is somewhat unfair because the singularity is closer to the cube than to the sphere. For dis-
ADVANCES IN ORTHONORMALIZING COMPUTATION
117
coiit,inuous boundary values (b) there is little difference in Ee/VR between the sphere and the cube. The family of solutions (b)-(f) exhibit the degree of approximation possible for prescribed boundary values of increasing smoothness. Theoretical discussions of this point do not seem to be available in three dimensions. If fd? 011 the sphere, the approximating functions of degree n 2 cause the greatest decrease in E,t/J’R whereas functions of degree n 3 contribute practically nothing. Case (f) is worse than case (e) where one might expect better results. But in (f), the best approximants would bc of dcgree five, and these were not included.
+ +
Case 111
Odd powers and imagiiiary parts of even powers were discarded in view on the outer boundary is an order of magnitude better of symmetry. Emax than on the inner boundary. The Dirichlet integral of this solution gives the capacity of the annulus (see, e.g., Diaz [15.30]). Case IV
Only the functions I , lie 9,log Iz f $1, lie ( x f + ) - I , Re (2 f +)+ played a significant role in (a). In case (b) eveii Re (z f +)+ contributed little. Convergelice to zero of 13x/Vli is very slow. As might be expected, no significant difference in EIR/VRbetween (a) and (b) turned up. The first treatment of this problem via least squares and punch card machines is in Reynolds [18.2]. Case V
This case was suggested to us by S. Bergniari [18.1]. Note that the roundiiig of the corners reduces E,,,,,,by and EIlt/V1t by +. The method apparently prefers smooth boundaries. Yet the error is relatively large in both cases (probably due to the noiiconvexity of the domains) and we do not feel that the results are sufficiently sharp to reflect the differences in the true solutions of (a) and (b). I n judging the table of rapidity of convergence bear in mind that we go up four degrees with each new function.
+
Case VI
This case is well worked over in the literature of numerical analysis. For polynomial boundary values prescribed on an ellipse the solution of the Dirichlet problem is a polynoniial (see Vodicka [18.3]). Note the family of houiidary values of iricreasiiig snioothness (L)-(g) and tlie resulting deand E’,t/Va. As in case 11, for feC” the approsiniating funccrease in E,,,,, tioii of degree n 2 causes the greatest decrease in ER/VR.
+
118
PHILIP J. DAVIS AND PHILIP RABINOWITZ Case
V11
Case VII(k) was first done in 1955 using SEACand eleven special fuiictions, see Davis and Rabinowitz [l.2]. The present computations lead to maximum errors which are lop3 times the old errors. This reflects the increased speed and capacity of the preseiit generation of computing machines and leads us to hope that similar progress will he made on more refractory problems over the riext five years. We have used almost the same set of boundary functions in V I aiid VII. The results show that the errors arc significantly smaller over the ellipse than over the “beau.” The differeiices become greater as the boundary function becomes smoother. Howevrr when the solution is harmonically contiriuable across the boundary, this ceases to be true and VII(j) is better than VI(j). The singularity is closer to the ellipse than to the “bean.” The nonconvex bouiidary of the “bean” does not seem to play a role in (j), but it is apparently the cause of the difference in behavior of VI(b)-(i) vs. VII(a)-(f), (h), and (i). As in VI, approximation by harmonic polynomials is strongly sensitive to the contiiiuity class of the boundary data and we would normally expect (g) to yield better results than (e)-(f). The actual results are something of a mystery. The errors are of same order of magnitude as in V. Maximum errors for (a)-(i) occur iii the vicinity of the innermost coilcavity. In (j) and (k) they occur at the boundary points nearest the singularity. Case Vlll For (a) sce Synge [2.2]. Note that the derivative discrepancies are of same order as the functional values. Note the superiority of (c) to (b) and the relative locations of their sirigularities which are a t the same distance from the boundary. The maximum error in (a), (b), and (c) occurred at the corners. Case IX
The addition of the logarithm doesn’t make much difference in the long run. Porit,zky and Danforth [18.5] tried collocation on t8hetorsion problem for this contour. Thirteen points and functions were employed and unsatisfactory results reported. Case X
Iio low order coiitinuity data was prescribed. Apart from (a) which was a “set-up,” the other cases yielded results roughly of the same order of magnitude. In (a), exact error a t (0.0) was an order of magnitude less than
ADVANCES IN ORTHONORMALIZING COMPUTATION
119
the avcragc crror. I t is intcrcsting that in (h-(f) the derivative is approxiinatcd more accurately than the function. The table of rapidity of coilvergerice shows that while I, naturally, decreases monotonically with the number of special functions, the separate crrors El,,,, j,; B,,,, I?; En, I,; and ER, j 2 rieed not. Case XI
For the entire biharmonic boundary value (R,a) the errors are of the same order as the cllipse X. For the other boundary value problems common to X and X I , the results on the ellipse are better by two orders of magnitude. We were interested in comparing the discrete inner product with the integral inner product and to obtain the latter with a high degree of accuracy, Gaussian weights and abscissas were employed in XI, (A). No observable improvement resulted. I n (C), the number of equidistributed points taken was reduced by 80%. The number of special functions was 18,and the number of points was, essentially, 40. At these points, the errors are comparable to (A) and (B). When the explicit approximate solution was evaluated a t the boundary points, the errors were comparable to the errors achieved in the interior, In other words, no divergencies of collocation type were observed (compare with XV). Case XI1
Cases (d)-(g) were suggested to us by J. Nowinski. 111 (a) the semicircle yields accuracy comparable to the square. In (b) and (c) the results are one order better than the square and an ordcr worse than the ellipse. Note that (.4)weights the values and those of derivatives equally (this was done elsewhere with the biharmoitic eyuat,ion) but (B) gives the values 10 times the weight of the derivative values. Accordingly we would expect Ell, in (B) to be better than in (A). Similarly, Em:,,,f2 and E R ,j 2 thatE should be worse. However, it is interesting to note that the values computed did not change significantly, and improvemelit in fi was less than the = 0.5w(I),practically worsening 0f.L In a case not reported here, where w(?) no change was observed. In the cases suggested by Xowinski, momeiits M,, M,, M , , Me were computed and thc match twtwecii M , and M , at the corners had a 5%) error. Case
Xlll
Case (a) has similarities to the torsioii problcm. The error is much lcss than for the torsion problem for the nonconvex regions studied in V, VII, and IX. Cases (b)-(e) have harmonically continuable solutions whose singularity in each case is at the same distance from the domain, h u t differ-
120
PHILIP J. DAVIS AND PHILIP RABINOWITZ
ences in accur:Lry show up nnd may hc nt,tributed to rr1:ttivc. local ions of Ihc singularity. Case XIV
This case is comparable in accuracy to the square, XI. Case XV
Although this domain presents difficulties, very high accuracy was achieved for the entire harmonic solution (k). The case of torsion (a) was very bad, and none of the other cases (e)-(j) was much better. The continuable cases (c) and (d) are an order of magnitude better than (a). The poor showing of the continuable case (b) may be attributed to the position of its singularity in the interior of the convex hull of the H . Cases (a), (b), (c), and (k)were also solved by collocation with 31 points. For case (k)resulting coefficients agreed fairly well with the corresponding theoretical coefficients. For cases (a)-(c), the resulting function, when evaluated at intermediate points on the boundary, exhibited errors of several orders of magnitude greater than the already large errors achieved by least squares. Case XVI
Comparing this case with XI (a) is same order of magnitude whereas (b) , (c), (d) are worse by about two orders of magnitude. This bears out our previous experience that for entire harmonic or biharmonic functions, the shape of the domain is of little consequence. Case XVll
This is an extension of previous computation (see Davis [18.4]). This computation was carried out in two ways: (a) Gram-Schmidt, double precision floating orthonormalization of the powers, (b) single precision floating recurrence. In view of the symmetry of the point set, the coefficient of in the orthonormal polynomial should be zero. With method (a) and for n = 17, this theoretically zero coefficient was of the same ordcr of magnitude as the leading coefficient. With method (b) a-id for n = 20, this coefficient was times the leading coefficient, i.e., the full precision of the computation. Here is strong evidence for the superiority of rccurrencc vs. Gram-Schmidt, wheiicvcr npplicablr. Case XVlll
We are not aware of any closed form expression for the answer to this problem. The computation was done by recurrence. The poor convergence
ADVANCES IN ORTHONORMALIZING COMPUTATION
121
(when compared with XVII) indicates the numerical difficulty associated wit,h this type of problem. The last column should theoretically converge to the proper answer. This is true of the two previous columns, if convergent. Case XIX
Cases XIX and XX are the only examples of complex orthogonalization carried out. For earlier computations, see Davis and Rabinowits [1.2]. Case XX
The results are far superior to those of XIX. The convergence of a,/a,+, is almost monotonic and is within 0.00002 of the theoretical value. The boundary errors in the mapping function are of an order of magnitude better than in XIX.
19. The Art of Orthonormalization
As in many problems run on computers, great care must be taken to see that there are no mistakes in the input data. For this reason, in running a series of computations, it is advisable if possible to insert one problem whose answer is known. The orthonormalizing routine is very sensitive to errors in the data of the function to be expanded. For this reason it is important to examine the deviations at each individual data point. By inspecting these discrepancies, one can detect a t a glance any errors in the input data. Usually the discrepancies a t all points are of the samc order of magnitude, so that when the discrepancy a t a particular point is greater than the average by one or two orders of magnitude. it is advisable to recheck the input data for that point. Experience has shown that that piece of information has been incorrectly prepared. Another important thing to remember when expanding a function by least squares is that the sum of the squares of the absolute deviations is minimized. This may result in large relative errors occurring a t several data point,s. If this is undesirable, a possible way to remedy the situation is to assign as weights the inverse squares of the values of the function, i.e., W ;=
1/Fi2.
The computations reported herein were almost all done in doubleprecision floating. For some of the scaling problems involved when working in fixed point see Davis and Rabinowits [1.11. For small problems involving up to 10-12 functions, single precision arithmetic (36 bits) should be adequate. However, because of roundoff , double precision is advisable for work wit,h more than that number. With single precision words of greater length,
122
PHILIP J. DAVIS AND PHILIP RABINOWITZ
the iiumber of functions for which single precision is adequate naturally increases. As has been mentioned several times, whenever one want,s to fit data by II polynomial, the %term recurrence relation should be used rather than Gram-Schmidt. This is so for reasons of accuracy, but even if comparable accuracy could be achieved, 3-term recurrence is much quicker than GramSchmidt and requires much less storage. The time for the former is proportional to nN,for the latter, to n2N. For solving the harmonic, biharmonic, and other linear elliptic equations, the relative advantages and disadvantages of orthogonal functions versus finite differences should be considered. The use of orthogonal functions has the advantage that it gives the answer as a closed expression, gives a good idea of the error committed, and can be used with any shape domain. On the other hand, the computations may take longer, and every new differential equation requires a new program to generate particular solutions.
20. Conclusions
The set of experiments reported in Section 17 indicates the accuracy that can be achieved with the least square method in a wide variety of situations. This method, without undue expenditure of machine time, yields answers of considerable accuracy in some problems and of quite acceptable accuracy in a great many problems. The interested reader, confronted with his own particular problem, can “extrapolate” from the cases given and obtain an idea of what the method is likely to do for him. Theory indicates and computation bears out the sensitivity of the method to the boundary and to the boundary data prescribed. The most favorable condition is that of boundary values which come from entire harmonic functions or from harmonic functions which are regular in large portions of the plane. Here we may expect geometric convergence, and it seems t,o matter little whcther the boundary itself is analytic or piecewise analytic. Boundary data leading to solutions which do not cont,inue harmonically across the boundary, or boundary data which is of low continuity class, is a less favorable situation. Convergence is oiily arithmetic and observed errors are correspondingly greater. Here it seems to matter what the boundary is, with analytic boundaries responding more favorably than piecewise analytic ones and convex boundaries than nonconvex ones. There is a progressive falling off in accomplishment as we move from simply-connected domains to multiply-connected domains and from two dimensions to three dimensions. The more coniplicatcd the geometrylmundary data combination, the more we must work to obtain a solutinn
ADVANCES IN ORTHONORMALIZING COMPUTATION
123
of given accuracy. Special devices, “tailor-made” for the configuration , might have to be called into play. This undoubtedly is the case with all numerical methods, and leads to the conclusion that for unusual geometrical configurations, the potential problem or the biharmonic problem cannot,, even now, he regarded as an open and shut matter. We regret that the return of one of the authors to his permanent post in September 1900 has prcvrnted us from continuing t hese iiivestigatioiis. We would likc to have looked into the equations AIL= 0 and AAu = 0 tackling more complicated boundary problems. Thus, we would like to have solved mixed problems for two-dimensional multiply-connected and t hree-dimensional simply- and mult iply-connected domains. The biharmonic equation with its wide variety of boundary value problems still needs considerable investigation, and we would like to h a w solved problems over t hree-dimensional domains. In these harder problems, we feel that, comparable accuracy can he achieved by taking 50-80 special functions. We would like to know something about the optimd ratio of n to N. When n = N we have, as has been pointed out, the rnsr of straight collocation with t he possibilities of serious divergencies. We would have liked-and this is of considerable importance-to h a w carried out comparisons of the present method with other methods currently employed. Though these things have not been possible, we feel we have laid a solid ground for anyone who wishes to carry this work further.
References SECTION
1. Introduvtion
1.1 Davis P., and Rabinowitz, P., A multiple purpose orthonormalizing code and its uses. J. Assoc. Cornpuling Machinery I,183-191 (1954). 1.2 Davis, P., and Rabinowitz, P., Numerical experiments in potential theory twin3 orthonormal functions. J. Wash. Acad. Sci. 46, 12-17 (195G). 1.3 Davis, P., Orthonormalizing Codes in Numerical Analysis, Lectures presented a t NBS-NSF Training Program in Nunirrirul Analysis, 1957 (J. Todd, ed.), to he published. SECTION
2. The Geometry of Least Squares
2.1 Schreier, O., and Sperner, E., Modern rllgebra and Matrix Theory, c;hapt. 11. Chelsea Publishing Co., New York, 1951. 2.2 Synge, J. I,., The Hypercircle in Mathematical Physics, pp. 7-124. Cambridge IJniversity Press, London and New York. 1957. 2.3 Taylor, A. E., Introduction to Functional Analysi,~,p. 106. Wiley & Sons, New York, 1958. 2.4 Davis, 1’. J., Haynsworth, E., and MarcuE, M., Bounds for the P-condition number of m:ttrires with poeitive roots, J. Research,Natl. Bur. Standards 658, 13-14 (1961).
124 SECTION
PHILIP J. DAVIS AND PHILIP RABlNOWlTZ
3. Inncr Products Useful in Numoricd Analysis
3.1 Bergman, S., The Kcrncl Function and Conformal Mapping. Am. Math. SOC., New York, 1950. SECTION
4. The Computation of Inner Products
4.1 Birkhoff, G., and Young, D. M., Jr., Numerical quadrature of analytic and harmonic functions. J. Math. and Phys. 99,217-221 (1950). 4.2 Tyler, G. W.. Numerical integration of functions of several variables, Con. J. Math. 5, 393412 (1953). 4.3 Hammer, P. C., and Stroud, A. H., Numerical integration over simplexes. Math. Tables Aids Comput. 10, 137-139 (1956). 4.4 Hammer, P. C.,and Wymore, A. W., Numerical evaluation of multiple integrals I. Math. Tables Aids Comput. 11, 59-67 (1957). 4.5 Hammer, P. C., and Stroud, A. H., Numerical integration of multiple integrals 11, Math. Tables Aids Comput. 19, 272-280 (1958). 4.6 Albrecht and Collatz, L., Auswertung mehrdimensionaler Integrde. 2. angew. Math. Meeh. 38,l-15 (1958). 4.7 Stroud, A. H., A bibliography on approximate evaluation of integrals. Math. Tables Aids Comput. 15, 52-80 (1961). 4.8 Stroud, A. €I., Quadrature methods for functions of more than one variable. Ann. N . Y . Acad. Sn'. 86,776-791 (1960). SECTION
5. Methods of Orthogonalkation
5.1 Courant, R., and Hilbert, D., Methoden der Mathematischen Physik, Vol. I , pp. 40-47. Berlin, 1931. 5.2 Seego, G., Orthogonal Polynomials. Am. Math. Soc., New York, 1939. 5.3 Shohat, J. A., Hille, E., and Walsh, J. L., A bibliography on orthogonal polynomials. Bull. Natl. Research Council (U.S.) 103, (1940). 5.4 Peach, M. O., Simplified technique for constructing orthonormal functiona. Bull. Am. Math. SOC.50,556-564 (1944). 5.5 A. Erdblyi el al., Higher Transcendental Functions, Vol. 2, Chapters 10 and 12. McGraw-Hill Book Co., New York, 1953. 5.6 Forsythe, G. E., Generation and use of orthogonal polynomials for data fitting with a digital computer. J. SOC.Znd. Appl. Math. 5 , 74-88 (1957). 5.7 Ascher, M., and Forsythf, G. E., SWACexperiments on the use of orthogonal polynomials for data fitting, J . Assoc. Compding Machinery 5, 9-21 (1958). 5.8 Weisfeld, M., Orthogonal polynomials in several variables, Numerische Math. I, 38-40 (1959). SECTION
6. Tables of Orthogonal Polynomials and Related Quantities
-
6.1 Russel, J. B., A table of Hermite functions, J . Math. and Phys. 19,291-297 (1933). [ee-Z'/2Hn(z): x = 0(.04)1(.1)4(.2)7(.5)8, n 0(1)11, 5D.l 6.2 British Assoc. for the Advancement of Sci., Legendre Polynomials, Mathematical Tables, Vol. A. Cambridge University Press, London and New York, 1946. [ P ~ ( z )z: = 0(.01)6, TZ = 1(1)12, 7-8D.l 6.3 Wiener, N., Extrapolation, Interpolation, and Smoothing of Stationary T i m Series. Wiley & Sons, New York, 1949. [Laguerre Polynomials Ln(x) = e@(e-V)(n). n = 0(1)5, z 0(.01).1(.1)18(.2)20(.6)21(1)26(2)30, 3-5D.l 6.4 National Bureau of Standards, Tables of Chebyshev Polynomials S,(x) and Cn(Z),
ADVANCES IN ORTHONORMALIZING COMPUTATION
125
Appl. Math. Ser. No. 9. U.S. Govt. Printing Office, Washington, D. C., 1952. [ x = 0(.001)2, n = 2(1)12, 12D. CoefTicienta for ri = 0(1)12.] 6 5 Saleer, H., Zucker, R., and Capuano, R., Tables of the zeros and weight factors of the fist twenty Herniite polyrioniids. J. Research N a f l . Bur. Slriritlurds 48, 111116 (1952). 0.0 Karmazina, L. N., Tablrtay Polirroinov Yalohg. Izdatclstvo Airad. K;uiil< S.S.S.11.
1 1
Moscow, 1954. [Jarobi Polynomials: C,,(p, q, z),
0.9 0.10 6 11
6.12
0.13
6.14
(12
= 0,
# 71; G,,(p,q, 2) = zR . . . ; z = O( 01)1, q = . l ( . l ) l , p = 1.1(0.1)3, n = 1(1)5, 7D.l Iiopal, Z.,Nuwierical Aiialyhis. 'CYiley & SOIIS, K e n Yolk, 1955. Davis, P., and Rabinonitz, P., Abscissas and weights for Gaussian quadratures of high order. J . Research Nall. Bur. Slandards 56, 35-37 (1956). Davis, P., and Rabinowitz, P., Additional abscissas and weights for Gaussian qiiadratures of high order. J. Research Natl. Bur. Sfandards 60,613-614 (1956). Head, J., mid 'CYilson, W., Laguerre futictions: tables and properties, PTOC. Inst. Elec. EUQPS. P f . C 103, 128-441 (195G). Fishman. H., Numerical integration constants. Math. Tables Aids Coinpulers 11, 1-9 (1957). Barker, J. E., Use of orthonormnl pol) nornials in fitting curves and estimating their first and second derivatives, Naval Proving Ground Ilept. KO.1553, Suppl. 1, Tables 1 and 2. Dahlgren, Virginia, 1958. Gawlik, H. J., Zeros of Legendre polynomials of orders 2-61 and weight coefficients of Gauss quadrature formulae, A.R.D.E. Memo. (U) 77/58. Fort Halstead, Kent, 1958. HxbinoNitz, P,,and Weiss, G., Taldrs of abscissas and weights for numerical
?it
0.7 G.8
+
(1 - z)P-'W-~G,G,
cvaluation of integrals of the form
lJm
e-zrry(z)dx. Malh. TabZes Aids Compul.
13, 285-294 (1959). 6.15 Rabinowitz, P., Abscissas and weights for Lobatto quadrature of high order. Math. o j C'oiripulalzo?i 14, 47-52 (1960). (I. 16 Hochstrasbcr, U. W., Orthogonal polynorniitls, in EIarulbook of Mathematical Punclions, Chapt. 22. U.S. Gov. Printing Office, Rashington, D. C. in press. SECTION
i.1
7.2 7.3
7.4 7.5.
7.6 7.7 7.8
7. Lrast Square Approxiination of Functions
Snctlecor, G . W., Statislical Methods. Iowa State College Press, hiies, Iowa, 1937. Kendall, M. G., The Advanced Theory of Statistics. Griffin, London, 1946. Spcncer, It. C., and Parke, N. G., 111, A Matrix Treatment of the Approxiination of Power Series Using Orthogonal Polynomials, Including Application, Electronics Itcwarch Division Antenna Laboratory. Air Force Cambridge Research Center, Cambridge, Massachusetts, 1952. Cvetkov, li., A new method of computation in the theory of least squares. Auatraliati J . A p ~ lSci. . 6, 274-280 (1955). Gale, L. A., A niodiF.ed-equiLtioris method for the least squares solution of condition equations. T r a m . Am. f;eophp. Union 36, 779-791 (1955). Shrnitzer, A,, C'hebjshev approximation of a continuous function Ly a class of funvtioiis. J . Aaaoc. ('ompuling Machrtiery 4, 30 35 (1957). hluehly, l i . J., First Interim Progrestj Report 011 llational Approxirn:btiori, lechnicd Report. Princeton Uiiiversity, Princeton, New Jersey, 1958. ;\lurnaghan, F. D., and-Wrench, J. \V., The Approsirnation of DiBereutiable
PHILIP J. DAVIS AND PHILIP RABINOWITZ
126
Functions by Polynomials, David Taylor Model Basin Report No. 1175. 1958. 7.9 Stiefel, E.L., Numerical Methods of Tchebycheff approximation in On Numerical Approzirnalion, (It. E. Langer, ed.), University of Wisconsin Press, Madison, Wisconsin, 1959. 7.10 Deming, L. S.,Selected bibliography of statistical literature, 1930 to 1957: I. Correlation and regression theory. J. Research Natl. Bur. Slandards 648, 55-68 (1 960). 7.11 Maehly, H ., and Witegnll, C., Tsrhebyscheff -Approximationen in kleinen Intervalleii I. Approximation durch Polynome. Numer. Math. P, 142-150 (1960). 7.12 Veidinger, L., On the numerical determination of the best approximations in the Chebyshev sense. N u m w . Math. 9, 99-105 (1960). 7.13 Box, G. E. P., Fitting Empirical Data, Mathematical Research Center Tech. Summary Report No. 151. University of Wisconsin, Madison, Wisconsin, 1960. 7.14 Hartley, 1%.O.,The ModiEed Gauss-Newton Method for the Fitting of NonLinear Regression Fiinctions by Least Squares, Statistical Lab. Tech. Report No. 2.3.Iowa State University, Anies, Iowa, 1959. SECTION
8. Overdetermined Systenis of Linear Equations
8.1 de la VallBe-Poiissin, C. J., Sur la mdthode de I’approximation mininium. SOC. sci. Bruzelles, Ann. SBr. 11, 35, 1-16 (1911).(English translation by H. E. Salzer available). 8.2 Goldstein, A. A,, On the Method of Descent in Convex Domains and Its Applications to the Minimal Approximation of Overdetermined Systems of Linear Equations, Math. Preprint No. 1. Convair Astronautics, 1956. San Diego, Cal. 8.3 Goldstein, A. A,, Levine, N., and Hershoff, J. B., On the “best” and “least Q “ ” approximation of an overdetermined systttni of linear equations. J. Assoc. Cornputing Machir~ery4,341-347 (1‘357). 8.4 Goldstein, A. A., and Chenry, W., A finite algorithm for the solution of consistent linear equations and inequalities and for the Tschebycheff approximation of i~iconsistentlinear equations. Pacijc J . Math. 3 , 415-427 (1958). 8.5 Zukhovitskiy, S. I., An algorithm for the solution of the Chebyshev approximation problem in the case of a finite system of incompatible h e a r equatiGXl8 (Russian). Uokladg Akad. Nauk S.S.S.R. 79, 561-564 (1951). 8.6 Hemez, E., Ge:efmalComputational Methodsjor Chebyaheu Approximation (Itussian), Part 11. Iedatelstov Akad. Nauk Ukrajrisk S.S.Il., Kiev, 1957. 8.7 Stiefel, E., Note on Jordan elimination, linear programming and Chebyshcv approximation. Numer. Math. 9, 1-17 (1960). SECTION
9. Least Square Methods for Ordinary Differential Equations
9.1 I’irone, M., Sul nietodo delle minime potenze ponderatc e sul metodo di ltitz per il calcolo approssimato nei problmii della fisira-matematica. Rend. Circ. mat. Palermo 59, 225-253 (1928). 9.2 Faedo, S.,8ul nietodo di Ritz e su quelli fondati sul principo dei niininii quadrati per la risoluzione approssimata dei problemi della fisica matematica. Rend. Mat. e A p p l . Uniu. Roma I d . nazl. alla mat. 6, 73-94 (1947). !) 3 Collatz, I,., Nirmerischr BehaiitllrtrLg uoii I)i~ercritinlgleichririyeri.Spririgcr Ikrlin, 1951.
9.4 Faedo, S., Sulla maggioraziorie dell’errore nei metodi di Ilitz e dei mininiiquadrati. Atla. Accad. nazl. Ltncei. Rend. Classc sci. Js. niat. 1 2 rial. 14,466-470 (1953).
ADVANCES IN ORTHONORMALIZING COMPUTATION
127
F o x , I,., 1Vunu~rimlSoliitiori 0.f 7’iiio-l’oinL Boii?ulnr~/Problrms i i i Ordiri(iri/ I I t f Jwcnlid Eqitnlioris. Oxford I’nivwhit y Prws, 1,nndon and Nrw 1-ork, 19.57. . Llftch. 9.6 Iiadner, If., Untrrsurhungcn ziir Kolloc:ttiorihnicthod(,. 2. A t ~ g r c ~Math. 40,99-11:( (1960). (3.7 Hildebrand, F. B. ltnd Crout, P. D., A Least Square Procedure for Solving Integral Equations by Polynomial Approximation, J . M n l h . mid Phys. PO, 3 10335 (1941), 9.8 I,onseth, h. D., Approximate Solutions of Fredholm-Type Integral ltlqiiationci, 1 3 i ~ I l AWL. . M d h . SOC.60,415-430 (1954).
!I5
SECTION
10. Linear Partial Differential Equations of Elliptic Type
10.1 Zaremba, S., L’equat,ion 1,iharmonique ct. une classe remarquahle do fonrtions fondamentales harmoniques. RuLE. intern. acad. sci. Cracouie, 147-196 (1907). 10.2 Zaremba, S., Sur le calcul nunikrique des fonctions demandhes dans le p r o b l h e de Dirichlet, et le prohkme hydrodynamiqiie. Bull. intern. acad. sci. Cracovie 1 , 125-195 (1909). 10.3 Bergman, S., uber die Entmicklung der harmonischen Funkt.ionen der Ebene und des Raumes nach Orthogonalfunktionen. Math. Ann. 86, 237-271 (1922). 10.4 Merriman, G. M., On the expansion of harmonic functions in terms of normalorthogonal harmonic polynomials. , 4 ~ J. . Math. 53, 589-596 (1931). 10.5 Bergman, S., The Kernel Function atrd Conformal Mapping. Am. Math. SOC., New Pork, 1950. 10.6 Fichera, Q., Risultati concernenti la resoluzione delle equazioni funzionali lineari dovuti all’Istitnto Nazionale per le Applicazioni del CaIcolo. Atti. ri ccad. rind. Liiirei Mem. Classe Sci. fis. mat. e nat. Sez. [ 8 ] , 3, 1-81 (1950). 10.7 Firhera C., On some general integration methods employed in connection with linear differential equations. J . Math. and Ph.ys. 29, 59-68 (1950). ’ 10.8 Kantorovich, L.V., and Krylov, V. I., Approzinmte Methods of Higher .4ml Intersciencc Publishers, Xew York, 1958 (1950 ed. translat.ed by (’. D. Bens 10.9 Bcrgman, S., and Schiffer, RI., K ~ r w Fi~7irtion.s l and Elliptic Differci~tialEqiiationa Academic Press, New York, 1953. i n Mathematical Phy. 10.10 Pieone, M., Exposition d’une methode d’int.6gration num6riqiie deu systbrncs d’6quations IinEaires aux derides part,ielles, mise en oeuvre a I’Instit,rit Kational pour lcs Applications du Calrul. REsultats obtcnus et rEsult,at,squo l’on pourrait att.eindre. Les machines ii ralcirler el la pensbe humnine. Colloq. intern. centre nnt. rerherche s c i . (Paris) 37, 23-261 (1953). 10.11 Picone, M., On tho mathematical work of the Italian Institute for the Application of Calculus during the first quarter contury of its existence. Piibbl. I d . nicz. .4ppl. del (”alrolo, Roma, No. 362 (1953). 10.12 Sokolnikoff, I. S.,Maltie,natiral Ttirory of Elastin‘ty, 2nd rd., hirGraw-Hill Hook (!o., New York, 1956. 10.13 Zolin, A. F., An approximate solution of thr polyharmonic prohlem. Dolrlarly A k d . NO,/& S.sT.S.R. 122, 971-973 (1958). 10.14 Lieherstein, H . &f,, A Continuons Method i n Kumtrrical Analysis Applied t,o $:samples From a New Class of Boundary Value Problems, Matjhemat,ic:tl llesearch Center Tech. Summ:~ry Report No. 175. [Jnivcrsity of Wisronsiii, Madison, Wisconsin, 1960. 10.15 Fichera, G., On a unified theory of boundary value problems for elliptic-parabolic equiitions of sccond order, in Bourdary Problems i n Differenlid Equations,
128
PHILIP J. DAVIS AND PHILIP RABINOWITZ
h'ls1.h. Rrsrarch Crntcr Symposium (R. E. Langrr, cd.), pp. 97-120. Univmily* of Wisconmin Press, Madison, Wisconsin, 1960. JU.10 Lowan, A. N., On the Picone 'l'reatment of Boundury Vahio Problcmv for Partial Differential Equations, Report 5310, Vniversity of California Radiation Laboratory, Livermoro, Califmiia (1958). SECTION
11. Coniplcte Systems of Particular Solutions
11.1 Bergman, S., Zur Theorie der ein-und mehrwertigen harmonischen Funktioneri des dreidimensionalen Ilaumcs. Math. 2. 94, 641-669 (1925). 11.2 Kellogg, 0. D., Foundation of Potential Theory. Frederick Ungar, New York, 1953. (Reprint, Dover Publications, New York). 11.3 Fichera, G., Teoremi di completezza connessi all'integrasione dell'equaeione A4u = f. Giorn. mat. Battaglini 77, 184-199 (1947). 11.4 Fichera, G., Teoremi di conipletczza sulla frontiera di un dominio per taluni sistemi di funzioni. Ann. mat. pura e appl. 97, 1-28 (1948). 11.5 Vekua, I. N., Novye Melody rcsheniya ellipticheskikh uravneni. OGIZ, Moscow and Leningrad, 1948. 11.6 Bers, L., Theory of Pseudoanalytic Functions, Department of Mathematics, Notes. New York University, New York, 1951. 11.7 Nehari, Z., Conformal Mapping. McGraw-Hill, New York, 1952. 11.8 Zwieling, K., Grundlagen einer Theorie der biharmonischen Polgnome. Verlag Technik, Berlin, 1952 11.9 Vekua, I. N., On completeness of a system of harmonic polynomials in space. Doklady Akad. Nauk S.S.S.R. [N.S.] 90,495-498 (1953). 11.10 Vekua, I. N., On completeness of a system of metaharmonic functions, Doklady Akad. Nauk S.S.S.R. [N.S.] 90,715-718 (1953). 11.11 Miles, E. P., Jr., and Williams, E., A basic set of homogeneous harmonic polynomials in k variables. Proc. Am. Math. SOC.6, 191-194 (1955). 11.12 Vekua, I. N., Sgsteme von I)iffcrentialgleichungen erster Ordnung vom elliptischcn
Typus und Randuertaujgahen (German translation). Math. Forschungsberichte, Berlin, 1956. 11.13 Henrici, P., A survey of I. N. Vekua's theory of elliptic partial differential equations with analytic coefficients. 2. Angew. Math. u. Phys. 8, 169-202 (1957). 11.14 Horvath, J., Basic sets of polynomial solutions for partial differential equations. Proc. Am. Math. SOC.9, 569-575 (1958). 11.15 Miles, E. P., Jr., and Williams, E., Basic sets of polynomials for the iterated Laplace and wave equations. Duke Math. J. 96/35-40 (1959). 11.16 Henrici, P., Complete systems of solutions for a class of singular elliptic partial differential equations, in Math. Research Center Symposium (R. R. Langer, ed.) University of Wisconsin Press, Madison, Wisconsin, 1959. 11.17 Bergman, S., Integral Operators in the Theorg of Linear Partial Differential Equations. Springer, Berlin, 1960. 11.18 Krzywoblocki, M. 2. v., Heryman's Linear Integral Operator Method in the Theorg of Compressible Fluid Flow (see Appendix by P. Davia and P. Rabinowitz). Springer-Verlag Vienna, 1960. 11.19 Gagua, M. B., On completeness of systems of harmonic fnnctions. Soobshcheniya Akad. Nauk Gruzin. S.S.R. 19, 3-10 (1957).
ADVANCES IN ORTHONORMALIZING COMPUTATION SECTION
129
12. Error Bounds; Degree of Convergence
12.1 Marcolongo, R., Sulla funzione di Green di grado n per la sfera. Rend circ. mat. Palermo 16, 230-235 (1902). 12.2 Walsh, J. L., On the degree of approximation to a harmonic function. Bull. Am. Math SOC.33, 591-598 (1927). 12.3 Humbert, P., Potentials et Prepotentials. Paris, 1936. 12.4 Nicolescu, M., Les Fonctions Polyharmoniques. Paris, 1936. 12.5 Walsh, J. L., Maximal convergence of sequences of harmonic polynomials. Annals of Math. 38, 321-354 (1937). 12.6 Miranda, C., Formule di maggioritzione e teoreme di esistenza per le funzioni biarmoniche di due variabili. Gwrn. mat. Batlaghi 78, 97-118 (1948). 12.7 Walsh, J. L., Sewell, W. E., and Elliott, H. M., On the degree of polynomial approximation to harmonic and analytic functions. Trans. Am. Malh. Soc. 67, 381420 (1949). 12.8 Fichera, G., Sulla maggiorazione dell'errore di approssimazione nei procedimenti di integrazione numerica delle equazioni della fisica matematica. Rend. accad. rci. $5. e mat. Soco reale Napoli [4] 17, Pubbl. Ist. nazl. appl. del Calcolo, Roma No. 289 (1950). 12.9 Collatz, L., Fehlerabschatzung bei der ersten Randwertaufgabe bei elliptischen Differentialgleichungen. 2. angew. Math: Mech. 39,202-211 (1952). 12.10 Griinsch, H. J., Eine Fehlerabschiitzung bci der dritten Randwertaufgabe der Potentialtheorie. 2. angew. MaM. Mech. 32, 279-281 (1952). 12.11 Shaginyan, A. L., On approximation in the mean by harmonic polynomials (Russian). Doklady Akad. Nauk Armyan. S.S.R. 19, 97-103 (1954). 12.12 Collatz, L., Numerische und graphische Methoden, in Handbuch der Physik (S. Flugge, ed.), Vol. 2. Springer-Verlag, Berlin, 1955. 12.13 Nehari, Z., On the numerical solution of the Dirichlet Problem, in Proc. Con!. Differential Equations, pp. 157-178. Univeraity of Maryland, College Park, Maryland, 1956. 12.14 Hochstrasser, U., Sumerical experiments in potential theory using the Nehari estimates. Malh. Tables Aid8 C'omput. 1 9 , 2G-33 (1958). 12.15 Duffin, It. J., and Nehari, Z., Note om Polyharinonic Functions, Tech. Report No. 32. Carnegie Inst. Technol., Pittsburgh, Pennsylvania, 1960. 12.16 Agnion, S., Maximum theorems for solutions of higher order elliptic equations. BUZZ. Am. Math. SOC.66,77-80 (1960). 12.17 Bramble, J. H., and Hubbard, B. E., A class of higher order integral identities with application to bounding techniques, in press. 12.18 Bramble, J. H., Hubbard, B. E., and Payne, 1,. E., Bounds for solutions of mixed boundary value problems for second order elliptic partial differential equations, in press. 12.19 Bramble, J. II., arid Paync, I,. E., New hounds for the deflection of elastic plates, in press. 12.20 Bramble, J. H., and Paync, L. E., Bounds for solutions of mixed boundary value problems for elastic plates, in press. , SECTION
13. Collocation and Interpolatory Methods
13.1 Poritzsky, IT., and Danforth, C. E., On the torsion problem, in Proc. 3rd U.S. Natl. Congr. of Applied Mechanics, pp. 431-441, New York, (1958).
130
PHILIP J. DAVIS AND PHILIP RABlNOWlTZ
13.2 Walsh, J. L., Solution of the Dirichlct problem for thc ellipse by interpolating harmonic polynomials. J . Math. Mech. 9, 193-196 (1960). 13.3 Curtiss, J. H., Interpolation with harmonic and complex polynomials to boundary valucs. J . Math. Mech. 9, 167-192 (1960). SECTION
14. Conformal Mapping
14.1 Szrgo, G., Uber orthogonale Polynomc, die zu einer gegebenen Kurve der koniplexen Ebene gehoren. Math. Z. 9,218-270 (1921). 14.2 Bergman, S., ffber die Entwitbklung der harmonischen Funktionen der Ebene untl des Raumes nach Orthogonalfunktionen. Math. Ann. 86,237-271 (1922). 14.3 Bochner, S., Uber orthogonale Systeme analytischer Funktionen. Math. Z. 14, 180-207 (1922). 14.4 Fekete, M., Uber die Verteilung der Wurzeln bei gewissen algebraischen Gleichungen mit ganzzahligen Koeffizienten. Math. Z. 17,228-249 (1923). 14.5 Carleman, T., uber die Approximation analytischer Funktionen durch lineare Aggregate von vorgegebenen Potenzen. Arkiv Mat. Astron. Fys. 17 (1923). 14.6 Nehari, Z., On the numerical computation of mapping functions by orthogonalization, Proc. Natl. Acad. Sn‘. U.S.37, 369-372 (1951). 14.7 Kliot-Dashinskiy, M. I., On a method of solution of a plane problem of potential theory (Russian). Inthen-Stroit. Inst. Sb. Nauc. Trudou (Leningrad) 17, 11-27 (1954) * 14.8 Behnke, H., and Sommer, F., Theorie der analylischen Funktionen einer lcomplexcn Veranderlichen. Springer, Berlin, 1955. 14.9 Fckete, M., and Walsh, J. L., On the asymptotic behavior of polynomials with extremal properties and of their uses. J . Anal. Math. 4, 49-87 (1955). 14.10 Walsh, J. L., Interpolation and Approximation by Rational Functions i n the Coinplex Domain, rev. ed. Am. Math. Soc., Providence, Rhode Island, 1956. 14.11 Davis, P., Numerical computation of the transfinite diameter of two col1ine:ir line segments. J . Research Natl. Bur. Standards 58,155-156 (1957). SECTION
15. Quadratic Functional8 Related to Boundary Value Problem
15.1 Trefftz, E., Ein Gegenstuck zum Ritzschen Verfahren, in Proe. 2nd Intern. Congr. for Applied Mechanics, 131-137. Zurich, 1926. 15.2 Friedrichs, K. O., Die Randwert und Eigenwert Probleme aus der Theorie der elastischen Platten (Anwendung der Direkten Methoden der Variationsrechnung). Math. Ann. 98, 205-247 (1928). 15.3 Trefftz, E., Konvergenz und Fehlerschiitzung beim Ritzchen Verfahren. Math. Ann. 100,503-521 (1928). 15.4 Friedriehs, K. O., Ein Verfahren der Variationsrechnung das Minimum eines Integrals alu dlts Maximum eines anderen Ansdruckes darzustellen. Nachr. Gca. Wiss. Gbttingen Math.-Physik. K l . I 13-20 (1929). 15.5 Weber, C., Eigengrenzung von Versehiebungen mit Hilfe der Minimalsatze. Z. angew. Math. Mech. 99,126-130 (1942). 15.6 Courant, R., Variational methods for the solution of problems of equilibrium and vibrations. Bull. A m . Math. SOC.49, 1-23 (1943). 15.7 Diaz, J. B., and Weinstein, A., Schwarz’ inequality and the methods of RayleighRitz and Trefftz. J . Math. and Phys. 96, 133-136 (1947). 15.8 P61ya, G., lhtinixting electrostatic capacity. A m . Malh. Monthly 54, 201-206 (1947).
ADVANCES
IN ORTHONORMALIZING COMPUTATION
131
15.9 Prayer, I%'., and Syngc, .I. L., Approximat,ions in elasticity based on the concept of frinct.ion space. Q w r t . Ap1~l.Math. 5, 241-269 (1947). 15.10 Topolyanskiy, D. B., On bounds for Dirichlet's integral. Prilclad. Mal. i Mekh. 11 (1947). 15.11 Dim, J. B., and Greenberg, 11. J., Upper and lower bounds for the solution of the first biharmonic boundary veluc problem. J. Math. and Phys. 97, 193-201 (1948). 15.12 Diaz, J. B., and Greenberg, H. J., Upper and lower bounds for the solution of the first boundary value problem o f elasticity, Quart. Appl. Math. 6, 326-331 (1948). 15.13 Diaz, J. B., and Weinstein, A., The torsional rigidity and variational methods. A,tn. J . Malh. 70, 107-116 (1948). 15.14 Greenberg, H. J., The determination of upper and lower bounds for the solutions of the Dirichlet problem. J . Math. and Phys. 27, 161-182 (1948). 15.15 Greenberg, H. J., and Praget, W., Direct determination of bending and twisting moments in thin elastic plates. A m . J . Math. 70, 749-763 (1948). 15.16 Weinstein, A., New Method8 for the Estimation of the Torsional Rigidity, Proc. 3rd Symposium in Appl. Mathematics (Am. Math. SOC.),1949, 141-161. McGrawHill Book Co., New York, 1950. 15.17 Fichera, G., Sulla maggioraeione dell'errore di approssiniaeione nei procedimenti di integrazione numerica delle equaeioni della Fisica Matematica. Rend. accad. sci. JS. e mal. Soc. nazl. sci. Napoli [4], 17, 1-8 (1950). 15.18 Fichera, G., Risultati concernenti la risolueione delle equaeioni funzionsli dovuti all'Istituto Nazionale per le applicaeioni del Calcolo. Atti. accad. nazl. Lincei, M e m ~CZasse . sci.fis. n d . e nat. I81 3,3-81 (1950). 15.19 Fichera, G., On some general integration methods employed in connection with lincar differential equations. J . Math. and Phys. 29, 59-68 (1950). 15.20 Funk, P., and Berger, E., Eingrenzung fur die grosste Durchbiegung einer gleichmiissig belasteten eingespannten quadratischen Platte, in FederhoferGirkmann Festschrijt, pp. 19+204. F. Deuticke, Vienna, 1950. 15.21 Mykhlin, S. G., Direct Methods in Matheinalieal Physics. Moscow, 1950. 15.22 Picone, hI.,arid Fichera, G., Neue funktionalanalytische Gruridlagen fur die Existenzproblenie und Losungsmethoden von Systemen linearer partieller Differentialgleichungen. Monalsh. Math. 54, 188-209 (1950). 15.23 P6lya, G., and Weinstein, A., On the t,orsional rigidity of multiply connected cross-sections. Ann. Math. [2] 52, 154-163 (1950). 15.24 Diae, J. B., Upper and lower bounds for quadratic functionals, in Proc. Symposium on Spectral Theory and Differential Problems (Oklahoma Agri. and Mech. College), pp. 27S289. Stillwater, Oklahoma, 1951. 15.25 Diaz., J. B., Upper and lower bounds for quadratic functionals. Colleclanea Malh. (Semirrnrio mat. de Barcelona) 4,l-50 (1951). 15.26 P61ya, G., and Szego, O., Isoperimetric 1nequ:tlities in hlathenistical Physius, Ann. Math. St,udies No. 27. Princeton Universit,y Press, Princctori, New Jersey, 1951. 15.27 Iteitan, D. K., and I-Iiggiiis, T. J., Calculation of the electrical capacitance of :L cube. J . Appl . Phys. 92,223-226 (1951). 15.28 Bertolini, F., Sulla capacith di un condensatore sferico. Nuouo ci,naento 9, 852-851 (1952). 15.29 Cooperman, P., An extension of the method of Trefftz for finding local bounds on the solutions of boundary value problems, and on their derivatives. Quart. Appl. Math. 10, 359-373 (1952). 15.30 Diax, J. I3., On the estimation of t,orsional rigidity and other physical quantities,
132
PHILIP J . DAVIS AND PHILIP RABINOWITZ
259-263 in Proc. 1st U.S. Natl. Congr. of Applied Mechanics (Am. SOC. Mech. Engs.), 1952. 15.31 Gross, W., Sul calcolo della capacith. elettrostatica di un conduttore. Atti. accad. nazl. Lincei Rend. Classe sci. fis. mal. e nat. [8] 19,496-506 (1952). 15.32 M y k h h , S.G., The Problem of the Minimum of a Quadratic Functional. Moscow, 1952. 15.33 Slobodyanskiy, M. G., Estimate of the error of the quantity sought for in the solution of linear problems by a variational method. Doklady Akad. Nauk S.S.S.R. [N.S.] 86,243-246 (1952). 15.34 Daboni, L., Applicazione a1 caso del cub0 di un metodo per eccesso e per difetto della capacith, elettrostatica di un conduttore. Atti. accad. natl. Lincei Rend. Classe sn'. $8. mat. e nat. [8], 14,461466 (1953). 15.35 Kato, T., On some approximate methods concerning the operators T * T . Math. Ann. 196,253-262 (1953). 15.36 McMahon, J., Lower bounds for the electrostatic capacity of a cube. Proc. Roy. Irish Acad. 55A, 133-167 (1953). 15.37 Washizu, I<., Bounds for solutions of boundary value problems in elasticity. J . Math. and Phys. 39, 117-128 (1953). 15.38 Weinbcrger, H.F., Upper and lower bounds for torsional rigidity. J. Math. and Phys. 39, 5442 (1953). 15.39 Fichcra, G., Methods of Linear Functional Analysis in Mathematical Physics, Proc. Intern. Congr. of Mathematicians, Vol. 111, pp. 216-228. Amsterdam, 1954 15.40 Fichera, G., Formule di maggiorazione connesse ad una classe di transformazioni lineari. Ann. mat. pura e appl. [4], 36, 273-296 (1954). 15.41 Daboni, L., Capacith, elettrostatica di un condensatore sferico con apertura circolare. Atti accad. sci. Torino Classe sci. fis. mat. e nat. 89, 1-10 (1954-55). 15.42 Payne, L., and Weinberger, H. F., New bounds in harmonic and biharmonic problems. J. Math. and Phys. 33,291-307 (1955). 15.43 Nicolovius, R., Abschiitzung der Losung der esrten Platten-Randwertaufgabe nach der Methode von MaploSynge. Z . angew. Math. Mech. 37, 344-349 (1957). 15.44 Nicolovius, R., Beitrage zur Diaz-Greenberg-Methode. Z . angew. Math. Mech. 37, 449-457 (1957). 15.45 Synge, J. L.,The Hypercircle in Mathematical Physics. Cambridge University Press, London and New York, 1957;see also the book review in Bull. Am. Math. SOC. 65,103-108 (1959). 15.46 Duffin, R. J., Distributed and Lumped Networks. Research Project: Mathematical Analysis of Electrical and Mechanical Systems, Tech. Report No. 37. Dept. of Math., Carnegie Inst. Technol., Pittsburgh, Pennsylvania, 1958. 15.47 Payne, L., and Weinberger, H. F., New bounds for solutions of second order elliptic partial differential equations, Pacific J. Math. 8, 551-573 (1958). 15.48 Forsythe, G. E.,and Rosenbloom, P. C., Numerical Analysis and Partial Differential Equations (Vol. 5 of Surveys in Appl. Math.), Wiley & Sons, New York, 1958 (see, in particular, pp. 145-146). 15.49 Diaz, J. B., Upper and lower bounds for quadratic integrals, and a t a point for solutions of linear boundary value problems, in Boundary Problems in Differential Equations, pp. 47-83. University of Wisconsin Press, Madison, Wisconsin, 1960. 15.50 Golomb, M., and Weinberger, H. F., Optimal approximation and error bounds, in On Numerical Approzimtion (R. E. Langer, ed.) 117-191. University of Wisconsin Press, Madison, Wisconsin, 1959.
ADVANCES IN ORTHONORMALIZING COMPUTATION TION
133
10. Ort hngon,zlisaticin Codrs and Compiitnt ions
C'lmshaa, C'. \V., C'iirvr fitting ail11 :I t1igit:il wmputrr. ( ' u / ) / / J I / / o.I. 2, 170 173 (1 960). 1(i.2 E froymsciii, M. A ., Multiple regression analysis, in Mnthcmnlicnl Methods f o r Cigitnl Cornpi t w s (-1.Hnlston and €1. 8. Wilf, eds.), 191-203. Wiley & Sons, New Tork, 1960. lG.3 Sriama, A,, Adaption de la niPthode des nioindres carr6s aux machines il calculer i.lertroniqiies. Chzifrea 2, 157-180 (1959). 16.4 Rutishauser, H., Notes on Application of ALGOL(the proposed International Algebraic Language) to Numerical .4nalysis, in Lectures Presented at the Summc r Conference on Frontier Research in Digital Computers, pp. 27-30. Computation Center, Consolidated University of Forth Carolina, Chapel Hill, North Carolina, 1959. I(i.1
SECTION
18. Comments on the Nurnc.riral Expcrimcnts
18.1 Bergman, S., Punch-card machine methods applied to the solution of the tnrsion problem. Qzcarl. A p p l . Math. 5, 64-79 (1947). 18.2 Reynolds, It. R., The Dirichlrt problem for multiply connected domains. J . Malh. arul Phys. 30, 11-22 (1951). 18.3 Vodi'ka, V., Elementare Fklle d r s Dirichletschen Problems filr elliptische Gebiete der Ebene. Z. angew. Malh. u. Phys. 8, 309-313 (1957). 18.4 Davis, P., Numerical computation of the transfinite diameter of two rollinear line segments. 2. Remwch Natl. Bur. Slamlards 58, 155-156 (1957). 18.5 Poritsky, H., and Danforth, C. E., On the Torsion Problem, Proc. 3rd U.S. Natl. Congr. of Applied Mechanics, pp. 431-441. New York, 1958. 1R.6 Bcrgman, S., and Herriot, J., Application of the Method of the Kernel Function for Solving Boundary Value Problems, Tech. Report No. 8. Dept. Appl. Math., Stanford Vniversity, Stanford, California, 1960.
The authors wish to thank H. F. Woinberyer for the refprenccs to Section 15.
This Page Intentionally Left Blank
Microelectronics Using Electron-Beam-Activated Machining Techniques* KENNETH R . SHOULDERS Stanford Research Institute Menlo Park. California
I . Iiitrodurtiorl . . . . . . . . . . . . . A . General . Ihvironment . . . . . . . . . . . . 14. Heating Effects . . . . . . . . . . . F. Cathode Formation and Properties . . . . . . C. . Vacuum Encapsulation . . . . . . . . . 11. (!oinponcrit Lifetime . . . . . . . . . . I . Solid Stat(&'l'un~irl1CtTcc.t 1)evicw . . . . . .
. . . . .
. . . . 137 . . . . 137 . . . . 139 . . . . 142 . . . . 141 144
. .
. .
. .
. .
. .
.
.
.
.
. 145
145
. . . . . 146 . . . . . 146 . . . . . 147 . . . . . 148 . . . . . 148 . . . . . 14!1 . . . . . 150 . . . . . 150 . . . . . 150 . . . . . 156 . . . . . 158 . . . . . 15!) . . . . . 1G3 . . . . . 163 . . . . . 165 . . . . . 1G6 . . . . . 167 . . . . . 172 . . . . . 175 . . . . . 178
* This article was prepared with the sponsorship of the Information Systems Branch, Ofice of Naval Research, under contract Nonr-2887(00) . Reprotkuclion i n whole OT ifl part as permilled f O T any purpose of the United Stales G'overnmenl 135
.
KENNETH R . SHOULDERS
136
V . Accessory Components
. . . . . . . . . . . . . . .
. . . . . . . A. Secondary Emission Devices B. Light Detectors . . . . . . . . . . . C. Light Generators . . . . . . . . . . . D. Micro Document Storage . . . . . . . . . E. Electrostatic Relays . . . . . . . . . . F. Electromechanical Filters . . . . . . . . . VI . Component Interconnection . . . . . . . . . A. Vacuum Tunnel Effect Amplifiers . . . . . . . B . Memory Devices . . . . . . . . . . . C. Electromechanical Components . . . . . . . . . . . . . . . D. Steerable Electron Guide E. Plasma System . . . . . . . . . . . . VII . Sitbstrate Preparation . . . . . . . . . . A. Mechanical Formirig . . . . . . . . . . B. Substrate Cleaning . . . . . . . . . . C. Substrate Smoothing . . . . . . . . . . D. Terminals . . . . . . . . . . . . . E. Substrate Testing Methotls . . . . . . . . VIII . Material Deposition . . . . . . . . . . . A . Thermal Evaporation . . . . . . . . . €3 . Substrate Heater . . . . . . . . . . . C . Reactive Deposition . . . . . . . . . . 1). Single Crystal Growth . . . . . . . . E. Instrumentation Methods . . . . . . IX. Material Etching . . . . . . . . . . . A. Molecular Beam Etching . . . . . . . . . B. Atomic Beam Etching . . . . . . . . . C. Ion Beam Sputtering . . . . . . . . . . D. Depth Control . . . . . . . . . . . . X. Reairct Production . . . . . . . . . . . . A. Evaporated Resists . . . . . . . . . . B. Chemical Decomposition Resists . . . . . . C . Multilayer Resist Production Methods . . . . . D. Compatibility with Electron Optical System . . . XI . Electron Optical System . . . . . . . . . . A. Micromachining Mode . . . . . . . . . B . Scanning Electron Microscope . . . . . . . C. Scanning X-Ray Fluorescence Probe . . . . . D . Mirror Microscope . . . . . . . . . . . E . Multiple-Cathode Field-Emission Microscope . . . F. Pattern Generator . . . . . . . . . . G . Construction Details . . . . . . . . . . XI1. High-Vacuum Apparatus . . . . . . . . . . . A . Requirements . . . . . . . . . . . . B . High-Vacuum Apparatus . . . . . . . . . C. Ultrahigh-Vacuum System . . . . . . . . D Integration of Apparatus . . . . . . . . XI11. Electron Microscopo Installutior! . . . . , . .
.
180
. . . . . 180 . . . . . 183 . . . . . 184
.
.
.
.
. 185
.
.
.
.
.
188
. . . . . 189 . . . . . 190 . . . . . . 190
. . . . . 193 . . . . . . . . . . . .
. . 195 . . 195 . . 197 . . 197
.
.
.
.
.
.
.
.
. . . .
. . . . . . . . . . . .
. . . . . . . . . . . . . .
. .
. .
.
.
.
.
.
.
.
.
.
. . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
. . . . . . . . . . . . . . . .
. . . .
.
.
. . . . . . . . . . .
.
.
.
. . . . .
. . . . .
.
.
. . . .
. . . .
. 198 . 199 . 201 . 202 . 203 . 204 . 205 . 213 . 215 . 221 . 222 . 224 . 224 . 2'28 . 229 . 229 . 230 . 231 . 232 . 234 . 235 . 236 . 237 . 246 . 249 . 251 . 253 . 25'4 . 256 . 260 . 260 . 261 . 265 . 273 . 275
MICROELECTRONICS
S I V . Domonst.rat,ionof hlirroma(.tiining A. Substrate Prcpn.ratiori . . . . . . 13. Film Drposition . . . . . . . . C. Resist Exposing . . . . . . . . D. Etching. . . . . . . . . . . E. Disciission of IlrsiiltR . . . . . . . S V . Summary . . . . . . . . . . . A. hIicrorlect,ronic Component C'onsidcr,ztions B. Tunnel Effect Components . . . . . C. Accessory Components . . . . . . D. Component Interconnection . . . . . E. Substrate Preparation. . . . . . . F. Material Deposit,ion . . . . . . . G. Material Etching . . . . . . . . H. Resist Production . . . . . . . . I. Electron Optical System . . . . . . J. Vacuum Apparatus . . . . . . . References . . .
137
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . .
. . . .
. . . .
. . . .
. . . . . . . .
. . . .
. . . .
. . . . . . . .
. 27G
. 277 277 278 280 281 283 283 . . . . 285 . . . . 2% . . . . 286 . . . . 287 . . . . 287 . . . . 287 . . . . 288 . . . . 288 . . . . 289 . . . . 289
. . . . . .
1. Introduction
A. General Concepts
It has become increasingly apparent that electronic data processing machines have been applied as of now to only a few relatively simple problems, compared to the vast array of complex problems of the future. It is conceivable that in only a few machine generations, TIC will be aided by complex machines designed to perform economically many of the functions of a n intelligent technician. Machines of the kind that we postulate might well require over lo1' electronically active components-a number comparable in size to that of highly developed biological systems. The greatest single obstacle to fabricating machines complex enough to cope with future problems is our iiiabilit,y to organize matter effectively. Many pertinent basic electronic effects are now known and usable; more will undoubtcdly become acceFsible. To use these effects for the construction of complex systems, processes permitting the manipulation of matter must proceed a t video frequency rates with due regard to the specialized properties of electronic materials. If we are to build machines to permit performance approaching-and perhaps exceeding-that of biological systems, why not attempt to use biological building methods and biological information-processing methods? After all, the processes do work, and they can do so in a garbage can without supervision. Nevertheless, it is suggested that there are other means within the scope of our developing technology that are more suitable to our present,
138
KENNETH
R. SHOULDERS
and future needs. Let us examine a few of the major differences betwrrn biological organisms and our proposed electronic systems. We would like to remove the severe environmental limitatioiis of biological systems. Preferably, our proposed machines should be able to function in a varuum, in corrosive atmospheres, i n a furnace, in Conditions of strong nuclear radiation, and under the forces induced by high acceleration or any other disorganizing conditions. Obviously, such machines should give superlative performance under what is presently considered “normal” environmental conditions-for example, in such an important matter as long lived stability. Furthermore, such environmental immunity should permit efficient coupling to the new and more energetic energy sources that are becoming available. The rugged environment suggested above would eliminate most, if not all, biological procesees. There are, however, electronic effects and materials that can survive and function in such an environment. We would like to pass on-from one generation of ninchinc to the nexta i l organization that has been acquired or “learned” by the previous generation. In the higher biological species, offspring must learn by personal experience much of the information accumulated by their predecessors. I n man, the learning process may take a third of man’s useful life-span. It would appear advantageous for our projected electronic machine to have built-in, initially, a considerable amount of previously acquired organization; it, should also have the additional capability of reorganization and expansion. This impoees on the building process a necessary capability for passing detailed data on to the machine that is being built. Data in the form of instructions as to components, interconnections, and states appears to be needed; a previously organized machine would serve as a model. From the foregoing considerations, important constraints have been imposed on the selection of building processes, ueable electronic effects, and materials. The process must be such that all the fabrication data of a successful predecesEor can be communicated to the successive generations of machines, in an economical length of time and with a minimum of human mpervision. The fabrication processes themselves should be rapid enough to tic able to take advantage of the very high rates of electronic communication now available. It appears mandatory to use parallel processingprocessing that would involve the fabrication and interconnection of many components simultaneously. The ability to function in extremely severe environments demands the selection of refractory metals and dielectrics possessing the highest binding energies. The properties of these materials make normal mechanical or thermal methods of fabrication for complete machines too difficult to cope
MICROELECTRONICS
139
with ;subtle chemical reactions, which can replacc “brute force” mechanical working or thermal burning, are required. Cleanliness, in extreme form, is imposed as a necessary condition for reliable long-term operation. Such cleanlincss of process can be achieved a t present only in ultrahigh vacuum in which material is transported in elemental or vapor form during fabrication. What size ehould our proposed machine be? Let us assume that the desired 10” electronic devices are built layer by layer on two-dimensional surfaces; then the average size per device is determined principally by the resolution of the process that can selectively localize necessary materials. The process to be presented here has achieved a resolution of several hundred angstrom units. From this consideration, and others described later, the size of the ultimate machine appears to lie in the range between one and several cubic inches. One may speculate on the implications of the portability of such a complex machine. It would appear to be possible for it to accompany a human-acting as a highly organized clerk-in order to aid him in solving with unmatched speed and precision complex problems a t the site of the problem. Direct communication between man and machine, in a language understood by man, through optical, acoustical, or other means compatible with the human senses, would be essential to surmount the artificial barriers usually imposed. Looking beyond the boundaries of this program, it becomes apparent that ultimately we would wish to divorce ourselves from the limitations imposed by the low binding energies of solid materials. We require organization of matter to effect order, permitting growth in our intelligence machine; a t the same time, increased operating rates imply increased temperatures, with an ensuing tendency to disorganize. To surmount the low binding energies of solids, it may be possible to construct plasma-like structures in which elementary particles are organized spatially and temporally in dynamic equilibrium, under the influence of applied fields. Several aspects of the present program can be extended to form a natural transition into this postulated class of machine. In summary, we would like to build machines operating a t electronic speeds, almost completely immune to the effects of a harsh environment, with components and versatility sufficient to attain a critical degree of organized complexity such that the machine itself may participate in the further development of ever more powerful machines. 8. Background of the Present Program
This report is a qualitative and semiquantitative discussion of the various problems that are being dealt with in an effort to explore the possibilities and limitations of a class of integrated microelectronic devices, systems, arid
140
K E N N E T H R. SHOULDERS
fabrication methods. It is hoped that this review will help others select a similar field of endeavor and that some of the devices and techniques discussed here will have direct application. It is recognized that this project is a very ambitious one and that the end goal will not be reached for some time. On the other hand, the author does not feel compelled to deviate from the course he has taken of pursuing the end goal, rather than directing effort toward interim devices and methods; the latter approach would surely lengthen the time required to attain the more desirable goal. I n the beginning of the writer’s work in this field, vacuum material manipulation methods were used to fabricate resistors, capacitors, inductors, vacuum tubes having secondary emission and field emission cathodes, photoconductors, and electrostatic relays. Masking methods were used with conventional thermal evaporation sources and the resolution of these masks was carried to an extreme degree in order to investigate the properties of the microelectronic devices. No important deviations from the expected characteristics of the devices were found due to scaling down the size, but many material deposition limitations of masking and thermal evaporation methods resulted in poor over-all results for the components. Various unsuccessful methods were tried in an effort to overcome the masking limitations. Among these were the application of ion beam techniques, xerographic methods, optically produced thermal effects, and optically produced chemical resists in vacuum. The use of electron beams bccame the most promising method of producing the desired localized images on a surface during the micromachining mode and also for the analysis of various effects produced on the surface. Among the first methods tried, a n electron beam was used to bun1 through a low tempemturc masking material such as condensed carbon dioxide, but the energy density requiremeiit was very high and breakdown of adjacent arcas resulted from inadvertent electrical discharge. Later, oil films were polymerized on various surfaces in a fashion described by Carr [ 11 and the resulting composite was developed in vapors of substances such as mercury and zinc. Although interesting patterns could be produced, they had very limited electronic application. It was then discovered that silicone pump oils could be decomposed to silica by the action of an electron beam, and that this resist layer of silica could serve to protect an underlying film of material during a subsequent etching process. Eventually, this etching process was carried out in a vacuum chamber a t reduced pressure and became compatible with the deposition and resist production operation. At the present time the basic micromachining process is this: to deposit a material to be selectively removed later; to deposit a thin layer of resistproducing material and to expose this with elcctrons in some desired
MICROELECTRONICS
141
pattern; then filially l o etch thc first lnycr of material with :t mol(wd,zr heam of an element that inakcs a volatile compound of thc mixterid bring machined. Ilcfinemcnts in apparatus and tcchniqucs hnvc produccd rcsolutions in excess of 100 A for the process with very short csposure times a t low current density for the electron beam. Vacuum apparatus was developed for the processing of the chemicals that had been introduced into the system. This consisted essentially of providing w r y high bakeout temperatures for the apparatus and using materials that were not affected by the chemicals. A great deal of mechanical manipulation is used in the various phases of the process. To introduce motion it has been found advantageous to use differential pumping methods in which the ultrahigh-vacuum chamber is surrounded by a vacuum of mm Hg in which “0”rings for the many entry ports can he used. The inner ultrahigh-vacuum chamber, on the other hand, which can he raised to temperatures of 900°C, does not require the use of such seals. I n this way the manipulation can be introduced into the ultrahighvacuum chamber through small holes without undue leaks or contamination from organic materials. The design of an electron optical system has been undertaken with the aim of providing the many functions of analysis and micromachiniirg that are needed without impairing the ultrahigh-vacuum condition desirable throughout the processing of a system of electronic components. Deposition equipment has heen improved; it is nearing the point where knowing what to do with it is of more concern than further improvements. The chief effort in the deposition equipment design is in iiitegrating it) properly with the vacuum apparatus and other equipments so that it is not limiting their performance. The design of a mass spectrometer is being undertaken, and upon completion it will be incorporatcd in the process control of the deposition apparatus. Aftcr working with and considering many components of thc film typesuch as cryotrons, magnetic devices, and semiconductors-for application to microelectronic systems, it appears that devices based upon the quaiitmi mechanical tunneling of electrons into vacuum (field emission) possess many advantages. They seem relatively insensitive to temperature variations and ionizing radiation and they permit good communication with other similar components in the system as well as with optical input and output devices. The switching speeds seem reasonably high, and the devices lend themselves to fabrication mcthods that could economically produce large uniform arrays of interconnected componeiits. These componeiits are bascd on the phenomena of field emission into vacuum, which has been under investigation for many years by competent people and has a firm scientific
142
KENNETH R. SHOULDERS
basis. In our future discussion of thc hasic emission process the term “ficltl emission" will usually bc used, and thr complete roniponrni will he referred to as a “tunnel effect component.” At prcsent the phase of integrating many isolated hits of thought and apparatus has been reached. The isolated bits seem to fit together in a n interesting pattern; however, only by actually going through the motions of physical tests can the complete plan be verified or modified to accomplish thc desired end. To accommodate unforeseen changes in any of the various aspects of the program, the maximum flexibility in equipment design has been sought. Flexibility tends to increase the initial effort and cost, hut] this should be the most economical course in the long run. The author would like to acknowledge the invaluable technical and moral support of his many associates through the past years, as well as the financial support of Stanford Research Institute; the Information Systems Branch of the Office of Naval Research; the Electron Tubes Division, Techniques Branch, of the U. S. Army Signal Research and Development Laboratory; and the Solid State Branch of the Air Research and Development Command, Air Force Cambridge Research Center. C. Objectives
The general objectives in this area of research are to develop the proper apparatus and techniques for electron-beam-activated micromachining and to apply these tools to the fabrication of integrated groups of electronic components. Ultimately, such a program, aided by logic and system studies, will lead to the fabrication of complete machines that could be used for real-world general pattern recognition, real-time language translation, and should have problem-solving and decision-making capabilities which are not available in present machines. The specific objectives for this program would be to first obtain the proper tools for electron-beam-activated micromachining, such as the electron optical system, ultrahigh-vacuum apparatus, and mass spectrometer controlled deposition and etching apparatus. Following the development, of these tools, the techniques for micromachining would be developed to an advanced stage while being applied to the fabrication of small groups of selected electronic components. The components to be developed would he selected to integrate well with themselves, their construction process, and the microscopic size range in which they must function. These components, based largely upon the quantum mechanical tunneling of electrons into vacuum, would seem to suffice for high-speed computer switching and memory, provide low noise amplification and communication filtering, and
MICROELECTRONICS
143
give optical coupling means in and out, of the system. The specific objectives are listed as follows: ( I ) Provide an electron optical system capable of analyzing the physical, chemical, and electrical properties of microscopic components so that they can he optimized without undue difficulty in spite of their reduced size, This electron optical system u ould consist of a mirror microscope, xannirig electron microscope, and X-ray fluorescence probe. The mirror microscope would be capable of voltage measurements down to 0.2 volts and have a maximum rePolution of approximately 500 A. The scanning microscope would h a w a resolution limit of 100 A. The X-ray fluorescence probe would have the ability to analyze 10-13 grams of material to 1% accuracy. (2) Integrate the electron optical system with mabs spectrometer coiltrolled deposition and etching apparatus in an ultrahigh-vacuum system capable of reaching 10-lGmm Hg in one-half hour or less. (3) Deviee vacuum tunnel effect devices of micron size with switching times in the 10-lo Fec region which (a) operate a t about 50 volts (b) have high input impedance (c) are insensitive to temperature effects u p to 1000°C (d) are insensitive to ionizing radiation effects up to the limits of the best knovn dielectric materials, and (e) have a useful lifetime of many hundreds of years. (4) Devise light sensing devices of fractional niicroii size based upon photoelectron emission from metal-dielectric matrices under high electric field, which have a frequency response in the range of 100 Mc/sec, and which have environmental immunity similar to the tuniiel effect devices. (5) Devise cathodoluminescent light generating sources of fractional niicroii size based upon the stimulation of a refractory phosphor by field-emitted electrons. (6) Deviee electromechanical devices ubing microscopic diaphragms operated by electric fields in the lo7volt/cm region for low-voltage relays and communication filters. (7) Deviw modular groups of interconnected vacuum tunnel devices which take advantage of the large dynamic range of lolo in operating current to permit very low power dissipation in the quiescent operating states. These circuits would t e specifically designed to eliminate lowvalue plate resistors in active memories in order that the power dissipation levels of both states of two-state devices are approximately 1O'O below the dissipation level during switching. (8) Investigate the limitations of periodically focused, electrically steerable, electron beam guides to provide interconnections between active components which will be less lossy than submicron sized wires, and
144
K E N N E T H R. SHOULDERS
to reduce power requirements and heating effects by conducting a n electron bunch or pulse through relatively long paths giving up energy only upon termination; making use of the interconnection flexibility such that a considerable amount of logic can be performed by interconnection control rather than by dcpendcnce on the logical properties of the active devices only. (9) Develop eclf-formation nicthods for active components in which a chemical process, during formation, is controlled by a significant electrical property of the component in such a way that the component characteristics are automatically modified to conform to some previously specified range of operation. (10) Develop a document storage system by using electron-beamactivated micromachining processes to store information in the form of 1O'O defined areas on a one-inch-square glass plate, each area representing ten optical intensity levels-to be read by fractional micron-sized lightsensing devices.
D. Time
Schedule for Various Objectives
The work outlined above will not contribute to the nest generation of machines made from micromodule components. I n the following generation of devices, employing solid state integrated techniques, there may be some slight interaction with electron-beam-activated machining methods such as the selective masking of semiconductor surfaces for control of diffusion; the various electron beam analysis tools may also contribute information about the microscopic behavior of diffusion processes and give data on surface effects. The third generation of machiiies would seem to be the one to which these techniques and devices would begin to contribute strongly. A great deal of effort is needed to carry this program to the point of showing completed systems being fabricated in a few hundred hours, but one thing that should become evident with such an electronic construction process is that there would be a very short delay between engineering and manufacture of electronic systems made by these methods. No tooling u p is needed for this electronic material handling process othcr than the construction of additional identical processing chambers and auxiliary control equipment. II. Research Plan
The problems listed iii the outliiic that follows represent most of the areas of investigation that were indicated at the beginning of this work. Initially, most of the scientific problems were worked on in a superficial
MICROELECTRONICS
145
or qualitative fashion iii ordcr to cxplorc economically thc wide latitudc of problems and efficiently converge on desired results. This exploration has removed the need for work in a great many areas and has shown the confines of the experimental work, which in turn allows the design and construction of integrated equipment without the fear of obsolescence. Future phases of the work will begin to fill in scientific gaps and generate new problems which cannot be seen in this first phase. The following paragraphs detail in summary form the gcneral work plan:
,4.General Component Considerations I . Application of scaling laws to determine optimum size range for various electronic components for complex data processing systems. 2. Investigation of manufacturing feasibility for various one-micronsized electronic components.
13. Tunnel Effect Component Design and Interconnection 1. Component Design
(a) Determination of area of application for solid state and highvacuum devices. (b) Determination of optimum impedance for vacuum tunnel effect devices and device of a geometry that optimizes gain and bandwidth while minimizing current intercepted by control and screen grids.
2. Interconnections and Circuits (a) Design of low-loss circuits having negligible quiescent power dissipation. (b) Investigation of direct coupling, secondary emission coupling, and optical coupling methods to provide minimum constraints for logic operations. (c) Investigation of periodically focused electron beam guides to be used as electrically steered communication paths in microelectronic systems. (d) Investigation of electron beam parametric amplifiers for lownoise amplification in microelectronic systems.
3. Fabrication (a) Determination of stability of metal-dielectric combinations under operating conditions involving high thermal shocks and intense electron bombardment.
KENNETH R. SHOULDERS
1 46
(1)) Investigation of encapsulation effcctivcncss using thin filnis
made by reactive deposition methods. (c) Investigation of self-formation mcthods for vacuum tuniicl effect devices having multiplc-tip cathodcs.
4. Test Methods (a) Use of field emission microscopy and field ion microscopy to determine physical, chemical, and electrical properties of materials used in tunnel effect, devices. (b) Development of electron mirror microscopy to show operating voltages and switching waveforms of microelectronic components. C. Accessory Component Design 1. Investigation of high current density, low voltage secondary cmission surfaces for tunnel effect component interconnection. 2. Investigation of photoelectron-emitting light detectors coupled with current amplifying secondary electron multipliers. 3. Investigation of light emission from field emitted electron stimulated phosphors.
4. Investigation of microdocument storage system using fractional wavelength optical reading methods. 5. Investigation of electrostatic electromechanical relays.
6. Investigation of electrostatically driven electromechanical filters.
D. Substrate Considerations I . Materials Determination-Determination of materials to bc used and the optimum shape and size of the substrate.
2, Mechanical Preparation Methods
(a) Investigation of economical limits to grinding and polishing operations. (t)) Study of the effects of vacuum firing for normalizing polishrd surfaces. (c) Investigation of porosity of raw samples. 3. Terminals
(a) Study of the termination of thin films to gross terminals and their encapsulation to prevent corrosion a t high temperature.
147
MICROELECTRONICS
(1)) Investigation of sturdy strain-frcc lend wire connections to subst rat P .
4. Cleaning and Smootliirig
(a) Optimization of ultrahigh-vacuum electron beam cleaning and smoothing methods for substrate preparation and intralaycr smoothing.
(b) Application of test methods to indicate faulty surface.
E. Material Deposition 1. Thermal Evaporation-Determination the maximum cleanliness of deposits.
of methods for obtaining
2. Reactive Deposition (a) Determination of smoothness, crystalline properties, and electrical properties of various metals, metal carbides, borides, osides, nitrides, silicides, and sulfides. (b) Investigation of diffusion characteristics between adjacent layers of material.
3. Crystal Growth (a) Investigation of single crystal growth in presence of sweeping thermal gradient on substrate with and without reversible chemical reactions. (b) Investigation of reduction of crystalline size by alloying and admixing materials so as to obtain uniform and stable polycrystalline materials.
4. Instrumentation (a) Design and construction of X-ray fluorescence probe film thickness monitor. (b) Design and construction of ion gauge evaporation rate controllers. (c) Design and construction of rf mass spectrometer for evaporation rate monitors, etching control, and destructive analysis of film materials. (d) Utilization of field emission and field ion microscopy to determine cleanliness of deposits, activation energy, and stabilities of materials.
148
KENNETH
R. SHOULDERS
F. Material Etching 1. Molccular Beam Etching (a) Fundamental study of low pressure surface chemical reactions so as to allow prediction of proper etchants and temperatures. (b) Study of surface migration of etchant with intention of controlling undercutting of resist and material redistribution. (c) Investigation of multiple component etchants to increase vapor pressure of material being etched. (d) Investigation of side reactions produced with multiple component etchants. (e) Study of etching rate of various crystal faces and methods of causing uniform etching by selective adsorption of different materials.
(f) Investigation of depth control methods with and without chemical barriers. 2. Atomic Beam and Sputtering Etch Methods (a) Determination of effect of molecular beam etch a t a surface bombarded by a high current density electron beam which produces atoms and ions. (b) Study of electrical property damage resulting from sputtering and atomic beams.
3, Instrumentation (a) Design and construction of high temperature substrate heater that is chemically nonreactive. (b) Design and construction of accurate temperature regulating devices to control substrate heat. G. Resist Production 1. Classification-Classification of electron sensitive reactions and application of the ones best suited to micromachining. 2. Efficiency Limitations (a) Study of polymerization mechanism in simple reactions and electron multiplication process in multilayer complex reactions. (b) Analysis of maximum attainable efficiency vs. resolution in optimum process.
MICROELECTRONICS
149
( c ) Study of electron absorption process in resist producing materials.
(d) Study of efficiency as a function of temperature and electric field. 3. Resolution Limitations (a) Investigation of electron scatter processes and X-ray production and adsorption as fog-producing mechanisms. (b) Study of material migration due to thermal and field enhanced mechanisms. (c) Investigation of etch-back methods to reduce background fog effects. 4. Integration Investigations-Development of optimum resistproduction methods which integrate well with other system functions, such as noncontamination of lens and film materials and the adaptability of resist materials to vacuum handling methods. 13. Electron Optical System-Determination of the system most suited to micromachining and analysis, which includes making a comparison between magnetic and electrostatic instruments, cylindrical vs. spherical lens elements, and a select ion of optimum electron velocities for various modes of use. 1. Resolution and Sensitivity Limits
(a) Determination of rrsolution and coiitrast limit for scanning microscope.
(I)) Determination of accuracy, sensitivity, arid resolution limit for X-ray fluoresccncc probe. (c) Study of voltage sensitivity, bandwidth, and resolution limit of mirror microscope. (d) Determination of resolution limit, for multiple field emission cathode imaging microscope. 2. Distortion Liniitations (a) Investigation of distortioii effects xhich would limit uniforniity of field. (b) Determin:Ltioii of maximtiin 1iuml)cr of rcsolva1)lc bits per field. (c) Study of mechaiiical and electrical instability which would limit rcproducibility and registration.
150
KENNETH
R. SHOULDERS
(d) Study and suppression of contamination effects and stray charging. 3. Accessory Apparatus
(a) Design and construction of combiiicd X-ray detector and electron multiplier. (b) Design and construction of electrically operated substrate micromanipula tor. (c) Design and construction of pattern generator for micromachining. (d) Assembly of monitor console to integrate various display functions.
I. Equipment Integration 1. Ultrahigh-Vacuum Apparatus (a) Design and construction of rapid-access equipnieiit capable of being baked a t high temperature to remove chemical residue betwceii deposition and encapsulation operations. (b) Design and construction of contamination-free pumps for ultrahigh vacuum. 2. Valves-Design and construction of large diameter, ultrahighvacuum valves for separation of various operations having volatile reaction products. 3. Manipulation-Design and construction of multichannel manipulator having great, operational flexibility.
111. Microelectronic Component Considerations
A. General Considerations
1.
CONSTHUCTION PROCESS CONSIDERATIONS
Our over-all component size is constrained by the construction techniques to be within the limits of one-tenth of a micron and two microns. The lower limit is set by the resolution of' the machining process. The upper limit is indirectly determined by the maximum thickness to which films can be deposited with desirable properties. Films thicker than a few microns can be obtained, but i n this size range mechanical iorces come into play which tend to cause severe strain a t the film boundaries unless the thermal expansion coeficients of the various materials are matched. These strains
MICROELECTRONICS
151
niay not rcsult iii pecliiig of the filiris, hiit, tlicy ciihaiicc tliv diffuhion of forcign matcrial along boundaricx Thc dimensional aspect ratio of a d ivc devices is prcdoiniiitlully coiitrolled by consideration of the impedancc level, both of the device itself and of the coupling means between similar components. For semi-conductor and vacuum devices, this aspect ratio usually approaches unity. Thus since the component thickness will approximate one micron, the width and length will also approach one micron, and the resolution of the machining process must considerably exceed this value so as to produce the detailed structure needed. When the Component size approaches the resolution of the construction process there is a severe problem in obtaining uniformity. At times it may be possible to couple a self-forming process with the principal construction process and effect an improvement in over-all resolution without complicating the primary process. An example of this that can be found in normal machine processes might be the use of electropolishing to remove burrs caused by an inexact machine tool cutting process; the electropolishing does not necessarily affect the basic dimensions. Some components seem inherently simple in geometry but require properties of materials which are either difficult to obtain or are unstable. Other components require simple material properties, but have complicated geometry. A transistor is an example of the first class and a transformer or saturable reactor is an example of the second class. An ideal component taxes neither the geometry nor the materials. One reason for the selecqion of vacuum tunnel effect devices is that they seem to fit both of these requirements better than any class of component examined thus far. A single metal and a single dielectric, chosen particularly for their stability, seem to be the only materials needed to give a large range of electronic effects. The geometry is extremely simple because the device consists principally of the termination of the various wircs coming into the active area, as shown in Fig. 1 (p. 160). A choice is always possible between fabricating two-state or multilevel devices. The multilevel dcvices usually require a higher resolution construction process for any particular over-all size. The stability requirements for the materials may also be higher, but carcful analysis of each case secms in order, because occasionally the two-state device design has latent possibilities of being a multilevel device with the same stability, and therefore is not being used in an optimum fashion. For the present, we will be concerned only with two-state devices.
2.
SCALING O F ELECTRICAL PROPERTIES
This program is investigating the reduction of size of electronic components by about three orders of magnitude, namely, from one millimeter
152
KENNETH R. SHOULDERS
to onc micron. This reduction i n sizo will bring almit c.limgc~sin tdia hhavior of materials, componcrits, mid thc m:Lnncr in which thc compoiictits are used. (a) Trunsrnission Line Loss. Oric of thc more unforturiate effects in reducing the size of electronic systems is thc increased loss experienced by transmission lines and tuned circuits. Assuming no change in resistivity, a transmission line made from a suitably stable material like molybdenum-3000 A diameter, one inch long-has 100,000 ohms dc resistance. Materials like copper or silver, having lower resistivity, are considered too unstable for this purpose. The 3000 A diameter is the largest convenient wire size that could be used with a 10,000 A, or one-micron-sized component. Any calculation of line loss should consider the possibility of high temperature operation of line sections rcsulting from cnergy dissipation during component switching. If conventional transmission lines arc to bc used for iiitcrroiinecting microelectronic componcrits then the components should have a high input impedance to avoid excessive current in the line and the attendant loss. There is no rigorous relation between the input impedance of a component and its output impedance, but the two are not usually independent. Thus most semiconductor and magnetic devices operate, input and output, a t relatively low impedance levels, while a few semiconductor and most vacuum devices operate at high impedance levels. With the high resistance transmission lines described above, the choice of high impedance level devices becomes mandatory if line loss is not to be excessive. For example, if one component is to couplc with another located 10,000 component diameters away, approximately one inch, then the input impedance of the most distant component should be a t least 100,000 ohms, considering only dc losses in the transmission line. Highcr input impedances would be required if ac losses were included. (b) Q of Circuits and Filtering. The Q of tuned circuits is drastically reduced when the components are scaled down uniformly by a factor N . The resistance R varies as 1/N, the capacitance C and inductance L vary as N , the product LC varies as N 2 , and the product RC is constant. The Q of a circuit can be defined as wLlR. If the frequency w remains the same, then the Q scales down as N 2 . Thus if all dimensions for a n inductor are reduced from 1 centimeter to 1 micron-a scale change of 10-4-the Q of the inductor would be reduced by a factor of lo", for the same frequency. It will be difficult indeed to perform filtering operations in the conventional manner unless other energy storage mechanisms are used. RC filters are not affected by scaling their size, but they require stable gain elements to achieve stable over-all performance. This stability may be achieved by degeneration of active elements up to the limit of stability
MICROELECTRONICS
153
imposed by the temperature sensitivity of the resistance elements. In scaling down a resistor all dimensions are assumed to be reduced, but in fact the present day resistors are already films so that it would be difficult to scale the thickness. The net result of this is that microresistors will either have to have very low resistance values, and thereby couple poorly to the active elements, or they will have to he made from materials of high resistivity. A n unfortunate circumstance of high-resistivity materials is that they are usually temperature sensitive to a very high degree, thus causing large changes in filter characteristics when used in a n RC network. Similarly, reactatnces-capacitive and inductive-may be produced by active semiconductor dcviccs, hut these devices also exhibit considerable teniperat ure instability . The very fast thermal time constants of micro devices could tcmperaturemodulate filters in the megacyclc region and if the filter is temperature sensitive a spurious signal results; i n addition, a drift is caused by slow heating. A useful filter for the micron size range can be devised, using bistable switching elements in a data processing unit which handles the incoming signals as digital data after appropriate time sharing and quantization circuits have sliced up the signal. These methods are not very good for handling small signals in the presence of large ones and may have to be coupled with drift-prone RC filters to improve the over-all results. Mechanical and acoustical filters scale nicely into the micron size range for radio frequencies. The most important rcquirement for these devices is that they be mechanically uncouplcd from the solid lattice that surrounds them. This problem is somewhat equivalent to the vacuum encapsulatioii problem where a membrane must be suspended free of the lattice over most, of its area. Effective driving and pickup transducers for these mechanical filter clrments can be clcctric field operated, being quite effective when subjected to bias fields of about lo7 volt/cm. Ultimately some of the molecular resoiiance arid adsorption properties of materials may come into UFC as filters, but none have been investigated in any detail. (c) Optinziiin Time Constant. On first examiliation it would appear that scaling down all dimensions of active componcnts would dccrease the RC time coilstant and therefore increase the system opcratiiig speed almost indefinitely. There is, however, a lower resistance set by efficient energy transfer requirements, i t . , the resistance should not be less than the maximum transmission line resistance between two interconnected devices. Furthermore, in most active devices the current rises faster than linearly with applied field, and the resistancc scales down more rapidly than linearly. Present, semiconductor tunnel diodes represent a component that exhibits
154
KENNETH R. SHOULDERS
too low an impedance for a system composed of one-micron-sized devices. Although the vacuum tunnel tetrode, which is discussed in a later section, tends to be a high impedance device, it may be possible to reduce its impedance level to around the optimum of 100,000 ohms. Such a micronsized device will have an interelectrode capacitance of approximately 10-l6 farads, yielding a time constant of about 10-" sec. This value appears optimum when based upon efficient energy transfer between components, one inch apart. It is further to be noted that a one-inch transmission line operating with a velocity of propagation approaching that of light, has a 10-lo sec delay, a time compatible with the switching time of the above device. (d) Heating Eflects. The integration of a large number of components into a closely packed structure introduces thermal coupling between elements to a degree not experienced in large-scale systems. Thus, if a number of devices are to operate simultaneously, they must either operate a t considerably reduced power density, be spaced sufficiently far apart, or the circuit design must be such that large numbers of adjacent devices will never be energized simultaneously. Generally, high power density is required for high operating speed. Furthermore, low operating power density two-state devices usually are quite temperature-sensitive. For example, the cryotron is very temperature sensitive and is also a very low power density switching device; ferroelectric and ferromagnetic devices, which operate at medium power density, are fairly temperature sensitive; the power density and temperature sensitivity of semiconductor devices depend somewhat 011 the width of the energy gap of the material used. The vacuum tunnel effect device appears to be the only device which can satisfy most of the heating requirements. It operates at a very high powcr density and up to temperatures barely under those which destroy the refractory materials from which it is made. Individual micron-sized devices, having a large surface to volume ratio, and made of refractory materials, such as molybdenum and aluminum oxide, can be projected to operate continuously a t a power density of lo* watt/cm2 C2.31. It appears possible to operate LL vacuum tunnel device with a dynamic dissipation range of loiobetween active switching and quiescent states. If the dissipation during switching is watts for a 10-13 watt quiescent condition, then the quiescent power for 10" such devices would be watts, an entirely acceptable level. In comparison with the above, a conventional solid state tunnel diode, with a dynamic dissipation range of only 20 to 1, would involve a quiescent power dissipation of watts per device, and a total of lo6 watts quiescent dissipation for 10" devices. Solid state tunnel devices made from only metals and dielectrics, as will
MICROELECTRONICS
155
be discussed later, seem to leiid themselves to low standby powcr but would not exhibit a negativr resistance characteristic.
3.
SCALING OF MATERIAL PROPERTIES
For metallic films 3000 A thick the bulk rrsistivit,y values may be used. Traiismissioii lines haviiig 3000 A diiznicters may also exhibit bulk characteristics even though the surface arca and electron scatter is very high. Dielectric breakdown strength increases to loBvolt/cm for 3000 A thick films of material that arc properly prepared [4].This is primarily because the thickness is near the value for the electron mean free path in the dielectric. An electron avalanche is not likely to occur under such conditions and breakdown is forestalled. High field electrostatic devices can use this increased breakdown strength to advantage. As the size of our coniponents is scaled dowii the surface-to-volume ratio rises and surface-tension forces play a profound role in determining the shapes of the objects we can make. Materials tend to minimize their surface energies and assume a spherical shape instead of remaining sharply pointed. One of the limits to the smallness of the tip size that can be used for field emission cathodes is the blunting caused by material migration and surface tension forces at high temperatures [5, 61. Another limit that can frequently be seen is the way that a surface layer of material, that is to be diffused into a semiconductor device, tends to agglomerate and form into sniall droplets [7]. This is partly caused by impurities accidentally present a t the interface, and partly by surface tension forces. Special problems in crystallinity are encountered when dimensions are reduced to the few thousand angstrom unit region. Single crystal materials require an unusually high degree of perfection over a large area if pinholes, dislocations, or other anomalies, which could cause accidental diffusion of materials from one laycr to another, are to be avoided. Polycrystalline films are usually composed of arrays of single crystals with few grain boundaries parallel to the film surface thickness but with many grain boundaries perpendicular to the surface, permitting easy diffusion paths in the thickness direction. Polycrystalline films having many crystals in series in the thickness direction can be formed by evaporation methods, but these are unstable and tend to recrystallize a t elevated temperature to crystals having all sides approximately equal to the film thickness. By proper admixing of materials such crystal growth can be forestalled, but even with 300 A crystal size, which is considered small, there would be only 10 crystals in a 3000 A-thick film, and the possibility of having a direct diffusion path is large. Certain electronic devices, like semiconductors, have detrimental sur-
156
KENNETH
R. SHOULDERS
face recombination effccts which can adversely affect the electrical properties of the device with the reduction in thc sizc of these devices by thc ensuing increased surface-to-volume ratio. The surface under consideration includes not only the free surfacc but internal boundaries which represent discontinuities in properties.
B.
The Over-All System Specification
Many detailed considerations are needed to dictate the requirements of a complex data processing machine [7a]. At this early stage of the program the opinions formed about the details of the eventual machine are not likely to be firm conclusions; however, opinions are needed to guide the day-by-day activity. The tentative specifications for a complete machine are presented here in order to show why various endeavors have been undertaken. 1.
NUMBER OF COMPONENTS
Based on a number of coiisidcrations a maximum system sizc of 10" components, contained wit,hin a one-cubic-inch volume, has been selected. The most important single factor in thc selection of this size is the resolution of the machining process. Other factors that are strongly related to size are : electronic effects, interconnection problems, uniformity of manufacture, and stability of components. Furthermore, the economics of fabricating such a large system, with the methods to be described, have been considered and found to be reasonable. 2.
THE SHAPE AKD NUMBER O F LAYERS
A cube, one inch on a side, has been selected to house the data processing system because of the adaptability of this shape to the manufacturing process. Vacuum deposition mcthods lend themselves to the deposition of material on a singlc surfacc. Past cxperiments have shown the feasibility of depositing fifty layers of material 3000 A thick. More layers could conceivably be dcpositcd but the resulting unevenness of thc top layers is difficult to cope with. When thesc layers of materials arc used to make components, wiring, shields, and crossovers, approximately ten layers of components could bc made on each substrate from the 50 deposited layers. The thinnest substrate that would secm reasonable to handle in the polishing and lead wire connection operations would be 0.01 inch thick. The major dimensions of the substrate are belicvcd to bc optimum a t about one inch square. Larger sizes arc difficult to heat uiiiformly to the high temperatures needed for reactive deposition and arc too flexible during exposure by the electron image. Smaller sizcs are more desirable to work with but entail a rrduction of thc numbcr of lead wircs that can bc attached
MICROELECTRONICS
157
to the substrate as well as a reduction of the area available for optical communication to the machine. One hundred substrates would thus stack to a one-inch cube. Approximately lo8 one-micron-sized components can be accommodated on one layer of a one square inch substrate when a 25% packing factor for components is used. A cubic inch machine mould thus be composed of 100 substrates having tell layers or lo9 components per substrate, for a total of 10" components. 3.
EXTEItNAL COMMUEiICATION
Electrical, optical, and meclianical methods would bc available for coupling the microelectronic system to the surrounding world. Lead wires can be attached to each of the substrate edges and up to 80 wires per plate could be secured; however, the difficulty of attaching and accounting for these wires would tend to h i t the number to a smaller value. A few hundred megacycles bandwidth per wire appears attainable. These wires could serve as intersubstrate coupling and coupling to external devices. Optical coupling between the adjacm t substrates could serve to connect up to lo4parallel data channels with over 10 Mc of bandwidth per channel. It is estimated that approxiinatcly 10% of the components would be used as electrooptical generating and receiving t raiisducers. The light detecting device would be a simple photoemissive surface coniiectcd to a vacuum tunnel effect tetrode employcd as an amplifier. If more sensitivity is needed, a secondary electron niultiplicr could be interposed between the two units without undue complexity. l'he light generator would be a refractory phosphor stimulated by elect roils drawn from a field emission source and coiltrolled by a grid stiucturc in a triode geometry. It is necessary to have a t,hin transparent substrate in order to allow many channels of communicatioii between substrates without the use of a lens. The light emitted from a small fluorescent source diverges until it strikes the photoreceptor. If the substrate is 0.01 inch thick, then the light that falls on the adjaceiit substrate could be confined to within approximately 0.01 inch by moderate collimation a t the source. A 0.01 inch resolution yields lo4 separate areas on the suhstratc. The optical channel or surface that is to communicate with the external surroundings of the machine could use a lens; this \voulCl allow over 10: optical channels in the input field. The output of the machiiic to human observers or to rcmotely located machines would also contain 106 individual channels. It would also be possible t o ohtaiii ~nrchanicalor acoustical interconunuriication with machiiic cireuith by providing thin hard diaphragms as the encapsulation that vary the spacing of field emission diodes, thus causing a current change. Thcse transducers could be czs m a l l as one micron.
158
4. ENERGY
KENNETH
R. SHOULDERS
SOURCE AND DISTRIBUTION
Under conditions of the maximum rate of processing data, up to 100 watts of power could be dissipated by this one-cubic-inch machine. The data processing rate a t this power input would be approximately 10l6bits per second. It is not expected that this rate would be used often, but, in the event it was reached, the prime energy source would be a 100 volt direct current supply drawing 1 amp. Auxiliary low current supplies for electron multipliers and the like could be provided from external sources or from internal converters. The power distribution system would have to be an elaborate arrangemcnt to prevent faulty sections from disturbing the entire machine. Fuse links built into the machine construction in the form of marginal wire diameters would serve as an irreversible protection device. Electrostatic relays could be employed as fault sensing devices and act as reusable circuit breakers.
IV. Tunnel Effect Components
The quantum mechanical tunneling of electrons through an energy barrier under the action of a high electric field is the basis of many known effects. The source of electrons for an electrical avalanche leading to destructive breakdown niay frequently be traced to field emitted or tunneling electrons. The Malter effect [8] is reported to be a process for obtaining very high secondary emission yield from a surface in vacuum; the effect is caused by a positive charge being deposited on the surface of a thin dielectric which covers a metal base and results in field-emitted electrons being drawn from the metal through the dielectric and into the vacuum system. Cold cathode tubes based on self-sustained secondary emission from MgO have been reported to use a Malter effect or field emission source for their emitted electrons [9]. Some dc electroluminescence has been reported to stem from field emission [lo] and the Mott and Gurney explanation of corrosion processes in thin films invokes quantum mechanical tunneling as the source of the electrons needed to carry on a thin film corrosion process [ll]. The utilizatioq of quantum mechanical tunneling in tunnel diodes has recently become an active area of investigation for electronic components, and before this, investigators of high-vacuum field emission have applied the phenomena to various microwave, tetrode, and cathode ray tube designs [12, 13, 13al. It is our intention to use the principal and best understood form of field emission, namely, the emission from a metal into vacuum, as the basis for
MICROELECTRONICS
159
active rlrrtroiiic~romponriit in n microrlrci roiric. data processing systrm. 111 addif ioii, infrared rlrtrc.tors a i d olrcf roluminrsc.enrc! genrrators will hc iiivcstigatcd that niakr U ~ C of ' licltl emissioii from aiid into semiconductors. A type of solid state tunnel effect c.oinpoiicnt has brcn investigated in which only a metal aiid a dielectric arc used, the mctal supplying the electxons and the dielectric giving the ricc~ssaryenergy gap. Experience thus far with this class of dcvice show ail unfavorably low impedance, as well as other problems such as traps caused by poor crystal structure in the dielectric. The greatest problem in applying field emission to devices is the instability of the final device due to impurities and material migration. Field emission can be a very high energy density effect, and when it is coupled with low activation energy materials, as is done in most applications attempted thus far, material decomposition and instability is the inevitable result. By using stable materials and very specific geometries to avoid electron collision with dielectric supports, a much higher stability could be expected. a11
A. Geometry
The most essential design criterion in a vacuum tunnel effect device is to provide a method of changing the field a t the cathode in such a way that the electrons that are emitted under the action of the field are not intercepted by the control electrode (grid) that caused it. Thc use of the emitted electrons depends upon normal elcctron ballistics, and conventional tube design considerations are adequate. Many geometries are possible that will achieve the desired results; two geometries that lend themselves to film construction techniques are shown in Figs. 1 and 2. These configurations could be constructed by successive deposition and machining operations. The only part of the process that is not obvious upon inspection is the method of supporting the conductor layer over the cavity in the lower dielectric, and the closing of the entire cavity with encapsulating material. These details will be discussed in Section IV, G. The multilayer geometry shown in Fig. 2 is more difficult because of the higher registration needed between layers. 1.
THE CATHODE
The cathode preferably is made in the form of an array of small tips superimposed upon a larger cathode lead wire. Tips having a radius of approximately 100 A would be desirable from the standpoint of lowering the operating voltage but wouId be difficult to fabricate and have a tend-
KENNETH R. SHOULDERS
160
CATHODE
ANODE
CONTROL G R I D
SCREEN G R I D
TOP VIEW
CATHODE
-SCREEN
~
CONTROL G R I D
-
GRID
-ANODE
SIDE VIEW
FIG.1. Tunnel cffcct vacuum tetrode-single-layer
type.
ency toward instability due to surface tension forces. It would be desirable to fabricate an array of at lcast ten of these tips on the cathode area. Some of the pertinent fabrication processes will be considered in Section IV, F. The work function of the cathode material should be approximately
TOP VIEW
SIDE VIEW
FIG.2. Tunnel effect vacuum triode-multilayer
type.
MICROELECTRONICS
161
FIG.3. Field emission current density J as a function of applied surface electric field F lor three values of work function in clcctron volts.
4.5 ev, the same as tungsten. If highcr work functions are used the voltage requirements of the device are increased, as are heating effects. Low work functions reduce the slope of the emission 10 I current-grid voltage characteristic and thus lower the gain of the device. If the work function is carried to very low values, loosely bound monolayers of materials such as cesium are usually used and instability results. At very low work functions the devices also becotne slight,ly temperature sensitive. Graphs taken from “Advances in Electronics,” [G, pp. 94, 961 and plotted in Figs. 3 and 4 show the dependence of cathode cur- 5 rent density upon field, work function, and -2 temperature. The range of current densities of interest for 1O-Io cm2 cathodes is between , -4 5 X lo6 amp,’cm? during tthe conducting 1 2 3 4 5 IO’/F (F IN VOLTS/CM) state to 5 X amp/cm2 during the quiescent state. This is a dynamic current FIG. 4. Field emission current swing of loLo, which corresponds to 5x1 actual density J as a function of apcurrent swirlg of from 10-4 to 10-14 plied surface electric field F a t and can be achieved by a change in field a t four temperatures for a 4.5 electron volt work function. the cathode of approximately five to one. Young [14] has observed a velocity distribution of 0.15 ev for field emitted electrons, indicating that this is not a noisy source and that
-
I
2
I
I
KENNETH R. SHOULDERS
162
the energy spread is low enough to allow easy control of the emitted electrons. 2.
GRIDS
A geometry must be selected for the control and screen grids that will allow their fields to effect the cathode without the electrons emitted from the cathode being intercepted by the grids. The control grid would normally operate positive with respect to the cathode potential and, by keeping the grid near the plane of the cathode, electron current to it can be reduced to very low values. The control grid may be run negative if it has a higher work function than the cathode, or is smoother than the cathode. Normally the grid would be formed smoother than the cathode so that negative operation is possible, but a negative grid is more difficult to incorporate in circuits because it cannot be connected directly to a preceding anode as a positive grid can. At a field intensity of lo7 volt/cm the mechanical force on the grids amounts to about (540psi, and this force increases with the square of the applied field. The grid electrodes must be firmly supported only a short distance from their active regions to prevent distortion in the field and unwanted mechanical resonances due to electromechanical effects. These effects will be discussed later in the section on electromechanical filters. The screen grid has the dual role of shielding the control grid from the anode field while acting as a bias field electrode. In an effort to lower the switching voltage of tunnel effect components it may be desirable to maintain the screen grid at a potential of 100 volts while the control grid and anode swing between 3 and 10 volts. The high field from the screen would serve as a bias field at the cathode while the control grid caused the necessary current change. The secondary electrons and reflected primary electrons at the plate must be effectively suppressed and prevented from returning to the screen grid if this method is to reduce over-all heating effects, including dissipation a t the screen grid. Secondary electron coupling effectsto be discussed later would be ineffective a t low voltages and therefore low voltage operation would not be desirable if secondary emission effects were required. 3. THE
ANODE
Efficient collection of the emitted electrons is the principal function of the anode, and heating effects caused by electron impact must be dissipated to the surrounding lattice. The anticipated current density at the anode is lo4amp/cm2 and the average electron velocity may be anywhere from 3 volts to 100 volts, depending upon the final outcome of some of the techniques for producing and stabilizing cathodes.
MICROELECTRONICS
163
The power density in a one-square-micron area could reach lo6watt/cm2 in operation. As will be discussed later in Section IV, E on Heating Effects, the power density is lower by two orders of magnitude than the maximum allowed. The high field strength used in tunnel effect components effectively suppresses space charge effects. It has been shown that current densities of lo' amp/cm2 can be obtained in the region of the field emission cathodes with only a 25% space charge effect [15]. This low space charge would allow efficient collection of electrons a t the anode for a current density of lo4amp/cm2. By shaping the anode in the form of a shallow cup, a space charge cloud may be purposely established as a means for suppressing secondary electrons. B. Time Constants
By considering the major energy storage mechanism, a n order of magnitude calculation of time constant may be made for the vacuum tunnel tetrodc. In this case, the interclectrode capacitance between screen grid and anode is of major importance. Roughly this capacitance is 10-l6 farads, assuming no increased capacitance due to interconnection wiring or charge carriers in the grid-plate region. The discharge resistance, based on a current of amp and a voltage swing of 50 volts between screen and anode, is 5 X lo6 ohms. The RC time constant is thus 5 X lo-" sec. The transit time of an electron between cathode and anode in this particular micron sized device, operating a t 50 volts, would be about 10-13 see, and no grid loading effects would be observed even a t 100,000 Mc, thus reducing the need for complex geometries such as traveling wave tubes to achieve high frequency gain. C. Tube Characteristics
Tube characteristics for vacuum tuniiel effect devices have been obtaiiied by both analytical and experimental methods; however, the analytical methods have been applied only to geometries that involve high voltage operation, in an effort to compare operation with the large-sized cathodes that have been thoroughly explored by previous investigators. I n early experimental work by the author, diodes and triodes were fabricated by film techniques using high resolution masking and vacuum evaporation methods. The masks were guided into position by a micromanipulator using a point projection field emission microscope to view the results. The geometries employed had largely undetermined dimensions, but over-all operating characteristics could still be determined. Operating voltages in the range between 20 volts and 100 volts were readily obtained with currents as high as 100 p a for short times. The devices were formed
164
KENNETH R. SHOULDERS
in a poor vacuum and not encapsulated, so that effects of contamination caused the destruction of the device a t high current levels. This contamination causes a build up of a carbonaceous material on the cathode and other structures by electron bombardment of the pump oils in the vacuum system. The build up process produces a runaway emission and eventual destruction. Tube characteristics showing an amplification factor up to 100, a plate resistance of several hundred thousand ohms, and a transconductance of over 1000 pmhos per ma have been obtained in experiments operated in the higher current range for a short time before destruction occurred. Both negative grid and positive grid characteristics were obtained; the most favorable case of positive grid operation gave a grid current sufficiently low to measure a dc power gain of lo4. Recent work a t Stanford Research Institute [l5s] on applied research in microminiature field emission tubes has partly confirmed the earlier findings by showing diode operation a t voltages as low as 25 volts for 10 pa of current, yielding a power density of lo4watt,/cm2 on an anode of stainless steel which had not been outgassed. In other tests using molybdenum anodes a power density in excess of lo7 watt/cm2 has been obtained. A spacing of 2000 A was used in this test, or essentially the same dimension anticipated for the final grid-to-cathode spacing. By using a 1000 mesh screen of copper as a grid and micromanipulating a 1000 A radius tungsten field emitter into one of the mesh openings, triode characteristics have been obtained even though the anode was located more than one centimeter from the grid. The voltage amplification for the triode geometry used was over 5000 and the grid current was less than lo4below the anode current when the grid was positive. A grid voltage of around 400 volts was necessary for this experiment because of the large hole size in the 1000 mesh screen. Tentative results using 5000 A hole diameters have shown good cont,rol grid action with 25 volts applied. A geometry similar to the triode shown in Fig. 2, but having a single emitter tip, has been chosen for analytical treatment. The dimensions chosen were: cathode radius 1000 A, cathode-to-anode spacing 7750 A, control grid hole diameter 5000 A, spaced 7750 A from the anode. The analysis considered grid emission and intercept currents. An amplification factor of 20, a transconductance of 50 pmho/ma, a plate resistance of lo7 ohms, and a power gain of 20 were found for the geometry chosen. The grid voltage was 250 volts for an anode voltage of 1500. These operating characteristics are far from ideal, but are in line with the large tip radius used in the calculations. Complete data on this project can be found in [15a]. It is the aim of this program to investigate analytically and experimentally the properties of field emission tubes in an effort to optimize their geometry arid determine the compatibilit,y of certain metal and dielectric
MICROELECTRONICS
165
c.on~binat,ioiisfor thcir coiistruction. Thc expcrinicntal appsrat 11s uscd oil this program is csscntially a point projection electron microscope using tho field emission tip as thc source. Rlicromaiiipulators arc provided that can position the emitter in 5000 A diamctcr grid holes while under observation by the microscope a t magnifications up to onc million diameters. The micromanipulators have been calibrated to give about 40 A of motion per dial division in any onc of three mutually perpendicular axes so that variations in gcometry can be investigated. It has been the aim of this program to use methods that are independent of electron beam micromachining so that early results can be obtained to help guide the later application of micromachining work. The features that are readily observed from this work to date arc that the operating current range is exceedingly large-which will be a great asset in obtaining low quiescent power dissipation-and that the power gain and voltage gain are adcquatc. D. Environment
It can be seen from Fig. 4 that the effect of tcmpcraturc on a cathode having 4.5 electron volts work function is negligible for the conducting state until a temperature of 3OOO"Ii is reached. The most important effect of increased temperature for the entire array of 10" components is the increase in quiescent current for the various deviccs. If 10" devices each having 10-'0 cm2 of cathode area are raised to a temperature of lOOO"K, under conditions which correspond to a current density of 5 x amp/cm2 per device and a field intensity of 2 x 107 volts pcr centimeter, then approximately 50 watts would bc dissipated by the entire group of components c volts). Since thc maximum (total current 0.5 amp, opcratiiig v o l t ~ ~ g100 continuous powcr dissipation for the system was postulated to be 100 watts, this leavcs only 50 watts available for dissipation of all the components that are not in thc quiescent state, thus reducing the data processing rate t o less than one half the original rate. A temperature limit may also be imposed by electrical conduction within the aluminum oxide lattice used for the dielectric of the system. Tests have been made by the author, which are described in Sections IV, I, and VIII, C, on aluminum oxide a t elevated temperatures by depositing the alumina onto the tips of field emission microscopes, observing the emission current that can be conducted through the lattice a t various temperatures, and studying thc ability of the surfacc to retain a charge. The tests indicate that properly deposited alumina is an adequate diclcctric for our purposes to temperatures in the ordcr of 12OO0I<. Mechanical cflects caused by the thermal cxpansioii of the construction materials introduce another source of variation with temperature. Heating the tunnel effect components would tend to reduce the field a t the cathode
166
KENNETH R. SHOULDERS
and rcducc the current,. This cffcct partly offscts thc quiesceiitncurrent! iiicrensc from thc cathodc iipon heating. Itadiat,ion damagc to vacuum tunncl cffcct coniponcmts would not RCCIkl to be a problem until the radiation destroyed the dielectric. Radiation damage study of field emission tips has been carried out by Mueller using alpha particles and observing the effect on the lattice by field ion microscopy [l6]. In this way Mueller was able to see the individual dislocations caused by the radiation and to count the displaced atoms. Alpha particles completely penetrate the small diameter tips and recoil collisions can be seen and measured. The most obvious effect of radiation damage to the cathode is to produce sharp projections which emit more easily; in addition, the dislocations produced cause an increased migration of the cathode materials and chiseling [17]. No data is available on the breakdown of dielectric materials a t high fields in the presence of ionizing radiation, but results directly applicable to tunnel effect components could be obtained by depositing dielectric material on a field emission tip and irradiating it while observing the results. By using inorgainic materials, and having device properties independent of crystallography, it appears possible to operate vacuum tunnel effect devices a t 10s/cm2more neutron flux than semiconductor devices. This value was arrived at by comparing the effects of radiation on semiconductor devices and a mica capacitor, which is somewhat equivalent to a tunnel effect device insofar as radiation damage is concerned. The reduction in the size of components causes fewcr ionizing events for any particular flux level, but the events would be more devastating because they involve a larger percentage, by volume, of the component. E. Heating Effects
I t has been stated that the steady state power density for a one micron sized device is as high as lo* watt/cm2. The principal energy transfer mechanism is thermal conductivity; Holm [IS] has given the analysis of the problem as it pertains to relay contact theory. Point projection X-ray microscopy [ 2 ] shows that a 20 kv electron beam at 50 pamps for a total input power of one watt may be continuously absorbed in a one micron spot size. The target for this beam is usually a thin film of metal such as aluminum or copper. In another X-ray source [3] a point of tungsten having 3X cm2 area was bombarded so as to give lo8 watt/cm2. The heat conductivity in the film and the needle shape is shown to be adequate t o allow many hours of continuous operation. Germer [19] has calculated the power density of relay contacts just before closure, in which the principal conduction mechanism is field emission, and his results agree with the previous data. Evidently individual micron-sized devices could be expected
MICROELECTRONICS
167
to operate a t high power density as long as no other components were immediately adjacent, in their nonquiescent states. A complicating effect is that the thermal conductivity is not uniform, but depends upon the location of wires and shields, since most of the conductivity is contributed by the metal in the system. I t is espccted that although thermal conduction will be the principal mode of heat transfer for individual components, when the temperature rises due to switching activity in a cluster of components, radiative processes will contribute largely to heat transfer to the external environment . The heating effects are not simple steady-state effects, but rather pulses that will probably result in high thermal shock. No analytical or experimental data is available to show the effect of this thermal shock on the metal dielectric combiiiatioii in thc 10-lO sec region. Experimental data obtained by the author in the 0.1 psec region using evaporated alumina on the side of a tungsten field emission microscope tip did not show any deviation from the steady state heating effects. The techniques, dimensions, and materials used in these tests were the same as these intended for the final component. The methods described by Mueller [20] allow quantitative determination of the surface activation energies of alumina by steady state methods, which permits interpretation of dielectric damage and migration. Under the action of high temperature and high electric fields, electrolytic action can occur in dielectrics and must be avoided a t all costs. Tesls by the author on simple high temperature capacitorsusing reactively-deposited aluminum oxide with molybdenum electrodes have been carried out in the past and have showii that the dielectric is stable a t 800°C and lo7volts/cm. These tests were not exhaustive and were only carried on for a few hours; the tests are described in Section VIII, C on Material Deposition. The work being conducted a t SRI on microminiature field emission tubes [15a] is adequately iiistrumeiited to conduct investigations into the various heating effects and will greatly assist the design of components and systems. F. Cathode Formation and Properties
1.
MULTIPLE TIP (’.%THOLIE FORMATION
1.ow voltage operation of tuniiel effect deviccs depends upon having small cathode radii, but high current operation requires a large cathode area because the current density per tip is limited to about lo7 amp/cm2. Some geometry of small tips covering the 3000 A cathode surface will produce thfl optiniuni opcratiiig coiiditioiis. ‘1’00 inany tips tightly bunched will ~cduc etlie field a t tlie tips and require highcr voltage, while too few tips and widely spaced oiies will lower the niaximuni available current. In addition to placement, thc tips will havc an optimum length-to-diameter
168
KENNETH R. SHOULDERS
ratio from the standpoint of fabrication ease. Field plots have been made to help determine the optimum placement of tips; from various other qualitative considerations the optimum tip size appears t o be about 200 A diameter, 400 A long, and spaced about 1000 A center to center. An array this size would have about 10 tips on the end of the 3000 A cathode. It would not be economical to try to machine these tips on the surface of the cathode; therefore, they must be grown by the use of crystallographic and surface tension forces. Tests have becri conducted by the author to confirm various opinions about these growth methods. It has becn found that a film of cathode metal such as molybdenum or tungsten can be made to agglomerate into small isolated patches having roughly the shape desired. This is done by depositing the cathode metal on a multiple layer of, first, aluminum oxide, then aluminum, and then heating the composite until the aluminum evaporates. The film thickness of the layers is 50 A for the aluminum oxide, 100 A for the aluminum, and 200 A for the molybdenum. One possible explanation for the growth of isolated islands of molybdenum is that the molybdenum is alloyed with the aluminum and then subsequently migrates on the aluminum oxide substrate film until surface tension forces gather the material into small clusters where the aluminum evaporates upon further heating. The tests were carried out on thin aluminum oxide films and the results were viewed directly in the electron microscope. No tests have been made on field emission microscope cathodes to verify the findings, or to check for foreign materials. I t is to be expected that the aluminum oxide is iiichded in the cathode structure; this may have distinct advant,agcs, as will he discussed later in t,hc section on cathodc stability. The development of a trustworthy method of forming multiple cathodes is a very worthwhile undertaking for vacuum tunnel effect components, and should be assisted by adequate instrumentation such as an electron lens having high resolution t o image the multiple cathode array. The field emission point microscope merges all images into one-if they originate from a relatively plane surface-and docs not allow the investigation of individual tips. One of the features of the high resolution electron optical system to be dcscribcd later is that it will allow a large plane of material, which has been proccsscd so as to produce a multiple cathode array, to be viewed by imaging selected portions of the array and thus determine the clectrical characteristics without removing the surface from thc ultrahighvacuum system in which it is made. In addition, the features of scanning clectroii microscopy, and X-ray fluorescciice spectroscopy, can be employed to help converge the techniques to produce the desired surface. If desired, this lens system can operate with a high temperature substrate, in the
MICROELECTRONICS
169
presence of strong clectrost,atic fields, to help promote the growt,h of tips having high length-to-diameter ratios. One of the limits that can be seen for the formation process is the inability to heat the metal structures, for cleaning purposes, to a temperature beyond the point where the dielectric and substrate become mobile. This corresponds to approximately 1500°C for aluminum oxide. I n our construction process it is possible to use chemical methods to remove certain material that is normally retained a t very high temperatures. An example of this is the use of a hydrogen molecular beam to remove carbon from tungsten [all. The uniformity of these formation methods may not be very high; however, when they are coupled with self-formation processes, which are discussed later, a high over-all uniformity may result. 2.
CATHODE STlBILITY
The geometry of the cathodes must remain unchanged if electrical stabilitry is t o be maintained in tunnel effect components. The cathodes may become dulled in time by surface migration and surface tension forces at elevated temperatures, or the tips may become sharper by surface migration in the presence of electrostatic fields [22]. Most of the work on the stability of field emission cathodes has been done with 1000 A tip radii made by electrolytically etching wire stock, and then heating the wire in vacuum to 2800°C for cleaning. This cleaining operation requires very high tcmperatures and results in diffusion of material and blunting of tips, thus making the use of small radius tips difficult. Some experimenters have used 100 h tip radii, but a great deal of care must be used in processing the tips. One of the advantages offered by film deposition methods for the construction of cathodes is that the tips are clean and gas-free upon completion, thus avoiding the need for high temperature processing. In addition, certain high vacuum chemical cleaning methods, such a s the removal of oxides and carbides by hydrogen, can be used at lower temperatures [21]. The stability of small cathodes depends somewhat upon the geometry being used. Herring [23] has pointed out that if the field emission point is a slightly bulbous knob on a cylindrical column, and if the portion of the knob a t its region of maximum perimeter contains a flat facet,, no blunting is possible without an outward motion of this facet. Since an outward motion would require two-dimensional nucleation, it would be practically impossible under the weak motivat,ion of surface tension; consequently, the shape would remain unchanged in time, even though it might be far from an equilibrium shape. In addition to geometrical stabilization methods, chemical barriers can be included in the system to stabilize migration. The addition of alumina or thoria and certain metal carbides such as hafnium carbide to the tungsten or molybdenum cathodes may serve as
170
KENNETH R. SHOULDERS
a diffusioii barrier. It, has bceii a long-standing practice in the vacuum tube industry to carboiiizc thoriat>edtungsten filamcnts which contain thoria, t o increase thc stability of thc surfacc. In actual operation these small tips would always be under the action 011 an electric field to prevent dulling whenever heating occurs. The possihility of the small tips hecoming sharper during operation seems less likely than the sharpening of larger tips because crystallographic binding forcrs become stronger as the size is reduced. It has been noted that it is not possible t o smooth a surface of anything except glassy materials beyond a roughness of a few hundred angstrom units by heating processes alone, because of the tendency of materials to crystallize. Mueller [24] has shown the application of field evaporation using exceedingly strong fields to smooth surfaces, but these fields exert more force than we use by a factor of approximately 100.
:<.SELF-FORMATIOX Sclf-formation methods are a class of supplementary process that follow the principal micromachining process and simultaneously adjust the electrical properties of all components in the topmost layer of components to a uniform characteristic. This method of producing uniform components drpeiids upon the activation of a chemical process by a significant electrical property of the component in such a way that the component is modified to conform more closely to a desired characteristic. The formation chemicals must then be removed without, altering the uniformity. The formation of a n electrolytic capacitor in which the applied voltage controls the dielectric thickness is the most common example of self-formation methods applied to a passive component. To obtain uniform emission currents from each of a large number of tunnel effect components having a single applied voltage is a most desirable application of self-formation methods. Additional requirements for the process would be to reduce grid emission by removing sharp projections from the grid and to prevent grid current being intercepted from the cathode by removing high emission angle anomalies from the cathode of each device. For the formation of tunnel effect components it is proposed that a n entire array of lo8 devices ‘be fabricated by the micromachining process, leaving off the final encapsulation layer, with all of the cathodes having been deliberately formed too sharp for use. A voltage would then be applied to all electrodes simultaneously in the presence of a molecular beam etchant a t an elevated temperature. The voltage would then be raised slowly until the sharpest tips begin emitting. These emitted electrons would supply the necessary activation energy for the chemical process to modify the sharp
MICROELECTRONICS
171
tips only. This process should result in a degenerative dulling action that will give a uniformly cmitting array as the voltage is raised to the operating value. After operating voltage is reached the etchant is discontinued and the tcmperature is raised to drivc off residues; finally the entire array is sealed a t high temperature in ultrahigh vacuum. The variations of spacings, work functions, and tip geometry would all be corrected by a change in tip radius. Vacuum tunnel effect devices appear to be a class of electronic component very suitable for self-formation methods. The process requires a component that is not sensitive to temperature, because elevated temperatures are needed to volatilize the reaction products from the electronic surface. An intermittent process using alternately low temperature reacting and high temperature purging could accomplish the same end, however. The component should have a geometry that allows the easy removal or redistribution of the materials, and vacuum devices have the highest dcgrec of accessibility t o the surface being formed. A further requircmcnt is that the electrical properties of the devicc havc similar operating characteristics after the cleanup phase as they did during the formation phase. The work function is the only property affected in tunnel effect devices during formation, being higher during the etching than in the clean state. After cleanup, the work function returns to the clean metal value in a vcry uniform and predictable way. Cryogenic devices appear to be the least likely class of component to self-form because they operate a t temperatures too low for chemical action. For magnetic devices, self-forming would be difficult but possible. Semiconductor devices would be the next most likely class. Vacuum devices appear t o be the best of all. The tcsts that have been carried out by the author to indicate to what extcnt the above notions are valid involvc etching tcsts on a single tip field cmission microscope and then cleaning of the tip to determine residual contamination. The formation-tests were performed on small tips having an external applied potential of about 300 volts in order to prevent destructive sputtering of the tips by high energy ions. High vclocity ions formed far away from the tip are known to strike the shank of the tip below thc originating point of the ionizing electron because of the inability of the high velocity ion to follow the field lines. This effect has the undesirable tendency to sharpen the tip [25]. The forming agent (chlorine) was admitted to the system in the form of a gas a t about mm Hg pressure and the molybdenum tip was held at a tcmperature of about 4OO0C, or just below the temperature of appreciable thermal etching. Upon raising the voltage progrcssive dulling of the tip was cvidcnt by a dccrease in current. The vncuum system was cleared of chlorine and thc tip was flashed t o a tempera-
172
KENNETH R. SHOULDERS
ture sufficient for cleaning; the original emission pattern was then observed to be free of chlorides. Mueller has sharpened tungsten tips in oxygen [25] by simply heating them. This would seem to be the opposite effect to what can be accomplished with low heat and electrons. The mechanism for self-formation is not kriowi, but the experimental coiiditions suggest some form of solid state electrolytic plating of cathode material from thc chloride or salt formed at the cathode surface. This electroplating would riccessarily usr the emitted electron current to activate the process and possibly to niaiiitain charge neutrality in the salt. Since a strong field exists across thc salt, migration of material ~ o u l dfollow thc field lines and not hc proiw to diftuse lstcrally. Under these conditions it may be possible t o Luild up the tips that are t,oo sharp, and thereby reduce emission. The dec.ompositioii of a volatile metal-carrying compound such as molybdenum or tuiigsten chloridc or tungsten carbonyl under the action of emitted electrons would rausc a similar effect. An example of this is shown for the decomposition of tin chloride by Powell et al. p!6] and the author has reduced small quantities of molybdenum chloride by electron bombardment. None of thew reactions has been carried out on field ernission tips; however, these will be investigated during the course of the project. G. Vacuum Encapsulation
Vacuum encapsulation mcthods are similar to t he surface smoothing methods that must be used betwcrn laycrs of components of either vacuum or solid state type. If an array of one-microii-cubed comporients and t,heir associated transmission lincs arc deposited on a surface, this constitutes the production of a rough surfare that must be smoothed before proceeding to the next layer. In present film methods using masks, this effect is not evident because the edges are so diffusc and the components so wide that a gradual thickness transition is produced. Doping methods of producing integrated wires that do not rise above the surfacc are not considered rcalistic in this size range because of the rscwxlingly high loss of the conductors. If the surface is not filled or smoothed before proceeding to thc next layer, a “pinhole effect” is produced, dielectrics are thinned a t the steep boundaries, and the possibility of dielectric breakdown is increased. One method of vacuum encapsulation resolves itself to the equivaleiit problem of drawing a taut film across the top of an open cavity in a vacuum chamber. To implement this in a way that does not interfere with the component operation, or introduce undesirable impurities, several steps arc needed. These steps consist of the deposition of a low melting poiiit material, the fusing of this material to give a smooth surface, thc dcpositio~l
MICROELECTRONICS
173
of a thin film of rrfractory material, the rcinoval of the low melting material by evaporation, a n d finnlly thc siritwing of thc refractory materi:tl to the basc niatcrial in high v:~(~uuni. The first technique devisrd to test thrsc methods was to strctah a thiu collodion film, in air, across the rough surface of an evaporated aluminum oxide film, dusted with alumina particles, to hc smoothed in much the same way that a phosphor is covrrrd before evaporating a n aluminizing layer. This film was then covered with LZ drposit of rvaporated silicon monoxide or aluminum oxide, as is often done in electron microscope specimen preparation. Hrating the sandwich to high temperatures drives out the collodion without lifting the diclrctric film, and a t sufficiently high temperatures in vacuum the silicon monoxide film call be sintered to the underlying film of aluminum oxide. The film sandwich is then tested in an electron microscope or an optical microscope and moire effects or optical interference tietween the f wo surfacrs near a bulge or anomaly indicate the separation of the surfaces t o form a cavity. Thrrc is no way of testing the degree of vacuum scalcd in the enclosurrs. Occasionally the top film ruptured and no covering was produced; this was assumcd to be due to the strain, produced hy very large particles of alumina, that wcre put on the surface to act as support for the original collodion film. An all-vacuum process was next testrd by the author to fiiid a substitute for the collodion film that had t o bc applied a t atmosphrric pressure. This test consisted of depositing a layer of myristic acid by evaporation, followed by quick heating to the n-rlting point and then rapid freezing to produce a smooth surfacc formed by surfacc tension. This laycr of material was covcred by evaporation, nith :m agglomcrated laycr of sodium chloride to produce isolated 100 A cubrs of niatrrial. A film of silicon monoxide, 50 A thick, was thrn applicd t o cover the surface. Upon gentle heating the rnyristjc acid drposit was driven out from under the silicon monoxide, presumably by passing through the vents provided by the sodium chloride crystals. Upon heating to high temprrature the sodium chloride was driven out and the covering film sintercd to the base film. A thin evaporated layer of aluminuin oxide coming from a low grazing angle to the surface was used to close thr small holes produced by the sodium chloride without penetrating to the basc layer, and then a thick layer was deposited normal to the surfacr to increase the strength. Electron microscope examination revraled that enclosures had been formed around any hole or anomaly on thc surfacr that the original myristic acid film had rovered, its surface tension being sufficient to hridge these cavitics. These same surface tension methods coupled with evaporated deposits can be used to support a diaphragm in the center of a cavity, similar to those needed for producing mechanical effects like electrostatic relays or
174
KENNETH R. SHOULDERS
iriwhaiiiv:d filters. This must, 1)c donc in a multilnycr f:Lshion hy c.ovcriiig the top of onc cavit,y with a film, constxuctiiig another cavity on the t q ) of this by micromachining, and finally closing this cavity by another film. Similar encapsulation techniques have been investigated by Sugata et al. [27] for the observation of specimens in gas layers in an electron microscope. This technique seeks to encapsulate a specimen at atmospheric pressure between two thin films. This is the inverse of our problem, but there are some similar features. The encapsulation methods investigated thus far have sought to test the method of encapsulation but not necessarily to optimize the selection of materials. The materials selection may be guided by the following considerations : (1) The most difficult problems arise in choosing the fill material which is to flow on the surface and fill in voids by surface tension action. This material must be able to wet the surface when liquid and freeze without crystallizing into a rough surface. The decomposition products of this material must be volatile and nonreactive with the electronic component’s materials. The vapor pressure of the material a t the melting point should be low enough to prevent rapid loss of material, but high enough (just below the melting point) to be removed by sublimation, since melting tends to distort the thin covering film. Organic materials tend to satisfy most of the requirements, but are prone to contaminate by production of metal carbides. Glasses of the sulfur-arsenic system are also interesting. A wide range of materials are available for selection and further tests are simple to conduct. Myristic acid and 1-octadecanol partially satisfy the requirements. (2) A suitable hole-producing material is one that agglomerates on the surface, and can be removed by heating without reacting unduly with the component or encapsulating film. Ionic materials and salts with a strong tendency toward crystallization are the best class. Ammonium chloride has shown favorable results for this part of the process. (3) The thin film that is used as a supporting membrane must be able to be evaporated to the surface without, agglomeration or surface mobility because it must accurately shadow the tiny hole-producing aggregates on the surface. This material must also be able to sinter to the base material after the removal of the fill material, and be able to recrystallize to a taut film without rupturing. A wide range of metals or dielectrics can fulfill these requirements, but, in general, they are low-vapor-pressure refractory materials that have chemical affinity for the substrate. Aluminum oxide is one of the best materials used thus far. (4) The final encapsulating film can be almost any material that can be evaporated and is stable enough to use in the final system. This layer
MICROELECTRONICS
175
can be backed up with a reactively deposited coating which is pinhole free and stable. Aluminum oxide is also a good material for this part of the process. The lifetime of the components on a substrate will largely depend upon thc effectiveness of encapsulation of the outer layers of material. It is expected that a multilayer coating will be employed and that reactive deposition methods will be used. Tests have been carried out in the past by the author to help determine the materials and methods bcst suited for encapsulation. In the absence of an adequate method for testing the degree of vacuum in a micron-sized enclosure, corrosion methods were substituted. When a thin metal surface is prepared in an ultrahigh vacuum and is exposed to various gases, such as hydrogen and oxygen, the resistivity changes by over two orders of magnitude if the film is thin enough to have very little bulk electrical conductivity. A inonolayer of gas adsorption is all that is needed to change the surface scatter of the conduction electrons. The encapsulation tests to be described are based upon the possibility of a similar change in resistivity if a foreign gas permeates the covering laycrs. A 100 A thick film of molybdenum was deposited on a sapphire substrate that had been clcaned by heating to above 1200°C in vacuum. This film was covered with an aluminum oxide film approximately 2000 A thick, followed by molybdenum and silica films each of the same thickness. The depositions were carried out a t around 9OO"C, as described in Section VIII, C on Material Deposition. The films were judged stable a t 800°C by prolonged heating in vacuum and measurement of the resistance of the first molybdenum film. The sandwich was removed from the vacuum system and tested a t temperatures up to 600°C in various corrosive media such as molten salt baths, and gaseous atmospheres such as carbon tetrachloride, hydrogen, phosgene, and water vapor. Tests in air up to 800°C were also conducted. None of the tests caused a properly encapsulated sample to change resistance. Many experiments failed, however, due to either improper choice of materials or processing methods. The greatest point of weakness in all systems tested was a t the terminals, where dissimilar materials and gross joining techniques were used. These tests have not been correlated in any analytical way with the effectiveness of vacuum encapsulation, but it is believed that there is a high probability of producing good encapsulation by these methods; however, the only complete test will be the performance of a micron-sized vacuum device. H. Component Lifetime
Present field emissioii cathodes operated at lo7 anip/cni' have lifetimes of 12,000 hours [ZS]. The deterioration mcchaiiisms are inevitably traced
176
KENNETH R. SHOULDERS
to gases being adsorbed on the tip and changing the work function or being ionized by the emitted electrons and then sputtering the cathode. These gaseous molecules come from adsorption sites on the container walls; they migrate through the container, or are driven out of impure construction materials by heating effects or electron bombardment. By reducing the quantity of these migratory materials the lifetime of tunnel effect devices could be extended in an almost linear fashion proportional to their reduction. By employing ultrahigh-vacuum deposition methods, the materials can be cleaned up by a factor of over 1000. The use of dielectric materials in close proximity to the electron beam will introduce the possibility of electron activated decomposition of the dielectric and attendant contamination, but tests thus far have not revealed any difficulty. “The Research on Microminiature Field Emission Tubes” [15a] has as one of its specific goals the investigation of the compatibility of dielectrics and field emission cathodes that are in close proximity. One desirable effect in reducing the size of the envelope is the reduction of surface area and the adsorbed contaminants that are contained on it. If the cathode is considered a sink for contaminants and the surface of the envelope a source, then the improvenient in lifetime is proportional to the reduction in envelope surface area; in our case this is a factor of about over large glass envelopes. Diffusion effects through the encapsulation material, outgassing of the construction materials, and decomposition would limit the lifetime long before this limit was reached. A strong ion pumping action is expected in a one-micron-sized tunnel effect device because of the high electron current density throughout the volume of the device even in the quiescent state, where the current density is around amp/cm2. During the switching cycle or active state the temperature and the current density may reach high peak values that serve to desorb gas from active surfaces and drive foreign material back into the encapsulation and allow it to continue diffusing throughout the system. A small amount of diffusing niatcrial is inevitable, but, it should be kept in the encapsulation material instead of being allowed to concentrate on the active surfaces. Young [29] has recently shown that certain metals containiiig oxygen can be made to liherate oxygen under the action of electron bombardment without heating. l‘ungsten and molybdenum liberated gases readily, while titanium seemed less troublehome. This rcsult ran be predicted from the stability of the oxides. Gross field emission devices use anodes having this decomposition effect and would be expected to show shortened lifetime. Clean materials would solve this problem, :tnd vacuuni handling techniques produre tlic. c*leniicbhtrn,ztcrials. If the materials that compose a coiiipoiicnt wcre perfectly stable and 110 K ~ W material were :idded then the compoiicnt would have an infinite life-
MICROELECTRONICS
177
time. The binding energies of materials and thr opwating tcmprratur~s determine the stability of the component. Vacuum tunnel effect componcnt,s need only a single type of metal and dielectric, and these materials can be chosen to have maximum binding energy without having to compromise an electrical property. These components are able to operate at high temperature, but if the lifetime of the component is the prime concern, then they may be operated a t 10x1- temperatures; in this case, however, the data procewing rate of the entire inachiiie will be reduced because fewer components can be permitted to switch per unit of time since dissipation heating is maximum during switching. All active electronic components are apparently disturbed to a similar degree by foreign materials. The smaller components will show a higher sensitivity to a fixed number of foreign atoms simply because these atoms represent a larger percentage of the active region. If a component is considered to be a certain volume of material embedded in a finite lattice of encapsulating material, then some of the difficoltics in stability will arise from the constant flux of foreign material dif'fusing through the volume of the component, affecting the active elcctroiiic surfaces. If no sinks for this foreign material are allon.ed to causc accumulation, then a vacuum device and a solid device would have equivalent stability. Micron-sized vacuum tunnel effect components appear to avoid establishment of impurity sinks hecawe of the high electron current densities and the attendant ion pumping action of the device. Sinks for foreign material can be provided in a solid system by having material concentration gradients or internal surfaces. Both solid state devices and vacuum devices can be influenced to an equal extent by these traps, but the vacuum device does not have surfaces other than the principal electronically active oiics to cause difficulty. It, is important to reduce thermal gradicwts and other strain producing mechanisms a t the external surfarcs of tlic bystem in order t o reduce diffusion into the system by hf rchs corrosion mechanisms. Similarly, external electric fields should I,c avoided in order to reduce electrolytic action. With small area enclosures the amount of material that is evaporated from the various electrodes and distributed to the dielcctric supports must be very small in order to prevent a continuous conducting film being formed on the surface. A coating of 50 A thick would cause great difficulty in operation; however, the electrical characteristics of the device would bc impaired Ixcause of the loss of mctal from the clectrodcs before the above twndition was reached. Extremely high temperatures or high sputtering rates nould 1 e needed to move much nietal to the dielectric surface, butj the dielectric could be transported through surface niigration to the metal electrodes a t temperatures as low as 1300°C; this is another reason for including an iiiitially small, aiid hopefully saturated, aniount of dielectric
178
KENNETH R. SHOULDERS
with the cathode mctals, as discussed in the section on multiple cathode format ion. Mueller has recenlly shown [IG] that the sputtering of field emission tips by gas atoms produced dislocations in the tips which caused an increase in surface migration, and therefore the chiseling action described by Dyke [22]. Individual atoms were observed in the field ion microscope and dislocations could be seen at the surface. Data on material migration effects both in the bulk and at the surface has bcen obtained by field emission microscopy. Surfaces have becn formed that are atomically clean with accurately measured radii for even these submicroscopic surfaces. All of these direct observation methods are available to tunnel effect component designers who have the job of optimizing the electrical and mechanical stability problems. Such powerful tools help assure the reliability of tunnel effect components. The smallest electronic component presently known is the field emission device. Even though the envelopes are large at present the cathodes have tjhe same area of 1O-lo cm that we intend t>ouse eventually. If it is assumed that the reliability of electronic components varies directly with the active area, for a given amount of coritamination wandering into the active area, then the present large sized vacuum tubes and semiconductor devices would have to show about 10l2hours of operation to be equivalent in stability to present field emission cathodes. Conventional electronic devices that are reduced in size by using thc same relatively unclean techniques employed for the large species would be expected to have a very short lifetime, and microelectronic construction methods therefore require techniques involving a high order of cleanliness. With a degree of cleanliness that seems attainable with state-of-theart methods, several hundred years of component stability would seem reasonable. 1. Solid State Tunnel Effect Devices
Solid state tunnel effect devices were investigated a t least as early as 1935 when De Boer [30] made diodes by evaporating successive layers of metal and dielectric in such a way that the cathode was rough and yielded field emission electrons when a voltage was applied to the sandwich. The current in the reverse direction followed the field emission equation very well, but the current in the forward direction obeyed a V3I2law, as if caused by space charge effects. This space charge was probably due to a large number of traps that set up a negative space charge. De Boer obtained about 1ma of current at 0.05 volt and 25 ma a t 0.15 volt using a potassium cathode, a calcium fluoride dielectric, and a silver anode.
MICROELECTRONICS
179
In Russia, Volokobinskii [31, 321 has iioted tunnel effects in solid systrills composed of metals and dielectrics; Malter [8] effects* are thought to be based upon field emission electrons being drawn from a metal cathode through a pure dielectric layer and into a vacuum. Methods of using solid state systems of metals and dielectrics for the construction of field emission triode switching components have been proposed by the author and summarized by Highleyman [33]. These methods involve the use of refractory metal cathode areas with raised anomalies to assist tunnel emission and to define the emitting area. Grid structures consist of either thin metal films with appropriate electron transmission and absorption characteristics, or thicker metal films with physical holes aligned with the cathode anomalies to prevent excessive grid currents. The anode is located adjacent to the grid and all electrodes are immersed in and separated by solid dielectric, such as alumina made by film deposition methods. Recently, Mead [34] has shown an analysis and some experimental results for a pure metal and dielectric type of tunnel effect triode made by alternate layers of aluminum and oxidized aluminum. Mead’s conclusion was that semiconductors are not necessary; that a triode tunnel device would replace a tunnel diode; and that the devices must be made by very refined processes. These remarks are substantially in agreement with our findings. Some tests were conducted in 1956 by the author on field emission into solid dielectrics by employing a point projection field emission microscope wit,h evaporated coatings of aluminum oxide on the tip of either tungsten or molybdenum. By using this method the active area of emission could be accurately defined and current density measurements made. Films from a few monolayers thick to several thousand angstrom units thick were deposited on tips with radii from 200 A to 1000 A and the current-voltage plots were made both before and after the dielectric deposition. It was noted with some surprise that the field required for the appearance of emission was the same both before and after the deposition. The only real diffcrence the addition of heavy dielectric layers made was to limit the current density to around lo3amp/cm?. At this point the tip exploded and ruined the experiment. Thin layers below 100 A thickness did not cover the tip uniformly and heavy emission current could be drawn from around the crystals of alumina. Heavy coatings seemed to have an instability in the emission pattern that might be construed to be traps changing their space charge effect, although the same effect would have been caused by surface charges from electrons or ions. The emission stability was much
* See p.
182.
180
KENNETH R. SHOULDERS
more tempcraturc-dcpcndcnt than for thc metal-vacuum field emission casc. Although the solid state turincl device could probably be developed int.o something useful, it has largely been replaced by the vacuum tunnel effect device for our purposes because of the following unfavorable characteristics : it has higher temperature sensitivity (probably due to trap effects); lower input impedance due to electron scatter in the lattice; and less device uniformity due to dependence on crystalline properties in the dielectric lattice. I n addition, self-forming methods arc difficult to apply. On the favorable side the solid state device can be made to operate a t lower voltages than vacuum devices; it probably has a much wider power range and lower quiescent power than semiconductor devices; and it may have a lower surface migration of materials, as compared with the vacuum device. However, boundary diffusion may be troublesome. V. Accessory Components A. Secondary Emission Devices
There are two priiicipal uses for secondary electron emission in the microelectronic system under consideration. Amplification of low level photocurrents can be greatly facilitated by electron multipliers, and grids of tunnel effect devices can be driven positive upon receipt of a burst of negative electrons if the secondary electron emission ratio is greater than one. A thin-film transmission-type electron multiplier similar to the one described by Sternglass [35] is represented in Fig. 5 . For small signal
PHOTO SURFACE ENCAPSULATION
FIG.5 . Thin-film transmission-type multiplier phototube.
amplification, such as photocurrent multiplication, this type of device may have considerable application; however, for large voltage swings the thin diaphragms may resonate mechanically a t frequencies within the pass-band. The spacings that would be employed in this device would be around 2000 A and the film thickness would be about 200 A. A maximum diameter of two microns would seem permissible if the films were taut.
181
MICROELECTRONICS
The voltage between electrodes would be approximately 100 volts to obtain a yield of four for a niolybdenum-alumina secondary emission surface operating at a field intensity of nearly lo7 volt/cm. Five stages of multiplication would give a current gain of 5000 with a transit time spread low enough to allow gain up to lo5 Mc. T5nch dynode in a large array of multipliers could be operated from a common power supply source. For large signal applications a inore sturdy design is called for. The design shown for the dynodc iii Fig. ci is similar to the design of Weiss [36] and is urcd herc as a secondary emission amplifier. When the dynode is struck by high velocity electrons, mow electroils leave the electrode than arrive; thus the electrodc is driven positive and clampcd near anode potential. Skellct t [37] describes various ways of usiiig secondary emission to achieve circuit functions with a devicc similar to this. The purpose of this particular circuit is to drive the grid of the top triode, in the seriesANODE SECONDARY EMISSION CWTING
DYNDDE- OUTPUT GRID CATHODE
GROUND
FIG.6. Srcorid:iry emissiou nmpliF,er couplcd n i t h a turinel effect vacuiiiii triode.
connected triodes, positive without haviiig to suffer degeneration due to cathode follower action. This will be described in more detail in Section VI, on iiitcrcoiinect ions. Secondary emission surfaccs are iiorrnally associated with instability; however, close inspection reveals that the surfaces tested could be predicted to be unstable because they are made from thermally unstable materials such as alkali halides [38], or electrically unstable materials such as magnesium oxide [39]. Even materials such as aluminum oxide, which is in a stable class of oxides, frcquently have uiistable hydroxides included in them because they are produced by electrolytic processes. Aluminum oxide must be fired to over 900°C to produce a stable alpha phase crystal structure. Impurities must be reduced to the same level that is found in remiconductor materials if stability is t o he achieved. Once stable and purr materials have been obtained, and eiicapsulation methods have been perfected to maintain this coriditioii, sccoiidary emission may be used for technical application. h i y system c*lcan cwough to accornmodate field
182
KENNETH R. SHOULDERS
emission cathodes will be suitable for the use of secondary emission effects. Pure metal secondary emission surfaces have yields up to around two secondary electrons per primary. The time delay from these surfaces is so short that it cannot be measured and the current density is limited only by space charge effects and heating of the material to the evaporation point. The temperature sensitivity of the effect is negligible and a normal yield may be obtained from a material even after it is melted [40]. Secondary emission is a fundamental process similar to field emission in some respects. One interpretation of the process states that a high local field is produced in a material by the ejection of a bound electron by a primary electron and that the high field produced by the emission of the electron can act t o accelerate a conduction electron sufficiently to escape the surface. Because this process is extremely rapid-faster than the relaxation time of electrons in metals-local fields can exist momentarily. Secondary emission is a lossy process on an energy basis in that the total emitted electron energy rarely exceeds 1% of the primary energy. When dielectrics are mixed with metals in various geometries, the properties of the surface can have many interesting aspects. The Malter effect [8] and the magnesium oxide self-sustained secondary emission cathode [9] are examples. I n both of these systems a primary bcam of electrons striking a complex metal-dielectric surface causes secondary yields of over 10,000. The cause of this high yield can be traced to field emission caused by a positive charge on a dielectric particle located near to the cathode surface. The positive charge is caused by secondary electron emission from the dielectric. The positive charge is not neutralized rapidly through dielectric leakage or through electron capture because it is placed out of the electron path in much the same way as the grid in a vacuum tunnel effect amplifier. For smaller amounts of dielectric than is needed to produce the Malter effect the yield of the surface falls and smoothly approaches the value of pure metal surfaces. The time constant for secondary emission is very long for the complex surfaces, going t o infinity for the self-sustained secondary emission cathode. For the simple geometries used in the Malter effect the time constant is in the region of a few seconds for gains over 10,000,and falls to the submicrosecond region for yields of about 50. Only a small fraction of the surface is used for emission, yet t3hecharging current for the eff'ect is spread out over the entire surface. For optimum geometries employing specific cathode areas and adjacent grid structures similar to tunnel effect amplifiers and having specific charging paths, the current gain would be related to the number of clcrtrons that escaped past the grid before one was caught; this would be in the vicinity of lo6.The time constant would be related to the grid-cathode capacity and the size of the charging beam current. It
MICROELECTRONICS
183
i~oul d11c :~rouiicllo-'" src for a oiic-mic~ron-siz(I(1 tlc\.iw :uitl :L I00 pmnp beam current. This beam aould l w ohtaiiicd from n 11rnr1)yfidcl emissioii cathode, and controlled by another grid. I11 this light, secondary emission seems very closely related to field emission. Tcchnologically, both processes perform best in the prcsencc of high fields, are operated a t similar voltages -approximately 50 volts-and must 1lai.c clean processing techniques and ultrahigh vacuum.
B.
Light Detectors
For the fast detection of low levcl light pulses the thin film transmission typc of multiplier phototube shown in Fig. 5 could be employed. The operation of the electron multiplier is described in Section V,A, on Secondary Emission Devices. I n this geometry the photocathode is contained on the optically transparent encapsulating window. For operation a t high temperatures, stable materials would have to be used; this precludes the use of normal photocathodes. Metal-dielectric complexes such as molybdenum and alumina would be useful again, but they would be suitable only for the 4000 A blue-light region. Using an electron multiplier having a gain of 5000, as described earlier, and a two-stage tunncl cffcct tetrode amplifier having a gain of lo4with a 100 Mc bandwidth, the over-all gain would be sufficient to detect a light intensity of ft-c falling on a one micron area and to amplify the signal to the voltage level used for the switching components. The principal uses for the light detector arc to allow optical coupling of up t o lo4 channels between substrates or modules containing lo9 components; to provide a method of reading microdocument storage data; to obtain optical information from outside the machine; and to provide coupling between internal electronic devices operating a t widely different voltages. Two different-sized optical detectors would be aeedcd for these functions. A 0.2-micron-diameter detector would be used for the microdocumcnt reading, and a one-micron-diameter detector ~rouldIIC used for thc other operations in order to have drtectors the samr size as active components. Many one-micron devirrs could be arrayed in parallel when used as intermodule detertors. Having lo4 paths of information would require detectors with dimensions not exceeding 0.01 inch; in any practical case, the detcctors would have dimensions of 0.004 inch by 0.004 inch, which would require about 100 one-micron-sized detectors operating in parallel. The lo4 paths of information have been selected as a maximum because of the necessary substrate thickness of 0.01 inch, which places a n upper limit on resolution due to light scattering in the absence of a lens. When data is fed from the outside world into the machine through a lens,
184
K E N N E T H R. SHOULDERS
the iiiput tnoh:iic- c*oultl coiitniii iilioiit, 10* o n e - n ~ i r ~ o i ~ -(Icfwt s i ~ ~o~r h~ l: L I ~ I\ ould thus requirc t hr best lens availal-ilc. For the rcading of mirrodacunicnts t hc 0.2-niivron-tlciwt ors will liuvc t o be followed by 0.2-micron-diamctcr amplifiers, or have the larger micron-sized amplifiers stacked several layers deep with connecting wires fanned out, t o each detector. If a single external illumination source is provided for microdocument rcading, and a scanned output is desired, then one of the intermediate dynodes of the electron multiplier ran bc gated to provide the blanking of that particular channel without introducing spurious signals. By appropriate wiring the microdocument optical detecting elements can be connected to light generators on a greatly enlarged viewing screen on the opposite side of the plate and thus provide the effect of optical magnification. A magnification in the order of 1000 could be achieved and still have lo4 bits in a field. The over-all effect would be the same as an image-intensifying microscope. Taft and Apker [41] and Sokolskaya [42] have reported on a type of optical detector using field emission from a cadmium sulfide tip. The field can be raised a t the tip until emission occurs even though not illuminated, but if the field is lowered to a point just below extinction, emission is not obtained until light is directed onto the tip. The maximum sensitivity is found in the peak of the characteristic absorption band of the material used. By the use of photoconducting materials in very high fields the photoelectric emission threshold is found to be extended to very long infrared wavelengths. To employ this effect in infrared detecting devices, the lattice must be kept cool in order to lower the noise figure; however, this apparently has no effect upon the emission mechanism. C. Light Generators
Phosphors that havc not been particularly optimized for cfficient opera-
t ion frequently have time constants in the lo-* sec region when stimulated with electrons. This stimulation can be achieved by emitting from n mctallic tip dircctly into the phosphor lattice or by first accelerating an electron and permitting it to cnter the lattice. The mechanism for direct currcnt elcctroluminescence has been described by Zalm [lo] as being produced by field emission from irregular edges of phosphor crystals coated with a slightly conducting layer. Luminescence in aluminum oxide, which has been observcd during anodization by Van Gee1 [GI, is considered to be electroluminewence. To verify the possibility of light emission from aluminum oxide stimulated by field cniit ted elcctrons, the author coated a tungsten field emission t,ip with evaporated aluminum oxide and emission current was drawn through the coating. A faint blue light was observed from the region of the tip. This work was done in an exploratory manner
MICROELECTRONICS
185
and no effort was made to measure the efficiency of the process nor thc spectral distribution of the light. Elinson [44] has shown the luminescence of one form of aluminum oxide. Aluminum oxide is being stressed here because i t represents one class of stable phosphors. Various phosphor materials, such as zinc sulfide and zinc oxide, have been made by film methods, but these do not seem stable enough for our requirements. An alternate method of stimulating a phosphor is to accelerate electrons from a field emission tip into a phosphor coated with a very thin metallic layer so as t o prevent surface charge. By using smooth film phosphors and small areas, a metallic coating much thinner than the usual aluminized layers can be obtained without an excessive voltage drop in the metallic layer. Low voltages and high current densities would be desirable so that compatibility with field emission sources could be achieved. One hundred volts a t several thousand amperes per square centimeter is expected. The maximum power loading of the phosphor would be very high because of the good thermal contact with nearby electrodes. Large arrays of parallel-operated one-micron-sized light generators would be needed to present output data and to couple between modulcs. Approximately 100 elements in an array would bc nceded. If the light source was scaled down to around 0.2-micron in diametcr and placed near a light detector, but out of sight of it, then the light source could be used t o illuminate an individual bit of microdocument data. The reflectance of the stored data would serve to couple the source and the detector. Diffraction effects would not be pronounced in this suhmavclcngth range because large aperture anglcs are postulated. The brightness from microscopic luminescence sources is not known in detail, but various emissions from silicon junctions, silicon carbide resistors, anodic films, and ficld emission sources indicate that although the intensity is escecdingly high the area is also very small, so as t,o limit the total light. By employing construction processes that produce a large number of active areas, the total light output could be raised to useful levels. D. Microdocument Storage
ITsiiig a micromachining process having 250 h rcsolution to record pcrmanerit data, l 0 l pbits of information could be isolated on a one-squareinch arca. The material stored on this plate could be read at a later date by a normal electron microscopy system, but this entails the use of cumbersome, expensive equipment. Optical rcading methods using convent ional light microscopy would be ineffective because of the diffraction limits of light, although dark-field methods would scatter light into the microscope from particles below the wavelength of light. I3y using individual light detcctors having cross scctiorial arcas propor-
186
KENNETH R. SHOULDERS
tional to the particle size to be rcsolvcd, thc data printed on the one-squareinch area could be read provided the data was held in close contact with the detectors. By reading at a resolution of 2500 A and writing with the limit of 250 A provided by the micromachining process, 100 levels of light as seen by the integrating light detector could be discerned, thus yielding close to 7 X 1Olo bits of data on a one-square-inch plate. It is not expected that the full range of 100 light levels would be available to a designer because of fluctuations in illuminating intensity, instability, and nonuniformity in detectors, effects of variable spacing between surfaces causing interference effects, and lack of analogue circuitry in microelectronic systems. Ten levels would seem to be the maximum value attainable in practice, and even this would require special techniques. A maximum storage density of 3.3 x 1O1O bits/in.2 would result. The illumination of the entire array could be accomplished by an external source of light flooding the area or by individual light sources located in thc same area as the detector. This latter method would give a convenient way of locally exciting the data and allow simultaneous or sequential access to the data without having to gate the photo detectors. Aside from the problems of forming the light generation and detecting dcvices, which are discussed in Sections V, B and V, C, the largest problcms in this storage system would be in achieving and maintaining smooth and clean surfaces with uniform antireflection properties. Standard microscope slides are smooth enough for our purposes, and they are often flat to within 10 fringes of light. With adequate antireflection coatings on both the optical detector surface and the micromachined surface containing the stored information, the difference in spacing will primarily result in adjacent channel cross-talk. Without antireflection surfaces a waveguide action between the surfaces will carry the scattered light to great distances and increase the cross-talk. It is expected that the light detectors will be within a few thousand angstrom units of the outside surface of the machine. This distance is primarily determined by the encapsulation requirements and antireflection coatings. The recording medium would be a simple metal film properly covered with encapsulating and antireflection materials. The minimum spacing between the data and the detector would be about 2500 A; the maximum spacing would be set by the flatness of the recording plate and would be around 10,000 A in absence of dust specks. The optical detector surface should be considered to be a delicate surface even if hard materials like alumina are used for covering layers. There is a relatively high probability of punching through the thin windows if a dust particle is pressed or rolled on the surface by the recording medium. The two methods of using this data storage system would be to mechanically scan the data in front, of a relatively small viewing area of lo6 ele-
MICROELECTRONICS
187
ments in which the data could be t,aken from the storage plate in both serial and parallel fashion, or to locate the plate in a fixed position and scan the data out in any series-parallel fashion from an array of 1010 detectors. The latter method would be preferred because there would be less abrading action to the surfaces and absence of a mechanical scanning mechanism. Recording methods using electron-beam-activated micromachining techniques are presently developed to a degree to permit attainment of 250 A resolution on glass plates or thin transparent films of support material. Encapsulation and antireflection coating methods are also developed to a high degree. The data converters necessary to take printed information from a book or other source and convert it to a complicated electron image are not available at present; however, no basic research would be required to attain this. Electron optical systems are able to expose complete retinal fields of iiiformation containing over lo8 bits per field [44a], with a current density sufficient for activating the electron beam machining process in 1/100 sec. This 10'O bit/sec data recording rate cannot be matched by present scanning devices. A scanning rate of about lo4pages/sec would be required. If lo4 pages/sec could be obtained by using microfilm methods to illuminate a metallic photocathode with a 4000 ft-c light source, then a 10 ma/cmz electron source would result; this would allow the micromachining process to proceed a t one-tenth full speed. A preferred method would be to make the first copy as a metallic image on a thin substrate a t reduced speed and t.hen use this master in a transmission-type electron optical system, with further demagnification, to produce copies a t full speed. Each square-inch plate would require up to lo4 exposures of the lo* bit fields t o produce the desired 10l2 bits of stored information. Using 0.1 amp/cmZ current density in t,he electron beam, about two minutes would be required to expose the plate. This two minute recording time is in accord with the time needed t o mechanically scan the lo4 areas of the recording plate in front of the electron lens. The details of the exposing time requirements and the number of bits per field will be discussed in Sections X and XI, on Resist Production and Electron Optics. I n addition to the time for exposing, time for deposition of base metal, etching, and antireflection coating would be added to the total time. Since these are batch processes, where many plates can be handled a t one time, they would not significantly increase the total time by more than a factor of two; this would give a total time of about four minutes to store 1OI2 bits of information. One-square-inch plates have been selected because of their ease of processing and their ease of manipulation in a dust-free environment,. The
1 aa
KENNETH R. SHOULDERS
storage of a large number of these plates would seem economical even if special containers or casettes had to be provided to keep them free of dust. Mechanical positioning devices accurate to 0.001 inch can be easily provided for such rigid plates, and this accuracy can position lo6 individual fields in front of the viewing area of the microelectronic system. Exact registration with some desired data cannot be expected with simple mechanical means; therefore, electronic data processing methods must be coupled with mechanical methods to locate the details needed once a field of data is presented to the machine. E. Electrostatic Relays
Electrostatic relays employing a one-micron-diameter diaphragm, under small tension, a few hundred angstrom units thick, would have a fundamental resonance above the 10 Mc region. With spacings of lo00 A between this conductive flexible diaphragm and a fixed plate, a field strength of 3 X lo6 volt/cm could be achieved with 30 volts. At a field strength of 3 X loe volt/cm the maximum stress would be 12,000 lbs/in.* and the diaphragm would be extended 1800 A. A small area electrical contact and lcad wire could he carried with the diaphragm and relay action obtained. More favorable geometries, such as unstretched diaphragms and cantilever beams, could be used to help lower the operating voltage and to provide memory action of the relay due to the adhesion forces of the contacts. Actuator plates on both sides of the diaphragm would be needed to restore the diaphragm to the alternate contact. By employing a separate electrically irisulatrd contact on the relay, normal relay action could be obtained. The contact area of electrostatic relays of such simple geometry must be smaller than the actuating area or they must be operated a t lower voltage to prevent the contact voltage from exerting a strong controlling action on the diaphragm and overriding the action of the input signal. Mechanical contacts made in high vacuum and never exposed to air would be free of the contamination problems found in ordinary relays; however, theFe contacts would have to be especially prepared to prevent, sticking. Hard contacts, such as tungsten carbide, or cermets such as tungsten and alumina may provide contacts with minimum sticking action. These materials are easy t o handle hy film techniques and would not increase the fabrication difficulty. Electrostatic relays would be useful in controlling very low level signals, in the distribution of power to various electronic devices, and in certain memory functions that could use mechanical devices as slow as 0.1 psec to drive a large number of electronic devices. In addition, optical reflection effects based upon mechanical motion and optical interference would be
MICROELECTRONICS
189
11scfu1f(ir outputJ equipment, in which incident light oil oile surfam of t 110 substrate could h r modified t o produce various colors : t i i ( I iiitcnsitirs i n a desirrd pnttcrii d l r d for by the clcctrical informatioil i i i t he macahiiir. F. Electromechanical Filters
If a circular diaphragm of molybdenum has a radius of oiie micron, is 2000 A thick, and is spaced 3000 A from an exciting electrodr, then the fundamcntal resonant frequency is 670 Mc and the electromechanical for a ficld strength of lo7 volt/cm. If the coupling coefficient is 6 X diaphragm is reduced t o 200 A, the resonant frequency is 67 h1c and the coupling coefficient rises quickly t o O.G, which is considered to be very strong coupling. I n order to lower the operating frequency of these small mechanical filters, the fundamental resonance of a reed supported at one end may be used. For a molybdenum reed two microns long and 200 ,4 thick, the resonant frequency is 4 Mc. The frequency dependence on temperature, which is caused by a change in the elastic modulus of the materials used, could be compensated by the use of bimetal elements for the exciting electrode, provided the coupling coefficient is high enough to allow the electrical capacity of the circuit to approach in value the effective capacity of the electromechanical element. The outstanding mechanical properties of thin fibers and diaphragms have long been noted. Part of this strength comes from the lack of crystallographic slip that is found in bulk materials. This characteristic would seem to impart stability to devices made from thin films. If metallic creep in molybdenum films became evident, then part of the metal in each elemcnt could be replaced by aluminum oxide, which has similar elastic properties but greater hardness. Just enough metal is iieedcd to make a11 electrically conductive vibrating member. These electrostatically driven electromechanical eleinent s appear to provide very convenient accessory components for vacuum tunnel effect devices, since they could be made from the same materials by the same processes and operate a t the same voltage and field strengths. A simple connection from the anode of a tunriel effect component to thc driving or exciting element of the filtcr is adequate to introduce the signal, although this requires a load resistor of several megohms to operate in the hundred megacycle region. Direct electron beam excitation can be used without load resistors if secondary emission is employed. An electron beam that is deflected across an alternately low and high secondary electron emission area of the filter element or thc driving element can cause the voltage on the electrode t o assume any value between the full anode voltage and the cathode. Experimental observations in the electron microscope have shown a similar effect. When a material like silica or alumina is produced in film
190
KENNETH R. SHOULDERS
form 011 a screen wirc, and intcntionally brokcn by mcchanical means, the breaks usually produce cantilever beam siaucturcs that protrudc out) hit o thc scrccn opcnirig and arc attached to the scrccii cdgc. Whcn this sample is observed in the electron microscopc under nonuniform illumination conditions, the image of the scrcen is seen t o be vibrating and the amplitude can be controlled by the illumination shape and intensity. This forced oscillation is evidently caused by some method of alternately charging and discharging the film by secondary electron emission. TWOtypes of oscillation are evident in the microscope image: one is a relaxation effect which seems to produce square waves, since two distinct images are observed; the other is essentially sinusoidal and is assumed to be an oscillation a t the resonant frequency of the cantilever beam. An electron multiplier could be inserted in front of the phosphor screen of the electron microscope and the output fed to an oscilloscope to determine the waveform and frequency of the oscillation. The point projection electron microscope and micromanipulator used on the program for “Research on Microminiature Field Emission Tubes” [15a] is ideally suited for obtaining data on micron-sized electrically operatcd mechanical filters. Data has recently been obtained on the low frcquency resonances (audio spectrum) of the various members by exciting the system with alternating current instead of direct current. Had rf been introduced, and the projected image of a mechanically resonant structure been displayed on the fluorescent screen, the bandwidth and other features of the structure could have been investigated. It is assumed that fabrication methods for large arrays of electromechanical components would alternately test and correct the structures until the assigned frequencies are obtained. The test could be implemented by direct excitation from the tunnel effect components or from the electron microscope feature of the micromachining process. VI. Component Interconnection A. Vacuum Tunnel Effect Amplifiers
A tunnel effect amplifier may be considered as a normal vacuum tube having a positive grid that draws low current. This positive grid allows the direct connection of anode and grids, thus dispensing with various coupling networks and transformers. When direct coupling of triodes is used, and the control grid is driven more positive than the anode, grid current results. Tetrodes would not have an appreciable control grid current under the same conditions, but the screen grid would draw current unless used as a low voltage suppressor grid. I n spite of direct coupling features most of
MICROELECTRONICS
191
the circuitry in this microelectronics system should be arranged for operation on a pulse basis in order to prevent excessive energy dissipation and heating. Stable load resistors are difficult to fabricate in this size range because the high resistivity materials needed have poor temperature coefficients of resistivity. Dynamic resistors in the form of vacuum tunnel effect triodes connected as diodes could serve as load resistors. Their temperature characteristics would bc very low and mat,ched to those of the active elements. It is believed that by the use of self-forming processes it would be possible to tailor the diode characteristics of triodes t o fit the active element characteristics. The steep voltage-current characteristics of the tunnel diode would make a very nonlinear resistor, but this would not seem to be excessively troublesome in switching circuits. For driving a transmission line or setting the voltage on a storage capacitor the series tuhe arrangement shown in Fig. 7 could be used. A pulse
K==?l-
ANODE
CATHODE
FIG.7. Series connected tunnel effect vacuum triodes.
into the upper tube grid would cause conduction and the output point between the two tubes would be driven to anode potential. Conduction of the lower tube sets the output value to cathode potential. The time constant for both of these operations depends upon the conduction current and the external capacity, and would be in the neighborhood of 10-lo sec for normal operation. With the series tube connection shown in Fig. 7, the top tube is acting as a cathode follower and voltage degeneration results when the input signal is derived relative to ground potential. To provide gain in this stage without the use of iiiverting amplifiers and load resistors, the secondary emission amplifier circuit shown in Fig. 6 would be employed. When electrons are emitted from the cathode by either raising the grid voltage or lowering the cathode voltage they are accelerated t o the dynode; this rauses more electrons to be emitted than are received and the dynode is clamped t o the anode potential. There is no mechanism shown in Fig. G for discharging the grid. Figure 8 shows a method of discharging the grid
192
KENNETH R. SHOULDERS
ANODE DYNODE GRID INPUT CATWODE SECONDARY EMISSION AKPLIFIER
i
SERIES CONNECTED TRIOOES
I
NEGA-WE RESISTANCE MEMORY
FIG.8. Assembly of multilayer tunnel effect coinpoiirnts to give an active memory with low quiesccnt power consumption.
when the lower series triode conducts. Part of the electrons from the lower cathode are bypassed to the grid through the anode of the lower tube, thus discharging the grid when needed, namely, upon conduction of the lower tube. Figure 9 shows a single metal layer type of component used for the same function as Fig. 8. Skellett [37] and Bruining [45] describe various methods of employing secondary electron emission to obtain circuit functions normally difficult t o achieve without the use of secondary emission. It is expected that the current density of the secondary emission process is similar to the currcnt density of the primary electron beam, thus yielding switching time constants in the same region as a vacuum tunnel effect device. Space charge effects can be neglected when high field intensities are used. A relatively low-noise negative grid, Class A amplifier could be formed from the various basic elements of the vacuum tunnel effect device by providing a space charge smoothed emission source that is followed by a normal negative grid triode. The emission source for the triode would be provided by a field emission cathode and anode. The anode would have a
'/////I/ / / / / / / / /, CATHODE
CATHODE
//
lop we*
F ~ G9.. Assembly of sinylc-luJrr tuiiiiel effect coitipoiiciits to give with low quiescent power consuniption.
uii
active memory
MICROELECTRONICS
193
Iiolc in it similar t o Q positive control grid, and some of the emitted electrons would pass through the hole into the retarding field region of the negative grid triode that followed. The space charge smoothing that resulted in the retarding and drift region would smooth the current fluctuations and reduce the effective noise temperature of the emission source below the temperature of the emitter. If a room temperature emitter were used, a useful noise improrrmrnt ovcr conventional negative grid thermionic tubcs could result.
B.
Memory Devices
It seems practical t o achieve memory effects by employing vacuum tunnel effect devires to compensate for the leakage current of storage capacitors. The distributed capacity of active elements can be employed as memories provided the active elements have a negative resist3ancecomANODE
I
I .’ !
---;
DYNODE 8 MEMORY ELECTRODE
CATHODE
FIG.10. Negative resistance memory device composed of tunnel effect vacuum triode constant current generator and sccondary emission dynode.
parable to the positive leakage resistance of the dielectric. The dynatron action of secondary emission devices can be used for a memory, as can a flip-flop action that operates a t current values equivalent to dielectric lrakage. The dynatron or negative resistance type of memory device is shown in Fig. 10. This is essentially a tetrode with a secondary emitting screen grid. The cathode is required t o emit a very low electron current somewhat greater than the leakage currents of the memory electrode or screen grid. This low current is provided by the control grid being connected to a supply voltage through a resistance formed by the dielectric leakage resistance. When the control grid voltage tries to rise the increased emission current is partly intercepted by the grid and any further rise is degenerated. The screen grid is clamped to either anode or cathode potential, depending upon where it has been previously set, by the negative resistance or dynatron action of sccondary emission. The power consumed for this holding action
KENNETH R. SHOULDERS
194
would be roughly 10 timcs the leakage dissipation of thc dielectric; this would make a large array of such memory devices possible without exccssive power consumption. The array would resemble a capacitor of equivalent area and thickness. If power were removed for a short time-up to several hours at room temperature-there would be no loss of stored data. Upon heating the entire array of devices the leakage current would increase as well as the conduction current from the active devices; however, they
I
CROSS-OVER
I
I
ANODE I
ANODE GRID
GRID
CATHODE OUTPUT
ANODE I
ANODE
GRID I
GRID
2
CATHaoE Sidm wiw SERIES CONNECTED TRIODES
'
TRIODE-DIOOE
FIG.11. Flip-flop memory device composed of croes-connected tunnel effect variium triodes using active anode load resistors.
would increase a t approximately the same rate and prevent loss of stored data. Figures 8 and 9 show a series tube setting circuit, negative resistance memory, and secondary emission amplifier combination. A flip-flop type of memory could be employed in the same fashion as the negative resistance memory but would not require secondary emission. Figure 11 shows an arrangement in which a series tubc ronnection is supplemented with a diode-triode connection to obtain low-power flip-flop action. During the set,ting of thc memory the series tubes forcefully dis-
MICROELECTRONICS
195
rhargc t hc st oiqq. capacity in approximately 10-lo sec. Following this wtting c~yclethe currents fall t o very low values and the effective plate h a d resistances are in the region of one-tenth of the dielectric resistance values. Vacuum tunnel cffcct devices would be very good for this type of operation because of their cxtremcly wide range of current swing for a small voltage change on the grid. During the holding part of the memory cycle, the upper diode and triode of Fig. 11 act as load resistors. The lower triodes have their grid and plate leads cross-coupled in typical flip-flop fashion and regenerate any charge loss on thc rncmory capacitor. It would be possible t o have a syinmetrical flip-flop in which the diode was substituted for a triode, and eomplcmcntary inputs as well as outputs were provided. The flip-flop circuit can be connected to output amplifiers of the series tube type when isolation is desired. This type of memory circuit can supply virtually no energy to extcrnal circuits and would normally be followed by low 1.1 amplifiers that remain quiescent until power is drawn from them by thc external load. C. Electromechanical Components
Electrornechariical components are designed to operate at the full anode potential of vacuum tunnel effect devices and may be coupled directly to them. When low level signals are to be processed, the various stages must be isolated by coupling capacitors in the usual manner to prevent drift of dc characteristics from saturating the amplifiers. Most filtering requirements could be satisfied by using many active elements and electromechanical components in series-parallcl cornhination.
D. Steerable
Electron Guide
A pcriodically focused elcctrostatic electron guide seeins compatible with this microelectronic system. Such a guide would be used to replace some of the fixed wiring in the system so as to give greater system flexibility. This electron guide system has the potentiality of greatly decreasing the power required to perform a particular computation by using combinational logic; in this case the beam must thread its way through a complex maze of logic states before giving up its energy to an electrode. Such a n electron guide system can also cross beams in the same plane without interference, thus reducing the wiring problem in a single plane. Ultimately, certain beambeam interactions seem suitable for the processing of information without having to charge and discharge the energy storage structures in a data processing machine. This is because a superconducting-like action is available through persistent current loops that can be set up and controlled by other beams. An interesting form of periodically focused electrostatic guide has been
196
KENNETH R. SHOULDERS
used by Cook et al. [46] to form high pcrveance beams for traveling wave tubes. The analysis of the stability criterion of such a “slalom” bcam has been given by Cook [47]. His data indicates that a beam guide structure having a spatial period of several microns and operated a t 100 volts could easily contain a beam with a currciit density of lo4amp/cm3, which is adequate for 10-lo sec switching time operation. The velocity of this beam is a little greater than one hundrcdth the velocity of light; thus propagation across a one-inch system would occur in about 5 X sec. The form of guide most suited to our purposes would consist of a uniform array of positive electrodes contained between planes of conductors operating a t cathode potential. An clcctron bcam would be injected into the structure with a ccrtain range of angles and velocitics so as to obtain a stable orbit or path through the array. Thc injection would be done with field emission sources followed by a simple grid structure to rcducc the emission angle to a useful value. These launching structures would be located throughout the array a t intervals of a few microns, in the same fashion as the positive electrodes. I n order to guide the bcam, it is necessary to altcr the potential on thc positive electrodes with t,he voltagc swinging betwccn anode potential and cathode potcntial, Beams having the same velocity but launched a t different angles follow diffcrcnt paths and may be effected in various ways by the stcering clectrodc. Bcams launched at a high angle relative to the direction of travel are most easily turiicd by the steering electrode; beams launched a t low angles and crossing very ncar to the positive electrodes are difficult to turn. Collection of the electrons must be accomplished by inserting an electrode in the electron path or by disturbing the elcctron path in the vicinity of the collection point so as to allow collection by the positive guide electrode. The collection points would have to be as numerous as the launching positions. In addition, it would seem reasonable to include several other functions, such as memory for the steering functions, and memory and gating elements for the logic functions. This cntire group of components would form a low complexity module that has internal communication via wires and distant communication via electron guides. The entire array of modules would operate a t a single potential of around 100 volts, even though the electron velocity periodically falls to lower values when turning. By using free electron beams isolated from the lattice and having a suitable interaction space, very low noise amplifiers similar to the parametric amplifiers of Crumly [48]and Udelson [48a] could be obtained, in spite of high lattice temperature. This electron beam parametric amplifier would have a decided advantage over solid state types employing tuned circuits because of the low Q of microminiature tuned circuits. The construction of free electron guides would primarily involve provid-
MICROELECTRONICS
197
iiig Idlow structures between plane clectrodes supportcd pcriodically by mctallic elcctrodcs. Thc construction process worild thus omit wircs and provide vacuum cavities in thcir stcad. The understanding of the principle of opcration or the application of steerable electron guides is far from complete; they are mentioned here only to show some of the potciitialitics of free electron systems.
E. Plasma System
I t is the author’s opinion that all organization of the sort that is interesting to data processing does not usefully ccase at a temperature equivalent to the melting point of matter. Conceptually, it may be feasible to consider a gaseous-like array of very small particles, such as ions, organized in quasistatic structures, through which slalom-focused electron beams thread their way. Some of these beams could scrvc as casily alterable system interconnections. Othcrs, circulating in persistent current loops, would serve as memory elements; control of these bcams and cnergy transfer would be accomplishcd by bcam-beam interaction. Some preliminary steps [49] have bccn takcn by showing the possibility of confining charged particles in quasistatic positions in an enclosed space by varying electric fields imposed esternally. This clcctrodynamic suspension system could provide the means for the organized structure consisting of charged particles so disposed in space as to guide the slalom electron beams carrying information. One of the most interesting features of a plasma-like machine is the triviality of the microminiaturization problem. Due to the extremely elastic behavior of the entire system, the organization could be accomplished at convenient sizes and then scaled down by a change in some parameter such as the confinement field frequency. This type of machine would show a high degree of three dimensional communicating ability-very much unlike our present layered structures. The extreme plasticity of the machine’s structure would assist the organization of data for optimum machine usage. All useful electronic effects would be available to such an electronic organism. It appears possible to generate and receive information from the acoustical end of the spectrum up into the X-ray spectrum, and to process information a t rates that would seem phenomenal by present standards.
VII. Substrate Preparation The deposition of film-type microelectronic components requires a rigid supporting structure that is capable of bcing processed at high tempera-
198
KENNETH R. SHOULDERS
turcs and aMc to have ruggedly bonded lead wircs attachccl to it. Wc have clectrd t,o use rcfractory transparent dielectric substrates such as sapphire. These substrate plates are approximatlcy 1inch square and 0.01 inch thick. The present cost of one-inch Eapphirc windows, which represent a good starting point for our substrates, is ten dollars, when bought in small quantities. The cost of preparing the substrate for final use could make the raw material cost quite tolerable. A high thermal conductivity, rigid refractory material such as beryllia would be more suitable to our application, but as yet there is no transparent material available a t reasonable cost. A. Mechanical Forming The ovcr-all dimensions such as thickness, parallelism, squareness, and flatness can be obtained by ordinary commercial grinding techniques to dimensional tolerances substantially better than 0.001 inch. The maximum convenient size for a substmtc of 0.010 inch thickness is about 1 square inch. The most difficult operation is the mechanical polishing of surfaces to the degree required for microelectronic components. A surface smoothness approaching the 300 A resolution of the construction process would bc desirable; however, it is difficult to obtain over the entire area of the substrate. Glass surfaces that have not been weathered have an average smoothness greater than 100 A but these surfaces are usually produced by fire polishing. Fire polishing has the undesirable property of fusing various dust particles into the surface where they become rough spots and chemical anomalies that can later damage the film components. Mechanical polishing of glasses frequently produces sleeks, pick-out cusps, and a host of mechanical difficulties that are a result of melting the surface by friction from the polishing medium. Ehrenberg [50] and Koehler [51] discuss some of these problems in relation to producing ultrasmooth surfaces for X-ray reflection. Polishing of hard, high-temperature materials like sapphire usually produces errors in the form of minute fractures in the surface instead of the melting effects seen in glass. Sapphire can be polished by commercial techniques so that it shows few flaws greater than 1000 A in depth, although some of the scratches produced extend for quite some distance. It does not seem worthwhile to attempt t o remove all of the mechanical irregularities down to a smoothness of 300 A by mechanical polishing alone; however, this process is a great asset in producing surfaces flat to less than one fringe of light and with a consistent smoothness of around 1000 A, having very few anomalies protruding above the surface. As will be shown later, the
MICROELECTRONICS
199
pits are less troublesome than the spikes when attempting to further smooth the surface. B. Substrate Cleaning
There are two slightly related problems in substrate cleaning, namely, to remove the specks of dust and debris that adhere to the surface and to remove the material that is contained in the substrate just below the surface and distributed throughout it as gas. If these problems are not handled separately, then, surface specks can be driven into the bulk by heating and greatly complicate their removal. The aim is to remove most of the dust particles, then heat the substrate to very high temperatures in vacuum to drive out the remaining material that can be removed at the firing tempera ture . A high-vacuum cleaning method, which seems adequate for removing surface dust and other anomalies that have not been fused into the surface, has been previously investigated by the author. This method was developed after it was found that, even if a surface could be cleaned by very laborious techniques in air, the surface usually became dirty again during the pumpdown of the vacuum system. A vacuum system used in everyday service is far from the clean environment visualized because there are many tiny metallic and dielectric particles as a result of previous depositions; these particles are moved around in the system under the influence of convection and electrical forces, some depositing on the substrate. These particles can be stabilized by welding them to the chamber. This is done by baking the vacuum system at high tempernturcs, while operating all of the high volt,age electrical circuits, and depositing several thousand angstroms of material on the walls of the vacuum chamber. To remove foreign material from the surface down to a particle size of 300 A, a method employing explosive reevaporation from the surface has been tested and found satisfactory. This method consists of evaporating a highly volatile and decomposable material like ammonium chloride onto the surface t o a thickness of several thousand angstrom units. This material is pulse-heated by an electron flood bcam of about 100 amp/cm2 at 20 kv between a tungsten filament and an anode ring of tungsten that is located near the substrate. After this treatment it has been found that the surface is free of foreign material that has not been fused into the surface by prior heating or polishing. An explanation that can be offered for the cleaning action is that the explosive vaporization of ammonium chloride carries off adhering particles by mechanical action after having discharged material adhering by Coulomb forces. The electron beam ionizes a great many of the gas molecules at the surface to assist in neutralizing charged
200
KENNETH
R. SHOULDERS
debris long enough to be removed from the surface. The ions formed do not have sufllcient energy to sputter the surface to any measurable extent if a short current pulse is used because they are formed in the low voltage region close to the surface. After removing adhering particles from the surface, the substrate can be heated to around 1700°C in vacuum to remove many of the absorbed and adsorbed gases produced during the mechanical polishing of the surface.
Fro. 12. High-vacuum electron-bonibiLrdmerit-heatrd furnace.
The effectiveness of heating as a method of cleaning is discussed by Hagstrum [52]. A vacuum furnace suitable for this heating, as well as for ceramic firing and metallizing, is shown in Fig. 12. The entire assembly is installed in a vacuum bell jar and pumped down t o the mm Hg region. The material t o be fired can be loaded into the molybdenum box shown in the center of the water-cooled housing, and a small molybdenum door is fitted onto the box to prevent direct electron bombardment of the material being fired.
MICROELECTRONICS
201
The anode box of molybdenum is heated by electron bombardment from the filament that is coiled around the box and suspended from the water cooled housing. A temperature of 2000°C can be attained by using an accelerating voltage of 5 kv a t 0.5 amp. The temperature is monitored by a tungsten-molybdenum thermocouple inserted in the anode. To prevent insulating the thermocouple meter the anode is operated at ground potential and the water-cooled housing and the filament are operated at high negative voltage. To do this the water-cooling lines must be insulated by 10 feet of plastic tubing; the filament current is supplied by a 10 amp isolation transformer with a 115 volt output. This furnace can be put on automatic regulation by using the standard emission regulators described in Section VIII, A and using the thermocouple output to feed a Philbrick USA-3 amplifier. After this heat treatment the surface is no longer smooth; it shows signs of thermal etching, has peculiar extrusions of material from inside the substrate, and has many unexplained steps and ridges. A subsequent mechanical polishing that is carried out with extremely fine diamond powder and water, using another sapphire plate as a lap, will remove most of the large anomalies. This process should be carried out with extreme care and only the minimum of material removed. When this substrate is once again clcancd by the above process in high vacuum and heatcd to around 1600°C the surface will be found to be smoothcr than 800 A, with only an occasional particle sticking out of the surface and a very few fractures deeper than 1000 A. It is an extremely tedious job to check by microscopy a surfacc that has any appreciable area, although methods will be describcd lntcr for doing so. C. Substrate Smoothing
A substrate smoothing method, which involves drawing a taut film across the substrate surface, and sintcring it onto the surface by heating, has been previously investigated by the author. This process, which is essentially the same one used to form vacuum cavities, was discussed in Section IV, G on tunnel effcct components. The process essentially amounts to depositing a material that can be easily liquified, such as myristic acid; rapidly heating a i d coolirig the material to produce a smooth surface; and evaporating a thin layer of material over the myristic acid t o provide a refractory surface to receive a heavier deposit later. The myristic acid is then driven out and the thin film is sintered to the subtrate. When completed, a heavy film is deposited over the thin film to provide mechanical strength. The entire process produces a surface that is smoother than 300 A; this smoothness is controlled by the crystallization of the smoothing material and the heavy film covering it.
202
KENNETH R. SHOULDERS
If this process were optimized, the advantages of a smooth fire-polished glassy surface could be retained along with the advantages of stability and cleanliness of refractory materials such as sapphire. There has been no work done on determining the limits of smoothness attainable by vacuumfire-polishing sapphire or similar crystalline materials with an intense electron beam; however, there is some evidence, from results on flamepolished samples in air, that smoothness could be obtained. The difficulty that might be expected would be the production of thermal etch pits and ridges. Even though a fire polishing process may be developed for the substrate proper, it will not serve to smooth between layers of components because of the damage to the components. If low temperature materials are used and retained between layers to allow fire polishing, the stability of the entire system would suffer and high bake-out temperatures for cleaning and outgassing could not be reached. D. Terminals
Sapphire is a very good material to use for a substrate because stable, rugged terminals can be attached to it. Molybdenum metallizing can be applied to sapphire and then coated with nickel or gold-nickel alloy for the attachment of nickel connecting wires or other, more corrosionresistant alloys. Sapphire has very high strength and an expansion coefficient near that of the metallized molybdenum coating. Satisfactory terminals have been produced in the past on sapphire by notching the edge of the substrate with a slot 0.02 inch wide and deep; coating this with a mixture of 90% -325 mesh molybdenum powder and 10% zirconium oxide; and firing at 1850°C in a vacuum furnace. This operation is carried out before the substrate goes into the final grinding operation, so that the terminals are ground flat along with the substrate. When all processing of components is completed on the substrate, nickel or goldnickel alloy can be used to vacuum-braze the connecting wires onto the terminals at temperatures between 970°C to 1500"C, depending upon the brazing material selected. In addition to the connecting wire termination, the thin film wires of the components must be terminated with a heavy vacuum-deposited film of molybdenum that acts as a buffer between the fired-on molybdenum terminal and the thin film. If this is not done there is an occasional open-circuit produced at the junction after high temperature cycling, probably as a result of some corrosion process and ineffectiveness of encapsulation at the junction of the heavy molybdenum alloy terminal and the thin a m . Terminals produced by these methods were used in the encapsulation tests described in Section VIII, C, where repeated cycles between low
MICROELECTRONICS
203
1 cmpcratures and temperatures around 800°C mere made in very corrosive onvironments. The terminal scrmrtl to he the most inadequate link in the process and failure was causcd hy sonic corrosion mechanism working its way under the encapsulation in the region of the molybdenum terminal. By siliconizing the termiiial and lead wire, a protective covering over the entire assembly could be obtained. Molybdenum wires treated in tetraethyl orthosilicate vapor a t around 800°C can be operated for extended periods above 1200°C in air without dificulty [53]. The protective layer is apparently a thin coating of quartz, bonded to the molybdenum through a molybdenum disilicide phase.
E. Substrate Testing Methods
Two classes of checking methods seem in order: one, to help the investigator converge upon a solution to surface-smoothing and speck removal; the other, to simply test the surface before proceeding to the next layer of components. Electron microscopy can serve for the first method, and the deposition of a very thin film capacitor can serve for the second. If a surface is cleaned properly and coated with a release agent, such as carbon, by vacuum evaporation, then coated at ail acute angle with a shadowing material like chromium, which is subsequently stripped from the surface for examination in an electron microscope [54], variations in the surface characteristics smaller than 50 A can be determined. The shape of the offending anomalies can help determine the source of trouble, and occasionally a particle is picked up by thc replica techniquc and electron diffraction can help determine its origin. Replica techniques are useful when commercial tranPmissioii microscopes must be used and direct surface examination cannot be employed. Later, we will incorporate a scanning probe electron microscope in our equipment and be able to examine surfaces directly; however, this is a very tedious process when large areas must be examined. An automatic process could probably be developed, but as yet nothing has been done toward achieving this goal. The testing method would involve searching for any light or dark area that appeared while the surface is scanned electronically and moved mechanically. One test method that has been used in the past to determine the smoothness of large areas is the deposition of a capacitor having a thin dielectric. I n this test, alternate layers of molybdenum and aluminum oxide are deposited to a thickness of around 100 A by rcactive deposition methods, as described in Section VIII, C on material deposition. The most sensible test for a surface is whether or not it is smooth enough to build microelectronic devices, and the thin film capacitor represents a useful test. Capacitors having about 100 A dielectric thickness and one square inch area have been produced by careful technique. This test method incidentally revealed
2Q4
KENNETH R. SHOULDERS
that evaporators frequently spatter out particles a few hundred angstrom units diameter which puncture dielectrics and roughen the surface ; therefore, reactive deposition processes were substituted. These capacitors, which are discussed in more detail in Section VIII, C, have shown up to 1000 mfd/in?, as judged by time constant measurements. A high series resistor was used to watch for the onset of tunnel emission, which must not become excessive.
VIII. Material Deposition
Two deposition methods are used in this microelectronic construction process-thermal evaporation and reactive deposition. Both methods are carried out in a high vacuum system, and both need very similar equipments to perform deposition. Both require heaters for the source materials, monitors for determining evaporation rates, and substrate heaters. Thermal evaporation methods use a source that is much hotter than the substrate, whereas reactive deposition usually USCS two sources that are cooler than the substrate, as well as a substrate that is hot enough to rcact the two materials or decompose them. As examples, molybdenum can be evaporated by having a source at approximately 2600°C and a substrate surface at 400°C; reactive deposition uses a molybdenum chloride source operating near 400",a hydrogen source at room temperature, and a substrate temperature as low as 900°C or as high as 1500°C. The reason for preferring reactive deposition is that by using a very hot substrate, clean, properly crystallized, and stable materials result. The substrate temperature can be much higher for reactive deposition than thermal evaporation because the reaction by-product evaporates from the surface, taking any excess energy with it. This permits the deposition of the desired material without reevaporation experienced in thermal evaporation deposition. Deposition of compounds such as oxides, nitrides, carbides, borides, sulfides, and silicides can be done by reactive deposition without decomposition [55], whereas the deposition of many compounds is difficult or impossible with the thermal evaporation process. A very great advantage is obtained with reactive deposition in that pinhole free and dense deposits are usually obtained. This is because there is a high degree of surface mobility that prevents shadowing of any rough spots on the surface; there thus results a complete dielectric covering of surfaces and a strain-free crystalline structure that is not prone to break out tiny pieces, an effect frequently found in evaporated deposits. The high substrate temperature also immediately volatilizes any spattered pieces from the evaporator that arrive at the substrate. These
MICROELECTRONICS
205
picccs wonld norinally constitute specks on thc surfare of rv:tporatrd dnposits and cause pinholes. Thermal evaporation methods will not bc uscd for cnd-product maherials in the final component phase of this work, but only for temporary deposits where the substrate must be kept cold, such as in the smoothing operation between layers; for the production of resists; and for the first film of the vacuum encapsulation process. Due to their simplicity, they will also be used in interim studies, where the finest films are not required. A. Thermal Evaporation
Two types of material sources are used in this project. Low temperature, nonreactive materials are handled by radiation-heating crucibles, containing the various materials, with resistance wire heaters. The heater wires are usually molybdenum or tungsten. High temperature and reactive materials are heated by direct clectron bombardment. For large samples and high evaporation rates, a water-cooled hearth is used and the evaporating material is bombarded with high energy electrons. Metals and dielectrics can be handled by this method provided the electron velocity is greater than the one-to-one point on the secondary electron emission characteristic, thus causing the dielectrics to charge positively. Small samples can be either directly bombarded and serve as their own support, or they can be contained in refractory metal baskets or ceramic crucibles. Bombarding of tungsten or molybdenum baskets is preferred to resistance heating because there is no burnout caused by alloying and thinning of the heater wire; in addition, there is no shorting problem between turns. Figure 13 shows a small multiple source evaporator for simple experiments; Fig. 14 is a drawing of the same evaporator. A mechanical manipulator moves the various sources and their shields under an electron-emitting filament. A switch makes contact with the appropriate anode and high positive voltage is supplied. The voltage ranges from 200 volts for materials like aluminum chloride, having a high vapor pressure, to 3000 volts for a material like tungsten. The emission source is a .010-inch-diameter tungsten filament, which normally runs at about 7 amps, to obtain emission currents of about 100 ma, the highest value used for this small evaporator. An auxiliary electrode, called the ion collector, is placed above the filament and operated at 30 volts negative with respect to the ground and filament. This ion collector collects a fraction of the ions present, as a result of the bombardment of the evaporating material, and serves to indicate the evaporation rate. The ion current is amplified and fed into the emission controller, as shown in Fig. 15, to cause automatic regulation of the evaporation rate. The desired evaporation rate is selected on the range switch and
'
206
KENNETH R. SHOULDERS
FIQ.13. Apparatus used for electron-beam-activated micromachining showing vacuum station, adapter spool, multiple-source electron beam heated evaporator, flood electron gun, shutter and substrate heater. SUBSTRATE HEATER
I tNIPULATOR RES
FIG.14. Drawing of apparatus used for electron-beam-activated micromachining shown in Fig. 13.
*
R-IWA POWER SUPPLY
T -300
-
-
-L
10 MEG TOTAL- LOG TAPER EUPORATWN RATE SWITCH
I
ION COLLECTOR
I I
51 El
7
I
FIG.15. Schematic diagram of evaporat,ion rat,e regulat,or used on multiple-source electron-beam-heated evaporator shown in Fig. 14.
208
KENNETH R. SHOULDERS
can be repeated to about 30% of previous values. The variations come from changes in geometry and changes in the gauge sensitivity as a result of change in the bombarding current. For the evaporation of larger quantities of refractory material, the dual electron bombardment evaporator, shown in Figs. 16 and 17, is used. This evaporator uses a water-cooled anode operated at voltages between 1000
Fro. 16. Dual source, electron-beam-heated evaporator for 1/2-inch diameter samples of refractory and reactive materials employing a water-cooled hearth for material support-shown with upper shield removed.
and 7500 volts. The emission current is supplied by a 0.012-inch-diameter tungsten filament that can supply a maximum of 0.5 amp emission a t approximately 10 amp heater current. For low voltage operation, a 0.01inch-diameter tungsten grid wire structure having the same diameter as the filament structure-about 0.80 inch-is used to assist emission. This grid is held between 200 to 300 volts positive and serves to control the electron emission. An ion collector wire located above the top shield is run
209
MICROELECTRONICS
at 30 volts negative and used to indicate the evaporation rate. The regulators can be connected between the ion collector and the control grid to regulate the evaporation rate. For evaporation of 0.50-inch-diameter Samples, the input power is about 200 watts for nickel-iron alloys, 600 watts for molybdenum and aluminum oxide, and 1200 watts for tungsten and tantalum. The maximum evaporation rate is limited by the onset of a glow discharge caused by the presence of the high pressure molecular beam. For a 0.5-inch-diameter source the maximum deposition rate is about 2000 A/min at a distance of 10 in. During normal operation of this evaporator the melted and evaporating materials do not stick to the cooled anode, although occasionally materials like aluminum and silver do stick; an increased power input is then needed to continue evaporation because of greater heat transfer to the anode. A shield system is provided t o prevent large quantities of evaporating EVAPORATING MATERIAL 7
k; /[,
FILAMENT
r F l L A M E N T INSULATOR UPPER SHIELD
ANODE INSULATOR
I
4.; ;
I
J,
\-ANODE
LOWER SHIELD
tlUM
Pic. 17. Drawing of water-coolctl elect ron-beam-heated evaporator shown in Fig. 16.
materials from adhering to the filament and changing its operating characteristics. Another shield between the filament and the substrate prevents most filament materials from reaching the substrate either from thermal evaporation or from reaction products like tungsten oxide, which is formed when aluminum oxide partly decomposes upon evaporation. The shield is not completely effective; occasionally, filament material reaches the high pressure evaporation zone and is scattered to the substrate. Another source of impurities is the sputtering of atoms from the construction materials near cathode potential, but this effect diminishes as a thin film of evaporating material is built up on the structures, allowing sputtering of only the desired material. Low temperature materials, particularly those used in reactive deposition and molccular beam etching, are evaporated from sources similar to the one shown in Fig. 18. A crucible of glass, alumina, or metal is centered in one of the heater coils shown and heated by radiation. The maximum temperature that can be reached conveniently by this method is about
210
KENNETH R. SHOULDERS
FIG.18. Dual evaporators and ion gauge rvitporation rate monitors for low temperature materials.
9oO°C, but this is adequate for any materials used in reactive deposition or molecular beam etching. The evaporation rate from this sourcc is monitored by a simple ion gauge located in a position to see only one of the sources. The ion gauge cross scction is shown in Fig. 19. This gauge is operated at 300 volts anode voltage and 1 ma. The ion collector is operated 20 volts negative with respect to the cathode and ground. Under these conditions the gauge sensitivity for air is around 10 puttmps of ion current per micron of pressure. The cirION COLLECTOR--\
[FILAMENT
ANODE COVER
MOLfCULAR BEAM
FIG.19. Cross sectional drawing of triode type ion gauge evaporation rate monitor.
21 1
MICROELECTRONICS
(wit shown in Fig. 20 is used to regulate the emission current, amplify the ion current,, and regulate the evaporation rate. By careful attention to geometrical positioning, this type of evaporation I ate controller has produced films that have optical densities consistent to 2yc.The film thickness should have a corresponding consistency. The principal rawe of variation is the movement of the gauge relative to the fourre. A n inverse square law of evaporating pressure vs. distance is seen by the gauge and this markedly affects the film thickness obtained. Cali-
3~ 22
George A Philbrick Researches b c
sv
-
.I M
1
t 300
-300
FIG.20. Scheniatic diagram of ion gauge emission controller and evaporation rate regulator.
bration of such a gauge and evaporator is normally carried out by evaporating a known weight charge of material to completion and observing the time required. Repeated runs of the same weight have shown that the results are repeatable to within 2% of any previous experiment and better than 1% of recent experiments. The ion gauge shown in exploded view in Fig. 21 is a ccramic-metal type tlcsigned for service in the ultrahigh-vacuum system where high bakcout temperatures are used. This design will eventually become the ionizer for an rf mass spectrometer that will be used l o monitor evaporation rate to
212
KENNETH R. SHOULDERS
and from the substrate. This gauge uses a control grid to control the emission, an anode operating at around 300 volts to draw emission into the ionizing chamber, an ionizing chamber that can have its potential varied between 10 volt,s and 300 volts, and a small diameter ion collector wire operating up to -20 volts. Typical operating currents are: first anode, 2 ma; ion chamber, 4 ma. The ion chamber has a separate collector a t the far end from the electron source that operates at about 1ma. This electrode is used to indicate the ionizing current and is connected to the current regu1at)ingamplifier. The output of this amplifier connects to the control grid and maintains constant ionizing current. A 0.003-inch-diameter wire
FIG.21. Ceramic-metal type ion gauge evaporation rate monitor designed for high temperature service in ultrahigh vacuum.
is inserted into the ionizing chamber to give an indication of the gas or molecular beam pressure there. The sensitivity of this gauge operated under the above conditions is 20 ramp per micron of pressure for air. The gauge is constructed of high alumina and fired in vacuum at 1900°C. The metallizing is a 0.0005-inch-thick coating of - 325 mesh molybdenum powder mixed with 10% zirconium oxide and fired at 1900°C. The various terminal posts and rods consist of molybdenum wire that has been cemented in by a 90% alumina and 10% zirconium oxide mix that is also fired in vacuum at 1900°C. The dimensional stability of these parts is excellent, even at 900°C bakeout temperatures. Both types of ion gauges may be used to monitor the evaporation from any type of source. When these gauges are used with high powered electron
213
MICROELECTRONICS
bombardment evaporators the leads have to be shielded from stray electrons and ions; in addition, deflection plates are placed at their molecular beam entrance in order to prevent ions and electrons from entering directly and changing the gauge calibrations. These gauges must be operated in a vacuum of’ better than iiim Hg to reduce the effects of background gas 011 the gauge reading; however, they have been operated at pressures as high as lop4 mni Hg by using a second gauge t o monitor the background gas pressure oiily and then subtracting the two gauge readings externally. A 3% balance over a pressure range of 100 t o 1 has been obtained in the past.
B. Substrate Heater A substrate heater that is particularly adapted to reactive deposition is represented in the sectional drawing of Fig. 22 and shown disassembled in FILAMENT SUPPORT CATHODE MOUNT AND MOLYBDENUM
UNTING HOLE AND
SUBSTRATE HOLDER
b-lcm4
FIG.22. Ceraiiiic-metal elrctroii I)omlmrtliiiriit siibstrtitr h a t e r for 1500°C service.
Fig. 23. This heater is designed to raise 0.70-inch-square substrates to 1500°C without introducing reaction by-products or introducing impurities, as some metallic heaters do. The construction material is alumina with molybdenum coating on the inside of the cups. High temperature vacuum firing methods are used throughout (as in the ionization gauge described in Section VIII, A), thus preventing volatile impurities from spoiling the films being produced. Heating is accomplished by electron bombardment of the molybdenum coating on the inside of the anode cup. The emission source is a 0.010-inchdiameter tungsten filament wire that emits up to 0.10 amp. A circular piece of molybdenum is used in the center of the filament to give an annular electron beam on the anode t o improve the heating uniformity. Regulation of the heat is obtained by n standard emission controller, as used on the ion gauge evslporatioii rate regulators. The regulators can be fed from a high gain ampl%er,’such as the Philbrick USA-3 that is used to amplify the output of a thermocouple. The thermocouple may be either placed near the
214
KENNETH R. SHOULDERS
substrate and operated by radiation coupling or by evaporating molybdenum-tungsten thermocouples onto the substrate. Using 1000 volts at 50 ma the heater reaches 800°C; 3000 volts and 100 ma gives 1700°C. The maximum temperature is limited by the softening point of the aluminum oxide anode, which is about 1700°C. This type of heater, having no voltage gradient across the hot ceramic, can attain higher temperatures than the resistance wire type because of the absence of electrolysis in the ceramic due to the presence of an electric field. Broad
FIQ.23. Photograph of disassembled substrate heater shown in Fig. 22.
area metallic sheet heaters can reach high temperatures, but they usually have exposed hot metal parts that can cause undesirable chemical reactions during etching or deposition. The small substrate heater shown in Fig. 24 is used for carrying out experiments on electron microscope specimen screens. This heater is heated by a 0.015-inch-diameter tungsten filament and can reach about 1000°C at 16 amp. The principal construction material is alumina. The substrate heater is used in the demonst,rntioii of electron-beam-activated micromachining described in Section XIV. It is shown installed in the vacuum system in Figs. 13 and 14.
MICROELECTRONICS
215
-THERMOCOUPLE HOLE
HEATER
.-
,-RETAINER SPRlNQ
C. Reactive
Deposition
Reactive deposition is characterized by low pressure rnolccular beam sources that are in line of sight of the substrate. The commercial process called vapor plating [55] is normally carried out at high pressures in a container; here there is a st,rongpossibility of reaction with the walls and hence contamination of the specimen being deposited. Some work in vapor plating has been directed toward epitaxial growth of semiconductors and other electronic end-products [56], but the majority of the work is for the production of thin corrosion-proof coatings on high temperature devices like rocket nozzles and exhaust manifolds. The two large classes of reactive deposition arc pyrolysis and combinational reactions. The first class is illustrated by the deposition of silica from the decomposition of tetraethylorthosilicatc on a hot surface and the deposition of tungsten from the dccomposition of tungsten hexacarbonyl. Most low temperature rcactions involve thc decomposition of organic compounds and the side reactions can contaminate the deposit more readily than in cases where inorganic sources are used. High temperature reactions of the same type are known as thermal decomposition reactions, and with these material can be deposited from the halides directly without undue fear of contamination. The deposition temperatures by this method are frequently too high t o be useful, in that the substrate will not withstand the temperature. The second class of reactive depositions are produced by evaporating two materials to a hot surface simultaneously and reacting them. Molybdenum is produced by the reduction of molybdenum chloride with hydrogen or a reactive metal such as zinc, aluminum, or magnesium. I n this reaction, the volatile halide is carried away from the hot substrate and deposits on
216
KENNETH
R. SHOULDERS
t,hc wall of thc equipment or is pumped away. Aluminum osidc is dcpositctl by the reaction of aluminum chloride and watcr vapor or othcr oxidizers such as ammonium nitrate. The deposition temperaturc for this second class of reactions is conveniently in the range of the maximum allowed by our substrates. The highest temperature that the substrate can tolerate is desirable because of the tendency to form clean stable deposits at high temperatures. 1.
MATERIAL SOURCES
The only end-product deposition method considered useful for this project is the oxidation and reduction of metal halides. Thus the principal problem in a source is obtaining pure metal halides. This may be done by filling small glass ampoules with material after vacuum distillation, and then breaking the ampoules after they have been transferred to the deposition chamber and are ready for use. These ampoules must be baked mildly in the deposition chamber before they are opened in order to drive off adsorbed gas, but they must not be put into the high temperature zone of the ultrahigh-vacuum system because it is baked out far above the softening point of glass. A vacuum lock is needed to make the transfer. This method is good for some special purposes but does not have the versatility of the method discussed below. Materials such as iodides, bromides, chlorides, and fluorides can be produced in situ by reaction of the metal that is to be transported to the substrate with one of the halogens. The metal is heated in a small-diameter tubing of alumina, quartz, or even the same metal, provided it is pure enough. A halide is heated in the lower portion of the tubing to its decomposition temperature and a low pressure gas results. Platinum chloride makes a good regenerative source for chlorine. This gas passes up the tube toward the hot metal and reacts with it to produce a volatile halide that goes to the substrate as a molecular beam. If the parent material is introduced as a clean vacuum-refined metal, and the halogen source is handled so as not to produce any volatile halides or metals, then a pure metal halide for deposition purposes results. All halogen sources should be line-of-sight baffled from the parent metal to be reacted and should contain a cold region for the purpose of stopping and condensing any of the decomposition products that are not halogens. A universal source of this type can be designed to fit practically any deposition apparatus. Alloys have been found particularly easy to handle by this method. The alloy is inserted in the heated tube and the halide passes over it to produce the correct ratio of halides corresponding to the alloy composition after a short time has elapsed to obtain equilibrium. The alloy sample is held a t a temperature high enough to react all components and volatilize all of them.
MICROELECTRONICS
217
If one coniponcrit is slow to react, it soon becomes prcdominant on the surface, as a result of the rcmoval of thc othcr compoiiciits, but thc corrcct ratio is ultimately obtained. If there is bulk diffusion of the sample, then an improper alloy is obtained, bcrause the reaction surface is constantly saturated with this mobile element. Whether or not the proper alloy is obtained a t the substrate surface is largely a function of the ratio of the reaction efficiencies of the various components of the alloy a t the substrate and not a function of the source material. A test was conducted by the author on a molybdenum-tungsten alloy containing 80% tungsten; a chemical analysis showed that this was approximatley the content of the film obtained. The control of the evaporation rate from these sources is obtained by adjusting the temperature on the halogen generator. A feedback loop from the halogen heater to the ionization gauge monitor regulates the evaporation rate to any desired value. The heater on the metal source rcmains fixed. It is desirable t o generate halogen gascs and metal halides in situ in order to overcome the difficulties of introducing thc pure gases into the ultrahighvacuum chamber through automatically regulated mechanical valves. Most metal chlorides are deliquescent,, absorb inoisturc rapidly, and cannot be handled in air. This method of generating them in the vacuum chamber has been found to relieve the problem. Many sourccs for halogen and other gases are available. Oxygen is obtained from silver oxide, hydrogen from heating zirconium or uranium hydride, fluorine from cobalt fluoride 111, chlorine from platinum chloride or molybdenum triehloride. All of these sources can produce very pure gas if properly handled. Compounds such as ammonium chloride may occasionally be used to replace halogen sources without danger of side reactions. Metal chlorides such as lead and thallium chloride can also be used because there is very little possibility of the metal component remaining a t the substrate as a compound or in the free state, due to the high temperatures used during deposition. 2.
MOLYBDENUM AND ALUMINUM OXIDE DEPOSITION
Molybdenum and aluminum oxide have been successfully deposited by the author in various shapes and on various surfaces to test some of their properties. These materials were taken as representative of a class that would be used in tunnel effect devices, namely, a refractory metal and a dielectric. The materials were deposited on field emission microscope tips to determine cleanliness; thcy were also made into sandwiches to test encapsulation and to test pin-hole effects, dielcctric breakdown strength, and electrolysis effects. Molybdenum was deposited on clean, smooth, sapphire substrates by reacting molybdenum chloride with hydrogen a t 1200°C. The substrate was
218
KENNETH R. SHOULDERS
hratcd by direct clcctxon bombardmcnt and was not uniform in tcmpri-ature by more than 200°C. Electron bombardment also tends to cnhanw the deposition [26] but the amount was not determined for this case. Tho molybdenum chloride was obtained by direct chlorination of molybdenum with chlorine gas, as described earlier. The hydrogen was obtained by heating zirconium hydride to around 900°C. An excess of hydrogen, by a factor of two, was used over the theorectical amount that is needed to carry mm Hg a t the out the reaction. A molecular beam pressure of around surface was computed. The deposition time ran nearly one-half a minute to produce a 200 A-thick film, as judged by optical transmission, which indicates a low deposition efficiency. Separate tests were also run on thin film aluminum oxide substrates so that direct electron microscope examination could be undertaken. Because of crystal growth, these reactively deposited samples showed a rougher surface than is produced by evaporation; the roughness was approximately 50 A or 25% of the crystal size. The crystal size was in the same order as the film thickness. For heavy deposits that must be smooth, alloying with materials to reduce the crystal size could be undertaken. Small crystal size is also desirable from the standpoint of micromachiniiig, as will bc discussed latcr in the section on molecular beam etching. Aluminum oxide was iicxt deposited to a thickness of 200 A by reacting aluminum chloride with water vapor on the 1200°C substrate previously coated with molybdenum. Aluminum chloride was obtained by direct chlorination of aluminum a t around 400°C; watm vapor was admitted to the system by a simple valvc. One difficulty is encountered with this system: the alumiiium chloride gauge becomes coated with aluminum oxide and produces erroneous readings. This was later avoided by using solid oxidizers that cannot enter the aluminum chloride gauge because of the shielding between the gauges. Deposition temperatures as low as 300°C appear to produce aluminum oxide films, but previous experience with unstable forms of aluminum oxide dictated temperatures above 900"C, the recrystallization temperature where stable alpha alumina is formed. During this deposit, a water-cooled mask was used to prevent deposition on the molybdenum terminal of the final metal layer. This masking was not very defined due to the high surface mobility and reevaporation from the substrate and mask. Electron microscope examination of the alumina film produced by this method did not show any of the dust-speck shadowing effects that can be found in thermally evaporated films. The film appeared continuous, dense, and about as rough as the molybdenum deposits. No examinations were undertaken on combination molybdenum and aluminum oxide films because it was believed that the high density of the molybdenum would
MICROELECTRONICS
219
obscure the alumina detail, as in the case of evaporated films that have been tested. A second layer of 200 A thick molybdenum was deposited through a water-cooled mask so that the lower electrode was not coated. This second layer was connected to a terminal which formed the capacitor between it and the lower electrode. A layer of aluminum oxide approximately 2000 A thick was next deposited over the entire assembly, then a 2000 A molybdenum layer, and finally a silica layer formed by decomposing tetraethylorthosilicate at 900°C. These three layers formed the encapsulation for the assembly. The tests for encapsulation effectiveness have been described in Section IV, G on tunnel effect component encapsulation. 3.
ELECTRICAL TESTS
The electrical tests for the above assembly have been designed to show signs of instability in the materials. These tests amounted to heating the assembly by passing a current through the thin base metal and observing any change in the resistance with time. It was assumed that any migrating material, material diffusing through the encapsulation, electrolytic action, or recrystallization would result in a shift in resistance of the metallic film. When a test piece had been properly made, heating at 800°C for as long as 20 hours in various corrosive media did not show any change in resistance in excess of the 2% experimental accuracy. Tests for a period longer than 20 hours were not performed because the few samples that werc produced were destroyed in the capacitor tests to be described. Poorly processed test pieces would show an almost instantaneous shift of characteristics upon heating. These fault,s wcre usually traced to an inadequate terminal encapsulation. Dielectric tests were performed to determine the niaximuni field strength attainable and to investigate any field enhanced migration that might be a result of foreign materials or decomposition of the dielectric. A resistivity of about 1OI6 ohm/cm was measured at room temperature; however, the actual value may ham been higher because of the increased effective area of the rough surface. The only method of measuring the dielectric constant was to use the ZtC charging time of the capacitor and then use the area and thickness to compute the dielectric constant. The capacity was around 100 mfd/cm2 for an estimated thickness of 200 A. The calculated value of dielectric constant was abnormally high and it was concluded that the surface roughness obscured the real dielectric constant. The data on the dielectric constant was secondary to the results sought on stability. At 800°C the resistivity of the dielectric was about 1Olo ohm/cm. This resistivity measurement could be repeated after many temperature cycles
220
KENNETH
R. SHOULDERS
between room temperature and 800°C. A few cycles were made between liquid nitrogen temperature and room temperature with no observed effect. By raising the voltage to about 20 and producing a fieId of lo7 volt/cm in the material, strong indications of tunnel emission could be observed. The voltage that produced this emission varied by as much as a factor of five between the several samples that were available. To find the point that emission occurred, the voltage was raised very slowly, giving the charging current of the capacitor time to subside. Increments of only a few millivolts were used and the maximum current change was only a fraction of a milliampere because of the sudden onset of tunnel emission. When the emission point is found it can be distinguished from ion current in a qualitative way by changing the temperature a small amount. The tunnel emission is normally able to sustain a more or less constant current with small increases in temperature, whereas ionic conduction increases more rapidly. Tunnel emission that is affected in some way by traps would also be expected to show a strong temperature dependence, but the experience gained in these tests allowed at least a few examples to be seen where the conduction current was primarily due to tunnel emission currents. The test for dielectric stability that seems most significant is the stability of the leakage current when the field is just under the value for tunnel emission to occur. The voltage for tunnel emission is found to be lower as the temperature is increased, and a reduction of voltage by 20% a t 800°C is considered to be normal for the samples used. If this voltage variation is taken into account, then the temperature may be cycled repeatedly with no apparcnt change in operating condition. When a normal capacitor or a poorly made film sample is left for some time at high field strength there is a gradual increase in current until an irreversible breakdown occurs. This current increase is accompanied by an instability or noise, which is assumed to be some form of material migration that precedes and causes actual breakdown. The most certain test for a good metal-dielectric system is a stable tunnel emission current. In several examples this current has been several milliamperes before breakdown occurred. Occasionally the breakdown is violent enough to clear the area around the emission site and allow further testing-without encapsulation. These tests were tried with evaporated dielectrics and anodically formed dielectrics; however, even though some results could be obtained near room temperature with anodically formed dielectrics, the high temperature results were completely negative. Evaporated materials yielded so few samples for testing, due to pinhole effects, that they were given up. Large area, 200 A thick dielectrics are difficult to evaporate. The dielectric tests were attempting to show that the theoretical breakdown strength of a little over lo8 volt/cm could be achieved, but tunnel
MICROELECTRONICS
22 1
emission currents limit the applied field to around lo7 volt/cm. If sclfformation processes, described in Section IV, F, had been used t o smooth the surface it may have been possible to reach higher values. Powers [4] has shown the breakdown strength for glasses such as aluminosilicate to be as great as 2 X lo8volt/cm; very smooth glassy surfaces were available and a special low emission cathode of silver iodide was used.
4. FIELD
EMISSION MICROSCOPE TESTS
In order to confirm the above limits and to test for purity of the deposition processes, a limited amount of work was carried out on field emission microscope tips, where many things can be directly seen and related to tunnel effect component work. Aluminum oxide was deposited on tungsten tips and measurements were made as described in Section IV, I on solid state tunnel effect components. The general conclusion directly confirmed the above finding-namely, that tunnel emission occurs a t the same applied voltage with or without a dielectric layer of alumina, and that this emission current brings on the breakdown. The maximum current density that could be supported by thc alumina on the tungsten tip was about 103 amp/cm2. Molybdenum was deposited by reactive deposition on tips of tungsten and molybdenum. The tip was also etched slightly with chlorine. When the tip was tested for cleanliness by viewing the emission pattern in a n ultrahigh-vacuum system, no effect that could be attributed to impurities was notcd. The tip had grown a prcdictable amount due to the deposition. The tip was never heated higher than 150O0C, in order to simulate the condit ions that would exist in actual componeiit construction on a sapphire substrate. The best method available to see the effects of interstitial impurities would be the use of ficld ion microscopy. Mueller [IG] has shown single atom interstitial impurities of oxygen in platinum. This method suffers slightly for poorly bound impurities in that the high fields of lo8 volt/cm used in field ion microscopy tend to desorb the impurities. This is of little concern, however, since our heating process would also drive these matcrials off.
D. Single
Crystal Growth
The apparatus for depositing single crystal films on amorphous or micropolycrystalline substrates is available with the various material sources and substrate heaters; however, adequate information is not available a t the present time on how to use this apparatus to produce the desired result. Small area films of single crystal silicon have been grown inadvertently during an etching process in which a polycrystalline film of silicon was being removed from a silica substrate by chlorine. A decomposition of
222
KENNETH R. SHOULDERS
the silicon chloride apparently resulted, and single crystal sheets about 0.03 X 0.005 inches were formed; these sheets had a thickness that varied from several hundred angstroms to several thousand angstroms in a stepwise fashion. The existence of the single crystal over the entire area of the deposit was confirmed by the selected area electron diffraction attachment of the electron microscope. The substrate was amorphous silica. Attempts to produce single crystal films of material on micropolycrystalline substrates would involve using reactive deposition methods near the equilibrium temperature where either deposition or etching could occur and then sweeping a thermal gradient across the substrate by electron bombardment so as to cause the growth to commence at one side and proceed across the surface. Epitaxial methods on single crystal substrates would not need a sweeping thermal gradient. At this point in the art of depositing films of material it is not clear whether micropolycrystalline films or single crystal films will be the most likely to produce the fewest troublesome imperfections across their thin dimension, imperfections that would cause migration, electrical breakdown, and etching difficulties. A single flaw between the surfaces in a single crystal film would be disastrous, whereas a micropolycrystalline film with several aligned crystallite boundaries in series could produce an equivalent effect. By admixing materials and using stratified structures the grain boundary diffusion could be retarded, but there is not enough experimental evidence to compare this case to the practical single crystal case. To the author’s knowledge, all single crystal thin films that have been produced thus far have shown strong variations in the thickness dimension; however, it is too early to accept these indi~nt~ions conclusively. E. Instrumentation Methods
Instrumentation is needed for the determination of film thickness as the deposit is being formed so that process control can be achieved. I n addition, instruments are needed to monitor the chemicals being used in the deposition and etching process and to aiialyze thc impurities contained in samples. The two instruments best suit,ed to the above requirements are an X-ray fluorescence spectrometer and a mass spectrometer. X-ray fluorescence methods have been shown to be capable of analyzing grams of material to an accuracy of 1%in an area as small as one micron [57]. These methods are applicable to all elements, provided that absorption of the emitted X-rays by gas or detector windows is prevented. The vacuum chamber is an ideal operating environment for such equipment. X-ray methods are independent of the temperature of the sample being measured and of the crystal structure because nuclear processes are
MICROELECTRONICS
223
iiivolvcd. These propcrties make the process applicablc to the detcrminat ion of film thickness during deposition at elcvatcd tcmperaturcs. Future cquipmcnt will incorporate a simplc clcctron gun focused to a small spot on the substrate during deposition, and an appropriate X-ray detector such as an electron multiplier coupled with an analyzer crystal to be used as an X-ray filter. The electron multiplier will be followed with a counter, integrators, and analog control equipment for regulating the film thickness. By proper electron beam current monitoring and standardization techniques involving periodic deflection of the beam to a st,andard sample, a film thickncss determination repeataldc t o 0.1% could be achieved. Thickness control is useful for both dcposition and etching operations. The equipment must operate in the vicinity of the deposition apparatus, and one of the prime design rcquirernents is the prevention of contamination of the electron gun and detector surfaces. By using molecular beams of solids instead of gaseous materials most of the contamination could be eliminated by simple shielding; howcver, the substrate being deposited upon emits a small quantity of material as a result of reaction inefficiencies and these materials tend to deposit on both the electron lens and the X-ray detector. Periodic testing of thc surface through mechanical shutters when the molecular beams are shut off would offer oiic solution, but not an entirely dcsirable one. A quadrupole mass spectrometer incorporating some features described by Paul [58] and by Brubaker [59] will he constructed and used to monitor both evaporation rate from the sources to the substrate and reevaporation from the substrate. In addition, the spectrometer can be used for destructive analysis of the various deposited films by either etching them off or sputtering them off and looking for the quantity and type of material contained in the film. This instrument will greatly aid in the understanding of the free radical type of surface chemistry that predominates in this low pressure range. The spectrometer consists of an ionization unit and an analyzer unit. The ionizer will resemble the one illustrated in Fig. 21 and described in Section VIII, A. The aiialyzer is simply four cylindrical conductors operating a t correctly chosen amplitudes of dc and rf frequencies. In operation the rf is swept across a range of frequencies in synchronism with the horizontal axis of a monitor oscilloscope. The vertical axis of the oscilloscope is fed from the selected ions that traverse the analyzer unit and a spectrum of atomic mass units versus their quantity is observed. The unit is equally applicable for solids or gases, and is intended to replace the present ionization-gauge evaporation-rate monitors. A stability in the region of 1% is expected for this instrument and a sensitivity adequate to detect a fractional monolayer on the surface appears available.
KENNETH R. SHOULDERS
224
IX. Material Etching
Since selective deposition cannot be carried out by reactive deposition methods to the high resolution required in our microelectronic work, we combine high-resolution selective-resist production with a gross etching process to give the desired micromachining action. The etching will be done in the samc vacuum system used for deposition in order to maintain high purity and good control over the process. There are three somewhat similar methods that are compatible with our requirements: molecular beam etching, atomic beam etching, and etching by sputtering. Molecular beam etching is the most commonly used and will process 90% of the materials that must be etched. Atomic beam etching is used when the reactions at the surface must be produced at a low temperature; sputtering is used when the surface material must be removed without any heating. A. Molecular Beam Etching
1.
GENERAL METHOD
When a heated surface in vacuum has a molecular beam of material directed upon it there is possibility of reaction and etching. The etching is produced by volatile species being formed and evaporating from the surface. The most readily volatile class of simple compounds are the,halides formed by reacting a metal with halogens. An example of this is the etching of molybdenum with chlorine gas in which molybdenum chloride is formed and evaporated from the surface. A temperature in excess of 500°C is needed to supply the activation energy for the reaction and to volatilize the chloride. In general it will be found that the more active halogens like chlorine and fluorine react more easily but produce less volatile halides, while the less active iodine and bromine are more difficult to react but produce highly volatile compounds. This has a tendency to keep the minimum operating temperature constant for the different halogens. The choice among halogens is largely based upon other considerations, such as compatibility with the resist and the underlying layer that will be reached upon completion of etching, as well as the availability of the pure halogen. Sources of molecular etchants can be divided into two classes-solids and gases. The gaseous sources are most easily obtained by admitting bottled gases into the vacuum system through a valve; however, this is a cumbersome method for controlling the gas pressure at the surface, and it is difficult to secure a clean source of gas. As mentioned in Section VIII, C on Material Deposition, halogens can be obtained by the thermal de-
MICROELECTRONICS
225
composit ion of various materials such as molybdcnurn trichloride, platinum chloride, and cobalt fluoridc. Thesc sourccs can bc vcry clcan if prccautions arc taken to prevent solid products such as ~nctalsand lowcr chlorides from being carricd along with the gas. The prcssurc of thc gas can be convenirntly controlled by adjusting the heat on the source, and regulation can be achieved by ion gauge monitoring and fecdback to the heater with the equipment shown in Figs. 18 and 20. One difficulty with gaseous ctching sources is that they must bc continuously pumped from the system to produce molccular beam conditions. Evcn with fast pumps, the number of collisions a t the surfacc by molcculcs arriving a t anglcs other than normal to the surfacc is high. This can produce a11undcrcutting action of the resist. Matcrials that begin as solids from thc sourccs and can be deposited as solids a t room temperature are cffectivcly pumping thcmsclvcs a t a very high rate and can produce truc molccular bcams. The arrival angle at thc surface is always vcry near thc normal anglc and any uiidercutting of thc m i s t would bc caused by other cffccts. An extremely wide range of compounds arc available that will producc ctching cffccts. The choice is largely dependent upon the side reactions produccd from thc compound. For hydrogen fluoride sourccs, materials like ammonium fluoride and ammonium bifluoride havc been found dcsirablc because their side products are gaseous and do not foiin any stable nitridcs or hydrides with molybdenum, one of thc principal materials under invcstigation here. These hydrogen fluoride sources are good for the removal of silica resist layers. Almost any solid compound of a metal and a halogen can be considered for the production of the halogen provided the compound reacts a t the etching temperature used and that the side reactions produce volatile products. Examples of materials that have been used include aluminum chloride, ammonium chloride, lcad chloride, thallium chloridc, potassium acid fluoridc, potassium tantalum fluoridc, ammonium fluoride, and ammonium bifuoride. By using solids and molecular beam techniques, reactions with the vessel can be prevented that could result in transport of foreign materials to the surface. The source material purity largely governs the final purity of the etched deposit. Solid sources also prevent unwanted etchants from entering the various instruments in the system, such as X-ray thickness monitor, ion gauges, and the electron lens. Molecular beam etching can be produced a t lower than normal temperatures by supplying the activation energy for the reaction by electron bombardment during the etching. Current densities in the order of 0.1 amp/cm2 must be used to provide high reaction rates in the order of one hundred monolayers per second. The difficulty with this process is that good uniformity of the electron bombardment is hard to obtain, compared to that obtained by heating the surface uniformly. Even if reactions can be ob-
KENNETH R. SHOULDERS
226
tained on the surface the process will not always be good for etching because the reaction products arc not volatile. Volatility can usually be increased by forming a higher halide or by forming a complex compound. Platinum chloride is an example of a low volatility compound that can be converted into a highly volatile compound by reacting with carbon monoxide to form the halide carbonyl. Other organic compounds, such as metal acetonates, can be formed that have high volatility. A risk is involved in using these organic rcactioris in that t,hey may form unwanted stable metal carbides. Metals that have heen microetched with chlorine gas include tungsten, molybdenum, tantalum, nickel, iron, aluminum, and silicon. Silicon dioxide has been etched with ammonium bifluoride and potassium acid fluoride, and aluminum oxide has been etched with phosgene and potassium tantalum fluoride. There appears to be a gas or a compound available to etch any known material and fairly simple chemical selection rules can be used to determine the proper ctcharit and etching tcmpcrature. 2.
PURITY TESTS
Molecular beam etching using chloriiie gas has becri carried out in the past by the author on a tungsten field emission microscope tip to determine whether or not any residue resulted when conditions simulating the final process were used. An cmittcr tip of approximately 1000 A radius was exposed to a chloride gas pressure of around mm Hg for two minutes a t about 600°C in the absence of an electric field. When the vacuum chamber was returned to ultrahigh-vacuum conditions and the tip was heated momentarily to 1500°C the emission pattern observed resembled the original pattern. The conclusion was that the tungsten was not contaminated by chlorine under these conditions. Admitting chlorine to the system a t a very low pressure the work function would immediately rise, indicating that a clean condition had initially existed. Aluminum oxide has been deposited on a 1000 A tungsten field emission tip by thermal evaporation to a thickness of around 400 A and then heated to 1500°C for stabi1il;ation. A low current could be drawn through this dielectric layer without destroying it and some of the characteristic appearance described by Mueller [20] was observed during the early deposition phase. The tip was heated to around 400°C and etched with phosgene gas at a pressure of mm Hg for four minutes. When the vacuum system was returned to ultrahigh-vacuum condition and the tip was heated to 1500"C, the tungsten pattern reappeared, indicating that the aluminum oxide had been removed without altering the tungsten pattern or radius in any visible way. It was assumed that carbon formed from the phosgene decomposition would have resulted in an enhanced emission around the
MICROELECTRONICS
227
:11dlonwcd eniission on thc 334 plmies, :ts shown by Mueller [GO]. This \\ah iiot evidcnt. Tests werc carried out using duniinuni cliloritlc to etch tuiigstcn tips in the accidental prescnce of oxygeii or water vapor-which is normal for a mni IIg vacuum system-and the aluminuni oxide pattern appeared. This could bc explained by the reaction of thc aluminum chloride and n.ater vapor 011 the hot tip. 01 1 a i d 112 p1aiic.s
3.
DEEI' ETCHING
If 100yo reaction efficiency could be obtained, the molecular beam etching process would produce straight sides on thc etched samples and give very high ratios of depth to width. In practice, the surface migration of the etchant and compounds formed between the etchant and the film material set limits to the depth-to-width ratio. Materials have been tktchcd that show ratios as high as 10 to 1. In this experiment the etched shape was a hole about 300 A in dinnictcr and 3000 A deep in a film of evaporated molybdenum. Possibly deeper holes could have been etched; however, the film thickness was only 3000 A. It is not known whether a pillar 300 A in diameter and 3000 A high could also be obtained by the same process. Etching is sometimes accompanied by a decomposition of the compounds formed after they have migrated some distance from the site of formation. In some extreme cascs with silicon films being etched in chlorine, the silicon chloride decomposed so as to form a finger-like growth of single crystal silicon. The crystals were several thousandths of an inch long and were not uniform in thickness. This decomposition process could conceivably be controlled to nullify thc effect of undercutting. With low-efficiency etching therc is a preferential attack on various faces of certain crystals, as can be sccn most readily with the field emission microscope. This effect would tend to produce an uneven etch on films composed of randomly oriented crystals, being most pronounced for films having large crystals extending through the thickness dimension of the deposit,. Ry proper alloying of the materials, the crystal size could be minimized and the cffrct reduced. Another method of controlling the different etching rates a t the various faccs is to adsorb materials on the sensitive faces that retard the etching. Optimization of any particular system of chemistry could be carried out on the field emission microscope and the process could then bc applied without modification to the niicromachining process. Conversely, by simultaneously depositing a nonmigratory catalyzing material, the etching process may proceed more rapidly in thc thickness dircction than through the lateral direction; the effect would thus give straightcr sides to the etched shape.
228
KENNETH R. SHOULDERS
B. Atomic Beam Etching
Atomic beams of dissociated moleculcs have larger reaction energies than molecular species and can bring about reactions at much lower temperatures. Wise and King [61] have shown the room temperature reaction and etching of carbon, silicon, germanium, and tin with atomic hydrogen. The hydrides of these materials are volatile at room temperature so that etching can proceed without interference from a surface layer. Atomic beam etching can offerno improvement over molecular beam etching unless the products formed are volatile a t the etching temperature. Reactions known as “Paneth reactions’’ [62] have long been used to identify methyl free radicals, which are an atomic species. These reactions involve passing a methyl radical vapor over lead at room temperature. The lead is reacted to form a methyl-lead compound which moves down the tube, decomposes, and plates a lead mirror on the walls of the apparatus. The fact that the species produced is unstable after reaction does not alter the utility of the etching process. I n either molecular beam or atomic beam etching the active materials are kept away from other gases, and from the walls of our processing chamber, and thus have no opportunity to give up their energy or decompose. Atomic species are usually produced by an rf discharge in a low pressure gas. The simplest method of accomplishing this in our apparatus is to use an rf field across the end of a quartz tubing having a small hole in the end and a gas pressure inside the tubing in the range of 100 microns. The exit atomic beam pressure should not be over 20 microns so as to produce the mm Hg pressure a t the surface being etched. Simpler necessary sources are available for some materials, such as hydrogen, which can be obtained by heating zirconium hydride. This molecular beam is then directed onto a tungsten plate operating a t 17OO0C, thereafter reevaporating to the surface to be etched with hydrogen atoms. In a similar fashion chlorine and oxygen atoms can be formed by reaction with hot, surfaces. For example, Langmuir [63] noted that tungsten deposits on the walls of an incandescent lamp were removed when the filament was heated to the dissociation temperature of chlorine, which was contained in the bulb a t a low pressure. He attributed the removal of the dark tungsten deposit from the walls to the formation of tungsten chloride. In other experiments involving two tungsten filaments, one running a t high temperatures and the other a t low, the low temperature filament was etched away on the side facing the hot filament and the hot filament gained weight from the thermal decomposition of the chloride so produced. The use of atomic species can markedly increase etching efficiency and yield large depth-to-width ratios.
MICROELECTRONICS
229
C. Ion Beam Sputtering
Material can be removed by physical or chemical sputtering [64] in a vacuum system operating a t a pressure of around 100 microns. The sputtering gases can be either inert or made chemically active to enhance the removal rate. The principal disadvantage to sputtering is the possibility of destroying electrical circuits by dielectric breakdown. The charge accumulated on the surface of a sputtered material can easily produce fields greater than the breakdown gradient of the underlying dielcctric. An additional disadvantage of sputtering at high removal rates useful for the micromachining process is the tendency to alter the electrical properties of material through damage to the crystal lattice, and by driving the sputtering atoms into the lattice. RrIost of these effects can be annealed out, as shown by Farnsworth [65]. The principal advantage of sputtering is the possibility of obtaining high ratios of depth to width of the etched sample, because of the high energy and high efficiency of material removal for the impinging atoms. Wehner [66] has given examples of the sputtering and etching of a silver surface, with fine abrasive particles imbedded in the surface, which left many long pillars of silver after etching. In a discussion with the author, Wehner stated that the etching angle was in the vicinity of one degree. The abrasive particles were probably aluminum nxide, which has a very low sputtering rate and makes a good resist. D. Depth Control
Molecular bcarii etching tests on molybdenum have been performed by the author in which a 3000 A-thick film of niolybdenum on an aluminum oxide film xas etched in chlorine until a mean thickness of 300 A was rcached. The variation in thickness across the Pample surface was checked in an electron microscope by observing the electron scatter produced by the sample and found to be less than 100 A. Evidently polycrystalliiie samples such as evaporated molybdenum can be etched by time-temperature methods to a repeatable thickness and to a uniformity greater than 2%. T o carry out uniform etching the temperature should be sufficiently higher than the niininium value required, so that variations in temperature across the surface of the sample will not cause greatly different reaction rates. The etching rate is then controlled by the rate of arrival of the molecular beam on the surface, which is in turn controlled by the halogen evaporation rate regulator to nn accuracy of 1% in the optimum caw. The over-all accuracy of depth control that could be expected, and has been shown experimentally for an open-ended type process, is in the order of 5% under controlled conditions. This value should be increased to
230
KENNETH R. SHOULDERS
better than 1% for average operating conditions involving closed feedback loops using 3000 A thick films. This implies a permissible variation of less than 30 A from all sources. Closed loop processes involving an X-ray fluorescent probe to monitor the film thickness ~7ouldgreatly facilitate the control of etching depth. This method is described briefly in Section VIII, E on Material Deposition. The electron probe that excites the X-rays would bombard several areas on the substrate not associated with components in order to check for etching depth and uniformity. A mass spectrometer placed so as to observe the reaction products coming from the surface can detect the conipletion of etching for a particular material by showing the appearance of a new species of compound and the disappearance of the material being etched. There would be very little indication of the uniformity of the process over the surface except for the sudden disappearance of a particular compound. Perfectly uniform etching would produce a sudden change in thc materials coming from the surface. When absolute control over etching penetration is required, thc lower layers of materials can be made chemically resistant to the molecular beam etchant. A buffer layer of very thin material can be used between layers of films; when this material is reached the etching process slops uiitil a very short etch is used to remove the buffer laycr. In this fashion the control over depth could be regulated to within approximately one-tenth the thickness of the buffer layer, or around 10 A.
X. Resist Production
Selective etching of a material can be produced by first selectively coating the surface with a chemically resistant laycr of material, and then etching away the unprotected areas. The thickness of this layer rarely exceeds 50 A and can be produced by a variety of methods. The simplest method is to evaporate a material like silica or alumina through a mask to the surface; however, this method suffers in both resolution and ease of manipulation for the large quantities of information that must be passed on to the surface. Optical systems could be used to expose light-sensitive resists but these methods also lack resolution and the number of hits of information per optical field is very limited at reduced size. The method that seems best suited to the micromachining process is the production of a resist layer b y electron beam methods. The principal requirement for this method is that a chemical change be brought about b y the action of electroils on a material without undue side reactions being
MICROELECTRONICS
23 1
c:uiwtl l)y light, X-rays, heat, stray low velocity wcondary clcctrons, or 1tackscat t crcd elcctrons. Dense mntcrials ctpnblc of absorbing thc energy of an elrctroii beam arc necd(tl so that the exposure tiiiie can be as short
as the mechanical wanning time recluircd to pass the substrate under thc lens and produce many exposures or frames on one substrate. No lens or electrical scanning system is capable of producing the 1011bits of information needed 011 the surface without mechanical scanning. Additional requirements are that the resist producing materials do not contaminate the electron lens and that the materials he deposited on the substrate by vacuum methods. A. Evaporated Resists
Expcrimental work can make very good use of evaporated rcsist layers because their thickness, as well as their chemical composition, can be accurately determined. It has been found that silica and alumina can be used to protect a very wide range of materials. Nearly all metals that can be etched in chlorine can be protected by silica and the silica later removed by hydrogen fluoride. If there is danger of contamination from silica then alumina can be used andjater removed by phosgene or tantalum potassium fluoride. Alumina can serve as a resist for etching silica, and silica is a n effective resist for alumina, because phosgene does not attack silica. These evaporated resists, which are applied through masks, are not good for component construction requiring high resolution; however, they are very good for studying high-resolution etching processes. The most effective mask for etching studies is one that has a known size and shape and is located very near the surface. These conditions are satisfied by depositing polystyrene latex spheres on the surface to be evaporated upon. These spheres, which are used to calibrate the niagnification of electron microscopes, come in a variety of sizes. They range from one micron in diameter down to 880 A, and are known in size to better than 1%. Deposition is accomplished by simple spraying of a solution from a nebulizer. When a resist is evaporated at an angle other than normal to the surface, the polystyrene spheres cast a shadow of known size and shapc. Any etching process following this now has a standard resist for dimensions and chemical constitution. The resist has been shown to have a definition in excess of 100 A. If a standard mask such as a screen mesh located some distance from the surface is also used during the resist deposition then penumbra regions will exist where the resist thickness is changing between maximum and minimum thickness. This variety of effects on one surface greatly accelerates the optimization of the etching
232
KENNETH R. SHOULDERS
process for a particular material and shows the various effects that can be cxpectcd due to having a resist, layer that is too thick or too thin. 8. Chemical Decomposition Resists
When electrons bombard certain materials there is a possibility of the material decomposing and producing a solid product that remains on the surface near the point of decomposition. Under certain conditions this material can be used for a resist either directly or through chemical conversion. Haefer [67] has shown the high resolution decomposition of hydrocarbons, silane, and borane by electrons to produce carbon, silicon, and boron. The same reactions have been investigated by the author with the intention of using the decomposition products as an etching resist. mm Hg and Silane was admitted to a vacuum system at a pressure of a thin film electron microscope specimen screen was bombarded with 1000-volt electrons at around 1 ma/cmz for several seconds. Upon examination in the electron microscope, high resolution deposits had been formed from the various surface structures, and they could not be etched away in chlorine. It was assumed that silicon had been produced by the electron decomposition of silane, and that this silicon had oxidized to silica during the transfer through air to the electron microscope. Since the silane had been in gas form, the electron flood gun was also coated with silica; this caused undesirable charging effects and instability of the beam. At the low pressure of mm Hg the only highly probable reaction site for the silane decomposition process was in the condensed phase of the adsorbed gas a t the surface being bombarded. Material formed in the vapor phase would be scattered widely to all surfaces. Compounds containing oxygen can be decomposed by electron bombardment, thus circumventing the oxidation step of the previous method. Tetraethylorthosilicate is one compound that has shown reasonable results for the micromachining of molybdenum [68]. The decomposition of this compound probably goes through a free radical polymerization phase immediately after bombardment; however, this high polymer decomposes upon heating to etching temperature, leaving silica as the principal residue. The production of silica by this method has shown a resolution in excess of 100 A. When used as a resist for the etching of molybdenum the resolution is also in excess of 100 A. The process conditions used to produce this resist are to admit tetraethylorthosilicate into the vacuum system at a pressure of lo-* mm Hg and to bombard the substrate with 1000-volt electrons at 1 ma/cm2 for about three seconds to produce a film of resist that is 20 A thick. After exposing, the silicate vapor is turned off and the substrate is heated to the etching temperature of about 600°C. The quan-
MICROELECTRONICS
233
t um yield of the process is over one, presumably because of the high gain of the polymerization mechanism. Since the silicate is in vapor form t,hroughout the vacuum system and adsorbs to the lens parts, silica is produced and instability results, as in the case of silane. To prevent decomposable material from reaching the lens surfaces, a solid may be evaporated to the surface of the material to be etched. One material that has been found useful is triphenylsilanol. This is a solid a t room temperature that can be evaporated to a surface to produce a thin film. The remaining material that does not go to the surface deposits on the walls of the vacuum unit without reaching the electron lens. When a film of triphenylsilanol is bombarded with an electron beam of 1 ma/cmz for approximately one second, part of the deposit is polymerized. Inspection of the deposit immediately after exposure does not reveal any change, but when the sample is heated to 200°C the unexposed material evaporates and the exposure pattern can be seen as a deposit of material. The deposit is largely silica, with traces of organic tars that can be driven out a t high temperatures. This resist is the one that has been used in the demonstration of electron beam micromachining of silicon and molybdenum, described in Section XIV; i t is very simple to use and it forms a protective coating on silicon for the transfer in air between the deposition chamber and the electron microscope used for exposing. A great many oxygen-containing organic compounds of silicon that have low enough vapor pressure and can be evaporated without decomposition can substitute for triphenylsilanol. In general, the higher molecular weight materials have the lowest vapor pressure and are the most desirable. A thin film of metal such as aluminum can be evaporated over the uiiexposed resist layer to help confine the vapors, prevent the conversion of the material by low velocit,y stray electrons, prevent surface charging, and help to increase the sensitivity of the reaction by adsorption of the primary beam; this has been shown by Sternglass [35] for transmission type electron multipliers. After exposure, the thin metal film can be removed by etching without disturbing the resist or underlying material. Most of the polymerization type reactions and a great many of the simple decomposition reactions caused by electrons can be classed as contamination-prone reactions, because they contain carbon or silicon, both of which form fairly stable compounds with metals used in electronic devices. Unless this contamination is strictly controlled, a source for device nonuniformity is present. Tests carried out by the author on field emission microscope tips did not reveal any sign of contamination, but judgment should be withheld until final components involving metaldielectric combinations are tested.
234
KENNETH R. SHOULDERS
C. Multilayer Resist Production Methods
In order to achieve the highest efficiency resist production, multiple layer processes may be employed in which a current gain is obtained in the layers. The highest yield of resist atoms possible, triggered by a single electron for a 300 A resolution, is in the order of los; in this case, every atom in the 300 A cube is converted by a chain process initiated by the single electron. The highest yield that has been obtained experimentally is around lo4.These high gain processes also have the smallest heating effects because of the low input energy required. A multiple layer process was tested by the author in which successive layers of 100 A-thick aluminum, 200 A-thick lead chloride, 300 A-thick lead sulfide, and 200 A-thick aluminum were deposited. The first aluminum layer was to be eventually converted into a resist; the lead chloride served to electrolytically corrode the first, aluminum layer; the lead sulfide acted as an electron-sensitive phot,oconductor ; and the final aluminum was a n electrode. In operation, this sandwich has about three volts applied between the aluminum plates while a 3 kv electron beam penetrated the layers. A local conduction path was set up in the lead sulfide by the high velocity beam and the lead chloride was put under strong field. The resulting electrolysis caused preferential chemical attack of the first aluminum layer, which was corroded into aluminum chloride. Upon heating the entire sandwich, the top three layers peeled off, leaving the selectively etched aluminum; this could be oxidized or fluorinated and used for a resist. This process had a quantum yield of about lo4.In spite of high yield the process is not suited to resist production, primarily because there is a constant leakage or “dark current” in the lead sulfide; this causes a background fog that seems difficult to control. The resolution of this process was not investigated, but it could not have been greater than 300 A because of the electron scatter produced in the thick layers of material. This process may be applicable to methods that expose the entire field a t once, but our process of micromachining requires over 1000 separate exposures on the same substrate, which makes the leakage current problem more difficult by a factor of 1000. All reactions investigated thus far that are enhanced by an external field common to all areas show the above leakage effect to various degrees. There are some reactions involving tunnel emission that look promising, but no definitive experiment has been performed. These high field reactions have the ability to confine the reaction products to the area of initiation because of the migration of material along the straight field lines of the parallel plate geometry. Space charge dispersion effects would probably be negligible a t the low current densities used. A simpler class of reactions that is potentially more nsrful to oiir
MICROELECTRONICS
235
proc~ssis reprcscnted by anodic or corrosion reactions; hcrc a film of metal such as aluminum is covcrcd with a film of oxidizing material like molybdenum trioxide, mangancsc dioxidc, or silver trioxide. Molybdenum trioxide is known to yield an oxygcn atom upon being bombarded with clectrons and converts from the yellow form to the blue oxide. Some of these decompositions have heen shown by Fischer [69] and Camp [70]. The oxidation of the aluminum or other base material would proceed until the oxygcn was depleted or until the field across the oxide fell to the point where tunnel electron emission [ll] or bombardment-induced conductivity [71,72] could no longer be sustained. Brief tests by the author, using aluminum and molybdenum trioxide, gave indications of being useful for resist purposes, but have not been completely verified. It was found that the molybdenum oxide residue could be removed by simple etching in chlorine, apparently by forming an oxychloride. The difficulty experienced, which was not overcome at the time, wts the removal of a thin layer of background fog produced by thcrmal oxidation of the aluminum. If this fog can be etched back without removing an excessive amount of the sclectively oxidized portions of resist, this could be a clean and useful process. In the above process an overlay of molybdenum metal would be beneficial to cause electron multiplication, to prevent loss of oxygen and to cause reduction of surface charging effects.
D. Compatibility
with Electron Optical System
I n the type of electron optical system uscd, the samplc will be immersed in the field of the lens instead of being shielded by an aperture, in order to overcome part of the spherical aberration caused by using apertures, This will require that the substrate be flat and that the surface of the resist being exposed not charge and distort the electrical field of the lens. A metallized covering over the resist will satisfy this requirement. I n addition, the resist material should not outgas or emit materials that would contaminate the lens. The same metal covering also reduces this problem. The resist-producing materials usually require only tens of volts of electron velocity to activate them, but it is difficult to operate a high resolution lens at lower velocities than 5000 volts because of space charge problems and stray fields. Fortunately, a resist material can be exposed a t any velocity above the minimum value, although at high velocities there is a compromise in resolution because of back-scattered electrons and increased heating effects. These effects have not yet occurred in practice. The optimum velocity is determined by the complete absorption of the beam energy in the resist material; however, this velocity is found to be very low for thin films and high resolution, being only about 300 volts for 100 A resolution. A compromise velocity would seem to be about 1500
236
K E N N E T H R. SHOULDERS
volts, although it may have to be as high as 5000 volts to avoid stray fields. High velocities have the additional disadvantage of producing X-rays and backscattcred electrons that can expose the resist and produce fog; for X-rays, however, this is a minor effect, becausc about one X-ray photon is produced for lo4electrons and the absorption of the resist layer for X-rays is fairly low. Effective aperturing of the clcctron lens will have to be used to prevent exposure of more than one field a t a time. This is necessary in order that reflected primary electrons and scattering from apertures further up the lens system will not accumulate to troublesome levels during the exposure of 1000 fields of information. Any tendency toward an arc-over should be detected and suppressed before it happens; otherwise the field being exposed by the lens will be ruined. Arc-over suppression is discussed in Section XI, A, on the Electron Optical System. Obtaining registration between adjacent fields and alternate layers will require observation of the resist surface by the mirror microscope or the scanning microscope feature of the electron optical system. The mirror microscope has essentially zero velocity electrons a t the surface; the scanning probe can only be as low as about 100 volts. A metallic covering over the resist will prevent electron penetration and exposure of the resist within certain limits of electron velocity.
XI. Electron Optical System
The present electron optical system development that is being pursued will provide the following modes of operation with only external electrical switching being required to go from mode to mode: (1) Exposure of a sensitiaed substrate by a scanning probe with an electron velocit,y between 500 volts and 30 kv a t a current density of around 100 amp/cm2, and resolution of 200 A, with provisions for conversion to a retinal image using a field emission source. (2) Scanning electron microscopy with a resolution limit of around 200 A having lo6 bits per field and a scanning time variable between 60 fields per second and one field in one minute. (3) Scanning X-ray fluorescence probe capable of detecting l O - I 4 grams of material and having a one micron resolution. The long wavelength detectability would extend through beryllium. (4) Electron mirror microscopy capable of measuring voltage differences to 0.2volt for a resolution of 500 A and having the capability of measuring time-varying voltages with greater than 1000 Mc bandwidth.
MICROELECTRONICS
237
( 5 ) Field emission and thermioiiic emission microscopy with a resolution of better than 500 A, capable of studying the uniformity of multiple field emission cathode arrays operating under ultrahigh-vacuum conditions. When uridertakiiig the design of such a versatile instrument it is neccssary to compromise operating characteristics in some of the modes in order to gain performance in others. The resist-exposing mode has been optimized in the electron optical system described here and the X-ray feature has been deemphasized. Higher resolution transmission microscopy methods have been supplanted by front surface methods because tho thickness of most, of our samples is too great for transmission microscopy. In order to obtain the same quality image from one field to another, thc. substrate must be flat, thick, and rigid. These requirements prevcnt transmission microscopy even if very thin films are used for components. An electrostatic imaging system was selected over a magnetic system because of the smaller distortion and image rotation effects, the insensitivity to voltage variations, and the abiIity to retain higher dimensional stability through vacuum bakeout. A spherical electron optical system was chosen over a cylindrical system because of the ease of fabrication and the mechanical stability during bakeout. All outputs from the various modes of operation have been converted to electrical form because of tho difficulty of providing for optical paths in the system and the instability of phosphors at, the higher bakeout temperatures. A. Micromachining Mode The primary function of the electron optical system is to expose images on the substrate that can later be converted into machined shapes by appropriate etching. I t is .desirable to ultimately use gross images for exposing the resist producing material ; however, for simple operations, involving only a few thousand components and their interconnections, scanning methods will be employed because they are simpler in that thcy do not require a pattern generator in the lens system. Simple external equipment can be obtained commercially for generating television-like signals to feed the deflectors of the electron optical system. Von Ardcnnc [44a] has shown that, by using sufficiently small deflection angles and highly corrected lenses, a scanning oscillograph can be made that has over 109 bits of information per field. The largest number of bits per field that ronveiit ional television generators are capable of achieving is approximately lo6. Elcctroii microscopes in general have over lo8 bits prr field, so that any information that, can bc scanned into the system will bc reproduced in the imaging system, provided that the aperture angles of the system remain somewhat less than one degree.
238
1.
KENNETH R. SHOULDERS
OVER-ALL DESCRIPTION
The electron optical system that is being constructed for this micromachining operation is shown in Fig. 25. Simple discs and cylinders of quartz are used to produce the functions of lens electrodes deflectors, and spacers. These quartz pieces are metallized with both highly conducting molybdenum and high resistivity material to provide the necessary conductive surface to form an electron lens. The principal components of the system are an electron gun, condensing lens, deflectors, intermediate lens, objrctive lens, and the necessary apertures. I n addition, there is an electron multiplier, a micromanipulator, and a stigmator built into the objective lens, as shown in Fig, 26. The group of parts shown in Fig. 25 and Fig. 26 is contained within a ceramic tubing that provides mechanical support and electrical contact to the various electrodes through the sockets provided, as shown in Fig. 27. The entire assembly is covered with a magnetic shield that has a removable top for the insertion and removal of substrates. A molten metal seal type of varve is provided on the top of the lens enclosure to allow sealing the lens from the vacuum system that surrounds it during certain bakeout operations that may contaminate the lens elements or the substrate that is in the enclosure. A socket actuating mechanism, shown a t the bottom of the lens in Fig. 27, serves to disengage the contact springs during the bakeout cycle and for removal of the lens parts. All parts of the lens are made with sufficient dimensional accuracy so that no alignment is necessary after installation, although slight electrical adjustments are possible with the deflectors. I n operation, the various lens components would be used as shown in Fig. 28(a). The crossover of the electron gun could be expected to be in the region of 0.005-inch diameter and would have to be demagnified 5000 times to reach 1 pin. diameter, or 250 A. I n order to reduce aberrations, the objective lens must be operated a t the highest demagnification and shortest focal length possible, while the intermediate and condenser lens operate a t progressively less demagnification. The operation and design of a three-lens electrostatic eIectron microscope similar to this one has been described by Bachman and Ram0 [73]; the reader should refer t o this reference for design details such as lens shape, aperture position and size, dimensional accuracy requirements, and stray field consideration. Each of the electron lenses is the three element unipotential type commonly found in electron microscopy. The most critical lens is the objective lens, shown in Fig. 26, which determines the over-all resolution of the system. This lens iriust be made with dimensional tolerances of about 20 pin. on each part in order to reduce the astigmatism to a value low enough to achieve 200 A resolution. The principal concern is t~ form very round
[ Z F E
[SCREEN SPACER
i
APERTURE
r
I
h)
w
Y)
INTERMEDIATE LENS
‘-ELECTRODE SPACER
I+(;. 25. I.cns voiifigwation for elertron optiral s y s t c w .
APERTURE
M u m MANIPULATOR RING ELECTRON MULTIPLIER
SUBSTRATE HOL
MICRO MANIPULATOR ANOOf
ELECTRON MULTIPLIER LEA
MICRO YbUlWLATOR GRID
OBJECTIVE LENS SOCKET
LENS ELECTRODES h)
STIGMATOR SURFAC
P
APERTURE CONTACT
MICRO YbMPULATOR GRID
APERTURE CONTACT SPRING
ELECTROOE SPKERS
LENS SUPPORT
OUTER SWPORT TUBE SPRING RETAINER
FIG.26. Objective lens assembly for electron optird system.
i
I
I
h)
4
LEL&CTRO. QUN A M CONDEIISER LENS
Lcmo MFLECTOR
FIG.27. Lcns enclosurc and support structure for dectron optical system.
I
-SUBSTRATE
OBJECTWE LENS SOCKET
KENNETH R. SHOULDERS
242
holes in the lens electrodes that are concentric with the adjacent electrodes so that the fields are uniform to a high degree. Provisions have been made for the incorporation of an electrostatic astigmatism corrector in the objective lens, a s shown in Fig. 26. This stigmator would consist of six metallic stripes of molybdenum deposited on the surface of the first lens electrode followed by a high resistance coating of material to prevent charging by the electron beam. The electrodes would be connected through the socket contacts to an external voltage source that could correct for residual asymmetry in the electric fields of the lens. The objective lenses in normal electrostatic ,instruments are apertured ELECTRON GUN CONDENSER LENS
STORAGE SCREEN AN0 PPERTURE
DEFLECTOR
n
I
INTERMEOlATE LENS
DEFLECTOR # 2 DEFLECTOR I 3
4PLRTWlE OBJECTIVE LENS ELECTFION MULTIPLIER
SUBSTRATE MlCROPROeE’
MICROPROBE DEFLECTION
MIRROR MICROSCOPE
MICROSCOPE
(0)
Ibl
IC)
RE4D-OUT
MIRROR
(dl
FIQ.28. Ray diagram for various modes of operation for the electrou optical system.
on both sides so that the specimen or substrate can be located out of the field; however, Liebmann [74] has shown that the apertures cause a n increase in spherical aberration that is very undesirable. Newberry [75] has immersed the specimen or target of an X-ray projection microscope in the field of the lens and avoided the exit aperture, thus giving an increased performance. This ttype of operation is possible with our application and will be used to help improve the lens performance. The entrance aperture is located as far as possible from the high field region of the lens in order to reduce the effect of the aperture. With this immersed type of operation the final lens electrodes, formed by the electron multiplier discs, are operated near the substrate potential, while the center lens electrode is operated
MICROELECTRONICS
243
near cathode potential a i d is used to focus the electrons on the substrate by varying the voltage slightly. 2.
OBJECTIVE L E K S DESIGN
The cscellence of the objective lcns is determined by its focal length, which is in turn deterniiiied by the field strength of the lens. The largest aberration term that limits the performance of a lens is spherical aberration, which can be reduced by reducing the focal length. When the limit in focal length is attained the maximum aperture angle to the lens can be determined, knowing the resolution desired, and the efficiency of the lens is determined. The lens shown in Fig. 26 is designed for a minimum focal length of 4 mm, using the 0.040-inch-thick electrodes shown; however, provisions have been made to operate with conical shaped lens elements having closer spacings and higher fields to obtain a focal length as short a s 2 nim, while still retaining the center electrode near cathode potential. All of the electrodes of the lens system arc interchangeable and the hole size can be conveniently altered. A focal lcngth of 4 nim is considered adequate for our purposes; a t this focal length, a spot of 200 A diameter can be focused having a current densit,y of about, 100 amp/cm2. Electrostatic lens systems have shown excellent results in the past in the hands of experienced operators, but all commercial attempts so far have been unsuccessful because of contamination problems. Our utilization of ultrahigh vacuum makes the electrostatic lens performance high because the field strength of the lens can be increased about one order of magnitude past the nornially allowed valucs of lo5 volt/cm. In order to operate in the region of lo6 volt/cm, the surfaces of thc lcns exposed to the high field must be very smooth, have high nork function, and be free of dielectric films that could cause Malter effect eniissioii arid the ensuing arc-over that this form of field emission produccs. Tests have been carried out by the author on molybdenum-coated quartz lens elements in which the field was increased to lo8 volt/cm without breakdonn, provided the surfaces were cleaned by high temperature processing in vacuum. The ceramic base material used in thesr lcns coiifiguralions havc lowcr thermal conductivity than metal electrodes and they arc more prow to be destroyed by a n elect rical discharge. The discharge rcnioves a port ion of the metallizing, requiring refinishing of the c~lcctrodes.Elaborate protection methods seem in order to protect both the lens elements and the substrate during exposing. These methods could consist of slowly raising the voltage while nionitoring the current to the various electrodrs in a n effort to sense the onset of field eniissioii, which invariably preccdcs the discharge. If field emission is detected, it can usually be rcmoved by exposing the source to a very low chlorine prwsure. This operation would have to be followed by
KENNETH R. SHOULDERS
244
another bakeout. Coiisidering that the above cycle would not have to be done more than once a day, or whenever the lens system was opened to air and contaminated by dust, this half-hour procedure may be worthwhile if ultimate performance is required. 3.
PATTERN QE7XERATION
The deflection of the microprobe would be acconiplished as shown in Fig. 28(b); the aperture is used as a fulcrum point in order to prevent a severe limitation of the field of view by the aperture. The deflectors are quartz cylinders that have four equally spaced metallized stripes on the inside with a high resistance uniform coating applied over-all. This type of deflection system produces strong coupling between t.he deflection plates and gives nonuniform deflecting fields, but these effects can be removed by fairly simple external circuitry. The deflection angles are normally less than one degree and linear deflection amplifiers are not difficult to secure. Only about lo6bits per field will be obtained with the scanning type of data input. The 100 amp/cm2 probe would be distributed over lo6 200 A diameter spots having a field size of 20 X 20 microns for aii average current density of amp/cm2. An exposure time of about 10 sec would be required under these conditions. The deflection signals can be derived from a commercial high definition flying spot scanner apparatus, the video output feeding the grid of the electron gun. Ordinary drawings are adequate for the information source. Had a gross image of los bits of information been used, the exposure time would have been decreased by lo6or the current density requirement reduced by the same factor. 4.
MICROMANIPULATOR
The micromanipulator shown in Fig. 26 is designed to move various portions of the substrate into the field of view of the lens by using a n electrically operated, thermal, bimetal motor. The micromanipulator anode is made from a bimetal of molybdenum and nickel and is heated by electron bombardment from a triode gun assembly machined into the ceramic contact support structure. By operating four diametrically opposed thermal motors and one locking device, the substrate can be progressively stepped to any position allowed by the limits of the system. The smallest increment of motion would be about 100 A and the longest continuous motion would be about 0.01 inches. Forty of the 0.01-inch steps would be needed to cross the 0.4-inch substrate shown in Fig. 26. After arriving a t any chosen position, the substrate is prevented from further movement by the friction locking of the system. All bimetal motors are disengaged
MICROELECTRONICS
245
from thc lockcd subbtratc, so that their drift does not disturb thc sctting. The timc constants of the thcrinal motors are sufficient to allow crossing from one side of the substrate to the other in approximately onc niinutc. In addition to the micromanipulator, thc substratc can bc positioned quickly to within 0.01 inch of any chosen position by an external mechanical manipulator that is used to exchange substrates and remove them to the deposition and etching positions in the apparatus. The vacuum seal in the top of the electron optical system and the magnetic shield cap shown in Fig. 27 must be removed during the use of this external manipulator. 5.
MAGNETIC SHIELDIYG
The electron vclocity of an electrostatic lens system can be varied without affecting the focus or rotating the image, provided all voltages to t!he various lens elements are varied proportionally. The electron velocity that seems most likely to be used for resist exposing is in the 3000-volt region. Higher velocities result in uiiwanted heating and back-scattered electron effects, and lower velocities are subject to stray magnetic field disturbances. If the triple shielding shown in Fig. 27 is as effective as planned, and if there is no deterioration of the shielding with successive vacuum bakeouts, then t8he voltage could be reduced to the 500-volt region without destroying the 200 A resolution or causing drift and distortion of the images by stray magnetic fields. 6.
REGISTRATION
The problem of obtainiiig registration between adjacent fields and between adjacent layers can be divided into two problems; namely, locating the area that must be fitted, and then providing identical frame size and shape for the two fields that must be registered. The first problem can be ultimately handled by using precise optical interference locating methods. With thcse methods, a one-inch substrate can be positioned to within one-hundredth of a wavelength of light, or about 50 A. For the present, however, simpler methods, giving poorer results, will be used by taking advantage of the electron microscope and X-ray feature of the elect,rori lens. These methods will be discussed in a later section. The most difficultj part of the registration problem is to obtain two images a t different times that have the same distortion and size. A drift in external circuitry or the mechanical featurcs of the lens will result in many divcrsc effects which are difficult to cornpensatc. A commcrcial electron niicroscope has sufficient stabilit,y to expose an image showing a resolution of better than 8 A in a n exposure time of up to one minute without the image being blurred by drift of either external
KENNETH R. SHOULDERS
246
clectrical circuits or i i i t c n d mcchaiiics. The same i1ii:ige c:m be exposcd a t some later time on the same photographic plate and it will be found Ihat the drift is usually lcss than 50 A over a pcriod of two days, provided the instrument is left in operating condition and had been properly stabilized before the first exposure. The beam current must also be turned off during this period or the contamination that results from the poor vacuum and the many rubber scals will contaminate the apertures severely and distort the image badly. When the instrument is shut down completely and then turned on at some later time and allowed to stabilize the drift can be very great, in some cases amounting to over 5000 A. The most difficult problem in stability is apparently caused by the many mechanical joints in the massive lens column. We have endeavored to design a sturdy mechanical system that is not massive by incorporating rigid quartz and ceramic pieces in an assembly that has high dimensional tolerances so as to prevent inadvertent motion of critical parts. In addition, the thermal expansion and material creep for these quartz lens elements is much lower than for metal elements. By using highly stable mechanical structures to prevent mechanical motion, and ultrahigh vacuum to prevent contamination and the attendant charging effects, and by being very careful to provide slightly conductive surfaces on all dielectrics exposed to the electron beam, it is hoped that we will achieve a long-time stability of better than 500 A in both distortion and drift of field so that this part of the registration problem can be made approximately equal to the resolution of locating the field to be registered upon.
B.
Scanning Electron Microscope
A scanning elcctron microscope can be made by using the lens configuration shown in Figs. 28(a) and 28(b) in conjunction with an external monitor and sweep generator similar to a television monitor. The output from the electron multiplier would feed the brightness axis of the monitor and the sweeps of the monitor and the microscope would be synchronized. Comlett and Duncumb [76] and Everhart et at. [77] have shown their results using scanning microscope methods. One important conclusion is that not only geometrical properties of the surface appear, but that differences in materials can be distinguished, as can the electrical characteristics of devices. The specific samples shown by Everhart were back-biased p-n junctions of germanium and p-n-p junctions of gallium phosphide. A decided change in image contrast resulted by altering the potentials applied to the junctions: base regions of less than 0.2 micron could be seen and variations of this width were apparent for various positions along the junction, a difficult thing to determine by any other nondestructive method.
MICROELECTRONICS
247
1. I1EGISTHATION
TLc principal w e for the scanning electron microscope in the niicro~nachiningprocess would be to determine the location of the registration inarks on the underlying layers of inaterial that have beeii previously machined. These indicating marks will be covered by a smoothing layer, a deposited film to be machined, and a resist layer, but the pattern of the underlying film would not be completely obscured because of the sensitivity of the xanning microscope to slight surface angle changes such as those produced during a surface smoothing operation. A vacuum cavity has been teen t o cause a slight depression at the surface, amounting to as ~nucli as 400 A for a one micron diameter cavity. The scanning niicroscope would in effect see a relief image of the underlying layer and registration could tie cfectcd by electrical servo-ing of the beam without exposing the resist, provided low elect ron velocities were used that could not penetrate the oiwlying nietal film. When registrat ion was secured, the beam velocity could be increased for the exposure without greatly altering the registration. 2.
SZ'HF.4CE INSPECTIOS
The scanning electron microscope can serve as an inspection tool for the cleanliness and smoothness of the subst rates. Slight imperfections on the surface would be seen as either dark or light areas in which the average secondary eniission value was altered by the shape of the anomaly. External clipping circuits could be adjusted so as to indicate a fault on the surface as t k x surface was scanned both electrically and mechanically. When an asperity was found, the operator could determine the best action to be taken . The contrast-producing mechanism for this type of niicroscopy is partly due to having anisotropic electron collection and partly due to celecting the proper velocity of the secondary electrons emitted from the surface. To perform these operations, the electron multiplier input potential must be adjustable, and some method of determining the polar angle of emission of the electron from the surface must be provided.
3.
ELEC'TROS MT'LTIPLIER
The electron multiplier shown in the ohjective lens in Fig. 26 and in cspanded form in Fig. 29 would seem to fulfill our need. There is a requirement €or a completely symmetrical field around the input dynode of the multiplier because it forms part of the lens structure, but once the electrons Iiave entered the region qf the fourth dynode axially unsymmetrical fields can 1;c considered. In order to provide for some degree of anisotropy for clcctroii collection, four anodes will l x provided and the one giving the
I I
h)
.s
OD
PRIYARV
ELECTRONS
FIG.29. Electron miiltiplier and X-ray detector.
FLY
MICROELECTRONICS
249
greatest contrast will bc selectcd for the prcsentation on the monitor. The symmetrical rings or dynodes opcrating at the same potential will tend to spoil the angular resolvability of the emit tcd electrons, but enough may remain for our purposes. The collection potential for the sccondary clectrons can he varied between several hundred volts positive with respect to the substrate target and several hundred volts more negative than the cathode potential. As shown by Evcrhart et al. [77], the contrast of various specimens depends upon which range of secondary electron velocities are selected to be used on the viewing monitor, the high-energy elastic primaries giving a different contrast than the low-energy secondaries. The electron multiplier will be made from quartz lens discs that have been ground so as to produce concentric grooves with the shape shown in Fig. 29, although a geometry using only smooth parallel plates has been postulated and may prove satisfactory. These blanks are metallized with a high resistivity coating to provide the voltage divider action needed for the various dynodes. The high resistivity coating that has been tested for this purpose is a cermet of tantalum-molybdenum and aluminum oxide, applied by evaporation and then heated to above 1000°C in vacuum. The stability of this coating is good over a long period of time and is not particularly affected by repeated temperature cycles up to 9OO"C, the vacuum bakeout temperature. The temperature coefficient of resistivity of this material is high, but in our application this is not troublesome because of the constant opcrating temperature. The temperature in the neighborhood of the lens must be maintained nearly constant for the stable operation of the electron lens. A sccondary emission coating of silica, tantalum, and beryllium, as described by Mendenhall [78] seems suited to our application, since high temperature cycling tests by the author and repeated exposure to air have failed to show appreciable change in operating characteristics. These coatings would be evaporated onto the electron multiplier surface using the shadowing effect of the dynodes that were machined into the quartz discs. The secondary emission coating is thus broken up into concentric bands with resistance material connecting them. In order to provide a constant voltage drop between dynodes, the resistance material must be made thinner on the rings having the greatest diameter and area. By application of 100 volts per dynode the gain of the multiplier should be sufficient to detect single electrons in the input and produce a one-tenth volt output for a one-megacycle bandwidth. C. Scanning X-Ray Fluorescence Probe
Cosslett and Duncumb [76] and Duncumb [79] have shown a scanning electron microscope and scanning X-ray fluorescence probe combined into
KENNETH R. SHOULDERS
250
one instrumcnt that allows the examination of a solid surface by microscopy, and the chemical analysis of the same surface by X-ray methods. The instrument we are constructing should serve a similar purpose, but the X-ray feature can be broken down into two modes of operation that are primarily a function of the X-ray detectors used. For high geometric resolution work that would not have high resolution for discriminating between chemical elements, a proportional pulse height detector would be used by biasing the electron multiplier to electron cutoff in the input and making it sensitive to X-rays. The pulse height output from the multiplier would give an indication as to the wavelength of the incoming X-ray photon and consequently the chemical element involved. For a more definite chemical analysis the substrate would have to be raised from the lens enclosure to a position that would allow the collection of X-rays on a crystal spectrometer and X-ray detector. This latter mode is not being planned for the immediate future although provisions have been made to incorporate it at a later date. 1.
X-RAY DETECTOR
An electron multiplier can be made to serve as an X-ray or a high energy particle detector with some wavelength discriminating ability. Bay [80] has shown some of the limits of this method. One of the biggest problems is that monochromatic radiation does not produce uniform output pulses because the electron multiplier gain varies from pulse to pulse, resulting from the electrons taking different paths through the multiplier. For single electron or photon inputs, a statistical distribution of pulse heights occurs. By providing the proper type of detector surface on the first dynode of the multiplier it is hoped to achieve enough discrimination between adjacent atomic numbers to allow detection of 10% of one material in the other by averaging the various pulse heights over many counts. The quantum efficiency of tantalum multiplier surfaces has been shown by Allen [81] to be abo6t one electron per 500 photons in the energy range between 0.2 A and 1.0 A, with the highest efficiency for the long wavelength photon. The limit of detectability for long wavelength photons is known to extend to the visible light region for electron multipliers, provided that no absorbing material is introduced between the source and the detector. Since we are concerned only with a nonabsorbing vacuum path, as shown in Fig. 29, the wavelength limit is essentially negligible. By using the X-ray absorption filtering action of various films of material on the detector surface and choosing the optimum electron velocity to excite the desired characteristic radiation, the multiplier type of detector may be adequate for routine chemical analysis of the materials at the surface, but the more desirable method would certainly be the use of a crystal spec-
MICROELECTRONICS
251
trometer. When proper operation of an X-ray fluorescence probe is attained a quantitative analysis on 10-13 grams of material is possible with a n accuracy of 1% [57].
2.
USE O F X-RAY PROBE
The principal use for the X-ray rnicroprobe in the micromachining process, aside from the film thickness determinations, is in checking for registration with very deep layers of material. This can be done b y first locating the general area with scanning electron microscopy and then raising the velocity to around 30 kv and penetrating the films in one selected spot that is expected to show the buried registration mark. This will expose the resist, but if an unused area is selected there will be no difficulty. The error signal developed by this scanning probe could be used to restandardize the top layer being exposed in t,he event that there has been a gradual shift of previous layers.
D. Mirror
Microscope
A mirror microscope can be formed by using the lens configuration shown in Fig. 28(r). The substrate is illuminated with a flood electron beam from the electron gun. The substrate is held a t a potential slightly negative with respect to the cathode and electrons arc reflected from the surface, imaged through the objective and intermediate lens, and strike the storage mreen a t high velocity. The storage screen had previously been charged by a scanning electron beam that traversed the No. 1 deflector twice and was mirrored from the intermediate lens, as shown in Fig. 28(cl). The scanning beam is syiichroiiized with an external monitor scope and the signal currents derived from charging the screen are used to modulate the brightness axis. When the storage screen is fully charged by scanning, no output image results; however, the high velocity electrons mirrored from the substrate cause bombardment-induced conductivity in the screen dielectric, and the screen is selectively discharged and a pattern can be scanned out to the monitor. The process of bombardment-induced conductivity has been described by Pensnk [71] and Ansbacher [72] and has been shown to bc effective for duminum oxide, thr dielectric to be used on this storage wrreri. An undrrlyirig signal elrrtrodc of molybdeiium will be used. 1.
STATIC OBSEIlVATIOPiS
The performance of the mirror microscope for some applications has heen described by Bartz et nl. [82] and Mayer [83]. This instrument is particularly useful for front surface microscopy in which the electrons do not stxikc the surface. In particular, the instrument has been used to
252
KENNETH R. SHOULDERS
measure resistivity, contact potential, voltage variations across a surface, and magnetic field distribution. Our principal use will be in the determination of the electrical characteristics of finished devices. If a single component is centered in the field of view of the microscope, the voltage on the various electrodes can be determined to an accuracy of about 0.2 volt by biasing the entire component near the cathode potential for the mirror microscope. The velocity distribution of the emitted electrons from the microscope cathode prevents more accurate determination of voltage without undue complexity. 2.
DYNAMIC OBSERVATIONS
If the entire field of view is ceiitered on a single electrode of the component under investigation, and this field is imaged to a spot on the storage screen while the No. 1 deflectors are fed with a sweep signal that is synchronized with the input signal to the component, a micro-oscilloscope would result which allows time varying voltage waveforms of the component to be shown as an intensity modulation on the external monitor. The bandwidth of this method would be limited by the brightness obtainable on the screen and the transit time of the electrons near the mirror surface. The brightness is governed by the current density a t the surface and the gain of the bombardment-induced conductivity mechanism in the storage screen, while the transit time is governed by the field strength of the objective lens. A frequency response of over 1000 Mc seems possible with the electron optical design that is under construction; the quantum limit of detectability for the system would be about 10" cps for an imaging efficiency of 1% in the electron optical system. This limit could be overcome by using over lo3 amp/cm2 for the illuminating beam, but such a high current density beam could interfere with either the component being observed or adjacent components. A field intensity of lo6 volt/cm could be used in the objective lens region, giving a transit time of about 10-8 sec for electrons over the full 6 mm path without seriously disturbing the components. The transit time lag and spread during the final few microns of deceleration where the contrast is developed would be several orders of magnitude less. The ideal way of using the mirror microscope to study component operation would be to encapsulate the components; shield them with a metallic layer, and bring out lead wires to the surface for the mirror microscope to view without interference from either the strong field or the stray electrons. The input signals to the various components would normally be fed in by connecting lead wires but it would be possible to use the electron probe to operate the components under some conditions. The most satisfactory method of testing would be t,o have an array of components interconnected
MICROELECTRONICS
253
ill sonic fashion similar to a divider, and rapidly pulse the input with an rlectron beam whilc Observing the output of the divider a t frequent, intervals with the mirror microscope.
E.
Multiple-Cathode Field-Emission Microscope
By using the mirror microscope connection but leaving the illuminating beam off, the electron optical system can become an emission microscope capable of observing the behavior of a large array of field emission cathodes. For this type of operation the objective lens electrode nearest the substrate would be operated a few hundred volts positive relative to the substrate, or just a few volts below the emission point, and the center electrode would be operated a t a higher positive voltage in order to draw emission current from the substrate in an area that is centered over the lens opening, and to converge the emitted electrons into a focused beam. Thermally emitted clectrons can also be imaged with this system provided that the substrate does not have to be heated above 900°C. The heating mechanism would be by electron bombardment of the substrate from the rear, and the resolution of the system would suffer from mechanical instabilities. The principal use of the emission microscope mode of operation would be to help optimize the multiple cathode formation processes and to allow direct inspection of the cathode surfaces after and during accelerated life testing cycles and therefore help optimize their performance. It is not, expected that the detailed emission characteristics of each emitting area could be resolved, but rather that the location and intensity of each of the areas could be seen with a resolution of approximately 300 A. The behavior of each of the emitting areas could be studied under ultrahigh vacuum conditions of approximately mni Hg. Slow temperature cycles and prolonged heating up to a maximum of 900°C could be done while observing the emitting surface, with some sacrifice in resolution due to the thermal instability of the microscope. Using smaller substrates than the normal 0.4 X 0.4 inches, the temperature could be raised to the 1500°C maximum allowed 1)y the insulating substrate material provided electrical connections were simple and did not result in a field across the dielectric. Emission from complex surfaces involving both emitter surfaces and grid structures could be imaged in the electron optical system without supplying the necessary fields from the external lens electrodes. Complete structures using cathodes, grids, and anodes could be viewed provided that the anode is a thin film of metal partly transparent to clectrons in the multiple layer type of tunnel effect device, or that the encapsulating layer is left off on the single layer device and some small fraction of the total emission escapes from the various surfaces in the form of field emitted primary electrons, elastic primaries, or secondary electrons.
254
KENNETH R. SHOULDERS
F. Pattern Generator
'l'o carry out iiiicroinuchiiiiIig of complex pattcriis on a surface by electron beam techniques, a pattern generator is required. This generator can vary in complexit,y from a simple flying spot scanner reading a drawing, to a complete computer of large complexity connected to the formation chamber in such a way that it monitors the operation and adjusts the processes to converge the operations to some useful end. The simplest methods will suffice for some time, but more complicated methods will be mentioned to indicate ultimate limitations.
1.
SCANNING SPOT
Using a single spot of electrons and deflecting it to various locations to expose the resist can be done by using commercial flying spot scanners having approximately lo6bits per field. For this requirement the field rate or the frame rate is not important provided the exposure time can be adjusted by some means. It would be desirable to have the frame rate just equal to the exposure time so that the pattern would be traced over once only. The exposure time for a scanning probe operating a t 100 amp/cm2 would be about 10 sec for the simplest type of resist having low sensitivity. This low scanning rate requires that the frequency response of the deflection and video amplifiers extend to dc, a requirement which is not usually met with commercial television equipment. Lower exposures would be accommodated by lowering the intensity of the electron beam when it was gated on. Only two levels of video voltage seem necessary for any pattern considered. Using the lo6 bits of information provided in a single field of scan, approximately 1000 tunnel effect components could be constructed without including their interconnecting wires. Only about 250 components per field could be constructed if the wires were included. The simple scanning spot is a very poor method of conveying information to tjhe surface of the substrate. 2.
PATTERN SCANNING
The previous scanning spot method can be improved greatly if instead of scanning a simple spot, the spot contains fine detail. If an image of the tunnel effect device is formed by passing the electron beam through a mask and then this image is deeected to some desired location, an improvement of lo3results over the simple scanning spot method, if there are effectively lo3 spots in the component structure. By applying analog deflection methods to the image of the field emission component and stepping this image from position to position, an array of components could be situated on the surface without interconnecting wires. By demagnifying the image
MICROELECTRONICS
255
of the coniponeiils, and consequently defocusiiig it, a scanning spot would be obtained for describing the locat ion of the interconnecting wires. Because of the drift in analog deflection amplifiers, one cannot make use of the full resolving power of the electron optical system, a limit is reached-somewhere not too far past 10’ bits per field-in which amplifier drift and instability cannot be easily reduced, although t here are methods employing a multiplicity of deflection plates and accurately clamped binary inputs that avoid this problem. A complete rctirinl field can bc stepped froni position to position by staircase sweep signals obtaiiicd from htable dc sources and inechanical (*oiitacts, provided the relatively slow s w c p rates can be tolerated. To compensate for relatively slow accurate staircase siveeps, it is necessary to increase the number of bits per field in the electron beam pattern that is to be deflected. Mechanical commutation nicthods can generate up to lo4 well defined steps of voltage per second and mechanical masks to define the beam pattern can have over lo5 resolvable elements in a square inch. 13y using this iterative method of producing parallel images, a pattern rate of lo9 bits per second could be generated. This system would be strongly constrained by the inflexibility of the mechanical apparatus and arbitrary pattcrns could not be generated; however, as many as lo5completely wired Components could be secured by this method in each field of data presented to the lens. 3.
FlELD EMlSSlON SOURCE
An ultimate pattern generator for the production of around lo5 completely wired components per field by entirely electronic means could be made by employing simple scanning techniques and other interim methods to generate a mosaic of field emission cathodes, grids, and interconnecting wires that could be used as an elect,ron source. Each emitting area would have to have a memory associated with it so that data could be serially stepped into the mosaic from the controlling computer and then used for processing in parallel fashion. The grid structures associated with the field emission sources could serve to regulate the emitted current, to converge the electrons to match the aperture angle of the electron optical system, and t o gate the proper areas to obtain emission. If complete pattern flexibility were desired, there would have to be lo8 individual areas for emission, and if the components were around one micron diameter the entire array would occupy one square inch of area. The position of the patiern generator in the electron optical system would be in the place now hhown for the storage screen in Fig. 25, arid the intermediate and objective kns would serve t o demagnify the image formed. Various simplifying configurations for the emitting source could be worked out employing
256
KENNETH R. SHOULDERS
combinations of simple accurate deflections and complex patterns or by building in constraints; however, no serious effort need be put on these problems for some timr. G. Construction Details
The construction problems on this electron optical system can be broken down into two major categories, namely, the fabrication of metallized discs and cylinders for the lens elements, and the fabrication of a mechanical support for the lens elements that also provides for electrical connection to the various electrodes. Ceramic materials have been selected for the principal construction materials because of their high dimensional stability a t elevated temperatures and because of the ease of obtaining the forty insulated lead wire coiinections to the various electrodes. High alumina ceramic and fused quartz are the materials that will be used on the first lens design, with provisions being made to substitute sapphire for the fused quartz in the event that, suitable dimrnsioiial stability is not obtained. 1.
LENS ELEMEKT CONSTRUCTlON
The lens elements for a typical lens such as the objective lens are shown in Fig. 30. The discs and cylinders are made by grinding and polishing fused quartz blanks to the desired shapes by standard optical lens grinding techniques. After rough grinding, the blanks are fired in high vacuum to a temperature of around 1000°C to relieve strains and show flaws that are not easily seen in the raw material. The parts are rotated a t low speed in a machine with an accurate spindle while a polishing tool generates the round and concentric surfaces. The final operation puts a chamfer of small radius on the edge of the holes, The dimensions of the parts are checked periodically by using optical interference techniques that are accurate to within 5 pin. or better. This measuring technique consists of viewing the surface with an optical microscope through an optically polished surface that is held very close to the surface being measured. Interference fringes result and they can be interpreted to show the roundness of the outside or the inside surface. If both surfaces are sufficiently round and concentric with the mounting shaft they can be considered round and concentric with each other. When the three principal lens electrodes itre assembled in the lens sleeve the alignment and concentricities of their center holes must be accurate to within about 50 pin. This is obtained by controlling the inside diameter and roundness of the leiis sleeve and the outside diameter of the lens electrodes to within 20 microinchcs. Matched parts can be selected for the objective lens while the intermediate lens and condenser lens can make use of the parts made to lower tolerances.
MICROELECTRONICS
257
FIG.30. Photograph of lens electrodes, sp:iwrs, sleeves, and support+
All lens electrodes aiid spacers are in:& optically flat, top and bottom, to within 20 pin. and the two surfacth :we parallel to within 40 pin., so that the parallelisrn of the electrodes is quit(. wit,hiii the tolerance required. The discs are easy to inalcc fiat niid p:~r:dlcl. but long cylinders used as spacers are a good deal more difficult. Finished discs have been fired in a v ~ ~ c u u furiiaw ni t o 900°C to siinulatc.
258
KENNETH R. SHOULDERS
vacuum bakeout conditions, without, evidence of sag, creep, or decomposition. The tests were made by supporting an optical flat, having the same dimensions as a lens electrode, at three points while firing and then checking the optical interference pattern after firing. It was noted that the dimensions were within 20 microinches of the original shape after firing. The surface was examined for alterations to the surface polish, but a magnification of 400 failed to show any recrystallization or devitrification. A metallized sample also appeared to be unaffected by the firing. The metallizing used was molybdenum applied by vacuum evaporation. The electron multiplier will be machined out of standard thickness lens electrodes by using a high speed grinding and polishing tool a t an off-axis angle to produce the slight overhang that is shown in Fig. 29-an overhang that is needed for the self-shadowing of the evaporated secondary emission coating. These concentric grooves do not need to be very accurate, provided the first and second dynodes are concentric with the inside diameter of the electron multiplier electrodes. The large apertures in the system will be made from fused quartz discs with various sized holes in the center. The objective lens aperture will be made by bonding a molybdenum disc having a very small hole in it to a sapphire or an alumina aperture disc; it is difficult to bond molybdenum discs to fused quartz because of the large difference in thermal expansion coefficients. The electron-bombardment-heated evaporator discussed in Sectioii VIII, A on Material Deposition has been used to metallize quartz plateh with molybdenum. It is desirable to deposit the molybdenum a t a temperature of around 600°C after a heating to 900°C for cleaning purposes. The heating of thc substrate was done by bombarding with electrons on the reverse side from the evaporator. A mask of alumina is used to delineate the deposition pattern. In order to deposit an adequate quantity of material in the holes of the lens electrodes the disc is rotated a t an angle of about 45" to the evaporator so as to deposit material on the straight sides. The deposition pattern is arranged to provide a contact electrode on the outside diameter of the electrode discs. These coatings of molyhdenuin are very hard and cannot be scratched by the tungsten wires that are used for contacts unless excessively high pressure is used. Prelimiiiary tests on the welding and sticking characteristics of the close fitting parts after heating in high vacuum indicate that there will not be a great deal of difficulty in removing the parts from the assembly. I n the event sticking does occur the silica surfaces will be coated with alumina by evaporation in order to reduce the affinity for the adjacent molybdenum film. Alumina has been successfully evaporated from the same type of evaporator that is used for the molybdenum evaporation.
MICROELECTRONICS
2.
259
MECHANICAL SUI’PO IWS
A series of support tubes, shown in Figs. 26 and 27, are used to constrain the various lens elements and provide alignment without the need for adjustments. The inner support tube shown in Fig. 26 is made from high alumina and is honed to a dimensional tolerance of 0.0005 inch on the inside diameter and 0.0005 inch cantilever distortion over the entire length. This provides the principal guide for the various lens supports and spacers. Clearance holes are ultrasonically drilled in the inner support tube to allow contact springs to engage the contact pins or other electrodes of the lens parts. These holes, along with other intentionally introduced holes, provide the paths needed to pump out the interior of the lens without pulling the gas through the lens apertures. The outer support structure is made from a group of alumina cylinders that are metallized and brazed together to form a vacuum-tight envelope. Both ends of the enclosure are sealed with molten metal type valves, the top one being opened and closed during a bakeout cycle of the ultrahighvacuum apparat,us. The lens must be closed whenever there is a possibility of corrosive residue from an etching or deposition cycle contaminating the lens. An external manipulator removes the cover and the substrate after the seal-breaking heater is operated. The various sockets for the lens assemblies and the deflectors are simple alumina cylinders with metallized lead wires carried through the walls; the inside surface of the cylinders is ground smooth. The contact springs shown in Fig. 26 float in the spring-retaining collars and serve to connect the lead wires with the electrodes when the lens assembly is pushed into the proper position by the socket-actuating mechanism shown in Fig. 27. Tungsten springs fired a t 1350°C in high vacuum have been tested and found satisfactory for the limited travel required on the contact spring. These springs must not be under compression during high temperature bakeout; this is accomplished by disengaging the socket mechanism. The entire inner support tube, contact spring retaining collars, and lens elements can be removed from the main outer support having the many lead wires by simply disengaging the socket mechanism and lifting the assembly out. The lens elements can be removed in a similar fashion from the inner support tube by pulling three retainer pins. This easy accessibility will be very convenient during the testing phase of the electron optical system. Low temperature metallizing processes will be used for the lead wires and the joining of the various ceramic parts in the support structure. Alloys of zirconium, nickel, and titanium will be used. A special vacuum furnace has been constructed capable of reaching 1300°C and this will be used for all brazing and metallizing operations.
260
KENNETH R. SHOULDERS
XII. High-Vacuum Apparatus A. Requirements
High-vacuum apparatus is required that is large enough to contain the deposition apparatus, etching apparatus, electron optical system, and substrate storage and manipulation apparatus. Since most of the proofs in this work are generated experimentally, the access time to the highvacuum system must be short in order to allow rapid progress to be made, and several cycles a day must be provided for. The degree of vacuum that must be obtained for scientific results is in the region of 10-'0 mm Hg, although many operations can be carried out in much poorer vacuum if variations in results can be tolerated in order to quickly survey a range of problems. Even though the pressure in the deposition and etching cycles rises to around 10-4 mm Hg the background pressure or partial pressure of contaminants must be kept in the mm Hg region to insure pure deposits. This low background pressure requirement, coupled with a high speed vacuum requirement,, dictate high bakeout temperatures and a short time constant for the bakeout furnace. By introducing many high vapor pressure chemicals into the vacuum chamber we have further increased the need for periodic thermal purging of the residue to prevent contamination of subsequent operations, and in one configuration of apparatus the system would have to be baked out every five minutes. If any cold areas are left in the apparatus the chemical residue will deposit on this site and raise the background pressure during following operations. It has been determined that all troublesome materials can be driven from the system if a bakeout temperature of 900°C is used, and that vacuum locks can be placed between chambers when it is desirable to heat one without the danger of condensing material in the other. Fast thermal time constants have been provided by using very light construction materials and heaters contained in the vacuum system so that they do not have to use thermally maesive insulation. By using differential pumping methods the construction materials for the inner vacuum system can be kept as light as required for simple mechanical support instead of having to support an atmosphere of pressure. I n addition, the many manipulator lead-ins to the system can be introduced more easily in regard to leaks in the ultrahigh-vacuum system. An additional advantage of differential pumping is that all of the parts that are heated are heated in a vacuum of mm Hg where corrosion and gas permeability of the container are very minor factors. The vacuum system and ultrahigh-vacuum attachment that is described here is a complex and expensive piece of appara tus, but this can be tolerated
MICROELECTRONICS
261
if it proves to hc the foundation apparatus for 5 coi-nplcte factory that can be used to economically construct high complexity electronic data processing equipment. There are two stages in the development of the vacuum equipment. The first phase, to be described here, is essentially a large single container or bell jar that houses all apparatus and has a single heater for the various bakeouts. I
262
K E N N E T H R. SHOULDERS
FIG.31. IIigh-vacuuin system, n,claptcr spool,
: ~ i i delectroiiic
regulator apparatus.
the bell jar, and this prcssurc is adcquak to begin thc ultrahigh-vacuum cycle. 1. Adapter Spool
Figure 32 shows the adapter spool installed on thc baseplate of the highvacuum system. Starting in the front left of the picture and proceeding
MICROELECTRONICS
263
counterclockwise around the ring, the various accessories arc : w:itcrcooling manifold, 200 anip low voltage lend-in, 30 kv 20 anip Ie:td-in, 10 kv 200 amp water cooled lead-in, 2 kv 10 arnp octal lead-in, optical port, two blank ports, 6 chaiiriel mcchaiiical iiianipulator lead-in, gas inlet, gas inlet, optical port, 10 kv 10 amp octal lead-in, 2 kv 10 amp octal lcud-in, ion gauge port, and 20 kv 10 amp lead-in. Almost any service can be entered into the vacuum system through the I :-inch holes provided in
FIG.32. Atlnptcr spool shotviiig v:irious :ic.wssoric~s.
the adapter spool. ,411 vacuuiii seals arc made with O-ring seals for their convenience of opmation. A large diaiiictcr O-ring seal is provided in the lower surface of the adapter spool to scwl to the baseplate. The spool can be remoi7ed from the vacuum system without disturbing the experiment, and another experimeiit can be iiiscrtetl in its place. As shown in Fig. 13 the experiments arc iriounted on an nlumiuutn bnseplafe which can he removed from the adapter spool provided the various electrical and water cooling leads are removed. The niechanical manipulator is detached to to ma tic ally upon raising the aliimiiium mounting plate.
KENNETH
264
2.
R. SHOULDERS
MYClLlNlCAL MANII’UCATION
The six-channel mechanical muriipulator shown in Fig. 33 is considered to he well suited to the present requirements where great flexibility in operation is required without int,roducing excessive thermal delays, lcaks into the ultrahigh-vacuum region, or mechanical diffieu1t)ydue to operation /INNER
VACUUM CHAMBER GE INSULATOR WATER COOLED CONTAINER AND ION COLLECTOR
/
_/-MANIPULATOR
P’
WIRES
- MANIPULATOR
__
ENTRY CHAMBER
MANIPULATOR SEAL
-
\-ELECTRON
SHIELD
MANIPULATOR
FILAMENT
1-
-FILAMENT
INSULATOR
INNER VACUUM CHAMBER SEAL
DRIVE WHEELi
FIG.33. Drawing of ion pump, inner vacuum chamber seal, and mechanical manipulation method.
at high temperatures in vacuum without lubrication. This manipulator consists of a system of capstans driven by a wheel that can be positioned under each capstan and rotated without causing adjacent channel interference. The capstans are grooved tlo receive 0.006 inch diameter molybdenum-tungsten ulloy wires. These wires can be conducted to various points in the system by using aluminum oxide pulleys and wire guides. A
MICROELECTRONICS
265
siniplified illustration of this is shown in Fig. 39 (p. 274). The position of the various channels is shown on a scale t8hatis positioned in view of one of the optical ports. A motion dependability of 0.01 inch can be secured by this method. All wires are spring loaded with the springs being retained in the cool region of the high vacuum system below the water cooled aluniiiium plate. When manipulator leads enter the ultrahigh-vacuum system, they are taken through a ceramic manipulator seal, as shown in Fig. 33, having very close fitting holes to seal the wires. In addition, the gases accidentally introduced are dueted to the ion pump through the manipulator entry chamber. Over eighteen channels of niechanical manipulation are needed in a complete processing chamber to introduce new material, to operate vacuum locks, to move substrates to various locations, and to operate shutters. The compactness, low thermal mass, and fexihility of the wire type manipulators are very desirable. KO maintenance problems have Iwen encountered in the operation of properly inbtalled wire-drive manipulators. C. Ultrahigh-Vacuum System
1.
GENERAL
An ultrahigh-vacuum attachment to the high-vacuum system is being constructed along the lines of a smaller system previously designed aiid tested by the author. This attachment consists of three principal parts: namely, an inner vacuum chamber aiid seal, an ion pump, and a bakeout furiiace. These parts are shown in Fig. 34. Figure 35 shows the hase of thc inner vacuum chamber aid the ion pump assembled on the adapter spool. The bakeout furnace is shown in the bell jar in Fig. 36, with its electrical power inlet and water cooling lilies being brought out one of the 6-inch diameter ports of the jar. Thc complete assembly of the ultrahigh-vacuum att,achment is shown in the drawing of Fig. 37. The inncr vacuum chamber is surrounded by a system of 0.005-iiich-thick nickel radiation shields and a water cooled jacket. Thc inner chamber is heated by radiation from a 0.046-inchdiameter molybdenum wire that is wourid on a nickel framework and supported on alumina insulators. This heater operates a t 208 volts arid 23 amps to heat thc iiiiier chanibcr and base plate to 900°C in about six miriut es. Thc inner chamber is supported by four rods connected to the inner vacuum chamber lift. This lift is operated froin outside the vacuum chamber through an O-ring seal and causes the inner chamber to move up and down about one-half inch aiid to make forceful contact with the inner vacuum chamber seal.
KENNETH R. SHOULDERS
266
FIG.34. Ultrahigli-vacuum attachmrnts fihowing the bakeout furnace, iiiner vacuum chamber, and ultrahigh vaciiiiin base plate with ion pump attached.
The ion pump and inariipulator entry chamber arc shown assembled on the inner vacuum chamber base plate in Fig. 33 with the manipulator wires passing through the water cooled base plate, the radiation shields, and the two seals on the manipulator entry chamber. 2.
SEALING
Figure 33 also shorn the detail of the molten inetnl seal for the inner chamber and of t,hc seal heater. Figure 38 shows further details of this seal hcatrr assembly. The seal that has been tested more thoroughly than others is silver-copper eutectic using a nickcl base plate 0.01 inch thick. Other seal materials such as gold-nickel eutectic have been tested against a 0.01 inch thick molybdenum metal base, and a Norton flame sprayed “Rokide” aluminum oxide trough 011 a nickel base. An electron emitting filament is supported by ccramic insulators attached to thc electron shield and this clrctroii source is used to raise the moat temperature above the
MICROELECTRONICS
267
surrouncling tenipernture b y electron bornhardinciit. A water cooling coil on t'hc base prevents heating of the base and provides mechanical strength. Thc principal design requirenieiit of thc entire seal izsseinbly is that proper materials be chosen t o prevent alloying of the parent metal parts. B y using thc silycr-coppcr cutectic at temperatures not over 785°C the seals have lwrn kept in nioltcii state for several hours or the equivalent of many hundreds of opening and closing cycles. Tests on sniall seals of gold-nickel eutectic on molybdenum have given sevcral hundred operations at, 980°C. The vapor pressure of the most mm Hg a t the volatile component of the sealing alloy is less than respective temperatures, but there is some loss of material through evaporation and capillary action along seains in the inner vacuum chamber. Material can be prcvented from ciitcring the experiment, region through the vapor phase b y inserting a cold MAc near the seal. The tests 011 flainc sprayed crramic coatings did not show any signs of deterioration during the several hours of testing tiiiic. Thcsc inatcrials were not prone to wet to the gold-iiickcl eritect ic until a sinall airiouiit of zirconium hydride was
FIG.35. Ultraliigh-vacuiuii base plate, ion pump, and radiation s h i ~ ~ l dshown s assembled 011 a t l q i t c ~Sl)OOI and high-vuciuuii system.
268
KENNETH R. SHOULDERS
added to the sealing alloy. Some peeling or cracking of the ceramic from the base could have gone undetected due to the thick metal overlay that resulted from the metallizing action. I n the present design shown in Fig. 33, the seal cannot be broken by raising to 970°C while keeping the remainder of the experiment cold
FIG.36. Ultrahigh-vacuum bakeout furnace shown assembled in outer bell jar.
because of the warping caused by the differential expansion of the nickel moat and the cold base plate. A modification of the shape of the moat or a change to a low expansion material like molybdenum would be required. The mode of operation anticipated is to raise the entire inner vacuum chamber to the bakeout temperature around 900°C and then supply the additional heat needed to open the seal with the seal-breaking heater.
MICROELECTRONICS
269
,, 1his is not, rspecially limiting hccausc residue must> I)(& driven froni the inside of thc chumhcr t o a cold finger h f o r c oprning the chamber; othcrwise the outcr viicuuin systriii will bcconic contamin:Lted with the residue. Scals as large as 9 inches in dianiet cr can he opciied a i d closed without
J
OUTER VACUUM CHAMBER
7 . .
v - HYDRAULIC HOIST
PUMP
/ION
/MANIPULATOR
INNER VACUUM
WIRES
-~ -MANIPULATOR CHAMBER
MOLYBOENUM HEAT
ENTRY
SEAL
RADlPTlON SHIELDS
WATER COOLED
_z
I5
SEAL BREAKING HEATER
/MANIPULATOR
:WSN~~~~~LECTOR ACTUATOR
BASE PLATE
__L'
i
A '
INNER VACUUM CHAMBER LIFT
OIL DIFFUSION PUMP BAFFLE AN0 VALVE
,,'
. . MANIPULATOR
CAPSTAN
-
FIG.37. Drawing of ultrahigh-v~cuumattachment assembled in the high-vacuum systcm.
particular regard for the warping problem apparent in the 17-inch diameter seal. 3.
ION PUMP
A detail of the ioii pump is shown in Fig. 33. This is essentially a watercooled caii that, has openings a t the top and bottom for eiitxy of gas. The caii contains mi electron emitting filament, an anode grid cage, and an
270
KENNETH R. SHOULDERS
electroil buntbardmelit type of evaporator for molybclciiuiii. The molybdenum metal evaporator acts as a chemical “getter” for the system whilc the ion pumping action runovcs iioble uiid other iiiert gascs. The evaporutjion of molybdcnuni is carried out iiitcrmittciitly whcncver a monolayer of gas is formed on the previous molybdcnuni layer. To evaporate the molybdenum it is oiily iiecessary to switch oil the voltage of the watercooled anodc that holds the inolybdeiiurii sample. Three thousand volt)s a t
FIG.38. Underside vicw of the ultrahigh-vacuum base platc showing elcctron bombardment ~ e a breaking l heater and cooling coil.
0.3 amp is needed to evaporate the molybdenum, and a current regulator is uscd to obtain stable operation. When ion pumping, the voltages will be adjusted to give about 800 volts betwcen t>hegrid cage and the filament a t an emission current of about 0.2 amp. The grid is composed of 0.006-inch-diameter molybdenum wires spaced 0.25 inch apart in order to provide a reasonable degree of electron transparency and allow a long mean path length for electrons. The ion collector is held about 100 volts negative with respect to the cathode and accelerates the positive ions formed in the grid region into it. A portion of
MICROELECTRONICS
27 1
the ioiis ncwkrated into the wntcr-coolrct ion c.ollc.ctor w e rct:Liiid :uitl a p u m p i ~ gaction result,s. I>uriig ioii punipiiig the iv:ttcr-cooletl aiiodc is held near cathode potential bo LLX iiot to collcct, clwtroiis. A sniall diameter wire is insertcld into the ceritcr of thc grid cage and opcrated atJ ground potential to serve as an ion collcctor for measuring the pressure in the ion pump. When large quantities of known materials are to be pumped from the system a separate pump can bc provided. For example, a platc of molybdenum hcatcd to around 600°C by electron bombardment and held adjacent to a water-cooled plate of nickel has bccn used to pump large quantities of chlorine by converting the chlorine to niolybdeiium chloride arid condensing it on the cold plate. The prcshure cannot be pumped to lower than the vapor pressure of the metal chloridc :itl the teiiipertLture of the coiidensing plate. By using this bame incthod but substituting a zircoiiiurncoated plate, carbon monoxide was rapidly pumped from a systcm without introducing contariiinatioii from liquid pump oils. No data is available yet on the puinpiiig speed of the pump shown in Fig. 33, but previous experience with smaller pumps of similar design indicate that for active gases the pumping ac%ionis similar to a 3-inch-diameter infinite sink. Due to the differential pumping of the entire vacuum system, the inert gases are not present in any significaiit quantity, and it has been found that the high bakeout teinperaturcs clcconiposc organic materials and oils to their simpler coinpoiicnts 4.
TPPIC.1L OPERATIOS
Based on previous experience with L: sniallcr vacuum syhtein of similar design, the author has found a strong dcpcridciice of the degree of vacuum obtained, and the speed of ohtaiuiiig thc vacuum, o i l the bakeout temperature and speed of heating and cooling. For twmiplc, it has been possible to evacuate a 9-inch-diameter by 12-inch-long vessel from atmospheric pressure to the lo-'" region in as short a time as 15 niinutcs. The time constant of the heater and of the cooling mechanism for the present design indicate that the same region of evacuatioii speed will be obtained. The vacuum cycle that might be considcrccl typical is as follows: (1) roughing of bell jar with inner chamber open-6 minutes; (2) diffusion pump cycle to lop4mm Hg-1 minute; (3) heating of ultrahigh-vacuum chamber to 900°C with inner chamber open-6 minutcs; (4)closing of inner chamber, firing of ion pump evaporator, and cooling of inner chamber by forced nitrogen gas and then water-6 minutcs; (5) ion pumping at, low temperature--3 minutes. A total of 22 miiiu to reach t)he 10-'O region would seem reasonably near the niinii~iumtime that could be expected for the vacuum equipment described here, which has $11inner chamber size of
272
KENNETH R. SHOULDERS
17 iiichcs by 30 inchcs. As the complexity of the apparatus in thc vaciium cnclosurc iiicreascd, the thermal time constant and the processing time would also increase. It would be reasonable to expect a time constant of 45 minutes per vacuum cycle for a moderately complex experiment. Thc cycle for opening the vacuum apparatus would be primarily a function of the t,hermal time constants of the molten met$alseals. If independent seal brcaking heaters were used, the opening time could be as short as four minutes; however, it would be in the region of 12 minutes if the entire system had to be heated and cooled to break the seal and drive out the residue from some previous experiment. 5.
ULTRAHIGH-VAC UUM ACCESSORIES
Because of the high temperature hakcout, a range of accessories is rireded for the ultrahigh-vacuum attachment, that is not compatible with the high vacuum accessorieh now commercially available. These accessories arc vacuum valves, vacuum locks, electrical lead-ins, optical ports, and cooling-water inlets. The valvc and vacuum lock requirements are met by combinations of the wire manipulators arid the molten metal seals. A bmall seal of the type employed in valves aiid locks is shown in the drawing of Fig. 27, as used with the electron optical system. These valves are patterned after the molten metal seals used in th r base of thc inner vaeuum (*hamheraiid a r r heated by electron bombardment. Electrical leads are fed into the system through high alumina ceramic terminals that have electrodes brazed into them. The entire electrode assembly containing 12 leads of 0.040 molybdenum wire in t n o h e a r rows of six each is brazcd into the base by electroii bombardment while in a lop4mm Hg vacuum. Large insulated lead-ins, such as the oiie shown oil the ion pump in Fig. 33, are brazed in a similar fashion. Optical ports can be obtained by brazing ground and polished discs of sapphire in the baseplate for the inner chamber, but if adequate electrical instrumentation is used there is no need for optical ports. Cooling-water lines can be brazed in position by flanging one end of a pipe while leaving the other straight, as shown in Fig. 33. A small quantity of gold-nickel eutectic can be used many times over for this operation, and it will be found that removal of any part properly brazed in place is a simple matter of reheating in vacuum with ail elcc%-on bombardment jig and letting thr part fall out hy gravity or by removing it with a manipulator. One feature that is a by-product of the high temperzlture bakeout furnace is t,he ability to coat the inside of the vacuum apparatus with thin, chemically-resistant films of alumina or silica. This is done by sealing up the
MICROELECTRONICS
273
system and adniitting a small quantity of aluniinum chloridc and water vapor, or tetraethylorthosilicate, while the temperature is being raised. All apparatus exposed to the chemicals, suvh as lead wires and coiinectors, are coated with insulation, and those drvires that must, he free of coat4ingshould he removed from the system. For somr purposes, this coatiiig gives the Fame rcsults as working in a glass vacuiini systrm. In addition to conventional ion gaugc instrument at ion of prcshure in the system, it, has bccn found uscful to include a field cmissioii cathode near the surface that is to be dcposited upon. When this field emitter is used as a ficld emission microscopc electron source, the effect is to produce an integrating vacuum gauge where spurious deposits on the tip can be identified by their emission pattern and by their activation energy. This integrating effect has hecii used to identify aiid locate bursts of solid materials that escape detection by ion gauge techiiiqut.s, but nevertheless form coiitamiiiating deposits on thc surfscc k i n g co:it,ed. These bursts of solids frequently arise from monicntary eloct rical discharges in the vacuum systcm 1vhic.h dislodge materid froin the w:ills by inadvertent rlectron and ion bombardmcnt. Materials have also becn removed from thc walls by displaccrnent reactions in which a short lived species of volatile material is produced that is difficult to see 011 an ion gauge. ,4niass spectrometer will be incorporated into thc vi~cuu~ii system j i i the future to help identify these materials and their sources, but thrir detection usually requires an integrating devicc because of their sporadic appearance. D. Integration of Apparatus
, I dr:~n.iiigof oiie layout for the various pieces of apparatus is shown in Fig. 30. This drawing shows a single dcpositiori and etching chamber that is capahle of k i n g sealcd off from thc main ultrahigh-vacuum system to prevent cont,aniination duriiig the purging cycles hetwcen dcpositions. A valve or vacuum lock i b providcd in thc top t o allow the transport of a substratr bctwecri the depohitioii chamber and the clectrou lens. Storage spacc for scveral substrates is provided in the iiiner vacuum chamber so t hat, various experiments can hc carried out without opening the vacuum charnbcr. Vwuum locks are also provided on the lower side of the deposition chambcr so that inatcrial not able t o withbtaiid the bakeout temperatures may be eiitcred from the cool portionr of the high-vacuum chamber. The clcctroii lens usscmhly is capahle of being sealed from the remainder of the vacuum chamber. Thr sealiiig ( ~ v e aiid r the magnetic shield cap shown in Fig. 27 must he manipulatrd out of thc path of the suhstratc : L I ~suhstratc holdw. ~~wrss:try slor:igr spncc for tlic~sc~ (wwrs is providcvl i l l the iiiiior V:WULIII~ (~h:mlwr.‘l’lie sockct-cwg:igiug mcdiaiiim for the elec’troii lciis is operated through the inanipulator located in the ‘I’hcb
MOLYBDENUM HEATER
ELECTRONLENS ENCLOSURE
RADIATION SHIELDS
SUPPORT TUlllNG
O U T E R VACUUM CHAMBER
HYORAULIC H O I S T
INNER VACUUM CHAMBER
DEPOSITION AND ETCHING CHAMBER
MANIPULATOR WIRE
DEPOSITION CHAMBER VALVE
MANIPULATOR S E A L S
MULTl CHANNEL YANIPULAT ORS
SUPPORT TUBINQ
.
.
HYORAULIC HOIST
I
DEPOSITION CHAMBER VALVE
MANIPULATOR WIRE PULLEYS MANIPULATOR W I R E GUIDES
DEPOSITION AND ETCHING CHAMBER
MANIPULATOR WIRES
MANIPULATOR W I R E QUIDES
ION PUMP MANIPULATOR ENTRY CHAMBER
ELECTRON L E N S ENCLOSURE
MOLYBDENUM HEATER
RADIATION SHIELDS ADAPTER SPOOL NER VACUUM CHAMBER SEAL BASE PLATE
I ‘-MULTI
CHANNEL MANIPULATOR
FIG.39. Layout drawing of ultrahigh-vttcuum system containing the electron optiral system, deposition and etching chamber, vacuum locks, and mechanical manipulation.
274
MICROELECTRONICS
275
bottom of the electron lens enclosure. This manipulator must exert a large force or the socket and manipulator wires cannot be used. The manipulation methods used for the various operations are discussed in Section XIT, B. It is iieccssary to consider manipulation methods that caii be hinged at a comnioii point, or quickly removed in order to allow rapid access to the various components of thc system. For example, Fig. 39 shows the electron leris covered by the manipulatms, hut this is not trouhlesome when disassembling the electron lens enclosure because the manipulator assembly pivots immediately abovc the upper manipulator wire guides, and the entire assembly can be rcmovcd a i d replaccd in a few seconds without disturbing the manipulator wires.
XIII. Electron Microscope Installation
-1 Hitachi IIU-10 clectron microscope is used in this work to serve as an analysis tool for the various processes heing studied, and, in this initial phase of the work, to serve as a high resolution demagnifying lens for the micromachining prow The following accessory items are also useful: Reficvted dectron microscopy attachment IClcctron ..pray g u n charge ncutralizcr Specimen hcating slid specimen cooling stage X-ray point projection microscopy at tachmerit High resolution diffraction attachment. The specifications as statrd by thc manufacturer arc : Acc~eIer:itingvoltages: I00 kv, 75 kv, 50 kv Magnificat ion: 400 to 180,000 electronically (:uaraiitcvxI resolving powor: 10 A Rcsolving Irides (Diffraction) : Sclected area diffraction 5 X High resolution difimction 5 x Illuniinnting System: J)oul)le cnndciiser lens with 5 square micron niiniinum arm of illuniinntion Optical System: Thrw stage magnet ic leiis systcin. P'our sets of polc pieces for the project or, changeable during operation. This instriiment hns passrtl rcsolution and stability tests made by oht aiiiiiig pictures of 1 lattice of copper p~iiIi:blociaiiilicat a magnification of 800,000 for thrw coiisccutivc days. This dcinoiistratc~sa resolution in the range of 8 A. The instruinerit was i~istnIlc(1ill a clcan room facility, shown in h
i
276
KENNETH R. SHOULDERS
FIG.40. Hitachi HU-10 electron microscope installation.
Fig. 40, that also houses the vacuum deposit>ionapparatus used in the micromachining program. XIV. Demonstration of Micromachining
A demonstration of micromachining has been conducted using simple vacuum deposition equipmciit and a commercial electron microscope. The optimum properties of the process cannot be secured by using such makeshift methods, but the goal of denionstrating the machinability of materials by electron beam techniques is attainable to some small degree. Some applications for electron beam machining would conceivably need no more elaboration than is used in this demonstration. Briefly, this demonstration consisted of depositing niolybdeiium metal on a thin film of aluminum oxide which is supported on an electron microscope specimcn screen, coating the molybdenum with the triphcnylsilanol, cxposiiig in the clcctron microscopc with a rcducctl pattern of n scrcci~ wire mesh, and etching thc molybdeiium in chlorine. Thc fiii:il rcsulf, is shown by a photograph taken in the elcctroii microscope.
MICROELECTRONICS
277
A. Substrate Preparation
A c o p p c ~wweii of 200 mrsh is covrixl with a collodion film made by spreading a 2% solution of ccllulosc nitrate in ainyl acetate on water, :ih described i i r :uiy st:indard text OII dcctron microscopr specimen screen pi*rpnration. The collodioii is typicdly 200 A thick. This s(’re(’n is thrn put i n a \~acuunisystem, swh a< thc oiir shown in Fig. 31, and coated \L ith :L 800 Li-tliickfilm of nlruninurii oxide I)y cuaporatiiig from a tungsten 1)uskct hratrd by rlcctroii t)oml)ardrncnt iii the apparatus shown in Fig. 13 and described in Section I’IIT, A on Matrrial Drposition. The copprr screen and films are heat c d t o urouiid 800°C by the srniill ceramic substratc heater shown in Fig. 24 to drivr off the collodion a d stabilize the evaporated :Llumiiia. Higher tempcrat u r ~ sivould he desirable but, the coppcr srreeii is prone to evaporate : u d mrlt hcfore thc rrcrystallizatiori of the aluiniiia a t 900°C; this is caused I)y thc fact that the copper is hottcr than the film of alumilia, since they :irr i i o t in I~lackbodycqiiilibrium with thc sritistr:it(>heater. B. Film Deposition
?‘he inoly1)deiium as evaporated on1o the aluiniiia surface with the apparatus shown in Fig. 13. The molybtlrnum sourcc is simply a selfFupporting rod of vacuum melted niolybdrnum that is heated by electron of the molybdenum film is optimum for our tiombardment . The thick1 purposes when betwcen A :~nd200 A. Heavy drposits cannot be penetrated by the electron beam during h t r r aiialysis for surface defects, such as spotty resists or piles of foreign material, that may have been produced. Thinner films fail to give adequate coiitrast iii the micrograph duc t o electron scattering from thr substratr. The tempcraturc of the substrate during the deposition is not important but is usually in the region of 300°C t o prcvrnt oil films from \)ring dcpositcd siinultarirously with the molyhdciiuin. The vacuum was rarcly better than 5 x 10-5mm IIg and the oil pumps have hrcii k i i o ~ ~to t i hckstrcani :md coiitaniiiratc~v:trious samplrs. Thc rntc of deposit ion was fairly slow for t molybdenum source that l i : ~ lieen used I)ecnusc of the small diameter. A typical drposition required three minutrs for a 150 A film of molyhtlriiiim. Aftcr deposit ion, the film iis heated to tiround 700°C to stabilize and rccrystallizc the molybdeiiuni. A micrograph of thc film after pro is shon-ii in Fig. 41, and a diffmctioii pntterii iii Fig. 4%. Thc f i l i n is c h o o l ( 4 to room trniprmtui~cand a cwatiiig of tripht’irylsilaiit,I \\:is :ipplicd l)y \ w u u i i i ( ~ v : t p ( ~ ri oaiti froni :1 m i i l l glass crricith with t h ~ : L I I I Capparat us usrd previoiihly. ‘I’hc. cruci\)l~ is hcld in :I iiiolyldriiuin wirr holdcr \vhicdi rcceivrs the elect roil 1)oriihardmciit for heating. A filrri thickh ( b
KENNETH R. SHOULDERS
FIG.41. Electron micrograph of 200 A-thick molybdenum film deposited on 200 A-thick aluminum oxide by thermal evaporation.
iiess of about 50 A is optimum for the resist layer. Thicker films tend to agglomerate and grow into feathery shaped patches. Thinner films can be used but the possibility of developing a hole in the resist is higher. Upon completion, the compositcb of films was removed from the vacuum system arid inserted i n the rlrct,mn microscope for exposing. C. Resist Exposing
The 1Iitach@IU-l0 electron microscope pictured iii Fig. 40 and described in Section XI11 has been modified slightly by installing a removable 500
MICROELECTRONICS
279
mesh screeii above the objective leiis in such a way that) the screeii can be dernagnified by 200 times. The focal length of the objective lens is 0.5 mm during normal operation aiid thc w r w n riiust be spaced about 100 mm above the principal plane to obtaiii the 200 t o one demagnification. The specimen holder for the microscope was modified by extending it approximately one-half millimeter below thr principal plane of the lens. This specimen holder was used for csposing only, aiid iiot for microscopy. .4pproximately 50 specimen holders conic with the iiistrurneiit so that modifying oiic is of little coiisequciice. The illumination system was stopped down by usiiig a small aperture iii the second condenser h i s . This leiis has three movable apertures that can be changed during the oprration of the instrumcnt without breaking the vacuuni. The aperture size used for cspoxiiig resists was approximately
FIG.42. Electron diffraction pattern of ruolybcleiiriin-LIlumirla sample shown in Fig. 41.
280
KENNETH R. SHOULDERS
0.002 inch in diameter. The illuminating intensity of tlhe instrument was reduced t o the lowest possible value by increasing the grid bias on the electron gun with the switch provided for that purpose. The intensity was further reduced by reducing the filament temperature. The only method of rontrolling the aperture angle of the illuminating electron beam into the objective lens is by adjustiiig the second rondenser lens current. The condenser lens thus determines the number of grid wires seen arid the ohjective lens controls the size of the image, although strictly spraking there is only one magnification that is properly focused, namely 200. The current density under typical exposure conditions is about 10 ma/cmz and the beam velocity is 50 kv. During exposure, the intermediate and projector lens were used as a microscope to observe the position of the specimen screen. These lenses do not, have enough resolution to observe the fine detail of the image used for exposing nor is the brightness high enough to see any detail. At best the microscope was used to tell whether or not the exposure was being made on a n open area of the specimen screen. The method that was most effective for determining corrrct focus was to cause t,he beam to converge to a point on the specimen screen, as indicated by the point projection niicroscopy produced, and then to increase the objective current a small but known amourit]. Another indication of focus can be observed in the diffraction pattern of the sample whereby the minimum number of spots appear when the beam is at, crossover. After proper alignment the exposures were made by moving the manipulator for the specimen screen in small increments, observing on the fluorescent screen that, the exposures were being made on the open spaces of the specimen screen. The exposures had to be long enough to prevent blurring of the image during the time in moviiig froni one sample to the next because beam blanking was not used. A oiie to three sccond exposure was normal since the manipulator could be moved rather rapidly froni one spot to the next. D. Etching
After exposing, the samplc was retumed to the vacuum system and inserted in the small substrate heater. The chamber was evacuated to the mm Hg and the sample was heated to 600°C. A stream region of 5 X of chlorine was admi d to the system through a valve and a length of aluminum tubing. The tubing was tcrmiiiated within 5 inches of the surface being et,ched; duc to the pumping action of the vacuum system the pressure was highrr :It thc surfncc than is indicated 011 a remote gauge. Thc chloriiw prcssurc was rchgulatecl l)y hand to a value uf 5 x 10 ' rnrn Hg, as iiidicatd by operating oiic of the evaporator positions as an ion gauge. 'l'he etching
MICROELECTRONICS
281
was complctcd in a fcw seconds; the chlorine prcssurc \vas rccluwd; aiitl the suhstratc tcmperaturc lowered. 0t)scrvatioii of the surface during etchiiig revealcd that the appearailre chaiiged flwn L: rcflcctiiig metallic
surface to a traiisparent dielectric surface. Thc sample was removed from the vacuum system and inserted in the electron microscope for viewing. Figurcs 1.3 and 44 show typical samples. E. Discussion of Results
ils indicated on Fig. 44, the image size is such that the scrccii wire spacing is 2500 A or 100,000 mesh per inch. 'l'he image has heen distorted
FIG.43. Low inngnificatioii electron niirrogr:ipti of ~nic~oma.ctiiiied molyhde~lum 2OU A-ttiick on slumiriurii oxide substrate.
282
KENNETH
R. SHOULDERS
FIG.44. High magnification electron micrograph of micromachined molybdenum 200 A-thick on aluminiiin oxide substrate.
into a pincushion shape by the electron optical system so that the current density falls off approximately as the square of the distance from the center of the axis. This effect is beneficial for determining the condition of exposure and the effect of the etching on underexposed areas. As can be seen, the transition region between properly exposed areas and unexposed areas produces patchy etching caused by having the resist too thin. I n the center region the transition is rapid enough to produce relatively sharp edges. The resolution shown iii the micrograph is in the region of 300 A, with the
MICROELECTRONICS
283
priiicipal error resultiiig frc~mthe raggctlness causecl I iy tlic l):~(:kgrouiid fog effect. The background fog, or patclics ( J f niolybdeiium abut 300 A in diameter, have been mused hy migration of unstable, recrystalliziiig aluminum oxide during the various hcating cycles. This aluminum oxide forms a chemically resistant film around the small granules of molybdenum and prevents further etching by chlorine. This tiackground fog can be removed by using a very short etch of phosgeiic gas to remove the thin film of alumina. Close inspection of Fig. 44 reveals that most of the molybdenum particles lie adjacent to a light area in thc supporting substrate, which could be interpreted to nican that the alumina came from these arms. By using high temperature screens aiid stabilizing the alumina to alpha form, most of the background van be removed, although there is still a very small reaction bctwecii thc niolybdeiium aiid the alumina. The dark areas i n the molybdenum film are caused by selcctiw rlcct roil scattering within the molybdenum crystallite. When these films are being observed in the electron inicroscope the arcas alternate between light and dark, and under darkfield illuminatioil they are even more pronounced. The thickness of the film is apparently uniform. Figure 45 shows a specimen scrceii of silicon on silicon dioxide made h y techniques comparable to the oiics used for the micromachining of molybdenum. The specimen support film had been torn from the screen and rolled back to produce the dark ragged line adjacent to the light area. As can be seen, several images were exposed on the same screen opening. These samples were oxidized slightly in transferring between the electron microscope and the vacuum deposition chamber, which resulted in another type of background fog, namely, silicoii dioxide on silicon. The only completely effective remedy for this is to keep the silicon under vacuum a t all times by incorporating the electron optical system with the deposition apparatus. This denionstrat ion has purposely been doiie with relatively crude equipment aiid techniques, with the exception of the electron microscope, so that they could he carried out without modification in any establishment outfitted with iiornial deposition apparatus, and having access to an electron microscope. With only minor additions, almost ally electron microscope with rcsolution greater than 100 A can be used for exposing. XV. Summary A. Microelectronic Component Considerations
The fabrication of one-micron-sized components, incorporated into a n electronic system having over-all dimensions of one inch on a side and
284
KENNETH R. SHOULDERS
coiitaiiiiiig 1W coinpoiicrits per cubic inch, is prcsciitly limited by coiistriw 1 ion tcchiiiqucs. Intcrconiicctioii betwccii widely separated components is coiistraiiicd by t hr high rlwt ricd rcsistiLii(ac of traiismissioii lilies, which can h a w a dc rcsistaiice of l o 5 ohms per liiiear inch for 3000 A diametcr conductors. To keep potentially high transmission line loss to acceptablc
FIG.45. Electron Inicmgraph ol niic.roiii:ic.liiiic,tl silicon 200 A-thick on silica 100 A-thick.
lcvels, active compoiiciits of high iinpcdaiice are required. Uiidcr these conditions, :ui optimum switchiiig time coiist,aiit for the active component lies in the 10-11 sec rcgioii; shorter switching times than lo-" scc result in exccssive interconncctioii loss at room tcmpcrature and above. Powcr densities of 10* watt/cm2 appear pcrmissiblc for onc-micron-sizcd components; these high power densities permit high data processing rates. For large arrays of cornpoileiits, a low quiescent, power for each component
MICROELECTRONICS
285
is required; peak power densit>yduring operation should he as high as 10'0 above the quicsccnt state. LC filtcrs of onc-micron-size would 1)c cupectd to have a Q as much as lox below the Q for convciitional sized filtcrs, thus making thrm relatively useless. RC filters would bc prone to tcmperature drift when scaled down in size because of the necessity for scaliiig up resistivity and the attendant highcr tcmperature coefficient of resistivity. Electrostatically operated elcctromcchaiiical filters that apprar ncll suited to the micron size rarigc and coiistruction processes arc discusscd. The rcsistivity of conductors is not affected in the size range under iiivcstigation, hut in the submicron range thc dielectric breakdown strength of insulators can bc incwascd to lox volt 'cm. Scaling dowri in size gives a high sur.facc-to-volume ratio that cauws materials to migratr under the action of surface t crision forces, and iii addition causes iiicrc:iscd carrier recomhiiiation at thc surfnres of srmiconductors. The most difficult problems in inaterial handling involvc tlicl protluction of uniform crystalliiiity in films. 8. Tunnel Effect Components
Electmiiic components 1)asrcl upon the quantum 1nechiic:d t uaneliiig of electrons from a metal into a high \ w i i u r t i are proposed; with this tcchiiique, only stable metals and dielectrivs arc employcd iii various grometrics to produce diodrs, triodcs, and tctrodcs. With suital)lc drsigii that makes iisc of the l~allisticaproperties of thc clcctroris in vacuum, control grids and srrerii grids of such deviccs caii he operated at a positive potrritial without drawing apprcciable currents. Operatiiig voltagrs as low as 10 volts, with currcmts in the region of 100 p unip for lo4 :mp/rmz currciit densities, appear possMc without aclvcrsc space chargc eff the high field iiitciisities used. Normal grid voltage variations used in various circuits would produce a chaiige in cathode current of about lolo, which is highly desirable for low poner quicscciit operation. The amplification factor for triodes can be in thr vicinity of 100, with a traiisronductaiicc of 1000 p i t h o / ma and a platc rcsistancc of about lo5ohms. With a power driisity of lofi wat,t,/cm2, a switrhing time constant of 10-lo sec appears possit)lc for a oiie-microii-sizr.d devicc, with an indicated transit time lag of src. High-frequeiiry grid-loading would bc iicgligiblc a t 100,000 hlc because of the short transit time lag. An array of 10" vacuum tuiiriel cf'fcct devices containcd iii oiie cubic inch of spacc could opcratc from arhitrarily low tempcratures up to arouiid 800°C. A t a inasimuni data proccssiiig rate of lot5bits per second, thc machine would self-hcat, to around 800°C. The iriseiisitivit,y of thc dcvice to local crystallographic
286
KENNETH R. SHOULDERS
properties would indicate an immunity to radiation damage about los timrs greater than present single-crystal semiconductor devices. Methods of potentially forming stable, mutiple-tip field emission cathodes having radii of 100 A are discussed ; here, self-formation methods would be ueed to generate arrays of over lo8 active components per layer. The methods would have to form components with uniform characteristics by degenerative processes. Vacuum encapsulation methods are discussed that are applicable to large arrays of film components, and corrosion methods of testing the encapsulation are described which indicate a component lifetime of several hundred years. Solid-state tunnel-effect amplifiers are considered ;however, high grid current, low impedance, temperature sensitivity, and device nonuniformity make them less desirable than vacuum devices. C. Accessory Components
Secondary electron emission effects having high stability, high current density, negligible time delay, and temperature insensitivity are discussed for appliration t o transmission-type electron multipliers and for coupling between tunnel effect components. Multiplier phototubes having 100 Mc bandwidth, negligible transit time ft-c appear possible spread, and a sensitivity adequate for detecting using film techniques. Such tubes would make photodetectors available with diameters ranging between 0.01 inch aiid 0.2 microii for application to the intercoiinection of large arrays of componrnts, and for microdocument reading. A method for using electronic micromarhiniiig techniques to record data 011 glass plates, with subsequent electrooptical read-out, yields a document storage scheme with a data density of 10'' hits per square inch and a readout rate of 1O'O bits per second. Electrostatically operated mechanical relays, operating in vacuum a t 30 volts, mid with frequencies up to 10 Mc, seem applicable to switching low level signals and for power distribution. Electromcchaniral filters composed of simple metal and dielectric diaphragms could serve as communication filters between 4 Mc and 600 Mc. Field strengths as high as lo7 volts make possible electromechanical coupling coefficieiits up to O.G, which would permit temperature compensL' t'ion by electrical interaction with internal thermal bimetal capacitors. D. Component Interconnection
I n the absence of thermally stable fixed rcsistors, tunnel effect diodes will be employed for some resistor functions in an effort to match thc ternperaturf~._cocfficic.iitsof active elements. Low dissipation circuits are dis-
MICROELECTRONICS
287
cussed in which the wide current swing of tunnel effect components is utilized. Thew low dissipation circuits would be employed in an active niemory in which the switching time may be 10-'O src, with quiescent power of watt for either a negative re ance or a flip-flop type of circuit element,. Low noise amplifiers arc discussrd in nhich a field emission cathode produces a virtual, space-charge-limited cathode for a iicgative grid tube. A method of using elcctrirally stccrable rlcctron guides for the purpose of eliminating transmission liiics from future systems and for increasing the logical freedom of a system is also discussrd. E. Substrate Preparation
Sapphire platrs, 1 inch square and 0.01 inch thick, are considered to be an optimuni substrate size. The platrs are mrchanically ground and polishrd ; clcanrd by high-vacuum methods using the explosive removal of a filmof niatcrial from the surface; then smoothcd in vacuum by drpositing a film 011 a temporary fill material, volatilizing the fill material, and sintcririg the film into the substrate. Terminals are prepared by firing molybdenum metal electrodes onto the plate iti a high-vacuum furnace.
F.
Material Deposition
Electron boriibardnient and rt4stively-heatcd t h r r n d evaporation sources have been built pcrmitting automatic regulation of evaporating material to within 2 7 , of a predetcrmined value 1)y using ion gauge monitors and electronic regulators. A bubstrate heatrr is described that is heated by a 3 kv, 0.1 amp elwtron lwani. The heater is completely enclosed by ceramic and is capable of achieving 1700°C operating temprratures for use with reactive deposition. Reactive deposition methods are discusscd in which dual thermal cvaporators are used to deposit stable, dcnsr, and pinhole-free niaterials like molybdenum and aluniiiium oxide films. These films wcre tested in the form of an encapsulatrd capacitor having a 200-Athick dielectric a t temperatures of 800°C and field strengths of 10' volt/cm to reveal impurities or imperfections. Solid state tunnel emission was obsrrved to orcur bct\seeii elcctrodcs. Tcsts for stability of aluminum oxide and molybdenum were carried out on the tips of a ficld emission microscope; the tests revealed no impurities caused by the dcposition and etching process. Mrthodx of potentially growitrg single crystd films of mattvials iiivolvirig reactive deposition and a sweeping t hcrmal gradient are discussed relative to deposition processes. G. Material Etching
Using molecular beam techiiiqucs in a high-vacuum chamber, both solid and gaseous sources are employed to etch the surface of materials siich
288
K E N N E T H R. SHOULDERS
as molybdenum and aluminum oxide, converting them to volatile compounds. Tests for the clcariliiiess of this process have been carried out on field emission microscope tips and 110 cvidence of etchant residue was observed. The parameters necessary to etch holes wit,h depth-to-diameter ratios of 10 : 1 in a molybdeiiuni film are discussed. Atomic beam etching methods arc discussed for application to low temperature etching of materials, such as the room temperature etching of silicon by atomic hydrogen. Alet,hods of using ion beams to sputter material from a substrate have shown very straight sided etching with no undercutting. Depth control methods effcrtivr to 1% are considered for various etching methods. H. Resist Production
Evaporated niat,er.ials such as silica and alumina can make effective etching resist,s by deposit,ing t,hem through masks; however, the resolution is limited by t,he mask. Resists can be produced by electron bombardment of materials such as triphenylsilnnol, which decomposes to form silica with a quant,um yield of about. one and a resolution of over 100 A for films 20 A t,hick. Multilayer met,hods of producing resists have shown a quantum yield of lo4molecules per electron with a resolution of 300 A. Consideration is given to finding resist-producing processes that are compatible with vacuum processing and t'he electron optical system. One of the chief requirement's is to maintain the surfacc uriipot)eritialduring resist formation, thus preveiitiiig electron bcsm distortion. I. Electron Optical System
The desigii of an clectrori opt,ical system is described that is irit,ended to become a micromachining electron source, a scanning electron microscope, a mirror microscope, arid an X-ray fluorescence probe. An elcctrostatic lens syst,em made from met,alized ceramic parts is currently being built; t,he lcns syst,em is capable of being baked in ultrahigh vacuum to a temperature of 900°C wit,hout t,he need for mechanical realignment t o obt.ain 200 A reso1ut)ion. Problems in obtaining registration between adjacent fields are discussed, and a regist.rat,ion of 500 A is predicted. The scanning microscope may approach a resolution of 200 A, which is equivalent to the micromachining mode. Only lo6bits per field would be expected for these two modes of operation because of limitat,ions in the deflection system. A n electron multiplier t.hat is integrated with the lens should provide most of the neccssary gain for the scanning microscope. The X-ray fluorcscence mode of operation should ultimately give a 1yo quantitative analysis on grams of material, with a resolution of one micron. Operation as a mirror microscope would give the ability t,o measure volt'ages down tjo 0.2 volt with a resolution of' 500 A and make dynamic voltage
MICROELECTRONICS
289
~nct~snrcinrnl s to 1000 M c . By usiiig the leiis as an emission microscope, multiple cathodc ficld cmihsion arrays could be imaged with 300 A resolution and thus greatly assist in forming uniform arrays. The requirements for various pattern generators are discussed; the ultimate geiicrator would be able to take full advantage of the lens hy producing a 10*-hit pattern every tenth of a setwid. The construction techniques for producing accurate lens elements are reviewed and some of the tests performed on the elements are described. J. Vacuum Apparatus
A 24-inch diameter, metal, bell jar-type vacuum system is described which is pumped by a 30 rfm roughing pump and a 6-inch oil diffusion pump. An adapter spool having 18 entry ports for accessories is shown arid ories such as cooling-water manifolds, electrical lead-ins, an optical port, and mult,iplc chaiiiicl manipulators are described. A demountable ultrahigh-vacuum attachmrwt is descrihcd which when completed, should be capable of attaining a vacuum of 1 X lo-'" mm Hg in 25 minutes. A 900°C bakeout] temperature is employed in this differentially pumped unit with a molten metal seal separating the high-vacuum and the ultrahigh-vacuum regions, the ultrahigh vacuum being produced by ion pumping. A range of accessories for the ultrahigh-vacuum unit are discussed, and considerations for iiitegration of the electron optical system and the deposition and etching apparatus are presented.
References 1. Chrr, P. H., K\'cw method of recording elertrons. Rev. S c i . Instr. 1, 711 (1930). 2. Niaon, W. C'., The point projcction X-ray rnicrosc+opcas a point source for microhcain X-ray diffr:wtion, i n S - R a y Mirroscopy aiicl Microrcrdiogruph?! (V. E. Cosslet, A. Ehgstrom, and H. H. Pattee, cds.), p. 336. Academic Press, New York, 1957. 3. Itovirisky, 13. hf., Lutsau, V. G., and Avdeyenko, A. I., X-ray microprojector, in X - R a y Microscopy and Microradiography (V. E. Cosslet, A. Engstrom, and H. €1. Pattee, eds.), p. 269. Academic Press, New York, 1957. 4. Powers, D. A,, and Von Hippel, A., Progress reports show consistent breakdown strengths between 50 and 200 megavolts per centimeter for film thicknesses between 700 and 4000 :tiigstronrs using aluinino d i r a t c glass. Afassachusetts Inst. of Technol. Lab. for Insulation Research, Cambridge, Massachusetts. 5. Good, It. €I., Jr., and Mueller, E;. W., Field emission, in Randbuch der Physik ( S . Fldgge, ed.), Vol. 21, Part I, p. 202 Springer-Verlag, Berlin, 1956. 6. Dyke, R. P., and Dolan, W. W., Field emission. Advaiares in Electronics and Electron Phys. 8, 153 (1956). 7. Giraedigiier, 11. J , Some aspects of vac~i\iindeposition of metals in trnnsistor falxication, in Y'ratls. 6th Natl. S!jrrzpociutn otL Vacuum Technology, 1958, 11. 235. 7a. Shoulders, K. R., 011microelectronic coniponent~,interconnections, and system
290
KENNETH R. SHOULDERS
fabrication, in Proc. WesternJoint Computer Covg. pp. 251-258. San Francisco, 1960. 8. Malter, L., Thin film field emission. Phys. Rev. 50, 48 (1936). 9. Firth, B. G., Investigation in Utilization of Self-sustained Field Emission, Contract DA-36-039 SC-73051, DA Project No. 3-99-13-022, Signal Corps Project No. 112R. U.S. Army, 1957. 10. Zalm, P., Electroluminescence in zinc sulphide type phosphors. Philips Research Rep. 11,353, 417 (1956). 11. Mott, N. F., and Gurney, R. W., Electronic Processes in Ionic Crystals, 2nd Ed., p. 267. Oxford University Presa (Clarendon) London and New York. 12. Charbonnier, F. M., Brock, E. G., Sleeth, J. D., and Dyke, W. P., The application of field emission to millimeter-wave oscillators and amplifiers, Final Report, Contract DA-36-039 SC-72377, DA Project No. 3-99-13-022, Signal Corps Project No. 112B. U.S. Army, 1958. 13. Martin, E. E., Pitman, H. W., and Charbonnier, F. M., Research on Field Emission Cathodes, WADC Tech. Report No. 5IT97 (ASTIA Document No. AD-210565). Wright Air Development Center, Cincinnati, Ohio, 1959. 13s. Griffith, J. W., and Dolan, W. W., Field emission cathode ray tube development, WADC Tech. Report 5&8 (ASTIA Document No. AD-155723). Wright Air Development Center, Cincinnati, Ohio, 1958. 14. Young, R. D., and Mueller, E. W., Experimental determination of the total energy distribution of field emitted electrons, paper presented a t Field Emission Symposium. University of Chicago, Chicago, Illinois, 1958. 15. Dyke, W. P., and Dolan, W. W., Field emission. Advances in Electronics and Electron Phys. 8, 109 (1956). 15a. Hansen, W. W., Applied Research in Microminiature Field Emission Tubes, Contract DA-36-039 SC-84526, Task No. 3A99-13-001-01, Signal Corps. Research and Development Lab. Quart. Progress Report Nos. 1 4 . U.S. Army, 1960. 16. Mueller, E. W., Field ion microscopy of damage in tungsten by bombardment with alpha particles and cathode sputtering. Paper presented a t 20th Ann. Conf. on Physical Elcctronics. Mass. Inst. Technol., Cambridge, Massachusetts, 1960. 17. Dyke, W. P., and D o h , W. W., Field emission. Advances i n Electronics and Electroii Phys. 8, 158 (1956) 18. Holm, R., Electric Contacts Ziandbook, 3rd ed., p. 27. Springcr-Verlag, Berlin, 1958. 19. Boyle, W. S., Kisliuk, P., and Germer, L. H., Elcctricsl breakdown in high vacuuni. J . A p p l . Phys. 26, No. 6, 720 (1955). 20. Good, 11. H., and Mueller, E. W., Field emission, in Handbuch der Physik (S. Flugge, ed.), Vol. 21, Part I, p. 213. Springcr-Verlag, Berlin, 1956. 21. Good, R. H., and Mueller, E. W., Field emission, in Handbuch der Physik (S. Flugge, ed.), Vol. 21, Part I, pp. 212,214. Springer-Verlag, Berlin, 1956. 22. Dyke, W. P., and Dolan, W. W., Field emission. Advances in Electronics and Electron Phys. 8, 158 (1956). 23. Herring, C., The use of classical macroscopic concepts in surface energy problems, in Structure and Properties of Solid Surfaces. (It. Comer and C. S. Smith, eds.), p. 80. University of Chicago Press, Chicago, Illinois, 1953. 24. Good, R. H., and Mueller, E. W., Field emission, in Handbuch der Physik (S. Fliigge, ed.), Vol. 21, Part I, p. 218. Springer-Verlag, Berlin, 1956. 25. Good, R. IT., and Mueller, E. W., Field emission, in Handbuch der Physik (S. Flugge, ed.), Vol. 21, Part I, p. 202. Springer-Verlag, Berlin, 1956. 26. Powell, C. F., Campbell, I. E., and Gonser, B. W., Vapor Plaling, p. 37. Wiley & Sons, New York, 1955.
MICROELECTRONICS
291
27. Sugnta, E., Yishitana, Y., Iianeda, S., Tatcishi, M., arid Yokoya, II., Furidainentnl Rcsearches for observing specinims in gas layers, in Z’roc. -3rd ~ n t c r n .(‘ortf. otL Elref)on Microscopy, 1954, p. 452. The Royal Microscopical Socicty, London, 1956. 28. Martin, E. E., Trolan, J. I<., i t i d Dyke, W. P., Stablc, high density field emission cathode. J . A p p l . Phys. 31, No. 5, 782 (1960). 29. Young, J. R., Evolution of gases mid ions from diffcreiit :modes under electron bombardment. J . A p p l . Phys. 31, KO. 5, 921 (1960). 30. De Boer, J. TI., Ekctrorl Emission awl Arlsorp(ion Phenomena, p. 370. RI:wMillnii c‘o. Kcw York, 1935. 31. Volokobinskii, M., Thc’ turiiiel effect in sulphitle rectifiers. S0uic.t Phys. --Uoklad,t/ 9 , No. 2, 192 (1!)58). 32. Volokohioskii, M., Thc. influence of elcctric fields on the proprrties of thin dielert ric and sciiiicontliicting films. ,9oz$d Phys.-Doklady 9, No. 2, 173 (1958). 33. IIighkayman, w.€I., Solid state field ernissioii wwitching compoiirbnts, Te’Tiil paper for nIo1rd:tr bhgineeriiig (’nurse. Mass. Iiist. Trvhnol., Cambritigr, M:wmwhiiac~tts, 1957. 34. Mead, C. A,, Thc tiiriiiel emission ainplifier. Z’roc. I R E 48, N o 3, 359 (1960). 35. Sternglass, E. J., and Wachtel, RI. M., Tr:tiismission secondary electron multipli(nation for high speed prilse countirig. IRE Trans. on Nuclear Sn‘. NS-3, No. 4 (1956). 36. Wriss, G., 0 1 1 se(koiidltry electron multipliers. 2. Tech. Physik 17, 623 (1930). 37. Skellrtt, A. M., Thc. use of secondary electror~emissionto ol,tsin trigger or relay action. J . A p p l . Z’hys. 13, 519 (1942). and Ap~ilica~iori oJ S i ~ o n r l a r y E l i ~ h iEnr7rsion, ~ p. 63. 38. Briikiing, €I., P h y hlrGr:tw-Hill Book Co., New York, 1954. 39. Wargo, P., Haxby, B. V., and Shepherd, Wr.G., Preparatiori and propcrtiex of thin film MgO secondary eniitters. J . i l p p l . Phys. 97, No. 11, 1311 (1956). 40. Bruiniiig, 11., P h y s m and Application of Secondary Electro~iEmisaion, pp. 45, 88. McCra.n-Hill Book Co., New York, 1954. 41. Apker, L., a i d Taft. I<;.,Field rinission from photor.onductors. Phys. Re//. 88, 1037 (1952). 42. Sokolskaya, J. L., Kilhiiiri, A . ,J., :tiid Ernlol:iova, T. Z., Field eiiiissiori from citdiniiim sulfide. Paper presented :tt Field Emission Syniposiiun, riiiversity of Chicago, Chicago, Illinois, 1958. 43. V:tii (lecl, W. C., Pistorius, C. A,, and UOIIIII~L, B. C‘., Luniinescriice of the oxidv r on aluminuin during :tiid aftcr its formation by electrolytic oxidation. Phalzps arch Res. 19, No. 6, 465 (1957). on, M. I., Vasil’ev, G. F., and Zhdarr, A. G., Field emission froin dielectrics rontairiing impurities. Radio Engineering a i d Electronics 4, No. 10, 274 (1959). 44a. voii Ardcnncb, M., A precision electron beam oscillograph with B spot diameter of R few microns. J. S r i . Znatr. 34, N o . 5, 206 (1957). 45. Bruinirig, H., Physzcs and ilpplication of Srcondary Ekctron Elnixwort, 11. 124. McGraw-Hill Hook Co., New York, 1954. 46. Cook, J. S., Kompfiier, R., and Yocum, \f7. I€., Slalom focusing. Proc. I R E 45, 1517 (1957). 47. Cook, J. S., I,ouiscll, W.H., arid Yocum, W. ]I., St;ibility of an electron on a slaloin orbit. J . Appl. Phys. 99, 583 (1958). 48. Criiinly, C. H., and Adler, li ., Electron beam parainetric amplifiers, Electronic Ind. 73, (1959). 48a. Utlelsoii, B. J., An electrostatically focosed electron beam parametric amplifier. Proc. I R E 48, No. 8, 1485 (1960).
292
KENNETH R. SHOULDERS
49. Wuerkcr, 11. F., Shrlton, IT., and Imigniuir, R. V., Elcctrodyuamic containment of charged particles. J . AppZ. Phys. 30, No. 3, 342 (1959). 50. Ehrenberg, W., X-ray optics: impcrfcrtions of optical flats and their effect on the reflection of X-rays. J . Opt. Sac. Am. 39, No. 9, 746 (1949). 51. Koehler, W. I?., Multipla-beam fringes of cqual chromatic order, part VII, mechanism of polishing Glass, J . Opt. Sac. Am. 45, No. 12, 1015 (1955). 52. Hagstrum, H. D., and D’Amico, C., Production and demonstration of atomically clean surfaces, J . Appl. Phys. 31, No. 4, 715 (1960). 53. Powell, C. F., Campbell, I. E., and Gonser, B. W., Vapor-Plating, p. 133. Wilcy & Sons, New York, 1955. ~ ~ , hlcGraw-Hill Book Co., 54. Hall, C. E., Zntroductzon to Elcctron M ~ c ~ o s cp.o 348. New York, 1953. 55. Powell, C. F., Campbell, I. E., and Gonser, B. W., Va’apor-Plating.WiIey & Sons, New York, 1955. 56. Schossberger, F. V., Spriggs, S., Ticulka, F., Tompkins, E., and Fagen, E., Researrh on the Pyrolytic Deposition of Thin Films, WADC Tech. Report No. 59-363 (Armour Resrarrh Foundation of Illinois Inst. of Tcchnol.) Wright Air Dcvclopment Center, Cincinnati, Ohio, 1959. 57. Diincumb, P., Microanalysis with II smnning X-ray microscope, in X-Ray Microsropy and Mzcroradiography (V. E. Cosslctt, A. Engstrom, H. 11. Pattcc, eds.), p. 619. Academic Press, New York, 1957. Das clelrtrische Massenfilter als Mas58. Paul, W., Reinhard, H. P., and von Zahn, IT., senspektrometer und isotopentrenncr. Z . Physzlc 159, No. 2, 143-182 (1958). 59. Brubaker, W. M., Study and Development of the Paul-Type Spectrometer, AFCRC Quart. Project Report No. 4, (‘ontract No. AF19(604)-5911. Air Force Cambridge Research Center, 1960. 60. Good, R . H., and Mueller, E. W., F’icld emission, in Harkdbuch der Physik ( S . Fldgge, ed.), Vol. 21 Part I, p. 214, Springer-Verlag, Beilin, 1956. 61. Wise, H., and King, A., Private communication and internal publication on Mechanism for the chemical reaction of atoms with solids. Stanford Research Institute, Menlo Park, California, l!fGO. 62. Paneth, F., and Hofeditz, W., Ber. deut. chem. Ges. 6PB, 1335 (1929). 63. Langmnir, I., quoted by S. Dushman, Sczerrtzjic Foundations of Vacuum Technzque, p. 679. Wiley & Sons, New York, 1949. 64. Holland, L., Vacuum Ucposition o j Thin Fibs, 1). 401. Wiley & Sons, New York, 1956. 65. Farnsworth, H. E., Schilicr, It. E., George, T. H., and Burger, 11. M., Application of the ion bombardment cleaning method to titanium, germanium, silicon, and nickel as determined by low-energy electron diffraction. J. Appl. Phys. 99, 1150 (1958). 66 Wehner, G. I<.,Sputtering by ion boinhardment. Advances in Electronics and EZectron Phys. 7,276 (1955). 67. IIaefer, It., Formation of thin layers in a self-maintained high vacuum discharge and its application in electron microscopial specimen technique, in Proc. 3rd Intern. Con$ on Electron Microscopy 1954, p. 466. The Royal Microscopical Society, London, 1956. 68. Burk, D. A., and Shoulders, I<. It., An approach to microminiature printed systems, in I’roc. Eastern Joint Compiler cor~j.1958, special Publication T-114. 69. Pisrher, It. X.,Derompositions of inorganic spccinieris during observation in the electron mirrosrope. J . Appl. Phys. 95, 894 (1954).
MICROELECTRONICS
293
70. Camp, M., The decomposition of silver azide by electrons, in Proc. 4th Intern. Conf. on EZectron micros cop?^, 1958, Vol. 1 , p. 134. Springer-Verlag, Berlin, 1960. 71. Pensak, L., Conductivity induced by clrctmn bombardment in thin insulating films. Phyr. Rev. 75, No. 3, 472 (1949). 72. Ansbacher, F., and Ehrenberg, W., Electron-bombardment conductivity of dielectric films. Proc. Phys. SOC.(London) A64, 362 (1951). 73. Rachman, C. H., and Ramo, S., Electrostatic electron microscopy. J . Appl. Phys. 14, 8 (1943). 74. Liebmann, G., Measured properties of strong unipotential lenses, Proc. Phys. SOC. (London) A64, 15 (1949). 75. Newberry, S. P., and SummerR, 6. E., A shorter focal length electrostatic objective lene for the point projection X-ray microscope, in X-Ray Microscopy and Microradiography (V. E. Cosslett, A. Engstrom, H. H. Pattee, eds.), p. 116. Academic Press, New York, 1957. 76. Cosslett, V. E., and Duncumb, P., A scanning microscope with either electron or X-ray recording, in Electron Microscopy, Proc. of Stockholm Conf. 1956 (F. S. Sjostrand and J. Ithodin, eds.), p. 12. Academic Press, New York, 1957. 77. Everhart, T. E., Smith, K. C. A., Wells, 0. C., and Oatley, C. W., Recent developments in scanning electron microscopy, in Proc. 4th Intern. Conf. on Electron Microscopy, 1958, Vcl. 1, p. 269. Springer-Verlag, Berlin, 1960. 78. Mendenhall, H. E., U.S. Patent No. 2,700, 626 (1955). 79. Duncumb, P., Microanalysis with an X-ray scanning microscope, Proc. 4th Intern. Conf. on Electron Microscopy, 1958, Vol. 1, p. 267. Springer-Verlag, Berlin, 1960. 80. Bay, Z., Electron multiplier as an electron counting device. Reu. Sn’. Insir. 12, 127133 (1941). 81. Allen, J. S., The X-ray photon efficiency of a multiplier phototube. Rev. Sci. Instr. 12, 484488 (1941). 82. Bartz, Von G., Weissenberg, G., and Wiskott, D., Ein Auflichtolektronenmikroskop, in Proc. 3rd Intwn. Corij. on Electron Microscopy, 1954, p. 395. Royal Microscopical Society, London, 1956. 83. Mayer, I,., J . A p p l . Phys. 26, 1228 (1955); 28, 975 (1957); 99, 658, 1454 (1958).
This Page Intentionally Left Blank
Recent Developments in linear Programming SAUL I. GASS International Business Machines Corporotion Washington, D.C.
L)econipositiori Algoritliin . . . . . . . . . Integer Linear Programming . . . . . . The Multiplex Method . . . . . . . . Gradient Method of Feasible Directions . . . . Linear Programming Applications . . . . . . . . 5.1 Crit,ical-Path Planning and Srheduling 5.2 Structural Design . . . . . . . . . 5.3 Other Applications . . . . . . . . . 6. Surnmary of Progress in Reltttrd Fields . , . . 6.1 Curve Fitting . . . . . . . . . . . 6.2 The Theory of Gaines . . . . . . . . . . . . . 6.3 Stochastic Linear Programming 6.4 Nonlinear Programming . . . . . . . . 7 . Jinear Programming Computing Cotlev and P r o d u r e s 7.1 Digital Computers . . . . . . . . 7.2 Analog Coniputers . . . . . . ~ . S C E M P. . . . . . . . . . . . . . . <). Linear Programming in Other Countries . . . . References . . . . . . . . . . . . . . 1. 2. 3. 4. 5.
.
.
.
.
.
.
.
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. ,
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
. .
. .
.
. .
.
. .
.
. .
.
. .
.
. .
.
.
.
296 302 309 314 317 317 320 321 322 322 31L2 323 323 325 325 360 361 363 366
. h y attempt to describe “recent” advances in linear programming can, a t best,, expect to obtain an incomplete and biased picture. The spirit of this survey was to record and highlight recent theoretical and applied areas which would be of interest to the audience of this volume. A large iiuniber of persons and groups contributed to the survey. Many made available their reports and papers (some unpublished) to the author. Some were kind enough to conduct their o w i L‘internalllsurvey and contribute it for this broader survey. We wish to thank all who have participated. Special thanks arc due to Thonias I,. Saaty for his sections on the multiplex method and the gradient method of feasible directions. We have grouped the material into a number of sections. Each scctioii is self-contained, but n-e hope the reader will not lose sight of the fact that 295
296
SAUL 1. GASS
the total survey measures and acknowledges the broad and important impact linear programming has made throughout the world.
1. Decomposition Algorithm
Although it is theoretically possible to solve any given linear programming model, the analyst is quickly made aware of certain limitations which restrict his endeavors. Chief among these limitations is the problem of dimensionality. Almost all difficulties that arise in the development of a programming problem can be related to its size. This is certainly true for such restrictive items as the cost of data gathering, matrix preparation, computing costs, reasonableness of the linear model, etc. To the early workers in this field the problem of size was an apparent mountain range to be overcome. It was then recognized that even though the computing powers of the National Bureau of Standards SEACand UNIVACI were available, and more versatile and powerful computers were on the horizon, special computing techniques had to be developed to speed the solution process in order to obtain accurate solutions at a reasonable cost. Today we find ourselves with the mountain range still in front, but the foothills behind us. This ascent up the slope of dimensionality has been mainly due to the successful investigation of special systems. Here we mean that the mathematical structure of the model has certain helpful or peculiar features which enable us to either transform the problem into a simpler one of lower dimensions, or to reduce greatly the solution process by a clever variation of a standard computing procedure. The major example of this type of reduction and transformation is the simplex method adapted by Dantzig to handle the structure of the transportation problem, and the variety of problems that can be transformed into a transportation-type problem (e.g., the caterer problem). In line with this, A. J. Hoffman observes in a letter to the author that almost all linear programming problems known to have a quick solution, i.e., by inspection, can be transformed into a transportation problem. The development of procedures for the solution of large-scale systems is reviewed in [42]. There Dantzig discusses techniques to reduce the computation time of large systems from two points of view, (1) by decreasing the number of iterations, and (2) by finding a coinpact form for Ifoffman ia d ( d o p i n g a gcncrctl principle for solving some transportation problems by inspwtion. He notes that the basic idea behind this principle was first noted by Monge and that Hoffman devcloped the idea while studying some work of Frechet. Following convention, he has titled this procediire Lhe French method.
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
297
the iirvcrx of thc basis. To cut tloiiw thc numbcr of itcrniions, hc notcs a iiumhcr of v:iriants of t hc siniplcs nwt hod which 1 1 : 1 \ ~ I)ccw propowl l o rcplacc the usual Phasc 1 of thc simplcs nic(lioc1. I Ic iiicliitles I3calc’s method of leading variables, Orchard-Hays’ composite simplex algorithm, Dantzig, Ford, Fulkerson’s primal-dtial algorithm, and Markowitz’s m a x i m u m decrease of objective f o r m per decrease of infeasibility form. We will return to this particular topic in a later section. Proposals for the finding of a compact form for the inverse includes Markowita’s sparse basis technique, Dantaig’s block triangular basis, and the decomposition algorithm by Dantaig and Wolfe for angular systems and multistage systems of the staircase type. As much promising work has been done in this latter area, the rest of this section will be concerned with the decomposition algorithm for linear programming [48,223]. For many problems the constraints consist of rather large subsets of equations which are related in that they refer to the same time period or same production facility, and these subsets are tied together by a small set of equations. These “tie-in equations” might represent total demand for a product. I n problems of this sort we have, in a sense, a number of separate linear programming problems whose joint solution must satisfy a set of additional restrictions. If we boxed in the sets of constraints and corresponding part of the objective function, Fig. 1 would result.
I I cn
= F i c : . 1.
1-
b,
1
A Block Angular System.
Here we have partitioned the original problem of Ax = b, x 2 0, cx a minimum into the decomposed program of finding the vectors xj 2 0 ( j = 1, 2, . . . , n) such that
SAUL I. GASS
298
CAjxj = b Bjxj
=
bj
(1.3) Ccjxj is a niiiiimun~, where A, is an m X nj matrix, Bj is an mj X nj matrix, cj is an nj-vector, b is an m-vector, and bj is an mj vector; xj is a variable nj-vector. As written above, this problem has m C m j constraints and C n j variables. I n attempting to reduce the dimensionality of the above problem, Danteig and Wolfe consider the following development. Assume that for each j, there is available the corresponding convex set Sj of solut,ions to each subproblem Bjxj = bj, xj 2 0. Then the solution of the original problem could be thought of as the selection of a convex combination of solution point’sfrom 8, for each j so as to satisfy the tie-in restrictions CAjxj = b and make Ccjxj a minimum. Although this bare idea of the decomposition appears fraught with its own difficulties, there are a few saving features which enable us to consider the optimization of an m n constraint problem subject to the solution of n mj X n, subproblems instead of one large problem with m Em, constraints.? The new problem to be considered is called the extremal program and it arises in the following fashion. Let us consider for some j a particular extreme point solut,ion xjk of the convex set of solutions 8, t’o the subproblem xi 2 0, Bjxj = bj. Define for each such extreme point k
+
+
+
Pjr, = Aj~jk, cjk = cjxjk. The extremal program is to find the numbers sjk 2 0 sat,isfying for all ( j , Ic)
C Ck c,!+s,~
is a minimum.
As Danteig and Wolfe note “The relation of the extremal problem to the original problem lies in the fact that any point of S,,because it is bounded (assumed) and a convex polyhedral set, may be written as a convex combination of its extreme points, that is, as ~ k ~ , I ~ , where k , the S j k satisfy (1.5) ; and the expressions (I .4)and (1.6) are just the expressions (1.1) and (1.3) of the decomposed problem rewritten in terms of the 8,k.” As mentioned above, this transformation yields a problem with only m n constraints, the m tie-in constraints (1.4), and the set of n single constraints (1.5). However, the number of variables has been increased to the total of all extreme points to the convex polyhedral S,,an extremely
+
From the design of present computer procedures for solving linear programming problems, we note that the number of constraints is usually the restrictive element.
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
299
large number. The saving element of the decomposition principle is that we need only consider a small number of t.his total and we only need the explicit representation of those to be considered, and only when required. The decomposition algorithm embodies the basic elements of t'he pricing vcctor aiid basis t,ransformation of the simplex met,hod. We will outline this algorithm using the notation of Dantaig and Wolfe. We assume that we are given an init.ial basis of the extremal problem, which consists of columns of the form (Pjk; 0, . . . , 1, . . . , 0), along with it,s corresponding pricing vector. This vector is written as (T,ii)where , the m-vector ?r is associat'ed wit,h the m constraints (1.4) and iiwith the n constraints (1.5); i.e., for the vectors in this basis we have rPjk-I- i i j = Cjk. I n order t'o determine if we have developed an extreme solution to the original problem, ~ v have e to solve, for each j , the related subproblems of: minimizing ( c j - aAj)xj subject t,o BJ. xI . = b .
xj 2 0. Let %j be such a solution for each j(i.e., an extreme point of SJ a i d let, be the one for which
jii,
6 = (cj,, - ~AjJjzj,- iijo= min [(cj - aAj)fj
- iij]
j
If 6 2 0, the algorithm teriniiiates aiid thc set of given cst'remal problem, and t'he vectors sj = CXjkSjk, k
j
=
sjk
solves the
1, . . . , 71
solve the original problem. If 6 < 0, we form the new column and its associated cost for the extremal problem, i.e., (Ajo%j0; 0, . . . , 1, . . . , 0) and cio~j, and introduce this column into the basis as in the regular simplex method. The reader is referred to the original paper for discussions on initiating the process, variations of the process, and a numerical example. The fact that, any linear progrsmniiiig problem can be decomposed in a manner which best suits its special form is the basis of a unique application of t,he decomposition principle to the transportation problem. Here me cite the work of Williams [223]. This special adaptation is best applied to a transportation problem which has few origins and many destinations. (In particular, Williams is concerned with problems that have 20 origins and 3500 destinations.) The basic stateincrit, of the transport,ation problem is to find the set, of ~ ; 2 j 0 such t,hnt.
SAUL I . GASS
300
Ci Cj cijxij Here we have n
is a minimum.
> m and Ca, = Cb, = 2’.
It is apparent that the sets of equations (1.7) and (1.8) can be naturally divided in a manner similar to the decomposed problem such that (1.7) corresponds to (1.1) and (1.8) corresponds to (1.2). To develop the associated extremal problem let us define the following: be the set of all basic solutions to (1.8). Then every feasible Let {x,,~} solution to (1.8) can be written as
x,, = C skxalk,
where
sk
2 0, Cs, = 1.
(1.10)
k
We note that any feasible solution to (1.7) can be given by (1.10) and the We also note value of the objective function is given by x,C,CkCq$kx%3k. that in order that the x,, satisfy (1.7) as well as (1.8), we require that the s k be solutions of C C S h x I 1 k = a, i
k
We then define the lcth supply plan by P,, = C,xalk and its corresponding cost c k = C,CJcl,xIlk. The extremal problem can now be stated as finding the numbers sk 2 0 which satisfy
Ck Pijsk
CChsk
=
ai
i
=
1, 2 , .
. . ,m
(1.11)
is a minimum.
We do not nced to write explicitly the condition that C s k = 1. It is implied by (1.11) since C u i = Cbj and the x a j k satisfy (1.8). The above cxtremal problem, which enables us to solve t>heassociated transportation problem, has only m equations and a very large number of variables. This number is equal to the total number of basic solutions to (1.8). As noted in the general discussion of the decomposition principle, we do not need to determine the explicit representation of the Pij and only need to develop the basic solution that is to be introduced into the basis. As in the general situation, we first need to start with a feasible solution to the extremal problem. We then optimize the related subproblems in order to improve this feasible solution or to determine if it is an optimal solution. For this particular decomposed problem, the subproblems reduce to the simple problem of successive optimization of (1.8) with appropriately modified costs. Williams noted that the optimization of (1.8) can be readily
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
301
obtained by determining what he calls the natural distribution, i.e., the distribution where each custoincr is assigned to the source which has the cheapest rate. The modified costs uscd to determine the natural distribution are the given costs minus numbers which are called source potentials, equilibrant prices, or shadow prices. In the simplex notation, those are the elements 7rZof the pricing vector T. To determine a first starting solution for the decomposition principle, let us note the explicit representation of (1.8). We have: Xll 212
+ +...+ + + ...+ 221
= 0,
ZW'1
222
= bz
Xn,?
(1 3') Tln +x2n +xWLn =:b, We see that we can immediately detcrmine the following basic feasible solutions to (1.8') :
{r1,l
=
11,;
T,,' =
0 for i
+ 1)
. . . , (xmIm = b,;
=
(Q,?
T , , = ~
0,;
x , , ~= 0 for i # 2)
0 for i # m ) .
The kth solution represents the shipping of all of the resources from the kth source. Since Plk = CJxIlk, we have for this first feasible solution that for i = k Pik = T (1.12) for i # li. P,k = 0 The explicit representation of (1.11) is then
(1.13)
By (1.12) this reduces to
TSI T s ~
= =
Ts,
al a2
= a,
A feasible solution to (1.13) is just 1
S k = - a k
T
Sk =
0
k=l,2, 1;
> m.
..., m
SAUL 1. GASS
302
Thc iiivcrae of thc matrix associated with (1.13) is a diagonal h n t r i s whose nonzero elements are all equal to 1/T. The cost, of each plan is giveii by c k and the pricing vector, ?r = ( A ~ A, ~ ., . . , A ~ ) is, equal t,o the product of the row vector (el, cq, . . . c,J and the inverse matrix. To determine if the present feasible solution can be improved, we need to determine if any of the quantities (1.24) A h = C AzPt/, - Ch t
is greater than zero, i.e., for those supply plans P,!, not in the basis which ones maximize A k > 0. In terms of basic solutions of (1.8) we have ( A l- C t , ) X b J k
Ak = 1
(1.15)
%
Hence we must select the basic solution (1.8) for which (ctJ - r,) is a minimum. And, as Williams points out, this is just the natural distribution under the costs c,, - r c and . is easily calculated. We need only calculate the associated supply plans and cost and introduce this basic solution in the basis as in the simplex method. The cost now is transformed to yield new r, and the process is repeated until all A k 5 0. The set of shadow prices A, for the final basis enables us to determine the optimum solution to the original problem. This is based on the fact that a constant can be added to or subtracted from each element in n row of the cost matrix without changing the problem (the final cost is modified, of course) and on the theorem: For the transportation problem there aluays exists a sel of numbers r t ,one for each source, such that there i s a natural distribution under the costs c,, - T , , and which i s th,e desired solution to the original transportation problem. This procedure has been coded with favorable results for the IBM 704 computer by personnel a t the Socony Mobil Oil Company Electric Computer Center in New York City. Details of the code were not available a t the time of this writing. In his paper, Williams goes on to discuss generalizations of the procedure such as the addition of linear constraints and generalizing the transportation problem.
2. Integer linear Programming
Since its introduction as a tool of applied mathematics, the outstanding computational problem of linear programming has been that of finding the optimum integer solution to a linear program. The need for this type of computational procedure is emphasized by the great number of problems
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
303
from the realm of combinatorial analysis and the areas of scheduling and production which have been formulated as linear programming problems. As we look through the general literature on linear programming we find many references t,o problems stated in terms of an integer linear program. Some of these were originally formulated a number of years ago, while others are of recent vintage. Many are stated in Danteig [34,40]. The list includes the fixed charge problem, Hirwh and Dantzig [98], Croes [31]; the traveling salesman problem, Dantzig, et al. [45,46] formulated as a multiple-trip integer program by Tucker [201] ; machine scheduling, Wagner [215], Manne [151]. Other appropriate discussions include Dantzig [35, 361, Gross [93,94], and Wagner and Whitin [217]. The above selections indicate that there have been many important applications waiting for the right computational procedures. It now appears that recent work of Gomory [88,89,90] and Beale [lo] has opened the way to practical solution for a number of such problems. Since, a t this writing, more work has been done on applying the all-integer procedure of Gomory, we will only describe that algorithm. We should note that the success of the transportation model with integer availabilities and demands is partially due to the fact that each basic solution corresponds to an extreme point which has integer or zero values for the coordinates. This is due to the triangular matrix of 0’s and 1’sassociated with the basis. This is an obvious condition for this formulation and we might ask what similar condition is sufficient for the general linear programming model to have an integer solution. For this answer we need only to refer to the basic transformatioii of the simplex procedure-the elimination transformations. If xz,represents the element in ith row and j t h column of the simplex tableau and ~ l ish the pivot element, then thc transformed elements are given by: x113
=
x,,
X1,XLL xlh
for i # 1
(2.1)
For the purpose of this discussion we will assume that the starting tableau is all integer. We see that a (very strong) sufficient condition for the successive solutions to he integer is to have each fraction in (2.1) reduce to an integer. Oiic unusual way for this to happen is to have each pivot element equal lo unity. This is, of course, what happens for the transportation problem. This property is equivalent to the selection of a basis whosc associated matrix bas a determinant equal t o 1. As we shall
304
SAUL 1. GASS
see below, this is one factor in the computational scheme developed and applied by Gomory for solving what is termed an all-integer programming algorithm. The geometry of the situation can be pictured by considering the solution to the following problem: maximize 3x1 2 2 subject to xi 2x2 5 8
+
+
3x1
- 422 5 12 x1 2 0 xq
2 0.
The optimum solution is given by x1 = 28/5 and x2 = 6/5 with maximum value equal to 18. This problem is pictured in Fig. 2. The convex set of solutions is bounded by the heavy lines and, as we have indicated with heavy dots, this region also includes a number of integer points. The problem is to determine which of these points maximizes the objective function. Since the basic computational scheme to be employed is the simplex method we should review the geometry of this technique. The simplex method starts at an extreme point of the convex set of solutions. Each iteration, that is the selection of a new basis, determines a new extreme point which is a neighbor to the old point (has a boundary segment in common). This process continues and stops after a finite number of steps a t the global maximum. We see that in order to have an optimum integer solution we must also have an associated extreme point which is integer valued. For the problem in Fig. 2, the simplex process would probably start a t the extreme point (0, 0), move to (4, 0) and finally to (28/5, 6/5). It is impossible to indicate what integer point in the solution space optimizes the objective function. The question that now arises considers the possibility of changing the convex set of solutions in such a manner as to make the appropriate feasible integer point an extreme point of the new convex region. For example, in Fig. 3 we have introduced two arbitrary constraints to the problem which do the trick. This technique of putting in cutting hyperplanes was used in the past for specific formulations. How to introduce these cuts systemnti d l y for any problem is the essence of Goniory’s procedure. Gomory’s initial approach required solving the problem by the basic simplex algorithm. If the optimum solution is not integer he introduces, one at a time, a set of linear inequalities which preserve the optimality
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
305
Determination of point which maximizes the objective function.
Fic. 2.
state of the primal problem but not the feasibility conditions. One application of the dual algorithm yields an optimum answer. If this is intcgervalued, the process stops. If not, a new constraint is introduced. The convergence of this early method was established, and the procedure reported, in Gomory [89]. I t has been superseded by the more efficient procedure of Goniory [goal; it is this algorithm which we shall discuss. We shall use the notation of the original paper. Here we assume for starting conditions that the simplex tableau is all integer-valued and that we start with a feasible solution for the dual simplex algorithm. The problem can then be written as: maximize 2
subject to
=
000
+
a01
(-tl)
+
(702 (-f2)
+ + * *
a 0 n (-tn)
(2.2)
SAUL 1. GASS
366
-1 (-tn), where t,he variables of the basis are denoted by and the variables not in the current basic solution are denoted by t,. All variables are restricted to be nonncgrttivc. The are the elements of the simplex tableau, a,o is the t, =
\
C
/ Fro. 3.
Thc twhniquc of using cutting hyprqhmes.
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
307
value of the ith basic variable, am is the value of the objective function, and ao, are the shadow prices for this basis. Since we are maximizing and assuming a dual feasible solution, all uo, 2 0. Not all ato are nonnegative as we are assuming that all elements are integers, but we have not determined an optimum integer solution, i.e., a feasible solution to the primal problem has not been obtained. What we now wish to do is to inlroduce new variables into the solution in a manner which will enable us to preserve the integer characteristic of the tableau and to move towards the optimum integral solution that is contained in the convex set defined by (2.2), (2.3) and (2.4). This will be accomplished by determining cutting hyperplanes defined as inequalities in terms of the nonbasic variables and a new slack vector. Introducing the new slack vector into the basis with a pivot element3 of -1 will preserve the integer tableau and move the solution point closer to the optimum integer point. A finite number of applications of these new constraints leads us to the optimum integer solution. To determine the form of this cutting constraint, let us consider any equation (dropping the row notation i) from the set (2.3) :
x or
= a0
+ ad-td + + . . + a,(-tn> + 1(-~). + ai(-h) + arr(--t?) + + as(--&!)
’
* * a,(-tn) 0 = 00 (2.5) We next rewrite each coefficient of (2.5) as a multiple of an integer and a remainder, i.e., in the form b,X rI where b, is an integer, T , is a remainder, and X is an unspecified positive nurnber to be determined. These elements can then be expressed as
+
+ r, = [u,/X]X + r, 1 = [l/X] + r
a, = b,X
o
o
j = 0, 1,
- . a ,
71
(2.6)
o<X,
where square brackets indicate “integer part of.” If u,/X < 0, then [a,/X] = b, < 0 such that b,h r, = a,. Substituting (2.6) in (2.5) and gathering appropriate terms we have
+
We note that any nonnegative integer values of the variables x and tj which satisfy the original Eq. (2.5) will also satisfy (2.7). Such a substitution will also make the left-hand side of (2.7) a nonnegative number. 3 We select - 1 as R pivot clrment instcml of +1 RB we are using the d u d algorithm to change solutions.
SAUL 1. GASS
308
We must now center our attention on the quantity in the curly bracket, that is, the expression denoted by s, s = [ao/x]
+ 2 [aj/~](-tj) + [ ~ / X I ( - Z ) . j=1
(2.8)
For this expression we note that s must not only be an integer, but must also be nonnegative. The f i s t part is clear since all variables and the bracketed quantities are integers. The second part is established by noting that ro < X and if s were negative, the right-hand side of (2.7) would be negative, which contradicts the statement that the left-hand side is nonnegative. This nonnegative integer s is what Gomory introduces as a new variable of the problem. For the purpose a t hand, the efficient development of a cutting constraint, we consider the situation where we restrict X > 1. The expression for s becomes
+ [a~/xl(-td + [ a ~ / X l ( - t d + + [an/hl(-tn> bo + h(--tl)+ b(-t,) + + bn(-tn).
= [ao/X]
or
= * * * (2.9) For any X > 1, Eq. (2.9) is a constraint which must be satisfied by any integer solution to the original linear programming problem. After a suitable selection of A, Eq. (2.9) can be used as a cutting constraint and added to the original constraints (2.3), (2.4). By a discussion based on the rules for applying the dual simplex algorithm, Gomory shows that the requirement for choosing X is that it should produce a pivot of - 1 and the X used should be as small as possible. The rule suggested by Gomory for selecting X is as follows:
..
1. The pivot column lc is the one with the smallest aoj, j = 1, 2, . , n. 2. For each aoj, find the largest integer mj such that (l/mj)aoj 2 aok. 3. Set X j = -aj/mj. 4. Xmin = maxXj,j = 1, 2, . . ., n. With this decision rule for selecting X the integer algorithm is as follows : “Assume an all-integer starting matrix which has an explicit solution for the dual problem. We choose a row having a negative constant term a,o (if there are none the problem has been solved); for this row choose Xinin and the pivot column by the four steps given above. We locate a new row (2.9) using this Xmin and adjoin this to the bottom of problem 2.2, 2.3, 2.4. We now perform Gaussian elimination on the new row, i.e., we introduce s as a new nonbasic variable, we drop the new row and then repeat the process. Because this pivot element is - 1, the matrix remains in integers.’’
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
309
Gomory proves that the process is finite and in so doing develops alternative rules for the selection of a row with a negative constant term required by the algorithm. One of the open areas in this procedure is the development of a selection rule which will cause rapid convergence for all problems. We suggest that readers review the original work before venturing into the procedure. This is emphasized by the erratic behavior of some of the rules that have been tested. On programs coded for the IBM 704 a t the IBM Research Center, Yorktown Heights, N.Y. four 15-variable, 154nequality problems were tried which differed only in that the constant terms were increased by 2 in going from one problem to the next. The first three were solved in 17, 21, and 23 pivot steps. The fourth ran over 400 steps before the computation was stopped. Similar results occurred for a set of 32 X 32 problems. See the section on program write-ups for further details. Another computer code was developed by Zemlin of Standard Oil of code which can handle California for a 32K IBM 704. This is a FORTRAN problems up to 70 X 200. The procedure is the one by Gomory described above. I n a communication from Zemlin (May 1960), he states that “Our experience with the codes has been limited to reasonably small problems (at most 70 variables). We do not consider the code a production code in view of the extreme unpredictability of the algorithm itself. About half of the problems we have attempted have been solved quite satisfactorily and the other half have required an intolerably large number of iterations to converge. . . . There seems to be no reliable method of estimating in advaiice whether a given problem will behave properly.” Zemlin’s remarks are a fitting conclusion to this section. They indicate that a possible breakthrough is near, and yet more interesting work is still ahead.
3. The Multiplex Method
We start with a n explanation of the various steps involved in the multiplex method. Some reflection will show that the ideas are basically those used in gradient methods with an elaborate method of changing vectors (one or more a t a time). We shall adhere to the notation and terms used by Frisch [73]. 1. One starts out by reducing the inequalities to equalities by introducing slack variables, and instead of minimizing the objective function subject to the constraints, maximizes its negation. 2. Then one selects a set of basis variables whose number equals the number of unknowns of the init,ial problem given with the inequalities.
SAUL I. GASS
310
The basis variables may include slack variables. (The reader should note that here the tjerm basis has a different meaning than the one used when describing the simplex method.) 3. The remaining variables are then expressed in terms of the basis variables using the equations of the first step. Note that there are exactly as many equations as there are nonbasis variables. This gives expressions of the form x, = bJo-I- x b j k x k , where I% ranges over the basis variables. 4. The expressions for the nonbasis variables are substituted in the objective function to replace the nonbasis variables. This yields an objective function expressed completely in terms of basis variables, which is now denoted by f = x f k x k where the summation is extended over the subscripts of the basis variables. 5 . For a starting point z”, nonnegative values are assigned to the basis variables from which initial nonnegative values for the nonbasis variables are obtained using the equations giving the relation among them. The objective function is also evaluated a t this point. 6. Sometimes lower and upper bounds, i.e., Z and 1,are given constraints on the values of the variables. When not prescribed as in the example given below, they are respectively fixed at zero and infinity. 7. Now, one evaluates the expression defining directions: dj
=
Cfkbjk.
8. In this step one performs what is called breakout analysis, i.e., one computes for all the variables:
if or x.0
--
d j
- fif d j
3 -
-dj
>0 < 0.
9. Having computed a column of values of X j one marks in a column beside it, called “at bounds,” that value of A, which is minimum. If one denotes its subscript by p , then:
niin A,
=
A,.
j
10. The iiew point
x l
is obtained from the relation
+
xj‘
= xio
”l
3, = {Zp
X,dj
if d, if d,
j # p
>0 <0
This completes the first round. 11. The second round begins by forming an “operation set.” The
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
31 1
variable of this set has thc same subscript as niin,A,. Note that there may bc more than one X, which gives the minimum. 12. A new sct of directioiis d, must now bc computed. ‘Lo do this, oiic performs thc following “regression analysis.” Suppose p , q, T are the subscripts of the operation set of variables and let llftj =
Cbdil, -fiI,o = cfkbJA;
it is desired to determine the B’s in
+ B,BI,, + B,.ZI,, + . . . = 0 M*” + R,Al,, + R,AI,, + . . . 0 Mpo
=
the set of all equations pertinent to the subscripts of the “operation set.” Before computing the d, one tests the operations set variables a t the value chosen (which from the definition must be a t a lower or a t an upper bound). A coefficient B, corresponding to a variable in the operation set x, is signcorrect if, when x, is a t its lower bound, B, is nonnegative. If 1, is a t its upper bound, B , is sign-correct if it is also nonpositive. If all coefficients corresponding to the operation set arc sign-correct, one computes d, for round two defined by: d,
=
M,,
+ B,‘ll,, + B,M,, +
* *
.
where the sum is taken to include all subscripts in the operations set. Note, for example, that for j = p , d, = 0. If some variables have sign-incorrect coefficients they are ignored, and the analysis proceeds with moments computed for variables with sign-correct coefficients (only) (i.e., the set of simultaneous equations considered is now smaller by the number of variables ignored and the regression coefficients are determined from these.) A direction is inadmissible if its corresponding variable is a t its upper bound and its d, is positive, or its corresponding variable is a t its lower bound and its d, is negative. One adjoins to the operation set a t any stage (perhaps a t the start) all those variables which are inadmissible. It is advisable to bring these variables in, one by one, each time testing for sigii correctness. With new dJ’s computed, one repeats the breakout analysis selecting a minimum A, and each time computing the excess in the objective function doa, where a refers to the iiuniber of the round and the subscript designates that d, has been computed for the objective function. The analysis is finished when doa = 0 and several variables are at their bounds (their number equals the number of basis variables). All variables a t their bounds must form an operation set whose coefficients are sign-correct.
SAUL 1. GASS
312
Tlic optimal solution is givcn by t,he final “ncw point” for which the ahovc is satislicd. Coiisidcr t h l i w w pr(qymiiiiiiiig prohlciii: 21 422 2x3 2 5
+ +
+ + 223 2. 4
3x1
2 2
2.c1
+ 9xz + z3= min
x 1 2 0 (i
=
1,2,3).
This gives:
Applying t,he foregoing ideas to our example one obtains the iterations which are elucidated in Table I. X denotes an operation set variable and “dot” denotes a sign-correct variable; W is used for a sign-incorrect variable; slc is used for a skipped “ignored” variable (none encountered in this example). The first column of the initial point comprises a feasible solution. One selects three variables xl,Q, 23 and assigns to them values in the feasible region. The objective function involves xl, x2,and 2 3 which have been chosen as the basis variables, and hence its coefficientsfk remain unaltered. If, for example, 2 4 had been used in the basis instead of 22, the objective function would have in it xl, 2 3 , and 2 4 , having substituted for x2 its expression in terms of 21, 2 3 , and 2 4 . Continuing with our problem me note that p = 4, and hence d, = M,o B4M4j.Now the moment equation gives M40 M44B4 = 0; i.e., -40 21B4 = 0 and consequently B4 = 1.904762. Note that
+
+
M4o
= fib41 =
M44
+ + fabra f2b42
-2 -36 -2
= b4ib14 =
1
+
+
b42b24
=
+
-40, b43b34
+ 16 + 4 = 21,
observin that bjk = b kj . I n computing Xj, note that X4 is o t computed any more because its d j = 0. The variable 24 remains in the operation set in the second round. Also there is no particular need to compute those X i which are very large. One has again the “At bounds” column marking min Xj. The new point xi2 is now computed and also the operation set marked.
313
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
TABLE I. EXAMPLE OF MULTIPLEX METHOD 4O Starting point Xl X?
XX x4 26
f
3.0 0.5 0.5 1.0 (3.5 - 11
3,
Zi
Lowcr bound
Upper bound
0 0 0 0 0
m
Xi
dl Z:f k h t i
Breakout analysis
m
- 2 - 9
I .5 0.05555ti
m
- 1
0.5
-40 - 17
0.025 0.382
03
m
At bounds
e
XI
At boullds
New poiut ~
XI XY XJ x4
x5
1
Xl
2’’ .I?{
.ri :r5
J.
2.950 0.275 0.475 0 ti.075 -8.850
2.931034 0 1.0344s3 0 0.862069
-0.095238 - 1.380952 2.809524
s.
30.75031 0 199138 Imge Large
0
3.952382
-1.2 0 0.0 0 -2.4
s. s.
e
2.442528 Imgv
0
dl,, J ~ Z O
- 40
-9f
+ M44B4 + hl,,B,
+ A14284 + Af22B2
+
21B4+
4R4+ B,
x. s.
s.
0 -2.5
L2irgr
-(i .Y
The moinerit equations are
0 0 2.5
e
= =
0 0
4Bz = 0 &=0
0.8 I?, = 5.8 Note Illat, i i i ihc: “At 1)oiiiids” column all t,hree operatioil set variables are at their lower ))ourids (zero values) and are sip-corrcxt. The number of h s i s variables is t.hree and hence coiiditioiis for a solutioii arc: satisfied; also do3 = 0. Therefore one has the desired solution given by tJhe2,“. =
SAUL I . GASS
314
4. Gradient Method of Feasible Directions
I n this section we shall study recent developments in the problem of optimizing (here we confine our attention t o maximization) a function of several variables .f(x) subject to the inequality constraints:
(i = I , . . . , m). 50 If, in addition, x >_ 0, i.e., xj 2 0 ( j = 1, . . . , n), then one has the general g*(x)
programming problem defined by Kuhn and Tucker [131a]. However, for many of the purposes of our discussion, one need not be limited to the condition of nonnegativity of the variables. The main object is to develop a n algorithm for solving the problem. Most algorithms are variations of a gradient method due to Cauchy. The gradient of the function f(x) is defined by
It is well known that a iiecessary condition for f to have a maximum at a boundary point Et = (%, . , . , %) of the region R , defined by the constraints, is that the gradient o f f be a nonnegative linear combination of the gradients of the constraints a t E. Thus,
om
c X'k7,(3, m
=
i=l
A, >_ 0.
The latter condition is rclated to the Lagrange multiplier procedure for obtaining the optimum. If the nonnegativity condition on the variables is included, then on multiplication by -1 these conditions acquire the form in which the general constraints are given, and are therefore included with these constraints. One is always faced with the problem of whether an optimum is global or local, I n studying solution procedures, to simplify matters, f(x) is assumed to be convex or concave depending on the type of optimum required and the region is assumed convex in order to insure a global optimum and avoid the problems of a local optimum. The equation J(x) = c defines a one-parameter family of surfaces. Each value of c fixes one of the surfaces of the family. When x = ( ~ 1 x 2 )is a two-variable vector, one has a set of contours in thc plane. Thus, each feasible point must fall on a contour of j(x) and the problem is to move to the contour on which the optimum is located. Since the gradient points in the direction of greatest increase, it is natural to follow the direction of the gradient. Most gradient procedures involve suggesting good, i.e., quick and converging directions, which when successively followed from
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
315
uric feasible point on one mcmbrr of thc family to another fcasiblc poilit, on the same member or 011 another member, lead to an optimum whether local or global. Note that if P is a global maximum, then f(P) 2 f(x) for any feasible x. We next study with Zouteridijk [227, 2281 his method of feasible directions. We first consider the case of linear constraints:
5 a,,x, 5 b , (i
=
1 , . . . , m).
3=l
In the matrix notation Ax vectors ai where
=
b let the matrix A consist of m column
Then each inequality may be written as
(i = 1, . . . , sn.), - UiTZ 2 0, where uiT is the transpose of ui and x is regarded here as a column vector 2/i(Z) = bi
rather than a row vector as first considered above. Note that for a feasible point 55, yi(P) 2 0 (i = 1, . . . , m ) . Those i for which Z lies on the boundary of the corresponding hyperplane, i.e., yi(Z) = 0, are denoted by (il,. . . , ik). We indicate a new feasible4 point by (x As) where s is a unit vector with initial point P and 0 5 X 5 X for some X > 0 (since the point is feasible, X > 0 exist.s). We have point.ed out that t.he gradient locally points in the direction of maximum increaee; it is t.herefore reasonable to determine t.hat s 5 (s,, . . . , s,) which makes the smallest angle with t.he gradient a t Z (the best feasible direction). Since we are working locally, we use the condition of footnote 4, with the fact that s is a unit vector, to determine that, s which yields the best feasible direction. Thus, we must solve (we UPC the t,ranspose of the gradient since all vectors are involved as column vectors) vjT(P)s = maximum subjcct to . . ujTs5 0 , z = zl, . . . , i,,, STS = 1.
+
A ncwssary condition for the feasibility of j ; aiTS
5 0,
+ As)
= bi,
- ailT(? + As)
holds at a feasible point.
. . . ,i a ) . > 0. Then since yi,(X)
(i = i l ,
Thus, suppose that for some i, say i ~ ,arils yi(X
+ As is that
= -uiITXs
= bi, - allT = 0, < 0. This is a contradiction since 1 ~ ; > 0
SAUL 1. GASS
316
Note that if the first constraint holds for no i, s must, be aloiig the gradient. It is assumed that if the gradient vanishes a t P, then a new feasible starting point in the neighborhood of P is chosen at, which the gradient does not vanish, otherwise P is a local maximum. Since it is desired to increase the objective function moving out of P along S, it is sufficient to find s such that the scalar product to be maximized is positive. This is insured by replacing the last condition by S T S 5 1. But this is a nonlinear constraint which may obviously be replaced by the equivalent condition - 1 2 s 5 1. Once the maximizing s, i.e., Z is determined, its value is substituted in the constraints and in the objective function. Now X is determined from yt(E AS) 2 0. Thus, X = minz ( ~ , ( E ) / u , ~ E )for those i for which U , ~ S> 0. Then one maximizesf(% AS) with respect t o X subject to 0 5 X I X. If we denote our successive feasible points by Z k (Ic = 0, 1, 2, . . .) we have &+I = Pb XkSk. The reader is referred to ref. [227] for a convergence proof. If one has a mixture of linear and nonlinear constraints
+
+
+
i = 1, . . . , m1 i = ml 1 , . . . , m
5 b, aLTx 5 11,
g,(x)
+
where gl(x)are convex for all i, then one uses the vector q z ( x ) E (ag,/axl, . . . , ay,/ax,), which is the outward pointing normal of the hypersurface g l ( x ) = b,. Assume also that arTx= b, for i = . . , i,. A vector s with initial point E is now feasible if q,'(x)s < 0 for i = ill . . . , ik (the set of subscripts of those nonlinear constraints for which g l ( x ) = b , ) and a t T s 2 0 for i = . . . , ip. To find s one introduces a variable u and then finds s satisfying Ql*(X)S
-(Vf)S
. .
+ L0 a,Ts 2 0 + 50
. .
. . . , ak % = %k+1, . . . 2 p z =
u
21,
u
sTs21 u =
or
-1Lsll
maximum.
If u > 0, then s is feasible (as can be seen from the first, second, and third of the constraints) and the objective function will increase in the direction of s. If the solution yields u = 0, then it follows that there exist nonnegative quantities XO and X, such that
ni I
cxi+xo=
i=l
1,
x,20,
x i 2 o ( i = l,...,m).
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
317
If A0 > O,Vf(Z) is a nonnegative linear combination of the outward-pointing normals in 5t and hence j z is a maximum. The treatment for A0 = 0 is excluded due to its infrequent occurrence in practice. As before one maximizesf(Z As) as a function of A, etc.
+
5. linear Programming Applications
One is continually amazed by the diverse origins of linear programming applications. This is certainly indicated by the large section an applications in the bibliography by Riley and Gass [179]. In this section we will not attempt to review all of the new applications which have appeared since thc publication date of that bibliography, but it is felt that we should highlight a few unusual and possibly not so wcll known applications. Thc rcadcr is referred to the references of this article for additional material on applications. 5.1. Critical-Path Planning and Scheduling
The application of linear programming to planning and scheduling by the critical-path method is described in Kelley and Walker [123,124] and Kelley [120]. Fundamental to the critical-path method is the basic representation of a project. It is characteristic of many projects that all work must be performed in some well-defined order. For example, in construction work, forms must be built before concrete can be poured; in research and development work and product planning, specifications must be determined before drawings can be made; in advertising, artwork must be done before layouts can be made, etc. These relations of order can be shown graphically. Each job in the project is represented by an arrow which depicts (1) the existence of the job, and (2) the direction of time flow from th8 tail to the head of the arrow. The arrows then are interconnected to show graphically the sequence in which the jobs in the project must be performed. The result is a topological representation of a project. Several things should be noted. I t is tacitly assumed that each job in a project is defined so that it is fully completed before any of its successors can begin. It is always possible to do this. The junctions where arrows meet are called events. These are points in time when certain jobs are completed and others begin. In particular there are two distinguished events, origin and terminus, respectively, with the property that origin precedes and terminus follows every event in t.he project. Associated with each event, as a label, is a nonnegative integer. It is
SAUL I . GASS
318
possible to label events in such a way that the event a t the head of a11 arrow always has a larger label than the event a t the tail. We assume that 1 events, events are always labeled in this fashion. For a project P of n t8heorigin is given the label 0 and the terminus is given the label n. The event labels are used to designate jobs as follows: if an arrow coilnccts event i to event j , then the associated job is called job (i,j ) . When the cost (labor, equipnlciit, arid materials) of a typical job varies with elapsed-time duration it usually approximates the form of the curve of Fig. 4. This represents what is usually called “direct” costs. Costs arising from administration, overheads, and. distributives are not included.
+
iv) -’
/rash
limit
V
:: b
I
I
Dij Duration of job (i,.j)
dij
Fia. 4.Typical job cost curve.
Kote that when the duration of job (i, j ) equals U,,, the cost is a minimum. On the surface, this ii a desirable point a t which to operate. Certainly management would seldom elect to require the job to take longer than the optimal method tirnc. We call D,,the normal duration for job (i,j ) . However, exogenous conditions may require that a job be expedited. This may be done in a variety of ways. But in any case there is a limit to how fast a job may be performed. This lower bound is denoted by d,, in Fig. 4 and is called the crash duration for job (i,j ) . It is thus reasonable to assume that the duration y t J of job (i,j ) satisfies
0 Idii
I yij 5 D,j.
(5.1) The cost of job (i,j ) is now approximated in a special way over the range defined by inequalities (5.1). The type of approximation used is dictated
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
319
1)y the mathematical technique involved in what follows. Thus, wc must :~ssumcthat the approximate cost function is a pieccwise linear, noniiicreasing and convex fund ion of y7,. Usually, in practice, iiisufficient data are available to makc more thaii a lincar approximatioil. There are cxccptions, of course T n the linear case me may write
+
Cost of job (i, .I) = U , ~ . V ~ , b,,, (5.2) where a,, 5 0 and b,, >_ 0. This is iridicated by the dotted line in Fig. 4. On the basis of job cost functions just developed we can determine the (direct) cost of any particular schedule satisfying inequalities (5.1) by simply summing the individual job costs. That is, Project (direct) cost
=
C (aijy2j+ b i j ) .
(5.3) It is clear that there arc gcnerally inany ways in which job durations may be selected so that the earliest completion times of the resulting schedules are all equal. However, each schedule will yield a different value of (5.3), the project cost. Assuming that all conditions of the project are satisfied by these schedules, the oiie which costs the least would be selected for implementation. It is therefore desirable to have a meuiis of selecting the least costly schedule for any given feasible earliest project completion time. Within the framework 11e have already constructed, such “optimal” schedules are obtained by solving the following linear program : Minimize (5.3) subject to (5.1) and (Tj.4) ? / > I 5 t, - t , , and f” = 0, f,, = x. (5.5) Inequalities (5.4) express the fact that thc duration of a job cannot exceed the time available for performing it. Equations (5.5) require the project to start at relative time 0 and be completed by relative time X. Because of the form of the individual job cost functions, within the limits of most interest, X is also the earliest, project completion time. A convenient tool for generating schcdules for various valucs of X is the method of parametric linear programming with X as the parameter. Intuitively, this technique works as follows. Initially, we let yI, = D,,for every job in the project. This is called the all-normal solution. We then assume that each job is started as early as possible. As a result we can compute t L ( 0 ) for all events. I n particular, thc earliest project completion time for this schedule is X = f n ( O ) . We now force a reduction in the project completion time by expediting certaiii of the critical jobs-those jobs that control project completion time. Kot all critical jobs are expedited, but
320
SAUL I . GASS
otrly thosn i,h:~I,tlrivo i,hc projoc t, cost, up a t n minimum rntc :LSt,hc project, completion t imc tlccrrnscs. As t lic project, completion is retluced, mow and more jobs become critical and thus different jobs may be expedited. This process is rcpeatcd uiitil no furthcr rcduction in project completion time is possible. Mathematically speaking, the process utilizes a primal-dual algorithm. The restricted dual problem is a network flow problem involving both positive upper and lower bound capacity restrictions. A form of the FordFulkerson network flow algorithm is used to solve it. The critical jobs that are expedited at each stage of the process correspond to a cut set in the graph of all critical jobs. This process produces a spectrum of schedules (characteristic solutions in the linear programming sense) each at minimum total (direct) cost for its particular duration. When the costs of these schedules are plotted versus their respective durations, we obtain a nonincreasing, piecewise linear, convex function.
5.1.1
COMPUTATIONAL EXPERIENCE
The critical-path method has been programmed for the UNIVACI, 1103A, and 1105. The limitations on the size of problems that available computer programs can handle are as follows: UNIVAC1-739 jobs, 239 events; 1103A-1023 jobs, 512 events; 1105-3000 jobs, 1000 events. An IBM 650 Computer program for critical-path planning and scheduling is available on a fee basis from Mauchly Associates, Inc., Ambler, Penna. Generally, computer usage represents only a small portion of the time it takes to carry through an application. Experience thus far shows that, depending on t,he nature of the project and the information available, it may take from a day to six weeks to carry a project analysis through from start to finish. At this point it is difficult to generalize. Computer time has run from 1 to 12 hours, depending on the application and the number of runs required.
5.2 Structural Design I n Heyman and Prager [97], it is shown that the automatic plastic design of structural frames can be treated by the method of linear programming. The authors point out, however, that since the number of variables increases very rapidly with the complexity of the frame, this basic formulation is computationally feasible only for simple frames. In their paper the authors propose an alternate method which greatly reduces the size of the problem. This special procedure has been coded for the IBM 650, Stone [195], and the IBM 704, Kalker [105]. These programs handle
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
32 1
the following basic problem: Given the centerline dimensions of a plane structure of n bays and m storeys, find the cross sections of the various members such that the material consumption is a minimum. I n Lemke et al. [133], the authors pose a nonlinear problem in plastic limit analysis as an integral linear program. Arabi [5] considers a design problem which is associated with airplane control surfaces. The tail control surfaces of aircraft are equipped with several internal balance panels connected to the movable surfaces b y linkages. For added safety, extra weight can be added, when feasible, to those panels so that if one should become ineffective during flight, enough weight remains to counterbalance the control surface. The problem is to determine the minimum weight which must be added to a n airplane to prevent flutter of the control surfaces in the event of such a failure. 5.3 Other Applications
Maline [146] studied the planning problem faced by a machine shop required to produce many different items so as to meet a rigid delivery schedule, remain within capacity limitat.ions, and a t the same time minimize the use of premium-cost overtime labor. It differs from alternative approaches to this well-known problem by allowing for setup cost indivisibilities. As an approximation, the following linear programming model is suggested: Let an activity be defined as a sequence of the inputs required to satisfy the delivery requirements for a single item over time. The input coefficients for each such activity niny then be ronstructed so as to allow for all setup costs incurred when the activity is operated at the level of iuiity or at zero. It is then shown that in m y solution to this problem, all activity levels will turn out to be either uility or zero, except for t,hose related to a group of items which, in number, must be equal to or less than the original number of capacity constraints. This result means that the linear programming solution should provide a good approximation whenever the number of items being manufactured is large in comparison with the number of capacity constraints. Fort [72] describes a linear programming model of the gaseous diffusion process for srparating uranium isotopes. The model is intended primarily as a component of larger models involving interactions between the gaseous diffusion prowss, nuclear reactors, :~nd other facilities of the nuclear niaterials industry. Such models inay be useful in attacking such problems ns t hc svlcct ion of tlie h s t coinposit ion of wastt? niuterial from the gaseous diflusioii p r o w ~ b ,t tit, choicc anlong altcriiat ivc designs arid inodcs of operation of nuclear rmciors, thc choice nn1oiig altcrnativc patterns of material flow ilrnong the various elenient~of t lie nuclear materials industry,
SAUL 1. GASS
322
and the assessment of the cost or value of materials or electric power supplied to or yielded by the industry. The model depicted concerns an idealized version of the gaseous diffusion process in steady-state operation. Three types of relations are taken into account: (a) material balance within the plant; (b) the scale of plant required to generate given material flows; and (c) the irreversible nature of the gaseous diffusion process. The principal contribution of this paper is in suggesting the importance of the last consideration in certain applications, and in describing a way t o handle it by linear programming. I n Saaty and Webb [186] we have not only a discussion of the formultition and solution of a rather large problem in scheduling aircraft overhaul, we also have a report on the application of the analysis of sensitivity to critical parameters. The authors note that this sensitivity analysis yielded information regarding the required accuracy of the parameters, and the desirability of their inclusion in the final version of the solution. Longrange planniiig is interpreted by means of these parameters, hence tests of new policy or planning can be based on the sensitivity analysis. As an example, the system has been found more sensitive to man-hour distributions than to the error distributions of the number of items allocated, contrary to existing expectation. Another problem of interest is the prediction method for determining releases to overhaul of aircraft engines. h mathemat>icalmodel using renewal theory has bcen tested for use as a basic input to the distribution modcl. 6. Summary of Progress in Related Fields
6.1 Curve Fitting
Basic work in this area, which transforms problems of “best fit” into linear programming problems, is contained in Kelley [121], Charnes and Lemke [26], Wagner [214], and Wolfe [226]. Wagner treats the problem of “best fit” as applied to least absolute deviations and least maximum deviations. Ile shows that if the linear regression relation contains p parameters, minimizing the sum of the vertical deviations from the regression line is equivalent to a p-equation linear programming model with bounded variables; and fitting by the Chebyshev criterion leads to a standard ( p 1)-equation linear programming model.
+
6.2 The Theory of Games
Tlic rclatiowliip between the problem of linear programming arid tlio problem of the zero-sum two-person game is well known. Applications, computational tcchniques, and theoretical advances from either field
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
323
hliould bc continually evaluatcd as to how they affrct oiic anothcr. Daiitzig [37] considers a two-movc gamc with perfect information. This leads to the problem of finding a global niiriimum of a concave function over a convex domain. It is shown that the global minimum can be obtained by solving a linear programming systcni with the side conditions that at least one of certain pairs of variables vanish. This is equivalrnt to a linear programming problem with some iiitegcr valued variables. In Dartmouth [49], Kcmeny and Thompson rcview thc essential elcments of the fictit,ious play method for solving matrix games. The authors have incorporated certain computational features which tend to decrease the total computational time. The procedure produces very accurate answers and does not have any round-off error. The method has been coded for an IBM 704 and can solve problems with dimensions of m 5 200, n 5 200 and mn 5 4000. A 10 X 10 game was solved in 10,694 steps in 1 niin with an accuracy of 0.00004 and a 17 x 18 game in 35 sec. A 20 X 20 problem deliberately made up to be as difficult as possihle for the method took 15 min and had an accuracy of 0.0005. 6.3 Stochastic Linear Progromming
An excellent presentation and survey of trhe work done in this area is given in Madansky [140]. The author reviews two types of stochastic problrms, the “wait-and-sce” and the “here-and-now.” I t is the latter type problem which is of interest and RIadansky rcviews the work done in this area. 111 addition, he discusses a wide varicty of rcferences. Other rrccnt publications in this ficld are Madansky [I411 and Wagner 12091. 6.4 Nonlinear Programming
As it is planned to includc n separate survey article on nonlinear programming in one of the future volumes in this series, we will only make a few notes and references to this broad and important area. A general survey paper which describes the various problem areas a i d methods for solution is one by Wolfe [226a]. There the author classifies the problrms based on the combination of the linearity or nonlinearity of thc objcctive function and constraint srt. He also classifies available computing procedures as the walking (simples nicthod for linear programs), hopping (the simplex method adaptcd t o minimizing a eoiivrx function hubject to liiicar constraints), and crecping (gradient mcthods). In addition to the work of Frisch and Zoiitendijk which are described in this survey, we find variations of the gradient method in the work by Lenike [132], Rosen [181], and Fiacco rt al. [66n]. I n addition we have thc differential gradient method of Arrow et al. [7]. The study by Witzgall [224] contrasts and discusses the procedures of Frisch, Lemke, and Roseti
SAUL I. GASS
324
as they apply to the linear casc. Witzgall describes the close computational relationship between Frisch and Rosen and the relationship between the Lemke Computational Tableau as a generalization of the Simplex Tableau. Computer codes for the Frisch, Zoutendijk, Rosen, and F.S.B. procedures have been developed. For example, Rosen notcs that his method has been coded for a Burroughs 705 and IBM 704. For the special case of minimizing a quadratic objective function subject to linear constraints, we cite the work of Wolfe [226], Beale [ll], and Markowitz [152]. Wolfe’s procedure is an adaptation of the simplex method and has been coded for the IBM 704. The use of this procedure calls for a transformation into a much larger problem of the order of m n equations and m 3n variables. Wolfe notes that the largest problem solved with this program has 90 constraints and 192 variables and took 230 min and 359 iterations. I n addition to the above procedures, the cutting plane method of Kelley [121] and Cheney and Goldstein [29] has been applied to general convex programming problems. From Kelley’s work we cite the following brief description of the procedure. The method uses the notion of cutting planes that has been introduced by Gomory and others in connection with techniques for solving integer linear programs. It is of interest to note that while Gomory’s method reduces to the Euclidean Algorithm when applied to a special casc, Kelley’s method is formally analogous to Newton’s Method for finding roots. We address ourselves to the problem of minimizing a linear form on a closed convex set R. It should be noted that this problem is equivalent to minimizing a convex function F ( x ) on a closed convex set R’. The former reduces to the latter since the linear form is also convex. Conversely, let R = {(qy)ly 2 F@), z&’). Then the latter problem has the form of minimizing the linear form y on R. Assuming that the problem has a bounded minimum we imbed that part of R containing a minimum point in a compact polyhedral convex set S and minimize the form on S. This is a straightforward linear programming problem. We now cut off a portion of X containing the just determined minimum point b y passing a hyperplane (cutting plane) between the minimum point and R, obtaining a new convex set S. The linear form is now minimized on S. This process of making cuts and minimizing the linear form on the resulting convex set is continued. In the limit we obtain a point in R for which the linear form is minimized. As just described, the convex programming problem reduces to solving a sequence (generally infinite) of linear programs. I n practice, of course, this sequence is truncated after a finite number of steps a t a point when the desired degree of approximation is obtained.
+
+
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
325
7. linear Programming Computing Codes and Procedures
If the importance of a computing algorithm can be measured by thc number of different machines for which corresponding codes have brcn written, then the simplex algorithm and its variations rank high indeed. In fact, whenever a new computer is announced, one can usually find a linear programming code on the list of codcs already developed or to be developed. It is also interesting to note that in their recent book, Mathematical Methods for Digital Compulers, by Ralston and Wilf [177], the authors deemed it worthwhile to include a chapter on “The Solution of Linear Programming problem^."^ I n gathering information for this section we attempted to be as inclusive and exhaustive as possible. It must be recognized that data of this sort does not remain up-to-date for any great length of time. Most of the information was supplied by the manufacturers or other appropriate sources such as computer user groups. No attempt was made to measure the efficiency or adequacy of the cited codes. The time estimates or comparisons given are taken from the program write-ups. Unless otherwise specified, the write-ups of the codes and card decks are available from the manufacturer or the associated computer users group. Also, unless otherwise specified, for the general linear programming problem, m is the number of equations and n the number of constraints; for the transportation problem m is the number of origins and n the number of destinations. 7.1 Digital Computers
7.1.1
THE IBM
701
SHARE PROGRAMS
Table I1 gives a list of available program for the 704 which have been submitted t o the SHAREDistribution Agency. Most of the codes are described in this section. 7.1.1.1 The Rand Code (RS Lrsl) This linear programming system uses the modified simplex procedure with the product form of inverse. It will handle as many as 255 equations and there is no restriction to the number of variables that may be used. The use of double precision floating point arithmetic in the internal calculations is adequate to solve problems involving as many as 255 equations because there are stringent, but not unreasonable, limitations on the input data. With the algorithms used, the computing time per iteration increases very little for a n increase in the number of variables. A precise time esti(Chapter 25, which was written by I ) w n If. Arden.)
SAUL I . GASS
326
T.\HI,E
Itlrntity C E FLP C E SCRL IR M L ~ IB TFL IB TFM
MI C N F ~ N$
NY
LPS2 TRI
RS LPSl RS MI 13C MAP SC MUSH
sc
XPCI) SM LPMP
I’K IP01
1’K
IP02
PK I P O 3 RS
BPI
RS
QP2
II
IM(’
Tit Ir
7-21-59 12-2S59 10-12-59 624-58 (i -17-59 9-14-58 9-11-58 12-27-57 1-08-57 341-60 10-17-58 10-1 7-58 7-09-59 7-03-59 7-01-60 7-0 1-60 7-01-60 10-0 1-60 1-01-61
FORTRAN L. 1’. SCROLsystem Machine loading-grneralized transportation l‘ransportation-Himgarian mrthod I)rundess version of IB TFL Capacitatcd network flow Ilrumless version of RS LPSI Transportation-Hungarian nirthod “The Rand code” FORTRAN mathematical programming system Tnput-output for sc MAP Small-problem linear programming Transportation 1.incar programming matrix structure print Integer programming Tnteger programming Integer programming RS L P S ~with upper bounds The simplex method for quadratic programming
mate is impossible, but one typical problem with 50 equations and 100 variables was solved in 20 min. The basic machine needed for this system is the IBM 704 with 4 logical drums, 4 tapes, and 4096 words of magnetic core storage. One additional tape may be used optionally for storing output results. One other tape may be used for input to the data assembly program which precedes the main code. Hence the input is from cards or tape and the output is the on-line printer or off-line printer. The card punch is used by the data assembly and also for punching restart information. A modification of this code, N$ L P S ~ ,has been written which uses a 32K machine and no drums. The code allows for the use of a curtain which restricts the solution to a predetermined set of vectors. The curtain can, of course, be lifted during the later stages of the solution. Parametric linear programming of the constant terms is also available. A descript,ion of this code in French is given by Pigot [168]. 7.1.1.2 Pigot French Code (PF PLI) This code is a modification of RS m p l done by D. Pigot of SociBtB des PBtroles Shell Berre. It includes such changes as being able to solve problems with up to 640 equations (with parametrization), and important
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
327
reductions in computation time. A write-up of the code in French is available [1681.
7.1.1.3 SCROL Systeni
(CE SCHL)
SCROL is a system using the origiiial 704 L.P. codes developed a t Rand, as a basis but incorporating a whole new dimension of control for operational procedures. The system is designed to perform not oiily the composite simplex inethod including maiiy variations but alho to carry out automatically a number of the data handling, data editing, and peripheral fuiictions formerly provided for with auxiliary routines or not at all. These are activated by card-programmed, interpretive n~arro-iiistructioi~s. The whole repertoire of available operations is conibined into one system which exists physically on a system tape. The basic version is for an 8K or 32K 704 with drums. It will not operate on a 4K machine. There exists another version to enable one to run on the 32K machine without drums. For small to mediumsized problems, transformations will never get to tape with 32K. The basic version can be easily added to for spccial applications. Conceptually the system consists of three parts: (1) the system control which includes some of the minor macro-operations, (2) thr major macro-routines, and (3) the L.P. codes or subroutines. Other features include p:mnietric linear programming of the constant terms and cost coefficients; the finding of alteriiilte optimal solutions; the determination of the range of each cost which still maintains optimality ; the handling of multiple objective functions and right-hand sides; “curtains,” which restrict the selection of a new basis to the vectors in front of the curtain, if possible; and partitions, which are absolute barriers in that vectors after the partii ion cannot enter the basis. 11s LPSI,
7.1.1.4 SCR.IIiSII
MUSHis a subroutine fur solviiig a niediuni-size linear program. The size restriction permits miiiimum transfer of data into and out of core and the me of single-precision arithmetic. This results in a routine whose speed compensates for its limited range. A modified simplex algorithm usiiig siiiglc-precision floating-poiiit arithmetic is uscd t o determine the optimum solution. A starting identity basis is made up from available unit positive vectors in the input data with artificial vectors supplied as needed. Theee artificials are driven out of the basis by assigning a relatively high penalty to them in the cost functional. Each change in the basis matrix is recorded as a change in thc inverse, which is niaiiitained in core. Iiouiitl-off errors, which accumulate in the
328
SAUL I . GASS
continuously updated inverse, are periodically reduced by a purification method due to Hotelling. Restrictions of the code are: (a) Two physical drums are required. (b) Input data must be in core a t the time MUSH is called in. (e) A problem of a t most 55 equations may be solved. Virtually any iiumber of variables can be treated if the machine used has more than 4K core memory. I n the 4K machine, the practical limit is about 128. If NMAX is the maximum number of variables for which the program is assembled, then the coefficients of the constraint matrix are limited to 2048 - R’MAX nonzero elements stored in compressed form.
7.1.1.5FORTRAN Linear Programming
(CE FLP)
The purpose of this program is to provide a flexible and easily modified code for solution of linear programming problems by the simplex method facilities. The program includes subroutines for installations with FORTRAK for: Input, Phase I setup, arbitrary transformations, pricing for composite algorithm, ratio test for composite algorithm, and Gaussian elimination. Output is interspersed as needed. A checking subroutine is also included. Restrictions are: (a) Components required-minimal FORTRAN installation but 8K core assumed. (b) Other programs required-FORTRAN master tape to compile. (c) Data-maximum number of rows in tableau including all optimizing forms, max m = 51.This caii be changed by recompiling, within memory limits. Total columns in rest mint matrix, excluding right-hand side and any artificial slack vectors for Phase I, niax 71 = 91.(Same comments as for m.) The entire tableau m X (a 1) is kept in core a t all times. All data are input and output as fixed point; input not to exceed 10 digits of which 5 are decimal fraction.
+
7.1.1.G FORTHAN Mathematical Programming Systems (RSM I ) RS ~1 (Rand Corporation, Mathematical Programming System One) is a system of 704 programs for the solution of linear programming problems. In its design, emphasis has bccri placed on producing a system of routines which could be readily modificd in any of its parts to serve special needs. The basic system coiisists of twenty-two routines, of which three are concerned wilh input, four with output, six predominantly with control functions, arid iiiiie with computing. The system i i i its prcsmt form has bccw dcsigiied for 3 321i drurnless 704, usiiig all cquipmciit but the punch, and from one to four tape units. The program is self-contained, but requires the deletion of some alternate
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
329
sii1Jroutincs before using. The data format is tlic mine as that, for thc Rand liiiear programming cotlc I<S L P s l cwcpt iiig thni, tlccinial point Y iiiust l)c used and certain roiii rol c ~ t l of s LPSI must bc replnccd by thosc appropriate l o this system. The program consists of the FORTKAN I1 Bss loader followed by relocatable hinarics. Alost of the system parts were 11. compiled with FORTRAN The system employs the revised simples method, in which the simplex method basis-inverse is maintained in its esplicit form. It can handle linear programming problems having up to 97 const mints, 299 (nonartificial) variables, and 2499 nonzero niatris entries. The parts of the computation sensitive to round-off error may be done in either single or double prccision. Accuracy on small problems done in single precision has been close to the theoretical maximum. Two linear programming problems mere used cstensively in testing the system. The first has 33 constraints, the second 64. The computing times given below wcrc found with the Rand Linear Programming System 1 (RS Lpsl-all double precision, product form of the inverse, considerable use of tapes in computation), the Standard Oil Company of California linear programming code (sc MusH-revised simplex method with explicit inverse, single precision, uses drum), and the single and double precision versions of the present Rand hlathematical Programming System 1 (RS M I ) . The difference in numbers of iterations done between L p s l and the other routines arises from the fact that the latter search the problem for singletons (columns having just one positive entry in the constraint matrix) and form a starting basis from them. L p s l does not automatically do this. If the heavy use L p s l makes of tapes in its computation is kept in mind, and the fact that its pricing-out operation is done in double precision, which TABLEI11 Kiimber of equations
33 64 33 64 33 G4
RS LPSl
SC MUSH
RS M 1 (SP)
Times (sec)-Computation Only 26 16 220 1300 Number of iterations (after initial basis) 69 33 35 136 110 Seconds per iteration-computation only 2.76 0.79 0.46 9.56 2.00 190
RS M 1 (DP)
73 1230 35 110 2.08 11.18
330
SAUL I. GASS
is not the CRSC in M I , the relstive tines per iterntioil of L r d ant1 ~1 seem to be a convincing argument fnr thc rffiricnry of thr product form of th r iiiversr, as used in L m 1 . A new program lis ~3 is under constructioit which will handle UP l o 4000 equations and will use single precision for all computations except for a double-precision “tidying up phase.”
7 . 1 . 1 . 7 70.4 Linear Programming System of Bonner and Moore This code is available from Bonner & Moore Engineering Associates, Houston, Texas, and is a general purpose code with special consideration given to the needs of the petroleum industry. The Boniier & Moore 704 Linear Progranlming System includes, as well as the linear programming code itself, a complete set of subroutines and subprograms for data loading, assembly, prc-edi ting, output, and output . ~with ~ a few subanalysis. Most of the system is written in F O R T R11, routines coded in SAP.Single-precision floating-point arithmetic is used throughout, and the main solution algorithm is a modification of the simplex algorithm. Direct, dual, and composite problems may be handled. Minimum computer requirements are an 8K 704, with four tapes and four logical drums. The system is written to take advantage of larger memories, and additional tape units may be used to advantage. As mentioned above, the solution algorithm uses single-precision floatingpoint arithmetic. Several measures have been taken to minimize the development of round-off error, however, and to detect and correct it when it appears. Some of the more important of these include significance analysis during the computation of new matrix and inverse elements, provision for automatic reinversion and matrix “clean-up” from time to time, and a n improved path selection algorithm which reduces the number of iterations required for solution by 20% to 50% over that required by earlier algorithms. The program is limited to a maximum of 2047 variables, including slacks and artificials. There are no independent limits on the number of restrictions or variables, although practical considerations limit problem size to approximately 150 restrictions on an 8K machine. Experience to date indicates that problem solution time of the order of 20 to 40 iniri may be expected for 100 restriction problems on an 8K machine. This time mould apply where no use was made of such techniques as curtaining, priority control, or pre-editing. Judicious use of these techniques would materially shorten the above estimate, as would use of a 32K core memory. The system also includes provirioii for automatic parametrization of the objective function or the requirements vector, or both simultaneously. The user has a great deal of control over the output of the intermediate cases during a parametric operation. For example, he may tell the program to
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
331
suppress output automatically if the parametrization results in a long sequence of very small steps, waiting until the solution has moved a significant amount before taking more than token output. Provision has also been made for revisions to the right-hand side, object8ivefunction, or both simultaneously after solution has been obtained, followed by postoptimal re-entry into the solution mode and resolution of the altered problem. Complete provision is also included for dump and restart procedures, in which all essential information from drums, tapes, and core may be preserved on a single tape. This information may then be recalled a t a later time during the same run (after a user-written program has performed some auxiliary function, for example) or a t some later date, The pre-edit option makes it possible to take advantage of prior knowledge of the probable structure of the solution, to decrease the number of iterations required, and cut down the actual size of the matrix being handled. The pre-edit routine will prepare a condensed matrix tape, with restrictions and column vectors pertaining to variables whose filial status is definitely known in advance deleted from the matrix. Thc smaller condensed matrix will be solved, and then the solution automatically restated in terms of the original expanded matrix. Other techniques for taking advantage of advance knowledge of the probable solution structure are also available, such as curtaining, matrix tape ordering, and the designation of “priority” and “prohibited” variables. In addition to the conventional listings of basis activities and shadow prices (with associated ranges of optimality and feasibility), elaborate output analysis and display options have been included in the system. These provide for several varieties of postsolution arithmetic on the final activities, including such things as the accumulation of weighted sums of final activities, the multiplication of such summations by priccs or costs, accumulations of totals a t several levels, the back computation of TEL levels in gasoline blending problems, and the display of results in a readily readable matrix format, with Hollerith descriptive information inserted. 7.1.1.8 The Transportation Problem
(XY T R ~ )
This program is designed to solve the basic transportation problem. The machine requirements include a 4K core storage, card reader, printer, 1 drum, and 3 tapes. The data restrictions are (m is the number of origins and n is the number of destinations) :
+ n I 850 m + 5 1850 m + 5 7500 m
for 4K,
)&
for 8K,
II
for 32K.
SAUL I . GASS
332
+ + + + + +
For all cases n 5 800 and m(n 1) 25m(n 1)/256 5 900,000. Only high speed storage is used if mn 4(m n - 1) 600 I high speed storage available. The coefficients cannot exceed 10’O - 1 in magnitude. The “stepping stone method” for solving the transportation problem is used. The major characteristics of the method are these: The stepping stone method is an iterative procedure. A distribution must be furnished at the start. A distribution is any allocation of shipments which satisfies all the requirements without regard to the total cost. At each iteration a new distribution is constructed with a smaller total cost than the previous. When an improvement is no longer possible the distribution is optimal with respect to the total cost and furnishes the required solution to the ~ the initial distribution automattransportation problem. Code NY T R finds ically. The demand for each destination is assigned to the origin whose corresponding unit transportation cost is smallest, provided the origin has sufficient supply available. Otherwise, the next cheapest origin is tried, and so on until the demand is met. The epsilon method is used to guarantee that a basic feasible solution is constructed. As an option, an initial distribution may be furnished as part of the input. 7.1.1.9 The Transportation Problem-Flow
Method (IB
TFL)
The transportation-flow program is designed to solve the transportation problem on the IBM704 in accordance with the flow, or Hungarian, method as proposed by L. R. Ford, Jr. and D. R. Fulkerson. Particularly for large matrices this program affords substantial gains in time over NY T R ~ which , is based on the stepping stone met,hod. The minimum machine requirements are : 4096 words of high speed storage,
card rcader, 1 drum (logical unit #1), 1 tape (logical unit #6),
on-line printer or tape #10 for output. The problem restrictions are
5 600, no restriction on WL, n) 5 high speed storage available. rk
618
+ m(n + 1) + 2(m +
The numerical data supplied must not contain numbers exceeding 10’0 IB TFL and NY T R ~(stepping stone method) have given the results shown in Table I V for the main calculation (exclusive in- and output) : 1 in magnitude. Comparison tests between
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
333
TABLEIV hfatrix (ni X
71)
130 X 30 I60 x 30 190 X 3 0 220 x 30
NY T R ~
2'13" 4'56"
I B TPL
1'30" 3'34" 4'00" 4'58"
7'05" 11'58"
7.1.1.10 Transportation Code (sc XPCD) This code solves the standard transportation problem using the algorithm presented by Munkres [161]. It is well-suited for tall distributions, i.e., when one dimension is much larger than the other. It requires only one tape, assumes a n 8K core storage and four drums. Infinite costs arc indicated in a condensed notation and previous shadow values may be used to start problems. Restrictions on input data are: number of columns number of rows
5 31
I 1500
largest cost coefficient largest availability largest demand
5 131,071
5 67,108,836
5 33,554,431.
7.1.1.11 Transportation Algorithm and Code I I This code [67] uses a variation of the Hungarian method and was writ11. ItI is available from the author, Merrill M. Flood. ten in FORTRAN Computational experience yielded the following computing times for sample problems. These problems were developed by a random number generating scheme which is st part of the program: Size
No. of problems
Average time (min)
29 X 116 29 X 29 20 X 29
2 3 13
2.47 0.53 0.25
One of the 29 X 116 took 3.03 min with Code I1 and 3.17 min for IB TFL. The code can handle problems up to 60 X 150. 7.1.1.12 Capacitated Network Flow Program
(MI
CNF1)
The program determines a flow pattern over a general network so that a linear cost function of the branch flows assumes its minimum value.
334
SAUL I. GASS
The branch flows arc restricted to being nonnegative and less than or equal to the capacities of the branches, and flow into and out of the nodes is conscrved. The nctwork flow problem is a linear programming problem arid can be solved by general lincar programming techniques, but doing so complicates the solution procedure since the network structure can be utilized to give a more efficient method. Many of the problems that have been solved by linear programming techniques can he formulated in terms of network flow, thcrcby utilizing this procedure. The program is complcte with input and output routines so that information can be transferred into and out of the computer conveniently. The input is assumed to be in integer form and integer arithmetic is used throughout. The results therefore are exact and are given as integers in the output. The only restrictions on the problem due to computation limitations is that the cost value associated with each branch and the branch capacity values must be less than 262,142, and the number of branches and nodes in the network must satisfy the relation: Five times the number of branches plus the number of nodes must be less than the magnetic core memory size minus 530. I n terms of the network, the problem is defined as: minimize
N+ 1
C
cIPz,, (cost function)
%,3=0
subject to the constraints N
C (xlj - x j , )
=
s, (conservation of flow a t the i = 1, 2,
. . . ,N
nodes)
j=1
0 5 x,, 5 M,, (flow restrictions) where, for an arbitrary branch (i,j), x t J = flow through the branch directed from node M,j
=
capacity of branch
cl,
=
cost per unit flow (assumed nonnegative)
i to node j
s1 = outside flow supply at node i (may be positive or negative)
N
=
number of nodes in network excluding the source and sink nodes.
The computer program solves the above problem for any specified total flow into the network. The total flow is the sum of the flows in the branches corresponding to the positive 8, terms. If the total flow is the sum of these s,, then the maximum flow through the capacity restricted network may be specified. Unless some other part of the network restricts the flow to a smaller value, the total flow allocated will then be the sum of the positive s,.
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
7.1.1.13 The Machine Loading Problem
335
(IB M L ~ )
The purpose is to solve any problem which can be cast into the following inat hematical form : Subject to the linear restrictions
5 a,,x,j I M , (i
=
I, 2,
J=1
. . . , m)
find values x,, w hch niiuimize the lincar function C C c,,xt,. The program requires n 704 (without special devices) with 8K memory, 2 tapes, and 4 logical drums. If data input is to be from tape rather than cards, a third (input) tape unit is required. The admissible size of problems has the upper limit : number of rows plus number of culumiis not to exceed 318(m n 5 318). All of core storage is used. The program also Korks on a 32K machine for the samc size problems. The method used for solution is a gcneralizntion of the ‘'stepping stoile method” as outlined in [61].
+
7.1.1.14 Multidimensional Distribution Problem This program is designed to solve problems with the following formulation: Find a set of integers x,,~,such that j=lh=l
55
Zijh
=
bj
i=l h=l
and
f 2 2 c,,JLx,,his a minimum.
t=1 j=lh=1
Three methods have been programmed: approximation method, method of optimal regions, and method of reduced matrices. The dimensions of thc problcms for each method are as follows:
.I pproximats solufiori I,. 5 a0
c p, I900 1.
1=
1
SAUL I. GASS
336
p i is the number of “shipping points” in the i t h dimension. PlpZ * ‘ * pk 5 60,000 Computing example: k = 2, pl = 186, p2 = 15; time, 19.5 min.
Ic
=
4, pl
=
p2 = p ,
=
p 4 = 3; time, 12 sec.
Optimal regio I I s k = 2
p1
+ p2 I 1200
p2 I 10 cij < 104 This is for tall matrices. Uses 8K memory, 5 tapes, 2 drums. Computing example: 7 X 1152; time, 45 min. Reduced matrices k
5 20
k
c p i 2 900
i=l
PIP2 ‘ * ‘
pk _< ~00,000
Cijh
< lo4
Requires 8K memory, 6 tapes, and 1 drum. Time given approximately by s2/100 min, where s = p i . These codes are available from B. Caller, University of Michigan Computing Center.
c
7.1.1.15 Integer Programming
(PK
1 ~ 0 11, ~ 0 21, ~ 0 3 )
Codes I P O ,~ 1 ~ 0 2and , 1 ~ 0 are 3 all designed to solve integer programming problems; that is, to solve linear programming problems with the additional 2 restriction that the variables involved are integers. Codes I P O ~and 1 ~ 0use only the method of Comory [90a] ; consequently, all coefficients remain integer throughout and the round-off error problem is avoided. Code 1 ~ 0 3 uses the methods of Comory [89, 90a] and introduces fractions into the course of the calculation, controlling round-off by various methods. Code I P O ~uses a simple form of [goal and each individual pivot step is faster than , usually requires fewer a step in the more complicated 1 ~ 0 21;~ 0 2however, steps than I P O ~arid can solve some problems that are intractable for 1 ~ 0 1 . IPO3 is probably the most generally effecfive of the three, though 1 ~ 0 is 2 often better on degenerate problems. The performance of a11 the codes is quite unpredictable. Failures (i.e., endlessly long runs) have been recorded with as f c w as 10 variables, successes with as many as 90. Generally, the difficulty goes up with the number of variables, the size of the matrix coefficients, and the density (lack of sparseness) of the matrix. However,
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
337
cvcn tlicse characterist ics are not always rcliablc guitlcs as n1:it ricw 1 hat, arc almost, idcntic:d havc producc~lw r y diffcrriit ruiis. For any problem thc niaxinium number of variablrs, i i , for 1ro1 and 1 ~ 0 2is 35 for ail 8 K memory, 100 for 32K. The maximum number of constraints is 75 - n for the 8Ii mwhiiie, 200 - n for the 32K. All coef3 a 3°C core storage; three magnetic ficients must be integers; 1 ~ 0 requires tape units, and tape-to-printer equipment are required in addition to thc “mininium 704.” For any problem, thc iiuinbrr of variables, n, must not exceed 100, and for most problems thc total iiurnbcr of objective funct‘ ’ lolls - 72. and constraints must not cxcecd 200 All coniputatiotis iii the program are floating point. There is no round-off error unless some matrix element reaches a magnitude of the order of lo8. This usually occurs only when many elements of the matrix have begun to oscillate violently between very large positive and negative integers. This condition indicates that the problem is probably intractablc for the code and that it is unlikely that any solution will be reached ill a reasonable amount of machine time. For 1 ~ 0 1 the , time elapsed during onc iterative step varies widely with the size of the problem, the numbers currently in the matrix, and other factors. A rough estimate is 75 pivots per minute for a problem of 20 variables and 20 constraints, 45 pivots per minute for a 30 by 30 problem. The number of pivots required for any particular problem is almost wholly unpredictable, although it tends to increase with the size of the matrix and decrease with the sparsity. For 1 ~ 0 2a, rough estimate is 65 pivots per minute for a problem of 20 variables and 20 constraints, 40 pivots per minute for a 30 by 30 problem. For 1 ~ 0 3a, rough estimate is 50 pivots per minute for a 30 by 30 matrix, 80 per minute for a 20 by 20. 7.1.2
THE IBM
7090
7.1.2.1 m/90
This code was developed by CEIR, Inc., Arlington, Virginia, for a number of petroleum and chemical organizations. At, this writing, it was not generally available. The code is approximately 10 times faster than the comparable RS Lrsl code for the 704. I t can handle 512 equations including the objective function and a n indefinite number of variables. It has double precision floatingIt point arithmetic and has the parametric and other features of SCROL. automatically determines if the problem can be solved using internal core storage. I t allows for the naming of rows as well as columns which enables the selection of a specified subset of rows and columns to be solved as a subproblcm.
SAUL I . GASS
338
The system also includes a facility for updating the matrix and allows fnr the addition or deletion of a row or column and the change of an individual coefficient. It is a system with a built-in compiler which greatly facilitates the addition of a ncw routine or a change in an old routine. 7.1.3
IBM
650
7.1.3.1 Linear Programming-10.1 .001 This program is in fixed point arithmetic. All numbers are 10 digits with the decimal point assumed between the 6th and 7th digit. It is designed to accommodate systems of m rows and n columns where m 5 30, n 5 59, and m(n 1) < 1400. The values of m and n pertain to the system after the slack vectors and art,ificial vectors have bccn adjoined.
+
7.1.3.2 Linear Programming-10.1.002
This program has been written for processing linear programming problems having a maximum of 97 equations not including the objective functions and an unlimited number of variables. This program uses the method of “recursive generation of vectors for the modified simplex method” as described by Eisemann [6Oa]. 18-digit double precision floating point computation is used. The programmer has chosen for E in check sum computation for a problem consisting of 33 equations and having a magnitude of the largest element of 4 X loa. For problems of larger order and having the same magnitude may be more suitable. of the largest element it is felt that E = 7.1.3.3 Linear Programming-10.1 .OOg This procedure will solve linear programming problems containing up to 40 equations and any number of variables. Fixed decimal arithmetic is used, with five digits on each side of the decimal point. Using the modified simplex method, column vectors representing properties of the different variables are brought into the calculation one a t a time, multiplied by the B matrix to bring them up to date, and checked to see whether they should be brought into the basis or not. They are brought into the basis if the profit function is more negative than an arbitrary tolerance. Successive passes of the input cards for all the variables are made until, on an entire pass, no variable is brought into the basis. Then the tolerance is lowered and more passes are made in the same way until, with the smallest practical tolerance, a pass is made during which no variable is brought into the basis.
339
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
7.1.3.4 The TransportationProblem-10.1.003 This program has been designed to solve the standard transportation problem. It also is possible to obtain alternate optional solutions. A problem is “small” if m, n satisfy the following inequalities:
2(m
+ + [y]+ [m + n - 1 ] + [ m
+ ;
-
‘1 < n
1200,
< 100.
Similarly, a problem is characterized as being “big” if m,n satisfy one or both of the inequalities:
2(m
+ + [ y ]+ [”+; ‘1 + [m + n - 1 ] > 1200, 72)
-
n
> 100.
The brackets indicate that the fraction is rounded up. The present program will solve a transportation problem for which m, n satisfies either the “small” or “big” inequalities. I n the case of a “small” problem, the entire cost matrix (cij) is stored on the drum. Only a single row of the cost matrix is stored on the drum in a “big” problem and the cost matrix must he continually circulated through the read feed. It is therefore advantageous to process a given problem a s a small problem whenever possible as this eliminates the C i j card reading time inherent in the big problem. All input data are restricted to a maximum size of five digits. The iterative method employed is essentially the same a s the “stepping stone” method. All operations are performed using fixed point arithmetic. 7.1.3.5 Linear Programming Code for the Augmented 660-lO.l.006 The composite algorithm is used in somewhat the same manner as used in the RAND Corporation RS Lpsl 704 program. All computations are performed in single precision floating point. The size of the problem which can be handled is restricted by the relationship ( m 2)(n 2) 5 1900, where m is the number of restrictions and n is the number of independent variables excluding the original basis variablcs. The precise time required per iteration depends on whether the existing solution is feasible, and on the number of zeros in the matrix. As an approximate figure, a problem with m = 17 and n = 40 requires about 20 secs per iteration. The basic 650 with floating point,, index registers, and immediate access storage are used.
+
+
SAUL 1. GASS
340
7.1.3.6 Network Problems This code is designed to handle iietworklike problems of a type more general than the transportation problem. It can solve such problems having a maximum of 96 nodes. It was written by C. E. Lemke and T. J. Powers of the Department of Mathematics, Rensselaer Polytechnic Institute. 7.1.3.7 A Linear Programming System This code for the 650 has been especially designed to meet the requirements of the petroleum industry. It is probably the most widely used 650 linear programming code and is available on a fee basis from Bonner and Moore Engineering Associates, Houston, Texas. It is a group of inter-related programs, with the output of one program serving as the input for the next. All of the programs were developed for use with a minimum 650.The output procedures were designed for a 407, but could easily be adapted to a 402 or 405. The code uses a computational modification of the simplex method which permits omission of the identity matrix from computer memory. Running time is approximately 15 msec per element per iteration (based upon the number of elements in the full matrix, including the identity matrix). Limiting size is determined by the inequality (m 2)(n - m 3) < 1731,
+
+
where m and n refer to the number of restrictions and variables, respectively, including the identity matrix. For example, for a system of 40 restrictions, up t o 78 variables, including the 40 variables in the original basis, may be handled. Operating speed for a matrix of this size would be approximately 50 to 60 sec per iteration. Experience has shown that optimization is usually obtained in a number of iterations ranging from 1 to 2 times the number of restrictions, when using this calculation procedure. Provision is made for assigning objective function coefficients t o the variables in the original basis, if they represent alternate dispositions or slack variables with real debits or credits attached. The output of this section of the program consists of a single card punched a t the conclusion of each iteration, containing an iteration count, the current value of the objective function, and the identification of the variables entering and leaving the basis during the current iteration. There is no card reading or card handling of any sort during the process of optimization. When the optimum is reached, the program will load the answer punchout routine described below. Provision has been included for detecting and identifying an unbounded solution, and for resolving degencracy.
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
341
Although the entire calculation is done in fixed decimal arithmetic, special provisions have been made to minimize rounding errors. Experience with this program has shown that these errors do not accumulate to significant levels within the number of iterations usually required to achieve optimization. Although some preliminary scaling of data is necessary, this rarely needs to go beyond a careful choice of units. The program is completely trapped against loss of leading digits through overflow conditions. The answer program is read into the computer after optimization has been achieved, and controls the punchout of the final solution. This punchout consists of three parts. The first group of cards includes one card for each variable in the final basis. This card contains a four digit number identifying the variable (arbitrarily assigned by the user when preparing the original data), the final activity of that variable, (b%),the row number in the final matrix correvponding to the variable, and the range of values (C‘,) which its objective function coefficient could take without destroying ‘ ins one the optimality of the current basis. The second group of cards conta‘ card for each variable not in the final basis. The information on these cards includes the variable identification, the shadow price, the column number, and the range of nonzero activities which the variable may take without destroying the feasibility of the current basis. The third group of cards is optional. It is a reloadablr punchout of all of the nonzero elements of the final solution matrix. After completion of the punchout, the solution matrix remains uiidisturbed in memory. At this point, new objective function coefficients may bc introduced through the use of an auxiliary “cost change” program. The computer will search out the row or column in which the variable in question is located, revise the objective function coefficient, determine whether optiniality has been destroyed by the change (or changes), and reoptiniize if necessary. In similar fashion, it is possible to specify a new requirements vector. The computer will multiply the new requirements vector by the inverse of the original basis, deriving a new solution vector in ternis of the present basis variables. It will then test the new solution vector for irifeasibilities (negative elements). If any infeasibilities exist, they will be removed by manipulation in the dual simplex algorithm. Absociated with this code is a matrix generator which nutonlatically develops the input matrix for a large class of blending problcms. 7.1.3.8 Multidimensional Distribution Problem
The rcadcr is referred to the similar section in the 704 code descriptions for a definition of the notation and terms which follow:
SAUL 1. GASS
342
Approximation procedure
k 5 7 1 5 p1, pz, . . , pk 5 15 Cbjh 5 lo5. For k = 4, pl = pz = p3 = p4 = 3, the code took 2 min arid 35 sec to solve the problem. Reduced matrices k 5 7 1 5 P l , pz, . . * , p , L 15 Cijh
s (s
-k
5
lo4
+ 1) I 235,
k
where s =
C pi.
i= 1
7.1.3.9 Dual Simplex Algorithm This program uses the dual simplex algorithm and is available from Mrs. Ann L. Hillier a t the Applied Mathematics and Statistics Laboratory, Stanford University. Table V gives values for m and n based on the number of negative coefficients in the objective function (maximizing) and the number of equalities in the system. TABLEV Segative coefficients cf) Equalities ( e ) c = o O<e<18 18
< E 5 19
j=O
o
I 20
,n
5 60 m 5 19 n 5 59 111 5 19 n 5 59
n
7)a
n
5 19 5 59
not, nllowctl
The code punches out the following information whcn the optimal solution has been reached: the value of the objective function, the values of the variables, and the iteration count.
7.1 .3.10 Bataafse Inlernationde Petroleum Maatschappij N . V . at The Hugue The following linear programming codes have been devised: (a) A code using the straightforward algorithm for systems of less than 60 equations.
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
343
(1)) A code using the straightforward algorithm for systems of less than 120 equations. (c) A code using the explicit inverse algorithm fhr systems of less than 120 cquations Problrins o f , for csnmplr, 119 equations and 175 noiibasis variables have been solvrd in 8-12 hr. Postoptimal routines for the codes mentioned under (a) and (b) are also available. The code mentioned under (c) is 20% to 30% faster tjhart the code mentioned under (b). 7.1.4
IRM
705
7.1.4.1 Product Inverse Linear Progminming-E2-OOb-O This program will calculate optimum solutions for problems involving up to 99 linear constraints arid 120 variables. I n addition to handling the usual types of problem, the program contains a partitioning feature useful in solving block-angular (e.g., multigrade blending) problems. Multiple profit functions and/or multiple requirements vectors may be submitted for all problems. An iiiversion routine is available to start with some prior basis other than unit positive, if desired. Matrix element data are submitted to the program in eight digit fixed point form, with the decimal point located in the middle. Zeros are not entered. Up to 19 nonzero elements in each structural vector are permissible; 59 nonzeros in right-hand sides. Mathematical operations are carried out in thirteen digit fixed form (with the decimal point after the fourth place) to maintain digital accuracy and, a t the same time, to permit a speed of calculation not possible with floating poiiit numbers. Vectors are condensed; no zeros are carried. As each vector to be brought into the solutioii basis is determined, the ratio of the total numbcr of iterations to the number of nonunit vectors in the basis is calculated and compared with a programmed constant (currently set a t 1.7). Whenever the ratio exceeds the constant, the basis is reinverted. This same check is also made (in this case against a constant of 1.3) each time it is found necessary to partition the product form of the inverse to store parts of it on the drum and tape. This device speeds up the solution of a problem and promotes digital accuracy. While the constants used in these checks should be set a t a level dependent upon the type of problem being run, it was felt that their level would be more or less fixed for a given machine installation and should not be left to the discretion of problem planners. 7.1.4.2 Linear Programming-El-001-0 This code will solve linear programming problems containing up to 60 equations and any number of variables. Fixed decimal arithmetic is used, with five digits on each side of the decimal point.
SAUL I. GASS
344
Using thc modified simplex method, column vectors reprcsentitig propcities of the different variables are brought into the calculation one a t a time, multiplied by the B matrix to bring them up to date, and checked t o see whether they should be brought into the basis or not. They are brought into the basis if the profit function is more negative than an arbitrary tolerance. Successive passes of the input data for all the variables are made until, on an entire pass, no variable is brought into the basis. Then the tolerance is lowered and more passes are made in the same way until, with the smallest practical tolerance, a pass is made during which no variable is brought into the basis. 7.1.4.3 Transportation Problem
This program can accommodate matrices up to 2500 in size, i.e., 11 5 2500, and n 5 500. Consequently the program can solve a matrix with 2000 destinations and 500 sources, and any combination up to 2498 destinations and 2 sources can be handled. It first selects a feasible solution by successive allocations to the minimum cost routes starting with the first row. The basic computational procedure is an adaptation of the standard simplex algorithm for the transport,ation problem. m
+
7.1.5 IBM 7070 At this time of writing codes for the IBM 7070 are in the process of being completed. Two are being written by A. E. Speckhard of IBM’s Western Region. One code which uses an internally stored simplex algorithm will permit a matrix area of approximately 9000 entries for a 10K machine, and approximately 4000 entries in a 5K machine. It permits parametric programming on the objective function. A more versatile code which can operate on problems of the order 300 X 1000 is also being written. This code will use the product form of the revised simplex method. It will be written in FORTRAN I1 and will have both single and double precision options. The use of FORTRAN will enable individual users to modify such elements as selection algorithms, pricing algorithms, etc. A third 7070 code is being prepared by The Koninklijke/ShellLaboratorium, Amsterdam. It will use the revised product-form algorithm and call for 10,000 word core memory, a t least 6 tapes, and 2 channels. The code will comprise a complete set of amendment routines, including parametric programming. The size of the problem is restricted by the formula l l m l.ln 5 7000. The code will use single precision with a double precision reinversion.
+
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
7.1.6
IBM
345
1620
7.1.6.1 General Limear l’rogra?tutbiiig I’roblern Thc size of the problem which can be solved is restricted by the following relationship: (ni
+ 2 ) ( t z - m + 3)
approximately 1600.
A niatris with 30 restrictions and 70 total variables will require approximately 25 see per iteration, including printing of the pertinent iteration data. Thc simplex algorithm is employed and arithmetic is in floating point niodc. ,ipproximately the first 4000 locations of storage are requircd for programs and tables. The rcniaiiider of storage is used for data and one working column. 7.1.6.2 Transportation Problem The niaximuni matrix size is mn 5 1600. A 40 x 40 matrix will require approximately 20 sec per iteration. The program achieves solutions in accordance with thc network flow, or Hungarian, method as adapted by Ford and Fulkerson. Fixed point srithmetic is used. Approximately 6400 locations for tables and instructioiis are rcyuired. 7.1.7
IBM
RAMAC305
This codc will solve the gciierd linear progrzLminiiig problem of dimcnsioiis up to 82 x 97. It USCR the original siniplex method, floating poiiit arithmetic., and can be usctl for straight matrix inversion (48 X 48). 7.1.8 REMIXWOSRASDU~\iv.%c 1-11
7.1.8.1 Rouiarl~dI’uriuble Simplex This code uses the standard simplex algorithm with the bounded variables technicpic.. With no bounds the problem size is constraiiicd hy* ni 5 18, ti 5 99, ??a < n
(m
+ 2)(n + I ) 5 4 5 0 ,
mid with upper lmnids the sccond restriction is (m
+ 2)(ri + 1) +
j1
5 450.
7.1.8.2 U A R C - AForce ~ ~ Simplox System Thr codc uses the revised simplex method and can coinpute ‘hear optimum” solution if a bound on the sum of the variables is k~lown. Dimensioiis are restricted by
SAUL I . GASS
346
12 m
5 119
120 5 m
5 179
n = 998 n = 665.
7.1.8.3 xpom-Transportation Problem This procedure uses Dantzig’s algorithm and an efficieiit scheme for using tapes and cores to cut down the number of tape passes. It has a number of options for finding the first feasible solution. Problems are limited by 2 I m I: 8 n I8999 9 5 m 5 18 11 5 6000 19 5 m
5 28
n
5 4000.
A time estimate is given by t = mn/300 min. It can also determine alternate optimum routes. It is recommended for long transportation problems. For example, a 15 X 488 problem took 20 min. The program defines a m X n nucleus program which is contained in high speed memory and this subproblem is optimized. The remainder of the rn X n problem is stored on tape by columns and is scanned for possible improvements a column at a time. 7.1.8.4 Critical Path Method For description see Section 5.1. 7.1.9 REMINGTON R A N D 1103, 1103A, 1105 Codes, but little information about them were available for these computers. Recent plans call for a new program t o be developed for the 1103A which will incorporate many of the recent advances in linear programming and solve problems of the order 1000 X n, where n is unlimited. For codes for the critical path method see Section 5.1. 7.1.10
WHIRLWIND
COMPUTER^
7.1.10.1 Transportation Algorithm The Whirlwind is a parallel binary machine with a 16-bit word length and an instruction specd of about 40,000 operations per second. It had (at the t h e this program was coded) 2016 registers of fast-access magnetic core storage, 40,000 registers of auxiliary storage in two magnetic drums, and magnetic tapes (used principally as an output medium). The size of the problem that can be solved by the program is limited by the size of basis table which would fit into the fast-access cores, and the At Massachusetts Iestitute of Technology.
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
347
amount of cost data that could be sforcd on onr of th r magnrtio drums. These rrstrict ions impc)scctl ihc liniii h
ui
+
/)/
=<
127,
II
5
401,
< a pprosiniat rl y 10,000. Oiily the csacntial costs arc storrtl. All costs that are not specified in the data are assunird to be iiifiilite. This :~llowsthc solution of larger problems within the above restrictions. Thr largest problem that has been solved with the routine has been a warehouse allocation problem with 54 supplies, 343 demands, and 8200 noninfiiiite costs. Kine problems of this size were run on Whirlwind recently in an avcrage time of 25 minutrs each. .kcording to usuul practice with the strppiiig stone method, the entire cost matrix is cxaininrd at cach iteration and the position is selected that V , - C,,. A new element gives the greatrst incremrntal Cost, AC' = I', is placed at this position in the shipmciit matrix for the next iteration. When the logical search procedure of Dennis [52] is used on a high speed computer, the time requirrd to carry out, the operations of one iteration becomes short compared to the time required to search through the entire cost>data in finding the new element to be put in the basis. Thus, the total time for solution might be cut down if the time spent searching the cost data per iteration were shortened at the expense of n larger number of iterations. Two alternate methods of selecting the element for the next iteration were tried. In one of these, the cost matrix is scanned row by row and the first position for which U , Ti, - Ct3is positive is selected. After each iteration, the search is resumed where it was broken off. In the other method, a complete row of the cost matrix is examined and the position in this row with the greatest incremental cost C = U , 4- T', - C,, is chosen. For the next iteration the next row is examined in the same way. The two alternate methods are compared below with the usual rule for a problem
+
+
TARLE VI. COMPARISON OF
hfetliod of sclerting new clrmrnt
METHODS FOR A
30
BY
260
PROBLEM
Xumber of iterntions
Number of rsaniiriations of cost niatris
Time (min)
2200 6 74 508
27 41 508
25.0 9.6 19.4
~.
First which woiild rcduce total w s t Best in row of Cost matrix Best position in Cost matrix
SAUL I . GASS
348
with 30 supplies, 260 demands, and about 8000 noniiifinitc costs. Tht: method of searching onc row a t a time gives the bcst results. 7.1.11
RCA
501
7.1.11.1 Transportation Problem
This program solves the transportation problem on the RCA 501 using a modification of the Dennis technique. It uses the Vogel method for getting the first feasible solution. It is particularly good for problems with sensitive cost matrices because use of the Vogel method enables the problem to be cut off before completion with a good (near-optimum) solution. The minimum machine requirements are: 1 module (16,384 characters) of high speed memory
1 tape station 1 on-line printer 1 paper tape render.
Problem restrictions : m
-
+ n 5 64.
Cost data must be 5 lo4 1. Rim requirements must be 5 los - 1. Approximate timing is as follows:
7.1.12 BESK
5 x 5
20 sec
10 X 15
1 min 10 sec
25 X 30
3 min 10 sec.
AND FACIT ED13 COMPUTERS7
Solution of the general linear programming problem is accomplished by the original simplex method. The limitations are given by
where [z] is the least integer 2 z. The transportation code can handle problems with 32 origins and 256 destinations. 7.1.13 PHILCO s-2000
Three basic codes are completed or under development. The first, which has been completed, uses the composite algorithm. It can solve any problem whose tableau can fit in the internal high speed memory. An initial
’ FACITElectronic, Stockholin.
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
349
basis can be selected. For a 32K machine, the maximum problem size is approximately nL X n < 31,000, where m = number of equations and n = number of nonbasic variables. A 38 X 41 problem took a total of 6 sec, inclusive of input and output edits. A code for solving the general problem which uses magnetic tapes was in the final stage of check-out a t this writing. It uses the composite algorithm, can handle multiple and parametric right-hand sides and objective functions, and an initial basis can be selected. It uses fixed and floating point double precision mode. Arithmetic quantities are carried to 94 bits. For a memory capacity of M locations, the most stringent requirement is 12m < M . An 8K machine will be able to handle over 600 equations. The third code, which has been completed, will handle transportation problems which satisfy the restriction 6 m 5n m n < M . The procedure used is the stepping-stone method.
+ +
7.1.14
ZEBRA
COMPUTER^
The procedure for solving the general problem uses the product form of the inverse of the revised simplex procedure. It uses fixed point arithmetic and problem dimensions are constrained by m 5 47, n 5 6 0 , arid N (m 1) 5 1986, where N = number of iterations. A 47 X 60 problem yielded the approximate timing formula of 60 2N sec for the N t h iteration.
+
7.1.15
+
FERRAKTI MERCURY
7.1.15.1 Codes for Solving the Ceancral Linear Programming Problem
SIMMER This code uses the contracted simplex algorithm. The bounds on the problem size are as given in Table VII. TABLEVII
32 64 96 128
256 192 128 96
8,192 12,288 12,288 12,288
A 66 X 59 problem was solved in 9 min and an 89 X 104 problem in 20 nun. Standard Telephone and Cable Ltd.
SAUL I. GASS
350
SIMP AX This code does not store zero elements. The space occupied by the matrix varies in the course of the computation, but a t no time must it exceed 9856 words for standard drum. Size limits are m 2 5 256 and n 1 5 256. A problem with m = 119, n = 119 with initial density of 8% and maximum density of 24% took 25 min to solve. Extensions of this code have been developed by installations containing larger drums and magnetic tape.
+
+
MULTIPLEX The multiplex method of Frisch [73], which is described in a preceding section, has been programmed for the Mercury computer by Dahl [32]. I n Dahl [33], a comparison is made between the Simplex Mercury code and the multiplex code. From this study one cannot draw any major conclusions except that the multiplex method appears to be faster for a certain small class of problcms.
TRANSMERC (Transportation Code) This code uses the stepping stone method and the problem limits are m 5 192 n 5 224 n 5 256 - m 5 32 [360/m] where brackets indicate that the fraction is rounded up. Examples of problems are : m
29 31 63
'71
102 87 33
Total time (mins)
Number of iterations
40 27 17
216 183 ~
These tinics include input of programmed data and output of results. Computing time is approximately 2/3 to 3/4 of the total. The following report from the Computer Development Division of Shell International, London, describes their work and other programs for the Mercury Computer. I n the Computer Developmerit Division, Lolidoil, a emall group of programmers are working on the ninthematirnl and computational aspects of mathematical programming.
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
35 1
The group nws a Ferranti ‘Mercury’ computw which has a fa\t store of 1021 words rarh of 40 bits, n drum storr of about 32,000 words and a hat-kiiig store of up to 8 magnetir tapes (6 installed a t present) divided bcstwcen two controls which arc’ autonomous, i.e., transfers from either or both fiystems of magnetic t a p s may be carried out while calculation is in progress. The computer USPS floating-point binary arithmetic, a one-address code and has a speed of betwecw GO and 180 mkroseconds per instruction.
A. Liwar Aogranmztig (a) A program using the straightforward simplex algorithm was completed in 1959 for MercLury using only the fast stJon.and the tlrum store. For typic-nl sizes of pol)lems and times of solutions see Table I. (b) Using the same algorithm, hut employing the tape facilities, another program was completed in rarly 1960. Due to the autonomous nature of the magnetic tapes, this program takes only between 1/2 and 2/3 of the time required by program (a) to solve the same problem and is capable of solving larger protAvms. (See Table If.) Features in common to both programs are:Compact storage of the matrix, i.e., only non-zero elements are Etoretl, in (a) on the drums, in (b) on the tapes, while the row numbers are stored separately; the basis matrix is not stored explicitly; costa m:iv be put on the slarks i n the original basis set; very small elements are eliminated. This cut-orit farility operates a t different levels a t the operator’s discretion; the cost row is reralculeted every 20 ryclefi and checked against the cost row transformed with the matrix. Extra checks of this type may be put in after any cycle; a n optional intermediate output is given after each cycle showing number of artificial slacks remaining, number of negative rlcments remaining in the cost row and the most negative of these, variables entering and leaving the basis this cycle, cnrrent value of the valiiatlon fiinrtion and the size of the pivot element. The following further facilities are provided i n programme (I)): At the operator’s discretion the whole matrix and contents or the tlruiu store may be “frozen” onto a fifth tape deck and “melted” again to re-enter the calculation when required. This procedure is rarricd out a t regular intervals during a long run to obviate serious loss of time due to machine failure; devices for dealing with very small pivot elements have been included in the programmr; the final output of the non-basis set includes the following ranges:
where the a,>is the element of the matrix in the zth row and j t h column and b, is the right hand side of the i t h equation. In conjunction with the amendments routine described below, these ranges facilitate parametrization of one variable a t a time; a pre-inversion routine is about to he developed and will be used for solving problems for which a good solution is already known, with a consequent reduction of calculation time. It will also be used to eliminate rounding errors in problems in which an optimum has been reached where the results are possibly suspect; amendment routines have been added, enabling alterations to the costs or to the right hand side of the equations to be made, or
SAUL 1. GASS
352
the nmw of any variable to be changed; prohlems which havo nrgai ivv ri&, h:ind fiitlrs which arc not opposiitc artificial s1:ickR may Iw ~olvrtl.
TABLEI. L.P. PROBLEMS RUN USINGI’ROGRA\IWC (a) ( I h w w ONLY) Time (min)
Sizc of problem
No. of cycles
Input
Calculation
Outpiit
168 168 168 168
229 238 25 1 222
3 3 3 3
35 62 52 43
7 7 7 7
103 X 181 103 X 181 103 X 181
85 107 105
2 2 2
7 11 8
5 8
Typc ‘C’
84 X 127 84 X 127
167 146
2 2
29 24
9 9
Type ‘D’
139 X 210
299
3
93
7
Type ‘E’“
534 X 371
426
6
62
23
(4 Type ‘A’
Type ‘l3’
200 200 200 200
(v4 X X X X
5
a An exceptional problem containing many zero or unit elements (i.e., practically a transportation problem).
RUNUSINGPROGRAMME (b) TABLE11. L.P. PROBLEMS (MAGNETICTAPESAND DRUMS) Size of problem
No. of cycles
Time (including input and output)
Typc ‘F’
276 X 255 276 X 255 276 X 255
527 557 480
4 hr 14 min 4 hr 54 min 5 hr 5 min
Type ‘G’”
260 X 220 2G5 X 270
187 249
35 min 32 min
Type ‘H’*
337 X 308
355
1 hr 11 min
a This type of problem was special in that about 100 to 120 of the equations were merely upper bounds restricting a variable.
A direct comparison of the programmes (a), (b), and an equivalent programme writtjen by Pigot for the IBM 704 computer gave for a test problem of size 219 X 264:-
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
Pigot coding (with IBhI 704) Programme (a) (Mercury) Programme (1)) (Mercury)
353
50 inin 48$ min 25 min
The Pigot programme is a refinement of the RS Lrsl programme, which was written for the IBM 704 in Paris, and gave a considerable reduction in calculation time with many problems compared with the earlier method. (c) A programme based on the revised product form algorithm is being developed and is hoped to be able to solve much larger problems, although no estimate of the time requircd for calculation is yet available. The programme will accept up to 500 equations and will include full parametrization facilities.
B. Uiscrttf. P ~ o g r a ~ ~ ~ i t i g Work has been started on a programme to solve discrete variable optimization problcnis, following the method described by Benders el al. [20]. It will be possiblc to attempt problems with up to 120 constraints and 120 variables but if the number of constraints approaches this figure the calculation time will probably be very long. This work is being undertaken to give a time comparison with the Goinory method which is also being coded.
C. Transportation Progra~nming The stc.pping-stone nicthod has been coded lor the hfercury computer without the use of the magnetic tapes. An itrration in the stepping-stonc mcthotl conaists of 3 main phases. ( I ) Finding the shadow costs, u Land ul. (2) Finding t h r most negative L U ~=~c I , - u, - itI, where c,,’s are the costs. (3) Transformation. In phase (2) the search for the negative u’zl involves examination of the t n X n elements of the cost matrix but phases (1) and (3) refer only to the m 11 - 1 elements of the solution. In the programme the elements are sorted, each cycle into :in order from which the shadow costs can then be found in two passes through the solution. The time for both phases and thc sorting can then be made proportional to n only. A further modification has been made to decrease the time spent in phase (2); a list of :t number of the most negative UI,,is formed and is used in succession in minor cycles cwnsisting of transformation and sorting only and phases (1) and (2) are callrd in only when necessary. The time for a minor cycle becomes proportional to 111 TZ and for a major cycle to ) J I x n ; the total time is probably proportional to (rn ~ h ) ~ although there may be a. small elemcnt proportional to n m ( m rh). Theoretical limitations on size are:
+
+
+
+
(I)
ni
(ii) (iii)
11
/)/
< 128 < 381 X [rt/32]
I 772
In prac.tire the total culculation timr works out to be approximately seconds.
n.
I)WlJtIl
(111
+ r1)*/60
O h 1 I / 0 !I
L)rrompositioii nicthotls huvc bccn titutlicd and a progranmw 10 c o i i j i i r i c . t ioii M itli thci simplex algorithm ia uridcr dcvelopmerit. (it)
+
IM
thoin i n
,
354
SAUL I. GAS5
(b) A programme has already been developed to solve a linear programming problem consisting of a number of transportation problems linked to one central simplex problem. The linking variables may consist of any or all of the availabilities and requirements of the transportation-problems. The method of solution consists of the following stages:-
I. A feasible solution of the simplex problem is found. 11. The transportation problems are solved for the values of the itvailabilities and requirements given in the simplex solution.
111. The resulting shadow costs from the transportation problems are fed as costs on the variables of the simplex problem.
IV. With these costs, R number of iterations are performed on the simplex problcm. Stages 11, I11 and IV are repeated until no further progress seems likely V. The complete linear programming matrix is assembled. VI. Iterations are repeated to reach the optimum. Preliminary and very approximate indications give a calculation time equal to the time required to solve the simplex problem, plus 50%, plus 3 seconds for each transportation variable. If T is the number of transportation problems of 7n, X n, (t = 1, 2, . , . , T) attached to a simplex problem of size 700 X na, then the following conditions are the theoretical limitations on the size of probleins possible to solve with this coding. (i)
YNO
+ 2 ( m ; + n; - 1) I1023
(ii) no
5 1023.
(iii)
710
+ (total no.
(iv)
?)It
I 64
(v)
/It
I128
of allowed movements in the trunsportation problems) about 4000.
(vi) 2’ I 40.
7.1.16 FERRANTI MAHK I The following report prepared by the programming group of The Koninklijke/Shell- Laboratorium in Amsterdam describes their work and a number of programs for the Mark I. At the KoninklijkelShell-Laboratoriumin Amsterdam (KSLA) a small group is working on the mathematiral and computational aspects of mathematical programming problems. The group forms part of the industrial mathcrnatics department] which among other things operates a Ferranti Mark I* computer (fast memory about 800 words (20 bits), drum nicmory 32000 words; fixed point binary arithmetic; speed about 1 msw per instruction, ont’ :Lddress rod(&). 1 . L i t ~ r u rl’rogrumminy
(a) The straightforward simplex algoritlini was coded for the Ferraiiti macllirie in 1956. A great number of linear programming problems has successfully been solved with this code including problems with about 150 constraints and 200 non-basic variables
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
355
(in about, 4-8 hours). Sitice 1950 it ntinrhrr of aincndmrnt routiiws and post-optiin:il programs havc hcrn m:idr. (b) In 1959 a romparativr stucly was m:dr of srveral roniputational algorithms for the simplex method, viz., the straightforward algorithm, the explicit inverse algorithm, the product-form algorithm and thc reviscd product-form algorithm. The last algorithm was developed at the same time: it will br rnrntionrd in point l(r). In this study the number of multiplication oprrations, the numhrr of additional bookkeeping instructions, the quantity of numbers to tie proressed, thr arcurary, the flexibility and the complexity of the codes werr ronsidrred. Sevrral assumptions were made about the fullness of the LP matrices, about the way variables in t h r original basis behavc etc. Thesc assumptions were tested with the hvlp of some practiral large linear programming problems and by using a special program which counted the number of multiplications in the different operations of the product-form algorithm as coded for the IBM 704 by the Rand Corp. There appeared to be good agreement between the theoretical formulae and the experimental results. The conclusion of the study was that the revised product-form algorithm i R the fastest for large problems and a large romputrr, the explirit inverse algorithm being the sccond best. ( c ) The revised produrt-form algorithm already nicntioned in point l(b) haB been developed. It is a major revision of the prodwt-form algorithm and diflrrs from it in the following respects: (i). Instead of calculating the shadow prices anew in each iteration, only those are recalculated which actually change, i.e., the new shadow prices are found by calculating the pivot row elements and by transforming the old shadow prices with the help of these pivot row elements. (ii). The main row calculation is performed in the same way as the calculation of the shadow prices in the product-form algorithm, hence consists of two parts. However, in the first part (backward transformation) those prices are already found which belong to variables that have once been in the basis. This reduces the second part considerably and makes it possible to remove columns from the original matrix which are or have been in the basis. (iii). If a variable re-enters the basis the forward transformation (updating the main column) is started there where the variable last left the basis. (iv). During a re-inversion pivot elements are chosen in such a way that a minimum number of non-zero elements is created in the new “eta vectors.” If a singular matrix is re-inverted there will be an automatic rejection of one or more columns. More information on the comparative study of point l(b) and the revised productrform algorithm can be found in Zoutendijk [228].
2. Transportation Programmang The stepping-stone method has been coded for the Ferranti Mark I* computer. As a starting procedure a revision of Vogel’s Approximation Method is used in which that column is always selected which is not yet full and for which the product of the requirement and the difference between the smallest and second smallest cost figures is largest. Scveral other starting procedures havc also been coded, e.g. the NW corner rule and the procedure of taking the columns in consecutive order and filling them in the best possible way. Comparison of computing times showed that the modified VAM method is about 30% faster than the best alternative. 3. Xetwork Flow Problems
The flow method (primal-dual method) for solving network flow problems taking into account both cost figures and capacity limitations for the flow through the branches of
356
SAUL 1. GASS
the network has been coded for the Ferranti Mmk I* computcr. I t can handle a maximum of 150 nodes and 1200 branches in the network, and is about 5 times faster thari the linear programming code written for the same computcr. Incorporated are: (a) A method for finding a pair of initial solutions (one for the primal problem and one for the dual problem) which satisfy the complementary slackness relations. The method is in particular applicable if some cost figures are negative. (b) A method for making post-optimality studies. If a network-flow problem is solved by the flow method, a new problem obtained by changing the total amount of flow, some capacities, of some cost figures, can be solved by starting from the pair of optimum solutions obtained for the original problem and its dual problem. A publication of the details of these methods is being prepared. 4 . Integer Programming Problems
The combinatorial method for solving integer linear programming problems in which all variables may assume the values 0 or 1, Benders el al. [20], has been coded for the Ferranti Mark I* computer. The program is written for problems involving about 40 inequalities and 40 variables. The calculating time is proportional to the number of constraints and depends exponentially on the number of variables involved. Experiments have shown that problems involving up to about 25 variables can be solved within a reasonable time (a few minutes to half an hour). 5. Nan-Linear Programming A complete set of procedures has been developed for the convex programming problem, i.e. the problem of minimizing a convex function in a convex region. These procedures are called methods of feasible directions and work along the following lines: (a) The starting point must be feasible (if such a point is not available, a phase I procedure must be applied to find one). (b) Each iteration consists of the determination of a new feasible point with a lower value for the objective function. It can be divided into two parts: (i). The determination of a usable feasible direction (i.e. direction along which it is possible to make a step without immediately leaving the feasible region, and along which the value of the objective function decreases). (ii). The determination of the length of the step to be taken in that direction. (c) The sequence of points obtained will be such that the values for the objective function converge to the minimum of the function restricted by the constraints. The different methods of feasible directions differ in the way the directions are determined. These direction-finding problems are always linear programming problems which can be solved by means of the simplex method. The methods are finite in the linear and quadratic case and can also be used if the convexity assumptions are not fulfilled (although a local minimum may be found in that case). The methods have been published in Zoutendijk [227, 2281. I n [228] equivalence is shown between many existing methods for the linear, quadratic or convex progrrtmming problem and the methods of feasible directions.
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
7. I . I7
357
FEI~RANTI PEGASUS
7.1.17.1 Cotlc $or S o b i n g tkc ( k t u t w l
Litiuir
l ’ r o ~ r ~ ~ ~ ~ il’roblctri ~tiitig
SIMPFIX Code uses contracted simplex method, fiscd poiiit arithmetic, and the problem sizes for the models of the computer are as follows: 4096 Work Drum Simpfix 16 Sinipfis 32 Simpfix 48
rta+2
n + l
16 32 48
192 96 64
7168 Word Drum Simpfix 7168: nz mri
5 166
n?
+
IL
_< 205
+ 9n + m _< 5973
Exnmplcs of times for solution: m
11
Total time (min)
11 35
180 62
20 20
LINPLEX In this code the matrix is read from tape and the basis is inserted for each iteration. The problem size is m 5 13 and n unlimited. A sample problem for m = 13 and n = 3070 took 15 min per iteration. 7.1.17.2 Codes for the Transportation Problem TRANSPEG Thc method wed is the iiivcrx bnhis and the problem size is constrained 1)Y m 5 128 n I 1792 (4096 word drums) ?a 5 4096 (7168 word drums); where a = number of sources, b = number of destinations and m = a b and ri = ab. Iteration time is approximately 7 n/100 see. Sample problem with a = 35, b = 93, m = 128, n = 3255 took 211 iterations with 139 min computing time and 152 min total time.
+
+
TRAXSTED This code uses the stepping-stone method and the problem size is re128 and n 4096 (either drum). A problcm of size a = 31, stricted by nz
<
<
SAUL 1. GASS
358
b = 71, m = 102, n time.
=
2201 took 139 iterations and 147 min computing
7.1.17.3 Codes for Discrete Programming
DISCUS Uses the early technique of Gomory which requires the matrix to add on a constraint after each iteration. This is a new code and computing times were not available. Typical values for m and n are 7 X 184, 19 X 58, 27 X 27 for the 4096 word drum; and 7 X 714, 23 X 106, 39 X 40 for the 7168 word drum. In each case allowance was made for the maximum possible expansion of the matrix. 7.1.18 DEUCE COMPUTER^ For the general problem a code using the simplex method is available. The dimensions must satisfy (m 3)(n 2) 5 8156. A 70 X 140 problem was solved in 130 iterations and took 44 hr. A 60 X 46 problem in 50 iterations took 40 min. A dual simplex code is also available. A code is nearly complete which uses the product form of the inverse of the revised simplex method. Only nonzero elements are stored. A quadratic programming code using Wolfe’s short form algorithm was used to solve a 14 X 40 problem. The problem expanded to a 55 X 136 problem and took 14 iterations in the first stage and 50 in the second. Several variations of Gomory’s integer programming algorithm have been coded with “varying success.’’ A code for the transportation problem uses the Ford and Fulkerson method. The maximum matrix is defined by:
+
mn
a
+
+ 7(m + n) 5 6560
where a is the number of cost elements packed per 32-bit word. The largest problem solved was a 106 X 160 and took 341 iterations and 10 hr. A new code using the stepping-stone method is under development. 7.1.19 GAMMA E.T. The following program was developed by the programming group of the Shell Nederland Raffinaderji N.V. a t Pernis for a BULLGAMMAA.E.T. having 48-word high-speed storage and 8192-word magnetic drum storage: (a) A code using the explicit inverse algorithm for systems of up to about 80 equations. @
English Electric Company, Ltd.
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
359
(b) A code using the revised product form algorithm for systems of up to about 130 equations. Systems of 90 equations containing 130 variables, with an initial fullness of 12%, are regularly solved in 10-15 hr. Postoptimal routines are not available. Reinversion is used for all amendments. In addition, a program for the general linear programming problem was written by the programming group of Electricit6 de France. This program calls for 16,000 words of drum storage and uses the product form of the inverse, simple precision arithmetic. The code can handle 255 equations and up to 765 variables. Special features include periodic reinversion, special handling of angular matrices, linear parametrization of the objective function and right-hand side, the determination of the dual variables, and use of artificial variables to find a first feasible solution. Examples of running times are given in Table VIII. TABLE VIII
17 43 63 227 89 a
Size
Koneero elements
Iterations
Time
X X X X X
93 345 018 3L" 10 noiizero
21 95 187 -
10 min 3; hr 10 Iir about 150 h P about 100 hr
34 88 1.59 296 474
-
This included parametrization of the objective function.
A transportation code for the GAMMA E.T. has also been written by Electricit6 de France. Restrictions on the dimensions are m n 5 256 and mn _< 5000. Approximate time per iteration is given in seconds by the formula ,1,[??l)L 3(m I). 12.
+
+
+
+
GO A program is being prepared by Electricit6 de France and is scheduled for completion in late 1960. It will use the product form of the inverse and have no limits on the dimensions of a problem. It will have the same special features of the GAMMA E.T. but will use double precision in phases of the calculation. Other features are planned. 7.1.20
GAMMA
7.1.21 WEIZA4C'" The oiily procedure progranimecl nt this time is the prorcdurr of Dennis [52] for solving the transportation problem. lo
'l'hr Wrixrnann Institute of Scicnc-e,Rehovoth, I s r ~ 4 .
360
SAUL I . GASS
7.2 Analog Computers Although not much publicity has been given to the solution of linear programming problems by analog devices, much work has been done in this area. The beauty of analog computation lies in the fact that one can obtain a continuous solution as a function of continuous changes in parameters of interest, and thus easily investigate sensitivity of the solution to a variety of coefficient changes. The same technique also allows for the solution of nonlinear problems. I n particular, there are a t least two commercially available analog computer systems for solving programming problems. Staff members of the Princeton Computation Center of Electronic Associates Incorporated have developed two techniques which are particularly well adapted for analog computers, and admit with the same facility solution of nonlinear programming problems where the Kuhn and Tucker concavity conditions are satisfied; in cases where more than one optimum exists, one can still reach the true optimum by repeating with different initial conditions. One technique is based on a random search where a Gaussian noise generator is used for each variable. Equipment-wise this technique is the cheapest, but it requires a longer solution time [65, 1621. The other technique is based on a gradient method and admits almost instantaneous solution but requires somewhat more equipment. This technique has been well known for linear programming [3, 174, 1751. The Reeves Instrument Corporation are marketing the REACL P 402-1 and LP 402-2 Linear Programming Computers. The basic analog method is that described b y Pyne arid by Ablow and Brigham. The following is a summary of the capabilities of the Linear Programming Computers : LP 402-1 1 . Solve linear programming problems of up to 15 variables. 2. Solve nonlinear programming problems of up to 15 variables, in conjunction with external function generation equipment,. 3. Solve up to 15 simultaneous linear equations. 4. Evaluate detcrrninants of up to 15 X 15 size or niiy of their minors. Invert any matrix of 15 X 15 or smaller. 5. Form the product of two or more niatricw of up to 15 X 15 size.
1,P 404-2 1 . Solve liiicar prograininilig problenis of up to 20 variables. 2. Solve lionlinear programming problems of up to 20 variables, in conjunction with external function generation equipment.
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
361
3. Solve up to 20 simultaneous linear equations. 4. Evaluatc dcterminants of up to 20 x 20 sizo o r :illy ol‘ ticir niitiors;. Invert any matrix of 20 X 20 or smaller. 5. Form the product of two or more matriccs of up to 20 X 20 size. All of the computer’s operations are performed with the equivalent of three to four figure accuracy. An iterative procedure can be employed which will double the accuracy of the solution. Nonlinearities can easily be generated by use of external electronic function generators and fed into the Linear Programming Computer through connections provided for this purpose. Special purpose analog computers have also been developed. The Scott Paper Company have their ALPAC(Analog Linear Programming Algebraic Computer) and the Central Electricity Authority of England has developed analog devices for solving the transportation problem [196]. The design is due to E. R. F. W. Crossman. It consists of an array of pulley blocks connected by strings, each string corresponding to a source or destination. Any displacement of a string will be distributed among the movable pulley blocks around which it passes. Thus the quantity z i j is represented by the vertical displacement of the pulIey block in position ij and the requirement of a destination is represented by half the length of the string pulled out. Minimization makes use of the property that, for stable equilibrium, the potential energy of a mechanical system is a t a minimum. Thus if one adds to each pulley block a weight representing the transport costs cij, then C i j z i j c i j = minimum. The main requirements of the transportation problem are therefore satisfied. The nonnegativity condition can be readily imposed by means of physical stops and route capacity and other limitations can also be incorporated by obvious physical means. The application of this analog is subject to difficulties with friction and extension of the string, and this will, in practice, limit the size of the problem to which it can be applied. A prototype version which is capable of handling problems 4 X 3 in size has been built. A second machine is being constructed which incorporates a low-friction polythene in the bearings and pulleys, and braided Terylene for the strings, which are kept taut by graduated spring-loaded reels. This one has been christened TOM(Transportation Optimization Machine). An excellent survey of mechanisms for linear programs along with “doit-yourself” suggestions is given in Sinden [190]. 8. Scemp
An organized plan for experimentation in the computational side of linear programming has been long overdue. A recent proposal by members of
362
SAUL I. GASS
the SHARE (IBM 704-9-90 Users Group) Linear Programining Comniittce was the basis of the cstahlishmcnt, of SCEMP (Standardized Computational organization. Experiment in Mathematical Programming) within the SHARE The basic purpose of SCEMP is the dcterminatioii of the relative efficiencies of the members of a rertain class of computational procedures for solving linear programming problems. This determination is to be made from a study of the process of solution of a fixed set of test problems by n group of related computer routines employing these procedures. The computational procedurcs in question are those which may be described as variations of the simplex algorithm or of its computational representation. The intended scope of the study is best described through some of the questions it is hoped the results of SCEMP will throw light on, such as: How do the threc principal computational forms of the simplex method-the standard form, the revised form with explicit inverse, and the revised form with product, inverse-differ wit,h respcct to arithmetic performed and data handled? To what degree do devices for reducing the total number of iterations-such as mixed pricing, curtaining, and arbitrary transformations-do so? Is the composite, the dual simplex, or the parametric algorithm best for removing infeasibilities? What tolerances for ending phase or resetting small numbers t o zero are safe to use? For problems of what size, and what condition, is double precision necessary? I n line with the above, we would like to cite information relating t o two of the question areas. The first deals with tolerance limits and the second with the criterion for selecting a vector to enter the basis. From C. E. Miller of Standard Oil of California, we have the following with respect to the code in use at Standard's computing installation: "It is a device to guard against selecting too small a pivot, particularly in the presence of degeneracy. We use two tolerances, 61 and 62 (0 < 61 < SZ, both small numbers) in the B calculation, after having selected the column to come into the basis. If a putative pivot position is positive but less than &,it is declared to be zero, and thus excluded as a candidate for pivoting. If the minimum e is attained on a row for which the pivot is less than 62 (we have already taken steps to ensure it to be larger than 6J, then we disqualify the entire column and rcmove it from the problem until optimality. It, and any others similarly disqualified are then brought in for pricing. We count the number of times this is done and in the rare event that a set of variables is disqualified repeatedly, we effcctively set 6 2 = 6, and take our chances on small pivots. This has been an effective and, for our problems, a nearly necessary feature. Standard uses 2-" for 61 and 2-12 for 62. This device has enabled Standard to skirt successfully numerical difficulties in highly degenerate models which had previously raused many headaches. I t is brlieved to be a significant step forward in the successful use of single preciAion for large linear programming models. To the brst of knowledge, credit is due to David Caplin of Royal Dutch Shell in London."
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
363
In Dickson and Frederick [57] a new criterion for the selection of a vector to enter the basis is given. The basic idea behind the scheme is to select a vector which gives the longest projection on the ‘(cost axis,)’ i.e., the vector which makes the smallest angle with this axis. The criterion resolves into the following rule: Select the kth variable as Pk (the vector to enter the basis) if (zk - c k ) < 0 and q h 2 cpj where
and aij+ = 0 if aij 5 0, and aij+ = aij if aij > 0. Here we are maximizing, The quantity of ~ $ jis the COS2bj7 where is the angle between the j t h vector and the cost axis for what the authors call the reduced problem. A 34 X 44 gasoline blending problem was solved using the standard maximizing criterion of selecting the PI,associated with min ( z j - cj) < 0 and the new procedure. The former method took 64 iterations and the latter method took 42. A 62 X 70 refinery simulation problem took 85 iterations with the old procedure and 62 with the new. A rough estimate of the time cost of this technique is a 10% increase in iteration time. A 10% reduction in the number of iterations would certainly offset this increase. The usual reduction appears to be in the order of 30% to 70% of the number of iterations required by codes employing a most negative (zj - c j ) criterion. As the authors point out, and as SCEMPemphasizes, there is room for improvement and a need for experimentation in the computational aspects of linear programming. 9. linear Programming in Other Countries
In the section on digital computer codes we reviewed the work being done in the Computer Development Division of Shell International, London; the Koninklijke/Shell-Laboratoriumof Amsterdam; the work on the multiplex method of Frisch in Norway; and the codes for a number of British and French digital computers. In this section we would like to summarize the additional information on foreign activities in linear programming available to us a t this time. 9.1 France
In France we find much work being done in the applied (as indicated by the available codes), theoretical, and educational areas of linear program-
364
SAUL I. GASS
ming. In the latter area, for example, we note the courses of G. Th. Guilbaud of the Sorbonne and J. Abadie of Electricit6 de France. From the theoretical side we cite the following investigations: Peuchot of Esso Standard is studying three approaches to the problem of finding the optimum solution by advancing in the interior of the convex set of solutions. They are the procedure which uses the gradient of the objective function, the logarithmic potential method of Frisch, and a Monte Carlo approach for finding a direction of improvement. Parisot of IBM, France has proposed a method which is based on the potential of Frisch. The principal of decomposition is described in French by Abadie [2] and has been applied to a transportation problem by the I B M France Service of New Scientific Studies. The development of a general program for the IBM 704 is being studied. Work in integer programming which has been done by Professor Fortet appears in the revue of the C.U.R.O. of Brussels. Coekelberghs has studied general transportation problems and their reduction to integer programming problems. His work is contained in a thesis a t the University of Brussels. An article by Thionet which appeared in The Review of Applied Statistics discusses the application of linear programming to boring plans. Additional studies include those of Matthys and Ricard of The National Society of French Railroads, who studied the transportation problem with respect to the theory of graphs and the duality relationships; and Abadie and his work on the generalized transportation problem and methods for speeding up calculations using the simplex method. A short bibliography of French works includes Abadie [I, 21, Bessihre [all, Bouzitat [24], Carpentier [ 2 5 ] , Kaufmann [llG], Lesourne [135], Maghout [142], Masse [155], Peuchot [167], Pigot [l68], [169], and SNCF [193]. 9.2 Italy
A great deal of intcrcst in linear programming has been developed in Italy by the Centro per la Ricerca Operativa in Milan under the directorship of Professor Francesco Brambilla. His book, La Progrummazione Matematica Nell’lmpresa (Mathematical Programming in the Firm), will be published in the near future. The Center has a GAMMA computer a t its disposal. Papers by Pozzi and Gazzano [170, 1711 describe an analog method for linear programming and a direct method for the solution of linear systems and matrix inversion particularly suitable for super automatic desk computers. Additional references are Longo, Ricossa, and
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
365
Giardiiia [138] and the methodological series of the Bollettino drl Ccntro per la Riccrca Opcrativa. 9.3 U.S.S.R.
In the past few years it has become quite apparent that many of the major Soviet mathematicians and economists are increasing their efforts in applying mathematical procedures to economic planning. The reader is referred to articles in Business Week, June 13, 1959; New York Times, page 10, June 12,1960; and to Foreign Affairs, January 1960. In the latter, Leontief describes the “State of Soviet Economic Science.” From the linear programming point of view it has also become apparent that a great deal of basic and important work has been done in the Soviet Union which predates many of the independent discoveries made by Dantzig and others outside of the Soviet bloc. In particular we refer to the articles by L. V. Kantorovich [106, 1081; translations of which appeared in Management Science, October 1958, and July 1960, respectively. The second article, which was published in 1939, is most startling. In that article the author describes such applications as the assignment of items or tasks to machines in metal-working, in the plywood industry, and in earth moving; trimming problems of sheet metal, lumber, paper, etc.; oil refinery operations, allocation of fuels to different uses; allocation of land to crops; and of transportation equipment to freight flows. Kantorovich’s computational procedure, termed the method of resolving multipliers, is also given in the 1939 paper. Since that time, and especially in recent years, many Soviet publications have appeared which relate directly or indirectly to linear programming procedures. In particular we would like to cite two books. The first, by Kantorovich, was published in 1959 and is titled Economic Calculationfor Better Utilization of Resources. Thc chapter headings are “Distribution of Output Programs and Estimation of Production,” “Maximum Program Fulfillment Using Given Resources,” “Questions Connected with the Expansion of the Productive Base.” The two appendices include the “Mathematical Statement of The Problem of Optimum Planning” and “Computation Methods for Answering Problems of Optimum Planning.” The second book is Application of Mathematics in Economic Research, V. S. Nemchinov, editor, 1959. This volume includes “The Application of Mathematics in The Field of Economics” by Nemchinov, “Mathematical Methods of Organization and Planning of Production” and “Subsequent Development of Mathematical Methods and Possibilities of Their Use in Planning and Economics” by Kantorovich, “A Lecture on Linear Programming” by B. Kerko, “Methods of Calculation for Solving Problems in Linear Programming” by G. Sh. Rubinshtein, and a short bibliography. Other Soviet work is discussed in Isbell and
366
SAUL I . GASS
Marlow [loo] and Gourary et aZ. [91]. The latter is a translation of a paper by Rubinshtein. The short bibliography of this translation lists Kantorovich [106-1131, Rubinshtein [183-1851, and Tolstoi [198]. The May-June 1960 issue of Operations Research notes that a series of lectures on “Problems of the Application of Mathematics in Economics” were given in 1959-1960 a t the University of Leningrad. We should note that linear programming has also been used by other members of the Soviet bloc. In a report on “Mathematical Research in China in the Last Ten Years” which appeared in the December 1959 issue of the Notices of the American Mathematical Society the author credits operations research, especially linear programming]as being a contributing factor in the ‘(greatleap.” He notes that it has been applied everywhere in the country to the building up of the national economy with a great saving of national funds. A so-called graphical method has been developed and is widely employed in practice. References 1. Abadie, J., Approvisionnement des centrales therdques et g6n6ralisationsdu problhme de transport. Rev. franc. recherche operatle., 28me trimestre (1958). 2. Abadie, J., Programmes linetires: Le principe de decomposition de Dantzig et Wolfe. Rev. franc. recherche operatle., 28me trimestre (1960). 3. Ablow, C. M., and Brigham, G., An analog solution of programming problems. J. Operations Research SOC.Am., Vol. 3, No. 4, 388-394 (1955).
4. Akers, 5. B., Jr., The Use of WyeDelta Transformations in Network Simplification, Rand Symposium on Mathematical Programming, The Rand Corp., Santa Monica, California, 1959. 5. Arabi, M., An Application of Linear Programming to Balancing Airplane Control Surfaces. Boeing Aircraft Co., Seattle, Washington, 1960. 6. Arrow, K. J., Karlin, S., and Scarf, H., Studies i n the Mathematical Theory o j ZnJentory and Production. Stanford Univ. Press, Stanford, California, 1958. 7. Arrow, K. J., Hurwicz, L., and Uzawa, H., Studies in Linear and Nonlinear Programming. Stanford Univ. Press, Stanford, California, 1958. 8. Arrow, K. J., and Johnson, 5. M., A Feasibility Algorithm for One-way Substitution in Process h a l y s k , Notes on Linear Programming: Part XLIII, RM-1976 (ASTIA Document No. AD-144278). The Rand Corp., Santa Monica, California, 1957. 9. Baumol, W. J., Activity analysis in one lesson. Am. Econ. Rev. 18, No. 5 (1958). 10. Beale, E. M. L., A Method for Solving Linear Programming Problems When Some But Not All of The Variables Must Take Integral Values, Statist. Techniques Research Group Tech. Report No.19. Princeton University, Princeton, New Jersey, 1958. 11. Beale, E. M. L., On minimizing a convex function subject to linear inequalities. J. Roy. Statist. SOC.B17 (1955). 12. Beale, E. M. L., On quadratic programming (O.N.R. Document NAVEXOS P-1278). Naval Research Logist. Quart. 6, No.3 (1959). 13. Beckmann, M. J., On the division of labor in teams. Metroecornmica 8, No.3, 163-168. (1956).
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
367
14. Beckmann, M. J., International and interpersonal division of labor. Weltwirtschafll. Arch. 78, 67-73. (1957). 15. Beckmann, M. J., Variational programming (Abstr.). Econornetrica 27, 269-270 (1959). 16. Beckmann, M. J., Lineare Planungsrechnung-Linear Programming. Fachverlag fur Wirtschaftstheorie und okonometrie, Ludwigshafen, Germany, 1959. 17. Beckmann, M. J., Lineares Programmieren und neoklassische Theorie. Weltwirtschaftl. Arch. 84 (1960). 18. Beckmann, M. J., and Ladcrman, J., A bound on the use of inefficient indivieible units, (Cowles Foundat.ion Paper No.109). Naval Research Logist. Quart. 3, No.4 245-252 (1956). 19. Bellman, R., Functional-equation Approaches to Various Classes of Linear Programming Problems, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Monica, California, 1959. 20. Benders, F., Jr., Catchpole, A. R., and Kuiken, C., Discrete Variables Optimization Problems. KoninkUjke/Shell-Lab., Amsterdam, 1960. 21. Bessibre, F., Applications de la dualit6 8. un modble de programmation B. long terme. Rev. franc. recherche operalle. 2&metrimestre (1959). 22. Bock, F., Efficient Algorithms for Finding Optimum Constrained Paths and Circuits in Networks, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Monica, California, 1959. 23. Boiteux, M., and d’Epenoux, F., Long Term Programming of Investments in the Electric Power Industry, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Monica, California, 1959. 24. Bouzitat, J., ThBorie des Jeux at Programmation LinBaire (French translation of S. Vajda, ref. 204a). 25. Carpentier, J., A Method for Solving Linear Programming Problems in Which Cost Depends Non-linearly on a Parameter, Electricit6 de France. Direction des Etudes et Recherche8 Report HX l&JLC/AM. 25a Charnes, A., and Cooper, W. W., Management Models and Industrial Application of Linear Programming. Wiey, New York, 1960. 2G. Charnes, A., and Lemke, C . E., Minimization of nonlinear separable convex functionals. Naval Research Logist. Quart. 1 (1954). 27. Cheney, E. W., A Code for Convex Programming, Rand Symposium on Mathematical Programming. The Rand Corp., Saqta Monica, California, 1959. 28. Cheney, E. W., and Goldstein, A. A., On Convex Programming and Tchebycheff Approximations, I, Appl. Math. Ser. No. 17. The Rand Corp., Santa Monica, California, 1959. 29. Cheney, W., and Goldstein, A. A,, Proximity Maps for Convex Sets, Mathematical Preprint. Ser. No.11-A. The Rand Corp., Santa Monica, California, 1959. 30. Cheney, L. K., Ullman, R. J., and Kawarantani, T. T., The Significance of M a t h e matical Programming in the Business World, Rand Symposium on Mathematical Programming, The Rand Corp., Santa Monica, California, 1959. 31. Croes, G. A., A Method for Solving the Fixed-Charge Problem and its Applications, Proc. 5th World Petroleum Cong. 32. Dahl, 0.-J., Linear Programming on the Mercury Computer: The Multiplex Method, Norwegian Defence Research Establishment Intern. Report F-375. Oslo, Norway, 1959. 33. Dahl, 0.-J., A Comparison Between the Simplex and Multiplex Methods, Nor-
368
SAUL 1. GASS
wcgian Defence Research Establishment Tech. Note F-22. Lillestrom, Norway 1959. 34. Dantzig, G. B., Discrete-Variable Extremum Problems, Notes on Linear Programming: Part XXXV, RM-1832 (ASTIA Document No. AD-112411). The Rand Corp., Santa Monica, California, 1956. 35. Dantzig, G. B., On Integer and Partial Integer Linear Programming Problems, Rand Report P-1410. Rand Corp., Santa Monica, California, 1958. 36. Dantzig, G. R., Solving Linear Programs in Integers, Notes on Linear Programming: Part XLVII, RM-2209 (ASTIA Document No. AD-156047) Rand Corp., Santa Monica, California, 1958. 37. Dantzig, G. B., Solving Two-Move Games With Perfect Information, Rand Report P-1459. The Rand Corp., Santa Monica, California, 1958. 38. Dantzig, G. B., The Dual of a Transportation Problem is not a Transportation Problem, Rand Report P-1532. The Rand Corp., Santa Monica, California, 1958. 39. Dantzig, G. n., New Directions in Mathematical Programming, Rand Report P-1646. The Rand Corp., Santa Monica, California, 1959. 40. Dantzig, G. B., On the Significance of Solving Linear Programming Problems With Some Integer Variables, Rand Report P-1486. The Rand Corp., Santa Monica, California, 1959. 41. Dantzig, G. B., General Convex Objective Forms, Rand Report P-1664. The Rand Corp., Santa Monica, California, 1959. 42. Dantzig, G. R., On the Status of Multistage Linear Programming Problems, Rand Report P-1028. The Rand Corp., Santa Monica, California, 1959; also in Manuyement Sci. 6, No.1 (1959). 43. Dantzig, G. B., A Machine-Job Scheduling Model, Rand Report P-1502. The Rand Corp., Santa Monica, California, 1960; also in Management Sci., 6, No. 2 (1960). 44. Dantzig, G. B., On the Shortest Route Through a Network, Rand Report P-1345. The Rand Corp., Santa Monica, California, 1960; also in Management Sci., 6, No. 2 (1960). 45. Dantzig, G. B., Fulkerson, D. R., and Johnson, S. M., On a Linear Programming Combinatorial Approach to the Traveling Salesman Problem, Linear Programming and Extensions: Part XLIX, RM-2321 (ASTIA Document No. AD-212974). The Rand Corp., Santa Monica, California, 1959. 46. Dantzig, G. B., Fulkerson, D. R., Johnson, S. M., On a Linear Programming, Combmatorial Approach to the ?raveling Salesman Problem, Rand Report P-1281. The Rand Corp., Santa Monica, California, 1959; also in Operations Research 7 , No. l(1959). 47. Dantzig, G. B., and Johnson, S. M., An Equivalent Linear Programming Problem, Rand Report P-1448. The Rand Corp., Santa Monica, California, 1958. 48. Dantzig, G. B., and Wolfe, P., A Decomposition Principle for Linear Programs, Rand Report P-1544. The Itand Corp., Santa Monica, California, 1959; also in Operations Research 8 , No. 1 (1959). 40. Dartmouth Math. Project Progress Report No. 3. Dartmouth College, IIanover, New Hampshire, 1958. 60. Debreu, G., Theory of Value, A n Axiomatic Anulysis oj Econornic Equilibrium, Cowles Foundation Monograph 17. Wiley, New York, 1'360. 51. DeLand, E. C., Continuous Programming Methods, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Mouica, California, 1959.
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
369
52. Dennis, J. B., A high-speed computer technique for the transportation problem, J . Assoc. Coinputiiig Mnchinrry 5, No. 2 (1058). 53. Dennis, J. B., Diode Networks and Network Flow Problems, I
370
SAUL I. GASS
72. Fort, D. M., The Separation of Uranium Isotopcs by Gaseous Diffusion, Proc. 2nd Intern. Conf. on Operational Research, 1960. 73. Frisch, R., The Multiplex Method for Linear Programming, Memorandum Sosialokon. Inst. University of Oslo, Oslo, Norway, 1958. 74. Fulkerson, D. R., A Feasibility Criterion for Staircase Transportation Problems and an Application to a Scheduling Problem, Rand Report P-1188. The Rand Corp., Santa Monica, California, 1957. 75. Fulkerson, D. R., A Network-Flow Feasibility Theorem and Combinatorid Applications, Notes on Linear Programming: Part XLV, RM-2159 (ASTIA Document No. AD-156011). The Rand Corp., Santa Monica, California, 1958. 76. Fulkerson, D. R., Bounds on the Primal-Dual Computation for Transportation Problems, Notes on Linear Programming: Part XLVI, RM-2178 (ASTIA Document No. AD-156001). The Rand Corp., Santa Monica, California, 1958. 77. Fulkerson, D. R., IncrefLsingthe Capacity of a Network: The Parametric Budget Problem, Rand Report P-1401. The Rand Corp., Santa Monica, California, 1959; also in Management Sci. 5 , No. 4 (1959). 78. Fdkerson, D. R., Zero-one Matrices with Zero Trace, Rand Report P-1618. The Rand Corp., Santa Monica, California, 1959. 79. Fulkerson, D. R., On the Equivalence of the Capacity-Constrained Transshipment Problem and The Hitchcock Problem, Notes on Linear Programming and Extensions: Part LIII. The Rand Corp., Santa Monica, California, 1960. 80. Fulkerson, D. R., and Gale, D., Comments on Solut,ion of the Quota Problem By a Successive-Reduction Method, Rand Report, P-1315. The Rand Corp., Santa Monica, California, 1958; also in Operations Research 6, No. 6 (1958). 81. Fulkerson D. R., and Johnson, 9. M., A Tactical Air Game, Rand Report P-1063. The Rand Corp., Santa Monica, California, 1957; also in Operations Research 5 , No. 5 (1957). 82. Gale, D., Transient Flows in Networks, Notes on Linear Programming: Part XLIV, RM-2152 (ASTIA Document No. AD-150686). The Rand Corp., Santa Monica, California, 1958. 828. Gale, D., The Theory of Linear Economic Models, MeGraw-Hill, New York, 1960. 83. GaIer, G. S., The use of computers for economic planning in the petroleum ehemical industry. Brit. Computer J . pp.145-149. (1959). 84. G d e r , B. A., A Multi-DimensionalDistribution Problem. University of Michigan, Ann Arbor, 1957. 85. Gus, S. I., Linear Programming: Methods and Applications. McGraw-Hill, New York, 1958. 86. Gerstenhaber, M., A solution method for the transportation problem. J . SOC.Ind. A p p l . Math. 6, No. 4 (1958). 86a. Glieksman, S., Johnson, L., and Eselson, L., Coding the transportation problem. Naval Research Logist. Quart. 7 , No. 2 (1960). 87. Goldatein, A. A., Proximity Maps for Convex Sets, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Monica, California, 1959. 88. Gomory, R. E., Essentials of an algorithm for integer solutions to linear programs. Bull. Am. Math. SOC.64, (1958). 89. Gomory, R. E., An Algorithm for Integer Solutions to Linear Programs, PrincetonIBM Math. Research Project Tech. Report No. 1. Princeton, New Je~sey,1958. 90. Gomory, R. E., An Algorithm for the Mixed Integer Problem, RM-2597. The Rand Corp., Santa Monica, California, 1960. 90a. Gomory, R. E., All-Integer Programming Algorithm, Report RC-189. IBM ReRearch Center, Moban&, N.Y., 1960.
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
37 1
91. Gourary, M. H., Isbell, J. R., and Marlow, W. H., A Generalization of the Problem Concerning the Extreme Int,ersection Point of an Axis with a Convex Polyhedron, Logistics Research Project (G. S. Rubinshtein, ed.). George Washington University, Washington D.C., 1960. 92. Gross, O., The Bottleneck Assignment Problem: An Algorithm, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Monica, California, 1959. 93. Gross, O., Class of DiscreteType Minimization Probleme, Notes on Linear Programming: Part XXX, RM-1644. The Rand Corp., Santa Monica, California, 1956. 94. Gross, O., A Simple Lmear Programming Problem Explicitly Solvable in Integers, Notes on Linear Programming: Part XXVIII, RM-1.560. The Rand Corp., Santa Monica, California, 1955. 95. Hartley, H. O., Nonlinear Programming for Separable Objective Functions and Constraints, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Monica, California, 1959. 96. Heady, E. O., and Candler, W., Linear Progranznzing Methods. Iowa State College Press, Ames, Iowa, 1958. 97. Heyman, J., and Prager, W., Automatic minimum weight design of steel frames. Franklin Znst. 966, No. 5 (1958). 98. Hirsch, W. M., and Dantzig, G. B., The Fixed Charge Problem, Notes on Linear Programming: Part XIX, RM-1383. The Rand Corp., Santa Monica, California, 1954. 99. Houthakker, H. S., The Capacity Method of Quadratic Programming, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Monica, California, 1959. 100. Isbell, J. R., and Marlow, W. H., On an Industrial Programming Problem of Kantorovitch, Logistics Research Project,, George Waahington University, Washington, D.C., 1960. 101. Jacobs, W. W., and Schell, E. D., The Maximum Number of Basic Solutions, Rand Symposium on Mathematical Programming, The Rand Corp., Santa Monica, California, 1959. 102. Johnson, S. M., A Linear Diophantine Problcm, 1l:iiid Report P-1115. The Rand Corp., Santa Monica, California, 1957. 103. Johnson, S. M., The Minimization of Functions of Free-energy Type, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Monica, California, 1959. 104. Johnson, S. M., Discussion: Sequencing n Jobs on Two Machines With Arbitrary Time Lags, Rand Report P-1526. The Rand Corp., Santa Monica, California, 1959, also in Management Sci. 5 , NO.3 (1959). 105. Kalker, J. J., Automatic Minimum Weight Design of Steel Frames on the IBM 704 Computer, Report IBM 2038/3. Brown University, Providence, Rhode Island, 1958. 106. Kantorovitch, L., Matliemutical AIelhotls of Organizing and Planning Productiou (Russian). Leningrad University, 66 pp., 1939. 107. Kantorovitch, L. V., A new method of solving some classes of extremal problems. Comp. rend. a d . sci. U.R.S.S. 98,211-214 (1940) (English). 108. Kantorovitch, L. V., On the translocation of masses. Comp. rend. acarl. sci. U.R.S.S. 37,199-201 (1942). 109. Kantorovitch, L. V.; On a problem of Monge. Uspeliki Mat. Nuitli (Russian). 3,22&226 (1948). 110. Kantorovitch, L. V., The selection of sawing schedules that guarantee maximal
372
SAUL 1. GASS
output of sawmill production in a given proportional arrangement. Lesnaya Prom. (Russian) No. 7, 15-17; No. 8, 17-19 (1949). 111. Kantorovitch, L. V., Economic Calculation for Better Utilization of Resources. Akad. Nauk S.S.S.R, MOSCOW, 1959. 112. Kantorovitch, L. V., and Gavurin, M. K., The application of mathematical methods in Problems of freight flow analysis, in Colledion of Problems Concerned with Increasing the Effectiveness of Tranceports(Russian) Akad. Nauk S.S.S.R., MOSCOW, 1949. 113. Kantorovitch, L. V., and Zalgaller, V. A., Calculation of a Rational Apportionment of Industrial Materials (Russian), 197 pp. Leningrad, 1951. 114. Karlin, S., Mathematical Methods and Theory i n Games, Programming and Economics, Vol. I: Matrix Games, Programming and Mathematical Economics; Vol. 11: The Theory of Infinite Games. Addison-Wesley, Reading, Massachusetts, 1959. 115. Karreman, H., Programming the Supply of a Strategic Material, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Monica, California, 1959. 116. Kaufmann, A., Mklhodes et Modbles de la Recherche Opbrationelle. 117. Kawaratani, T. K., Ullman, R. J., and Dantzig, G. B., Computing Tetraethyl Lead Requirements in the Linear Programming Format, Rand Report P-1545. The Rand Corp., Santa Monica, California, 1959. 118. Kelley, J. E., Jr., A computational approach to convex programming (Abstr.). Econometrica 97, 276 (1959). 119. Kelley, J. E., Jr., An application of linear programming to curve fitting. Soc. Ind. Appl. Math. 6, No. 1, 15-22 (1958). 120. Kelley, J. E., Jr., Extension of the Construction Scheduling Problem: A Computation algorithm, UNIVACApplications Research Center Report. Remington Rand UNIVAC, Philadelphia, Pennsylvania, 1958. 121. Kelley, J. E., Jr., The Cutting Plane Method for Solving Convex Programs. Mauchly Associates, Ambler, Pennsylvania, 1959. 122. lielley, J. E., Jr., Parametric Programming and the Primal-Dual Algorithm, Operations Research 7 , No. 3 1959. 123. Kelley, J. E., Jr., and Walker, M. R., Critical-Path Planning and Scheduling, Proc. Pastern .Joint Computer Conf., Boston, 1959. 124. Kelley, J. E., Jr., and Walker, M. R., Critical-Path Planning and Scheduling: An Introduction. Mauchly Associates, Ambler, Pennsylvania, 1959. 125. Iioopmans, T. C., Water Storage Policy in a Simplified Hydroelectric System, Papcr No. 115, Proc. 1st Intern. Conf. on Operational Research, London, 1957. 126. Koopmans, T. C., Three Essays on the State of Economic Science. McGraw-Hill, New York, 1957. 127. KoopmanR, T. C., and Bausch, A. F., Selected Topics in Economics Involving Mathematical Reasoning, Cowles Foundation Paper No. 136, a reprint from S I A M Rev. 1, No. 2 (1959). 128. Krelle, W., and IZunzi, 14. P., Lineare Programnzierung. Verlag Industrielle Organisation, Ziirich, Switzerlau 1, 1958. 129. Kron, G., Tearing, tensors and topological models. Am. Scientist 45, No. 5 (1957). 130. Kron, G., Diakoptics, Piecewise Solution of Largescale Systems. General Engineering Lab. G.E. Corp., Schenectady, New York: Introduction, An introduction to universal engineering, 1957; Chapter I, Topology of piecewise analysis, Report No. 57GL330-1, 1957; Chapter 11, Orthogonal networks, Report No. 57GL330-2, 1957; Chapter 111, Piecewise solution of diffusion-type networks, 1957; Chapter IV, Topology of piecewise solution, Report No. 57GL330-4, 1957; Chapter V,
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
373
Topological model of a transportation problem, Report No. 57GL330-5, 1958: Chapter VI, Piecewise optimization of linear programming, Report No. 57GL330-6; 1958. 131. Kron, G., A Very Simple Example of Piecewise Solution, General Engineering Lab. Report No. 58GL71. G.E. Corp., Schenectady, New York, 1958. 131a. ICuhn, H., and A. W. Tucker, Nonlinear Programming, Proc. 2nd Berkeley Symposium on Mathematical Statistics and Probability, University of California Press, Berkeley, California, 1950. 132. Lemke, C. E., The Constrained Gradient Method of Linear Programming, Math. Report No. 27. Rensselaer Polytech. Inst., Troy, New York, 1959. 133. Lemke, C. E., Charnes, A., and Sienkiewicz, 0. C., Plastic Limit Analysis and Integral Linear Programs, Math. Report No. 21. Rensselaer Polytech. Inst., Troy, New York, 1959. 134. Leontief, W., The Problem of Stability in Dynamic Models, Rand Symposium on Mathematical Programming, The Rand Corp., Santa Monica, California, 1959. 135. Lesourne, J., Technique icononaique et Gestion industrielle. 136. Leutert, W. W., Some Useful Linear Programming Techniques, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Monica, California, 1959. 137. Locb, H. L., Algorithms for Chebycheff Approximations Using the Ratio of Linear Forms, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Monica, California, 1959. 138, Longo, Ricossa, and Giardina, La Prograinmazione Lineare. 139. Madansky, A., Bounds on the Expectation of a Convex Function of a Multivariate Random Variable, P-1418. The Rand Corp., Santa Monica, California, 1959; a160 in Ann. Math. Statist. 30, No. 3 (1959). 140. Madansky, A., Some Results and Problems in Stochastic Linear Programming, Rand Report P-1596. The Rand Corp., Santa Monica, California, 1959. 141. Madansky, A., Inequalities for Stochastic Linear Programming Problems, Rand Report P-1GOO. The Rand Corp. Santa Monica, California, 19GO; also in Management Sci. 6,No. 2 (1 960). 142. Maghout, I<., ITne MBthode pour le REsolution des Programmes linbaires, Programmes param6triques. AcadEmie des Sciences, Paris, 1960. 143. Makower, H., Actiuity Analysis and the Theory of Economic Equilibrium.-Macmillan, New York, 1957. 144. Manne, A. S., A Note on the Modigliani-Hohn Production Smoothing Model, Cowles Foundation Paper No. 113. 1957. 145. Manne, A. S.,A Target-Assignment Problem, Cowles Foundation Paper No. 120, 1958. 146. Manne, A . S., Programming of economic lot sizes. Managemenl Sci. 4, No. 2 (1958). 147. Manne, A. R., Costs and Benefits in Mathematical Programming, Rand Report P-93G, The Rand Corp., Santa Monica, California, 1956. 148. Manne, A. S., Allocating MATS Equipment with the Aid of Linear Programming, Rand Report RM-1612, The Rand Corp., Santa Monica, California, 195G. 149. Maiuie, A. S., A target assignment problem. Operations Research 6,No. 3 (1958). 150. Manne, A. S., Linear Programming and Sequential Decisions, Cowles Foundation Discussion Paper No. 62. 1959. 151. Manne, A. S., On the Job Shop Scheduling Problem, Cowles Commission for Research in Economics Contract No. 358 (01) NR047-066. Office of Naval Research, 1359.
374
SAUL 1. GASS
152. Markowits, H., The optimization of a quadratic function subject t o linear constraints. Naval Research Logist. Quart. 3, NO. 1 (1956). 153. Markowitz, H. M., Portfolio Setection. Wiley, New York, 1959.
154. Markowita, H.M., An example of Monte Carlo Programming, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Monica, California, 1959. 155. Masse, P., Le Choix des Znvestissements. 156. McGuire, C. B., Comparisons of Information Structures, Cowles Foundation Discussion Paper No. 71. 1959. 157. Merrill, R. P., Experience with the 704 Program for the Gradient Projection Method, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Monica, California, 1959. 158. Metager, R. W., Elementary Mathematical Programming. WiIey, New York, 1958. 159. Minty, G.J., Monotone Networks, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Monica, California, 1959. 160. Motzkm, T. S., Ramifications of Optimization Theory, Rand Symposium on Mathematical Progrtxmming. The Rand Corp., Santa Monica, California, 1959. 161. Munkres, J., Algorithms for the assignment and transportation problems, SOC.Znd. Appl. Math. 5 , No. 1 (1957). 162. Munson, J. K.,and Rubin, A. I., Optimization by random search on the analog computer. IRE Trans. on Electronic Computers EC8, No.2 (1959). 163. Nemchmov, V. S. (Ed.), Application of Mathematics in Economic Research. SocidEconomic Literature, Moscow, 1959. 164. Nichols, C. R., Linear-programming model for refinery simulation, Oil Gas J . , (1959). 165. Orchard-Hays, W., “SCROL,” A Comprehensive Operating System for Linear Programming on the IBM 704. CEIR, Arlington, Virginia., 1960. Synopsis of Current and Planned Linear Programming Systems 166. Orchard-Hays, W., for the IBM 704, 709, and 70&0 Computers, Rand Symposium on Mathematicd Programming. The Rand Corp., Santa Monica, California, 1959. 167. Peuchot, J., Recherche6 Concernant La Rdsolution des Problhmes dc Programmation LinBaire. AcadBmie des Sciences, Paris, 1960. 168. Pigot, D.,Code de calcul pour Programmes linbaires. Department d’Efficience, SOC.PBtroles Shell Berne, 1959. 169. Pigot, D.,L’Application de la Mdthode Simplexe aux grands Programmes lindaires, Proc. 2nd Intern. Conf. on Operational Research, Aix-en-Provence, France, 1960. 170. Pozzi and Gassano, Un metodo analogieo per i problemi di lineari programmaaione. BoU. centro m‘cerca operation No. 5-6. 171. Pozai and Gassrtno, Un metodo diretto per la soluzione di sistemi lineari e l’inversione de matrici. Boll. centro ricerca operation No.7-8. 172. Prager, W., Linear Programming and Structural Design, Notes on Linear Progruinming: Part XLII, RM-2021 (ASTIA Document, No. AD-150G61). The Rand Corp., Santa Monica, California, 1957. 173. Pyle, L. D.,The Generalized Inverse in Linoar Programming, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Monica, California, 1959. 174. Pyne, I. B., Linear programming on an electronic analog computer. Trans. AZE 75, 139-143 (1956). 175. Pyne, I. B., Linear Programming on an Electronic Analogue Computer (1956 AIEE Transactions Annual), Technical Article Reprint No. 110. Reeves Instrument Corp., New York, 1957.
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
375
176. Iladner, R., The Application of Linear Programming to Team Decision Problems, Cowles Foundation Paper No. 128. 1959. 177. Ralston, A., and Wilf, T-I. S., MathemalicaZ Methods for Digital Compulers. IVilcy, New York, 1960. 178. Reinfeld, N., and Vogrl, W., AlaliiMltaticaZ Progratnnzing. Prentice-Hall, Englewood Cliffs, New Jersey, 1958. 179. Riley, V., and Gass, S. I., Linear Programming and Associated Techniques-An Annotated BibZiography. John8 Hopkms Press, Baltimore, Maryland, 1958. 180. Rockafellar, R. T., The Abstract Algebra of Linear Programming, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Monica, California, 1959. 181. Rosen, J. B., The gradient projection niethod for nonlinear programming, I: Linear Constraints. S I A M Journal 8, No. 1 (1960). 182. Rosen, J. B., Extension of the Gradient Projection hlethod to Nonlinear Constraints, Rand Symposium on Mathematical Progmmnming. The Rand Corp., Santa Monica, California, 1959. 183. Rubinshtein, G. S., The Problem of the Extrcme Intersection Point of an Axis with a Bounded Convex Polyhedron and Some of its Application#, Dissertation. Leningrad National A.I. Girtsen Institute, Leningrad, 1955. 184. Rubinshtein, G. S., The problem of the cxtrcme intcrscction point of an axis with a polyhedron and its application to the investigation of a finite system of linear inequalities, Doklady Akad. Nauk S.S.S. R. (Russian) 100,627-630 (1955). 185. Rubinshtein, G. S., The problem of the extreme intersection point of an axis with a polyhedron and some of its applications, Uspekhi Mat. Nauk. (Russian) 10, 206-207 (1955). 186. Saaty, T. L., and Webb, K. W., Sensitivity and Renewals in Scheduling Aircraft Overhaul, Proc. 2nd Intern. Conf. on Operational Research, English Univ. Press, London, 1960. 187. Shapley, L. S., On Network Flow Functions, Notes on Linear Programming and Extensions: Part L, RM-2338 (ASTIA Document No. AD-214635). The Rand Corp., Santa Monica, California, 1959. 188. Shetty, C. M., Solving linear programming problcms with variable parameters. J . Ind. Eng. 10, No. 6 (1959) 189. Shindle, W. E., and Kelley, J. E., Jr., A General Algorithm for the Transportation Problem, UNIVACApplications Research Center Report. Remington Rand UNIVAC, Philadelphia, Pennsylvania, 1959. 190. S i d e n , F. W., Mechanisms for h e a r programs. Operations Research 7 , No. 6 195'3. 191. Smith, D. M., Techniques for Block-angular and Non-linear Programming Problems, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Monica, California, 1959. 192. Smith, L. W., Some Data-processing Problems Surrounding the Solution of Large Mathematical Programming Problems, Rand Symposium on Mathematical Programming. The Itand Corp., Santa Monica, California, 1959. 193. SNCF, ProcBdures Manuelles de R4solution des Programmes LinBaires dits de Transport. France. 194. Stone, J. J., The Cross-Section Method, Rand Report P-1490. The Rand Corp., Santa Monica, California, 1958. 195. Stone, R. L., Automatic Minimum Weight Design of Steel Frames on the IBM 650 Computer, Brown University Report IBM 2038/4. Brown University, Providence, Rhode Island, 1958.
376
SAUL I . GASS
196. Stringer, J., and IIalcy, K. B., The Application of Linear Programming to a Large-Scale Transportation Problcm, Proc. 1st Intern. Conf. on 0pcratiori:Ll Iicsearrh. Operations Research Soc. of America, Baltimore, Maryland, 1957. 197. Talacko, J., Interval Solutions of Stochastic Iincar Inequalities, Rand Symposium on Mathcmatical Programming. The Rand Corp., Santa Monica, California, 1959. 198. Tolstoi, A., Methods of removing irrational shipments in planning. Sotsialist. Transport (Russian) 9/28-51 (1939). 1%. Tornqvist, L., Somc ncw principles for solving linear programming problems. Bul2. inst. intern. statist. (Stockholm) (1957). 200. Tucker, A. W., Abstract Structure of the Simplex Method, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Monica, California, 1959. 201. Tucker, A. W., An Integer Program for a Multiple-Trip Variant of the Traveling Salesman Problem. Princcton Pniversity, Princcton, New Jersey, 1960; J . Assoc. Computing Machinery, 7, No. 4 (1960). 202. Tucker, A. W., On A Problem of E. F. Moore, Princeton University, Princeton, New Jersry, 1960. 203. Uzawa, JI., Imputations and Programming, Rand Symposium on Mathematical Programming. The ltand Corp., Santa Monica, California, 1959. 204. Vajda, S., Basic Solutions, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Monica, California, 1959. 204a. Vajda, S., An Introrliiclion to Linear Programming ant1 the Il’heory of Gumcs. Wiley, New York, 1960. 205. Vajda, S., Readings i n Linear Progranming. Wiley, New York, 1958. 206. Vazsonyi, A., Scientific Programming in Business and Industry. Wiley, New York, 1958. 207. Vaesonyi, A., Mathematical Programming in Marketing, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Monica, California, 1959. 208. Votaw, D. F., Jr., Mathematical Programming and Personnel Assignment, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Monica, California, 1959. 209. Wagner, H. M., On thc distribution of solutions in linear programming problems. J . Am. Statist. ASSOC. 53, 161-163 (1958). 210. Wagner, II. M., The dual simplex algorithm for bounded variables. Naval Research Logist. Quart. 5, No. 3 (1958). 211. Wagner, H.M., The simplex method for beginners. Operations Research 6, No. 2 (1958). 212. Wagner, €1. M., A practical guide to the dual theorem. Operations Research 6, No. 3 (1958). 213. Wagner, H. M., On a class of capacitat,ed transportation problems. Managemcnt Sn’. 5, No. 3 (1959.) 214. Wagner, H.M., Linear programming techniques for regression analysis. J . Am. Statist. ASSOC. 54,206-212 (1959). 215. Wagner, €1. M., An integer linear-programming model for machine scheduling. Naval Research Logist. Quart. 6, No. 2 (1959). 216 Wagner, H. M., A supplementary bibliography on linear programming. Operations Research 5, No. 4 (1957). 217. Wagner, H. M., and Whitin, T. M., Dynamic version of the economic lot size model. Management Sci. 5, No. 1 (1958). 218. Ward, L. E., Jr., The problem of N traveling salesmen, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Monica, California, 1959.
RECENT DEVELOPMENTS IN LINEAR PROGRAMMING
377
219. Warga, J., Convex minimization problems I, Revision I, RAD Tech. Memo TM-59-20. AVCO Research and Advanced Development Division, Wilmington, Massachusetts, 1950. 220. Warga, J., Convex minimization problems 11, The transportation problem, Revision I, RAD Tech. Memo TM-59-21. AVCO Research and Advanced Development Division, M'ilinington, Massachusetts, 1959. 22 I. Warga, J., Convex minimization problems 111, Linear Programming, Revision I, RAD Tech. Memo TM-59-22. AVCO Research and Advanced Development Division, Wilmington, Massachusetts, 1959. 222. White, W. B., Johnson, S. M., and Dantzig, G. B., Chemical Equilibrium in Complex Mixtures, Rand Report P-1059. The Rand Corp., Santa Monica, California, 1957. 223. Williams, A. C., Transportation and Transportation-Like Problems, Electronic Computer Center Report ECC 60.2. Socony Mobd Oil Co , New York, 1960. 221. Witzgall, C., Gradient-Projection Methods for Linear Programming, PrincetonIBM Math. Research Project Tech. Report No. 2. Princeton, New Jersey, 1960. 225. Wolfe, P. Linear Programming and Recent Extensions, Rand Symposium on Mathematical Programming. The Rand Corp., Santa Monica, California. 226. Wolfe, P., The Simplex Method for Quadratic Programming, Rand Report P-1295. The Rand Corp., Santa Monica, California, 1959; also in Econometrica 97, No. 3, 382-398 (1 959). 226a. Wolfe, P., Computational Techniques for Nonlinear Programs, Princeton University Conf. on Linear Programming, 1957 (privately printed). 227. Zoutendijk, G., Maximizing a function in a convex region. J . Roy. Slalisl. SOC. BP1 (1959). 228. Zoutendijk, G., Methods of Peavible Directions, A Study in Linear and Noniinear Programming. Elsevier, Amsterdam, Netherlands, 1960.
This Page Intentionally Left Blank
The Theory of Automata, a Survey ROBERT McNAUGHTON Moore School of Electrical Engineering Univerrify of Pennsylvania
Philadelphia, Pennsyfvanio
1. Introduction. . . . . . . . . . . . . . . . . . . . 379 2. Finite Automata . . . . . . . . . . . . . . . . . . 385 3. Probabilistic Automata . . . . . . . . . . . . . . . . 391 4. Behavioral Descriptions . . . . . . . . . . . . . . . . 393 5. Various Concepts of Growing Automata . . . . . . . . . . . 397 6. Operations by Finite and Growing Automata, Real-Time and General. , 402 7. Automation Recognition . . . . . . . . . . . . . . . . 407 8. Imitation of Life-Like Processes by Automata . . . . . . . . . . 411 References . . . . . . . . . . . . . . . . . . . . 416
.
1. Introduction
The development of the theory of automata is, from one point of view, a historical result of the technological development of the computer art and, from another, promises to be an integral part of that development. Nevertheless, the concept of an automaton is not to be identified with the concept of a computer, and the theory of automata is a theory that is ideally understood in its own terms. Its proper relationship to computer work is one of a pure science to an applied science. I n pace with the development of computer technology, certain parts of the theory of automata have grown so much that they cannot be adequately treated here. Thus, this survey will cover only that part of automaton theory that lies outside of certain subtheories, namely, switching theory, the theory of computability, artificial intelligence, and learning machinesas well as various topics having to do with the statistical aspects of machines. These subtheories are briefly described here with some helpful references to show how they fit into the theory as a whole, but problems that fall clearly within any of them are omitted. One consequence of this decision is that the work reported here is quite recent: the median date of 379
380
ROBERT McNAUGHTON
all the references is 1959; only 15% have dates earlicr than 1956 (the date of the publication of Automata Studies by Shannon arid McCarthy), and these are mostly borderline references. The amount of attention given to any rcferencc is no indicat,ion of its relative merit. The manner of prescntation and the proportion of space given to the various research papers reflect sometimes a personal bias and sometimes irrelevant circumstances. As far as I know, however, no piece of work has been omitted if it is within the area of the survey, and if its author is clearly willing to have it publicized. (I have overcome the tcmptation to report on several interesting rumors of results that people have obtained.) I would judge that almost all published works within the area of the survey are listed in the bibliography (with a corresponding mention in the text), although no system has been employed that absolutely guarantees that everything published has been reported.’ Any serious omissions are probably foreign, such as Russian. It is necessary to begin with some sort of definition of “automaton.” For the purposes of this section a rather vague definition will have to suffice. An automaton is a device of finite size a t any time with certain parts specified as inputs and outputs, such that what happens a t the outputs a t any time is determined, or a t least its probability distribution function is determined, by what has happcned a t the inputs. Much of the vagueness of this definition is inherent in the subject matter. Any more precise definition would be sure to exclude some things that some theorists have studied as automata. Most writers do not give a general definition of “automaton,” but merely define a term that designates the kind of automaton they are interested in, such as “finite automaton,” “Turing Machine,” ctc. General definitions that are found in the literature (for example, the one given on pp. 193-194 of Burks and Wang [lo]) are as vague and casual as the one I have given. The reader who is made uncomfortable by vagueness is asked to read patiently through this introductory section, with the promise that concepts discusscd in later sections will be more precise. Another discomforting feature of this definition is its broadness: it designates as automata many things to which we 71’0uld not ordinarily apply Several people have read a first draft of this survey and have proposed useful corrections and additions. I n particular, their suggestions account for almost half the present bibliography; without having had their help, I could not make even a limited claim of completeness. Needless to say, none of them can be held responsible for any omissions. Although many of them deserve individual footnotes of this length, I shall simply list them all in alphabetic order: F. L. Alt (the editor), A. W. Burks, A. J. Goldman, 2. S. Harris, H. Hiz, H. G. Herzberger, J. H. Holland, A. K. Joshi, L. Kanal, C. Y. Lee, S. Litwin, M. L. Minsky, E. F. Moore, J. Myhill, R. Narasimhan, D. B. Netherwood, A. G. Oettinger, €1. E. Tompkina, €1. Wang, and H. Yamada.
THE THEORY
OF AUTOMATA, A SURVEY
38 1
the term. Indeed it can be argued that any object at all that is used for a purpose is, by this definition, an automaton. However, it seems theoretically advantageous not to modify the definition t o correct this strange conFequence. Rather than make the concept any more specific, we simply acknowledge that some devices are fruitfully described as automata and some are not. Perhaps a way of characterizing those devices that are fruitfully regarded as automata is to say they are devices whose purpose is to transform information, or at least involve information at either the input end or the output end. Thus computers, telephone switching networks, meters of all kind, and servo-mechanisms are conveniently studied as applications of automaton theory, while power transformers, egg-beaters, and furnaces are not, even though they all have inputs and outputs. It is difficult t o draw the line between devices in which information is involved and devices not involving information. But, in spite of the inherent vagueness in this type of characterization, it does give us at least a rough indicat,ion of where the fruitful applications lie. Another thing is that automaton theory can be fruitfully applied only to devices that are relatively complicated in their mechanism between inputs and outputs. Thus there is no point in applying it to the study of a thermometer. The theory of automata has a special subject matter, and not all the knowledge we have about devices classified as automata comes under this heading. The subject matter is abstract, having to do with logical arid mathematical consequences of precisely formulated concepts; it excludes physical problems in the realization of automata. For example, solid state physics and computer engineering are in no sense part of automaton theory. What happens a t the inputs of an automaton is completely out of control of the automaton, except that it is usually sensitive through the inputs to certain events and not others. For example, it may be sensitive to temperature but not to earthquakes; or it may be sensitive to earth tremors but completely insensitive, as a working device, to normal temperature differences. I n the case of a computer the inputs are set at will by the human community, but within the limits of the input mechanism. However, since the theory of automata is an abstract theory, it does not concern itself with what in the environment an automaton is sensitive to. It assumes simply that an input is capable of taking on a valve, at any time of its operation, that is within a certain range of values. Similarly, the theory stipulates that an output a t any time takes on a value within its range of vaIues, and is not concerned with what the automaton does to its environment through its outputs. The term “input history” is used to denote the complete history of the values that the inputs have assumed, including information of when it has assumed them. 111 these terms we c:w rephrase part of the definition of
382
ROBERT McNAUGHTON
“automaton” and say that the value of the output at any time depends on the input history up to that time. In a similar way we can define “output history,” and observe that, since the value of the output depends on the input history, the output history up to a certain time depends on the input history up to that time. A few distinctions are useful. If the value of the output of an automaton is determined absolutely, and not probabilistically, by the input history, then the automaton is deterministic. Otherwise it is probabilistic. Although an automaton must have at any moment a finite size, it may grow in time. If so, it may grow to be arbitrarily large as time goes on indefinitely. An example of an automaton that grows to be arbitrarily large is a Turing machine (see Section 5) whose used portion of tape grows without bound; for, to be considered as an automaton, a Turing machine must include its tape. (Also, since the tape serves as input and output, the inputs and outputs of a Turing machine are not limited in size.) Automata are classified according to whether they can grow to be arbitrarily large or not. An automaton that grows but cannot become arbitrarily large turns out to be no more powerful in what it can do than a nongrowing automaton, The two interesting classifications then are potentially injinite growing automata (“growing automata” for short) and fixed automata. If there is a continuum of values for each input and output (like our common conception of a needle on a dial) then the automaton is said to be continuous; if each input and output admits only a discrete class of values then the automaton is discrete. It is quite common to think of an analog computer as a continuous automaton and a digital computer as a discrete automaton. hfixed automata may exist, having both discrete inputs or outputs and continuous inputs or outputs; for the purposes of theory these are conveniently characterized as continuous automata. A theoretical exposition of continuous automata by Nerode is found in Section 3 of Myhill et al. [84]. (Because I have never seen this unpublished report I cannot review it.) I t is likely that no other research on the theory of continuous automata in the abstract has ever been reported. Hence, the remainder of this survey will deal exclusively with discrete automata, which is not meant to imply that the theory of continuous automata is not a possible interesting field for future research. An automaton is either synchronous or nonsynchronous. A synchronous automaton is one whose input and output history can be described completely as occurring during certain discrete moments of time. (If a beginning time is assumed, as in Section 2, the times can be labeled time 0, time 1, time 2, etc.) A nonsynchronous automaton is one whose history cannot be described in this manner, because its action is understood only
THE THEORY OF AUTOMATA, A SURVEY
383
iii terms of the continuity of time. A synchronous computcr is au obvious example of a synchronous automaton, and an asynchronous mmputer aii cxample of a nonsynchronous automaton. (These distinctions-growing vs. fixed, discrete vs. continuous, syiichroiious vs. nonsynchronous, and deterministic vs. probabilistic-are discussed by Burks [8, 103.) We must distinguish between the behavior and structure of an automaton. Its structure embodies the principles of its construction from parts. Its behavior concerns the phenomena that are observable when the automaton is enclosed in a black box, and the only exposed parts arc the inputs and outputs. The subtheory known as switching theory is a rather fundamental part of the theory of automata. In spite of this fact, I have decided not to include switching theory in this survey, because it would exceed by far the limitations on this paper to report on the many papers on switching theory that have appeared during the last ten years or so. However, in order to make clear to the reader just what I am omitting I shall have to delimit its contents. A switching circuit can be defined as a circuit that can be profitably analyzed as consisting of a finite number of points each of which has at any one time one of a finite number of values. The values of the points usually depend on each other in certain ways, so that, if some of its points are designated as inputs and some as outputs, a switching circuit is a fixed discrete automaton. (Incidentally, according to this same point of view, a classical electrical circuit is a fixed continuous automaton. One could maintain that classical circuit theory in its abstract aspects is a branch of continuous automaton theory.) However, not everything about fixed discrete automata is part of switching theory. I propose that this theory embraces knowledge primarily concerning the structure of such devices. Some of its problems have to do with the relationship between behavior and structure, but any subtle or intricate aspects of these problems are problems of structure and not behavior. Thus, it includes all problems having to do with minimization of components. In particular, the problem of finding a minimal normal formula, if it is not a problem of pure logic, is a problem of switching theory; in either case, it is not discussed in this survey. The reader is warned that much of the theory of sequential circuits is hereby omitted, even though these problems are often placed under the heading of automaton theory. My proposed definition, then, is that switching theory is the study of the structure and the relationship between behavior and structure in fixed discrete automata. I think that this characterization is aa accurate and informative as any other equally short characterization of the work that
304
ROBERT McNAUGHTON
has bccn clone in the field. Problems of switching theory that have so l a r been studied are niany and varicd. Thc rcadcr who is curious to gct ail idea of what they arc can consult two recent texts: Caldwell [15] and Humphrey [53]. Further referenccs can be found in two bibliographies: Netherwood [86] and Harvard Computation Laboratory [44]. There are some problems of the structure of fixed discrete automata that have the flavor of generality and abstractness that appeals to those who call themselves automaton theorists. For example, Hartmanis [43] gives a technique for determining whether an automaton can be partitioned into several machines, and, if so, how it can thus be decomposed. Although he is concerned with structure, one reads his paper with the feeling that the significance is mainly behavioral. Holland [48]is concerned with cycles in logical nets (in the sense of Burks and Wright [ll]) which are nets made up of delay elements and truth-functional elements (such as “and” gates, rror” gates and “not” gates). He shows, for example, that nets having only one delay element in any cycle realize rather trivial functions. (A cycle is a closed path in the net that does not intersect itself.) The sigiiificance of Holland’s work involves structure, but the behavioral characterizat,ions that he establishes for nets of a given cycle structure are of interest in themselves. I do not think that the division between switching theory and automaton theory proper is one that can serve as a division of labor in research. For years to come, I suspect, any one interested in one of these areas will inevitably find himself interested in the othcr. I have drawn the line between the two areas solely to make it possible to write a survey paper covering one of them. Even for such a limited purpose the act of drawing the line has its dangers; for there are many who classify as part of one area problems that I have classified as part of the other. The theory of automata proper is left with two main areas. The first is the study of rather general properties, structural and behavioral, of growing automata. Most of these problems are not problems of efficiency of design or efficiency of construction although there are some exceptions. (In contrast>,much of the switching-theory literature is concerned with efficiency.) An existence theorem about automata satisfying certain conditions would be a typical example of work done in this first area. The second main area is the study of the various behavioral descriptions of automata of all varieties. An algorithm converting one kind of behavioral description into another is an example in this area. The reader may be interested in a comparison of this survey with other papers with similar titles. Shannon’s survey [loo] gives, for the most part, a more detailed discussion of some of the material in Section 8. In spite of its early date, it makes rather interesting reading. After a short discussion
THE THEORY OF AUTOMATA, A SURVEY
385
of the human brain, computers, Turing machines, and logic machines, it presents a rather systematic study of machine game playing. It then goes into learning machines and self-reproducing machines. Holland [17] briefly reviews formal systems and logical calculi, recursive functions, Turing machines, stored-program computing machines, the theory of logical nets, information theory, and neurophysiology and psychology. The article [11 by Aizermnii et al. is not a survey of automaton theory but is an original paper mostly in switching theory, according to the definition I have given above. For the most part it is rather abstract switching theory, in that its methods do not merely apply to just one kind of hardware realization but involve structural considerations that are quite general. The first part of this paper does have some material within the limits of this survey, and will be mentioned in a later section.
2. Finite Automata
In general, the output of an automaton reacts a t any time not merely to the input a t that time, but to the input history. Thus, in some way or other, such an automaton must have memory. An important theoretical concept in this connection is the concept of state: an automaton can be in various states a t various times, and it by going from one state to another in response to something that has happened a t the inputs that it remembers what has happened. The total number of different states is an indication of the amount of memory that the automaton has. All the memory is explained in terms of the states, so that the value of each of the outputs, a t any time, depends on just the values of the inputs a t that time and the state; the only influence the past input history can have on the present value of the output is through its determination of the present state. (It is to be noted that, if this point of view is strictly maintained, a Turing machine has an infinite number of states. What is usually referred to by the word “state” in the Turing machine is really the state of part of it, not including the tape.) Another important distinction is the o w between finite-state automata a i d automata with infinitely ninny states. An example of the former is a switchiiig circuit as described in Section 1. The example of the latter is any interesting growing automaton (see Section 5). I n this context the number of states of an automaton is to be understood as the smallest number of states that ran he used to explain its behavior. “State” is used ill a behaviorial and not a structural sense. Thus a device that can be analyzed structurally as having an infinite number of states is a finite-state device, if
386
ROBERT McNAUGHTON
all but a finite number of its states can be dispensed with in explaining its behavior. I n recapitulation, automata have been separated into fixed and growing, into discrete and continuous, into synchronous and nonsynchronous, into deterministic and probabilistic, and into finite-state and infinite-state. These five bifurcations together divide the class of automata into thirty-two possible subclasses, some of which have received attention in the literature and will be discussed in this survey. Some of these classes may be empty. For example, it seems that there are no fixed discrete synchronous deterministic infinite-state automata. But no one has proved this proposition, and it seems clear that it cannot be proved by abstract reasoning from the definitions without the aid of some extra assumptions. The most elementary of the thirty-two classes is the class of fixed discrete synchronous deterministic finite-state automata, in the sense that the study of any of the other classes usually makes use of the fundamental properties of automata in this class. These automata are discussed so often that they have the simple name “finite automata.”
Fro. 1
Given a state and a value for each input of a finite automaton, at any moment of time, the value for each output and the state at the next moment of time is determined. The behavior of the device is characterized completely by a description of these transitions from state to state and output values, presented as a table or a graph. This characterization is quite useful, and is fundamental in the theory. For example, consider the finite automaton characterized by the state graph of Fig. 1. This device has one input and one output each having a range of two values, 0 and 1. The two circles represent the two states and the arrows show the transitions. The “l/0” on the arrow leading from state I means that if the automaton is in state I and the input value is 1 then the output value is 0 and the state a t the next moment of time is 11. Note that, of necessity, there is exactly one arrow leaving each state with any given input value. Many authors assume that an automaton has an initial state, which it assumes at the initial moment of time. The history of the device consists completely of what happened at the initial moment of time and at succeeding moments of time up through the present. The initial moment of time
THE THEORY
OF AUTOMATA, A SURVEY
387
is called “time 0”; the succeeding moments are, in order, time 1, time 2, time 3, etc. Since the automaton invariably starts in the initial state, the succession of stat.es is determined from the histories of all the inputs. Since the output values at any time depend on the state and the input values, the output values at any time are determined from the input histories. Thus with the aid of the initial-state assumption, the state-graph characterization fits with the part of the definition of “automaton” given in Section 1 that says an output value a t any time is determined from the histories of the inputs. Some papers do not assume an initial state in their discussion of finite automata. (Examples are Moore [78] and other papers that have pursued Moore’s problems.) A finite automaton as conceived in that way would be a class of finite automata as conceived here, one for each different choice of an initial state. It is not difficult to discuss the results of those papers in these terms. The advantage of the definition of Section 1 (and the initial-
state assumption) is that an automaton can be thought of as a device for transforming the information of the input histories into the information of the output histories. Convenient and precise behavioral descriptions are then possible. For example, if state I is the initial state of the device whose graph is Fig. 1, then there is another way of characterizing its behavior. That is to say that the output value is 1a t any time if and only if the input value is 1 a t that time and there have been an even number of 1’s in the input history up to and including that time. No similar behavioral description would be available if there were no initial time and initial state. An important concept in connection with state graphs is reduction. This concept is most conveniently described under the initial-state assumption, although it does not depend on it. Two state graphs are equivalent if every input sequence (i.e. input history) yields two identical output sequences (i.e. output histories) when applied to both graphs. Thus, for example, the state graph of Fig. 2 is equivalent to the state graph of Fig. 1 assuming that state I is the initial state in each case. A state graph is reduced if there is no equivalent state graph with fewer states. A funda-
388
ROBERT McNAUGHTON
mental thcorcm is that two rcduced equivalent state graphs are isomorphic, which means that thcrc is a one-to-one correspondence between the states of one and the states of the other such that initial states correspond to each other and transition arrows between states in one graph are the same, and are labeled the same, as the transition arrows between corresponding states in the other. (For further elucidation on state graphs see Huffman [50], Mealy [72], Moore [78], Rubinoff [98], and Caldwell [15].) A state graph certain of whose transition arrows or output specifications are missing is called a “don’t-care state graph.” Such a graph might be drawn by someone who wanted to specify a finite automaton but in the case of certain combinations of state and input value did not care either about the next state or the output value or both. One’s interest in the concept of a don’t-care state graph depends to some extent on how likely it is that he will ever prescribe a finite automaton in this manner. I n my opinion, when one wants to specify a finite automaton it is more likely that he will find one of the languages discussed in Section 4 more suitable than the language of state graphs. If this is so, then the concept of a don’t-care state graph is important, assuming don’t-care conditions as taken into account in those languages give rise to don’t-care state graphs. Ginsburg [35-381 and Paul1 and Unger [89], seek to generalize the concepts and algorithms of fully specified state graphs to don’t-care state graphs. The same is attempted by C. Y. Lee [65, p. 1279ff.1, although Lee’s approach differs from the others. The reader who has never studied any of these papers may begin to understand how difficult the problems are if he tries to formulate a notion for don’t-care state graphs that will serve thc same purpose that the notion of equivalence serves for fully specified state graphs. The concept of cquivalericc is applied to automata themselves as well as to state graphs. Two finite automata are equivalent if every input sequence (applied at the initial state) yields the same output sequence from one as from the other. Moore [78] considers the problem of devising experiments to tell whether machines are equivalent: an experiment is simply an input sequence. Because Moore does not stipulate an initial state, his concepts are more complicated than the corresponding concepts introduced here. Ginsburg [34] continues that part of Moore’s study having to do with how long experiments have to be, and Deuel [28] gives a more exacting analysis of some of Moore’s important concepts, and attempts to reconstruct in formal terms the theory behind experiments on machines. Trakhtenbrot [lo61 is essentially a study of the quantity of memory in a finite automaton. Tf the automaton has n states then the amount of memory is logzn.In these terms Trakhtenbrot characterizes the periodic behavior of the output given the periodic behavior of the input of an automaton
THE THEORY OF AUTOMATA, A SURVEY
389
(with a given quantity of memory), and gives an indepcndent proof of a theorem of Moore [78] on how short an csperiment can hc that distinguishes two inequivalcnt fiiiitc automata (with a givcii quantity of memory). I n Gill [33] an experiment of length 2” n - 1 is developed for an automaton with a single binary input that will reveal everything there is to know about the automaton given that, for any time t , the output values a t time 2 depend only on the input histories after time t - n. An automaton fulfilling this last condition is called by Gill a “finite-memory automaton.” Srinivasan and Karasimhan [lo51 show how to convert the specification of a finite-memory automaton (which is simply a specification of the output a t time t given each input sequence of length n) into a state graph for the automaton. I n Burks and-Wang [lo] a finite automaton is structurally characterized as a net made up of a finite number of “and” gates, “or” gates, “not” gates, and delay elements connected together in a not too arbitrary manner. (The concept of a logical net goes back to Burks and Wright [ll].) This concept of finite autolr;aton is equivalent to the one defined here in the sense that every finite automaton is behaviorally equivalent to some net. However, a finite automaton need not be structurally similar to any net. Given any two states S1and Szin the state graph, there may or may not be a path of arrows from S1 to AS.If thcre is not then it is impossible for the automaton ever to get into Szafter it has once been in S1. State graphs and states arc classified according to this kind of consideration. If there is no path from the initial state to a state X, then S is a n inadmissible state. An inadmissible state is a state that the automaton can never get into; hence it is regarded as a state of an automaton for structural reasons and not for kehavioral reasons. It represents a conceivable configuration of the automaton that the automaton can never get into. (See Burks [8].) If, for every two states S1 and Sz of a state graph, there is a path from S1to Szand a path from Szto S1then the state graph is strongly connected. An automaton that is described by a strongly connected state graph is fully reversible in the sense that whatever is done to it in the way of a n input sequence can always be undone hy subsequently applying another suitable input sequence. The state graph of Fig. 1 is strongly connected. Not all important finite automata have strongly connected state graphs. For example, the oneinput one-output automaton, whose output value is 1 a t time t if and only if there has been a 1 in the input sequence a t some time preceding time t, is described by the state graph of Fig. 3, which is obviously not strongly connected. One approach to state graphs has becn given here out of several appear-
+
390
ROBERT McNAUGHTON
ing in the literature. In some places the output is a function of the state only, instead of the state and the input. Also, by making some minor conceptual adjustments, state graphs can be applied to the asynchronous circuits of switching theory, which are nonsynchronous but are otherwise like finite automata. Cadden’s papers [13,14], both in the area of switching theory, show how three types of state graph arise in the study of switching circuits, and present algorithms for converting a graph of any one of these types into a graph of any other type that is functionally equivalent to it. If we apply the term “information” to what goes in a t the inputs and what comes out a t the outputs of an automaton, we can think of it as a device for transforming information. (See Section 1.) I n a deterministic automaton the input information must in a sense contain the output information, since the output information is determined given the input information. Thinking in these terms we are led to ask the question, for what automata does the output information also contain the input information? Huffman calls such automata “information-lossless” ;he has made a study of finite automata and has advanced methods of testing these
FIQ.3
automata to determine whether they have this property [51,52]. Note that it is theoretically possible for a person to observe the outputs of an information-lossless automaton and then to tell exactly how the inputs were activated. In general, however, in order to tell what the input values are at a given time, he may have to wait for information about the output values many units of time later. In a series of papers, Huzino studies some of the concepts discussed in this section from an advanced and quite an abstract mathematical point of view. He calls attention [54] to some interesting kinds of machine, based on a state-graph analysis: his uniform machine is in effect a combinational machine, i.e., one whose output reflects the input at that time and is independent of the past history; his Latin-connected machine is like a counter, in that each input serves to rotate the machine a certain distance and direction around the cycle of states, but resembles a Latin square in that for any two states there is an input taking the machine from one state to the other; his reversible machine, if it goes from one state to another, can get back to the fist state again. In his 1958 paper [55] Huzino reformulates the problem of reducibility. Later papers [56, 57, 581 study operators on state graphs, which are in
THE THEORY OF AUTOMATA, A SURVEY
39 1
effect operators that map machines or pairs of machines into machines. In fact, the introduction and use of operators on state graphs seem t o characterize Huzino’s general approach, the underlying motivation of which is often unclear. Many writers on automata take an abstract set-theoretical point of view. As a typical example, they may define “finite automaton” as an ordered quintuple (S, I, U , f, g) where each of the five items is itself a n abstract mathematical entity: S, I, and U are finite sets of abstract entities, called elements, f is a function that maps S X I into S , and g is a function that maps S X I into U. Thus, for every element s in S and i in I,f(s, i) is an element of S and g(s, i) is an element of U . It is to be noted that,, according to the abstract approach, this definition is complete. A finite automaton is any ordered quintuple satisfying just these conditions. To satisfy the reader’s psychological need, and not to satisfy any theoretical need, the abstract automaton theorist will explain that S can be thought of as a set of states, I a set of input values and U a set of output values; f can be thought of as a transition function, which tells the next state given the input value and present state; and g the output function, which tells the output value given the input value and present state. Given the quintuple, a state graph is determined and given a state graph of a one-input one-output automaton the quintuple is dctermined. (For automata with several inputs or several outputs, minor adjustments can be made.) Perhaps the greatest justification of the abstract approach is that it leads to generalization to other iiiteresting notions by liberalizing the abstract features of the definition. It, can then be shown how certain concepts that are apparently unrelated to the concept of finite automaton are in fact related. Two papers that have achieved such results are those by Burks and Wright [12] and Ginsburg [40].
3. Probabilistic Automata
The physical universe, it is often said, is a statistical one and not a fully deterministic one, with regard to laws of cause and effect. If this is so then it would seem that deterministic automata are fictitious, and that all automata are probabilistic, alt hough the relevant probabilities may be close to 1. It has to be explained, therefore, why a large part of automaton theory is concerned with deterministic devices, and why, indeed, the study of probabilistic automata has, in spite of some noble attempts, not procccdcd far at all.
392
ROBERT McNAUGHTON
The area of application of automaton theory can be divided into manmade devices and natural devices (such as the animal nervous system). The application t o man-made devices has so far been more significant than the application to natural devices. Since man-made devices are usually built to be as deterministic as possible, departures from determinism, inevitable as they are, are regarded as imperfections to be minimized. Cases of useful probabilistic features are relatively rare. For this reason, deterministic automata are studied as the ideal, even though actual machines may never come up to them perfectly. There are cases in which artificial machines do use probabilistic features by design. An example would be any calculation on a digital computer using random numbers. Other examples are machines that are built (or programmed) to imitate natural machines, discussed in Section 8. To throw light on these and on natural machines (animal nervous systems, etc.) a probabilistic theory of automata would certainly be appropriate. Some have felt that a more realistic switching theory would be probabilistic rather than deterministic. Von Neumann’s paper [110] is best regarded as part of switching theory-rather than of the theory of automata proper, by the criteria of Section 1. However, the application of his work is so broad (to machines in general, to the nervous system, etc.) that this classification may seem strange even if technically correct according to my definitions. Von Neumann’s main problem is to show how to construct nets to control error out of elements with a certain error associated with them. This is also the problem in Moore and Shannon [81] and other more recent papers in switching theory. An interesting question is whether there is anything that can be done by a probabilistic automaton but not by a deterministic automaton. Of course, the answer is in the affirmative if the question is interpreted in its broadest sense: a probabilistic automaton can generate a random sequence and a deterministic automaton cannot, in the strictest sense of the word “random.” But this answer seems to be unsatisfactory. Producing a random sequence is a rather indefinite task. We would like to know if there is anything more definite that a probabilistic machine can do. Now the question seems a little unfair to probabilistic machines, since whatever they can do they must do probabilistically. The reader might think that it is impossible to formulate an interesting and precise question here. Actually, in de Leeuw et al. [27], such a precise and interesting quest,ion is formulated. The formulation has to do with enumeration of a set of symbols printed by the output of a machine. For every symbol S that the output can logirally print there will be a degree of probability r that S will eventually be printed by a given probabilistic machine. If, for some probabilistic machine, S is the set of all symbols
THE THEORY OF AUTOMATA, A SURVEY
393
that is printed with at least the pro1)ahility p , then S is p-enumerable. The precise formulation of thr qurstion, thru, is as follow: arc therc any sets that are p-enunicrahlc Init, not 1-c.iirinicl.al~lc? Tlic clucht ion is aiihwcrcd iii the article in the ncgative. The precise question Eorniulnted by de Lecuw el al., it seems, does not exhaust the meaning of the original question. It would be interesting to formulate othrr precise versions of the question and attempt, to answer them. As far as I know, no other formulations have been made. Indeed, one might say in general that,, before the theory of probabilistic automata can proceed to develop, someone must formulate some interesting and precise problenis. Recently, G. F. Rose has given a probabilistic iiiterprctation of a deterministic machine [9G]. He considers a Turing machine (see Seceion 5) which, when given n numbers, rl,. . . , T,, in order on its tape (for some fixed n ) , prints n other numbers, gl, . . . , gn,in order 011 its tape. If, from every such n-tuple of z’s, there are at, least m of the z’s such thntf(z,) = ys, then the machine is said to compute j with probability m / n . The justification of this definition lies in the fact that, in general, thrre may be no way of knowing for which nz of the n x’s thc function has been correctly computed. Rose leaves open the crucial question, are there any functions that are computable with probability ? n / f i , for any ?n and n, I I ) 5 n, that are not computable with probability 1/1 (which is the same as being computable in the sense discussed in Section 5)? As Myhill points out in correspondence, it will be interesting if it turns out that there are such functions for some probability m / n dose to 1.
4. Behavioral Descriptions
In Kleene [64], a problem is introduced and solved that has turned out to be fundamental in the theory of automata. Consider the class of all finite automata with one binary input and one binary output, with values 0 and 1. Also let us stipulate that each automaton in the class has a n initial state, the state a t time 0. For any finite sequence S of 0’s and 1’s as an input history, a n automaton will react eithrr with an output 1 or an output 0. Every automaton breaks down thr set of all finite sequences of 0’s and 1’s into two disjoint sets, those to which it reacts with an output 1 and those to which it reacts with an output 0; we say that thc automaton rpprescnts that set of sequences to which it reacts with a 1. Kleene’s problem is, what sets of sequences of 0’s and 1’s are representable by finite automata? (The generalization of the problem that stllows
394
ROBERT McNAUGHTON
iiiput values other tlian 0 and 1 turns out to add no further difficult8y.) Kleene’s answer is that a set is so representable if and only if it is “regular.” “Regular set” (or “regular event,,” as he calls it) he defines by recursioii, which I shall paraphrase after giving some preliminary definitions. Let A and B be two sets of sequences. Then the concatenation set A - B of A and B is the set of all sequences which are made up of a member of A followed by a member of B. (Thus, for example, if 00111 is a member of A and 0010 is a member of B then 001110010 is a member of A.B.) The iterate of A on B, A*B, is the smallest set that contains every sequence of B, A.B, A.A.B,A-A.A.B,etc. Inother words, A*Bisthesetofallsequences which are obtained from taking (for all n 2 0 ) n sequences of A (possibly with repetitions) and a sequence of B and joining them together in that order.’ “Regular set” is defined recursively as follows: (1) Every finite set of sequences is regular, i.e., a set with only finitely many members. (Keep in mind that although we are dealing exclusively with finite sequences, the set of all finite sequences is infinite.) (2) The concatenation set of two regular sets is regular. (3) The iterate of a regular set on a regular set is regular. (4) The union of two regular sets is regular. ( 5 ) No set is regular unless its being so follows from (l), (2), (3), and (4). (The reader who is not familiar with recursive definitions is referred to Church [21] or Copi ~31.1 Kleene’s theorem is that a set of finite sequences is representable by a finite automaton if and only if it is regular. Kleene’s work was continued by Myhill [84]. The definition of regularity givcs us a behavioral language known as the “regular-expression laiiguage” for finite automata with one binary output and one binary input. “Regular expression” is defined as follows: (1) “0” and “1” are regular expressions. If a! and p are regular expressions then (2) asp, (3) a*& and (4) a! U p are regular expressions. (5) Nothing is a regular expression unless its being so follows from (l), (2), (3), and (4). An example of a regular expression is ((0-0)*1 U 1*0).(0.0)*1. Rather obviously, a regular expression is a name for a regular set. The behavior of a one-input one-output automaton A is described by a regular expression a! if the following is true: the output value of A a t time t is 1 if and only if the input sequence from time 0 through time t inclusive is a sequence in the class named by the expression. For example, if an automaton’s behavior is described by (10)*1,and if the input sequence is 101010.. . then the output sequence will be 101010. . . . For since 1 (the input sequence up to time 0 ) is a member of (10)*1 the output is 1 at time 0 ; since 10 (the input sequence up t o time 1) is not a member of (10)*1 the output is 0 a t time 1; since 101 (the input sequence up to time 2)
THE THEORY OF AUTOMATA, A SURVEY
395
is a member of (10)*1 the output is 1 a t time 2; etc. Hence the output sequence is 101010. . . . (See Copi et al. [24].) I n order to use regular expressions to describe the behavior of automata other than the kind that has one binary input and one binary output, adjustments must be made. If the input is not binary or if there is more than one input, the digits 0 and 1 must be supplemented. It is convenient to use a separate digit or special symbol for each combination of input values. For example, if there are three binary inputs then there are eight combinations of input values which are symbolized as 0, 1, 2, 3, 4, 5, 6, 7; regular expressions used in describing this automaton may have these as digits. If there are several outputs, but these are binary, then a separate regular expression must be used for each output. If an output is not binary, and has p possible values, then the following plan can be used. Where z is the least integer such that 2” g p , the output can be coded as if it were x binary outputs, and z regular expressions are used. For example, if, for a given output U,p = 3, then x = 2; U can be thought of as two binary outputs Ul and Uz,where, if the three values of U are 0, 1, 2, then 0 for U can be thought of as 0, 0 (0 for U1,0 for U,), 1 as 0, 1, and 2 as 1, 0. The regular-expression language can be liberalized in several ways. The first is t o allow the star operator to stand alone. Thus a* names the set that differs from the set named by a*a in that the set named by a* also contains the empty sequence, i.e., the sequence of zero length. an p names the set which is the intersection of the set named by a and the set named by p. And -a is a name for the complement (with respect to the set of all finite sequences of 0’s and 1’s) of the sct named by a. It is readily proved that the intersection of two regular sets is regular, and the complement of a regular set is regular; these are consequences of Kleene’s theorem, using elementary facts about automata. The regular-expression language so enlarged is discussed in McNaughton and Yamada [71], which also gives a synthesis algorithm. This language seems suitable to a design engineer’s thinking and for specifying precisely the kind of machine he wants. For example (borrowed from the cited article), suppose he wishes to specify that an automaton is to have an output 1 a t time t if and only if either there have never been three consecutive 0’s in the input sequeiice to time t, or there have been three consecutive 1’s in the input sequence t o tirile t sirice the last three consecutive 0’s. A regular expression can readily be written down specifying the automaton. Noting that (0 U 1)* denotes the set of all finite sequences (of 0’s and l’s), and that (0 LJ, 1)*000(0 U 1)* denotes the set of all finite sequences having three consecutive 0’s somewhere, the desired regular expression is [(O 1)*000(0 1)*] (0 1)”111 [(O 1)*000(0 1)*].
-
u
u
u u
-
u
u
396
ROBERT McNAUGHTON
The state-graph language and the regular-expression language are two languages for describing the behavior of finite automata. Several languages have been proposed using symbolic logic. It will not be possible to present in detail the languages of symbolic logic which have appeared in the literature. A rough outline and references will have to suffice. The reader is possibly familiar with symbolic logic in its use as a behavioral language for combinational automata. (If, in a given automaton, each output value at any time depends only on the input values a t that time and is independent of the history of the device, then the device is known as a combinational automaton. In short, it is an automaton without memory.) For describing these automata, one uses that part of symbolic logic known as the propositional calculus, which contains propositional variables standing for simple sentences, connectives such as “and,” “or,” and “not,” but does not contain quantifiers “for all x” or “for some z.” The notation of Boolean algebra is used more often and is quite similar to the propositional calculus. Automata with memory can be described by symbolic logic by introducing variables standing for instants of time; having introduced these variables it is natural to introduce quantifiers. One of the earliest suggestions to use variables and quantifiers in this way is in Patterson [88]. To illustrate, let “&” stand for “and,” “ V ” for “or,” I ( - ” for “not,” “= - 9 ) for “if and only if,” “(x)” for “for all 2,’’ and “(32)” for “there exists an x.” Let x < y mean that time moment x precedes time moment y. Let Iz mean that the input I is 1 a t time x; -15 will therefore mean that the input I is 0 a t time x. Let Ut mean that output Ii is 1 a t time t. Then the behavior of the one-input one-output automaton discussed above (whose output is 1 a t time t if and only if there have never been three consecutive 0’s in the input sequence to time t, or there have been three consecutive 1’s and no three consecutive 0’s since) is described by the following formula of symbolic logic :
Ut = [-(35)(2+1<1
&
- Iz &
Iz+l &
-
Iz+2)
v(3y)(y+1
& z + l < t & -1x
& NIZ+l &
I.
-Iz+2)) The reader wishing more exposition on the description of automata by means of symbolic logic with quantifiers is referred to Wang [115] and t o Burks and Wang [lo, pp. 293-2961. I n Trakhtenbrot [lo71 and in McNaughton [70], behavioral formulas of the above type are investigated more deeply. Church’s paper on circuit synthesis [22] uses symbolic logic both behaviorally mid stmcturally.
THE THEORY OF AUTOMATA, A SURVEY
397
Another way of using symbolic logic to describe the behavior of automata is to let variables range over finite sequences of 0’s and 1’s. This technique has been quite successfully developed by Elgot [29] and Buchi [ 5 ] . Fitch [32] introduces a behavioral language for finite automata that has the advantage of a functional notation. The output sequence is described as a function of the input sequences. The fact that this behavioral language is developed in a system of conibinatory logic is logically interesting, but is incidental to the theory of automata. An advantage of a functional notation is that it is related to the problem of Section 6. Another advantage, as Fitch points out, is that of a possible application to describe the behavior of cont,inuous automata, using functions over sequences of real numbers rather than functions over sequences of integers. All the behavioral languages discussed here are for finite automata. An investigation into possible methods of extending the languages discussed in this section to apply to growing automata would be worthwhile, but as far as I know no such investigation has been carried out. Specifically, suppose we consider the class of all growing automata with one binary input and one binary output (which will be discussed in Sections 5 and 6). What operations, in addition to concatenation, iteration, and union, can generate the sets of sequences of 0’s and 1’s that are representable by such automata? (Readers familiar with the theory of computability are cautioned to read Section 6 before they conclude that all computable sets are so represented.) The answer would provide us with an extension of the regular-expression language to describe the behavior of such automata. The interested reader can formulate for himself other specific questions concerning behavioral languages for the various aspects of growing automata discussed in Sections 5, 6, and 7.
5. Various Concepts of Growing Automata
Turing’s 1936 paper [lo81 and that of Post [91] were two independent treatments of almost the same abstract idea of a machine for computation. Although they were written long before the theory of automata became a going concern, they contained the first mathematically clear abstract concept of an automaton to appear in the literature. Turing’s paper, because it appeared slightly earlier and because it was more substantial, became better known, and the machine concept has been given his name. The impact of these ideas was felt in the theory of computability (see Davis [26] or Kleene [63]). Computability by means of a Turing machine was one of several precise notions offered as an explication of the vague notion of effective calculabihty : a funct
398
ROBERT McNAUGHTON
is some possible mechanical means or set of rules for calculating it. Other precise notions were recursiveness [62], and lambda convertibility [20]. The three precise notions turned out to be mathematically equivalent, which was generally accepted as evidence that any of the three notions was an adequate explication of the notion of effective calculability. This last proposition is known as Church’s thesis [19]. Specifically in relation to Turing machines, it means that any function that can be calculated at all can be calculated by a Turing machine. The quest for an effectively calculable function is often described as a quest for an algorithm, in many mathematical theories. Indeed, the concept of an algorithm is more general than the concept of a function. For example, one speaks of algorithms for constructing a matrix or graph with a certain property. However, by a technique well known in the theory of computability, namely the technique of Godel numbering, such algorithms can be expressed as effectively calculable functions. (This technique cannot be exposited here; see Davis [26, Chapter 41 or Kleene [63, Section 521.) I n almost every mathematical theory, there are certain things that can be done algorithmically and other things that cannot. For example, in automaton theory, there is the algorithm to reduce a finite state graph. (See Section 2.) But there is no algorithm for finding a minimal Turing machine computing the same function that a given Turing machine computes (which can be proved easily by anyone with an advanced knowledge of the theory of computability). Another type of quest to which the theory of computability can be applied is the quest for a decision procedure. A decision procedure for an infinite set of questions, each having “yes” or “no” as an answer, is a procedure that, when applied to any of the questions, answers it. If a set of questions has no decision procedure, it is said to be undecidable, A decision procedure can be thought of as an effectively calculable function whose value is always either 0 (for “yes”) or 1 (for “no”). In automaton theory there is a decision procedure for the equivalence of two finite state graphs, but there is no decision procedure for whether two Turing machines compute the same function. The existence or nonexistence of other algorithms and decision procedures in the theory of automata are the special concern of Buchi, Elgot, and Wright [7], of Elgot and Buchi [30], and of Rabin and Scott [92]. Besides these special references, it should be noted that it is characteristic of switching theory and the theory of automata to present many algorithms and decision procedures. The theory of computability as such is not needed to establish an algorithm or decision procedure; but a critical application of that theory is needed to prove that one does not exist.
THE THEORY OF AUTOMATA, A SURVEY
399
Church’s thcsis (that any function that can be calculated at, all can bc calculated by a Turing machine) is not provable like a mathematical theorem, since it docs not embody wholly mathematical notions. But it is an important thesis for the theory of automata because it implies that, in an important sense, it is not possible to have a concept of an automaton that is more general than the coiiccpt of a Turing machine. Although perhaps most readers are familiar with the concept of a Turing machine, the concept is so important in the theory of automata that it should be reviewed here. A Turing machine (machine as a whole) consists of a tape that can be extended indefinitely and is divided into successive squares, and a machine proper thatJ is fixed in size and capable of reading and writing on the tape. The machine proper without the tape is a finite automaton, but the machine as a whole is a growing automaton since generally t3hetape will have to grow without hounds in a t least one direction. The machine proper has a finite number of states. At any time it must be in one of these states and scanning a square of the tape. Depending on what state it is in and what the symbol is on the scanned square, its action is determined. (Thus it is a deterministic automaton.) Its action consists of either doing nothiiig at all, or doing one or more of the following: erasing the scanned symbol, writing a new symbol, then moving to the left or right. At the next instant of time it may go into a new state. If it does nothing a t all and if it remains in the same state then at that point the machine has come to a halt. The machine computes by being given a prepared tape and being set in motion from an initial state on a certain square of the tape; the result of the coniputation is read from the tape if and after the machine has come to a halt. Thus the tape is both input and output. Turing machines do not always halt. The fact that there is no decision procedure to tell whethcr a Turing machine will come to a halt is a significant fact in the theory of computability. It is, however, outside the scope of this paper to explain this significance. (See Davis [26, p. 701.) The universal Turing machine (also introduced in Turing [108]) is often discussed, since it resembles the general-purpose computers. The universal Turing machine is capable of imitating any Turing machine by a process that is essentially the same as programming a computer. One takes an arbitrary Turing machine, writes its rules of instructions in a certain manner on the left-most portion of the tape of the universal machine, whereupon the latter will imitate the former exactly by referring in each step of the former’s operation to the set of rules a t the left-most portion of its tape. (Thus, the idea of the stored program, as well as the details of machine simulation, was presented seven years before work was begun on the ENIAC!) Davis [25] precisely defines the term “universal Turing machine.”
400
ROBERT McNAUGHTON
Turing machines are the theoretical counterparts to large-scale digital computers, and since t,he advent of the latter several rather serious attempts have been made t,o reexamine the foundations of t,he theory of the former. Wang [114] puts forth the concept of a machine B which is like a large-scale digital computer in that it operates according to an internally stored program, but is like a Turing machine in that all its operations are along a tape that is divided into squares. The machine head can write a symbol on the tape, but cannot erase and, furthermore, marks only one kind of symbol. There are four types of instructions for the machine: shift the head right on the tape, shift it left, mark the square being scanned, and a conditional transfer. The last instruction type is like the “jump” instruction in a computer; the machine executes this instruction by jumping to a designated instruction if the scanned square is marked, and proceeding to the next instruction in turn if the scanned square is blank. Wang proves that all recursive functions are computable on the machine B, which shows that his machine, in a sense, can do anything a Turing machine can do, although it, may not do it in the same way. Wang’s machine B is interesting for two reasons. First, it gives a variant to the concept of a Turing machine that is similar in some respects to a modern large-scale digital computer. Second, it shows that there exists a universal Turing machine with one tape that prints but does not erase. Wang critically examined his four types of instructions, and some other instruction types. One of the questions raised in this rather thorough discussion was whether the conditional transfer, “jump if blank” which is just the reverse of the one he used, “jump if marked,” would have done just as well. This question has been answered in the affirmative by Shepherdson and Sturgis [lo41 and also, independently, by C. Y. Lee [66]. (I do not know which of the two parties has temporal priority.) C. Y. Lee continues the investigation of Wang-type machines; the two references to his work seem to overlap in their coverage. Lee considers the various machines that result by restricting the types of instructions allowed. Among other things, he shows that if a machine program does not include a “shift left” instruction then it is equivalent, in a sense, to the operation of a finite automaton; and he shows, conversely, that the operation of every finite automaton is equivalent to such a program. Recently, Minsky has shown how to construct a universal Turing machine with two tapes that neither writes nor erases on the tapes. This machine, described in Minsky [74], is able to do anything a Turing machine can do simply by moving on the two tapcs. All the information-input, output, and intermediate information-is summarized in the positions of the machine on the two tapes. An attempt was made in Moore [77] to simplify the universal Turing
THE THEORY OF AUTOMATA, A SURVEY
40 1
machine. The result was a machine design that cuts down on the amount of physical equipment that would be necessary for realization. An important result about Turing machines is that of Shannon [loll who found that for any Turing machine, one can obtain another Turing machine carrying out the same computation, which is capable of recognizing only two symbols on the tape but has an increased number of states in the machine proper; or one that has only two states in the machinc proper but deals with a larger riumber of possible symbols. In Shannon’s reductions in both directions, the product of the number of states and the number of symbols is approximately a constant. Shannon ends his paper by calling attention to a problem: “to find the minimum state-symbol product for a universal Turing machine.” In Ikeno [59], a machine is presented with six symbols and ten states. In Minsky [75] Ikeno’s machine was simplified to six symbols and seven states. As far as I know, there is no improvement in the literature 011 Minsky’s product of 42, and there is no rational guess as to what, the miiiimum product number is. (Recently, Minsky has announced a universal Turing machine with a state-symbol product of 36.) Lately, some computer specialists have been paying attention to the theory of computability, including recursive functions. They have been exploiting a very natural connection between how a computer computes and how recursive functions are defined. However, since neither computer prograniniing nor recursive function theory is within the announced scope of this survey, this topic will not be covered here. Suffice it to mention the single reference, hIcCarthy [68]. Essentially, a Turing machine results froin a finite autoniaton by, first,, endowing it with the power to read symbols, write symbols, and handle a tape, arid, second, giving it an infinite tape. (The tape need not be actually infinite, but only potentially infinite in that more tape can be spliced on whenever a computation requires it.) There are other ways of generalizing the concept of a finite automaton to obtain the concept of a growing automaton. One is to begin with the concept of a finite automaton as being composed of parts in a definite way. The generalization is obtained by allowing the device to add parts to itself according to certain definite rules, connecting them to certain specified parts that are already there. Essentially, this mode of generalization simply provides for rules of growth as well as for rules of interaction of the parts. Another geiieralizatioii is to allow for an infiriitc array of parts connected together. Sevcrnl conditions must he filled. First, each part is connected to only finitely inaiiy other parts. Second, each of the parts has what is called an inactive state, such that if a part P is in the inactive state a t a time t arid if all the parts connected to P are inactive a t t then P is in the inactive
402
ROBERT McNAUGHTON
+
state at time t 1. Third, all but a finite number of parts are in the inactive state at time 0 (and hence the same is true at any time following). Fourth, there must be a rule governing the layout of the parts in the infinite array. This mode of generalization then turns out to be equivalent to the previous mode, in that a part that is not and has never been in the active state can be considered to be not yet there. I t can be considered to be added to the array (according to a definite rule) when it first assumes an active state. Potentially infinite automata other than Turing machines (probably) appear first in a work by von Neumann [log]. A more detailed presentation by the same author will be appearing [113]. For discussion of this idea (and other ideas) of von Neumann’s, see Kemeny [61] and Shannon [102]. Another formulation, which is similar to von Neumann’s, appears in a paper by Burks [9]. A formulation that is closer to computer technology is that of Holland [49]. Holland’s device consists of modules each of which is like a digital cornputcr.
6. Operations by Finite and Growing Automata, Real-Time and General
We have been discussing growing automata from the point of view of structure. It is time now to discuss the behavioral question: what can they do that finite automata cannot do? There are several answers to this question, depending on what precise formulation of the question we take. To begin with we might considcr onc-binary-input one-binary-output automata. Then any such automaton effccts a transformation of the input sequence into the output sequence. We can ask, (1) what is the class of transformations that arc realized by finite automata, and (2) what is the class of transformations realized by growing automata? I n my opinion neither of these questions has been answered in a completely neat mathematical way. A similar opinion is expressed in the closing passages of Section 3, Part I, of Aizerman et al. [l]. In Nerode [85] there is a mathematical formulation of the class of transformations realized by a subset of the set of all finite automata, but Nerode does not extend his characterization to fit all finite automata. The concept of regularity explained in Section 4 does provide a kind of indirect answer to the first question. (In this connection see Buchi [4, 61.) A similar formulation is given in Medvedev [73]. Raney [93] and Bazilevsky [3] discuss sequential functions, i.e., functions that map sequences of 0’s and 1’s into sequences of 0’s and 1’s. For example, there is the delay function D, well known in the literature on switching theory, which can be defined as follows: for S = SO, s1, SZ, .
..
THE THEORY OF AUTOMATA, A SURVEY
403
DS is thc sequence lo, tl, t2, . . . where to = 0 and, for each i, t l f l = s8. In these papers other sequential functions are studied as well. Raney’s paper offers a brief introduction to this matter. My attention was called to Bazilevsky’s article too late for me to review it in the detail that it merits. Fitch’s work [32], discussed in Section 4, should also be mentioned in connection with the first question because its language expresses functional relations between input and output sequences. The paper by Aizerman ct ol. [I], in Section 3, Part I, considers two problems about finite automata that seem to be important here. (1) Given an infinite input sequence and a corresponding state sequence, is there a finite automaton on which that input sequence will produce that state sequence? The difficulties here (which are discussed in detail in the article) stem from the fact that the two sequences are infinite. (2) Given an infinite input sequence and an infinite output sequence, is there a finite automaton on which that input sequence will produce that output sequence? This problem has all the difficulties of the first problem and more. This article offers no solution to either of thcse problems, but their mere formulation is enough of a contribution. The second question (what is the class of transformations realized by growing automata?) to my knowledge has not been answered a t all. It is to be noted that not all computable transformations are realized by growing automata. “Computable transformation” can be defined as follows: let io, il, iz, . . , be an input sequence; the output U , a t time n is obtained by a computable transformation if there is a Turing machine that, for every n, computes U,, given io, . . . , in. Now, if a transformation is to be realized by a growing automaton, the output U , has to be available a t time n. Indeed (as is proved in Yamada [119] and discussed more precisely below), the class of all transformations realized by growing automata is a proper subclass of the class of all computable transformations (subject to a proviso discussed below) ; transformations in the subclass must be computable within certain time limits sct from the times that various pieces of thc data are available. There is an analogy bctween such coniputations and the rcal-time computations carried on by certain computers: e.g., a computation used in the prediction of weather or tracking of a satellite, whose result must be available well before a certain time if the computation is to serve any purpose a t all. For this reason, l o an operation that is performed by an automaton so that the output comes out within a time limit, I shall apply the term “real-time.” There are many complications about real-time operations that do not exist for general operations. For example, for real-time operations a Turing machine with one tape is (possibly) not the most general type of growing automaton. It is an open qucstioii whether Turing machines with several
404
ROBERT McNAUGHTON
tapes can do real-timc operations that Turing machines with a single tape cannot do. Insight gained from Yamada [118] (the first work to my knowledge to discuss real-time operations in the theory of automata) seems to favor an affirmative conjecture on this question. In both his works, Yamada stipulates that a device be capable’of only a limited number of operations at any one time. In particular, each of his devices has only a fixed and finite number of tapes, and during any time at most k things can be done to any tape, where k is a constant positive integer for the device. On the other hand, Holland points out to me in correspondence that, in his iterative circuit computers [49], many programs may act simultaneously during any time interval. Apparently there is no upper bound on the number of such programs acting simultaneously. Hence, there may be real-time operations performable by iterative circuit computers but not performable by multitape Turing machines. As far as I know, this is an open question. The difference between real-time operations and general operations in finite automata seems to be of no theoretical consequence. For whatever can be done by a finite automaton as a general operation can be done by some other finite automaton in real time. In this section we shall confine our attention to three types of operation: a Jinite-state operation (i.e., an operation performed by a finite automaton), a real-time operation (i.e., an operation performed by a growing automaton in real time), and a general operation (i.e., an operation not subject to time restrictions performed by a growing automaton). Yamada’s 1961 paper [119] contains a proof that there exist general operations not performable in real time by any of his devices. From the method of his proof we can reasonably conclude that, most probably, there exist general operations that are not real-time operations. (His proof uses the diagonal method; it can be used to show that, for any recursively enumerable (see Davis [26]) set of growing automata, there exist general operations that are not performable in real time by any device in that set. An important question is whether there is a recursively enumerable set of growing automata such that every growing automaton is behaviorally equivalent in the strict sense to some member in the set. (Two automata are behaviorally equivalent in the strict sense if the same input sequences fed simultaneously to the two automata yield the same output sequences also simultaneously.) The answer to this question may turn on a mathematically precise definition of “growing automaton,” but, if it is in the affirmative, then Yamada’s method of proof shows that there exist general operations that are not real-time operations.) From Yamada’s dissertation [118] it is apparent that there are real time
T H E THEORY OF AUTOMATA, A SURVEY
405
operations that are not finite-state operations. The perfectaquare counter, a rather simple example of one of his devices, has a single binary input, a single binary output, and a single tape that is not regarded as part of the input or output: the output is 1 a t any time if and only if the input is 1 at that time and the number of 1’s in the input history up to and including that time is a perfect square. It is well known that this operation, thus proved to be a real-time operation, is not a finite-state operation. Yamada [I181 shows how to construct many real-time counters. An interesting example of a finite automaton is the full adder, namely the two-input one-output automaton which takes two arbitrary binary numbers :n the order of the least significant digits first, fed in at the inputs, and produces the sum at the output in order of the least significant digits first. There is a finite autoniaton that performs this function, no matter how large the numbers. In fact,, this automaton is nothing but the binary full adder familiar tjo computer engineers. It has two states, one for a carry of one, the other for a carry of zero. The reader who is familiar with finite-state devices will not find i t difficult to prove that there is no finite automaton that will do the same thing when the order of input and output is reversed to most significant digits first. Slightly more difficult is the proof that no finite automaton will perform the task for multiplication even in the order of least significant digits first. The reader may object that a finite automaton can indeed multiply, citing the example of a digital computer which is a finite automaton. It is true that a digital computer with only its internal storage is a finite automaton. (Some machines having peripheral storage, e.g., tape, that can be expanded indefinitely, would have to be classified as growing automata.) The numbers that a machine can multiply are of limited size, limited by the size of the internal memory if not by the word-size of the machine. It would be impossible for such a machine to multiply two numbers each of which required more digits to represent than the sum total of all the numbers of digits available in the memory of the machine. Thus multiplication is not a finite-state operation. A n open problem is whether the multiplication of arbitrarily large numbers is a real-time operation. More precisely, is there a two-input oneoutput automaton such that if two n-digit binary numbers are put on the inputs, least significant digits first, from time 0 through time n-1 inclusive, the output will yield the product, least significant digits first, from time 0 through time 2n-1 inclusive? If the answer is in the negative, meaning that the product cannot be available by time 2n-I, then for what function f is the product computable by time f(n), for all n? There are similar questions
406
ROBERT McNAUGHTON
for operations other than multiplication. As far as I know, these problems have never been investigated to any significant extent. (D. Arden has recently communicated the construction of a real-time multiplier.) Although it is difficult to state in precise general terms what growing automata can do that finite automata cannot do, we get some insight if we restrict ourselves to inputless automata with one binary output. The only interesting thing about the behavior of such an automaton is what sequence it will produce. A finite inputless automaton will always produce an ultimately periodic sequence. This fact is easily proved. Let SO,&, SZ, . . . be the states that the machine assumes in order. Being deterministic and inputless if S , = Sj then Si+l= AS^+^; i.e., if one state follows another at any time then it always follows it. Since the machine has only finitely many states, some state must be repeated in the sequence; i.e., for some i and j, where i # j , Si= Sj. It follows that, from time j on, the machine will simply cycle periodically through the states Si, S+*, . . , Sj-1. Since the output depends only on the state, it will also cycle periodically. Thus the only sequences that result from inputless finite-state operations are ultimately periodic sequences. An example of a sequence that is not ultimately periodic is 001011011101111011111. . . A Turing machine that prints out this sequence on its tape is given in Turing’s 1936 paper [108, p. 2341; hence this sequence results from a general operation. But it also results from a real-time operation; a device similar to the perfectsquare counter mentioned above will produce it. This device has no input, but a single binary output on which the sequence comes out. It has a single tape that is not part of the input or output. Yamada’s proof [119] can (subject to reservation similar to the one discussed parenthetically above) be used to prove that there are sequences resulting from general operations that do not result from real-time operations. But such a proof would be a sort of existence proof. It would be excessively difficult, if possible a t all, for me to describe such a sequence here. Yamada, in both references, discusses counters which have one input rather than inputless sequence producers. But, mathematically, his problem is the same, since the only function of the input is to advance time; in other words, in his counters nothing at all happens when the input is 0. A sequence that results from a general operation is a computable sequence. (A sequence a ~ a, ~ , .. . is computable if there is a computable functionf such that, for all n, a, = f(n).) But the problem of characterizing (by some mathematically interesting necessary and sufficient condition) sequences that result from real-time operations is an unsolved one.
.
.
THE THEORY OF AUTOMATA, A SURVEY
407
7. Automaton Recognition
Lately, much attention has been paid to the topic of recognition by machine. I n general the question has been, of all the things that a human being can be trained to recognize, which ones can be recognized by machines? Letters of the alphabet, human faces, grammatical correctness of sentences, spoken words (see Fatehchand [31]), and many other things can be thought about in this connection. We get a precise mathematical problem for the theory of automata if we ask the following question: for a given set of finite sequences of 0’s and l’s, is there a n automaton that recognizes members of the set? Recognition here would simply mean that the automaton reacts in one way if the sequence is a member of the set, and in another distinguishable way if the sequence is not a member of the set. For the remainder of this section we shall assume that a finite sequence is presented to the automaton on a tape, and the automaton can move along this tape and examine it symbol by symbol. It finishes its examination and then halts in one of two states; the favorable state if the sequence is a member of the set, the unfavorable state if it is not. I n that case, the automaton represents the set. This type of problem is discussed from many points of view (not all of which are discussed here) by Rabin and Scott ~921. There are several possibilities. First,, the automaton may be a finite automaton that scam the sequence once from left to right, as studied in detail by Ginshurg [39]. In that casc, a sct of sequences is representable if and only if it is regular, in the sense esplaiiied in Section 4. This follows from the fact that when the tape is scanned by the device in this fashion, it is as if a binary input were being pulsed. Another possibility is that the automaton, still finite, is able to go back and forth over the tape but is still iiot able to make any marks on the tape. In this case, it turns out tlhat no sets are representable other than the regular sets. X proof of this fact appears in Shepherdson’s paper [lo31 although the result is credited to Rabin mid Scott, who merely sketch the proof in theirs [92]. It turns out that the simplest finite automaton that represeiits a regular set by being able to scaii back and forth is possibly simpler than the simplest automaton that represents it by scanning sequences in oiie sweep from left to right. (Simplicity here is measured by the nuniber of states in thr nutornutoil.) Hut the surprising thing is that for every back-and-forth automat o i l t he equivaleiit left-to-right auto~iiutoii does exist, however more elaborate it may have to be. A third possibility is that the niachiiie, still a finite automaton and still
408
ROBERT McNAUGHTON
allowed to go back and forth, is allowed to erase and write other symbols on the tape that is presented to it. In this manner, the machine is able to represent sets of sequences that are not regular. We must now observe that, although the machine without the tape is a finite automaton, the system as a whole including the machine proper together with its tape is no longer a finite automaton, because it uses the tape not merely as an input mechanism but also as added memory. Before discussing the significance of this observation, let us consider the example of a machine of this kind representing the set of all palindromes made up from 0’s and 1’s. (A palindrome is a sequence that reads the same way backward as forward.) The machine simply scans the first symbol at the left, puts a check mark above it, goes all the way to the right of the tape and checks to see if the last symbol is the same. If it is not, it halts in an unfavorable state. If it is, it checks that symbol and goes back to the left-most unchecked symbol, places a check mark above it and goes to the right-most unchecked symbol and checks it if it is the same as the symbol just checked. It keeps going in this manner until all the symbols are checked, in which case it halts in a favorable state, or until one of the symbols on the right does not match the corresponding symbol on the left, in which case it halts in an unfavorable state. The reader who has worked with finite-state devices will be able to convince himself that there is a finite-state device capable of processing a tape in this manner. The details of the construction of similar devices can be found in Myhill’s report [82]. The set of palindrome sequences is not representable by a finite automaton that does not write or erase. An outline of this proof is as follows: if there is[such a device that represents the set of palindromes then by the Rabin-Scott result there is a left-to-right finite automaton representing that set. In testing a sequence, this automaton must be capable of remembering everything that it has seen up until the half-way point of the sequence, in order to check the correctness of the symbols in the second half. (Another difficulty, how it knows when it is half way through, is ignored in this proof by contradiction.) But then the memory of the device would have to be arbitrarily large, since it must process sequences that are arbitrarily large. But a finite automaton has only a finite number of states, and hence has a limited memory. The third type of automaton, the back-and-forth write-and-erase type, is called by Myhill [82] a linear-bounded automaton. Although the machine proper of such a system has a constant amount of memory, the amount of memory in thc tape varies with the problem. Since the tape is just long enough to contain the sequence to be tested, the amount of memory in the tape is proportional to the length of the problem that is given to it. The amount of memory in the whole system, therefore, is a linear function of
THE THEORY OF AUTOMATA, A SURVEY
409
the length of thc problcm. I-Iencc the tcrm, “linear-boundcd autoinaton.” Ititchie [95] cxamincs an infinite hicmrchy of classcs of automata, thc lowcst two of which are the finitc aut30mat8aand the lincar bouridcd automata. His hierarchy docs not embrncc all thc growing automata. His study of the classes of functions cornputcd by each class in the hierarchy is a contribution to the theory of computability. Many recognition problems can be handled by linear-bounded automata but not by finite automata. There is a linear-bounded automaton to test whether a sequence has more 0’s than l’s, whether the number of 1’s is a prime number, etc. Allowing recognizable symbols to be symbols other than 0 and 1, there is a linear-bounded automaton for recognizing whether a string of symbols is a well-formed formula of any of the usual types of systems of symbolic logic. Grammatical correctness of an English sentence is probably recognizable by a linear-bounded automaton, although the precise meaning of grammatical correctness in a natural language is apt to vary from time to time and from place to place. Indeed, I should like to be bold enough here to advance a hypothesis that any task of interest to structural linguistics is performable by a linear bounded automaton. This hypothesis would imply that any such operation on sentences can be performed on a digital computer with an amount of memory that is a linear function of the length of the sentence examined. It is rather well known that none of the operations mentioned in this paragraph are performable by finite automata. Chomsky’s work [l6, 17, 181 gives further insight into the important connection between structural linguistics and the theory of automata. Two papers dealing with computer programming of structural linguistics [46, 601 describe a programming procedure based on an analysis of sentences due to Harris [42]. One gains insight into some of the problems involved in programming the analysis of natural languages, complex and irregular as they are, by studying the corresponding procedure for simple and regular mechanical languages. Taken in this spirit, Oettinger [87] is interesting in this context. The reader is warned that, references to these borderline areas are not complete. One application of programming structural linguistics on a digital computer is mechanical translation of languages, although there are aspects other than programming techniqucs and automaton theory involved in this field. A survey of mechanical translation has been made by Bar-Hillel [2]. At present there is no practical mcthod of translating one natural language into another that is both fully mechanical and fluent. Nor is such a method just around the corner. (Bar-Hillel agrees, and presents a cogent argument for an even stronger statement: that ideal mechanical translation is impossible.) In my opinion, theoreticians, rather than paying
410
ROBERT McNAUGHTON
too much attention to the practical obstaclcs in tdw way of obtaining such a method, fihould direct their efforts toward a study of the problems on the borderline betwecn structural linguistics aid thr theory of automata, and a study of the programming of morc mod& operations on natural languages. It is not possible to guarantee that a practical mechanical translation will ultimately result from such studies. But then it is always impossible to predict what the practical applications of basic research will be. A fourth type of automaton to examine sequencesis a full-fledged Turing machine. Such a device would be just like a linear-bounded automaton except that it would have an unlimited supply of tape, beyond what is required to write down thc sequence to be tested. A set of sequences is representable by a Turing machine if and only if it is recursive, or computable. No set of sequences is representable by any automaton unless it is recursive; herc again a Turing machine is the most general type of automaton. A fifth type of automaton for sequence recognition is a real-time growing automaton. Such a device must decide about a sequence of length n in the n time units that it takes to examine the sequence. It may have as much memory as it can use in this interval of time, perhaps in the form of other tapes and reading heads (a finite number). To my knowledge, no one has investigated what sets are representable by real-time growing automata. The relationship between them and linear-bounded automata is of interest. It is not difficult to prove that every set representable by a real-time multitape Turing machine is representable by a linear bounded automaton. For the amount of time the former has to examine a sequence is equal to the length of the sequence; since not more than a fixed number of operations can occur during any time interval (see Section 6), the memory it can use is a t most a linear function of the length of the sequence. (The exact proof requires more detailed considerations.) It is an open question whether every set representable in real-time by an iterative circuit computer (of Holland [49])is representable by a linear bounded automaton. For in an iterative circuit computer (as Holland points out in correspondence) “many programs may act simultaneously transmitting results over nondelay paths; memory accessible is no longer a linear function of computation time.” It is not known whether every set representable by a linear-bounded automaton is representable by a real-time growing automaton. Specifically, it is an open question whether the set of palindromes is representable by a real-time growing automaton. It is not difficult to construct such a device for testing the set of palindromes with designated midpoints. Let a palindrome with designated midpoint be either a palindrome of even length 2n
THE THEORY OF AUTOMATA, A SURVEY
41 1
+
with a cross between the nth and (ri 1)st symbol, or palindrome with odd length, 2n - 1, with a cross over the nth symbol. The real-time growing automaton to represent these simply records the symbols as it sees them on an auxiliary tape; then, when it comes to the cross, it checks whether the symbols in the right side of the sequence are the same as the symbols it has recorded on the auxiliary tape, in reverse order. It should be mentioned that (as A. Joshi points out in correspondence) the recognition problem is often given as a don’t-care problem: one is given a set of sequences S1to which the machine should respond favorably, and a set Sfto which the machine should respond unfavorably; one does not care how the machine reacts to sequences that are in neither Sl nor Sz. To construct an automaton so specified, of any of the varieties discussed above, is a matter of finding a suitable enlargement of the set Sl to be represented. This matter (to which I know of no reference) is probably related to the study of don’t-care state graphs discussed in Section 2, or at least in that aspect of the matter having to do with recognition by finite-state automata. I should like to close this section with just one more editorial comment. The theory of automata has been very much concerned with finite automata, growing automata, and the distinction between the two. Both for application to the computer field and for interesting development of the theory, it would seem to be interesting to investigate concepts of automata that are between these two extremes. As such the concepts of a linearbounded automaton and a real-time growing automaton seem to be appropriate for further investigation. Also, in this connection, Chomsky’s 1959 paper [17] is interesting even though it is concerned exclusively with linguistic operations. 8. Imitation of life-like Processes by Automata
Recently, the study of computing machinery has stimulated a wave of speculation on an ancient question, whether man is himself a machine of sorts, and, if so, whether it is theoretically possible for man to build an artificial machine in his own image. Scientists generally agree that these questions are too broad to expect answers on, and tend to substitute for them more specific questions. Investigations have turned around attempts to get machine models of growth, reproduction, perception and recognition, rote learning, insightful learning, problem solving, and many other things that human beings are capable of. Some of these investigations will be reviewed in this section, but the reader is warned that the references are far from complete, and in some cases not even representative. Where possible, references will be made to ot,her survey articles.
412
ROBERT McNAUGHTON
From the point of view of technology there is no point in building another human being; we already have them in plentiful supply. Technology will profit only if what we build exceeds human capabilities in one function or another. Thus the largest of the electronic “brains” are not nearly as versatile as the human brain, but what they can do they do much more quickly, and also more efficiently since they are not affected as much b y distraction and fatigue. Nevertheless, along with the drive for improving technology, many scientists have become increasingly concerned with possibilities of machine construction and performance regardless of efficiency. Whether these efforts will produce any long range technological advance is a difficult question. To begin with, Moore [79] describes a model of an artificial plant that subsists in an environment from which it takes food and energy and carries on most of the vegetative functions. This model is interesting in that it is a plan for a very simple machine imitation of life. In a series of articles, L. S. Penrose discusses the problem of mechanical models of reproduction. (See Moorc [80], a review of four of Penrose’s papers, and Penrose [go].) These articles are interesting and very entertaining in that they show that mechanical models for these biological processes can be quite simple. But the things that grow and reproduce are themselves too simple to be interesting: all they are designed to do is grow and reproduce. An interesting question is, how complicated need a device be that performs the function of a very simple finite automaton and is capable of reproducing itself? For example, is there a simple device capable of both binary addition and reproduction? As far as I know such questions have not been answered. I suspect that, in spite of the simplicity of the models in the Penrose papers, these automata will have to be quite complex, compared both to Penrose’s models and to the corresponding finite automata that are not capable of reproducing. Judging from the title (“Possibilities of Favorable Mutations in SelfReproducing Automata”), Myhill’s notes [83] seem to be a rather interesting contribution to the current thinking about reproducing machines. Another question concerns the possibility of building a machine with a capability that no known organism has: namely, to live forever. This problem (and, strictly speaking, also the problems of growth and reproduction) relates to switching theory, by my definition of Section 1, since it inevitably deals with parts and structure. Yet, since this problem is rather farfetched, abstract, and apparently impractical, it would not ordinarily be thought of in that way. The problem can be stated as follows: give the logical specifications of a part, and a machine composed of these parts which is capable of replacing
THE THEORY OF AUTOMATA, A SURVEY
41 3
parts as tlicy wear out. That is, the machine is capablc of sensing when n part is worn out and reaching into a pile of parts to replace it. Assumed
is a probabilistic life-expectancy function on each of the parts, beginning at the time when it first starts to function as part of the machine. For example, the probability of lasting a t least n units of time might be a-'I. The problem is, to design the machine in such a way t,hat its life expectancy is greater than the life expectancy of any of its parts. (This problem mas suggested t o me orally by E. F. Moore.) If a machine is always composed of not more than n parts, and if the probability of a part failing during a time interval is never less than a constant k , greater than 0, then the probability that such a machine will last forever is 0. For the probability that all the parts will fail at once during a time interval is a t least the constant 1 - (1 - k)". Hence, in an infinite amount of time the probability is zero that i t never will happen that all the parts fail a t once. Note that this argument (due to E. F. Moore) holds no matter how often and in what manner parts are replaced. The soundness of the assumptions rests on the fact that any part no matter how new has some probability of failure, which cannot be made to approach zero. It follows that no individual biological orgallism can live forever. This truth is a well known empirical law, but it is interesting to have proved it by such a simple argument, whose only assumptions are that the organism reaches a masiniuni size and that each part has a probabilistic rate of decay. I t is also interesting that no matter how well such a n organism could replace worn tissue, it would still bc mortal. The same argument applies to any collection of organisms, provided that there are ncvcr more than a fixed finite number of organisms in the collection. The conscqucnce is that if the human population in the universe cver reaches a maximum thrn the human race cannot last forever. If a niachiiic, made up of parts with a probabilistic rate of decay, is to have a nonzero probability of lasting forrver then it must be a growing device with no vital center of a fixed size. Whether such a device is possible is an open question. An answer, in the form either of a detailed construction of such a device or of a proof that it is impossible, would be interesting, at least in theory. Gorii [41] discusses a sirnplc iuodel of machine habit-forming. The model allows for random choices among alternative responses, the probabilities of thc various choices being modified each time a choice is made. Hence the model is one of the forination of habit by repetitive action rather than by reinforcement. Voii Neumaiiii was seriously interested for many years in the problem of precisely characterizing thc similarit ics niid differences between n
41 4
ROBERT McNAUGHTON
general-purpose electronic digital computer and a humaii brain. The reader who is interested in this topic should consult his book [lll]. Other references are a 1951 work [I091 and a lcctJureto be published [113]. Artificial intelligence is a topic that has excited wide interest, both in people who are out to probe metaphysical questions about intelligence and in people who are simply interested in improving machine capabilities. It is beyond the scope of this paper to survey the literature in this field. The reader desiring such a survey is referred to Reitman [94] or to Minsky [76]. Here I shall make one broad distinction that I think is appropriate. It is possible to get the machine to do something that could be called intelligent by programming it to follow a routine that a human mind has devised. Many of the achievements that come under the heading of artificial intelligence are of this variety. For example, almost every routine for playing a game that has so far appeared simply has the machine check through the various moves and counter-moves that can be made on a given configuration of the game. The machine checks all the reasonably likely configurations that can result so many moves ahead from a move, evaluates them according to a point system, and then decides on a move based on these evaluations. (See Samuel [99].) It should be clear that programming such procedures on a machine, although it does give the machine the appearance of intelligence, simply gets the machine to execute a human plan. Is there a way of getting the machine to figure out its own plan? More deserving of the term “artificial intelligence” would be a scheme whereby the machine could be instructed in the rules of, say, chess and then, by playing a number of games, could learn to improve its own game without further human instruction. Such a scheme for a game as complicated as chess is beyond the state of the computer art at present. However, were it possible it would be of great significance; for, quite conceivably, a machine programmed in this fashion could come to learn more about the game of chess than any human being. These are the two approaches to artificial intelligence: programming the machine to carry out a complicated procedure for doing something difficult, and programming the machine to learn and develop its own procedure, (A more elaborate classification is given by Shannon [loo].) Any argument as to whether either of these is a way of getting the machine to exhibit intelligent behavior is a verbal argument, to be settled only upon a precise definition of “intelligent behavior.” Although the concept of intelligence is vague, attempts to get the machine to simulate intelligence can nevertheless be quite fruitful as developments of the computer art. Besides the attempt to get a general-purpose digital computer to exhibit intelligent behavior by suitably programming it, various attempts have been made to construct special-purpose machines that can do intelligent
THE THEORY
OF AUTOMATA, A SURVEY
41 5
things. Two surveys of these machines, outside of the scope of this article, are by R. J. Lee [67] and Hawkins [45]. Another psychological process is that of perception, which may be described as the organization of reactions to stimuli into a concept, For several years a rather controversial device called the “perceptron” has been developed and discussed by Rosenblatt. He claims that his device, in learning t o organize stimuli on a visual field, makes use of “sensory areas,” “association areas,” and “concept areas,” whose roles are roughly the same as the corresponding areas in the animal brain. (See Rosenblatt [97] for a summary of this work and further references to his work. Hawkins [45] gives a brief technical review of Rosenblatt’s work, in the context of a survey of “self-organizing systems.”) Rosenblatt’s work falls in the area of pattern recognition, which is well outside the scope of this survey. A survey of this area is given by Minsky [76, Section 111. Analogies between the nervous system and machines have occupied the serious attention of several scientists in the last fifteen years. Nerve nets have been studied by McCulloch and Pitts [69], by Kleene [64], and in more recent papers (e.g., Willis [117]). The term “cybernetics” was introduced in 1948 by Wiener [116], who felt that the set of concepts used in the study of the animal nervous system, computing devices and similar systems should be brought together into one scientific theory. It might seem that cybernetics should be identical to the theory of automats. However, Wiener’s theory is based on probability and statistics, and especially the statistical theory of communication. The theory of automata, as should be clear from the earlier sections of t,liis survey, is based on logic and discrete mathematics. This difference reflects a difference in purpose, which is difficult to characterize, eucept, by saying that cybernetics seeks to study the statistical aspects of ccrt:iin devices, arid autoinaton theory seeks t o study the logical and algebraic aspects of the same devices or similar devices. (That part of von Neumann’s work which is discussed in this section is part of cybernetics, although von Neumann does not use the term. Speaking in this way, we would also say that von Neumann’s paper [llo], described in Section 3, contains some rather specific results of cybernetics.) One could argue about which theory is more fruitful, but I think we must face the fact that neither is so fruitful as to force us to forsake the problems that are studied as part of the other. The domain of application will probably be different: the theory of automata will probably be useful for constructing iicw machines and new systems, whereas cybernetics will probably be niost useful in the analysis of the animal iiervous system a i d very large machines which are already built. There seems to be no reason, however, why these two theories should not coalesce to become one theory, except perhaps that there are few scientists who
41 6
ROBERT McNAUGHTON
have sufficient coninlalid over both logic and statistics to be a force for bringing these two theories together. This same fact explains why probabilistic automaton theory has not developed very far. This concludes my Survey of the Theory of Automata. Like other articles in Advances in Computers, it is intended as a dated report on the area of research that it covers. Were it to be written a few years hence, it would be radically different in outline and in content. Indications are that more and more research in the theory of automata will be carried out in the years ahead. Inevitably, as some problems are solved, other problems will always come forward, giving the theory a continually changing aspect. References
In the case of proceedings of nieetiiigs, the date of the meeting, and not of publication, is given. I n all other references, the date of publication is given. Some 1961 and 1962 dates are just reascnable guesscs. 1 . Aizerman, M. A,, Gusev, 1,. -4.,Ilozo~ioer,I,. I., Smirnova, I. M., Tal, A. A., Finite automata. Automation and Remote Control (translation of Avtonmtzka i Telemekhanzka) P i , 156-163 and 248-254 (1960). 2. Bar-Hillel, Y., The prescnt status of automatic translation of languages, in Advances in Computera (F. L. Alt, ed.), Vol. I, pp. 92-164. Academic I'resP, New York, 1960. 3. Baailevsky, J., CJucstions in the theory of temporal logical functions, in Questions in the Theory of Logical MachLrres (translation of Voproai ?'toria Maternatichcskikh Mashin, a volume of papers in Russian), 1958. 4. Buchi, J. R., Regular Canonical Systems and F'iiiite Aiitonia, Dept. of Philosophy, Tech. Report No. 03105, 2794-7-T. University of Rlirhigan Research Inst., Ann Arbor, Michigan, 1959. 5. Buchi, J. R., Weak second-order arithmetic :tiid fhite autom:ita. Z. math. Logilc u. Grundlagen Math. 6, 66-92 (1960). 6. Buchi, J. R., Periodic Sets of Words and Finite Autoinata, Abstr. Intern. Cong. for Logic, Methodology and Phil. of Sci., p. 1. Stanford, Califoruilt, 1900. 7. Buchi, J. R., Elgot, C. C., and Wright, J. B., The nonexistence of certain algorithms in finite automata theory, Abstract. Notices Am. Math. Soc. 5, 98 (1958). 8. Burks, A. W., The Logic of Fixed and Growing Automata, Proc. Intern. Symposium on the Thcory of Switching, Part I, pp. 147-188. IIarvard University, Cambridge, Massachusetts, 1957. 9. Burks, A. W., Computation, behavior and structure in fixed and growing automata,
in Self-organizing Systems, Papers of the Interdisciplinary Conf. on Self-organizing Systems, Chicago, 1959 (M. C. Yovits, ed.), pp. 282-311. Pergamon Press, New York, 1960. 10. Burks, A. W., and Wang, IT.,The logic of automata. J. Assoc. Computing Machinery 4,193-218 and 279-297 (1957). 11. Burks, A. W., and Wright, J. B., Theory of logicd nets. R o c . I R E 41, 1357-1365 (1953). 12. Burks, A. W., and Wright, J. B., Sequence Generators and Digital Computers, 1961, unpublishcd.
THE THEORY OF AUTOMATA, A SURVEY
417
13. Cadden, W. J., Sequential Circuit Thcory, Dissertation. Princeton University, 1956. 14. Cadden, W. J., Eqiiivalcnt scqiicntial circuits. ZZLE 'I'rans. OTL Circzril Theory CT-6, 30-34 (1 950). 15. Caldwell, S., Switching Circuils and Logical Design. Wiley, New York, 1958. 16. Chomsky, N., Three models for the description of language. I R E Trans. o n Information Theory IT-P, 113-124 (1956). 17. Chomsky, N., On certain formal properties of grammars. Znjorm. and Control 9, 137-167 (1959). 18. Chomsky, N., and Miller, G. A., Finite state languages. Inform. and Control I, 92-112 (1958). 19. Churrh, A., A note on the Entscheidungsproblem. J. Symbolic Logic 1, 40-41 and 101-102 (1936). 20. Church, A., The Calculi of Lambda-Conversion, Ann. Math Studies No. 6, Princeton University Press, Princeton, Xew Jersey, 1941. 21. Church, A., Zntroduction lo Mathemafical Logic. Princeton University Press, Princeton, hTewJersey, 1956. 22. Church, A., Applications of Recursive Arithmetic to the Problem of Circuit Synthesis, Summaries of the Summer Inst. for Symbolic Logic, pp. 3-50. Cornell University, Ithaca, New York, 1957. 23. Copi, I. M., Symbolic Logic. Macmillan, New York, 1954. 24. Copi, I. M., Elgot, C., Wright, J. B., Realization of events by logical nets. Assoc. Computing Machinery 5, 181-196 (1958). 25. Davis, M. D., A note on universal Turing machines, in Automata Studies (C. E. Shannon and J. McCarthy, eds.), pp. 167-176. Princeton University Press, Princeton, New Jersey, 1956. 26. Davis, M. D., Computability and Unsolvability. McGraw-Hill, New York, 1958. 27. deleeuw, K., Moore, E. F., Shannon, C. E., Shapiro, N., Computability by probabilistic machines, in Automata Studies (C. E. Shannon and J. McCarthy, eds.), pp. 183-212. Princeton University Press, Princeton, New Jersey, 1956. 28. Deuel, P. (Univ. of British Columbia), The Behavioral Theory of Automata, 1961, unpublished. 29. Elgot, C., Decision Problems of Finite Automata Design and Related Arithmetics, Logic of Computers Group, Tech. Report No. 2722, 2794, 27554-T. University of Michigan Research Inst., Ann Arbor, Michigan, 1959. 30. Elgot, C. C., and Biichi, J. R., Decision problems of weak second order arithmetic8 and finite automata. Notices Am. Math. SOC.5, 834 (1958); 6 , 4 8 (1959). 31. Fatehehand, R., Machine recognition of spoken words, in Advances in Computers (F. L., Alt, ed.), Vol. I, pp. 193-231. Academic Press, New York, 1960. 32. Fitch, F. B., Representation of sequential circuits in combinatory logic. Phil. of Sn. 95,263-279 (1958). 33. Gill, A., Characterizing experiments for finite-memory binary automata. I R E T T U ~on S . Electronic Computers EC-9,469-471 (1960). 34. Ginsburg, S., On the length of the smallest uniform experiment which distinguishes the terminal states of a machine. J . Assoc. Computing Machinery 5,266-280 (1958). 35. Ginsburg, S., On the reduction of superfluous states in a sequential machine. J . Assoc. Computi?ag Machinery 6,259-282 (1959). 36. Ginsburg, S., A technique for the reduction of a given machine to a minimal state machine. IRE Trans. on Electronic Computers EC-8,346-355 (1959).
41 8
ROBERT McNAUGHTON
37. Ginsburg, S., Synthesis of minimal state machines. I R E Truns. on Electronic C'owputers EC-8,441449 (1959). 38. Ginsburg, s., Connective properties preserved in minimal stntc tnachincs. J . Assoc. Cornpuling Machinery 7, 311-325 (1960). 39. Ginsburg, S., Sets of tapes accepted by diffcrerit types of automata. J . ASYOC. Computing Machinery 8/81-86 (1961). 40. Ginsburg, S., Examples of Abstract Machines, 1961, unpublished. 41. Gorn, S., On the mechanical simulation of habit-forming and learning. Inform. and Control 2,226-259 (1959). 42. Harri8, Z.,String Analysis for Center and Adjuncts, Department of Linguistics, Language Analysis Paper No. 5. University of Pennsylvania, Philadelphia, Pennsylvania, 1960. 43. Hartmania, J., Symbolic analysia of a decomposition of information processing machines. Inform. and Control 3, 154-178 (1960). 44. Harvard Computation Laboratory, Bibliography of Works in Switching Theory, 1960,unpublished. 45. Hawkins, J. K.,Self-organizing systems-a review and commentary. Proc. I R E 49,3148 (1961). 46. Hiz, H., Steps Toward Grammatical Recognition, Proc. Intern. Conf. for Standards on a Common Language for Machine Searching and Translation. Cleveland, Ohio, 1959. To be published by Interscience Publishers, New York. 47. Holland, J., Survey of Automata Theory, Memorandum of Project Michigan. University of Michigan, Ann Arbor, Michigan, 1959. 48. Holland, J., Cycles in logicsal nets. J. Franklin Inst. 270, 202-226 (1960). 49. Holland, J., Iterative Circuit Computers, Proc. Western Joint Computer Conf., pp. 259-265. San Francisco, 1960. 50. Huffman, D.A., The synthesis of sequential switching circuits. J. Franklin I I L S ~ . 257, 161-190 and 275-303 (1954). 51. Huffman, D.A., Canonical Forms for Information-lossless Finite-state Automata, 1959 Intern. Symposium on Circuit and Information Theory; see IRE Trans. on Circuit Theory CT-6, 41-59 (1959). 52. Huffman, D. A., Notes on information-lossloss finite-state automata. Niioz~o cimenlo [lo] 13, Suppl., 397405 (1959). 53. Humphrey, W. S., Sudtching Circuits with Computer Applications. McGraw-Hill, New York, 1958. 54. Huzino, S., On some sequential machines and experiments. Mem. Fac. Sci. Kyushu Univ. AIP, 136-158 (1958). 55. Huzino, S., Reduction theorems on sequential machines. Mem. Fuc. Sci. Kyushu Univ. All?,159-179 (1958). 56. Huzino, S., On the existence of Scheffer stroke class in the sequential machines. Mem. Fac. Sci. Kyushu Univ. Al3, 53-68 (1959). 57. Huzino, S., Some properties of convolution machines and sigma composite machines. Mem. Fuc. Sci. Kyushu Univ. A1 3, 69-83 (1959). 58. Iluzino, S., On some sequential equations. Mem. Fac. Sci. Kyushu Univ. Al4, 50-62 ( 1960). 59. Ikeno, N., An Example of a Uiiivcrsal Turing Marhine (Japanese), read a t the Natl. Convention of the Inst. Elec. Communications Engineers of Japan, 1958. 60. Joshi, A. I<.,Computation of Syntactic Structure, Proc. Intern. Conf. for Standards on a Common Language for Machine Searching and Translation. Cleveland, Ohio, 1959. To he published by Interscience Publishers, New York.
THE THEORY OF AUTOMATA, A SURVEY
419
61. Kemeny, J. G., Man viewed as a machine. Sci. American 192, 58-67 (1955). 62. Kleene, S. C., General recursive functions of natural numbers. Math. Ann. 11 2, 727-742 (1936). 63. Kleene, S. C., Introduction to Metamathematics. Van Nostrand, Princeton, New Jersey, 1952. 64. Kleene, S. C., Representation of events in nerve nets and finite automata, in Auiomata Studies (C. E. Shannon and J. McCarthy, eds.), pp. 342. Princeton University Press, Princeton, New Jersey. 1956. 65. Lee, C. Y., Automata and finite automata. Bell System Tech. J . 39, 1267-1295 (1960). 66. Lee, C. Y., Categorizing automata by W-machine programs. J . Assoc. Computing M a c h i w y , 8 , 384-399 (1961). 67. Lee, R. J., Learning Machines, Proc. Bionics Symposium, Wright Air Development Division. Cincinnati, Ohio, 1960. 68. McCarthy, J., Recursive functions of symbolic expressions and their computation by machine. Commun. Assoc. Computing Machinery 3, 184-195 (1960). 69. McCulloch, W. S., and Pitts, W., A logical calculus of the ideas imminent in nervous activity. Bull. Math. Biophys. 5 , 115-133 (1943). 70. McNaughton, R., Symbolic Logic for Automata, Wright Air Development Division Tech. Note No. 60-244. Cincinnati, Ohio, 1960. 71. McNaughton, R., and Yamada, H., Regular expressions and state graphs for automata, I R E Trans. on Electronic Coinpulers EC-9, 39-57 (1960). 73. Mealy, G. H., A method for synthesizing sequential circuits, Bell System Tech. J . 34,1045-1079 (1955). 73. Medvedev, I. T., On a Class of Event,s Representable in a Finite Automaton (English translation by J. Schorrken), Croup Report No. 34-73. M.I.T., Lincoln Lab. Lexington, Massachusetts, 1958. 74. Minsky, M., Recursive Unsolvability of Post's Problem of "Tag," Tech. Report No. 54 G-0023. M.I.T., Lincoln Lab. Lexington, Massachusetts, 1960. 75. Minsky, M., A 6-Symbol 7-State Universal Turing Machine, Tech. Report No. 54 G-0027. M.I.T., Lincoln Lab. Lexington, Massachusetts, 1960. 76. Minsky, M., Steps toward artificial intelligence. Proc. I R E 49, 8-30 (1961). 77. Moore, E. F., A Simplified Universal Turing Machine, Proc. Meeting Assoc. Computing Machinery, pp. 50-55. Toronto, Ontario, Canada, 1952. 78. Moore, E. F., Gedanken experiments on sequential machines, in Autonzala Studies (C. E. Shannon and J. McCarthp, eds.), pp. 129-153. Princeton University Press, Princeton, New Jersey, 1956. 79. Moore, E. F., Artificial living plants. Sci. Anierican 195, 118-126 (1956). 80. Moore, E. F., Review of four papers hy Penrose. I R E Trans. O I L Elertroiiic C " 0 7 ~ pulers EC-8, 407-408 (1959). 81. Moore, E. F., and Shannon, C. E., Reliable circuits using less reliable relays. J . Franklin Inst. 969, 191-208 and 281-298 (1956). 82. Myhill, J., Linear Bounded Automata, Wright Air Development Division Tech. Note No. 60-165. Cincinnati, Ohio, 1960. 83. Myhill, J., Possibilities of Favorable Mutation in Self-Reproducing Automata, Lecture Notes, Summer Engineering Conf. University of Michigan, Ann Arbor, Michigan, 1960. 84. Myhill, J., Nerode, A., and Tennenbaum, S., Furidamental Concepts in the Theory of Systems, Wright Air Development Center Tech. Report No. 57-624. Cincinnati, Ohio, 1957.
420
ROBERT McNAUGHTON
85. Nerode, A., Linear automaton transformations. Proc. Am. Malh. Soc. 9, 541-544 (1958). 8G. Netherwood, D. B., Logicd machine dckgn: a selected bibliography. IRE Trans. on Electronic Computers EC-7, 155-178 (1958); EC-8, 365-380 (1959). 87. Oettinger, A. G., Automatic syntactic analysis and the pushdown store, Proc. Symposia in Appl. Math. (Am. Math. SOC.)12, (1961). 88. Patterson, G. W., Logical Syntax and Transformation Rules, Proc. 2nd Symposium on Large-Scale Digital Calculating Machinery, pp. 125-133. Cambridge, Maas., 1949. 89. Paull, M. C., and Unger, S., Minimizing the number of states in incompletely specified sequential switching functions, IRE Trans. on Electronic Computers EC-8, 356-367 (1959). 90.Penrose, L. S., Developments in the theory of self-replication. New Biol. 31, 57-66 (1960). (Penguin Books, Harmondsworth, England.) 91. Post, E. L., Finite conibinatory processes-formulation I. J . S p b o l i c Logic 1, 103-105 (1936). 92. Rabin, M. O., and Scott, D., Finite automata and their decision problems. I R M J . Research and Development 3, 114-125 (1959). 93. Raney, G. N., Sequential functions. J . Assoc. Computing Machinery 5 , 177-180 (1958). 94. Reitman, W. R., Information-processing Languages and Heuristic Programs : D. Ncw Stage in the Bead Game, Proc. Bionics Symposium, Wright Air Development Divieion. Cincinnati, Ohio, 1980. 95. Ritchie, R. W., Classes of Recursive Fimctions of Predictable Complexity, Doctoral dissertation. Princeton University, Princeton, New Jersey, submitted 1960. 96. Rose, G. F., An extended notion of computability, Abstr. Intern. Congr. for Logic, Methodology and Phil. of Sci., p. 14. Stanford, California, 1960. 97. Rosenblatt, F., Perceptron simulation experiments. Proc. IRE 48,301-309 (1960). 98. Rubinoff, M., Remarks on the Design of Sequential Circuits, Proc. Intern. Symposium on the Theory of Switching, Part 11, pp. 241-280. Harvard University, Cambridge, Massachusetts, 1057. 99. Samuel, A. L., Programming computers to play games, in Aduances in Computers (F. L. Alt, ed.), Vol. I, pp. 165-192 Academic Press, New York, 1960. 100. Shannon, C. E., Computers and automata. Proc. IRE 41, 1234-1241 (1953). 101. Shannon, C. E., A universal Turing machine with two internal states, in Automata Studies (C. E. Shannon and J. McCarthy, eds.), pp. 157-166. Princeton University Press, Princeton, New Jersey, 1956. 102. Shannon, C. E., von Ncumann’s contribution to automata theory. BzLZZ.Am. Malh. SOC.64, 123-129 (1958). 103. Shepherdson, J. C., The reduction of two-way automata to one-way automata, IBM J . Research and Development 3 , 198-200 (1959). 104. Shepherdson, J. C., and Sturgis, 11. E., The Computability of Partial Recursive Functions, Abstr. Intern. Congr. for Logic, Methodology and Phil. of Sci., p. 17. Stanford, California, 1960. 105. Srinivasan, C. V., and Narasimhan, R., On the synthesis of finite sequential machines. Proc. Indian Acad. Sci. 50,6842 (1959). 106. Trakhtenbrot, B. A., On operators realizable in logical nets. Doklady Akad. Nauk S.S.S.R. 112, 1005-1007 (1957). 107. Trakhtenbrot, B. A., Synthesis of logic networks whose operators are described by
THE THEORY OF AUTOMATA, A SURVEY
42 1
means of single-place predicate calculus. Doklady Akad. Xauk S.S.S.R. 1 1 8, 646449 (I 958). 108. Turing, A. M., On computable numbers with an application to the Entschcidungsproblem. Proc. London Math. SOC.r2] 49,230-265 (1936); 43, 544-546 (1937). 109. von Neumann, J., The general and logical theory of automata, in Cerebral Mecha.nis?nsin Behavior-The Hixon Symposium (L. A. Jeffries, ed.), 1951; Reprinted in The World of M a t h e d i c s (J. R. Newman, ed.), Vol. 4, pp. 2070-2098. Simon & Shuster, New York, 1956. 110. von Neumann, J., Probabilistic logics and the synthesis of reliable organism from unreliable components, in Automata Studies (C. E. Shannon and J. McCarthy, eds.), pp. 43-98. Princeton University Press, Princeton, New Jersey, 1956. 111. von Neumann, J., The Computer and the Brain, Yale University Press, New Haven, Connecticut, 1958. 112. von Neumann, J., Five Lectures on Automata Theory (A. W. Burks, ed.), to be published by the University of Illinois Press, Urbana, Illinois, 1962. 113. von Neumann, J., The Theory of Automata: Construction, Reproduction and Homogeneity (A. W. Burks, ed.), to be published by the University of Illinois Press, Urbana, Illinois, 1962. 114. Wang, H., A variant to Turing’s theory of computing machines, J . Assoc. Computing Machinery 4, 63-92 (1957). 115. Wang, H., Symbolic Representation of Calculating Machines, Summaries of thc Summer Inst. for Symbolic Logic, pp. 181-188. Cornell University, Ithaca, Ncw York, 1957. 116. Wiener, N., Cybernetics. Technology Press, Cambridge, Massachusetts, 1948. 117. Willis, D., Plastic Neurons as Memory Elements, Proc. Intern. Conf. on Information Processing, pp. 2W298. UNESCO, Paris, 1959. 118. Yamada, H., Counting by a Class of Growing Automata, Doctoral dissertation. University of Pennsylvania, Philadelphia, Pennsylvania, 1960. 119. Yamada, H., A mode of real time operations of a subclass of Turing machines arid the existence of a subclass of recursive functions which are not real time computable, to appear in I R E Trans. on Electronic Computers EC-10, (1961).
This Page Intentionally Left Blank
Author Index Numbers in parentheses are reference numbers and are included to assist in locating references when the authors' names are not mentioned at the point of referenre in t,he text. Numbers in italics refer to the page on which the reference is listed.
A Abntlie, J., 304, :?66 Ablow, C. M.,300 (3), 36'0' Adler, R., 196 (48), 201 Agmon, S., 77, 129 Aigerman, M. A., 385, 402, 403, 416 Akers, S. B., Jr., 366' Albrecht, 124 Allen, J. S., 250, 295 Ansbacher, F., 235 (72), 251, 23.3 Apker, I,., 184, 101 Arabi, AT., 321, 566 Arrow, K. J., 323, 366 Ascher, M., 62 (5, 7), 85, 124 Avdeyenko, A. I , 154 (3), 160 (S), LS!!
B lhchman, C. H., 238, 2Y.I Bar-Hillel, Y., 409, 416 Barker, J. E., 63 (6.12), 85, 125 Bartz, Von G., 251, 9693 Batten, G. W., 15, 50, 65, L$ Baumol, W. J., 366 Bay, Z.,250, Baxilevsky, J., 402, 416' Beale, E. M. L., 303, 324, 366 Beckmann, pvl. J., 366, 367 Behnke, H., 82, 130 Bellman, R., 567 Benders, F., Jr., 353, 350, 36'7 Berger, E., 131 Bergman, S., 59, 71 (10.3, 10.5, 10.9), 74, 75, 81, 82, 95, 117, I S $ , I d ) , 128, ITYO, 133
Bers, L., 75 (ll.G), 128 Bertolini, F., 1.31 Bessierc, F., 364, 367
Birkhoff, G., 40, 64, 60, 124 Blackwell, D., 323 (66a), 360 Blair, P. M., 15, 63 Bochner, S., 81, 130 Bock, F., 367 Boiteux, M., 367, 369 Rouma, B. C., 184 (43), 291 Bouzitat, J., 364, 367 Box, G. E. P., 67 (7.13), 68, 126 Boyle, W. S., 166 (19), 290 Bramble, J. H., 78, 129 Brian, P. L. T., 40, 64 Hrigham, G., 360 (3), 366 Brock, E. G., 158 (12), 2990 Brubnker, W. M., 223, 292 Hruining, IT., 181 (38), 182 (40), 192, 291 Buck, D. A., 232 (68), 292 Hiichi, J. R., 397, 398, 402, 416, 417 Burger, R. M., 229 (65), 292 Burks, A. W., 380, 383, 384, 389, 391, 396, 402, 416
C Cadden, W. J., 390, 417 Cnldwell, S., 384, 388, 417 Clamp, M., 235, 9603 Campbell, I. E., 172 (26), 203 (53), 204 (55), 215 (55), 218 (26), 290,2996 Candler, W., 371 Capuano, R., 64 (6.5), 126 Carleman, T., 81, 130 Carpentier, J., 364, 367 Cam, P. H., 140, 289 Catchpole, A. R., 353 (20), 356 (20), 367 Charbonnier, F. M., 158 (12, 13), 290 Charnes, A., 321 (133), 322, 367, 37s Cheney, E. K., I N , 324, 367
423
424
AUTHOR INDEX
Chcncy, L. I<., 367 Chomtiky, N., 409, 411, 417 Church, A., 394, 396, 398 (20), 4f7 Clenshaw, C. W., 86, 133 Collatz, L., 5 (5), 52, 69, 70, 76, 79, 124, 126, 129
Cook, J. S., 196, 291 Cooper, W. W., 367 Cooperman, P., 131 Copi, I. M., 394, 395, 417 Cosslett, V. E., 246, 249, 293 Courant, R., 36 (37), 63, 12.4, 130 Crandall, S. H., 27, 63 Crank, J., 18 (22), 63 Croes, G. A., 303,367 Grout, P. D., 70 (9.7), 127 Crumly, C. B., 196, 991 CWtiSS, J. H., 80,130 Cuthill, E. H., 15, 63 Cvetkov, B., 125
17 (20), 18 (21), 22 (27), 23 (30), 25 (32), 26 (32), 27 (30), 28 (34), 30 (8),34 (35), 35 (36), 36 (34), 37 (21, 39,40), 39 (401, 40 (21,43), 43 (27, 30, 45), 46 (55, 56), 47 (55), 48 (61), 49 (54, 62, 63), 50 (54), 62, 63, 64
D a n , R. J., 77, 129, 132 Duncumb, P., 222 (57), 246, 249, 251 (57), 292, 293
Dyke, W. P., 155 (6), 158 (12), 161 (B), 163 (15), 166 (17), 169 (22), 175 (28), 178, 289, 290, 291
E Efroymson, M. A., 86, 133 Ehrenberg, W., 198, 235 (72), 251, (72), 292, 293
Ehrlich, L. W., 47, 64 Eisemann, K., 335 (61), 338, 369 Elgot, C. C . , 395 (24), 397, 398, 416, 417 Eliison, M. I., 185, 291 Elliott, H. M., 79 (12.7), 129 ErdBlyi, A., 62, 12.4 Ermolaeva, T. Z., 184 (42), 991 Eselson, L., 370 Everhart. T. E., 246, 249, 293
D Daboni, L., 132 Dahl, 0.-J., 350, 367 D’Amico, C., 200 (52), 2991 Danforth, C. E., 80, 101, 118, 129, 133 Dantzig, G. B., 296 (42), 297 (48), 303, F 323,368, 371, 372, 377 Davis, M. D., 397, 398, 399, 404, 417 Faedo, S., 70 (9.2, 9.4), 126 Davis, P., 56, 57, 64 (6.8, 6.9), 86, 100, Fagen, E., 215 (56), 292 118, 120, 121, 123, 126, 130, 133 Fan, K., 369 De Boer, J. H., 178, 291 Farnsworth, H. E., 229, 292 Debreu, G., 368 Fassberg, H. E., 369 DeLand, E. C., 368 Fatehchand, R., 407, 417 de la VallBe-Poussin, C. J., 69, 126 Favreau, R. R., 360 (65), 369 deleeuw, K., 392, 41 7 Fekete, M., 83, 130 Deming, L. S., 68, 126 Ferguson, R. O., 369 Dennis, J. B., 347, 359, 369 Fiacco, V., 323, 369 d’Epenoux, F., 367,369 Fichera, G., 72 (10.7), 73, 77, 127, 128, Deuel, P., 388, 41 7 129, 131, 132 De Vitry, A. F., 369 Firth, B. G., 158 (9), 182 (9), 2990 Diaz, J. B., 78, 84, 117, 130, 131, 132 Fischer, R. E., 235, 292 Dickson, J. C., 363, 360 Fishman, H., 64 (6.11), 125 Dimsdale, B., 369 Fitch, F. B., 397, 403, 417 Dolan, W. W., 155 (6), 158 (13a), 161 Flatt, €1. P., 19, 47 (24), 63 (6), 163 (15), 166 (17), 169 (22), 178 Flood, M. M., 333 (67), 369 (22), 289, 290 Ford, L. R., Jr., 369 Dorfman, R., 369 Forsythe, G. E., 57, 62 (5.6, 57), 85, 124, Dorn, W. S., 569 132 Douglas, J., 5 (8),14 (€9,1.5 (13), 16 (8), Fort, D. M., 321, 370
425
AUTHOR INDEX
Fox, L., 69, 70 (9.5), 127 Halmos, P. It., 20 (25), 53 Franks, R. G. E., 360 (65), 69 Hammer, P. C., 124 Frederick, F. P., 363, 369 Hansen, W. W., 164 (15a), 167 (15a), 176 Freidrichs, K. O., 130 (15a), 190 (15a), 290 Frisch, R., 309, 350, 370 IIurris, Z., 409, 418 Fulkerson, D. R., 303 (45, 46), .,6.~, 6 ~ , IIartley, H. O., 67, 126, 372 370 Hartmanis, J., 384, 418 Funk, P., 131 Hawkins, J. K., 415, 418 Haxby, B. V., 181 (39), 291 G Haynsworth, E., 57 (2.4), 123 Gagua, hl. B., 74, 128 Head, J., 125 Gale, D., 370 Heady, E. O., 371 Gale, L. A., 125 Hcnrici, P., 50 (65), 54, 75, 128 Guler, G. S., 370 Herring, C., 169, 290 Galler, B. A., 370 Hcrriot, J., 133 Gallie, T. M., 17, 46, 47 (55), 53, 54 Hershoff, J. B., 126 Crass, S. I., 317, 370, 375 Heyman, J., 320, 371 Gassano, 364, 374 Higgins, T. J., 131 Gavurin, M. K., 366 (112), 576 Highleyman, W. H., 179, 291 Gawlik, H. J., 64 (6:13), 125 IIilbcrt, D., 36 (37), 63, 124 T h q e , T. H., 229 (65), 292 Hildebrand, F. B., 5 (l), 52, 70 (9.7), 127 Germer, Z H., 166, 290 Hille, E., 62 (5.3), 124 Gerstenhaber, M., 370 I-lirsch, W. M., 303, 371 Giardina, 364, 373 Hie, I€.,409 (46), 418 Gill, A., 389, 417 Hoclistrasser, U. W., 63, 77, 165, 12.9 Ginsburg, S. 388, 391, 407, 417, 418 Hofcdite, W., 228 (62), 292 Glirkman, S., 370 Holland, J., 384, 385, 402, 404 (49), 410, Gnaedigner, R. J., 155 (7), 289 Goldstein, A. A., 126, 324, 367, 370 418 Holland, L., 229 (64), 298 Golomb, M., 132 Holm, R., 166, 290 Gomory, R. E., 303, 305, 336 (90a), 370 Gonser, B. W., 172 (26), 203 (53), 204 (55), Horvath, J., 128 Houthakker, H. S., 371 215 (55), 218 (26), 290,292 Good, R. H., 155 (5), 167 (20), 169 (21), Hubbard, B. E., 78, 129 170 (24), 171 (25), 172 (25), 226 (20), Huffman, D. A., 388, 390 (51, 52), 418 Humbert, P., 77, 129 227 (60), 289, 290, 292 Humphrey, W. S., 384, 418 Gorn, S., 413, 418 Hnrwice, L., 323 (7), 366 Gourary, M. H., 366, 371 Huzino, S., 390 (54, 56, 57, 58), 418 Greenberg, H . J., 131 Hynian, M., 18 (23), 53 Griffith, J. W., 158 (13a), 230 Gross, O., 303, 371 I Gross, W., 132 Ikeno, N., 401, 418 Griinsch, H. J., 77, 129 Isbell, J. R., 366 (91), 3;1 Gurney, R. W., 158 ( l l ) , 235 (11), 290 Gmev, L. A., 385 (l), 402 (l), 403 (I), 46 I
J
H ITueler, li., 232, 2% IIagstrum, H. D., 200, &9:? Haley, K. B., 3% fiall, C.E., 203 (54), 292
Jacobs, W. W., 371 John, F,,10, 12, 50 (lo), 52 Johnson, L., 370 Johnson, S. M., 303 (45, is), 966, J6!), 370, 371, 377
AUTHOR INDEX
426
.Jones, B. F., 46, 49, 50 (54), 64 Joshi, A. K., 409 (60), 428 Juncosa, M. L., 5 (2, 3, 4), 13, 22, 68
K Kadner, H., 70, 12Y Kalker, J. J., 320, 372 Kaneda, S., 174 (27), 891 Kantorovitch, L. V., 82, 127, 365, 366, 371, 372
Iiaplan, S., 18 (23), 53 Karlin, S., 366, 37.2 Iiarmazina, L. N., 63, 125 Iiarreman, H., 3Y2 Iiato, T., 132 Kaufmann, A., 364, 372 Iiawaratani, T. K., 372 Kelley, J. E., Jr., 317, 322, 324, 372 Kellogg, 0. D., 73, 75, 128 Kemeny, J. G., 402, 419 Kendall, M. G., 68, 125 Kilimin, A. J., 184 (42), 291 King, A., 228, 29% Kisliuk, P., 166 (19), 8990 Kleene, 8. C., 393, 397, 398 (62), 415, 419 Kliot-Dashinsky, M. I., 82, 130 Koehler, W. F., 198, 292 Kompfner, R., 196 (46), 291 Koopmans, T. C., 3rd Kopal, Z., 64 (6.7), 126 Krelle, W., 372 Kron, G., 3Y% Krylov, V. I., 82, 127 Krzywoblock, M. Z. v., 75, 128 Kuhn, H., 314, 373 Kuiken, C., 353 (20), 356 (20), 367 Kunzi, H. P., 372
L Laasonen, P., 5 (7), 14 (7), 52 Laderman, J., 36Y Langmuir, I., 228, 292 Langmuir, R. V., 197 (49), 292 Lax, P. D., 43, 44 (46), 47, 54 Lee, C. Y., 388, 400, 41.9 Lee, R. J., 415, 419 Lees, M., 16 (19), 22, 27, 40, 46, 65,154 Lenike, C. E., 321, 322, 323, 36Y, 573 Leontief, W., 673 Lesourne, J., 364, 3 7 3
Leutert, W., 44 (48), 64, 373 Levine, N., 126 Lieberstein, H. M., 71, 72, 78, l2Y Liebmann, G., 242, 293 Loeb, H. L., 373 Longo, 365,373 Lonseth, A. D., 70 (9.8), 127 Lotkm, M., 15, 17, 63 Louisell, W. H., 196 (47), 291 Lourie, J. R., 335 (61), 369 Lowan, A. N., 72, 128 Lutsau, V. G., 154 (3), 166 (3), 289
M McCarthy, J., 401, 419 McCulloch, W. S., 415, 419 McGuire, C. B., 374 McMahon, J., 132 McNaughton, R., 395, 396, 419 Madansky, A,, 323, 3Y3 Maehly, H. J., 126, 126 Maghout, K., 364, 373 Makower, €I., 373 Malter, L., 158, 179, 182, f?90 Manne, A. S., 303, 321, 373 Marcolongo, R., 77, 129 Marcus, M., 57 (2.4), 183 Markowitz, H. M., 324, 374 Marlow, W. €I., 366 (Ql),3Yl Martin, E. E., 158 (13), 175 (28), 290, 292 Masse, P., 364,374 Mayer, L., 251,203 Mead, C. A., 179,291 Mealy, G. H., 388,429 Medvedev, I. T., 402,419 Mendenhall, H. E., 249,293 Merrill, R. P., 3Y4 Merriman, G. M., 72, 127 Metzger, R. W., 374 Miles, E. P., Jr., 74,91,128 Miller, G. A., 409 (18), 417 Milne, W. E., 5 (6), 36 (6),62 Minsky, M., 400,401,414,415,419 Minty, G. J., 374 Miranda, C., 76, 12!1 Moore, 14;. F., 387, 388, 389, 392 (27), 400 412, 41 7, 419 Mott, N. I?., 158 (ll), 235 (11), 2.90 Motzkin, T. S., 374
427
AUTHOR INDEX
Miieller, E. W., 155 (5), 161 (14), 166, 167, 169 (21), 170, 171 (25), 172 (25), 178, 221, 226, 227, 289, 290, 292 Munkres, J., 333, 374 Munson, J. K., 360 (162), 374 Murnaghan, F. D., 126 Myhill, J., 382, 394, 408, 412, 419 Mykhlin, S. G., 131, 132
Powell, C. F., 172, 203 (53), 204 (55), 215 (55), 218 (26), 290, 692 Powers, D. A., 155 (4), 221, 289 Pozzi, 364, 374 l’rager, W., 131, 320, 371, 374 Pyle, L. D., 374 I’yne, I. B., 360 (174, 175), 374
N
Itabin, M. O., 398, 407, 460 Rabinowitz, P., 56, 64 ( 6 4 6 . 9 , 6.14,6.15), 86, 100, 118, 121, 123, 126 Rachford, H. H., 18 (21), 37 (21, 38), 40 (211, 49 (62), 63, 64 Radner, R., 375 Ralston, A., 325, 376 Ramo, S., 238, 293 Raney, G. N., 402, 420 Reinfeld, N., 376 Reinhard, H. P., 223 (58), 292 Reitan, D. K., 13f Reitman, W. R., 414, 420 Iternez, E., 128 Reynolds, R. R., 117, 133 Richardson, L. F., 24 (31), 53 Itichtrnyer, R. D., 43, 44 (46, 47), 154 Iticossa, 365, 373 Reley, V., 317, 375 Ititchie, It. W., 409, 420 Rockafellar, R. T., 375 Rose, G. F., 393, 420 Rose, M. E., 15, 22, 47, 53, 54 Rosen, J. B., 323, 375 Rosenblatt, F., 415, 420 Rosenbloom, P. C,, 132 Ilovinsky, B. M., 154 (3), 166 (3), 289 Rozonoer, L. I., 385 (l), 402 (l), 403 (l),
Narasimhan, It., 389, 420 Nehari, Z., 73, 77, 82, 128, f29, fc30 Nerode, A., 382 (84), 394 (84), 402, 419, 420
Netherwood, D. B., 384, 420 Newberry, S. P., 242, 293 Nichols, C. R., 374 Nicolescu, M., 77, 120 Nicolovius, R., 132 Nicolson, P., 18 (22), 53 Nishitana, Y., 174 (27), 291 Nixon, W. C., 154 (2), 166 (2), 2SD
0 Oatley, C. W., 246 (77), 249 (77), 69.3 O’Brien, G., 18 (23), 53 Oettinger, A. C., 409, 420 Orchard-Hays, W., 374
P Paneth, F., 228 (62), 292 Patterson, G. W., 396, 420 Paul, nf. C., 388, 420 Paul, W.,223, 992 Payne, L. E., 78, 129, 132 Parke, N. G., III., 125 Peaceman, D. W., 37 (38), 49 (62), 55, <54 Peach, M. O., f24 Penchot, J., 364, 374 Penrose, L. S., 412, 420 Pensak, L., 235 (71), 251, 298 Picone, M., 69, 70 (9.1), 72, f26, 127, 2-91 Pigot, D., 326, 327 (168), 364, 374 Pistorius, C. A., 184 (43), 291 Pitman, H. W., 158 (13), 290 Pitts, W., 415, 419 PBya, G., 130, 131 Poritsky, H., 80, 101, 118, 129, f39 Post, E. L., 397, 420
R
416
Rubin, A. I., 360 (162), 374 Rubinoff, M., 388, 420 Rubuishtein, G. S., 366, 375 Russel, J. B., 63, 184 Rutishauser, H., 89, 133
S Saaty, T. L., 322, 376 Saleer, H., 64 (6.5), 195 Samuel, A. L., 414, 420 Sarnuelson, P., 369 Sargent, L. F., 369
428
AUTHOR INDEX
Scarf, H., 366 Srhell, E. D., 371 Schiffer, M.,71 (10.9),74, 267 Schilier, R.E., 229 (65),292 Srhossberger, F. V., 215 (56),292 Schreier, O., 57, 66,123 Sciama, A., 86, 133 Scott, D., 398, 407,480 Sewell, W.E., 79 (12.7),169 Shaginyan, A. L., 79, 129 Shannon, C. E.,384, 392 (27), 401, 402, 414,417, 420 Shapiro, N., 392 (27),417 Shapley, L. S., 376 Shelton, H., 197 (49), 292 Shenitzer, A.,126 Shepherd, W.G., 181 (39),291 Shcpherdson, J. C.,400,407,460 Shctty, C.M., 376 Shindle, W. E., 376 Shohat, J. A.,62,12.4 Shoulders, K.R.,156 (7a), 232 (68), 289, 298
Sienkiewicz, 0. C., 321 (133),373 Sinden, F.W., 361,376 Skellett, A. M.,181, 192,291 Slecth, J. D., 158 (12),290 Slobodyanskiy, M.G.,132 Smirnova, I. M., 385 (l),402 (l),403 (l), 416
Smith, D. M., 376 Smith, K.C . A., 246 (77), 249 (77), 293 Smith, L. W., 376 Smith, N., 323 (66a), 369 Snedecor, G.W.,68, 126 Sokolnikoff, I. S., 127 Sokolskaya, J. L., 184,292 Solow, R.,369 Sommer, F.,82,130 Spencer, R. C., 125 Sperner, E.,57, 66, 163 Spriggs, S.,215 (56), 29.2 Srinivasan, C.V., 389,420 Sternglass, E.J., 180,233,291 Stiefel, E.L.,68, 126 Stone, J. J., 375 Stone, R. L.,320,375 Strang, W. T., 22, 44,53,6.4 Stringer, J., 376 Stroud, A. H., 60 (4.7),124
Sturgis, H. E., 400,460 Sugata, E., 174,291 Summers, S. E., 242 (75),293 Synge, J. L., 57,84, 109, 118,123,131,132 Szego, G.,61 (5.2),62,81,82,124, 130,131
T Taft, E., 184, 291 Tal, A. A., 385 (l), 402 (l),403 (l), 416 Talacko, J., 376 Tateishi, M.,174 (27),291 Taylor, A. E.,57, 123 Tennenbaum, S.,382 (84),394 (84), 419 Ticulka, F., 215 (56), 292 Tolstoi, A., 366,376 Tompkins, E.,215 (56),292 Topolyanskiy, D.B.,131 Tornqvist, L., 376 Trakhtenbrot, B. A., 388, 396,480 Trefftz, E.,130 Trench, W.,47,54 Trolan, J. K.,175 (28), 291 Tucker, A. W.,303,314,373, 376 Turing, A. M.,397,399, 406, 4.21 Tyler, G.W.,124
U Udelson, B.J., 196,291 UUman, R.J., 367, 372 Unger, S., 388,420 Uzawa, H.,323 (7),366, 376
V Vajda, S., 376 Van Geel, W.C., 184, 891 Varga, R. S., 15, 40, 63,64 Vasil'ev, G. F., 185 (44),291 Vazsonyi, A,, S76 Veidinger, L., 126 Vekua, I. N., 73,74 (ll.lO), 75, 128 Vodicka, V., 117,133 Vogel, W., 376 Volokobmskii, M.,179, 2991 Volterra, V., 49,64 von Ardenne, M.,187 (44a), 237,291 Von Hippel, A., 155 (4),221 (4), 289 von Neumann, J.,392,402 (113),414 (109, 111, 113),415,422 von Zahn, U., 223 (58),292 Votaw, D.F., 376
AUTHOR INDEX
W Wachtel, M. M., 180 (35), 233 (35), 992 Wagner, H. M., 303, 322, 323, 376 Walker, M. R., 317,372 Walsh, J. L., 62 (5.3), 78, 79,80 (13.2), 82, 83, 224, 129, 130 Wang, H., 380, 383 (lo), 389, 396, 400, 416, 421
Ward, L. E., Jr., 376 Warga, J., 377 Wargo, P., 181 (39), 292 Washizu, I<., 132 Wasou, W., 13, 22, 52 Webb,K. W., 322, 375 Weber, C., 130 Wehner, G . K., 229, 292 Weinberger, H. F., 78, 232 Weinstein, A., 230, 131 Weisfeld, M., 62, 184 Weiss, G., 64 (6.14), 125, 181, 2991 Weissenberg, G., 251 (82), 293 Wells, 0. C., 246 (77), 249 (77), 293 Whit,e, W. B., $77 Whitin, T. M., 303, 376 Wiener, N., 12.4, 415, 421 Wdf, H. S., 325, 376 Williams, A. C., 297 (223), 299, 369, 577 Williams, E., 74, 91, 128 Willis, D., 415, 481 Wilson, W., 286 Wise, H., 228, 292
429
Wiskott, D., 251 (82), 293 Witegall, C., 186, 323, 377 wolfe, P., 297 (48), 322, 323, 324, 368, 37; Wrench, J. W., 125 Wright, J. B., 384, 389, 391, 395 (24), 398, 416,417
Wuerker, R. F., 197 (49), 292 Wymore, A. W., 124
Y Yaniada, If., 395, 403, 404, 405, 406, 421 Yocum, W. H., 196 (46, 47), 992 Yokoya, H., 174 (27), 292 Young, D. M., 5 (2, 3, 4), 13, 22, 59 Young, D. M., Jr., 60, 124 Young, J. R., 176, 292 Young, It. D., 161, 290 Young, W. M., 369
Z Zalgaller, V. A., 366 (113), 372 Zalm, P., 158 (lo), 184, 290 Zaremba, S., 71, 127 Zhdan, A. G., 185 (44), 292 Zolin, A. F., 72, 127 Zoutendijk, G., 315, 316 (227), 355, 356 (228), 377 Zucker, R., 64 (6.5), 126 Zukhovitskiy, S. I., 2f6 Zwieliny, K., 128 Zygniund, A., 21 (2G), 63
Subject Index A
C
Accuracy of approximation, 5, 27, 31 Adder, 405 Alternating direction methods, 31, 34, 37 ff, 49 Amplifiers, vacuum tunnel effect, 190 ff, 286 electron-beam, 196 Analog computers, 360 ff, 364 Approximation aee Accuracy, Chebyshev, Least squares, Mixed problems, Nonlinear Artificial intelligence, 379 ,414 Automata, 379 ff behavior of, 383, 384, 393 ff continuous, 382 deterministic, 382, 392, 393 discrete, 382 finite, 380, 385 ff, 402 ff finite-memory, 389 fixed, 382 growing, 382, 397 ff, 402 ff, 410 linear-bounded, 408 nonsynchronous, 382 operations of, 402 ff probabilistic, 382, 391 ff structure of, 383, 384 synchronous, 382
Calculation requirements, for parabolic equations, 30 ff Caterer problem, 296 Cathodoluminescence, 143 Chebyshev polynomials, C. approximation, 63, 67, 68, 86 Cleaning methods (for micromachining), 199 ff Closed orthogonal systems, 57, 73 Collocation, 70, 79 ff, 123 Complete systems of particular solutions, 72, 73 ff Composite simplex algorithm, 297, 362 Computability, 379, 397, 398, 401, 409, 410 Computable sequences, 406 Computable transformations, 403 Computer codes for linear programming, 309, 320, 325 ff for orthogonalization, 85 ff Condition number, 57 Conformal mapping, 59, 60, 81 ff Consistent difference equations, 10 Convergence, 10, 15, 19, 22, 23, 25, 26. 37, 36, 40, 41 ff, 48, 72, 89 ff degree of, 75 ff Corrosion, 175 Coupling, 145, 157, 287 Covariance see Variance Crank-Nicolson differenre equation, 18 ff, 25, 31, 35, 39, 43, 47, 49 Critical path, 317 ff, 346 Cryotron, 154 Curve fitting, 64 ff, 322 Cutting plane method, 324 Cybernetics, 415
B 13ackward difference equation, 13 fy, 31, 34, 35, 44, 46, 48, 49, 50 Dessel inequality, 57, 83 Biharmonic equation, 71, 72, 74, 76, 89 ff, 102-112, 118-120, 122, 123 Block-triangular basis, 297 Boolean algebra, 396 Boundary value problems, 4, 14, 19, 26, D 28, 31, 32, 36, 37, 43, 57, 75, 77, 78, Decision procedures, 398 83 ff, 89 ff, 116 ff Brain, human (analogy with computers), Decomposition algorithm, 206 ff, 353, 364 Delay function, 402 414 430
43 1
SUBJECT INDEX
Drposition (01 in:tf(~ri:ils), 1 4 1 , 20.2 ff, 222, 287, 289 1)y evaporatir)ti, 2o i U, 2h7 reactive, 204, 215 IT, 287 Dirichlet inner product, 59, 71, S t IXrichlet integral, 83 Dirichlet problem, 70, 89 ff, 116 ff Discrete programming see Linear programming, integer Document storage, 144, 146, 183, 185 ff, 286 Domain of dependence, 13 “Don’t care” state graphs, prot)lenis, 388, 410 Duhamel principle, 11, 19, 24 Djmatron, 193
E Econoniizlatioii, ti3 Effective calculability, 397 Elmticity theory, 59 Electroluminesceiicc, 184 Electkolytic action in dielectrics, 167 ISlectromechanical filters, 189 ff Electron-beam activation of micromachining, 135 ff Electron guide, 195 ff Electron lens, 225, 243, 245, 256 If, 273, 289 Electron microscopy, 146, 168, 174, 190, 203, 229, 232, 236, 238, 245, 275, 276, 278, 281, 283; sce also Pllirror microscopy scanning, 246, 249, 288 Electron multiplier, 180, 183, 223, 238, 242, 246, 247 ff, 250, 258, 286, 288 Electron optics, 141, 143, 149, 187, 235, 236 ff, 272, 282, 283, 288, 289; .we also Electron microscopv Electrostatic capacity, 83 Electrostatic relays, 188 ff Elliptic partial differential equations h general, 75, 76, 79 linear, 59, 70 H, 75, 77, 122 Elliptic-parabolic-hyperbolic (mixed) equations, 72 Encapsulation, 172 ff, 181, 187, 202, 205, 217, 219, 220, 252, 280 Energy integral, 59 Energy methods 27, 44 ff
Equivalenre of automata, 388 I h o r s see Arrrimcy, Clonvergmce, Iloiind-oR Krror boiinda, 7.5 ff ICtcliing, 140, 143, 148, 209, 218, 222, 224 ff, 287, 289 Eiilcr-Lagrange expressions, 59 Evaporation see DepoRition Existence of solutions, 12 Experiments see Numerical Ihplicit difference equations, 4 ff
F Feasible directions see Gradient method Ferroelectric devices, 154 l(’erromagnetic devices, 154 I+dd emission, 141, 158 ff, 178, 184, 185, 255, 286,287; see also Tunnel effect inicroscopy, 146, 147, 149, 171, 179,217, 218, 221. 226, 227, 233, 237, 253, 287, 288 Field ion microscopy, 146, 147, 237 Finite-state opelation (of automata), 404 Fluorescence probes, 147, 149, 230, 236, 249, 288 Fluorescence spectroscopy, 168, 222 FORTRAN, 328 ff Fourier series, segments, 56, 62, 77 French method, 296, 326
G Grtlerkin method, 70 Games see Theory of games Gaseous diffusion, 321 Gasoline blending (oil refinery) problems, 363, 365 Gradient method of feasible directions, 295, 314 ff Cham determinant, matrix, 57, 63, 88 Gram-Schmidt method of orthogonalizttion, 61, 62, 86, 89, 120, 123 Graph theory, 364 Green’s function, 72 Green’s identity, 84 Growth, imitation of, by automata, 411
H Habit-forming automata, 413 Ilarmonic nndysis, 19
SUBJECT INDEX
432
functions, 72, 79, 88, 122 polynomials, 73, 78, 79 Heat equation, 2, 4, 15, 17, 31, 37, 41, 4(i Heating effects of tunnel diodes, 166 Higher-order correct difference equations, 25 ff Hungarian method, 332, 333 Hydrodynamics, application of orthonormalization to, 75 Hypergeometric function, 75
I Immortal automata, 412 Inadmissible state, 389 Information, 381, 390 Initial value problems, 8 Inner products, 56, 58 ff, 62 computation of, 59 ff, 86 Inner product spaces, 56, 58, 65 Instability see Stability Integer Programming see Linear Programming, integer Tntegral operators, 74 Integro-differential equations, 49 ff Interpolation, 79 ff; see also Mixed problems Ion collector, 205, 208, 210 Ion gauge, 210-213, 225 Jon pumping, 177, 269 ff, 289
K I
1 Lambda convertibility, 398 Landau order notation, 3 Laplace equation, 70, 71, 73, 74, 77, 89 ft', 116 ff iterated, 74 Leading variables, method of, 297 Learning machines, 379, 411, 414 Least Rquares, 59, 72, 89 ff, 116 ff approximation of functions, 64 ff, 85 IT geometry of, 56 ff methods for ordinary differential equations, 69 ff for partial differential equations, 75 polynomials, 57
rclation to collocation and int,erpolation, 79 ff Life-like processes, 411 ff Lifetime of automata see Immortal automata of microelectronic components, 175 ff, 286; see also Stability Light detector, 183 ff Light generator, 184 ff Linear equations see Overdetermined systems Linear independence, 57, 60, 74 Linear programming, 67, 295 ff applications of, 317 ff integer (discrete), 302 ff, 336 ff, 353, 356, 358, 364 parametric, 319 stochastic, 323 Logical nets, 389
M Machine (shop) loading, 321, 335 Magnetic shielding, 245; see also Ferromagnetic Mass spectrometry, 141, 143, 147, 211, 222, 223, 230, 273 Maximum decrease of objective form, 297 Maximum principles, 5, 34, 47, 75-77 Memory devices, 193 ff Meters, 381 Microdocuments see Document storage Microelectronics, 135 ff Micromachining; 135 ff Micromanipulator, 190, 244 ff Microscopy see Electron, Field emission, Field ion mirror microscope, 251 ff, 288 Mixed boundary value problems (Robin's problems), 70, 90, 101, 109, 118, 119 Mixed partial differential equations see Elliptic-parabolic-hyperbolic Mixed problems of interpolation and approximation, 65 Monte Carlo method, 3G4 Multidimensional distribution problem, 335, 341 Mukiplex method, 295, 309 ff, 350 Multiply connected regions, 73, 82, 93, 94, 117
433
SUBJECT INDEX
N Nerve nets, 415 Nets, logical, 389 Network flow, 320, 333 ff, 340, 355 Networks see Switching Neumann function, 72 Neumann problems, 70, 90 Nonlinear approximation problems, 66, 69 Nonlinear differential equations, 2, 7, 15, 18, 22, 27, 29, 37, 40, 47 Nonlinear programming, 323 ff, 356, 3G0 Norm, 56, 59, (Chebyshev, nonquadratic, uniform) 67 Normal equations, 57 Numerical experiments, 89 ff, 116 1T
Polynomials see Orthogonal Potential theory, 59, 60, 71 Primal-dual algorithm, 297 Propositional calculus, 396
Q Quadratic functionals, 57, 83 ff Quadratic objective function, 324, 358 Quadrature, approximate, 60 Gaussian, 63 Quasilinear equations 16, 22
R
P
Ileal-time growing automaton, 410 Real-time operation (of automata), 402 ff llecognition, 407 ff, 411 Recurrence relation for orthogonal polynomials, 61, 62, 122 llcccirsive functions, 398, 401, 410 Reduction of state graphs, 387 Refinery problems see Gasoline blending Regular rvents, expressions, sets, 394, 407 Representation (of sets by automata), 393, 407, 410 Reproduction, imitation of, by automata, 411, 412 Resists (to etching), production of, 230 ff, 288 Ritz method, 70 Robin’s problems see Mixed boundary value problems Round-off errors, 15, 57, 60, 61
Palindromes, 408, 410 Parabolic differential equations, 1 ff systems of, 48 ff Partial differential equations see Elliptic, Parabolic Particular solutions of differential equations, 74; see also Complete systems Pattern generation (by electron beams), 244, 254 8, 289 Perception by automata, 411, 415 Perceptron, 415 Petroleum refinery prol,lerns ACC (hsoliile blending I’hotoelectrori emission, 143, 18-1 Planning atid scheduling, 321, 322 Plasma system, 197 Polishing (in micrornachining), 198 Polyhar~nonicfunrtiom, 72
Scaling, of electrical properties, 151 ff of material properties, 155 ff SCEMP, 361 ff Scheduling see Planning Schwarz inequality, 5B SCROL, 327 Secondary emission devices, 180 ff, 191, 193, 205 Sell-adjoint equations, 16 Self-formation, 144, 170 ff, 180, 191 Sequential functions, 402 Servomechanisms, 381 Yevcral space variables, c1iITcwnti:d equntions in, 31 ff SHARE, 325 ff, 362 Shop loading see Machine loading
0 Oil refinery problems see Gasoline blending Ordinary differential equations see Least squares Orthogonal polynomials, 61, 62, 82, 85 ff, 112-116, 120, 121 Tables of, 63 ff Orthogonality, 56 Orthogonalization, methods of, 60 fT codes, 85 ff Orthonormalization, 55 ff Overdetermined systems of linear equations, 68 ff Overrelaxation, 34
S
SUBJECT INDEX
434
Simplex method, 304, 325 ff, 362, 364 Single crystals, 221 ff Smoothing (of electronic components), 201 ff Solid state tunnel effect devices, 178 ff Sparse basis technique, 297 Spherical harmonics, 73 Stability abstract analysis of, 41 ff of difference equations, 11, 19, 21, 24, 28, 29, 32, 33, 34, 36, 37 of microelectronic components and circuits (see also Lifetime), 169 ff, 177, 179, 181, 189, 219, 220, 232, 233, 253, 255, 275, 286, 287 States, of automata, 385 graphs, 386, 396 initial, 386 Stefan problem, 46 fT Stieltjes integrals, 58 Stochastic programming see Linear programming Strongly connected state graphs, 389 Structural design, 320 ff Structural linguistics, 409 Suapension, electrodynamic, 197 Switching circuits, networks, theory, 379, 381, 383, 384, 392, 398, 412
Transportation problem, 296, 299, 303, 325 ff, 364 Traveling wave tube, 163, 196 Tridiagonal system of equations, 14, 16, 17, 25, 28, 37 Tube characteristics (of tunnel effect devices), 163 ff Tunnel effect devices (see also Field emission), 142, 143, 145, 154,158 ff, 189, 193, 195, 217, 254, 285 ff Tunnel emission, 204, 220, 234, 287 Turing machine, 380, 385, 397-402, 410 universal, 399
T
W
Theory of games, 322 ff Transfinite diameter, 83, 89, 112-116, 120, 121
U Ultrahigh vacuum, 265, 289 Undecidable sets of questions see Decision procedures Uniqueness of solutions, 12 Unstable differential equations, uncoxiditionally, 24 ff
V Vacuum apparatus, 141, 260 ff, 289 Vacuum handling techniques, 176, 289 Variance-covariance matrix, 88
Wang-type machines, 400 Warehouse allocation problems, 347 Wave equat,ion, 7d