This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
= rp[cos {p
=^/^ [/Wl2dr =\af +| ^ (flfc2 +6*2) = X |Ct'2 (4,11"7) ft = l
Jfc=-oo
whenever the integral on the left exists (ParsevaVs Theorem, see also Sec. 15.2-4). Table D-l of Appendix D lists the Fourier coefficients and the mean-square values (7) for a number of periodic functions. Refer to Sec. 20.6-6for numerical harmonic analysis.
(c) Functions Which Can Be Expressed as Fourier Integrals.
Let f(t) be a real function such that P |/(r)| dr exists. Then
FUNCTIONS EQUAL TO THEIR FOURIER SERIES
139
4.11-4
f(t) = (" c(v)e2«ivt dv = f°° = —)= / Ctoe*" da, with c(y) s / °° f(T)e~2^ dr = *V(t«)
(4.11-8)
C(o>) =77= ^ f(r)e~^dr ^-L=FF(iu) =-1= c(v) V27T V2tt
(CO = 27TJ>)
throughout every open interval where/($) is of bounded variation, if one defines/® = y2\j(t - 0) - /($ + 0)] at each discontinuity. A function f(t) having the Fourier transform c(v) will be called an inverse Fourier transform ^-l[c(v)\ of c(v); under the given conditions, Eq. (8) defines ^~l[c(v)] uniquely wherever f(t) is continuous. Equation (8) may be rewritten in different forms, e.g.,
/W ="IT JOL°° d(a Jf°°-co /to cos w(< ~ *) &
=V^ Jo lC(w)l cos ^ +ar& C(»)] <*w = 2 r° [A(v) cos 2tt^ + £(?) sin 2tt^] dp
(co = 2wv)
(4.11-9)
The real functions A(v) and —B(v) in Eq. (9) are, respectively,the cosinetransform
of the even part (Sec. 4.2-26) of f(t), and the sine transform of the odd part of f(t):
am =c(y) +c(-y> -B(v)
c(v) — c( —v)
17(0 +/(-0" =*cp^]=/.
/(0 ~/(-o
= /
/(t) COS 2ttvt dr 00
f(r) sin 27T^r dr i
(v >0)
cW = c*(-v) = A(p) -iB(v) (4.11-10)
40>) =0 t//(0 is odd, and #M = 0 t//(0 is even (see also Sec. 4.11-36). The Fourier-integral expansion describes/(0 as a "sum" of infinitesimal sinusoidal components with frequencies v or circular frequencies co = 2-irv (v > 0); the functions 2|cW| and arg c(v) respectively define the amplitudes and the phase angles of the sinusoidal components. Note that c(-v) = c*(v), C(-co) = C*(co), and that
r i/topdr = Jr—» \C(v)\*dv=r \FFa<»)\^ = Jr—oo \cM\*d» (4.n-n) J —oo Z7T
J —°°
whenever the integral on the left exists (ParsevaVs Theorem; see also Table 4.11-1). (d) A more general type of function can be expressed as a sum of a function (8)
and a set of periodic functions (6), so that both a "band spectrum" and a "linespec trum" exist (see also Sec. 18.10-9).
The treatment of Fourier series and Fourier
integrals can be formally unified through the introduction of generalized (integrated) Fourier transforms (Sec. 18.10-10).
4.H-5
FUNCTIONS, LIMITS, CALCULUS
140
(e) The Paley-Wiener Theorem. Given apositive real function $(co), there exists a
real function f(t) # 0such that P |/(0|2 dt exists and f(t) =0for either t > 0or t < 0 and
|FHtco)|2 s IP /(*)e-*«« d* |2 a*(«) if *(-«) ^*(«») and toft P |(co)|2 dco and P ^^f^ *» *»«* (Paley-Wiener Theorem).
4.11-5. Representation of Functions and Operations in Terms of Fourier Coefficients or Fourier Transforms (see also Sees. 4.10-1
and 8.3-1). (a) Uniqueness Theorem. A suitably integrable func
tion f(t) uniquely defines its Fourier coefficients (2b) or its Fourier transform. Conversely, a complete set ofFourier coefficients or a Fourier
transform uniquely defines the corresponding function f(t) almost everywhere (Sec. 4.6-146) in the interval of expansion; in particular, f(t) is uniquely defined at each point of continuity in the interval of expansion. This uniqueness theorem holds even if the Fourier series or Fourier integral does not converge (see also Sec. 4.11-7).
Note that not every trigonometric series (not even every convergent trigonometric series) is a Fourier series, nor is every function c(v) a Fourie. transform (see also Sec. 4.11-2c).
(b) Operations with Fourier Series. Given f(t) with the Fourier coefficients ak, fa, ck, and
^* + /f7* (term-by-term addition and multiplication by constants). Term-by-term integration of aFourier series (2) over an interval (t0) t) in ; t?] = —r2 sin t? cos t? [r dr + r cos t? sin > dt? + r sin t? cos ^> d> 0) x2 + y2 = r/2 (right circular cylinders) 2/ = x tan t? (planes through z axis) z = 2 (planes parallel to x?/ plane) 0, 0 < t? < ir) Fv = /V sin t? sin ^> + fa cos t? sin *> + P?cos ?> + Fy sin ^> ^V = —Fx sin ?> + Fy cos *> 0, 0 < t? < tt) 1 and z) = u(x, y, z) - yp(z) = C (9.6-7) describes an integral surface, where \p(z) is the solution of the ordinarydifferential equation obtained by elimination of x and y from , 2) = (a + bz)(A + B loge r')(a + fa) (m = 0, 1, 2, . . .) +^p-J P/»(cos &)(a cos m*> +0sin mrf (i = 0, 1, 2, . . . ; m = 0, 1, 2, . . . , j) = u( +0sin m) (x) s M R, and conversely. Xi. This locus (X discriminant) includes loci of cusps and nodes as well as envelopes. t2) = dX dX (z + w) - $(x)]n~* Sm *M R _2j+l(j-m)\ / d(p sin m 'x (j = 0, 1, 2, . . . ; m = 0, 1, 2, . . . , j)
the interval of expansion yields aseries converging to J^ /to dr. The theo rem holds for all values of t0 and t if f(t) is periodic with period T. Note that these theorems do not require convergence of the given Fourier series. Refer to Sec. 4.8-4c for differentiation of infinite series.
(c) Properties of Fourier Transforms. Table 4.11-1 lists the most important properties of Fourier transforms (see also Sec. 8.3-1). Note also
ffc|/(0 cos 2irvd\ $c[f(t) sin 2wp
= = = =
M[cc(» + Mlcs(v + lA\cs(v + H[cc(v -
vo) "o) vo) *o)
+ + -
cc(p cs(v cs(p <%(*
~ +
vo)\ *>o)\ vo)\ "o)j
ffc[/<2'>(01 =(-l)'(2w)«*c[/(0) - 2£ (-1)'(2tf)»'/«*-»^»(0 +0) r
ffc[/<2r+i)Wi = (-l)r(2w)*+HFfl[/(01 " 2Y (-l)/(2w)W/
141
TERMS OF FOURIER COEFFICIENTS
4.11-5
Table 4.11-1. Properties of Fourier Transforms (see also Sec. 4.11-3 and Table 8.3-1) Let
31/(0] ^ /_"„ f(t)e-2™1 dt = c(v) s FF(iu) = y/2i C(co)
(co = 2ttv)
/(0 - r c«e^^ = r FF(i»)e*'^ =-^ j^CM**'** and assume that the Fourier transforms in question exist.
(a) ff[a/i(0 + #,(*)] s a^[/i(0] + ffF[fi(fl]
(LINEARITY)
lF[r(0]sc*(-i,)sF*(-ta,)
^(^j =1c(A=LFf (*A (^ange of scale
a \a/ a \ a J SIMILARITY THEOREM) &lf(t + r)] = e2vivTc(v) s e^Fpfa) (shift theorem) (b) Continuity Theorem, $[/(2,a)] —> ^[/(O] as a —> a implies f(t,a) -+ f(t) wherever f(t) is continuous. Analogous theorems apply to Fourier cosine and sine transforms.
(c) Borel's Convolution Theorem. &[/i(flMfi(fl] = 3tfi(0 */i(0] where
/i(0 */t(0 s y^ /!«/,(« - r) dr - J^ /x(« - r)/,(r) dr WiW/iW] - P Ci(X)c,(f -X)d\s A* ci(f - X)c«(X) dX
E/_". ^iW^W" - X)] ^ =f^ FfiH(<* - X)]F„(*X) g (d) ParsevaPs Theorem. If JT-co \fi(t)\2 dt and J[°°—to \h(t)\2 dt exist, then
/_! **lMt)W2(t)] dv =j2m /i*(0/2(0 (ft (e) Modulation Theorem.
ffLflOe**] s *V[i(« - coo)] = c(* - vo) 5[f(t) cos cooO s }i{Fp[i(a - co0)] + *>[<(« + co0)]} s KM" - *o) + c(v + y0)]
ff[/(0 sin cooO s ^ {F,[i(« - coo)] - *V[i(« + co0)]} ~ 2i^P ~~ P^ " C^ + "°^ (f) Differentiation Theorem.
ff[/
1, 2, . . .) provided that/(r)(0 exists for all t, and that all deriva tives of lesser order vanish as \t\ —> » (2tt^ = co).
4.11-6
FUNCTIONS, LIMITS, CALCULUS
142
for r = 0, 1, 2, ... , provided that the derivative on the left existsfor 0 < t < oo, and that all derivatives of lesser order vanish as t -» ».
4.11-6. Dirichlet's and Fejer's Integrals (see also Sec. 4.11-7). The partial sums 1
80(0 = ^oo
1
V
/
, 2irt , r
.
. 2ir*\
s„(0 = gao + > ( a* cos *"y + &* sm *~T ) fc =1 (n = 1, 2, . . .) (4.11-12) n-l
and the corresponding arithmetic means an-i(0 - - ) s*(0 (Sec 4.8-6c) of a ft = o
Fourier series (2) may be written
sin (271 +1)^
7TT
m = J: /,r/2 /ft + T) ~~~ V' '
•W
T]-T/2n ^T)
• £T sin-y
dr
r/»/g-r)+/q+r) sin(2ro +1)fJj
I7 Jo
2 sin^ ( _ ^
rr/2 ,
^X
JaanT^ sin yf sin n T/2f(t-T)+f(t+T)[—T\ ^
nT Jo
2
\
. TTT ]
X Sm W
( - -
4.11-7. Summation by Arithmetic Means, (a) The partial sums of a Fourier
series may notconstitute useful approximations tof(t). This may betrue, inparticu
lar, if the series diverges, or if the partial sums "overshoot" f(t) badly near a dis
continuity off(t) (nonuniform convergence near thediscontinuity, Gibbs phenomenon). One may thenresort to summation byarithmetic means (Sec. 4.8-6c). Every Fourier series (2) is summable by arithmetic means to the sum }>4\f(t —0) + f(t + 0)] for all t in (-T/2, T/2) where the latter function exists (Ftjir's Theorem). The arithmetic means converge to/(0 almost everywhere in the interval ofexpansion; they converge uniformly to/(0 onevery open subinterval of (-272, T/2) where/(0 is continuous.
(b) Similarly, if f~ \f(r)\dr exists, and f(t) is continuous in every finite interval, then the arithmetic means
(.
Xt\2
-^U (4.11-15) sin -~-\
XVS.
- -
N2
143
MULTIPLE FOURIER SERIES AND INTEGRALS
4.11-8
converge uniformly to f(t) in every finite interval as X -* «>. The uniform convergence extends over (— <», oo) if f(t) is uniformly continuous in (— «, oo). 4.11-8. Multiple Fourier Series and Integrals,
(a) Given an n-dimensional
region of expansion defined by a, < tj + Tj, j = 1, 2, . . . , n, the multiple Fourier . _
fa\ + Ti
series generated by a function f(t\, t2, . . . , tn) such that /
Jai
fCin + Tn
I
Jan
[(12+T2
/
Jat
...
l/(n, r2, . . . , Tn)\ dr\ dr2 . , . drn exists is defined as
2
I
" X «**••• w«p [aw J */^]
fti = — oo ft, = — oo
C*|A|*"*» = T^m
1
fcn = — «
J= l
/•a.+r. /-a,+r, ran+r„ 1 tT / / ••• / f(ri, T2, . . . , Tn) > (4.11-16)
i 1* 2 ' * * J- n Jai
Jat
Ja„
[
n
exp -2iri } k> w~\ dri dr% • • • aVn y=i
(b) The multiple Fourier integral generated by a function f(t\, t2, . . . , tn) such /oo
r
00
/_ J
/* 00
f 00
' ' ' I- W(T]> T2' • *• >Tn^l ^Tl ^T2 *• • ^n exists is defined as /" 00
/* 00
V2?
•exp i y o>^; Id«idun . . . d(on y-i
C(o,„ «»..., Wo) - ^-n/_Ww /_"„ •••pj(n, r2, . . .,r.)
(4.11-17)
n
•exp —i y cojTj dri dr% . . . drn One may introduce vj = wy/2ir as in Eq. (4). For regions of expansion defined by 0<$/< oo or —oo <Jy<0 one obtains multiple Fourier sine or cosine integrals by analogy to Sec. 4.11-36.
(c) Given the region of expansion —oo
&= —•
C*(«i) a m A_ /
/
t2 V2tt J - * y«2
/(r„ r2)e ^
r' ' dTl (ft
One may similarly mix Fourier integrals, Fourier series (and also Fourier sine and cosine integrals) in more than two dimensions.
(d) The exponentials in Eqs. (16), (17), and (18) can be expanded into sine and cosine terms through the use of Eq. (21.2-28). All the theorems of Sees. 4.11-2 through
4,11-7 can begeneralized to apply to multiple Fourier series and integrals.
4.12-1
FUNCTIONS, LIMITS, CALCULUS
144
4.12. RELATED TOPICS, REFERENCES, AND BIBLIOGRAPHY
4.12-1. Related Topics. The following topics related to the study of functions, limits, and infinite series are treated in other chapters of this handbook:
Functions of a complex variable Divergence theorem, Stokes' theorem Metric spaces, convergence
Chap. 7 Chap. 5 Chap. 12
Orthogonal expansions Numericalapproximations Special functions
Chap. 15 Chap. 20 Chap. 21
4.12-2. References and Bibliography (see also Sees. 8.7-2 and 12.9-2). 4.1. Apostol, T. M.: Mathematical Analysis, Addison-Wesley, Reading, Mass., 1957. 4.2. Bartle, R. G.: The Elements of Real Analysis, Wiley, New York, 1964. 4.3. Boas, R. P.: A Primer of Real Functions, Wiley, New York, 1960. 4.4. Buck, R. C: Advanced Calculus, 2d ed., McGraw-Hill, New York, 1965. 4.5. Churchill, R. V.: Fourier Series and Boundary Value Problems, 2d ed., McGrawHill, New York, 1963.
4.6. Dieudonn6, J.: Foundations of Modern Analysis, Academic, New York, 1960. 4.7. Eggleston, H. G.: Introduction to Elementary Real Analysis, Cambridge Univer sity Press, New York, 1962.
4.8. Fleming, W.: Functions of Several Variables, Addison-Wesley, Reading, Mass., 1966.
4.9. Gelbaum, B. R., and J. M. H. Olmsted: Counterexamples in Analysis, HoldenDay, San Francisco, 1964.
4.10. Goffman, C: Calculus ofSeveral Variables, Harper & Row, New York, 1965. 4.11. Goldberg, S.: Methods of Real Analysis, Blaisdell, New York, 1964. 4.12. Graves, L. M.: The Theory ofFunctions ofReal Variables, 2d ed., McGraw-Hill, New York, 1956.
4.13. Knopp, K.: Theory and Application of Infinite Series, Blackie, Glasgow, 1951; also 5th ed. (in German), Springer, Berlin, 1964.
4.14. Natanson, I. P.: Theory ofFunctions ofa Real Variable (2vols.), Unger, New York, 1955/9.
4.15. Papoulis, A.: The Fourier Integral and Its Application, McGraw-Hill, New York, 1962.
4.16. Rankin, R. A.: Introduction to Mathematical Analysis, Pergamon Press, New York, 1962.
4.17. Rosser, J. B.: Asymptotic Formulas and Series, in E. F. Beckenbach, Modern Mathematics forthe Engineer, 2d series, McGraw-Hill, New York, 1961.
4.18. Rudin, W.: Principles of Mathematical Analysis, 2d ed., McGraw-Hill, New York, 1964.
4.19. Wall, H. S.: Analytic Theory of Continued Fractions, Van Nostrand, Princeton, N.J., 1948.
4.20. Widder, D. V.: Advanced Calculus, 2ded., Prentice-Hall, Englewood Cliffs, N.J., 1961.
CHAPTER
VECTOR
ANALYSIS
5.1. Introduction
5.3. Vector Calculus: Functions of a Scalar Parameter
5.1-1. Euclidean Vectors
5.3-1. Vector Functions and Limits
5.2. Vector Algebra
5.3-2. Differentiation
5.2-1. Vector Addition and Multipli cation of Vectors by (Real)
5.3-3. Integration and Ordinary Dif ferential Equations
Scalars
5.2-2. Representation of Vectors in Terms of Base
Vectors and
Components 5.2-3. Rectangular Cartesian Compo nents of a Vector
5.2-4. Vectors and Physical Dimen sions
5.2-5. Absolute Value
(Magnitude,
Norm) of a Vector
5.2-6. Scalar Product (Dot Product, Inner Product) of Two Vectors 5.2-7. The Vector (Cross) Product 5.2-8. The Scalar Triple Product (Box Product)
5.2-9. Other Products Involving More Than Two Vectors
5.2-10. Representation of a Vector a
as a Sum of Vectors Respec tively along and Perpendicular to a Given Unit Vector u
5.2-11. Solution of Equations
5.4. Scalar and Vector Fields 5.4-1. Introduction 5.4-2. Scalar Fields 5.4-3. Vector Fields
5.4-4. Vector Path Element and Arc
Length 5.4-5. Line Integrals 5.4-6. Surface Integrals 5.4-7. Volume Integrals 5.5. Differential Operators 5.5-1. Gradient, Divergence, and Curl: Coordinate-free
Definitions
in
Terms of Integrals 5.5-2. The Operator V 5.5-3. Absolute Differential, Intrinsic Derivative, and Directional De rivative
5.5-4. Higher-order
Directional
De
rivatives 145
5.1-1
VECTOR ANALYSIS
5.5-5. The Laplacian Operator 5.5-6. Repeated Operations 5.5-7. Operations on Special Functions 5.5-8. Functions of Two Position Vectors
„,
_
or
More
- ,_,
146
5.7. Specification of a Vector Field in Terms of Its Curl and Divergence 5.7-1. Irrotational Vector Fields 5.7-2. Solenoidal Vector Fields
5.7-3. Specification of a Vector Point
5.6. Integral Theorems
Function in Terms of Its Di-
5.6-1. The Divergence Theorem and
vergence and Curl
Related Theorems
5.6-2. Stokes' Theorem and Related Theorems 5.6-3. Fields with tinuities
Surface
Discon-
5"8- Re,ated ToP,cs' «•«*«»*>•, ««» Bibliography 5.8-1. Related Topics 5.8-2. References and Bibliography
5.1. INTRODUCTION
5.1-1. Euclidean Vectors. Each class of Euclidean vectors (e.g., displacements, velocities, forces, magnetic field strengths) permits the definition of operations known as vector addition (Sec. 5.2-1), multiplica tion of vectors by (real) scalars (Sec. 5.2-1), and scalar multiplication of vectors (Sec. 5.2-6). Each class of (Euclidean) vectors commonly encountered in geometry and physics is, moreover, intimately related to the two- or three-dimensional space of Euclidean geometry: 1. The vectors of each class permit a reciprocal one-to-one representa tion by translations (displacements, directed line segments) in the geometrical space. This representation preserves the results of vector addition, multiplication by scalars, and scalar multiplication of vectors (and thus magnitudes and relative directions of vectors; see also Sees. 12.1-6 and 14.2-1 to 14.2-7). 2. In most applications, vectors appear as functions of position in geometrical space, so that the vectors are associated with geometrical points (vector point functions, Sec. 5.4-1). Vectors, such as velocities or forces, are usually first introduced in geometrical language as "quantities possessing magnitude and direction" or, somewhat more precisely, as quantities which can be represented by directed line segments subject to a "parallelogram law of addition." Such a geometrical approach, common to most elementary courses, is employed in Sees. 5.2-1 and 5.2-8 to introduce the principal vector operations. Refer to Sees. 12.4-1 and Chap. 14 for a discussion of vectors from a much more general point of view. Vector analysis is the study of vector (and scalar) functions. Each
147
VECTORS IN TERMS OF BASE VECTORS AND COMPONENTS
5.2-2
vector may be specified by a set of numerical functions (vector components) in terms of a suitable reference system (Sees. 5.2-2, 5.2-3,5.4-1, and 6.3-1). Note: The description of a physical situation in terms of vector quantities should not be regarded as merely a kind of shorthand summarizing sets of component equa tions by single equations, but as an instance of a mathematical model (Sec. 12.1-1) whose essential "building blocks" are not restricted to numbers. Note also that a class of objects admitting a one-to-one reciprocal correspondence with a class of directed line segments is not necessarily a vector space unless it has the algebraic properties outlined above (EXAMPLES: finite rotations, directed metal rods). 5.2. VECTOR ALGEBRA
5.2-1. Vector Addition and Multiplication of Vectors by (Real) Scalars. The operation (vector addition) of forming the vector sum a + b of two Euclidean vectors a and b of a suitable class is a vector
corresponding to the geometrical addition of the corresponding displace ments (parallelogram law). The product of a Euclidean vector a by a real number (scalar) a is a vector corresponding to a displacement a times as long as that corresponding to a, with a reversal in direction if a is negative. The null vector 0 of each class of vectors corresponds to a displacement of length zero, and a + 0 = a. With these geometrical definitions, vector addition and multiplication by scalars satisfy the relations
a + b = b + a
a + (b + c) = (a + b) + c = a + b + c «0a) = (c#)a (a + @)a = aa + jSa a(a + b) = aa + ab (l)a = a ( —l)a = —a (0)a = 0 a — a =
0
a +
(5.2-1)
0 = a
Refer to Sees. 14.2-1 to 14.2-4for a more general discussion of vector algebra.
5.2-2. Representation of Vectors in Terms of Base Vectors and
Components, m vectors ai, a2, . . . , &m are linearly independent if and only if Xiai + X2a2 + • • • + Xmam = 0 implies Xi = X2 = • • • = Xm = 0; otherwise the set of vectors is linearly dependent (seealso Sees. 1.9-3 and 14.2-3).
Every vector a of a three-dimensional vector
space can be represented as a sum a = aid + «2C2 + azez
(5.2-2)
in terms of three linearly independent vectors ei, e2, e3. The coefficients «i, «2, «3 are the components* of the vector a with respect to the Some authors refer to the component vectors aieh «2e2, asCa as components.
5.2-3
VECTOR ANALYSIS
148
reference system defined by the base vectors ei, e2, ea (see also Sec,
14.2-4). Given a suitable reference system, the vectors a, b, . . . are thus represented by the respective ordered sets (<*i, a2, a8), (ft, ft, ft), . . . of their components; note that a + b and aa are respectively represented by (on + ft, a2 + ft, az + ft) and (aah aa2, aa8). Vector relations can, then, be expressed (represented) in terms of corresponding (sets of) relations between vector components. Vectors belonging to two-dimensional vector spaces (e.g., plane displacements) are similarly represented by sets of two components. The system of base vectors chosen for the representation of vectors defined at a
given point of geometrical space (Sec. 5.4-1) is usually simply related to the coordinate system used for the description of the geometricalspace. Chapter 6 deals specifically with the representation of vector relations in terms of "local" base vectors directed
along, and perpendicular to, the coordinate lines of curvilinear coordinate systems at each point; the magnitudes and directions of the local base vectors are, in general, different at different points. Transformation equations relating vector components associated with different reference systems are given in Table 6.3-1 and in Sec. 14.6-1.
5.2-3. Rectangular Cartesian Components of a Vector.
Given a
right-handed rectangular cartesian-coordinate system (Sec. 3.1-4) in the geometrical space, the unit vectors (Sec. 5.2-5) i, j, k respectively directed along the positive x axis, the positive y axis, and the positive z axis form a convenient system of base vectors at each point. The components ax, av, az of a vector a = axi + ayj + azh
(5.2-3)
are the (right-banded) rectangular cartesian components of a. Note that
a* = a • i
ay = a • j
a, = a • k
(Sec. 5.2-6) are direction numbers of the vector a; the components u • i, u • j, u • k of any unit vector u are its direction cosines (Sec. 3.1-8).
5.2-4. Vectors and Physical Dimensions. Euclidean vectors may also be multi plied by scalars which are not themselves real numbers but are suitably labeled by real numbers (quantities isomorphic with the .field of real numbers, Sec. 12.1-6; thus one multiplies a constant velocity vector by a time interval to obtain a displacement vector). // a vector (2) or (3) is a physical quantity, one usually associates its physical dimension with the components rather than with the base vectors. The latter are then regarded as dimensionless and may be used to define a common scheme of reference systems (scheme of measurements, see also Sec. 16.1-4) for various classes of
vectors having different physical dimensions (e.g., displacements, velocities, forces, etc.; see also Sec. 16.1-4).
5.2-5. Absolute Value (Magnitude, Norm)
of a Vector.
The
absolute value (magnitude, norm) |a| of a Euclidean vector a is a
149
THE VECTOR (CROSS) PRODUCT
5.2-7
scalar proportional to the length of the displacement corresponding to a (Sec. 5.1-1; see also Sees. 14.2-5 and 14.2-7 for an abstract definition). Absolute values of vectors satisfy the relations (1.1-3). A vector of magnitude 1 is a unit vector. The (mutually perpendicular) base
vectors i, j, k (Sec. 5.2-3) are defined to be unit vectors, so that |a*i| = \ax\, \avi\ = M> l<**kl = W> and
|a| = vV + av2 + az*
(5.2-4)
5.2-6. Scalar Product (Dot Product, Inner Product) of Two Vectors, The scalar product (dot product, inner product) a • b [alternative notation (ab)] of two Euclidean vectors a and b is the scalar a •b =
a
(5.2-5)
b cos y
where y is the angle «£a, b (see also Sees. 14.2-6 and 14.2-7 for an abstract definition). If a and b are physical quantities, the physical dimension of the scalar product a • b must be observed (see also Sec. 5.2-4). Table 5.2-1 summarizes the principal relations involving scalar products. Two nonzero vectors a and b are perpendicular to each other if and only if a • b = 0. Table 5.2-1. Relations Involving Scalar Products
(a)
Basic Relations
a •b = b •a
a-(b+c)=aab+a*c
a •a - a2 - |a*| >0
|a •b| < |a| |b|
(aa) • b = a(a • b)
cos y = al^ Va2b2
(b) In terms of rectangular cartesian components (Sec. 5.2-3)
i-i=j-j=k-k = l
i.j=j.k=k-i=0
a • b = (axi -f avj -f- a,k) • (bxi + bvj -f 6zk) = axbx -f avby -f atbz ax — a • i
ay=a*j
5.2-7. The Vector (Cross) Product.
d = a •k
The vector (cross) product
a X b (alternative notation [ab]) of two vectors a and b is the vector of magnitude
|a X b| = fa| |b| sin y
(5.2-6)
whose direction is perpendicular to both a and b and such that the axial motion of a right-handed screw turning a into b is in the direction of a X b. Two vectors are linearly dependent (Sec. 5.2-2) if and only if
5.2-8
VECTOR ANALYSIS
their vector product is zero.
150
Table 5.2-2 summarizes the principal rela
tions involving vector products.
Refer to Sec. 16.8-4 for a more general
definition of the vector product and to Sees. 3.1-10 and 17.3-3c for the representation of plane areas as vectors. Table 5.2-2. Relations Involving Vector (Cross) Products (a) Basic Relations a Xb = -(bXa) aXa=0 a • (a X b) = b • (a X b) = 0 (aa) Xb=<*(aXb) aX(b+c)=aXb+aXc
[(a + fi)a] X b = (a + 0)(a X b) = «(a X b) + j3(a X b) (b) In terms of any basis eh e2, e8 a = aiCi + a2e2 + "sea
a X b =
b = jSiei + /32e2 + /38e8
e2 X e3
«i
/3i
e3 X ei
a2
02
ei X e2
az
/33
(c) In terms of right-handed rectangular cartesian components i X i = j X j = k X k = 0 a Xb
iXj=k
i
ax
j
av by = i|^v ™" +J
k
az
jXk = i
kXi=j
bx bt
at
ax
bz
bx
+ k
ax
av
bx
bv
— i(avbt — aj>v) + j(aebx —axbt) -f ls.(axbv — aybx)
5.2-8. The Scalar Triple Product (Box Product). a • (b X c) = [abc] = [bca] = [cab] = —[bac] = —[cba] = —[acb]
(5.2-7)
[abc]2 = [(a X b)(b X c)(c X a)] = a2b2c2 - a2(b • c)2 - b2(a • c)2 - c2(a • b)2 + 2(a • b)(b • c)(a • c) a*a
a*b
aac
b • a
b • b
b • c
c • a
c • b
c • c
(Gram's determinant, see also Sec. 14.2-6)
[abc][def] =
a • d
a •e
a • f
bd
be
bf
c •d
c •e
c • f
(5.2-8)
(5.2-9)
In terms of any basis ei, e2, e8 (Sec. 5.2-2; see also Sees. 5.2-3 and 6.3-4) [abc] =
«i OL2 a*
02 $1 Pz
Yi 72 73
[eie2e3]
(5.2-10)
In terms of right-handed rectangular cartesian components (Sec. 5.2-3)
VECTOR FUNCTIONS AND LIMITS
151
ax
bx
az
bz
5.3-1
(>0 if a, b, c are directed like right- ,. 9 --v
[abc] =
handed cartesian axes)
cz
5.2-9. Other Products Involving More Than Two Vectors. a X (b X c) = (a • c)b - (a • b)c =
1 b a *b
c a •c
(vector triple product)
(a X b) • (c X d) = (a • c)(b • d) - (a • d)(b • c) =
(5.2-12)
a •c
b •c
ad
bd
(5.2-13)
(a X b)2 = a2b2 - (a • b)2
(5.2-14)
(a X b) X (c X d) = [acd]b - [bcd]a = [abdjc - [abcjd
(5.2-15)
5.2-10. Representation of a Vector a as a Sum of Vectors Respectively along and Perpendicular to a Given Unit Vector u. (5.2-16)
a = u(u • a) + u X (a Xu) 5.2-11
(a)
(b)
. Solution of Equations. (5.2-17)
xxa:b}impliesx =a^+(aXb)^ x: izv \.
,.
?(b x
x •c = r J
c) + q(c X a ) +r(a Xb) [abc]
(5.2-18)
(c) as-f-b2/-|-cz + d=0 impliesi
X"
[abcl
[dca] [abc]
V
z
=
-
[dab] [abc]
(5.2-19)
(d) (b X c)x + (c X &)y + (a X h)z + d = 0 implies d'C
d-a
[abc]
y
=
[abc]
[abc]
(5.2-20)
5.3. VECTOR CALCULUS: FUNCTIONS OF A SCALAR PARAMETER
5.3-1. Vector Functions and Limits.
A vector function v = v(t) of
a scalar parameter t associates one (single-valued function) or more (multiple-valued function) "values" of the vector v with every value of the scalar parameter t (independent variable) for which v(t) is defined (see also Sec. 4.2-1). In terms of rectangular cartesian components (5.3-1)
v = v(0 = vx(t)i + vv(t)j + vz(t)h
A vector function v(t) is bounded if \v(t)\ is bounded. \(t) has the limit (see also Sees. 4.4-1 and 12.5-3) vi = lim \(t) if and only if for t-*u
every positive number e there exists a number 8 > 0 such that \t — h\ < 8 implies |vi — v(0| < «. If lim v(0 exists,
lim v(<) = [i lim vx(t) + j lim vy(t) + k lim vz(t)] «->
t-*tl
t-^tl
t~*t\
(5.3-2)
5.3-2
VECTOR ANALYSIS
152
Formulas analogous to those of Sec. 4.4-2 (limits of sums, products, etc.) apply to vector sums, scalar products, and vector products. y(t) is continuous for t = t\ if and only if lim \(t) = v(^) (see also Sees. 4.4-6 t-*u
and 12.5-3).
5.3-2. Differentiation.
A vector function v(t) is differentiable for
t = h if and only if the derivative
dv(t) = lim v(t + At) - v(t) dt
&->o
(5.3-3)
At
exists and is unique for t = t\ (see also Sec. 4.5-1).
If the derivative
d2v(t)/dt2 of dv(t)/dt exists, it is called the second derivative of v(t), and so forth.
Table 5.3-1 summarizes the principal differentiation rules.
Table 5.3-1. Differentiation of Vector Functions with
Respect to a Scalar Parameter (a) Basic Rules
i W« ±»«)] =Yt ±^
Jt [«v«)l =«^ (« constant)
Jl^)w«,»«,].[5w«]+[^o]+[W|] (b) /n terms of rectangular cartesian components dx(t)
dt
dvx(t) .
dvyjt) .
dt l+
dv,(t)
dt 3+ dt
(c) If the base vectors ei(t), e2(t), e9(t) are functions of t, and \(t) —ai(0«i(0 + cn(t)e*(t) •+* as(0es(0> then dv(0
[doL\
. *dct2
, daz
1 . T
dei ,
de2
,
deal
dt
Analogous rules apply to the partial derivatives d\/dtx s= vtl, dv/dt2 = vtt, . . . of a vector function v = y(ti, £2, . . .) of two or more scalar parameters h, fo, . . . Note: If u(t) is a unit vector (of constant magnitude but variable direction) and •(*) - t>(0u(0,
du
w
i = wXu
,
and
dv
dv
,
du
dv
,
w
^ = ^u + vdr = ^u + oXv
/c _ ..
g-3-4)
w is directed along the axis about which u(/) [and thus also v(t)] turns as t varies,
153
SCALAR AND VECTOR FIELDS
5.4-1
so that a right-handed screw turning with u(t) would be propelled in the direction of t>. Its magnitude is equal to the angularrateof turnof u(0 [and thus also of v(J)] with respect to I (EXAMPLE: angular velocity vector in physics; see also Sec. 17.2-3). Equation (4) describes the separate contributions of changes in the magnitude and direc tion of \{t).
5.3-3. Integration and Ordinary Differential Equations. The indefinite integral \(t) = fv(t) dt of a suitable vector function \(t) is defined as the solution of the vector differential equation (see also Sec. 9.1-1)
jjjV(0-T«)
(5.3-5)
which may be replaced by a set of differential equations for the components of V(t). Other ordinary differential equations involving differentiation of vectors with respect to a scalar parameter are treated similarly. The definite integral m
fb v(0 dt = max|fc-«,-i|->0 lim ) vMiU - U-i) J" Hx With
\
I
\
,-« ft.
(O.d-Oj
a = t0 < Ti < ti < T2 < h < • • ' < Tm
(see also Sec. 4.6-1) may be treated in terms of components:
P v(t) dt =i r »*(0 dt +jJa vy{t) dt +kjQ ir,(t) dt
(5.3-7)
5.4. SCALAR AND VECTOR FIELDS
5.4-1. Introduction.
The remainder of this chapter deals specifically
with scalar and vector functions of position in three-dimensional Euclidean
space. Unless the contrary is stated, the scalar and vector functions of position are assumed to be single-valued, continuous, and suitably differentiable functions of the coordinates, and thus of the position vector r = xi + yj + hx. In Sees. 5.4-2 to 5.7-3 relations involving scalar and vector functions are stated
1. In coordinate-free (invariant) form, and 2. In terms of vector components along right-handed rectangular cartesian coordinate axes (Sec. 5.2-3), so that*
F(r) Eg F(x, y, z) - iFx(x, y, z) + }Fv(x, y, z) + kFz(x, y, z)
(5.4-1)
The relations to be described are independent of the coordinate system used to specify position in space. The representation of vector relations in terms of vector components along, or perpendicular to, suitable curvilinear coordinate lines (and thus along different directions at dif ferent points) is treated in Chap. 6. * Throughout Chaps. 5 and 6, the subscripts in FXt FVi F8} . . . do not indicate differentiation with respect to x, y, z, . . . ; in fact, no scalar function F(x, yt z) is introduced.
5.4-2
VECTOR ANALYSIS
154
5.4-2. Scalar Fields. A scalar field is a scalar function of position (scalar point function) $(r) = $(x, y, z) together with its region of definition.
The surfaces
$(r) = $(x, y, z) = constant
(5.4-2)
(Sec. 3.1-14) arecalled level surfaces of the field and permit its geometri cal representation.
5.4-3. Vector Fields. A vector field is a vector function of position (vector point function) F(r) = F(x, y, z) together with its region of definition. The field lines (streamlines) of the vector field defined by F(r) have the direction of the field vector F(r) at each point (r) and are specified by the differential equations dr X F(r) =0
or
dx:dy:dz = Fx:Fy:Fz
(5.4-3)
A vector field may be represented geometrically by its field lines, with the rela tive density of the field lines at each point (r) proportional to the absolute value |F(r)| of the field vector.
5.4-4. Vector Path Element and Arc Length (see also Sec. 4.6-9). (a) The vector path element (vector element of distance) dr along a curve C described by
r = r(t)
or
x = x(t)
y = y(t)
z = z(t)
(5.4-4)
is defined at every regular point (r) = [x(t)t y(t), z(t)] as
dr-l^+J^ +kA-^l+jl +kJ)*-^* (5.4-5) dv is directed along the tangent to C at each regular point (see also Sec. 17.2-2).
(b) The arc length s on a rectifiable curve (4) (Sec. 4.6-9) is given by
s = JhC ids
with
ds = \/dx2 + dy2 + dz2 (5.4-6)
at each regular point (r) == [x(t), y(t), z(t)] of the curve
The sign of ds is assigned arbitrarily, e.g., so that ds/dt > 0.
155
SURFACE INTEGRALS
5.4-6
5.4-5. Line Integrals (see also Sec. 4.6-10). Given a rectifiable arc C represented by Eq. (4), the scalar line integrals
f *(r) ds = f *(i, y, z) ds = [$[x(t), y(t), z(t)]^dt
fcdt'nr) =fcW'F{l)dt = / [Fz(x, y, z) dx + Fy(x, y, z) dy
(5.4-7)
+ Ft(x, y, z) dz]
can be defined directly as limits of sums in the manner of Sec. 4.6-10; it is, however, more convenient to substitute the functions x(t), y(t), z(t), dx/dt,dy/dt, and dz/dtobtained from Eq. (4) into Eq. (7) and to integrate over t.
One similarly defines the vector line integrals
=ifc*(P, V, *) dx +j/c*(*> V, *)dy +k}c*(x, V, *) dz
=fc [»<*, y,z) g +j*(x, 2/, *) f +»fe 2/, «) J] <*<
(5-4-8)
and f dr XF(r) =J^ XF(r)
Unless special conditions are satisfied (Sec. 5.7-1), #&e vafoe o/ a scaZar or vector line integral depends on the path of integration C. Refer to Sees. 6.2-3a and 6.4-3a for the use of curvilinear coordinates.
Note: It is often useful to introduce the arc length s as a new parameter into the expressions (7) to (9) by means of Eq. (6).
5.4-6. Surface Integrals (see also Sees. 4.6-12 and 17.3-3c). (a)At each regular point of a two-sided surface represented by r = r(u, v) (Sec. 3.1-14), it is possible to define a vector element of area
5.4-6
VECTOR ANALYSIS
\du
dv) , . (dz dx
156
I \du dv du dvj dx dz\ . . /dx dy dy dx\l , _,
,„ , ^x
at each surface point (u, v). In the caseof a closed surface, the sense and order of the surface coordinates u, v are customarily chosen so that the direction of dA (direction of the positive surface normal, Sec. 17.3-2) is outward from the bounded volume. The scalar element of area at the surface point (u, v) is defined as In..
a— I
— X — \du dv = y/a(u, v) du dv
(5.4-11)
The sign of dA may be arbitrarily assigned (see also Sees. 4.6-11, 4.6-12, 6.4-36, and 17.3-3c).
In particular, for u —x, v —y, z = z{x, y)f
"-(-'I-'!*-)"-*'
I
*.«„,.V.+(g)'+(5)-«.*j
<5"2»
(b) In the following it will be assumed that the area f \dA\ (Sec. 4.6-11) of each surface region S under consideration exists; in this case Eq. (10) defines dA almost everywhere on S (Sec. 4.6-146). The scalar surface integrals
fs\dA\*(r)
and
/fldA-F(r)
(5.4-13)
j& dA XF(r)
(5.4-14)
and the vector surface integrals
fs dA$(r)
and
of suitable field functions $(r) and F(r) may then be defined directly as limits of sums in the manner of Sees. 4.6-1, 4.6-10, and 5.3-3. One can, instead, employ Eq. (10) to express each surface integral as a double integral over the surface coordinates u and v (see also Sec. 6.4-36). Note
Js dA •F(z, y, z) =JJFx[x(y, z), y, z] dy dz +jj Fv[x, y{x, z), z] dx dz
+jj FJLx, y, z(x, y)] dx dy (5.4-15) In the first integral u = y, v = z are independent variables; in the second integral u = z, v = x, etc. [seealso Eq. (12)]. Equation (15) must notbe interpreted to imply
157
GRADIENT, DIVERGENCE, AND CURL
5.5-1
dA = i dydz + j dx dz + k dx dy without such qualifications on the meaning of dx, dy, and dz.
5.4-7. Volume Integrals (see also Sec. 4.6-12). Given a simply con
nected region V of three-dimensional Euclidean space, the scalar volume integral
fv #(r) dV =fff *(*, y, z) dx dy dz
(5.4-16)
and the vector volume integral
fv F(r) dV =Jff [iFx(x, y, z) +]Fy(x, y, z) +kFz(x, y, z)] dx dy dz (5.4-17)
may be defined as limits of sums in the manner of Sec. 4.6-1, orthey may be expressed directly in terms of triple integrals over x, y, and z. Refer to Sees. 6.2-36 and 6.4-3c for the use of curvilinear coordinates. 5.5. DIFFERENTIAL OPERATORS
5.5-1. Gradient, Divergence, and Curl: Coordinate-free Defini tions in Terms of Integrals. The gradient grad $(r) s V$ of a
scalar point function $(r) ss $(x, y, z) is a vector point function defined at each point (r) = (x, y, z) where $(r) is suitably differentiable. In coordinate-free form,
f dA*(Q)
grad $(r) s V$ s lim J-^ «-o
/ dV
(5.5-1)
J Vi
where Y\ is a region containing the point (r) and bounded by a closed surface Si such that the greatest distance between the point (r) and any point of aSi is less than 5 > 0 (see also Sec. 4.3-5c). Given a suitably differentiable vector point function F(r) = F(x, y, z), it is similarly possible to define a scalar point function, the divergence of F(r) at the point r,
I dA •F(o) div F(r) = V •F = lim ^—f 5->o
/ ay
(5.5-2)
JVi
and a vector point function, the curl (rotational) of F(r) at the point r,
f dA X F(9) curl F(r) mV X F = lim J-^—f «-.o
/ dV
JVx
(5.5-3)
5-5-2
VECTOR ANALYSIS
158
Note: At each point where the vector grad * = V* exists, it has the magnitude
'™'=V(gy+(£)•+(?)•
(5.5-4)
of, as well as the direction associated with, the greatest directional derivative d*/ds (Sec. 5.5-3c) at that point. V* defines a vector field whose field lines are specified
by the differential equations
dr X (V*) = 0
««
or
j
j
j
d$ d$ d$
dx:dy:dz = —:—: — dx dy dz
(5.5-5)
The gradient lines defined by Eq. (5) intersect the level surfaces (5.4-2) perpen
dicularly.
5.5-2. The Operator V. In terms of rectangular cartesian coordi nates, the linear operator V (del or nabla) is defined by (5.5-6)
Its application to a scalar point function $(r) or a vector point function F(r) corresponds formally to a noncommutative multiplication operation
with a vector having the rectangular cartesian "components" d/dx,
d/dy, d/dz; thus, in terms of right-handed rectangular cartesian coordi nates x, y, z,
V*<*, y, z) ^ grad *(*, y, z) = igox + j|* + k|* dy dz V . F(x, y, z) mdiv F(x, y, z) =d-h+ dll+ dJL' ox dy dz V X F(as, y, z) = curl F(x, y, z)
mi(9J._dFS (&F, _BF\ (dFv \dy dzJ+J\dz dx) + k\lxi
d F
'
TyF>
dFx\ ~dy~)
1 te F*
(5.5-7)
k £ F-
(C.T)P-Q-.g+^+ftf - i(G • VFX) + j(G • VF„) + k(G • VF.)
Table 5.5-1 summarizes a number of rules for operations with the operator V.
159
ABSOLUTE DIFFERENTIAL
5.5-3
Table 5.5-1. Rules for Operations Involving the Operator V (a) Linearity
V( + *) = V* + V* V • (F + G) = V • F + V • G VX(F + G)=VXF|VXG
(b) Operations
on Products
V(
¥V«I> + $V¥
= = = = = = =
V(ct$) = aV$ V • (aF) = aV • F V X («F) = aV X F
(F • V)G + (G • V)F + F X (V X G) + G X (V X F) *V • F + (V*) • F G«V XF -F-V X G
F(G • V*) + *(G • V)F $VXF + (V*) X F (G • V)F - (F • V)G + F(V • G) - G(V • F) %[V X(FXG) +V(F-G) - F(V • G) 4- G(V • F)
F X (V X G) - G X (V X F)]
Note that vector equations involving V*, V • F, and/or V X F have a meaning
independent of the coordinate system used. Refer to Chap. 6 and Sec. 16.10-7 for transformations expressing V*, V • F, and V X F in terms of different coordinate systems.
5.5-3. Absolute Differential, Intrinsic Derivative, and Directional
Derivative, (a) The change (absolute differential) d$ of a scalar point function <3>(r) associated with a change dx = i dx + j dy + k dz in position is (see also Sec. 4.5-3a)
d$ = — dx + —- dy + —- dz = dv • grad <£> = (dv • V)$ ox
oy
dz
(5.5-8)
(b) The intrinsic (absolute) derivative (see also Table 4.5-2a) d$/dt of $(r) along the curve r = v(t) is, at each point (r) of the curve, the rate of change of $(r) with respect to the parameter t as r varies as a function of t:
d$ _ /dv • \
_ dx d$
r = v(t)
x = x(t)
It " \dt' ) " dt ds with
or
dy d
dz d$
dt dy^~ dt dz y = y(t)
(5.5-9)
z = z(t)
Note: If * depends explicitly on t [$ = $(r, t)], then
(c) The directional derivative d$/ds of <£>(r) at the point (r) is the rate of change of $(r) with the distance s from the point (r) as a function of direction. The directional derivative of $(r) in the direction of the
5.5-4
VECTOR ANALYSIS
160
unit vector u = i cos ax + j cos <xv + k cos az denned by the direction cosines (Sec. 3.1-8a) cos aX} cos av, cos az is d$
d
d<£>
d$
5T = cos axTx + C0S °* -d7j + cos a- 6^ = (u -V)*
(5.5-11)
d$/ds is the intrinsic derivative of $(r) with respect to the path length s along a curve directed along u = dv/ds. (d) The absolute differential, intrinsic derivative, and directional derivative of a vector point function F(r) are defined in a manner analogous to that for a scalar point function.
Thus
dF = (dr • V)F = [i(dr • V)FX + j(dr • V)FV + k(dr . V)FJ = [idFx +jdF„ +kdFg]
••s-)'-['e-')'-+'(s-')'.+'($-*)'-]
dF
_
-[•
(5.5-12)
= (u • V)F - [i(u . V)FX + j(u • V)FV + k(u • V)FJ
L ds +J ds +klFj 5.5-4. Higher-order Directional Derivatives. Taylor Expansion. order directional derivative of * or F in the u direction is defined by
-1 *(r) = (u•V)»*(r)
and
d»
^ F(r) = (u •V)«F(r)
The nth-
(5.5-13)
respectively. For suitably differentiable functions, one has, if the series in question converges (see also Sec. 4.10-5),
*(r +Ar) =*(r) +(Ar -V)*(r)]^ +^ (Ar. V)**(r)l + •••=e
- *(r) +Ar ^*(r)]r +i (Ar)' g *(r)]r +••• (5.5-14a) and
F(r + Ar) = F(r) + (Ar •V)F(r)] + ± (Ar •V)*F(r) 1 + • • •= e
- F(r) +Ar i F(r)]r +I (Ar)' jg F(r)]p + •. . (5.5-146) where each directional derivative is taken in the direction of Ar.
5.5-5. The Laplacian Operator. The Laplacian operator V2 = (V •V) (sometimes denoted by A), expressed in terms of rectangular cartesian coordinates by
(
V;
\dx* ^ dy2 ^ dz*)
(5.5-15)
161
FUNCTIONS OF TWO OR MORE POSITION VECTORS
5.5-8
(see Chap. 6 and Sec. 16.10-7 for other representations), may be applied to both scalar and vector point functions by noncommutative scalar "multiplication/' so that
( d2 , d2 . d2\. (5.5-16)
V2F = (iV2Fx + jV2Fy + kVWz) Note
and
V2(a
(5.5-17) (5.5-18)
5.5-6. Repeated Operations. Note the following rules for repeated operations with the operator V: div grad $ = V • (V3>) =: V2* grad div F = V(V - F) =: V2F + V X (V X F) curl curl F = V X (V X F) = V(V •F)- - V2F curl grad $ = V X (v$) = 0 div curl F = V • (V X F) = 0
5.5-7. Operations on Special Functions.
(5.5-19)
A number of results of
differential operations on scalar and vector functions of the position vector r = (x, y, z) are tabulated in Tables 5.5-2 and 5.5-3, respectively. Additional formulas may be derived with the aid of Table 5.5-1. Note also
[F(r) • V]r = F(r) a
a
• r
X
(5.5-20) r
(5.5-21)
where a is a constant vector.
Table 5.5-2. Operations on Scalar Point Functions (r £3 |r|; a is a constant vector; n = 0, ±1, ±2, . . .) $
a
• r
V$
V2*
a
0
rn
nrn-2r
n{n + l)r«-2
loge r
r/r2
1/r2
5.5-8. Functions of Two or More Position Vectors. Many problems involve scalar or vector functions of two or more position vectors (functions depending on
5.6-1
VECTOR ANALYSIS
162
Table 5.5-3. Operations on Vector Point Functions
(r = |r|; a is a constant vector; n = 0, ±1, ±2, . . .) F
a
V
V
• F
XF
(G • V)F
V2F
VVF
r
3
0
G
0
0
Xr
0
2a
a XG
0
0
arn
nr»~*(r • a)
nr»-*(r X a)
nr"-2(r* G)a
n(n + l)r»-aa
nrw_Ja + n(n — 2)rn"«(r •a)r
rr»
(n + 3)r»
0
r»G + nr»~*(r • G)r
n(n + 3)r»-»r
n(n + 3)rn_2r
r • a/r»
r X a/r»
a log. r
(G • r)a
a r2
a
2(r • a)r r4
the positions of two or more points). In the typical case of two position vectors, r a (x, y, z) and q = (£, q, £), say, functions like $(r, p) s $(z, y, z; $, v, f) and F(r, 9) = F(a:, y, z; £, 17, f) may be operated on by two different V operators, described in terms of right-handed rectangular cartesian "components" by dx
Note
+^ +k
d
dz
and
Vp
. d
d id t +. l. k—: drj
df
in particular V*(r = -vp *(r - 9) V-F(r -9) - -Vp •F(r -9) V XF(r -9) = -Vp XF( r-9)
(5.5-22)
5.6. INTEGRAL THEOREMS
5.6-1. The DivergenceTheorem and Related Theorems, (a) Table 5.6-1 summarizes a number of important theorems relating volume integrals over a region V to surface integrals over the boundary surface S of the region V. In the formulas of Table 5.6-1, volume integrals are taken over a bounded, simply connected open region V bounded by a (two-sided) regular closed surface S (Sec. 3.1-14). All functions are assumed to be single-valued throughout V and on S. The existence ofthe (proper or improper) volume integrals is assumed. All theorems hold for unbounded regions V as well as for bounded regions if the integrands ofthe surface integrals are 0(l/r8) in absolute value as r-> 00 (Sec. 4.4-3). Refer to Chap. 6 and Sec. 17.3-3 for formulas expressing surface and volume elements in terms of curvilinear coordinate systems; see also Sees. 15.6-5 and 15.6-10 for applications. (b) Normal-derivative Notation.
The normal derivative of a
scalar function $(r) at a regular point of the surface S is the directional
derivative of $(r) in the direction of the positive normal (usually the outward normal, Sec. 17.3-2), and thus in the direction of the vector dA.
2
7
Special cases
Green's theorems
JS
J8
JSdn
a*
e*\
* — )dA dn)
1 |V*|»dF+ JV / *V«*dV= J/8 dA-(*V*)= J/8 *— dA dn
JV
1 V1* dV = / dA •V* = / — dA (Gauss's Theorem)
JV
/
= / 1* JS\ dn
I (*V»* - *V*¥) dF = / dA • (¥V* - *V¥) f
dn
= / ¥— dA
1 V*•V*dV + / ¥V»# dV = / dA •(¥V*)
jvV*(r)dV >= JsdA*(r)
\ VXF(r) dV =J dA XF(r)
[ V•F(r) dV =J dA •F(r)
Vector formulas
* Less stringent conditions are discussed in Ref. 5.5.
6
5
4
gradient
Theorem of the
Theorem of the rota tional
1
3
Divergence theorem (Gauss's integral theorem)
Theorem
differentiable
with
con
derivatives
¥(r) continuous; *(r) differenti able with continuous partial
Existence of integrals is sufficient
(b) On S
derivatives
*(r), ¥(r) twice differentiable *(r), *(r) differentiable with con with continuous second partial tinuous partial derivatives
tives
tinuous second partial deriva
twice
*(r), ¥(r) differentiable with con tinuous partial derivatives; *(r)
F(r), *(r) differentiable with con tinuous partial derivatives
(a) Throughout V
Sufficient conditions*
Table 5.6-1. Theorems Relating Volume Integrals and Surface Integrals (see also Sec. 5.6-1)
ON
CD
3
w
o 50
W
H
W
H
»
•
3
w
O »
W
H
H
ft W
w
o
<
5.6-2
VECTOR ANALYSIS
164
The normal derivative is customarily denoted by d$/dn, so that
?dAm (dA-V)$ on
5.6-2. Stokes9 Theorem and Related Theorems.
Given a vector
function F(r) single-valued and differentiable with continuous partial derivatives throughout a finite region V containing a simply connected regular (one-sided) surface segment S bounded by a regular closed curve C,
f dA •[V XF(r)] =6cdv F(r) (Stores' theorem)
(5.6-1)
i.e., the line integral of F(r) around C equals the flux of V X F through the surface bounded by C. Under the same conditions as above,
js (dA XV) XF(r) =j)cdrX F(r)
(5.6-2)
and for a scalar point function $(r) single-valued and differentiable with continuous partial derivatives throughout V
f dA X [V*(r)] - 6 dr *(r)
(5.6-3)
Equations (1), (2), and (3) apply to unbounded regions V if the integrands of the line integrals on the right are 0(l/r*) in absolute value as r -* » (Sec. 4.4-3). 5.6-3. Fields with Surface Discontinuities (see also Sec. 15.6-56). Let the scalar field $(r) or the components of F(r) be continuously differentiable on either side of a regular surface element S but discontinuous on S so that
*(r) = *+(r)
or
F(r) = F+(r)
on the positive side of S (Sec. 17.3-2), while $(r) = $_(r)
on the negative side of S.
or
F(r) = F_(r)
At each point (r) of S one defines the functions
N(r)[
(surface gradient) |
N(r) • [F+(r) — F_(r)]
(surface divergence) >
N(r) X [F+(r) —F_(r)]
(surface rotational) J
(5.6-4)
where N(r) is the positive unit normal vector of S at the point (r) (Sec. 17.3-2). The definitions (4) permit one to extend the integral theorems of Table 5.6-1 to functions with surface discontinuities. 5.7. SPECIFICATION OF A VECTOR FIELD IN TERMS OF ITS CURL AND DIVERGENCE
5.7-1. Irrotational Vector Fields.
A vector point function F(r) (as
well as the field described by it) is called irrotational (lamellar)
165
SOLENOIDAL VECTOR FIELDS
5.7-2
throughout a region V if and only if, for every point of V, V X F(r) = 0
dFv dz
dFt = dFz
dFx = dFx
dFv
dy
dz
dx
dx
dy
(5.7-1)
This is true if and only if —F(r) is the gradient V$(r) of a scalar point function $(r) at every point of V [see also Eq. (5.5-19)]/ in this case dv • F(r) ss Fx(x, y, z) dx + Fv(x, y, z) dy + Fz(x, y, z) dz = -dv • V$(r) s -d$ (5.7-2) w aw ezac* differential (Sec. 4.5-3a). $(r) is often called the scalar potential of the irrotational vector field. // V is simply connected (Sec. 4.3-66), $(r) is a single-valued function uniquely determined by F(r) except for an additive constant, and the line integral
fr d9. F(9) = -[*(r) - *(a)]
(5.7-3)
c
is independent of the path of integration C if the latter comprises only
points of V; the line integral
If V is multiply connected,
$(r) may be a multiple-valued function. As a special case,
is a necessary and sufficient condition that the line integral
i.,[Fx(x,y)dx+Fy(x,y)dy} is independent of the path of integration, i.e., that the integrand is an exact differential. *
5.7-2. Solenoidal Vector Fields. A vector point function F(r) (as well as the field described by it) is called solenoidal throughout a region V if and only if, for every point of V,
V.F(,)-0
or
g +S +g-o
(5.7-5)
This is true if and only if F(v) is the curl V X A(r) of a vector point function A(r) [see also Eq. (5.5-19)], the vector potential of the vector field described by F(r). * See footnote to Sec. 5.4-1.
5.7-3
VECTOR ANALYSIS
166
5.7-3. Specification of a Vector Point Function in Terms of Its Divergence and Curl, (a) Let V be a finite open region of space, bounded by a regular surface S (Sec. 3.1-14) whose positive normal is uniquely defined and varies continuously at every surface point. // the divergenceand curl of a vector point function F(v) are given at every point (v) of V, then F(v) may be expressed throughout V as the sum of an irrotational vector point function Fi(r) and a solenoidal vector point function F2(r),
with
F(r) = F1(v) + F2(r)
V XFi(r) =V•F2(r) =0 )
(5,7"6)
(Helmholtz's Decomposition Theorem). F(v) is uniquely defined throughout V if, in addition, the normal component F(v) • dA/\dA\ of F(r) is given at every surface point (Uniqueness Theorem). The problem of actually computing F(r) from these data involves the solution of partial differential equations subject to certain boundary conditions. The important special case in which V • F(r) = V X F(r) = 0 and thus F(r) = —V$(r), V^r) = 0 throughout V forms the subject matter of potential theory as discussed in Sees. 15.6-1 to 15.6-10.
(b) If suitable functions
V • F(r) = 4irQ(r)
and
V X F(r) = 4rrl(r)
(5.7-7)
are given for every point (r) of space, then Eq. (6) defines Fi(r) and F2(r), and hence F(r), uniquely except for additive functions F0(r) such that V2F0(r) = 0. One has
Fl(r) =-V*(r)
*(r)= f ^j\dV all space
F2(r) = V X A(r)
_ J[ r^^jdV '(») A(r) =
" J
all space
(5.7-8)
|r-f|
provided that the integrals on the right (scalar and vector potentials) exist (see also Sec. 15.6-5); the integration extends over all points (9). 5.8. RELATED TOPICS, REFERENCES, AND BIBLIOGRAPHY
5.8-1. Related Topics. The following topics related to the study of vector analysis are treated in other chapters of this handbook: Algebra of more general vector spaces Chaps. 12, 14 Applications of vector algebra to analytic geometry Chap. 3 Applications of vector calculus to the geometry of curves and surfaces (differential geometry) Chap. 17 Two-dimensional fields Chap. 10 Potential theory Chap. 10 General transformation properties of vector point functions Chap. 16 Functions of a complex variable Chap. 7
167
REFERENCES AND BIBLIOGRAPHY
5.8-2
5.8-2. References and Bibliography (see also Sec. 6.6-2). 5.1. Brand, L.: Vector and Tensor Analysis, Wiley, New York, 1947. 5.2. : Vector Analysis, Wiley, New York, 1957. 5.3. Dorrie, H.: Vektoren, Edwards, Ann Arbor, Mich., 1946. 5.4. Halmos, P. R.: Finite Dimensional Vector Spaces, Princeton University Press, Princeton, N.J., 1942. 5.5. Lagally, M.: Vorlesungen uber Vehtor-Rechnung, Edwards, Ann Arbor, Mich., 1947.
5.6. 5.7. 5.8. 5.9. 5.10.
Lass, H.: Vector and Tensor Analysis, McGraw-Hill, New York, 1950. McQuistan, R. B.: Scalar and Vector Fields, Wiley, New York, 1965. Sokolnikoff, I. S.: Tensor Analysis, 2d ed., Wiley, New York, 1964. Weatherburn, C. E.: Elementary Vector Analysis, Open Court, LaSalle, 111., 1948. : Advanced Vector Analysis, Open Court, LaSalle, 111., 1948.
CHAPTE
,6
CURVILINEAR COORDINATE SYSTEMS
6.3-4. Derivation of Vector Relations in Terms of Curvilinear Com
6.1. Introduction
6.1-1. Introductory Remarks
ponents
6.2. Curvilinear Coordinate Systems 6.2-1. Curvilinear Coordinates 6.2-2. Coordinate Surfaces and Coordi nate Lines
6.2-3. Elements of Distance and Vol ume
6.3. Representation of Vectors Terms of Components
in
6.3-1. Vector Components and Local
of
Contravariant
Covariant Components
Vector
Relations
in
Terms
of
Orthogonal Components
6.4-1. Orthogonal Coordinates 6.4-2. Vector Relations
6.4-3. Line Integrals, Surface Inte grals, and Volume Integrals 6.5. Formulas Relating to Special Orthogonal Coordinate Systems 6.5-1. Introduction
Base Vectors
6.3-2. Representation of Vectors in Terms of Physical Components 6.3-3. Representation of Vectors in Terms
6.4. Orthogonal Coordinate Systems.
and
6.6. Related Topics, References, and Bibliography
6.6-1. Related Topics '6.6-2. References and Bibliography
6.1. INTRODUCTION
6.1-1. Chapter 6 deals with the description (representation) of scalar and vector functions of position (see also Sees. 5.4-1 to 5.7-3) in terms of curvilinear coordinates (Sec. 6.2-1). Vectors will be represented by 168
169
6.2-1
CURVILINEAR COORDINATES
components along, or perpendicular to, the coordinate lines at each point (Sees. 6.3-1 to 6.3-3). The use of curvilinear coordinates simplifies many problems; one may, for instance, choose a curvilinear coordinate system such that a function under consideration is constant on a coordinate
surface (Sees. 6.4-3 and 10.4-lc). In accordance with the requirements of many physical applications, Chap. 6 is mainly concerned with orthogonal coordinate systems (Sees. 6.4-1 to 6.5-1). The representation of vector relations in terms of nonorthogonal components is treated more elaborately in Chap. 16 in the context of tensor analysis. 6.2. CURVILINEAR COORDINATE SYSTEMS
6.2-1. Curvilinear Coordinates.
A curvilinear coordinate system
defined over a region V of three-dimensional Euclidean space labels each point (x, y, z) with an ordered set of three real numbers x1, x2, xz* The *3 coordinate line
{x\x2+dx2, x3+dx*) (x\ x2+ dx2, x3)
(x1+dx\x2 + dx2,x3+dx3) x2 coordinate line
\xl+dxl,x2+dx2,x3)
xl coordinate line
Fig. 6.2-1. Illustrating coordinate line elements, coordinate surface elements, and volume element for a curvilinear coordinate system.
curvilinear coordinates x1, x2, xz of the point (x, y, z) = (x1, x2, xz) are related to the right-handed rectangular cartesian coordinates x, y, z (Sec. 3.1-4) by three continuously differentiable transformation equations x1 = xl(x, y, z)
x2 = x2(x, y, z)
xz = xz(x, y, z)
(6.2-1)
where the functions (1) are single-valued and d(xl, x2, xz)/d(x, y, z) ?* 0 throughout V (admissible transformation, see also Sees. 4.5-6 and 16.1-2). * The indices 1, 2, 3 of x1, x2, x3 are superscripts, not exponents; see also Sees. 16.1-2 and 16.1-3.
6.2-2
170
CURVILINEAR COORDINATES
The x1, x2, xz coordinate system is cartesian (Sec. 3.1-2; not in general rectangular) if and only if the transformation equations (1) are linear equations. 6.2-2. Coordinate Surfaces and Coordinate Lines.
The condition
xi = x*(x, y, z) = constant defines a coordinate surface. Coordinate surfaces corresponding to different values of the same coordinate x* do not intersect in V. Two coordinate surfaces corresponding to different coordinates x*, xi intersect on the coordinate line corresponding to the third coordinate xk. Each point (x, y, z) = (x1, x2, xz) of V is represented as the point of intersection of three coordinate surfaces, or of three coordinate lines.
6.2-3. Elements of Distance and Volume (Fig. 6.2-1; see also Sees. 6.4-3 and 17.3-3). (a) In terms of curvilinear coordinates x1, x2, xs, the element of distance ds between two adjacent points (x, y, z) = (x1, x2, xz) and (x H- dx, y + dy, z + dz) = (x1 + dx1, x2 + dx2, xz + dxz) is given by the quadratic differential form
ds2 = dx2 + dy2 + dz2 = N Y gnc(xl, x2, xz) dx1 dxk
with a*(x* x2 xz) - \— ^ + %L *!L + $L *L\ withg%k{x,x,x) ydx.dxk dxidxk —.dxk\xixix% = gki(x\ x2, xz)
(6.2-2)
(i, k = 1, 2, 3)
The functions gik(xl, x2, xz) are the components of the metric tensor (Sec. 16.7-1). (b) The six coordinate surfaces associated with the points (x1, x2, xz) and (x1 + dx1, x2 + dx2, xz + dxz) bound the parallelepipedal volume element
dV =^{\iV:Z\sdx1dx2dxz d(x1, x2, xz)
With [d$^] - dGt [^Xl> X*> X3)] - g
(6.2-3)
(see also Sees. 5.4-7, 6.4-3c, and 16.10-10). \fg is taken to be positive if the directed increments dx1, dx2, dxz, which define the positive direc tions on the coordinate lines, form a right-handed system (Sec. 3.1-3). The curvilinear coordinate system is then right-handed throughout the region V; otherwise the coordinate system is left-handed throughout V (see also Sec. 16.7-1). Refer to Sees. 6.4-3, 17.3-3, and 17.4-2 for the description of vectorpath elements and
surface elements in terms of curvilinear coordinates.
171
REPRESENTATION OF VECTORS
6.3-3
6.3. REPRESENTATION OF VECTORS IN TERMS OF COMPONENTS
6.3-1. Vector Components and Local Base Vectors. In Chap. 5, a vector function of position was described in terms of right-handed rec tangular cartesian components by reference to local base vectors i, j, k; the magnitude and direction of each cartesian base vector are the same at every point (x, y, z) (Sees. 5.2-2 and 5.4-1). If the vector function F(r)
is to be described in terms of curvilinear coordinates x1, x2, xz, it is useful to employ local base vectors directed along, or perpendicular to, the coordinate lines at each point (x1, x2, xz). Such base vectors are themselves vector functions of position. The following sections describe three dif ferent systems of local base vectors associated with each given curvilinear coordinate system. 6.3-2. Representation of Vectors in Terms of Physical Compo nents. Given a curvilinear coordinate system defined as in Sec. 6.2-1, the unit vectors vli(xx, x2, xz), u2(a;1, x2, xz), vlz(x1, x2, xz) respectively directed along the positive x1, x2, xz coordinate lines (Sec. 6.2-36) at each point (x1, x2, xz) = (r) may be used as local base vectors (see also Sec. 16.8-3). Every vector F(r) = F(x1, x2, xz) can be uniquely represented in the form
F(r) = F(*S x2, xz) = Aui + Au2 + Au3
(6.3-1)
at each point (r) = (x1, x2, zz). Refer to Table 6.3-1 for the transformation equations relating the functions
A = At*1, x2, xz) A = Ate1, x2, xz) Pz = fiz(x\ x2, xz)
(6.3-2)
(physical components of F in the coordinate directions, see also Sec. 16.8-3) and the local base vectors uite1, x2, xz), u2(x1, x2, xz), uz(x1, x2, xz) to their respective rectangular cartesian counterparts Fx, Fy, Fs and i, j,
k. Note that the functions —= t-^> —7= -A> —7= -^-. arethe direction VglidxX Vg»dxt Vglidx%
cosines of u» (i.e., of the ^-coordinate line) with respect to the x, y, z axes (see also Fig. 6.3-1). 6.3-3. Representation of Vectors in Terms of Contravariant and Covariant
Components. For any curvilinear coordinate system defined as in Sec. 6.2-1, it is possible to introduce the system of local base vectors
6.3-3
CURVILINEAR COORDINATES
ei(xl, x2, x*) a + Vtfn ui
172
e2(xl, x2, x3) s + VW u2 e3(xS x2, x3) = + VtfTs u8 (6.3-3)
directed along the coordinate lines, and the system of local base vectors
directed perpendicularly to the coordinate surfaces. represented in the respective forms
Each vector F(r) can then be
F(r) - F(x\ x2, x8) = F*ei +F*e2 + F*e3 = F^1 + F2e2 + F8e8
(6.3-5)
at each point (r) s (x1, x2, x8). The magnitudes as well as the directions of the local base vectors e»- and e» change from point to point, unless x1, x2, x8 are cartesian coordi nates (Sec. 16.6-la).
fcx3 coordinate line
x1 coordinate line
Fig. 6.3-1. Representation of a vector F in terms of unit vectors ui, u2, u3 and physical
components fi\, P2, P*. The contravariant components F{ = F*(x\ x2, x8) = F • e»' and the covariant components F» = Fi(x\ x2, x8) = F • e» (t = 1, 2, 3) in the scheme of measurements (Sec. 16.1-4) associated with each system of coordinates x1, x2, x8, and the associated base vectors e» and e* may also be defined directly by the transformations relating them to the rectangular cartesian components Fx, Fv, Ft and base vectors i, j, k (Table 6.3-1). Note (see also Sees. 16.6-1 and 16.8-4) =
[eie2e8j
e1 Xe2
e3 X e1
e2 X e8 d
e2 =
~*
[e^e3]
e8 =*
8
[eie2e3]
[eie2e8] = [e^e8]""1 = [uiu2u3] V0H022038 = ± Vg
(6.3-6) (6.3-7) (6.3-8)
173
REPRESENTATION OF VECTORS
6.3-3
Table 6.3-1. Transformations Relating Base Vectors and Vector Components Associated with Different Local Reference Systems (a) Relations between i, j, k and the Local Base Vectors u„ e», e» Asso ciated with a Curvilinear Coordinate System
—±-=e< + V Qa
e,-J?2i+i£j+iik dxx
dx%
dx%
dx* . . dx*. V
dx*
/—
V
dx<
V
dx
1= llx-^"*- I Vxei= ldx~<e 3
3
3
t=l
1=1
t=l
3
,
V
3
dx1
/—
. dx*
,.
, 0 ox
i
3
V
dx{
V
dz
k== I TzVgiiUi= Z^e<= Z^e t=l
,
i=l
(b) Relations between Fx, Fy, Ft and the Physical Components Fit Contravariant Components F%, and Covariant Components F* Associ ated with a Curvilinear Coordinate System
'«-£'-+S'.+£'.
<*-i.».»
ax A- = v ^x „, = v mp v j>x_/^_
1 A **' Vg7i *=i
A dxi
»=i
J* dx %
3^3
Y *yJ±. = Yi^™ = Y ?£f t=l
vJ
3
i=l
i=l
3
3
f = Y j£_A. = Y JiLFi = Y **.'*» (c) Relations between Local Base Vectors and Vector Components Asso ciated with Two Curvilinear Coordinate Systems (barred base vectors and vector components are associated with the &l, £2, 2s system: see also Sec. 16.2-1).
Gt=T^r* *-2£- «•-$£^
v ykk
t=i 3
Pk = +V^^
F* = Yg-> t=i
3
3
t=i
fr-1'**)
3
Fk = Y§>, *=i
(A =1, 2, 3)
6.3-4
CURVILINEAR COORDINATES
174
In the special case of a rectangular cartesian coordinate system, x1 = x, x2 = y, x8 = z, and
ei = e1 = ui = i
e2 = e2 = u2 = j
e8 = e8 = u8 = k
(6.3-9)
A main advantage of the contravariant and covariant representation of vectors is the relative simplicity of the transformation equations relating contravariant (or covariant) vector components associated with different coordinate systems (Table 6.3-1; see also Sec. 6.3-4).
6.3-4. Derivation of Vector Relations in Terms of Curvilinear
Components. In principle, every vector relation given, as in Chap. 5, in terms of rectangular cartesian components can be expressed in terms of curvilinear components with the aid of Table 6.3-1. If the relations in question involve differentiation and/or integration, note that the base vectors associated with curvilinear-coordinate systems are functions of position. Many practically important problems permit the use of orthogonal coordinate systems (Sec. 6.4-1). In this case, the formulas of Sees. 6.4-1 to 6.4-3 and Tables 6.4-1 to 6.5-11 yield comparatively simple expressions for many vector relations directly in terms of physical components. When these special methods do not apply, it is usually best to employ contravariant and covariant vector components rather than physical components; the formulation of vector analysis in terms of con travariant and covariant components is treated in detail in Chap. 16 as part of the more general subject of tensor analysis. In particular, one uses the formulas of Sees. 16.8-1 to 16.8-4 for computing scalar and vector products, and the relatively straightforward method of covariant differentia tion (Sees. 16.10-1 to 16.10-8) yields expressions for differential invariants like V4>, V • F, and V X F. 6.4. ORTHOGONAL COORDINATE SYSTEMS. VECTOR RELATIONS IN TERMS OF ORTHOGONAL COMPONENTS
6.4-1. Orthogonal Coordinates. An orthogonal coordinate sys tem is a system of curvilinear coordinates x1, x2, xz (Sec. 6.2-1) chosen so that the functions (ft* (a:1, x2, xz) satisfy the relations gik(x\ x2, xz) = 0
if i ^ k
(6.4-1)
at each point (x1, x2, xz). The coordinate lines, and thus also the local base vectors uh u2, u3, of an orthogonal coordinate system are perpendicular to each other at each point; each coordinate line is perpendicular to all coordinate surfaces corresponding to constant values of the coordinate in question.
175
6.4-2
VECTOR RELATIONS
6.4-2. Vector Relations, (a) The formulas of Table 6.4-1 express the most important vector relations in terms of orthogonal coordinates and components. The appropriate functions gu = ga(xl, x2, xz) for each specific orthogonal coordinate system are obtained from Eq. (6.2-2) or from Tables 6.5-1 to 6.5-11.
Table 6.4-1. Vector Formulas Expressed in Terms of Physical Components for Orthogonal Coordinate Systems (Sec. 6.4-1) Plus and minus signs refer, respectively, to right-handed and left-handed orthogonal coordinate systems. The appropriate functions guix1, x2, x3) = |e»|2 are obtained from Eq. (6.2-2) or from Tables 6.6-1 to 6.5-11 (a) Scalar and Vector Products
F • G = PA + PA + PA F XG -
U2 u8 Ui Pi
[FGH] = P2 & Pz &
|F| =
+ VP12 + Pz2 + ^s2
Ui Pi ($1 ui Pi 61 U2 P% (?2 [U1U2U3] = ± U2 Pz 62 vkz Fz Gz U8 Pz (rs A . 61
X U8 Pi 6l X ui P2 6z X U2 Pz 8 fr 6,
S% [uiu2u8] = ±
F2
Gz
8
Sz Sz
(b) Differential Invariants* (g = gngz2gzz) ~~
^ r
la*.
ui VgTi VXF:
Vi
1
dx1
_a_
U2 V^22
dx2 d
u3 y/gzz
dx9
Pz
(F • V)*
IVgTi **1
a* ,
1
a*i
Pi Vqii Pz Vgzz
Pz Vgn a*
Vgz~z**"
+ •
Pz
d<*>
Vg7* d*3\
\JL (VI J!*\ +JL (^1
f 1 if i = k .
Ui • u* = -j n .. .
[ 0 if 1 j* k
w
/ 0 if i = k .
u,- X u* = <
[ ± uj, %5^ k ?*j ?*i
(6.4-2)
[uiu2u3) = ±1 u<
+ Vgii
= ± Vgii*
Pi = + VgTiF* -
Vgu
(i - 1, 2, 3)
(6.4-3)
[eie2e3] = [e^e8)-1 = [uiu2u3] Vgngzzgzz = ± Vgugzzgzz =» ± V0
(6.4-4)
6.4-3
CURVILINEAR COORDINATES
176
(6.4-5)
where the plus and minus signs refer, respectively, to right-handed and left-handed orthogonal coordinate systems.
Note: Expressions for V$, V • F, and V X F may be derived directly from the definitions of Sec. 5.5-1 with the aid of the volume element shown in Fig. 6.2-1.
6.4-3. Line Integrals, Surface Integrals, and Volume Integrals (see also Fig. 6.2-1). (a) Given a rectifiable curve C (Sec. 5.4-4), the
vector path element dr has the physical components dsi = Vqu dx1,
ds2 = VOii dx2, dsz = Vqszdxz in each curvilinear coordinate system +
+
(Sec. 6.2-1), i.e. dr
dr = -37 dt =* i dx + j dy + k dz = ui dsi + u2 ds* + u8 dsz
= ui V011 dx1 + u2 V022 dx2 + u3 V~Q~zz dxz (6.4-6) The appropriate expression (6) for dr must be substituted in each line
integral defined in Sec. 5.4-5. Thus, for orthogonal coordinates x1, x2, xz
[dr •F=j (Ptdst +P*ds2 +Pzdsz) Note that for every curvilinear coordinate system dr = d dx1 + e2 dx2 + e8 dx3
(see also Sec. 17.2-1); for orthogonal coordinates x1, x2, x8,
jc dr •F- fc (P, dx1 +if2 ^2 +F3 dx3) =jc (guFi dx1 +gzzF2 dx2 +^,8F8 dx8) (b) The description of a surface S and of the vector surface element dA in terms of surface coordinates is discussed in Sees. 5.4-6 and 17.3-1.
In particular, for orthogonal coordinates x1, x2, xz chosen so that S is a portion of an xk coordinate surface (Sec. 6.2-2), the space coordinates x{ and x' are (orthogonal) surface coordinates on S, and /—
dA = Ufc VgaQjj dx* dx1' = uk dsi dsj = efc ^-^ dx{ dx' gkk
(i?*j?*k?* i; k = 1, 2, 3)
(6.4-8)
The relative simplicity of the expressions (5.4-13) and (5.4-14) for surface integrals resulting from the use of Eq. (8) is often the main reason for introducing a curvilinear coordinate system (see also Sec. 10.4-lc). The sign of the square root in Eq. (8) determines the direction of the positive normal (Sec. 17.3-2) and is taken to be positive for right-handed coordi nate systems.
177
SPECIAL FORMULAS
6.5-1
(c) The volume element dV appearing in the expressions (5.4-16) and (5.4-17) for volume integrals is given by Eq. (6.2-3), or
dV = Ct\X ?(f', Xyl, Z\. dx' dx2 dxz = ± Vgdx1 dx2 dxz X ) = ± VQwQiigzzdx1 dx2 dxz = ±dsids2dsz (6.4-9) y/g is positive wherever the coordinate system is right-handed (Sec. 6.2-36). 6.5. FORMULAS RELATING TO SPECIAL ORTHOGONAL COORDINATE SYSTEMS
6.5-1. Introduction. Tables 6.5-1 to 6.5-11 present formulas relating to a number of special orthogonal coordinate systems. In particular,
the functions g^x1, x2, x2) = |et|2 are tabulated for each coordinate sys tem and may be substituted in the relations of Table 6.4-1 and of Sees. 6.4-3 and 16.10-1 to 16.10-8 to yield additional formulas. Note: A given family of coordinate surfaces x1 = x*(x, y, z) - constant, x2 —x2(x, y, z) = constant, x3 = x3(x, y, z) = constant is necessarily common to all systems of coordinates x1 •= ^(x1), £2 = x2(x2), & = x8(x8). The particular systems described in Tables 6.5-1 to 6.5-11 are representative.
6.6. RELATED TOPICS, REFERENCES, AND BIBLIOGRAPHY
6.6-1. Related Topics.
The following topics related to the study of
vector relations in terms of curvilinear coordinates are treated in other
chapters of this handbook: Solid analytic geometry Elementary vector analysis Coordinate transformations Contravariant and covariant vectors Transformation of base vectors Differential invariants Differential geometry Potential theory
Chap. 3 Chap. 5 Chaps. 3, 14 Chap. 16 Chap. 16 Chap. 16 Chap. 17 Chap. 15
6.6-2. References and Bibliography (see also Sec. 5.8-2). 6.1. Courant, R., and D. Hilbert: Methods of Mathematical Physics, vol. I, Wiley, New York, 1953.
6.2. Kellogg, O. D.: Foundations of Potential Theory, Springer, Berlin, 1929. 6.3. MacMillan, W. D.: Theory of the Potential, McGraw-Hill, New York, 1930. 6.4. Madelung, E.: Die mathematischen Hilfsmittel des Physikers, 7th ed., Springer, Berlin, 1964.
6.5. Magnus, W., and F. Oberhettinger: Formulas and Theorems for the Functions of Mathematical Physics, Chelsea, New York, 1954; 3d ed., Springer, Berlin, 1966. 6.6. Margenau, H., and G. M. Murphy: The Mathematics of Physics and Chemistry, Van Nostrand, Princeton, N.J., 1943. 6.7. Stratton, J. A.: Electromagnetic Theory, Chap. I, McGraw-Hill, New York, 1941. 6.8. Whittaker, E. T., and G. N. Watson: A Course of Modern Analysis, Cambridge, New York, 1927.
zero)
symbols (Sec. 16.10-3; three-index symbols not shown are identically
Christoffel three-index
Vg = V 011022088
gu = |e»|2
ds2 = (dr)2 = dx2 + dy2 + dz2
ment
Square of distance ele
dinate differentials
Transformation of coor
tesian coordinates)
y
/-
d(x, y, z)
I- 1
(rl (t?t?
., = —r sm21?
|rt?J
r |r^J r 1*1 { J = — sin t?cos 1?
iM o J = cot t?
d(r, t?,
V0 = ^ ' y' ' = r2 sin t?
[
d(x, y, z)
d(r',
/-
V9
,
gzs = 1
[r'
r
g
£ry = 1
0^ = r2 sin21?
0rr = 1
gw = r2
ds2 = dr72 + r'2 dip2 + dz2
ds2 = dr2 + r2 dt?2 + r2 sin21? d*>2
dz = dz
2
d2 = cos t? dr — r sin t? dt?
25 =
y = r' sin $>
dx = cos
z — r cos t?
X
$> = arctan -
x = r' cos ^>
dx — sin t? cos
X
y = r sm t? sm tp
x —r sin t? cos
r' = + "Vx2 + y2
r = + Vx2 + y2 + z2
y — x tan
x2 + y2 — z2 tan2 t? = 0 (circular cones)
(Circular) cylindrical coordinates r',
x2 _|_ yz _|_ zz — rz (spheres)
dinates (x, y, z are righthanded rectangular car t? = arccos -r
Transformation of coor
Coordinate surfaces
Spherical (polar) coordinates r, t?,
Table 6.5-1. Vector Formulas in Terms of Spherical and (Circular) Cylindrical Coordinates (see also Figs. 2.1-2 and 3.1-lb) The formulas for cylindrical coordinates also apply to polar coordinates in the xy plane (Sec. 2.1-8)
w
CD
1
O O
D
5
d$
dy
dx
d$
a
az
a
-
F.
^v
T ax F* a^
a2*
h — dz2
a2*
r dt?
1 e*
1 r2 — ) H
i a / a*\ ,
r2 dr\ dr)
1
a$
dt?/
a
I sin t? — 1 H
a/ .
r2 sin21? at? \
l
a^«, r sin t? d^>
r2 sin2 t?
1
ur + - dip J r [sin t? d«>
(A> sin t?) +
; u« r sin t? dtp
r sin t? dt?
U0 H
— — (Pp sin t?) r sin t? [St?
r2 dr
Ur H
==**>+ '
dr
d&
(rPv) u, ar J '
a*>2
a23>
1 a*>
r' dy>
a*
az
u,
dzJUr+\dz
r' dr' V ^r'/ r'2 a*>2 +az2
\r'd
dr')
u.
Aaj, _ a/\A t /dPr' _ dP\ u9
1^ l a^ dPt 5^(^>+5-T7 r"'ar dip + ~dz
dr'
d*
F,=PZ
Fx = Pr sin t? cos *> + P# cos # cos
F, = Pr cos t? - P& sin t?
£.=Fg
Fx = Pr' cos ^> —Pv sin ?> Fv = PTf sin v> -+- /*v cos
Pr = Fx sin t? cos
r',
(Circular) cylindrical coordinates
* To find the Laplacian of a vector, use V2F = V(V • F) - V X (V X F), or use Eq. (6.4-5) and Table 16.10-1.
V2# = — H dx2 dy2
a2*
Laplacian of a scalar point function*
V XF
components
dx dy dz Curl in terms of physical
Divergence in terms of physical components
dz
V«t> = i-- + j — + k —
d
ical components
Gradient in terms of phys
nents
Transformation of phys ical vector compo
Spherical (polar) coordinates r, *,
(see also Figs. 2.1-2 and 3.1-16) (Continued)
Table 6.5-1. Vector Formulas in Terms of Spherical and (Circular) Cylindrical Coordinates
en
Cfl
3
W ft
CD
6.5-1
CURVILINEAR COORDINATES
180
Table 6.5-2. General Ellipsoidal Coordinates X, p, v or u, v, w
(a) Coordinate Surfaces (solve each equation to obtain X, /*, v in terms of x, y, z)
^
+6^+^ =1 f"™0™' X2
a2 +/t
h r^-r
V2
z2
V2
22
, 62 +A* h C22 +/ »
X2
= 1 (HYPERBOLOIDS OF V (X > —C2 > M> —&*
ONE SHEET)
(
>v> _a2)
-5—r a2 + v h rr^ 62 + * h c2o +. * = 1 (HYPERBOLOIDS TW0 SHEET8) OP (b) Transformation to Ellipsoidal Coordinates
r2 = (a2 + X)(a2+/x)(fl2 + »)
(a2 - 62)(a2 - c2)
„_ (62 + X)(62+/i)(62+y)
v
(62 - a2)(o2 - c2)
0 _ (c2 + X)(c2 + /x)(c2 + ») (c2 - a2)(c2 - b2)
(c) Alternative System (see Ref. 6.4 for other alternative systems).
Intro
duce u, v, w by
a t \
a2 + ft2 + c2
, , x
a2 + b2 +c2 A , . v = 4p(w)
a2 + b2 +c2
5—!
so that (Sec. 21.6-2)
du = -?L=
dv = -?L=
V/ft)
with
V/M
dw= dv
VfM
f(t) = (a2 + 0(&2 + t)(c2 + t)
(d) 0AX
jj^
to
4^
gUu - 4[p(^) - g>(t/)]fe>(u) - p(w)]
g„
^0
0„ = 4b(v) - p(u)][f?(v) - f?(w)\
g™, = 4b(w) - p(w)]fe>(u>) - ff(v)] 4 V/(x) a r ,— a*i 4V7qq
m v** _
a r /^ a*i
(e) v * " (x - M)(x - ,)ax [V/(X) axJ + Cm - x)fc - ,)^ L^^ a^J 4V7M =1f
1
a^*
a r /7rr a*i 1
a*$
l
a2<s>)
4 tb(w) - fp(v))[p(u) - j?M) aw2 + \p(v) - &(u)][&(v) - p(w)] dv2 ,
[p(w) - 0>(u)]fe>(u>) - p(v)] dw2j
181
SPECIAL FORMULAS
6.5-1
Table 6.5-3. Prolate Spheroidal Coordinates a, r, ip or u, v,
x2 -\- y2
.
z2
.
.
-57-5— 1\ + al
X2 + y2 , Z2 _ a2(r2 — 1) dh2 ~*
,\
(prolate spheroids)
a2(ai — 1)
i
(HYPERBOLOIDS OF REVOLUTION, )
y —x tan
(planes through z axis) j
All spheroids and hyperboloids have common foci (0, 0, a), (0, 0, —a); 2ac and 2ar are, respectively, the sum and difference of the focal radii of the point (x, y). (b) Transformation to Prolate Spheroidal Coordinates X2 = a2(
y2 = a2(
z = flkrr
(c) Alternative System (removes ambiguities)
r = COS V
(d) g
^tt = a2 • _ ^2
fl'tiu = 0»* = a2(sinh2 u + sin2 v)
£«>*, = a2(
.
a2 -
t2
a2*!
+ (
Table 6.5-4. Oblate Spheroidal Coordinates a, r, tp or u, v,
(a) Coordinate Surfaces (solve each equation to obtain a, r, tp in terms of x, y, z) x2+2/2 + a2(l +
z2
aV2 z2
-
2/ = x tan
aV2
=
1
-
1
(oblate spheroids)
(hyperboloids of revolution, )
(planes through z axis)
(b) Transformation to Oblate Spheroidal Coordinates x2 = a2(l + a2)(l - r2) cos2 ip
y2 = a2(l +
t
(c) Alternative System (removes ambiguities)
r — cos v
(d) g„ =a2 £±£ 0Tr - a2 £-±£ ^ =a2(l +*2)(1 - r2) guu =» 0™ = a2(sinh2 w -f cos2 v)
0^ = a2 cosh2 w sin2 v
+
a2$1
(1 +
182
CURVILINEAR COORDINATES
6.5-1
Q
hC^ IP:::?(NZ^Ca(
fl(ff+T)v\A' / (F/J^\ p
\ \ \l * \
\
/
/
V
\
)
a-
\ 7Vp'\ \
1 Uf / /
p
T
Q'
Fig. 6.6-1. An orthogonal system of confocal ellipses and hyperbolas with foci F, F'. Such curves define an elliptic coordinate system in their plane and will generate coordinate surfaces of
1. A prolate spheroidal coordinate system if the figure is rotated about the axis P'OP (Table 6.6-3)
2. An oblate spheroidal coordinate system if the figure is rotated about the axis Q'OQ (Table 6.5-4) 3. An elliptic cylindrical coordinate system if the figure is translated at right angles to the plane of the paper (Table 6.5-5)
Fig. 6.6-2. An orthogonal system of confocal parabolas with focus F. Such curves define a parabolic coordinate system in their plane and will generate coordinate sur faces of
1. A parabolic coordinate system if the figure is rotated about the axis P'OP (Table 6.5-8)
2. A parabolic cylindrical coordinate system if the figure is translated at right angles to the plane of the paper (Table 6.5-9)
183
SPECIAL FORMULAS
6.5-1
Table 6.5-5. Elliptic Cylindrical Coordinates a, t, z or u, v, z (also used as confocal elliptic coordinates in xy plane; see also Fig. 6.5-1) (a) Coordinate Surfaces (solve each equation to obtain
-s—o +
oVz
/j.2 -n, + ah-2
V2
„/ o
rr = 1
(right elliptic cylinders)
a2(a£ — I)
yZ o/ o—7T = 1
ai{ji — 1)
z —z
\
1
(planes parallel to xy plane)
(b) Transformation to Elliptic Cylindrical Coordinates X = acr
y2 = a2(
z = z
(c) Alternative System (removes ambiguities) a = cosh u
(d) g„o = a2 -z
r-
—
t = cos v
a2 — T2
0tt = a2 -z
1
1
r
—
gtz = 1
t*
gUu = gw = a2(sinh2 u + sin2 *>)
p„ = 1
«'-S!r;,[v^(i^») /a2* l iu.2 a2(sinh2 u + sin2 0) \aw2
a2$>\ "•"
-J..2 I av2/
a2* '
a-2 dz
Table 6.5-6. Conical Coordinates r, v, w
(a) Coordinate Surfaces (solve each equation to obtain r, v, w in terms of x, y, z) x2 + y2 -f- z2 = r2
(spheres)
p+A+J^i-0 (oo»») [c2> vi >bi >wi X2
V2
w2
w2 — b2
—5 H—o
,, H—5
z2
, = 0
w2 — c2
(cones)
(b) Transformation to Conical Coordinates
, rvw
*
x be
(c; g„ -_ 1
,
y
u2 (v2 - b2)(w2 - b2)
b2
b2 - c2
- w2) gvv -_ _ {v2 u2(v2 _ fc2)(v2 _ c2)
,
w2 (v2 - c2)(w2 - c2)
c2
c2 - b2
- w2) gww -_ ^ u2(v2 _62)(m;2 _ c2)
6.5-1
CURVILINEAR COORDINATES
184
Table 6.5-7. Paraboloidal Coordinates X, p, v
(a) Coordinate Surfaces (solve each equation to obtain X, n, v in terms of x, y, z)
(elliptic paraboloids) \ X-A+X-B-a' + X ——j H —Tj = 2z + n (hyperbolic paraboloids) \v > A > /x > B > X v — -A
(elliptic paraboloids) /
v — B
(b) Transformation to Para boloidal Coordinates
_,
(4 - \)(A - n){A - v)
_ (B -\)(B -,x)(B - v)
V
(B-A)
(A-B)
z=l(A+B -\-n-v) 1 (M -X)(„ -X)
_
_ 1 (p - M)(X - /i) 1 (X -
0(m - ")
*" " 4 (A - v){B - v)
Table 6.5-8. Parabolic Coordinates
(a) Coordinate Surface (solve each equation to obtain
£±£-*++ )I (confocal . * paraboloids of revolution; foci at
x, +y*. -2z+t*\ 0EIGIN) y —x tan v?
(planes through z axis)
(b) Transformation to Parabolic Coordinates
X =
Z = ^(r2 —
««-;^.G£('?)*;*(-?)+(*+»3]
SPECIAL FORMULAS
185
6.5-1
Table 6.5-9. Parabolic Cylindrical Coordinates
£-*+*• (CONFOCAL RIGHT PARABOLIC CYLINDERS) z=2
(planes parallel to xy plane)
(b) Transformation to Parabolic Cylindrical Coordinates x = trr y = %(t2 —
^
tt2*
1
z = z
/d2* , *2*\ , d2*
OA = OB-a
Fig. 6.5-3. A family of circles through two poles A} B, and the family of circles orthog onal to those of the first family. Such curves define a bipolar coordinate system in their plane and will generate coordinate surfaces of 1. A bipolar coordinate system if the figure is translated at right angles to the plane of the paper (Table 6.5-10) 2. A toroidal coordinate system if the figure is rotated about the axis Q'OQ (Table 6.5-11)
6.5-1
CURVILINEAR COORDINATES
186
Table 6.5-10. Bipolar Coordinates
(also used as bipolar coordinates in xy plane; see also Fig. 6.5-3) (a) Coordinate Surfaces a2
x2 + (y — a cot a)2 = —
a2 (cot2
sin2
(right circular cylinders)
a2
(x - a coth t)2 -f y2 = ^-v^- = a2(coth2 r — 1) (right circular cylinders) sinn* 1
(planes parallel to xy plane)
For any point (x, y) in the xy plane, a is the angle subtended at (x, y) by the two poles (—a, 0) and (a, 0). eT is the ratio of the polar radii to (x, y). (b) Transformations
x2 + (y - to)»
a sinh r
cosh r — cos
T-2l0ge(a;-a)2+2/2
cosh r — cos
(c) gr
0«« = 1
(cosh t — cos a)2
(d) V>* =i (cosh , - cos ,)« (£+£*)+£• Table 6.5-11. Toroidal Coordinates
(x* + y2) + (z - a cot
(spheres)
sin2 a
(VxH1? - a coth t)2 + 22
(tores or anchor rings)
sinh21
(planes through z axis)
y = x tan ?>
(b) Transformations a sinh t re =
—r
«
—z
0
=
cosh t — cos o-
a sinh r
cosh t
cosh',
(c) g<
— cos
cos $>
iw s2+y2 + (z -to)2 * = 210g
sin
1.
Wx2 + y2 + al2 + g2
2
Wx2 + i/2 - a]2 + z2
v> = arctan x
• COS
a2 sinh2 r
a2 1 Qrr =» «
(cosh t — cos
9
t — cos
sinh (d) v2* = (cosh r "" cos °)a [A ( sink t d$\ *' a2 sinh r |_dr \ cosh r - cos
.
d /
sinh t
d$\
d(T \COSh t — COS
sinh r(cosh t — cos
3*2J
CHAPTE
r7
FUNCTIONS OF A COMPLEX v a r i a b l e
7.1. Introduction
7.1-1. Introductory Remarks 7.2. Functions of a Complex Vari able. Regions of the Complex-
7.5. Integral Theorems and Series Expansions
7.5-1. Integral Theorems 7.5-2. Taylor-series Expansion 7.5-3. Laurent-series Expansion
number Plane
7.2-1. Functions of a Complex Variable 7.2-2. z Plane and w Plane. Neigh borhoods
7.2-3. Curves and Contours
7.2-4. Boundaries and Regions 7.2-5. Complex Contour Integrals 7.3. Analytic (Regular, Holomorphic) Functions 7.3-1. Derivative of a Function
7.3-2. The Cauchy-Riemann Equa tions
7.3-3. Analytic Functions 7.3-4. Properties of Analytic Functions
7.6. Zeros and Isolated Singularities 7.6-1. Zeros
7.6-2. Singularities 7.6-3. Zeros and Singularities at In finity 7.6-4. Weierstrass's and Picard's Theorems
7.6-5. Integral Functions 7.6-6. Product Expansion of an In tegral Function 7.6-7. Meromorphic Functions 7.6-8. Partial-fraction Expansion of Meromorphic Functions 7.6-9. Zeros and Poles of Meromorphic Functions
7.3-5. The Maximum-modulus
7.7. Residues and Contour Integra
Theorem
tion
7.4. Treatment
of Multiple-valued
Functions
7.4-1. Branches 7.4-2. Branch Points and Branch Cuts 7.4-3. Riemann Surfaces
7.7-1. Residues
7.7-2. The Residue Theorem
7.7-3. Evaluation of Definite Integrals 7.7-4. Use of Residues for the Summa tion of Series 187
7.1-1
COMPLEX VARIABLES
7.8. Analytic Continuation 7.8-1. Analytic Continuation and Monogenic Analytic Functions 7.8-2. Methods of Analytic Continua tion
188
7.9-4. The Schwarz-Christoffel Trans formation
7.9-5. Table of Transformations
7.10. Functions
Mapping Specified
Regions onto the Unit Circle
7.9. Conformal Mapping
7.9-1. Conformal Mapping 7.9-2. Bilinear Transformations 7.9-3. The Transformation
w = V2(z + l/2)
7.10-1. Riemann's Mapping Theorem 7.11. Related Topics, References, and Bibliography
7.11-1. Related Topics 7.11-2. References and Bibliography
7.1. INTRODUCTION
7.1-1. The theory of analytic functions of a complex variable furnishes the scientist or engineer with many useful mathematical models. Many mathematical theorems are simplified if the real variables are considered as special values of complex variables. Complex variables are used to describe two-dimensional vectors in physics; analytical functions of a complex variable describe two-dimensional scalar and vector fields (Sees. 15.6-8 and 15.6-9). Finally, analytic functions of a complex vari able represent conformal mappings of the points of a plane into another plane (Sees. 7.9-1 to 7.9-5). 7.2. FUNCTIONS OF A COMPLEX VARIABLE. REGIONS OF THE COMPLEX-NUMBER PLANE
7.2-1. Functions of a Complex Variable (see also Sees. 1.3-2 and 4.2-1 and Table 7.2-1; refer to Chap. 21 for additional examples). A complex function
w = f(z) = u(x, y) + iv(x, y) = \w\ei6 (z = x + iy = \z\e^)
(7.2-1)
associates one or more values of the complex dependent variable w with each value of the complex independent variable z in a given domain of definition.
Single-valued, multiple-valued, and bounded functions of a complex variable are defined as in Sees. 4.2-2a and 4.3-36.
Limits of complex
functions and sequences and continuity of complex functions as well as convergence, absolute convergence, and uniform convergence of complex infi nite series and improper integrals are defined as in Chap. 4; the theorems
x
z
6)2
(x -
6)2
1
log«z
tanh z
H log. (x« + j/»)
cosh 2x + cos 2y
sinh 2x
cos 2x + cosh 2y
sin 2x
cosh x cos 2/
cosh z
tanz
sinh x cos y
sinh z
(* -0.X±1. ±2, ± . . .)
arctan (-) +2for
sin 2y cosh 2x + cos 2y
sinh 2y cos 2x + cosh 2y
sinh x sin 1/
cosh x sin 2/
cos x cosh y to = 1
(A = 0, ±1, ±2, ± . . .)
fort
TO=1
to=»1
TO=»1
(branch corresponding to k = 0 only)
Z = 1
(A = 0, ±1, ±2, ± . . .)
z = fort
(A = 0, ±1, ±2, ± . . .)
2 =» for
2 = (fc + H)*i »=1 (A: = 0, ±1, ±2, ± . . .)
z =
2 = (k + M)t «=1 (A = 0, ±1, ±2, ± . . .)
Branch points of infinite order at z = 0, z =» »; both are essential singularities
(A: = 0, ±1, ±2, ± . . .)
Poles (to = 1) at z = (A + M)«
Essential singularity at z = »
Essential singularity at z «=• » Poles (to = 1) at z = (k + M)* (A; = 0. ±1, ±2, ± . . .)
Essential singularity at z » «
Essential singularity at z = »
Essential singularity at z = »
cos z
TO=1
(A = 0f ±1, ±2, ± . . .)
for
— sin x sinh y
z =
Essential singularity at z = «
Branch point (to =» 1) at z = 0 Branch point (to = 1) at z >=> «>
Pole (to = 1) at z = 0 + t"6
Pole (to => 2) at z = 0
cos x sinh y
(branch
Pole (to = 2) at z = » Pole (to = 1) at z «= 0
sin x cosh 2/
Zero of order 1 at z = 0 point)
TO =
TO =• 2
1
sin z
/
00
00
TO =
to = 2
Isolated singularities Pole (to = 1) at z = «
Essential singularity at z =» »
2
2 a
2 =
00
to=1
ex sin y
1 \
/-*+v*»+»»y
(1/-6) a)2 + (1/ -
-2xy
2 n
2 = 0
2=0
Zeros (order to)
arg/(z) = arctan :
e* cos J/
yf*
/x + v*2 + A^ 1 \ 2 /
(x -a) (X - 0)2 + (1/ -
1
z - (a + ib) (a, 6 real)
(X* + 1/2)2
z*
1/2
x' + ya
z
x* -
X* + l/2
X
1
V
x» -
1
2xy
y
w(x, y)
z»
!/»
m(x, y)
Function /(z)
Note \f(z)\ = Vu2 + v2
}{z) = u(x, y) 4- iv(xt y) of a Complex Variable z = x + iy (see also Sees. 1.3-2 and 21.2-9 to 21.2-11)
Table 7.2-1. Real and Imaginary Parts, Zeros, and Singularities for a Number of Frequently Used Functions
•
<
X
r< w
*s
2
©
•
©
in
o
9
C3
7.2-2
COMPLEX VARIABLES
190
of Chap. 4 apply to complex functions and variables unless a restriction to real quantities is specifically stated. In particular, 00
every complex power series } ak(z —a)k has a real radius of convergence &=o
rc (0 < r0 < oo) such that the series converges uniformly and absolutely for \z —a\ < r0 and diverges for \z —a\ > r0 (Sec. 4.10-2a). 7.2-2. z Plane and w Plane. Neighborhoods. The Point at Infinity (see also Sec. 4.3-5). Values of the independent variable
z = x + iy are associated with unique points (x, y) of an Argand plane (Sec. 1.3-2), the z plane. Values of w = u + iv are similarly associated with points (u, v) of a w plane.
An (open) 5-neighborhood of the point z = a in the finite portion of the plane is defined as the set of points z such that \z —a\ < 8 for some 5 > 0.
The point at infinity (z = oo) is defined as the point z transformed into the origin (z = 0) by the transformation z = 1/z. A region contain ing the exterior of any circle is a neighborhood of the point z = oo. 7.2-3. Curves and Contours (see also Sees. 2.1-9 and 3.1-13). A con tinuous (arc or segment of a) curve in the z plane is a set of points z = x + iy such that
z = z(t)
or
x = x(t)
y = y(t)
(— oo < ti < t < t2 < <») (7.2-2)
where x(y) and y(t) are continuous functions of the real parameter L A (portion of a) continuous curve (2) is a simple curve (Jordan arc) if and only if it consists of a single branch without multiple points, so that the functions x(t) and y(t) are single-valued, and the set of equations x(n) = x(r2)
y(ri) = y(r2)
has no distinct solutions rh r2 in the closed interval [ti, t2]. A simple closed curve (closed Jordan curve) is a continuous curve consisting of a single branch without multiple points except for a common initial and terminal point. A simple curve or simple closed curve will be referred to as a (simple) contour if and only if it is rectifiable (Sec. 4.6-9).* The element of distance between suitable points z and z + dz on a contour (2) is
ds = \dz\ = Vdx2 + dy2. 7.2-4. Boundaries and Regions (see also Sees. 4.3-6, 7.9-16, and 12.5-1). The geometry of the complex-number plane (including definitions of distances and angles)
* Some authors restrict the use of the term contour to regular curves (Sec. 3.1-13).
191
COMPLEX CONTOUR INTEGRALS
7.2-5
is identical with the geometry of the Euclidean plane of points (x, y) or vectors r = ix + }y for finite values of x and y, but the definition of z = oo (Sec. 7.2-2) introduces a topology different from that usually associated with plane geometry. The points z of the complex-number plane can be represented homeomorphically (Sec. 12.5-1) by corresponding points of a sphere with longitude arg z and colatitude 2 arccot
(|z|/2), (stereograpKlc projection), so that z = 0 and z = w correspond to opposite poles. The points of every simple closed curve C separate the plane into two singly connected open regions: every continuous curve containing a point of each of the two regions contains a point of their common boundary C (Jordan Separation Theorem). If C does not contain z — oo one of the two regions is bounded (i.e., situated entirely in the finite portion of the plane, where \z\ is bounded) and the other is unbounded; if C contains z = oo both regions are unbounded. More generally, the boundary C of a given region or domain D may be a multiplicity of nonintersecting simple closed curves (multiply connected region, see also Sec. 4.3-6). In any case, the positive direction (positive sense) on the boundary curve is defined as that leaving the region D (interior of the boundary) on the left (counter clockwise for outside boundaries, see also Fig. 7.5-1). The (open) set of points on one side of a boundary curve C is an open region, and the (closed) set of points on one side and on the boundary is a closed region.
7.2-5. Complex Contour Integrals (see also Sees. 4.6-1 and 4.6-10). One defines TO
f f(z) dz =
J^
lim
T KMizi - *_,)
max|2,-«»_i|-*0 ><
(7.2-3a)
*= 1
where the points z0 < f i < z\ < f 2 < z2 < • • • < fm < zm lie on the
contour C connecting z = a = zo and z = b = zm.
If the limit (3a)
exists, then
fcf(z) dz = Jc [u(x, y) dx - v(x, y) dy] + i jc [v(x, y) dx + u(x, y) dy] (7.2-36) where the real line integrals are taken over the same path as the complex integral. Theintegration rulesof Table 4.6-1 apply; in particular, reversal of the sense of integration on the contour C reverses the sign of the integral.
If f(?) is bounded in absolute value by M, and u(x, y) and v(x, y) are of bounded variation (Sec. 4.4-86) on a contour C of finite length L, then the integral (3) exists, and
\fcf(z)dz\<ML
(7.2-4)
If C contains z = oo, or if f(z) is not bounded on C, the integral (3) can often be defined as an improper integral in the manner of Sec. 4.6-2.
7.3-1
COMPLEX VARIABLES
192
7.3. ANALYTIC (REGULAR, HOLOMORPHIC) FUNCTIONS
7.3-1. Derivative of a Function (see also Sec. 4.5-1). A function w = f(z) is differentiable at the point z = a if and only if the limit
g=/'(2)=lim^±^i^
(7.3-1)
[derivative off(z) with respect to z] exists for z = a and is independent of the manner in which Az approaches zero. A function may be differ
entiable at a point (e.g., \z\2 at z = 0), on a curve, or throughout a region. 7.3-2. The Cauchy-Riemann Equations. f(z) = u(x, y) + iv(x, y) is differentiable at the point z = x + iy if and only if u(x, y) and v(x, y) are continuously differentiable throughout a neighborhood of z, and du
dv
— = — dx dy
du
dv
— = — -rdy dx
,~
-r,
N
(Cauchy-Riemann equations) v '
(7.3-2)
at the point z, so that dw du . . dv dv j = 7 + ir = 7 dz dx dx dy
.du dy
1 -r-
,„ 0 rtX v '
(7.3-3)
7.3-3. Analytic Functions, (a) A single-valued function f(z) shall be called analytic (regular, holomorphic) * at the point z = a if and only if f(z) is differentiable throughout a neighborhood of z = a. f(z) is analytic at z = a if and only if f(z) can be represented by a power series 00
f(z) = y ak(z —a)k convergent throughout a neighborhood of z = a &=o
(alternative definition).
Refer to Sees. 7.4-1 to 7.4-3 for an extension of
the definition to multiple-valued functions.
(b) f(z) is analytic at infinity if and only if F(z) = /(1/z) is anaJET"1
lytic at z = 0. One defines/'(oo) = —|2 — I f(z) is analytic at infinity if and only if iff(z) f(z) can be expressed expres as a convergent series of 00
negative powersf f(z) = > bk(z —a)~k for sufficiently large values of \z\ (see also Sec. 7.5-3).
7.3-4. Properties of Analytic Functions. Letf(z) be analytic through out an open region D. Then, throughout D, * The terms differentiable, analytic, regular, and holomorphic are used interchange ably by some authors.
193
BRANCH POINTS AND BRANCH CUTS
7.4-2
1. The Cauchy-Riemann equations (2) are satisfied (the converse is true). 2. u(x, y) and v(x, y) are conjugate harmonic functions (Sec. 15.6-8). 3. All derivatives of f(z) with respect to z exist and are analytic (see also Sec. 7.5-1).
If the open region D is simply connected (this applies, in particular, to the exterior of a bounded simply connected region),
4. The integral / /(f) df is independent of the path of integration, ca provided that C is a contour of finite length situated entirely in D; the integral is a single-valued analytic function of z, and its derivative is f(z) (see also Sec. 7.5-1).
5. The values of f(z) on a contour arc or a subregion in D define f(z) uniquely throughout D.
All ordinary differentiation and integration rules (Sees. 4.5-4 and 4.6-1) apply to analytic functions of a complex variable. If f(z) is analytic at
z —a andf(a) ?± 0, thenf(z) hasan analytic inverse function (Sec. 4.2-2a) at z = a. If W = F(w) and w = f(z) are analytic, then W is an analytic function of z. If a sequence (or an infinite series, Sec. 4.8-1) of functions fi(z) analytic throughout an open region D converges uniformly to the limit f(z) throughout D, then f(z) is analytic, and the sequence (or series) of the
derivatives f'{(z) converges uniformly to f(z) throughout D. The sequence
(or series) of contour integrals Jcfi(z) dz over any contour Coffinite length in D converges uniformly to l f(z) dz. 7.3-5. The Maximum-modulus Theorem. The absolute value \f(z)\ of a function f(z) analytic throughout a simply connected closed bounded
region D cannot have a maximum in the interior of D.
If \f(z)\ < M on
the boundary of D, then \f(z)\ < M throughout the interior of D unless f(z) is a constant (see also Sec. 15.6-4). 7.4. TREATMENT OF MULTIPLE-VALUED FUNCTIONS
7.4-1. Branches.
One extends the theory of analytic functions to
suitable multiple-valued functions by considering branches fi(z), f2(z), ... off(z) each defined as a single-valued continuous function through out its region of definition. Each branch assumes one set of the function values of f(z) (see also Sec. 7.8-1).
7.4-2. Branch Points and Branch Cuts,
(a) Given a number of
branches off(z) analytic throughout a neighborhood D of z = a, except possibly at z = a, the point z = a is a branch point involving the given
7.4-3
COMPLEX VARIABLES
194
branches* of f(z) if and only if f(z) passes from one of these branches to another as a variable point z in D describes a closed circuit about z = a. The order of the branch point is the number m of branches reached by f(z) before returning to the original or m + lBt branch as z describes successive closed circuits about z = a. If f(z) is defined at a branch point z = a the function value/(o) is common to all branches "joining"
at z = a (EXAMPLE: -\/z has a branch point of order 2 at z = 0). The point z = oo is a branch point of f(z) if and only if the origin is a branch point oif(l/z) (see also Sec. 7.6-3). Given a function w = f(z) whose inverse function $(w) exists and is single-valued throughout a neighborhood of w —f(a), the point z = a j* » is a branch point of order m of f(z) whenever &(w) has a zero of order m (Sec. 7.6-1) or a pole of order m + 2
(Sec. 7.6-2) at w = f(a).
(EXAMPLES: w = y/z and w = 1/y/z, a = 0).
Similarly, if $(w) exists and is single-valued throughout a neighborhood of w = /(©o),
the point z — oo is a branch point of order m of f(z) if &(w) has a zero of order m + 2
or a pole of order matw = /(<») (EXAMPLES: w —y/z and w = 1/y/z).
(b) The individual single-valued branches of f(z) are defined in regions bounded by branch cuts, which are simple curves chosen so that no closed circuit around a branch point lies within the region of definition of a single branch. The choice of branches and branch cuts for a given function f(z) is not unique, but the branch points and the number of branches are uniquely defined (Fig. 7.4-1). All branches of a monogenic analyticfunction may be obtained by analytic continua tion of successive elements (Sec. 7.8-1).
7.4-3. Riemann Surfaces. It is frequently useful to represent a multiple-valued function f(z) as a single-valued function defined on a Riemann surface which consists of a multiplicity of z planes or "sheets" corresponding to the branches of f(z) and joined along suitably chosen branch cuts. A circuit around a branch point on such a surface transfers the variable point z between two sheets corresponding to two branches of
f(z). If both w = f(z) and its inverse are multiple-valued functions, then both the z plane and the w plane may be replaced by suitable Riemann surfaces; w = f(z) will now define a reciprocal one-to-one correspondence (mapping) between the points of the two Riemann surfaces, except at branch points. Note: The Riemann surface for a monogenic analytic function (obtained by ana lytic continuation, Sec. 7.8-1) must be connected; thus the multiple-valued function f(z) = ± 1, although analytic everywhere, is not a monogenic analytic function. Many theorems about single-valued analytic functions apply also to multiplevalued monogenic analytic functions defined on suitable Riemann surfaces without
* Note that f(z) may have other branches which do not join the given branches at z = a and which may or may not be analytic at z = a.
RIEMANN SURFACES
195 y
7.4-3 y
A
ii
z plane
z plane
i)
1
(D
>-x
>
X
«;=v/l-z2
w-Jz~
w=\ogez=J -p-
w = arc sin z = /
y
.
y
z plane
z plane of
-e
*~x
->-*
-l o-*
=>/S^T
u;=\/z2+l w
=arctan*=o/ ^
Fig. 7.4-1. Branch points and branch cuts for some elementary functions.
COMPLEX VARIABLES
7.5-1
196
restriction to a single branch. The construction of Riemann surfaces for arbitrary functions may require considerable ingenuity ,(Refs. 7.8 and 7.15).
Throughout this handbook, statements about analytic func tions refer to single-valued analytic functions or to single-valued branches of monogenic analytic functions unless specific refer ence to multiple-valued functions is made. 7.5. INTEGRAL THEOREMS AND SERIES EXPANSIONS
7.5-1. Integral Theorems. Let zbe a point inside the boundary contour C of a region D throughout which f(z) is analytic, and let f(z) be analytic on C.
Then
9 /(f) d$ = 0 (Cauchy-Goursat integral theorem)
(7.5-1)
f(z) = cT~-
m)
™-Urfi
(7.5-2) *
/GO
^•fii*« Figure 7.5-1 illustrates the application of the Cauchy-Goursat integral theorem (1) to multiply connected domains (see also Sec. 7.7-1). Equation (2) yields f(z) and its derivatives in terms of the boundary values oif(z). Specifically,
f
= %d
(7.5-3)
t -2
Note that (b J dt is an analytic function of zeven if f(z) isnotanalytic throughrct-z out D and only continuous on C; Eq. (1) holds if f(z) is also analytic in D. A continuous single-valued function f(z) is analytic throughout the bounded region D
if Eq. (1) holds for every closed contour C in D and enclosing only points of D (Morerafs Theorem).
7.5-2. Taylor-series Expansion (see also Sec. 4.10-4). (a) If f(z) is analytic inside and on the circle K of radius r about z = a, then there exists
a unique and uniformly convergent series expansion in powers of (z —a)
197
LAURENT-SERIES EXPANSION
f(z) = y dk(z - a)k
With
7.5-3
(\z - a\ < r, a 5* oo) (7.5-4)
0,-^)-^^^^*
The largest circle Kc or \z —a\ = rc all of whose interior points are inside the region wheref(z) is analytic is the convergence circle of the power series (4); rc is the radius of convergence (Sec. 4.10-2a). A number of useful power-series expansions are tabulated in Sec. E-7.
Fig. 7.5-1. Application of Cauchy's integral theorem to a multiply connected region of the z plane (Sees. 7.2-4 and 7.5-1). A region D bounded by exterior contours
Ci, C2, . . . and interior contours C[, C'2, . . . is made simply connected by cuts (shown in broken lines). The integrals over each cut cancel, and Cauchy's integral
theorem becomes
£&,*»*-!&,><»*-° where all integrals are takenin the positive (counterclockwise) direction.
The same tech
nique applies if D is not bounded.
(b) If M(r) is an upper bound of |/(z)| on K, then
H =^tI/«(*)I<^
(Cauchy's inequality)
(7.5-5)
(c) If Taylor's series (4) is terminated with the term a„_i(s - a)"-J the remainder R„(z) is given by _ (z — a)n
Rn(z) =
2wi
fit) ft
a- - fl)n(r - 2)
\ r j r - \ z - a} (7.5-6)
7.5-3. Laurent-series Expansion, (a) // f(z) is analytic throughout the annular region between and on the concentric circles Ki and K2 centered
COMPLEX VARIABLES
7.6-1
198
atz = a and of radii r\ and r2 < n, respectively, there exists a unique series
expansion in terms of positive and negative powers of (z —a), 00
00
/«=
•>;
dk(z
-a)k+ J bk(z -a)-" *=1
k-0
(n> \r - a\> r2) with
1 Qk =
2-m
i
(7.5-7)
/(f) dt
Id,*-- ay-m di The first term ofEq. (7) is analytic and converges uniformly for \z —a\ < ri; the second term [principal part off(z)\ is analytic and converges uniformly for \z —a\> r2. Note: The case a — « is treated by using the transformation 2 = 1/z, which transforms z = oo into the origin.
(b) If the first term in Eq. (7) is terminated with a„_i(z —a)n_1, the remainder Rn(z) is given by
ff M - (* - «)w ^ with
/(f) #
1l&wi' <(^V \ n J n M(,ri)ri - \z - a\,
(7.5-8)
If the second term in Eq. (7) is terminated with bn-i(z —a)-(B-1), the remainder R'n(z) is given by (f ~ «)n/(f) # iJ»(2)==2«-(2-fl)»i1Jl^ f)
(7.5-9)
M(ri) and ilf (r2) are upper bounds of \f(z)\ on 2£i and 2£2, respectively. 7.6. ZEROS AND ISOLATED SINGULARITIES
7.6-1. Zeros (see also Sec. 1.6-2). The points z for which f(z) = 0 are called the zeros of f(z) [roots of f(z) = 0]. A function f(z) analytic at z = a has a zero of order m, where m is a positive integer, at z = a if and only if the first m coefficients a0, a\, a2, . . . , Om_i in the Taylorseries expansion (7.5-4) of f(z) about z = o vanish, so that /(«)(« — a)~m is analytic and different from zero at z = a.
199
SINGULARITIES
7.6-2
The zeros of a function f(z) analyticthroughout a region D are either isolatedpoints [i.e., each has a neighborhood (Sec. 7.2-2) throughout which f(z) j*0 except at the zero itself], or f(z) is equal to zero throughout D.
Ifh(z) andf2(z) are analytic throughout a simply connected bounded open region D and on its boundary contour C, and if 1/2(2)! < 1/1(2) | ^ 0 on C, then fi(z) and f\(z) + f2(z) have the same number of zeros in the region D (Rouch&s Theorem). Every polynomial of degree n has n zeros, counting multiplicities (Fundamental Theorem of Algebra, see also Sec. 1.6-3). 7.6-2. Singularities. A singular point or singularity of the function f(z) is any point where f(z) is not analytic. The point z = a is an iso lated singularity of f(z) if and only if there exists a real number 8 > 0 such that f(z) is analytic for 0 < \z —a\ < 8 but not for z = a. An isolated singularity of f(z) at z = a 5* 00 is
1. A removable singularity if and only if f(z) is finite throughout a neighborhood of z = a, except possibly at z = a itself; i.e., if and only if all coefficients bk in the Laurent expansion(7.5-7) off(z) about z = a vanish.
2. A pole of order m (m = 1, 2, . . .) if and only if (z —a)mf(z) but not (z - a)m-lf(z) is analytic at z = a; i.e., if and only if bm y£ 0
6m+i = bm+2 = • • • = 0
in the Laurent expansion (7.5-7) off(z) about z = a; or if and only if l//(z) is analytic and has a zero of order m at z = a.
In this
case, lim \f(z)\ = 00 no matter how z approaches z = a. z—>a
3. An isolated essential singularity if and only if the Laurent expansion (7.5-7) of f(z) about z = a has an infinite number of terms
involving negative powers of (z - a); \f(z)\ becomes indefinitely large as z approaches the value z = a for some approach paths but not for others.*
4. A branch point if f(z) is a multiple-valued function, and z = q satisfies the conditions of Sec. 7.4-2.
If mbranches of a function f(z) join at a branch point z = a, that branch point is considered a pointofcontinuity, a zero, a removable singularity, a pole, or an essential singularity of the "complex" F(z) of the m branches in question if and only if the function
*(f) = F(r + a)
(7.6-1)
has a point of continuity, a zero, a pole, or an essential singularity, respectively, at f = 0; F(z) = f(z) in a neighborhood of z = a, except that F(z) can take only values of the m branches joining at z = a. Notethat/Cs) may haveother branches at
z = a which may or may notjoin at z = a and whose behavior at z = a may ormay *Sometimes the definition of an isolated essential singularity is extended to cover
limit points of poles.
7.6-3
COMPLEX VARIABLES
200
not be different from that of the m branches considered above.
If the m branches
joining at z = a are the only branches of f(z) at z = a, then F(z) = f(z) at z = a. The behavior of different branches of a multiple-valued function f(z) at a point
z = a which is not a branch point f(z) must be considered independently for each branch.
EXAMPLES: The function f(z) defined by f(z) = 0 for z = 0, f(z) = i for z * 0 has a removable singularity at z = 0. sin 1/z has a removable singularity at z = 0. l/(z _ 2)8 has a pole of order 3 at z - 2. e1/f has an essential singularity at z = 0.
Vz has a branch point of order1 at z = 0; this branchpoint is a zero of order1 for all branches.
7.6-3. Zeros and Singularities at Infinity. f(z) is analytic at infinity if and only if /(1/z) is analytic at the origin. The point z = °o is a zero
or a singularity of any of the types listed in Sec. 7.6-2 if /(1/z) behaves correspondingly at the origin. The behavior of f(z) at infinity may be investigated with the aid of the Laurent expansion of /(1/z) about the origin.
7.6-4. Weierstrass's and Picard's Theorems.
Let f(z) be a single-
valued function having an isolated essential singularity at z = a. Then 1. For any complex number w, every neighborhood ofz = a contains a
point zsuch that \w —f(z)\is arbitrarily small (Weierstrassfs Theorem). 2. Every neighborhood of z = a contains an infinite set of points z such that f(z) = w for every complex number w with the possible exception of a single value of w (Picard's Theorem).
7.6-5. Integral Functions. An integral (entire) function f(z) is a function whose only singularity is an isolated singularity at z = ». If
this singularity is a pole oforder m, then f(z) must be a polynomial (integral rational function) of degree m. A function f(z) analytic for all values of z is a constant (Liouvilleys Theorem). An integral function f(z) which is not a constant assumes every value w, except
possibly one, at at least one point z, and, if f(z) is not a polynomial, at infinitelymany points z.
7.6-6. Product Expansion of an Integral Function.
For every set of points z0,
zi, z2 . . . having no limit point (Sec. 4.3-6a) except possibly z = «>, there exists an integral function f(z) whose only zeros are zeros of given orders mk at the points z = Zk. Let z0 = 0, zk 9* 0 (k > 0); if there is no zero for z = 0, then ra0 = 0. Then f(z) can be represented in the form
m- ~n {(. - a -pii+\ a)'+• ••+san r -«™ k = i
where g(z) is an arbitrary integral function, and the r* are (finite) integers chosen so as to make the product converge uniformly throughout every bounded region (Theorem of Weierslrass; see Sec. 21.2-13 for examples of product expansions).
201
ZERO AND POLES OF MEROMORPHIC FUNCTIONS
7.6-9
7.6-7. Meromorphic Functions. f(z) is meromorphic throughout a region D if and only if its only singularities throughout D are poles. The number of such poles in any finite region D is necessarily finite. Many authors alternatively define a function as meromorphic if and only if its only singularities throughout the finite portion of the plane are poles. Every function meromorphic throughout the finite portion of the plane can be expressed as the quotient of two integral functions without common zeros, and thus as the quotient of two products of the type discussed in Sec. 7.6-6. A function meromorphic throughout the entire plane is a rational algebraic function expressible as the quotient of two polynomials (see also Sec. 4.2-2c).
7.6-8. Partial-fraction Expansion of Meromorphic Functions (see also Sec.
1.7-4). Let f(z) be any function meromorphic in the finite portion of the plane and mk
having poles with given principal parts (Sec. 7.5-3) N bkj(z —zk)~J' at the points y=i
z = Zkof a given finite or infinite set without limit points in the finite portion of the plane. Then it is possible to find polynomials pi(z), p2(z), . . . and an integral function g(z) such that mk
f(z) =£ [£ hki(>z ~Zk)~j +P*W] +9(e) k
(7.6-3)
j= 1
and the series converges uniformly in every bounded region where f(z) is analytic (MittagLeffler's Theorem).
7.6-9. Zeros and Poles of Meromorphic Functions. Let f(z) be meromorphic throughout the bounded region inside and continuous on a
closed contour C on which f(z) ?* 0. Let N be the number of zeros and P
the number of poles off(z) inside C, respectively, where a zero or pole of order m is counted m times.
Then
(7.6-4)
For P = 0, Eq. (4) reduces to the principle of the argument
N=g
(7.6-5)
where Ac# is the variation of the argument # off(z) aroundthe contourC. Equation (5) means that w = f(z) maps a moving point z describing the contour C once into a moving point wwhich encircles the wplane origin N = 0, 1, 2, . . . times
iff(z) has, respectively, 0, 1, 2, . . . zeros inside the contour Cinthe zplane. Equa tions (4) and (5) yield important criteria for locating zeros and poles off(z), such as the famous Nyquist criterion (Ref. 7.6).
7.7-1
COMPLEX VARIABLES
202
7.7. RESIDUES AND CONTOUR INTEGRATION
7.7-1. Residues.
Given a point z = a where f(z) is either analytic or
has an isolated singularity, the residue Res/ (a) of f(z) at z = a is the coefficient of (z —a)-1 in the Laurent expansion (7.5-7), or
Res/ (a) =A, ^/({•)#
(7.7-la)
where C is any contour enclosing z = a but no singularities of /(z) other than z — a.
The residue Res/ (oo) of f(z) at z = «> is defined as
Res/(oo)=^^/(f)#
(7.7-16)
where the integral is taken in the negative sense around any contour
enclosing all singularities off(z) in the finite portion of the plane. Note that
Res, (oo) = lim [-*/(*)]
(7.7-2)
if the limit exists.
If f(z) is either analytic or has a removable singularity at z = a ^ oo,
then Res/ (a) = 0 [see also Eq. (7.5-1)]. If z = a ?* «> is a pole of order m, then
K-'W-^hn^c—w)].-.
(7-7_3)
In particular, let z = a ^ oo be a simple pole off(z) = p(z)/q(z), where p(z) and q(z) are analytic at z = a, and p(a) ^ 0. Then
Res/ (a) =|M>
(7.7-4)
7.7-2. The Residue Theorem (see also Sec. 7.5-1). For every simple closed contour C enclosing at most a finite number of (necessarily isolated) singularities zh z2, . . . , zn of a single-valued function f(z) continuous on C, n
—. $ /(f) # = / Res/ (2*)
(residue theorem)
(7.7-5)
&=i
One of the zk may be the point at infinity. Note carefully that the contour
Cmust not pass through any branch cut (see also Sec. 7.4-2).
203
EVALUATION OF DEFINITE INTEGRALS
7.7-3. Evaluation of Definite Integrals,
7.7-3
(a) One can often evaluate
a real definite integral J^ f(x) dx as a portion of a complex contour
integral
(b) To evaluate certain integrals of the form P f(x) dx one applies Eq. (5) to a contour Ccomprising the interval (-R, R) of the real axis and the arc S of the circle \z\ = Rin the upper half plane. The follow ing lemmas often yield the integral over S as R -> oo :
LR^l Is fW # =° whenever the integral exists for all finite values of R, and zf(z) tends uniformly to zero as \z\ —> oo with y > 0
2. Jordan's Lemma: if F(z) is analytic in the upper half plane, except possibly for a finite number of poles, and tends uniformly to zero as \z\ -> oo with y > 0, then for every real number m
lim jfs F(£)e*»t d{ = 0
#_♦„
The contour-integration method may yield the Cauchy principal value (Sec. 4.6-26)
of /_„/(*) dxeven #the integral itselfdoes not exist. Jordan's lemma is particularly useful for the computation of improper integrals of the form f °° F(x)eim* dx and,
because of Eq. (21.2-28), integrals of the form f^ F(x) cos (mx) dx and /_°° F(x) sin (mx) dx (inverse Laplace and Fourier transforms, Sees. 4.11-3 and 82-6) ~" to Iff *« any semicircular arc of the circle \z - a\ =eabout a simple pole z =a
°ff(z), then
r
lim L/(f)^f =^Res/ (a) This fact is used (1) to evaluate integrals over contours "indented" around simple polesand (2) for computing the Cauchy principal values of certain improper integrals, (d) One may apply the residue theorem (5) to integrals of the type
JQ $(cos
where * is a rational function of cos * and sin P, with the aid of the transformation
2 = «*
d
)I
„ (C)-7K,e R?e;hodJof Steep^t Descent (Saddle-point Method). For a given
or suuably deformed contour Csuch that |/(2)| is 8mall except for a pronounced
z**-i K (Simple pole)
M 7.7-10. f -£- =lim f* * - lim U - f 1-£
2*
= 27ri Res (z = i) — 0 = x, using Lemma 1, Sec. 7.7-36. y
= — [0 — 0 — (—iir)] - -> using Jordan's Lemma and Sec. 7.7-3c. 2t
v
2
Jt J
i
z plane
ss
I \
s's~ - \ zr( ,V— _JR
(Simple V
\
pole) V.
y
J—-"7/
N'
Branch cut
Fig. 7.7-lc. Given 0 < a < 1,
JO l+X
.-,0 L^C
iS
JS'Jl+z
JO
1+X
/?-•»
The integrals overS and£' go to zero, and the integral over Cequals 2iri Res (z = —1) = 2irt'eirt'(a~1).
Hence,
/"* *'"' dx - i _ e2«-(«-i) 1 (6jrc i2
J0 i + x
sin a7r
(0 < a < 1)
Fig. 7.7-1. Simple examples of contour-integral evaluation (Sec. 7.7-3; see Refs.7.11, and 7.16 for more advanced problems). 204
205
ANALYTIC CONTINUATION
7.8-1
maximum at z = zo with /'(z0) = 0 one may attempt an approximation of the form
en0(z) ^ (see also Ref. 6.4). 7.7-4. Use of Residues for the Summation of Series.
Given a contour C enclos
ing the points z — m, z — m -\-\, z = ra + 2, . . . , z — n, where m is an integer, let/(z) be analytic inside and on C, except possibly for a number of poles a\, a*, . . . , ojv none of which coincide with z = m, z = m + 1, z = m + 2, . . . , z = n. Then n
N
2 f(k) =2S fc**® cot *rdr " X Re8a (ay)
A;= to
i = 1
(7,7"7)
where Resa (a,) is the residue of irf(z) cot irz at z = a,-; and
£ (-!)*/(&) =^j>c^^) cosecTrf dr - £ Res. (a,) &= TO
(7.7-8)
/ =1
where Res. (a,) is the residue of irf(z) cosec 7rz at z = a,-. It is frequently possible to chose the contour C so that the integral on the right of Eq. (7) or (8) vanishes. 7.8. ANALYTIC CONTINUATION
7.8-1. Analytic Continuation and Monogenic Analytic Functions (see also Sees. 7.4-1 to 7.4-3). (a) Given a single-valued function /i(z) defined and analytic throughout a region Dx, the function f2(z) defined and analytic throughout a region D2 is an analytic continuation of fi(z) if and only if the intersection of Di and D2 contains a simply con nected open region Dc where fi(z) and f2(z) are identical.
The analytical continuation f2(z) is uniquely defined by the values of fi(z) in Dc (see also Sec. 7.3-3). Moreover, analytic continuations of fi(z) satisfy every functional equation and, in particular, every differential equation satisfied byfi(z) (principle of conservation offunctional equations). One can thus use f2(z) to extend the region of definition of fi(z), and conversely: fi(z) and/2(z) are regarded as elements of a single analytic function f(z) defined throughout Dx and D2. fi(z) and/or f2(z) may be capable of further analytic continuation leading to additional elements of/(*).
(b) Multiple-valued Functions. Note that fx(z) and f2(z) are not necessarily identical throughout the entire intersection of their regions of definition. f2(z) may have an analytic continuation fz(z) defined on Z>i but not identical with f\(z). Analytic continuation can yield elements
7.8-2
COMPLEX VARIABLES
206
belonging to different branches of a multiple-valued analytic function f(z); the two values f(zo) obtained by analytic continuation of fi(z) along two different routes C, C are identical if C and C do not enclose a branch point off(z). (c) The possible analytic continuations of a given element constitute a monogenic analytic function f(z) defined, except at isolated singu larities, throughout the plane or on a connected region with natural boundaries. Whereas the choice of the successive elements defining f(z) is not fixed, the principle of conservation of functional equations applies to all elements, and any one element uniquely defines all branches, isolated singularities, and natural boundaries of f(z). 7.8-2. Methods of Analytic Continuation,
(a) The standard method
of analytic continuation starts with a function f(z) defined by its powerseries expansion (7.5-4) inside some circle \z —a\ = r. For any point z = b inside the circle, f(b), f'(b), . . . are then known and yield a Taylor-series expansion about z = b. The new power series converges inside a circle \z —b\ = r' which intersects the first circle but may con tain a region not inside the first circle. This process may be continued up to the natural boundaries of the function; each power series is an element oif(z). Note: A function defined as a power series with a finite radius of convergence has at least one singularity on the circle of convergence.
(b) Two functions fi(z) and f2(z) defined and analytic throughout the respective open regions Di and D2 separated by a contour arc C are analytic continuations of each other (elements of the same monogenic analytic func tion) if they are equal and uniformly continuous on C. (c) The Principle of Reflection. Let f(z) be defined and analytic
throughout a region D intersected bythe real axis, and letf(z) bereal for real z. Then, for every value of a in D, f(z) is defined and analytic at z = a*, and /(a*) = /*(a)
(7.8-1)
More generally, let f\(z) be defined and analytic throughout a region D\ bounded in part by a straight-line segmentSz, where fi(z) is continuous; and let w = fi(z) map Sz onto a corresponding straight-line segment Sw in the w plane. Then the function w = f2(z) mapping the reflection in Se of every point zin D into the reflection of w = f\(z) in Sw is an analytic continuation ofh(z). 7.9. CONFORMAL MAPPING
7.9-1. Conformal Mapping, (a) A function w = f(z) maps points of the z plane (or Riemann surface, Sec. 7.4-3) into corresponding points of
207
BILINEAR TRANSFORMATIONS
7.9-2
the w plane (or Riemann surface). At every point z such that f(z) is analytic andf(z) ?* 0 the mapping w = f(z) is conformal; i.e., the angle between two curves through such a point is reproduced in magnitude and sense by the angle between the corresponding curves in the w plane. Infinitely small triangles around such points z are mapped onto similar infinitely small triangles in the w plane; each triangle side is "stretched" in the ratio \f'(z)\:l and rotated through the angle arg/'(z). The superficial magnification (local magnification of small areas) due to the mapping w = f(z) = u(x, y) + iv(x, y) is du
17 Wl
d(x,y)
du
dv dv dx
(7,yi)
dy
at every point z where the mapping is conformal. A conformal mapping transforms
the lines x = constant, y —constant into families of mutually orthogonal trajectories in the w plane. Similarly, the lines u(x, y) = constant, v(x, y) = constant correspond to orthogonal trajectories in the z plane (see also Table 7.2-1).
A region of the z plane mapped onto the entire w plane by w = f(z) is called a fundamental region of the function f(z). Points where f'(z) = 0 are called critical
points of the transformation w = /(z).* A mapping which preserves the magnitude but not necessarily the sense of the angle betweentwo curves is called isogonal (exam ple of an isogonal but not conformal mapping: w = z*).
(b) The mapping w = f(z) is conformal at infinity if and only if w = /(1/z) = F(z) maps the origin z = 0 conformally into the w plane. Two curves are said to intersect at an angle y at z = oo if and only if the transformation z = 1/z results in two corresponding curves intersecting at an angle y at z = 0. Similarly, w = f(z) maps the point z = a conformally into w = oo if and only if w = l/f(z) maps z = a con formally into the origin w = 0 (see also Sec. 7.2-3). 7.9-2. Bilinear Transformations,
(a) A bilinear transformation
(linear fractional transformation, Moebius transformation) az + b
W= cT+1
,„
,
& ~ad* 0)
(7.9-2)
establishes a reciprocal one-to-one correspondence between the points of the z plane and the points of the w plane. In particular, each of the two invariant points
Z=i [(a "~ d) ±V(a-d)* +4bc]
(7.9-3)
(which may or may not be distinct) is mapped onto itself. The mapping is conformal everywhere except at the pointz = —d/c, which corresponds *Some authors also refer to the (singular) points where l//'(z) = 0as critical points of/(z).
7.9-2
COMPLEX VARIABLES
208
to w = oo. Straight lines and circles in the z plane correspond to straight lines or circles in the w plane, and conversely; in this connection, every
straight line is regarded as a circle of infinite radius through the point at infinity. For any bilinear transformation mapping thefour points Z\,z2,Zz, z, respec tively, into Wi, w2, Wz, w Z\ — z
1Z\ — Z2 _W\ — w 1 W\ — w2 Zz — z2 Wz — wl Wz — w2
(7.9-4)
zz —zl
(invariance of the cross ratio or anharmonic ratio*).
Equation (4)
defines the unique bilinear transformationmapping three given points Z\,z2,
Zz, respectively, into three given points W\, w2, Wz. There exists a bilinear transformation which will transform a given circle or straight line in the z plane into a given circle or straight line in the w plane (see also Table 7.9-2 and Sec. 7.10-1). special cases.
The transformation
w = Az + B
(7.9-5)
where A and B are complex numbers, corresponds to a rotation through the angle
arg A together with a stretching or contraction by a factor \A\, followed by a translation through the vector displacement B. The linear transformation (5) is the most general conformal mapping which preserves similarity of geometrical figures. The transformation
w = 1/z
(7.9-6)
representsa geometrical inversion of the point z with respect to the unit circle about the origin, followed by a reflection in the real axis. The transformation (6) maps 1. Straight lines through the origin into straight lines through the origin 2. Circles through the origin into straight lines which do not contain the origin, and conversely
3. Circles which do not pass through the origin into circles which do not pass through the origin
(b) The bilinear transformations (2) constitute a group; inverses and products of bilinear transformations are bilinear transformations (Sec. 12.2-7). Every bilinear transformation (2) may be expressed as the result (product) of three successive simpler bilinear transformations (see also Sec. 7.9-2a):
zf = z + d/c z" = \jz' _ be —ad „ , a
w-
c^
z +c
(translation) (inversion and reflection) (rotation and stretching,
(7.9-7a) (7.9-76)
followed by translation) (7.9-7c)
* The cross ratio (4) is real (and admits a geometrical interpretation, Ref. 7.2)
if and only if the points zi, z2, z8, z (andhence also the pointswh w%, wz, w) lieon a circle or straight line.
209
THE SCHWARZ-CHRISTOFFEL TRANSFORMATION
7.9-3. The Transformation w = %(z + 1/z).
7.9-4
The transformation
»- J(* +i)
(7-9-8a)
is equivalent to w
w
or
-^ =Uqri)2 or z=w+vV - 1 (7.9-86) —
u=^f1*1 +|i|) cos ^ y=2I'2' " Id) sin * (7-9-8c)
The transformation (8) is conformal except at the critical points z = 1
and z = —1. Both the exterior and the interior of the unit circle \z\ = 1 are mapped onto the entire plane with the exception of the straight-line segment (u = —1, u = 1), which corresponds to the unit circle \z\ = 1 itself.
Some important properties of this transformation are outlined in
Table 7.9-1.
7.9-4. The Schwarz-Christoffel Transformation.
The Schwarz-
Christoffel transformation /——
_£!!
an
(z - xi) *(z - x2) r • • • (z - xn) * dz + B \
V) on = 2tt
with
I (7*9"9)
\
maps the upper half-plane y > 0 conformally onto the interior of a polygon in the w plane; the polygon corresponds to the x axis, the vertices Wi, w2, . . . , wn correspond to different points xh x2, . . . , xn on the x axis, and the exterior angle of the polygon at the vertex w3- equals aj (j = 1, 2, . . . , n). For any given polygon in the w plane, three of the x/s can be chosen arbitrarily; the other x/s and the parameters A and B are then uniquely determined.
If xn is chosen to be infinitely large Eq. (9) reduces to
(z - x[) "(z - x'2) - • • • (z - x'n_x)
with
V ay = 2w
- dz + Br (7.9-10)
y=i
where A' and B' are constant parameters, and x[, x2, . . . , x'n_! are new points on the x axis.
7.9-5
COMPLEX VARIABLES
210
Table 7.9-1. Properties of the Transformation 1
(see also Tables 7.9-2 and 7.10-1)
z = x + iy = |z|e»> (|z| > 0);
-*H)
w = w -f iv = re** (|w| > 0)
Point, curve(s), or region in Point, curve(s), or region in z plane w plane
Ellipses, foci at ± 1
Circles about 0
\z\ = e^ = constant ^ 1
u2
v2
cosh2 \p
+ -
sinh2 \p
=
1
Remarks
If
#
decreases
for \z\ < 1 Straight-line rays through 0
Hyperbolas, foci at ± 1
ip = constant j* 0
v2
cos2 ip
Unit semicircle \z\ = 1, y >0
Unit semicircle \z\ = 1,
sin2 tp
=
1
Straight-line segment (+1, -1)
If ip increases, u de
Straight-line segment
If ip increases, u in
2/ <0
creases
(-1, +1)
creases
Straight-line segments of the Straight-line segments of the real axis y — 0, specifically real axis v = 0, specifically the intervals x =
— oo to x =
x =
—1
x = 0 x = +1
—1
u =
— co to u =
u — —1
to x = +1 to x = + °°
w = + oo to w = +1 U = -fl tO U = +oo
"Streamlines" for flow around
to w =
—1
to x = 0
— oo
v = constant
unit circle with velocity % at z =
oo
Corresponding lines of con stant velocity potential
u = constant
The actual determination of the x7- or xj is quite complicated, except in certain "degenerate" cases where one or more of the angles aj vanish (see also Ref. 7.4). Application of the Schwarz-Christoffel transformation to parallelograms and rectangles in the w plane yields elliptic functions z of w (Sec. 21.6-1). The transformations 26 to 30 in Table 7.9-2 are special cases of the SchwarzChristoffel transformation.
7.9-5. Table of Transformations.
Table 7.9-2 illustrates a number
of special transformations of interest in various applications.
211
TABLE OF TRANSFORMATIONS
7.9-5
Table 7.9-2. Table of Transformations of Regions*
Fig. l.w = z8.
7Z
U
Fig. 2. w = z2.
^B Fig. 3. w = z2;
Fig. 4. w = 1/z.
*From R. V. Churchill, Introduction to Complex Variables and Applications, 2d ed., McGraw-Hill, New York, 1960.
7.9-5
212
COMPLEX VARIABLES
Table 7.9-2. Table of Transformations of Regions (Continued)
y
v A'v> B'
Fig. 5. w = 1/z.
y
D
E
niF
w
c
m l3
AL
D'
Fig. 8. w = e\
y
E
\A
Up
////
/ZsZ^Zwy/ yab
D __ _7r
*
C
IT
2
Fig. 9. w = sin z.
E
A'
B'
TABLE OF TRANSFORMATIONS
213
Table 7.9-2. Table of Transformations of Regions (Continued) y V
'V/////AA
i>
Fig. 10. w — sin z.
V
E F\
A' B'
Fig. 11. w - sin z; BCDiy = k, B'C'D' is \cosh f—^-rY + (-r-^-rY = 1k/ \smh A;/
Fig. 12. w = -—-. z Y 1
Fig. 13. t« =
*+*
7.9-5
7.9-5
COMPLEX VARIABLES
214
Table 7.9-2. Table of Transformations of Regions (Continued) v\
y\ \B
z - a
; a —
1 + xix2 + V(l - xi2)(l - x22)
az — 1
Rq ss
Xi -f- X2
1 -a?isi +V(l -Si2)U -x7) (a > 1 and #0 > 1 when —1 < x2 < Xi < 1). Xi — X2
•>*-*
„
1K
z-a
1 + xix2: + V(si2 - 1)W - 1)., ;
rIG. 15. 10 = • -J ffl = « az — 1
xi + X2
:g'g' - X- V(j'a ~ 1)(aa* ~ 1} fe
X2
Fig. 16. w = z + 1/z.
215
7.9-5
TABLE OF TRANSFORMATIONS
Table 7.9-2. Table of Transformations of Regions (Continued) y\
v
B
A
ED'
\C
Szwam. Tr B'
A'
Fig. 17. w = z + 1/z.
1
k
*»- *- —'+liB'C'D'is dJ^iT +(]^-i)a= *• Z7
IT
^L-«
A'
Fig. 19. w = log,,——; z = -coth—• z + 1
F
2
tf TTi
&
mtmmm A'
B' ki
C
Fig. 20. w= log, ^-i; ABC is x2 + y2 - 2# cot k= 1. 2 + 1
A'PTTTTTTTIF*
#
ajy Fig. 21. u? a loga
« + l
-; centers of circles at z = coth c„, radii: csch cn(n = 1, 2).
z — 1
7.9-5
COMPLEX VARIABLES
216
Table 7.9-2. Table of Transformations of Regions (Continued)
Fig. 22. w = k log, ——- + log* 2(1 - k) + iV - k log, (z + 1) - (1 - k) log, (z - 1); xi = -A; - 1.
A'
B'
Fig. 23. w = tan'* - l ' cos z 2
1 + cos z
Fig. 24. w = coth - = ?—tl. 2
ez -
Fig. 25. it) = logfl coth -• 2
1
217
TABLE OF TRANSFORMATIONS
Table 7.9-2. Table of Transformations of Regions (Continued)
Fig. 27. w=2(z +1)1 +log, («j* ++ 1)* )}| + 1)•
Frn OQ . 1 + ifa , , 1 + t 4 (z - 1 \* F,o. 28. . - tj tag. j—jj +log. _;« - (—) •
5
I
Fig. 29. 10 = - [(z2 - 1)* + cosh"1 z].
Fig. 30. w=cosh- (2* 7 *7 ') - ±coshH* + 1)« - 2*1 \ *- 1 / Jb L (A; - l)z J
7.9-5
7.9-5
COMPLEX VARIABLES
218
Table 7.10-1. Table of Transformations Mapping a Specified Region D Conformally onto the Unit Circle (\w\ < 1) z = x -f iy = \z ei(P w =» u + iv —\w\ei§ Region D in the z plane Upper half plane y>o
Right half plane x > 0
Transformation
x z — a
w = e%\
z
.v z
w = e6A
Remarks
(x real)
— a*
—
z =s a is trans
a
z + a*
(X real)
formed into the
origin w = 0
Unit circle
I*I<1
a*z -
1
(X real)
Strip of width 7r 0 < X < 7T, — oo < j/ < oo
Special case of the Schwarz-
W — 1
Christoffel
10 + 1
transformation
(7.9-9) Sector of unit circle
\*\ < 1, 0 <
(1 + z1'")2 - i(l - z1'")2 (1 +zu«)* + i(l -z1'*)2
z plane cut from z = 0 to z = oo along the positive real axis
Vz - i
Region outside and on the el-
y/z +i
-i[<-»-+^]
See also Sec. 7.9-3
Region outside and on the pa rabola
|Z|C0S'|=:1
\w + lj
Region inside and on the parab ola
\z\ cos21 - 1 10
Semicircle \z\ < R, x > 0
w = tan2
(^)
. z2 + 2Rz -
R*
%z2 -2Rz - R* 11
Regions bounded by straight Combine a Schwarz-Christoffel transformation lines (polygons) (Sec. 7.9-4) with transformation 1 above
219
REFERENCES AND BIBLIOGRAPHY
7.11-2
7.10. FUNCTIONS MAPPING SPECIFIED REGIONS ONTO THE UNIT CIRCLE
7.10-1. Riemann's Mapping Theorem. For every simply connected open region D in the z plane, with the exception of the entire z plane and the entire z plane minus one point, there exists a conformal mapping w = f(z) which establishes a reciprocal one-to-one (biunique) correspondence between
all points of D and all interior points of the unit circle \w\ = 1. The ana lytic function f(z) is uniquely determined if the mapping of a point in D and a direction through that point is specified. If D is bounded by a regu lar curve (Sec. 3.1-13) C, then/(2) is continuous on C and establishes a
reciprocal one-to-one correspondence between all points of C and all points on the unit circle \w\ = 1. Note: (1) The transformation thus specified defines the analytic function f(z); and (2) with the trivial exceptions noted above, it is possible to map every region D bounded by a simple contour conformally onto any other region D' bounded by a simple contour. The problem of mapping a region D conformally onto the unit circle is closely related to the solution of Dirichlet's boundary-value problem for the region D (Sec. 15.6-9). Frequently, a desired conformal mapping can be obtained through successive rela tively simple transformations.
Table 7.10-1 lists a number of transformations mapping a given region D conformally onto the unit circle. 7.11. RELATED TOPICS, REFERENCES, AND BIBLIOGRAPHY
7.11-1. Related Topics. The following topics related to the study of functions of a complex variable are treated in other chapters of this handbook:
Complex numbers Roots of polynomials Functions, limits, differentiation, integration, infinite series Laplace transformation Two-dimensional potential theory, conjugate harmonic functions Special functions
Chap. Chap. Chap. Chap. Chap. Chap.
1 1 4 8 15 21
7.11-2. References and Bibliography. 7.1. Ahlfors, L. V.: Complex Analysis, 2d ed., McGraw-Hill, New York, 1966. 7.2. Bieberbach, L.: Einfukrung in die Konforme Abbildung, De Gruyter, Berlin, 1927.
7.3. Carrier, G. F., et al.: Functions of a Complex Variable, McGraw-Hill, New York, 1966.
7.4. Churchill, R. V.: Complex Variables and Applications, 2d ed., McGraw-Hill, New York, 1960. 7.5. Copson, E. T.: Theory of Functions of a Complex Variable, Oxford, New York, 1960.
7.11-2
COMPLEX VARIABLES
220
7.6. Cunningham, J.: Complex-variable Methods in Science and Technology, Van Nostrand, Princeton, N.J., 1965.
7.7. Hille, E.: Analytic Function Theory, 2 vols., Blaisdell, New York, 1959/62. 7.8. Hurwitz, A., and R. Courant: Allgemeine Funktionentheorie und elliptische Funktionen, Springer, Berlin, 1964.
7.9. Knopp, K.: Funktionentheorie, translated by F. Bagemihl, Dover, New York, 1947.
7.10. Kober, H.: Dictionary of Conformal Representations, Dover, New York, 1952. 7.11. McLachlan, N. W.: Complex Variable and Operational Calculus with Applica tions, Macmillan, New York, 1946. 7.12. Nehari, Z.: Conformal Mapping, McGraw-Hill, New York, 1952. 7.13. : Introduction to ComplexAnalysis, Allyn and Bacon, Boston, 1961. 7.14. Pennisi, L. L.: Elements of Complex Variables, Holt, New York, 1963. 7.15. Springer, G.: Introductionto Riemann Surfaces, Addison-Wesley, Reading, Mass., 1957.
7.16. Whittaker, E. T., and G. N. Watson: A Coursein Modern Analysis, Cambridge, New York, 1958.
CHAPTER
8
THE LAPLACE TRANSFORMATION AND OTHER FUNCTIONAL TRANSFORMATIONS
8.3-4. Limit Theorems
8.1. Introduction
8.1-1. Introductory Remarks 8.4. Tables of Laplace-transform Pairs and Computation of In verse Laplace Transforms
8.2. The Laplace Transformation 8.2-1. Definition
8.2-2. 8.2-3.
Absolute Convergence Extension of the Region of
8.4-1. Tables of Laplace-transform
Definition
8.4-2. Computation of Inverse Laplace
8.2-4. Sufficient
Conditions
for
the
Existence of the Laplace Trans form 8.2-5.
Inverse Laplace Transformation
8.2-6. The Inversion Theorem
8.2-7. Existence of the Inverse Laplace Transform
8.2-8.
Uniqueness of the Laplace Transform and Its Inverse
8.3. Correspondence between Opera tion on Object and Result Functions
8.3-1. Table of Corresponding Opera tions
8.3-2. Laplace Transforms of Periodic Object Functions and Ampli tude-modulated Sinusoids
8.3-3. Transform of a Product (Con volution Theorem)
Pairs Transforms
8.4-3. Use of Contour Integration 8.4-4. Inverse Laplace Transforms of Rational Algebraic Functions: Heaviside Expansion 8.4-5. Inverse Laplace Transforms of Rational Algebraic Functions: Expansion in Partial Fractions 8.4-6. Expansions in Series 8.4-7. Expansions in Terms of Powers of t
8.4-8. Expansions in Terms of Laguerre Polynomials of t 8.4-9. Expansions in Asymptotic Series
8.5. "Formal" Laplace Transform ation of Impulse-function Terms 8.5-1. Impulse-function Transforms 221
8.1-1
THE LAPLACE TRANSFORMATION
8.6. Some
Other
Integral Trans
222
forms. Finite Fourier and Han kel Transforms
formations
8.6-1. Introduction
8.7-2. Generating Functions
8.6-2. The Two-sided (Bilateral) La
8.7-3. z Transforms
place Transformation
8.6-3. The Stieltjes-integral Form of the Laplace Transformation 8.6-4. Hankel Transforms and FourierBessel Transforms
8.7. Finite Integral Transforms, Generating Functions, and z Transforms
8.7-1. Series
as
Functional
Trans
8.8. Related Topics, References, and Bibliography
8.8-1. Related Topics 8.8-2. References and Bibliography
8.1. INTRODUCTION
8.1-1. The Laplace transformation (Sec. 8.2-1) associates a unique func tion F(s) of a complex variable s with each suitable function f(t) of a real variable t.
This correspondence is essentially reciprocal one-to-one for
most practical purposes (Sec. 8.2-8); corresponding pairs of functions f(t) and F(s) can often be found by reference to tables. The Laplace trans formation is defined so that many relations between, and operations on, the functions f(t) correspond to simpler relations between, and operations on, the functions F(s) (Sees. 8.3-1 to 8.3-4). This applies particularly to the solution of differential and integral equations. It is, thus, often useful to transform a given problem involving functions f(t) into an equivalent problem expressed in terms of the associated Laplace trans forms F(s) ("operational calculus" based on Laplace transformations or "transformation calculus," Sees. 9.3-7, 9.4-5, and 10.5-2). 8.2. THE LAPLACE TRANSFORMATION
8.2-1. Definition.
The (one-sided) Laplace transformation
F(s)=£[f(t)] = f0°°f(t)e-«dt s lim [bf(t)e-"dt a-*0 Ja
(0 < a < b)
(8.2-1)
b—mo
associates a unique result or image function F(s) of the complex vari able s = a + iu with every single-valued object or original function f(t) (t real) such that the improper integral (1) exists. F(s) is called the (one-sided) Laplace transform of f(t). The more explicit notation £[f(f)') s] is also used.
223
THE INVERSION THEOREM
8.2-2. Absolute Convergence.
8.2-6
The Laplace transform (1) exists for
f °° 1/(01«^ dt = O_>o \im Jafb \f(t)\e-« dt
(0 < a < b) (8.2-2)
Jo
b-*»
exists for
8.2-3. Extension of the Region of Definition. tion of the analytic function
The region of defini
F(s) = £[f(t)] (cr > ea) (8.2-3) can usually be extended by analytic continuation (Sees. 7.8-1 and 7.8-2) so as to include the entire s plane with the exception of singular points (Sec. 7.6-2) situated to the left of the abscissa of absolute convergence. Such an extension of the region of definition is implied wherever necessary. 8.2-4. Sufficient Conditions for the Existence of the Laplace Transform. The Laplace transform £[f(t)] defined by Eq. (1) exists in the sense of absolute convergence (Sec. 8.2-2; see also Sec. 4.9-3)
1. For
- /." w»i
dt
exists for every finite t\ > 0, and \f(t) \e~ffat is uniformly bounded for t > U > 0 lf(t) is of exponential order or 0(e*at) as t-* », Sec. 4.4-3] 3. For
8.2-5. Inverse Laplace Transformation.
The inverse Laplace
transform £-x[F(s)] of a (suitable) function F(s) of the complex variable s = a + tea is a function f(t) whose Laplace transform (1) is F(s). Not everyfunction F(s) has an inverse Laplace transform.
8.2-6. The Inversion Theorem. Given F(s) = £[f(t)],
8.2-7
THE LAPLACE TRANSFORMATION
1
224
fai + iR
fi(t) = s-. lim / F(s)eat ds Z7TZ fl_»« Jci-iR (8.2-4a)
HIKt-Q) + f(t+ 0)] for t>0) lAf(0 + 0)
for* = 0
0
for t < ol
(fl > fa)
In particular, for every t > 0 where f(t) is continuous (8.2-46) The path of integration in Eq. (4) lies to the right of all singularities of F(s).
inversion integral//(t) reduces to 77—. /
4TTl Jcri —ioo
The
F(s)eatds if the integral exists: other-
wise, fi(t) is a Cauchy principal value (Sec. 4.6-2&).
8.2-7. Existence of the Inverse Laplace Transform. Note carefully that the existence of the limit (4) for a given function F(s) does not in itself imply that F(s) has an inverse Laplace transform [EXAMPLE: F(s) — e*2]. The existence of JB'^M] should be checked [e.g., by Eq. (1)] for every application of the inversion theorem. The following theorems state sufficient (not necessary) conditions for the existenceof£-l[F(s)).
1. If F(s) is analytic for a >
2. Given F(s) =
(a >
then «£-1[^(s)] exists, and the corresponding Laplace transformation possesses an abscissa of absolute convergence.
8.2-8. Uniqueness of the Laplace Transform and Its Inverse. The Laplace transform (1) is unique for each function f(t) having such a transform. Conversely, two functions f\(t) and f2(t) possessing identical Laplace transforms are identical for all t > 0, except possibly on a set of measure zero (Sec. 4.6-14); fi(t) = f2(t) for all t > 0 where both functions are continuous (Lerch's Theorem). Thus, f(t) is uniquely denned by its Laplace transform for almost all 2 > 0 (Sec. 4.6-146); a given function F(s) cannot have more than one inverse Laplace transform continuous for all t > 0.
Different discontinuous functions may have the same Laplace transform. In particular, the generalized unit step function (see also Sec. 21.9-1) defined by f(t) = 0 for t < 0, f(t) = 1 for I > 0 has the Laplace transform 1/s regardless of the value assigned to f(t) for t = 0.
225 LAPLACE TRANSFORMS OF PERIODIC OBJECT FUNCTIONS 8.3-2
8.3. CORRESPONDENCE BETWEEN OPERATIONS ON OBJECT AND RESULT FUNCTIONS
8.3-1. Table of Corresponding Operations. Table 8.3-1 lists a num ber of theorems each establishing a correspondence between an operation on a function fi(t) and an operation on its Laplace transform Fi(s), and vice versa. These theorems are the basis of Laplace-transform tech niques for the simplified representation of operations (operational calculus based on the use of Laplace transforms). 8.3-2. Laplace Transforms of Periodic Object Functions and Amplitude-modulated Sinusoids, (a) 7/ f(t) is periodic with r T
period T (Sec. 4.2-26), and l
\f(t)\ dt exists, then
£[/«] =r=W- lome'8t dt ^>0)
(8.3-1)
Note: The integral on the right is an integral function (Sec. 7.6-5), so that £[f(t)] has no singularities for finite 5 except possibly for simple poles on the imaginary axis. /T/2
|/(«)| dt exists, then
/ (< +tJ) s -/» and £[/(*)] =x+\.Tal2 f0T/2f(t)e-« dt
(«• >0) (8.3-2)
(c) //
« ={o(0 £/$ <S}
and ^Wl - i^rW.WWJ
<* >0) (8.3-3)
(d) If
** - {f®m for/S
and £[wW1 =°oth^£[/(0]
(a- >0) (8.3-4)
(e) Transform of an Amplitude-modulated Sinusoid.
// F(s)
is the Laplace transform of f(t), then
£[f(t) sin «i«] s i, [F(s - iwi) - F(s + t«i)] £[f(t) cos uit] = Y2[F(s - tcoi) + F(s + iwi)]
(8.3-5)
6
5
Convolution of object functions!
Translation (shift) of object function . . . if/(0 = Ofor* <0
Change of scale
Integration of object function . . . if/'(0 exists fori >0
3
4
. . . if f(t) is bounded for t > 0, and f'(t) exists for t > 0 except for t = t\, h, . . . where f(t) has unilateral limits
(b> 0)
(a > 0)
-/**/i
/i*/^/0B/i(»-r)dr
f(t - b)
f(at)
f^f(r)dr+C
f'(t) /<'>» (r = 1, 2, . . .) /'(*)
«/i(0 + #t(<)
Linearity (a, 0 constant)
Differentiation of object function* . . . if/'(0 exists for all* >0 . . . if /W(<) exists for all t > 0
Object function
Operation
2c
26
2a
1
number
Theorem
«Fi« + fiFt(8)
Result function
• • •
s
Fi(8)F*(8)
e-fc«F(s)
a \a)
s
sF(s) - /(0 + 0) - Y e-«»[f(tk + 0) - f(tk - 0)]
-/fr-»(0+ 0)
sF(s) -/(0+0) s'F(s) - s^V(0 + 0) - s--2/'(0 -f 0) -
limits are assumed to exist (see also Sees. 8.3-2 and 8.7-3)
The following theorems are valid whenever the Laplace transforms F(s) —£[f(t)] in question exist in the sense of absolute convergence;
Table 8.3-1. Theorems Relating Corresponding Operations on Object and Result Functions
CO
O 21
CO
w
Si
CO
Translation of result function
Integration of result function (path of integration situated to the right of the abscissa of absolute convergence)
Differentiation of result function
of t and 8
Differentiation and integration with respect to a parameter a independent
a is independent of t and s)
sult function (continuity theorem;
Corresponding limits of object and re
Operation
e*'f(t)
>
-tf(t) (-WW
/ /«, a) da Jai
fat
da
lim /(*, a)
Object function
/
F'(s) FM(s)
F(8, a) da
F(s - a)
f"F(s)ds
J ax
Cat
da
ct-*a
lim F(s, a)
Result function
of £[fi * /«].
See also Sec. 8.3-3.
* The abscissa of absolute convergence for £[/(r)(*)] is 0 or
11
10
96
9a
86
8a
7
number
Theorem
Table 8.3-1. Theorems Relating Corresponding Operations on Object and Result Functions (Continued)
CD
©
a
§
C#>
W
»
I
ft
El
8.3-3
THE LAPLACE TRANSFORMATION
8.3-3. Transform of a Product (Convolution Theorem).
Fi(s) m £[/i(0] (a > ax)
228
Given
F2(s) = £[/,(*)] (
let J*e-^i'lfxit)]2 dt and JJ e-2
JB[/i«/t«)] e i J**** FiWFtb -z)dz (
If F(s) is theLaplace transform of f(t) and £[f'(t)] exists,
then
lim sF(s) = /(0 + 0)
(8.3-8)
8—* oo
i/ Ifce 2imt< on $fce left exists. If, in addition to the first two conditions, sF(s) is analytic for
lim sF(s) = lim f(t) 8-*0
/-♦oo
(8.3-9)
(see also Table 8.3-1,2).
8.4. TABLES OF LAPLACE-TRANSFORM PAIRS
AND COMPUTATION OF INVERSE LAPLACE TRANSFORMS
8.4-1. Tables of Laplace-transform Pairs. A number of Laplacetransform pairs are tabulated in Appendix D for reference. In par ticular, Table D-6 lists a number of Laplace-transform pairs with rational algebraic result functions.
Appendix D also shows how tables of Fourier-
transform pairs may be used to obtain certain Laplace-transform pairs, and vice versa.
8.4-2. Computation of Inverse Laplace Transforms. Sections 8.4-3 to 8.4-9 describe various procedures for finding the object function f(t) corresponding to a result function F(s) obtained in the course of a problem solution (see also Sees. 9.3-7, 9.4-5, and 10.5-2). Note: Unless the existence of JB"1^^)] is definitely known, any results obtained through the use of the inversion integral (8.2-4) should be checked by means of Eq. (8.2-1).
Particular caution is recommended in connection with the use of series
expansions for £~1[F(8)]; seemingly straightforward asymptotic or even convergent expansions may not be valid unless F(s) and f(t) satisfy certain restrictive conditions.
A number of sufficient (but not necessary) conditions for the validity of series expan sions of £~l[F(s)] are presented in Sees. 8.4-6 to 8.4-9. In many problems, a function /(*) "suspected" to be g-^Fis)] can be tested, e.g., by resubstitution into a differential equation, so that heuristic methods for finding JB-^FW] may be quite useful.
8.4-3. Use of Contour Integration.
Values of the contour integral
(8.2-4) may frequently be obtained with the aid of the residue theorem
(Sees. 7.7-1 to 7.7-3) and Jordan's lemma (Sec. 7.7-36). If F(s) is a
INVERSE LAPLACE TRANSFORMS
229
8.4-4
multiple-valued function, the contours used must not cross any branch cuts of F(s) (Sec. 7.4-2; see Refs. 8.5 and 8.10).
8.4-4. Inverse Laplace Transforms of Rational Algebraic Func tions : Heaviside Expansion, (a) IfF(s) is a rational algebraicfunction expressible as the ratio of two polynomials in s,
F(S) - D(s)
(8.4-1)
such that the degree of the polynomial D(s) is higher than that of the poly nomial 2>i(s), then iT1^)] equals the sum of the residues (Sec. 7.7-1) of F(s)e8t at all the singular points (poles) of F(s). To compute the inverse
Laplace transform JC-1^*)], first find the roots sk of D(s) = 0 [which determine the poles of F(s)] by one of the methods described in Sees. 1.8-1 to 1.8-6 or 20.2-2 and 20.2-3.
Then
1. If D(s) = a0(s — si)(s — s2) ' • • (s — s„) [all roots of D(s) = 0 are simple roots], then
2. If D(s) s a0(s - sx)m>(s - s2)mj • • • (s - sn)m% n
(8.4-2)
ink
£->[F(s)] - jB-i [§^] =V VH^** (t >0) with th
H = 1 dh* Us-s^^s)] *' (J - 1) Km - j)!d*-* L W) _L* (see also Table D-6).
(b) It is sometimes convenient to obtain an inverse Laplace transform involving a multipleroot of D(s) = 0 directly by a limiting process leading to the coincidence of distinct simple roots (see also Table 8.3-1,7). EXAMPLE:
and, more generally,
(c) Complex Roots. In Eq. (2), pairs of terms corresponding to complex-con jugate roots 8 = a ± iui maybe combined as follows (see also Sec. 8.4-5):
8.4-5
THE LAPLACE TRANSFORMATION
230
(A + iB)*'e<»+*»i>« + (A - iB)t'e<*-iai» = 24*re°* cos mt - 2J?Jre0< sin «^
= Rtre** sin (out + a) «=» i&re°' cos («i* 4- a') S (8.4-3) with
R = 2 V^.2 + -B*
a — —arctan -5
a! = arctan -7
A, B, #, a, and a' are real if the coefficients in D\(s) and D(s) are real; JB-1[F(*)] is then a real function of t.
(d) If one of the roots sk of D(s) = 0 should also be a root of Dx(s) « 0, then one or more terms of the expansion (2) will vanish. In general, such common roots of D(8) =» 0 and Di(«) = 0 can be "factored out" and canceled in Eq. (1).
8.4-5. Inverse Laplace Transforms of Rational Algebraic Func tions: Expansion in Partial Fractions. Instead of applying the
Heaviside expansion (2) directly, one may expand F(s) = *}*' as a D(s)
sum of partial fractions by one of the methods described in Sec. 1.7-4.
If D(s) and Di(s) have no common zeros, each real root s* = a of D(s) = 0 will give rise to m* partial fractions of the form 61
62
6,m *
s —a
(s — a)2
(8 — a)m*
where mk is the order of the root sk = a. Each pair of complex-conjugate roots Sk = a ± icoi will give rise to m* partial fractions of the form Citz
s + di
-n* . .. * (s - a)2 + cox2
s + d2
c2 w [(s - a)2 + a,!2]2
* - -
s + dmk
"m* [(« - ay + coi2]-*
where mkis the order of the roots Sk = a ± z'coi. «C_1[F(s)j is then obtained as the sum of the inverse Laplace transforms of such terms (Table D-6). The method of Sec. 8.4-46 may be useful.
8.4-6. Expansions in Series. If the form of F(s) is complicated, or if F(s) is given only implicitly, e.g., as the solution of a differential equa tion in s (Sec. 9.4-5), it is sometimes possible to obtain £~1[F(s)] by expanding F(s) into a convergent series and taking the inverse Laplace transform of the latter term by term. Such a procedure may also be useful for approximating «C_1[F(«)]. Frequently, series-expansion meth ods are justified by the following theorem.
Fk(s) = £[fk(t)]
Let
(a > oa; k = 0, 1, 2, . . .)
and let
2/0"l/*(0|e-'-'<« 00
converge.
Then the series > Fk(s) converges uniformly to a function F(s)
231
SERIES EXPANSIONS
8.4-8
00
for cr > <xa; the series } fk(t) converges absolutely to a function f(t) for almost all t (Sec. 4.6-146), and 00
00
00
w)]=£ [ y /*»] = y £[/*(oi = y nw=*•« A= 0
Aj= 0
^ >*.>
* = 0
(8.4-4)
8.4-7. Expansions in Terms of Powers of t. Series expansions of F(s) in descending powers of s,
*•(»> = 7 + if + *• '
(W > r > 0)
(8.4-5)
may frequently be obtained as Laurent expansions (Sec. 7.5-3) about s = 0 or, in the case of rational algebraic functions of the type specified in Sec. 8.4-4, simply by long division (Sec. 1.7-2). If the conditions of Sec. 8.4-6 are satisfied, then, for almost all t in (0, h),
f(t) =eC-TC>] =6! +b2t +^t* +•••+^A_.f**-i+ . . . (h> t> 0)
(8.4-6)
which may furnish (at least) useful approximations to JC_1[y(«)] for t < ti. If 00
F(s) - £ J % [Re («) >0]
(8.4-7)
Aj= 0
converges, then 00
£-'[F(8)) - <--i ^ T{k\ a) t>°
(t >0)
(8.4-8)
A=»0
In particular, if the series on the left converges, 00
£_1 [Aklo°i\ -da[c° +r(2° +(iM(20° +•••] (<^0) (84-9> 8.4-8. Expansions in Terms of Laguerre Polynomials of t. Every function F(s) = £[/(*)]
(
analytic at s = oo may be expanded into an absolutelyconvergentseriesin powersof
8.4-9
THE LAPLACE TRANSFORMATION
for a > aa, corresponding to a Taylor series convergent for \z\ < 1. for (T0 = 0, F(s) may be expressed in the form
232
In particular,
fW =a-.) I c*>=^ l »(i^y („ >..) ft = 0
fc=>0
(8.4-11)
k
with
ct =2(5)j,'<»(2) (*" 0,1,2, •••) y=o
and, if the conditions of Sec. 8.4-6 are satisfied, then, for almost all t > 0, oo
/(*) =£-*[F(s)] - «-«/• £ ^^
«>0)
(8.4-12)
A= 0
where the Ljb(0 are the Laguerre polynomials defined in Sec. 21.7-1.
8.4-9. Expansions in Asymptotic Series. Valid approximations to £rl[F(8)] may often be obtained in terms of asymptotic series (Sec. 4.8-66). The following type of asymptotic expansion is of importance in connection with the solution of certain partial differential equations (Sec. 10.5-2). //
F(s) = £[/(*)]
(o- >
can be expanded into a convergent series of the form
. .+ .ai*w . .+ a2s +. a8«*-....,.IV. F(s) = -.(a0 +•••)= - > a*s*>2
(8.4-13)
k = 0
in a neighborhood of s = 0, then f(t) may be represented by the asymptotic series oo
f(t) ^ a0 +-^= V(- l)>a2y+1 r(j +±A) as t-• oo (8.4-14) provided that either of the following sets of conditions is satisfied (see also Sec. 8.5-1 and Ref. 8.5): 1. f(t) is differentiable for t > 0, continuous for t = 0, and £[f'(t)] exists; and there exists a (finite) real positive number K such that
jt W(t) - aQ - Sy_,(0]} > -Kir*
(t > 0,i = 1, 2, . . .) (8.4-15)
where S}-i(t) is the (j — l)at partial sum of the series in Eq. (14). 2. f(t) is continuous for t > t\ > 0, andoa < 0;F(s) is analytic for a > 0 except for s = 0. There exists c\ > 0 such that for o\ >
233
IMPULSE-FUNCTION TRANSFORMS
8.6-1
lim F(s) =0 uniformly with respect to a
|a>|—•<»
lim F<*>(iw) =0
(k = 1, 2, . . .)
|a)|—+00
|o±e° |F«>(iw)| dco amfe awrf /
0' =2, 3, . . .)
F((ri + io))eiut do* converges uniformly for t > t\ > 0.
These rather restrictive conditions provide a rigorous basis for a number of asymp totic expansions of <£_1[*(\/s)] originally derived with the aid of the old Heaviside operational calculus (Refs. 8.3 and 8.10). 8.5. "FORMAL" LAPLACE TRANSFORMATION OF IMPULSE-FUNCTION TERMS
8.5-1. Impulse-function Transforms, (a) In Eq. (8.4-13) and in similar series expansions, note that the inverse Laplace transforms of
individual terms like a, as, as%, ... do not, strictly speaking, exist, since these functions do not tend to zero as s —• oo. In most applica tions, terms of this nature will appear under summation and integral signs in such a manner that the series or integral does have an inverse Laplace transform.
(b) If one applies the Laplace transformation (8.2-1) to the "defini tions" of the impulse functions 8(t) and 8+(t) and their "derivatives" (Sees. 9.2 and 21.9-6), one obtains the formal results £W)] = }4
£[*+(*)] = 1
JC[*i(0] ^ 8
£[8'+(t)] -is2
£[8(t - a)] = £[8+(t - a)] = e~°°
(a > 0)
...
} (8.5-1)*
£[8™(t - a)] = £[8+^(t - a)] = e~aask (a > 0; k = 1, 2, . . .)
and, if f(t) is continuous for t = a > 0,
£[8+(t - a)f(t)] ^ er~f(a)
(8.5-2)
Eqs. (1) and (2) are useful in many applications, but one must not forget that such relations have no strict mathematical meaning. New results suggested by Eqs. (1) and (2) must always be verified by mathematically legitimate means (see also Sees. 8.2-7 and 21.9-2a). 8.6. SOME OTHER INTEGRAL TRANSFORMATIONS
8.6-1. The Laplace transformation (8.2-1) is a functional transformation associating "points" F(s) in a result space with "points" f(t) in an object * The asymmetrical impulse function 8+(t) is more suitable for use in connection
with the one-sided Laplace transformation than the symmetrical impulse function 8(0.
8.6-2
THE LAPLACE TRANSFORMATION
234
space (see also Sees. 12.1-4 and 15.2-7). Table 8.6-1 and Sees. 8.6-2 to 8.6-4 introduce a number of other functional transformations (see
also Sec. 4.11-5, Appendix D, and Refs. 8.8 and 8.10).
8.6-2. The Two-sided (Bilateral) Laplace Transformation,
(a)
The two-sided (bilateral) Laplace transformation
**[/(<)] =/_". f(t)e-« dt =£[f(t); s] +£[f(-t); -«]
(8.6-1)
is an attractive generalization of the Laplace transformation applicable, like the Fourier transformation (Sees. 4.11-3 to 4.11-7), to problems where values of f(t) for t < 0 are of importance. <&b[/(0 ; s] converges absolutely if and only if both £[f(t); s] and £[f( -t); —s] converge absolutely, so that the region of absolute convergence, if any, will be a strip of the
s plane determined by two abscissas of absolute convergence. Many properties of the two-sided Laplace transformation are simply derived from corresponding properties of the onesided Laplace transformation by reference to Eq. (1) (see also Refs. 8.5 and 8.17). In particular, note
£bW)] - £[f(t)]
if f(t) =0for t<0 j
£*[/(*)] =£[/«] - \
if/(0 = cfor t<0 I
(g ^
(b) The values of the inverse transform£b~1[F(s)] arenot restricted to zero for t < 0, so that £b~1W(s)] exists for a larger class of functions F(s) than does ir1^*)]. Given F(s) = £B[f(t)]
(crai < cr < oa%)
then, for any value of t having a neighborhood where f(t) is of bounded variation,
fi(t) =-U lim f'l*tB F(s)e« ds =\ [f(t - 0) +/(* +0)] (
(inversion theorem) (8.6-3)
(c) Note also the convolution theorem £B[fi(t)]£B[f2(t)] = £s[fi *M
With
)
fl*f2 =pJim(t~T)dT=f2*fl)
assuming absolute convergence (see also Tables 4.11-1 and 8.3-1). 8.6-3. The Stieltjes-integral Form of the Laplace Transformation. The Stieltjes-integral form of the Laplace transformation £sMt)]
= f °° e-« d
(8.6-5)
Table 8.6-1. Some Linear Integral Transformations Related to the Laplace Transformation
value)
(use principal
transform
Hilbert
transform
Mellin
transform
Fourier
transform
Laplace
Bilateral
/(*)e-»"' dt
V J —00 8 — t
]o° mt'~1*
1
J"oo me~" dt
f" Mse-*' dt
Definition
* Whenever the transforms in question exist.
5
4
3
2
s-multiplied Laplace
1
transform
Transform
No.
. F{8)f da
[ox+i*
«• y — oo < — 8
if" ™*
2*ri Jai—t»
1
f00 du 1 F(tu)eiut — J -oo 2ir
f
(t>0)
(t > 0)
— / . F(8)#'d8 2vi Jo*i —1»
1
-F(a)e"d8
fci+i» i
— /
1
— /
Inversion integral
-i«e{JB[/(0;8l; -«}
ijB{<e[/(-0;8'J;fi}
£j»[/(eO; -«]
£Blf(t)\iu]
£B[f(t);8) =£[f(t);8]
s£[/(t); s]
Laplace transformation*
Relation to unilateral
1 f co
irjQ
.F(*) -= - J
x yo implies 1 /•»
fit) «= - /
(&(«) cos &>< — o(o») sin ut] du
[o(«) cos w/ -f- 6(o>) sin at] du
/(0 and F(f) are quadrature functions, i.e.
Sees. 4.11-3 to 4.11-5
Sec. 8.6-2
Remarks
Each transform is denotedby F(s) (see also Sees. 4.11-3 to 4.11-5, 8.6-1 to 8.6-5, 10.5-1, 15.2-7, 15.3-1, and 18.3-8)
8
CD
O
=
»
CD
ft •
H
ft •
O
M
as H
cn
8.6-4
THE LAPLACE TRANSFORMATION
236
permits a more general formulation of many theorems than the ordinary Laplace transformation (see also Sec. 4.6-17 and Refs. 8.5 and 8.18). The other functional transformations listed in Table 8.6-1 may be similarly written in terms of Stieltjes integrals.
Note that F(s) = se"*" can be represented in the form (5) without the use of impulse functions (Sec. 8.5-1). The s-muUiplied Laplace transform (Table 8.6-1,1) is some times employed for similar reasons.
8.6-4. Hankel Transforms and Fourier-Bessel Transforms, Definition and Inversion Theorem. The integral transform
f(s) m3Cm[f(t)] » 3U/(0; 8] mf~ f(t)Um(st) dt
(a)
(8.6-6a)
(Hankel transform of order m)
where f(t) is a real function, and Jm(z) is the rath-order Bessel function (Sec. 21.8-1), exists in the sense of absolute convergence whenever
/
1/(01* exists. //, in addition, f(t) is of bounded variation in a
neighborhood of t, one has the inversion formula
fi(t) = /0°° f(s)sJm(st) ds =y2[f(t - 0) +f(t + 0)] (m > —%)
(8.6-66)
(Hankel inversion theorem)
which determines the inverse transform uniquely wherever it is continuous. (b) Properties of Hankel Transforms. Note the following relations:
K.[/(o0; «] =i 3Cm [/(f); iJ 3Cm ||/(0]=^ {Xm-WQ] +mWiWOl)
(8.6-7) (8-6-8)
3Cm[/'«)] =^((«- l)3(Wi[/(0] - (« + l)3C»-i[/(0] 1 (8-6-9)
3C» [/"(0 +\}'(J) ~£/(*)] =-s23C„[/(0] /0" «3C[/(0]3C«to(0] ds =J' tf(t)g(t) dt
(8-6-10)
(m >~Y2)
(Parseval's theorem for Hankel transforms)
(8.6-11)
(c) Fourier-Bessel Transforms (see also Sees. 21.8-1 and 21.8-2). The following integral-transform pairs are related to the Hankel-transform pair (6):
/(») = I' fWtJmW dt ) 7. } /;«) =] J(s)sJm(.sl) da =y2[f(l - 0) +/« +0)] j
(m = 0, 1, 2, . . .) (8.6-12)
237
GENERATING FUNCTIONS
lis) - JJf(t)-^Jm+H(st)t*dt %
i
8.7-2
\
i
[
fiit) =JT /(«) — /m+^(s0s2 <*s =\ [fit - 0) +/(« +0)]J
(m =0, 1, 2, . . .)
(8.6-13)
Both integral-transform pairs are referred to as Fourier-Bessel transform pairs; for m = 0, Eq. (13) reduces to a Fourier sine-transform pair.
8.7. FINITE INTEGRAL TRANSFORMS, GENERATING FUNCTIONS, AND z TRANSFORMS 8.7-1. Series
as
Functional
Hankel Transforms.
Transforms.
Finite
Fourier
and
A finite or convergent series 00
represents a functional transformation of the function (sequence, Sec. 4.2-1) fk = f(k) defined on the discrete set of integers k = 0, 1, 2, . . . . Note that for suitable fk and ^/(x,k) such a series can be written as an integral transform in terms of a Stieltjes integral (Sec. 4.6-17):
$(s) = r *(x,k) d
7(«) - T /*«*
(8.7-2)
y(s) is called a generating function for the sequence of coefficients /o, /i, hi . . . , while
7e(s) = Y/*£k\
(8.7-3)
Bmi\x) = JmtXaOiVmfaX) - Nmi\x)Jmia\)
a' > a > 0, and
form, with
Hankel trans
Finite annular
(Hankel series)
transform
Finite Hankel
(Fourier sine series, Sec. 4.11-26)
transform
Finite sine
(Fourier cosine series, Sec. 4.11-26)
transform
Finite cosine
1:
$ix)xBm(\kx) dx
/: $ix)xJmi\kX) dx
/0°*(x:) sin \kX dx
/;
&ix) cos \kX dx
Bmi\nx)
X^JVta'Xt)
Jm2ia\k) -Jm2ia':X*)
2A
fk cos \kX
I_2 V,
00
Jmj\kX) [Jm'i*ka)]'
2Y
/* sin \kx
a2 Li
jfc=i
00
k=i
a Lj
a Jo
<£(a;) dx + - /
2V
I fa
- /
00
*(») or H&ix - 0) + $ix + 0)1
are
0
Bmia'\) = 0
Jmia\) = 0
ik = 1, 2, . . .)
a
X* = k-
tan aX =
roots of
the positive
Xi, X2, . .
-\xJmi\kX)dx a:2 J
= —A*2/* - a\kJm'(a\k)*ia)
*" + x
+ Xt[^(0) + (-l)*+^(o)]
=-X^+-2[^^)-Ma)l *- [^(a'Xjfe) J
/
Jo |_
/:
*"'(*) sin X*a; da;
-3>'(0) + (-l)**'(a)
/; $"(a;) cos \kxdx
transformation
Important derivative
Table 8.7-1. Some Finite Integral Transforms (see also Sees. 10.4-2c and 15.4-12). a; is a real variable, *'(x) = d
09
CO
to
2
g
H w
o
W
H
H
to
I
09
J* *ix)xJm(\kx) dx
/ *(a;) sin X*a: dx
WftJmiXkX)
* = 1
Zj [(X*2 + *>2)a2 - m*J/»f(a**)
0Y
00
Zj b+ a(X*2 + 62)
0 y (x*2+h*)fk sin x**
00
* = i
c°s x** / #(a0 cos Xjta; dx 0 vA (x*2+&•)/* 6+ a(X*2 + 62)
00
*ix) or HMx - 0) + *(x + 0)]
In Tables 8.7-la and 6, note that J'0(\kd) = —/i(X*a).
transform
Finite Hankel
transform
Finite sine
transform
Finite cosine
/*
—b
+ bJmia\) = 0
X/^aX)
X cot aX =
X tan oX = b
roots of
the positive
Xi, X2, . . . are
(6 > 0)
+ [*'(a) + te(a)]a/m(aXt)
= -X*2/* + XA*(0) + [*>'(a) + 6*(a)] sin aX*
/ *"(a;) sin X*a: dx
= -Xt2/t - *'(0) + [*'(a) + &$(«)] cos aX*
/ &'(x) cos X*a; dx
transformation
Important derivative
(b) Finite Transforms Useful with Boundary Conditions of the Form *(a) + W(a) =0
2s|
CO
d
•S3
o
NX
w
W
8.7-3
THE LAPLACE TRANSFORMATION
240
is called an exponential generating function for the fk. Applications and properties of generating functions are further discussed in Sec. 18.3-8 and Appendix C. EXAMPLE: Fibonacci numbers /0, /i, /2, . . . are defined by /o = h = 1
/* = fk-i +A-2
ik = 2, 3, . . . )
(8.7-4)
Their generating function is
T(s) = _ *_gi =1+s+2s2 +3s3 +5s4 +8s* + •••
(8.7-5)
8.7-3. z Transforms. Definition and Inversion Integral. z transform of a suitable sequence fQ, f\, /2, . . . is defined as
Z[fk',z] =/o + £ + ^f + • • • =Fz(z)
(\z\ > ra)
The
(8.7-6)
where z is a complex variable, and the series converges absolutely out side of a circle of absolute convergence of radius ra depending on the given sequence; analytic continuation in the manner of Sec. 8-2.3 can extend the definition. The corresponding inversion integral
fik =gU j)c Fz(z)z^ dz
(k =0, 1, 2, . . .)
(8.7-7)
where C is a closed contour enclosing all singularities of Fz(z), gives the inverse transform Z~1[Fz(z)] = fk for suitable fk (Ref. 8.11). Inversion can then utilize the residue calculus in the manner of Sec. 8.4-3, espe cially if Fz(z) is a rational function expandable in partial fractions. Inversion is even simpler if Fz(z) can be expanded directly in terms of
powers of 1/z. Note that the inverse transform must be unique wherever the series (6) converges absolutely (Sec. 4.10-2c). Table 8.7-2 summarizes the most important properties of z transforms.
Their application to the solution of difference equations and sampleddata systems is treated in Sec. 20.4-6, where the relation of z trans
forms to jump-function Laplace tranforms is also discussed. Table 20.4-1 lists a number of z-transform pairs. The z transform is related to the Mellin transform of Table 8.6-1.
Note the
analogy between power series and Mellin transforms and between Dirichlet series 00
/ 2/Afi"** and Laplace transforms. fc=o
8.8. RELATED TOPICS, REFERENCES, AND BIBLIOGRAPHY
8.8-1. Related Topics. The following topics related to the study of the Laplace transformation and other functional transformations are
241
RELATED TOPICS
8.8-1
Table 8.7-2. Corresponding Operations for z Transforms
The following theorems are valid whenever the ^-transform series in question converge absolutely; limits are assumed to exist (see also Table 8.3-1), and /_i = /_2 = • • • = g-i = 0_2 = • • • = 0 Theo
Operation
rem
1
Object sequence
Result function
otfk + Pgt
aFziz) + pGziz)
Linearity (a, 0 con stant)
fk+i = E/* 2
sequence
3
zFziz) - zfo r-l
Advanced object
(r = 1, 2, . . .) zrFziz) -
/*+r ^ E'/*
Delayed object
fk-i m E~%
sequence
) ftf-i
z-Wziz)
Finite differences 4a
forward difference
fk+i — fk = A/*
46
backward difference
/* - /*-i s Vfk
iz -
l)Fziz) - zf0
*—Fziz) z
k
5
Finite summation of
I"
object sequence
-^—Fziz)
2 — 1
00
6
7
Convolution of
/ /iff*-*
object sequences Corresponding limits (continuity theo rem; a is independ ent of k and z)
lim fkia) ot—*a
Fziz)Gziz)
lim Fziz,a) a—>a
Differentiation and
pendent of A' and z Initial and final*
values of object sequence
10
Differentiation of result function
^Fziz,a)
/ fkia) da J ax
faiFziz,a)da fax
da fa*
respect to a pa rameter a inde
9
-/*(«)
d
integration with 8
/o
lim Fziz)
lim fk
lim" {z- \)Fziz)
k—*»
kfk
k%
da
(r - 1, 2, . . .)
1iz — l)Fziz) is assumed to be analytic for \z\ > 1.
Z->1
d
-z-Fziz) dz
8.8-2
THE LAPLACE TRANSFORMATION
242
treated in other chapters of this handbook: Expansion in partial fractions Limits, integration, and improper integrals Convergent and asymptotic series Functions of a complex variable, contour integration
Chap. Chap. Chap. Chap.
1 4 4 7
Transformations Fourier transforms
Chaps. 12, 14, 15 Chap. 4
Applications of generating functions
Chap. 18, Appendix C
z transforms Tables of Fourier, Hankel, and Laplace-transform pairs
Chap. 20 Appendix D
Applications of the Laplace transformation are discussed in other chap ters of this handbook as follows: Ordinary differential equations Partial differential equations Integral equations
Chaps. 9, 13 Chap. 10 Chap. 15
8.8-2. References and Bibliography. 8.1. Campbell, G. A., and R. M. Foster: Fourier Integrals for Practical Applications, Van Nostrand, Princeton, N.J., 1958. 8.2. Churchill, R. V.: Operational Mathematics, 2d ed., McGraw-Hill, New York, 1958.
8.3. Ditkin, V. A., and A. P. Prudnikov: Integral Transforms and Operational Calculus, Pergamon Press, New York, 1965. 8.4. Doetsch, G.: Anleitung zum Praktischen Gebranch der Laplace-Transformation, Oldenburg, Munich, 1956. 8.5. : Handbuch der Laplace-Transformation, 3 vols., Birkhauser, Basel, 1950. 8.6. and D. Voelker: Die Zweidimensionale Laplace-Transformation, Birk hauser, Basel, 1950.
8.7.
, H. Ejieiss, and D. Voelker: Tabellen zur Laplace-Transformation und Anleitung zum Gebrauch, Springer, Berlin, 1947. 8.8. Erdelyi, A., et al.: Tables of Integral Transforms (Bateman Project), 2 vols., McGraw-Hill, New York, 1954.
8.9. Jury, E. I.: Theory and Application of the z-transform Method, Wiley, New York, 1964.
8.10. McLachlan, N. W.: Modern Operational Calculus, Macmillan, New York, 1948. 8.11. Miles, J. W.: Integral Transforms, in E. F. Beckenbach, Modern Mathematics for the Engineer, 2d series, McGraw-Hill, New York, 1961. 8.12. Nixon, F. E.: Handbook of Laplace Transformation, 2d ed., Prentice-Hall, Englewood Cliffs, N.J., 1965.
8.13. Papoulis, A.: The Fourier Integral and Its Applications, McGraw-Hill, New York, 1962.
8.14. Scott, E. J.: Transform Calculus, Harper, New York, 1955. 8.15. Sneddon, I. N.: Fourier Transforms, McGraw-Hill, New York, 1951. 8.16. Tranter, C. J.: Integral Transforms in Mathematical Physics, 2d ed., Wiley, New York, 1956.
8.17. Van Der Pol, B., and H. Bremmer: Operational Calculus Based on the Two-sided Laplace Integral, Cambridge University Press, London, 1950. 8.18. Widder, D. V.: The Laplace Transform, Princeton University Press, Princeton, N.J., 1941.
CHAPTER 9
ORDINARY
DIFFERENTIAL EQUATIONS
9.1. Introduction
9.1-1. Survey 9.1-2. Ordinary Differential Equations 9.1-3. Systems of Differential Equa tions
9.1-4. Existence and Desirable Prop erties of Solutions 9.1-5. General Hints
9.2. First-order Equations 9.2-1. Existence and Uniqueness of Solutions
9.2-2. Geometrical Interpretation. Singular Integrals 9.2-3. Transformation of Variables
9.2-4. Solution of Special Types of First-order Equations 9.2-5. General Methods of Solution
(a) Picard's Method of Successive Approximations (6) Taylor-series Expansion 9.3. Linear Differential Equations
9.3-1. Linear Differential Equations. Superposition Theorems 9.3-2. Linear Independence and Fun damental Systems of Solutions 9.3-3. Solution by Variation of Con stants.
Green's Functions
9.3-4. Reduction of Two-point Boundary-value Problems to Initial-value Problems
9.3-5. Complex-variable Theory of Linear Differential Equations. Taylor-series Solution and Ef fects of Singularities 9.3-6. Solution of Homogeneous Equations by Series Expansion about a Regular Singular Point 9.3-7. Integral-transform Methods 9.3-8. Linear Second-order Equations 9.3-9. Gauss's Hypergeometric Dif ferential Equation and Riemann's Differential Equation 9.3-10. Confluent Hypergeometric Functions 9.3-11. Pochhammer's Notation
9.4. Linear
Differential
Equations
with Constant Coefficients
9.4-1. Homogeneous
Linear
Equa
tions with Constant Coefficients
9.4-2. Nonhomogeneous Equations. Normal Response, Steady-state Solution, and Transients 9.4-3. Superposition Integrals and Weighting Functions 9.4-4. Stability 243
ORDINARY DIFFERENTIAL EQUATIONS
9.1-1
9.4-5. The Laplace-transform Method of Solution
9.4-6. Periodic Forcing Functions and Solutions.
The Phasor
Method
(a) Sinusoidal
Forcing
and Solutions.
Functions
Sinusoidal
Steady-state Solutions (b) The Phasor Method (c) Rotating Phasors
and
Fre
quency-response Functions (a) Transfer Functions (6) Frequency-response Functions (c) Relations between Transfer Functions or Frequencyresponse Functions and Weight ing Functions 9.4-8. Normal Coordinates and Nor mal-mode Oscillations
(a) Free Oscillations (6) Forced Oscillations
9.5. Nonlinear Second-order Equa tions
9.5-1. Introduction
Solution
9.5-3. Critical Points and Limit Cycles (a) Ordinary and Critical Phaseplane Points (6) Periodic Solutions and Limit Cycles son's Theorems
Functions
Functions
9.5-2. The Phase-plane Representation. Graphical Method of
(c) Poincare^s Index and Bendix-
(d) More General Periodic Forcing 9.4-7. Transfer
244
9.5-4. Poincare-Lyapounov Theory of Stability 9.5-5. The Approximation Method of Krylov and Bogoliubov (a) The First Approximation (6) The Improved First Approxima tion
9.5-6. Energy-integral Solution
9.6. Pfaffian Differential Equations 9.6-1. Pfaffian Differential Equations 9.6-2. The Integrable Case 9.7. Related Topics, References, and Bibliography
9.7-1. Related Topics 9.7-2. References and Bibliography
9.1. INTRODUCTION
9.1-1. Survey. Differential equations are used to express relations between changes in physical quantities and are thus of great importance in many applications. Sections 9.1-2 to 9.3-10 present a straight forward classical introduction to ordinary differential equations, including some complex-variable theory.
Sections 9.4-1 to 9.4-8 introduce the linear
differential equations with constant coefficients used in the analysis of vibra tions, electric circuits, and control systems, with emphasis on solutions by Laplace-transform methods. Sections 9.5-1 to 9.5-6 deal with non linear second-order equations. Sections 9.6-1 and 9.6-2 introduce Pfaff ian differential equations, although these are not ordinary differential equations. Some naturally related material is treated in other chapters of this handbook, particularly in Chap. 8 and Sees. 13.6-1 to 13.6-7. Boundary-
245
ORDINARY DIFFERENTIAL EQUATIONS
9.1-2
value problems, eigenvalue problems, and orthogonal-function expansions of solutions are discussed in Chap. 15, and a number of differential equa tions defining special functions are treated in Chap. 21. The notation used in the various subdivisions of this chapter has been chosen so as to
simplify reference to standard textbooks in different special fields.
Thus the usually
real variables in Sees. 9.2-1 to 9.2-5 are denoted by x, y = yix); the frequently com
plex variables encountered in the general theory of linear ordinary differential equa tions (Sees. 9.3-1 to 9.3-10) are denoted by z, w = wiz). The variables in Sees. 9.4-1 to 9.5-6 usually represent physical time and various mechanical or electrical variables and are thus introduced as t, yk = ykit).
9.1-2. Ordinary Differential Equations. equation of order r is an equation
An ordinary differential
F[x, y(x), y'(x), y"(x), . . . , y^(x)] = 0
(9.1-1)
to be satisfied by the function y = y(x) together with its derivatives
y'(x), y"(x), . . . , y(r)(x) with respect to a single independent variable x. To solve (integrate) a given differential equation (1) means to find functions (solutions, integrals) y(x) which satisfy Eq. (1) for all values of x in a specified bounded or unbounded interval (a, b). Note that solutions can be checked by resubstitution. The complete primitive (complete integral, general solution) of an ordinary differential equation of order r has the form y = y(x, ft, ft, . . . , ft)
(9.1-2)
where ft, ft, . . . , ft are r arbitrary constants (constants of integra tion, see also Sec. 4.6-4). Each particular choice of these r constants yields a particular integral (2) of the given differential equation. Typical problems require one to find the particular integral (2) subject to r initial conditions
2/(zo) = 2/0
y'(xo) = y'0
y"(x0) = y'i
• • • y(r-»(X9) = ^(-d
(9.1-3)
which determine the r constants ft, ft, . . . , ft. Alternatively, one may be given r boundary conditions on y(x) and its derivatives for x = a and x = b (see also Sec. 9.3-4).* Many ordinary differential equations admit additional solutions known
as singular integrals which are not included in the complete primitive (2) (see also Sec. 9.2-26). A differential equation is homogeneous if and only if ay(x) is a solu tion for all a whenever y(x) is a solution (see also Sees. 9.1-5 and 9.3-4). * Strictly speaking, initial and boundary conditions refer to unilateral derivatives (Sec. 4.5-1).
9.1-3
ORDINARY DIFFERENTIAL EQUATIONS
246
Given an r-parameter family of suitably differentiate functions (2), one can elimi nate Ci, C2, . . . , Cr from the r -f 1 equations y™ = y^ix, Ch C2, . . . , Cr) (j = 0, 1, 2, . . . , r) to obtain an rth-order differential equation describing the family. Note: An ordinary differential equation is a special instance of a functional equation imposing conditions on the functional dependence y = yix) for a set of values of x. OTHER EXAMPLES OF FUNCTIONAL EQUATIONS: yix&2) = y(xx) + 2/(^2) [logarithmic property, satisfied by yix) = A logfl re], partial differential equations (Sec. 10.1-1), integral equations (Sec. 15.3-2), and difference equations (Sec. 20.4-3).
9.1-3. Systems of Differential Equations (see also Sees. 13.6-1 to 13.6-7). A system of ordinary differential equations
F<(x; 2/i, y2, ... ; y[, y'2, . . .) = 0
(i = 1, 2, . . .)
(9.1-4)
involves a set of unknown functions y\ = y\(x), y2 = y2(x), . . . and their derivatives with respect to a single independent variable x. The order rt- of each differential equation (4) is that of the highest derivative occurring. In general, one will require n differential equations (4) to determine n unknown functions yk(x); and the general solution yx = y\(x), 2/2 = 2/2(3) will involve a number of arbitrary constants equal to r = n + r2 +
• • • + r„.
The solution of a system (4) can be reduced to that of a single ordinary differential equation of order r through elimination of n — 1 variables 2/jb and their derivatives. More importantly, one can reduce every system (4) to an equivalent system of r first-order equations by introducing higherorder derivatives as new variables.
9.1-4. Existence and Desirable Properties of Solutions. A prop erly posed differential-equation problem requires an existence proof indicating the construction of a solution subject to the given type of initial or boundary conditions. The existence of physical phenomena described by a given differential equation may suggest but does not prove the existence of a solution; an existence proof checks the selfconsistency of the mathematical model (see also Sees. 4.2-16 and 12.1-1; see Sees. 9.2-1 and 9.3-5 for examples of existence theorems). It is desirable to design mathematical models involving differential equations so that the solutions are continuous functions of numerical coefficients, initial conditions, etc., so as to avoid excessive errors in solutions due to small errors in numerical data (see also Sec. 9.2-la).
9.1-5. General Hints, (a) Substitution of a Taylor series (Sec. 4.10-4) or other series expansion for y(x) in a given differential equation may yield equations for the unknown coefficients (see also Sees. 9.2-56 and 9.3-5). Many differential equations can be simplified through transformation of variables (Sees. 9.1-56, 9.1-3, and 9.3-8c). Every differential equation or system of differential equations can be reduced to a system of first-order equations; so that the methods of Sec. 9.2-5 apply.
247
GEOMETRICAL INTERPRETATION
9.2-2
(b) The following special types of differential equations reduce easily to equations of lower order (see also Sees. 9.2-3 and 9.5-6):
Fix, yM, 2/(n+1), . . . , i/
(introduce y = y™) (introduce £ = y, y = y')
If a given differential equation Fix, y, y', y", . . . , 2/(r)) = 0 is homogeneous in the arguments y, y', y", . . . , 2/(r) (Sec. 4.5-5; this does notnecessarily imply that the differential equation is homogeneous in the sense of Sec. 9.1-2), introduce y = y'/y. 9.2. FIRST-ORDER EQUATIONS
9.2-1. Existence and Uniqueness of Solutions, order differential equation expressible in the form
(a) A given first-
(9.2-1)
2-**"
has a solution y = y(x) through every "point" (x = Xo, y = yo) with a neighborhood throughout which f(x, y) is continuous. More specifically, let D be a region of "points" (x, y), (x, rj) where f(x, y) is single-valued, bounded, and continuous and
l/(s, V) —f(x> v)\ ^ M\y —y\ (Lipschitz condition)
(9.2-2)
for some real M independent of y and rj. Then the given differential equa tion (1) has a unique solution y = y(x) through every point (x = x0,y = yo) of D, and y(x) is a continuous function of the given value yo = y(x0). Each solution extends to the boundary of D. The Lipschitz condition (2) is satisfied, in particular, whenever fix, y) has a bounded and continuous derivative df/dy in D. (b) (See also Sec. 9.1-3). An analogous existence theorem applies to systems of first-order differential equations
^ =fiix; yi, 2/2, ... ,yn)
H=1, 2, . . . ,n)
(9.2-3)
if the Lipschitz condition (2) is replaced by n
l/te; Vu 2/2, ... , 2/n) - fix; vh ij2, . . . , »;„)| < M } \yk - vk\ (9.2-4) k = i
9.2-2. Geometrical Interpretation. Singular Integrals (see also Sees. 17.1-1 to 17.1-7). (a) If x, y are regarded as rectangular cartesian coordinates, a first-order differential equation
F&, V,P)=0 (p - g)
(9.2-5)
9.2-3
ORDINARY DIFFERENTIAL EQUATIONS
248
describes a "field" of line elements (x, y, p) or elements of straight lines through (x, y) with slope p = dy/dx = f(x, y). Each line element is tangent to a curve of the one-parameter family of solutions
V = V(x, X)
or
(9.2-6)
where X is a constant of integration. A plot of the field of tangent directions permits at least rough graphical determina tion of solutions;the general character of thefamily of solutions may be further discussed
in the manner ofSec. 9.5-2. It may be helpfulto knowthat the curvesFix, y, px) = 0
orfix, y) = pi are isoclines where the solution curves havea specified fixed slope px. The curves — -f — / = 0 are loci of points of inflection (see also Sec. 9.5-2).
(b) Singular Integrals (see also Sec. 9.1-2). Let F(x, y, p) be twice continuously differentiate with respect to x and y, and let dF/dy 9± 0. Elimination of p from
F(x, y,p)=0 dF(x^> v) =0
(9.2-7)
yields a curve or set of curves called the p discriminant of the given differential equation (locus of singular line elements). A curve defined
by Eq. (7) is a singular integral of the given differential equation if dF
dF
^ + q- P = 0 on this curve, unless both dF/dx and dF/dy vanish at a point of the curve. Geometrically, such singular integrals are frequently envelopes of the family of solution curves (6), and may thus be obtained from the complete primitive (6) in the manner of Sec. 17.1-7.
9.2-3. Transformation of Variables (see Sec. 9.2-4 for examples), (a) A suitable continuously differentiable transformation x = X(x, y)
y = Y(x, y) dY
with
[d(x, y) * UJ
dY
P=d/=%+%P (p-*^
(9'2"8)
dx "*" dy V will transform the given differential equation (1) or (5) into a new differ ential equation relating x and y. The new equation may be simpler, or a solution y = y(x) may be known. Once y = y(x) is found, y = y(x) is given implicitly, or by inverse transformation. (b) Contact Transformations (see also Sees. 10.2-5, 10.2-7, and 11.6-8). A set of twice continuously differentiable transformation equations
249
SOLUTION OF SPECIAL FIRST-ORDER EQUATIONS
*- *, fe V, V)
V=9(x, y,p)
P= Pix, y, p)
9.2-4
[*[% *g *o] (9.2-9a)
with the special property
dy - pd£ ss gf(x, 2/, p)(cfy - p da;)
fafe y, p) ^ 0]
(9.2-96)
or
dx /dy ,
d£\
dy /a* ,
dx\
^U +^)=^U +P*-J
(9-2-9c)
defines a contact transformation associating line elements (Sec. 9.2-2a) (x, y, p)
and (5, y, p) so that line elements forming regular arcs are mapped onto regular arcs, and contact of regular arcs is preserved.
dy
It is then legitimate to write p = -j-
I) = -X and to use suitable contact transformations (9) to simplify differential
r
ax
equation (1) or (5). Once a solution y = yix) of the transformed equation is known, y s y(x) is given implicitly or by inverse transformation. In particular, gix, y, p) =1 yields the easily reversible contact transformation £ = p
y = px — y
p = x
(Legendre transformation)
(9.2-10)
which transforms a given differential equation (5) into
Fip, px - y, *) = 0
(9.2-11)
Equation (11) may be a simpler differential equation or, indeed, an ordinary equation relating £ and y.
9.2-4. Solution of Special Types of First-order Equations,
(a)
The following special types of first-order equations are relatively easy to solve.
1. The variables are separable: y' = f\(x)/fi(y).
Obtain the solu
tion from ff2(y) dy = ffi(x) dx + C. 2. "Homogeneous" first-order equations:* yf = f(y/x). duce y = y/x to reduce to type 1. 3. Exact differential equations can be written in the form
P(x, y) dx + Q(x, y) dy = 0
Intro
(9.2-12) (r\Tp —
/^O =
—J
\
Sec. 5.7-1 ). Obtain the solution from
9.2-4
ORDINARY DIFFERENTIAL EQUATIONS
250
If the expression on the left of Eq. (12) is not an exact differential (dP
fiO\
~dy ** ~dxj °ne may ^e a^e to fin(* an inte8rating factor \x = n(x, y) such that multiplication of Eq. (12) by »(x, y) yields an exact differential equation. The integrating factor n(x, y) satisfies the partial differential equation
(dP
dQ\
^d/x
„du
>\Ty-Tc) =(lTx-pi
<9-2-14>
4. The linear first-order equation y' + a(x)y = f(x) (see also Sees. 9.3-1 and 9.3-3) admits the integrating factor H = n(x) = exp Ja(x) dx The complete primitive is then
y=~tfx) [jf{sc)tl{pc) dx +C]
(9.2-15)
Many first-order equations can bereduced to one oftheabove types by transforma
tion of variables (Sec. 9.2-3). In particular
y' = fiax + fiy) reduces to type 1if one introduces g = ax + 0y. ,
ctix -4- fay -f* 71
V " <*2X -f fay + 72reduces to t^e 2by acoordinate translation ifaxfa - atfa ^ 0; otherwise introduce y = a%x + fay + 72 to separate the variables.
y' B/iWl/+Mx)yn (Bernoulli's dipperbntial equation) reduces to a linear equation if one introduces y = yl~n. (b) Given a first-order equation of the form
V = Hx, y')
or
x - hiy, y')
(9.2-16)
it may be advantageous to differentiate both sides with respect to x. The resulting differential equation
„/_«*, *h dy'
y -Tx+Wte
0T
dh , .dh dy'
1=iTyy+Wdt
(9'2-17>
might be easy to solve for y' » y'{x) or y' = y'fo), respectively; substitution of this result intothe given Eq. (16) yields the desired relation of x and y. If the solution of
Eq. (17) takes the form uix, y') = 0 or uiy, y') * 0, the desired relation of x and y
is given in terms of a parameter p = y'.
EXAMPLES: Clairaut's differential equation y = y'x + fiy') yields the complete primitive y <= Cx + /(C) and the singular integral (in parametric representation) x = -/'(p), y » -pf'ip) +fiP). Lagrange's differential equation y = xfiip) + /2(p) is solved in the same manner.
(c) Riccati Equations. The differential equation
y' = aix)y* + bix)y + cix) (general Riccati equation) (9.2-18) is sometimes simplified by the transformation y = 1/jJ; alternatively, 0(3)2/
251
GENERAL METHODS OF SOLUTION
9.2-5
leads to a homogeneous second-order equation for y = yix):
y" - [^ +&(*)] y' +aix)dx)y =0
(9.2-19)
If a particular integral 2/1 ix) of Eq. (18) is known, the transformation y = yiix) + -z
yields a linear differential equation.
If one knows two particular integrals ylt 2/2 or
three particular integrals y\, 2/2, y%, one has, respectively,
„,
mi ,
2/2 - yi
V= y* + 1+ Cexp faix)iy> - y>) dx
„ _ yiiy* - yz) + Cy2jyi - y*) ,Q 2_2m
y~
2/2 - 2/3 + Cfoi - 2/a)
(^ "'
For any four particular integrals 2/1, 2/2, 2/a, 2/4, the double ratio (2/1 — 2/2) (2/3 — y*)/ (2/1 - 2/a) (2/2 - 2/4) is constant. The special Riccati equation
y> + ay2 = fee*
(9.2-21)
can be reduced to type 1 if m = 4fc/(l - 2k) (fc =0, ±1, ±2, . . .). the transformation 3 = **'0»+», 2/ = — H x y
ax
reduces Eq. (21) to a similar equation
^| +dy* = fa*
(9.2-22a)
ax
with
a=-^75
For k > 0,
6=zr^o
* - - ^44
(9-2-226)
The procedureis repeated until (after A; steps) the right side of the differential equation is constant.
Similarly, for k < 0, the transformation x = £-i/(«+D, y =
• m 1 n yields
a differential equation of the form (22a) with ra + 1
5= - m—5_ + 1
9.2-5. General Methods of Solution,
ra = - ?^-±i ra + 1
(9.2-22c)
(a) Picard's Method of
Suecessive Approximations. To solve the differential equation y' = f(Xy y) for a given initial value y(x0) = y0, start with a trial solution y[0] (x) and compute successive approximations
yu+u(x) = 2/0 + fXf[x, yM(x)] dx J xo
(j = 0, 1, 2, . . .) (9.2-23)
to the desired solution y (x). The process converges subject to the conditions of Sec. 9.2-1. Picard's method is useful mainly if the integrals in Eq.
(23) can be evaluated in closed form, although numerical integration can, in principle, be used. A completely analogous procedure applies to systems (3) of first-order differential equations.
9.3-1
ORDINARY DIFFERENTIAL EQUATIONS
252
(b) Taylor-series Expansion (see also Sec. 4.10-4). If the given function f(x,y) is suitably differentiable, obtainthe coefficients y(m)(xo)/m\ of the Taylor series
y(x) =y(x0) +y'(x0)(x - x0) +^ y"(x*)(x - z0)2 + ••• (9.2-24) by successive differentiations of the given differential equation: y'(x) = f(x, y)
with x = xo, y = y(x0) = y0.
An analogous procedure applies to systems of first-order equations. 9.3. LINEAR DIFFERENTIAL EQUATIONS
9.3-1. Linear Differential Equations. Superposition Theorems (see also Sees. 10.4-2, 13.6-2, 13.6-3, 14.3-1, and 15.4-2). A linear ordi
nary differential equation of orderr relating the real or complex variables z and w = w(z) has the form
Lw = a0(z) -^r + at(z) -^ + • • • + ar(z)w =f(z)
(9.3-1)
where the ak(z) and f(z) are real or complex functions of z. The general solution (Sec. 9.1-2) of a linear differential equation (1) can be expressed as the sum of any particular integral and the general solution of the homo geneous linear differential equation (Sec. 9.1-2)
lw ma0(z) ^ +ai(«) ~ +' *•+ar(z)w =0 (9.3-2) For any given nonhomogeneous or "complete" linear differential equa tion (1), the homogeneous equation (2) is known as the complementary equation or reduced equation, and its general solution as the com plementary function.
Let Wi(z) and W2(z) be particular integrals of the linear differential equa tion (1) for the respective "forcing functions" f(z) = fi(z) and f(z) s f2(z). Then awi(z) + Pw2(z) is a particular integral for the forcing function f(z) = afi(z) + 0/2(2) (Superposition Principle). In particular, every linear combination of solutions of a homogeneous linear differential equation (2) is also a solution.
253
SOLUTION BY VARIATION OF CONSTANTS
9.3-3
The superposition theorems often represent some physical superposition principle. Mathematically, they permit one to construct solutions of Eq. (1) or (2) subject to given initial or boundary conditions by linear superposition.
Analogous theorems apply to systems of linear differential equations (see also Sec. 9.4-2).
9.3-2. Linear Independence and Fundamental Systems of Solu tions (see also Sees. 1.9-3, 14.2-3, and 15.2-la). (a) Let wx(z), w2(z), . . . , wr(z) be r — 1 times continuously differentiable solutions of a homogeneous linear differential equation (2) with continuous coefficients in a domain D of values of z.
The r solutions wk(z) are linearly inde-
r
pendent in D if and only if Y \kwk(z) = 0 inD implies Xi = X2 = * • • Jk = l
= \r = 0 (Sec. 1.9-3). This is true if and only if the Wronskian deter minant (Wronskian)
W[wh w2, . . . , wr] s
Wi(z)
W2(Z)
' ' •
WT(Z)
w[(z)
w2(z)
•••
wr(z)
w^-»(z)
w2^l\z)
differs from zero throughout D.
(9.3-3)
• • • w*-»(z)
W = 0 for any z in D implies W == 0 for
all z in D*
(b) A homogeneous linear differential equation (2) of order r has at most r linearly independent solutions, r linearly independent solutions Wi(z), w2(z), . . . , wk(z) constitute a fundamental system of solutions r
whose linear combinations ) akwk(z) include all particular integrals of Eq. (2). (c) Use of Known Solutions to Reduce the Order. If ra < r linearly inde pendent solutions wiiz), 102(3), . . . , wmiz) of the homogeneous equation (2) are known, then the transformation w = W[wi, wt, . . . , wm, w]
9.3-3. Solution by Variation of Constants. Green's Functions. (a) Given r linearly independent solutions W\(z), w2(z), . . . , tpr(z)'of the homogeneous linear differential equation (2), the general solution
of the complete nonhomogeneous equation (1) is W = d(z)W!(z) + C2(Z)W2(Z) + • • • + Cr(z)wr(z)
(9.3-4)
* Note that the theorem in this simple form does not apply to every sefr of r — 1 times continuously differentiable functions Wkiz); they must be solutions of a suitable differential equation (2).
9.3-3
ORDINARY DIFFERENTIAL EQUATIONS
254
r
with 2} C'k{z)wk<»(z) =0 (j =0,1,2, ... ,r-2) A= l
(9.3-5)
After solving the r simultaneous equations (5) for the r unknown deriva
tives C'k(z), one obtains each Ck(z) = fC'k(z) dz + Kk by a simple inte gration. In principle, J/ws procedure reduces the solution of any linear ordinary differential equation to the solution of a homogeneous linear differential equation.
(b) Assuming real variables z = x and w = w(x) for simplicity, par ticular integrals of the complete differential equation (1) can often be written as
w=/«*G(*' *)/(*) d*
(a <*<6>
(9-3"6)
where G(:c, £) is known as the Green's function (sometimes called the weighting function, Sec. 9.4-3) yielding the specific particularintegral in question. The complete integral of Eq. (1) is then r
w=fabG(x, {)/({) dt + S Akwk(x)
(9.3-7)
*=i
where the wk(z) are r linearly independent solutions of Eq. (2), and the A* are r constants of integration to be determined by suitable initial or boundary conditions.
Any given set of r linearly independent solutions wk(x) of the comple mentary equation (2) permits one to construct a particular integral (6) with r
G(x' &wmM^/ c*(l)w*(x)U{x ~l}
(9-3_8)
&= 1
where the Ck(x) are obtained from Eq. (5), and U(x) is the unit-step function defined in Sec. 21.9-1.
For linear differential equations oforder r = 2, the complete integral is given by Eq. (4) or (7) with
c;(x) = _ M.
, "*(*>
C'2(X) =HZL
Wl{x)
,_
a0(x) wiix)w2ix) —w2ix)w[ix)
\
aQix) wiix)w'2ix) —Wiix)w[ix)
G( .,
' }"
L_U>1JX)W2JZ) - W2ix)Wii^)
acta «i(e»;(© - «,(©»;«) ^
__
?;
(9.3.9)
255
THEORY OF LINEAR DIFFERENTIAL EQUATIONS
9.3-5
(c) While the general solution (7) obtained with the aid of the par ticular Green's function (8) is only another way of writing Eq. (4),
it is often possible to construct a Greenes function G(x, £) such that the par ticular integral (6) satisfies the specific initial or boundary conditions of a given problem. Assuming boundary conditions linear and homogeneous in w(x) and its derivatives, the required Green's function G(x, £) must satisfy the given boundary conditions and
LG(x, 0 =0 {x*&
f* LG(x, &d£ =1 (9.3-10)
for x in (a, b), with dr~2G/dxr-2 continuous in (a, b), and
^1
_£W]
1
(a<6) (9.3-11)
The existence and properties of such Green's functions are discussed from a more general point of view in Sec. 15.5-1; see also Sec. 9.4-3. Table 9.3-1 lists the Green's functions for a number of boundary-value problems. 9.3-4. Reduction of Two-point Boundary-value Problems to Initial-value
Problems. The general theory of boundary-value problems and eigenvalue prob lems involving ordinary differential equations is treated in Sees. 15.4-1 to 15.5-2
(see also Sees. 9.3-3, 20.9-2, and 20.9-3). Thefollowing method isoften useful in con nection with numerical solution methods.
Given an rth-order linear differential equation Lw = /(z) with r suitable boundary conditions to be satisfied by w(z) and its derivatives for z = a, z = b, write the solu tion as r
w=woiz) + y akWkiz)
(9.3-12a)
where the Wkiz) are defined by the r + 1 initial-value problems
Lwoiz) = fiz) Lwkiz) - 0 with
with w0<»ia) =0 (j = 0, 1, 2, . . . , r - 1) ] Wk^ia) = {i for i = fc - 1J * - 1, 2, . . . , r) J (9.3-126)
Apply the r given boundary conditions to the general solution (12a) to obtain r simultaneous equations for the r unknown coefficientsa*. Note: Given a nonlinear boundary-value problem like
^ = Hz, w, w') dz1
with
wia) = wa
w(b) = wb
(9.3-13)
one can often calculate wQ>) for two or three trial values of the unknown initial value
w'(a); the correctvalue of w'ia) is then approximated by interpolation.
9.3-5. Complex-variable Theory of Linear Differential Equations. Taylor-series Solution and Effects of Singularities, (a) A given
9.3-5
ORDINARY DIFFERENTIAL EQUATIONS
256
Table 9.3-1. Green's Functions for Linear Boundary-value Problems
Each boundary-value problem listed has the solution wix) = J Gix, £)/(£) d£. Use Gix, £) to obtain solutions for other initial or boundary conditions from Eq. (9.3-7) (see also Sees. 9.3-3, 9.4-3, 10.4-2, 15.4-8, and 15.5-1). The table yields solutions for other intervals (a, b) with the aid of suitable coordinate transformations. No.
Differential
Interval
Boundary
equation
(a. 6)
conditions
Q(x, *) <7(f, x)
w(0) = w(l) = 0
la
lb
o =
0
b «
1
dho
Id o
=.
b =
U
dho 2
-(1 -
w(0) = 0 w\l) = 0
a
=
-M(x - *) - K
W(-1) = „,(!) = o
-K(x - *- x* + l)
w(-l) = w(l) u>'(-l) = w'(l)
-K(x - *)* - M(x - *) - K*
-1
1
—
oo
b S3 00
dx*
w finite
-Me*-*
in ( — oo, oo)
a = 0
3a
b =
dho
—
1
sin kx sin ft(l — f)
w(0) - u>(l) = 0
k sin &
+ kho=/(x)
3b
a =
-1
b =
1
u>(-l) = u>(l)
cos ft(x — £ + 1)
w'(-l) = u,'(l)
2ft sin k
a = 0 4a
dho
—
-
4b
1
=
-1
b =
dw\
ft sinh k
1
to(-l) - u>(D w'(-D = u>'(l)
cosh ft(x — $ + 1) 2ft sinh ft
log* £
(inhomoqeneou8 Bessel equation)
d r
w(0) » w(l) - 0
to*
5
a «
0
b =
1
u>(0) finite w(l) - 0
(to = 0)
-Ufi)"-H
(to = 1, 2, . . .)
dwi a
6
b =
sinh Ax sinh ft(l - £)
kho - /(x) a
d /
*)x
— x
w(0) - -u>(l) w'(0) = -«>'(1)
le
(x < £) (x > I)
=» - 1
w(-l) finite
-log.(1 - x)(l + $) - log. 2 + i
TO*
-
: w = /(x)
(to = 0)*
1— X*
(inhomogeneoub
b =
u> (1) finite
1
Leqendre equation)
_ _L f1 + x LzjY"" 2ro \1 - x" 1 + {/
(to = 1, 2, . . .) 7
dho
a -
5-^
b = 0
0
w(0) » u>'(0) = u;(l) = to'(l) = 0
-
x*(£ -
fl
1)*
(2x{ + x - 3*)
o
* This is a modi/led Oreen'a function in the sense of Sec. 15.5-16and does not satisfy Eq. (9.3-10)
257
SOLUTION OF HOMOGENEOUS EQUATIONS
9.3-6
linear differential equation (1) has an analytic solution w = w(z) at every regular point z where the functions ak(z) and f(z) are analytic (see also Sec. 7.3-3). // these functions are single-valued, and if D is a singly connected region of regular points, a given set of values w(zo), w'(z0), . . . , wir~l)(zo) for some point zo in D defines a unique solution w(z) in D. To obtain this solution in Taylor-series form (see also Sec. 4.10-4), substitute
w(z) =w(z0) +w'(z0)(z - zo) +^ w"(z,)(z - zo)2 + ••• (9.3-14) and
/(*) =/(zo) +f'(zo)(z - zo) +^f"(zo)(z - zo)2 + ••• into the given differential equation (1); comparison of coefficients will
yield recurrence relations for the unknown coefficients -r-f w{k) (zo) (k = r, r + 1, . . .). The series (14) converges absolutely and uniformly within every circle \z —z0| < R in D. (b) Analytic continuation of any solution w(z) of a linear differential equation around singularities of one or more coefficients ak(z) will, in general, yield different branches of a multiple-valued solution (see also Sees. 7.4-2, 7.6-2, and 7.8-1). In particular, one complete circuit around a singularity will transform a funda mental system of solutions wiiz), w%iz), . . . , wriz) of the homogeneous linear differ ential equation (2) into a new fundamental system v)iiz), $2(2), . . • , iSr(z). The two fundamental systems are necessarily related by a nonsingular linear transformation
Wiiz)
y aikWkiz)
ii = 1, 2, . . . , r)
(9.3-15)
ft=i
The eigenvalues X* of the matrix [a.*] (Sec. 13.4-2) are independent of the particular fundamental system w>i(z), 1^2(2), . • • , wriz) in question.
9.3-6. Solution of Homogeneous Equations by Series Expansion about a Regular Singular Point, (a) A singularity z = Z\ of one or more ak(z) is an isolated singularity of the homogeneous linear differ ential equation (2) if and only if z\ has a neighborhood containing no other singular point. An isolated singularity z = Z\ is a regular singu lar point of the homogeneous linear differential equation if and only if none of its solutions w(z) has an essential singularity at z\\ otherwise z = z\ is an essential singularity of the given differential equation. z — Z\ is a regular singular point if and only if ak(z)/a0(z) has at worst a pole of order k at (k = 1,2, . . . , r) (Fuchs}s Theorem). In this case,
9.3-6
ORDINARY DIFFERENTIAL EQUATIONS
258
Eq. (2) can be rewritten as
/
\r d'W
.
\r X I \ dT~lW . .
n, o / Nd^W
(* -zi) if +(* - *>^w i^ + ^- ^r-2vM -^
+ • • • + pr(z)w = 0
(9.3-16)
where all pk(z) are analytic in a neighborhood Z>i of Z\.
It follows that a given homogeneous linear differential equation (2) admits a solution of the form 00
w = (z - zi)" y ak(z - zi)k
(9.3-17)
ft = 0
whenever z = z\ is a regular point or a regular singular point. The expo nent m must satisfy the rth-degree algebraic equation
m(m " 1) * * ' (m - r + 1) + m(m - 1) • • • 0* - r + 2)Pl(Zl) + • - • + jupr_i(zi) + Pr(Zi) = 0 (iNDICIAL EQUATION) (9.3-18) The first coefficient a0 may be chosen at will, and the other coefficients ak are found successively from a set of recurrence relations obtained on
substitution of the series (17) into Eq. (2) or (16). The series converges absolutely and uniformly within every circle \z — z\\ < R in D\. Different roots m = Mi, M2, . . . , Mr of the indicial equation (18) yield linearly independent solutions (17) of the given differential equation, unless two roots \ik coincide or differ by an integer. In such cases, one may use the known solutions to reduce the order of the given differential equation in the manner of Sec. 9.3-2c or 9.3-7a, or use Frobenius's method (Ref. 9.14); see also Sec. 9.3-8a. The exponents m* are related to the eigenvalues X* obtained from Eq. (15) for one circuit about the regular singular point z\, with X* =» e2T»>*.
(b) Regular Singular Points at Infinity (see also Sees. 7.2-1 and 7.6-3). z = oo is a regular singular point of Eq. (2) if and only if the transformation
z=-
w(z) =w(-) =w(z) *£ =_2-2 ** £« =2< eg +22-a ^dz . . . (9.3_19) dz dz dz2 dz2 v '
yields a differential equation having a regular singular point at z = 0. In this case, one may obtain solutions of the transformed equation in the manner of Sec. 9.3-6a.
(c) Generalization. If z = zx is not a regular singular point (e.g., if the functions Pk have poles at z « Z\) one may still write solutions similar to Eq. (17) by replacing each power series by a Laurent series (Sec. 7.5-3) admitting negative as well as positive powers of (z — zi).
259
LINEAR SECOND-ORDER EQUATIONS
9.3-7. Integral-transform Methods.
9.3-8
The solution of a linear differential equa
tion (1) with polynomial coefficients a*(z) ss } ahkZ* and given initial conditions h
w(/)(0) = tuoW) is often simplified through the use of the unilateral Laplace transforma tion in the manner of Sec. 9.4-5. Apply the formula ft-i
£[**u>»>(*); a] =(-D*^ {s"£[wiz); s] - Ya*-/-Hi>«>(0 +0)1 (9.3-20) y=o
to obtain a new and possibly simpler differential equation for the Laplace transform £[wiz); s] of the solution. Boundary-value problems can be transformed to initialvalue problems by the method of Sec. 9.3-4. More general integral transformations (Table 8.6-1) may be similarly employed in various special cases (see also Sec. 10.5-1 and Ref. 9.14).
9.3-8. Linear Second-order Equations (see also Sees. 15.4-3c and 15.5-4). (a) The theory of Sees. 9.3-1 to 9.3-4 applies to linear secondorder equations, so that one is mainly interested in the solution of homo geneous linear second-order equations ,
d2w ,
, . dw
,.
n
(9.3-21)
Iz2 + ttl^ Iz + a2^W = Equation (21) is equivalent to
i[p^^]+^)w=0 piz) = exp Jaiiz) dz
)
(9.3-22)
qiz) = a2iz)piz)
If any solution W\iz) of Eq. (21) or (22) is known, the complete primitive is
^-..Wfft +ft/j^^]
(9-3-23)
(b) Series Expansion of Solutions (see also Sees. 9.3-5, 9.3-6, 9.3-9, and 9.3-10). In the important special case of a second-order equation expressible in the form
(« - *i)2 7p +(* - *i)Px(*) jf +Pitow =0
(9.3-24)
where p\(z) and P2(z) are analytic at z = Z\, the indicial equation (18) reduces to
M2 + Mfoi(zi) - 1] + Pt(*i) = 0
(9.3-25)
The indicial equation has two roots. The root m = Mi having the larger real part yields a solution of the form 00
w = wx(z) ss (z - zi)"* y ak(z - zk)k ft = 0
(9.3-26)
9.3-9
ORDINARY DIFFERENTIAL EQUATIONS
260
The second root /*2 yields a similar linearly independent solution, which is replaced by oo
w = Awi(z) loge (z - zi) + (z - Zi)" y bk(z - zi)k (9.3-27) ftt'i
if /ii and H2 are identical or differ by an integer. Substitute each solu tion (26) or (27) into the given differential equation (24) to obtain recurrence relations for the coefficients.
(c) Transformation of Variables. The following transformations may simplify a given differential equation (21) or reduce it to a differ ential equation with a known solution. 1. w = w exp f
2? +[ai(z) +.2*(*)] |£+[a2(z) +a1(z)
One attempts to choose
w n AeIS(t)d'
id « -w
(9.3-29) v '
transforms Eq. (21) into a first-order differential equation of the Riccati type (Sec. 9.2-4c).
(d) Existence and Zeros of Solutions for Real Arguments. variable. The homogeneous linear differential equation
Let £ be a real
Lw = eS* + fll(:r) T + a^x)v) = °
(9.3-30)
has a solution w = wix) in every interval [a, b] where aiix) and azix) are real and con tinuous; the solution is uniquely determined by the values wixo), w'ixo) for some Xo in [a, b]. Unless w = 0 in [a, b], wix) has at most a finite number of zeros in any finite interval [a, b]; the zeros of any two linearly independent solutions alternate in [a, b].
9.3-9. Gauss's Hypergeometric Differential Equation and Riemann's Differential Equation (see also Sees. 9.3-5 and 9.3-6). (a) The homogeneous linear differential equation
*(1"z) S+[c ~,(a+h+i)z] i%-abw =o (hypergeometric differential equation)
(9.3-31)
has regular singular points at z = oo (exponents \i — a, p = b), z = I (exponents /* = 0, n = c — a — b), and z = 0 (exponents m = 0, \k = 1
261
GAUSS'S HYPERGEOMETRIC DIFFERENTIAL EQUATION 9.3-9
— c), and no other singularities. The solutions of Eq. (31) include many elementary functions, as well as many of the special transcendental functions of Chap. 21 as special cases. Series expansion about z = z\ = 0 yields solutions (hypergeometric functions) (17) for /x = 0 and \l = 1 — c, with
n
_ (a + n + k)(b + fji + k)
ak+1 - (C + j. + Jb)(l + M+ k) ak For n = 0 one obtains the special hypergeometric function
(hypergeometric series)
(9.3-32)
The series converges uniformly and absolutely for |z| < 1; the con vergence extends to the unit circle if Re (a + b — c) < 1, except for the point z = 1 if Re (a + b — c) > 0. The series reduces to a geometric series (Sec. 4.10-2) for a = 1, b = c, and to a Jacobi polynomial (Sec. 21.7-8) if a and/or b equals zero or any negative integer. The function (32) is undefined if one of the denominators c, c + 1, . . . equals zero and does not cancel out.
A second (linearly independent) solution of Eq. (31) may be obtained in the man ner of Sec. 9.3-86; in particular, the hypergeometric function of the second kind
»«,,6?.,.) • ^-«tSmnt-S^~"^-«+ m -. +1; 2 - c; z)
(9.3-33)
is a solution whenever c is not an integer. Note the following relations (|z| < 1):
F(a' b> *Z) " rfeWc- a) Jo' f°"1(1 " f)e-°-1(1 " *>"* # [Re (c) > Re (a) > 0] (9.3-34) ^ .*kF{a +lfb +1;c +1;f) *> =^*(a +lf6 +1;c +1;z) (93.35) F(a' 6; C; 1} =ncT-a)Tic - 6)
[Re (a +6" c) <0]
(9-3"36>
The following formulas serve for analytic continuation of F(o, b;c; z) outside the unit circle:
F(a, b; c; z) »(1 - z)-°F (a, c- b; c; j^)
=(1 - *)-V (6, c- a; c; j-i-) - (1 - *)c-°-6F(c - a, c - b; c; z)
(9.3-37)
*(«, »;«;*> - \f-amo ~- %'<*• »;« +1 -«+1; i -.) +(1 ~'^^'Sw)"''^ - a, c- 6; c- a- 6+1; 1- ,) (9.3-38)
9.3-10
ORDINARY DIFFERENTIAL EQUATIONS
262
F(a, 6; c; «) - %gg I g (-,)-* (a, 1- c+a; 1- 6+a; -1) +?(a)?(c-6)(-'>^ (m - e+M - . +6; J) (9-3-39) See Table 9.3-1 and Refs. 21.9 and 21.11 for additional formulas.
(b) Riemann's Differential Equation. Paperitz Notation. is a special case of the linear homogeneous differential equation dho
/l - a - a'
dz%
\
z —zi
1 - <8 - &
z —z%
, Va<*'iZ\ - 22)(gl - Zj)
L
1 - y - y'\ dw
z —g8 ) dz ffl'jZs - g8)(g2 - 2i)
2— Zi
77^8 - gl)(g» - Z*)~\
g —g2
(g - gx)(g - z2)iz - g8)
Equation (31)
0
g— g8
(Riemann's differential equation)
J (9.3-40)
whose only singularities are distinct regular singular points at z = zy (exponents a, a'), z = g2 (exponents 0, £')> and 2 = g8 (exponents 7, 7'); note a+a'+/3+/3'+7+7' = l
A solution of Eq. (40), written in the so-called Paperitz notation, is
!+«-«';»-*>»»-*>] (9.3-41) (g - g2)(g8 ~ gi)J which reduces to Eq. (32) for Z\ = 0, 22 = «, g8 = 1 and a = 0, 0 = a, 7 = 0; a' = 1 - c, 0' = 6, y = c - a - b.
See Table 9.3-2 and Ref. 21.11 for additional
formulas.
9.3-10. Confluent Hypergeometric Functions.
One can move the
singularity z = 1 of the hypergeometric differential equation (31) to z = b by substituting z/b for z; the singularity at z = b will then approach the original singularity at z = » as b —> 00 (confluence of singularities). One thus obtains the new differential equation d2w . , .dw z -j-2 + (c — 2) -j
n aw = 0
,Tjr , (Rummer's confluent hypergeo
metric DIFFERENTIAL EQUATION)
(9.3-42)
whose only singularities are a regular singular point at z = 0 and an essential singularity at z = ». Many special transcendental functions are solutions of Eq. (40) for special values of a and c (Chap. 21). Series expansion about z = zx = 0 yields solutions (confluent hyper geometric functions) (17) for p = 0 and fx = 1 — c, with -
<* + /* + &
a*+1 ~ (c + M+ *)(1 + M+ *) *
263
CONFLUENT HYPERGEOMETRIC FUNCTIONS
9.3-10
Table 9.3-1. Additional Formulas Relating to Hypergeometric Functions 1. Gauss's Recursion Formulas
cF(a, b - 1; c; z) - cF(a - 1, 6; c; z) + (a - b)zF(a, b; c + 1; z) c(a - b)F(a, b; c; z) - a(c - b)F(a + 1, 6; c + 1; 2) + 6(c - a)F(o, 6 + 1; c + c(c + l)F(o, 6; c; 2) - c(c + l)/^, 6; c + 1; 2) - abzF(a + 1, 6 -f- 1; c + cF(a, b; c; 2) - (c - a)F(a, b + 1; c + 1; 2) - a(l - 2)F(a + 1, 6 + 1; c + cjP(a, 6; c; 2) -f- (6 - c)F(a + 1; 6; c + 1; 2) - 6(1 - z)F(a + 1, 6 + 1; c + c(c — bz —a)F(a, b;c;z) — c(c — a)/^(a — 1, b; c; 2) + a&2(l - z)F{a + 1, b + 1; c + c(c - a2 — 6)F(o, 6; c; 2) — c(c — &)/?*(<», b — 1; c; 2) + a&2(l - 2)F(a + 1, b + 1; c + cF(a, b; c; 2) - c/^a, 6 + 1; c; 2) + <wF(a + 1, 6 + 1; c + 1; «) cF(a, b; c; 2) - cF(a + 1, b; c; 2) + bzF(a + 1, 6 + 1; c + 1; 2) c(a - (c - 6)z)F(a, 6; c; 2) - ac(l - z)F(a + 1, 6; c; 2) + (c - a)(c - b)zF(a, b; c + c{6 - (c - 0)2(^(0, 6; c; 2) - 6c(l - z)F(a, b + 1; c; 2) + (c - o)(c - 6)2F(a, 6; c + c(c + l)F(a, 6; c; 2) - c(c + l)F(a, 6 + 1; c + 1; 2) + a(c - b)zF(a + 1, 6 + 1; c + c(c + l)F(af 6; c; 2) - c(c + l)F(a + 1, 6; c + 1; 2) + 6(c - a)2F(a + 1, b + 1; c + cF(a, 6; c; z) - (c - b)F(at b; c + 1; 2) - W(a, 6 + 1; c + 1; 2) cF(a, 6; c; z) - (c - a)F(at b; c + 1; 2) - aF(a -f 1, 6; c + 1; 2)
= 0 1; 2) =0 2; 2) =0 1; 2) =0
1; 2) = 0 1; 2) = 0 1; 2) =0
=0 =0 1; 2) =0 1; 2) = 0
2; 2) = 0
2; 2) =0
2. Miscellaneous Formulas
(1 +a.)^(a,a +i-6;fc+i;^) =f (a, 6; 26;^-^-,) (1 +zy°F(2a, 2o +1- c; c; z) =F(a, a+|;c; (1 *<)t) F(a, 6; a+6+i; sin' d) =F(2a, 2b; o+6+\; sin*|)
=0 =0
9.3-10
ORDINARY DIFFERENTIAL EQUATIONS
264
Table 9.3-1. Additional Formulas Relating to Hypergeometric Functions (Continued)
3. Some Elementary Functions Expressed in Terms of the Hyper geometric Function (see also Table 21.7-1) (1 +z)n =F(-n, 1; 1; -x) loge(l +x) =*F(1, 1;2; -x)
arcsin x=xF (-^ ^t£ X*J arctan x = xF (^ 1; ^; —x2 J p/l+nl-n3
. , \
sinnx = n sin x F( —^—' —o—' 2; Sln * / „ /n
w 1
. „ \
coswa; = F C7:' "~ o' 2' s
x/
_/! +n 1 -n 1 . . \
=cosn xF( - ^—y^J gJ ~tan2 x)
K(*)-|f(-iI;l;f)
E*)-|F(-^;l;*«)
Table 9.3-2. Additional Formulas Relating to Confluent Hypergeometric Functions
F(a; 2a; 2z) = 2«-He-.•»(«-**>/* r(a + xA)^Vi~a Ja-^(zeivl2) F(a; c; z) = e*F(c — a; c; —2)
aF(a + 1; c + 1; 2) = (a - c)F(a; c + 1; 2) + cF(a; c; 2) aF(a + 1; c; 2) = (2 + 2o - c) F(a; c; «) + (c - o)F(o - 1; c; 2)
j-. & f* «•» - -"°'°+*; +m+ -"f<°+-+"•+* •> (n = 0, 1, 2, . . .) f^ <• *> =r(ar)(r!c2l-a) e"2 /-T e""(1 - ()e"°"'(1 +')0~' *(0 < Re a < Re c) rfaT(a+i-+-> +l)F(_ 1)
+1;g) =e*2-«/2 70p e-
[Re(a +v+D>0;|arg2| <|]
265
HOMOGENEOUS LINEAR EQUATIONS
9.4-1
For ft = 0 one obtains Kummer's confluent hypergeometric function
«-F(«;c;.)-l+2. +iI^±^^+... (W<.) (confluent hypergeometric series)
(9.3-43)
(see also Sec. 21.7-5). A second solution may be obtained in the manner of Sec. 9.3-86; in particular, the confluent hypergeometric function of the second kind
*(o; c; z) s r(° r(a)r(i-(c)~ 1} zl"Ha - «+i; 2- «;«) (9-3-44) is a solution whenever c is not an integer. Note the following relations:
F{a> CJ Z) - naitc-a) lo fa"1(1 " ')C"°"Vr* g - 2F(a +1; c+1; «)
~ =5*(a +1; c+1; z)
F(a; c; z) = e'F(c - a; c; -z)
(9-3"45) (9.3-46) (9.3-47)
See Refs. 21.9 and 21.11 for additional formulas.
9.3-11. Pochhammer's Notation.
The infinite series (32) and (43)
are special cases of
mFn(dh ah . • . , dm) Ch C2, . . . , Cn] z) oo
Z/ ft! (d)»(c)* • • • (c)»
C'
' ' ' • •) ( 3_48)
where (a;)* = x(x + 1) • • • (a; + fc - 1). In this notation the hyper geometric function (32) becomes 2F1 (a, 6, c; z), and the confluent hyper geometric function (43) is written as 1F1 (a; c; z). 9.4. LINEAR DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS
9.4-1. Homogeneous Linear Equations with Constant Coefficients
(see also Sec. 9.3-1).
(a) The first-order differential equation dv
a° Tt + aiy = °
(a° ** °)
(9.4-1)
has the solution
y = ce-(«i/«o)<
[c = y(Q)]
(9.4-2)
9.4-1
ORDINARY DIFFERENTIAL EQUATIONS
266
For a0/oi > 0
2/(a0/ai) =1j/(0) « 0.37i/(0)
I/(4a0/ai) » 0.02y(0)
a0/ai is often referred to as the time constant.
(b) The second-order equation (a0 5^ 0)
-3f+*a+—'-•
(9.4-3)
has the solution
= Cie* + C*<*
\
-ai ± VV - ±a*a\ 1(<*i2 - 4ao02 * 0) 8l,2 =
2a0 y = (d + C20e-(ai/2oo)<
(9.4-4a)
j
(<*i2 - 4<*oa2 = 0)
(9.4-46)
If a0, ai, and a2 are real, Si and s2 become complex for ai2 —4a0a2 < 0; in this case, Eq. (4a) can be written as
y = effit(A cos mt + B sin wO = Refflt sin (aNt + a) (9.4-4c)
where the quantities V4aoa2 — fli2 ci =
—
2a0
W
=
2a0
(9.4-4d)
are respectively known as the damping constant and the natural (characteristic) circular frequency. The constants C\, C2, A, B, R, and a are chosen so as to match given initial or boundary conditions (see also Sec. 9.4-5a).
If a
f « 1, 0 < £ < 1 one obtains, respectively, an overdamped solution (4a), a critically damped solution (46), or an underdamped {oscillatory) solution (4c). In the latter case, the logarithmic decrement 2ir
Equation (3) is often written in the nondimensional form 1 d*y
Wl» dt% ^
r dy
mdt ^v
with
«i =» Voi/ol is called the undamped natural circular frequency; for weak damping (f2 « 1), «i « on (see also Fig. 9.4-1).
HOMOGENEOUS LINEAR EQUATIONS
267
9.4-1
y{ty i 0.8
A 0.6
\
/\
0.4
0.2
).2
f1
\
3.4
n"f"
'0.6
TUf
=0.8
-f—t =1.2/^
/
r-u-
^\\ \
0
\
\
y/
\ //
-0.2
V/
-nA
0123456789
10
Fig. 9.4-1. Solution of the second-order differential equation
for 2/(0) = 0, dy/dt]0 = 1. Response is overdamped for f > 1, critically damped for £ = 1, and underdamped for 0 < f < 1.
(c) To solve the rth-order differential equation dru
dr—^u
(ao * 0)
(9.4-5)
9.4-2
ORDINARY DIFFERENTIAL EQUATIONS
268
find the roots of the rth-degree algebraic equation
a0sr + aisr_1 + • • • + a0 = 0
(characteristic equation)
(9.4-6)
obtained, for example, on substitution of a trial solution y = est. If the r roots Si, S2, . . . of the characteristic equation (6) are distinct, the given differential equation (5) has the general solution y = CieM + C2e^ + • • • + Cre'**
(9.4-7a)
If a root Sk is of multiplicity m*, replace the corresponding term in Eq. (7a) by (Ck + Ckit + Ck2t2 + • • • + Ckmk-^-*)e»< (9.4-76) The various terms of the solution (7) are known as normal modes of the given differential equation. The r constants Ck and Cjy must be chosen so as to match given initial or boundary conditions (see also Sec. 9.4-5a).
If the given differential equation (5) is real, complex roots of the characteristic equation appear as pairs of complex conjugates
(9.4-7c)
where A and B, or R and a, are new real constants of integration. (d) Given a system of n homogeneous linear differential equations with constant coefficients
(9.4-8)
(j = 1, 2, . . . , n)
where the
D(s) = det [
[characteristic equation of the system (8)] (9.4-9)
The constants of integration must again be matched to the given initial or boundary conditions (see also Sees. 9.4-56 and 13.6-2). 9.4-2. Nonhomogeneous Equations. Normal Response, Steadystate Solution, and Transients (see also Sec. 9.3-1). (a) The super position theorems and solution methods of Sees. 9.3-1 to 9.3-4 apply to
269
SUPERPOSITION INTEGRALS AND WEIGHTING FUNCTIONS
9.4-3
all linear ordinary differential equations. Thus the general solution of the nonhomogeneous differential equation
Ly s a°ar + aiir^+ ' ' ' +arV =m
(9.4-10)
can be expressed as the sum of the general solution (7) of the reduced equation (5) and any particular integral of Eq. (10).
If, as in many applications, f(t) = 0 for t < 0,* the particular integral y = yw(t) of Eq. (10) with yN = y'N = y% = • • • = yN
f(t). To solve Eq. (10) fort>0 with given initial values for y, y\ y"y . . . , 2/(r_1), one adds the solution of the corresponding initial-value problem for Eq. (5) to the normal response yir(t). In many applications (stable electric circuits, vibrations), all roots of the character istic equation (9) have negative real parts, and the complementary function (7) dies out more or less rapidly (stable "transient solution"). In such cases, one is often mainly interested in a suitable nontransient particular integral y = yss{t) the "steady-state solution" due to the given forcing function/(Z). In other cases, yss{t) is not uniquely defined by the given differential equation but depends on the initial conditions.
The normal response yN{t) may or may not include a transient term.
(b) In the same manner, each solution function yk = yk{t) of a system of linear differential equations with constant coefficients,
(9.4-11)
0' = 1, 2, . . . , n)
can be expressed as the sum of the corresponding solution function of the complementary homogeneous system (8) and a particular solution func
tion of the given system (11). The normal response of the system (11) to a set of forcing functions f3{t) equal to zero for t < 0 is the par ticular solution such that all yk vanish for t < 0 together with all deriva tives which can be arbitrarily chosen (see also Sec. 13.6-2). (c) If a forcing function contains a periodic term whose frequency equals that of an undamped sinusoidal term in the complementary function (7), then the differential equation or system may not have a finite solution {resonance, see also Sees. 9.4-5 and 15.4-12).
9.4-3. Superposition Integrals and Weighting Functions (see also Sees. 9.3-3, 9.4-7c, and 15.5-1). (a) Physically Realizable Initial*This means one considers only forcing functions of the type f{t) ss f{t)U+{t), where U+{t) is the asymmetrical unit-step function defined in Sec. 21.9-1 (see also the footnote to Sec. 9.4-3).
9.4-3
ORDINARY DIFFERENTIAL EQUATIONS
value Problems.
270
Application of the Green's-function method of Sec.
9.3-36 to the differential equation (10) yields the normal-response solution (Sec. 9.4-2) as a weighted mean over past values of f(t) in the form
V=VN(t) =f* h+(t - r)f(r) dr =f* Mf)/(* - f) # (DUHAMEL'S SUPERPOSITION INTEGRAL, CONVOLUTION INTEGRAL) (9.4-12)
if one assumes (1) /(*) = 0 for t < 0 (initial-value problem), and (2) h+(t —r) = 0 for t < t, so that "future" values of f(t) cannot affect "earlier" values of y(t), and "instantaneous" effects are also ruled out
(physically realizable systems). More general problems are considered in Sec. 9.4-3d.
If the derivative on the right exists, ft Qm
yN^(t)
= J0 ^ W " r)/(r) dr
(m = 0, 1, 2, . . . , r) (9.4-13)
The weighting function h+(t —r) is the special Green's function defined by
Lh+(t - r) =0
fj Lh+(t - t) dr =1 (* >T) (9.4-14a)
M* - T) = h'+(t - r) = • • • = Mr_1)(* - t) = 0
(t < t) (9.4-146)
h+(t —r) is the normal response to an asymmetrical unitimpulse 8+(t —r) (Sec. 21.9-6); Eq. (14a) may be rewritten as a "symbolic differential equation"
Lh+(t - r) = 8+(t - r)
(* > r)
(9.4-14c)
Note that / h+(t —r) dr is the normal response to the asymmetrical unitstepfunction U+(t —t) (Sec. 21.9-1). The "symbolic differential equa tion" (14c) is often easily solved for h+(t —r) by the Laplace-transform method of Sec. 9.4-5 (see also Sees. 8.5-1 and 9.4-7); alternatively, h+(t) can be found as that solution of the homogeneous differential equation Lh+(t) =0
(t> 0)
(9.4-14d)
which satisfies the initial conditions
Mo + o) = m° + o) = • • • = M'-2>(o + o) = o M*-»(0 + 0) = -
do
EXAMPLE: For Ly =a{dy/dt) + y, one has h+(t) =i «-«'•(* >0).
(9.4-14e)
271
SUPERPOSITION INTEGRALS AND WEIGHTING FUNCTIONS
9.4-3
(b) Under similar conditions, the normal-response solution of a system of linear differential equations (11) can be expressed in the form Vk
=jlQ {h+)ki{t - r)/,(r) dr
{k =1, 2, . . . ,n)
(9.4-15)
where {h+)kj{t —r) is the A;th solution function obtained on substitution of /,•(© = 5+{t —r)
(i = 1, 2, . . . , n)
in
Eq.
(11). The
weighting-function matrix
[{h+)kj{t —r)] is often called a state-transition matrix (Sec. 13.6-2). (c) A superposition integral (12) also yields the normal response if the "input function" f{t) and the "output function" y{t) are related by a differential equation of the form
-S+-S3+ •••+•*-*$ +*&+ •••+hf (9-4-16) Such relations may result, in particular, if a system (11) is reduced to a single differ ential equation by elimination of all but one of the unknown functions. Any (fc+)*/(Q, and also h+{t) if relations of the type (16) are considered, may con tain delta-function-type singularities (modified Green's function, Sec. 15.5-16). Thus,
if h+{t) = ci&+{t - ti) + c25+(* -*«)+•••+ ho{t), then Eq. (12) yields the normal response
y=VN(t) - cj{t - h) +c*f{t - U) +•••+f* h0{t - t)/(t) dr
{t >0)
(d) More General Problems. "Symmetrical" vs. "Asymmetri cal" Weighting Functions. To deal with forcing functions different from zero for t < 0, one may introduce the "symmetrical" weighting function h(t — r) denned by
lh(t - r) =0
J_°°w lh(t - r) dr =1
(9.4-17)
or Lh(t — r) = S(t — r) (see also Sec. 21.9-2) with suitable initial or The resulting solution
boundary conditions.
y=/_". *(* - r)/wdr =/-I *(*•>/<*"f)#
(9-4-18)
will, in particular, satisfy Eq. (10) for t > 0 with 2/(0 - 0) = y'(0 - 0) = • • • = y<*-»(0 - 0) = 0
if /(*) = f(t)U(t) (Sec. 21.9-1), and one adds the condition h(t - r) = fc'(* - r) = • • • = *<*-»(< - r) = 0
(* < r)
(9.4-19a)
/&+(£) and h(t), and the solutions (12) and (18), are easily confused. The "asymmetrical" weighting function h+(t) is particularly convenient for use with the unilateral Laplace transforms employed by most engineers,
while h(t) fits the context of Fourier analysis or bilateral Laplace trans forms (see also Sec. 18.10-5). In theusual physical applications, Eq. (19a)
9.4-4
ORDINARY DIFFERENTIAL EQUATIONS
272
holds since "future" values of f(t) cannot affect the solution; h+(t) and h(t) are then identical wherever they are continuous. Frequently, forcing functions cannot even affect the solution instantaneously, so that h(t — r) satisfies the stronger condition
h(t - r) = h'(t - t) = • • • = W-»(t - t) = 0
(t < t)
(9.4-196)
and h(t) and h+(t) are identical. EXAMPLE: In a purely resistive electric circuit, the current y{t) and the voltage f{t) are related by y{t) =f{t)/R; so that h+{t) = 8+{t)/R and h{t) = 8{t)/R. But
for Ly aa^ +y, h+{t) mh{t) m±e""°. 9.4-4. Stability. A linear differential equation (10) or a system (11) will be called completely stable if and only if all roots of the correspond ing characteristic equation (6) or (9) have negative real parts, so that effects of small changes in the initial conditions tend to zero with time
(refer to Sec. 13.6-5 for a more general discussion of stability).
The
nature of the roots may be investigated with the aid of Sees. 1.6-6 and 7.6-9 (stability criteria for electric circuits and control systems). A
differential equation (10) is completely stable if and only if j \h+(r)\ dr [or equivalently [
\h(r)\ dr\ exists; a similar condition for every weighting
function of a system (11) is necessary and sufficient for complete stability of the system. 9.4-5. The Laplace-transform Method of Solution (see also Sees. 8.1-1, 8.4-1 to 8.4-5, 9.3-7, and 13.6-2). (a) To solve a linear differential equation (10) with given initial values y(0 + 0), y'(0 + 0), z/"(0 + 0), . . . , 2/(r~"1)(0 + 0), apply the Laplace transformation (8.2-1) to both sides, and let £[y(t)] = Y(s), £[f(t)] = F(s). The resulting linear alge braic equation (subsidiary equation)
(a0sr + ais'"1 + • • • + ar)Y(s) = F(s) + G(s) G(s) = ?/(0 + 0)(a0sr"1 + ais'"2 + • • • + ar_i)
+ V'(0 + 0)(a0sr-2 + alS*-* + • • • + ar_2) } +
(9.4-20)
• • •
+ 2/(r"2)(0 + 0)(a0s + ax) + ao^-^O + 0) is easily solved to yield the Laplace transform of the desired solution y(t) in the form
Y(8) =
£W
aQsr + ais*'1 +
- - - + Or
+
<M
a0sr + ais1""1 +
• • • + ar
(9.4-21)
Here the first term is the Laplace transform YN(s) of the normal response yN(t) (Sec. 9.4-2a), and the second term represents the effects of nonzero initial vii! i.ics of j^'j ~iud ^s derivatives. ..... ; ,^. ,,u.. j ^) u.id y^-\C)
273
PERIODIC FORCING FUNCTIONS AND SOLUTIONS
9.4-6
are found as inverse Laplace transforms by reference to tables (Appendix D), or by one of the methods of Sees. 8.4-2 to 8.4-9. In particular, each of the r terms in the partial-fraction expansion of G(s)/(aoSr + ais**"1 + • • • + Or) (Sec. 8.4-5) yields a corresponding term of the force-free solution (7). This solution method applies without essential changes to differential equations of the type (16).
(b) In the same manner, one applies the Laplace transformation to a system of linear differential equations (11) to obtain
n
F*(s) =XW FM +XT5$? Gj{s) (fc =*' 2' •*•'n) j= 1
j= 1
(9.4-23)
where A3-k(s) is the cofactor of
The desired solutions yk(t) are obtained from Eq. (23) by inverse Laplace transformation. In problems involving unstable differential equations (Sec. 9.4-4) and/or impulsetype forcing functions, the solutions may contain delta-function-type singularities (see also Sees. 8.5-1 and 21.9-6).
9.4-6. Periodic Forcing Functions and Solutions. The Phasor Method, (a) Sinusoidal Forcing Functions and Solutions.
Sinusoidal Steady-state Solutions. Every system of linear differential equations (11) with sinusoidal forcing functions of equal frequency, fj(t) s B3 sin (orf + ft)
(j = 1, 2, . . . , n)
(9.4-24o)
admits a unique particular solution of the form
yk(t) s Ak sin (at + ak)
(k = 1, 2, . . . , n)
(9.4-246)
In particular, if all roots of the characteristicequation (9) have negative real parts (stable systems, Sec. 9.4-4), the sinusoidal solution (246) is the unique steady-state solution obtained after all transients have died out (Sec. 9.4-2).
9.4-6
ORDINARY DIFFERENTIAL EQUATIONS
274
(b) The Phasor Method. Given a system of linear differential equations (11) relating sinusoidal forcing functions and solutions (24), one introduces a reciprocal one-to-one representation of these sinusoids by corresponding complex numbers (vectors^ phasors)
?' =v^' =^ u=1>2' •••••> ?*=^ri=^ (* =i,2,...,«)
(9.4-25)
The absolute value of each phasor equals the root-mean-square value of the corresponding sinusoid, while the phasor argument defines the phase of the sinusoid. The phasors (25) are related by the (complex) linear algebraic equations (phasor equations)
(j = 1, 2, . . . , n) (9.4-26)
which correspond to Eq. (11) and may be solved for the unknown phasors n
?*=S^IH ik=l'2'•••'w)
(9-4-27)
(see also Sec. 9.4-56). In the case of resonance (Sec. 9.4-2c), the expres sion (27) may not exist (may become "infinitely large"). (c) Rotating Phasors. A set of sinusoidal functions (24) satisfies any given sys tem of linear differential equations (11) if and only if thesame is truefor thecorresponding set of complex exponential functions (rotating phasors)
/,») mBtfP'+fifi m?>. ^2
(j =1, 2, . . . ,n) {
yk{t) m A*e'«"+«*> B Yke«« y/2
{k = 1, 2, . . . , n)
(g 4-2g)
which are often more convenient to handle than the real sinusoids (24).
(d) More General Periodic Forcing Functions (see also Sees. 4.11-4, 4.11-5, and 9.4-5c). Given a stable system (11) with more gen eral periodic forcing functions expressible in the form oo
f(t) = 3^o0 + y (ah cos hut + bh sin heat)
(9.4-29)
one can apply the phasor method of Sec. 9.4-6a separately for each sinusoidal term and superimpose the resulting sinusoidal solutions to obtain the steady-state periodic solution. This procedure may be more convenient than the Laplace-transform method if only a few harmonics of the periodic solution are needed.
TRANSFER FUNCTIONS
275
9.4-7
9.4-7. Transfer Functions and Frequency-response Functions. (a) Transfer Functions. The function H(s) =
__ YN(s)
1
+ ar
aQsr + aisr_1 +
(9.4-30)
F(s)
in Eq. (21) is known as a transfer function. The transfer function "represents" a linear operator (Sec. 15.2-7) which operates on the forcing function (input) to yield the normal response (output; see also Fig. 9.4-1). F(8)
YN(8)
«."•*?
ZN(8)
Fig. 9.4-1. Transfer-function representation of linear differential equations with con stant coefficients. If yN{t) in turn serves as the forcing function for a second differ ential equation to produce the normal response ZN{t), the two transfer functions multiply, i.e., ZN{s)/F{s) = Hi{s)H2{s).
More generally, each function Ajk(s)/D(s) in Eq. (23) is the transfer function relating the normal-response "output" yk(t) of the system (11) to the "input" fj(t) when all other forcing functions vanish identically. The transfer functions Ajk(s)/D(s) together constitute the transfer matrix.
The transfer functibn corresponding to Eq. (16) is H{8) m
bQsP -f bisP'1 -f aosr + ai5r_1 -j- •
+ &p + ar
(9.4-31)
(b) Frequency-response Functions (see also Sec. 9.4-6a). The frequency-response functions ff(ico) and Ajk(io))/D(ia) similarly relate the phasors representing sinusoidal forcing functions and steadystate solutions of given circular frequency co. Specifically, the absolute value and the argument of a frequency-response function respectively relate the amplitudes and the phases of input and output sinusoid; thus for f(t) = B sin (at + 0), y(t) = A sin (at + a)
|ff(to,) | =
B
arg H (ia) = a — 0
(9.4-32)
If frequency-response functions are "cascaded*' in the manner of Fig. 9.4-1, the amplitude responses |#(i
(c) Relations between Transfer
Functions or Frequency-
response Functions and Weighting Functions (see also Sees. 4.11-4e, 9.4-3, and the convolution theorem of Table 8.3-1). The transfer function H(s) is the unilateral Laplace transform of the asymmetrical weighting function h+(t), and the bilateral Laplace transform (Sec. 8.6-2) of the symmetrical weighting function h(t):
9.4-8
ORDINARY DIFFERENTIAL EQUATIONS
276
H(s) =J~ h+(t)e-*< dt =f"M h(t)e-°> dt
(9.4-33)
Hence the frequency-response function H(ia) is related to the symmetrical weighting function h(t) by the Fourier transformation*
H(ia) =^ h(t)e-^ dt
(9.4-34)
Equations (33) and (34) indicate the possibility of obtaining weighting functions as inverse Laplace or Fourier transforms of rational functions.
9.4-8. Normal Coordinates and Normal-mode Oscillations, (a) Free Oscillations. Small oscillations of undamped mechanical or elec trical systems are often described by a set of n linear second-order differ ential equations of the form
V(bjk £-t +ajk) yk =0 (j =1, 2, . . . ,n) (9.4-35) jfe=i
where the matrices [ajk] and [bjk] are both symmetric, positive-definite (Sec. 13.5-2), and such that the resulting characteristic equation (9) has 2n distinct, nonzero, purely imaginary roots ±i
= ^ tkhyh
(* =1, 2, . . . ,n)
(9.4-36)
with coefficients tkh chosen in the manner of Sees. 13.5-5 and 14.8-7 so as
to diagonalize the matrices [a/J and [bjk] simultaneously; the transformed system takes the simple form
^ +uh2yh =0 (h =1, 2, . . . ,n)
(9.4-37)
The resulting free sinusoidal normal-mode oscillations
yh = Ah sin (aht + ah)
(h = 1, 2, . . . , n)
(9.4-38)
do not affect one another (are "uncoupled"). The normal coordinates (38) may have an intuitive physical interpretation. The problem is a generalized eigenvalue problem involving sets of n functions [yi{t),
y*{t), . . • , Vn{t)] as eigenvectors (see also Sees. 13.6-2a, 14.8-7, and 15.4-5). * See the footnote to Sec. 4.11-2.
277
THE PHASE-PLANE REPRESENTATION
9.5-2
EXAMPLE: For a pair of similar coupled oscillators described by
-^ = -co022/i - a2{yi ~ 2/2)
-^ = -coo22/2 - ot\y2 - yi)
the normal coordinates are simply 2/1=2/1 + 2/2, 2/2 = 2/1 —y2. Given yx = 1 2/2 = tfyiM —dy2/dt = 0 for / = 0, the normal-mode equations
IM = ~W0^1
^ ="(w°2 +2a2^"2
yield
£1 = cos w0*
2/2 = cos t y/m2 + 2a2
and thus
» - Jtti +W=cos V^+22^-coo
2/2 =
i ft - ft) =Sin Vcoo2+2a2-co0<sin V^+^ +cop ^
For a2 «C wo2 (weak coupling), this solution describes the so-called resonance phenomenon.
(b) Forced Oscillations. The corresponding forced-oscillation problem n
L, \bik ST2 +a'*) Vk =fi(t)
0' =1, 2, ... ,n)
(9.4-39)
can, in principle, be solved in the manner of Sec. 14.8-10 through normal-mode expansion of the forcing functions /,(*). The Laplace-transform method of Sec. 9.4-5 is usually more convenient.
9.5. NONLINEAR SECOND-ORDER EQUATIONS
9.5-1. Introduction. Sections 9.5-2 to 9.5-5 introduce the general terminology and the most easily summarized approximation method of
the theory of nonlinear oscillations. References 9.15 to 9.17, 9.22, and 9.23 are recommended for further study; for better or for worse, many solution methods are closely tied to specific applications. The perturbation method of Sec. 10.2-7c is often used to simplify nonlinear problems
especially in celestial mechanics. See Sees. 20.7-4 and 20.7-5 for numerical methods of solution.
9.5-2. The Phase-plane Representation. Graphical Method of
Solution (see also Sec. 9.2-2). The second-order differential equation d*y
*(*
dv\
is equivalent to the system of first-order equations dy
dt=y
dy
I=W> y> &
(9-5-2)
9.5-3
ORDINARY DIFFERENTIAL EQUATIONS
278
The general solution y = y(t), y = y(t) of Eq. (1) or (2) can be repre sented geometrically by a family of directed phase-trajectory curves in the yy plane or phase plane. The phase-plane representation is most useful if the given function f(t, y} dy/dt) does not involve the inde
pendent variable t explicitly (e.g., "free" oscillations). In this case, the system (2) is of the general form
J =P(y, 0)(- y)
f= Q(y, y)
(9.5-3)
and the phase trajectories satisfy the first-order differential equation
dy _ Q(y, y)
r
Q(y,?)1
,0 .,,
which specifies the slope of the solution curve through (y, y) at that point. The resulting field of tangent directions ("phase-plane portrait" of the given differential equation) permits one to sketch y(y) and hence y(t) for given initial values of y and y; one may begin by drawing loci of constant slope dy/dy = m (isoclines, Fig. 9.5-1).
9.5-3. Critical Points and Limit Cycles (see also Sec. 9.5-4). Ordinary and Critical Phase-plane Points.
(a)
Given a differential
equation (1) reducible to a system (3), a phase-plane point (y, y) is an ordinary point if and only if P(y} y) and Q(y, y) are analytic and not both equal to zero; there exists a unique phase trajectory through each ordinary point.
Phase-plane points (y0, y0) such that
gj =P(y0, io) =0 ^ =Q(y0, jo) =0
(9.5-5)
are critical points (or singular points) where the trajectory is not uniquely determined. Critical points are classified according to the nature of the phase trajectories in their neighborhood; Fig. 9.5-2 illus trates the most important types. Physically, critical points are equi librium points admitting stable or unstable equilibrium solutions y = y0 (Sec. 9.5-4).
(b) Periodic Solutions and Limit Cycles. Periodic solutions y = y(t) correspond to closed phase-trajectory curves, and vice versa. A closed phase trajectory C is called a limit cycle if each trajectory point has a neighborhood of ordinary points in which all phase trajectories spiral into C (stable limit cycle, see also Sec. 9.5-4) or out of C (unstable limit cycle), or into C on one side of C and out of C on the other side (half-stable limit cycle). For an example, see Sees. 9.5-4c and 9.5-5 (see also Fig. 9.5-3).
279
9.5-3
CRITICAL POINTS AND LIMIT CYCLES
>y
+ 1 +0.8
0
-1
-3
Fig. 9.5-1. Isoclines, tangent directions, and some solutions of the differential equation dy y corresponding to Van der Pol's differential equation
with dy/dt = y, n = 1.
Only the right half-plane is shown.
9.5-3
ORDINARY DIFFERENTIAL EQUATIONS
Stable nodal point
Stable focal point
Saddle point
280
Unstable nodal point
Unstable focal point
Vortex point
Fig. 9.5-2.. Phase trajectories in the neighborhood of six types of critical points (Sees. 9.5-3 and 9.5-4).
(a)
(6)
Fig. 9.5-3. (a) A stable limit cycle enclosing an unstable critical point at the origin. "Soft" self-excitation of oscillations for arbitrarily small initial values of y and y. (6) A stable limit cycle enclosing a stable critical point at the origin and an unstable limit cycle (shown in broken lines). "Hard" self-excitation of oscillations for initial values outside the unstable limit cycle.
281
POINCARtf-LYAPUNOV THEORY OF STABILITY
9.5-4
(c) Poincare's Index and Bendixson's Theorems. The possible existence of limit cycles (stable oscillations) is of interest in many applications. In addition to the analytical criteria of Sec. 9.5-4, the following theory is sometimes helpful. For any given phase-plane portrait, the index of a closed curve C containing only ordinary points is the number of revolutions of the solution tangent as the point {y, y) makes one complete cycle around C. The index of an isolated critical point P is the index of any closed curve enclosing P and no other critical point. Then
1. The index of any closed curve C equals the sum of the indices of all the (isolated) critical points enclosed by C; if C encloses only ordinary points, its index is zero. 2. The index of a nodal point, focal point, or vortex point is 1; the index of a saddle point is —1 (see also Fig. 9.5-2). 3. The index of every closed phase trajectory is 1; hence a limit cycle must enclose at least one critical point other than a saddle point (Fig. 9.5-3).
No closed phase trajectories exist within any phase-plane domain where dP/dy + dQ/dy is of one sign {Bendixson's First Theorem). A trajectory which remains in a bounded region of the phase plane and does not approach any critical point for 0 < t < °o is either closed or approaches a closed trajectory asymptotically {Bendixson's Second Theorem).
9.5-4. Poincare-Lyapunov Theory of Stability (see also Sees. 13.6-5 to 13.6-7). (a) The solution y{ = yu(t) (i = 1, 2, . . . , n) of a system of ordinary differential equations reduced to the form
^ =Mt; yh 2/2, ... ,yn)
(* =1, 2, . . . ,n)
(9.5-6)
is stable in the sense of Lyapunov if and only if each y{(t) —> yu(t) as y%(to) —> yu(to) (i = 1, 2, . . . , n), the convergence being uniform for t > 0 (Sec. 4.4-4); sufficiently small initial-condition changes then cannot cause large solution changes. A stable solution is asymptotically stable if and only if there exists a bound r(U) such that \yi(tQ) —yu(lo)\ < r(t0) for all i implies lim \yi(t) - yu(t)\ = 0 for all i. An asymptotit-to-> «
cally stable solution is asymptotically stable in the large (completely stable) if and only if r(tQ) = ». This is, in particular, true for every solution of a completely stable system of linear differential equations with constant coefficients (all roots of characteristic equation have negative real parts, Sees. 9.4-4 to 9.4-7). (b) Stability of Equilibrium.
For a system of the form
^ =MVi, 2/2, ... ,yn)
(i =1, 2, . . . ,n)
(9.5-7)
with suitably differentiable fi} an equilibrium solution dv ~
y<(t) = yu = const
-^ = U(ya, 2/12, . . . , 2/in) = 00 (i = 1, 2, . . . , n)
is asymptotically stable whenever the linearized system n
d/y dt
(9.5-8)
9.5-4
ORDINARY DIFFERENTIAL EQUATIONS
282
is completely stable, where the partial derivatives are computed for y\ = yn, 2/2 = 2/i2, . . . , 2/n = 2/in.
This is true whenever all roots s of the
characteristic equation
det [H " S8i] =°
(9'5"9>
have negative real parts. The equilibrium is unstable if Eq. (9) has a root with positive real part; if the real parts of all roots are negative or zero, one requires a more detailed stability investigation (Sec. 13.6-7). In particular, for a second-order differential equation (1) reducible to a system (3) the characteristic equation is s2 +
(*P
dQ\
(dPdQ
dPdQ\[
.,/dP
\
= 0
apl (9.5-10)
where all derivatives are computed at the equilibrium point (y0, yQ). The equilibrium point is
A stable or unstable nodal point if both roots Si and s2 ofEq. (10) are real and negative or positive, respectively
A saddle point if $i and s2 are real and of opposite sign A stable or unstable focal point if si and s2 are complex conjugates with negative or positive real parts, respectively
A vortex point if si and s2 are purely imaginary (see also Sec. 9.5-3a and Fig. 9.5-2). Examples for the six types of equilibrium points listed are most easily obtained from the linear differential equation (9.4-3) for different values of the coefficients (Sec. 9.4-16).
(c) Stability of a Periodic Solution. The stability of a periodic solution V«=» yp{t) &0, y = yP{t) of Eq. (3) depends on that of the linearized system
dby = dPl 5+5^1 \ dt dy iyp(t),yp(t) V+ dy ]Vp(t),vP(t) *V 1
2*8.221 57+^1 «• I dt dy iyP(t),yp(t) V+ dy ]yP(t),vP(t) V)
(9'5"n)
which is satisfied by small variations (Sec. 11.4-1) by, by of the periodic solution. This is a system oflinear differential equations with periodic coefficients having the same period T as the given solution; Eq. (11) admits two linearly independent solu tions of the form
ty - hn{t) by « h12{t) by = e*h2l{t) by = e*'h22{t) where the h&{t) are periodic functions. The periodic solution is stable if
(9.5-12)
isless than zero, and unstable if\>0; thecase X= 0requires additional investigation.
283
APPROXIMATION METHOD OF KRYLOV AND BOGOLYUBOV
9.5-5
See Sec. 13.6-6 for additional stability criteria. EXAMPLE: The differential equation
^ +y- m(1 - y2) •% =0 (Van deb Pol's equation) (9.5-14) yields a stable limit cycle in the neighborhood of the approximate periodic solution y = yp{t) « 2 cos t, y = yp{t) « -2 sin t, with X = -ai[1 -f- 0{fx)] (see also Sec. 9.5-5).
9.5-5. The Approximation Method of Krylov and Bogolyubov. (a) The First Approximation. Equivalent Linearization. To solve a differential equation of the form
jjf +.V+tfUSfl-O i-tf)-'
(9-5-15)
where w is a given constant, and the last term is a small nonlinear pertur bation, write y = r(t) cos
Assuming that errors of the order of p.2 are negligible, the "amplitude" r(t) and the "total phase"
u.
f^v
-=- = ^P- I
r
f(r cos X, —rco sin X) sin Xd\ = — « ai(r)
dt J: /•* ,—"> <9-5-i7> -L = w -|_ _r_ / f(r Cos X, —ra> sin X) cos \d\ = v «2(r) at
Z7ITW Jo
For a given initial value r(0) = r0, the solution (9.4-4) of the equivalent linear differential equation
w + ai(ro) 3? + a2{ro)y =°
(9'5"18)
approximates the solution of the given differential equation (15) with an error of the order of p.2.
For a periodic solution such as a limit cycle
(Sec. 9.5-36), the approximate amplitude rL is obtained from aifrz.) = 0,
and the circular frequency is approximated by a/c^l).
The limit cycle
is stable if dai/dr]r~rL > 0 and unstable if dai/dr]rarL < 0.
For self-
excitation from rest, one must have ai(0) < 0. The first approximation is of considerable interest in connection with periodic nonlinear oscillations. In such cases, the equivalent linear differential equation (18) yields the same energy storage and dissipation per cycle as the given nonlinear equa tion (15). The equivalent linear equation can therefore be used in many investiga tions of nonlinear resonance phenomena (Ref. 9.17). (b) The Improved First Approximation. An improved first-order approxima tion is given by
9.5-6
ORDINARY DIFFERENTIAL EQUATIONS
284
00
y=r{t) cos
where r{t) and *>(£) are given by Eq. (17), and 1
[2*
1
f2ir
ak(r' ~ (ir2 _ i) /0 Ar cos x> "~rw sm ^) cos &A d\ {k = 0, 2, 3, . . .)
Mr) = „ 2_ 1x /
/(r cos X, -r« sin X) sin A;X d\
(9.5-20)
{k = 2, 3, . . .)
EXAMPLE: In the case of Van der Pol'sdifferential equation (14), Eq. (17) yields
Tf = 2" (l ~ T) rW = , / (9.5-21) dt 2V 4/ Vl + ^r02(e"' - 1) There is a stable limit cycle for r = rL = 2. The coefficients (20) ail vanish except for /33; the improved first approximation is r3(t)
y = r{t) cos {t +
sin S{t + *>o)
(9.5-22)
The Krylov-Bogolyubov approximation (19) is an improvement on an earlier method due to Van der Pol, who derived an approximation of the form y = a{t) cos cot + b{t) sin cot in an analogous manner.
The Krylov-Bogolyubov approximation method can be extended to apply in the case of a periodic forcing function on the right side of the differential equation (15) (nonlinear forced oscillations, subharmonic resonance, entrainment of frequency). For this and other methods, see Refs. 9.15 to 9.17.
9.5-6. Energy-integral Solution.
Differential equations of the form d2v
^f =f(y)
(9.5-23)
which are of considerable interestin dynamics, can be reduced to first-order equations through multiplication by dy/dt and integration:
(f)2 =2ff{y)dy +d t= f
dy
= + C2
J V2ff{y) dy + Ci
(9.5-24) (9.5-25) K
J
9.6. PFAFFIAN DIFFERENTIAL EQUATIONS
9.6-1. Pfaffian Differential Equations (see also Sees. 3.1-16 and 17.2-2). A Pfaffian differential equation (first-order linear total differential equation)
P(x, y, z) dx + Q(x, y, z) dy + R(x, y, z) dz = 0
(9.6-1)
285
THE INTEGRABLE CASE
9.6-2
with continuously differentiable coefficients P, Q, R may be interpreted
geometrically as a condition P •dr = 0 on the tangent vector dr = (dx, dy, dz) of a solution curve (integral curve) described by two equations f(x, y, z) = 0, g(x, y, z, C) = 0, where Cis a constant of integration. To find the integral curves lying on an arbitrary regular surface
f(x, y,z)=0
(9.6-2)
solve the ordinary differential equation obtained by elimination of z and dz from Eq. (1) and df(x, y, z) = 0.
9.6-2. The Integrable Case (see also Sec. 9.2-4). The Pfaffian dif ferential equation (1) is integrable if and only if there exists an inte grating factor p. = n(x, y, z) such that p(P dx + Q dy + R dz) is an exact differential d(p(x, y, z); this is true if and only if
r\dz
dy)^H\dx
dzj^
\dy
dx)
In this case, every curve on an integral surface
*(*, V, z) = C orthogonal to the family of curves described by
(9.6-4)
§ =| =| is a solution.
(9.6-5)
It follows that the solutions found in the manner of Sec.
9.6-1 from a conveniently chosen family of surfaces (usually planes)
f(x,V,z) = f(*,V,*;>) = 0 (9.6-6) lie on an integral surface (4) obtainable by elimination of X from the resulting solution f(x, y,z;\) =0, g(x, y,z,C;\) = 0 (Mayer's Method of Solution).
To find integral surfaces (4) by another method, hold z constant and obtain the solution of the ordinary differential equation P dx + Q dy = 0 in the form u(x, y, z) — K = 0. Then
A — —J: — ——(— —d^\
Pdx " Qdy ~ R\dz
dz)
/a qq\
K' ;
with the aid of Eq. (7). Note that the function (8) is the required inte grating factor fx(x, y, z). An important application is found in thermodynamics, where the reciprocal of the absolute temperature T is an integrating factor for the adiabatic condition bq = 0 of the form (1); bq/T is the (exact) differential of the entropy. See Ref. 10.23for a dis cussion of total differential equations involving more than three variables.
9.7-2
ORDINARY DIFFERENTIAL EQUATIONS
286
9.7. RELATED TOPICS, REFERENCES, AND BIBLIOGRAPHY
9.7-1. Related Topics. The following topics related to the study of ordinary differential equations are treated in other chapters of this handbook:
The Laplace transformation Calculus of variations Matrixnotation for systems of differential equations Boundary-value problems and eigenvalue problems Numerical solutions Special transcendental functions
Chap. 8 Chap. 11 Chap. 13 Chap. 15 Chap. 20 Chap. 21
9.7-2. References and Bibliography. 9.1. Agnew, R. P.: Differential Equations, 2d ed., McGraw-Hill, New York, 1960. 9.2. Andronow, A.A., and C.E. Chaikin: Theory ofOscillations, Princeton University Press, Princeton, N.J., 1949.
9.3. Bellman, R.: Stability Theory ofDifferential Equations, McGraw-Hill, NewYork, 1953.
9.4. Bieberbach, L.: Differentialgleichungen, 2d ed., Springer, Berlin, 1965. 9.5. Birkhoff, G., and G. Rota: Ordinary Differential Equations, Blaisdell, New York, 1962.
9.6. Coddington, E. A.: An Introduction toOrdinary Differential Equations, PrenticeHall, Englewood Cliffs, N.J., 1961.
9.7.
and N. Levinson: Theory of Ordinary Differential Equations, McGrawHill, New York, 1955.
9.8. Ford, L. R.: Differential Equations, 2d ed., McGraw-Hill, New York, 1955. 9.9. Golomb, M., and M. E. Shanks: Elements ofOrdinary Differential Equations, 2d ed., McGraw-Hill, New York, 1965.
9.10. Hale, J. K.: Oscillations in Nonlinear Systems, McGraw-Hill, New York, 1963. 9.11. Hartman, P.: Ordinary Differential Equations, Wiley, New York, 1964. 9.12. Hochstadt, H.: Differential Equations, Holt, New York, 1964. 9.13. Hurewicz, W.: Lectures on Ordinary Differential Equations, M.I.T., Cambridge, Mass., 1958.
9.14. Kamke, E.: Differentialgleichungen, Losungsmethoden und Losungen, vol. I, Chelsea, New York, 1948.
9.15. Krylov, N., and N. Bogolyubov: Nonlinear Oscillations, translated by S. Lefschetz, Princeton University Press, Princeton, N.J., 1943. 9.16. Lefschetz, S.: Differential Equations: Geometric Theory, 2d ed., Interscience, New York, 1963.
9.17. Minorski, N.: Nonlinear Oscillations, Van Nostrand, Princeton, N.J., 1962. 9.18. Petrovsky, I. G.: Lectures on Partial Differential Equations, Interscience, New York, 1955.
9.19. Pontryagin, L. S.: Ordinary Differential Equations, Addison-Wesley, Reading, Mass., 1962.
9.20. Saaty, T. L., and J. Bram: NonlinearMathematics, McGraw-Hill, New York, 1964. 9.21. Sansone, G., and R. Conti: Nonlinear Differential Equations, Macmillan, New York, 1964.
9.22. Stoker, J. J.: Nonlinear Vibrations, Interscience, New York, 1950. 9.23. Struble, R.: Nonlinear Differential Equations, McGraw-Hill, New York, 1961. 9.24. Tenenbaum, M., and H. Pollard: Ordinary DifferentialEquations, Harper and Row, New York: 1963.
9.25. Tricomi, F. G.: Differential Equations, Hafner, New York, 1961.
CHAPTE
rIO
PARTIAL DIFFERENTIAL EQUATIONS
10.1. Introduction and Survey 10.1-1. Introduction
10.1-2. Partial Differential Equations 10.1-3. Solution of Partial Differential
Equations: Separation of Vari ables
10.2. Partial Differential Equations of the First Order 10.2-1. First-order Partial Differential
Equations with Two Inde pendent Variables. Geomet rical Interpretation 10.2-2. The Initial-value Problem
10.2-3.
10.2-4.
Hamilton-Jacobi
tion.
Equa
Solution of the Canoni
cal Equations
10.3. Hyperbolic, Parabolic, and El liptic Partial Differential Equa tions.
Characteristics
10.3-1. Quasilinear Partial Differential Equations of Order 2 with Two Independent Variables. Characteristics
10.3-2. Solution of Hyperbolic Partial Differential Equations by the Method of Characteristics
Complete Integrals. Deriva tion of General Integrals, Par ticular Integrals, Singular In tegrals, and Solutions of the
10.3-3. Transformation of Hyperbolic, Parabolic, and Elliptic Differ ential Equations to Canonical
Characteristic Equations Partial Differential Equations
10.3-4. Typical Boundary-value Prob lems for Second-order Equa
of the First Order with n Inde
pendent Variables (a) The Initial-value Problem Complete Integrals and Solu tion of the Characteristic Equa tions
(c) Singular Integrals 10.2-5. Contact Transformations 10.2-6.
10.2-7. The
Canonical Equations and Ca nonical Transformations
(a) Canonical Equations Q>) Canonical Transformations {c) Poisson Brackets
Form
tions
(a) Hyperbolic Differential Equa tions
(6) Parabolic
Differential Equa
tions
(c) Elliptic Differential Equations 10.3-5. The
One-dimensional
Wave
Equation 10.3-6. The Riemann-Volterra Method
for Linear Hyperbolic Equa tions
10.3-7. Equations with Three or More Independent Variables 287
PARTIAL DIFFERENTIAL EQUATIONS
10.1-1
10.4. Linear Partial Differential
288
Particu
10.4-9. Solution of Boundary-value Problems by Orthogonal-series
10.4-1. Physical Background and Sur
(a) Dirichlet Problem for a Sphere
Equations of Physics. lar Solutions
Expansions: Examples (6) Free Vibrations of an Elastic String (c) Free Oscillations of a Circular
vey
10.4-2. Linear Boundary-value Prob lems
10.4-3. Particular
Solutions
of
Membrane
La
place's Differential Equation: 10.5. Integral-transform Methods
Three-dimensional Case
(a) Rectangular Cartesian Coordi
10.5-1. General Theory 10.5-2. Laplace Transformation of the
nates
(6) Cylindrical Coordinates (c) Spherical Coordinates
Time Variable
10.5-3. Solution of Boundary-value Problems by Integral-trans form Methods: Examples
10.4-4. Particular Solutions for the Space Form of the Three-
dimensional Wave Equation
(a) One-dimensional Heat Conduc tion in a Wall with Fixed
(a) Rectangular Cartesian Coordi nates
Boundary Temperatures: Use of Laplace Transforms {b) Heat Conduction into a Wall of
(6) Cylindrical Coordinates (c) Spherical Coordinates 10.4-5. Particular Solutions for Twodimensional Problems
Infinite Thickness: Use of Fourier Sine and Cosine Trans
10.4-6. The Static Schrodinger Equa
forms
tion for Hydrogenlike Wave
10.5-4. Duhamel's Formulas
Functions
10.4-7. Particular Solutions for
the
Diffusion Equation 10.4-8. Particular Solutions for
Wave Equation.
10.6. Related Topics, References, and
the
Sinusoidal
Waves
Bibliography 10.6-1. Related Topics
10.6-2. References and Bibliography
10.1. INTRODUCTION AND SURVEY
10.1-1. Sections 10.2-1 to 10.2-7 deal with partial differential equations of the first order and their geometrical interpretation and include an out line of the Hamilton-Jacobi theory of canonical equations. Sections 10.3-1
to 10.3-3 introduce the characteristics and boundary-value problems of hyperbolic, parabolic, and elliptic second-order equations. Sections 10.4-1 to 10.5-4 present the solutions of the most important linear partial differential equations of physics (heat conduction, wave equation, etc.) from the heuristic point of view of an elementary course and outline the use of integral-transform methods. A more sophisticated theory of linear boundary-value problems and eigenvalue problems is described in Chap. 15.
289
PARTIAL DIFFERENTIAL EQUATIONS
10.1-2. Partial Differential Equations (see also Sec. 9.1-2).
10.1-2
(a) A
partial differential equation of order r is a functional equation of the form
(10.1-1)
which involves at least one rth-order partial derivative of the unknown function $ = $(xh x2, . . . , xn) of two or more independent variables Xi, x2, . . . , xn. A function $(x\, x2, . . . , xn) which satisfies the given partial differential equation on a specified region of "points" (xi, x2, . . . , xn) is called a solution or integral of the partial differential equation. The general solution (general integral) of a given rth-order equation (1) will, in general, involve arbitrary functions. Substitution of specific functions yields particular integrals corresponding to given accessory conditions, e.g., given conditions on $(xh x2, . . . , xn) and/or its deriva tives on a curve, surface, etc., in the space of "points" (xi, x2, . . . , xn) {boundary conditions, initial conditions). Many partial differential equations admit additional solutions (singular integrals) which are not obtainable through substitution of specific functions for the arbitrary functions in the general integral (Sec. 10.2-lc). (b) A partial differential equation is homogeneous if and only if every constant multiple a$ of any solution $ is a solution. A partial differential equation (1) is linear if and only if F is a linear function of $ and its derivatives (see also Sees. 10.4-1 and 10.4-2). (c) Systems of Partial Differential Equations. A system of partial differential equations
Fi
Compatibility Conditions.
(xi, x2, . . . ,xn\ *,, *2, . . . ;|^> •• )=0 {i =1, 2, . .) (10.1-2)
involves a set of unknown functions $i(&i, £2, . . . , xn), ^{xi, xi, . . . , xn), . . . and their partial derivatives. One can reduce every partial differential equation or system of partial differential equations to a system of first-order equations by introducing suitable derivatives as new variables (see also Sec. 9.1-3). A system of partial differential equations (2) may admit a solution $1, $2, . . . only if the given functions Fi and their derivatives satisfy a set of compatibility condi tions (integrability conditions) which ensure that differentiations of two or more equations (2) yield compatible higher-order derivatives. To derive a compatibility
condition, eliminate the * and their derivatives from a set of equations obtained by differentiation of the given partial differential equations (2).
10.1-3
PARTIAL DIFFERENTIAL EQUATIONS
290
EXAMPLE: Given— + fi{xlf x2) = 0, — + f2{xh x2) = 0, differentiation yields <92$
dfi
d23>
dh
dxTdx~2 = ~ dx~2 dxTtei = " 6x7 so that the given partial differentiaI equations are df df compatible only if-^ = ^~ (d) Existence of Solutions. Aswith ordinary differential equations (Sec. 9.1-4), the actual existence and uniqueness of solutions for a given partial differential equa tion or system of partial differential equations require a proof in each case, even if all compatibility conditions are satisfied. See Refs. 10.5 and 10.18 for a number of existence theorems.
10.1-3. Solution of Partial Differential Equations: Separation of Variables (see also Sees. 10.4-2 to 10.4-9). In many important applica tions, an attempt to write solutions of the form
$ = $(Zi, X2, . . . , Xn) =
(10.1-3)
permits one to rewrite a given partial differential equation (1) in the "separated" form
Fi\Xi> ^sr is? •• 7 -Fo\X2>*»'•••»*•;**> as? as? - 7 Then the unknown functions
ft (.*.*..., *;»;5g.|g,...)-C (10.1-46) where C is a constant of integration (separation constant) to be deter mined in accordance with suitably given boundary conditions or other accessory conditions. Note that Eq. (4a) is an ordinary differential equation for the unknown function
10.2-1. First-order
Partial Differential Equations
Independent Variables.
with Two
Geometrical Interpretation (see also
291
FIRST-ORDER PARTIAL DIFFERENTIAL EQUATIONS
Sees. 9.2-2 and 17.3-11). equation
10.2-1
(a) Given a first-order partial differential
F(x, y, z,p,q)=0 (p Bg, gs g; Fp> +Fq* *o)
(10.2-1)
for the unknown function z = z(x, y), let the given function F be singlevalued and twice continuously differentiate, and consider x, y, z as rectangular cartesian coordinates. Then every solution z = z(x, y) of the partial differential equation (1) represents a surface whose normal has the direction numbers p, q, —1 at every surface point (x, y, z); the solu tion surface must touch a Monge cone of characteristic directions
defined by
dx:dy:dz = Fp:Fq:(pFp + qFq) at every point (x, y, z). A set of values (x, y, z, p, q) is said to describe a planar element associ ating the direction numbers p, q, —1 (and thus the tangent plane of a hypo thetical surface) with a point (x, y, z). A given partial differential equation (1) selects a "field" of planar elements (x, y, z, p, q) tangent to the Monge cones (Fig. 10.2-1).
Fig. 10.2-1. The initial strip and one characteristic strip of the solution sur face. Note the Monge cone at Po{From Burington, R. S., and C. C. Torrance, Higher Mathematics, Mc Graw-Hill, New York, 1939.)
If Fp and Fq do not depend explicitly on p and q {quasilinear partial differential equa tion of the first order), then each Monge cone degenerates into a straight line (Monge axis).
(b) Strips and Characteristic Equations.
A set of suitably dif-
ferentiable functions
x = x(t)
y = y(t)
z = z(t)
p = p(t)
q = q(t)
(10.2-2)
represents the planar elements (points and tangent planes) along a strip of a regular surface if the functions (2) satisfy the strip condition dz
dx
dy
-di = Pdi + q dt Given a first-order partial differential equation (1), every set of func tions (2) which satisfies the ordinary differential equations
10.2-2
PARTIAL DIFFERENTIAL EQUATIONS
— = v^ + dy
dx —w
dt
dt
dt
dp _
t„j? , j? x
q dt
Tt ~ ~{p*° + K)
dq
292
dy —p ) [characteristic equa-
p
dt ~~ q I tions associated with the
t
( partial differential equa-
dt = ~{qF* + Fy) ) tion (1)]
(10.2-3)
together with Eq. (1) is said to describe a characteristic strip. A characteristic strip touches a Monge cone at each point (x, y, z); the associated curve x = x(t), y = y(t), z = z(t) (characteristic curve, characteristic) lies on a solution surface and has a characteristic direc
tion at each point. Solution surfaces can touch one another only along characteristics.
(c) Singular Integrals (see also Sees. 9.2-26, 10.1-2a, and 10.2-3c). Solutions z = z{x, y) of Eq. (1) derived by elimination of p and q from
^l,Hj)-0 dF(X' y V' q) =0 ^^MM»=0 (10.2-4) are singular integrals; they do not satisfy the condition Fp2 -f Fq2 7* 0 and are not contained in a general integral of the partial differential equation (1).
10.2-2. The Initial-value Problem.
It is desired to find a solution
z = z(x, y) of Eq. (1) subject to the initial conditions (Cauchy-type bound ary conditions)
x = xq(t)
y = yQ(r)
z = z0(t)
p = p0(r)
q = q0(r)
(10.2-5a)
with
F[xo(t), yo(r), z0(r), p0(r), q0(r)] =0
J =p0 ^ +Qq ^ (10.2-56)
which specify an "initial strip" of points and tangent planes of the solu tion surface along a regular curve C0; the projection of C0 onto the xy plane is to be a simple curve (Sec. 3.1-13). To solve this initial-value problem, find the solution
x = x(t, t)
y = y(t, t)
z = z(t, r)
p = p(t, r)
q = q(t, r) (10.2-6)
of the system of characteristic equations (3) subject to the initial conditions (5) for t = 0. The resultingfunctions (6) satisfy Eq. (1); find the solution
z = z(x, y) or an implicit solution
The initial-value problem has a unique solution if the given initial conditions (5) imply
Otherwise the problem has a solution only if the given initial conditions (5) describe a characteristic strip; in this case there are infinitely many solutions.
293
COMPLETE INTEGRALS
10.2-3
Note: If the functions (5a) are left arbitrary subject to the conditions (56) and (7), then the solution obtained above constitutes a general integral of the given partial differential equation (1).
10.2-3. Complete Integrals. Derivation of General Integrals, Particular Integrals, Singular Integrals, and Solutions of the Characteristic Equations, (a) A complete integral of the first-
order partial differential equation (1) is a two-parameter family of solutions
•-***.**.*>
(^kBa-wk^h"0) (10-2-8)
having single-valued derivatives d<£> to =P(x, y,\,fi)=p
d<$> -^ =q(x, y, X, /x) =q
and continuous second-order derivatives with respect to x, y, X, p, except possibly for d23>/dX2 and d23>/6V2; a given set of values (x, y, z, p, q) sat isfying Eq. (1) must define a unique set of parameters X, p. A complete integral (7) yields a general integral if one introduces an arbitrary func tion p = p(\) and eliminates X from
2- *[x, y, X, „<X)] =0 f +*!! =0
dO.2-9)
(envelope of a one-parameter family of solutions, Sec. 17.3-11).
(b) Derivation of Particular Integrals.
To obtain the particular
mtegral corresponding to a given set of initial conditions (5), one must find the correct function p(\) to be substituted in the general integral derived in Sec. 10.2-3a. Obtain p = p(\) by eliminating r from
d$(x, y, X, m) _
fa
M - VAV
d*(z, y, X, p)
^
=qo(r)
x= x0(t)
y = y0(r) (10.2-10)
(c) Derivation of Singular Integrals.
*-
Elimination of Xand fi from
*»(*,?, x.m)=0
*»(Wm)=0 (102.u)
may yield a singular integral (envelope of a two-parameter family of solutions).
(d) Solution of the Characteristic Equations. Every complete integral (8) of a first-order partial differential equation (1) yields the complete solution of the system of ordinary differential equations (3). x = x(t), V = 2/(0 are obtained from
d*(x, y, X, M) _ t
d\
" '
d*(x, y, X, p) _
6V
~^
(10.2-12)
PARTIAL DIFFERENTIAL EQUATIONS
10.2-4
294
where X, p, and 0 are arbitrary constants of integration; z = z(t),p = p(t), and q = q(t) are then obtained by substitution of x = x(t, X, /z, P) and 2/ = i/(J, X, p, fi) into _ d$(x, y, X, m)
5
dy
(10.2-13)
(e) Special Cases. Table 10.2-1 lists complete integrals for a number of frequently encountered types of first-order partial differential equations and permits one to apply the methods of Sec. 10.2-3 to many problems. See also Ref. 10.2 for a general method of deriving complete integrals. Table 10.2-1. Complete Integrals for Special Types of First-order Partial Differential Equations No.
Type of differ ential equation
Complete Integral z « *(x, y, X, M)
1
x, y, z do not occur F{p, q) - 0 explicitly
z » Xx + \'y + n with F{\, X') » 0
2a
z = J/(x, X) dx + \y + /*
26
Only one of the var v = f(x, q) iables x, y, z occurs explicitly V = /(*, q)
3
The variables are
separable
1
dz
„.
Fi{x, p) = F2{y, q) ( = X, Sec. 10.1-3) or
2 =» J/ifo X) <*c -f //2(2/, X) dy -f m
p =/i(a>, X) q - My, x) 4
Generalized Clairaut
z = px + qy
equation (see also Sec. 9.2-4)
2 - Xs + M2/ 4- /(X, p)
+ f(p,q)
10.2-4. Partial Differential Equations of the First Order with n
Independent Variables, Sees. 10.2-1 and 10.2-2).
(a) The Initial-value Problem (see also It is desired to find the solution
z = $(xi, x2, . . . , xn)
of the first-order partial differential equation F(Xi, X2, . . . , Xn', Z) Pi, p2, . . . , pn) = 0 (10.2-14) k = i
* Where g {z, X) is a solution of p « f{z, Xp).
295
FIRST-ORDER PARTIAL DIFFERENTIAL EQUATIONS
10.2-4
subject to the initial conditions
Z = 2o(Tl, T2, . . . , Tn_i) Xi = Xf-o(Ti, T2, . . . , rn_l) p.- = p
(10.2-15o)
where x,- = £»o(ti, T2, . . . , rn-i) (t = 1, 2, . . . , n) is to represent a hypersurface free of multiple points, and
^(XlO, £20, • • • >ZnOj Z0] P10, P20, . . • , Pno) = 0
• =XP*° ^
0' - 1, 2, ...,n- 1) (10.2-156)
To obtain the correct relation between xi, #2, . . . , xn, and 2, solve the system of ordinary differential equations n
(i = 1, 2, . . . , n)
(characteristic equations)
(10.2-16)
subject to the initial conditions (15) for t = 0, and eliminate the n parameters ti, T2, . . . , rn_i, and £. The initial-value problem has a unique solution if the given initial conditions (15) imply dF
dF
dF
dpi
dp%
dpn
dxi
dx*
dxn
dr\
dn
dxi
dx2
dXn
drn_i
drn-i
dTn-1
^0
[i.e., »{*" *»•••»*-) ^ ol (10.2-17)
(b) Complete Integrals and Solution of the Characteristic Equations (see also Sec. 10.2-3). A complete integral of the first-order partial differential equation (14) is an n-parameter family of solutions
z = $(xi, x2, . . . , xn) ai, a2) . . . , an)
det ^ ^ r ° I (10.2-18)
having single-valued derivatives
d$/dXi = pi(xi, x2, . . . , xn] «i, a2, . . . , a„)
and continuous second-order derivatives d2$/dXi dxk and d2$/dXi dak; a given set of values (xh x2) . . . , xn; z\ ph p2, . . . , pn) satisfying Eq. (14) must define a unique set of parameters ah a2) . . . , an. A com plete integral (18) yields a general integral if one introduces n arbitrary
10.2-5
PARTIAL DIFFERENTIAL EQUATIONS
296
functions ak = a&(Xi, X2, . . . , X„_i) (k = 1, 2, . . . , n) and eliminates the n — 1 parameters Xy from the n equations
z = $[xi, x2) . . . , xn; «i(Xi, X2, . . . , X„_i), «2(Xi, X2, . . . , X„_i), . . . , an(Xi, X2, . . . , Xn-i)] (10.2-19)
n
I £*-• »->•»
-»
Every complete integral (18) yields the complete solution of the system of ordinary differential equations (16). Obtain Xi = Xi(t) (i = 1, 2, . . . , n) from
H =Arf
(* =1, 2, . . . ,n- 1)
H =< (10.2-20)
where ai, 0:2, . . . , an, fa, fa, . . . , pn-i are 2n — 1 arbitrary constants of integration; then find z = z(t) and pi = Pi(t) (i = 1, 2, . . . , n) by substituting xt- = #»(£; ai, a2, . . . , <*„; 0i, /32, • • • , 0n-i) into 2 = 3>0d, x2, . . . , zn; ai, a2, . . . , an)
Pi = ai;
(*" - 1.2, • • •.»)
(c) Singular Integrals (see also Sees. 10.2-lc and 10.2-3c). Singular integrals of the partial differential equation (14) are solutions z = $(£1, £2, . . . , xn) obtained by elimination of the p< from the n + 1 equations
F(Xi, Z2, . • • , Xn] Z; pl} p2, • • • , Pn) = 0
^ =0
(t - 1, 2, . . . , n)
(10.2-21)
subject to dF/dxi + pidF/dz = 0. A given complete integral (18) may yield a singu lar integral by elimination of on, «2, • • • , <xn from the n + 1 equations z — *(xi, a?2, . . . , iCn,* ai a2, . . . , an) = 0
^dan = 0
(& = 1, 2, . . . , n)
(10.2-22)
10.2-5. Contact Transformations (see also Sec. 9.2-36). Some prob lems involving first-order partial differential equations can be simplified by a twice continuously differentiate transformation
Xi = Xi(xh X2, . . . , Xn) Z) Ph 7>2, • • • , Pn) \
Pi = Pi(xh X2, . . . , Xn) Z) Pi, P2, . . . , Pn) 1 (l = 1, 2, . . . , fl)
Z= Z(xh X2, . . . , Xn) Z) ph p2, • • • , Pn) J with d(Si, g2> • - - , Xn) Z) pi, p2, . . . , pn) ^ Q d(xh X2, . . . , Xn) Z) pi, p2, . . . , pn)
(10.2-23)
297
CANONICAL EQUATIONS
10.2-6 n
chosen so that every complete differential dz = \ Pk dxk is transformed k = \
dz into acomplete differential dz = V\ pk dxk with pi = — (i = 1, 2, . . . ,
w), or 71
d£
- y pkdxk n
= gf(xi, x2, . . . , zn; z) ph p2, . . . , p«) (efe: - Ypt eta*j [gf(xi, z2, . . • , xn) z) pi, p2, . . . , pn) 7* 0]
(10.2-24)
Such a transformation is called a contact transformation; a contact transformation necessarily preserves every strip condition and will thus preserve contact (osculation) of regular surface elements for n — 2 (see also Sec. 10.2-1). A contact transformation (23) transforms the given partial differential equation (14) into a new partial differential equation F(Xi, X2, . . . , Xn) Z) pi, p2, . . . , pn) = F(Xi, X2, . . . , Xn) Z) Pi, p2, . • . , Pn)
(pi -M;*-1'2'-" n) (10-2-!
•25)
with solutions z = z(xi, x2, . . . , xn). It may happen that the new equation (25) does not contain the pi and is thus an ordinary equation. EXAMPLE: n-dimensional Legendre transformation (see also Sees. 9.2-36 and 11.5-6).
Xi = pi
pi = Xi
(i = 1, 2, . . . , n) \
}
fk*k — * T \j 2 = A PkXk ~~ Z+ ^
I
(10.2-26)
10.2-6. Canonical Equations and Canonical Transformations.
(a) Canonical
Equations.
For a
first-order
partial differential
equation
G(xh X2, . . . , Xn) pi, p2, . . . , pn) = 0
(pi =£L> i=1, 2, ...,n) (10.2-27) which does not contain the dependent variable z explicitly, the character-
10.2-6
PARTIAL DIFFERENTIAL EQUATIONS
298
istic equations (16) take the especially simple form n
dz _ Y^ dxk dt~ LPk~dt dxi dG "37 = —
dt
dpi
dpi dG -37• = — t—
dt
(10.2-28)
/. 1 o \ (z = 1, 2, . . . , n)
dxi
v
' '
'
(canonical equations)
(10.2-29)
Note: The solution of every given first-order partial differential equation (14) can be reduced to the solution of a partial differential equation of the simpler form (27) with n •+• 1 independent variables xi, X2, . . . , xnf z) for every solution u = u(xi, X2, . . . , xn; z)
of the partial differential equation ( d u
du
du
.„**...,*.;.;- g, -g dz
"^1 du J =0 (10.2-30) dXn\
dz/
dz
yields a corresponding solution z =» z(xl, x%, . . . , xn) of the given partial differential equation (14) such that u(xi, X2, . . . , xn) z) = 0.
(b) Canonical Transformations (see also Sec. 11.6-8). continuously differentiable transformation
Xi = Xi{Xi, X2, . . . , Xn) Ply p2, . . . , pn) 1~. = j 2 Pi = Pi(Xl, X2, . . . , Xn) Ply 7>2, . . . , Pn) J
with
A twice
n)
' ' " ' *'
A= d^1} *2> ' ' ' >*n> $h P*> * - - >P») ^ o 3(Xh X2, . . . , Xn) Ply P2y . . . y Pn)
(10.2-31) is a canonical transformation if and only if it transforms the canonical equations (29) into a new set of canonical equations dx%
d/v,__
_.-
_N
"37 = JZ^&h X2, . . . , Xn) Pi, p2) . . . , Pn)
t-T= = —dJ.^Wi %r.X2, . . . , Xn)... ..>«->.' Ply p2t . . . , Pn)
»> (10.2-32)
for an arbitrary twice-differentiable function G(xi, Xi, . . . , xn) Piy P2, . . . , pn). This is true if and only if
dxi _ dpk
tek~Wi
dpi = _<>Pk
^X&
dXi
dxi _ _ dxk
Wk " dpiif (. k=! 2 #^. n) (10-M8) dpi, = dXk
dp*
d^<
( K'
' '
' ' ^
299
CANONICAL EQUATIONS
10.2-6
n
i.e., if and only if } (Pfc &x* "" P* d&k) is the complete differential dQ, A= l
of a "generating function" fi = Q(zi, z2, . . . , xn) Pi, P2, . . . , Pn). Note that Eq. (33) implies A = 1. For every canonical transformation (31) the function G appearing in the transformed canonical equations (32) is simply given by
G(XlyX2, . . . , Xn) Pl, p2, . . . , pn) = G(xh x2, . . . , xn; pi, p2, . . . , Pn)
Given a first-order partial differential equation (27) with the canonical equations (29), the new canonical equations (32) are those associated with the partial differential equation
G{xh x2, . . . , £n; pi, p2, . . . , pn) = 0
(pi =^,i=1, 2, ...,n) (10.2-34) The solution z = z(£i, x2, . . . , xn) of the transformed partial differ ential equation (34) is related to the solution z = z(xh x2, . . . , xn) of the original equation (27) by
Z = Z - 12(0:i, X2, . . . , Xn) Ply p2y . . . , pn)
(10.2-35)
Equations (31) and (35) together constitute a contact transformation (Sec. 10.2-5).
A canonical transformation can be specified in terms of its generating function Q(xh x2, . . . , xn) Pi, P2, . . . , pn)) the latter is often given indirectly as a function of the Xi and xit or of the p* and pi. In particular, every twice-differentiable function £2 = ^(xi, x2, . . . , xn) xh x2, . . . , xn) defines a canonical transformation (31) such that n
I (Pk dxk - pk dxk) = d*
or
pt = — OiCi
p< = - ^?
uXi
(i = 1, 2, . . . , n)
(10.2-36)
The canonical transformations (31) constitute a group (see also Sec. 12.2-8).
(c) Poisson Brackets. Given any pair of twice continuouslydifferentiable func
tions g(xh 32, . . . , xn; pi, P2, . . . , pn), h(xh x2, . . . , xn; ph p2, . . . , p„) one defines the Poisson bracket
&=i
10.2-7
PARTIAL DIFFERENTIAL EQUATIONS
300
so that
[h, g] « -[g, h] [g, g] = 0 \g, constant] = 0 [gig2} h] = g2[gi, h] + gi[g2, h] [0i + gi, h] = [gi, h] + [g2, h]
(10.2-38) (10.2-39)
[/, to, Wl + [ft IK /]] + ft, [/, g)] = 0 (POISSON'S IDENTITY)
(10.2-40)
Note that [/, g] = 0, [/, h] = 0 implies [g, h] = 0. Given a transformation (31), let g(£\, $2, . . . , #n; Pi, p2, . . . , pn) = flf(a;i, z2, . . . , x„; ph p2, . . . , pn)
£(*1, #2, • . . , &n'f Pl, P2, • • • , Pn) = fc(Xl, &2, • • . , Xn) pi, P2, . . . , Pn), and
fc = i
A given transformation (31) is a canonical transformation if and only if it preserves Poisson brackets, i.e., if and only if [g, h] = [g, h] for all twice continuously differentiable functions g, h. Now let the variables x» and p» be functions of a parameter t such that a set of canonical equations (29) holds. Then
[xi, Xk] = 0
\pi} pk) = 0 ]
u. -» i - J ° tf *" * *
l*«, P« " | i if» = ife
} (i, * = 1, 2, . . . , n)
J
(10.2-42)
and for every suitably differentiable function f(t; xi, x2, . . . , xn; pi, P2, . . . , pn)
f -f/.GJ+g
(10.2-43)
In particular, d//d* = 0, [/, (?] = 0 imply / = constant. Two functions g} h of the Xi(t) and p»(0 are canonically conjugate if and only if [g, h] = 1; this is true when ever g and h satisfy a pair of canonical equations (e.g., Xi, p», and t, G). A given trans formation (31) is a canonical transformation if and only if it preserves the relations (42), i.e., if and only if
[2i, *d = 0 _ ,
[pit pk]=oy (i, k = 1, 2, . . . , n)
f 0 if i * k
[£i> pk] = i 1ifi =k
10.2-7. The Hamilton- Jacobi Equation.
(10.2-44)
Solution of the Canoni
cal Equations, (a) An important application of the theory of firstorder partial differential equations is the solution of systems of ordinary differential equations which can be written as canonical equations associ ated with a partial differential equation of the special form
P + H(X, Xi, X2, . . . , Xn) Ply Ply • • • , Pn) = 0
(
dz
dz .
, 0
\
(10.2-45)
(Hamilton-Jacobi equation)
Note that n + 1 independent variables x and Xi are involved.
Since Eq. (29) yields dx/dt = 1, one can write x s t (assuming x = 0 for t = 0);
301
THE HAMILTON.JACOBI EQUATION
10.2-7
the 2n canonical equations (29) for the Xi and p» become*
-jfi = Q=m H(t, Xlf X2, . . . , Xn) Ply p2y . . >yPn) dpi
d TT/.
N
-fo = —-fa ti{t, Xi, X2, . . . , Xn) Ply Ply . . . , Pn)
(i = 1, 2, . . . , n)
(10.2-46)
Systems of ordinary differential equations having the precise form (46) are of importance in the calculus of variations (Sec. 11.6-8) and in analytical dynamics and optics. If it is possible to find an n-parameler solution Z = <£>(£, Xi, x2, . . . , xn) en, a2, . . . , <Xn) + «n+l
(10.2-47)
of the Hamilton-Jacobi equation (45) with det [d2$/dxidctk] ^ 0, then the solution Xi = Xi(t), pi = pi(t) (i = 1, 2, . . . , n) of the system of 2n ordinary differential equations (46) is given by — $(t, xh x2, . . . , xn) aiy a2, . . . , an) = ft
(i = 1, 2, . . . , n)
OCii
(10.2-48)
where the ctk and ft are 2n constants of integration. One first solves the n equations (48) for the Xi — Xi(t); the pi = pi(t) are obtained by substitu tion of Xi = Xi(t) into pi = d$/dxi (i = 1, 2, . . . , n) (see also Sec. 10.2-46). (b) Use of Canonical Transformations (see also Sec. 10.2-66). If a complete integral (47) solving the given equations (46) is not known, one may try to introduce a canonical transformation relating the 2n -f 2 variables x = t, p, Xi, p< to 2n -j- 2 new variables x = t, p} xt, Pi so that p -f H = p + H, and
d£»
dH
-aJ=Wi
dpi
dH
(.
-dT=-dx-i
0
(* - 1> %• • • . »>
/ino,m
(10.2-49)
In this case, n
n
Y pkdxh - #dt - ( S pkdxk - Hdl\ =da fc = l
(10.2-50)
A= l
must be a complete differential of a "generating function" fl = Cl(t, xh x2, . . . , xn; Pi Ph P2, . • • , Pn). In particular, let I = t) then every twice continuously differ entiable function Q =
d*
(t - 1, 2, . . . , n) )
dXi
\
* =* +f *The remaining canonical equation is -j- — at
—•
dt
(10.2-51)
10.3-1
PARTIAL DIFFERENTIAL EQUATIONS
302
It may be possible to choose this transformation so that H does not depend explicitly on the x% (transformation to cyclic variables xt). (c) Perturbation Theory. Given the solution (47) of the Hamilton-Jacobi equation (45), the generating function * = &(t, Xh X2, . . . , Xn) Xi, X2f . . . , Xn)
defines a canonical transformation (51) yielding constant transformed variables: Xi = at
Pi = -fa
(i = 1, 2, . . . , n)
(10.2-52)
As shown in Sec. 10.2-7a,the 2n equations (52) yield the solution xf = Xi(t), pt- = pt(f) of the canonical system (46).
Given such a solution of the "unperturbed" canonical system (46), one often desires to solve the canonical equations
dxi __ dK
Wd^i
with
dpi _
dK
..
St " ~dx~i
i _
.
(* " 1. 2, . . . , n) )
K = H(t, Xi, x2, . . . , xn) pi, p2, . . . , pn)
(10.2-53)
-f- eHi(t, Xi, X2, . . . , Xn', Pi, P2, • . . , Pn)
where eHi is a small correction term (perturbation, e.g., the effect of a small disturbing field in celestial mechanics). Using the known solution (47) of the "unperturbed" Hamilton-Jacobi equation (45), introduce new variables Xi, p% by a canonical trans formation (51) with the generating function * = $(t, Xi, X2, . . . ,Xn', Xi,' X2, . . . , Xn)
Since Eq. (46) has been replaced by Eq. (53), the Xi and pt are no longer constants but satisfy transformed canonical equations
d£i dHi ~dt=e^
dpi dHi -dt=~€^
(^1,2, ...,n)
which may be easier to solve than the given system (53). Xt = ai + eXi(t)
Pi = -ft + eP<(«)
mxoka\ (10.2-54)
If one writes
(t = 1, 2, . . . , n)
the corrections eXi(t), ePi(t) to the constants (52) may yield corresponding correc tions to the solutions of the unperturbed system (46) by an approximately linear transformation.
10.3. HYPERBOLIC, PARABOLIC, AND ELLIPTIC PARTIAL DIFFERENTIAL EQUATIONS. CHARACTERISTICS
10.3-1. Quasilinear Partial Differential Equations of Order 2 with
Two Independent Variables. Characteristics, (a) A partial dif ferential equation of order r is quasilinear if and only if it is linear in the rtb-order derivatives of the unknown function $. Thus a real quasilinear second-order equation with two independent variables x, y has the form
d23>
d2<£
d2$
(10.3-1)
303
QUASILINEAR SECOND-ORDER EQUATIONS
10.3-1
where an, ai2, a22, and B are suitably differentiable real functions of x, y, $, d$/dx, and d$/dy. (b) Characteristics. Given a real boundary curve Co described by x = x(r)
y = y(T)
(10.3-2a)
a set of Cauchy-type boundary conditions (called initial conditions in Sec. 10.2-2) specifies boundary values*
•-*> £-*» £-«« (£->*+«$) <»*•> A given set of suitably differentiable functions (2) uniquely defines the values of d2$/dx2 = u(r), d2$/dx dy = v(r), d2$/dy2 = w(j) (and also the values of higher-order derivatives of $) on the curve (2a) at every point Po where the functions (2a) do not satisfy the ordinary differential equation
anl^rJ m -»»»§ or
(10.3-4)
dy
ai2 ± V ai22 — 011022
da?
an
This is true because the derivatives u, v,wof$ on Co must satisfy Eq. (1) and the "second-order strip conditions"
dp
dx ,
dy
I = U
dq
dx .
dy
dr = VTr+WTr
/1A 0 _N
(1°-3"5)
so that, for instance,
__ an dy dp + 022 dx dq + Bdx dy
,^~ «-.
an di/2 — 2ai2 dx dy + a22dx2
Equation (4) holds at Pq if Co is a segment of a characteristic base curve (often called a characteristic) y = y(x) satisfying Eq. (4), or if Co touches such a curve at Po. Properly speaking, the characteristics associated with the given partial differ ential equation (1) are curves x — x(t), y «= y(r), z = z(t) on the solution surface z = $(x, y) such that y =» y(x) satisfies Eq. (4).
Since the expression (6) must be
* If one is given the boundary values of the normal derivative d$/dn (Sec. 5.6-1), say
I* -
'
(, S - , 4") - J>(,)
(IO.iW)
solve Eq. (3) together with dz/dr = p(dx/dr) + qidy/dr) = 0 to obtain p(r) and ?(r).
10.3-2
PARTIAL DIFFERENTIAL EQUATIONS
304
finite on the solution surface, p = d$/dx and q = d$/dy must satisfy the ordinary differential equation On dy dp + a22 dxdq -f- B dx dy = 0
or
^. = - g" ± ^a"2 ~ ailtt22 _ B (dy\ dp a22 a22Vdp/
/in q 7^ UU.d-7)
on every characteristic defined by Eq. (4), with corresponding plus and minus signs. Note: The second-order derivatives of $ may be discontinuous (though finite) on a characteristic, so that different solutions can be "patched together" along characteristics.
(c) Hyperbolic, Parabolic, and Elliptic Partial Differential Equations. The given partial differential equation (1) is Hyperbolic if ana22 —ai22 < 0 in the region of points (x, y) under consideration, so that Eq. (4) describes two distinct families of real characteristic base curves
Parabolic if ana22 —ai22 = 0, so that there exists a single family of real characteristic base curves
Elliptic if ana22 — ai22 > 0, so that no real characteristics exist 10.3-2. Solution of Hyperbolic Partial Differential Equations by the Method
of Characteristics (see also Sec. 10.3-4). In the hyperbolic case (ana22 —al22 < 0), simultaneous solution of the four ordinary differential equations (4) and (7) yields p <= d$/dx and q = d$/dy on the solution surface as functions of x and y, so that $ = $(x, y) can be obtained by further integration. In many applications, d&/dx and d$/dy rather than $(z, y) are of paramount interest (velocity components); the method forms the basis for many analytical and numerical solution procedures in the theory of compressible flow.
Computations are considerably simplified in special cases. If B = 0, one has
where the subscripts refer to the characteristics derived, respectively, with a plus sign and a minus sign in Eqs. (4) and (6). If, in addition, an, ax2, and a22 depend only on d$/dx, d$/dy one need only solve Eq. (7) to obtain the characteristics (e.g., two-dimensional steady supersonic flow). Again, if an, ai2, and o22 depend only on x, y one need only solve Eq. (4).
10.3-3. Transformation of Hyperbolic, Parabolic, and Elliptic Differential Equations to Canonical Form. For convenience, let an, ai2, and a22 be functions of x and y alone, so that the ordinary differ ential equation (4) separates into two linear first-order equations du
-j- = Xi(z, y)
du
jk = **(x> y)
with solutions hx(x, y) = ax
(10.3-9a)
witn solutions h2{x, y) = a2
(10.3-96)
where on, a2 are arbitrary constants.
Depending on the sign of the
305
TRANSFORMATION TO CANONICAL FORM
10.3-3
function aua22 — a122 in the region of points (x, y) under consideration,* three cases arise.
1. Hyperbolic Partial Differential Equation {ana22 —CL122 < 0). \i(x, y) and X2(x, y) are real and distinct. There exist twoone-param eter families of real characteristics (9a) and (96); a curve of each family passes through every point (x, y) under consideration. Introduce new coordinates
* = *i(«, V)
V = h2{x, y)
(10.3-10)
to transform the given partial differential equation (1) to the canon ical form
dx dy
The alternative coordinate system
*=M^
*= V
(10-3_12)
yields the second canonical form d2$
d?
2. Parabolic Partial Differential Equation (aua22 — a\22 = 0). Xi(x, y) and \2(x, y) are real and identical. Thereexists a single oneparameter family of real characteristics (9); one characteristic passes through each point (x, y) under consideration. Introduce x = fci(x, y)
y = h0(x, y)
(10.3-14)
where ho(x, y) is an arbitrary suitably differentiable function such that d(x, y)/d{x, y) 9^ 0. Equation (1) is transformed to the canonical form d25> 2
/
d$ d$\
dx2
3. Elliptic Partial Differential Equation (aua22 — a\22 > 0). \i(x, y) and \2(x, y) and hence also hi(x, y) and h2(x, y) are complex conjugates; no real characteristics exist. Introduce
- = hi(x, y) + h2(x, y)
= hx(x, y) - h2(x, y) ^ ^^
* In the usual applications, the discriminant aua22 — ai22 does not change sign in the region under consideration. Note also that the sign of ana22 — ai22 is invariant with respect to any real continuously differentiable coordinate transformation x = x(x, y), y = y(x, y) with nonvanishing Jacobian.
10.3-4
PARTIAL DIFFERENTIAL EQUATIONS
306
to transform Eq. (1) to the canonical form
3+3-'(**••£ 3)
<«»-">
These three types of partial differential equations differ significantly with respect to the types of boundary conditionsyieldingvalid and unique integrals (Sees. 10.3-4 and 15.6-2).
10.3-4. Typical Boundary-value Problems for Second-order Equa tions, (a) Hyperbolic Differential Equations. The Cauchy initialvalue problem of Sec. 10.3-16 requires one to solve the hyperbolic differ ential equation (1), given 3>, d$/dx, and d$/dy on an arc C0 of a regular
y» const.
y° const.
(c) {d) Fia. 10.3-1. Boundary-value problems for hyperbolic differential equations.
curve which is neither a characteristic (6) nor touches any characteristic. Such a curve intersects each characteristic at most once; the given initial values determine <£ in a triangular region D0 bounded by Co and a character istic of each family (Fig. 10.3-la). More specifically, the value of 3> at
307
THE ONE-DIMENSIONAL WAVE EQUATION
10.3-5
each point P of Do is determined by the values of $ and its derivatives on the portion Cp of Co which is bounded by the characteristics through P.
A second type of boundary-value problem prescribes only a linear rela tion a(d$/dn) + 0$ = b(x, y) on an arc Co specified as above; in addi
tion, $ is given on a characteristic arc Cc through one end point of Co (Fig. 10.3-16). A third type of boundary-value problem prescribes $ on two intersect
ing characteristic arcs Cc and C'c (Fig. 10.3-lc). Combinations of these three types of problems will indicate admissible
boundary conditions for more complicated boundaries. Thus in Fig. 10.3-ld, $, d$/dx, and d$/dy may be given on C0, but only one relation of the type a(d$/dri) + /3$ = b(x, y) can be prescribed on each of C0 and C'0'. The solutions in the various regions indicated in Fig. 10.3-ld are "patched" together along characteristics ("patching curves") so that $ is continuous, while d$/dx and d$/dy may be discontinuous. Note that closed boundaries are not admissible. EXAMPLE: Initial-value problems for the hyperbolic one-dimensional wave equa tion, Sec. 10.3-5.
(b) Parabolic Differential Equations. There exists one and only one family of characteristics. Although the Cauchy problem can again be solved for a suitable arc Co, one is usually given $ on a characteristic
£(= t) =0 and a(d$/dn) + /?<£ on two curves which do not intersect or touch each other or any characteristic. Closed boundaries in the xy plane are not admissible. EXAMPLE: An admissible boundary-value problem for the parabolic diffusion
equation d*$/dz* -f- (I/72) d$/dt = 0 specifies $(x, t) = &o(x) on the characteristic t a 0 {initial conditions) and a(x, t) d$/dt + p(x, t)$ on the curves x = a and x = b (boundary conditions).
(c) Elliptic Differential Equations (see also Sees. 10.4-1 and 15.6-2). No real characteristics exist; Cauchy-type boundary conditions are not
admissible. Typical problems specify a(x, y)(d$/dn) + fi(x, y)$ on a curve C enclosing a bounded or unbounded solution region ("true" boundary-value problems). 10.3-5. The One-dimensional Wave Equation (see also Sees. 10.3-4a, 10.4-8a, and 10.4-96). The hyperbolic differential equation
d2${x, t) _ \ d2$(x, t) = dx2
c2
(10.3-18)
dt2
(one-dimensional wave equation)
10.3-5
PARTIAL DIFFERENTIAL EQUATIONS
308
has the general solution
$(s, t) = *i(s - a) + $2(x + ct)
(10.3-19)
which represents a pair of arbitrarily-shaped waves respectively propa gated in the +x and —x directions with phase velocity c. The char acteristics x ± ct = constant are loci of constant phase (Fig. 10.3-2). Sections 10.3-5a, b, and c list solutions for three types of initial-value problems (e.g., waves in a string); in practice, the Fourier-expansion
Fig. 10.3-2. Characteristics for the one-dimensional wave equation
= 0. dx2
c2 dt2
method of Sec. 10.4-96 may be preferable, since it applies also to nonhomogeneous differential equations (forced vibrations), (a) The initial conditions $(x, 0) = *0(x)
d$l
Vo(x)
(-00 < x < oo)
(10.3-20)
specify a true Cauchy initial-value problem (Sees. 10.3-1 and 10.3-4a; see also Figs. 10.3-la and 10.3-2). The solution is I
1
fx+ct
Hxy 0=2 [o(x " Ct) + *o(x + Ct)] + 2~c-ct V°(® d* (d'Alembert's solution)
(10.3-21)
Disturbances initially restricted to any given interval a < x
(b) The initial conditions $(x, 0) = *0(tf)
d*J,-o
v0(x)
(x > 0)
(10.3-22)
309
THE RIEMANN-VOLTERRA METHOD
10.3-6
and the boundary condition
$(0, 0=0
{t > 0)
(10.3-23)
specify a combination-type boundary-value problem (see also Sec. 10.3-4a and Figs. 10.3-ld and 10.3-2). The solution is
*(x, t) =i [P{x - ct) +P(x +ct)] +± ['** Q(© dH (10.3-24) pr^ _ ( *•(*)
(* > 0)
nx) " (-*0(-s) (* < 0)
Q(x) _ [ v0(x) (x > 0) ^W " I-i>0(-*) (* < 0) (10.3-25)
which corresponds to superposition of incoming and reflected waves. (c) The initial conditions
*(*, 0) =*o(»)
^1 =vQ(x) at jt=o
(0<x
and the boundary conditions
$(0, 0 = $(L, 0=0
(t > 0)
(10.3-27)
define another combination-type problem (see also Figs. 10.3-ld and 10.3-2). The solution is given by Eq. (24) if one interprets P(x) and Q(x) as odd periodic functions with period 2L and respectively equal to $o(x) and Vq(x) for 0 < x < L. 10.3-6. The Riemann-Volterra Method for Linear Hyperbolic
Equations (see also Sees. 10.3-4a and 15.5-1). It is desired to solve the Cauchy problem (Sec. 10.3-1) for the real hyperbolic differential equation
L$(x, y) = ^- + a(x, y)-^ + K*, 2/) ^ + c(x, y)$ =f(x, y) (10.3-28)
with $ and d$/dx, d$/dy given on a boundary curve Co satisfying the conditions of Sec. 10.3-1. Referring to Fig. 10.3-la, the solution at each point P with the coordinates x = £, y = r\ is given in terms of the "initial values,, of $(x, y), d$/dx, and d$/dy on the boundary-curve segment Cp cut off by the characteristics through P.*
*tt, *) =G&] - jcp GR*(a dy - bdx) +j^ (*¥j£dy +GR|? dx\ + (( Qkfdxdy
=GR$1 - j GR$(a dy -bdx)
-/ft(*§& +0-Sdy) +/^cfcc,,f (1°-3"29)
* See also the footnote to Sec. 10.3-16.
10.3-7
PARTIAL DIFFERENTIAL EQUATIONS
310
where the so-called Riemann-Green function Gr(x, y; £, ij) is con tinuously differentiable on and inside the region DP bounded by CP and the characteristics through P and satisfies the conditions of the sim pler boundary-value problem
„
fe 2/inside DP) y {1Q^
GB = exp / o(£, y) dy
(x = £)
J V
GB =exp f* b(x, v) dx
(y =v)
EXAMPLES: For a = b = c = 0, GR b 1; and for a = b = 0, c - constant, one has (rjz =» /o[\/4c(a; —£)(j/ —17)], where Jo(z) is the zero-order Bessel function of the first kind (Sec. 21.8-1). For many practical applications involving linear hyper bolic differential equations with constant coefficients, the integral-transform methods (Sec. 10.5-1) are preferable.
10.3-7. Equations with Three or More Independent Variables.
A
real partial differential equation of the form n
n
11
€L%k\Xi, X2, . . . , Xn)
dXi dXk
1 *='l
+*(.!....,*.;•;£. •••»£)-0 (10.3-31) where $ = $(xi, x2, . . . , xn)f is an elliptic partial differential equation if and only if the matrix [aik] is positive definite (Sec. 13.5-2) throughout the region of interest.
In many problems involving nonelliptic partial differential equations, the unknown function $ depends on n "space coordinates" Xi, x2, . . . , xnand a "time coordinate'' t; the partial differential equation takes either of the forms n
n
32$
32,3,
£a£ " 1? + B" °
n
n
^ VA
a2#
2/2/ (10.3-32)
where the matrix [c#] = [c^zi, z2, . . . , xn) t)] is real and positive defi nite and B is a function of the xi} t, $, and the first-order derivatives. The two types of partial differential equations are respectively referred to as hyperbolic and parabolic differential equations. Characteristics of the more general partial differential equations (31) and (32) are surfaces or hypersurfaces on which Cauchy-type boundary conditions cannot
311
PHYSICAL BACKGROUND AND SURVEY
10.4-1
determine higher-order derivatives of the solution (Ref. 10.5). Elliptic partial differential equations have no real characteristics. The concept of characteristics has also been extended to certain partial differential equations of order higher than two, and to some systems of partial differential equations (Ref. 10.5). 10.4. LINEAR PARTIAL DIFFERENTIAL EQUATIONS OF PHYSICS. PARTICULAR SOLUTIONS
10.4-1. Physical Background and Survey, (a) Many problems of classical physics require one to find a solution $(x, t) or $(r, t) of a linear partial differential equation on a given space interval or region V (Table 10.4-1). The unknown function <£> and/or its derivatives must, in addi tion, satisfy given initial conditions for £ = 0 and linear boundary condi tions on the boundary S of V. Related problems arise in quantum mechanics.
Each partial differential equation listed in Table 10.4-1 is homogeneous (Sec. 10.1-26) if / = 0. A given boundary condition is, again, homo geneousif and only if it holds for every multiple a$ of any function which satisfies the condition. Inhomogeneities represent the action of external influences (forces, heat sources, electric charges or currents) on the phys ical system under consideration. Typically, an elliptic differential equa tion describes an equilibrium situation (steady-state heat flow, elastic deformation, electrostatic field). Parabolic and hyperbolic differential equations describe transients (free vibrations, return to equilibrium after a given initial disturbance) or, if there are time-dependent inhomo geneities ("forcing functions" in the differential equation or boundary conditions), such equations describe the propagation of disturbances (forced vibrations, radiation). One can relate each problem of the type discussed here to an approximating system of ordinary differential equations by replacing each space derivative by a difference coefficient in the manner of Sec. 20.8-3 (method of difference-differential equations).
This method is not only useful for numerical calculations; the analogy to a discretevariable problem of the type described in Sees. 9.4-1 to 9.4-8 may give interesting physical insight.
(b) Construction of Solutions by Superposition.
The most
important methods for the solution of linear differential equations are based on the fundamental superposition theorems stated explicitly in Sees. 9.3-1 and 15.4-2. The most important solution methods superimpose a judiciously chosen set of trial functions to construct solutions which match
given forcing functions, given boundary conditions, and/or given initial conditions. Eigenfunction expansions (Sees. 10.4-2 and 15.4-12) and integral-transform methods (Sees. 9.3-7, 9.4-5, and 10.5-1 to 10.5-3) are systematic schemes for constructing such solutions. Greenes-function methods (Sees. 9.3-3, 9.4-3, 15.5-1 to 15.5-4, 15.6-6, 15.6-9, and 15.6-10)
4th order
Elliptic
Hyperbolic
Parabolic
Type
Static case
Elastic vibrations
Static case
lines
transmission
Damped waves,
Waves (strings, membranes, fluids, electromagnetic)
diffusion
i a2$
1 d*
One-dimensional
dx4
—
a<$
d4$
d2$
a2*
-/<*)
i a2^
a2*
a*
Hx~2 ~ 72 dt2 " f{Xf °
d2*
Heat conduction, a2*
Physical background
l a2*
V2V2* = /(r)
V2V2* H
V2* = /(r)
= f(r, 0
dt
at
- ai — - a2* = /(r, 0
a$
a2*
v*$ _ ao —
l a2*
v2* - - — = /(r, o c2 a*2 ^ '
7^ at
1 a*
Multidimensional
conditions
Boundary only
dt
<£and —
conditions
initial conditions on
Boundary conditions;
Boundary only
dt
$and —
initial conditions on , a$
Boundary conditions;
initial conditions on <£
Boundary conditions;
Accessory conditions
Table 10.4-1. The Most Important Linear Partial Differential Equations of Classical Physics
I
CD
G > H
O
W
H
w
W
3 >
313
LINEAR BOUNDARY-VALUE PROBLEMS
10.4-2
tire superposition schemes which reduce the solution of suitable problems to that of problems with simpler forcing functions or boundary conditions. The general theory of linear boundary-value problems is treated in Chap. 15; Sees. 10.4-3 to 10.4-8 present useful particular solutions from the heuristic point of view of an elementary course.
(c) Choice of Coordinate System. The system of coordinates xl, x2, or xl, x2, xz used to specify the point (r) is usually chosen so that (1) separation of variables is possible (Sec. 10.1-3) and/or (2) the given boundary S becomes a coordinate line or surface, or a pair of coordinate lines or surfaces.
10.4-2. Linear Boundary-value Problems (see also Sees. 15.4-1 and 15.4-2). (a) Let V be a given three- or two-dimensional region of points (r), and let S be the boundary surface or boundary curve of V. One desires to solve the partial differential equation
L*(r) = /(r)
(r in V)
(10.4-la)
(i = 1, 2, . . . , N; r in S)
(10.4-16)
subject to a set of boundary conditions
BMr) = fc(r)
where L$ and B»^> are linear homogeneous functions of the unknown
function <£>(r) and its derivatives. Every solution of this linear boundaryvalue problem can be written as the sum $ = $a + $b of solutions of the simpler boundary-value problems
L$A(r) =0 B<Mr) = 6t(r)
(r in V) (i = 1, 2, . . . , N; r in S)
(10.4-2a) (10.4-26)
LM*) = /« B<Mr) =0
(* in V) (* = 1, 2, . . . , N; r in S)
(10.4-3)
and
Note that Eq. (2) involves a homogeneous differential equation, whereas Eq. (3) has homogeneous boundary conditions. (b) Homogeneous Differential Equation with Nonhomogeneous
Boundary Conditions. The particular solutions listed in Sees. 10.4-3 to 10.4-6 often permit one to expand 3u(r) as an infinite series or definite integral
M*) =£«AM
or
*A(r) =f^ <*(m)3v« dp (10.4-4)
over suitably chosen solutions $M(r) of Eq. (2a); the coefficients aM or a(/x) are chosen so as to satisfy the boundary conditions (26). Frequently the approximation functions $v(r) are complete orthonormal sets (Sec. 15.2-4; e.g., Fourier series, Fourier integrals); one may then expand the
10.4-2
PARTIAL DIFFERENTIAL EQUATIONS
314
given functions 6t(r) in the form (4) and obtain the unknown coefficients
(c) Nonhomogeneous Differential Equation with Homogeneous Boundary Conditions: Eigenfunction Expansions (see also Sees.
15.4-5 to 15.4-12). For an important class of partial differential equa tions (1), the solution $B(r) of Eq. (3) can be constructed by a similar superposition of solutions ^(r) of the related homogeneous differential equation
U(r) = XiKr) (r in V) (10.4-5a) for the different possible values of Xpermitting ^(r) to satisfy the homo geneous boundary conditions
BflKr) =0 (i = 1, 2, . . . , N; r in S) (10.4-56) In general, such solutions exist only for specific values of the parameter
X (eigenvalues); the solutions \p = ^x(r) corresponding to each eigen value are called eigenfunctions of the boundary-value problem (5). Sections 10.4-3 to 10.4-8 list particular solutions ^(r) for a number of partial differential equations ofthe form (5a). These functions may besuperimposed to form solutions ofthecorresponding nonhomogeneous problem (3). Eigenfunctions corresponding to discrete sets of eigen values often form convenient orthonormal sets (Sec. 15.2-4) for expan sion of forcing functions and solutions in the form
/to =^/xihto
*b =^jftihto
x
X
(10.4-6a)
(refer also to Table 8.7-1), whereas continuous sets (continuous spectra) S\ of eigenvalues Xyield integral-transform expansions.
/to =/5x F(k)M*) d\
$b =f 0(X)#x(r) dk (10.4-66)
Substitution of Eq. (6a) or (66) into Eq. (3) yields the unknown coeffi
cients fix or0(X). Refer to Sees. 10.5-1 and 15.4-12 for the general theory of this solution method and its range of applicability. An important alternative method of solving Eq. (3), the so-called Green's-function method, is treated in Sees. 15.5-1 to 15.5-4.
(d) Problems Involving the Time Variable. In problems involv ing the time as well as the space coordinates, one desires to solve a linear differential equation
L$(r, t) = /(r, t)
(r in V, t> 0)
(10.4-7a)
subject to a set of linear initial conditions
A/i>(r, t) = ft(r)
(j = 1, 2, . . . , M; r in V; t = 0 + 0) (10.4-76)
315
SOLUTIONS OF LAPLACE'S DIFFERENTIAL EQUATION
10.4-3
as well as the linear boundary conditions
BMr, t) = 6<(r, t)
(i = 1, 2, . . . , AT; r in S; t > 0)
(10.4-7c)
Since the initial conditions (76) are simply boundary conditions on the "coordinate surface" t = 0, the methods of Sees. 10.4-2a and 6 apply
(Sec. 10.4-9). The following procedures may simplify the treatment of initial conditions:
1. $b = $2?(r, t) can be further split into a sum of functions respectively satisfying homogeneous initial conditions and homogeneous bound ary conditions.
2. Separation of variables (Sees. 10.1-3, 10.4-76, and 10.4-8). 3. Laplace transformation of the time variable (Sees. 10.5-2 and 10.5-3a). 4. Duhamers method (Sec. 10.5-4).
10.4-3. Particular Solutions of Laplace's Differential Equation: Three-dimensional Case (see also Table 6.5-1, Sec. 10.1-3, and Sees.
15.6-1 to 15.6-9; see Ref. 10.5 for solutions employing other coordinate systems) (a) Rectangular Cartesian Coordinates x, y, z. _ d2$
V
d2$
d2$>
dx2 + dy2 + dz2
(10.4-8)
admits the particular solutions
*w.(*, V> *) = *"+*»+*" (Aa2 + k22 + kz2 = 0) 3>ooo(z, y, z) = (a + bx)(a + Py)(A + Bz)
(10.4-9)
which combine into various products of real linear, exponential, trigono metric, and/or hyperbolic functions.
(b) Cylindrical Coordinates r',
Then
V
r' dr' V fry
r'2 d
(10.4-10)
separates into
d^ 2u((p) + m2u(
(10.4-11)
*Lv(z) + K2v(z) =0
(10.4-12)
d
§r2 "(f) +i J- w{r>) - (k2 +^ w(r>) =0
(10.4-13)
PARTIAL DIFFERENTIAL EQUATIONS
10.4-3
316
where uniqueness requires u(cp + 27r) = u(
$±xm(r',
$om(r',
(10.4-14)
where Zm(f) is a cylinder function (Sec. 21.8-1); in particular, if a given problem requires $ to be analytic for r = 0, then Zm(f) must be a BesseZ
function ofthe first kind (Sec. 21.8-1). Note that complex-conjugate solu tions (14) combine into real particular solutions like
(acos Kz + bsin 2&s)[iiZ.l(ifo') + A*Zm(-iKr')](a cos m? + psinm?) for real K. Such solutions can be superimposed to form real Fourier series. Note that m = 0 in cases of axial symmetry.
(c) Spherical Coordinatesr, &,
arV dr/ + sin*d*V
**/ +
(10.4-15)
sin2#
separates into
g-5 ufo) + m2w(^) = 0
(10.4-16)
(1 - f2) J^f) ~2f|,(r) +[i(i +1) - -J5l-] ,(f) =0 (f = cos d) 2 d
i(i + 1)
o>2 ™(r) + rTr W(x) ~ JW r2 *' ™(r) = °
(10.4-17)
(10.4-18)
where regularity for # = 0, # = wand uniqueness require that m = 0, ±1, ±2, . . . , ±j andi = 0, 1, 2, ... . Equation (15) admits par ticular solutions of the form
*sm(r, #,
(10.4-19)
where theP;m(f) are associated Legendre functions of thefirst kind ofdegreej
317
SPACE FORM OF THE WAVE EQUATION
10.4-4
(Sec. 21.8-10). Combination of such solutions yields more general par ticular solutions
*,(r, *
Y0
-1
yj>t\oosd) th*
U = 0, 1, 2, . . .) (10.4-21)
-3
The functions (21) satisfy Eq. (15) for r = constant and are called spherical surface harmonics of degree j (see also Sec. 21.8-12). There are 2j + 1 linearly independent spherical surface harmonics of degree j. For orthogonal-series expansion of solutions, note that the functions
V^ &rSi p'ro(C08 *> cos m« ^
&r3i p'm(cos*> sia m*
(j = 0 1,2,... ; m = 0, 1, 2, . . . , j) or the sometimes more convenient functions
i J¥±i o;-M)! P ,.,(0ob «•«-* (j - 0, 1, 2, . . . ; m = 0, ±1, +2, . . . , ±j) constitute orthonormal sets in the sense of Sec. 21.8-2. These functions are called
tesseral spherical harmonics (sectorial spherical harmonics for m = j; see also Sees. 10.4-9a, 15.2-6, and 21.8-11). The orthonormal functions
^/£+ip,(cos*) are known as zonal spherical harmonics.
If one admits solutions with singularities for # = 0, # = ir, one must add analogous
solutionsinvolvingthe associated Legendre functions of the secondkind (Ref. 21.3).
10.4-4. Particular Solutions for the Space Form of the Three-
dimensional Wave Equation (see also Sees. 10.3-5 and 15.6-10). The differential equation ^2$ _j_ £2$ _ Q
(space form of the wave equation, Helmholtz's equation)
(10.4-22)
is obtained, for example, by separation of the time variable t in the three-dimensional wave equation (44). The coefficient k2 may be nega tive (k = u, space form of the Klein-Gordon equation). For suitably given homogeneous linear boundary conditions (e.g., $ = 0 on the boundary of S of a bounded region V), Eq. (22) admits solutions
10.4-5
PARTIAL DIFFERENTIAL EQUATIONS
318
only for a corresponding discrete set of values of k2 (eigenvalue problem,
Sec. 15.4-5; see Sec. 10.4-96 for an example).
(a) Rectangular Cartesian Coordinates x, y, z. Equation (19) has the particular solutions
*w.fo V, z) = e*<*i*+*w+**>
(kt2 + k22 + kz2 = k2) )
*0M.fa V> «) = (a + bxW'v+xrt
$oo*(x, y, z) = (a + fcr)(a + jfo/)eifc*
(k22 + kz2 = A;2) } (10.4-23) j
which may be combined into various products of real linear, exponential, trigonometric, and hyperbolic functions.
(b) Cylindrical Coordinates r',
aV2 "(f) +p^p W) - [(K2 - k2) +£] w(r') =0 (10.4-24) where uniqueness requires m= 0, ±1, ±2, . . . , and Kis an arbitrary separation constant to be determined by the boundary conditions. Equation (22) admits solutions of the form
*±Km(r>f
(c) Spherical Coordinates r, #,
dF2 w(r) +71w{r) +[k* ~^7^] "M =° (10.4-26) where regularity for # = 0, # = tt and uniqueness require that m= 0, ±1, ±2, . . . , ±j, and./ = 0, 1, 2, . . . . Equation (22) admits par
ticular solutions of the form
**(', *, v) =^Zj+x(kr)Ys(&,
**o(r, #,
} (10.4-27)
where Fy(#, *>) is a spherical surface harmonic (21). In particular, if a given problem requires $ to be analytic for r = 0, the Zj+H(kr)/->/r are spherical Bessel functions ofthe first kind (Sec. 21.8-8). 10.4-5. Particular Solutions for Two-dimensional Problems (see also Sees. 15.6-7 and 15.6-106). (a) Laplace's equation
319
SOLUTIONS FOR THE DIFFUSION EQUATION d2$
10.4-7
(10.4-28)
dy2 ~ r dr\ dr ) ^ r2 dip2 admits the particular solutions
$K(xt y) = etKirnv)
*Q(x, y) = (a + bx)(a + ft/)- (10.4-29)
Mr,
(m =0, 1, 2, . . .) I
$o(r, ?) = A + £ log« r
J (10.4-30)
where 2£, like a, b, a, P, A, B, is an arbitrary parameter to be determined by the boundary conditions.
(b) Space Form of the Wave Equation. The two-dimensional space form (22) of the wave equation admits the particular solutions
*W», V) = *iiklx+kiV)
*ok(x, y) = (a + bx)e*y
(*i2 + k2% = *2) I
$km(r,
J
(io 4-31)
(m = 0, 1, 2, . . .)
$o(r,
(c) Complex-conjugate solutions (29) and (31) combine into various products of real linear, exponential, trigonometric, and/or hyperbolic functions.
10.4-6. The Static Schrodinger Equation for Hydrogenlike Wave Functions. The three-dimensional partial differential equation
V2<j> + /?? _ XA $ = 0
(10.4-33)
admits the particular solutions
«xy(r, *,
e-Xr \ (j =0, 1, 2, . . .) (10.4-34) *x._,(r, *, ,) =— L7lit,U2*r)Yi(#,
10.4-7. Particular Solutions for the Diffusion Equation (see also Sees. 10.3-46, 10.4-9, 10.5-3, 10.5-4, and 15.5-3). (a) The one-dimen sional diffusion equation
PARTIAL DIFFERENTIAL EQUATIONS
10.4-8
d2$
dx2
320
1 d$
^ -qI ~=
(10.4-35)
0
admits the particular solutions $k{x, t) = tfttt*-**,* $(x, t) = -~ e-*vw Vi
*o(s, 0 = (t>0)
•*")
(10.4-36)
where k is an arbitrary separation constant to be determined by the boundary conditions.
(b) The two- or three-dimensional diffusion equation V2$
— = o
(10.4-37)
$(r, t) = ^(r)e-*^
(10.4-38)
y2 dt
admits the particular solutions
where ^(r) is any particular solution of the corresponding Helmholtz equation (22) (Sees. 10.4-4 and 10.4-56); k is an arbitrary separation constant to be determined by the boundary conditions. Equation (37) also admits the particular solution
/ I e-r*/4yH
(two-dimensional case)
$(r, t) =J x
f ~rg e~r ^ '
(three-dimensional case)
^
(10.4-39)
10.4-8. Particular Solutions for the Wave Equation. Sinusoidal
Waves (see also Sees. 4.11-46, 10.3-5, 10.4-9, 10.5-2, and 15.6-10). (a) The one-dimensional wave equation
a2$ _ j. d*$ dx2
(10.4-40)
c2 dt2 ~
admits particular solutions of the form §(x, t) = e±ikxe±i
(w = kc)
(10.4-41)
where k is an arbitrary constant to be determined by the boundary conditions. The functions (41) and the corresponding values of k2 are
usually the eigenfunctions and eigenvalues of an eigenvalue problem, Sees. 10.4-2 and 15.4-5.
Solutions of the form (41) combine into real solutions
321
SOLUTIONS FOR THE WAVE EQUATION
10.4-8
$(x, t) = C cos (cot + 71) cos (kx + y2) (sinusoidal standing waves)
$(x, t) = a cos (wt + kx) + 6sin (cot + kx)
} (co = kc) (10.4-42)
= A cos (cot T kx + y) (sinusoidal WAVES TRAVELING IN THE ± X DIRECTION)
The circular frequency co, the frequency v = co/2t, the wave number
k, the wavelength X = 2ir/k, and the phase velocity c of the sinusoidal waves are related by
\v =J =c
(10.4-43)
Sinusoidal waves (42) may be superimposed to form Fourier series or Fourier integrals for more general waves. (b) The two- or three-dimensional wave equation
v
(10.4-44)
C2 dt2
admits particular solutions of the form
S(r> t) = $k(*)e±iu,t
(co = kc)
(10.4-45)
where <£*(r) is any particular solution of the corresponding Helmholtz equation (22) (Sees. 10.4-4 and 10.4-56); k is an arbitrary separation constant to be determined by the boundary conditions. Solutions of the form (45) combine into solutions involving real trigonometric functions; in particular, note the following examples:
Hx, y,Z)t) = A cos [cot T (kxx + k2y + kzz) + y] (ki2 + k22 + kz2 = k2; co = kc) (sinusoidal plane waves, wave FRONTS NORMAL TO DIRECTION GIVEN BY fci, k2, kz) (10.4-46)
$(r',
(10.4-47)
*(r, *,
(10.4-48)
_
<£(r, #,
(10.4-49)
$(r, &,
(10.4-50)
10.4-9
PARTIAL DIFFERENTIAL EQUATIONS
322
$(r,
(two-dimensional sinusoidal circular standing waves)
(10.4-51)
The cylindrical waves (47) are propagated in the ±z direction with phase velocity c' = (o/K = kc/K, which is seen to depend on o> and K (dispersion). One defines the group velocity in the z direction as do/dK =* Kc/k.
(c) The Generalized One-dimensional Damped-wave Equation (Teleg rapher's Equation). The transmission-line equation d2$
d2<£>
d$
aF-a°5F-°'¥-Os* = 0
(104-52)
admits particular solutions of the form *(x, t) « e±ik*e«
(10.4-53)
where s =
(10.4-54)
Complex-conjugate solutions (53) combine into damped sinusoidal waves in the manner
of Sec. 9.4-16; again, double roots of Eq. (54) are treated as in Sec. 9.4-16. Equation (52) includes Eqs. (35) and (40) as special cases. An analogous generalization applies to the multidimensional case.
10.4-9. Solution of Boundary-value Problems by Orthogonal-series Expan sions: Examples (see also Sec. 10.5-3). The following examples illustrate the use of the particular solutions listed in Sees. 10.4-3 to 10.4-8; see also Table 8.7-1. (a) Dirichlet Problem for a Sphere (see also Sees. 10.4-3c, 15.6-2a, and 15.6-6c). One desires to find the function $(r) which satisfies Laplace's equation V2* = 0 throughout a given sphere r < R and assumes suitably given boundary values $(R, #,
j
* =$(r, tf,
The unknown coefficients ctjm, 0ym are obtained from the given boundary conditions 00
j
$(R, #,
with the aid of the orthogonality conditions of Sec. 21.8-12:
ayo « ** * l Ri+1 (2V d
<*im =2j'+ 1 HT m\\ Ri+1 Jo (2V ^ Jof* b(#, ip)P,m(co8 #) cos mip sin t> d# | 2t (j+m)!
fiim =^2^ U+m)\Ri+1 fo" d(p jo H*' ^Pim(cos *> sin m* 8in *d*
(i =0, 1, 2, . . . ;m= 1, 2, . . . , j) ' (10.4-55)
323
SOLUTION OF BOUNDARY-VALUE PROBLEMS
10.4-9
(b) Free Vibrations of an Elastic String (see also Sec. 10.3-5). The lateral displacement
d2* _ J.3* = n dx2
c2 dt2
In addition, let d& ~\
$>(x, 0) = &o(x)
—
ot jt**0
$(0, t) = *o(0
= vQ(x)
(0 < x < L)
®(L, t) = *6(0
(initial conditions)
(boundary conditions)
Consider the special case
, nirc co = kc = -y-
. n i n (n = 0, 1, 2, . . .)\
One next attempts to write the solution as an infinite sum of these particular solutions, i.e., ;*/ j\ . nir nirc A , , . nx . nxc A $(£, 0 = V> I/ c„ sm y a; cos -y * + 6„ sin -=- x sin -=- t 1 n=»l
This is seen to be a Fourier series (Sec. 4.11-46) whose coefficients an, bn fit the given initial conditions if
a"=z/c 0
$o(x) sin -j- x dx i-/
6„ = ntcc JO / v0(x) sin-^-xdx Li
(n = 1, 2, . . .)
(n - 1, 2, . . .)
The solution is seen to be the sum of harmonic standing waves (modes of vibration) "excited*' by the given initial conditions.
(c) Free Oscillations of a Circular Membrane. The transverse displacement *(r) of a membrane clamped along its boundary circle r = 1 satisfies the waveequation 1 d2*
v** ~~ £i~qp = ° (r
* = *o(r)
d*/dt = v0(r)
for t = 0
(r < 1)
One uses polar coordinates r, ip. Since * is to be regular for r = 0, one attempts to superimpose solutions (51) involving only Bessel functions of the first kind, i.e., Zn(kr) = Jm(kr) (Sec. 21.8-1). Such solutions represent characteristic oscillations satisfying the given boundary condition * = 0 (r = 1) if
k =» ~c = kmn
(n,m= 0, 1, 2, . . .)
where kmn is the nth real positive zero of Jm(£); the problem is an eigenvalue problem. The solution appears as a sum of characteristic oscillations 00
*(**,
00
y Jm(kmnr) (amn cos mtp -f 0m„ sin m
m = 0 n = 0
• (amn cos cfcm„J + 6mn sin ckmnt)
10.5-1
PARTIAL DIFFERENTIAL EQUATIONS
324
where the unknown coefficients are determined as ordinary Fourier coefficients upon substitution of the given initial conditions for t = 0. 10.5. INTEGRAL-TRANSFORM METHODS
10.5-1. General Theory (see also Sees. 8.1-1, 8.6-1, 9.3-7, 9.4-5, 10.4-2c,
and 15.4-12). It is desired to find the solution $ = $(zi, x2, . . . , x„), subject to suitable boundary conditions, of the real linear partial differ ential equation
(Li + L2)3> = a0(xn) —2 + ax(xn) — + a2(xn)$ + L2$ = /(si, x2, . . . , xn)
(10.5-1)
where L2 is a linear homogeneous differential operator whose derivatives and coefficients involve X\, x2, ... , xn-i but not xn. For simplicity, assume xn to range over a fixed bounded or unbounded interval (a, b) for all xi, x2, . . . , Xn-iy so that the boundary S of the solution region con sists of two £n-coordinate surfaces (see also Sec. 10.4-lc). One can often simplify Eq. (1) by applying a linear integral transformation
$(xi, x2f . . . , Xn-i) s) = ja K(xn, s)
If the transformation kernel K(xn, s) is chosen so that
LitK(zn, s) m-^-2 (a0K) - /- (axK) +a2K =\(s)K (10.5-3) OXn
CfXn
then partial integration (generalized Green's formula, Sec. 15.4-3) yields a possibly simpler differential equation
[L2f + X(s)]$ =J(x\y x2, . . . , 3n-i; s) - P(xh x2, . . . , xn; s) J P(xlf x2, . . . , Xn) s) ss a0(K—- $—) + (ax - a'0)$K
Xn>=a
J
(10.5-4)
tobesatisfiedbytheunknownintegraltransform$(xi,x2, . . . ,Xn-i;s). Note that the new differential equation introduces the boundary values of $ and d$/dxn given for xn = a, xn = b and variable xi, x2, . . . , xn-iThe integral-transform method assumes the existence of the relevant integrals (2) and a convergent inversion formula
$(Xi,X2, . . . ,Xn) = J^ H(xn,s)$(x1,X2, . . . ,Xn-l] s) ds (10.5-5) which usually involves complex integration (see also Sec. 15.3-7).
Table
325
INTEGRAL-TRANSFORM METHODS
10.5-3
8.6-1 lists a number of useful integral transformations and inversion formulas.
Note that constants of integration in solutions of the transformed differential equation (4) must be regarded as arbitrary functions of s; similarly, arbitrary functions of xi, X2, . . . , Xn-i must also be arbitrary functions of s. The integral-transform method may be generalized to apply to differential equa tions involving higher-order derivatives of xn; again, the integral transformation may operate on two variables Xn, xn-i simultaneously (Ref. 8.6). If 6-function terms
in H(xnf s) are admitted, Eq. (5) can yield series expansions for $ over finite intervals (a, 6) (finite integral transforms, Sec. 8.7-1 and Ref. 10.21).
10.5-2. Laplace Transformation of the Time Variable (see also Sees. 8.2-1, 8.3-1, 9.3-7, and 9.4-5). Initial-value problems involving hyperbolic or parabolic differential equations are often simplified by the unilateral Laplace transformation
*M =/0°°
V'S - a„ ^f - ax g - a2$ =/(r, t)
(10.5-6)
into a new and possibly simpler differential equation
V2$ - (a0s2 + axs + a2)$ = /(r, s) -
(a0s + ai)$ + a0 -tt (10.5-7)
for the unknown integral transform $(r, s) of $(r, t). The boundary conditions are similarly transformed; note that Eq. (7) includes the effect of given initial conditions. 10.5-3. Solution of Boundary-value Problems by Integral-transform Meth
ods: Examples (see also Sec. 10.4-9). The following examples illustrate the simplest applications of integral transformations to the solution of boundary-value problems (see also Refs. 10.3, 10.18, and 10.19).
(a) One-dimensional Heat Conduction in a Wall with Fixed Boundary Temperatures: Use of Laplace Transforms. Duhamel's theorem (Sec. 10.5-4) reduces an important class of one-dimensional heat-conduction problems to the form a*3>(s, t)
*(a, t) = *0 =» constant
1 d*(x, O n
/
^
*(x, 0 -f 0) = $o(x) $(&, t) = &b = constant
r ,
^
(initial, conditions) (t > 0) (boundary conditions)
The differential equation transforms into d2&>(x
8)
s
1
—^5- - 72 *(*> •) - - 7, *•(*)
(a<x
10.5-3
PARTIAL DIFFERENTIAL EQUATIONS
32#
If, in particular, $0(2) = $0 = constant, one has
$(*,«) = dW^Vs/7 + c2(8)e-*y/*/y + where the functions Ci(s) and Cz(s) must be chosen so as to match the transformed boundary conditions
$(a, s) = $«/«
$(6, *) = *&/*
$(x, 0 is obtained as an inverse Laplace transform by one of the methods of Sees. 8.4-2 to 8.4-9. In particular, for a = 0, $fl = 0, $0 = 0, one has
. , x \/5
smh —2L-
&(x, 8) = $>&
. , b \/S
8 sinh ——'
Note that this problem can also be solved by the method of Sec. 10.4-9. (b) Heat Conduction into a Wall of Infinite Thickness: Use of Fourier Sine and Cosine Transforms. The method of Sec. 10.5-3a still applies if a = 0, b = 00, and one is given
^-£^-0 $(x, 0 + 0) = $o(s)
(*>0;<>0) (x > 0)
(initial conditions)
/ • , , Nla , . } (t > 0) I$(x, t)\%dx exists I
(boundary conditions)
$(0, t) =*a(t) I
One may, instead, apply the Fourier sine transformation
tp(x, t) sin sxdx
(Sec. 4.11-3) with the aid of Sec. 4.11-5c to obtain the transformed differential equation
Ia^+safo(s, 0=^(0 72
at
* ir
and the transformed initial condition $(s, 0 + 0) = $o(«).
The transformed problem
has the solution
$(s, 0 =$0(s)e-Y»«« + V72« yo[' $«(0«_7W('"T) dr ~7r and $(x, t) = ^|- / $(s, 0 sin sa; ds. If, in particular, one is given *«(*) =con stant = *0 and $0(2) = 0, one has $o(s) = 0, and
•feO =*.^f^ 4
*fe0 =*.(l -erf2-^)
327
DUHAMEL'S FORMULAS
10.5-4
If the given boundary condition specifies, instead, dx
Jx = 0+0
-
(zero heat flux through boundary), one uses the Fourier cosine transformation
10.5-4. Duhamel's Formulas (see also Sec. 9.4-3). (a) Let L be a homogeneous linear operator whose coefficients and derivatives do not involve the time variable L Let $(x, t) be the solution of the initial-value problem
L* +Ao(x) ^f +Ax(x) |5 =0 *(z,0 +0)=0 51 =0 ot Jo+o ad$/dx + 0<S> = b(t)
(0 <x
with as many homogeneous linear boundary conditions for $ = 0 and/or x = L as needed, and let ^(x, J) be the solution of the same problem for b(t) s 1 (* > 0). Then
$(rc, t) = *(z, *)&(0 + 0) + j ¥(x, t - t)V(t) dr
= i T¥(z, <- r)6(r) dr
(0 < x
(b) The solution $(r, t) of the generalized diffusion problem
V•[*(r)V]
aaH + ^ = 6(r' °
(r in 7, *>0) (r in 7)
) (10.5-10)
(' in 5, <> 0)
with time-dependent forcing functions /(r, 0 and 6(r, t) is related to the solution ¥(r, t; X) of the simpler problem
V•[*(r)V]¥ +a(r)¥ - ^a ^ =/(r, X) 1 d^
ag+^-ftfcX) *(r, 0; X) = #0(r)
(r in V, t>0)
(rin*f«>0) > (10JW1> (r in 7)
where/(r, X) and b(r, X) depend on a fixed parameter Xrather than on the variable t. One obtains $(r, t) from
*(r, t) =jt jlQ ¥(r, t- X; X) d\
(10.5-12)
10.6-1
PARTIAL DIFFERENTIAL EQUATIONS
328
10.6. RELATED TOPICS, REFERENCES, AND BIBLIOGRAPHY
10.6-1. Related Topics. The following topics related to the study of partial differential equations are treated in other chapters of this handbook:
Fourier series and Fourier integrals Curvilinear coordinates The Laplace transformation and other integral transformations Calculus of variations Boundary-value problems and eigenvalue problems
Chap. Chap. Chap. Chap. Chap.
4 6 8 11 15
Numerical solutions Special transcendental functions
Chap. 20 Chap. 21
10.6-2. References and Bibliography (see also Sec. 15.7-2). 10.1. Bers, L., F. John, and M. Schechter: Partial Differential Equations, Wiley, New York, 1963. 10.2. Bieberbach, L.: Differentialgleichungen, 2d ed., Springer, Berlin, 1965. 10.3. Churchill, R. V.: Fourier Series and Boundary-value Problems, 2d ed., McGrawHill, New York, 1963. 10.4. : Operational Mathematics, 2d ed., McGraw-Hill, New York, 1958. 10.5. Courant, R., and D. Hilbert: Methoden der mathematischen Physik, 3 vols., Wiley, New York, 1953/67. 10.6. Duff, G. F., and D. Naylor: Differential Equations of Applied Mathematics, Wiley, New York, 1965. 10.7. Epstein, B.: Partial Differential Equations, McGraw-Hill, New York, 1962. 10.8. Feshbach, H., and P. Morse: Methods of Theoretical Physics, McGraw-Hill, New York, 1953. 10.9. Frank, P., and R. Von Mises: Die Differentialgleichungen der Mechanik und Physik, 2d ed., M. S. Rosenberg, New York, 1943. 10.10. Friedman, B.: Principles and Techniques of Applied Mathematics, Wiley, New York, 1956. 10.11. Garabedian, P. R.: Partial Differential Equations, Wiley, New York, 1964. 10.12. Hopf, L.: Introduction to the Partial Differential Equations of Physics, Dover, New York, 1948.
10.13. Kamke, E.: Differentialgleichungen, Losungsmethoden und Losungen, vol. II, Chelsea, New York, 1948.
10.14. Lebedev, N. N., et al.: Problems of Mathematical Physics, Prentice-Hall, Englewood Cliffs, N.J., 1965.
10.15. Petrovsky, E. G.: Lectures on Partial Differential Equations, Interscience, New York, 1955.
10.16. Sagan, H.: Boundary and Eigenvalue Problems in Mathematical Physicsf Wiley, New York, 1961.
10.17. Shapiro, A. H.: The Dynamics and Thermodynamics of Compressible Fluid Flow, vol. I, Ronald, New York, 1953.
10.18. Sneddon, I. N.: Elements of Partial Differential Equations, McGraw-Hill, 10.19.
New York, 1957. : Fourier Transforms, McGraw-Hill, New York, 1951.
329
REFERENCES AND BIBLIOGRAPHY
10.6-2
10.20. Sommerfeld, A.: Partial Differential Equations in Physics, Academic Press, New York, 1949. 10.21. Tranter, C. J.: Integral Transforms in Mathematical Physics, 2d ed., Wiley, New York, 1956. 10.22. Tychonov, A. N., and A. A. Samarski: Partial Differential Equations in Mathe matical Physics, Holden-Day, San Francisco, 1964. 10.23. Margenau, H., and G. M. Murphy: The Mathematics of Physics and Chemistry, 2d ed., Van Nostrand, Princeton, N.J., 1952.
CHAPTER
11
MAXIMA
AND
MINIMA
OPTIMIZATION
AND
PROBLEMS
11.4-2. The Simplex Method 11.4-3. Nonlinear Programming.
11.1. Introduction
11.1-1. Introductory Remarks
The
Kuhn-Tucker Theorem 11.2. Maxima and Minima of Func tions of One Real Variable
11.2-1. Relative Maxima and Minima 11.2-2. Conditions for the Existence of Interior Maxima and Minima
11.4-4. Introduction to Finite Zero-
sum Two-person Games (a) Games with Pure Strategies (b) Games with Mixed Strategies (c) Relation to Linear Program ming
11.3. Maxima and Minima of Func tions of Two or More Real Variables
Maxima and Minima of
11.3-1. Relative Maxima and Minima
11.3-2. Expansion of A/ 11.3-3. Conditions for the Existence of Interior Maxima and Minima 11.3-4. Extreme-value Problems with
Constraints or Accessory Con ditions.
The Method of La
grange Multipliers 11.4. Linear Programming, Games, and Related Topics
11.4-1. Linear-programming Problems (a) Problem Statement <» The Linear-programming Prob lem in Standard Form. ble Solutions
(e) Duality
Definite Integrals 11.5-1. Variations 11.5-2. Maxima and Minima of Defi
nite Integrals 11.5-3. Solution of Variation Problems 11.6. Extremals as Solutions of
Differential Equations: Classical Theory
11.3-5. Numerical Methods
330
11.5. Calculus of Variations.
Feasi
11.6-1. Necessary Conditions for the Existence
of
Maxima
and
Minima
11.6-2. Variation Problems with Con
straints or Accessory Condi tions.
The Method of La
grange Multipliers 11.6-3. Isoperimetric Problems
INTRODUCTION
331
11.6-4. Solution of Variation Prob
lems Involving Higher Deriva tives in the Integrand
(6) Optimum-control Theory and Calculus of Variations
(c) Initial-state and Terminal-
11.6-5. Variation Problems with Un
known Boundary Values and/or Unknown Integration Limits
(a) Given Integration Limits, Un known Boundary Values (b) Given Boundary Values, Un known Integration Limit (c) General Transversality Con ditions
11.6-6. The Problems of Bolza and
Mayer
state Manifolds
(d) Continuity, Differentiability,
and Independence Assumptions («) Generalizations 11.8-2. Pontryagin's Maximum Prin ciple (a) Adjoint Variables and Optimal Hamiltonian
(b) The Boundary-value Problem 11.8-3. Examples (a) Zermelo's Navigation Problem (6) Simple Bang-bang Time-opti
11.6-7. Extremals with Corners.
Refraction, Reflection, and Boundary Extrema 11.6-8. Canonical Equations and Hamilton-Jacobi Equation 11.6-9. Variation Problems Involving Two or More Independent
mal Control
(c) A Simple Minimal-time Orbittransfer and Rendezvous Prob lem 11.8-4. Matrix Notation for Control
Problems 11.8-5.
Variables: Maxima and Min
ima of Multiple Integrals 1.6-10. Simple Sufficient Conditions for Maxima and Minima
11.1-1
Inequality Constraints on State Variables.
Corner Con
ditions 11.8-6.
The Dynamic-programming Approach
11.7. Solution of Variation Problems
by Direct Methods 11.7-1. Direct Methods
11.7-2. The Rayleigh-Ritz Method 11.7-3. Approximation of y(x) by Polygonal Functions
11.8. Optimal-control Problems and the Maximum Principle 11.8-1. Problem Statement
(a) State Equations, Controls, and Criterion
11.9. Stepwise-control Problems and
Dynamic Programming 11.9-1. Problem Statement
11.9-2. Bellman's Principle of Opti mally
11.10 Related Topics, References, and Bibliography 11.10-1. Related Topics 11.10-2. References and Bibliography
11.1. INTRODUCTION
11.1-1. A large class of problems can be stated as extreme-value problems: one desires to find parameter values or functions which maximize or
minimize a quantity dependent upon them. In many engineering prob lems it is, for instance, desirable to maximize a measure of performance
11.2-1
MAXIMA AND MINIMA AND OPTIMIZATION PROBLEMS
332
or to minimize cost. Again, one can at least approximate the solutions of many problems by choosing unknown parameter values or functions so as to minimize errors in trial solutions; restatement of a problem as an extreme-value problem may then lead to powerful numerical approxi mation methods.
EXAMPLES: Solution of eigenvalue problems in vibration theory and quantum mechanics (Sees. 15.4-7 and 15.4-86); Hamilton's and Jacobi's principles in dynamics. 11.2. MAXIMA AND MINIMA OF FUNCTIONS OF ONE REAL VARIABLE
11.2-1. Relative Maxima and Minima (see also Sec. 4.3-3).
A real
function f(x) denned for x = a has a (relative) maximum or a (rela tive) minimum f(a) for a; = a if and only if there exists a positive real number 8 such that, respectively,
A/ = f(a + Ax) - f(a) < 0
or
A/ s f(a + Ax) - f(a) > 0
for all Ax = x — a such that f(a + Ax) exists and 0 < \Ax\ < 5.* The relative maximum (minimum) is an interior maximum (interior minimum) or a boundary maximum (boundary minimum) if x = a is, respectively, an interior point or a boundary point of the domain of definition assigned to f(x) (Sec. 4.3-6a).| 11.2-2. Conditions for the Existence of Interior Maxima and
Minima, (a) If f'(x) exists for x = a, then f(a) can be a (necessarily interior) maximum or minimum only if f(x) has a stationary value for x = a, i.e.,
f(a) = 0
(11-2-1)
(b) Iff(x) has a second derivative f" (x) for x = a, then f (a) is A maximum if f(a) = 0 and f'(a) < 0 A minimum if f(a) = 0 and f'(a) > 0
(c) More generally, if f(x) has n continuous derivatives f'(x), f"(x), . . . , /<»>(*) for x = a, and f(a) = f"(a) = • • • = /<«-»(a) = 0, then f(a) is
A maximum if n is even and /(n)(a) < 0 A minimum if n is even and fin)(a) > 0 * A/ is denned as the change of a given function f(x) resulting from change Ax in the independent variable x. A/ is a function of a and Ax. Af must not be con fused with the variation 5/introduced in Sec. 11.5-1.
t The problem statementmust specify the domain of definition of f(x). Note that fi(x) = x(- oo < x < *) has no maximum, but/2(x) = x (x < 1) has a boundary maximum for x = 1.
333
CONDITIONS FOR THE EXISTENCE
11.3-3
If n is odd and fin)(a) j± 0, then f(x) has neither a maximum nor a minimum for x = a, but a point of inflection (Sec. 17.1-5). EXAMPLES: Each of the functions x2, x4, x*, . . . has a minimum for x = 0. Each of the functions xz, x5, x7, . . . has a point of inflection for x = 0. 11.3. MAXIMA AND MINIMA OF FUNCTIONS OF TWO OR MORE REAL VARIABLES
11.3-1. Relative Maxima and Minima.* A real function f(x\, x2, . . . , xn) defined for Xi = ah x2 = a2, . . . , xn = an has a (relative) maximum or a (relative) minimum f(ah a2, . . . , an) for xx = ah x2 = a2, . . . , xn = an if and only if there exists a positive real number 8 such that
4f = f(ai + Aa?i, a2 + Ax2, . . . , an + Axn) — f(ah a2, . . . , an) (11.3-1) is, respectively, less than zero or greater than zero for all Axi, Ax2, . . . , Axn such that f(a\ + Ax\, a2 + Ax2, . . . , an + Axn) exists and 0 < Axi2 + Ax22 + • • • + Axn2 < 8. The relative maximum (minimum) is an interior maximum (interior minimum) or a boundary maxi
mum (boundary minimum) if the point (ah a2, . . . , a„) is, respec tively, an interior point or a boundary point of the region of definition assigned to f(x\, x2) . . . , xn) (Sec. 4.3-6a). 11.3-2. Expansion of A/. The quantity A/ defined by Eq. (1) is a function of ai, a2, . . . , an and A#i, Aa;2, . . . , Axn. If f(x\, x2, . . . , z„) is suitably differentiable,
a/=X SL.
Axi+5 XS ^feL
i=i
^A%
i = i a=i
+
• • •
(11.3-2)
(Sec. 4.10-5). The terms of degree 1, 2, . . .in the Axi in this expansion respectively constitute the first-order change (principal part of A/) A1/, the second-order change A2/, . . . of the function f(xi, X2, . . . , £„), ... for ^i = ax, x2 = a2, . . . , xn = an (see also Sec. 4.5-36). 11.3-3. Conditions
for the Existence of Interior Maxima and
Minima, (a) If f(xh x2, . . . , xn) is differentiablefor X\ = ah x2 = a2, . . . , xn = an, f(a\, a2, . . . , an) can be a (necessarily interior) maximum or minimum only if the first-order change A1/ vanishes, i.e., only if
*l dxi
=
0
*l dx2
= 0
* See footnotes to Sec. 11.2-1.
V dXn axn
=
0
(11.3-3)
11.3-4 MAXIMA AND MINIMA AND OPTIMIZATION PROBLEMS
334
for xi = ah x2 = a2, . . . , xn = an. f(xh x2, . . . , xn) is then said to have a stationary value for xx = ah x2 = a2, . . . , xn -- an.
(h) Iff(xh x2, . . . , xn) zs ftwce differentiable and satisfies the necessary condition (3) /or zi = ax, z2 = a2, . . . , xn = an, then f(ah a2, . . . , a») /s a maximum if the (real symmetric) quadratic form n
n
«"-iXDcfeL
teto
(1IM)
is negative definite (Sec. 13.5-2); and f(ah a2, . . . , an) is a minimum if the quadratic form (4) is positive definite. EXAMPLE: Find the maxima and minima of the function z = Zxz - x + yz - Sy* - 1
Here the necessary conditions
fx =fe» - 1=0 and g =3y» - 6*, =0 are satisfied for a; = M, 2/ = 0; x = -^, y = 0; x = }{, y = 2; x = -}/z, y = 2.
^Ut dic^ = ^^ 6^2 = 6y ~" 6;a^T" = °' an^ insPecti°n of the characteristic equation (18z-M)(62/-6-M) =0 3hows that the only extreme values are
and
a maximum (m = —6, n2 = —6), z = —% a minimum (m = 6, /i2 = 6), 2 = —4%
for x = —}i and y = 0 for x = }{ and 2/ = 2
(c) If the first-order change A1/ of a suitably differentiable functionf(xif x2, . . . , xn) vanishes and A2/is semidefinite (Sec. 13.5-2), the nature of the stationary value depends on higher-order derivatives. f(xi, x2, . . . , xn) cannot have a maximum or minimum where A2/ exists and is indefinite.
Given a set of values Xi = ax, x2 = a2, . . . , xn = a„ obtained from Eq. (3), one may investigate the nature of the quadratic form (4) by any of the methods listed in Sec. 13.5-6; or one may test for a maximum or minimum by actual computation of f(a\ + Ax\, a2 + Ax2, . . . , an + Axn) for judiciously chosen combinations of incre ments Arci, Ax2, . . . , Axn.
11.3-4. Extreme-value Problems with Constraints or Accessory Conditions. The Method of Lagrange Multipliers. Maxima and minima of a real function f(xh x2, . . . , xn) of n variables x\, x2, . . . , xn subject to suitably differentiable constraints or accessory conditions in the form of m < n equations
• • •
(pm(Xly X2y . . . , Xn) = 0
(11.3-5)
may, in principle, be found in the manner of Sec. 11.3-3 after m of the n variables xi, x2, . . . , xn have been eliminated with the aid of the rela-
335
LINEAR-PROGRAMMING PROBLEMS
11.4-1
tions (5). If it is impossible or impractical to eliminate m variables directly, one applies the following necessary condition for a maximum or minimum of f(x\, x2, . . . , x„) subject to the constraints (5): dxi
with
dx2
dxn
$(Xi, x2, . . . , xn)
(11.3-6)
= f(x\, x2, . . . , xn) + Y X^y(a;i, x2, . .
• f Xn)
i-1
The m parameters Xy are called Lagrange multipliers. The n + m unknowns Xi = a» and Xy are obtained from the n + m equations (5) and (6). Note that Eq. (6) is a necessary condition for a stationary value of the function $(xi, x2, . . . , xn) if Xi, x2, . . . , xn are independent variables. EXAMPLE: Find the sides of the rectangle of maximum area inscribed in the circle x2 + y2 = r2
The rectangle area A may be expressed in the form A = 4xy Then
*fo y) = 4^2/ + Ms2 + y2 — r2)
The necessary condition for a maximum or minimum yields
— = 4y + 2As = 0
and
2* = ^ + 2\y = 0
so that X = —2, and x = y yields the desired maximum.
11.3-5. Numerical Methods. If, as is often the case, the function to be maximized or minimized depends on many variables, or if explicit differentiation is difficult or impossible, maxima and minima must be found by systematic numerical trial-and-error schemes. The most important numerical methods for problems with and without constraints are outlined in Sees. 20.2-6 and 20.3-2.
11.4. LINEAR PROGRAMMING, GAMES, AND RELATED TOPICS
11.4-1. Linear-programming Problems, (a) Problem Statement. A linear-programming problem requires one to determine the values
of r variables X\,X2, . . . , Xr which minimize a given linear objective function (criterion function)
Z = F(Xh X2, . . . , Xr) 3 CiXi + C2X2 + ' • • + CrXr
(11.4-ltt)
(or maximize —z) subject tow>r linear inequality constraints Xi = AaXi + Ai2X2 + • • • + AirXr - Bi > 0
(i = 1, 2, . . . , n) (11.4-16)
11.4-1
MAXIMA AND MINIMA AND OPTIMIZATION PROBLEMS
336
Figure 11.4-la illustrates a simple example. In a typical application, the problem is to buy necessarily positive quantities X\, X2, . . . , Xr of r types of raw materials ("inputs") so as to minimize a given total cost (la) while keeping the respective quantities qi = AnXi + Ai2X2 +
+ AirXr
(i = 1, 2, . . . , m)
of m "output" products at or above m specified levels Bu B2, . . . , Bm; in view of the r conditions Xk > 0 (k = 1, 2, . . . , r), one has a total of n = r -f-m inequality Solution set:
5xi-¥x2-x3-5; xx,x2,x3>0
Solution set:
SX^Xt-SZO
constantz=3x1+2jc2
+~*i
*~Xx
Minimum feasible
solution (1,0) (a)
\oi
Fig. 11.4-1. Solution sets in (Xi,X2) space (a), and in (xi,x2,xi) space (b) for the linearprogramming problem z = SXi + 2X2 = min
with or with
Xi s Xi > 0
s2 = Z2 > 0
x3 = 5Xi + X2 - 5 > 0
z = 3a;i + 2x2 = min 5a;i + x2 — xz = 5 zi > 0
z2 > 0
xz > 0
In this case, r = 2, n =3, ra = n — r = 1.
constraints. A large number of applications, especially to management problems, are discussed in Refs. 11.2 to 11.10. Note that a given problem may or may not have a solution.
(b) The Linear-programming Problem in Standard Form. Feasible Solutions. Expressed in terms of the n positive slack
variables (16), the linear-programming problem (1) requires one to
337
LINEAR-PROGRAMMING PROBLEMS
11.4-1
minimize the linear objective function
Z = f(Xl, X2, . . . , Xn) = C1X1 + C2X2 + ' ' • + CnXn (11.4-2a) subject to m — n — r < n linear constraints
onxi + ai2x2 + • • • + ainXn = 6, > 0
(i = 1, 2, . . . , m) (11.4-26)
and n linear inequality constraints
xk > 0
(fc « 1,2, .-. . ,n)
(11.4-2c)
{standard form of the linear-programming problem). A feasible solution (feasible program) for a linear-programming problem given in the standard form (2) is an ordered set ("point") (xi, x2, . . . , xn) which satisfies the constraints (26) and (2c); a mini mum feasible solution actually minimizes the given objective function (2a) and thus solves the problem. A basic feasible solution is a fea sible solution with N < m positive xk', a basic feasible solution is nondegenerate if and only if N = m. Whenever a feasible program exists, there exists one with at least n — m
of the Xk equal to zero. If, in addition, the objective function z has a lower bound on the solution set, then there exists a minimum feasible solution with at least n — m of the xk equal to zero (Ref. 11.2). The set of all feasible solutions to a given linear-programming problem is a convex set in the n-dimensional solution space) i.e.; every point (£h £2, . . . , fn) on a straight-line segment joining two feasible-solution points (xi, x2, . . . , xn) and (x[, x'2, . . . , xn), or & = axk + (1 - ot)x'k
(0 < a < 1; k = 1, 2, . . . , n)
(see also Sec. 3.1-7) is also a feasible solution. More specifically, the solution set is a convex polygon or polyhedron on the plane or hyperplane defined by the constraint equations (26), with boundary lines, planes, or hyperplanes defined by the inequalities (2c), as shown in the simple example of Fig. 11.4-16. 7/ a finite minimum feasible solution exists, the linear-programming problem either has a unique minimum feasible solution at a vertex of the solution polygon or polyhedron, or the same minimal z is obtained on the entire convex set generated by two or morevertices (degenerate solution). (c) Duality.
The minimization problem
z = FiXh X2, . . . , Xr) s CiXi 4- C2X2 + • • • + CrXr = min
with Xi 3 AnXi + Ai2X2 + • • • + AirXr -Bi>0 Xk > 0
(A; = 1, 2, . . . , r)
\
I Mu„. ii = 1, 2, . . . , m) [ U1'4"^ )
11.4-2
MAXIMA AND MINIMA AND OPTIMIZATION PROBLEMS
338
and the corresponding maximization problem w = GiYh Y2, . . . , Ym) s BiYi + B2Y2 + • • •
V
+ BmYm = max
with y* = 4lfcr* + 42*F2 + • • •
+ AmkYm - C* < 0
-Yi <0
J > (H.4-4)
ik = 1, 2, . . . , r) \
(i = 1, 2, . . . , ra)
/
are dual linear-programming problems. // either problem (3) or (4) has a finite optimum solution, then the same is true for the other problem, and min z = max w
(11.4-5)
If either problem has an unbounded solution, then the other problem has no feasible solution (Ref. 11.2).
The duality theorem permits replacement of a given linear-programming problem with one having fewer unknowns or fewer given inequalities and is also useful in cer tain numerical solution methods (Ref. 11.5).
11.4-2. The Simplex Method, (a) Linear-programming problems can be solved by the numerical hill-climbing methods outlined in Sec. 20.2-6,
but the simplex method (Refs. 11.2 to 11.6) takes advantage of the linearity of the expressions (1) and (2). Assuming that a solution of the linear-programming problem (2) exists, any solution-polygon vertex can be found through simultaneous solution of the m linear equations (26) and n —m linear equations xk = 0. To avoid laborious investigation of an excessive number of nonminimal vertices, the simplex method pro
vides a systematic computing scheme progressing from vertex to vertex in the direction of lower z values.
(b) To find a solution vertex, one selects m basic variables, say xi, x2, . . . , xm, which are to be greater than zero at the vertex. By systematic elimination of variables (Sees. 1.9-1 and 20.3-1), xh x2, . . . , xm can be expressed in terms of the remaining unknowns xm+i, xm+2, • • • , xn; then Eqs. (2a) and (26) can be rewritten in canonical form relative to the selected basic variables, viz.,
Xl + (ai,m+lXm+l + • • • + OLlnXn) = ft X2 + (a2,m+lXm+l + • • • + OL2nXn) = ft (11.4-6)
Xm + (am,m+lXm+l + ' ' ' + <XmnXn) = Pn z = {ym+iXm+i + • • • + ynxn) + Zo
Now xm+i = xm+2 = ... = xn = 0 defines a basic feasible solution (Sec. 11.4-16) if all ft in Eq. (6) are nonnegative. If, in addition, all yk are nonnegative, then Xi =
ft 0
(i = 1, 2, . . . , m) (i = m + 1, m + 2, . . . , n)
(11.4-7)
339
THE SIMPLEX METHOD
11.4-2
is a minimum feasible solution with z = z0; this solution is unique if all yk are positive.
(c) Consider next the case that Eq. (6) yields a nondegenerate basic feasible solution (i.e., ft, ft, . . . , ft > 0, see also Sec. 11.4-16), which is not optimal, i.e., at least one yk is negative. Let yK be the smallest (most negative) yk.
To obtain a basic feasible solution with a smaller z
value, xK is then introduced as a basic variable in the next iteration cycle; xK will replace that former basic variable xi which, referring to Eq. (6), reaches zero first as xK is increased from zero, i.e., / is associated with the smallest possible xK = —
(11.4-8o)
<xik
Then an improved basic feasible solution is given by (8a) and
'
_ f ft —chrXk > 0
i0
(i = 1, 2, . . . , m; note xi = 0)
(i = m+ 1, m+ 2, . . . , n if i 9* K) ft
z = z0 + yKXK = z0 + yK IK —
(11.4-86)
< Zo
OtIK
Now, introduction of n
xK
- £ - ^ (*'+. X aiix)
(nA-9)
into Eq. (6) produces the canonical form relative to the new basic vari ables. The complete simplex algorithm, which permits convenient tabulation (simplex tableau, Refs. 11.5 and 11.6) and also adapts readily to machine computation, is repeated and usually progresses to an optimal solution in a finite number of steps. If, in the course of the computation, the basic variables correspond to a degenerate basic feasible solution (i.e., if at least one of the basic variables turns out to be zero, see also Sec. 11.4-16), further improvement of the objective function z could, in principle, stop, but this rarely if ever occurs in a practical prob lem (Ref. 11.5). One can extend the simplex method to degenerate cases, e.g., by introducing small changes in the given coefficients during the computation (perturbation method) (Ref. 11.15). Many practical refine ments of the basic simplex method appear in the literature (Refs. 11.2 to 11.7).
Note: The simplex algorithm necessarily fails when all a?,x are nonpositive; in this case, the objective function z has no lower bound on the solution set.
11.4-2 MAXIMA AND MINIMA AND OPTIMIZATION PROBLEMS
340
EXAMPLE (Fig. 11.4-2): The problem z = MXi + MX* = min xi = Xi > 0 x2 = 5Xi + X2 -
with
5 > 0 *3 s 2Zi + 5X2 - 10 > 0 x4 = X2 > 0
transforms to the standard form
z = Hxi + 3>£x4 = min 5xi — x2 + x4 = 5 > 0 2xi - x3 + 5x4 = 10 > 0 Xi, X2, Xz, Xi > 0
with
x2=0
Minimum feasible solution
\
12
\
\
\
3456/78
-*-xt
*4B0 Fig. 11.4-2. Solution set in (XltX2) space for the example of Sec. 11.4-2c.
Starting with x\, x2 as basic variables (first vertex from the right in Fig. 11.4-2), one obtains the canonical form
xi - Mxz + HxK = 5 > 0 x-2 - $ixi + 2%x4 = 20 > 0 242 = 3x3 -
7x4 + 30
The coefficient of x4 in the last expression is negative; x4 is, therefore, increased until x2 reaches zero (this takes place before xi reaches zero and corresponds to the second vertex from the right in Fig. 11.4-2). Thus, K = 4, / = 2, and 20 x4 =
40
23
xi = 23
x2 = x3 = 0
The new canonical form relative to Xi, x4 is xx - «s*« + Hsxi = l9i'.i >0 x4 + 3iax* - y2*xz = *%z > 0 12 X 232 = 7x2 + 17x3 + 205
341
NONLINEAR PROGRAMMING
11.4-3
which corresponds to the unique minimum feasible solution with z = 205/(12 X 23) = 205/276.
(d) Use of Artificial Variables to Start the Simplex Procedure. The simplex algorithm, as described, presupposes that a suitable choice of the initial basic variables xi, x2, . . . , xm has produced a basic feasi ble solution with fii, fi2, . . . , fim > 0. Such a choice may not be obvi ous. The following procedure may be used to start the solution, or to decide whether a feasible solution exists.
With the given problem in the standard form (2), introduce m arti ficial variables xn+i, xn+2, . . . , xn+m and solve the "augmented" linear-programming problem minimizing
z + w = z + xn+i + xn+2 + • • * + xn+m
(11.4-10a)
subject to the constraints anXi + ai2x2 + • • • + ainXn + xn+i = &i > 0 c^itfi + a22x2 + • • • + a2nxn + xn+2 = b2 > 0 (11.4-106)
OmiXi + Om2X2 + ' ' * + Q>mnXn + Xn+m = bm > 0 xk > 0 (k = 1, 2, . . . , n + m)
The m artificial variables necessarily define a basic feasible solution of
the augmented problem; the original problem has no solution unless the minimum of the feasibility form w = xn+i + xn+2 + • • • + xn+m is zero.
11.4-3. Nonlinear Programming. The Kuhn-Tucker Theorem. If the objective function and/or one or more inequalities in the linear-
programming problem (1) is replaced by an expression nonlinear in the problem variables Xk, one has a nonlinear-programming problem. Such a problem results, for example, if the solution-set boundaries and/or the lines of constant z in Fig. 11.4-la are replaced by nonlinear curves.
Nonlinear-programming problems are of considerable practical interest but are, with few exceptions (Ref. 11.11), accessible only to numerical solution methods (see also Sec. 20.2-6). Assuming suitable differentiability, a necessary inot sufficient) condition for a maxi mum of
z =fixh x2, . . . , xn) subject to m inequality constraints
(i -= 1, 2, . . . , m)
(11.4-11)
11.4-4
MAXIMA AND MINIMA AND OPTIMIZATION PROBLEMS
342
is the existence of m + 1 nonnegative Lagrange multipliers Xo, Xi, X2, . . . , XOT (see also Sec. 11.3-4) not all equal to zero and such that X» > 0
ii = 0, 1, 2, . . . , m)
\i
[i = 1, 2, . . . , m) (11.4-12a)
in
iFritz John Theorem). Xo is positive and can be set equal to unity in the condition (12a) whenever there exists a set of n real numbers (2/1, 2/2, . . . , yn) such that, for the values of x\, xi, . . . , xn in question, n
V d
(11.4-126)
iAbadie's form of the Kuhn-Tucker Theorem, Ref. 11.1).
11.4-4. Introduction to Finite Zero-sum Two-person Games, (a) Games with Pure Strategies. A finite zero-sum two-person game is a model of a conflict situation characterized by a finite payoff matrix, On
&12
• • •
dim
a2i
a22
• • •
a2m
ani
an2
m' -
anm_
(11.4-13)
where aik is the (positive or negative) payment due player A from player B if A chooses the ith of n pure strategies Sh S2, . . . , Sn available to him, and B chooses the kth of m pure strategies S[, S2, . . . , S'm possible for him. Neither player knows the other's choice. Note that the sum of the payoffs to both players is zero for each move (hence the name zero-sum game). The game is symmetrical if and only if m = n, and dki = —a{k for all i, k.
To win, the maximizing player A selects the ith row of the payoff matrix so as to maximize min a«, while the minimizing player B attempts to minimize max aik. For every given payoff matrix (13) i
max min otfc < min max at* t
ft
ft
(11.4-14)
i
If the two quantities in Eq. (14) are equal for a (not necessarily unique) pair i = I,k = K, the game is said to have the saddle point or solution /, K and the (necessarily unique) value am. The optimal strategies for such a game are unaffected if player A knows B's move beforehand, and vice versa.
343
FINITE ZERO-SUM TWO-PERSON GAMES
11,4-4
(b) Games with Mixed Strategies. Mixed strategies are defined by probabilities (Sec. 18.2-2) ph p2, . . . , pn assigned by player A to his n respective strategies Si, S2, . . . , Sn, and probabilities p[, p'2, . . . , p'm assigned by player B to his strategies S[, Sf2, . . . , S'm. In a mixedstrategy game, A attempts to maximize the minimum of the expected value (Sec. 18.3-3)
min
l'
Vm'
Y y OikPiPt
A< A*-
through his choice of ph p2, . . . , pn, while B tries to minimize n
max Vl'p*
m
y y ancPip'k Vn £i£i
But for every given payoff matrix (11), n
m
nax Vn ILpl/tp2',min. . . ,Vm A/, Y ,^< Y a,-*p.-pi1J !
max Pl,P2,
t = 1
=
K = 1
min max > > aikpip'k P\',P2f Vm' LP1.P2, . . . ,Pn .W ^. -J
(11.4-15)
(Minimax Theorem for finite zero-sum two-person games). The common value of the two quantities (15) is called the value v of the game. Every finite zero-sum two-person game has at least one solution defined by the optimal strategies pi, p2, . . . , pn) p[, p2, . . . , pL- Multiple solutions are possible, but the value of the game is necessarily unique. A fair zero-sum two-person game has the value 0. EXAMPLE: The familiar penny-matching game has the payoff matrix
ca where strategy 1 for each player is a bet on heads, and strategy 2 is a bet on tails. The game is symmetrical and fair and has no saddle point. The solution is pi =
Pi = M> P2 = p'2 = H-
(c) Relation to Linear Programming. While a number of approx imation methods yielding solutions to mixed-strategy games have been developed (Ref. 11.10), the most generally applicable approach relates the problem to linear programming.
The solution of a given matrix game is unchanged by addition of a positive constant a to each a,A, so that it is not a restriction to consider only games with positive values v > 0. In this case, the optimal strate
gies pi, p2j . , . , pn and p[, p2, . , . , p'm and the value v for a finite
11.5-1
MAXIMA AND MINIMA AND OPTIMIZATION PROBLEMS
344
zero-sum two-verson game with the v^yoff matrix (11) are given by Vi = vXi Pi = vYk
(» = l, 2, . . . ,n) (* = 1, 2, . . . ,m)
(11.4-16)
1
1
Wmin
where zmax, wmln, and the Xi and Yk are determined by the solution of the dual linear-vrogramming problems (Sec. 11.4-1) z = Xi + X2 +
• • • + Xn = min
with
auXi + a2kX2 + • • • + ankXn > 1
with
w = Fx + F2 + • • • + Ym = max o,-iFi + a,2F2 + • • • + aimYm < 1
i
n% A n \
,
.
(k =1, 2, . . . ,m) > (1L4-17fl>
X. > 0
(t = 1, 2, . . . , n) ,.
(i =1, 2, . . . ,n) J (1L4"176)
-K* < 0
(/c = 1, 2, . . . , m)
11.5. CALCULUS OF VARIATIONS.
MAXIMA AND MINIMA OF
DEFINITE INTEGRALS
11.5-1. Variations, (a) A variation by of a function y{x) of £ is a function of x defined for each value of x as the difference by = Y(x) — y(x) between a new function Y(x) and y{x). Each variation dy defines a change in thefunctional relationship of y and x and must not be confused with a change Ay in the valueof a given function yix) due to a change Ax in the independent variable (Sec. 11.2-1).
(b) Given a function F[yi(x), y2(x), . . . , yn(x); x], the variation bF corresponding to the respective variations byi, by2, . . . , byn of the func tions yi(x), y2(x), . . . , yn(x) is bF m F(yi + 52/i, y2 + by2, . . . , yn + byn; x) - F(yi, 2/2, . . . , yn\ x)
(11.5-1)
If y(x) and by are differentiable, the variation by' of the derivative yf(x) due to the variation by is
by' " 8tx " 3£ (62/) " rW " ^
(1L5'2)
More generally, given F[yi(z), y2(x), . . . , y»(»); yj(x), yj(«), . . . , y'n(x))x],
bF = F(yi + byly y2 + by2, . . . , yn + byn] y[ + by[, y2 + by'2, y'n + tyj; x) - F(yh y2, . . . , yn; y[, y2,...,y'n;x) (11.5-3)
345
MAXIMA AND MINIMA OF DEFINITE INTEGRALS
11.5-2
(c) If F is suitably differentiable, Eq. (3) can be expanded in the form
i=i
t=ii=i
+^*;*+&«H +--- (ilw) for each value of x (Sec. 4.10-5). The terms of degree 1, 2, . . . in the byi and by[ in the Taylor expansion (4) respectively constitute the firstorder variation blF of F, the second-order variation b2F of F, . . . (d) The definitions and formulas of Sees. 11.5-la, b, and c apply with out change to functions y, yi} and F of two or more independent variables xi, x2, . . . .
11.5-2. Maxima and Minima of Definite Integrals. While the theory of ordinary maxima and minima (Sees. 11.2-1 to 11.3-5) is con cerned with unknown values of independent variables x or x{ corresponding to maxima and minima of given functions, it is the objective of the calculus of variations to find unknown functions y(x) or yi(x) which will maximize or minimize definite integrals like
/ = J[XF F[y(x), y'(x), x] dx xo
(11.5-5a)
or r xf
I = J/xo F[yi(x), y2(x), . . . , yn(x); y[(x), y'2(x), . . . , y'n(x);x] dx (11.5-56)
for a specified function F. I is a functional (Sec. 12.1-4) determined by the function(s) y(x) or yi(x) together with the integration limits x\, x2 and the boundary values y(x0), y(xF) or y%{xo), yi(xF). A definite integral (5) has a strong maximum for given integration
limits Xo, xF and a given function y(x) or a given set of functions yi(x) if and only if there exists a positive real number € such that the variation
57 =8fx'Fdx= f"8Fdx+ Jxo+&xo I" (F + SF) dx + Jx? fXr+Sz' (F + SF) dx Jxo Jxo (11.5-6)
is negative for all increments bxQ, bxF and all variations by or byh by2, . . . , byn whose absolute values are less than e and not identically zero.
A strong minimum of a definite integral (5) is similarly defined by bl> 0.
The integral / has a weak maximum (or a weak minimum) for given xo, xF and y(x) or yi(x), y2{x), . . . , yn(x) if and only if there exists a positive real number e such that bl < 0 (or bl > 0, respectively) only for all increments bx0, bxF and all variations by or byi, by2, . . . ,
11.5-3
MAXIMA AND MINIMA AND OPTIMIZATION PROBLEMS
346
8yn and by' or by[, by'2, . . . , by'n whose absolute values are less than e and not identically zero. A strong maximum (or minimum) is neces sarily also a weak maximum (or minimum). If the region of definition for "admissible" functions y(x) or yi(x),
y2(x), . . . , yn(x) is bounded through inequalities such as y(x) < a
yj(x) + y22(x) < b
f(yh y2, . . . , yn) > 0
a function y(x) or yt(x) maximizing or minimizing / can lie wholly in the interior of the region of definition (interior maximum or minimum),
or wholly or partially on its boundary (boundary maximum or minimum, see also Sec. 11.6-6).
In most applications, the maximizing or minimizing functions y(x) or y{(x) need not be compared with all possible functions of x. In the
following, the existence of the definite integrals in question is assumed wherever necessary, and it is understood that (1) max imizing or minimizing functions are to be chosen from the set of
all functions having piecewise continuous first derivatives on the interval or region under consideration. In addition, it will be
assumed that (2) each integrand F is twice continuously differ entiable on the integration domain (see also Sec. 11.6-lc). 11.5-3. Solution of Variation Problems. Functions yix) or yiix) which maximize or minimize a given definite integral may be found (1) as solutions of differential equations which ensure that 8*F = 0 (Sees. 11.6-1 to 11.6-7), or (2) by the "direct" methods described in Sees. 11.7-1 to 11.7-3.
It is not a trivial observation that a given problem of the type discussed here may not possess any solution. Every solution derived with the aid of the necessary (not sufficient) conditions of Sees. 11.6-1 to 11.6-7 must be tested for actual maximum or minimum properties. Some sufficient conditions for the existence of maxima or minima of definite integrals are discussed in Sec. 11.6-9.
11.6. EXTREMALS AS SOLUTIONS OF DIFFERENTIAL EQUATIONS: CLASSICAL THEORY
11.6-1. Necessary Conditions for the Existence of Maxima and
Minima,
(a) A necessary condition for the existence of either a maximum
or a minimum of the definite integral
I =f*oFF[y(x), y'(x), x] dx
(11.6-1)
for fixed xQ, xF is
"-£«"•-£*!:-£[£(£)-£]**-• for anarbitrary small variation by. Hence every maximizing orminimizing function y(x) must satisfy the differential equation
347
CONDITIONS FOR EXISTENCE OF MAXIMA AND MINIMA 11.6-1
d_{dF\ _ dx \dy'J
— = 0
(euler-lagrange equation)
(11.6-2)
wherever the quantity on the left exists and is continuous (see also Sec. 11.6-lc). In addition, y(x) must either assume given boundary values y(xo) and/or y(xF), or y(x) must satisfy other conditions determining its boundary values (Sec. 11.6-5). (b) Similarly, every set of n functions y\(x), y2(x), . . . , yn(x) maxi mizing or minimizing the definite integral
I = Jxo fXFF[yi(x),y2(x), . . . jnWjl/lW,!/^), . . . , y'n(x); x] dx (11.6-3)
must satisfy the set of n differential equations
d /dF\
dF
n
-T- I ^-7 1 — v- = 0
dxydy'J
dy{
/•
i o
\
(i = 1, 2, . . . , n)
vi,
(11.6-4)
(euler-lagrange equations)
together with suitably given boundary conditions, wherever all the quantities on the left exist and are continuous (see also Sec. 11.6-lc). (c) Functions y(x) or y{(x) satisfying the Euler equation or equations associated with a given variation problem are called extremals for the problem in question. A further necessary istill notsufficient) condition for a maximum or minimum of I on
a given extremal is that the matrix [d2F/dy'i dy'k] be, respectively, negative or positive semidefinite (Sec. 13.5-2) on the extremal iLegendre's Condition); this reduces to d F/dy'2 < 0 or d2F/dy'2 > 0, respectively, in the one-dimensionalcase (seealso Sees. 11.6-7 and 11.6-10).
In general, the necessary conditions of Sees. 11.6-la and b apply only where the yix) or ytix) are twice continuously differentiable, but this is not as restrictive a con dition as it might seem. Let the integrand F be twice continuously differentiable for x0 < x < xF. Then all continuous differentiable functions yix) or ytix) which actually maximize or minimize I for given Xo, xF and given boundary values necessarily have continuous second-order derivatives for all x in (x0, xF) where thematrix [d2F/dy{ dyk) is
negative or positive definite (Sec. 13.5-2), or d2F/dy'2 ?* 0 iTheorem of Du BoisReymond).
(d) The derivation of the conditions (2) and (4) from S1/ = 0 is based on the fundamental lemma of the calculus of variations: if fix) is continuous on the bounded
interval [x0, xF) and Jxo I** fix)gix) dx = 0for arbitrary gix), then fix) = 0 on [xQ, xF]\ and on Du Bois-Reymond's Lemma: a continuous function fix) is necessarily a constant fxf tx in ix0, xF) if I fix)gix) dx = 0 for every gix) such that \ gix) dx is a continuous Jxo
Jxo
fxF
function in (xo, xF), and I
Jxo
gix) dx = 0.
11.6-1
MAXIMA AND MINIMA AND OPTIMIZATION PROBLEMS
348
S(*,y)=0
(0,0)
Fig. 11.6-la. Two of the infinite set of
Fig. 11.6-16. Refraction of extremals.
broken-line extremals minimizing / = (1 - y')2y'2dx with 2/(0) = 0, 7/(2) = /; 1. The Euler-Lagrange equation yields extremal segments y = ax + b, with a and b determined by the boundary condi tions and a Weierstrass-Erdmann condi
tion for each corner; it follows that a is either 0 or 1. Note that d2F/dy'2 = 12y'2 - \2y' + 2 (with undetermined //') can vanish anywhere in (0, 2), but d2F/dy'2 > 0 on each minimizing ex tremal. Note also that no continuously differentiable extremal yields a smaller value of / (Refs. 11.19 and 11.22). y
(xF,yF) «(xF,yF) <*o»V >-*
^S(*,y)=0
Fig. 11.6-lc. Reflectionof extremals from
Fig. 11.6-ld. An extremal lying partly on
a boundary curve.
a boundary curve. Fig. 11.6-1. Extremals with corners.
EXAMPLES OF APPLICATIONS INVOLVING VARIATION PROBLEMS:
geodesies in Riemann spaces (Sec. 17.4-3); derivation of Lagrange's equations from Hamilton's principle. See also Sec. 15.4-7. EXAMPLE: Brachistochrone in Three Dimensions.
Given a three-dimensional
rectangular cartesian coordinate system with the positive x axis vertically downward, this classical problem requiresthe determination of the spacecurve y = yix), z = zix) which will minimize the time t taken by a particle sliding on the curve without fric tion under the actionof gravity to get from the origin to some specified point [xF > 0, yixp), z(xF)]. Since by conservation of energy
—i-i[(S)'+(*)'+(I)']
349
VARIATION PROBLEMS
11.6-2
where g is the acceleration of gravity, the quantity to be minimized is
IV2g~ =J*±y/1 +
A(
v'
\ =A (
\ =o
dx \xM Vi + iy')2 + iz')2)
dx \xK Vi + iy')2 + iz')2)
xH Vi + iy')2 + iz')2
x* Vi + iy')2 + (*')2
=
where ct and c2 are constants.
c2
Since — = —» the curve must lie in a vertical plane, dz
c2
which will be the xy plane if the boundary conditions yixF) = yF are assumed.
zixF) = 0
In this case, z' — 0 and x
y
so that
y=aarccos (1 - -J - V^ax - x2 +k (a = 2c2)
The constant k must vanish, since y = 0 for x = 0; the constant a will depend on the
values of xf and 2/1?. The desired curve represents a cycloid (Sec. 2.6-2e) with its base along the y axis and its cusp at the origin.
Note: The problem of the brachistochrone, like the problem of Sec. 11.6-3, is an example of the minimization of a line integral (Sec. 4.6-10). Problems of this type can always be reduced to problems involving integration over a single independent variable.
11.6-2. Variation Problems with Constraints or Accessory Condi
tions. The Method of Lagrange Multipliers. It is desired to find sets of functions yi(x), y2(x), . . . , yn(x) which maximize or minimize
the definite integral (3) while also subjecttorn
y2{x), . . . , yn(x) as solutions of the set of differential equations (Euler equations)
=.(£)-£-° "-1-*•••••»
(IIM)
subject to the constraints (5), where •m
$. F+ ^ M*)«
(11.6-7)
11.6-3
MAXIMA AND MINIMA AND OPTIMIZATION PROBLEMS
350
The unknown functions \j(x) are called Lagrange multipliers (see also Sec. 11.3-4). If the problem has a solution, the n + m functions yi and Xy can be determined from the n + m relations (6) and (5) together
with the given boundary conditions. The differential equations (6) are necessary (but not sufficient) conditions for a maximum or minimum, provided that all the quantities on the left exist and are continuous. The method of Lagrange multipliers remains applicable in the case of suitably differentiable nonholonomic constraints of the form
v>i(Vh 2/2, ... , yn; 2/1, 2/2, . . . , y'n', x) = 0
11.6-3. Isoperimetrie Problems.
(j = 1, 2, . . . , m)
(11.6-8)
An isoperimetric problem requires
one to find sets of functions yi(x), y2(x), . . . , yn(x) which will maximize
or minimize a definite integral (3) subject to the accessory conditions
f
J xo
*k(yi, 2/2, . . . , yn\ y[y y'2, . . . , y'n; x) dx = ck (k = 1, 2, . . . , m')
(11.6-9)
where the ck are given constants. If the given functions Vk are suitably differentiable, the method of Lagrange multipliers applies; the unknown functions yi{x) must satisfy a set of differential equations (6) subject to the constraints (9), where now m'
$ = F + y n&k
(11.6-10)
u = i
The m' Lagrange multipliers \ik are constants to be determined together with the n unknown functions yt(x) from the m' + n relations (9) and (6) and the given boundary conditions. EXAMPLE: The area o a closed plane curve x = x{t), y = yit) can be written in the form
T
If2*/ dy
dx\,4
I = 2jo \XTt-y-dt)dt if the parameter t is suitably chosen. To maximize I subject to the accessory condi tion
where R is a constant different from zero, write
*-K.*-'S)+•[(*)• <m The resulting Euler equations (6), viz., dy ,0 d2x
-dt+2tidF = 0
dx , 0 d2y
.
dt+2fMd£ = °
351
PROBLEMS WITH UNKNOWN BOUNDARY VALUES
11.6-5
and the given accessory conditions are satisfied by
x =Rcos'i+x° -^ -fxo
yy=-Rsini = —Rsin ^7 +2/0
with n=^
i.e., the desired curve is a circle of radius R (see also Sec. 11.7-1). Note: To maximize or minimize the definite integral (3) subject to m accessory conditions (5) and r accessory conditions (9), apply the conditions (6) with wi
m'
* b f + y \m + y »k*k j=i
jb=i
11.6-4. Solution of Variation Problems Involving Higher-order Derivatives in the Integrand. To maximize or minimize definite integrals of the form
1=Jxi F^Vh y*> - • ->yn> yi> y*>-'-> Vn'> y"t y^ •••»»»;••• -,x)dx (11.6-11)
onemay introduce allderivatives of order higher than oneasnewdependent variables related to oneanother andto the yiby constraints y" = dy'jdx, y-' = dy" /dx, .... The resulting necessary conditions fora maximum orminimum of the definite integral (11) take the form
dyt
dx \dy'ij dx2 \ty< / dx9 \dt/t- / x ' (11.6-12) provided that the integrand F is suitably differentiable, and all the quantities onthe
left exist and are continuous.
In order to solve the problem completely, one must
also specify the boundary values of all but the highest derivatives of each function yt appearing in the integrand.
11.6-5. Variation Problems with Unknown Boundary Values and/ or Unknown Integration Limits, (a) Given Integration Limits, Unknown Boundary Values. To maximize or minimize the definite
integral (3) when one or more of the boundary values yi(x0) and/or yi{xF) are not specified orconstrained (Sec. 11.6-5c), replace each missing bound ary condition by a corresponding natural boundary condition
I? =0 (x =xo and/or x=xF)
(11.6-13)
(b) Given Boundary Values, Unknown Integration Limit. If one of the integration limits, say xF, is unknown but all boundary values yi(xF) are given, then the extremals y^x) must satisfy n
I
*L.y>k-F = 0
(x = xF)
(11.6-14)
1
(c) General Transversality Conditions. Frequently, an integra tion limit, say xF, and/or one or more of the corresponding boundary values yi(xF) will not be given explicitly, but the "point" [xF] yi(xF),
11.6-6
MAXIMA AND MINIMA AND OPTIMIZATION PROBLEMS
352
y2{xF), . . . , yn(xF)] is known to satisfy nF given continuously differ entiable relations
Gj[zP) yi(xF), y2{xF), . . . , yn(xF)] = 0
(j = 1, 2, . . . , nF < n) (11.6-15)
The integration limit xF and the unknown functions yi(x) which together maximize or minimize the integral (3) must then satisfy the Euler equations (4) and the boundary conditions (15) together with a set of transversality conditions expressed by
l+SAjS:=o {x=xf)
(iL6-i6a)
i = l
for each yi(xF) which is not given explicitly, and/or n
nF
[ZJS^-^-X^S-0 {x=xf) (n-6-mb) if xF is not explicitly known. The Ay are nF constant Lagrange multi pliers to be determined, together with up to n + 1 unknowns yi(xF) and/or xF, from the n + nF + 1 relations (15) and (16). Note that Eqs. (13) and (14) are special cases of Eqs. (16a) and (166). (d) Analogous methods apply if the boundary conditions corresponding to x = Xo are not completely specified. EXAMPLE: To minimize a line integral
/ =/>*«), „(«), .(«)] A=j>*(0, yit), zit)] Vg)2 +(f)2 +g)2^ [Gix, y, z) > 0]
where ixQ) yQ, 20) and ixF, yF, zF) lie on two suitable (convex) nonintersecting regular surfaces £o, SF specified by
Boixo, 2/0, 20) = 0
GFixF, yFf zF) = 0
the solution extremal must satisfy the transversality conditions
IdBo . dBo uSB*
dxo ' dy0 ' dz0 dGp
dGp
=
o)
&Gp
dxF : dyF : dzF
V " tF)
i.e., it must intersect £0 and SF orthogonally. If, in particular, Gix, y, z) = 1, then 7 is a distance between the surfaces, and the extremals are straight lines. The actual existence of a minimum depends on the specific nature (convexity) of the given surfaces.
11.6-6. The Problems of Bolza and Mayer.
A functional of the form
rxp
J = Jxo F^yi^} **&)> ' ' ' >Vn^iy'dx), y2ix), . . . , y'nix); x]dx + h[yiixF)y2ixF), . . . ,ynixF);xF]
(11.6-17)
353
EXTREMALS WITH CORNERS
11.6-7
to be maximized by a suitable choice of functions yt(x) iProblem of Bolza), subject to suitably given boundary conditions of the form (15), can, for suitably differentiable h, be written as
/. XFiF +h')dx
(11.6-18)
h = hix) is an additional dependent variable satisfying the differential-equation constraint n
S =h " I a£Vk +Tx
<*»<*<*')
d1-6"19)
A;= l
and the boundary conditions
MO) = 0
hixF) = h[yiixF), y2ixF), . . . , y„ixF);xF]
(11.6-20)
The yiix) must satisfy the usual Euler-Lagrange equations for x0 < x < xF. An analogous procedure is used if a function of the boundary values corresponding to x = Xo is added to the given expression (17). If the yiix) are subject to given differential-equation constraints but F = 0, so that only the boundary function h remains to be maximized or minimized, one has the Problem of Mayer, which will be considered in the context of optimum-control theory in Sees. 11.8-1 to 11.8-6.
11.6-7. Extremals with Corners. Refraction, Reflection, and Boundary Extrema. The functions yix) or y^x) maximizing or minimizing a definite integral (1) or (3) were assumed to be only piecewise continuously differentiable (Sec. 11.5-2). They can have corners (discontinuities in first-order derivatives) for values of x such
that d2F/dy'2 = 0, or where the matrix [dW/dy'i dy'k] is only semidefinite for some y, y' or 2/1, y2, . . . , yn; y[, y'2, . . . , y'n, or where F has a discontinuity (Sees. 11.6-1 and 13.5-2).
Corners may, in particular, occur
1. For any x over an interval of x values, as illustrated by the example of Fig. 11.6-la.
2. On a curve, surface, or hypersurface
Six, y) = 0
or
Six; yu y2, . . . , yn) = 0
(11.6-21)
crossed by the extremal curves ("refraction" of extremals, Fig. 11.6-16). 3. On the boundary of a region from which the extremals are excluded by an inequal ity constraint
Six, y) < 0
or
Six; yh y2, . . . , yn) < 0
(11.6-22)
(Fig. 11.6-lcandrf).
At every "free"corner [xi, yixx)) or [xr, yiixi), y2ix\), . . . , ynix\)\ between extremals (Fig. 11.6-la, but not b, c, d), the extremal segments must satisfy
or
FAxi - 0) = Fyixx + 0) FVi,iXi - 0) = Fy/ixi +0) n
[Fy,y' - /^Jx-x^o = [Fv>y' - F]x-x1+o (i = 1, 2, . . . , n) n
(WEIERSTRASS-ERDMANN CORNER CONDITIONS)
(11.6-23)
11.6-8
MAXIMA AND MINIMA AND OPTIMIZATION PROBLEMS
354
For refraction (Fig. 11.6-16) and reflection (Fig. 11.6-lc) of extremals, the curve,
surface, or hypersurface defined by Eq. (21) or (22) acts like an intermediate boundary where each extremal must satisfy Eq. (21) together with the corner conditions
FWt.(xi +0) - Fy,iXl - 0) =-A jl^l
- i^l
j
ii = 1, 2, . . . , n)
\lfvl-F] -\t »*-r]J*-x,-o Lfwt dyk Jx-xi+o L|iwi dyk "Jfc = l
> <»•«*>
k = l
\dx ]xi+0
dx]xi-0)
where A is a constant Lagrange multiplier. In each case, the corner conditions sup plement the given boundary conditions to determine the corner points. In the boundary-extremum case illustrated by Fig. 11.6-ld, extremals must enter and leave the boundary along a tangent to the boundary (see also Sec. 11.8-5). ftp
Note: In the special.case 1=1
Jto
Gix, y, z) ds with ds2 = dx2 + dy2 + dz2, the
corner conditions yield the refraction and reflection laws of optics if the function Gix, y, z) is interpreted as the reciprocal of the propagation velocity (Ref. 11.21). 11.6-8. Canonical Equations and Hamilton- Jacobi Equation. The n secondorder Euler equations (4) associated with the variation problem of Sec. 11.6-16 are equivalent to the 2n first-order canonical equations
Vi = —
dpi
Pi = - —
(t = 1, 2, . . . , n)
dyt
(11.6-25o)
dF
in the 2n dependent variables yi and p» = —/» where dVi
H(Vu 2/2, . . . , yn; pi} Pi, . . . , pn] x) n
= X ViVi ~ F(yi> v*> - • - >y^\ y[> y*> • • • »yn'»x)l
(11.6-256)
i=i
, [ d2F 1 det ldyidyk] T^T-' * ° (contact transformation, see also Sees. 10.2-6 and 10.2-7). Note that the transversality conditions (13), (14), (16), and the corner conditions (23), (24) are greatly simplified if one introduces the pi and H. In classical dynamics, the canonically conjugate or adjoint variables p» are interpreted as generalized momenta, and the Hamiltonian function H has the dimensions of energy. Sections 11.8-1 to 11.8-6 restate the calculus of variations in terms of canonical equations in the context of optimum-control theory. One can sometimes simplify the solution of the problem by introducing 2n + 1 new variables y», p^ and Hiyi, y2, . . . , yn', Pu p2, . . . , pn'f x) through a suitable canonical transformation (Sec. 10.2-6) which yields new canonical equations dH
Vi =
dPi
dH
Pi = - —
(i = 1, 2, . . . , n)
355
MAXIMA AND MINIMA OF MULTIPLE INTEGRALS
11.6-9
(b) Assuming that the minimum fXo fXo Lin min
z(x),... ,yn(x) vi(x),v2(x.
JI = = min J/
JXp
F(yh y2, ... , yn; yv y2, . . . , y'n) x) dx
= S[x0, yiixo), y2(x0), . . . , yn(xo); xF) yi(xF), y2ixF), . . . , yn(xF)] = 5feo, yi(*o), yt(xo), • • • , ynixo)} a £(JT, Fi, F2, . . . , F„)
defines unique optimal extremals for fixed xF, y
dX+H{YhY2, . . . .y.,—,—, . . .,_;*)-0 (HAMILTON-JACOBI EQUATION)
(11.6-26)
The ordinary differential equations (25a) are the corresponding characteristic equa tions (see also Sees. 10.2-4 and 11.8-7).
11.6-9. Variation Problems Involving Two or More Independent Variables: Maxima and Minima of Multiple Integrals. One often wishes to determine a set of n functions yi(xi, x2, . . . , xm), y2(xi, x2, . . . , xm), . . . , yn(x\, x2, . . . , xm) of m independent variables X\, x2, . . . , xm so as to maximize or minimize a multiple integral
1=//''"/ Fdxidx2 ' ' ' dXm
(11.6-27)
where the integrand F is a given function of the n + m + mn variables yi, xky and yi>k = dyi/dxk. The boundary S of the region of integration V and the boundary values of the functions yi may or may not be given. The functions yt are assumed to be chosen from the set of all functions
having continuous second partial derivatives in V (see also Sec. 11.5-2). If the integrand F is suitably differentiable, every set of maximizing or minimizing functions y\, y2, . . . ,yn must satisfy the set of partial differ ential equations
(11.6-28) k—i
together with suitable boundary conditions, whenever the quantities on the left exist and are continuous. The Lagrange-multiplier methods of Sees. 11.6-2 and 11.6-3 apply to variation problems involving accessory conditions.
If the boundary values of one or more of the unknown functions yi are not given, one must use the condition hlI = 0 to derive natural boundary condition" analogous to Eq. (13). Thus in the case of two independent variables, x\, x2, let the boundary S be a regular closed curve described in terms of its arc length s by x\ —xiis), x2 = rt2(s);
11.6-10
MAXIMA AND MINIMA AND OPTIMIZATION PROBLEMS
356
then the natural boundary conditions are
^dp __Wdx1 = 0 dyi.i ds
n)
(116-2g)
dyit2 ds
The case of an unknown boundary is treated in Ref. 11.21.
EXAMPLE: The small displacements of a string of length L may be represented by a function y(x, t) of the coordinate x measured along the string and of the time t. The kinetic and potential energies of the entire string are, respectively,
where m is the mass per unit length, and Q is the string tension.
For a minimum of the
integral
[tF (T -U)dt = y2 Jto[tF JOC (myt* - Qy**) dt dx
Jto
(Hamilton's principle, Ref. 11.22), one must have 0
which is the correct wave equation for the vibrating string.
11.6-10. Simple Sufficient Conditions for Maxima and Minima. Consider a one-parameter family of extremals [solutions of the Euler-Lagrange equation (2)] y = y(x, n) for the definite integral
=ffXF F(x, y, y') dx Jxo
(11.6-30)
with 2/(xo, m) = 2/o for a parameter range m < n < H2 such that
exists. The extremals corresponding to this parameter range all go through the point ixo, 2/o); they will not intersect again in (zo, xF) if z (x, fi) satisfies the differential equation
(J- Fvv* - Fvv) Z+T (*W«') =° in (x0, xF) [extremal field centered at (a*0, 2/0)].
(JACOBI'S CONDITION) (11.6-31) The function
P(x, y) = y'(x, /x)
obtained through elimination of n with the aid of y = y{x, *t) represents the slope of the extremal field at a point (x, y). An extremal y(x, m) of the field satisfying Eqs. (2) and (31) maximizes the given integral (30) if the function E{x, y, y', P) a F{x, y, y') - Fix, y, P) - iy' - P)Fv>ix, y, P) (WEIERSTRASS E FUNCTION)
(11.6-32)
is nonpositive for all points (re, y) sufficiently close to yix, /*); this is, in particular, true if
Fy'v* < 0
(legendre's strong condition)
(11.6-33)
along the field extremal in question. The maximum is a strong maximum (Sec. 11.5-2) if E < 0 for all (x, y) sufficiently close to the extremal curve and arbitrary y'.
357
DIRECT METHODS
11.7-1
E < 0 along the extremal for every finite y' (Weierstrass's necessary condition) is necessary for a strong maximum, but is not sufficient by itself. Conditions for a mini mum of the integral (30) similarly correspond to E > 0. The multidimensional case is treated in Refs. 11.16 and 11.17.
11.7. SOLUTION OF VARIATION PROBLEMS BY DIRECT METHODS
11.7-1. Direct Methods. Each so-called direct method for maximizing or minimizing a given definite integral
I = fXFF[y(x), y'(x), x] dx
(n.7-1)
Jxo
attempts to approximate the desired function y(x) successively by a sequence of functions u\(x), u2(x), . . . selected so as to satisfy the boundary conditions imposed on y(x). Each function ur(x) is taken to be a differentiable function of x and of r parameters ar\, ar2, . . . , arr. The latter are then chosen so as to maximize or minimize the function
Ir(arl, ar2, . . . , arr) s /
F[ur(x), u'r(x), x] dx
(11.7-2)
(i = 1, 2, . . , , r; r = 1, 2, . . .)
(11.7-3)
Jxo
with the aid of the relations
*Li = o
(Sec. 11.3-3a). Every tentative solution y(x) obtained in this manner as the limit of a sequence of approximating functions u\(x), u2(x), . . . still requires a proof that the definite integral (1) is actually a maximum or minimum. Analogous methods of solution apply to variation problems involving more than one unknown function (Sec. 11.6-16) and/or more than one independent variable (Sec. 11.6-9). If accessory conditions are given (Sees. 11.6-2 and 11.6-3), they are made to apply to each approximating function; the approximating integrals may then be maximized or mini mized by the Lagrange-multiplier method of Sec. 11.3-4 (see example). Direct methods can yield numerical approximations and/or exact solu tions. Since, under the conditions listed in Sec. 11.5-2, the solutions of variation problems must satisfy differential equations expressing the con dition 8lI = 0, every direct methodfor the solution of a variation problem is also an approximation method for solving differential equations. EXAMPLE: To solve the isoperimetric problem given as an example in Sec. 11.6-3,
ur(t) s 3>£ao -f- y ((ik cos kt + bk sin kt) k = \
vr{t) = l.j«0 + y {akvmkt H- p&sinAtf) fc = i
(r = 1, 2, . . .)
11.7-2
MAXIMA AND MINIMA AND OPTIMIZATION PROBLEMS
358
so that r
Ir =g L*(ur -Jr - vT -^\ dt =*- Y &(a*0* - &***)
(r =1, 2, . . .)
A= l
To maximize 7r subject to the accessory condition
f [(SO* +($)']*- - X*<* +w+«p +M=» &= 1
one applies the Lagrange-multiplier method of Sec. 11.3-4 and finds fit + 2Xr&a* = 0
a* + 2\rk0k = 01
,,
- „
-a* + 2Xr&&* = 0
-bk + 2Uajfc = 0 J
^
±, *,..., n
.
for r = 1, 2, . . . . Starting with k =• 1, one obtains Xr = }£, Pi — —<*i> <*i = &ij for A; > 1, a* = 6* = «jfc = /3fc = 0. In this example, the direct method yields the exact solution
with
x = Hao + ai cos J + 6i sin t y = J^ao — &icos t + a\ sin £ ai2 + &i2 = ft2
11.7-2. The Rayleigh-Ritz Method. This method attempts to expand the desired solution y(x) as a series in terms of a complete set of functions ¥3(x) (Sec. 15.2-4) so r
that the approximating functions ur(x) =
> ctTj9j(x) satisfy the given boundary i=0
conditions. As a rule, the Vj(x) are orthogonal functions, so that the parameters arj = otj are independent of r (Sec. 15.2-4), just like in the example of Sec. 11.7-1. The Rayleigh-Ritz method is useful for numerical solution of certain eigenvalue problems in vibration theory and quantum mechanics (see also Sec. 15.4-7). 11.7-3. Approximation of y(x) by Polygonal Functions. y(x) may be approxi mated by polygonal functions ur(x) each defined, say, by its r values ari = ur(xQ + Ax), Ctrl = ur(xo -j- 2Ax), ...,<*„.= ur(x0 + r Ax) = ur(xF —Ax). In this case, the con ditions (3) yield a difference equation (Sec. 20.4-3) approximating the Euler equation (11.6-2) of the given variation problem. 11.8. CONTROL PROBLEMS AND THE MAXIMUM PRINCIPLE
11.8-1. Problem Statement, (a) State Equations, Controls, and Criterion. In control problems, the state of a dynamical (mechanical, electrical, chemical, etc.) system is represented by n state variables x\(t), x2(t), . . . , xn(t) satisfying n first-order differential equations dx-
-jj£ = fi(Xi, X2, • . . , Xn', Ui, U2, . . . , Ur) (i = 1, 2, . . . , n)
(state equations)
(11.8-1)
Typical state variables are generalized coordinates and velocities in mechanics, electric currents and voltages, and concentrations of chemi-
359
PROBLEM STATEMENT
11.8-1
cals (Sec. 11.8-3); the independent variable t is usually the time. The problem is to determine the r control variables (controls) Uk = Uk(t) (k = 1, 2, . . . , r) as functions of t for U < t < tF so as to minimize a given criterion functional
xo(tF) = / fo(xh x2, . . . , xn; uiy u2, . . . , ur) dt (11.8-2) (e.g., cost, mean-square error, time to achieve a task, etc.) subject to inequality constraints
Qj(uh u2, . . . , ur) < 0
(j = 1,2,. . . ,N)
defining the closed domain of admissible controls U.
(11.8-3)
Possible con
straints on the admissible states (xi, x2, . . . , xn) will be discussed in Sees. 11.8-ldand 11.8-5.
The optimal control functionsUk(t) will producean optimal trajectory Xi = Xi(t) in the n-dimensional state space. Solution of such a control problem requires suitably given boundary conditions to determine initial and final values Xi(U), Xi(tF)', .the initial and final times to, tF may them selves be unknowns (Sec. 11.8-lc). (b) Optimum-control Theory and the Calculus of Variations. The methods of Sees. 11.8-2 to 11.8-5 may be regarded as a somewhat
generalized calculus of variations applied to the important class of prob lems defined in Sec. 11.8-1. In the language of earlier sections, both the Xi(t) and the uk(t) are unknown dependent variables* yi(t). The state equations (1) are differential-equation constraints, and the variables pi defined in Sec. 11.8-2 are the corresponding variable Lagrange multi pliers. The adjoint equations and maximum principle introduced in Sec. 11.8-2 constitute necessary (not sufficient) conditions for optimal Uk} Xi and are essentially equivalent to the Euler equations (11.6-4) plus the condition E > 0 (Sec. 11.6-10), where they apply. The maximum principle states the optimizing conditions in an elegant, convenient, and * As a matter of fact, the entire problem of Sees. 11.6-1 to 11.6-7, i.e., maximization or minimization of integrals rxF
,
,
,
/ = J/xa F[yi(?)> 2/2(x), . . . , 2/„(z);2/,(a;), y2(x), . . . , yn(x);x dx with suitable constraints and/or boundary conditions is reformulated as an optimumcontrol problem if one substitutes X = t
I = xo(tF)
Xq = to
xp — tF
, y{(x) = Xi(t) with dx-jj- =mit) = y{(x)
(i = 1, 2, . . . , n)
In this special case, the theory of Sec. 11.8-2 leads to the classical canonical equations of Sec. 11.6-8; this approach may and may not simplify the given problem.
11.8-1
MAXIMA AND MINIMA AND OPTIMIZATION PROBLEMS
360
more general form, permitting a relatively straightforward treatment of systems with discontinuous control variables (Sec. 11.8-3).
(c) Initial-state and Terminal-state Manifolds. While the starting time U and the initial values Xi(t0) of the state variables are given in many control problems, a more general problem statement merely requires the initial state [#i(*o), z2(2o), . . . , xn(to)] to lie on a given (n — n0)-dimensional initial-state manifold (hypersurface, curve, or point in the state space) denned by Bj[xi(t0), x2(to), . . . , xn(t0)] = 0
(j = 1, 2, . . . , n0 < n) (11.8-4o)
The terminal state [xi(tF), x%{tF)} . . . , xn(tF)] is similarly con strained to lie on a given (n — nF)-dimensional terminal-state manifold defined by
G;[xi(W, Xt(tF), . . . , xn(tF)] = 0
(j = 1, 2, . . . , nF < n) (11.8-46)
In addition, one may have given inequality constraints defining allowable regions on each manifold (4).
Note that a single inequality
(11.8-5)
confining, say, the terminal state to the interior of a given n-dimensional state-space region is essentially equivalent to the corresponding equality constraint, since each trajectory entering the terminal-state region must cross its boundary.
(d) Continuity, Differentiability, and Independence Assump tions (see also Sec. 11.5-2 and Ref. 11.25). Unless the contrary is specif ically stated, it will be assumed in the following that, for all xif Uk, and t under consideration,
1. The given functions /0, /i, f^ . . . , /» are continuously differenti able with respect to the state variables xi} and continuous in the control variables uk.
2. The functions Qj are continuously differentiable with nonzero gradients. 3. The n0 functions Bj and the nF functions Gj defining the initialand terminal-state manifolds are continuously differentiable; func tional independence is assured by the condition that the ranks (Sec. 13.2-7) of the matrices [dBj/dXi{tQ)] and [dGj/dXi(tF)] are, respectively, equal to n0 and nF. All admissible controls uk(t) are to be chosen from the class of functions of bounded variation in (to, tF) (hence, unilateral limits exist everywhere, Sec. 4.4-8), and piecewise continuously differentiable wherever they are continuous. The corresponding Xi(t) will be piecewise continuously differentiable.
361
PROBLEM STATEMENT
11.8-1
Less restrictive assumptions are often possible, but the resulting theo rem statements become cumbersome (Ref. 11.25). (e) Generalizations (see also Sees. 11.8-4 and 11.8-5). The opti mization methods introduced in Sees. 11.8-1 to 11.8-5 can be made to apply to much more general problems. In particular, 1. If one or more of the given functions/*, Qj,
^tl =xs/n+1
Xn+l{to) «to
Xn+l(tF) =tF
(11.8-6)
This procedure applies also if a B$ depends explicitly on t0, or a Gj on tF. 2. Continuously differentiate equality constraints on the state variables,
0' = 1, 2, . . . , m)
(11.8-7)
(j = 1, 2, ... , m')
(11.8-8)
and isoperimetric conditions
f%;(*i, x2, . . . , xn) dt = Cj
Jio
with piecewise continuously differentiate tyj can be handled by the Lagrangemultiplier method of Sees. 11.6-2 and 11.6-3. One replaces the function /0 in the criterion functional (2) with m
m'
Fo ^/o + Y \j{t)
(11.8-9)
k=l
where the \j(t) and uj are, respectively, variable and constant Lagrange multi pliers. Such constraints are not enforced by the control, but modify the defini tion of the given system. Control-enforced constraints on the Xi will be treated in Sec. 11.8-5.
3. Criterion functionals of the form rtF
Xo(tF) = /
F(Xi, X2) . . . , Xn) Mi, U2, . . . , Ur) dt + h[x1(tF), x2(tF), . . . }xn(tF)]
(11.8-10)
reduce to the simpler form (2) if one introduces /o(si, x2} . . . , xn\ uu u2, . . . , ur) = F + h(t) - h(t0)
With
d dh h(t) s ^M*l(0, X2(t), . . . ,Xn(t)] s* ^A ^/»
)
(11.8-11)
i=l
(see also Sec. 11.6-6).
4. If the constraints (4) on the control variables uk depend explicitly on the state variables Xi (including, possibly, t = x„+i), one can usually eliminate this dependence by introducing new control variables. EXAMPLE: A constraint (4) of the form
M
< q(xh X2, . . . , xn)
11.8-2
MAXIMA AND MINIMA AND OPTIMIZATION PROBLEMS
362
reduces to the form
if one introduces the new control variable v defined by Uk = vq(xh x2, . . . , xn).
11.8-2. Pontryagin's Maximum Principle, and Optimal Hamiltonian.
(a) Adjoint Variables
It is convenient to treat the criterion
functional (2) as the final value x0(tF) of an added state variable xo(t) satisfying the state equation
-^ = f0(xh Xt, . . . , Xn) Uh U2, . . . , Ur)
(11.8-12)
and the initial condition
x0(to) = 0
(11.8-13)
Necessary conditions for optimal control are then defined in terms of Pontryagin's Maximum Principle: Define n + 1 adjoint variables po(t), Pi(t)y P2(t)j . . . , pn(t) as solutions of the n + 1 first-order differential equations n
f'=-2P*t (.--0,1,2,...,»;«.
(adjoint equations) p0(t) = constant < 0
with
(11.8-14) (11.8-15)
Then the optimal control minimizing the criterion functional (2) is given by that set of admissible control variables Uk — Uk(t) which maximize the Hamiltonian function n
H(Xi,X2, . . . jXn) po, pi, p2, . . . ,pn;Ul,U2, . . . , Ur) = \ Pifi »= 0
(11.8-16) for each t between t0 and tF; moreover,
M(xhX2, . . . yXn;p0) ph p2) . . . ,pn) = max H(xi,x2j . . . , xn', po, pi, pi, . . . , p„; (ui,U2, . . . ,ur) in U
Uh U2, . . . , Ur) = 0
(to
(11.8-17)
In addition, the optimal Xi(t) and Uk(t) must satisfy the given conditions (1) to (4), and transversality conditions no
Vi +XA' S =° (* =
363
PONTRYAGIN'S MAXIMUM PRINCIPLE
11.8-2
corresponding to the given boundary conditions (4); the Ay and Ay are unknown constants (see also Sees. 11.6-5, 11.6-8, and 11.8-26). Given the assumptions listed in Sec. 11.8-lc, the adjoint variables p»(i) will be piecewise continuously differentiable functions. They remain continuous even if one admits discontinuities of /0 and/or /i, f2, . . . , fn on a hypersurface S given by g(xi, x2i . . . , xn) = 0
provided that g is continuously differentiable, and that the /»• possess unilateral derivatives with respect to xh x2) . . . , xn on either side of S ("refraction" of optimal trajectories, see also Sec. 11.6-7). Under these conditions, the optimal Hamiltonian, too, remains continuous on S. (b) The Boundary-value Problem. Pontryagin's maximum con dition yields, in principle, relations expressing each control variable Uk in terms of the Xi and p», i.e., uk = uk(x0, xi, x2y . . . , zn;po, Ph Pt, • • • ,Pn)
(A; = 1, 2, . . . , ra) (11.8-19)
These relations may be obtained through solution of an ordinary maximum-of-a-function problem [possibly with inequality constraints (3)] for each t. Once this has been done, the optimum-control problem reduces to the solution of the 2n + 2 first-order differential equations (1), (12), and (14), or
dXi
dH
dpi
dH
..
,
/11fl OAN
subject to Eq. (17) and the boundary conditions. Since the adjoint equations (14) are homogeneous in the p„ one can arbitrarily choose the constant in Eq. (15), so that
pQ(t) = -1
(t0
(11.8-21)
One now has exactly 2n + n0 + nF + 2 boundary conditions (4), (13), (18), and (21) to determine 2n + 2 unknown constants of integration, n0 + nF unknown multipliers A,-, Ay, and the unknown time interval tF — t0. The missing boundary condition is obtained through substi tution of t = tF into Eq. (17). Unless either t0 or tF is given explicitly, one introduces the additional state variable xn+i defined by dxn+i dt
(see also Sec. 11.8-le).
= 1
xn+i(to) = to
11.8-3
MAXIMA AND MINIMA AND OPTIMIZATION PROBLEMS
364
Note: Whenever a boundary condition (4), say Gj = 0, permits explicit determi nation or elimination of a boundary value, say xi(tF)} then pi(tF) is undetermined by the relations (4) and (18). In this case, the Jih equation (46) (and hence Aj) and the 7th equation (186) are simply omitted. If, on the other hand, the boundary value of a state variable, say xi(tF), is left "free" or undetermined by the terminalstate constraints (46), then the 7th equation (186) yields the corresponding "natural boundary condition" pi(tF) = 0 (see also Sec. 11.6-5). Similar special cases apply to Eqs. (4a) and (18a).
(c) Since the maximum principle, as given, expresses only necessary (not sufficient) conditions for optimum control, the Pontryagin method may yield multiple candidates for the optimal solution, or no solution may exist. Actual solution of the two-point boundary-value problem usually requires numerical iteration methods (see also Sees. 20.9-2 and 20.9-3). In addition, actual derivation of the maximizing functions (19) may also require successive approximations (Refs. 11.23 and 11.24). 11.8-3. Examples, (a) Zermelo's Navigation Problem. A ship with rectangu lar cartesian coordinates xi, x2 runs at a given constant velocity V in a current with given local velocity components v\{x\, x2), v2(xh x2) (t>i2 + v22 = V2). Given zi(0) = 0, x2(0) = 0, one desires to minimize the time [tF
x0(tF) =tF = /
"'-Jo
dt
(/o = 1)
required to reach a given (attainable) point (xiFt x2F) by choosing the angle u(t) between the instantaneous velocity relative to the water and the x\ axis. The state equations are •jt = Vi(x\, x2) + Vcosu at
-TT — v2(x\, x2) -+- Fsin u at
dxo
dt-1
«*-'>
Hence
H = pi(i>i + V cos u) + p2(v2 + V sin u) — 1 where po = —1.
For a maximum of H, cosw
=
—
VPI2 + ?>22
VPI2 + P22
where p\, p% must satisfy the adjoint equations
dp\
and
(
dv} .
dv2\
dp2
/
SVi
dVi\
M = max H = piVi + p2v2 — V Vpi2 + P22 —1=0 u
If, in particular, Vi and v2 are constant, then so are pi, p2, and u; their values, together with tF> must satisfy
a:i(0) = 0
a-o(O) = 0
xx(tF) = xiF
x2{tF) = x2F
11.8-3
EXAMPLES
365
(b) Simple Bang-bang Time-optimal Control.
Given Xi(0), x2(0) = £i(0) and
the state equations dx2
dx\
~di [d2xi
=
x2
~dt
u(t)
"1
i.e., —— = u(t) , one desires to minimize the time
x0(tF) = tF= rFdt
(/o = i)
required to reach the given terminal state xi(tF) = x2(tF) = ±i(tF) —0 by choosing an optimal control u = u(t) such that
|«(0I < 1 Maximization of the Hamiltonian
H = pix2 + p2u — 1
subject to \u\ < 1 requires
u = sign p2 = |J1 with
dpi _ « dt
(p2 > 0) (P2 < 0)
rf^2 — —p\ — constant dt
so that pi = pi(0), p2 = p2(0) —
Fig. 11.8-1. Phase-plane trajectories for simple bang-bang control of a simple optimalcontrol problem (Sec. 11.8-36).
intersect the "switching curve" corresponding to p2 = 0, and each trajectory continues to the origin on the latter (Fig. 11.8-1). Each specific trajectory depends on the unknown parameters pi(0), p2(0), which must be determined so that the given bound ary conditions xi(tF) = x2(tF) = 0 are satisfied. (c) A Simple Minimal-time Orbit-transfer and Rendezvous Problem. Referring to Fig. 11.8-2, the motion of a simple rocket-driven vehicle in a vertical plane, assuming a flat earth (constant acceleration of gravity, —g) and no air resist-
11.8-3
MAXIMA AND MINIMA AND OPTIMIZATION PROBLEMS
366
ance, is given by thefour state equations dx
dx _
dt
dt
dy _ . =
dt
V
T cosu mo + m(t — t0)
dy
T_ sin u
dt
~ mo + m(t — to) -
9
(Fig. 11.8-2). x = xi, x = x2) y = x8, and y = x4 are state variables; g, T (thrust), rrto (vehiclemass at start), and m < 0 (fuel-consumption rate, assumed constant) are
Fig. 11.8-2. Geometry for the orbit-transfer problem (Sec. 11.8-3c).
given constants. It is desired to transfer the vehicle from a given horizontal starting orbit defined by x(to) = Mo y(to) = 2/o
±(to) = Vo y(to) = 0
to the given horizontal orbit of a target, and to match its given position and constant velocity:
x(tF) = vFtF + a y(tF) = yF
x(tF) = vF y{tF) = 0
This is to be accomplished by programming the control variable (attitude angle) u = u(t) so as to minimize the total fuel consumption, i.e., so as to minimize rh(tF —to), or simply /ftF dt. Since the initial- and final-state manifolds depend explicitly on U J to
and tF, one introduces x6 = t with dxt/dt = 1, X(,(t0) = t0) xb(tF) = tF. Maximization of the Hamiltonian,
requires dH/du = 0, or -p2 sin u -f- p4 cos u = 0 and
cos u
=
P2
—
sin w
=
VP22 + p42
P4
—
VP22 + P42
The adjoint equations
dpi = dps = dps _ n d*
d*
ctt
cfo2 d*
-pi
dp4 _ dt
-
-Pa
yield
Pi(0 = Pi(*o)
p2(t) = -pi(to)(t - to) + p2(*o)
PaW = Pa(*o)
p4(0 = -p*(to)(t ~ to) + p4(*o) P&(0 = Psfto)
367
INEQUALITY CONSTRAINTS ON STATE VARIABLES
11.8-5
Since the starting and terminal values of x, y, and y are fixed, the only transversality conditions (18) of interest are those resulting from
or
x(t0) — v0to = 0
x{tF) — vFtF — a = 0
p5(*o) - AjUo = 0
pb(tF) - AiVF = 0
Pi(*o) + AJ =0
pi(tF) + Ai =0
With pi and p6 constant, this implies pi = p6 = 0. Furthermore, the maximal Hamiltonian is zero at t = tF:
T
mo + m(t — to)
Wito) + p42(*o) + Pz2(to)(tF - to)2 - 2pz{to)pSo){tF - to)]X + gp*(!o)(tF - to) — gpSo) -1=0
This, together with the eight given initial and terminal conditions, determines the nine quantities to, tF, x(t0), x(t0), y(h), y(to), P2^o), P3(to), P4^o), and thus the complete solution.
11.8-4. Matrix Notation for Control Problems.
The control problem can be very
conveniently expressed in the matrix notation of Sec. 13.6-1 (and also in the corre sponding tensor notation, Sec. 14.7-7). One introduces x = x(t) = \xo, xi, x2, . . . , Xn], (n + 1) X 1 column matrix representing the state vector
u = u(t) = {ui, u2} . . . , uT), r X 1 column matrix representing the control vector p = p(t) = (p0, pi, P2, . . . , Pn), 1 X (n + 1) row matrix representing the adjoint vector
(see also Sec. 14.5-2). The relations of Sees. 11.8-1 and 11.8-2 can now be restated
compactly; note that ~ = j^ is an (n +1) X(n +1) matrix. In particular, one has
~- = f(x, u) = —
(state equations)
dp df ^l=_p-i-=
dH _
at
dx
dx
H(p, x, u) = pf
.
(adjoint equations)
\i
(11.8-22)
(hamiltonian function)
The criterion functional (2) may be regarded as the matrix product (inner prod uct, Sec. 14.7-1) x0 = ex of the column matrix x and the 1 X (n + 1) row matrix c = (1, 0, 0, . . . ,0). Reference 11.25treats the case of criterion functionals defined as ex with a more general row matrix c.
11.8-5. Inequality Constraints on State Variables. also Sec. 11.6-6).
Corner Conditions (see
(a) If the domain X of admissible system states (x0, x\, x2, . . . , xn)
is restricted by continuously differentiable inequality constraints
Sj(x0, xi, x2, . . . , xn; ui, u2, . . . , ur) < 0
(j = 1, 2, . . . , M)
(11.8-23)
which are to be enforced by the control in the course of the optimal-trajectory gener ation (see also Sec. 11.8-le), the theory of Sec. 11.8-2 applies without change to tra jectories and portions of trajectories situated in the interior of the closedstate region X definedby Eq. (23). For trajectories and portions of trajectories situated on the bound ary of X, at least one of the constraints (23) becomes an equality. For simolicity,
11.8-5
MAXIMA AND MINIMA AND OPTIMIZATION PROBLEMS
368
consider a boundary region Ds with only a single equality, say S(xo,xi,x2, . . . ,xn;ui,u2, . . . ,ur) = 0
(11.8-24)
In this boundary region, the Lagrange-multiplier method of Sees. 11.6-2and 11.8-le
applies; i.e.,each optimal trajectory satisfies Eq. (24) andthe relations of Sec. 11.8-2a, with /o replaced by
Fo = /o + \(t)S
[(x0, xi, x2, . . . , xn) in Ds)
(11.8-25)
where X is a variable Lagrange multiplier.
Alternatively (and equivalently), f0 can be replaced on Ds by any one of the expressions n
Fo=fo +n(t)SM =fo +n{t) £ ~^fi
\(x0, xh x2, . . . ,xn) in Ds;
*= 0
k = 1, 2, . . .]
(11.8-26)
[(ico, *i, x2, . . . , xn) in As]
(11.8-27)
n
with S<°> s S = 0,
£<» s ^ ^ V —/>• =0, i = 0
where n(t) is a variable Lagrange multiplier. (b) If £<*> is the first of the functions (27) which contains a control variable u
explicitly, S < 0 is termed a Kth-order inequality constraint. In this case, an optimal trajectory entering the boundary region Ds defined by Eq. (24) from the interior of X at the time t = t\ must satisfy the corner conditions (jump conditions) K-2
Pdti -0) = pith +0) + Li S n-^1 dXi Jt = ti A'=0
dS
,.
, „
x)
(H-8-28)
J
where the vk are constant Lagrange multipliers, and b is an arbitrary constant. An optimal trajectory leaving the state-space boundary at the time t —t2 for the first time after entering at t = ti must satisfy the corner conditions
Villi - 0) = Vilh + 0) - b 2£_
-.
H\
n
= H\
te J' ='2
(» = 1, 2, . . . , n) 1
i (11.8-29)
The arbitrary constant 6 occurring in both Eq. (28) and Eq. (29) is usually taken to be zero, so that the p»(0 are continuous at the exit point. If an optimal trajectory is reflected by a boundary region Ds corresponding to a single constraint (24), one has t\ = t2, and the corner conditions (28) apply with 6=0. If the single constraint (24) is replaced by two or more such constraints, terms corresponding to each new constraint (with additional multipliers) must be added to each sum in Eqs. (25), (26), (28), and (29), Explicit time dependence of constraints is treated in the manner of Sec. 11.8-le.
369
PROBLEM STATEMENT
11.9-1
Note: See Ref. 11.25 for exceptions to the corner conditions (28), (29) in special situations.
11.8-6. The Dynamic-programming Approach (see also Sec. 11.9-2). If the optimum-control problem denned in Sec. 11.8-1 defines unique optimal extremals for fixed Xi(tF) and variable Xi(t0) = Xi
(i = 1, 2, . . . , n)
then the criterion-function minimum
min
/ Ffo(xh x2, . . . , xn) uh u2, . . . , ur) dt
Ul(t),U2(t),'-' ,Ur(t) Jta
= S[x1(t0),x2(to), . . . ,xn(to);xi(tF),x2(tF), . . . ,xn(tF)] = S(XhX2, . . . ,Xn) (11.8-30) satisfies the first-order partial differential equation
M(xl9 X2, . . . ,Xn; ^-, w, ---,-^J =0 (hamilton-jacobi equation)
(11.8-31)
where M is the optimal (maximized) Hamiltonian function Eq. (17) (see also Sec. 11.6-8). The ordinary differential equations (20) are the corre sponding characteristic equations; their solutions are thus readily deter mined if a complete integral of the partial differential equation (31) is known (Sec. 10.2-4).
The dynamic-programming approach "imbeds" a given optimumcontrol problem in an entire class of similar problems with different initial coordinates. The partial differential equation (31), which solves this entire set of problems, expresses the fact that every portion of an optimal trajectory optimizes the criterion functional for the corresponding initial and terminal points (Bellman1s Principle of Optimality). It is possible to derive the entire optimum-control theory under very general assumptions from the principle of optimality. In general, numerical solution of Eq. (31) is difficult for n > 2 (see Refs. 11.13 to 11.15, which also discuss dynamic programming in a more general context). 11.9. STEPWISE-CONTROL PROBLEMS AND DYNAMIC PROGRAMMING
11.9-1. Problem Statement. An important class of optimization problems involving stepwise control (this includes control of continuous systems by digital computers) or stepwise optimal resource allocation can be formulated as follows: a system described by a set of state varia bles x = (xi, x2, . . . , xn) proceeds through a sequence of states °x, lx, 2x, . . . such that each state change is given by the state equations (in
11.9-2
MAXIMA AND MINIMA AND OPTIMIZATION PROBLEMS
370
this case, difference equations, see also Sec. 20.4-3) k+1Xi = MkXU*Xi, . . . ^Xnl^Uu^U*, . . . , H-Hfr)
\
or
j
k+ix = f(kx, k+lu)
(i = 1, 2, . . . , n) \ (11.9-1)
where the "control variable" k+1u = (k+1uh k+1u2) . . . , k+1ur) defines the sequence of decisions (policy) changing the A;th system state into the (k + l)8t state. Given the initial state °x (and, possibly, a set of suitable equality or inequality constraints on state and control variables), it is desired to find an optimal policy lu, 2u, . . . , Nu which will minimize a given criterion function N-l
Nx0 = J /0(*s, k+1u) + h(Nx) =Nx0(°x)
(11.9-2)
k = 0
where N = 1, 2, ... is the number of steps considered (dynamic programming). As in continuous optimal-control problems, the initial and final states may or may not be given by the problem; and it is, again, possible to generalize the problem statement in the manner of Sec. 11.8-le.
11.9-2. Bellman's Principle of Optimality (see also Sec. 11.8-6). If lu, 2u, . . . , Nu is an optimal policy resulting in the state sequence °x, lx, 2x, . . . , Nx for a given dynamic programming problem withinitial state °x, then 2u, zu, . . . , Nu is an optimal policy for the same criterion function andfinal state Nx but initial state lx. If one denotes min Nx0(X) by NS(X), u
the optimality principle leads to the fundamental recurrence relation (partial difference equation, Sec. 20.4-36)
NS(X) = min {MX, HO + ""^[/(X, Hi)]} with
*S(X) = min f0(X, xu)
where minima are determined in accordance with any given constraints. Numerical solution of this functional equation for the unknown function NS(X) amounts to stepwise construction of a class of optimal policies for many initial states. The desired optimal policy is "imbedded" in this class. Solution usually requires digital computation; even so, solution of problems with more than two or three state variables Xi is practical only in special cases. References 11.12 to 11.18 describe many examples and approximation methods, and Ref. 11.20 discusses a stepwise-control analogy to the Maximum Principle.
371
REFERENCES AND BIBLIOGRAPHY
11.10-2
11.10. RELATED TOPICS, REFERENCES, AND BIBLIOGRAPHY
11.10-1. Related Topics.
The following topics related to the study of
maxima and minima are treated in other chapters of this handbook: Ordinary differential equations Partial differential equations, canonical transformations Eigenvalues of matrices and quadratic forms Sturm-Liouville problems
Chap. 9 Chap. 10
Statistical decisions Numerical methods
Chap. 19 Chap. 20
Chap. 13 Chap. 15
11.10-2. References and Bibliography (see also Sec. 20.2-6 for numer ical optimization techniques). Linear and Nonlinear Programming, Theory of Games
11.1. Abadie, J.: On the Kuhn-Tucker Theorem, ORC 65-18, Operations Research Center, University of California, Berkeley, 1965.
11.2. Dantzig, G. B.: Linear Programming and Extensions, Prentice-Hall, Englewood Cliffs, N.J., 1961.
11.3. Dresher, M.: Games of Strategy: Theory and Applications, Prentice-Hall, Englewood Cliffs, N.J., 1961.
11.4. Gale, D.: The Theory of Linear Economic Models, McGraw-Hill, New York, 1960.
11.5. Gass, S. I.: LinearProgramming, 2d ed., McGraw-Hill, New York, 1964. 11.6. Hadley, G.: Linear Programming, Addison-Wesley, Reading, Mass., 1962. 11.7. Karlin, S.: Mathematical Methods and Theory in Games, Programming, and Economics, 3d ed., Addison-Wesley, Reading, Mass., 1959. 11.8. Kuhn, H. W., and A. W. Tucker, Nonlinear Programming, in Proc. 2d Berkeley
Symp. on Math.Stat, andProb., vol. 5, University of California Press, Berkeley, 1952.
11.9. Luce, R. D., and H. Raiffa: Games and Decisions, Wiley, New York, 1957. 11.10. Vajda, S.: Theory of Games and LinearProgramming, Wiley, New York, 1956. 11.11. Boot, J. C. G.: Quadratic Programming, North Holland Publishing Company, Amsterdam, 1964.
See also the articles by L. Rosenberg and E. M. Beale in M. Klerer and G. A. Korn: Digital Computer User's Handbook, McGraw-Hill, New York, 1967. Calculus of Variations, Optimum-control Theory, and Dynamic Programming
11.12. Aris, Rutherford: Discrete Dynamic Programming, New York, Wiley, 1963. 11.13. Bellman, R.: Dynamic Programming, Princeton University Press, Princeton, N.J., 1957.
11.14.
and S. E. Dreyfus: Applied Dynamic Programming, Princeton Uni versity Press, Princeton, N.J., 1962.
11.15.
and R. Kalaba: Dynamic Programming and Modern Control Theory, Academic, New York, 1965.
11.16. Bliss, G. A.: Calculus of Variations, Open Court, Chicago, 1925.
11.17. Bolza, O.: Vorlesungen iiber Variationsrechnung, Stechert, New York, 1931.
11.10-2 MAXIMA AND MINIMA AND OPTIMIZATION PROBLEMS 372
11.18. Dreyfus, S. E.: Dynamic Programming and the Calculus of Variations, Aca demic, New York, 1965.
11.19. Elsgolc, L. E.: Calculus of Variations, Pergamon/Addison-Wesley, Reading, Mass., 1962.
11.20. Fan, Liang-Tsen, and Chin-sen Wang: The Discrete Maximum Principle, Wiley, New York, 1964.
11.21. Fan, Liang-Tsen: The Continuous Maximum Principle, Wiley, NewYork, 1966. 11.22. Gelfand, I. M., and S. V.Fomin: Calculus of Variations, Prentice-Hall, Engle wood Cliffs, N.J., 1963.
11.23. Leitmann, G.: Optimization Techniques, Academic, New York, 1962. 11.24. Merriam, C. W.: Optimization Theory and the Design ofFeedback Control Sys tems, McGraw-Hill, New York, 1964.
11.25. Pontryagin, L. S., et al.: The Mathematical Theory ofOptimal Processes, Wiley, New York, 1962.
11.26. Weinstock, R.: Calculus of Variations, McGraw-Hill, New York, 1952. 11.27. Athans, M., and P. L. Falb: Optimal Control, McGraw-Hill, New York, 1966.
CHAPTE
r!2
DEFINITION OF MATHEMATICAL
MODELS: MODERN (ABSTRACT) ALGEBRA AND ABSTRACT SPACES
12.1. Introduction
12.1-1. Mathematical Models
12.1-2. Survey 12.1-3. "Equality" and Equivalence Relations
12.1-4. Transformations, Functions, Operations 12.1-5. Invariance
12.1-6. Representation of One Model by Another: Homomorphisms and Isomorphisms 12.2. Algebra of Models with a Single Denning Operation: Groups 12.2-1. Definition and Basic Proper ties of a Group 12.2-2. Subgroups
12.2-3. Cyclic Groups.
Order of a
Group Element 12.2-4. Products of Complexes. Cosets
12.2-5. Conjugate Elements and Sub groups. Normal Subgroups. Factor Group 12.2-6. Normal Series.
Composition
Series
12.2-7. Center.
Normalizers
12.2-8. Groups of Transformations or Operators 12.2-9. Homomorphisms and Iso morphisms of Groups. Representation of Groups 12.2-10. Additive Groups. Residue Classes and Congruence 12.2-11. Continuous and
Mixed-con
tinuous Groups 12.2-12. Mean Values
12.3. Algebra of Models with Two Denning Operations: Rings, Fields, and Integral Domains 12.3-1. Definitions and Basic Theorems
12.3-2. Subrings and Subfields. Ideals 12.3-3. Extensions
12.4. Models Involving More Than One
Class
of
Mathematical
Objects: Linear Vector Spaces and Linear Algebras
12.4-1. Linear Vector Spaces 12.4-2. Linear Algebras 373
12.1-1
MODERN ALGEBRA.
12.5. Models Permitting the Defini tion of Limiting Processes: Topological Spaces 12.5-1. Topological Spaces 12.5-2. Metric Spaces
12.5-3. Topology, Neighborhoods, and Convergence in a Metric Space 12.5-4. Metric Spaces with Special Properties. Point-set Theory 12.5-5. Examples: Spaces of Numeri cal Sequences and Functions
12.5-6. Banach's
Contraction-map
ping Theorem and Successive Approximations 12.6. Order
12.6-1. Partially Ordered Sets (a) Order (6) Lattices
12.6-2. Simply Ordered Sets 12.6-3. Ordered Fields
SPACES
374
12.7-2. Direct Products of Groups 12.7-3. Direct Products of Real Vector
12.7-4. Product Space 12.7-5. Direct Sums
12.8. Boolean Algebras
12.8-1. Boolean Algebras 12.8-2. Boolean Functions.
Reduc
tion to Canonical Form 12.8-3. The Inclusion Relation
12.8-4. Algebra of Classes
12.8-5. Isomorphisms of Boolean Alge bras. Venn Diagrams 12.8-6. Event Algebras and Symbolic Logic 12.8-7. Representation of Boolean Functions by Truth Tables. Karnaugh Maps
12.8-8. Complete Additivity and Measure Algebras
12.7. Combination of Models: Direct
Products, Product Spaces, and Direct Sums 12.7-1. Introduction.
Cartesian
Products
12.9. Related Topics, and Bibliography
References,
12.9-1. Related Topics
12.9-2. References and Bibliography
12.1. INTRODUCTION
12.1-1. Mathematical Models. Physical processes are, generally speaking, described in terms of operations (observations, experiments)
relating physical objects. The complexity of actual physical situations calls for simplified descriptions in terms of verbal, symbolic, and even
physical models which "abstract" suitably chosen "essential" prop erties of physical objects and operations.
Mathematics in the most general sense deals with the definition and manipulation of symbolic models. A mathematical model involves
a class of undefined (abstract, symbolic) mathematical objects, such
as numbers or vectors, and relations between these objects. A mathe matical relation is a hypothetical rule which associates two or more of
the undefined objects (see also Sees. 12.1-3 and 12.8-1). Many relations are described in terms of mathematical operations associating one or
more objects (operand, operands) with another object orset of objects
(result). The abstract model, with its unspecified objects, relations, and operations, is defined by aself-consistent set of rules (defining postu lates) which introduce the operations to be employed and state general
relations between theirresults (descriptivedefinition ofa mathematical
375
SURVEY
12.1-2
model in terms of its properties; see Sees. 12.2-1, 12.3-1, 12.4-1, 12.5-2, 12.6-1, 12.8-1, and 1.1-2 for examples). A constructive definition introduces a new mathematical model in terms of previously defined
mathematical concepts (e.g., definition of matrix addition and multi plication in terms of numerical addition and multiplication, Sec. 13.2-2). The self-consistency of a descriptive definition must be demonstrated by construction or exhibition of an example satisfying the defining postu
lates (existence proof, see also Sees. 4.2-16 and 9.1-4). In addition, it is customary to test the defining postulates for mutual independence. A mathematical model will reproduce suitably chosen features of a
physical situation if it is possible to establish rules of correspondence relating specific physical objects and relationships to corresponding mathematical objects and relations. It may also be instructive and/or enjoyable to construct mathematical models which do not match any counterpart in the physical world. The most generally familiar mathe matical models are the integral and real-number systems (Sec. 1.1-2) and Euclidean geometry; the defining properties of these models are more or less directly abstracted from physical experience (counting, ordering, comparing, measuring).
Objects and operations of more general mathematical models are often labeled with sets of real numbers, which may be related to the results of
physical measurements. The resulting "representations" of mathe matical models in terms of numerical operations are discussed specifically in Chaps. 14 and 16.
Questions as to the self-consistency, sufficiency, and applicability ofthe logical rules used to manipulate mathematical statements form the subject matter of metamathematics, which is by no means a finished structure (Ref. 12.16).
12.1-2. Survey. Modern (abstract) algebra* deals with mathematical models defined in terms of binary operations ("algebraic" operations,
usually referred to as various types of "addition" and "multiplication") which associate pairs of mathematical objects (operands, or operatorand operand) with corresponding results. Sections 12.2-1 through 12.4-2 introduce some of the most generally useful models of this kind, notably groups, rings, fields, vector spaces, and linear algebras; Boolean algebras are treated separately in Sees. 12.8-1 through 12.8-6. The important subject of linear transformations (linear operators) and their eigen vectors and eigenvalues is introduced in Chap. 14. The representation of vectors and operators in terms ofnumerical components and matrices is discussed in detail in Chaps. 14, 15, and 16.
Sections 12.5-1 through 12.6-3 serve as a brief introduction to mathe matical models which permit the definition of limiting processes and ♦The word algebra has three loosely related meanings: (1) a general subject, as used here (abstract algebra, elementary algebra); (2) the theory of algebraic operations used in connection with a specific model (matrix algebra, tensor algebra); (3) a type of
mathematical model (a linear algebra, a Boolean algebra).
121"3
MODERN ALGEBRA. SPACES
376
order. In particular, Sees. 12.5-2 through 12.5-4 deal with metric spaces. Sections 12.7-1 through 12.7-5 discuss simple schemes for combining mathematical models (direct products and direct sums).
12.1-3. "Equality" and Equivalence Relations, (a) The descrip tive definition of each class of mathematical objects discussed in this chapter is understood to imply the existence of a rule stating whether or not two given mathematical objects a, b are "equal" (equivalent or indistinguishable in the context of the model, a = b); the rule must be such that
1. a = a (reflexivity of the equality relation) 2. a = b implies b = a (symmetry) (transitivity) 3. a = b, b = c implies a = c Sections 13.2-2 and 16.3-1 show examples ofthe definition ofequality in construc tively defined models.
(b) More generally, any relation a ~ b between two objects a, 6 of a class C is called an equivalence relation if and only if it is reflexive (a~a), symmetric (a
bimplies b^ a), and transitive (a~b,b~c implies a ~ c). Every equivalence rela
tion defines a partition of the class C, i.e., a subdivision of C into subclasses without
common elements. Elements of the same subclass are equivalent with respect to the properties defining the equivalence relation.
EXAMPLES: Equality, identity of functions (Sec. 1.1-4), congruence and simi larity of triangles, metric equality (Sec. 12.5-2), isomorphism (Sec. 12.1-6).
12.1-4. Transformations, Functions, Operations (see also Sees. 4.2-1, 14.1-3, and 14.3-1). A set of rules x-+ x' associating an object x' of a class C" with each object z of a class C is called a transformation (mapping) of C into C"; x' is a function
x' = x'{x) =f(x) (12.1-1) of the argument x, with domain (domain of definition) C and
range C". C and C" may or may not bethe same class of objects. The correspondence (1) can be regarded as an operation on the operand x producing the result (transform) x'. Given suitable definitions of
equalityin C and C" (Sec. 12.1-3), an operation (transformation, function)
(1) is well denned if and only if x = y implies x' = y'. This property will be understood to apply unless the contrary is specifically stated.
A correspondence (1) is unique [f(x) is a single-valued function of x] if and only if a unique object xr corresponds to each object x. The correspondence is a reciprocal one-to-one (biunique) correspondence of C and C" if and only if the relation (1) maps C uniquely onto all of C and defines a unique inverse correspondence (inverse transforma
tion. Many authors categorically define every mapping as unique (see also footnote to Sec. 4.2-2). The set of pairs (x} x') is called the graph of the function f(x).
377
REPRESENTATION OF ONE MODEL BY ANOTHER
12.1-6
Each object x in Eq. (1) can be a set of objects xh x2, . . . ; so that it is possible to define functions x' = f(xh x2, . . .) of two or more arguments.
A numerical (real or complex) function defined on a set of functions
is called afunctional [e.g., f*
12.1-5. Invariance (see also Sees. 12.2-8, 13.4-1, 14.4-5, 16.1-4, and 16.4-1). Given a transformation (1) of a class (space) C into itself, any function F(z, y, . . .) such that F[f(x), f{y), . . .] = F(x, y, . . .) for all x, y} ... in C and any relation 0(x, y, . . .) = A which implies 0[f(x), f(y), . . .] = A for all x, y, . . . in C is called invariant with respect to the transformation (1).
12.1-6. Representation of One Model by Another: Homomor
phisms and Isomorphisms. Let I be a mathematical model (Sec. 12.1-1) involving objects a, 6, . . . and operations 0, P, . . . whose results 0{a, b, . . .), P(a, b, ...),.. . are elements of M* A second model M' is a homomorphic image of M with respect to the operation 0{a, b, . . .) if and only if
1. There exists a unique correspondence a-^> a', b—> b', . . . relating one of the elements a', b', . . . of M' to each element of M.
2. It is possible to define an operation 0' on the elements of Mf so that 0{a, b, . . .) -• 0'(a', b', . . .). A correspondence with these properties is called a homomorphism of M to M' and preserves all relations based on the operation in question; i.e., each such relation between elements a, b, . . . of M implies a cor
responding relation between the elements a', bf, . . . of M'. A homo morphism mapping the set of elements of M into itself (onto a subset of itself) is called an endomorphism. An isomorphism is a homomorphism involving reciprocal one-to-one
correspondence.
Two models M and M' so related are isomorphic
with respect to the operation in question; in this case both M —> M' and M' —> M are homomorphisms. An isomorphism mapping the set of elements of M onto itself is an automorphism of M. * The elements a, b, . . . need not belong to a single class of mathematical objects (e.g., there may be vectors and scalars, Sec. 12.4-1).
12-2-1
MODERN ALGEBRA. SPACES
378
The concept of a homomorphism, and the related concepts of isomorphism and automorphism, are of the greatest practical importance, since they permit the representation of one model by another. One may, inparticular, repre sent mathematical objects by sets of real numbers (analytical geometry, matrix and tensor representations). Note that isomorphism is an equivalence relation (Sec. 12.1-36) between models: properties of an entire class of isomorphic models may be derived from, or discussed in terms of, the properties of any model of the class. Some writers require each homomorphism M-» M' to map Monto all ofM'. In this case the homomorphism defines an isomorphism relating disjoint classes of ele ments of M to elements of M'.
12.2. ALGEBRA OF MODELS WITH A SINGLE DEFINING OPERATION: GROUPS
12.2-1. Definition and Basic Properties of a Group, (a) A class G ofobjects (elements) a,b,c, . . . isa group if andonly if it is possible to define a binary operation (rule of combination) which associates an object (result) a Ob with every pair of elements a,bofG so that 1. a O b is an element ofG(closure under the defining operation)
2. a O (b O c) = (a O b) O c (associative law) 3. G contains a (left) identity (identity element) E such that, for each element a of G,E O a = a 4. For each element a of G, G contains a (left) inverse a-1 such that a-1 Q a = E
Two elements a, bof a group commute if and onlyif a O b = b O a. If all elements a, b of a group Gcommute, the defining operation ofGis commutative, and Gis a commutative or Abelian group. A group
Gcontaining a finite number g ofelements is of finite orderg; otherwise Gis an infinite group (ofinfinite order). In the latter case, Gmay or may not be a countable set (see Sec. 12.2-11 for a brief discussion of continuous groups).
(b) Every group G has a unique left identity and a unique right identity, and the two are identical (E O a = c0 E = a). Each element a has a unique left inverse and a unique right inverse, and the two are identical (cr1 O a = a O ar1 = E). Hence
c O a = c O b implies a = b )
a©c=bOcimplies a=bJ (cancellati°n laws)
(12.2-1)
379
CONJUGATE ELEMENTS AND SUBGROUPS
12.2-5
Gcontains aunique solution xofevery equation cO x = bor x O c = b; i.e., unique right and left "division" is possible.
(c) The operation denning a group is often referred to as (abstract)
multiplication (note, however, Sec. 12.2-10); its result is then written asa
product abf and the inverse of a is written as or1. This convention is
freely used in the following sections.
Multiple products aa, aaa, ... are written as integral powers a\ a8, . . . , with (a-i)» = a~n, a0 = E.
(a-i)-i = a
Note
aman = am+n
(am)n = amn
(ab)'1 = b~H~l (12.2-2)
12.2-2. Subgroups. A subset Gi of a group Gis a subgroup of Gif and only if Gi is a group with respect to the denning operation of G. This is true (1) if and only if Gi contains all products and inverses of its elements, i.e., (2) if and only if Gi contains the product ab~l for each pair of elements a, b in G\. The intersection (Sec. 4.3-2a) oftwo subgroups ofGis a subgroup ofG. Gand E are
improper subgroups of G; all other subgroups of Gare proper subgroups. IfG is of finite order g (Sec. 12.1-la), the order gi of every subgroup Gi ofGis adivisor of g (Lagrange's Theorem); g/gi iscalled the index ofGi (with respect toG). 12.2-3. Cyclic Groups. Order of a Group Element. A cyclic group consists ofthe powers a0 = E, a, a2, ... ofasingle element aand isnecessarily commutative.
Every group ofprime order and every subgroup ofa cyclic group is cyclic. Each element aof anygroup G"generates" a cyclic subgroup ofG, the period of a. The order of this subgroup is called the order of the element a and is equal to the least positive integer m such that am = E.
12.2-4. Products of Complexes. Cosets. (a) Every subset Ci of a group G is
called a complex. The product* dC2 of two complexes d, C2 ofGis the setof all
(different) products axa2 ofan element ai ofCi and an element a2 ofC2. CiCi = Ci if
and only if Ci is a subgroup of G. The product GiG2 oftwo subgroups ofGis asubgroup of G if and onlyif GiG2 = G&i.
(b) A left coset xGi of a subgroup Gi of Gisthesetof all products xax of a given
element xofGandany element aiofGi. A right cosetGxx is,similarly, the set of all
products aix. Acoset of Gi isasubgroup of Gif and only ifxisan element of Gx; inthis
casexGi = Gtx = Gi. Two left cosets ofd are either identical or have nocommon element;
the same is true for two right cosets. Every subgroup Gi ofGdefines a partition (Sec. 12.1-3) ofGinto afinite or infinite number nx ofleft cosets, and a partition ofGinto ni
right cosets; if Gis offinite order g, ni equals the index g/gi ofGi (Sec. 12.2-2). Two elements a,bofG belong to the same left coset ofGi ifGi contains a~% and to the same right coset of Gi if G\ contains ba~x.
12.2-5. Conjugate Elements and Subgroups. Normal Subgroups. Factor Group, (a) Two elements x and x' of a group Gare conjugate if and only if they are related by a transformation (Sec. 12.1-4). x' = arHa (or x = ax'a~l) (12.2-3) * The product CiC2 of two complexes must not be confused with their intersection (logical product, Sec. 4.3-2o) Ci O C2, or with a direct product (Sec. 12.7-2).
12-2"6
MODERN ALGEBRA. SPACES
380
where a is an element ofG. xf is then called the transform of x under conjugation by a. Conjugation is an equivalence relation (Sec. 12.1-36) and defines a partition ofG into classes ofconjugate elements.
(b) A transformation (3) transforms each subgroup Gi of G into a conjugate subgroup G[ = a~lGia. Gx is mapped onto itself (G[ = Gi) for every a in G(1) if and only if the elements ax ofGi commute with every element a of G(aG^ = Gxa) or (2) if and only if Gx contains all conjugates of its elements. A subgroup distinguished by these three properties is called a normal subgroup (normal divisor, invariant subgroup) of G.
Every subgroup of index 2 (Sec. 12.2-2) isanormal subgroup ofG. Asimple group, contains no normal subgroup except itself and E.
(c) The left cosets (Sec. 12.2-46) ofa normal subgroup are identical with the corresponding right cosets and constitute a group with respect to the operation ofmultiplication defined inSec. 12.2-4a; this group is the factor group (quotient group) G/Gi. If Gis of finite order, the order ofG/Gi equals the index g/gx ofGx (Sec. 12.2-2). 12.2-6. Normal Series. CompositionSeries. For anygroup G, a normal series is a (finite) sequence of subgroups G0 = G, Gh G2, . . . , Gm = E such that each Gt is a normal subgroup of Gi-i. A normal series is a composition series of G if and only if eachGi is a proper normal subgroup of (?<_! such that no further normal sub
groups can be "interpolated" between G^ and G{, i.e., if and only if each composi tion factor Gi-x/Gi is a simple group (Sec. 12.2-5). Given any two composition series of the same group G, there exists a reciprocal one-to-one correspondence between their respective elements such that corresponding elements are isomorphic groups (JordanHolder Theorem). Gis a solvable group if and only if all its composition factors are cyclic groups (Sec. 12.2-3).
12.2-7. Center. Normalizers. (a) The set ofallelements ofGwhich commute with every element ofG is a normal subgroup ofG, the center or central ofG.
(b) The set ofall elements ofG which commute with a given element aofG isasubgroup ofG (normalizer ofa), which contains the period of a(Sec. 12.2-3) as anormal subgroup. The set ofelements ofG which commute with (every element of) a given subgroup Gx of Gis a subgroup ofG (normalizer of Gx), which contains Gi as a normal subgroup. The number ofdifferent elements or subgroups ofG conjugate (Sec. 12.2-5) to the element or subgroup associated with a normalizer equals the index (Sec. 12.2-2) ofthe normalizer.
12.2-8. Groups of Transformations or Operators (see also Sees. 12.1-4, 14.9-1 to 14.10-7, and 16.1-2). The set Gof all reciprocal one-toone (nonsingular) transformations x' = f(x) ofany class Conto itself is a group; the defining operation is the successive application of two trans formations (multiplication of transformations or operators). Given any subgroup Gi ofG, two objects x, x' related by a transformation x' = f(x) of Gi are equivalent under Gx. This relationship is an equivalence relation (Sec. 12.1-3), so that each subgroup Gx defines a partition (classification) of C. Every
381
ADDITIVE GROUPS
12.2-10
property invariant (Sec. 12.1-5) with respect to (all transformations of) Gi is common to all objects x equivalent under Gx. Aset of (usually numerical) functions Fx(x), F2(x), . . . invariant with respect to Gi is a complete set of invariants with respectto (?i if and only if the set of function values uniquely defines the equivalence class of any given object x of C. EXAMPLES OF TRANSFORMATION GROUPS: All n\ permutations (see also
Appendix C) of n elements (symmetric group on n elements); all n!/2 even permutations corresponding to even numbers ofinterchanges ofn elements (alternat ing group on n elements). Every permutation of a countable set S of objects can be expressed as a product of cyclic permutations of subsets of S so that no two such cycles affect the same elements of S.
12.2-9. Homomorphisms and Isomorphisms of Groups. Rep resentation of Groups (see also Sees. 12.2-4, 12.2-5, and 14.9-1 to
14.10-7). (a) A homomorphism or isomorphism (with respect to the defining operation, Sec. 12.1-6) maps the elements a, b, ... of a given
group Gonto the elements a', b', ... of a new group (?' so that (ab)' -• a'bf. The identity of G is mapped onto the identity of G', and inverses are mapped onto inverses. For every homomorphism ofGonto G', the identity ofG'corresponds toa normal subgroup Ge ofG, and each element ofG' corresponds to a coset ofGs] the factor group G/Ge is iso morphic with G'. Ge is called the kernel of the homomorphism. Every normal sub group Gi ofGis the kernel of a homomorphism mapping Gonto the factor group G/G\. The classof all automorphisms of anygroup Gis a group; the classof all transformations (3) ofG (inner automorphisms ofG) is a subgroup of the automorphism group. Conjugate subgroups are necessarily isomorphic.
(b) Every group G can be realized (represented) by a homomorphism relating the group elements to a group of transformations mapping some class of objects onto itself (Cayleyfs Theorem; see also Sec. 12.2-8). In particu lar, every group of finite order is isomorphic to a group of permutations (regular representation of a finite group, Sec. 14.9-la). Refer to Sees. 14.9-1 to 14.10-7 for representations of groups in terms of linear transformations and matrices.
12.2-10. Additive Groups. Residue Classes and Congruence, (a) The defining
operation of a commutative group (Abelian group, Sec. 12.2-la) is often referred to as (abstract) addition. Its result may be written as a sum a + b, and the identity and inverse are written, respectively, as 0 (zero or null element) and —a, so that a + (— b) = a — b
The group is then called an additive group (addition group, module), and expres sions like a + a, a + a + a, —a — a, . . . may be written as 2a, 3a, —2a, . . . .
(b) Every subgroup of a commutative group is a normal subgroup. The cosets of a subgroup G\ of an additive group G are called residue classes modulo G\', two ele ments a, b of G belonging to the same residue class (i.e., G\ contains a —b; see also Sec. 12.2-46) are congruent modulo Gx [a = b(Gi)]. Congruence is an equivalence relation (Sec. 12.1-3; see also Sec. 12.2-46). The factor group G/Gi of an additive
group Gis the group of residue classesmoduloG\and may be denoted by GmoduloG\.
12.2-11
MODERN ALGEBRA. SPACES
382
Two real integers m, n are congruent modulo r [m = n(r)], where r is a third real integer, if and only if m - nisamultiple ofr, so that m/r and n/rhave equal remainders.
12.2-11. Continuous and Mixed-continuous Groups, (a) The elements of a
continuous group Gcan be labeled with corresponding sets ofcontinuously variable parameters so that the parameters 7l, y2, . . . of the product c = ab are continu
ously differentiable functions ofthe parameters ai, a2, ... ofaand the parameters 0i, 02, . . . of b. The parameters may, for instance, be the matrix elements in a
representation ofG(EXAMPLE: the three-dimensional rotation group, Sec. 14.10-7). (b) A mixed-continuous group Gisthe union ofmseparated (Sec. 12.5-16) sub
sets "connected" in the sense of Sec. 12.2-lla (m = 1, 2, . . .)• The connected
subset Gi containing the identity element is a normal subgroup ofGwith index m; the other connected subsets of Gare the cosets associated with Gi (see also Sees!
12.2-4 and 12.2-5. EXAMPLE: the reflection-rotation group in three-dimensional Euclidean space, see also Sec. 14.10-7).
12.2-12. Mean Values. The mean value of a numerical function F(a) defined on
the elements a of a group Gis a linear functional (Sec. 12.1-4) defined for agroup G
of finite order g as
Mean{F(a)| =i Y F(a)
(12.2-4)
a in Q
More generally, for a suitable numerical function F(a) defined on a continuous or mixed-continuous group G,
fnF(a)dU(a)
Mean [F(a)) = ^_ JQdU(a)
(12.2-5a)
where the "volume element" dU(a) is defined in terms of the parameters at, ft, and 7» introduced in Sec. 12.2-lla by
Thedefinition (5) applies also to countably infinite and finite groups if theintegrals are interpreted as Stieltjes integrals (Sec. 4.6-17) and reduces to Eq. (4) for finite groups.
The definition of Mean [F(a)} implies
Mean {F(a0a)} = Mean{F(aa0)l = Mean{^(a)) = Mean \F(a-1))
(12.2-6)
for every fixed element a0 in G.
12.3. ALGEBRA OF MODELS WITH TWO DEFINING
OPERATIONS: RINGS, FIELDS, AND INTEGRAL DOMAINS
12.3-1. Definitions and Basic Theorems, (a) A class R of objects (elements) a, bf c, ... is a ring if and only if it is possible to define two binary operations, usually denoted as (abstract) addition and multiplica tion, such that
383
DEFINITIONS AND BASIC THEOREMS
12.3-1
1. R is a commutative group with respect to addition
(additive group, Sec. 12.2-10); i.e., R is closed under addition, and
a + b = b+ a a + 0 = a
a + (b + c) = (a + b) +c a+ (-a) = a - a = 0
2. ab is an element of R
(closure under multiplication)
3. a(bc) = (ab)c (associative law for multiplication) 4. a(b + c) = ab + ac
(b + c)a = ba + ca (distributive laws)
Note that aO = 0a = 0 for every element a of 72.
Two elements p and
g of R such that pg = 0 are called left and right divisors of zero. For a ring without divisors of zero (other than zero itself), ab = 0 implies a = 0, or b = 0, or a = b = 0, and 2/ie cancellation laws (12.2-1) hold. "Multiples" of ring elements like 2a, 3a, . . . (Sec. 12.2-10a) are not in general
products of ring elements. Integral powers of ring elements are defined as in Sec. 12.2-lc.
(b) A ring with identity (unity) is a ring containing a multiplica tive (left) identity E such that Ea = a for all a (see also Sec. 12.2-la). E is necessarily a unique right identity as well as a unique left identity. A given element a of a ring with identity may or may not have a multi plicative (left) inverse ar1;if a~l does exist, it is necessarily a unique right inverse as well as a unique left inverse (see also Sec. 12.2-1).
(c) A field* is a ring with identity which contains (1) at least one element different from zero and (2) a multiplicative inverse a-1 for each element a j& 0. The nonzero elements of a field F constitute a group with respect to multiplication.
Given any pair of elements b,C5*0ofF, the equations ex = b and xc = b have solutions in F.
These solutions are unique whenever b j& 0 (unique
left and right division, see also Sec. 12.2-16); they are unique even for 6 = 0 if F is a field without divisors of zero.
(d) A ring or field is commutative if and only if ab = ba for all a, 6. A commutative field is sometimes simply called a field, as opposed to a skew or noncommutative field.
A Galois field is a finite commuta
tive field. An integral domain is a commutative ring with identity and without divisors of zero. Every finite integral domain is a Galois field. In every integral domain all nonzero elements of the additive group have the same order (Sec. 12.2-3). This order is referred to as the characteristic of the integral domain.
(e) Refer to Sec. 12.6-3 for a brief discussion of orderedfields. * Some writers require every field to be an integral domain (Sec. 12.3-ld).
12-3-2
MODERN ALGEBRA.
SPACES
384
EXAMPLES OF FIELDS: rational numbers, real numbers, complex numbers; continuous functions defined on a finite interval (has zero divisors); polynomials EXAMPLES OF INTEGRAL DOMAINS: real integers, complex numbers with integral coefficients. EXAMPLE OF A COMMUTATIVE RING WITHOUT IDENTITY: even integers.
12.3-2. Subrings andSubfields. Ideals, (a) Asubset Rx ofa ring Risa subring ofRif and only if it is a ringwith respect to thedefining operations ofR. Thisis true 0
ifand only ifRx contains a - band ab for every pair of itselements a, b. Similarly a subset Fx ofa field F isa subfield of F ifand only ifit isa subring ofF comprising at least one nonzero element and contains ab~l for every pair ofelements a, bof Fu The nonzero elements ofFx constitute a subgroup of the multiplicative group of F. (b) A subset Ixof a ring R is an ideal in R if and only if 1. 11 is a subgroup of R with respect to addition.
2. 7i contains all products ab (left ideal), or all products ba (right ideal), or all products ab and ba (two-sided ideal), where a is any element of Ih and b is any element of R.
12.3-3. Extensions. It is often possible to "imbed" commutative rings or fields
as subrings or subfields in "larger" fields (quotient fields, algebraic extension fields, etc.) in a manner analogous to that outlined in Sec. 1.1-2 for the real integers (see also the example in Sec. 12.4-2). The theory of fields, including the so-called Galois theory, deals with the existence of such extensions (APPLICATIONS: investigation of the possibility of constructions with ruler and compass, ofsolving algebraic equa tions in terms of radicals, and of constructing latin squares; Refs. 12.1 and 12.9). 12.4. MODELS INVOLVING MORE THAN ONE CLASS OF MATHEMATICAL OBJECTS: LINEAR VECTOR SPACES AND LINEAR ALGEBRAS
12.4-1. Linear Vector Spaces. Let R be a ring with (multiplicative) identity 1 (Sec. 12.3-16); the elements a, ft ... of fi shall be referred to as scalars. A class V of objects (elements a, b, c, ... is a (linear) vector space over the ring R, and the elements of V are called vectors
if and only if it is possible to define two binary operations called vector addition and multiplication of vectors by scalars such that* 1. TJ is a commutative group with respect to vector addi tion: for every pair of elements a, b of V, V contains a vector sum a + b, and
a + b = b + a a + 0 = a
a+(b + c) = (a + b)+c a + (-a) = a - a = 0
where 0 is the additive identity element (null vector) of V, and —a is the additive inverse of a (Sec. 12.2-10) * Some authors require that V contain not onlyevery sumof twovectors, but also every infinite sum ai + a* + '• • • which converges in some specified sense; vector spaces defined in this manner are not in the realm of algebra proper (see also Sec. 14.2-1).
385
LINEAR ALGEBRAS
12.4-2
2. Given any vector a of V and any scalar a of R, V con tains a vector aa, the product of the vector a by the scalar a (closuke under multiplication by scalars) 3. (a£)a = a(jSa) (ASSOCIATIVE LAW FOR MULTIPLICATION BY SCALARS) 4. a(a + b) = aa + ab (a + 0)a = aa + /3a (distributive laws) 5. la = a
Note that
Oa = 0
(-l)a = -a
(-a)a = -(aa)
(12.4-1)
Linear vector spaces are of fundamental importance in applied mathematics; they are treated in detail in Chap. 14 (see also Chaps. 5, 6, 15, 16, and 17).
12.4-2. Linear Algebras. Given a ring R of scalars with identity 1, a class L is a linear algebra (linear associative algebra, system of
hypercomplex numbers) over the ring R if and only if it is possible to define three binary operations (addition and multiplication in L and multiplication of elements of L by scalars) such that 1. L is a ring 2. L is a linear vector space over the ring R of scalars
The order of a linear algebra is its dimension as a vector space (Sec. 14.2-4). A linear algebra is a division algebra if and only if it is a field (Sec. 12.3-lc). Refer to Sec. 14.9-7 for matrix representations of linear algebras. An element A 5^ 0 of any linear algebra is idempotent if and only if A2 = A and nilpotent if and only if there exists a real integer m > 1 such that Aro = 0. These definitions apply, in particular, to matrices (Sec. 13.2-2) and to linear operators (Sec. 14.3-6).
EXAMPLES (see also Sees. 13.2-2 and 14.4-2): (1) The field of complex numbers is a commutative division algebra of order 2 over the field of real numbers. (2) The field of quaternions a, b, . . . is the only extension (Sec. 12.3-3) of the complexnumber field which constitutes a noncommutative division algebra over the field of real numbers. Every quaternion a can be represented in the form a — ao + iai + J0C2 + kaz — (a0 + ioti) + (a2 + iaz)j
(12.4-2)
where a0, ax, a2, a.% are real numbers, and i, j, k are special quaternions (generators) satisfying the multiplication rules
..
,.
.l ftt ,.3= —ik *
jfc = —k] =8
... i] =
=3
...
—31 = k
(12.4-3)
12.5-1
MODERN ALGEBRA.
SPACES
386
If one defines a* - oc0 —iax —ja2 — ka3f then
|o|* = a*a = aa* = aQ2 + ax2 + a22 + a,2 a"1 = a*/|a|2
(12.4-4) (12.4-5)
(see also Sec. 14.10-6). 12.5. MODELS PERMITTING THE DEFINITION OF LIMITING PROCESSES: TOPOLOGICAL SPACES
12.5-1. Topological Spaces (refer to Sec. 4.3-2 for elementary proper ties of point sets), (a) A class C of objects ("points") £ is a topological space if and only if it can be expressed as the union of a family 3 of point sets which contains
1. The intersection of every pair of its sets 2. The union of the sets in every subfamily
3 is a topology for the space C, and the elements of 3 are called open sets relative to the topology 3. A family (B of open sets is a base for the topology 3 if and only if every set of 3 is the union of sets in
The intersections of any subset C\ of a topological space C with its open sets constitute a topology for C\ (relative topology of the subspace d of C, relativization of 3 to Ci). A given space may admit more than one topology; every space C admits the indiscrete (trivial) topology comprising only C and the empty set, and the discrete topology comprising all subsets of C.
(b) For a given topology, a neighborhood of a point x in C is any point set in C which contains an open set comprising x. Given the defi nition of neighborhoods, one defines limit points, interior points, boundary points, and isolated points of sets in the manner of Sec. 4.3-6a; topological spaces are seen to abstract and generalize certain features of the realnumber system.
In any topological space C, a point set is open if and only if it contains only interior points; a point set S in C is closed (1) if and only if S is the complement (with respect to C) of an open set, or (2) if and only if S contains all its limit points (alternative definitions). S is dense in (rela tive to) C if and only if every neighborhood in C contains a point of S. A topological space C is separable if and only if it contains a countable dense set; this is true whenever the topology of C has a countable base. A topological space is compact if and only if every infinite sequence of points in C has at least one limit point in C; this is true if and only if every family of open sets which covers (i.e., whose union exhausts) C has a finite subfamily which covers C (alternative definition). Every compact space is separable.
A point set S in a topological space C is compact (compact in C) if every infinite sequence of points in S has at least one limit point in C; S is compact in itself if every such sequence has a limit point in S.
387
METRIC SPACES
12.5-2
Separable spaces are of special interest because their points can be labeled with countable sets of coordinates (Sec. 14.5-1).
Compactness is a generalization of the
concept of boundedness (Sec. 4.3-3) in finite-dimensional spaces, where the BolzanoWeierstrass and Heine-Borel theorems apply (Sec. 12.5-4).
(c) The limit points and the boundary points of a set S respectively constitute its (first) derived set S' and its boundary. The closed set S KJ S' is the closure of S. Two sets are separated if and only if neither intersects the closure of the other. A set is connected if and only if it cannot be expressed as the union of separated proper subsets (EXAMPLE: connected region in a Euclidean space, Sec. 4.3-66).
(d) Continuity. Homeomorphisms. A correspondence (map ping, transformation, function, operation) x-* x' = f(x) relating points x' of a topological space C" to points x of a topological space C is con tinuous at the point x = a if and only if f(a) exists and every neigh borhood of f(a) is the image of a neighborhood of a. f(x) is continuous if and only if it is continuous at all points of C, i.e., if and only if every open set in C is the image of an open set in C. A continuous reciprocal one-to-one correspondence having a continuous inverse is a homeomorphism or topological transformation; topological spaces so related are homeomorphic or topologically equivalent. (d) Topology is the study of relationships invariant with respect to homeomor phisms (topological invariants; see also Sec. 12.1-5). Topology has geometrical applications ("rubber-sheet geometry," study of multiply connected surfaces, etc.); more significantly, suitable topological spaces are models permitting the definition of converging limiting processes with the aid of the neighborhood concept (see also Sees. 4.3-5, 4.4-1, and 12.5-3). Definitions of specific topologies are often phrased directly as definitions of neighborhoods or convergence. EXAMPLES: The definition of neighborhoods given in Sec. 4.3-5a establishes the
"usual" topology of the real-number system and permits the introduction of limits, differentiation, integration, infinite series, etc. Similarly, Sec. 5.3-1 amounts to the definition of a topology for the space of Euclidean vectors (see also Sees. 12.5-3 and 14.2-7).
12.5-2. Metric Spaces. A class CM of objects ("points") x, y, z, . . . is a metric space if and only if for each ordered pair of points x, y of C it is possible to define a real number d(x, y) (metric, distance function, "distance" between x and y) such that for all x, y, z in Cm 1. 2.
d(x, x) = 0 d(x, y) < d(x, z) + d(y, z)
(triangle property)
This definition implies d(x, y)>0
for all x, y in Cm-
d(y, x) = d(x, V)
(12.5-1)
12.5-3
MODERN ALGEBRA.
SPACES
388
x and y are metrically equal if and only if d(x, y) = 0; this does not necessarily imply x = y. Metric equality is anequivalence relation (Sec. 12.1-36). Two metric spaces are isometric if and only if they are related by a distancepreserving reciprocal one-to-one correspondence (isometry). Properties common to all metric spaces isometric to a given metric space are metric invariants.
EXAMPLES: The (finite) real and complex numbers constitute metric spaces with the metric d(x, y) = \x - y\. More generally, every normed vector space (Sec. 14.2-5), and thus every unitary vector space, admits the metric d(x, y) = ||x —y|| (see also Sees. 14.2-7 and 15.2-2).
12.5-3. Topology, Neighborhoods, and Convergence in a Metric
Space, (a) Given any point a of a metric space CM, the set (region) of points x in CM such that d(a, x) < 8is called an open ball of radius 5 about a. The open balls offinite radii constitute a base for a topology in CM (Sec. 12.5-la); open setsare unions ofopen spheres. A neighbor hood of the point a in CM is any set containing an open ball of finite radius about a. d(a, x) < r defines a closed ball in CM) and d(a, x) = r defines a sphere.
(b) A sequence of points x0, xh x2, . . . of a metric space CM is said to converge (metrically) to the limit a in CM if and only if d(xn, a) —• 0 as n —> oo (see also Sec. 14.2-7).
If a variable point x(£) of CM is a function of the real variable £, x(Q converges (metrically) to the limit a in CM as £-> a if and only if d[x(£), a] -» 0 as £—> a. x(£) is continuous for £ = a (Sec. 12.5-lc) if and only if x(a) exists and d[x(£), x(a)] -> 0 as £-> a. A function x' = f(x) relating points x' of a metric space CM to points a; of a metric space CM is continuous (Sec. 12.5-lc) at the point x = a if and only if f(a) exists and d[f(x), f(a)] —> 0 as d[x, a] -> 0.
12.5-4. Metric Spaces with Special Properties.
Point-set Theory
(see also Sees. 4.3-6 and 12.5-1). A metric space Cm is
Complete if and only if every sequence of points xh x2, ... of Cm such that lim d(xm, xn) = 0 (Cauchy sequence) converges to a m—+oo
71—•«
point of Cm (see also Sees. 4.9-la, 4.9-2a, 4.9-3a, 14.2-7, and 15.2-2). Precompact if and only if, for every real e > 0, there exists a finite set of balls of radius less than e covering CM• Conditionally compact if and only if every infinite set of points in Cm contains a Cauchy sequence.
Boundedly compact or boundedly conditionally compact if and only if every closed ball in CM is, respectively, compact or condition ally compact. Every closed set in a metric space is a subspace with the same metric. A point set S in a complete metric space Cm is a complete metric subspace if and only if S is closed. Every set compact in itself (Sec. 12.5-1) is bounded. A metric space is compact if and
12.5-6
BANACH'S CONTRACTION-MAPPING THEOREM
389
only if it is precompactand complete; every precompact metric space is separable, finite-dimensional unitary vectorspace (Euclidean vector space, Sec. 14.2-7) is separable, complete, and boundedly compact; in every Euclidean vector space
1. Every infinite point set contained in a closed ball Ka of finite radius has at least one limit point in Ka (Bolzano-Weierstrass Theorem). 2. Given any rule associating a ball Ka of finite radius with each point x of a closed set S contained in a ball of finite radius, there exists a finite set of these balls which covers (includes) all points of S (Heine-Borel Covering Theorem).
These theorems apply, in particular, to sets of real or complex numbers, and to point sets in two- or three-dimensional Euclidean spaces.
12.5-5. Examples: Spaces of Numerical Sequences and Functions. The concepts introduced in Sees. 12.5-2 to 12.5-4 offer a concise and sug gestive terminology for many problems involving the approximation of an arbitrary element of a suitable class Cm by a sequence of elements xi, X27 . . . of Cm. In such instances, the distance d(xn, x) measures the error of the approximation, or the degree to which a system characterized by x\, #2, . . • fails to meet a performance criterion. Table 12.5-1 lists a number of examples of topological spaces. Espe cially important applications involve the approximation of a function f(t) by a sequence of functions si(t), s2(t), . . . , such as the partial sums of an infinite series (see also Sees. 4.7-3, 5.3-11, 13.2-1, 13.2-11, 14.2-7, and 15.2-2).
12.5-6. Banach's Contraction-mapping Theorem and Successive Approximations. Let x' = f(x) be any transformation mapping the Table 12.5-la. Some Spaces of Numerical Sequences X s (fc, £2, . . .), y = (m, w, . . .) Common
Definition
designation
m
Bounded real sequences
Metric d(x, y)
Remarks
SUp |{jfc - t}k\
Complete; not separable
k
c
Convergent real sequences
sup |f* - Vk\ k
Complex sequences such oo
I*
that
/
|{a|2 exists
[£ 1* - *!•]*
AH are
complete and
separable
Complex sequences such oo
Ip
that y |{*|i» exists
[J 1* - "I']'* k=\
(1> = 1, 2, . . .)
12.5-6
MODERN ALGEBRA.
SPACES
390
Table 12.5-lb. Some Spaces of Functions x(t), y(t) (see also Sees. 14.2-7, 15.2-2, and 18.10-9; the definitions are readily extended to functions of two or more variables) Common
F[0, 1]
Metric d(x, y)
Definition
designation
No
Real functions defined
on [0,1]
metric exists.
Remarks
Topology defined by pointwise
convergence
Complete; C[0, 1]
sup
Continuous real functions
separable (Sec. 4.7-3);
\x(t) - y(fl|
o<e
defined on [0,1]
uniform convergence
Complex functions such L*(a, b)
that /
\f |*(0 -lK«)l«*J
\x(t)\*dt exists
in the sense of Lebesgue (Sec. 4.6-15)
Complete (Sec. 15.2-2); separable even if (a, 6) is not
Complex functions such Lp(a, b)
bounded;
fb that / \x(t)\p dt exists
\f l*(0 -y(0l*
Ja
convergence
in mean
in the sense of Lebesgue (P = 1, 2, . . .) Complex functions such that
1 fT/2
L*
lim - / \x(t)\*dt r_»oo T J-T/2 exists in the sense of
r l fT/2 L7W00 T J-T/2
1H J
Complete; not
separable
Lebesgue
closed set S of a complete metric space Cm into itself so that d[f(x),f(y)]
(12.5-2)
for all x, y in S, where a is a positive real number less than unity ("contrac tion mapping11). Then S contains a unique fixed point Xf of the map ping / such that
f(xf) = xf
(12.5-3)
Moreover, the solution xf of the equation (3) is the limit of every sequence of successive approximations
xn+i = f(xn)
(n = 0, 1, 2, . . .)
as n—• oo, for an arbitrary starting point x0 in S. vergence of the approximation sequence is given by d(xn, xf) <
1 -
a
d(xh xq)
(12.5-4) The rate of con
(n = 0, 1, 2, . . .)
(12.5-5)
The contraction-mapping principle furnishes a powerful method for establishing the convergence of a wide range of approximation techniques (Sees. 20.2-1, 20.2-6, and 20.3-5).
391
ORDERED FIELDS
12.6-3
12.6. ORDER
12.6-1. Partially Ordered Sets, (a) Order. A class (set) S of objects (elements) a, b, c, . . . is partially ordered if and only if it admits a transitive ordering relation (rule of precedence) a < b between some pairs of elements a, b so that a < b,b < c
implies
a < c
(12.6-1)
a < b may or may not preclude b < a (antisymmetry). The symbol < is used to indicate a reflexive ordering relation which satisfies the condition a < a for all a.
A partially ordered set permits the definition of upper bounds, lower bounds, least upper bounds, greatest lower bounds, maxima, and/or minima of suitable subsets in the manner of Sec. 4.3-3.
A partially ordered set S is order-complete
if and only if every nonempty subset having an upper bound has a least upper hound in S; this is true if and only if every nonempty subset having a lower bound has a greatest lower bound in S (alternative definitions). (b) Lattices. A (nonempty) class of objects (elements) a, b, . . . is a lattice if and only if it admits a reflexive ordering relation such that every pair a, b of ele ments has a unique least upper bound sup (a, b) and a unique greatest lower bound inf (a, b). In every lattice one can define "sums'' a + b = sup (a, b) and "products" ab es inf (a, b). Sums and products so defined have the properties 1, 2, 3, 5 listed in Sec. 12.8-1 for Boolean algebras.
12.6-2. Simply Ordered Sets. A class (set) S of elements a, 6, c, . . . is a simply (linearly, totally, completely) ordered set (chain) if and only if it admits an ordering relation having the transitivity property (1) and such that* a < b or b < a for every pair of distinct elements a, b a < b, b < a implies a = b
(12.6-2)
The ordering relation may or may not be reflexive, but every simply ordered set admits the reflexive ordering relation defined by a < 6 if a < 6 or a = b. A simply ordered set S is well ordered if and only if every nonempty subset of S has a minimum. Every countable simply ordered set is well ordered. 12.6-3. Ordered Fields (see also Sees. 1.1-5 and 12.3-1). (a) A commutative field is an ordered field if and only if each element can be uniquely classified as "positive" (>0), "negative" (<0), or zero (= 0) in such a way that a > 0, b > 0
implies
a + b > 0, ab > 0
(12.6-3)
(b) Every order-complete (Sec. 12.6-la) ordered field is isomorphic (with respect to addition and multiplication) with the field of real numbers. Every ordered integral domain whose positive elements are well ordered (Sec. 12.6-2) is isomorphic with the field * Some authors replace the second condition (2) by "a < b precludes b < a."
12.7-1
MODERN ALGEBRA.
SPACES
392
of real integers. The last theorem expresses the equivalence of binary numbers, decimal numbers, roman numerals, etc.
12.7. COMBINATION OF MODELS: DIRECT PRODUCTS, PRODUCT SPACES, AND DIRECT SUMS 12.7-1. Introduction.
Cartesian
Products.
Sections
12.7-1
to
12.7-5 deal with a class C of mathematical objects described in terms of two or more properties and represented as ordered sets {a\, a2, . . .} of objects ai, a2, . . . respectively taken from suitably defined classes C\, C2, . . . . The objects a\, a2, . . . may be regarded as properties or attributes of the new object [a\, a2, . . .}; one defines
{ah a2, . . .} = {&i, b2, . . .} it and only if ai = b\,a2 = b2, . . . . The class C is called the cartesian product of the classes C\, C2, . . . . This method of combining mathe matical objects into more complicated mathematical objects is associative:
{ah {a2, a3}) = {{ah a2), a3l = {ah a2, a3}
(12.7-1)
It remains to relate operations in C to operations in C\, C2, . . . .
12.7-2. Direct Products of Groups (see also Sec. 12.2-1). The direct product G = G\ <8> G2 of two groups G\ and G2 having the respective elements a\,b\, . . . and a2,b2, . . . is the group comprising all ordered pairs {a\, a2], with "multiplication" denned by
{ah a2) {bh b2] = {aj)h a2b2]
(12.7-2)
The order of G equals the product of the orders of (?i and G2. G has the identity E = {2
12.7-3. Direct Products of Real Vector Spaces (see also Sees. 12.4-1, 14.2-1, and 14.2-4). The direct product V = ^i ® V2 of two real vector spaces ^i and ^2 having the respective elements ai, bi, . . .
and a2, b2, . . . is the real vector space comprising all ordered pairs (outer products) {ai, a2} = aia2, with vector addition and multiplication
393
BOOLEAN ALGEBRAS
12.8-1
of vectors by scalars defined so that
ai(a2 + b2) = aia2 + aib2
(ai + bi)a2 = aia2 + bia2
^ 7 o\
aaia2 = (aai)a2 = ai(aa2) The linear dimension of the vector space V equals the product of the linear dimensions of Vi and V2. EXAMPLE: Construction of tensors as outer products of vectors, Sees. 16.3-6, 16.6-1, and 16.9-1.
12.7-4. Product Space (see also Sec. 12.5-la). The product space formed by two topological spaces C\, C2 is their cartesian product with the product topology (family of open sets in the product space) defined as the family of all cartesian products of S\ and S2, where S\ is an open set in Ci, and S2 is an open set in C2. 12.7-5. Direct Sums, (a) The direct sum V' = Vi (B V2 of two vector spaces Vi and V2 having the respective elements ai, bi, . . . and a2, b2, . . . and admitting the same ring R of scalars is the vector space comprising all pairs [ai, a2] = [a2, ai] with vector addition and multiplication of vectors by scalars a of R defined so that
[ai, a2] + [bi, b2] = [ai + bi, a2 + b2]
a[ah a2] = [aai, aa2] (12.7-4)
The dimension of V' equals the sum of the dimensions ofVi and V2. If Vi and V2 have no common elements, one may write [ai, a2] = ai + a2, and 'Ui and V2 are subspaces of V'. Every linear vectorspace of dimension greater than one can be represented as a direct sum of nonintersecting subspaces.
(b) The direct sum R' —R\ 0 R2 of two rings R\ and R2 having the respective elements a\, b\, . . . and a2, b2, . . . is the ring comprising all ordered pairs (often referred to as direct sums) [a\, a2] with addition and multiplication defined so that
[ah a2] + [61, 62] = [ai + fei, a2 + b2]
7
[ai, a2][bh bi] = [aibh a2b2]
(c) The direct sum of two linear algebras (Sec. 12.4-2) is the linear algebra of ordered pairs, with addition, multiplication, and multiplica tion by scalars defined as in Eqs. (4) and (5). 12.8. BOOLEAN ALGEBRAS
12.8-1. Boolean Algebras. A Boolean algebra is a class § of objects A, B, C, . . . admitting two binary operations, denoted as (logical) addition and multiplication, with the following properties:
•2.8-2
MODERN ALGEBRA.
SPACES
(a) For all A, B,Cm$ 1. § contains A + B and AB
394
(closure)
2. A+B = B + A\ AB = BA I
, (commutative laws)
3. A + (B + O = (A + B) + C I .
A(BC) =(AB)C ](ASSOCIATIVE LAW») | , 4 +5(7 =(A +B)(A + C) ) (^TRIBUTiVELAWS)
4. 4(B + C) = AB + AC 5. A + A = 44 = -4
(IDEMPOTENT PROPERTIES)
6. A + B = 5 if and only if AB = 4 (consistency property) (b) In addition,
7. § contains elements/ and0 suchthat, for every A in g, 4+0 = 4
AI = A
(it follows that A0 = 0 4 + / = /) 8. For every element A, g contains an element A (complement of A, also written A or / — A) such that
4 + 1 =7
44 = 0
In every Boolean algebra,
A(A + B) s 4 + 45 = A
^
_
[
(4£)^J + #J^
(laws of absorption)
(duality, or de Morgan's laws) (12.8-2)
1 =4 4 + IB = 4 + B
(12.8-1)
1=0
0=7
(12.8-3)
AB + AC + BC = AC + BC (12.8-4)
If A + B = £, one may write AB as B —A (complement of A with respect to B). Two or more objects A, B, C, ... of a Boolean algebra are disjoint if and only if every product involving distinct elements of the set equals 0. The symbols U (cup) and Pi (cap) used in Sees. 4.3-2a, 12.8-56, and 18.2-1 to
denote union and intersection of sets and events are frequently employed to denote logical addition and multiplication in any Boolean algebra; so that A KJ B stands for A + B, and A C\ B stands for AB.
12.8-2. Boolean
Functions.
Reduction
to
Canonical
Form.
Given n Boolean variables Xh X2, . . . , Xn each of which can equal any element of a given Boolean algebra, a Boolean function Y = F(Xh X2, . . . , Xn)
is an expression built up from Xh X2, . . . , Xn through addition, multi plication, and complementation.
395
BOOLEAN FUNCTIONS
12.8-2
In every Boolean algebra, there exist exactly 2(2n) different Boolean func tions of n variables. The relations of Sec. 12.8-1 imply
F(X, Y, ...,+,.) - F(X, ?,...,., +) (12.8-5) F(X, X, Y, ) B XF(I, 0, Y, . . .) + £F(0, 7, F, . . .) = [X + F(0, /, Y, . . .)][* + F(J, 0, 7, . . .)] (12.8-6) OT(X, 2, Y, . . .) m XF(I, XF(I, 0, Y, . . .) \ (12.8-7) X + F(X, X, Y, .)mX +.F(0, /, Y, . . .) f
(b) Every Boolean function is either identical to 0 or can be expressed as a unique sum of minimal polynomials (canonical minterms) Z\Z2 . . . Zn, where Zi is either Xi or Xi (canonical form of a Boolean
A1A2A3 /
[
\
\
#W
~
«V
X\ X2X3
X1X2X3
**-\X-2^z/
XiX2X^
Fig. 12.8-1. A Venn diagram (Euler diagram). Diagrams of this type illustrate rela tions in an algebra of classes (Sec. 12.8-5). If the rectangle, circle, square, and triangle are respectively labeled by 7, Xh X2, Xz, the diagram shows how a Boolean function of Xif X2, X3 can be represented as a union of minimal polynomials in Xh X2, X3 (Sec. 12.8-2). Note that there are 23 = 8 different minimal polynomials.
function; see Fig. 12.8-1 for a geometrical illustration). A given Boolean function Y = F(Xi, X2, . . . , Xn) may be reduced to canonical form as follows:
1. Use Eq. (2) to expand complements of sums and products. 2. Reduce F(Xh X2, . . . , Xn) to a sum of products with the aid of the first distributive law.
3. Simplify the resulting expression with the aid of the identities XiXi = Xi, XiXi = 0, and Eq. (4).
4. If a term / does not contain one of the variables, say Xiy rewrite / BSfXi + fXt.
12.8-3
MODERN ALGEBRA.
SPACES
396
In many applications, it may be advantageous to omit step 4 and to continue step 3 so as to simplify each term of the expansion as much as possible (see also Sec. 12.8-7). In view of de Morgan's laws (2), everyBooleanfunction not identically zero can also
be expressed as a unique product of canonical maxterms Z\ + Z'2 -f • • • -+• Z'n, where Zi is either Xi or Xi. There are, altogether, 2n minterms and 2nmaxterms. EXAMPLES:
c—
(4B + CD) - (A + B) (S + 5) - AS + AB + BS + Bd >
) \ (12.8-8)
(A + B)(C + D) =XB + SB = (I + S)(A + D)(B + S)(B + 5)J
12.8-3. The Inclusion Relation (see also Sec. 12.6-1). (a) Either A + B = B or AB = 4 is equivalent to a reflexive partial ordering relation A < B [or B > A; (logical) inclusion relation]. (b) In everyBoolean algebra, A
AB s inf (A, B)
(12.8-9)
where the bounds are defined by the inclusion relation. Every Boolean algebra is a lattice (Sec. 12.6-16). (c) Given any element A of a Boolean algebra g, the elements XA < A of $ constitute a Boolean algebra in which A takes the place of I.
12.8-4. Algebra of Classes. The subsets (subclasses) 4, B, . . . of any set (class) 7 constitute a Boolean algebra (algebra of classes) under the operations of logical addition (union), logical multiplication (intersection), and complementation denned in Sec. 4.3-2a. The empty set (or any set which contains no element of 7) is denoted by 0. The relation < (Sec. 12.8-2) becomes the logical inclusion relation C. 12.8-5. Isomorphisms of Boolean Algebras. Venn Diagrams. Two Boolean algebras are isomorphic (with respect to addition, multipli cation, and complementation, Sec. 12.1-6) if and only if they have the same number of elements. Every Boolean algebra is isomorphic to an algebra of classes (Sec. 12.8-4); Venn diagrams (Euler diagrams) like that shown in Fig. 12.8-1 conveniently illustrate the properties of Boolean algebras in terms of an algebra of classes.
12.8-6. Event Algebras and Symbolic Logic. Event algebras serve as models for the compounding of events (suitably denned out comes of idealized experiments; see Sec. 18.2-1 for additional discussion). If Ei, E2, . . . represent such events, then
Ei KJ E2 represents the event (proposition) E\ OR E2 (or both; inclusive OR)
E\ C\ E2 represents the event (proposition) Ei AND E2
397
12.8-7
BOOLEAN FUNCTIONS BY TRUTH TABLES
E represents the event (proposition) NOT E I represents a certain event (union of all possible outcomes, Sec. 18.2-1) 0 represents an impossible event
In two-valued (Aristotelian) logic, an algebra of hypothetical events (logical propositions, assertions) E is related to a simpler Boolean algebra of truth values T[E] equal to either 1 (E is true) or 0 (E is false) by the homomorphism (Sec. 12.1-6)
T[I] = 1
T[0] = 0
)
T[Ei UE2] = T[Ei] + T[E2] ^T[EX HE2] = T[Ei]T[eA T[E] = T[E]
J
(12.8-10)
On the basis of these assumptions, a proposition E is either true or false (law of the excluded middle), and the truth value of any proposition E expressible as a Boolean function of ("logically related to") a set of events Ei, E2, . . . is given by
T[E] = T[F(Eh E2, ...)] = T[F{T[Ei], T[E2], . . .}] with
0+1 = 1
0 + 0 = 0 0-0 =
1 + 1 = 1)
0-1=0
0
0=1
l.l = l|
1=0
(12.8-11)
(12.8-12)
I
12.8-7. Representation of Boolean Functions by Truth Tables. Karnaugh Maps. Given a set of Boolean variables Xl} X2, . . . , Xn each capable of taking the values 0 or 1 (Sec. 12.8-6), each of the 2<2'l) Boolean functions Y = F(Xlt X2,
,Xn)
is uniquely denned by the corresponding truth table (Table 12.8-1) listing the func tion values for all possible arguments. Table 12.8-1 also lists a common arrangeTable 12.8-1.* Truth Table for
F = X?Z + XYZ + tYZ + X?Z + XYZ = (X + Y + Z)(X + Y + Z)(X + ?+2) X
Y
z
F
0
0
0
0
0
0
1
1 1
0
1
0
0
1
1
1
1
0
0
0
1
0
1
1
1
1
0
1
1
1
1
0
Corresponding minterm
2?Z 2fZ XYZ 1YZ X?2 X?Z
= = = = = =
m0 toi m2 m3 m4 mb
XYZ = m& XYZ = m7
* Based on J. V. Wait: Symbolic Logic and Practical Applications, in M. Klerer and G. A. Korn, Digital Computer User's Handbook, McGraw-Hill, New York, 1967.
12.8-7
MODERN ALGEBRA.
SPACES
398
ment of the corresponding minterms (Sec. 12.8-2); each minterm is assigned the binary numberdetermined by the arrangement of O's and Vs in X, Y, Z. F is seen to equal the Boolean sum of the minterms corresponding to the function value 1 in the truth table.
A 00
A 01
B
11
B
10
A
A
0
1
A
00
SO
CO
BI
CI
A 01
11
10
B
(o)
B
(b)
A
00 ~ c
A
01
11
00
A A 000 001 011 010
10 D
~ Q
01
A
100 101
A
111
110
S
00 01
D
D
11
11
c
c
10
10
D
D
^ /s/
B
t>J
B
B /v
(c) W)
C
0
C
1
M
Fig. 12.8-2.Karnaugh maps for two, three, four, and five Boolean variables (a) to (d), and a two-variable map for the function of Table 12.8-1, (e) (from Ref. 20.9). A Karnaugh map is a Venn diagram (Sec. 12.8-5) providing for an orderly arrange ment of squares corresponding to the 2n minterms generated by n variables (Fig. 12.8-2). Truth-table values for a given function F are entered into the appropriate squares; the given function F is then the union of all minterms labeled with a 1.
For functions of up to perhaps six variables, the Karnaugh map makes it convenient to recombine these minterms into unions and/or intersections so as to minimize, say,
399
BOOLEAN FUNCTIONS BY TRUTH TABLES A
12.8-7
A
^ D
Q
v»
•Z
m
/I
^
(a)
B
A
A
*ud
B
(b)
P
•"g (l)
(c)
? td D
*{ u A
> D
B
B
2?
^
R AL
AL r
*
<
ABD
ABB
ABC
c
BD
BC
BD
3> ©
c
j
*•
B
B
B
F = ABC + A3C + ABC
«
BC-¥ ABC - (B + C) (A + B + C)
Fig. 12.8-3. Logic simplification with a Karnaugh map.
(Basedon Ref. 20.9.)
12.8-8
MODERN ALGEBRA.
SPACES
400
the number of logical additions, multiplications, and/or complementations. This is useful for the economical design of digital-computer circuits (Fig. 12.8-3). 12.8-8. Complete Additivity and Measure Algebras (see also Sees. 18.2-1 and
18.2-2). Many algebras of classes and event algebras require one to extend the defining postulates to unions and intersections of infinitely many terms (this is, strictly speaking, outside the realm ofalgebra proper, Sec. 12.1-2). ABoolean algebra § is completely additive if and only if every infinite sum A\ + A2 + • • -is
uniquely defined as an element of g. A completely additive Boolean algebra § is a measure algebra if and only if there exists a real numerical function (measure) M(A) defined for all elements A of § so that 1. M(A) > 0 2. M(0) = 0
3. M(Ai + A2 + • • •) = M(Ai) -f M(A2) + • • • for every countable set of disjoint elements A i, A2, . . .
EXAMPLES OF MEASURES: Cardinal number of a set (Sec. 4.3-26), length, area, volume, Lebesgueand Stieltjes measures (Sees. 4.6-14, 4.6-17),truth value (Sec. 12.8-6), probability (Sec. 18.2-2).
12.9. RELATED TOPICS, REFERENCES, AND BIBLIOGRAPHY
12.9-1. Related Topics. The following topics related to the study of modern algebra and abstract spaces are treated in other chapters of this handbook:
Theory of equations, determinants Point sets Matrices, quadratic and hermitian forms Linear vector spaces, linear operators, eigenvalue problems, matrix repre sentations Function spaces, orthogonal expansions, eigenvalue problems involving differential and integral equations Tensor algebra and analysis
Chap. 1 Chap. 4 Chap. 13 Chap. 14
Chap. 15 Chap. 16
12.9-2. References and Bibliography. Algebra (see also Sees. 1.10-2, 13.7-2, and 14.11-2)
12.1. Barnes, W. E.: Introduction to Abstract Algebra, Heath, Boston, 1963. 12.2. Birkhoff, G., and S. MacLane: A Survey of Modern Algebra, rev. ed., Macmillan, New York, 1965.
12.3. Herstein, I. N.: Topics in Algebra, Blaisdell, New York, 1964. 12.4. Johnson, R. E.: First Course in Abstract Algebra, Prentice-Hall, Englewood Cliffs, N.J., 1953.
12.5. McCoy, N. H.: Introduction to Modern Algebra, Allyn and Bacon, Boston, 1960.
12.6. Mostow, G. D., et al.: Fundamental Structures of Algebra, McGraw-Hill, New York, 1963.
12.7. Schreier, O., and E. Sperner: Introduction to Modern Algebra and Matrix Theory, Chelsea, New York, 1955.
12.8. Vander Waerden, B. L.: Modern Algebra, rev. ed., 2 vols., Ungar, New York, 1950-1953; 6th German edition, Springer, Berlin, 1964.
401
REFERENCES AND BIBLIOGRAPHY
12.9-2
Boolean Algebras and Logic
12.9. Church, A.: Introduction to Mathematical Logic, Princeton University Press, Princeton, N.J., 1956.
12.10. Copi, I. M.: Symbolic Logic, Macmillan, New York, 1954. 12.11. Flegg, H. G.: Boolean Algebraand Its Applications, Wiley, New York, 1964. 12.12. Chu, Y.: Digital Computer Design Fundamentals, McGraw-Hill, New York, 1962.
12.13. Hohn, F. E.: Applied Boolean Algebra, Macmillan, New York, 1960. 12.14. Rosenbloom, P. C: The Elements of Mathematical Logic, Dover, New York, 1951.
12.15. Suppes, P. C: Introduction to Logic, Van Nostrand, Princeton, N.J., 1958. 12.16. Tarski, A.: Introduction to Logic and to the Methodology of Deductive Sciences, 2d ed., Oxford, Fair Lawn, N.J., 1946. 12.17. Whitesitt, J. E.: Boolean Algebra and Its Applications, Addison-Wesley, Reading, Mass., 1961. Switching Logic
12.18. Caldwell, S. H.: Switching Circuits and Logical Design, Wiley, New York, 1958. 12.19. Marcus, M. P.: Switching Circuits for Engineers, Prentice-Hall, Englewood Cliffs, N.J., 1962. 12.20. Miller, R. E.: Switching Theory, vol. 1: Combinatorial Circuits, Wiley, New York, 1965.
Topology (see also Sees. 4.12-2, 14.11-2, and 15.7-2) 12.21. 12.22. 12.23. 12.24. 12.25. 12.26.
Aleksandrov, P. S.: Combinatorial Topology, 3 vols., Graylock, New York, 1956. Bushaw, D.: Elements of General Topology, Wiley, New York, 1963. Hall, D. W., and G. L. Spencer: Elementary Topology, Wiley, New York, 1955. Hocking, J., and G. Young: Topology, Addison-Wesley, Reading, Mass., 1961. Kelley, John L.: General Topology, Van Nostrand, Princeton, N.J., 1955. Lefschetz, S.: Introduction to Topology, Princeton University Press, Princeton, N.J., 1949. 12.27. Liusternik, L. A., and V. J. Sobolev: Elements of Functional Analysis, Ungar, New York, 1961.
12.28. Newman, M. H. A.: Topology of Plane Sets of Points, Cambridge, New York, 1951.
12.29. Pervin, W. J.: Foundations of General Topology, Academic Press, New York, 1964.
12.30. Pontryagin, Lev S.: Foundations of Combinatorial Topology, Graylock, Rochester, N.Y., 1952. 12.31. Wallace, A. H.: Introduction to Algebraic Topology, Pergamon Press, New York, 1957.
CHAPTER
13
MATRICES.
QUADRATIC AND
HERMITIAN
FORMS
13.3-2. Matrices with Special Sym metry Properties
13.1. Introduction
13.1-1. Introductory Remarks
13.3-3. Combination Rules
13.2. Matrix
Algebra
and
Matrix
Calculus
Normal Matrices
13.2-1. Rectangular Matrices. 13.2-2. Basic Operations
Norms
13.2-3. Identities and Inverses
13.2-4. Integral Powers of Square Matrices
13.2-5. Matrices as Building Blocks of Mathematical Models
13.2-6. Multiplication by Special Matrices.
13.3-4. Decomposition Theorems.
Permutation
Matrices
13.2-7. Rank, Trace, and Determi nant of a Matrix
13.2-8. Partitioning of Matrices 13.2-9. Step Matrices. Direct Sums 13.2-10. Direct Product (Outer Prod uct) of Matrices 13.2-11. Convergence and Differentia tion
13.4. Equivalent Matrices. Eigen values, Diagonalization, and Related Topics
13.4-1. Equivalent and Similar Matrices
13.4-2. Eigenvalues and Spectra of Square Matrices 13.4-3. Transformation of a Square Matrix to Triangular Form. Algebraic Multiplicity of an Eigenvalue 13.4-4. Diagonalization of Matrices 13.4-5. Eigenvalues and Characteristic Equation of a Finite Matrix 13.4-6. Eigenvalues of Step Matrices 13.4-7. The Cayley-Hamilton Theorem and Related Topics
13.2-12. Functions of Matrices
13.3. Matrices with Special metry Properties
Sym
13.3-1. Transpose and Hermitian Con jugate of a Matrix 402
13.5. Quadratic and Hermitian Forms
13.5-1. Bilinear forms
13.5-2. Quadratic Forms
RECTANGULAR MATRICES
403
13.2-1
(a) Homogeneous Systems.
13.5-3. Hermitian Forms
13.5-4. Transformation of Quadratic and Hermitian Forms.
Trans
formation to Diagonal Form 13.5-5. Simultaneous Diagonalization of Two Quadratic or Hermitian
Nor
mal-mode Solution
(b) Nonhomogeneous Systems. The State-transition Matrix
(c) Laplace-transform Solution 13.6-3. Linear Systems with Variable Coefficients
Forms
13.5-6. Tests for Positive Definiteness, Nonnegativeness, etc. 13.6. Matrix Notation for Systems of Differential Equations (State Equations). Perturbations and Lyapunov Stability Theory 13.6-1. Systems of Ordinary Differ ential Equations. Matrix No tation
13.6-2. Linear Differential Equations with Constant Coefficients
13.6-4. Perturbation Methods and
Sensitivity Equations
13.6-5. Stability of Solutions: Defi nitions
13.6-6. Lyapunov Functions and Sta bility
13.6-7. Applications and Examples 13.7. Related Topics, References, and Bibliography 13.7-1. Related Topics 13.7-2. References and Bibliography
(Time-invariant Systems) 13.1. INTRODUCTION
13.1-1. Matrices (Sec. 13.2-1) are the building blocks of an important class of mathematical models. Matrix techniques permit a simplified representation of various mathematical and physical operations in terms of numerical operations on matrix elements. Chapter 13 introduces matrix algebra and calculus (Sees. 13.2-1 to 13.4-7), quadratic and hermitian forms (Sees. 13.5-1 to 13.5-6), and the matrix/state-variable treatment of ordinary differential equations (Sees. 13.6-1 to 13.6-7). The general application of matrices to the representation of vectors, linear transformations (linear operators), and inner products is reserved for Chap. 14. Sections 13.5-1 to 13.5-6 similarly introduce quadratic and hermitian forms from the viewpoint of simple algebra; their real significance for the representation of scalar products is discussed in Sees. 14.7-1 and 14.7-2. 13.2. MATRIX ALGEBRA AND MATRIX CALCULUS
13.2-1. Rectangular Matrices.
A
m
Norms,
an
ai2
au
a2i
a22
a2n
jim\
am2
Omn.
(a) An array
= [aik]
(13.2-1)
of "scalars" aik taken from a commutative field (Sec. 12.3-1) F is called a (rectangular) m X n matrix over the field F whenever one of the "matrix operations" defined in Sec. 13.2-2 is to be used. The elements a^k are called matrix elements; the matrix element aik is situated in the
13.2-1
MATRICES.
QUADRATIC AND HERMITIAN FORMS
404
ith row and in the kth column of the matrix (1). m is the number of rows, and n is the number of columns. A matrix is finite if and only if it has a finite number of rows and a finite number of columns; otherwise the matrix is infinite.
A finite or infinite matrix (1) over the field of complex numbers is bounded if and only if it has a finite bound (norm) defined typically as m
n
m
n
\\a\\ s sup |y y arttik I (y i^2 = y m* =1) (13.2-2) t=i &=i
t=i
k=i
Table 13.2-1. Some Matrix Norms (n and/or m can be infinite; see also Sees. 12.5-5 and 14.4-1)
(a) Rectangular m X n Matrices m
n
A = [aik] m
\\A I =sup J^ £ *iktm J
(m, n > 1) n
(J 161'- I W-i)
t=l & = 1
i = l
m
\\A\\t = sup ) \aik\
Mil// = sup ) \Oit\
k fifi m
n
u\\p=(y y M')ur i = l m
(P = 1, 2, . . .)
/fe= l n
phi-y y m in
(taxicab norm)
n
Mil. - (X I |fltt|,)M t = l
(EUCLIDEAN OR FROBENIUS NORM)
A= l
(b) Column Matrices x =
or Row Matrices x = (&, £2, . • . , £n)
n
II*Up - PL - ( £ l«*l*)""
(p = 1, 2, . . .)
n
Mi - Plli - J \h\
(taxicab norm)
fc=i n
11*11* - PHs - ( 2 l&l1)*
(EUCLIDEAN NORM)
NI- = ||X|U s SUP |fc| (c) Relations.
Each norm satisfies the relations
\\A + B\\ < \\A\\ + \\B\\
Mil = |a|p||
In particular,
IH*II*< \\A\Wh
\\AB\\ < \\A\\\\B\\
405
BASIC OPERATIONS
13.2-2
(see also Table 13.2-1 and Sec. 14.4-1). A finite matrix over the field of
complex numbers is bounded if andonlyif all its matrixelements arebounded. Throughout this handbook all matrices will be understood to be bounded matrices over the field of complex numbers, unless
the contrary is specifically stated. A matrix A = [aik] is real if and only if all matrix elements aik are real numbers.
(b) n X 1 matrices are column matrices, and lXn matrices are row matrices.
The following notation will be used:
$2
= {&} ^ x
[tih • • • &J s [&] = £
(13.2-3)
(see also Sees. 13.3-1, 14.5-1, and 14.5-2).
(c) Ann X n matrix is called a square matrix of order n. A square matrix A = [aik] is
Triangular (superdiagonal) if and only if i > k implies aik = 0 Strictly triangular if and only if i > k implies aik = 0 Diagonal if and only if i 5* k implies aik = 0 Monomial if and only if each row and column has one and only one element different from zero
13.2-2. Basic Operations.
Operations on matrices are defined in
terms of operations on the matrix elements. 1.
2.
Two m Xn matrices A = [aik] and B = [bik] are equal (A = B) if and only if aik = bik for all i, k (see also Sec. 12.1-3). The sum of two m X n matrices A s [a,&] and B = [bik] is the m X n matrix
A + B = [aa] + [bik] = [aik + bik] 3.
The product of the m X n matrix A = [aik] by the scalar a is the m X n matrix
aA = ct[aik] = [aaik] 4.
The product of the m X n matrix A = [atJ] and the n X r matrix B = \bik] is the m X r matrix n
AB s [aij][bjk] =[ 2 °<^*]
13.2-3
MATRICES.
QUADRATIC AND HERMITIAN FORMS
406
In every matrix product AB the number n of columns of A must match
the number of rows of B (A and B must be conformable). The exist ence of AB implies that of BA if and only if A and B are square matrices; in general BA ^ AB (see also Sec. 13.4-46). Note A+B = B + A a(f3A) =- (aff)A
A + (B + C) = l[A + B) + C a(AB) = (aA)B = A(aB)
A(BC) = (AB)C «(A + B) = otA + aB (a + 0)A = aA + 0A A(B + C) = AB + AC (B + C)A = BA + CA
\\A + B\\ < \\A\\ + \\B\\
M||-|«|||A||
(13.2-4)
II^BII < U\\ \\B\\ (13.2-5)
13.2-3. Identities and Inverses.
Note the following definitions
1. The m Xn null matrix (additive identity) [0] is the m Xn matrix all of whose elements are equal to zero. Then
A + [0] = A 0A = [0] [0]B = C[0] = [0]
where A is any m X n matrix, B is any matrixhaving n rows, and C is any matrix having m columns. 2. The additive inverse (negative) —A of the m Xn matrix A = [aik] is the m X n matrix -A m (-l)A s[-a*]
with A + (-A) = A - A = [0].
3. The identity matrix (unit matrix, multiplicative identity) J of order n is the n Xn diagonal matrix with unit diagonal elements:
/-[* where <=j\«{^ (see also Sec. 16.5-2). IB = B
Then CI = C
where B is any matrix havingn rows, and C is any matrix having n columns; and for any n Xn matrix A IA = AI = A
407
MATRICES AS BUILDING BLOCKS
13.2-5
4. A (necessarily square) matrix A is nonsingular
(regular) if and only if it has a (necessarily unique) bounded multiplicative inverse or reciprocal A~l defined by AA~l = A~lA = /
Otherwise A is a singular matrix.
A finite n X n matrix A = [atfe] is nonsingular if and only if det (A) = det [aik] 5* 0; in this case A~l is the n X n matrix
A-1 m [aai
Ldet [aik\]
where At* is the cofactor of the element aik in thedetermi nant det [aik] (see also Sees. 1.9-2 and 14.5-3).
Products and reciprocals of nonsingular matrices are nonsingular; if A and B are nonsingular, and a^O,
(AB)-1 = B-'A-1
(aA)-1 = crlA~l
(A-1)"1 = A
(13.2-6)
(see also Sec. 14.3-5). A given matrix A is nonsingular if it has a unique left or right inverse, or equal left and right inverses; the mere existence of one or more left and/or right inverses is not sufficient (see also Sec. 14.3-5). A given n X n matrix is nonsingular if and only if it can be partitioned (Sec. 13.2-8) into a linearly independent set of row or column matrices.
13.2-4. Integral Powers of Square Matrices. One defines A° = /, A1 = A, A2 = AA, A3 =* AAA, . . . and, if A is nonsingular, A-p = (ii-i)* = (A*)"1 (p = 1, 2, . . .)
The ordinary rules for operations with exponents apply (see also Sec. 14.3-6). 13.2-5. Matrices as Building Blocks of Mathematical Models. The definitions of Sees. 13.2-2 and 13.2-3 (constructive definitions, Sec. 12.1-1) imply the following results:
1. Given any pair of (finite) positive integers m, n, the class of all m X n matrices over a field F is an mn-dimensional vector space over F (Sees. 12.4-1 and 14.2-4). In particular, n-element column or row matrices form n-dimensional vector spaces (see also Sec. 14.5-2). 2. The class of all square matrices of a given (finite) order n over a field F is a linear algebra of order n2 over F; singular matrices are zero divisors (Sec. 12.4-2). 3. The class of all nonsingular square matrices of a given (finite) order n over a field F constitutes a multiplicative group (Sec. 12.2-1) and, together with the n X n
null matrix^ a division algebra of order n2 over the field F (Sec. 12.4-2).
13.2-6
MATRICES.
QUADRATIC AND HERMITIAN FORMS
408
Analogous theorems apply to bounded infinite matrices over the field of real or complex numbers.
13.2-6. Multiplication by Special Matrices.
Permutation Matrices.
Given
any n X n matrix A,
1. Premultiplication of A by the matrix obtained on replacing the 1 in the ith row of the n X n identity matrix by a complex number a multiplies all elements in the iih row of A by a.
2. Premultiplication of A by the matrix obtained on replacing the nondiagonal element b\ —0 of the n X n identity matrix by 1 adds the kth row of A to the ith row of A. 3. Premultiplication of A by the permutation matrix formed through a permutation of the rows of the n X n identity matrix results in an identical permutation of the rows of A.
The third transposition relation (13.3-1) yields three analogous theorems describing operations on the columns of a matrix as results of postmultiplication by special matrices.
13.2-7. Rank, Trace, and Determinant of a Matrix (see also Sec. 14.3-2). The rank of a given matrix is the largest number r such that at least one rth-order determinant (Sec. 1.5-1) formed from the matrix by deleting rows and/or columns is different from zero. An m X n matrix
A is nonsingular if and only if m = n = r, i.e., if and only if A is square and det (A) ^ 0 (Sec. 13.2-3). The trace (spur) of an n X n matrix A = [aik] is the sum n
Tr
(A) = ^ ««
of the diagonal terms. If n = oo, this sum converges whenever A is bounded (Sec. 13.2-la). For finite matrices A, B Tr (A + B) = Tr (A) + Tr (B) Tr (BA) = Tr (AB)
Tr (aA) = a Tr (A) Tr (AB - BA) = 0 (13.2-7)
det (AB) = det (BA) = det (A) det (B)
(13.2-8)
The determinant det (A) of an infinite square matrix A = [aik] is defined as det (A) = lim det [a,-*]
(i, k < n)
n—*°o
if the limit exists.
The theorems of Sees. 1.5-1 to 1.5-6 apply to determinants defined 00
in this manner whenever both > a,* and f| an converge. iVfc
»= l
13.2-8. Partitioning of Matrices. A matrix having more than one row and column may be partitioned into smaller rectangular submatrices by lines drawn between rows and/or columns. One can
multiply two similarly partitioned n X n matrices A and B by entering
CONVERGENCE AND DIFFERENTIATION
409
13.2-11
their rectangular submatrices as elements in the ordinary matrix-product formula (Sec. 13.2-2); the product elements thus obtained are the submatrices of the n X n matrix AB. This theorem is helpful in numerical computa tions (Sec. 20.3-4).
13.2-9. Step Matrices. Direct Sums (see also Sees. 13.4-6, 14.8-2, and 14.9-2). A step matrix is a square matrix A which can be parti tioned into a diagonal matrix (Sec. 13.2-lc) of square submatrices A i, A 2, . . . , so that Ai A =
[0]
(13.2-9)
[0]
A step matrix is often referred to as the direct sum A = Ai 0 A2 0
• • •
of the square matrices along its diagonal (see also Sec. 12.7-5). Note that Ap = Axv 0 A2p 0 • • • for p = 0, 1, 2, . . . (and also for p = —1, —2, . . . if A is nonsingular), and Tr (A) = Tr (A1) + Tr (A2) + det (A) = det (AJ det (A2) • •
(13.2-10)
13.2-10. Direct Product (Outer Product) of Matrices (see also Sec. 12.7-3). The direct product (outer product) A ® B of the m X n matrix A = [aik] and the m' X n' matrix B = [bi'k'] is the mm' X nn' matrix
A ® B S3 [cjh]
(cjh = a»*6»'jb')
(13.2-11)
where j enumerates the pairs (i, i') in the sequence (1, 1), (1, 2), . . . , (1, m'), (2, 1), (2, 2), . . . (m, m'), and where h enumerates the pairs (k, k') in a similar man ner.
Note
(A
(13.2-12) (13.2-13)
13.2-11. Convergence and Differentiation, (a) A sequence of matrices So, Sh S2, . . . each having the same number of rows and the same number of columns is said to converge to a bounded matrix S if and only if every matrix element of Sn converges to the corresponding element of S as n—> », i.e., if and only if lim \\S —Sn\\ = 0. One n—»«
similarly defines limits of matrix functions A = A (t) of a scalar parameter t (see also Sec. 12.5-3). (b) If the matrix elements of a matrix A = [aik] are differentiate functions at*(£) of a scalar parameter t, one writes
13.2-12
MATRICES.
QUADRATIC AND HERMITIAN FORMS
410
Partial differentiation and integration of matrices are defined in an anal ogous manner.
13.2-12. Functions of Matrices. Matrix polynomials and algebraic functions of matrices are defined in terms of the elementary matrix operations. The Cayleyoo
Hamilton theorem (Sec. 13.4-7) reduces every convergent series / akAk in powers ffo
of an n X n matrix A (analytic function of the matrix A) to an nth-degree polynomial in A.
13.3. MATRICES WITH SPECIAL SYMMETRY PROPERTIES
13.3-1. Transpose and Hermitian Conjugate of a Matrix (see also Sees. 14.4-3 and 14.4-6a). Given any m X n matrix A ss [aik] over the field of complex numbers,
The transpose (transposed matrix) of A is the n X m matrix A = [aki]. The hermitian conjugate (adjoint, conjugate, associate matrix)* of A is the n X m matrix At = [a*J. Note the following relations: (A+B) = 1 + S
W~) = (I)"1
M) = ai 1 = A
[6] = [0] (A + B)f = At + £f (A-Ot = W)-1
I
mil = U\\
(13.3-1)
= 1
M)t = «*ilt (iit)t = A
[0]t = [0]
(AB) = M
(AB)f = Bfilt ||At|| = ||A||
(13.3-2)
/t = /
A, J!, and Af are necessarily ofequal rank. For every square matrix A, Tr (A) = Tr (A)
det (A) = det (A)
(13.3-3)
Tr (At) = [Tr (A)]*
det (At) = [det (A)]*
(13.3-4)
13.3-2. Matrices with Special Symmetry Properties (see also Sees. 14.4-4 to 14.4-6). A square matrix A = [aik] is * aki is the complex conjugate of ak% (Sec. 1.3-1). The terms adjoint, conjugate, and O88ociate(d) are used with different meanings (see also Sees. 12.2-5, 14.4-3, 16.7-1, and 16.7-2); some authors refer to the matrix A'1 det (A) (matrix of cofactors with trans posed indices, adjugate of A) as the adjoint of A. The symbols used for the trans
poseA, the hermitianconjugate A\, and the complex conjugate A* a [a*k] s (^J)t of a given matrix A also vary; some authors denote the hermitian conjugate of A by A*
411
DECOMPOSITION THEOREMS
13.3-4
Symmetric if and only if A = A, i.e., aik = aki Skew-symmetric (antisymmetric) if and only if A = —A, i.e., a{k =
—aki
Hermitian (self-adjoint, self-conjugate) if and only if At = A, i.e., aik = aki
Skew-hermitian (alternating) if and only if At = —A, i.e., atfc = —ali
Orthogonal if and only if A A = AA = /, i.e., A = A-1 Unitary if and only if AtA = AAt = /, i.e., At = A"1 A hermitian matrix is symmetric, a skew-hermitian matrix is skew-symmetric,
and a unitary matrix is orthogonal if and only if all its matrix elements are real. The diagonal elements of hermitian, skew-hermitian, and skew-symmetric matrices are, respectively, real, pure imaginary, and equal to zero. The determinant of a hermitian matrix is real. The determinant of a skew-hermitian n X n matrix is real if n is even, and pure imaginary if n is odd. The determinant of a skew-symmetric matrix of odd order is equal to zero. The determinant of a unitary matrix has the absolute value 1, and the determinant of an orthogonal matrix equals either + 1 or -1.
13.3-3. Combination Rules (see also Sec. 14.4-7).
(a) // the matrix
A is symmetric the same is true for Ap (p = 0, 1, 2, . . .), A-1, TAT, and aA.
Given any nonsingular matrix T, TAT is symmetric if and only if A is symmetric; hence for any orthogonal matrix T, T~lAT is symmetric if and only if the same is truefor A. // A and B are symmetric the same is true for A + B. The product AB of two symmetric matrices A and B is symmetric if and only if BA = AB, (b) If the matrix A is hermitian the same is true for Ap (p = 0, 1, 2, . . .), A-1, and T\AT, and for aA if a is real. Given any nonsingular matrix T, T]AT is hermitian if and only if A is hermitian; hence for any unitary matrix T, T~lAT is hermitian if and only if the same is truefor A.
// A and B are hermitian the same is true for A + B. The product AB of two hermitian matrices A and B is hermitian if and only if BA = AB. (c) If A is an orthogonal matrix the same is true for Ap (p = 0, 1, 2, . . .), A-1, A, and —A. If A and B are orthogonal the same is true for AB.
If A is a unitary matrix the same is true for Ap (p = 0, 1, 2, . . .), A"1, and At, and for aA if \a\ = 1. If A and B are unitary the same is true for AB.
13.3-4. Decomposition Theorems. Normal Matrices (see also Sees. 13.4-4a and 14.4-8). (a) For every square matrix over the field of complex numbers
13.4-1
MATRICES.
QUADRATIC AND HERMITIAN FORMS
412
1. XA(A + A) = Si is a symmetric matrix, and J^(A —A) = S2 is skew-symmetric. A = Si + S2 is the (unique) decomposition of the given matrix A into a symmetric part and a skew-symmetric part
2. J^(A + At) = Hi and ^ (A - At) = H2 are hermitian matrices. A = Hi + iH2 is the (unique) cartesian decomposition of the given matrix A into a hermitian part and a skew-hermitian part (compara ble to the cartesian decomposition of complex numbers into real and imaginary parts, Sec. 1.3-1).
3. AtA is hermitian (and nonnegative, Sec. 13.5-3), and there exists a polar decomposition A = QU of A into a nonnegative hermitian factor Qand a unitary factor U. Qis uniquely defined by Q2 = AtA
and U is uniquely defined if and only if A is nonsingular (compare this with Sec. 1.3-2).
(b) A square matrix A is a normal matrix if and only if AtA = AAt, or, equivalently, if and only if H2HX = HiH2. 13.4. EQUIVALENT MATRICES.
EIGENVALUES,
DIAGONALIZATION, AND RELATED TOPICS
13.4-1. Equivalent and Similar Matrices (see also Sees. 12.2-5a, 13.5-4, 13.5-5, and 14.6-2). (a) Two rectangular matrices A and B are
equivalent if and only if there exist two nonsingular matrices S, T such that A and B are related by the transformation
B = SAT
(13.4-1)
Every matrix B equivalent to a given matrix A has the same number of rows andthe same number of columns as A and can be obtained through successive application of the six operations defined in Sec. 13.2-6. Equivalent matrices are ofequal rank; two m X n matrices ofequal rank are necessarily equivalent. A and B are equivalent whenever B = QA or B = AQ, where Q is a nonsingular matrix.
(b) In particular, two square matrices A and A are similar (some times simply called equivalent) if and only if there exists a nonsingular matrix T (transformation matrix) such that A and A are related by the similarity transformation (collineatory transformation) A = T-^AT
or
A = TAT-1
A, A, and T are necessarily square matrices of equal order.
(13.4-2)
413
EIGENVALUES AND SPECTRA OF SQUARE MATRICES
13.4-2
Every similarity transformation (2) preserves the results of matrix addi tion, matrix multiplication, and multiplication by scalars (see also Sec. 12.1-6). Two similar matrices have the same rank, the same trace, and the same determinant (see also Sec. 13.4-2a). (c) Two square matrices A and A related by a transformation A = fAT
(13.4-3)
where T is nonsingular, are congruent. Two square matrices A and A related by A = T\AT
(13.4-4)
where T is nonsingular, are conjunctive. In either case A, A, and T are necessarily square matrices of equal order.
(d)Matrix equivalence, similarity, congruence, and conjunctivity are equivalence rela tions; eachdefines a partition of the classof matricesunder consideration (Sec. 12.1-36). In most applications, two or more similar matricesconstitute different representations of a linear transformation (linear operator, dyadic) A (Sec. 14.6-2). It is, then, of interest (1) to find similarity transformations yielding particularly simple representa
tions of A (transformation of a matrix to diagonal or other "canonical" form) and (2) to find properties of matrices which are invariant with respect to similarity trans formations and are thus common to each class of similar matrices (e.g., rank, trace, determinant, eigenvalues).
13.4-2. Eigenvalues and Spectra of Square Matrices (see also Sec. 14.8-3). (a) The eigenvalues (proper values, characteristic values, characteristic roots, latent roots) of a (finite or infinite) square matrix
A = [aik] are those values of the scalar parameter Xfor which the resol vent matrix A — X/ is singular. The spectrum (eigenvalue spec trum) of the matrix A is the set of all its eigenvalues.
The eigenvalues of a square matrix A can be defined directly as the eigenvalues of a linear operator represented by A (Sec. 14.8-3); similar matrices have identical spectra. Note that the spectrum of an infinite matrix may or may not be a discrete set: e.g., every real number between —-k and tt is an eigenvalue of the infinite matrix [ajk] = [( —1)'"~* 8{/(j —k)]. Some authors restrict the term eigenvalue to values of X in the discrete spectrum (see also Sec. 14.8-3d).
(b) Given a normal matrix A (A"\A = AAt, Sec. 13.3-46) with eigen values X, Af has the eigenvalues X*, Hi = }^(A + A\) has the eigenvalues
Re (X), and H2 = jp(A —At) has the eigenvalues Im (X) (see also Sec. 13.3-4a).
All eigenvalues of a givennormal matrix are real if and only if the matrix is similar to a hermitian matrix (see also Sec. 14.8-4).
In particular,
all eigenvalues of hermitian and real symmetric matrices are real. All eigenvalues of a unitarymatrix have absolute values equal to 1; in particular, real eigenvalues of real orthogonal matrices equal +1 or —1, and their
13.4-3
MATRICES.
QUADRATIC AND HERMITIAN FORMS
414
complex eigenvalues occur in pairs e±tV. A square matrix is nonsingular if and only if all its eigenvalues are different from zero. (c) Refer to Sees. 13.4-5a, 14.8-5, and 20.3-5 for the numerical cahular
tion of eigenvalues, and to Sec. 14.8-9 for the calculation of bounds for the eigenvalues of a given matrix.
13.4-3. Transformation of a Square Matrix to Triangular Form. Algebraic Multiplicity of an Eigenvalue (see also Sec. 14.8-3e). (a) Given any square matrix A having a purely discrete eigenvalue spectrum, there exists a similarity transformation A = T~lATsuch that A is triangular (Sec. 13.2-lc). The diagonal elements of every triangular matrix similar
to A are eigenvalues ofA, and each eigenvalue X, ofA occurs exactly raj > 1 times as a diagonal element; raj is called the algebraic multiplicity of the eigenvalue X,-.
Note: m) is not necessarily equal to the degree of degeneracy my defined in Sec. 14.8-36. In the case of infinite matrices, one or more of the m\may be infinite.
(b) If Tr (A) exists, and A has a purely discrete eigenvalue spectrum, Tr (A) equals the sum of all eigenvalues, each counted a number of times equal to its algebraic multiplicity. If det (A) exists (Sec. 13.2-7), it equals the similarly computed product of the eigenvalues (see also Sec. 13.4-5). 13.4-4. Diagonalization of Matrices (see also Sec. 14.8-5). (a) A square matrix A can be diagonalized by a similarity transforma tion (i.e., there exists a nonsingular transformation matrix T such that A = T~lAT is diagonal, Sec. 13.2-lc) if and only if A has a purelydiscrete eigenvalue spectrum and is similar to a normal matrix (Sec. 13.3-46). More specifically, a given matrix A having a purely discrete eigenvalue spectrum can be diagonalized by a similarity transformation with unitary transformation matrix T (or with a real orthogonal transformation matrix if A is real) if and onlyif A is a normal matrix (A\A = A At, Sec. 13.3-46). In each case the diagonal elements of A are eigenvalues of A; every eigenvalue occurs a number of times equal to its algebraic multiplicity. Refer to Sec. 14.8-6 for a procedure yielding transformation matrices T with the desired properties. SPECIAL CASES OF DIAGONALIZABLE MATRICES. Hermitian and unitary matrices (and thus real and symmetric or orthogonal matrices) are special instances of normal matrices. Every matrix having only discrete eigenvalues of algebraic multi plicity 1 is similar to a normal matrix.
(b) Two hermitian matrices A and B with purely discrete eigenvalue spectra can be diagonalized by the same similarity transformation (and, in particular, by the same similarity transformation with unitary transforma tion matrix T) if and only if BA = AB (see also Sees. 13.5-5, 14.8-6e). (c) Given any hermitian matrix A having a purely discrete eigenvalue spectrum^ there exists a nonsingular matrix T such that A = T\AT is
415
EIGENVALUES AND CHARACTERISTIC EQUATION
13.4-5
diagonal; the diagonal elements of A are then real. In particular, there exists a nonsingular matrix T such that the diagonal elements of A take only the values +1, —1, and/or 0.
Given any real symmetric matrix A having a purely discrete eigenvalue
spectrum, there exists a real nonsingular matrix T such that A = fAT is diagonal. In particular, there exists a real nonsingular matrix T such that the diagonal elements of A take only the values +1, —1, and/or 0. Matrices T with the desired properties are obtained from T = DU, where U is a unitary (or real orthogonal) matrix such that U~lAU is diagonal, and D is a real diagonal matrix; U is found by the method of Sec. 14.8-6 (see also Sec. 13.5-4d).
13.4-5. Eigenvalues and Characteristic Equation of a Finite
Matrix,
(a) The eigenvalue spectrum of a finite n X n matrix A ss [otfc]
is identical with the set of roots X of the nih-degree algebraic equation
FA(\) = det (A - X7) = det [aik - \8{] an — X a2i
a22 — X
au
• • • • • •
a\n a2n
Q>nl
Qn2
' ' '
Q>nn ~~ X
= 0
(13.4-5)
(CHARACTERISTIC EQUATION OR SECULAR EQUATION OF THE MATRIX A)
The multiplicity (order, Sec. 1.6-2) of each root X, equals its algebraic
multiplicity m'j as an eigenvalue, so that m[ + m'2 + • • * = n. Similar n X n matrices have identical characteristic equations; the coeffi cients in Eq. (5) are symmetric functions of the n roots Xi, X2, . . . , X„ (Sec. 1.6-4). In particular, the coefficient of X""1 and the constant term in Eq. (5) are, respectively, + Xn) = (-l)-iTr(A) X1X2 • • • X„ = det (A)
(-l)-l(Xi + X,+
The coefficient of Xw~r equals (— l)r times the sum
(13.4-6)
of the [ ]r-\r owed principal minors
(Sec. 1.6-4) of det (A).
(b) (See also Sec. 14.8-3). Given a finite square matrix A with eigenvalues X,-, a A has
the eigenvalues a\j, and A*> has the eigenvalues XJ (p = 0, 1, 2, . . . ; p = 0, ±1, ±2, . . . if A is nonsingular).
Every polynomial or analytic function f(A) (Sec. 13.2-12) oo
has the eigenvalues /(Xy).
A matrix power series
) aitAk converges (Sec. 13.2-1 la) if
00
and only if the power series y ak\) converges for every eigenvalue X,- of A. Given two finite square matrices A and B having the respective eigenvalues X,- and fih, the eigenvalue spectrum of the direct product A ® B (Sec. 13.2-10) is the set of products X,-^.
13.4-6
MATRICES.
QUADRATIC AND HERMITIAN FORMS
416
13.4-6. Eigenvalues of Step Matrices (Direct Sums, Sec. 13.2-9). The spectrum of a finite or infinite) step matrix (direct sum) A = Ax © A2 © • • • is the union of the spectra of Ai, A2, . . . ; algebraic multiplicities add. The contribution of each
finite submatrix Ak may be obtained with the aid of its characteristic equation.
13.4-7. The Cayley-Hamilton Theorem and Related Topics, (a) Every finite square matrix A satisfies its own characteristic equation (Sec. 13.4-5a), i.e.
FA(A) = 0 (Cayley-Hamilton theorem)
(13.4-7)
(b) The Cayley-Hamilton theorem permits one to represent every integral power, and hence every analytic function, of a finite n X n matrix A as a linear function of
any n distinct positive integral powers of A (see also Sec. 13.2-12). Specifically, n
f(A) s i Y An-kA«-k
(13.4-8)
where A is the Vandermonde determinant (Sec. 1.6-5) det [Xj-1], and A,- is the determi nant obtained on substitution of /(Xi), /(X2), . . . , /(X„) for X{, Xj, . . . , x£ in A. If the eigenvalues Xi, X2, . . . , X» of the matrix A are distinct, then Eq. (8) can be rewritten as
n w - w f(A) = ) f(\k) -
(Sylvester's theorem) (13.4-9)
13.5. QUADRATIC AND HERMITIAN FORMS
13.5-1. Bilinear Forms. A bilinear form in the 2n real or complex variables £i, £2, . . . , £n, Vh V2, . . . , yn is a homogeneous polynomial of the second degree (Sec. 1.4-3a) n
n
£Ay ~ S I aik^k
(13.5-1)
13.5-2. Quadratic Forms. A (homogeneous) quadratic form in n real or complex variables* &, $2, ...,£» is a polynomial n
n
$Ax =
(13.5-2)
Ctik^i^k = SAiX t=l&=1
* The theory of Sees. 13.5-2 to 13.5-6also applies to quadratic and hermitianforms in a countably infinite set of variables £i, £2, . . . (n = «>), and to the corresponding CO
infinite matrices, provided that the infinite sums
>
00
> |a»*|2,
=1 A? = i 00
y |&|2 converge. *= l
00
00
>
> |6»*|2, and
417
QUADRATIC AND HERMITIAN FORMS
13.5-4
where Ai = }^(A + &) is the "symmetric part" (Sec. 13.3-4) of the matrix A = [aik\. The expression (2) vanishes identically if and only if A is skew-symmetric (aki = —aik, Sec. 13.3-2). A quadratic form (2) is sym metric if and only if the matrix A = [aik] is symmetric (aki = aik, Sec. 13.3-2) and real if and only if A (and thus every aik, Sec. 13.2-la) is real. A real symmetric quadratic form (2), and also the corresponding real symmetric matrix A, is called positive definite, negative definite, nonnegative, or nonpositive if and only if, respectively, $Ax > 0, $Ax < 0, $Ax > 0, or $Ax < 0 for every set of real numbers £i, £2, • • . , £n not all equal to zero. All other real symmetric quadratic forms are indefinite (i.e., the sign of $Ax depends on £1, £2, . . . , £n) or identically zero.
A real symmetric quadratic form (2), and also the corresponding real symmetric matrix A, is positive semidefinite or negative semidefinite if and only if it is, respectively, nonnegative or nonpositive, and xAx = 0 for some set of real numbers £i> €2, . • . i €n n°t all equal to zero.
13.5-3. Hermitian Forms. A hermitian form in n real or complex variables* £1, £2, . . . , £n is a polynomial n
xfAx =
n
11
(13.5-3)
«;*£*£*
i = 1 k = 1
such that the matrix A = [aik] is hermitian (a,-* = a*t, Sec. 13.3-2). A form (3) is real for every set of complex numbers £1, £2, . . . , £n if and only if A is hermitian (see also Sec. 14.4-4). A hermitian form (3), and also the corresponding hermitian matrix A = [a,-J, is called positive definite, negative definite, nonnegative,
or nonpositive if and only if, respectively, x\Ax > 0, x\Ax < 0, x^Ax > 0, or x\Ax < 0 for every set of complex numbers £1, {2, . . . >£n not all equal to zero. All other hermitian forms (or hermitian matrices) are indefinite (i.e., the sign of x]Ax depends on £1, £2, . . . , £n) or identically zero. A hermitian form (3), and also the corresponding hermitian matrix A, is positive semidefinite or negative semidefinite if and only if it is, respectively, nonnegative or nonpositive, and x^Ax = 0 for some set of complex numbers £1, £2, . . . , £n not all equal to zero.
13.5-4. Transformation of Quadratic and Hermitian Forms. Transformation to Diagonal Form, (a) A linear substitution
£i = £ tah or
x — Tx * See footnote to Sec. 13.5-2.
(i =1, 2, . . , ,n)
(det [tai * 0)
(13.5-4)
13.5-4
MATRICES.
QUADRATIC AND HERMITIAN FORMS
418
(nonsingular homogeneous linear transformation of coordinates or vector components, "alias" point of view, Sec. 14.6-1) transforms every quad ratic form (2) into a quadratic form in the new variables \i, £2, . . . , £»:
$A
X~ L, L, dik^k s *^* t=ia=i
with
(13.5-5)
d* = £ £ °;aWa*
(i, k = 1,2, . . . , n)
y=ib°i
A = fAT
or
A is symmetric if A is symmetric; A is real if A and T are real. The linear substitution (4) transforms every hermitian form (3) into a new hermitian form: n
n
x\Ax
t\Ax = with n
dik = or
(13.5-6)
n
22
Qjhtjithk
(i, k = l,2,
. . , n)
A = T\AT
(b) For every given real symmetric quadratic form (2) there exist linear transformations (4) with real coefficients Uk such that the transformed
matrix A in Eq. (5) is diagonal (see also Sec. 13.4-4c), so that &Ax = xAx
= y anil
(13.5-7)
Similarly, for every given hermitian form (3) there exist linear trans formations (4) such that
x^Ax = x^Ax = } &n\h\2
(13.5-8)
t=i
The number r of nonzero coefficients &a in Eq. (7) or (8) is independent of the particular diagonalizing transformation used and equals the rank of the given matrix A; r is called the rank of the given quadratic or hermitian form. For any given real symmetric quadratic form (2) the difference between the respective numbers of positive and negative coefficients da in Eq. (7) is independent of the particular diagonalizing transformation used (Jacobi-Sylvester Law of Inertia); this number is referred to as the signature of the given quadratic form.
419
SIMULTANEOUS DIAGONALIZATION
13.5-5
(c) In particular, there exists a real orthogonal diagonalizing matrix T for every real symmetric quadratic form (2), and a unitary diagonalizing matrix T for every hermitian form (3) (see also Sec. 13.4-4). The result ing principal-axes transformation (transformation to normal coordinates lh h) • • • > inf see also Sec. 9.4-8) yields the normal form of the given quadratic or hermitian form, viz., n
n
«Axs 2Ax s Y \ili2 or xfAa; = x\Ax = 2, Ai|£»|2 1= 1
(13.5-9)
i~i
where the set of real numbers X» is the eigenvalue spectrum of the given matrix A (Sec. 13.4-2). (d) The additional transformation £< = f»/\/M (* ^ 1* 2, . . . , n) reduces the expressions (9) to their respective canonical forms n
n
SAx = y €$f»a
or
x\Ax = Y «|ft|*
(13.5-10)
where each a equals +1, —1, or 0 if the corresponding eigenvalue X,- is positive, nega tive, or zero.
(e) The calculation of suitable diagonalizing transformation matrices is discussed in Sec. 14.8-6.
13.5-5. Simultaneous Diagonalization of Two Quadratic or Hermitian Forms (see also Sees. 13.4-46 and 14.8-7). Given two real symmetric quadratic forms $Ax, £Bx, where $Bx is positive definite, it is possible to find a real transformation (4) which diagonalizes £Ax and £Bx simultaneously. In particular, there exists a real transformation (4) to new coordinates f i, |2, . • • , In such that n
$Ax = $Ax = y m&2
n
ZBx 3 $Bx = y &
i=l
(13.5-11)
i=1
Similarly, given two hermitianforms xfAx, x\Bx, where xfBx is positive definite, there exists a transformation (4) to new coordinates £i, £2, . . . , In such that n
xfAx ss x-\Ax == £ m*|?»|2
n
*\Bx = x\Bx = Y|&|2 (13.5-12)
t=i
*=i
In either case, the set of real numbers n\, H2, . . . t pn is the eigenvalue spectrumof thematrix B~lA, obtainable as the set of roots of the nth-degree algebraic equation
det (A - fiB) = det [aik - jibik] = 0
(13.5-13)
13.5-6
MATRICES.
QUADRATIC AND HERMITIAN FORMS
420
The desired transformation matrix T is obtained by the method of Sec. 14.8-76, or from T — UT0, where T0 is the matrix reducing xBx or xfBx to canonical form (Sec. 13.5-4d), and U is a unitary matrix which diagonalizes xAx or x\Ax (Sec. 13.5-4c).
Note: Two real symmetric quadratic forms xAx, xBx or two hermitian forms xfAx, xfBx can be diagonalized simultaneously by the same unitary transformation matrix T if and only if BA = AB (see also Sees. 13.4-46 and 14.8-6e).
13.5-6. Tests for Positive Definiteness, Nonnegativeness, etc. (a) A real symmetric quadratic form or hermitian form is positive definite, negative definite, nonnegative, nonpositive, indefinite, or identically zero (Sees. 13.5-2 and 13.5-3) if and only if the (necessarily real) eigenvalues Xy of the matrix A = [aik] are, respectively, all positive, all negative, all nonnegative, all nonpositive, of different signs, or all equal to zero. A real symmetric quadratic form or hermitian form is positive semidefinite or negative semidefinite if and only if it is, respectively, nonnegative or nonpositive, and at least one eigenvalue X,- of the matrix A = [aik] equals zero. Note that the X,- are the roots of the characteristic equation (13.4-5); the signs of these roots can often be investigated by one of the methods of Sec. 1.6-6.
(b) A hermitian matrix A = [aik] (and the corresponding hermitian form or real symmetric quadratic form) is positive definite if and only if every one of the quantities d\\,
an
ai2
a2i
an
On
#12
«13
a2i
ait
«23 ,
azi
G&32
G&33
. . . ,
det [a*]
is positive (Sylvester's Criterion).
(c) A hermitian matrix A (and the corresponding hermitianform or real symmetric quadratic form) is negative definite, nonpositive, or negative semidefinite if and only if —A is, respectively, positive definite, nonnegative, or positive semidefinite. (d) A matrix A is a nonnegative hermitian matrix if and only if there exists a matrix B such that A = B\B. A real matrix A is a nonnegative symmetric matrix if and only if there exists a real matrix B such that
A = BB.
In either case, A is positive definite if Bs and thus A, is
nonsingular.
(e) // both A and B are positive definite or nonnegative, the same is true for AB. Every positive definite matrix A has a unique pair of square roots H, —H defined by H2 = A; H is positive definite (see also Sec. 13.3-4). 13.6. MATRIX NOTATION FOR SYSTEMS OF DIFFERENTIAL
EQUATIONS (STATE EQUATIONS).
PERTURBATIONS AND
LYAPUNOV STABILITY THEORY
13.6-1. Systems of Ordinary
Notation.
Differential
Equations.
Matrix
As noted in Sec. 9.1-3, a general system of ordinary differ-
421
EQUATIONS WITH CONSTANT COEFFICIENTS
13.6-2
ential equations (9.1-4) reduces to the first-order form dyi dt
= /»(*; 2/i, 2/2, . . . , 2/n)
(i = 1, 2,
, n)
if appropriate derivatives are introduced as new variables yit tem (la) is written as a single matrix differential equation
dy __ d
"2/i(0"
'hit) 2/i, y2, . . • ,2/n)"
2/2(0
U(t\ 2/1, 2/2, .. • ,2/n)
Jjn(t)_
Jn(t; yh 2/2, . . • ,2/»)_
The sys
(13.6-16)
= f(t,y)
dt "~ dt
(13.6-la)
(see also Sec. 13.2-11), where y(J) and f(t, y) are n X 1 column matrices. If the fi are single-valued and continuous and satisfy a Lipschitz con dition (9.2-4) over the domain of interest, then the solution y(t) of Eq. (1) is uniquely determined by the initial condition
'2/i(0)"
2/io
1/2(0)
2/20
b«(0).
_2/n0_
2/(0)^
=
(13.6-lc)
2/o
The system (1) is called autonomous if and only if / does not depend explicitly on the independent variable t. More than merely a notational convenience, the matrix notation will be seen to
extend intuitive insight gained from studies of simplefirst-order differential equations to systems of first-order equations.
Matrix operations needed for solution of linear
systems (Sec. 13.6-2) are, moreover, readily implemented with digital computers. In the most important applications, t represents physical time, and the yt(t) are state variables representing the state of a dynamical system; the system (1) is then called a system of state equations (see also Sec. 11.8-4and Refs. 13.10 to 13.16).*
13.6-2. Linear Differential Equations with Constant Coefficients
(Time-invariant Systems), (a) Homogeneous Systems. Nor mal-mode Solution. The solution of the homogeneous linear system dt
-i
or
OikVk
dy dt
Vi(0) = ytQ
= Ay
(i = 1, 2,
2/(0) = y0
(A = [aik])
, n)
(13.6-2a) (13.6-26)
* Many engineering texts refer to the matrix y(t) as a state vector. It would be more correct to state that the matrix elements yt(t) (state variables) represent a state
vector in a specific scheme of measurements (in the senseof tensor analysis, Chap. 16; see also Ref. 13.15).
13.6-2
MATRICES.
QUADRATIC AND HERMITIAN FORMS
422
with constant coefficients a^ (see also Sees. 9.3-1 and 9.4-ld) is explicitly given by
y(t) = eA%
(t > 0)
(13.6-3)
where the matrix function eAt is the n X n matrix defined in accordance
with Sees. 13.2-12 and 13.4-7. Expansion of eAt by Eq. (13.4-8) involves cumbersome matrix multiplications but, if the given matrix A has n dis tinct eigenvalues, expansion of eAt by Sylvester's theorem (13.4-9) yields the normal-mode expansion of Sec. 9.4-1. One can often simplify the solution of a problem (2) by introducing n new dependent variables (state variables) ynby a nonsingular linear transformation
yi =J Ulhyh
(i = 1, 2, . . . , n)
or
y - T$
(13.6-4)
such that the resulting transformed system
%=Ay ,<©-,. with
A = T~lAT
•
(136_5)
y0 = T^yo
is simplified (see also Sees. 14.6-1 and 14.6-2). If, in particular, there exists a trans formation (4) which diagonalizes the given system matrix A (Sees. 13.4-4 and 14.8-6), then the transformed variables % are normal coordinates of the given linear system (see also Sec. 9.4-8): they satisfy "uncoupled" differential equations
^ =\hm
(h =1, 2, . . . ,n)
(13.6-6)
where Xi, X2, . . . , Xn are the eigenvalues of A. If A has n distinct eigenvalues, the solution of the original problem (2) is then given by Eq. (4) with
yn = Vhoe^
(h = 1, 2, . . . , n)
(13.6-7)
Complex-conjugate terms in a normal-mode solution (4), and also coincident and zero eigenvalues, can be treated in a manner analogous to Sec. 9.4-1. In the general case, one can use a transformation (4) producing a triangular matrix A (Sec. 13.4-3), so that the yi(t) can be derived one by one (Ref. 13.15).
(b) Nonhomogeneous Equations. trix.
The State-transition Ma
The linear system
§ =| aikyk +Mt) Vi(0) =yi0 (i =1, 2, . . . ,n) (13.6-8a) or
dft =Ay +f(t)
y(0) =2/o
(A =- [aik])
(13.6-86)
where f(t) is an n X 1 column matrix, describes the response of a (timeinvariant) linear system to the inputs fi(t). As in Sees. 9.3-1 and 9.4-2, the matrix solution y(t) is obtained by superposition of the homogeneous-
423
LINEAR SYSTEMS WITH VARIABLE COEFFICIENTS
13.6-3
system solution (3) and a particular integral (normal response) yN(t): y(t) = eAty0 + yN(t)
\
With yN(t) =ft h+(t - r)/(r) dr =ft h+({)f(t - f) * J (' ~0) (13.6-9)
The n Xn state-transition matrix h+(t — r) = [{h+(t — t)}«] for the initial-value problem (8) is a generalization of the one-dimensional weighting function h+(t —r) in Sec. 9.4-3 and satisfies
^^ =Ah+(t) (t >0)
MO) =/
(13.6-10)
so that
h+(t) = eAt
(t > 0)
(13.6-11)
h+(t) is the response to the set of (asymmetrical) unit impulses fi(t) = S+(t) (i = 1, 2, . . . , n; see also Sec. 9.4-3d). Note that the solution (9) is precisely analogous to the solution of the one-dimensional problem dy/dt = ay + /(J), y(0) = yo.
(c) Laplace-transform Solution (see also Sec. 9.4-5). Elementby-element Laplace transformation of the given constant-coefficient matrix equation (8) produces
or
sY(s) - y0 = AY(s) + F(s) Y(s) = (si - A)-1^ + (si - A)~lF(s)
(13.6-12)
where Y(s), F(s) denote the respective Laplace transforms of y(t), f(t). The terms in Eq. (12) are the transforms of those in Eq. (9); inverse Laplace transformation of each element Ft(s) of Y(s) produces yi(t). 13.6-3. Linear Systems with Variable Coefficients (see also Sees. 9.2-4, 9.3-3, and 18.12-2). (a) The most general linear system (1) has the form
^ =A(t)y +f(t)
(13.6-13)
where A(t) = [aik(t)] is an n X n matrix, and f(t) is an n X 1 column matrix (linear differential equations with variable coefficients and forcing terms). The solution can again be written as
y(t) =w+(t, 0)2/o +ft w+(t, X)/(X) d\
(t >0) (13.6-14)
where w+(t, X) is the n X n state-transition matrix determined for t > X as the solution of
T^ =*»«+«. *> <«>*>} w+(X, X) = /
J
(13.6-15)
13.6-4
MATRICES.
QUADRATIC AND HERMITIAN FORMS
424
or as the response to a set of (asymmetrical) unit impulses fi(t) = 8+(t —X), where i = 1, 2, . . . , n; see also Sec. 9.4-3d systems, io+(t, X) = h+(t — X).
For constant-coefficient
(b) For any real or complex matrix A (t) with continuous elements, the solution of the homogeneous linear system
f =A(t)y
(13.6-16)
is Y(t)y(0), where Y = Y(t) is an n X n matrix and the unique solution of the matrix differential equation
^ =A(t)Y
F(0) =/
(13.6-17)
Y(t) is nonsingular; its columns constitute n linearly independent solutions of Eq. (16) (fundamental-solution matrix, see also Sec. 9.3-2). U(t) = [Y-l(t)]1[ is the unique solution of
_^Z =A^t)u
with 17(0) =1
(13.6-18)
Equations (17) and (18) are adjoint linear differential equations.* The state-transition matrix w+(t, X) of Sec. 13.6-3a is given by w+(t, X) ^ Y(t)Y-!(\) = Y(t)UU\)
(t > X)
(13.6-19)
so that Eq. (14) corresponds to a matrix version of the variation-of-constants solu tion of Sec. 9.3-3.
13.6-4. Perturbation Methods and Sensitivity Equations, Given a system of differential equations
§ =/(*,*;«)
2/(0) =2/o
(a)
(13.6-20)
which depends on a set (column matrix) [ai, a?, . . . , am] of m param eters ctk, let 2/(i) (t) be the known solution for the parameter values given by a = on = {an, an, . . . , aim}. The perturbed solution y^(t) + 8y(t) corresponding to the perturbed parameter matrix a = a\ + 8a may be easier to find through solution of
jt &V = /a 2/(D + *y, on + 8a) - f(t, 2/(d; a,)
8y(0) = 0 (13.6-21)
for the perturbation (variation, Sec. 11.5-1) 8y than by direct solution * I — - A(t) and —I —- A}(I) are adjoint operators on n X 1 matrix functions dt
dt
u(i) such that f °° u\(t)u(t) dt exists and u(0) =0 if one defines the inner product of
two such functions u, vby (u,v) = JJ wfGMO dt (Sec. 14.4-3; see also Sec. 15.4-3).
425
STABILITY OF SOLUTIONS: DEFINITIONS
13.6-5
of Eq. (20). Equation (21) is exact. For suitably differentiate f(t, y; a), one may, however, be able to neglect all but first-order terms in a Taylor-series expansion of Eq. (21) to find an approximation to 8y (first-order perturbation) by solving the linear system
It 8y = &i}8y + fo8a
8y(0) = °
(13.6-22)
where the elements of the n X n matrix df/dy = [dfi/dyk]y==yil) and the n X m matrix df/da = [df{/dak]y^y^ will, in general, depend on the "nominal solution" y&)(t) and hence on t. If the perturbations 8yt are small compared with the |y,-|, one may be able to neglect approximation and numerical errors in the computation of the 8yit (b) The dependenceof the solution y(t) on the parameters ak is often describedby (the n X m matrix of) the sensitivity coefficients (parameter-influence coeffi
cients) Zik = dyi/dotk, which form an n X m matrix Z = dy/da ss [dy.-/da*]y_wa). For each given nominal solution y(t) = y(i)(t), the sensitivity coefficientsare functions
of t and satisfy the mn linear differential equations (sensitivity equations) given by
(c) The initial values yi(0) = yi0 may be treated as parameters in perturbation and
sensitivity calculations. In this case, the initial conditions 8y(0) = 0 in Eqs. (21) and (22) must be replaced by
8y(0) = 2/(0) - 2/(d(0) = 8y0
(13.6-24)
The appropriate initial conditions for the sensitivity equations (23) are
13.6-5. Stability of Solutions: Definitions (see also Sec. 9.5-4). (a) Given a system
\% =f(t, y)
(t >0)
(13.6-26)
different types of stability of a solution y = y^(t) can be denned in terms of the effects of various parameter perturbations (Sec. 13.6-4). The following theory is concerned with stability in the sense ofLyapunov, which is determined by the effects of small changes &V(to) = y(to) - 2/(i) (to)
in initial solution values on the resulting perturbations ty(fl s V(t) ~ 2/d) (t) for t > to.
13.6-6
MATRICES.
QUADRATIC AND HERMITIAN FORMS
426
The solution y = 2/d)(0 of the system (26) is
stable in the sense of Lyapunov if and only if for every real € > 0, there exists a real A(€, to) > 0 such that \\8y(to)\\ < A(e, to) implies
11^2/(0 II < € for all t> t0. Otherwise the solution is unstable. asymptotically stable in a region Di(t0) of the "state space" of points y = {yi, 2/2, . . . , 2/n} if and only if 2/d)(0 is stable, and y(t0) in Dx(to) implies lim 8y(t) = 0 (i.e., ||$2/(0||-> 0 as J-» 00, Sec. t—*»
13.2-11).
asymptotically stable in the large (completely stable, globally asymptotically stable) if and only if the entire state space is a region of asymptotic stability. Note: In the above definitions, the norm \\Sy\\ of the n X 1 column matrix 8y = {52/i, hy%, . . . ,8yn\, defined in accordance with Eq. (13.2-2) as
\\Sy\\ = sup |ft SVl + & tyf + • • • + & 8yn\
(tf + tf + • • • + £*2 = 1) (13.6-27a)
can be conveniently replaced by one of the alternate norms (Table 13.2-1)
or
\\8yh = i(8yi)2 + («2/2)2 + • • • + («2/»)2lH ||ty||i = \8yi\ + |tys| + • • • + \*Vn\
(EUCLIDEAN nobm) (taxicab norm)
(13.6-276) (13.6-27c)
Note that these definitions refer to stability of solutions, not of systems
(see also Sees. 9.4-4 and 13.6-7). If a solution is stable in the sense of Lyapunov, sufficiently small changes in initial values cannot cause large solution changes at any time. For an asymptotically stable solution, the effects offinite initial-value changes, up to specified bounds, are nullified after sufficient time has elapsed. If the solution is asymptotically stable
in the large, even arbitrarily large initial-value changes willhave negligible
long-term effects. Asymptotic stability is a requirement for practical control systems.
(b) Anunstable solution yw(t) ofEq. (26) hasa finite escape time T if and onlyif it becomes unbounded after a finite time t = T.
13.6-6. Lyapunov Functions and Stability, (a) Stability of Equi librium for Autonomous Systems (see also Sec. 9.5-46). An equi librium solution y(t) = 2/u)(* > 0) of the autonomous system
f =f(y)
(t >0)
(13.6-28)
is defined by
/(2/(d) = 0
(13.6-29)
It will suffice to consider equilibrium solutions y(t) = 2/(i) = 0, since
427
LYAPUNOV FUNCTIONS AND STABILITY
13.6-6
other equilibrium "points" 2/ = 2/(D in state space can be translated to the origin by a simple coordinate transformation.
With reference to the solution y(t) = 0 of a given system (28), a Lyapunov function is any real function V(y) such that V(0) = 0 and, throughout a neighborhood D of the "point" y = 0 in the "state space" of "points" 2/ = {2/1,2/2, ... ,2/n}, V(y) is continuously differentiate and
V[y(t)] > 0
for all y(t) j* 0 satisfying Eq. (28). The equilibrium solution y(t) = 0 is stable in the sense of Lyapunov if (and only if, Ref. 13.11) there exists a corresponding Lyapunov function.
y(t) = 0 is asymptotically stable
1. If there exists a Lyapunov function V(y) satisfying the stronger condition dV/dt < 0 for all solutions y(t) 5* 0 of Eq. (28) in D (Lyapunov's Theorem on Asymptotic Stability). 2. If there exists a Lyapunov function V(y) not identically zero on any solution trajectory y = y(t) in D (Kalman-Bertram Theorem). If the neighborhood D of the origin defining a Lyapunov function V(y) contains a bounded region Di such that V(y) < Vo, where V0 is any positive constant, then y(t) = 0 is asymptotically stable in Di. If a Lyapunov func tion V(y) can be defined for the entire state space, and V(y) —> 00 as \\y(t)\\ —* 00, then the solution y(t) = 0 is asymptotically stable in the large (La Salle1s Theorem on Asymptotic Stability). The equilibrium solution y(t) = 0 of Eq. (24) is unstable if there exists a
neighborhood D of y = 0, a region Di in D, and a real function U(y) such that
1. U(y) is continuously differentiate, and
U[y(t)] >0 I U[y(t)] >0 for all solutions y(t) in D\, except that 2- U(y) = 0 at all boundary points (Sec. 4.3-6) of Z>i in D. 3. 2/ = 0 is a boundary point of Dx (Cetaev's Instability Theorem). (b) Nonautonomous Systems. Every solution y = 2/(i)(0 of the system
^ =f(t, V)
(t >0)
(13.6-31)
can be transformed to the equilibrium solution y(l) = 0 of a new (generally nonautonomous) system by the transformation y(t) = y(t) -}- yw(t).
13.6-7
MATRICES.
QUADRATIC AND HERMITIAN FORMS
428
The equilibrium solution y(t) = 0 of a given system (26) is asymptotically stable in the large if there exist a continuously differentiable real function V(t, y), two continuous nondecreasing real functions Vi(\\y\\), V2(\\y\\), and a continuous realfunction V*(\\y\\) such that V(t,0) = Fi(0) = 72(0) = 78(0) = 0 and
Vt(\\y\\) >V(t,y) >V1(\\y\\) >0
^ <- VtW)
13.6-7. Applications and Examples (see also Sec. 9.5-4). (a) Applications such as control-system design motivate the search for Lyapunov functions establishing asymptotic stability in specified state-space regions, or in as large regions as possible ("direct method" of Lyapunov for stability investigations). Lyapunov functions for particular solutions are not unique, and practical search methods are more of an art than a science (Refs. 13.11 to 13.14).
(b) As noted in Sec. 9.5-4a, the equilibrium solution y(t) = 0 of the linear homo geneous constant-coefficient system
%=Ay
(13.6-32)
(Sec. 13.6-2a) is asymptotically stable in the large (completely stable) if and only if the system is completely stable in the sense of Sec. 9.4-4, i.e., if and only if all eigenvalues of the system matrix A have negative real parts. This is true if andonly if for an arbi trary positive definite symmetric matrix Q, there exists a positive definite symmetric matrix P such that
IP +PA = -Q
(13.6-33)
V(y) = gPy is then a Lyapunov function for the equilibrium solution y(t) = 0. (c) Dufnng's equation
3+.j+f+v.0 describes the oscillations of a nonlinear spring. Introducing y = yi, y = 2/2, one has the nonlinear first-order system
The theory of Sec. 13.6-6a indicates that
V(yh 2/2) a M(byxA + 2t/!2 + 2y*2)
dV
-jt = -W2 dt
is a Lyapunov function for the equilibrium solution y\(t) = y*(t) —0 when a > 0, b > 0 ("hard spring"); this solutionis asymptoticallystable in the large.
For a > 0, 6 < 0 ("soft spring"), the equilibrium solution yx(t) = y*(t) = 0 is asymptotically stable, but not in the large (Fig. 13.6-1).
429
APPLICATIONS AND EXAMPLES
Fig. 13.6-1. Region of asymptotic stability for Duffing's equation
with a = 1, b = -0,04.
(Based on Ref. 13.11.)
13.6-7
13.7-1
MATRICES.
QUADRATIC AND HERMITIAN FORMS
430
13.7. RELATED TOPICS, REFERENCES, AND BIBLIOGRAPHY
13.7-1. Related Topics. The following topics related to the study of matrices, quadratic forms, and hermitian forms are treated in other chapters of this handbook: Linear simultaneous equations Chap. 1 Systems of ordinary differential equations Chap. 9 Matrix notation for optimum-control problems Chap. 11 Use of matrices for the representation of vectors, linear transformations (linear operators), scalar products, and group elements Chap. 14 Eigenvectors and eigenvalues of linear operators Chap. 14 Matrix techniques for difference equations Chap. 20 Numerical techniques Chap. 20
13.7-2. References and Bibliography (see also Sees. 12.9-2 and 14.11-2). 13.1. Aitken, A. C: Determinants and Matrices, 8th ed., Interscience, New York, 1956.
13.2. Birkhoff, G., and S. MacLane: A Survey of Modern Algebra, rev. ed., Macmillan, New York, 1965
13.3. Gantmakher, F. R.: The Theory of Matrices, Chelsea, New York, 1959. 13.4. : Applications of the Theory of Matrices, Inter cience, New York, 1959. 13.5. Hohn, F. E.: Elementary Matrix Algebra, 2d ed., Macmillan, New York, 1964. 13.6. Nering, E. D.: Linear Algebra and Matrix Theory, Interscience, New York, 1963.
13.7. Shields, P. C: Linear Algebra, Addison-Wesley, Reading, Mass., 1964. 13.8. Thrall, R. M., and L. Tornheim: Vector Spaces and Matrices, Wiley, New York, 1957.
13.9. Zurmuehl, R.-jMalrizen, 2d ed., Springer, Berlin, 1964.
(See also the articles by G. Falk and H. Tietz in vol. II of the Handbuch derPhysik, Springer, Berlin, 1955. For numerical techniques, see Sees. 20.3-3 to 20.3-5.) Matrix Techniques for Systems of Differential Equations (See also Refs. 9.3 and 9.16 in Sec. 9.7-2)
13.10. DeRusso, P., et al.: State Variables for Engineers, Wiley, New York, 1965. 13.11. Geiss, G. R.: "The Analysis and Design of Nonlinear Control Systems via Lyapunov's Direct Method," #TD-TD#-63-4076, U.S. Air Force Flight Dynamics Laboratory, Wright-Patterson AFB, Ohio, 1964.
13.12. Hahn, W.: Theory and Applicationof Lyapunov's Direct Method, Prentice-Hall, Englewood Cliffs, N.J., 1963.
13.13. Krasovskii, N. N.: Stabilityof Motion, Stanford, Stanford, Calif., 1963.
13.14. Letov, A. M.: Stability of Nonlinear Control Systems, Princeton, Princeton, N.J., 1961.
13.15. Schultz, D. G., and J. L. Melsa: State Functions in Automatic Control, McGrawHill, New York, 1967.
13.16. TomoviS, R.: Sensitivity Analysis of Dynamic Systems, McGraw-Hill, New York, 1964.
CHAPTER
14
LINEAR VECTOR SPACES AND LINEAR
TRANSFORMATIONS (LINEAR OPERATORS). REPRESENTATION OF MATHEMATICAL MODELS IN TERMS OF MATRICES
14.1. Introduction.
Reference Sys
tems and Coordinate Transfor mations
14.1-1. Introductory Remarks 14.1-2. Numerical Description of Mathematical
Models:
Ref
erence Systems 14.1-3. Coordinate Transformations 14.1-4. Invariance
14.1-5. Schemes of Measurements
(Linear Operators) 14.3-1. Linear
Transformation
Vector Space.
of a
Linear Opera
14.3-2. Range, Null Space, and Rank
14.2-1. Defining Properties Manifolds
14.3. Linear Transformations
tors
14.2. Linear Vector Spaces 14.2-2. Linear
14.2-6. Unitary Vector Spaces 14.2-7. Metric, and Convergence in Normed Vector Spaces. Banach Spaces and Hilbert Spaces 14.2-8. The Projection Theorem
of and
Sub-
spaces in V 14.2-3. Linearly Independent Vectors and Linearly Dependent Vec tors
14.2-4. Dimension of a Linear Mani
fold or Vector Space. Bases and Reference Systems (Co ordinate Systems) 14.2-5. Normed Vector Spaces
a
Linear
Transformation
(Operator)
14.3-3. Addition and Multiplication by Scalars. Null Transformation 14.3-4. Product of Two Linear Trans
formations (Operators). Identity Transformation 14.3-5. Nonsingular Linear Transfor mations (Operators). Inverse Transformations (Operators)
14.3-6. Integral Powers of Operators 431
VECTORS, TRANSFORMATIONS, AND MATRICES 14.4. Linear
Transformations
of a
Normed or Unitary Vector Space into Itself. Hermitian and Unitary Transformations (Operators) 14.4-1. Bound of a Linear Transforma tion 14.4-2. The Bounded Linear Transfor mations of a Normed Vector
Space into Itself 14.4-3. Hermitian Conjugate of a Linear Transformation (Op erator) 14.4-4. Hermitian Operators 14.4-5. Unitary Transformations (Op erators) 14.4-6. Symmetric, Skew-symmetric, and Orthogonal Transforma tions of Real Unitary Vector Spaces 14.4-7. Combination Rules
General
Definition
of
14.4-10. Infinitesimal Linear Transfor mations
14.5. Matrix Representation of Vec Linear Transforma
tions (Operators) 14.5-1. Transformation of Base Vec
tors and Vector Components: "Alibi" Point of View
tors and Linear Transforma
tions (Operators) Notation
for
Orthonormal Bases
14.7-1. Representation of Inner Products
14.7-2. Change of Reference System 14.7-3. Orthogonal Vectors and Orthonormal Sets of Vectors
14.7-4. Orthonormal Bases (Complete Orthonormal Sets) 14.7-5. Matrices Corresponding to Hermitian-conjugate Operators 14.7-6. Reciprocal Bases 14.7-7. Comparison of Notations 14.8. Eigenvectors and Eigenvalues of Linear Operators 14.8-1. Introduction 14.8-2. Invariant Manifolds.
De
composable Linear Transfor mations (Linear Operators) 14.8-3. Eigenvectors, Eigenvalues, and Spectra 14.8-4. Eigenvectors and Eigenvalues Normal
and
Hermitian
Operators 14.8-5. Determination of Eigenvalues and Eigenvectors: Finite-di mensional Case
14.8-6. Reduction and Diagonalization of Matrices. Principalaxes Transformations
14.8-7. "Generalized" Eigenvalue Problems
14.5-2. Matrix Representation of Vec
14.5-3. Matrix
Products.
of
Conjugate Operators
tors and
14.7. Representation of Inner
and Matrices
14.4-8. Decomposition Theorems. Normal Operators 14.4-9. Conjugate Vector Spaces. More
432
Simul
taneous Linear Equations 14.5-4. Dyadic Representation of Lin
14.8-8. Eigenvalue Problems as Sta tionary-value Problems 14.8-9. Bounds for the Eigenvalues of Linear Operators 14.8-10. Nonhomogeneous Linear Vec tor Equations
ear Operators
14.6. Change of Reference System 14.6-1. Transformation of Base Vec
tors and Vector Components: "Alias" Point of View
14.6-2. Representation of a Linear Operator in Different Schemes of Measurements
14.6-3. Consecutive Operations
14.9. Group Representations and Related Topics
14.9-1. Group Representations 14.9-2. Reduction of a Representation 14.9-3. The Irreducible Representa tions of a Group 14.9-4. The Character of a Represen tation
433
DESCRIPTION OF MATHEMATICAL MODELS
14.9-5. Orthogonality Relations 14.9-6. Direct Products of Representations
14.1-2
and Quaternions. CayleyKlein Parameters 14.10-5. Rotations about the Coordi-
14.9-7. Representations of Rings, Fields, and Linear Algebras
14.10. Mathematical Description of Rotations
nate Axes 14.10-6. Euler Angles 14.10-7. Infinitesimal Rotations, Con tinuous Rotation, and Angular Velocity 14.10-8. The Three-dimensional Rota
tion Group and Its Repre-
14.10-1. Rotations in Three-dimensional Euclidean Vector Space 14.10-2. Angle of Rotation.
sentations
Rotation
-A-X1S 14.10-3. Euler Parameters and Gibbs
14.11. Related Topics, References, anc[ Bibliography
Vector
14.10-4. Representation of Vectors and Rotations by Spin Matrices
14.11-1. Related Topics 14.11-2. References and Bibliography
14.1. INTRODUCTION. REFERENCE SYSTEMS AND COORDINATE TRANSFORMATIONS
14.1-1. This chapter reviews the theory of linear vector spaces (see also Sec. 12.4-1) and linear transformations (linear operators). Vectors and
linear operators represent physical objects and operations in many important applications.
Most practical problems require a description (representation) of mathematical models (Sec. 12.1-1) in terms of ordered sets of real or
complex numbers. In particular, the concepts of homomorphism and isomorphism (Sec. 12.1-6) make it possible to "represent" many mathe matical models by corresponding classes of matrices (Sec. 13.2-1; see also Sec. 13.2-5), so that abstract mathematical operations are made to correspond to numerical operations on matrix elements (EXAMPLES: matrix repre sentations of quantum-mechanical operators and of electrical trans
ducers). Sections 14.5-1 to 14.10-8 describe the use of matrices to repre sent vectors, linear operators, and group elements. 14.1-2. Numerical Description of Mathematical Models: Refer
ence Systems (see also Sees. 2.1-2, 3.1-2, 5.2-2, 6.2-1, 12.1-1, and 16.1-2). A reference system (coordinate system) is a scheme of rules which describes (represents) each object (point) of a class (space, region of a space) C by a corresponding ordered set of (real or complex) numbers (components, coordinates) xh x2, . . . . The number of coordinates required to define each point (xh x2) . . .) is called the dimension of
the space C (see also Sec. 14.2-46). In many applications, coordinate values are related to physical measurements.
14.1-3
VECTORS, TRANSFORMATIONS, AND MATRICES
434
14.1-3. Coordinate Transformations (see also Sees. 2.1-5 to 2.1-8
3.1-12, 6.2-1, and 16.1-2). A transformation of the coordinates xh #2, . . . is a set of rules or relations associating each point (x\, x2, . . .) with a new set of coordinates.
Coordinate transformations admit two
interpretations:
1. "Alibi" or "active" point of view: the coordinate transformation x[ = x[(xh x2, . . .)
x2 = x'2(xh x2, . . .)
• • • (14.1-1)
describes an operation (function, mapping, Sec. 12.1-4) associating a new mathematical object (point) (x[, x2, . . .) with each given point (xi, x2, . . .) 2. "Alias" or "passive" point of view: the coordinate transforma tion
Xi = Xi(x\, x2, . . .)
x2 — x2(x\, x2, . . .)
• • • (14.1-2)
introduces a new description (new representation, relabeling) of each point (xi, x2, . . .) in terms of the new coordinates xh x2, . . . . Coordinate transformations permit one to represent abstract mathematical rela tionships by numerical relations ("alibi" point of view) and to change reference systems ("alias" point of view). A change of reference system often simplifies a given problem (EXAMPLES: principal-axes transformations, Sees. 2.4-8, 3.5-7, 9.4-8, and 17.4-7; contact transformations, Sees. 10.2-5 and 11.5-6; angle and action variables in dynamics).
14.1-4. Invariance (see also Sees. 12.1-5 and 16.2-1, and refer to Sees. 16.1-4 and 16.4-1 for more detailed discussions). A function of the coordinates labeling an object or objects is invariant with respect to a given coordinate transformation (1) or (2) if the function value is unchanged on substitution of x\(x\, x2, . . .) or Xi(x\, x2, . . .) for each x{. A relation between coordinate values is invariant if it remains valid after similar substitutions.
Invariance with respect to an "alibi"-type coordinate transformation is interpreted in the manner of Sec. 12.1-5. Functions and relations invariant with respect to a suitable class of "alias"-type coordinate transformations may be regarded as functions of, and relations between, the actual objects (invariants) represented by different sets of coordinates
in different reference systems. A complete set of invariants fi(xh x2, . . .), /2OB1, #2, ...),.. . uniquely specifies all properties of the object
(xi, #2, . . .) which are invariant with respect to a given class (group) of coordinate transformations (see also Sec. 12.2-8).
14.1-5. Schemes of Measurements. The representation of a model involving two or more classes of objects will, in general, require a refer ence system for each class of objects; the resulting set of reference systems
435
LINEAR MANIFOLDS AND SUBSPACE IN *0
14.2-2
will be called a scheme of measurements. A change of the scheme of measurements involves an "alias"-type coordinate transformation for each class of objects; one usually relates these transformations so as to ensure the invariance of important functions and/or relations (see also Sees. 14.6-2, 16.1-4; 16.2-1, and 16.4-1). 14.2. LINEAR VECTOR SPACES
14.2-1. Denning Properties. As already stated in Sec. 12.4-1, a linear vector space V of vectors a, b, c, . . . over the ring (with identity, Sec. 12.3-16) R of scalars a, £, . . . admits vector addition and multiplica tion of vectors by scalars with the following properties: 1. V is a commutative group with respect to vector addi tion: for every pair of vectors a, b of V, V contains the vector sum a + b, with a + b = b + a
a + (b + c) = (a + b) + c
and V contains the additive inverse — a of each vector
a and an additive identity (null vector) 0 so that a + 0 = a
a + ( —a) = a — a = 0
2. *U contains the product aa of every vector a of V by any scalar a of R, with (a/3)a = a(/3a) a(a + b) = aa + ah
la = a (a + j3)a = aa + j8a
where 1 is the multiplicative identity in R. Note that
Oa = 0
(-l)a = -a
(-a)a = -(aa)
(14.2-1)
Unless the contrary is specifically stated, all linear vector spaces con sidered in this handbook are understood to be real vector spaces or complex vector spaces respectively defined as linear vector spaces over the field of real numbers and over the field of complex numbers. In the case of vector spaces admitting a definition of infinite sums (Sec. 14.2-76), many authors refer to a set of vectors with the above
properties as a linear manifold and reserve the term vector space for a linear manifold which is closed, i.e., which contains all its limit points (Sec. 12.5-16; the two terms are equivalent in the case of finite-dimensional manifolds, Sec. 14.2-4).
14.2-2. Linear Manifolds and Subspaces in 13. A subset V\ of a linear vector space V is a linear manifold in V if and only if Vi is a
14.2-3
VECTORS, TRANSFORMATIONS, AND MATRICES
436
linear manifold over the same ring of scalars as V; Vi will be called a subspace of V if it is a closed linear manifold in V (see also Sec. 14.2-1). A proper subspace of V is a subspace other than 0 or V itself. Any given set of vectors ei, e2, . . . in V generates (spans, deter mines) a linear manifold comprising all linear combinations of ei, e2, EXAMPLES: straight lines and planes through the origin in three-dimensional Euclidean space.
14.2-3. Linearly Independent Vectors and Linearly Dependent Vectors (see also Sees. 1.9-3, 5.2-2, and 9.3-2). (a) A finite set of vec tors ai, a2, . . . is linearly independent if and only if
Xiai + X2a2 + • • • = 0
implies
Xi = X2 = • • • =0
(14.2-2)
Otherwise, the vectors ai, a2, . . . are linearly dependent, and at least one vector of the set, say ak, can be expressed as a linear combina tion a* = > mai of the other vectors a» of the set.
As a trivial special
t*
case, this is true whenever a* is a null vector. (b) The definitions of Sec. 14.2-3a apply to infinite sets of vectors ai, a2, . . . if it is possible to assign a meaning to Eq. (2). In general, this will require the vector
space to admit a definition of convergence in addition to the algebraic postulates of Sec. 14.2-1 (Sees. 12.5-3 and 14.2-76).
14.2-4. Dimension of a Linear Manifold or Vector Space.
Bases
and Reference Systems (Coordinate Systems), (a) A (linear) basis in the linear manifold V is a set of linearly independent vectors d, e2, . . . of V such that every vector a of V can be expressed as a linear form
aiei + a2e2 +
(14.2-3)
in the base vectors e,-. Every set of linearly independent vectors forms a basis for the linear manifold comprising all linear combinations of the given vectors.
(b) In a finite-dimensional linear manifold or vector space spanned by n base vectors
1. Every set of n linearly independent vectors is a basis. 2. No set of m < n vectors is a basis. 3. Every set ofm>n vectors is necessarily linearly dependent.
The number n is called the (linear) dimension of the vector space. An infinite-dimensional vector space does not admit a finite basis. (c) In every finite-dimensional real or complex vector space, the num
bers aij a2, . . . , an are unique components or coordinates of the
437
UNITARY VECTOR SPACES
14.2-6
vector a = o!iei + Qj2e2+ • • • + anen in a reference system (coordi nate system) defined by the base vectors ei, e2, . . . , e„. Note that a + b has the components ai + ft, and aa has the components aa» (i = 1, 2, . . . , n; see also Sec. 5.2-2).
(d) Two linear vector spaces V and V' over the same ring of scalars a, p, . . . are isomorphic (Sec. 12.1-6) if and only if it is possible to
relate their respective vectors a, b, . . . and a', b', . . . by a reciprocal one-to-one correspondence a <-> a', b *-» b', . . . such that a + b *-> a' + b', aa <-* aa'. In the case of finite-dimensional vector spaces, this is true if and only if V and V' have the same linear dimension. In particular, every n-dimensional real or complex vector space is isomorphic with the space of n-rowed column matrices over the field of real or complex numbers, respectively (matrix representation, Sec. 14.5-2).
14.2-5. Normed Vector Spaces. A real or complex vector space V is a normed vector space if and only if for every vector a of 'O there
exists a real number ||a|| (norm, absolute value, magnitude of a) such that a = b implies ||a|| = ||b||, and, for all a, b in V, > 0 M| = \a\ ||.|| ||a + b|| < ||a|| + ||b|| (Minkowski's inequality)
(14.2-4)
A unit vector is a vector of unit magnitude (see also Sec. 5.2-5).
Note
that ||-a|| = ||a||, ||0|| = 0 (see also Sec. 13.2-1). 14.2-6. Unitary Vector Spaces, (a) A real or complex vector space Vu is a unitary (bermitian) vector space if and only if it is possible to define a binary operation (inner or scalar multiplication of vectors) associating a scalar (a, b), the (hermitian) inner, scalar, or dot product of a and b, with every pair a, b of vectors of Vu, where 1.
2. 3. 4.
(a, (a, (a, (a,
b) = (b, a) * (hermitian symmetry) b + c) = (a, b) + (a, c) (distributive law) ab) = a(a, b) (associative law)* a) > 0; (a, a) =0 implies a = 0 (positive definiteness)
It follows that in every unitary vector space
(b + c, a) = (b, a) + (c, a)
(aa, b) = a*(a, b)
|(a, b)|2 < (a, a)(b, b) (Cauchy-Schwarz inequality)
(14.2-5)
(14.2-6)
* Some authors use the alternative defining postulate (aa, b) = a(a, b) which amounts to an interchange of a and b in the definition of (a, b).
14.2-7
VECTORS, TRANSFORMATIONS, AND MATRICES
438
The Cauchy-Schwarz inequality (6) (see also Sec. 1.3-2) reduces to an equation if and only if a and b are linearly dependent (see also Sees. 1.3-2, 4.6-19, and 15.2-lc). m vectors ai, a2, . . . , am of Vu are linearly independent if and only if the mth-order
determinant det [(at-, &k)] is different from zero (Gram's determinant, see also Sees. 5.2-8 and 15.2-la).
(b) 7/ a unitary vector space is a real vector space, all scalar products (a, b) are real, and scalar multiplication of vectors is commutative, so that (a, b) = (b, a)
(aa, b) = a(a, b)
(14.2-7)
Note: The inner-product spaces with indefinite metric used in relativity theory are real or complex vector spaces admitting the definition of an inner product (a, b) which satisfies conditions (1) to (3), but not condition (4) of Sec. 14.2-6a. The vectors
of such a space may be classified as vectors of positive, negative, and zero square (a, a). One defines ||a|| = V|(a, a)|. See also Sees. 14.2-5 and 16.8-1.
14.2-7. Metric and Convergence in Normed Vector Spaces. Banach Spaces and Hilbert Spaces, (a) Every normed vector space (Sec. 14.2-5) is a metric space with the metric d(a, b) = ||a —b|| (Sec. 12.5-2) and permits the definition of neighborhoods and convergence in the manner of Sec. 12.5-3 (see also Sec. 5.3-1). In this sense, an infinite sum a0 + ai + a2 • • • converges to (equals) a vector n
»
n
s = lim Y a* = 7 a* of Vu if and only if lim Is —S a* II = 0. A normed vector space V is complete (Sec. 12.5-4) if and only if every sequence of vectors s0, si, s2, . . . of v such that
lim ||sn - sm|| = 0 n—*« m—»<»
(Cauchy sequence, see also Sec. 4.9-1) converges to a vector s of V. Com plete normed vector spaces are called Banach spaces. Every finitedimensional normed vector space is complete. (b) Every unitary vector space Vu permits one to introduce the norm (absolute value, magnitude) ||a|| of each vector a, the distance d(a, b) between two "points" a, b of Vu, and the angle 7 between any two vectors a, b by means of the definitions
||a[| = V(a, a)
d(a, b) = ||a - b|| = V(a -- b, a - b) +
(a, b)
C0S7
l|a||||b||
(14.2-8)
439
LINEAR TRANSFORMATIONS OF A VECTOR SPACE
14.3-1
The functions ||a|| andd(a, b) defined by Eq. (8) satisfy all conditions of Sees. 14.2-5 and 12.5-2. If Vu is a real unitary vector space, the Cauchy-
Schwarz inequality (6) insures that the angle y is real for all a, b. (c) Finite-dimensional real unitary vector spaces are called Euclidean vector spaces. They are separable, complete, and boundedly compact (Sec. 12.5-4) and serve as models for n-dimensional Euclidean geometries (see also Chaps. 2 and 3 and Sees. 5.2-6 and 17.4-6d). Completeinfinitedimensional unitary vector spaces are called Hilbert spaces.* The complete sequence and function spaces listed in Table 12.5-1 are all Hil bert spaces (and hence also Banach spaces). Hilbert spacespreservemany of the propertiesof Euclidean spaces. In particular, every separable (Sec. 12.5-16) real or complex Hilbert space is iso morphic and isometric to the space I2 of respectively real or complex infinite
sequences (£i, £2, . . .) such that ||(£i, £2, • • Oil2 = |£i|2 + Ifrl8 + • • • converges (Table 12.5-la). Hence each vector of a separable Hilbert space can be labeled with a countable set of coordinates (or with a column or row matrix, Sec. 14.5-2). Every linear manifold in a Hilbert space is a complete subspace (see also Sec. 14.2-2) and is, thus, itself a Euclidean vector space or a Hilbert space.
14.2-8. The Projection Theorem. Given any vector x of a unitary vector space Vu and a complete subspace Vi, there exists a unique vector
y = xp ofVi which minimizes the distance \\x —y|| for ally inVi; moreover, xp is the unique vector yofVi such that x —y is orthogonal to every vector xiofVi, i.e.,
(x-xp, Xl) =0
(xiinV,)
(14.2-9)
(see also Sec. 14.7-36). The mapping x -> xp is a bounded linear opera tion (Sec. 14.4-2) called the orthogonal projection of (the points of) Vu onto Vi.
The projection theorem is of the greatest practical importance, for Eq. (9) defines the optimal approximation of a vector x by a vector y of the "simpler" class Vi if ||x —y||2 measures the error of the approximation. EXAMPLES: Projection of points onto planes in Euclidean geometry; orthogonalfunction approximations (Sees. 15.2-3,15.2-6, 20.6-2, and 20.6-3), mean-square regres sion (Sec. 18.4-6), Wiener filtering and prediction.
14.3. LINEAR TRANSFORMATIONS (LINEAR OPERATORS)
14.3-1. Linear Transformation of a Vector Space.
ators.
Linear Oper
Given two linear vector spaces V and V' over the same field of
* Some authors do not require Hilbert spaces to be infinite-dimensional; others require them to be separable (Sec. 12.5-16) as well as complete. Unitary vector spaces are sometimes referred to as pre-Hilbert spaces.
14.3-2
VECTORS, TRANSFORMATIONS, AND MATRICES
440
scalars a, 0, . . . , a (homogeneous) linear transformation of V into V' is a correspondence x' = f(x) s Ax
(14.3-1)
which relates vectors x' of V' to vectors x of V so as to preserve the "linear" operations of vector addition and multiplication of vectors by scalars:
f(x + y) s f(x) + f(y)
f(«x) es af(x)
(14.3-2)
Each linear transformation can be written as a multiplication by a linear operator A (linear operation), with A(x + y) = Ax + Ay
A(ax) = a (Ax)
(14.3-3)
f(x) = Ax + a7 is called a linear vector function. The definition of each linear operator must include that of its domain of definition. In physics, the first relation (3) is often referred to as a superposition principle for a class of operations.
14.3-2. Range, Null Space, and Rank of a Linear Transformation
(Operator).
The range (Sec. 12.1-4) of a linear transformation A of V
into V' is a linear manifold (Sec. 14.2-2) of V'. The null space of A is the manifold ofV mapped onto Ax = 0. The rank r and the nullity / of a linear transformation A are the respective linear dimensions (Sec. 14.2-46) of its range and null space. // V has the finite dimension n, then its range and null space are subspaces, and r + r' = n. 14.3-3. Addition and Multiplication by Scalars.
Null Transfor
mation, (a) Let A and B be linear transformations (operators) map ping a given domain of definition D in V into V'. One defines A ± B and «A as linear transformations of V into V' such that
(A ± B)x = Ax ± Bx
(«A)x = a(Ax)
(14.3-4)
for all vectors x in D.
(b) The null transformation O of V into V' is defined by Ox = 0 for all vectors x in V.
14.3-4. Product of Two Linear Transformations (Operators). Identity Transformation,
(a) Let A be a linear transformation
(operator) mapping V into V', and let Bbe a linear transformation map ping (the range of A in) Vf into V". The product BA is the linear trans
formation qf V into V" obtained by performing the transformations A
441
BOUND OF A LINEAR TRANSFORMATION
14.4-1
and B successively (see also Sec. 12.2-8): (BA)x = B(Ax)
(14.3-5)
(b) The identity transformation I of any vector space V transforms every vector x of V into itself: lx = x
IA s Al = A
(14.3-6)
14.3-5. Nonsingular Linear Transformations (Operators). In verse Transformations (Operators). A linear transformation (oper ator) A is nonsingular (regular) if and only if it defines a reciprocal one-to-one correspondence mapping all of *U onto all of V' (V and V' are then necessarily isomorphic, Sec. 14.2-4d). A is nonsingular if and only if it has a unique inverse (inverse transformation, inverse
operator) A-1 mapping V' onto V so that x' = Ax implies x = A_1x' and conversely, or AA"1 = A"XA = I
(14.3-7)
Products andinverses of nonsingular transformations (operators) are nonsingular; if A and B are nonsingular, and a 5* 0, (AB)"1 = B^A"1
(aA)-1 = of^A"1
(A-1)-1 = A
(14.3-8)
Nonsingular linear transformations (operations) preserve linear inde
pendence of vectors and hence also the linear dimensions of transformed manifolds (Sees. 14.2-3 and 14.2-4). A linear operator A is nonsingular if it has a unique left or right inverse, or if it has equal left and right inverses; the mere existence of a left and/or right inverse is not sufficient.
A linear transformation (operator) A defined on a finite-dimensional vector
space is nonsingular if and only if Ax = 0 implies x = 0, i.e., if and only if r = n, r' = 0 (Sec. 14.3-2).
14.3-6. Integral Powers of Operators. One defines A0 =1, A1 = A, A2 = AA, A3 = AAA, . . . , and, if A is nonsingular, A"* = (A-1)? = (A*)-1 (p = 1, 2, . . .). The ordinary rules for operations with exponents apply (see also Sec. 12.4-2). 14.4. LINEAR TRANSFORMATIONS OF A NORMED OR UNITARY VECTOR SPACE INTO ITSELF. HERMITIAN AND UNITARY TRANSFORMATIONS (OPERATORS)
14.4-1. Bound of a Linear Transformation (see also Sec. 13.2-la). A linear transformation A of a normed vector space (Sec. 14.2-5) V into
a normed vector space V' is bounded if and only if A has a finite bound
14.4-2
VECTORS, TRANSFORMATIONS, AND MATRICES
442
(norm)
| A| =sup ^
xinl) ||X||
(14.4-1)
Note that
>0
||oA|| - |a| ||A||
I|AB|| < ||A|| ||B||
||A + B|| < ||A|| + ||B||| n4,2.
||Ax|| < ||A|| ||x||
) (14*4"2)
Every linear transformation (operator) defined throughout a finite-dimen sional normed vector space is bounded.* If V and V are unitary vector spaces (Sec. 14.2-6), then
||A||- xint) sup ^! =suP^= sup ||A*|| Nl x*0 ||x|| ||x|| =l |(x, Ay) I
(14.4-3)
a sup II ii ||y|| II II = OsD-DarQ-i sup Kx> Ay)l x*o l|x|| 14.4-2. The Bounded Linear Transformations of a Normed Vector
Space into Itself, (a) The bounded linear transformations (operators) A, B, ... of a normed vector space V into itself constitute a linear algebra (Sec. 12.4-2). Within this algebra, the singular transformations are zero divisors (Sec. 12.3-la); the nonsingular transformations constitute a multi
plicative group and, together with the null transformation (Sec. 14.3-36), form a division algebra (Sec. 12.4-2). If V has the finite dimension n, the transformation algebra is of order n2.
The transformation algebra (operator algebra) is not in general com mutative (see also Sec. 12.4-2). The operator AB - BA is called the commutator of A and B.
(b) Bounded linear operators defined on complete unitary vector spaces (either finite-dimensional or Hilbert spaces, Sec. 14.2-7c) permit the definition of convergent sequences and analytic functions of operators with the aid of the metric ||A - B|| (Sec. 12.5-3) in the manner of Sees. 13.2-11 and 13.2-12.
14.4-3. Hermitian Conjugate of a Linear Transformation (Oper ator). Every bounded linear transformationf A of a complete unitary vector space Vu into itself has a unique hermitian conjugate (adjoint, conjugate, associate operator) Af defined by
(x, Ay) = (Afx, y)
for all x, y in Vu
(14.4-4)
* Many texts define homogeneous linear operators on any normed vector space as operators which satisfy Eq. (14.3-3) and are bounded (and hence continuous in the sense of Sec. 12.5-lc). t See footnote to Sec. 14.4-1.
UNITARY TRANSFORMATIONS (OPERATORS)
443
14.4-5
so that
(A + B)f = Af + Bf («A)t = «*Af (AB)t = BfAf (A-l)t = (At)'1 (Af)t = A
||At|| = ||A||
||AtA|| = ||A||«
Ot = O
(14.4-5)
It = I
(Ax, By) ^ (x, AtBy) = (BfAx, y)
(14.4-6)
(see also Sec. 14.2-6a).
14.4-4. Hermitian Operators. A linear operator A mapping a com
plete unitary vector space Vu into itself is a hermitian operator (selfadjoint operator, self-conjugate operator) if and only if At = A
i.e.
(x, Ay) = (Ax, y)
for all x, y in Vu
(14.4-7)
// Vu is a complex complete unitary vector space, A is hermitian if and only if (x, Ax) is real for all x, or
(x, Ax) = (Ax, x)* = (Ax, x)
for all x in Vu
(14.4-8)
(alternative definition). A transformation (operator) A such that At = —A is called skew-hermitian.
Hermitian operators are of great importance in applications which require (x, Ax) to be a real quantity (vibration theory, quantum mechanics). A hermitianoperator A is, respectively, positive definite, negative definite, nonnegative, nonpositive,
positive semidefinite, negative semidefinite, indefinite, or zero if the same is true for the inner product (hermitian form) (x, Ax) (see also Sees. 13.5-3 and 14.7-1). 7/ A is nonnegative, there exists a hermitian operator Q such that QfQ = QQt = A; Q is uniquely defined if A is nonsingular.
14.4-5. Unitary Transformations (Operators). A lineartransforma tion A mapping a completeunitary vector space Vu into itself is unitary if and only if AtA = AAt = I
i.e.
At = A"1
(14.4-9)
Every unitary transformation A is nonsingular and bounded, and ||A|| = 1. Every unitary transformation x' = Ax preserves scalar products:
(xf, y') = (Ax, Ay) - (x, y)
||x'|| = ||Ax|| = ||x||
for all x, y in Vu
for all x in Vu
(u *_m
K'
If Vu is finite-dimensional, each of the relations (10) implies that A is unitary.
Unitary transformations preserve the results of scalar multiplication of vectors as well as those of vector addition and multiplication by scalars; so that absolute values,
14.4-6
VECTORS, TRANSFORMATIONS, AND MATRICES
444
distances, angles, orthogonality, and orthonormality (Sees. 14.2-7a and 14.7-3) are invariant (Sec. 12.1-5).
14.4-6. Symmetric, Skew-symmetric, and Orthogonal Transfor mations of Real Unitary Vector Spaces, (a) The hermitian con jugate (Sec. 14.4-3) Af associated with a linear transformation A of a real
complete unitary vector space VE intoitself is often called the transpose (conjugate, associate operator) A of A and satisfies the relations
(x, Ay) = (Ay, x) = (Ax, y) = (y, Ax)
for all x, y in VE (14.4-11)
A may be substituted for Af in all relations of Sec. 14.4-3 whenever the vector space in question is a real unitary vector space.
(b) A linear transformation A of a real complete unitary vector space Ve is symmetric if and only if A
= A
(x,Ay) = (y, Ax)
i.e.
ske>\ -symmetric
A =
—A
i.e.
for all x, y in
VE
(14.4-12)
Ve
(14.4-13)
(antisymmetric) if and only if
(x, Ay) = -(y, Ax)
for all x, y in
orthogonal if and only if AA = AA = 1
i.e.
A = A-1
(14.4-14)
Orthogonal transformations defined on real unitary vector spaces are unitary, so that all theorems of Sec. 14.4-5 apply.
14.4-7. Combination Rules (see also Sec. 13.3-3). (a) // the operator A is hermitian, the same is true for Ap (p = 0, 1, 2, . . .), A-1, andTfAT, and for a A if a is real.
Given any nonsingular operator T, TfAT is hermitian if and only if A is hermitian; hence for any unitary operator T, T_1AT is hermitian if and only if the same is true for A. In particular, if thevectorspace in questionis a real completeunitary vectorspace, and A is symmetric, the same is true for Ap (p = 0, 1, 2, . . .), A-1, TAT, and aA. Given any nonsingular operator T, TAT is symmetric if and only if A is symmetric; and for any orthogonal operator T, T_1AT is symmetric if and only if the same is true for A.
(b) If A and B are hermitian (or symmetric), the same is true for A + B. The product AB of two hermitian (or symmetric) operators A and B is hermitian (or symmetric) if and only if BA = AB (see also Sec. 13.4-46).
445
CONJUGATE (ADJOINT, DUAL) VECTOR SPACES
14.4-9
(c) // A is a unitary transformation (operator), the same is true for Ap (p = 0, 1, 2, . . .), A"1, and Af, and for «A if \a\ = 1. If A and Bare unitary, the same is true for AB.
// Ais an orthogonal transformation, the same is true for Ap (p = 0,1, 2, .), A"1, A, and —A. If Aand Bare orthogonal, the same is true for AS. 14.4-8, Decomposition Theorems. Normal Operators (see also Sees. 13.3-4 and 14.8-4). (a) For every linearoperator A mappinga complete
unitary vector space into itself, %(A + Af) = Hi and ^. (A —Af) = H2 are hermitian operators. A = Hi + *H2 is the (unique) cartesian decom
position ofthe given operator Ainto a hermitian part anda skew-hermitian part (comparable to the cartesian decomposition ofcomplex numbers into real and imaginary parts, Sec. 1.3-1).
If A is denned on a real complete unitary vector space, the cartesian
decomposition reduces to the (unique) decomposition of A into the sym metric part %(A + A) and the skew-symmetric part }4(A —A). For every linear operator A mapping a complete unitary vector space into itself, AfA is hermitian and nonnegative, and there exists a polar
decomposition A = QU of Ainto a nonnegative hermitian factor Q and a unitaryfactor U. Q is uniquely defined by Q2 = AfA, and Uis uniquely defined if and only if A is nonsingular (compare this with Sec. 1.3-2). (b) A linear operator A mapping a complete unitary vector space Vu into itself is a normal operator if and only if AfA = AAfor, equivalently, if and only if H2Hi = HiH2. A bounded operator A is normal if and only
if ||Ax|| = ||A|| ||x|| for all x in Vu. Hermitian and unitary operators are normal.
14.4-9. Conjugate (Adjoint, Dual) Vector Spaces. More General Definition of Conjugate (Adjoint) Operators, (a) The bounded linear transformations* A of a normed vector space V into any complete
normed vector space (Banach space, Sec. 14.2-7) constitute a complete normed vector space, with addition and multiplication by scalars defined by Eq. (14.3-4), and norm ||A||. In particular, the class of bounded, linear, and homogeneous scalar functions v(x) defined on a normed vector
space Vconstitute a complete normed vector space* the conjugate (adjoint, dual) vector space V"\ associated with V. A bounded linear transformation x' = Ax *See footnote to Sec. 14.4-1.
t Note carefully that the value of
In the context of Chap. 15 (Sec. 15.4-3), the numerical-
valued function
14.4-10
VECTORS, TRANSFORMATIONS, AND MATRICES
446
mapping V into another normed vector space V' relates vectors
defined by
(14.4-15a)
(14.4-156)
Afiscalled theconjugate (adjoint) operator associated with A; one has IIAtll = ||A||
(14.4-16)
Note that (Af)t is not in general identical with A in this context.
(b) Every bounded, linear, and homogeneous scalar function
(14.4-17)
where $ is a vector of Vv. The correspondence between the vectors
\\
(14.4-18)
the correspondence is also isometric (Sees. 12.5-2 and 14.2-76). Hence, complete unitary vector spaces are self-conjugate, i.e., identical with their conjugate spaces except for isomorphism and isometry. The defi nition of operators conjugate to linear transformations of a complete unitary vector space into itself can then be reduced to the simple defi nition of Hermitian-conjugate operators given in Sec. 14.4-3. 14.4-10. Infinitesimal Linear Transformations (see also Sees. 4.5-3
and 14.10-5). (a) An infinitesimal linear transformation (infin itesimal linear operator, infinitesimal dyadic) defined on a normed real or complex vector space has the form A = I + eB
(14.4-19)
where Bis bounded, and |e|2 is negligibly small compared to 1 (e is usually a scalar differential).
(b) For infinitesimal linear transformations, A ••= I + eB, Ai = I + eiBi, A2 = I + €2&2
AiA2 = I+ (eiB, + «2B2) = A2Ai) Infinitestimal linear transformations (operators) commute.
U44 M)
447
VECTOR COMPONENTS: "ABILI" POINT OF VIEW
14.5-1
(c) An infinitesimal linear transformation A = I + & defined on a com
plete unitary vector space is unitary if and only if eB is skew-hermitian. An infinitesimal linear transformation A = I + eB defined on a real com plete unitary vector space is orthogonal if and only if eB is skew-symmetric. 14.5. MATRIX REPRESENTATION OF VECTORS AND LINEAR TRANSFORMATIONS (OPERATORS)
14.5-1. Transformation of Base Vectors and Vector Components:
"Alibi" Point of View (see also Sec. 14.1-3). Consider a finite-dimen sional* real or complex vector space Vn with a reference system defined by n base vectors ei, e2, . . . , e„ (Sec. 14.2-4). Each vector n
x = £iei + &e2 + • • • + £«en = Y fee*
(14.5-1)
is described by its components £i, f2, . . • , £n- A linear transformation (operator) A mapping Vn into itself (Sec. 14.3-1) transforms each base vector e* into a corresponding vector n
ek = Ae* = aufcei + 02*62 + *
• + an*e„ = ) a*'fre* t = l
(14.5-2)
(k = 1, 2, . . . , n) and each vector x of Vn into a corresponding vector x' of Vn: n
n
n
x' = Ax = A Y fee* = Y fee, = T fie* A=l
A=l
(14.5-3)
i=l
The components fi of the vector x' and the components fe of the vector x, both referred to the ei, e2, . . . , e„ reference system, are related by the n linear homogeneous transformation equations
$i = OnSi + CH2S2 + £2 = #2l£l + ^22^2 +
+ ttlnfe + a2nfe
(transformation o f v e c t o r compo
NENTS, "alibi" in = «nl£l + On2fe +
+
(14.5-4)
POINT OF VIEW)
* The theory of Sees. 14.5-1 to 14.7-7 applies also to certain infinite-dimensional vector spaces (Sees. 14.2-4 and 14.2-76). Such spaces must permit the definition of countable bases (Sec. 14.2-4), such as orthonormal bases (Sec. 14.7-4), and of con vergence (Sec. 14.2-7), so that sums like that in Eq. (1) become convergent infinite series. This is, in particular, true for every separable Hilbert space (Sec. 14.2-7c). Vector spaces which do not admit countable bases can be represented by suitable function spaces (Sec. 15.2-1).
14.5-2
VECTORS, TRANSFORMATIONS, AND MATRICES
448
14.5-2. Matrix Representation of Vector and Linear Transforma
tions (Operators). For each given reference basis ei, eo, . . . , e» in Vn
1. The vectors x = £iei + £2e2 + • • • + £nen of Vn are represented on a reciprocal one-to-one basis by the column matrices {&} (Sec. 13.2-16).
2. The linear transformations (operators) mapping Vn into itself are represented on a reciprocal one-to-one basis by the n X n matrices A = [oik] denned by Eq. (2) or (4).
The transition between vectors and operators and the corresponding matrices is an isomorphism (Sec. 12.1-6): sums and products involving scalars, vectors, and transformations correspond to analogous sums and products of matrices. Identities and inverses correspond; nonsingular and unbounded transformations (operators) correspond, respectively, to nonsingular and unbounded matrices, and conversely. In particular, the coordinate-free vector equation (14.5-5)
x' = Ax
is represented in the eh e2, . . . , en reference system by the matrix equation
x
an
#i2
ain
(221
#22
#2n
_ani
an2
=
L & J
*1 = Ax
(14.5-6)
L in J
which is equivalent to the n transformation equations (4); and the product of two linear transformations A and Bis represented by the product of the corresponding matrices A and B (carefully note Sec. 14.6-3). Note: Transformations of an n-dimensional vector space Vn into an ra-dimensional vector space Vm may be similarly represented by m X n matrices. Transformations
relating two real vector spaces can always be represented by real matrices. 14.5-3. Matrix Notation for Simultaneous Linear Equations (see also Sees. 1.9-2 to 1.9-5).
A set of simultaneous linear equations
I
ancXk = h
(i = 1, 2,
m)
(14.5-7)
TRANSFORMATION OF BASE VECTORS
449
14.6-1
is equivalent to the matrix equation
Ax = b
flu
fli2
* • '
a\n
Xi
fl-_>l
&22
'
" '
a-2n
X'l
flmi
fl»»2
'
'
amn-
•
-
Xn
(14.5-8) -
The unknowns Xk may be regarded as components of an unknown vector such that the transformation (8) yields the vector represented by the 6,-. If, in particular, the matrix [aik] is nonsingular (Sec. 13.2-3), then the matrix equation (8) can be solved to yield the unique result x = A~lb
(14.5-9)
which is equivalent to Cramer's rule (1.9-4).
14.5-4. Dyadic Representation of Linear Operators. A linear operator A defined on an n-dimensional vector space may also be expressed as a sum of n outer products of pairs of vectors (n dyads) in the manner of Sec. 16.9-1. The corresponding n X n matrix A can be similarly expressed as the sum of n outer products of pairs of row and column matrices (see also Sec. 13.2-10). 14.6. CHANGE OF REFERENCE SYSTEM
14.6-1. Transformation of Base Vectors and Vector Components:
"Alias" Point of View (see also Sec. 14.1-3). (a) Given a reference basis ei, e2, . . . , en in the finite-dimensional* vector space Vn, m vectors n
afc = \ dikei (k = 1, 2, . . . , m) are linearly independent (Sec. 14.2-3) if and only if the matrix [and is °f Tank m (see also Sec. 1.9-3). In particular, for every reference basis ei, e2, . . . , en in Vn Cjfc = £l&ei + t2k&2 +
= ) Uk^i
• ' * + tnktn \
(k = 1, 2, . . . , n) (
t = l
with
det [Uk] ?* 0
(transformation of base vectors)
(14.6-1)
/
The matrix T = [Uk] represents a (necessarily nonsingular) transformation T relating the old base vectors e» and the new base vectors e* = Te* in the manner of Eq. (14.6-2).
(b) Now each vector x of Vn can be expressed in terms of vector com ponents ii referred to the e, system or in terms of vector components Ik referred to the e& system:
x = ^ fce* = 2, t*5* A= l
* See the footnote to Sec. 14.5-1.
(14.6-2)
14.6-2
VECTORS, TRANSFORMATIONS, AND MATRICES
450
The vector components fc and |* of the same vector x are related by the n linear homogeneous transformation equations £l = tnli + ti2%2 + • • • + hn\n it = <2lf1 + <22?2 + • * * + t2nln
(trans forma t i o n OF
in = tnlll + fn2?2 +
~r tnnin
VECTOR
or in matrix form
X
COMPO
'hi
tu ' ' '
hn
it
<21
<22 * ' '
t2n
Lin
fnl
tn2 ' '
^nn_
=
(14.6-3)
NENTS,
h
"ALIAS" = Tx
POINT OF
view)
_ln
The meaning of the transformation equations (3) must be carefully dis tinguished from that of the formally analogous relations (14.5-4) and (14.5-6). Note also the inverse relations, viz.
et- = T_1e< = }
Tik
det [tik]
et
(i = 1, 2,
, n)
(14.6-4)
Jfe = l
(14.6-5) i = l
or
x = T~lx
)
where Ti* is the cofactor of Uk in the determinant det [Uk] (Sec. 1.5-2).
14.6-2. Representation of a Linear Operator in Different Schemes of Measurements, (a) Consider a linear operator A represented by the matrix A in the scheme of measurements (Sec. 14.1-5) associated with the base vectors et (Sec. 14.5-2) and by the matrix A in the e\ scheme of measurements, so that for every vector x of "On x' = Ax
x' = Ax
a? = Ax
(14.6-6)
Given the transformation matrix T relating the et and e& reference sys tems so that
x = Tx
x' = Tx1
(14.6-7)
(Sec. 14.6-1), the matrices A and A are related by the similarity transfor mation (Sec. 13.4-1) A =
T~lAT
or
A =
TAT-1
(14.6-8)
451
CONSECUTIVE OPERATIONS
14.6-3
Conversely, every matrix A related to A by a similarity transformation (8) represents the same linear operator A in a scheme of measurements specified by the base vectors (1). (b) All matrices (8) representing the same operator A have the same rank r; r equals the rank of A (Sec. 14.3-2). The trace and the determinant of the matrix A are also common to all matrices (8) and are referred to as the trace Tr (A) and the determinant det (A) of the operator A (see also Sees. 14.1-4 and 13.4-16). (c) Matrix Transformation of Base Vectors. If one admits row and column matrices of vectors, Eqs. (1) and (6) can be respectively written as
[ei
with
e2 . . . ]T
(14.6-9)
[e[ e2 . . . ] = [ei e2 . . . ]A
(14.6-10)
x = [ei
e2 . . . ] = [ei
e2
. . . ]x = [ei
e2 . . . ]x
x' = [ei e2 ... ]x' = [e[ e2 ... ]x
(14.6-11)
(see also Sec. 16.6-2).
14.6-3. Consecutive Operations (see also Sees. 14.5-2 and 14.10-6). (a) Consider two consecutive linear operations A, B defined by the basevector transformations
e* = Ae* = } aikei 1=i
ei' = Be£ = BAe,
, (* - 1,2, . . . ,n)
(14.6-12)
J
where A, B, and BA are, respectively, represented by the matrices A = [aik], B = [bik], and BA in the e» scheme of measurements, i.e., x' = Ax
x" = Bx' = BAx
(14.6-13)
is represented by xf = Ax
x" = Bx' = BAx
x = (BA)~lx"
(14.6-14)
where x' = {i[f i'2f . . .} and x" = {i[', i", . . .} as well asz = {£i, f2, . . .} are columns of vector components measured in the e» reference system.
Note carefully that the operation B defined by Eqs. (12) to (14) is, in general, different from the operation defined by
= ^ k*e*' =ABA-^i =ABefc
(k =1, 2, . . . ,n) (14.6-15)
which corresponds to
(14.6-16)
14.7-1
VECTORS, TRANSFORMATIONS, AND MATRICES
452
since the matrix B = [bik] represents ABA-1 and not B in the e'k scheme of measurements.
(b) The column matrices x = {£1, {2, . . • , in], x = {|i, h, . . . ,
in], %= {h, £2, . . . , |nI representing the same vector n
n
n
x=2 ** = 2 he'k = 2 ***" t = l
&= 1
A;= l
(14.6-17)
are related by the alias-type transformations
x = A-lBA$
x = Ax = BA$
£ = (BA)~lx
(14.6-18)
Note again that, in general, x j* B£. 14.7. REPRESENTATION OF INNER PRODUCTS. ORTHONORMAL BASES
14.7-1. Representation of Inner Products (see also Sees. 14.2-6,
14.7-66, and 16.8-1).
Given a finite-dimensional unitary vector space n
or separable Hilbert space (Sec. 14.2-7)* Vu, let the vectors a = Y c^e* n
and b = X @ie< ^e rePresen^e(l by the respective column matrices a = [en] and b = {ft} in the manner of Sec. 14.5-2 (see also Sec. 13.2-16). Then n
n
(a, b) = ^ X QikofPk =a\Gb (14.7-1) with
G = [gik]
gik = (et-, e*)
(i, k = 1, 2, . . . ,n)
The matrix G = [gik] is necessarily hermitian (gik = g*{) and positive definite (Sec. 13.5-3); if Vu is a real unitary vector space, then G is real and symmetric. Equation (1) makes it possible to describe absolute values, angles, distances, convergence, etc., in Vu in terms of numerical vector components (see also Sec. 14.2-7). In particular, n
for every vector x =
> £te» in Vu, n
n
||x||* = Y Y gikiUk = x]Gx t=lA=l
* See the footnote to Sec. 14.5-1.
(14.7-2)
453
ORTHONORMAL BASES
14.7-4
The hermitian form (Sec. 13.5-3) (2) is called the fundamental form of Vu in the scheme of measurements denned by the base vectors e».
14.7-2. Change of Reference System (see also Sees. 14.6-1 and 16.8-1). If one introduces a new set of base vectors e* such that a = Ta, b = Tb (Sec. 14.6-1), the invariance (Sec. 14.1-4) of n
n
(a, b) =atfb = £ ^ guafa n
n
= ^ 2) SikoiBt =a\QS
(14.7-3)
implies
(? = [fo] B [(eif 5*)] a TtOT
(14.7-4)
14.7-3. Orthogonal Vectors and Orthonormal Sets of Vectors. (a) Two vectors a, b of a unitary vector space Vu are (mutually) orthog onal if and only if (a, b) = 0 (7 = 90 deg, Sec. 14.2-76). An orthonormal (normal orthogonal) set of vectors is a set of mutually orthog onal unit vectors ui, u2, . . . , so that
(ui, uk) =81 =11 .f *=A (i, k=1, 2,
(14.7-5)
Every set of mutually orthogonal nonzero vectors (and, in particular, every orthonormal set) is linearly independent; so that the largest number of vectors in any such set (the orthogonal dimension of Vu) cannot exceed the linear dimension of Vu (see also Sees. 14.2-3 and 14.2-46). (b) BessePs Inequality (see also Sec. 15.2-36). Given a finite or infinite orthonormal set ui, u2, . . . and any vector a in Vu,
} \(uk, a)|2 < ||a||2
(Bessel's inequality)
(14.7-6)
The equal sign applies if and only if the vector a belongs to the linear manifold spanned by the orthonormal set (see also Sec. 14.7-4). BessePs inequality is closely related to the projection theorem of Sec. 14.2-8 and is often used to prove the convergence of infinite series.
14.7-4. Orthonormal Bases (Complete Orthonormal Sets), (a) In a finite-dimensional unitary vector space of dimension n, every orthonormal set of n vectors is a basis (orthonormal basis). More generally, in every complete unitary vector space Vc (this includes both finite-dimen sional unitary vector spaces and Hilbert spaces, Sec. 14.2-7c), an orthonormal set of vectors m, u2, . . . constitutes an orthonormal basis (complete
14.7-5
VECTORS, TRANSFORMATIONS, AND MATRICES
454
orthonormal set, complete orthonormal system) if and only if it satisfies the following conditions:
1. Every vector a of Vc can be expressed in the form a = ($iui + d2u2 + • • • with
dk = (uk, a) (k = 1, 2, . . .)
2. Given any vector <$iui + 42u2 + • • • = a,
||a||2 = |<$i|2 + |42|2 + * * * (Parseval's identity) 3. Given any pair of vectors <2iUi + <$2u2 + • • • = a,
i^iui + $2u2 + • • • = b, (a, b) = <$*& + &*J2 + • • • 4. The orthonormal set ui, u2, . . . is not contained in any other orthonormal set of Vc. If a is a bounded vector of Vc, then (u*, a) = 0 (k = 1, 2, . . .) implies a = 0.
Each of these four conditions implies the three others. The relative sim plicity of the above expressions for ||a||2 and (a, b) makes orthonormal
bases especially useful as reference systems. Note that the concept of a complete orthonormal set extends the definition of a basis (Sec. 14.2-4) to suitable infinite-dimensional vector spaces in the following sense: if
a, a' are two vectors with identical components dk, then ||a —a'|| = 0 (Uniqueness Theorem).
(b) Construction of Orthonormal Sets of Vectors. Given any countable (finite or infinite) set of linearlyindependent vectors ei, e2, . . . of a complete unitary vector space, there exists an orthonormal set ui, u2, . . . spanning the same linear manifold.
Such an orthonormal set may be constructed with the aid of the following recursion formulas (Gram-Schmidt orthogonalization process, see also Sec. 15.2-5):
«* = VIM
vi = ei
\
/ et-+i)u* x \j (i = 1, 2, . . .) (14.7-7) v*+i = ei+i - V > (ufc,
14.7-5. Matrices Corresponding to Hermitian-conjugate Oper
ators (see also Sees. 14.4-3 to 14.4-6, 13.3-1, and 13.3-2).
(a) Given a
455
RECIPROCAL BASES
14.7-6
linear operator A represented by the matrix A, the hermitian conjugate Af is represented by G~XA\G in the same scheme of measurements.* (b) For an orthonormal reference system m, u2, . . . , one has G = I (Sec. 14.7-4), and A es [aik] m [(,*, An*)]
(14.7-8)
so that hermitian-conjugate operators correspond to hermitian-conjugate matrices, and conversely. Thus hermitian, skew-hermitian, and unitary operators correspond to matrices of the same respective types, and con versely. In particular, symmetric, skew-symmetric, and orthogonal operators defined on real vector spaces correspond, respectively, to sym metric, skew-symmetric, and orthogonal matrices whenever an orthonormal reference system is used. More generally, unitary or orthogonal operators correspond to matrices of the same respective type whenever the reference base vectors are orthogonal (not necessarily orthonormal). 14.7-6. Reciprocal Bases, (a) For every basis ei, e2, . . . , e„ in a finite-dimen sional vector space, there exists a uniquely corresponding reciprocal (dual) basis e1, e2, . . . , en defined by the symmetric relationship
(«•, «0 - «1 - {? j| J*J} {i' k=1-2
n>
(14.7-9o)
so that each e» is perpendicular to all e* with k ?* i, and [(e*, e*)] - G-i = [(*, e*)]-* e*
=£ gike<
(k =1, 2, . . . ,n)
(14.7-96)
(14.7-10)
(b) Vectors a, b, . . . represented in the e» reference system by column matrices a, b, . . . are represented in the e* system by the column matrices Ga, Gb, . . . , f and (a, b) - a\Gb = aUGb) = (Ga)\b = (Ga)\G~KGb)
(14.7-11)
In particular, (e», a) = en. A linear operator A given by the matrix A in the e» scheme of measurements is represented by the matrix GAG*1 in the e*' scheme of measurements. (c) The reciprocal basis el, e2, . . . , en corresponding to a new set of base vectors n
•*e,-
with
det [Uk] ?* 0
l
* Note that Af is represented by the matrix Af in the scheme of measurements corresponding to the reciprocal basis, Sec. 14.7-6. fOne can also represent the vectors a, b, . . . by the row matrices (Ga)\, (Gb)^, . . . or (Ga), (Gb), . . . , corresponding to a representation by covariant vector components (Sees. 16.2-1 and 16.7-3).
14.7-7
VECTORS, TRANSFORMATIONS, AND MATRICES
456
is given by
=^ <"*
(14.7-12)
i=l
The base vectors e* and e* are said to transform contragrediently (see also Sec. 16.6-2).
Similarly
x = Tx
implies
Gx = T(Gx)
(14.7-13)
(d) Every orthonormal basis (Sec. 14.7-4a) is identical with its reciprocal basis (selfdual), so that e1" = et = ii* (i = 1, 2, . . . , n), and G = /
Gx s x
(14.7-14)
GAG~l s A
14.7-7. Comparison of Notations. In order to permit easy reference to standard textbooks, subscripts only have been used throughout Chaps. 12 to 14 to label vector components and matrix elements. The improved dummy-index notation employed in tensor analysis and described in Sec. 16.1-3 uses superscripts as well as subscripts. Table 14.7-1 reviews the different notations used to describe vectors and linear
operators and may be used to translate one notation into the other (see also Sec. 16.2-1).
Table 14.7-1. Comparison of Different Notations Describing Scalars, Vectors, and Linear Operators Components (matrix elements) Coordinate-free
(invariant) notation
a (scalar)
Matrix
representations
a
Conventional notation
a
Dummy-index notation
a
(e<, e&)
[(ei, e*)l = 6
g%k
gik
(e*, e»)
[(e«, e*)] . G-'
No special symbol
gik
a
Oii
a{
Ga
No special symbol
a>i = (Jikak
a (vector)
n
y = Ax
A (linear operator, dyadic)
y = Ax
A
Vi — /
aiktk
Oik
GAG~l
At
Aik = gnAl
GA
AG~l
y< = Alkxk
No special symbols
Aik = A)g*
A* = 9iiAy*
457
INVARIANT MANIFOLDS
14.8-2
14.8. EIGENVECTORS AND EIGENVALUES OF
LINEAR OPERATORS
14.8-1. Introduction. The study of eigenvectors and eigenvalues is of outstanding practical interest, because
1. Many relations involving a linear operator are radically simplified if its eigenvectors are introduced as reference base vectors (diagonalization of matrices, quadratic forms, solution of operator equa tions in spectral form; see also Sec. 15.1-1). 2. The eigenvalues of a linear operator specify important properties of the operator without reference to a particular coordinate system.
In many applications, eigenvectors and eigenvalues of linear operators have a direct geometrical or physical significance; they can usually be interpreted in terms of a maximum-minimum problem (Sec. 14.8-8). The most important applications involve hermitian operators which have real eigenvalues (Sees. 14.8-4 and 14.8-10). 14.8-2. Invariant Manifolds. Decomposable Linear Transforma tions (Linear Operators) and Matrices. The manifold Vi is a linear vector space V is invariant with respect to (reduces) a given linear transformation A of V into itself if and only if A transforms every vector x of Vi into a vector Ax of Vi. Given a reference basis ei, e2, . . . , em, em+i, . . .in V such that ei, £2, . . . , em span *Ui, A is represented by a matrix A which can be partitioned in the form
[0]
*...]
(14*1)
where Ai is the m X m matrix representing the linear transformation Ai of Vi "induced" by A. Ai may or may not be capable of further reduction.
A linear transformation (linear operator) A of the vector space V into itself is decomposable (reducible, completely reducible*) if and only if V is the direct sum V = Vi 0 ^2 0 • • • (Sec. 12.7-5a) of two or more subspaces Vh V2} . . . each invariant with respect to A. In this case, one writes A as the direct sum A = Ai 0 A2 0 • • • of the
linear transformations Ai, A2, . . . respectively induced by A in Vi, V2, . . . .
A square matrixA represents a decomposable operator A if and only if A is similar to a step matrix (direct sum of matrices A\, A2, . . . corresponding * See the footnote to Sec. 14.9-26.
14.8-3
VECTORS, TRANSFORMATIONS, AND MATRICES
458
to Ai, A2, . . . , see also Sec. 13.2-9). A matrix A with this property is also called decomposable (reducible, completely reducible*). 14.8-3. Eigenvectors, Eigenvalues, and Spectra (see also Sec. 13.4-2). (a) An eigenvector (proper vector, characteristic vector) of the linear transformation (linear operator) A defined on a linear vector space V is a vector y of V such that
Ay = Xy
(y ^ 0)
(14.8-2)
where X is a suitably determined scalar called the eigenvalue (proper value, characteristic value) of A associated with the eigenvector y. (b) // y is an eigenvector associated with the eigenvalue X of A, the same is true for every vector ay j£ 0. // yi, y2, . . . , y8 are eigenvectors asso ciated with the eigenvalue X of A, the same is true for every vector aiyi + «2y2 + • • • + oi8y8 5* 0; these vectors span a linear manifold invariant with respect to A (Sec. 14.8-2; see also Sec. 14.8-4c). This theorem also applies to convergent infinite series of eigenvectors in Hilbert spaces. An eigenvalue X associated with exactly m > 1 linearly independent eigenvectors is said to be m-fold degenerate; m is called the degree of degeneracy or geometrical multiplicity of the eigenvalue. Eigenvectors associated with different eigenvalues of a linear operator are linearly independent. A linear operator defined on an n-dimensional vector space has at most n distinct eigenvalues. Every eigenvalue of a nonsingular operator is different from zero.
(c) Given a linear operator A with eigenvalues X, ak has the eigenvalues a\, and Ap has the eigenvalues Xp (p = 0,1,2, . . . ;p = 0, ±1, ±2, . . . if A is nonsingular). Every polynomial f(A) (Sec. 14.4-26) has the eigen values f(\) (see also Sees. 13.4-56). All these functions of A have the same eigenvectors as A.
(d) The Spectrum of a Linear Operator.
The spectrum of a
linear operator A mapping a complete normed vector space (Banach space, Sec. 14.2-7a) V into itself is the set of all complex numbers (spec tral values, eigenvalues")") X such that the vector equation Ax — Xx = f
does not have a unique solution x = (A — Xl)_1f of finite norm for every vector f of finite norm (see alsolSec. 14.8-10). More precisely stated, the operator A = XI does not have a unique bounded inverse (A — XI)"1 * See the footnote to Sec. 14.9-26.
f Some authors refer to all spectral values as eigenvalues; others restrict the use of this term to the discrete spectrum.
459
EIGENVECTORS AND EIGENVALUES
(resolvent operator).
14.8-4
The spectrum may be partitioned into
1. The discrete spectrum (point spectrum) defined by Eq. (2) with eigenvectors y 7^ 0 of finite norm ||y|| 2. The continuous spectrum where (A — XI)-1 is unbounded with domain dense in V
3. The residual spectrum where (A — XI)"1 is unbounded with domain not dense in V
The spectrum of a linear operator A contains its approximate spec trum defined as the set of all complex numbers Xsuch that there exists a
sequence of unit vectors ui, u2, . . . such that ||(A — Xl)un|| < l/n (n = 1, 2, . . .). The approximate spectrum contains both the discrete and the continuous spectrum (see also Sec. 14.8-4). The residual spectrum of A is contained in the discrete spectrum of Af.
(e) The spectrum of a linear operator A is identical with the spectrum of every matrixA representing A in the manner of Sec. 14.5-2. The algebraic multiplicity ra< of any discrete eigenvalue Xy of A is the algebraic multi
plicity of the corresponding matrix eigenvalue (Sec. 13.4-3a). raj is greater than, or equal to, the geometrical multiplicity mj of Xy (see also Sec. 14.8-4c). For every linear operator having a purely discrete spectrum and a finite trace, Tr (A) equals the sum of all eigenvalues, each counted a number of times equal to its algebraic multiplicity. If det (A) exists, it equals thesimilarly computed product of theeigenvalues (see also Sec. 13.4-36).
The characteristic equation Fa(X) = 0 associated with a class of similar finite
matrices (Sec. 13.4-5a) is the characteristic equation of the corresponding operator A and yields its eigenvalues together with their algebraic multiplicities. The CayleyHamilton theorem [FA(A) = 0, Sec. 13.4-7a] and the theorems of Sec. 13.4-76 apply to linear operators defined on finite-dimensional vector spaces.
(f) 7/ A, y, and z are bounded, then Ay = Xy, Afz = /xz implies either fi = X* or (y, z) = 0 (see also Sec. 14.8-4). 14.8-4. Eigenvectors and Eigenvalues of Normal and Hermitian
Operators (see also Sees. 13.4-2, 13.4-4a, 14.4-6, 14.8-8, 15.3-36, 15.3-4, and 15.4-6). (a) If A is a normal operator (AfA = AAf, Sec. 14.4-86), then A and Af have identical eigenvectors; corresponding eigenvalues of A and Af are complex conjugates. The spectrum of every normal operator is identical with its approximate spectrum; the residual spectrum is empty (see also Sec. 14.8-3d). Foreverynormaloperator with eigenvalues X, thehermitianoperators \i(A + Af) = Hi,
— (A — Af) = H2, and AfA have thesame eigenvectors as A and the respective eigenvalues Re(X), Im(X), and |X|2.
14.8-5
VECTORS, TRANSFORMATIONS, AND MATRICES
460
(b) All spectral values of any hermitian operator are real. The converse is not necessarily true; but every normal operator having a real spec trum is hermitian.
The spectrum of every bounded* hermitian operator A is a closedset of real numbers; its largest and smallest values equal sup (Ax, x) and inf (Ax, x). 11*11 = 1
llx|| = l
(c) The following important special properties of normal operators apply, in particular, to hermitian operators: 1. Orthogonality of Eigenvectors. Eigenvectors corresponding to different eigenvalues of a normal operator are mutually orthogonal. 2. Completeness Property of Eigenvectors. Every boundednormal operator A defined on a complete unitary vector space Vu is completely reduced (Sec. 14.8-2) by a subspace spanned by a complete orthonormal set of eigenvectors (Sec. 14.7-4) and a subspace orthogonal to every eigenvector of A. If Vu is separable (in particular, if Vu is finitedimensional), the orthonormal eigenvectors span VuEvery normal operator A defined on a complete and separable unitary vector space Vu is decomposable into a direct sum (Sec. 14.8-2) of normal operators Ai, A2, . . . defined on corresponding subspaces Vi, V2, . . . 0/ Vu so that each A, has the single eigenvalue X,-. In each subspace V,, there exists a complete orthonormal set of eigenvectors y^'K If the eigenvalue A of a normal operator has the (finite) algebraic multiplicity m, then the degree of degeneracy of X is also equal to m, and conversely.
(d) Spectral Representation. The following properties of normal operators apply, in particular, to hermitian operators. Given an orthonormal set of eigenvectors yi, y2, . . . of the bounded normal operator A
and any vector x = £iyi + |2y2 + * • * [Ia = (y*, x), Sec. 14.7-4a], note Ax = Xiliyi + A2I2V2 + * * *
\
(spectral representation of the operator A) \ (14.8-3)
(x, Ax) = Ai||i|» + X2II2I2 + • • •
J
where X* is the eigenvalue associated with yk (see also Sec. 13.5-4&). See Refs. 14.10 and 14.11 for analogous properties of normal operators whose spectra are not discrete. If A is an operator of this type, the sums (3) must be replaced by Stieltjes integrals over the spectrum.
14.8-5. Determination of Eigenvalues and Eigenvectors: Finitedimensional Case (see also Sees. 13.4-5 and 13.5-5; refer to Sec. 20.3-5 for numerical methods). Given the n X n matrix A = [aik] representing a linear operator A in the scheme of measurements defined by the base vectors ei, €2, . . . , e„, one determines the eigenvalues Xand the eigen vectors y = rjiei + 77262 + * * * + i)n^n of A as follows: * See footnote to Sec. 14.4-1.
461
REDUCTION AND DIAGONALIZATION OF MATRICES
14.8-6
1. Find the eigenvalues as roots of the nth-degree alge braic equation (characteristic equation) det (A - XI) = det (A - XI) an — X
cti2
a2i
(&22 — X
* * '
0>2n
«nl
CLnn — X
Obtain the components iji0), t?20), . linearly independent eigenvectors yO) = VlU)ei + V2U)e2 +
din
= 0
. , i?n(/) of mj
+ n
associated with each distinct eigenvalue Xy by solving the n simultaneous linear equations (an - \j)vi(i) + a>i2V2(i) + a2ii?i0) + (a22 - *j)v2U) +
+ flln1|n(/) = 0 + a2nVnU) = 0
anmU) + an2V2a) + • ' - + (ann - X,)^ = 0
whose system matrix has the rank n — m, (Sec. 1.9-5).
The mj column matrices yU) =
are called modal columns (or
LVn^J simply eigenvectors) of the given matrix A, associated with the eigen value Xy. Each modal column may be multiplied by an arbitrary con stant different from zero.
14.8-6. Reduction and Diagonalization of Matrices. Principalaxes Transformations (see also Sec. 14.8-2). (a) Every finite set of
linearly independent eigenvectors yi, y2, . . . , y8 associated with the same eigenvalue of a linear operator A spans a subspace Vi invariant with respect to A (Sec. 14.8-36). If yi, y2, . . . , y8 are introduced as the first s base vectors, A will be represented by a matrix of the form A ss
L [0] i A2 J
(14.8-4)
(b) In particular, let A be a normal operator defined on a finitedimensional vector space of dimension n. Then the procedure of Sec 14.8-5 will yield exactly n linearly independent eigenvectors yO) = Vl(»ei + rj2U)e2 +
+ Vnu)en
14.8-6
VECTORS, TRANSFORMATIONS, AND MATRICES
462
The n corresponding modal columns form a nonsingular modal matrix T = [tik]
with
tik = m(j) = vm
(14.8-5)
where each distinct eigenvalue Xy accounts for my adjacent columns; the n = mi + m2 + • • • columns are labeled by successive values of k = 1,2, . . . , n. The alias-type coordinate transformation x = Tx
(14.8-6)
(Sec. 14.6-1) introduces the n eigenvectors y(y) of the normal operator A as a reference basis. The similarity transformation A = T-'AT
(14.8-7)
yields the matrix A representing A in the new reference system; A is a step matrix
" ii | [0] | • • • l m
_JL_|._!iJ.:..:..:..
(14.8-8)
_ ...| ... j... _
where each submatrix Aj corresponds to a different eigenvalue Xy of A and has exactly my rows and my columns (see also Sec. 14.8-4c). (c) If the n eigenvectors defining the columns of the modal matrix (5) are mutually orthogonal, then the similarity transformation (7) yields a diagonal matrix A (diagonalization of the given matrix A, Sec. 13.4-4a). To obtain a transformation matrix T which diagonalizes a given matrix A representing a normal operator A, proceed as follows:
1. If all eigenvalues Xy arenondegenerate (this is true whenever the char acteristic equation has no multiple roots), every modal matrix (5) diagonalizes A.
2. // there are degenerate eigenvalues Xy, orthogonalize each set of my eigenvectors y^ = 7/i^ei + 7?20)e2 + • • • + VnU)en by the Gram-
Schmidt process (Sec. 14.7-46). The n modal columns y® =
V2U) U?«(i)j
representing the resulting mi + m2 + • • • = n mutually orthog onal eigenvectors y0) = iji(y)ei + ^20)e2 + " • • + ynu)en form the desired transformation matrix T.
(d) In many applications, the original reference basis ei, e2, . . . , e» is orthonormal (rectangular cartesian coordinates), so that n
n
(x, Ax) =x\Ax ^JJ Oik&b i=l i = l
(14.8-9)
463
"GENERALIZED" EIGENVALUE PROBLEMS
14.8-7
(Sec. 14.7-4a), and the new reference basis is taken to be an orthonormal set of eigenvectors fu) (obtained, if necessary, with the aid of the GramSchmidt process). Then every modal matrix T formed from the fu) is a unitary matrix. A unitary coordinate transformation (6) introducing n orthonormal eigenvectors as base vectors is called a principal-axes transformation for the operator A (see also Sees. 2.4-7 and 3.5-6). A principal-axes transformation for a hermitian operator A reduces the corresponding hermitian form (9) to its normal form (13.5-9) (see also Sec. 13.5-4).
(e) Two hermitian operators A and B can be represented by diagonal matrices in the same scheme of measurements if and only if BA = AB (see also Sees. 13.4-46 and 13.5-5).
14.8-7. "Generalized" Eigenvalue Problems (see also Sees. 13.5-5 and 15.4-5). (a) Some applications require one to find "eigenvectors" y and "eigenvalues" m defined by a relation of the form
Ay = MBy
(y * 0)
(14.8-10)
where B is a nonsingular operator. The quantities y and n are necessarily the eigenvectors and eigenvalues of the operator B_1A; the problem reduces to that of Eq. (2) if B is the identity operator. If both A and B are hermitian, and B is positive definite (Sec. 14.4-4), then 1. All eigenvalues n are real. 2. One can introduce the new inner product
(a, h)B = (a, Bb)
(14.8-11)
(see also Sec. 14.2-6a). In terms of this new inner product, the operator B_1A becomes hermitian, and the orthogonality and com pleteness theorems of Sec. 14.8-4c apply. In particular, eigenvec tors y associated with different eigenvalues /x are mutually orthogonal relative to the scalar product (11) (see also Sec. 14.7-3a). (b) Consider a finite-dimensional unitary vector space and an orthon
normal reference system m, 112, . . . , un, so that (a, b) =
) atfik
(Sec. 14.7-4a). Let A and B be represented by hermitian matrices A = [aik], B ss [bik], where B is positive definite. Then the eigen values /Lt defined by Eq. (10) are identical with the roots of the nth-degree algebraic equation
det (A — pB) s= det [aik — /*&*] = 0 (characteristic equation for THE "GENERALIZED" EIGENVALUE PROBLEM) (14.8-12)
For each root /*,- of multiplicity mj, there are exactly m linearly inde-
14.8-8
VECTORS, TRANSFORMATIONS, AND MATRICES
464
pendent eigenvectors yw) = ^ui + r)2U)u2 + • • • + Vnu)un; the com ponents rji01 are obtained from the linear simultaneous equations n
7 (aik - pjbikhk® = 0
(i = 1, 2, . . . , n)
(14.8-13)
Application of the Gram-Schmidt process (Sec. 14.7-46) to the mt + m2 + • • • =» n
eigenvectors y<») yields a complete orthonormal set relative to the new inner product n
n
(a, b)B =a]Bb = T Y bika*pk
(14.8-14)
t=i&=1
If this orthonormal set of eigenvectors is introduced as a reference basis in the manner of Sec. 14.8-6c, the hermitian forms (x, Ax) = x]Ax and (x, Bx) = x\Bx take the form (13.5-12) (simultaneous diagonalization of two hermitian forms, Sec. 13.5-5).
(c) Refer to Sec. 15.4-5 for a discussion of analogous "generalized'' eigenvalue problems involving an infinite-dimensional vector space. 14.8-8. Eigenvalue Problems as Stationary-value Problems (see also Sec. 15.4-7). (a) Consider a hermitian operator A defined on a finite-dimensional unitary vector space Vu) and introduce an ortho-
normal reference basis* ui, U2, . . . , un, so that A is represented by a hermitian matrix A == [aik\. The important problem of finding the eigen vectors y(y) = 7/i0)ui + ^20)u2 + • • • + ?;n0)u„ and the corresponding eigenvalues Xy of A is precisely equivalent to each of the following problems: 1. Find (the components rji of) each vector y ^0 such that
(y, Ay) (y> y)
y\Ay y\y
X X aik7l*%
ftA
(Rayleigh's quotient)
(14.8-15)
2 ***
has a stationary value, y = y0) yields the stationary value Xy. 2. Find (the components ?;» of) each vector y such that
(y, Ay) = yUy = y Y aikV?Vk
(14.8-16)
has a stationary value subject to the constraint n
(y, y) - v\v - J "•*"'• =l
(14.8-17)
i = l
y = y0) yields the stationary value Xy. * The orthonormal basis is used for convenience in most applications; in the general case, it is only necessary to substitute (y, Ay) = y\GAy and (y, y) = y\Gy in Eqs. (15) to (17).
465
EIGENVALUES OF LINEAR OPERATORS
14.8-9
3. Find (the components r/t of) each vector y^O such that (y, y) has a stationary value subject to the constraint (y, Ay) = 1. y = y0) yields the stationary value 1/Xy. Let the eigenvalues of A be arranged in increasing order, with an m-iold degenerate eigenvalue repeated m times, or Xi < X2 < • • • < Xn. The smallest eigenvalue Xi is the minimum value of Rayleigh's quotient (15) for an arbitrary vector y of Vu. The rth eigenvalue Xr in the above sequence similarly is less than or equal to Rayleigh's quotient if y is an arbi trary nonzero vector orthogonal to all eigenvectors associated with Xi, X2, . . . , Xr_i; Xr is the maximum of min (y, Ay)/(y,y) for an arbitrary y'mVr
(r — 1)-dimensional subspace Vr of Vu (Courant's Minimax Principle). The last theorems may be restated for problems 2 and 3, and for maxima instead of minima; note that a minimum in problem 1 or 2 corresponds to a maximum in prob lem 3, and conversely. The inner product (y, Ay) usually has direct physical sig nificance. Problem 3 associates the eigenvalues of A with the principal axes of a second-order hypersurface (see also Sees. 2.4-7 and 3.5-6).
(b) Generalizations. The theory of Sec. 14.8-8a may be extended to apply to the "generalized" eigenvalue problem denned by Ay = /iBy, where A is a hermitian operator, and B is hermitian and positive definite (Sec. 14.8-7). It is only necessary to replace (y, y) by (y, y)u = (y, By) in each problem statement of Sec. 14.8-8a. In particular, Rayleigh's quotient (15) is replaced by
(y, Ay) _ y\Ay _ t-f1k~i (y, By)
y\By
V
A A "**?* » = i *-= 1
(Rayleigh's quotient for the "generalized" eigenvalue problem)
(14.8-18)
Analogous theorems apply to suitable operators defined on Hilbert spaces; here the inner products (y, Ay), (y, y), and (y, By) may be integrals rather than sums, so that the stationary-value problems of Sec, 14.8-8a become variation problems (Sec. 15.4-7).
14.8-9. Bounds for the Eigenvalues of Linear Operators (see also Sec. 15.4-10). The following theorems are often helpful for the estima tion of eigenvalues. (a) Every eigenvalue z = A of a normal linear operator A represented by a finite n X n matrix A = [aik] is contained in the union of the n circles n
\z - oa\ < y \aik\
(i = 1, 2, . . . , n)
jt=i
(GeRSCHGORIN'S CIRCLE THEOREM)
(14.8-19)
14.8-10
VECTORS, TRANSFORMATIONS, AND MATRICES
466
Re(X) lies between the smallest and the largest eigenvalue of y2(A + A\) = Hx, and Im(X) lies between the smallest and the largest eigenvalue of —. (A - A]) = H2 (see also Sec. 13.3-4a).
(b) For hermitian matrices and operators
|X|2 <2 J M2 i
W<||A|
(14.8-20)
k
(c) Comparison Theorems. Let m < p.2 < • • • < y,n be the sequence (including multiplicities) of the eigenvalues for a finite-dimen
sional eigenvalue problem (10), where A and B are hermitian, and B is positive definite.
Then
1. Addition of a positive-definite hermitian operator to A cannot decrease any eigenvalue \xr in the above sequence. 2. Addition of a positive-definite hermitian operator to B cannot increase any eigenvalue \iT.
3. If a constraint restricts the vectors y to an (n — m)-dimensional subspace of VUf then the n —m eigenvalues n[ < /4 ' * * < n'n-m of the constrained problem satisfy the relations
Mr < v!T < Hr+m
(r = 1, 2, . . . , n - m)
(14.8-21)
The constraint usually takes the form of m independent linear equations relating the vector components rji.
These theorems also apply to operators defined on Hilbert spaces if A and B are positive-definite hermitian operators yielding a discrete sequence Mi ^ M2 < * * * with finite multiplicities. 14.8-10. Nonhomogeneous Linear Vector Equations (see also Sees. 1.9-4, 15.3-7, and 15.4-12). (a) Given a bounded operator A, the vector equation
Ax - Xx = f
(14.8-22)
has a unique solution x for every given vector f if and only if the given scalar Xis not contained in the spectrum of A (Sec. 14.8-3rf). 7/ Xequals an eigenvalue Xi of A in the sense of Eq. (2), then Eq. (22) has a solution only if the given vector f is orthogonal to every eigenvector of Af associated with the eigenvalue Xf. In the latter case, there is an infinite number of solutions; every sum of a particular solution and a linear combination of eigenvectors corresponding to the eigenvalue Xi is a solution, (b) The important special case Ax = f
(14.8-23)
467
GROUP REPRESENTATIONS
14.9-1
where A is a bounded normal operator, admits a unique solution x for every given vector f if and only if Ax = 0 implies x = 0, i.e., if and only if A is nonsingular. If A is singular, Eq. (23) has a solution only if f is orthogonal to every eigenvector of Af associated with the eigenvalue zero, (c) For a hermitian operator A = Af having an orthonormal set of 00
eigenvectors y* such that f = Y (yk, f)y* the solution of Eq. (22) is given by 00
x=X£^y*
(i4-8-24)
where the \k are the (not necessarily distinct) eigenvalues corresponding to each y*. 14.9. GROUP REPRESENTATIONS AND RELATED TOPICS
14.9-1. Group Representations, (a) Every group (Sec. 12.2-1) can be represented by a homomorphism (Sec. 12.1-6) relating the group ele ments to a group of nonsingular linear transformations of a vector space (representation space, carrier space), and thus to a group of nonsingular matrices (this is a form of Cayley's theorem stated in Sec. 12.2-96). A representation of degree or dimension n of a group G in the field F is a group of n X n matrices A, B, . . . over F related to the elements a, b, . . . oiG by a homomorphism A = A(a),B = B(b), . . . , so that ab = c implies A(a)B(b) = C(c) for all a, b in G (representation condition), n equals the linear dimension of the representation space. A representation is faithful (true) if and only if it is reciprocal one-toone (and thus an isomorphism, Sec. 12.1-6). Every group admits a complex vector space as a representation space; i.e., every group has a representation in thefield of complex numbers. Such a representation permits one to describe the defining operation of any group in terms of numerical additions and multiplications (see also Sees. 12.1-1 and 14.1-1). Most applications deal with groups of transforma tions (Sec. 12.2-8; for examples refer to Sec. 14.10-7). Every group ai,az, . . . ,ag of finite order g admits a faithful representation comprising the g linearly independent permutation matrices (Sec. 13.2-6) A,(a,) = [(Hkiflj)] defined by
**'> " {I otwS =E]
«. A*- 1. * •... f)
04*1)
where E is the identity element of the given group (regular representation of the finite group). Every finite group is thus isomorphic to a group of permutations (see also Sec. 12.2-8).
14.9-2
VECTORS, TRANSFORMATIONS, AND MATRICES
468
(b) Two representations (R and (R of a groupGare similar or equiva lent if and only if all pairs of matrices A (a) of (R and A(a) of (R are
related by the same similarity transformation (Sec. 13.4-16) A = T~lAT. In this case, the matrices A(a) and A (a) are said to describe one and the same linear transformation A(a) of a representation space common to (R and (R (see also Sec. 14.6-2).
A representation (R is bounded, unitary, and/or orthogonal if and only if all its matrices have the corresponding properties. Every repre sentation of a finite group and every unitary representation is bounded. For every bounded representation there existsan equivalent unitary representation.
(c) The rank of any representation (R is the greatest number of lin early independent matrices* in (R.
14.9-2. Reduction of a Representation, (a) A representationf (R of a group G is reducible if and only if the representation space V has a proper subspace Vi invariant with respect to (R, i.e., with respect to every linear transformation of V described by a matrix of (R (Sec. 14.5-2). This is true if and only if there exists a similarity transformation A =
T-^AT
which reduces the matrices A(a), B(b), . . . of (R simultaneously to cor responding matrices of the form
A(a) = \AM.\.4'M AW~L [0] \At(a) IJ
B(b) =r.M).i.m *<*>>-[ [0] \B,Q>) 1J
... {U-92) H4 9-2)
where Ai(a), Bi(b), . . . are square matrices of equal order (alternative definition). The matrices Ai (a), Bi (b), . . . constitute a representation (Ri of the given group G, with the representation space Vi. A representa tion which cannot be reduced in this manner is called irreducible.
(b) A representation (R will be called decomposable^: if and only if its representation space is a direct sum V = Vi 0 V2 0 * • * (Sec. 12.7-5a) of subspaces Di, 1)2, • . . invariant with respect to (R. This is true if and only if there exists a similarity transformation A = T~lAT which reduces the matrices A (a), B(b), . . . of (R simultaneously to * Linear independence of matrices is defined in the manner of Sec. 14.2-3, since matrices may be regarded as vectors. f The definitions of this section also apply to any set of linear transformations of a vector space into itself, or to any corresponding set of matrices (not necessarily a group). XThe terms reducible, decomposable, and completely reducible are variously inter
changed by different authors.
These terms are, indeed, equivalent in the case of
bounded matrices, transformations, and representations, and thus for all representa
tions of finite groups (Sec. 14.9-2rf).
THE IRREDUCIBLE REPRESENTATIONS OF A GROUP
469
14.9-3
corresponding step matrices At(a) A(a) =
[0]
[0]
i• • •
Az(a) i • • •
6(b) ^
Bi(W [0]
[0] B2(b)
...'...,.••
• • • (14.9-3)
(direct sums of matrices, Sec. 13.2-9), where corresponding submatrices are of equal order. Each set of matrices Ai(a), Bi(b), . . . (i = 1, 2, . . .) constitutes a representation
(c) A representation (R is completely reducible if and only if it is decomposable into irreducible representations (irreducible compo nents) (R(1), (R(2), . . . .
(d) Conditions for Reducibility (see also Sec. 14.9-56). Every bounded representation (and, in particular, every representation of a finite group) is either completely reducible or irreducible. A group G has a decomposable representation if and only if it is the direct product (Sec. 12.7-2) of simple groups (Sec. 12.2-56).
A bounded representation (R is completely reducible if and onlyif there exists a matrix Q, not a multiple of I, which commutes with every matrix of (R. Irreducible repre sentations of commutative (Abelian) groups are necessarily one-dimensional.
Whenever all corresponding matrices A, A of two irreducible representations (R and (ft are related by the same transformation QA = AQ, then
14.9-3. The Irreducible Representations of a Group, (a) The decomposition (R = (R(1) © (R(2) © • • • of a given completely reducible representation (R of a group into irreducible components is unique except for equivalence and for the relative order of terms. Every completely reducible representation is uniquely defined by its irreducible components (except for equivalence). If (R(i) is one of exactly my mutually equivalent irreducible components of (R 0' = 1, 2, . . .), one may write (R = mi(R(1) © m2(R(2) © • • • (b) For every group G of finite order g,
1. The number m of distinct nonequivalent irreducible representations is finite and equals the number of distinct classes of conjugate elements (Sec. 12.2-5a). 2. // ft/ is the dimension of the jth irreducible representation, its rank equals n,2 (Burnside's Theorem); the rank of every representation (RofG equals the sum of the ranks n,-2 of the distinct irreducible components in (R. 3. Each nj is a divisor of g, and wi2 + n22 +
+ nm* = g
(14.9-4)
4. The regular representation ofG (Sec. 14.8-la) contains thejth irreducible representa tion of G exactly w,- times.
14.9-4
VECTORS, TRANSFORMATIONS, AND MATRICES
470
(c) The determination of the complete set of irreducible representations of a group G of operators is of particular interest as a key to the solution oi certain eigenvalue problems. Given any hermitian operator H which commutes with every operator of a group G, there exists a reciprocal one-to-one correspondence between the distinct eigen values \t of H and the nonequivalent irreducible representations (R<>> of G, and the degree of degeneracy of each X» equals the dimension of (Rtf>(classificationof quantum-mechani cal eigenvalues from symmetry considerations, Refs. 14.20 to 14.22).
14.9-4. The Character of a Representation,
(a) The character of
a representation (R is the function
x(a) e Tr [A(a)\
(14.9-5)
defined on the elements a of the group G represented by (R. Conjugate group elements (Sec. 12.2-5o) have equal character values. For every bounded representation, xfa-1) ^ [x(a)]* (14.9-6) Two completely reducible representations of the same group G are equivalent if and only if they have identical characters.
(b) The characters of irreducible representations are called simple or primitive characters, and the characters of reducible representations
are known as composite characters. (R = (Ri © (R2 0 • • • implies x(a) = Xi(a) + x*(a) + ' ' ' , where x(a), Xi(«), X2(a), . . . are the respective characters of (R, (Ri, (R2, . . . .
14.9-5. Orthogonality Relations (see also Sec. 14.9-6). (a) The primitive characters x(1)(a)> x(2)(«)> • • • respectively associated with the nonequivalent irreducible representations (R(1), (R(2), . . . of a finite group G satisfy the relations
Mean {[Xu)(a)]*xij,)(a)} =-g Y [xU)(a)]*x(j,)(a) a in G
=*/' =jl ifj=f] °'' / =1. 2. •••.») (14-9-7) For every completely reducible representation (R = mi(R<1) © m2(R(2) ©•••()/(?, x(a) = mlX(«(a) + m2X™(a) + • • •
(14.9-8)
my =Mean |[xw>(a)]*x(a)| - ^ [xU)(a))*x(a)
(14.9-9)
iinG
Mean {|x(a)|2) = - Y |x(o)|2 = mi2 + m22 + • • • > 1 (14.9-10) a in 0
Mean f|x(a)|2} = 1 whenever (R is irreducible. (b) Each of the m nonequivalent irreducible representations (R<>> of a finite group G is equivalent to a corresponding unitary irreducible representation comprising the matrices [uikU)(a)] (see also Sec. 14.9-16). The elements of these unitary matrices satisfy
471
THREE-DIMENSIONAL EUCLIDEAN VECTOR SPACE
14.10-1
the relations
Mean {tt,*«>(a)'W<''>(a)} =- Y ^^>(a)*^^">(a) = i- 5J,5J,5£, a in (7
(j, f = 1, 2, . . . , m; i, fc = 1, 2, . . . , ny; »', A;' = 1, 2, ... , ny>)
(14.9-11)
(c) The relations (7) to (11) apply to countably infinite, continuous, and mixed-con tinuous groups whenever the mean values (Sec. 12.2-12) in question exist. In this case the primitive characters x(y)(«) = xU)[a(<*\, <*2, . . •)] constitute a complete orthog onal set of functions in the sense of Sec. 15.2-4.
14.9-6. Direct Products of Representations, (a) Given an ni-dimensional representation (Ri and an n2-dimensional representation (R2 of the same group G, the nirt2 X n\n2 matrices each obtained as a direct product (Sec. 13.2-10) of a matrix of (Ri by a matrix of (R2 constitute a representation of G, the direct product (Kronecker product) (Ri (x) (R2 of the representations (Ri and (R2 (see also Sec. 12.7-2). Its representa tion space is the direct product of the representation spaces associated with (Ri and (R2 (Sec. 12.7-3). The character x(a) of (R = (Ri ® (R2 is the product of the respective characters xi(a) °f G*i and Xz(°) °f ^2: x(a) = xi(a)X2(a)
(14.9-12)
// both (Ri and (R2 are bounded or unitary, the sameis true for (Ri 0 (R2. (b) The direct product (R(1)
egwate 1; otherwise, (R(1) 0 (R(2) is completely reducible. One may use this last fact to derive new irreducible representations of G from given irre ducible representations. (c) The irreducible representations of the direct product Gi ® G2 of two groups Gi and G2 (Sec. 12.7-2) are the direct products (Ri<'"> <8> (R2(,v) of the irreducible representations (Ri0) of Gi and (R2(''> of G2.
14.9-7. Representations of Rings, Fields, and Linear Algebras (see also Sees. 12.3-1 and 12.4-2).
Rings, fields, and linear algebras may
also be represented by suitable classes of matrices or linear transforma tions. In particular, a linear algebra of order n2 over afieldF is isomorphic to an algebra of n X n matrices over F, provided that the given algebra has a multiplicative identity (regular representation of a linear algebra, see also Sees. 14.9-la and 14.10-6). 14.10. MATHEMATICAL DESCRIPTION OF ROTATIONS
14.10-1. Rotations in Three-dimensional Euclidean Vector Space. (a) Every orthogonal linear transformation
x' = Ax
(AA - AA = I)
(14.10-la)
14.10-2
VECTORS, TRANSFORMATIONS, AND MATRICES
472
of a three-dimensional Euclidean vector space (Sec. 14.2-7a) onto itself preserves absolute values of vectors and angles between vectors (Sees. 14.4-5
and 14.4-6). Such a transformation is a (proper) rotation if and only if det (A) = 1, i.e., if and only if the transformation also preserves the relative orientation of any three base vectors (and hence right- and lefthandedness of axes, vector products, and triple scalar products). A trans formation (la) with det (A) = —1 is an improper rotation, or a rotation with reflection.
(b) Given any orthonormal basis (Sec. 14.7-4) m, u2, u3, let x = £iui + £2u2 + £3u3
x' = Sjui + ^u2 + Sjus
Every transformation (la) is represented by £i = aiif1 + 0,12^2 + ai3^3 £2 = «2i?i + a22£2 + a23£3
(14.10-16)
£3 = a3i£i + a32£2 + a33£3 or in matrix form
x' = Ax
where
det M
for proper rotations.
(14.10-lc)
= det (4) = 1
(14.10-2)
Since an orthonormal reference system is used, the
real matrix A = [aik] describingeach rotation is orthogonal (AA = AA = /, see also Sec. 14.7-5), i.e.,
2 ai}aici =£ aSiaik = Jjf \ J*} 3
3
(t, k=1, 2, 3) (14.10-3)
and each coefficient aik equals the cofactor of aki in the determinant det [aik]. Three suitably given coefficients aik determine all 9. Geometrically, at* is the cosine of the angle between the base vector 3
ut and the rotated base vector u'k = Au* = > a^u,- (see also Sec.
14.5-1):
J=l aik = ut • u'k = m • (Au*)
(i, k = 1, 2, 3)
(14.10-4)
14.10-2. Angle of Rotation. Rotation Axis, (a) A rotation (1) rotates the position vector x of each point in a three-dimensional Euclid ean space through an angle of rotation 8 about a directed rotation axis whose points are invariant. The rotation angle 8 and the direction sosines c\, c2, c3 of the positive rotation axis are given by cos 5 = y2[Tv (A) - 1] = ^[Tr (A) - 1]
= V2(an + a22 + a33 - 1) . ,UAM) _ a32 — a23
1
2 sin 8
_ ai3 -~ a3i
2
2 sin 8
_ a2i — ai2
°3 ~ 2 sin 8
'
473
ROTATIONS BY SPIN MATRICES AND QUATERNIONS
14.10-4
so that 8 > 0 corresponds to a rotation in the sense of a right-handed screw
propelled in the direction ofthe positive rotation axis. Either the sign of 8 or the positive direction on the rotation axis may be arbitrarily assigned. The direction of the positive rotation axis is that of the eigenvector ciui + C2U2 +
c3u8 corresponding to the eigenvalue +1 of A and is obtained by a principal-axes transformation of the matrix A (Sec. 14.8-6). The remaining eigenvalues of A are cos 5 ± i sin 5 = e±iS.
(b) The transformation matrix A corresponding to a given rotation described by 8, c\, c2, c3 is A m
«11
a\2
C&18
#21
#22
#23
tt3l
tt32
#33_
+ (1 - cos 8)
=
cos 8
1
0
0
0
1
0
0
0
1
Ci2
C\C2
C\C%
C2C1
C2
C2C3
^CzC\
+ sin 8
„2
CzC2
ci*
0 c3 —c2
— CZ 0 Ci
14.10-3. Euler Parameters and Gibbs Vector, symmetrical parameters
X = ci sin 8/2
n = c2 sin 8/2
C2 —ci 0
(14.10-6)
(a) The four Euler
v = c3 sin 5/2
p = cos 5/2
(X2 + m2 + v2 + P2 = 1)
(14.10-7)
define the rotation uniquely, since Eq. (6) yields X2 _ ^2 _ v2 + p2 A =
2(X/i + vp) 2(v\ - up)
2(XM - vp) H2 -
v2 - X2 + p2
2(/** + Xp)
2(*X + up) 2(txv - Xp) - X2 -
/x2 + P2
(14.10-8)
X, m, j>, p and —X, —m> —v, —p represent the same rotation. (b) The Gibbs vector G = (riUi + G2U2 + (t3U3 with
(14.10-9)
Gi = ci tan 5/2 = X/p G2 = c2 tan 5/2 = /x/p (?3 = c3 tan 5/2 = v/p
also defines the rotation uniquely. ,5
The rotated vector x' can be written as
x' = Ax = cos2^[(l - |G|2)x + 2(G-x)G + 2G X x]
(14.10-10)
14.10-4. Representation of Vectors and Rotations by Spin Ma trices and Quaternions. Cayley-Klein Parameters, (a) Given an orthonormal basis ui, u2, u3, every real vector x = *£iui + £2112 + £3113
may be represented by a (in general complex) hermitian 2X2 matrix
Hm\t i3 -t ^"Y^l s ^ + ^2 + ^3 (14.10-11)
14.10-4
VECTORS, TRANSFORMATIONS, AND MATRICES
474
where the hermitian Pauli spin matrices
*•[! I] *•[?"»]
*-[i _!] ("•»»»
correspond, respectively, to m, u2, u3. The correspondence (11) is an isomorphism preserving the results of vector addition and multiplication of vectors by (real) scalars.
For every rotation (1), the 2 X 2 matrix representing the rotated vector X' = Ax = $Jui + {Ju8 + £',11,
#' = *{Si + &3% + Z'zSz = UHU1 (14.10-13) where U is the (in general complex) unitary 2X2 matrixwithdeterminant 1 (unimodular 2X2 matrix) defined by
with
La = P+aJip
L-&* b= n+ i\ <**J ( \a\2 + \b\2 =\2 + M2 + ^2 + p2 = x )
(14 10-14) U4,1U lv
The complex numbers a, b, determine the corresponding rotation uniquely; but a, b and —a, —6, and hence £7 and —U, represent the same rotation. Either a, b, —6*, a* or a*, **, —*, a are referred to as the Cayley-Klein parameters of the rotation. Geometrically, the complexparameters a, bdefinethe complex-plane transformation
*' =hTTT*
(|a|2 + lb|2 = 1}
(14.10-15)
(bilinear transformation, Sec. 7.9-2) relating the stereographic projection u of the point ($i, £2, £3) on a sphere about the origin onto the complex u plane (Sec. 7.2-4) and the stereographic projection u' of the rotated point (£x, &,£) (see also Ref. 14.13).
(b) The linear combinations of /, iSh iS2, and iSz with real coefficients constitute a matrix representation of the quaternion algebra (Sec. 12.4-2), whose scalars correspond to real multiples of /, and whose generators correspond to iSh iS2, iSz} with
St = S22 = S32 = /
)
S1S2 = —S2S1 = iSz
j
S2Sz =-SSS2 =1S1
SzSi =-SiSs =iSfi I (14.10-16)
Every complex 2X2 matrix can be expressed as such a linear combi nation; in particular,
U = pi —i(\Si + jwS2 + vSz) 1 (quaternion representation
U1f = pi + i(\Si + m£2 + pSz) J
of rotations) (14.10-17)
Again, both U and —U define the same rotation uniquely.
14.10-6
EULER ANGLES
475
14.10-5. Rotations about the Coordinate Axes.
The following
transformation matrices represent right-handed rotations about the positive coordinate axes: 0
0
cos if sin if
— sin ^ cos ^
cos ^
0
0
Atffl =
angle if about ui)
0
(14.10-18a)
(rotation through an
sin if
angle if about u2)
10
— sin if
Az(+) s
(rotation through an
1
0 0
(14.10-186)
cos if
cos if sin if
— sin if cos if
0 0
0
0
1
(rotation through an
ANGLE if ABOUT U3)
(14.10-18c)
Note
AcK+) s M*) s M-+) 14.10-6. Euler Angles,
(i = 1> 2> 3)
(14.10-19)
(a) Every matrix A = [aik] representing a
proper rotation in three-dimensional Euclidean space can be variously expressed as a product of three matrices (18), and in particular as A m Az(«)A2(P)Az(y) cos a
— sin a
sin a
cos a
0
0
0] [" cos j8
0 sin/3~] ["cos 7
0
10
0
sin y
lj L- sin/3 0 cos j8j L 0
"cos a cos P cos y
—(cos a cos j8 sin 7
cos a sin /3"
— sin a sin y sin a cos fi cos y + cos a sin y — sin £ cos y
+ sin a cos 7) — sin a cos j8 sin 7
sin a sin j8
— sin 7 cos 7
0 0
0
1
= A 32 (a, 0, 7)
+ cos a cos 7
sin j8 sin 7
cos f$
(14.10-20)
The three Euler angles a, ft 7 define the rotation uniquely; except for multiples of 2tt, they are uniquely determined by a givenrotation, unless 0 = 0 ("gimbal lock," Sec. 14.10-6$.
A set of cartesian x', y', z' axes (moving "body axes" of a rigid body) initially aligned with ui, u2, u3 can be rotated into alignment with ui, ui, uj by three successive rotations (18) [Fig. 14.10-1; note the dis cussion of Sec. 14.6-3, which explains the apparently inverted order of the three matrices in Eq. (20)]. 1. Rotate about z' axis through the Euler angle a 2. Rotate about y' axis through the Euler angle 0 3. Rotate about z' axis through the Euler angle 7
14.10-6
VECTORS, TRANSFORMATIONS, AND MATRICES
476
Fig. 14.10-1. The Euler angles a,/3, y. The axis OL of the second rotation (through fi)
is often called the line ofnodes. Note that a and0 arethespherical polar coordinates
for the vector u3 in the uj, u2, u3 system.
u' (Pitch axis)
ux' (Roll axis)
u3r (Yaw axis) Fig. 14.10-2. Body axes of an aircraft.
The inverse rotation A"1 (which turns x' back into x) is represented by the matrix
ii"1 e £ es Az(-y)A2(-p)Az(-a) s Az2(-yf -ft -a) s Az2(ir -y,P,v - a)
(14.10-21)
There are sixways to express a rotation matrix (1) as a product
A = AitoJAttyJAifa) = Aik(in, fit ft)
(i} k = 1, 2, 3;i j£ k) (14.10-22)
14.10-6
EULER ANGLES
477
of rotations about two different coordinate axes. Of the different resulting Eulerangle systems, another frequently used one is defined by
A Z5 Az(af)Al(^)Az(y') = Azl(«', j8', 7')
(14.10-23)
which is related to the system of Eq. (20) by
a! = a + tt/2
ft = 0
V = 7 - t/2
(14.10-24)
(b) There are, furthermore, six ways to represent a rotation matrix A as a product
4 = 24i(0lMi(*2)Aib(03) S 4tf*(01, ^2, *8) (i, k = 1,2, 3; i 5* i, 1 9*htj 9* k) of rotations about three different coordinate axes.
(14.10-25)
In particular,
A = A1(
sin (p sin # cos if + cos
— cos d sin ^ — sin
cos
sin #
— sin
(14.10-26)
is frequently used to describe the attitude of an aircraft or spacevehicle after successive roll
respectively directed forward, to starboard, and toward the bottom of the craft (Fig. 14.10-2). (c) The profusion of the 12 Euler-anglesystems defined above is augmented by the fact that some authors replace one or more of the Euler angles by its negative, and that some of the literature involves left-handed coordinate systems.
In addition, the
reader is warned to check whether a given Euler-angle transformation is originally defined as an operation ("alibi" interpretation, Sec. 14.5-1) or as a coordinate trans
formation ("alias" interpretation, Sec. 14.6-1), since it is possible to confuse A and A-1 = A. Specifically, the component matrix x = {£1, £2, £3} of a vector x in the
fixed Ut systemandthe component matrix x = (£i, £2, £3} in the rotated u^systemare related by
x = Ax
x = A-hi = Ax
(14.10-27)
and the base-vector matrices (Sec. 14.6-2) transform according to ul
[u[ u2 1/3] = [uiu2u8]A
or
/
_u8_
Note the discussion of Sec. 14.6-3.
Ul
u2 = = 1A U2 u2 u2 _U3_
(14.10-28)
14.10-7
VECTORS, TRANSFORMATIONS, AND MATRICES
478
(d) The parameters a, c2, cz, 5and X, /*, v, pare readily expressed in terms of Euler angles with the aid of Eq. (5) and the Euler-angle matrix. Thus x
. S y — a . 8 X = cisin ^ = sin ^—^— sin ^ 8
ce — y
.
:" + T
8
M= c2 sin^ = cos —^ sin | .
5
.
v = cz sin ^ = sin 8
a 4- y
^
(14.10-29)*
8
cos 2
a + 7
6 =e
A
p = cos g = cos —^- cos I
Notethat addition of 2?r to one ofthe Euler angles changes the sign of allparameters (29) and leaves the rotation matrix A unchanged.
If ^2 = 0in Eq. (22) or#2 = *-/2 in Eq. (25) (e.g., 0 = 0, /3' = 0,or» = *-/2), then the remaining two Euler angles are no longer uniquely defined by the given rotation ("gimbal lock" in attitude-reference systems). Euler angles are, therefore, employed only for the representation of rotations which aresuitablyrestricted in range.
14.10-7. Infinitesimal Rotations, Continuous Rotation, and Angular Velocity (see also Sees. 5.3-2 and 14.4-9). (a) An infinitesi mal three-dimensional rotation through a small angle d8 about a rotation axis with the direction cosines d, c2, c3 is described by the orthogonal infinitesimal transformation
x' = (I + A)x
(14.10-30)
so that A is skew-symmetric (Sec. 14.4-9). In the orthonormal m, u2, u8 reference frame, A is represented by the skew-symmetric matrixt A =
0 Cz
—Cz 0
—c2
Ci
obtained from Eq. (6) for 8 = db—> 0.
c2 — ci db
(14.10-31)
0
It follows that
Ax = (c X x) db
(14.10-32)
where c = cmi + c2u2 + c3u3 is the unit vector in the direction of the * a and b are defined in Sec. 14.10-4. Note that several different definitions of the Cayley-Klein parameters are in common use.
t In general, if W is any skew-symmetric linear operator defined on a three-dimen sional Euclidean vector space Ve and represented in an orthonormal reference frame ui, 112, u8 by the skew-symmetric matrix
W
0
—w8
Wz
0
— Mi
Wi
0
— 202
w2
then, for every vector x of Ve, x' = Wx = w X x
where w = W1U1 + w2u2 + W3113 (see also Sec. 16.9-2).
479
INFINITESIMAL ROTATION, CONTINUOUS ROTATION
14.10-7
positive rotation axis. This relation is independent of any reference frame (see also Sec. 14.10-36).
(b) For a continuous three-dimensional rotation described by x'(0 = A(flx
where x is constant, Eq. (32) yields
dx^ =dmx=„{t)x X/W ^ tt(0 x[A(/)x] (I4.10.33o) at
at
The vector
G)(0 S 0)i(i)ui + 0)2(t)u2 + 0)z(t)uZ
^ «i(«)ui(0 + «i(0uj(0 + uz(t)u'(t)
(14.10-336)
is directed along the instantaneous axis of rotation (axis of the rotation x' —> x' + dx'), and |
(14.10-34)
= Q(t) A(t)
dt
where Q is the skew-symmetrical operator represented in the ui, 112, u8 and Uj(0, U2W, Ugtf) reference systems by the respective matrices Sl(t) b
fi(i) m
0 m(t)
(02(t) -<*i(t)
-e>2(t)
-o>z(t) 0 <*i(t)
0 m(t) -«2(0
-«8(0 0 *l(0
«2(0 -m(t) 0
0
(14.10-35)
With the aid of the relations (14.6-8) and A-1 = A, it follows that dA dt
^
= QA s
dA r
Ail
(14.10-36) *
rdA
Substitution of Eq. (8), (20), or (26) into Eq. (36) yields relations between angularvelocity components and matrix elements (direction cosines), Euler parameters, or Euler angles. In particular, dA
df
.
W^dtlaik]
CO3O12 — O>2018
(*>lQ>lZ ~~ <*>8flll
CO3O22 — W2028
W1O28 — W8021
d)2ail — 03\a\2 CO2O2I — «l(Z22
C08fl82 — «2088
W1&88 — COs^l
C02&81 — Wl*l82 _
(14.10-37)
14.10-8
VECTORS, TRANSFORMATIONS, AND MATRICES
480
o>i(0 ss 2(Xp — Xp — fry -f /*£)
ss —Sin a ^r + COS a Sin 0 -=^ w2(0 = 2(/iP —^tp - vX 4- yX)
^
,
= COS a -^ + Sin a Sin /3 ^
. r
V[COMPONENTS OF *(«) IN FIXED (
(
REFERENCE FRAME] Vnr-xu_uo'
co3(0 = 2(vp — vp — Xp, -f X/i) ^a i ady
^dr + coal3dt
U)l(t) ss 2(Xp —\p -\- /IP —yip) • o da , . d8 = - sm 0cos y-r-f sin 7-^
s cos t? cos ^ -=7- + sin xp -rdt
r dt
w2(t) = 2(/ip —fxp -f A - yX) —. • a '
da. .
dfi
= sm 0sm 7 ^- + cos 7^7-
\
V[components of g>(0 in rotating . 1aq
^ ^ / = —cos t? sin xp -£ -f cos ^ -?- '
REFERENCE FRAME] ^ '
'
aj3(0 s= 2(j>p — j/p + X/i — X/i) ^ da , dy
_ d
rff ^^^cos^-^sin*) d&
- • ^7- ss coi sin xp -J-, a>2 cos xp
Iv
(14.10-40)
dxp ,. . ~ ss (a>2 sin ^ — a>i cos xp) tan t? + wa
14.10-8. The Three-dimensional Rotation Group and Its Repre sentations (see also Sees. 12.2-1 to 12.2-12 and 14.9-1 to 14.9-6). (a) The orthogonal transformations (1) of a three-dimensional Euclidean vector space onto itself are necessarily bounded and nonsingular and constitute a group, the three-dimensional rotation-reflection group
Rf. The proper rotations [det (A) = 1] constitute a normal subgroup of Rf, the three-dimensional rotation group Rf. Neither Rf nor Rf is commutative.
Rotations involving the same absolute rotation angle \b\ belong to the same class of conjugate elements. The rotations about any fixed axis form a commutative subgroup of Rf (two-dimensional rotations). Rf is a mixed-continuous group, and Rf is a continuous group (normal subgroup of Rf with index 2). Rf is, in turn, a subgroup of the group of all nonsingularlinear transformations of the Euclidean vector space onto itself (full linear group, FLG). Note that every transformation of FLG is the product of a proper or improper rota tion and a nonnegative symmetric transformation (affine transformation, stretch ing or contraction; see also Sees. 14.4-8 and 13.3-4).
481
THE THREE-DIMENSIONAL ROTATION GROUP
(b) The Irreducible Representations of Rf.
14.10-8
The 2 X 2 matrices
(14) constitute an irreducible unitary two-to-one representation ofRf in the field of complex numbers; i.e., the three-dimensional rotation group Rf is represented by the group of unitary transformations with determinant 1 of a two-dimensional unitary vector space onto itself (two-dimensional unimodular unitary group, special unitary group, SUG).
More generally, the three-dimensional rotation group Rf has bounded irreducible representations of dimensions n = 2, 3, 4, . . . . The com plete set of unitary irreducible representations is conveniently written as
lUmq{a> P> y>\ - I 7 (j -m- h)\(j + q- h)\(h - q + m)\h\ h = 0
(a*)j~m~ha3+q~h (b*)h+™-ibh oo
]
(-1)* VU + m)\(j - m)!Q- + g)!(j - g)! ,(raa+4r) (j-m- h)\(j + q- h)\(h - q + m)\h\
\cos2)
{^2)
J
(j =±, 1, I •••;m, q=-j, -j +1, .. ,j- 1, j) (14.10-41)*
Each sum has only a finite number of terms, since 1/AH = 0 for N < 0. The representation (RW) is true (one-to-one) for j = 1, 2, . . . and two-toone for j = %,%,-• • (see also Sec. 14.10-4). The character (Sec. 14.9-4) of (R«> is
xw(«, A7) - Tr [Ulil(a, p, y)] msin(s(m+^)5 0' = 0, H> 1, 3A, • • •) (14.10-42) where 8 is the angle of rotation denned in Sec. 14.10-2. The peculiar indicesj, m, and q employedto label the
*Some authors replace the factor ( —l)h by ( —l)h~q+m, corresponding to multi plication of each matrix by the same diagonal matrix, As written, Eq. (41) reduces to Eq. (14) for j = y2.
14.11-1
VECTORS, TRANSFORMATIONS, AND MATRICES
482
(c) Direct Products of Rotation Groups (see also Sees. 12.7-2 and 14.9-6). Direct products of three-dimensional rotation groups describe, for instance, the composite rotations of dynamical systems comprising two or more rotating bodies (atoms or nuclei in quantum mechanics). Rf <8> Rf is isomorphic with Rf; note =
• . . 0 (ft(l/-i'i>
(Clebsch-Gordan equation)
(14.10-43)
14.11. RELATED TOPICS, REFERENCES, AND BIBLIOGRAPHY
14.11-1. Related Topics. The following topics related to the study of matrix methods are treated in other chapters of this handbook: Linear simultaneous equations Elementary vector algebra Vector relationsin terms of curvilinear coordinates Groups, vector spaces, and linear operators Diagonalization of matrices, eigenvalue problems, quadratic and hermitian forms Tensor algebra
Chap. 1 Chap. 5 Chap. 6 Chap. 12
Chap. 14 Chap. 15
14.11-2. References and Bibliography (see also Sees. 12.9-2, 13.7-2, and 15.7-2).
14.1. Birkhoff, G., and S. MacLane: A Survey of Modern Algebra, 3d ed., Macmillan, New York, 1965.
14.2. Dunford, N., and J. T. Schwartz: Linear Operators, Interscience, New York, 1964.
14.3. Finkbeiner, D. T.: Introduction to Matrices and Linear Algebra, Freeman, San Francisco, 1960.
14.4. Halmos, P. R.: Finite-dimensional Vector Spaces, 2d ed., Princeton, Princeton, N.J., 1958.
14.5.
: Introduction to Hilbert Space and the Theory of Spectral Multiplicity, Chelsea, New York, 1957.
14.6. Nering, E. D.: Linear Algebra and Matrix Theory, Interscience, New York, 1963.
14.7. Shields, P. C: Linear Algebra, Addison-Wesley, Reading, Mass., 1964. 14.8. Stoll, R. R.: Linear Algebraand Matrix Theory, McGraw-Hill, New York, 1952. 14.9. Thrall, R. M., and L. Tornheim: Vector Spaces and Matrices, Wiley, New York, 1957.
14.10. Von Neumann, J.: Die Mathematischen Grundlagen der Quantenmechanik, Springer, Berlin, 1932. 14.11. Zaanen, C: Linear Analysis, North Holland Publishing Company, Amsterdam, 1956.
(See also the articles by G. Falk and H, Tietz in vol, II of the Handbuch derPhysih^ Springer, Berlin, 1955.)
483
REFERENCES AND BIBLIOGRAPHY
14.11-2
Representation of Rotations
14.12. Goldstein, H.: Classical Mechanics, Addison-Wesley, Reading, Mass., 1953. 14.13. Synge, J. L.: Classical Dynamics, in Handbuch der Physik, Vol. Ill, Springer, Berlin, 1960. Representation of Groups
14.14. Boerner, H.: Darstellungen von Gruppen mit Beruecksichtigung der Beduerfnisse der modernen Physik, Springer, Berlin, 1955. 14.15. Hall, M.: The Theory of Groups, Macmillan, New York, 1961. 14.16. Kurosh, A. G.: The Theory of Groups, 2 vols., 2d ed., Chelsea, New York, 1960. 14.17. Margenau, H., and G. M. Murphy: The Mathematics of Physics and Chemistry, 2d ed., Van Nostrand, Princeton, N.J., 1957. 14.18. Murnaghan, F. D.: The Unitary and Rotation Groups, Spartan, Washington, D.C., 1962. 14.19. Van der Waerden, G. L.: Gruppentheoretische Methode in der Quantenmechanik, Springer, Berlin, 1932. 14.20. Weyl, H.: The Theory of Groups and Quantum Mechanics, Dover, New York, 1931.
14.21. : The Classical Groups, Princeton, Princeton, N.J., 1939. 14.22. Wigner, E.: Gruppentheorie und ihre Anwendung auf Quantenmechanik der Atomspektren, Brunswick, 1951. 14.23. Zassenhaus, Hans J.: The Theory of Groups, 2d ed., Chelsea, New York, 1958.
CHAPTE
R 15
LINEAR INTEGRAL EQUATIONS, BOUNDARY-VALUE PROBLEMS, AND EIGENVALUE PROBLEMS
15.1. Introduction.
Functional
15.1-1. Introductory Remarks 15.1-2. Notation
15.2. Functions as Vectors. Expan sions in Terms of Orthogonal Functions
15.2-1. Quadratically Integrable Func tions as Vectors.
Inner Prod
uct and Normalization
15.2-2. Metric and Convergence in La. Convergence in Mean 15.2-3. Orthogonal Functions and Orthonormal Sets of Functions
15.2-4. Complete Orthonormal Sets of Functions
15.2-5. Orthogonalization
15.3-1. Linear Integral Transforma tions
Analysis
and
Nor
malization of a Set of Functions
15.2-6. Approximations and Series Ex pansions in Terms of Orthonormal Functions
15.2-7. Linear Operations on Functions
15.3-2. Linear Integral Equations: Survey 15.3-3. Homogeneous Fredholm Inte gral Equations of the Second Kind: Eigenfunctions and Eigenvalues 15.3-4. Expansion Theorems 15.3-5. Iterated Kernels
15.3-6. Hermitian Integral Forms. The Eigenvalue Problem as a Variation Problem
15.3-7. Nonhomogeneous Fredholm Equations of the Second Kind (a) Existence and Uniqueness of Solutions
(b) Reduction to an Integral Equation with Hermitian Kernel
(c) The Resolvent Kernel 15.3-8. Solution of the
484
In
(a) Solution by Successive Ap proximations.
15.3. Linear Integral Transforma tions and Linear Integral Equations
Linear
tegral Equation (16)
Neumann
Series
(b) The Schmidt-Hilbert Formula for the Resolvent Kernel
485
LINEAR PROBLEMS
(c) Fredholm's Formulas for the Resolvent Kernel
(d) Singular Kernels 15.3-9. Solution of Fredholm's Linear
Integral
Equation
of
the
First Kind
15.3-10. Volterra-type Integral Equa tions
Linear Boundary-value Problems: Problem Statement and Notation
Complementary Homoge neous Differential Equation and Boundary Conditions for a Linear Boundary-value Problem. Superposition Theorems
15.4-3.
Hermitian-conjugate and Ad joint Boundary-value Prob lems. Hermitian Operators
15.4-4. The
Fredholm
Alternative
Theorem 15.4-5.
Eigenvalue Problems Involv ing Linear Differential Equa tions
15.4-6.
Eigenvalues and Eigenfunctions of Hermitian Eigen value Problems. Complete Orthonormal Sets of EigenHermitian Eigenvalue Prob lems as Variation Problems
15.4-8.
One-dimensional
Eigenvalue
Problems of the Sturm-Liou-
ville Type
ary-value Problems with Homogeneous Boundary Con 15.5-2. Relation of Boundary-value Problems and Eigenvalue Prob lems to Integral Equations.
volving Second-order Partial Differential Equations Comparison Theorems
15.4-11. Perturbation Methods for the
Solution of Discrete Eigen value Problems
(a) Nondegenerate Case (b) Degenerate Case 15.4-12. Solution of Boundary-value
Problems by Expansions
15.5-3. Application of the
Green's-
function Method to an Initialvalue Problem: The General
ized Diffusion Equation 15.5-4. The Green's-function Method
for Nonhomogeneous Bound ary Conditions 15.6. Potential Theory
15.6-1. Introduction. Laplace's and Poisson's Differential Equa tions
15.6-2. Three-dimensional Potential
Theory: the Classical Bound ary-value Problems (a) The Dirichlet Problem (6) The Neumann Problem 15.6-3. Kelvin's Inversion Theorem
15.6-4. Properties of Harmonic Functions
(a) Mean-value
and
Maximum-
(b) Harnack's Convergence Theorems
15.6-5. Solutions of Laplace's and Poisson's Equations as Poten tials
15.4-9. Sturm-Liouville Problems In
15.4-10.
15.5-1. Green's Functions for Bound
modulus Theorems
functions 15.4-7.
Relation
Green's Resolvent
tions
15.4-2.
Functions.
of Boundary-value Problems and Eigenvalue Problems to Integral Equations
ditions
15.4. Linear Boundary-value Prob lems and Eigenvalue Problems Involving Differential Equa 15.4-1.
15.5. Green's
Eigenfunction
(a) Potentials of Point Charges, Dipoles, and Multipoles (b) Potentials of Charge Distribu tions. Jump Relations (c) Multipole Expansion and Gauss's Theorem
(d) General Solutions of Laplace's and Poisson's Equations as Potentials
15.6-6. Solution of Three-dimensional
Boundary-value Problems by Green's Functions
15.1-1
LINEAR PROBLEMS
(a) Green's Function Entire Space
for
the
486
(a) Green's
Function
for
the
Entire Plane
(6) Infinite Plane Boundary with Dirichlet Conditions. Image
(b) Half
Charges (c) Sphere with Dirichlet Condi
(c) Circle with Dirichlet Condi tions. Poisson's Integral
tions.
Poisson's Integral
An Existence Theorem
15.6-7. Two-dimensional Potential
Theory.
with
Dirichlet
Formulas
(d) Existence of Green's Func
Formulas
(d)
Plane Conditions
Logarithmic Poten
15.6-8. Two-dimensional Potential
Theory: Conjugate Harmonic Functions
15.6-9. Solution of Two-dimensional
Boundary-value Problems. Green's Functions and Con-
15.1. INTRODUCTION.
More General Differential
Equations.
tials
formal Mapping
tions and Conformal Mapping 15.6-10. Extension of the Theory to Retarded and
Advanced Potentials (a) Three-dimensional Case (b) Two-dimensional Case
15.7. Related Topics, References, and Bibliography
15.7-1. Related Topics 15.7-2. References and Bibliography
FUNCTIONAL ANALYSIS
15.1-1. Functional analysis regards suitable classes of functions as "points" in topological spaces (Chap. 12) and, in particular, as multi dimensional vectors admitting the definition of inner products (Sec. 15.2-1) and expansions in terms of orthogonal functions (base vectors, Sec. 15.3-4). The resulting elegant and powerful geometrical analogy relates a large class of operations, including linear integral transformations and differentiation, to the theory of linear transformations introduced in Chap. 14. The solution of Hnear ordinary and partial differential equations and of linear integral equations is thus found to be a more or less simple generalization of the solution of linear simultaneous equations; in particular, the solution may involve an eigenvalue problem. Sections 15.3-1 to 15.3-10 review linear integral equations; Sees. 15.4-1 to 15.4-12 introduce linear boundary-value problems and eigenvalue prob
lems involving differential equations. The remainder of the chapter deals with various methods of solving linear boundary-value problems, viz. 1. Eigenfunction expansions (Sec. 15.4-12); this method may be ex tended to include the various integral-transform methods (Sec. 10.5-1)
2. Green's functions (Sees. 15.5-1, 15.5-3, 15.6-6, and 15.6-9) 3. Reduction to integral equations (Sec. 15.3-2) 4. Variation methods (Sec. 15.4-7; see also Sees. 11.7-1 to 11.7-3)
487
QUADRATICALLY INTEGRABLE FUNCTIONS AS VECTORS
15.2-1
In particular, Sees. 15.6-1 to 15.6-10 deal with boundary-value problems
involving Laplaceys and Poisson}s differential equations (potential theory) and the space form of the wave equation. Although many practical problems will yield only to numerical solu tion (Sec. 20.9-4), the general and intuitively suggestive viewpoint of functional analysis offers far-reaching insight into the behavior of vibrat ing systems, atomic phenomena, etc.
15.1-2. Notation (see also Sec. 15.4-1). Throughout Sees. 15.2-1 to 15.5-4, $(a>), f(x), F(x), . . . symbolize either functions of a single inde
pendent variable x or, for brevity, functions of a set of n independent variables x\ x2, . . . , xn (see also Sees. 6.2-1 and 16.1-2). In the onedimensional case, dx isa simple differential; in the multidimensional case, dx = dx1 dx2 . . . dxn. An integral
I = jvf(i) d$
(15.1-1)
stands either for a one-dimensional definite integral / = fbf(£) d£ over a bounded or unbounded interval V = (a, b), or for an n-dimensional integral
I= 11 ' ' ' fvf(^ ?> ••' >« d?d? ' ' •d? (18-1-2) over a region V in n-dimensional space. As a rule it is possible to intro
duce volume elements dV(£) = V\gW\ dfe so that each integral (2)
becomes a volume integral (Sees. 6.2-3, 15.4-16, and 16.10-10).
15.2. FUNCTIONS AS VECTORS. EXPANSIONS IN TERMS OF ORTHOGONAL FUNCTIONS
15.2-1. Quadratically Integrable Functions as Vectors.
Inner
Product and Normalization.* (a) A real or complex function f(x) defined on the measurable set E of "points" (x) or (a;1, x2, xz) is
quadratically integrable on Eif and only if j |/(Q|* d£ exists in the sense of Lebesgue (Sec. 4.6-15). The class L2 [more accurately L2(V)] of all real or complex functions quadratically integrable on a given inter val or region V constitute, respectively, an infinite-dimensional real or complex unitary vector space (Sec. 14.2-6) if one regards the functions f(x), h(x), . . . as vectors and defines
The vector sum of f(x) and h(x) asf(x) + h(x) The product of f(x) by a scalar a as af(x) The inner product of f(x) and h(x) as * See Sec. 15.1-2 for notation.
15.2-2
LINEAR PROBLEMS
(/,*)^/F7tt)/*(©fc(0
488
(15.2-1)
where 7(3) is a given real nonnegative function (weighting function) quadratically integrable on V. A change of weighting function corre sponds to a change in the independent variable; in many applications, one has y(x) = 1, or y(%) d% is a volume element (Sec. 15.4-16). Linear independence of a set of functions (vectors) in L2 is defined in the manner of Sec. 1.9-3 (see also Sec. 14.2-3). A set of quadratically integrable functions fi(x), f2(x), . . . ,fm(x) are linearly independent if and only if Gram's determinant det [(fi,fk)] differs from zero (see also Sec. 14.2-6a).
(b) As in Sec. 14.2-7, the norm of a function (vector) f(x) in L2 is the quantity
11/11 - VUT) - [fy 7«)|/«)l2 ditf
(15.2-2)
A (necessarily quadratically integrable) function f(x) is normalizable
if and only if ||/|| exists and is different from zero. Multiplication of a normalizable function f(x) by 1/||/|| yields a function /(#)/||/|| of unit norm [normalization of f(x)]. (c) The inner product defined by Eq. (1) has all the properties listed in Sec. 14.2-6. In particular, iff(x), h(x), and the real nonnegative weight ing function y(x) are quadratically integrable on V
\(f, A)|2 - |/F yf*h dtf < jv 7|/|2 di fY y\h\* dt m(/, f)(h, h) (Catjchy-Schwarz inequality) (15.2-3) and
11/ +*H " (fy y\f+ W*tf <(fy y\f\2 **)* +(fy *W* ^f = 11/11 + 11*11
(Minkowski's inequality)
(15.2-4)
15.2-2. Metric and Convergence in L2. Convergence in Mean (see also Sees. 12.5-2 to 12.5-5, 14.2-7, and 18.6-3). (a) As a unitary vector space, L2 admits the distance function (metric, Sec. 12.5-2)
d(f, h) = ||/ - h\\ B[fy 7(f)|/«) - *(©l» d{\*
(15.2-5)
The root-mean-square difference (5) between f(x) and h(x) equals zero
(metric equality) if and only if f(x) = h(x) for almost all x in V (Sec. 4.6-146).
(b) Convergence in Mean.
The metric (5) leads to the following
definition of metric convergence inL2.
For a given interval or region V,
a sequence of quadratically integrable functions s0(x), Si(x), s2(x), . . .
489
ORTHOGONAL SETS OF FUNCTIONS
15.2-3
converges in mean (with index 2) to the limit s(x) [sn(x) asw-> oo ] if and only if
>s(x)
in mean
d2(sn, s) s \\sn - s\\2 (15.2-6)
s fvy(&M& ~ «({)|*#->0asn-* oo
In this case, the sequence defines its limit-in-mean l.i.m. sn(x) = s(x) n—>«
uniquely almost everywhere in V.
Convergence in mean does not neces
sarily imply actual convergence of the sequence So(x), Si(x), s2(x), . . . at every point, nor does convergence of a sequence at all points of V imply convergence in mean. Equation (6) does imply convergence of a sub sequence of sn(x) for almost every x in V. In particular, an infinite series a0(x) + ai(x) + a2(x) + • • • of quad ratically integrable functions converges in mean to the limit s(x) if and n
only if ) ak(x) Lj
>s(x) as n —» oo. In this case, one writes
in mean
&= 0
a0(x) + ax(x) + a2(x) + • • •
=
s(x)
(c) Completeness ofL2: the Generalized Riesz-Fischer Theorem. The space L2 associated with a given interval or region V is complete (Sec. 12.5-4a); i.e., every sequence of quadratically integrable functions so(x), Si(x), S2(x), . . . such that l.i.m. \sm —sn\ = 0 (Cauchy sequence) m—Kx> n—*•
converges in mean to a quadratically integrable function s(x) and defines s(x) uniquely for almost all x in V (generalized Riesz-Fischer Theorem). Note: The completeness property expressed by the generalized Riesz-Fischer theorem establishes L2 as a Hilbert space (Sec. 14.2-7c), permitting the introduction of orthonormal bases with all the properties noted in Sees. 14.7-4 and 15.2-4. This
is an important reason for the use of Lebesgue integration and convergence in mean. (d) One defines f(x, a) :
>F(x) as a -» a if and only if lim \\f(xta) -F(x)\\ =0
in mean
_^„ UJ v '
'
v '"
15.2-3. Orthogonal Functions and Orthonormal Sets of Functions
(see also Sec. 14.7-3). (a) Two quadratically integrable functions f(x), h(x) are mutually orthogonal [orthogonal with respect to the real nonnegative weighting function 7W]* on F if and only if
(/,*)^/F7«)/*«)A«)^ =0
(15.2-7)
A set of functions Ui(x), u2(x), ... is an orthonormal (normal orthogonal, normalized orthogonal) set if and only if *Some authors call f(x) and h(x) mutually orthogonal only if also y(x) =1 in V,
so that j f*h d£ = 0.
15.2-4
LINEAR PROBLEMS
490
k^)-/F7^^=5|=jj^:^)
(15.2-8)
(t, k = 1, 2, . . .)
Every set of normalizable mutually orthogonal functions (and, in particular, every orthonormal set) is linearly independent. (b) Bessel's Inequality. Given a finite or infinite orthonormal set Ui(x), u2(x), . . . and any function f(x) quadratically integrable over V,
y \(uk, f)\2 < (/, /) (Bessel's inequality)
(15.2-9)
k
The equal sign applies if and only if f(x) belongs to the linear manifold spanned by Ui(x), u2(x), . . . (see also Sees. 14.2-2, 14.7-3, and 15.2-4). 15.2-4. Complete Orthonormal Sets of Functions (Orthonormal Bases, see also Sec. 14.7-4). An orthonormal set of functions U\(x),
u2(x), . . . in L2(V) is a complete orthonormal set (orthonormal basis) if and only if it satisfies the following conditions: 1. Every quadratically integrable function f(x) can be expanded in the form /(«)
=
f\Ui(x) + f2u2(x) + • • •
in mean
with fk = (uk, f)
(k = 1, 2, . . .)
2. For every quadratically integrable function f(x) such thsit fiUi(x) +f2u2(x) + • • • = /Or), in mean
(/,/) = |/x|2 + |/2|2+ * * * (PARSEVAL'S IDENTITY, COM PLETENESS relation; see also Sees. 14.7-36 and 15.2-36) 3. For every pair of quadratically integrable functions f(x), h(x) such that f\U\(x) + f2Ui(x) + • • • = f(x), hiU!(x) + h2u2(x) + • • • = h(x), in mean
in mean
(f,h) =f?h1+tfh2+ • • •
4. The orthonormal set U\(x), u2(x), . . . is not contained in any other orthonormal set of L2(V)\ i.e., every quadratically integrable function f(x) orthogonal to every uk(x) equals zero almost everywhere in V.
491
APPROXIMATIONS AND SERIES EXPANSIONS
15.2-6
Each of these four conditions implies the three others. Given an interval or region V with a complete orthonormal set of functions Ui(x), u2(x), . . . and any set of complex numbers fi, f2, . . . such that 00
/ \fk\2 converges, there exists a quadratically integrable function f(x) such
& that fiUi(x) + f2u2(x) + • • • converges in mean to f(x) (Riesz-Fischer Theorem, see also Sec. 15.2-2c); the/* define f(x) uniquely almost every where in V, and in particular wherever f(x) is continuous in V (Unique ness Theorem, see also Sec. 4.11-5). 15.2-5. Orthogonalization and Normalization of a Set of Func tions (see also Sec. 14.7-46). Given any countable (finite or infinite) set of linearly independent (Sec. 1.9-3) functions
«,<*)- M*)ll *(x) -
V(v(> vf)
Vi(x)
vi(x) =
vi+1(x) =
y
(15.2-10)
« = 1, 2, . . .)
See also Sec. 21.7-1 for examples.
15.2-6. Approximations and Series Expansions in Terms of Ortho-
normal Functions (see also Sees. 4.11-2c, 4.11-46, 15.4-12, 20.6-2, 20.6-3, 20.9-9, and 21.8-12). Given a quadraticallyintegrable functionf(x) and an orthonormal set Ui(x), u2(x), . . . , an approximation to f(x) of the form
sn(x) = axui(x) + a2u2(x) + • • • + anUn(x) (n = 1, 2, . . .)
(15.2-11)
yields the least mean-square error J \sn(x) —f(x)\2dx if ak = (uk, f). Note that the choice of the coefficients ak is independent of n. This property, together with the relative simplicity of the formulas listed in
Sec. 15.2-4a, establishes the great importance of series expansions
/(*)
=
fiMx) +f2u2(x) + - • •
/* = (uk,f)
(k = 1, 2, . . .)
in mean
in terms of suitable normalized orthogonal functions ux(x), u2(x), . . . .
15.2-7
LINEAR PROBLEMS
492
15.2-7. Linear Operations on Functions. Sections 8.2-1, 8.6-1 to 8.6-4, 15.3-1, 15.4-1, and 20.4-2 introduce various linear operations (Sec. 14.3-1) associating a function
(15.2-12)
with a given function $(£) so that
L[*i(© + •*«)] - L*i(© + L*2(£) \
L[o»(e] = «U({)
f
(15.2-13)
*(£) and
1. Equation (12) describes an operation on the object function $(£) ("alibi" point of view).
2. $(£) and
matrices, Sec. 14.6-16), and Eq. (12) describes a change of representation ("alias" point of view, used especially in quantum mechanics; see also Sec. 8.1-1). 15.3. LINEAR INTEGRAL TRANSFORMATIONS AND LINEAR
INTEGRAL EQUATIONS
15.3-1. Linear Integral Transformations.* (a) Sections 15.3-1 to 15.3-10 deal with linear integral transformations
K/(0 mfv K(x, 0/(0 d( =F(x)
(15.3-1)
relating a pair of functions /(Q and F(x). The function K(x,%) is called the kernel of the linear integral transformation. All integrals will be assumed to exist in the sense of Lebesgue (Sec. 4.6-15). The domains of the object function f(Q and the result function F(x) in Eq. (1) are not necessarily identical (note, for example, the Laplace transformation, Sec. 8.2-1). Throughout Sees. 15.3-lb to 15.3-10 it is, however, assumed that x and £ range over the same interval or region V. The "symbolic" integral transformation
/F*(*,0/(Od*s/(x) represents the identity transformation (see also Sees. 15.5-1 and 21.9-2). Linear integral transformations (1) may be interpreted from either the "alibi" or the "alias" point of view (Sec. 15.2-7); in the former case each kernel K(x, £) repre sents a linear operator in the same sense as a matrix (Sec. 14.5-2; see also Sec. 15.3-lc).
(b) For a given kernel K(x, Q, R(x, Q = K(£, x) is called the trans posed kernel, and K\(x, 0 — X*(£, x) is the adjoint (hermitian conjugate) kernel (see also Sec. 14.4-3). A given kernel K(x, 0 is * Refer again to Sec. 15.1-2 for notation.
See also Sec. 15.4-16.
493
LINEAR INTEGRAL EQUATIONS: SURVEY
15.3-2
Symmetric if and only if K(t, x) = K(x, 0 Hermitian if and only if K*(t, x) = K(x, 0
Normalizable ifandonly if J J \K(x, 012 dt dx exists andisdifferent from zero
Continuous in the mean over V if and only if
lim f \K(x + Ax, 0 - K(x, 0|2 dt = 0 (see also Sec. 12.5-lc)
Separable (degenerate) if and only if K(x, 0 can be expressed
as a finite sum K(x, 0 = X f<(x)hi(t) A normalizable kernel represents a bounded operator (Sec. 14.4-1), so that F(x) is normalizable if /(£) is normalizable (Sec. 15.2-16). Separable kernels represent operators of finite rank (Sec. 14.3-2). // K(x, £) is hermitian, normalizable, and con tinuous in the mean over V, then F(x) is continuous in V whenever /(£) is quadratically integrable over V. (c) Matrix Representation. Product of Two Integral Transformations. Given a normalizable kernel K(x, £) and an orthonormal basis (Sec. 15.2-4) Ui(x)f U2(x), . . . in the space of functions f(x), let
/(a in = y fkM& mean ,*-' K(X, &
=
nx) in mean - ,*•<, y
FkUk(x)
V V ^wrfm \ kikUi(x)ut(£)
in mean .*-*. ,
)
(15-3-2>
-t
x—1 k —1
[kik =fvfvu^x)K^ S)Uk^ dxd*\ Then Eq. (1) is equivalent to the matrix equation \Fi\ = [&»*]{/*).
Note that the
matrix product [*?&,•*][&,•*] corresponds to the kernel / M(x, y)K(rj, £) dy representing the product of two successive integral transformations (1) whose kernels K(x, £), M(x, £) correspond to [kik], [mik]. If K(x, £) is a separable kernel, then it is possible to choose the uk(x) so that the matrix [kik] is finite.
15.3-2. Linear Integral Equations: Survey. An integral equation is a functional equation (Sec. 9.1-2) involving integral transforms of the unknown function <£>(#) [if the functional equation also involves a deriva tive of $(x) one speaks of an integro-differential equation]. A given inte gral equation is homogeneous if and only if every multiple a$(x) of any solution $(x) is a solution. Sections 15.3-2 to 15.3-10 deal with linear integralequationsof the general form
«*)*(*) ~ XfvK(x, t)*(t) dt =F(x)
(15.3-3)
15.3-3
LINEAR PROBLEMS
494
where K(x, £), fl(x), and F(x) are given functions; the integration domain
V may be given (Fredholm-type integral equations) or variable (e.g., Volterra-type integral equations, Sec. 15.3-10). Three important types of problems arise:
1. A linear integral equation of the first kind [P(x) s 0, X = 1, Sec. 15.3-9] requires one to find an unknown function $(x) with the given integral transform F(x). The corresponding operator equa tion K$ = F is analogous to a matrix equation [&»•*]{ $*} = {Fi}. 2. A homogeneous linear integral equation of the second kind [F(x) == 0, 0(x) s 1, Xunknown; Sees. 15.3-3 to 15.3-6] represents an eigenvalue problem. The corresponding operator equation
XK$ = $ is analogous to a matrix equation \[kik]{$k} = {$t}. 3. A nonhomogeneous linear integral equation of the second kind [$(x) = 1, Xgiven; Sec. 15.3-7] may be written as $ - XK$ = F and represents a problem of the type discussed in Sec. 14.8-10. If p(x) is a real positive function throughout V, one can reduce the general linear integral equation (3) to a linear integral equation of the second kind with the aid of the transformation
*<*> =^= *(*) - Hx) Vm K(x, Q=K(x, QVfHxMQ (15.3-4) 15.3-3. Homogeneous Fredholm Integral Equations of the Second Kind: Eigenfunctions and Eigenvalues (see also Sees. 14.8-3 and 15.4-5). (a) A function ^ = yf/(x) which is not identically zero in V and satisfies the homogeneous Fredholm integral equation of the second kind
XK*(0 =Xfv K(x, t)H$) dt =*(*)
(15.3-5)
for a suitably determined value of the parameter Xis called an eigenfunction (characteristic function) of the linear integral equation (5) or of the kernel K(x, t)> The corresponding value of Xis an eigenvalue of the integral equation.
V ^i(^) and >l/2(x) are eigenfunctions corresponding to the same eigen value X, the same is true for every linear combination aiyfri(x) + a2\f/2(x). The number m of linearly independent eigenfunctions corresponding to a given eigenvalue Xis its degree of degeneracy; if K(x, t) is a normaliz able kernel, each m is finite. Eigenfunctions associated with different eigenvalues of a linear integral equation (5) are linearly independent. The total number of linearly independent eigenfunctions is finite if and only if K(x, t) is separable. If the linear operator K described by the linear integral transformation (5) has a unique inverse L = K"1, then the \p(x) and Xare eigenfunctions and eigenvalues of the
495
EXPANSION THEOREMS
15.3-4
nonsingular linear operator L (see also Sec. 15.4-5), and all the eigenvalues X are different from zero. In many applications L is a differential operator, and K(x, t) is a Green's function (Sec. 15.5-1).
(b) Eigenfunctions and Eigenvalues for Hermitian Kernels (see also Sees. 14.8-4 and 15.4-6). // K(x, t) is a hermitian kernel, then 1. All eigenvalues of the integral equation (5) are real. 2. Eigenfunctions corresponding to different eigenvalues are mutually orthogonal (with weighting function 1, Sec. 15.2-3). // the kernel K(x, t) is normalizable as well as hermitian, then 3. All eigenvalues are different from zero. 4. There exists at least one eigenvalue different from zero; the eigenvalues constitute a discrete set comprising at most a finite numberof eigenvalues in any finite interval, and every eigenvalue has a finite degree of degeneracy.
15.3-4. Expansion Theorems, (a) Expansion Theorems for Her mitian Kernels. Given the definitions of inner products (Sec. 15.2-1) and convergence in mean (Sec. 15.2-2) with y(x) = 1, every normalizable hermitian kernel K(x, t) admits an orthonormal set of eigenfunctions ipi(x), yj/2(x), . . . such that, for every given function F(x) representable as the integral transform (1) of a function f(t), oo
F(x) in mean = ,y^ Fk*k(x)
(x in V)
with
I <15'3-6)
Ft = (**, F) =fv^t(t)F(t) dt
(k = 1, 2, . . .)
The series (6) converges absolutely and uniformly to F(x) for all x in V (1) ifftt) is piecewise continuous in V, or (2) iff(t) is quadratically integrable and K(x, t) is continuous in the mean. A hermitian kernel K(x, t) is complete if and only if every quadratically integrable function F(x) can be represented in the form (1) or (6), so that the eigenfunctions fa(x) constitute a complete orthonormal set (Sec. 15.2-4); this is true, for example, if K(x, t) is definite (Sec. 15.3-6). Every normalizable hermitian kernel K(x, £) can be expandedin the form 00
K(x, &in mean = L-J ) i-*:«)**(*) A&
(x, €in V)
(15.3-7)
The seriesconverges uniformly to K(x, £) for all x, £ in V (1) if K(x, £) is continuous in
the mean, or (2) if V is a bounded interval orregion such that K(x, £) is continuous for all
15.3-5
LINEAR PROBLEMS
496
x, £ in V, and eitherthe positive or thenegative eigenvalues of Eq. (5) are finite in number (Mercer1s Theorem).
(b) Auxiliary Kernels and Expansion Theorems for Nonhermitian Ker nels. For every normalizable kernel K(x, £), the auxiliary hermitian kernels
I K^(x, y)K(rjt £) drj and / K(x, v)K1[(ij, t) drj yield identical real eigenvalues ma2 and the respective orthonormal sets of eigenfunctions vk(x) and wk(x).
Note
M* fv K(x, Z)vk(l-) dt =wk(x) ^fy KUx, £)w*(£) d£ =vk(x) (15.3- 8) where m* > 0.
Given any normalizable kernel K(x, £), every function F(x) expressible as an integral transform (1) with kernel K or K~\ can be expanded in the respective form 00
F(x) in mean = ,^<, y bkWk(x)
with bk =(wk, F) (x in V)
(15.3-9a)
with ak =(vk, F) (x in V)
(15.3-96)
or oo
F(x) in mean = ,^-fy akvk(x) m
mean .
-,
Each series converges absolutely and uniformly to F(x) for all xinV (1) iff(£) is piecewise continuous in V, or (2) if /(£) is quadratically integrable, and K(x, £) is continuous in the mean; the weighting function is 1 (Sec. 15.2-1). For every normalizable kernel K(x, £) 00
= y> -1 - vl($)wk(x) (*, *) in mean = JL^ Mfc
K(x, $)
15.3-5. Iterated Kernels.
K,(x, t) • K(x, t)
(x, S in V)
(15.3-10)
The iterated kernels Kp(x, t) defined by
Xp+ifo Q= /7^fe nWl, Qdv (p = 2, 3, . . .)
represent the powers Kp of the linear operator K of Eq. (1).
(15.3-11)
The linear
integral equation
XK^(© ^ Xfy Kp(x, tMO dt =*(*)
(15.3-12)
/ias £/ie same eigenfunctions \//(x) as Eq. (5), with corresponding eigenvalues \p (see also Sec. 14.8-3d). Conversely, every solution yp(x) of Eq. (12) solves Eq. (5). // K (x, t) is hermitian and normalizable, then 00
Kp(x, $=£jl ««)**(*) (P =1, 2, •••;x, £in F) (15.3-13) and £/ws series converges absolutely and uniformly in V.
497
NONHOMOGENEOUS FREDHOLM EQUATIONS
15.3-7
15.3-6. Hermitian Integral Forms. The Eigenvalue Problem as a Variation Problem (see also Sees. 11.1-1, 13.5-2, 13.5-3, 14.8-8a, and 15.4-7). (a) Given a normalizable hermitian kernel K(x, £), the (necessarily real) inner product
<♦, K*)
=fv fv **(*)*(*, 0*tt) dx dt
(15.3-14)
is called a hermitian integral form or, in particular, a real symmetric quadratic
integral form if K(x, £) is real and symmetric* The hermitian integral form (14), and also the hermitian kernel K(x, £), is positive definite, negative definite, non-
negative, or nonpositive if and only if the expression (14) is, respectively, positive, negative, nonnegative, or nonpositive for every function $(x) not identically zero in V and such that the integral exists. The integral form (14) is positive definite or negative definite if and only if all eigenvalues of K(x, £) are, respectively, positive or negative.
(b) The problem of finding the eigenfunctions ip(x) and the eigenvalues Xfor a normaliz able hermitian kernel may be expressed as a stationary-value problem in the manner of Sec. 14.8-8a, e.g.,
Find a quadratically integrable function $(x) such that the hermitian integral form (14) has a stationary value subject to the constraint
(*,*) -y^wepd*-! * = to(x) yields the stationary value X&.
All the other theorems of Sec. 14.8-8a apply; it is only necessary to remember that the operator K represented by the given kernel K(x, £) has the eigenfunctions fa(x) and the eigenvalues 1/X*. It is, then, often possible to solve an integral equation of the form (5) exactly or approximately by the methods of the calculus of variations.
15.3-7. Nonhomogeneous Fredholm Equations of the Second Kind (see also Sec. 14.8-10). (a) Existence and Uniqueness of Solu tions. Fredholm's linear integral equation of the second kind
*(z) -\fv K(x, ©*({) dt =F(x)
(15.3-16)
has the following "alternative property":
1. // the given parameter X is not an eigenvalue of K(x, £), then Eq. (16) has at most one unique solution $(x). 2. If Xequals an eigenvalue Xi of Eq. (5), then the "adjoint" homogeneous integral equation
X? fv K*(t, x)x(t) dt = x(x)
(15.3-17)
has a solution x(#) not identically zero in V, and the given integral equation (16) has a solution $(x) only if F(x) is orthogonal (with * Note that, in terms of the matrix elements defined in Sec. 15.3-lc, 00
00
(*, K*) =£ £ *«*;**
(15.3-15)
15.3-8
LINEAR PROBLEMS
498
weighting function 1) to every solution x(x) of Eq. (17).
Note that
Eqs. (17) and (5) are identical for hermitian kernels.
Solutions subject to the above conditions actually exist, in particular, whenever K(x, t) is piecewise continuous and normalizable and F(x) is continuous and quadratically integrable over V. If a solution $(x) exists in case 2, then Eq. (16) has infinitely many solutions, since every sum of a particular solution and a linear combination of eigenfunctions yf/(x) corresponding to Xi is a solution; in particular, there exists a unique solution $(x) orthogonal to all these eigenfunctions. (b) Reduction to an Integral Equation with Hermitian Kernel.
If K(x, t) is normalizable, every solution $(x) of Eq. (16) is a solution of the integral equation
*(*) - Wfv H(x, tMt) dt =F(x) - X* jy K*(t, x)F(t) dt (15.3-18a) where H(x, t) is the hermitian kernel defined by
H(x, t) s e*^K +
(c) The Resolvent Kernel.
A solution $(x) of the linear integral
equation (16) is conveniently written in the form
#(*) =F(x) +\fy T(x, t) \)F(t) dt
(15.3-19)
The function T(x, t; X) is called the resolvent kernel (sometimes known as the reciprocal kernel) for the integral equation (16); Eqs. (16) and (19) represent mutually inverse linear transformations. Whenever the resolvent kernel T(x, £; X) exists, it satisfies the integral equations
T(x, *; X) - Xfy K(x, t)T(t, *; X) dt =K(x, Q Y(x, Z; X) - Xjv K(t, 1-)r(x, t; X) dt =K(x, $)
\ (15.3-20)
T(x, $; X) - T(x, ^; X') = (X - X') f V(x, t; \)T(t, $; V) dt for arbitrary X, X' which are not eigenvalues of K(x, £).
15.3-8. Solution of the Linear Integral Equation (16) (see alsoSec. 20.8-5 for numerical methods), (a) Solution by Successive Approxi mations.
Neumann Series.
Starting with a trial solution
$W(x) =F(x)
499
SOLUTION OF THE LINEAR INTEGRAL EQUATION (16)
15.3-8
one may attempt to compute successive approximations
#M(X) = F(x) + X/ K(x, t)*™(t) dt
(j = 0, 1, 2, . . .) (15.3-21)
to the desired solution $(x) of Eq. (16). The functions (21) may be regarded as partial sums of the infinite series
F(x) +\fv K(x, t)F(t) dt +X2 fv K2(x, t)F(t) dt + ••• = (1 + XK + X2K2 + • • -)F(x) (Neumann series)
(15.3-22)
// K(x, t) is normalizable, and F(x) is quadratically integrable over V,
then there exists areal number rc > + j
\K(x, t) I2 dt dx\
such that
the Neumann series (22) converges in mean to a solution (19) for |X| < rc.
//, in addition, f \K(x, t)\2 dt and fv\K(ty %)\2 dt are uniformly bounded in V, then the power series (22) and the corresponding power series for the resolvent kernel
T(x, t; X) = K(x, t) + \K2(x, t) + WKz(x, t) + • • '
(15.3-23)
actually converge uniformly to the indicated limitsfor x, t in V and |X| < rc. The function (23) is then an analytic function of Xfor |X| < rc and can be continued analytically (Sec. 7.8-1) to yield a resolvent kernel for other suitable values of X. The series (23), as well as the series (22), is known as a Neumann series.
If the normalizable kernel K(x, t) is hermitian, then the radius of con
vergence (or mean convergence) rc is given by rc = |Xi|, where Xi is the eigenvalue of K(x, t) with the smallest absolute value. (b) The Schmidt-Hilbert Formula for the Resolvent Kernel (see also Sec. 14.8-10). For every normalizable and hermitian kernel K(x, t) the solution of the linear integral equation (16) is given by Eq. (19) with the resolvent kernel 00
T(x, k) X) =K(x, &+XYffff^j
(X *X*, k=1, 2, .. .)
A= l
(Schmidt-Hilbert formula) where the ypk(x) are orthonormal eigenfunctions* of K(x, £). converges uniformly for x, t in V and X ^ X*.
(15.3-24) The series
* If X,- is an m-fold degenerate eigenvalue (Sec. 15.3-3a), one writes Xt+l =
• • • = Xi+tn—1 == X*
so that the series (24) contains m terms involving the orthonormal eigenfunctions 4>%(x), $i+i(x), . . , \l/i+m-\(x) corresponding to X».
15.3-8
LINEAR PROBLEMS
500
r(x, £; X) isa meromorphic function ofXin thefinite part ofthe Xplane; the residues of T(x, £; X) at its poles X = \k are simply related to the eigenfunctions fa(x) (Sees 7.6-8 and 7.7-1).
schmidt-hilbert solution IPXequals an eigenvalue X*. If the given parameter X
equals an eigenvalue X,- ofthegiven kernel K(x, £), omit the term or terms involving X,- in the sum (24) (which isnow no longer a resolvent kernel) and add an arbitrary eigenfunction $(x) associated with X» to the right side of Eq. (19). The resulting function $(x) will solve the given integral equation (16), subject to the conditions of case 2 in Sec. 15.3-7a.
(c) Fredholm's Formulas for the Resolvent Kernel. If K(x, t) is normalizable, the resolvent kernel T(x, f, X) can be expressed as the ratio of two integral functions of X(see also Sec. 7.6-7), viz.,
T(X> *; X) =D(D(X)X)
(15.3-25)
where
D(X) =XHET CkXk D{x> *; x) =XTT Dk{x> *)x* with
c° = 1
and, for k = 1, 2, . . . ,
D0(x, t) = K(x, t)
) (15.3-26)
Ck =fvDk^(t,t)dt Dk(x, t) = CkK(x, t)-kjy K(x, v)Dk.1(v, ©drj Both power series converge for all finite X; the power series for D(x, t) X) converges uniformly in V. The poles of T(x, |; X) coincide with the zeros ofZ)(X).
Note the analogy between the functions D(x, £; X), D(\) and the determinants used n
to find the solution xk of the analogous finite-dimensional problem Y (aik - X5J)s* =6< (i = 1, 2, . . . , n) by Cramer's rule (1.9-4).
(d) Singular Kernels. If the given kernel K(x, £) becomes unbounded in a
neighborhood of x = £while the iterated kernel K2(x, £) remains bounded, find the
solution $(x) of Eq. (16) by solving the integral equation
*(x) - X* fy K2(x, £)*(*) dt =F(x) +\fv K(x, £)F(£) d$ (15.3-27) Equation (27) is obtained by substitution of
*(x) =F(x) +\Jv K(x, £)*(£) dt on the left of Eq. (16). This procedure may be repeated if Kz(x, £), KA(x, £),... is the first iterated kernel which remains bounded.
501
VOLTERRA-TYPE INTEGRAL EQUATIONS
15.3-10
15.3-9. Solution of Fredholm's Linear Integral Equation of the
First Kind (see also Sees. 14.8-106, 15.4-12, and 15.5-1). If the linear integral equation
fy K(z, ©*(© di =F(x)
(15.3-28)
with a given normalizable kernel K(x, £) has a solution $(x), the expan sion theorem of Sec. 15.3-46 yields 00
*(*) in mean = .^< y M* («*, *>*(*) + *&)
(15.3-29)
where the vk(x), wk(x), and pk are defined as in Sec. 15.3-46, and
dl- =*(*) /,v K-fa &F(Q A
with
K-i(x, © = ) nkw*k(£)vk{x)
}
(15.3-30)
K-i(x, (•) is called the reciprocal kernel associated with K(x, £) and represents the linear operator K"1.
For hermitian kernels K(x, £) one has vk(x) = wk(x)t nk = |X*|.
15.3-10. Volterra-type Integral Equations, dimensional real variables.
(a) Let x and £ be one-
Then
1. Volterra's integral equation of the second kind
*(z) - Xfj H(z, *MQ d$ =F(x) <
(15.3-31)
reduces to a Fredholm-type integral equation (16) over the interval V s= (0, oo) if one introduces the new kernel
K(x, Q=H(x, k)U+{x - i) =(*<*' ?) jj *>*} (15-3'32) 2. Volterra's integral equation of the first kind
Jox H{x, D*(© di =P(x)
(15.3-33)
reduces by differentiation to the form (31) with
dR(x, Q
H(x, Q- -=-^— H(x, x)
dF(x)
F(x) -
*2-
H(x, x)
(15.3^34)
15.4-1
LINEAR PROBLEMS
502
(b) The following example illustratesa methodof dealingwith a classof Volterratype integral equations having unbounded kernels.
/.'/ (g*-t)«d* =F&)
To solve
(0
(15.3-35)
multiply both sides by (y —x)"'1 and integrate with respect to x from a to y; the resulting integral equation has a bounded kernel.
It follows that
In the special case a = ^, Eq. (35) is known as Abel's integral equation. 15.4. LINEAR BOUNDARY-VALUE PROBLEMS AND EIGENVALUE
PROBLEMS INVOLVING DIFFERENTIAL EQUATIONS
15.4-1. Linear Boundary-value Problems: Problem Statement
and Notation (see also Sees. 9.3-1, 10.3-4, 15.1-2, and 15.2-7). (a) A linear homogeneous function of a function $(x) and its derivatives will be written as the product L$(s) of &(x) and a linear differential oper ator L, such as d/dx or - V2 - q(x\ x2, x3). One desires to find unknown functions $(z) which satisfy a linear differential equation L*(a?) = f(x)
(x in V)
(15.4-la)
throughout a given open interval or region V of points (x), subject to a set of N linear boundary conditions
BMx) = bt(x)
(t = 1, 2, . . . , N; x in S)
(15.4-16)
to be satisfied on the boundary S of 7; each B&(x) is to be a linear homo geneous function of $(#) and its derivatives.
(b) Notation. Volume Integrals and Inner Products (see also Sees. 4.6-12, 15.1-2, 15.2-1, and 16.10-10). In the one-dimensional case, x is a real variable, Eq. (la) is an ordinary differential equation, and V is a bounded or unbounded open interval whose end points x = a, x = b con stitute the boundary S.
In the n-dimensional case, x stands for a "point" (xh x2f . . . , xn) in the n-dimensional space, and Eq. (la) is a partial differential equation. One assumes the possibility of introducing a volume element
dV(x) s y/\g{x\ x\ . . . , z*)| oV dx2 • • • dxn s VltfMI da? in the manner of Sees. 6.2-36 and 16.10-10; s/\g(x)\ is to be real and
positive throughout 7. Then J ^>(£) oT(£) is an n-dimensional volume
503
SUPERPOSITION THEOREMS
15.4-2
integral over V, and one can define the inner product of two functions u(x)} v(x) on V (Sec. 15.2-1) as
(u, v) s /7 Yo«)u*«M0 ^ ^ /F **(*)»«) dF(Q
(15.4-2)
Note that y*(x) = V\g(x)\ depends on the coordinate system.
One may similarly assume the existence of an appropriate surface ele
ment dA(x) (x in S) on the boundary hypersurface S; Jscp(£)dA(g) is, then, a surface integral in the n-dimensional space (see also Sees. 4.6-12 and 6.4-36).
In particular, for n = 3, V is a bounded or unbounded open region of space with boundary surface S; S is to be a regular surface (Sec. 3.1-14). For n = 2, V is usually a plane region with (regular) boundary curve S. 15.4-2. Complementary Homogeneous Differential Equation and Boundary Conditions for a Linear Boundary-value Problem.
Superposition Theorems (see also Sees. 9.3-1, 10.4-2, 14.3-1, and 15.2-7). With eachlinearboundary-value problem (1), onecanassociate a unique homogeneous complementary or reduced differential equation
LHx) =0
(x in V)
(15.4-3a)
and a unique ordered set of homogeneous "complementary boundary conditions"
BMx) =0
(t = 1, 2, . . . , N; x in S)
(15.4-36)
The following important theorems may be applied in turn to relate solu tions $(x) of a given linear boundary-value problem (1) to functions satisfying simpler relations of the form (3).
1. The solution of every linear boundary-value problem (1) can be reduced to that of a linear boundary-value problem involving the same operator L and the homogeneous boundary conditions (36) (see also Sec. 15.5-4).*
2. The most general function satisfying a linear differential equation (la) can be written as the sum of a particular solution of Eq. (la) and the most general solution of the complementary equation (3a). * Write *(x) = $(x) + v(x),where v(x) is a conveniently chosen function satisfying the givenboundary conditions (16). Then $(x) is the solutionof the linearboundaryvalue problem
L*(x) = /(*) - Lv(x) (xin V) \ B«*(x) =0 (* = 1, 2, . . . , N; x in S) f
Q5^
Lv(x) is determined by the choice of v(x) but may contain 5-function terms; it is often possible to choose v(x) so that Lv(x) = 0 (see also Sec. 10.4-2).
15.4-3
LINEAR PROBLEMS
504
3. A homogeneous linear differential equation is satisfied by every linear combination of solutions.
4. Let
5. Let $i(z) and $2(x) satisfy the homogeneous linear differential equa tion L$(x) = 0 subject to the respective linear boundary conditions B&iix) = bu(x) andB&2(x) = b2i(x). Thena^x) + &$2{x) satisfies the given differential equation and the boundary conditions Bjf>{x) = abu(x) + 0b2i(x)
15.4-3. Hermitian-conjugate and Adjoint Boundary-value Prob lems. Hermitian Operators (see also Sees. 14.4-3, 14.4-4, and15.3-16). (a) Given a homogeneous linear boundary-value problem (3) and the definition (2) of inner products, one defines the hermitian-conjugate boundary-value problem
Ltx(z) =0 B,tx(z) =0
(a: in V) (t = 1, 2, . . . , N; x in S)
(15.4-5a) (15.4-56)
by the condition
(», U) - (Lft>, u) = jv [v*Lu - tt(Lfy)*] dV =0 (15.4-6) where u = u(x), v = v(x) is any pair of suitably differentiate functions
such that u(x) satisfies the given boundary conditions (36), and v(x) satisfies the hermitian-conjugate boundary conditions (56). u and v (or y*) may then be said to represent vectors in adjoint vector spaces (Sec. 14.4-9). Two linear second-order differential operators L and Lf will be called hermitianconjugate operators if and only if the function v*Lu —w(Lfv)* has the form of an n
n-dimensional divergence
.-——. ) —k W\g(x)\ Pk) (Table 16.10-1), where each V\g(x)\ k^x °x
P* can be a function of u, v, u*, v* and their first-order derivatives.*
It is then pos
sible to express the volume integral / [v*Lu —u(L]v)*] dV as an integral over the boundary S; formulas of this type are known as generalized Green'sformulas (see also Sec. 15.4-3c). Suitable sets of hermitian-conjugate boundary conditions BiU —0, Btfv = 0 (x in S) are now defined by the fact that the boundary integral vanishes, so that Eq. (6) holds. In case of real functions and operators, hermitian'-conjugate boundary-value prob lems, operators, and boundary conditions are usually known as mutually adjoint.
* The k's are superscripts, not exponents, as in Chaps. 6 and 16.
505
ADJOINT BOUNDARY-VALUE PROBLEMS
15.4-3
(b) Hermitian Operators. A differential operator L is hermitian (self-adjoint) if and only if
(i>, U) - (U, u) =fy [v*Lu - u(Lv)*] dV =0
(15.4-7)
for every pair of suitably differentiable functions u = u(x)f v = v(x) which satisfy identical homogeneous boundary conditions defining a linear manifold of functions.
Hermitian-conjugate boundary-value
problems with hermitian operators have identical boundary conditions and are thus identical (self-adjoint boundary-value problems).
(c) Special Cases. Real Sturm-Liouville Operators and Gen eralization of Green's Theorem (see also Sees. 5.6-1 and 15.5-4). In the one-dimensional case, Eqs. (3a) and (5a) are ordinary linear differ
ential equations subject to given linear boundary conditions at the end points of an interval (a, 6) = V. If inner products are defined by
(u, v) = fb u*vdx (Sec. 15.2-la), one has, for real second-order differ ential operators,
U s a0(x) -^ +ax(x) ^ + a2(x)u Lty s oV [ao{x)v] " di [ai(x)v] + a2{x)v \ (15.4-8) vlu - uLt* = -^ P(x) = a*(x)(vu' - uv') + [ax(x) - a'0(x)]uv
P(x) is sometimes called the conjunct of u(x) and v(x) with respect to the operator L. The condition ax(x) s a'0(x) makes L a self-adjoint Sturm-Liouville operator
(seealso Sec. 15.4-8), and partial integration yields the generalized Green's formula
fb (vLu - uLv) da = -p(x)(vu' - uv')Ja
(15.4-10)
In the three-dimensional case, define inner products by Eq. (2), and define the operator V2 in the manner of Tables 6.4-1 or 16.10-1. Then for any differentiable real function q = q(x\ x2, xz), the real differential operator
L s= -V2 - q(x\ x\ a8)
(15.4-11)
15.4-4
LINEAR PROBLEMS
506
is self-adjoint and satisfies the generalized Green's formula
/ (vLu - aLv) dV = - / (vVu - uVv) •dA
(see also Table 5.6-1). An analogous formula may be written for the two-dimensional case.
15.4-4. The Fredholm Alternative Theorem (see also Sees. 14.8-10, 15.3-7a, and 15.4-12). The linear boundary-value problem defined by the differential equation
L$(s) = f(x)
(x in V)
(15.4-13a)
with homogeneous boundary conditions
BMx) =0
(i = 1, 2, . . . , N; x in S)
(15.4-136)
cannot have a unique solution $(x) if the hermitian-conjugate (adjoint) homogeneous boundary-value problem (5) has a solution x(x) other than x(x) == 0. In this case, the given problem (13) can be solved only if f(x) is orthogonal to every x(x), i.e., if Eq. (5) implies
(x,/) =fvX*fdV =0
(15.4-14)*
// this condition is satisfied, then the given problem (13) has either no solution or an infinite number of solutions. Note: In many applications, L is a hermitian operator, and the hermitianconjugate problem (5) is identical with the given problem (13).
15.4-5. Eigenvalue Problems Involving Linear Differential Equa tions (see also Sees. 10.4-2c, 14.8-3, and 15.1-1). (a) For a given set of linear homogeneous boundary conditions, an eigenfunction (proper function, characteristic function) of the linear differential operator L is a solution \//(x), not identically zero in 7, of the differential equation LHx) = W(x)
(x in V)
(15.4-15)
where Xis a suitably determined constant calledthe eigenvalue (proper value, characteristic value) of L associated with the eigenfunction \p(x). (b) More general eigenvalue problems require one to find eigenfunctions \[/(x) ^ 0 and eigenvalues Xsatisfying a linear differential equation U(s) = \B(x)$(x)
(x in V)
* In the one-dimensional case, dV is simply identical with dx.
(15.4-16)
507
EIGENVALUE PROBLEMS
15.4-5
subject to given linear homogeneous boundary conditions; B(x) is to be real and positive throughout V.
(c) // i(x) is an eigenfunction associated with the eigenvalue X, the same is true for every function a\l/(x) ^ 0. // ^i(x)f yp2(x), . . . , ^8(x) are eigenfunctions associated with X, the same is true for every function ai^i + a2\p2 + * * * + ctayf/8 ^ 0
This theorem also applies to uniformly convergent infinite series of eigenfunctions.
Eigenfunctions associated with different eigenvalues are linearly inde pendent (Sees. 1.9-3 and 9.3-2). An eigenvalue Xassociated with exactly
m > 1 linearly independent eigenfunctions is said to be m-fold degen erate; m is the degree of degeneracy.
(d) The Spectrum of a Linear Eigenvalue Problem. Continuous Spectra and Improper Eigenfunctions (see also Sees. 14.8-3d and 15.4-12). For a given set of homogeneous linear boundary conditions and B(x) > 0, the spectrum of a linear
eigenvalue problem (16) isthesetofall complex numbers Xsuch that the differential equation
L*(x) - \B(x)*(x) = /(*)
(x in V)
(15.4-17)
with a given normalizable "forcing function "/(a) does nothave a unique normalizable solution $(x)subject to the given boundary conditions. Thespectrum maycomprise continuous and residual spectra (Sec. 14.8-3d) as well as a discrete spectrum denned by Eq. (16) with normalizable eigenfunctions \p(x). In the special case B(x) = 1, one speaks of the spectrum of the differential operator L. If L is a hermitian operator and B{x) > 0, both the discrete and the continuous
spectrum of the eigenvalue problem (16) are included in the approximate spectrum (Sec. 14.8-3d). One can often obtain the approximate spectrum by approximating the given eigenvalue problem with a sequence ofeigenvalue problems yielding purely discrete spectra. In the course of such a limit process, the discrete eigenfunctions are replaced by a set of functions labeled by thecontinuously variable spectral param eter X; such functions are known as improper eigenfunctions and will satisfy Eq. (16).
EXAMPLE: The ordinary differential equation-^- = -X$(x) with V = (0, «>) and the boundary conditions
$(0) = 0
r* |*(012 # exists
yields the continuous spectrum 0 < X< «. This spectrum is approximated by
the discrete spectrum X=(k^Y (k =0, 1, 2, . . .) of the eigenvalue problem
^||^ = -X*.(s) with Vm(0, a), $(0) =*(a) =0 irk
as a -> oo. The eigenfunctions sin — x (k = 0, 1, 2, . . .) of the latter problem approximate the improper eigenfunctions sin (\/x i) as o-» « (transition from Fourier series to Fourier integral).
!54-6
LINEAR PROBLEMS
508
Similarly, the quantum-mechanical wave functions and energy levels ofa particle in free space are approximated by those for particles confined to increasingly larger boxes."
15.4-6. Eigenvalues and Eigenfunctions of Hermitian Eigenvalue Problems. Complete Orthonormal Sets of Eigenfunctions (see also Sees. 14.8-4, 14.-8-7, 15.2-4, 15.3-3, and 15.4-36). (a) If the operator L in Eq. (15) is hermitian, then
1. All spectral values X are real.
2. Normalized eigenfunctions \p{) \(/k corresponding to different eigenvalues are mutually orthogonal in the sense that
(*<, *k) = Jv ypUk dV =0
(i * k)
If Lis a hermitian operator having apurely discrete spectrum, the following expansion theorem holds:
3. There exists an orthonormal set of eigenfunctions \pi(x), 4>2(x), permitting a series expansion
=
a^/x(x) + a2\f/2(x) + • • •
in mean
(«* =fv *tv dV, k=1, 2, . . .) (15.4-18a) of every quadratically integrable function
(b) The same theorems apply to the "generalized" eigenvalue problem (16), where L is hermitian and B(x) > 0, provided that one redefines
orthogonality and normalization in terms of the new inner product*
(u, v)B = jy u*vB dV Thus Eq. (18a) is replaced by the more general expansion
(ak =Jvf*
Iv +?+«B ^ =6£ =(J jj JJ£) (i, k=1, 2, ...) (15.4,19a) These relations apply to the simpler eigenvalue problem (15) as a special case.
*In the one-dimensional case, dV is, again, simply identical with dx.
509
HERMITIAN EIGENVALUE PROBLEMS
15.4-7
(c) For a hermitian eigenvalue problem with a (necessarily real) continuous spec trum admitting improper eigenfunctions (Sec. 15.4-5d), there exists a set of improper eigenfunctions \f/(x, X) such that
L **(*, *)*(*, *')*(€) dV& = 5<x - V)
(15.4-196)
15.4-7. Hermitian Eigenvalue Problems as Variation Problems
(see also Sees. 11.7-1 to 11.7-3, 14.8-8, 15.3-66, 15.4-86, 15.4-96, and
15.4-10). (a) The eigenvalue problem (16) for a hermitian differential operator Lwith discrete eigenvalues Xi, X2, . . . is equivalent to each of the following variation problems:*
1. Find functions yp(x) ^ 0 in V which satisfy the given boundary conditions and reduce the variation (Sees. 11.5-1 and 11.5-2) of
//in f ^*UdF (ft W = hi (Rayleigh's quotient) (15.4-20) (ft B+) f \f\*B dV to zero.
2. Find functions f(x) which satisfy the given boundary conditions and reduce the variation of
(^ l*) = fv+*LfdV
(15.4-21)
to zero subject to the constraint
(*, Bf) = jv \+\2B dV= 1
(15.4-22)
In each case, ty = >pk(x) yields the stationary value Xfc.
It is, then, possible to utilize the direct methods of the calculus of variations, notably the Rayleigh-Ritz method (Sec. 11.7-2) for the solu tion of eigenvalue problems involving ordinary or partial differential equations.
(b) Given a hermitian operator L with a discrete spectrum which
includes at most a finite number of negative eigenvalues (Sees. 15.4-8 and
15.4-9), let the eigenvalues be arranged in increasing order, with an m-fold degenerate eigenvalue repeated m times, or Xi < X2 < • • • . The smallest eigenvalue Xi is the minimum value ofRayleigVs quotient (20) for an arbitrary "admissible" function \p(x), i.e., for an arbitrary normaliz able yp(x) which satisfies the given boundary conditions and such that Rayleigh's quotient exists.
Similarly, the rth eigenvalue Xr in the above sequence will not exceed Rayleighys quotient for any "admissible" function
^•^-B
LINEAR PROBLEMS
Jy +*k
510
(15.4-23)
for all eigenfunctions $k associated with \h X2, • • • , Xr_i (Courant's Minimax Principle).
15.4-8. One-dimensional Eigenvalue Problems of the Sturm-
Liouville Type, (a) Consider a one-dimensional real variable x, and let V be the bounded interval (a, 6). Then the most general real her mitian second-order differential operator L has the form (9). The real differential equation
^--{sI^e]+ •(*)}* • " [P(X) S +V'{X) tx +q(x)+] =XB(x)* (homogeneous Sturm-Liouville differential equation) (15.4-24a) defines a self-adjoint eigenvalue problem if one adds either of the linear homogeneous boundary conditions
Bi^ = a^'(a) + Ptf(a) =0
B2^ = a2f'(b) + ^(6) = 0 (15.4-246)
or BiGM = *(a) - Wo) =0
B2^ = ^(a) - *'(&) = 0 (periodicity conditions)
(15.4-24c)
It will be assumed that p(x), q(x), and B(x) are differentiable in [a, 6], that p(x) as well as B(x) is positive in [a, 6], and that no eigenfunction corresponding to X = 0 satisfies the given boundary conditions. These
assumptions ensure that the eigenvalues Xare discrete and positive; they are nondegenerate for the boundary conditions (246). If the eigenvalues are arranged in increasing order Xi < X2 < • • • , then Xn is asymptotically proportional to n2 as n —> a> (Sec. 4.4-3). Various simplifying transformations and methods for solving homogeneous differ ential equations of the type (24) are treated in Sees. 9.3-4 to 9.3-10; see also Sees. 21.7-1 to 21.8-12. Eigenvalue problems of the Sturm-Liouville type are met, in par ticular, upon separation of variables in boundary-value problems involving linear partial differential equations (Sees. 10.4-3 to 10.4-9) and are of great importance in quantum mechanics.
Note that every differential equation of theform
a^x) W* + ai(:c) Tx + c(*^ " X^W^
(15.4-25)
can be reduced to the form (24a) through multiplication by exp / gl(^ ~ a°(x> dx
J a0(x) (see also Sec. 9.3-8a). Hence the examples of Sees. 10.4-3 to 10.4-9 may be treated as Sturm-Liouville problems.
511
STURM-LIOUVILLE PROBLEMS
15.4-9
(b) The generalized Green's formula (10) applies; ifone writes q(x) = q[(x) - £2(3),
f bvLu dZ = / {pu'2 +2qyuu' +q*u2) d£ - (puu' +qiU2)
\
U /rh ulv d£ U fb 16 I (15-4"26) = / [puV +gi(w'i> 4- v'u) +g2w] d£ - (pvf +gif)u J^ J The integrals (26) have physical significance in many applications (see also Sec. 15.4-7).
(c) Generalizations and Related Problems.
Related more general problems
involve unbounded intervals V = (0, «>) or V = (— «>, 00) with boundary conditions
whichspecifythe asymptotic behavior of \J/(x) at infinity or simply demand normalizability. Again, one may admit singularities of p(x), q(x), and/or B(x) for x = a or x -b (Sec. 21.8-10). 7/ the spectrum is purelydiscrete, X„ is still asymptotically propor tional to n2 even if one admits singularities of p(x), q(x), or B(x) at the end points of the bounded interval (a, b).
Nonhomogeneous boundary-value problems involving the Sturm-Liouville operator L may be solved by the methods of Sees. 9.3-3, 9.3-4, 15.4-12, and 15.5-1.
15.4-9. Sturm-Liouville Problems Involving Second-order Partial
Differential Equations (see also Sec. 15.4-3c). (a) Let x = (x1, x2, . . . , xn), and define inner products by Eq. (2) with dV = dx1 dx2 • • • dxn.
Then the real partial differential equation n
L^ = — > —Ap -r-A + q\f = Xlty (multidimensional HOMOGENEOUS STURM-LIOUVILLE EQUATION)
with
(15.4-27)
p = p(xl, x2, . . . , xn) > 0 B = B(x\ x2, . . . , xn) > 0 q = q(x\ x2, . . . , xn)
all differentiable on V and S, defines a self-adjoint eigenvalue problem dip
for given homogeneous boundary conditions of the form a -—h W = 0.
// the given region V is bounded, the resulting eigenvalue spectrum is discrete and includes at most a finite number of negative eigenvalues. If the eigen values are arranged in a sequence Xi < X2 < • • • , then X„ —> 00 as n
—>
00.
(b) In the three-dimensional case, the same theorem applies to the eigenvalues of the differential equation
-VV - q(x\ x2, x*)* = \B(x\ x2, x*)f
with boundary conditions of the form a^+^ =0and B>0and qboth differ entiable on V and S.
The generalized Green's formula (12) applies, and
f uLu dV = f [(Vu)2 - qu2] dV - f (uVu) •d\
(15.4-28)
15.4-10
LINEAR PROBLEMS
512
The integral (28) has physical significance in many applications (see also Sec. 15 4-7 and Table 5.6-1).
15.4-10. Comparison Theorems (see also Sees. 5.6-16, 14.8-9c, and 15.4-7). The following comparison theorems apply to ordinary or partial differential equations of the Sturm-Liouville types defined in Sees. 15.4-8 and 15.4-9. Given a differential equation (24a) or (27) and an interval or region Vwith boundary conditions (246) or (24c) (0/a > 0),
1. An increase in p, increase in q, and/or decrease in B will surely not increase any eigenvalue Xk; similarly, a decrease in p, decrease in q, and/or increase in B will not decrease any eigenvalue \k. 2. A reduction in the size of the given interval or region V will never decrease any eigenvalue \k generated by the boundary conditions
iKa) = f(b) = 0 or $(xh x2, . . . , xn) = 0 on S
(15.4-29a)
or decrease any eigenvalue \k generated by the boundary conditions
+'(a) =V(b) =0 or
^ =0on S
(15.4-296)
3. Each eigenvalue \k generated by the boundary conditions
o^f(a) + /ty(a) = <*/,'(b) + /ty(6) =0 or a^ + W= 0on S on
with p/a >0 is a nondecreasing function of @/a; in particular, the Neumann condition (296) will not yield larger eigenvalues than the Dirichlet condition (29a).
4. A modification of the eigenvalue problem imposing constraints (acces sory conditions) on \p will not decrease any eigenvalue X*.
The comparison theorems are ofspecial interest in vibration theory (effects ofmass,
stiffness, and geometry on natural frequencies).
EXAMPLES (see also Sec. 10.4-9): +"(x) = -X^s) (vibrating string) and
VV(z, y, z) = —M(x, y, z) (vibrating membrane).
15.4-11. Perturbation Methods for the Solution of Discrete Eigenvalue Problems. Given the eigenvalues A* and the orthonormal eigenfunctions fa of a her
mitian eigenvalue problem defined by
U =W
(15.4-30)
one desires to approximate the eigenvalues X* and the eigenfunctions fa of the "per
turbed" hermitian problem
L# + eL'f = \$
(15.4-31)
with unchanged boundary conditions; eL'$ isto be a small perturbation term (€ « 1). (a) Nondegenerate Case. Corresponding to each nondegenerate eigenvalue X» of the unperturbed problem, one has
513
SOLUTION OF BOUNDARY-VALUE PROBLEMS
k*1
^ (i = 1, 2, . . .)
15.4-12
(15.4-32)
ft
L;fc = ftfL'ftdV
where
(t, Jfe = 1, 2, . . .)
(15.4-33)
(b) Degenerate Case. For each m-fold degenerate unperturbed eigenvalue X, with the m eigenfunctions fa, fa, ... }^m, one may have up to m distinct eigenvalues X -
of the perturbed operator.
Xy
The corresponding values of AX = —-— are approxi
mated by the roots of the mth-degree secular equation Ln — AX Li2 ^21 ^22 —AX
L'mi
L'm2
' • ' • • •
Llm L2m m
= 0
(15.4-34)
• • • L'mm - AX
and may or may not coincide; i.e., a perturbation may remove degeneracies. The eigenfunction or eigenfunctions corresponding to each value of X = X, + AX m
are approximated by $ = > ak&k, where the a* are obtained from Jfc=i m
y ak(L'ik - Si AX) =0
(i =1, 2, . . . , m)
(15.4-35)
Note that no eigenfunctions other than the m eigenfunctions of the degenerate
eigenvalue Xy affect the approximations to Xand f given for the degenerate case; the approximation given for \p is a "zero-order" approximation not proportional to c. See Ref. 15.18 for higher-order effects. (c) See Refs. 15.10 and 15.11 for a treatment of continuous spectra.
15.4-12. Solution of Boundary-value Problems by Eigenfunction
Expansions (see also Sees. 10.4-2c, 14.8-10, 15.4-4, and 15.5-2; refer to Sec. 10.4-9 for examples), (a) A very important class of physical boundary-value problems (e.g., elastic vibrations, electromagnetic theory) involves a real linear ordinary or partial differential equation L*(x) - \B(x)$(x) = f(x)
(x in V)
(15.4-36)
subject to given homogeneous boundary conditions, where L is a her mitian operator (Sec. 15.4-36), and B(x) > 0 in V. If f(x) = 0 in V (no applied forces, currents, etc.), then Eq. (36) reduces to the comple mentary homogeneous differential equation
LtKx) = \B(x)f(x)
(x in V)
(15.4-37)
which can be satisfied only by proper or improper eigenfunctions yp(x)
15.4-12
LINEAR PROBLEMS
514
with spectral values X. In the nonhomogeneous case (36) (e.g., forced oscillations), Xwill be a given parameter.
Consider first that the eigenvalue problem (37) has a purely discrete spectrum of (not necessarily distinct) eigenvalues Xi, X2, . . . associated
with corresponding orthonormal eigenfunctions fi(x), \//2(x), . . . (Sec. 15.4-66). Assuming that the "forcing function" can be expanded in the form 00
f(x) =B(x) ^ Mu(x)
(/* =fv+tfdV, Jc =1, 2, . . .) (15.4-38)
for almost all x in V, the desired solution of Eq. (36) is given by the "normal-mode expansion*' 00
= y jH^-r *»(*)
$(x)
i mean £-/ Afc
(15.4-39)
A
The series (39) defines the solution $(x) uniquely at every point of con tinuity in V, provided that the given parameter X does not equal an eigenvalue Xfc (resonance!). In the latter case, a solution exists only if f(x) is orthogonal to all eigenfunctions associated with X*, so that fk = 0. One has then an infinite number of solutions given by the series (39) plus any linear combination of eigenfunctions associated with X&.
(b) If the eigenvalue problem (37) has a purely continuous spectrum Da with improper eigenfunctions ^(x, X) having the orthonormality property (196) (e.g., Sturm-Liouville operators with singularities in V, or V unbounded), one may attempt to expand the solution $(x) as a "generalized Fourier integral" over the (necessarily real) spectrum DA. One has then, for almost all x in V,
*(x) = f Dk
H(A)y/,(x, A) dk
^(A) =T=\jvr& A)m dV
'
(15.4-40)
provided that DA does not contain X(see also Sec. 10.5-1, integral-trans form methods).
If the eigenvalue problem (37) has both a discrete and a continuous spectrum, the solution will contain terms of the form (39) as well as an integral (40); both types of terms can be combined into a Stieltjes integral over the spectrum.
(c) The solution methods of Sees. 15.4-12a and 6 may apply even if the given operator L is not hermitian, provided that valid orthonormaleigenfunction expansions can be shown to exist.
515
GREEN'S FUNCTIONS
15-5. GREEN'S FUNCTIONS.
15.5-1
RELATION OF BOUNDARY-VALUE
PROBLEMS AND EIGENVALUE PROBLEMS TO
INTEGRAL EQUATIONS
15.5-1. Green's Functions for Boundary-value Problems with Homogeneous Boundary Conditions (see also Sees. 9.3-3, 9.4-3, and 15.4-4; for examples, see Table 9.3-1, Sec. 15.6-6, and Sec. 15.6-9). (a) A linear boundary-value problem
U(z) = f(x) *&(x) =0
(x in V) (i = 1, 2, . . . , N; x in S)
(15.5-la) (15.5-16)
expresses the given function f(x) as the result of a linear operation on an unknown function $(x) subject to the given boundary conditions. If it is possible to write the corresponding inverse operation as a linear integral transformation (Sec. 15.3-1)
*(x) =fvG(x, ©/(© dV(i)
(15.5-2)*
for every suitable f(x), the kernel G(x, £) is called the Green's function for the given boundary-value problem (1). G(x, £) must satisfy the given homogeneous boundary conditions (16) together with LG(x, £) = 0
fvLG(x, 0dV(lt) =1 or
LG(x, £) = 8(x, 0
(x, £ in V; x * Q \
(xinV)]
(x, £ in V)
(15.5-3a)
(15.5-36)
where 8(x, £) is the delta function for the specific coordinate system used (Sec. 21.9-7). Equation (2) describes the solution $(x) of the given boundary-value problem (1) as a superposition of elementary solutions G(x, £)/(£) having singularities at x = £. These elementary solutions can be interpreted as effects of impulse forces, point charges, etc., f(Z)5(x, £) at x = £ (Sees. 9.4-3, 15.5-4, 15.6-6, and 15.6-9). Green's functions can often be found directly by formal integration of the "symbolic differential equation" (36) subject to the given boundary conditions by the methods of Sees. 9.4-5, 10.5-1, or 15.4-12. See Table 9.3-3, Sec. 15.6-6, and Sec. 15.6-9 for examples of Green's functions. (b) Modified Green's Functions (see also Sec. 15.4-4). A given boundaryvalue problem (1) will not admit a Green's function satisfying Eq. (3) if the hermitianconjugate homogeneous problem Ltx(s) - 0
(a; in V)
Bitx(z) =0
(a: in S)
(15.5-4)
possesses a solution x(s) other than x(x) = 0 (Sec. 15.4-4). In this case, it may still be possible to find a modified Green's function G(x, £) permitting the expansion (2) for suitable functions f(x) orthogonal to every %W. G(x, £) must satisfy the condition
* In the one-dimensional case, dV(£) = d£.
15.5-2
LINEAR PROBLEMS
LG(x, 0«s(x, Q- £ xSbtt)xw(*)
516
(15.5-5)
&=i
together with the given boundary conditions (16), where the xotfaO are any complete orthonormal set spanning all solutions of the problem (4). Any resulting solution (2) cannot possibly be unique (Sec. 15.4-4), but there exists at most one modified Green's function which satisfies the added orthogonality conditions
jv x£btt)G(s, &dV(& =0
(* =1, 2, . . .)
(15.5-6)
If L is a hermitian operator, then the xo* are simply its eigenfunctions for X = 0.
(c) Green's functions for hermitian-conjugate boundary-value prob lems (Sec. 15.4-3a) are hermitian-conjugate kernels (Sec. 15.3-16). For every Green's function G(x, £) associated with a hermitian operator L (Sec. 15.4-36), G*(i, x) a G(x, {).
15.5-2. Relation of Boundary-value Problems and Eigenvalue Problems to Integral Equations. Green's Resolvent, (a) If there exists a Green's function G(x, Q for the boundary-value problem (1), Eq. (2) implies that the more general boundary-value problem
L*(*) - \B(x)*(x) = f(x)
BMx) =0
(x in V) \
(i = 1, 2, . . . , N; xin S) J
^
(15-5-7)
is equivalent to the linear integral equation
*(*) -\fvK(x,Q*(Qdi =F(x) K(x, Q=G(x, ©5(f) V\g~W\ F(x) = fvG(x, Qf(Q dV(Q
(15.5-8)
Note that the integral equation takes the given boundary conditions
into account. In the one-dimensional case, dV(£) s d£, and \g(£)\ = 1. The integral-equation representation brings the theory of Sees. 15.3-1 to 15.3-9 and the numerical methods ofSec. 20.9-10 to bear on linear boundary-
value problems and eigenvalue problems. In particular, one may employ the Neumann series (15.3-23) and analytic continuation to introduce a
resolvent kernel (Sec. 15.3-7c) T(x, £; X), so that the solution appears in the form
*(*) =F(x) +\fv T(x, (; \)F(Q di;
(15.5-9)
r(#> t'> *) is called Green's resolvent; it isthe kernel of a linear integral transformation representing the resolvent operator (L —X)-1 (Sec. 14.8-3d). T(x, £; X) can often be obtained by the methods of Sec. 15.3-8; but note carefully that K(x, £) is not necessarily a normalizable
kernel. The set ofsingularities Xof T(x, £; X) is precisely the spectrum of the operator L (Sec. 15.4-5d), which may include a continuous spectrum.
517
THE GENERALIZED DIFFUSION EQUATION
15.5-3
Specifically, the poles of Y(x, £; X) correspond to discrete eigenvalues of L, while branch points indicate the presence of a continuous spectrum (Ref. 15.2).
(b) Problems Involving Hermitian Operators. Eigenfunction Expansion of Green's Function (see also Sees. 15.3-3, 15.3-4, 15.4-6, and 15.4-12). In the important special cases where L is a hermitian operator and B(x) is real and positive in V, one may introduce a new unknown function
*(*) = *(x) VB(x) VJgWl
(15.5-10)
to replace Eq. (8) by a linear integral equation with hermitian kernel:
*(*) - Xfy K(x, ©*«) di =F(x)
K(x, Q=G(x, 0VB(x)B(it) Vl^MSl
\
j (15.5-11)
F(x) =y/B(x) V\g(x)\ fvG(x, {)/(© dV(Q )
The hermitian eigenvalue problems Lfax) = \B(x)fax)
Bifax) - 0
(x in V)
(i = 1, 2, . . . , m; x in S)
\x) - Xfv K(x, 0W0 dt
and
f(i
(15.5-12a)
(15.5-126)
yield identical spectra; corresponding eigenfunctions $(x) amd f (x) are related by
$(x) = fax) y/B(x) y/\g(z)\* If the eigenvalue spectrum is purely discrete, there exist complete sets of eigenfunctions fa(x) and fa(x) such that
fv Btffa dV =fv $fh dV =*<
(15.5-13)
In this case, R(x, £) is a normalizable kernel, and 00
G(x,i)
=
) -*;(«)*»(»)
in mean ^-/ ^-f 'A*
fefinF)
(15.5-14)
A= l
which does not depend explicitly on B(x) or g(x). In the case of Sturm-Liouville problems with purely discrete spectra (Sees. 15.4-8a and 15.4-9), the series (14) con verges absolutely and uniformly in V (see also Mercer's theorem, Sec. 15.3-4). (c) If the given boundary-value problem (7) is expressed in terms of differential invariants (Sec. 16.10-7), then G(x, £), K(x, £), and K(x, £) are point functions invari ant with respect to changes in the coordinate system used.
15.5-3. Application of the Green's-function Method to an Initial-value
Problem: The Generalized Diffusion Equation (see also Sees. 10.4-7, 10.5-3, and 10.5-4).
It is desired to find the solution $ = $(x, t) of the initial-value problem
V°* - a* Tt ~ 6* " f(x' l) \ (x in V) *(a;, 0) = *o(s)
(15.5-15)
15.5-4
LINEAR PROBLEMS
518
with given constant coefficients a2, b and homogeneous boundary conditions * = 0 or d$/dn ^Oon the boundary of a given n-dimensional region V of points (x), where n = 1, 2, or 3.
Then
*fe ° " lo Iv G{X> t] 6t)/(*' t) dT dm) - a2 ^ Gfo t; |, 0)*o(0 dF(€)
(* >0) (15.5-16)
where the Green's function G(x, t; £, r) satisfies the given boundary conditions and
V*° - <* ft - W=5(*> *)5(< " T) }(^ in V) G = 0
(15.5-17)
(* < r)
Given the eigenfunctions fa(x) and eigenvalues X* of the space form of the wave equation
VY(s) + X*(x) =0
(x in V)
(15.5-18)
subject to the given boundary conditions (Sees. 10.4-4 and 10.4-56), one has
1 V
G(x, t; fc t) = - -
-*±*(«-r)
> e
°8
**(*)**(*)
(* > t)
(15.5-19)
If V encompasses the entire space, then
(* > r; n = 1, 2, 3)
(15.5-20)
where |r — p| is the distance between the points (x) = (r) and (£) = (p). The result ing solution (16) is known as the PoisSon-integral solution of the diffusion problem.
15.5-4. The Green's-function Method for Nonhomogeneous Boundary Conditions (see also Sec. 15.4-2). (a) The solution $(x) of a three-dimensional linear boundary-value problem of the form
l*(x) = 0
(x in V)
B*(z) = b(x)
(x in S)
(15.5-21)
can often be written as a surface integral
*(*) = f8Gs(x, (MS) dA(Q
(15.5-22)
for every given function b(x) integrable over S. Gs(x, £) must satisfy the given differential equation for x in V, £ in S, and
BGs(x, © = 0
(x, i in S; x 9* Q )
Gs(x, £) is variously defined as a Green's function of the second kind or simply as a Green's function (see also Sec. 15.6-6). Analogous rela tions hold in the two-dimensional case (see also Sec. 15.6-9).
519
NONHOMOGENEOUS BOUNDARY CONDITIONS
15.5-4
Linear boundary-value problems involving nonhomogeneous differ ential equations as well as nonhomogeneous boundary conditions can often be solved by superposition of a volume integral (2) and a surface integral (22).
(b) As shown in Sec. 15.4-2, every boundary-value problem (21) can be rewritten as a boundary-value problem of the type (1). Hence, Gs(x, £) can be related to the ordinary Green's function G(x, (•) defined in themanner of Sec. 15.5-1 for theproblem withll complementary}t homogeneous boundary conditions. In particular, consider two-dimensional or threedimensional boundary-value problems involving real self-adjoint differ ential equations of the form
L*(x) s -[V2 + q]*(x) = 0
(a in V)
where q = q(x) is a real differentiable function. tion G(x, £) satisfying
(15.5-24)
Given a Green's func
-[V* + q]G(x,Q = «(*,©
(15.5-25)
for suitably given homogeneous boundary conditions (Sec. 15.5-1), Green's formula (15.4-12) yields
*(*) =js [g(x, Q^ - *(© ^j^\ dA(& (x in V) (15.5-26) where d/dv denotes normal differentiation with respect to £. boundary conditions of the form
B$(x) = $(x) = b(x)
(x in S)
(Dirichlet conditions)
Hence for
(15.5-27)
the solution (22) requires
G8(x, Q=- ^-^
«in S)
(15.5-28)
where G(x, £) satisfies Eq. (25) in V and vanishes on S. For boundary conditions of the form d<£>
B$(x) s — = b(x)
(x in S)
(Neumann conditions)
(15.5-29)
the solution (22) of the differential equation (24) requires one to use the "Neumann function"
G8(x, Q = G(x, Q
(£ in S)
(15.5-30)
where G(x, f) satisfies Eq. (25) in V, and dG(x, Q/dn = 0 for x in S. Sections 15.6-6 and 15.6-9 show the application of these relations to the solution of true boundary-value problems for elliptic differential equations. Section 10.3-6 illus-
15.6-1
LINEAR PROBLEMS
520
trates a similar method for the solution of initial-value problems for hyperbolic differ ential equations (see also Sec. 10.3-5). 15.6. POTENTIAL THEORY
15.6-1. Introduction. Laplace's and Poisson's Differential Equa tions (see also Sees. 5.7-3, 10.4-3, and 10.4-5). Many important applica tions involve solutions 3>(r) of the linear partial differential equations V2$(r) = 0
(Laplace's differential equation)
(15.6-1)
V2$(r) = —47rQ(r) (Poisson's differential equation)
(15.6-2)
where $(r) and Q(r) are functions of position in a three-dimensional Euclidean space of points (r) = (x, y, z) = (x1, x2, xz), or in a two-dimen sional Euclidean space points (r) = (x, y) = (a;1, x2). (r) is most fre quently interpreted as the potential of an irrotational vectorfield F(r) = -V<S>(r)
due to a distribution of charge or mass such that V • F(r) = 47rQ(r) (Sees. 5.7-2, 5.7-3, and 15.6-5). The study of such potentials and, in particular, of solutions of Laplace's differential equation (1) is known as potential theory. 15.6-2. Three-dimensional Potential Theory: The Classical Bound ary-value Problems, (a) The Dirichlet Problem. A bounded region V admitting a solution of the boundary-value problem V2$(r) =0
(r in V)
$(r) = 6(r) (r in S) (Dirichlet problem)
(15.6-3)
for any given continuous single-valued function b(r) is called a Dirichlet region; whenever a solution exists, it is necessarily unique. If V is an unbounded region, one must specify the asymptotic behavior of the solu tion at infinity, say *(r) = 0(l/r) as r —» <x>; the latter condition ensures uniqueness whenever a solution exists. Section 15.6-6d further discusses the existence of solutions.
The solution 3>(r) of the Dirichlet problem (3) yields a stationary value of the
Dirichlet integral J (V3>)2 dV, where (r) is assumed to be twice continuously dif ferentiable in V and S and to satisfy the given boundary conditions (see also Sec. 15.4-7a). Dirichlet problems are of particular importance in electrostatics (Ref. 15.6).
(b) The Neumann Problem. value problem
For the second classical boundary-
521
V*(r) =0
PROPERTIES OF HARMONIC FUNCTIONS
(r in V)
|| =6(r)
15.6-4
(r in S) (Neumann problem)
(15.6-4)
where 6(r) is a given continuous single-valued function, the existence of
solutions requires / b(r) dA = 0 (see also Gauss's theorem, Table 5.6-1). If V is an unbounded region, one requires $>(r) = 0(1/r) and d$/dn = 0(l/r2) as r —> <». The solution of a Neumann problem for a bounded region V is unique except for an additive constant. Neumann problems occur, in particular, in connection with the flow of incompressi ble fluids (Ref. 15.6).
15.6-3. Kelvin's Inversion Theorem.
7/ $(r) is a solution of Laplace's differential
equation in a region V inside the sphere |r — a| = R, then
*« =|7^|* [[7^ ('"*>+-]
<15-6-5>
is a solution in a corresponding region V outside the sphere, and conversely (Kelvin's Inversion Theorem). In terms of spherical coordinates r, #,
/ R2
\
<£(r, #,
Hence, the so-called exterior boundary-value problem V2<£(r) =0
(r outside V and S) \
a|? + 0* =6(r)
(r in S)
on
$(r) = 0(1/r)
asr-> oo
\ I
(15.6-6)
)
for a bounded region V can be transformed into a corresponding boundary-value problem for the interior of an "inverted" region V through a transformation R*
|r-a|
, (r - a)
(15.6-7)
15.6-4. Properties of Harmonic Functions, (a) Mean-value and Maximum-modulus Theorems. Solutions <£(r) of Laplace's differ ential equation (1) are called harmonic functions. Every function $(r) harmonic in the open region V is analytic (Sec. 4.10-56) and has har monic derivatives of every order throughout V. Each value $(ri) equals the arithmetic mean (Sec. 4.6-3) of $(r) over the surface (and hence over the volume) of any sphere centered at r = rx and contained in V (Mean-value Theorem). Conversely, a continuous function <£>(r) is harmonic in every open region V where the above mean-value property holds. A function <£(r) harmonic throughout the bounded region V and its bound ary S cannot have a maximum or minimum in the interior of V (Maximummodulus Theorem, see also Sec. 7.3-5). Conversely, a continuous function without maxima and minima in the interior of a sphere is harmonic
15.6-5
LINEAR PROBLEMS
522
throughout the sphere. If $(r) is harmonic in V, continuous in V and S, and equal to zero on S, then $(r) = 0 in V. If $(r) is harmonic in V, continuously differentiable in V and S, and d$/dn = 0 on S, then $(r) is constant in V. The inversion theorem of Sec. 15.6-3 yields analogous theorems for the unbounded region V outside S. (b) Harnack's Convergence Theorems. The following convergence theorems are of interest in connection with series approximations for harmonic functions. If the sequence s0(r), 8i(r), s2(r), . . . of functions all harmonic in V and continuous on the boundary SofV convergesuniformly on S, then the sequenceconverges uniformly in V to a function s(r) which is harmonic in V and such that s(r) = lim sn(r) on S.
The sequence
it-**
for any partial derivative of sn(r) converges uniformly to the corresponding partial deriva tive of s(r) in every closed subregion of V (Harnack's First Convergence Theorem). Given a sequence of functions s0(r), si(r), S2(r), . . . harmonic in V and such that So(r) > si(r) > 82(r) > . . . for all r in V, convergence at any point of V implies con vergence throughout V and uniform convergence in every closed subregion of V; the limit is a harmonic function in V (Harnack's Second Convergence Theorem).
15.6-5. Solutions of Laplace's and Poisson's Equations as Poten tials, (a) Potentials of Point Charges, Dipoles, and Multipoles. The following particular solutions of Laplace's equation have especially simple physical interpretations: Wr — o) s= -j
1 == —
\*-9\
V(x-S)2 + (y-v)2 + (z-t)2
(r 5^ p)
[POTENTIAL OF A UNIT POINT CHARGE AT THE POINT (o)]
(15.6-8)
[POTENTIAL OF A UNIT DIPOLE DIRECTED ALONG THE UNIT VECTOR Ui AT THE POINT (p)] (15.6-9)
More generally, the potential of a multipole of order j at the origin (p = 0) is *,(r) = (-l)'(py V)(py_!.V) • • • (pi.V)^o(r)
where the so-called multipole-moment components Q$\ are constants determined by the j vectors pi, p2, . . . , py defining the multipole. In terms of spherical coordinates r, #,
*fW =ppr *i (;) =^i Y&> *) (* *0) (15.6-11) where Yj(
523
SOLUTIONS OF LAPLACE'S AND POISSON'S EQUATIONS
15.6-5
(b) Potentials of Charge Distributions. Jump Relations. Other particular solutions of Laplace's equation are obtained by linear superposition or integration of simple and/or dipole potentials. Of par ticular interest are volume potentials of charge and dipole distributions,
f Q(e)Mr - 9) dV(9) = f r^S* dV(e)
(15.6-12)
- fv (P(e) •V]^o(r - g) dV(e) =- jy [p(e) •V] ^^-^ dV(9) (15.6-13) and surface-distribution potentials for charges and dipoles (potentials of single- and double-layer distributions)
js
1. The single-layer potential (14) is continuous at every regular point r0 of the surface S. The same is true for the directional derivative d$/dt in any given direction tangent to S at r0; but the directional derivative d$/dn along the normal of S at r0 satisfies the jump relation
dn\+
On 1
= -47rcr(ro)
(15.6-16)
where the subscripts + and —indicate the respective unilateral limits as r —• r0 on the positive-normal side of S and on the negative-normal side of S.
2. The double-layer potential (15) and its tangential derivatives satisfy the jump relations
«Mr0) - *fro) = *(r0) - *_(r0) = 27rp(r0) ai
f]t - fL„-?]....-f]_— If
at.
(15.6-17)
«**>
at every regular point r0 of S, while the normal derivative d$/dn is continuous.
Note: In the special case p(r) = p = constant, the double-layer potential (15) equals p times the solid angle subtended by S at the point (r); this angle is taken to be positive if (r) is on the side of the positive normals to S. For a closed surface S such a potential equals —4irp if (r) is inside S, and 0 if (r) is outside S.
15.6-5
LINEAR PROBLEMS
524
(c) Multipole Expansion and Gauss's Theorem. Consider the potential $(r) due to any combination of charge distributions confined to a bounded sphere |r| < R; let Qt be the finite total charge. <£(r) will be a linear combination of potentials of the types (12) to (15); for |r| > R, one can expand *(r) as a Taylor series (5.5-4) of terms (10), or as a series of spherical harmonics (11). For sufficiently large r, the potential is thus successively approximated by the potential of a point charge Qt at the origin, by a point-charge potential plus a dipole potential, etc. (multipole expan sion).
For every regular surface enclosing the entire charge distribution, Gauss's
theorem (Table 5.6-1) takes the special form
f dX •V$ = Jsdn [ d-^-dA = -47t0t
(15.6-19)
J8
(d) General Solutions of Laplace's and Poisson's Equations as Potentials. Let V be a singly connected, bounded or unbounded threedimensional region with regular boundary surface S, let 3>(r) be twice continuously differentiable in V and continuously differentiable on S, and let $(r) = 0(l/r) as r—> «. Then Green's theorem (Table 5.6-1 or Sec. 15.4-3c) permits one to represent 3>(r) in the form
*(r) = sL [*o(r - e)vp*k)i •dA(p) (15.6-20O)
~hjv*o(r ~e)v**(g) dV{9) (r in V) *>o(r - o) =
(15.6-206)
|r-o|
where the operator Vp implies differentiations with respect to the com ponents of p; note that Vp^>o(r — p) = V
4tt
dA
47r dn
2. A double-layer surface distribution (15) of density -r- $
3. A volume distribution (12) of density — -r- V2$ = Q(r) 47T
The last potential vanishes if $(r) satisfies Laplace1s differential equation (1) in V. Note: The expression (20a) vanishes if r is outside V and S, and equals $(r)/2 for rin£.
525
THREE-DIMENSIONAL BOUNDARY-VALUE PROBLEMS
15.6-6
15.6-6. Solution of Three-dimensional Boundary-value Problems
by Green's Functions. The Green's-function methods of Sees. 15.5-1 and 15.5-4 express the solution $(r) of the Dirichlet problem (3) and the Neumann problem (4) for suitable regions V as surface integrals
*(r) = fa G8(r, g)b(g) dA(9)
(15.6-21)
Section 15.5-4& relates each "surface" Green's function Gs(r, p) simply
to the ordinary Green's function G(r, p) which yields the solution
*(r) =4tt fv G(r, p)Q(p) dV(9)
(15.6-22)
of Poisson's differential equation (2) subject to "complementary" homogeneous Dirichlet or Neumann conditions. Superposition of Eqs. (21) and (22) yields solutions of Poisson's equation with given boundary values 6(p) of $ or d$/dn (Sec. 15.4-2). Note G(g, r) = G(r, p) (Sec. 15.5-1). The Green's functions are easily found in the following special cases; note that the positive surface normals point outward. (a) Green's Function for the Entire Space. If V is the entire three-dimensional space, Eq. (20) yields the unique solution (22) of Poisson's equation subject to the "boundary conditions" 3>(r) = 0(1 /r) as r —» oo. The required Green's function is
fl,
(15-6-23)
(see also Sec. 5.7-3). (b) Infinite Plane Boundary with Dirichlet Conditions. Image Charges. If V is the half-space z > 0, the Green7s functions for Dirichlet conditions are
G(r> <>)=l (\^J\ ~^4) =i[<po(r ~e) - w(r ~*)] Gs(r, 9) =~fv =f ]f=Q =2*[(*-*)'+(,/-„)'+ 2']* (15-6-25) (15.6-24)
where the point (p) = (J, n, —f) is the reflected image of (p) s= (J, rj, f) in the boundary plane. The solution (22) can be interpreted as a volume potential due to the given charges and induced image charges; equation (21) expresses the effect of nonhomogeneous Dirichlet conditions as the potential of a double layer on the boundary. (c) Sphere with Dirichlet Conditions. Poisson's Integral For mulas. If V is the interior of a sphere |r| = R, the Green'sfunctions for
15.6-7
LINEAR PROBLEMS
526
Dirichlet conditions are
«**-s(Fh-'nr—\ \ Rr"V =5[rt* - 9) - j
\
-^-__aG?l
_
l
r2 - R2
^r'e;= a," Wl-R ~~ " 4^ /^(r2 + fl* - 2#r cos 7)* (15-6"27) where 7 is the angle between r and p, or cos 7 = cos # cos #' + sin # sin #' cos (
(15.6-28)
if the spherical coordinates of the points (r) and (p) are respectively denoted by r, &,
Kelvin's inversion theorem (Sec. 15.6-3). The solution (21) of the Dirichlet problem
V2$(r, *,
(r < R)
= b(fl,
(r = R)
becomes
•(r, *, „) = m*-r2) f*« 4tt
Jo
/V
W^rfnJ'dJ'
* Jo (R2 + r2 - 2Rr cos 7)* (Poisson's integral formula)
(15.6-29)
which mayagain beinterpreted as a double-layer potential. Theexpres sion (27) can be expanded in spherical surface harmonics with the aid of
Eq. (21.8-68) (multipole expansion, Sec. 15.6-5c; see also Sec. 10.4-9). Equations (26) to (29) yield solutions in the interior of the sphere (r < R). If Vis the exterior of the sphere (r > R), G(r, p) is still given
by Eq. (26), but now G8(r, p) = - *2fel) = WElSL] ,so that the dv dp \paR signs on the right sidesof Eqs. (27) and (29) must be reversed. (d) An Existence Theorem. A Green's function G(r, g) for Poisson's equation with Dirichlet conditions—and hence a solution of the Dirichlet problem (3) with reason able boundary values—exists for every region V bounded by a finite number of regular surface elements such that every boundary point can be the vertex ofa tetrahedron outside V.
15.6-7. Two-dimensional Potential Theory. Logarithmic Poten tials, (a) In many three-dimensional potential problems $(x, y, z) is independent of the z coordinate, and Vis a right-cylindrical region whose boundary surface S intersects the xy plane in a boundary curve C. V is then represented by a plane region D bounded by C (e.g., potentials and flows about infinite cylinders, plane boundaries). Such problems form
the subject matter of two-dimensional potential theory. The divergence
527
TWO-DIMENSIONAL POTENTIAL THEORY
15.6-7
theorem and Green's theorem take the form (4.6-33); one may use rec
tangular cartesian coordinates x, y, or plane polar coordinates r,
—
d** . d**
l d ( d*\ j_ 1 d** - n
v * s a? + 1? s 73? V *7
^
~
(Laplace's differential equation)
**
dx2^ dy2
(15.6-30)
r dr \ drj ^ r2 dip2 (Poisson's differential equation) (15.6-31)
with $ = *(r), Q = Q(r), r m (x,y) m (r,
gous to those of Sec. 15.6-4 apply to two-dimensional harmonic functions; it is only necessary to substitute circles for spheres in each mean-value theorem. Kelvin's inversion formula (5) holds for every
circle |r - a| = R; thus, if $(r,
true for - <S> I—>
Mr - f) - log.|^i =log. ^=^==== (logarithmic potential of a point source at r = p) (15.6-32) which describes the potential of a uniformly charged straight line in the z direction. The function (32) takes the role of the function (8) in the two-dimensional theory, so that the potentials (12), (14), and (15) of Sec. 15.6-56 are replaced by corresponding logarithmic potentials
[ Q(q)Ut - g) dA(g) = f Q(g) log, j^i-fl dA^) f
(15.6-33) (15.6-34)
- \c p« d-t*« - - I p« I ^ wh\ ds^ (15-6"35) Equation (20) is replaced by
~^LMr ~p)v'2$(e) dA{9) (i5-6-36> Note
g&±> ds(9) =dJ%^dv - 30£J> «
(15.6-37)
15.6-8
LINEAR PROBLEMS
528
15.6-8. Two-dimensional Potential Theory: Conjugate Harmonic
Functions (see also Sees. 7.3-2 and 7.3-3). (a) The xy plane may be regarded as a complex-number plane with points z = x + iy. A pair of (necessarily harmonic) functions $(x, y), V(x, y) are harmonic con jugate functions in a region D of the plane if and only if S = $(x, y) + i*(x, y)
is an analytic function of z = x + iy in D. Harmonic conjugate func tions are related by the Cauchy-Riemann equations d$ __ dV
d$
dx-ty
dV
dj - " to
(15.6-38)
and define eachother uniquely throughout D, except for arbitary additive constants. Given a function *(x, y) harmonic in D, one obtains \p(x, y) as the line integral w
n
f(XtV) (
d$
d$
\
nx'y) =LA-Tyd*+T*dy)
(15-6-39«)
where x0 and y0 are arbitrary constants, and the path of integration is located in D. If >&(x, y) is given, one has
*(x>y)=LAwdx-i*v
(i6-6-396)
The curves $(x,y) = constant and *(x,y) = constant are mutually orthogonal families. These curves have important physical interpreta tions (equipotential lines and gradient lines in electrostatics, lines ofcon stant velocity potential and streamlines for incompressible flow). E is often called the complex potential. (b) Every transformation
z = z(z)
(z = x + iy,
z = x + iy)
(15.6-40)
which is analytic and such that dz/dz 5^ 0 in V (conformal mapping, Sec. 7.9-1) transforms the harmonic conjugates $(x, y), ty(x, y) into a new pair of harmonic conjugates $(x, y), $(x, y) with mutually orthogonal contour lines. This theorem permits one to simplify boundaries and contour lines by conformal mapping (see also Sec. 15.6-9). (c) Let *(x, y) be the solution of a Neumann problem
V*¥ = 0
(in D)
~ = b(x, y) - b(s)
(on C)
such that *(x, y) and its derivatives are continuous on C as well as in D.
complex conjugate (396) of V(x, y) is the solution of the Dirichlet problem
(15.6-41) Then the
529
GREEN'S FUNCTIONS AND CONFORMAL MAPPING
V2* = 0
(in D)
with —= - — ds on
*(», y) = B(x, y) = B(s)
(on C)
or
(on C)
15.6-9
(15.6-42)
B(s) = - J[ b(s) ds + constant (15.6-43)
The solution $(x, y) of the Dirichlet problem (42) similarly yields the solution (39a) of the Neumann problem (41), provided that
- 6 b(s) ds - <£ ^ ds =0
(15.6-44)
so that Gauss's theorem is satisfied (see also Table 5.6-1 and Sec. 15.6-5c).
15.6-9. Solution of Two-dimensional Boundary-value Problems. Green's Functions and Conformal Mapping (see also Sees. 15.5-1,
15.5-4, and 15.6-6). The Green's-function methods of Sees. 15.5-1 and 15.5-3 express the solution $(r) of Poisson's differential equation (31) with homogeneous linear boundary conditions in the form
*(r) =fD G(r, p)Q(p) dA(g)
(15.6-45)
and the solution $(r) of Laplace's differential equation (30) with given boundary values b(r) of $ or d$/dn as
*« =faOa(r, 9)bb) dab)
(15.6-46)
Solutions of Poisson's equation subject to nonhomogeneous linear
boundary conditions may be obtained by superposition of suitable integrals (45) and (46). Note G(r, p) = G(g, r). The Green's functions are easily found in the following special cases. (a) Green's Function for the Entire Plane. If D is the entire plane, Eq. (36) yields the unique solution (45) of Poisson's equation sub ject to the "boundary conditions" $(r) —> 0 as r —> ». The required Green's function is
G(r>ff)-JLlog.IFi-^
(15.6-47)
(b) Half Plane with Dirichlet Conditions. If D is the half plane x > 0, the Greenes functions for Dirichlet conditions are
(r, 9) =JL [log. j^ - log, 1F1^I] = ^ [Mr - e) - Mr - e)] _ 1 . (x + W +(y- vY
- 5 l0S« (x - Q* + (y- ,)»
G^ »)--!?- If],_. - I*+iy - ,)•
M5 6-48)
(15-6 48)
(15-6'49)
15.6-10
LINEAR PROBLEMS
530
where the point (p) ^ (-£, 77) is the reflected image of (p) = (£, 7;) in the boundary line.
(c) Circle with Dirichlet Conditions. Poisson's Integral For mulas. If D is the interior of a circle r = R, the Green's functions for Dirichlet conditions are
1 R2r> ++p2 w~2rp G(r, p) =j-loge ^ 2^ cogC0S(y{
- ^^
(15.6-50) R2 -
r2
+ r2- 2Rr cos (
where p, ^?' are the polar coordinates of the point (p). The solution (45) of the Dirichlet problem
V2$(r,
(r < 0)
* = &(*>)
(r = «) (15.6-52)
becomes
*.,
n
^ f2* =J_/'2
#2 - r2
2x^0 R2 + r2 - 2Rr cos (
(15.6-53)
which may be expanded as a Fourier series in
(r < R). If D is the exterior of the circle (r > R), G(r, g) is still given dC dC~\ by Eq. (50), but now G8(r, g) = - — = dp p \paR , so that the signs on ov & the right sides of Eqs. (51) and (53) must be reversed.
(d) Existence of Green's Functions and Conformal Mapping. Section 15.6-9c and Riemann's mapping theorem (Sec. 7.10-1) imply the existence of a Green's function (and hence of a solution of the Dirichlet
problem) for any region D which, together with its boundary C, can be mapped conformally onto the unit circle. More specifically, let w = w(z) be the analytic function mapping the points z = x + iy of D and C onto the unit circle so that the point z = f = £ + in of D is transformed into the origin, or
w(z) = 1 (z on C)
w(£) = 0
(15.6-54)
Then the Dirichlet problem for the region D admits the Green's function
G(x, y; £, 77) s G(z, f) =loge t~^
(15.6-55)
15.6-10. Extension of the Theory to More General Differential
Equations. Retarded and AdvancedPotentials (see also Sec. 10.4-4).
531
RELATED TOPICS
15.7-1
The theory of Sees. 15.6-5 to 15.6-7 and 15.6-9 attempts to construct solutions of Laplace's and Poisson's differential equations by superposition of simple and dipole potentials. The theory is readily generalized to deal with the more general differential equations V2$ + k2$ = 0
(15.6-56)
V2* + /c2
(15.6-57)
which include the space form of the wave equation (k real, Sec. 10.4-4) and the space form of the Klein-Gordon equation used in nuclear physics (k = in). The differential equation (57) is of the type discussed in Sec. 15.5-46.
(a) Three-dimensional Case. The three-dimensional equation (56) admits the elementary particular solution
(15.6-58)
For real k the positive sign corresponds to outgoing waves, and the nega tive sign to incoming waves; for imaginary k = iic, only negative expo nents —|#c| are of general interest. Substitution of the appropriate expres
sion (58) for
r
/P±ik\r-e\
\
1
/* /
g±i*|r-0|\
•« - hIs (v^\v'*) •dA(e) - £ /.(*v- i^r) •dM
(15.6-59)
(b) Two-dimensional Case. In the case of two-dimensional differential equations of the form (56) or (57), the elementary particular solution (32) is replaced by
I— y H0M(k\r —g\) (outgoing waves)
« ^- H0W(k\r —q\)
(incoming waves)
(15.6-60)
where the £T0(r)(2) are Hankel functions (Sec. 21.8-1).
15.7. RELATED TOPICS, REFERENCES, AND BIBLIOGRAPHY
15.7-1. Related Topics. The following topics related to the study of linear integral equations, boundary-value problems, and eigenvalue problems are treated in other chapters of this handbook:
15.7-2
LINEAR PROBLEMS
Functions of a complex variable The Laplace transformation Ordinary differential equations Partial differential equations Calculus of variations Linear vector spaces, linear operators Numerical solutions Special transcendental functions
532
Chap. 7 Chap. 8 Chap. 9 Chap. 10 Chap. 11 Chap. 14 Chap. 20 Chap. 21
15.7-2. References and Bibliography (see also Sees. 4.12-2, 10.6-2, 12.9-2, and 14.11-2). 15.1. Akhiezer, N. I.: Theory of Approximation, Ungar, New York, 1956. 15.2. and I. M. Glazman: Theory ofLinear Operators in Hilbert Space, Ungar, New York, 1963.
15.3. Banach, S.: Theorie des Operations LinSaires, Chelsea, New York, 1933. 15.4. Berberian, S. K.: Introduction to Hilbert Space, Oxford, Fair Lawn, N.J., 1963. 15.5. Bochner, S.: Lectures on Fourier Integrals, Princeton, Princeton, N.J., 1959.
15.6. Courant, R., and D. Hilbert: Methods of Mathematical Physics, rev. ed., Wiley, New York, 1953/66.
15.7. DieudonnS, J. A.: Foundations of Modern Analysis, Academic, New York, 1960.
15.8. Dunford, N., and J. T. Schwartz: Linear Operators, Interscience, New York, 1964.
15.9. Edwards, R. E.: Functional Analysis, Holt, New York, 1965. 15.10. Feshbach, H., and P. M. Morse: Methods of Theoretical Physics (2 vols.), McGraw-Hill, New York, 1953.
15.11. Friedman, B.: Principles and Techniques of Applied Mathematics, Wiley, New York, 1956.
15.12.
: Generalized Functions and Partial Differential Equations, PrenticeHall, Englewood Cliffs, N.J., 1956.
15.13. Halmos, P. R.: Introduction to Hilbert Space and the Theory of Spectral Multi plicity, Chelsea, New York, 1957.
15.14. Kellogg, O. D.: Foundations of Potential Theory, Ungar, New York, 1943. 15.15. Kolmogorov, A., and S. V. Fomin: Elements of the Theory of Functions and Functional Analysis (2 vols.), Graylock, New York, 1957/61. 15.16. Lanczos, C: LinearDifferential Operators, VanNostrand, Princeton, N.J., 1961. 15.17. Liusternik, L. A., and V. J. Sobolev: Elements of Functional Analysis, Ungar, New Yo.k, 1961.
15.18. Lorch, E. R.: Spectral Theory, Oxford, Fair Lawn, N.J., 1962. 15.19. Madelung, E.: Die mathematischen Hilfsmitteldes Physikers, 7th ed., Springer, Berlin, 1964.
15.20. Mikhlin, S. G.: Integral Equations, Pergamon, New York, 1957. 15.21. Riesz, I., and B. Nagy: Functional Analysis, Ungar, New York, 1955.
15.22. Vulikh, B. Z.: Functional Analysis for Scientists and Technologists, Pergamon, New York, 1963.
(See also the articles by F. Schlogl and J. Sneddon in vol. I of the HandbuchderPhysik, Springer, Berlin, 1956, and the references for Chap. 10.)
CHAPTE
,16
REPRESENTATION OF MATHEMATICAL MODELS:
TENSOR ALGEBRA AND ANALYSIS
16.3-6. (Outer) Product of Two
16.1. Introduction
16.1-1. Introductory Remarks 16.1-2. Coordinate Systems and Ad missible Transformations
16.1-3. Description
(Representation)
of Abstract Functions by Com
ponents.
Dummy (Umbral)
Index Notation
16.1-4. Schemes of Measurements and Induced Transformations. Invariants
Character
16.4. Tensor Algebra: Invariance of Tensor Equations
16.4-1. Invariance of Tensor Equa tions
16.5. Symmetric and Skew-symmet
16.2. Absolute and Relative Tensors
16.2-1. Definition
Tensors
16.3-7. Inner Products 16.3-8. Indirect Tests for Tensor
of
Absolute
and
ric Tensors
16.5-1. Symmetry
and
Skew-sym
metry
Relative Tensors in Terms of Their Induced Transformation
16.5-2. Kronecker Deltas
Laws
16.5-3. Permutation Symbols 16.5-4. Alternating Product of Two
16.2-2. Infinitesimal Displacement. Gradient of an Absolute Scalar
16.3. Tensor Algebra: Definition of Basic Operations
Vectors
16.6. Local Systems of Base Vectors
16.6-1. Representation of Vectors and
16.3-1. Equality of Tensors
Tensors in
16.3-2. Null Tensor 16.3-3. Tensor Addition
Base Vectors
16.3-4. Multiplication of a Tensor by an Absolute Scalar
16.3-5. Contraction of a Mixed Tensor
Terms of Local
16.6-2. Relations between Local Base Vectors Associated with Dif ferent
Schemes of
Measure
ments
533
16.1-1
TENSOR ANALYSIS
16.7. Tensors Defined on Riemann Spaces. Associated Tensors
16.7-1. Riemann Space and Funda mental Tensors
534
16.10. The Absolute Differential Cal culus. Covariant Differentia tion
16.7-2. Associated Tensors. Raising
16.10-1. Absolute Differentials 16.10-2. Absolute Differential of a
and Lowering of Indices 16.7-3. Equivalence of Associated
16.10-3. ChristoffelThree-index Sym
Relative Tensor
Tensors
bols
16.7-4. Operations with Tensors De
fined on Riemann Spaces
16.10-4. Covariant Differentiation 16.10-5. Rules for Covariant Differ entiation
16.8. Scalar Products and Related Topics
16.10-6. Higher-order Covariant De rivatives
16.8-1. ScalarProduct (Inner Product)
16.10-7. Differential Operators and
of Two Vectors Defined on a Riemann Space 16.8-2. Scalar Products of Local Base
16.10-8. Absolute (Intrinsic) and Di
Vectors.
Curve
Components
of
a
Tensor
16.8-4. Vector Product Triple Product
rectional Derivatives
16.10-9. Tensors Constant along a
Orthogonal Coordi
nate Systems
16.8-3. Physical
Differential Invariants
and
Scalar
16.9. Tensors of Rank Two (Dyadics) Defined on Riemann Spaces 16.9-1. Dyadics 16.9-2. Inner-product Notation 16.9-3. Eigenvalue Problems
16.10-10. Integration of Tensor Quan tities. Volume Element 16.10-11. Differential Invariants and Integral Theorems for Dyadics
16.11. Related Topics,
References,
and Bibliography
16.11-1. Related Topics
16.11-2. References and Bibliography
16.1. INTRODUCTION
16.1-1. Tensors are mathematical objects (see also Sec. 12.1-1) associ ated as functions of "position" with a class (space) of other objects {"points" labeled with numerical coordinates). Each tensor is described
(represented) by an ordered set of numerical functions (tensor compo nents) of the coordinates in such a manner that it is possible to define mathematical relations between tensors independent of (invariant with respect to) the particular scheme of numerical description used. Tensor algebra was developed by successive generalizations of the theory of vector spaces (Sees. 12.4-1 and 14.2-1 to 14.2-7), linear algebras (Sees. 12.4-2 and 14.3-5), and their representations (Sec. 14.1-2). Tensor analysis is particularly concerned with tensors in their aspect as point functions and applies especially to the description of curved spaces (Chap. 17) and of continuous fields in physics. Tensor methods fre-
535
DESCRIPTION OF ABSTRACT FUNCTIONS
16.1-3
quentlylinkcomplicated numerical data measured in different frames of reference to relatively simple abstract models.
16.1-2. Coordinate Systems and Admissible Transformations.
Consider a class (space, region of a space) of objects (points) labeled with corresponding orderedsets of n < °° continuously variable real numbers (coordinates) x1, x2, . . . , xn. A coordinate transformation ("alias" point of view, see also Sec. 14.1-3) "admissible" in the sense of the fol lowing sections is a relabeling of each point with n new coordinates xl, x2, . . . , xn related to the original coordinates x1, x2, . . . , xn by n transformation equations
xk = xk(x\ x2, . . . , xn)
(k = 1, 2, . . . , n)
(16.1-1)
such that, throughout the region of points under consideration, (1) each function xk(xl, x2, . . . , xn) is single-valued and continuously differ entiable and (2) the Jacobian (Sec. 4.5-5) det [dtf/dx*] is different from zero.
The class of "admissible" transformations constitutes a group (Sec. 12.2-1) with respect to the operation offorming their "product," i.e., of applying twotransformations successively (see also Sec. 12.2-8). The Jacobian of the product of two transformations
is the product of the individual Jacobians. Each admissible transformation (1) has a unique inverse whose Jacobian is the reciprocal of the original Jacobian.
16.1-3. Description (Representation) of Abstract Functions by Components. Dummy (Umbral) Index Notation. Tensoranalysis deals with abstract objects associated with the points (x1, x2, . . . , xn) of an n-dimensional space (n < oo, Sec. 14.1-2) as point functions (see also Sec. 5.4-1) defined on a region of the space. Each point function
Q(x1, x2, . . . , xn), say, willbe described or represented by an ordered set of nB < oo numerical functions (components of Q, see also Sec.
14.1-2) Q(Ju fa • • > h) z1, x2, . . . , xn) of the coordinates x1, x2, . . . , xn, where each index j\, j2} . . . , Or runs from 1 to n. Depending on the type of object (Sees. 16.2-1 and 16.2-2), certain of the indices labeling each component are written as superscripts, and others as subscripts. Thus an object Q may be described by n1*8 components Qi?w'-ru'(xl> x*> • • • >zn), where all indices run from 1 ton. With this notation, it is possible to abbreviate sets of sums like n
y AikBk = c* n
n
J 2 Aifi'C* =Dih » = 1 &= 1
(t - 1, 2, ... , n)
ij, h= 1, 2, ... n)
by A*Bk = c<
by A\pCa =Dih
!6.1-4
TENSOR ANALYSIS
536
and also sums involving vectors, like n
) akek = a
by akek = a
through the use of the following conventions.
1. Summation Convention. A summation from 1 to n is performed over every dummy (umbral) index appearing once as a super script and once as a subscript. Any dummy index may be changed at will, since dummy indices are "can celed" by the summation (EXAMPLE: AikBk = A^B, = AihBh). Different summation indices should be used in the case of multiple summations.
2. Range Convention. All free indices appearing only as superscripts or only as subscripts are understood to run from Hon; so that an equation involving R free indices stands for nR equations. The superscripts and subscripts on the twosides ofany equationmust match.
3. In derivatives like dal/dxk, k is considered as a subscript. The dummy-index notation denned by the above conventions is used throughout the remainder of this chapter. Thus expressions like AikBk are understood to be sums, unless the contrary is explicitly indicated, as in (AikBk)noBum.
In many applications, the dummy-index notation is considerably more powerful than the matrix notation employed in Chap. 13 (see Table 14.7-1 for a comparison of notations).
16.1-4. Schemes of Measurements and Induced Transformations. Invariants (see also Sees. 14.1-5 and 16.2-1). A scheme of measure ments for a class (or a set of classes) of abstract point functions is a reference system (or a set of reference systems, one for each class of
functions) for the (biunique) representation of each point function by a corresponding set of numerical components. Each function will be representedby the samenumber of components in all schemes
of measurements considered. In physics, each component value usually corresponds to the result of a physical measurement.
It is useful to associate a definite scheme of measurements (x scheme of measurements) with each system of space coordinates x1, x2, . . . , xn.
The components $&:.%(zS x2, . . . , xn) of a point function Q(x\ x2, . . . ,xn) in the x scheme of measurements and the components QSV-'-'^/ (x1, x2, . . . , xn) of Q in the x scheme of measurements are then related by an induced transformation
537
DEFINITION OF ABSOLUTE AND RELATIVE TENSORS
16.2-1
(16.1-2)
associated with (induced by) each admissible coordinate transformation (1).
A class Cof abstract pointfunctions Q(x\ x2, . . . , xn) described in an x scheme of measurements by numerical components Qfrfr.V.'fcte1, a?1, • • • , z«) constitutes aclass of invariants or geometrical objects (sometimes called tensors in the mostgeneral sense) if their mathematical properties can be defined in terms of (abstract) operations
independent of the scheme of measurements (see also Sees. 12.1-1 and 16.4-1). Then the induced transformations (2) are related to the defining operations of C as follows: 1. The correspondence between any admissible coordinate transformation (1) and the corresponding induced transformation (2) is an isomorphism (Sec. 12.1-6) preserving the operation of forming the product of two transformations. 2. Everyinduced transformation is an isomorphism preserving the defining opera tions of C (see also Sec. 16.4-1). Defining operations involving two or more
classes d, C2, . . . of invariants (e.g., scalars and vectors) are also preserved by the induced transformations of C\, C2, . . . .
A class C of point functions may also be invariants only with respect to a subgroup of the group of admissible coordinate transformations (Sec. 12.2-8). 16.2. ABSOLUTE AND RELATIVE TENSORS
16.2-1. Definition of Absolute and Relative Tensors in Terms of Their Induced Transformation Laws (see also Table 16.2-1 and Sec.
6.3-3). Throughout this handbook (and in practically all applications), tensors are understood to be real absolute and relative tensors repre
sented by real components. Each type of absolute or relative tensor is
defined by a linear and homogeneous induced transformation law (Sec. 16.1-4) relating the tensor components in different schemes of measurements. Given a space of points (re1, re2, ... , xn) and a group of admissible coordinate transformations (16.1-2),
1. An (absolute) scalar (scalar invariant; absolute tensor of rank 0) a is an object represented in an x scheme of measurements by a function a(xl, x2, . . . , xn) and in an x scheme of measure ments by a function a(xx, x2, . . . , xn) related to a(xl, x2, . . . , xn) at each point by the induced transformation (16.2-1)
2. An (absolute) contravariant vector (absolute contravariant tensor of rank 1) a is an object represented in an x scheme of measurements by an ordered set of n functions (components)
(R « 2, r - 5 = 1)
Absolute mixed tensor of rank 2, A
(R = s = 2, r = 0)
Absolute covariant tensor of rank 2, A
(R = r = 2, s = 0)
rank 2, A
A{{x\ x2, . . . , x«)
Aikix1, x2, . . . , xn)
Aik{x\ x\ . . . , xn)
a^x1, x2, . . . , xn)
Absolute covariant vector, a (R = 8 =1, r = 0)
Absolute contravariant tensor of
a»(zl, x*, . . . , x»)
a(xl, X2, . . . , Xn)
of measurements
Components in £ scheme
Absolute contravariant vector, a (R = r = i, 8 = 0)
(R = r = 8 = 0)
Absolute scalar (scalar invariant), a
Type of tensor quantity
,„) ,*»)
iM(*',*', ...,«•) =g'g4,,(x.^ Aj(*i, *»,..., *») =g g^(*', x'
dxf dx*
35' dx*
AA(*if x2, . . . , x») = — --a*(z\ *«,..., x»)
clip*
«*(*», £2, . . . , 2») = — ai(x\ x2, . . . , x»)
a*(«i, S2, . . . , £») =^;.a*(xS x2, . . . , x»)
d&
«(**, £2, . . . , x») = *(x\ x2, . . . , xn)
Components in x scheme of measurements
Transformation Laws (Sec. 16.2-1; refer to Sec. 16.1-3 for dummy-index notation)
Table 16.2-1. Definition of the Most Frequently Used Types of Absolute Tensor Quantities in Terms of Their Induced
5
W
O
|
H
Id
539
DEFINITION OF ABSOLUTE AND RELATIVE TENSORS 16.2-1
a{(xl, x2, . . . , xn) and in an z scheme of measurements by an ordered 'set of n components ak(x\ x2, . . . , xn) related to the al(xx, x2, . . . , xn) at each point by the induced transformation (16.2-2)
3. An (absolute) covariant vector (absolute covariant tensor of rank 1) a is an object represented in an x scheme of measurements
by an ordered set of n functions (components) a^x1, x2, . . . , xn) and in c,ix x scheme of measurements by an ordered set of n com
ponents ak(x\ x2, . . . , xn) related to the a*(x\ x\ . . . , xn) at each point by the induced transformation dxi
(16.2-3)
ak = dx~k ai
An (absolute) tensor A of rank r + s, contravariant of rank r and covariant of rank s, is an object represented in an x scheme
of measurements by an ordered set of n**8 functions (components) A^'t'.'.fi^x1, x2, . . . , xn) and in an x scheme of measurements
by an ordered'set of n1** components iJ&"'-fa'(*S x2, • • - , *n)
related to theA^J'.'.^ix1, x2, . . . , xn) at each point bytheinduced transformation
-
.
dxk>dxk>
dxk* dx« dx« t mmdx* 4ilis...ir
Akk%::k'u =^9^* ' ' ' dx^dx1* dxk>
dxk> *««-•«
(16.2-4)
A relative tensor (pseudotensor) A of weight W and of rank r -j- s, con travariant of rank r and covariant of rank s, is an object represented in an
x scheme of measurements by n*** functions (components) AJftV.*;.*$/ (x1, x2, . , xn) andin an x scheme ofmeasurements by an ordered setofn14"9 compo
nents Akk%\r.:krk/(^> **> • • • »*n)related t0 the Ai$#::%AxX> x*> • • • '*n) at each point by the induced transformation
kr dxki dxfc* dxkr dx"' dx{* Ak\>kj...kJ = ^ ^7, * • • ^xlr d£kx' Q^
dx{' dxk<
AWr-^l'*'--''"l]W A%x%*'•'•• \_d(x\ X2, . . . ,x»)J
(16.2-5)
where W is a real integer. Relative tensors are called densities for W = 1
and capacities for W = -1 (EXAMPLES: Volume and surface elements are scalar and vector capacities; see also Sees. 6.2-36 and 17.3-3c).
The denning transformation (5) includes Eqs. (1) to (4) as special cases (W = 0 for absolute tensors). A tensor represented by Agfr.":?rit> is a mixed tensor if and
16-2-2
TENSOR ANALYSIS
540
only if neither r nor s equals zero. The induced transformation (5) characterizing
every absolute or relative tensor quantity is linear and homogeneous in the tensor components.
The corresponding inverse transformation is
A{&\\\V-—— • • . **^1?^-'...^L Ik*. ••*, N*1,*2, .-.,*") "K
9 dx"r dxk>
dxK dxW ax**'
dx*.' A*'"'••" [d(xi,x\
,X»)J
(16.2-6)
Note: Relative ordering of superscripts and subscripts is frequently used to conserve symbols. Thus A*k and Ak* denote different sets of components (see also Sees. 167-2 and 16.9-1).
16.2-2. Infinitesimal Displacement. Gradient of an Absolute Scalar (see also Sees. 5.5-2, 5.7-1, 6.2-3, and 16.10-7). The coordinate differentials dx* represent an absolute contravariant vector, the infinitesimal displacement dr.
Given a suitably differentiable absolute scalar a, the components da/dx* represent an absolute covariant vector called the gradient V«of a. A given absolute covariant vector described by a{ is the gradient of an absolute scalar if and only if — - —. = 0 dx*
for all i, k.
dx*
16.3. TENSOR ALGEBRA: DEFINITION OF BASIC OPERATIONS
16.3-1. Equality of Tensors. Two tensors Aand Bof the same type, rank, and weight are equal (A = B) at the point (x1, x2, . . . , xn) if and only if their corresponding components in any one scheme of measure ments are equal at this point:
Ai&::^(x\ x2, . . . , *•) = b^::!\,(x\ x2, . . . , *») (tensor equality)
(16.3-1)
Corresponding components of A and B are then equal in every scheme of measurements (see also Sec. 16.4-1). Tensor equality is symmetric, reflexive, and transitive (Sec. 12.1-3).
Note: Relations between the "values" of tensor point functions atdifferent points are not denned in ordinary tensor algebra. Some such relations are discussed, for the special case of tensors defined on Riemann spaces, in Sec. 16.10-9.
16.3-2. Null Tensor. The null tensor 0 of any given type, rank, and weight is the tensor whose components in any one scheme of measure
ments are all equal to zero. Thus A = 0 at the point (x1, x2, .
xn)
if and only if
-^iJ''^(x\ x2, . . . , xn) = 0 (null tensor)
(16.3-2)
All components of A are then equal to zero in every scheme of measurements. 16.3-3. Tensor Addition.
Given a suitable class of tensors all of the
same type, rank, and weight, the sum C = A + B of two tensors A
541
(OUTER) PRODUCT OF TWO TENSORS
16.3-6
and B is the tensor described in any one scheme of measurements (and
hence in all schemes of measurements) by the sums of corresponding com ponents of A and B:
(%&"*<.' = £*&'-% + #&•"•**.' (tensor addition)
(16.3-3)
A -f Bis of the same rank, type, and weight as Aand B. Tensor addition is com mutative and associative.
16.3-4. Multiplication of a Tensor by an Absolute Scalar. The product B= ak of a tensor A and an absolute scalar a is the tensor represented in every scheme of measurements by the products of the components of A and the scalar a:
£»>£;;.% = aA$•/.'.'.*<.'
(multiplication by scalars)
(16.3-4)
aA is of the same rank, type, and weight as A. Multiplication by scalars is com
mutative, associative, anddistributive with respect to bothtensor andscalar addition. In particular, (- 1)A = -A isthenegative (additive inverse) ofA, withA - A = 0. 16.3-5. Contraction of a Mixed Tensor.
One may contract a mixed
tensor A described by Ajfy::^* by equating a superscript to a subscript and summing over the pair. The resulting n^8-2 sums describe a tensor of the same weight as A, contravariant of rank r —1 and covariant of rank s —1. In general, a mixed tensor can be contracted in more than one way and/or more than once. EXAMPLE: An absolute or relative mixed tensor A of rank 2 described by A\ may be contracted to form the absolute or relative scalar (trace of A)
A\ = A\ + A\ + • • • + Ann
16.3-6. (Outer) Product of Two Tensors (see also Sec. 12.7-3). The (outer) product C = AB of two tensors A and B, respectively, of
weight Wand W andrepresented byAj&.V.^ and £j^V "**,' is the tensor described by Ctni • • •irkikt -"kp
ii'tY • • • Wki'kt' "'kq
—A «m ; "ir, Rklkt;" kp , ^tYti '"u^kxkt • • • kq
(outer multiplication)
(16.3-5)
AB is contravariant of rank r + p and covariant of rank s + q and is of weight W + W. Outer multiplication is associative; it is distributive with respect to tensor addition. It is not in general commutative, since the relative order of the indices in Eq. (5) must be observed.
EXAMPLES: a*bk = A*k; a*bk = A%k; aibk = An* The product (4) is a special case of an outer product.
16-3"7
TENSOR ANALYSIS
542
16.3-7. Inner Products. Ifthe outer product of two tensors Aand B,
described by Eq. (5), can be contracted (Sec. 16.3-5) so that one ormore
SUkfktrSCl!PtS °f A^;;:-;v are summed against one or more subscripts of
Bk\'k\'.'..Pkq> and/or conversely, the resulting sums represent an inner product of the tensors Aand B. Ingeneral, several such inner products can be formed.
Every inner product of two tensors Aand Bis a tensor of the same weight as AB
The rank of the inner product is equal tothat of AB diminished by twice the number of index pairs summed. Inner multiplication is distributive with respect to tensor addition.
EXAMPLES: a% = y; A*a* = c*; Afa = dk; B*»bi =/*; Cikak = h<.
16.3-8. Indirect Tests for Tensor Character. Let Q be an object
described in an x scheme of measurements by nR components Q(jlf j2f • • • >Jr\x1,x2, . . . ,xn), and let Xbe atensor described by X^y.^x1]
x2, . . . , xn). The outer product QX is defined, as in Sec. 16.3-6, by the nR+r+8 components Q(jh j2, . . . , jB)X$fc::f'i,. Inner products of Q and
Xare represented by sums formed through contraction (Sec. 16.3-5)
of QX, so that one or more indices of Q(jh j2, . . . , jR) are summed against one or more superscripts and/or subscripts of X§S/.'.'.*£/.
The object Qis atensor ifand only ifthe outer producToX, or a. given
type of inner product of Qand X, isatensor Yoffixed rank, type, and weight
for any arbitrary tensor Xoffixed rank, type, and weight.
An analogous theorem results if Xis, instead, the outer product of Rdistinct arbitrary vectors offixed types and weights. Ineither case one infers the rank and type of Qby
matching superscripts and/or subscripts. The weight ofQ must be thedifference of the respective weights of Y and X.
EXAMPLES: (1) Q is an absolute tensor contravariant of rank r and covariant of rank s if, for every absolute vector a represented by a*, n
ir+.= l
where the components on the right describe an absolute tensor depending on a. (2) Qis an absolute tensor contravariant of rank rand covariant of rank s if, for
every absolute tensor Arepresented by A^;; .»"$,, n
n
n
I I ••• J QUuh ••. .jr*.)^;:.;;?•.,_ =« Jl = l J2= l
jr+,=>l
where a represents an absolute scalar depending on A.
Note: An object Q described by n2 components Qik is an absolute covariant tensor
of rank 2 if and only if
n
n
)
) Qika*ak = a
*= U
= 1
543
SYMMETRY AND SKEW-SYMMETRY
16.5-1
represents ascalar invariant for every absolute contravariant vector a described by a*, and Qik - Qn.
16.4. TENSOR ALGEBRA: INVARIANCE OF TENSOR EQUATIONS
16.4-1. Invariance of Tensor Equations. For each admissible coordi
nate transformation (16.1-1), the induced transformation laws used to define tensor components preserve the results of tensor addition, contraction, and outer (and hence also inner) multiplication, aswell astensor equality. Every relation between tensors expressible in terms of combinations of such
operations (including convergent limiting processes) is invariant with respect to the group ofadmissible coordinate transformations. If the rela tion applies to tensor components in any one scheme of measurements, it holds in all schemes of measurements (see also Sees. 12.1-5 and 16.1-4).
EXAMPLE: Alfayfy + £##:.% - (%&::!'« implies A^y^ + &£&::%,, - C*I**V •••**«'> an(* conversely; thisrelation may besymbolized by theabstract equation
A +B =C." One may, then, speak of tensors and tensor operations without refer enceto a specific scheme of measurements. Each suitable classoftensor point functions is a class ofinvariants and constitutes an abstract model definable (to withinan isomorphism, Sec. 12.1-6) in terms ofmathematical operations without reference to components (see also Sees. 12.1-1 and 16.1-4). Thus suitable classes of absolute tensors of rank 0, 1, and 2, respectively, constitute
scalar fields (Sec. 12.3-lc), vector spaces (Sec. 12.4-1), and linear algebras (Sec. 12.4-2; see also Sec. 16.9-2). Classes of tensors of rank 2, 3, . . . may be built up as direct products of vector spaces (Sec. 12.7-3; see also Sec. 16.6-lc). 16.5. SYMMETRIC AND SKEW-SYMMETRIC TENSORS
16.5-1. Symmetry and Skew-symmetry. An object Q described by nR components Q(ji, ft, . . . , Jr) each labeled with an ordered set of R indices ji, J2, . . . , Jr is
1. Symmetric with respect to any pair of indices, say ji and j2, if and only if
Q(ji = iy J* = K jz, . . . , jR) = Q 0'i = k, j2 = iy is, • • • , Jr) 2. Skew-symmetric (antisymmetric) with respect to any pair of indices, say i\ and *2, if and only if
QUi = hh = fc>is, • • • Jr) = -Q(ii = k,h = hJ*> • • • >h) for all sets of values of i, k, jz, . . . , jR, each running from 1 to n. Q is (completely) symmetric or (completely) skew-symmetric with
16-5"2
TENSOR ANALYSIS
544
respect to a set of indices if and only if Q is, respectively, symmetric or skew-symmetric with respect to every pair of indices of the set. The symmetry or skew-symmetry of an absolute or relative tensor with respect to any pair of superscripts or any pair of subscripts is invariant
with respect to the group ofadmissible coordinate transformations. 16.5-2. Kronecker Deltas. The (generalized) Kronecker delta of
rank 2r is the absolute tensor represented by n2r components 5^;;:*'j.r defined as follows:
!• Cfct.'.".**, = +1 or -1 if all superscripts ih i2, . . . , ir aredifferent, and the ordered set of subscripts kh k2, . . . , kr is obtained, respec tively, by an even or odd permutation (even or odd number of transpositions) of the ordered set ih i2, . . . ,ir
2. Skiki-'^kr = 0for allother combinations ofsuperscripts andsubscripts
Of particular interest is the Kronecker delta of rank 2 described by .i
(0 if i ^ k
** = (l if i =k
(16.5-1)
Contraction of any mixed tensor A by summation over a superscript i and a sub
script i' (Sec. 16.3-5) is equivalent to inner multiplication (Sec. 16.3-7) of Aby 6\'. Each Kronecker delta is completely skew-symmetric (Sec. 16.5-1) with respect to both the set of superscripts and the set of subscripts. Kronecker deltas of rank 2r > 2n are zero.
If A\\}fcyyrw is symmetric with respect to any pair of superscripts, then °»m • • • tVMYtV •••».' — u
(16.5-2)
If Ax#y,:: \, is symmetric with respect to any pair of subscripts, then
*%£::-'£A\fc:;.% = o
(16.5-3)
«i!S:::feii**-*'-rMw»'"''
(16.5-4)
Note also
d*^•'**'
(n-r)\klhi'''krir+1 '••*'
il<> ••<-' (n - r)\ (16-5_5)
dx*
-k = K
(16.5-6)
16.5-3. Permutation Symbols (see alsoSec. 16.7-2c). The permuta tion symbols
«**'"fc = «??••••:*•
and
,,,,,...,„ = 4?;.;»,
(ie.5-7)
represent completely skew-symmetric relative tensors of rank n and of
weight +1 and —1, respectively.
Note that
1- eilis"in = €ilit...in = 0 if two or more of the indices ilf i2, . . . , in are equal.
REPRESENTATION OF VECTORS AND TENSORS
545
16.6-1
2, e«'i«-to = eilit...in = 1 if the ordered set ih it, . . , in is obtained by an even permutation of the set 1, 2, . . . , n. in = —1 if the ordered set ih ii, • . , in is obtained 3. , n. by an odd permutation of the set 1, 2, And note »1*2 • • • »r*"r+l * ' ' *
(16.5-8)
'«*!*» •' •JMr+l ' ••in =* (n - r) l8k\k\ "• •' kr jiif'in^e*l»2 ' .
t " • *n Al
A 2
An — «
A*1 A*2
(16.5-9)
A\? = det [Ai]
(16.5-10)
16.5-4. Alternating Product of Two Vectors (see also Sees. 16.8-4 and 16.10-6). The alternating product (sometimes called bivector) of two contravariant vectors and the alternating product of two covariant vectors are skew-symmetric tensors of rank two respectively represented by
ya = a*bi —a'b*
and
F,y = atbj —a,-6t
(16.5-11)
The weight of the alternating product is the sum of the weights of the two factors. 16.6. LOCAL SYSTEMS OF BASE VECTORS
16.6-1. Representation of Vectors and Tensors in Terms of Local Base Vectors, (a) Given an x scheme of measurements, each (absolute or relative) contravariant vector described by al(xl, x2, . . . , xn) may be represented as an invariant linear form
a = a{(x\ x2, . . . , xn)ei(x\ x2, . . . , xn)
(16.6-1)
(see Sec. 16.1-3 for umbral-index notation) in the n absolute contra variant local base vectors (Sec. 14.2-3) ex(xl, re2, . . . , xn), e2(x\ x2,
. . . , xn), . . . , en(x1, x2, . . . , xn) associated with the x scheme of measurements. The ith base vector et- has the components 8], 5?, . . . , 8{. (b) Similarly, each (absolute or relative) covariant vector b described by bi(xl, x2, . . . , xn) may be represented in the form b = bi(xl, x2,
, xn)e{(xl, x2, . . . , xn)
(16.6-2)
in terms of the n absolute covariant local base vectors e1(a;1, x2, . . . ,
xn), e2(x\ x2, . . . , xn), . . . , en(x1, x2, . . . , xn) associated with the x scheme of measurements.
8\, . . . , 8n.
The ith base vector e* has the components 8[,
Note that the vectors (1) and the vectors (2) will, in
general, belong to different vector spaces.
(c) Every absolute or relative tensor A described by A^y.y^ may be represented as an invariant form
16.6-2
TENSOR ANALYSIS
A = AW/',,ir./e-e- • • • p.i»m>
546
(16.6-3)
in the local base vectors et and e\ 16.6-2. Relations between Local Base Vectors Associated with Different Schemes of Measurements. The 2n local base vectors
^•(a?1, x2, . . . , xn) and e'Oc1, z2, . . . , xn) may be thought of as defining the # scheme of measurements (Sec. 16.1-4) in invariant language. New local base vectors e^x1, x2, . . . , xn) and e*^1, x2, . . . , xn) associated with an x scheme of measurements have the components 5j and 5*, respectively, in the x scheme of measurements and are related to their respective counterparts associated with the x scheme of measurements as follows: dx*
ek(x\ x2, . . . , xn) = —k ei(x\ x2, . . . , xn) dxk
(16.6-4)
ek(x\ x2, . . . ,xn) = —. &(x\ x2, . . . , xn) Note that ^(a;1, x2, . . . , xn) and e.-^1, x2, . . . , £n) are, in general, different vec tors (of the same vector space), not different descriptions of the same vector. The base vectors e» transform formally like (cogrediently with) absolute covariant vector components (Sec. 16.2-1), whereas the e*' transform like absolute contravariant vector components. Absolute contravariant and covariant vector components a* and 6» (and hence contravariant and covariant base vectors e» and e*) transform contragrediently, so that inner products like a% are invariant (see also Sec. 16.4-1). 16.7. TENSORS DEFINED ON RIEMANN SPACES. ASSOCIATED TENSORS
16.7-1. Riemann Space and Fundamental Tensors. Riemann spaces permit the definition of scalar products of vectors in such a manner that the resulting definitions of distances and angles (Sec. 16.8-1; see also Sec. 14.2-7) lead to useful generalizations of Euclidean geometry (see also Sees. 17.4-1 to 17.4-7). A finite-dimensional space of points labeled by ordered sets of real* coordinates x1, x2, . . . , xn is a Riemann space if it is possible to define an absolute covariant tensor of rank 2 (Sec. 16.2-1) described (in an x scheme of measurements) by compo nents gik(xl, x2, . . . , xn) having the following (invariant) properties throughout the region under consideration: * The theory presented in Sees. 16.7-1 to 16.10-11 applies to vectors and tensors with real components denned on Riemann spaces described by real coordinates. The theory also applies to the Riemann spaces considered in relativity theory, where the introduction of an imaginary coordinate (Sec, 17.4-6) is essentially a notational convenience
547
ASSOCIATED TENSORS
1.
16.7-2 1
Each gik(xl, x2, . . . , xn) is a real single-valued func tion of the coordinates and possesses continuous par tial derivatives.
2. 3.
gik(xl, x2, . . . , xn) = gki(x\ x2, . . . , xn) g = g(xl, x2, . . . , xn) = det [gik(x\ x2, . . . , xn)] ?* 0
The matrix [gik(xl, x2, . . . , xn)] is frequently, but not necessarily, positive definite (Sec. 13.5-2); the indefinite case is of interest in relativity theory (see also Sees. 16.8-1 and 17.4-6).
The metric tensor (see also Sec. 17.4-2) described by the g^x1, x2, . . . , xn) and the absolute symmetric tensor of rank 2 (conjugate or associated metric tensor) whose components gik(x1, x2, . . . , xn) are defined by
g*gkj = 8}
or
fp = y
(16.7-1)
where Gik (= Gki) is the cofactor (Sec. 1.5-2) of gik in the determinant det [guc], are the fundamental tensors of the Riemann space. The components of either fundamental tensor define the element of distance ds and the entire intrinsic differential geometry of the Riemann space (Sees. 17.4-1 to 17.4-7). A system of coordinates x1, x2, . . . , xn is, respectively, right-handed or left-
handed if the scalar density \/\g(xl, x2, . . . , xn)\ is positive or negative; an arbi trary choice of sign for any one coordinate system defines every admissible coordinate system as either right-handed or left-handed, since
vises»—, x-)i =vig(x», »«
*-)i *g; gl;;;;; j£|
(see also Sees. 6.2-36 and 6.4-3c).
16.7-2. Associated Tensors.* Raising and Lowering of Indices. An (absolute or relative) contravariant vector represented by a* and a covariant vector represented by ak, defined on a Riemann space and related at every point (x1, x2, . . . , xn) by
ak = aigik
and hence
a,- = gikak
(16.7-2)
are called associated vectors. More generally, an associated tensor of a given tensor described by Alk^y.%rka is obtained by raising a sub
script k through inner multiplication by gki, or by lowering a super script i through inner multiplication by g^; or by any combination of such operations. A tensor of rank greater than one has more than one associated tensor. Since it is desirable to denote the components of all * See also footnote to Sec. 13.3-1.
16.7-3
TENSOR ANALYSIS
548
tensors associated with a given tensor A by the same symbol A, it is necessary to order superscripts and subscripts with respect to each other (see
also Sec. 16.2-1). Thus the result ofraising the subscript k2 in A^'-K. is denoted by
4**'" V**..-*. = 9ik>A%i;::?k. (16.7-3) Raising of previously lowered indices and/or lowering of previously raised indices restores the original tensor components. Note: The contravariant and covariant permutation symbols (Sec. 16.5-3) are not associated relative tensors but are related by ««l»l-"»n = -Qiikigiik* ' ' ' g%nkneklk* ' ' '*»
€*!** '"kn = ^*l»'lgr*2»*2 . . . gkn*n€ilii . . .in (16.7-4)
16.7-3. Equivalence of Associated Tensors. The correspondence betweenassociated tensors defined on a Riemann space is an equivalence relation partitioning the class of all tensors (Sec. 12.1-36). The compo nents of any associated tensor of a tensor A defined on a Riemann space are, then, considered as a different description (representation) of the same tensor A (see also Sec. 16.9-1).
In particular, vector components ak and a,- related by Eq. (2) are inter preted as the contravariant and covariant representations of the same vector a in the x scheme of measurements used. In the notation of Sec. 16.6-1, a = akek = a»e'
(16.7-5)
so that the base vectors ei, e2, . . . , en and e1, e2, . . . , en defined on a Riemann space may be regarded as (reciprocal) bases of the same vector space (see also Sec. 16.8-2). They are related formally like associ ated vector components:
e* = gike{
e{ = gikek
(16.7-6)
Substitution of the appropriate expression (6) for some e* or e»- in the expansion (16.6-3) of any tensor A corresponds, respectively, to raising or lowering the index in question (see also Sec. 16.9-1).
16.7-4. Operations with Tensors Denned on Riemann Spaces. in question are defined on a Riemann space
If the tensors
1. Any two tensors having the same rank and weight may be added in the manner
of Sec. 16.3-3, after their components have been reduced to the same configura tion of superscripts and subscripts through raising and/or lowering of indices. 2. A tensor may be contracted over any pair of indices in the manner of Sec. 16.3-5, after one of the indices has been appropriately raised or lowered. Contraction over two superscripts i, k corresponds, then, to inner multiplication by gik; contraction over two subscripts *, k corresponds to inner multiplication by gik.
549
SCALAR PRODUCTS OF LOCAL BASE VECTORS
16.8-2
3. Inner products of two tensors A and B are defined as contractions of their outer product AB over an index (or indices) of A and a corresponding index (or indices) of B as in (2) above. 16.8. SCALAR PRODUCTS AND RELATED TOPICS
16.8-1. Scalar Product (Inner Product) of Two Vectors Denned on a Riemann Space. In accordance with Sec. 16.7-4, it is possible to define the scalar product (inner product, see also Sees. 5.2-6, 6.4-2a, and 14.2-6) a • b of any two absolute or relative vectors a and b represented by (real) components a* or ak and 6* or bk: a
b
=
9ik(x\ x\ .
•
. , xn)af(x\ X2, &*(*»,
=
akbk =•- a%
=
gikajbk
=
b
• ,*")
(16.8-1)
• a
The magnitude (norm, absolute value; Sec. 14.2-5) |a| of an abso lute or relative vector a described by (real) components ai or ak is the absolute or relative scalar invariant
|a| = y/W\
a2 = a • a = gikalak = alai = gika{ak (16.8-2)
+
A unit vector is an absolute vector of magnitude one. The cosine of the angle y between two absolute or relative vectors a and b is the abso lute scalar invariant
0087 =[af|E|
(16-8'3)
Equations (2) and (3) imply the elementary definition (5.2-5) of the scalar product. Note: If the quadratic form gika*ah is indefinite (indefinite metric, see also Sec. 17.4-4) at a point (xl, x2, . . . , xn), the square a • a of an absolute or relative vector a
represented by components a*at that point is positive, negative, or zero depending on the sign of gika*ak, and |a| = 0 does not necessarily imply a = 0.
16.8-2. Scalar Products of Local Base Vectors. Orthogonal Coordinate Systems (see also Sees. 6.3-3, 6.4-1, and 17.4-7a). The magnitudes of, and angles between, the local base vectors eif e2, . . . , en and e1, e2, . . . , en at each point (x1, x2, . . . , xn) (Sec. 16.6-1) are given by
e, • ek = gik(xl, x2, . . . , xn) e*• e* = gik(x1, x2, . . . , xn) e*' • ek = 8{
(16.8-4)
The vectors et are directed along the corresponding coordinate lines, and the e* are directed along the normals to the coordinate hypersurfaces. Each e* is perpendicular
16.8-3
TENSOR ANALYSIS
to all e< except e*.
550
A system of coordinates x1, x2, . . . , xn is an orthogonal
coordinate system if and only if gih(xl, x2, . . . , xn) = 0 for i j*- k; in this case the e* (and thus also the e*) are mutually orthogonal. Not every Riemann space admits orthogonal coordinates.
Note: Two sets of base vectors e* and e*satisfying the relation e< • e* = &k consti* tute reciprocal bases of the vector space in question (see also Sec. 14.7-6).
16.8-3. Physical Components of a Tensor (see also Sec. 6.3-4). The local unit vectors u» in the coordinate directions corresponding to the subscript are related to the e»- and e* by
u» = ~
7= e< = ~
7==; 0**e*
e» = -f y/\g~^\ u<
e* = g*kei (16.8-5)
+ V \ga\ + vB The physical components d,- and .£/,/,. ••ja of vectors and tensors are defined by
a = Y djUj
dj = V\ih7\(*>'
kx
n
n
(16.8-6)
+
n
A" 2 D ' ' ' L, &hif-iR*ix*h ' ' ' uiB h = 1 h = 1
(16.8-7)
jr = 1
where the iC,„-2...ia are obtained by comparison of Eqs. (7) and (16.6-3). The physical component of a vector a in the direction of another vector b is denned as
a • b/|b|. 16.8-4. Vector Product and Scalar Triple Product (see also Sees. 5.2-7, 5.2-8, 6.3-3, and 6.4-2). The vector product a X b of two absolute or relative vectors a and b defined on a three-dimensional Riemann space (n = 3) is the vector repre sented by the components (see also Sec. 16.5-3) 1
:e*ikaibj =
: e*ik(aibj
afii)
(16.8-8)
y/\g~\ eiika*b> = y2 y/\g\ €iik(a*bi - a*b*) J
or
so that
aXb =
V\o~\
ei e2 e3
a\ at a%
b\ &2 bi
a1
- V\F\
61
a2
b2
a3
b3
=
-bXa
(16.8-9)
Note ei =
e2 X e3
[eie2e3]
e2 =
e2 Xe8 ei =
[eie2e3]
e2
e3 X et
et X e2
[e1e2e3l
[e1e2e8l
e8Xe^
e1 Xe2
[e^e3]
e8 =
(16.8-10)
[ele2e3]
where the scalar triple product [abc] is defined, as in Sec. 5.2-8, by [abel = a • (b X c) =
V\F\
VW\
a\
b\
C\
at
bi
c2
a3
&3
Cz
= VW\
a1
61
a2
b2
c2
a3
b3
C3
c1
(16.8-11)
e*>kaibjCk = VM e»,*al'&'o*
so that
[eiCaea] = \/\g\
[e*e2e3]
(16.8-12)
VM The formulas of Table 5.2-2 and Sec. 5.2-9 hold.
551
INNER-PRODUCT NOTATION
16.9-2
Note: The definition (8) of the vector product implies the elementary relation (5.2-6) and defines the vector product of two absolute vectors as an absolute vector.
Someauthors replace y/\g\ by 1in the definition (8), sothat the vector product of two absolute vectors becomes an "axial" vector (as contrasted to "polar" or absolute vectors) described either as a relative contravariant vector of weight +1, or as a relative covariant vector of weight —1.
16.9. TENSORS OF RANK TWO (DYADICS) DEFINED ON RIEMANN SPACES
16.9-1. Dyadics. Absolute or relative tensors of rank 2 (dyadics) A, B, . . .defined on Riemann spaces and described, for instance, in terms of their respective mixed components Ak, B%k) . . . (see also Sec. 16.7-2 for ordering of indices) are of interest in many applications. They are sometimes thought to warrant the special notation outlined in the following sections. Every dyadic A may be represented as a sum of n dyads (outer products of two vectors):
A - p,q> - (P;et)(<7>e*)
A{ - p
(16.9-1)
Either the antecedents py or the consequents q> may be arbitrarily assigned as long as they are linearly independent (Sec. 14.2-3). In particular
A = Aje»e* = Aike<ek = ii*e,-ejb = Aike*ek A = etA» = A*e* A* = A{kek A* - Aje<
(16.9-2) (16.9-3)
In terms of physical components Aik (Sec. 16.8-3) n
n
A=^ ^ AikUiUk
Aik =Wtt Vl0«0Wb|)no«.m A* =gk>A) (16.9-4)
In the case of orthogonal coordinates (Sec. 16.8-2)
2«-U >©«.-«
16.9-2. Inner-product Notation (see also Sec. 16.8-1). The following notation is sometimes useful for the description of inner products involving (real) dyadics and vectors defined on Riemann spaces:
where
A • a = p,(q> • a) a • A = (a • p,)q'
vector described by A%kak vector described by Aka{
(16.9-6) (16.9-7)
A • B = p,(q> • pOq'A
dyadic described by Aftl
(16.9-8)
B = pjq'' = Bjfoe*
With these definitions, thealgebra of dyadics is precisely thealgebra of linear operators described in Sees. 14.3-1 to 14.3-6; a dyadic associates a linear transformation with each point (x\ x2, . . . , zn).
Table 14.7-1 relates the tensor notation to the "classical"
notation of Chap. 14 and to the matrix notation for dyadics and vectors. Symmetry and skew-symmetry of a dyadic are defined in the manner of Sec. 16.5-1.
Thus the dyadic A is symmetric if and only if Ak* = A*k, or if and only if Aw = Aik; but this does not necessarily imply that Ak = A[, nor does the last relation imply the symmetry of A (see also Sec. 14.7-5).
16.9-3
TENSOR ANALYSIS
552
Tr(A) = AJ = p, • q» is called the scalar of the dyadic (1). The double-dot 3
product of A and B is the scalar A • • B = >
3
> (p< • p*)(q* • q'*).
»=1&=1
For n = 3, it is possible to define the cross products
a X A = (a X p,)q'
A X a = py(q> X a) \
A X B= p,(q* X pj)qi
/
(16'9"9)
The vector vA = p; X qJ is called the vector of the dyadic (1). vA = 0 if and only if A is symmetric. If A is skew-symmetric, then for every vector a a • A = Y^a X a = -A • a
(16.9-10)
so that vector multiplication is equivalent to inner multiplication by a skew-symmetric dyadic.
16.9-3. Eigenvalue Problems (see also Sec. 14.8-3). Eigenvalues and eigenvectors of dyadics are denned at each point (x1, x2, . . . , xn) in the manner of Sec. 14.8-3. The coefficients appearing in the characteristic equation (Sec. 14.8-5) corresponding to a dyadic are absolute or relative scalars. Given any symmetricdyadic A in Euclidean space with continuously differentiable components and a region V where det [Ak] j* 0, there exists an orthogonal coordinate system (Sec. 16.8-2) such that the matrix [Aj] is diagonal throughout V (normal coordinates, see also Sec. 17.4-7). A symmetric dyadic A defined on a three-dimensional Euclidean space may be
represented geometrically by a quadric surface (3.5-1), with o»jt = A{, for i, k = 1, 2, 3 (see also Sec. 3.5-1).
16.10. THE ABSOLUTE DIFFERENTIAL CALCULUS. COVARIANT DIFFERENTIATION
16.10-1. Absolute Differentials, (a) A small change or differential of a tensor quantity cannot be denned directly as the difference between "values" of the tensor function resulting from changes dx1, dx2, . . . , dxn in the coordinates x1, x2, . . . , xn, since the tensor algebra of Sees. 16.3-1 to 16.3-7 does not define relations between tensor "values" at
different points. The absolute differentials da, da, db, dk, dB, . . . of absolute scalars, vectors, and tensors a, a, b, A, B, . . . defined on a Riemann space in terms of suitably differentiable components are, instead, defined by the following postulates:
1. The absolute differentials da, da, and dk are absolute tensor quantities of the same respective ranks and types as a, a, and A. 2. The absolute differential da of an absolute scalar a is represented in the x scheme of measurements by Da = -T-. dxj = aj dxi dx1
(16.10-1)
553
ABSOLUTE DIFFERENTIALS
16.10-1
3. The following differentiation rules hold: d(a • b) = a • db + b • da d(k + B) = dk + dB d(ak) = kda + adk d(kB) = kdB + Bdk
(16.10-2)
In particular, Eq. (2) implies d(aa) = a da + a da
d(a + b) = da + db so that
da
= d(a*ei) =•- e* da{ + a1 dei
<j (x\ x2,
xn) dx7' ei
^iJ\X , X ,
xn) dx1' e*
= rf(a,ef) =•- e* da{ + a, de{ i.e. ,
(16.10-3)
da is described by the components Da1
= a j dx1
Da{ =
or
a>i,j dx1'
The postulates listed above result in a self-consistent and invariant (Sees. 16.4-1 and 16.10-7) generalization of the vector calculus described in Chap. 5; the postulates are satisfied if one chooses
da1 =
dx1'
\kj\ fk\ \i j\
+ ak
da< Ui,3
dx1
ak
(16.10-4)
QihO^j
where the functions j . .> == IYvOc1, x2, . . . , xn) are the Christoffel three-index symbols of the second kind defined in Sec. 16.10-3. Note: Equations (3) and (4) express each component of the absolute differential da as the sum of a "relative differential" da*or dat and a term due to the point-to-point changes in the base vectors (i.e., in the metric). One may define vector derivatives da/dx* by 7
da =
da
. .
(16.10-5)
—. dx1 dx>
da dx>
with
d , . x dx* dx* v '
da* dx* da dai
= dx~i (a*'e,> dx ft* „ I k\ ejfc = [ij; k]ek =
dx*
\i jf
,
. ,
dei dx' dx*
* ,J
de*
(16.10-6)
dx
dej
de'
dx*
dx
^= -{hj}1
(16.10-7)
16.10-2
TENSOR ANALYSIS
554
(b) The postulates of Sec. 16.10-la imply that the absolute differential of any absolute tensor A defined on a Riemann space in terms of suitably differentiable components Atf#-•*%.' is an absolute tensor of the same type and rank, with components DA\fy::!ria> given by
dk = d(A&;:: w *
e*)
F) Ahit; • • ir _ -
(16.10-8)
da&j::* with*
ir -n-h'ti-
-
D
—
' u''j
d
A hit *• • tr
dx1'
°
1 *1 Ahit'"ir Aki»'ii'• ••u'
w',v''' tV
...
1
'
W iJ +
AMM •"ir
il'W'''*V "~ dx1
\ A' A
^1 1 Akhit•••tr I . . .
A MM "
'tr
""•ii'W' • • u- 'A;
I
(16.10-9)
• •ir-lk ' " ia
16.10-2. Absolute Differential of a Relative Tensor.
Absolute differentials of
relative tensor quantities (Sec. 16.2-1) are defined in the manner of Sec. 16.10-1. In particular, the absolute differential da of a relative scalar a of weight W is repre sented in the x scheme of measurements by
Da = a„- dx*
da ( k ] a,, = —. —W <. , J- a
with
(16.10-10)
Equation (10) reduces to Eq. (1) for W = 0. The absolute differential dA of any (suitably differentiable) relative tensor of weight W takes the form (8) with a term
W\j kf A«rV...
(16.10-11)
added in the expression (9) for AJ-$2/.7.*'$,/,,•, so that (16.10-12)
dA is a relative tensor of the same rank, type, and weight as A.
16.10-3. Christoffel Three-index Symbols, (a) The Christoffel three-index symbols of the first kind [ij; k] = Yij^(xl,x2, • • • , xn)
and the Christoffel three-index symbols of the second kind
= Tfj^x1, x2, . . . , xn) associated with an x scheme of measurements in a Riemann space are functions of the coordinates x1, x2, . . . , xn, viz., * Some authors use the notation • "' »V = A»1»8 / • • »r dX*
dx
/nr A»1»«; ' ' *r \
555
CHRISTOFFEL THREE-INDEX SYMBOLS
16.10-3
w*-i (£ +&-£) (A}-"Wi*i
(16.10-13)
where gik and g* are the fundamental-tensor components of the Riemann space in the x scheme of measurements. The Christoffel three-index sym bols are not in general tensor components but transform according to the Christoffel transformation equations dx* dx1' dx" r..
, ,
d2xl
dx1
[hk]T] " WWkdX~'[iJ'S] + WdX~kdX~'9iJ
Tl =MdtfdV f8I ft *J dxh dxk dx8 \i j]
d2X8 d&
dxhdsk dx8
where the functions [hk; r] = Thk.^x1, x2, . . . , xn) and
[ C16'10"14)
{Tjfef - r»-*(*s
x2, . . . , xn) are the Christoffel three-index symbols associated with an x scheme of measurements related to the x scheme of measurements by a suitably differentiable coordinate transformation. (b) The Christoffel three-index symbols satisfy the following relations:
W; *1 =W; k] {.*} ={^J lij; k] =»»{/,•} {/,-} =ff"[y; *] ^ =P; J] +[#; fl =en{ikk} +ta{j\)
(16.10-15) (16.10-16) (16.10-17)
d-S=-9
(16.10-18)
—log, Viffj = \i h)
(16.10-19)
9 = det [go,]
{A}-g--3-*--£- k*«-S• -
[ij; k] a |^ J ^0 (i^j ^k^ i)\ ,v. fl ,-i = * ^! [#; *] = b»; 2 a*/ f k I 1 ty«
>
J * I s / k I _ 1 dgkk_ lalog.^ibl \i k\~ \k i) ~ 2gkk dx* ~ 2 dx* I
(16.10-21)
16.10-4
556
TENSOR ANALYSIS
16.10-4. Covariant Differentiation. Because of the analogy between Eq. (4) or (9) and ordinary partial differentiation, the operation of obtain
ing the functions Ap^:'.^ s —j Affy::^ from the tensor components Ailv."•?;,' is called covariant differentiation (with respect to the metric described by the gik). If the components Aj;*M/..'?rtV represent an absolute
tensor A, then the components AJ.$8/\\\*;Vij. describe an absolute tensor VM — ^ltYi2'... u'.jeiieit
eV
etVe e
contravariant of rank r and covariant of rank s + 1.
(16.10-22)
VA is commonly
called the covariant derivative of A. d
Note that, in general, neither the functions —. Ayfy.'.' frilt* nor the "relative" differentials dAi\i}j.'.' .*;,/ are tensor components. 16.10-5. Rules for Covariant Differentiation (see also Sec. 16.10-7). Computations are frequently simplified by the fact that the ordinary rules for differentiation of sums and products (Table 4.5-2) apply formally to covariant differentiation.
/ A hit' "ir
I
Note
DMta • • • tr \
__
« j \rx t-,'l2'... if -p •L>il'it'... i»)
dX1
/ A hit " ' ir
TDkikt • • • kr
\
A hit •• • ir
_i_ dm is • • • ir
-"-ii'tV •' • i$',j * &i\'n' • • • «*«',,•
A hit • " ir
Tikxkt' " kT
*|,*,/"'' *• kl'ki' '•'**' ~~ -n-h'it''• •uDk\'kt' •••k,\i +
Tik\kt " 'kr
(16.10-23) A hit • • • ir
nki'kt' - - - k/A-tiit' • • • u'.i
-^ Ai&::kk::*i/ = a^::^::!^.
(contraction rule)
The last two rules apply to the covariant differentiation of inner products (Sees. 16.3-7 and 16.8-1).
Note also
Qik.j
=
g,i
=
Vet
=
(Ricci' s theorem)
9j = 0 0 Ve' =
(16.10-24) 0
Equation (24) shows that thefundamental tensors behave like constants with respect to covariant differentiation. The covariant derivative of every associated tensor of A is an associated tensor of VA (see also Sees. 16.7-2 and 16.7-3). The covariant derivative of every Kronecker delta and permutation symbol (Sees. 16.5-2 and 16.5-3) is zero.
557
HIGHER-ORDER COVARIANT DERIVATIVES
16.10-6
16.10-6. Higher-order Covariant Derivatives. Successive covariant differentiations* of tensor components yield higher-order covariant deriv atives described by components like A)™2y.'.yig>jlji...j'm. These tensors are not in general symmetric with respect to pairs of subscripts j; the order of covariant differentiation may beinverted if and only if the Riemann space in question is flat (Sec. 17.4-6c). For any vector described by a„
aijk - aitkj = Rrijkar
(16.10-25)
where Rrijk are the components of the mixed curvature tensor (Sec. 17.4-5) Table 16.10-1. Differential Invariants Defined on Riemann Spaces
(V = e* —.; see also Sees. 16.2-2, 16.10-1 to 16.10-7, and Tables 6.4-1 to 6.5-11) x3
(a) The gradient of an absolute scalar a is the absolute vector da
..da
Va = eiixi = eiQii (b) The covariant derivative Va of an absolute vector a is called the gradient or local dyadic of a. The divergence of an absolute vector a is the absolute scalar V • a represented by Da*
da* .
. f i )
1
d ,
/n
^
dxT =dx-*+a \ki\ mVP^(V^at)
Note: The appropriate formulas of Table 5.5-1 apply to gradients and divergences defined on Riemann spaces (see also Sec. 16.10-5).
(c) The Laplacian operator V2 = V • V is the invariant scalar operator represented by g*k — —•
In particular, the Laplacian V2a of an absolute scalar
a is represented by
y dx*dx*
9 \dx*dx*
\ikfdx*J
VM^A
Vlfi1<W
(d) Given an absolute vector a, the skew-symmetric absolute tensor represented by
n. = ^a* _ Da; __ dot _ daj_ °' ~ dx*
dx* ~" dx*
dx*
is identically zero if and only if a is the gradient of an absolute scalar. Note: For n — 3, it is possible to define the curl of an absolute vector a as the absolute vector V X a represented by the components
i 52/ ,«,* . _J_ (2*i _ 5iA €«* m_L_ (*i _ iai\ iik V\g\dx% (see also Sec. 16.8-4).
2V\g\\dx*
dxi /
2V\0\\dxi
dxi'
The formulas of Table 5.5-1 and Eq. (5.5-19) apply to absolute
vectors defined on three-dimensional Riemann spaces.
of the Riemann space. * It is understood that the components ga of the metric tensor as well as the tensor components to be differentiated are repeatedly differentiable.
16.10-7
TENSOR ANALYSIS
558
16.10-7. Differential Operators and Differential Invariants (see also Sees. 5.5-2 and 5.5-5). (a) If one demies — •"•A %l\t W?/" *r. /**. «*. • • • *» ***l'*»*2' Q£j ' ' •t.'^tl^tt cir*5 C • • • *>*' e •' __ a hit '"ir
. . .
— -n-h'it •••ti1/c«ctt
*>"' it' , . , circ
c
»V c
the covariant derivative (22) of a tensor A may be written as an "outer product" (Sec. 16.3-6) of A and the (invariant) vector differential operator V
=
(16.10-26)
e> —
(del or nabla) whose "components" D/dx1' transform like covariant vector components (Sec. 16.2-1). For every admissible transformation (16.1-1) of coordinates describing the same Riemann space
(b) Tensor relations involving covariant differentiation as well as tensor addition and multiplication are invariant with respectto the group of admissi ble coordinate transformations (see also Sees. 16.4-1 and 16.10-1). Tensor quantities obtained through outer and/or inner multiplication of other tensor quantities by the invariant operator (26) are called differential invariants.
Table 16.10-1 lists the most useful differential invariants.
16.10-8. Absolute (Intrinsic) and Directional Derivatives (see also Sec. 5.5-3). Given a regular arc in Riemann space, x{ = x\t)
(h
(16.10-28)
the components dxi/dt represent a contravariant vector dr/dt "directed" along the tangent to the given curve (Sec. 17.4-2). The absolute (intrinsic) derivative dk/dt of a suitably differentiable absolute or relative tensor A with respect to the parameter t along the given curve is a tensor of the same rank, type, and weight as A:
dk
(dx
\
dt
\dt
)
—— = I — • V ) A. with components
'
(16.10-29) £l Ahit "'ir
__ Ahit - ' ir
~?L
fon-Wit'~-u' ~ ^h'it'-'u'.i fa
If the components of A depend explicitly as well as implicitly on t,
559
DIFFERENTIAL INVARIANTS
16.10-11
The directional derivative dk/ds of A in the direction of the given curve (ds 9* 0, Sec. 17.4-2) is the absolute derivative of A with respect to the arc length s along the curve. 16.10-9. Tensors Constant along a Curve. Equations of Paral lelism. A tensor A is defined as constant along a regular curve arc
(28) (i.e., its "values" at neighboring points on the curve are "equal") if and only if its absolute derivative (30) (and thus also its absolute dk \
differential dk = — ds ) along the curve is zero. Sums and products of as
/
such tensors are also constant along the curve in question, so that, for example, the absolute values of, and the angles between, constant vectors are constant. Every vector a whose components a*(a;1, x2, . . . , xn) or ai(xl, x2, . . . , xn) satisfy the differential equations Da4 = 0
or
Da> = 0
(equations op parallelism)
(16.10-31)
as the coordinates x1, x2} . . . , xn vary along a curve (28) undergoes a "parallel displacement" along the curve. Note: A vector obtained by "parallel displacement" of a given vector along a closed curve is not in general equal to the original vector when the starting point is reached (see also Sec. 17.4-6).
16.10-10. Integration of Tensor Quantities. Volume Element. Integrals of tensor quantities over curves in Riemann space may be defined in terms of scalar integrals over a suitable parameter as in Sees. 5.4-5 and 6.2-3a. The volume ele ment dV is the scalar capacity (Sec. 16.2-1) defined by (see also Sees. 6.2-36 and 6.4-3c)
dV = y/^\dx1dx% • • • dxn
(16.10-32)
Volume integrals over scalar invariants are scalar invariants, but volume integrals over tensors of rank R > 0 are not, in general, tensors. Volume elements in suitable subspaces take the place of surface elements in threedimensional space. Generalization of the integral theorems of Sees. 5.6-1 and 5.6-2 exist (Sec. 16.10-11 and Ref. 16.6, Chap. 7).
16.10-11. Differential Invariants and Integral Theorems for Dyadics (see also Sec. 16.9-1). The divergence of a suitably differentiable dyadic A defined on a Riemann space is the vector V -A, where the operator V (Sec. 16.10-7) acts like a covariant vector.
Note
V • (aA) = aV • A + A • Va (16.10-33) V • (a • A) = (Va) • • A + (V • A) • a V • (A • a) - (Va) • • A + (V • A) • a (16.10-34)
where A is the transposed dyadic of A (A{k = Ak). The dyadic Va is the gradient of a (Table 16.10-1). The following integral theorems analogous to those of Sees. 5.6-1 and 5.6-2 hold for suitable functions, surfaces, and curves:
16.11-1
TENSOR ANALYSIS
Jv V•AdV =js dA •A
fvVadV =fsdAa
560
(16.10-35)
JyVXAdV =J^AXA (16.10-36)
Jv V•(A •a) dV =J8 dA •(A .a) Jg dA •(V XA) »y^ dr •A
(16.10-37) (16.10-38)
16.11 RELATED TOPICS, REFERENCES, AND BIBLIOGRAPHY
16.11-1. Related Topics. The following topics related to the study of tensor analysis are treated in other chapters of this handbook: Vector analysis Abstract algebra Linear transformations Rotation of a vector Differential geometry
Chap. Chap. Chap. Chap. Chap.
5 12 14 14 17
Partial differentiation Special coordinate systems
Chap. 4 Chap. 6
16.11-2. References and Bibliography. 16.1. Brillouin, L.: Les Tenseurs en mecaniqueet en elasticity Masson, Paris, 1949. 16.2. Eisenhart, L. P.: An Introduction to Differential Geometry, Princeton University Press, Princeton, N.J., 1947. 16.3. Lagally, M.: Vorlesungen uber Vektor-Rechnung, Edwards, Ann Arbor, Mich., 1947.
16.4. 16.5. 16.6. 16.7. 16.8.
Lichnerowicz, A.: Elements of Tensor Calculus, Wiley, New York, 1962. Phillips, H. B.: Vector Analysis, Wiley, New York, 1933. Rainich, G. Y.: Mathematics of Relativity, Wiley, New York, 1950. Sokolnikov, I. S.: Tensor Analysis, 2nd Ed., Wiley, New York, 1964. Synge, J. L., and A. Schild: Tensor Calculus, University of Toronto Press, 1949.
CHAPTE
r!7
DIFFERENTIAL GEOMETRY
17.1. Curves in the Euclidean Plane
17.1-1. Tangent to a Plane Curve
17.3. Surfaces in Three-dimensional
Euclidean Space
17.1-2. Normal to a Plane Curve
17.3-1. Introduction
17.1-3. Singular Points
17.3-2. Tangent Plane and Surface
17.1-4. Curvature of a Plane Curve
Normal
17.1-5. Osculation
17.3-3. The First Fundamental Form
17.1-6. Asymptotes 17.1-7. Envelope of a Family of Plane
of a Surface. Elements of Distance and Area 17.3-4. Geodesic and Normal Curva ture of a Surface Curve. Meusnier's Theorem 17.3-5. The Second Fundamental
Curves
17.1-8. Isogonal Trajectories 17.2. Curves in Three-dimensional
Euclidean Space 17.2-1. Introduction
17.2-2. The Moving Trihedron (a) Tangent to a Curve (b) Osculating Circle and Plane. Principal Normal (c) Binormal. Normal and Recti fying Planes 17.2-3. Serret-Frenet Formulas. Curvature and Torsion of a
Space Curve 17.2-4. Equations of the Tangent, Principal Normal, and Bi normal, and of the Osculating, Normal, and Rectifying Planes 17.2-5. Additional Topics 17.2-6. Osculation
Form. Principal Curvatures, Gaussian Curvature, and Mean Curvature
17.3-6. Special Directions and Curves on a Surface.
Minimal Sur
faces
17.3-7. Surfaces as Riemann Spaces. Three-index Symbols and Beltrami Parameters
17.3-8. Partial Differential Equations Satisfied by the Coefficients of the Fundamental Forms.
Gauss's Theorema Egregium 17.3-9. Definition of a Surface by E, F, O, L, M, and JV 17.3-10. Mappings 17.3-11. Envelopes 561
17.1-1
DIFFERENTIAL GEOMETRY
17.3-12. Geodesies
562
nite Metric. Null Directions and Null Geodesies
17.3-13. Geodesic Normal Coordi nates. Geometry on a
17.4-5. Specification of Space Curva
Surface
ture
17.3-14. The Gauss-Bonnet Theorem
17.4-6. Manifestations of Space Curva ture. Flat Spaces and Eu clidean Spaces 17.4-7. Special Coordinate Systems
17.4. Curved Spaces 17.4-1. Introduction
17.5. Related Topics, References, and Bibliography
17.4-2. Curves, Distances, and Direc tions in Riemann Space 17.4-3. Geodesies
17.5-1. Related Topics 17.5-2. References and Bibliography
17.4-4. Riemann Spaces with Indefi
17.1. CURVES IN THE EUCLIDEAN PLANE
17.1-1. Tangent to a Plane Curve.
Given a plane curve C represented
by
V=f(x)
or or
(17.1-la) (17.1-16) (17.1-lc)
(Sec. 2.1-9) in terms of suitably differentiable functions, the tangent to C at the point Pi = (xlf yt) is defined as the limit of a straight line (secant) through Pt and a neighboring point P2 as P2 approaches Pi. The curve (1) has a unique tangent described by y -— 77, 2/1
(17.1-2a)
dx
g(.-*) +gfr-*)-o
or
or
= S- (X (x -— xx)
X= ^- (t - h) + Xi
y
= dJa(t dt
h) + 2/1
(17.1-26) (17.1-2c)
at every regular point (xh y{) where it is possible to choose a parameter t
so that x(t) and y(t) have unique continuous derivatives not both equal to zero; or, equivalently, where
tan &
/dep = dy / dx I dy dtl idt
= dy =
dip
dx
dx
(17.1-3)
563
CURVATURE OF A PLANE CURVE
17.1-2. Normal to a Plane Curve.
17.1-4
The normal to the curve (1) at a
regular point Pi = (xh 2/1) is the straight line through Pi and perpen dicular to the tangent at Pi:
The direction of the positive normal is arbitrarily fixed with respect to the common positive direction of curve and tangent. The positive direction on a curve (1) is arbitrarily fixed by some convention (e.g., direction of increasing t, increasing x, etc.; see also Sec. 2.2-1). 17.1-3. Singular Points. Given a curve (16) such that the wth-order derivatives of
if all first-order derivatives but not all second-order derivatives of
vanish at Pi = (xh 2/1), the slopes dy/dx of the two tangents are obtained as roots of the quadratic equation
The roots of Eq. (5), and hence the two tangents, may be real and different (double point), coincident (cusp, or self-osculation point, see also Sec. 17.1-5), or imaginary (isolated point). The properties of a curve at a singular point can be similarly described in terms of discontinuous or multiple-valued derivatives of f(x) or of x(t) and y(t).
17.1-4. Curvature of a
Plane Curve.
The circle of curvature
(osculating circle) of a plane curve C at the point Pi is the limit of a circle through Pi and two other distinct points P2 and P3 of C as P2 and P8 approach Pi. The center of this circle (center of curvature of C corresponding to the curve point Pi) is located on the normal to C at Pi.
The coordinates of the center of curvature are
(17.1-6)
*-+['+®']/3-*+^e where all derivatives are computed for x = x\ (t = ti); dots indicate differentiation with respect to t. * The radius pK of the circle of curvature (radius of curvature of C at Pi) equals the reciprocal of the curvature
k of C at Pi defined as the rate of turn of the tangent with respect to the arc length s along C (Sec. 4.6-9): * Equations (6) and (7) may be rewritten in terms of the partial derivatives of
17.1-5
DIFFERENTIAL GEOMETRY
1
K Pk
d&
ds
d2y / I
i'dy\2 _
dx2/'\jl+\,dx) +
564
xy — yx
(17.1-7)
y/& + y2* +
where all derivatives are computed for x = xi (t = h).* A given curve C is, respectively, concave or convex in the direction of the positive y axis wherever d2y/dx2 and thus k is positive or negative. Many authors introduce |«c| rather than kas the curvature, as in Sec. 17.2-3. In terms of polar coordinates r,
ds2 = dr* + r2 d
and and
tan M=r /—
- - - - — - dip -» ^ - r
K p. ~ ds " ds~ + S
1
(*> = *>i)
\d
^^
(17.1-8)
((* = - «>^ (17-1"9> /1710s
-t-
17.1-5. Osculation. A point Pi = (xit yx) is an osculation point (point of con tact) of order n of two curvesdescribed by suitably differentiable equations y = f(x) and y = g(x) if and only if
f(xi) - 0(&i)
/'(*i) = p'fo)
••.
f(*)(xi) = flrW(jci) /(n+l)(:Cl) ^ pOi+Dfo)
(17.1-10)
Such a point may be regarded as the limit of n + 1 realor imaginary points of inter section approaching one another. At every osculation point the tangents to the two curves coincide; the curves intersect (cross) if and only if n is even. A point where a curve intersects its own tangent is a point of inflection. At every point of inflection K =0.
17.1-6. Asymptotes. A straight line is an asymptote of the given curve C (C approaches the straight line asymptotically) if and only if the distance between
the straight line and a point P = (x, y) on the curve tends to zero as x% + y% —» «. If C is a regular arc the asymptote is the limiting case of the tangent at P. 17.1-7. Envelope of a Family of Plane Curves.
The envelope of a suitable one-
parameter family of plane curves described by
(17.1-11)
osculates, or contains a singular point of, every curve (11). One obtains the equa tion of the envelope by elimination of the parameter X from Eq. (11) and
*EkJk*> =0
(17.1-12)
♦Equations (6) and (7) may be rewritten in terms of the partial derivaties of
565
THE MOVING TRIHEDRON
17.2-2
The envelope exists in a region of values of x, y, and X such that
d-g^ *0 d(x, y)
*xk*0
(17.1-13)
Equations (11) and (12) define the locus of limiting points of intersection of
17.1-8. Isogonal Trajectories. The family of curves intersecting a given family
(y cos 7- -jf sin 7\ dx + f^ sin 7+ -^- cos 7) dy =0 (17.1-14) For 7 = t/2, Eq. (14) yields orthogonal trajectories. 17.2. CURVES IN THREE-DIMENSIONAL EUCLIDEAN SPACE
17.2-1, Introduction (see also Sec. 3.1-13). Sections 17.2-1 to 17.2-6 deal with the geometry of a regular arc C described by
:z2 y-„w .-.(o}(^'*« (i7-2-i} where the functions (1) have unique continuous first derivatives and dr/dt ?* 0 for h < t < t2. Higher-order derivatives will be assumed to
existas needed. It is convenient to introduce the arc length s = I ds = Jtic
/ y/dv • dr = \ y/dx2 + dy2 + dz2 (Sec. 5.4-4) as a new parameter;
Jtic
Jtic
the sign of ds is arbitrarily fixed to determine the positive direction of curve and tangents (see also Sees. 17.2-2 and 17.2-3). Differentiation with respect to s will be indicated by primes, so that, for example,
' = dx __ dx I ds ~~ ds ~ dt j' 1di The representation of curves in terms of curvilinear coordinates (Chap. 6) is briefly discussed in Sec. 17.4-1.
17.2-2. The Moving Trihedron (see also Sees. 17.2-3 and 17.2-4). (a) Tangent to a Curve. The tangent to a regular arc C at the point Pi = (ri) = (xi, 2/1, zi) is the limit of a straight line (secant) through Pi and another point P2 of C as P2 approaches Pi. A unique tangent exists at every point of a regular arc. The positive tangent direction coincides with the positive direction of C at Pi. (b) Osculating Circle and Plane. Principal Normal. The osculating circle or circle of curvature of C at the curve point Pi is the limit of a circle through Pi and two other distinct points P2 andP8
17.2-2
DIFFERENTIAL GEOMETRY
566
of C as P2 and P3 approach Px. The plane of this circle (osculating plane of C at Pi) contains the tangent to C at Pi. The directed straight line from the curve point Pi to the center of the osculating circle (center of curvature) is perpendicular to the tangent and is called the principal normal of C at Pi.
(c) Binormal. Normal and Rectifying Planes. The binormal of C at Pi is the directed straight line through Px such that the positive tangent, principal normal, and binormal form a system of right-handed rectangular cartesian axes (Sec. 3.1-3). These axes determine the "mov
ing trihedron'' comprising the osculating plane and the normal and
Binormal
Fig. 17.2-1. The moving trihedron associated with a space curve C.
rectifying planes respectively normal to the tangent and to the prin cipal normal of C at Pi (Fig. 17.2-1).
(d) The unit vectors t, n, and b respectively directed along the posi tive tangent, the principal normal, and the binormal are given by (unit tangent vector) \
t = r'
r"
b = t X n
1 „
(unit principal-normal vector) > (17.2-2) (unit binormal vector) /
at each suitable point of the curve. The vector m = r" is called the curvature vector; k is the curvature further discussed in Sec. 17.2-3.
567
EQUATIONS OF THE TANGENT
17.2-3. Serret-Frenet Formulas.
17.2-4
Curvature and Torsion of a
Space Curve (see also Sees. 17.2-4 and 17.2-5).
(a) The unit vectors (2)
satisfy the relations t' = m
n'=—Kt+rb
b' = —rn (Serret-Frenet formulas)
(17.2-3)
with k = - = |t'| = |r"| Pk
at each curve pointPi.
As s increases, the point Pi moves along the curve
C and
1. The tangent rotates about the instantaneous binormal direction at the (positive) angular rate k (curvature of C at Pi). 2. The binormal rotates about the instantaneous tangent direction
at the angular rate r (torsion of C at Px; r is positive wherever the curve turns in the manner of a right-handed screw).
3. The entire moving trihedron rotates about the instantaneous direc tion of the Darboux vector Q = rt + feb at the (positive) angular
rate |Q| = y/r2 + *2 (total curvature of C at Pi). The instantaneous rotation of t, n, and b becomes more evident on rewriting the Serret-Frenet formulas as follows (see also Sees. 5.3-2 and 14.10-5):
t' = ft X t = (Kb) X t n' = a X n 1 b'=OXb = (rt) X b J
H7 2-4} v • ;
pK = 1/K is the radius of the circle of curvature (radius of curvature) of C at Pi; pT = 1/r is called the radius of torsion.
(b) The scalar functions k = k(s)and t = t(s) together define the curve C uniquely, except for its position and orientation in space (intrinsic equations of a space curve). C is a plane curve if and only if its torsion r vanishes identically, and a straight line if and only if its curvature k vanishes identically. (c) In terms of the more general parameter t,
1 pk
|r X r| |f|8
T
= 1 _ [frr] pr |r X r|2
where the dots indicate differentiations with respect to t.
(17.2-5)
Note also
r = *t + -PK s2n = st + (b) Xf \PK J
(17.2-6)
(decomposition of the acceleration of a moving point into tangential and normal components, see also Sec. 5.3-2).
17.2-4. Equations of the Tangent, Principal Normal, and Binormal, and of the Osculating, Normal, and Rectifying Planes, (a) The tangent, principal normal, and binormal of C at the point Pi = (ri) s (xi, yi, zi) are respectively described by
r = r} -|- u%
r —Tx -\- un
r = ri + wb
(17.2-7)
17.2-5
DIFFERENTIAL GEOMETRY
568
where u is a variable parameter. The equations of the osculating, normal, and rectifying planes are, respectively, (r - n) • b = 0
(r - n) • t = 0
(r - n) • n = 0
(17.2-8)
(b) The right-handed rectangular cartesian components of the unit vectors (2) are tx —x' ft* = —x" K
ty = y'
ny = - y" K
hx =\ (y'z" - y"z') «•
(direction cosines of the tangent)
tz - z'
ns = —z"
(17.2-9a)
(direction cosines of THE the PRIN prinES OF
K
CIPAL normal)
bv =- (z'x" - z"x')
(17.2-96)
b, =- (x'y" x"y') [x'y" - x"yf)
K
K
(direction COSINES OFthe binormal)
(17.2-9c)
and
k = -Pk = V*"2 + y"* -f z"* +
r =-Pr =4«
X
X'
X
y'
y"
y"
z'
z"
z"
(17.2-10)
Substitution of the correct direction cosines (9) in x - xi __ y — yi = z - zi COS ax
COS av
(17.2-11)
COS ctz
yields the equations of the tangent, principal normal, and binormal in terms of rec tangular cartesian coordinates, and
(x - Xi) cos ax + (y - yx) cos av + (z - zx) cos az = 0
(17.2-12)
represents the osculating, normal, and rectifying planes. 17.2-5. Additional Topics, (a) The center of curvature associated with the curve point Pi has the position vector rK = ri + P
(17.2-13)
(b) The limit of a sphere through four distinct pointsPi, P2, P8, P4 of C as P2, P8, and P4 approach Pi is the osculating sphere of C at Pi.
Its center lies on the directed
straight line in the positive binormal direction through the center of curvature (axis of curvature or polar line of C at Pi). The radius pa of the osculating sphere and the position vector ra of its center are
P.2 = pK2 + (pTPK)2
r, = rK + pTpKb = tx + P(tn + pTpKb
(17.2-14)
C lies on a sphere of radius R if and only if pT = R. The polar lines of C are tangent to the polar curve defined as the locus of the centers of the osculating spheres of C. The polar surface (polar developable) of C is the ruled surface (Sec. 3.1-15) generated by the polar lines. (c) Involutes and Evolutes. The tangents of C generate a ruled surface (tan gent surface, tangential developable) of two sheets, which are tangent to one another at the given curve. The involutes of the given curve C are those curves on the tangent surface which are orthogonal to the generating tangents. Given the position vector r = r(s) of a moving point P8 on C, the points P/ = (g) of every involute of C are given by 9 = Q(s) = r(s) + (d- s)t(s)
Each involute corresponds to a specific value of d.
(17.2-15)
Note that d is the constant sum
of 8 and the tangent PJPi (string property of involutes).
569
TANGENT PLANE AND SURFACE NORMAL
17.3-2
A curve C is an evolute of C if the tangents of C are normal to C, i.e., if C is an involute of C".
The evolutes of C lie on its polar surface (see also Ref. 17.7).
17.2-6. Osculation (see also Sec. 17.1-5). A point Pi = (rx) = (xu yi, Zi) is an
osculation point (point of contact) of order n of two regular arcs represented by r = f(s) and r = g(s) if and only if at Pi one has s —Si such that
f(«i) - g(si) - ri
{'(Si) = g'(si)
•••
f(n)(si) = g(n)(si)
f(«+D(5l) ^ g(»+D(Sl)
(17.2-16)
17.3. SURFACES IN THREE-DIMENSIONAL EUCLIDEAN SPACE
17.3-1. Introduction (see alsoSec. 3.1-14; seeSec. 3.5-10 for examples). Sections 17.3-1 to 17.3-14 deal with the geometry of a regular surface ele ment S described by
or or
x = x(u, v)
r = r(u, v) y = y(u, v)
z = z(u} v)
(17.3-la) (17.3-16) (17.3-lc)
in some region of values of the parameters (surface coordinates) u, v (Sec. 3.1-14). The functions (1) are to have continuous first partial derivatives such that the rank of the matrix dx
ty
dz
du
du
du
dx dv
dy dv
dz dv _
(17.3-2)
equals 2 (Sec. 13.2-7), i.e., such that the three functional determinants
d(x, y)/d\u, v)y d(y, z)/d(u, v), d(z, x)/d(u, v) do not vanish simul
taneously, or
'- x'• "° (r« -1' *< =%)
(17-3"3)
Higher partial derivatives will be assumed to exist as needed. The conditions listed above ensure the existence and linear independence of the vectors r„ and r* directed, respectively, along the tangents to the surface-coordinate lines u = constant and v = constant through the surface point (w, v). Surface
pointswhere the threedeterminants existbut vanish forevery choice ofthe parameters u, v are singular points corresponding to edges, vertices, etc.
17.3-2. Tangent Plane and Surface Normal, (a) At every surface pointPi = (rj) ss fa, yh Zi) = (ult vj satisfying the conditions of Sec. 17.3-1 (regular point of the surface) there exists a unique tangent plane denned as the limit of a plane through three distinct surface points Pi, P2,P3 asP2 andP3 approach Px along curves which do not have a common
tangent at Pi. This tangent plane contains the tangents of all regular surface arcs through Pi; the equation of the tangent plane is
17.3-3
DIFFERENTIAL GEOMETRY X — Xi
[(r - ri)rttr„] = 0
or
dx
dx
du
dv
dy
dy
-r-
V - Vi "ST. du •£ dv z — Zi
dz
570
(17.3-4)
= 0
dz
—
—
du
dv
where all derivatives are taken for u = uh v = vx.
(b) The surface normal ofSatthe regular surface point Px = (rx) = (zi, 2/i, Zi) s (u1} Vi) is the straight line through Pinormal to the tangent plane at that point. This surface normal isdescribed by the parametric equation r -
ri = tN
(17.3-5)
where
N =
>j(y,g) , . d(z, x) hd(x,y) d(u, v) ^^(u.v) ^ d(u, v)
ru X rv
L= (17.3-6)
|r„ X rJ
\|_3(^)J ^L^,^)J ^U^J is the unit normal vector ofS at the point Px; all derivatives are taken
for u = uhv = t>!. The direction ofN isthe direction of the positive normal atPi;note thatthe positive u direction, the positive v direction, and the positive normal form a right-handed system of axes (Sec. 3.1-3) directed along rM, r„ and N.
17.3-3. The First Fundamental Form of a Surface. Elements of
Distance and Area, (a) The vector path element (Sec. 5.4-4) along a surface curve
r = r[u(t), v(t)]
or
u = u(t)
v = v(f)
(17.3-7)
is
dr = vudu + xv dv
(17.3-8)
and the square of the element of distance ds = |dr| on the surface (1) is given at each surface point (u, v) by ds2 = \dr\2
= E(u, v) du2+ 2F(ut v) du dv + G(u, v) dv2 with
«*.)-*•* .(£)'+(jj)"+(£y (17.3-9)
du dv T du dv
du dv
«*•>—*-(£)"+(SMS)' (first fundamental form of the surface)
571
NORMAL CURVATURE OF A SURFACE CURVE
17.3-4
At every regular pointof a realsurface (1) described in terms of realsurface coordi nates u, v the quadratic form (9) is positive definite (Sec. 13.5-2), i.e., E > 0
G>0
EG -F2>0
(17.3-10)
(b) The angle y between two regular surface arcs
r = Ri(0 r = R2(0
and
or or
u = Ui(t) u = U2(t)
v = Vi(t) v = V2(t)
through the surface point (u} v) is given by COS 7
=
dRi • dRo
IdRil |dR2|
E dUi dU2 + F(dUj dV2 + dVj dU2) + G dVj dV2
' y/E dUS + 2F dUi dVi + GdVi2 y/E dU22 + 2FdU2 dV2 + GdV22
=^^xf/^^+^^U(?^?-2 ds ds ^ \ds ds ^ ds ds J ^ ds ds
(17.3-11)
In particular, the angle 71 between the surface-coordinate lines u = constant and v = constant through the surface point (u} v) is given by COS 71 =
F .—
.
sin 71 =
y/EG
y/EG - F2 7=— VEG
H7*12^
KY1.6-1Z)
The surface coordinates u} v are orthogonal if and only if F = 0 (see also Sees. 6.4-1 and 16.8-2).
(c) The vector element of area d\ and the scalar element of area dA at the regular surface point (u, v) are denned by dA = (ru X rv) du dv = N|rfA[
dA = ± \dK\ = -\/a(u, v) du dv with
(17.3-13)
a(u, v) = |r„ X rv\2 = EG - F2
- r*(y,«)T + \*(*,*)Y + r*(*,y)1' The sign of dA may be fixed arbitrarily (see also Sees. 4.6-11, 5.4-6a, and 6.4-36). 17.3-4. Geodesic and Normal Curvature of a
Meusnier's Theorem, arc C described by
Surface Curve.
(a) At each point (u, v) of a regular surface
r = r[u(s), v(s)]
or
u — u(s)
v = y(s)
(17.3-14)
the curvature vector r" = m (Sec. 17.2-2d) has a unique component in
the tangent plane (geodesic or tangential curvature vector) and a unique component along the surface normal (normal curvature vector), i.e.,
r" = m = K(7(N X r') + knN
(17.3-15)
where the primes indicate differentiation with respect to the arc length
17.3-3
DIFFERENTIAL GEOMETRY
572
s along C. At each point [u(s), v(s)], kg = *[r'nNJ = [r'r"N] [geodesic or tangential CURVATURE OF C AT (u, v)]
(17.3-16)
is the curvature of the projection of C onto the tangent plane (see also Sec. 17.4-2d), and
kn = k(u • N) = r" • N = -r' • N' [normal curvature of C at (w, v)]
(17.3-17)
is the curvature of the normal section (intersection of the surface and
a plane containing the surface normal) through the tangent of C (see also Sec. 17.3-5).
(b) At a given surface point (u, v), the curvature kofevery surface curve whose osculating plane (Sec. 17.2-26) makes an angle a with the normalsection plane through the same tangent is kn cos a
(17.3-18)
where kn is the normal-section curvature given byEq. (17) or (20) (Meusnier's
Theorem). Equation (18) yields, in particular, the curvature k of any oblique surface section in terms of the curvature kn of the normal section through the same tangent.
17.3-5. The Second Fundamental Form. Principal Curvatures, Gaussian Curvature, and Mean Curvature, (a) To write Eq. (17) in terms of the surface coordinates u, v, let dN = Nudu + Nvdv. Then -dr-dN
= L(u, v) du2 + 2M(u, v) dudv + N(u, v) dv2 with
L(u,v) = —rM«Nu =
\FuvXvXv\
VEG-F2
M(uy v) = -rw • Nv = -rv • Ntt = N{u,v) = -r^-N, =
(17.3-19)
y/EG - F2
[TVVTUTV\
VEG-F2
(second fundamental form of the surface)
where allderivatives aretakenat thesurface point(uy v); the box products can be expanded by Eq. (5.2-11). At a given surface point (r) = (u, v) the curvature of the normal section containing the adjacent surface point (r + dr) = (u + du,v + dv) is
573
MINIMAL SURFACES dr-dN
KN =
—
17.3-6
-Ks)' +»ss +w(i) Ldu2 + 2M dudv + N dv2 Edu2 + 2Fdudv +Gdv2
(V7 « 2Q^ U ;
(b) Unless kn has the same value for every normal section through (u, v) (L:M:N = E:F:G, umbilic point), there exist two principal normal sections respectively associated with the largest value Ki and the smallest value k2 of k [principal curvatures of S at the surface point (u} v)]. The planes of the principal normal sections are mutually
perpendicular; for any normal section through (u, v) whose plane forms the oblique angle &with the plane of the first principal normal section, KN = ki cos2 &+ k2 sin2 #
(Euler's theorem)
(17.3-21)
ki and k2 are eigenvalues of the type discussed in Sec. 14.8-7 (see also Sec. 14.8-8); they are obtainable as roots of the characteristic equation L - kE kF
M -
M N -
kF kG
= 0
(17.3-22)
The symmetric functions H(u, v) = }4(Ki + **) and K(u> v) —W2 are respectively known as the mean curvature and the Gaussian curva ture of the surface S at the point (w, v); note
H=Ifa +K2) =\EN ~Eg™FtGL (MEAN CURVATURE) (17.3-23) Kee Kik2 s LeGZ^I (Gaussian curvature) (17.3-24) (see also Sees. 17.3-8 and 17.3-13). Kh k2, H, and K are surface point functions independent of the particular surface coordinates u} v used. Depending on whether the quadratic form (19) is definite, semidefinite, or indefinite (Sec. 13.5-2) at the surface point (w, v), the latter is An elliptic point with K = kik2 > 0 (normal sections all convex or all concave; surface does not intersect its tangent plane.
EXAMPLE: any point of an
ellipsoid)
A parabolic point with K - kxk2 = 0 (e.g., any point of a cylinder) A hyperbolic point (saddle point) with K = kxk2 < 0 (both convex and concave normal sections; surface intersects its tangent plane. EXAMPLE: any point of a hyperboloid of one sheet)
An umbilic point {kx = k2, Sec. 17.3-5&) is necessarily either elliptic or parabolic. 17.3-6. Special Directions and Curves on a Surface.
Minimal Surfaces,
(a)
A line of curvature is a surface curve whose tangent at each point belongs to a
principal normal section; the differential equation
17.3-7
DIFFERENTIAL GEOMETRY dv2 E
—dudv F
du2 G
L
M
N
574
- 0
(17.3-25)
defines the two mutually perpendicular lines of curvature v = v(u) through each point (u, v).
(b) An asymptotic line is a surface curve of zero normal curvature (20), so that L du2 -f 2Mdudv -f AT dv2 = 0
(17.3-26)
(EXAMPLE: any straight line on the surface). The tangent to an asymptotic line defines an asymptotic direction on the surface. Note that the asymptotic lines at elliptic surface points are imaginary curves v = v(u). The lines ofcurvature bisect the asymptotic directions.
(c) Two regular surface arcs u = Ui(t), v = Vi(t) and u = 17,(1), v = V2(t) have mutually conjugate directions at the surface point (u, v) ifand only if thetangent ofone arc isthelimiting intersection of the tangent planes at two points approaching (u, v) on the other arc, or
L dUi dU2 + M(dUi dV2 + dU2 dVi) + N dVi dV2 = 0
(17.3-27)
This relationship is necessarily reciprocal [EXAMPLE: directions of the lines of curvature at (u, v)].
(d) The parametric lines u - constant, v —constant are Orthogonal if and only if F = 0 Conjugate if and only if M = 0 Lines of curvature if and only if F = M = 0 Asymptotic lines if and only if L = N = 0
(e) A minimal surface is a surface such that H(u, v) = 0; this is true if and only if the asymptotic lines form an orthogonal net. Surfaces having minimum area for a
given reasonable boundary curve are minimal curves (Plateau's problem, soap-film models).
17.3-7. Surfaces as Riemann Spaces. Three-index Symbols and Beltrami Parameters. A regular surface element S with fundamental form (9) is a two-
dimensional Riemann space of points (u, v) with metric-tensor components E, F, G (Sees. 16.7-1 and 17.4-1; see also Sec. 17.3-12); a(u} v) = EG - F2is the metric-tensor
determinant (Sec. 16.7-1). One can define surface vectors and tensors, surface scalar products, and surface covariant differentiation on the Riemann space S in the manner of Sees. 16.2-1, 16.8-1, and 16.10-1. The Christoffel three-index symbols of the second kind (Sec. 16.10-3) for the surface S take the form
f 1 )
I1 Us
GEU - 2FFU + FEV
2(EG -F*)
f 1 1 - / X I - GE» ~FG"
UVs~ U Us =2(EG-F>) f 1 I
-FGV + 2GFV - GGU
\2 2/s
2(EG - F*)
( 2 1
-FEU + 2EFU - EE0
U l/5 = I2I
2(EG - F*) / 21
EGU-FEV[ ,_ 0 no,
11 2fs m\2 l\s " 2(EG-F*) ( (17*3"28) ( 2 \
_ EG0 - 2FFV + FGU
U 2/s ~
2(EG - F*)
where the subscript S has been introduced to distinguish the surface Christoffel sym bols (28) clearly from the Christoffel symbols {»•*,• | associated with the surrounding space.
575
DEFINITION OF A SURFACE
17.3-9
For suitably differentiable functions $(u, v), V(u, v)t the functions w^ T
et/* x
. * t \ . n*. t
V5($ *) _ ***** - F&»*« +*»*«) +#*"*" '
(Beltrami's first »
DIFFERENTIAL \
EG — F2
v2/$) =
*
T d G*» - F*v
" y/EG^Y2 Idi VEG^Y2 V
LV
parameter) I
,
\ (17.3-29)
(Beltrami's second (
_d A3>t, -/'gyl
DIFFERENTIAL I
PARAMETER) I
*» y/EG - F2i
'
are surface differential invariants respectively analogousto V& • V^ and V2^ as defined in Table 16.10-1. Since the two-dimensional Riemann space S is "embedded" in three-dimensional Euclidean space, one can interpret covariant differentiation on a .curved surface by actual comparison of surface vectors at adjacent surface points. 17.3-8. Partial Differential Equations Satisfied by the Coefficients of the Fundamental Forms. Gauss's Theorema Egregium. (a) The following rela
tions describe the change of the linearly independent vectors r„, rv, N ("local tri hedron") along the surface-coordinate directions:
r- ={i1ihru +{i2ihr-+I,N r
Ku=j^2l(FM-GL)ru +(FL-EM)rv] \ (Weingarten N. =_i_ [(FN _GM)ru + (FM - EN)rv] J
E«™NS)
(17.3-31)
(b) The following relations are compatibility relations (integrability conditions, Sec. 10.1-2c) ensuring that ruuv = r„uu, r„u» = r„, lUVV
L^{l\}sM +{l\}sN-M^{l\}sL +{l\}sM\ (M__ AINARDI-
CODAZZI
(17.3-32)
f[^(i 2|s-^{l 1|5 +(1 2J41 ^"l1 XM2 2)«J "
LN -M2 EG -F2
(17.3-33)
(c) Equation (33) expresses K solely in terms of Ef F, and G and their derivatives: the Gaussian curvature K(u, v) of a surface is a bending invariant unchanged by deformations which preserve the first fundamental form (Gauss's Theorema Egregium).
17.3-9. Definition of a Surface by E, F, G, L, M, and N. Three given functions E(u, v) > 0, G(u, v) > 0, and F(u, v) such that EG —F2 > 0 define the metric properties (intrinsic geometry) of the surface. Six functions E, Fy G, L, M, N satisfying the above inequalities and the
17.3-10
DIFFERENTIAL GEOMETRY
576
compatibility conditions (32) and (33) uniquely define a corresponding real surface r = r(u, v) except for its position and orientation in space (Funda mental Theorem of Surface Theory).
17.3-10. Mappings,
(a) A suitably differentiate one-to-one trans
formation
u=u(u, v) v=v(u} v)
\jr^ *ol
(17.3-34)
maps the points (u} v) of a given regular surface element r = r(u, v) onto corresponding points (u, v) of another regular surface element r = r(w, v). In the following, barred symbols refer to the second surface. The mapping is
Isometric (preserves all metric properties) if and only if E(uy v) = E(u, v)} F(u} v) s F(u, v)} G(u, v) = G(u, v)
Con formal (preserves angles) if and only if E(uf v) :F(u} v) \G(u, v) == %i;):F(«,i;):(?(m,d)
Equiareal (preserves areas) if and only if n) ' I == +/ &(u, v)
JT2
yjEU _ p2
(b) To obtain a conformal mapping (34), one maps each surface conformally onto a plane and relates the two planes by an analytic complex-variable transformation (Sec. 7.9-1).
To map the surface r = r(u, v) conformally onto a plane with rectangular cartesian coordinates £(u, v), r)(u, v)f solve the differential equation E(ut v) du2 + 2F(ut v) du dv + G(u, v) dv2 = 0
(17.3-35a)
or, equivalently,
l\=j2(-F +iy/EG-F2)
and
^ =i (-F - i y/EG - F*) (17.3-356)
to obtain the complex surface curves (isotropic or minimal surface "curves" defined by ds2 = 0) U(u, v) —constant V(u, v) = constant Then the real orthogonal surface coordinates
&u, v) =±(U + V)
v(u, v) =i (U - V)
(17.3-36)
(isometric or isothermic surface coordinates) reduce the first fundamental form of the surface to
ds2 = 4>(S, v)(dZ2 + dr,2)
(17.3-37)
which is proportional to the first fundamental form d£2 -\- dt\2 of a plane with rectangu lar cartesian coordinates £, r\.
17.3-11. Envelopes (see also Sees. 10.2-3 and 17.1-7).
Let the equations (17.3-38)
577
GEODESIC NORMAL COORDINATES
17.3-13
represent a one-parameter family of surfaces such that
V
(17.3-39)
in some region V of space. Then V contains a surface (envelope of the given sur faces) which touches (has a common tangent plane with) each surface (38) along a curve (characteristic) described by
(17.3-40)
Elimination of X from Eq. (40) yields the equation of the envelope. If, in addition to the condition (39), [V
(17.3-41)
which touches each characteristic (40) at a focal point obtainable by elimination of X from Eq. (41). Equation (40) defines the locus of the limiting curves of intersection of
17.3-12. Geodesies (see also Sec. 17.4-3). A geodesic on the regular surface element £ is a regular arc whose geodesic curvature (Sec. 17.3-4a) is identically zero; a geodesic is either a straight line, or its principal normal coincides with the surface normal at each point. Every geodesic u = u(s)
v = v(s)
or
v = v(u)
satisfies the differential equations
ds2 ^ [1 ljs\dsj ^ \l 2\sds ds^ \2 2\s\dsJ
S+j.*iJ.(£)'«{i2.UM»U(S)'-» \ (173-42") d2<
?=J2 2jsW +L2ll 2js~ \2 2)sJVdV
°r du
+[|,1.).-'|1,.).]5-|m|. "'*« These relations define a unique geodesic through any given point in any given direction.
Geodesies on a curved surface have many properties analogous to those of straight lines on a plane (see also Sec. 17.4-3). If there exists a surface curve of smallest or greatest arc length joining two given surface points, then that curve is a geodesic. The actual existence of a geodesic through two given surface points requires a separate investigation in each case. 17.3-13. Geodesic Normal Coordinates. Geometry on a Surface (see also Sec. 17.4-7). (a) For a system of geodesic normal coordinates u, v the u coordi-
17.3-14
DIFFERENTIAL GEOMETRY
578
nate lines are orthogonal trajectories of a "field" of geodesies v » constant. The u coordinate lines are then geodesic parallels cutting off equal increments of u —s on each geodesic v = constant, and ds2 = du2 + G(u, v) dv2
(17.3-43)
In terms of geodesic normal coordinates u, v
K_
1 d2VG _ OJ - 2GGUH
K-~VG~to2~
W—
(M*AAS
(173-44)
(b) In the special case of geodesic polar coordinates, the geodesies v = constant intersect at a point (origin, pole), and vis the angle (Sec. 17.3-36) between the geodesic labeled by v and the geodesic v = 0. Each u coordinate line is a geodesic circle of radius u intersecting all v geodesies at right angles. The "circular arc" of "radius" u corresponding to an angular increment dv is
\/G(u, v) dv ~[u - HK0u* + o(u3)] dv
(17.3-45)
where K0 is the Gaussian curvature at the origin. The quantity (445) is less than, equal to, or greater than u dv if, respectively, K0 > 0, KQ = 0, or K0 < 0. The circumference Co(u) and the area Aq(u) of a small geodesic circle of radius u about the origin are related to the circumference 2ituand the area iru2 of a plane circle of equal radius by
3 Hm 2«u-Co(u) . 12
„»-A«(«) .
(c) For any geodesic triangle on a surface of constant Gaussian curvature K, the excess of the sum of the angles A, B, C over icand the triangle area St are related by A +B + C -ir = KSt
(17.3-47)
The resulting surface geometry is Euclidean for K = 0, elliptic for K > 0, and hyperbolic for K < 0. Surfaces of equal constant Gaussian curvature are isometric (Minding's Theorem). EXAMPLES: On a sphere of radius R, K =» 1/R2 (see also Sec. B-6). A surface with constant K < 0 is obtained by rotation of the tractrix
j=r sin t
y=
— Hog, tan^ + cos n
(pseudosphere, Ref. 17.7).
17.3-14. The Gauss-Bonnet Theorem. Let K(u, v) be continuous on a simply connected surface region S whose boundary C consists of n regular arcs of geodesic curvature Kq(u, v). Then the sum © of the n exterior
angles of the boundary is related to the integral curvature / K dA of the surface region S by
f KGds + J KdA = 27r —0 (Gauss-Bonnet theorem) (17.3-48) The first integral vanishes if all the boundary segments are geodesies; Eq. (47) is a special case of Eq. (48).
579
CURVES AND DISTANCES IN RIEMANN SPACE
17.4-2
17.4. CURVED SPACES
17.4-1. Introduction. The theory of Sees. 17.4-2 to 17.4-7 treats geo metrical concepts like distance, angle, and curvature in terms of general curvilinear coordinates and extends these concepts to a class of multi dimensional spaces, viz., the Riemann spaces introduced in Chap. 16. The tensor notation of Chap. 16 is employed.
17.4-2. Curves, Distances, and Directions in Riemann Space (see also Sees. 4.6-9, 5.4-4, and 6.2-3).* (a) The metric properties (see also Sec. 12.5-2) associated with an n-dimensional Riemann space of points (x1, x2, . . . , xn) are specified by its metric-tensor components guc(xl, x2, . . . , xn) (Sec. 16.7-1) which define scalar products, and thus magnitudes and directions of vectors at each point (Sec. 16.8-1). (b) A regular arc C is described by n parametric equations X* = x*(t)
(h
(17.4-1)
with unique continuous derivatives dxi/dt which do not vanish simul taneously. The components dx* = (dx*/dt) dt represent a vector dr along (tangent to) the curve C at each of its points (x1, x2, . . . , xn),
and the element of distance between two neighboring points (x1, x2, . . . , xn) and (x1 + dx1, x2 + dx2, . . . , xn + dxn) on C is defined as ds = V\9ik(x\ x2,
.
.
.
,
-V
xn) dx* dxk\ gik(xl, x2, . .
nv dx1 dxk
' yXn)~dt~dt
(17.4-2) dt
The sign of ds is chosen so that ds > 0 for dt > 0 (positive direction on the curve C). The arc length s along the curve C, measured from the curve point corresponding to t = t0, is
The value of the integral (3) is independent of the particular parameter t used.
(c) The direction of the curve (1) at each of its points (x1, x2, . . . ,xn) is that of the vector dr; i.e., the angle y between any vector a defined
at (x1, x2, . . . , xn) and the curve is given by cos y = a • dr/\a\ \ds\ (Sec. 16.8-1). In particular, the angle y between two regular arcs x* = X\(t) and x* = X\(t) is given by * Equations (4) to (6) apply directly whenever ds j± 0. The case of null directions (dr j* 0, ds — \dr\ —0) in Riemann spaces with indefinite metric is briefly discussed in Sec. 17.4-4.
17.4-3
DIFFERENTIAL GEOMETRY
dX\ dX*
""y-to-sis?
580
,+m . JX
(17-4-4)
The points common to n —1 of the n coordinate hypersurfaces x* = constant (i = 1, 2, . . . , n) through a given point (x1, x2, . . . , xn) lie on a coordinate line
associated with the nth coordinate (see also Sec. 6.2-2). The cosine of the angle betweenthe ith and the kth coordinate linethrough the point (x1, x2, . . . , xn) equals Qik/y/gagkk-
(d) The unit vector dr/ds (represented by dxl/ds) is the unit tangent vector of C at each curve point (x1, x2, . . . , xn). The first curvature
vector d2r/ds2 represented by -r[-j-) is perpendicular to the curve (principal-normal direction, see also Sec. 17.2-26); the absolute value
, I \d2r\
\\
D(dxt\D(dx*\\
.,,,.
is the absolute geodesic curvature (absolute first curvature) of C at (x1, x2,
. . , xn) (see also Sec. 17.3-7).
17.4-3. Geodesies (see also Sec. 17.3-12).* (a) A geodesic in a Rie
mann space is a regular arc whose geodesic curvature is identically zero, so that the unit tangent vector dr/dsis constant along the curve (parallel to itself, Sec. 16.10-9), i.e.,
d2r
n
—(—\ = ^! 4- ! * I ^ ^ - n ds\ds) ds2 * [jkj ds ds °
(17.4-6)
The n second-order differential equations (6) define a unique geodesic x{ = x*(s) through anygiven point [x{ = x*(si)] in any given direction (given values of the dxi/ds for s = Si). More generally, the differential equations d*x< , [ i ) dx* dxk
, dx*
(with suitable initial conditions on the x{and dx^dt) define a geodesic x* =» a;^); the choice of the function
amounts to a choice of the parameter t used to describe the geodesic but does not affect the curve as such.
(b) Geodesies have many of the properties of the straight lines in Euclidean geometry (see also Sec. 17.4-6). // there exists a curve of * See footnote to Sec. 17.4-2.
581
SPECIFICATION OF SPACE CURVATURE
17.4-5
smallest or greatest arc length (3) joining two given points of a Riemann space, then that curve is a geodesic. More generally, the differential equa tions (6) or (7) of a geodesic may be regarded as Euler equations (Sees. 11.6-1 and 11.6-2) which ensure that the first variation of the arc length (3) along a curve joining the given points is equal to zero. The actual existence of a geodesic through two given points of a Riemann space requires a separate investigation in each case. 17.4-4. Riemann Spaces with Indefinite Metric.
Null Directions and Null
Geodesies. If the fundamental quadratic form gik(xl, x2, . . . , xn) dx*dxk of a Riemann space is indefinite (Sec. 13.5-2) at a point (x1, x2, . . . , xn), the square
|a|2 = fiftftO*a* of a vector a may be positive, negative, or zero, and |a| = 0 does not necessarily imply a = 0.
At any point (x1, x2, . . . , xn) the direction of a vector
a 5^ 0 such that |a|2 = gika*ak = 0 is a null direction. For a vector displacement dr j^Oina null direction, ds = |dr| = 0; note that the points (x1, x2, . . . , xn)and (x1 + dx1, x2 + dx2, . . . , xn + dxn) separated by such a null displacement dr are not identical. A curve xi = x'(t) such that dx*dxk
n
/*» * *\
9ik-dtH-°
(17-4-9>
has a null direction at each of its points (curve of zero arc length, null curve, mini mal curve, see also Sec. 17.3-106). A curve x* = xl(t) satisfying Eq. (7) as well as Eq. (9) is a null geodesic (geodesic null line). Every null direction defines a unique null geodesic through a given point (x1, x2, . . . , xn) (application: light paths in rela tivity theory).
17.4-5. Specification of Space Curvature, (a) The RiemannChristoffel curvature tensor of a given Riemann space is the absolute tensor of rank 4 denned by the mixed components
* - £ (;4 - h (;4+{,'*} j/4 - {,*»} 1/4 <"•«<» or by the covariant components
Rim - g
dx< dxh
dx> dxh
dx* dxkJ
+ 9"{[jk; s][ih; r] - [jh; s][ik; r]\
(17.4-11)
The components of the curvature tensor satisfy the following relations:
R*» = -R%k = 9riRrm Rnkh = RkMj = —Rjikh = —Rijhk
R)kh + Rihi + R\jk = 0 (17.4-12) Rijkh + Rikhj + Rih,k = 0
(17.4-13)
In an n-dimensional Riemann space there exist at most n2(n2 — 1)/12 distinct nonvanishing covariant components Rijkh*
Note also
B)kKr + R)hr,k + R)rtih =01 (BlANCm IDENTITIB8) (17.4.14) tCijkh,r -T Rijhr.k + lCijrk,h = 0 J
/
\
/
17.4-6
DIFFERENTIAL GEOMETRY
582
(b) The Ricci tensor of a Riemann space is the absolute tensor of rank 2 denned by the covariant components
Rij = Rji = Rijk_
m da/oV
dx* [i j] " [i jj dx^l0ge ^ + \h j] \i k\ (17.4-15)
or by the mixed components Rj = gikRkj- In an n-dimensional Riemann space there exist at most n(n + l)/2 distinct nonvanishing covariant com ponents Rij. The eigenvector directions (Sec. 14.8-3) of the Ricci tensor are the Ricci principal directions at each point (xl, x2, . . . , xn) of the Riemann space.
(c) The curvature invariant or scalar curvature of a Riemann space is the absolute scalar invariant
fi 5E fi< ES 0»/fo SEE flttfljy
(17.4-16)
(d) The Einstein tensor of a Riemann space is the absolute tensor of rank 2 denned by the components
G) m R) - KR8)
or
Gij = gu0} = Ru - MR9ii
(17.4-17)
The divergence (Sec. 16.10-7) G)ti of the Einstein tensor vanishes identically. (e) Refer to Sec. 17.3-7 for the special case of a two-dimensional Riemann space (curved surface in three-dimensional Euclidean space).
17.4-6. Manifestations of Space Curvature. Flat Spaces and Euclidean Spaces, (a) Parallel Propagation of a Vector. Parallel propagation of a vector a (Da* = 0, Sec. 16.10-9) around the infinitesimal closed circuit (x1, x2, . . . , xn) —> (a;1 + dx1, x2 + dx2, . . . , xn + dxn) -» (x1 + dx1 + d$\ x2 + dx2 + d£2, . . . , xn + dxn + dp) -+ (x1 + dp, x2 + d^2, . . . , xn + dp) —* (re1, x2, . . . , xn) changes each component a* of a by
8ai = -fifoaf dxh dp
(17.4-18)
[see also Eq. (16.10-25)]. (b) Geodesic Parallels. Geodesic Deviation. If a one-parameter family of geodesies x{ = z*(s, X) has orthogonal trajectories,* the latter are geodesic parallels cutting off equal increments of the arc length s on each geodesic of the given family (see also Sees. 17.3-12 and 17.4-7). Adjacent geodesies x* = x{(s, X) and x{ = x*(s, X + d\) are then related by x{(s, X + d\) = x{(s, X) + 77l'(s, X)
The geodesic-deviation vector represented by the ^(s, X) is normal to the geodesic x*(s, X), or gurfp10 = 0, where the pk are the unit-tangent* Note that the notation x* = x{(s, X) excludes null geodesies.
583
SPECIAL COORDINATE SYSTEMS
17.4-7
vector components of the geodesic x{ — x*(s, X). Along any one of the given geodesies (X = constant), the ??* change with s so that
-ry = —R}khVkP3Ph (c) Flat Spaces.
(EQUATION OF GEODESIC DEVIATION)
(17.4-19)
A Riemann space is flat
1. If and only if all components R)kh or Riikh of the curvature tensor vanish identically, so that successive covariant differentiations commute [Eq. (16.10-25)], and parallel propagation around infin itesimal circuits leaves all tensor components constant (Sees. 16.10-9 and 17.4-6a) 2. If and only if the Riemann space admits a system of rectangular cartesian coordinates p, p, . . . , p such that, at every point of the space, ds2 = 6i(P)2 + e2(P)2 + • • - + en(p)2
(17.4-20)
where each e, is constant and equals either +1 or —1. In every scheme of measurements denned by a cartesian coordinate system all Christoffel symbols (Sec. 16.10-3) vanish identically; covariant differentiation reduces to ordinary differentiation, and every geodesic or geodesic null line can be described by linear parametric equations p =» aH + 6*. Equation (20) may be formally simplified by the introduction of homogeneous
coordinates f»' = y/Ti P; f *is imaginary if c» = —1 (this convention is used in relativity +
theory, Ref. 17.4).
(d) A Euclidean space is a flat Riemann space having a positivedefinite metric, so that all €t- in Eq. (20) are equal to +1 (Euclidean geom etry, see also Chaps. 2 and 3). Note that the topology (Sec. 12.5-1) of a Euclidean space may differ from the "usual" topology employed in elementary geometry (EXAMPLES: surfaces of cylinders and cones). 17.4-7. Special Coordinate Systems. Because of the invariance of tensor equa tions (Sees. 16.1-4, 16.4-1, and 16.10-76), it is often permissible to simplify mathe matical arguments by using one of the following special coordinate systems. (a) Not every Riemann space admits orthogonal coordinates (gtk = 0 for i j* k, Sec. 16.8-2), but it is possible to choose one of the coordinates, say xn, so that its coordinate lines are normal to all others, or 71
—
1
7t —
X
= £ Ygtk dxi dxk +gnn(dx«)*
(17.4-21)
=1 A = l
at every point (re1, x2, . . . , xn). It is always possible to choose such a coordinate system so that either gnn = 1 or gnn = —1; xn is, then, measured by the arc length s along its coordinate lines, aftd the latter are geodesies normal to every hypersurface xn » constant (geodesic normal coordinates, see also Sec. 17.3-13).
17.5-1
DIFFERENTIAL GEOMETRY
584
(b) Every Riemann space admits local rectangular cartesian coordinates |l, £*,...,£* such that the metric is given by Eq. (20) at any one given point (£*, £2, . . . , £n). Hence every Riemann space is "locally flat"; i.e., every sufficiently small portion of the space is flat (Euclidean if the metric is positive definite). (c) Riemann coordinates with origin 0 are defined as x{ = sp*, where the p* are unit-tangent-vector components at 0 of the geodesic joining 0 and the point (xl,x2, . . . , xn), and s is the geodesic distance between these points. Every Riemann space admits Riemannian coordinates for any given origin 0; note that all Christoffel symbols vanish at 0.
17.5. RELATED TOPICS, REFERENCES, AND BIBLIOGRAPHY
17.5-1. Related Topics. The following topics related to the study of differential geometry are treated in other chapters of this handbook: Plane analytic geometry Solid analytic geometry Vector analysis Curvilinear coordinates Tensor analysis, Riemann spaces
Chap. Chap. Chap. Chap. Chap.
2 3 5 6 16
17.5-2. References and Bibliography. 17.1. Blaschke, W., and H. Reichardt: Vorlesungen uber Differentialgeometrie, 2d ed., Springer, Berlin, 1960. 17.2. Eisenhart, L. P.: An Introduction to Differential Geometry, Princeton University Press, Princeton, N.J., 1947. 17.3. : Riemannian Geometry, Princeton, Princeton, N.J., 1949. 17.4. Guggenheimer, H. W.: Differential Geometry, McGraw-Hill, New York, 1963. 17.5. Kreyszig, E.: Differential Geometry, 2d ed., University of Toronto Press, Toronto, Canada, 1963. 17.6. Rainich, G. Y.: Mathematics of Relativity, Wiley, New York, 1950. 17.7. Sokolnikov, I. S.: Tensor Analysis, 2d ed., Wiley, New York, 1964. 17.8. Spain, B.: Tensor Calculus, Oliver & Boyd, London, 1953.
17.9. Struik, D. J.: Lectures on Classical Differential Geometry, 2d ed., AddisonWesley, Reading, Mass., 1961.
17.10. Synge, J. L., and A. Schild: Tensor Calculus, University of Toronto Press, Toronto, Canada, 1949.
17.11. Willmore, T. J.: Introduction to Differential Geometry, Oxford, Fair Lawn, N.J., 1959.
(See also the article by H. Tietz in vol. II of the Handbuch der Physik, Springer, Berlin, 1955.)
CHAPTER
18
PROBABILITY THEORY AND RANDOM PROCESSES 18.1. Introduction
18.3-8. Characteristic Functions and
Generating Functions
18.1-1. Introductory Remarks
18.3-9. Semi-invariants
18.2. Definition and Representation of Probability Models 18.2-1. Algebra of Events Associated
18.3-10. Computation of Moments and Semi-invariants from xx (q), Mx(s), and yx(s). Relations between Moments and Semiinvariants.
with a Given Experiment 18.2-2. Mathematical
Definition
Probabilities. Probabilities
of
Conditional
18.2-3. Statistical Independence 18.2-4. Compound Experiments, Inde pendent Experiments, and In dependent Repeated Trials 18.2-5. Combination Rules
18.2-6. Bayes's Theorem 18.2-7. Representation of Events as Sets in a Sample Space 18.2-8. Random Variables
18.4. Multidimensional Probability Distributions 18.4-1. Joint distributions 18.4-2. Two-dimensional Probability
Distributions.
Two-dimensional Probability Distributions 18.4-4.
Expected Values, Moments, Covariance, and Correlation
18.4-5.
Conditional Probability Dis tributions Involving Two
18.2-9. Representation of Probability Models in Terms of Numerical Random Variables and Distri bution Functions
Marginal
Distributions 18.4-3. Discrete and Continuous
Coefficient
Random Variables
18.3. One-dimensional Probability Distributions
18.4-6. 18.4-7.
18.3-1. Discrete One-dimensional
Probability Distributions 18.3-2. Continuous
One-dimensional
Regression n-dimensional Probability Distributions
Expected Values and Moments 18.4-9. Regression. Multiple and 18.4-8.
Partial Correlation Coeffi
Probability Distributions 18.3-3. Expected Values and Vari ance. eters
Characteristic Param of
One-dimensional
cients 18.4-10. Characteristic Functions 18.4-11.
Statistically Independent
18.4-12.
Entropy of a Probability Dis tribution, and Related Topics
Random Variables
Probability Distributions 18.3-4. Normalization
18.3-5. Chebyshev's Inequality and Related Formulas
18.3-6. Improved Description of Probability Distributions: Use of Stieltjes Integrals 18.3-7. Moments
of
a
One-dimen
sional Probability Distribu-
18.5. Functions of Random Varia
bles.
Change of Variables
18.5-1. Introduction
18.5-2. Functions (or Transforma tions) of a One-dimensional 585
586
PROBABILITY THEORY
18.5-3. Linear Functions (or Linear Transformations) of a One-
dimensional Probability Dis
dimensional Random Variable 18.5-4. Functions and Transforma tions of Multidimensional Ran dom Variables 18.5-5. Linear Transformations
18.8-6. Two-dimensional Normal Dis
18.5-6. Mean and Variance of a Sum
18.8-9. Addition Theorems for Special Probability Distributions
of Random Variables
18.5-7. Sums of Statistically Inde pendent Random Variables
18.5-8. Compound Distributions
tributions tributions
18.8-7. Circular Normal Distributions 18.8-8. w-dimensional Normal Distri butions
18.9. Mathematical Description of Random Processes
18.9-1. Random Processes
18.6. Convergence in Probability and Limit Theorems
18.9-2. Mathematical Description of Random Processes
18.6-1. Sequences of Probability Dis tributions. Convergence in Probability 18.6-2, Limits of Distribution Func
tions, Characteristic Functions,
18.9-3. Ensemble Averages
(a) General Definitions (b) Ensemble Correlation Func tions and Mean Squares (c) Characteristic Functions (d) Ensemble Averages of Integ
and Generating Functions. Continuity Theorems 18.6-3. Convergence in Mean
18.9-4. Processes Denned by Random
18.6-4. Asymptotically Normal Proba bility Distributions
18.9-5. Orthogonal-function Expan
rals and Derivatives Parameters
sions
18.6-5. Limit Theorems
18.7. Special Techniques for Solving Probability Problems
18.10. Stationary Random Processes. Correlation Functions and
Spectral Densities
18.7-1. Introduction
18.7-2. Problems Involving Discrete Probability Distributions: Counting of Simple Events and Combinatorial Analysis 18.7-3. Problems Involving Discrete Probability Distributions: Successes and Failures in Com
ponent Experiments
18.10-1. Stationary Random Processes
18.10-2. Ensemble Correlation Func tions
18.10-3. Ensemble Spectral Densities 18.10-4. Correlation Functions and
Spectra of Real Processes
18.10-5. Spectral Decomposition of Mean "Power" for Real Processes
18.8. Special
Probability
Distribu
tions
18.8-1. Discrete One-dimensional
Probability Distributions 18.8-2. Discrete Multidimensional
Probability Distributions 18.8-3. Continuous Probability Dis tributions: the Normal (Gauss ian) Distribution 18.8-4. Normal Random Variables: Distribution of Deviations from the Mean
18.8-5. Miscellaneous Continuous One-
18.10-6. Some Alternative Ensemble
Spectral Densities 18.10-7. t Averages and
Ergodic
Processes
(a) t Averages
(b) Ergodic Processes 18.10-8. Non-ensemble Correlation
Functions and Spectral Den sities
18.10-9. Functions with Periodic Components 18.10-10. Generalized Fourier Trans
forms and Integrated Spectra
587
INTRODUCTION
18.11. Special Classes of Random Processes. Examples
18.1-1
18.11-6. Random Processes Generated
by Periodic Sampling
18.11-1. Processes with Constant and
Periodic Sample Functions (a) Constant Sample Functions (b) Random-phase Sine Waves (c) More General Periodic Processes 18.11-2. Band-limited Functions and
Processes.
Sampling
Theorems 18.11-3. Gaussian Random Processes 18.11-4. Markov
Processes
and the
Poisson Process
(a) (b) (c) (d)
Random Processes of Order n
Purely Random Processes
18.12. Operations on Random Processes
18.12-1. Correlation
Spectra 18.12-5. Nonlinear Operations 18.12-6. Nonlinear Operations on Gaussian Processes
(a) Price's Theorem (6) Series Expansion
18.11-5. Some Random Processes Gen
Theorem
and
18.12-4. Relations for t Correlation Functions and Non-ensemble
Markov Processes The Poisson Process
erated by a Poisson Process (a) Random Telegraph Wave (b) Process Generated by Poisson Sampling (c) Impulse Noise and Campbell's
Functions
Spectra of Sums 18.12-2. Input-Output Relations for Linear Systems 18.12-3. The Stationary Case
18.13. Related Topics, References, and Bibliography 18.13-1. Related Topics 18.13-2. References and Bibliography
18.1. INTRODUCTION
18.1-1. Mathematical probabilities are values of a real numerical func
tion defined on a class of idealized events, which represent results of an experiment or observation. Mathematical probabilities are not defined directly in terms of "likelihood" or relative frequency of occurrence; they are introduced by a set of defining postulates (Sec. 18.2-2; see also Sec. 12.1-1) which abstract essential properties of statistical relative frequencies (Sec. 19.2-1). The concept of probability can, then, often be related to reality by the assumption that, in practically every sequence of independently repeated experiments, the relative frequency of each event tends to a limit represented by the corresponding probability (Sec. 19.2-1).* Theories based on the probability concept may, however, be useful even if they are not subject to direct statistical interpretation. Probability theory deals with the definition and description of models involving the probability concept. The theory is especially concerned with methods for calculating the probability of an event from the known or postulated probabilities of other events which are logically related to the first event. Most applications of probability theory may be inter preted as special cases of random processes (Sees. 18.8-1 to 18.11-5). * Whenever this proposition is justified, it must be regarded as a law of nature; it should not be confused with mathematical theorems like Bernoulli's theorem or the
mathematical law of large numbers (Sec. 18.6-5).
18.2-1
PROBABILITY THEORY
588
18.2. DEFINITION AND REPRESENTATION OF PROBABILITY MODELS
18.2-1. Algebra of Events Associated with a Given Experiment. Each probability model describes a specific idealized experiment or
observation having a class gf of theoretically possible results (events, states) E permitting the following definitions. 1. The union (logical sum) Ex U E2 KJ • • • (or Ex + E2 + • • •) of a countable (finite or infinite) set of events Eh E2y . . . is the event of realizing at least one of the events Eh E2} . . . . 2. The intersection (logical product) Ex P\ E2 (or EXE2) of two events Ei and E2 is the joint event of realizing both Ex and E2.
3. The (logical) complement E of an event E is the event of not realizing E ("opposite" or complementary event of E). 4. / is the certain event of realizing at least one of the events of gf. 5. 0 is the impossible event of realizing no one of the events of gf.
In each case, the class g of events comprising gf and 0 is to constitute a completely additive Boolean algebra (algebra of events associated with the given experiment or observation) having all the properties out lined in Sees. 12.8-1 and 12.8-4.
Either ftU E2 = Et or EXC\E2= E2
implies the logical inclusion relation E2 C Ex (E2 implies E\); note 0 C E C I- Ei and E2 are mutually exclusive (disjoint) if and only if EiC\E2 = 0. The set gi of joint events E C\ Ex is the algebra of events associated with the given experiment under the hypothesis that Ei occurs; Ei n Ei = Ei is the certain event in gi (see also Sec. 12.8-3). 18.2-2. Mathematical
Definition of Probabilities.
Conditional
Probabilities. It is possible to assign a (mathematical) probability P[E] (probability of E, probability of realizing the event E) to each event E of the class g (event algebra, Sec. 18.2-1) associated with a given experiment if and only if one can define a single-valued real func tion P[E] on g so that 1. P[E] > 0 for every event E of g 2. P[E] = 1 if the event E is certain 3. P[Ei U E2 U • • •] = P[Ei] + P[E2] + • • • for every countable (finite or infinite) set of mutually exclusive events Eh E2, . . . Postulates 1 to 3 imply 0 < P[E] < 1; in particular, P[E] = 0 if E is an impossible event. Note carefully that P[E] = 1 or P[E] = 0 do not neces sarily imply that E is, respectively, certain or impossible.
589
COMPOUND EXPERIMENTS
18.2-4
A fourth defining postulate relates the "absolute" probability P[E] associated with the given experiment to the "conditional" probabilities
P[E\Ex] referring to a "simpler" experiment restricted by the hypothesis that^i occurs. The conditional probability P[E\Ei] oiE on (relative to) the hypothesis that the event Ex occurs is defined by the postulate The probability of the joint event E C\ Ex is P[E H Ex] = P[Ex]P[E\Ex] (multiplication law, LAW OF COMPOUND PROBABILITIES)
P[E\Ex] is not defined ifP[Ex] = 0. In the context of the restricted experiment, the quantities P[E\Ei] are ordinary probabilities associated with the joint events E C\ E\ constituting the event algebra §i of the restricted experiment (Sec. 18.2-1). In practice, every probability can be interpreted as a conditional probability relative to some hypothesis implied by the experiment under consideration.
18.2-3. Statistical Independence. Two events Ex and E2 are statis tically independent (stochastically independent) if and only if
P[Ex H E2] = P[Ex]P[E2]
(18.2-1)
so that P[Ex\E2] = P[Ex] if P[E2] ^ 0, and P[E2\Ex] = P[E2] if P[EX] ^ 0. N events Ex, E2, . . . , En are statistically independent if and only if not only each pair of events Eiy Ek but also each pair of possible joint events is statistically independent: P[Et H E3] = P[E%\P[Ei\
(1 < * < j < N)
P[Ei r\ Ei H Ek] = PiE^EAPiE,]
(l
P[Ex r\ E2 C\ • • • Pi EN] = P[Ex]P[E2] • • • P[EN] 18.2-4. Compound Experiments, Independent Experiments, and Inde pendent Repeated Trials. Frequently an experiment appears as a combination
of component experiments (see also Sees. 18.7-3 and 18.8-1). Let E', E", E'", . . . denote any result associated, respectively, with the first, second, third, . . . compo nent experiment. The results of the compound experiment can be described as joint
events E = E' C\ E" C\ E'" C\ . . . ; their probabilities will, in general, depend on the nature and interaction of all component experiments. The probability P[E'] of realizing the component result E' in the course of a given compound experiment is, in general, different from the probability associated with E' in an independently per formed component experiment. Two or more component experiments of a given compound experiment are inde pendent if and only if their respective results E', E", E'", . . . obtained in the course of the compound experiment are statistically independent, i.e.,
P[Ef n E"\ = P[E']P[E"\
P[E' Pi E" C\ Em] = P[Ef]P[E")P[E,,f]
for all E'} E", E'", . . . (Sec. 18.2-3).
If a component experiment is independent of
18.2-5
PROBABILITY THEORY
590
all others, the probability of realizing each of its results in the course of the given compound experiment is equal to the corresponding probability for the independently performed component experiment. Repeated independent trials are independent experiments each having the same set of possible results E and the same set of associated probabilities P[E). The probability of obtaining the sequence of results Eh E2, . . . En in the compound experiment corresponding to a sequence of n repeated independent trials is P[Elt E2, . . . , En] = P[tfi]P[JB,] • • • P[En]
(18.2-3)
18.2-5. Combination Rules (see also Sees. 18.7-1 to 18.7-3). Each of the theorems in Table 18.2-1 expresses the probability of an event in terms of the (possibly already known) probabilities of other events logically related to the first event. More generally, the probability of realizing at least m and exactly m of N (not necessarily statistically independent) events Eh E2, . . . , En is, respectively N
N
pa = i (-I)*- (iz\) a, Pm = i (-d>-» (>) * 3 «=m
j = m
where Si = Y P[Ei]
(m = 1, 2, . . . , N)
(18.2-4)
SN = P[#i Pi E2P • • • P EN]
(18.2-5)
S2 = Y Y P[E{ Pi Ek]
i
i
Note that N
N
St =2 (*-dPt=20Pw k=i
k=j
u=x'%• ••'N) (182"6)
If Ei, E2, . . . , En are statistically independent, the quantities (5) reduce to the symmetric functions (1.4-9) of the P[Et] (Table 18.2-16). EXAMPLES: If the probability of each throw with a die is %, then The probability of throwing either lor6isJ^+H — HThe probability of not throwing 6 is 1 — }i = %. The probability of throwing 6 at least once in two throws is % + % — }ie = 1HeThe probability of throwing 6 exactly once in two throws is }i — %e — Hs> The probability of throwing 6 twice in two throws is ^ej etc.
18.2-6. Bayes's Theorem (see also Sec. 18.4-56). Let Hi, H2, . . . be a set of mutually exclusive events such that HiKJ H2\J • • • = /. Then, for each pair of events Hi, E,
P[H
_ P[Hx]P[Em
p^I
= _P[gJP[g|gil
^P[Ht\P[Em
(Bayes's theorem) (18.2-7)
i
Equation (7) can be used to relate the "a priori,, probability P[#»]of a hypothetical cause Hi of the event E to the "a posteriori" probability P[Ht\E] if (and only if) the Hi are "random" events permitting the definition of probabilities P[Hi],
g
of N statistically in dependent events EXt E2, . . . , En
Probability of realizing all P[EXPE2P • • • PEN] = P[EX]P[E2] • • • P[EN]
to
CO
jg
En
2
Ideally independent
O
H
events Eh E2, . . . ,
least one of N statis-
(b) Probability of realizing at P[EX\J E2\J • • • KJ EN] = 1 - {1 -P[#i]){l - P[E2]} • • • {1 - P[EN]\
of N events Eu E2, . . . ,EN
5?
*
Probability ofrealizing all P[EXPE2P • • • PEn] =P[#ilP[#2|#i]P[#8|#i P E2] • • - P[En\ExPE2P • • • P EN-i]
*
2?i and E2 (Ex or E2 or
both)
least one of two events
Probability of realizing at P[EX \J E2] = P[Ei] + P[E2] - P[EX P E2]
(a) Probability of not realiz- P[E] = 1 - P[E] ing the event E
Table 18.2-1. Probabilities of Logically Related Events
18.2-7
PROBABILITY THEORY
592
18.2-7. Representation of Events as Sets in a Sample Space. Every class S of events E permitting the definition of probabilities P[E]
can be described in terms of a set d of mutually exclusive events tl ^ 0 such that each event E is the union of a corresponding subset of 0.
0 is called a sample space or fundamental probability set associated with the given experiment; each set of sample points (simple events,
elementary events, phases) tl of 0 corresponds to an event E. In particular, 0 itself corresponds to a certain event, and an empty subset of 4 corresponds to an impossible event.
The probabilities P[E] can then be regarded as values of a setfunction, the probability function defining the probability distribution of the sample space. Each probability P[E] is the sum of the probabilities attached to the simple events included in the event E. The event algebra S is thus represented isomorphically by an algebra of measurable sets (see also Sees. 4.6-176 and 12.8-4). The fundamental probability set associated with the conditional probabilities P[E\EX] is the subset of / representing EX. Con versely, a sample space associated with any given experiment may be regarded as a subset "embedded" in a space of events associated with a more general experiment (see also Sees. 18.2-1 and 18.2-2).
18.2-8. Random Variables.
A random variable (stochastic vari
able, chance variable, variate) is any (not necessarily numerical)* variable x whose "values" x = X constitute a fundamental probability
set (sample space, Sec. 18.2-7) of simple events [x = X], or whose values label the points of a sample space on a reciprocal one-to-one basis. The associated probability distribution is the distribution of the random variable x. The definition of any random variable must specify its distribution.
Every single-valued measurable function (Sec. 4.6-14c) x defined on any fundamental probability set 0 is a random variable; its distribution is defined by the probabilities of the events (measurable subsets of 6, Sec. 18.2-7) corresponding to each set of values of x.
18.2-9. Representation of Probability Models in Terms of Numeri cal Random Variables and Distribution Functions.
The simple
events (sample points) £ of the fundamental probability set associated with a given problem are frequently labeled with corresponding values (sample values) X of a real numerical random variable x. Each sample value of x may, for instance, correspond to the result of a measurement defining a simple event. Compound events, like [x < a], [sin x > 0.5], or [x = arctan 2], correspond to measurable sets of values of x (see also Sec. 18.2-8).
More generally, each simple event may be labeled by a corresponding
(ordered) set X s= (Xx, X2, . . .) of real numbers Xi, X2, . . . which * The boldface type used to denote a multidimensional random variable x does not necessarily imply that x is a vector.
593
PROBABILITY DISTRIBUTIONS
18.3-1
constitutes a "value" of a multidimensional random variable x = (xi,
X2, . . .). Each of the real variables xly x2, . . . is itself a random variable (see also Sec. 18.4-1). Given a random variable x or x labeling the simple events of the given fundamental probability set on a one-to-one basis, the probabilities associated with the corresponding experiment are uniquely described by the probability distribution of the random variable. Throughout this handbook, all real numerical random vari ables are understood to range from —oo to + oo ; values of a numeri
cal random variable which do not label a possible simple event $ are treated as impossible events and are assigned the probability zero. The distribution (or the probability function, Sec. 18.2-7) of any real numerical random variable x is uniquely described by its (cumulative) distribution function
*,(X) EE $(X) EEE P[X < X]
(18.2-8)
Similarly, the distribution of a multidimensional random variable x = (xx, x2, . . .) is uniquely described by its (cumulative) distribution function
$x(Xi, x2, . . .) m *(XX> X„ . . .) = P[xi < Xi, x2 < X2, . . •]
(18.2-9)
Conversely, the distribution function corresponding to a given probability distribu tion is uniquely defined for all values of the random variable in question. Every distribution function is a nondecreasing function of each of its arguments, and *x(- oo) = 0 #x(oo) = 1 * *x(- », X2, Xz, . . .) = *x(Xi, - oo, Xh . . . ) = • • • = o **(«>, co, ...) = 1
(18.2-10) (18.2-11)
18.3. ONE-DIMENSIONAL PROBABILITY DISTRIBUTIONS
18.3-1. Discrete One-dimensional Probability Distributions (see Tables 18.8-1 to 18.8-7 for examples). The real numerical random vari
able x is a discrete random variable (has a discrete probability distribution) if and only if the probability
p.(Z) - p(X) m P[x = X]
(18.3-1)
is different from zero only on a countable set of spectral values X = X(i), X(2>, . . . (spectrum of the discrete random variable x).
Each dis-
594
PROBABILITY THEORY
18.3-2
crete probability distribution is denned by the function (1), or by the corresponding (cumulative) distribution function (Sec. 18.2-9)
*.(X) = HX) =P[x<X]= y p(X{i))
(18.3-2a)* *
x(i)<x
Throughout this handbook, the notation N y(x) will be used to X
signify summation of a function y(x) over all spectral values X(,> of a discrete random variable x (see also Sec. 18.3-6). Note
£p(aO =$(co) =1
(18.3-3)
18.3-2. Continuous One-dimensional Probability Distributions (see Table 18.8-8 for examples). The real numerical random variable x
is a continuous random variable (has a continuous probability distribution) if and only if its (cumulative) distribution function
$X(X) = $(X) is continuous and has a piecewise continuous derivative, the frequency function (probability density, differential distribu tion function) of x
P[X < x < X + Ax] ^d& Ax
As-»0
~ dX
(18.3-4)
for all X. f P[X < x < X + dx] = d$ = cp(X) dx is called a probability element (probability differential). P[x <X] ^*(Z) P[a
<
x
Note
-I-.
<6] == Hb) - Ha)
= / cp(x) dx
I °° (p(x) dx = f>( oo) = 1
(18.3-5)
(18.3-6)
If x is a continuous random variable, each event [x = X] has the probability zero but is not necessarily impossible. The spectrum of a continuous random variable x is the set of values x = X where
* In terms of the step function U-(t) [U-(t) = 0 if t < 0, C7_(0 = 1 if t > 0, Sec. 21.9-1], *(X) s p(X{l))U.(X - Xw) + p(X(2))U.(X - X(2)) + . . . (18.3-26) f Some authors call a probability distribution continuous whenever its distribution function is continuous.
595
EXPECTED VALUES AND VARIANCE
18.3-3
Note: A random variable can be continuous (i.e., have a piecewise continuous fre quency function) over part of its range, while it is discrete elsewhere (see also Sec. 18.3-6).
18.3-3. Expected Values and Variance. Characteristic Parameters of One-dimensional Probability Distributions (see also Sec. 18.3-6). (a) The expected value (mean, mean value, mathematical expecta tion) of a function y(x) of a discrete or continuous random variable x is
I y(x)p(x)
E{y{x)} =
(x discrete) (18.3-7)
X
J-
y(x)
(x continuous)
if this expression exists in the sense of absolute convergence (see also Sees. 4.6-2 and 4.8-1). (b) In particular, the expected value (mean, mean value, mathe matical expectation) E\x) = £ and the variance Var [x] = a2 of a discrete or continuous one-dimensional random variable x are denned by
/ 2 xp(x)
(x discrete)
E{x] = £= { *
1 f °°
\ 1
x
(re continuous)
Var \x\ = cr2 = E{(x - £)2}
(2 (* - aw*) —
/
(x discrete)
X
1 f CO
\
(18.3-8)
„ (x ~ £)V(#) dx
(x continuous)
For computation purposes note (see also Sec. 18.3-10)
Var {x} = a* = E{x*\ - £2 = E{x(x - 1)} - {({ - 1) (18.3-9) Whenever E[x\ and Var \x] exist, the mean square deviation E{(X -Xy\ =
of the random variable x from one of its values X is least (and equal to a2) for X = £.
(c) E{x\ and Var {x} are not functions of x; they are functionals (Sec. 12.1-4) describing properties of the distribution of x. E[x] is a measure of location, and Var {x \ is a measure of dispersion (or concentra tion) of the probability distribution of x.
A number of other numerical
''characteristic parameters'' describing specific properties of one-dimen-
18.3-3
PROBABILITY THEORY
596
sional probability distributions are defined in Table 18.3-1 and in Sees.
18.3-7 and 18.3-9. Note that one or more parameters like E{x), Var [x\, E\\x ~ £lh • • • may not exist for a given probability distribution. (d) Tables 18.8-1 to 18.8-8 list mean values and variances for a num
ber of frequently used probability distributions. Table 18.3-1. Numerical Parameters Describing Properties of Onedimensional Probability Distributions (see also Sees. 18.3-3, 18.3-7, and 18.3-9)
(a) The Fractiles.
The P fractiles (quantiles of order P.) of a one-
dimensional probability distribution are values xp of x such that P[x < xP] = $(xp) = P (0 < P < 1)
Xft is the median of the distribution.
The quartiles xy±, x\^, x%, the deciles
£o.i, zo.2, . . . , £0.9, ana* the percentiles xo.oi, zo.02, . . . , 20.99 respectively divide the range of x into 4, 10, and 100 intervals corresponding to (compound) events having equal probabilities. The various fractiles exist for every probability distribution but are not necessarily unique. Tables of fractiles are widely used in statistics (Sees. 19.5-3 and 19.5-4). (b) Measures of Location
1. The expected value E\x] (Sec. 18.3-3)
2. The median xy^ (see above) 3. A mode of a continuous probability distribution is a value £mode of x such that ¥>(£mode) is a relative maximum (Sec. 11.2-1). A mode of a discrete probability distribution is a spectral value £mode preceded and followed by
spectral values associated with probabilities smaller than p(£mode). A distribution having one, two, or more modes is respectively called unimodal, bimodal, or multimodal.
(c) Measures of Dispersion 1. The variance** = Var \x\ (Sec. 18.3-3) 2. The standard deviation (root-mean-square deviation, dispersion)
4. The mean (absolute) deviation E\\x — £|) 5. The interquartile range x%-x^ and the 10-90-percentile range X0.9-X0.1 6. The range (difference between the largest and the smallest spectral value) 7. The half width of a unimodal continuous distribution is half the difference
between the two values of x where
(d) Measures of Skewness and Excess (or Curtosis). The first two of the following measures are defined in terms of the moments discussed in Sec. 18.3-7 (see also Sec. 18.3-9). 1. The coefficient of skewness 71 = jus/a^ = kz/k2H 2. The coefficient of excess 72 = /A4/V22 — 3 = k4 A22 3. Pearson's measure of skewness for unimodal distributions (£ — £mode) fa (see also Sec. 19.2-4) The quantities 712 and 72 + 3 or
— are also used instead of 71 and 72
597
PROBABILITY DISTRIBUTIONS
18.3-6
18.3-4. Normalization. Given a function \f/(x) > 0 known to be proportional to the function p(x) associated with a discrete random variable x (Sec. 18.3-1),*
\ y(x)rp(x) X £{2/(z)} =Ar
p(x) =i f(x) ==^-
2*w
,*w>
2^(a?)
(18.3-10)
Given a function rp(x) > 0 known to be proportional to the frequency function (p(x) of a continuous random variable x (Sec. 18.3-2),
1
,/V>
/
£{y(*)) - ^
\f/(x) dx
In either case, k is called the normalization factor.
V(x)*(z)dx
I
\p(x) dx
(18.3-11)
Analogous procedures apply to
multidimensional distributions (Sec. 18.4-1).
18.3-5. Chebyshev's Inequality and Related Formulas. The fol lowing formulas specify upper bounds for the probability that a ran dom variable x, or its absolute deviation \x — £| from the mean value £ = E{x}, exceeds a given value a > 0.
P[x > a] <|
a
P[\x - £| > a] < —2
(a > 0)
(a > 0) (Chebyshev's
a
(18.8-12) (18.3-13)
Inequality) If x has a continuous distribution with a single mode (Table 18.3-1) £mode, one has the stronger inequality
P[|*-S|>a]<J(a/^|)a
(18.3-14)
where 2 is Pearson's measure of skewness (Table 18.3-1); note that 2 = 0 if the dis tribution is symmetrical about the mode.
18.3-6. Improved Description of Probability Distributions: Use of Stieltjes Integrals. The treatment of discrete and continuous probability distributions is unified if one expresses the probability of each event [X — AX < x < X + AX] as a Lebesgue-Stieltjes integral (Sec. 4.6-17)
P[X-AX<x<X + AX]= JfX+AX c»(x) X —AX
(18.3-15)
* In order to conform with the notation used in many textbooks, the values x = X of a random variable x will be denoted simply by x whenever this notation does not lead to ambiguities.
18.3-7
PROBABILITY THEORY
598
where &(X) = P[x < X] is the cumulative distribution function (Sees. 18.2-9, 18.3-1, and 18.3-2) defining the distribution of the random vari
able x. For continuous distributions the Stieltjes integral (15) reduces
to a Riemann integral. For a discrete distribution, $(X) isgiven byEq. (2), and P[X - AX <x <X + AX] reduces to the function p(X) defined in Sec. 18.3-1.
In terms of the Stieltjes-integral notation,
EM =/-"I xd*W
E{y(x)) =|_" y(x) d$(x)
Var {x} =p^ (x - £)2d$(x)
(18.3-16)
for both discrete and continuous distributions. The Stieltjes-integral notation applies also to probability distributions which are partly dis crete and partly continuous. An analogous notation is used for multi
dimensional distributions (Sees. 18.4-4 and 18.4-8). Discrete distributions may be formally represented in terms of a "probability density" involving impulse functions 8-(X - X(i)) (see also Sees. 18.3-1 and 21.9-6).
18.3-7. Moments of a One-dimensional Probability Distribution (see also Sees. 18.3-6 and 18.3-10). (a) The moment of order r > 0 (rth moment) about x = X of a given random variable x is the mean value E{(x —X)r}f if this quantity exists in the sense of absolute con vergence (Sec. 18.3-3).
(b) In particular, the rth moment of x about X = 0 is
<xr =E{xr} = f"^ xrd$(x) y xrP(x)
(x discrete)
X
/_„ zr
(18.3-17)
(x continuous)
and the rth moment of x about its mean value J (central moment of order r) is
Hr =E{(X - Qr] =y_^ (X - Qr ^(3) 24(X- 0rp(x) J_oo (x —£)r
(x discrete)
(18.3-18)
(x continuous)
The existence of ar or /xr implies the existence of all moments <xk and fxk of order k < r; the divergence of ar or fir implies the divergence of all moments ak and nh of order k > r.
CHARACTERISTIC FUNCTIONS
599
18.3-8
If the probability distribution is symmetric about its mean, all (existing) central moments iir of odd order r are equal to zero.
(c) The rth factorial moment of x about X = 0 is (18.3-19)
alrl = E(zM) = E{x(x - 1) • • • (x - r + 1)) The rth central factorial moment of x is E|(x - £) M} of x about X = 0 is 0r = E{\x\r\.
The r* absolute moment
Note
Mi = 0
J
(188.3-20) •
(d) .A one-dimensional probability distribution is uniquely defined by its moments
a0,ai,a2, . . . if they all exist and are such that the series ) aksk/k\ converges absolutely fc=o
/orsome |s| > 0 [see also Eq. (28) and the footnote to Sec. 18.3-8o].
(e) Refer to Tables 18.8-1 to 18.8-7 for examples, and to Sec. 18.3-10 for relations connecting the ar, Mr, and a[r].
18.3-8. Characteristic Functions and Generating Functions (see also Sec. 18.3-6; refer to Tables 18.8-1 to 18.8-8 for examples).*
(a) The probability distribution of any one-dimensional random vari able x uniquely defines its (generally complex-valued) characteristic function
V eiQXp(x)
xM ^BH-) - /_•„ «<- *(*) - r! l"
{x discrete) eiqx
(18.3-21)
(x continuous)
where qis a real variable ranging between —oo and oo.
(b) The probability distribution of a random variable x uniquely defines its moment-generating function
y e8Xp(x)
M.(.) - E{e-*} - /_"„ e-*dHx) =\\
<* discrete>
I " e8X
(x continuous)
and its generating function (see also Sec. 8.6-5) * See footnote to Sec, 18.3-4.
(18.3-22)
18.3-9
PROBABILITY THEORY
600
£ sxP(x) \
x
7x(s) = E{s*\ = f~ sxd$(x) s <
(x discrete)
(18.3-23)
/_ sVW cb (a; continuous)
for eachvalue of the complex variable s such that the function in question exists in the sense of absolute convergence. (c) The characteristic function xx(q) defines the probability distribution
of x uniquely* The same is true for each of the functions Mx(s) and yx(s) if it exists, in the sense of absolute convergence, throughout an interval of the real axis including s = 0 in the case of Mx(s), and s = 1 in the case of yx(s). Specifically, if £ is a discrete or continuous random variable,
p(x) = hm ^1 /fQ er"*xx(q)dq
(x discrete)
Q-»« *H J -Q
(18.3-24)
(x continuous)
Eq. (24) also yields p(x) or
(18.3-25)
(d) In many problems it is much easier to obtain a description of a probability distribution in terms of xx(q), Mx(s), or yx(s) than to compute $(x), p(x), or
Xx(q)f Mx(s), or yx(s) are known. The linear integral transformations (21) to (24) can often be made with the aid of tables of Fourier or Laplace transform pairs (Appen dix D).
(e) The generating function yx(s) is particularly useful in problems involving dis crete distributions with spectral values 0, 1, 2, . . . , for then 00
7.W = £ s*p(x)
p(x) =i 7xC»)(0)
(x =0, 1, 2, . . .)
2 = 0
whenever the series converges (see also Sec. 18.8-1; see Ref. 18.4 for a number of interesting applications).
18.3-9. Semi-invariants (see also Sec. 18.3-10).
Given a one-dimen
sional probability distribution such that the rth moment aT exists, the first r semi-invariants (cumulants) k0, k1} k2, . . . , kt of the distribu* &(x) is, then, uniquely defined except possibly on a set of measure zero; #(x) is unique wherever it is continuous (see also Sec. 18.2-9).
601
18.3-10
COMPUTATION OF MOMENTS
tion exist and are defined by r
log, xx(q) =^Kk M* +o(q')
(18.3-26)
Under the conditions of Sec. 18.3-7d all semi-invariants Kh k2, . and define the distribution uniquely.
. exist
18.3-10. Computation of Moments and Semi-invariants from Xx(q), M«(s), and yx(s). Relations between Moments and Semiinvariants. Many properties of a distribution can be computed directly from Xx(q), Mx(s), or yx(s) without previous computation of $(x), v(x), or p(x).
If the quantities in question exist, or = t-^x,W(0) = M,<'>(0)
aM = Tx(r)(l)
*r =i- J log. Xx(q)]^Q =^ log, M.w]f_
(18.3-27)
Note 00
00
»
M.(s) - fc Y«*jj log. M.(«) - Jfc=0 J| "*F! 7x(S +X) =*=0 2 "w S (18'3-28> =0 ' provided that the function on the left (respectively the moment generating function, the semi-invariant-generating function, and the factorial-moment-generating function of x) is analytic throughout a neighborhood of s —0.
Equations (28) yield E{x\ and Var {x} with the aid of the relations E{x\ = £ = «i = am = *i
Var {x} = (72 = /i2 = <*2 - f2 = «[« - i(i -!) = ««
(18.3-29)
Table 18.3-1 lists other parameters which can be expressed in terms of moments.
The following additional formulas relate moments and semi-invariants: r
r
„, _ £ (_i)» g) «_*«* aT = 2 (0 »"-*«*
&= 0 r-1
«W = y slar-k
(r = 0, 1, 2, ... ;see also Sec. 21.5-3) (18.3-31)
&= o
<x2 = a[2] -f- «[l] «3 = «[3] + 3a[2] + «[1]
= *2 4" ^l2 = «3 + 3*1*2 + *18
(18.3-32)
«4 = <X[4] + 6«[3] + 7«[2] + «[l] = «4 + 6*1**2 + 4*i*3 -f 3*22 + «I4 M3 = «3 /*4 = *4 + 3«22 (18.3-33)
18.4-1
PROBABILITY THEORY
602
*4 = /i4 — 3/t*22 «5 = Hb — 10 fi2fiZ
(18.3-34)
*6 = M6 ~ 15/U2M4 - 10/132 + 30/*28
18.4. MULTIDIMENSIONAL PROBABILITY DISTRIBUTIONS
18.4-1. Joint Distributions (see also Sec. 18.2-9). The probability dis tribution of a multidimensional random variable x = (xX) x2, . . .) is described as a joint distribution of real numerical random variables
Xi, x2, . . . . Each simple event (point of the multidimensional sample space) [x = X] = [xx = Xx, x2 = X2, . . .] may be regarded as a result of a compound experiment in which each of the variables xx, x2, . . . is measured. Each joint distribution is completely defined by its (cumula tive) joint distribution function.
18.4-2. Two-dimensional Probability Distributions. Marginal Distributions. The joint distribution of two random variables xx, x2 is defined by its (cumulative) distribution function
$X(XX, X2) = $(XX, X2) 5 P[Xl < Xh x2 < X2]
(18.4-1)
The distribution of Xi and x2 (marginal distributions derived from
the joint distribution of Xi and x2) are described by the corresponding marginal distribution functions
$l(Xx) B P[Xl < XX] EE P[Xl <XhXt< OO] EH $(XX, 00) MX2) = P[X2 < X2] ee P[Xl < oo, X2 < X2] m <|>(oo, X2)
(18.4-2)
18.4-3. Discrete and Continuous Two-dimensional Probability Distributions, (a) A two-dimensional random variable x ss (xx, x2) is a discrete random variable (has a discrete probability distribu tion) if and only if the joint probability px(Xh X2) s p(Xx, X2) Ei P[Xl = Xlf x2 = X2]
(18.4-3)
is different from zero only for a countable set (spectrum) of "points" (Xi, X2), i.e., if and only if both xx and x2 are discrete random variables
(Sec. 18.3-1). The marginal probabilities respectivelyassociated with the marginal distributions of xx and x2 (Sec. 18.4-2) are
px(Xx) e, P[Xl =ZJ = J p(Xx, X2) Xi
Pi(X2) mP[x2 =XJ = J P(X,, **)
(18.4-4)
18.4-4
EXPECTED VALUES
603
(b) A two-dimensional random variable x = (xx, x2) is a continuous random variable (has a continuous probability distribution) if and only if (1) 3>(Xi, X2) is continuous for all Xx, X2, and (2) the joint frequency function (probability density)
nr y\-„(Y Y^- d2<J>(X!, X2)
(18.4-5)
exists and is piecewise continuous everywhere.*
(18.4-6)
(c) Note
Y^ P(XX, Xt) =£ Pl(xX) =2?*(**) =1 X\
Xt
X\
x%
r_«, r-oo ^^ ^ ^ ^ = /_w ^l^1) dXl —I
(18.4-7)
(p2(x2) dx2 = 1
18.4-4. Expected Values, Moments, Covariance, and Correlation Coefficient, (a) The expected value (mean value, mathematical expectation) of a function y = y(xx, x2) of two random variables xx, x2 with respect to their joint distribution is
E{y(xx, x2)} = J*^ f~^ y(xx, x2) d*fa, x*) y y y(xx,x2)p(xh x2) =
<
Zl
for discrete distributions
u
f " /"J* y(xh x2)
(18.4-8)
18.4-5
PROBABILITY THEORY
604
if this expression exists in the sense of absolute convergence (see also Sec. 18.3-3).
Note: If y is a function of Xi alone, the mean value (8) is identical with the mean
value (marginal expected value) with respect to the marginal distribution of xx.
(b) The mean values E{xx) = fr, E{x2] = £2 define a "point" (£i, £2) called the center of gravity of the joint distribution. The quantities E{(xx - Xx)^(x2 - X2)r*} are called moments of order
rx + r2 about the "point" (Xh X2). In particular, the quantities arir2 = E{x^xf*} Mrjr2 = E{(xx - i-xyi(x2 - i-2y>\ (18.4-9) are, respectively, the moments about the origin and the moments about
the center of gravity (central moments) of order rx + r2 (see also Sec. 18.3-76).
(c) The second-order central moments are ofspecial interest and warrant a special notation. Note the following definitions: An = E{(xx - fO2}^ Var {xx} = ^ X22 = E\(x2 - £2)2} = Var \x2\ = cr22 (VARIANCES OF Xi AND X2)
X12 = X21 = E{(xx - ?i)(z2 -£>)} = Cov {xh x2] (COVARIANCE OF Xx AND X2) Pl2 = P21 = p{xX, X2] =
(18.4-10)
Xl2
+ \A11X2 Cov [xx, x2\
= E
xi -
£1 x2 -
f.
+ \/Var {xx) Var [x2\ [
(18.4-11)
18.4-5. Conditional Probability Distributions Involving Two Ran dom Variables, (a) The joint distribution of two random variables
£i, x2 defines a conditional distribution ofxx relative to the hypoth
esis that x2 = X2 for each value X2 of x2 and a conditional distribu tion ofx2 relative to each hypothesis xx = Xx. Theconditional distribu
tions of xx and x2 derived from a discrete joint distribution (Sec. 18.4-3a) are discrete and may be described bythe respective conditional probabili
ties (Sec. 18.2-2)
Pi\2(Xx\X2) =P[X1 = Xl\x2 = X2] Ei p(x}>J:2) P2(X2)
V*\i(X2\Xx) s P[x2 = X2\Xl = XJ mV{XXX,2) Pl(^l)
(18.4-12)
REGRESSION
605
18.4-6
The conditional distributions of xx and x2 derived from a continuous
joint distribution (Sec. 18.4-36) are continuous and may be described by the respective conditional frequency functions
, X2)
/ylyN _
(18.4-13)
(b) Note
y Pi\2(xx\x2) = ^ P2ii(x2|xi) = 1 XI
32
/_°
(18.4-14)
/V IV>> _ Pl(Xl)p2ll(X2lXi) Pi|2(Ai|X2) = ^ — ——"
> Pi(xi)P2\i(X2\xx)
XI
,v ixrx _ P2(X2)pi|2(XllX2) P2|l(A2|Ai) = ^ — > P2(»2)Pl|2(Ai|«2)
vi^CX'ipQ =
2| l(X2|Xi) =
(Bayes's theorem, see also Sec. 18.2-6)
(18.4-15)
/Joo ^2(^2)<^l|2(Xl|x2) dx2
(c) Given a discrete or continuous joint distribution of two random variables xx and x2, the conditional expected value of a function
y(xx, x2) relative to the hypothesis that xx = Xx is
Sy(Xx, x2)p2\x(x2\Xx) E{y(xx, x2)\Xx) =
for discrete distributions
(18.4-16)
(^ y(Xx, x2)(p2\x(x2\Xx) dx2 for continuous distributions
if this expression exists in the sense of absolute convergence. Note that E{y(xx, x2)\Xx\ is a function of Xx. EXAMPLE: The conditional variances of xi and x2 are the respective functions Var {a;i|X,| = E\(x, - E[xl\X2\Y\X2\
Var [x2\X1\ = E[(x2 - #(s2|Xi})2|Xi} (18.4-17)
18.4-6. Regression (see also Sees. 18.4-9 and 19.7-2). (a) Given the joint distribution of two random variables xx and x2, a regression of x2 on xx is any function g2(xx) used to approximate the statistical depend-
W-4-6
PROBABILITY THEORY
606
ence ofx2 on xx by a deterministic relation x2 « 02(zi). More specifically, z2 is written as a sum of two random variables, %2 = g2(xx) + h2(xx, x2)
(18.4-18)
where h2(xx, x2) is regarded as a correction term. In particular, the function
02(#i) = E[x2\xx]
(mean-square regression of x2 on
Xi, regression of the mean of x2 on xx)
(18.4-19)
often simply called the regression of x2 on xx, minimizes the mean-square deviation
E{[h2(xx, x2)Y) = E{[x2 - g2(xx)Y\ (18.4-20) The corresponding curve x2 = E{x2\xx] isthe (theoretical) mean-square regression curve of x2.
(b) It is often sufficient to approximate the regression (19) by the linear function
92(xx) = g2^(xx) = {2 + p2l(Xl - {l)
Pn = pi2^£ (18.4-21)
(linear mean-square regression of x2 on xx)
Equation (21) describes a straight line, the mean-square regression line of x2; &2X is the regression coefficient of x2 on xx. Equation (21) represents the linearfunction axx + bwhose coefficients a, bminimize the mean-square deviation
E{[x2 - (axx + b)]2} = (722 + aW - 2aPl2
(c) The mean-square regression (19) may be approximated more closely by a polynomial of degree m (parabolic mean-square regression of order m) or by other approximating functions, with coefficients or parameters chosen so as to mini mize (20).
(d) If x2is regarded as the independent variable, one has similarly 0ifo) ss E[xi\x2)
(mean-square regression of xi on x2) (18.4-22)
giw(x2) a £i + fiit(xt - &)
fi» = Pl2^
(linear mean-square REGRESSION OF Xi on x2)
(18.4-23)
Note that in general neither (19) and (22) nor (21) and (23) are inverse functions.
All mean-square regression curves and mean-square regression lines pass through the center of gravity ((?i, £2) of the joint distribution.
607
n-DIMENSIONAL PROBABILITY DISTRIBUTIONS
18.4-7
The above definitions apply, in particular, if either of the two random variables, say xi = t,becomes a givenindependent variable,and x2(t) describes a random process (Sec. 18.9-1).
18.4-7. n-dimensional Probability Distributions, (a) The joint distribution of n random variables xX) x2, . . . , xn is uniquely described by its (cumulative) joint distribution function
$X(XX, X2, . . . , Xn) - HXX, X2, . . . , Xn) = P[Xi < Xi, X2 < X2, . . . , Xn < Xn]
(18.4-24)
(Sec. 18.2-9). The joint distribution of m < n of the variables xx, x2, . . . , xn is an m-dimensional marginal distribution derived from the original joint distribution. One obtains the corresponding marginal distribution function from the joint distribution function (24) by
substituting Xj = oo for each of the n —m arguments Xj which do not occur in the marginal distribution, e.g.,
*X2(XX, X2) = *(Xx, X2, co, . . . , oo) $2(X2) 35 $(oO, X2, 00, . . . , 00) = $12(00, X2)
etC.
(b) An n-dimensional random variable x = (Xlj X2, . . . , xn) is a discrete random variable (has a discrete probability distribution) if and only if the joint probability
px(Xi, X2, . . . , Xn) = p(XX, X2, . . . , Xn) EE P[Xx = Xi, x2 = X2, . . . , xn = Xn] (18.4-25)
differs from zero only for a countable set (spectrum) of "points"
(Xi, X2, . . . , Xn), i.e., if and only if each of the n random variables xx, x2, . . . , xn is discrete (see also Sees. 18.3-1 and 18.4-3a). Marginal probabilities and conditional probabilities are defined in the manner of Sees. 18.4-3o and 18.4-5a, e.g.,
Pi2(Xi, X2) =y ^ •••£ p(Xh X2, Xz, . . . ,xn) Xa
Xi
Xn
p2(X2) =y y ••• y p(xx, X2, xB, ...,«») s £ Vufru X2) Xl
Xa
Xn
etc.
31
and
/xr ixr x = _ -Pi2CX"j, pi,2(X1|x2) p2(X2)X2)
ini (y \ir xs) y ^= - Pm(XXt Pi\n(Xi\x2, p2z(X2fX2,Xz)X8) « , (y yiy^- Pw(Xi, X2, X8) P28ii(A2, A8|Ai) = ~p~jx~)
.
(c) An n-dimensional random variable x = (xx, x2, . . . , xn) is a continuous random variable (has a continuous probability dis-
18.4-8
PROBABILITY THEORY
608
tribution if and only if (1)
Xi, X2, . . . , Xn and (2) the joint frequency function (probability density)
. , Xn) = (p(Xx, X2, . . . , Xn) _ d**(Xu x2, .
• ,Xn) 3XXdX2 • • • dXn
(18.4-26)
exists and is piecewise continuous everywhere.* (p(Xx, X2, . . . , Xn) dxx dx2 • • • dxn is called a probability element (see also Sees. 18.3-2 and 18.4-36). The spectrum of a continuous probability dis
tribution is the set of "points" (Xx, X2, . . . , Xn) where the frequency function (26) is different from zero.
(d) Note
Xl
Xi
Xn
(18.4-27)
i-oo /-co "' *J_„*(xhVi, . . . ,xn)dxxdx2 • ••dxn = 1 (e) The frequency functions associated with the (necessarily continuous) marginal and conditional distributions derived from a continuous n-dimensional probability distribution are defined in the manner of Sees. 18.4-36 and 18.4-5a, e.g., ,„ (Y Y \ —d2<£i2CX"i, *12(Xl,X2)= ^ ^X2) /»
r oo
r oo
-oo 7-oo ' ' ' y-oo
dx„
,„ . (Y \Y \ —
7tT\—
„ i tY \Y Xt) Y\ = —•
(f) The jom* distribution of two or more multidimensional random variables x = (xi, x2f . . .), y = (yh 2/2, ...),... is the joint distribution of the random vari ables xi, x2, . . . ; 2/1, 2/2, . . . ; . . . .
Note: A joint distribution may be discrete with respect to one or more of the
random variables involved, and continuous with respect to one or more of the others; and each random variable may be partly discrete and partly continuous.
18.4-8. Expected Values and Moments (see also Sec. 18.4-4). (a) The expected value (mean value, mathematical expectation) of a
function y = y(xx, x2, . . . , xn) of n random variables xx, x2, . . . , xn with respect to their joint distribution is * See footnote to Sec. 18.3-2.
E{y(xx, x2, .
-/-•
•
•
l
00 J —
r oo
2/0*1,
Xn
r oo
»2, •
. . , Xn)p(xX, X2, . . . , Xn) (18.4-28)
(for discrete distributions) //(xi, z2, . . . , xn)
1 J —00 J —00
1
y(xx, x2, . . . , xn)
to
d$(xX, X2, . . . , Xn)
XI X%
/
Xn)}
)
f °°
in-- •I =
18.4-8
EXPECTED VALUES AND MOMENTS
609
&2, •
. . , xn) dxx dx2 - • - dxn (for continuous distributions)
if this expression exists in the sense of absolute convergence. Note: If y is a function of only m < n of the n random variables X\, x2, . . . , xn, then the mean value (28) is identical with the mean value of y with respect to the joint distribution (marginal distribution, Sec. 18.4-7) of the m variables in question.
(b) The n mean values E{xx\ = £i, E{x2) = £2, . . . E{xn) = in define a "point" (£i, f2, . . . , in) called the center of gravity of the joint distribution. The quantities E{(xx - Xx)r*(x2 - X2)r* • • • (xn Xn)rn] are the moments of order rx + r2 + • • • + rn about the "point" (Xi, X2, . . . , Xn). In particular, the quantities = E{xXriX2r2 ' ' ' XnTn) P-rxrt ...rr
= E\(xx - ixyi(x2 - i2y* • • • (xn - &»)r»}
(18.4-29)
are, respectively, the moments about the origin and the moments about the center of gravity (central moments).
(c) The second-order central moments are again of special interest and warrant a special notation; the quantities \ik = \ki = E{(xi- ii)(xk - &)}
_ /Var {xi} = or* if i = k\
-\Covixi,xk} ift^tj
,. , _ - 2
^
i *)
(18.4-30)
define the moment matrix [\ik] = A and its reciprocal (Sec. 13.2-3)* [A*] s [A*]"1 see A-i
det [Xtfc] is the generalized variance of the joint distribution.
The
(total) correlation coefficients * Note that some authors denote the cofactor matrix [X^]~l det [Xa] by [A«]. The notation chosen here simplifies some expressions,
18.4-9
PROBABILITY THEORY
Pik = p\Xi, Xk\ =
.
610
= E '
(18.4-31)
(i, k = 1, 2, . . . , n)
(see also Sec. 18.4-4c) define the correlation matrix [pik] of the joint
distribution. + \/det [pik] = + Vdet [\ik]/o~x
The matrices [X»*] and [p»&] are real, symmetric, and nonnegative (Sees. 13.3-2 and 13.5-2). Their common rank (Sec. 13.2-7) r is the rank of the joint distribution.
The ellipsoid of concentration corresponding to a given n-dimensional probability distribution is the n-dimensional "ellipsoid"
/ Y AikXiXk =n + 2
(18.4-32)
in defined so that a uniform distribution of a unit probability "mass" inside the hypersurface has the moment matrix [X«]. The ellipsoid of concentration illustrates the
"concentration" of the distribution in different "directions"; the "volume" of the ellipsoid is proportional to the square root of the generalized variance. For r < n, the probability distribution is singular: its spectrum (Sec. 18.4-7) is restricted to an r-dimensional linear manifold (straight line, plane, hyperplane) in the n-dimensional space of "points" (xi, x2, . . . , xn), and the same is true for its ellipsoid of concen tration. Thus the spectrum of a two-dimensional probability distribution is restricted to a straight line if r = 1, and to a point if r = 0.
18.4-9. Regression. Multiple and Partial Correlation Coeffi cients (see also Sees. 18.4-6 and 19.7-2). (a) Given the joint distribu tion of n random variables xx, x2, . . . , xn, one may study the depend ence of one of the variables, say xx, on the remaining n — 1 variables by writing
xi = gi(x2, x3, . . . , xn) + hx(xx, x2, . . . , xn) where hx(xx, x2, . . . , xn) is regarded as a correction term.
(18.4-33)
The function
gi(x2, XZ, . . . , Xn) = E{XX\X2, XZ, . . . , Xn]
t\x1p1\2Z...n(xx\x2, xz, . . . , xn)
Hxx is discrete
7" 3HPl|M---n(3l|32, < i Xz, . . . , Xn)wdxX [/_«,
> (18.4-34)
if Xi is continuous
(mean-square regression of xx on x2, xz, . . . , xn) minimizes the mean-square deviation E[xx —gx(x2, xz, . . . , xn)]2;E{xx\X2, Xz, . . . , Xn} is the conditional mean of xx relative to the hypothesis that #2 = X2, xz = Xz, . , . , xn = Xn (see also Sec, 18,4-5c),
611
CHARACTERISTIC FUNCTIONS
18.4-10
(b) The mean-square regression of any variable Xi on the remaining n — 1 variables is often approximated by the linear function
giV^ti +^Mxk- b
(18.4-35)
(LINEAR MEAN-SQUARE REGRESSION OF Xi ON THE REMAINING 71—1 VARIABLES)
(see also Sec. 18.4-6).* The regression coefficients 0* are uniquely determined if the distribution is nonsingular (Sec. 18.4-8). The multiple correlation coefficient
p{*,, *<»} =+^1-^
(18.4-36)
is a measure of the correlation between Xi and the remaining n — 1 variables.
(c) The random variable ft<(1) = s< —0»(1) (difference between a;,- and its "linear estimate" 0*(1) for A« 5^ 0) is the residual of xt with respect to the remaining n — 1 variables.
Cov {h«\ xk] = 0 (tV Jb)
Note
Var {&<<»} = 1/A* (residual variance)
(18.4-37)
(d) Regressions and residuals may be similarly denned in connection with a suitable marginal distribution (Sec. 18.4-7a) of m < n variables, say x1} x2, . . . , xn. The
quantities analogous to 0i2, /3is, . . . ; hiil), h2™, ... are then respectively denoted
by 1812.34... m, j8i,.M...m, • • • ; M&;..«.» *£}n... «, v • J in each case, there is a subscript corresponding to each variable of the marginal distribution.
(e) The partial correlation coefficient of X\ and x2 with respect to Xz, Xa, . . • , Xn,
KH..... = pl*B«...« MS4....I
A12
7^=
VA11A12
<18^38>
measures the correlation of xi and x2 after removal of the linearly approximated effects of xz, xit . . . , xn. In particular, for n = 3, Pl2 —' P13P28
P"'8
V(l -P132)(1 -P282)
18.4-10. Characteristic Functions (see also Sec. 18.3-8).
The proba
bility distribution of an n-dimensional random variable x = (xx, x2, . . . , xn) uniquely defines the corresponding characteristic function (joint characteristic function of xx, x2, . . . , xn) * See footnote to Sec. 18.4-86.
18.4-11
PROBABILITY THEORY
612
n
Xx(q) ss Xx(qh q2, . . . , qn) =E|exp (i Yqkxk\\ k = l n
S /-co /-co ' ' ' /j! eXP [* 2 QkXk\ d*x(XU X*> ' ' ' >Xn) &=1
(18.4-39)
and conversely. For continuous distributions,
***u Xi' •••'Xn) m?K5= /--/-."• ' /— GXP \_~i 2j qkXk\ x*(qi' q2' --•' 9"^ dQid9* •••dqn (18.4-40) k = \
The joint characteristic function corresponding to the marginal distribution of m < n of the n variables xh x2, . . . , x„ is obtained by substitution of qk = 0 in Eq. (39) whenever xk does not occur in the marginal distribution; thus xwfai, £2) = Xxfei, 9* 0, . . . , 0).
Moments and semi-invariants of suitable multidimensional probability distributions canbe obtained as coefficients in multipleseries expansions of xx and log* xx in a man ner analogous to that of Sec. 18.3-10.
18.4-11. Statistically Independent Random Variables (see also
Sees. 18.2-3 and 18.5-7).* (a) A set of random variables xx, x2, . . . , xn are statistically independent if and only if the events [xx £ £1], [x2 G S2], . . . , [xn G Sn] are statistically independent for every collec tion of real-number sets Sx, S2, . . . , S„. This is true if and only if $(Xl, X2, . . . , Xn) = $l(xX)$2(x2) * # ' $n(Xn)
(18.4-41)
or, in the respective cases of discrete and continuous random variables, if and only if
P(XX, X2, . . . , Xn) S pX(xX)p2(x2) • • • pn(Xn) J Qg . . .
^ '~ '
The joint distribution of statistically independent random variables is completely defined by their individual marginal distributions. Sta
tistically independent random variables xx, x2, . . . are uncorrelated, i.e., Pik = 0 for all i 9^ k (Sec. 18.4-8c), but the converse is notnecessarily true (see also Sec. 18.8-8). (b) Statistical independence of multidimensional random variables
xi, x2, . . . is defined by Eqs. (41) or (42) on substitution of xi, x2, . . . for xx, x2} . . . . * See footnote to Sec. 18.3-4.
613
ENTROPY OF A PROBABILITY DISTRIBUTION
18.4-12
EXAMPLE: The multidimensional random variables (xi, x2) and (xit Xi, Xb) are statistically independent if and only if
*i2346(xi, x2, xz, Xa, x0) = $i2(xh x2)&zu(xz, x4, xa)
(18.4-43)
Note that Eq. (43) implies the statistical independence of x2 and x6, Xi and (xif x4), (x\, x2) and (x3, xb), etc. (c) Given a joint distribution of n discrete or continuous random variables Xi, x2, . . . , xn such that (xi, x2, . . . , xm) is statistically independent of (xm+i, xm+2, . . . , xn), note
Pl2...m|m+l...n(£i, X2, . . . , Sm|xm+1, . . . , Xn) = P\2...m(X\, X2, . . . , Xm) or
(d) Two random variables Xi and x2 are statistically independent if and only if their joint characteristic function is the product of their individual (marginal) characteristic functions (Sec. 18.4-10), i.e., Xn(qi, ?2) = xi(Qi)x2(q2) (18.4-45)
An analogous theorem applies for multidimensional random variables (see also Sec18.5-7).
(e) // the random variables x\, x2, . . . are statistically independent, the same is true for the random variables yi(x{), y2(x2), . . . . An analogous theorem holds for multidimensional random variables.
18.4-12. Entropy of a Probability Distribution, and Related Topics, (a) The entropy associated with the probability distribution of a one-dimensional random variable x is defined as
(E \log2 —7-r [ = —>p(x) log2 p(x)
H{x\ =<
PX
*^
[E -!log2—rrr = — I
(re discrete) I
> (18.4-46)
H\x) (entropy of x) is a measure of the expected uncertainty involved in a measure ment of x.
In the case of discrete probability distributions, H{x) > 0, with H{x\ =0 if and only if x has a causal distribution (Table 18.8-1). The continuous distribution having the largest entropy for a given variance a2is the normal distribution (Sec. 18.8-3),
with H\x) = log2 y/i-K&r2. (b) In connection with the discrete or continuous joint distribution of two random variables xif x2, one defines the joint entropy
rj{ , _ J -E\\og2pn(xh x2)} a [Xi' X2] " t -£{log2
(discrete distribution) 1 /1R4-47x (continuous distribution)/ ^i0-*-*'>
and the conditional entropies
. ^[-E\\o%2px\2(xx\x2)\ (discrete distribution)! (18 4.43) *2i^n \—E\\o^2
(18.4-49)
18.5-1
PROBABILITY THEORY
614
The equality on the right applies if and only if xi and x% are statistically independent (Sec. 18.4-11). The nonnegative quantity Uxh x2) = HM +H{x2] - H{xh x2\ - H[xx) - H,t[xi\ = H{x2) - HXl{x2]
(18.4-50)
is a measure of the "statistical dependence" of xi and x2. The functionals (46), (47), (48), and (50) have intuitive significancein statistical mechanics and in the theory of communications.
18.5. FUNCTIONS OF RANDOM VARIABLES. CHANGE OF VARIABLES
18.5-1. Introduction.
The following relations permit one to calcu
late probability distributions of suitable functions of random variables
and, in particular, to change the random variables employed to describe a given set of events.
18.5-2. Functions (or Transformations) of a One-dimensional
Random Variable,
(a) Given a transformation y = y(x) associating a
unique value of a random variable y with each value of the random variable
x, the probability distribution of y is uniquely determined by that of x [see also Sec. 18.2-8; y(x) must be a measurable function]. (b) Let the random variables x and y be related bya reciprocal one-to-one transformation y = y(x), with x = x(y). 1. If y(x) is an increasingfunction,
Then
*,(Y) = *JLx(Y)] *,(X) = *v[y(X)] yp = y(xP) xP = x(yP) (0 < P < 1)
(18.5-1)
Note that either y(x) or —y(x) is necessarily an increasing function. In either case, the medians xj4 and yy^ are related by y# —y(xft).
2. If x and y are continuous random variables,
or
(18.5-2) y=K
for all values Y of y such that dx/dy exists and is continuous. Note: If x(y) is multiple-valued, one writes
EXAMPLE: If
(Y>0)\J (18.5-3)
615
LINEAR FUNCTIONS
18.5-3
(c) For single-valued, measurable y(x), f(y),
E{f(y)} =E{f[y(x))\ =/^ f[y(x)] d*x(x)
(18.5-4)
whenever this expected value exists; note that neither reciprocal one-toone correspondence nor differentiability has been assumed for y(x). In particular, substitution of f(y) = eay in Eq. (4) yields the momentgenerating function My(s) = E{e8V], and substitution of f(y) = eiqv pro duces the characteristic function Xv(o) = E{eiqv) (Sec. 18.3-8). If the integrals can be calculated, one may then use Eq. (18.3-25) to find
y = sin (x + o)
where a is a constant, and x is uniformly distributed between 0 and 2ir. 1
r2w
1
Then
fv/2
Xv(q) = 77/ eiqv{x) dx ss-it JI—v/2 e»*w<*> dx Zir JO
.h1 «W-i
*
mr
eiqv
VT=T* Jv T-r where we have used the symmetry properties of sinz and the fact that dy = cos (x + a) dx = Vl —y2 dx- 1* follows that
f1
l
(Ivl < i) (W > i)
o
(see also Sec. 18.11-16).
(d) By an extension of the convolution theorem of Sec. 8.3-3 to bilateral Laplace transforms (Sec. 8.6-2), Eq. (4) can be rewritten as
E{f(y)} - ^.£+%~ M.(«) [J^ f[y(x)]e-°*
(18.5-5)
where the integration contour parallels the imaginary axis in a suitable absoluteconvergence strip; the quantity in square brackets is seen to be the bilateral Laplace transform of f[y(x)] (see also Sees. 8.6-2 and Table 8.6-1). The complex contour integral (5) may be easier to compute than the integral (4). (e) Note that, in general, E[y(x)) ^ y(E\x\) (see also Sec. 18.5-3).
18.5-3. Linear Functions (or Linear Transformations) of a Onedimensional Random Variable, (a) // x is a continuous random variable, and y = ax + b, then (18.5-6)
««{Y) = W\'Px\~^/ (b) // the mean values in question exist,
E{ax + 6} = aE{x] + b
Var {ax + b\ = a2 Var {x)
E{(ax +b)r] =ar*r +(j) a^bar^ +•••+&' Xaz+b(<2) = eib*Xx(aq)
M«x+b(s) = ehMx(as)
yax(s) = yx(sa)
(18.5-7)
(18.5-8)
PROBABILITY THEORY
18.5-4
616
The semi-invariants (Sec. 18.3-9) ictf of y = ax -f 6 are related to the semi-invari ants Ki of x by Kit = a*i + b, Kri = arKr (r > 1). (c) Of particular interest is the linear transformation to standard units (18.5-9)
withal*'} = 0, Var \x'} - 1 x' is called a standardized random variable (see also Sec. 18.8-3).
(d) If y = y (x) is approximately linear throughout most of the spectrum of x, it is sometimes permissible to use the approximations
Var {y(x)\ « [?/'(£)]2Var {x\
E{y(x)\ «y($)
y « y(& + (x- Qy'(&
(18.5-10)
where y'(x) = dy/dx.
18.5-4. Functions
Random Variables,
and
Transformations
of
Multidimensional
(a) // the random variables
2/1 = y\(X\, X2, . . . , Xn)
2/2 = 2/2(^1, Xi, . . . , Xn)
' * * (18.5-11)
are single-valued measurablefunctions of then random variables xx, x2, . . . , xn for all xx, x2, . . . , xn, then the probability distribution of each random variable y* is uniquely determined by the joint distribution of xx, x2, . . . , xn) and the same is true for each joint or conditional distribution involving a finite set of random variables yi. Thus the distribution function of yi and the joint distribution function of y,- and 2/jt are, respectively,
*n(Yi) s //••*/ d**(Xl> x*> *' *WF«, Yk) s //••'/ <***(*>» *2> Vi(xi,X2
(18.5-12)
Xn)
. • , xp)
(18.5-13)
1Ji(xi,X2, . . . ,Xn)
(b) Ifx = (xi, x2, . . . , zn) andy = (yx, y2, . . . , 2/n) we continuous random variables related by a reciprocal one-to-one (nonsingular) transfor mation (11), their respective frequency functions
^(F2, F2, . . . , Fn) =
d(£i, X2, . . • , Xn) dfoi, 2/2, . . • , 2/n)
(18.5-14)
where
X, = Xi(Yx, F2, . . . , Fn)
(t = 1, 2, . . . , n)
/or all Fi, F2, . . . , Fn swc/i #ia2 the Jacobian exists and is continuous.
617
FUNCTIONS AND TRANSFORMATIONS
18.5-4
If x(y) is multiple-valued,
(c) For single-valued, measurable ?/; = Vi(xx, x2, . . . , xn) (i = 1, 2, . . . ,m) and/(?/i, y2, . . . , ym),
Elf(yhV2, • • • ,2/m)} = ^ l" **' /"" fd$x(Xi,X%, . . . ,Xn) /-M
/ — 00
/ — 00
(18.5-15)
whenever this expected value exists. As in Sec. 18.5-2c, neither recipro cal one-to-one correspondence nor differentiability has been assumed. Table 18.5-1.
Distribution of the Sum x = Xi -f x2 +
• • • + xn of n
Independent Random Variables (see also Sees. 18.5-7, 18.6-5, and 19.3-3)
The distribution of a
sum x = xx + x2 +
• • • + xn of n
statistically independent random variables xx, x2, . . . , xn with mean values £i, £2, . . . , £n and variances
Let £ = E{x] = £i + & + • • • + f», and let a2 = Var {x} =
• • • + an2 = fx2, fiz, M4, . . . be the central moments
of the sum x.
(px(x) can be approximated by the Gram-Charlier series of Sec.
19.3-3 if the series converges (true, e.g., whenever \x\ > a implies (px = 0 for some finite real a > 0, as for all physical variables). this case, Eq. (18.8-6) implies
* V2t
L
In
3!
+^4! 72(^4 - 6x/2 +3) +
•K'-^)
where 71 = /x3/o-3, 72 = M4A4 — 3 are the coefficients of skewness and excess of x (Table 18.3-1). The normal approximation is especially good if each xi is symmetrically distributed about &, so that 71 = 0.
Substitution of / = exp (sxyx + s2y2 + • • • + smym) yields the joint moment-generating function of yx, y2, . . . , ym, and substitution of / = exp (iqxyx + ^22/2 + * * * + iqmym) yields the joint characteristic function. Transform methods analogous to Eq. (5) may be useful. Such methods have been successfully applied to special random-process problems (Sec. 18.12-5).
18.5-5
PROBABILITY THEORY
618
(d) For any two random variables x\, x2, E\xxx2\ = E{xi)E\x2\ - Cov {xi, x2]
if this quantity exists.
(18.5-16)
If xi, x2, . . . , xn are statistically independent, then
E{XiX2 . . . xn] = E{xi)E\x2] . . . E{xn) if this quantity exists.
(e) If y = Xix2, and
(18.5-17)
Other suitable functions y = y(xi, x2) can be treated in a similar manner. 18.5-5. Linear Transformations (see also Sees. 14.5-1 and 14.6-1). singular linear transformation
For every non-
n
yi = Vi +
/
dikixk — &)
(i = 1, 2, . . . , n) [in matrix form
y — v + A(x — £)]
(18.5-18)
the respective joint distributions of x\, x2, . . . , xn and yXf y2, . . . , yn are of equal rank (Sec. 18.4-8c), and
E{yi\ = Vi
(i = 1, 2, . . . , n) n
(in matrix form
E{y\ = 17)
(18.5-19)
n
Kk =El(y* - Vi)(yk - Vk)} = \ /, ^'M&m
ft A; =1, 2, . . . , n)
i=lA=i
(in matrix form A' = 4Al)
(18.5-20)
if the quantities in question exist. A' = [\'ik] is the moment matrix (Sec. 18.4-8c) of (yu 2/2, ••• t 2/n).
The methods of Sec. 13.5-5 make it possible to find
1. An orthogonal transformation (18) such that the new momentmatrix [X$J (and hence also the correlation matrix —/^—r I is diagonal (transformation to uncorrected variables y%).
2. A transformation (18) such that vi = vi = ' ' ' = Vn = 0 and \ik = Slk (trans-
formation to uncorrected standardized variables yi (see also Sees. 18.8-66 and 18.8-8). The matrix [ E [ Xi*xk | ]must be nonsingular .
619
SUMS OF RANDOM VARIABLES
18.5-7
18.5-6. Mean and Variance of a Sum of Random Variables, (a) For any two (not necessarily statistically independent) random variables xx, x2
E{xx ± x2] = E{xx) ± E{x2] = ?i ± £2 Var {xx ± x2\ = Var {xx} + Var \x2] ± 2 Cov {xx, x2\ =
(18.5-21)
if the quantities in question exist, (b) More generally, n
n
E
n
t = l
(18.5-22)
n
Var l a0 4- /, a<Xi I = A / a»a*P»wr»°"t (general variance law) (c) If y —y(xx, X2, . . • , xn) is approximately linear throughout most of the joint spectrum of (x\, x2, . . . , xn), it may be permissible to use the approximation 71
y- vtti, h, . . . , &) + T ^1 t
t
t (x*
M
and to compute approximate values of E{y\ and Var {y\ by means of Eqs. (19) and (20) (see also Sec. 18.5-7).
18.5-7. Sums of Statistically Independent Random Variables
(refer to Sec. 18.8-9 for examples), (a) If xx and x2 are statistically independent random variables, then
*x1+*,(X) s *X(X) *$2(X) =J^ $2(X - xx) d*x(xx) x*,+xa(?) = xi(q)x2(g)
Pxl+x2(X) =pi(Z) *p2(X) s ^ p2(Z - zi)pi(zi) (18.5-23)
= ) Pi(X —x2)p2(x2) (xx, x2 discrete) Xt
18.5-8
PROBABILITY THEORY
620
(b) More generally, if x = Xi + x2 + • • • + x„ is the sum of n < oo statistically independent random variables xx, x2, . . . , xw,
*.(*) a *i(X) * *,(X) * •-. * *W(X)
*,((?) s Xl(g)Xsfo) • ' ' Xn(q)
(18.5-24)
and, if the quantities in question exist, Px(X) = Pl(X) * p2(X) * . . . * Pn(X)
,,(X) ^ (pi(X) *
Mx(s) 5 Ml(a)M,(«) • • • Mn(«)
7.W - yi(s)y2(s) • • • 7.W
n
n
*
.
n
E\x\ =*=^ ^M =7 & Var {* I -•» -7.
iJl
.
.
i
(18.5-25) (18.5-26)
n
Var {xi\
-X
er««
(18.5-27)
E{(x - S)*} = £ #{(*» - &)«|
(18.5-28)
t = i n
Kr = V Kr(0
(18.5-29)
t=i
where Kr<*> is the rth-order semi-invariant of rr».
Equations (24) and (26) permit the
computation of higher-order moments with the aid of the relations given in Sec. 18.3-10.
(c) The distribution of the sum z = (zlf z2, . . .) = x + y of two suitable sta tistically independent multidimensional random variables x = (x\, x2, . . .) and y = (2/i, 2/2, • • •) is described by
$z(£i, Ztl •••) =j_*w \_"w ••• /j^ *y(^i - xx, Z2 - x2, . . .) d$x(xlf x2, •••) (18.5-30)
Xzfei, 92, . . .) e Xxfei, 92, . . .)xy(?i, 92, . . .)
(18.5-31)
18.5-8. Compound Distributions. Let X\, x2, . . . be independent random vari ables each having the same probability distribution, and let A; be a discrete random variable with spectral values 0, 1, 2, . . . ; let k be statistically independent of xx, x2, . . . . If the generating functions yXl(s) and 7*(s) exist, the distribution of the sum x = Xi + x2 + • • • + Xk is given by its generating function
7*00 ^ 7J7..W]
(18.5-32)
18.6. CONVERGENCE IN PROBABILITY AND LIMIT THEOREMS
18.6-1. Sequences of Probability Distributions. Convergence in Probability (see also Sec. 18.6-2). A sequence of random variables yx,
2/2, .. . converges in probability to the random variable y (yn con verges in probability to y as n —> oo) if and only if the probability that yn differs from y by any finite amount converges to zero as n —> », or yn-—> y as n —> oo if and only if in p
lim P[\y - yn\> e] = 0
for all e > 0
(18.6-1)
621
NORMAL PROBABILITY DISTRIBUTIONS
18.6-4
An m-dimensional random variable y„ converges in probability to the ra-dimensional random variable y as n —> « if and only if each component variable of yn converges in probability to the corresponding component variable of y. // the m random variables ynh yni, . . . , ynm converge in probability to the respective constants ax, a2, . . . , am as n—> <», then any function g(yni, 2/n2, . . . , ynm) expres sible as a positive power of a rational function of ynh yn2, . . . , ynm converges in proba bility to g(ai} a2, . . . , am), provided that this quantity is finite.
18.6-2. Limits of Distribution Functions, Characteristic Func
tions, and Generating Functions. Continuity Theorems, (a) 2/n converges in probability to y as n —> oo if and only if the sequence of distribution functions $Vn(¥) converges to the limit $y(Y) for all Y suchthat $y(Y) is continuous.
(b) yn converges in probability to y as n —> oo if and only if the sequence of characteristic functions Xvn(o) converges to a limit continuous for q = 0; in this case Xv(q) — hm Xvnfa) (Continuity Theorem for Characteristic n—*<*>
Functions). (c) A sequence of discrete random variables y\, y2, . . . converges in probability to the discrete random variable y as n —> » if and only if
lim pVn(Y) s pv(Y)
(18.6-2)
n—>w
i/ the random variables yi, y2, . . . all have nonnegative integral spectral values 0, 1, 2, . . . and possess generating functions yVl(s), yV2(s), . . . , then Eq. (2) holds if and
only if lim yVn(s) = yy(s) for all real s such that 0 < s < 1 (Continuity Theorem for GeneratingFunctions). Note that a sequence of discrete random variables may con verge in probability to a random variable which is not discrete (see, for example, Table 18.8-3).
(d) Analogous definitions apply if y(n) converges in probability as a function of a continuous parameter n.
(e) Analogous theorems apply to multidimensional probability distributions.
18.6-3. Convergence in Mean (see also Sec. 12.5-3). Given a ran dom variable y having a finite mean and variance and a sequence of random variables yx, y2, . . . all having finite mean values and vari ances, 2/n converges in mean (in mean square) to y as n —> oo if and only if
lim E\y - yn\ = 0
lim Var \y - yn\ = 0
(18.6-3)
Convergence in mean implies convergence in probability, but the converse is not true; yn ——> y as n—* oo does not even imply that E{y} or Var {y\ exists.
18.6-4. Asymptotically Normal Probability Distributions (refer to Table 18.8-3 and Sec. 19.5-3 for examples). The (probability distribu tion of a) random variable yn with the distribution function $Vb(Y, n) is asymptotically normal with mean rjn and variance o»2 if and only if there exists a sequence of pairs of real numbers r}n,
18.6-5
PROBABILITY THEORY
622
random variable (yn — yn)/
lim P[nn + a
(18.6-4)
Equation (4) permits one to approximate the probability distribution of yn by a normal distribution with mean vnand variance
18.6-5. Limit Theorems, (a) For every class of events E permitting the definition of probabilities P[E] (Sec. 18.2-2)
1. The relative frequency h[E] = nE/n (Sec. 19.2-1) of realizing the event E in n independent repeated trials (Sec. 18.2-4) is a random variable which converges to P[E] in mean, and thus also in probability, as n —» oo (Bernoulli1s Theorem).
2. h[E] is asymptotically normal with mean P[E] and variance - P[E] n
{1 - P[E]} (see also Table 18.8-3). Note that (see also Table 18.8-3)*
P(ue) -(") {P[E)\»°ll - P[E)\»-»s VW£/
(nE =0, 1, 2, . . . , n) \
Ej^} =P[E] Var {^} - i P[E]{1- P[E)}
\ (18.6-5)
J
(b) Let xx, x2, ... be a sequence of statistically independent random variables all having the same probability distribution with (finite) mean value £.
Then, asn-^ oo,
1. The random variable x = - (xx + x2 + ♦ • • + xn) converges in
probability to £ (Khinchinefs Theorem, Law of Large Numbers). 2. x is asymptotically normal with mean £ and variance o-2/n, provided that the common variance a2 of xx, x2, . . . exists (Lindeberg-Livy Theorem, CentralLimit Theorem; see also Sees. 19.2-3 and 19.5-2).
(c) Let xx, x2, . . . be any sequence of statistically independent random variables having (finite) mean values £i, %2, . . . and variances ax2, a22, ....
Then, as n —> oo,
1.
in V
• 0 (Chebyshevfs Theorem).
623
SOLVING PROBABILITY PROBLEMS n
n
2. -n Lj Y Xi - -n Z-/ ) fe = -nZ-/ y (x{ - &)' »=i
18.7-2
n
*=i
>0provided that
in P
»=i
n
lim —r / (T,-2 = 0
(Law of Large Numbers).
n->» n2 Lj »-i
n
n
3. T/ie random variable > x, ts asymptotically normal with mean } & n
and variance ) cr*2, provided that,for every positive real number €, n
i
n
Var | T2tj (Xi if x<2 <e£ cr<2 lim ^— =1 where s,= \ ';* (18.6-6) £^
/ 0 if Xi2 > e£
(Central Limit Theorem, Lindeberg conditions). The Lindeberg conditions are satisfied, in particular, if there exist two positive real numbers a and b such that E\\xi\2+a) exists and is less than 6
18.7. SPECIAL TECHNIQUES FOR SOLVING PROBABILITY PROBLEMS
18.7-1. Introduction. Most probability problems require one to com pute the distribution of a random variable x (or the distributions of several random variables) from given conditions specifying the distribu tions of other random variables xx, x2, . . . . As a rule, the simple events labeled by values of x are compound events corresponding to various logical combinations of values of xx,x2, . . . . Thefirst step in the solution of any such problem must be the unequivocal definition of the fundamental probability set labeled by each random variable. The probabilities of com pound events may then be computed by the methods of Sees. 18.2-2 to 18.2-6 and 18.5-1 to 18.7-3. Equation (18.3-3), (18.3-6), (18.4-7), or (18.4-27) may be used to check computations.
18.7-2. Problems Involving Discrete Probability Distributions: Counting of Simple Events and Combinatorial Analysis. Each fundamental probability set labeled by the spectral values of a discrete random variable (Sec. 18.3-1) is a countable set of simple events. The
18.7-3
PROBABILITY THEORY
624
following relations (either alone or in combination with the relations
of Sees. 18.2-2 to 18.2-6) aid in computing probabilities of compound events:
(a) If, as in many games of chance, equal probabilities are assigned to each of the N simple events of a given finite fundamental probability set, then the probability of realizing a compound event ("success") defined as the union (Sec. 18.2-1) of Nx specified simple events ("favorable" simple events) can be computed as Probability of success
__ number of favorable simple events _ Nx total number of simple events
(18.7-1)
N
(b) Given a countable (finite or infinite) fundamental probability set, let an event E be defined as the union of Nx simple events each having the probability px, N2 simple events each having the probability p2, . . . ; then
P[E) = Nxpx + N2p2 + • • • Nx + N2 +
(18.7-2)
• • • need not be finite.
(c) Given Nx simple events Ef, N2 simple events E", . . . , and Nn simple events EM respectively associated with n independent component experiments (Sec. 18.2-4), there exist exactly NXN2 • • - Nn simple experiments [E' C\E" C\ • • • C\ E™] = [Ef, E", • • • , #<»>]. (d) In many problems, the simple events under consideration are various possible arrangements of a given set or sets of elements, so that the numbers Nx, N2, . . . in (a), (b), and (c) above are numbers of permutations, combinations, etc. The most important relevant definitions and formulas are given in Appendix C.
18.7-3. Problems Involving Discrete Probability Distributions:
Successes and Failures in Component Experiments. Compound events are often described in terms of the results obtained in component experiments each admitting only two possible outcomes ("success" and "failure"). The probabilities of various compound events can be com puted by the methods of Sees. 18.2-2 to 18.2-6 from the respective probabilities #i, #2, . . . of success in the first, second, . . . component experiment.
The methods of Sees. 18.5-6 to 18.5-8 may becomeapplicable if one labelsthe events
"success" and "failure" in the fcth-component experiment with the respective spectral values 1 and 0 of a discrete random variable xk whose distribution is described by Pxk(l) = ** E{xk\ = tf*
Xzk(q) s (1 - *A) + **<«
pXk(0) = 1 - *«, Var [xk\ = t?*(l - #*)
MXk(s) a (1 - **) + tftf'
(18.7-3) (18.7-4)
7xk(s) = (1 - tfA) + &a (18.7-5)
625
DISCRETE PROBABILITY DISTRIBUTIONS
18.8-1
Successes in two or more independent experiments are, by definition, statistically independent events (Sec. 18.2-4). Repeated independent trials (Sec. 18.2-4) each hav ing only two possible outcomes are called Bernoulli trials (di = #2 = ' * * = #)• The probabilityof realizing exactly x = Xi + x2 + • • • + xn successes in n Bernoulli trials is given by the binomial distribution (Table 18.8-3). If the trials are inde pendent, but the &k are not all equal, one obtains the generalized binomial distribu tion of Poisson.
A subsequence of r successes or failures in any sequence of n trials is called a run of length r of successes or failures (see also Ref. 18.4, Chap. 13). 18.8. SPECIAL PROBABILITY DISTRIBUTIONS
18.8-1. Discrete
One-dimensional
Probability
Distributions.*
Tables 18.8-1 to 18.8-7 describe a number of discrete one-dimensional
distributions of interest, for instance, in connection with sampling prob lems and games of chance. The generating function rather than the characteristic function or the moment-generating function is tabulated: the latter two functions are easily obtained from
Xx(q) s yx(e^)
M,(«) ss yx(e°)
(see also Sec. 18.3-8). Moments not tabulated are also easily derived by the methods of Sec. 18.3-10. Table 18.8-1. The Causal Distribution (see also Table 18.8-8)
(a) p(x) =*f ={J £*^ *}
(* " 0, ±1, ±2, . . . ;£is areal integer)
(b) E\x\ = £
yx(s) = **
Var \x\ = 0
Table 18.8-2. The Hypergeometric Distribution
(Ni\/N - ATA (y\ ... „, , nJVi (h) E\x\ =-y =n*
N >Ni = #N >0) „
, , n - 1\ Var fxl = nNi(N-Nt)/. 1V ^t L[l - j^ZTx)
=n*(l - *) (l - ±^\) (c) Typical Interpretation. p(x) is the probability that a random sample of size n without replacement (Sec. 19.5-5) contains exactly x objects of type 1, if the sample is taken from a population of N objects of which Nx = &N are of type 1.
(d) Approximations. As N—* « while n and & = Ni/N remain fixed, the hypergeometric distribution approaches a binomial distribution (Table 18.8-3; sampling with and without replacement becomes approximately equivalent if n/N is small). The binomial approximation is usually permissible if n/N < 0.1. The binomial distribution may, in turn, be suitable for approximation by a normal distribution (see Table 18.8-3) or by elPoisson distribution (Table 18.8-4). * See footnote to Sec. 18.3-4.
18.8-1
PROBABILITY THEORY
626
Table 18.8-3. The Binomial Distribution (Fig. 18.8-1; see also Sec. 18.7-3) (a)
P(s) - (") #*(1 - *)»-»
(a? =0, 1, 2, . . . ;0<#<1)
p(a;) is largest when x equals the largest integer < (n + l)t?, so that the sequence p(0), p(l), p(2), . . . increases monotonicallyfor # > n/(n + 1) and decreases monotonically for t? < l/(n + 1); otherwise the binomial distribution is unimodal (Fig. 18.8-1).
Note also
*,c*) - I (•) ** - ,)»-• -1 -*(?, a).> _*„Wa) £_£_) i'=0
(a; = 0, 1, 2, . . .)
with mi = 2(x + 1), ra2 = 2(n - a;) (refer to Sees. 19.5-3 and 21.4-5).
(b) E\x\ = nt?
Var {z} = w#(l - #)
aa = nt? + n(n - l)tf2 Ms = nt?(l - t?)(l - 2#) 1 7i =
n(n - l)(n - 2)0* + Sn(n - l)t?2 + n& nt?(l - #)[1 + 3(n - 2)t?(l - #)]
as
M4
1 - 6fl(l - fl)
2#
>
V^(l - 0)
7*(s) = (#s + 1 - #)»
72
" _
(c) Typical Interpretation.
»*(! -*)
p(x) is
1. The probability that a random sample of size n witfi replacement (Sec. 19.5-5) contains exactly a; objects of type 1 if the sample is taken from a population of N objects of which t?iV are of type 1. 2. The probability of realizing an event ("success") exactly x times in n independent repeated trials (Bernoulli trials, Sec. 18.7-3) such that the probability of success in each trial is #.
(d) Approximations. As n —> oo, the binomial variable x is asymptotically normal with mean n& = £, and variance n#(l — #) =
P(x) =(*) 0*<1 - t»»-* ~
P |a < ^-—* < 6 -• $u(b) - ^(a)
as n-> « for fixed a, b
Approximations based on these relations are usually permissible for <72 = nt?(l - t?) > 9 See Refs. 18.4 and 19.8 for discussions of the resulting errors. See Table 18.8-4 for the Poisson-distribution approximation to the binomial distribution.
627
DISCRETE MULTIDIMENSIONAL DISTRIBUTIONS
18.8-2
p(x)
p(x)
0.50 h
0.25 0.25
2
3
4
x
(a)
(b)
p(x)
,_j_i
-771-1 771 771+1
• 71— 1
71
(c) Fig. 18.8-1. The binomial distribution: (a) n = 4, fl = 0.4; (6) n = 3, fl = 0.8;
(c) n = 16, fl = 0.7. m is the mode. (From Mood, A.M., Introduction to the Theory of Statistics, McGraw-Hill, New York, 1950.)
18.8-2. Discrete Multidimensional Probability Distributions (see also Sec. 18.4-2). (a) A multinomial distribution is described by p(xx, x2, . . . , xn) (xx, x2,
N\
; th*itVa
XX\X2\ • ' ' Xn\
#»*»
. , xn = 0, 1, 2, . . . ; XX + X2 + • ' ' + Xn = N)
where #i, &*,..., &n are positive real numbers such that 01 + *t + • • • + * . « 1
(18.8-1)
18.8-2
PROBABILITY THEORY 1.0
628
0.4
0.8
€=0.1
0.3
0.4
PM 0.2
0.2
0.1
0
1=1
0
1
2
(a)
(b)
0.140 0.120 0.100
8-10 0.080
P(x) 0.060 0.040
0.020 0
_^J_ JL 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 *
(c)
Fig. 18.8-2. The Poisson distribution. (From Goode, H. H., and R. E. Machol,
System Engineering, McGraw-Hill, New York, 1957.)
Table 18.8-4. The Poisson Distribution (Fig. 18.8-2; see also Sec. 18.9-3)
(a) p(x) =*-$£
(x = 0, 1,2,
(b) E\x) = Var \x] = £
• • J € > 0)
yx(s)
a2 = £(£ + 1)
as = Z(¥ + 3£ + 1) M« = 3£2 + £
7i = r ^
72 = £-»
o;4 = £(!;* + 6£2 + 7£ + 1)
(c) The Poisson distribution approximates a hypergeometric distribution (Table 18.8-2) or a binomial distribution (Table 18.8-3) as fliV-> », n-> «, fl/n -» 0 in such a manner that An has a finite limit £ (Law of Small Numbers). The approximation is often useful for fl < 0.1, nfl > 1. Tfce most important applications of thePoisson distribution appear in connection with random processes of the type discussed in Sec. 17.9-3.
Given an experiment having n mutually exclusive results Eh E2, . . . , En with respective probabilities fli, fl2, . . . , fln such that fli + fl2 + • • • -f fl» = 1, the expression (1) is the probability that the respective events Eh E2, . . . , En occur exactly x\, x2} . . . , xn times in N independent repeated trials (see also Sec. 18.7-3). In classical statistical mechanics, xx, x2, . . . , xn are the occupation numbers of n independent states witli respective a priori probabilities fli, flo, . . . , flB.
629
CONTINUOUS PROBABILITY DISTRIBUTIONS
18.8-3
Table 18.8-5. The Geometric Distribution
(a) p(x) - fl(l - fl)*
(b) B[x) - ^
(x = 0, 1, 2, . . . ; 0 < fl < 1)
Var [x] =^
yx(s) - ^(f.^
(c) Typical Interpretation: p(x) is the probability of realizing an event ("success") for the first time after exactly x Bernoulli trials with probability of success fl. #(x) = 1 — (1 — fl)z+1 (x = 0, 1, 2, . . .) is the probability that the first success occurs after at most x trials (see also Table 18.8-6).
Table 18.8-6. Pascal's Distribution
(a) p(x) =(™+* ^fl-U-A)* (x = 0, 1, 2, . . . ; m = 0, 1, 2, . . . ; 0 < fl < 1)
(b) E[x] =m—=—
Var {x\ =m-^—
yx(s) =(x _ (1 __ ^J
(c) Typical Interpretation: p(x) is the probabiUty of realizing an event ("success") for the mth time after exactly m + x — 1 Bernoulli trials with proba bility of success fl. $(x) is the probability that the mfch success occurs after at most m -f x — 1 trials. For m = 1 Pascal's distribution reduces to the geo metric distribution (Table 18.8-5).
Table 18.8-7. Polya's Distribution (Negative Binomial Distribution)
« »<*> =(rTjJ'1(1+ffl"J1 +(l"1My(0) C- i. a, . . •> 1
p(0) = (1 + 00 * (b) E{x\ = *
U > 0, /3 > 0) Var {*} = £(1 + 0© _1
7*(«) s (1 + « - 0fcO * (c) Polya's distribution reduces to the Poisson distribution (Table 18.8-4) for 0 = 0, and to the geometric distribution (Table 18.8-5) for 0 = 1. See Ref. 18.4 for an interpretation in terms of a random process ("contagion ").
(b) A multiple Poisson distribution is described by
(xx, x2, . . . , xn = 0, 1, 2, . . . ; & > 0, /c = 1, 2, . . . , n)
(18.8-2)
18.8-3. Continuous Probability Distributions: The Normal (Gauss ian) Distribution. A continuous random variable x is normally distributed (normal) with mean £ and variance a2 [or normal with
parameters £, a2; normal with parameters |, a; normal (£, o-2); normal a, *)] if
18.8-4
PROBABILITY THEORY
630
(18.8-3)
*<*)
The distribution of the standardized normal variable (normal x
t
deviate) u =
- (see also Sec. 18.5-3c) is given by 1
e
-2
(normal frequency function)
2
V2
MV)-WrF-.''"dU-i[l+al(^)} (normal distribution function) E{u\ = 0
(18.8-4)
Var {u} = 1
(see also Fig. 18.8-3 and Sec. 18.8-4). erf z is the frequently tabulated error function (normal error integral, probability integral; see also Sec. 21.3-2)
erf z m-erf (-z) m-?= /"* e-** df =2$tt(s y/2) - 1 vWo
¥>(X) has points of inflection for X = £ ±
(18.8-5)
Note
*>«<*>( tf) a (-l/y/2)*Hh(U/y/2)9.(U)
(18.8-6)
where Hk(z) is the kth Hermite polynomial (Sec. 21.7-1). Every normal distribution is symmetric about its mean value £; £is the median and the (single) mode. The coefficients of skewness and excess are zero, and H2k = 1-3 • • • (2k - IV2* Kl =* £
K2 =
x*(g) =exp (- ^2 +^
M2*-i =0 ' • - • =» 0
(fc = 1, 2, . . .)
(18.8-7) (18.8-8) (18.8-9)
The moments ar about the origin may be computed by the methods of Sec. 18.3-10.
The normal distribution is of particular importance in many applications, especially in statistics (Sees. 19.3-1 and 19.5-3). 18.8-4. Normal Random Variables: Distribution of Deviations from the Mean.
(a) For any normal random variable x with mean £ and variance
P[a<x
(18.8-10)
631
NORMAL RANDOM VARIABLES
18.8-4
*{X) * *.(0) 0.6/(7 -0.6 0.5/
0.4/(7 -0.4 0.3/(7 -0.3
0.2/(7
Point of inflection
Center of gravity of area to
-0.2
the right of X=£ and under the curve
0.1/(7 -0.1
0
n
mrr £+2(7
£+3(7 U
(a) mv
l
0.75
A
£« mean of distribution
/1
(7 = standard deviation
[
/
0.50-
U- (X-fl/er
1 p.e. - 0.6745(7 1 m.a.e. = 0.7979(7
0.25
>^ 1
.X •
i
1
1
-«(*') ! i—•—i—•
1
£-3(7
£-2(7*'
£-(7
-2
-1
'1— £
£-p.e.
—I—i
£+<7
1
£+2(7
1
£+3(7
•
X
£+p.e. U
(b)
Fig. 18.8-3. (a) The normal frequency function •
(>'-£fJ)
and (6) the normal distribution function •(X)
•K^y V^tt
da;
- *«(tf)
(l7 - ^-*)
(FromBurington, R. S., and D, C, May, Handbook ofProbability and Statistics, McGrawHill, New York, 1953,)
18.8-4
PROBABILITY THEORY
632
and, for Y > 0,
P[\x - £| < Ya] = P[Z - Ya < x < £ + Y
=1- erf f-^\ =1- *M(r) (18.8-12) (b) The frac tiles
M#» =» W1+p =» |m|i_« = U a
(18.8-13)
P[\* - £| < \u\P
(18.8-14)
defined by
Table 18.8-8. Continuous Probability Distributions No.
1
Causal dis tribution
2
Frequency function
Distribution
«(* 0 - 1I °1 - V
Distribution function *(x)
U(x -
(X *= & f) (x
Rectangular or uniform distribu
7" 2a
(1* - *l< «)
0
(|x - f | > a)
0
3
(* - a < x < * + a)
2a
1
Cauchy's
(x > * + a)
11
™1 +(^jy
distribu tion
4
(x < f - a)
— (x - { + a)
tion
- H— arctan 2 ir
Laplace's
-6
2/3
tion
*
Me
*
_|x-|j
1 - He Beta dis tribution
x -
£
a
|x-g|
distribu
5
&
0
(x < 0, x > 1)
rL)r(«ia",(1-^"'(0
*
0
(x< *) (x > 0
(x < 0)
J»(a, 0) 1
(0 < x < 1) (x > 1)
(a > 0, 0 > 0)
6
Gamma dis
0
(x < 0)
0
(x < 0)
rx/0(a)
(x > 0)
tribution
zw(«) ^ * ** (a > 0, 0 > 0) * See footnote to Sec. 18.3-4.
(x" ^ 0)
—
r(a)
NORMAL RANDOM VARIABLES
633
18.8-4
are often referred to as tolerance limits of the normal deviate u or as a values of the normal deviate (see also Sec. 19.6-4). Note
Mo.95 = MO.976 » 1-96
|w|o.99 = ^0.996 « 2.58
|tt|o.999 = **0.9995 « 3.29
(18.8-15)
(c) Note the following measures of dispersion for normal distributions (see also Table 18.3-1):
The mean deviation (m.a.e) E[\x - £|| =
The probable deviation (p.e., median of \x — £|) One-half the half width ^/2 log«2
\u\w = —uyA
The lower and upper quartiles xyA = £ —w^o- = £ — M^os% = £ + ^-M* = £ + lwta
E[x)
€
Variance
Var (x)
0
Characteristic
Remarks
function
Xx(9)
x is almost always (Sec. 4.6-146) equal to $. Note that the rectangular, Cauchy, and Laplace dis tributions approximate a causal distribution as a -» 0 or 0 -*• 0 (see also Table 18.8-1 and Sec.
c't«
21.9-2) *
a*
3
#{x) and Var (x) do not exist; the Cauchy principal value (Sec. 4.0-2c) of E[x\
sin (ag)
t
ag
e»f«-a|fl|
is f
*
20*
«*« 1 + w
a
«/3
F(a; a + /3; iq)
a+0 (a + 0)Ka + 0 + 1)
x is uniformly distributed over the interval (£ — a, * + «)
Distribution of x = $ + a tan y if y is uniformly distributed between y = —ir/2 and y = ir/2 (rec tangular distribution). Cauchy's distribution is symmetric about x «= $. Half width and inter quartile range are both equal to 2a For £ = 0, the characteristic function is proportional to the frequency function of a Cauchy distribution with a = 1/0
7x(a, 0) is the incomplete beta-function ratio (Sec. 21.4-5; see also Sec. 19.5-3 and Table 18.8-3). Unique mode (a - l)/(a + 0 - 2) for a > 1, 0 >1. T(a + r)T(a + 0)
at «/3
a/3*
(l-/3ig)-a
T(a)r(a + 0 + r)
T*(a) is the incomplete gamma function (Sec. 21.4-5)
18.8-5
PROBABILITY THEORY
634
18.8-5. Miscellaneous Continuous One-dimensional Probability Distributions.
Table 18.8-8 describes a number of continuous one-
dimensional probability distributions (see also Sees. 19.3-4, 19.3-5, and 19.5-3).
18.8-6. Two-dimensional Normal Distributions, (a) A two-di mensional normal distribution is a continuous probability distribu tion described by a frequency function of the form
2tt
9« — Zpi2
Xl " & ^2 - fa (-, 0"!
0"2
(
The marginal distributions of xi and x2 are both normal with respective mean values fa, fa and variances en2,
The conditional distributions of xx and s2are both normal, with
E[xi\x2] = £1 + Pl,2! (3., _ $2)
var (z^s,) =^(l - Pl22) (18.8-17)
^{^zki} = £2 + P12J (xi - fc)
Var {x2|o:i} =
sothat the regression curves are identical withthe mean-square regression lines (Sec. 18.4-6). xi and x2 are statistically independent if and only if they are uncorrected (P12 = 0, see also Sec. 18.4-11).
Note
P[xi >£1, x2 >&] =P[xi
, I (18.8-19) ^bi >$1, z2
standardized normal variables uh u2 with the correlation coefficient Pl2, or in terms of statistically independent standardized normal variables (Sec. 18.5-5). Thus
•
2ir V1 — P122
exp | - w L
z\l ~ Pi* )
(Ml« - 2Pl2u
=
ui =
gl "" ft
"l
u2 =
32 — ft
"2
,
ux =
Ml — pX2U2
VI - P128
/
u2 =
U2 — pi2Ui ,
\A - P122
(18.8-21)
18.8-8
n-DIMENSIONAL NORMAL DISTRIBUTIONS
635
(c) The distribution (16) is represented graphically by the contour ellipses
nb K^y - ^^^+e^yi - *- —-(18.8-22) The probability that the "point" (xu x2) is inside the contour ellipse (22) is P = *x»(2)(X2)
i.e., \a = xp2(2) (Table 19.5-1). The two mean-square regression lines respectively denned by Eqs. (17) and (18) bisectall contour-ellipse chords in the xi and x2 direc tions, respectively (see also Sec. 2.4-6).
18.8-7. Circular Normal Distributions.
Equation (16) represents
a circular normal distribution with dispersion
of gravity (fa, fa) if and only if pi2 = 0,
r\
2r
*r(B) s P[(x1 - fa)2 +(a, - fa)2
(18.8-23)
(see also Sec. 18.11-16 and Table 19.5-1). Circular normal distributions are of particular interest in problems related to gun
nery; circular probability paper shows contour circles for equal increments of &r(R). Note
r^ = Vw2(2) * » 1.1774
(circular probable error, cpe, cep)
(18.8-24)
E{r] = \k* ** l-2533
The joint distribu
tion of n random variables xh x2) . . . , xn is an n-dimensional normal distribution if and only if it is a continuous probability distribution having a frequency function of the form
V(27r)»det M n
n
exp [- 5Yi XAjk{xj" *il{xk"w] (l8,8":
26)
y-i fc=i
* See footnote to Sec. 18.4-8.
18.8-9
PROBABILITY THEORY
636
Each normal distribution is completely defined by its center of gravity (£h £2, . . . , £») and its moment matrix [\jk] = [AyJ-1, or by the corresponding variances and correlation coefficients (Sec. 18.4-8). The characteristic function is n
n
n
Xx(qh q2, . . . ,qn) =exp [t £ fe-g,- - K^ ^ Xy*M*] (18.8-27) 3°l
,=lfc=l
Each marginal and conditional distribution derived from a normal distribution is
normal. All mean-square regression hypersurfaces are identical with the correspond ingmean-square regression hyperplanes (Sec. 18.4-9). nrandom variables xi,x2, . . . , xn having a normal joint distribution are statistically independent if and only if they are vjicorrelated (see also Sec. 18.4-11).
Each w-dimensional normal distribution can be described as the joint distribution
of n statistically independent standardized normal variables related to the original variables by a linear transformation (18.5-15).
18.8-9. Addition Theorems for Special Probability Distributions* (see also Sec. 18.5-7 and Table 19.5-1). (a) The binomial distribution
(Table 18.8-3), the Poisson distribution (Table 18.8-4), and the Cauchy distribution (Table 18.8-8) "reproduce themselves'' on addition of inde
pendent variables. If the random variable x is defined as the sum x = Xi + x2 +
• • • + xn
ofn statistically independent random variables xh x2, . . . , xn, then
Pifa) = Uj 0*'(1 ~ #)ni~Xi implies px(x) =(n° j #*(1 - #)»<>-* (n0 = tti + n2 + • • • + nn)
Pi(xi) = e-*i Xil -^y
(18.8-28)
implies px(x) = e~t —, x\ (« = ii + b + • • • + In)
(18.8-29)
"w"«i+(^*Yimplies"-w""l+(Liy ft = *i + (2 + • • • + fc) (18.8-30)
(b) T/ie sum a; = a* + z2 + • • • + xn of n statistically independent random variables xh x2, . . . , xn is a normal variable if and only if xh x2, . . . , xn are normal variables.
£ = £1 + £2 + • • • + in
In this case,
If xi, x2, . . . , xn are (not necessarily statistically independent) normal variables, then x = axxi + a2x2 + • • • + anxn is a normal variable whose mean and variance
are given by Eq. (18.5-19).
* See footnote to Sec. 18.3-4.
637
DESCRIPTION OF RANDOM PROCESSES
18.9-2
18.9. MATHEMATICAL DESCRIPTION OF RANDOM PROCESSES
18.9-1. Random Processes.
Consider a variable x capable of assum
ing different values x{t) for different values of an independent variable L A random process (stochastic process) selects a specific sample func tion x(t) from a given theoretical population (Sec. 19.1-2) or ensemble of possible sample functions. More specifically, the functions x(t) are said to describe a random process if and only if the sample values xi = x(ti), x2 = x(t2)} . . . are random variables admitting definition of a joint probability distribution for every finite set of values (sampling times) tij U, . . . (Fig. 19.8-1). The random process is discrete or con tinuous if the joint distribution of x(ti), x(t2), . . . is, respectively, discrete or continuous for every finite set t\, t2, . . . . The process is a random series if the independent variable t assumes only a countable set of values. More generally, a random process may be described by a multidimensional variable x(t) = [x(t), y(t), . . .]. The definition of a random process implies the existence of a probability distribu tion on the (in general, infinite-dimensional) sample space (Sec. 18.2-7) of possible
functions x(jt). Each particular function x(t) = X(t) constitutes a simple event [sample point, "value" of the multidimensional random variable x(t)]. In most applications the independent variable t is the time, and the variable x(t) or x(0 labels the state of a physical system. EXAMPLES: Results of successive observations, states of dynamical systems in Gibbsian statistical mechanics or quan tum mechanics, messages and noise in communications systems, economic time series.
18.9-2. Mathematical Description of Random Processes, (a) To describe a random process, one must specify the distribution of x(ti) and the respective joint distributions of [x(h), x(t2)], [x(ti), x(t2)} x(h)]} . . . for every finite set of values h, t2) £3, . • • (first, second, third, . . . probability distributions associated with the random process). These distributions are described by the corresponding first, second, . . . (or first-order, second-order, . . .) distribution func tions (see also Sec. 18.4-7) *(Xi,ti) =P[x(h) <X1] *
(18.9-la)
or, respectively for discrete and continuous random processes, by the corresponding probabilities and frequency functions P(i,(ii,ii)^[iW =xj pi2)(Xh h; X2} t2) = P[x(h) = Zi, x(t2) = X2]
(18.9-16)
18.9-3
PROBABILITY THEORY
638
Note: The sequence of distribution functions (la) describes the random process in increasing detail, since each distribution function $(n) completely defines all pre ceding ones as marginal distribution functions (Sec. 18.4-7). The same is true for
each sequence (16). Each ofthe functions (1) is symmetric with respect to (unaffected by) interchanges of pairs Xi} U and Xki tk.
(b) Conditional probability distributions descriptive of the random process are related to the functions (16) in the manner of Sec. 18.4-7; thus
V\Xi, h; . . . ; Xm, tm\Xm+i, tm+i; . . . ; Xn, tn)
= P(n-m)(Xm+i) P(n)(Xl' tl]tm+i'y ' ' .' .>Xn> *») (is 9-20) . JXn, tn)
Note: The functions (2) are not in general symmetric with respect to interchanges of pairs Xi, U and Xk, tkseparated by the bar.
(c) A multidimensional random process, say one generating a pair of sample functions x(t), y(t), is similarly defined in terms of joint distri butions of sample values x(tx), y(tk). In particular, $(2)(Xi, h; Y2, t2) = P[x(h) < Xh y(t2) < Y2] 18.9-3. Ensemble Averages,
(a) General Definitions.
(18.9-3) The en
semble average (statistical average, mathematical expectation) of a suitable function f[x(h), x(t2), . . . , x(tn)) of n sample values x(h), x(t2), . . . , x(tn) (statistic, see also Sec. 19.1-1) is the expected value (Sec. 18.4-8a)
E{f] =E{f[x(t,),x(t2), . . . ,x(tn)]\ (18.9-4)
**
if this limit exists in the sense of absolute convergence. Integration in Eq. (4) is over Xh X2, . . . , Xn) E{f] is a function of h, t2, . . . , tn. Similarly, for a multidimensional random process described byx(t), y(t), E{f[x(h)iy(t2))x(h),y(U)) . . .]}
=/I /-. "' ' /_"„ '
if the limit exists in the sense of absolute convergence.
(18.9-5)
639
ENSEMBLE AVERAGES
18.9-3
(b) Ensemble Correlation Functions and Mean Squares. ensemble averages E{x(ti)} = t(h), E{x2(ti)}, and R~(M«) = E{x(h)x(t2)\ (ensemble autocorrelation function) R*,(Mi) - E{x(h)y(t2)) (ensemble crosscorrelation function)
The
(18.9-6a) (18.9-66)
are of special interest. They abstract important properties of the random process and are frequently all that is known about the process: note that
E\xKh)) =R**(Mi)
Var [x(tO\ s Rxx(h, h) - [E{x(tO\]2
\
\ (18.9-7)
Cov {x(h), y(t2)} mRxy(h, t2) - E{x(h)}E{y(t2)) J
The definitions (6) and Eq. (7) apply to real x(t), y(t). If x(t) and/or y(t) is a complex variable (really a two-dimensional random variable), then one defines
r**(zi, to = E{x*(h)x(t2))« r:m, to R*y(ti, t2) s E\x*(h)y(t2)} s R*,(fa, h)
(18.9-8)
which includes (6) as a special case; R^ is necessarily real for real x and y. Note that, for real or complex x, y,
R-(Mi) =E{\x(tO\2} |R„(«i, «|2 < #{ WOl^fl^)!2}
(18.9-9) (18.9-10)
Existence of the quantities on the right implies that of the correlation functions on the left.
(c) Characteristic Functions. The nth characteristic function corresponding to the nth distribution function (la) of the random process (see also Sec. 18.4-10) is n
xoofai, h; q2, t2; . . . ;qn, Q=E|exp [i £ ff*a?(fc)]} (18.9-11) k = \
Joint characteristic functions for x(t),y(t), . . . are similarly defined. Characteristic functions can yield moments like E{x(t0\, E{x2(t0\,
Rxx(h, t2), . . . by differentiation in the manner of Sees. 18.3-10 and 18.4-10.
(d) Ensemble Averages of Integrals and Derivatives (see also Sec. 18.6-3).
Random integrals of the form
y=£ f(t)x(t) dt
(18.9-12)
18.9-4
PROBABILITY THEORY
640
are defined in the sense of convergence in probability (Sec. 18.6-1) or, if possible, in the mean-square sense of Sec. 18.6-3. The integral con verges in mean (in the sense of Sec. 18.6-3) if and only if
E{\y\*} =f' dh f*(tO jhaf(t2) Rxx(h, t2) dt2
(18.9-13)
exists. If ja \f(t)\E{\x(t)\) dt exists, then the integral (12) exists in the sense of absolute convergence for each sample function x(t), except vossibly for a set of probability 0, and
E{Jabf(t)x(t) dt] =f" f(t)E{x{t)\ dt
(18.9-14)
The important relation (14) is needed, in particular, to derive the inputoutput relations of Sec. 18.12-2 (see also Refs. 18.13 to 18.17). The random process generating x(t) is continuous in the mean (mean-square continuous) at t = U in the sense of Sec. 18.6-3 if and only if lim JE{|a:(t) - x(t0)\2} = 0 t-*to
this is trueif and only if RXx(t\,t2) existsand is continuous for U ~ t2 = t0. The random process generating x(t) will be called the mean-square derivative of a random process generating x{t) if and only if
jg.Mlx(<+*"'w-H,}-°
(l8-9-l5)
This is true if and only if d2Rxx(ti, t2)/dli dt2 exists and equals d2R»»(*i, t2)/dt2dtx for all ti = t2.
It follows that
Ei±(t)} -g[tf{*(0} =«*) Rx*(*i, U) s —Rxx(th t0
)
[ (18.9-16) R«(ti, U) =-~^R„(th t2) j
(see also Sec. 18.12-2).
18.9-4. Processes Defined by Random Parameters. It is often possible to represent each sample function of a random process as a deterministic function x = x(t; 771, v2, . . .) of t and a set of random parameters rjh v2, . . . .
The process is then defined by the joint distri
bution of 771, 772, .. . ; in this case,
E{f[x(t0, . . . ,x(tn)]\ ^ f^ pn • ••f\ f[x(h; 771,772, . . .), • • • , x(tn; ni, n2, . . .)]<»*„...(171,712, . . .)
(18.9-17)
In particular, each probability distribution of such a random process is
641
STATIONARY RANDOM PROCESSES
18.10-1
uniquely defined by its characteristic function (Sec. 18.4-10) Xin)(qi, h; . . . ; qn, L) n
- /_"«, /T„ • ••/_"»exp [* X qkX(th'vi'Vi' •••)] d*ttfc...(ni,i»„ . • •) (18.9-18) 18.9-5. Orthonormal-function Expansions. Given a real or complex random process x(t) with E{x(t)) finite and Rx*(*i, t2) bounded and continuous on the closed observation interval [a, b], there exist complete orthonormal sets of functions uL(t), u2(t), . . . (Sec. 15.2-4) such that
x{t) = > uk(t)
-i
with
f u*(t)uk(t) dt = b\
ck= f uk{t)x(t) dt
(i, k = 1, 2, . . .) (18.9-19)
where the series and the integral for each ck converges in mean in the sense of Sec. 18.6-3 (see also Sec. 18.9-3d). The random process is, then, represented by the set of random coefficients cu c2, . . . ; the first n coefficients may give a useful approximate representation. In particular, there exists a complete orthonormal set Uk(t) = *pk(t) such that all the Ck are uncorrected standardized random variables, i.e.,
E{ck] = 0
E[c*ck\ = 5l,
(i, k = 1, 2, . . .)
(18.9-20)
(Karhunen-LoSve Theorem). Specifically, the required \f/k(t) are the eigenfunctions of the integral equation
Xf6 R„«i, t2)+(t2) dt2 = Hh)
(18.9-21)
(see also Sec. 15.3-3). The corresponding eigenvalues X* are nonnegative and have at most a finite degree of degeneracy (by Mercer's theorem, Sec. 15.3-4), and 00
E| f6 \x(t)\* dt\ =f6 Rxx(t, t)dt= ^ j£
(18.9-22)
k = i
The Karhunen-LoeVe theorem constitutes a generalization of the theorem of Sec. 18.5-5.
EXAMPLES: Periodic random processes (Sec. 18.11-1), band-limited flat-spectrum noise (Sec. 18.11-26). Although explicit analytical solution of the integral equation (14) is rarely possible, the theorem is useful in detection theory (Ref. 19.24). 18.10. STATIONARY RANDOM PROCESSES.
CORRELATION FUNC
TIONS AND SPECIAL DENSITIES
18.10-1. Stationary Random Processes. A random process, or the corresponding ensemble of functions x(t), is stationary if and only if each of its probability distributions is unchanged when t is replaced by
18.10-2
PROBABILITY THEORY
642
t + to, so that
$(n)(Xi, ti + t0', X2y t2 + to] . . . J Xn, tn + k) = $(n)(Xh fa; X2y t2; . . . ; Xn, tn) (-00 < t0 < oo ;n = 1, 2, . ..)
(18.10-1)
i.e., the nih probability distribution depends only on a set of n — 1 differences
Tl = t2 - fa
T2 = h - fa
• • •
Tn-l = tn~ fa (18.10-2)
of sampling times tk. Similarly, two or more random processes generat ing x(t), y{t), . . . are jointly stationary if and only if their joint probability distributions are unchanged when t is replaced by t + U. For stationary and jointly stationary random processes, each ensemble average (18.9-4) or (18.9-5) depends only on n - 1 differences (2):
E{f[x(t0x(t2), . . . ,x(tn)]\ = E{f[x(0),x(rO, . . . , x(rn-0]} (18.10-3)
for every fa (see also Sec. 18.10-2). 18.10-2. Ensemble Correlation Functions (see also Sec. 18.9-36).
(a) For stationary x(t) [and jointly stationary x(t), y(t)], the expected values
E{x(t)\ ss E{x] = £
E{\x(t)\*\ = E[\x\*}
E{y(t)\ s E{y] = n . . .
are constant, and the ensemble correlation functions (18.9-8) reduce to
functions of the delay t2 — fa = r separating fa and t2.
In this case,
Rxx(r) m E{x*(t)x(t + r)} = R*,(-r) Rxv(r) = E{x*(t)y(t + t)}= R*x(-t)
PMr)| < Rxx(0) = E{\x\*\ |Rx,(r)|2 < Rx,(0)R,y(0) = E{\x\*\E{\y\*}
(18.10-4)
(18.10-5) (18.10-6)
Again, existence of the quantities on the right implies existence of the corre lation functions on the left. If Rxx(t) is continuous for r = 0, it is con tinuous for all r (Ref. 18.17). [Rxx(U — tk)] is a positive-semidefinite hermitian matrix (Sec. 13.5-3) for every finite set t\, t2, . . . , tn.
(b) Normalized ensemble correlation functions are defined by
*"W ~ E\\x\*\ - \W
PzvKT) ~ (E{\x\*} - |{|2)»(tf{|2/|*) - |,|«)>* (18.10-7)
643
CORRELATION FUNCTIONS AND SPECTRA
18.10-4
Note \pxx\ < 1, \pxy\ < 1. For real stationary x, y, pxx and pxy are real correlation coefficients (Sec. 18.4-4), and Eq. (4) implies Riz(t) = Rxx(-r)
Rxv(t) = Rvx(-t)
(18.10-8)
Random processes which are not stationary or jointly stationary but have constant E{x(t)}, E{y(t)\ and "stationary correlation functions" satisfying Eq. (4) are often called stationary, or jointly stationary, in the wide sense.
18.10-3. Ensemble Spectral Densities. If x{t) is generated by a stationary random process, and x(t), y(t) by jointly stationary random processes, the ensemble power spectral density $xx(w) and the ensemble cross-spectral density $xy(w) are defined by $*x(co)
=
$xyM
=
/:. Rxx(r)e~-*°T dr /:. Rxy(r)e" -i<*T dr -
*x*x(")
*i(«)
(18.10-9)
(WIENER-KHINCHINE RELATIONS)
Assuming suitable convergence, this implies
•/-• o^xx(co)6^^^R*x(-r) R-(r) • /_! *-(co)e"T£ • R-(~r)
R**(t)
(18.10-10)
The Fourier transforms (9) are introduced, essentially, to simplify the relations between input and output correlation functions in linear time-invariant systems (Sec. 18.12-3). Existence of the transforms (9) requires, besides the existence of E[\x\2} and E{\y\2} (Sec. 18.9-36), that Rxx(t) or Rxy(r) decays sufficiently quickly as r —» oo. In the case of periodic and d-c processes, one extends the definitions of spectral densi ties to include delta-function terms chosen so that Eq. (10) is satisfied (Sec. 18.10-9).
18.10-4. Correlation Functions and Spectra of Real Processes. The relations (9) and (10) apply to both real and complex random proc esses x(t), y(t). Note that the power spectral density $x*(w) is always real, even if x is complex; but the cross-spectral density $xy(«) may be a complex function even for real x, y. If x and y are real, the same is
18.10-5
PROBABILITY THEORY
true for the correlation functions Rxx(t), Rxv(t).
$xx(o>) = /
644
In this case,
Rxx(r)e~^r dr s 2 / RXx(r) cos or dr s $xx(-w) (18.10-11)
Rxx(r) s (^ ^(w)e^^ =2f°° $„(«) cos or ^ »R»(-r) (18.10-12)
*.*(«) ^ *„(-«) = **,(-«)
Rx,(r) s Ryx(~r)
(18.10-13)
Note again that Eqs. (11) to (13) apply to real x, y. 18.10-5. Spectral Decomposition of Mean "Power" for Real
Processes.
For real x{t), substitution of r = 0 in Eqs. (11) and (12)
yields
E{x2} =Rxx(0) =f^ *„(„) g =2[~ *«(«) ^ (18.10-14) This is interpreted as a spectral decomposition of E{x2} (mean "power"). In the first integral, contributions to E{x2} are "distributed" over both positive and negative frequencies with density $xx(u) ("two-sided" power spectral density), measured in (xunits)2/cps, sincew/2t is frequency in cps. Alternatively, we can considerE{x2} as distributed only over nonnegative ("real") frequencies with the "one-sided" power spectral density 2$a;x(co) (x units) 2/cps.
Intuitive interpretation of the—in general complex—cross-spectral density $xv(co) is not quite so simple. For real x(t), y(t), substitution of r = 0 in Eq. (10) yields
E{xy] =Rav(0) =J^ *„(w) ^ =2[J Re *„(«) ^ (18.10-15) Re $s!,(w) is often called a cross-power spectral density. Im $xv(oj) (cross-quadrature spectral density) does not contribute to the mean "power" (15).
18.10-6. Some Alternative Ensemble Spectral Densities.
Other
spectral-density functions found in the literature are
SxxM s <M27r*0
with
E{\x\2} = P SxxW dv (18.10-16)
(v = co/2ir; two-sided spectral density in x units2/cps)
gxx(co) =i-^x(w)
with
E{x2\ = p gxx(a>)da> (18.10-17)
{two-sided spectral density in x-units2/radian/sec)
645
t AVERAGES AND ERGODIC PROCESSES
18.10-7
and the one-sided spectral densities
r„W = 2SIX(x) = 2$>xx(2wv)
(v > 0)
(18.10-18)
G„(«) - 2&,(«) = i *„(«)
(
(18.10-19)
Note that rza.(v) and GXx(w) are defined only for nonnegative fre quencies. Similar definitions also apply to cross-spectral densities. Note that symbols and definitions vary greatly in the literature; the correct definition should be restated and referred to in each case.
18.10-7. t Averages and Ergodic Processes, (a) t Averages. Given any function x(t), the t average (average over t, frequently a time average) of a measurable function f[x(fa), x(t2), . . . , x(tn)] is defined as
e T—»» lim ^&I JJ—T f[x(fa + t), x(t2 + t), . . . ,x(tn +t)] dt (18.10-20) if the limit exists.* // x(t) describes a random process, then
E{
(18.10-21)
whenever the integrals exist. (b) Ergodic Processes. A (necessarily stationary) random process generating x(t) is ergodic if and only if the probability associated with every stationary subensemble is either 0 or 1. Every ergodic process has the ergodic property: the t average (20) of every measurable function f[x(fa), x(t2), . . . , x(tn)] equals its ensemble average (18.9-4) with proba bility one, i.e.,
P[
(18.10-22)
whenever these averages exist. Any one of the functions x(t) will then define the random process uniquely with probability one, e.g., in terms of the characteristic functions (18.9-11) computed from x(t) by means of Eq. (21). Each t average, such as <x>, <x2>, or Rxx(r), will then * The notation J is sometimes used instead of < />, as well as instead of E{f \; but the symbol / is preferably reserved for the sample average
/ =^o/ +2/+ • • • +nf) n
where */ is the value of / obtained from one of an empirical random sample of n sample functions x(t) = kx(t) (k = 1, 2, . . . , n; see also Sec. 19.8-4).
18.10-8
PROBABILITY THEORY
646
represent, with probability one, a property common to the entire ensemble of functions x(t). Two or more jointly stationary random processes are jointly ergodic if and only if the probability associated with every stationary joint sub ensemble is either 0 or 1. The ergodic theorem applies to averages com puted from sample values of jointly ergodic processes. 18.10-8. Non-ensemble Correlation Functions and Spectral Den sities. Given the real or complex functions x(t), y(t) (which may or may not be sample functions of a random process) such that
<|*(0)|«> -B^^/^WOI1* <\y(0)\2> = r->» lim^ iT \y(t)\2dt zi J-T
(18.10-23)
exist, the t averages
<x*(0)x(r)> = T->«> lim i41 J-T fT x*(t)x(t +T) efi SRxx(r) [autocorrelation function of x(t)]
(18.10-24)
<x*(0)y(r)> mlim ^ [* x*(t)y(t +r) dt =Rxy(r) T->*> Zl J-T [crosscorrelation function of x(t) and y(t)]
(18.10-25)
exist. These correlation functions satisfy all the relations listed in Sec. 18.10-2, if each ensemble average (expected value) is replaced by the corre sponding t average. Again, the (non-ensemble) power spectral density ^xxfa) and the cross-spectral density Vxy(o)) are introduced through the Wiener-Khinchine relations
*x*(o>) s /"" Rxx(r)e-™dT m*x*x(co) ) ]-w
**(«) - j^ Rxy(r)e-™ dr - ¥*,(«) J
(18.10-26)
// these "individual" spectral densities exist (one formally admits deltafunction terms, Sec. 18.10-9), they satisfy relations analogous to those listed in Sees. 18.10-3 to 18.10-5. Alternative non-ensemble spectral densities can be defined in the manner of Sec. 18.10-6.
If x(t), y(t) are sample functions of jointly stationary random proc esses, then the correlation functions (24), (25) and the spectral densities (26) are random variables whose expected values equal the corresponding ensemble functions whenever they exist. If x(t), y(t) are jointly ergodic, then the correlation functions (24), (25) and the spectral densities (26) are identical to the corresponding ensemble quantities with probability one.
647
FUNCTIONS WITH PERIODIC COMPONENTS
18.10-9
As an alternative definition, spectral densities are sometimes introduced by the formal relation
*xyM = T-+ao lim ^pofMbrM *1
(18.10-27a)
where ar(c») and 6r(«) are Fourier transforms of the "truncated" functions xr{t), yr(t) respectively equal to x{t), y(t) for |*| < T and zero for |*| > T:
or(«) = f_Tx(t)e-i»tdt
6r(«) = f yifie-*" dt
(18.10-276)
The corresponding ensemble spectral density $*„(co) may then be denned by *xV(w) = E{*Xy(u>)}, aod the Wieoer-Khinchine relations (26) follow from BoreFs convolution theorem (Table 4.11-1). In general, however, Eq. (27) is valid only if both sides appear in an integral over w (in particular, spectral densities often contain delta-function terms, Sees. 18.10-9 and 18.11-5; see also Sec. 18.10-10). 18.10-9. Functions with Periodic Components (see also Sec. 18.11-1). Like other t averages, non-ensemble correlation functions and spectra are of interest mainly if they happen to equal the corresponding ensemble quantities with probability one (this is true for all t averages in the case of ergodic processes, Sec. 18.10-76). When this is true, the single integrals (24), (25) may be easier to compute than the double integrals (4). The ergodic property also permits interpretation of, say, $xx(u>) in terms of the "frequency content" of a single "typical" sample function x(i), since $xx(
y{t) — a cos (
(18.10-28a)
one has
A«(t) - 2 cos «,r
Rxy(r) - | Qotherwige
(18.10-286)
More generally, let x(t) be a real function and of bounded variation in every finite interval and such that <|z(0)|2> exists. Then x(t) can be represented almost everywhere (Sec. 4.6-146) as the sum of its average value <x(0)> = c0, a countable set of periodic components, and an aperiodic component* p(t): 00
00
x(t) = y ckeiukt + p(t) =c0 + \ (ak cos ukt +bK sin wkt) + p(t) &=-oo
* = l 00
= Co + 7 Ak cos («*< +
18.10-9
PROBABILITY THEORY
648
with wo = 0, cot = —ok > 0 (k = 1, 2, . . .), so that
1 ft
[ Co if co = 0
im 2jt J_Tx(t)e-^ dt = j Ck =clk if co =coA
(k = ±1, ±2, . . .)
lim T-*
~* °°
I 0 otherwise 1 rr
lim h^7 /
( Co if w = 0
I1
s(0 cos cat dt = \ ~ak if co = co*
(i = 1, 2, , . .)
V0 otherwise
lim -L /^ a;(Q sin cat dt =J26* if w= ~*w
(* «= 1, 2, . . .)
V0I otherwis otherwise
c* = H(«* ~ **) = HAtfi**
(k = 1, 2, . . .) (18.10-296)
lim -L fT p(t)e-iutdt = 0 y_»oo Al J —T
(18.10-29c)
Let 2/(0 be another real function 2/(0 satisfying the same conditions as x{t), so that 2/(0
= /, yekiu* +g(0 =To + X (a* cos WJfc* +^A sin w**) + Q® k = l
70 + 7 Bk cos (co** + M + q(t) (18.10-30) The set of circular frequencies coi, co2, . . . is understood to include the periodiccomponent frequencies of both x(t) and y{t).
Then
Rxx(r) = y |c*|V"*' +Rpp(T) 00
=Co2 + Ji T ^*2 cos cojkr + R„(T) 00
k= 1
\
/
(18.10-31)
***(co) =2tt y |c*|2S(co - co*) +*PP(co) Jfc=-
=27rco25(co) +I y Ak*[6(co - co*) +5(co +co*)] +*PPM fc = l
04*2 = a*2 + 6*2, k = 1, 2, . . .)
and ft,„(T) = ^ cJ7A:e'w*T +#P9(r) &=-oo
= coyo + \i ) [(ajtajt -f 6*0*) cos coat k = i
+ (a*0* - 6*a*) sincoat] + RPg(r) ) (18.10-32)
= CoTo + H ) 4*£* COS (cojfcT + ypk -
¥*»(») = 2tt Y C^t*5(co - co*) + *P9(co) k= — oo
649
GENERALIZED FOURIER TRANSFORMS AND SPECTRA
18.10-10
The crosscorrelation function Rxy{r) measures the "coherence" of x(t) and 2/(0 or the "serial correlation" between the function values x(t) and y(t + r) separated by a delay r. x(t) and 2/(0 are uncorrected if and only if Rxy(r) = 0. Note: The (real) functions x(t), y(t) belong to a complex unitary vector space with inner product (u, v) = (Sec. 14.2-6). Note the useful orthogonality relations
^/>•"'e'°'Ho S;S (l8-l0-33) jjim ^ jT_T cos M+«) cos (at +«*-{« COS(^57J25) (£° =0) (18.10-34) 18.10-10. Generalized Fourier Transforms and Integrated Spectra,
(a) To
avoid the difficulties associated with delta-function terms in the Fourier transforms
and spectral densities of periodic functions, one may introduce the generalized or integrated Fourier transform Xjjvr(fcco) of x{t), defined (to within an additive constant) by /oo
p—iwt
p—»*«of
- oo *W —-llit dt
(18-10-35)
The corresponding inversion integral is the Stieltjes integral (Sec. 4.6-17)
x(t) = P e**1 dXINT{io>)
(18.10-36)
If the Fourier transform -X>(tco) of x(t) exists, then
XintO**) - XINT{im) =J["wo XF(ia>)^ Air
XF(ua) =d^INJ^ (18.10-37) a^co/Air)
oo
If x(t) can be represented as
/
ckei(akl (this is, in particular, true for periodic func-
*•£.. tions; see also Sec. 18.11-1), then Xint^) is a step function (Sec. 21.9-1). (b) The integrated power spectrum ^/jvr(co) of a stationary or wide-sense stationary random process generating xit) is the generalized Fourier transform of its autocorrelation function:
-co
with
Rxx(t)
o—: -2™
dT — /
ywo
xx(co) =- I
2. 1 (1810.38)
Rxx(t) = f eiuT d*iNT(a>)
J
Analogous relations can be written for non-ensemble correlation functions and spectra. (c) Note the following generalizations of the Wiener-Khinchine relations (9) and (26) for real stationary (or wide-sense stationary) x(i).
s- E {\XiNT[i(co + e)] — Xint[i(o> —e)]|2) = 5- [®int(co -f e) — $int(w —e)] ^€ J u)— e 1
f00
27rlim jr- /
«_>() Ae J —00
-^TT
i57T J — 00
€T
|-X"/^r[i(co + e)] — X/jvrfr(« — «)]|2 cos cor dco
= r—>.» lim =L Z"7, ^(j +T) dt =JRm(t) (18.10-40) Ai j —t
18.11-1
PROBABILITY THEORY
650
For t = 0, Eq. (40) yields Wiener's Quadratic-variation Theorem
lim p r \X1NT[i(o> +«)] - XiNrlib* - «)]|8dw =T-+oo lim X, Tm \x(t)\*dt (18.10-41) e-*0Ae J —co Al J-T If the non-ensemble power spectral density Vxx(
***(co) = 2*-lim I \XINT[i(<* + 01 - XmT[Hs* - «)]|» 18.11 SPECIAL CLASSES OF RANDOM PROCESSES.
(18.10-42)
EXAMPLES
18.11-1. Processes with Constant and Periodic Sample Functions. (a) Constant Sample Functions (Fig. 18.11-la). If each sample function x(t) is identically equal to a constant random parameter a with given probability distribution, the latter determines the resulting random process uniquely. The process is stationary; but it is not ergodic. If E{a2} exists, E{x(t)\ = E{a) Rxx(r) = E{a2} (18.11-la) while <x(t)> = a Rxx(t) = a2 (18.11-16) (b) Random-phase Sine Waves.
Let
x(t) = a sin (ut + a)
(18.11-2)
(Fig. 18.11-16) where a is a given constant, and the phase angle a is a random variable uniformly distributed between 0 and 2t. The process is stationary and ergodic, with
¥>[*(0] =
, ,
*Va*-x>
E{x(t)) = <x(t)> = 0
(W < a)
i (18.11-8)
Rxx(t) = Rxx(t) = ~ cos cor
If the amplitude a of the random-phase sine wave is not a constant, but is itself a (positive) random variable independent of a (as in ampli tude modulation), the process is stationary but not in general ergodic. Now
r /ai
1 f
E{x(t)\ =0
\
(18.11-4)
Rxx(r) = y2E{a2} cos cor j
If, in particular, the amplitude a has a Rayleigh distribution defined by
/^_/«e-°2/2
*o(a) = i 0
(«>0)1
(a < 0) /
m«ii«
(18-U"5)
(circular normal distribution with a2 = 1, Sec. 18.8-7), then the random process is Gaussian (Sec. 18.11-3). If the phase angle a is not uniformly distributed between 0 and 2irt then the pro
cess is nonstationary even if the amplitude a is fixed.
651
CONSTANT AND PERIODIC SAMPLE FUNCTIONS
18.11-1
T
(a)
(&) *(»
(0 x(t)
W
iqj
n
n*W= 2 a*i;(t-t4) aAi;(*-*A)
-*V
% ^/\T,xnr
•^
(c) Fig. 18.11-1. Sample functions rc(0 for five examples of random processes. In Fig. 18.11-le, x(t) is the sum of the individual pulses akv(t —h) shown.
(c) More General Periodic Processes (see also Sec. 18.10-9). The random-phase sine wave is a special case of the general randomphase periodic process represented by
x(t) = Co + y [ak cos k(mt + a) + bk sin &(co0* + a)] (18.11-6) where a is uniformly distributed between 0 and 2tt; it is assumed that the series converges in mean square in the sense of Sec. 18.6-3. The
PROBABILITY THEORY
18.11-1
652
fc»
-l/2a
l/2a
4U«)
4>--(w)
Fig. 18.11-2. Autocorrelation function and power spectrum for a random telegraph wave (a) and a coin-tossing sample-hold process (b) having equal mean count rates a = y$M, both with zero mean and mean square a2. Note that different w scales are used in (a) and (&). (From G. A. Korn, Random-process Simulation and Measurements, McGraw-Hill, New York, 1966.)
process is stationary and ergodic, with
E{x(t)} = <x(t)> = co
R**(r) =Rxx(t) =Co2 +^y (ak2 +bh2) cos /ccoor A= l
(18.11-7)
<MC0) 3E ¥X2(C0) = 27TC025(CO)
&= 1
A still more general periodic process is defined by the Fourier series 00
x(t) = Co +
/
(akcos kcoot + 6asin A«oO
(18.11-8)
with real random coefficients cq, ak, bk, assuming that the series converges in mean square. Such a process is wide-sense stationary if and only if E[ak) = E{bk) = 0
E{ak*\ = E\bk*\
Elcoai) = E{c0bk\ = E\aibk) = 0 E{aiak) = E{bibk) = 0
(tV A;)
(18.11-9)
BAND-LIMITED FUNCTIONS AND PROCESSES
653
18.11-2
In this case, Eq. (8) is an orthogonal-series expansion in the sense of Sec. 18.9-5, and
E{x{t)} =EM
Rxx(j) =E{c02} + y2 y E{ak2 +bk2} cos W
(18.11- 10)
k = i
18.11-2. Band-limited Functions and Processes. Sampling Theo rems, (a) A function x(t) is band-limited between co = 0 and co = 2tB if and only if its Fourier transform XF(iu>) (Sec. 4.11-3) exists
and equals zero for |co| > 2irB; B (measured in cycles per second if t is measured in seconds) is the bandwidth associated with x(t). every band-limited x(t)
For
00
x(t)
-1
x(tk)
sin 2irB(t - h) 2wB(t - tk)
(18.11-11)
(tk =Wk =0> ±]L> ±2> •••)
i.e., x(t) is uniquely determinedfor all t by samples x(th)spaced 1/2B t-units apart (Nyquist-Kotelnikov-Shannon Sampling Theorem). The functions (Fig. 18.11-3)
uk(t) =V2B sine 2B(t - tk) =V2B ^^-'t^ (k = 0, ±1, ±2, . . .)
(18.11-12)
Fig. 18.11-3. The sampling function sine t = ?HLZ* (see also Table. F-21). irt
18.11-3
PROBABILITY THEORY
654
constitute a complete orthonormal set for the space of functions x(t) band-limited between c* = 0 and co = 2ttB (Sec. 15.2-4); note
xk =x(tk) =2B f"^ x(t) sine 2B(t - h) dt
(k =0, ±1, ±2, . . .)
("sampling property" of the sine function)
(18.11-13)
l_° sine (X - i) sine (X - k) d\ = 25 /_° sine 2B(t - U) sine 2B(t - tk) dt
~*-{\ %*%}
(^ =0- ±1. ±2- •••) (18-H-14)
(b) A stationary or wide-sense stationary random process with sample functions x(t) is band-limited between co = 0 and co = 2-kB if and only if its ensemble power spectral density $xx(w) exists and equals zero for
|co| > 2ttB. In this case, the expansion (11) applies in the sense of meansquare convergence (Sec. 18.6-3), i.e., 00
E|[x(0 - £ xk sine 2B(t - <*)]*} =0 (18.11-15) k= —oo
and Eq. (11) represents each sample function x(t) in terms of its sample values Xk = x(k/2B) with probability one. Note: In the special case of a stationary band-limited "flat-spectrum" process with
. , * |*o
*"(w) =(o
(M<2rB)
(M >M)
sin2xBr nRnift.
R"(r) =2*oB ~^T (18'u-16)
the sample values xk = x(k/2B) have zero mean and are uncorrelated.
18.11-3. Gaussian Random Processes (see also Sees. 18.8-3 to 18.8-8,
and 18.12-6). A real random process is Gaussian if and only if all its probability distributions are normal distributions for all h, U, . . . . Every Gaussian process is uniquely defined by its (necessarily normal) second-order probability distribution, and hence by the ensemble autocorre lation function Rxx(h,t2)^E{x(ti)x(t2)} together with £(t) = E{x(t)\. Specifically, the joint distribution of every set of sample values x\ = x(h), x2 = x(t2), . . . , xn = x(tn) is a normal distribution with probability density
n
- V(2^)-1det MexP [" 2XXM*'' "*)(* ~*>] (18-U-17) j=l & = 1
with
Hi = E{x(t3)\
Ay* = Rxx(tj, tk) - &&
(j, k = 1, 2, . . . , n) (18.11-18)
[A,*] = M"1
(18.11-19)
655
MARKOV PROCESSES AND THE POISSON PROCESS
18.11-4
Processes obtained through addition of Gaussian processes and/or linear operations on their sample functions are Gaussian (Sec. 18.12-2), Coefficients in orthogonal-function expansions of a Gaussian process (Sec. 18.9-5) are jointly Gaussian random variables. 18.11-4. Markov Processes and the Poisson Process, (a) Random Process of Order n. A random process of order n is a random process completely specified by its nth (nth-order) distribution function $(n) (Sec. 18.9-2), but not by <*>(n-i). (b) Purely Random Processes. A random process described by x(t) is a purely random process if and only if the random variables x(ti), x(t2), . . . are statistically independent for every finite set h, t2, . . . . A purely random process is completely specified by $(d(Xi, h), Pd)(Xi, h), or
(c) Markov Processes. A discrete or continuous random process described by x(t) is a (simple) Markov process if and only if, for every finite set h < t2 <
• • • < tn-i < tn,
p(Xn, tn\Xi, tn . . . ; Xn-i, tn-i) = p(Xn, tn\Xn-i, tn-i) Or
(18.11-20a) (18.11-206)
If x(tn-i) = Xn-i is given, knowledge of x(tn-2), x(tn-d),
. . . contributes nothing to one's knowledge of the distribution of x(tn). A Markov process is completely specified by its second-order probability distribution and hence by its first-order probability distribution together with the "transition probabilities" given by
p(X2, t2\x, t)
or
(t < t2)
(18.11-21)
A Markovian random series is often called a Markov chain. Every purely random process is a Markov process. Many physical processes can be described as Markov processes. An important class of problems involves the determination of the functions (21) from their given uinitial values11 specified for t = t\. The defining prop erty (20) of a Markov process implies the Chapman-Kolmogorov-Smoluchovski equation
p(X2, t2\Xh h) = ^ V(X2, h\x, t)p(x, t\Xi, h)
(h
or
(h < t < t2) (18.11-226)
18.11-4
PROBABILITY THEORY
656
Equation (22) is a first-order difference equation (Sec. 20.4-3) which may be solved for the unknown function (21) of the independent variable t
whenever p(x, t\Xh h) or cp(x, t\Xh h) is suitably given. If p(i)(Xi, h) or ^(i)(Xi, ti) is known, the Markov process is now completely deter mined for all t > t\.
(d) The Poisson Process. In many problems involving random searches, waiting lines, radioactive decay, etc., x(t) is a discrete random
variable capable of assuming the spectral values 0, 1, 2, . . . ("counting process"; number of "successes," telephone calls, disintegrations, etc.). A frequently useful model assumes the Markov property (20a) and 0 if X2 < x
p(X2, t2\x, t) = \l-"At- °^ if X2 =* t PK *' 2| ' } * aAt + o(At) if X2 = x + 1 o(At) ifX2 > x+ l]
(At = t2 - t > 0; x, X = 0, 1, 2, . . .)
(18.11-23)
where o(At) denotes a term such that o(At)/At becomes negligible as A* -> 0 (Sec. 4.4-3). To find
p(X, t\Xh h) m P(K, T) (K = X - Zi = 0, 1, 2, . . . ;T = t-h) substitute the given transition probabilities (23) into the Smoluchovski equation (22a) for £2 = t + A£ to obtain the difference equation
(K = 0, 1, 2, . . .) with P(—1, T) = 0. equation
(18.11-24)
As At-* 0, this reduces to an ordinary differential
^P(K, T) =-aP(K, T) +aP(K - 1, T)
(K =0, 1, 2, . . .) (18.11-25)
for each K. These differential equations are solved successively for P(0, T), P(l, T), P(2, T), . . . , with initial conditions given by
P(0, 0) = p(Xx, h\Xh t0 = 1
P(K, 0) = p(X! + K, h\Xl9 to = 0 (K > 0)
(18.11-26)
It follows that
P(K, T) =e~«T ^^
(r >0; K=0, 1, 2, . . .) (18.11-27)
Thus, once the process is started, the number K of state changes in every
time interval of length T has the Poisson distribution (Table 18.8-4). a is called the mean count rate of the Poisson process.
657
SOME RANDOM PROCESSES
18.11-5
The probability that no state changes take place is
P(0, T) = e~«T
(T > 0)
(18.11-28)
so that the probability that at least one state change takes place is
1 - P(0, T) = 1 - 6-«r
(T > 0)
(18.11-29)
The time interval Ti between successive state changes is a random variable with probability density
i>Tl(Ti) = *e-<m
(18.11-30)
and expected value 1/a.
Within any finite time interval of length T, a Poisson process is also uniquely defined by the joint distribution of the K + 1 statistically independent random variables K, U, h, . . . , tic, where K is the number of state changes during the time T, and ti, U, . . . , tK are now the respective times of the lBt, 2nd, . . . , Kth state change during this time interval.
One has
pK{K) =P(K, T) =^p
(K —1, 2, . . .)
(e) See Refs. 18.15, 18.16, and 18.17 for treatments of more general Markov
18.11-5. Some Random Processes Generated by a Poisson Process.
(a) Random Telegraph Wave (Fig. 18.11-lc). x(t) equals either a or —a, with sign changes generated by the state changes of a Poisson process of mean count rate a (Sec. 18.11-4d). The process is stationary and ergodic if started at t = — oo, and
E{x(h)\ = 0 Rxx(t) = a2e-2«\*\\ ^ ,N 4a2a \
*~(w) =co' + (2a)2
(18.11-32)
J
(b) Process Generated by Poisson Sampling (Fig. 18.11-lc?). x(t) changes value at each state change of a Poisson process with mean count rate a; between state changes, x(t) is constant and takes continu ously distributed random values x with given mean £ and variance a2. The process is stationary and ergodic if started at t = — «>, and
E{x(t0\ = £ * r ^
Rxx(r) = ? +
o t2*( ^ .
2°2<x
I
(18.11-33)
(c) Impulse Noise and Campbell's Theorem (Fig. 18.11-le). x(t) is the sum of many similarly shaped transient pulses, 00
x(t) = y akv(t - tk)
(18.11-34)
18.11-6
PROBABILITY THEORY
658
whose shape is given by v = v(t), with
/_„ Kfle-*" * = VF(M
(18.11-35)
while the pulse amplitude ak is a random variable with finite variance, and the times tk are random incidence times determined by the state changes of a Poisson process with mean count rate a. The process is stationary and ergodic if started at t = —oo ; it approximates a Gaussian random process if many pulses overlap. One has
E{x(tO) = £=*E{ak} pn v(t) dt
E{x2(h)\ =e +aE{ak2} j'm v2(t) dt
I (lg n_36)
RssW =? + ccE{ak2\ f\ v(t)v(t +r) dt *xx(o>) = 2rP*(«) + aE{ak2}\VF(ia>)\2 In the special case where ak is a fixed constant, the formulas (36) are known as Campbell1s theorem.
18.11-6. Random Processes Generated by Periodic Sampling. Certain measuring devices sample a stationary and ergodic random vari able q(t) periodically and then hold their output x(t) for a constant sam pling interval At. The resulting random process is stationary and ergodic if the timing of the periodic sampling commands is random and uniformly distributed between 0 and At. A sample function x(t) will be similar to Fig. 18.11-1 d except that state changes must be separated by integral multi ples of At. If q is a binary random variable capable of assuming only the values a and —a with probabilities J^, J^, then x(t) will resemble the random telegraph wave of Fig. 18.11-lc, except that state changes are, again, separated by integral multiples of At ("coin-tossing" sample-hold process). If different samples of q are statistically independent, then
R~(r) = [E{q\? = [E{x\]> = I2
(|r| > At)
Rxx(t) = E{q2\ Prob [t, t + r are in same sampling interval]
+ [E{q}]2 Prob [t, t + t are notinsame sampling I /jg H-37) interval]
=Var [x] (l - jjj) +e (|r|
#„(«) =Var {*} At [SinJl/2/2)]2 +^^ C18-11"38) Figure 18.11-2 compares Rxx(t) and $xx(a>) for a random telegraph wave
659
IMPUT-OUTPUT RELATIONS FOR LINEAR SYSTEMS
18.12-2
and a coin-tossing sample-hold process with equal mean count rates a = 1/2At, zero mean, and E{x2} = a2. 18.12. OPERATIONS ON RANDOM PROCESSES
18.12-1. Correlation Functions and Spectra of Sums. y(t) be generated by real or complex random processes.
Let x(t),
For
z(t) = ax(t) + jiy(t)
(18.12-1)
with real or complex a, /3, the correlation functions Rxz(t\, tO, Rzx(h, t0> Rzz(t\, t2) are given by Rxz — CtRxx + pRxy
Rzx = QJ*Ra;x ~f" fi Ryx
Rzz = H2R*x + WRyy + a*0Rxy + <X0*Ryx
(18.12-2)
These relations also apply to the correlation functions Rxz(r), Rzx(t), Rzz(r) of stationary random processes; the corresponding spectral densi ties are
$xz = a$xx + P$xy
$zx = a*$xs + P*$yx
*zz = \*\2*xx + \P\2*yy + a*p*xy + a$**>yx 18.12-2. Input-Output Relations for Linear Systems, sider a real linear system with real input x(t) and output
(18.12-3)
(a) Con
y(t) =J"m w(t, \)x(\) d\ =J"^ h(t, $)x(t - f) df (18.12-4) where the weighting function (Green's function, Sees. 9.3-3 and 9.4-3) is the system response to a unit-impulse input d(t — X) (impulse applied at t = X), and h(t, f) s w(t, t - f). In the most important applications, t represents time, and w(t, X) = 0 for t < X, since physically realizable systems cannot respond to future inputs (see also Sec. 9.4-3).
(b) If x(t) is generated by a real random process, and if E{x2(t)\ and E{y2(t)} exist, then
E{y(t)\ =J~m w(t, \)E{x(\)\ d\
(18.12-5)
RXy(h, t2) =Rvx(t2, h) =f~m w(t2, \)Rxx(h, X) d\ Ryy(h, t2) = / ^ W(t2, fJ.)Ryx(tl, /i) dfl
(18.12-6)
(generalized wiener-lee relations)
E{y2(t)\ =/^ w(t, ui)Ryx(t, M) d^
(18.12-7)
18.12-3
PROBABILITY THEORY
660
If x(t) is Gaussian, y(t) is also Gaussian and completely determined by Eqs. (5) to (7).
18.12-3. The Stationary Case, W(t, X) - h(t - X)
h(t, f) m hi?)
H(iu)
with
(a) // the input x(t) is stationary, and
-/:
-A
. dca
H(iw)e>M
-
2tt
(18.12-8)
hQ;)er** d{
(time-invariant linear system, see also Sec. 9.4-3), then the systemoutput y(t) is also stationary; y(t) will be ergodic (Sec. 18.10-76) if this is true forx(t). The input-output relations (4) to (7) for real x(t), y(t) reduce to
y(t) = j"n h(t - \)x(\) d\ = P h(£)x(t- f) df (18.12-9) E{y) =E{x) f^ h({) df (18.12-10) Rs„(r) = Rj,:.(-r)
Rra(r)
•A
h(r-• X)RIX(X) d\
•/:.
h(T-
-/-". A(f)R«(T --f)# fi)Ryx(ji) dn =J\ h(t)Ryx(T - f) df (^WIENER-LEE
(18.12-11)
RELATIONS)
E{y2} =j"M *(r)IU»(f) *
(18.12-12)
In most applications, physical realizability requires h(Z) = 0 for f < 0 (see also Sec. 9.4-3).
(b) The important input-output relations (11) are greatly simplified if they are expressed in terms of spectral densities (Sec. 18.10-3): ®xy(w) = H(iw)$xx(c*))
$yx(<») = H*(iu)$xx(<j))
*„„(«) = H*(uo)*„((o) = H(iu)$vx(a) = |#(*co)|2<M")
(18.12-13)
(c) Note also
Rw(t) = P m(X)R«(T - X) dx with
m(X) - Jr—oo hMhUi + \) dn = J("— oo \H(i»)\Wp Z7T
(18.12-14)
661
NONLINEAR OPERATIONS ON GAUSSIAN PROCESSES
18.12-6
In the special case of stationary white-noise input with Rx*(r) = $o5(r) (Sec. 18.11-46), note
RXJ/(r) a *ofe(T) R„,(r) s $om(T)
(18.12-15) (18.12-16)
E{y») s *om(0) =$o y_"M h^) dt
(18.12-17)
18.12-4. Relations for t Correlation Functions and Non-ensemble
Spectra. The relations (2), (4), and (10) to (17) all hold if each ensemble average, correlation function, and spectral density is replaced by the corre sponding t average, t correlation function, and non-ensemble spectral density (Sees. 18.10-7 to 18.10-9), whenever these quantities exist.
18.12-5. Nonlinear Operations. Given a random process gener ating x(t) and a single-valued, measurable function y = y(x), the functions
y = y(x) = y[x(t)]
(18.12-18)
represent a new random process produced by a (generally nonlinear) zero-memory operation on the x(t); y(x) does not depend explicitly on t. Distributions and ensemble averages of the y process are obtained by the methods of Sees. 18.5-2 and 18.5-4. In particular, the autocorrelation function of the "output" y is, for real variables,
Ryy(h, t2) = /^ J"^ y(xOy(xO
RVy(ti, t2) =- ^35 J f M„„(*i, s2)Y(sl)Y(s2) ds
1 as2 1
>
with
Y(s) = f °° y(x)e~XBdx
(18.12-20)
(
where the integration contours Ci, C2 parallel the imaginary axis in suitable absoluteconvergence strips (Ref. 18.15). The "transform method" is especially useful in connection with certain practically important transfer characteristics y(x), e.g., limiters, half-wave detectors, quantizers, etc. (Refs. 18.13 and 18.15). 18.12-6. Nonlinear Operations on Gaussian Processes, (a) Price's Theorem (Ref. 18.17). Given two jointly normal random variables Xi, x2 with covariance \i2 and a function f(xi, x2) such that
\f(xi,x2)\
E[f{xh x0\ - E{d2^xX2n}
(n - 1, 2, . . .) (18.12-21)
18.13-1
PROBABILITY THEORY
662
Price's theorem yields ensemble averages (and, in particular, correlation functions) in the form
BMxu *)) =/;"Ej^^} A,. +C
(18.12-22)
where C is the value of E{f{xi, x2)\ for Xi2 = 0, i.e., for uncorrelated xh x2. Price's theorem also leads to the useful recursion formula
E{xinx2m) = mn
rxis
Elx^^xz^1} d\i2 + E[xim]E{x2n) (n, m = 1, 2, . . .)
(18.21-23)
In particular,
E\xM) = 2X122 + 4\l2E{xi}E{x2\ -f* E[x^\E{x22)
(18.12-24)
(b) Series Expansion. Given a stationary Gaussian process x(t) with 2?{z} = 0, Rxx(t) =
Ely] =a0
Rxv(r) =^Rxar(r)
Rvi/(t) = Y ak*PxxHr)
with ak =—L= [" y(
(18.12-25)
(k =0, 1, 2, . . .)
where the /T*(t;) are the Hermite polynomials defined in Table 21.7-1.
18.13. RELATED TOPICS, REFERENCES, AND BIBLIOGRAPHY
18.13-1. Related Topics. The following topics related to the study of probability theory and random processes are treated in other chapters of this handbook:
Measure, Lebesgue integrals, Stieltjes integrals, Fourier analysis Construction of mathematical models, abstract spaces, Boolean algebras. Orthogonal-function expansions Mathematical statistics, random-process measurements and tests Permutations and combinations
Chap. 4 Chap. 12 Chap. 15 Chap. 19 Appendix C
18.13-2. References and Bibliography (see also Sec. 19.9-2). 18.1. Arley, N., and K. R. Buch: Introduction to the Theory of Probability and Sta tistics, Wiley, New York, 1950. 18.2. Burington, R. S., and D. C. May: Handbook of Probability and Statistics, 2d ed., McGraw-Hill, New York, 1967.
18.3. Cramer, H.: Mathematical Methods of Statistics, Princeton, Princeton, N.J., 1951.
18.4.
: The Elements of Probability Theory and Some of Its Applications, Wiley, New York, 1955.
18.5. Feller, W.: An Introduction to Probability Theory and Its Applications, vol. I, 2d ed., Wiley, New York, 1958; vol. II, 1966.
18.6. Gnedenko, B. V.: Theoryof Probability, Chelsea, New York, 1962. 18.7. and A. I. Khinchine: An Elementary Introduction to the Theory of Probability, Dover, New York, 1961.
663
REFERENCES AND BIBLIOGRAPHY
18J3-2
18.8. Loeve, M. M.: Probability Theory, 3d ed., Van Nostrand, Princeton, N.J., 1963.
18.9. Parzen, E.: Modern Probability Theory and Its Applications, Wiley, New York, 1960.
18.10. Richter, H.: Wahrscheinlichkeitstheorie, 2d ed., Springer, Berlin, 1967. Random Processes
18.11. Bailey, N. T. J.: The Elements of Stochastic Processes with Applications to the Natural Sciences, Wiley, New York, 1964.
18.12. Bharucha-Reid, A. J.: Elements of the Theory of Markov Processes and Their Applications, McGraw-Hill, New York, 1960. 18.13. Davenport, W. B., Jr., and W. L. Root: Introduction to Random Signals and Noise, McGraw-Hill, New York, 1958. 18.14. Doob, J. L.: StochasticProcesses, Wiley, New York, 1953. 18.15. Middleton, D.: An Introduction to Statistical Communication Theory, McGrawHill, New York, 1960. 18.16. Parzen, E.: Stochastic Processes, Holden-Day, San Francisco, 1962. 18.17. Papoulis, A.: Probability, Random Variables, and Stochastic Processes, McGrawHill, New York, 1965. 18.18. Rosenblatt, M.: RandomProcesses, Oxford, New York, 1962. 18.19. Saaty, T. L.: Elements of Queueing Theory with Applications, McGraw-Hill, New York, 1961.
CHAPTER
19
MATHEMATICAL STATISTICS
19.2-5. Simplified Numerical Compu tation of Sample Averages and
19.1. Introduction to Statistical Methods
Variances.
19.1-1. Statistics
19.1-2. The Classical Probability Model: Random-sample Sta tistics. Concept of a Popula tion (Universe) 19.1-3. Relation between Probability Model and Reality: Estimation and Testing 19.2. Statistical Description. Defi nition and Computation of Random-sample Statistics
Corrections for
Grouping 19.2-6. The Sample Range
19.3. General-purpose Probability Distributions
19.3-1. Introduction
19.3-2. Edgeworth-Kapteyn Repre sentation of Theoretical Dis tributions
19.3-3. Gram-Charlier and Edgeworth Series Approximations 19.3-4. Truncated Normal Distribu tions and Pareto's Distribution
19.2-1. Statistical Relative Fre
quencies
(a) Definition and Basic Properties
19.3-5. Pearson's General-purpose Dis tributions
(6) Mean and Variance 19.2-2. The Distribution of the Sam
19.4. Classical Parameter Estimation
ple. Grouped Data (a) The Empirical Cumulative
19.4-1. Properties of Estimates 19.4-2. Some Properties of Statistics
Distribution Function
(6) Class Intervals and Grouped (c) 19.2-3. (a) (jb) 19.2-4.
Sample Fractiles Sample Averages The Sample Average of x The Sample Average of y(x) Sample Variances and Mo ments
(a) The Sample Variances (jb) Sample Moments (c) Measures Excess
664
Used as Estimates
19.4-3. Derivation of Estimates: The
Data
of
Skewness
Method of Moments
19.4-4. The
Method
of
Maximum
Likelihood
19.4-5. Other Methods of Estimation
19.5. Sampling Distributions 19.5-1. Introduction
and
19.5-2. Asymptotically Normal Sam pling Distributions 19.5-3. Samples Drawn from Normal
19.1-1
STATISTICS
665
Populations.
The x2» t> and
19.5-4. The Distribution of the Sample
19.6. Classical Statistical Tests 19.6-1.
Statistical Hypotheses
19.6-2.
Fixed-sample
Tests:
Defini
Level of Significance.
Ney-
man-Pearson Criteria for
Choosing Tests of Simple Hypotheses 19.6-4. Tests of Significance 19.6-5. Confidence Regions
Tests for Comparing Normal Populations. Analysis of Variance
(a) Pooled-sample Statistics (b) Comparison of Normal Popula tions
(c) Analysis of Variance
The x2 Test for Goodness of Fit
19.6-8.
Test for Statistical Independ ence of Two Random Variables
tions
19.6-7.
(c) Test for Hypothetical Values of the Regression Coefficient (d) v-dimensional Samples 19.7-5. The Sample Mean-square Con tingency. Contingency-table
tions
19.6-6.
(6) The r Distribution. Test for Uncorrected Variables
Range
19.5-5. Sampling from Finite Popula
19.6-3.
(a) Distribution of the Sample Correlation Coefficient
v2 Distributions
Parameter-free Comparison of Two Populations: The Sign
19.7-6. Spearman's Rank Correlation. A Nonparametric Test of Sta tistical Dependence
19.8. Random-process Statistics and Measurements
19.8-1. Simple Finite-time Averages 19.8-2. Averaging Filters 19.8-3. Examples
(a) Measurement of Mean Value (6) Measurement of Mean Square (c) Measurement of Correlation Functions
19.8-4. Sample Averages
19.9. Testing and Estimation with Random Parameters
19.9-1. Problem Statement
Test 19.6-9. Generalizations
19.7. Some Statistics, Sampling Dis tributions, and Tests for Multi variate Distributions
19.7-1. Introduction
19.7-2. Statistics Derived from Multi
variate Samples 19.7-3. Estimation of Parameters
19.7-4. Sampling Distributions for Normal Populations
19.9-2. Bayes Estimation and Tests 19.9-3. Binary State and Decision Variables: Statistical Tests or Detection
19.9-4. Estimation (Signal Extraction, Regression) 19.10. Related Topics, References, and Bibliography 19.10-1. Related Topics 19.10-2. References and Bibliography
19.1. INTRODUCTION TO STATISTICAL METHODS
19.1-1. Statistics.
In the most general sense of the word, statistics
is the art of using quantitative empirical data to describe experience and to infer and test propositions (numerical estimates, hypothetical corre spondences, predicted results, decisions). More specifically, statistics deals (1) with the statistical description of processes or experiments, and (2) with the induction and testing of corresponding mathematical models involving the probability concept. The relevant portions of probability theory constitute the field of mathematical statistics. These techniques
19.1-2
MATHEMATICAL STATISTICS
666
extend the possibility of scientific prediction and rational decisions to many situations where deterministic prediction fails because essential parameters cannot be known or controlled with sufficient accuracy.
Statistical description and probability models apply to physical proc esses exhibiting the following empirical phenomenon. Even though indi vidual measurements of a physical quantity x cannot be predicted with sufficient accuracy, a suitably determined function y = y(xh x2j . . .) of a set (sample) of repeated measurements xh x2, . . . of x can often be pre dicted with substantially better accuracy, and the prediction of y may still yield useful decisions. Such a function y of a set of sample values is called a statistic, and the incidence of increased predictability is known as statistical regularity.
Statistical regularity, in each individual situa
tion, is an empirical physical law which, like the law of gravity or the induction law, is ultimately derived from experienceand notfrom mathe
matics. Frequentlya statistic can be predicted withincreasing accuracy as the size n of the sample (xh x2, . . . , xn) increases (physical laws of large numbers). The best-known statistics are statistical relative fre quencies (Sec. 19.2-1) and sample averages (Sec. 19.2-3).
19.1-2. The Classical Probability Model: Random-sample Sta tistics. Concept of a Population (Universe), (a) In an important class of applications, a continuously variable physical quantity (observa ble) x is regarded as a one-dimensional random variable with the inferred
or estimated probability density
The probability density in the w-dimensional sample space of "sample points" (xi, x2) . . . , xn) is the likelihood function
L(xlt x2, . . . , xn) =
(19.1-1)
Every random-sample statistic denned as a measurable function
y = y(xh x2) . . . , xn)
of the sample values is a random variable whose probability distribution (sampling distribution of y) is uniquely determined by the likelihood function, and hence by the distribution of x. Each sampling distribu tion will, in general, depend on the sample size n. While the assumptions made in this section do not apply to every physical situation, the model is capable of considerable generalization. The distribution of x need not
667 RELATION BETWEEN PROBABILITY MODEL AND REALITY 19.1-3
be continuous (games, quality control). The sample may beofinfinite size andmay not even be a countable set; the xk may not have the same probability distribution
and may not be statistically independent (random-process theory, Sec. 18.9-2). Finally, x, and thus each Xk, can be replaced by a multidimensional variable (Sees. 19.7-1 to 19.7-6).
(b) As the size n of a random sample increases, many sample statistics converge in probability to corresponding parameters of the theoretical distribution of x; in particular, statistical relative frequencies converge in meanto the corresponding probabilities (Sec. 19.2-1). Thusoneconsiders each sample drawn from an infinite (theoretical) population (universe, ensemble) whose sample distribution (Sec. 19.2-2) is identical with the theoretical probability distribution of x. The probability distribution is then referred to as the population distribution, and its parameters are
population parameters. In many applications, the theoretical popu lation is an idealization of an actual population from which samples are drawn.
19.1-3. Relation between Probability Model and Reality: Estima
tion and Testing, (a) Estimation of Parameters. Statistical methods use empirical data (sample values) to infer specifications of a probability model, e.g., to estimate the probability density
In
most applications, statistical relative frequencies (Sec. 19.2-1) are used directly only for rough qualitative (graphical) estimates of the population distribution. Instead, one infers (postulates) the general form of the theoretical distribution, say
(19.1-2)
where iji, rj2, . . . are unknown population parameters to be estimated on the basis of the given random sample (xh x2, . . . , xn). Sections 19.3-1 to 19.3-5 list a number of "general-purpose" frequency functions
(2) to be chosen in accordance with the physical background, the form of the sample distribution, and convenience in computations.
The parameters in, rj2, . . . usually measure specific properties of the theoretical distribution of x (e.g., population mean, population variance, skewness; see also Table 18.3-1). In general, one attempts to estimate values of the parameters 771, t\2, . . . "fitting" a given sample (xu x2,
. . . , xn) by the empirical values of corresponding sample statistics yi(xi, x2, . . . , xn), y2(xh x2, . . . , xn), . . . which measure analogous properties of the sample (e.g., sample average, sample variance, Sees. 19.2-3 to 19.2-6).
"Fitting" is interpreted subjectively and not neces
sarily uniquely; in particular, one prefers estimates y(x1} x2, . . . , xn) which converge in probability to t\ as n—> 00 (consistent estimates),
19-2"1
MATHEMATICAL STATISTICS
668
whose expected value equals 77 (unbiased estimates), whose sampling distribution has a small variance, and/or which are easy to compute (Sees. 19.4-1 to 19.4-5).
(b) Testing Statistical Hypotheses. Tests of astatistical hypothesis
specifying some property of a theoretical distribution (say, an inferred set of parameter values 771, V2, . . .) are based on the likelihood (1) of a test sample (xh x2, . . . , xn) when the hypothetical probability density (2) is used to compute L{xh x2, . . . , xn). Generally speaking, the test will reject the hypothesis if the test sample (xh x2, . . . , xn) falls into a region of small likelihood; or equivalently, if the corresponding value of a test statistic y(xh x2, . . . , xn) is improbable on the basis of the hypo thetical likelihood function. The choice of specific conditions of rejec tion is again subjective and is ultimately based on the penalties of false
rejection and/or acceptance and, to some extent, on the cost of obtaining
test samples of various sizes (Sees. 19.6-1 to 19.6-9).
Nonparametric tests test hypothetical distribution properties other than parameter values (e.g., identity of two distributions, statistical independence of two random variables, Sees. 19.6-8 and 19.7-6) and are particularly convenient in suitable applications, since no specific form (2) of the population distribution need be inferred.
Note: Incorrect use of statistical methods can lead to grave errors and seriously wrong conclusions. All (possibly tacit) assumptions regarding a theoretical distribu
tion must be checked. Never use the same sample for estimation and testing. Finally, remember that statistical tests cannot prove any hypothesis; they can only demon strate a "lack of disproof."
19.2. STATISTICAL DESCRIPTION. DEFINITION AND COMPUTATION OF RANDOM-SAMPLE STATISTICS
19.2-1. Statistical Relative Frequencies, (a) Definition and Basic
Properties. Consider an event E which occurs if and only if a measure ment of the random variable x yields a value in some measurable set SE
(usually a class interval, Sec. 19.2-26). Given a random sample (xh x2,
. . . , xn) of x, let nE denote the number of times a sample value xk implies the occurrence of the event E. The statistical relative fre
quency of the eventE obtained from the given, random sample is h[E] = 5* n
(19.2-1)
where n is the size of the sample. The definition of statistical relative frequencies implies the existence of an event
algebra (Sec. 18.2-1) for the experiment or observation in Question. The defining
669
THE DISTRIBUTION OF THE SAMPLE
19.2-2
properties of mathematical probabilities (Sec. 18.2-2) are abstracted from correspond ing properties of statistical relative frequencies. Thus the relative frequencies of mutually exclusive events add, the relative frequency of a certain eventis 1,etc.
(b) Mean and Variance. Since therandom sample may be regarded as a set of n Bernoulli trials (Sec. 18.7-3) which yield or do not yield
E, the random variable nE has a binomial distribution (Table 18.8-3) where # = P[E] = JfSb d$(x) is the probability associated with the event E, and
E{m] =P[E]
Var {h[E]} =P[E]{* " P[E]'
(19.2-2)
The statistical relative frequency h[E] is an unbiased, consistent estimate
of the corresponding probability P[E]; as n —> », h[E] is asymptotically normal with the parameters (2) (Sees. 18.6-4 and 18.6-5a).
19.2-2. The Distribution of the Sample.
Grouped Data,
(a)
The Empirical Cumulative Distribution Function. For a given random sample (xlf x2, . . . , xn), the empirical cumulative distribu tion function
F(X) = h[x<X]
(19.2-3)
is a nondecreasing step function, withF(- oo) = 0, F(oo) = 1. F(X) is an unbiased, consistent estimate of the cumulative distribution function
3>(X) = P[x < X] (Sec. 18.2-9) and defines the distribution (frequency distribution) of the sample (empirical distribution based on the given sample).
(b) Class Intervals and Grouped Data. Let the range of the
random variable x be partitioned into a finite or infinite number of con
veniently chosen class intervals (cells) Xj —-^ < x < Xj + -~^(j = 1, 2, . . .) respectively of length AXX, AX2, . . . and centered at x = Xi < X2 < • • • . For a givenrandom sample, the class frequency
(occupation number) n, is the number of times an xk falls into thejth class interval (description of the sample in terms of grouped data). The statistical relative frequencies h5 = n3-/n (relative frequencies of observations in the jth class interval) must add up to unity and are
consistent, unbiased estimates of the corresponding probabilities Vi = I F3 JXi-AXi/2 d$(x) V'
The cumulative frequencies Nj and the cumulative relative fre quencies Fj are defined by
19.2-3
MATHEMATICAL STATISTICS
670
Sample statistics can be calculated from the statistical relative frequencies hj = nj/n just as corresponding population parameters are calculated from probabilities. Theroundoff implicit in numerical computation ofstatistics groups datawith a classinterval width equal to one least-significant digit. Grouping into much larger class intervals may be economically advantageous, since "quantization errors'1 due to grouping very often average out orare easily corrected (Sec. 19.2-5 and Ref. 19.25). The statistics
F(X), nj, hj, Nif and Fj also yield various graphical representations of sample dis tributions and hence of estimated population distributions (bar charts, histograms, frequency polygons, probability graph paper, etc.; see Refs. 19.1 and 19.8). (c) Sample Fractiles (see also Table 18.3-1 and Sec. 19.5-2&). The sample P fractiles (sample quantiles) XP are defined by h[x < XP] = F(Xp) = P
(0 < P < 1)
(19.2-5)
Equation (5) does not define XP uniquely but brackets it by two adjacent sample values xk. X^isthe sample median, and XyA, X^, X% are sample quartiles, with analogous definitions for sample deciles and sample percentiles.
19.2-3. Sample Averages (see also Sees. 18.3-3, 19.2-5, and 19.5-3). (a) The Sample Average ofx. Given a random sample (xh x2, . . . , xn), the sample average of x is n
%= ~(xi + x2 +
+xn)=- y
(19.2-6)
xk
n Lj
In terms of the sample distribution over a set of class intervals centered
at x = Xh X2, . . . , Xm (Sec. 19.2-2), x is approximated by TO
xQ =- (n1Xl +n2X2 + • • • +nmXm) =1V nyZy (sample average from grouped data) (19.2-7) x is a measure of location of the sample distribution. E{x) = {
to - m =g
Var \x\ = — }
n
Note (19.2-8)
e{(x - ©4} =^l^)^±ju (192. •9)
whenever the quantity on the right exists, x is an unbiased, consistent estimate of the population mean £ = E{x); if cr2 exists, x is asymptotically normal with the parameters (8) as n —> oo (Sees. 18.6-4 and 19.5-2).
SAMPLE VARIANCES AND MOMENTS
671
19.2-4
(b) The Sample Average of y(x). The sample average of a func tion y(x) of the random variable x is n
+ y(xn)] = £m 2j / i y(Xk)
y = - [y(xi) + y(x2) + n
(19.2-10)
k = l
or, from grouped data m
(19.2-11)
Vo
Estimates based on Eq. (11) are sometimes improved by correction terms (Sec. 19.2-5).
19.2-4. Sample Variances and Moments (see also Sees. 18.3-3, 18.3-7, 19.2-5, 19.4-2, and 19.5-2). (a) The Sample Variances. The sam ple variances n
=(* - xy =\Y<{Xk~ *y (19.2-12)
n
= n—^-r S2 =-^-r V (Xk - X)2 —1 »-1l/ are measures of dispersion of the sample distribution; s is called sample standard deviation or sample dispersion. Note E{s*} =
n -
1
E{S*) =
Var |S') =i(pt - ^~
(19.2-13)
(19.2-14)
whenever the quantity on the right exists. S2 is an unbiased, consistent estimate of the population variance a2 = Var {x} and is thus often more useful than s2.
(b) Sample Moments.
The sample moments ar and sample
central moments mr of order r are defined by n
ar =x~r =- \ x*t
n
m'= (x ~*)r == nX (** ~~ ®
(19.2-15)
19.2-5
MATHEMATICAL STATISTICS
672
Note
xr f ) 0i2r — CLr2 Var \ar\ =
E{ar\ = ar
(19.2-16)
n
E{mr} = p.r + 1 ,
Var {mr} = 76- (/x2r - 277*r_iMr+i - Mr2 + rWr-i) +
°(i)
(19.2-17)
whenever the quantity on the right exists. ar is an unbiased, consistent estimate of the corresponding population momentar = E{xr\. If a2r exists, ar is asymptotically normal with the parameters (16) as n—> oo. mr is a consistent (but not unbiased) estimate of nr. The random variables
{n _ i){n _ 2)
and
m%
n(n2 - 2n + 3)m4 - Sn(2n - 3)m22 (n - l)(w - 2)(n - 3)
are, respectively, consistent, unbiased estimates of M3 and /z4. (c) Measures of Skewness and Excess (see also Table 18.3-1 and Sec. 19.3-5). The statistics
(19.2-18)
0i
respectively measure the skewness and the excess (curtosis, flatness) of the sample and are consistent estimates of the corresponding population parameters (see also Sec. 19.4-2). Roughly speaking, gi > 0 indicates a longer "tail" to the right. Some authors introduce gi2 and g2 + 3 or (g2 + 3)/2 instead of gi and g2. Several other measures of skewness have been used (Ref. 19.1). 19.2-5. Simplified Numerical Computation of Sample Averages and Vari ances. Corrections for Grouping (see Refs. 19.3 and 19.8 for calculation-sheet layouts), (a) For numerical computations, it is convenient to choose a computing
origin Xq near the center of the sample distribution ("guessed mean") and to compute
* =X0 +I J (xk- XQ)
(19.2-19)
or, for grouped data, m
£a =X0 +\ £ UjiXj - X0)
(19.2-20)
(b) The sample variances s2 and S2 may be computed from
n
n L* &= 1
£2
(19.2-21)
673
THE SAMPLE RANGE
19.2-6
which is approximated for grouped data by m
so2 =?—lSo2 =-n 7Li n.Xj2 - xo2 n '
(19.2-22)
y=i
(c) Computations with grouped data are simplifiedif all class intervals are of equal length AX, and if one of the class-interval mid-points Xj = X0 is taken to be the computing origin, so that
Xy = Xo + Yj AX
(19.2-23)
where the Yjare "coded" class-interval centers which take integral values 0, ±1, ±2, ....
One has then m
xG = Xo+yaAX
yo =-n Li y rtjYj
(19.2-24)
y= l
m
sa2 =^-i Sg2 =i ( T njYj* - ny02\ (AX)2
(19.2-25)
y=i
One may check computations by introducing a different computing origin X0, say Xo = 0.
(d) Sheppard's Corrections for Grouping. Let all class intervals be of equal length AX. Then, if the characteristic function (Sec. 18.3-8) xx(q) and its derivatives are small for |g| > 2tt/Ax, one may improve the grouped-data approximation so2 to the true sample variance s2 by adding Sheppard's correction —(AX)2/12.
Anal
ogous corrections for the grouped-data sample moments m
m
ar0 = n- Li y njXf y-i
mrQ = -n Lj y ny(Xy - £G)r
(19.2-26)
y=i
yield the improved estimates /
C&1 = 0,\Q
V
a'2 = a2o - 1/{2(AX)2 a'z = a3G - Haw(AX)2
m2 = m2o - H2(AX)2 m'z = m30
a\ = aag - y2a2Q(AX)2 + V2^(AXY
I \
(19.2-27)
I m\ = miG - Y2m2Q(AX)2 + K4o(AX)< /
More generally, r
<*; = J (0 (21~* - «BiO(r-*)o(AZ)*
(r - 1, 2, . . .) (19.2-28)
k = o
where the Bk are the Bernoulli numbers (Sec. 21.5-2).
These corrections become
exact if \x(q) = 0 for |g| > 2tt/Ax - e (e > 0). For normal variables with AX < 2
a[ = ai within 2.3 X 10~8AX (£ ^ 0), and -(AX)2/12 is within 3.1 X 10"V o/ ^e exact correction if £ = 0.
Note: Sheppard's corrections often yield useful estimates of errors due to the use of
rounded-off sample values in the exact formulas (12) and (15). Thus, if Sheppard's correction applies, a mean round-off error AX/2 in the xk affects s2 only as (AX)2/12.
19.2-6. The Sample Range (see also Sec. 19.5-4). The sample range w for a random sample (xh x2, . . . , xn) is the difference of the largest and the smallest sample value xk. The sample range has physical
193-1
MATHEMATICAL STATISTICS
674
significance (quality control) and serves as a rough but conveniently calculated estimate of population parameters for specific theoretical distributions (Sec. 19.5-4). The sample range and the smallest and largest sample values are examples of rank (order) statistics. 19.3. GENERAL-PURPOSE PROBABILITY DISTRIBUTIONS
19.3-1. Introduction. The probability distributions described in Sees. 18.8-1 to 18.8-9 and, in particular, the normal, binomial, hypergeometric, and Poisson distributions can often serve as theoretical popu lation distributions in statistical models. The applicability of a particu lar type of probability distribution (with suitably fitted parameters) may be inferred either theoretically or from the graph of an empirical distribution.
Normal distributions are particularly convenient. Each normal distribution is completely defined by its first- and second-order moments; again, the use of normal populations favors the computation of exact sampling distributions for use in statisti
cal tests (Sees. 19.6-4 and 19.6-6). The use of a normal distribution is frequently justified by the centrallimit theorem (Sec. 18.6-5); in particular, errors in measurements are often regarded as normally distributed sums of many independent "elementary errors."
19.3-2. Edgeworth-Kapteyn Representation of Theoretical Dis tributions.* It is often desirable to fit the empirical distribution of a random variable x with a probability distribution described by
*.& - *. \9^ut] ,x{x) _ i e4[^]a L
*'
J
(19.3-1)
o-oVZir
where g(x) is a function of x selected so as to be normally distributed with parameters (p, ag2). Once g(x) has been chosen (e.g., from theoretical considerations), only two parameters, n and
of a sequence of random variables 2/1 = 2/0+ zMy0), 2/2 = 2/1+ z2h(yi), . . . , where zi, z2, . . . are random variables satisfying the conditions of Sec. 18.6-5c, and fx dy
2/i + 2/2 + • • • = / tj-z = g(x). The Z\, z2, . . . may be considered as "reac tion intensities" in a physical process which successively generates 2/1, 2/2, • • • • EXAMPLES: (1) h(y) = constant yields a normal distribution for x. (2) h(y) = y —a results in a logarithmic normal distribution defined by V(X) = 0
(X < a)
1
—e 21
(x — *)
* See footnote to Sec, 18.3-4,
ino8, (*-«)-»-.. *
J
\
I
(x > a) \ J
(193.2)
675
TRUNCATED NORMAL DISTRIBUTIONS
19.3-4
19.3-3. Gram-Charlier and Edgeworth Series Approximations.
It is frequently convenient to approximate the frequency function of a standardized random variable x =
,
V Var \z\
(Sec. 18.5-5) in the form
[A (g-»)*ww+S (?)'--«] -*.(*)-Jj^(,)(*) + [H ?«(4)(*) +^f *,<"(*)] (19-3-3) +
where the parameters /x3, /*4, 7i, 72 refer to the theoretical distribution of z (Sec. 18.3-76 and Table 18.3-1). An analogous expression for the dis tribution function $x(x) is obtained by substitution of $Jk)(x) for
See also Table 18.5-1.
For a rather restricted class of distributions, the approximation (3) comprises the first terms of the orthogonal expansion (Sec. 15.2-6)
*,(*) = *«(*) +g|*.(,)(aO +5]*«(4)(*) + ' ' *
(19.3-4a) (19.3-46)
Vk =(-1/V2)* /_"„ Hk(x/y/2)
(19.3-4c)
with
(Gram-Charlier Type A series), where Hk(x) is the fcth Hermite polynomial (Sec. 21.7-1).
The series (4a) converges to $x(x) if m, n2, . . . exist and /
e4 etex(a;) converges;
the series (46) will then converge to
x < £0, and
»<*> = 7TT—A
-)/
*<*> "
11 — To
<* > W (19.3-5)
J (degree of truncation) isthe fraction of the original popula-
19.3-5
MATHEMATICAL STATISTICS
676
tion removed by the truncation. If £a is known, one may use a, = £ = E[x) =m+^({.) a2 = n2 +«rV(*a)(*. + /,)
J^ }
(19.3-6)
to estimate n and
^=^ (* )a+1 with
*(x)=* - ay {x >*•>• °c>o) (i9-3-?)
£=E\x] =—^ £a
(a >i)
^ = Za2
(19.3-8)
19.3-5. Pearson's General-purpose Distributions. The frequency functions
dtp _ dx
x + vi rj2 + V& + ViX2
v(x)
(19.3-9)
whose four parameters define each distribution completely. Each parameter rjk can be estimated as a function of the first four moments aror^ (Sec. 19.4-3), but the com putation of sampling distributions is, at best, difficult. The distributions defined by Eq. (9) can be classified according to the nature of the roots of rj2 + t\zx + rj4x2 = 0 and include most of the continuous distributions described in Table 18.8-8, Sec. 18.8-3, and Sec. 19.5-3 as special cases (see also Ref. 19.7). For Pearson's distributions, Pearson's measure of skewness (Table 18.3-1) is
*
" 2(572 - 67i2 + 6)
(19.3-10)
so that for small 71, 72 one has £mode « £ - yi
19.4-1. Properties of Estimates (see also Sec. 19.1-3). (a) A sample statistic y(xh x2, . . . , xn) is a consistent estimate of the theoretical
population parameter 77 if and only if y converges to v in probability as the sample size n increases, i.e., if and only if the probability of realizing any finite deviation \y — v\ converges to zero as n-+ 00 (Sec. 18.6-1). (b) The bias of an estimate y for the parameter rj is the difference b(ri) = E{y] — rj. y is an unbiased estimate of v if and only if E{y] = V for all values of 77. (c) Asymptotically Efficient Estimates and Efficient Estimates. It is desirable to employ estimates whose sampling distributions cluster as densely as possible about the desired parameter value, i.e., estimates with small variance Var \y), or small standard deviation (standard error of the estimate) \/Var {y}. For the important special class of estimates y(xh x2, . . . , xn) whose sampling distributions are asymp totically normal with mean v and variance constant/ft = \/n (see also
677
PROPERTIES OF ESTIMATES
19.4-1
Sec. 19.5-2),
X= limn Var {y) >Xmin
ry-i ^-
(19.4-1)
*{(*£*) I The asymptotic efficiency e^ly] = Xmin/X of such an estimate y measures the concentration of the asymptotic sampling distribution about
the parameter rj; y is an asymptotically efficient estimate of the parameter rj if and only if X = Xmin. Every asymptotically efficientestimate is consistent.
More generally, the "relative efficiency" of the estimate y(x\, x2, . . . , xn) of 77 from a given sample size n is measured by the reciprocal of the mean-square deviation E{(y —v)2}- Under quite general conditions (Ref. 19.4, Chap. 32), the meansquare deviations E[ (y — 77) 2| of the various possible estimates y of a given parameter rj have a lower bound given by
E{{y - ,)»} > *=!=
)...*'\u
(19.4-2)
(Cram&r-Rao Inequality). For unbiased estimates y, Eq. (2) reduces to
Var {„} >±f =^^^L-x.i
(W-«)
(mi
The efficiency e\y) = y "In. »of anunbiased estimate satisfying Eq. (3) measures the concentration of the sampling distribution about ij = E{y).
lim e{y\ is again n—»•
called the asymptotic efficiency, if this quantity exists. A (necessarily unbiased and consistent) estimate y is an efficient estimate of rj if and only if Var [y] exists and equals the lower bound \min/n.
(d) Sufficient Estimates. An estimate y(xi, x2, . . . , xn) of the parameter 77 is a sufficient estimate if and only if the likelihood func tion for one sample (Sec. 19.1-2) can be written in the form
L(X1, X2, . . . , Xn; 1) si LX{Y, n)L2{Xly X2, . . . , Xn) Y = y(Xh X2, . . . , Xn)
(19.4-4;
where L2 is functionally independent (Sec. 4.5-6) of Y. In this case, the conditional probability distribution of (xi, x2, . . . , xn) on the hypothe sis [y = Y] is independent of n, so that a sufficient estimate y of n embodies all the information about v which the given sample can supply. Efficient estimates are necessarily sufficient.
(e) Generalizations. Equations (1) to (3) apply to discrete proba bility distributions if the probability p(x1} x2, . . . , xn) is substituted for the probability density
19.4-2
MATHEMATICAL STATISTICS
678
to d applies without change to populations described by multidimensional random variables.
A set of m suitable unbiased estimates ylt y2, . . . , ym are joint efficient esti mates of m corresponding population parameters 771,172, . . . , Vm if the concentration ellipsoid (Sec. 18.4-8c) of the joint sampling distribution coincides with a "maximum concentration ellipsoid" analogous to the minimum variance defined by Eq. (3). Joint asymptotically efficient estimates are similarly defined in terms of the asymptotic sampling distribution. The reciprocal of the generalized variance (Sec. 18.4-8c) associated with the joint sampling distribution of yh y2, . . . ,ymis& measure of the relative joint efficiency of the set of estimates. To define a set of m joint sufficient estimates ylt y2, . . . , ymit is only necessary to replace the random variable y in Eq. (4) by the ra-dimensional variable (y\, y2, . . . , 2/m).
19.4-2. Some Properties of Statistics Used as Estimates (see also Sec. 19.2-4). (a) Functions of Moments. Every statistic expressible as a power of a rational function of the sample moments ar is a consistent estimate of the same function of the corresponding population moments ar, provided that the ar's in question exist and yield a finite function value (see also Sec. 18.3-7). Multiplication of a biased consistent estimate by a suitable function of n will often yield an unbiased consistent estimate. An entirely analogous theorem applies to functions of the sample moments arir2... 0/ multivariate samples. In particular, g\, g2, Uk, r,*, and ri2.Z4... m (Sec. 19.7-2) are consistent estimates of the corresponding population parameters 71, 72, X»*, p»*, and P12.34 • • • T»«
(b) For samples taken from a normal population 1. £ is an efficient estimate of £.
2. £ and s2 are joint asymptotically efficient estimates of £ and a2, but s2 is biased;
£ and S2 =» —-^-r s2are joint sufficient and asymptotically efficient estimates of £ and
S2 has the efficiency
n — 1
3. If £ is known, s2 = (x — £)2 is an efficient estimate of a2. 2
4. The sample median xw has the asymptotic efficiency — 7T
For samples taken from a binomial distribution (Table 18.8-3), £ is an efficient esti mate of £. For samples taken from a two-dimensional normal distribution (Sec. 18.8-6) with
known center of gravity, the sample central moments In, h2, and l22 (Sec. 19.7-2) are joint asymptotically efficient estimates of \n, X12, and X22. 19.4-3. Derivation of Estimates: The Method of Moments.
If the
population distribution is described by a given function $(x; yi, t\2, . . .),
«2 = 0L2(ni, t\2, . . .)
• • •
679
THE METHOD OF MAXIMUM LIKELIHOOD
19.4-4
if these quantities exist. The method ofmoments defines (joint) estimates Vi(xi, x2, . . . , xn), y2(xh x2, . . . , xn), . . . , ym(xh x2, . . . , xn) of m corresponding population parameters rn, r\2, . . . , rjm by the m equations
«r(0i> 2/2, . . . , ym) = ar(xh x2, . . . , xn) (r = 1, 2, . . . , m)
(19.4-5)
obtained on equating the first m sample moments ar to the corresponding population moments ar. The resulting estimates yk are necessarily func tions of the sample moments (see also Sec. 19.4-2o).
19.4-4. The Method of Maximum Likelihood. For any given sam
ple (xh x2, . . . , xn), the value ofthe likelihood function L(xh x2, . . . , xn) (Sec. 19.1-2) is a function of the unknown parameters 771, tj2, ... . The method ofmaximum likelihood estimates each parameter rjk by a cor responding trial function yk(xlt x2, . . . , xn) chosen so that L(xlf x2, • • . ,xn;yi, y2} . . .) is as largeas possible for eachsample (xlfx2, . . . , xn). One attempts to obtain a set of m (joint) maximum-likelihood
estimates yx(xx, x2,^ . . . , xn), y2(xh x2, . . . , xn)f . . . , ym(x!, x2f . . . , xn) as nontrivial solutions of m equations
^- loge L(xu x2, . . . , xn', 2/i, y2, . . . , ym) = 0
(19.4-6)
(k = 1, 2, . . . , m) (maximum-likelihood equations) which constitute necessary conditions for a maximum of the likelihood
function if the latter is suitably differentiable (Sec. 11.3-3). Although the maximum-likelihood method often involves more com
plicated computationsthan the momentmethod (Sec. 19.4-3), maximumlikelihood estimates may be preferable, particularly in the case of small samples, because
1. If anefficient estimate (or a set ofjoint efficient estimates) exists, it will appear as a unique solution of the likelihood equation orequations (6).
2. // a sufficient estimate (or a set ofsufficient estimates) exists, every solu tion of the likelihood equation or equations (6) will be a function of this estimate or estimates.
In addition, under quitegeneral conditions (Ref. 19.4, Chap. 33), the likelihood equa tions (6) have a solution yielding consistent, asymptotically normal, and asymptoti cally efficient estimates.
EXAMPLE: If x is normally distributed, the maximum-likelihood estimate X —£ n
for $is an efficient estimate and minimizes ^ (a* —X)2 (method of least squares in t —1
the theory of errors).
19.4-5
MATHEMATICAL STATISTICS
680
Note that the maximum-likelihood method applies also to multidimensional popu lations and applies even in the case of nonrandom samples. 19.4-5. Other Methods of Estimation. A number of methods usually employed to test the goodness of fit of an estimate or estimates can be modified to infer param eter values (Sec. 19.6-7; see also Sees. 19.9-1 to 19.9-5). 19.5. SAMPLING DISTRIBUTIONS
19.5-1. Introduction (see also Sees. 19.1-2 and 19.1-3). Section 19.5-1 lists properties of a class of statistics frequently used as consistent esti mates of corresponding population parameters.
Section 19.5-2 deals with
the approximate computation of sampling distributions for large samples, while Sees. 19.5-3 and 19.5-46 are concerned with the distributions of statistics derived from normal populations.
19.5-2. Asymptotically Normal Sampling Distributions (see also Sec. 18.6-4). The following theorems are derived from the limit theorems
of Sec. 18.6-5 and permit one to approximate the sampling distributions ofmany statistics bynormal distributions if thesample size issufficiently large.
(a) Let y(xly x2y . . . , xn) be any statistic expressible as a function of the sample moments mk such that y = f(mh m2, . . .) exists and is twice continuously differentiate throughout a neighborhood ofrax = /n, m2 = /i2, . . . . Then the sampling distribution of y is asymptotically normal as
n-> oo, with mean /(/*i, »2, . . .) and variance constant + q(1\. The n \n2J theorem applies, in particular, to the sample mean x, to the sample variances s and S, and to the sample moments ar and mr (Sec. 19.2-4). An analogous theorem applies to multidimensional populations. (b) The distribution ofeach sample fractile Xp is asymptotically normal with mean xp
and variance n
applies, in particular, to the sample median Xy^. Under analogous conditions, any joint distribution of the sample fractiles (and hence, for example, the sample inter quartile range) is also asymptotically normal.
19.5-3. Samples Drawn from Normal Populations. The x2, t, and v2 Distributions,
(a) In the case of samples drawn from a normal
population (normal samples), all sample values are normal variables, and many sampling distributions can be calculated explicitly with the aid of Sees. 18.5-7 and 18.8-9. The assumption of a normal population can often be justified by the central-limit theorem (Sec. 18.6-5, e.g., errors in measurements), or the method of Sec. 19.3-2 can be used.
681
19.5-3
NORMAL POPULATIONS
For any sample of size n drawn from a normal population with mean £ and variance a2 x — £
1.
-^= has a standardized normal distribution (u
a/y/n
distribution, Sec. 18.8-4). x — i?
2.
j= —
x — i?
(Student's ratio) has a t dis-
S/\/n sj-yn — 1 tribution with n — 1 degrees of freedom (Table 19.5-2).
Q 3.
(n - 2"^— 1)S2 = —2ns2 = -j1 V> (Xk / — x)2 -«lhas a x 2 distri,,. «. • Jfc=l
bution with n — 1 degrees of freedom (Table 19.5-1). and
#• — £ 4. — has a standardized normal distribution.
Xi -
£
Xi
V n
has a £ distribution with n — 1 degrees
of freedom.
0
2
4
6
8
10
12
Fig. 19.5-1. The x2 distribution for various values of m.
14
19.5-3
MATHEMATICAL STATISTICS
682
Table 19.5-1. The x2 Distribution with m Degrees of Freedom (Fig. 19.5-1; see also Sees. 18.8-7, 19.5-3, and 19.6-7) 10
for Y < 0
/mv ^ rc-»/^/»
for Y > 0
r(=) V* (b) E{y] = m Mode
Var {y} = 2m
w —2
•>g
(m > 2)
coefficient of skewness 2 „
about v = 0 m(m "J" 2) • • • (m + 2r —2)
coefficient of excess —
r
characteristic function (1 - 2ig)_m/i
semi-
invariant
2T~Hr - \)\m
(c) Typical Interpretation.
Given any m statistically independent standm
ardized normal variables w* = (xk — £*)/?*, the sum x2 =
/
Uk2 has a x2 dis-
tribution with m degrees of freedom.
"i Uk2 is expressed as a sum of r quadratic forms yj(ui, ui, . . . , um) of respective ranks (Sec. 13.5-46) m,-, then yi, yit . . . , yr are statistically independent and distrib uted like x2 with m\,mi, . . . , mr degrees of freedom if and only if rm + r»2 + • • • + rrir = m (Partition Theorem).
The sum of r independent variables yi, 2/2, . . . . yr respectively distributed like x2 with mi, mz, . . . , rrir degrees of freedom has a x2 distribution with m — m\ + w*2 + * • * + mr degrees of freedom (Addition Theorem or reproductive property, see also Sec. 18.8-9).
(d) The fractiles of y will be denoted by x2f or x2p(™)', published tables frequently show x\-a(™>) vs. a. (e) Approximations.
Asm—> 00,
y is asymptotically normal with mean m and variance 2m y/m is asymptotically normal with mean 1 and variance 2/m
y/2y is asymptotically normal with mean \/2m —1 and variance 1 A useful approximation based on the last item listed above is x2p(m) « H(y/2m - 1 + up)2
(m > 30)
This approximation is worst if P is either small or large.
A better approxima
tion is given by
xV(m)»m(l-^+^VJ) 6. Xi S„ X•.yjn /—^-v = —s — 1
has an r distribution with m = n —2
(Sec. 19.7-4). n
n§2
1 V^
7. — = -j > fe - £)2 has a x2 distribution with n degrees of t = i
freedom.
683
19.5-3
NORMAL POPULATIONS 0.5+
Normal
-4-3-2-101234
Fig. 19.5-2. Student's t distribution compared to the standardized normal distribution.
Table 19.5-2. Student's t Distribution with m Degrees of Freedom (Fig. 19.5-2; see also Sees. 19.5-3 and 19.6-6)
(a)
(b) E\y] = 0 Mode
•(=*-•)
(•+sr Var [y
(m > 1)
0
m — 2
(m > 2)
coefficient of skewness 0
1 • 3 • • • (2r about y = 0 (m - 2) (m -4) 2rth moment
(m - 2r)
coefficient of excess
m
(2r < m)
(c) Typical Interpretation,
3(m - 2) — 4
(m >4)
y is distributed like the ratio xo
y = t =
Vs^
+ X22 +
+ Xm2)
where x0, X\, x2, . . . , xm are m -|- 1 statistically independent normal variables, each having the mean 0 and the variance a2. Note that t is independent of a2. (d) The fractiles yp of y will be denoted by tp; note tp = —U-p. The dis tribution of \y\ = \t\ is related to the / distribution by
P[\y\
\t\i-a —ti-a/2
defined by P[\y\ > \t\i-a] = ot (a values of t) are often tabulated for use in sta
tistical tests; note that some published tables denote \t\i-a by ta. (e) Approximations. As m —> °o t y is asymptotically normal with mean 0 and variance 1, so that tp « up and \t\i-a —h-a/2 » Mi-<* = Wi-a/2 (Sec. 18.8-46) for m > 30.
19.5-3
MATHEMATICAL STATISTICS
684
Table 19.5-3. The Variance-ratio Distribution (v2 Distribution, Snedecor F Distribution, W2 Distribution) and Related Distributions (See also Sec. 19.6-6) 0
(a)
for Y < 0
\( fm\-2m r \(m +m'\ 2 J
m —2 2
(Y) = <\W r fm\ pSnS\ m+m'
0+50" /u\ 2? emj/ » = —;m _• (b) v '
'"'
Mode
m' — 2
/ / > ^ on (m' 2) v
y
for F > 0
i'2(m + m' ra' -— 2) Tr Var ([y]) = 2m'2(m / ,v—ON_, , £r l,M
w' -— 2)2(m' -— 4) m(m' (m' > 4)
m'(m - 2)
rth moment T\2+r)T\2 r) fmTV
m(m' + 2)
about y = 0
/m\
/^\
\ m/
(c) Typical Interpretation, y is distributed like the ratio v2of two random variables having x2 distributions with m and m' degrees of freedom (Table 19.5-1), or
- (Xi2 + X,2 + • • • + Xm2)
i7(^+,;2+...+^)
,
™W
m'
where the Xi and x'k are m + m'statistically independent normal variables each having the mean 0 and the variance
for V < 0
CD v, with ^.(D =j2($y r/«\rV\^ 0 +5F2)~ 2 for 7
> 0
(2) 2 = loge v = y2 log v2, with
—»-(5),^if^'-('+5'-)" ' (Fisher's z distribution)
(3) g2 = ™f V2 f\ + J* „A xhas the freta distribution (Table 18.8-8). (e) Published tables usually present the fractiles v\_„(m, m') — ——
—»
t;2«(m, ra')
t;i_a(m, m'), or Zi_« (m, m') as functions of a for various values of m, m'.
685
THE DISTRIBUTION OF THE SAMPLE RANGE
19.5-4
Table 19.5-3. The Variance-ratio Distribution (v2 Distribution, Snedecor F Distribution, w2 Distribution) and Related Distributions (Continued)
Special cases: 1 v*p(m, 1) = J^J^
v*p(1, m) = t1J:p(m) 1
v2p(m, oo) = — x2p(m)
2>2p(°o, m) =
v2p(l, oo) = U2. , p
i
mx2i-p(m)
i>2p(°°, 1) = ——
x ' r
U2p/2
2
(f) Approximations. mean —=
Asm—> oo, m' —> °o, z is asymptotically normal with
— and variance -~
—•
This approximation is often useful for
m > 30, m' > 30.
Table 19.6-1 lists additional formulas.
Tables 19.5-1 to 19.5-3 detail
the properties of the x2> t, and v2 distributions; fractiles of these functions are available in tabular form.
(b) The sample mean x and the sample variance s2 are statistically independent if and only if the sample in question is drawn from a normal population. For every normal sample, £, s2, and mrm
*«<">= ° pi
i
Y»'»»-(.+W+8) 6
*l"l = - ^TT
v
! .
24n(n - 2)(n-3)
Var l»»l ~ (» + i).(B+3)(n+S)
a9-8"" ,,....
(19-5"2)
19.5-4. The Distribution of the Sample Range (see also Sees. 19.2-6 and 19.7-6). (a) For every continuous distribution, the frequency func tion of the range w for a random sample of size n is
(b) For normal populations, both the mean and the dispersion of w are multiples of the population dispersion a:
E{w) = kna
Var {w} = c»V
(19.5-4)
kn, cn, and cn/kn have been tabulated as functions of n (Ref. 19.3); w/kn is an unbiased estimate of a. The average range
^ _ (wi + 102 + • • • + wm) m
19.5-5
MATHEMATICAL STATISTICS
686
obtained from a sample of m random samples of size n is asymptotically normal with mean kno~ and variance cn2o-2/m as m —> <»; w/kn is an unbiased, consistent estimate of
E{w] = (b-a) 5^zi n + 1
Var {w) = (bv - a)2' (n +2(n~ 1} TC* ( ' l)2(n +
2) (19.5-5)
and w
n + 1
- is a consistent, unbiased estimate of b — a.
n — 1
'
Note also that
the arithmetic mean of the smallest and the largest sample value is a consistent, unbiased estimate of E{x) (Ref. 19.4). (d) For any continuous population distribution, the probability that at least a fraction q of the population lies between the extreme values xmin, xmax of a given ran dom sample of size n is 1 - nqn~l + (n - l)qn
(19.5-6)
19.5-5. Sampling from Finite Populations. Given a finite population of N ele ments (events, results of observations) Eh E2, . . . , En each labeled with one of the M < N spectral values X
E{x] - {- ± ^ NiX" *= i
(19.5-7)
M
Var {*) =a2 =1 ^ tf«(X(0 - £ t = i
JS|*| ={ Var |S) =^J^f
+ 3iV(iV - n - 1)(» - 1)«22]
(19-5-8)
(19.5-10)
These formulas reduce to those given in Sees. 19.2-3 and 19.2-4 for N = oo. 19.6. CLASSICAL STATISTICAL TESTS
19.6-1. Statistical Hypotheses. Consider a "space" of samples (sample points) (xi, x2, . . . , xn), where xh x2, . . . , xn are numerical random variables. Every self-consistent set of assumptions involving
the joint distribution of xh x2, . . . , xnis a statistical hypothesis.
A
hypothesis H is a simple statistical hypothesis if it defines the proba-
687
FIXED-SAMPLE TESTS: DEFINITIONS
19.6-2
bility distribution uniquely; otherwise it is a composite statistical hypothesis. More specifically, let the joint distribution of Xi, x2, . . . , xn be defined by $(a?i, x2, . . . , xn; ni, r\2, . . . ),
19.6-2. Fixed-sample Tests: Definitions. Given a fixed sample size n, a test of the statistical hypothesis H is a rule rejecting or accepting the hypothesis H on the basis of a test sample (Xi, X2, . . . ,Xn). Each test specifies a critical set (critical region, rejection region) Sc of "points" (xi, x2, . . . , xn) such that H will be rejected if the test sample (Xly X2, . . . , Xn) belongs to the critical set; otherwise H is accepted. Such rejection or acceptance does not constitute a logical disproof or proof, even if the sample is infinitely large. Four possible events arise: 1. 2. 3. 4.
H H H H
is is is is
true and is accepted by the test. false and is rejected by the test. true but is rejected by the test (error of the first kind). false but is accepted by the test (error of the second kind).
For any set of true (actual) parameter values 771,772, . . . , the probability that a critical region Sc will reject the hypothesis tested is irsc(li> V2f . . .) = P[(xh x2, . . . , xn) E Sc; Vh V2, . . .]
= JSc d$(Zl, X2, . . . , Xn\ 771, 772, . . .) (power function op the critical region Sc)
(19.6-1)
Whenever the hypothesis Hi ss [rji = 7711, 172 = 7721, . . .] contradicts the hypothesis tested, the rejection probability ttsc^h, 7721, . . .) is called the power (power function) of the test defined by Sc with respect to the alternative (simple) hypothesis Hi (see also Fig. 19.6-la). A
graph of the correct-acceptance probability 1 — 0 against the false-rejection probability a is called the operating characteristic of the test (Fig. 19.6-16; see also Sec. 19.6-3).
19.6-3
MATHEMATICAL STATISTICS
688
19.6-3. Level of Significance. Neyman-Pearson Criteria for Choosing Tests of Simple Hypotheses, (a) It is, generally speaking, desirable to use a critical region Sc such that wSc(vi, V2, . . .) is small for parameter combinations 771, 772, .. . admitted by the hypothesis tested, and as large as possible for other parameter combinations. Given a critical region Sc used to test the simple hypothesis H0 = [771 = 7710, 172 = 1720, • • •] ("null hypothesis"), let H0 be true. Then the probability of falsely rejecting H0 (error of the first kind) is TSc(rjio, 7720, . . .) = a. a is called the level of significance of the test: the critical region Sc tests the simple hypothesis H0 at the level of significance-a. In the case of discrete random variables xi, x2, . . . , xn, one cannot, in general, specify a at will, but an upper bound for a may be given. A critical region used to test a composite hypothesis H = [fa, r\2, . . .) £ D] will, in general, yield different levels of significance for different simple hypotheses (parameter combinations 771, rj2, . . .) admitted by H; one may specify the least upper bound of these levels of significance.
(b) For each given sample size n and level of significance a
1. A most powerful test of the simple hypothesis H0 = [771 = 7710, V* = 1720, . . .] relative to the simple alternative Hi = [771 = 7711, m = *72i, . . .] is defined by the critical region Sc which yields the largest value of 1 —0 = irSc (vu, yu, . . .).
2. A uniformly most powerful test is most powerful relative to every admissible alternative hypothesis Hi; such a test does not always exist.
3. A test is unbiased if wSc (in, 1721, . . .) > a for every alternative simple hypothesis Hi; otherwise the test is biased. A most powerful unbiased test relative to a given alternative Hi and a uniformly most powerful unbiased test may be defined as above.
Toconstruct the critical region Scfor a most powerful test, use all sample points (xh x2, . . . , xn) such that the likelihood ratio (p(xh x2, . . . , xn; 7710,7720, . . .)/
constant c; different values of c will yield "best" critical regions at dif
ferent levels of significance a. Uniformly most powerful tests are of par ticular interest if one desires to test H0 against a composite alternative hypothesis. A uniformly most powerful unbiased test may exist even
though no uniformly most powerful test exists (see also Ref. 19.4). In practice, ease of computation may be the deciding factor in a choice among several possible tests; one can usually increase the power of each test by increasing the sample size n (see also Sec. 19.6-9),
19.6-3
LEVEL OF SIGNIFICANCE
689
jL Probability density of test statistic
Probability density of test statistic if null
hypothesis H0 holds
False-rejection probability a
Correct-rejection probability 1-/9 (power of the test)
Probability density of test statistic if alternate
hypothesis Hx holds Test statistic, False-acceptance probability,£
(rejectHQ if test statistic
y(xvx2, ...,Xn)
is in this region)
Fig. 19.6-la. Test of the null hypothesis Ho against a simple alternate hypothesis Hi in terms of a test statistic y = y(xh x2, . . . , xn).
Correct-rejection probability,1-/3 >i
False-rejection probability, a
Fig. 19.6-16. Operating characteristic of a test (see also Fig. 19.6-la).
19.6-4
MATHEMATICAL STATISTICS
690
Table 19.6-1. Some Tests of Significance Relating to the Parameters £,
Hypothesis
Test statistic
to be tested
V = y(.Xl, Xi, . . . , Xn)
Critical region 5c rejecting the hypothesis at the level of significance
a (
1
£ = £o (o- known)
2
£ < £o (
1/ > Wi-a
3
£ > fo (
1/ < ua = -wx_a
4
£ = fo
\y\ > Mx-cr = Ui-a/2 x -
* -
€o
M > l«li-a = h-a/2
(m = n - 1)
fo
5
f < fo
y >
(m = n — 1)
6
f> fo
y <
(m = n — 1)
7
*<*./"
<m-»-l>
V > x\-a
(m = n -
y < X2a
(m = n — 1)
2/ > X2i-a/2
8
(n -
S2
1) —
9
1)
* Note that published tables often tabulate x2i-a rather than x2oJ check carefully
19.6-4. Tests of Significance. Many applications require one to test a hypothetical population property specified in terms of a set of parameter values 771 = t?io, 172 = 1720, . . . against a corresponding sample property described by the respective estimates yi(xi, x2, . . . , xn), y2(xh x2, . . . , xn), . . . of rji, 7j2, . . . . One attempts to construct a "test statistic" y = y(xi, x2, . . . , xn; rjio, 9720, . . .) = g(yi, y2, .
;tyio> 1720, . . .) (19.6-2)
whose values measure a deviation or ratio comparing the sample property to the hypothetical population property for each test sample (xi, x2, . . . , xn). The simple hypothesis H0 = [771 = 7710, 772 = 7720, . . .] is then rejected at a given level of significance a (the "deviation" is sig nificant) whenever the sample value of y falls outside an acceptance interval [yPl < y < yp2] such that P[ypl
(19.6-3)
691
CONFIDENCE REGIONS
19.6-5
Normal Population (see also Sees. 19.5-3oand 19.6-4). Tests are based on random 19.5-1 and 19.5-2 apply if the sample size n is large*. Power function
-(•r-^)+"(v'^) *(-5£)
Remarks
Simple hypothesis. Uniformly most power ful unbiased test; no uniformly most power ful test exists
Composite hypotheses. If the admissible hypotheses are restricted to £ > £0 for test 1, and to £ < $0 for test 2, either test is a uniformly most powerful test for the sim ple hypothesis £ = £0
Simple
hypothesis;
"two-tailed t
test."
Caution: many published tables denote Use Eq. (19.6-1)
Uli-« by ta Composite hypotheses; "one-tailed t tests"
p[*J <7? x«] +p[*2 >77 xVa/2J (m =n" 1} P|XS >^7 X»i-a 1(m =n- 1) P X2 <^7 X2a (m - n- 1)
Simple hypothesis. No powerful test exists
uniformly
most
Composite hypotheses. If the admissible hypotheses are restricted to a2 > ao* for test 8, and to
on the notation used in each case.
Tests defined in this manner are often called tests of significance.
Equation (3) specifies yPl = ypXno, t?2o, . . .) and yP% = yps(no, *?2o, . . .) as fractiles of the sampling distribution of y(xh x2, . . . , xn; rjiof 1720, . . .)• It is frequently possible to choose a test statistic y such that its fractiles yP are independent of 7710, 7720, . . . . Table 19.6-1 and Sees. 19.6-6 and 19.6-7 list a number of important examples (see also Fig. 19.6-1). In quality-oontrol applications, the acceptance limits ypx and yp2 defined by Eq. (3) are called tolerance limits at the tolerance level a, and the interval [yplt yp2] is called a tolerance interval (see also Fig. 19.6-1).
19.6-5. Confidence Regions, (a) Assume that one has constructed a family of critical regions Sa(rji, 772, . . .) capable of testing a corresponding set of simple hypotheses (admissible parameter combinations) (771, 772, . . .) at some given level of significace a.f Then for any fixed test sample (xi = Xi, x2 = X2, . . . , xn = Xn), the set Da(Xh X2, . . . , t In the case of discrete distributions, the critical regions Sa(vh V2, • • •) will, in general, be defined for a level of significance less than or equal to a.
19.6-5
MATHEMATICAL STATISTICS
692
Xn) of parameter combinations (771, 772, . . .) ("points" in parameter space) accepted on the basis of the given sample is a confidence region at the confidence level a; 1 — a is called the confidence coefficient.
The confidence region comprises all admissible parameter combinations whose acceptance on the basis of the given sample is associated with a probability P[(XhX2, . . . , Xn) not in ^(771,772, . . .)] at least equal to the confidence coefficient 1 — a.
The method of confidence regions relates a given sample to a corresponding set of "probable" parameter combinations without the necessity of assigning "inverse probabilities" to essentially nonrandom parameter combinations (see also Sec. 19.6-9).
(b) Confidence Intervals Based on Tests of Significance (see also To find confidence regions relating values of one of the unknown parameters, say 771, to the given sample value Y = y(Xh X2, . . . , Xn) of a suitable test statistic y, refer to Fig. 19.6-1. Plot lower Sec. 19.6-4).
ypzfa)
yPl(vi)
Confidence interval
^1 = 111
Value of theoretical parameter, ^ Fig. 19.6-2.
and upper acceptance limits (tolerance limits) yp^rji) and yp2(ni) against 771 for a given level of significance a* The intersections of these accept ance-limit curves with each line y = Y define upper and lower confidence limits (fiducial limits) 771 = y2(Y), 771 = 71(F) bounding a confidence interval Da = [71, 72] at the confidence level a. The confidence interval comprises those values of 77! whose acceptance on the basis of the sample value y = Y is associated with a probability P[yPl(771) < Y < VPiivi)] at least equal to the confidence coefficient 1 — a. * See footnote to Sec. 19.6-5a.
19.6-6
ANALYSIS OF VARIANCE
693
Table 19.6-2. Confidence Intervals for Normal Populations
Use the approximations given in Tables 19.5-1 to 19.5-3 for large n Parameter,
Confidence interval [71 < rji < T2], confidence coefficient 1 — a = P2 — Pi
1
£ (
X - Up2 —y= < £ < X - UPl —7= y/n yw
2
£ (
x - tP2 -7= < £ < x - tPi —=
3
a2
No.
s
s
-yn
yw
S>n-l<
X2P,
19.6-6. Tests for Comparing Normal Populations. Analysis of Variance, (a) Pooled-sample Statistics. Consider r statistically independent random samples (xa, xi2, . . . , xini) drawn, respectively, from r theoretical populations with means & and variances a*2 (i = 1, 2, . . , r). The ith sample is of size n{ and has the mean and variance n»
m
fc=i
k=>i
(i = 1, 2, . . . , r)
(19.6-4)
The r samples may be regarded as a "pooled sample" whose size, mean, and variance are given by r
r
m
r
(19.6-5) 1 fc = i
S* =
n -
s^i t %<** - *)2
1
i = i
k=i
1 1 [(n - r)So2 + (r- l)SA2] } (19.6-6)
n r
r
i a l
t = 1
The statistics £02 (pooled variance) and SA2 are measures of dispersion within samples and between samples, respectively.
(b) Comparison of Normal Populations (see also Tables 19.5-1 to 19.5-3). If the rindependentsamples are drawnfromnormal populations
19.6-6
MATHEMATICAL STATISTICS
694
with means fc and identical variances crt-2 = a2, then S2, SQ2, and SA2 are all consistent and unbiased estimates (Sec. 19.4-1) of the (usually unknown) population variance a2, (n —r)S02/a2 and (r - l)SA2/
n~n]e
— has a
a
standardized normal distri-
k bution.
Xk) -— (£i cy (Xi (Xi -—Xk) - &) * /—^— / n&k U 41- 4 -r .• with .ji. n,- + n* —2 ^. « (fc-•&) has a £distribution
degrees of freedom.
Si2/Sk2
has a y2 eftsJn&Mfe'on),
SA2/S02
h&s &v2 distribution) , __
loge (Si/Sk) has a 2 distribution J (m = n,- — 1, m' = n* — 1)
loge (Sa/Sq) has a «distribution j^ _ r
_
,_
'
~" n ~~ r'
Table 19.6-3 shows the use of these test statistics in tests of significance (Sec. 19.6-4) comparing normal populations. It is also possible to cal culate confidence limits (Sec. 19.6-5) for the difference fc — £* from the t distribution. The case of normal populations with different variances is discussed in Ref. 19.8.
(c) Analysis of Variance. The third test of Table 19.6-3 compares the mean values ft- by partitioning the over-all variance S2 into compo nents So2 and SA2 respectively due to statistical fluctuation within samples and to differences between samples. This technique is known as analysis of variance; the particular case in question involves a one-way classifica tion of samples corresponding to values of the index i. Many similar tests are used to analyze the effects of different medications, soil treat ments, etc. (Refs. 19.3, 19.4, and 19.8). Analysis of Variance for Two-way Classification (Randomized Blocks). Consider rq sample values xa arranged in an array £11
£12
• * •
Xlq
$21
£22
* • *
X2q
Xrl
Xr2
' ' '
Xrq
and introduce row averages xl., column averages xTk, and the over-all average £ by q
r
r
q
51 - s 2 ** ** -12 *" *=k 2 2** k= l
t = l
t' = lfc = l
(l9-6"7)
695
THE x2 TEST FOR GOODNESS OF FIT
19.6-7
The over-all variance is partitioned as follows: Q
S* rq » = i &= i
1
rq — 1
[(r - l)(q - l)S02 + (r - l)qS2Rovr + r{q - l)S2cJ
(r - l)(g - 1) ,Lx Lx
(19.6-8)
(variance due to fluctuations within rows and columns) I (variance between rows) '
°Row t= l
(variance between columns) ft=l
If the random variables re,-* are normal with identical variances, So2, SRow, an(l ^Coi
are statistically independent random variables respectively distributed like x2 with (r — l)(g - 1), r — 1, and q —1 degrees of freedom. The test statistic SRow/So2 is, then, distributed like v2with ra = r — 1, ra' = (r — l)(g — 1) and serves to test the equality of the row means E[x{.\ = £». in the manner of Table 19.6-3. Similarly, Scoi/W is distributed like v2 with ra = q - 1, ra' = (r - l)(g - 1) and serves to test the equality of the column means E{x.k\ =* £•*. Table 19.6-3. Significance Tests for Comparing Normal Populations (refer to Sec. 19.6-6; see also Tables 19.5-2 and 19.5-3) Hypothesis
Critical region rejecting the hypothesis at a level of significance < a
to be tested
la
16
*i2 <
2a
fc = fc
26
& < €*
(given cr»2 =
£l = *2 =
" ' ' = £r
(given
|l0ga (Si/Sk)\ >Zl-a/2 log, (Si/Sk) > zi-a or Sf/Sk2 > V*l-a
So
£% —gfc
\m
——
/ nt
So \lm H
(ra = m — 1, ra' = nk — 1) (Fisher's z test and v2 test)
> t\-an I (m = n< + nk - 2) S(£ test; note the special ^ . \ case ?i»- = 1) >
tl-a
I
|k>ge (&i/jSo)| >*l-«/2 (ra = r — 1, ra' = n — r) (Fisher's z test)
• • • = *r2)
19.6-7. The x2 Test for Goodness of Fit (see also Table 19.5-1).
(a) The x2 test checks the "fit" of the hypothetical probabilities
19.6-8
MATHEMATICAL STATISTICS
696
pk = p[Ek] associated with r simple events Eh E2, . . . , Er to their relative frequencies hk = h[Ek] = nk/n in a sample of n independent observations. In many applications, each Ek is the event that some ran
dom variable x falls within a class interval (Sec. 19.2-2), so that the test compares the hypothetical theoretical distribution of x with its empirical distribution.
The goodness of fit is measured by the test statistic
— Z^-t*1^
cm,
y converges in probability tox2with m = r — 1 degrees offreedom asn —> oo.
If all npk > 10 (pool some class intervals if necessary), the resulting test rejects the hypothetical probabilities ph p2, . . . , pr at the level of sig nificance a whenever the test sample yields y > xi-a(m); for m > 30 one may replace the x2 distribution by a normal distribution with the indi cated mean and variance.
(b) The x2 Test with Estimated Parameters. If the hypothetical probabilities pk depend on a set of q unknown population parameters Vi) 12, . . . , rjqy obtain their joint maximum-likelihood estimates from
thegiven sample (Sec. 19.4-4) andinsert theresulting values ofpk = pk(rii, r?2, ... , Vq) into Eq. (9). The test statistic y will then converge in proba bility to x2 with m = r —q —1 degrees of freedom under very general conditions (see below), and the x2 test applies with m = r —q— 1. Tests of this type check the applicability of a normal distribution, Poisson distribution, etc., with unspecified parameters. The theorem applies whenever the given functions pk(vi, V2, • . • , Vg) satisfy the following conditions throughout a neighborhood of the joint maximum-likelihood "point" (171, 172, . . . , vq): r
!• ) P*(Vl, V2, . . • , Vq) = 1. 2. The pk(vu m, ... , nq) have a common positive lower bound, are twice con tinuously differentiable, and the matrix [dpk/drjj] is of rank q.
19.6-8. Parameter-free Comparison of Two Populations: The Sign Test. One desires to test the hypothesis that two random variables x and y have identical probability distributions, or P[x-y>0] =P[x-y<0] = y2 assuming that it is known that P [x = y] = 0. Consider a random sam ple of n pairs xly yx\ x2, y2; ... ; xn, yn; neglect any pairs such that x{ = y{
(ties) in computing the sample size n. The probability that more than
697
MULTIVARIATE SAMPLES
19.7-2
m differences Xi — y{ are positive is
*->-i[Ui)+Ui)+---+(:)] Now let ma be the smallest value of m such that p(m) < a and
1. Reject the hypothesis (1) at the level of significance
2. Reject the hypothesis at the level of significance 2a if the number of positive or negative differences exceeds ma (Two-tailed Sign Test). ma has been tabulated against a and n (Ref. 19.15). The sign test can also be used to test (1) the symmetry of a probability distribution; (2) the hypothesis that x = X is the median of a distribution (Ref. 19.15). 19.6-9. Generalizations (Ref. 19.17).
The fixed-sample tests of Sees. 19.6-1 to
19.6-7 admit only two alternatives, viz., acceptance or rejection of a given hypothesis on the basis of a test sample. Sequential tests permit an increase in the sample size (additional observations) as a third possible decision; it is then possible to specify both the level of significance and the power of the test (relative to some alternative
hypothesis) when the test hypothesis will finally be accepted or rejected. Schemes of fixed-sample and sequential tests are special examples of statistical decision functions or rules of behavior associating a decision (set of parameters
Vh *?2, • • • , "point" in parameter space) with each given sample (xi, X2, . . .) of some observed qualities. In practice (operations research, detection theory), the decision function is designed so as to maximize the expected value of some measure of effectiveness (payoff function) involving the gains due to correct decisions, the losses due to incorrect decisions, and the cost of testing samples of various sizes.
19.7. SOME STATISTICS, SAMPLING DISTRIBUTIONS, AND TESTS FOR MULTIVARIATE DISTRIBUTIONS
19.7-1. Introduction.
Sections 19.7-2 to 19.7-7 are in no sense an
exhaustive survey of multivariate statistics but present a number of fre quently used definitions, formulas, and tests for convenient reference. Note that multivariate statistics often serve to estimate and test sto
chastic relationships between two or more random variables. 19.7-2. Statistics Derived from Multivariate Samples,
(a) Given
a multidimensional random variable x = (xi, x2, . . . , xv) (Sec. 18.2-9), one proceeds by analogy with Sees. 19.1-2 and 19.2-2 to 19.2-4 to intro duce a random sample of size n, (xi, x2, . . . , x„) = (xu, x2\, . . . , Xvi) X\2, x22, . . . , xv2, . . . ; a;in, x2n, . . . , xvn) and the statistics n
xik
(i = 1, 2, . . . , v) (sample average of xt)
(19.7-1)
W.7-2
MATHEMATICAL STATISTICS
698
n
J\xlf x2, . . . , a?,) =- ) /(zlfc, z», . . . , a;,*) &=i
[sample average of f(xh x2, . . . , z„)] hi = fe - Xi)(x, - %) = Z# (i,; = 1, 2, . . . , v)
(19.7-2)
(sample variances Si2 for i = j, sample covariances for i ?* j)
j
(19.7-3) (sample correlation coefficients)
(19.7-4)
The "point" corresponding to x s (ft, ft, ... , ft) is the sample center of gravity, and the matrix L s [li3] is the sample moment matrix; det [/#] is the generalized variance of the sample. (b) Given a two-dimensional random sample (xn, x21; x12, x22; . . . ;
xin, x2n) of a bivariate random variable (xh x2), one defines the sample regression coefficients
til
Si
l22
S2
The empirical linear mean-square regression of x2 on X\ g*(xi) = ft + 62i(zi - ft)
(19.7-5)
(19.7-6)
is that linear function axi + b which minimizes the sample mean-square deviation n
[x2 - (aXl +6)]2 =I 2^ [a-tt - (asu +6)]2
(19.7-7)
*=i
(see also Sees. 18.4-66 and i9.7-4c).
For ^-dimensional populations, empirical multiple and partial correla
tion coefficients and regression coefficients are derived from the sample moments l{j by analogy with Eqs. (18.4-35) to (18.4-38) and serve as estimates of the corresponding population parameters. See Refs. 19.4 and 19.8 for the complete theory.
(c) The statistics (1) to (4) can be approximated by grouped-data estimates in the manner of Sees. 19.2-3 to 19.2-5. In particular, Sheppard's corrections for grouped-data estimates likG of the second-order central moments \ik are given by
l« = laa - yi2(*Xi)2
l'ik = kkG (i*k)
where the AXi are the constant class-interval lengths. These formulas often render errors due to grouping negligible if AXi is less than one-
eighth the range of #». Practical computation schemes are given in Refs. 19.3 and 19.8.
699
NORMAL POPULATIONS
19.7-4
19.7-3. Estimation of Parameters (see also Sec. 19.4-1). Since the theorem of Sec. 19.4-2a applies to multivariate distributions, the statistics
(3) to (5), as well as the sample averages ft, are consistent estimates of the corresponding population parameters. The mean value and variance of each sample average ft and sample variance la = s»2 is given by Eqs. (19.2-8), (19.2-13), and (19.2-14); in addition, note
Var {kj} =I[E{(Xi - &•(*, - S/)2! " V] +0(1) (19.7-9) Cov ili y lj5] =I[E{(Xi - Wto - &)»} - \u\j] +0(i) (19.7-10) Cov {la, kj} =I[E(xi - mxj - &)) - X^il +0(^ (19.7-11) ^{r«}=P, +oQ (19.7-12) Var {rij} is of the order of l/n as n increases (Ref. 19.4). For multidimensional normal distributions (see also Sec. 19.7-4) only,
Sfdet [Ml =(»•»»-;)••'»->
det [X,]
V- (det 0,,]}. =(y| f *+^ j 1}
U >*- 2) (19.7-13)
(n - l)2(n - 2)2 • - - (ti - r)2 det2 [Xi,]
19.7-4. Sampling Distributions for Normal Populations (see also Sees. 18.8-6 and 18.8-8). For random samples drawn from multivariate
normal populations, sampling distributions and tests involving only the sample averages x{ and the sample variances la = s»-2 are obtained simply from Sec. 19.5-3. It remains to investigate statistics which describe stochastic relationships of different random variables xif in particular
sample correlation and regression coefficients.
(a) Distribution of the Sample Correlation Coefficient. For simplicity, consider a random sample (xu, x2i) xX2, x22, . . . ; Xm, x2n) drawn from a two-dimensional normal population described by
^'Xi) - W.V1-P'exP (" ^^ [C^T (t\
\
cr2
) JJ
(
W.7-4
MATHEMATICAL STATISTICS
700
(Sec. 18.8-6). The probability density of the sample correlation coeffi cient r12 = r (Sec. 19.7-2) is
=i^i (1 _p2)^(1 _rS)^ r *V*
dj_
(-l
(19.7-15)
and equals zero for |r| > 1; note
E{r\ =P +0(^ Var (r| =
1 +r
y = 2los*r=-r
(19.7-17)
which for n > 10 may be regarded as approximately normal with the approximate mean and variance
2^)~ll0ge^ +^_ Var {,) «^
(19.7-18)
Figure 19.7-1 illustrates the behavior of the statistics rand yfor different values of p and n.
If yand */' are values ofthestatistic (17) calculated from two independent random samples of respective sizes n, n' from the same normal population, then y - y' is approximately normal with mean 0 and variance l/(w - 3) + l/(n' - 3).
(b) The r Distribution. Test for Uncorrelated Variables. In
the important special case p = 0 (test for uncorrelated variables!), the frequency function (15) reduces to
. <^)J- (1 - r2) 2 \Ar C-i-2)
n-4
(-1 < r < 1) (19.7-19)
r
In this case, t = r y/n - 2/\/l - r2 /ias a * distribution with n - 2
degrees offreedom (Table 19.5-2). Thestatistic ry/n-lis saidto have an r distribution with n - 2 degrees of freedom.
The r distribution
has been tabulated (Ref. 19.3) and is asymptotically normal with mean 0 and variance 1 as n -» oo. Either the t distribution or the r distribution yields tests for the hypothesis p = 0.
19.7-4
NORMAL POPULATIONS
701
-1.0 -0.8 -0.6 -0.4 -0.2 3 -
-1
0
0.2
0.4
0.6
p»0
p»0.5
p-=0.8
n=50
n=50
n=50
0
0.8
1.0
1
Fig. 19.7-1. Probability densities of the statistics r and y used to estimate the corre lation coefficient p of a multidimensional normal distribution. (From Burington and
May, Handbook of Probability and Statistics, McGraw-Hill, New York, 1953.)
19-7-5
MATHEMATICAL STATISTICS
702
(c) Test for Hypothetical Values of the Regression Coefficient (see also Sec. 19.7-2). Given a random sample drawn from a bivariate
normal population described by Eq. (14), one tests hypothetical values of the population regression coefficient fitl = X12/Xn = Pl2
* = s2 VI 7t—1 (&21 " *«) - r122
(19.7-20)
which is distributed like t with n - 2 degrees offreedom. This test is far more convenient than one using the sample distribution of the sample regression coefficient b21 (Ref. 19.8).
(d) ^-dimensional Samples. For random samples drawn from a
^-dimensional normal population described by the probability density (18.8-26), the joint distribution of the sample averages xlf x2, . . . , xv is normal with mean values fc, £2, . . . , $, and moment matrix A/n. The joint distribution of the x{ is statistically independent of the joint dis tribution of the v(v + l)/2 sample moments l{j (Generalized Fisher's Theorem, see also Sec. 19.5-36).
19.7-5. The Sample Mean-square Contingency. Contingencytable Test for Statistical Independence of Two Random Variables
(see also Sec. 19.6-7). (a) Given a two-dimensional random sample (xi, 2/1; x2, y2; ... ; xn, yn) of a pair of random variables x, y, a con tingency table arranges the n sample-value pairs (xk, yk) in a matrix of s x class intervals and r y class intervals.
Let there be
n{. pairs (xk, yk) in the ith x class interval
n.j pairs (xk, yk) in the A;th y class interval
n{j pairs (xk, yk) both in the tth x class interval and in the jih y class interval
The statistic
^ f
_f^ (riij
Zv jLj
n{. n.yV
v± Vli n
«•
.
L~i L^ ni-n-j »"i
y=i
(sample mean-square contingency)
(19.7-21)
measures the "degree of association" (statistical dependence) between x and y. f ranges between 0 and min (r, s) - 1 and reaches the latter
value if and only if each row (r > s) or each column (r < s) contains only one element different from zero.
(b) // x and y are statistically independent, then the test statistic nf2 converges in probability to x2 with m = (r - l)(s - 1) degrees of freedom
703
SAMPLE FINITE-TIME AVERAGES
19.8-1
as n—» oo (Table 19.5-1). If all n{j > 10 (pool some class intervals if necessary), the hypothesis of statistical independence is rejected at the level of significance a by the critical regionnf2 > xi-«(™) (Sec. 19.6-3). (c) The special case r = s = 2 (success or failure, two-by-two con tingency table) is of special interest; in this case
/2 = (^11^22 —ni2n2i)2
*
n\.n2.n.\n.2
-g 7-22)
(see also Ref. 19.4).
19.7-6. Spearman's Rank Correlation. A Nonparametric Test of Statistical Dependence. Suppose that a random sample of n observa tion pairs (xx, yi) x2, y2; . . . ; xn, yn) yields only the information that xk is the Akth largest value of xk in the sample, and yk is the Bkth largest value of yk (k = 1, 2, . . . , n). If x and y are statistically independent, the test statistic n
r =i- n(n226— 1) Lj y(Ak-Bkr k=i
(spearman's rank correlation)
(19.7-23)
is asymptotically normal with mean 0 and variance l/(n —1) as n—> °o. For any value of n > 1, the hypothesis of statistical independence is rejected at the level of significance < a (Sec. 19.6-3) if
R > J11-"
(19.7-24)
\R\ > Ul~a/2
(19.7-25)
S/n — 1
(one-tailed test) or y/n — 1
(two-tailed test). Note: If x and y have a normal joint distribution, then 2 sin irR/6 is a consistent estimate of their correlation coefficient pxv.
19.8. RANDOM-PROCESS STATISTICS AND MEASUREMENTS
19.8-1. Simple Finite-time Averages.
Let
f(t\, t2, . . . , tn)
= f[x(h),x(t2), . . . ,z(*n);2/(*i), y(t2), . . . ,y(fn); . . •]
(19.8-1)
be a given measurable function of the sample values x(U), y(ti), . . . generated by the one-dimensional or multidimensional random process
19.8-2
MATHEMATICAL STATISTICS
704
x(t), y(t), . . . (Sec. 18.9-1). The finite-time averages n
n
[fin =- \ ffa +kAt,t2 +kAt, . . . ,tn +kAt) =^Vf(k At) (19.8-2)
(S)t =-f Lffa +Mi +X, . . . ,tn +X) d\ =i fT f(\) d\ (19.8-3) can be obtained, respectively, through sampled-data and continuous averaging of finite real data. [f]n and (f)T are random variables whose
distributions are determined by the given random process. If x(t) repre sents a stationary (but not necessarily ergodic) random process, then
Ellfln] = E{(f)T) = E{f]
(19.8-4)
The finite-time averages [/]n and (f)T are, then, unbiased estimates of the unknown expected value E{f\ and will be useful for estimating E{f\ if their random fluctuations about their expected value (4) are reasonably small. More specifically, the mean-square error associated with estima tion of E{f\ by a measured value of [/]n is n
n
Var M-i=^ y y cov [/am), a* ad} t=i a=i n-1
=IVar {/I +| 2) (l - J) Cov {/«», /(ft A<)} (19.8-5) If one introduces k At = Xand lets A* -> 0, n = !T/A* —> oo, then the sampled-data average (2) converges to the continuous average (y)T if the latter exists. A similar limiting process applied to Eq. (3) yields
Var \(y)T} =| YYl - £\ Cov {/(0), /(X)} rfX (19.8-6) Depending on the nature of the autocovariance function
Cov {/(0),/(X)} = E{[f(0) - E{f}][f(\) - E{f}}\ = E{f(0)f(\)} - [E{f\]2 = R//(X) - R„(0) = K„(X)
(19.8-7)
the mean-square error (4) or (5) may or may not decrease to an accept ably small value with increasing sample size n or integration time T. 19.8-2. Averaging Filters. Time averaging is often accomplished by low-pass filters implemented as electrical networks, by electromechanical measuring devices
with inertia and damping, or by digital sampled-data operations. Consider, in par-
AVERAGING FILTERS
705
19.8-2
ticular, a general time-invariant averaging filter with stationary input f(t) = f{t\ + /, ti + t, . . . , tn + t), bounded weighting function h(^), and frequency-response function H(iw) (Sec. 9.4-7). If the filter input is applied between t = 0 and t = T {averaging time), the filter output is
z(T) =J^ hT(T - X)/(X) d\ =JJ hT(t)f(T ~0d{
(19.8-8)
"™-{T S££n}
Hence
E[z(T)} =£(/} |or MA # =a(DJ?{/|
(19.8-10)
so that z(T)/a(T) is an unbiased estimate of E\f\. The estimate variance is given by
Var [z{T)\ =E[[z{t) - E[z{t)\\2\ =f~^ W|*r(X){R„(X) - [£f/}]2} <*X (19.8-11) where (see also Sec. 18.12-3)
«WA) = P Mf)Mf + X) #
(19.8-12)
As T7—> oo, Var {3(!T)} will not in general go to zero but, rather, approaches the stationary output variance
Var {*(«)} =f^ TO(X)iR„(X) - [#{/)]*) d\ = P |#(zco)|2(a,) ^ «2|J5r(0)|»*(0)B«o
(19.8-13)
where <£(«) is the power spectral density (Sec. 18.10-3) of f(t) —E{f], and Beq is the bandwidth of an equivalent "rectangular" low-pass filter having the frequency response
#eq(Jo>) = j #(0)
(M < 2ttJ5eq)
Beq is a useful measure of the variance-reducing properties of a given averaging filter. Table 19.8-1 lists H(iu>) and J3eq for some practical filters (flat-spectrum input). Table 19.8-1. Averaging Filters (Ref. 19.25) £T(i»)
First-order filter
.
Second-order niters:
1
1
io>T0 + 1
4!To
1
(tcoT7! + l)(i«Tj + 1) Critically damped system Underdamped system
•Beq
1
4(3Ti + r»)
1
1
(iuTo + l)2 a2 + «i2
8T0 a2 + wi2
(iu> + a)2 + wi2
8a
19.8-3
MATHEMATICAL STATISTICS
19.8-3. Examples (Ref. 19.25).
706
(a) Measurement of Mean Value.
It is desired
to measure the mean value E[x\ = £ of a stationary random voltage x(t) with
R**(t) =a*e-«" +? *M(W) =J°fa2 +2r€»*(«)
(19.8-14)
(white noise passedthrough a simple filter with —3-db bandwidth a/2ir cps, or random telegraph wave with counting rate a/2, Sec. 18.11-5). In this case,
Var \{x)T\ =*? fQT (l - |)r*«fc - ^ <«r - 1+e—) <|f (19.8-15) and for the first-order averaging filter of Table 19.8-1 with T » T0 (T > 4!To for most practical purposes),
Var \z(T)) mVar {*(»)} =^ /""
*»
(b) Measurement of Mean Square. It is desiredto measure E[f\ = E [x2) for Gaussian noise :r(£) satisfying Eq. (14). In this case,
Rff(r) = 2a4€-2a">" + (a2 + £2)2
)
•//w-s^fo +w+ «-.(.))
(19*8"17)
Var {<*2>r) - j^% (2aT - 1+«-*•*•)
(19.8-18)
and for the first-order averaging filter with T7 » r0 and / = x2,
Var |«(T)} «Var {*(«)} =^^ t <-^
(19.8-19)
(c) Measurement of Correlation Functions (see also Sec. 18.9-36). The vari ances of correlation-function estimates for jointly stationary x(t), y(t) are given by Eqs. (5), (6), and (11) with /(*) b x(t)y(t + r). Unfortunately, each variance depends on
E{f(0)f(\)} = E[x(t)x{1 + \)y(t + r)y(t + r + X)}
(19.8-20)
and this fourth-order moment of the joint distribution of x(t) and y(t) is hardly ever known. In the special case of jointly Gaussian and stationary signals x(t), y(t) with zero mean values,
#{/(0)/(X)} = RXX(\)RVU(\) + RXV2(T) + Rxv(r + \)Rxy(r - X)
(19.8-21)
but even this involvesthe unknown correlation function Rxy(r) itself and henceyields useful information only in simple special cases. For stationary Gaussian x(t),
Var \{x{t)x{t +t))t) <| f" Rxx2(\) d\
(|T|
(19.8-22)
To illustrate the dependence of the autocorrelation-function estimate
(x(t)x(t +t))t =| r x(\)x(k +r) d\
(19.8-23)
707
SAMPLE AVERAGES
19.8-4
on signal bandwidth and delay, consider again Gaussian noise x(t) satisfying Eq. (14). In this case, a'
Var {(x(t)x{t + t))t) =^f)2 {2aT " *+ ^^ + [(2aT + 1)(2a77 " 1} - 2(ar)2]e-2aT]
(T > r > 0)
(19.8-24)
For r = 0, this agrees with Eq. (18). For observation times T large compared to the reciprocal signal bandwidth,
-^ < Var {(x(t)x(t +t)M <2-4 al
oil
(|r| < T)
(19.8-25)
(within 1 per cent for aT > 104).
For the more general case of a stationary Gaussian signal x(t) with
Rxx(t) = a2e-«M cos Wlr + [E[x]]2
(19.8-26)
one similarly obtains
Var \(x(t)x(t +r))T\ <2^
(aT >10*, |rj
(19.8-27)
19.8-4. Sample Averages. Different sample functions generated by the same random process will be denoted by lx(t), 2x(t), . . . (Fig. 18.9-1) and are regarded as statistically independent; i.e., every finite set of samples {x(ti), {x(t2), . . . is statistically independent of every set of samples from a different sample function kx(t). If one can realize a set of sample functions lx(t), 2x(t), . . . , nx(t) in independent repeated experiments, then the sample values xx(ti), 2x(ti), . . . , nx(h) constitute a classical random sample of size n; i.e., the kx(h) are statistically inde pendent random variables with identical probability distributions. Simi larly, lx(t!), 2x(t2); 2x(h), ?x(t2); . . . ; nx(h), nx(t2) or *x(h), 'y(t2); 2x(h), 22/fe); . • • ; nx(h), ny(t2} constitute bivariate random samples. Sample averages, like
x(h) = £1 Mh) + 2x(h) + • • • -t- nx(h)] n
mil), x(k),...]=i y ^ k=i
(19.8-28)
n Zj
f[kx(h), kx(t2), . . .] I
k = i
x(ti)y(t2) = - V M*i) ky(h) n
k
are, then, random-sample statistics in the sense of classical statistical theory. Sample averages must be obtained from repeated or multiple
MATHEMATICAL STATISTICS
19.9-1
708
experiments, but it is usually much simpler to derive variances and probability distributions for sample averages than it is for time averages. In particular,
Var {5^5} =I Var {x(h)\ it
Var {/} =\ Var {/} n
Var {xihMti)) = £n Var {x^Mh))
(19.8-29)
just as in Sec. 19.2-3 (see also Fig. 19.8-1). t
Fig. 19.8-1. Four sample functions x(t) = kx(t) from a continuous random process represented by x(t).
19.9. TESTING AND ESTIMATION WITH RANDOM PARAMETERS
19.9-1. Problem Statement. A practically important class of decision situations can be represented by the model of Fig. 19.9-1. The cost C (risk) associated with one operation of the system shown is a func tion of the state of the environment represented by the m-dimensional random variable (si, s2, . . . , sm) = s and by a decision (response)
709
PROBLEM STATEMENT
19.9-1
represented by the r-dimensional variable (yi, y2, . . . , yr) = y, i.e.,
C = C(s, y)
(19.9-1)
The decision maker (man, machine, or system) arrives at the decision y on the basis of received, observed, or measured data represented by the Noise
\ Decision-making
Communication, perception, observation,
X
Stimuti received via sense organs or devices
measurements
mechanism (involves a
priori knowledge about
systemandenvironment)
Responses v=v M
8
control forces)
State of environment
(external conditions
affecting system
performance)
System performance
<
involving interaction of system and environment
C(s,y) Performance measure
Fig. 19.9-1. Context for statistical decisions.
n-dimensional random variable (x\, x2, . . . , xn) s x, which is related to the state of the environment through the given joint distribution of s and x. The decision maker forms y as a deterministic function (deci sion function, see also Sec. 19.6-9) of the received data:*
y = y(x)
(19.9-2)
Given the joint distribution of s and x together with the cost function (1) representing the system performance for each combination of environ ment state and decision, one desires to minimize the expected risk
E{C(s, y)} =/8 fx C[s, y(x)]
(19.9-3) s, x, and y
may be discrete or continuous variables. * Random or partially random selection of decisions (as. in games with mixed strategies, Sec. 11.4-46) will not be considered here.
19.9-2
MATHEMATICAL STATISTICS
19.9-2. Bayes Estimation and Tests.
parameters probability problem is estimation;
710
If the environment-state
Si, s2, . . . , sm are regarded as parameters of the unknown distribution of the observed sample (xh x2, . . . , xn), the somewhat similar to the classical problems of testing and the essential difference is that the parameters si, s2, . . . , sm
are now random variables.
For continuous variables s, x, the decision maker's knowledge of the environment state on the basis of a received sample x is, then, expressed
by the conditional probability density ^>(s|x).
Minimization of the
expected risk (3), or
E\C(s, y)} =fx d*(*) /8 C[s, y(x)] <*i>(s|x)
(19.9-4)
then requires minimization of the conditional risk
E{C(s, y|x)} =fB C[s, y(x)] ^(s|x)
(19.9-5)
for each sample x through a proper choice of the decision function y(x). If one is given the "a priori" distribution of s and the conditional proba bility density
tf*(s|x) =
Jb
(19JWJ)
Decision processes based on such minimization of the expected risk are known as Bayes estimation, and as Bayes tests if y is a discrete variable. Note that not all the unknown parameters Sineed affect C[s, y] explicitly (e.g., signal-carrier phase in amplitude-modulated radio transmission, Ref. 19.23). If, as is often the case, one has no reliable knowledge of the cost func tion C(s, y) and/or of the "a priori" distribution of s, then Bayes estima tion and testing becomes impossible. One may assume "worst-case" C(s, y) and
19.9-3. Binary State and Decision Variables: Statistical Tests or Detection (see also Sees. 19.6-1 to 19.6-4). Assume that there exist only two environment states, s —0 (null hypothesis) and s = 1 (alternate hypotheses corresponding, say, to the absence and presence of a target to be detected). Assume two possible decisions y = 0, y = 1, which correspond to acceptance or rejection of the null hypothesis on the basis of the observed data sample (xh x2, . . . , xn)> The problem amounts
711
BINARY STATE AND DECISION VARIABLES
19.9-3
to the choice of a critical region (rejection region) Sc of sample points (xi, x2, . . . , xn) which will minimize the expected risk E{C(s,y)}
= C(s = 0, y = 0)p0 j- (p(x\, x2, . . . , xn\s = 0) dx\ dx2 . . . cfon + C(s = 0, y = l)p0 J/_Sc
+ C(s = 1, y=0)(1 - po) /g *>Cci, :r2, ...,xn\s = 1) dxi dx2 . . . dxn + C(s = \,y = 1)(1 - pQ) f
wherep0 = P[s = 0], and Sc is the complement ofSc (acceptanceregion). E{C(s, y)} will be minimized if one rejects the null hypothesis whenever the likelihood ratio
a(*x, **..., *.) -
(19-9-8)
(see also Sec. 19.6-3) exceeds the critical value
A Ac
po C(a = 0, y = 1) - C(s = 0, y = 0) 1 - po C(« = 1, y = 0) - C(s = 1,2/ = 1) __ po added cost of false rejection (false alarm) /iqqq\ —1 —po
added cost of false acceptance (miss)
Note that (1) any increasing or decreasing function of the likelihood ratio (8) can replace the latter as a test statistic, and (2) the likelihood ratio (8) is itself a monotonic function of the "a posteriori" conditional proba
bility p(s\xi, x2, . . . , xn), which may be regarded as a basic test statistic. EXAMPLE: Signal Detection with Additive Flat-spectrum Gaussian Noise. The objective is to decide whether a received signal x(t) of bandwidth B is due to noise alone [s = 0, or x{t) = n(t)] or to signal and additive noise [s — 1, or x(t) = s(0 + n(l)]; in view of the finite bandwidth, one can describe signals and noise in terms of sample values Xk = x(k At)
with
At = ^=
Ad
Sk = s(k At)
n = n(k At)
(k = 1, 2, ... , 2BT)
where T is the observation time (Sec. 18.11-2).
One is given 2BT
2BT
19.9-4
MATHEMATICAL STATISTICS
712
so that 2BT
2BT
fc = l
A=l
A(xhx2, ...,*«) =exp[-^ \ sk2 +± ^ SkXk] Since 2BT
2BT
>d X 8kXlc ~ A 8^ At)x& A*) A< « / «(0x(0 eft =z(xi, Xt, . . . , Z2B20
2B
ft = l
&= 1
is an increasing function of the likelihood ratio A(xh x2f . . . , X2bt),• z rather than A may be used as a test statistic. The receiver forms z(xi, Xi, . . . , X2bt)} either by discrete summation or by continuous integration, and compares it with a threshold value zc determined by p0 and C(s, y); z > zc corresponds to the decision y = 1 (crosscorrelation detector or matched-filter detector). See Sees. 19.22, 19.23, and 19.26 to 19.28 for additional examples and applications.
19.9-4. Estimation (Signal Extraction, Regression). In a typical measurement situation, the environment has a continuously distributed set of states represented by values of s = (si, $2, . . . , sn), and the m components yk of the m-dimensional decision function y = (2/1, y2, . . . , ym) must be chosen so as to approximate the corresponding sk as closely as possible in some sense specified by the cost function C(s, y). In the practically important case of least-square estimation, one assumes a cost function of the form
C(s, y) = C(sh s2, . . . , sm; yh y2, . . . , ym) m
- J (»* - VkY
(19.9-10)
In this case, E [C(s, y)} is minimizedif eachyk equalstheconditionalexpected value of Sk for the measured sample (xi, x2, . . . , xn):
yk = E{sk\xh x2, . . . ,xn\ = ]~^ sk
(19.9-11)
(see also Sees. 18.4-5, 18.4-6, and 18.4-9), where
=
Jb
•
713
RELATED TOPICS
19.10-1
EXAMPLE: D-c Measurement with Additive Gaussian Noise. The objective is to measure a single random quantity s from a sample (xi, x2, . . . , xn) of measured values xk = s + nk, given
t\
¥>s{S) =
1-=. e -5Ti(a-*'>2 *Q*
("a priori
probability density of s) n
vw(xu X2, . . .,xn\s) =(^)n 2exP [- i 2 ^ " 5)2] which corresponds to additive-noise samples n* statistically independent of each other and of s. Bayes's formula yields
= e Za a v 2?r
with
S=E{s\xhx2l . . . ,xn\ -
1 \_1
^((^ — s)2} is then minimized by the least-squares estimate
,.* +*£ 2/ = £ = -
2 .P" 71
which depends on the sample values Xk only by way of the statistic x (sample average) and is biased by a priori knowledge of
References 19.22, 19.23, and 19.26 to 19.28 describe additional applications.
19.10. RELATED TOPICS, REFERENCES, AND BIBLIOGRAPHY
19.10-1. Related Topics. The following topics related to the study of mathematical statistics are treated in other chapters of this handbook: Probability theory, special distributions, limit theorems, random processes Chap. 18 Numerical calculations
Chap. 20
Linear programming, game theory Combinatorial analysis
Chap. 11 Appendix C
19.10-2
MATHEMATICAL STATISTICS
714
19.10-2. References and Bibliography (see also Sec. 18.13-2). 19.1. Brownlee, K. A.: Statistical Theory and Methodology in Scienceand Engineering, Wiley, New York, 1961.
19.2. Brunk, H. D.: An Introduction to Mathematical Statistics, 2d ed., Blaisdell, New York, 1964. 19.3. Burington, R. S., and D. C. May: Handbook of Probability and Statistics with Tables, 2d ed., McGraw-Hill, New York, 1967. 19.4. Cramer, H.: Mathematical Methods of Statistics, Princeton, Princeton, N.J., 1951.
19.5. Dixon, W. J., and F. J. Massey, Jr.: An Introduction to Statistical Analysis, 2d ed., McGraw-Hill, New York, 1957. 19.6. Eisenhart, G. C, M. W. Hastay, and W. A. Wallis: Techniques of Statistical Analysis, McGraw-Hill, New York, 1947. 19.7. Elderton, W. P.: Frequency Curves and Correlation, 3d ed., Cambridge, New York, 1938.
19.8. Hald, A.: Statistical Theory with Engineering Applications, Wiley, New York, 1952.
19.9. Hoel, P. G.: Elementary Statistics, 2d ed., Wiley, New York, 1966. 19.10. Hogg, R. V., and A. T. Craig: Introduction to Mathematical Statistics, Macmillan, New York, 1959. 19.11. Lehmann, E. L.: Testing Statistical Hypotheses, Wiley, New York, 1959. 19.12. Mood, A. M., and F. A. Graybill: Introduction to the Theory of Statistics, 2d ed., McGraw-Hill, New York, 1963. 19.13. Neyman, J.: First Course in Probability and Statistics, Holt, New York, 1950. 19.14. Schefife, H.: The Analysis of Variance, Wiley, New York, 1959. 19.15. Van der Waerden, B. L.: Mathematische Statistik, 2d ed., Springer, Berlin, 1965. 19.16. Walsh, J. E.: Handbook of Nonparametric Statistics (2 vols.), Van Nostrand, Princeton, N.J., 1960/65. 19.17. Weiss, L.: Statistical Decision Theory, McGraw-Hill, New York, 1961. 19.18. Wilks, S. S.: Mathematical Statistics, 2d ed., Wiley, New York, 1962. Random-process Statistics and Decision Theory
19.19. Bendat, J. S.: Principles and Applications of Random Noise Theory, Wiley, New York, 1958. 19.20. and A. G. Piersol: Measurement and Analysis of Random Data, Wiley, New York, 1966. 19.21. Blackman, R. B., and J. W. Tukey: The Measurement of Power Spectra, Dover, New York, 1958. 19.22. Davenport, W. B., and W. L. Root: Introduction to Random Signals and Noise, McGraw-Hill, New York, 1958. 19.23. Hancock, J. C: Signal Detection, McGraw-Hill, New York, 1966. 19.24. Helstrom, C. W.: Statistical Theory of Signal Detection, Pergamon Press, New York, 1960. 19.25. Korn, G. A.: Random-process Simulation and Measurements, McGraw-Hill, New York, 1966. 19.26. Middleton, D.: An Introduction to Statistical Communication Theory, McGrawHill, New York, 1960. 19.27. Wainstein, L. A., and V. D. Zubakov: Extraction of Signals from Noise, Pren tice-Hall, Englewood Cliffs, N.J., 1962. 19.28- Wozencraft, J. M., and I. M. Jacobs: Principles of Communication Engineering, Wiley, New York, 1965.
CHAPTER
20
NUMERICAL CALCULATIONS
FINITE
20.1. Introduction 20.1-1.
AND
DIFFERENCES
and Minima
(a) Problem Statement.
Survey
Itera
tion Methods
20.1-2. Errors
(6) Varying One Unknown at a Time
20.2. Numerical Solution of
(c) Search Methods for Finding
Equations
Maxima and Minima
20.2-1. Introduction 20.2-2. Iteration
Methods.
Newton-
Raphson Method and Regula Falsi
Numerical Solution of Alge braic Equations: Computation of Polynomial Values (a) Successive Multiplications
20.2-3.
Q>) Horner's Scheme 20.2-4.
Solution of Algebraic Equa tions: Iteration Methods
(a) General Methods (b) Muller's Method (c) The Bairstow Method 20.2-5. Additional Methods for Solv
ing Algebraic Equations (a) The Quotient-difference Algorithm (b) GraefiVs Root-squaring Process
(c) A Matrix Method (d) Horner's Method 20.2-6. Systems of Equations and the Problem of Finding Maxima
(d) Treatment of Constraints 20.2-7. Steepest-descent Methods (Gradient Methods) (a) Method of Steepest Descent (6) Descent with Computed Gradi ent Components 20.2-8. The Newton-Raphson Method and the Kantorovich Theorem
20.3. Linear Simultaneous Equa tions, Matrix Inversion, and Matrix Eigenvalue Problems 20.3-1. "Direct" or Elimination Methods for Linear Simultane
ous Equations (a) Introduction (M Basic Elimination Procedure (Gauss's Pivotal-condensation Method) (c) An Improved Elimination Scheme (BanachiewiczCholesky-Crout Method) (d) Direct Methods in Matrix Form 715
716
NUMERICAL CALCULATIONS 20.3-2. Iteration Methods for Linear
Equations (a) (6) («> (<*) («)
Introduction
Gauss-Seidel-type Iteration
20.5-1. Introduction
Systematic Overrelaxation Steepest-descent Methods (Gradient Methods)
20.5-2. General Formulas for Poly nomial Interpolation (Argu ment Values Not Necessarily
(/) A Conjugate-gradient Method A Partitioning Scheme for the Solution of Simultaneous
Linear Equations or for Matrix Inversion 20.3-5.
Eigenvalues and Eigenvectors of Matrices
(a) The Characteristic Equation (6) An Iteration Method for Hermitian Matrices
(0 A Relaxation Method («0 Jacobi-Von Neumann-Goldstine Rotation Process for
Real Symmetric Matrices
W
Other Methods
20.4. Finite Differences and Differ
ence Equations 20.4-1. Finite Differences and Central
Means
Operator Notation 20.4-3. Difference Equations 20.4-4. Linear Ordinary Difference Equations to Superposition Theorems (b) The Method of Variation of 20.4-2.
Constants 20.4-5.
Linear Ordinary Difference Equations with Constant Coefficients: Method of Undetermined Coefficients
20.4-6. Transform Methods for Linear
Difference Equations with Constant Coefficients to The z-transform Method
20.4-7.
20.5. Approximation of Functions by Interpolation
Relaxation Methods
20.3-3. Matrix Inversion 20.3-4.
20.4-8. Stability
Sampled-data Representation in Terms of Impulse Trains and Jump Functions. Laplace-transform Method Systems of Ordinary Difference Equations (State Equations). Matrix Notation
Equally Spaced) (a) Lagrange's Interpolation Formula
(6) Divided Differences and New ton's Interpolation Formula (c) Aitken's Iterated-interpolation Method
(d) The Remainder 20.5-3. Interpolation Formulas for Equally Spaced Argument Values. Lozenge Diagrams (a) Newton-Gregory Interpolation Formulas
(b) Symmetric Interpolation Formulas
(c) Use of Lozenge Diagram 20.5-4. Inverse Interpolation 20.5-5. "Optimum-interval" Interpolation 20.5-6. Multiple Polynomial Interpo lation
20.5-7. Reciprocal Differences and Rational-fraction Interpolation 20.6. Approximation by Orthogonal Polynomials, Truncated Fourier Series, and Other Methods
20.6-1. Introduction
20.6-2. Least-squares
Approximation
over an Interval
20.6-3. Least-squares Polynomial Ap proximation over a Set of Points
(a) Optimum-interval Tabulation (6) Evenly Spaced Data 20.6-4. Least-maximum-absolute-
error Approximations 20.6-5. Economization of Power Series
20.6-6. Numerical Harmonic Analysis
and Trigonometric Interpola tion
20.6-7. Miscellaneous Approximations
717
NUMERICAL CALCULATIONS
20.7. Numerical Differentiation and
Integration 20.7-1. Numerical Differentiation
(a) Use of Difference Tables (b) Use of Divided Differences (c) Numerical Differentiation
20.9-3. The Generalized Newton-
after Smoothing (d) Approximate Differentiation of
20.7-2. Numerical Integration Using Equally Spaced Argument Values
Operators 20.9-6. Representation of
Formulas
(b) Gregory's Formula (c) Use of Euler-MacLaurin
Boundary
Conditions
Formula
20.7-3. Gauss and Chebyshev Quad rature Formulas
20.7-4. Derivation and Comparison of Integration Formulas 20.7-5. Numerical Evaluation of Mul
20.9-7. Problems Involving Three or More Independent Variables 20.9-8. Validity of Finite-difference Approximations. Some Stability Conditions 20.9-9. Approximation-function Methods for Numerical Solu
tiple Integrals
tion of Boundary-value
20.8. Numerical Solution of Ordi
Problems
nary Differential Equations 20.8-1. Introduction and Notation
20.8-2. One-step Methods for InitialEuler
and
Runge-Kutta-type Methods 20.8-3. Multistep Methods for Initial-value Problems
(a) Starting the Solution
(b) Simple Extrapolation Schemes (c) Predictor-corrector Methods and Step-size Changes 20.8-4. Improved Multistep Methods 20.8-5. Discussion of Different Solu
tion Methods, Step-size Con trol, and Stability 20.8-6. Ordinary Differential Equa tions of Order Higher Than the First and Systems of Ordi nary Differential Equations 20.8-7. Special Formulas for Secondorder Equations 20.8-8. Frequency-response Analysis 20.9. Numerical Solution of Bound
ary-value Problems, Partial Differential Equations, and
Integral Equations
Numerical Solution of Partial
Differential Equations with Two Independent Variables 20.9-5. Two-dimensional Difference
(a) Newton-Cotes Quadrature
Problems:
Raphson Method (Quasilinearization) 20.9-4. Finite-difference Methods for
Truncated Fourier Series
value
20.9-1. Introduction
20.9-2. Two-point Boundary-value Problems Involving Ordinary Differential Equations: Dif ference Technique
(a) Approximation by Functions Which Satisfy the Boundary Conditions Exactly (b) Approximation by Functions Which Satisfy the Differential Equation Exactly 20.9-10. Numerical Solution of Integral Equations 20.10. Monte-Carlo Techniques 20.10-1. Monte-Carlo Methods 20.10-2. Two Variance-reduction
Techniques (a) Stratified Sampling (6) Use of Correlated Samples 20.10-3. Use of A Priori Information:
Importance Sampling 20.10-4. Some Random-number Gen
erators.
Tests for Random-
20.11. Related Topics, References, and Bibliography 20.11-1. Related Topics 20.11-2. References and Bibliography
20.1-1
NUMERICAL CALCULATIONS
718
20.1. INTRODUCTION
20.1-1. Chapter 20 outlines a number of the best-known numerical com putation schemes, with emphasis on mathematical methods rather than on detailed layout or programming. Sections 20.4-1 to 20.5-7 introduce the calculus of finite differences and the solution of difference equations. Finite-difference methods are of great interest not only for the numerical solution of differential equations but also in connection with many mathematical models whose variables change in discrete steps by nature rather than by necessity.
20.1-2. Errors.
Aside from possible mistakes (blunders) in the course
of numerical computations, there may be errors in the initial data, round-off errors due to the use of a finite number of digits, and trunca tion errors due to finite approximations of limiting processes. Approxi mate effects of small errors Azt or fractional errors Axi/Xi on a result
f(xh x2, . . .) can often be studied through differentiation (Sec. 4.5-3); thus
A(zi + x2) = Aa:i + Ax2 A(#i — x2) = Azi — Ax2 A(xix2) « X\ Ax2 + x2 Ax\
A(si —x2) < |Asi| + |As2l Xi —
x2
~~
\X\ — X2\
A(xxx2) = A^i X\X2
Xi
(20.1-1)
A^2 ,2Q ^ x2
The generation and propagation of errors in more complicated computa tions are subject to continuing studies; most analytical results of such studies are restricted to very specialized classes of computations (Refs. 20.4 to 20.7, 20.10, 20.44, 20.49, 20.58, and 20.59). It is desirable to check on mistakes and errors by various checking routines (e.g., resubstitution of approximate solutions into given equations) carried along between appropriate steps of the main computation. As a very rough rule of thumb, one may carry two significant figures in excess of those justified by the precision of the initial data or by the required accuracy. Convergent iteration schemes (Sees. 20.2-2, 20.2-4, 20.3-2, 20.8-2c, and 20.9-3) will tend to check and reduce the effects of errors, unless the errors affect the convergence. A computing scheme (program of numerical operations, algorithm) is said to be numerically stable if it does not increase the (absolute or relative) effects of round off errors in the given data and in the calculations. More precise definitions of numerical stability can often be related to the asymptotic stability of solutions of difference equations (recurrence relations) in the manner of Sec. 20.8-5.
719
ITERATION METHODS
20.2-2
20.2. NUMERICAL SOLUTION OF EQUATIONS
20.2-1. Introduction.
The numerical solution of any equation
/(*) = 0
(20.2-1)
should be preceded by a rough (often graphical) survey yielding informa tion regarding the existence and position of real roots, estimates for trial solutions, etc. (see also Sees. 1.6-6 and 7.6-9). Solutions may be checked by resubstitution.
20.2-2. Iteration Methods. Newton-Raphson Method and Regula Falsi. The following procedures apply, in particular, to the solution of transcendental equations. (a) Rewrite the given equation (1) in the form z =
(see also Sec. 20.2-26).
(20.2-2)
Starting with a trial solution z[0], compute suc
cessive approximations
sW-u = v(gui)
(j = 0, 1, 2, . . .)
(20.2-3)
The convergence of such approximations to a desired solution z requires separate investigation. The Contraction-mapping Theorem of Sec. 12.5-6 can often be applied to establish convergence and rate of convergence. The
iteration is terminated when \zU] —«[y_1I|/l;2;[i,l is sufficiently small. Collatz's Convergence Criterion and Error Estimate (Ref. 20.3). If it is pos sible to choose a complex-plane region D such that (1) there exists a positive real number M < 1 with the property
MZi) -
for all Zt, Z2inD
and (2) D contains zW, z[1\ and every point z having the property
\z - zV\\ < YZTJi '*m " Zi'~l]\
(20.2-4)
for some value of j > 0, then the approximations (3) converge to a solution z of Eq. (1) which satisfies Eq. (4) and is unique in D. Note that the right side of Eq. (4) is an upper bound for the error of the jih approximation %V\ (see also Sec. 12.5-6).
(b) In general, there are many possible ways of rewriting the given equa tion (1) in the form (2). Particular choices of
*,,
' (Newton-Raphson method)
/(g"Q - i1 [/(*M)]T(gw) ,[*-.! = m - Ml [/(*"•')]'/"(*") f'(*[j]) 2 [f(zW)Y
(20.2-5)
(20.2-6)
,«n«7x VH-£-i)
20.2-2
NUMERICAL CALCULATIONS
720
These iteration schemes are especially useful for the computation of real roots. To find complex roots of real equations, one must start with a complex trial solution zW.
The convergence considerations and error criterion of Sec. 20.2-2a apply in each case. Equation (6) is a special case of the general Newton-Raphson-Kantorovich scheme of Sec. 20.2-7. // f'(z) exists and is continuously differentiable in the region under consideration, and one can find positive real bounds A, B,tC such that
'—^" ' -^ "A |/'(2l°I)|
and
ITTTT^ I< |/'(zl°])| -*• B
|/"(z)|
whenever
(20.2-8a)
(20.2-86)
\z - »I»1| < JL (1 - \/l - 2ABC)
(20.2-8c)
then the desired root z satisfies the last inequality, and the convergence rate of the NewtonRaphson iteration is given by
\z - *M| <^1 (2ABC)*'1
(j - 1, 2, . . .)
(20.2-9)
Note the relatively rapid convergence under these circumstances (see also Sec. 20.2-8 and Refs. 20.43 and 20.47). USEFUL EXAMPLES: Application of the Newton-Raphson scheme (6) to 1/z —a = 0, z2 —a = 0, 1/z2 — 1/a2 = 0, and zn —a — 0 yields iteration routines
for 1/a, y/a, and y/a\ zV+n = ZM(2 - azV\) -• 1/a as j-> oo
(a > 0)
zii+il =y2 (ZV) +-5L^ -» Vo as j-♦ co
(a >0)
«[/+*] =zm Tl +a~2aZll])2~\ -> V~a as j-> oo Zli+i] = (i - k) Z[i) +
<*
(20.2-10)
(Heron's algorithm)
(20.2-11)
(a >0)
(20.2-12)
_> ^ as j -> *
(a > 0)
(20.2-13)
The iteration (10) converges for all zl°] between 0 and 2/a; (11), (12,) and (13) require only zl°l > 0.
(c) The Regula Falsi. The following iteration scheme applies par ticularly well to real equations and real roots and is used when f(z) is not easily computed. Given Eq. (1), start with two trial values z = z[Q], z = z[1] and obtain successive approximations zu+n = z[j]
ZU] — zw
1
f
f(z[j]\
f(*li]) - f(z[k]) R }
(j = 0, 1, 2, . . . ; k < j)
(20.2-14)
For continuous real functions/(z), one attempts to bracket each real root between approximations z[j] and z[k] such that f(z[J]) and f(zm) have opposite signs. Each z[j] can then be obtained by graphical straightline interpolation; the scale of the graph should be appropriately increased at each step.
721
NUMERICAL SOLUTION OF ALGEBRAIC EQUATIONS
20.2-3
(d) Aitken-Steffens Convergence Acceleration. If f(z) is real and three times continuously differentiable for real z in a neighborhood of a real root where/'(z) ^A 1, then one can improve on the convergence of the iteration sequence (3) by the following iteration scheme
0[i] = /(2[oi)
zN = f(zW)
z\n = /(«[•])
2[8] = /(aiPi)
The iteration is terminated when one of the denominators turns out to be zero (in general, the desired accuracy is obtained before this if the sequence converges: see also Ref. 20.5). This method, like the regula falsi, may substitute for the fast-con verging Newton-Raphson scheme if computation olf'(z) is too cumbersome.
(e) Multiple Roots. Iteration schemes based on Eq. (6) (NewtonRaphson method) and Eq. (7) will not converge in the neighborhood of a multiple root of the given equation. Note that multiple zeros of f(jz) are zeros of f'(z); for algebraic equations, the greatest common divisor of f(z) and/'(z) can be obtained by the method of Sec. 1.7-3. (f) Interval Halving.
If a single simple real root of a real equation (1) is known
to lie between z = a and z = b, compute / ( —~— )• If this has the same sign as,
say, /(&), compute/ I^(^
h
20.2-3. Numerical Solution of Algebraic Equations: Computation of Polynomial Values, (a) Successive Multiplications. To com pute values of a polynomial f(z) s a0z» + a^-1 + • • • + an-iz + an
(20.2-16)
for use in the iteration methods of Sec. 20.2-36, compute successively a0z + ai, z(a0z + ai) + a2, . . . ; or obtain the desired quantities f(c), f'(c), f"(c)/2\, . . . by Horner's scheme. (b) Horner's Scheme. Long division of f(z) by (z — c) yields a new polynomial f\(z) and the remainder f(c) (Sec. 1.7-2); long division of fi(z) by (z — c) in turn yields a new polynomial fi(z) and the remainder f'(c). Continuing in this manner, one obtains successive remainders f"(c)l2\, f'"(c)/Z\, . . . . Note that these remainders are the coeffi cients of the polynomial
F(u) mf(u +c)= /(c) + f'(c)u + ± /"(c)u2 + • • • = f(z) (20.2-17)
20.2-4
NUMERICAL CALCULATIONS
722
Horner's Scheme for Complex Arguments. Given a real polynomial f(z), let c = a + ib. Then /(c) = Ac + B = (4a + £) + iAb, where 4* + B is the (real) remainder of the quotient f(z)/(z2 - 2az + a2 + 62) (Ref. 20.3).
20.2-4. Solution of Algebraic Equations: Iteration Methods.
(a) General Methods.
To compute single simple roots of an algebraic
equation
f(z) = a0zn + axzn~l + • • • + an-iz + an = 0
(20.2-18)
one may
1. Use the Newton-Raphson method of Eq. (6). 2. Employ Eq. (5) with k = l/an-i, calculating successive values of the polynomial
zu+n = z[j\ _ /(2[i])/an_3
(20.2-19)
by Horner's scheme. If an-i = 0, introduce u = z — c and use Eq. (17) to rewrite Eq. (18) as F(u) = 0.
3. Attempt to bracket the root between argument values yielding function values of opposite signs, or use the regulafalsi (Sec. 20.2-2c). Convergence of these and the following iteration methods depends on the initial trial value. This may be based on a priori information, on approximate solutions obtained by one of the methods of Sec. 20.2-5, or on a root of a second-order polynomial approximating f(z). In the absence of such information, the trial value may simply be zero or a small real or complex number. In the latter case, Horner's scheme for complex arguments may be employed if f{z) is a real polynomial. If f(z) is a real polynomial, one may simply attempt to find trial values a + ib so as to minimize the quantities A and B in Horner's scheme for complex arguments. See also Ref. 20.3.
As successive roots z» are found, the corresponding factors z —Zi are divided out (Sec. 1.7-2). Zi or 2,(1 + i) may be tried as a starting value for iteration on the next root.
The Muller and Bairstow methods simplify calculations for complex roots and tend to converge better than the Newton-Raphson method when roots are close together.
(b) Muller's Method (Ref. 20.5). Substitution of parabolic for linear interpolation in the derivation of the regula falsi (Sec. 20.2-2c) produces the iteration algorithm
zv+n = ZU] - (giii _ zu-i])
/Uj
B3 ± VBj2 - kAfii
(20.2-20a)
where the sign in the denominator is chosen so as to make its absolute value as large as possible, and &i = QjL - ft(l + qs)fi-i + qtfj-2 Bj = (2q3- + l)/y - (1 + &)%_! + g,%_,
Ci = (1 + &)/;
} (j = 0,1, 2, . . .) (20.2-206)
ZU) — zU-i]
to = zu-n - 2[,--2]
/y = /(*w)
723
METHODS FOR SOLVING ALGEBRAIC EQUATIONS
20.2-5
The solution may be started with z[0] = —1, z[1] = 1, and z[2] =• 0 (hence q2 = —K)> and /0, /i, /2 respectively replaced with an — an-i + an-2y an + an-i + on-2, 0n.
Muller's method applies also to nonalgebraic equations. (c) The Bairstow Method (Refs. 20.5, 20.9, and 20.54). Assume that the given equation (16) admits a quadratic factor x2 — ux — v defining two distinct simple roots, and that one has trial values u[0], v[0] which approximate u and v sufficiently well. Then a sequence of improved approximations converging to u, v are obtained from
(<£12)2 - <&Li<£U
VU+1] = V[i] + 6»UC—1 vcn_2;
(20.2-21a)
^n-lCn-2 cn_itn_z
where the &&[il, CA;iyl are consecutively obtained for each,; for
Kin = 64w + uWcgL + vw$U | (A: - U, 1, J, . . . , n; 6-i^ = &-2[yi = C-i^ = C_2W = 0
(20.2-216) j
This method has been found to work best for polynomials of even degree.
20.2-5. Additional Methods for Solving Algebraic Equations, (a) The Quotient-difference Algorithm (Ref. 20.5). The following scheme (a generalization of the classical method of Bernoulli, Refs. 20.5 and 20.9) can produce approximations to all roots of suitable algebraic equations (16) in one sequence of calculations. The method may be useful for finding trial values to be used in iteration methods. For a (suitable) given equation (16), write the tabular arrangement of Fig. 20.2-1, where successive entries are obtained from
QSi = Qi[k] + (M<m - tffe11)) •n\k\ -n m &Ifc+11 \ ("RHOMBUS RULES") (20.2-22tt) with
Ei+1 =Ei
Qgi
Etw = B<w =0
j
(i = 1, 2, . . .)
(20.2-22&)
This procedure breaks down if a Q-+i or Ei[fc] equals zero. The quotientdifference scheme exists, however, in many useful special cases, in par ticular, if all n roots z\, Z2, . . . , zn of the given equation (16) are posi tive; or if they are simple roots with distinct nonzero absolute values. In the latter case,
lim Q<m = zk
(20.2-23)
lim EM = 0
(20.2-24)
t—» 00
NUMERICAL CALCULATIONS
20.2-5
Q-2W = 0
_iW = 0
Q0W = -
724
ao a
E0W =
E-iW
_ as
ai
QoW
QiM E^1]
• • *i?i =
a2
On-
0-iM E0M
• ^"-,1 QoW
QiW
OiW
Eii2)
&w
*&?'
Fig. 20.2-1. A quotient-difference table.
for each k, so that each Q column in Fig 20.2-1 yields an approximation to a root.
More generally, let \zi\ > \z2\ > • • • > \zn\ > 0. Assuming that the quotient-difference scheme does not break down, Eq. (23) still yields every root Zk which differs in absolute valuefrom its neighbors in the above sequence;
and Eq. (24) still holds whenever \zk\ > \zk+\\- This helps to identify roots with equal absolute values, such as complex-conjugate roots; see Ref. 20.5 for a refined procedure which actually approximates such roots. (b) Graeffe's Root-squaring Process. Given the real algebraic equation 71
f(z) s H (z - zk) = zn + axzn~l + • • • + an = 0 A= l
obtain the coefficients a*(1) of n
/(*)/(-*)»(-D" n (*2-«*2) = (-1)»(22)» + ai«>(*»)—» + • • • + a»(1) by writing the array k = n
k =
n — 1
k = n -
2
k = n -
an
On-l
fln-2
an-3
an
-a„_i
On-2
— On-3
On2
3
(Column number)
-On-8
+2an«n-2
+2a„a„_4
+2o„an_fl
—2an-\an-z
—2an-lfln-6 +2an-2fl«-4
...
an(1)
«n-l
an-2
°n-8
725
METHODS FOR SOLVING ALGEBRAIC EQUATIONS
12.2-5
Repeat this process, obtaining successively the coefficients at0) of
(-l)n n (z2'- **2i) s (" l)n(^*)n + a,^')""1 + • • • + ar U) As j increases, the array usually assumes a definite pattern: (1) the double products in a column may become negligible, so that successive column entries become squares with equal signs (regular column*); all entries of a column may have equal signs and absolute values equal to a definite fraction of the squared entry above (fractionally regular column); (2) column entries may have regularly alternating signs (fluctu ating column); and (3) a column may be entirely irregular. Each pair of regular columns (say, k and k — r) separated by r — 1 nonregular columns corresponds to a set of r roots z of equal magnitude such that ak U)
•|2'r
(20.2-25)
as.;-
*£r
These r roots are all either real or pure imaginary if the r — 1 separating columns are fractionally regular. Specifically, 1.
Two adjacent regular columns (say k and k — 1) yield a simple real root z such that ak U)
as,?-
ak-l
Two regular columns (k and k — 2) separated by a fluctuating column indicate a pair of simple complex-conjugate roots z such that
Ok^
„;+l
asj-
In practice, one first finds the real and purely imaginary roots; deter mine signs by substitution or with the aid of Sec. 1.6-6. Lehmer's method, a refined version of GraefiVs scheme, permits direct determination of signs. (c) A Matrix Method. It is possible to compute the roots of an nth-degree algebraic equation (11) with aQ = 1 as eigenvalues of the (n + 1) X (n + 1) "com panion matrix"
A
0
1
0
0
0^
0
0
1
• • •
0
0
0
0
0
0
(20.2-26)
= L—an
-an-i
—an-2
— C&2
by one of the methods outlined in Sec. 20.3-5.
* M. B. Reed and G. B. Reed, Mathematical Methods in Electrical Engineering, Harper, New York, 1951.
20.2-6
NUMERICAL CALCULATIONS
(d) Horner's Method.
726
Horner's method (for real roots) evaluates the coefficients
of successive polynomials Fx(u) = f(u + c{), F2(u) = Fi(u + c2), . . . by Horner's scheme, where ci, c2, . . . are chosen so as to reduce the absolute values of the
remainders.
If one succeeds in obtainingFj(cj) « 0, then the desired root is approxi
mated by c\ + c2 + • • • + Cj.
20.2-6. Systems of Equations and the Problem of Finding Maxima and Minima (see also Sees. 11.1-1, 12.5-6, 20.2-2, and 20.2-3). (a) Problem Statement. Iteration Methods. The problem of solving n simultaneous equations fi(xh x2) . . . , xn) = 0
(i = 1, 2, . . . , n)
(20.2-27)
for n unknowns Xi, X2, . . . , xn is equivalent to the problem of minimiz ing the function n
F(xhX2, . . . ,sB) s £ \fi(xhx2, . . . ,xn)\2
(20.2-28)
t=l
or some other increasing real function of the absolute values |/t| of the n residuals (errors) f{ = fi(xh x2, . . . , xn). The problem of mini mizing (or maximizing) a given function of n variables is of great practi cal importance in its own right. A useful class of iteration methods starts with a trial solution Xi[0] (i = 1, 2, . . . , n) and attempts to construct successive approximations X.u+D = X.U] + \u\v.U]
(i = 1, 2, . . . , n;j = 0, 1, 2, . . .)
(20.2-29)
which converge to a solution x{ as.;' —> <x>. Once the ratios Vi[j] :v2[j]: • • • :vnU] ("direction" of the jth step) are chosen, one may minimize F(x1[3+l], a;2[y+1], . . . , xnu+1]) as a function F(\M) of the parameter X^ which determines the step size. For this purpose, F(\M) may be approximated by a Taylor series or by an inter polation polynomial (Sec. 20.5-1) based on three to five trial values of \[j]. The latter method also applies to the computation of maxima and minima of a tabulated function F(x\, rc2, . . . , xn). The convergence of such iteration methods can, again, be investigated with the aid of the Contraction-mapping Theorem of Sec. 12.5-6; the trial solutions (xi, x2, . . . , xn) are regarded as vectors with the norm (xi2 + x22 + • • • -f xn2)^ or \xt\ + \x2\ + • • • + \xn\ (see also Ref. 20.5).
(b) Varying One Unknown at a Time. Attempt to minimize F by varying only one of the x{ at each step, either cyclically or so as to reduce the largest absolute residual (see also Sec. 20.3-2c, relaxation). Special one-dimensional search methods are discussed in Refs. 20.9 and 20.57.
727
STEEPEST-DESCENT METHODS
20.2-7
(c) Search Methods for Finding Maxima and Minima. Pure random searches, which merely select the largest or smallest value F(xh X2, . . . , xn) for a number of randomly chosen points (xh x2, . . . , xn) are useful mainly for finding starting approximations for iteration meth ods (Ref. 20.9). Random-perturbation methods start with a parameter
point (xi, x2) . . . , xn) and proceed to apply sets of random perturba tions Axh Ax2, . . . , Aa;„ to all unknowns at once until an improvement in the criterion function F is obtained.
The resulting perturbed param
eter point constitutes the next approximation, and the search continues. This technique which, unlike a pure random search, can take advantage of the continuity of a criterion function, is a highly useful hill-climbing method when gradient methods are frustrated by adverse features of the multidimensional terrain, such as "ridges," "canyons," "flat spots," and
multiple maxima and minima. Random-perturbation methods have been considerably refined by strategies involving step-size changes and
preferential treatment of certain directions in parameter space depending on past successes or failures (Refs. 20.9 and 20.57). (d) Treatment of Constraints (see also Sees. 11.3-4, 11.4-1, and 11.4-3). Maxima and minima with constraints
be treated either by a penalty-function technique which adds a term of the form r
r
S Kk\
k=i
function F(xh x2, . . . , xn), where each Kk is a large positive constant for minimiza tion (negative for maximization), and m is an even positive integer. Another tech
nique projects the computed gradient vector on the constraint surface, so that the gradient search proceeds along the latter (Refs. 20.9 and 20.57). The most useful digital-computer routines for optimization include options for changing the search strategy when convergence is slow.
20.2-7, Steepest-descent Methods (Gradient Methods), (a) Method of Steepest Descent. Choose v™ = -dF/dx{, where all derivatives are computed for Xi = XiU], and reduce the step size \U] as the minimum of F is approached (Fig. 20.2-2). For analytical functions F and small /,-, a Taylor-series approximation for F(\W) yields the optimum step size
(dF\2
X[j) =: _
I_ \dXk) £zi
(j = 0, 1, 2, . . .)
(20.2-30)
Zt dxk dxh dXk dxh where all derivatives are computed for x% = z»M. mations for F(\W) may be more convenient.
Interpolation-polynomial approxi
NUMERICAL CALCULATIONS
20.2-8
(b) Descent with Computed
Gradient
728
Components.
Fre
quently, derivation of the gradient components dF/dxk needed for a gradient descent is impossible or impractical. The required derivatives can then be approximated by difference coefficients AF/Axk obtained through perturbations of one unknown xk at a time. Since this takes n gradient-measuring steps for each "working step" in the gradient direction, one may prefer to continue in this gradient direction until the criterion function F is no longer improved and remeasure the gradient only then. A number of refined gradient-descent techniques are dis cussed in Refs. 20.9 and 20.57.
Minimum
Fig. 20.2-2. Criterion-function surface for a two-variable minimum problem, showing three minima, level lines, and gradient lines.
In this example, all three minima of
F(xi,x2) have the same value, zero. (Based on L. D. Kovach and H. Meissinger: Solution of Algebraic Equations, Linear Programming, and Parameter Optimization, in H. D. Huskey and G. A. Korn, Computer Handbook, McGraw-Hill, New York, 1962.)
20.2-8. The
Newton-Raphson
Method
and the
Kantorovich
Theorem (see also Sees. 20.2-2 and 20.9-3). (a) The Newton-Raph son Method. Start with a trial solution zt[01, and obtain successive approximations Xiu+1] by solving the simultaneous linear equations n
Si+Yi&{XhWl "Xk)=0 (*' =1> 2, ...,n)
(20.2-31)
with xk = xk™ (j = 0, 1, 2, . . .). (b) Kantorovich's Convergence Theorem (see also Ref. 20.52).
Solution of the linear equations (31) implies that the matrix [dfi/dXk] has an inverse (Sec. 13.2-3) [dfi/dxk]~l = [Tik(xh x2i . . . , xn)] for the xh
729
METHODS FOR LINEAR SIMULTANEOUS EQUATIONS
20.3-1
values in question. Let A, B, C be positive real bounds (matrix norms, Sec. 13.2-1) such that n
n
max V |r*|
(20.2-32a)
if the initial trial values x{ = Xi[0] are substituted in the /,-, dfi/dxkf and Yik. Also, let the fi(xh x2, . . . , xn) be twice continuously differentiable and satisfy
V V I_^L_
(i = 1, 2, . . . , n)
Zj Lj IdXj dxk
(20.2-326)
for all (x\, X2, . . • , xn) in the cubical region defined by
max \Xi - XiW\ <Jctt- Vl - ^BC)
(20.2-33)
Then the given system (27) has a solution (xi, x2, . . . , Xn) in the region defined by Eq. (33), and the rate of convergence is given by
max |*< - x™\ <^i (2ABd)^
(20.2-34)
which again indicates relatively rapid convergence. 20.3. LINEAR SIMULTANEOUS EQUATIONS, MATRIX
INVERSION, AND MATRIX EIGENVALUE PROBLEMS 20.3-1. "Direct" or Elimination Methods for Linear Simultaneous
Equations,
(a) Introduction. The solution of a suitable system
of linear equations anXi + anX2 +
• • • + d\nxn = &i
^21^1 + 022^2 + ' *' + a2nXn = &2 an\X\ + aniXi +
/«q on
• • • + annxn = bn
by Cramer's rule (Sec. 1.9-2) requires too many multiplications for practi cal use if n > 4. The following procedures solve a given system (1) by successive elimination of unknowns.
(b) Basic Elimination Procedure (Gauss's Pivotal-condensation Method). Let au be the coefficient having the largest absolute value. To eliminate xj from the ith equation (i ^ I), multiply the 7th equation
20.3-1
NUMERICAL CALCULATIONS
730
by au/au and subtract from the ith equation. Repeat the process to eliminate a second unknown from the remaining n — 1 equations, etc. (c) An Improved Elimination Scheme (Banachiewicz-CholeskyCrout Method). Obtain xn, z„_i, . . . , xh in that order, by "back substitution" from the equations X\ + 0112^2 + 0(1323 +
+ OtinXn = 01 + CL*nXn = ^2
X2 + «23^3 +
(20.3-2)
Xn = 0n
after calculating successive elements of the (n + 1) X n array 7aN^12
«13
"
'
0L\n
01
721
722sS«23
'
'
OL2n
02
7»i
7n2
*
'^\0tnn
A»
7n3
with the aid of the recursion formulas
7fi = an
aik
(i = 1, 2, . . . , n)
7»* =to —y yijOijk
au = —
an
(k = 2, 3,
,n)
(i, k=2, 3, . . . ,n; i > k)
t-i
<*ik = —IOik - y yija,k)
(i, k=2,3,...,n;i
(20.3-3)
3= 1 i-l
Pl =JT an
A=^7« f&t " iL/ y *<&)/ \
(* =2, 3, . . . ,n)
For the computation of determinants, note det [Oik] = 711722 ' • ' 7nn
(20.3-4)
The Banachiewicz-Cholesky-Crout method requires less intermediate recordmg than the Gauss method and becomes particularly simple if the matrix [aik] is symmetric (flki — aik). In this important special case,
otik = — yu
(i < k)
(20.3-5)
731
ITERATION METHODS FOR LINEAR EQUATIONS
20.3-2
(d) Direct Methods in Matrix Form. Gaussian elimination (Sec. 20.3-16) can be interpreted as a transformation of the givensystem (1), i.e., Ax = 6
to the form
TnTn-i • • • TxAx = TnTn-l ' - ' Txb
by successive nonsingular transformation matrices Th T2, . . . , Tn chosen so that theresulting system matrix TnTn-i • • • TiA has a triangular form similar to that of Eq. (2).
Again, the solution scheme of Sec. 20.3-lc amounts to the definition of a single
transformation matrix T such that TAx = Tb
hasthe triangular form (2). Various alternative methods employ postmultiplication or similaritytransformations instead of the premultiplication used here. Several modified triangularization schemes associated with the names of Crout,
Doolittle, Givens, Householder, and Schmidt are described in Refs. 20.9, 20.11, 20.15, 20.22, and 20.25. Some of the modifications help to reduce round-off-error effects (Ref. 20.58). Wilkinson (Ref. 20.9) gives a comparison ofmethods onthebasis of numerical stability and number of multiplications.
20.3-2. Iteration Methods for Linear Equations,
(a) Introduc
tion. Iteration methods (see also Sees. 20.2-6 and 20.2-7) are usually
preferred to direct methods for solving linear equations only if the system matrix [aik] is "sparse," i.e., if most of the a**, especially those not near the main diagonal, are equal to zero. Precisely this is true for the
(possibly very large) systems of linear equations generated by partialdifferential-equation problems (Sec. 20.9-4). Given a system of linear
equations (1) with real coefficients, each of the following iteration methods approximates the solution x{ (i = 1, 2, . . . , n) by successive approximations of the form (20.2-20). The residuals obtained at each step of the iteration will be denoted by n
/<[*) se y aikxk™ - h (i = 1, 2, . . . , n;j = 0, 1, 2, . . .) (20.3-6) Note: Many iteration schemes require the given coefficient matrix A = [a»*] to be normal with a positive-definite symmetric part (Sec. 13.3-4), or to besymmetric and positive-definite (Sec. 13.5-2). If this isnottruefor thegiven coefficient matrix, one may attempt to rearrange orrecombine the given equations soas to obtain relatively large positive diagonal coefficients, or one may solvethe equivalent system n
n
n
y y ahiaikxk =£ ahibi t=i&=i
(h =1, 2, . . . ,n)
(20.3-7)
t=i
whose coefficient matrix is necessarily symmetric and positive-definite whenever the system (1) is nonsingular.
20.3-2
NUMERICAL CALCULATIONS
732
(b) Gauss-Seidel-type Iteration. Rearrange the given system (1) (possibly by recombining equations and/or multiplication by constants) to obtain as large positive diagonal coefficients a« as practicable. Start ing with a trial solution x{™ (i = 1, 2, . . . , n), compute successive approximations
^ =*{* - ^.Uj] =*i[j] - ±(V aikxk™ - b}j a„
(i = 1, 2, . . . , n;j = 0, 1, 2, . . .) (20.3-8) or use
Xiu+i] =Xim _ i / y aikXkiM] + y a.kXku] _ 6\ (f = 1, 2, . . . ,n;i = 0, 1, 2, . . .)
(20.3-9)
Both schemes are simple, but convergence isassured only if thematrix [aik\ is positive definite (Sec. 13.5-2); and convergence may be slow.
(e) Relaxation methods depend on the (manual) computer's judg ment to obtain rapid convergence of an iteration process. Starting with a trial solution x^ (frequently simply xi™ = 1, x2[0] = xz™ = • • • = xn[0] = 0), one attempts to introduce successive approximations x{U] in a loosely systematic fashion so asto reduce then residuals (6) to zero. One tabulates the residuals /t-W at each step and combines the following procedures:
1. Basic Relaxation Procedure. At each step, "liquidate" the residual /r[>J having the greatest absolute value by adjusting xr alone:
Xju+i\ « Xlu] _ il_ an
x.u+i] = x.ui
y\ ^ 7j/ (20.3-10) \ /
Only rough values ofx^j+^ are required for the initial steps. 2. Block Relaxation and Group Relaxation. Apply equal increments Xii*w - XiW to a set ("block") of xWa so as to liquidate one of the residuals fiu+1], or so as to reduce the sum of all residuals to zero.
The latter procedure is useful particularly if all the initial residuals /»[01 are of equal sign.
Group relaxation applies different increments to a chosen set of Xi[j])8 for similar purposes.
3. Overtaxation. One can often improve the convergence of a relax ation process by changing the sign of the residualoperated on while
733
ITERATION METHODS FOR LINEAR EQUATIONS
20.3-2
reducing its absolute value, without liquidating it entirely at this stage.
Relaxation methods are particularly useful for the (manual) solution of the large
sets of simple linear difference equations frequently used to approximate a partial differential equation (Sec. 20.9-3). Relaxation methods at their best exploit the manual computer's skill, experience, and knowledge of the underlying physical situa tion. Many special tricks apply to particular applications (Refs. 20.21 to 20.30).
(d) Systematic overrelaxation is employed in an important class of iteration methods for solving the "sparse" equation systems generated
by partial-differential-equation problems. Consider a system of equa tions (1) rearranged so that all diagonal coefficients equal unity (an = 1), and modify the iteration (9) to read
Xiv+i] =x.u\ - ^ ( ^ aikxk[i+l] + £ OikXk™ - &<) (i = 1, 2, . . . ,n;j = 0, 1, 2, . . .)
where coy is a real overrelaxation factor > 1chosen (either once and for all or at each step) so as to speed up the convergence. The choice ofcoy is discussed in Refs. 20.15, 20.22, 20.23, and 20.30; Ref. 20.9 contains a large bibliography.
(e) Steepest-descent methods (gradient methods) minimize a n
n
positive function like Fs £ |/t|, F= J |/t|2, ... in the manner of t = i
Sec. 20.2-4d. If the given coefficient matrix [aik] is symmetric and posi tive definite, one may use n
Xiwn = xxn - -^±
/<w
£ £ akhfk[iWj] k=lh~1 (i = 1, 2, . . . ,n;j =0, 1, 2, . . .) (20.3-11) If the convergence is of an oscillatory nature, it may be possible to accelerate the convergence by multiplying the last term in Eq. (11) by 0.9 for some values ofj.
(f) A Conjugate-gradient Method. Given a system of linear equations (1) with a real, symmetric, and positive definite coefficient matrix [o«], start with a trial solution xtw. Let I*™ = /»[01, and compute
2<> 3-3
NUMERICAL CALCULATIONS
734
successively n
*w-u = Xim __ __^l_
w
2, ^ akhVkli]v*{n
a=i^=1 n
/^ =m _ __^
£^
2 2 a**t,*['?!'*t'1 *=i
*=1 A = l
y y akhVk{i]vhM h~i a=i
(* = 1, 2, . . . , n;j = 0, 1, 2, . . .) (20.3-12) The resulting /t-W satisfy Eq. (6). Without round-off errors, this pro cedure yields the exact solution x{ = x^ after n steps; in addition, the method has all the advantages ofan iteration scheme (Sec. 20.1-2), but it requires many multiplications.
It is possible to write conjugate-gradient algorithms directly applicable to any given coefficient matrix [a,*]; such schemes amount essentially to a solution of Eq (7) by the method of Eq. (12).
20.3-3. Matrix Inversion (see also Sees. 13.2-3 and 14.5-3). (a) The methods of Sees. 20.3-1 and 20.3-2 apply directly to the numerical inver sion of a given nonsingular n X n matrix A s [a*], and also to the cal culation of det [aik\. (b) An Iteration Scheme for Matrix Inversion. To find the inverse A'1 of
a given n Xn matrix A, start with an n X n trial matrix XM (say XW = I) and compute successive approximations
XE/+U = XW(2/ - AXW)
(j = 0, 1, 2, . . .)
(20.3-13)
If the sequenceconverges, the limit is A'1 (see also Sec. 20.2-26).
(c) (See also Sec. 13.4-7.) For every nonsingular n Xn matrix A,
A"* = - 1 (A*-* + aui-« + • • • + an.J) with
ai - -Tr (A)
^ (20.3-14)
as = --, [oy_i Tr (A) + a,_2 Tr (A2) + • • . + 0l Tr (il*-i) + Tr (Ai)]
(j = 2, 3, . . . , n)
SIMULTANEOUS LINEAR EQUATIONS
735
20.3-4
20.3-4. A Partitioning Scheme for the Solution of Simultaneous
Linear Equations or for Matrix Inversion (see also Sec. 13.2-8). (a) A system of linear equations (1) can be written in matrix form as
Ax =
an
di2
* • *
#in
Xi
#21
#22
' * *
#2n
X2
' 6i ' b2
= b (20.3-15)
=
Onl
a„2
ann_ _ Xn _
' ' *
(Sec. 14.5-3). Equation (15) can be rewritten in the partitioned form An
Bi
Xi
A12
=
Ax s=
An
b
X2
A22
with
An =
an
#12
dim
#21
#22
dim
dml
Omi
' bi "
" Xi " Xi =
X2
- &iw
(20.3-16)
62
Bt = -
(m < n)
_ bm _
The resulting matrix equations AnXi + A12X2 = Bi A21X1 + A 22X2 = B2
(20.3-17)
(An — ^12^22_1^.2l)^l = Bi — Al2A22~lB2
(20.3-18)
yield
which, once the (n —m) X (n —m) matrix A22 has been inverted, furnishes m linear equations for the first ra unknowns Xi, X2, . . • , xm.
This procedure may be useful, in particular, if only the first ra unknowns are of interest.
(b) The inverse matrix A-1 is obtained in the partitioned form Cn
C12
C21
C22
A"1 =
with
Cn = (An — Ai2A22~1A2i)"1 C22 = (A22 — A2iAu-1Ai2)~1
(20.3-19)
C21 = —A22 ^21^11 Cl2 = —An~1Ai2C22
so that the inversion of the n X n matrix A is reduced to the inversion
of smaller ra X ra and (n — ra) X (n — ra) matrices.
20.3-5
NUMERICAL CALCULATIONS
736
20.3-5. Eigenvalues and Eigenvectors of Matrices (see also Sees.
13.4-2 to 13.4-6, 14.8-5, and14.8-9). (a) The Characteristic Equation. The eigenvalues X* of a given n X n matrix A = [aik] can be obtained as roots of the characteristic equation
FA(\) = det [aik - U\] = 0 (20.3-20) by one of the methods outlined in Sees. 20.2-1 to 20.2-3. To avoid
direct expansion of the determinant, one may evaluate FA(\) for n + 1 selected values of X(say X= 0, 1, 2, . . . , n), so that the nth-degree polynomial FA(\) is given (exactly) by one of the interpolation formulas of Sec. 20.5-3.
(b) An Iteration Method for Hermitian Matrices.
Let A be a
hermitian matrix, so that all eigenvalues arereal; in many applications A is realand symmetric. Assuming that the eigenvalue \M with the largest absolute value (dominant eigenvalue) is nondegenerate, start with a trial column matrixx™ suchas {1, 0, 0, . . . ,0} and compute successive matrix products
xu+n = ajAxv\
(j = o, 1, 2, . . .)
(20.3-21)
where «,- is a convenient numerical factor chosen, for instance, so that the component ofx[j+u largest in absolute value equals 1. Asj increases, x[* will approximate an eigenvector associated with the dominant eigenvalue \m; the latter may then be obtained from n
.
,,
xjAx
n
2 2 aikX^x"
ifi *ri
*= 1
This process converges more rapidly if |XM| is substantially larger than the absolute values of all other eigenvalues of A, and if the direction of
x[°i is already close to that of the desired eigenvector; if x[0] = {1, 0, 0, . . . , 0} does not work well, try {0, 1, 0, 0, ... , 0}, etc. One can accelerate the convergence by using A2 or A8 instead of A in Eq. (21). Additional eigenvectors and eigenvalues are found after reduction of the given matrix (Sec. 14.8-6); see also Refs. 20.7 and 20.48 for other procedures.
If Xm is ra-fold degenerate, the vectors described by Eq. (21) will not in general con verge, but for large j they will tend to stay in a subspacespanned by the eigenvectors associated with \M. Hence mlinearly independent matrices xW willyield mmutually orthogonal eigenvectors by orthogonalization (Sec. 14.7-46). (c) A Relaxation Method. Starting with a trial vector represented by the column matrix x, compute y = Ax and adjust the matrix elements & of x so that the greatest absolute difference between two of the quotients W£i, W$2, • • • , W£« becomes as small as possible: the ra are the matrix elements of y.
FINITE DIFFERENCES AND CENTRAL MEANS
737
20.4-1
(d) Jacobi-Von Neumann-Goldstine Rotation Process for Real Symmetric Matrices. Given a real symmetric matrix A = [aik] = A[0], begin by eliminating the nondiagonal element aiK having the largest absolute value through the orthogonal transformation A™ = Tr'A^Ti
(Tis [tik])
with
Uk = «j[l + (cos #1 - I) (8} + «i)J + sin #i(VK5l ~ *j*f) (i, ft = 1, 2, . . . , n)
(20.3-23)
2aiK
tan 2tfi
an — ajuc
(rotation through an angle th in the "plane" spanned by e7 and eK, see also Sees. 2.1-6 and 14.10-26). If aH = aKK, then take #i = ±r/4, the
sign being that of aiK- Apply an analogous procedure to A[1] to obtain AW, and repeat the process. The product TiT2Tz ... of the transfor mation matrices converges to an orthogonal matrix which diagonalizes the given matrix A even if there are multiple eigenvalues. Variations of the Jacobi method simply reduce all off-diagonal elements aik with absolute values greater than a predetermined threshold in succession (Ref. 20.9).
(e) Other Methods. A number of other numerical methods for solving matrix eigenvalue problems, including those involving nonhermitian matrices, will be found in Refs. 20.9, 20.22 to 20.26, and 20.31.
20.4. FINITE DIFFERENCES AND DIFFERENCE EQUATIONS
20.4-1. Finite Differences and Central Means, (a) Let y = y(x) be a function of the real variable x. Given a set of equally spaced argu
ment values xk = x0 + k Ax (k = 0, ±1, ±2, . . . ; Ax = h > 0) and
a corresponding set or table offunction values yk = y(xk) = y(xo + k Ax), one defines the forward differences
Ayk =
Vk+1 "-
Vu
(first-order forward differences) 1 A2yk = Ayk+i - Ayk = Vk+2 ~- 2yk+i + yk (second- ORDER FORWARD DIFFERENCES)
[ (20.4-1)
Aryk = Ar_1 yk+i -
Ar'~lVk =
£<-C) Vk+r-j
\
(r:= 2,3, . . .)(rth--ORDER FORWARD DIFFERENCES) / (k = 0, ±1, ±2, . . .)
20.4-1
NUMERICAL CALCULATIONS
738
and the backward differences Vyk Vryk
=
Vk- 1 = Ayk-i v -'lVk - v-"lyk-i =
Vk -
Aryk.
~\k =
(r = 2, 3, . 0, ±1, ±2,
•)
1
(20.4-2)
.)
(b) Even though the function values yk = y(x0 + k Ax) may not be known for half-integral values of k, one can calculate the central differences
tyk = yk+\i — yk-u = Ayk-u 8ryk = fiyk+H ~ Zr~lyk-Y> = A'yk-r/2
(20.4-3)
(r = 2, 3,
and the central means
m = H(yk-H + yk+\d VTVk = K(/*r-12/*-^ + f-'yk+x)
(r = 2, 3,
0
) (20.4-4)
for k = ±y2, ±%, ... if the order r is odd, and for k = 0, ±1, ±2, . . . if r is even.
(c) Finite differences are conveniently tabulated in arrays like s-i
y-\
Ay_i
^^^Ay0 xi
yi
£2
2/2
Afy-i
^^ A22/0
Ayi ^ ^ ^ A82/0 A*2/i
A42/-i
(20.4-5)
"^^
Aj/2 38
2/3
3-3
2/-3
3-2
2/-2
V2/-2
V2y_! V2/-1
x^i y_i
V32/o ^
V22/o^--"^ V42/i V2/o^-^
30
V^^^
31
2/1
3-2
#-2
3_i
2/-1
V*#i
(20.4-6)
V32/i
«-%
£°__#o 3i
yi
Xt
2/3
«22/-l
*2/-H
S2y0 S2yi
«82/-H
S4y0
(20.4-7)
20.4-2
OPERATOR NOTATION
739
Note that (1) the computation of an rih-order difference requires r + 1 function values, and (2) the nth-order differences of an nih-degree polynomial y(x) are constant.
20.4-2. Operator Notation,
(a) Definitions. Given a suitably de
fined function y = y(x) of the real variable x and a fixed increment
Ax = h of x, one defines the displacement operator (shift operator) Eby
Ey(x) s y(x + Ax)
Ery(x) = y(x + r Ax)
(20.4-8)
where r is any real number.
The difference operators A, V, 8 and the central-mean operator fj, are defined by
Ay(x) ss y(x + Ax) - y(x) (forward-difference operator) Vy(x) ss y(x) - y(x - Ax) (backward-difference operator)
(20.4-9)
8y(x) =y(x +^j - y\x - -|J (central-difference operator)
(central-mean operator)
(20.4-10)
Note that, for uk = wfco + h Ax), vk = v(x0 + k Ax) A(uk + v*) = Auk + Aw*
A(a^fc) = a Auk
(20.4-11)
AiukVk) — Uk Avk + Vk Auk + Auk Avk
Note also that, for Ax = 1,
A*M = A[z(z - 1) • • • (x - r + 1)] = rx(x - 1) •
(x - r + 2) = rsl'-1!
(r = 1, 2, . . .)
(20.4-12)
(b) Operator Relations (see also Sec. 14.3-1 and Ref. 20.36). A = E - 1 = EV = E^5
V = 1 - E-1 = E^A = E-^5 ^
5 = E^ - E-h = E"^A = E^V E = 1 + A VA = AV = 82
m = y2(E» + e-k)
(20.4-13)
As an aid to memory, note r
Ar =(E - 1)' =T (-1)' (r) E^' ;=o
(r =1, 2, . . .)
(20.4-14)
20.4-3
NUMERICAL CALCULATIONS
together with V = E_1A and 8 = E~^A.
740
Note also
E' =(1 +A)r =1+(*) A+(2\ A2 +. . .
(20.4-15)
If r is a positiveinteger, the series (15) terminates; otherwise, its convergence requires investigation.
(c) For suitably differentiable operands,
£r =eAx d=x+rAx D+^_ (r Ax D)2 _|_ ...
/D s d\ (20 4_16)
(operator notation for Taylor's series, Sec. 4.10-4; see also Table 8.3-1).
20.4-3. Difference Equations,
(a) An ordinary difference equa
tion of order r is an equation
G(xk, yk, yk+h yk+2, . . . , yk+r) = G(xk, y^ Eyk, E2yk, . . . , Eryk) = 0 (* = 0, ±1, ±2, . . . ; r = 1, 2, . . .)
(20.4-17)
relating values yk = y(xk) = y(x0 + k Ax) of a function y = 2/(z) defined on a discrete set of values x = xk = x0 + k Ax, where Az is a fixed increment. It is often convenient to introduce k = (x —x0)/Ax = 0, ±1, ±2, . . . as a new independent variable. An ordinary difference equation of order r relates yk and a set offinite differences A%, V{yk, or 8% of order up to and including r; or the difference equation may relate yk
and difference coefficients tv%> tt%» or j^-. of order i up to and including r.
To solve a difference equation (17) means to find solutions y = y(x) such that the sequence of the yk satisfies the given equation for some range of values of k. The complete solution of an ordinary difference equation of order r will, in general, involve r arbitrary constants; the latter must be determined by accessory conditions on the y^ such as initial conditions or boundary conditions. The solution of a difference equation over any finite range amounts, in principle, to that of a set of simultaneous equations. Difference equations are used (1) to approximate differential equations (Sees. 20.9-2 and 20.9-4) and (2) to deal with situations represented by discrete-variable models.
A SIMPLE EXAMPLE: SUMMATION OF SERIES. The problem of solving a first-order difference equation {recurrence relation) of the form Ayk-\ = Vyk = ak
or
yk = yk_i + ak
(20.4-18)
with a given initial value y0 = aQ is identical with the problem of summation of series (Sec. 4.8-5): k
Vk = yt-i + cck = S ak 3= 0
{k = 0, 1, 2, . . .)
(20.4-19)
741
LINEAR ORDINARY DIFFERENCE EQUATIONS
20.4-4
The problem is analogous to the integration of a differential equation y' —f(x). 2 and V are inverse operators; note Eq. (12) and n
n
} Uk Avk = (Un+iVn+l —UmVm) — } Vk+l AUk (SUMMATION BY PARTS) (20.4-20) k=m
k=m
(b) A partial difference equation relates values <£»/... = $(xo + i Ax, yo + j Ay, . . .) (i, j, . . . = 0, ±1, ±2, . . .) of a function $ = $(x, y, . . .); the order of the partial difference equation is the largest difference between i values, j values, . . . occurring in the equation. Refer to Sec. 20.9-5 for formulas expressing various partial-difference operators in terms of function values <£»,-..., and for the use of partial difference equations to approximate solutions of partial differential equations.
20.4-4. Linear Ordinary Difference Equations, Theorems (see also Sees. 9.3-1 and 15.4-2).
(a) Superposition
A linear difference equa
tion is linear in the values and differences of the unknown function.
Thus a linear ordinary difference equation of order r has the form ao(k)yk+r + ai(k)yk+r-i + • • • + ar(k)yk s [a0(k)Er + ai(k)Er-' + • • • + ar(k)]yk = /(*)
(20.4-21)
where the at(&) and f(k) are given functions of k = 0, ±1, ±2, . . . . As in Sec. 9.3-1, solutions yk corresponding to f(k) = afi(k) + $2(k) are the corresponding linear combinations of solutions corresponding to the indi vidual forcing functions fi(k) and f2(k) (Superposition Principle). The complete solution of Eq. (21) can be expressed as the sum of any particular solution and the complete solution of the linear and homogeneous licomple mentary equation1 f
[a0(k)Er + ai(k)Er-' + • • • + ar(k)]yk = 0
(20.4-22)
Again, every linear combination of solutions of a linear homogeneous dif ference equation (22) is itself a solution of Eq. (22). The theory of ordinary difference equations parallels that of ordinary differential equa tions in many details (see also Ref. 20.67). In particular, a homogeneous linear difference equation (22) admits at most r solutions y^k, 2/(2)t, . . • linearly independent
on the set k = 0, 1, 2, . . . .
r such solutions are linearly independent if and only if
the Casoratian determinant
K-lVWh 2/(2)ft, • • • , y(r)k] =
2/(1)*
V(2)h
• ' '
V(r)k
2/(1)*+1
2/<2)ft+l
* * *
V(r)k+l
y(l)k+r-l
2/(2)ft+r-J
* * *
2/(r)*+r-l
(20.4-23)
is not identically zero for k = 0, 1, 2, . . . . The Casoratian is analogous to the Wronskian in Sec. 9.3-2.
(b) The Method of Variation of Constants (see also Sec. 9.3-3).
Assuming
that r linearly independent solutions ywk) 2/(2)*, . . • , ^(r>* of the complementary equation (22) are known, the complete solution of the nonhomogeneous linear differ-
20.4-5
NUMERICAL CALCULATIONS
742
ence equation (21) is given by yk = Cx{k)y{l)k + C2(k)y(2)k + • • • + Cr(k)yMk
(20.4-24a)
with
y y(m+i ACh(k) =0 (j =1, 2, . . . ,r - 1) (20.4-246)
r
y y(h)k+r ACh(k) =f(k) After solving the r simultaneous linear equations (246) for the ACh(k), obtain each Ch(k) by summation as in Eq. (19).
20.4-5. Linear Ordinary Difference Equations with Constant Coefficients: Method of Undetermined Coefficients (see also Sees. 9.4-1 to 9.4-8). (a) The complete solution of the linear homogeneous difference equation aoyk+r + aiyk+r-i +
• • • + aryk s (a0Er + aJE*-1 +
• • • + ar)yk = 0
(20.4-25)
with given constant coefficients ao, ai, . . . ,ar can be expressed in terms of normal-mode terms completely analogous to those of Sec. 9.4-lc, viz., yt = CiXi* + C2X2* + • • • + CrKk
(20.4-26)
where Xi, X2, . . . , Xr are the roots of the algebraic equation a0\r + aik*-1 + • • • + ar = 0
(20.4-27)
provided that all r roots are different. If a root, say Xi, is an m-fold root, then the corresponding term in the solution (27) becomes (Ci + kC2 + k2Cz + • • • + fcw~1CTO)Xi*. Two terms corresponding to com
plex conjugate roots X = a ± ip may be combined into \k\k(A cos k
' ' ' + aryk
s (ooE' + aiE'-1 + • • • + ar)yk = /(*)
(20.4-28)
can often be derived in the manner of Sec. 20.4-4, but the special methods of Sees. 20.4-6 and 20.4-7a may be more convenient. 20.4-6. Transform Methods for Linear Difference Equations with Constant Coefficients, (a) The ^-transform Method (see also Sees. 8.7-3 and 9.4-5). Given a difference equation (28), z transfor mation of both sides with the aid of the shift theorem 2 of Table 8.7-2
743
z TRANSFORM METHOD
20.4-6
yields the z transform
Z[yk; z] s Yz(z) s y0 + Hi + V-\ + • • • z
(20.4-29)
z*
of the unknown solution sequence y0, yi, 2/2, . . . in the form
Y (z\ = ZK)
EM
4-
W>
aQzr + aizr~l + • • • + ar "*" a0zr + aizr~l + • • • + ar (20.4-30)
where the first term on the right, as in Sec. 9.4-5, represents a "normal response" to the given forcing sequence /(0), /(l), f(2), . . . , and the second term represents the effect of r given initial values 2/0,2/1,2/2, . . . , yr-i. Specifically, Gz(z) = yo(a0zr + aizr~l + • • - + ar-iz) + yi(aozr~l + aiz"2 + • • • + ar-2z) + • • • + yr-ia0z
(20.4-31)
The unknown yk may be obtained as coefficients of \/zk in the powerseries expansion of Yz(z), or one may utilize a table of ^-transform pairs (Table 20.4-1). As in the case of Laplace transforms, Yz(z) can be reduced to a sum of simpler forms by a partial-fraction expansion, which yields normal modes corresponding to the roots of the characteristic equation (27). (b) Sampled-data Representation in Terms of Impulse Trains and Jump Functions. Laplace-transform Method. If one formally admits the asymmetric impulse function 8+(t) and step-function differentiation in the sense of Sec. 21.9-6, a sampled-data sequence uq, ui, ui, . . . can be represented, on a reciprocal one-to-one basis, by a corresponding impulse train*
u\(t) = i«o*+(0 + uid+(t - T) + u26+(t - 2T) + • • •
(t > 0)
(20.4-32)
where t is a real variable, and T a real positive constant (sampling interval). If the sampled-data sequences yo, y\, y2, . . . and/(0),/(l),/(2), . . . satisfy any difference equation (28), the corresponding functions yUt),f](t) satisfy the functional equation (difference equation for functions) aoyUt + rAt) + aiy\[t + (r - l)At] + • • • + aryW) = /t(0
with appropriate initial conditions. transformation, with
£fat(0; s] = wo + Uie-T< + u2e~2Ta + • • • (see also Sec. 8.2-2).
(* > 0)
(20.4-33)
Unlike Eq. (28), this relation admits Laplace (
(20.4-34)
The resulting transform method is analogous to the z-transform
method with z = e*T — e*At where At = T.
To avoid the use of symbolic functions, one can, instead, represent the sequence Uo, U\, u2, . . . by the corresponding jump function
J+t*(0 = f [ut (t) - ttt (< -Ai)] dt = wjk
for &71 < t < {k + 1J71
ft > 0; A; = 0, 1, 2, . . .)
(20.4-35)
* u}(t) is usually denoted by u*(t) in the literature on sampled-data systems.
20.4-7
744
NUMERICAL CALCULATIONS
Table 20.4-1. A Short Table of z Transforms z transform
z[y*; z]
Sequence of sample values yk(k = 0, 1, 2, . . .)
00
= V^ ; =0
z
1
1
z -
1
z
2
k
(z 3
I)2
z(z + 1)
k2
(2 -1)3
4
r) (n =0, 1, 2, . . .)
5
ak
z
(z -
l)n+l z
z
—
a
az
6
kak
{z - ay
7
(n) ^^ {U =°' h%'''] ak -
bk
T (a * b)
8 a
z
(z - a)n+1
—
o
z
(z — a){z — b) az sin b
9
ak sin bk
10
ak cos bk
z2 — 2az cos b + a2
z(z — a cos b) z2 -
2az cos b + a2
which can be physically interpreted as the output of a "zero-order data-hold" device. J+2/(0 and f+f(t) satisfy the same functional equation as y](t) and /t(0* so that Laplace transformation is again possible.
20.4-7. Systems of Ordinary Difference Equations (State Equa tions). Matrix Notation. Just as in the case of differential equations,
one may be given a system of ordinary difference equations involving two or more unknown sequences y(xk) —yk} z(xk) =«*,.... One can reduce any rth-order difference equation (17) to a set of r first-order equa-
745
STABILITY
20.4-8
tions by introducing the E*2/a or A{yk (i = 1, 2, . . . , r — 1) as new variables (state variables, see also Sec. 13.6-1). Again, as in Sec. 13.6-1, a given system of linear first-order difference equations (recurrence rela tions, state equations)
yk+i = auyk + anzk + * ' * + fi(k) \
Zk+i = a2iyk + a22Zk + • • • + /2M }
(20.4-36a)
with constant coefficients a{j may be rewritten as a matrix equation Yk+i = AYk + F(k)
(20.4-366)
(see also Sec. 14.5-3), where Yk is the column matrix {yk} z^ . . .), F(k) is the column matrix \fi(k), /2W, . . .}, and A = [an]. Given Yo = {2/0, zo, . . .}, the solution is k-i
Yk = AkY0 + y Ak-h-*F(h)
(k = 0, 1, 2, . . .) (20.4-37)
h%
where Ak, in analogy with Sec. 13.6-26, is called the state-transition matrix for the system (36). The powers Ak required for the solution can be computed with the aid of Sylvester's theorem (Sec. 13.4-76), so that each eigenvalue of A, i.e., each root Xof the characteristic equation, det (A -XI) = 0
(20.4-38)
again corresponds to a normal mode (see also Sees. 13.6-2 and 20.4-5). This method applies, in particular, to the solution of the rth-order linear difference equation (28) if it is first reduced to a system of the form (36) through the introduction of yk+i, yk+2, • • . , 2/*+r-i as new variables, so that Yk = {?/A+r-l, yk+r-2, •
Vk]
F(k) - (-/(£), 0, 0, . . GO
A =
_
at
a2
ar_i
ar
ao
ao
ao
ao
1
0
0
0
0
1
0
0
0
0
1
0
(20.4-39)
_
20.4-8. Stability, (a) By analogy with Sec. 9.4-4, a linear difference equation (28) or a linear system (36) with constant coefficients will be
called completely stable if and only if all roots of the corresponding
20.5-1
NUMERICAL CALCULATIONS
746
characteristic equation (27) or (38) have absolute values less than unity; this ensures that the effects of small changes in the initial conditions tend to zero as k increases.
Reference 20.63 presents Jury's version of the Schur-Cohn test for stability, which is analogous to the Routh-Hurwitz criterion (Sec. 1.6-66) for roots with negative real parts.
(b) Lyapunov's definitions (Sec. 13.6-5) and the related theorems (Sec. 13.6-6) on the stability and asymptotic stability of solutions are readily extended to the solution sequences Yo, Y\, Y2, . . . of linear and nonlinear autonomous difference-equation systems
Yk+i = F(Yk)
(k = 0, 1, 2, . . .)
(20.4-40)
Generally speaking, conditions like dV/dt < 0 for continuous-system Lyapunov functions V(y) simply translate into analogous conditions AV(Yk) < 0 for discreteparameter Lyapunov functions V(Yk) (Ref. 20.63). 20.5. APPROXIMATION OF FUNCTIONS BY INTERPOLATION
20.5-1. Introduction (see also Sees. 12.5-46 and 15.2-5). Given n + 1 function values y(xk) = yk, an interpolation formula approximates the function y(x) by a suitable known function Y(x) = Y(x; a0, ai, a2> . . . , an) depending on n + 1 parameters a, chosen so that Y(xk) = y(xk) = yk for the given set of n + 1 argument values x^ In particular, Sees. 20.5-2 to 20.5-6 deal with polynomial interpolation. Other interpolation methods are discussed in Sees. 20.5-7 and 20.6-6.
The use of interpolation-type approximations (and, in particular, of polynomial interpolation) is not always justified. Thus, in the case of empirical functions, one may wish to smooth fluctuations in the y(xk) due to random errors (Sec. 20.6-3).
20.5-2. General Formulas for Polynomial Interpolation (Argu ment Values Not Necessarily Equally Spaced). An nth-order poly nomial interpolation formula approximates the function y(x) by an nth-degree polynomial Y(x) such that Y(xk) = y(xk) = yk for a given set of n + 1 argument values xk. (a) Lagrange's Interpolation Formula. Given y0 = y(xQ), 2/i = y(xi), 2/2 = 2/fe), . . . ,yn = y(xn)
Y(x) = fa - sQQe - X2) ' - ' (x - xn) K
(X0 —Xi)(Zo —Xi) ' • ' (X0 - Xn) (x - x0)(x - X2) ' ' ' (x - Xn)
...
(Xi — Xq)(Xi — X2) • * • (Xi — Xn) (x — Xq)(x - Xi) ' ' ' (x - Xn-i)
(Xn - X0)(xn - Xi) ' ' * (Xn - Xn-i) Vn (Lagrange's interpolation formula)
(20.5-1)
747
GENERAL FORMULAS
20.5-2
(b) Divided Differences and Newton's Interpolation Formula. One defines the divided differences
Xi —
Xq
&r(Xo, Xi, X2} . . . , Xr) _ Ar_i(a;i, x2i
(20.5-2)
xr) - Ar_i(q;o, xh . . . , xr-i) Xf
Xq
(r = 2, 3, . . .) Then
Y(x) = 2/o + (x - Xo)Ai(zo, Xi) + (x — x0)(x - Xi)A2(a:o, xh x2) + • * * n
1
(20.5-3)
+ [Ft (x "~ x*) 1An(a;0, Xi, x2, . . . ,xn) &= 0
(Newton's interpolation formula)
Unlike in Eq. (1), the addition of a new pair of values xn+i, yn+i requires merely the addition of an extra term.
It is convenient to tabulate the
divided differences (2) for use in Eq. (3) in the manner of Eq. (20.4-5). Taking a divided difference is a linear operation (Sec, 15.2-7) on the function y(x). Each function (2) is completely symmetric in its arguments.
(c) Aitken's Iterated-interpolation Method. The following scheme may be useful if one desires values of the interpolation poly nomial Y(x) rather than a simple expression for Y(x). Let Yijk... be the interpolation polynomial through (x{, y%), (xjf yj), (xk) yk)} . . . , so that Yqu. .. n = Y(x). Interpolation polynomials of increasing order are then obtained successively as follows: 1
Yqi = Xi
2/0
X -
Xi
yi
X —
Xq
Xq
X — X2
Xq
x
Yqi Y12 5^012 1^123
1
Yqi2 = X2
3^0123 ==
X — Xq Xq
1 Xz —
X — Xq —
XZ
F12 =
1
X — Xi
2/i
X2 — Xi
X — X2
2/2
The process may be terminated whenever two interpolation polynomials of successive orders agree to the desired number of significant figures. (d) The Remainder. If y(x) is suitably differentiable, the remain der (error) Rn+i(x) involved in the use of any polynomial-interpolation formula based on the n + 1 function values y0, yh y2, . . . , yn may be
20.5-2
748
NUMERICAL CALCULATIONS
(K+5)3 A4y_6
[u+4)x
y-Ar
(k+6)5
A6y-7 C**-f-7)7
(tt+5)4 A^-rf
lM+6)6 37y*7
V
^A3y_5
(m+4)2
1
y-3
(M+3)!
i
by-z
(u+4)3
Ay.j
("+6J7
>3j^ (u+4)4
A5y.5 (M+5)6 A7y-6
(u+3»3
A*y^4
(tt+4)5 J^">5
(w+2)2t
s.A3y.3 J^(m+3]4 ^^ A5y_4X^f«+4)6^ \ A7y.5
>"tw+2)3\ A2y_2
^ A^_3
\ (a+3)5
(u+l}2
(m+2)4
A5y„3
{a+3)6
y-lX^ [u+l)x\ 1
("+5)5^-6
(m+3J2
Ay_2
i
J^y^s
A3y.2
(M+5)7
X^(u+4J7 A6y„4 A7y_4
>"" \
y^ \
J^ \
/
*^<
^V
^^K. S
"^V.
y0~*-(uJ1-^-A^.1*-(u+l)3^-A4y_2^-(U+2)5-*--A6y.3^-[u+3J7
^r^
^"^
j....>.... A3b---»--Mw)2--->---A3^r.>-.{M+l)4.*--A5^.2-->--(M+2)6-->"A7y-3 («)3
A3^
(u-l)2
1
y.*y-i (u+l)5 («)4
(a-2)!
*2v
><
1
ASjy.!
><
y(
|u+l)6
>3*
(u-2)2
(m-D4
l«-3)2
Me
^*yf (»-2j4
(u-3)3
(«-4)!
Gregory-Newton (Forward)
Vy2
A7y.2 ("+U7
(")5
l»-2l3 1
^y-2 ("+2)7
(M)7
fo-U6
(w-2]5
(K-D7
-->— Stirling
• Gregory-Newton (Backward) Gauss (I)
Fig. 20.5-1. Lozenge diagram for interpolation formulas. 1
/u -\- k\
s\
\
(u + k)a = — (u + k)W = I
s
/
J is used.
The abbreviated notation
To convert a path through the lozenge
to an interpolation formula, the following rules are formulated: 1. Each time a difference column is crossed from left to right a term is added. 2. If a path enters a difference column (from the left) at a positive slope, the term
added is the product of the difference, say Aky-P, at which the column is crossed and the factorial (u + p — 1)* lying just below that difference. 3. If a path enters a difference column (from the left) at a negative slope, the
term added is the product of the difference, say Aky-P} at which the column is crossed and the factorial (w + p)& lying just above that difference.
4. If a path enters a difference column horizontally (from the left), the term added
749
INTERPOLATION FORMULAS
20.5-3
estimated from n
Rn+1(x) - y(x) - Y(x) =(n+1}, /(n+1)«) EI (* - **) (20-5-4) ft = 0
where £ lies in the smallest interval / containing xq, xi, X2, . . . , xn, and x (see also Sec. 4.10-4; £ will, in general, depend on x), and 1
l«-+i(*)l ^ T^TTTi max. l/(n+1)WI II I* - Xk\ JnTl)\x-ri
(20'5"5)
k = 0
20.5-3. Interpolation Formulas for Equally Spaced Argument Values. Lozenge Diagrams. Let yk —y(xk), Xk = Xq + k Ax (k = 0, ± 1, ± 2, . . .), where Ax is a fixed increment as in Sec. 20.4-1, and intro duce the abbreviation (x — a;o)/Aa; = u. (a) Newton-Gregory Interpolation Formulas. Given y0, yi, y2, . . . or y0, y_i, y-2, . . . , Eq. (3) becomes, respectively,
Y(x) =y. +QAy0 +(j) A*». +••• j (^™^ FW =2/o +£V». +^{^ Vy. +••] ^bpo^tion 1!
2!
*
I
(20.5-6)
formulas)
(b) Symmetric Interpolation Formulas. More frequently, one is given y0, 2/1, 2/2, •• • and y-h y-2, . . . ; Table 20.5-1 lists the most useful interpolation formulas for this case (see also Fig. 20.5-1). Note that Everett's and Steffensen's formulas are of particular interest for use with printed tables, since only even-order or only odd-order differ ences need be tabulated.
is the product of the difference, say Aky-P, at which the column is crossed and the average of the two factorials (u -f p)k and (u + p — l)k lying, respectively, just above and just below that difference.
5. If a path crosses a difference column (from left to right) between two differences, say A*2/_(p+i) and Aky„p, then the added term is the product of the average of these two differences and the factorial (w + p)k at which the column is crossed. 6. Any portion of a path traversed from right to left gives rise to the same terms as would arise from going along this portion from left to right except that the sign of each term is changed.
7. The zero-difference column corresponding to the tabulated values may be treated by the same rules as any other difference columns provided one thinks of the lozenge as being entered just to the left of this column.
Thus this column can be
crossed by a path making a positive, negative, or zero slope, just as is true of the other columns. (Reprinted by permission from K. S. Kunz, Numerical Analysis, McGraw-Hill, New York, 1957.)
formula
interpolation
Steffensen's
formula*
interpolation
Everett's
interpolation formula f
Bessel's
formula*
Stirling's interpolation
Interpolation polynomial, Y(x)
where Ax is a fixed increment; u = (x —xd)/Ax
*=0
A= l
m-l
' Z, 2k + 1V
2fc *=1
) Z *W + I \
••+|{(^*)'-'»»-C2r)'"-»-4
2
)
2
Remainder,
(2m++mi)2'
m — 1\ ._ w v ((u -j-2m ) »<2""(f) Ax»»
(U+2mm~1)yi*n)M*x"a
(LVli)y('m+,Hi)^'m+l
every x0 + k Ax used)
interval containing x and
(£ lies in the smallest
#n+i(s) «i?2«+i(a;)
. (« + 2)(m + 1)m(m - 1) (d*yo + 8*y! __ 191 ^jM^iN 24 V 2 924 2 J gives a sunplified polynomial including the effect ofsixth-order differences (see also Refs. 20.13 and 20.16).
2k
tn 1"vV «+M'^'l-* +****** . V «/«+*- 1\ -,
t BesseVs modified formula
4
3
2
1
No.
Table 20.5-1. Symmetric Interpolation Formulas
One is given an oddnumber n -f 1 = 2m + 1 of function values yk = y(x0 + kAx) (k = 0, ± 1, ±2, . . . , ±m),
25
CD
o z
r*
e
ft
ft •
ft >
»
w
S
a
751
"OPTIMAL-INTERVAL" INTERPOLATION
20.5-5
(c) Use of Lozenge Diagrams. Many interpolation formulas can be obtained with the aid of a lozenge diagram (Fraser diagram) like that shown in Fig. 20.5-1. One can derive more sophisticated interpolation formulas (e.g., Everett's formula) by averaging two or more equivalent interpolation polynomials. 20.5-4. Inverse Interpolation, (a) Given Xk = x0 + k Ax, yk — y(xk) for k = 0, 1,2, . . . ,n, it is desired to find a value £ of x between xo and x\ so that y(x) assumes a given value rj; Ax is assumed to be so small that £ is unique. One may apply the general interpolation formulas (1) or (3) by reversing the roles of x and y. Alter natively, one may apply one of the iteration schemes of Sec. 20.2-2 to solve the equation K(|) - , - 0
where Y(x) is a suitable interpolation polynomial approximating y(x); one uses linear interpolation for the first step of the iteration, second-order interpolation for the second step, etc.
(b) Inverse Interpolation by Reversion of Series. Use any desired interpola tion formula and write it as a power series Y(x) = oo + a\X + a2x% +•••«*!/• Then
ai
\
ai
/
with
«•
c,--2-' + 2(2iY Oi \Oi/
20.5-5. "Optimal-interval" Interpolation (see also Sec. 20.6-3). The nth-degree polynomial Y(x) which equals y(x) at n + 1 points x = x, (j = 0, 1, 2, . . . , n) n
of [a, b] and minimizes max
Y] (x —Xk)\ will approximately minimize the maxi-
«<-<6lft=0
I
mum absolute value of the interpolation error (4) in la, &]. Y(x) is given by
The required polynomial
n
re*) =| Ao +J A„n (2z~_b~a)
(» =l, 2,...)
with
^ (20.5-8a) n
At =^hl «*»mS£FT
(fc =».'-2
»)
; = 0
where T^f) is the &th-degree Chebyshev polynomial (Sec. 21.7-4), and
*i =4~6 +^ cos %TTT
«- 0, 1, 2, .... ») (20.5-86)
20.5-6
NUMERICAL CALCULATIONS
752
20.5-6. Multiple Polynomial Interpolation. To approximate z = z{x, y) by a polynomial Z{x, y) such that Z(x, y) = z(x, y) for a given set of "points" fa, yk), one may first interpolate with respect to x to approximate z(x, yk) for a set of values of k; interpolation with respect to y will then yield Z{x, y). Alternatively, one can substitute an interpolation formula for y into a formula for interpolation with respectto x. Thus, if As = Ay = h is a given fixedincrement, and e(x0 + j Ax, 2/0 + k Ay) = zik X u
—
=
Xo
U,k = 0, ±1, ±2,
•.)
(20.5-9)
, = y - yo Ay
Ax
Table 20.5-2. Interpolation Coefficients* Lagrange Five-point Interpolation
/. « L_2(s)/_2 + £,_!(«)/_! + Lo(s)/o + L^)/, + L2{s)f2 For negative s, use lower column labels 8
L.2(s)
L-M
.0
.000000
.000000
L0(s)
Li(s)
L2(8)
1.000000
.000000
.000000
.0
.1
.007838
-.059850
.987525
.073150
-.008663
-.1
.2
.014400
-.105600
.950400
.158400
-.017600
-.2
.3
.019338
-.136850
.889525
.254150
-.026163
-.3
.4
.022400
-.153600
.806400
.358400
-.033600
-.4
.5
.023438
-.156250
.703125
.468750
-.039063
-.5
.6
.022400
-.145600
.582400
.582400
-.041600
-.6
.7
.019338
-.122850
.447525
.696150
-.040163
-.7
.8
.014400
-.089600
.302400
.806400
-.033600
-.8
.9
.007838
-.047850
.151525
.909150
-.020663
-.9
1.0
.000000
.000000
.000000
1.000000
.000000
-1.0
1.1
-.008663
.051150
-.146475
1.074150
.029838
-1.1
1.2
-.017600
.102400
-.281600
1.126400
.070400
-1.2
1.3
-.026163
.150150
-.398475
1.151150
.123338
-1.3
1.4
-.033600
.190400
-.489600
1.142400
.190400
-1.4
-1.5
1.5
-.039063
.218750
-.546875
1.093750
.273438
1.6
-.041600
.230400
-.561600
.998400
.374400
-1.6
1.7
-.040163
.220150
-.524475
.849150
.495338
-1.7
1.8
-.033600
.182400
-.425600
.638400
.638400
-1.8
1.9
-.020663
.111150
-.254475
.358150
.805838
-1.9
2.0
.000000
.000000
.000000
.000000
1.000000
-2.0
L2(s)
Li(s)
L-i(s)
L_2(s)
L0(s)
8
Note: All coefficients become exact if each terminal 8 is replaced by 75, and each terminal 3 by 25.
* From F. B. Hildebrand, Introduction to Numerical Analysis, McGraw-Hill. New York, 1956.
753
MULTIPLE POLYNOMIAL INTERPOLATION Newton Interpolation
/,«/o+« Afo + C2(s) A*/0 + C«(«) A»/0 + C«(«) A4/0 + C6(«) A«y0 /n- « /» - 8 V/„ + C«(«) V2/« - Cz(8) Vy„ + C4(«) V% ~ Cb(s) V*fn (8 positive for interpolation) c,«
a
-1.0
c,(«)
CM
1.00000
-1.00000
1.00000
-1.00000
c.w
-.9
.85500
-.82650
.80584
-.78972
-.8
.72000
-.67200
-.7
.59500
-.53550
.63840 .49534
-.46562
-.6
.48000
-.41600
.37440
-.34445
-.5
.37500
-.31250
.27344
-.24609
-.4
.28000
-.22400
.19040
-.16755
-.61286
-.3
.19500
-.14950
.12334
-.10607
-.2
.12000
-.08800
.07040
-.05914
-.1
.05500
-.03850
.02984
-.02447
0
.00000
.00000
.00000
.00000
.1
-.04500
.02850
-.02066
.01612
.2
-.08000
.04800
-.03360
.02554
.3
-.10500
.05950
-.04016
.02972
.4
-.12000
.06400
-.04160
.02995
.5
-.12500
.06250
-.03906
.02734
.6
-.12000
.05600
-.03360
.02285
.7
-.10500
.04550
-.02616
.01727
.8
-.08000
.03200
-.01760
.01126
.9
-.04500
.01650
-.00866
.00537
1.0
.00000
.00000
.00000
.00000
Stirling Interpolation
/. « /o + * m«/o + Ct(«) 52/o + C,(«) /i58/o + Ci(s) 6*ft> Ci(«)
s
0
.00000
.1
.00500
.2
.02000
.3
.04500
.4
.08000
.5
.12500
.6
.18000
.7
.24500
.8
.32000
.9
.40500
1.0
.50000
CiW .00000
-.01650f -.03200f -.04550f -.05600f -.06250f -,06400f -.05950f -.04800f -.02850t .00000
C<(s)
8
.00000
0
-.00041
-.1
-.00160
-.2
-.00341
-.3
-.00560
-.4
-.00781
-.5
-.00960
-.6
-.01041
-.7
-.00960
-.8
-.00641
-.9
.00000
-1.0
t Change sign when reading 8 from right-hand column.
20.5-6
20.5-6
NUMERICAL CALCULATIONS
754
Bessel Interpolation
/. « M/| + (« - |) «/* + C« M52/j + C« 53/j + C4(s) n8% + CB(8) 5*/j C«
CM
CM
CM
0
00000
.00000
.00000
.1
04500
.00784
.2
08000
.3
10500
.4
12000
.00600f .00800f .00700f .00400f
12500
.00000
.5
-•
.00000
.02240
.6
.02344
.00000
5
.01440 .01934
t Change sign when reading s from right-hand column.
Everett Interpolation
• (1 - «)/o + CM 52/o + CM 8% + s/i + C(l - 8) 8% + C(l - 8) «*fi CM
8
CM
0
.00000
.1
-.02850
.00455
.2
-.04800
.00806
.00000
.3
-.05950
.01044
.4
-.06400
.01165
.5
-.06250
.01172
.6
-.05600
.01075
.7
-.04550
.00890
.8 .9
-.03200
.00634
-.01650
.00329
1.0
.00000
.00000
Stepfensen Interpolation
/. «/o + C(s) 5/j + C(s) 8% - Ci<-«) 8/-j-c,(-«) *y_i CM
8
1.0
-. 00063f -.00086f -. 000771 -. 000451
CM
-.5
-.12500
-.4
-.12000
.02240
-.3
-.10500
.01934
.02344
-.2
-.08000
.01440
-.1
-.04500
.00784
0
.00000
.00000
.1
.05500
-.00866
.2
.12000
-.01760
.3
.19500
-.02616
.4
.28000
-.03360
.5
.37500
-.03906
.9
.8
.7
755
INTRODUCTION
20.6-1
one may use Bessel's interpolation formula (Table 20.5-1) twice to obtain Z(x, y) = M(zoo + 2io + 2oi + Zn) + \i(u - }i)(zio - zoo + Zn - zoi) + Hi" - H)(*01 - ZOO 4" Zll - Zio) + (U - H)(^ - M)(«ll - ZlO ~ ^01 + Zoo) + • • • (Bessel's formula for two-way interpolation) (20.5-10) Analogous methods apply to functions of three or more variables.
20.5-7. Reciprocal Differences and Rational-fraction Interpolation. Given y{xk) = yk (k = 0, 1, 2, . . .), where xo, xif x2, . . . are not necessarily equally spaced, one defines the reciprocal differences ,
.
Pl\Xo, Xi) =
Xq — X\
2/0 — 2/1
f
N
Xq — X2
,
P2&0, Xi, X2) = —7- —s 7- —x T 2/1 Piteo, Si) - pi(xi, x2)
, „. v = Pr{Xo,Xl9Xh . . . ,Xr) - ^ ^ ^ ^ ^ #^^ x0 -_ xr^ ^ ^ + pr_2(xi, x2, . . . , Zr-i)
?^ \
(20.5-11)
(r = 3, 4, . . .)
Then
Y(x) = ?/, + PiOc, 2-7—*\ Zi) w^
=
•,,.
(20.5-12o)
j
where one successively substitutes Plfe Xi) = pi(Xi, £2) +
p2(x, Xi, x2) — 2/1
p2(x, Xi, X2) = p2(£l, X2, Xz) H
7—
a; - xz .
,
v
\
(20.5-126)
psCx, rci, z2, x8) — pi(£i, s2)
yields a continued-fraction expansion approximating 2/(2;) by a rational function F(x) which assumes the given function values 2/0, yi, 2/2, •• • for x — xo, xi, x2, . . .
(Thiele's Interpolation Formula, see also Ref. 20.36).
The continued fraction termi
nates whenever y{x) is a rational function.
20.6. APPROXIMATION BY ORTHOGONAL POLYNOMIALS, TRUNCATED FOURIER SERIES, AND OTHER METHODS
20.6-1. Introduction. In practice, polynomial interpolation is best suited for analytic functions uncorrupted by noise or random errors, since high-order interpolation polynomials tend to follow the latter too closely, and low-order interpolation polynomials may waste essential information. In such cases, it is preferable to employ "smoothed" poly nomial or rational-fraction approximations designed to minimize either a weighted mean-square approximation error or the maximum absolute approximation error over a specified interval a < x < b. By contrast, Taylor-series approximations, which approximate an analytic function closely in the immediate neighborhood of one given point are, in general, useful for numerical approximation only if convergence is extraordinarily rapid.
20.6-2
NUMERICAL CALCULATIONS
756
20.6-2. Least-squares Polynomial Approximation over an Interval (see also Sec. 15.2-6). It is desired to approximate the given function /(*) by
F(x) = ao<po(x) + anpx(x) + a2
(20.6-1)
so as to minimize the weighted mean-square error
e2 =jha y(x)[F(x) - f(x)]* dx =0
(20.6-2)
over the expansion interval (a, b), where y(x) is a given nonnegative weighting function. If the
/: y(x)
(i t* j)
(20.6-3)
(i = 0, 1, 2, . . .)
(20.6-4)
then the desired coefficients at are given by /b
y(x)f(x)
«i - -^
/ y(x)ip1(x) dx
Orthogonal-function approximations, exemplified by orthogonal-poly nomial expansions (Sees. 20.6-2 to 20.6-4) and truncated Fourier series
(Sec. 20.6-6), have the striking advantage that an improvement of the approximation through addition of an extra term an+i(pn+i(x) does not affect the previously computed coefficients ao, oi, a2, . . . , a„. Substitution of x = az + &, dx = a dz in Eqs. (1) to (4) yields rescaled and/or shifted expansion intervals. Note that computation of the coefficients (4) requires one to know f(x) throughout the entire expansion interval (a, b).
20.6-3. Least-squares Polynomial Approximation over a Set of Points. A different type of least-squares approximation of the form (1) requires f(x) to be given only at a discrete set of m + 1 points, x0, xi, X2, . • • , xm. One minimizes the weighted mean-square error in
e'2 = 2 yk[F(xk) - f(xk)]2
(20.6-5)
where the yk are given positive weights. This is again relatively simple if the
J YmfeO¥>;(**) =0
(i * j)
(20.6-6)
757
POLYNOMIAL APPROXIMATION OVER A SET OF POINTS
20.6-3
Such polynomials can be obtained from 1, x, x2, . . . through the GramSchmidt orthogonalization procedure of Sec. 14.7-4. The coefficients o»are given by m
2 Vkf(xk)
(i = 0, 1, 2, . . . , n < m)
(20.6-7)
k = 0
The resulting polynomial will be an interpolation polynomial if n = m; if n < m, addition of extra terms a{
Xk =cos |JLdLi * (k =0, 1, 2, . . . ,ra)
(20.6-8)
The resulting approximation polynomials defined by Eq. (6) for unit weights yk = 1 tw'H be the Chebyshev polynomials Ti(x) (see also Sec. 20.5-5).
(b) Evenly Spaced Data (see also Ref. 20.6). If one has ra + 1 = 2M + 1 arguments Xk evenly spaced in an expansion interval (a, b), so that
xk =—& +kAx
(ax =h-^) k=0, ±1, ±2, . . . , ±ikA (20.6-9)
then the polynomials
(20.6-10)
where the pi(t, 2M) are the Gram polynomials*
pit 2M) = V (-I)*** Ci±^! (M±0!!i (* = 0, 1, 2, , . . , 2M; M = 1, 2, . . .) * The normalization (12) chosen is that of Ref. 20.6. are used by different authors.
(20.6-llfl)
Other normalization factors
20.6-4
NUMERICAL CALCULATIONS
758
with
*'*' m z(z - 1)(2 - 2) •••(*-* + 1) *toi =i (g > 0) 0M m 0
(* = 1, 2, (fc _ i, 2,
::J)(20.6-116)
AT
and
V> „,(*, »*flb 2JO 2M) - (2M (2;+1)[(2M)!]2 + i + 1)\(2M - i)\ &= — M
(i = 0,1,2, ... ,2M;M = 1,2, .. .)
(20.6-12)
The first five Gram polynomials are po(t, 2M) = 1
Pift 230 =£ Mt' 2M)
M(2M - 1)
?3(i' 2M) = A/(Af - 1)(2M - 1) „ /, o^ _ 35<4 - 5(6Af2 + M - 5)<2 + 3Af(M2 - l)(ilf + 2)
M'
J
2M(M - l)(2Af - 1)(2M - 3)
p«(«, 2Af) =
63<6 - 35(2M2 + 2M - S)t* + (15M* + 30M3 - 35M2 - 50M+ 12)< 2Af(Af - 1)(M - 2)(2Af - l)(2Af - 3) (20.6-13)
In particular, for M = 2 (five data points),
Po(0-l
Pi(0-Hi
Pi(0 = M(5<3 - 17<)
p2(0 = H«2-2)
1 t9na...
p«(fl =K2(35<< - 155i2 + 72) j t20-6"14)
If we denote the given function values f(xk) by /*, Gram-polynomial approximation with M = 2 yieldsthe following smoothed approximationpolynomial values, where F{ m Ffa) m (?<(/„, fh /_„ . . .) = (?,(/„, /_„ /i, • • •)
*"-i = Mo(69/_2 + 4/_! - 6/„ + 4/i - /2) m /_, - >f0«4/o *"-i = Hs(2/-t + 27/_i + 12/, - 8/, + 2/0 « /_! + %Bay0 ^o = M5(-3/_2 + 12/_i + 17/, + 12/t - 3/2) = /„ - Ms«4/o (20.6-15)
("smoothing by fourth differences"; 5*/0 is defined in Sec. 20.4-1). 20.6-4. Least-maximum-absolute-error Approximations, (a) The computation of the coefficients o< for an approximation polynomial (1) which minimizes the maximum absolute error \F(x) —f(x)\ on (a, b) for
759
ECONOMIZATION OF POWER SERIES
20.6-5
f(x) given either on the entire interval or on a discrete set of points requires a fairly laborious iterative procedure. Chebyshev's classical method is described in Ref. 20.61, while Hastings (Ref. 20.41) has evolved a heuristic method relying heavily on the investigators intu ition in relocating the zero-error points in successive plots of the error
F(x) —f(x) by iterative parameter changes. Hastings method is also applicable to more general nonpolynomial approximations of the form F(x) =F(x;ai,a2, . . . , aj, where <xh a2, . . . , a» are suitably adjusted parameters.
In practice, approximations of the form (1) in terms of Chebyshev polynomials (shifted and rescaled as needed, see also Sec. 20.6-46), although really derived from least-square considerations, are so close to the ideal least-absolute-error polynomial approximation that most texts on numerical analysis omit the laborious derivation of the latter. This
is easily seen whenever it is reasonable to assume that the error of a Chebyshev-polynomial approximation n
F(z) = y akTk(ax)
[Tk(x) = cos M, cos 0 = x]
k = 0
is essentially equal to the omitted term an+iTn+i(ax) for x in (a, b). Tn+i(ax) oscillateswith amplitude 1 in the expansion interval. It follows that the maximum excursions of the absolute error in the expansion inter
val are all approximately equal to |an+i|. (b) It is often convenient to employ the shifted Chebyshev poly nomials T* (x) defined by
T*(x) s Tn(2x - 1) = cos n*
(cos 0 = 2x - 1, n = 0, 1, 2, . . .) (20.6-16)
or
n = 1
T:+1(x) = 2(2* - l)T*(z) - T:_x(x)
(n = 0, 1, 2, . . .) (20.6-17)
(Refs. 20.4 and 20.12).
(c) Table 20.6-1 lists the first few polynomials Tn(x) and T*(x) together with expressions for 1, x, x2, ... in terms of these polynomials (see also Sec. 21.7-4). Tables 20.6-2 to 20.6-4 list some useful polynomial approximations.
20.6-5. Economization of Power Series (Refs. 20.4 and 20.12). If computation of orthogonal-function-expansion coefficients is laborious, but a truncated-power-series approximation to fix) is readily available, we can reduce its degree with little loss of accuracy by expressing higher
powers of x in terms of lower powers and a Chebyshev polynomial. For
20.6-6
NUMERICAL CALCULATIONS
760
Table 20.6-1. Chebyshev Polynomials Tn(x) and T*(x), and Powers of 3 To = 1 Ti-x
T2 = Tz = 7\ = !T6 = T* = T7 = T* =
2x^-1 4a;8 - 3a; 8x4 - 8x2 + 1 16x* - 20a;3 + 5x 32x« - 48x4 + 18x2 - 1 64x7 - 112x* + 56x3 _ 7x 128x8 - 256x« + 160x4
1 = To x = 7\
X2 x* x4 x6 x« X7 x*
= iA(Tq + T%) = 34(3^ + r,) = ^(3r0 + 4!T2 + 2T4) = KedOT7, + 5T3 + Tt) = M2(107\> + 15r2 + 67\ + T6) = ^(357*, + 217^ + 7T6 + T7) = K28(35770 + 567*2 + 287%
- 32x2 + 1
Td = 256x9 - 576x7 + 432^5
+ 87'6 + Ts)
x* = ^56(1267T, + 847'3 + 367T6
- 120x3 + 9x
Tf r; 7'* T* T*
+ 9r7 + T9)
- 1 1 = 7'0* - 2x - 1 x = i«r; + 7?) = 8x2 - 8x + 1 x* = y8(ST* + 47? 4- 7?) = 32x3 - 48x2 + 18x - 1 x3 = ^2(107'0* + 157? + 67'* + 7'*) = 128x4 - 256x3 + 160x2 - 32x + 1 x4 = 3^28 (357* + 56T* + 287? + ST* + T*)
example, the left-hand side of Table 20.6-1 yields
*9 = K56(-9z + 120a;3 - 432a:5 + 576z7) + Y2^T,(x)
The last term fluctuates with the relatively small amplitude J^56 in (—1,1). We omit the last term and substitute the rest for x9, leaving a seventh-degree polynomial; the procedure may then be repeated for x7, etc.
Instead, we couldexpress all powers in terms of Chebyshevpolynomials and neglect the higher-degree polynomials, which will tend to have small
coefficients. The resulting expansions on (0, 1) or (—1, 1) are, in general, not identical with the true least-squares polynomial expansion of f(x), but the method is convenient.
20.6-6. Numerical Harmonic Analysis and Trigonometric Inter polation (see also Sees. 4.11-46 and 20.6-1). (a) Given m function values y(xk) = yk for xk = kT/m (k = 0,1, 2, . . . , m - 1), it is desired to approximate y(x) within the interval (0, T) by a trigonometric polynomial n
3= 1
(20.6-18) m-\
so as to minimize the mean-square error Y [Y(xk) —yk]2. The required fc= Q
(0 < x < 1)
10*
(0 < x < 1)
log, (1 + x)
(0<x
fix)
.04165 73475
ai =
.99999 64239
ai = .99949 556 a2 = -.49190 896 a3 = .28947 478
a4 =
a, = 1.14991 96 a2 = .67743 23
a, = 1.15129 277603 (1 4- aix 4- a2X2 4- a8x8 4- aAx4 4- a6X5 + aex6 4- a7x7)2 a2 = .66273 088429 a3 = .25439 357484 a4 = .07295 173666
(1 4- aa + o2x2 + 03X3 4- aa4)2
a8 = .33179 90258 a4 = -.24073 38084
= .16765 40711 = -.09532 93897 =» .03608 84937 = -.00645 35442
a5 = .01742 111988 a6 = .00255 491796 a7 = .00093 264267
a3 = .20800 30 a4 = .12680 89
a5 a6 a7 a8
o4 = -.13606 275 a6 = .03215 845
a5 = -.00830 13598 a6 = .00132 98820 a7 = -.00014 13161
ai = -.99999 99995 a2 .49999 99206 a3 = -.16666 53019
a2 = .3536 03 = - .15953 32 a4 = .02936 41
-.9664
ai = -.99986 84 a2 = .49829 26
ai
Parameters
4- aex6 4- a7X7 + Ogx8 a2 = -.49987 41238
aix 4- a2x2 4- asx3 4- a4x4 + asx6
aix 4- a2x2 4- a&z 4- a4x4 4- as£6
1 4- aix 4- a*x2 4- a3x8 4- a4x4 + asx6 4- aex6 4- a7x7
1 + aix 4- a2x2 4- a8x8 4- a4x*
1 4- aix + a2x2
Approximation, F(x)
Table 20.6-2. Some Polynomial Approximations
5 X 10"8
7 X 10"<
3 X lO-8
10~6
2 X 10"10
3 X 10"6
3 X 10~3
error
absolute
Maximum
Hastings (Ref. 20.41)
(Ref. 20.38)
Goldstein
Carlson and
Reference
>
2!
© 2!
• 50
2 S
w
SE
•z
x\
tan x
x cot X
x\
(°^i)
/
(•*-*i)
/
(0<x<^)
COS X
(•*'*i)
1 4- a2x2 4 a&*
sin x
2 X 10~4
error
Maximum absolute
a6 =
a2 = -.49670
-.00019 84090
a4 = .03705
a2 = .31755
-.00138 88397
a2 = -.332867
a2 = .33333 14036 a4 = .13339 23995 a6 = .05337 40603
a6 =
a4 = -.024369
a8 = .02456 50893 a10 = .00290 05250 ai2 = .00951 68091
a4 = .20330
3 X 10"fi
2 X 10~8
10~3
2 X 10"9
9 X 10~4
a6 =
-.00211 77168
1
a2 = -.33333 33410 a8 = -.00020 78504 4 X 10"10 4 ajox10 a4 = -.02222 20287 aio = -.00002 62619
1 4 a2x2 4- a^4 4 aez6 4 a»x8
1 4 a2x2 4 a4x4
4 aiox10 4 ai2x12
1 4 a2x2 4- «4X4 + aex6 + asx8
1 4 a2x2 4 a^4
a4 = .00761
a2 = -.49999 99963 a8 = .00002 47609 4 aiox10 <*4 = .04166 66418 a10 = -.00000 02605
1 4 a2x2 4 a^4 + aex6 + asx8
1 4- a2x2 4 aa*
a2 = -.16605
Parameters
a2 = -.16666 66664 a8 = .00000 27526 2 X 10~9 4 aiox10 a4 = .00833 33315 a10 = -.00000 00239
1 4 a2x2 + a4x4 4 aex6 + asx8
Approximation, F(x)
fix)
Table 20.6-2. Some Polynomial Approximations (Continued)
(Ref. 20.38)
Goldstein
Carlson and
(Ref. 20.38)
Carlson and Goldstein
Reference
o>
~j
CD
o 2
a
ft
ft •
ft
w
S
!4
OS
r(x + l)=i! (0 < x < 1)
(-1 <x <1)
arctan x
(-1 <x <1)
arctan x
(0 < x < 1)
arcsm x
+ a&*
1 4 aix 4 aax8 +
4asx8
4a6X6
1 4 aix 4 a2x2 4 a»x8 4 a&*
1 4 «2X2 4
aix 4 asx8 4 a^5 4 a7x7 4 a9x9
4 avx7)
4 a3x8 4 cue4 + a&* 4 a<$x6
— (1 - x)^(a0 4 aix 4 a2x2
4 a3x8)
- - (1 - x)^(a0 4 aix 4 a2x2
a2 = .07426 10 a3 = -.01872 93
a9 =
a7 =
—
.02083 51
ai a2 tt8 a4
= -.57719 1652 = .98820 5891 = -.89705 6937 = .91820 6857
fl] = -.57486 46 a2 = .95123 63 a, = -.69985 88
a6 a6 a7 a8
.10106 78
.42455 49
= -.75670 4078 = .48219 9394 = -.19352 7818 = .03586 8343
aB =
a4 =
a2 = -.33333 14528 aio = -.07528 96400 a4 = .19993 55085 al2 = .04290 96138 a6 = -.14208 89944 ai4 = -.01616 57367 a8 = .10656 26393 ai6 = .00286 62257
ai = .99986 60 a3 = -.33029 95 a6 = .18014 10
.08513 30
ag = -.05017 43046 a7 = -.00126 24911
a0 = 1.57079 63050 a4 = .03089 18810 ai = -.21459 88016 a5 = -.01708 81256 a2 = .08897 89874 a6 = .00667 00901
a0 = 1.57072 88 ai = -.21211 44
3 X lO"7
5 X 10"6
2 X 10"8
lO"6
2 X 10"8
5 X 10-5
(Ref. 20.41)
Hastings
(Ref. 20.38)
Carlson and Goldstein
Hastings (Ref. 20.41)
(Ref. 20.41)
Hastings
o 2
a
ft
w
2
2
-4
K0(x) = x-H/o sin 0O
(-3 <x <3)
(3 <X < oo)
(-3 <x <3)
Ni(x) = x-^/i sin 0X
(0 < x < 3)
(3 <x < oo)
|e| < 1.1 X 10"'
(0 < x < 3)
4 .22120 91(x/3)2 4 2.16827 09(x/3)4 - 1.31648 27(x/3)6 + .31239 51(x/3)8 - .04009 76(x/3)10 4 .00278 73(x/3)" + €
xiVi(x) = (2/7r)xloge (Y2x)J,(x) - .63661 98
|€'| < 9 X 10~8
- .00029 166(3/x)6 4 e'
0i = x - 2.35619 449 4 .12499 612(3/x) 4 .00005 650(3/x)2 - .00637 879(3/x)3 4 .00074 348(3/x)4 4 .00079 824(3/x)5
\e\ < 4 X 10~8
- .00020 033(3/x)6 4 «
/i = .79788 456 4 .00000 156(3/x) 4 .01659 667(3/x)2 4 .00017 105(3/x)3 - .00249 511(3/x)4 4 .00113 653(3/x)5
Ji(x) = x-^/i cos 0l?
|e| < 1.4 X 10"8
4 .00427 916(x/3)ia - .00024 846(x/3)18 4 *
Formula arrangement and error bounds from Ref. 20.1.
tions to Some Modified Bessel Functions, Math. Tables and Other Aids to Compulation, 10: 162-164 (1956) (with permission).
*From E. E. Allen, Analytical Approximations, Math. Tables and Other Aids to Computation, 8:240-241 (1954); Polynomial Approxima-
.|e| <1.3 X 10~8
4 .00001 109(x/3)12 4 e
x"1Jr,(x) = M - .56249 985(x/3)2 4 .21093 573(x/3)4 -.03954 289(x/3)« 4 .00443 319(x/3)8 - .00031 761(x/3)10
|e'| < 7 X 10"8
4 .00013 558(3/x)6 4 e'
0o = x - .78539 816 - .04166 397(3/x) - .00003 954(3/x)2 4 .00262 573(3/x)3 - .00054 125(3/x)4 - .00029 333(3/x)6
|e| < 1.6 X 10"8
4 .00014 476(3/x)6 4 e
/o = .79788 456 - .00000 077(3/x) - .00552 740(3/x)2 - .00009 512(3/x)8 4 .00137 237(3/x)4 - .00072 805(3/x)6
Jo(x) = x-^/o cos 0O
|c| < 5 X lO"8
4 .00021 00(x/3)12 4 e
Table 20.6-3. Some Approximations for Cylinder Functions* Jo(x) = 1 - 2.24999 97(x/3)2 4 1.26562 08(x/3)4 ATo(x) = (2/tt) log, (lix)JQ(x) 4 .36746 691 4 .60559 366(x/3)2 - .31638 66(x/3)6 4 .04444 79(x/3)8 - .00394 44(x/3)10 - .74350 384(x/3)4 4 .25300 117(x/3)« - .04261 214(x/3)8
2
S
IT*
ft
ft •
ft • IT*
»
g
a
20.6-6
NUMERICAL HARMONIC ANALYSIS
765
Table 20.6-4. Chebyshev-polynomial Approximations*
[T*n(x) ss cosn# with cos # = 2x - 1] Approximation /(*)
series 00
e~x
(0 < x < 1)
X ^n(s) n = 0
oo
ex
(0 < x < 1)
£ ^(X) n = 0
log. (1 4 X) (0 < x < 1)
oo
X AnT*n(x) n = 0
COS }4irX
(-1 <x < 1)
oo
2 ^(s2) n = 0
sin l^irx
(1 <x < 1)
00
X £ ^«^n(^2) n = 0
arctan x
(-1 <x < 1)
oo
x 2) ^.r:tB«) n=0
n
-4n
An
n
0 .64503 5270 1 -.31284 1606 2 .03870 4116 3 -.00320 8683
4
.00019 9919
5
- .00000 9975
6
.00000 0415
7
- .00000 0015
0 1 2 3
4
.00054 3437
5
.00002 7115
6
.00000 1128
7
.00000 0040
8
.00000 0001
6
- .00000 8503
1.75338 .85039 .10520 .00872
7654 1654 8694 2105
0 .37645 2813 1 .34314 5750 2 -.02943 7252 3 .00336 7089 4 -.00043 3276 5 .00005 9471
7
.00000 1250
8
- .00000 0188
9
.00000 0029
10 - .00000 0004 11
.00000 0001
3
- .00059 6695
0 .47200 1216 1 -.49940 3258 2 .02799 2080
0 1.27627 8962 1 -.28526 1569 2 .00911 8016
4
.00000 6704
5
- .00000 0047
3
- .00013 6587
4
.00000 1185
5
- .00000 0007
0 .88137 3587 1 -.10589 2925 2 .01113 5843
6
.00000 3821
7
- .00000 0570
3 -.00138 1195
9
4 .00018 5743 5 -.00002 6215
10
8 -
.00000 0086 00000 0013 .00000 0002
For |x| > 1, use arctan x = Hit — arctan (1/x) arcsin x
-H V2 < x < y2 V2
00
X X ^nTj(2x2) n = 0
0
1.05123 1959
5
1 2
.05494 6487 .00408 0631
6
3
.00040 7890
8
4
.00004 6985
9
7
.00000 5881 .00000 0777 .00000 0107 .00000 0015 .00000 0002
Note: arccos x = ir/2 — arcsin x
For Y2 y/2 < x < )L, use arcsin x = arccos (1 ~ a2)*, arccos x = arcsin (1 -x2)^
* Numerical data from C. W. Clenshaw, Polynomial Approximations to Elementary
Functions, Math. Tables and Other AidstoComputation, 8-.143(1954) (withpermission). For more extensive tables, see C. W. Clenshaw, Chebyshev Series for Mathematical Functions, in Natl Phys. Lab. Math. Tables, vol. 5, London, 1962.
206-6
NUMERICAL CALCULATIONS
766
coefficients Aj, B} are m-l
m-l
(20.6-19)
In the special case n = ra/2, Eqs. (18) and (19) together with m-l
An =4w/2 - - ^ (-l)ty*
(20.6-20)
A= 0
yield Y(xk) - 2/(z&) (trigonometric interpolation) for arbitrary 5„.
See Ref. 20.62 for similar formulas applicable when the xk are not equidistant. Numerical methods for multidimensional Fourier analyses and syntheses are discussed in Ref. 20.2. Reference 20.70 discusses "fast" Fourier-analysis routines.
(b) The 12-ordinate Scheme. The calculation of the sums (206) is simplified whenever m is divisible by 4. Table 20.6-5 shows a con venient computation scheme for m = 12.
If no harmonics higher thanthe third are required, note the simpler formulas
64„ = 2/o + Vi + 2/2 + • • • + 2/n 442 =2/0-2/3+2/6-2/9 Ai = Jift/o - 2/e) + A8
648 = 2/0 - 2/2 + 2/4 - 2/e + 2/s - 2/10 658 = 2/1 - y% + 2/5 - 2/7 + 2/9 - 2/n 5! = J$(y, - 2/9) + Bz
(20.6-21) and
4B2 = y(T/8) - 2/(3!T/8) + y(5T/8) - y(7T/8)
(20.6-22)
The four additional function values required for Eq. (22) can often beread directly from a graph of y(x) vs. x. See Ref. 20.62 for a 2±-ordinate scheme.
(c) Determination of Unknown Periodic Components (Prony's Method). Given Nvalues/0 =/(0),/, =/(l),/2 =/(2), . . . JN^ = f(N - 1) ofan empirical function f(u) assumed to have the form
/ = Ai cos
then cos
whose coefficients ak must satisfy the linear equations m-l
e* — ) (fi+k-l 4" /2m+t-A-l)ajfc + /m+t_iam —/»_i —/2m+t-l = 0 A = l
(t = 1, 2, . . . , N - 2ra)
For the best least-squares fit, solve the m linear equations N-2m
£ J «• =0 (ft =1, 2, . . . ,m) t*=l
Si 85
So S6
Copy
5
9
Si + Ti Si - Ti
15
14
S3
S2
6^0 124 6
^1
si
s0
Copy and multiply
Si
<^i
d0
Add each column (9 to 12)
st
s0
Sum of 5 and 6 Difference of 5 and 6
13
12
11
10
8
7
s2 S4
d2
s2
2/io
!/2
^1
Si
it
d\ ds
dA
s4
2/8
2/4
V3 .
•••
S3
i
8z ...
dz
s8
2/9
2/3
& 6Ai 6A6
!T2
... -^1
4
^2
S2
111
di
...
Difference of 1 and 2
4
6
Si
2/n
So
2/i
2/o
...
Sum of 1 and 2
Given function values
How obtained
3
2
1
Line
2/e
Se
n
s3
Sz
s0
QA2 6Ai
d2 ...
s2
11
d2 dz d4 ...
ds ...
s6
2/7 • •.
2/6
Tz
—s3
d2
Ti
QAz
Si
d0
yk =2/ Cj%)> refer t0 S*5- 20-6"6
S5
...
Sz
//
6B5
6Bl
Tb
V3 „
-y*.
Table 20.6-5. 12-ordinate Scheme for Harmonic Analysis
2 /S6
6£4
T6
2
2
\/3 „
6B2
V3 „
S7
6Bz
T7
2
r
o 2
2
• w
S3
>
W
M
3
G
206-7
NUMERICAL CALCULATIONS
768
for the mcoefficients ak. Once the «/, are known, it is relatively easy to find the Aki Bkfor the best least-squares fit in the manner indicated in Sec. 2().6-6a.
The frequencies of sinusoidal components in statistical time series can often be
identified by inspection of the empirically determined autocorrelation function (Sees. 18.10-9 and 19.8-3c).
20.6-7. Miscellaneous Approximations, (a) More general approximation meth ods are not restricted to sums of the form (1) but employ a rational or otherwise readily computable approximation function F(x) = F(x\ <xh a2) . . . , <*„) with parameters alf a2, . . . , <xn determined so as to minimize a weighted mean-square error or the maximum absolute approximation error.
(b) The Fade method (Refs. 20.15 and 20.16) produces rational-fraction approxi mations (Fade-table entries) d
/ \
ao + CL\X H- • • • 4- amxm
Rmn{x) S l+6is+ ••• +Js»
to n - 0, 1, 2, . . .) (20.6-23)
to a given suitably differentiable function f(x) by matching the m + n + 1 poly nomial coefficients of the identity
[/(0) +f(0)x +5j/"(0)s» +•••+ij/oco)^] (1 +blX +•••+bnx») = a0 4- aix + • • • + amxm
(20.6-24)
The resulting equations determine the m + n + 1 coefficients aiandbk, if a solution exists. It is understood that the numerator and denominator in Eq. (23) have no common factor. The Rm0(x) are simply truncated MacLaurin series (Sec. 4.10-4). EXAMPLE: For f(x) = e; Rnm(x) = 1/Rmn(-X), and Rio(x) s 1 + x + y2x2 + Hx3 + y2Ax* 24 -
R
6a;
_ 12 + 6a; + x2 12 -
ftc + x2
(c) Pad4 approximations, like truncated Taylor or MacLaurin series, suffer from errors increasing with distance away from a specific expansion point. By contrast, Maehly's method (Refs. 20.15, 20.42, and 20.43) derives rational-fraction approxima tions as ratios of two truncated Chebyshev series. As another alternative, Ref. 20.41 describes a quitegenerally applicable heuristic iteration method forfinding mini mum-absolute-error approximations of the general form F(x; ah a2, . . . , an). Further discussions of the use of approximation methods in connection with digital computation willbe foundin Refs. 20.38 to 20.46. The optimal form of the approxi mation chosen depends not only on the function to be approximated but also on the
hardware andsoftware configuration ofthe computer used. Anumber of examples of various typesare presented in Table20.6-6. In general, approximations computed for successive expansion intervals are "pieced together," and it is again possible to trade expansion-interval size for arithmetic complication. For frequently used, wellknown functions, an approximation routine is, in general, substantially faster as well as more economical of computer storage than interpolation from a stored table.
For
digital computers incorporating fast division hardware, rational-function approxima tions are, in general, superior to polynomial approximations.
arctan x
<x
e~X2/2 dX
Santa Monica, Calif.,
♦Formulas from C,
(0 < x < oo)
V2^^-w
—^= f
(0 < X < oo)
oo)
(a0 4- a2»2 4- a4x4 -f aex6)-1
[1 4- aix + a2x2 + • • • + a^6]16
[1 4- fliz 4- Q2Z2 + Q>zxz + cus4]4
di = .04986 73470
ai = .07052 30784 a3 = .00927 05272 a5 = .00027 65672 a0 = 2.490895 a2 = 1.466003 60 = 2.50523 67 62 = 1.28312 04 64 = .22647 18 ci = .196854 c2 = .115194
rf4 d6 d6
Ci
c3
66 bz 610
o6
a4 :
a2 a< o6
a4
a2
1.5 X 10~7
2.5 X 10~4
.000344 .019527
.00003 80036 .00004 88906 .00000 53830
2.3 X 10"4
2.7 X 10~3
3 X 10"7
5 X 10"4
1.5 X 10"7
2.5 X 10~5
5 X 10~8
10~7
6 X 10-*
.13064 69 -.02024 90 .00391 32
-.024393 .178257
.04228 20123 .00015 20143 .00004 30638
.230389 .078108
.25482 9592 1.42141 3741 1.06140 5429
.34802 42 .74785 56
Maximum
absolute error
1955.
Hastings, Approximations for Digital Computers, Princeton, Princeton, N.J.; copyright The Rand Corporation,
dz = .00327 76263
1 - M(l + dix + d2z2 4- d3x3 4- d^4 4- d6x5 + da6)~16 d2 = .02114 10061
1 - y2(\ -h ci« 4- C2X2 + czx* + C4X4)-4
(bo + h2x2 4- 64z4 + b&* + bgz8 + biox10)"1
1 -
1 -
ax = .278393 a3 = .000972
a5
(0 < X <
a8
1 4-ps
a4 = -1.45315 2027
a3
a2 = -.09587 98
a2 = -.28449 6736
1 -
erf x
ai
p = .47047
.09437 6476 .19133 7714
a3 = .36415 a?
e l [*e-»d\
t = 1 4- pz (<M 4- M2 + a3(3 + <M4 4- abtb)e~xZ
(ait 4- <*2*2 4- azt3)e~*2
1
ai = .86304 01 = .86859 1718 a3 = .28933 5524 a5 = .17752 2071
Parameters
ai
1 -
1 4- -28x2
J = (x - l)/(s + 1)
* = (s - l)/(x 4- 1)
oi« 4- azt* 4- <M5 + <M7 + <M9
ait + a3i3
Approximation, F(x)
p = .32759 11
(-!<*< 1)
\VTo
log 10 x
fix)
Table 20.6-6. Miscellaneous Approximations*
©
s
X
O
C#)
o
w
ft W
CD
20.7-1
NUMERICAL CALCULATIONS
770
20.7. NUMERICAL DIFFERENTIATION AND INTEGRATION
20.7-1. Numerical Differentiation.
Numerical differentiation is sub
ject to errors due to insufficient data, truncation, etc., and should be used with caution.
(a) Use of Difference Tables (equidistant argument values, see also Sec. 20.4-1). For suitably differentiate operands y(x),
^ =^ =[^log.(l+A)]r =^_(A-i^ +iA3-...)
dz
(20.7-1)
so that, if xk = x0 + k As (k = 0, ±1, +2, . . .),
V'k =y'(xk) =Dyh =^ Uyh - IA2y* +| A«y* - •••)
Vk =y"(xk) =D*yk =^jL ^A«y* - A'j/* +||A*y» ( (20.7-2) -§*•»+•••) Differentiation of Stirling's and Bessel's interpolation formulas (Table 20.5-1) yields, respectively,
y'" =Ai(Syk-lSZyk +mdiyk- ''") ) i/ii \ ^ =(A^(5^-l2^ +90^ ))
y'^Tx(&yk-kblyk+--) ) /
*
\
^ =(A^(^-2i^+--))
(20-7"3)
(2°-7"4)
Many similarformulas, and alsoformulas for approximating higher-order derivatives, may be derived by differentiation of suitable interpolation formulas (see also Fig. 20.9-1). Note alsothe following explicit three-point differentiation formulas with errorestimates
y-i - 5^ (-3y_, +4j/o - yi) +^jV'tt) 1
As2
v° =~2Kx ("^'-, + v,) " T v'"{&
i
^
V'x =^ Gf-i - 42/o +3J/,) +^ y'"(i) where y_i < £ < yt. See Ref. 20.17 for analogous five-point formulas.
(20'7"5)
771
NUMERICAL INTEGRATION
20.7-2
(b) Use of Divided Differences (seealso Sec. 20.5-26). Given the function values y(xk) for a set of not necessarily equally spaced argument values x0, xh x2, . . . , differentiation of Newton's interpolation formula (20.5-3) yields
y(r)(X) **FrV(x)Ar(XQ,Xh . . . , Xr) + F^X^+iiXo, Xh . . . , Xr+i) + • • •
J-1
with
Fi(x) b f] (x - a;*)
(r - 0, 1, 2, . . .)
(j = 0, 1, 2, . . .)
} (20.7-6)
&= 0
See Ref. 20.6 for a discussion of the errors due to the interpolation-polynomial approximation.
(c) Numerical Differentiation after Smoothing.
The following
formulas are based on differentiation of smoothed polynomial approxi mations rather than on interpolation polynomials and may thus be less
affected by random errors in empirical data (Ref. 20.12). n
£ kyi+k v'i » ^"—
(20-7"7)
2 V fc2Aa; ffi
For n = 2, this yields
y'< " w& i~2yi~i ~Vi-1 + Vi+1 + 2Vi+2)
(20-7"8)
Some other formulas are
Vk » 7^7" (32/*+i + 10V< ~ 18Vk-i + 6yk-2 - Vk-i)
^ ~12bl(Vk-2"VH2)" 8(2/*_l"Vk+l)]
)
(2a7"9)
v'k » 7WT- (Vk+2 - 62/jb+2 + 182/jb+i - lOy* - 3j/*_i) 12Ax
(d) Approximate Differentiation of Truncated Fourier Series. If y(x) is approximated by an (n + l)-term truncated Fourier series (20.6-18), Lanczos (Ref. 20.12) suggests estimation of y'(x) from
y('+0-r(a-s) y'(x) «. —* "•' 2t v =*•
(20.7-10)
n
20.7-2. Numerical Integration Using Equally Spaced Argument Values, (a) Newton-Cotes Quadrature Formulas. Quadrature for-
20.7-2
NUMERICAL CALCULATIONS
772
mulas of the closed Newton-Cotes type (Table 20.7-1) use the approximation fxo+n Ax
)l7 '"* V^ dX ~ a°y° + aiVl + a2V2 + "•*+ anVn with
ak
'\(\-l)(\-2) • • • (x-n)^
k\(n- k)\ Jo '
\ (20.7-11)
(X - A;)
where the yk = y(xk) are given function values for n + 1 equally-spaced argument values xk = x0+ k Ax (k =* 0,1, 2, . . . , n); the resulting error vanishes if y(x) is a polynomial of degree not greater than n. Instead of using values of n > 6, one adds m sums (6) of n < 6 terms for successive subintervals:
/
2/(z) da; = /
y(aj) dx + /
y(a?) do; + (20.7-12)
(b) Gregory's Formula. aro+nAx
/.
The "symmetrical" formula
y(x) dx « Aa?[^y0 + 2/1 + y2 + • • • + yn-i + liVn + K2(A/o - Afn-i) - K4(A2/o + A%_2) + %o(A3/o + A*/n_3) - Keo(A4/o + Ayn_4) ± • • •] (Gregory's quadrature formula) (20.7-13)
adds correction terms to the trapezoidal rule of Table 20.7-1.
The
Table 20.7-1. Quadrature Formulas of the Closed Newton-Cotes Type
[yk = y(xk) = y(x + k Ax), k = 0, 1, 2, . . . , n]
/'« / No.
1
2
Trapezoidal rule (n = 1) Simpson's rule
(n =2) 3
y(x)dx = /
(add analogous expressions for m successive subintervals)
Weddle's rule
(n = 6)*
Error, / — /'
(a?o < $ < Xo + n Ax)
Ax ,
Y too+2/i)
-H2(nAz)V*)tt)
Arc ,
1 /nAsV ., x
•g- (2/0 + 42/! + ^2) Ho&x(yQ + 52/i +1/2 + 62/3 + 2/4 + 52/s -f 2/e)
1?
(n Ax\ , %
212310 \ 2 / ""w (0 < # < 1)
* In Weddle's rule, the correct Newton-Cotes coefficient 4H40 of A62/o has been replaced by %0.
773
GAUSS AND CHEBYSHEV QUADRATURE FORMULAS
20.7-3
formula is derived in the manner of Sec. 20.7-4 and is suitably truncated
in each case. The formula is exact for polynomials y(x) of up to the (2m + l)8t degree if up to 2mth-order differences are included (m = 0, 1, 2, . . .). In particular, omission of first-order and higher differences yields the trapezoidalrule formula fxo+n Ax
/
JXQ
y(x) dx « AxMyo + 2/1 + • • • + 2/»-i + HVn]
which is exact for first-degree polynomials y(x). differences produces
(20.7-14)
Omission of third-order and higher
fxo+n Ax
/
Jxo
y(x) dx « Ax\$i4yo + 2%42/i + 2%4Vi + 2/s + • • • + • • • + 2/n-3 + 2H4Vn-2 + 2M42/n-l + %4Vn]
(20.7-15)
which is exact for third-degree polynomials y(x). (c) Use of Euler-MacLaurin Formula. The Euler-MacLaurin summation for mula (4.8-10) yields a number of quadrature formulas involving values yk = y'(xk), yk — y"(xk)} . . . of the derivatives of y(x) as well as of y(x). Thus,
/•xo+As , * ,
J
AX .
.
N
AX2 . i
i. . AXA . n,
in _
y(x) dx = y (2/0 + 2/0 - J2 (yi ~ 2/o) +720 (2/1 - 2/o ) + • • • (20.7-16)
again adds correction terms to the trapezoidal rule.
20.7-3. Gauss and Chebyshev Quadrature Formulas,
(a) Rewrite
the given definite integral j y(x) dx as f 77(f) d£ with the aid of the transformation
* = -~2~ f + —J"
itt) = —2~ yW
(20.7-17)
and approximate the latter integral by
/>(«)*« £«*<&) w^^ 2
a" = (1 - fc»)[P;(&)]»
1
(Gauss
> QUADRATURE V FORMULA)
(20.7-18)
(» - 1. 2. •••) J
where the n argument values {* are the n zeros of the nth-degree Legendre polynomial P„(£) (Sec. 21.7-1). Table 20.7-2 lists the & and ak for a number of values of n.
20.7-3
NUMERICAL CALCULATIONS
774
The error due to the use of the Gauss quadrature formula (18) is
E , (n\Y(b - a)2^ (2n + l)[(2n\)]*y
w
(a < X < b)
(20.7-19)
(b) A simpler class of quadrature formulas is obtained with the aid of the transformation (9) and an approximation of the form
1n(0 #*?[*(?,)+ *(«) + / (n = 2, 3, 4, 5, 6, 7, 9) (Chebyshev quadrature formula)
(20.7-20)
Table 20.7-2 lists the £'k for a number of values of n. The use of equal weights minimizes the probable error if y(x) is affected by normally distributed random errors. The derivation of the £'k is discussed in Refs. 20.6 and 20.10.
Table 20.7-2. Abscissas and Weights for Gauss and Chebyshev Quadrature Formulas (Sec. 20.7-3; adapted from Ref. 20.11)
(a) Abscissas & and Weights ak for the Gauss Quadrature Formula (18) n
Abscissas
Weights
2
±0.577350
1
3
0
%
±0.774597 4
5
±0.339981
0.652145
±0.861136
0.347855
0
0.568889
±0.538469 ±0.906180
0.478629 0.236927
(b) Abscissas (£ for the Chebyshev Quadrature Formula (20) n
Abscissas
n
2
±0.577350
7
3
5
±0.187592 ±0.794654 0
±0.374541 ±0.832497 6
±0.266635 ±0.422519 ±0.866247
0
±0.323912
o
±0.529657 ±0.883862
±0.707107 4
Abscissas
9
0
±0.167906 ±0.528762 ±0.601019 ±0.911589
GAUSS AND CHEBYSHEV QUADRATURE FORMULAS
775
20.7-3
Table 20.7-2. Abscissas and Weights for Gauss and Chebyshev Quadrature Formulas (Continued) (c) Abscissas & and Weights ak for the Gauss-Laguerre Quadrature Formula (21)
Weights
Abscissas
n
2
3
4
5
0.585786
0.853553
3.414214
0.146447
0.415775
0.711093
2.294280
0.278518
6.289945
0.0103893
0.322548
0.603154
1.745761
0.357419
4.536620
0.0388879
9.395071
0.000539295
0.263560
0.521756
1.413403
0.398667
3.596426
0.0759424
7.085810
0.00361176 0.0000233700
12.640801
(d) Abscissas & and Weights ak for the Gauss-Hermite Quadrature Formula (22) n
Abscissas
2
±0.707107
3
4
5
0
Weights 0.886227 1.181636
±1.224745
0.295409
±0.524648 ±1.650680
0.804914 0.0813128 0.945309
0
±0.958572
0.393619
±2.020183
0.0199532
For n = 3, the error due to the use of the Chebyshev quadrature formula (20) is
3^(^-7SHX) (a <X
fQ" ertntt) dl- « £ aw(&) &=i
with a* =
(Gauss-Laguerre quadrature formula)
frl)2
(n = 1, 2, . . .)
(20.7-21)
20.7-4
NUMERICAL CALCULATIONS
776
where the & are the zeros of the wth-degree Laguerre polynomial L„(£) (Sec. 21.7-1); and n
ft=l
Wlth
(Gauss-Hermite quadrature
_ 2»+1n! \/»
[
(20.7-22)
FORMULA)
,
where the & are the zeros of the nth-degree Hermite polynomial #„(£) (Sec. 21.7-1). Again,
with
ak =^
£* =cos 2~ 7T
(& =1, 2, . . . , n)
(Gauss-Chebyshev quadrature formula)
(20.7-23)
The & are seen to be zeros of the nth Chebyshev polynomial (Sec. 21.7-4).
Many similar formulas exist (Refs. 20.6 and 20.56). Table 20.7-2 lists some of the & and ak for Eqs. (21) and (22).
20.7-4. Derivation and Comparison of Integration Formulas, (a) In general, one can derive unknown coefficients ak and/or abscissas xk in an integration formula
f 0(aO7(aO dx « a0y(xo) + a\y(xx) + a2y(x2) +••••+ any(xn) (20.7-24) as follows (Ref. 20.4):
1. The formula is required to hold exactly for y(x) = 1, x} x2, . . . , xm(m < n).
This yields m equations
j xry(x) dx = a^Xor + a\X\T + a2x2T + • • • + anxnr (r = 0, 1, 2, . . . , m < n)
(20.7-25)
for the unknown ak and xk.
2. One may prescribe some or all of the xk (Newton-Cotes formulas, Gregory's formula) and/or the ak (Chebyshev Quadrature). 3. One may impose constraints on (relations between) the ak either for symmetry (Gregory's formula) or to minimize round-off-error effects. For the latter purpose, all ak should be positive (subtraction increases round-off-error effects, Sec. 20.1-2).
The relative desirability of these measures depends on the application. The xk may or may not be in the integration interval. The same method
777
INTRODUCTION AND NOTATION
20.8-1
also applies to formulas containing added derivative terms like bkyf(xk)f as in Sec. 20.7-2c.
(b) The Gauss-type integration formulas, like Eq. (18), are exact if v(£) is a polynomial of degree <2n —1, while Newton-Cotes formulas are exact only for polynomials of degree
If y(£) has only a piecewise-continuous first-order derivative, then repeated use of the simple trapezoidal rule is about as good as any quadrature formula. See Ref. 20.56 for modified Gaussian quadrature formulas exact for trigonometric polynomials and other special functions. 20.7-5. Numerical Evaluation of Multiple Integrals. Multiple integrals may be evaluated by repeated application of the procedures outlined in Sees. 20.7-2 and 20.7-3; or divide the domain of integration into subregions separated by coordinate lines and use the approximations
flh /V(*' y) dx dy ~T" (2/o° +/l° +/o1(Woolley's +/-1,0 +/o-l) approximation)
(20.7-26)
flh j-hKx' y) dx dy ~TT [16/o° +4(/l° +/o1 +/-1>0 +/o,*l) + /n +/1.-1 +/-1.1 +/-1.-1I
where/,-fc = f(j Ax, k Ay), Ax = Ay = h /h
rh
(Simpson's approximation)
(20.7-27)
(j, k = 0, +1), and
rh
_h i_h / ,/(*> y>z) dx dydz « Vzh'(fioo + /010 + fooi + /-1.00 + /0.-1.0 + /00.-1)
(20.7-28)
where fiik = f(i Ax, j Ay, k Az), Ax = Ay = Az = h (i, j, k = 0, +1). A simple two-dimensional Gauss-type integration formula (Ref. 20.56) is 3
3
f\ [\f(Z,ri)dSdr, « J J amf(K*k)
J~lJ-1 with
tiki
_\
Xi = - VH
X2 = 0
X3 = VH
ai = %
a2 = y§
az = V*
(20.7-29)
Reference 20.56 also gives integration formulas for s-dimensional cubes, spheres, and other regions. Monte-Carlo methods (Sec. 20.10-1) are also of interest for multidimensional integration. 20.8. NUMERICAL SOLUTION OF ORDINARY
DIFFERENTIAL EQUATIONS
20.8-1. Introduction and Notation. Refer to Chap. 9 for ordinary methods of solution. A rough graphical solution (Sec. 9.5-2) may pre cede the numerical solution for orientation purposes. Sections 20.8-2
to 20.8-8 deal with initial-value problems; the solution of boundary-value problems is discussed in Sees. 20.9-1 and 20.9-3.
20.8-3
NUMERICAL CALCULATIONS
778
To solve the first-order differential equation y' = /(*, y)
(20.8-1)
for a given initial value y(x0) = y0, consider fixed increments Ax - h of the independent variable x. Let xk = x0 + k Ax (k = 0, 1, 2, . . .),
and denote successive samples of the computed (in general, approximate) solution y(x) and its derivative y'{x) by
Vk « 2/fe) = y(x0 + h Ax) I
_
ti = /* = /(**, 2/,) « 2/'te) ) {h " °' *' 2j *' °
(20'8"2)
In the absence of round-off errors, the differenceyk+i —2/TRTjE(£ft+i) is the truncation error due to a stepwise approximation of continuous inte gration. If exact solution values 2/true(z*), 2/true(z*_i), . . . are substi tuted for yk) 2/*_i, . . . , then yk+i —2/true(z*+i) is the local truncation error for the given integration formula. The true truncation error, how
ever, is affected by propagated errors from earlier solution steps as well as by the local truncation error (Sec. 20.8-5). 20.8-2. One-step Methods for Initial-value Problems: Euler and Runge-Kutta-type Methods, (a) For sufficiently small increments
Ax = h and sufficiently small k = 0, 1, 2, . . . , the following simple recursion formulas produce stepwise approximations yk to the solution y = y(x) of Eq. (13:
yk+i = yk + fk Ax (Euler-Cauchy polygonal approximation)
(20.8-3)
yk+i =yk +f(xk +-£, yk +fk -£) Ax
(20.8-4)
2/a+i = yk + VAh + f(xk+h yk + fk Ax)] Ax
(20.8-5)
(b) Table 20.8-1 lists Runge-Kutta-type routinesfor the numerical solu tion of Eq. (1). Methods (a) and (6) in Table 20.8-1 are usually called "third-order'' methods,* because the formulas for yk+i are exact for /(#> y) = 1) %j x2i #3; for suitably differentiate f(x, y)} the local trun cation error is 0[(Ax)A] as x—» 0 (Sec. 4.4-3). Methods (c), (d), and (e) are, by an analogous definition, "fourth-order" methods. See Refs. 20.11 and 20.64 for higher-order Runge-Kutta formulas (see also Sec. 20.8-5). 20.8-3. Multistep Methods for Initial-value Problems,
(a) Start
ing the Solution. Given the initial value y0, each of the following solution schemes requires one to compute the first three to five function values 2/1, y2, . . . by one of the methods of Sees. 9.2-5, 20.8-2, or 20.8-4. * Some authors call (a) and (b) "fourth-order" methods, and (c) and (d) "fifthorder" methods instead.
779
MULTISTEP METHODS FOR INITIAL-VALUE PROBLEMS
20.8-3
This "starting solution" should be computed more accurately than the required solution by at least a factor of 10. If the Runge-Kutta method is used to start the solution, employ a step size Ax = h smaller than that required for the subsequent difference scheme. Table 20.8-1. Some Runge-Kutta-type Methods for Ordinary Differential Equations ( Sec. 20.8-2) or Systems of Differential Equations (Sec. 20.8-5)
In each formula, ki —f(xk, yk) Ax = fk Ax (a) 2/*+i = yk + H(ki + 4k2 + kz)
fa =/ (xk +~> yk +^ JAz kz = f(xk + Ax, yk + 2k2 — ki) Ax (b)
yk + J£(*i + Zkz)
yk+i
./
.Arc
,&i\ a
k2
= / (xk + -g-' yk + 3")Ax
kz
= f(xk + 3Ax, yk + g&2) Ax (Runqe-Kutta-Heun method)
(c)
yk + H(ki +2k2+2kz+k4)
yk+i
k2 kz
=/(Xt +f>2rt +£)Ax
ki «= /(x*+i, yk + kz) Ax
(d)
yk + lA(ki + 3A;2 + 3&3 + ki)
2/Jfc+i
k2 kz
=/fa;* + -
ki = f(xk + Ax, yk + kx -
(e)
k2 + kz) Ax
H[hi + 2(1 - V~H)k2 + 2(1 + VTi)kz + ki]
yk+\
k2
=f\xk +-£' w* +21) Ax
kz
=f[Xk +T' ** " (Ji " v^)/bl +(1 ~v^)*2] Ax
k4
= f[x* + Ax, yk - \/y2 k2 + (1 + VH)*3] Ax (RUNGE-KUTTA-GILL METHOD)
(b) Simple Extrapolation Schemes. Given yk} yk-i, yk-2, . . . , one approximates successive solution values
y(xk+1) ~yk+ fXk+1 f(xf y) dx J xk
(20.8-6)
by integrating an extrapolation polynomial through fk, /*_i, fk-.2, . . . instead of f(x, y). Using the second interpolation formula (20.5-6),
20.8-4
NUMERICAL CALCULATIONS
780
one obtains
y*+i = yk + (ft + Wfk + 5A2v2fk + %v% + 25M2oV4/* + 9^88v5/* + • • -)Ax (20.8-7) Truncation of the general "open" integration formula (7) after successively higherorder difference terms yields first the Euler formula (3) and the open trapezoidal rule 2/a+i = Vk + )i@fk - fk-i) Ax
(20.8-8)
both used in digital differential analyzers, and then the third-order formula Vk+i = Vk + M2(2Sfk - 16/*_! + 5/*_2) Ax
(20.8-9)
and the fourth-order Adams-Bashforth predictor of Table 20.8-2.
(c) Predictor-corrector Methods and Step-size Changes. De noting the "predicted" value (7) of yk+i as yk+®D, one can improve the approximation of Sec. 20.8-36 by requiring the interpolation poly nomial to assume the value /f+iED = f(xk+i, 2/jfc+iED)- The resulting "corrected" yk+i is obtained by using the predicted fk+i in the "closed" integration formula
y*+i = i«s»w>™> = yk + (A+i ~ Whi - K2v2/*+i - H4*%+i - 1^2oV%+i - HeoVfrfi - • • • )
(20.8-10)
The corrector (10) is truncated like the predictor (7). The resulting difference ^corrected _ ^predicted measures the local truncation error of the corrected approximation and is reduced to some preassigned value by suitable selection of the increment Ax. To halve the step size for fourth-order formulas, use the interpolation formulas yk-H = K28[452/a + 72yk-! + llyk-2 + (-9/* + 36/A_, + 3/*_2) Ax] yk-n =
i
/ Ml _L_70 _L a\ 128[Hy* + 722/*_i + 452/A_2
"•"-''
->*-«
' \ (
(20.8-11) v '
- (Sfk + 36/*-! - 9/*_2) Ax]
To double the step size, one may restart, or use past solution values stored for this purpose (see also Ref. 20.4). A corrected solution value yk+i can be successively improved further if it is resubstituted into the corrector formula as a new predicted value. In practice, step-size reduction and/or the use of modifiers (Sec. 20.8-46) is often preferred to such iteration.
20.8-4. Improved Multistep Methods, (a) More general "open" integration formulas (useful as predictors) and "closed" integration for mulas (useful as correctors) can be written in the respective forms yk+i = A0yk + Aiyk-i + A2yk-2 + Azyk-Z
+ (B0fk + Bi/*-i + B2fk_2 + Bift-i) Ax
(predictor)
(20.8-12)
(corrector)
(20.8-13)
2/a+i = a0yk + aiyk-i + a2yk-2
+ (6-i/b+i + b0fk + &i/b-i + 62/*-2) Ax
781
IMPROVED MULTISTEP METHODS
20.8-4
Instead of determining all the coefficients by making each formula exact for f(x, y) = 1, x, x2, . . . , one usually prefers to require only "fourth-order" matching (up to and including x*); the coefficients thus left undetermined are chosen so as to improve error propagation and/or
simplify computation (Sec. 20.8-5). Table 20.8-2 lists a number of useful fourth-order predictor and corrector formulas. Table 20.8-2. Some Fourth-order Predictor-corrector Methods
Each predictor-corrector scheme may be used with or without modifiers, and /jb+iD = /fe+i, 2/2+°iD). In each case, the magnitude of the correction on the last line is an upper bound for the local truncation error. (a) Adams-Bashforth Predictor with Adams-Moulton Corrector and (Optional) Modifier
2/£?iED = Vk + 3^4(55/* - 59/*_, + 37/*_2 - 9/*_3) Ax VtflD = yf?r + 2%o(2/?ORR " 2/*PRED) 2/£?RR = yk + YuOffi? + 19/* - 5/*_x + /*-*) Ax y*+i = 2/£?iRR - 19A7o(y???K - v&n (b) Hamming's Method
yf?F° = v*-s + %Vh - /*-i + 2/*-2) Ax 2/&°iD = 2/^iED + 11K2l(2/?ORR - vl ED) ^okr = 1^(9^ _ yk_j + ^(/mod + 2fk - fk_t) Ax 2/*+i = 2/?+°iRR - %2i(y??r - vi?n (c) Milne's Method, here shown with Hamming's modifier, has relatively low local truncation error but is unstable if df/dy is a negative real number, or a real matrix having a negative eigenvalue.
2/r+iED = yk-s + %(2fk - fk-i + 2fk.2)Ax
y^S° - 2CED + 28A*(yckORR - 2/rED) 2/&RR = yk-i + HC/SSD + 4/* + /*-*> A* yk+i = ygr - y29(yck°?n - »55BD) See Ref. 20.4 for additional formulas.
(b) Use of Modifiers (Ref. 20.4). In each predictor-corrector method, the difference ^corrected __ ^predicted js roughly proportional to the local truncation error. Hence, one may improve the solution by adding a fraction ^corrected __ ^predicted) 0f ^he preceding differ ence to ^predicted before substitution in the corrector, and subtracting
(1 - O0(2/££*RECTED - PREDICTED) from ^CORRECTED to obtain y^ (Table 20.8-2).
20.8-5
NUMERICAL CALCULATIONS
782
(c) The Fourth-order method of Clippinger and Dimsdale (Ref. 20.11) requires iteration of
yk+i = H(yk + yk+2) + H(fk - /*+0 Ax \
yk+2 = yk + HUk + 4/n.i +/*+2) a*
J
(20.8-14)
starting with a trial value for yk+2, say yk+2 ~ yk + 2fk Ax, and permits step-size changes by simplesubstitution of newvalues of Ax. The methodis self-starting, but once the solution is started, one can save iterations by employing extrapolation to predict yk+2.
20.8-5. Discussion of Different Solution Methods, Step-size Con trol, and Stability, (a) Integration formulas specifically selected for low local truncation error can emphasize error propagation over suc cessive solution steps. The design of differential-equation-solving routines requires a compromise between local truncation error, stability, and compu tation time. In addition, formula coefficients which produce summation terms of equal sign and not too different absolute values arepreferred, so as to reduce fractional errors due to round-off. The final choice depends on the application and on the computer used. Double-precision accumulation of dependent variables is frequently indicated. If the given f(x, y) is at all complicated, the principal cost (time) of the computation is associated with calculations of derivative values
f(x, y). For differential-equation problems with reasonably smooth (repeatedly differentiable) integrator inputs f(x, y), multistep integration schemes require relatively few derivative calculations and conveniently permit timesaving automatic step-size control in terms of ^corrected _ ^predicted] Runge-Kutta-type methods tend to be very stable (Sec. 20.8-56) and do not require a separate starting routine; they are, there fore, often preferred for problems involving frequent step inputs. RungeKutta schemes require relatively more derivative computations per step, and efficient step-size control is more difficult (see also Refs. 20.4, 20.15, and 20.52). To estimate the local truncation error for step-size control in Runge-Kutta routines, one can compare results obtained for different step sizes (using stored derivative values whenever possible), or investigate a suitable function of the h. For the
frequently used Runge-Kutta method (c) in Table 20.8-1, the quantity e = 3/jb+iAx + hi - 2kz - 2k4
(20.8-15)
is a rough measure of the local truncation error preferable to \k2 —kz\/\ki —k2\ and ki + ki = 2k%, which are also used (Ref. 20.67).
(b) The stability of the approximate solution y0, yh y2} . . . of Eq. (1) computed, say, with the aid of a general multipoint formula yk+i = A0yk + 4i2/*-i + • • • 4- Aryk-r + (B-ifk+l + B0fk + 5i/*_i + • • • + Brfk-r) Ax
(20.8-16)
783
ORDINARY DIFFERENTIAL EQUATIONS
20.8-6
depends on the stability of the corresponding linearized difference equation for the approximate error sequence e<j, ei, e2, . . . , viz., ek+i » AQek + Axek-\ + • • • + Arek-r
+ ^1 (Sec. 20.4-8).
(B_ie*+i + Boe* + J3ie*_, + • • • + Brek.r) Ax (20.8-17)
The associated characteristic equation
(-1+B- IL **)**+(A°+B°fLAx)*+(Ai+* f L*0 z"x
+ . . . +^r +Br^l Ax) - 0 (20.8-18)
will have a root zt = exp-^1 As + O(&\ Ax)./ If JH > 0, then the cordyjxk \dyjxk dyjxk responding normal mode of the error sequence is unstable, but so is the true solution y(x) near x —xk, so that the fractional error may be acceptable for small Ax. For r = 0 (simple one-step method), Zi is the only root. For r > 0, Eq. (18) will have additional roots corresponding to spurious modes in the computed solution because of the higher-order difference approximation. Relative stability of the computed solu tion requires that all these extra roots lie within the unit circle \z\ = 1 (Sec. 20.4-8) for the values of Ax of interest.
The worst hazards exist for step inputs (which,
like the round-off noise, may excite spurious modes) and near stability boundaries of the original differential equation. Such situations, if suspected, may be tested by artificially introduced perturbations.
The stability of a predictor-corrector scheme will depend on both the predictor and the corrector formula but is affected more strongly by the latter if the correction is
small. Integration routines of order higher than the fourth require careful stability investigation, but their use may be economical in trajectory computations with large step sizes (Ref. 20.64).
20.8-6. Ordinary Differential Equations of Order Higher Than the First and Systems of Ordinary Differential Equations, (a) Each ordinary differential equation of the second or higher order is equivalent to a system of first-order equations (Sec. 9.1-3). If the latter is written in the matrix form of Sec. 13.6-1, each solution method of Sees. 20.8-2 to 20.8-4 yields an analogous method for the numerical solution of systems of differential equations. (b) Specifically, consider a system of first-order differential equations y' = f(x, y,z, . . .)
z' = g(x, y, z, . . .)
(20.8-19)
with solution y = y(x), z = z(x), . . . . Note that solution by the Taylor-series and Picard methods is essentially analogous to the procedures outlined in Sees. 9.2-5o and 9.2-56.
The Runge-Kutta method is analogous to the scheme of Sec. 20.8-26: y*+i = yk + H(ki + 2k2 + 2kz + k4) zk+i = z* + H(mi + 2m% + 2w8 + m4)
...
20.8-7
NUMERICAL CALCULATIONS
784
with
ki = f(xk} ykf zk, . . .) Ax mi = g(xk, yk, zk, . . .) Ax , ./ .Ax , k\ , mi
\ M
k2 =f\xk + y>yk + y** +'T' • • '/ ( , Ax
, &i
, mi
\ A
z* + -g"' ^ + 2"' Z& "*" "2"' *' '/
(20.8-20)
7 ./ . k2 , m2 A kz =f{xk +. Ax y, i/* +^ ejb +-j' • • •j\ &c ( , Ax
. kt
, m2
\ A
z* + y 2/fc + "2' 2fc + "J"' • • •) Ax
#4 = /(x& + Ax, yk -f /b3, 2fjb -f m3, . . .) Ax m4 = sr(xjb + Ax, yk + fc3, 2* + m3, . . .) Ax
Any one of the multistep schemes of Sec. 20.8-3 may be applied to each equation (19); one writes
y(xk) =yk
z(xk) =zk
. . .1
f(xk, yk, zk, ...)=/&
g(xk, yk, **,...)= gk
. . .J
(20 8-21)
(c) The stability of the correct solution y = y(x) of a system (matrix equation)
will depend on the matrix [df/dy] (Sec. 13.6-5). Equation (17) becomes a matrix difference equation, whose characteristic roots must be compared to the eigenvalues of the matrix [df/dy]Xk to determine relative stability (Refs. 20.4, 20.6, and 20.52).
20.8-7. Special Formulas for Second-order Equations. Because of the practical importance of second-order differential equations, the fol lowing schemes for the numerical solution of the differential equation y" = /(*, 2/, yf)
(20.8-22)
with given initial values y(xo) = yo, y'(xo) = y'0 are of interest; in each case,
x0 + k Ax = xk
y{xk) = yk y'{xk) = yk f(xk, Jto y'k) = A (* = 0, 1, 2, . . .)
(20.8-23)
These methods may be extended to apply to the solution of systems of two or more second-order equations in the manner of Sec. 20.8-6. (a) A Runge-Kutta Method for Eq. (22). yk+i = yk + yk Ax + H(ki + k2 + kz) Ax
yfk+i = y'k + K(*i + 2&2 + 2&3 + t4) with
&i = f(*k, yk, yk) Ax
Ax k2h =f//Ixk +. y> y^ +, yk,Ax -2> y'k,±1/J\A + Hfo) Ax
kz =f(xk +y> yk +y'k -j +j ax, y'k +y2k2\ ax hi = f(xk + Ax, yk + ykAx + J ^ Ax, yk + kz) Ax
(>
(20.8-24)
785
INTRODUCTION
20.9-1
(b) An Interpolation-iteration Scheme. Starting with a trial value for fk+i, iterate
Vk+i = 2yk - y*-! f (fk + H2V2/*+i) Ax2 \
y'k+l = y'k.i + (2/, + MV%+i) Ax
(2Q g_25)
j
v '
(c) Prediction-correction Schemes.
Ifi+i = 2/1-3 + %(?U - fk-i + 2/*_2) Ax
(predictor) \
yk+i = 2/*-i + M(£+i + -4yi + 2/U) Ax
yi+i = 2/i-i + M(/*+i + 4/* + /*-0 A*
(20.8-26)
(corrector) j
If /(z, 2/> y') does not contain ?/' explicitly, transform the given differ ential equation in the manner of Sec. 9.1-56, or use the following pre diction-correction scheme:
!T?iED = 22/a-i - yk-z + Vsifk + fk-i + fk-2)(Ax)2 \ (predictor) ( r2Q 8 27)
^1RR =22/fc-2/,.1 + K2(/irAED + 10A+A-i)(Ax)2
(
(corrector) /
Reference 20.4 gives modifier formulas for this scheme. (d) Numerov's Method for Linear Equations (Refs. 20.4 and 20.11). Linear differential equations of the form
y" = f(x)y + g(x)
(20.8-28)
can be solved with the corrector of Eq. (27) alone; substitution of Eq. (28) yields uk+i = 2uk - uk-i + [fkyk + gk + K2fe+i - 2^ + gk-i)](Ax)2 with
yk =
(20.8-29)
uk
1 - M2/*(Az)2
20.8-8. Frequency-response Analysis. Given a stable integration formula (12), a complex-sinusoid input fk = eiax* will eventually produce a steady-state sinusoidal solution 2/* = H(i(a)eiaxk Qust as in Sec. 9.4-6). Substitution yields the sampled-data frequency-response function
H(M - G(z) - x*^Ji-^Ji-.\ .A*
<* " e,wAx> (20-8-30)
Integration formulas can now be designed so as to approximate the ideal integrator response l/iu> as closely as possible in amplitude and phase. To reduce round-offnoise propagation, it is desirable that the error \H(iw) —l/iw| decrease with frequency (Ref. 20.4). The same type of analysis applies also to double-integration formulas (Sec. 20.8-7).
20.9. NUMERICAL SOLUTION OF BOUNDARY-VALUE
PROBLEMS, PARTIAL DIFFERENTIAL EQUATIONS, AND INTEGRAL EQUATIONS
20.9-1. Introduction. This chapter describes numerical methods applicable to the solution of boundary-value problems involving ordi-
20.9-2
NUMERICAL CALCULATIONS
786
nary or partial differential equations and to the solution of hyperbolic and parabolic partial differential equations. Alternative numericalsolution methods outlined in this handbook include
Reduction of partial differential equations to ordinary differential equa tions by separation of variables (Sees. 10.1-3 and 10.4-9), solution of characteristic equations (Sees. 10.2-2 and 10.2-4), and the method of characteristics (Sec. 10.3-2)
Reduction to a variation problem (Sees. 11.1-1, 11.7-1, and 15.4-7) and solution by direct methods such as the Rayleigh-Ritz method (Sec. 11.7-2)
Reduction to an integral equation (Sec. 15.5-2) which may be solved directly or by means of one of the approximation methods of Sec. 20.9-5
Perturbation methods for the solution of eigenvalue problems are described in Sec. 15.4-11. Many combinations of the various solution
methods are possible, and vast numbers of specialized procedures have been developed for special applications (Refs. 20.9, 20.21 to 20.31, and 20.50).
The quasilinearization method for solving two-point boundary-value problems is outlined in Sec. 20.9-3.
20.9-2. Two-point Boundary-value Problems Involving Ordinary Differential Equations: Difference Technique (see also Sees. 9.3-3, 15.4-1, and 15.5-2). Two-point boundary-value problems requiring the solution of a given ordinary differential equation subject to boundary conditions at the end points of an interval (a, b) can often be reduced to initial-value problems by the method of Sec. 9.3-4. The followingfinitedifference method may be more convenient.
Divide the given interval (a, b) into subintervals of equal length by a net
x0 = a
#i = #o + Az
x2 = x0 + 2Arc
...
xn = x0 + n Ax = b
and replace each derivative in the given differential equation and in the boundary conditions by a corresponding difference approximation (Sec. 20.7-la, Fig. 20.9-1, and Fig. 20.9-6) of equal (or higher) order. The given ordinary differential equation is thus approximated by a differ ence equation. The numerical solution of the difference equation amounts to the solution of a set of simultaneous equations for the unknown function values yk = y(x + k Ax) (Sees. 20.2-5 to 20.3-2). In typical linear problems, the system matrix is usually "sparse" (Sec. 20.3-2a), and the problem is thus suitable for the iteration methods of Sec. 20.3-2. In particular, relaxation methods (Sec. 20.3-2c) are useful for "manual" computation.
787
THE GENERALIZED NEWTON-RAPHSON METHOD
©-KD
0€KD
®G>+eKd
20.9-3
0O0OO
Zti.3JL
*h
a*s
&T-® GP-® h2v*
2
2h2V2
AK
d*
dxdy
Fig. 20.9-1. Operators for central-difference approximations (rectangular cartesian coordinates: Ax -Ay = h). Percentage errors are of the order of h2.
EXAMPLE: To solve -r( + 5f 2/ = ° subject to the boundary conditions y(0) = 1, dx£ 2/(1) =0, divide the interval (0, 1) into n subintervals of length Ax = 1/n and use d2y/dx2 « A2y/Ax2 to obtain the difference equation
Vk+i - 2yk + 2/ft-i + (^Ax2)yfc = 0
(k = 1, 2, . . . , n - 1)
with 2/0 = 1, 2/n = 0. For n = 4 (A = 1, 2, 3), the simultaneous equations to be solved for 2/1, 2/2, 2/s are
-1.992/1 + 2/2 = -1 2/i - 1.99 2/2 + 2/3 = 0 2/2 - 1.992/s = 0
20.9-3. The Generalized Newton-Raphson Method (Quasilinearization). While both nonlinear and linear two-point boundary-value problems can be treated numerically by the methods of Sees. 9.3-4 and 20.9-2, both these methods are most
easily applied to linear differential equations. One can often find solutions of the important class of nonlinear two-point boundary-value problemsof the form
y" = f(x, 2/', y") with
(a<x
\
+\y(b), y'(b)\ = 0 J
(20.9-1)
by the following so-called quasilinearization method. Start with a trial olution yW(x) which satisfies the given boundary conditions, and obtain successive approxi mations yW(x), y[2Kx), . . . by solving linear problems
j,"[*+u = /(*, yM, y'M) +fv(x, ym, tfinw+i - yW) + fr(x,yM,v'M)
+ lM?/l>l(6), y'[/](6)J[|/'[/+i](6) - y'M(ft)] = 0
(20.9-2)
20.9-4
NUMERICAL CALCULATIONS
788
where the subscripts denote partial differentiation. This technique is.readily gen eralized to apply to systems of differential equations if one introduces the matrix notation of Sec. 13.6-1 and constitutes a generalization of the Newton-Raphson method of Sec. 20.2-8: like the latter, it may converge quite rapidly. While fairly general convergence conditions can be formulated, convergence is commonly tested by trial and error (Refs. 20.47 and 20.60).
20.9-4. Finite-difference Methods for Numerical Solution of Par
tial Differential Equations with Two Independent Variables. Methods analogous to that of Sec. 20.9-2 apply to the solution of prob lems involving partial differential equations. Introduce an appropriate net of coordinate values x{ = x0 + i Ax, yk = y0 + k Ay (i, k = 0, ±1, ± 2, . . .); the unknown function $(x, y) will be represented by a discrete set of function values $(xi, yk) = $ik. Approximate each differentia)
operator by a corresponding difference operator, so that everyderivative is approximated by a corresponding partial difference coefficient. The resulting difference equation willyielda system ofsimultaneousequations to be satisfied by the unknown functionvalues$ik. The most important problems lead to the following situations.
1. A boundary-value problem for an elliptic partial differential equation (e.g., a Dirichlet problem for V2* = 0, Sec. 15.6-2) produces a set of N simultaneous equations for N unknowns $«.
The large number of equations and unknowns associated with such
difference techniques makes many practical partial-differential-equation problems a challenge for even the largest digital computers. As the only redeeming feature, the nature of the difference operators resulting from the usual second-order and fourth-order linear partial differential equa tions (Table 10.4-1) again leads to systems of linear equations whose system matrices are "sparse," i.e., which have few nonvanishing terms except near the main diagonal. Such problems are, then, well suited for solution by the iteration methods of Sees. 20.3-2.
2. A linear eigenvalue problem (e.g., V2* = X$ with suitable boundary conditions, Sec. 15.4-1) yields a matrix eigenvalue problem. 3. Initial-value problems involving a parabolic or hyperbolic partial dif ferential equation (Sec. 10.3-4) also lead to sets of simultaneous equations if the differencing scheme chosen relates each unknown $,•* to other unknown as well as to known function values (implicit methods for solving initial-value problems). 4. If forward-difference approximations (Sec. 20.7-1) are used for the time derivatives in an initial-value problem, one obtains a set of recursion formulas for successive $ik% starting with the given initial values (explicit methods for solving initial-value problems).
789
TWO-DIMENSIONAL DIFFERENCE OPERATORS
20.9-5
In general, explicitmethods will require less computation than implicit methods for solving initial-value problems; but the recursion schemes tend to involve error-propagation or stability problems similar to those associated with the numerical solution of ordinary differential equations.
As in the latter case, approximations may be improved through predictorcorrector methods analogous to those of Sec. 20.8-3 (Refs. 20.9, 20.16, and 20.20; see also Sec. 20.9-8). A vast number of special techniques, including coordinate transfor mations reducing suitable boundaries and differencing nets to more con venient shapes, will be found in Refs. 20.9, 20.16, 20.17, and 20.20 to 20.31. Reference 20.9, in particular, contains an extensive bibliography. 20.9-5. Two-dimensional Difference Operators.
Figures 20.9-1 to
20.9-6 list the most frequently needed linear difference operators.
In
JLhAVA
3h2V2
9fc4V4
Fig. 20.9-2. Central-difference approximations for equilateral-triangle nets (side h). Percentage errors are of the order of h2.
particular, each diagram ofFigs. 20.9-1 to 20.9-5 yields a specified centraldifference expression for the center of the "star" or "molecule" as the weighted sum of the function values $ik "above," "below," "to the right," etc., of the center, each weighting coefficient being indicated in its proper
20.9-5
position.
NUMERICAL CALCULATIONS
790
Thus, Fig. 20.9-1 yields
h2V2$]x=Xi,y=yk « *(xi + h, yk) + $(xi - h, yk) + $(xi9 yk + h) + *(xi, yk-h) - 4$(xi, yk) = $i+i,k + ^_l|Jfc + $itk+1 + $,,*_! - 4$* where & = Ax = Ay is the mesh width used for both x and y.
Vi+/s J Fig. 20.9-3. A central-difference operator suitable for use with graded nets, or for transition to odd mesh widths near a boundary. The percentage error is of the order of h.
2>?h2 sin2 tf V2
Fig. 20.9-4. A central-difference operator for use with oblique cartesian-coordinate nets.
The percentage error is of the order of h2.
Figure 20.9-1 applies to rectangular cartesian coordinates and equally spaced points. Figure 20.9-3 is usefulif one wishes to change mesh width either (1) to accommodate irregular boundaries or (2) to increase the accuracy of the computation in some region of special interest (use of graded nets). In any case, the net used can be refined after an initial rough computation.
Figures 20.9-2, 20.9-4, and 20.9-5 show difference operators for nets other than rectangular cartesian. Figure 20.9-6 shows a few forwardand backward-difference operators.
791
TWO-DIMENSIONAL DIFFERENCE OPERATORS
20.9-5
For computation purposes, the net points (xi, yk) are often labeled in some simple sequence 1, 2, ... ; the corresponding function values $ik are then denoted by $i, 2, . . . (see, for example, Fig. 20.9-7). In "manual" relaxation-type computations (Sec. 20.3-2) involving problems of type 2, it is customary to use a large plan of the region and to enter
(Ar)2 v2
Fig. 20.9-5. A central-difference operator for use with polar coordinates r,
hZ'h
dx2
Fig. 20.9-6. Forward- and backward-difference approximations (rectangular cartesian coordinates, Ax = Ay = h). Percentage errors are of the order of h2.
function values and residuals directly at each net point. Function values and residuals from earlier steps of the relaxation process are simply crossed out or erased.
Note: In problems of types 2 and 3, one cannot always replace a given differential operator by a difference operator of higher order, since the resulting difference-equation solution might contain spurious oscillation modes. On the other hand, finite-differ ence solutions of problems of type 1 (initial-value problems associated with hyper bolic or paraubolic differential eqations) often employ difference operators of higher
20.9-6
NUMERICAL CALCULATIONS
792
orders than those of the corresponding derivatives, just as this was one in Sees. 20.9-1 to 20.9-5 (see also Refs. 20.17, 20.23, and 20.48 for a number of special methods).
20.9-6. Representation of Boundary Conditions (Fig. 20.9-7). Ap proximate the given boundary by mesh lines of the net used; introduce a graded net (Fig. 20.9-3 and Sec. 20.9-36) if necessary. Then 1. If boundary values of the unknown function
1
«4i
*1
*2
§3
*4
r i
«-! 1
1 1 1 1 1
(a) Rectangular net
(6) Triangle net
Fig. 20.9-7. Representation of boundary conditions for v2* = 0 or V4$ = 0.
The
net is continued to the left of the given boundary, and "image values" $J = $>2, $i = $4 are introduced to yield difference equations representing d$/dn = 0 at the boundary points shown. The boundary condition = 0, V2* = 0 (e.g., free edge of an elastic plate) is similarly represented by $i = $8 = 0, $2 = —$2, *i = —*4. EXAMPLE: Figure 20.9-76 shows how the boundary condition d&/dn = 0 is approximated through "reflection" of function values in the boundary, so that $2 — $2 = 0, #4 —
or, since the boundary condition implies $'2 = $2, $4 = *4, 2*2 + 2*4 + *s + *i - 6*3 = 0
The last equation involves only points inside or on the boundary.
793
VALIDITY OF FINITE-DIFFERENCE APPROXIMATIONS
20.9-8
20.9-7. Problems Involving Three or More Independent Variables. Analogous methods apply to problems involving three or more variables. In particular, h2V2*(x, y, z) « *(z + h, yf z) +
+ <*»(*, y + h, z) + *(s, y -h,z) + *(*, y,z+h)+ *(*, y,z -h) - 6$>(x, *,
\ (20.9-3) y, z) )
The number of independent variables can often be reduced through separation of variables (Sec. 10.1-3).
20.9-8. Validity of Finite-difference Approximations. Some Sta bility Conditions, (a) Every difference-approximation solution re quires an investigation of its validity. The error propagation resulting from two different difference-equation approximations to the same differ ential equation may differ radically even though the same mesh widths are used. It is, moreover, not always possible to improve the approxi mation by decreases in mesh widths, even if exact solutions of the differ ence equations (with zero relaxation residuals) are available. For a proper finite-difference approximation, the solution of the difference equation should converge to that of the given differential equation in a nonoscillatory manner. Implicit-solution schemes for linear partial dif ferential equations (Sec. 20.9-4) rarely lead to difficulties of this type, but in explicit methods for parabolic and hyperbolic differential equa tions, the various coordinate increments may have to satisfy special stability conditions.
(b) In particular, the solution of the difference equation
$(x + Ax, t) - 2$(s, t) + $(a? - Ax, t) = 1 $(s, t + At) - $(x, t) Ax2
a2
At
or (in recursion-relation form)
#(*, t+At) =^ {*(* - Ax, t) +*(* +Ax, t))+(l- ^)*(*, t) converges to the solution of the partial differential equation (heat-conduction equation, Sec. 10.4-1) dx2
a2 dt
if At -> 0, Ax —> 0 so that At < ^~2 Ax2. At = w-2 Ax2 is a useful combi nation, which will also improve the approximation for finite increments (Ref. 20.9).
20.9-9
NUMERICAL CALCULATIONS
794
By contrast, the implicit methodbased on the Crank-Nicholson approximation
~~ 5a!? [*(X -Ax>1 +A') +*(* +Ax, t+A*)] +(l +^) *(x, t+At) =0 " ^) *fe ° +^ [*(* - A*><} +*<* +A*> oi will yield approximations whichconverge to the correctsolutionof the heat-conduction equations whenever Ax—» 0, At—> 0.
(c) Again, the solution of the difference equation *(z + Ax, t) - 2$(x, t) + $(x - Ax, t) Ax2
= 1 *(x, t + At) - 2$(x, t) + $(a, t - At) c2
At2
converges to the solution of the wave equation
d2$ = 1 d2$ dx2 " c2 dt2
if At-*0, Ax-+0so that At < - Ax. c
(d) Reference 20.9 lists numerous other approximation formulas and stability conditions for the heat-conduction and wave equations.
20.9-9. Approximation-function Methods for Numerical Solution
of Boundary-value Problems, (a) Approximation by Functions Which Satisfy the Boundary Conditions Exactly. Let x be either a one-dimensional variable or a multidimensional variable x = (xi, x*, . . . , xn) (Sec. 15.4-1). It is desired to solve an ordinary or partial differential equation
L*(x) = f(x)
(x in V)
(20.9-4a)
(x in 8)
(20.9-46)
subject to a boundary condition
B*(z) = b(x)
on the boundary S of a given region V (Sec. 15.4-1). Approximate the desired solution $(x) by an approximation function
(20.9-5a)
which satisfies the given boundary conditions and depends on m param eters ori, a*, . . . , am.
In many applications, the equations (2) are
linear, and one approximates $(x) by a linear combination of known
795
APPROXIMATION-FUNCTION METHODS
20.9-9
functions m
(20.9-56)
where
, a^ in Eq. (5a)
or (56) by one of the following schemes:
1. Collocation.
E(Xi) ah a2, . . . , am) = 0
(i = 1, 2, . . . , m) (20.9-7)
are solved for a\, a2, . . . , am.
2. Least-squares Approximations (see also Sec. 20.6-2). Choose the ak so as to minimize the mean-square error
J(ah a2, . . . ,c) = fy \E& a„ a„ . . . ,<*m)|2 d( (20.9-8) or the cruder mean-square error N
J'(a1} «„...,aJSy 6*|B(Z*; ax, a„ . . . , «m)|2 (20.9-9) h = i
where the 6* are suitably determined weighting coefficients associ ated with N chosen points Xh X2, . . . , XN in V. One may, for instance, space the points Xh evenly and let 61 = 62 =
• • * = 6i\r = 1
The ak are determined by the m conditions
M. =0 dak
or
^dak = 0
(k = 1, 2, . . . , m) (20.9-10)
3. Galerkin's Weighting-function Method. Choose m linearly independent "weighting functions" *i(x), ^2(x), . . . , Vm(x) (one frequently uses ^A "
/F *?(f)«(f; «i, ai,.-., «») d« « 0
(i =1, 2, . . . ,m) (20.9-11)
20.9-10
NUMERICAL CALCULATIONS
796
If the given equations (4) are linear (linear boundary-value problem), then an approximation of the form (56) will yield linear equations (7), (10), or (11).
(b) Approximation by Functions Which Satisfy the Differential Equation Exactly. It is frequently preferable to use an approximation function (5) which satisfies the given differential equation (3a) exactly for all x in Vand to determine the parameters ak so asto match the given
boundary conditions in the sense of the collocation method (boundary collocation), the least-squares method, or the Galerkin method.
20.9-10. Numerical Solution of Integral Equations (see also Sec. 15.3-2). (a) The solution $(x) of a linear integral equation
*(x) ~ XJy K(x, Q*(Q d£ =F(x)
(20.9-12)
can often be approximated by the iteration method of Sec. 15.3-8a; or one may approximate the given kernel K(x, £) by a polynomial or other
degenerate kernel (Sec. 15.3-16) to simplify the solution. Eigenvalue problems (F = 0) can often be treated as variation problems (Sec. 15.3-6).
(b) The approximation-function methods of Sec. 20.9-9 also apply directly to the numerical solution of a linear integral equation (12). Use an approximation of the form (56) and compute the m functions
fk(x) =
(k =1, 2, . . . ,m) (20.9-13)
Then
1. The collocation method yields the m linear equations m
2 akfk(Xx) =F(Xi)
(i = 1, 2, . . . , m)
(20.9-14)
2. The least-squares method yields the m linear equations m
^ Aikotk =ft
(i =1, 2, . . . ,m)
jfc=i N
with
Aik = Y bHff(Xh)fk(Xh) h=i
N
ft = J bhff(Xh)F(Xh) A=l
)
(20.9-15)
797
TWO VARIANCE-REDUCTION TECHNIQUES
20.10-2
3. GalerkMs method yields the m linear equations
2 Aikak =ft
with
(t = 1, 2, . . . ,m)
iltt =/„ *?(*)/*(*) «
'
(20'9"16)
20.10. MONTE-CARLO TECHNIQUES
20.10-1. Monte-Carlo Methods (Refs. 20.42 and 20.46). In principle, every
Monte-Carlo computation may be considered as the estimation of a suitable integral
1=/_"„ /_"«, •••/_V(Xi' Xz> •••'Xjv) d*(Xi> M> •--' Xn) (2aio"l) by a random-sample average n
/(si, **..., **) - J V/(**i> ***> . . . , ***)
(20.10-2)
where (rci, x2, . . - , xN) is a (generally multidimensional, N > 1) random variable with known distribution function <£(si, z2, . * • , xn).
Monte-Carlo techniques are widely useful in investigations of random-process phenomena too complicated for explicit solution by probability theory, viz., neutrondiffusion problems, detection and communication problems, and a vast variety of
operations-research studies. In addition, it often pays to recast other types ofprob lems, especially those involving complicated multidimensional integrals, in a form suitable for Monte-Carlo solution.
For simplicity, the following discussion is based on Monte-Carlo estimation of the one-dimensional integral
by
/= P/(X)^(X)
(20.10-3)
W) - - I/O*) +/(«*) + • • • +/(•*)]
(20.10-4)
n
The variance of the estimate (4) of (3) on the basis of a random sample (*£, *x, . . . , nx), is
Var [Jfx)) =\ Var {/(*)}
(20.10-5)
so that the rms fluctuation decreases only as l/y/n with increasing n (Sec. 19.2-3). The estimate variance is due to the random fluctuation in the distribution of different samples (}x, 2x, . . . , nx).
20.10-2. Two Variance-reduction Techniques (Ref. 20.46). The following
techniques attempt to "doctor" the sample (}x, 2x, . . . , nx) so as to reduce the
20.10-2
NUMERICAL CALCULATIONS
798
variance of the sample mean while still preserving the relation
2*7(75)} -Blf(x)} =7
(20.10-6)
i.e., without biasing the estimate.
(a) Stratified Sampling. One divides the range of the random variable x into a number of suitably chosen class intervals £,_! <*<£,- and agrees to fix the number
n, of otherwise independent sample values kx = %(i = 1, 2, . . . , n,) falling into
the ith class interval. Assuming a priori knowledge of the probabilities Pi = Pfe-i < x < fr] = *(&) - *(*,_,)
(20.10-7)
associated with the class intervals (e.g., on thebasis of symmetry, uniform distribu
tion, etc.), one can employ the stratified-sample average Hi
JWSTRAT - ^ P* ^ [£ WxA ] 3
(20.10-8)
t'=l
as an unbiased estimate of I, with
Var {J(x)stratJ =Y ^ Var {/(%•)) Lj n,
(20.10-9)
3
Note that repeated stratified samples will differ only within class intervals. The
variance (9) can be smaller than the random-sample variance Var (Jfr)}/n with
n=2/ n/ ** apriori mformation permits afavorable choice of the £,• and n,-. In i
principle, it would be bestto choose class intervals for equal variances
Var {/(<*,)} =i- /J>(X) <*ft(X) - [i- ^/(X)
i.e.,
»i = *Pi
(20.10-11)
In this ideal case, one should have the relatively smallestimatevariance
Var {/WstratJ = -Var {/(%•)}
(20.10-12)
As the class intervals are decreased, the stratified-sampling techniques will produce
results analogous to that of anintegration formula, but ordinarily the class intervals are larger; practical applications are usually multidimensional, so that simple sym metry relations may yield favorable class intervals.
(b) Use of Correlated Samples. If individual sample values kx are not statisti
cally independent (as theywould bein atrue random sample), the expression (9) for
the estimate variance is replacedby
Var {/(aOcoRREL) =-Var [x] +^ Y Y Gov {% kx)
(20.10-13)
i
(see also Sec. 19.8-1). Judiciously introduced negative correlation between selected
sample-value pairs % kx will produce negative covariance terms in Eq. (13) and may
799
SOME RANDOM-NUMBER GENERATORS
20.10-4
reduce the variance well below the random-sample variance Var \f{x))/n without biasing the estimate.
As a simple example (Ref. 20.42), let x be uniformly distributed between x = 0 and x = 1, and let f(x) be the monotonic function (ex - l)/(e - 1). One designs the sample so that n is even, and 2x = 1 - % 4x = 1 - % . . . , nx = 1 —n~lx, with sample values otherwise independent. Since f(x) and /(l —x) are negatively
correlated, Var (/(x)corbel} » Hi Var (/(x)( so that the rms fluctuation is reduced by a factor of about 5.6. In addition, the correlated sample requires one to generate fewer random numbers. More interesting
applications are, again, to multidimensional problems. Note that stratified sam pling, in effect, also introduces negative correlation between sample values: k+1x can no longer fall into a given class interval if kx has filled the latter. 20.10-3. Use of A Priori Information: Importance Sampling.
As a matter of
principle, Monte-Carlo computations often can and should be simplified through judicious application of partial a priori knowledge of results. As a case in point, importance-sampling techniques attempt to estimate an integral (1) by a sample
average [f(y)/g(y)]> wherey is a random variable with probability density
*M =giv)^ The estimate is easily seen to be unbiased.
(20.10-u)
The function g(y) is chosen so that
Mi)-MBS-']'} is small, subject to the constraint /_*
Tests for Randomness (Refs.
20.9, 20.51, and 20.55). Congruential methods for generating pseudo-random numbers Xi less than a given nonnegative modulus m start with any nonnegative Xq < m and compute successive values
Xi = laxi-i + c]mod m
(i = 1, 2, . . . ; 0 < a < m, 0 < c < m)
where modulo-m addition is defined in the manner of Sec. 12.2-10.
(20.10-16)
On a binary
computing machine, the modulus m is conveniently chosen equal to 2 computer word length. For c = 0, the generator is called a multiplicative congruential generator; other wise, it is a mixed congruential generator.
Sequences obtained in this manner are not truly random but may have "pseudo random" properties, such as a uniform distribution between 0 and m, zero corrrelation between different X{, random-appearing runs of odd and even numbers, etc. The uniform distribution may be tested with a x2test (Sec. 19.6-7), and serial corelation may be tested in the manner of Sec. 19.7-4. Even with zero correlation, samples (xi, x-i, . . . , xn) taken from a pseudo-random-number sequence will notbe statis tically independent, a fact which, depending on the specific application, can result in disagreeable surprises. It may well be wise to obtain true random samples for
20.11-1
NUMERICAL CALCULATIONS
800
Monte-Carlo computations by analog-to-digital conversion of true analog noise (Ref. 20.55).
Pseudo-random numbers having other than uniform distributions are readily obtained as functions F(xi) of uniformly distributed pseudo-random variables. A number of other methods are discussed in Ref. 20.9.
20.11. RELATED TOPICS, REFERENCES, AND BIBLIOGRAPHY
20.11-1. Related Topics. The following topics related to the study of numerical computationsare treated in other chapters of this handbook. Theory of equations Ordinary differential equations Partial differential equations
Chap. 1 Chaps. 9, 13 Chap. 10
Linear and nonlinear programming, dynamic programming, calculus of variations
Matrices, diagonalization Eigenvalue problems Boundary-value problems Regression
Chap. 11 Chaps. 13, 14 Chaps. 14, 15 Chap. 15 Chaps. 18, 19
20.11-2. References and Bibliography. Ceneral
20.1. Abramowitz, M., and I. A. Stegun (eds.): Handbook of Mathematical Functions National Bureau of Standards, Washington, D.C., 1964. 20.2. Booth, A. D.: Numerical Methods, Butterworth, London, 1955.
20.3. Collatz, L.: Numerical Methods, in Handbuch der Physih, vol. 2, Springer, Berlin, 1955.
20.4. Hamming, R. W.: Numerical Methods for Engineers and Scientists, McGrawHill, New York, 1962.
20.5. Henrici, P.: Elements of Numerical Analysis, Wiley, New York, 1964. 20.6. Hildebrand, F. B.: Introduction to Numerical Analysis, McGraw-Hill, New York, 1956.
20.7. Householder, A. S.: Principlesof Numerical Analysis, McGraw-Hill, New York, 1953.
20.8. Jennings, W.: First Course in Numerical Methods, Macmillan, New York, 1964.
20.9. Klerer, M., and G. A. Korn (eds.): Digital Computer User'sHandbook, McGrawHill, New York, 1967.
20.10. 20.11. 20.12. 20.13.
Kopal, Z.: Numerical Analysis, Wiley, New York, 1955. Kunz, K. S.: Numerical Analysis, McGraw-Hill, New York, 1957. Lanczos, C: Applied Analysis, Prentice-Hall, Englewood Cliffs, N.J., 1956. Milne, W. E.: Numerical Solution of Differential Equations, Wiley, New York, 1953.
20.14. Noble, B.: Numerical Methods, Oliver & Boyd, London, 1964. 20.15. Ralston, A.: A First Course in Numerical Analysis, McGraw-Hill, New York, 1964.
20.16.
and H. S. Wilf: Mathematical Methods for Digital Computers, 2 vols., Wiley, New York, 1960 and 1967.
801
REFERENCES AND BIBLIOGRAPHY
20.11-2
20.17. Salvadori, M. G., and M. L. Baron: Numerical Methods in Engineering, 2d ed., Prentice-Hall, Englewood Cliffs, N.J., 1961.
20.18. Scarborough, J. B.: Numerical Mathematical Analysis, 5th ed., Johns Hopkins, Baltimore, 1962.
20.19. Stiefel, E. L.: An Introduction to Numerical Mathematics, Academic, New York, 1963.
20.20. Todd, J.: Survey of Numerical Analysis, McGraw-Hill, New York, 1962. Linear Equations, Matrix Problems, and Partial Differential Equations
20.21. Allen, D. N.: Relaxation Methods, McGraw-Hill, New York, 1954. 20.22. Fadeev, D. K., and V. N. Fadeeva: Computational Methods in Linear Algebra, Freeman, San Francisco, 1963.
20.23. Forsythe, G. E., and W. R. Wasow: Finite-difference Methods for Partial Differential Equations, Wiley, New York, 1960.
20.24. Fox, L.: An Introduction to Numerical Linear Algebra, Oxford, Fair Lawn, N.J., 1964.
20.25. Householder, A. S.: The Theory of Matrices in Numerical Analysis, Blaisdell, New York, 1964.
20.26. Paige, L. J., and O. Taussky: Simultaneous Linear Equations and the Deter mination of Eigenvalues, National Bureau of Standards Applied Mathematics Series 29, 1953.
20.27. Shaw, F. S.: An Introduction toRelaxation Methods, Dover, New York, 1953. 20.28. Southwell, R. V.: Relaxation Methods in Engineering Science, Oxford, Fair Lawn, N.J., 1940.
20.29.
: Relaxation Methods in Theoretical Physics, Oxford, Fair Lawn, N.J., 1946.
20.30. Varga, R. S.: Matrix Iterative Analysis, Prentice-Hall, Englewood Cliffs, N.J., 1962.
20.31. Wilkinson, J. H.: The Algebraic Eigenvalue Problem, Oxford, Fair Lawn, N.J., 1965.
(See also the articles by J. H. Wilkinson and by W. K. Karplus and V. Vemuri in Ref. 20.9.)
Finite Differences and Difference Equations
20.32. Goldberg, S.: Introduction to Difference Equations, Wiley, New York, 1958. 20.33. Jolley, L. B.: Summation of Series, Chapman & Hall, London, 1925; reprinted, Dover, New York, 1960.
20.34. Jordan, C: Calculus of Finite Differences, Chelsea, New York, 1947.
20.35. Jury, E. I.: Theory and Application of the Z-transform Method, Wiley, New York, 1964.
20.36. Milne-Thomson, L. N.: The Calculus ofFiniteDifferences, Macmillan, London, 1951.
20.37. Ragazzini, J. R., and G. F. Franklin: Sampled-data Control Systems, McGrawHill, New York, 1958. Approximation Methods
20.38. Carlson, B., and M. Goldstein: Rational Approximation of Functions, Los Alamos Scientific Laboratory, Rept. LA-194&, 1955.
20.39. Davis, P. J.: Interpolation and Approximation, Blaisdell, New York, 1963.
20.11-2
NUMERICAL CALCULATIONS
802
20.40. Dunham, C. B.: Convergence Problemsin Maehly's Second Method, J. ACM, April, 1965.
20.41. Hastings, C: Approximations forDigital Computers, Princeton, Princeton, N.J., 1955.
20.42. Maehly, H.: First InterimProgress Report on Rational Approximations, Project NR-044-196, Princeton University, 1958. 20.43. : Methods for Fitting Rational Approximations, J. ACM, 10: 257 (1963).
20.44. Meinardus, G.: Approximation von Funktionen undihrenumerische Behandlung, Springer, Berlin, 1964.
20.45. Rice, J. R.: The Approximation ofFunctions, Addison-Wesley, Reading, Mass., 1964.
20.46. Snyder, M. A.: Chebyshev Methods in Numerical Approximation, Prentice-Hall, Englewood Cliffs, N.J., 1966. Miscellaneous
20.47. Bellman, R. E., and R. E. Kalaba: Quasilinearizalion and Nonlinear Boundaryvalue Problems, Elsevier, New York, 1965. 20.48. Collatz, L.: Eigenwertaufgaben mit technischen Anwendungen, Akademische Verlagsgesellschaft m.b.H., Leipzig, 1949. 20.49. Fox, L.: Numerical Solution of Two-point Boundary-value Problems, Oxford, Fair Lawn, N.J., 1957.
20.50.
: Numerical Solution of Ordinary and Partial Differential Equations, Pergamon Press, New York, 1962. 20.51. Hammersley, J. M., and D. C. Handscomb: Monte Carlo Methods, Wiley, New York, 1964.
20.52. Henrici, P.: Discrete-variable Methods in OrdinaryDifferential Equations, Wiley, New York, 1963.
20.53. : Error Propagation for Difference Methods, Wiley, New York, 1963. 20.54. Ostrowski, A. M.: Solution of Equations and Systems of Equations, Academic, New York, 1960.
20.55. Schreider, Y. A.: Method of Statistical Testing (Monte Carlo Method), Elsevier, New York, 1964.
20.56. Stroud, A. H., and D. Secrest: Gaussian Quadrature Formulas, Prentice-Hall, Englewood Cliffs, N.J., 1966.
20.57. Wilde, D. J.: Optimum-seeking Methods, Prentice-Hall, Englewood Cliffs, N.J., 1964.
20.58. Wilkinson, J. H.: Rounding Errors in Algebraic Processes, Prentice-Hall, Englewood Cliffs, N.J., 1964. 20.59. Rail, L. B. (ed.): Error in Digital Computation, Wiley, New York, 1965. 20.60. Roberts, S. M., and J. S. Shipman: The Kantorovich Theorem and Two-point Boundary-value Problems, IBM J. Res., September, 1966. 20.61. Remez, E.: General Computation Methods for Chebychev Approximation, Izv. Akad. Nauk Ukrainsk. SSR, Kiev, 1957. 20.62. Whittaker, E., and G. Robinson: The Calculus of Observations, 4th ed., Blackie, Glasgow, 1944.
20.63. Lindorff, D. P.: Theory of Sampled-data Control Systems, Wiley, New York, 1965.
20.64. Fehlberg, E.: New One-step Integration Methods of High-order Accuracy, NASA TR R-%48, George C. Marshall Space Flight Center, Huntsville, Ala., 1966.
803
REFERENCES AND BIBLIOGRAPHY
20.11-2
20.65. Huskey, H. D., and G. A. Korn (eds.): Computer Handbook, McGraw-Hill, New York, 1962. 20.66. Korn, G. A., and T. M. Korn: Electronic Analog and Hybrid Computers, McGraw-Hill, New York, 1964. 20.67. Brand, L.: Differential and Difference Equations, Wiley, New York, 1966. 20.68. Warten, R. M.: Automatic Step-size Control for Runge-Kutta Integration, IBM J. Res., October, 1963. 20.69. Beckett, R., and J. Hurt: Numerical Calculations and Algorithms, McGrawHill, New York, 1967. 20.70. Cooley, J. W., and J. W. Tukey: An Algorithm for the Machine Calculation of Complex Fourier Series, Math. Comp., 19:297, April, 1965.
CHAPTE
r21
SPECIAL FUNCTIONS
21.3. Some Functions Defined Transcendental Integrals
21.1. Introduction
21.1-1. Introductory Remarks
21.2. The Elementary Transcenden tal Functions
nometric Functions 21.2-3. Addition Formulas and Mul
tiple-angle Formulas 21.2-4. The Inverse Trigonometric Functions
21.2-5. Hyperbolic Functions 21.2-6. Relations between the Hyper bolic Functions
21.2-7. Formulas Relating Hyper bolic Functions of Compound Arguments 21.2-8. Inverse Hyperbolic Functions 21.2-9. Relations between Exponen tial, Trigonometric, and Hyperbolic Functions 21.2-10. Decomposition of the Loga rithm
between
Inverse
Trigonometric, Inverse Hy perbolic, and Logarithmic Functions 21.2-12. Power Series and Other
Expansions 21.2-13. Some Useful Inequalities 804
21.3-1. Sine, Cosine, Exponential, and Logarithmic Integrals 21.3-2. Fresnel Integrals and Error
21.2-1. The Trigonometric Functions 21.2-2. Relations between the Trigo
21.2-11. Relations
by
Function
21.4. The Gamma Function and Re lated Functions
21.4-1. The Gamma Function
(a) Integral Representations (6) Other Representations of T(z) (c) Functional Equations
21.4-2. Stirling's Expansions for T(z) and n\
21.4-3. The Psi Function 21.4-4. Beta Functions
21.4-5. Incomplete Gamma and Beta Functions
21.5. Binomial Coefficients and Fac
torial Polynomials. Bernoulli Polynomials and Bernoulli Numbers
21.5-1. Binomial Coefficients and Fac
torial Polynomials 21.5-2. Bernoulli Polynomials and Bernoulli Numbers
805
SPECIAL FUNCTIONS
(a) Definitions
(6) Miscellaneous Properties 21.5-3. Formulas Relating Polyno mials and Factorial Polyno mials
21.5-4. Approximation Formulas for
© 21.6. Elliptic Functions, Elliptic In tegrals, and Related Functions
21.6-1. Elliptic Functions: General Properties 21.6-2. Weierstrass's P Function
21.6-3. Weierstrass's f and
21.6-5. Reduction of Elliptic Integrals (a) Formal Reduction Procedure
21.7-5. Associated Laguerre Polyno mials and Functions 21.7-6. Hermite Functions
21.7-7. Some Integral Formulas 21.7-8. Jacobi Polynomials and Gegenbauer Polynomials
21.8. Cylinder Functions, Associated Legendre Functions, and Spherical Harmonics 21.8-1. Bessel Functions and Other
Cylinder Functions Integral Formulas (a) Integral Representations of Jo(z), J\(z), ^2(2), . . . (b) Sommerfeld's and Poisson's
21.8-2.
Formulas
(c) Miscellaneous Integral For
mulas
(6) Change of Variables. Weier strass's and Riemann's Normal Forms
(c) Reduction to Legendre's Nor mal Form
21.6-6. Legendre's Normal Elliptic Integrals (a) Definitions
(6) Legendre's Complete Normal Elliptic Integrals
Involving
Cylinder
Functions
Zeros of Cylinder Functions The Bessel Functions Jo(z), Ji(z),J2(z), . . . (a) Generation by Series Expan
21.8-3.
21.8-4.
sions
(b) Behavior for Real Arguments (c) Orthogonality Relations 21.8-5.
(c) Transformations 21.6-7. Jacobi's Elliptic Functions (a) Definitions
Solution of Differential Equa tions in Terms of Cylinder Functions and Related Func tions
(6) Miscellaneous Properties and Special Values
21.8-6. Modified Bessel and Hankel
(c) (d) (e) (/)
21.8-7.
Addition Formulas Differentiation Transformations Series Expansions
Functions
21.8-8. 21.8-9.
21.6-8. Jacobi's Theta Functions 21.6-9. Relations between Jacobi's El
liptic" Functions, Weierstrass's Elliptic Functions, and Theta
The Functions berm z, beiro z, herm z, heim z, kerm z, keiTO z Spherical Bessel Functions Asymptotic Expansion of Cyl inder Functions and Spherical Bessel Functions for Large Absolute Values of z
21.8-10.
Functions 21.8-11.
21.7. Orthogonal Polynomials
Associated Legendre Func tions and Polynomials Integral Formulas Involving Associated Legendre Polyno mials
21.7-1. Survey 21.7-2. Real Zeros of Orthogonal Poly nomials
21.8-12.
Spherical Harmonics. Orthogonality
21.8-13. Addition Theorems
21.7-3. Legendre Functions 21.7-4. Chebyshev Polynomials of the
(a) Addition Theorem for Cylin
First and Second Kind
(6) Addition Theorems for Spher-
der Functions
SPECIAL FUNCTIONS
21.1-1
Functions Approximating 5(x) (6) Discontinuous Functions Ap proximating 5(x) (c) Functions Approximating
ical Bessel Functions and Le
gendre Polynomials
21.9. Step Functions and Symbolic Impulse Functions
21.9-1. Step Functions 21.9-2. The Symbolic
«'(*), 5"0c), . . . , 5
Dirac
Delta
tions
21.9-6. Asymmetrical Impulse Func
Function
21.9-3. "Derivatives" of Step Func tions and Impulse Functions 21.9-4. Approximation of Impulse
tions
21.9-7. Multidimensional Delta Func tions
Functions
(a) Continuously Differentiable
21.10. References and Bibliography
21.1. INTRODUCTION
21.1-1. Chapter 21 is essentially a collection of formulas relating to special functions. Refer to Chap. 7 for the relevant complex-variable theory, and to Chaps. 9, 10, and 15 for a treatment of differential equa
tions.
References 21.3 and 21.9 deal with the less frequently encountered
special transcendental functions. 21.2. THE ELEMENTARY TRANSCENDENTAL FUNCTIONS
21.2-1. The Trigonometric Functions (see also Sees. 21.1-1 and
21.2-12 and Table 7.2-1).
(a) The trigonometric functions w = sin z,
w = cos z are defined by their power series (Sec. 21.2-12), as solutions of d2w
the differential equation -^ + w = 0, by z = arcsin w, z = arccos w (integral representation, Sec. 21.2-4), or, for real z, in terms of righttriangle geometry (goniometry, Fig. 21.2-1). The remaining trigono metric functions are defined by cot z =
tan z = cos z
1
cos g
tan z
sin z
1 sec z
cosec z
= cos z
(21.2-1)
(21.2-2)
= sin z
(b) sin z and cos z are periodic with period 2ir\ tan z and cot z are periodic with period ir. sin z, tan z, and cot z are odd functions, whereas cos z is an even function. Figure 21.2-2 shows graphs of sin z, cos z, tan z, and cot z for real arguments. Figure 21.2-3 shows triangles which
THE TRIGONOMETRIC FUNCTIONS
807
21.2-1
Fig. 21.2-1. Definitions of circular measure and trigonometric functions for a given angle
v
x
sin
cos
r
.
r
y
sin
x
cos
tan
x
-
1 =
cos
COS y?
cot tp cosec ^
sin
r
-
y
=
-
1
sin ^
serve as memory aids for the derivation of function values for z = w/6 = 30 deg, tt/4 = 45 deg, and tt/3 = 60 deg (see also Table 21.2-1). (c) The relations
sin z
=
cos
tan z =
cot
(l ~z) cos *=sin (l - *)
(21.2-3)
permit one to express trigonometric functions of any real argument in
21.2-1
SPECIAL FUNCTIONS
808
atari
Fig. 21.2-2. Plots of the trigonometric functions for real arguments z =
Table 21.2-1. Special Values of Trigonometric Functions A
0°
(degrees)
360°
A
(radians)
sin A
30°
60°
90°
7T
IT
IT
4
3
2
«V3
1
0
«
0
-1
0
M
1
V*
270°
3tt 7T
6
0
180°
45°
2
-1
1
cos A
1
tan A
0
cot A
±00
iaVz
V2
i
1
V3
V"3
±00
0
0
±00
1
V3
1
VI
0
±00
0
21.2-3
MULTIPLE-ANGLE FORMULAS
809
Fig. 21.2-3. Special triangles for deriving the trigonometric functions of 30 deg, 45 deg, and 60 deg. Table 21.2-2. Relations between Trigonometric Functions of Different Arguments 90° ± A
-A
180° ± A
270° ± A
n360° ± A
sin
— sin A
cos A
T sin A
— cos A
± sin A
cos
cos A
+ sin A
— cos A
+ sin A
cos A
tan
— tan A
+ cot A
± tan A
T cot A
± tan A
cot
— cot A
+ tan A
± cot A
T tan A
± cot A
terms of function values for arguments between 0 and w/2 = 90 deg (Table 21.2-2 and Fig. 21.2-1).
21.2-2. Relations between the Trigonometric Functions (see also Sec. 21.2-6).
The basic relations sin2 z + cos2 z = 1 sin z
• = cos z
tan z =
1
(21.2-4)
—-— cot z
yield
sing = ± Vl ~~ cos2 cos z = ± \A —sm2z sin z
tan 2 =
± \/l - sin2 z cot z =
\/l —sin2 z _
tan z
1
± \/l + tan2 z
± y/l + cot2 2
1
cot z
± Vl + tan2 z ± Vl - cos2 z
± Vl + cot2 2 (21.2-5) 1 cot z
cos 2
cosz
1
± Vl - cos2 z
tan 2
21.2-3. Addition Formulas and Multiple-angle Formulas. basic relation
The
21.2-3
SPECIAL FUNCTIONS
810
sin (A + B) = sin A cos B + sin B cos A
(21.2-6)
yields sin (A ± B) = sin A cos B ± cos A sin 5 cos (A ± B) = cos A cos B + sin A sin B x / >* • r>\ tan A ± tan B
tan (A ± £) = ^-=-7 '
-t—
5
1 + tan A tan 2?
(21.2-7)
/ a i dn B + 1 cot4. (A ± B) = cot — A , cot—r1^cot A + cot B
sin 2A = 2 sin A cos A cos 2A = cos2 A — sin2 A = 2 cos2 A tan 2A =
cot 2A =
1 = 1 -
2 sin2 A
2 tan A 1 -
(21.2-8)
tan2 A
cot2 A - 1
1,
, .
.
,.
2cotA =2(cotA-tanA)
-/ 008 2 ^
— cos A
sin
cos A
(21.2-9)
sin A
tan ^ COt 77 =
1 — cos A
1 + cos A
sin A
sin A
1 + cos A
1 — cos A
sin A
a sin A + b cos A = r sin (A + B) = r cos (90° - A - £) '
r = + V<*2 + &2 sin A ± sin B = 2 sin cos A + cos B = 2 cos
a
A ± B
A
—
A + B 2
A + B
4.
sin (A ± 2?) ^ =f
d
B
- cos
—2 sin
a . 4.
B
cos —
cos A — cos B =
tan A ±. tan B =
(21.2-10)
tan B = -
A -
B
• sin
(21.2-11)
cos A cos 5
_i_ cot4. 5d = sin. (B ± A) cot* Aii ± \—^—^ sin A sin B
2 cos A cos B = cos (A — B) + cos (A + B) 2 sin A sin B = cos (A — B) — cos (A + B) 2 sin A cos B = sin (A - B) + sin (A + B) 2 cos2 A = 1 + cos 2A 2 sin2 A = 1 — cos 2A
(21.2-12)
811
THE INVERSE
TRIGONOMETRIC FUNCTIONS
21.2-4
sinnA = f J cosn_1 A sin A —(J cosn~8 A sin8 A + ( - ) cosn~6 A sin8 A T
o
(21.2-13)
cos nA = cosn ^4 —f J cosn~2 Asin2 A + (nA cosn_4 A sin4 A T • « If n is an odd integer,
sip" 2=\^y sin 712 - (") sin (n - 2)z +(*) sin (n - 4)2
- (3) sin (n - 6)2 +•••(-1)"5" (n - 1Jsin 2 (21.2-14)
cosn 2=(^ j
cos 712 + f*Jcos (n - 2)2 +fg) cos (» - 4)2 * \ J cos 21 + • • • + I/ to —1
V 2 /
J,
If n is an even integer,
smn 2
(-Ds 2»»-l
cos ri2 - (nJcos (n - 2)2 +r^) cos (w - 4)2
- ••+(-l)~U-2jcos22 +M^I (21.2-15)
COS" 2=Qj* ' COS 712 +(*) COS (71 - 2)2 +(*) COS (fl - 4)2
+---+(4-2)°"a,]+(i)i 21.2-4. The Inverse Trigonometric Functions (see also Table
7.2-1).* (a) The inverse trigonometric functions w = arcsin z, w = arccos z, w = arctan z, w = arccot z are respectively defined by z = sin w
2 = cos w
2 = tan w
z = cot w
*The functions arcsin 2, arccos 2, arctan 2, and arccot 2 are often denoted by sin-1 2, cos"1 2, tan"12, and cot"1 z, respectively. Thisnotation tends to be mislead ing and is not recommended.
21.2-4
812
SPECIAL FUNCTIONS
or by dz
in z = / Jo
arcsm
"A
arctan z =
arccos z
dz
+ *2
-i;
=
arccot 2 =
dz
(21.2-16)
J-.1 +
Figure 21.2-4 shows plots of the inverse trigonometric functions for real arguments; note that arcsin z and arccos z are real if and only if z is
real and \z\ < 1. All four functions are infinitely-many-valued because • >arc tan x
Fig. 21.2-4. Plots of the inverse trigonometric functions.
of the periodicity of the trigonometric functions. For real arguments,
the principal value of arcsin z and arctan z is that between —7r/2 and 7r/2 (see also Fig. 21.2-4); the principal value of arccos z and arccot z is that between 0 and tt (see also Fig. 7.4-1).
813
RELATIONS BETWEEN THE HYPERBOLIC FUNCTIONS
21.2-6
(b) Note
arcsin a ± arcsin b = arcsin (a \/l —b2 ± b Vl —a2) = arccos (\A —a2 \/l —b2 T ab) arccos a ± arccos b = arccos (ab T Vl —a2 y/l —b2) = arcsin (6 \/l —o,2 ± a Vl —b2)
(21.2-17)
arctan a ± arctan b = arctan • 1 T a&
21.2-5. Hyperbolic Functions (see also Fig. 21.2-5 and Table 7.2-1).
The hyperbolic functions* w = sinh z, w = cosh z are defined by the power series (21.2-42), as solutions of the differential equation d2w dz2
w = 0
or simply by ez
sinh z =
cosh z =
e* + e~
(21.2-18)
Four additional hyperbolic functions are defined as tanh z — sech z —
sinh z cosh z 1
cosh 2
cosh z
coth z =
sinh 2
(21.2-19)
1
cosech 2! =
sinh z
Geometrical Interpretation of sinh t and cosh t for Real t. If t/2isthe area bounded by the rectangular hyperbola (Sec. 2.5-26) x2 - y2 = 1, the x axis, and the radius vector
of the point (x, y) on the hyperbola, then y = sinh t, x = cosh t. Note that, if the hyperbola is replaced by the circle x2 + y2 = 1,then y = sin t, x = cos t.
21.2-6. Relations between the Hyperbolic Functions (see also Sec. 21.2-8).
The basic relations cosh2 z — sinh2 z = 1
sinh z
.
,
1
(21.2-20)
—r— = tanh z = cosh 2 coth z
yield
sinh z = ± \/cosh2 z - 1 = -
tanh z
_
± Vl - tanh2 2
cosh 2 = ± Vl + sinh2 z = tanh 2 =
sinh 2
=
= + Vcosh2 2-1
± Vl + sinh2 z
coth2 = ± Vi + smn^ = sinn 2
1
± \/l —tanh2 2
± Vcoth2 2—1
cosh z
coth g
± Vcoth2 2—1 1
~ coth z
==^ =
cosh 2
1
± Vcosh2 2-1
tanh z (21.2-21)
* The symbols Sin z, Cos 2, Tan z, Cot 2 are also used.
21.2-7
SPECIAL FUNCTIONS
814
21.2-7. Formulas Relating Hyperbolic Functions of Compound Arguments (these formulas may also be derived from the corresponding formulas for trigonometric functions by using the relations of Sec. 21.2-9).
sinh (A ± B) = sinh A cosh B ± cosh A sinh B cosh (A + B) = cosh A cosh B ± sinh A sinh B j. i_ / j i n\ tanh A ± tanh B (
tanh (A±B) =l± ^ A^ fi xi> / >«
. r»\
/ni o no\ ^ (21.2-22)
coth A coth B ± 1
COth (^ ± B) - coth B ±coth A
sinh 2A = 2 cosh A sinh -A
cosh 2A = cosh2 A + sinh2 A
M24 = r w i
>
<2L2-23>
coth 2A = C0th2 A+ X 2 coth A . , A smh -^-
,
/cosh A — 1
-^
cosh|2 = ±Jc-5?^4Jli \ 2 , A _
sinh A
_ cosh A — I
2 ~~ cosh A + 1 ~~
, A _
sinh A
2
cosh A — 1
>
(21.2-24)
sinh A
__ cosh A + 1 sinh A
sinh A ± sinh B = 2 sinh —=— cosh —~-
cosh A + cosh J5 = 2 cosh —^— cosh —=—
cosh A- cosh J5 =2sinh A*B sinh A ~B ) (21.2-25) tanhA±tanhg=sin^(/±/) cosh A cosh 5 cothA±cothg=si°h(/.\A> smh A smh # 2 cosh A cosh £ = cosh (A + B) + cosh (A - B) 2 sinh A sinh B = cosh (A + B) - cosh (A - B)
2 sinh A cosh B = sinh (A + B) + sinh (A - B) \ (21.2-26) 2 cosh2 A = 1 + cosh 2A 2 sinh2 A = cosh 2A -
1
815
DECOMPOSITION OF THE LOGARITHM
21.2-10
21.2-8. Inverse Hyperbolic Functions (see also Sec. 21.2-4). The inverse hyper bolic functions w = sinh"1 z, w = cosh-1 z, w = tanh-1 z are respectively defined by 2 = sinh w, z = cosh w, z = tanh w,* or by integrals in the manner of Sec. 21.2-4. Note
sinh"1 a ± sinh"1 b - sinh"1 (a Vb2 + 1 ± 6 Va2 + 1) = cosh"1 (Va2 + 1 V&2 + 1 ± ab) cosh"1 a ± cosh-1 b = cosh-1 (ab ± y/a2 — 1 V&2 — 1) = sinh"1 (b Va2 - 1 ± a V&2 - 1)
(21.2-27)
tanh-1 a ± tanh-1 b = tanh-1 -z—:—=•
1 ± ab
21.2-9. Relations between Exponential, Trigonometric, and Hy perbolic Functions (see also Sees. 1.2-3 and 21.2-12 and Table 7.2-1). eiz = cos z + i sin z eiz _ e-iz eiz + e~i cos z
=
sin z
=
(21.2-28)
2i
e-i* = cos z — i sin z (21.2-29) e* —cosh z + sinh 2 e~* = cosh z —sinh z (21.2-30) cosh z =
ez + e"*
cos z = cosh iz
sin 2 =
—i sinh &
tan 2 =
—i tanh z*2
cot z = i coth z*2
sinh 2 =
cosh sinh tanh coth
z z z 2
ez — e~*
= — = =
cos Z2 —i sin z*2 —i tan z*2 i cot z*2
(21.2-31)
(21.2-32)
aix ~ 6i* iog0« = cos (x log, a) -f- i sin (a; log, a) x* = e* lo«« * = cos (log, x) -f i sin (loge a;)
i* =e«i°8a< =cos y +t sin ^ e*»T»- « l
e<«»+1>»* = -1
(21.2-33)
(n = 0, ±1, ±2, . . .)
21.2-10. Decomposition of the Logarithm (see also Sees. 1.2-3 and 12.2-12 and Table 7.2-1).
loge 2 = loge \z\ + i arg (2)
(21.2-34)
log. (iz) =loge x + (2n + y2)ri ) _ loge (-*) = loge x + (2n + 1)« J (n " °' ±*' ±2' ' • ° (2L2"35) * This notation is the usual one in English-speaking countries, although it tends to
be misleading (see also Sec. 21.2-4). An alternative notation is ar sinh z, ar cosh z, ar tanh z, or Ar Sin z, Ar Cos z, Ar Tan z.
21.2-10
SPECIAL FUNCTIONS
816
cosh*
„ sinh x
sinh x, cosh x, tanh x
Fig. 21.2-5. Hyperbolic functions. (From Baumeister and Marks, Mechanical Engi neers1 Handbook, 6th ed., McGraw-Hill, New York, 1958.)
Fig. 21.2-6. Exponential functions and logarithms. (From Baumeister and Marks, Mechanical Engineers' Handbook, 6th ed., McGraw-HiU, New York, 1958.)
817
SOME USEFUL INEQUALITIES
21.2-13
21.2-11. Relations between Inverse Trigonometric, Inverse Hy perbolic, and Logarithmic Functions. arccos z = i cosh-1 z
cosh-1 z = i arccos z
arcsin z = —%sinh-1 iz
sinh"1 z = —i arcsin iz i ^i 2-<&)
arctan z = —i tanh-1 iz arccot z — i coth-1 iz
tanh-1 z = —i arctan iz ' coth-1 z = i arccot iz
arccos z = —i loge (z + i y/l - z2)
cosh"1 z = loge (z + y/z2 - 1)
arcsin z = —i loge (iz + y/l - z2)
sinh"1 z = loge (z + y/z2 + 1)
i , 1 + iz arctan z = — ^ logc ^
. , _, 1, 1+2 tanh 1z = ~ loge ^ -
2
1 — iz
i t
iz — 1
2
iz +
,,,
arccot 2 = — o loge -—r^r
cotn
1
*
1 — z
li
2+1
*
z — \
2 = 0 loge
=-
(21.2-37)
21.2-12. Power Series and Other Expansions. Power-series expan sions, as well as some product and continued-fraction expansions for the elementary transcendental functions, are tabulated in Sees. E-7 to E-9 of Appendix E. See also Sees. 20.6-1 to 20.6-5 and Tables 20.6-2 to 20.6-4 for other numerical approximations.
21.2-13. Some Useful Inequalities (see also Figs. 21.2-2 and 21.2-6). For real x,
sin x < x < tan x
(0 < # < ~)
. 2X sina;>-
( T . . 7r\ \g<x<^
cos x <
~~
< 1
X
(21.2-38)
(0 < x < it)
""
""
~~
ex > 1 + x e
< 1 -
X
r
e l~x < 1 — x < e~x
(x
v ^ (r*
(x < 1)
x
\\
X
l> < 1 _ z
X
1 + X
(
(21.2-39)
(x> -1)/
1 , x < loge (1 + X) < X
(-1 <X9*0)
x < -iogc(i - x)
(0^x
|loge(l -*)|
(0 < x < 0.5828)
(21.2-40)
21.3-1
SPECIAL FUNCTIONS
818
21.3. SOME FUNCTIONS DEFINED BY TRANSCENDENTAL INTEGRALS
21.3-1. Sine, Cosine, Exponential, and Logarithmic Integrals (see also Fig. 21.3-1).
Si
, v
fx sin x , —
~X Ci (*)
One defines
7r
[" sin x ,
_ —— a. * %i —
(sine integral)
3!3"1"5!5" +
-/; n . i
COS X X
t , ., N
-p1 a:4 _
dx = C + l0ge X
1 z2 . = C+ logex-y2+VJ +•
COS X
(21.3-1)
dx
X
(x > 0)
(cosine integral)
(21.3-2)
A sin*
>-*
ASifr)
(b) r "C sin x
Fig. 21.3-1. sin z/z = sine (z/vr) (a), and the sine integral / dx = Si x (b). jo x (From M. Schwartz, Information Transmission, Modulation, and Noise, McGraw-Hill, New York, 1959.)
SINE, COSINE, EXPONENTIAL
819
Ei
(—x) = — /
—dx
Jx
Cx
li (x) == / -.
X
•£/£
JO l0ge X
(x > 0)
(x > 0)
21.3-1
(exponential integral)
(21.3-3)
(logarithmic integral)
(21.3-4)
where C « 0.577216 is the Euler-Mascheroni constant defined in Sec. 21.4-16.
0.6
1.2.
1.8
2.4
Fig. 21.3-2. The Fresnel integrals.
3.0
3.6
(From Ref. 21.1.)
21.3-2
SPECIAL FUNCTIONS
820
It is customary to introduce an alternative exponential integral ET(2) so that, for real x, y,
Ei(s) = lim Ei(z ± iy) ± iw = \i(ex) \ y->0+0
1
Ei (xep*) = Ei (-a?) ± i*
(
M(±iy) =Ci (2/) ±iSi(y) ±i\
)
(21.3-5)
21.3-2. Fresnel Integrals and Error Function (see also Sec. 18.8-3 and Fig. 21.3-2).
One defines
C(x) s fX cos \x2 dx =JH fe x2) (Fresnel S(x)
,
. *• 0 ,
T A
A
/ integrals) (21.3-6)
+J»(g *») +/.* (j**) + erfXSV^joe d* =vH*-3+2!5-3!7 ±"" ) (error function)
(21.3-7)
where the Jn/2(z) are the half-integral-order Bessel functions discussed in Sec. 21.8-le.
Note
C(x) - iS(x) mT^-.erf \±±±xy/i\ ' v asz-> oo
(21.3-8) (21.3-9)
Note also
/ C(x) dx = xC(x) C*
sin ^ x2 1
7T
1
jo S(x) dx = xS(x) + - cos ^ a;2 - - ^ /fx erf x dx = a: erf x -\ Jo
(21.3-10)
1
-=l (e~zi — 1)
VV
The related integrals
1
f x sin x , - ^ . l ^ i i 1 6 !
1
f * cos x , _ i _ l ^ i i ^
2V*70 Vi
2VJJo V^"
3 731+115!* ' ' ' L.E! +
52! +94! 136!"
are also sometimes known as Fresnel integrals.
(21.3-11)
THE GAMMA FUNCTION
821
21.4-1
The function erfc z — 1 — erf z
V^
/""«-*•#
(21.3-12)
is known as the complementary error function.
21.4. THE GAMMA FUNCTION AND RELATED FUNCTIONS
21.4-1. The Gamma Function,
(a) Integral Representations.
The gamma function T(z) is most frequently defined by
T(z) =|o°° e-Hz~l dt
[Re (z) >0]
(21.4-1)
(Euler's integral of the second kind) or for Re (z) < 0 by
—r-^ = o—•
r\
-2
-2
n
-4
Fig. 21.4-1. T(n + 1) vs. n for real n. Note T(n -f 1) = n! for n = 0,1, 2, . . . , and the alternating maxima and minima given approximately by r(1.462) =» 0.886, r(-0.5040) = -3.545, r(-1.573) = 2.302, r(-2.611) = -0.888, . . . .
definition can be extended by analytic continuation (Sec. 7.8-1). The only singularities of T(z) in the finite portion of the z plane are simple poles with residues ( —l)n/nl for z = —n (n = 0, 1, 2, . . .); l/T(z) is an integral function. Figure 21.4-1 shows a graph of T(x) vs. x for real x. Note
822
SPECIAL FUNCTIONS
21.4-2
r(i) ==
r(H) =
1
(21.4-3)
+
T(n + 1) ==
(n = 0, 1,2, . • •)
n!
(b) Other Representations of T(z). T(z) = lim
n!
z(z + l)(z + 2) • • • (z + n - 1) (Euler's definition)
(21.4-4)
00
—— = zeCz || (1 + t) e~zlk '
(Weierstrass's product
A=i
representation)
(21.4-5)
C is the Euler-Mascheroni constant defined by
c=ilm.(Xihlogen) * = i
—/ e-l\oge tdt = —/ logcfloge-Jrfr (21.4-6)
« 0.5772157
(c) Functional Equations. (21.4-7)
T(z + 1) = «r(a) r(«)r(i - z) =
Sin 7T2
\2
/
\2
/
COS 7T2
(21.4-8)
rM-V££™r(. +M. +2)...r(. +«-^) (n = 2, 3, . . .) (Gauss's multiplication theorem) (21.4-9) 21.4-2. Stirling's Expansions for T(z) and n! (see also Sees. 4.4-3, 4.8-66, and 21.5-4). [1
1
139
1+ l^ + 288^-5l840i"3
- 24883203* +0(^"5)] (|arg z|
' ,
n->« nne~n y/2im
= 1
or
n\ ^ nne~n y/2irn
as n —> oo (Stirling's formula)
(21.4-11)
823
INCOMPLETE GAMMA AND BETA FUNCTIONS
21.4-5
The fractional error in Stirling's formula is less than 10 per cent for n = 1 and decreases as n increases; this asymptotic formula applies par
ticularly to computations of the ratio of two factorials or gamma func tions, since in such cases the fractional error is of paramount interest. More specifically
nne~n y/%m
(21.4-12)
n!~n-V^exp(-n+^-sg^i +•••) asn-.cc (21.4-13) 21.4-3. The Psi Function (Digamma Function). 00
m-1 kg. rW -^(rn "ih) -c
mM4>
♦«-/."(!r-£0*--/.,te-,+i?-.)*
[Re (z) > 0]
(21.4-15)
Note
*(1) = -C 21.4-4. Beta Functions.
Hz + D= Hz) + \ z
(21.4-16)
The (complete) beta function is defined as
B(P, Q) =ff^+g) " B(?> P>
(2L4"17)
or by analytic continuation of
B(p, q) = I' P-l(l - 0«"1 *
[Re (p) > 0, Re (q) > 0]
(Euler's integral of the first kind)
(21.4-18)
B(p, q) = fj (1 t+t)P+q dt =2L* sin2p_1 *cos2*-1 *^ (21.4-19) Note
B(p, )B(p + ?, r) = B(g, r)B(g + r, p)
1
_
B(n, ra) ~
/n + m —l\ _
(n + m —l\
\
\
n —1 /""
(21.4-20)
m —1 / (n, m = l, 2, . . .)
(21.4-21)
21.4-5. Incomplete Gamma and Beta Functions. The incomplete gamma function TB(p) and the incomplete beta function B,(p, q) are respectively denned by analytic continuation of
T8(p) = f^ P~le-* dt [Re (p) >0] (21.4-22) B„(p, q) =fQ* tp~l(l - 0fl_1 dt [Re (p) >0, Re (q) >0; 0
215-1
SPECIAL FUNCTIONS
824
21.5. BINOMIAL COEFFICIENTS AND FACTORIAL POLYNOMIALS. BERNOULLI POLYNOMIALS AND BERNOULLI NUMBERS
21.5-1. Binomial Coefficients and Factorial Polynomials (see also Sec. 1.4-1). Table 21.5-1 summarizes the definition and principal prop
erties of the binomial coefficients I J. The expression
Zln] m(n) nl mx(x ~l) ' ' "(* ~n+X) = S0(n)xn + S^x"'1 + . . . + SSH^z) OW = 1 (n = 1, 2, . . .)
(21.5-1)
is called a factorial polynomial of degree n. The coefficients Skin) are known as Stirling numbers of the first kind and can be obtained with the aid of the recursion formulas
A*-™ = /S*(n) - nSji or from Eq. (17).
(21.5-2)
Note
(x + y)M = > (?) rfHytr+i
(n =0, 1, 2, . . .)
(Vandermonde's binomial theorem)
(21.5-3)
If one defines xW = 1, a?M = l/(x + l)(a? + 2) • • • (a? - r) (r = -1, -2, . . .), then the relation
and Eq. (20.4-12) hold for all integral pseudo exponents r, m. Note 0I°J = 1.
21.5-2. Bernoulli Polynomials and Bernoulli Numbers (see also
Sec. 21.2-12). (a) Definitions. The Bernoulli polynomialsBk(n)(x) of order n = 0, 1, 2, . . . and degree k are defined by their generating function (Sec. 8.6-5)
^-n =^ BhM(z) £ (n =0, 1, 2, ...) (21.5-4) Note
Bk«»(x) =a*
B»0H-i>(*) =^±^[(z - l)(x - 2) -.. (x - n)] (n > k)
(21.5-5)
The 2?fc(n)(0) = 5fc(n) are called Bernoulli numbers of order n, with 00
(et .!•" t). =y B»°° j§ (n - 0, 1, 2, .. .)
(21.5-6)
825
BERNOULLI POLYNOMIALS AND BERNOULLI NUMBERS
21.5-2
Table 21.5-1. Definition and Properties of the Binomial Coefficients (see also Sees. 1.4-1 and 21.5-4)
(a) Definitions and General Properties. If xy y, z are real numbers, and n is an integer,
Mx-l)...(x-n + l)(oin>0\
C)-|I
10
forn=0 (DEFINITION) for n < OJ (addition theorem)
(•:')-©+(.*-,) (D-'-'K^r1)'—' (b) In particular, if N and n are positive integers, foriV >n forN
GO-
G-0-(O-»
(c) If M, iV, n are positive integers such that M > n, N > n,
CI!)-20
fi:D- 2 <-»'-©
(*:•)-2(7)0 (5^=2"(/+j)(T) y=o
i=0
L
£0(n) = 1 Bi(n) = -%n £3(n) = -H">2(n - 1)
£2(n) = K2^(3w - 1)
#4(n) = 3^4orc(15n8 - 30n2 + bn + 2)
}
(21.5-7)
£6(»> = -^6n2(n - l)(3n2 - 7n - 2) £6(n) = ^032n(63n6 - 315n4 + 315n8 + 91n2 - 42n - 16) Bernoulli numbers of order 1 are often simply called Bernoulli num bers Bk™ = Bk;Bk = 0 for all odd k > 1, and
21.5-3
SPECIAL FUNCTIONS
Bo = 1
Bx = -y2
B2 = H
B* = -Ho
826
Be = ^2 . . .
(21.5-8)
with alternating signs. The Bernoulli numbers may also be obtained from the recursion formulas
b.-i i+(o*+(;)*« +---+(**iK'-° (t = 1, 2, . . .) (21.5-9) or in determinant form by solution of the linear equations (6). (b) Miscellaneous Properties.
Note
££*<»>(*) =kBftlz) J'bWQ dt -^-J^ [Bftfc) - Bj&ia)] (21.5-10) ABWW(*) = mBSZi1^*)
A»BmW(ib) = m(m - 1) • • • (m - n + l)*"»-» (* = 0, ± 1, ± 2, • • • ; m > n)
/"*+i
/
BkM(x + 1) = BkM(x) + fcBfci1^*) 1
(21.5-11)
B*W(1) = B*<»> + &£(»_-d (21.5-12)
BfcW(© <*$ =__ AB&fc) = B*(«-i)(*)
A
/ B*<«>(£) d* = Bfcc»-» (21.5-13)
Bfc(n)(n _ a;) = (—l)*Bjb
Bk(n+»{x) - f1- y B*W(x) +*(| - ABJiW B*(«+«(l) =(1 - !J\ B*(1)(wa;) =ra*_1 } Bfcd) Ix H J (multiplication theorem)
*»«»=(-i)-i y ^* bj»+iW=(_1)M y 2sin2>* (A - o, i, 2, . . .)
21.5-3. Formulas Relating Polynomials and Factorial Polyno mials. Bernoulli polynomials and Bernoulli numbers relate factorial polynomials (Sec. 21.5-1) to powers of x; such relations aid in the solution
of difference equations and, in particular, in the summation of series (Sec. 4.8-6d). *w
Note
B(j)»lB*(»-l)"'(*-n +1) n
-*„<•+»(*+1)- £(j:})*;&x* (»= o,i,2,...) *=0
(21.5-17)
827
ELLIPTIC FUNCTIONS: GENERAL PROPERTIES
21.6-1
f' ({ _i)({ _2) •••(S - 2r +1) dk =j* (£ - D'2'-11 d$ = rj- [B,rar)(x) ~ B2rM] Zr
(r = 1, 2, . . .) (21.5-18)
21.5-4. Approximation Formulas for f J(see also Sec. 21.4-2). If Nis aposi2 \N
I
3/T
tive integer and z = -^ -^ ~ n « >#' then
.^r^"(1+.^+...) (-i<-<0 «"*•» If n <3C N, use
(J)-S''"^ (°
(2L5-20)
Forlarge values of N, n, and N - n, use Stirling's formula (21.4-11). 21.6. ELLIPTIC FUNCTIONS, ELLIPTIC INTEGRALS, AND RELATED FUNCTIONS
21.6-1. Elliptic Functions: General Properties. Afunctions = f(z) of the complex variable z is an elliptic function if and only if
1. f(z) is doubly periodicwith two finite primitive periods (smallest periods) coi, o>2 whose ratio is not a real number, i.e., f{z + mwi + no)2) s f(z)
\m, n=0, ±1, ±2, ... ;Im fc£\ *o] (21.6-1)
and
2. The only singularities of f(z) in the finite portion of the z plane are poles (see also Sees. 7.6-7 to 7.6-9).
A doubly periodic function f{z) repeats the values it takes in any period parallelogram, say that defined by the points 0, coi, co2, «i + w2, where the two sides joining the latter three points are excluded as belong
ing to adjacent parallelograms. The order of an elliptic function is the number of poles (counting multiplicities) in any period parallelogram. A doubly periodic integralfunction is necessarily a constant The residues in any period parallelogram add up to zero; hence the simplest nontrivial elliptic function is of order 2. An elliptic function f(z) of order r assumes
21.6-2
SPECIAL FUNCTIONS
828
every desired value w exactly r times in each period parallelogram, if one counts multiple roots of the equation f(z) - w = 0 (Liouville's Theorems, see also Sec. 7.6-5).
Elliptic functions are usually encountered in connection with integrals or differ ential equations which involve square roots of third- or fourth-degree polynomials (e.g., arc length of an ellipse, equation of motion of a pendulum; see also Sees. 4.6-7 and 21.6-4). Weierstrassfs elliptic functions and normal elliptic integrals are con structed in terms of simpler functions with known singularities for theoretical sim plicity (Sees. 21.6-2, 21.6-3, and 21.6-56). For numerical calculations, one prefers Jacobi's elliptic functions (Sec. 21.6-7), which may be regarded as generalizations of trigonometric functions; Legendre's normal elliptic integrals, which are closely related to the inverses of Jacobi's functions, are widely tabulated (Sees. 21.6-5 and 21.6-6).
21.6-2. Weierstrass's %> Function, (a) @(z) = fp(z\
pw - *««!, «o-uvy [7—x~—ri - 7—i—_ i * ^/i/L(2- nuai - nw2)2 (mcoi + nco2)2J m
n
(m*+n'5*0)
=«?(-*) [lm^>ol
(21.6-2)
where the summation ranges over all integer pairs m, n except for 0, 0. w = ®(±z —C\cai, w2) satisfies the differential equation (dw\2 —\
= 4^3 _ g2W _ gz s 4^ _ ej(w _ e^w _ e^ (21.6-3)
with
*-*(?) e2 =g>(^) *•-«'(?) (2L6-4) ei + e2 + e8 = 0
exe2 + exez + e2e3 = -H92
eie2e3 = %gz (21.6-5)
The parameters g2, gz determine the constants «i, w2 associated with
each p function and are known as invariants of g>(z) s @(z\wi, w2) = &(*•> 92, gz)) note
&(z; 9*, 9.) - mV (mz; g,-g) (m> =Jj) (21.6-6)
The points w = ei, e2, e3 and w = oo are the branch points of the inverse function
dw
-r. \/4w3 —02W —03
(Weierstrass's normal elliptic .91 fi ?.
INTEGRAL OP THE FIRST KIND) ^ ' ~ '
of #>(z|coi, co2) (see also Sec. 21.6-56).
829
WEIERSTRASS'S f AND a FUNCTIONS
21.6-3
Note the series expansions
*W " 22 + 20
+ 28
+ 1200
^ 6160
^
=h+ Z a**2*~2 [0 <w<min (,wi1, M)]
(21.6-8)
fc=2
°* = (k - 3)(2fc + 1) (aza*-2 + a3flk-3 + ' **+ *'-"* m
n
k=l
03 =140 2, 2, (ma,! +nco2)« =W \216 " 3Z 1=^0 m n (m«+n^0)
&= 1
(? = e»™i/w«)
(21.6-9)
and the addition formula
,W +« =_,W _,(B) +J[^ifff]'
(2L6-10)
(b) litoen/ elliptic function f(z) with periods a>i, co2 con &e represented as a rational function of g>(z; «i, w2) and g>'(z; «i, w2). More specifically,/(z) can be written in the form
/« = W(*)] + P'WBJPWl
(21.6-11)
where Ri and #2 are rational functions; %>'(z) is an odd elliptic function of order 3.
21.6-3. Weierstrass's f and
(a) Weierstrass's f and a functions are
not elliptic functions but may be used to construct elliptic functions with easilyrecog nizable singularities.
They are defined by
f(2) =r(*|Wl, «,)=- + 2, 2, L« - wtoi " ««« +m«i+n«t +(w«i+n«,)*J m
n
(ro»+n»5*0)
= -re-*)
8=2
1111 m
\
~~ mo>i + wco2/
L™"! + ^2
2(m«i + nw2)2J
n
(ro«+n»?*0)
-
[Imte)>0] (2L6-12> where the sums and products range over all integer pairs m, n except for 0, 0. has simple poles, and
r'W =-*M
$$ =fW
f (z)
(21.6-13)
21-6-4
SPECIAL FUNCTIONS
830
Equations (8) and (13) yield Laurentexpansions of f (z) and
f(2) = f^2) Wdw iconst. s/4w* —g2W —gz
(WEIERSTRASS's NORMAL ELLIPTIC INTEGRAL OF THE SECOND KIND) ^ '
'
and the addition theorem
tU +B) =«*> +r(B) +jffijlffi
(21.6-15)
Given 2f(cui/2) = in, 2f(W2/2) = V2, one has t/iwz - i^ui ?•2*i, and
f(z -f- mcoi + n«2) = f(z) -|- mi?! + nm
(m, n = 0, ±1, ±2, . . .) (21.6-16)
and
Ei *(* " «i>
/to =^r
r
( Z 0/ =2 6/)
(21.6-18)
21.6-4. Elliptic Integrals (see also Sec. 4.6-7). The function
F{z) = \zJ{z)dz is an elliptic integral whenever f(z) is a rational function of z and the
square root -\/G(z) of a polynomial G(z) = a0 z4 + ai zz + a2z2 + a3 2 + «4
s a0(z - ax)(2 - a2j(z - az)(z - a4) (21.6-19) without multiple zeros; one includes the case of third-degree polynomials G(z) 3 Gz(z) as well as fourth-degree polynomials G(z) = G*(z) by intro ducing a4 = 00 whenever a0 = 0, so that formally a0(z —a4) 5 ax. Every elliptic integral is a multiple-valued function of z; different inte gration paths yield an infinite number of function values. The points z = «i, z = a2, z = az, z = aA are branch points. One joins a\, a2 and «3, «4 by two suitably defined branch cuts to obtain a Riemann surface (Sec. 7.4-3) connected like the surface of a torus. An elliptic integral of the first kind is finite for all z; its only singularities are branch points at z = on, a2, as, a4. An elliptic integral of the second kind is analytic throughout the z plane except for branch points at on, a2, a3, «4 and a finite
831
REDUCTION OF ELLIPTIC INTEGRALS
21.6-5
number of poles.
An elliptic integral of the third kind has a logarithmic singu
larity (see also Sees. 21.6-5 and 21.6-6).
21.6-5. Reduction of Elliptic Integrals. The following procedures reduce every elliptic integral Jf(z) (fetoa weighted sum of elementary functions and three so-called normal elliptic integrals (see also Refs. 21.2, 21.11, and 21.12 for alternative procedures; Ref. 21.2 contains a very comprehensive collection of explicit formulas expressing elliptic integrals in terms of normal elliptic integrals).
(a) Formal Reduction Procedure. \/G(z) are polynomials in 2, and rewrite
Note that even powers of
Px(z) + P2(z) VG(zj = (Pi + P2VG)(PZ-P*VG)
m - pz(z)+p4w vm "
(p3)2 - (*w
(216_20)
where the Pi(z) are polynomials, and Ri(z), R2(z) are rational functions. Ri(z) can be integrated to yield elementary functions (Sec. 4.6-6c). Partial-fraction expansion of the rational function R2{z) (Sec. 1.7-4)
, } dz to that of integrals of the form
VG(z)
In = J[(^=^dz VG(z)
(n =0, ±1, ±2, . . .)
(21.6-21)
Every one of these integrals can be expressed in terms of Io, I\, I2y and J_i alone with the aid of the recursion formula
(2n + 6)6o/n+4 + (2n + 5)&i/„+3 + (2n + 4)&2/n+2
+ (2n + 3)63/»+i + (2n + 2)bJn = 2(a - c)"*1 y/0(z) (n = 0, ±1, ±2, . . .)
(21.6-22)
where the coefficients fc* are defined by the identity
G(z) = aoz4 + oi23 + a2z2 + a& + a* e 60(2 - c)4 + 61(2 - c)3 + &2(^ - c)2 + bz(z - c) 4- bA
(21.6-23)
In addition, (22) yields I2 explicitly in terms of J0, Ji, and 7_i if a0 =0, or if c is a root of G(z) = 0 (i.e., 64 = 0). Even if c is not such a root, one can rewrite 72 as
/•fe^£'A- [^-^dZ +af^ + fi[^= (21.6-24)
J vow
; vow
where 2 = c' is a root of G(z) = 0.
i vow
; Vow
IJence every elliptic integral can be
expressed as a weighted sum of elementary functions and three relatively simple types of elliptic integrals of the first, setond, and third kinds (Sec.
21.6-5
SPECIAL FUNCTIONS
832
21.6-4), viz.,
i vm ivm I\z-c)vm
(21,6~25)
The first of these integrals is usually referred to as a normal elliptic integral of the first kind; it is often convenient not to use the other two integrals (25) directly but to introduce suitable linear combinations
as normal elliptic integrals of the second and third kind (Sees. 21.6-2, 21.6-3, and 21.6-6). (b) Change of Variables.
Weierstrass's and Riemann's Normal
Forms. At any convenient stage of the reduction procedure, one may introduce a new integration variable z = z(z) to transform the elliptic integrals (21) or (25) into new elliptic integrals involving a more con venient polynomial G(z) and, possibly, a simpler recursion formula (22). In particular, a bilinear transformation
z=gg^f
(^ - BC *0)
(21.6-26)
(Sec. 7.9-2) chosen by substitution of corresponding values of z and z so as to map the branch points z = a\, a2, az, en into z = e\, e2, ez, oo yields elliptic integrals in Weierstrass's normal form with G(z) = 4z"3 —g2z —gz. These integrals are related to Weierstrass's g? function (Sec. 21.6-2). Again, a transformation (26) mapping z = ah a2, az, a4 into z = 0, 1, l/k, —l/k, where A; is a real number between 0 and 1, yields elliptic integrals in Riemann's normal form, with G(z) = 2(1 - 2)(1 - k2z).
(c) Reduction to Legendre's Normal Form.
More frequently,
one desires to transform a real elliptic integral f*f(x) dx to Legendre's normal form with G(z) = (1 - 22)(1 — k2z2) where k2 is a real number
between 0 and 1. The reduction procedure will then yield (real) Legendre's normal integrals (Sec. 21.6-6) whose values are available in published tables.
Let G(x) be a real polynomial greater thanzero in (a, x); since f *f(x) dx is to be real, the integration interval cannot include a real root of G{x) = 0. Table 21.6-1 (pages 712-713) lists transformations a; = x(cp) mapping the real integration interval (a, x) into a corresponding range of real angles
dx ^
VG(x)
d
Vl - k2 sin2
(0
(21.6-27)
for the various possible types of real fourth-degree polynomials G(x) =
833
LEGENDRE'S NORMAL ELLIPTIC INTEGRALS
21.6-6
GA(x) and third-degree polynomials G(x) = Gz{x). The correct values of the constant parameters k2 and \i are also tabulated. In each case, the leading coefficient (a0 or ai) of G(x) is taken to be either 1 or —1.
In the case of real roots, it is assumed that «i > a2,
a* > a4; complex roots are denoted by b\ ± ic\ and b2 ± ic2, with bi > b2, ci > 0, c2 > 0. The following auxiliary quantities have been introduced:
otik = 0Lk — en
(i, k = 1, 2, 3, 4)
(a, P, y, 8)
a — 50 —7 tan #2 =
i_ q
Ci + C2
tan #3 = ,
,
Oi —
o
tan ?>4 =
o2
a2 — &i Ci — c2
(21.6-28)
, , b\ — b2
,, Q /oN2 COS #3 (tan ^s/2)2 = =v
'
'
cos #4
i/ = tan [(#2 — &{)/2] tan [(#! + #2)/2]
21.6-6. Legendre's Normal Elliptic Integrals (see also Sees. 26.6-4 and 26.6-5). (a) Definitions. Legendre's (incomplete) normal elliptic integrals are defined as F(k, V) m
•*mf.
d
dz
=. = F(k, z)
(Legendre's
o V(l-^2)(1-^2) NORMAL ELLIPTIC INTEGRAL OF THE FIRST KIND)
E{h,*>)=/
= E(k, z)
(21.6-29a)
Vl - k2 sin2
/>
d
r(c, fc,
-/: = 7f(c, k, z)
rf2
/o (*2 - c) V(l - *2)(1 " k2z2) (Legendre's normal elliptic INTEGRAL OF THE THIRD KIND)
(21.6-29&)
where z = sin
21.6-6
SPECIAL FUNCTIONS
834
values of the amplitude tp between —-k/2 and ir/2 if A:2 is a real number between 0 and l',F(k,
+~
Fig. 21.6-1. Variation of the ellipticintegral u of the first kind with
For real values of the modulus k, the modular angle a = arcsin k
(21.6-29c)
is often tabulated as an argument instead of k (Fig. 21.6-2); one writes F(k,
k2 = m is often
(b) Legendre's Complete Normal Elliptic Integrals (see also Fig. 21.6-3). K = K(t)
The functions
^
/V2
'Jo
, d(p
W= k2 sin2
= F(t,x/2)
s F(90°\a)
E = E(k) s f^ VI - k2 sin2 *> ^ = E(k, tt/2)
(21.6-30)
s £(90°\a) with
a = arcsin k
are respectively known as Legendre's complete elliptic integrals of the first and second kind, k = sin a and k' = *\/l ~~ k2 = cos a
835
LEGENDRE'S NORMAL ELLIPTIC INTEGRALS
21.6-6
Fdp\a)
"0° 10° 20° 30° 40° 50° 60° 70° 80° 90°
" 0° 10° 20° 30° 40° 50° 60° 70° 80° 90°
(a)
(b)
Fig. 21.6-2. The incomplete elliptic integral of the first kind, F(k,
plotted against
(From Ref.
21.1.)
I I l 1 I I l I l l I I l I I l \\-^a jo
0° 10° 20°30° 40° 50°60°70° 80° 90° (a)
' 0° 10° 20° 30° 40° 50° 60° 70° 80° 90° (b)
Fig. 21.6-3. The elliptic integrals K(k) m F(90°\a), K'(k) a F(90°\90° - a) (a), and E(k) s E(90°\a), E'(A>) e tf(90°\90° - a) (b) (k = sin a; from Ref 21.1).
21.6-6
SPECIAL FUNCTIONS
836
are called complementary moduli. K(/c) and K'(&) = K(fc') are associated elliptic integrals of the first kind; E(&) and E'(&) = E(fc') are associated elliptic integrals of the second kind. Note
EK' +E'K - KK' =|
(Legendre's relation) (21.6-31)
and K(0) = K'(l) = tt/2, K(l) - K'(0) = ». Different values of the multiple-valued elliptic integral F(k, ip) differ by 4raK + 2raK'; different values of E(k,
(21.6-32)
so that, for real fc2 < 1,
(21.6-33)
•»-i'(-iJ=^)-i [-({)••'-(«)•¥where F (a, 6; c; 2) is the hypergeometric function defined in Sec. 9.3-9.
(c) Transformations.
Legendre's normal elliptic integrals (29)
with moduli k and k = l/k, k', l/k', ik/k!, k'/ik, (1 - k')/{l + kf), 2Vk/(l + k) are connected by the relations listed in Table 21.6-2, where
Vl - k2 sin2 ip = A(
(21.6-34)
(see also Sec. 21.6-7a). Table 21.6-3 lists similar relations for the com plete normal elliptic integrals (30). In particular,
Successive substitution of
K(*>=TTT
K=W k'n+l =-^-^
(21-6"35)
(n =0, 1,2, . . .)
(21.6-36)
for A;' in Eq. (35) yields (1 - k'n)/(l -f k'n) -» 0; since K(0) = tt/2, one obtains eo
n = 0
which may be useful for numerical computation of K(k). in analogous fashion.
K(k,
real zeros
Gt(x), three
real zeros
Gi(x), four
coefficient
zeros
-1
+1
-1
+ 1
Leading
G(x),
at < x < a\
x
OI < X
ai < x < as
ai < X < oi
04 < x
at < x < at
X
or
ai < x
Interval =
—
an x
ati x
as
—ai
x
at
ai
—
ai —
oi
as
X
a4
a\
a%
at
ax
ai
—
—
an
—
an — asi sin3 tp
sin3 p
-r~r~
x x
ojoii —atati sin3
«i -
an
1 — sin3
ai — at sin*
—
—
—
an
—
asi x
on — an sin3
an x
at + an sin*
— —
04101 —
an X
oioii — at asi sin3
on + o4i sin3 v>
a4 on + «i o4i sin2
an x
an x
at an —at an sin*
an — an sin*
a4i x
a4S x
sin3 y> s>
an — an sin3
a\ an —at an sin3 «p
x
Transformation
00
ai
as
at
—
oo
a\
at
ai
ai
as
ai
04
at
at
04
ai
X
x/2
0
»/2
0
x/2
0
»/2
0
tt/2
0
ir/2
0
r/2
0
t/2
0
V
Corresponding values
Table 21.6-1. Transformation to Legendre's Normal Form* All zeros of G(x) real
w
asi
an
(an) 54
M
CD
>
O
z H
IT1
M
o
M
w
w
0
a is
2
(aiia4s)^
2
M
an
(ai, as, o4, ai)
(ai, ai, a4, ai)
&*
3
com
1
-1
1
-1
— 00 < X <
x < 01
01 < X
OS < x < 01
X < OS
or
ai < X
Interval
00
v — cos
1 — COS tp
Cl
(«i — *)
"*"" v
61 — Cl
ci
X = 61 — Cl COt ip
x = 61 + ci tan (v> + M0» + M0<) tan {ip + M*» + K*) =» (x - 6i)/ci
(tan H
COS 01
cos 01 1 + COS
Ci
cos 0t X — OS
cos 0i ai — x
2
ai — as
(tan ?)i
2
ai + as
Transformation
$, = 0« = „72
acute
0i, 0a, He,
0i acute
0i obtuse
0i, 0t acute
0t obtuse
di acute
quantities
Auxiliary
* From A. Erdelyi et al., Higher Transcendental Functions, vol. 2, McGraw-Hill, New York, 1953.
01 > a
6, = 61
plex zeros
four com
4(X),
61 > 6s
plex zeros
four com
4(X),
plex zeros
two com
Ga(x),
plex zeros
two
two real and
1
coefficient
zeros
4(X),
Leading
G(x),
00
6i
00
ai
as
ai
X
He*
t/2 - Het -He*
-Het -
-t/2 - H0i -He*
0
0
V
Corresponding values *3
0s)]3
-©'
sin3 06
[sin (Hei + H*)!1
[sin H(0i -
Table 21.6-1. Transformation to Legendre's Normal Form* (Continued) G(x) has complex zeros
)
\
Cl
1
ClCS /
/cos 06\^
/cos 0i\J^
V *
/ — COS 0l\^
Cl
(cos 01cos 0s)^
Cl
(— COS 01cos 0s)W
/*
09
S
CD
0
g
z
H
Q
A(*, k)
cos2
A(*>, A:)
A(
(1 + k') sin
A(
ik
1 -k'
l+k'
(1 + k')F(
-ikF(
k'F(
-ik'F(
-iF(
*F(*, k)
*X*, £) E(
E(
y
_
y
A(*>, k)
sin
A(
sin
A(
et/ »x - k2 19 sm * cos E(cp, k)
j-^p [E(
k
i
k'
1
^[E(
i[E(
±[E(
* From A. Erdelyi et al., Higher Transcendental Functions, vol. 2, McGraw-Hill, New York, 1953.
1
ik sin ip
A(
k'
COS ip
A(
k' sin
COS V?
A(cp, k)
sec ?>
A(
COS
k'
—ik' tan
—i tan
k sin ip
sin ip
ik
k'
1
k'
k
1
k
Table 21.6-2. Transformations of Elliptic Integrals*
On
r
o
2 O
W
s
21.6-7
SPECIAL FUNCTIONS
840
Table 21.6-3. Transformations of Complete Elliptic Integrals* k
K(*)
K'<*)
1
k(K + tK')
&K'
E'(A)
E(ife)
- (E + »E' -
±E'
k'*K - ***K')
A;
k
k'
E
K'
K
E'
k'{K' + iK)
Jfc'K
- (E' + *E - k*K' - **'«)
1
A'
k' ik k'K
*'(K' -
kK'
&(K + *K')
1 + *'k
(1 + k')K'
(1 + k)K
I±*K.
- (E' + tE - **K' -
tK)
1
k'
k'
1 + k'
2*H 1 +k
- (E - iE' - *'*K + **«')
-E' k
ik
1 -
ik'*K)
k'
k'
E + A;'K
2E' -
1 + k'
2
2E -
E' + AK'
*'«K
1 +
1 + *
2
k*K'
1 + f
Ar
* From A. Erd61yi et al., Higher Transcendental Functions, vol. 2, McGraw-Hill, New York, 1953.
21.6-7. Jacobi's Elliptic Functions, (a) Definitions. Inversion of the elliptic integrals z = F(k, ip) and z = F(k, w) (Sec. 21.6-6a) yields the functions am z (amplitude of z) and sn z (sinus amplitudinis of z), i.e., ip =
dtp
am z
Jo Vl - &2 sin2 ?
= F(k, ip) (21.6-38)
2 w = sn z = sin (am z) z
" jo V(l - w2)(l - k2w2)
= F(/c, w)
In addition, one defines the functions en z (cosinus amplitudinis of z) and dn z (delta amplitudinis of z) by en z
dn z
=
cos (am z) A (am z, k)
=
Vi- Vi- -
sn5lz
k2 sn2
(21.6-39) z
sn z, en z, and dn z are Jacobi's elliptic functions. A given value of the parameter k is implied in each definition; if required, one writes sn (z, k), en (z, k), dn (z, k). k', K, K7, E, and E' are defined as in Sec. 21.6-66. Jacobi's elliptic functions are real for real z and real k2 between
JACOBI'S ELLIPTIC FUNCTIONS
841
21.6-7
>-»
-l1-
Fig. 21.6-4. The Jacobian elliptic functions sn w, en u, and dn u for k2 — %. Ref. 21.1.)
(From
en u
0.5
»/#
1
0
0.5
U/K
1
0
0.5
1
u/K
Fig. 21.6-5. One-quarter of a complete cycle of the elliptic functions en u, sn u, and dn u, plotted against the normalized abscissa u/K, for three values of the modulus k.
(From J. Cunningham, Introduction to Nonlinear Analysis, McGraw-Hill, New York, 1958.)
21.6-7
842
SPECIAL FUNCTIONS
Table 21.6-4. Periods, Zeros, Poles, and Residues of Jacobi's Elliptic Functions* m and n are integers Function
Primitive
Zeros
Periods 4K
sn (u, k)
dn (u, k)
4K
2K
4tK'
2mK
(2m + 1)K + 2wK'
2K + 2iK'
Residues
(-1)"*
2mK + 2mK'
2iK!
en (u, k)
Poles
+ (2n + 1)*K'
(2m + 1)K + (2n + 1)»K'
(-1)«+H'
* From A. Erdelyi et al, Higher Transcendental Functions, vol. 2, McGraw-Hill, New York, 1953.
0 and 1 and reduce to elementary functions for k2 = 0 and k2 = 1; in particular,
sn (z, 0) = sin z
en (z, 0) = cos z
(21.6-40)
Jacobi's elliptic functions can also be defined in terms of g?, f, a, or # functions by the relations of Sec. 21.6-9.
(b) Miscellaneous Properties and Special Values. Jacobi's elliptic functions are of order 2 (Sec. 26.1-1); their periods, their (simple) zeros, and their (simple) poles are listed in Table 21.6-4. sn z is an odd function, while en z and dn z are even functions of z. Table 21.6-5 lists
special function values. Table 21.6-6 shows the effects of argument changes by quarter and half periods, using the convenient notation
sn (z, k) = s
en (z, k) = c
dn (z, k) = d
(21.6-41)
Note also
sn2 z + en2 z = fc2sn2 z + dn2 z = 1 dn2 z - k2 en2 z = k'2 (21.6-42) sn (—z) = —sn z en (—z) = en z dn (—z) = dn (z) (21.6-43)
sn (2K - z) = sn z
sn (2iK' - z) = -sn (z) )
en (2K - z) = -en z
en (2tK' - z) = -en (z) >
dn (2K - z) = dn z
dn (2zK' - z) = -dn (z) j
E(k, sn z) = JQ* dn2 zrfz References 21.2 and 21.11 contain additional formulas.
(21.6-44)
(21.6-45)
fc'^(l + Jfc')-H
*#H(2ib)-W(l - »)
1
Artf (1 + k)X
0
H«'
1
(1 + A)W
oo
-(1 +k)X
0
««'
tK'
««'
-(1 -Jfe)K
0
(1 - k)H
A/
-ffc-M(l - Jfe)J4
-ik-%'
fc'W
Jfe'H(2Jfc)-fc(l - i)
-tfc'W(l - jfc')-H
-*'W(2A;)-W(1 + *)
-*'H(1 + ib')-H
(2A;)-W[(1 + *)* + i(\ -k)H]
(1 - Jb')-H
(2*)-W[(l + k)H - i(l -*)*]
(1 + Jb')-H
XK
-(H*')W[(1 + k')H - i(\ - V)*]
tib'Mi
(«*')H[(1 + *'>* + «1 - *')*]
* From A. Erdelyi et al, Higher Transcendental Functions, vol. 2, McGraw-Hill, New York, 1953.
-(Hk')H[(l + *')H + t(l - *')*]
-tfc'W
(Mk')»[(l + *')>* - t(l ~ *')>*]
k'H
—*#H(2*)-H(1 + t)
-*~H(1 + ib)W
««'
dn ()i«K + JiniK')
-ffc'»(l - *')"**
oo
*K'
0
-tfrH(l - k)H
en (JiroK + Jj niK')
Aj-W
-ik-tt
sAiK'
(2*)-W[(l + A;)1/* - t(l - *)*]
00
iK' Ar1
k-H
(2A;)-W[(1 + fc)W + i(l - *)*]
ifc-H
HiK'
(1 - Jfc')M
1
K
(1 + *')-H
«K
0
0
sn (HmK + ^niK')
0
\UmK
Table 21.6-5. Special Values of Jacobian Elliptic Functions*
O
i
a
ft
3
C
w t«
ft o W
21.6-7
SPECIAL FUNCTIONS
844
Table 21.6-6. Change of the Variable by Quarter and Half Periods.
Symmetry*
sn (mK -f niK' ± u)
-K
0
K
2K
3K
-;k'
-d/(kc)
±1/(A»)
d/(kc)
=F1/(A*)
-d/(kc)
0
-c/d
±s
c/d
+ 5
-c/d
iK!
-d/(kc)
±l/(ks)
d/(kc)
T1/(A»)
-d/(kc)
2iK'
-c/d
±s
c/d
Ts
-c/d
mK'X
en (mK + niK' ± u)
\mK -K
0
K
2K
3K
-iK'
-ik'/(kc)
±id/(ks)
ik'/(kc)
+ id/(ks)
—ik'/(kc)
0
±k's/d
c
Tk's/d
—c
±k's/d
iK'
ik' /kc
Tid/(ks)
—ik'/(kc)
±id/(ks)
2iK'
Tk's/d
—c
±k's/d
niK' \
ik'/(kc) Tk's/d
c
dn (mK + niK! ± u)
\wiK -K
0
K
2K
niK' \ -iK'
Tik's/c
±ic/s
Tik's/c
T ic/s
0
k'/d
d
k'/d
d
iK'
±ik's/c
Tic/8
±ik's/c
Tic/s
2iK'
-k'/d
-d
-k'/d
-d
SiK'
Tik's/c
±ic/s
Tik's/c
±ic/s
* From A. Erde*lyi et al., Higher Transcendental Functions, vol. 2, McGraw-Hill, New York, 1953.
JACOBI'S ELLIPTIC FUNCTIONS
845
21.6-7
Table 21.6-7. Transformations of the First Order of Jacobi's
Elliptic Functions* Transformation
it
A
k'u
ti
k
ik
1
k'
k'
k'
k
k
K'
k'K
k'(K' - iK)
K'
K
A(K + tK')
*K'
k'(K' + tK)
k'K
sn (u, k)
Tt sn (ti, k) dn (u, k)
on (ti, k)
dn (u, &)
on (ti, k)
1
dn (ti, A;) dn (ti, it)
. sn (ti, k) B
— iu
1
k'
k
ik
ku
C
D
1
k
k'
ik'
— ik'u
E
k'
1
ik
k
— iku
ftK'
en (ti, &)
A; sn (ti, k)
., , sn (ti, k)
k(K + tK')
— tk'
en (u, k)
1
dn (ti, k)
on (ti, k)
en (ti, k)
dn (ti, k)
en (w, k)
dn (ti, &)
1
en (ti, k)
en (u, k)
., sn (ti, &) —tfc
dn (ti, A;)
1
en (ti, k)
dn (ti, k)
dn (ti, k)
* From A. Erd61yi et al., Higher Transcendental Functions, vol. 2, McGraw-Hill, New York, 1953.
(c) Addition Formulas. sn (4 + B) =
en (A + B) =
dn (A + B) =
sn A en B dn B + sn 5 en A dn ^4. 1 -
k2 sn2 A sn2 £
en A en i? — sn A dn A sn B dn 2? 1 -
k2 sn2 A sn2 £
(21.6-46)
dn A dn B — k2 sn A en A sn B en 5 1 -
k2 sn2 A sn2 B
(d) Differentiation. c?(sn z) dz
rf(cn g)
= en 2 dn 2 = \/(l —sn2 z)(l — A;2 sn2 z)
= -sn 2 dn z = V(l - en2 z)(k'2 + /c2 en2 z) \
d(dn z) dz
(21.6-47)
= -fc2 sn z en 2 = \/(dn2 2 - 1)(A/2 - dn2 z)
(e) Transformations.
Table 21.6-7 shows relations between Jacobi's
elliptic functions with moduli k and ik/k', k', l/k, l/k', k'/ik (see also Sec. 21.6-6c). (f) Series Expansions.
snz - z - (1 + **) z+- *«) ^ 3, + . (1 v* + . 14/b2 — ~/5, , „ . ,.2 en 2=1- ~22 +(1 +4fc2) £ - (1 +44&2 +16*<) §]+'•• dn z=1- /c2 ^ +/b2(4 +/b2) || - fc2(16 +44/b2 +*«) ^ + •
\z\ < min(|K'|, |2K + *K'|, |2K - iK'|)
(21.6-48)
21.6-8
SPECIAL FUNCTIONS
846
21.6-8. Jacobi's Theta Functions, (a) Given a complex variable v and a complex parameter q = eirT such that r has a positive imaginary part, the four theta functions*
*i(tO =*iMt) =2^ (-l)ng(n+^)8sin (2n +l)ir»
\
n = 0 00
= i V (_l)«g(«-tf)*(e*r«)2n-l n= —
oo
00
«M*>) = *2(»|t) = 2 Y qln+W cos (2n + l)irt> n = 0 00
= y g(»-H)«(e*«)»»-i n= —
^ (21.6-49)
oo oo
oo
M») = *i(»k) = 1 + 2 T gn2 cos 2n7rt> = T gnV)2 n«=»l
n= — oo
00
#4(i>) = ^4(y|r) = 1 + 2 V (-l)nqn* cos 2mrv n=l oo
= y (-l)nqn\eivv)2n n=B _oo
are (simply) periodic integral functions of v with the respective periods 2, 2, 1, 1. The four theta functions (49) have zeros at v = m + nr, m + nr + %> m + (n + %)t + %, and m + (n + 3^)t, respectively, where m, n = 0, ±1, ±2, . . . ; these zeros yield infinite-product representations (7.6-2) (Refs. 21.3 and 21.11). The theta functions are not elliptic functions. The very rapidly converging series (49) permit one to compute various elliptic functions and elliptic integrals with the aid of the relations of Sec. 21.6-9; in their own right, the theta functions are solutions of the partial differential equation
S3-4"*?-0
(2UW50)
which is related to the diffusion equation (Sec. 10.3-46). (b) Note
Mv + H) - *tW
Mv + H) = -Mv) 1
Mv + M) - Mv)
Mv + H) = Mv)
I
(21.6-51)
(21.6-52)
* Some authors denote #4(0) by #o(*0 or #(*>)•
847
JACOBI'S ELLIPTIC FUNCTIONS
21.6-9
(c) To find #»• (rV,n rT T n) (and thus similarly transformed elliptic func tions, Sec. 21.6-9), use
Mv\r + 1) = eivl4Mv\r)
Mv\r + 1) = e<"«*i(t>|T) \
Mv\r + 1) = *4(»|r)
*i(»|t + 1) = Mv\r)
\tI
r/
t
J
(21.6-53)
*i
(21.6-54)
*«("! "-) =yj^eir^Mv\r) (d) The values of the four theta functions and their derivatives for v = 0 (zero-
argument values) aresimply denoted by t?» = MO), tf{ = #<(0), . . . (i = 1, 2, 3, 4) and satisfy the relations t
q'" o'"
»[ = ^2t?8t?4
t/i
a"
t/o
o"
#«
n»<
t/j
-V = -f + "f + "f t?! t?2 173 #4
(21.6-55)
21.6-9. Relations between Jacobi's Elliptic Functions, Weier strass's Elliptic Functions, and Theta Functions.
If the various
parameters implicit in the definitions of sn z, en z, dn z, @(z), £(z),
K=^ V^=7> -l*,* iK' =%VtT=7* =rK ) iK' I r-£"? dmr>0)
(21.6-57)
and
w = «Vei - e3 = 2Kw
» = i= 2K il coi
(21.6-58)
then
Vg>(g)-e8~«y2^OQ
cn w" Vgoo - e, -r,m(
(2L6"59)
(21.6-60)
21.7-1
SPECIAL FUNCTIONS
848
21.7. ORTHOGONAL POLYNOMIALS
21.7-1. Survey. The orthogonal polynomials discussed in Sees. 21.7-1 to 21.7-8 are special solutions of linear homogeneous second-order differ ential equations related to the hypergeometric differential equation (9.3-31) (Legendre, Chebychev, and Jacobi polynomials) or to the con fluent hypergeometric differential equation (9.3-42) (Laguerre and Hermite polynomials). These special solutions are generated by special homogeneous boundary conditions: each class of orthogonal polynomials is a set of eigenfunctions for an eigenvalue problem reducible to the gen eralized Sturm-Liouville type (Sees. 15.4-8a andc). Only real z = a;are of interest in most applications. The polynomials rpoix), >pi(x), faix), . . . of each type are, then, defined except for multiplicative constants, which are usually (but not always) chosen so that the coefficient of xn in the nth-degree polynomial 4*n(x) is unity. Successive polynomials rpo(x), ^i(#), ^2(x), . . . of each type can be derived
1. In terms of the appropriate hypergeometric series (Sec. 9.3-9a) or confluent hypergeometric series (Sec. 9.3-10) 2. By means of recursion formulas derived from the differential equation
3. By successive differentiations of a generating function y(x, s) (see also Sec. 8.6-5)
4. Through Gram-Schmidt orthogonalization of the powers 1, x, x2, . . . with the appropriate inner product (Sec. 15.2-5) 5. From an integral representation (Sec. 21.7-7), which is usually related to an integral-transform solution of the differential equation or to the Taylor- or Laurent-series coefficients of the generating function Table 21.7-1 lists the principal formulas.
Series expansions in terms of orthogonal polynomials are derived in the manner of Sec. 15.2-4a and yield useful approximations which minimize appropriately defined mean-square errors (Sec. 15.2-6; see also Sees. 20.5-1 and 20.6-3).
21.7-2. Real Zeros of Orthogonal Polynomials. All zeros of each orthogonal polynomial discussed in Sees. 21.7-1 to 21.7-8 are simple and located in the interior of the expansion domain. Two consecutive zeros of
$n(x) bracket exactly one zero of \l/n+i(x), and at least one zero of \f/m(x) for each m > n (Ref. 21.9). 21.7-3. Legendre Functions (see also Sees. 21.8-10, 21.8-11, and 21.8-13; and Refs. 21.3, 21.9, and 21.11). The differential equation (Legendre's differential equation) and the recursion formulas for the Legendre polynomials (Table 21.7-1) are satisfied not only by the Legendre polynomials of the first kind P»(z) of Table 21.7-1 but
849
ASSOCIATED LAGUERRE POLYNOMIALS
21.7-5
also by the Legendre functions of the second kind
Q0(z) =\log. l±l
Ql(z) =| log, L±J! _i Q2(z) =\ (3i> - 1) log, J-±-* - \z
... (21.7-1)
More generally, the method of Sec. 9.3-86 permits one to derive linearly independent solutions P«(2), Qa(z) of Legendre's differential equation (Legendre functions of the first and second kind) for nonintegral positive or negative or complex values of n = absolutions for n = a and n = —a — 1 are necessarily identical. 21.7-4. Chebyshev Polynomials of the First and Second Kind. The differ ential equation and the recursion formulas for the Chebyshev polynomials (Table 21.7-1) are satisfied not only by the Chebyshev polynomials of the first kind Tn(x) s cos (n arccosz)
(n = 0, 1, 2, . . .)
(21.7-2)
of Table 21.7-1, but also by the Chebyshev "polynomials" of the second kind U0(x) = arcsin x
U^x) a sin (n arccos x) a VTEZ^. Tn(x) n
ax
(n =1, 2, . . .) (21.7-3) 1
While the functions Un(x) are not polynomials in x, the functions -—-j^-r:
dT
,n+1, and
jrp
-t-^ are polynomials; both are also sometimes referred to as Chebyshev polynomials of the second kind.
Note
Tn(x) - - Vl-x*^Un(x) f 1 Un(x)U„>(x) dx
J-i
vT=T«
(n =0, 1, 2, . . .)
f0 for n' ^ n or n' = n = 01
= V/2 ^r »' - » * 0
J
(21.7-4) (2L7"5)
21.7-5. Associated Laguerre Polynomials and Functions (see also Sees. 9.3-10 and 10.4-6). (a) The associated (or generalized) Laguerre polynomials* of degree n — k and order &,
L*(«) =jpL.to =(-1)%! (£)F(k -n;k +l;x) (n= 1,2, . . . ;k = 0,1,2, . . . ,n)
(21.7-6)
satisfy the differential equation
^ +(k + 1- z)^ +(n - k)w =0 * This notation is used in most physics books.
(21.7-7)
Some authors use alternative asso
ciated Laguerre polynomials L(^(x) = ( —l)*L*+Jfc(a;)/(n + k)\ of degree n, which satisfy the differential equation
ZS +{k +1" Z) 17 +nW =° (»» *- 0, 1, 2, . . .)
21.7-5
SPECIAL FUNCTIONS
850
123456789
(a)
10
(b)
(c)
Fig. 21.7-1. Legendre polynomials Pn(x) as functions of x (a), (b), and as functions of 0 = arccoss (c). (From Ref. 21.1.)
851
21.7-5
ASSOCIATED LAGUERRE POLYNOMIALS
>-*
Fig. 21.7-2. Chebyshev polynomials Tn(x).
Fig. 21 .7-3.
21.1.)
(From Ref. 21.1.)
Laguerre polynomials (a), and Hermite polynomials (b).
(From Ref.
Differential equation
(Real) domain and boundary conditions
Orthogonality and normalization
Series (9.3-32) or (9.3-43)
Recursion formulas
Generalized Rodrigues's formula
Generating function
First five polynomials
1
2
3
4
5
6
7
8
tc8(x) dx exists
f 0
2*
2 '2
n'x'7
(n'^n)
(n, _* <»--n>
1 dP„
f0,„
— = \ v 2
U
.
K*»> _% (n' = n 9* 0)
(n' = n = 0)
Tn (C08 0) = C08 710
8x< - 8x» + 1 = cos 40 16xB — 20x* + 5x «= cos 60
M(35x« - 30x* + 3) = ^4(35 cos 40 + 20 cos 20 + 9) M(63x6 - 70x» + 15x)
COS 0
1 X =
n = 0
1-2.X +..- J r"W*" 0,|<1)
00
'.w-^pVi—^a-*>•-»
T»+i(x) = 2xT„(x) - Tn-i(x)
Tn(x)=F(n,-n;l;^)
y/l-x*
f1 Tn(x)Tn>(x) dx
J-l
/
( '^'i.xVT^iexists
u>(«) is unique and analytic for real 2 = x in / , u. fl w*(x)dx
2x2 - 1 = cos 20 4x» — 3x = cos 30
(W>1)
dw .
^(3x» - 1) = M(3 cos 20 + 1) M(5x» - 3x) = H(5 cos 30 + 3 cos 0)
\ n=0
/ y P„(ar)8-»-i
(|«|
.. dho
(1_28) •dT>-zTz+nhD = Q
,,
Chebyshev polynomials Tn(x) (Fig. 21.7-2)
1 X = COS 0
1
°°
I Y P»(x)«» a } n=0
/
'•w-s^iff^-1)-
D, x , x* -
Pn+1(x) ^+/**>«<*) .^P-iW
2»(n!)'X*V
Pnix) =F(-n, n+1; l;^-5)
/ P„(x)P„'(x) dx= { 2 J-l [2n + l
/•1
J-l
/
/•I
w(z) is unique and analytic for real z = x in ( — 1, 1);
(l-,.)g-2^ +„(„ +l)tt =0
Legendre polynomials Pn(x) (Fig. 21.7-1)
Table 21.7-1. Orthogonal Polynomials of Legendre, Chebyshev, Laguerre, and Hermite (see also Sees. 21.7-1to 21.7-7)
No.
CD
ft
IT1
•
hS W ft
g
Series (9.3-32) or (9.3-43)
Recursion formulas
Generalized Rodrigues's formula
Generating function
First five polynomials
4
5
6
7
8
dhn n dw , _ -7— — 2z -=- + 2nw = 0
Hermite polynomials! Hn(x) (Fig. 21.7-36)
[ 0
(n' ;«* n)
1
r-
>
n = 0
fj fa-l *
(Ci < r -f no1
-x« + 25x« - 200x« + 600xs - 600x + 120
-x» + 9x* - 18x + 6 x« - 16x» + 72x» - 96x + 24
-x + 1 x* - 4x + 2
.
S
*•»<*> =«'s£;<*',
L„+i(x) = (2n + 1 - x)L„(x) - n*L„-i(x)
= (_i)„\xn _ n*x«-i + n'(n ~ 1)2x"-2
\ (n!)» (n' = n)
e-*L„(x)L„,(x) dx = {
e~*t02(x) dx exists
L„(x) = n!F(-n; 1; x)
Jo
/
f oo
Jo
/
Too
e-*' ^2(3;) ^ exists
I 2»n! \Ar
(0
(n' ;* n) (n' = n)
= 2ntf»_i(x)
2ni/«_i(x)
48x! + 12 160x» + 120x
2 12x
t Some authorsuse alternative Hermite polynomials' He„(x) n 2"»'* Hn{x/y/2), which"satisfy the differential equation -=-j —z -z—h nto «= 0.
16x« 32x6 -
1 2x 4x* 8x» -
n = 0
oo
Hnix) =(-D^^C^2)
ax
^
Hn+i(x) = 2x#„(x) -
ff2,»(x) = (-l)»2»(2n - l)(2n - 3) • • • 3- lF(-n; K; **) ffi»+i(x) = (-l)«2»+i(2n + l)(2n - 1) • • • 3 • lxF(-n; %', x*)
y -»
f •
1
w(z) is unique and analytic for real z = x in (0, »); u;(z) is unique and analytic for real z = x in (— «,»);
z-dT* + il-z)d7 + nwss0
Laguerre polynomials* Ln(x) (Fig. 21.7-3a)
* Some authors denote the polynomial —;Ln(x) by Ln(x).
Orthogonality and normalization
3
n!
(Real) domain and boundary condi
2
tions
Differential equation
1
No.
Table 21.7-1. Orthogonal Polynomials of Legendre, Chebyshev, Laguerre, and Hermite (seealso Sees.21.7-1to 21.7-7) (Contintted)
00
CD
r
2
as o
o
o o ©
o
03
Table 21.7-2. Coefficients for Orthogonal Polynomials, and for xn in Terms of Orthogonal Polynomials* n
n
(a) Legendre Polynomials: Pn(x) = a~l Y cmxm
xn = 6"1 Y dmP„ (x)
rn = 0
x°
X2
x1
1
1
a;8
3
m = 0
X6
Z4
5
35
X6
63
X7
231
429
an \ Po
1
Pi
1
p2
2
Pb
2
P*
8
Pb
8
Pe
16
Pi
16
1
1
7
1
1
1
33
3
3
-1
2 5
28
35
72 63
8
-315
105
-35
1
&» To
1
1
n
r2 Tz T\ Tb Ta
T7
8
56
T*(x) = 32a;6 - 48a;4 + 18a;2 - 1
21
6
1 16
7
1
32
-48
18
-7
15
5
-20
5
35
10
1
-8
-1
10
4 4
64
32
3
1
-3
X7
X6
16
3
2
1
X6
8
4
1
-1
m = 0
a;4
1
1
Ti
Xs
2
1
16
a;n = b~l Y dmTm(x)
rn = 0
X2
429
x« = H3i[33P„ + 110P2 + 72P4 + 16P6] n
X1
16
-693
(b) Chebyshev Polynomials: Tn(x) = Y cmzTO x°
88 231
315
P6(s) = He[231a;6 - 315a;4 + 105a;2 - 5]
182
8
-70
15
110
2
-30
-5
143
20
-3 3
27
-112
1
64
1
a;6 = H2[10T0 + 15!F2 + 6r4 + TQ)
* Abridged from M. Abramowitz and I. A. Stegun (eds.), Handbookof Mathematical Functions, National Bureau of Standards, Washington, D.C., 1964. 854
21.7-5
ORTHOGONAL POLYNOMIALS
855
Table 21.7-2. Coefficients for Orthogonal Polynomials, and for
xn in Terms of Orthogonal Polynomials (Continued)
(c) Laguerre Polynomials:L„(a;) = ) CmXm
^n = bn* ) dmLt i(*) m=0
m = 0
bn L0
1
1
1
2
6
24
120
720
5040
1
1
2
6
24
120
720
5040
-1
-4
-18
-96
-600
-4320
-35280
2
18
144
1200
10800
105840
-6
-96
-1200
-14400
-176400
24
600
10800
176400
-120
-4320
-105840
720
35280
-1
u
1
u
2
-4
u
6
-18
9
U
24
-96
72
-16
U
120
-600
600
-200
25
U
720
-4320
5400
-2400
450
-36
5040
-35280
52920
-29400
7350
-882
L7
X1
X6
x*
a;4
a;3
a;2
x1
x°
1
-1
1
-1
1
-1
49
-5040
U(x) = a;6 - 36a;5 + 450a;4 - 2400a;3 + 5400a;2 - 4320a; + 720
x6 = H2o[720L0 ~ 4320Li + 10800L2 - 14400L3 + 10800L4 - 4320L6 + 720L6] n
n
(d) Hermite Polynomials: Hn(x) = Y cmxm
xn = b~l } dmHm(x
m=0
bn
Ho
1
2
ffi
H2 H3 H<
H7
8
16
-1680
32
3360
ff6(aO = 64a;6 - 480x4 + 720x2 - 120
42
1 64
-480
720
-120
30
1
-160
120
420
20
1
-48
12
180
12
1
-12
H6 #6
4
840
60
6
1
-2
120
12
2
1
128
64
32
16
X7
X6
X6
a;4
8
4
2
1
Xs
X2
X1
x°
w=0
-1344
1
128
1
x« = K4[120H0 + 180tf2 + 30tf4 + #d
21.7-6
SPECIAL FUNCTIONS
856
for integral values n = 1, 2, . . . ; k = 0, 1, 2, . . . , n. Equation (7) reduces to the differential equation of the Laguerre polynomials (Table 21.7-1) for k = 0.
Note the generating function 8X
oo
(r=ri)k r^-s =£ L»^ 5 (fc - °' l> 2> •••> (2L7"8) and the orthogonality and normalization given by
j£,o **-Ll(z)LUx) dx =^[y «*
(21.7-9)
If nonintegral real or complex values of n and k are admitted, the differential equa tion (7) defines the generalized Laguerre functions. These functions, of which the polynomials (6) are special cases, are confluent hypergeometric functions (Sec. 9.3-10) with a = k -n, c = A; + 1.
(b) The functions <Ma>) = xi€r*i*Ll*ff(x) (n - 1, 2, . . . ; j = 0, 1, 2, . . . , n —1), which are often referred to as associated Laguerre functions, satisfy the differential equation
•S +'T +M-^l-o (n = 1, 2, . . . ; j - 0, 1, 2, . . . , n - 1)
(21.7-10)
and
if
o
,2 ^_9J„
+2nj(x)x2dx =
2n[(n+ j)!P
,
J
(n —j — 1)!
(21.7-11)
(see also Sec. 10.4-6).
21.7-6. Hermite Functions. The functions ^„(x) = e~**l2Hn(x) (n = 0, 1, 2, . . .), which are often referred to as Hermite functions, satisfy the differential equation
^ +(2n +1- z2)w =0 (n =0, 1, 2, . . .)
(21.7-12)
/."oo Mx)fn'(x) dx =2»7l! \/ir C
(21.7-13)
and
21.7-7. Some Integral Formulas (see Refs. 21.3 and 21.11 for additional formulas). 1
fv
P„(cos t?) = - /
(cos # + i sin tf cos *)n <&
(n = 0, 1, 2, . . .) (Laplace's integral)
P"(x) " 53 £2%-S"+>dt
(21.7-14)
<« =0, 1, 2, . . .) (Schlaepli's integral)
(21.7-15)
The integration contour in Eq. (15) surrounds the point z —x.
Hn(x) =yj?H f°° (a; \/2 +it)ne~^dt
(n =0, 1, 2, . . .) (21.7-16)
857
BESSEL FUNCTIONS
Jo
21.8-1
(n - k)\
fj x*^-[Li(«)P <** =^^ (2n -*+1)
^ (21 M?)
/ a*+2e-*[L*(z)]2 dx = , V *' , (6n2 - 6nfc + fc* + 6n - 3fc + 2)
Jo
\n — «)* (n = 1, 2, . . . ; A; = 0, 1, 2, . . . , n - 1)
/:
xe-**Hn{x)Hn>(x) dx = 2«-%! V^ *i"1 + 2«(n + 1)! V* 5n'+1 (21.7-18)
21.7-8. Jacobi Polynomials and Gegenbauer Polynomials (see Ref. 21.11 for a detailed discussion), (a) Jacobi's (hypergeometric) polynomials are special instances of hypergeometric functions ^n(x) = F(-n, a + w; y; x)
=
xl-*(l - *)*-•
__ * [X7+»-i(l - x)«-7+«] (21.7-19)
7(7 + 1) • ' ' (7 + » - 1) <&» l
v
(Sec. 9.3-9) and satisfy the orthogonality conditions
/.
1
r(7)r(« - 7 + 1)
(« - 7 + !)(« - 7 + 2) • — (« - 7 + n)
n!_
«(« + 1) • • • (a + n - 1)7(7 + 1) • • • (7 + n - 1) a + 2n [Re 7 > 0; Re (a - 7) > -1]
(21.7-20)
(b) The functions
are known as Gegenbauer (ultraspherical) polynomials. They constitute a
generalization of the Legendre polynomials (Table 21.7-1), to which they reduce for a = %. The Gegenbauer polynomialssatisfy the differential equation
(22 _ i) g +{2a +1) , J - n(n +2a)w - 0
(21.7-22)
and the orthogonality condition
/.\ (1 - .fl-W-WCSW <*x =C2aa_i;ryn+n;)r(a)]a (21-7-23) The Gegenbauer polynomials can be generated as coefficients of the power series
(1 - 2sx +**)-« =£ C;(*)«»
(21.7-24)
n = 0
21.8. CYLINDER FUNCTIONS, ASSOCIATED LEGENDRE FUNCTIONS, AND SPHERICAL HARMONICS
21.8-1. Bessel Functions and Other Cylinder Functions,
(a)
A cylinder function (circular-cylinder function) of order m is a solution w = Zm(z) of the linear differential equation
21.8-1
SPECIAL FUNCTIONS
d*w yldwf ^2" "g di" ~M
858
m2\ r ) ^ = 0 (Bessei/sdifferential
*
'
TP.rkTTArrTr^xr^ equation)
(21.8-1)
where m is any real number; one usually imposes the recurrence relations
Zm+1(Z) =?5 Z.W - Zm.,{Z) =™Zm(z) - ^Zrniz) (21.8-2)
as additional defining conditions. The functions e±i(Kz±m,f)Zm(iKrf) are solutions of Laplace's partial differential equation in cylindrical coordi nates r',
(b) The most generally useful cylinder functions of order m satisfying the recurrence relations (2) are (see also Figs. 21.8-2 to 21.8-4) JJfi) =
(-1)*
(|arg z\ < tt)
2/ Li k\T(m + k + 1) \2)
(21.8-3)
t = 0
(Bessel functions of the fiest kind)
#«•(«) - e[nmw \Jm{z) cos mw - J_m(z)] (m^O, ±1, ±2, . . .) Nm(z) = (-!)—#_(*) = -Jm(z)
(loge| +c)
»
k
m+k
-1 (iY V (-!)* dX" (V14. v A t \2j kLi *!(m +•*)! \2j \Lfj + y=i L, 1) =0 y=i m
(21.8-4)
—i
(m-*- l)!/gY* Jfc! &= 0
(m = 0, 1, 2, . . . .; |arg z\ < t)
(Neumann's Bessel functions of the second kind) < fl.(1)W • Jm(z) + iNm(z) Hm™(z) m Jm(z) - iNm(z) (Hankel functions of the first and second kind)
(21.8-5)
859
BESSEL FUNCTIONS
21.8-1
The last three sums in Eq. (4) are given the value zero whenever the lower limit exceeds the upper limit; and C « 0.577216 is Euler's constant
(21.4-6). Note that everyfunction Nm(z) has a singularity at the origin.* (c) Analytic Continuation. To obtain values of the cylinder func tions for |arg z\ > ir, use Jm(einvz) = eimn*Jm(z) Nm{einvz) = e-imnvNm(z) + 2iJm(z) sin mrnr cot mir (n = 0, 1, 2, . . .)
(21.8-6)
where one uses sin mrnr cot mir = (—l)mnn for m = ±n; and
Hm™(ewz) = -e-^Hm^(z) = -#-*>(2)(*) 1 Hm™(e-ivz) = -e™«Hm™(z) = -H-m™(z) J
m o7) v '°',}
Note that cylinder functions of integral order are single-valued integral functions (Sec. 7.6-5).
(d) Every cylinder function of order m can be expressed as a linear com bination of Jm(z) and Nm(z) and as a linear combination of Hmil)(z) and Hm™(z):
Zm(z) = aJm(z) + bNm(z) = «H.WW + Wm™(z)
(21.8-8)
(fundamental systems, Sec. 9.3-2). Jm(z) and J-**(z) constitute a fun damental system unless m = 0, ±1, ±2, . . . , since then J^n(z) =
(— \)mJm(z). The three fundamental systems have the respective Wronskians (Sec. 9.3-2) 2/wz, —U/irz, and —2 sin (mir)/irz\ the first two Wronskians are independent of m.
Jm(z) =\ [Hm™(z) +Hm™(z)]
Note
Nm(z) =\{ [Hm"(z) - Hm™(*)] (21.8-9)
J.n(z) = y2[e^Hm^(z) + e-™*Hm™(z)\
(21.8-10)
(e) Cylinder functions with m — ±%, ±%, . . . can be written as elementary transcendental functions (see also Sec. 21.8-8):
/„,« - >|^ (- i|)184-2 (ft =1, 2, . . .)
(21.8-13)
*„«,« =^ JJ* ir^«C) ^Vf^
(21.8-H)
ffM<*>(2) B_ -Jli£^
(21.8-15)
ff^w, =\j?££
* The Neumann functions ATm(«) are sometimes denoted by Ym(z); some authors refer to them as Weber's Bessel functions of the second kind.
21.8-2
SPECIAL FUNCTIONS
860
21.8-2. Integral Formulas (see Sec. 8.6-4, Table D-7, and Refs. 21.3 and 21.11 for additional relations), (a) Integral Representations of Jo(z), Ji(z), J2(z), . . . . 1 /"»
Jm(z) = ~ / cos (mt - z sin t) dt
(m = 0, 1, 2, . . .)
(Bessel's integral formula)
J2m(z) = -
cos (z sin t) cos 2mt dt
T J°
(21.8-16)
\
I (m = 0,
J2m+i(z) =? fW/2 sin (z sin t) sin (2m +l)t dt\ lf 2' ' ' #) (2L8"17)
^m^ =2^ / ei'^te^-viVdt =t^ j eizcoat cos mtdt (m = 0, 1, 2, . . .) (Hansen's integral formula)
(21.8-18)
Jm^ =2W (iT f *"","1«<-c,,/40 d* (m =0, 1, 2, . . .) (21.8-19) where the integration contour encloses the origin, and |arg z\ < ir.
Fig. 21.8-1. Integration contours for Sommerfeld's integrals; t —y + iy. Stratton, Electromagnetic Theory, McGraw-Hill, New York, 1941.)
(b) Sommerfeld's and Poisson's Formulas.
(J. A.
Referringto Fig. 21.8-1, the com
plex contour integral g-tro(ir/2)
r
Zm(z) = —-— I 6*(«ooBt+mo fa
(Sommerfeld's integral)
(21.8-20)
equals HmW(z) for the contour C = Ch Hmm(z) for the contour C = C2, and 2Jn(z) for the contour C = C3. The contours may be deformed at will, provided that they start and terminate by approaching t - « in the shaded areas indicated,
t = 0 and
21.8-3
ZEROS OF CYLINDER FUNCTIONS
861
t = 7T may be used as saddle points (Sec. 7.7-3e) for Ci and C2 in computations of Zm(z) for large values of z (see also Sec. 21.8-9). Note also
Jm(z) — 2(z/2)m— rW2 cos ^cos t) gin2w tdt vvr(m 4- 3^) -/o
(m > _^}
(Poisson's integral formula)
(21.8-21)
(c) Miscellaneous Integral Formulas Involving Cylinder Functions (see also Sec. 21.8-4c).
fXJm(ax)Jn((3x)xdx =^z-Q2 [<xJm(Px)Jm+i(<xx) - pJm(<xx)Jm+i(0x)] " a2 - /32
Wm-l(f3x)Jn(<XX) - aJm^(ax)Jm^x))
(«2 - /32 * 0)
j* [Jm(«x)]*X dx - |2 [/:(ax)J2 +i (*2 - J) [/.(ax)]2 (m > -1)
(Lommel's integrals)
(21.8-22)
(-1
/;-m+nJm(ax) dx = 2n~man~n~1
(21.8-23)
21.8-3. Zeros of Cylinder Functions, (a) All zeros of cylinder func tions are simple zeros, except possibly z = 0. Consecutive positive or negal.o -m =
0.8 0.6
0
1
^m =
1
w=2 - m ==3
0.4
Jm(x) 0.2 0 -0.2
-0.4
-0.6
"02468
Fig. 21.8-2. The Bessel functions Jo(x), Ji(x), /2(s), that/m(-aO = (-l)mJm(x).
10
12
14
for real arguments.
Note
tive real zeros of two linearly independent real cylinder functions of order m alternate; z = 0 is the only possible common zero.
(b) (See also Fig. 21.8-2). The function Jm(z) has an infinite number
of real zeros; for m > —1, all its zeros are real. For m = 0, \i, 1, %, 2, . . . and n = 1, 2, . . . , Jm(z) and Jm+n(z) have no common zeros.
For m = 1, 2, . . . , consecutive positive or negative real zeros of Jm(z)
21-8"4
SPECIAL FUNCTIONS
862
are separated by unique real zeros of Jm+1(z) and by unique real zeros of JJz); and consecutive positive or negative real zeros ofz~mJm(z) are separated
by unique real zeros of z~mJm+1(z).
21.8-4. The Bessel Functions J0(z), J1(z), J2(z), ... (a) Genera tion by Series Expansions. Bessel functions of nonnegative integral
order m - 0, 1, 2, . . . are single-valued integral functions of z. They
may be "generated" (see also Sec. 8.6-5) as coefficients of the Laurent
series (Sec. 7.5-3)
«/ _i\
»
e*V *) =JB(Z) + ^ [«- +(-«)—]/„(*)
(21.8-24)
ro=l
or as coefficients of the Fourier series 00
cos (z sin 0 = J0(z) + 2 T /»(*) cos 2kt (21.8-25a)
00
sin (z sin 0 =2 Y J2k-i(z) sin (2k - l)t 00
m= —
» 00
=Jo(z) +2 Y [J2*(*) cos 2fo ±iJ2k-i(z) sin (2fc - 1)2] jfe=i
N
(Jacobi-Anger formula) (21.8-256) 00
00
1- J0(z) +2\ J2k(z) - /0»(*) +2Vjrfc«(0) 2»a2»V (n+2fe)(n+fe -1)1 _
^
2 A
jb!
,.
^M
(21.8-26)
(n - 1, 2, . . .) (21.8-27)
(b) Behavior for Real Arguments. For real z, J0(z), J^z), J2(z),
. . .^ are real functions of z = a; Fig. 21.8-2 illustrates their zeros, their maxima and minima, and their asymptotic behavior for x-> oo (see
also Sees. 21.8-3 and 21.8-9).
(c) Orthogonality Relations (see also Sees. 15.4-6 and 21.8-2c). Given any two real zeros xi} xk of Jm(z), one has the orthogonality relations which yield an orthogonal-function expansion. Note also
«/0°° J~("&J»te&l dt =S(« - P) where 8(x) is the impulse function introduced in Sec. 21.9-2.
(21.8-29)
21.8-4
THE BESSEL FUNCTIONS
863
i
1.0 0.8
-V
0.6
~JY?\ *
0.4
'/A VN^
0.2
>/ V \ \
0 0.2
•0.4
o/
2\
4
/l
y
\6
/
A \\/X 8/\ 10\ /12 / 14/
V
•K \ V • /*
•0.6 /
0.8
U /
(a)
0.8
WjiSW+^ioW
0.6h 0.4-
0.2 P 0
2
4
6
8
10
12
14
16
18
20
22
24,
-0.2[-
/MoW
-0.4/
-0.6-
-0.81
/ /
Fig. 21.8-3. Bessel functions and Neumann functions.
(From Ref. 21.1.)
21.8-4
SPECIAL FUNCTIONS
864 "8
<
cm
-e
6q
o0£
QT
T
o0£-7/
7/v >/ 1 1 /
V/
o09^T—•
oSZ-| o06-=°fl 1
+
390I-l
V)^ "^09X^\ —3^\
—\ 2^
<5
v
\ xA«A *2 ©
I3
vS-^*—
\^\\ Is \\j^A -3 ^ eo*
fa /
r^hr^-
.0L0 I-**
o ^ en h
—pw
•35
r
rs^L
1 M7 /
r-
H-l =f/l80°|
1
p I
^
I
o &<&h
21,8-4
THE BESSEL FUNCTIONS
865
1*°l K,
2.4
// 1l'1
-1
2.0
lh J
1 /
1.6
\
\ /
1.2
/
/
/
/
/ / /
*
0.8
\ N'
\x
0.4
:/V
^>^.
/
/
0
0
1
1
1
2
^Tl
Fig. 21.8-5. Modified Bessel and Hankel functions.
••
{From Ref. 21.1.)
H
2 T3
-10
Fig. 21.8-6. ber x, bei x, ker x, and kei x.
-0.10
{From Ref. 21.1.)
21.8-4
SPECIAL FUNCTIONS
866
Jj(x) 0.6
l j°0
rtjix)
0.5 0.4
0.3
vi-i
0.3
0.2 h
0.2
0.1
01f//\ \\WVrN
°
•*
o^ 2 \\4\ i\i \iA /X'\ V»»*-o.i 6 Y8/YoWvl4 -0.1\ VVvV^" -0.2 -0.2 -
-0.3 -
Fig. 21.8-7. Spherical Bessel functions.
(From Ref. 21.1.)
Pn(x) 2.0
/
\
1.5
1.0
0.5
0
-0.5
>-.*
0.2
0.4/
0.6
0.8
1.0
M
-1.0
-1.5
Fig. 21.8-8. The associated Legendre functions iV(s), /¥(*), iV(x). 21,1.)
(From Ref.
867
MODIFIED CYLINDER FUNCTIONS
21.8-7
21.8-5. Solution of Differential Equations in Terms of Cylinder Functions and Related Functions.
&w
l_^2a dw
az2
z
dz
The linear differential eguation
r
a> - mVl w=Q (2LMQ)
l
z
]
has solutions of the form
w = z°Zw(&zc)
(21.8-31)
Many special cases of Eq. (30) are of interest (Sees. 21.8-6 to 21.8-8, Refs. 21.6 and 21.9). 21.8-6. Modified Bessel and Hankel Functions.
The modified
cylinder functions of order m,
Im(z) = i~mJm(iz)
(modified Bessel functions) \
Km(z) =\ im+1Hma)(iz) (modified Hankel functions) | are defined with the aid of Eqs. (3) to (5); the definition is extended by means of Eqs. (6) and (7). The functions (32) are linearly independent solutions of the differential equation
+I *£ - (1 +<*) W=0 Z dz
\
Z2 J
(MODIFIED BeSSEl's
DIFFERENTIAL EQUATION)
(see also Sec. 10.4-36) and satisfy the recursion formulas
Im+l(z) = Im-l(z)
Im(z) =2t- Im(z) — Im-l(z)
2m Zd Km+1(z) = Km.x(z) +™Km(z) = -2%-Km(z) - Km^(z) \( (2L8"34) z
az
Im(z) and Km(z) are real monotonic functions for m = 0, ±1, ±2, . . . and real z.
21.8-7. The Functions berm z, beim z, herm z, heim z, kerm z, keim z. functions berm z, beim z, herm z, heim z, kerm z, keim z, defined by
Jm(imz) = berm z ± i beim z\
HmW{i**z) s herm 0+ t heim z ) t^JTw(*±K*) a kerm z ± t keL 2 are real for real values of z. ber0 z = ber z.
The
,91 R«_
^.»-^
(21.8-36)
The subscript m is omitted for m = 0, e.g.,
Note that
kerm z == —| heim z
keim 2=| herm 2
ber 2, bei z, her z, and hei z as well as Joi&Mz) and /f0(1)(*9*s) are solutions of the differential equation
21.8-8
868
SPECIAL FUNCTIONS
Note
ber z = 1 — c^)4
, (m*
(2!)» ' (4!)*
-1
zt | 22 •42 •62 •82
22 •42
z*
/
r
(3!)» + ' '
(21.8-38)
z*
—
1
22
22 •42 •62 "**
In some applications, it is convenient to introduce \Jm(iMz)\, \Km(iMz)\, arg Jn(iMz), and arg [i~mKm(iMz)] as special functions instead of or together with berm z, beim z, kerroz, and keimz (Ref. 21.6). All these functions are real for real values of z.
21.8-8. Spherical Bessel Functions.
The spherical Bessel func
tions of the first, second, third, and fourth kind
nj(z) =J^Nj+^z)
3i(*) = xl2i ^+»W
(21.8-39)
satisfy the differential equation rf2w
(see also Sec. 10.4-4c) and the recursion formulas
w;+i(s) =^-^Wj(z) - Wj-i(z) =-zf^t-twfa)] (21.8-41) For integral values of,;, the spherical Besselfunctions are elementary transcendental functions (see also Sec. 21.8-le): sin z
3o(z) =
cos z
i-iM -
. , N
./
h0«)(z) B - H.
z
z
9.
(21.8-42)
*_!»)(«) = —
z
1 d\'8in in zz
^0(2)(Z). *c? |
z
ni(z) m (-l)/+y^_.iW
(i - 1, 2, . . .) (21.8-43)
21.8-9. Asymptotic Expansion of Cylinder Functions and Spheri cal Bessel Functions for Large Absolute Values of z (see also Sees. 4.4-3 and 4.8-6).
As z -» »,
'-w^Ws Am(z) cos t ^ - "2" - J)
- Bm(z) sin (2 ~^ -1) *-»-*£ 4ro(z) sm U - ^" - 4)
+JBm(2) cos f2- ™- jj
(| arg | z < t)
(21.8-44)
869
ASSOCIATED LEGENDRE FUNCTIONS
21.8-10
where Am(z) and Bm(z) stand for the asymptotic series
AM_ ,
+
(4m* - l)(4m* - 9)
(4m2 - l)(4m2 - 9) (4m2 - 25) (4m2 - 49) _
4!(8z)4
/? r\ _ 4m2 - 1
*mW "
+ ' ' > (21.8-45)
(4m2 - l)(4m2 - 9)(4m2 - 25)
80
3K85P
±
• •
Substitution of Eqs. (44) and (45) into Eq. (5) yields corresponding asymptotic expansions of Hm(l)(z) and Hm{2)(z). For \z\ ^> \m\ as z —> <»,
t ,\^
(2
(
mir
tt\
Jm(2) ~VS c°sV"T - I)
\
J
H„<»(*) a* . ff e'G-^-f) ffm«>(2) ~ jl a-'(-TT-i) I
7ra+M(2) ^ _J-| sin (* - ^j (m =0, 1, 2, . ..) i/(*) =* -cos (z - J—^- tt) n,(z) a* i sin (2 - 3-~^ tt) Vl)(«) ^ - (-z')'+V»
(21.8-47) ( (21.8-48)
ft/2)(2) ^ - i'+1e-"
Note that the asymptotic relationship of Jm(z), Nm(z), HmW(z), and Hmw(z) is analogous to the relationship of the more familiar trigonometric and exponential functions.
21.8-10. Associated Legendre Functions and Polynomials (see also Sees. 21.7-1 and 21.7-3). (a) The associated Legendre functions of degree j and order m are solutions of the differential equation
0—>S-*5+[*>+»-A]--• (21.8-
49)
where j and m are complex numbers; Eq. (49) reduces to Legendre's differential equation (Table 21.7-1 and Sec. 21.7-3) for m = 0. The general theory of the associated Legendre functions is found in Refs. 21.3 and 21.9. In the most important applications (Sees. 10.4-3c and 21.8-12), j and m are effectively restricted to the real integral values 0, 1, 2, . . . , while z = x is the cosine of a real angle &and hence a real number
870
SPECIAL FUNCTIONS
21.8-10
between —1 and 1. Under these conditions, Eq. (49) is satisfied by the associated Legendre "polynomials" of the first kind*
3v '
2m (j — m)\m\ v
'
F\m- j, m
+ j + 1; m + 1
1 - x\
; 2 ;
=(l-z2)m/2l^;(z)
(21.8-50)
_ (1 -«»)«/»JH-
2*/!
.
efa»,+m ^
;
= (—l)'+mPf( —a;) (real a; between —1 and 1; j = 0, 1,2, . . . ;m = 0, 1, 2, . . . , j)
(see also Sec. 9.3-9), with PJ(x) = Pj(x), and Pf(x) = 0 for m > j. In particular,
P\(x) = Vl - s2 = sin # PKz) = 3z Vl - a:2 = ZA sin 2#
(21.8-51) (21.8-52)
P|(a;) = 3(1 - x2) = ^(1 - cos 2#)
P\(x) = K(5z2 - 1) Vl - x2 = %(sin # + 5 sin 3#) |
P32(x) =15x(l - x2) = i%(cos i? - cos 3#)
| (21.8-53)
P\(x) = 15(1 - x2) Vl - x2 = !%(3 sin 0 - sin 3#) j P}(x) = 1 • 3 • 5 • • • (2j - 1)(1 - x2)*'2 = 1-3-5 • • • (2j-l)sin'#
(j = 0,l,2, . . .)
(21.8-54)
where cos & = x (see also Sec. 21.8-12). (b) The associated Legendre "polynomials" defined by Eq. (50) satisfy the following recurrence relations (—1 < x < 1):
(2j + l)xP?(x) - (j - m + l)Pf+1(x) - (j + m)Pf_x(x) = 0 (0 < m < j - 1)
(21.8-55)
(x2 - 1) |P?(x) - (j - m+l)Pf+1(x) +(j +l)*P;r(aO =0 dx (0 < m < j)
(21.8-56)
VT + tftf + 1) - m(m + l)]Pf(x) = 0 (0 < m < j - 2)
(21.8-57)
Pf+2(x) - 2(m + 1)
;P?+10l0
Pf+1(x) - P^Oc) = (2j + 1) vT^^Pf-1^) (0 < ro < j - 1)
(21.8-58)
0* + m)(j + m + l)Pr_iW - U -m)(j-m+ l)Pf+1(x) = (2j + 1) Vl - x2P?+1(x) (0 < m < j - 1) (21.8-59) * Some authors reserve the symbol P™(x) for the functions here denoted by
(—l)mpm^ or fry (—l)i+™PT(x). Note that not allPf(x) are actually polynomials in x.
871
SPHERICAL HARMONICS
(c) Asymptotic Behavior.
21.8-12
As j —• »,
*-<„,, =(-i)^-^oos[(i +0* -I+==] +00-«) (0 < tf < tt)
(21.8-60)
21.8-11. Integral Formulas Involving Associated Legendre Polynomials (see also Sec. 21.7-7).
P?&) = (-l)m/2
[.
jlir
/ (x + y/x^^l. cos ty cos mt dt
Jo
(j = 0, 1, 2, . . . ; m = 0, 1, 2, . . . , j)
(Heine's integral formula)
(21.8-61)
(j, j' = 0, 1, 2, . . . ; m = 0, 1, 2, . . . , j < j')
(21.8-62)
/.1prWi7c,
m)l
yo
2;-f 1 (j - m)!
yo
l-o;2
2m (j - m)!
(j = 0, 1, 2, . . . ; m = 1, 2, . . . , j)
(21.8-63)
21.8-12. Spherical Harmonics. Orthogonality (see also Sees. 10.4-3c, 14.10-76, and 15.2-6). (a) Solutions $(r, &, cp) of Laplace's par tial differential equation in spherical coordinates (10.4-15) are known as spherical harmonics. Spherical surface harmonics of degree,; are solutions Yj(&,
fry
1
/)2V
^ +cot&^ +^-/w+iU +1)Y =0 (2L8"64) obtained on separation of variables in Eq. (10.4-15). If one imposes the "boundary" conditions of regularity and uniqueness for 0 < # < ir, 0 <
^i^rSi'«~*>-— (j = 0,1,2, . . . ;m = 0,1,2, . . . ,j)
(21.8-65)
are known as tesseral spherical harmonics of degree j and order m; they are periodic on the surface of a sphere and change sign along "node lines" # = constant and
21.8-13
SPECIAL FUNCTIONS
872
Both the functions (65) and the frequently more convenient complex functions
1 I2j + l(j- \m\)\
2\ * U+\m\)r' l
'
(j = 0, 1, 2, . . . ; m = 0, ±1, ±2, . . . , ±j)
(21.8-66)
are orthonormal sets of eigenfunctions in terms of the inner product
(/, h) = j* d
(21.8-67)
(Sec. 15.4-66). (/, h) = 0 for every pair of functions (65) or of functions (66) unless / = h, in which case the inner product equals one. There are exactly 2j + 1 linearly independent spherical surface harmonics of degreej.
(b) Every twice continuously differentiable, suitably periodic real function $(&,
3
H#> ) = 7 [K«ioPy(cos #) + YPf(cos #)(ajm cos m
m=l j
= 2 I 7^*PJm,(cos *)e*»*
(21.8-68a)
y=0 m= -;
with
ajm =^^ (J +™)' £ d
=?L+i ^7m)| /"** dipe™* Jo[2lF $(&,
(21.8-686)
Expansions of the form (68) can be physically interpreted as multipole expansions of potentials (Sees. 15.6-5a and c). 21.8-13. Addition Theorems, (a) Addition Theorem for Cylinder Functions. Let Pi and P2 be two points of a plane, with polar coordi
nates (rh
Referring to Fig. 21.8-10a, let r\ > r2) so that
d2 = ri2 + r22 —2nr2 cos (
ri - rtf-^-** — r0fi*^i-^ r\ — r2e
\
(21.8-69)
873
21.8-13
ADDITION THEOREMS
^70.5° -180e
-120c
m
60°
•
60°
»120
180°
109.5°
180°
Fig. 21.8-9. Nodes of the function P\ (cos #) sin 3
Pis(rlt
Pis(rv&i#i)
P2*(r2,
Pzs (r2,&2t
(a) Fig. 21.8-10. Geometry for addition theorems.
(b) The addition theorems are useful for
expressing effects at P2 of a source of potential, radiation, etc., at Pi, or vice versa (see also Sees. 15.6-5 and 15.6-6).
Then, for every cylinder function Zm(z) satisfying Eqs. (1) and (2), oo
Zm(ad)eim+ = £ Zm+fc(ari)7ft(ar2)e'fc(^-^) fc=-oo
(addition theorem for cylinder functions) where a is any complex number.
(21.8-70)
In particular,
Zm[a(n + r2)\ = ^ Zk(an)Jm.k(ar2)
(21.8-71)
k= — w
(b) Addition Theorems for Spherical Bessel Functions and
Legendre Polynomials. Let Pi and P2 be two points in space, with
874
SPECIAL FUNCTIONS
21.9-1
spherical coordinates (rh &h
One has
d2 = r\2 + r22 — 2r\r2 cos y
cos 7 = cos #i cos #2 + sin #i sin #2 cos ((pi —
(21.8-72)
and 00
jo(ad) =S1^ = }(2k+ l)jfc(ari)j*(ar2)P*(cos 7) fc = 0
fc0a>(ad) =
fcad
-£•
(2/c + l)^<1>(ari)jA:(ar2)Pfc(cos 7)
(fi > rt)
(addition theorem for ZERO-ORDER SPHERICAL BESSEL FUNCTIONS)
(21.8-73)
Py(C0S 7) = P;(C0S t^i)Py(C0S #2) .1
+%\ r+ mli JP?t(C0S ^i)pf(cos t>2)cos m^! - ^2) TO=1
(ADDITION THEOREM FOR LEGENDRE POLYNOMIALS)
(21.8-74)
21.9. STEP FUNCTIONS AND SYMBOLIC IMPULSE FUNCTIONS
21.9-1. Step Functions (see also Sees. 4.6-17c, 18.3-1, and 20.4-5c). (a) A step function of the real variable a; is a function which changes its value only on a discrete set of discontinuities (necessarily of the first kind, Sec. 4.4-76). The function values at the discontinuities may or may not be defined. The most frequently useful step functions are* (0 for x < 0 \
(SYMMETRICAL UNIT-STEP
Y2iovx = 0\
FUNCTION)
1for x > 0 J
TT , N U+(x)
(21.9-1)
(Oforz < 0
U-(x) = L
for x > Or
0 for x < o\
(ASYMMETRICAL UNIT-STEP FUNCTIONS)
1 for x > 0)
(see also Fig. 21.9-1). * The notations employed to denote the various unit-step functions vary; use caution when referring to different texts.
875
STEP FUNCTIONS
21.9-1
U+(x)
kUix)
i
^'(x,h)
kK(x,h)
h-+0
h-+0
1
r~ -*
h* h
i
x
#
h
~w
Fig. 21.9-1. The unit-step functions U(x) and U+(x) and approximations to the
impulse functions 6(x), &+(x), &'(x), and d'+(x). Note
U(0 - 0) = 0 U(0 + 0) = 1 *7(e*) 5 1 U[(x - a)(x - 6)] - *7b - max (a, b)] + *7[min (a, 6) - »]
(21.9-2) (21.9-3)
Every step function can be expressed (except possibly at its discontinuities x = Xk)
as a sum of the form NakU(x - a;*), ^ a**7_0e - a*), or Y a*!7+(a; - xk) (EXAMk
k
k
PLES: Sgn x s 21/(x) - 1; and the "jump" functions of Sec. 20.4-5c).
21.9-2
876
SPECIAL FUNCTIONS
(b) Approximation of Step Functions by Continuous Functions.
U(x) = lim i + - arctan (ax)
(21.9-4)
U(x) -
(21.9-5)
lim Y2[erf (ax) + 1]
(21.9-6)
I7(aj) = lim (2-'-«°) a—>» ir J —to
(21.9-7)
T
(c) Fourier-integral Representations (see also Sec. 4.11-6).
tour integral =—. /
^rTrt J —CO
The complex con-
— do is respectively equal to U(t) or —£/(—t) if the integration CO
contour passes below or above the origin.
The Cauchy principal value of the integral
(Sec. 4.6-26) equals U(x) - Y2. Note also (21.9-8)
U(l-t) = ±[" gingcoso,^ IT J —to
(<>Q)
(21.9-9)
CO
21.9-2. The Symbolic Dirac Delta Function, (a) The symmetrical unit-impulse function or Dirac delta function 8(x) of a real variable x is "defined'' by
(OifX
)
\f(X) if a < X < b
j
/ /(*)«(* - X) di- = \y2f(X) if X=aor X= b\
(a
where f(x) is an arbitrary function continuous for x = X.
More gen
erally, one "defines" 8(x) by
/a6/tt)*« - X) d$ (0 if X < a or X > b
(21.9-106)
= mf(X + 0)ifX = a
(a
)y2f(x - o) if x = b \%[f(X - 0) +f(X + 0)] if a < X < b)
where/(x) is an arbitrary function of bounded variation in a neighborhood of x = X. 8(x) is not a true function, since the "definition" (10) implies the inconsistent relations
8(X) =0
(x 9*0)
j~m «(Q d* =1
(21.9-10c)
8(x) is a "symbolic function" permitting the formal representation of the functional identity transformation /(£) —• f(x) as an integral trans
formation (Sec. 15.3-la).
The "formal" use of 8(x) furnishes a con-
877
STEP FUNCTIONS AND IMPULSE FUNCTIONS
21.9-3
venient notation permitting suggestive generalizations of many mathe matical relations (see also Sees. 8.5-1, 15.3-la, 15.5-1, and 18.3-1). Although no functions having the exact properties (10) exist, it is possi ble to "approximate" 8(x) by functions exhibiting the desired properties to any degree of approximation (Sec. 21.9-4). One can usually avoid the use of impulse functions by introducing Stieltjes integrals (Sec. 4.6-17); thus
fabf(SWZ -X)dt =fabm dU($ - X) It is possible to introduce a generalizing redefinition of the concepts of "function"
and "differentiation" (Schwarz's theory of distributions, Refs. 21.13 to 21.18). Other wise, mathematical arguments involving the use of impulse functions should be regarded as heuristic and require rigorous verification (see also Sec. 8.5-1). (b) "Formal" Relations Involving 5(x). S(ax) = - 8(x)
(a > 0)
6(-x) = 8(x)
f(x)d(x - a) = y2[j(a - 0) +/(a + 0)]d(x - a)
5(x2 - a2) =^ [6(x - a) +8(x +a)]
/:
x6(x) = 0
(a >0)
S(a - x)8(x -b)dx = S(a - b)
(21.9-11) (21.9-12)
(21.9-13) (21.9-14)
21.9-3. "Derivatives" of Step Functions and Impulse Functions (see also Sec. 8.5-1). Equations (10) and (19) and also the relation £[8(t — a)] = e~a* = s£[U(t — a)] (a > 0) suggest the symbolic relation ship
8(x) =^ U(x)
(21.9-15)
The impulse functions 8'(x), 8"(x), . . . , 5(r)(x) are "defined" by
fabf(0o"(Z " X) dt 0 if X
3^(-l)r/lr)(* + 0)ifz = a y2(-iyf^(x-o)iix = b
[%(-l)r[fir)(X - 0) +f<*(X + 0)] if a < X < b (a < b)
(21.9-16)
for an arbitrary function f(x) such that the unilateral limits f(r)(X —0) and f{r)(X + 0) exist. The functions 5(r)(£ - X) are kernels of linear integral transformations (Sec. 15.2-1) representing repeated differentia tions. Note also the symbolic relation
xr8^(x) = (-l)rr\8(x)
(r = 0, 1, 2, . . .)
(21.9-17)
21.9-4
SPECIAL FUNCTIONS
878
21.9-4. Approximation of Impulse Functions (see also Fig. 21.9-1). (a) Continuously Differentiable Functions Approximating 8(x). One can approximate 8(x) by the continuously differentiable functions
5(*' °° = T(flV + l)
as a-^ oo
(21.9-18a)
8(x, a) = _5L e-««*»
as a -> oo
(21.9-18&)
t/ 8(x, a)v = a sin ax
as a -> oo
,<*+ * «* ^ (21.9-18c)
ir
ax
v
'
in the sense that lim 8(x, a) = 0 (x j* 0), and a—•
«
Um / "- /(*)*(* - fc «)<** =Hl/(* - 0) +a* +0)]
a—too •/
°°
wherever /(a - 0) and f(x + 0) exist; note also «->• lim yf—»°° 5(£, a) d£ = 1. Integration of the approximating functions (18) yields the corresponding stepfunction approximations (4) and (5). £[S(x - a, a)] (a > 0) converges to e~aa = £[8(x -o)]asa-> « for each function (18).
(b) Discontinuous Functions Approximating 8(x). 8(x) is often approximated by the central-difference coefficient (Sec. 20.4-3)
,fc h) =U +h) - U(* - »
&sh^0
(21.9-19)
Note also
lim i / " /«) 8in "(X 7 f) dt =5[/(* - 0) +f(X +0)]
CO—>» IT J — to
A
—
C
4
(—oo <X< oo)
(Dirichlet's integral formula)
(21.9-20)
and
Jjf_0 I j'-.™ l-2acoS7xa-i)+a>d* - 5MX - °) +><* +°>1 (-7T <X <7T)
(21.9-21)
if/(a;) is of bounded variation in a neighborhood of a; = X (see also Sec. 4.11-6). (c) Functions Approximating 8'(x), 8"(x), . . , 5
«'(*> «) " - | (giJ+ 1)2 as or -> oo (_l)rrIsin[(r+ l)arctan^J
H*te«),ljp
—
L(a;2+a2)(r+1)/2^
Successive differ
(21.9-22) as«->0 no a
—*
(r = 0, 1, 2, . . .)
(21.9-23)
Note also
S,(X> h) _ ^+^-2a(x) + t/(,-ft)
^^ Q
(21 ^^
879
MULTIDIMENSIONAL DELTA FUNCTIONS
21.9-5. Fourier-integral Representations.
21.9-7
Note the formal relations
8(x - X) =^Zir Jf—oo " e-*** e™* doo
(21.9-25)
W(x - X) = ~ I °° (*»)'«-*"* e*"* do> <£7T J -co
(21.9-26)
iJ [5(s - X) -f b(x + *)] = -7T JOf " cos «X cos <*c
(21.9-27)
21.9-6. Asymmetrical Impulse Functions (see also Sees. 8.5-1 and
9.4-3). (a) The asymmetrical impulse functions 8+(x), 8'+(x), . . . , 8?(x) are "defined" by
fb *r*\* ry ^j, f 0 if X < aorX > b\ . . ,. ja+Q /tt)M« - X) dt =(/(X +0)iia<x
f6 vnxrHt
Y\rit
f0 if X < a or Z > 6
)
]a+0KWKl; -X)dS= ((_1)r/(r)(X +o) if a< X
(a
(21.9-29)
One may write
8+(x) m28(x)U(x) = ± U+(x)
(21.9-30)
One way to obtain approximation functions for 8+(x) is to substitute the approxima tion functions of Sec. 21.9-4 into one of the relations (30), e.g.,
8+(x, h) =^
^
^
as h-> 0
(21.9-31)
Note also
*/,
n
„ U(x) - 2U(x - h/2) + U(x-h)
8+(x, h) = 4 ——
*
-^
-
-'
r
n
as h-> 0
fnn n oftN
(21.9-32)
(b) One may introduce 8+(—x) = 8~(x) as a second asymmetrical impulse function corresponding to the "derivative" of the asymmetrical step function U-(x) (Sec. 21.9-1).
21.9-7. Multidimensional Delta Functions (see also Sec. 15.5-1). For an n-dimensional space of "points" (x1, x2, . . . , xn) with a volume element defined as
dV = dV(xl, x2, . . . , xn) = V\g(x\ x2, . . . , xn)\ dx1 dx2 • • • dxn (Sec. 16.10-10), the n-dimensional delta function 8(x\ ft a;2, ft . . . ; rcn, £") must satisfy
jv /(ft ft ... ,m*1, {i; X2, ft . . . ;X», T) <*F(ft ft . . . ,D = /(X1, X2, . . . , X»)
(21.9-33)
for every "point" (X1, X2, . . . , Xft) in V where f(xl, x2, . . . , zn) is continuous. Note that the definition of 8(x1, ft a;2, ft . . . ; &n, £") depends on the coordinate system used and is meaningless wherever dV —0. In particular, for rectangular cartesian coordinates x, y, z, one has dV —dx dy dz, and 8(x, *; y, r, 2, S) = «(* - €)8(y - n)8(z - f)
(21.9-34)
21.10
SPECIAL FUNCTIONS
880
21.10. REFERENCES AND BIBLIOGRAPHY
21.1. Abramowitz, M., and I. A. Stegun (eds.): Handbook of Mathematical Functions, National Bureau of Standards Applied Mathematics Series 55, Washington, D.C., 1964. 21.2. Byrd, P. F., and M. D. Friedman: Handbook of Elliptic Integrals, Springer, Berlin, 1954. 21.3. Erd&yi, A.: Higher TranscendentalFunctions, vols. 1 and 2 (Bateman Project), McGraw-Hill, New York, 1953. 21.4. Jahnke, E., and F. Emde: Tables of Functions with Formulae and Curves, Dover, New York, 1945. 21.5. Hurwitz, A., and R. Courant: Vorlesungen uber allgemeine Funktionentheorie und elliptische Funktionen, 4th ed., Springer, Berlin, 1964. 21.6. McLachlan, N. W.: Bessel Functions for Engineers, Oxford, Fair Lawn, N.J., 1946.
21.7. Schafke, F. W.: Einfuhrung in die Theorie der speziellen Funktionen der mathematischen Physik, Springer, Berlin, 1963. 21.8. Sneddon, I. N.: The Special Functions of Physics and Chemistry, Oliver & Boyd, Edinburgh, 1956.
21.9. Whittaker, E. T., and G. N. Watson: ModernAnalysis, Macmillan, New York, 1943.
21.10. : A Course in Modern Analysis, Cambridge, New York, 1946. 21.11. Oberhettinger, F., and W. Magnus: Formulas and Theorems for the Functions of Mathematical Physics, Chelsea, New York, 1954; 3d ed., Springer, Berlin, 1966,
21.12. Tricomi, F. G.: Elliptische Funktionen, Akademische Verlagsgesellschaft, Leipzig, 1948. (See also the articles by J. Lense and J. Meixner in vol. I of Handbuch der Physik, Springer, Berlin, 1956; and see also Sec. 10.6-2). Generalized Functions and the Theory of Distributions
21.13. Arsac, J.: Fourier Transforms and the Theory of Distributions, Prentice-Hall, Englewood Cliffs, N.J., 1966.
21.14. Friedman, A.: Generalized Functions and Partial Differential Equations, Pren tice-Hall, Englewood Cliffs, N.J., 1956. 21.15. Gelfand, I. M., et al.: Generalized Functions, Academic, New York, 1964. 21.16. Lighthill, M. J.: Introduction to Fourier Analysis and Generalized Functions, Cambridge, New York, 1958. 21.17. Schwartz, L.: Thtoriedes Distributions, 2d ed., Hermann & Cie, Paris, 1957. 21.18. Zemanian, A. H.: Distribution Theory and Transform Analysis, McGraw-Hill, New York, 1965.
APPENDIX
FORMULAS FIGURES
DESCRIBING PLANE
AND
SOLIDS
A-1. The Trapezoid A-2. Regular Polygons
A-5. Solids of Revolution A-6. The Five Regular Polyhedra
A-3. The Circle
A-4. Prisms, Pyramids, Cylinders, and Cones
A-1. The Trapezoid (sides a, b, c, d; a and b are parallel; the altitude h is the distance between a and b). The area S is given by
S=| (a +b)h
h=^-g Vs(s - a+b)(s - c)(s - d) s = a-b + c+ d (A_1}
The trapezoid is a parallelogram if a = b and a rhombus if a = b = c = d.
881
APPENDIX A
882
A-2. Regular Polygons (length of side equal to a) Num
ber of
sides,
Radius of cir
Regular polygon
n
Radius of in
cumscribed circle,
scribed circle,
• * r = a / /o2 sin -
p = a / 2 tan -
/
/
n
/
Equilateral triangle
i^s
i^
4
Square
;v.
2
Regular pentagon
6
8
octagon
10
Regular decagon
a2
/-
a
a2
a Vy2 + Ho y/l a VM + Ko V5 ^4 V25 + 10 Vs
Regular hexagon Regular
Area, <S> = y$nap
n
3
5
t
§V5
a
^a2\/3
ay/l + MVl
l(. +V3)
i(i+Vfi)
2V5+2\/5
2a2(l+V2J
a
A-3. The Circle (radius r, see also Sec. 2.5-1). Circumference = 2wr
%a2 V5 + 2\/5
(a)
area = irr2
(b) A central angle of
a sector of area %r V
a c/iord of length 2r sin ^
a segment of area 3^r2(
(c) The area between a circle of radius n and an enclosed (not neces sarily concentric) circle of radius r2 is ir(r\ + r2)(ri — r2).
A-4. Prisms, Pyramids, Cylinders, and Cones, (a) Volume of a prism or cylinder (bounded by a plane parallel to the base of area Si, altitude h) hSi
(b) Volume of a pyramid or cone (base area Si, altitude h) }?ihSi (c) Volume of the frustrum of a pyramid or cone (bounded by parallel
planes; base areas Si, S2, altitude h) Hh(Si + S/S1S2 + £2) (d) Curved surface area of a right circular cone (base radius r, altitude h)
irr y/r2 + h2
883
PLANE FIGURES AND SOLIDS
A-5. Solids of Revolution Solid
No.
1
Sphere of radius r
2
Oblate spheroid
(axes 2a >2b, e=^ 1 3
Surface area
Volume
4rrr2
i-
b2
logfl
1 +€
j
27ra2 + ir
1
2irb2 -f- 2tt — arcsin e
e
-ira26 3
1—6
Prolate spheroid 4 - nab2 3
ab
(axes 2a > 26, e = .Jl 4
Torus (anchor ring) circle of radius r rotated about an axis at a distance R from the center
5
€
Zone or segment of a sphere of radius r between parallel planes at a distance h; base radii rl} r2
4«r2Rr
27r2i?r2
2wrh + ir(n2 + r22)
^(3rx2 + 3r2* + h2) o
\* S(i+V»)V5 4 =V2(5+VS) 4
8 equilateral triangles
12 regular pentagons
20 equilateral triangles
Octahedron
Dodecahedron
Icosahedron
a
/-
4V3
VOs + VH)
W10+vi
a \
6^6
2
a
22
\Vi
6 squares
Cube
12 v
/-
-;v5
4 equilateral triangles
Tetrahedron
a
Radius of
inscribed sphere
Radius of cir
cumscribed sphere
of surfaces
Regular polyhedron
Number and type
5a2 \/3
3a2 V5(5 + 2VB)
2o2\/3
6o2
o2V3
area
Surface
/^
^ a«(8 +VS)
Y4 (15 +7V5)
°3
a8
a3 /o 12 v
-V2
Volume
A-6. The Five Regular Polyhedra (length of side equal to a; the respective numbers F of surfaces, E of vertices, and K of edges are related by E + F — K = 2).
*d •a w
APPENDIX
B
PLANE AND SPHERICAL TRIGONOMETRY
Plane Trigonometry B-1. Introduction
B-2. Right Triangles B-3. Properties of Plane Triangles B-4. Formulas for Triangle Computa tions
B-6. Properties of Spherical Triangles B-7. The Eight Spherical Triangle B-8. Formulas for Triangle Computa tions
B-9. Formulas Expressed in Terms of the Haversine Function
Bibliography
Spherical Trigonometry
B-5. Spherical Triangles: Introduction
PLANE TRIGONOMETRY
B-1. Introduction. Plane trigonometry describes relations between the sides and angles of plane triangles in terms of trigonometric functions (Sees. 21.2-1 to 21.2-4); note that all plane figures bounded by straight lines may be regarded as combinations of triangles. Since all plane triangles may be resolved into right triangles, the most important trigonometric rela tions are those relating the sides and angles of right triangles. B-2. Right Triangles. b and hypotenuse c,
In every right triangle (Fig. B-1) with sides a, 885
APPENDIX B
A + B = 90°
886
a2 + b2 = c2 (theorem of Pythagoras)
sin A = cos B = -
sin B = cos A = -
c
c
tan il = cot B = T
(B-1)
tan B = cot A = -
0
a
Fig. B-1. Right triangle.
B-3. Properties of Plane Triangles. In every plane triangle (Fig. B-2), the sum of the angles equals 180 deg. The sum of any two sides is greater than the third, and the greater of two sides opposes the greater
Fig. B-2. Oblique triangle.
of two angles.
A plane triangle is uniquely determined (except for sym
metric images) by 1. Three sides
2. Two sides and the included angle 3. One side and two angles 4. Two sides and the angle opposite the greater side In every plane triangle, the three bisectors of angles intersect in the center M of the inscribed circle. The three perpendicular bisectors of the sides intersect in the center
F of the circumscribed circle. The three medians intersect in the center of gravity G of the triangle. The three altitudes intersect in a point H collinear with the last two points, so that HG'.GF = 2. The mid-points of the sides, the footpoints of the perpendiculars from the vertices to the sides, and the mid-points of the straight-line segments joining H to each vertex lie on a circle whose radius is half that of the cir cumscribed circle (nine-point circle, or Feuerbach circle). The center of the
nine-point circle is the mid-point of the straight-line segment HF.
887
PLANE AND SPHERICAL TRIGONOMETRY
B-4. Formulas for Triangle Computations. In the following rela tions, A, B, C are the angles opposite the respective sides a, b, c of a
plane triangle (Fig. B-2).
The triangle area is denoted by S; r and p
are the respective radii of the circumscribed and inscribed circles, and s = %(a + b + c). Additional formulas are obtained by simultaneous cyclicpermutationof A, B, C and a, b, c. Table B-1 permits the computa tion of the sides and angles of any plane triangle from three suitable sides and/or angles.
a2 = 52 _j_ c2 __ 2bc cos A = (b + c)2 —46c cos2 -^ (law of cosines) (B-2) -—j = -—5 = -—jz = 2r sin A
sin B
c = a cos B + b cos A
. A
,
l(s -b)(s- c)
miz - +V
be
sin
tan
A
(law of sines)
(B-3)
(projection theorem)
(B-4)
sin C
Us -b)(s- c)
A
,
x
\s(s - a)
cos 2=+yj-^c-1
I (B.5)
sin A = + j- y/s(s —a)(s —b)(s —c) ,
,
b+ c
. B + C tan —=—
2
6-c ~ .
£ - C
(B-6)
tan—jr—
(b + c) sin ^ = a cos —^—
(b —c) cos -77- = a sin —^— (B-7)
p = (s —a) tan^A = (s —6) B tan^ = (sC —c) tan^ = 4r sm ^ sin -^ sm - = s tan-^ tan ^- tan ^ S=
g
= 4r C0S o" cos o" cos o
(B-8) (B-9)
£ = %ab sin C = 3^&c sin 4 = J^ac sin B = [s(s -a)(s- b)(s - c)]*
=2r2 sin Asin 5sin C=^ =ps Length of altitude ha =^ Length of angular bisector wa = ,
f £ =- aft0 ) y/bc(a + b + c)(b + c —a)
Length of median ma = J^ y/2b2 «+• 2c2 —a2
(B-10)
APPENDIX B
Table B-1. Solution of Plane Triangles.
888
Obtain all other cases by cyclic
permutation (refer to formulas of Sec. B-4 and to Fig. B-2)
Case
Given
Three sides
Example
a, b, c
Formulas used
Note A + B + C = 180°
A, B, C from (2) or (3)
Conditions for the existence of a solu
tion (see also Sec. B-3) Sum of two sides must
be
greater
than
the
third
Two sides
b,c,A
2
and the
included
yU
2' 2
from (6) or (7), hence B and C; or B, C from (3) and (4):
angle
b sin A tan B
=
c — b cos A
a from (3) or (4) One side
a, B, C
6, c from (3); A = 180° - B - C
and two
angles Two sides and an
opposite angle
b, c, B
From (3), a =
b sin A
sin B
c sin B
sin C =
Problem has one solu
tion if 6 > c; two solu tions if b < c, c sin B
A = 180° - B - C
SPHERICAL TRIGONOMETRY
B-5. Spherical Triangles: Introduction. On the surface of a sphere, the shortest distance between two points is measured along a great circle, i.e., a circle whose plane passes through the center of the sphere (geodesic, Sec. 17.3-12). The vertices of a spherical triangle are the intersections of three directed straight lines passing through the center of the sphere and the spherical surface. The sides a, b, c of the spherical triangle are those three angles between the three directed straight lines which are less than 180 deg. Corresponding to each triangle side, there is a great-circle segment on the surface of the sphere (Fig. B-3). The angles A, B,C of the spherical triangle opposite the sides a, b, c, respec tively, are the angles less than 180 deg between the great-circle segments corresponding to the triangle sides, or the corresponding angles between the three planes defined by the three given straight lines. Spherical trigonometry is the study of relations between the sides and angles of spherical triangles (e.g., on the surface of the earth and on the celestial sphere).
689
PLANE AND SPHERICAL TRIGONOMETRY
Fig. B-3. Spherical triangle.
In many problems, physicists and engineers will prefer the use of the rotation trans formations (Sec. 14.10-1) to the use of spherical trigonometry.
B-6. Properties of Spherical Triangles. Each side or angle of a spherical triangle is, by definition, smaller than 180 deg. The geometry on the surface of a sphere is non-Euclidean (see also Sec. 17.3-13); in every spherical triangle, the sum of the sides will be between 0 and 360 deg, and the sum of the angles will be between 180 and 540 deg. In every spherical triangle, the greater of two sides opposes the greater of two angles. The sum of any two sides is greater than the third, and the sum of any two angles is less than 180 deg plus the third angle. A spherical triangle is uniquely determined (except for symmetric images) by 1. Three sides
2. 3. 4. 5.
Three angles Two sides and the included angle Two angles and the included side Two sides and an opposite angle, given that the other opposite angle is less than, equal to, or greater than 90 deg 6. Two angles and an opposite side, given that the other opposite side is less than, equal to, or greater than 90 deg
Note: In every spherical triangle, it is possible to define great circles as perpendicular bisectors of sides, bisectors of angles, medians, and altitudes. The planes of the three great circles of each type intersect in a straight line.
APPENDIX B
890
In analogy to the circumscribed circle ofa plane triangle, thereexists a circumscribed right circular cone containing the three straight lines defining the triangle; the axis of thiscone isthestraight line formed bythe intersection ofthe planes oftheperpendicu lar bisectors. There is also an inscribed right circular cone touching the three planes corresponding to the spherical triangle; the axis of this cone is the straightlineformed by the intersection of the planes of the bisectors of the angles. The "radius" r of the circumscribed circle and the "radius" poftheinscribed circle areangles denned as half the vertex angles of the respective cones.
Given the radius Rofthesphere, the area Sr ofa spherical triangle isgiven by SR = Rh
(B-ll)
6 = il+£ + C-7r
(B-12)
where 6 is the spherical excess
measured in radians. Thequantity d = 2ir - (a + b + c)iscalled spherical defect. The polar triangle corresponding to a given spherical triangle is defined by three directed straight lines perpendicular to the planes associated with the sides of the original triangle. The sides of the polar triangle are equal to the supplements of the corresponding angles of the original triangle, and conversely. Thus everytheoremor formula dealing with the sides and angles of the original triangle may be transformed into one dealing with the angles and sides of the polar triangle.
B-7. The Right Spherical Triangle. In a right spherical triangle, at least oneangle, C, say, is equal to 90deg; the opposite side, c, is called the hypotenuse. All important relations between the sides and angles of the right spherical triangle may be derived from Napier's rules, two con venient aids to memory:
Fig. B-4. Napier's rules.
Napier's Rules: In the diagram ofFig. B-4, the sine of any of the angles shown is equal
1. To the product of the tangents of the two angles adjoining it in the diagram
2. To the product of the cosines of the two angles opposing it in the diagram
891
PLANE AND SPHERICAL TRIGONOMETRY
EXAMPLE: To compute thesides and angles ofa right spherical triangle with the hypotenuse c, given c and a.
This problem has a solution only if sin a < sin c; then Cos b =
cos c
D
tan a
cos B = 7
cos a
tan c
. A _ sin a
sin A — -v—
sin c
Note: If a is less than, equal to, or greater than 90 deg, so is A, and conversely. If 6 is less than, equal to, or greater than 90 deg, so is B.
If a and A are given, the problem has a solution only if the above condition is
satisfied and sin a < sin A; unless a = A, there are two solutions. The situation is analogous if b and B are given.
If Aand Baregiven, theproblem has a solution only if90 < A + B < 270 deg and
-90 deg < A - B < 90 deg (Sec. B-6).
Aspherical triangle having a side equal to 90 deg iscalled a quadrantal triangle and may be treated as the polar triangle (Sec. B-6) ofa rightspherical triangle.
For all problems involving the spherical-triangle computations (right or oblique triangles), it is strongly recommended that a sketch be drawn which roughly indicates whether the various angles and sides will be less than, equal to, or greater than 90 deg.
B-8. Formulas for Triangle Computations (see also Fig. B-3). In the following relations, Ay By C are the angles opposite the respective
sides a, b, c of a spherical triangle. The respective "radii" of the cir cumscribed and inscribed cones are denoted by r and p. Additional
formulas are obtained by simultaneous cyclic permutation of A} By C and a, by c. Table B-2 permits the computation of the sides or angles of any spherical triangle from three suitable sides and/or angles. The inequalities noted in Sec. B-6 must be observed in order to avoid ambig uous results in triangle computations.
sin a _ sin b __ sin c
(law of sines)
sin A ~~ sin B ~~ sin C
cos a = cos b cos c + sin b sin c cos A
(law of cosines
cos A = — cos B cos C + sin B sin C cos a
FOR THE SIDES) (law of cosines FOR THE ANGLES)
A
b+ c
B+ C
x
b-c.B + C 2 0i" 2
.
a
(B-13) (B-14)
(B-15)
B-C
tan —^— cos —o— = tan o cos —o—
i a . B -C l ] """ 2 2 I
(Napier's
B + C cos b + c = cot,A —c [' tan —^— —£— -^ cos b—x—
analogies)'
tan —rz— sin —^— = tan rsin
x
B - C . b+ c
^A . b - c
tan —s— sin —^— = cot -^ sin —^—
(B-16)
APPENDIX B
. A . b+ c
.a
sin - sin —^— = sin - cos • A
sin -
b+ c
a
cos —— = cos -= cos
2
u
B + C
>
2
cos — sin —^ = sin -2 sin B Z ^( & 2 ^ B-C 2
—c . B cos -4 -^ cos 6—^— = cos ga sm
__a+b+c
892
B —C
(DeLAMBRE'S OR
Gauss's analogies)
^" '
+ C
A+B+C
5
2
sin - = /sin (* - &) sin (s - c) 2 COS
\
-A _
sin 6 sin c
/sin s sin (s —a)
\
2 ~ V sin bsin c—
. a_
/-cos S cos (£ - A)
2- V a= 2
/ (HALF"ANGLE formulas) (B-18)
sin B sin C
/cos (5 - B) cos (S - C) \
sin B sin C
gin A = +2 Vsin s sin (s - a) sin (s - 6) sin (s - c) sin 6 sin c
sin a = +2 V - cos S cos QS - A) cos QS - J?) cos (S - C)
(B-19)
sin B sin C
cot r = /cos QS - A) cos QS - g) cos (S - C) \
— cos £
tan p=^in(s-q)sin^-6)sin(S-c) cot4^ =Sin/S~g) tan p
f (B_2Q)
tang^008^-^ 2 cot r
tan | tan S-^ tan ^— tan ^-^ tan [i - i ) = + /
(B-21)
s —b , s — c tan —pr— tan
?
L. (L'Huilier's
tan | tan S-^
equation)
(B-22)
B-9. Formulas Expressed in Terms of the Haversine Function. Certain trigo nometrical relations become particularly suitable for logarithmic computations if they are expressed in terms of the new trigonometric functions versed sine, versed cosine, and haversine, defined by
vers A = 1 - cos A
covers A = 1 - sin A
hav A - J£(l - cos A)
(B-23)
PLANE AND SPHERICAL TRIGONOMETRY
893
Thus, if tables of the haversine function are available, one may use the following formulas for spherical-triangle computations: hav A =
sin (s - b) sin (s - c)
hay Q_ h&y (6 _ c) + sin hB{n chav A
sin b sin c
(B-24)
Other similar relations may be obtained by cyclic permutation.
Table B-2. Solution of Spherical Triangles (refer to formulas of Sec. B-8 and to Fig. B-3) Example (obtain Case
Given
Formulas used
other cases
by cyclic permuta tion)
Conditions for the exist ence of a solution (see also See. B-6)
1
Three sides
a, b, c
A, B, C from (18) and cyclic Sum of two aides must be greater than the third permutation
2
Three angles
A,B,C
a, b, c from (18) and cyclic 540° > A+B + 0 180°; sum of two angles must be less than 180° plus the third
permutation
angle 3
Two sides and the included angle
b,c, A
B + C
tB-Ce
and —-— from (16),
2
2
hence B and C; a from (17), (18), or (14) 4
Two angles and the
B,C, a
included side
b + c
b — c
2
2
^— and
from (16),
hence b and c; A from (17), (18), or (15) 5
Two sides and an
b,e,B
C from (13); A and a from (16)
opposite angle
Problem has either one or
two solutions if sin c sin B < sin 6.
Retain the values
of C which make A — B and o — 6 of like sign; and A+B - 180° and o + b 180° of like sign 6
Two angles and an opposite side
B, C, b
c from (13); A and a from (16)
Problem has either one or two solutions if sin &sin C < sin B. Retain the values of c which make A — B and
a — b of like sign, and A+B - 180°anda + 6180° of like sign
Bibliography
Kells, L. M., et al.: Plane and Spherical Trigonometry, 3d ed?, McGraw-Hill, New York, 1951.
Palmer, C. I., et al.: Plane and Spherical Trigonometry, 5th ed., McGraw-Hill, New York, 1950.
APPENDIX
PERMUTATIONS, COMBINATIONS, AND RELATED TOPICS
Table C-1. Permutations and Partitions
Table C-2. Combinations and Samples Table C-3. Occupancy of Cells or States
C-1. Use of Generating Functions C-2. Polya's Counting Theorem References and Bibliography
Refer to Sec. 1.2-4 and Table 21.5-1 for definitions and properties of factorials and binomial coefficients. Stirling's formula (Sec. 21.4-2) is useful in numerical computations. Table C-1. Permutations and Partitions 1
Number of different orderings (permutations) of a set of n distinct objects
Number of distinguishable sequences of N objects comprising n < N indistinguishable objects of type 1 and N —n indistinguishable objects of type 2, or Number of distinguishable partitions of a set of N distinct objects into 2 classes of n < N and N —n objects, respectively
n!
o- (N
N\
- n)\n\
(binomial coeffi
cient, Sec. 21.6-1)
. Number of distinguishable sequences of N = JVi + N2 + - • - + Nr objects comprising Nt indistin N\ guishable objects of type 1, N2indistinguishable ob jects of type 2, . . . , and Nr indistinguishable ob NiWtl • • - Nr\ jects of type r, or , Number of distinguishable partitions of a set of
N = Ni +• N2 + - • • + NT distinct objects into r
classes of Nh N2, . . . , Nr objects, respectively 894
(multinomial coefficient)
895
PERMUTATIONS AND COMBINATIONS
Table C-2. Combinations and Samples (see also Table C-3 and Sec. 18.7-2). Each formula holds for N < n, N = n, and N > n Number of distinguishable unordered combinations of N distinct types of objects taken n at a time:
1
i. Each type of object may occur at most once in any combination (combinations without repetition; see also Table C-1, 2) ii. Each type of object may occur 0, 1, 2, . . . , or n times in any combination (combinations with repetition) iii. Each type of object must occur at least once in each combination
2
Number of distinguishable samples (se quences, ordered sets, variations) of size n taken from a population of N distinct types of objects: i. Each type of object may occur at most once in any sample (samples without re placement, sequences without repetition) ii. Each type of object may occur 0, 1, 2, . . . , or n times in any sample (samples with replacement, sequences with repeti tion)
(N\ \n)
(N +n \
n
i)-es'--,') (;:!)
N(N - 1) • . • (N - n + 1)
N»
-CD-
EXAMPLES: Given a set of N = 3 distinct types of elements a, 6, c.
For n = 2, there exist 3 combinations without repetition (a6, ac, be); 6 combinations with repetition (aa7 abf ac} 66, 6c, cc); 6 distinguishable sam ples without replacement (aby aCy 6o, 6c, ca} cb); and 9 distinguishable samples with replacement (aat ab} acf 6a, 66, 6c, ca} cb, cc).
APPENDIX C
896
Table C-3. Occupancy of Cells or States (see also Table C-2 and Sec. 18.7-2).
Each formula holds for N < n, N = n, and N > n Number of distinguishable arrangements of n indistinguishable objects in N distinct cells (states):
i. No cell may contain more than one object
ii. Any cell may contain 0, 1, 2, . . . , or n objects
C+:-).(»V.-11)
iii. Each cell must contain at least oneobject Number of distinguishable arrangements of n distinct objects in N distinct cells:
N(N - 1) • • • (N - n + 1)
i. No cell may contain more than oneobject
-©••
ii. Any cell may contain 0, 1, 2, . . . , or n objects
C-1. Use of Generating Functions (Refs. C-1, C-3). (a) Com binations. The combinations of n distinct objects Ah A2, . . . ,An, taken k at a lime without repetition, are all "exhibited" as the coefficients ak generated by the generating function (Sec. 8.) n
F(s) = Y aksk = (1 + AlS)(l + A2s) • • . (1 + Ans)
(C-1)
A= 0
The numbers of such combinations of n objects are the coefficients
a>k = f 7J generated by the enumerating generating function (enumerator) n
^enum(s) = 2 a*sk = (1 + «)n
(C-2)
&=o
This "product model" for combinations can be generalized. Thus, if the object At may be repeated 0, rh or r2 times while A2, Az, . . . , An may each occur once or not at all, the corresponding generating functions take the form
F(s) = (1 + Aw + A^s^)(l + A2s) • • • (1 + Ans) ^enum(s) = (1 + s" + sr2)(l + s)""1 If there is no restriction on the number of repetitions,
(C-3) (C-4)
00
^enumw=a+s+s2 +••0"=(nr^y=y, (n+j ~i)s" (c-6> k = 0
897
PERMUTATIONS AND COMBINATIONS
If each of the n objects must occur at least once,
FE»Ms) -(. +.'+•• •)• =(r^)" =£ (J I J) s> (c-6) k = n
(b) Permutations. To enumerate the permutations of n distinct objects taken kata timewithout repetition one may use the exponential generating function (Sec. 8.7-2.) Ge
(1 +«)»
iW * = 0
-*<(:)
(C-7)
If one of the n objects can be repealed 0, ri, or r2, times, then
Genum(s) =(X +S+S) (1 +S)n~l
(0-8)
If any number of repetitions are allowedy
Genum(s) =(l +8+J +•• V=e-
(C-9)
If eac/i object rawsZ occw a£ ZeasJ once,
GenumOO =(« +5J +•••)" =(e- - D»
(C-10)
C-2. Polya's Counting Theorem. Consider a finite set D of n "points" p each to be associated with one of the elements (figures) / of a second finite set R (figure store, figure collection); more than one p may be associated with the same /. One desires to investigate the class of such arrangements or configurations (patterns, mappings from D into Rf Fig. C-1) subject to a given group G of permutations of the A permutation ofD
k D Configuration
R
1
Configuration
Fig. C-1. Two configurations are shown. The right-hand one is obtained from the left-
hand one through a permutation of the points pt. Two such configurationscan be equiv alent by virtue of some previously defined symmetry in the point set D and/or because two (or more) of the figures (in this case, ft and Q are indistinguishable.
APPENDIX C
898
points p (Sec. 12.2-8). Two configurations d, C2 are equivalent with respect to G if and only if a permutation in G transforms Ci into C2; equivalent configurations necessarily contain the same figures. The cycle index ZG(sh s2, . . . , sn) of the permutation group G is a generating function defined as follows.
Every permutation P of G
classifies the points p of D into uniquely determined subsets (cycles) such that P produces only cyclic permutations of each subset (Sec. 12.2-8). Let bk be the number of such cycles of length k for a given per mutation (6i + 262 + • • • + nbn = n). Then the cycle index is defined as the polynomial
ZG{Sl, S2i . . . , Sn) = - 2j 9*ih... SlblS*bi • • • Sn6" (C-ll) all P in G
where g is the total number of permutations in (i.e., the order of) G, and 0V2... is the number of permutations, with bi cycles of length 1, b2 cycles of length 2, etc.
Assuming that each type of figure / in R is associated with a nonnegative integer w (weight or content of /), let aw be the number of
distinguishable fs of weight w. The figure-counting series (store enumerator) is the generating function 00
a(s) = ^ «-»"
(C-12)
w = 0
The content (weight) of a configuration is the sum of its figure weights; equivalent configurations have equal content. Let Aw be the number of nonequivalent configurations of content w; then the con figuration-counting series (pattern enumerator) is the generating function 00
A(s) = £ 4ws"
(C-13)
«>=o
The generating functions (12) and (13) are related by A(s) = ZQ[a(s)y a(s2)y . . . , a(sn)] (polya's counting theorem) (C-14) 00
In particular, the configuration inventory A(l)= > Aw is related w = 0 00
to the figure inventory a(l) = } awby A(l) = Z0[a(l), a(l), . . . , a(l)]
(C-15)
The theorem can be generalized to apply to situations where figures and configurations are labeled with two or more weights (Ref. C-1).
899
PERMUTATIONS AND COMBINATIONS
References and Bibliography
C-1. Beckenbach, E. F. (ed.): Applied Combinatorial Mathematics, Wiley, New York, 1964.
C-2. MacMahon, P. A.: Combinatory Analysis (2 vols.), Cambridge, London, 19151916.
C-3. Riordan, J.: An Introduction to Combinatorial Analysis, Wiley, New York, 1958. C-4. Ryser, H. J.: Combinational Mathematics, Wiley, New York, 1963.
APPENDIX
TABLES OF FOURIER EXPANSIONS AND LAPLACE-TRANSFORM PAIRS
Table D-1. Fourier Coefficients and
Mean-square Values of Periodic Functions Table D-2. Fourier Transforms
Table D-3. Fourier Cosine Transforms Table D-4. Fourier Sine Transforms
Table D-5. Hankel Transforms
Table D-6. Laplace-transform Pairs Involving Rational Alge braic Functions
Table D-7. Laplace Transforms
D-1. Tables D-1 to D-7 present a number of Fourier expansions (Sec. 4.11-4), Hankel transforms (Sec. 8.6-4), and Laplace transform pairs (Sees. 8.2-1, 8.2-6, and 8.4-1) for reference.
D-2. Fourier-transform Pairs and Laplace-transform Pairs, For suitable functions f(t) (Sees. 4.11-4 and 8.2-6),
(a)
f(t) =-^= / C(w)e^rfa) = 7-oo / c(p)e2™'dv(-<x> < t < ») V 2tT J - oo
C(a>) m-j= r f(t)e-*»dt c(v) mFF(ia) mT f(t)e-*™'dt (<* = 2-kv) «" Jn-i"
F(s) = f" /(«)«-' * 900
(D-1)
(t > 0,
(D-2) (
901
FOURIER AND LAPLACE-TRANSFORM PAIRS
so that tables of Laplace-transform pairs (2) (Tables D-6 and D-7) may be used to obtain many Fourier-transform pairs (1): 1. Givenf(t), obtain, if possible, F(s) = £[f(t)] and Fx(s) = £[/(-*)]. for a > 0, then
IfF(s) and Fi(s) are analytic
cW.-^f(*)+*(-*.)]
I (D3)
c(v) =FF(iu) s F(ia>) + Fi(-iu) (a> =2wv) j 2. Given C(co) or c(v) = FF(i(*)),
f(t) =y/Z*^ [C (•)] =*-' [c (^.)] - ^iM] (t > 0)
(D-4)
provided that this expression exists, so that C(s/i) or FF(s) is analytic for a > 0, andf(t) = 0 for t < 0. (b) 77te following procedures permit one to obtain many Laplace-trans form pairs (2) from tables of Fourier-transform pairs (1) (Table D-2 and Ref. 8.1):
1. Givenf(t) such that f(t) is real and f(t) = 0 for t < 0, use the table of Fourier-transform pairs (or any other method) to obtain C(co) or c(v) s FF(ia>). Then
£[/(*)] =F(s) -y/frcfy-c (±?j =FF(s) (a >0) (D-5) 2. Givenfif) such thatf(t) is real and even, f(—t) = f(t)y*
*W0]-*-(.)-^c(f)-ic(^)-^(.) (»>0) (D-6) 3. Given F(s) analytic for a > 0, obtain «C_1[F(s)] for £ > 0 as the func tion f(t) corresponding to
C(o>) 3 -i= F(zw) V27T
or
c(p) 5 Fj?(*a>) = F(2ttzV) (co = 2™)
(D-7)
* Note that every function can be rewritten as the sum of even and odd parts (Sec. 4.3-2), and that f(—t) is even whenever/(0 is odd.
6
5
4
3
2
1
6. = 0
6. = 0
nvJ
a.»2^^^sinc(^)«nc[^ +r>>]
6,-0
6 = 0
Fourier coefficients
(for phasing as shown in diagram)
T
V »To
T xr0\
2
A
--t«*T)
»
1 A ( . xTq
2 A To
A
A^-
*?
Average value
1 A.(*T*
27*
3 . 2xT»
3T
J,3To + 22'i
"ft
*•?
Mean-square value
irX J
3
, 2x7*0
,»ro\
+—""'-r)
2Tilo,V-r--2am-2^
-^./(O »-^(2+|cos^+|cos2»<-^cos4«t+ico86^±...)(haljuwavib bectipibd binuboid). fForr0= T= A/«) =-A (-+-008^-^0084^ +3^ cos owfi.-.) (fulu-wave BEcnraD bwuboid).
♦For To-j
'waveform
Triangular
sinusoid
Clipped
pulees*t
Half-sine
trapezoidal pulses
Symmetrical
Symmetrical triangular pulses
Rectangular pulses
Periodic function, /({) « f(t + T)
L
Table D-1. Fourier Coefficients and Mean-square Values ofPeriodic Functions [sec. 4.11-4a; sine (x) b 8ln7rXl
S3
5
2 O
FOURIER AND LAPLACE-TRANSFORM PAIRS
903
Table D-2a. Fourier-transform Pairs*
/«> -/^WW." £
1
r/2)
T
TOT
"2
lO(UI> T/2)
Fr(i
Tsinc^air-^-l».2l ° 2»4» ~t~t
2
.*
f 0(l«l
t
-TOT
I 0
(Ul £ D
"^s-'^)
•I
I
V *\
-r o
(-TJ* +1
e aVr/
VK»-**,nF
(Complex)
♦ «0)]
coe uQt
*[*(« - w0)+ Ha
mnu0t
j [8(a-u0)-6iu +«0)]
Uo
2 8(t-kT) T 2T ST
I
10 1
iit-T)
-&T-2T-T 0
I t
:2fc^
r
ttftttt
t
J2^ -$°A
.-¥
-r o r
J3^
r
2L
"* T~T
J^w
r
^flL
Tract
-3T-2T-T 0" T 2TST
^L
r
A--co
¥2 •(--¥)
f
I t (Imaginary)
ttftttt
e 2 •»*
* Reprinted from G. A. Korn, #asfc Tables in Electrical Engineering, McGrawHill, New York, 1965.
APPENDIX D
904
Table D-26. Fourier Transforms*
/(*) =-±= j^ C(Z)e-
C(0
(I)* w<-
sin (ax) X
0
eiux p < x < q 0 x < p, x > q e-cx+ia>x
X > 0
0
x < 0
U| > a j
e»p(«+£) _ g»«(w+€)
<2r)»
f *
(2r)l(«+-€+ic)
e~**2 Re(p) > 0
(2p)-Je~*2/4p
cos (px2)
«pH «(£-!') (2P)-isin(|+^)
sin (px2)
\x\-*
2ir(l - s) sin (*stt) ^l1"*
0 < Re(s)< 1
1
1
M
Hi
e-a\x\
[(a2 + *2)* + a]* (a2 + £2)i
\x\i
/2\i cos (|a) cosh (j£) \7r/ cosh (£) + cos (a)
cosh (ttX)
einh (ax) sinh (7ric)
(a2 - z2)-* 0
.
.
/ 1 \*
sin (a)
\2TrJ cosh (£) + cos (a) |g| < a |x| > a
(Mtfote)
sin [b(a2 + x2)i] (a2 + x2)h
0 (Wi/o(aVfc2-fi
Pn(x) o
»«r-»/»+itt)
\X\ < 1 M > i
cos (b Va2 - x2)
(a2 - s2)* 0
. . ^
|21 < a
(W*Jo(a V*2 + 62)
|a;| > a
cosh (6 Va2 - z2)
(a2 - *2)i 0
|«| > b |*|<&
. . ^
,iCl < a
(W*Jo(o V*2 - b*)
W > a
* From I. A. Sneddon, Fourier Transforms, McGraw-Hill, New York, 1951.
FOURIER AND LAPLACE-TRANSFORM PAIRS
905
Table D-3. Fourier Cosine Transforms*!
f(x) =^ /o°° Colt) cos (£*)<*£ Co(& =V! £ /W cos ^x)dx •Cote)
/<*) 1
0 < x < a
0
x > a
x?-1
0 < p < 1
cos (z)
0 < a; < a
0
a; > a
e"*
sech (ttx)
/2\i sin (£a)
(^TWr-ringpr) / 1 y fsin [a(l - 0] . sin [a(l + £)]]
\W t
/?y W
"1 -*
'
1+ *
i i + *2
l
l + £4
e~**
e-?
cos (£z2)
^[cos(^2)+sinQ,2)]
sin (§x2)
(1 -z2)" 0
J
^i[-G')—S')] 0
* Three general rules are worthy of notice: 1. If Cc(£) is the Fourier cosine transform of f(x), then /(£) is the Fourier cosine transform of Cc(x). 2. Iif(x) is an even function of x in (— oo, oo), then the Fourier cosine transform of /(*) (0 < x < oo) is C(Q. 3. The Fourier cosine transform of f(x/a) is aCc(£a).
t From I. A. Sneddon, Fourier Transforms, McGraw-Hill, New York, 1951.
APPENDIX D
906
Table D-4. Fourier Sine Transforms*f
/(*) =Vf /0 CS(Z) sin (#)<*£ C5(|) =^ j^/(a:) sin (&)dx /(*)
e"*
Cs(S)
w
1 + S2
e-#%
xe~\**
sin (a;)
mlog
a;
a;(l - x2)p
0 < a; < 1, v > -1
0
a; > 1
x"-1
0 < p < 1
1+* 1 -s
2T(f + i)H-v,+ite)
gyr(p)sin(i^)^ 2n+ipnn!£ tt§(p2 + £2)»+1
cos (aa;2)
-(©"(Tfc)] M[cos (2a£)i - sin (2a£)*]
a;-ie-0*"* 0
0 < a; < a
(x* - a2)-i
x > a
(i)^o(aO
* In the calculation of Fourier sine transforms we may make use of the rules: 1. If Ca(& is the Fourier sine transform off(x) then/(£) is the Fourier sine transform of Cs(x). 2. Iif(x) is an odd function of x in (— oo, oo), then the Fourier sine transform oif(x) (0 < x < oo) is -iC(S). 3. The Fourier sine transform of f(x/a) is aCa(aQ. t From I. A. Sneddon, Fourier Transforms, McGraw-Hill, New York, 1951.
FOURIER AND LAPLACE-TRANSFORM PAIRS
907
Table D-5. Hankel Transforms*
Jtt) =f0" xf(x)Jv(ix)dx
m =/0 " tftt)/,tt*)«tt m x>
0 <x < a
0
x > a
1
0 < x
0
x > a
(a2 - x2)
Jtt)
V
>
0
0 < x < a, x >
-1
0
a
xH-2e-p**
>
-1
x'e-***
>
-1
xH-le-px
>
-1
a"+l
—
Sf+i(&)
|/i(«0
^/l(W-^!/o(w rr(*F + j/i)
*
„ /i
, l
.
, ,.
4P;
e-{8/4P
Vv)v+l
2^t(Jm + i" + »r(i + iM + M (£2 + p2)^-4"+*r(v + l)r(j)
X2Fl (2" +2"+2;2 "" 2m;1 +"; S2 +*2 P2/\ x*-*
>
e~**
-1
2T(i + Jm + M €M+ir(j - J/. + M
0
(*2 + p2)-*
e'v*
0
P(S2 + P2)-i
X-if-V*
1
x
e-pa
1
1 €
1
£(*2 + P2)~i
0
e-<*
0
0 (a2 - £2)-*
1
€«» - a2)* f ^ a
X
e-v* a
(a* + x*)i
sin (ax) x
sin (ax)
X*
P *(S2 + P2)*
£ >a 0 < ^ < a
a
X
sin (x)
(S2 + p2)* - p
0 0
£
sin-i(i) *>1 1*
*<1
* From I. A. Sneddon, Fourier Transforms, McGraw-Hill, New York, 1951.
Table D-6. Laplace-transform Pairs Involving Rational Algebraic Functions F(s) = Di(s)/D(s)
1.9
1.8
1.7
1.6
1.5
1.4
1.3
1.2
8
1
1
1
a)(8 -
8 + d
a)(8 -
b)
b)
Note (« — a)a + o>i2 = [a — (a + icoi)][s — (a — ua{)]
a(a -
b)
. _ a* + ga + d
8* + gs + d
a+ d
a(a - b)
1
b
8(8 — a) (8 — b)
±
i
a -
A- °+d
-.4-.
-(-3
a
a(a - 6)
Ae«* + Be* + K
Ae°t + Be*
Ae°* + K
€a<
1
J(t) (t > 0)
1 6(6 -
a)
B
6(6 - o)
b* + gb + d
B- 6(6h+d - o)
B
b-a
-ih
a
a
1
K
a6
d = —
d
ab
Il~7b
1
jKi sin tat + K2 cos tat = y/Ki3 + K2* sin (tat + a), with a = arctan K2/Kx
8(8 — a) (8 — b)
8 + d
8(8 — a) (8 — b)
(8 -
(8 -
8(8 — O)
8(8 — a)
1
—
a
1
1.1
8
F(s)
No.
and
Each formula holds for complex as well as for real polynomials Di(s) and D(s) (Sec. 8.4-4); but the latter case is of greater practical interest. In this case the roots of D(s) = 0 are either real or they occur as pairs of complex conjugates, and the functions f(t) are real.
i
FOURIER AND LAPLACE-TRANSFORM PAIRS
909
I
+
II
[I
SI «
II
3| O
3 +
a
3
II
P
II
+
H
3
+ 3
+
+
-I 3 n
ii
ii
II
5
II
II
+
0
+
+ 05
+
3
+
+
+
2.10
2.9
2.8
2.7
1
2.6
a)* + «i»]
8 + d
6)1(8 -
1
6)f(8 - a)3 + an*]
8(« - 6) [(8 - o)» + an*]
s(e -
(« -
8s 4- g8 + d
(a - b)[(8 - a)» + «,»]
a + d
(a - 6)[(« - a)* + m*]
F(8)
No.
4e«< sin (a»i* + a) + Be" + K
Ae°* 8in (&>i< + a) + Be"
/(*) (t > 0)
=
1 r(a+d)s+ «iO^ a + d
— arctan
a — 6
~ (a - 6)2 + wi*
6+ d
*
(a - 6)* + m*
an L
— arctan
— arctan
a — b
a>i
b[(b - a)2+ an*]
an
arctan — a
"
o — 6
m
6(tt2 + ail2)
ft»I (o2 + 0)l2)^[(o - 6)2 + o>,2]J4
ns — on8 + ag + d
an (2a + g)
(a - 6)2 + a>i2
.
wl
&J1
a = arctan —— — arctan a + d a — b
6)1
arctan — a
J
1 T(a* - «i2 + ag + d)2 + „12(2a + g)2~| J4
arctan
*
1 r(d +a)2 +an*"|H m(a* + mt)H [ (a - 6)2 + W1» J P- H(b - 6+ d JT d 0)2 + Wl2] " ft(02 + a,!8)
A
a =
a -6
*
«i L(a - 6)2 + a>i« J
a = arctan •
a
^
— arctan
a
=
*
wi[(o - 6)2 + «i*jJ*
A
~
(a -
6)2 + Wla
b* + bg + d
Table D-6. Laplace-transform Pairs Involving Rational Algebraic Functions F(s) = Di(s)/D(s) (Continued)
I
FOURIER AND LAPLACE-TRANSFORM PAIRS
911
ta
+ •8,
'S
3 & + _ +
+ (3
M 3
O
3
3 n
3 W
3 e
+
d
08
*»
3
3l«
II
d
00.
!
8 + IO d
'O
e
d
•*»
XZ
08
II
O 31^
! i
i
I
O
d
08
■♦»
03.
S § n
[I 0
3
+
§ 1
3
£,
5
I
+
.?
II
.3
+
Cb
+
+
+
3
+
3
I
-Is
-Is-
3
3
3
+
+
+
H
«ls
3 II
ti
(I
+ 'a
+
3
3
+
+
+ 3
+ 'a I
+
09
Ok
+
S
3.11
3.10
3.9
3.8
3.7
3.6
3.5
3.4
3.3
1
a)
a)2
a)
a) 2
o)
(a -
(8 -
6)
.
(A 4- A*t)e" + Be"
1
A =
!
6+ d
(.-».
a*
—
C2
a2 - d
a2
d
a3
A=--
8 + d
a)2(« -
—
a8
a -f- d
A = - —
A =
«2
A = -
A = l
A = 0
A
1
a)2
(ii 4- Ait)e°* + X
Ae* + K + tftf
(A + Ai<)e*
f(t) (t > 0)
a)2(8 - 6)
8(8 -
82 + ga + d
8(8 — a) 2
8(8 -
1
82(8 -
82 + g8 + d
82(8 -
s + d
82(S -
(8 -
8 + <*
a)2
1
3.2
(8 -
F(s)
No.
a + d
*-;-»
a
«8 + *0 + d
a
a+ d
.1
,
Ai -
J
A =
a
Ai = -
K = -A
K= -A
Ai = a + d
Ai = 1
d
a
a
--
a
„
.
B = -4
_
K = 1 — A
x=-a
K = -A
JCi -
Ki
Table D-6. Laplace-transform Pairs Involving Rational Algebraic Functions F(s) = Di(s)/D(8) (Continued)
d
X
I
3.18
3.17
3.16
3.15
3.14
3.13
3.12
6)
8 + d
a)2(« - 6)
a(a - a) 2(8 - 6)
82 4- ga 4- d
8(8 - a)2(8 - 6)
8(8 -
1
«2(S - a) (a - 6)
a* + ga + d
8*(8 - a) (a - 6)
a + d
82(8 - a) (a - 6)
1
a)*(a -
8* + ga + d
(a -
(A 4- Ait)e« + Be" 4- K
Ae<" + Be* + K + Kit
1
a + 6
a2(a -
°+d
6)
6 -
2a
a262
1
&+ d
a8(a - 6)2
oJ + 2ad -
a2(a - 6)2
a2(6 + g) 4- <*(2a - 6)
b(a -
6)2
62 + da 4- d
A" "
B ~ 6(a - 6)2
A
B ~ 6(a - 6)2 6d
a6g + d(a + 6)
a2(a -
a8 4- ag + d
a262
a2(a - 6) a6 + d(a 4- 6)
A~ a2(a - 6)2
* ~
_
1
6)
(a - 6)2
a8 — 2a6 — bg — d
K~ a2&2
A~ a)
a)
*
6)
o + d
a86
a(a -
a&
62(6 -
a)
62 4- 6a + d
a6
62(6 -
6+ d
a6
*
&»(» -
a- b
a2 + ag + d
d
a2 + ag 4- d
a2&
* ~ " a»&
d
M = a(a - 6)
=
1S= a(a- 6) X
^
B
B
^' -
s (a - 6)2
6« + bg 4-d
I
CD
»
O
"1
CD
>
H
w
O
o
3.24
3.23
3.22
3.21
3.20
1
a) 2
6)(8 -
O) 2
a)2(8 -
1
«»(«.- a) 2
6)2
82 + ga 4- d
82(8 -
a + d
82(8 -
1
a)2(s -
(8 -
(a -
6)(8 -
8 -\-d
a)2(8 -
3.19
(8 -
F(a)
No.
c)
c)
(A + Ai*)e«* + K + Xif
(A 4- Ait)e°* 4- Be" + Ce"
fd) (t > 0)
=
=
„
=
B «
i
if =
A
X
X
_
B .
^
-A
(a -
2
a»
6)«
ao + 2d
-A
—
-A
a»
q + 2d
-A
a»
(6 - a) 2(6 - c)
64-d
(a - 6)2(a - c)8
a2 4- 2ad - d(6 4- c) - 6c
X
(6 - a) 2(6 - c)
(a - 6)2(a - c)»
(6 4-c) - 2a
a*
a2
(c -
(a -
(c -
(a -
~"
Ai =
4| =
1
a)2(c -
c+ d
6)(a -
1
6)2
(a - 6)2
(a -
a2
a2
6)
c)
*—
6)
a + d m
a)2(c -
a8 4- ag + d
a8
a8
1 6)(a - c)
A - °+d
c
Ai =
c
^
Table D-6. Laplace-transform Pairs Involving Rational Algebraic Functions F(s) = D1(s)/D(s) (Continued)
W
©
FOURIER AND LAPLACE-TRANSFORM PAIRS
915
31* d
s 8
I
5l" d
!
II
ii
I * I +
4-,
«3 •©
D
II
II
II
•5
«l
X3 I C I II
£ 3
+
-
.©
3
I
+ «
+ ^
£ 3
4-
e
I II
II
I
I
i
II
II
ii
3
^
+
3 N
+ +
%
« 5
+
34-
^
O
II
oq
05 «5
4-
+
+
4+
4-
\
+
I
+
+
+
•a
+
3
3
+
4-
4 +
4.9
4.7
4.6
No.
[(8 - a)2 4- «12J2
(8 — a)8 — &>!»
[(8 - a)2 + Wi2]2
[(8 - a) 2 4- «l2]2
(8 - 6)2[(8 - a) 2 4- w,2]
8s + ga + d
(8 - 6)2[(8 - a) 2 + W1»]
a 4-d
(a - 6)2[(8 - a) 2 + m*]
1
F(a)
2«i«
<e°' cos »i<
2wi
-—- e8' (sin uit — mt cos «i£)
Ae"' sin (wi< + a) + (B 4* Bi061
/(0 tt > 0)
6)
[(a - 6)2 4- 0.12J2
(a -
6)2 + m*
-2 arctan -
a — wctan
(a -
a — 6
68 + bg + d (o _ 6)j 4. W1t
2 arctan a-b
as — a>i8 4- ag + d
o)i(2a -f a)
6)2 + U1»
a 4- d 6 + d
« arctan —— — 2 arctan
[(a - 6)2 + mq(26 + a) 4- 2(a - 6)(&2 4- bg 4- d)
7/ — 6)2-|T^T~t mi] 7* «i[(a
. _ [(a8 - ^2 + ag 4- d)8+ WIJ(2a + g)*]H
[(a - 6)2 + „i2]s
B = (a - 6)2 + q,t» + 2(a - 6)(6 + d)
m[(a - 6)2 4- u,sj
^ = [(a 4- d)2 + m*]M
[(a - 6)2 4- «i2]2
2(a -
«i[(a - 6)2 4- W1sj
Table D-6. Laplace-transform Pairs Involving Rational Algebraic Functions F(s) = Di(s)/D(s) (Continued)
w
OS
M
d
O
FOURIER AND LAPLACE-TRANSFORM PAIRS
917
Table D-7. Table of Laplace Transformst F(s) 1
1
f(t)
(t > 0)
1
s
2
1
t
s2
3
i (» = 1, 2, . . .)
tn~l
(n - 1)!
1
1
Vs
Wt
4
5
5-8
6
s-<»+*> (n = 1, 2, . . .)
7
2Hn~l
1 X 3 X 5 • • • (2n - 1) V5r
1
8 s
—
a
1
9
(5 - a)2 10
(s_1a)n(n =l,2,...)
7—^Ti (n - 1)! 'n_1«a' tk-leat
11
(s — a)k 12* 13* 14*
1
(s — a)(s — b) s
(s — a)(s — b)
a
(ga* _ ebt} o
—
o
(b - c)eat + (c - a)ebt + (a - b)ect (a — b)(b — c)(c — a)
(s — a)(s — b)(s — c)
s2 + a2
—
—}—r (aeat - bebt)
a
1
1
15
_1
1 •
,
a
5
16
s2+a*
cos at
* Here a, 6, and (in 14) c represent distinct constants, t From Ruel V. Churchill, Operational Mathematics, 2d ed., McGrawHill, New York, 1958.
APPENDIX D
918
Table D-7. Table of Laplace Transforms (Continued) f(t)
fM 17
- sinh at
s2 -a2
a
8
18
cosh at
s2 -a2
19 20 21 22
23 24
25
26
27 28
29
30 31
1
-5 (1 —cos at)
s(s2 + a2) 1
-3 (at —sin at)
s2(s2 + a2) 1
5-5 (sin at —at cos at)
(s2 + a2)2 8
t
s2
5- (sin a< -+- at cos ai)
s2 -a2 2 cos at
(s2 + a2)2
S
(a9 r*b2)
(s2 + a2)(«2 +b2) K
cos a£ — cos bt b2 -a2
1
(s - a)2 + b2 8
—
'• eat sin bt
a
(s - a)2 + b2 3a2
s8 + a8
eat cos 6i
a—_ ^1(00.2!^ V3sin5iVl)
4a8
s4 + 4a4
s* + 4a4 1
sin a< cosh at — cos ai sinh at
5-5 sin ai sinh ai 5-; (sinh ai —sin at)
a*
8a8s2
(s2 + a2)8
34
36
.
2a
(s2 + a2)2
s* -
35
.
o- sin a/
(s2 + a2)2
32 33
(t > 0)
5-5 (cosh ai —cos ai) (1 + a2t2) sin ai — ai cos at
n!di»('e ' 1
eot(l + 2ai)
(5 - a) I
y/s —a —Vs — b
1
2\/^i3
(ebt - eat)
919
FOURIER AND LAPLACE-TRANSFORM PAIRS
Table D-7. Table of Laplace Transforms (Continued) f(t)
F(s) 37
38
39
1
—— —ae°*' erfc (a V0 Vrt
Vs -fa V
_I_ _|_ aeo2, erf (a y7) V^ _L VS
Vs s + a2
V
2a V^r
(s - a2) 1
41
-/. aV*e^dX
i e«2< erf (a V)
1
40
(t > 0)
a Vr
Vs (s + a2)
e-**< [aVle^ d\
eah[i> - a erf (a V01 42
1 43
eaH erfc (a VO
Vs (Vs + a)
L= e~al erf (V& - a V)
1 44
(5 + a) V« + b b2 -
45
46"
47
48f 49
50
51
52
- be™ erfc (6 V*)
(s - a2)(b + V)
a2
y/b — a
e^T^erf (a V>) - l]
Vs(s -a2)(Vs+b) (l - s)"
+ e6'* erfc (6 V?)
n!
sn+i
(2n)\ V*t
#2n(V0
(1 -5)»
n\
5n+|
V^r(2n + 1)!
V5 + 2a
-1
#2«+l(\/0
ae-ot[/i(ai) + h(at)]
Vs Vs + a Vs + b
(s + a)*(* + 6)* (* > 0) (5 + a)i(s + b)i
Vs 4- 2a —Vi
te-^<[h(^t) e-°«/i(ai)
V5 + 2a + Vs
* /T„(a;) is the Hermite polynomial
t In(x) - i~nJn(ix), where /n is Bessel's function of the first kind.
APPENDIX D
920
Table D-7. Table of Laplace Transforms (Continued) F{s)
M
(a - 6)*
53
, / - ~ ,—, (Vsj-a + Vs + b)2k (k >
0)
(Vs + a + Vi)-^ (" > -1) Vs + a
54
2! e-Ka-rt)*/^
f-f4') ie-w,ga<)
1
55
Jo(at)
VsHha2 (Vs2 +a2 -a)*
56
(" > -1)
Vs2 4- a2 57
(8,+a2)*(*>0)
58
(Vs2 + a2 - «)* (A; > 0) (s - V52 - a2)"
59
Vs2 60
(" > -1)
(H=^*>0)
a"Jv(at)
o'/,(o<)
|0 when 0 < i < fc
61
when i > fc
5
( 0 when 0 < i < k i t - k when i > A; 0 when 0 < t < k
62
63
64
65
(t - kywhen t > k rw
(/*>0)
J 1 when 0 < t < k \ 0 when t > k
1 -e"*«
1 + coth %ks «(1 - 6-*")
2s
1 4- [t/k] = n when (n — l)k< t < nk
(n - 1, 2, . . .) (Fig. D-1) (0
66
67
1
69
when 0 < t < k
1 4- a 4- a2 + <• • . 4- a*"1
«(e** - a)
when nk < t < (n + l)k (n - 1, 2, . . .) M(2Jb,0 - (-I)""1
- tanh fcs
when 2fc(n - 1) < i < 2fcn
8
68
(t > 0)
s(l +
(n = 1, 2, . . .) (Fig. D-2)
when (n — 1)& < t < nk
H(2k,t) (Fig. D-3)
921
FOURIER AND LAPLACE-TRANSFORM PAIRS
Table D-7. Table of Laplace Transforms (Continued) fit)
F(s)
(t > 0)
F(t) = 2(n - 1) 70
71
72
1
when (2n - 3)A; < i < (2n - 1)A;
8 sinh ks
(i>0)
M(2A;, i + 3fc) +1=14- (-1)" when (2n - 3)k < t < (2n - l)fc
1
8 cosh ks
(*>0) F(t) - 2n - 1
1 lt , - coth ks
when 2A;(n - 1) < t < 2kn
8
73
k COth coth *"* 82 4-k2 2A;
|sin A;i| / sin i when
74
1
1
(2n - 2)tt < i < (2n - 1)*-
(s2 + 1)(1 - e~")
| 0
75
1 €-(*/•>
Jo(2 VE)
76
JL <>-(*/•>
77
JLe*/«
I
when
(2n - l)x < t < 2mc
6
Vs
—-= cos 2 y/kt VTt —— cosh 2 y/kt VTt
Vi
1 e-<*/.) si
—— sin 2 V^ V»*
Lev-
—j=z sinh 2 V&
80
^ 6-(*/0 („ > 0)
(ft™ J,-t(2VH)
81
1I6*/'(m>0)
(^•,"'/,1(2VS)
82
g"*^ (A; > 0)
2V^'eXP\ 4'/
78
79
y/vk
Si
83
-e~kV(k 2> 0) 8
84
-±= e-kV° (k £ 0) Vs
85
s-3e-*Vi (A. ^ 0)
fc
/
k*\
6rfC(^) ^ ^_exp(~9
2^exp(-g)
-fcerfC(^i) 86
ae~ky/l
—
7=- (k * 0)
—e«*eoI« erfc (aVt-\
y-)
s(a + Vs)
+erfc(^7i)
APPENDIX D
922
Table D-7. Table of Laplace Transforms (Continued) F(s) e
87
/(«)
-*>A
Vs(a 4- Vs)
(A; ^ 0)
(* > 0)
(a V*
e°V' erfc I
2 V*'
e-*V«(«+a)
( 0
Vsji*+a)
I e~*0
88
r8*+a*
89
Vs2" +
when 0 < t < k
1° I /o(a y/t2 -
a2
when i > k when 0 < t < k
k2)
when i > k
when 0 < t < k
i° ^ h(a y/W= k2) when i
90
Vs2" -
a2
> A;
e-k(y 91
Vs2 4- a2
/o(a V*2 4- 2kt)
(A; ^ 0)
(0
when 0 < t < k ak
92
y/t2
A;2
when i > A; when 0 < t < k
0
ak ^==/,(av^T»)
93
*
when t > k
(0
a,e-*V^HT» 94
when 0 < t < k
VsM^a1 (Vs2 + a2 -f-s)' (r > - 1)
95
96
97
99
100
when i > A;
Jlog.
r'(l) - log i [r'(l) » -0.5772]
j5 log « (A; > 0)
,t_, f r'(fe)
log s s
98
Ji(a y/t2 - A;2)
—
(a > 0)
a
I [r(Aj)j2
logj) r(jfcjj
ea«[loga - Ei( -at)]
logs 8* + 1
cos i Si i - sin i Ci i
8 log 8 s2 + 1
i log (1 4- A*) (A; > 0)
— sin i Si i — cos i Ci i
-(-3
923
FOURIER AND LAPLACE-TRANSFORM PAIRS
Table D-7. Table of Laplace Transforms (Continued) fit)
F(s) .
101
102
s
—
i (e» - •")
a
l0Ss-6 - log (1 + *»«») 8
103
it > 0)
-»»©
ilog (*24-«2) (a >0)
2 log a - 2 Ci (at)
j2\og(s2+a2) (a>0)
- [ai log a 4* sin ai — at Ci (ai)]
,
s2+a2
- (1 — cos at)
,
s2 -a2
- (1 — cosh at)
8
104
105
log
106
log
107
arctan -
a
2
g2
o
gi
- sin A;i
8
108
1
.
k
Si (A;i)
- arctan 8
S
109
e*M erfc (ks) (k > 0)
^6XP(-A)
110
- e**'' erfc (ks) (k > 0)
-w
5
ill
e*« erfc y/ks (k > 0)
112
JLerfc(VE)
tii
Vs 113
114
115
-i- g*« erfc (y/ks) (k > 0) Vs
r0
when
0 < i < A;
I (iri)_i when i > A; 1
y/ir(t + k)
-isin(2A; y/t) irt
-G)
J-e^erfcf-^-)
Vs
Vfc x VT(i 4- A;)
\Vs/
1 c-2kVt y/Vt (0
when 0 < t < k
116
tfo(A;s)
\ (p _ £2)-§ wnen i > A;
117
K0(k Vs)
2i6XPV"W
118
i e*»Ki(A;s)
± y/t(t + 2k)
1
/
A;2\
i
5
1
/
A;2\
119
Vs 120
£*•*•©
-?= # o(2 V2fci) VTt
APPENDIX D
924
Table D-7. Table of LaplaceTransforms (Continued) F(s)
f(t)
121
Tre-**Io(ks)
[t(2k - i)]-i 0 k - t
122
e-**Ii(ks)
irk y/t(2k -1)
123
-e"Ei (-as)
1
0 i 4-a 124
i + se<"Ei (-as)
125
(1-su)
(t > 0) when 0 < t < 2k when i > 2k , when 0 < t < 2k when i > 2A:
(a > 0)
(n^T2(a>0)
cos s + Ci s sin s
i2 4- 1
H Fig. D-1 (0,1)
(A,0) Af(c,i)
Fig. D-2
Fig. D-3
BIBLIOGRAPHY
CampbeU, C. A., and Foster: Fourier Integrals for Practical Applications, Van Nostrand, Princeton, N.J., 1948.
Erdelyi, A.: Integral Transforms, vols. 1 and 2 (Bateman Project), McGraw-Hill New York, 1954.
Oberhettinger, F.: Tabellen zur Fourier Transformation, Springer, Berlin, 1957. Smith, J. J.: Tables ofGreen's Functions, Fourier Series, and Impulse Functions for Rectangular Coordinates, Trans. AIEE, 70, 22, 1951.
Ditkin, V. A., and A. P. Prudnikov: Integral Transforms and Operational Calculus, Pergamon Press, New York, 1965.
APPENDIX
INTEGRALS, SUMS, INFINITE SERIES AND PRODUCTS, AND CONTINUED FRACTIONS
Integral Tables
E-l. Elementary Indefinite Integrals E-2. Indefinite Integrals
Integrals containing
Start with formula number
ax +b
_*
ax + b and ex + d a + x and b + x
31
ax2 +bx + c
40
a2 ±x2
54
a8 ±x8 a4 ±z4
80
35
94
y/x y/ax + b y/a2 - x2 y/x2 + a2 y/x2 - a2 y/ax2 + bz+c
102
Miscellaneous irrational forms sin cos sin and cos tan and cot
263 274 313 354 411
Hyperbolic functions Exponential functions Logarithmicfunctions
424 449 474
Inverse trigonometric and hyperbolic functions
492
121 157 185 213 241
92S
APPENDIX E
926
E-3. Definite Integrals Integrals containing
Start with formula number
Algebraic functions Trigonometric functions Exponential and hyperbolic functions Logarithmic functions Sums and Infinite Series E-4. Some Finite Sums
E-5. Miscellaneous Infinite Series E-6. Power Series: Binomial Series
1 9
38 61
Infinite Products and Continued Fractions
E-8. Some infinite Products E-9. Some Continued Fractions
E-7. Power Series for Elementary Transcendental Functions
E-1. Elementary Indefinite Integrals. Add constant of integration in each case. /xn+l
2. y^ =log,M (s^O) 3. J smzdx = —cos s 4. J cos xdx = sin x
5. J tan xdx « —loga |cos x\ 6. J cot xdx = log* |sin x\
927
INTEGRAL TABLES
'/;
dx
cos2 a;
f dx 8. / . 0
J sin2 a;
= tana;
= —cot a;
/dx
1
x
a
a
-=—:—- - - arctan a2 + z2
(a^O) v
'
0. J/*-r^-2 =itanh^=i-loge^±^ (|*|
-o
1
5=
x2 — a2
a
3/
1
3J ~* A
a
2a
a; + a
COth-1 - = jr- l0ge
2. /Vda; = ex
4. J sinh xdx = cosh a; 5. J cosh xdx —sinh a; 6. j tanh xdx = log, cosh a? 7. y coth xdx = log, |sinh a;|
8. y/ —t4= tanh a; cosh2 a;
/; 20- / ./T .., =arcsin* <«* o) 9- 'sro^-00111* da;
.
,
Va2 -
=
a;2
a;
arcsin -
«
:
(bl > o) M '
'
APPENDIX E
928
E-2. Indefinite Integrals.* Add constants of integration as needed. Note: As customary in integral tables, interpret log, f(x) as log, \f(x)\ whenever it occurs on the right-hand side of an integral formula and f(x) is negative, m, n are integers. (a) Integrals containing ax + b
(a ^ 0)
1. j (ax +b)- dx = X+ (ax +6)*h
(n * -1)
2'/^ =^°g,(aa; +&) 3. Jx(ax +bYdx =a2{nl+2) (ax +b)»+2 4. / xm(ax + b)n dx
=arm +n+i) Um(a* +6)n+1 - mb Ixm~l(ax +b)ndx] = , ' , 1\xm+1(ax +b)n +nb Ixm(ax +b)"-1 dx] (m > 0, m + n + 1 ^ 0) / x dx
7 f
x^x
x
—
b
&
.
1
7 (aa; + 6)8 " 2a2(aa; + b)2 a2(ax + b) 8 f xdx j. r b l "| •J (ax + b)n a2 l(n - l)(ax + b)*-1 (n - 2)(ax + b)»~2\ (n * 1, 2)
* Adapted, in part, from I. Bronstein and K. Semendjajev, Pocketbook of Mathe matics, 6th ed., published by the Soviet Government, Moscow, 1956.
929
INTEGRAL TABLES
9- / aJTb =a* [l(aX +b)i "2b{aX +b) +b*Iog« (ax +6)] 10J(^^ =^[(^ +6)-261Og'(aX +6)~^] .,
f
x*dx
1 T,
,
,-,26
62
1
12. / x2(ax + 6)n da; 1 [{ax + 6)"+3
""S»|_ « + 3
WL (ox + b)"+2 , .„ (ax + b)"+H
2b
n+ 2 + °
n+1 J (n * -1, -2, -3)
x(ax + 6) = bl0Sea~x~+b 14 f dx _ 1 1 • j x(ax + 6)2 ~ 6(ax + b) f
dx
_1. ax + b 62 ge x
_ 1 fl /a* + 2&V i ,
«
I
Xb- J x(ax +by ~¥l2\ ax + b) + S° ax + bj dx
1_ . o,
16- J xHflx „t/J0*±.+ m=_i= b)~ bx +HloS« +¥
ax + b
._ f dx b+ 2ax . 2a. ax + 17• J x*(ax + b)2 ~ Vx(ax + b) "*" 6s & 18 [ dx _ 2ax —b . a2. a; 8# J x\ax + b)~ 262x2 + ¥ S' ax + b Let ax + b b X (a ^ 0).
Then
19- J IT =a^ \T ~T +3bX ~bl0ge X)
20- /t-5»(t-8Mr +86,I*x +2) ot
/*x8dx
1 /_
...
_
35»
63 \
2L J ^ =a*{X-mos«X-JC+2X*)
b
APPENDIX E
930
*/*-i('»'+?-5:+&) =I f
-1
,
36
362
6»
1
o41> - 4)Z-4 ^ (n - 3)Z»-» (n - 2)X-2 + (n - l)JK--iJ (w * 1, 2,3, 4) n-l
/dx
r
1
2
i
^
yn
x^=-H2raS +ra +a6%-6
26 fj*
! f- V M (-a)***-1 , x
. xi
2&-Jx*X»- 6*4 t4W(fc-l)^ + x-wal°g'7j (n>2) (n>
W' J xW*~ -b'l3a l0g^ + X + 2x^-—J 2R [ dx -
oo /"_**
• J x'Z»
1[(KnUn*X J.ia'x
aW.X*
4aX~\
1 r-'ff/n + lN (-a)*x*~2 , o»X2
6»+2 L £3 \ *> I (k ~ 2)**~2
2x2
so• Jf— = fc^-1 l_m^y"2 V(m+n" 2\ Jf-^K-g)* X»X» A; / (m-k- l)xm-k~l terms with (m —A? - 1) = 0 are replaced by
("•:!72)<-*>-"°*ci (6) Integrals containing ax + band ex + d
31. /"« +T^d* = -a;+
c2
loge (ca; + d)
(a 5* 0,c 5* 0)
931
INTEGRAL TABLES
33 f
xdx
**• J (ax + b)(cx + d)
=j-^—51~log (ox + 6) - -log. (cx +d)~\J be ~- ad\_a c 34 [
(be -ad*0)
dx
**• J (ax + by(cx + d) be —ad \_ax + b
be —ad * ax + bj
(c) Integrals containing a + x and b + x
,. [ xdx 6 6d- J (a + x)(b + x)2 ~ (a - 6)(6 + x) / x 2 dx
v (a j* 6)
a . a+ x (a - 6)2 loga 6 + x
b2
a2
(a + x)(b + xy = (b-a)(b + x) + W=ayl°S'{a + X) . 62 - 2a6.
/. ,
v
+ -(jrrayl^^b + x) 37 (
dx
6 J (a + x)2(6 + x)2 -1
/
1
1 \ .
2
.
a+
~(a-by\a + x'tb + x) + (a- 6)' g* 6 +
oR f
xdx
**• J (a + x)2(6 + x)2 1 ( a j. b \ i a + b lft_ i± ~ (a - by\a + x ^b + x) "*" (a - by St 6 +
f
s2<**
da- J (a + xy(b + xy -
-1
/
a2
,
62 \
2o6
.
o_+
" (a - by \a + x + b+ x) + (o - 6)' g,6 +
APPENDIX E
932
(d) Integrals containing ax2 + bx + c 40.
/
(a ?* 0)
dx
ax2 + bx + c
1
,
2ax + b-Vb2- 4ac
VWT^el0^ 2aa; +6+V62-4ac Viae - 62
arctan
2
2aa; + 6
2aa; + 6
n% ^ A x
(& >^
(62 < 4ac)
V4ac - 62
(62 = 4ac)
In numbers 41 through 47, let 62 — 4ac ^ 0. 41.
dx
2ax + 6 (4ac - 62)(aa;2 + 6a; + c)
/ (ax2 + bx + c)2
dx + 2a [ ^ (4ac - 62) J ax2 + bx + 42.
dx
/ (ax2 + bx + c)n+1
c
2ax + b
n(4ac - 62)(aa;2 + 6a; + c)n
. 2(2n - l)q f
dx
"*" n(4ac - 62) J (ax2 + bx + c)n 43.
44.
J^4t +c=il0^^2 +bx +c^lJax2 +lx +< = 7. - 0^2 lo& (ax2 + bx + c) I . 62-2ac f dx x2 dx
ax2 + bx + c
a
/xndx 45.
2a2
a;*1--1
aa;2 + 6a; + c ~ (n - l)a
i" 2a2 J ax2 + bx + xn~{ n-2
_ c /*
6 f
x"-'dx
o./ax2 + 6x + c 46.
a; dx
/ (ax2 + bx + c)n+1
.
.
l"**1'
~(2c + bx)
n(4ac - 62)(oa;2 + 6a; + c)n
b(2n - 1) f
n(4ac - 62) y (aa;2 + 6a; + c)n
INTEGRAL TABLES
933
47.
/
xmdx
(ax2 + bx + c)n+l
a(2n -m+l)(ax2 + bx + c)n
_ n —m+ 1 6 f x1*"1 dx 2n —m+l'aJ (ax2 + bx + c)n
\n+l
m —1
c f
xm~2 dx
(m?*2n + 1)
2n\ — —m m + 1laJ a J (ax2 i + bx + c)n+1 Let ax2 + bx + e = X (a >* 0).
Then
f x2"-1 dx _ 1 f a?"-8 dx _c [' a?*"8 i dx 6 f x2n~2 dx a) X» J X* " a j X*-1 aJ ' /dx _ J_ 1 x^ 6 f dx 49. xX~~2ci0geX 2c J X /dx _ 1 b^ f dx 4_ I f dx 50. xXn ~ 2c(n - l)^-1 2c J X» "*" cj xXn~l
48.
51.
J x2X
52.
/
2c2 ge x2
ex + \2c2
dx
xmXn
(m — l)cxm'1Xn-1
c) J X (2n + m — 3) (m — l)c
-}
_ (n + m —2)6 /" (m — l)c 53.
f
dx
J (/x + ?)X
=
i
dx
xm~2Xn
dx
(m> 1)
J xm~lXn
T/iog (fr + g>H
2(c/2-ff6/ + ff2a)L;i°g' +
X
J
2ga —bf
+ g*a)J X
a + x
(|x| < a)
(e) Integrals containing a2 =fc x2, with X s= a2 + a;2
F = arctan •
or
fl, X = a2-x2
Fstanh-1-^ a
2l0g
f dx
2(cP*-gbf
x + a
L2l0g'x^^
(|x| > a)
APPENDIX E
934
Where =b or =F appears in a formula, the upper sign refers to X = a2 + x2 and the lower sign to X = a2 —x2 (a 9* 0).
[dx _ 1 J X" a /dx _
55* ' ^2'2ah X2 " + 2a^Y /dx
x
3x
3
4a2X2 T 8o4Z T 8a8
W f ^ _ g , 2n —1 /" dx j Z»+l 2na2X»+ 2no2 J X*
58. y^-±|log.Z 59. J(xdx_J_ ^, - =F 2Z R(\ ( Xdx
uz l
60- JP=T4P
62. /**^ =±x=FaF **
at
f x2 dx
x
1 ,„
[ %2 dx
x
x
f x2dx
x
1 f dx
1
J "T8"" = ^ 4Z2 ±8a2* ±Ita8
„..
™ j^_=±___l0g„Z [x*dx x2 a2, 66.
„ [ x*dx j
X8- ~
1 , a8 2Z "*" 4X*
935
INTEGRAL TABLES
69
/x'dx _
1
X1*1
„„ f dx
,
a*
2(n - l)*-1 + 2nZ^ 1 ,
(w > }
x2
70JxX =2a->loS'X
_. f dx 1_ J_. x^ "' J xX* ~ 2a?X + 2a* 0geX f dx 1 1 . J_. xf 72 J xX8 ~ 4o2Z2 + 2o4X + 2o« 0g° X 73
[ dx_
Lzply
7 x2^T
a2x
a8
74 /"_^_=_J_=f_5_=fJ_f J x2*2
a4x ^ 2a«X
75° J/"j^L x2Z8
2o6
LT_£_TAT15F
a6x ^ 4a«X» ^ 8o«X ^ 8a' 1 1.x2
/* dx
77 f dx ''• j x8X2 " 7S /" Jx_
\_ _l_ 1 . xj 2a<x2 T 2o4Z ^ a8 g* Z 1_ J_ 1 ,3, xf 2o"x2 ^ al? ^ la*X* ^ 2o"10S* X
'8> y x8X8 ~
79- /(^^ =a-^T2[clog'^ +-)-|log^±5K] (/) Integrals containing a8 ± a;8, with I = a8±x8
(a 5*0)
Where =b or =F appears in a formula, the upper sign refers to X s a8 + a;8 and the lower sign to X a a8 *- a;8.
qi\ fdx
, 1,
(a±a;)2 . 7—2 H
80. / -^ = ± 5-5 log, -5-1=
J X
fti /" d*
6a2 6* a2 =F ax + a;2
x jl. — (d£
0 *7 X2 = 3a8Z "*" 3a8 J X
17= arctan . 2a; =F7=-a
tfy/s
aVs
APPENDIX E
Q9
/ xdx
1 ,
a- -t- ax t x- , a2=Faa;4-a;2
936
1i
2a; =F a
7—;—r^— —?= arctan 7=**• / -y- = g- loge —/„ _j_ ^2— =fc =*= aV3 ~77^ (a zfc a;)2 aV3
R„ /* a;rfa; _ x2 ,1 f a; da; **• J Z2 " 3a8X + 3a8 j IT Qyl
f x2dx
, 1.
_
84-J-^=±3l0g'Z xfrfx _ — 1 85. j/" -^--zf Qc
fx*dx
, _,_ , /"dx
fi7 J/" x8dx x 1 /" dx 87. -^r = =p —±-J oo
f dx
on
f dx
1.x8
88- J xX = 3a^l0S'X
1.1.x*
89' J xX* = WX + 3a-
qo /" jte_
j x8X2
I -p x2 4 /"xdx o»x ^ 3a6X T 3o8 J X
L_ = L [^
2o8x2
a*JX
1_
x
_5_ /" dx
2a6x2 "*" 3o»X "*~ 3o«; X
(g) Integrals containing a* ± x4
f
dx
1
(a^ 0)
.
x2 + axV2 + a2
,
a;2
1
axV2
94- j ^TP =4^8^1°g*x8-ax^+a2 +ia^arCtan^:^ tu*
f xdx
1
95-jFq^ =2^arctaV2 f x2dx
1 .
x2 + oxV2 + a2 ,
1
,
axV2
96- J ^+^=-ia^l0gV-ax^ +a2 +2a^arCtanaT^
937
INTEGRAL TABLES
97•/^-Jl'>g•(°'+l:', /; —< - 4a->l0e°—x + 2T8arctana dx
1 ,
a + a; ,
oq f xdx - x i
<>? + x2
1
x
aa' J o^^T4 " 4¥8l0ge^^ inn
f x*dx
1.
a+ X
1
.
X
100. j ^TT4 =4al0g«;T=^ - 2a&IcUina
101./^=4log.(«4-a;4) (fi) Integrals containing vx and a2 + b2x
1AO f Vxdx
2y/x
mo f xVxdx
2xVx
102- J/ a2o +, b2x u = -To b2
103. / 2 . , 2 = -^T5 J a2 + b2x 362
104 [ ^dx
.
bVx
Ts arctan a 63
2a2Va; , 2a3 n 64
.
r- t f arctan 65
Vi
i (a2 + 62a;)2
tne f xVxdx
2a
(a, 6^0)
bVx a
J_arctan^
62(a2 + 62a;) + a63 arCtan a 2b2xVx + Sa2Vx
m' J W+bW = 6<(a2 + 62a;)
3a
.
bVx
" 6^arctan"T"
infl /f 2 , bVx 7= = -r arctan . Jf (a2 + dx db2x)Vx^—,.,,»— ab «
106.
f
. /
dx
—7= =
J (a2 + b2x)xVx
f
cte
_
2
26
a2Vx
<*
7=
Vx
A 6V*
5 arctan
1
«
6Vg
*7 (a2 + b2x)2Vi a2(a2 + b2x) + a36 arCtan a (i) Integrals containing vx and a2 —b2x > 0
tnft
f Vxdx
nn /" xVxdx
2Vx . a,
2xVx
110- J o^^x * —W
a + bVx
2a2Vx , a8
a + &Vx
fc4" + T^a-^Wi
(a, 6 5^ 0)
APPENDIX E
i f _^ dx _
Vx
938
1 ^ a+ bVx
m-' J/ (a' (»?-h2xy b2(a2-b2x) 2ab*l0gea - bV~x + 5a*vx Za2Vx no /* xVxdx sv^aa; _ -2b2xVx —zo^arva; -t-
J (a2 - 62a;)2 "
113 f
da?
115. f
*5
116. f
dx
3a. aa,
64(a2 - 62a;)
a + 6Vx
26* l0ge a - bVx
=JLi a+ bVx
' 7 (a2 - 62a;)Vi a6 0&,a _ 6v^ 114 f dg _ 2 6, a+ 6Vs 'J (a2 - 62a;)a;Vx a2Vx <# °g' a- bVi
_=
7 (a2-62a02V*
^£
+.J_l0 "+bV*j
a2(a2 - 62a;) T 2a86 s*a-6v^
J (a2 - 62a;)2a;Vx
!2 a2(a2- 62a;)Va;
_362Vs <*4(a2 - 62a;) . 36 .
a + bVx
+ 2a->l°S'a-=Wx' (J) Other integrals containing Vx
ii* f Vxdx J a* + x*
117. / -T—x—* =
(a > Vx > 0)
1 t= .log« x + aV2x + a2 , 1t= arctan . -z aV2x 7= 2aV2 x - aV2* + a2 aV2 a2 - a;
f dg _ 1 1 loga i a; + aV2a; + a2 118.s* Jf_-^_= (a4 + a;2)VaJ 2a8V2 °g6a; - aV2x + a2 H
+ v^
1
— y/x
a
7=
1 . aV2x 7= arctan l2 — a8V2
x Vx
arctan —
a
/dx 1 a + Vx 1 Va; *; t t = in l°ge 7= + ~aarctan —
(a4 - x2)\Tx
2a8 8e a - Vi
a8
a
(fc) Integrals containing Vox + 6, tuitfi X = aa; + 6
F=/a; + fif
A = 6/ - ag
(a^O)
939
INTEGRAL TABLES
121. j Vxdx^^Vx* 122. J(xVXdx-2®**-2^ 15a2 123. !x>VX dx =2(15a2x2-12a6x +8b2)Vxi J
105a8
124. J[* 2^ VX a [xdx
m- JVX
2(ax-2&) /y
3^— VX
10ft f x* dx _ 2(3a2x2 - 4a6x + 862)Vx
6- J VX 127.
f dx_ J xVx
ISa^ 2 * 1,-1^ ft =77? 1 log. , Vx-Vb -^tanh Vb
v_6 arctan^ 2
(&>0)
(&<0)
128. JfVldx =2Vx +b[-^= x J xVx loo f
VX _ a_ /" dx to 26 7 xVx
130. 7/^dx=-^ +5f^= x2 °* x +2J ^VX 131 [-I?
J xWx
VX
(n - l)6x»"»
(2n-3)af dx
(2n - 2)6 J x«-iVx
132. J[VX'dx-2^ 5a
133. J xVT* dx =^-t (5VT' - IbVT")
134./x2VTidx =|(2f?-^+W^
(*»K1)
940
APPENDIX E
138. yf^= =-|= +i [-4= (6*0) xVx8 bVx bj xvx v ' 139 /"
3o _ 3a f dx
bxVx
140./.
X±»/2dx =
62VX
2X«±»)/2
o(2 ± n)
(6^0)
262J XVX
(n * =F2)
141. y/* xZ-/a dx =1a2 (f^ - ^f^) (. *T2,' =F4)' \ 4± n 2± n / /„
,„ .
2 /X<6±n>/2
26X<4±n>/2
a8 \ 6 ± n
62X<2±")/2\
4± n
2 don J
(n * =F2, T4, =F6)
,.. /•jC"/2dx
2-XV2 , ,, f X("-2>/2 J
.
143. J/ —-— = —— + 6 J/ —-— dx x n x
i/i
144, j xX«'* ~(n
dx
+ 2)6X<»-2>/2 ^ bj xX<»-W2
145' J x2*^2 = ~6xX(»-2V2 ~26 J xX& / . ===== arctan */—^ dx V-a/ M aY dx VXY
,A.
(n * 0) (n * 2, 6 =* 0)
(6 * 0) (a/ < 0)
4= tanh-i Jjf - -f= log, (VoT +VfX) +C, Vaf Va* Vaf (a/ > 0)
U7 f *<** .. VXY ag + bf f J VXY
of
2a/ J
dx
Vx?
941
INTEGRAL TABLES
•/
148.
dx
2VX
VxVT*
aVy
'_2 149.
/dx _ YVX~
V^JA
arctan
fVx
(/A<0)
&*&T% <*>»>
yVx
(f*0)
-/W=(2^)a(^-/^) 154, JVxT»= "(^IjaIfS +I71-!)0/^^.} (A=«0,ti=U)
155. jVXY»dx =^^.{WIY^+A/^f) (f *0) 156. j^^ =^-1^-j^ +-j-7f_j (/=*0,W=U) (0 Integrals containing Va* —x2, with X s a2 - x2
(a > 0)
157. j VXdx =\(xVx +a2arcsin^) 158. J xVZdx= -gVz8
APPENDIX E
159. Jx*Vxdx =-fVX» +1 (xVx +a*arcsin?) m.jx>VXa* =^-a>^f 161./^dx =VZ-alog.^I 1ft„ f^Xj„ lo*. / —r- ax = J
VX
x2
x
. x
arcsin a
lfl, J[^X-a, Vx,\. a+ Vx 163. —dx=-lxT + -\ng.—— iaj If -^= dx = arcsin . -x 164.
.65. /!|--^ i«a f 3*
x^r~
m- J VT=~2
y x2Vx
-
[ dx _
a%
• x
+2 MCama
a2*
VX
1 . a+ VJf
7°- i Wf _ ~2W* ~ 2a"8108'-1-
171. j VT'dx =i(xVT> +2^VX +|-4arcsin?) 172. f xVx*dx =-|vX» 173. ] x'Vxidx
_ +_^_ +^_ +jgarcsin-
942
943
INTEGRAL TABLES
174./x8V^dx =^ - ^ 175. /*^dx =^+a2Vx-a8log.i±^ J
x
6
17fl VX*. 176. Jf —dx
x
VT» 3 ^ - ^arcsm3. . x _ _ ^VX
._fVxi, 177. ]-rdx=
VX> -^
3Vx.3a. a + Vx — +_log,_—
178. J/-*?==* VX* a*VX m' JvT* = Vx ion f x* dx
X
. X
mJ VF =Vf_arCSina //r3 rJV
182. J/-* xVxS
/—
/t2
1 _Ialog.i±^I o8 * x
cfiVx
' J WF» " a* \ x~ + Vx) 184 f ** = 1 , 3 _ _3_. a+ Vx ' J x8Vx»
2a2x2VX "l- 2aVX
2a8 oga
(m) Integrals containing Vx* + a8, witfc X = x* + a2
(a > 0)
185. / VX dx =| (xVX +a2sinh-»|) - | [xVX +a* log, (x +VX)] +d
*
APPENDIX E
186. J xVX dx =|Vx8"
187. f x2VX dx =| VX8 - j (xVx +a2 sinh"1^\ =| VX5 - j [xVx + a2log,(x +VX)] +tt
188./x8VXdx =^ - ^ 189. JfyJdx =VX-alog.^±^ X X
190. J(^dx=-^ +sinh-ifa x2 x =-^ +log. (x +VX) +ft a;
191 fVX^
191. J xi ax-
VX 2x*
1
2a10g'
«+VJ x
192. Jf 4= =sinh-» 5=log. (X +VX) +Ct VX «
-|V5-|log.(«+v^) +C, f dx
1,
a+ VX
m]-xvx =-al0*-ir,„ /• dx
Vx vx
944
945
INTEGRAL TABLES
f dx _
198' J xVX
VX
1
o+VX
"20^ + 2a-8l0g«-^—
199. / VX8 dx =i(xVX8 +^ VX +* sinh-?) =i[xVx8 +3-f5Vx +|-4log.(x+Vx)] +C1 200. J xVx8 dx =|Vx8 9m [ rtVri*, =-g xV^ 201. jx2VX8dx
a^Vx* -
a*xVx a' . . .x _-_sinh-i_
xVx"8
a2xVF8
a%Vx
6
24
16
-^log.(x +Vx) +d
/x8Vxidx =^-^ f^fdx^ +aWx-aUo^-^ 204./^!dx=-^+|xVx +|a2sinh-1 2Vxi
202.
203.
-— +|xVx +|o* log. (x +VX) +a /da;
__
x
VT»-aWx xdx
1
207/vxi--Vx 208./$|=-r^ +sinh-5 Vx
^=+log, (x +Vx) + C,
APPENDIX E
/ x z dx
^/= ,
a2
Vx^Vx +Vx
f
dx
f
dx
1
1.
210JxVx> =aWx-a>lOS>
a + Vx
-
1 (Vx , x \
2U- J^Vx^~aA~ +Vx) f dx 212. ] X%VY»
__1 3 , _3_j a+Vx 2a*x*Vx 2a^Vx "*" 2a8 ** x
(re) Integrals containing Vx2 —a2, witfe X = x2 - a2
(a > 0)
213. j Vx dx =^(xVx - a2 cosh"1?) =|[xVx - a2 log. (x +VX)] +Ci
214. JxVxdx =|Vx8 215. J x2Vx dx - | Vx? +|* (xVx - a2 cosh"'^ =| VF +| [xVx - a2log.(x +VX)] +d
216./x8Vxdx =^ +^ 217. J/ — dx = VX - aarccos -x x
218. J[^£dx= -— +cosh-* Ia x2 x = -— + \og,(x + Vx) + &
946
947
INTEGRAL TABLES a
0 • arccos 2a x
220' / Vl =C0Sh_1 ~a =l0g' (-x +Vx^ +Cl / X dx
/""""
Vx = /x2dx
x,/= . a2
,_. x
=|Vx +|log.(x+Vx) +C1
223./^ =^! +a2Vx 224 /
dx
1
a
o
*
"""/— = - arccos -
"*• y x> xVx
/• dx _VX 225'i ?Vx " W ^
[ dx
VX . 1
a
m j ^ - ^ +^^x
227. / Vx5 dx =|(xVJF - §f£ VX +^ cosh-?) =i[a;Vxi-^Vx +|-4iog.(x +Vx)] +c1 228. j xVx1dx <= ±Vxi xVX8 a2xVX8 _ a4xVx , a« _„t_, x 229./,^ =^+^----+»r6«Mt-., 6 + 24 16 xVx1 , a2xVx» 6
^
24
a4xVx 16
+^log,(x +Vx) +C,
APPENDIX E
948
VX1 , a2Vx8 230. fx8Vx8dx^p +
231. y/ ^-*-3 dx =^-3 - a2VX +a8 x 3
a
' arccos -
OQO /" VX8 VX3 .
232. I —^j- dx =
i*-&
x
Vx8 , 3 a/= 3o , . . *x VX8 s h hxVX — sa cosh x -
g" +2xVZ " 2a COsh a
•^ +fxVX - |o« log. (x +VX) +Ct V] VX8J VX8 , 3VX 3 233. Jf -jfdx" =—^ - 5° arccos ^a 2x^ +^^ + ^--2aarCC0Sx dx _ 234./ VX8
x a2Vx
235. J/"-^L--4VX8 Vx x2dx x , . ,x m/^._^+cosh-,? ==
8
7= + cosh-1 -
VX
a
- -;^= +log«(x +VX) +Cx 237. Jf^ =VX--^= Vx1 Vx 000 f dx 238. / —7= J xVX8
=
11 a 7= j arccos a2VX a8 *
239 (_*i===_i7^I +_JL\ ' J x2VX8
0._ f 240. /
dx
—7= =
J x8VX8
a4\ * 1
1=
2a2x2Vx
VX/
3 t=
2a«Vx
3
a
2«6
*
— rr-.•, arccos -
(o) Integrals containing Vox2 + bx + c, with X = ax2 + bx + c
(a 5*0) 4ac — b2
INTEGRAL TABLES
949
'•i log, (2VaX + 2ax + b) + C Va
241.
f dx
1 sinh-i^t-^ + Ci Va
(a >0, A>0)
Va
J VX~" i log. (2aa; + b)
(a > 0, A- 0)
Va
2aa; + 6
V~^ 242.
243.
244.
(a < 0, A < 0)
arcsin
V-A
dx = 2(2oa; + b)
/ xVx
aVx
(a > 0)
(A 5*0)
X2VX
/ X(2n+i)/2 * -
2(2aa; + b)
(2n - ljAX^-W2
+
21fc(:[n,
—1) f
dx
rv-1 J Xt2"-"'2
2re
(A*0) 245.
/VXdx =^+^^ 4a
+
If
dx
2k J VX 2k]
246.
247.
/"X(2„+1)/2dx _ (2a* +&)X<2"+»/2
248.
j A
249.
J VX~ a
2a) VX
/xdx X(2»+D/2
1 (2w _ lJoX*"-"/2
rfX
4a(» + 1)
2n+l /" (2n_1)/2
+ 2k(n + 1) J X
f xdx _ Vx__b_ f dx
250.
251.
b_ f dx 2a J X»«+«/»
dX
APPENDIX E
2-o (x2dx_(x
950
36\ „/^r , 362 - 4ac f dx
252' 7 VX " \JSi"&) VX+-8al-j VX 263. f ^ . ^ - y i h . l /" *L (A / H.O) ^ XVX
aAVX^aj Vx
K' '
254. y/"xVXdx =^-^2-f±i)VX* /" *L 3a 8a2 A 4a&7 Vx
255. jxXVXdx^X^-^jXVXdx X(2n+l)/2cte
w./^r*-(.-«)^+a^-/vj* 1i ,
/o\/7y . 2c o» , A\ , „ /2VcX
~v;log
«•/;&
-vrinh-,6-!vT+Ci (c>°'a>°) 1
. 6x + 2c
=arcsui—.
—
259.
J x2Vx
, ^ ..
(c>0>
c
ex
.
_, _ . ^ ..
(c < 0, A < 0)
xv — A
2c J xVx
260.
261
fVXdx
-J ~3?
262
Vx .
f dx . b f dx
x~ + aJ Vx + 2JxVx dx
951
INTEGRAL TABLES
(p) Other irrational forms
(a > 0, b 5^ 0)
263. 7/ a;Vax2 , d*+6a;k =-^VSJ+te &* 0/.4 264. /f
,
da
• = arcsin
J V2aa; - a;2
g "~a <* x
265. / , xdx
= - V2ax - x2 + aarcsin •
J V2ax — a;2
—
a
a
266. / V2aa; - a;2 da; =X-^ V2ax - a;2 +y arcsin ^
f
dx
267. /
,
J (ax2 + b)Vfx2 + g
=
1
,
x
—
a
aVoflr - 6/
/
VbVag-bf
arctan /; T, o
VbVfx2 + g (og-6/>0)
2VbVbf-ag
l0ge
VbVfx2 + g+ xVbf-ag {ag _ bf< Q) VbVfx2 + g-xVbf-ag
268. JfV^aT+6dx ="/" +*}V^+6 (n + l)a 0_ f dx 270. / —,
=
o^
= —- arccos -7=*
J xvxB + a2
f
271. / —y
dx
y a;Va;n — a2
0>y0 f Vxdx
2 log,— . a + Va;" + a2 7=*
^
Va;n
2
a
^^
Va;n
2
/Za*8
(q) Recursion formulas (m, n, p are integers)
273. / a;wt(aa;n + b)p dx
= —r—^—— xm+1(axn + by+ npb I xm(axn + ft)*"1 da; m + wp + lL
J
J
APPENDIX E
952
=W+i)[-xm+:(axB +6),,+l
+ (m +n+np + 1) j xm(axn +b)*+l dx
-^+-Wb[xm^axn +b^1 —a(m + n + np + 1) / xm+n(axn + by dx
= a(m -7—-r t-tJ xm"n+1(axn + 6)"+1 + np + 1)|_ —(m —n + 1)6 / xm-n(aa;n + 6)* da; (r) Integrals containing the sine function
(a y* 0)
274. / sin aa; da; = —- cos ax a
275. / sin2 ax da; = -a; — — sin 2ax
/sir
276. / sin8 aa; da; = — cos aa; + 5- cos8 ax '
a
/'
3a
3
1
1
8
4a
32a
277. / sin4aa; dx = ~x — 7- sin 2ax + ^77- sin 4ax 1
278. Jf8in»axdx=-sian~lqa;C08ga; +^^/'sia"-2axdx (»v >0)' na n J 0-n
279.
j1 xt .
,
sin ax
x cos aa;
/ x sin aa; dx = —=
280. J/ a;2 sin ax dx = —r sin ax —[\a a2
;) cos ax
a8/
281. Ja;8 sin as dx - (|? - Ji)**** ~(J - §) x" sin axdx =
9S9 /"sin rc ^
a
cos ax
cos ax + - / xn"1 cos aa; dx aj
(ox)8
(ox)^
_(aaO^
953
rt0.
INTEGRAL TABLES
f sin ax ,
sin ax ,
284. / —=— dx = J x2
rt0„
f sinax ,
f cos axdx
ha / J
x
x
1 sinax .
a
f cos ax ,
285- J —dx=-fr=-i-x^r+—i] -F=r* 286. / _
J sin ax
= / cosec axdx = - log, tan -^ J
a
°
2
= - logfl (cosec ax — cot ax) a
287. /; / -r^= -ia cot ax sin2 ax ooo
/"da;
cos ax
, 1,
.
ax
m J iiir^x - "2alE^x +2^log«tan 2
289. yf-^<w>D an" ax =—r1-^-?vJrL a(n — 1) sinn_l ax +!L^ n — 1 J[ttttsin" 2ax /" xdx _ 1 r
(ax)'
7(ax)'
31(ax)'
290 J sW = a2 L"* + 3TT! + 3T5T5I + nTTl • 127(ax)'
"*" 3 • 5 • 9! "*" / x dx
x
1
o.^2 ^ = —-a cot ax + ~21°£« s*n ax a2
sin2 ax
x dx 292. 7/ -^ sinn si ax
x cos ax
1
(n — l)a sinw_1 ax
(n — l)(n — 2)a2 sinn~2ax
n —2 f
xdx
^ijsliF^ sax
293. j/,_-^— --itaift-f) 1 + sin ax a \4 2/ 294. J/"-—* Itan^ +f?) 1 — sin ax a \4 2/ one f Xdx X, /V OX\ , 2 , /*T OX\ 295. / 7—:—: = — tan ( 7 — -jr- ) + -5 logfi COS ( 7 — 7T ) J 1 + sin ax a \4 2/ a2 s \4 2/
ooa /" __?_dx
x , /V
ax\ , 2 ,
. Ar
ax\
296. / ^ ; = - COt ( 7 — 7t J + -; log, SU1 ( 7 — ^r ) J 1 — sin ax a \4 2/ a2 e \4 2/
/ ^ o\
(n>2)
APPENDIX E
954
"•/fro—+i-(S*f) •»/=?rs=3-i,-(i,,f)+i,*,-! «»/irF==?-5-(S-f)-B-(S-T) «*/ir=fcsi-B-(i-T)+B-'(i-?) ax
2
Qm f
sinoxdx
1
/*•
ox\ , 1 . t /»
ax\
QAO /* sinoxdx
1
/»
ox\ , 1
ax\
../t
303. J/,1 + *sin2ax -4-arcsin(^""M 2V2a \ sin*ax + 1 / 304. J/",—44-/"-^-^Itanax 1 — sin2 ax J cos2 ax a OAC /* .
. , ,
sin (a —6)x
sin0/v(a + 6)x . '
305. / sin ax sin 6x dx = -o7* r^ y 2(a —6)
2(a + 6)
(|a|^|6|;for|a| - |6| see 275)
ana f
30". / r—:
dx:
J 6 + c sin ox
= —,
2
.
aVb2 - c2
1
arctan
6tan/ (ax/2) +c — V&2 - c2
,. t ^ ftX
(62 > c2)
1q 6tan(ax/2)+c- Vc2 - 61
aVc2 - 62 °gd6tan (ox/2) + c+ Vc2 - 62
(62 < c2)
307 I g"103^ _. 5 _ £ / dx j b+ C8wax c cjb + c8max ono /" dx : r = -r 1 ,logatan . ox (j f 308. / 77—: -r- — £ / J sin ax (6 + c sin ax)
/
a6
^
2
r-J
dxr-
6,/ 6 + csu ; sin ax
dx
c cos ax
(6 + c sin ax)2
a(62 —c2)(6 + c sin ax)
^b*-c*J b + cm a m ax
955
rt«_
INTEGRAL TABLES
f
sin ox: dx
6 cos ox
31°- J (6+7ii sin ax)2
a(c2 - 62)(6 + c sin ax)
__c
/*
dx
cz — —b2 b2 J J 6 b + + csi] c sin ax
~„
f
dx
311. / ,» . o « o— =
J b2 + c2 sin2 ox
0
f
1
,
.
a6V62 + c2
dx
arctan
1
.
312. / rr ^-t-5— = y arctan J 62 - c2 sin2 ox o5V62 - c2
Vb2 + c2 tan ax —r
b
,, . AN
(6 > 0)
Vb* - c2 tan ox r &
(62 > c2, 6 > 0) 1 . Vc2 - 62 tan ox + 6 ;log, 2a6Vc2 - 62 Vc2 - 62 tan ax - 6
(c2 > 62, 6 > 0)
(s) Integrals containing the cosine function
(a ^ 0)
313. / cos ax dx = - sin ax a
314. / cos2 axdx = =» + j- sin 2ax 315. / cos8 ax dx = - sinax —=-sin8 ax J
a
6a
/ 3
1 1 cos4 ax dx = jcx + jsin 2ax + t^tsin 4ax
rt„_
/"
»
317. / cosn ax dx -
cos11""1 ax sin ax , n — 1 f
J-
010
318.
,
/•
,
na
_.
,
/ cos*"2 ax dx
n
y
cos ax . x sin ax
/ x cos ax dx = —=
a2
1
a
/2x
/x2
2\
a2
\ a
a8/
x2cos ax dx = -r cos ax + (
=Jisin ax
320. J x8cosaxdx =(^-|jjcosax +^-^)i
sin ax
321. f x-cosaxdx =2l5^«?Jxn-isinaxdx (n >0)
APPENDIX E
956
(ox)4 _ (ox)6 4-4!
6-6!
323. [S°8a5d:c= _cos_ox_ /"sinox^ y
x2
x
oo^ f cos ax .
y
x
cos ax
a
fain ax dx
62A' J ^T** " "(n - i)x-« ~^rij -7=r-
.
(n " 1}
325' j c^x =l^&ten (f +|)=|log. (sec ax +tan ax) /J
-j
—;— = - tan ax cos2 ax a
o97 f
dx
sin ax
, 1
/*• , ax\
327' j ^i8^ =2acos2ax +2^logetanU +yj 328 f dx ^ 1 sin ax n—2 /* dx ' y cosn ax
a(n - 1) cos**"1 ax "*" n - 1y cosn~2 ax
329 /" xdx =1 [M? , (™Y ' J cos ax
.
^n
'
5(ax)6 61(ax)8
a2 L 2 t4-2!t6-4!t 8-6!
. l,385(ax)"
1
+ 10-8! +' "J / x dx
x
1
^ ^ - -tan ax+ -2log. cos ax
331
/ ^^ _
x sin ax
1
' J cosn ax ~* (n —l)a cos*1"1 ax "" (n —l)(n —2)a2 cosn~2 ax , n —2 /"
xdx • ax
/* dx = ±tanf 1 332. y/r^S 2 1 + cos ax ~~ aa
333. 11 ft & =-^cot^ cos ax a 2 334 f
^^
QQk
xdx
^
/
a;
ox , 2 , x
. ax , 2 ,
ox .ax
j 1-cosax ="S00^ +a2log'smy
/^rtv
957
INTEGRAL TABLES
cos< 336., I/* r?™^ =s-ltang* cos ax
a
2
S iax dx oo* / iC O wd 337. / ^ = —x
1
1 - cos ax
a
, ax cot -jr2
338. y[ cos ax (1J^+ cos ax)-=il0g„tanfc +^)-*tanf a \4 2/ a 2 ooa
f
^a;
1,
y cos ox (1 — cos ax) /dx
(1 + cos ax)2
,
a
°
1 , ax a
2/
1
ax
1
ax
2a
2
6a
2
/dx
1
ax
1
7i \5 = ~T" (1 — cos ax)2 2a °0t "752
n,n
(it , ax\ \4
f
cosaxdx
1 ,
o^o f
cosaxdx
1
ax
2
ax
£"" 6a Cot8 li" 2
1 . ,ax
342. / Tj—1 rr = jr- tan -^— — tan3 -^y (1 + cos ax)2 2a 2 6a 2
, ax
1
,, ax
343. / 7^ r^ = h- cot -jr— — cot3 -xy (1 — cos ax)2 2a 2 6a 2
„„.
f
dx
1
. /I - 3cos2ax\
344. / r—j——r— = —7=- arcsin [ -7-j ; ) y 1 + cos2 ax 2V2a \ 1 + cos2 ax /
345. y/ q;1
5— = y/ -t-5— = —a cot ax sin2 ax
— cos2 ax
o^c //* cos ax cos 6x l dx j = sin (a —6)x , sin (a + 6)x —^ rr- H—^—, ,/
346.
y
2(a — 6)
2(a + 6)
(|a| ^ |6|;for |a| = |6|, see 314)
qav
f
dx
2
,
(6 —c); tan (ax/2) v ' '
347. / r-r = —. arctanJ b + c cos ax aV62 — c2
=
Vb2 - c2
/t9 ^ 9X
(b2 > c2)
1 t (c - 6) tan (ax/2) + Vc2 - 62 aVc2 - 62 °ge (c - 6) tan (ox/2) - Vc2 - 62 (62 < c2)
348 f cos ax dx
x __ 6 f
J 6 + ccos ax ^ c
dx
c J 6 + ccos ax
APPENDIX E
958
349' Jcosax (b+ ccosax) =aV0g«tan(f +l) ~s/& +
dx
c cos ax
350.
dx
i (6T c cosax)2
c sin ax
a(c2 - 62)(6 + c cos ax)
_
6
f
dx
c2 - 62y 6 +
q*i
/ _cos ax dx
35L i (6Tc cos ax)2
6 sin ax
a(62 - c2)(6 + c cos ax)
_
c
f
dx
62 - c2 y 6 +
3*2 /
c cos ax
dx
1
,
6tanax
,. ^ ^x
m J SM^^gS-atV6. +c."CtmvF+^ 1
,
2a6Vc2 - 62
c cos ax
0)
6 tan ax — Vc2 — 62
*6 tan ox + Vc2 - 62 (c2 > 62, 6 > 0)
(t) Integrals containing both sineand cosine
t /sir
(a 5* 0)
354. / sin ax cosaxdx = ^- sin2 ax 2a
355. / sin2 ax cos2 ax dx = fo - ^§i^ 32a 356
/sinn ax cos ax dx = —,—r-rrsinn+1 ax a(?i + 1)
. / sin ax cosn ax dx = —-.——pr cosn+1 ax
=
sin*1"1 ax cosm+1 ax , n —1 f . „a» • ^.x
a(n + m)
(n^-1) N
'
(n^ —1)
r —7—; / sinn~2 ax coswax dx n + mJ (m, n > 0)
959
INTEGRAL TABLES
m- 1 f n +. mJ, sinn ax cosw~2 ax dx (m, n > 0)
sinn+1 ax cos™""1 ax
a(n + m)
359. J/
sm ax cos ax
= -a log* tan ax
m / sin2 afcos ax =\ [log«tan (i +?) " STs] 36L / sinoxtos'ax =\(log«t&n?+c^x) 362. yf-^-^ =-(log,tanax-9 *J sm3 ax cos ax a\ 2 sin2 ax/ 363. y/ _sin axdxcos3 ax =ia\Aog,6 tan ax +02 cos21. ax)^
364. Jf sin2 . -t ax***cos2. ax a cot 2ax 365. Jf sin2 .2dxi =l\^L.-J+hoget&n(l +^)'\ ax cos3 ax a L2 cos2 ax sin ax 2 \4 2 /J /*
dx
1/
367 f
dx
=
368 f
dx
=
1
cos ax , 3,
.
ax\
366' J sin8 ax cos2 ax " aV^x" ~~ 2sin2 ax + 2l0ge tan 2) " y sin ax cosnax
' J sinn ax cos ax
1
a(n — 1) cos*1"1 ax
1
, f
a(n —1) sinn_1 ax
dx
y sin ax cosn~2 ax (n^ 1); (see 361, 363)
. [
dx
y sinn""2 ax cos ax
(n^ 1); (see 360, 362)
369. J(-.si dx
sinn ax cosm ax
1 a(n — 1)
t n+ n + m-2 m —2 ff dx 11 sinn_1 ax cos"1"1 ax ax n_1 axcos™"1 n— —11 Jy sinn~2 ax cosm ax (m > 0, n > 1)
1
1
a(m — 1)
sin*-1 ax cos"1"1 ax
n + m-2 f m —1
y
dx sinn ax cosm~*2 ax
(n > 0, m > 1)
APPENDIX E
370./ sin ax dx _ cos2 ax
960
1 a cos ax
371. yf^to
1
y cos* ax ~~ a(n — 1) cos*-1 ax
o7« /* sin2 ax dx 1. 1 Ar , ax\ 373' j "^ax- " —sinax +-logetan^ +-j
«7.
f sin2 ax dx
1T sin ax
1.
, Ar , ax\l
374- J -£FST - aU^x" - 2l0getanU+2-JJ 375 /" sin2 ax dx _ ' J cos* ax
sin ax
1_ f
a(n - 1) cos*-1 ax
n —1J (n ^ 1)
Qf?a f sin3 ax dx
1 /sin2 ax , .
cos*"2 ax
(see 325, 326, 328)
\
376' J -^s-aT- =-^-"2- +logeCosaxj Q77 f sin8 ax dx 5
o77. /
y
cos2 ax
1/
= - I cos ax H
a\
378 /" sin3 ax dx _ 1I" J cos* ax
1 \1
cos ax/
1
aL(w —1) cos*-1 ax
1
"]
(n —3) cos*-3 axj (n^l,n^3)
379. yfgg^cb,-™ri"+ fsin*-2axdx cos ax a(n — I) J cos ax
v
'
380. y[ainnaxdx= , sin"+laa! n-m +2 f sin-ax cosm ax a(m — 1) cos™-1 ax m — 1 J cosm-2 ax (m^ 1) sin*"1 ax
a(w — m) cos*1"1 ax
, n — 1 //" sin*-2 sin*"2 ax a dx
n —m J
cosw < ax (wi t^ n)
sin*"1 ax a(m — 1) cos*1"1 ax
n —1 /* sin*-1 ax dx m —1y
cos*1"2 ax (m^l)
961
INTEGRAL TABLES
381
382.
/co cos s
axdx
1
si sin2 ax
/
a sin ax
cos axdx
1
sin8 ax
2a sin2 ax
5axdx _ 383. y/"S«LH« sin*1ax r"-
QQ/I
/* cos2 axdx
V
(Bv +d'
1
"/
1/
,.
.
ax\
1 /cos ax
,
.
384' J sin ax =aVC0S a* +log< tan T) oqk f cos2 axdx
ax\
385- J sin8 ax = "25 ViiTO " log< tan Y) 3g6 /* cos2 ax dx _ 1 / cos ax /" dx \ 7 sin* ax
n —1\a sin*-1 ax
J sin*-2 ax/ (n^ 1); (see 289)
QQ7
f cos3axdx
1 /cos2ax , .
.
\
387' J -sk^r =a\-2~ +**•**«; QQQ f cos8 axdx
«530.
/
y
r-r
sm2 ax
1/ .
= — I sin ax +
a\
og9 /" cos3 ax dx _ 1r *J sin* ax
1 \
J
sin ax/
1
aLfa —3) sin*-3 ax
1
(n —1) sin*-1 axj
390. ( ^os^^^cos*^ (cos*;2 axdx y sm ax
391 /" cos* ax dx _ *J sinm ax
a(n -1)TJ
sm ax
cos**1 ax
a(ra - 1) sin1*-1 ax
I
v
;
n —m+ 2 f co cos* ax dx m- 1 J sisin"1-2 ax
(m^l)
n - 1 f;cos*"2 ax dx cos*"1 ax in~lf a(n —m) sin**-1 ax w-mj n —m J sinmax (m 5^ n)
cos*-1 ax a(wi —1) sin1*-1 ax
n —1 f cos*"2 axdx m —1 J sin1*-2 ax (m^l)
APPENDIX E
oqo
/
to
962
1
l
ax
°™m J sin ax (1 =fc cos ax) " ^(1 ± cos ax) + 2al0ge tan ~2
m' Jcos ox (li sin ax) =^(1 isinax) +^log< tan (f+?) 394. yI cos ax(l± «?"*»cos ax) a1 B 1 ±
cos ax
cos ax
one
O90.
f
COSaxdx dx 77—- ;
/ -;
oQA /*
sin ax dx
oq7 /
cos oxdx
^
11.l0& ,
r =
y sm ax (1 ± sm ax)
1 db sin ax
a 6
sin ax
1
, 1,
, /*r , ax\
7 cos ax (1± sin ax) =2a(l=fcsinax) =*= 2alog
w#' y sin ax (1 ± cos ax) " QOQ /*
sin axdx
o/\o
cos ax dx
u —l
+ ax
2a(l ± cos ax) ± 2al0ge tan 2
x_ 1.
,.
,
x
398- J sinax=fccosax=s2^25log-(8inaa;±cos^ /
x
1
, .
3"' J sinax±cosax=±2+2^log'(smaa;:fccosoa;>
40°- /sinax±cosax=a^log'tan(¥±l) ^ yi nr—^hr-i^iog^iitanf) 1 + cos ax db sin ox a &\ 2/ .^ /f
402.
dx;
t—:
y 6 sm ax + c cos ax
= —.
1
ox + 6
aVb2 + c2
loge tan —^—
2
sin 6 = ,
tan0 = ^
V&2 + c2
AM
403.
f sin axdx
/ t—: = y o + c cos ax
404. y/ T-j b+ Arke.
405.
:
c sin ax
,. ,
.
log« (b + c cos ax) * v '
= —loge (b + csin ax)' ac * ^
f sin2 axdx
/ r-n
1. ac
6
1 /6 + c
.fib,
\
x
/
c
5— = — \—7— arctan l \ It—.— tan ax)
y o + ccos2ax
acv
b
\\b + c
963
INTEGRAL TABLES
sin ax cos ax dx
_
1
406' J 6b™iT~™™™„~ cos2 ax + c sin2 ax =o^TTZTm ~" 2a(c —6) 1oS« (6 cos2 ax + csin2 ax) (c*6)
407. J/ b2 rz cos25ax +;—5—~5— = abc -r- arctan (r tan axJ c2 sm2 ax \b )
m Ao
j dx 1 , c tan ax + b J b2 cos2 ax —c2 sin2 ax ~" 2a6c g
409. y/ sin ox cos bx dx =-"?2(a<°++?* - «?2 (a* 0) 6) (a2 5* 62; for a = 6 see 354)
410. J
dx
b + c cos ax + d sin ax
—1
=
aVb2 - c2 - d2
arcsm
c2 + d2 + 6(c cos ax + d sin ax) !
!—
!
—
Vc2 + d2(b + c cos ax + dsin ax) (b2 >c2 + d2, \ax\ < w)
1
aVc2 + d2 - 62 c2+ d2 + b(ccos ax + d sin ax)
I
+ v'c2 + d2 —62(c sin ax —dcos ax) Vc2 + d2(6 + c cos ax + d sinax) (62 < c2 + d2, |ax| < t)
-L P "~ (c + d) cos ax + (c —d) sin axl
..at \_b + (c —d) cos ax + (c + d) sin axJ
,,2= 2
,2.
(u) Integrals containing tangent and cotangent functions (a^O)
411. / tanaxdx =
log, cos ax
412. / tan2 axdx = - tan ax —x a
413. / tan8 ax dx = ^~ tan2 <** + ~l°g« cos ox
414. / tan*axdx = , _ . tan*-1 ax - I tan*-2 ox dx
(n > 1)
APPENDIX E
+1 -
/
dx
964
__ i cotaxdx dx
J b+ ctanax" J bcot ox + ~ /)2j. 2 6a; + - loge (b cos ax + csin ax) Air f
41b. /
y
dx
J Vb + c tan2ax
= —,
1
. / lb-c .
aVb - c
\
arcsin I \—=•— sin ax 1
\\
b
)
(6 > 0, b2 > c2)
A^ /"tan*ax ,
tan*+1ax
417. / 5—dx = —.—p^pr y cos2ax a(n + 1)
,
.
1N
(n j* —1) v '
418. / cotaxdx = a- logtf sinax cot2axdx 419. / cot2 axdx = /I r—5— = tan2 ax
a
420. / cot8 axdx = —5- cot2 ax
cot ax — x
logesin ax
421. y/ cot*axdx = y/ -—-— = tan* ax
7
rrcotn"1ax
a(n — 1)
— / cot*-2axdx
(n > 1)
tan ax dx 422 Jf b + c^cot ax - yf 6i tan ox + c
= ,2 , 2\bx .no
cot* OX ax , / CO*"
sin2 ax
__ -COV"4OX —cot**1 ax ~~ a(n + 1)
loge (c cos ax + bsin ax) /
.
1X
(v) Integrals containing hyperbolic functions 424. / sinh xdx = cosh x
425. /sinh»xdx =^-|
(a 9* 0)
965
INTEGRAL TABLES
•/; 428. / cosh x dx = sinh x /'
427
429.
sinh2 x
= —coth a;
/ cosh2 x dx =
sinh 2x . x
4t
+2
/dx
—r— = 2 arctan e* = arctan (sinh x)
/•7t
—rr- = tanh x cosh2 a;
430 7f cosh2 s^lxa; d s=
cosh cosh a;
433./;
sinh x
cosh a:
sinh2 x
da; =
—
1
434. / x sinh xdx =* x cosh a; — sinh a?
435. J x cosh xdx = x sinh x —cosh a;
436. J tanh a; da; = log« cosh x
437. J tanh2 xdx = x —tanh a; 438. f coth x dx = log« sinh a;
439. J coth2 a; dx =a; —coth x 440. / sinhn aa;c?x r i
— sinhn_1 ax cosh ax an
a(n + 1)
n- 1 /" n
sinhn+1 ax cosh a#
J
sinhn~2axdx
(n> 0)
-t+8 [si: n + 1J
sinhn+2 ax dx
(n < -1)
966
APPENDIX E
/
441. / coshn axdx 1
— sinh ax cosh*-1 ax
n- 1 r
an
a(n + 1)
n
J
sinh ax coshn+1 ax +
coshn~2axdx
(n> 0)
n + 2 f coshB+2 ax dx (n < -1)
J. /sir
442. J sinh ax sinh bx dx = . .„
4*3.
i
.
., ,
,
sinh (a + o)x 2(a + 6)
sinh (o — 6)x' 2(a - 6)
sinh (a + b)x , sinh (a — b)x H — 2(a + 6) 2(o - 6)
/ cosh ax cosh ox ax = —n,_ , ,./
(a2 ^ 62)
444.ft. /sk / sinh ax cosh to dx • ^ 2(a(a++b)f}* + ^1"—*>* 2(a - 6) 445. / sinhax sinax dx = ~ (cosh ax sinax —sinh ax cos ax)
446. / cosh ax cos ax dx = 5- (sinh ax cos ax + cosh ax sin ax)
«7. /*b««-*-^(-*—+ *h-*^ 448.,/, / cosh ax sin ax dx = ^- (sinh ax sin ax —cosh ax cos ax) (w) Integrals containing exponential functions
»•/•
449. / e"* dx = - ea* '
a
eax
450. / xe** dx =^ (ax - 1)
461. /**-&-.-g-| +!) 452. / xne°* dx = - xV* —- / xn_V1 dx
(n > 0)
... fe", , . ax . (ax)2 , (ax)3 , 463. J -dx =log,x +r-il +^I +3-731+ ••
967
INTEGRAL TABLES
454-/^ =^l(-x^ +a/xS^) (»>D /dx
1
€"*
r+-^ = alog«r+^
^./^ =f-^log.(6 +ce~)
458- / 5=1*5= - IWcarCtan WD (6C >0) 1 log.i±?^ (6c <0) c — e^V—bc
2av—be
4.
f xe«*dx
e«*
J (1 + ax)2
a2(l + ax)
460. y/ e**log„xdx =iea*log,x - -a J/ —dx a ° x / e a x
e^sinbxdx = q2
,2 (asinfcx - fccosfcx)
462. / e«* cos 6x dx =fl2 ^ &2 (a cos 6x +6sin 6x) xe^sin bxdx = 2
(a sin bx —6 cos6x)
"" (a2 4. 52)2 [(fl2 "* &2) sin bx —2ab cos 6x] xe«* cos bx dx = q2
2(a cos 6x + 6sin &x) eax
"" (a2 + 6^)2 [(a*"~ 62) cos &* + 2aft sin 6x]
465. f *«* »:~ H,in rT rfr _«"[(6 • c) sin (b - c)x +acos (6 - c)x] J
2[a2 + (6 - c)2]
_ ***[(& + c) sin (6 + c)x + a cos (b + c)ap 2[a2 + (6 + c)2]
APPENDIX E
968
466. J16«* cos bx cos ex dx =^ ' c) ^2[a2 +7g* +"*°* (b " °)x] (6 - c)2] , e°g[(6 + c) sin (6 + c)x + a cos (6 + c)x]
+
2[a2 + (6 + c)2]
^c^ i «. • t j ^E0 SU1 (6 ~" c)z — (6 — c) cos (6 — c)x] 467. / e*5 sin 6x cos cxdx = —— 2[a2 + (b - c)2]
, eP*[a sin (b + c)x — (6 + c) cos (fe + c)x]
+
2[a2 + (6 + c)2]
468. / e°* sin bxsin (6x + c) dx
_ eax cos c _ e?*[a cos (26x + c) + 26 sin (26x + c)] 2a
/
2(a2 + 462)
469. / eax cos6xcos (6x + c) dx _ e°g cosc , e°*[a cos (26x + c) + 26sin (26x + c)]
2a
»•/
+
2(a2 + 462)
470. / eox sin 6xcos (6x + c) dx
eax sinc , eg*[a sin (26x + c) —26 cos (26x + c)]
2a
/
+
2(a2 + 462)
471. / e°* cos oxsin (bx + c) dx e"rsinc , e°*[a sin (2bx + c) —2b cos (2bx + c)]
2a .„ , . . , 472. / «- sin6x dx
/ ,
,
e«cos«6xdx
+
2(a2 + 462)
e°*sin"-16x (a sin 6x — nb cos bx)
fl2 + W2fc2
L
eax cosn_1 6x (a cos 6x + nb sin 6x)
ffl2 + w2b2
+!^+w/cMC0SB-2teda:
969
INTEGRAL TABLES
(x) Integrals containing logarithmic functions
(a 9^ 0)
474. / log«axdx = xlog*ax —x
475. J (loge ax)2 dx = x(logfl ax)2 —2x log« ax + 2x
476. y (log<»ax)ndx =x(logeax)n —w. f (logeax)""1 dx
(n^ —1)
(loge aa;)3
+ 3-3! +• ' "J 478. / x loge ax dx = ^-loge ax — 4 2 6e / x 3
x2 loge axdx = -5- loge ax —
./xMogeaxdx =x-[^-^] (»„-!)
480
xw(loge ax)*dx =^--j (logeax)w - ^j / x» (logfl ax)"1"1 dx
482. J[SSbS&te =(hfc^p-1 x n + 1
(w, n 7* —1)
(n ^ _1}' v
483. yflgfe**,-l0g;f -7 J-— xn (n — ljx*-1 (»- l^x*"1
(„*!) v y
484. J/" <*°^ dx =- (n, ^ l)xn~l f" t+-5Lf ^ xn^m"1 dx xn ^ n - 1J (» * 1) / x n dx
1
r
logT^ =a^1 Ll0ge (1°g° "** + (n + 2) ^ "* "1"
(w + l^dog.ax)2 (w+l)3(log«ax)3 2-2! ~l~ 3-3!
" Jh3 J *y
b/ =(» +1) log.ax]
APPENDIX E
486 f x"dx _ ' J G0ge «^)m
—xn+l
(W —l)(l0ge ax)"1-1
970
, n+ 1 f
x^ dx
m —1J (loge aIx)1"^1 (m^l)
487. /jr^;log. dog. ax) loge ax dx
488.
dx f dh J a;(loge ax)n
(n — 1)(loge ax)"-1 x
489. / sin (log« ax) dx = ^2 [sin (log« ax) —cos (log* ax)]
490. / cos (loge ax) dx = ^2 [sin (log« ax) + cos (loge ax)] x
e"* loge 6x dx = - e°* loge 6x
/ — dx
(y) Integrals containing inverse trigonometric and hyper bolic functions (a > 0)
492. / arcsin - dx = x arcsin - + Va2 —x2 J
a
a
493. J/ x arcsin -a dx = \2 (-^r —T] arcsin -a + t4 Va2 —x2 4/
A^
f
. X ,
Xn+1
a
n+1
.X
495. / xn arcsin - dx = —rr arcsm
J
if r-r /
a
xn+1
>
n+1 J Va2 — x2
dx
(n*-l) x
/arcsin - dx
-
a ' 2 «. 3 • 3 a53
T2-4-5-5a6T2-4-6-7-7a7T (x8 < a2) arcsm - dx
497. /
'•/
a
5
=
l
,x
arcsin
1,
loge
a + Va2 —
971
INTEGRAL TABLES
498. / farcsin - ) dx =x(arcsin -J +2IVa2 —x2 arcsin / x
x
a
a
xj
/
arccos - dx = x arccos - — Va2 — x2
500
/x , (x2 a2\ x x arccos - dx = (-=• —j J arccos a \2 4/ a
x ^/"i r -r Va2 — x2 4
501. J/ x2 arccos -a dx = %arccos -a —^9v(x2 + 2a2)Va2 —x2 3 ' mo /f xnnarccos -x dx j = —:—r x"+1 arccos -x Hi 1r-rr /f , x"+1 dx 1 502. J a n + 1 a n + 1 y Va2 — x2 (n*-l) /ar arccos ccos -
503.
dx ax
a
j
ir,
ix
x
2l0g'X-a -
&
1 x3 2-3-3 a8
1-3 x" 2-4-5-5 a*
1-3-5 x7 2 • 4 • 6 • 7 • 7 a7
(x2 < a2) /
504. /
X
arccos-dx
a 5
'x2
t
=s
1 x
1
, ^/-r
mx , 1,_ a + Va2 — ; arccos- + - log« a
a
e
505. / Iarccos - J dx =x(arccos -J —2(Va2 —x2 arccos - + xJ / x
x
a
a
arctan - dx = x arctan
a
^ logs (a2 + x2) j*
507. y/ xarctan -a dx =£(x2 +a2)' arctan -a - ^2 2v
508. J/ x2arctan-dx =?6 arctan-a -o ^ +^ loge (a2 +x2) a o
509. y/"x-arctan^dx =-*£,arctanx- -^ /" *£^§ a n + 1 a n + 1,/ a2 + x2 (n * -1)
APPENDIX E
arctan --ax dx /arctan
/
357
X
arctan-dx a 2 — x2
972
,
1 x
,
arctan
, 0 , _ 1 . a2 + x2 — logfl
x "a
°" 2a
&
x
512. /
/
arctan - dx
_2_ = _ _ x"
1
.arctan£
(n — l)x"-1
a
+^T^T J x->(a2 +x2)
(w * X)
513. J/ arccot -a dx =xarccot -a + ^log, (a2 +x2) 2i
514. y/ xarccot -a dx =\z (x2 +a2) arccot -a +^2i
515. y/ x2 arccot -a dx =§o arccot -a +? - ^ log. (a2 +x2) oo Ldx 516. y/ xn arccot -a dx = 71—:—7 arccot -a H n +?-r1 y/ -^-r+ 1 a2 + x2
(n*-l) x
/
517.
arccot - ax dx
a
x arccot - dx
518. j —-j /
_ y1
.
_
x5 . x7
a "*" 32a3
52a6 + 7V
""210ge *
'
--arccot- +-loge-^-
arccot-dx
—
9
_ x , x8
-
= — 7-
7T-—; arccot -
(n — l)xn_1
__
a
a
f
dx
n - 1y x-Ka2 + x2)
,
.jv
{n * l)
973
INTEGRAL TABLES
520. J/ sinh-1 -a dx = xsinh-1 -a - Vx2 + a2
521. Jxsinh-^dx - \ L2 +^\ sinh"1? - | Vx* +a2 522. J/ c0sh-i ^dx = xcosh-1 -a =F Vx2 - a2 a
(upper sign for cosh-1 - > 0)
523. / tanh"1 - dx =xtanh-1 - +£loga (a2 - x2) j
/•
a
a
Z
K2 — /T.2
524. / xtanh-i-dx =
,x
,
ax
2 tenh-i+i
a
525. J/ coth-» -a dx = xcoth-1 -a + % log. (x2 - a2) Z E-3. Definite Integrals (see also Sees. 214-1 and 21.6-6 and Appendix D).m,n are integers.
(a) Integrals containing algebraic functions
7T
2
I
adx —
io a2 + x2 " *
(a>0)
0
(a = 0)
7T
(a < -0)
2
3. / x«(l - x)"dx =2/ x^l - x2)<»dx =r
«"
,
Jo (l+x)*w+2aa;
_r(« + i)r(j8 + i)
r(a + /S + 2) - *(«+ 1. 0+ 1)
APPENDIX E
dx tr 4 JoP—* (1 +• x)x° — sin ar
Too
974
(a
,
6- Jo (T=l)^ = -,rcoto,r
(o<1>
7- ;o/ tSt^x =.6sin. T ... 1+r (air/o)
(0
ft r
V?r(i/a)
(6) Integrals containing trigonometric functions (see also Sees. ESc and d) {0
(m — n even)
—5
2m
5
w2 — n2
(m — n odd) x
'
10. JT* sin mx sin (nx +#) dx = /*cos mx cos (nx +#) dx TO -1i-
rx/2
11. /
Jo
r*/2
sinaxdx= /
Jo
(m 5* n)
cos #
(m = n)
j _1 I—^—/
cosBxdx = x Vir—7-
2 T(* +l)f
f1 • 3 • 5 - • • (a - 1)
w
(a - 2, 4, . . .)
2-4-6 2-4-6
' ' ' (g ~ 1)
5.7 . . . a
(a > -1)
f„ _ O C
N
\a —*f °f • • •/
»-r^*^-Tjmm /; = ±£(«+1,0+1) a!0!
2(a + 0 + l)!
(a, 0^-1) (a, ft integers > 0)
975
INTEGRAL TABLES
Note: Use formula (12) to obtain
r ,3-1.
'sinx dx
/I
Vsin x dx Vsinxdx
|
(a >0)
|
(a <0)
yo
sin ax
/
yo
dx =
cos ax dx
= oo
(a^O) («>0)
ax dx yo
x
(o<0) ift
I
/o
cos ax — cos 6x j
.
i
sin x cos ax
6
. , _ „.
=log( a (a' 6* °> | (M
dx =
I
.0
(|a| > 1)
-g j " sinxdx _ /*" cosxdx C yo Vx" yo Vx ~ V2
20
21.
22
/"" cos,aaidx = 5e-'.i
Jo 1 + /""sin2
Jo /o
x;
•dx = ^|a|
sin ax sin bx , « "X x2
=
«ra -jr2
(a<6)
1 hr 23. yo/ cos (x2) dx =/o/ sin (x2) dx =_,,„ 2 \2
-j
dx
v ^cos:
etc.
APPENDIX E
976
24. / sin2 mxdx = / cos2 mx dx = 5 yo
yo
2
25. yof'-rr --7-^ (a>6>0) a + bcosx Va2 - b2 26. f/2 * =arccos (6/a) yo
a + 6 cos x
Va2 — b2
27. Jo a2cos2x +o2sin2x =M (a6 ^ 0) /V2
(a2cos2x + 62sin2x)2
29
/** (a —6 cos x) dx ' Jo a2 - 2a& cos x + 62
_ ?r(a2 + b2) 4a36s "0
(a2 < 62)
s
(a = 6) (a2 > 62)
30
31
/
cosnxdx
wbn
y o 1 — 26 cos x + 62
f1
dx
, ^ ~ ,,, ^ .*
1 — b2
= a
y o 1 + 2x cos a + x2
2 sin a
v
' l '
'
/Q
t\
\
2/
32. Jo(" 1 +, 2x - cos dx a +, x2,--^(0
sinxdx
1.
1+
33'Jo Vl-&2sin2x"2fcl0gT^ 0il
[*/2
34. /
yo
35. Jor
>
cosxdx m .
VI-A;2 sin2 x
1
. .
= 7 arcsm fc
*
*sin2xdx _ 1 /|r _ Ps vTT ft2 sin2 x *21 *'
36. Jof'"Vl.«"** =1[E - (1 - *2)K]JJ -fc2sin2x A2
(1*1 < 1)
INTEGRAL TABLES
977
(c) Integrals containing exponential and hyperbolic func tions
(a > 0)
37. /
e-°*dx =
J.
i. /"x»e-°'dx =r(^11) (a>0,6>-l) -£k
(a >0, 6=0,1, 2, . . .)
r(^) . ^r xh-a* dx . -A__Z
(a >0, 6>-1)
rl • 3 ••• (6 - l)v^ 2(W2)+la
l! 2«<m>/2
(« > 0, ft - 1, 3, 5, . . .>
40. Jo/ e-at"dx = ~2a
41. Jo/ xe~'tdx = 2~
42- Jof°x*e-*>dx =^4 43. Jo/ v^e-*dx =J-Ji 2a \a
44. JoffZtk-Jz Vx Va
45. ^ e<-*!-«'A')rfa; =le-**VZ I
46'Jo
e~°x — "fi""^* '
x
(a > 0, ft = 0, 2, 4, . . .)
7i
dz-tog,-
(a,ft>0)
APPENDIX E
47. yor^*e* — 1
6
48. JoP-^ = e* + 1
12
49. /
e~ox cos bx dx —
a2 + b2
b 50. yo/ 6-°* sin 6x dx = -*-r-r= a2 + o2 51
/ e~** cosh 6x dx = a2, —g o2 ,g
yo
. yo/ e~a*sinh6xdx = -=—u a2 — b2 /•• xe~a* xe~ax sin sir bx dx =
53.. /
yo
(a2 + 62)2
/;
(a2 + 62)2
54. / xe~°* cos 6xdx = yo
(a >6>0) ~~ '
(ay > b> 0) ~~ '
a2-62
55. /o e~aixi cos bx dx =
2a
yo
(a2 + 62)8
57. yof%2^cosftxdx =y-^ (a2 + 62)8 co /*" e-^sinx dx -
58. / Jo
59, '
•/:
x
<**
,
1
= arctan a
cosh ax
2a
Xdx
7T2
y o sinh ax ~~ 4a2 (d) Integrals containing logarithmic functions
61. Ploge |log« x\ dx =JT" en* log„ xdx =-C=-0.577 2157
978
979
INTEGRAL TABLES IT' log., X , 62. /f1 ^dx=-6 .2
Jo 1 -
f1 f^dx=-^ log.: 63./ • JoI+~
67. yo/ xlog, (1 - x) dx = -74
68. yo/ xloge (1 +x) dx =74
73./;(l„g.I)-^.v^ 74. Jo (log,i)"dx =T(a +1) (a >-1) 75. £ (log. x)» dx =(-1)" •n!
(n - 1, 2,... )
76. />(log4)"dx =| ^ (a,»>-l) 77. /Z"'72 log„sinxdx = /fT/2 log,cosxdx = —^log,2 yo
Jo
&
APPENDIX E
f*
980
2
78. / x log, sin x dx = —^- log, 2 yo
^
79. ['loge (a ±bcosx) dx =7rlog,(a +^'jlJ!^ (a >6) 80. yoriogg(1 +singcosa?)dx=:ffa cos X 81. / f x/2
82. I
sin x log, sin x dx = log, 2 — 1 log, tan x dx = 0
rw/4
84.
Jo l0g« (1 + tan x) dx = glog, 2
SUMS AND INFINITE SERIES
E-4. Some Finite Sums
+ (n - 1) + n =
1. 1 + 2 + 3 +
2. p+(p + l) + (p + 2) +
ft(ft + 1)
• + (q " 1) + q
_ (q + p)(q-p+ 1) 2
S. 1 + 3 + 5 + 4. 2 + 4 + 6 +
• •
5. I2 + 22 + 32 +
+ (2n - 3) + (2ft - 1) = ft2 + (2ft - 2) + 2ft = ft(ft + 1) ft(ft+ l)(2ft+ 1) + (ft - l)2 + ft2 = 6
ft2(ft + l)2
6. I3 + 23 + 33 +
+ (ft - l)3 + ft3 =
7. I2 + 32 + 52 +
+ (2ft - l)2 =
8. l3 + 33 + 53 +
+ (2ft - l)3 = ft2(2ft2 - 1)
9. I4 + 24 + 34 +
+ ft4 =
ft(4ft2 - 1)
ft(ft + l)(2ft + D(3ft2 + 3ft - 1)
10. a0 + (a0 + d) + (a0 + 2d) +
= n **" (2a0 + nd)
30
+ (a0 + nd)
(finite arithmetic series)
INTEGRAL TABLES
981
11. 1 + a + a2 +
+ an =
1 -
an+1
1 -
(finite geometric series)
a
71
12
Y—i+
\
—1 — 1 — n i) " rT+1 " ft+1
k=i
13.
V
U Hk + 1)(A- + 2)
*=1
2 [1•2
(n+l)(n + 2)J
m(VLVia)s[n(la +b) \ (» =1, 2, ...)
14. V sin (ka + b) =
sin:
. /ft + 1 \ 15. > cos (fca + b) = —^— 7 rt
/ft V
, ,\ L
sin I —-— a I cos I ^ a + o J Sill;
2n-l
i* 16.
Y/
Z_/
2n-l
*ir = cos — ft
V>
Z^
• *•• a sin — =0 ™
£-5. Miscellaneous Infinite Series
3-1-12 +l-\+---±-n+
lo&2
5- 1_2 + 4-8+ ••• ±¥+
3
fi i
J j.1
1x1
4.
1
-
=4
7 -L- +J- + -L+ . . . |
1
8T^3 + 3T5 + 5^7+
- l)(2n + 1) + •
l-2T2>3T3-4t
^n(n+l)
+ (2n
+
= 1 1
2
APPENDIX E
9. 10.
11.
12.
1,1,1...
1-3 ' 2-4 * 3-5 '
982
1
.
3 4
1 '1- 7-9 * '1 11-13 * " '1 • • • '1 (4ft - l)(4ft+1) * + 3-5 ^ 1 •21 • 3 '1 2 •3* •4 •1
• • • '1 ft(ft + l)(ft * + 2) + ^ •••
1
7T
2
8
1
I
1 •2 . . . 1 ' 2 •3 . . . (1 + 1) ^
,
A
,
1
1 ft ... (ft + I - 1) ' 13.
.
! (ft- l)(ft + l) '
1+1 + 1+ T 22 ^ 32 T
(I
-1)0-1)!
-16
14.
X+1+1+... = JL4 ^ 24 ^ 34 ^
90
15.
1+1 + 1+
Zl
1 ^ 26 ^ 36 ^
945
1 + 2s A+ A+...-.*!. I" 38 t9450
Ifi
oo
Note:
The series / A;*"2 is referred to as Riemann's zeta function ffx
of the variable 0.
17. 1 + a + a2 + • • • =
__
(|a| < 1, infinite geometric series)
18- Xw+T) =1 Y
1
1
1
• 2/ *(* + 1) • • • (k + m- 1) m- 1(m - 1)!
19.
(m = 2, 3, . . .)
983
INTEGRAL TABLES
E-6. Power Series: Binomial Series
1. (1 ± x)m
_ i . ^
. rn(mrti- 1) x 2
= 1 ± mx H
m(m - l)(m- 2)
*
3!
+
+ (+1)- «(«-D • • • (m-n+1) x„ + (w > 0) 2. (1 + *)H
_, , 1
" l ± iX
_1-3 2
1-3-7
4-8* ±4-8-12; 1 -3-7-11 4-8-12-16
a;4 +
3. (1 ± *)>*
_. , 1
~ L± ZX
_1-2 , , 1-2-5 3
3-6* ± 3^9* 1-2-5-8 3-6-9-12
x* ±
4. (1 + x)>
_ , . 1 _ 1 • 1 2 , 1-1-3 - l± 2X 2-4* ± 2-4-6: 1-1-3-5
x* ± 2-4-6-8~ •"•
5. (1 + x)*
_ , . 3
, 3-1
_ 311
~ l * 2* + 2^4 x +2T4^6: , 31-1-3 4_ "t"2-4-6-8a; + 6. (1 ± x)*> 5
, 5-3 , , 5-3-1
_1± 2X + 2-4a:2± 2-4-6• 5-3-1-1 4_ 2-4-6-8X +
} (1*1 < 1)
APPENDIX E
7. (1 ± x)~ ! _
= 1 + mx -\,
984
m(m + 1) .
^y—- x2
m(m + l)(m + 2) 3!
+ (+!)»
m(m + 1)
(m + n-1)
xn ±
- • •
(m> 0)
8. (i ± x)-y*
t _ 1
, 1-5 , _ 1-5-9
,
= 1 + 4* + 4^8*' + ^8^2*' 1-5-9-13
4_
+ 4 -8 -12 -16* + 9. (1 ± x)-X
- i - t1
-ulli 2 - 1*4-7
" l + 3*^3-6* + ^^V +
10. (1 ± a)"* , - 1
1-4-7-10
3 •6 • 9 • 12 :
, 1*3 2_ 1-3-5 3 , 1-3-5-7 ,_
= 1 + 2X + 2^X2 + 2^X + 2^4^8* + 11. (1 ± x)-1 = 1 + a: + a:2 + a:3 + a:4 + • • • 12. (1 ± *)-*
} (M < i)
-i-§ j-il^ , - 3 -5 -7 , , 3-5-7-9 4_
^1 + 2Xi"2-4a;+2-4-6X"h2-4-6-8:r + 13. (1 ± x)~2 = 1 + 2a: + 3a:2 + 4a:3 + 5a:4 T
14. (1 ± a)-H
1 _ 5
, 5-7 2_ 5-7-9 3
= 1 + 2:C + 2T4:c2 + 2T4T6ic3
. 5-7-9-11 4
+ 2-4-6-8 * * 15. (1 ± a:)"3
=1+—^ (2 •3a: +3•4a:2 +4•5a:3 T5•6a:4 + 16. (1 ± a:)-4
-IT
'
1-2-3
(2 • 3 • 4x T 3 • 4 • 5a;2 + 4 • 5 • 6a;8 + 5 • 6 • 7x4 +
17. (1 ± a;)"6 = 1 +
t.2^3.4 (2 •3•4•5* T3•4•5•6a;2 + 4 • 5 • 6 • 7x8 T 5 • 6 • 7 • 8a;4 +
•)J
985
INTEGRAL TABLES
Note: The series for (1 ± x)m reduces to finite sums (Binomial Theo rem, Sec. 1.4-1) if m = 1, 2, 3, . . . .
E-7. Power Series for Elementary Transcendental Functions
I
l.e*=l + *+ ^! + fj+ •••
2. sm* = 3-1-, + ^+ • • ' 3. cos z = 1 - 2j + j-, +
4. sinh z = z + -^ + ^ + • • • z2
zA
5. cosh 2 = 1+2] + 4J+'*' 6. sin (z + a) = sin a + z cos a —
4!
M / , n . 7. cos (2 + a) = cos a — zsma
.
+
93
(« +?) ft!
3!
2!
' ' •+ «2
3!
(N < «o)
22 cos a . 23 sin a ^ h•
j
4! 4!
2!
+
+
g4 cos a
2s cos a
2" sin
, «4sin a
+
z2 sin a
zn cos [\ a + -^ ) 2/ n\
± ' ''
94
8. log, (1 + z) =2- J + I - J + • • • .
.
, 1 28 , 1 3 26 , 1 3 5 «»
9. arcs.n2 = 2+ 2-3 + 2 ' 4 ' 5 + 2 ' 4 ' 6 "7 + ,_.,_.
1 2*
1 3 26
13 5 2'
10. smh '2 = 2-5-3 +2-4-5-2-4-6-7 *
(W < i)
11. arctan 2 = 2 — 77 + -? + o
o
1+2
12. tanh-12 = ~ loge,
2=*+3+5+
Note: Series 11 converges also for \z\ = 1, unless z = ±t.
,,,,„, •^-•><7»'--s»,.-. fc=l
~* + 3* "•"B2 +3T5* +
(w
APPENDIX E
986
14.COt2 =^|(-l)*^2- =i-i2-^2322*£2*.» _ 1 (2*)! ~2
1 _J_ 3* 45*
945' (0 < |«| < *r)
where the Bk are the Bernoulli numbers denned in Sec. 21.5-2.
13 and
14 yield similar series for tanh z and coth z with the aid of Eq. (21.2-32).
15. arctan2 -|-i +-Li_-L|+...
(|*| >1and z* * -1)
16. arctan2 =r^[l +!ri^2 +!^(r^72)2+ •••] (*•*-!)
17. COth-2 =i +i-8 +^+---
(|2|>1)
INFINITE PRODUCTS AND CONTINUED FRACTIONS
E-8. Some Infinite Products (see also Sec. 7-6)
2.cos2 =njl-[T-^715]2) (W < •)
00
3.sinh2 =2]l[l +(fJ] Jfe = l 00
4.cosh2 =nji +[^r!rT)]2) ifc = l
J
E-9. Some Continued Fractions (see also Sees. 4.8-8 and 20.5-7)
z
z2 z2 z2
( _^ t ,
2 _ 1 2 2 2 2 Z Z_
« - 1=1+2=3+2=5+2= ' "' __ t , z
~~
z_ _z
z_ _z
z
z_
\
\ I
(U|
"*" 1= 2+ 3-2+5=:2+7-: ' ' /
log,(l +z) =^^^if-i^il ... (2 in the plane cut from -1 tO -°o)
ai
ai
D.C., 1964.
C«
2a\ai — a8 — ai3 6aia2 — 2a8 — 4ai8
c%
a4 — (61C3 4- &2C2 4- bsCi + bi) a4 + aia8 + Mo** + M^ai2 4- M4«i4 a4 — }i(asCi + 2a2c2 + 3aiC8)
a8 — (61C2 4- &2C1 4- bz) a8 4- aia2 4- Mai8 a8 - M(a2Ci + 2aic2)
62 + ai&i 4- fl2
a% — (biCi + 62) a2 + Mai2 tt2 — MfllCl
A. Stegun (eds.), Handbook ofMathematical Functions, National Bureau of Standards, Washington,
64 4- ai68 4- a2&2 + a86i + a4
+ }i*(fl - l)(n - 2)(n - 3)ciai3
na* + cia8(n — 1) 4- M^(n - l)a22 4- H(n — l)(n — 2)ciOia2
Ma4 - Maia8 - M^2 + Keai2a2 - ^28ai4 ^iaia8 + %a22 - Ma4 - 1Me«i^t 4- 3^L28ai4
6aia8 4- 3a22 - 2a4 - 12ai2a2 + 5ai4
2aia3 — 3ai2a2 — a4 4- a22 + ai4
6s 4- ai&2 4* 0261 4- a8
Ma8 - K<*ias + Keoi8 M«» — Mai2 ^aia2 - Mas - iKeai8 %ai2 — Ma2 M(n — l)ciai + wa2 C\ai(n — 1) + Mciai2(n - l)(n - 2) + naz
ai2 — a2 3ai2 - 2a2
♦From M. Abramowitz and I,
S3 = Sl/»2 «8 = exp (si — 1) S3 = 1 + loge Si
S1S2
na\
S8 = «ln
St =
Mai -Mai
S3 = ** S3 = .,-W
ai +61 ai — 61
-2ai
—Oi
Ci
53 = «i_1 «8 = 3i~2
Operation
Table E-l. Operations with Series*
Let si = 1 + aix + a2x2 + a8x8 4- a4x4 + • s2 = I +bix+ biX2 + &s*8 + M4 + • s8 = 1 + Cix + c2z2 + C3£8 + c4rr4 + •
CD
w
H • 00 r
ft
K
SB
APPENDIX E
988
BIBLIOGRAPHY
Abramowitz, M., and I. A. Stegun: Handbook of Mathematical Functions, National Bureau of Standards, Washington, D.C., 1964.
Bierens de Haan, D.: Nouvelles tables d'integrates denies, Stechert, New York, 1939. Byrd, P. F., and M. D. Friedman: Handbook of Elliptic Integrals for Engineers and Physicists, Springer, Berlin, 1954.
Grobner and Hofreiter: Integral Tafel, 2d ed., Springer, Vienna, 1958. Lindman, C. F.: Examen des novelles tables $ integrates dSfinies de M. Bierens de Haan, 1944.
Luke, Y. L: Integrals of BesselFunctions, McGraw-Hill, New York, 1962. Petit Bois, G.: Tables of Indefinite Integrals, Dover, New York, 1961. Ryshik, I. M., and I. S. Gradstein: Tables of Series, Products, andIntegrals, Academic Press, New York, 1964.
APPENDIX
NUMERICAL TABLES
Squares Logarithms Radians, Sine, Cosine, Tangent, Cotangent, and Their Logarithms Exponential and Hyper-
Table Table Table Table Table Table
bolic Functions and Their
Table F-16. t Distribution
Logarithms
Table F-17. x2 Distribution
Table F-5.
Natural Logarithms
Table F-18.
Table F-6.
Integral Sine and Cosine Exponential Integral Complete Elliptic Integrals
Table F-19. Random Numbers Table F-20. Normal Random Numbers Table F-21. sin x/x
Table F-l. Table F-2.
Table F-3.
Table F-4.
Table F-7.
Table F-8. Table F-9. Factorials and Binomial Coefficients
F-10. Gamma Function F-ll. Bessel Functions F-12. Legendre Polynomials F-13. Error Function F-14. Normal Distribution F-15. Normal Curve Ordinates F Distribution
Table F-22. Chebyshev Polynomials
The following numerical tables are intended less for extensive numerical computations than as quantitative background material indicating the behavior of the most important transcendental functions. The following numerical constants are frequently useful: ir = 3.1415927 2tt = 6.2831853
_^ 2tt
= 0.1591549
it2 = 9.8696044 e = 2.71828183
log. 10 =
log10 t = 0.4971499 logio 2tt = 0.7981799
logio ^- =9.2018201 - 10 logio ir2 = 0.9942997 logio e =
logio e
loge 10
= 0.43429448
= 2.302585093 989
APPENDIX F
990
Table F-l. Table ofSquares* 0
1
2
4
3
5
7
6
8
9
0
C)
1
4
£
16
25
36
49
64
81
1
IOC>
121
144
16S
196
225
256
289
324
361
2
40C )
441
484
528
576
625
676
729
784
841
3
90C
961
4
1,600
1,681
1,024 1,764
1,082 1,849
1,156 1,936
1,225 2,025
1,296 2,116
1,369 2,209
1,444 2,304
1,521 2,401
5
2,500 3,600 4,900 6,400 8,100
2,601 3,721 5,041 6,561 8,281
2,704 3,844 5,184 6,724 8,464
2,809 3,969 5,329 6,889 8,649
2,916 4,096 5,476 7,056 8,836
3,025 4,225 5,625 7,225 9,025
3,136 4,356 5,776 7,396 9,216
3,249 4,489 5,929
3,364 4,624 6,084
7,569 9,409
7,744 9,604
3,481 4,761 6,241 7,921 9,801
10,404 10,609 10,816 11,025 12,544 12,769 12,996 13,225 14,884 15,129 15,376 15,625 17,424 17,689 17,956 18,225 20,164 20,449 20,736 21,025
11,236 13,456 15,876 18,496 21,316
11,449 13,689 16,129 18,769 21,609
11,664 13,924 16,384 19,044 21,904
11,881 14,161 16,641 19,321 22,201
24,025 27,225 30,625 34,225 38,025
24,336 27,556 30,976 34,596 38,416
24,649 27,889 31,329 34,969 38,809
24,964 28,224 31,684 35,344 39,204
25,281 28,561 32,041 35,721 39,601
6
7 8 9 10 11
12
13 14 15 16
17 18 19 20 21 22
23 24
25
26 27 28 29 30
31 32
33 34
10,000 12,100 14,400 16,900 19,600
10,201 12,321 14,641 17,161 19,881
22,500 25,600 28,900 32,400 36,100
22,801 23,104 23,409 23,716 25,921 26,244 26,569 26,896 29,241 29,584 29,929 30,276 32,761 33,124 33,489 33,856 36,481 36,864 37,249 37,636
40,000 44,100 48,400 52,900 57,600
40,401 44,521 48,841 53,361 58,081
40,804 44,944 49,284 53,824 58,564
41,209 45,369 49,729 54,289 59,049
41,616 45,796 50,176 54,756 59,536
42,025 46,225 50,625 55,225 60,025
42,436 46,656 51,076 55,696 60,516
42,849 47,089 51,529 56,169 61,009
43,264 47,524 51,984 56,644 61,504
43,681 47,961 52,441 57,121 62,001
62,500 67,600 72,900 78,400 84,100
63,001 68,121 73,441 78,961 84,681
63,504 68,644 73,984 79,524 85,264
64,009 69,169 74,529 80,089 85,849
64,516 69,696 75,076 80,656 86,436
65,025 70,225 75,625 81,225 87,025
65,536 70,756 76,176 81,796 87,616
66,049 71,289 76,729 82,369 88,209
66,564 71,824 77,284 82,944 88,804
67,081 72,361 77,841 83,521 89,401
90,000 90,601 96,100 96,721 102,400 103,041 108,900 109,561 115,600 116,281
91,204 97,344 103,684 110,224 116,964
91,809 97,969 104,329 110,889 117,649
92,416 93,025 93,636 98,596 99,225 99,856 104,976 105,625 106,276 111,556 112,225 112,896 118,336 L19,025 L19,716
94,249 100,489 106,929 113,569 120,409
94,864 101,124 107,584 114,244 121,104
95,481 101,761 108,241 114,921 121,801
* This table is reprinted from Table XXVII of Fisher and Yates, Statistical Tables
for Biological, Agricultural, and Medical Research, published by Oliver & Boyd, Ltd., Edinburgh, by permission of the authors and publishers.
NUMERICAL TABLES
991
Table F-l. Table of Squares (Continued)
122,500 123,201 129,600 130,321 136,900 137,641 144,400 145,161 152,100 152,881
123,904 124,609 125,316 126,025 126,736 127,449 128,164 128,881 131,044 131,769 132,496 133,225 133,956 134,689 135,424 136,161 138,384 139,129 139,876 140,625 141,376 142,129 142,884 143,641 145,924 146,689 147,456 148,225 148,996 149,769 150,544 151,321 153,664 154,449 155,236 156,025 156,816 157,609 158,404 159,201
160,000 160,801 168,100 168,921 176,400 177,241 184,900 185,761 193,600 194,481
161,604 162,409 163,216 164,025 164,836 165,649 166,464 167,281 169,744 170,569 171,396 172,225 173,056 173,889 174,724 175,561 178,084 178,929 179,776 180,625 181,476 182,329 183,184 184,041 186,624 187,489 188,356 189,225 190,096 190,969 191,844 192,721 195,364 196,249 197,136 198,025 198,916 199,809 200,704 201,601
202,500 203,401 211,600 212,521 220,900 221,841 230,400 231,361 240,100 241,081
204,304 205,209 206,116 207,025 207,936 208,849 209,764 210,681 213,444 214,369 215,296 216,225 217,156 218,089 219,024 219,961 222,784 223,729 224,676 225,625 226,576 227,529 228,484 229,441 232,324 233,289 234,256 235,225 236,196 237,169 238,144 239,121 242,064 243,049 244,036 245,025 246,016 247,009 248,004 249,001
250,000 251,001 252,004 253,009 254,016 255,025 256,036 257,049 258,064 259,081 260,100 261,121 262,144 263,169 264,196 265,225 266,256 267,289 268,324 269,361 270,400 271,441 272,484 273,529 274,576 275,625 276,676 277,729 278,784 279,841 280,900 281,961 283,024 284,089 285,156 286,225 287,296 288,369 289,444 290,521 291,600 292,681 293,764 294,849 295,936 297,025 298,116 299,209 300,304 301,401 302,500 303,601 313,600 314,721 324,900 326,041 336,400 337,561 348,100 349,281
304,704 305,809 306,916 308,025 309,136 310,249 311,364 312,481 315,844 316,969 318,096 319,225 320,356 321,489 322,624 323,761 327,184 328,329 329,476 330,625 331,776 332,929 334,084 335,241 338,724 339,889 341,056 342,225 343,396 344,569 345,744 346,921 350,464 351,649 352,836 354,025 355,216 356,409 357,604 358,801
360,000 361,201 362,404 363,609 364,816 366,025 367,236 368,449 369,664 370,881 372,100 373,321 374,544 375,769 376,996 378,225 379,456 380,689 381,924 383,161 384,400 385,641 386,884 388,129 389,376 390,625 391,876 393,129 394,384 395,641 396,900 398,161 399,424 400,689 401,956 403,225 404,496 405,769 407,044 408,321 409,600 410,881 412,164 413,449 414,736 416,025 417,316 418,609 419,904 421,201 422,500 423,801 425,104 426,409 427,716 429,025 430,336 431,649 432,964 434,281 435,600 436,921 438,244 439,569 440,896 442,225 443,556 444,889 446,224 447,561 448,900 450,241 451,584 452,929 454,276 455,625 456,976 458,329 459,684 461,041 462,400 463,761 465,124 466,489 467,856 469,225 470,596 471,969 473,344 474,721 476,100 477,481 478,864 480,249 481,636 483,025 484,416 485,809 487,204 488,601 490,000 491,401 492,804 494,209 495,616 497,025 498,436 499,849 501,264 502,681 504,100 505,521 506,944 508,369 509,796 511,225 512,656 514,089 515,524 516,961 518,400 519,841 521,284 522,729 524,176 525,625 527,076 528,529 529,984 531,441 532,900 534,361 535,824 537,289 538,756 540,225 541,696 543,169 544,644 546,121 547,600 549,081 550,564 552,049 553,536 555,025 556,516 558,009 559,504 561,001
APPENDIX F
992
Table F-l. Table of Squares (Continued) 3
75 76
77 78 79
6
562,500 564,001 565,504 567,009 568,516 570,025 571,536 573,049 574,564 576,081 577,600 579,121 580,644 582,169 583,696 585,225 586,756 588,289 589,824 591,361 592,900 594,441 595,984 597,529 599,076 600,625 602,176 603,729 605,284 606,841 608,400 609,961 611,524 613,089 614,656 616,225 617,796 619,369 620,944 622,521 624,100 625,681 627,264 628,849 630,436 632,025 633,616 635,209 636,804 638,401
80
640,000 641,601 643,204 644,809 646,416 648,025 649,636 651,249 652,864 654,481
81
656,100657,721 659,344 660,969 662,596 664,225 665,856 667,489 669,124 670,761
82
672,400 674,041 675,684 677,329 678,976 680,625 682,276 683,929 685,584 687,241 688,900 690,561 692,224 693,889 695,556 697,225 698,896 700,569 702,244 703,921 705,600 707,281 708,964 710,649 712,336 714,025 715,716 717,409 719,104 720,801
83 84 85 86
87 88 89 90 91
92
93 94
95 96 97 98 99
722,500 724,201 725,904 727,609 729,316 731,025 732,736 734,449 736,164 737,881 739,600 741,321 743,044 744,769 746,496 748,225 749,956 751,689 753,424 755,161 756,900 758,641 760,384 762,129 763,876 765,625 767,376 769,129 770,884 772,641 774,400 776,161 777,924 779,689 781,456 783,225 784,996 786,769 788,544 790,321 792,100 793,881 795,664 797,449 799,236 801,025 802,816 804,609 806,404 808,201 810,000 811,801 813,604 815,409 817,216 819,025 820,836 822,649 824,464 826,281 828,100 829,921 831,744 833,569 835,396 837,225 839,056 840,889 842,724 844,561 846,400 848,241 850,084 851,929 853,776 855,625 857,476 859,329 861,184 863,041 864,900 866,761 868,624 870,489 872,356 874,225 876,096 877,969 879,844 881,721 883,600 885,481 887,364 889,249 891,136 893,025 894,916 896,809 898,704 900,601 902,500 904,401 906,304 908,209 910,116 912,025 913,936 915,849 917,764 919,681 921,600 923,521 925,444 927,369 929,296 931,225 933,156 935,089 937,024 938,961 940,900 942,841 944,784 946,729 948,676 950,625 952,576 954,529 956,484 958,441 960,400 962,361 964,324 966,289 968,256 970,225 972,196 974,169 976,144 978,121 980,100 982,081 984,064 986,049 988,036 990,025 992,016 994,009 996,004 998,001
Exact squares of four-figure numbers can be quickly calculated from the identity (a ± by = a2 ± 2a6 + V. Thus 693.3s = 480249 + 415.8 + 0.09 = 480664.89.
NUMERICAL TABLES
993
Table F-2. Five-place Common Logarithms of Numbersf 100-155
"No. T 100 101 102 103 104
00
01
5
6
1
8
9
410 828
173 604 •030 452 870
217 647 •072 494 912
260 689 ♦115 536 953
303 732 ♦157 578 995
346 775 ♦199 620 ♦036
389 817 ♦242 662 ♦078
243 653 ♦060 463 862
284 694 •100 503 902
325 735 •141 543 941
366 776 ♦181 583 981
408 816 ♦222 623 ♦021
449 857 •262 663 •060
490
898 ♦302 703 ♦100
258 650
336 727 •115 500 881
376 766 ♦154 538 918
415 805 ♦192
423 805
297 689 •077 461 843
454 844 •231 614 994
493 883 ♦269 652 ♦032
145 521 893 262 628
183 558 930 298 664
221 595 967 335 700
258 633 •004 372 737
296 670 ♦041 408 773
333 707 ♦078 445 809
744 •115 482 846
408 781 ♦151 518 882
954 314 672 •026 377
990 350 707 ♦061 412
•027 386 743 •096 447
•063 422 778 •132 482
•099 458 814 •167 517
♦135 493 849 ♦202 552
♦171 529 884 ♦237 587
•207 565 920 •272 621
♦243 600 955 ♦307 656
726 072 415 755 093
760 106 449 789 126
795 140 483 823 160
830 175 517 857 193
864 209 551 890 227
899
934 278 619 958
968 312 653
♦003 346 687 ♦025 361
428 760
494 826 156
528 860 189 516
840
561 893 222 548 872
594 926 254 581 905
628 959 287 613 937
661 992 320 646
743
461 793 123 450 775
033 354 672 988 301
066 386 704 •019 333
098 418 735 •051 364
130 450 767 •082
194 513 830 •145 457
226 545 862 ♦176 489
258 577 893
395
162 481 799 •114 426
290 609 925 •239 551
613 922 229 534 836
644 953 259 564 866
675 983 290 594 897
706 •014 320 625 927
737 •045 351 655 957
768 ♦076 381 685 987
799 ♦106 412 715 ♦017
829 ♦137 442
137 435 732 026 319
167 465 761 056 348
197 495 791 085 377
227 524 820 114 406
256 554 850 143 435
286 584 879 173 464
316 613 909 202 493
754 ♦041 327 611 893
782 ♦070 355 639 921
173
201
1
2
000 432 860 284 703
043 475 903 326 745
087 518 945 368 787
130 561 988
119 531 938 342 743
160 572 979 383 782
202 612 ♦019
218 610 999 385 767
♦038
690
179 571 961 346 729
070 446 819 188 555
108 483 856 225 591
918 279 636 991 342 691 037 380 721 059
105 106 107 108 109
02
110 111 112 113 114
04
139
05
532 922 308
115 116 117 118 119
06
120 121 122 123 124
125 126 127 128 129 130 131 132
03
07
08
09
10
11
12
133 134 135 136 137 138 139
13
14
140 141
142 143 144
15
145 146 147 148 149
16
17
394 727 057 385 710
090 418
3
423 822
483 808
150 151 152 153 154
18
609 898 184 469 752
638 926 213 498 780
667 955 241 526 808
696 984 270 554 837
725 •013 299 583 865
155
19
033
061
089
117
145
No.
L
0
1
2
3
4
♦
Proportional parts
4
b
1 5
243 585 924 261
6
576 956
294
♦208 520
371
992 327
969
694 ♦024 353 678 ♦001 322 640 956
♦270 582 891 ♦198 503
746 ♦047
860 ♦168 473 776 ♦077
346 643 938 231 522
376 673 967 260 551
406 702 997 289 580
811 384 667 949
840 ♦127 412 696 977
869 ♦156 441 724 ♦005
229
257
285
8
9
♦099
17
Indicateschange in the firsttwo decimal places. t From R. H. Perry, Engineering Manual, McGraw-Hill, New York, 1959.
806
1 2 3 4 5
6 7 8 9
1 2 3 4
5 6 7 8 9
1 2 3 4 5 6 7 8 9
1 2 3 4 5 6 7 8 9
44
43
42
4.4 8.8
4.3 8.6
4.2 8.4
13.2
12.9
12.6
17.6
17.2
16.8
22.0 26.4
21.5 25.8
21.0 25.2
30.8 35.2 39.6
30.1 34.4 38.7
29.4 33.6 37.8
41
40
30
4.1 8.2 12.3
4.0 8.0 12.0
3.9 7.8 11.7
16.4
16.0
15.6
20.5 24.6 28.7 32.8 36.9
20.0 24.0 28.0 32.0 36.0
19.5 23.4 27.3 31.2 35.1
38
37
J6^
3.8 7.6 11.4 15.2
3.7 7.4 11.1 14.8
3.6 7.2 10.8 14.4
19.0
18.5
18.0
22.8 26.6 30.4 34.2
22.2 25.9 29.6 33.3
21.6 25.2 28.8 32.4
35
34
33
3.5
3.4
3.3
7.0 10.5 14.0 17.5 21.0 24.5 28.0 31.5
6.8 10.2 13.6 17.0 20.4 23.8 27.2 30.6
6.6 9.9 13.2 16.6 19.8 23.1 26.4 29.7
32
31
30
3.2 6.4 9.6 12.8 16.0 19.2 22.4 25.6 28.8
3.1 6.2 9.3 12.4 15.5 18.6 21.7 24.8 27.9
3.0 6.0 9.0 12.0 15.0 18.0 21.0 24.0 27.0
♦107
1 2 3 4 5 6 7 8 9
Proportional parts
APPENDIX F
Table F-2.
994
Five-place Common Logarithms of Numbers (Continued) 155-210
No.
1/
0
i
2
3
4
5
6
1
8
155 156 157 158 159
19
033 312 590 866 140
061 340 618 893 167
089 368 645 921 194
117 396 673 948 222
145 424 700 976 249
173 451 728 •003 276
201 479 756 ♦030 303
229 507 783 •058 330
257 535 811 •085 358
285 562 838 •112 385
412 683 952 219 484
439 710 978 245 511
466 737 •005 272 537
493 763 •032 299 564
520 790 •059 325 590
548 817 •085 352 617
575 844 ♦112 378 643
602 871 •139 405 669
629 898 ♦165 431
656 925 •192 458 722
748 011 272 531 789
775 037 298 557 814
801 063 324 583 840
827 089 350 608 866
854 115 376 634 891
880 141 401
906 168 427 686 943
932 194 453 712 968
958 220 479
737 994
985 246 505 763 •019
045 300 553 805 055
070 325 578 830 080
096 350 603 855 105
121 376 629
147 401 654 905 155
198 452 704 955 204
223 477 729 980 229
249 502 754 •005 254
274 528 776 •030 279
160 161 162 163 164 165 166 167 168 169
170 171 172 173 174
20
21
22
23
24
880 130
660 917 172 426 679
930 180
696
' 9
Proportional parts
1 2
3 4 5
6 7 8 9
1 2 3 4 5 6 7 8 9
29
28
2.9 5.8 8.7 11.6
2.8 5.6 8.4 11.2
14.5
14.0 16.8
17.4 20.3 23.2 26.1
19.6 22.4 25.2
27
26
2.7 5.4 8.1 10.8 13.5 16.2 18.9 21.6
2.6 5.2 7.8 10.4 13.0 15.6 18.2 20.8
OA.
OO
9
A
25
175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194
25
26
27
28
195 196 197 198 199
29
200 201 202 203 204
30
205 206 207 208
31
209
32
210 No.
L
304 551 797 042 285
329 576 822 066 310
353 601 846 091 334
378 625 871 115 358
403 650 895 139 382
428 674 920 164 406
452 699 944 188 431
477 724 969 212 455
502 748 993 237 479
527 768 007 245 482
551 792 031 269 505
575 816 055 293 529
600 840 079 316 553
624 864 102 340 576
648 888 126 364 600
672 912 150 387 623
696 935 174 411 647
720 959 198
435 670
744 983 221 458 694
717 951 184 416 646
741 975 207 439 669
764 998 231 462 692
788 •021 254 485 715
811 •045 277 508 738
834 •068 300 531 761
858 ♦091 323 554 784
881 •114 346 577 807
905 •138 370 600 830
928 •161 393 623 853
875 103 330 556 780
898 126 353 578 803
921 149 375 601 825
944 172 398
990
623 847
967 194 421 646 870
217 443 668 892
♦012 240 466 691 914
•035 262 488 713 937
•058 285 511 735 959
•081 308 533 758 981
003 226 447 667 885
026 248 469 688 907
048 270 491 710 929
070 292 513 732 951
092 314 535 754 973
115 336 557 776 994
137 358 579 798 ♦016
159 380 601 820 •038
181 403 623 842 •060
203 425 645 863 •081
103 320 535 750 963
125 341 557 771 984
146 363 578 792 •006
168 384 600 814 •027
190 406 621 835 •048
211 428 643 856 ♦069
233 449 664 878
255 471 685 899 •112
276 492 707 920 •133
298 514 728 942 ♦154
175 387 597 806 015
197 408 618 827 035
218 429 639 848 056
239 450 660 869 077
222
243
263
284
0
1
2
3
260 471 681 890 098 305
1 4
281 492 702 911 118 325 5
* Indicates change in the first two decimal places.
♦091
527 773 •018
261 503
302 513 723 931 139
323 534 744 952 160
345 555 765 973 181
366 576 785 994 201
346
366
387
408
1 6 1 7
8
9
1
° K
2 3 4
5.0 7.5 10.0 12.5 15.0 17.5 20.0
5
6 7 8 Q
24 1 2 3 4 5 6 7 8 9
2.4 4.8 7.2 9.6 12.0 14.4 16.8 19.2 *A.U 23
1
2 3 4 5 6 7 8 9
2.3 4.6 6.9 9.2 11.5 13.8 16.1
18.4 on
v
*v.t
22 1
2 3 4 5 6 7 8 9
2.2 4.4 6.6 8.8 11.0 13.2 15.4 17.6 19.8
Proportional parts
NUMERICAL TABLES
995
Table F-2. Five-place Common Logarithms of Numbers (Continued) 210-265
Proportional parts
No.
L
6
1
2
3
5
6
1
210 211 212 213 214
32
33
222 428 634 838 041
243 449 654 858 062
263 469 675 879 082
284 490 695 899 102
305 511 715 919 122
325 531 736 940 143
346 552 756 960 163
366 572 777 980 183
593 797 ♦001 203
264 465 666 866 064
284 486 686 885 084
304 506 706 905 104
325 526 726 925 124
345 546 746 945 143
365
34
244 445 646 846 044
385 586 786 985 183
405 606 806 ♦005 203
242
282 479 674 869 064
301 498 694 889 083
321 518 713 908 102
341 537 733 928 122
361 557 753 947 141
380
400
577 772 967 160
596 792 986
025
262 459 655 850 044
180
420 616 811 •005 199
218 411 603 793 984
238 430 622 813 •003
257 449 641 832 •021
276 468 660 851 •040
295 488 679 870 •059
315 507 698 889 •078
334 526 717 908
372 564 755
392 583 774
946
965
•097
353 545 736 927 •116
♦135
•154
1 2
173 361 549 736
211 399 586 773 959
229 418 605 791 977
248 436 624 810 996
267 455 642 829 •014
286 474 661 847 •033
305 493
324 511
680 866 •051
698 884 ♦070
342 530 717 903 •088
3 4
922
192 380 568 754 940
107 291 475 658 840
125 310 493 676 858
144 328 511 694 876
162 346 530 712 894
181 365 548 731 912
199 383
566 749 931
218 401 585 767 949
021 202 382 561 739
039 220 399 579 757
057 238 417 596 775
075 256 435 614 792
093 274 453 632 810
112 292 471 650 828
917 094 270 445 620
934 111 287 463 637
952 129 305 480 655
970
146 322 498 672
987 164 340 515 690
794 967 140
811 985 157 329 500
829 •002 175 346 518
846 •019 192 364 535
863
4
8
9
387
408 613 818 •021 224 21
215 216 217 218 219 220 221 222 223
224
439 635 830 35
225 226 227 228 229 230 231 232
36
233 234
566 766 965 163
425 626 826
•025 223
1 2
3 4 6 6
7 8
37
237 238 239 240 241 242 243 244
245 246 247 248 249 250 251 252
38
39
40
253 254 255 256 257 258 259
312 483
41
260 261 262
263 264
42
265 No. ♦
L
603 785 967
438 621 803 985
273 457 639 822 •003
130 310 489 668 846
148 328 507 686 863
166 346 525 703 881
184 364 543 721 899
•005 182 358 533 707
•023 199 375 550 724
♦041 217
•058 235 410 585 759
•076 252 428 602 777
♦037 209 381 552
881 •054 226 398 569
898 •071 243 415 586
933 •106 278 449 620
950 •123 295 466 637
756 926 •095 263 430
773 943 ♦III 280
790 960 •128
447
296 464
807 976 •145 313 481
236 420
393 568 742
915 ♦088 261 432 603
254
654 824 993 162 330
671 841 ♦010 179 347
688 858 •027 196 364
705 875 ♦044 212 380
722 892 ♦061 229 397
739 909 •078 246 414
497 664 830 996 160
514 681 847 ♦012 177
531 697 863
564 731
193
547 714 880 ♦045 210
896 •062 226
581 747 913 •078 243
597 764 929 •095 259
614 780 946 ♦111 275
631 797 963 •127 292
814 979 ♦144 308
325
341
357
374
390
406
423
439
456
472
0
1
2
3
4
5
6
7
8
9
'•029
Indicates change in the first two decimal places.
647
6.3 8.4 10.5 12.6 14.7
16.8
9 1 lO.S
20
5 6 7
2.0 4.0 6.0 8.0 10.0 12.0 14.0
9
16.0 18.0
1 2 3 4 5 6 7 8 9
3.8 5.7 7.6 9.5 11.4 13.3 15.2 17.1
8
235 236
4.2
19 1.9
18 1 2 3 4 5 6 7 8 9
x.o
3.6 5.4 7.2 9.0 10.8 12.6 14.4 16.2
Proportional parts
APPENDIX F Table F-2
996
Five-place Common Logari thms of Numbers (Continued) 265-320
No.
L
0
i
2
3
4
5
6
1
8
9
265 266 267 268 269
42
325 488 651 813 975
341 504 667 830 991
357 521 684 846 •008
374 537 700 862 •024
390 553 716 878 •040
406 570 732 894 •056
423 586 749 911 •072
439
602 765 927 •088
456 619 781 943 •104
472 635 797 959 •120
270 271 272 273 274
43
136 297 457 616 775
152 313 473 632 791
169 329 489 648 807
185 345 505 664 823
201 361 521 680 838
217 377 537 696 854
233 393 553 712 870
249 409 569 727 886
265 425 584 743 902
281 441 600 759 917
933 091 248 404 560
949 107 264 420 576
965 122 279 436 592
981
996 154 311 467 623
•012 170 326 483 638
•028 185 342 498 654
•044
138 295 451 607
201 358 514 669
•059 217 373 529 685
•075 232 389 545 700
716 871 025 179 332
731 886 040 194 347
747 902 056 209 362
762 917 071 225 378
778 932 086 240 393
793 948 102 255 408
809 963 117 271 423
824 979 133 286 439
840 994 148 301 454
855 •010 163
484 637 788 939 090
500 652 803 954 105
515 667 818 969 120
530 682 834 984 135
545 697 849 •000 150
561 712 864 •015 165
576 728 879 •030 180
591 743 894 •045 195
606 758 909 •060 210
621 773 924 •075 225
240 389 538 687 835
255 404 553 702 850
270 419 568 716 864
285 434 583 731 879
300 449 598 746 894
315 464 613 761 909
330 479 627 776 923
345 494 642 790 938
359 509 657 805 953
374 523 672 820 967
982 129 276 422 567
997 144 290 436 582
•012 159
•026 173 319 465 611
♦041
188 334 480 625
•056 202 349 494 640
•070 217 363 509 654
•085 232 378 524 669
•100
305 451 596
392 538 683
•115 261 407 553 698
712 857 001 144 287
727 871 015 159 302
741 886 029 173 316
756 900 044 187 330
770 914 058 202 344
784 929 073 216 359
799 943 087 230 373
813 958 101 245 387
828 972 116 259 402
842 986 130 273 416
430 572 714 855 996
444 586 728 869 ♦010
458 601 742 883 •024
473 615 756 897 •038
487 629 770 911 ♦052
501 643 785 926 •066
515 657 799 940 •080
530 671 813 954 •094
544 686 827 968 •108
558 700 841 982 •122
136 276 415 554 693
150 290 429 568 707
164 304 443 582 721
178 318 457 596
192
748
220 360 499 638 776
234 374 513 651 790
248 388 527 665
734
206 346 485 624 762
803
262 402 541 679 817
831 969 106
845
859
872
900
243 379
982 120 256 393
996 133 270 406
•010 147 284 420
886 •024
927 •065 202 338 474
941 •079 215 352 488
955 •092 229 365 501
515
529
542
637
0
1
2
275 276 277 278 279 280 281 282 283 284
285 286 287 288 289
44
45
46
290 291 292 293 294 295 296 297 298 299
300 301 302 303 304
47
48
305 306 307 308 309 310 311 312 313 314 315 316 317 318 319
49
50
320 No.
L
332 471 610
17
246
161 297 433
•037 174 311 447
914 ♦051 188 325 461
556
569
583
596
610
623
3
4
5
6
7
8
* Indicates change in the first two decimal places.
Proportional parts
317 469
9 1
1 2
3 4 5 6 7
8 9
1.7
3.4 5.1 6.8 8.5 10.2 11.9 13.6
15.3
16 1 2 3 4 5 6 7 8 o
1.6 3.2 4.8
6.4 8.0 9.6 11.2 12.8 141
15 1
2 8 4 5 6
7 8 9
1.5 3.0 4.5 6.0 7.6 9.0 10.6 12.0 13.5
14 1 2 3 4 5 6 7 8 9
1.4
2.8 4.2 5.6 7.0 8.4 9.8 11.2 12.6
Proportional parts
997
NUMERICAL TABLES
Table F-2. Five-place Common Logarithms of Numbers (Continued) 320-375
6
7
"ft
*
515 651 786 920 055
529 664 799 934 068
542 678 813 947 081
556 691 826 961 095
569 705 840 974 108
583 718 853 987 121
596 732 866 ♦001 135
610 745 880 •014 148
623 759 893 •028 162
637 772 907 •041 175
188 322 455 587 720
202 335 468 601 733
215 348 481 614 746
228 362 495 627 759
242 375 508 640 772
255 388 521 654 786
268 402 534 667 799
282 415 548 680 812
295 428 561 693 825
308 441 574 706 838
851 983 114 244 375
865 996 127 257 388
878 ♦009 140 271 401
891 •022 153 284 414
904 •035 166 297 427
917 •048 179 310 440
930 ♦061 192 323 453
943 •075 205 336 466
957 •088 218 349 479
970 •101 231 362 492
504 634
517 647 776 905 033
530 660 789 917 046
543 673 802 930 058
556 686 815 943 071
569 699 827 956 084
582 711 840 969 097
595
763 892 020
724 853 982 110
608 737 866 994 122
621 750 879 •007 135
340 341 342 343 344
148 275 403 529 656
161 288 415 542 668
173 301 428 555 681
186 314 441 567 694
199 326 453 580 706
212 339 466 593 719
224 352 479 605 732
237 365 491 618 744
250 377 504 631 757
263 390 517 643 769
345 346 347 348 349
782 908 033 158 283
795 920 045 170 295
807 933 058 183 307
820 945 070 195 320
832 958 083 208 332
845 970 095 220 345
857 983 108 233 357
870 995 120 245
370
883 •008 133 258 382
895 •020 145 270 394
407 531 654 777 900
419 543 667 790 913
432 555 679 802 925
444 568
456 580 704 827 949
469 593
716 839 962
481 605 728
494 617 741 864 986
506 630 753 876 998
518 642 765 888 •011
023 145 267 388 509
035 157 279 400 522
047 169 291 413 534
303 425 546
072 194 315 437 558
084 206 328 449 570
108 230 352 473 594
121 242 364 485 606
133 255 376 497 618
630 751 871 991 110
642 763 883 •003 122
654 775 895 •015 134
666 787 907 •027 146
678 799 919 •038 158
691 811 931 ♦050 170
♦062 182
715 835 955 •074 194
727 847 967 •086 205
739 859 979 •098 217
229 348 467 585 703
241 360 478 597 714
253 372 490 608 726
265 384 502 620 738
277 396 514 632 750
289 407 526 644 761
301 419 538 656 773
313 431 549 667 785
324 443 561 679 797
336 455 573 691 808
820 937 054
844 961 078 194 310
855 972 089
867 984 101
879 996
891 •008 124
206
322
217 334
229 345
241 357
902 •019 136 252 368
914
113
171 287
832 949 066 183 299
♦031 148 264 380
926 •043 159 276 392
403
415
426
438
449
461
473
484
496
507
0
1
2
3
4
5
7
8
9
No.
L"
320 321 322 323 324
50
51
325 326 327 328 329 330 331 332 333 334
52
335 336 337 338 339
53
54
350 351 352 353 354 355 356 357 358 359
360 361 362 363 364
55
56
365 366 367 368 369
370 371 372
57
373 374
375
No.
L
o 1 1
1
*
691 814 937 060 182
4
•4
* Indicates changein the firsttwo decimal places.
851 974 096 218
340 461 582
703 823 943
6
Proportional parts
14
1 2 3 4 5 6 7 8 g
1.4 2.8 4.2 5.6 7.0
8.4 9.8 11.2 1Q
A
18 1 2 8 4 5 6 7 8 9
1.3 2.6 3.9 5.2 6.5 7.8 9.1 10.4 11.7
1 2 3 4 6 6
1.2 2.4
12
3.6
4.8 6.0 7.2
7
8.4
8 9
9.6 10.8
Proportional parts
998
APPENDIX F
Table F-2. Five-place Common Logarithms of Numbers (Continued) 375-430
5
6
7
415 530 646 761 875
426 542 657 772
565 680 795 910
461 577 692 807 921
473 588
887
438 553 669 784 898
703 818 933
484 600 715 830 944
496 611 726 841 956
507 623 738 852 967
978 093 206 320 433
990 104 218 331 444
•001 115 229 343 456
•013 127 240 354 467
•024 138 252 365 478
♦035 149 263 377 490
♦047 161 275 388 501
•058 172 286 399 512
•070 184 297 411 524
•081 195 309 422 535
546 659 771 883 995
557 670 782 894 •006
569 681 794 906 •017
580 692 805 917 •028
591 704 816 928 •040
602 715 827 939 ♦051
614 726 838 950 ♦062
625 737 850 961 •073
636 749 861 973 •084
647 760 872 984 •095
106 218
118 229 340 450 561
129 240 351 461 572
140 251 362 472 583
151 262 373 483 594
162 273 384 494 605
173 284 395 506 616
671
693 802 912 •021 130
704 813 923 •032 141
715 824 934 ♦043 152
726 835 945 ♦054 163
228
0
375 376 377 378 379
57
403 519 634 749 864
58
385 386 387 388 389 390 391 392 393 394
59
Proportional parts
3
L
380 381 382 383 384
ft
2
No.
329 439 550
"i
• 4 449
9
627
195 306 417 528 638
207 318 428 539 649
737 846 956 •065 173
748 857 966 •076 184
759 868 977 •086 195
184 295 406 517
780 890 999 108
682 791 901 •010 119
403 404
206 314 423 531 638
217 325 433 541 649
336 444 552 660
239 347 455 563 670
249 358 466 574 681
260 369 477 584 692
271 379 487 595 703
282 390 498 606 713
293 401 509 617 724
304 412 520 627 735
405 406 407 408 409
746 853 959 066 172
756 863 970 077 183
767 874 981 087 194
778 885 991 098 204
788 895 •002 109 215
799 906 ♦013 119 225
810 917 ♦023 130 236
821 927 •034 140 247
831 938 •045 151 257
842 949 •055 162 268
410 411 412 413 414
278 384 490 595 700
289
300 405 511 616 721
310
321
395 500 606 711
416 521 627 731
426 532 637 742
331 437 542 648 752
342 448 553 658 763
352 458 563 669 773
363 469 574 679 784
374 479 584 690 794
415 416 417 418 419
805 909
826 930 034 138 242
836 941 045 149 252
847 951 055 159 263
857 962
868 972
066 170 273
076 180 284
878 982 086 190 294
888
014 118 221
815 920 024 128 232
993 097 201 304
899 •003 107 211 315
325 428 531 634 737
335 439 542 644 747
346 449 552 655 757
356 459
562 665 767
366 469 572 675 778
377 480 583 685 788
387 490 593 696 798
397 500 603 706 808
408 511 614 716 818
418 521 624 726 829
839 941 043 144 246
849
951 053 155 256
859 961 063 165 266
870 972 073 175 276
880 982 083 185 286
890 992 094 195 296
900 •002 104 205 306
910 •012 114 215 317
921 •022 124 225 327
931 •033 134 236 337
347
357
367
377
387
397
407
417
428
438
0
1
2
4
5
6
7
8
9
60
400 401 402
61
2 3 4 5 6
7
660 770 879 988 097
395 396 397 398 399
J.1
1
8 9
1.1
2^2 3.3 4.4 5.5 6.6 7.7
8^8 9.9
1U
62
420 421 422 423 424 425
426 427 428 429
63
430
No. IL
13
* Indicates change in the first two decimal places.
1
1.0
2
2*0
3
3.0 4.0
4 5 6
7 8 o
s!o 6.0 7.0 8.0 on
Propori lonal parts
999
NUMERICAL TABLES
Table F-2. Five-place Common Logarithms of Numbers (Continued) 430-485
6
No. 1 L 430 431 432 433
63
1 • 1 2
3
4
5
6
1
8
387 488 589 689 789
397 498 599 699 799
407 508 609 709 809
417 518 619 719 819
428 528 629 729 829
438 538 639 739 839
1 9
347 448 548 649 749
357 458 558 659 759
468 568 669 769
377 478 579 679 779
438 439
849 949 048 147 246
859 959 058 157 256
869 969 068 167 266
879 979 078 177 276
889 988 088 187 286
899 998 098 197 296
909 •008 108 207 306
919 •018 118 217 316
929 ♦028 128 227 326
939 ♦038 137 237 335
440 441 442 443 444
345 444 542 640 738
355 454 552 650 748
365 464 562 660 758
375 473 572 670 768
385 483 582 680 777
395 493 591 689 787
404 503 601 699 797
414 513 611 709 807
424 523 621 719 816
434 532 631 729 826
445 446 447 448 449
836 933 031 128 225
846 943 040 137 234
856 953 050 147 244
865 963 060 157 254
875 972
895
263
885 982 079 176 273
992 089 186 283
904 •002 099 196 292
914 ♦011 108 205 302
924 ♦021 118 215 312
450 451 452 453 454
321 418 514 610 706
331 427 523 619 715
341 437 533 629 725
350 447 543 639 734
360 456 552 648 744
369 466 562 658 753
379 475 571 667 763
389 485 581 677 773
398 495
408
591 686 782
504 600 696 792
455 456 457 458 459
801 896 992 087 181
811 906 •001 096 191
820 916 •011 106 200
830 925 •020 115 210
839 935 •030 124 219
849 944 •039 134 229
858 954 •049 143 238
868 963 •058 153 247
877 973 ♦068 162 257
♦077 172 266
460 461 462 463 464
276 370 464 558 652
285 380 474 567 661
295 389 483 577 671
304 398 492 586 680
314 408 502 596 689
323 417 511 605 699
332 427 521 614 708
342 436 530 624 717
351 445 539 633 727
361 455 549 642 736
465 466 467 468 469
745 839 932 025 117
755 848 941 034 127
764 857 950 043 136
773 867 960 052 145
783 876 969 062 154
792 885 978 071 164
801 894 987 080 173
811 904 997 090 182
820 913 ♦006 099 191
829 922 •015 108 201
470 471 472 473 474
210 302 394 486 578
219 311 403 495 587
228 321 413 504 596
238 330 422 514 605
247 339 431 523 614
256 348 440 532 624
265 357 449 541 633
274 367 459 550 642
284 376 468 560 651
293 385 477 569 660
475 476 477 478 479
669 761 852 943 034
679 770 861 952
688 779
043
870 961 052
697 788 879 970 061
706 797 888 979 070
715 806 897 988 079
724 815 906 997 088
733 825 916 •006 097
742 834 925 ♦015 106
752 843 934 ♦024 115
124 215 305 395 485
133 224 314 404 494
142 233 323 413 502
151 242 332 422 511
160 251 341 431 520
169 260 350 440 529
178 269 359 449 538
187 278 368 458 547
196 287 377 467 556
205 296 386 476 565
574
583
592
601
610
619
628 | 637
646
655
0
1
2
3
434 1 435 436 437
64
65
66
67
68
480 481 482 483 484 485
No.
L
367
070 167
4 1
5
• Indicates change in the first two decimal places.
6 1
Proportional parts
1
1.0
2 3 4 5 6 7 8 9
2^0 3.0 4.0 6.0 6.0
7.0 8.0 9.0
887 982
9
7
8 1
9
1 2 3 4 5
6 7 8 9
0.9
1.8 2.7 3.6 4.5 5.4 6.3 7.2 8.1
Proportional parts
1000
APPENDIX F
Table F-2. Five-place Common Logarithms of Numbers (Continued) 485-540
Proportional parts
No.
L
0
1
2
3
4
*
6
7
ft
9
485 486 487 488 489
68
574 664
592 682 771 860 949
601
610
690 780 869 958
699 789 878 966
619 708 797 886 975
628 717 806 895 984
637 726
931
583 673 762 851 940
815 904 993
646 735 824 913 ♦002
655 744 833 922 ♦011
490 491 492 493 494
69
020 108 197 285 373
028 117 205 294 381
037 126 214 302 390
046 135 223 311 399
055 144 232 320 408
064 152 241 329 417
073 161 249 338 425
082
090
099
170 258 346 434
267 355 443
495 496 497 498 499
461 548 636 723 810
469 644 732 819
478 566 653 740 827
487 574 662 749 836
496 583 671 758 845
504 592 679 767 854
513 601 688 775 862
522 609 697 784 871
531 618 705
793 880
539 627 714 801 888
500 501 502 503 504
897 984
906 992 079 165 252
914 ♦001 088
923 ♦010
174 260
096 183 269
932 ♦018 105 191
278
940 ♦027 114 200 286
949 ♦036 122 209 295
958 ♦044 131 217 303
966 •053 140 226 312
975 ♦062 148 234 321
338 424 509 595 680
346 432 518 603 689
355 441 526 612 697
364 449 535 621 706
372 458 544 629 714
381 467
389
398 484
731
569 655 740
406 492 578 663 749
783 868 952 037 122
791 876 961 046 130
800 885 969
012 096
766 851 935 020 105
774 859 944 029 113
054 139
978 063 147
817 902 986 071 155
825 910 995 079 164
834 919 •003 088 172
515 516 517 518 519
181 265 349 433 517
189 273 357 441 525
198 282 366 450 533
206 290 374 458 542
214 299 383 467 550
223 307 391 475 559
231 315 399 483 567
240 324 408 492 575
248 332 416 500 584
257 341 425 508 592
520 521 522 523 524
600 684 767 850 933
609 692 775 858 941
617 700 784 867 950
625 709 792 875 958
634
642 725 809 892 975
650 734
659 742 825 908 991
667
717 800 883 966
750 834 917 999
675 759 842 925 •008
016 099 181 263 346
024
032 115 198 280
041 123 206 288 370
049 132 214 296 378
057 140 222 305 387
066
074 156 239 321 403
082
090
148 230 313 395
165 247 329 411
173 255 337 419
530 531 532 533 534
428 509 591 673 754
436
452 534
460
542 624 705 787
469 550 632 713 795
477 559 640 722 803
485 567 648 730 811
493 575 656 738 819
501 583 665 746
535 536 537 538 539
835 916 997
876 957 ♦038 119 199
884 965 ♦046 127 207
892
973 ♦054 135 215
900 981 ♦062
143 223
908 989 •070 151 231
753 842
70
505 506 507 508 509 510 511 512 513 514
525 526 527 528 529
329 415 501 586 672 757 842
927 71
72
73
540
No.
070 157 243
L
557
107 189 272 354
518 599 681 762
362 444 526 607 689 770
616 697 779
552 638 723 808 893
817 900 983
475 561 646
179
188 276 364 452
078 159
852 933 ♦014 094
175
102 183
868 949 ♦030 111 191
239
247
255
264
272
280
288
296
304
312
0
1
2
3
4
5
6
7
8
9
* Indicates change in the first two decimal places.
0.9 1.8 2.7
3.6 4.5 5.4
7
6.3
8
7l2
9
8.1
1 2 3 4 5
0.8
8
6 7 8 9
1.6 2.4 3.2 4.0 4.8 6.6 6.4 7.2
827
844 925 ♦006 086 167
860 941 ♦022
9 1 2 3 4 6 6
Proportional parts
1001
NUMERICAL TABLES
Table F-2. Five-place Common Logarithms of Numbers (Continued) 540-595
No.
L 1 o
1
2
3
4
5
6
1
ft
9
540 541 542 543 544
73 I239
255
320 400 480 560
247 328 408 488 568
416 496 576
264 344 424 504 584
272 352 432 512 592
280 360 440
288 368 448 528 608
296 376 456 536 616
304 384 464 544 624
312 392 472 552 632
545 546 547 548 549
640 719 799 878 957
648 727 807 886 965
656 735 815 894 973
664 743 823 902 981
672
687 767 846 926
695 775 854 934 ♦013
703 783 862 941
711 791 870 949
♦020
♦028
036 115 194 273 351
044 123 202 280 359
052
060 139 218 296 374
068
210 288 367
147 225 304 382
076 155 233 312 390
092 170 249 327 406
099 178 257 335 414
107 186 265 343 421
555 556 557 558 559
429 507 586 663 741
437 515 593 671 749
445 523 601 679 757
453 531 609 687 764
461 539 617 695 772
468 547 624 702 780
632 710 788
484 562 640 718 796
492 570 648 726 803
500 578 656 733 811
560 561 562 563 564
819 896 974 051 128
827 904 981 059 136
834 912 989 066 143
842 920 997
074 151
850 927 •005 082 159
858 935 ♦012 089 166
865 943 •020 097 174
873 950 •028 105 182
881 958 ♦035 113 189
966 •043 120 197
565 566 567 568 569
205 282 358 435 511
213 289 366 442 519
220 297 374 450 526
228 305 381 458 534
236 389 465 542
243 320 397 473 549
251 328 404 481 557
259 335 412 488 565
266 343 420 496 572
570 571 572
587 664 740 815 891
595 671
610 686 762 838 914
618 694
626 702
633 709
770 846
778 853
785
921
929
861 937
641 717 793 868 944
648 724 800 876 952
656
747 823 899
603 679 755 831 906
974 050 125 200 275
982 057 133 208 283
989 065 140 215 290
997 072 148 223 298
♦005 080 155 230 305
♦012
•020
087 163 238 313
095 170 245 320
♦027 103 178 253 328
•035 110 185 260 335
358 433
373 448 522 597 671
380 455 530 604 678
388 462 537 612 686
395 470 545 619 693
403 477 552 626 701
410 485 559 634 708
753 827 901 975
768 842 916 989 063
775 849 923 997 070
782 856 930 •004 078 151 225 298 371
550 551 552 553 554
74
75
573 574 575 576 577 578 579
76
967 042
118 193 268
336
131
751 830 910 989
312
520 600
679 759 838 918 997
♦005
084
162 241 320 398 476 554
274
351 427 504 580
732 808 884 959
418 492 567 641
350 425 500 574 649
656
365 440 515 589 664
585 586 587 588 589
716 790 864 938 012
723 797 871 945 019
730 805 879 953 026
738 812 886 960 034
745 819 893 967 041
048
760 834 908 982 056
085 159 232 305 379
093 166 240 313 386
100 173 247 320 393
107 181 254 327 401
115 188 262 335 408
122 195 269 342 415
129 203 276 349 422
137 210 283 357 430
144 218 291 364 437
444
452
459
466
474
481
488
495
503
510
517
2
3
4
5
6
7
8
9
77
590
591 592 593 594 595
No.
L
0 1 1
507 582
• Indicates changein the first two decimal places.
9 1 2 3 4 5 6 7 8 9
0.9 1.8 2.7
1 2 3 4 5 6
0.8 1.6 2.4 3.2
3.6 4.5
5.4 6.3 7.2 8.1
889
580 581 582 583 584
343
Proportional parts
8
7 8 9
4.0
4.8 5.6
6.4
7.2
7 1 2 3 4 5
6 7 8 9
0.7 1.4 2.1 2.8 3.5 4.2 4.9 5.6 6.3
Proportional parts
1002
APPENDIX F
Table F-2. Five-place Common Logarithms of Numbers (Continued) 595-450 8
9
656 728 801
517 590 663 735 808
866 938 •010 082 154
873 945 •017 089 161
880 952 ♦025 097 168
219 290 362 433 505
226 297 369 440 512
233 305 376 447 519
240 312 383 455 526
569 640 711 781 852
576 647 718 789 859
583 654 725 796 866
590 661 732 803 873
597 668 739 810 880
916 986 057 127 197
923 993 064
134 204
930 •000 071 141 211
937 ♦007 078 148 218
944 •014 085 155 225
951 ♦021 092 162 232
267 337 407 477
532
260 330 400 470 539
546
274 344 414 484 553
281 351 421 491 560
288 358 428 498 567
295 365 435 505 574
302 372 442 512 581
602
609
671 741 810 879
678 748 817 886
616 685 754 824 893
623 692 761 831 900
630 699 768 837 906
637 706 775 844 913
644 713 782 851 920
651 720 789 858
948
955 024 092 161 229
962 030 099 168 236
969
975 044 113 182
982 051 120
989 058 127
250
188 257
195 264
996 065 134 202 271
318 387 455 523 591
325 393 462 530 598
332 400
0
i
2
*
4
5
6
77
452
459
481 554 627 699 772
495
532 605 677 750
474 546 619 692 764
488
525 597 670 743
466 539 612 685 757
561 634 706 779
815 887 960 032 104
822 895 967 039 111
830 902 974 046 118
837 909 981 053 125
844 916 989 061 132
605 606 607 608 609
176 247 319 390 462
183 254 326 398 469
190 262 333 405 476
197 269 340 412 483
610 611 612 613
533 604 675 746 817
540 611 682 753 824
547 618 689 760 831
554 625 696
888 958
895 965 036 106 176
902 972 043 113 183
909
246 316 386 456 525
253 323 393 463
600 601 602 603 604
78
614 615 616 617 618 619
79
029 099 169
620
239
621 622 623 624
309 379 449 518
625 626 627 628 629
588 657 727 796 865
595 664 734 803
630 631 632 633 634
934
941 010
822
-1
510 583
L
No. 595 596 597 598 599
767 838
979 050 120 190
568 641 714 786
503 576 648 721 793
851 924 996 068 140
859 931 •003 075 147
204 276 347 419 490
211 283 355 426 497
561 633 704 774 845
Proportional parts
8
1 2
3 4 5
6 7 8 9
0.8 1.6 2.4 3.2 4.0 4.8 5.6
6.4 7.2
927 7
037 106 175 243
003 072 140 209
079 147 216
017 085 154 223
277 346 414 482 550
284 353 421 489 557
291 359 428 496 564
298 366 434 502 570
305 373 441 509 577
312 380
640 641 642 643 644
618 686 754 821
625 693 760 828 895
632 699 767 835 902
638
706 774 841 909
645 713 781 848 916
652 720
787 855 922
659 726 794 862 929
645 646 647 648 649
956 023
963
969
976
983
990
996
090 158 224
030 097 164 231
037 104 171 238
043 111 178 245
050 117 184 251
057 124 191 258
064 131 198 265
291
298
305
311
318
325
331
338
3
4
5
6
7
80
635
636 637 638 639
889
81
650
No.
♦
L
0
1 1 1 2
448 516 584
Indicates change in the first two decimal places.
339 407 475
468 536 604
543 611
665 733 801 868 936
672
679
740 808 875 943
747 814 882
•003 070 137
♦010 077
♦017
204 271
1 2
3 4 5 6 7
8 9
M.I
1.4 2.1 2.8 3.6 4.2 4.9 5.6 6.3
949
144 211 278
084 151 218 285
345
351
1 8 1 9
Proportional parts
1003
NUMERICAL TABLES
Table F-2. Five-place Common Logarithms of Numbers (Continued) 650-705
No.
L
0
1
2
3
4
*
6
7
ft
9
650 651 652 653 654
81
291 358 425 491 558
298 365 431 498 564
305 371 438 505 571
311 378 445 511 578
318 385 451 518 584
325 391 458 525
591
331 398 465 531 598
338 405 471 538 604
345 411 478 544 611
351 418 485 551 618
624 690 757 823 889
631 697 763 829 895
637 704 770 836 902
644 710 776 842 908
651 717 783 849 915
657 723 790 856 921
664
671
730 796 862 928
737 803 869 935
677 743 809 875 941
750 816 882 948
954
961 027 092 158 223
968 033 099 164 230
974 040 105
981 046 112 178
987 053 119 184 250
994 060 125 191 256
♦000
066 132 197 263
♦007 073 138 204 269
♦014 079 145 210 276
321 387 452 517 582
328 393 458 523 588
334 400 465 530 595
341 406 471 536 601
655 656 657 658 659 660 661 662 663 664
82
665 666 667 668 669
282 347 413 478 543
607 672 737 802 866
670 671 672 673 674 675 676 677 678 679
020 086 151 217
83
930 995 059 123 187
289 354 419 484 549
295 360 426 491 556
171 236 302 367 432 497 562
243 308 374 439 504 569
315 380 445 510 575
684
614 679 743 808 872
620 685 750 814 879
627 692 756 821 885
633 698 763 827 892
640 705 769 834 898
646 711 776 840 905
653 718 782 847 911
659 724 789 853 918
666 730 795 860 924
937 ♦001
943 ♦008
065 129 193
072 136 200
950 ♦014 078 142 206
956 ♦020 085 149 213
963 •027 091 155 219
969 •033 097 161 225
975 ♦040 104 168 232
982 ♦046 110 174 238
988 ♦052 117 181 245
680 681 682 683 684
251 315 378 442 506
257 321 385 448 512
264 327 391 455 518
270 334 398 461 525
276 340 404 468 531
283 347 410 474 537
289 353 417 480 544
296 359 423 487 550
302 366 429 493 556
308 372 436 499 563
685 686 687 688 689
569 632 696 759 822
575 639 702 765 828
582 645 708 771 835
588 651 715 778 841
594 658 721 784 847
601 664 727 790 853
607 670 734 797 860
613 677 740 803 866
620 683 746 809 872
626 689 753 816 879
690 691 692 693 694
885 948 Oil 073 136
891 898 954 960 017 ^023 080 086 142 148
904 967 029 092 155
910 973 036 098 161
916 979 042 105 167
923 986 048 111 173
929 992 055 117 180
935 998 061 123 186
942 ♦004
695 696 697 698 699
198 261 323 386 448
205 267 330 392 454
211 273 336 398 460
217 280 342 404 466
223 286 348 410 473
230 292 354 417 479
236 298 361
423 485
242 305 367 429 491
248 311 373 435
255 317 379 442 504
700 701 702 703 704
510 572 634 696 757
516 578 640 702 763
522 584 646 708 770
528 590 652 714 776
535 597 658 720 782
541 603 665 726 788
547 609 671 733 794
553 615 677 739 800
559 621 683 745 807
566 628
819
825
831
837
844
850
856
862
868
874
0
1
2
3
4
5
6
7
8
9
84
705
No.
♦
L
Indicates change in the firsttwo decimal places.
497
Proportional parts
067 130 192
i
1
0.7
l!4
2 3 4 5 6 7
4.9
8 9
6.3
2.1 2.8 3.5 4.2
5^6
6 1 2 3
4 5 6 7 8 9
0.6 1.2 1.8 2.4 3.0 3.6 4.2 4.8 5.4
689 751 813
Proportional parts
1004
APPENDIX F
Table F-2. Five-place Common Logarithms of Numbers (Continued) 705-760
1
ft
9
862 924 985 046
868
874
911 973 034 095
856 917 979 040 101
107
930 991 052 114
936 997 059 120
211 272 333 394
156 217 278 339 400
163 224 285 345 406
169 230 291 352 412
175 236 297 358 418
181 242 303 364 425
449 510 570 631 691
455 516 576 637 697
461 522 582 643 703
467 528 588 649 709
473 534 594 655 715
479 540
721
485 546 606 667 727
757 818 878 938 998
763 824 884 944 •004
769 830 890 950 ♦010
775 836 896 956 •016
781 842 902 962 •022
788 848 908 968 •028
0
1
2
3
4
819 880 942 003 065
825 887 948 009 071
831 893 954 016 077
837 899 960 022 083
844 905 967 028 089
710 711 712 713 714
126 187 248 309 370
132 193 254 315 376
138 199 260 321 382
144 205 266 327 388
150
715 716 717 718 719
431 491 552 612 673
437 497 558 618 679
443 503 564 625 685
NoT"T 705 706 707 708 709
84
85
'* 850
6
600 661
720 721 722 723 724
733 794 854 914 974
739 800 860 920 980
745 806 866 926 986
751 812 872 932 992
86 725 726 727 7281 729
034 094 153 213 273
040 100 159 219 279
046 106 165 225 285
052 112 171 231 291
058 118 177 237 297
064 124 183 243 303
070 130 189 249 308
076 136 195 255 314
082 141 201 261 320
088 147 207 267 326
730 731 732 733 734
332 392 451 510 570
338 398 457 516 576
344 404 463 522 581
350 410 469 528 587
356 416 475 534 593
362 421 481 540 599
368 427 487
546 605
374 433 493 552 611
380 439 499 558 617
386 445 504 564 623
735 736 737 738 739
629 688 747 806 864
635 694 753 812 870
641 700 759 817 876
646 705 764 823 882
652 711 770 829 888
658 717 776 835 894
664 723 782 841 900
670 729 788 847 906
676 735 794 853 911
682 741 800 859 917
740 741 742 743 744
923 982 040 099 157
929 988 046 105 163
935 994 052 111 169
941 999 058 116 175
947 ♦005 064 122 181
953 •011 070 128 186
958 •017 075 134 192
964 ♦023 081 140 198
970 •029 087 146 204
976 •035 093 151 210
745 746 747 748 749
216 274 332 390 448
221 280 338 396 454
227 286 344 402 460
233 291 350 408 466
239
245 303 361 419 477
251 309 367 425 483
256 315 373 431 489
262 320 379 437 495
268 326 384 442 500
750 751 752 753 754
506 564 622 680 737
512 570 628 685 743
518 576 633 691 749
523 581 639 697 754
529 587 645 703
541 599 656 714
760
535 593 651 708 766
772
547 604 662 720 777
552 610 668 726 783
558 616 674 731 789
755 756 757 758 759
795 852 910
800 858 915
818 875 933 990 047
823 881 938 996 053
829 887 944 •001 059
835 892 950 •007 064
846
973 030
812 869 927 984 041
841 898 955
967 024
806 864 921 978 036
•013 070
904 961 •018 076
081
087
093
099
104
110
116
121
127
133
0
1
2
3
4
5
6
7
8
87
88
760 No.
L
297 355 413 471
* Indicates changein the first two decimal places.
9
Proportional parts
6 1 2 3 4 6 6 7 8 9
0.6 1.2 1.8 2.4 3.0 3.6 4.2 4.8 5.4
Proportional parts
1005
NUMERICAL TABLES
Table F-2. Five-place Common Logarithms of Numbers (Continued) 760-815
No.
L
6
1
2
3
4
$
760 761 762 763 764
88
081 138 196 252 309
087 144 201 258 315
093 150 207 264 321
099 156 213 270 326
104 161 218 275 332
110 167 224 281 338
765 766 767 768 769
366 423 480 536 593
372 429 485 542 598
378 434 491 547 604
383 440 497 553 610
389 446 502 559 615
770 771 772 773 774
649 705
762 818 874
655 711 767 824 880
660 717 773 829 885
666 722 779 835 891
672 728 784 840 897
930 986 042 098 154
936 992 048 104 159
941 997 053 109 165
947 •003 059 115 170
780 781 782 783 784
209 265 321 376 432
215 271 326 382 437
221 276 332 387 443
785 786 787 788 789
487 542 597
493 548 603 658 713
498
775 776 777 778 779
89
653 708
790 791 792 793 794
553 609 664 719
f 6
1
8
9
121 178 235 292
127 184 241 298
355
133 190 247 304 360
412 468 525 581 638
417 474 530 587 643
116 173 230 287 343
349
395 451 508 564 621
400 457 514 570 627
406 463 519 576 632
677
683 739 795 852 908
689 745 801 857 913
694
734 790 846 902
750 807 863 919
700 756 812 868 925
953 •009 064 120 176
958 •014 070 126 182
964 •020 076 131 187
969 •025 081 137 193
975 •031 087 143 198
981 •037 092 148 204
226 282 337 393 448
232 287 343 398 454
237 293 348 404 459
243 298 354 409 465
248 304 360 415 470
254 310 365 421 476
260 315 371 426 481
504 559 614 669 724
509 564 620 675 730
515 570 625 680 735
520
526 581 636 691 746
531 586 642 697 752
537 592 647 702 757
575 631 686 741
763 818 873 927 982
768 823 878 933 988
774 829 883 938 993
779 834 889 944 998
785 840 894 949 •004
900 955 •009
796 851 905 960 •015
911 966 •020
037 091 146 200 255
042 097 151 206 260
048 102 157 211 266
053 108 162 217 271
059 113 168 222 276
064 119 173 227 282
069 124 179 233 287
800 801 802 803 804
309 363 417 472 526
314 369 423 477 531
320 374 428 482 536
325 380 434 488 542
331 385 439 493 547
336 390 445 499 553
805 806 807 808 809
580 634 687 741 795
585 639 693 747 800
590 644 698 752 806
596 650 704 757 811
601 655 709 763 816
607 660
714 768 822
810 811 812 813 814
849 902 956 009
859 913 966
865 918 972
062
854 907 961 014 068
020 073
025 078
870 924 977 030 084
116
121
126
132 3
795 796 797 798 799
90
91
815
No.
L
0
1 1 2
807 862 916 971 •026
812 867 922 977 •031
075 129 184 238 293
080
135 189 244 298
086 140 195 249 304
342 396 450 504 558
347 401 455 509 563
352 407 461 515 569
358 412 466 520 574
612 666 720 773 827
617 671 725 779 832
623 677 730 784 838
628 682 736 789 843
875 929 982 036 089
881 934 988 041 094
886 940 993 046 100
891 945 998 052 105
•004 057 110
137
142
148
153
158
164
4
5
6
7
8
9
790 845
* Indicates change in the first two decimal places.
801 856
Proportional parts
6
| 2 3 4 5 6
7 8 9
0.6 1.2
1*8
2.4 3.0 3.6
4*2
4.8 ha
5 1 2 3 4 5
6 7 8 9
0.5
1.0 1.5 2.0
2.5 3.0 3.5 4.0 4.5
897 950
Proportional parts
1006
APPENDIX F
Table F-2. Five-place Common Logarithms of Numbers (Continued) 815-870
Proportional parts
No.
L
0
1
2
J
815 816 817 818 819
91
116 169 222 275 328
121 174 228 281 334
126 180 233 286 339
132 185 238 291 344
137 190 243 297 350
142 196 249 302 355
148 201 254 307 360
153 206 259 312 365
158 212 265 318 371
164 217 270 323 376
387 440 492 545 598
392 445 498 551 603
397 450 503 556 609
403 455 508 561 614
408 461 514 566 619
413 466 519 572 624
418 471 524 577 630
424 477 529 582 635
429 482
823 824
381 434 487 540 593
825 826 827 828 829
645 698 751 803 855
651 703 756 808 861
656 709 761 814 866
661 714 766 819 871
666 719 772 824 876
672 724 777 829 882
677 730 782 834 887
682
687 740 793 845 897
693 745 798 850 903
908 960 012 065 117
913 965 018 070 122
918 971 023 075 127
924 976 028 080 132
929 981 033 085 137
934 986
939 991 044 096 148
944 997 049 101 153
950
955 •007 059 111 163
169 221 273
179 231 283
184 236 288
355 407
205 257 309 361 412
210 262 314
340 392
195 247 298 350 402
200
335 387
189 241 293 345 397
418
215 267 319 371 423
820 821 822
830 831 832 833 834
92
4
5
038 091 143
6
7
735 787 840
892
8
•002 054
106 158
9
535 587 640
835 836 837 838 839
324 376
174 226 278 330 381
840 841 842 843 844
428 480 531 583 634
433 485 536 588 639
438 490 542 593 645
443
495 547 598 650
449 500 552 603 655
454 505 557 609 660
459 511 562 614 665
464 516 567 619 670
469 521 572 624 675
474 526 578 629 681
845 846 847 848 849
686 737 788 840 891
691 742 793 845 896
696 747 799 850 901
701 752 804 855 906
706 758 809 860 911
711 763 814 865 916
717 768 819 870 921
722 773 824 875 927
727 778 829 881 932
732 783 834 886
850 851 852 853 854
942 993 044 095 146
947 998 049 100 151
952 •003 054 105 156
957 •008 059 110 161
962 •013 064 115 166
967 •018 069 120 171
973 •024 075 125 176
978 •029 080 131 181
983 •034 085 136 186
988 •039 090 141 192
855 856 857 858 859
197 247 298 349 399
202 252 303 354 404
207 258 308 359 409
212 263 313 364 414
217 268 318 369 420
222 273 323 374 425
227 278 328
232 283 334
379 430
384 435
237 288 339 389 440
242 293 344 394 445
860 861 862 863
450 500 551 601
455 505 556 606
460 510 561
611
864
651
656
661
465 515 566 616 666
470 520 571 621 671
475 526 576 626 677
480 531 581 631 682
485 536 586 636 687
490 541 591 641 692
495 546 596 646 697
865 866 867 868
702 752 802 852 902
707 757 807 857 907
712 762 812 862 912
717 767 817 867 917
722 772 822 872 922
727 777 827 877 927
732 782 832 882 932
737 787 837 887 937
742 792 842 892 942
747 797 847 897 947
952
957
962
967
972
977
982
987
992
997
0
1
2
3
4
5
6
7
8
9
93
869
870 No.
L
* Indicates change in the first two decimal places.
252 304
366
6 1 2 3 4 5
6 7 8 9
0.6 1.2 1.8 2.4 3.0 3.6 4.2 4.8 5.4
937
5_ 1 2 3 4 5 6 7 8 9
0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5
Proportional parts
1007
NUMERICAL TABLES
Table F-2. Five-place Common Logarithms of Numbers (Continued) 870-925
No. 1I , 0
5
6
1
8
9
93
952
871 194 872 873 874
002 052 101 151
957 007 057 106 156
962 012 062 111 161
967 017 067 116 166
972 022 072 121 171
977 027 077 126 176
982 032 082 131 181
987 037 087 136 186
992 042 091 141 191
997 047 096 146 196
875 876 877 878 879
201 250 300 349 399
206 255 305 354 404
211 260 310 359 409
216 265 315 364 414
221 270 320 369 419
226 275 325 374 424
231
236 285 335 384 433
240
290 340 389 438
245 295 345 394 443
880 881 882 883 884
448 498 547 596 645
453 503 552 601 650
458 507 557 606 655
463 512 562 611 660
468 517 567 616 665
473 522 571 621 670
885 886 887 888 889
694 743 792 841 890
699
895
704 753 802 851 900
709 758 807 856 905
714 763 812 861 910
719 768 817 866 915
890 891 892 893 894
939 988 036 085 134
944 993 041 090 139
949 998 046 095 143
954 •002 051 100 148
959 •007 056 105 153
963 •012 061 109 158
968 •017
895 896 897 898 899
182 231 279 328 376
187 236 284 332 381
192 240 289 337 386
197 245 294 342 390
202
250 299 347 395
207 255 303
900 901 902 903 904
424 472 521 569 617
429 477 525 574 622
434 482 530 578 626
439 487 535 583 631
905 906 907 908 909
665 713 761 809 856
670 718
'766 813 861
674 722 770 818 866
910 911 912 913 914
904 952 999 047 095
909 957 •004 052 099
915 916 917 918 919
142 190 237 284 332
920 921 922 923 924
925
870
No.
95
96
L
1
2
*
4
280 330 379 429
Proportional parte
478 527
483
576
581 630 680
488 537 586 635 685
493 542 591 640 689
729 778 827 876 924
734 783 832 880 929
738
1
787 836 885 934
2 3 4 5
163
973 •022 071 119 168
978 •027 075 124 173
983 ♦032 080 129 177
216 265 313 361 410
221 270 318 366 415
226 274 323
352 400
211 260 308 357 405
419
444 492 540 588 636
448 497 545 593 641
453 501 550 598 646
458 506^ 554 602 650
463 511 559 607 655
468 516 564 612 660
679 727 775 823 871
684 732 780 828 875
689 737 785 832 880
694 742 789 837 885
698 746 794 842 890
703 751 799 847 895
708 756 804 852 899
914 961 •009 057 104
918 966 •014 061 109
923 971 •019 066 114
928 976 •023 071 118
933 980 •028 076 123
938 985 •033 080 128
942 990 •038 085 133
947 995 ♦042 090 137
147 194 242 289 336
152 199 246 294 341
156 204 251 298 346
161 209 256 303 350
166 213 261 308 355
171 218 265 313 360
175 223 270 317 365
180 227 275 322 369
185 232 280 327 374
379 426 473 520 567
384 431 478 525 572
388 435 483 530 577
393 440 487 534 581
398 445 492 539 586
402 450 497 544 591
407 454 501 548 595
412 459 506 553 600
417 464 511 558 605
421 468 515 563 609
614
619
624
628
633
638
642
647
652
656
1
2
3
4
5
6
7
8
9
0
748 797 846
* Indicates change in the first two decimal places.
626 675 724
773 822 871 919
066 114
532
5
6 7 8 9
0.5 1.0 1.5 2.0 2.5 3.0 3.5
4.0 4.5
371
4 1
2 3
4 5 6 7 8 o
0.4 0.8 1.2 1.6 2.0 2.4 2.8 3.2 a
A
Proportional parts
1008
APPENDIX F
Table F-2. Five-place Common Logarithms of Numbers (Continued) 925-980 2
5
4
619 666 713 759 806
624 670 717 764 811
628 675 722 769 816
633
848 895 942 988 035
853 900 946 993 039
858 904 951 997 044
862 909 956 •002 049
960 •007 053
935 936 937 938 939
081 128 174
090 137 183
220 267
086 132 179 225 271
230 276
095 142 188 234 280
100 146 192 239 285
940 941 942 943 944
313 359 405 451 497
317 364 410 456 502
322 368 414 460 506
327 373 419 465 511
945 946 947 948 949
543 589 635 681 727
548 594 640 685 731
552 598 644 690 736
557
950 951 952 953 954
772 818 864 909 955
777 823 868 914 959
000 046
091 137 182
960 961 962 963 964
• 1
No.
L
0
925 926 927 928 929
96
614 661 708 755 802
5
• 7
6
Proportional parts
• 9
ft
638 685 731 778 825
642 689 736 783 830
647 694 741 788 834
652 699 745 792 839
656 703 750 797 844
872 918 965 •011
876 923 970 •016 063
881 928 974 •021 067
886 932 979 •025 072
890 937 984
151 197 243 290
109 155 202 248 294
114 160 206
211
253 299
257 304
123 169 216 262 308
331 377 424 470 516
336 382 428 474 520
341 387 433 479 525
345 391 437
350 396 442 488 534
354 400 447 493 539
649 695 740
562 607 653 699 745
566 612 658 704 750
571 617 663 708 754
575 621 667
713 759
580 626 672 717 763
585 630 676 722 768
782 827 873 918 964
786 832 877 923 968
791 836 882 928 973
795 841 887 932 978
800
845 891 937 982
804 850 896 941 987
809 855 900 946 991
813 859
005 050 096 141 186
009 055 100 146 191
014 059 105 150 195
019 064 109
023 069
155 200
114 159 205
032 078 123 168 214
037 082 127 173 218
041 087 132 177 223
227 272 318 363 408
232 277 322 367 412
236 281 327 372 417
241 286 331 376 421
245 290 336 381 426
250 295 340 385 430
259 304 349 394 439
263 308
399 444
268 313 358 403 448
1 2 3 4 6 6
965 966 967 968 969
453 498 543 588 632
457 502 547 592 637
462 507
552 597 641
466 511 556 601 646
471 516 561 605 650
475 520 565 610 655
525 570 614 659
484 529 574 619 664
489 534 579 623 668
493 538 583 628 673
7 8 9
970 971 972 973 974
677 722 767 811
682 726 771 816 860
691 735 780 825 869
695 740 785 829
856
686 731 776 820 865
874
700 744 789 834 878
704 749 793 838 883
709 753 798 843 887
713 758 802 847 892
717 762 807 851 896
975 976 977 978 979
900 945 989 034 078
905 949 994 038 083
909 954 998 043 087
914 958 ♦003 047 092
918 963 ♦007 052 096
923 967 •012 056 100
927 972 •016 061 105
932 •021 065 109
936 981 •025 069 114
♦029 074 118
123
127
131
136
140
145
149
154
158
162
0
1
4
5
6
7
8
9
930 931 932 933 934
955 956 957 958 959
97
98
99
980 No.
1L
2
603
3
680 727 774 820 867
914.
058
104
* Indicateschangein the first two decimal places.
028 073 118
164 209
254 299
345 390 435 480
483 529
976
118 165
♦030 077
5 1 2 3 4 5 6 7 8 9
0.5 1.0 1.5 2.0 2.5 3.0 3.5
4.0 4.5
905 950 996
4
354
0.4 0 8 1.2 1.6 2.0 2.4 2.8 3.2 3.6
941 985
Proportional parts
1009
NUMERICAL TABLES
Table F-2. Five-place Common Logarithms of Numbers (Continued) 980-1000
No.
L
0
1
2
3
4
$
6
7
8
9
980 981 982 983 984
99
123 167
127
131 176 220
136
140
145
185 229 273 317
189 233 277 322
154 198 242 286 330
158 202 247
162 207
308
180 224 269 313
149 193 238 282 326
291 335
251 295 339
524
352 397 441 484 528
357 401 445 489 533
361 405 449 493 537
366 410 454 498 542
370 414 458 502 546
374 419 463 506 550
379 423 467 511 555
383 427 471 515 559
211 255 300
171 216 260 304
264
985 986 987 988 989
344 388 432 476 520
990 991 992 993 994
564 607 651 695 739
568 612 656 699 743
572 616 660 704 747
577 621 664 708 752
581 625 669 712 756
585 629 673 717 760
590 634 677 721 765
594 638 682 726 769
599 642 686 730 774
603 647 691 734 778
995 996 997 998 999
782 826 370 913 957
787 830 874 917 961
791 835 878 922 965
795 839 883 926 970
800
843 887 930 974
804 848 891
935 978
808 852 896 939 983
813 856 900 944 987
817 861 904 948 991
822 865 909 952 996
348 392 436 480
1000
00
000
004
009
013
017
022
026
030
035
039
No.
L
0
1
2
3
4
5
6
7
8
9
• Indicates changein the first two decimal places.
Proportional parts 5
8 9
0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5
1 2 3 4 5 6 7 8
0.4 0.8 1.2 1.6 2.0 2.4 2.8 3.2
9
3.ft
1 2 3 4 5 6
7
A
Proportional parts
1010
APPENDIX F
£S$882 &8S882 g8S882 8ES882 g8?882 gS?SS2
S*l©t>.*
SSo^K »*ncnint?*{n
S^SSS Ino*«noomo*
i>»*o*o in »>*.*«•
"«••«• m m en c**,
o©
©^^j-ooodcm o^oSooSS C*l— On — f**,«N
>o cs cs *9-© *© SS^onS O^'iftO^'tfS
-ifto^om* gEs^^SS! O0-m«0*
SS^SiQS c«"\t>»— tvfSQO
*OlsmON
co*©-* nnp
o* oo t>. *o m •«*•
^•mmescs—
— — oooo\
inNooJoo
mmoN*m m m es
ooin
oNSS;^ uiNoommo
w^n^SS — — — ooo mmmmmm
o£2i££222 m <«»• T <«»•*<*• *>»•
^g —oomtno*
————©o
Si^WWW oomm — o*t>.
z: 2S 50 £! & )£
«*&©£££ mrtNogs
25
iE^Si
r£intMo**o i>,o *p *o in in JniniSininin
22 "
— r**o* ^* •*•r»*
«Sffi^SS jQSCJtrgS*
oooooon
S i^o*o*^ln
•tvoOmOMV
60 0
— •n-moo
tv — o o o :5£=
tS
r% o* p*i oo ^t-— mT^mt^cn
—1>. r«»*p — *©
oo*N«ooood
SS^Sfeg
J-rvooQ>ooi>» • cn-tfoinN
o* o* o* — o* m — obvooornin
c*,t>»o©oo^
— o* — — c*im
"*,&fr»,0.r5'^ o\r**r\>0*0*0
"*• c*i on *o — iv
^EXiO0.0*^"1
S'Ssifi^SSO N>oon«9
M
r5
tXin*o*o*o*S
tvtvtvtvoooo
cx> aoaoooo* o*
0*0*0*0*0*©
0*0*0*0*0000
0*0*0*0*0* 0*0*0*0*0*0* g^^SgjSS
tNNOvOiflm
^" £0 SO2J ^ S
0^0^0*0*0*0*
Sonononono'n
o*'o*o*o»o*
ononono^onon
O* 0s ON ON O* ON
On ON On o* on On
0*0*0*0*0*0*
0*0*0*0*0*0*
0*0*0*0*0*0*
0*0*0*0*0*0*
o* 0*0* o* o* o*
0*0*0*0*0*0*
3*b*b*o*o*
000 0*0*0*0*0000
00 dooooooocor^
Jj
-^tvONO—
-rvivtvoo'od oo'ao'odao'oo'oo' oo cdoo oo" oo'oo oo oo'oo oo'oo'oo" oo'ao'ao'oo'odoo* oo'oo* oo" oo" oo" on
«
P ©
a
0
S
25S5
c**.c*i —
oonoo
p 0 6fi
5*0.0*0*0* 0*0*0*0*0*
0*0*0*0*0*0* 0*0*0*0*0*0*
E>$>• t> >£ i° iS
iPiniO!OiO"i
0*0*0*0*0*0*-
0*0*0*0*0*0*
0*0*0*0*0*0*
000000000000
000000000000
000000000000
000000ao000*
S*o*o*o*g*g*
S^S^&S^
S£$£ S£ S£ S£ S£
cd
•r**t^i>iOOoo
3
00* 00* 00* 0000" 00
lasses essss-s ssssss* sss^f-s ISssSs SiaSsi Iggl©© oSS8§2 S2SSSS SS8©,©,I ©'©•©•©'gg SSSSS2
3oNooN*om SNtnoO"-^
ifljinN-o t** d
?**,*© o* (N
ooN*o«r\*t
"