This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
, the statement about the
CHAP. 18
774
Complex Analysis and Potential Theory
maximum of
(III) If II and
X-Xo y~yo
Under these assumptions the maximum principle (I) is still applicable. The problem of determining
THEOREM 5
Uniqueness Theorem for the Dirichlet Problem
Iflor a given region and given boulldary mlues the Dirichlet problem for the Laplace equatio/l in two l'ariables has a solutioll. the solution is ullique.
---..
.... -~
...-.-----
kl 2 around the unit circle. Does your result contradict Theorem I?
1. Integrate
12-" 1 VERIFY THEOREM 1 for the given F(::),
::0'
and
circle of radius l. 2. (::: + 1)3':::0 = 2 3. (::: - 2)2. ::0 = ~ 4. 10:::
15-71
4
'':0
=
0
VERIFY THEOREM 2 for the given cI>(x. y).
(xo. Yo) and Circle of radius I. 5. (x - 2)(y - 2), (4. -4)
6. x 2
-
.'"2. (3, 8)
7. x 3
-
3xy2, (I, I)
8. Derive Theorem 2 from Poisson's integral formula. 9. CAS EXPERIMENT. Graphing Potentials. Graph the potentials in Probs. 5 and 7 and for three other
functions of your choice as smfaces over a rectangle or a disk in the xy-plane. Find the locations of maxima and minima by inspecting these graphs. 10. TEAM PROJECT. Maximum Modulus of Analytic Functions. (a) Verify Theorem 3 for (i) F(:::) = ::2 and the square 4 ~ x ~ 6. 2 ~ Y ~ 4, (ii) F(:::) = e1z and any bounded domain. (iii) F(:::) = sin:: and the unit disk. (b) F(x) = cos x (x real) has a maximum I at O. How does it follow that this cannot be a maximum of IFez.) I = Icos:::1 in a domain containing Z = O? (c) F(::) = 1 + 1::12 is not Lero in the disk Izi ~ 4 and has a minimum at an interior point. Does this contradict Theorem 3? (d) If F(:::) is analytic and not constant in the closed unit disk D: 1<:1 ~ 1 and IF(.;;)I = C = COIlSt on the unit circle. show that F(.:) must have a zero in D. Can you extend this to an arbitrary simple closed curve?
775
Chapter 18 Review Questions and Problems Ill-13
I
MAXIMUM MODULUS
Find the location and si7e of the maximum of IF(;:)I in the unit disk 1:<:1 ;::; I. 11. F(;:) = ;:2 -
12. F(;:)
=
a;:
1
+ b la,
b complex)
13. F(;:) = cos 2;:
14. Verify the maximum principle for <1>(x. y) = eX cos y and the rectangle a ;::; x ;::; b. 0 ;::; y ;::; 2'iT.
. . = .-.... •.-. ... : .-====== --
.~
15. (Conjugate) Do ¢ and a harmonic conjugate \)! of (I) in a region R have their maximum at the same point of R? 16. (Conformal mapping) Find the location (u l . VI) of the maximum of <1>* = ell cos V in R*: Iwl ;::; 1. V ~ 0, where w = 1I + iv. Find the region R that is mapped onto R* by w = f(;:) = ;:2. Find the potential in R resulting from <1>* and the location (xl' .\'1) of the maximum. Is (lib VI) the image of (Xl' YI)? If so, is this ju~t by chance?
•• , -:..ll£STIONS AND PROBLEMS
1. Why can potential problems be modeled and solved by complex analysis? For what dimensions? 2. What is a harmonic function? A harmonic conjugate?
3. Give a few example~ of potential problems considered in this chapter.
4. What is a complex potential? What does it give physically'?
5. How can conformal mapping be used in connection with the Dirichlet problem?
6. What heat problems reduce to potential problems? Give a few examples. 7. Write a short essay on potential theory in fluid flow from memory. 8. What is a mixed boundary value problem? Where did it occur? 9. State Poisson's formula and its derivation from Cauchy's formula. 10. State the maximum modulus theorem and mean value theorems for harmonic functions. 11. Find the potential and complex potential between the plates y = x and y = x + 10 kept at 10 V and 110 V, respecti vel y. 12. Find the potential between the cylinders 1:1 = I cm having potential 0 and Id = 10 em having potential 20 kV.
13. Find the complex potential in Prob. 12. 14. Find the equipotential line U = 0 V between the cylinders Id = 0.2S cm and 1:1 = 4 cm kept at -220 V and 220 V. respectively. (Guess first.) 15. Find the potential between the cylinders Izl = 10 cm and 1:::1 = 100 cm kept at the potentials 10 kV and 0, respectively.
16. Find the potential in the angular region between the plates Arg ::: = 'iT16, kept at 8 k V, and Arg ;: = 'iT13, kept dt 6kV.
17. Find the equipotential lines of F(;:)
= i
Ln z..
18. Find and sketch the equipotential lines of F(:::) = (l
+
i)/:::.
19. What is the complex potential in the upper half-plane if the negative half of the x-axis has potential I kV and the positive half is grounded? 20. Find the potential on the ray)' = x, x > 0, and on the positive half of the x-axis if the positive half of the .v-axis is at 1200 V and the negative half IS grounded. 21. Interpret Prob. 20 as a problem in heat conduction. 22. Find the temperature in the upper half-plane if the portion x > 2 of the x-axis is kept at SO"C and the other portion at O°C.
23. Show that the isotherm~ of Fr;:) = hyperbolas.
_;:::2
+ ;: are
24. If the region between two concentric cylinders of radii 2 cm and 10 cm contains water and the outer cylinder is kept at 20°C, to what temperature must we heat the inner cylinder in order [0 have 30°C at distance 5 cm from the axis? 25. What are the streamlines of Fr:::) = i/;:?
26. What is the complex potential of a flow around a cylinder of radius 4 without circulation? 27. Find the complex potential of a source at ::: = 5. What are the streamlines? 28. Find the temperature in the unit disk 1:1 ;::; I in the form of an infinite series if the left semicircle of Izi = 1 has the temperature of SO°C and the right semicircle has the temperature DoC.
29. Same task as in Prob. 2~ if the upper semicircle is at 40°C and the lower at DoC.
30. Find a series for the potential in the unit disk with boundary values <1>( I, 0) = 0 2 (-'iT < 0 < 'iT).
776
CHAP. 18
Complex Analysis and Potential Theory
......
•
~
,,~--
Complex Analysis and Potential Theory Potential theory is the theory of solutions of Laplace's equation (1)
Solutions who~e second pattial derivatives are continuous are called harmonic functions. Equation (I) is the most important PDE in physics, where it is of interest in two and three dimensions. It appears in electrostatics (Sec. 18.1), steady-state heat problems (Sec. 18.3), fluid flow (Sec. 18.4), gravity, etc. Whereas the three-dimensional case requires other methods (see Chap. 12), two-dimensional potential theory can be handled by complex analysis. since the real and imaginary parts of an analytic function are harmonic (Sec. 13.4). They remain harmonic under conformal mapping (Sec. 18.2), so that conformal mapping becomes a powerful tool in solving boundary value problems for (1), as is illustrated in this chapter. With a real potential
(2)
F(z) =
+
i\If
(Sec. 18.1).
Then both families of curves
,
PA RT
••
E
•
•1
,
t
-
.(
.,
..""
~ (
Numeric
Analysis Software (p. 778-779) CHAPTER 19
Numerics in General
CHAPTER 20
Numeric Linear Algebra
CHAPTER 21
Numerics for ODEs and PDEs Numeric analysis, more briefly also called numerics, concerns numeric methods. that is, methods for solving problems in terms of numbers or corresponding graphical representations. It also includes the investigation of the range of applicability and of the accuracy and stability of these methods. Typical tasks for numerics are the evaluation of definite integrals. the solution of equations and linear systems, the solution of differential or integral equations for which there are no solution formulas, and the evaluation of experimental data for which we want to obtain. for example. an approximating polynomial. Numeric methods then provide the transition from the mathematical model to an algorithm, which is a detailed stepwise recipe for solving a problem of the indicated kind to be programmed on your computer, using your CAS (computer algebra system) or other software, or on your programmable calculator. In this and the next two chapters we explain and illustrate the most frequently used basic numeric methods in algorithmic form. Chapter 19 concerns numerics in general; Chap. 20 numeric linear algebra. in particular, methods for linear systems and matrix eigenvalue problems; and Chap. 21 numerics for ODEs and PDEs. The algorithms are given in a form that seems best for showing how a method works. We suggest that you also make use of programs from public-domain or commercial software listed on pp. 778-779 or obtainable on the lnternet.
778
PART E
Numeric Analysis
Numerics has increased in importance to the engineer more than any other field of mathematics owing to the ongoing development of powerful software resulting from great research activity in numerics: new methods are invented, existing methods are improved and adapted. and old methods-impractical in precomputer times-are rediscovered. A main goal in these activities is the development of well-stmctured software. And in largescale work-millions of equations or steps of iteration--even small algorithmic improvements may have a large effect on computing time, storage demand, accuracy, and stability. On average this makes the algorithms used in practice more and more complicated. However, the more sophisticated modem software will become, the more important it will be to understand concepts and algorithms in a basic form that shows original motivating ideas of recent developments. To avoid misunderstandings: Various simple classical methods are still very useful in many routine situations and produce satisfactory results. In other words, not everything has become more sophisticated.
Software See also http://www.wiley.com/college/kreyszig/ The following list will help you if you wish to find software. You may also obtain information on known and new software from magazines, such as Byte Magazine or PC Maga::,ille, from articles published by the Americall Mathematical Society (see also their website at www.ams.org).theSocietyforlndustrialandAppliedMathematics(SIAM.at www.siam.org). the Association for COl1lpllting Machillel:r (ACM. at www.acm.org), or the Institute of Electrical and Electronics Engineers (IEEE, at www.ieee.org). Consult also your library, Computer Science Department, or Mathematics Department.
Derive. Texas Instntments, Inc .. Dallas, TX. Phone 1-800-842-2737 or (972) 917-8324, website at www.derive.com or www.education.ti.com. EISPACK. See LAPACK. GAMS (Guide to Available Mathematical Software). Website at http://gams.nist.gov. On-line cross-index of software development by NIST. with links to IMSL. NAG, and NETUB. IMSL (International Mathematical and Statistical Library). Visual Numerics, Inc., Housron. TX. Phone \-800-222-4675 or (713) 784-3131. website at www.vni.com. Mathematical and statistical Fortran routines with graphics. LAPACK. Fortran 77 routines for linear algebra. This software package supersedes UNPACK and EISPACK. You can download the routines (see http://cm.bell-labs.comlnetliblbib/minors.html) or order them directly from NAG. The LAPACK User's Guide is available at www.netlib.org.
PART E
Numeric Analysis
779
LINPACK see LAPACK Maple. Waterloo Maple, Inc., Waterloo, ON, Canada. Phone 1-800-267-6583 or (519) 747-2373, website at www.maplesoft.com. Maple Computer Guide. For Advanced Engineering Mathematics, 9th edition. By E. Kreyszig and E. J. Norminton. J. Wiley and Sons. [nc .. Hoboken. NJ. Phone 1-800-225-5945 or (201) 748-6000. Mathcad. MathSoft, Inc., Cambridge, MA. Phone 1-800-628-4223 or (617) 444-8000. website at www.mathcad.com or www.mathsofLcom. Mathematica. Wolfram Research, Inc., Champaign. IL. Phone 1-800-965-3726 or (217) 3lJ8-0700, website at www.wolframresearch.com. Mathematica Computer Guide. For Advanced Engineering Mathematics. lJth edition. By E. Kreyszig and E. J. Norminton. J. Wiley and Sons. Inc .. Hoboken. NJ. Phone 1-800-225-5945 or (201) 748-6000. Matlab. The MathWorks, Inc., Natick, MA. Phone (508) 647-7000, website at www.mathworks.com. NAG. Numerical Algorithms Group, Inc., Downders Grove. IL. Phone (630) 971-2337, website at www.nag.com. Numeric routines in Fortran 77, Fortran 90, and C. NETLIB. Extensive library of public-domain software. See at www.netIib.org and http://cm.bell-Iabs.comlnetlib/. NIST. National Institute of Standards and Technology, Gaithersburg, MD. Phone (301) 975-2000, website at www.nist.gov. For Mathematical and Computational Science Division phone (301) 975-3800. See also http://math.nist.gov. Numerical Recipes. Cambridge University Press, New York. NY. Phone (212) 924-3900. website at www.us.cambridge.org. Books (also source codes on CD ROM and discettes) containing numeric routines in C, C++. Fortran 77, and F0l1ran 90. To order. call office at West Nyack. NY, at 1-800-872-7423 or (845) 353-7500 or online at www.numerical-recipes.com. FURTHER SOFTWARE IN STATISTICS. See Part G.
~
I
:·~f
~.._I
,
.... ~
"1
. ... . /, -. "«:-
CHAPTER
19
i ....J
Numerics in General This first chapter on numerics begins with an explanation of some general concepts. such as floating point, roundoff errors, and general numeric errors and their propagation. In Sec. 19.2 we discuss methods for solving equations. Interpolation methods, including splines, follow in Secs. 19.3 and 19.4. The last section (19.5) concerns numeric integration and differentiation. The purpose of this chapter is twofold. First, for all these tasks the student should become familiar with the most basic (but not too complicated) numeric solution methods. These are indispensable for the engineer, because for many problems there is no solution formula (think of a complicated integral or a polynomial of high degree or the interpolation of values obtained by measurements). In other cases a complicated solution formula may exist but may be practically useless. Second. the student should learn to understand some ba<;ic ideas and concepts that are important throughout numerics, such as the practical form of algorithms. the estimation of errors, and the order of convergence. Prerequisite: Elementary calculus References and Answers to Problems: App. J Part E, App. 2
19.1
Introduction Numeric methods are used to solve problems on computers or calculators by numeric calculations, resulting in a table of numbers andlor graphical representations (figures). The steps from a given situation (in engineering, economics. etc.) to the final answer are usually as follows.
1. Modeling. We set up a mathematical model of our problem. such as an integral, a system of equations. or a differential equation. 2. Choosing a numeric method and parameters (e.g., step size), perhaps with a preliminary error estimation. 3. Programming. We u~e the algorithm to write a corresponding program in a CAS, such as Maple, Mathematica, Matlab, or Mathcad, or, say, in Fortran, C, or C++, selecting suitable routines from a software system as needed. 4. Doing the computation. S. Interpreting the results in physical or other terms, also deciding to rerun if further results are needed. 780
SEC. 19.1
Introduction
781
Steps and 2 are related. A slight change of the model may often admit of a more efficient method. To choose methods, we must first get to know them. Chapters 19- 21 contain efficient algorithms for the most important classes of problems OcculTing frequently in practice. In Step 3 the program consists of the given data and a sequence of instructions to be executed by the computer in a certain order for producing the answer in numeric or graphic form. To create a good understanding of the nature of numeric work, we continue in this section with some simple general remarks.
Floating-Point Form of Numbers We know that in decimal notation, every real number is represented by a finite or an infinite sequence of decimal digits. Now most computers have two ways of representing numbers, called fixed poillf and floating point. In a fixed-point system all numbers are given with a fixed number of decimals after the decimal point; for example, numbers given with 3 decimals are 62.358, 0.014, 1.000. In a text we would write, say, 3 decimals as 3D. Fixed-point representations are impractical in most scientific computations because of their limited range (explain!) and will not concern us. In a floating-point system we write. for instance.
0.1735' 10- 13 ,
-0.2000' 10- 1
1.735' 10- 14,
- 2.000' 10- 2 .
or sometimes also
We see that in this system the number of significant digits is kept fixed, whereas the decimal point is "t1oating." Here, a significant digit of a number c is any given digit of c, except possibly for zeros to the left of the first nonzero digit; these zeros serve only to fix the position of the decimal point. (Thus any other zero is a significant digit of c.) For instance, each of the numbers
1360,
1.360,
0.001360
has 4 significant digits. In a text we indicate, say, 4 significant digits, by 4S. The use of exponents permits us to represent very large and very small numbers. Indeed. theoretically any nonzero number a can be written as
0.1 ~
(I)
Iml <
1,
11
integer.
On the computer, 111 is limited to k digits (e.g., k = 8) and n is limited. giving representations (for finitely many numbers only!)
(2)
a = :!:.m· 10",
These numbers a are often called k-digit decimal macbine numbers. Their fractional part m (or Iii) is called the mamis.m. This has nothing to do with "mantissa" as used for logarithms. 11 is called the exponent of a.
782
CHAP. 19
Numerics in General
Underflow and Overflow. The range of exponents that a typical computer can handle is very large. The IEEE (Institute of Electrical and Electronics Engineers) floating-point standard for single precision (the usual number of digits in calculations) is about - 38 < 11 < 38 (about - 125 < n* < 125 for the exponent in binary representations, i.e., representations in base 2). [For so-called double precision it is about - 308 < n < 308 (about -1020 < n* < 1020 for binary).] If in a computation a number outside that range occurs, this is called underflow when the number is smaller and overflow when it is larger. In the case of underflow the result is usually set to zero and computation continues. Overflow causes the computer to halt. Standard codes (by IMSL, NAG, etc.) are written to avoid overflow. Error messages on overflow may then indicate progran1ming errors (incorrect input data, etc.).
Roundoff An error is caused by chopping (= discarding all decimals from some decimal on) Or rounding. This error is called roundoff error, regardless of whether we chop or round. The rule for rounding off a number to k decimals is as follows. (The rule for rounding off to k significant digits is the same, with "decimal" replaced by "significant digit.") Roundoff Rule. Discard the (k + 1)th and all subsequent decimals. (a) If the number thus discarded is less than half a unit in the Ath place, leave the kth decimal unchanged (" rounding down"). (b) If it is greater than half a unit in the A1h place, add one to the kth decimal ("rounding lip"). (c) If it is exactly half a unit, round off to the nearest el'en decimal. (Example: Rounding off 3.45 and 3.55 to I decimal gives 3.4 and 3.6, respectively.) The last part of the rule is supposed to ensure that in discarding exactly half a decimal, rounding up and rounding down happens about equally often, on the average. If we round off 1.2535 to 3, 2, I decimals, we get 1.254. 1.25. 1.3, but if 1.25 is rounded off to one decimal, without fmther information, we get 1.2. Chopping is not recommended because the corresponding error can be larger than that in rounding, and is systematic. (Nevertheless, some computers use it because it is simpler and faster. On the other hand, some compurers and calculators improve accuracy of results by doing intermediate calculations using one or more extra digits, called guarding digits.) Error in Rounding. Let a = fl(a) in (2) be the floating-point computer approximation of a in (I) obtained by rounding. where fl suggests floating. Then the roundoff rule gives (by dropping exponents) 1111 - ilil ~ ~. lO- k . Since Iml ~ 0.\, this implies (when a *- 0) (3)
II - a a
I
I
1111 - iii 11/
I
~
I l-k _.\0 2
The right side u = ~. 10 1 - k is called the rounding unit. If we write a = ll( I + 8), we have by algebra (li - a)/a = 8, hence 181 ~ 1I by (3), This sholl"s that the rounding l/17it u is all error bOl/nd in rounding. Rounding errors may ruin a computation completely, even a small computation. In general, these errors become the more dangerous the more arithmetic operations (perhaps several millions!) we have to perform, It is therefore important to analyze computational programs for expected rounding eITors and to find an arrangement of the computations such that the effect of rounding errors is as small as possible. The arithmetic in a computer is not exact either and causes further errors; however, these will not be relevant to our discussion.
SEC. 19.1
783
Introduction
Accuracy in Tables. Although available software has rendered various tables of function values superfluous, some tables (of higher functions, of coefficients of integration formulas, etc.) will still remain in occasional use. [f a table shows k significant digits, it is conventionally assumed that any value a in the table deviates from the exact value a by at most ±! unit of the kth digit.
Algorithm. Stability Numeric methods can be formulated as algorithms. An algorithm is a step-by-step procedure that states a numeric method in a form (a "pseudocode") understandable to humans. (Turn pages to see what algorithms look like.) The algorithm is then used to write a program in a programming language that the computer can understand so that it can execute the numeric method. Important algorithms follow in the next sections. For routine tasks your CAS or some other software system may contain programs that you can use or include as pat1s of larger programs of your own. Stability. To be useful, an algorithm should be stable; that is, small changes in the initial data should cause only small changes in the final results. However, if small changes in the initial data can produce large changes in the final results, we call the algorithm unstable. This "numeric instability," which in most cases can be avoided by choosing a better algorithm, must be distinguished from "mathematical instability" of a problem, which is called "ill-conditiolling, " a concept we discuss in the next section. Some algorithms are stable only for certain initial data, so that one must be careful in such a case.
Errors of Numeric Results Final results of computations of unknown quantities generally are approximations; that is, they are not exact but involve errors. Such an error may result from a combination of the following effects. Roundoff errors result from rounding, as discussed on p. 782. Experimental errors are en-ors of given data (probably arising from measurements). Truncating errors result from truncating (prematurely breaking off), for instance, if we replace a Taylor series with the sum of its first few terms. These errors depend on the computational method used and must be dealt with individually for each method. ["Truncating" is sometimes used as a term for chopping off (see before), a terminology that is not recommended.] Formulas for Errors. If a is at1 approximate value of a quantity whose exact value is a, we call the difference
(4)
E=a-a
the error of a. Hence (4*)
a
=
a + E,
True value
= Approximation + Error.
For instance, if a = 10.5 is an approximation of a = 10.], its error is error of an approximation a = 1.60 of a = 1.82 is E = 0.22.
E
= -0.3. The
CHAP. 19
784
Numerics in General
CAUTION! In the literature la - al ("ab~olute error") or used as definitions of error. The relative error E,. of is defined by
a-
a are sometimes also
a
(5)
Er
=
E
a-a
Error
a
a
True value
(a =f:. 0).
This looks useless because a is unknown. But if Itl is much less than a instead of a and get
lal. then we can use
(5')
This still looks problematic hecause E is unknown-if it were known, we could get a = a + E from (4) and we would be done. But what one often can ohtain in practice is an error bound for a, that is, a number f3 such that
Itl
~
f3,
la - al
hence
~
f3.
This tells us how far away from our computed a the unknown a can at most lie. Similarly. for the relative error, an en-or bound is a number f3r such that a - a --a-
hence
I
I
~
f3r·
Error Propagation This is an imp0l1ant matter. It refers to how en-ors at the beginning and in later steps (roundoff, for example) propagate into the computation and affect accuracy, sometimes very drastically. We state here what happens to en-or bounds. Namely, bounds for the error add under addition and subtraction, whereas bounds for the relative error add under multiplication and division. You do well to keep this in mind.
THEOREM 1
Error Propagation (a) In addition and sllbtraction, an error bound for the results is given by the sum of the error bounds for the terms. (b) In multiplication and division, all error bound for the relative error of the results is given (approximately) by the sum of the bounds for the relative errors of the gil'en llumbers.
PROOF
(a) We use the notations x = E of the difference we obtain
Itl
=
=
x + EI' Y = Y+ E2, IEII ~ f31.IE21 ~ f32' Then for the en-or Ix Ix -
y - (x -
y)1
x - (y -
y)1
= h - E21
~
hi
+
k21
~ f31 + f32'
SEC. 19.1
785
Introduction
The proof for the sum is similar and is left to the student. (b) For the relative error and bounds f3rI, f31'2
IErI
=
Er
of xy we get from the relative errors
Ixy - xy I Ixy =
xy
(x - EI)(Y - E2 ) xY
Erl
I IElY + E
2X -
=
and
Er2
of X,
y
EI E21
x)'
This proof shows what "approximately" means: we neglected EI E2 as small in absolute value compared to lEI I and IE21. The proof for the quotient is similar but slightly more tricky (see Prob. 15). •
Basic Error Principle Every numeric method should be accompanied by an error estimate. [f such a formula is lacking, is extremely complicated, or is impractical because it involves information (for instance, on derivatives) that is not available, the following may help.
Error Estimation by Comparison. Do a calculation twice with different accuracy. Regard the difference a2 - al of the results aI, a2 as a (perhaps en/de) estimate of the error EI of the inferior result al' Indeed, al + EI = a2 + E2 by formula (4*). This implies a2 - al = EI - E2 = EI because a2 is generally more accurate than aI, so that IE21 is small compared to IEII.
Loss of Significant Digits This means that a result of a calculation has fewer correct digits than the numbers from which it was obtained. This happens if we subtract two numbers of about the same size, for exan1ple. 0.1439 - 0.1426 ("subtractive cancellation"). It may occur in simple problems, but it can be avoided in most cases by simple changes of the algorithm-if one is aware of it! Let us illustrate this with the following basic problem. E X AMP L E 1
Quadratic Equation. Loss of Significant Digits Find the roots of the equation X2 -
40x
+2
=
0,
using 4 significant digits (abbreviated 4S) in the computation.
Solution. A formula for the roots Xl' -"2 of a quadratic equation ax2 +
bx
+c
=
0 is
(6)
Furthermore, since xlx2 (7)
=
cIa, another formula for those roots is Xl a~
before.
Fro~ (6) we obtain X = 20 :+: V398 = ~O.OO :+: 19.95. This gives Xl = 20.00 + 19.95 = 39.95, involving no dIffIculty, whereas X2 = 20.00 - 19.95 = 0.05 is poor because it involves loss of significant digits. In contrast, .(7) gives Xl = 39.?5, X2 = 2.000/39.95 = 0.05006, in error by less than one unit of the last digit, as a computatIon Wlth more digIts shows. (The lOS-value is 0.05006265674.)
786
CHAP. 19
Numerics in General
Comment. To avoid misunderstandings: 4S was used for convenience; (7) is beller than (6) regardless of the number of digits used. For instance. the 8S-computation by (6) is xl = 39.949937. X2 = 0.050063. which i~ poor. and by (7) it is Xl as before. X2 = 2/.\"1 = 0.050062657. In a quadratic equation with real root,. if X2 is absolutely largest (because b > OJ. use (6) for X2 dnd then xl =
-_ _-...........
c!(m'2)'
...
•
....
- •••.. - -.... - .. y-~ . . .-. •• - . _".
1. (Floating point) Write 98.17, -100.987, 0.0057869, - 13600 in floating-point form, rounded to 4S (4 significant digits). 2. Write -0.0286403. 11.25845. - 3168\.55 in f1oatingpoint form rounded to 6S. 3. Small differences of large numbers may be particularly strongly affected by rounding errors. l\Iu~trate this by computing 0.36443/(17.862 - 17.798) as given with 5S. then rounding stepwise to 4S. 3S. and 2S. where "stepwise" means: round the rounded numbers. not the given ones. 4. Do the work in Prob. 3 with numbers of your choice that give even more drastically different results. How can you avoid such difficulties? 5. The quotient in Prob. 3 is of the form a/(b - c). Write it as alb + c:)/(b 2 - c 2 ). Compute it first with 5S, then rounding numerator 12.996 and denominator 2.28 stepwise as in Prob. 3. Compare and comment. 6. (Quadratic equation) Solve.\"2 - 20x + I = 0 by (6) and by (7). using 6S in the computation. Compare and comment. 7. Do the computations in Prob. 6 with 4S and 2S. 8. Solve.\"2 + 100x + 2 = 0 by (6) and (7) with 5S and compare. 9. Calculate lIe = 0.367879 (6S) from the partial sums of 5 to 10 terms of the Maclaurin series (a) of e-" with x = 1, (b) of eX with x = 1 and then taking the reciprocal. Which is more accurate? 10. Addition with a fixed number of significant digits depends on the order in which you add the numbers. Illustrate this with an example. Find an empirical rule for the best order. 11. Approximations of 7T = 3.141 592 653 589 79 ... are 22/7 and 355/1 13. Determine the corresponding errors and relative errors to 3 significant digits. 12. Compute 7T by Machin's approximation 16 arctan ( 115) - 4 arctan (1/239) to lOS (which are correct). (In 19X6, D. H. Bailey computed almost 30 million decimals of 7T on a CRA Y-2 in less than 30 hours. The race for more and more decimals is continuing. ) 13. (Rounding and adding) Let al' . . . , an be numbers with aj correctly rounded to Dj decimals. In calculating
the sum (II + ... + (In' retmmng D = min DJ decimals, is it essential that we first add and then round the result or that we first round each number to D decimals and then add?
14. (Theorems on errors) Prove Theorem I(a) for addition. 15. Prove Theorem I(b) for division. 16. Show that in Example 1 the absolute value of the error of X2 = 2.000/39.95 = 0.05006 is less than 0.00001. 17. Overflow and underflow can sometimes be avoided by simple changes in a formula. Explain this in terms of V.~ + y2 = I + (ylx)2 with x 2 ~ y2 and x so large that x 2 would cause ovelt10w. Invent examples of your own. 18. (Nested form) Evaluate I(x) = x 3 - 7.5x 2 + 1l.2x + 2.8
xV
= «x - 7.5)x
+ 11.2)x + 2.8
at x = 3.94 using 3S arithmetic and rounding. in both of the given forms. The latter, called the nestedfamz, is usually preferable since it minimizes the number of operations and thus the effect of rounding. 19. CAS EXPERIMENT. Chopping and Rounding. (a) Let x = 4/7 andy = 113. Find the error~ Echop, Eround and the relative errors Er.ch, Er.rd of x + y. x - y. xy. x I)" in chopping and rounding to 5S. Experiment with other fractions of your choice. (b) Graph Echop and Eround (for 5S) of k121 as a function of k = I, 2, . , . , 21 on common axes. What average value can you read from the graph for Echop? For Eround? Experiment with other integers that give similar graphs. Different types of graphs. Can you characteri7e the ditTerent types in terms of prime factors? (c) How does the situation in (b) change if you take 4S instead of 5S? (d) Write programs for the work in (a)-(c). 20. WRITI.'JG PROJECT. Numerics. In your own words write about the overall role of numeric methods in applied mathematics, wby they are impol1ant, where and when they must be used or can be used, and how they are influenced by the use of the computer in engineering and other work.
SEC. 19.2
19.2
787
Solution of Equations by Iteration
by Iteration
Solution of Equations
From here on. each ~ection will be devoted to some basic kind of problem and corresponding solution methods. We begin with methods of finding solutions of a single equation (1)
f(x)
=
0
where f is a given function. For this task there are practically no formulas (except in a few :,imple cases). so that one depends almost entirely on numeric algorithms. A solution of (1) is a number x = s such that f(s) = O. Here. s suggests "solution." but we shall also use other letters. Examples are x 3 + x = I, sin x = O.5x. tan x = x, cosh x = sec x, cosh x cos x = -1, which can all be written in the form (1). The first concerns an algebraic equation because the corresponding f is a polynomial, and in this case the solutions are also called roots of the equation. The other equations are transcendental equations because they involve transcendental functions. Solving equations (1) is a task of prime importance because engineering applications abound: some occur in Chaps. 2.4. 8 (characteristic equations), 6 (partial fractions), 12 (eigenvalues. zeros of Bessel functions). and 16 (integration). but there are many, many others. To solve (1) when there is no formula for the exact solution. we can use an approximation method. in plli1icular an iteration method, that is, a method in which we start from an initial guess .1'0 (which may be poor) and compute step by step (in general better and better) approximations Xl' X2' ... of an unknown solution of (1). We discuss three such methods that are of particular practical importance and mention two others in the problem set. These methods and the underlying principles are basic for understanding the diverse methods in software packages. In general, iteration methods are easy to program because the computational operations are the smne in each step-just the data change from step to step--and. more important. if in a concrete case a method converges. it is stable (see Sec. 19.1) in generaL
Fixed-Point Iteration for Solving Equations
f{x) = 0
Our present use of the word "fixed point"" has absolutely nothing to do with that in the last section. In one way or another we transform (1) algebraically into the form
(2) Then we choose an Xo and compute
(3)
Xl
x
=
=
g(xo), X2
g(x).
=
g(Xl)' and in general
(11 = 0, 1, .. ').
A solution of (2) is called a fixed point of g, motivating the name of the method. This is a solution of 0). since from X = g(x) we can return to the original form f(x) = O. From (I) we may get several different forms of (2). The behavior of corresponding iterative sequences Xo. x l ' . . . may differ, in particular. with respect to their speed of convergence. Indeed, some of them may not converge at all. Let us illustrate these facts with a simple example.
788
CHAP. 19
Numerics in General
E X AMP L ElAn Iteration Process (Fixed-Point Iteration) Set up an iteration process for the equation f(x) = x 2 x = 1.5 ±
vT.25.
thus
-
3x
+I
O. Since we know the solutions
=
and
2.618034
0.381966.
we can watch the behavior of the error as the iteration proceeds.
Solution.
The equation may be written thu~
(4a)
Xn+l =
~(X71 2 +
I).
If we choose Xo = I, we obtain the sequence (Fig. -l23a: computed with 6S and then rounded) Xo = 1.000,
Xl
= 0.667,
X2 =
0.481,
X3 =
which seems to approach the smaller solution. If we choose Xo ~o = 3, we obtain the sequence (Fig. 423a, upper part) Xo = 3.000.
Xl
= 3.333.
X2
=
4.037.
X3
=
0.411.
X4
0.390,"
=
.
2, the situation is similar. If we choose
= 5.766.
=
X4
11.415 •...
which diverges. Our equation may also be written (divide by x) thu~
(4b)
xn+l =
3-
and if we choose Xo = I, we obtain the sequence (Fig. 423b) Xo = 1.000,
Xl =
2.000,
X2
= 2.500,
X3
= 2.600,
X4
2.615,· ..
=
which seems to approach the larger solution. Similarly, if we choose Xo = 3, we obtain the sequence (Fig. 423b) Xo
=
3.000,
Xl =
2.667,
X2 =
2.625,
X3 =
2.619,
X4
=
2.618,· ".
Our figures show the following. In the lower part of Fig. 423a the slope of gl(x) is less than the slope of < I, and we seem to have convergence. In the upper part, gl(x) is steeper (g~(x) > I) and we have divergence. In Fig. 423b the slope of g2lx) is less near the intersection point Ix = 2.618, fixed point of g2, solution of f(x) = 0), and both sequences seem to converge. From all this we conclude that convergence seems to depend on the fact that in a neighborhood of a solution the curve of g(x) is less steep than the straight line y = x. and we shall now see that thi~ condition Ig' (x)1 < I (= slope of y = x) is sufficient for convergence. •
y = x, which is I, thus Ig~(x)1
5
5
~/
/
c.l
/ x (a)
Fig. 423.
5
0
0
x (h)
Example 1, iterations (4a) and (4b)
g2(x)
5
SEC. 19.2
789
Solution of Equations by Iteration
An iteration process defined by (3) is called convergent for an Xo if the corresponding sequence xo, Xb ••• is convergent. A sufficient condition for convergence is given in the following theorem, which has various practical applications.
THEOREM 1
Convergence of Fixed-Point Iteration
Let x = s be a solution of x = g(x) and suppose that g has a continuous derivative in some interval J containing s. Then !f Ig' (x) I ~ K < 1 in J, the iteration process defined by (3) converges jor any Xo in J. and the limit of the sequence {xn} is s.
PROOF
By the mean value theorem of differential calculus there is a t between x and s such that g(x) - g(s)
Since g(s) = s and Xl = g(xo), Ig' (x) I in the theorem
X2
=
g'(t) (x - s)
(x
in J).
= g(Xl)' ... , we obtain from this and the condition on
Applying this inequality n times, for n, n - 1, ... , 1 gives
Since K < I, we have K n -----? 0; hence IXn
sl-----? 0 as n -----?
-
•
00.
We mention that a function g satisfying the condition in Theorem 1 is called a contraction because Ig(x) - g(v)1 ~ Klx - vi, where K < 1. Furthermore, K gives information on the speed of convergence. For instance, if K = 0.5, then the accuracy increases by at least 2 digits in only 7 steps because 0.5 7 < 0.01. E X AMP L E 2
An Iteration Process. Illustration of Theorem 1
+ x-I ~ a by iteration.
Find a solution of I(x) = ..3
Solution.
A sketch shows that a solution lies near x = 1. We may write the equation as (x 2
1 x = gl(x) = - - - 2 .
1
+
SO
x
that
xn+ 1 =
---2 .
1 +xn
+
l)x = lor
Also
for any x because 4x2 /(1 + x 2 )4 = 4x 2 /(1 + 4x 2 + ... ) < 1, so that by Theorem 1 we have convergence for any Xo. Choosing Xo = 1. we obtaio (Fig. 424 on p. 790) Xl ~
0.500,
X2
= 0.800,
X3
=
0.610,
X4
= 0.729,
X5
= 0.653,
X6
= 0.701, ....
The solution exact to 6D is s = 0.682 328. The given equation may also be written Then and this is greater than 1 near the solution, so that we cannot apply Theorem I and as~ert convergence. Try = 0.5. Xo = 2 and see what happens. ,The example shows that the transformation of a given i(x) = a into the form x = g(x) with g satisfying Ig (x)1 3 K < 1 may need some experimentation. •
Xo = 1. Xo
790
CHAP. 19
Numerics in General
1.0
0.5 x
Fig. 424.
Iteration in Example 2
Newton's Method for Solving Equations t(x) = 0 Newton's method, also known as Newton-Raphson's method,! is another iteration method for solving equations f(x) = 0, where f is assumed to have a continuous derivative f'. The method is commonly used because of its simplicity and great speed. The underlying idea is that we approximate the graph of f by suitable tangents. Using an approximate value Xo obtained from the graph of f, we let Xl be the point of intersection of the x-axis and the tangent to the curve of f at Xo (see Fig. 425). Then hence In the second step we compute X2 = Xl - f(xIV!' (Xl), in the third step X3 from X2 again by the same formula, and so on. We thus have the algorithm shown in Table 19.1. Formula (5) in this algorithm can also be obtained if we algebraically solve Taylor's formula (5*)
y
x
Fig. 425.
Newton's method
IJOSEPH RAPHSON (]648-l715), English mathematician who published a method similar to Newton's method. For historical details. see Ref. [GR2], p. 203. listed in App. I.
SEC. 19.2
791
Solution of Equations by Iteration
Table 19.1
Newton's Method for Solving Equations I(x)
ALGORITHM NEWTON (j,
t' , XO,
E,
=0
N)
This algorithm computes a solution of f(x) = 0 given an initial approximation Xo (starting value of the iteration). Here the function f(x) is continuous and has a continuous derivative f' tx).
f, f', initial approximation xo, tolerance
INPUT:
€
> O. maximum number of
iterations N. OUTPUT: For n
Approximate solution
Xn
(n ~ N) or message of failure.
0, I, 2, ... , N - 1 do:
=
Compute
t' (x,,).
If f' (xn) = 0 then OUTPUT "Failure". Stop.
2
[Procedure completed unsuccessfully]
Else compute
3
(5)
4
If
IXn+l -
xnl ~
Elx,J then OUTPUT xn +
1.
Stop.
[Procedure completed sllccessfullv]
End 5
OUTPUT "Failure". Stop. [Procedure completed unsuccessfully after N iterations]
End NEWTON
If it happens that t' (x n ) = 0 for some n (see line 2 of the algorithm), then try another statting value Xo. Line 3 is the heart of Newton's method. The inequality in line 4 is a termination criterion. [f the sequence of the Xn converges and the criterion holds, we have reached the desired accuracy and stop. In this line the factor IXnl is needed in the case of zeros of very small (or very large) absolute value because of the high density (or of the scarcity) of machine numbers for those x. WARNING! The criterion by itself does not imply convergence. Example. The harmonic series diverges, although its partial sums Xn = Lk~l 11k satisfy the criterion because lim (Xn+l - xn) = lim (1/(n + 1) = O. Line 5 gives another termination criterion and is needed because Newton's method may diverge or, due to a POOT choice of xo, may not reach the desired accuracy by a reasonable number of iterations. Then we may try another Xo. If I(x) = 0 has more than one solution, different choices of Xo may produce different solutions. Also, an iterative sequence may sometimes converge to a solution different from the expected one.
792 E X AMP L E 3
CHAP. 19
Numerics in General
Square Root Set up a Newton iteration for computing the square root x of a given positive number c and apply it to c = 2.
Solutioll.
We have x =
Vc. hence f(x) =
r2 - c
= O.
f' (x)
=
X3 =
1.414216,
2r. and (5) takes the form
For c = 2, choosing '\0 = I. we obtain Xl X4
E X AMP L E 4
= 1.500000,
=
X2
1.416667,
X4 =
1.414214....
•
is exact to 6D.
Iteration for a Transcendental Equation Find the positive solution of 2 sin x = x.
Solution.
=
Setting f(x)
xn+l
r - 2 sin x. we have
=
Xn -
Xn -
f' (x) =
2 sinxn
I - 2 cos x. and (5) gives
2( sin xn - xn cos xn)
1- 2cosxn
1- 2cosxn
From the graph of f we conclude that the solution is near Xo = 2. We compute:
I:
1
2
3 X4 =
E X AMP L E 5
2.00000 1.90100 1.89552 1.89550
1.83229 1.64847 1.63809 1.63806
3.48318 3.12470 3.10500 3.10493
1.90100 1.89552 1.89550 1.89549
1.89549 is exact to 5D since the solution to 6D is
1.~95
•
494.
Newton's Method Applied to an Algebraic Equation Apply Newton's method to the equation f(x) = x3
Solution.
+X-
I =
o.
From (5) we have
Starting from Xo = 1. we obtain Xl =
0.750000,
X2
= 0.686047,
X3 =
0.682 340,
X4 =
0.682 328, ...
where .'4 ha~ the error -I . 10- . A comparison with Example 2 shows that the present convergence is much • more rapid. This may motivate the concept of the order of all iteration process, to be discussed next. 6
Order of an Iteration Method. Speed of Convergence The quality of an iteration method may be characterized by the speed of convergence, as follows.
Let xn+ 1
=
= g(x'Y!)
g(x). Then
define an iteration method, and let Xn approximate a solution s of
=
S - En' where En is the error of X n . Suppose that g is differentiable a number of times, so that the Taylor formula gives
x
Xn
Xn+l
(6)
=
g(xn )
=
g(s)
= g(s)
+ g'(s)(xn - g'(S)En
- s)
+ h"(s)(xn
+ !g"(S)En 2 +
-
S)2
+
SEC. 19.2
793
Solution of Equations by Iteration
The exponent of En in the first oonvanishing term after g(s) is called the order of the iteration process defined by g. The order measures the speed of convergence. To see this, subtract g(s) = s on both sides of (6). Then on the left you get Xn+l - S = -E,,+l, where E,,+1 is the error of Xn+l' And on the right the remaining expression equals approximately its first nonzero term because IEnl is small in the case of convergence. Thus (a)
in the case of first order,
E,,+l=+g'(S)En
(7)
in the case of second order,
etc.
Thus if En = lO-k in some step, then for second order. En+l = C· (l0-k)2 = c· 1O-2k, so that the number of significant digits is about doubled in each step.
Convergence of Newton's Method In Newton's method, g(x)
=
x - f(x)/f' (x). By differentiation,
g'(x) = 1 -
f' (X)2
-
f
(8)
f(.T)f"(X)
,
2
(x)
f(x)f"(x}
f' (X)2 Since f(s) = 0, this shows that also g'(s) = O. Hence Newton's method is at least of second order. If we differentiate again and set x = s, we find that
(8*)
g"(S) =
f"(S) f'(s)
which will oot be zero in general. This proves
THEOREM 2
Second-Order Convergence of Newton's Method
if f(x) is three times differentiable and f' and f" are not z.ero at a solution s of f(x)
=
0, then for Xo sufficiently close to s, Newton's method is of second order.
Comments.
For Newton's method, (7b) becomes, by (8*),
For the rapid convergence of the method indicated in Theorem 2 it is important that s be a simple zero of f(x) (thUS (s) =I=- 0) and that Xo be close to s, because in Taylor's formula we took only the linear term [see (5*)], assuming the quadratic term to be negligibly small. (With a bad Xo the method may even diverge!)
f'
794 E X AMP L E 6
CHAP. 19
Numerics in General
Prior Error Estimate of the Number of Newton Iteration Steps Use .1"0 = 2 and xl = I.YUl 111 Example 4 for estimating how many Iteration steps we need to produce the solution to 5D accuracy. This is an a priori estimate or prior estimate because we can compute it after only one iteration, prior to further iterations.
Solution.
We have I(x} =
x-
2 sin X = O. Differentiation gives
{'(s)
{'(Xl) 2 sin x] = - - - = ---.....:.--- =
,
2J (s)
2I' (.~I)
0.57.
Hence (Y) gives 1"n+ll
= 0.57"n2 = 0.57(0.57"~_IJ2 =
where M = 2n + 2 n condition becomes
I
+ ... + 2 + I = 2n+I -
0.573"~t_l
= ... = 0.57M,,~+I ;;: 5' 10-6
I. We show below that Eo = -0.11. Consequently. our
Hence II = 2 is the smallest possible n. according to this cnlde estimate. in good agrecment with Example 4. "0 = -0.11 is obtained from "1 - Eo = ("1 - s) - (Eo - s) = -Xl + Xo = 0.10. hence 2 2 "1 = "0 + 0.10 = -0.57"0 or 0.57"0 + "0 + 0.10 = O. which gives "0 = -0.11. •
Difficulties may arise if It' (x) I is very small near a solution s of f(x) = o. for instance, if s is a zero of f(x) of second (or higher) order (so that Newton's method converges only linearly, as an application of I'H6pital's rule to (8) shows). Geometrically, small It' (x) I means that the tangent of f(x) near s almost coincides with the x-axis (so that double precision may be needed to get f(x) and f' (x) accurately enough). Then for values x = far away from s we can still have small function values
Difficulties in Newton's Method.
s
R(s)
= fCS).
In this case we call the equation f(x) = 0 ill-conditioned. R(s) is called the residual of f(x) = 0 at S. Thus a small residual guarantees a small error of only if the equation is not iII-conditioned.
s
E X AMP L E 7
An Ill-Conditioned Equation J(x) = X5 + 1O- 4.t = 0 is ill·conditioned. x = 0 is a solution . .t' (0) = 10-4 is small. At s = 0.1 the residual J(O.I) = 2· 10- 5 is small, but the en-or -0.1 is laIger in absolute value by a factor 5000. Invent a more drastic example of your own. •
Secant Method for Solving {(x)
=
0
Newton's method is very powerful but has the disadvantage that the derivative f' may sometimes be a far more difficult expression than f itself and its evaluation therefore computationally expensive. This situation suggests the idea of replacing the derivative with the difference quotient
t' (xn ) =
f(x n )
-
Xn -
f(x n -
1)
Xn - 1
Then instead of (5) we have the formula of the popular secant method
SEC. 19.2
Solution of Equations by Iteration
795 y
x
Fig. 426.
Secant method
(10)
Geometrically, we intersect the x-axis at -',,+1 with the secant of f(x) passing through Pn - 1 and Pn in Fig. 426. We need two starting values Xo and xl' Evaluation of derivatives is now avoided. [t can be shown that convergence is superlinear (that is, more rapid than linear, IEn+11 = const 'IEnIL62; see [E5] in App. 1), almost quadratic like Newton's method. The algorithm is similar to that of Newton's method, as the student may show. CAUTION!
It is not good to write (0) as
xn-d(xn) - -'llf(Xn -1) f(xn) - f(X n -1)
because this may lead to loss of significant digits if Xn and Xn-l are about equal. (Can you see this from the formula?) EXA M P L E 8
Secant Method Find the positive solution of f(x) = x - 2 sin x = 0 by the secant method, starting from .Io = 2,
Solution.
Xl =
1.9.
Here, (10) is
Numerical values are:
n
X n -1
xn
Nn
D"
2
2.000000 1.900000 1.895747
1.900000 1.895747 1.895494
-0.000740 -0.000002 0
-0.174005 -0.006986
3 X3
= 1.895 494 is exact
to
6D. See Example 4.
-0.004253 -0.000252
o
•
Summary of Methods. The methods for computing solutions s of f(x) = 0 with given continuous (or differentiable) f(x) start with an initial approximation Xo of s and generate a sequence Xl, X2, . . . by iteration. Fixed point methods solve f(x) = 0 written as x = g(x), so that s is a fixed point of g, that is, s = g(s). For g(x) = X - f(x)li' (X) this is Newton's method, which for good Xo and simple zeros converges quadratically (and for multiple zeros linearly). From Newton's method the secant method follows by replacing f' (x) by a difference quotient. The bisection method and the method of false position in Problem Set 19.2 always converge, but often slowly.
CHAP. 19
796
Numerics in General
...... - . --11-71 FIXED-POINT ITERATION Apply fixed-point iteration and answer related questions where indicated. Show details of your work. 1. x = 1.4 sin x, Xo = 1.4
2. 00 the iterations indicated at the end of Example 2. Sketch a figure similar to Fig. 424.
3. Why do we obtain a monotone sequence in Example I, but not in Example 2?
f = X4 - X + 0.2 = 0, the root near I, Xo 5. f as in Prob. 4, the root near 0, Xo = 0
=
6. Find the smallest positive solution of sin x
e- 0 . 5X ,
4.
Xo =
=
I.
7. (Bessel functions, drumhead) A partial sum of the Maclaurin series of JoCr) (Sec. 5.5) is f(x) = I - !x 2 + i'4x4 - 23~4X6. Conclude from a sketch that f(x) = 0 near x = 2. Write f(x) = 0 as x = g(x) (by dividing f(x) by !x and taking the resulting x-term to the other side). Find the zero. (See Sec. 12.9 for the importance of these zeros.) 8. CAS PROJECT. Fixed-Point Iteration. (a) Existence. Prove that if g is continuous in a closed interval f and its range lies in f, then the equation x = g(x) has at least one solution in f. Illustrate that it may have more than one solution in f. (b) Convergence. Let f(x) = x 3 + 2r2 - 3x - 4 = o. Write this asx = g(x), for g choosing (1) (x 3 - /)113, (2) (r2 - !n1l2, (3) x + !f, (4) x(l + !f), (5) (x 3 - f)lx 2 , (6) (2x 2 - f)/2x, (7) x - flf' and in each case Xo = 1.5. Find out about convergence and divergence and the number of steps to reach exact 6S-values of a root.
19-181 NEWTON'S METHOD Apply Newton's method (60 accuracy). First sketch the function(s) to see what is going on. ~ ~nx
= cot x,
Xo
=
12. x
-
5x
+ In x
+ 3 =
2,
16. (Legendre polynomials) Find the largest root of the Legendre polynomial P 5 (x) given by 5 P5(X) = k(63x - 70x 3 + 15x) (Sec 5.3) (to be needed in Gauss integration in Sec. 19.5) (a) by Newton's method, (b) from a quadratic equation. 17. Design a Newton iteration for cube roots and compute ~ (60, Xo = 2). 18. Design a Newton iteration for -\7c" (c > 0). Use it to 3-.4(::;- ,5(::;compute \12, V'2, v 2, v 2 (60, Xo = I). 19. TEAM PROJECT. Bisection Method. This simple but slowly convergent method for finding a solution of f(x) = 0 with continuous f is based on the intermediate value theorem, which states that if a continuous function f has opposite signs at some x = a and x = b (> a). that is, either f(a) < 0, feb) > 0 or f(a) > 0, feb) < 0, then f must be 0 somewhere on [a, b]. The solution is found by repeated bisection of the interval and in each iteration picking that half which also satisfies that sign condition. (a) Algorithm. Write an algorithm for the method. (b) Comparison. Solve x
= cosx by Newton's method and by bisection. Compare.
(e) Solve e- x = In x and eX
+ x4 + X
2 by bisection.
=
20. TEAM PROJECT. Method of False Position (Regula falsi). Figure 427 shows the idea. We assume that f is continuous. We compute the x-intercept Co of the line through (ao, f(ao), (b o , f(bo». If f(co) = 0, we are done. If f(ao)f(c o ) < 0 (as in Fig. 427), we set a 1 = ao, b 1 = Co and repeat to get Cl, etc. If f(ao)ffco) > 0, then f(co)f(b o ) < 0 and we set al = Co, b 1 = b o, etc. (a) Algorithm. Show that
10. x = cos x, xo
11. x 3
15. (Associated Legendre functions) Find the smallest positive zero of 2 P4 = (I - X2)p~ = ¥(-7x 4 + 8x 2 - 1) (Sec. 5.3) (a) by Newton's method, (b) exactly, by solving a quadratic equation.
0, Xo = 2 Xo =
2
13. (Vibrating beam) Find the solution of cos x cosh x = I near x = ~ 7T. (This determines a frequency of a vibrating beam: see Problem Set 12.3.) 14. (Heating. cooling) At what time x (4S-accuracy only) will the processes governed by fl(X) = 100(1 - e- O.2x ) and f2(X) = 4Oe- O.Olx reach the same temperature? Also find the latter.
Co
=
aof(bo ) - bof(ao) f(bo) - f(ao)
and write an algorithm for the method. (b) Comparison. Solve x 3 = 5x + 6 by Newton's method, the secant method, and the method of false position. Compare. (C) Solve X4 = 2, cos x = V~, and by [he method of false position.
x
+
In
x
2
SEC. 19.3
797
Interpolation
y
121-241
~/
SECANT METHOD
Solve. using
~ y=f(x) x
Xo
and
23. Prob. 9,
=
0.5
Xl
=
J
Fig. 427.
19.3
indicated.
= 2.0 = I. Xl
I.
Xo =
24. Prob. 10, Xo
/(
Xl i1~
21. Prob. ll, Xo = 0.5, Xl 22. e- x - tan X = o. Xo
=
Xl
0.5,
0.7
I
25. WRITING PROJECT. Solution of Equations. Compare the methods in this section and problem set, discussing advantages and disadvantages using examples of your own.
Method of false position
Interpolation Interpolation means finding (approximate) values of a function f(x) for an X between Xl' • • . , xn at which the values of f(x) are given. These values may come from a "mathematical" function, such as a logarithm or a Bessel function, or, perhaps more frequently, they may be measured or automatically recorded values of an "empirical" function, such as the air resistance of a car or an airplane at different speeds, or the yield of a chemical process at different temperatures, or the size of the U.S. population as it appears from censuses taken at IO-year intervals. We write these given values of a function f in the form
different x-values xo,
fo
=
f(xo),
fn
= f(x,J
or as ordered pairs
A standard idea in interpolation now is to find a polynomial Pn(x) of degree n (or less) that assumes the given values; thus (1)
We call this Pn an interpolation polynomial and xo, ... , xn the nodes. And if f(x) is a mathematical function, we call Pn an approximation of f (or a polynomial approximation, because there are other kinds of approximations, as we shall see later). We use Pn to get (approximate) values of f for x's between Xo and xn ("interpolation") or sometimes outside this interval Xo ~ x ~ Xn ("extrapolation"). Motivation. Polynomials are convenient to work with because we can readily differentiate and integrate them. again obtaining polynomials. Moreover, they approximate continuous functions with any desired accuracy. That is. for any continuous f(x) on an interval 1: a ~ x ~ b and error bound f3 > 0, there is a polynomial Pn(x) (of sufficiently high degree n) such that If(x) - Pn(X)
I < f3
for all x on 1.
This is the famous Weierstrass approximation theorem (for a proof see Ref. [GR7], p. 280; see App. O.
798
CHAP. 19
Numerics in General
Existence and Uniqueness. Pn satisfying (I) for given data exists-we give formulas for it below. Pn is unique. Indeed, if another polynomial qn also satisfies lJn(.ro) = fo, ... , lJn(xn) = f n • then Pn(x) - lJn(x) = 0 at xo . ...• X n , but a polynomial Pn - qn of degree 11 (or less) with 11 + I roots must be identically zero, as we know from algebra; thus Pn(x) = q,,{x) for all x, which means uniqueness. • How to Find Pn? This is the important practical question. We answer it by explaining several standard methods. For given data. these methods give the same polynomial. by the uniqueness just proved (which is thus of practical interest!), but expressed in several forms suitable for different purposes.
Lagrange Interpolation Given (xo, f 0), (Xl' f 1). ' , , , (Xn , f n) with arbitrarily spaced Xj' Lagrange had the idea of multiplying each f j by a polynomial that is I at Xj and 0 at the other 1l nodes and then taking the sum of these 11 + 1 polynomials. Clearly, this gives the unique interpolation polynomial of degree 11 or less. Beginning with the simplest case, let us see how this works.
Linear interpolation is interpolation by the straight line through (xo, fo), (Xl' II): see Fig. 428. Thus the linear Lagrange polynomial PI is a sum PI = Lof0 + Ld 1 with Lo the linear polynomial that is I at Xo and 0 at Xl; similarly, Ll is 0 at Xo and I at Xl' Obviously, Lo(x)
x -
=
Xl -
Xo Xo
This gives the linear Lagrange polynomial
(2)
. fo
+
Error~
~
/1" / f .o
p(x) 1
1--'" }'=f(x) f) x
x
Fig. 428.
E X AMP L E 1
Linear interpolation
Linear Lagrange Interpolation Compute a 4D-value of In \1.2 from In 9.0 = 2.1\172. In 9.5 = 2.2513 by linear Lagrange interpolation and determine the error, llsing In \1.2 = 2.2192 (4Dl.
Solutioll.
Xo
= 9.0. Xl = 9.5.10
=
In 9.0. it
x - 9.5
Lo(x)
= --=0:5 = - 2.0(x
L 1 (x)
= ----0:5 = 2.0(x -
x - 9.0
= In 9.5. In (2) we need
- 1).5).
9.0).
Lo«).2)
= - 2.0( -0.3) = 0.6
L 1 (9.2) = 2· 0.2
=
0.4
(see Fig. 429) and obtain the answer In 9.2 ~ Pl(9.2)
= L o(9.2)1 0 +
L 1 (9.2)1 1
= 0.6' 2.1972 +
0.4' 2.2513
= 2.2188.
SEC. 19.3
799
Interpolation
The error is E = a - a = 2.2192 - 2.2188 = OJlO04. Hence linear interpolation is not sufficient here to get 4D-accuracy; it would suffice for 3D-accuracy. •
~~- --_-"'oX'L, o
... --~I~I~'~I--~--~~ 9 9.2 9.5
Fig. 429.
11 x
10
Lo and L, in Example 1
Quadratic interpolation is interpolation of given (xo, fo), second-degree polynomial P2(X), which by Lagrange's idea is
(Xl, f1), (X2, f2)
by a
(3a)
Lo(x)
=
[ (x)
(x - Xl)(X - X2)
_0_
[o(xo)
(3b)
LI(x)
(xo -
[1 (x)
(x - Xo)(x -
[1(X1)
(Xl - XO)(XI -
= --
~(x) =
Xl)(XO -
(x - Xo)(X -
=
[2(X) [2(X2)
X2) X2) X2) Xl)
(X2 - XO)(X2 -
Xl)
How did we get this? Well, the numerator makes Lk(xj) = 0 ifj makes Lk(xk) = 1 because it equals the numerator at X = Xk' E X AMP L E 2
=1=
k. And the denominatOi
Quadratic Lagrange Interpolation Compute In 9.2 by (3) from the data in Example I and the additional third value In 11.0
Solution.
=
2.3979.
In (3),
Lo(x) =
(x - 9.5)(x - 11.0) (9.0 - 9.5)(9.0 - 11.0)
= x
2
-
(x - 9.0)(x - 1 1.0)
20.5x
I
= (9.5 _ 9.0)(9.5 _ 11.0) = - 0.75
~(x)
= (11.0 _ 9.0)(11.0 _ 9.5) =
1
"3
Lo(9.2) = 0.5400,
2
L 1(x)
(x - 9.0)(x - 9.5)
+ 104.5,
(x
+ 99),
L1(9.2)
= 0.4800,
+ 85.5),
~(9.2)
= -0.0200,
- 20x
2
(x
- 18.5x
(see Fig. 430), so that (3a) gives, exact to 4D, In 9.2 = P2(9.2)
= 0.5400' 2.1972 + 0.4800' 2.2513
- 0.0200' 2.3979
x
Fig. 430.
Lo, L" L2 in Example 2
= 2.2192.
•
800
CHAP. 19
Numerics in General
General Lagrange Interpolation Polynomial.
For general n we obtain
(4a)
where Lk(Xk) = 1 and Lk is 0 at the other nodes. and the Lk are independent of the function
f to be interpolated. We get (4a) if we take
(4b)
lo(x)
=
(x - X1)(X - X2) ... (x - xn),
Ik(x)
=
(x - xo) ... (x - Xk-1 )(x -
In(x)
=
(x - xo)(x - xl) ... (x - X,,-l)'
Xk+1) ... (x -
0< k < n,
x n ),
We can easily see that P,,(Xk) = h. Indeed, inspection of (4b) shows that Ik(xj) = 0 if =1= k, so that for x = Xk' the sum in (4a) reduces to the single tenn (lk(Xk)llk (Xk»fk = h.
j
Error Estimate. If f is itself a polynomial of degree n (or less), it must coincide with p" because the n + 1 data (xo. fo), ... , (x11' fIt) determine a polynomial uniquely. so the error is zero. Now the special f has its (n + 1)st derivative identically zero. This makes it plausible that for a ~ene/"{t! fits (11 + l)st derivative tn+ll should measure the error En(X)
=
f(x) - p,,(x).
t
It can be shown that this is true if n + ll exists and is continuous. Then, with a suitable t between Xo and x" (or between Xo. x n ' and x if we extrapolate), f'n+l>(t)
(5)
E.,(.l) = f(x) - Pn(x) = (x - .lo)(x - Xl) .•• (X -
ern)
-=-------'-'--
(11
+
I)!
Thus !En(x)i is 0 at the nodes and small near them, because of continuity. The product (x - Xo) ... (x - xu) is large for \" away from the nodes. This makes extrapolation risky. And interpolation at an x will be best if we choose nodes on both sides of that x. Also, we get error bounds by taking the smallest and the largest value of f'n+ll(t) in (5) on the interval Xo ~ t ~ Xn (or on the interval also containing x if we extrapolate). Most importantly, since Pn is unique, as we have shown, we have
THEOREM 1
Error of Interpolation
Formula (5) gil'es the error for allY po/molllia/ interpolation lIIet/lOd (f f(x) has a continuous (11 + I )st deri\'{/tive.
Practical error estimate. If the derivative in (5) is difficult or impossible to obtain, apply the Error Principle (Sec. 19.1), that is, take another node and the Lagrange polynomial Pn+1(X) and regard PII+1(X) - Pn(x) as a (crude) error estimate for Pn(x),
SEC. 19.3
Interpolation
E X AMP L E 3
801
Error Estimate (5) of Linear Interpolation. Damage
by Roundoff. Error Principle
Estimate the error in Example I first by (5) directly and then by the Error Principle (Sec. 19.1). Soluti01l.
(A)
Estimatioll by
(5),
We have
11
= 1.
fit)
= In 1. {(t) =
lIt.
{'(t) =
-2-
2t
Hence
0.03
(-1) "l(x) = (x - 9.0){x - 9.5)
-1112.
thus
"1(9.2) =
-2- .
t
2
= 9.0 gives the maximum 0.03/9 = 0.00037 and f = 9.5 gives the minimum 0.03/9.5 2 = 0.00033.
SO that we get 0.00033 ~ "1(9.2) ~ 0.00037, or better, 0.00038 because 0.3/81 = 0.003 703 .... But the error 0.0004 in Example I disagrees, and we can learn something! Repetition of the computation there with 5D instead of 4D gives
1
In 9.2 = 1'1(9.2)
= 0.6' 2.19722 + 0.4 . 2.25129 = 2.21885
with an actual en'or " = 2.21920 - 2.21885 = 0.00035. which lies nicely near the middle between our two error bounds. This shows that the discrepancy 10.0004 vs. 0.00035) was caused by rounding, which is not taken into account in (5). (B) EstinUltioll by the Error Pri1lciple. We calculate 1'1(9.2) = 2.21885 as before and then 1'2(9.2) as in Example 2 but with 5D, obtaining 1'2(9.2)
= 0.54' 2.19722 + 0.48· 2.25129
- 0.02' 2.39790
= 2.21916.
The difference P2(9.2) - 1'1(9.2) = 0.00031 is the approximate error of 1'1(9.2) that we wanted to obtain: this is an approximation of the actual en'or 0.00035 given above. •
Newton's Divided Difference Interpolation For given data (xo, f 0)' ... , (xn , f n) the interpolation polynomial Pn(x) satisfying (l) is unique, as we have shown. But for different purposes we may use Pn(x) in different forms. Lagrange's form just discussed is useful for deriving formulas in numeric differentiation (approximation formulas for derivatives) and integration (Sec. 19.5). Practically more imp0l1ant are Newton's forms of Pn(x), which we shall also use for solving ODEs (in Sec. 21.2). They involve fewer arithmetic operations than Lagrange's fonn. Moreover, it often happens that we have to increase the degree 11 to reach a required accuracy. Then in Newton's forms we can use all the previous work and just add another term. a possibility without counterpart for Lagrange's form. This also simplifies the application of the Error Principle (used in Example 3 for Lagrange). The details of these ideas are as follows. Let Pn-l(X) be the (n - l)st Newton polynomial (whose form we shall detelmine); thus Pn-l(XO) = fO,Pn-1(X1) = f1,' .. ,Pn-1(Xn -l) = fn-l' FUl1hermore, let us write the nth Newton polynomial as
(6) hence (6')
Here gn(x) is to be determined so that Pn(xO) = fo, Pn(Xl) = flo ... , Pn(xn ) = f n' Since p" and Pn-l agree at xo, ... , X n -l' we see that gn is zero there. Also, gn will generally be a polynomial of nth degree because so is Pn' whereas Pn-l can be of degree n - I at most. Hence gn must be of the form (6")
CHAP. 19
802
Numerics in General
We determine the constant an- For tIns we set x = Xn and solve (6") algebraically for Replacing gn(xn ) according to (6') and using Pn(xn ) = In' we see that this gives
(In-
(7) We write {lk instead of {In and show that (lk equals the kth divided difference, recursively denoted and defined as follows:
and in general
(8)
PROOF
If II = 1, then Pn-l (xn ) = Po(xl ) of f(x) at Xo. Hence (7) gives
= Io because Po(x) is constant and equal to Io, the value
and (6) and (6") give the Newton interpolation polynomial of the first degree
If II
=
2, then this PI and (7) give
where the last equality follows by straightforward calculation and comparison with the definition of the right side. (Verify it: be patient.) From (6) and (6") we thus obtain the second Newton polynomial
For
II
= k, formula
Withpo(x)
=
(6) gives
fo by repeated application with k
=
I,· .. ,
II
this finally gives Newton's
divided difference interpolation formula (10)
f(x) = fo
+ (x - xo)f[xo, XI] + (x - xo)(x - xI)flxo,
XI' X2]
+ ... + (x - xo)(x - XI) ... (x - xn-I)f[xo, ... , xn].
SEC. 19.3
803
Interpolation
An algorithm is shown in Table 19.2. The first do-loop computes the divided differences and the second the desired value Pn(x). Example 4 shows how to arrange differences near the values from which they are obtained: the latter always stand a half-line above and a half-line below in the preceding • column. Such an arrangement is called a (divided) difference table. Table 19.2
Newton's Divided Difference Interpolation
ALGORITHM INTERPOL txo, ... , Xn;
f 0,
••• ,
f n; x)
This algorithm computes an approximation p,,(.f) of f(.O at .f.
INPUT: Data (xo- f 0), (x b .fI) •...• (xno f n); Approximation Pn(.x) of fCO
OUTPUT: Set f[-':i1
=
For
I.. . ..
111 =
x
For j
U = 0, ... , n).
fj
11 -
I do:
0, ....
=
11 -
Tn
do:
End End Set Po(x) For k
=
=
f o·
1, ... ,
11
do:
End OUTPLT Pn(i) End INTERPOL
E X AMP L E 4
Newton's Divided Difference Interpolation Formula Compute f(9.2) from the values shown in the fust two columns of the following table. Xj
fj
=
ft9
!to
(2.079442" ....
9.0
2.197225
f[Xj. Xj+l]
".
(0.117 78~
,----- - \-U.006 433
0.lO8134 9.5
2.251 292
-0.005200 0.097735
1l.0
2.397895
804
CHAP. 19
Numerics in General
Soluti01l. We compute the divided differences as shown. Sample computation: (0.097735 - O.lO8 134)/(11 - 9) = -0.005200. The values we need in (lO) are circled. We have f(x) = P3(x) = 2.079442
+ 0.117783(,' - KO) - 0.006433(x - 8.0)(x - 9.0) + 0.000 4Il(x - 8.0)(x - 9.0)(x - 9.5).
Alx=9.2, J(9.2) = 2.079442
+ 0.141 340 - 0.001544 - 0.000030
2.219208.
=
The value exact to 6D is J(9.2) = In 9.2 = 2.219203. Note that we can nicely see how the accuracy increases from term to term: Pl(9.2) = 2.220782.
P2(9.2) = 2.219 238,
•
P3(9.2) = 2.219208.
Equal Spacing: Newton's Forward Difference Formula Newton's formula (10) is valid for arbitrarily spaced nodes as they may occur in practice in expe1iments or observations. However, in many applications the x/s are regularly spaced-for instance, in measurements taken at regular intervals of time. Then, denoting the distance by 11, we can write (11)
Xn
=
Xo
We show how (8) and (10) now simplify considerably! To get started, let us define the first forward difference of I at
the second forward difference of I at
Xj
Xj
+
I1h.
by
by
and, continuing in tins way, the kth forward difference of I at
Xj
by
(12)
(k
=
1,2, .. ').
Examples and an explanation of the name "forward" follow on p. 806. What is the point of this? We show that if we have regular spacing (11), then
(13)
We prove (13) by induction. It is true for k
= 1
I7
I because
Xl
=
.\"0
+ /z,
1
(II - Io) = -111 I::.Io· • 7
SO
that
SEC. 19.3
805
Interpolation
Assuming (13) to be true for all forward differences of order k. we show that (13) holds for k + 1. We use (8) with k + I instead of k; then we use (k + I)h = Xk+1 - .\"0' resulting from (II). and finally (12) with) = 0, that is. j,k+lfo = !lokfl - j.kfo. This gives
(k
+
which is (13) with"
+
___ 1 j.k [ k!hk· fl
I)h
____ I
~k
fo
k!hk
]
•
1 instead of ". Formula (13) is proved.
In (10) we finally set x = .\"0 + rh. Then x - Xo = rh, x - XI = (r - 1)h since Xo = h, and so on. With this and (13), formula (10) becomes Newton's (or Gregory2-Newtoll's) forward difference interpolation formula
Xl
f(x)
= Pn(x) =
~
(:)
~sfo
(x
=
Xo
+
r = (x - xo)/17)
rh,
s=o
(14)
= fo + rj.fo +
1)
r(r -
2!
2
j. fo
r(r -
+ ... +
1) ... (r -
11
+
Il!
1)
j.nfo
where the binomial coefficients in the first line are defined by (15)
(~) = I.
(:) = _r(_r_-_1)_(r_-_2)_s;_'_'_(r_-_s_+_I_)
(s
> O. integer)
and s! = 1 . 2 ... s. Error.
(16)
From (5) we get. with.\" -
En(X)
=
f(x) - Pn(x)
=
.\"0
=
rh,
X -
hn + I (n
+
1)!
r(r -
Xl
=
(r -
1)h. etc ..
1) ... (r - n)tn + ll (t)
with t as characterized in (5). Formula (16) is an exact formula for the error, but it involves the unknown t. In Example 5 (below) we show how to use (16) for obtaining an enor estimate and an interval in which the true value of f(x) must lie. Comments on Accuracy. (A) The order of magnitude of the enor to that of the next difference not used in Pn(x),
En(X)
is about equal
(B) One should choose xo, .... x 71 such that the x at which one interpolates is as well centered between xo, . . . , Xn as possible. 2JA~ES GREGORY (1638-1675). SC?ts mathematician. professor a[ SI. Andrews and Edinburgh. ~ in (14) and'\ (on p. 807) have nothmg [0 do WIth the Laplacian.
806
CHAP. 19
Numerics in General
The reason for (A) is that in (16), II"(r __ - J)__________ ... (r - 11)1 -cc
~~
~
1 ·2 ... (11
+
~
J)
if
Irl
~
]
(and actually for any I" as long as we do not extrapolate). The reason for (B) is that 11"(1" - 1) ... (r - n)1 becomes smallest for that choice.
E X AMP L E 5
Newton's Forward Difference Formula. Error Estimation Compute cosh 0.56 from (14) and the four values in the following table and estimate the error.
cosh Xj
j
Xj
°
0.5
)~IP 626;
0.6
1.185465
fj =
2 t:. fj
t:.fj
1 (O.U5783')~
p.Ol] 865, 0.069704
2
t:,.3fj
0.7
,0 ..900697/
1.255169
0.012562 0.082266
0.8
I
1.337435
Solution. We compute the forward differences as shown in the table. The values we need are circled. In (14) we have r = (0.56 - 0.50)/0.1 = 0.6, so that (14) gives cosh 0.56 = 1.127626 =
+ 0.6' 0.057839 + 0.6(~OA) . 0.011 ROS + 0.6(-0.:)(-1.4) . 0.000 697
1.127 626 + 0.034 703 - 0.001 424 + 0.000039
= 1.16U 944.
Error estimate. From (16), since the founh derivative is cosh(4l t 0.1
1"3(0.56) =
where A = -0.000003 36 and 0.5 ~ t and smallest cosh t in that interval:
=
cosh t,
4
4! ·0.6(-OA)(-1.4)(-2A) cosh I
=
A cosh t,
~
0.8. We do not know t, but we get an inequality by taking the largest
A cosh 0.8 ~ 1"3(0.62) ~ A cosh 0.5.
Since f(xl = P3(x)
+
~(x),
this gives P3(0.56) + A cosh 0.8 ~ cosh 0.56 ~ P3(0.56)
+ A cosh 0.5.
Numeric values are 1.I60 939
~
cosh 0.56
~
1.160 941.
The exact 6D-value is cosh 0.56 = 1.160941. It lies within these bounds. Such bounds are not always so tight. • Also, we did not consider roundoff errors, which will depend all the number of operations.
This example also explains the name ':forward difference formula": we see that the differences in the formula slope forward in the difference table.
SEC. 19.3
807
Interpolation
Equal Spacing: Newton's Backward Difference Formula Instead of forward-sloping differences we may also employ backward-sloping differences. The difference table remains the same as before (same numbers, in the same positions), except for a very harmless change of the running subscript j (which we explain in Example 6, below). Nevertheless, purely for reasons of convenience it is standard to introduce a second name and notation for differences as follows. We define the first backward difference of fat Xj by
the second bac/..:ward difference of f at
Xj
by
and, continuing in this way, the kth backward difference of f at Xj by
(17)
(k
= 1,2, .. ').
A formula similar to (14) but involving backward differences is Newton's (or Gregory-Newton's) backward difference interpolation fonnula
f(x)
= Pn(x) = ~
(r + : -
(18)
=
E X AMP L E 6
fo
+
rVfo
+
r(r
+
]) 1)
2!
VSfo
(x = Xo
2
r(1"
V fo
+ ... +
+
+
rh, r = (x - xo)lh)
1) ... (r n!
+n-
1)
Vnfo·
Newton's Forward and Backward Interpolations Compute a 7D-value of the Bessel function Jo(x) for x = 1.72 from the four values in the following table, using (a) Newton's forward formula (14). (b) Newton's backward formula (18).
jfor
jbaclr
.Ij
Jo(.\j)
0
-3
1.7
0.3979849
1
-2
l.8
0.3399864
] st Diff.
2nd Diff.
3rd Diff.
-0.0579985 -0.0001693 -0.0581678 -]
2
l.9
0.2818186
0.0004093 0.0002400
-0.0579278 3
0
Solution.
I
2.0
0.2238908
The computation of the differences is the same in both cases. Only their notation differs.
(a) Forward. In (14) we have r = (1.72 - 1.70)/0.1 = 0.2, andj goes from each column we need the first given number, and (14) thus gives Jo(1.7 2) = 0.3979849 =
+ 0.2( -0.0579985) +
0.3979849 - 0.011 5997
a to 3 (see first column). In
0.2( -0.8) 0.2( -0.8)( -1.8) 2 (-0.000 1693) + 6 . 0.000 4093
+ 0.000 0135 + 0.000 0196
which is exact to 6D, the exact 7D-va1ue being 0.3864185.
=
0.3864183,
808
CHAP. 19
r
Numerics in General
(b) 8ack\\ard. For (18) we usej shown in the second column. and in each column the last number. Since = (1.72 - 2.00)/0.1 = -2.8. we thus gel from lI8)
1 0 (1.72) = 0.223 89U8 - 2.8( -0.0579278)
+
-2.8(- 1.8) 2
. 0.000 2400 +
-2.8(
=
0.223 8908 + 0.162 1978 + 0.000 6048 - 0.000 2750
=
0.386 4184.
\.8)(-
6
0.8)
. 0.000 4093
•
Central Difference Notation This is a third notation for differences. The first central difference of f(x) at Xj is defined by
and the kth central difference of f(x) at <:okf -
(19)
v
j -
Xj
by
<:ok-If
v
j+1/2 -
<:ok-If
v
(j
j-1/2
= 2,3, .. ').
Thus in this notation a difference table, for example, for f -1' fo, fI, f2, looks as follows: X-I
f
Xo
fo
-I
[)f -1/2 [)2fo [)3f1/2
[)fI/2 Xl
fl
X2
f2
[)2fl [)f3/2
Central differences are used in numeric differentiation (Sec. 19.5), differential equations (Chap. 21), and centered interpolation formulas (e.g., Everett's formula in Team Project 22). These are formulas that use function values "symmetrically" located on both sides of the interpolation point x. Such values are available near the middle of a given table, where centered interpolation formulas tend to give better results than those of Newton's formulas, which do not have that "symmetry" property.
1. (Linear interpolation) Calculate PI (x) in Example I. Compute from it In 9.4 = PI(9.4). 2. Estimate the enor in Prob. 1 by (5). 3. (Quadratic interpolation) Calculate the Lagrange polynomial 1'2(X) for the 4D-values of the Gamma function [(24), App. 3.1J r(l.oO) = 1.0000. r(l.02) = 0.9888, r(l.04) = 0.9784, and from it approximations of r{x) for x = 1.005, 1.010. 1.015. 1.025. 1.030. 1.035. 4. (Error bounds) Derive enor bounds for P2(9.2) in Example 2 from (5). S. (Error function) Calculate the Lagrange polynomial P2(X) for the 50-values of the enor function
j(x) = erf x = (2/\'";) J~ e- dw, namely. 1(0.25) = 0.27633 ..«0.5) = 0.52050. f( I) = 0.84270. and fromp2 an approximation of j(O.75) (= 0.71116. 50). w2
6. Derive an error bound in Prob. 5 from (5). 7. (Sine integral) Calculate the Lagrange polynomial P2(X) for the 40-values of the sine integral Si(x) [(40) in App. 3.1], namely, Si(O) = O. Si(l) = 0.9461. Si(2) = 1.6054, and from P2 approximations of Si(O.5) (= 0.4931. 40) and Si(1.5) (= 1.3247,401. 8. (Linear and quadratic interpolation) Find e- O.25 and O 75 e- . by linear interpolation with xo = 0, Xl = 0.5 and Xo = 0.5, Xl = I. respectively. Then find pix)
SEC. 19.3
809
Interpolation
interpolating e- x with Xo = O. Xl = 0.5. X2 = 1 and from it e- O.25 and e- O.75 • Compare the errors of these linear and quadratic interpolations. Use 4D-values of e- x . 9. (Cubic Lagrange interpolation) Calculate and sketch or graph L o, L 1 . L 2 , L3 for x = O. 1.2.3 on common axes. Find P3(X) fOT the data (0. I)
(1,0.765198) (2, 0.223891 ) (3. -0.260052) [values of the Bessel function 10(x)]. Find = 0.5, 1.5. 2.5 by interpolation.
P3
for
X
10. (Interpolation and extrapolation) Calculate P2(X) in Example 2. Compute from it approximations of In 9.4, In 10, In 10.5. In 11.5. In 12, compute the errors by using exact 4D-values. and comment. 11. (Extrapolation) Does a sketch or graph of the product of the (x - Xj) in (5) for the data in Prob. 10 indicate that extrapolation is likely to involve larger errors than interpolation does? 12. (Lower degree) Find the degree of the interpolation polynomial for the data (-2.33) (0.5) (2.9) (4,45) (6, 113). 13. (Newton's forward difference formula) Set up (14) for the data in Prob. 7 and derive P2(x) from (14). 14. Set up Newton's forward difference formula for the data in Prob. 3 and compute [(1.01), [(1.03), [(1.05). 15. (Newton's divided difference formula) Compute f(0.8) and f(0.9) from f(0.5) = 0.479
f(I.O) = 0.841
17. (Central differences) Write the difference in the table in Example 5 in central difference notation.
18. (Subtabulation) Compute the Bessel function 11(X) for X = 0.1. 0.3, .... 0.9 from 1 1(0) = 0,11(0.2) = 0.09950. h(O.4) = 0.19603. 1 1 (0.6} = 0.28670.11(0.8) = 0.36884, 11 (1.0) = 0.44005. Use (14) with II = 5. 19. (Notations) Compute a difference table of f(x) = x3 for X = O. 1, 2. 3.4. 5. Choose Xo = 2 and write all occurring numbers in tenTIS of the notations (a) for central differences, (b) for forward differences, (c) for backward differences. 20. CAS EXPERIMENT. Adding Terms in Newton Formulas. Write a program for the forward formula ( 14). Experiment on the increase of accuracy by successively adding terms. As data use values of some function of your choice for which your CAS gives the values needed in determining errors. 21. WRITING PROJECT. Interpolation: Comparison of Methods. Make a list of 5-6 ideas that you feel are most basic in this section. Arrange them in the best logical order. Discuss them in a 2-3 page report.
22. TEAM PROJECT. Interpolation and Extrapolation. (a) Lagrange practical error estimate (after Theorem I). Apply this to PI (9.2) and P2(9.2) for the data Xo = 9.0. Xl = 9.5. X2 = 11.0. fo = In Xo. fl = In XI> f2 = In X2 (6S-values). (b) Extrapolation. Given (.~i' f(x) = (0.2.0.9980). (0.4. 0.9686). (0.6. 0.8443), (0.8, 0.5358), (1.0. 0). Find f(0.7) from the quadratic interpolation polynomials based on (a) 0.6, 0.8. 1.0, ({3) 0.4. 0.6. 0.8. (y) 0.2, 0.4, 0.6. Compare the errors and comment. [Exact f(x) = cos (!7TX 2 ), f(O.7) = 0.7181 (45).1 (c) Graph the product of factors (x - xi> in the error formula (5) for 11 = 2.' . '. 10 separately. What do these graphs show regarding accuracy of interpolation and extrapolation? (d) Central differences. Show that 8 2 fm = fm+l - 2fm + fm-I> and. furthermore 8 3 fm+I12 = fm+2 - 3fYII f 1 + 3fm - f m - 1 • n 8 fm = t!.."fm-n/2 = vnfm+n/2' (e) Everett's interpolation formula
f(2.0) = 0.909
f(x)
=
(I - r)fo
by quadratic interpolation. (20)
16. Compute f(6.5) from
+
(2 -
+
rf1
r)O -
3!
r)( -r)
2
8 fo
f(6.0) = O. J 506 f(7.0) = 0.3001
f(7.5)
= 0.2663
fO.7) = 0.2346
by cubic interpolation, using (10).
is an example of a formula involving only even-order differences. Use it (0 compute the Bessel function Jo(x) for x = 1.72 from 10(l.60} = 0.4554022 and 10 (1.7). 10(1.8),10(1.9) III Example 6.
810
19.4
CHAP. 19
Numerics in General
Spline Interpolation Given data (function values, points in the .\y-plane) (.\"0' f 0), (XI- f 1)' . . . , (Xn , f n) can be interpolated by a polynomial P,/x) of degree 11 or less so that the curve of P,ix) passes through these 11 + 1 points (Xj, fj); here fo = f(xo), ...• fn = f(x,.). See Sec. 19.3. Now if 11 is large, there may be trouble: P n(x) may tend to oscillate for x between the nodes xo, ... , x",. Hence we must be prepared for numeric instability (Sec. 19.1). Figure 431 shows a famous example by C. Runge 3 for which the maximum error even approaches x as 11 ~ x (with the nodes kept equidistant and their number increased). Figure 432 illustrates the increase of the oscillation with 11 for some other function that is piecewise linear. Those undesirable oscillations are avoided by the method of splines initiated by I. J. Schoenberg in 1946 (Quarterly of Applied Mathematics 4, pp. 45-99, 112-141). This method is widely used in practice. It also laid the foundation for much of modem CAD (computer-aided design). its name is borrowed from a draftman's spline, which is an elastic rod bent to pass through given points and held in place by weights. The mathematical idea of the method is as follows: Instead of using a single high-degree polynomial Pn over the entire interval a ~ x ~ b in which the nodes lie, that is, a
(1)
we use
11
=
Xo
< Xl < ... < Xn = b,
low-degree, e.g., cubic, polynomials
one over each subinterval between adjacent nodes. hence qo from Xo to Xl' then q1 from
YI
-5
"'--/
Fig. 431.
Fig. 432. 3
-~. o
Runge's example {(x) = 1/(1
+ x 2 ) and
"'-../
0
5
x
interpolating polynomial PlQ(x)
Piecewise linear function {(x) and interpolation polynomials of increasing degrees
CARL RUNGE (1856-1927). German mathematician, also known for his work on ODEs (Sec. 21.1).
SEC. 19.4
811
Spline Interpolation
to X2' and so on. From this we compose an interpolation function g(x), called a spline. by fitting these polynomials together into a single continuous curve passing through the data points, that is.
Xl
Note that g(x) = qo(.\") when Xo ~ x ~ Xl' then g(x) = q1(X) when Xl ~ X ~ X2, and so on, according to our construction of g. Thus spline interpolation is piecewise polY1lomial intel]Jolatioll. The simplest q/ s would be linear polynomials. However. the curve of a piecewise linear continuous function has corners and would be of little interest in general-think of designing the body of a car or a ship. We shall consider cubic splines because these are the most important ones in applications. By definition, a cubic spline g(x) interpolating given data (xo, f 0), ... , (xn , fn) is a continuous function on the interval a = Xo ~ X ~ X." = b that has continuous first and second derivatives and satisfies the interpolation condition (2); furthermore, between adjacent nodes, g(x) is given by a polynomial %(x) of degree 3 or less. We claim that there is such a cubic spline. And if in addition to (2) we also require that
(3) (given tangent directions of g(x) at the two endpoints of the interval a ~ X ~ b), then we have a uniquely detennined cubic spline. This is the content of the following existence and uniqueness theorem. whose proof will also suggest the actual determination of splines. (Condition (3) will be discussed after the proof.)
THEOREM 1
Existence and Uniqueness of Cubic Splines
Let (xo, f 0), (xt> f1)' .•• , (xn , f.,,) with arbitrarily spaced givell Xj [see (I)] and given fj = f(x),.i = 0, I, ... , Il. Let ko alld k n be allY given numbers. Then there is olle and only one cubic spline g(x) corresponding to (1) alld satisfying (2) and (3).
PROOF
By definition, on every subintervallj given by Xj ~ x ~ Xj+ 1 the spline g(x) must agree with a polynomial %lX) of degree not exceeding 3 such that
(4)
(j
= 0,
(j
= 0, 1, ... , n - I)
I, ... ,
17 -
I).
For the derivatives we write (5)
with ko and kn given and k1, ••• , kn - 1 to be detemllned later. Equations (4) and (5) are four conditions for each qj(x), By direct calculation, using the notation (6*)
1
(j
= 0, 1, ... , Il
-
1)
we can verify that the unique cubic polynomial %(x) (j = 0, 1, ... , n - I) satisfying
812
CHAP. 19
Numerics in General
(4) and (5) is %(X) = f(xj)c/(x - Xj+l)2[1
(6)
+
2clx - x)]
+
f(Xj+l)C/(X - X)2[1 - 2cj(x - Xj+l)]
+
kjc/(x - Xj)(x - Xj+l)2
+
kj + 1C/(X - X)2(X - Xj+l)'
Differentiating twice, we obtain
(8) By definition, g(x) has continuous second derivatives. This gives the conditions (j = I,' .. ,
11 -
I).
If we use (8) withj replaced by j - I, and (7), these n - I equations become
where ViJ = f(xj) - f(Xj-l) and Vfj+l = f(Xj+l) - f(xj) and j = 1,' .. , n - I, as before. This linear system of 11 - I equations has a unique solution k1 , ••• , kn - 1 since the coefficient matrix is strictly diagonally dominant (that is. in each row the (positive) diagonal entry is greater than the sum ofthe other (positive) entries). Hence the determinant of the matrix cannot be zero (as follows from Theorem 3 in Sec. 20.7), so that we may determine unique values k] • ... , kn - 1 of the first derivative of g(x) at the nodes. This proves the theorem. • Storage and Time Demands in solving (9) are modest. since the matrix of (9) is sparse (has few non7ero entries) and tridiagonal (may have nonzero entries only on the diagonal and on the two adjacent "parallels" above and below it). Pivoting (Sec. 7.3) is not necessary because of that dominance. This makes splines efficient in solving large problems with thousands of nodes or more. For some literature and some critical comments, see American Mathematical Monthly 105 (1998), 929-941. Condition (3) includes the clamped conditions (10)
in which the tangent directions t' (xo) and t' (x,,) at the ends are given. Other conditions of practical interest are the free or natural conditions (11)
(geometJically: zero curvature at the ends, as for the draftman's spline), giving a natural spline. These names are motivated by Fig. 290 in Problem Set 12.3.
SEC. 19.4
813
Spline Interpolation
Determination of Splines. Let ko and kn be given. Obtain kl' .... k n - l by solving the linear system (9). Recall that the spline g(x) to be found consists of n cubic polynomials qo, ... , qn-l' We write these polynomials in the form (12)
where j = O•...•
11 -
I. Using Taylor's formula. we obtain
ajO = q/x) = Ij
by (2),
q; (Xj) = kj
by (5),
ajl
=
by (7),
(13)
with aj3 obtained by calculating q;'(Xj+l) from (12) and equating the result to (8), that is,
and now subtracting from this 2aJ 2 as given in (13) and simplifying. Note that for equidistant nodes of distance hj = h we can write cJ = and have from (9) simply
= 1111 in (6*)
(j = 1, ... ,
(14)
E X AMP L E 1
C
11 -
I).
Spline Interpolation. Equidistant Nodes Interpolate f(x) = .,.4 on the interval -I :::;:: x :::;:: I by the cubic spline g( r) corresponding to the nodes Xo = -1. Xl = O. X2 = 1 and satisfying the clamped condition, g' (-1) = f' (- I), g '(1) = f' (I).
Solution. We have h
In our ,tandard notation the given data are fo = f( - \) = 1. II = frO) = O. = 1 and 11 = 2, so that our spline consists of 11 = 2 polynomials qo(x)
= 1I00
ql(X) = 1I1O
+
lIOl("
+
+ 1Il1-~ +
1)
+
1I12X2
lI02(x
+
1)2
+
lI03(x
+
1)3
+ {/13..3
f2
=
f( 1)
(-\ :::;:: x:::;:: 0). (0 :::;:: x:::;:: I).
We determine the kj from (14) (equidistance!) and then the coefficients of the spline from (13). Since the sp,tem (14) is a ,ingle equation (with j = 1 and h = I) ko
Here fo = f2 = \ (the value of x4 at the ends) and ko - I and 1. Hence
+
4kl
11
= 2,
+ 4kl + k2 = 3(f2 - fo)·
= -4.
k2
= 4. the value~ of the derivative 4\·3 at the
end~
-4
= 1.
+
4 = 3(1 -
I) = 0,
From (13) we can now obtain the coefficients of qo, namely, {loo
=
fo
= 1,
{lOI
= ko = -4. and
814
CHAP. 19
Numerics in General 3
a02
= ]2 (h - Io)
a03
=
2
3(Io I
1
-1 (k1 +
h) +
2ko)
I "2(kl + ko) I
=
3(0 - 1) - (0 - 8)
= 2(1 -
Similarly, for the coefficients of qi we obtain from (13) the values a12
=
3(I2 -
h) -
a13 = 2(h - I2)
(k2
0)
+
(0 -
0lO
=
II
4)
=
0,
=
5
= -2. all
=
kl
=
0, and
+ 2k1 ) = 3(1 - 0) - (4 + 0) = -1
= 2(0
+ (k2 + k 1)
-
I) + (4 + 0)
= 2.
This gives the polynomials of which the spline g(x) consists. namely, if if
-I
~x~O
O~x~l.
Figure 433 shows .f(x) and this spline. Do you see that we could have saved over half of our work by using symmetry? •
((x)
x
Fig. 433.
E X AMP L E 2
Function fIx) =
X4
and cubic spline g(x) in Example 1
Natural Spline. Arbitrarily Spaced Nodes Find a spline approximation and a polynomial approximation for the curve of the cross section of the circularshaped Shrine of the Book in Jerusalem shown in Fig. 434.
'"
~
•
•
;; I
T: Jl
~
-3
Fig. 434.
-2
0
Shrine of the Book in Jerusalem (Architects F. Kissler and A. M. Bartus)
SEC. 19.4
Spline Interpolation
815
Solution.
Thirteen points. about equally distributed along the contour (nut along the x-axis!), give these data:
Xj
5.8
Ij
0
-5.0
-4.0
-2.5
-1.5
-0.8
1.5
1.8
2.2
2.7
3.5
0
0.8
1.5
2.5
4.0
5.0
5.8
3.9
3.5
2.7
2.2
1.8
1.5
0
The figure shows the conesponding interpolation polynomial of 12th degree, which is useless because of its oscillation. (Because of roundoff your software will also give you small enor terms involving odd powers of x.) The polynomial is P12(X) = 3.9000 - 0.65083..2
+ 0.033858x4 + 0.01l04lx 6
+ 0.000055595x 10
-
-
0.00I40I0x 8
0.00000071 867x 12 .
The spline follows practically the contour of the roof, with a small error near the nodes -O.li and 0.8. The spline is symmetric. Its six polynomials corresponding to positive x have the fullowing coefficieuts of their represeutations (12). (Note well that (12) is in terms of powers of x - Xj, uot x!)
I I
j
x-interval
ajO
ajl
aj2
aj3
0
0.0... 0.8 0.8. .. 1.5 [.5 .. .2.5 2.5 .. .4.0 4.0 ... 5.0 5.0 ... 5.8
3.9 3.5 2.7 2.2 1.8 1.5
0.00 -1.01 -0.95 -0.32 -0.027 -1.13
-0.61 -0.65 0.73 -0.09[ 0.29 -1.39
-0.015 0.66 -0.27 0.084 -0.56 0.58
1
2 3 4
5
- --
....._...._-. ..... _........... -. ~
1. WRITING PROJECT. Splines. In your own words, and using as few formulas as possible, write a short report on spline interpolation, its motivation, a comparison with polynomial interpolation, and its applications. 2. (Individual polynomial qj) Show that qj(x) in (6) satisfies the interpolation condition (4) as well as the derivative condition (5). 3. Verify the differentiations that give (7) and (8) from (6).
4. (System for derivatives) Derive the basic linear system (9) for k1 , . . . , k n - 1 as indicated in the text. 5. (Equidistant nodes) Derive (14) from (9). 6. (Coefficients) Give the details of the derivation of aj2 and aj3 in (13). 7. Verify the computations in Example I. 8. (Comparison) Compare the spline g in Example I with the quadratic interpolation polynomial over the whole interval. Find the maximum deviations of g and P2 from f. Comment. 9. (Natural spline condition) Using the given coefficients, verify that the spline in Example 2 satisfies g"(x) = 0 at the ends.
110-161
DETERMINATION OF SPLINES
Find the cubic spline g(x) for the given data with ko and k n as given. 10. f( -2) = .f( -1) ko = k4 = 0
=
f(1)
=
f(2)
=
O. f(O)
= I.
11. If we started from the piecewise linear function in Fig. 435. we would obtain g(x) in Prob. 10 as the spline satisfying g' (-2) = f' (-2) = 0, g' (2) = f' (2) = O. Find and sketch or graph the corresponding interpolation polynomial of 4th degree and compare it with the spline. Comment.
I
~2_-&-----=--2 -----/
0
----
x
Fig. 435. Spline and interpolation polynomial in Problems 10 and 11 12. fo = f(O) h = f(6)
1, fl = f(2) = 9, f2 = f(4) 41, ko = 0, k3 = -12
= 41,
816
CHAP. 19
Numerics in General
13. fo = f(-I) = O. fl = f(O) = 4. f2 = f(1) = O. ko = 0, k2 = O. Is g(x) even? (Give reason.) 14. fo = f(O) = o . .ft = fO) = L f2 = f(Z) = 6. f3 = f(3) = 10. ko = 0, k3 = 0 15. fo = f(O) = \. fl = fO) = O. f2 = feZ) = - \ . f3 = f(3) = 0.1..0 = O. k3 = -6 16. It can happen that a spline is given by the same polynomial in two adjacent subintervals. To illustrate this, find the cubic spline g(x) for f(x) = sin x corresponding to the partition Xo = -7TI1. Xl = 0, X2 = 7T/2 of the interval -7T/Z ~ x ~ 7T/Z and satisfying g'(-7T/Z) = f'(-7T/2) and
in components. \"(t) = Xo
+ y(t) = Yo
+
+
X~f
+
(2(xo -
+
Y~f
(2(yo -
(3(Xl -
Xl)
+
+
X~
(3(Yl YI)
xo) -
(2x~
+
3
X;)t
l'O) -
+ y~ +
-t- X;»f2
(2y~ + y;»t 2
y;)£3.
Note that this is a cubic Hennite interpolation polynomiaL and 11 = I because we have two nodes (the endpoints of C). (This has nothing to do with the Hermite polynomials in Sec. 5.S.) The two points
g'(7T/Z) = f'(7T/2).
17. (Natural conditions) Explain the remark after (II). 18. CAS EXPERIMENT. Spline versus Polynomial. [f your CAS gives natural splines, find the natural splines when x is integer from -111 to Ill, and yeO) = I and all other y equal to O. Graph each such spline along with the interpolation polynomial P2m' Do this for 111 = Z to 10 (or more). What happens with increasing 111? 19. If a cubic spline is three times continuously differentiable (that is, it has continuous first, second. and third derivatives). show that it must be a single polynomial. 20. TEAM PROJECT. Hermite Interpolation and Bezier Curves. In Hermite interpolation we are looking for a polynomial p(x) (of degree ZI1 + I or less) such that p(x) and its derivative p' (x) have given values at 11 + I nodes. (More generally, p(x). p' (x), p"(xJ, ... may be required to have given values at the nodes.) (a) Curves with given endpoints and tangents. Let C be a curve in the x),-plane parametrically represented by ret) = [x(t), y(t)], 0 ~ t ~ I (see Sec. 9.5). Show that for given initial and terminal points of a curve and given initial and terminal tangents. say, A:
ro
=
[x(O).
yeo)]
[xo, Yo].
[Xl'
Vo
=
x~. Yo + Y~]
= [Xl -
x;, YI
-
Y;]
are called guidepoints because the segments AGA and BGB specify the tangents graphically. A, 8, GA , GB determine C. and C can be changed quickly by moving the points. A curve consisting of such Hemlite interpolation polynomials is called a Bezier curve. after the French engineer P. Bezier of the Renault Automobile Company. who introduced them in the early 1960s in designing car bodies. Bezier curves (and surfaces) are used in computer-aided design (CAD) and computer-aided manufacturing (CAM). (For more detaib, ~t!e Ref. [E21] in App. \.) (b) Find
and graph the Bezier curve and its guidepoints if A: [0, 0]. 8: [I, 0], Vo = [i, VI = [-i, -!vi3].
n
(c) Changing guidepoints changes C. Moving guide points farther away makes C "staying near the tangents for a longer time." Confirm this by changing Vo and VI in (b) to Zvo and 2Vl (see Fig. 436). (d) Make experiments of your own. What happens if you change VI in (b) to -VI' If you rotate the tangents? If you multiply Vo and VI by positive factors less than I?
[x(l). y(J)]
r1
8:
= [xo +
and
yd
[x'(O), ),'(0)]
[x~, y~].
VI
[x'(I),
[x~.
y'(I)]
y;]
we can find a curve C, namely,
reT) (15)
=
ro + vot
+ +
(3(r 1
ro) - (ZVo
(Z(r o - r 1 )
+
Vo
+
+
vtl)t 2
V 1 )t 3 ;
B
Fig. 436.
x
Team Project 20(b) and (c): Bezier curves
SEC. 19.5
19.5
817
Numeric Integration and Differentiation
Numeric Integration and Differentiation Numeric integration mean<; the numeric evaluation of integrals
J =
I
b
f(x) dx
a
where a and b are given and f is a function given analytically by a formula or empirically by a table of values. Geometrically, J is the area under the curve of f between a and b (Fig. 437). We know that if f is such that we can find a differentiable function F whose derivative is f. then we can evaluate J by applying the familiar formula
J =
I
b
f(x) dx
=
[F' (x) =
F(h) - F(a)
f(x)j.
a
Tables of integrals or a CAS (Mathematica. Maple, etc.) may be helpful for this purpose. However, applications often lead to integrals whose analytic evaluation would be very difficult or even impossible, or whose integrand is an empirical function given by recorded numeric values. Then we may obtain approximate numeric values of the integral by a numeric integration method.
Rectangular Rule. Trapezoidal Rule Numeric integration methods are obtained by approximating the integrand f by functions that can easily be integrated. The simplest formula. the rectangular rule. is obtained if we subdivide the interval of integration a;:::; x;:::; b into /l subintervals of equal length II = (b - a)//l and in each subinterval approximate f by the constant f(x/), the value of f at the midpoint x/ of the jth subinterval (Fig. 438). Then f is approximated by a step function (piecewise constant function). the 11 rectangles in Fig. 438 have the areas f('\·I*)I1 • ... , f(xn*)I1, and the rectangular rule is
J =
(1)
I
b
f(x)dx = h[f(Xl*)
+
f(X2*)
+ ... + f(xn*)]
a
( b-ll) /7=--n
.
The trapezoidal rule is generally more accurate. We obtain it if we take the same subdivision as before and approximate f by a broken line of segments (chords) with endpoints [a, f(a)], [Xl> f(Xl)], ... , [b, feb)] on the curve of f (Fig. 439). Then the area under the curve of f between a and b is approximated by n trapezoids of areas Mf(a) + f(Xl)]h,
![f(Xl)
+ f(X2)]h,
![f(Xn-l) y
y y={(x)
~ R
a
b
X
Fig. 437. Geometric interpretation of a definite integral
+
f(b)]h.
r/oK 11 '" ) I
I
a xt
X·
2
Fig. 438.
X
I
* b n
Rectangular rule
x
CHAP. 19
818
Numerics in General
y
!~~ ) 'I o( 00
x
Trapezoidal rule
Fig. 439.
By taking their slim we obtain the trapezoidal rule
(2)
]
I
=
b
lex) dx = h[!l(a)
+
f(X1)
+
f(X2)
+ ... +
f(Xn-l)
+
!f(b)]
a
=
where h E X AMP L E 1
(b - a)/n, as in (1). The
x/s and a and b are called nodes.
Trapezoidal Rule
=
Evaluate J
f
1
e -:? dx by means of (2) with
11
= 10.
o
Solution. Table 19.3 j
0 2 3 4 5 6 7 8 9 10
J = 0.1(0.5·1.367879
+ 6.778167)
=
0.746211 from Table 19.3.
•
Computations in Example 1 Xj
X/
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0 0.01 0.04 0.09 0.16 0.25 0.36 0.49 0.64 0.81 1.00
Sums
1.000000
0.367879 1.367879
Error Bounds and Estimate for the Trapezoidal Rule An error estimate for the trapezoidal rule can be derived from (5) in Sec. 19.3 with
n = 1 by integration as follows. For a single subinterval we have f(x) - P1(X)
=
f'(t)
(x - Xo)(X - Xl) - -
2
SEC 19.5
819
Numeric Integration and Differentiation
with a suitable t depending on x, between Xo and Xl. Integration over X from a Xl = Xo + h gives
J
Iz
xo+h
~
f(x) dx - -
2
J
+ f(XI)] =
(y - xo)(x - Xo - II)
~
Xo to
f"(t(x»
xo+h
[fCyo)
=
2
dx.
Setting X - Xo = v and applying the mean value theorem of integral calculus, which we can use because (x - xo)(x - Xo - h) does not change sign, we find that the right side equals {'Ci) v(v - III dv - o 2
f
(3*)
h
=
3 ( h 3
-
3 3 h ) f"Ci) h -- = - 2 2 12
-
_ f " (t)
where t is a (suitable, unknown) value between Xo and Xl. This is the error for the trapezoidal rule with n = 1, often called the local error. Hence the error E of (2) with any n is the sum of such contributions from the n subintervals; since /z = (b - a)/n. nh 3 = neb - a)3/1l 3• and (b - a)2 = n 2/z 2• we obtain
(3)
E=-
(b - a)3 II ~ (b - a) 2 II ~ 12n 2 fU)=12 hf(t)
with (suitable, unknown) i between a and b. Because of (3) the trapezoidal rule (2) is also written
(2*)
1 =
b
[1
1
J f(x) dx = /z "2f(a) + f(XI) + ... + f(Xn-I) + "2f(b) a
]- b - a 2" ~ - h f (t). 12
Error Bounds are now obtained by taking the largest value for f", say, M 2 , and the smallest value, M 2 *. in the interval of integration. Then (3) gives (note that K is negative)
(4)
where
(b - a)3 K = - ---=-2
12n
Error Estimation by Halving h is advisable if h" is very complicated or unknown, for instance, in the case of experimental data. Then we may apply the Error Principle of Sec. 19.1. That is, we calculate by (2), first with h. obtaining, say, 1 = 1h + Eh, and then with ~/z, obtaining 1 = h/2 + Eh/2. Now if we replace /z2 in (3) with (~h)2, the error is multiplied by 1/4. Hence Eh/2 = iEh (not exactly because i may differ). Together, 1h/2 + Eh/2 = 1h + Eh = 1h + 4Eh/2. Thus 1h/2 - 1h = (4 - l)Eh/2. Division by 3 gives the error formula for 1h/2
(5)
E X AMP L E 2
Error Estimation for the Trapezoidal Rule by (4) and (5) Estimate the error of the approximate value in Example I by (4) and (5).
Solution.
o
(A) Error boullds by (4). By differentiation, f"(x) = 2(2x 2
-
l)e- x2 . Also, {"(x) > 0 if
< x < 1, so that the minimum and maximum occur at the ends of the interval. We compute
820
CHAP. 19
M2
=
Numerics in General
f"(I)
= 0.735759 and
M2*
= ('(0) = -2. Furthenl1ore. -0.000 614
~ E ~
K
= -1/1200. and (4) gives
0.001 (,67.
Hence the exact value of 1 must lie between 0.74(,211 - 0.000 fil4 = 0.745597
0.74(, 21 1
and
+ 0.001
6(,7 = 0.747 87R.
Actually, 1 = 0.746 824, exact 10 60, (8) Error estimate by (5). lh = 0,746211 in Example I. Also.
lh/2 =
0,05
[~ e-cil2ol' + + + 0.367R79)] (1
0,74(,671.
=
J~l
Hence
Eh/2
=
l(Jh/2 -
1,,)
= 0,000153 and lh/2 +
E"/2
•
= 0,746824. exact (0 60,
Simpson's Rule of Integration Piecewise constant approximation of f led to the rectangular rule (I), piecewise linear approximation to the trapezoidal rule (2), and piecewise quadratic approximation will lead to Simpson's rule. which is of great practical importance because it is sufficiently accurate for most problems, but still sufficiently simple. To derive Simpson's rule, we divide the interval of integration a 2 x 2 b into an even Ilumber of equal subintervals. say, into 11 = 2m subintervals of length h = (b - a)/(2m), with endpoints Xo (= a), Xl, ... , X2m-1' X2m (= b); see Fig. 440. We now take the first two subintervals and approximate f(x) in the interval Xo 2 X 2 X2 = Xo + 211 by the Lagrange polynomial P2(X) through (xo, fo). (XIo f1), (X2, f2), where f j = f(xj). From (3) in Sec. 19.3 we obtain
The denominators in (6) are 2h2, -112, and 2172, respectively. Setting s have X -
Xl = sl1,
Xo = x - (Xl - h) = (s
X -
X - X2 = X - (Xl
+ 11)
=
+ l}h
1)17
= (s -
and we obtain P2(X) = !s(s - l)fo - (s
y
+ 1)(S - l)f1 + !(S + l)sf2'
r; -...:.: -J
First parabola _ ,./'' Second parabola
rl~
rI
~ Lo'T~"
..
I~Y,
I
I
I x
Fig. 440.
Simpson's rule
(x - x1)111, we
SEC. 19.5
821
Numeric Integration and Differentiation
We now integrate with respect to x from Xo to X2' This corresponds to integrating with respect to s from -1 to 1. Since dx = h ds, the result is
J
X2
(7*)
=
f(x) dx
Xv
J""2P2(X) dx = h ( -31fo + -34f1 + "3I) f2 . Xv
A similar formula holds for the next two subintervals from X2 to -'"4, and so on. By summing all these 111 formulas we obtain Simpson's rule4
(7)
I
h
b
f(x) dx = -
3
a
+ 4f1 +
(fo
where h = (b - a)/(211l) and fj rule.
Table 19.4
2f2
+
4f3
+ ... + 2f2m-2 +
4f2m-1
+
f2m),
f(xj)' Table 19.4 shows an algorithm for Simpson's
=
Simpson's Rule of Integration
ALGORITHM SIMPSON (a, b,
111,
fo, f1, ... , f2m)
This algorithm computes the integral j = Jgf(x) dx from given values fj = f(xj) at equidistant Xo = {/, Xl = Xo + h . .... X2m = Xo + 2mh = b by Simpson's rule (7). where h = (b - a)/(2m). INPUT:
a, b,
fo • ... , f2m
Approximate value J of j
OUTPUT: Compute
111.
fo
+
f2m
= f2
+
f4
So =
S2
+ ... +
f2m-2
h = (b - a)/2111
_ j
h =
-
3
(so
+ 4s1 +
2s2 )
OUTPUT J Stop. End SIMPSON
Error of Simpson's Rule (7). If the fourth derivative a ::::; x ::::; b, the error of (7), call it ES' is
(8)
ES
=-
(b - a)5 180(2171)4
(4) At
_
f ()-
t<4) exists and is continuous on
_(b_-_G_J h 4f(4)(i)'
180
'
~HOMAS SIMPSON (1710-1761), self-taught English mathematician. author of several popular textbooks. Simpson's rule was used much earlier by Torricelli, Gregory (in 1668), and Newton (in 1676).
822
CHAP. 19
Numerics in General
here t is a suitable unknown value between a and b. This is obtained similarly to (3). With this we may also write Simpson's rule (7) as
(7**)
Error Bounds. By taking for t<4) ill (8) the maximum M4 and minimum M4 * on the interval of integration we obtain from (8) the error bounds (note that C is negative)
C
where
(9)
(b - a)5
= - ----:180(2m)4
Degree of Precision (DP) of an integration f0I111ula. This is the maximum degree of arbitrary polynomials for which the formula gives exact values of integrals over any intervals. Hence for the trapezoidal rule, DP
=
I
because we approximate the curve of f by portions of straight lines (linear polynomials). For Simpson's rule we might expect DP = 2 (why?). Actually, DP = 3 by (9) because /4) is identically zero for a cubic polynomiaL This makes Simpson's nile sufficiently accurate for most practical problems and accounts for its popUlarity. Numeric Stability with respect to rounding is another impOltant property of Simpson'~ nile. Indeed, for the sum of the roundoff errors Ej of the 2m + I values f j in (7) we obtain. since Iz = (b - a)12111,
h
-3 lEo
+
4El
+ ... +
E21111 ~
(b - a) 3·2m
6111u
= (b - a)u
where u is the rounding unit (u = ~. 10- 6 if we round off to 6D; see Sec. 19.1). Also 6 = I + 4 + I is the sum of the coefficients for a pair of intervals in (7); take 111 = I in (7) to see this. The bound (b - (I) u is independent of Ill, so that it cannot increase with increasing 11l, that is, with decreasing h. This proves stability. • Newton-Cotes Formulas. We mention that the trapezoidal and Simpson rules are special closed Newton-Cafes formulas, that is, integration formulas in which f(x) is interpolated at equalIy spaced nodes by a polynomial of degree n (n = I for trapezoidal, Il = 2 for Simpson), and closed means that a and b are nodes (a = .ro, b = xn). Il = 3 (the three-eighths nile; Review Prob. 33) and a higher n are used occasionally. From n = 8 on, some of the coefficients become negative, so that a positive f· could make a ~egative contribution to an integral, which is absurd. For more on this topi~ see Ref. [E25] m App. I.
SEC 19.5
823
Numeric Integration and Differentiation
E X AMP L E 3
Simpson's Rule. Error Estimate Evalume J =
f
Solution.
Since h = 0.1, Table 19.5 give,
1
e -:i'- dx by Simpson's rule with 2m = 10 and estimate the error.
o 0.1
3
J =
(1.367879 + 4· 3.740 266 + 2· 3.037 901) = 0.746825.
Estimate of error. Differentiation gives ( 4 )(x) = 4(4..4 - 121.2 + 3)e-:i'-. By considering the derivative /5) of /4) we find that the largest value of /4) in the interval of integration occurs at 0 and the smallest value at r'" = (2.5 - 0.5"\' loi J2 . Computation gives the values M4 = /4)(0) = 12 and 1114* = /4)(x*) = -7.419. Since 2m = 10 and b - a = I, we obtain C = - 111 800 000 = -0.000 000 56. Therefore. from (9). -0.000 007
~ ES ~
0.000 005.
Hence J must lie between 0.746825 - 0.000 007 = 0.746818 and 0.746825 + 0.000005 = 0.746830, so that at least four digits of our approximate value are exact. Actually. the value 0.746825 is exact to 5D because J = 0.746824 (exact to 6D). Thus our result is much better than that in Example 1 obtained by the trapeLOidal rule. whereas the number of operations is nearly the same in both cases. •
Table 19.5 j
0
Computations in Example 3 Xj
xl
0
0
0.1
0.01
2
0.2
0.04
3 4 5 6 7 8 9 [0
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.09 0.16 0.25 0.36 0.49 0.64 0.81 1.00
e- Xj 1.000000
0.990050 0.960789 0.913931 0.852144 0.778801 0.697676 0.612626 0.527 292 0.444 858 0.367879 1.367879
Sums
2
3.74U 266
3.037 YUl
Instead of picking an n = 2m and then estimating the error by (9), as in Example 3, it is better to require an accuracy (e.g., 6D) and then determine 11 = 2111 from (9). EX AMP L E 4
Determination of n What
II
=
2m in Simpson's Rule from the Required Accuracy
should we choose in Example 3 to get 6D-accuracy?
Solution.
Using M4 = 12 (which i, bigger in absolute value than M 4 *). we get from (9), with b and the required accuracy.
thus
2'106'12 m = [
180·2
4
J1I4
a=
1
= 9.55.
Hence we should choose 11 = 2111 = 20. Do the computation, which paralleb that in Example 3. Note that the error bounds in (4) or (9) may sometimes be loose, so that in such a case a smaller 11 = 2m may already suffice. •
824
CHAP. 19
Numerics in General
Error Estimation for Simpson's Rule by Halving h. and gives
The idea is the same as in (5)
(10)
lit is obtained by using hand lh/2 by using ~h. and Eh/2 is the error of l h12 . Derivati()n. In (5) we had ~ as the reciprocal of 3 = 4 - I and ~ = (~)2 resulted from 1z2 in (3) by replacing II with ~h. In (0) we have l~ as the reciprocal of 15 = 16 - 1 and {6 = (~)4 results from /1 4 in (8) by replacing h with ~II. E X AMP L E 5
Error Estimation for Simpson's Rule by Halving Integrate flX) = l'1TX4 coslm- from 0 to 2 with" = I and apply (10).
Solution.
The exact 5D-value of the integral is 1 = 1.25953. Simp,on's rule gives
h = Hf(O} +
4f(l) + f(2)] =
i [.f(0)
112 =
I
= "6 LO +
+
l(O + 4· 0.555360 + 0) = 0.740480.
4.f( +) + 2.f(l) + 4J (%) + .f(2)]
4' 0.045351 + 2 ·0.555361 + 4· 1.521579 + 0]
=
1.21974.
Hence (10) gives e1>/2 = if;(1.21974 - 0.74048) = 0.032617 and thus 1 = 1h/2 + E12 = 1.26236. with an error -0.00283. which is less in absolute value than of the error 0.02979 of 1,,/2' Hence the use of (10) was well worthwhile. •
to
Adaptive Integration The idea is to adapt step h to the variability of f(x). That is, where f varies but little, we can proceed in large steps without causing a substantial error in the integraL but where .f varies rapidly, we have to take small steps in order to stay everywhere close enough to the curve of f. Changing h is done systematically, usually by halving 11, and automatically (not "by hand"') depending on the size of the (estimated) error over a subinterval. The subinterval is halved if the cOlTesponding error is still too large, that is, larger than a given tolerance TOL (maximum admissible absolute en-or), or is not halved if the error is less than or equal to TOL. Adapting is one of the techniques typical of modern software. In connection with integration it can be applied to various methods. We explain it here for Simpson's rule. In Table 19.6 a star means that for that subinterval, TOL has been reached. E X AMP L E 6
Adaptive Integration with Simpson's Rule 4
Integrate J(x) = l'1Tv cos l'1TX from x = 0 to 2 by adaptive integration and with Simpson's rule and TOLfO. 2J = 0.0002.
Solutioll.
Table 19.6 shows the calculations. Figure 441 show, the integrand ffx) and the adapted intervals used. The first two intervals ([0. 0.5], [0.5, 1.0j) have length 0.5. hence h = 0.25 [because we use 2111 = 2 subintervals in Simpson's rule (7**)1· The next two interval~ ([1.00. 1.25J. [1.25. 1.50]) have length 0.25 (hence h = 0.125) and the la~t four intervals have length 0.125. Sample computatio"s. For 0.7-10480 ~ee Example 5. Formula (10) gives (0.123716 - 0.122794)115 = 0.000061. Note that 0.123716 refers to [0. 0.5] and [0.5. 11. so that we must subtract the value conesponding to [0, I] in the line before. Etc. TOL[O, 2] = 0.0002 gives
SEC. 19.5
825
Numeric Integration and Differentiation
0.0001 for subintervals oflength 1,0.00005 for length 0.5. etc. The value of the integral obtained is the sum of the values marked by an asterisk (for which the error estimate has become less than TaL). This gives J = 0.123716
+
0.52~8l)5
+ 0.388263 + 0.218483
=
1.25936.
The exact 5D-value is J = 1.25953. Hence the error is 0.00017. This is about 11200 of the absolute value of that in Example 5. Our more extensive computation has produced a much better result. •
Table 19.6
Computations in Example 6
Interval
Integral
Error (10)
TOL
Comment
[0,2]
0.740480
[0. 11 [1, 2]
0.122794 1.10695 Sum = 1.22974
0.032617
0.0002
Divide fut1her
0.UU4782 0.118934 Sum = 0.123716*
0.000061
0.0001
TOL reached
0.528176 0.605821 Sum = 1.13300
0.001803
0.0001
Divide further
0.200544 0.328351 Sum = 0.528895*
0.000041<
0.00005
TOL reached
0.388235 0.218457 Sum = 0.606692
0.000058
0.00005
Divide further
0.1%244 0.192019 Sum = 0.388263*
0.000002
0.000025
TOL reached
0.153405 0.065078 Sum = 0.218483*
0.000002
0.000025
TOL reached
[U.O, 0.5] [0.5, 1.0]
[1.0. 1.5] [1.5, 2.0]
[1.UU, 1.25] l1.25. 1.50)
[1.50. 1.75] [1.75, 2.00]
[1.500, 1.625] [1.625, 1.750]
0.0002
-
[1.750. 1.875] [1.875, 2.000]
{(xl
1.5 1.0
0.5
o o Fig. 441.
0.5
x
Adaptive integration in Example 6
826
CHAP. 19
Numerics in General
Gauss Integration Formulas Maximum Degree of Precision Our integration fonnulas discussed so far use function values at predetermined (equidistant) x-values (nodes) and give exact results for polynomials not exceeding a certain degree [called the degree of precision; see after (9)]. But we can get much more accurate integration formulas as follows. We set
I
(11)
n
1
f(t) dt =
~ AJfj
j~l
-1
with fixed 11, and t = ± I obtained from x = a, b by setting x = HaCt - I) + bet + 1)]. Then we detennine the 11 coefficients AI' ... , AT! and 11 nodes t l •••• , tn so that (II) gives exact results for polynomials of degree k as high as possible. Since 11 + 11 = 211 is the number of coefficients of a polynomial of degree 211 - 1, it follows that k ~ 2n - I. Gauss has shown that exactne!>s for polynomials of degree not exceeding 2n - 1 (instead of n - I for predetermined nodes) can be attained, and he has given the location of the tj (= the jth zero of the Legendre polynomial P n in Sec. 5.3) and the coefficients Aj which depend on 11 but not on f(t), and are obtained by using Lagrange' s interpolation polynomial, as shown in Ref. [E5] listed in App. I. With these tj and Aj , formula (11) is called a Gauss integration formula or Gauss quadrature formllla. Its degree of precision is 211 - I, as just explained. Table 19.7 gives the values needed for n = 2 .... , 5. (For larger 11. see pp. 916-919 of Ref. [GRI] in App. 1.) Table 19.7
n
2
3
4
5
E X AMP L E 7
Gauss Integration: Nodes tj and Coefficients Aj
Nodes
-0.57735 02692
1
0.57735 02692
1
-0.7745966692 0 0.7745966692
0.55555 55556 0.88888 88889 0.55555 55556
-0.8611303116 -0.33998 10436
0.3478548451 0.6521451549
0.33998 10436 0.8611363116
0.6521-l 51549 0.34785 4845 I
-0.90617 98459 -0.5384693101 0 0.5384693101 0.9061798459
0.23692 68851 0.47862 86705 0.56888 88889 0.47862 86705 0.23692 6885 I
Gauss Integration Formula with n
Degree of Precision
Coefficients Aj
Ij
=
3
5
7
9
3
Evaluate the mtegral in Example 3 by the Gauss integration formula (I I) with
/I
=
3.
Solutio'!.." 1 w~ have to c~nvert our integral from 0 to I into an integral from -\ to I. We set r = Then dx - 2 dr, and (I I) WIth /I ~ 3 and the above values of the nodes and the coefficients yields
!U +
I).
SEC. 19.5
827
Numeric Integration and Differentiation
f~Xp(_X2)dX = ~ { o
=
~[%exp(-±(I -
exp (- ± (I +
1)2)
dl
-1
/fr)
+
%exp(-~) + %exP(-±(1 + Hf)] =0.746815
(exact to liD: 0.746 825), which is almost as accurate as the Simpson result obtained in Example 3 with a much larger number of arithmetic operations. With 3 function values (as in this example) and Simpson's rule we would • get ~(l + 4e- O.25 + e -1) = 0.747 180. with an error over 30 times that of the Gauss integration.
E X AMP L E 8
Gauss Integration Formula with n
=
4 and 5
4
Integrate f(x) = !7TX cos !7TX from x = 0 to 2 by Gau". Compare with the adaptive integration in Example 6 and comment.
Solutioll.
x =
1
+ 1 gives f(t) =
J = Ad1 + ... + A4f4
!7T(t
+ 1)4 cos (!7T(t + 1». as needed in (11). For 11 = 4 we calculate (6S)
= A 1{f1
+
f4)
+
A 2 (f2
+ i3)
= 0.347855(0.000290309 + 1.02570) + 0.652145(0.129464
+ 1.25459) = 1.25950.
The error is 0.00003 because J = 1.25953 (6S). Calculating with lOS and 11 = 4 gives the same result; so the error is due to the formula. not rounding. For Il = 5 and lOS we get J = 1.25952 6185, too large by the amount 0.000000250 because J = 1.259525935 (lOS). The accuracy is impressive. particularly if we compare the • dmount of work with that in Example 6.
Gauss integration is of considerable practical importance. Whenever the integrand f is given by a formula (not just by a table of numbers) or when experimental measurements can be set at times tj (or whatever t represents) shown in Table 19.7 or in Ref. [GRl], then the great accuracy of Gauss integration outweighs the disadvantage of the complicated tj and Aj (which may have to be stored). Also, Gauss coefficients Aj are positive for all n, in contrast with some of the Newton-Cotes coefficients for larger 11. Of course, there are frequent applications with equally spaced nodes, so that Gauss integration does not apply (or has no great advantage if one first has to get the tj in (11) by interpolation). Since the endpoints -1 and 1 of the interval of integration in ell) are not zeros of Pn' they do not occur among to, ... , tn. and the Gauss fonnula (11) is called. therefore, an open formula, in contrast with a closed formula, in which the endpoints of the interval of integration are to and tn- [For example. (2) and (7) are closed formulas.]
Numeric Differentiation Numeric differentiation is the computation of values of the derivative of a function f from given values of f. Numeric differentiation should be avoided whenever possible, because, whereas integration is a smoothing process and is not affected much by small inaccuracies in function values, differentiation tends to make matters rough and generally gives values of f' much le~s accurate than those of f-remember that the derivative is the limit of the difference quotient. and in the latter you u!>ually have a small difference oflarge quantities that you then divide by a small quantity. However, the formulas to be obtained will be basic in the numeric solution of differential equations. We use the notations fi = t' (x), f;' = f"exj), etc., and may obtain rough approximation formulas for derivatives by remembering that
t' (x)
= lim f(x h~O
This suggests
+
11) - f(x) h
CHAP. 19
828
Numerics in General
t' -
(12)
fl - fo
8f1/2 h
1/2 -
h
Similarly, for the second derivative we obtain etc.
(13)
More accurate approximations are obtained b) differentiating suitable Lagrange polynomials. Differentiating (6) and remembering that the denominators in (6) are 2h2, -h 2 , 2h 2 , we have
Evaluating this at
(14)
Xo, Xl> X2,
we obtain the "three-point formula,,"
I
(a)
f~
=
211 (-3fo
(b)
f~
=
2h (-fo
(c)
f2
,
I
1
= -21z
(fo -
+ 4f1
+
- f2),
f2)'
4fI +
3f2)'
Applying the same idea to the Lagrange polynomial P4(X), we obtain similar formulas, in particular. (15) Some examples and further formulas are included in the problem set as well as in Ref. [E5] listed in App. I.
1. (Rectangular rule) Evaluate the integral in Example I by the rectangular rule (1) with a subinterval of length 0.1. 2. Derive a formula for lower and upper bounds for the rectangular rule and apply it to Prob. I.
!3-8!
TRAPEZOIDAL AND SIMPSON'S RULES Evaluate the integrals numelically as indicated and determine the error by using an integration formula known from calculus. F(x) =
J x
1
dx* -
x*
H(x) =
.
G(x)
IX 0
1~\:*e-~* dx* o
dx* 2 cos x* '
3. F(2l by (2).
Il
=
10
4. F(2) by (7), n = [0
5. Gn) by (2).11 = 10 6. G(I) by (7),11 = 10
7. H(4) by (2),
Il
10
8. H(4) by (7),
Il
= 10
!9-121
HALVING Estimate the error by halving. 9. In Prob. 5 10. In Prob. 6 11. In Prob. 7 12. In Prob. 8
Chapter 19 Review Questions and Problems
829
NON ELEMENTARY INTEGRALS
If IE3d ~ TOL, stop. The result is is3 = 132 + E32 . (Why does 24 = 16 come in?) Show that we obtain 1"32 = -0.000266, so that we can stop. Arrange your 1- and E-values in a kind of "difference table."
113-191
The following integrals cannot be evaluated by the usual methods of calculus. Evaluate them as indicated. Si(x) =
Sex)
I
sinx*
x
o
dx~'.
- - .x~
= {'Sin (X*2) dx*.
C(x)
= {"cos (X*2) dx*
o
0
Si(x) is the sine integral. S(x) and C(x) are the Fresnel integrals. (See App. 3.1.)
en.
13. SiO) by II = 5. II = 10 14. Using the values in Prob. 13. obtain a better value for Si(l). Hint. Use (5). 15. Si(l) by (7), 2111 = 2. 2m = 4 16. Obtain a better value in Prob. 15. Hint. Use (10). 17. Si(l) by (7). 2m = 10 18. S(1.25) by (7). 2111 = 10 19. C( 1.25) by (7). 2111 = 10
If IE3d were greater than TOL, you would have [0 go on and calculate in the next step 141 from (1) with h = ~: then
143
=
142
+
1"42
with
q2
=
20. (Stability) Prove that the trapezoidal rule is stable with respect to rounding.
144
=
143
+
E43
with
E43
= 63
142
=
141
+
I
with
E41
1"41
=
"3 (141
-
1 31 )
I
15 (142 -
1 32 )
I
(J43 -
is3)
where 63 = 26 - 1. (How does this come in?) Apply the Romberg method to the integral of f( 1:) = ~7TX4 cos !7TX from x = 0 to 2 with TOL = 10-4 .
121-241 GAUSS INTEGRATION Integrate by (11) with II = 5: 21. IIx from I to 3 22. co~ x from 0 to!7T 23. e-x" from 0 to I
DIFFERENTIATION
24. sin <x 2 ) from 0 to 1.25 25. (Given TOL) Find the smallest 11 in computing the integral of 1Ix from I to 2 for which 50-accuracy is guaranteed (a) by (4) in the use of (2). (b) by (9) in the use of (7). Compare and comment. 26. TEAM PROJECT. Romberg Integration (W. Romberg. Norske Videllskab. Trolldheil11, FfJrh. 28, Nr. 7, 1955). This method uses the trapezoidal rule and gains precision stepwise by halving h and adding an error estimate. Do this for the integral of f(x) = e- X ti'om x = 0 to x = 2 with TOL = 10-3 , as follows. Step 1. Apply the trapezoidal rule (2) with h = 2 (hence n = I) to get an approximation In. Halve II and use (2) to gel 121 and an error estimate
27. Consider f(x) = X4 for Xo = 0, Xl = 0.2, X2 = 0.4, X3 = 0.6, X4 = 0.8. Calculate f~ from (14a), (14b), (14c), (15). Determine the errors. Compare and comment. 28. A "four-point formula" for the derivative is
Apply it to f(x) = X4 with Xl' ... , X4 as in Prob. 27, determine the error. and compare it with that in the case of (15).
29. The derivative f' (x) can also be approximated in terms of first-order and higher order differences (see Sec. 19.3):
1 E21 =
22 _
I
(121 -
1u)·
If IE211 ~ TOL, stop. The result is 122 = 121 + E21' Step 2. Show that E21 = -0.066596, hence 1"211 > TOL and go on. Use (2) with hl4 to get 131 and add to it the error estimate 1"31 = i(131 - 1 21 ) to get the better 132 = 131 + 1"31- Calculate E32
=
I 24 _
I
I
(132 -
1 22)
= 15
(132 -
1 22),
+
"3I .1.3 fo
-
4"1 ...\ 4 fo + - . .. ) .
Compute t' (0.4) in Prob. 27 from this fOlmula usino differences up to and including first order, ~econd order, third order, fourth order. 30. Derive the formula in Prob. 29 from (14) in Sec. 19.3.
830
CHAP. 19
Numerics in General
==::==
: ::&W :::....
1. What is a numeric method? How has the computer influenced numeric methods? 2. What is floating-point representation of nwnbers? Overflow and underflow?
3. How do error and relative enor behave under addition? Under multiplication? 4. Why are roundoff errors important? State the rounding rules.
5. What is an algorithm"! Which of its properties are important in software implementation? 6. Why is the selection of a good method at least as important on a large computer as it is on a small one? 7. Explain methods for solving equations, in particular fixed-point iteration and its convergence. 8. Can the Newton (-Raphson) method diverge? Is it fast? Same questions for the bisection method.
S T ION SAN D PRO B L EMS 23. What is the relative error of l1a in terms of that of a? 24. Show that the relative error of
25. Compute the solution of x 5 = x + 0.2 near J. = 0 by transforming the equation into the f01Tl1 x = g(x) and starting from Xo = O. (Use 6S.) 26. Solve cos x = x by iteration (6S, Xo = I), writing it as x = (O.74x + cosx)/1.74. obtainingx4 = 0.739085 (exact to (is!). Why does this converge so rapidly? 27. Solve X4 - x 3 - 2x - 34 = 0 by Newton's method with Xo = 3 and 6S accuracy. 28. Solve cos x - x
22. Answer the question in Prob. 21 for the difference 4.81 - 11.752.
0 by the method of false position.
.fO.Q) = 3.00000
= 1.98007
f(l.2)
f(1.4) = 2.92106 f( 1.6) = 1.111534
10. What do you remember about errors in polynomial imerpolation?
13. In what sense is Gau~s integration optimal? Explain details. 14. What does adaptive imegration mean? Why is it useful? 15. Why is numeric differentiation generally more delicate than numeric integration? 16. Write -0.35287. 1274.799, -0.00614. 14.9482. 113, 8517 in floating-point form with 5S (5 significant digits, properly rounded). 17. Compute (5.346 - 3.644)/(3.454 - 3.055) as given and then rounded stepwise to 3S. 2S. I S. ("Stepwise" means rounding the four rounded numbers. not the given ones.) Comment on your results. 18. Compute 0.29731/(4.1132 - 4.0872) with the numbers as given and then rounded stepwise (that is. rounding the rounded numbers) to 4S. 3S, 2S. Comment. 19. Solve x 2 - 50x + 1 = 0 by (6) and by (7) in Sec. 19.1, using 5S in the computation. Compare and comment. 20. Solve x 2 - 100x + 4 = 0 by (6) and by (7) in Sec. 19.1, using 5S in the computation. Compare and comment. 21. Let 4.81 and 12.752 be correctly rounded to the number of digits shown. Determine the smallest interval in which the sum (using the true instead of the rounded values) must lie.
=
29. Compute f(1.28) from
9. What is the advamage of Newton's interpolation formulas over Lagrange's?
11. What is spline interpolation? Tts advantage over polynomial interpolation? 12. List and compare numeric integration methods. When would you appl} them'!
a2 is about twice that of
a.
f( 1.8) = 2.69671
f(l.O) = 2.54030 by linear interpolation. By quadratic interpolation. using f( 1.2), fOA-), IO.6).
30. Find the cubic spline for the data f(-I) = 3
f(1) = I f(3) = 23
f(5) = -1-5
ko = k3 = 3. 31. Compute the integral of X3 from 0 to I by the trapezoidal rule with 11 = 5. What error bounds are obtained from (4) in Sec. 19.5? What is the actual error of the result? Why is this result larger than the exact value? 32. Compute the integral of cos (X2) from 0 to I by Simpson's rule with 2111 = 2 and 2171 = 4 and estimate the error by (10) in Sec. 19.5. (This is the Fresnel integral (38) in App. 3.1 with x = 1.) 33. Compute the integral of cos x from 0 to three-eights rule
f
the
3
b
a
-!rr by
f(x) dx =
"8 hUo + 3.fl + 3f2 + f3) -
-
1
80
.
(b - a)h 4 !'IV)(t)
and give error bounds; here a ~ f ~ band Xj = a + (b - a)jI3, j = 0, ... , 3.
831
Summary of Chapter 19
,
.... ....._......
-
..
II
•
Numerics in General [n this chapter we discussed concepts that are relevant throughout numeric work as a whole and methods of a general nature, as opposed to methods for linear algebra (Chap. 20) or differential equations (Chap. 21). In scientific computations we use the floating-point representation of numbers (Sec. 19.1); fixed-point representation is less suitable in most cases. Numeric methods give approximate values a of quantities. The error E of a is
(Sec. 19.1)
(1)
where a is the exact value. The relatil'e error of Ii is E/a. Errors arise from rounding, inaccuracy of measured values, truncation (that is. replacement of integrals by sums, series by partial sums), and so on. An algorithm is called numerically stable if small changes in the initial data give only cOiTespondingly small changes in the final results. Unstable algorithms are generally useless because errors may become so large that results will be very inaccurate. Numeric instability of algorithms must not be confused with mathematical instability of problems ("ill-conditioned problems," Sec. 19.2). Fixed-point iteration is a method for solving equations f(x) = 0 in which the equation is first transformed algebraically to x = g(x), an initial guess Xo for the solution is made, and then approximations X10 X2 • • • • , are successively computed by iteration from (see Sec. 19.2)
(2)
Xn+l
=
g(xn )
Newton's method for solving equations f(x)
(3)
X,,+l
= Xn
-
(n
= O. 1. ... ).
= 0 is an iteration (Sec. 19.2).
Here Xn+1 is the x-intercept of the tangent of the curve y = f(x) at the point X n • This method is of second order (Theorem 2, Sec. 19.2). If we replace f' in (3) by a difference quotient (geometrically: we replace the tangent by a secant). we obtain the secant method; see (10) in Sec. 19.2. For the bisection method (which converges slowly) and the method offalse position, see Problem Set 19.2. Polynomial interpolation means the detelTnination of a polynomial P>l(x) such that Pn(Xj) = Ii, where.i = 0, ... , 11 and (xo, f 0), ••. , (xn , fn) are measured or observed values, values of a function, etc. Pn(x) is called an intelpo/afioll poZvl1omial. For given data. Pn(X) of degree 11 (or less) is unique. However. it can be written in different forms, notably in Lagrange's form (4). Sec. 19.3, or in Newton's divided difference form (10). Sec. 19.3, which requires fewer operations. For regularly spaced xo, Xl = Xo + h, .... xn = Xo + l1h the latter becomes Newton's forward difference formula (formula (14) in Sec. 19.3)
832
CHAP. 19
Numerics in General
f(x) = Pn(X) = fo
(4)
where r
=
+
rt:.fo
+ ... +
r(r - 1) ... (r - n
+
I)
t:.nfo
n!
(x - xo)lh and the forward differences are t:.fj = f j +1 - fj and (k
=
2,3, .. ').
A similar formula is Newton's backward difference illterpolationfo1711ula (formula (18) in Sec. 19.3). Interpolation polynomials may become numerically unstable as 11 increases, and instead of interpolating and approximating by a single high-degree polynomial it is preferable to use a cubic spline g(x). that is. a twice continuously differentiable interpolation function [thus. g(Xj) = fjJ, which in each subinterval Xj ~ x ~ Xj+1 consists of a cubic polynomial qj(x): see Sec. 19.4. Simpson's rule of numeric integration is [see (7), Sec. 19.5] h
b
(5)
fa
f(x) dx =
3"
(fo
+
4fL
+
2f2
+ 4j3 + ... +
2f2711-2
+ ~f2m-l +
f2711)
with equally spaced nodes Xj = Xo + jh. j = 1.... ,2m, h = (b - a)/(2m). and f j = f(xj)' It is simple but accurate enough for many applications. Its degree of precision is DP = 3 because the error (8), Sec. 19.5. involves h4. A more practical error estimate is (10), Sec. 19.5.
obtained by first computing with step h, then with step /1/2, and then taking III 5 of the difference of the results. Simpson's rule is the most important of the Newton-Cotes formulas, which are obtained by integrating Lagrange interpolation polynomials, linear ones for the trapezoidal rule (2), Sec. 19.5, quadratic for Simpson's mle. cubic for the three-eights rule (see the Chap. 19 Review Problems). etc. Adaptive integration (Sec. 19.5, Example 6) is integration that adjusts ("adapts") the step (automatically) to the variability of f(x). Romberg integration (Team Project 26, Problem Set 19.5) starts from the trapezoidal rule (2). Sec. 19.5. with h. h12, h/4, etc. and improves results by systematically adding error estimates. Gauss integration (II), Sec. 19.5, is important because of its great accuracy (DP = 217 - 1, compared to Newton-Cotes's DP = 11 - 1 or 11). This is achieved by an optimal I;hoice of the nodes, which are not equally spaced; see Table 19.7, Sec. 19.5. Numeric differentiation is discussed at the end of Sec. 19.5. (Its main application (to differential equations) follows in Chap. 21.)
, CHAPTER
J
20
Numeric Linear Algebra In this chapter we explain "orne of the most important numeric methods for solving linear systems of equations (Secs. 20.1-20.4), for fitting straight lines or parabolas (Sec. 20.5), and for matrix eigenvalue problems (Secs. 20.6-20.9). These methods are of considerable practical importance because many problems in engineering, statistics, and elsewhere lead to mathematical models whose solution requires methods of numeric linear algebra. COM MEN T. This chapter is independent of Chap. 19 and can be studied immediately after Chap. 7 or 8.
Prerequisite: Secs. 7.1. 7.2, 8.1. Sections that may be omitted in a shorter course: 20.4. 20.5. 20.9 References and Answers to Problems: App. I Part E. App. 2
20.1
Linear Systems: Gauss Elimination A linear system of n equations in E I , . . . , En of the form
11
unknowns
Xl ••• ,
Xn is a set of equations
(1)
where the coefficients ajk and the bj are given numbers. The system is called homogeneous if all the bj are zero; otherwise it is called nonhomogeneous. Usmg matrix multiplication (Sec. 7.2), we can write (1) as a single vector equation (2)
Ax
where the coefficient matrix A
A=
=
=b
[ajk] is the 11 X 11 matrix
au
al2
al n
a21
a22
a2n and
anI
an2
ann
bl
Xl
x=
and
Xn
and b =
bn 833
834
CHAP. 20
Numeric Linear Algebra
are column vectors. The following matrix system (I):
A=
[A
b]
A is
called the augmented matrix of the
=
A solution of (I) is a set of numbers Xl' • . • , Xn that satisfy all the II equations, and a solution vector of (1) is a vector x whose components constitute a solution of (1). The method of solving such a system by determinants (Cramer's rule in Sec. 7.7) is not practical, even with efficient methods for evaluating the determinants. A practical method for the solution of a linear system is the so-called Gal/ss eliminatioll, which we shall now discuss (proceeding illdependently of Sec. 7.3).
Gauss Elimination This standard method for solving linear systems (I) is a systematic process of elimination that reduces (1) to "triangular form" because the system can then be easily solved by "back substitution." For instance, a triangular system is
and back substitution gives
X3
= 3/6 = 112 from the third equation, then
from the second equation, and finally from the first equation
How do we reduce a given system (I) to triangular form? In the first step we elimil1ate from equation E2 to En in (I). We do this by adding (or subtracting) suitable multiples of E1 from equations E2 , ••• , En and taking the resulting equations, call them E~, ... , E~ as the new equations. The first equation, E 1 , is called the pivot equation in this step, and llU is called the pivot. This equation is left unaltered. In the second step we take the new second equation E~ (which no longer contains Xl) as the pivot equation and use it to eliminate X2 from E; to E~. And so on. After Il - I steps this gives a triangular system that can be solved by back substitution as just shown. In this way we obtain precisely all solutions of the given system (as proved in Sec. 7.3). The pivot llkk (in step k) must be different from zero and should be large in absolute value, to avoid roundoff magnification by the multiplication in the elimination. For this we choose as our pivot equation one that has the absolutely largest ajk in column k on or below the main diagonal (actually, the uppermost if there are several such equations). This popular method is called partial pivoting. It is used in CASs (e.g., in Maple). Xl
SEC 20.1
835
Linear Systems: Gauss Elimination
Partial pivoting distinguishes it from total pivoting, which involves both row and column interchanges but is hardly used in practice. Let us illustrate this method with a simple example. EXAMPLE 1
Gauss Elimination. Partial Pivoting Solve the system
Solutioll.
We must pivot since El ha, no xl-term. In Column 1. equation E3 ha, the large,t coefficient. Hence we interchange El and E3, 6Xl + 2x2 + 8x3 = 3X1
26
+ 5X2 + 2X3
=
8.1:2 + 2X3
~
8
-7.
Step 1. Elimillatioll of Xl It would suftlce to show the augmented matrix and operate on it. We show both the equations and the augmented matrix. In the first step. the first equation is the pivot equation. Thus
Pivot 6----+(§i)+ 2X2 + 8x3
=
26
Eliminate - - ' > ~ + 5x2 + 2r3 = 8X2 + 2x3
8
= -7
[:
2
8
5
2
8
2
':]
-7
To eliminate Xl from the other equations (here. from the second equation). do: Subtract 3/6
~
112 times the pivot equation from the second equation.
The result is ntl
+
+
=
26
2
8
4X2 - 2'-3 = -5
4
-2
-5
8x2 + 2'3 = -7
8
2
-7
2x2
8x3
26]
Step 1. Elimillatioll of X2 The largest coefficient in Column 2 is 8. Hence we take the nell' third equation as the pivot equation, interchanging equations 2 and 3,
26
2
8
Pivot 8----+
~+ 2x3 =-7
8
2
-7
Eliminate
I§l- 2X3 =
4
-2
-5
6Xl + 2x2 + 8x3 =
--'>
-5
[:
26]
To eliminate .1:2 from the third equation. do: Subtract 112 times the pivot equation from the third equation. The resulting triangular system i, shown below. Thi, is the end of the forward elimination. Now comes the back substitution.
Back substitutioll. Determillatioll of X3, X2, Xl The trIangular system obtained in Step 2 is 2
8
8
2
-7
o
-3
_.2
26] 2
836
CHAP. 20
Numeric Linear Algebra
From this system. taking the last equation, then the second equation. and finally the fust equation. we compute the solution x 3 -- 12 X2
= ~(-7 - 2X3) = -I
Xl
= ~(26
- 2~2 -
8x3)
= 4.
This agrees with the values given above. before the beginning of the example.
•
The general algorithm for the Gauss elimination is shown in Table 20.1. To help explain the algorithm, we have numbered some of its lines. bj is denoted by aj,n+l' for uniformity. In lines I and 2 we look for a possible pivot. [For k = I we can always find one; otherwise Xl would not occur in (l ).] In line 2 we do pivoting if necessary, picking an ajk of greatest absolute value (the one with the smallest j if there are several) and interchange the corresponding rows. If lakkl is greatest, we do no pivoting. 11ljk in line 3 suggests multiplier, since these are the factors by which we have [0 multiply the pivor equation E~ in Step k before subtracting it from an equation Ef below E~' from which we want to eliminate Xk' Here we have written E~ and E/ to indicate that after Step I these are no longer the given equations in (1), but these underwent a change in each step, as indicated in line 4. Accordingly, ajk etc. in lines 1-4 refer to the most recent equations, andj ~ k in line I indicates that we leave un[Ouched all the equations that have served as pivot equations in previous steps. For p = k in line 4 we get 0 on the right. as it should be in the elimination.
In line 5, if the last equation in the triangular system is 0 = b~ *- 0, we have no = 0, we have no unique solution because we then have fewer solution. If it is 0 = equations than unknowns.
b:
E X AMP L E 2
Gauss Elimination in Table 20.1, Sample Computation In Example 1 we had all = O. so that pivoting wa, necessary. The greatest coefficient in Column I was a3l' Thus J = 3 in line 2. and we interchanged El and E 3 . Then in lines 3 and 4 we computed 11121 = 3/6 = ~ and G22 =
5 - ~ . 2 = 4.
G23 =
2 - ~. 8 = -2.
G24
= 8 - ~ . 26 = -5.
and then 11131 = 0/6 = O. so that the third equation 8X2 + 2X3 = -7 did not change in Step I. In Step 2 = 2) we had 8 as the greatest coefficient in Column 2. hence J = 3. We interchanged equations 2 and 3. computed 11132 = -4/8 = - ! in line 4. and the a33 = -2 - !·2 = -3. a34 = -5 - ~(-7) = -~. This produced the triangular fOim used in the back substitution. • (k
If akk = 0 in Step k, we must pivot. If lakkl is small. we should pivot because of roundoff error magnification that may seriously affect accuracy or even produce nonsensical re~ults. E X AMP L E 3
Difficulty with Small Pivots The solution of the 'ystem
0.0004Xl + 1.402x2 = 1.406 0.4003xl - 1.502x2 = 2.501 is Xl = 10'"'2 = 1. We solve this system by the Gauss elimination. using four-digit floating-point arithmetic. (4D is for simplicity. Make an 8D-arithmetic example that shows the same.)
In
(a) Picking the first of the given equations as the pivot equation. we have to mUltiply this equation by 0.4003/0.0004 = 1001 and subtract the result from the second equation. obtaining
=
SEC. 20.1
837
Linear Systems: Gauss Elimination
Table 20.1
Gauss Elimination
ALGORITHM GAUSS
(A =
[ajd = LA
b])
This algorithm computes a unique solution x
=
[Xj] of the system (1) or indicates that
(1) has no unique solution.
INPUT:
Augmented n X (n
=
matrix
A=
[ajk]' where aj,n+l = bj
1. . . . , n - 1. do: 0 for all j exists."' Stop
If ajk =
1
I)
Solution x = lXj] of (I) or message that the system (1) has no unique solution
OCTPUT:
For k
+
~
k then OUTPUT "No unique solution
[Procedure completed unsuccessfully; A is singular]
2
Else exchange the contents of rows J and k of A with J the smallest j ~ k such that lajkl is maximum in column k.
3
For j
k
=
+
I. .... n. do:
. _ ajk 1njk- - - akk
For p = k
4
I
+
ajp:
I. . . . , n + I. do:
= ajp - 1njk a kp
End End End If ann = 0 then OUTPUT "No unique solution exists." Stop Else
5
6
[Start back substitution]
For i 7
=
n - 1. . . . . 1, do: Xi =
....!...
(ai,n+l -
aii
i
aijXi)
j=i+l
End OUTPUT x = [Xj]. Stop End GAUSS
~ 1405x2 =
Hence x2
=
~ 1404/( ~ 1405)
Xl
This failure occurs because large error in Xl'
=
~ 1404.
= 0.9993, and from the first equation. 1
00004 (l.406 ~ 1.402· 0.9993) .
instead of Xl U.oo5
=-= 0.0004
=
10. we get
12.5.
iani is small compared with ia12i. so that a small roundoff error in X2 leads to a
838
CHAP. 20
Numeric Linear Algebra
(b) Picking the second of the given equations as the pivot equation. we have
10
multiply this equation by
0.0004/0.4003 = 0.0009993 and subtract the result from the first equation. obtaining
1.404x2 = 1.404.
Hence .\"2 = I, and from the pivot equation xl = 10. This success occur~ because la2l1 is not very small compared to la221. so that a small roundoff enOf in .\"2 would not lead to a large error in Xl' Indeed. for instance. if we had the value x2 = 1.002. we would still have from the pivot equation the good value Xl = (2.501 + 1.505)10.4003 = 10.01. •
Error estimates for the Gauss elimination are discussed in Ref. [E5] listed in App. I. Row scaling means the multiplication of each Row j by a suitable scaling factor ~J- It is done in connection with partial pivoting to get more accurate solutions. Despite much research (see Refs. [E9]. [E24] in App. I) and the proposition of several principles, scaling is still not well understood. As a possibility, one can !>cale for pivot choice only (not in the calculation, to avoid additional roundoff) and take as first pivot the entry aj1 for which IlljlltlAjl is largest: here Aj is an entry of largest absolute value in Row j. Similarly in the further steps of the Gauss elimination. For instance, for the system 4.0000X1
+
14020.\"2
=
14060
0.4003.\"1 - 1.502.\"2 = 2.501 we might pick 4 as pivot, but dividing the first equation by 104 gives the system in Example 3, for which the second equation is a better pivot equation.
Operation Count Quite generally, important factors in judging the quality of a numeric method are Amount of storage Amount of time (== number of operations) Effect of roundoff elTOL For the Gauss elimination, the operation count for a full matrix (a matrix with relatively many nonzero entries) is as follows. In Step k we eliminate Xk from n - k equations. This needs n - k divisions in computing the IIljk (line 3) and (n - k)(11 - k + 1) multiplications and as many subtractions (both in line 4). Since we do 11 - 1 steps. k goes from I to n - I and thus the total number of operations in this forward elimination is n-1
fen)
=
L
n-1
(n -
k)
+2L
k~l
n-1
=L s~l
(11 - k)(n - k
+
(write Il - k = s)
1)
k~l
n-1
S
+
2
L
s(s
+
1) = !(Il - 1)11
+ 1(11 2
-
1)n = 111
3
S=l
where 2n 3/3 is obtained by dropping lower powers of 11. We see that f(ll) grows about proportional to /1 3 . We say that f(n) is of order n 3 and write
SEC. 20.1
839
Linear Systems: Gauss Elimination
where 0 suggests order. The general definition of 0 is as follows. We write f(ll) = O(h(n))
if the quotient If(n)lh(Iz)1 remains bounded (does not trail off to infinity) as 11 ~ x. In our present case, hen) = 11 3 and, indeed. f(n)ln 3 ~ 2/3 because the omitted telms divided by 11 3 go to zero as n _ h . In the back substitution of Xi we make 11 - i multiplications and as many subtractions, as well as 1 division. Hence the number of operations in the back substitution is n
b(ll) = 2
2: (11
-
i)
+ 11
= 2
2: s + n
=
n(n
+
I)
+
n =
/1
2
+
211 =
2 0(11 ).
5=1
We see that it grows more slowly than the number of operations in the forward elimination of the Gauss algorithm, so that it is negligible for large systems because it is smaller by a factor 11, approximately. For instance, if an operation takes 10- 9 sec, then the times needed are: Algorithm
n = 1000
Elimination Back substitution
n=10000
0.7 sec
11 min
0.001 sec
0.1 sec
.P RO 8 E=E M- S E.E10:;-L For applicatiolls of linear systems see Secs. 7.1 and 8.2.
11-31
5. 2X1
GEOMETRIC INTERPRETATION
6.\1
-
8x2 = -4
+
2x2
= 14
Solve graphically and explain geometrically. 1. 4.\1
+
X2 =
3.\1 - 5.\2
2.
=
-4.3
6. 25.38x1
-33.7
-7. 05x 1
1.820.\1 - 1.183x2 = 0 -12.74.\1
3.
+
7.2.\1 -21.6.\1
14-141
+
8.281.\2 = 0 3.5x2 =
10.5.\2 = -4R.5
6X2
X2
=
-3
= 30.60
=
4.30.\2
+
8.
5x 1
+
3X2
- 4x2 IOx1 - 6.\2
= 137.86
8X3
= -85.88 178.54
+
X3
15x2
=
2
+ 8X3 = -3
+
26x3
9. 4.\1 + IOx2 - 2X3 -Xl -
-8.50
13x3
13x1 - 8X2
GAUSS ELIMINATION
+
+
6X1
16.0
Solve the following linear systems by Gauss elimination. with partial pivoting if necessary (but without scaling). Show the intermediate steps. Check the result by substitution. If no solution or more than one solution exists, give a reason. 4. 6.\"1
7.
15.48x2
+
= =
3X3 =
0 -20 30
25x2 - 5X3 = -50
CHAP. 20
840 10.
Xl
+
2X2
lOx 1
+
X2 IOx2
11. 3.4x 1 -Xl
3X3
-II
+
X3
8
i-
2X3
=
(b) Gauss elimination and nonexistence. Apply the Gauss elimination to the following two systems and compare the calculations step by step. Explain why the elimination fails if no solution exists.
2
6.l2x2 - 2. 72x3 =0
+
1.80X2
2.7x1 - 4.86x2
12.
Numeric Linear Algebra
3X2
+
+
0.80X3
-
2. 16x3=0
5X3
=
1.20736
+
6X3
=
1.6x2
-1.6
+ 4. 8x3
4.8x2 - 9.6x3 7.2x3
14.
4.4x2
+
32.0
+
7.2x4 = -78.0
+ 4.8x4
20.4
=
3.0x3 - 6.6x4
=
-4.65
+ 8.4X4
=
4.62 -4.35
- 7.6x3
+
3.0x4 =
5.97
15. CAS EXPERIMENT. Gauss Elimination. Write a program for the Gauss eliminarion with pivoting. Apply it to Probs. 11-14. Experiment with systems whose coefficient determinant is small in absolute value. Also investigate the perfonnance of your program for larger systems of your choice. including sparse systems.
2X2 - x3
5
9X1
+
5X2 - X3
13
Xl
+
+
3
X2
X3 =
(c) Zero determinant. Why maya computer program give you the result that a homogeneous linear system has only the trivial solution although you know its coefficient determinant to be zero? (d) Pivoting. Solve System lA) (below) by the Gauss elimination first without pivoting. Show that for any fixed machine word length and sufficiently small E > 0 the computer gives X2 = I and then Xl = O. What is the exact solution? Its limit as E ~ O? Then solve the system by the Gauss elimination with pivoting. Compare and comment. (e) Pivoting. Solve System (H) by the Gauss elimination and three-digit rounding arithmetic. choosing (i) the first equation, (ii) the second equation as pivot equation. (Remember to round to 3S after each operation before doing the next, just as would be done on a computer!) Then use four-digit rounding arithmetic in those two calculations. Compare and comment.
2
16. TEAM PROJECT. Linear S~'stems and Gauss Elimination. (a) Existence and uniqueness. Find a and b such that aX1 + X2 = b, Xl + X2 = 3 has (i) a unique solution. (ii) infinitely many solutions, (iii) no solutions.
20.2
+
4X1
X2
-0.329193
13. 6.4x 1 + 3.2x2 3.2x1 -
3
+
-2.34066
3X1 - 4X2 5x1
0
+ X3
Xl
(E)
-4.61 6.21x1
+
3.35x2 = -7.19
Linear Systems: LU-Factorization, Matrix Inversion We continue our discussion of numeric methods for solving linear systems of n equations in 11 unknowns Xl' ••• , X n , (1)
Ax
=
b
where A = [ajk] is the 11 X 11 coefficient matrix and x T = [Xl' .. xn] and b T = [hI' .. bnJ. We present three related methods that are modifications of the Gauss elimination, which
SEC. 20.2
841
Linear Systems: LU-Factorization, Matrix Inversion
require fewer arithmetic operations. They are named after Doolittle, Crout, and Cholesky and use the idea of the LV-factorization of A, which we explain first. An LV-factorization of a given square matrix A is of the form
(2)
A
LV
=
where L is lower triangular and U i, upper triallgular. For example,
A =
[2 3J = 8
[1 OJ [2 3J
LU =
5
4
I
0-7
It can be proved that for any nonsingular matrix (see Sec. 7.8) the rows can be reordered so that the resulting matrix A has an LU-factorization (2) in which L turns out to be the matrix of the multipliers Injk of the Gauss elimination, with main diagonal 1, ... , 1, and V is the matrix of the triangular system at the end of the Gauss elimination. (See Ref. [E5], pp. 155-156, listed in App. 1.) The crucial idea now is that Land U in (2) can be computed directly, without solving simultaneous equations (thus, without using the Gauss elimination). As a count shows. this needs about n 3 /3 operations. about half as many as the Gauss elimination. which needs about 21l 3 /3 (see Sec. 20.1). And once we have (2), we can use it for solving Ax = b in two steps. involving only about 11 2 operations. simply by noting that Ax = LUx = b may be written
(3)
(a)
Ly
=b
where
Ux
(b)
=
y
and solving first (3a) for y and then (3b) for x. Here we can require that L have main diagonal 1, ... ,las stated before; then this is called Doolittle's method. Both systems (3a) and (3b) are triangular, so we can solve them as in the back substitution for the Gauss elimination. A similar method, Crout's method, is obtained from (2) if V (instead of L) is required to have main diagonal 1. ...• 1. In either case the factorization (2) is unique. EXAMPLE 1
Doolittle's Method Solve the system in Example I of Sec. 20.1 by Doolittle's method.
S Diu ti~Il.
The decomposition (2) is obtained from
5
a12
A
=
[ajk]
=
[ "n
a21
a22
{[23 = [ 30 ""]
8
a31
a32
a33
6
2
'] [' 2
=
8
~]F
0
1Il21 nJ31
11132
U12
""] 1123
"22
0
li33
by determining the 111jk and liJ1<' using matrix multiplication. By going through A row by rov. we get successively
all =
3 = I . lIll =
11121 =
lill
a12
=
5
=
I • lI12
0
= 1112
a13
=2 =
I . 1/13 =
a23
= 2 =
111211113
1123
-I
li23
=2
=2.2 11132 =
+
li13
I •2 +
li33
842
CHAP. 20
Numeric Linear Algebra
Thus the factorization (2) is
:
:
:] =
6
2
8
[
LU [:1 0 =
2
-I
~] [:
: :] .
I
0
0
6
We first solve Ly = b, determining '"I = 8. then.\"2 = -7, then Y3 from 2Y1 - )"2 + thu~ (note the interchange in b because of the interchange in A!)
[~
2
0
:] [::::] = [ _ : ] .
-I
26
1)3
Then we solve Ux = y. determining
Solution
x3 =
)"3 =
16 + 7 + Y3 = 26;
,=[:]
3/6. then x2, then xl' that is,
Solution
x=
[J •
This agrees with the solution in Example I of Sec. 20.1.
Our formulas in Example I suggest that for general 11 the entries of the matrices L = rmjk] (with main diagonal l. ...• 1 and mjk suggesting "multiplier") and U = [Ujk] in the Doolittle method are computed from k=l,"',n j = 2,···, n j-1
(4) Ujk
=
lljk -
L
k
11ljs U s k
= j .... , n; j
~
2
s=l
j = k
Row Interchanges.
+ 1, ... , n; k
~
2.
Matrices, such as
[~
:J
or
[~ ~J
have no LU-factorization (try!). This indicates that for obtaining an LU-factorization. row interchanges of A (and corresponding interchanges in b) may be necessary.
Cholesky's Method
'*
For a symmetric, positive definite matrix A (thUS A = AT, X TAx > 0 for all x 0) we can in (2) even choose U = L T, thus Ujk = n1kj (but cannot impose conditions on the main diagonal entries). For example,
SEC. 20.2
843
Linear Systems: LU-Factorization. Matrix Inversion
The popular method of solving Ax = b based on this factorization A = LLT is called Cholesky's method. In terms of the entries of L = [ljkl the formulas for the factorization are
= 2,"',11
j j-l
(6)
L
Gjj -
Ij8
2
j = 2.···.11
8=1
= j + I, . . . , 11;
p
j
~
2.
If A is symmetric but not positive definite, this method could still be applied, but then leads to a complex matrix L, so that the method becomes impractical. E X AMP L E 1
Cholesky's Method Solve by Cholesky"s method: 4.\"1 +
2X2
+ 14.\"3
2"1 + 17.\"2 14xl -
Solution.
5x2
=
14
5.\"3 = -!o1
+
83.\"3 =
155.
From (6) or from the form of the factorization
I~ ~:l
: [
-5
14
o
: 1[~1
= [:::
83
133
131
o
0
we compute. in the given order. 111 =
~
=
2
121 =
a21
-
2
=
-
2
111
a31
= I
131 =
-
14
= -
111
=
2
7
This agrees with (5). We now have to solve Ly = b, that is. 0
[;
4
-3
0] ["'] [ 14]
~ ~~
As the second step. we have to solve Ux
[:
4 0
=
=
-~:~
LT X
Solution .
y~ [-':J
= y, that is.
-;] [:} [-,;]
Solution
x~ [-:J
•
CHAP. 20
844
Numeric Linear Algebra
Stability of the Cholesky Factorization
THEOREM 1
The Clwlesky LLT -!actorizatio/1 is numerically stable (as defined in Sec. 19.1).
PRO 0 F
We have ajj = lj12 + Ij22 + ... + ljf by squaring the third formula in (6) and solving it for ajj. Hence for allljk (note that ljk = 0 for k > j) we obtain (the inequality being trivial)
That is,
ljk
2
is bounded by an entry of A, which means stability against rounding.
•
Gauss-Jordan Elimination. Matrix Inversion Another variant of the Gauss elimination is the Gauss-Jordan elimination, introduced by W. Jordan in 1920, in which back substitution is avoided by additional computations that reduce the matrix to diagonal form. instead of the triangular form in the Gauss elimination. But this reduction from the Gauss triangular to the diagonal form requires more operations than back substitution does, so that the method is disadvantageous for solving systems Ax = h. But it may be used for matrix inversion, where the situation is as follows. The inverse of a nonsingular square matrix A may be determined in principle by solving the 11 systems (j
(7)
=
1, ... , 11)
where bj is the jth column of the 11 X 11 unit matrix. However. it is preferable to produce A -1 by operating on the unit matrix I in the same way as the Gauss-Jordan algorithm, reducing A to I. A typical illustrative example of this method is given in Sec. 7.8.
..........- . ........
!'!I!.~.!I!.!IIII~III!.':!I!_~_lJ!!I!!y~~::P---
11-71
•• _ -
...,.
DOOLITTLE'S METHOD
Show the factorilation and solve by Doolittle's method. 1.
3Xl
+
15xl
+
2x2
=
15.2
Ilx2 = 77.3
6.
-9.88 3.3x3 = -16.54
0.5Xl - 3.0X2 + -1.5Xl - 3.5x2 -
7.
3Xl 18xl
+
9X2
+
+ 48x2 +
9Xl -
27x2
1O.4x3 =
6X3 =
21.02
2.3
39x3 = 13.6
+ 42x3
=
4.5
8. TEAM PROJECT. Crout's method factorizes A = LV, where L is lower ttiangular and V is upper
SEC. 20.3
845
Linear Systems: Solution by Iteration
triangular with diagonal entnes lljj = l,j = 1, ... ,11, (a) Formulas. Obtain formulas for Crout's method similar to (4).
20
(b) Examples. Solve Probs. I and 7 by Crout's method. (e) Factor the following matrix by the Doolittle. Crout. and Cholesky methods.
-4
l-:
14. CAS PROJECT. Cholesky's Method. (a) Write a program for solving linear systems by Cho)esky's method and apply it to Example 2 in the text, to Probs. 9-11, and to systems of your choice.
25 4
(b) Splines. Apply the factorization part of the program to the following matrices (as they occur in (9), Sec. 19.4 (with cJ = 1), in connection with splines).
(d) Give the formulas for factoring a tridiagonal matrix by Crout's method. (e) When can you obtain Crout's factorization from Doolittle's by transposition?
19-131
4
CHOLESKY'S METHOD
[
Show the factorization and solve.
9.
9X1
+
6X2
+
12x3 =
6Xl
+
13x2
+
IIx3 = 118
12x1
+
IIx2
+
26x3 = 154
0.64X2
+
0.32x3 = 1.6
0.12x1 + 0.32x2
+
0.56x3 = 5.4
+
6x 1
+
8X1
+
12.
Xl -Xl
-
+
6X2 +
8X3 =
0
34x2
+
52x3 =
-80
52x2
+
129x3 = -226
X2
+
3X3
5X2 -
+
3X1 - 5X2
+
19x3
+
2X1 - 2X2
+
3X3
+
20.3
2X4 =
4
o
2
INVERSE Find the inverse by the Gauss-Jordan method. showing the details.
16. In Prob 4. 17. In Prob. 5. 18. In Prob. 6.
19.
30
3X4 =
188
=
2
21x4
o o
116-191
2x4 = -70
5X3
].
o o
15. (Definiteness) Let A and B be positive definite 11 X /1 matrices. Are - A, A f. A + B, A - B positive definite?
0. 12x3 = 1.4
11. 4xl
4
87
+
10. 0. 04X 1
o
2
In Prob. 7.
20. (Rounding) For the following matrix A find det A. What happens if you round off the given entries to (a) 5S, (b) 4S, (c) 3S, (d) 2S, (e) IS? What is the practical implication of your work? 114
-3/28
Linear Systems: Solution by Iteration The Gauss elimination and its variants in the last two sections belong to the direct methods for solving linear systems of equations; these are methods that give solutions after an amount of computation that can be specified in advance. In contrast. in an indirect or iterative method we start from an approximation to the true solution and. if successful, obtain better and better approximations from a computational cycle repeated as often as may be necessary for achieving a required accuracy, so that the amount of arithmetic depends upon the accuracy required and varies from case to case.
846
CHAP. 20
Numeric Linear Algebra
We apply iterative methods if the convergence is rapid (if matrices have large main diagonal entries, as we shall see), so that we save operations compared to a direct method. We also use iterative methods if a large system is sparse, that is, has very many zero coefficients. so that one would waste space in storing zeros, for instance, 9995 zeros per equation in a potential problem of 104 equations in 104 unknowns with typically only 5 nonzero terms per equation (more on this in Sec. 21.4).
Gauss-Seidel Iteration Method l This is an iterative method of great practical importance. which we can simply explain in terms of an example. E X AMP L E 1
Gauss-Seidel Iteration We consider the linear system =
-0.25x I
+
50
- 0.25x4 = 50
(I)
+
-0.25.1'1
- 0.25.1'2 - 0.25.1'3
(Equations of this form arise in the numeric in the fonn
~olution
+
of PDEs and in spline interpolation.) We write the system
0.25x2 + 0.25x3
Xl =
+ 50
X2
= 0.25.1'1
+ 0.25x4 + 50
X3
= 0.25xl
+ 0.25x4 + 25
(2)
0.25 t2 + 0.25x3
X4 =
+ 25.
These equations are now used for iteration: that is. we start from a (possibly poor) approximatIOn to the solution. say xiO) = 100, x~O> = 100. x~O) = 100. x~O) = 100. and compute from (2) a perhaps better approximation
Use "old" values ("New" values here not yet available)
t 0.25x~o)
(3)
X 2(1)-
0.25x~O)
+ 50.00 = 100.00
0. 25x ll
i
0.25x~0l
+ 50.00 = 100.00
ill
0. 25xiO)
+ 25.00 = 75.00
0. 25x lll X4 -
+
0.25x~1l + 0.25x~1l
+ 25.00 = 68.75
t
Use "new" values
These equations (3) are obtained from (2) by substituting on the right the //lost recellt approximation for each unknown. In fact, correspondmg value~ replace previous ones as soon as they have been computed. so that in
IPHlLIPP LUDWIG VON SEIDEL (1821-1896), German mathematician. For Gauss see foomore 5 in Sec. 5.4.
SEC. 20.3
847
Linear Systems: Solution by Iteration
the second and third equations we use xill (not xiOJ), and in the last equation of (3) we use x~l) and x~l) (not x~Ol and x~OJ). Using the same principle. we obtain in the next step 0.25x~ll + 0.25x~ll
+ 50.00
X~2) = 0.25xi2 )
+ 0.25x~1l + 50.00
x~2) = 0.25xi
+
2
)
0.25x~1)
(2) _
X4
-
=
93.750
= 90.625
+ 25.00
= 65.625
+ 25.00
= M.062
Further steps give the values
Xl
X2
X3
X4
89.062 87.891 87.598 87.524 87.506
88.281 87.695 87.549 87.512 87.503
63.281 62.695 62.549 62.512 62.503
62.891 62.598 62.524 62.506 62.502
Hence convergence to the exact solution Xl
=
X2 =
87.5, x3
= X4
= 62.5 (verify!) seems rather fast.
•
An algorithm for the Gauss-Seidel iteration is shown on the next page. To obtain the algorithm, let us derive the general formulas for this iteration. We assume that ajj = I for j = 1, ... , n. (Note that this can be achieved if we can realTange the equations so that no diagonal coefficient is zero; then we may divide each equation by the corresponding diagonal coefficient.) We now write
(4)
A=I+L+U
(ajj
=
1)
where I is the n x n unit matrix and Land U are respectively lower and upper triangular matrices with zero main diagonals. If we substitute (4) into Ax = b. we have Ax = (I
+ L + U) x = b.
Taking Lx and Ux to the right, we obtain, since Ix = x, (5)
x = b - Lx - Ux.
Remembering from (3) in Example I that below the main diagonal we took "new" approximations and above the main diagonal "old" ones, we obtain from (5) the desired iteration formulas
(6) where x(m) [_l}m)] is the mth approximation and x(m+ 1) = [X}m+ 1)] is the (m + I )st approximation. In components this gives the formula in line I in Table 20.2. The matrix A must satisfy Gjj =1= 0 for allj. In Table 20.2 our assumption ajj = I is no longer required, but is automatically taken care of by the factor l/ajj in line 1.
848
CHAP. 20
Table 20.2
Numeric Linear Algebra
Gauss-Seidel Iteration
ALGORITHM GAUSS-SEIDEL (A. b,
XlV), E, N)
This algorithm computes a solution x of the system Ax = b given an initial approximation x W), where A = [ajd is an 11 X 11 matrix with lljj =1= 0, j = I, ... , 11.
A, b, initial approximation of iterations N
INPUT: OUTPUT:
x(Q),
tolerance
E
> 0, maximum number
Approximate solution x(m) = [x7")] or failure message thal X(N) does not satisfy the tolerance condition
For 111 = 0..... N - 1. do: For j = L ... , /I, do: r~m+ll =
•.1
-
I
(j-l b. - ~
£.J
J
(l ..
2
€
£.J
Jk k
k-]
JJ
End If m':IX IX)7n+ll - xj",JI <
It
~
(/. x(m+lJ -
(/.
x(mJ Jk k
)
k~j+l
then OUTPUT
xcm+ll.
Stop
J
[Procedure completed successtittly 1
End OUTPUT:
"No solution satisfying the tolerance condition obtained after N iteration steps." Stop [Procedure completed
U11.\"Ucces.~f~ttly]
End GAUSS-SEIDEL
Convergence and Matrix Norms An iteration method for solving Ax = b is :-'did to converge for an initial x(V) if the corresponding iterative sequence xW>, x(1), X(2 ), . . . converges to a solution of the given system. Convergence depends on the relation between x(m) and x(m+lJ. To get this relation for the Gauss-Seidel method, we use (6). We first have (I
and by multiplying by (I (7)
x{m+lJ
=
Cx{m)
+
+
+
L)x(m+lJ
=
b -
Ux(m)
L)-1 from the left,
(I
+ L) -1 b
where
C = -(I + L)
-1
U.
The Gauss-Seidel iteration converges for every x(O) if and only if all the eigenvalues (Sec. 8.1) of the "iteration matrix" C = [CjkJ have absolute value less than l. (Proof in Ref. [E5]. p. 191, listed in App. 1.)
CAUTION! If you want to get C, first divide the rows of A by au to have main diagonal 1, ... , 1. If the spectral radius of C (= maximum of those absolute values) is smalL then the convergence is rapid.
Sufficient Convergence Condition.
A sufficient condition for convergence is
(8)
IICII
< 1
SEC. 20.3
849
Linear Systems: Solution by Iteration
Here
IIcll
is some matrix norm, such as
(9)
(Frobenius norm)
or the greatest of the sums of the
!cjkl in a colulIlll of C
(10)
IICIi =
n
max k
or the greatest of the sums of the
2: !cjkl
(Column "sum" norm)
j=l
!cjkl in a row of C n
(Row "sum" norm).
(11)
These are the most frequently used matrix norms in numerics. In most cases the choice of one of these norms is a matter of computational convenience. However, the following example shows that sometimes one of these norms is preferable to the others. E X AMP L E 1
Test of Convergence of the Gauss-Seidel Iteration Test whether the
Gauss~"eidel
iteration converges for the system
2x+ y+ ;::=4 x + 2y
+ :: =
\" = 2 - ~x -~;::
written
4
:: = 2 - ~x
x+ y+2::=4
Solution.
-
h.
The decomposition (multiply the matrix by 112 - why?) is
In]
[l~
112
112
1/2
1
-U + L)-'U
~
1/2
=I+L+U=I
[0 ° + 112
112
1/2
112
~][:
1/2
In] 112
0
0
:] + [:
0
0
It shows that
c
~
0
- [
)12 -1/4
-112
We compute the Frobenius norm of e
lIell = ( -4I + -41
I 16
+-
I
1 + -9 + -16+ -64 64
0 0
In]
-1/2
112
[:
0
114
-In] -114
118
318
r
Y'2 = (-5064 2= 0884 < I .
amI conclude from (8) that this Gauss-Seidel iteration converges. It is interesting that the other two norms would permit no conclusion. as you ~hould verify. Of course. this points to the fact that (8) tS sufficient for convergence rather than necessary. •
Residual. Given a system Ax defined by (12)
b, the residual r of x with respect to this system is
r
= b - Ax.
CHAP. 20
850
Numeric Linear Algebra
Clearly, r = 0 if and only if x is a solution. Hence r *- 0 for an approximate solution. In the Gauss-Seidel iteration, at each stage we modify or relax a component of an approximate solution in order to reduce a component of r to zero. Hence the Gauss-Seidel iteration belongs to a class of methods often called relaxation methods. More about the residual follows in the next section.
Jacobi Iteration The Gauss-Seidel iteration is a method of successive corrections because for each component we successively replace an approximation of a component by a corresponding new approximation as soon as the latter has been computed. An iteration method is called a method of simultaneous corrections if no component of an approximation x(m) is used until all the components of x(m) have been computed. A method of this type is the Jacobi iteration, which is similar to the Gauss-Seidel iteration but involves not using improved values until a step has been completed and then replacing x cm) by x(m+D at once, directly before the beginning of the next step. Hence if we write Ax = b (with ajj = I as before!) in the form x = b + (I - A)x, the Jacobi iteration in matrix notation is x(7n-t-l) = b + (I - A)x(m)
(13)
(ajj
=
I).
This method converges for every choice of x(O) if and only if the spectral radius of 1 - A is less than 1. It has recently gained greater practical interest since on parallel processors alln equations can be solved simultaneously at each iteration step. For Jacobi, see Sec. 10.3. For exercises. see the problem set.
- ....-----..... . ,
_~
_........
..-.
..---..
• . . . . . . . A _ _ _ . . . ~ ... • . . .
1. Verify the claim at the end of Example 2. 2. Show that for the system in Example 2 the Jacobi iteration diverges. Him. Use eigenvalues.
13-81
-Xl
+
4X2 X2
Xl
+
X2
+
6X3
=
7.
-61.3 8. 185.8
4.
X2
5Xl
+
.1.'2
Xl
+
6X2
+
7X3
=
25.5 0
+
Xl
+
4X2 -
2x 1
+
3x2
+
X3
= -10.5
2X3
=
-2
8X3
=
39
21
X2
GAUSS-SEIDEL ITERATION
Do 5 steps, starring [rom Xo = [I l]T and using 6S in the computation. Hint. Make sure that you solve each equation for the variable that has the largest coefficient (why?). Show the details.
3.
6. 4Xl -
X3
= -45
+ 4x3 =
33
lOx]
+
X2
+
.\"3
6
Xl
+
lOx2
+
X3
6
x] +
x2
+
IOx3
6
4xl
+
5x3
=
12.5
Xl
+
6.\"2
+
2X3
=
18.5
8xl
+
2X2
+
X3
=
-1l.5
9. Apply the Gauss-Seidel iteration (3 steps) to the system in Prob. 7, starting from (a) 0, 0, 0, (b) 10. 10, 10. Compare and comment. 10. In Prob. 7, compute C (a) if you solve the rust equation for Xl. the second for .\"2' the third for X 3 , proving convergence; (b) if you nonsensically solve the third equation for Xlo the first for X 2 , the second for X3, proving divergence.
SEC 20.4
851
Linear Systems: Ill-Conditioning, Norms
improvement of convergence. (Spectacular gains are made with larger systems.)
11. CAS PROJECT. Gauss-Seidel Iteration. ta) Write a program for Gauss-Seidel iteration. (b) Apply the program to A(t)x = b, starting from [0 0 O]T. where
A(t)
=
r:
t
J
b
=
rJ
For t = 0.2.0.5.0.8.0.9 determine the number of steps to obtain the exact solution to 6S and the corresponding spectral radius of C. Graph the number of steps and the spectral radius as functions of t and comment. (c) Successive overrelaxation (SOR). Show that by adding and subtracting x Cm) on the right, formula (6) can be written XCm+l) =
X Cm )
+
b - LxCm + 1 )
-
(U
+
l)x Cm )
Anticipation of further corrections motivates the introduction of an overrelaxation factor w > I to get the SOR formula for Gauss-Seidel X Cm + ll =
X Cm )
+ web -
112-151
Do 5 steps. starting from Xo = [I 1 IJT. Compare with the Gauss-Seidel iteration. Which of the two seems to converge faster? (Show the details of your work.) 12. The system in Prob. 6 13. The system in Prob. 5 14. The system in Prob. 8 15. Show convergence in Prob. 14 by verifying that I - A, where A is the matrix in Prob. 14 with the rows divided by the corresponding main diagonal entries. has the eigenvalues -0.519589 and 0.259795 ::'::: 0.246603i.
I
116-20 NORMS Compute the norms (9). (10), (II) for the following (square) matrices. Comment on the reasons for greater or smaller differences among the three numbers. 16. The matrix in Prob. 3 17. The matrix in Prob. 7 18. The matrix in Prob. 8
19.
Lx Cm + ll
(14)
20.4
r~
-k
(ajj = 1)
intended to give more rapid convergence. A recommended value is w = 2/(1 + ,h - p). where p is the spectral radius of C in (7). Apply SOR to the matrix in (b) for t = 0.5 and 0.8 and notice the
JACOBI ITERATION
20.
r-:
17
-k -2k
-:1
-k
2k
-3
-12
-:1
Linear Systems: Ill-Conditioning, Norms One does not need much experience to observe that some systems Ax = b are good, giving accurate solutions even under roundoff or coefficient inaccuracies. whereas others are bad. so that these inaccuracies affect the solution strongly. We want to see what is going on and whether or not we can "trust" a linear system. Let us first formulate the two relevant concepts (ill- and well-conditioned) for general numeric work and then tum to linear systems and matrices. A computational problem is called ill-conditioned (or ill-posed) if "small" changes in the data (the input) cause "large" changes in the solution (the output). On the other hand, a problem is called well-conditioned (or well-posed) if "small" changes in the data cause only "small" changes in the solution. These concepts are qualitative. We would certainly regard a magnification of inaccuracies by a factor 100 as "large," but could debate where to draw the line between "large" and "small." depending on the kind of problem and on uur viewpoint. Double precision may sometimes help, but if data are measured inaccurately, one should attempt changing the mathematical setting of the problem to a well-conditioned one.
852
CHAP. 20
Numeric Linear Algebra
Let us now tum to linear systems. Figure 442 explains that ill-conditioning occurs if and only if the two equations give two nearly parallel lines. so that their intersection point (the solution of the system) moves substantially if we raise or lower a line just a little. For larger systems the situation is similar in principle, although geometry no longer helps. We shall see that we may regard ill-conditioning as an approach to singularity of the matrix. y
y
x
x
(a)
(b)
Fir 442. (a) Well-conditioned and (b) ill-conditioned linear system of two equations in two unknowns
c X AMP L ElAn III-Conditioned System You may verify that the system 0.9999x -
1.000ly = 1
xhas the solution x = 0.5. Y = -0.5,
wherea~
y=l
the syMem
0.9999x - 1.000ly
x-
=
1
)"=I+E
has the solution x = 0.5 + 5000.5 E. Y = -0.5 + 4999.5 E. Thi~ shows that the system is ill-conditioned because a change on the right of magnitude E produces a change in the solution of magnitude 5000E. approximately. We see that the lines given by the equations have nearly the same slope. •
Well-conditioning can be asserted if the main diagonal entries of A have large absolute values compared to those of the other entries. Similarly if A-I and A have maximum entries of about the same absolute value. Ill-conditioning is indicated if A-I has entries of large absolute value compared to those of the solution (about 5000 in Example I) and if poor approximate solutions may still produce small residuals. Residual.
The residual r of an approximate solution
(2)
b is defined as
r = b - Ax.
(1)
Now b
xof Ax =
=
Ax, so that r
=
A(x -
x).
Hence r is small if x has high accuracy, but the converse may be false:
SEC. 20.4
85}
Linear Systems: Ill-Conditioning, Norms
E X AMP L E 1
Inaccurate Approximate Solution with a Small Residual The system
1.0001.\"1 + Xl
+ I.OOOlx2
=
2.0001
has the exact solution xl = 1, -'"2 = I. Can you see this by inspection? The very inaccurate approximation Xl = 2.0000, X2 = 0.0001 has the very small residual (to 4D)
2.0001 J [ 1.0001 I.OOOOJ [2.0000J [2.000IJ _ [2.0003J = [-O.0002J . [2.0001 - 1.0000 1.0001 0.0001 2.0001 2.0001 0.0000 =
r =
From this. a naive per,on might draw the false conc\u,ion that the appfoximation should be accumte to 3 Of 4 decimals. Our result is probably unexpected. but we shall see that it has to do with the fact that the system is ill-conditioned. •
Our goal is to show that ill-conditioning of a linear system and of its coefficient matrix A can be measured by a number. the "condition number" K(A). Other measures for ill-conditioning have also been proposed, but K(A) is probably the most widely used one. K(A) is defined in terms of norm. a concept of great general interest throughout numerics (and in modem mathematics in general!). We shall reach our goal in three steps. discussing 1. Vector norms
2. Matrix norms 3. Condition number
K
of a square matrix.
Vector Norms A vector norm for column vectors x = [Xj] with 11 components (11 fixed) is a generalized length or distance. It is denoted by II xII and is defined by four properties of the usual length of vectors in three-dimensional space. namely.
(3)
(a)
IIxll is a nonnegative real number.
(b)
IIxll =0
(c)
IIkxll = Iklllxll
Cd)
Ilx
+
if and only if
yll ~ Ilxll
x
= O.
for all k.
+
Ilyll
(Triangle inequality).
If we use several norms, we label them by a subscript. Most important in connection with computations is the p-Ilorm defined by
(4) where p is a fixed number and p ~ 1. In practice, one usually takes p third norm, Ilxlix (the latter as defined below), that is,
+ ... +
(5)
IIxlll = IX11
(6)
IIxl12
= YX12 + ... + xn2
(7)
IIxlix
= ~x J
Ixjl
= 1 or 2 and. as a
IXnl ("Euclidean" or "12-norm") ("lx-norm").
854
CHAP. 20
Numeric Linear Algebra
For n = 3 the 12-norm is the usual length of a vector in three-dimensional space. The lrnorm and lx-norm are generally more convenient in computation. But all three norms are in common use. E X AMP L E 3
Vector Norms IfxT
=
[2
-3
0
L -4], then
IIxll! =
IIxll2 = V30, Ilxll oo =
10,
•
4.
In three-dimensional space, two points with position vectors x and xhave distance Ix - xl from each other. For a linear system Ax = b, this suggests that we take IIx - xii as a measure of inaccuraty and call it the distance between an exact and an approximate solution. or the error of x.
Matrix Norm If A is an n X n matrix and x any vector with n components, then Ax is a vector with 11 components. We now take a vector norm and consider Ilxll and IIAxll. One can prove (see Ref. [E17]. p. 77, 92-93, listed in App. 1) that there is a number c (depending on A) such that
IIAxll
(8)
~
cllxll
for all x.
Let x*- O. Then IIxll > 0 by (3b) and division gives IIAx11/1lx11 ~ c. We obtain the smallest possible c valid for all x (=1= 0) by taking the maximum on the left. This smallest c is called the matrix norm of A corresponding to the vector norm we picked and is denoted by IIAII. Thus
(9)
IIAII
the maximum being taken over all x (10)
=1=
=max
IIAxl1
(x =1= 0),
W
O. Alternatively [see (c) in Team Project 24],
IIAII
= max IIxll=
1
IIAxll·
The maximum in (10) and thus also in (9) exists. And the name "matrix Ilorm" is justified because IIAII satisfies (3) with x and y replaced by A and B. (Proofs in Ref. [EI7] pp. 77,92-93.) Note carefully that IIAII depends on the vector norm that we selected. In particular, one can show that for the lrnorm (5) one gets the column "sum" norm (10), Sec. 20.3, for the lx-norm (7) one gets the row "sum" norm (11), Sec. 20.3. By taking our best possible (our smallest) (11)
IIAxl1
~
c = IIAII
we have from (8)
IIAllllxll·
This is the formula we shall need. Formula (9) also implies for two Ref. [E17], p. 98) (12)
thus
11
X n
matrices (see
SEC. 20.4
855
Linear Systems: III-Conditioning, Norms
See Refs. [E9] and [E 17] for other useful formulas on norms. Before we go on. let us do a simple illustrative computation. E X AMP L E 4
Matrix Norms Compute the matrix norms of the coefficient matrix A in Example I and of its inver,e A-I. assuming that we use (a) the lrvector norm, (b) the loo-vector norm.
Solutioll.
We use (4*). Sec. 7.!l, for the inverse and then (10) and (II) in Sec. 20.3. Thus 0.9999
-1.0001 ]
A-I =
A= [ 1.0000 - 1.0000
-5000.0 [ -5000.0
5000.5J. 4999.5
(a) The Irvector norm gives the column "sum" norm (10). Sec. 20.3: from Column 2 we thus obtain = 1-1.00011 + 1-1.00001 = 2.0001. Similarly. IIA -111 = 10000.
IIAII
(b) The lx-vector norm gives the row "sum" norm (11), Sec. 20.3: thus IIAII = 2, IIA -111 = 10000.5 from Row 1. We notice that IIA -111 is surprisingly large. which makes the product IIAII IIA -111 large (20001). We shall see below that this is typical of an ill-conditioned system. •
Condition Number of a Matrix We are now ready to introduce the key concept in our discussion of ill-conditioning, the condition number K(A) of a (nonsingular) square matrix A, defined by (13)
K(A)
= IIAII IIA -III·
The role of the condition number is seen from the following theorem.
THEOREM 1
Condition Number
A linear system of equations Ax = b and its matrix A whose condition number (13) is small are well-cmzditioned. A large condition number indicates ill-conditioning.
PROOF
b = Ax and (11) give IIbll ::; IIAII IIxll. Let b =1= 0 and x =1= O. Then division by Ilbll IIxll gives 1 IIxll
"All lib II
-<--
(14)
=
Multiplying (2) r = Atx - x) by A-I from the left and interchanging sides, we have x - x = A -Ir . Now (11) with A -1 and r instead of A and x yields IIx - xII
=
IIA -Irll ~ IIA -III IIrll.
Division by IIxll [note that IIxll =1= 0 by (3b)] and use of (14) finally gives
(15)
M
IIx - xII < _1_ -1 < IIAII -1 _ IIxll = IIxll IIA IIlIrll = IIbll IIA II IIrll - K(A) IIbil .
Hence if K(A) is small, a small IIrll/llbll implies a small relative error IIx - xll/llxll, so that the system is well-conditioned. However, this does not hold if K(A) is large; then a small IIrli/llbil does not necessarily imply a small relative error IIx - xll/llxll •
856
E X AMP L E 5
CHAP. 20
Numeric Linear Algebra
Condition Numbers. Gauss-Seidel Iteration
5 [ A =:
I]
has the
: :
Since A is symmetric. (10) anll
inver~e
(II) in Sec. 20.3 give the same condition number
We see that a linear system Ax = b "ith this A is well-conditioned. 9]T (confirm this). For instance. if b = [14 0 28]T. the Gauss algorithm gives the solution x = [2 -5 Since the main diagonal entries of A are relatively large. we can expect reasonably good convergence of the Gauss-Seidel iteration. Inlleed, qaning from. say, Xo = [I I qT. we obtain the first 8 steps (3D values)
EXAMPLE 6
Xl
X2
X3
1.000 2.400 1.630 1.870 1.967 1.993 1.998 2.000 2.000
1.000 -1.100 -3.882 -4.734 -4.942 -4.988 -4.997 -5.000 -5.000
1.000 6.950 8.534 8.900 8.979 8.996 8.999 9.000 9.000
•
III-Conditioned Linear System Example 4 gives by (10) or (II). Sec. 20.3, for the matrix in Exmnple I the very large condition number K(A) = 2.0001 . 10 000 ~ 2· 10 000.5 ~ 200001. This confirms that the sy,tem is very ill-conditioned. Similarl) in Example 2. where by (4""). Sec. 7.8 and 6D-computation. A-I = _1_ [
0.0001
-1.oo00J = [
1.0001 -1.0000
l.UOOI
5000.5 -5000.0
-5.000.0J 5000.5
so that (10), Sec. 20.3, gives a very large K(A), explaining the surprising result in Example 2. K(A) =
(1.0001 + 1.0(00)(5000.5 + 5000.0) = 20002.
•
In practice, A-I will not be known, so that in computing the condition number K(A), one must estimate IIA -111. A method for this (proposed in 1979) is explained in Ref. [E9] listed in App. I.
Inaccurate Matrix Entries.
KlA) can be used for estimating the effect ox of an inaccuracy oA of A (errors of measurements of the Qjb for instance). Instead of Ax = b we then have (A
+ oA)(x + ox)
Multiplying out and subtracting Ax
Aox
=
=
b.
b on both sides, we obtain
+ oA(x + ox) = O.
Multiplication by A-I from the left and taking the second term to the right gives
SEC. 20.4
857
Linear Systems: III-Conditioning, Norms
+
Applying (II) with A-I and vector oA(x
Applying (11) on the right, with
oA and x
ox) instead of A and x, we get
- ox instead of A and x, we obtain
lIoxll ~ IIA -111 IISAII IIx
+
Sxll .
Now IIA -111 = K(A)/IIAil by the definition of K(A), so that division by IIx + oxll shows that the relative inaccuracy of x is related to that of A via the condition number by the inequality II ox II W
(16)
= IIx
II ox II + oxll
<:
IIA-IIIII
=
~AII
b
_
-
A K(
IIoAil
) IIAII .
Conclusion, If the system is well-conditioned, small inaccuracies IISAII/IIAil can have only a small effect on the :>olution. However. in the case of ill-conditioning. if II oA 11/11 A II is small. IISxll/llxll11lay be large. Inaccurate Right Side. You may show that. similarly, when A is accurate, an inaccuracy Sb of b causes an inaccuracy ox satisfying
(17)
Hence lIoxll/llxll must remain relatively small whenever K(A) is small. E X AMP L E 7
Inaccuracies. Bounds (16) and (17) If each of the nine entries of A in Example 5 i_ measured with an inaccuracy of 0.1. then 115A:: (16) gives 115xll
3 '0.1
lJ;f ~ 7.5' - 7-
= 0.321
115xll ~ 0.321 IIxll
thus
= 0.321
. 16
=
9' 0.1 and
= 5.14.
By experimentation you will find that the actual inaccuracy Iloxll is only about 30% of the bound 5.14. This is typicaL Similarly. if 5b = [0.1 0.1 O.I]T. then 115bll = 0.3 and Ilbll = 42 in Example 5. so that (17) gives 115xll TxlI ~ 7.5'
0.3 42 = 0.0536.
hence
115xll ~ 0.0536' 16 = 0.857
but this bound is again much greater than the actual inaccuracy. which is about 0.15.
Further Comments on Condition Numbers. may be helpful.
•
The following additional explainarions
1. There is no sharp dividing line hetween "well-conditioned" and "ill-conditioned," but generally the situation will get worse as we go from systems with small K(A) to systems with larger K(A). Now always K(A) ~ I, so that values of 10 or 20 or so give no reason for concern, whereas K(A) = 100, say, calls for caution. and systems such as those in Examples I and 2 are extremely ill-conditioned.
CHAP. 20
858
Numeric Linear Algebra
2. If K(A) is large (or small) in one norm, it will be large (or small, respectively) in any other norm. See Example 5.
3. The literatme on ill-conditioning is extensive. For an introduction to it, see [E9]. This is the end of our discussion of numerics for solving linear systems. In the next section we consider curve fitting, an important area in which solutions are obtained from linear systems.
........ --
-.... .. ......... ==:=:=::: =........_-,._==.-_----,-
-
~~
VECTOR NORMS Compute (5), (6), (7). Compute a corresponding unit vector (vector of norm 1) with respect to the too-norm. 1. [1 -6 5]
2. [0.4 3. [-4
4. 5. 6. 7. 8.
- 1.2 4 3
0 8.01 -3]
18. Verify the calculations in Examples 5 and 6 of the text.
119-2°1
ILL-CONDITIONED SYSTEMS Solve Ax = bb Ax = b 2 • compare the soLutions. and comment. Compute the condition number of A.
[0 0 0 0] -0.1 [0.3 0.5 LO] [L6 21 54 -119] [1 1 1 1 1 1] [3 0 0 -3 0]
9. Show that
16. Verify (II) for x = [4 -5 2]T taken with the loe-norm and the matrix in Prob. L5. 17. Verify (12) for the matrices in Probs. to and 1 L
19. A =
Ilxll oo ~ IIxl12 ~ Ilxlli.
[
LAJ . b i -- [1.4J
2 1.4
1
1
•
b2 -- [1.44J 1
20. A= [ 5 -7J. b= [-2J .b = [-2J -7
110-151
10
i
3
2
3.1
MATRIX NORMS. CONDITION NUMBERS Compute the matrix norm and the condition number corresponding to the II-vector norm.
21. (Residual) For Ax = b] in Prob. 19 guess what the residuaL of = [113 -160]T might be (the solution
II.
22. Show that K(A) ~ 1 for the matrix norms (10), (11), Sec. 20.3, and K(A) ~ V;; for the Frobenius nOim (9), Sec. 20.3.
13.
[ 57
ro~,
21
10.5
7
5.25
10.5
7
5.25
4.2
7
5.25
4.2
3.5
5.25
4.2
3.5
3
14.
15.
[0~1
01
0.1
0]
o 100
o
'~l
x
being x = [0
I]T). Then calculate and comment.
23. CAS EXPERIMENT. Hilbert Matrices. The 3 X 3 Hilbert matrix is
The n X n HiLbert matrix is Hn = fhjk], where hjk = l/(j + k - 1). (Similar matrices occur in curve fitting by Least squares.) Compute the condition number K(Hn) for the matrix norm corresponding to the loc- (or ld vector norm, for fl = 2, 3, ... ,6 (or further if you wish). Try to find a formula that gives reasonable approximate values of these rapidly growing numbers. Solve a few linear systems involving an Hn of your choice.
SEC. 20.5
859
Least Squares Method
24. TEAM PROJECT. Norms. (a) Vector norms in our text are equivalent, that is, they are related by double inequalities; for instance, (a)
I\xl\x ~ I\XI\I ~ IIl\xIL,,:
(b)
-lIxllI ~ IIxlix ~ IIxll l
(18) 1
·
11
Hence if for some x, one norm is large (or small), the other norm must also be large (or small). Thus in many investigations the particular choice of a noml is not essential. Prove (18). (b) The Cauchy-Schwarz inequality is
(19b) (c) Formula (10) is often more practical than (9). Derive (10) from (9). (d) Matrix normS. lllustrate (11) with examples. Give examples of (12) with equality as well as with strict inequality. Prove that the matrix norms (10), (II) in Sec. 20.3 satisfy the axioms of a nonn IIL\II ~ IIAI\
= 0 if and only if A = o. IIkAil
IIA It is very important. (Proof in Ref. [GR7] listed in App. 1.) Use it to prove (l9a)
20.5
o.
+ BII
=
IkIIiAII,
~ IIAII
+ IIBII·
25. WRITING PROJECT. Norms and Their Use in This Section. Make a list of the most important of the many ideas covered in this section and write a two-page report on them.
Least Squares Method Having discussed numerics for linear systems, we now turn to an important application. curve fitting. in which the solutions are obtained from linear systems. In curve fitting we are given 11 points (pairs of numbers) (Xb "1)' ... , (Xn , Yn) and we want to determine a function f(x) such that
approximately. The type of function (for example. polynomials. exponential functions, sine and cosine functions) may be suggested by the nature of the problem (the underlying physical law, for instance), and in many cases a polynomial of a certain degree will be appropriate. Let us begin with a motivation. If we require strict equality [(Xl) = )'1, . . . , f(xn) = Yn and use polynomials of sufficiently high degree, we may apply one of the methods discussed in Sec. 19.3 in connection with interpolation. However, in certain situations this would not be the appropriate solution of the acmal problem. For instance. to the four points (1)
(-1.3.0.103),
(-0.1. 1.099),
(0.2, 0.808),
there corresponds the interpolation polynomial [(x)
:~
Fig. 443.
= X3
-
X
(l.3, 1.897)
+ I (Fig. 443), but if we
/
t=ilx ~
ApprOXimate fitting of a straight line
860
CHAP. 20
Numeric Linear Algebra
graph the points, we see that they lie nearly on a straight line. Hence if these values are obtained in an experiment and thus involve an experimental error, and if the nature of the experiment suggests a linear relation, we better fit a straight line through the points (Fig. 443). Such a line may be useful for predicting values to be expected for other values of x. A widely used principle for fitting straight lines is the method of least squares by Gauss and Legendre. In the present situation it may be formulated as follows.
Method of Least Squares.
The straight line
(2)
y
=a+
bx
should be fitted through the givell points (Xl, Yl)' ...• (Xn . Yn) so that the sum of the squares of the distances of those points from the straight line is minimum. where the distance is measured in the vertical direction (the .v-direction).
The point on the line with abscissa Xj has the ordinate a + bXj. Hence its distance from (Xj, .\) is h1 - a - bxjl (Fig. 444) and that sum of squares is n
q =
L
()J - (/ - bXjl.
j~l
q depends on a and b. A necessary condition for q
to
be minimum is
(3)
oq
ob
= - 2L
Xj
(\J - a - b;f)
=
0
(where we sum over j from 1 to n). Dividing by 2, writing each sum as three sums, and taking one of them to the right, we obtain the result
(4)
an
+b
a~ L.J x·J
+
~ L.J X·J
b~ L.J
= ~ y.
X·2 =
J
L.J
.J
~ L.J
X·V·
J-J'
These equations are called the normal equations of our problem.
Vertical distance of a point (xi' Yj) from a straight line y = a + bx
Fig. 444.
SEC. 20.5
861
Least Squares Method
E X AMP L E 1
Straight Line Using the method of lea~t squares. fit a straight line to the four points given in formula (I).
Solutioll. 11
We obtain
= 4,
2::>'J =
2: x/ =
0.1,
2: Yj =
3.43,
2: XjYj =
3.907,
2.3839.
Hence the normal equations are 4ll + O.JOb
=
3.9070
O.lll + 3.43b
=
2.3839.
The solution
•
+ 0.6670T.
Curve Fitting by Polynomials of Degree m Our method of curve fitting can be generalized from a polynomial y polynomial of degree III
(5) where
p(x) III ~ 11 -
a
+
bx to a
= b o + bix + ... + b",x'Tn
l. Then q takes the form
"
q =
2: (Yj -
p(Xj»2
j~I
and depends on conditions
III
+
I parameters bo, ... , bm . Instead of (3) we then have
aq
(6)
abo
aq abm
= 0,
=
0
which give a system of 111 + I normal equations. In the case of a quadratic polynomial
(7) the normal equations are (summation from I to n)
+ b 2: Xj + b2 2: x/ = 2: )j ol1 boL + b 2: xl + b 2: x/ = 2: bo 2: xl + b] 2: x/ + b 2: x/ = 2: xl,vj· b
(8)
i
Xj
i
2
2
The derivation of (8) is left to the reader.
XjYj
III
+
I
862
CHAP. 20
E X AMP L E 2
Numenc Linear Algebra
Quadratic Parabola by Least Squares Fit a parabola through the data (0. 5), (2. 4), (4, 1), (6, 6). (8, 7).
Solul;on. For the nonnal equations we need 11 = 5, L.\) = 20, L)"j = 23, LXj)] = 104. LX/)'j = 696. Hence these equations are 5bo + 20b! + 120b2 20bo
=
LX/
= 120, LX/ = 800.
2.x/ = 5664.
23
+ 120b! + 800b2 = 104
120bo + 800b! + 5664b 2
= 696.
Solving them we obtain the quadratic least squares parabola (Fig. 445) y = 5.11429 - 1.4l429x
•
+ O.21429x2.
y
8
•
2
• o Fig. 445.
2
4
6
8
x
Least squares parabola in Example 2
For a general polynomial (5) the normal equations form a linear system of equations in the unknowns b o, ... , b"n" When its matrix M is nonsingular, we can solve the system by Cholesky's method (Sec. 20.2) because then M is positive definite (and symmetric). When the equations are nearly linearly dependent, the normal equations may become illconditioned and should be replaced by other methods; see [E5], Sec. 5.7, listed in App. 1. The least squares method also plays a role in statistics (see Sec. 25.9).
11-61
FITTING A STRAIGHT LINE
Fit a straight line to the given points (x, y) by least squares. Show the details. Check your result by sketching the points and the line. Judge the goodness of fit. 1. (2,0), (3, 4), (4. lO), (5. 16)
2. How does the line in Prob. I change if you add a point far above it, say. (3. 20)? 3. (2.5, 8.0), (5.0, 6.9), (7.5. 6.2), (10.0, 5.0) 4. (Ohm's law U = Ri) Estimate the resistance R from the least squares line that fits (i, U) = (2.0, 104). (4.0,206), (6.0, 314), (10.0, 530). 5. (Average speed) Estimate the average speed vav of a car traveling according to s = v • t [km] (s = distance
traveled. t [h] = time) from (t, s) (11, 3lO). (\2. 4lO).
=
(9, 140), (10, 220),
6. (Hooke's law F = ks) Estimate the spring modulus k from the force F [lb] and extension s [cm], where (F, s) = (1, 0.50), (2, 1.02), (4, 1.99), (6, 3.01), (10.4.98), (20. 10.03). 7. Derive the normal equations (8).
18-10
1
FITTING A QUADRATIC PARABOLA
Fit a parabola (7) to the given points Cx, y) by least squares. Check by Sketching. 8. (-1,3), (0, 0), (I. 2). (2. 8) 9. (0,4), (2, 2), (4, -1), (6, -5)
SEC 20.6
863
Matrix Eigenvalue Problems: Introduction
10. Worker's time on duty x [h] 2 Worker's reaction lime [sec] 1.50 1.28
lAO 1.85 2.20
16. TEAM PROJECT. The least squares approximation of a function f(x) on an interval a ;;a x ;;a h by a function
11. Fit (2) and (7) by lea<;t squares to (-1.0,5.4), (-0.5,4.1), (0,3.9), (0.5,4.8), (1.0, 6.3), (l.5, 9.3). Graph the data and the curves on common axes and comment.
where Yo(x), ... ,y",(x) are given functions, requires the determination of the coefficients ao, .•• , am such that
3
4
5
12. (Cubic parabola) Derive the formula for the normal equations of a cubic least squares parabola. 13. Fit curves (2) and (7) and a cubic parabola by least squares to (-2, -35), (-1. -9), (0, -I), (I, -I),
(2, 17). (3. 63). Graph the three curves and the points on common axes. Comment on the goodness of fit.
14. CAS PROJECT. Least Squares. Write programs for calculating and solving the normal equations (4) and (8). Apply the programs to Probs. 3, 5, 9, 11. If your CAS has a command for fitting (Maple and Mathematica do), compare your results with those by your CAS commands. 15. CAS EXPERIMENT. Least Squares versus Interpolation. For the given data and for data of your choice find the interpolation polynomial and the least squares approximations (linear. quadratic, etc.). Compare and comment. (a) (-2,0), (-I, 0), (0, I), (I, 0), (2, 0) (b) (-4. 0), (-3. 0), (-2. 0). (-1. 0), (0, I). (I, 0). (2. 0), (3. 0). (4. 0)
I
(9)
b
[f(x) - Fm(x)]2 dx
a
becomes mmlmum. This integral is denoted by Ilf - Fm1l 2 , and IIf - Fmll is called the L 2 -norm of f - F m (L suggesting Lebesgue2 ). A necessary condition for that minimum is given by allf - F ml12/aaj = 0, j = 0, ... , m [the analog of (6)]. (a) Show that this leads to m + 1 normal equations (j = 0, ... , m) m
2:
hjkak = b j
where
k=O
(10)
hjk =
I I
b
)'j(X)Yk(X) dx,
a
bj =
b
!(X)'\".i(x) dx.
Q
(C) Choose five points on a straight line, e.g., (0, 0), (l, 1), ... , (4, 4). Move one point 1 unit upward and
(b) Polynomial. What form does (lO) take if Fm(x) = a o + lllX + ... + amx"'? What is the coefficient matrix of (10) in this case when the interval is 0 ;;a x;;a I?
find the quadratic least squares polynomial. Do this for each point. Graph the five polynomials on Cornmon axes. Which of the five motions has the greatest effect?
(c) Orthogonal functions. What are the solutions of (10) if .\'o(x), ... , Ym(x) are orthogonal on the interval a ;;a x ;;a b? (For the definition, see Sec. 5.7. See also Sec. 5.8.)
20.6
Matrix Eigenvalue Problems: Introduction In the remaining sections of this chapter we discuss some of the most important ideas and numeric methods for matrix eigenvalue problems. This very extensive part of numeric linear algebra is of great practical importance, with much research going on, and hundreds, if not thousands of papers published in various mathematical journals (see the references in [ES], [E9], [Ell], [E29]). We begin with the concepts and general results we shall need in explaining and applying numeric methods for eigenvalue problems. (For typical models of eigenvalue problems see Chap. 8.)
2HENRI LEBESGUE (1875-1941). great French mathematician. creator of a modern theory of measure and integration in his famous doctoral thesis of 1902.
864
CHAP. 20
Numeric Linear Algebra
An eigenvalue or characteristic value lor latent root) of a given Il X n matrix A = [ajk] is a real or complex number A such that the vector equation Ax
(1)
= Ax
has a nontrivial solution, that is, a solution x "* 0, which is then called an eigenvector or characteristic vector of A corresponding to that eigenvalue A. The set of all eigenvalues of A is called the spectrum of A. Equation (I) can be written lA - AI)x = 0
(2)
where I is the n X n unit matrix. This homogeneous system has a nontrivial solution if and only if the characteristic determinant det (A - AI) is 0 (see Theorem 2 in Sec. 7.5). This gives (see Sec. 8.1)
Eigenvalues
THEOREM 1
The eigenvalues of A are the solutions A of the characteristic equation all -
A
a21
(3)
a 1n
a12 a22 -
A
a2n
det (A - AI) =
= O.
ann -
an2
anI
A
Developing the characteristic determinant, we obtain the characteristic polynomial of A, which is of degree n in A. Hence A has at least one and at most n numerically different eigenvalues. If A is real, so are the coefficients of the characteristic polynomial. By familiar algebra it follows that then the roots (the eigenvalues of A) are real or complex conjugates in pairs. We shall usually denote the eigenvalues of A by
with the understanding that some (or all) of them may be equal. The sum of these n eigenvalues equals the sum of the entries on the main diagonal of A, called the trace of A; thus n
(4)
trace A =
L j~1
n
ajj
=
L
Ak .
k~1
Also, the product of the eigenvalues equals the determinant of A,
(5) Both formulas follow from the product representation of the characteristic polynomial, which we denote by f(A),
SEC. 20.6
865
Matrix Eigenvalue Problems: Introduction
If we take equal factors together and denote the numerically distinct eigenvalues of A by Ab ... , AI" (r ~ 11), then the product become~
(6) The exponent 111j is called the algebraic multiplicity of Aj • The maximum number of linearly independent eigenvectors corresponding to Aj is called the geometric multiplicity of Aj . It is equal to or smaller than 111j. A subspace S of R n or en (if A is complex) is called an invariant subspace of A if for every v in S the vector A,- is also in S. Eigenspaces of A (spaces of eigenvectors; Sec. 8.1) are important invaJiant subspaces of A. An 11 X 11 matrix B is called similar to A if there is a nonsingular n X 11 matrix T such that
(7) Similarity is important for the following reason. THEOREM 2
Similar Matrices
Similar matrices have the same eigellmlues. If x is an eigenvector of A, then y = T-Ix is all eigel1l'ector of B in (7) wrrespollding to the sallie eigellmille. (Proof in Sec. 8.4.)
Another theorem that has vaJious applications in numerics is as follows. THEOREM 1
Spectral Shift
IfA has the eigenvalues A1> •.• , An' then A - kI with arbitrary k has the eigenvalues Al - k . .. '. An - k.
This theorem is a special case of the following spectral mapping theorem. THEOREM 4
Polynomial Matrices
If A is all eigenvalue of A, then
is all eigenvalue of the polynomial matrix
q(A)x
+ O's_lAs - 1 + ... ) x = asAsx + as_lAs-Ix + .. . = asAsx + as_lAs-Ix + ... = q(A) x. =
(asAS
•
CHAP. 20
866
Numeric Linear Algebra
The eigenvalues of important special matrices can be characterized as follows.
Special Matrices
THEOREM 5
The eigenvalues of Hermitian matrices (i.e., -AT = A), hence of real symmetric matrices (i.e., AT = A), lire real. The eigenvalues of skew-Hermitian matrices (i.e., AT = -A), hence of real skew-symmetric matrices (i.e., AT = -A) are pure imaginary or O. The eigenvalues of unifllry matrices (i.e., AT = A-I). hence of orthogonalmlltrices (i.e .• AT = A-I), have absolute value l. (Proofs in Secs. 8.3 and 8.5.)
The choice of a numeric method for matrix eigenvalue problems depends essentially on two circumstances, on the kind of matrix (real symmetric, real general, complex, sparse, or full) and on the kind of information to be obtained, that is, whether one wants to know all eigenvalues or merely specific ones, for instance, the largest eigenvalue, whether eigenvalues lind eigenvectors are wanted. and so on. It is clear that we cannot enter into a systematic discussion of all these and further possibilities that arise in practice, but we shall concentrate on some basic aspects and methods that will give us a general understanding of this fascinating field.
20.7
Inclusion of Matrix Eigenvalues The whole numerics for matrix eigenvalues is motivated by the fact that except for a few trivial cases we cannot determine eigenvalues exactly by a finite process because these values are the roots of a polynomial of Ilth degree. Hence we must mainly use iteration. In this section we state a few general theorems that give approximations and error bounds for eigenvalues. Our matrices will continue to be real (except in formula (5) below). but since (nonsymmetric) matrices may have complex eigenvalues. complex numbers will playa (very modest) role in this section. The important theorem by Gerschgorin gives a region conSisting of closed circular disks in the complex plane and including all the eigenvalues of a given matrix. Indeed. for each j = 1. .... n the inequality (1) in the theorem detennines a closed circular disk in the complex A-plane with center {ljj and radius given by the right side of (1); and Theorem 1 states that each of the eigenvalues of A lies in one of these n disks.
Gerschgorin's Theorem
THEOREM 1
Let A be all eigenvalue qf all arbitrary integer j (l ~ j ~ 11) we have
PROOF
Il
X
Il
matrix A
[ajk]. Then for some
Let x be an eigenvector corresponding to an eigenvalue A of A. Then
(2)
Ax = Ax
or
(A - AI)x = O.
Let Xj be a component of x that is largest in absolute value. Then we have /x11 /Xj/ ~ I for = 1, ... , n. The vector equation (2) is equivalent to a system of 11 equations for the
I7l
SEC. 20.7
867
Inclusion of Matrix Eigenvalues
n components of the vectors on both sides. The jth of these n equations with j as just indicated is
Division by Xj (which cannot be zero; why?) and reshuffling terms gives
By taking absolute values on both sides of this equation, applying the triangle inequality la + bl ~ lal + Ibl (where a and b are any complex numbers), and observing that because of the choice of j (which is crucial !), IX1lxjl ~ 1, ... , IXn1xjl ~ I, we obtain (l), and the theorem is proved. • E X AMP L E 1
Gerschgorin's Theorem For the eigenvalues of the matrix 112
5
we get the Gerschgorin disks (Fig. 446)
°
1:
Center 0,
radiu~
I,
02:
Center 5.
radius 1.5.
03:
Center 1.
radius 1.5.
The centers are the main diagonal entries of A. These would be the eigenvalues of A if A were diagonal. We can take these values as crude approximation~ of the unknown eigenvalues (3D values) Al = -0.209, A2 = 5.305. A3 = 0.904 (verify this): then the radii of the disks are corresponding error bounds. Since A is symmetric, it follow, from Theorem 5, Sec. 20.6, that the spectrum of A must actually lie in the intervals [-1. 2.5] and [3.5, 6.5]. It is interesting that here the Gerschgorin disks form two disjoint sets, namely, 01 U 3 , which contains two • eigenvalues, and 02. which contains one eigenvalue. This is typical. as the following theorem shows.
°
y
x
Fig. 446.
THEOREM 2
Gerschgorin disks in Example 1
Extension of Gerschgorin's Theorem
If p Gerschgorin disks fom! a set S that is disjoint from the n - p other disks of a given matrix A. t!zen S contains precisely p eigenl'Q/ues of A (each cOllnted with its algebraic multiplicity. as defined in Sec. 20.6).
Idea of Proof. Set A apply Theorem I to At
= B + C, where B is the diagonal matrix with entries
= B + tC with real t growing from 0 [0 1.
ajj. and
•
868 E X AMP L E 2
CHAP. 20
Numeric Linear Algebra
Another Application of Gerschgorin's Theorem. Similarity Suppose that we have diagonalized a matrix by some numeric method that left us with some off-diagonal entries of size 10- 5 , say, 5
10- ] 10-5
2
.
4 What can we conclude about
deviation~
of the
eigenvalue~
from the main diagonal
entrie~?
By Theorem 2. one eigenvalue must lie in the disk of radius 2· 10- 5 centered at 4 and two 5 eigenvalues (or an eigenvalue of algebraic mulliplicity 2) in the disk of radius 2· centered at 2. Actually, since the matrix is symmetric. these eigenvalues must lie in the intersections of these disks and the real axis. by Theorem 5 in Sec. 20.6. We sho" how an isolated disk can always be reduced in size by a similarity transformation. The matrix
Solution.
10-
o
5
10- [I0 10-
l~
o
5]
o
4
2
0
o
]
;]
is similar to A. Hence by Theorem 2. Sec. 20.6. it has the same eigenvalues as A. From Row 3 we get the smaller disk of radius 2· 10- 10. Note that the other disks got bigger. approximately by a factor of 105 . And in choosing T we have to watch lhat the new disks do not overlap with the disk whose size we want to decrease. For funher interesting facts, see the new book [E28j. •
By definition, a diagonally dominant matrix A = [ajk] is an n X n matrix such that
lajjl
(3)
~
L lajkl
j
=
I,"', n
k*j
where we sum over all off-diagonal entries in Row j. The matrix is said to be strictly diagonally dominant if > in (3) for all j. Use Theorem I to prove the following basic property.
THEOREM 3
Strict Diagonal Dominance
Strictly diagonally dominant matrices are nOllSingular.
Further Inclusion Theorems An inclusion theorem is a theorem that specifies a set which contains at least one eigenvalue of a given matlix. Thus. Theorems I and 2 are inclusion theorems; they even include the whole spectrum. We now discuss some famous theorems that yield further inclusions of eigenvalues. We state the first two of them without proofs (which would exceed the level of this book).
SEC. 20.7
869
Inclusion of Matrix Eigenvalues
THEOREM 4
Schur's Theorem 3
Let A = [ajk] be an
11
X n matrix. Then for each of its eigenvalues A1> •.• , An' n
(4)
n
n
\AmJ 2 ~ L \Ai\2 ~ L L i=l
In (4) the second equality sign holds
\ajk\2
(Schur's inequality).
j=l k=l
if alld ollly if A is such that
(5)
Matrices that satisfy (5) are called normal matrices. It is not difficult to see that Hermitian, skew-Hermitian, and unitary matrices are normal, and so are real symmetric, skew-symmetric, and orthogonal matlices. E X AMP L E 1
Bounds for Eigenvalues Obtained from Schur's Inequality For the matrix
we obtain from Schur's inequality IAI ;;: v'I949 = 44.1475. You may verify that the eigenvalues are 30. 25, • and 20. Thus 302 + 25 2 + 202 = 1925 < 1949; in fact, A is not normal.
The preceding theorems are valid for every real or complex square matrix. Other theorems hold for special classes of matrices only. Famous is the following.
THEOREM 5
Perron's Theorem4
Let A be a real n X n matrix whose e1l1ries are all positive. Then A has a positive real eigenvalue A = p of multiplicity I. The corresponding eigenvector can be chosen with all components positive. (The other eigenvalues are less than p in absolute value.)
For a proof see Ref. [B3], vol. II, pp. 53-62. The theorem also holds for matrices with nonnegative real entries ("Perron-Frobenius Theorem,,4) provided A is irreducible, that is, it cannot be brought to the following form by interchanging rows and columns: here Band F are square and 0 is a zero matrix.
:ISSAI SCHUR (1875-1941), German mathematician, also known by his important work in group theory. OSKAR PERRON (1880-1975), GEORG FROBENIUS (1849-1917), LOTHAR COLLATZ (1910-1990), German mathematicians, known for their work in potential theory, ODEs (Sec. 5.4) and group theory, and numerics. respectively.
870
CHAP. 20
Numeric Linear Algebra
Perron's theorem has various applications, for instance, in economics. It is interesting that one can obtain from it a theorem that gives a numeric algorithm:
Collatz Inclusion Theorem4
THEOREM 6
Let A = [ajk] be a real n X 11 matrLl: whose elements are all positive. Let x be a1l\' real vector whose components Xl' • • . • Xn are positive, alld let )'1> •••• Yn be the components of the vector y = Ax. Then the closed imen'al 011 the real axis bounded by the smallest and the largest of the n quotients % = )'/Xj contains at least one eigenvalue of A.
PROOF
We have Ax
=
yor
(6)
y - Ax
= O.
The transpose AT satisfies the conditions of Theorem 5. Hence AT has a positive eigenvalue A and, corresponding to this eigenvalue. an eigenvector u whose components Uj are all positive. Thus ATu = Au, and by taking the transpose we obtain uTA = AUT. From this and (6) we have
or written out n
L
u/Yj -
AXj)
= O.
j~l
Since all the components
Uj
are positive, it follows that
Yj - A.r:j
~
O.
that is.
for at least one j.
Yj - ,\xj
~
0,
that is,
for at least one j.
(7)
and
Since A and AT have the same eigenvalues. A is an eigenvalue of A. and from (7) the statement of the theorem follows. • E X AMP L E 4
Bounds for Eigenvalues from Collatz's Theorem. Iteration For a given matrix A with positive entries we choose an x = "0 and iterate, that is, we compute Xl = Axo· X2 = AXI • . . . • X20 = AX19' In each step, taking x = ":i and y = AXj = Xj+l we compute an inclusion interval by Collatz's theorem. This gives (6S)
0.02 A =
[ 049 0.02
0.28
0.22] 0.20
0.22
0.20
0.40
•• ' . xl9
=
,xo ~
[I] 1
0.002 16309J 0.00108155 [ 0.00216309
,Xl =
I
,x20
[0.73] 0.50
,X2 ~
0.82
=
[0.00155743 J 0.000778713 0.00155743
[0.5481] 0.3186 , 0.5886
SEC. 20.7
871
Inclusion of Matrix Eigenvalues and the intervals 0.5 have length
~
j Length
0.32
A ~ 0.82, 0.3186/0.50
= 0.6372 ~
A ~ 0.548l/0.73
= 0.750822, etc. These intervals
2
3
10
15
20
0.113622
0.0539835
0.0004217
0.0000132
0.0000004
Using the characteristic polynomial. you may verify that the eigenvalues of A are 0.72. 0.36. 0.09. so that those intervals include the largest eigenvalue. 0.72. Their lengths decreased withj. so that the iteration was worthwhile. • The reason will appear in the next ~ection, where we discuss an iteration method for eigenvalues.
PROBLEM SET 2
11-61
Find and sketch disks or interval~ that contain the eigenvalues. If you have a CAS, find the spectrum and compare.
1.[-~ 2.
-: ]
[1~_2
10- 2
10- 2
10- 2
9. If a symmetric 11 X 11 matrix A = [ajkl has been diagonalized except for small off-diagonal entries of size 10- 6 , what can you say about the eigenvalues?
10. (Extended Gerschgorin theorem) Prove Theorem 2.
11. Prove Theorem 3.
2
10- ] 10-2
8
8. By what integer factor can you at most reduce the Gerschgorin circle with center 3 in Prob. 6?
12. (Normal matrices) Show that Hermitian, skewHermitian, and unitary matrices (hence real symmetric. skew-symmetric, and orthogonal matrices) are normal. Why is this of practical interest? peA»~ Show that p(A) cannot be greater than the row sum norm of A.
13. (Spectral radius 9
14. (Eigenvalues on the circle) Illustrate with a 2
X
2
matrix that an eigenvalue may very well lie on a Gerschgorin circle (so that Gerschgorin disks can generally not be replaced with smaller disks without losing the inclusion property). 1
4.
+
0.3
i
0.3i
-3
+ 2i
115--171
0.5i] 0.1
[
5. [
O.li
0.2
4i
0.1i
O.li
o o
-1
+
10
6. [
i
4-i
0.1
0.1
6
-0.2
o
SCHUR'S INEQUALITY
Use (4) to obtain an upper bound for the spectral radius: 15. In Prob. I 16. In Prob. 6 17. In Prob. 3
118-191
COLLATZ'S THEOREM
Apply Theorem 6, choosing the given vectors as vectors x.
10 I
18.
9
[
3 2
7. (Similarity) Find T-TAT such that in Prob. 2 the radius of the Gerschgorin circle with center 5 is reduced by a factor III 00.
'9. [:
4 2
J[J[J[J
CHAP. 20
872
Numeric Linear Algebra
(b) Apply the program to symmetric matrices of your choice. Explore how convergence depends on the choice of initial vectors. Can you construct cases in which the lengths of the inclusion intervals are not monotone decreasing? Can you explain the reason? Can you experiment on the effect of rounding?
20. CAS EXPERIMENT. Collatz Iteration. (a) Write a program for the iteration in Example 4 (with any A and xo) that at each step prints the midpoint (why?). the endpoints. and the length of the inclusion interval.
20.8
Power Method for Eigenvalues A simple standard procedure for computing approximate values of the eigenvalues of an n X n matrix A = [ajd is the power method. In this method we start from any vector xo (* 0) with n components and compute successively Xs
=
AX.<_l·
For simplifying notation, we denote Xs-l by x and Xs by y. so that y = Ax. The method applies to any n X n matrix A that has a dominant eigenvalue (a A such that is greater than the absolute values of the other eigenvalues). If A is symmetric. it also gives the error bound (2), in addition to the approximation (1).
IAI
THEOREM 1
Power Method, Error Bounds
Let A be an n X n real symmetric matrix. Let x (* 0) be any real vector with n components. Furthermore. let y
= Ax,
mo
=
1111
XTX,
=
xTy,
Then the quotient
(I)
(Rayleigh 5 quotient)
q=
is lin lIpproximation for an eigenvalue A of A (usually that which is greatest in absolute value, but no general statements are possible). FlIrthe17llore.
if we set q =
A-E.
so that
E
is the error of q. then
(2)
5 LORD RAYLEIGH (JOHN WILLIAM STRUTT) (1842-1919), great English physicist and mathematician. professor at Cambridge and London, known for his important contributions to various branches of applied mathematics and theoretical physics. in particular. the theory of waves. elasticity. and hydrodynamics. In 1904 he received a Nobel Prize in physics.
SEC. 20.8
873
Power Method for Eigenvalues
PROOF
fJ2
denotes the radicand in (2), Since
1111
= qmo by ( I). we have
Since A is real symmetric, it has an orthogonal set of n real unit eigenvectors ZI' . . . , zn corresponding to the eigenvalues AI, . . . , An' respectively (some of which may be equal). (Proof in Ref. [B3], vol. I, pp. 270-272, listed in App. I.) Then x has a representation of the fmm
Now
AZI
=
AIZI,
and. since the
Zj
etc., and we obtain
are orthogonal unit vectors.
(4) It follows that in (3),
Since the
Zj
are orthogonal unit vectors, we thus obtain from (3)
Now let Ae be an eigenvalue of A to which q is closest, where c suggests ·'closest"". Then (Ae - q)2 ~ (AJ - q)2 for j = 1, ... , n. From this and (5) we obtain the inequality
Dividing by
1110,
taking square roots. and recalling the meaning of
This shows that fJ is a bound for the error A and completes the proof.
E
fJ2
gives
of the approximation q of an eigenvalue of •
The main advantage of the method is its simplicity. And it can handle sparse matrices too large to store as a full square array. Its disadvantage is its possibly slow convergence. From the proof of Theorem ] we see that the speed of convergence depends on the ratio of the dominant eigenvalue to the next in absolute value (2:] in Example I, below). If we want a convergent sequence of eigenvectors, then at the beginning of each step we scale the vector, say, by dividing its components by an absolutely largest one, as in Example 1, as follows.
874 E X AMP L E 1
CHAP. 20
Numeric Linear Algebra
Application of Theorem 1. Scaling For the symmetric matrix A in Example 4, Sec. 20.7. and Xo indicated scaling
A
=
O.4Y
0.02
om
0.28
0.22
0.20
[I]
0.22]
[
~::~ , Xo
=
XlO
=
~
~.504682
[0.890244] ,
Xl
[0.931193]
~.609756 , X2
=
0,999707]
0,990663] X5 =
I] T we obtain from (I) and (2) and the
= [I
.
~.500146
0,999991] .
~.500005
Xl5 =
.
[
[
[
~.541284
=
Here AXV = [0.73 0.5 0.S2IT, scaled to Xl = [0.73/0.82 0.5/0.82 lIT. etc. The dominam eigenvalue is 0.72, an eigenvector [I 0.5 lIT. The corresponding q and 8 are computed each time before the next scaling. Thus in the first step. XOTAxO 2.05 - - - T - = -3- = 0.683333 Xo Xo
,, __ (1112 _ q2)1/2 __
v
1110
(AXo)TAxo 2)112 T - q = (1.4553 - - - q 2)112 = 0.134743. Xo Xo 3
This gives the foIlowing values of q, 8, and the error
j
€
=
0.72 - q (calculations with 10D, rounded to 6D):
2
5
10
q
0.683333
0.716048
0.719944
0.720000
8
0.134743 0.036667
0.038887 0.003952
0.004499 0.000056
0.000l41 5' 10-8
€
The error bounds are much larger than the actual errors. This is typicaL although the bounds cannot be improved: that is, for special symmetric matrices [hey agree with the errors. Our present results are somewhat better than those of CoIlatz'~ method in Example 4 of Sec. 20.7, at the expense of more operations. •
Spectral shift, the transition from A to A - kI, shifts every eigenvalue by -k. Although finding a good k can hardly be made automatic, it may be helped by some other method or small preliminary computational experiments. In Example I. Gerschgorin' s theorem gives -0.02 ~ A ~ 0.82 for the whole spectrum (verify!). Shifting by -0.4 might be too much (then -0.42 ~ A ~ 0.42), so let us try -0.2. E X AMP L E 2
Power Method with Spectral Shift For A - 0.21 with A as in Example I we obtain the following substantial improvements (where the index 1 refers to Example I and the index 2 to the present example).
j
81 82 €l €2
0.134743 0.134743 0.036667 0.036667
2
5
10
0.038887 0.034474 0.003952 0.002477
0.004499 0.000693 0.000056 1.3. 10-6
0.000141 1.8· 10-6 5 '10- 8 9' 10- 12
•
SEC. 20.9
875
Tridiagonalization and QR-Factorization
PRO B. L EMS E1::::..2 D-;-8_
11-71
°
POWER METHOD WITH SCALING
Apply the power method (3 steps) with scaling. using = [I I]T or [I I I]T or [I I I I]T. as applicable. Give Rayleigh quotients and error bounds. Show the details of your work.
Xo
3.5 1. [
3.
0.6
2.0J
2. [
U.5
2.0
11. (Spectral shift, smallest eigenvalue) In Prob. 5 set B = A - 31 (as perhaps suggested by the diagonal entries) and try whether you may get a sequence of q's converging to an eigenvalue of A that is smallesT (not largest) in absolute value. Use Xv = [I I nT. Do 8 step". Verify that A has the spectrum (0.3.51. 12. CAS EXPERIMENT. Power Method with Scaling. Shifting. (a) Write a program for 11 X 11 matrices that prints every step. Apply it to the (nonsymmetric!) matrix (20 steps), starting from [1 I If.
0.8J
-0.6
0.8
[~ ~J
.{; :J
5'[-~ -~ ~] I
2
9. Prove that if x is an eigenvector, then 8 = in (2). Give two examples. 10. (Rayleigh quotient) Why does q generally approximate the eigenvalue of greatest absolute value? When will q be a good approximation?
A
3
=
° 4
4
-I
° 2
5 8
3
7.
6.
0
2
3
2
o
8
2
-2
°
° ° ° 3
o
5
8. (Optimality of 8) In Prob. 2 choose Xo = [3 _l]T and show that q = 0 and 8 = I for all steps and that the eigenvalues are ::'::: 1. so that the interval [q - 8, q + 8] cannot be shortened in general! Experiment with other xo.
20.9
[
::
-19
~: I:] . - 36
-7
(b) Experiment in (a) with shifting. Which shift do you find optimal? (c) Write a program as in (a) but for symmetric matrices that prints vectors. scaled vectors, q, and 8. Apply it to the matrix in Prob. 6. (d) Find a (nonsymmetric) matrix for which 8 in (2) is no longer an error bound. (e) Experiment systematically with speed of convergence by choosing matrices with the second greatest eigenvalue (i) almost equal to the greatest, (ii) somewhat different, (iii) much different.
Tridiagonalization and QR-Factorization We consider the problem of computing all the eigenvalues of a real symmetric matrix A = [ajk]' discussing a method widely used in practice. In the first stage we reduce [he given matrix stepwise to a tridiagonal matrix, that is, a matrix having all its nonzero entries on the main diagonal and in the positions immediately adjacent to the main diagonal (such as A3 in Fig. 447, Third Step). This reduction was invented by A. S. Householder (1. Assn. Comput. Machinery 5 (1958), 335-342). See also Ref. [E29] in App. I. This Householder tridiagonalization will simplify the matrix without changing its eigenvalues. The latter will then be determined (approximately) by factoring the tridiagonalized matrix, as discussed later in this section.
876
CHAP. 20
Numeric Linear Algebra
Householder's Tridiagonalization Method An 11 X 11 real symmetric matrix A = [ajk] being given, we reduce it by Il - 2 successive similarity transformations (see Sec. 20.6) involving matrices PI' ... , P n-2 to tridiagonal form. These matrices are orthogonal and symmetric. Thus Pi l = PIT = PI and similarly for the others. These transformations produce from the given Ao = A = [lIjk] the matrices Al = raJ};]. A2 = [aj~)], .... A n - 2 = [a3k- 2 )] in the form Al = PIAoPl A2 (1)
=
P 2A I P 2
The transformations (1) create the necessary zeros, in the first step in Row I and Column 1, in the second step in Row 2 and Column 2, etc., as Fig. 447 illustrates for a 5 X 5 matrix. B is tridiagonal. How do we determine PI' P 2 , . . . Pn - 2 ? Now, all these P r are of the form
where I is the 0; thus
(3)
= I •... , 11
(r
(2) 11
X 11
-
2)
unit matrix and Vr = [Vjr] is a unit vector with its first r components
o
o
o
*
o
o
*
*
V n -2
=
* *
*
*
where the asterisks denote the other components (which will be nonzero in general).
Step 1. VI has the components Vn
= 0
(a)
(4)
j = 3. 4 .... ,
(b)
11
where (c)
Sl
=
\/a212 + a3I 2 + ... + a nI 2
where SI > 0, and sgn a21 = + 1 if a21 ~ 0 and sgn lI21 = -1 if a21 compute PI by (2) and then Al by (I). This wa<; the first step.
< O. With this we
SEC 20.9
Tridiagonalization and QR-Factorization
877
[.~~; .. *"'J ,.
o.!_
[:~:::** _.....
[::::* * :::*.:=;::J
..~.!'*:
*
*.
1
r:
:;: -
First Step
Second Step
Third Step
At =PtAP t
A" = P,At P 2
~ =P3 A"P j
Fig. 447. Householder's method for a 5 X 5 matrix. Positions left blank are zeros created by the method.
Step 2. We compute v 2 by (4) with all subscripts increased by 1 and the aj~), the entries of Al just computed. Thus [see also (3)]
ajk
replaced by
-I ( 1 +la~~1 - -) 2 S2
(4*)
j = 4,5, ... , n
where
+
a~~
2
With this we compute P2 by (2) and then A2 by (1).
Step 3. We compute V3 by (4*) with all subscripts increased by 1 and the ajil replaced by the entries aj'fl of A2 , and so on. E X AMP L E 1
Householder Tridiagonalization Tridiagonalize the real symmetric matrix 4
6
5 2
Solution. Step sgn
G21
=
2
1. We compute Sl2 = 4 + ]2 + ]2 = l!l from (4c). Since (4) by straightforward computation
Q21 =
+ 1 in (4bl and get tram
0 1 [00.119573 161. [ 0.985 598 56
V21
VI =
V31
V41
=
0.119 573 16
From this and (2).
o
o
-~.235
-0.942 809 04
-0.235 702 27
-0.235 702 27
0.97140452
-0.G28 595 48
-0.235 702 27
-0.028 595 48
0.97140452
70227J
4 > 0, we have
878
CHAP. 20
Numeric Linear Algebra
From the first line in (I) we now get
-Vt8
Step 2. From (4*) we compute
S22
o
7
-I
-(
-I
9/2
312
-I
3/2
912
01
= 2 and
[:~]
V2 =
=
[:.923879 ,,]. 0.38268343
V42
From this and (2).
P,
~ [~
0
-l~
0
0
0 0
-Itv'2
0
-1V2
]
Itv'2
The ",cond line in (I) now gives
[,
-Vt8
0
7
,/2
Y2
6
0
0
-v'18
B = A2 = P2 A I P2 =
0
0
~l
This matrix B is tridiagonal. Since our given matrix has order 11 = 4, we needed 11 - 2 = 2 steps to accomplish this reduction. as claimed. (Do yOll see that we got more zeros than we can expect in general?) B is similar to A, as we now show in general. This is essential because B thus has the same spectrum as A, by Theorem 2 in Sec. 20.b. •
We assert that Bin (1) is similar to A = Ao. The matrix PT is symmetric;
B Similar to A. indeed,
T
TT
T
TT
T
Pr = (I - 2v,.vr) = 1 - 2(v rvr ) = [ - 2v vr = Pro T
Also, PT is orthogonal because T
P."Pr
=
Pr
2
=
Vr
is a unit vector. so that T2
(I - 2vrvr )
T
V,T V T
=
1 and thus
T
T
= 1 - 4v rvr + 4v.rv r V."Vr T
T
T
= 1 - 4v,.vr + 4vAvr vr)Vr = I. Hence p;l
= pr T =
Pr and from (1) we now obtain
B
= P11-2A11-3Pn-2 = ...
. . . = Pn - 2 Pn - 3 -1
•••
-}
= Pn - 2 P n - 3
P I AP1
...
-1
•.•
PI API"
Pn - 3 Pn - 2 . Pn - 3 Pn - 2
1
= P- AP where P
=
P IP 2 ... P n-2' This proves our assertion.
•
SEC. 20.9
879
Tridiagonalization and QR-Factorization
QR-Factorization Method In 1958 H. Rutishauser of Switzerland proposed the idea of using the LU-factorization (Sec. 20.2; he called it LR-factorization) in solving eigenvalue problems. An improved version of Rutishauser's method (avoiding breakdown if certain submatrices become singular, etc.; see Ref. [E291) is the QR-method, independently proposed by the American I. G. F. Francis (Computer 1. 4 (1961-62), 265-271, 332-345) and the Russian V. N. Kublanovskaya (Zhllmal v.ych. Mal. i Mat. Fi::.. 1 (1961),555-570). The QR-method uses the factorization QR with orthogonal Q and upper triangular R. We discuss the QR-method for a real symmetric matrix. (For extensions to general matrices see Ref. rE29] in App. 1.) In this method we first transform a given real symmetric f1 X 11 matrix A into a tridiagonal matrix Bo = B by Householder's method. This creates many Leros and thus reduces the amount of further work. Then we compute B 1 , B 2 , . . . stepwise according to the following iteration method.
Step 1. Factor Bo = QoRo with orthogonal Ro and upper triangular Bl = RoQo·
Ro.
Then compute
Step 2. Factor Bl = QIRI' Then compute B2 = R 1 Ql' Ge1leral Step s
+ 1.
(5)
Here Qs is orthogonal and Rs upper triangular. The factorization (Sa) will be explained below. B 8 + 1 Similar to B. Convergence to a Diagonal Matrix. From (5a) we have Rs = Q;lBs' Substitution into (5b) gives
(6) Thus Bs+l is similar to Bs' Hence Bs+l is similar to Bo = B for all s. By Theorem 2, Sec. 20.6, this implies that Bs+l has the same eigenvalues as B. Also, Bs+l is symmetric. This follows by induction. Indeed. Bo = B is symmetric. Assuming Bs to be symmetric, that is, BsT = Bs, and using Q;l = Qs T (since Qs is orthogonal), we get from (6) the symmetry,
If the eigenvalues of B are different In absolute value, say, then
lim
s_oc
Bs =
IAll > IA21 > ... > IAnl,
D
where D is diagonal, with main diagonal entries listed in App. I.)
"-]0
"-2, ... ,
An. (Proof in Ref. [E29J
880
CHAP. 20
Numeric Linear Algebra
How to Get the QR-Factorization, say, B = Bo = [bjk ] = QoRo. The tridiagonal matrix B has 11 - I generally nonzero entries below the main diagonal. These are b2l • b32 , . . . • bn,n-l' We multiply B from the left by a matrix C 2 such that C 2 B = [b)~] has bW = O. We multiply this by a matrix C3 such that C3 C 2 B = [bj'r] has b~~ = O. etc. After 11 - I such multiplications we are left with an upper triangular matrix Ro. namely. (7)
These
11
X
f1
matrices Cj are very simple. Cj has the 2
X
2 submatrix
(OJ suitable)
in Rows j - 1 and j and Columns j - 1 and j; everywhere else on the main diagonal the matrix Cj has entries 1; and all its other entries are O. (This submatrix is the matrix of a plane rotation through the angle OJ; see Team Project 28. Sec. 7.2.) For instance. if fl = 4. writing Cj = cos OJ. sJ = sin OJ. we have
C2
=
~
- 'C, 2
S2
0
C2
0
0 0
0 0
0
~l4 ~ ~
0
0
Cg
S3
-.1'3
C3
0
0
o
0
o
These Cj are orthogonal. Hence their product in (7) is orthogonal, and so is the inverse of this product. We call this inverse Qo. Then from (7), (8)
where, with Ci 1
= C/o
(9) This is our QR-factorization of Bo. From it we have by (5b) with s = 0 (10)
We do not need Qo explicitly, but to get Bl from (10). we first compute R OC 2 T. then (ROC 2 T)C3 T, etc. Similarly in the further steps that produce B2 • B 3 , • • • • Determination of cos OJ and sin OJ. We finally show how to find the angles of rotation. cos f)2 and sin ~ in C 2 must be such that b~) = 0 in the product
o
o
SEC 20.9
881
Tridiagonalization and QR-Factorization
Now b~21 is obtained by multiplying the second row of C 2 by the first column of B,
cos 82 =
v'l
(II)
E X AMP L E 2
VI]
82
tan 82
sin 82 = Similarly for 83 , 84 ,
+ tan 2
V]
+
+
(b 21 /b n
)2
b21lb n
V]
tan 2 82
+
(b 21 /b n
)2
The next example illustrates all this.
.. '.
QR-Factorization Method Compute all the eigenvalues of the matrix 4 6 5
Solution.
We fust reduce A to tridiagonal foml. Applying Householder's method. we obtain (see Example I)
[
A2 =
-~!J8 -~T8
o ,\;'2
o
v2
6
o
0
o
From the characteristic determinant we see that A 2 • hence A, has the eigenvalue 3. (Can you see this directly from A2?1 Hence it suffices to apply the QR-method to the tridiagonal 3 X 3 matrix
'.
~ ~ -~Is •
-,\;18 7
[
v2
Step I. We multiply B from the lefl by
C2 =
[COO ~
sin
f)2
-s~
cos
f)2
f)2
]
0
Here (-sin O2 )' 6 + (cos 02)(-~) With these values we compute
and lhen C 2 B by
= 0 gives
7,34846<) 23
C2 B
=
[
C,
~
0
[:
cos
1.154 700 54
o
1.414213 56
6.00000000
(cos
7.348469 23
= C g C2 B = [
f)3) •
f)3
= -0.57735027.
.
1.414 213 56 = 0 the values cos
-7.50555350
f)3]'
-0.816 496 58]
3.26598632
.!.
cos
f)3
(II) cos O2 = 0.81649658 and sin f)2
-7.50555350
Si:
f)3
- ,in
0
In C 3 we get from (- sin f)3) • 3.265 <)86 32 and sin f)3 = 0.39735971. This gives
Ro
0] f)3
-0.8\649658J
0
3.55902608
3.443 784 13
o
o
5.04714615
.
= 0.917 662 94
CHAP. 20
882
Numeric Linear Algebra
From this we compute
Bl
= RoC2
T
C3
T
=
10.333 333 33
- 2.054 804 67
-:.05480467
4.03508772
:.005 532 51 ]
2.00553251
4.63157895
[
which is symmetnc and tridiagonal. The off-diagonal entries in BI are still large in absolute value. Hence we have to go on.
Step 2. We do the same computations as in the first step. with Bo = B replaced by Bl and C z and C 3 changed accordingly, the new angles being 62 = 0.196291 533 and 63 = 0.513415589. We obtain -2.802322 -II
[ 1053565375 Rl
~
-0.391 145 88]
0
4.08329584
3.98824028
0
0
3.06832668
1
and from this
B2 =
[ 10.8" 879 88
-0.79637918
-:.796379 18
5.44738664
~50702500
1.50702500
2.672 733 48
We see that the off-diagonal entries are somewhat smaller in absolute value than those of B I . but ,till much too large for the diagonal entries to be good approximations of the eigenvalues of B.
Further Steps. Ib~~)1 = Ib~i)1 in all
We list the main diagonal entries and the absolutely largest off-diagonal entry, which is steps. You may show that the given matrix A has the spectrum I 1.6,3,2. (J)
Stepj
b\·i)
b<.i)
HJ)
maXj*k Jbjkl
3 5 7 9
10.966892 9 10.9970872 10.999742 I 10.999977 2
5.94589856 6.00] 81541 6.00024439 6.00002267
2.08720851 2.00109738 2.00001355 2.00000017
0.58523582 0.12065334 0.035911 07 0.01068477
11
22
33
•
Looking back at our discussion, we recognize that the purpose of applying Householder's tridiagonalization before the QR-factorization method is a substantial reduction of cost in each QR-factorization. in particular if A is large. Convergence acceleration and thus further reduction of cost can be achieved by a spectral shift, that is, by taking Bs - ksI instead of Bs with a suitable ks . Possible choices of ks are discussed in Ref. [E291, p. 510.
---. .•. "1-41
-...... HOUSEHOLDER TRIDIAGONALIZATION
Tridiagonalize. showing the details:
1. [
3.5
1.0
1.0
5.0
3.0
1.5
3.0
3.5
0.98 3.
0.04
1.5J
~[
0.04
0.56
OAO
0.44
0.40
0.80
[
8
2
2
8
8
2
2
2
2
6
4
2
2
4
6
4.
0
]
15-91 0.44]
8
QR-FACTORIZATION
Do three QR-steps to find approximations of the eigenvalues of: 5. The matrix in the answer to Prob 1
883
Chapter 20 Review Questions and Problems 6. The matrix in the answer to Prob. 3
9.
0,:]
0.2
4.1
,, 1. What are the main problem algebra?
0.1
0.1
4.0
ro
O.~] 1.0
0.1
10. CAS EXPERIMENT. QR-Method. Try to find out experimentally on what properties of a matrix the speed of decrease of off-diagonal entries in the QR-method depends. For this purpose write a program that first tridiagonalizes and then does QR-steps. Try the program out on the matrices in Probs. 1. 3. and 4. Summarize your findings in a short report.
-U.I
-4.3
7.0
area~
=1 :
in numeric linear
S T ION SAN D PRO B L EMS Xl
+
X2
+
X3
S
2. What is pivoting? When and how would you apply it?
Xl
+
2X2
+
2x3
6
3. What happens if you apply Gauss elimination to a system that has no solutions?
Xl
+
2x2
t-
3X3
8
4. What is Doolittle's method? Its connection to Gauss elimination?
17.
18.
5X1
5. What is Cholesk)"s method? When would you apply it?
19.
10. Why are similarity transformations of matrices important in designing numeric methods? Give examples. 11. What is the power method for eigenvalues? What are its advantages and disadvantages?
13. State Schur's inequality and give some applications of it.
GAUSS ELIMINATION
116-]91 Solve: 16.
4x2 5Xl
+
6Xl -
3X2
+
7X2
+
=
11.8
X3 =
34.2
3X3
2X3
=
-3.1
-10
3X2
+ +
21:1
X2
9X3
0
=
3X3
=
IS
X3
=
-13
+ SX3 =
26
20. Solve Prob. 17 by Doolittle's method. 21. Solve Prob. 17 by Cholesky's method. 122-24 1 INVERSE MATRIX Compute the inverse of:
22.
14. What is tridiagonalization? When would you apply it? 15. What is the idea of the QR-method? When would you apply the method?
- SX 2 + ISx 3
3x 1 -
12. State Gerschgorin's theorem from memory. Can you remember its proof?
17
X2
4X2 -
8. What is least squares approximation? What are the normal equations? 9. What is an eigenvalue of a matrix? Why are eigenvalue problems important? Give typical examples.
3x3
2Xl -
6. What do you know about the convergence of the Gauss-Seidel method? 7. What is ill-conditioning? What is the condition number and its significance?
+
23.
r"
2.0
05] O.S
0.5
1.0
1.5
2.0
1.0
2.0
10]
r'
2.0
3.S
1.0
1.S
I.S
9.0
CHAP. 20
884
Numeric Linear Algebra
125-261 GAUSS-SEIDEL METHOD Do 3 steps without scaling, starting from [1 25.
+ 15x2 -
Xl
X3
+ 3x2
lOx I
11
=
-17
5
34. In Prob. 18 35. In Prob. 19 136-381 CONDITION NUMBER Compute the condition number (corresponding to the Coo-vector norm) of the coefficient matrix: 36. In Prob. 22 37. In Prob. 23 38. In Prob. 24 @9-40 1
FITTING BY LEAST SQUARES
Fit:
VECTOR NORMS
127-321
Compme the f\-, C2 -, and Ceo-norms of the vectors 27. [0 4 -8 3]T
28. [3 29. 30. 31. 32.
8
[-4 [0
-ll]T
1
0
0
2]T O]T
[-5
-2
7
[0.3
1.4
0.2
0
O]T
-0.6]T
133-351 MATRIX NORM Compute the matrix norm corresponding to the Ccc-vector norm for the coetlicient matrix: 33. In Prob. 17
39. A straight line to (- 2. O. \). (0. 1.9). (2. 3.8), (4, 6.\), (6,7.8) 40. A quadmtic pambola to O. 9). (2, 5), (3. 4), (4. 5). (5. 7)
141-431 EIGENVALUES Find three circular disks that must contain all the eigen values of the matrix: 41. In Prob. 22 42. In Prob. 23 43. Tn Prob. 24 44. (Power method) Do 4 sleps of the power method for the matrix in Prob. 24. starting from [I I I]T and computing the Rayleigh quotients and error bounds. 45. (Householder and QR) Tridiagonalize the matrix in Prob. 23. Then apply 3 QR steps. (Spectrum (6S): 9.65971, 4.07684. 0.263451)
Numeric Linear Algebra Main tasks are the numeric solution of linear systems (Secs. 20.1-20.4), curve fitting (Sec. 20.5). and eigenvalue problems (Secs. 20.6-20.9). Linear systems Ax = b with A = [ajk]' written out
(1)
can be solved by a direct method (one in which the number of numeric operations can be specified in advance. e.g., Gauss's elimination) or by an indirect or iterative method (in which an initial approximation is improved stepwise).
885
Summary of Chapter 20
The Gauss elimination (Sec. 20.1) is direct, namely, a systematic elimination process that reduces (1) stepwise to triangular fonn. In Step I we eliminate Xl from equations E2 to En by subtracting (a2I/an) EI from E 2, then (a3I/an) EI from E 3. etc. Equation EI is called the pivot equation in this step and an the pivot. In Step 2 we take the new second equation as pivot equation and eliminate X2, etc. If the triangular fonn is reached, we get Xn from the last equation, then Xn-I from the second last. etc. Partial pivoting (= interchange of equations) is necessary if candidates for pivots are zero, and advisable if they are small in absolute value. Doolittle's, Crout's, and Cholesky's methods in Sec. 20.2 are variants of the Gauss elimination. They factor A = LV (L lower triangular, U upper triangular) and solve Ax = LUx = b by solving Ly = b for y and then Ux = y for x. In the Gauss-Seidel iteration (Sec. 20.3) we make an = a22 = ... = ann = I (by division) and write Ax = (I + L + U)x = b; thus x = b - (L + U)x, which suggests the iteration formula (2)
XCrn + I )
=b
- Lx(m+l) - UX Crn)
in which we always take the most recent approximate x./s on the right. If Ilell < l. where C = -(I + L)-IU, then this process converges. Here. IICII denotes any matrix norm (Sec. 20.3). If the condition number K(A) = IIAII IIA -111 of A is large. then the system Ax = b is ill-conditioned (Sec. 20.4), and a small residual r = b - Ax does 1Iot imply that x is close to the exact solution. The fitting of a polynomial p(x) = b o + blx ., ... + bmx'" through given data (points in the J\"}·-plane) (Xl' YI), ... , (Xm Yn) by the method of least squares is discussed in Sec. 20.5 (and in statistics in Sec. 25.9). Eigenvalues A (values A for which Ax = Ax has a solution x =1= 0, called an eigenvector) can be characterized by inequalities (Sec. 20.7), e.g. in Gerschgorin's theorem, which gives 11 circular disks which contain the whole spectrum (all eigenvalues) of A, of centers ajj and radii Llajkl (sum over k from I to 11, k =1= j). Approximations of eigenvalues can be obtained by iteration. starting from an Xo =1= 0 and computing Xl = Axo, x 2 = Ax!> ... , xn = AXn-i. In this power method (Sec. 20.8) the Rayleigh quotient
(3)
q=
gives an approximation of an eigenvalue (usually that of the greatest absolute value) and, if A is symmetric, an error bound is
(4) Convergence may be slow but can be improved by a speCTral shift. For determining all the eigenvalues of a symmetric matrix A it is best to first rridiagonalize A and then to apply the QR-method (Sec. 20.9), which is based on a factorization A = QR with 0I1hogonai Q and upper triangular R and uses similarity transformations.
CHAPTER
21
>~-
Numerics for ODEs and PDEs Numeric methods for differential equation" are of great practical importance to the engineer and physicist because practical problems often lead to differential equations that cannot be solved by one of the methods in Chaps. 1-6 or 12 or by similar methods. Also, sometimes an ODE does have a solution fonnula (as the ODEs in Secs. 1.3-1.5 do), which, however, in some specific cases may become so complicated that one prefers to apply a numeric method instead. This chapter explains and applies basic methods for the numeric solution of ODEs (Secs. 21.1-21.3) and POEs (Secs. 21.4-21.7). Sections 21.1 alld 21.2 may be studied immediately after Chap. 1 alld Sec. 21.3 immediately after Chap. 2, because these sections are independent of Chaps. 19 and 20. Sections 21.4-21.7 Oil PDEs may be studied immediately after Chap. 12 if students have some knowledge of linear systems of algebraic equations.
Prerequisite: Secs. 1.1-1.5 for ODEs, Secs. 12.1-12.3, 12.5, 12.10 for POEs. References and Answers to Problems App. 1 Part E (see also Parts A and C), App. 2.
21.1
Methods for First-Order ODEs From Chap. 1 we know that an ODE of the first order is of the form F(x, y. y') = 0 and can often be written in the explicit form y' = f(x. y). An initial value problem for this equation is of the fonn
y' = f(x. yt
(1)
)'(xo) = Yo
where .1.'0 and Yo are given and we assume that the problem has a unique solution on some open interval (/ < x < b containing .1.'0' In thi~ section we shall discu~s methods of computing approximate numeric values of the solution y(x) of (1) at the equidistant points on the x-axis Xl
=
.1.'0
+
h.
where the step size h is a fixed number, for instance, 0.2 or O. I or 0.01. whose choice we discuss later in this section. Those methods are step-by-step methods, using the same formula in each step. Such formulas are suggested by the Taylor series 2
(2) 886
y(x
+ h)
= y(x)
+
hy' (x)
+
h ""2 y"(X) + ....
SEC 21.1
887
Methods for First-Order ODEs
For a small h the higher powers h 2 , h 3 , approximation y(x
+ h)
.•.
= y(x)
= y(x)
are very small. This suggests the crude
+ hy' (x) + hf(.\", y)
(with the second line obtained from the given ODE) and the following iteration process. In the first step we compute
+
Yl
=
Y(.\"o
+
h). In the second step we compute
which approximates )'(X2) = 1'(xo
+
2h), etc., and in general
which approximates Y(XI)
=
Yo
hf(xo, Yo)
(3)
(n
= 0, 1, .. ').
This is called the Euler method or the Euler-Cauchy method. Geometrically it is an approximation of the curve of y(x) by a polygon whose first side is tangent to this curve at xo (see Fig. 448).
y
Fig. 448.
Euler method
This crude method is hardly ever u~ed in practice, but since it is simple, it nicely explains the principle of methods based on the Taylor series. Taylor's formula with remainder has the foml y(X
+ h) = y(x) + hy' (x) + ~h2y"W
(where x ~ l; ~ x + h). It shows that in the Euler method the tru/lcation error il1 each step or local truncation error is propOitional to h 2 , written 0(11 2 ), where 0 suggests order (see also Sec. 20.1). Now over a fixed x-interval in which we want to solve an ODE the number of steps is proportional to 1111. Hence the total error or global error is propOitional 2 to h 0/h) = hI. For this reason. the Euler method is called a first-order method. In addition, there are roundoff errors in this and other methods, which may affect the accuracy of the values Yb .\'2, ..• more and more as 11 increases, as we shall see.
888
CHAP. 21 Table 21.1
Numerics for ODEs and PDEs Euler Method Applied to (4) in Example 1 and Error
+
xn
)'n
0 I
0.0
0.000
0.000
0.2 0.4 0.6 0.8 1.0
0.000 0.040 0.128 0.274 0.489
0.040
2 3 4 5
E X AMP L E 1
0.2(xn
11
Exact Values
)'n)
0.088 0.146 0.215
Error
En
0.000 0.021
0.000
0.092 0.222 0.426 n.7U<
0.052 0.094 0.152 0.229
O.02J
Euler Method Apply the Euler method to the following initial value problem. choosing II = 0.2 and computing )'1 • . . .
y' =
(4)
Solution.
Here f(x.
y)
= x
+ y; hence
X
+ y,
yeO) = o.
f(x n , Yn) = xn
Yn+1
=
)"n
, \'5:
+ Yn' and we see that (3)
+ 0.2lxn
-j-
become~
Yn)'
Table 21.1 shows the computations. the values of the exact solution y(x) = eX -
x-I
obtained from (4) in Sec. 1.5. and the error. Tn practice the exact solution is unknown, but an indication of the accuracy of the values can be obtained by applying the Euler method once more with step 211 = 0.4. letting )'n* denote the approximation now obtained. and comparing corresponding approximations. This computation is:
xn
vn *
0.0 0.4 0.8
0.000 0.000 0.160
O.4(Xn
+ )'n)
0.000 0.160
Yn in Table 21.1
Difference Yn - )'n*
0.000 0.040 0.274
0.000 0.040 0.1l4
Let En and En" be the errors of the computations with II and 211. respectively. Since the error is of order 112. in a switch from II to 217 it is multiplied by 22 = 4, but since we need only half as many steps as before, it will be multiplied only by 412 = 2. Hence En" = 2En SO that the difference is En * - En = 2En - En = En' Now Y = )'n + En = Yn * + En * by the definition of error; hence En" - En = )'n - )'n* indicates En qualitatively. Tn our computations, Y2 - )'2* = 0.04 - 0 = 0.04 (actual error 0.052. see Table 21.1) and )"4 - )'4* = 0.274 - 0.160 = 0.114 (actually 0.152). •
E X AMP L E 2
Euler Method for a Nonlinear ODE Figure 449 concerns the initial value problem (5)
yeO) = OA
and shows the curve of the solution y = 1/[2.5 - Sex)] + 0.01x 2 where Sex) is the Fresnel integral (38) in App. 3.1. It also shows 80 approximate values for 0 ~ x ~ 4 obtained by the Euler method from (3).
Although Ii = 0.05 is smaller than Ii in Example 1, the accuracy is still not good. It is interesting that the error is not monotone increasing. obviously since the solution is not monOlone. We shall return to this ODE in the problem set. •
SEC. 21.1
889
Methods for First-Order ODEs
y
0.70
0.60
0.50
0.40
o Fig. 449.
2
3
4
x
Solution curve and Euler approximation in Example 2
Automatic Variable Step Size Selection in Modern Numeric Software The idea of adaptive integration as motivated and explained in Sec. 19.5 applies equally well to the numeric solution of ODEs. It now concerns automatically changing the step size h depending on the variability of y' = f determined by (6*)
Accordingly, modern software automatically selects variable step sizes h n so that the error of the solution will not exceed a given maximum size TOL (suggesting tolerance). Now for the Euler method. when the step size is h = h m the local error at Xn is about ~hn21y"(xn)l. We require that this be equal to a given tolerance TOL,
(6)
(a)
21"(Xn) 1=
I 2hn )'
2TOL thus
TOL,
Iy''lxn) 1
y"(X) must not be zero on the interval J: Xo ~ x = xN on which the solution is wanted.
Let K be the minimum of 1y" (X) 1on J and assume that K > O. Minimum 1,/'(x) 1corresponds to maximum h = H = Y2 TOLIK by (6). Thus. Y2 TOL = HVK. We can insert this into (6b). obtaining by straightforward algebra K
(7)
where
For other methods, automatic step size selection is based on the same principle.
Improved Euler Method By taking more terms in (2) into account we obtain numeric methods of higher order and precision. But there is a practical problem. If we substitute y' = f(x, y(x)) into (2), we have (2*)
890
CHAP. 21
Numerics for ODEs and PDEs
Now y in .f depends on x, so that we have f' as shown in (6*) and .f", .fill even much more cumbersome. The general strategy now is to avoid the computation of these derivatives and to replace it by computing .f for one or several suitably chosen auxiliary values of (x. y). "Suitably" means that these values are chosen to make the order of the method as high as possible (to have high accuracy). Let us discuss two such methods that are of practical imp0l1ance. namely. the improved Euler method and the (classical) Runge-Kutta method. In the improved Euler method or improved Euler-Cauchy method (sometimes also called Heun method), in each step we compute first the auxiliary value (8a)
and then the new value (8b) Thi~ method ha" a simple geometric interpretation. In fact. we may say that in the we approximate the solution y by the straight line through interval from x" to Xn + (xn , Yn) with slope f(x n , Yn), and then we continue along the straight line with slope f(X n +l, Y:;+I) until x reaches X,,+I' The improved Euler-Cauchy method is a predictor-corrector method, because in each step we first predict a value by (8a) and then correct it by (Sb). In algorithmic form. using the notations kl = hf(xn, Yn) in (Sa) and k2 = hf(.'n+b Y~+I) in (8b), we can write this method as shown in Table 21.2.
!h
Table 21.2
Improved Euler Method (Heun's Method)
ALGORITHM EULER (f, xo. Yo, h. N) This algorithm compute<; the solution of the initial value problem y' = f(x. y) . .\"(xo) = Yo at equidistant points xl = Xo + 17, X2 = Xo + 211, ... , XN = Xo + Nfl; here f is such that this problem has a unique solution on the mterval [xo. xNl (see Sec. 1.7). INPUT:
Initial values xo, Yo, step size II, number of steps N
OUTPUT: Approximation Yn+l to the solution Y(Xn +l) at where 11 = 0, . . . , N - 1 For 11 =
o.
I. .... N - 1 do:
Xn+l = Xn
+ 11
kl = hf(xn, y,,)
k2
= hJ(Xn+b
Y,,+l =
y"
Yn
+ k1 )
+ 2(k1 + k2 )
OUTPUT Xn+l, ."n+l End Stop End EULER
xn+l
=
Xo
+
(n
+
l)h.
SEC 21.1
891
Methods for First-Order ODEs
E X AMP L E 3
Improved Euler Method Apply the improved Euler method to the initial vallie problem (4). choosmg h
Solution.
=
0.2. as before.
For the present problem we have in Table 21.2
k2 .1',,+1 = Yn +
=
0.2
""2
0.2(x" + 0.2 + "n
+ 0.2(xn + .1',,»
(2.2xn + 2.2Yn + 0.2) = Yn + 0.22(xn + .1',,) + 0.02.
Table 2 1.3 show~ thai ollr present results are more accurate than tho~e in Example I: see also Table 21.6. •
Table 21.3
Improved Euler Method Applied to (4) and Error
+ .1'n) + 0.02
U.22(xn
n
xn
•vn
0
0.0 0.2 0.4 0.6 0.8 1.0
0.0000 0.0200 0.0884 0.2158 0.4153 0.7027
2 3 4 5
Exact Values (4D)
Error
0.0000 0.0214 0.0918 0.2221 0.4255 0.7183
0.0000 0.0014 0.0034 0.0063 0.0102 0.0156
0.0200 0.0684 0.1274 0.1995 0.2874
Error of the Improved Euler Method. The local error is of order h 3 and the globed error of order h 2 , so that the method is a second-order method. PROOF
Setting
In = Ierm .1'(xn » and using (r), we have
(9a)
Y(Xn
+
h) - .r(xn )
=
-
hI,.
1 2-' 1 3-" + :z./I In + 6 h In + ....
Approximating the expression in the brackets in (8b) by Taylor expansion, we obtain from (8b) 1 [)'n+l - )'n ="2h In _
(where'
1
[-
-"2 h In
(9b)
= dldx."
+
] In+l
+
(fn
-
+
-,
1
1"
+ 1"+1
2 -If
hIn +"2h In
and again using the
+ ... )]
etc.). Subtraction of (9b) from (9a) gives the local error
h
3
_If
h
3
_If
11 3
-If
-6 .f n - -4 I n + ... -- - -12 .f" + .... Since the number of steps over a fixed x-interval is proportional to I11z, the global error 3 2 • is of order h /11 = h , so that the method is of second order.
892
CHAP. 21
Numerics for ODEs and PDEs
Runge-Kutta Methods (RK Methods) A method of great practical importance and much greater accuracy than that of the improved Euler method is the classical Runge-Kutta method of fourth order, which we call briefly the Runge-Kutta method. 1 It is shown in Table 21.4. We see that in each step we first compute four auxiliary quantities k 1, k2 , k 3 , k4 and then the new value Yn+1' The method is well suited to the computer because it needs no special starting procedure, makes light demand on storage, and repeatedly uses the same straightforward computational procedure. It is numerically stable. Note that if f depends only on x, this method reduces to Simpson's rule of integration (Sec. 19.5). Note further that kJ, ... , k4 depend on 11 and generally change from step to step. Table 21.4
Classical Runge-Kutta Method of Fourth Order
ALGORITHM RUNGE-KUTTA (f, x o, )'0' h, N). This algorithm computes the solution of the initial value problem y' at equidistant points Xl = Xo
+
h. X2
=
Xo
+
2h . . . . , XN = Xo
= f(x, y), y(Xo) = Yo
+ Nil:
here f is such that this problem has a unique solution on the interval [xo, XN] (see Sec. 1.7). INPUT:
Function f, initial values xo, Yo, step size h, number of steps N
OUTPUT: Approximation Yn+1 to the solution 'y(Xn +1) at X,,+1 where n = 0, I, ... , N - 1 For n
=
= Xo
+
(n
+
l)h.
0, 1, ... , N - 1 do:
k1 = hf(xn, y,,)
+ ~h, Yn + ~kl) hf(xn + ~h, Yn + ~k2) hf(xn + h, Yn + k3) = Xn + h = Yn + ~(k1 + 2k2 + 2k3 +
k2 = hf(xn k3 = k4 = Xn+1 Yn+1
k 4)
OUTPUT Xn +], )'1/+1 End Stop End RUNGE-KUTTA
INamed after the German mathematicians KARL RUNGE (Sec. 19.4) and WILHELM KUTTA (1867-1944). Runge [Math. Annalen 46 (1895),167-178], KARL HEUN [Zeitschr. Math. Phys. 45 (1900), 23-38], and Kutta [Zeitschr. Math. Phys. 46 1901l. 435-453] developed various such methods. Theoretically, there are infinitely many fourth-order methods using four function values per step. The method in Table 21.4 is most popular [rom a practical viewpoint because of its "symmetrical" form and its simple coefficients. It was given by Kutta.
SEC 21.1
893
Methods for First-Order ODEs
E X AMP L E 4
Classical Runge-Kutta Method Apply the Runge-Kutta method to the initial "alue problem (4) in Example 1, choosing II computing five steps.
Solutioll.
For the present problem we have f(x. v}
x
=
+ y.
0.2, as before, and
=
Hence
kl = 0.2(xn
+ y,,),
k2 = 0.2(x"
+ 0.1 + y" + O.Sk l ),
k3 = 0.2(x"
+ 0.1 + Yn + 0.Sk2 }.
k4 = 0.2(x"
+ 0.2 + y" + k3 ).
Table 21.5 shows the results and their errors, which are smaller by factors 103 and 104 than those for the two Euler methods. See also Table 21.6. We mention in passing that since the present k 1• . . . . k4 ate simple, operations were saved by substituting kl into k2. then k2 into k3 . etc.: the resulting formula is shown in Column • 4 of Table 21.5.
Table 21.5
Runge-Kutta Method Applied to (4)
n
x"
v ."
0
0.0 0.2 0.4 0.6 0.8 1.0
0 0.021400 0.091 818 0.222107 0.425521 0.718251
2 3 4 5
0.2214(x" + Yn) + 0.0214
bxact Values (6D) y = eX - x-I
106 X Error
0.021400 0.070418 0.130289 0.203 414 0.292730
0.000000 0.021403 0.091825 0.222 119 0.425541 0.718282
0 3 7 12 :W 31
ofy"
Table 21.6 Comparison of the Accuracy of the Three Methods Under Consideration in the Case of the Initial Value Problem (4), with h = 0.2
Error
!
x
r=ex-x-I
Euler (Table 21.1)
Improved Euler (Table 21.3)
Runge-Kutta (Table 21.5)
0.2 0.4 0.6 0.8 1.0
0.021403 0.091825 0.222119 0.425541 0.718282
0.021 0.052 0.094 0.152 0.229
0.U014 0.0034 0.0063 0.0102 0.0156
0.000003 0.000007 0.000011 0.000020 0.000031
I
Error and Step Size Control. RKF (Runge-Kutta-Fehlberg) The idea of adaptive integration (Sec. 19.5) has analog<; for Runge-Kutta (and other) methods. In Table 21.4 for RK (Runge-Kutta), if we compute in each step approximations y and .17 with step sizes hand 2h. respectively, the latter has error per step equal to 5 2 = 32 times that of the former; however, since we have only half as many steps for 2h, the actual factor is 2 5 /2 = 16. so that, say, and thus
894
CHAP. 21
Numerics for ODEs and PDEs
Hence the error
E =
d- h ) for step size h is abour
(10)
where y problem
y = lh) -
y(2hl, as said before. Table 21.7 illustrates (10) for the initial value
y' =
(1L)
(y - x-l)2
+ 2,
yeo)
= I.
the step size h = 0.1 and 0 ~ x ~ 0.4. We see that the estimate is close to the actual error. This method of error estimation is simple but may be unstable. Table 21.7 Runge-Kutta Method Applied to the Initial Value Problem (11) and Error Estimate (10). Exact Solution y = tan x + x + 1 \:
0.0 0.1 0.2 0.3 0.4
v
y
(Step size /7)
(Step si7e 211)
Error Estimate (10)
Actual Error
Exact Solution (9D)
1.000 000 000 1.200 334 589 1.402 709 878 1.609 336 039 1.822 792 993
l.000 000 000
0.000000000
1.402 707 408
0.000 000 165
1.822 788 993
0.000000 267
0.000000000 0.000 000 083 0.000 000 157 0.000 000 210 0.000 000 226
1.000000000 1.200334672 1.402 710 036 1.609 336 250 l.822 793 219
RKF. E. Fehlberg [Computing 6 (1970), 61-71] proposed and developed error control by using two RK methods of different orders to go from (xn , Yn) to (xn +1> Yn+l)' The difference of the computed y-values at Xn+l gives an error estimate to be used for step size controL Fehlberg discovered two RK formulas that together need only 6 function evaluations per step. We present these formulas here because RKF has become quite popular. For instance. Maple uses it (also for systems of ODEs). Fehlberg's fifth-order RK method is (12a)
with coefficient vector y = [Yl ... Y6]. (12b)
y=
U;5
o
28561 56430
6656 12825
i5]'
His fourth-order RK method is (13a)
with coefficient vector (13b)
Y * -_
[25 216
o
1408 2565
2197 4104
_1] 5'
SEC 21.1
895
Methods for First-Order ODEs
In both formulas we use only 6 different function evaluations altogether. namely.
(14)
k1
= hf(x."
k2
= hf(x., +
k3
= Izf(x n + ~h.
y.,
+
k4
= hf(xn + ~~h,
Yn
+ ~~~~kl
k5
= hf(x., + h.
J'n
+ i~~k1
k6
= hf(x., + ~h.
.\" n -
J'n)
!Iz,
Yn +
!k1)
~k1 +
i2k2)
~~g~k2 + ~~~~k3) 8k2 + 3:~~k3 - 481t~k4) 2k2 - ~~~k3 + ~~g:k4 - ~~k5)'
+
287 k1
The difference of (12) and (13) gives the error estimate (15)
E X AMP L E 5
En+l -- .\'n+1 -
* --...Lk Yn+1 360 1
128 k 4275 3
-
2197 k 75240 4
+..!.k +..2..k 50 5 55 6'
Runge-Kutta-Fehlberg For the initial value problem (II) we obtain from (12)-(14) with 11= 0.1 in the first step the 0.200062 500000
k1 = 0.200000 000000
k2
k3 = 0.200140756867
k4 = 0.2008.'i6926154
k5 = 0.201006676700
k6 = 0.200250418651
.\'1*
=
=
12S-value~
1.20033466949
.\'1 = 1.20033467253
and the error
e~timate E1 =."1 -
.\'~ = 0.000000 00304.
The exact 12S-value is .1'(0.1) = 1.20033467209. Hence the actual error of."1 i~ -4.4· 10- 10, smaller than that in Table 21.7 by a factor 200. •
Table 21.8 summarizes essential features of the methods in this section. It can be shown that these methods are Ilumerically stahle (definition in Sec. 19.1). They are one-step methods because in each step we use the data of just one preceding step. in contrast to multistep methods where in each step we use data from several preceding steps. as we shall see in the next section. Table 21.8
Methods Considered and Their Order (= Their Global Error)
Method Euler Improved Euler RK (fourth order) RKF
Function Evaluation per Step
2 4 6
Global Error
Local Error
0(11)
0(17 2 )
2
0(h ) 0(114) 0(h 5 )
0(h 3 ) 0(11 5 ) 0(h 6 )
896
CHAP. 21
Numerics for ODEs and PDEs
Backward Euler Method. Stiff ODEs The backward Euler formula for numerically solving (I) is (n = 0, 1, .. ').
(16)
This formula is obtained by evaluating the right side at the new location (Xn +1, )'n+1); this is called the backward Euler scheme. For known )'n it gives )'n+l implicitly, so it defines an implicit method, in contrast to the Euler method (3), which gives Yn+l explicitly. Hence (16) must be solved for )'n+l' How difficult this is depends on f in (1). For a linear ODE this provides no problem, as Example 6 (below) illustrates. The method is particularly useful for "stiff' ODEs. as they occur quite frequently in the study of vibrations, electric circuits. chemical reactions. etc. The situation of stiffness is roughly as follows; for details, see, for example, [E5]. [E25], [E26] in App. I. Error terms of the methods considered so far involve a higher derivative. And we ask what happens if we let h illcrease. Now if the error (the derivative) grows fast but the desired solution also grows fast. nothing will happen. However. if that solution does not grow fast, then with growing h the error term can take over to an extent that the numeric result becomes completely nonsensical. as in Fig. 450. Such an ODE for which h must thus be restricted to small values, and the physical system the ODE models. are called stiff. This term is suggested by a mass-:-.pring system with a stiff ~pring (spring with a large k; see Sec. 2.4). Example 6 illustrates that implicit methods remove the difficulty of increasing II in the case of stiffness: it can be shown that in the application of an implicit method the solution remains stable under any increase of h, although the accuracy decreases with increasing h. E X AMP L E 6
Backward Euler Method. Stiff ODE The initial value prublem
y'
= f(x. y) =
-20.,'
+
20x
2
+
2x.
y(O) = I
has the solution (verify!) y =
e- 20x
+
x 2.
The backward Euler formula (16) is ),,,+1 =
Noting that xn+1
y"
+
hJ(xn+b Yn+1) = Yn
= X" + h. tak:ing the term Yn
(l6*)
+
h( -20Yn+1
+ 20X~+1 +
2Xn +1)'
-20hY,,+1 to the left. and dividing. we obtain
+
h[20(xn
Yn+l =
+
h}2
+
2(x"
+ h)j
1 + 20h
The numeric re,ulr- in Table 21.9 show the following. Stability of the backward Euler method for h = 0.05 and also for h = 0.2 with an error increase by about a factOT 4 for h = 0.2. Stability of the Euler method fOf h
= 0.05 but instability for h = 0.1
(Fig. 450).
Stability of RK for h = 0.1 but instability for h = 0.2. 1l1b illustrates that the ODE is stiff. Note thm even in the case of stdbility the appruximation of the ~olution near x = 0 is poor. •
Stiffness will be considered further in Sec. 21.3 in connection with systems of ODEs.
SEC. 21.1
897
Methods for First-Order ODEs y
•
2.0
I I I I I I
1.0 , , ,
I
I I I
,I
o " lO.2~
:0.4', I.. 0.6 I
,
\
I
\: -1.0
'I
1,
~:
0.8 4' LO x
'
"
•
Fig. 450. Euler method with h = 0.1 for the stiff ODE in Example 6 and exact solution
Table 21.9
Backward Euler Method (BEM) for Example 6. Comparison with Euler and RK
BEM
x
h
0.05
=
1.00000 0.26188 0.10484 0.10809 0.16640 0.25347 0.36274 0.49256 0.64252 0.81250 1.00250
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
BEM h = 0.2
Euler h = 0.05
Euler h = 0.1
1.00000
1.00000 0.00750 0.03750 0.08750 0.15750 0.24750 0.35750 0.48750 0.63750 0.80750 0.99750
1.00000 -1.00000 1.04000 -0.92000 1.16000 -0.76000 1.36000 -0.52000 1.64000 -0.20000 2.00000
0.24800 0.20960 0.37792 0.65158 1.01032
RK h
=
RK
0.1
1.00000 0.34500 0.15333 0.12944 0.17482 0.25660 0.36387 0.49296 0.64265 0.81255 1.00252
h = 0.2
1.000 5.093 25.48 127.0 634.0 316H
Exact
1.00000 0.14534 0.05832 0.09248 0.16034 0.25004 0.36001 0.49001 0.64000 0.81000 1.00000
===== .... -........ ,,_ . . . . ..-..-.
=
- - ......- ............... -- .... ---... ..
11-41
EULER METHOD
7.
Do 10 steps. Solve the problem exactly. Compute the error. (Show the details.)
1. y' 2. )"'
= y, yeO) =
=
y, yeO)
3. y' = (y 4.
y' = (y +
IS-lO I
=
.d. X)2.
1. h = 0.1 1. h = 0.01 yCO) = 0, h yCO) = o. h
1. h
6. (Logistic population»)."
=
=
1, h = O. I
8. y' + )' tan x = sin 2x, yeO) = I, h = 0.1 9. Do Prob. 7 using the Euler method with h = 0.1 and
=
0.1
=
0.1
111-171
0.1. Compare with Prob I y - )'2, yeO)
=
0.2, h
=
0.1
h 2 ,y(0)=0,h=0.1
CLASSICAL RUNGE-KUTTA METHOD OF FOURTH ORDER
Do 10 steps. Compare as indicated. Comment. (Show the details. ) 11.
y' - xy2 = O• .1'(0) = 1, h = 0.1. Compare with Prob. 7. Apply (10) to )'10'
12. y' =
yeo)
compare the accuracy.
Do 10 steps. Solve exactly. Compute the error. (Show the details.)
=
xy2 = O.
10./ = I
IMPROVED EULER METHOD
S. y' = y. -,,(0) and comment.
y' -
= y - y2 • .1'(0) = 0.2, h = 0.1. Compare with Prob. 6. Apply (10) to )'10'
898 13. 14. 15.
CHAP. 21
Numerics for ODEs and PDEs
(b) Graph solution curves of the ODE in (5) for various positive and negative initial values. (c) Do a similar experimem as in (a) for an initial value problem that has a monotone increasing or monotone decreasing solution. Compare the behavior of the error with that in (a). Comment.
y' = (l + X-I»),. yO) = e. h = 0.2 y' = !U'lx - xl),). ),(2) = 2. h = 0.2 y' + -" tan x = sin 2x, yeo) = 1. II = 0.1
16. In Prob. 15 use h = 0.2 (5 steps) and compare the error. 17. y' + 5x 4 y2 = 0, yeO) = 1. h = 0.2
20. CAS EXPERIMENT. RKF. (a) W]ite a program for RKF that gives .tn' y". the estimate (10). and if the
18. Kulla's third-order method is defined by Yn+l = y" + irk] + 4k2 + k3*) with kl and k2 as in RK (Table 21.4) and k3* = 1If(xn +].),,, - k] + 2k2 ). Apply this method to (4) in Example I. Choose h = 0.2 and do 5 steps. Compare with Table 21.6. 19. CAS EXPERIMENT. Euler-Cauchy vs. RK. (a) Solve (5) in Example 2 by Euler. Improved Euler. and RK for 0 ~ x ~ 5 with step h = 0.2. Compare the errors for x = 1,3,5 and comment.
solution is known. the actual error
En'
(b) Apply the program to Example 5 in the text
(10 steps, 11 = 0.1). (c) E" in (b) gives a relatively good idea of the size of the actual error. Is this typical or accidental? Find out by experimentation with other problems on what properties of the ODE or solution this might depend.
21.2 Multistep Methods In a one-step method we compute )'n+1 using only a single step, namely, the previous value y.,. One-step methods (Ire "self-starting," they need no help to get going because they obtain ."1 from the initial value ."0' etc. All methods in Sec. 21.1 are one-step. In contrast, a multistep method uses in each step values from two or more previous steps. These methods are motivated by the expectation that the additional information will increase accuracy and stability. But to get staJ1ed, one needs values. say. YO,."b ."2'."3 in a 4-step method, obtained by Runge-Kutta or another accurate method. Thus, multistep methods are not self-starting. Such methods are obtained as follows.
Adams-Bashforth Methods We consider an initial value problem (1)
y'
=
f(x, y),
as before. with f such that the problem has a unique solution on some open interval containing .1'0' We integrate y' = f(x, y) from x" to xn+1 = Xn + h. This gives
Now comes the main idea. We replace f(x. y(x» by an interpolation polynomial p(x) (see Sec. 19.3), so that we can later integrate. This gives approximations )'n+1 of y(xn +1) and Yn of y(xn ), x n+ 1
(2)
)'n+1
= )'n +
J x"
p(x) dr.
SEC. 21.2
899
Multistep Methods
Different choices of p(x) will now produce different methods. We explain the principle by taking a cubic polynomial, namely, the polynomial P3(x) that at (equidistant)
has the respective values In
= I(xn , Yn)
(3)
This will lead to a practically useful formula. We can obtainp3(x) from Newton's backward difference formula (18), Sec. 19.3:
where
x - Xn h
r=
We integrate p3(X) over x from Xn to Xn+l x = Xn
The integral oqdr
(4)
I
x".
Xn
+
+ hr.
= Xn + 11. thus over r from 0 to
we have
1) is 5/12 and that ofir(r 1
= 11 dr.
+ l)(r + 2) is 3/8. We thus obtain
(1~VIn+ 5
P3dX=lzlp3dr=h In+ 0
dx
1. Since
~) - V2In+ ':"V- 3fl1 .
2
12
8
It is practical to replace these differences by their expressions in terms of I:
\In V2In
= In - In-l = In - 2In-l + 1",-2
We substitute this into (4) and collect terms. This gives the multistep formula of the Adams-Bashforth method of fourth order
(5)
)'n+l = Yn
+
h 24 (55In - 59In-l
+ 37In-2
- 9In-3)'
It expresses the new value Yn+ I [approximation of the solution y of (I) at Xn+ 1] in terms of 4 values of.f computed from the y-values ohtained in the preceding 4 steps. The local truncation error is of order 11 5 , as can be shown, so that the global error is of order 114; hence (5) does define a fourth-order method.
900
CHAP. 21
Numerics for ODEs and PDEs
Adams-Moulton Methods Adams-Moulton methods are obtained if for p(x) in (2) we choose a polynomial that interpolates f(x, .v(x» at x n +l' x n , Xn_1> ... (as opposed to Xn , X"'-I' ... used before; this is the main point). We explain the principle for the cubic polynomial P3(X) that interpolates at X n +l' X'" Xn -l' X.,-2' (Before we had x n ' Xn -l, X,,-2' X.,-3') Again using (18) in Sec. 19.3 but now setting r = (x - x n +1)lh, we have
We now integrate over x from Xn to Xn+l as before. This corresponds to integrating over r from -I to O. We obtain
Replacing the differences as before gives (6)
Yn-ll
=
Yn
+
f
lz
Xn+l
P3(X) dx
=
Yn
+ 24
(9f"+1
+ 19fn - 5f,,-1 +
fn-2)'
Xu
This is usually called an Adams-Moulton formula. It is an implicit formula because fn+l = f(X,,+b )'n+l) appears on the right, so that it defines Yn+l only implicitly, in contrast to (5), which is an explicit formula, not involving Yn+l on the right. To use (6) we must predict a value y ~;+ I> for instance, by using (5), that is,
(7a)
Y~+1
= Yn +
h
24 (55fn - 59fn-l
+ 37fn-2
- 9fn-3)'
The corrected new value Yn+ 1 is then obtained from (6) with f n+ 1 replaced by f~+1 = J(x n +1, Y~+1) and the other fs as in (6); thus,
(7b)
h
Yn+1 = Yn
+ 24
(9f~+1
+
19fn - 5fn-l
+
f n-2)'
This predictor-corrector method (7a). (7b) is usually called the Adams-Moulton method offourth order. It has the advantage mer RK that (7) gives the error estimate
as can be shown. This is the analog of (10) in Sec. 21.1. Sometimes the name 'Adams-Moulton method' is reserved for the method with several corrections per step by (7b) until a specific accuracy is reached. Popular codes exist for both versions of the method. Getting Started. In (5) we need fo, fl, f2, f3' Hence from (3) we see that we must first compute Yr. -"2, .1'3 by some other method of comparable accuracy, for instance, by RK or by RKF. For other choices see Ref. [E26] listed in App. I.
SEC. 21.2
901
Multistep Methods
E X AMP L E 1
Adams-Bashforth Prediction (7a), Adams-Moulton Correction (7b) Solve the initial value problem
y'
(8)
=
T
+ y,
by (7a), (7b) on the interval 0 ::;; x ::;; 2. choosing h
=
y(O) =
0
0.2.
Solutio". The problem is the same as in Examples 1-3. Sec. 21.1. so that we can compare the results. We compute starting values Yl . .'"2• .'"3 by the classical Runge-Kutta method. Then in each step we predict by (7a) and make one correction by (7b) before we execute the next step. The result~ are shown and compared with the exact values in Table 21.10. We see that the corrections improve the accuracy considerably. This is typicaL • Table 21.10 Adams-Moulton Method Applied to the Initial Value Problem (8); Predicted Values Computed by (7a) and Corrected values by (7b) Il
Xn
0 1 2 3 4 5 6 7 8 9 10
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0
106 • Error
Yn
Exact Values
0.425529 0.718270 1.120106 1.655191 2.353026 3.249646 4.389062
0.000 000 0.021403 0.091825 0.222119 0.425541 0.718282 1.120117 1.655200 2.353032 3.249647 4.389056
0 3 7 12 12 12 11 9 6 1 -6
Starting
Predicted
Corrected
Yn
Yn*
0.000000 0.021 400 0.091 818 0.222 107 0.425361 0.718066 1.119855 1.654885 2.352653 3.249 190 4.388505
ofYn
Comments on Comparison of Methods. An Adams-Moulton formula is generally much more accurate than an Adams-Bashforth formula of the same order. This justifies the greater complication and expense in using the former. The method (7a), (7b) is Ilumerically stable, whereas the exclusive use of (7a) might cause instability. Step size control is relatively simple. If ICorrector - Predictorl > TOL, use interpolation to generate "old" results at half the current "tep size and then try /1/2 as the new step. Wherea~ the Adams-Moulton fOIillula (7a), (7b) needs only 2 evaluations per step, Runge-Kutta needs 4; however, with Runge-Kutta one may be able to take a step size more than twice as large, so that a comparison of this kind (widespread in the literature) is meaningless. For more details, see Refs. [E25], LE26] listed in App. 1.
........ -.. -.. ,,_ ...... ..-...... -.... ......-............. --... 1. Carry out and show the details of the calculations leading to (4 H7) in the text.
12-111
ADAMS-MOULTON METHOD (7a), (7b)
Solve the initial value problems by Adams-Moulton. 10 steps with I correction ~r step. Solve exactly and compute the enOL (Use RK where no starting values are given.)
2.),' = J, yeO) = I, II = 0.1 (1.105171, 1.221403,
1.349859)
= -0.2xy. yeO) = I, h = 0.2 4. y' = 2xy, yeo) = I. h = 0.1 5. y' = I + y2, )"(0) = O. h = 0.1 3. y'
6. Do Prob. 4 by RK, 5 steps, h = 0.2. Compare the errors.
CHAP. 21
902
Numerics for ODEs and PDEs
solution and comment.
7. Do Prob. 5 by RK. 5 steps. h = 0.2. Compare the errors. 8. 9.
14. How much can you reduce the error in Prob. 13 by halving II (20 steps. h = 0.05)? First guess, then compute.
y' = xly. y(l) = 3, h = 0.2 y' = (x + Y - 4)2, y(O) = 4, h = 0.2. only 7 steps (why?)
15. CAS PROJECT. Adams-Moulton. (a) Accurate starting is important in (7a), (7b). Illustrate this in Example I of the text by using starting values from the improved Euler-Cauchy method and compare the results with those in Table 21.9. (b) How much does the error in Prob. 11 decrease if you use exact starting values (instead of RK-values)? (c) Experiment to find out for what ODEs poor starting is very damaging and for what ODEs it is not. (d) The classical RK method often gives the same accuracy with step 211 as Adams-Moulton with step h. so that the total number of function evaluations is the same in both cases. Illustrate this with Prob. 8. (Hence corresponding comparisons in the literature in favor of Adams-Moulton are not valid. See also Probs. 6 and 7.)
y' = I - 4.\'2, y(O) = O. II = 0.1 11. y' = x + y . .1'(0) = O. h = 0.1 (0.00517083.
10.
0.0214026.0.04(8585) 12. Show that by applying the method in the text to a polynomial of second degree we obtain the predictor and corrector formulas II
Y~+1 = Yn
+ 12
Yn+l = Yn
+ 12
(23fn -
h
(5fn+l
16f,,_1
+
+
5f,,-2)
8fn - fn-l)'
13. Use Prob. 12 to solve y' = 2x.\". y(O) = I (10 steps, It = 0.1, RK starting values). Compare with the exact
21.3
Methods for Systems and Higher Order ODEs Initial value problems for first-order
(1)
y'
=
system~
f(x.
of ODEs are of the form
y).
in components
f is assumed to be such that the problem has a unique solution y(x) on some open x-interval containing Xo. Our discussion will be independent of Chap. 4 on system~. Before explaining solution methods it is important to note that (l) includes initial value problem:-- for single mth-order ODEs,
(2)
y'>n)
= f(x, y, y',
and initial conditions )'(xo) = Kb
y' (xo)
.v"..... y(Tn-D)
= K 2 , ••• , y<m-l)(xo) = Km as special cases.
SEC. 21.3
Methods for Systems and Higher Order ODEs
903
Indeed, the connection is achieved by setting (3)
)' = _ 1: ",
I
_\. 1 = _\' ,
_3
)'2 =)'
Then we obtain the system
, )'1 =)'2
(4)
,
Y'111-1 ~ Y·uz.
Y~t
= I(x.
Yl ....• Ym)
Euler Method for Systems Methods for single first-order ODEs can be extended to systems (1) simply by writing vector functions Y and f instead of scalar functions y and I, whereas x remains a scalar variable. We begin with the Euler method. Just as for a single ODE this method will not be accurate enough for practical purposes, but it nicely illustrates the extension principle. E X AMP L E 1
Euler Method for a Second-Order ODE. Mass-Spring System Solve the initial value problem for a damped mass-spring system y"
+ 2/ +
0.75.1' = 0,
1'(0)
= 3.
/(0) = -2.5
by the Euler method for systems with step h = 0.2 for x from 0 to I (where x is time).
Solutioll.
The Euler method (3). Sec. 21.1. generalizes to systems in the form
(5) in
Yn+l
= Yn + hf(xn' Yn),
componelll~
and similarly for systems of more than two equations. By (4) the given ODE converts to the system
y~
=
y~
= .f2(X• ."1' .1'2) = -2Y2 - 0.75.\"1'
h(x. Yl' .1'2)
= Y2
Hence (5) becomes Yl.n+l = )'1,11.
+ O.2Y2.n
)'2.n+l = )'2,n
+
0.2(-2)'2,11. - 0.75Yl,n)'
The initial conditions are y(O) = )'1(0) = 3, -'" (0) = Y2(O) = -2.5. The calculation~ are shown in Table 21.11 on the next page. As for single ODEs. the results would not be accurate enough for practical purposes. The example merely serves to illustrate the method because the problem can be readily solved exactly. thu~
•
904
CHAP. 21
Table 21.11
Numerics for ODEs and PDEs
Euler Method for Systems in Example 1 (Mass-Spring System) ."1 Exact
.1" 1.71
0 2 3 4
0.0 0.2 0.4 0.6 0.8
5
1.0
(5D)
3.00000 2.50000 2.11000 1.80100 1.55230 1.34905
buor
.\'2 Exact )"2.11
= )"1 - )"I.n
EI
3.00000 2.55049 2.18627 1.88821 1.64183 1.43619
-2.50000 - 1.95000 -1.54500 -1.24350 -1.01625 -0.84260
0.00000 0.05049 0.76270 0.08721 0.08953 0.08714
Euor
(5D)
E2
-2.50000 -2.01606 -1.64195 -1.35067 -1.12211 -0.94123
= )"2
-
)"2."
0.00000 -0.06606 -0.09695 -0.10717 -0.10586 -0.09863
Runge-Kutta Methods for Systems As for Euler methods, we obtain RK methods for an initial value problem (1) simply by writing vector formulas for vectors with III components, which for 111 = 1 reduce to the previous scalar formulas. Thus for the classical RK method of fourth order in Table 21.4 we obtain
(6a)
(Initial values)
and for each step
11
= 0, 1, ... , N
- 1 we obtain the 4 auxiliary quantities
ki
=
II f(x n ,
k2
=
hf(xn
+
!h,
Yn
+ !kI )
k3
=
hf(xn
+
!h,
Yn
+ !k2)
Yn)
(6b)
and the new value [approximation of the solution y(x) at x n + 1 = Xo
+
(n
+ I)h1
(6c)
E X AMP L E 2
RK Method for Systems. Airy's Equation. Airy Function Ai(x) Solve the initial value problem y" =
X-,"~
.1"(0) ~ 1/(32/3. rO/3J) = 0.35502805.
/(0) = -1I(3113·r(lf3)) = -0.25881940
by the Runge-Kutta method for systems with h = 0.2: do 5 steps. This is Airy's equation,2 which arose in optics (see Ref. [Al3]. p. 188. listed in App. I). r is the gamma function (see App. A3.11. The initial conditions are such that we obtain a standard solution, the Airy function Ai(x), a special function that has been thoroughly investigated: for numeric values. see Ref. [GRI], pp. 446. 475.
2Named after Sir GEORGE BIDELL AIRY (1801-1892), English mathematician. who is known for his work in elasticity and in PDEs.
SEC. 21.3
905
Methods for Systems and Higher Order ODEs
Solution.
For y"
=
xy.
setting YI = Y,)"2 =
yi =
v' we obtain the system (4)
, )"1 =)"2
Hence f = [h f2]T in (1) has the components hex. y) = .\'2' f2(X. Y) = XYI' We now write (6) in components. The initial conditions (6a) are YI.O = 0.35502805. )"2.0 = -0.25881 940. In 16b) we have fewer subscripts by simply writing kl = a, k2 = b. k3 = C, ~ = d. so that a = [01 02]T. etc. Then (6b) takes the form
a=h
Y2,n [
]
X·n Yl.1l
(6b*)
For example. the second component of b Now in b (= k 2 ) the first argument is
i~
obtained as follows. fIx. y) has the second component f2(x, y) =
x = x"
+ lh.
Y = Yn
+ !a.
X\'I'
The second argument in b is
and the first component of this is
Together.
Similarly for the other components in (6b*). Finally. (6c*)
Yn+! = Yn
+ !(a +
2b
+ 2c + d).
Table 21.12 shows the values y(x) = YI(x) of the Airy function Ai(x) and of its derivative as of the (rather small!) error of y(x).
y' (x)
= )'2(x)
a~ well •
Table 21.12 RK Method for Systems: Values Yl,n(X n) of the Airy Function Ai(x) in Example 2 II
Xn
Yl,n(X n )
Yl(X n ) Exact (8D)
108 • Error of .V I
Y2,n(Xn )
0
0.0 0.2 0.4 0.6 0.8 1.0
0.35502805 0.30370303 0.25474211 0.20979973 0.16984596 0.13529207
0.35502805 0.30370315 0.25474235 0.20980006 0.16984632 0.13529242
0 12 24 33 36 35
-0.25881 940 -0.25240464 -0.23583 073 -0.21279 185 -0.18641 171 -0.15914687
2 3 4
5
906
CHAP. 21
Numerics for ODEs and PDEs
Runge-Kutta-Nystrom Methods (RKN Methods) RKN methods are direct extensions of RK methods (Runge-Kutta methods) to secondorder ODEs y" = f(x, y. y'), as given by the Finnish mathematician E. J. Nystrom [Acta Soc. Sci. fenn., 1925, L, No. 131. The best known of these uses the following formulas, where 11 = 0, I, ... , N - I (N the number of steps): kl
~h.f(xn' -".", y~)
=
k2 = ~hf(xn (7a)
+
~h.
)'n
+ K. y~ +
k3
= ~Izf(xn + ~h . ."n +
k4
=
~hf(xn + h, ."n
K.
k1)
where K
=
~h (y~
where L
=
h (y~
+ ~kl)
Y:L + k 2)
+ L, .":1 +
2k3)
From this we compute the approximation )'n+l of Y(Xn+lJ at Xn+l
=
+
Xo
k3)'
+
(n
+
l)h,
(7b)
and the approximation y~+ 1 of the derivative y' Ltn+ 1) needed in the next step. (7c)
=
RKN for ODEs y" f(x, y) Not Containing y'. Then k2 the method particularly advantageous and reduces (7) to
(7*)
=
kl
= ~Izf(xn' Yn)
k2
= ~hf(xn + ~h. Yn + ~h(y~ + ~kl» =
k4
=
Yn+l
~hf(xn
+
h, Yn
+
h(Y:1
= Yn + h(y~ + !(k1 +
+
k3 in (7), which makes
k3
k 2)
2k2 »
Y~+1 = Y:1 + -!(k1 + 4k2 + k 4 )· E X AMP L E 3
RKN Method. Airy's Equation. Airy Function Ai(x) For the problem k2
=
k3
In
=
Example 2 and"
O.l(xn
+
0.))(.\""
+
=
0.2 as betore we obtain from (Y) silllply
O.IY;l
+ 0.05k1 !.
k4
=
O.I(xn
+
"1
0.2)(y"
= O.I.~nYn
and
+ O.2y~ + 0.2k2)·
Table 21.13 ,how, the result,. The accuracy is the 'ame as in Example 2. but the work wa' much less.
•
Table 21.13 Runge-Kutta-Nystrom Method Applied to Airy's Equation, Computation of the Airy Function y = Ai(x)
,
Xn
Yn
\' ,n
0.0 0.2 0.4 0.6 0.8
0.35502805 0.303 703 04 0.254742 II 0.20979974 0.16984599 0.135292 18
-0.258 81940 -0.252404 64 -0.235 830 70 -0.21279172 -0.18641134 -0.15914609
LO
y(x) Exact (8D)
0.35502805 0.303 703 15 0.25474235 0.209800 06 0.16984632 0.13529242
lOB. Error
of -"n
0 II
24 32 33 24
I
SEC. 21.3
907
Methods for Systems and Higher Order ODEs
Our work in Examples 2 and 3 also illustrates that usefulness of methods for ODEs in the computation of values of "higher transcendental functions."
Backward Euler Method for Systems. Stiff Systems The backward Euler formula (16) in Sec. 21.1 generalizes to systems in the form
(8)
+
Yn+l = Yn
h f(Xn+l'
(11
Yn+l)
= O. I ... ').
This is again an implicit method. giving Yn+l implicitly for given Yn- Hence (8) must be solved for Yn+l' For a linear system this is shown in the next example. This example also illustrates that. similar to the case of a single ODE in Sec. 21.1, the method is very useful for stiff systems. These are systems of ODEs whose matrix has eigenvalues A of very different magnitudes, having the effect that, just as in Sec. 21.1, the step in direct methods, RK for example, cannot be increased beyond a certain threshold without losing stability. (A = -I and -10 in Example 4, but larger differences do occur in applications.) E X AMP L E 4
Backward Euler Method for Systems of ODEs. Stiff Systems Compare the backward Euler method (8) with the Euler and the RK methods for numerically solving the initial value problem
y" + 11/ + lOy
= lOx
+ II.
)"'(0) =
"(0) = 2.
-10
converted to a system of first-order ODEs,
Solution.
The given problem "an easily be '<JIved, obtaining y = e- x
+ e- lOx + x
so that we can compute errors, Conversion to a system by setting Y = Yl.)"' = Y2 [see (4)] gives
,
Yl
)"1(0) =
=)"2
y~ = -IOYI - 11.'"2
+
lOx
+ II
2
)"2(0) = -10,
The coefficient matrix -A
has the characteristic determinant
whose value is A2 + IIA + lO = (A The backward Euler formula is
Yn+l
=
.'"I,n+lJ [
=
."2.n+l
+
I)(A
I-10
+ 10), Hence the eigenvalues are
-I and -lO as claimed above.
_"I'''J [ "2,n+l J [ Y2.11 + Ii -10)"I,n+1 - IlY2,n+l + lOXn+l + II -
Reordering terms gives the linear system in the unknowns )"I,n+l and )"2.n 11 11)'2,n+l
)"1.,,+1 -
IOh)'1.,,+1
+
(1
+ 1 Ih)Y2_11+ 1
=
)"I,n
= )'2;n
+ 1011!-,,, +
h) -t
1111.
The coefficient determinant is D = I + IIIz + IOh 2 , and Cramer's rule (in Sec 7_6) gives the solution 1I1i)Yl.n
+
hY2,n
+
10112xn
+
11112
-lOhY1,,,
+
Y2,n
+
lOlnn
+
1111
0 + Yn+l = D
[
+
3 1011 ] .
+ 1011 2
CHAP. 21
908
Numerics for ODEs and PDEs
Table 21.14
Backward Euler Method (BEM) for Example 4. Comparison with Euler and RK
BEM h = 0.2
x
h
BEM = 0.4
0.0 0.2
2.00000 1.36667
2.00000
0.4 0.6 0.8 1.0 1.2 1.4
1.20556 1.21574 1.29460 1.40599 1.53627 1.67954 L.83272 1.99386 2.16152
1.31429
l" 1.8 2.0
Euler = 0.1
Euler h = 0.2
RK
RK
h = 0.2
Iz = 0.3
2.00000 1.01000 1.56100 1.13144 1.23047 1.34868 1.48243 1.62877 1.78530 1.95009 2.12158
2.00000 0.00000 2.04000 0.11200 2.20960 0.32768 2.46214 0.60972 2.76777 0.93422 3.10737
2.00000 1.35207 1.18144 1.18585 1.26168 1.37200 1.50257 1.64706 1.80205 1.96535 2.13536
2.00000
h
1.35020 1.57243 1.86191 2.18625
Exact 2.00000 L.l5407 L.08864 1.15129 1.24966 1.36792 1.50120 1.64660 1.80190 1.96530 2.13534
3.03947
5.07561}
8.72329
Table 21.14 shows the following. Stability of the backward Euler method for h accuracy for increasing Ii. ~
Stability of the Euler method for Ii Stability of RK for h
=
~
0.2 and 0.4 (and in fact for any Ii; try Ii = 5.0) with decreasing
0.1 but instability for h
~
0.2.
0.2 but instability for II = 0.3.
Figure 451 show. the Euler method for Ii ~ 0.18, an interesting case with initial jumping (for about x < 3) but later monotone following the solution curve of y = Y1' See also CAS Experiment 21.
•
y
..
4.0
-..-
3.0
~
.-
2.0
1.0
0
Fig. 451.
-===== -.•. -.. -. -.... •
_
-
~
4. Y~
Y2, Y1(0)
=
x
0.18 in Example 4
=
I, )'2(0)
=
Yb Y; = -J'2, .VI(O)
=
2, Y2(0)
=
2, h
= 0.1,
10 steps 5. y"
2. y~ = -3)"1 + )"2' y~ = Y1 - 3)"2' )"1(0) = 2, )"2(0) = 0, h = 0.1,5 steps
=
4
3
5 steps
EULER FOR SYSTEMS AND SECOND-ORDER ODES Solve by the Euler method:
Y;
2
........
L~
)"1,
•
Euler method with h
1. Verify the calculations in Example I.
3. y~ =
I \I \I
. . .-..--
.. _ - . · . . . . . . . A -~ __ .....
_
" "" "I
=
-1, h = 0.2,
+
4y =
0, yeO) = 1, y'(O) = 0, h
0.2,
1, y' (0)
0.1,
5 steps 6. y" - )' 5 steps
x, nO)
-2, h
SEC. 21.4
909
Methods for Elliptic PDEs
7. y~
= -.\"1 + Y2' Y; = -.\"1 - Y2, ."1(0) = .\"2(0) = 4, h = 0.1, 10 steps
o.
17. ,," -
i,
~ -141
RK FOR SYSTEMS
Solve by the classical RK: 9. The system in Prob. 7. How much smaller is the error? 10. The ODE in Prob. 6. By what factor did the error decrease? 11. Undamped Pendulum. Y" + siny = 0, yt77) = O. y' (77) = 1. h = 0.2, 5 steps. How doe~ your result fit into Fig. 92 in Sec. 4.5?
12. Bessel Function 10 , xy" + y' + xy = 0, y(l) = 0.765198,/ (I) = -0.-1-40051, h = 0.5,5 steps. (This gives the standard solution 10(x) in Fig. 107 in Sec. 5.5.) 13•
.r ~
=
-4Y1
+ )'2'
y~ =
.\'1 -
4)'2'
YI(O) = 0,
Y2(0) = 2, h = 0.1, 5 steps
14. The system in Prob. 2. Ho'W much smaller is the error? 15. Verify the calculations for the Airy equation in Example 3.
fi.-I91
RUNGE-KUTTA-NYSTROM METHOD
= -
3, ,,' (0) 6x 2 -+ 3.)
(x
2
I -
=
!
0,
In 2.
20. CAS EXPERIMENT. Comparison of Methods. (a) Write program~ for RKN and RK for systems. (b) Try them out for second-order ODEs of your choice to find out empirically which is better in specific cases. (c) In using RKN, would it pay to first eliminate y' (see Prob. 29 in Problem Set 5.5)? Find out experimentally. 21. CAS
EXPERIMENT. Backward Euler Stiffness. Extend Example 4 as follows.
and
(a) Verify the values in Table 21.14 and show them graphically as in Fig. 451. (b) Compute and graph Euler values for h near the "critical" h = 0.18 to determine more exactly when instability starts.
(c) Compute and graph RK values for values of h between 0.2 and 0.3 to find h for which the RK approximation begins to increase away from the exact solution. (d) Compute and graph backward Euler values for large h: confirm stability and investigate the error increase for growing h.
Do by RKN:
16. Prob. 12 (Bessel function Jo). Compare the results.
21.4
n" + 4,' = 0, ,'(0) 0.2: 5 steps '(Exact: y '= x4
- x)y" - xy' + y = 0, y(!) = y'(!) = I - In 2, h = 0.1. 4 steps 19. Prob. II. Compare the results.
18.
8. Verify the formulas and calculations for the Airy equation in Example 2.
=
Methods for Elliptic PDEs The remaining sections of this chapter are devoted to numerics for PDEs (partial differential equations), particularly for the Laplace, Poisson, heat, and wave equations. These POEs are basic in applications and, at the same time, are model cases of elliptic, parabolic, and hyperbolic PDEs, respectively. The definitions are as follows. (recall also Sec. 12.4). A POE is called quasilinear if it is linear in the highest derivatives. Hence a secondorder quasilinear PDE in two independent variables x, y is of the form
(I) 1I is an unknown function of x and y (a solution sought). F is a given function of the indicated variables. Depending on the discriminant ac - b 2 , the PDE (I) is said to be of
elliptic type
if
ac - b 2 > 0
(example: Lap/ace e£juation)
parabolic type
if
ae - b 2 = 0
(example: beat equation)
hyperbolic type
if
lie - b 2 < 0
(example: wal'e equation).
910
CHAP. 21
Numerics for ODEs and PDEs
Here. in the heat and wave equations, y is time t. The coefficients G, b, c may be functions of x, y. so that the type of (I) may be different in different regions of the xy-plane. This classification is not merely a formal matter but is of great practical importance because the general behavior of solutions differs from type to type and so do the additional conditions (boundary and initial conditions) that must be taken into account. Applications involving elliptic equatio1ls usually lead to boundary value problems in a region R. called a first boundary value problem or Dirichlet problem if u is prescribed on the boundary curve C of R, a second bOllndary I'llllle problem or Neumann problem if lln = au/all (normal derivative of lI) is prescribed on C. and a third or mixed problem if II is prescribed on a part of C and Un on the remaining part. C usually is a closed curve (or sometimes consists of two or more such curves).
Difference Equations for the Laplace and Poisson Equations In this section we consider the Laplace equation
(2) and the Poisson equation
(3) These are the most important elliptic PDEs in applications. To obtain methods of numeric solution. we replace the pmtial derivatives by conesponding difference quotients, as follows. By the Taylor formula,
(4) (b)
We subtract (4b) from (4a), neglect terms in h3 , 114, ..• , and solve for (5a)
ux(x, y) =
1 21z [u(x
+ ", y)
-
llx.
Then
h. y)].
lI(x -
Similarly, u(x. y
+ k)
= u(x. y)
+
/...uy(x, y)
+ ~k2Uyy(X, y) + .. -
and u(x, y - k)
=
lI(X. y) -
kuy{.r. y)
+ ~k2Uyy(x, y) + - ...
By subtracting, neglecting tenns in k 3 , k4, ... , and solving for u y we obtain ~
(5b)
I lIy(x,
y) =
2k [u(x, Y
+ k)
- u(x, Y - k)].
SEC. 21.4
911
Methods for Elliptic PDEs
We now turn to second derivatives. Adding (4a) and (4b) and neglecting terms in h 5 , . • . , we obtain u(x + h, y) + u(x - h, y) = 2u(x, y) + h 2 ux .",(x, y). Solving for u xx ' we have
h4,
(6a)
I
l/xx(x, Y) =
fl
uyy(x. y) =
k2
+
[u(x
h, y) - 2l/(x, y)
+
+ k) -
+ u(x. y
It(x -
/z, y)].
Similarly, (6b)
1
[u(x. y
2u(x. y)
-
k)].
We shall not need (see Prob. 1) 1 + h. \' '4hk'
+ k)
UXy(x. v) = - - [u(x
(6c)
.
- u(x - h. v
.
- u(x
+
+
k)
h, y - k)
+
u(x - h, Y - k)].
Figure 452a shows the points (x + h, y), (x - h, y), ... in (5) and (6). We now substitute (6a) and (6b) into the Poisson equation (3), choosing k a simple formula: (7)
u(x
+ h, y) + It(x, Y + h) + u(x -
h, y)
+ u(x, Y -
=
=
h) - 4lt(x, y)
h to obtain
h 2 f(x, y).
This is a difference equation corresponding to (3). Hence for the Laplace equation (2) the corresponding difference equation is (8)
u(x
+ h, y) +
u(x, y
+ h) +
u(x - h, y)
+
u(x, y - h) - 4u(x, y) = O.
h is called the mesh size. Equation (8) relates u at (x. y) to u at the four neighboring points shown in Fig. 452b. It has a remarkable interpretation: u at (x, y) equals the mean of the values of u at the four neighboring points. This is an analog of the mean value property of harmonic functions (Sec. 18.6). Those neighbors are often called E (East), N (North), W (West), S (South). Then Fig. 452b becomes Fig. 452c and (7) is (7*)
u(E)
+
u(N)
+
N
X
x
X
kl
hi
h
(x-h,y) x~x (x+h,y) (x,y)
k
u(S) - 4u(x, y) = h 2 f(.'(, y).
(x,y+hJ (x,y+k)
h
+
u(W)
I
h
(x-h,y)
x--
r" .
hi h
--x
(x,y)
(x+h,y)
h
x
(x,y-k)
(aJ Points in (5) and (6)
Fig. 452.
W X __ h_
-6 _h_X (x,Y)
h X
x
(x,y-h)
s
(b) POints In (7) and (8)
(e) Notation in (7*)
Points and notation in (5)-(8) and (7*)
E
912
CHAP. 21
Numerics for ODEs and PDEs
nn
Our approximation of h 2 ',pu in (7) and is a 5-point approximation with the coefficient scheme or stencil (also called pattern. lIloleeule. or star)
(9)
r
-4
I}' We
may
now write (7) a, {I
-4
I}
u
~ h'!,<,
y).
Dirichlet Problem In numerics for the Dirichlet problem in a region R we choose an h and introduce a sljuare grid of horizontal and vertical straight lines of distance h. Their intersections are called mesh points (or lattice poillts or nodes). See Fig. 453. Then we approximate the given PDE by a difference equation [(8) for the Laplace equation], which relates the unknown values of u at the mesh points in R to each other and to the given boundary values (details on p. 913). This gives a linear system of algebraic equations. By solving it we get approximations of the unknown values of u at the mesh points in R. We shall see that the number of equations equals the number of unknowns. Now come" an important point. If the number of internal mesh points. call it p, is small, say, p < 100, then a direct solution method may be applied to that linear system of p < 100 equations in p unknowns. However, if p is large, a storage problem will arise. Now since each unknown u is related to only 4 of its neighbors, the coefficient matrix of the system is a sparse matrix, that is. a matrix with relatively few nonzero entries (for instance, 500 of 10000 when p = 100). Hence for large p we may avoid storage difficulties by using an iteration method. notably the Gauss-Seidel method (Sec. 20.3), which in PDEs is also called Liebmann's method. Remember that in this method we have the storage convenience that we can overwrite any solution component (value of ll) as soon as a "new" value is available. Both cases, large p and small p, are of interest to the engineer, large p if a fine grid is used to achieve high accuracy, and small p if the boundary values are known only rather inaccurately, so that a coarse grid will do it because in this case it would be meaningless to try for great accuracy in the interior of the region R. We illustrate this approach with an example. keeping the number of equations small, for simplicity. As convenient llOllitiOIlS for mesh poillts lllld correspondillg Vlt/ues of the solution (and of approximate solutions) we use (see also Fig. 453) Pij
(10)
=
(ih. jh),
llij
= uUh, jll).
y
x
Fig. 453. Region in the xy-plane covered by a grid of mesh h, also showing mesh points Pll = (h, h), ... , Pij = (ih, jh), ...
SEC. 21.4
Methods for Elliptic PDEs
913
With this notation we can write (8) for any mesh point (11)
EXAMPLE 1
Ui+l,j
+
Ui,j+l
+ Ui-l,j +
Pi)
in the form
4uij
Ui,j-l -
O.
=
Laplace Equation. Liebmann's Method The four sides of a square plate of side 12 em made of homogeneous material are kept at constant temperature
ooe and IDOoe as shown in Fig. 454a. Using a (very wide) grid of mesh.j. cm and applying Liebmann's method (that is, Gauss-Seidel iteration), find the (steady-state) temperature at the mesh points.
Solution.
In the case of independence of time. the heat equation (see Sec. 10.8)
reduces to the Laplace equation. Hence our problem i, a Dirichlet problem for the latter. We ehoo,e the grid shown in Fig. 454b and consider the mesh points in the order Pn, P 21 . P 12 • P 22 . We use (11) and, in each equation. take 10 the right all the terms resulting from the given boundary values. Then we obtain the system =
+
u22
= -200
- 4"12 +
U22
= -100
(12) Un U21
-200
+
U12 -
4U22 =
-100.
In practice, one would solve such a small system by the Gauss elimination, finding "n = "21 = 87.5. "12 = U22 = 62.5. More exact values (exact to 3S) of the solution of the actual problem [as opposed to its model (12)] are 88.1 and 61.9. respectively. (These were obtained by using Fourier series.) Hence the error is about 1%. which is surprisingly accurate for a grid of such a large mesh size h. If the system of equations were large, one would solve it by an indirect method, such as Liebmann's method. For (12) this is as follows. We write (12) in the form (divide by -4 and take terms to the right) [/n =
0.25"21
+
1112 = 0.25"n 0.25u21
+
+ 50
0.25u12
+
0.25u22
+ 50
+
0.25u22
+ 25
+ 25.
0.25u12
These equations are now used for the Gauss-Seidel iteration. They are identical with (2) in Sec. 20.3, where un = Xl' U21 = X2, "12 = X3, U22 = X4. and the iteration is explained there, with 100, 100, 100, 100 chosen as starting values. Some work can be saved by better starting values, usually by taking the average of the bound:.u)' values that enter into the linear system. The exact solution of the system is Ull = "21 = 87.5. u12 = U22 = 62.5, as you may verify.
u=o
y
u=o
121------'"'1 P 22
P12
*-4
u = 100
--,
R
POl
Ipll
u = 100
I p21
-e - .
IPlO 'P I
20
U
(a) Given problem
=
100
(b) Grid and mesh points
Fig. 454.
Example 1
914
CHAP. 21
Numerics for ODEs and PDEs
Remark.
It is interesting to note that if we choose mesh h = LlII (L = side of R) and consider the (II - 1)2
internal
me~h
points (i.e .. mesh points not on the boundary) row by row in the order
then the system of equations has the
(11 -
X (II -
1)2
1)2
coefficient matrix
-4
B
-4
B
(13)
Here
A =
B=
-4
B
-4
B
is an (II - I) X (11 - 1) matrix. (In (12) we have 11 = 3. (11 - 1)2 = 4 internal mesh points. two submatrices B. and two submatrices I.) The matrix A is nonsingular. This follows by noting that the off-diagonal entries in each row of A have the sum 3 (or 2). wherea~ each diagonal entry of A equals -4. so that non~ingularity is implied by Gerschgorin's theorem in Sec. 20.7 because no Gerschgorin disk can include O. •
A matrix is called a band matrix if it has all its nonzero entries on the main diagonal and on sloping lines parallel to it (separated by sloping lines of zeros or not). For example. A in (13) is a band matrix. Although the Gauss elimination does not pre~erve zeros between bands. it does not introduce nonzero entries outside the limits defined by the original bands. Hence a band structure is advantageous. In (13) it has been achieved by carefully ordering the mesh points.
ADI Method A matrix is called a tridiagonal matrix if it has all its nonzero entries on the main diagonal and on the two sloping parallels immediately above or below the diagonal. (See also Sec. 20.9.) In this case the Gauss elimination is particularly simple. This raises the question of whether in the solution of the Dirichlet problem for the Laplace or Poisson equations one could obtain a system of equations whose coefficient matrix is tridiagonal. The answer is yes, and a popular method of that kind, called the ADI method (alternating direction implicit method) was developed by Peaceman and Rachford. The idea is as follows. The stencil in (9) shows that we could obtain a tridiagonal matrix if there were only the three points in a row (or only the three points in a column). This suggests that we write (II) in the form (l4a)
Ui-l,j -
4Uij
+
ui+I,j
=
-Ui,j-l -
lIi,j+I
so that the left side belongs to y-Row j only and the right side to x-Column i. Of course, we can also write (11) in the form (l4b)
lIi.j-1 -
4Uij
+
Ui.j+l
=
-Ui-I,j -
ui+l,j
so that the left side belongs to Column i and the right side to Row j. In the AD! method we proceed by iteration. At every mesh point we choose an arbitrary starting value u~T. In each step we compute new values at all mesh points. In one step we use an iteration
SEC 21.4
91S
Methods for Elliptic PDEs
formula resulting from (14a) and in the next step an iteration formula resulting from (14b), and so on in alternating order. In detail: suppose approximations lI~j) have been computed. Then, to obtain the next approximations lI~'JHl), we substitute the 1I~'J') on the right side of (l4a) and solve for the l/~j'H1) on the left side; that is, we use (lSa)
We use (lSa) for a fixed j, that is, for a fixed row j, and for all internal mesh points in this row. This gives a linear system of N algebraic equations (N = number of internal mesh points per row) in N unknowns, the new approximations of l/ at these mesh points. Note that (ISa) involves not only approximations computed in the previous step but also given boundary values. We solve the system (15a) (j fixed!) by Gauss elimination. Then we go to the next row, obtain another system of N equations and solve it by Gauss. and so on, until all rows are done. In the next step we alternate direction, that is, we compute the next approximations u~jn+2) column by column from the ugn + ll and the given boundary values, using a fonnula obtained from (l4b) by substituting the u~jn+]) on the right: (lSb)
For each fixed i, that is. for each colulIlll. this is a system of M equations (M = number of internal mesh points per column) in M unknowns. which we solve by Gauss elimination. Then we go to the next column. and so on, until all columns are Jone. Let us consider an example that merely serves to explain the entire method. E X AMP L E 2
Dirichlet Problem. ADI Method Explain the procedure and formulas of the AD! method in terms of the problem in Example I. using the same grid and starting values 100. 100. 100. 100.
Solution.
While working. we keep an eye on Fig. 454b on p. 913 and the given boundary values. We obtain 11\]1.. Iti~. lI\]d from (15a) with 111 = O. We w,ite boundary values contained in (15a) first approximations without an upper index. for better identification and to indicate that these given values remain the same during the iteration. From (15a) with 111 = a we have for j = I (first row) the system
"ttl..
(i~1)
(i
The solution is
ilill =
=
2)
u~li ~ 100. Fori = 2 (second row) we obtain fium (15a) the system (i = I) (i = 2)
The solution is /li~ ~
/I\]d
~ 66.667.
Secolld approximatiolls 1tW,. iI~1. Iti~. /I~i are now obtained from (l5b) with 111 = I by using the first approximations just computed and the boundary values. For i ~ I (first column) we obtain from (15b) the system (J = 1)
= -1101 -
II'll
(J = 2)
The solution is
/lfl. =
91.11.
,,cli =
(J =
I)
(J
2)
~
64.44. For i = 2 (second column) we obtain from (15b) the system
The solutIOn is /I~l = 91.11, /I~i = 64,44.
=
-1t'iV. - It:ll
Numerics for ODEs and PDEs
CHAP. 21
916
In this example. which merely serves to explain the practical procedure in the AD! method. the accuracy of the second approximations is about the same as that of two Gauss-Seidel steps in Sec. 20.3 (where IIU = xl' lt2l = .\"2' lt12 = X3' 1122 = X4)' as the following table shows.
-Method
lin
lI21
AD!. 2nd approximations Gauss-Seidel, 2nd approximations Exact solution of (12)
91.11 93.7S 87.S0
91.11 90.62 87.50
-
64.44 6S.62 62.50
64.44 64.06 62.50
•
Improving Convergence. Additional improvement of the convergence of the ADI method results from the following interesting idea. Introducing a parameter p, we can also write (11) in the form (a)
Ui-l,j -
(2
+ P)Uij + ui+l,j
=
-Ui,j-l
+
(2 -
(b)
lli,j-l -
(2
+ p)uij + ui.j+l
=
-Ui-l,j
+
(2 - P) lI ij
P)Uij -
Ui,j+l
(16) -
lIi+l,j'
This gives the more general ADI iteration formulas (a) (17)
(b)
u\m+2) _ 1,,)-1
(')
....
+ p)1l\1n+2) + 1,)
1l\,,!-+2) 1.,)+1
=
_u\m+D z-I,)
+ ("") L.
p)u(m+D 'lJ
u(m+~) 1,+1,) .
For P = 2. this is (IS). The parameter p may be used for improving convergence. Indeed, one can show that the ADI method converges for positive p. and that the optimum value for maximum rate of convergence is
Po
(18)
=
'iT
2 sin K
where K is the larger of M + I and N + 1 (see above). Even better results can be achieved by letting P vary from step to step. More details of the ADI method and variants are discussed in Ref. rE2S] listed in App. 1.
---.•.--.......-........ ..
-.-~
--
.. ..........
,,-~--.-.
1. Derive (Sb). (6b). and (6c).
12-71
GAUSS ELIMINATION, GAUSS-SEIDEL ITERATION
For the grid in Fig. 455 compute the potential at the four internal points by Gauss and by 5 Gauss-Seidel steps with starting value~ 100. 100, 100, 100 (showing the details of your work) if the boundary values on the edges are: 2.
3 II = 0 on the left. x on the lower edge, 27 - 9.\'2 on the righl. x 3 - 27x on the upper edge.
ue.
lIe I, 0) = 60. 0) = 300. II = lOO on the other three edges. 4. 1/ = X4 on the lower edge. 81 - 54.\'2 + )"4 on the right. x 4 - 54x 2 + 81 on the upper edge, y4 on the left. Verity the exact solution X4 - 6x 2y2 + y4 and determine Ihe elTor.
3.
5.
II
6.
U = no on the upper and lower edges, lID on the left and right.
= sin !7TX on the upper edge, 0 on the other edges. 10 steps.
SEC 21.5
Neumann and Mixed Problems. Irregular Boundary
7. Vo on the upper and lower edges, - Vo on the left and right. Sketch the equipotentIal lines.
917 use symmetry; take II = 0 as the boundary value at the two points at which the potential has a jump. u = 110 V
"~110V~"~11",
2 y
x
Fig. 455.
2
"~-110Vt.fj "~-110V
3
Problems 2-7 u=-110V
8. Verify the calculations in Example 1. Find out experimentally how many steps are needed to obtain the solution of the linear system with an accuracy of 3S. 9. tUse of symmetry) Conclude from the boundary values in Example I that U21 = Un and U22 = U12' Show that this leads to a system of two equations and solve it. 10. (3 X 3 grid) Solve Example I, choosing h = 3 and starting values 100, 100, .... 11. For the square 0 ~ x ~ 4, 0 ~ y ~ 4 let the boundary temperatures be O°C on the horizontal and 50°C on the vertical edges. Find the temperatures at the interior points of a square grid with h = I. 12. Using the answer to Prob. II, try to sketch some isotherms. 13. Find the isotherms for the square and grid in Prob. II if U = sin ~'iTx on the horizontal and -sin ~'iTY on the vertical edges. Try to sketch some isotherms. 14. (Intluence of starting values) Do Prob. 5 by Gauss-Seidel, starting from O. Compare and comment. 15. Find the potential in Fig. 456 using (a) the coarse grid, (b) the fine grid, and Gauss elimination. Hint. In (b),
21.5
Fig. 456.
Region and grids in Problem 15
16. (ADI) Apply the ADI method to the Dirichlet problem in Prob. 5, using the grid in Fig. 455, as before and starting values zero. 17. What Po in (18) should we choose for Prob. 16? Apply the ADI formulas (17) with Po = l. 7 to Prob. 16, performing I step. lllustrate the improved convergence by comparing with the corresponding values 0.077, 0.308 after the first step in Prob. 16. (Use the starting values zero.) 18. CAS PROJECT. Laplace Equation. la) Write a program for Gauss-Seidel with 16 equations in 16 unknowns, composing the matrix (13) from the indicated 4 X 4 submatrices and including a transfOimation ofthe vector ofthe boundary values into the vector b of Ax = b. (b) Apply the progranl to the square grid in 0 ~ x ~ 5. o ~ y ~ 5 with h = I and u = 220 on the upper and lower edges, U = 110 on the left edge and u = -10 on the right edge. Solve the linear system also by Gauss elimination. What accuracy is reached in the 20th Gauss-Seidel step?
Neumann and Mixed Problems. Irregular Boundary We continue our discussion of boundary value problems for elliptic PDEs in a region R in the xy-plane. The Dirichlet problem was studied in the last section. In solving Neumann and mixed problems (defined in the last section) we are confronted with a new situation, because there are boundary points at which the (outer) normal derivative Un = au/an of the solution is given, but u itself is unknown since it is not given. To handle such points we need a new idea. This idea is the same for Neumann and mixed problems. Hence we may explain it in connection with one of these two types of problems. We shall do so and consider a typical example as follows.
918
E X AMP L E 1
CHAP. 21
Numerics for ODEs and PDEs
Mixed Boundary Value Problem for a Poisson Equation Solve the mixed boundary value problem for the Poisson equation
shown in Fig.
~57a.
1.5
x
u=o (a)
Regio~
(b) Grid (h = 0.5)
R and boundary values
Fig. 457.
Mixed boundary value problem in Example 1
Solution.
We u~e the grid shown in Fig. 457b. where" = 0.5. We recall that (7) in Sec. 21A has the right side f (x. y) = 0.5 2 • 12xy = 3xy. From the formulas II = 3,·3 and I/ n = 6x given on the boundary we compute the bounda.ry data
,,2
(1)
PH
1/31 = 0.375. and
P21
ilU12 U32 =
3.
ely
= 6·0.5 = 3,
=
6·1 = 6.
are internal mesh points and can be handled as in the last section. Indeed. from (7), Sec. 21.4. with
{,2 = 0.25 and ,,2f (X. y) = 3xy and from the given boundary values we obtain two equations corresponding to PH and P 21 • as follows (with -0 resulting from the left boundary). 12(O.5·0.5)·! - 0 = 0.75 (2a)
+ 1122
=
12(1 • 0.5)·! - 0.375
=
1.125
The only difficulty with these equations seems to be that they involve the unknown values U12 and 1/22 of 1/ at P 12 and P22 on the boundary. where the normal derivative I/ n = ol//iln = flu/o)" is given. instead of 1/: but we shall overcome this difficulty as follows. We consider P12 and P 22 . The idea that will help us here is thb. We imagine the region R to be extended above to the first row of external mesh points (corresponding to y = 1.5), and we assume that the Poisson equation also holds in the extended region. Then we can write down two more equations as before (Fig. 457b) =
1.5 - 0 = 1.5
(2b)
+ "23
=3
- 3
= O.
On the right. 1.5 is 12xy,,2 at (0.5. I) and 3 is 12x.'"{,2 at (I. 1) and 0 (at P02 ) and 3 (at P32 ) are given boundary values. We remember that we have not yet used the boundary condition on the upper part of the boundary of R, and we also notice that in (2b) we have introduced two more unknowns 1/13' 1/23' But we can now use that comlilion and get rid of "13' 1/23 by applying the central difference formula for dl/ Idy. From (I) we then obtain (see Fig. 457b) 3=
6=
flll12
oy 01/22
oy
1/13 - 1/11 = 1113 - lIll. 21z 1/23 2h
hence
1/13
hence
1123
=
1111 + 3
u21
= 1I23 - lI21'
=
"21 + 6.
SEC. 21.5
Neumann and Mixed Problems. Irregular Boundary
919
Substituting these results into (2b) and simplifying, we have 2"11
2"21
41112
+
+
1112 -
1122 4U22
= 1.5 - 3 = -1.5 = 3 - 3 - 6 = -6.
Together with (2a) this yields, written in matrix form,
(3)
r
-~ -4 0 ~] r::::] r~:::5] r ~:::5]. =
2
0
o
2
1
1112
-4
1122
-4
1.5 - 3
-1.5
~0-6
-6
(The entnes 2 come from U13 and l/23, and so do -3 and -6 on the right). The solution of (3) (obtained by Gauss elimination) is as follows; the exact values of the problem are given in parentheses. U12
= 0.R66
un = 0.077
= I.RI2
(exact I)
U22
lexact 0.125)
"21 =
0.191
(exact 2) (exact 0.25).
•
Irregular Boundary We continue our discussion of boundary value problems for elliptic PDEs in a region R in the xy-plane. If R has a simple geometric shape, we can usually arrange for certain mesh points to lie on the boundary C of R, and then we can approximate partial derivatives as explained in the last section. However, if C intersects the grid at points that are not mesh points, then at points close to the boundary we must proceed differently, as follows. The mesh point 0 in Fig. 458 is of that kind. For 0 and its neighbors A and P we obtain from Taylor'S theorem (a)
UA
=
+
uh
lIo -
h
Uo
(4)
(b)
lip
=
auo ax auo ax
+ +
1 -
2 -
I
2
a2uo
(uhf -'-2-
ax
h2
2 a uo 2 ax
+ ...
+
We disregard the terms marked by dots and eliminate alia lax. Equation (4b) times a plus equation (4a) gives
p
Q
Fig. 458.
c
Curved boundary C of a region R, a mesh point 0 near C, and neighbors A, B, P, Q
920
CHAP. 21
Numerics for ODEs and PDEs
We solve this last equation algebraically for the derivative, obtaining ;PU o
ax 2
2 [I =
+ a)
a(l
h2
I
+~
UA
I
Up -
-;; U o
lIQ -
I -b
]
Similarly, by considering the points O. B. and Q. a2Uo 2
iJy
2 h
=
-2
[1+ b(\
b)
+ - -I I +b
LIB
Uo
]
.
By addition, (5)
y 2 U o = -22
[UA
h
For example, if a
a (l
=
+
a)
+
+
liB
b (1
+
Up
1+a
b)
+ ~ _ ta + b)uo ] I +b ab
.
!, b = !, instead of the stencil (see Sec. 21.4) we now have
because 1/[a(1 + a)l = ~, etc. The sum of all five terms still being zero (which is useful for checking). Using the same ideas, you may show that in the case of Fig. 459. 2 (6) V U o
2 1z2
= -
[UA + a(a
+ p)
UB b(b
+
q)
+
Up
pep
+
a)
+
ap + bq abpq
uQ
--~-
q(q
+
b)
]
Uo
,
a formula that takes care of all conceivable cases. B
bh 0
ph
ah
P~--""---~A qh Q
Neighboring points A. B. P, Q of a mesh point 0 and notations in formula (6)
Fig. 459.
E X AMP L E 2
Dirichlet Problem for the Laplace Equation. Curved Boundary Find the potential II in the region in Fig. 460 that has the boundary values given in that figure: here the curved portion of the boundary is an arc of the circle of radius 10 about (0, 0). Use the grid in the figure. 3 II is a solution of the Laplace equation. From the given formulas for the boundary values II = x , 512 - 24y2 . ... we compute the values at [he points where we need them: the result is shown in [he figure. For Pll and P 12 we have the usual regular' stencil. and for P21 and P 22 we use (6), obtaining
Solution.
1/ =
0.5 (7)
-2.5
0.5
0+
0.9
-3 0.6
0+
SEC. 21.5
Neumann and Mixed Problems. Irregular Boundary
921
y
6
u=512-24/
u=o u=o
3
u = 296
u= 216
u = 27
°O~----~3~~--~6--~8~~x 3
U =X
Region, boundary values of the potential, and grid in Example 2
Fig. 460.
We use this and the boundary obtain the system
value~
and take the mesh points in the
o-
- 4u n + 0.6ll n
+
- 2. 5u21
0.6U21
0.5U22 =
+ 0.61112 -
u~ual
order P lI • P 21 , P 12 • P 22 . Then we
27
=
-27
-0.9' 296 - 0.5 . 216 = -374.4
+0
U22
=
702
31122
=
0.9 . 352
702
+ 0.9 . 936 = 1159.2.
Tn matrix form,
-2.5 (8)
o
0
-4
0.6
0.6
:.5] I
r:: :] r-~~:.4].
-3
1112
702
1122
1159.2
Gauss elimination yields the (ruundedl value, Uu = -55.6,
U21 =
49.2,
ll12 =
-298.5,
ll22 =
-436.3.
Clearly, from a grid with so few mesh points we cannot expect great accuracy. The exact solution of the PDE (not of the difference equation) having the given boundary values is u = x 3 - 3xy2 and yields the values Ull =
-54,
!l21 =
54,
U12 =
-297,
U22 =
-432.
Tn praclice one would use a much finer grid and solve the resulting large system by an indirect method. •
1. Verify the calculation for the Poisson equation in Example 1. Check the values for (3) at the end. 2. Delive (5) in particular when a = h =
!.
3. Deri ve the general stencil formula (6) in all detail. 4. Verify the calculation for the boundary value problem in Example 2. 5. Do Example I in the text for "\7 2 u = 0 with grid and boundary data as before.
MIXED BOUNDARY VALUE PROBLEMS 6. Solve the mixed boundary value problem for the Laplace equation "\7 2 u = 0 in the rectangle in Fig. 457a (using the grid in Fig. 457b) and the boundary conditions u~ = 0 on the left edge, U x = 3 on the right edge, u = x 2 on the lower edge, and u = x 2 - 1 on the upper edge. 7. Solve Prob. 6 when un = 1 on the upper edge and u = 1 on the other edges.
922
CHAP. 21
Numerics for ODEs and PDEs
8. Solve the mixed boundary value problem for the Poisson equation ,2u = 2(x 2 + y2) in the region and for the boundary conditions shown in Fig. 461, using the indicated grid.
y
/u=o 3
u
2
0
Problems 8 and 10
9. CAS EXPERIMENT. Mixed Problem. Do Example 1 in the text with finer and finer grids of your choice and study the accuracy of the approximate values by comparing with the exact solution u = 2.\"y3. Verify the latter. 10. Solve , 211 = -1T2y sin! 1T.\' for the grid in Fig. 461 and lIy(l, 3) = lI y (2, 3) = !vW, II = 0 on the other three sides of the square.
1/
0
Fig. 462.
11--<>--=--
u=o
P21
3
x
Problem 11
12. If in Prob. II the axes are grounded (II = 0), what constant potential must the other portion of the boundary have in order to produce 100 volts at Pu? 13. What potential do we have in Prob. II if L/ = 190 volts on the axes and L/ = 0 on the other portion of the boundary? 14. Solve the Poisson equation y 2L/ = 2 in the region and for the boundary values shown in Fig. 463, using the grid also shown in the figure. y
3
u=/-3y----.......
1.5
n-<>----I~ u = /
IRREGULAR BOUNDARIES
11. Solve the Laplace equation in the region and for the boundary values shown in Fig. 462. using the indicated grid. (The sloping portion of the boundary is y = 4.5 - x.)
21.6
2
u =3x
u=o---"
Fig. 461.
- 1.5x
~u=9-3y
P ll 2 I----{>--='::.....,:>---='--~
2
P12
u=o---
y
=X
- 1.5y
u=o Fig. 463.
Problem 14
Methods for Parabolic PDEs The last two sections concerned elliptic POEs. and we now tum to parabolic POEs. Recall that the definitions of elliptic, parabolic, and hyperbolic POEs were given in Sec. 21.4. There it was also mentioned that the general behavior of solutions differs from type to type, and so do the problems of practical interest. This reflects on numerics as follows. For all three types, one replaces the POE by a corresponding difference equation, but for parabolic and hyperbolic POEs this does not automatically guarantee the convergence of the approximate solution to the exact solution as the mesh h ~ 0; in fact, it does not even guarantee convergence at all. For these two types of POEs one needs additional conditions (inequalities) to assure convergence and stability, the latter meaning that ~mall perturbations in the initial data (or small errors at any time) cause only small changes at later times. In this section we explain the numeric solution of the prototype of parabolic PDEs, the one-dimensional heat equation (c constant).
SEC. 21.6
923
Methods for Parabolic POEs
This POE is usually considered for x in some fixed interval, say, 0 :;;; x :;;; L, and time t ~ 0, and one prescribes the initial temperature u(x, 0) = f(x) (f given) and boundary conditions at x = 0 and x = L for all t ~ 0, for instance 1/(0. t) = 0, I/(L, t) = O. We may assume c = I and L = I; this can always be accomplished by a linear transformation of x and t (Prob. I). Then the heat equation and those conditions are O:;;;x:;;;Lt~O
(1)
(2)
II(X,
(3)
u(O. t)
=
0)
=
(Initial condition)
f(x)
u(], t)
=0
(Boundary conditions).
A simple finite difference approximation of (1) is [see (6a) in Sec. 21.4;j is the number of the time step]
(4)
]
k
1
(Ui,j+l -
Uij)
= /7 2
(Ui-l,j -
2Uij
+ Ui-l,j)'
Figure 464 shows a corresponding grid and mesh points. The mesh size is h in the x-direction and k in the t-direction. Formula (4) involves the four points shown in Fig. 465. On the left in (4) we have used ajonmrd difference quotient since we have no information for negative t at the start. From (4) we calculate ui.j+l' which corresponds to time row j + 1, in terms of the three other II that correspond to time row j. Solving (4) for lIi,j+I' we have
(5)
lIi,j+1 =
(1 -
+
2r)uij
r(ui+l,j
+
Ui-l,j),
u = 0 --....,..,I---o-----<J---o----l (j = 2) ~u=O
I---o-----<J---o----l(j=l)
k h u
Fig. 464.
!
x
= {(x)
Grid and mesh points corresponding to (4), (5)
(i,j + 1)
X
/k ( i - l . j ) X - - - h - - X - - - h - - X (i + l,j) (i,j)
Fig. 465.
The four points in (4) and (5)
r=
924
CHAP. 21
Numerics for ODEs and PDEs
Computations by this explicit method based on (S) are simple. However, it can be shown that crucial to the convergence of this method is the condition
(6)
r=
That is. Uij should have a positive coefficient in (S) or (for r =~) be absent from (S). Intuitively, (6) means that we should not move too fast in the (-direction. An example is given below.
Crank-Nicolson Method Condition (6) is a h<mdicap in practice. Indeed, to attain sufficient accuracy, we have to choose h small, which makes k very small by (6). For example, if h = 0.1, then k ::::; O.OOS. Accordingly, we should look for a more satisfactory discretization of the heat equation. A method that imposes no restriction on r = klh 2 is the Crank-Nicolson method, which uses values of u at the six points in Fig. 466. The idea of the method is the replacement of the difference quotient on the right side of (4) by ~ times the sum of two such difference quotients at two time rows (see Fig. 466). Instead of (4) we then have I
k
1 (Ui,j+l -
Uij) =
2h2 (Ui+l,j
(7)
+
-
2uij
I 2h2 (Ui+l,j+l -
2Ui,j+1
+
Ui-l,j+l)'
Multiplying by 2k and wliting r = klh 2 as before, we collect the terms corresponding to time row j + I on the left and the terms corresponding to time row j on the right: (8)
(2
+ 2r)Ui,j+1
-
rtUi+I,j+l
+
lIi-l,j+l)
= (2 -
2r)uij
+
r(ui+l,j
+ Ui-l,j)'
How do we use (8)? In general, the three values on the left are unknown, whereas the three values on the right are known. If we divide the x-interval 0 ::::; x ::::; I in (I) into II equal intervals, we have II 1 internal mesh points per time row (see Fig. 464, where II = 4). Then for j = 0 and i = I,···. II I. formula (8) gives a linear system of II - I equations for the II - I unknown values Ull, 11 2 1, ••• , U n -I,1 in the first time row in terms of the initial values lIoo, UlO, . • • , UnO and the boundatl' values Um (= 0). lInl (= 0). Similarly for j = l,.i = 2, and so on; that is, for each time row we have to solve such a linear system of II - I equations resulting from (8). Although r = klh 2 is no longer restricted, smaller r will still give better results. In practice, one chooses a k by which one can save a considerable amount of work, without making r too large. For instance, often a good choice is r = 1 (which would be impossible in the previous method). Then (8) becomes simply
(9)
411i ,j+l -
ui+l,j+l -
Ui-l,j+l
=
Time rowj + 1
X
X
Time rowj
X
X
ui+l,j
+ Ui-l,j' X
Ik h
h
Fig. 466. The six points in the CrankNicolson formulas (7) and (8)
X
SEC. 21.6
925
Methods for Parabolic PDEs
j=5
0.20 0.16
.
.. . ..
0.12
P12
j=3
P 22
r-
j=2 ----- ----I Ip d ----- - - ----- e - - j = 1 I IplO Ip40 P20 P30 j=O 0.2 0.4 0.6 0.8 1.0 ;=1 ;=3 i = 2 i =4 i= 5
0.08
---
IP
0.04 t= 0
x=O
u
Grid in Example 1
Fig. 467.
E X AMP L E 1
j=4
Temperature in a Metal Bar. Crank-Nicolson Method, Explicit Method Consider a laterally insulated metal bar of length I and such that c 2 = I in the heat equation. Suppose that the ends of the bar are kept at temperature" = O°C and the temperature in the bar at some instant -call it I = 0i, .f(x) = sin 1T.\' Applying the Crank-Nicolson method with h = 0.2 and r = I, find the temperature u(x. f) in the bar for 0 ~ t ~ 0.2. Compare the results with the exact solution. Also apply (5) with an r satisfying (6), say, r = 0.25. and with values nO! satisfying (6). say. ,. = I and,. = 2.5.
Solution by Crank-Nicolson.
Since,. = I. formula (8) takes the form (9). Since h = 0.2 and klh 2 = I. we have k = h 2 = 0.04. Hence we have to do 5 steps. Figure 467 shows the grid. We shall need the initial values ,. =
1110
=
sin 0.277
=
0.587785.
"20 = sin 0.41T = 0.951 057.
Also, "30 = "20 and "40 = lI1O' (Recall that lI10 means" at P10 in Fig. 467. etc.) In each time row in Fig. 467 there are 4 internal mesh points. Hence in each time step we would have to solve 4 equations in 4 unknowns. But since the initial temperature distribution is symmetric with respect to x = 0.5, and II = 0 al both ends for all t. we have lI31 = lI21, "41 = lIll in the first time row and similarly for the other rows. This reduces each system to 2 equations in 2 unknowns. By (9). since 1/31 = 1/21 and 1101 = O. for j = 0 these equations are (i = I)
=
1100 + "20 = 0.951 057
(i = 2)
The solution is lin = 0.399274.
lI21 =
(i = I) (i = 2)
The solution is (Fig. 468):
I
U12
t
x=O
0.00 0.04 0.08 0.12 0.16 0.20
0 0 0 0 0 0
0.646039. Similarly, for time row j = I we have the system 4"12 -£l12
+
1122 = "01 +
£l21 =
0.646039
+
U21 =
1.045313.
31/22 =
"11
0.271 221. 1/22 = 0.438844, and so on. This gives the temperature distribution
X
= u.2
0.588 0.399 0.271 0.184 0.125 0.085
x = U.4
x = U.6
x = u.8
0.951 0.646 0.439 0.298 0.202 0.138
0.951 0.646 0.439 0.298 0.202 0.138
0.588 0.399 0.271 0.184 0.125 0.085
x= 0 0 0 0 0 0
926
CHAP. 21
Numerics for ODEs and PDEs u(x. t)
1
Fig. 468.
Temperature distribution in the bar in Example 1
Comparison witll the exact solutio1l.
The present problem can be solved exactly by separating
variables (Sec. 12.5): the result is (10)
= 0.25.
For h = 0.2 and r = klh 2 = 0.25 we have k = rh = 0.25 . 0.04 = 0.0 I. Hence we have to perform 4 times as many step' as with the Crank-Nicolson method! Formula (5) with,. = 0.25 is
Solution by the explicit method (5) with r 2
(11)
We can again make use of the symmetry. For j "20 = "30 = 0.951 057 and compute
= 0 we need
"00
= O.
1110
= 0.5877!lS (see p. 925).
lin = 0.25(1100 + 21110 + "20) = 0.531 657 U21
= 0.25 (1110 + 2"20 +
Of cou[,e we can omit the boundary terms compute
1130)
O.
1101 =
1112 = 0.25(21111 1122
= 0.25 (1110 +
+
"02 =
31120)
=
O.R(iO 239.
0.··· from the formulas. For j
I we
1121) = 0.480888
= 0.25(1111 + 31121) = 0.778094
and so on. We have to perform 20 steps in,tead of the S CN steps. but the numeric values ,how that the accuracy is only about the same as that of the Crank-Nicolson values CN. ll1e exact 3D-values follow from (10).
x
=
0.2
x = 0.4
t
0.04 0.08 0.12 0.16 0.20
CN
By (11)
Exac[
CN
By (11)
Exac[
0.399 0.271 0.184 0.125 0.085
0.393 0.263 0.176 0.118 0.079
0.396 0.267 0.180 0.121
0.646 0.439 0.298 0.202
0.637 0.426 0.285 0.191
0.641 0.432 0.291 0.196
0.082
0.138
0.128
0.132
Failure of (5) with r violati1lg (6).
Formula (5) with h = 0.2 and r = I-which violates (6)-is
SEC. 21.6
Methods for Parabolic PDEs and
give~
927
very poor values: ,orne of the,e are T =
0.04 0.12 0.20 FOlmula (5) with an even larger r these are
0.1 0.3
0.363 0.139 0.053
= (I -
2r)uoj
+
Exact
T
0.396 0.180 0.082
=
0.4
0.588 0.225 0.086
Exact 0.641 0.291 0.132
= 2.5 (and h = 0.2 as before) gives completely nonsensical results: some of x = 0.2
Exact
0.1)265 0.0001
0.2191 0.0304
1. (Nondimensional form) Show that the heat equation ut = c 2 uj'j', 0 ~ X ~ L, can be transformed to the "nondimensionar' standard fonn lit = l(ox. 0 ~ x ~ I. by setting x = .WL. t = c 271L2. II = uluo, where lto is any constant temperature. 2. Derive the difference approximation (4) of the heat equation. 3. Derive (5) from (4). 4. Using the explicit method [(5) with II = I and k = 0.5]. find the temperature at t = 2 in a laterally insulated bar of length 10 with ends kept at temperature 0 and initial temperature fIx) = x - 0.lx 2 • 5. Solve the heat problem (I )-(3) by Crank-Nicolson for 0 ~ t ~ 0.20 with II = 0.2 and k = 0.04 when fIx) = x if 0 ~ x < !. lex) = I - x if ! ~ x ~ I. Compare with the exact values for t = 0.20 obtained from the series (2 terms) in Sec. 12.5. 6. Solve Prob. 5 by the explicit method with h = 0.2 and k = 0.01. Do II steps. Compare the last values with the Crank-Nicolson 3S-values 0.107. 0.175 and the exact 3S-values 0.108, 0.175. 7. The accuracy of the explicit method depends on r (~!). Illustrate this for Prob. 6. choosing r = ~ (and h = 0.2 as before). Do 4 steps. Compare the values for t = 0.04 and 0.08 with the 3S-values in Prob. 6, which are 0.156. 0.254 (t = 0.04).0.105,0.170 (t = 0.08). 8. If the left end of a laterally insulated bar extending from x = 0 to x = I is insulated. the boundary condition at x = 0 is 11,,(0. t) = u.,.(O. t) = O. Show that in the application of the explicit method given by (5), we can compute IIO,j+ I by the fomlllla 1I0. j + I
0.2
2ruij'
Apply this with II = 0.2 and r = 0.25 to determine the temperature II(X. t) in a laterally insulated bar extending fi'om x = 0 to I if lI(X, 0) = O. the left end is insulated
9.
10.
11.
12.
x
=
0.4
0.0429 0.0001
Exact 0.3545 0.0492.
•
and the right end is kept at temperature get) = sin ¥'lTt. Hint. Use 0 = Buo/ilx = (IIIj - IL1.) I2h. In a laterally insulated bar of length I let the initial temperature be fIx) = x if 0 ~ x ~ 0.2, fIx) = O.lS( I - x) if 0.2 ~ x ~ I. Let u(O, t) = O. 11(1. t) = 0 for all t. Apply the explicit method with II = 0.2, k = 0.01. Do 5 steps. Solve Prob. 9 for f(x) = x if 0 ~ x ~ 0.5, I(x) = I - x if 0.5 ~ x ~ l, all the other data being as before. Can you expect the solution to satisfy lI(x. t) = 11(1 - x, t) for all t? Solve Prob. 9 by (9) with II = 0.2, 2 steps. Compare with exact values obtained from the series in Sec. 12.5 (2 tenns) with suitable coefficients. CAS EXPERIMENT. Comparison of Methods. (a) Write programs for the explicit and the Crank-Nicolson methods. (b) Apply the programs to the heat problem of a laterally insulated bar of length I with lI(x, 0) = sin 'lTX and u(O, t) = 11(1. t) = 0 for all t, using h = 0.2, k = 0.01 for the explicit method (20 steps), II = 0.2 and (9) for the Crank-Nicolson method (5 steps). Obtain exact 6D-values from a suitable series and compare. (e) Graph temperature curves in (b) in two figures similar to Fig. 296 in Sec. 12.6. (d) Expeliment with smaller h (0.1,0.05. etc.) for both methods to find out to what extent accuracy increases under systematic changes of II and k.
113-151
CRANK-NICOLSON
Solve (I )-(3) b) Crank-Nicolson with r = 1 (5 steps). where: 13. fIx) = X( I - x), lz = 0.2 14. f(x) = x( I - x), h = O. I (Compare with Prob. 13.) 15. f(x) = 5x if 0 ~ x < 0.2. f(x) = 1.25(1 - x) if 0.2 ~ x ~ 1, h = 0.2
928
21.7
CHAP. 21
Numerics for ODEs and PDEs
Method for Hyperbolic PDEs In thi~ section we consider the numeric solution of problems involving hyperbolic POEs. We explain a standard method in terms of a typical setting for the prototype of a hyperbolic POE. the wave equation:
=
o~ x
~ 1, 1 ~
0
(1)
U tt
(2)
u(x. 0)
=
j(x)
(Given initial displacement)
(3)
ut(x. 0)
=
g(x)
(Given initial velocity)
(4)
u(O, t)
=
U xx
u(l, t)
=0
(Boundary conditions).
Note that an equation U tt = c 2 u-r;x and another x-interval can be reduced to the form (I) by a linear transformation of x and 1. This is similar to Sec. 21.6, Prob. I. For instance, (1)-(4) is the model of a vibrating elastic string with fixed ~nds at x = 0 and x = I (see Sec. 12.2). Although an analytic solution of the problem is given in (13). Sec. 12.4, we lise the problem for explaining basic ideas of the numeric approach that are also relevant for more complicated hyperbolic PDEs. Replacing the derivatives by difference quotients as before, we obtain from (1) [see (6) in Sec. 21.4 with y = t]
(5)
I k2
(Ui,j+l -
2Uij
+
Ui,j-I)
=
I
h2
(Ui+l,j -
2Uij
+
Ui-I,j)
where h is the mesh size in x, and k is the mesh size in t. This difference equation relates 5 points as shown in Fig. 469a. It suggests a rectangular grid similar to the grids for parabolic equations in the preceding section. We choose r* = k 2flz2 = L Then Uij drops out and we have
(6)
(Fig. 469b).
It can be shown that for 0 < r* ~ 1 the present explicit method is stable, so that from (6) we may expect reasonable results for initial data that have no discontinuities. (For a hyperbolic POE the latter would propagate into the solution domain-a phenomenon that would be difficult to deal with on our present grid. For unconditionally stable implicit methods see [El] in App. L) Equation (6) still involves 3 time steps j - I,j,j + J, whereas the formulas in the parabolic case involved only 2 time steps. Furthermore, we now have 2 initial conditions.
x
Time rowj + 1
•
Time rowj
x-,-x
Ik
x--x--x h
Ik h
x (a) Formula (5)
Fig. 469.
Time rowj-l
I
X (bl Formula (6)
Mesh points used in (5) and (6)
SEC. 21.7
Method for Hyperbolic PDEs
929
So we ask how we get stal1ed and how we can use the initial condition (3) This can be done as follows. From lItex. 0) = g(x) we derive the difference formula I 2k
(7)
=
where &
(uil -
g(ih). For t
Into this we substitute and by simplification
Ui,-l)
hence
= gi.
= O. that is, j = 0, equation (6) is
Ui.-l
as given in (7). We obtain
Ua
=
Ui-l,O
+
Ui+l,O -
+
Uil
2kgi
(8) This expresses E X AMP L E 1
lta
in terms of the initial data. It is for the beginning only. Then use (6).
Vibrating String, Wave Equation Apply the present method "ith" = k = 0.1 to the problem (1)-(4). where i(x) = sin
g(x) = O.
77X.
Solutio1l. The grid is the same as in Fig. 467, Sec. 21.6, except for the values of t, which now are 0.2. 004, ... (instead of 0.04. 0.08, ... ). The initial values From (8) and gtx) = 0 we have
liDO, lIlO' . . •
From this we compute. using
0.587785.1120 = lI30 =
and
ltOI =
0.951 057,
1111 =
~(1I00
+
1120) =
~. 0.951 057 = 0.475528
2)
"21 =
~(1I1O
+
"30) =
~. 1.538841 = 0.769421
"31 = 1121,1141 = 1111
using
sin 0.277 =
(i = I) (i =
and
lI10 = 1140 =
are the same as in Example 1, Sec. 21.6.
1102 = . . . =
by symmetry as in Sec. 21.6, Example I. From (6) withj = I we now compute, 0,
(i = I)
U12 = 1101
+
lI21 -
£lID =
0.76') 421 - 0.587785
(i = 2)
U22 = "11
+
£131 -
1I20 =
0.475528
+ 0.769421 - 0.951057
=
U.UH 636
=
0.293 !l92,
by symmetry; and so on. We thus obtain the following values of the dIsplacement over the first half-cycle:
"32 = 1I22, "42 = lI12
II(X. t)
of the
~tring
x=O 0.0 0.2 0.4 0.6 0.8 1.0
0 0 0 0 0 0
x
= 0.2
x = 0.4
X
0.588 0.476 0.182 -0.182 -0.476 -0.588
0.951 0.769 0.294 -0.294 -0.769 -0.951
0.951
0.588
0
0.769 0.294 -0.294 -0.769 -0.951
0.476 0.182 -0.182 -0.476 -0.588
0 0 0 0 0
X
= 0.6
=
0.8
x=
CHAP. 21
930
The~e
Numerics for ODEs and POEs
values are exact to 3D (3 decimals), the exact solution of the problem being (see Sec. 12.3) II(X,
I) = sin
TlX
cos
TIl.
The reason for the exactness follows from d'Alemberfs solution (4), Sec. 12.4. (See Prob. -I, below.)
•
This is the end of Chap. 21 on numerics for ODEs and POEs, a rapidly developing field of basic applications and interesting research. in which large-scale and complicated practical problems can now be attacked and solved by the computer.
,@
VIBRATING STRING
Solve (I )-(4) by the present method with h = k = 0.2 for the given initial deflection fix) and initial velocity 0 on the given (-interval. ~ ( ~ 2 I - x). 0 ~ ( ~ I
O.OlxO - x), 0
1. f(x)
=
2. f(x) 3. f(x)
= x2(
if 0 0.2 < x ~ I = x
~
x
~
0.2. f(x)
0.25(1 - x) if
4. Show that from d' Alemberfs solution (13) in Sec. 11.4 with c = I it follows that (6) in the present section gives the exact value 1I;,i+l = lI(ih, (j + l)ft). 5. If the string governed by the wave equation (I) starts from its equilibrium position with initial velocity g(x) = sin 'In. what is its displacement at time ( = 0.4 and x = 0.2. 0.4. 0.6. 0.8? (Use the present method with h = 0.2. k = 0.2. Use (8). Compare with the exact values obtained from (I2) in Sec. 12.4.)
................. ... ....... _. __ .... -.. _ .... ......... __"a.-.-.~
1. Explain the Euler and Improved Euler methods in geometrical terms. 2. What are the local and global orders of a method? Give examples. 3. What do you know about error estimates? Why are they important? 4. How did we obtain numeric mcthods by using the Taylor series'? 5. In each Runge-Kutta step we computed auxiliary values. How many? Why? 6. What are one-step and multistep methods? Give examples. 7. What is the idea of a predictor--corrector method? Mention some of these methods.
6. Compute approximate values in Prob. 5. using a finer grid (ft = 0.1. k = 0.1), and notice the increase in accuracy.
7. Illustrate the starting procedure when both f and g are not identically zero. say, f(x) = I - cos 2'lTx, g(X) = x - ,\'2. Choose ft = k = 0.1 and do 2 time steps.
8. Show that (I2) in Sec. 12.4 gives as another starting formula lin
I = ~
(Ui+l.O
+
I
lIi-1.0)
+ ?
-
-
I
k
Xi
+
g(s) ds
Xi- k
(where one can evaluate the integral numerically if necessary). In what case is this identical with (8)? 0.1, 9. Compute u in Prob. 7 for r = 0.1 and x 0.2, .. " 0.9, using the formula in Prob. 8. and compare the values. 10. Solve (I)-(3) (h = k = 0.2, 5 time steps) subject to f(x) = .\.2. g(x) = 2x. 11,.(0. t) = 2t. lIe I. t) = (l + 1)2.
STIONS AND PROBLEMS 10. What is automatic step size control? How is it done in practice? 11. Why and how did we use finite differences in this chapter,? 12. Make a list of types of PDEs. corresponding problems, and methods for their numeric solution. 13. How did we approximate the Laplace equation? The Poisson equation? 14. Will a difference equation give exact solutions of a PDE? 15. How did we handle (a) irregularly shaped domains. (b) given normal derivatives at the boundary? 16. Solve y' = 2x.\', yeO) = I, by the Euler method with ft = 0.1. 10 steps. Compute the en·or.
B- What is the idea of the Rungc-Kutta-Fehlberg method?
17. Solve y' = 1 + y2. yCO) = O. by the improved Euler method with h = 0.1,5 steps. Compute the error.
9. How can Runge-Kutta be generalized to systems of ODEs?
18. Solve y' = (x + y h = 0.2, 7 steps.
4)2, yeO)
= 4. by RK with
931
Chapter 21 Review Questions and Problems 19. Solve Prob. 17 by RK with II = 0.1, 5 steps. Compute the error. Compare with Prob. 17. 20. (Fair comparison) Solve y' = 2x- i Vy - In x + X-I, y(l) = 0 for I ~ x ~ 1.8 (a) by the Euler method with h = 0.1, (b) by the improved Euler method with h = 0.2. (c) by RK with h = 0.4. Verify that the exact solution is y = (In X)2 + In x. Compute and compare the errors. Why is the comparison fair? 21. Compute eX for x = 0, 0.1, .... 1.0 by applying RK to y' = r. yeO) = I. h = 0.1. Show that the result is 5D-exact.
I~O-321
POTENTIAL
Find the potential in Fig. 471, using the given grid and the boundary values: 30. u = 70 on the upper and left sides, U = 0 on the lower and right sides 31. u{PlO) = u(P 30 ) = 960. u(P20 ) = -480, U = 0 elsewhere on the boundary 32. u(P Ol ) u(P oa ) u{p 41) = u(P 43) = 200, u(P lO ) = u(P 30 ) 400, u(P 20 ) = 1600, u(P 02 ) = U(P 42 ) U(P 34 ) = 0
U(P I4 )
=
lI(P 24 ) =
22. Solve y' = (x + y)2, yeO) = 0 by RK with h = 0.2. 5 steps. 23. Show thaI by applying the method in Sec. 21.2 to a polynomial of first degree we obtain the multistep predictor and corrector formulas
)'~~+l
= Yn
+
Yn.1 = )'n
+
h (3fn -
2
fn-I)
Fig. 471.
where.f~ 11
=
f(Xn+l~ Y:+1)'
24. Apply the multistep method in Prob. 23 to the initial value problem y' = x + y. yeO) = 0, choosing h = 0.2 and doing 5 steps. Compare with the exact values. 25. Solve y' = (y - x - 1)2 + 2. y(O) = 1 for 0 ~ x ~ I by Adam~-Moulton with h = 0.1 <md starting values I. 1.200334589. 1.402709878. 1.609336039. 26. Solve y" + Y = O. y(O) = O. y' (0) = I by RKN with h = 0.2, 5 steps. Find the elTOr. 27. Solve y~ = -4."1 + 3.\'2' y~ = 5)'1 - 6Y2' )'1(0) = 3. ."2(0) = -5. by RK for systems. h = O.L 5 steps.
28. Solve y~ = -5.\"1 + 3)'2')'~ = -3.\"1 - 5.\"2' .\"1(0) = 2. Y2(0) = 2 by RK for systems. h = 0.1. 5 steps. 29. Find rough approximate values of the electrostatic potential at Pu. P 12 , P 13 in Fig. 470 that lie in a field between conducting plates (in Fig. 470 appearing as sides of a rectangle) kept at potentials 0 and I IO volts as shown. (Use the indicated grid.)
YI 4
u = 110 V
~_"""I"",1_. 'P13
2
u=o
u=o-
u=o Fig. 470.
Problem 29
Problems 30-32
33. Verify (13) in Sec. 21.4 for the sy~tem (12) and show that A in (\ 2) is nonsingular. 34. Derive the difference approximation of the heat equation. 35. Solve the heat equation (l). Sec. 21.6. for the initial condition J(x) = x if 0 ~ x ~ 0.2, J(x) = 0.25(1 - x) if 0.2 < x ~ I and boundary condition (3). Sec. 21.6. by the explicit method [formula (5) in Sec. 21.6] with h = 0.2 and k = 0.0 I so that you get values of the temperature at time t = 0.05 as the answer. 36. A laterally insulated homogeneous bar with ends at x = 0 and x = I has irlitial temperature O. Its left end is kept at O. whereas the temperature at the right end varies sinusoidally according to u(t, I) = get) = sin ~1T1.
Find the temperature u(x. t) in the bar [solution of (1) in Sec. 21.6] by the explicit method with h = 0.2 and r = 0.5 (one period, that is, 0 ~ t ~ 0.24). 37. Find lI(X, 0.12) and lI(X, 0.24) in Prob. 36 if the left end of the bar is kept at -g(t) (instead of 0). all the other data being as before. 38. Find out how the results of Prob. 36 can be used for obtaining the results in Prob. 37. Use the values 0.054, 0.172, 0.325, 0.406 (r = 0.12, x = 0.2,0.4.0.6, 0.8) and -0.009, -0.086. -0.252. -0.353 (r = 0.24) from the answer to Prob. 36 to check your answer to Prob. 37. 39. Solve lit = lIxx (0 ~ X ~ I. r ~ 0), 2 lI(X. 0) = x ( I - x), u(O. t) = lI( I, r) = 0 by Crank-Nicolson with h = 0.2. k = 0.04. 5 time steps. 40. Find the solution of the vibrating string problem lit{ = lIxx• U(x. 0) = x(l - x), lit = 0, u(O, t) = lI( I. t) = 0 by the method in Sec. 21.7 with h = 0.1 and k = 0.1 for t = 0.3.
932
CHAP. 21
Numerics for ODEs and PDEs
Numerics for ODEs and PDEs fn this chapter we discussed numerics for ODEs (Secs. 21.1-21.3) and PDEs (Secs. 21.4-21.7). Methods for initial value problems
v' = f(x, y),
(I)
Y(Xo)
=
Yo
involving a first-order ODE are obtained by truncating the Taylor series y(x
+ 11)
= y(x)
+ hr
,
(x)
/7 2
+2
/I
y (x)
+ ...
where, by (I), y' = f, y" = f' = aflilx + (ilflay»)"'. etc. Truncating after the term Izy', we get the Euler method, in which we compute step by step
(2)
J'n+ 1
= Yn + hf(x", Yn)
(11
= 0, I, ... ).
Taking one more term into account, we obtain the improved Euler method. Both methods show the basic idea but are too inaccurate in most cases. Truncating after the term in /7 4 , we get the important classical Runge-Kutta (RK) method of fourth order. The crucial idea in this method is the replacement of the cumbersome evaluation of derivatives by the evaluation of f(x. y) at suitable points (x, y); thus in each step we first compute four auxiliary quantities (Sec. 21.1) kl
= hf(xno Yn)
k2
= hf(xn + 41z. )'n + 4k1 )
k3
= hf(xn + 4h. Yn + 4k2)
(3a)
and then the new value (3b)
Error and step size control are possible by step halving or by RKF (Runge-Kutta-Fehlberg). The methods in Sec. 21.1 are one-step methods since they get )"n+l from the result Yn of a single step. A multistep method (Sec. 21.2) uses the values of Ym Yn-l' ... of severa) steps for computing Yn+l. Integrating cubic interpolation polynomials gives the Adams-Bashforth predictor (Sec. 21.2) (4a)
= Yn
+
I
24 h(55fn - 59fn-l
+
37fn-2 - 9fn-3)
933
Summary of Chapter 21
where (4b)
h
f(Xj, }J)' and an Adams-Moulton corrector (the actual new value)
Yn+I = Yn
+
1 24 h(9f~+1
+
19fn - 5fn-I
+
fn-2)'
where f~+1 = f(Xn+h ."~+I)· Here, to get started, .1'1, ."2' Y3 must be computed by the Runge-Kutta method or by some other accurate method. Section 19.3 concerned the extension of Euler and RK methods to systems y' = f(x, y),
j
thus
= 1, ... , 111.
This includes single mth order ODEs, which are reduced to systems. Second-order equations can also be solved by RKN (Runge-Kutta-Nystrom) methods. These are particularly advantageous for .v" = f(x • .1') with f not containing y'. Numeric methods for PDEs are obtained by replacing pattial derivatives by difference quotients. This leads to approximating difference equations, for the Laplace equation to (Sec. 21.4)
(5) for the heat equation to
(6)
1
k
I
(Ui,j+I - Uij)
= h2
(Ui+I,j - 2Uij
+ Ui-I,j)
(Sec. 21.6)
and for the wave equation to (7)
(Sec. 21.7);
here hand k are the mesh sizes of a grid in the x- and y-directions, respectively. where in (6) and (7) the variable .1' is time t. These PDEs are elliptic, parabolic, and h}perbolic, respectively. Con'esponding numeric methods differ. for the following reason. For elliptic PDEs we have boundary value problems, and we discussed for them the Gauss-Seidel method (also known as Liebmann's method) and the ADI method (Sees. 21.4, 21.5). For parabolic PDEs we are given one initial condition and boundary conditions, and we discussed an nplicit method and the Crank-Nicolson method (Sec. 21.6). For hyperbolic PDEs, the problems are similar but we are given a second initial condition (Sec. 21.7).
..
~\'f
. J
PA RT
.&
,
...
F
,,
-
",
'1
- --
I
-
IIii
•
)
-
-'
•
~
Optimization, Graphs
C HAP T E R 22
Unconstrained Optimization. Linear Programming
C HAP T E R 23
Graphs. Combinatorial Optimization Ideas of optimization and application of graphs play an increasing role in engineering. computer science, systems theory. economics. and Olher areas. In the first chapter of this part we explain some basic concepts. methods. and results in unconstrained and constrained optimization. The second chapter is devoted to graphs and the corresponding so-called combinatorial optimization, a relatively new interesting area of ongoing applied and theoretical research.
935
~\-:' ...:
CHAPTER , . Ii.
22
-.IIt:!!!
, .:i!fi3!!iI .... - _. . .
Jill"' 1
Unconstrained Optimization. Linear Programming Optimization principles are of basic importance in modern engineering design and systems operation in various areas. The recent development has been influenced by computers capable of solving large-scale problems and by the creation of conesponding new optimization techniques. so that the entire field has become a large area of its own. In the present chapter we give an introduction to the more impOltant concepts, methods, and results on unconstrained optimization (the so-called gradient method) and constrained optimization (linear programming). Prerequisite: a modest working knowledge of linear systems of equations References and Answers to Problems: App. 1 Part F, App. 2.
22.1
Basic Concepts. Unconstrained Optimization In an optimization problem the objective is to opti11li~e (l11axil1li~e or 1I1ini111i~e) some function f. This function f is called the objective function. For example, an objective function f to be maximized may be the revenue in a production of TV sets, the yield per minute in a chemical process, the mileage per gallon of a certain type of car, the hourly number of customers served in a bank, the hardness of steel, or the tensile strength of a rope. Similarly, we may want to minimize f if f is the cost per unit of producing certain cameras, the operating cost of some power plant, the daily loss of heat in a heating system. the idling time of some lathe. or the time needed to produce a fender. In most optimization problems the objective function f depends on several variables
These are called control variables because we can "control" them, that is. choose their values. For example, the yield of a chemical process may depend on pressure Xl and temperature x 2 • The efficiency of a cel1ain air-conditioning system may depend on temperature XI' air pressure X2' moisture content X3, cross-sectional area of outlet X4. and so on. Optimization theory develops methods for optimal choices of XI • ••• , x'" which maximize (or minimize) (he objective function f, that is, methods for finding optimal values of Xl • • • • , X n . 936
SEC. 22.1
937
Basic Concepts. Unconstrained Optimization
In many problems the choice of values of Xl, . . . , Xn is not entirely free but is subject to some constraints, that is. additional restrictions arising from the nature of the problem and the variables. For example, if Xl is production cost, then Xl ~ 0, and there are many other variables (time. weight, distance traveled by a salesman, etc.) that can take nonnegative values only. Constraints can also have the form of equations (instead of inequalities). We first consider unconstrained optimization in the case of a function I(x}> ... , X,J. We also wlite x = (Xl, . . . , xn) and I<x), for convenience. By definition,
I has a minimum at a point x = Xo in a region
R (where
I is defined)
if I(x)
for all x in R. Similarly,
~
I(X o)
I has a maximum at Xo in R if
for all x in R. Minima and maxima together are called extrema. Furthermore, I is said to have a local minimum at Xo if
for all x in a neighborhood of Xo, say, for all x satisfying
where Xo = (Xl, ... , Xn) and r > 0 is sufficiently small. Similarly, I has a local maximum at Xo if I(x) ~ I(X o ) for all x satisfying Ix - xol < r. If I is differentiable and has an extremum at a point Xo in the interior of a region R (that is, not on the boundary), then the partial derivatives ai/axI' ... , aflax., must be zero at Xo. These are the components of a vector that is called the gradient of I and denoted by grad lor VI. (For 11 = 3 this agrees with Sec. 9.7.) Thus (1)
A point Xo at which (1) holds is called a stationary point of I. Condition (1) is necessary for an extremum of I at Xo in the interior of R, but is not sufficient. Indeed, if 11 = 1. then for -" = Ilx), condition ll) is y' = !' (Xo) = 0; and, for instance, )' = x 3 satisfies y' = 3x2 = 0 at X = Xo = 0 where I has no extremum but a point of inflection. Similarly, for I(x) = XIX2 we have VI(O) = 0, and I does not have an extremum but has a saddle point at O. Hence after solving (I), one must still find out whether one has obtained an extremum. In the case 11 = I the conditions y' (Xo) = 0, -,,"(Xo) > 0 guarantee a local minimum at Xo and the conditions y' (xo) = 0, -,,"(Xo) < 0 a local maximum, as is known from calculus. For 11 > 1 there exist similar criteria. However. in practice even solving (1) will often be difficult. For this reason, one generally prefers solution by iteration, that is, by a search process that starts at some point and moves stepwise to points at which I is smaller (if a minimum of I is wanted) or larger (in the case of a maximum).
938
CHAP. 22
Unconstrained Optimization. Linear Programming
The method of steepest descent or gradient method is of this type. We present it here in its standard form. (For refinements see Ref. [E25] listed in App. 1.) The idea of this method is to find a minimum of f(x) by repeatedly computing minima of a function get) of a single vruiable t, as follows. Suppose that .f has a minimum at Xo and we strut at a point x. Then we look for a minimum of f closest to x along the straight line in the direction of - vf(x). which is the direction of steepest descent (= direction of maximum decrease) of fat x. That is, we determine the value of t and the corresponding point
(2)
z(t)
=
x -
tV f(x)
at which the function g(t)
(3)
= f(z(t»
has a minimum. We take this z(t) as our next approximation E X AMP L E 1
to
Xo'
Method of Steepest Descent Determinc a minimum of (4)
starting ti'01TI Xo
= (6. 3) = 6i + 3j and applying the method of steepest descent.
Solutioll.
Clearly. inspection shows that j(x) has a minimum at tI. Knowing the solution gives u~ a better feel of how the method worh. We obtain,fIx) = 2Xl i + 6x2j and from this z(t) = x - t'f(x) = (\ - 2tlx1 i g(t) =
f(zv»
+ (I - 6rlx2 j
2 2 2 = (1 - 2t) x]2 + 3(1 - 6t) X2 .
We now calculate the derivative
set g' (t) = O. and solve for t. finding
Starting from Xo = 6i + 3j, we compute the value~ in Table 22.1. which are shown in Fig. 472. Figure 472 suggests that in the case of slimmer ellipses C'a long narrow valley"). convergence would be poor. You may confirm this by replacing the coefficient 3 in (4) with a large coefficieIlt. For more sophisticated descent and other methods. some of them also applicable to vector functions of vector variables. we refer to the references lIsted in Part F of App. 1: see also [E25j. •
x2
Fig. 472.
Method of steepest descent in Example 1
Linear Programming
SEC. 22.2
939
Table 22.1
Method of Steepest Descent, Computations in Example 1 X
II
0
6.000 3.484 1.327 0.771 0.294 0.170 0.065
I
2 3 4 5 6
3.000 -0.774 0.664 -0.171 0.147 -0.038 0.032
0.210 0.310 0.210 0.310 0.210 0.310
STEEPEST DESCENT Do 3 steepest descent steps when: 3. f(x) = 3X I 2 + 2X22 - 12xl S. f(X) = Xo =
6.
+ 16x2, Xo = [I l]T + 2x/ - Xl - 6x2 , Xo = [0 OJT 0.5X12 + 0.7X22 - Xl + 4.2x2 + I,
X1
[-I
f(x) = .\"0 = [2
2
I]T
x/ +
0.lx22
+
8Xl
+
X2
+ 22.5.
-I]T
7. f(x) = 0.2x/ + x/ - 0.08Xl' Xo = [4 4]T 8. fIx) = X1 2 - X22. Xo = [2 I ]T. 5 steps. First guess. Then compute. Sketch your path.
22.2
0.581 0.381 0.581 0.38] 0.581 0.381
-0.258 -0.S57 -0.258 -0.857 -0.258 -0.857
10. fIx) = Xl 2 - X2' Xo = [I l]T. Sketch your path. Predict the outcome of further steps. 11. .f(x) = aXI + bX2' any "0. First guess. then compute.
,3-111
=
I - 6t
9. f(x) = X1 2 + cx/. Xo = [c I ]T. Show that 2 steps give [c llT times a factor. -4c2 /(c 2 - 1)2. What can you conclude from this about the speed of convergence?
1. What happens if you apply the method of steepesl descent to fIx) = X1 2 + X2 2 ? 2. Verify that in Example I. successive gradients are orthogonal. What is the reason?
4. f(x)
I - 2t
12. CAS EXPERIMENT. Steepest Descent. (a) Write a program for the method. (b) Apply your program to f(x) = X1 2 + 4.\"22. experimenting with respect to speed of convergence depending on the choice of Xo . (c) Apply your program to fIx) = .\"12 + X24 and to fIx) = X14 + X24, Xo = [2 I]T. Graph level curves and your path of descent.
Linear Programming Linear programming or linear optimization consists of methods for solving optimization problems with constrai1Zts, that is, methods for finding a maximum (or a minimum) x = [Xl' ••• , xnl of a li1Zear objective function
satisfying the constraints. The latter are linear inequalities, such as 3x1 + 4X2 :;; 36, or 0, etc. (examples below). Problems of this kind arise frequently, almost daily, for instance, in production, inventory management, bond trading, operation of power plants. routing delivery vehicles, airplane scheduling, and so on. Progress in computer technology has made it possible to solve programming problems involving hundreds or thousands or more variables. Let us explain the setting of a linear programming problem and the idea of a "geometric" solution, so that we shall see what is going on. Xl ~
940
EXAMPLE 1
CHAP. 22
Unconstrained Optimization. Linear Programming
Production Plan Energy Savers. Inc .. produces heaters of types Sand L. The wholesale price is $40 per heater for Sand $88 for L. Two time constraints result from the use of two machines M1 and M 2. On M1 one needs 2 min for an S heater and g min for an L heater. On M2 one needs S min for an S heater and 2 min for an L heater. Determine production figures Xl and X2 for Sand L. respectively (number of heaters produced per hour) so that the hourly revenue ~ =
fer)
= 40X1
+ 88x2
is maximum.
Solution.
Production figures Xl and X2 must be nonnegative. Hence the objective function (to be maximized) and the four constraints are (0)
(1)
2\~1
+ 8x2
~
60 min time on machine
(2)
SX1
+ 2x2
~
60 min time on machine M2
~
(3)
Ml
0
(4)
Figure 473 shows (0)--(4) as follows.
Con~tancy
lines Z = COllst
are marked (0). These are lines of constant reHnue. Their slope is -40/88 = - SIll. To increase ~ we must move the line upward (parallel to itself), as the arrow shows. Equation (I) with the equality sign is marked (1) It intersects the coordinate axes at Xl = 6012 = 30 (set x2 = 0) and x2 = 60/8 = 7.S (set Xl = 0). The arrow marks the side on which the points (Xl. x2) lie that satisfy the inequality in (I). Similarly for Eqs. (2)-(4). The blue quadrangle thus obtained is called the feasibilit) region. It is the set of all feasible solutions. meaning solutions that satisfy all four constraints. The figure also lists the revenue at O. A. B. C. The optimal solution is obtained by moving the line of constant revenue up as much as possible without leaving the feasibility region completely. Obviously. this optimum is reached when that line pas~es through B. the intersection (10. S) of (I) and (2). We see that the optimal revenue ';:max = 40· 10
is obtained by producing twice as many S
heater~
+ 88· S = $840
•
as L heaters.
x2
0: z=O A: z = 40·12 = 480 B: z = 40 . 10 + 88 . 5 = 840 C: z=88·7.5=660
20 (3)
(4)
"
o
"
10
A
20
""""(0) ~""O
"",~O
~
""'(O)~ "'"" ""80
Fig. 473.
Linear programming in Example 1
SEC 22.2
941
Linear Programming
Note well that the problem in Example I or similar optimization problems call1lot be solved by setting certain partial derivatives equal to zero. because crucial to such problems is the region in which the control variables are allowed to vary. Furthermore, our "geometric" or graphic method illustrated in Example I is confined to two variables Xl, x2' However, most practical problems involve much more than two variables, so that we need other methods of solution.
Normal Form of a Linear Programming Problem To prepare for general solution methods. we show that constraints can be written more uniformly. Let us explain the idea in terms of (1),
This inequality implies 60 -
2XI -
8x2
;;:;
0 (and conversely), that is, the quantity
is nonnegative. Hence, our original inequality can now be written as an equation
where
is a nonnegative auxiliary variable introduced for converting inequalities to equations. Such a variable is called a slack variable, because it "takes up the slack" or difference between the two sides of the inequality.
X3
E X AMP L E 2
Conversion of Inequalities by the Use of Slack Variables With the help of two slack variables following form. Maximize
\-3' .\"4
we can write the linear programming problem in Example I in the
subject to the constraints
1, ... , 4).
(i =
We now have /I = 4 variables and III = 2 (linearly independent) equations. so that two of the four variables. for example. Xl' x2. determine the others. Also note that each of the four sides of the quadrangle in Fig. 473 now has an equation of the form Xi = 0:
OA: x2 = O. AB:
x4 =
BC:
X3
0,
= 0,
CO: Xl = O. A vertex of the quadrangle is the intersection of two sides. Hence at a vertex. n - 1/1 = 4 - 2 variables are /ero and the others are nonnegallve. Thus at A we have x2 = 0, \"4 = 0, and so on.
=
2 of the •
942
CHAP. 22
Unconstrained Optimization. Linear Programming
Our example suggests that a general linear optimization problem can be brought to the following normal form. Maxill1i~e (5)
subject to tlte cOllStrail1ts
(6)
Xi
~
0
(i
=
I, ... ,
11)
with all bj nonnegative. (If a bj < O. multiply the equation by -1.) Here Xl' .... Xn include the slack variables (for which the c/s in f are zero). We assume that the equations in (6) are linearly independent. Then, if we choose values for 11 - m of the variables. the system uniquely detelmines the others. Of course. since we must have Xl ~
0, ... , Xu
:;;
0,
this choice is not entirely free. Our problem also includes the minimization of an objective function f since this corresponds to maximizing - f and thus needs no separate consideration. An n-tuple (Xl' ••• ,xn ) that satisfies all the constraints in (6) is called afeasible poillt or feasible solution. A feasible solution is called an optimal solution iffor it the objective function .f becomes maximum. compared with the values of f at all feasible solutions. Finally, by a basic feasible solution we mean a feasible solution for which at least 11 - III of the variables Xl' ••• , X" are zero. For instance, in Example 2 we have 11 = 4, III = 2. and the basic feasible solutions are the four vertices 0, A. B. C in Fig. 473. Here B is an optimal solution (the only one in this example). The following theorem is fundamental.
THEOREM 1
,--------------------------------------Optimal Solution
Some optimal solutioll of a lil1ear progra/1lmil1g problem (5). (6) is also a basic feasible .I'nlutimz of (5), (6).
J
For a proof, see Ref. [F5], Chap. 3 (listed in App. I). A problem can have many optimal solutions and not all of them may be basic feasible solutions: but the theorem guarantees that we can find an optimal solution by searching through the basic feasible solutions only. This is a great simplification; but since there are (
Il) (Il)
1/ - I l l
III
different ways
of equating 11 - III of the 11 variables to zero, considering all these possibilities. dropping those which are not feasible and then searching through the rest would still involve very much work. even when 11 and /1l are relatively small. Hence a systematic search is needed. We shall explain an important method of this type in the next section.
SEC 1l.2
943
Linear Programming
1. What is the meaning of the slack variables X3' X4 in Example 2 in terms of the problem in Example I? 2. Can we always expect a unique solution (as is the case in Example L)?
3. Could we find a profit f(XI' X2) = QIXI + Q2X2 whose maximum is at an interior point of the quadrangle in Fig. 473? (Give a reason for your answer.)
4. Why are slack variables always nonnegative? How many of them do we need?
REGIONS AND CONSTRAINTS
15-101
Describe and graph the region in the first quadrant of the x l x2-plane determined by the inequalities:
5. Xl +
2X2 ~
IO
X2 ~
0
X2 ::;;
2
Xl -
+
7. 2.0XI
5.0XI +
8. 2XI 4XI Xl
9.
6.0X2 ~
18.0
2.5x2 ~
20.0
X2 ::;;
6
+
5X2 ~
40
-
2X2 ::;;
-3
-
Xl
+
X2 ::;;
3
Xl
+
X2 ~
9
10.
+
X2 ~
X2 ::;;
2
+
5X2 ::;;
15
X2 ::;;
-2
2X2 ~
LO
3x I
2x) -
-Xl +
111-151
MAXIMIZATION AND MINIMIZATION
Maximize the given objective function given constraints. 11.
f
=
-lOx) +
2X2,
16. (Maximum output) Giant Ladders, Inc., wants to maximize its daily total output of large step ladders by producing Xl of them by a process PI and X2 by a process P 2 , where PI requires 2 hours of labor and 4 machine hours per ladder, and P2 requires 3 hours of labor and 2 machine hours. For this kind of work, 1200 hours of labor and 1600 hours on the machines are at most available per day. Find the optimal Xl and X2'
18. (Minimum cost) Hardbrick, Inc., has two kilns. Kiln I can produce 3000 grey bricks, 2000 red bricks, and 300 glazed bricks daily. For Kiln II the corresponding figures are 2000, 5000, and 1500. Daily operating costs of Kilns I and II are $400 and $600. respectively. Find the number of days of operation of each kiln so that the operation cost in filling an order of 18000 grey, 34000 red, and 9000 glazed blicks is minimized.
3
Xl +
15. Minimize f in Prob. 11.
17. (Maximum profit) Universal Electric, Inc .. manufactures and sells two models of lamps, LI and L 2 , the profit being $150 and $100, respectively. The process involves two workers WI and W2 who are available for this kind of work 100 and 80 hours per month, respectiveLy. WI assembles LI in 20 min and L2 in 30 min. W 2 paints LI in 20 min and L2 in 10 min. Assuming that all lamps made can be sold without difficulty. determine production figures that maximize the profit.
-Xl +'\2 ::;; -3 -Xl
14. Minimize fin Prob. 13.
Xl::;; 0,
f subject to the X2::;; 0,
19. (Maximum profit) United MetaL Inc .. produces alloys B) (special brass) and B2 (yellow tombac). BI contains 50% copper and 50% zinc. (Ordinary brass contains about 65% copper and 35% zinc.) B2 contains 75% copper and 25% zinc. Net profits are $120 per ton of Bl and $100 per ton of B 2 . The daily copper supply is 45 tons. The daily zinc supply is 30 tons. Maximize the net profit of the daily production. 20. (Nutrition) Foods A and B have 600 and 500 calories, contain L5 g and 30 g of protein, and cost $1.80 and $2.10 per unit, respectively. Find the minimum cost diet of at least 3900 calories containing at least 150 g of protein.
944
22.3
CHAP. 22
Unconstrained Optimization. Linear Programming
Simplex Method From the last section we recall the following. A linear optimization problem (linear proglamming problem) can be written in normal form; that is:
Maximize (1)
subject to the cOllstraillts
(2)
Xi
~
0
(i
= I. .... 11).
For finding an optimal solution of this problem. we need to consider only the basic feasible solutions (defined in Sec. 22.2), but there are still so many that we have to follow a systematic search procedure. In 1948 G. B. Dantzig published an iterative method, called the simplex method, for that purpose. In this method, one proceeds stepwise from one basic feasible solution to another in such a way that the objective function f always increases its value. Let us explain this method in terms of the example in the last section. In its original form the problem concerned the maximization of the objective function z
subject to
=
40x 1
+
88x2
2'\'1
+
&-2 ~ 60
X2 ~
O.
Converting the first two inequalities to equations by introducing two slack variables X3, we obtained the normal form ofthe problem in Example 2. Together with the objective function (written as an equation;:: - 40X1 - 8!h2 = 0) this normal fonn is
X4'
= (3)
0
= 60
where Xl ~ 0, ... , .1.'4 ~ O. This is a linear system of equations. To find an optimal solution of it, we may consider its augmented matrix (see Sec. 7.3) b
(4)
SEC. 22.3
945
Simplex Method
This matrix is called a simplex tableau or simplex table (the illitial simplex table). These are standard names. The dashed lines and the letters
-
b
~,
are for ease in further manipulation. Every simplex table contains two kinds of variables .r} By basic variables we mean those whose columns have only one nonzero entry. Thus X3' X4 in (4) are basic variables and Xl' -'"2 are nonbasic variables. Every simplex table gives a basic feasible solution. It is obtained by setting the nonbasic variables to zero. Thus (4) gives the basic feasible solution .\"3
= 60/1 = 60,
X4
=
z=o
60/1 = 60,
with .\"3 obtained from the second row and .\"4 from the third. The optimal solution (its location and value) is now obtained stepwise by pivoting, designed to take us to basic feasible solutions with higher and higher values of::: until the maximulll of z is reached. Here, the choice of the pivot equation and pivot are quite different from that in the Gauss elimination. The reason is that Xl> X2, .\"3, X4 are restricted to nonnegative values.
Step 1. Operatioll 0 1 : Selection of the Column of the Pivot Select as the colunrn of the pivot the first column with a negative entry in Row 1. In (4) this is Column 2 (because of the -40). Operatioll O 2 : Selection of the Row of the Pivot. Divide the right sides [60 and 60 in (4)] by the corresponding entries of the column just selected (6012 = 30. 60/5 = 12). Take as the pivot equation the equation that gives the smallest quotient. Thus the pivot is 5 because 60/5 is smallest. Operatioll 0 3 : Elimination by Row Operations. pivot (as in Gauss-Jordan, Sec. 7.8).
This gives zeros above and below the
With the notation for row operations as introduced in Sec. 7.3, the calculations in Step give from the simplex table To in (4) the following simplex table (augmented matrix). with the blue letters referring to the previous table.
z
(5)
[
b
-~--t--~--~z~.;-t-~---~~:-t-j!~-l I
o
:
I
5
Row I
= 60/5 = 12,
8 Row 3
Row 2 - 0.4 Row 3
I
2: 0
:
60
We ,ee that basic variables are now Xh X3 and nonbasic variables are latter to zero, we obtain the basic feasible solution given by T 1, Xl
+
X3
= 36/1 = 36,
-1."2, .\"4.
Setting the
::: =
480.
This is A in Fig. 473 (Sec. 22.2). We thus have moved from 0: (0, 0) wIth::: = 0 to A: (12, 0) with the greater::: = 480. The reason for this increase is our elimination of a term (-40.1."1) with a negative coefficient. Hence elimillation is applied only to Ilegative e1ltries in Row I but to no others. This motivates the selection of the COIUIIlIl of the pivot
946
CHAP. 22
Unconstrained Optimization. Linear Programming
We now motivate the selection of the row of the pivot. Had we taken the second row of To instead (thus 2 as the pivot), we would have obtained z = 1200 (verify!), but this line of constant revenue ~ = 1200 lies entirely outside the feasibility region in Fig. 473. This motivates our cautious choice of the entry 5 as our pivot because it gave the smallest quotient (60/5 = 12).
Step 2. The basic feasible solution given by (5) is not yet optimal because of the negative entry -72 in Row 1. Accordingly, we perform the operations 0 1 to 0 3 again, choosing a pivot in the column of -72. Operatioll 0 1 , Select Column 3 of T 1 in (5) as the column of the pivot (because -72 < 0). Operation 02' We have 3617.2 = 5 and 60/2 = 30. Select 7.2 as the pivot (because 5 < 30). Operation 0 3 , Elimination by row operations gives
z
T2
(6)
=
b
~-~-i--~--~~-f--J~----~~.~-i-~:~-] o
:5 I I
0
: I I
1:
3.6
We see that now Xl' X2 are basic and x 3 , from T 2 the basic feasible solution Xl
= 50/5 = 10,
X2
+
10 Row 2
1
50
I I
0.9
Row I
Row 3
""'::"-Row 2 7.'2
nonbasic. Setting the latter to zero, we obtain
X4
= 36/7.2 = 5,
.: = 840.
This is B in Fig. 473 (Sec. 22.2). In this step . .: has increased from 480 to 840. due to the elimination of -72 in T l . Since T2 contains no more negative entries in Row I, we conclude that z = f(10, 5) = 40 . 10 + 88 . 5 = 840 is the maximum po:"sible revenue. It is obtained if we produce twice as many S heaters as L heaters. This is the solution of our problem by the simplex method of linear programming. • Minimization. If we want to millimize <: = f(x) (instead of maximize), we take as the columns of the pivots those whose entry in Row I is positive (instead of negative). In such a Column k we consider only positive entries Tjk and take as pivot a ~ik for which b/tjk is smallest (as before). For examples, see the problem set .
•... _-_ .. _ _.......-- --- ........... - ......... _....... .... --. ~
-=9l
L _
SIMPLEX METHOD
Write in nonnal form and solve by the simplex method, assuming all xJ to be nonnegative. 1. Maximize.f = 3Xl + 2X2 subject to 3x1 + 4.l2 ~ 60. 4Xl + 3X2 ~ 60. IOxl + 2X2 ~ 120. 2. Prob. 16 in Problem Set 22.2. 3. Maximize the profit in the daily production of XI metal frames Fl ($90 profit/frame) and X2 frames F2 ($50 profit/frame) under the restrictions XI + 3X2 ;§; 1800 (material). Xl + .\"2 ;§; 1000 (machine hours), 3Xl + X2 ;§; 2400 (labor).
4. Maximize f = X\
+
X2
+
2Xl
X3 ;§;
+ 3X2 +
4.8, lOxI +
\"3
subject to 9.9, -"2 -
X3 ~
X3 ;§;
0.2.
5. The problem in the text with the order of the constmims interchanged. 6. Minimize f =
4Xl -
3.\"1
+
4X2
+
5.\"3
2Xl
+
3X3
~
30.
~
IOx2 - 20.\"3 subject to 60, 2.\"1 + .\"2 ~ 20,
7. MinimiLe f = 5Xl - 20X2 subject to 2Xl + 5x2 ~ 10. 8. Prob. 20 in Problem Set 22.2.
2Xl
+ 1Ox2 ;§; 5,
SEC. 22.4
Simplex Method:
9. Maximize
f
34xI
=
+ 2X2 + Xl + X2 + 5X3
X3 ~
8Xl
~
+
947
Difficulties
+
29x2
54. 3xI
+
32x3
8X2
+
(b) Write a program for maximizing::: in R.
subject to 59.
2X3 ~
39.
= {{IXI
+ 1I2X2
(el Write a program for maximi7ing ::: = {{lXI + ... + lInX" subject to linear constraints.
10. CAS PROJECT. Simplex Method. (a) Write a program for graphing a region R in the first quadrant
(d) Apply your programs to problems in this problem
of the xlx2-plane determined by linear constraints.
set and the previous one.
22.4 Simplex Method:
Difficulties
We recall from the last section that in the simplex method we proceed stepwise from one basic feasible solution to another, thereby increasing the value of the objective function f until we reach an optimal solution. Occasionally (but rather infi'equently in practice). two kinds of difficulties may occur. The first of these is degeneracy. A degenerate feasible solution is a feasible solution at which more than the usual number 11 - 111 of variables are zero. Here 11 is the number of variables (slack and others) and 111 the number of constraints (not counting the '~i ~ 0 conditions). Tn the last section, 11 = 4 and 111 = 2, and the occurring basic feasible solutions were nondegenerate; n - 111 = 2 variables were zero in each such solution. In the case of a degenerate feasible solution we do an extra elimination step in which a basic variable that is zero for that solution becomes nonbasic (and a nonbasic variable becomes basic instead). We explain this in a typical case. For more complicated cases and techniques (rarely needed in practice) see Ref. [F5J in App. I. E X AMP L E 1
Simplex Method, Degenerate Feasible Solution AB Steel. Inc .. produces two kinds of iron 11' '2 by using three kinds ot raw matenal R I . R2• R3 (scrap iron and two kind, of ore) as shown. Maximi7e the daily profit.
Raw Material Needed per Ton
Raw Material
[ron 11
iron 12
2
I
Rl R2 R3
Raw Material Available per Day (tons) 16
I
8 3.5
0
Net profit per ton
$150
$300
Let Xl and X2 denote the amount (in tons) of iron '} and 12 , respectively. produced per day. Then our problem is as follows. Maximize
SoIZlti01l.
;: = I(x) = subject
to
the constraints Xl
~
O. -'"2
~
150Xl
+
300X2
0 and (raw material R I ) (raw material R 2 ) (raw material R3 ).
948
CHAP. 22
Unconstrained Optimization. Linear Programming
By introducing slack
variable~ X3' X4. X5
we obtain the normal form of the
con~tr.lints
16
8
(1)
Xi;;;;
I. .... 5).
(i =
0
As in the last section we obtain from (\ ) and (2) the initial simplex tahle
b
[
-~l-~~Q-~J~O-l-~---~---~-r~~-l
To =
(3)
I
I
o: o II
:
0
I I
0
0
I .
0 :
Thi~
=
1611 = 16.
is 0: (0.0) in Fig. 474. We have
11
X5
= 5 variables Xj'
m
3.5
= x2 = 0 we have frolll (3) the basic
We see that xl. x2 are nonbasic variables and x3' x4' x!) are basic. With Xl feasible solution X3
H
I I
0
= 3.511 = 3.5.
= 3 constraints, and 11
-
111
;: =
O.
= 2 variables equal
to zero in our ",Iution. which thus is nondegenerate.
Step 1 of Pivoting Operatioll 0
1:
Column Selection of Pivot. Column 2 hince -150 < 0).
Operatioll O 2: Row Selection of Pivot. 1612 = H, 811 = 8; 3.510 is not possible. Hence we could choose Row 2 or Row 3. We choose Row 2. The pivot is L Operation 0 3: Elimination by Row Operations. This gives the simplex table Xl
X3
-"2
X4
b
X5
------------------------------75 0 I 0 -225 0
(4)
Tl =
[~
0
2
0 0
1
2
1
0
-2
0
0
0
I I I I I I I I
We see that the basic variables are X], X4.'\5 and the nonbasic are x2' we obtain from T] the basic feasible solution
I~ 1
Rm, I
'5 Rm\
Rm
4 Ro'
16
0 3.5 t3'
Row 4
Setting the nonoosic variables to rero,
SEC. 22.4
Simplex Method:
949
Difficulties
Xl =
1611 = 8.
X4
= 0/1 = O.
X5
= 3.5/1 = 3.5.
~ =
1200
This is A: (8. 0) in Fig. 474. This solution in degenerate because X4 = 0 (in addition to X2 = O. x3 = 0): geometrically: the straight line x4 = 0 also pa"es through A. This requires the next step. in which x4 will become nonbasic.
Step 2 of Pivoti1lg Operatioll 0 1 : Column Selection of Pivot. Column 3 (since -225 < 0). Operatioll O 2: Row Selection of Pivot. 16/1 = 16.0/! = O. Hence! must serve as the piVOL Operatioll 0 3 : Elimination by Row Operations. This gives the following simplex table. h
~ _:_ Q- - -
.9_:_-::1JQ. - - ~-?.O- - - - Q. _~!~Q_]
Ro",
o :
0 :
Row
(5)
2
I
2
I
[ 0: 0
!:
o :
0 :
0
-2
0:
I()
I
.! Rm .
I
-!
I
0:
- 2
:
0 3.5
Row -I- - 2 Row .•
We see that the basic variables are Xl. x2. x5 and the nonbasic are x3' x4' Hence x4 has become nonbasic. as intended. By equating the nonbasic variables to /ero we obtain from T 2 the basic feasible solution Xl
= 16/2 = 8.
-'2 =
O/! =
o.
-'5 =
3.511 = 3.5.
;: =
1200.
This is still A: (8. 0) in Fig. 474 and;: has not increa,ed. But this opens the way to the maximum. which we reach in the next step.
Step J of Pivoting Operatioll 0 1 : Column Selection of Pivot. Column 4 (since -150 < 0). Operatioll O 2: Row Selection of Pivot. 16/2 = 8. O/(-!) = O. 3.511 = 3.5. We crultakc I as the pivot. (With
-! as the pivot we would nm leave A. Try iL)
Operatioll 0 3 : Elimination by Row Operations. This gives the simplex table b
_1_ ~_ Q(6)
[
o : I o II o :
2
0: 0
0
!
I I I
0
0:
We see that basic variables are basic feasible solution Xl =
- - Q- ~ _0__ J~O___ !?Q - ~ J 2?J__ ]
2
Xl' X2' X3
0
2
-2:
0
1 -2
-2 and non basic X4'
912 = 4.5.
X3 =
X5'
9
I I I
1.75
:
3.5
Row
150 Row-l-
Row
2 Row-l-
Equating the latter to zero we obtain from T3 the
3.5/1 = 3.5.
;: =
1725.
This is B: (-1-.5.3.5) in Fig. 474. Since Row of T3 has no negative entries. we have reached the maximum daily profit ::max = f(4.5. 3.5) = 150' 4.5 + 300' 3.5 = SI725. This is obtained by u~ing 4.5 tons of iron h ~U~cl~~ •
Difficulties in Starting As a second kind of difficulty, it may sometimes be hard to find a basic feasible solution to start from. In such a case the idea of an artificial variable (or several such variables) is helpful. We explain this method in terms of a typical example.
950
E X AMP L E 2
CHAP. 22
Unconstrained Optimization. Linear Programming
Simplex Method: Difficult Start, Artificial Variable Maximize (7)
subject to the constraints Xl
Solutioll.
~
O.
X2 ~
0 and (Fig. 475)
By means of ~Iack variables we achieve the normal foml of the constrrnnts
=0
=
1
=2
(8)
Xi ~
0
(i =
I, .... 51.
Note that the first slack variable is negative (OT zero). which makes \3 nonnegative within the (and negative outside). From (7) and (8) we obtain the simplex table
fea~ibilit)
region
b
r--L- -~:-l-
-!o I I o II
ro :
I
-2
I
-I
_Q ___
I I I I
-1
0
0
0
I
0
:
0
0
I 1 I . I :2 I
:
4
are nonbasic. and we would like to take x3. '\4. X5 as basic variables. By our usual process of equating the nonbasic variables to zero we obtain from this table
Xl. X2
Xl =
O.
.1."3 =
1I(-l) = -1,
X4 =
211 = 2.
"5 =
4/1 = 4 .
~ =
O.
"3 < 0 indicates that (0, 0) lies outside the feasibility region. Since X3 < O. we cannot proceed immediately. Now. instead of searching for other basic variables. we use the following idea. Solving the second equation in (8) for .1."3. we have
To this we now add a variable
X6
on the right.
B
2
f=7 \ \ \ \ \
~ C
o ,------A,
o
Fig. 475.
1
~ 2
3
Xl
Feasibility region in Example 2
SEC. 22.4
Simplex Method:
Difficulties
951
(9)
is called an artificial variable and is subject to the constraint X6 ~ O. We must take care that x6 (which is not part of the given problem!) will disappear eventually. We shall see that we can accomplish this by adding a term -MX6 with very large M to the objective function. Because of (7) and (9) (solved for x6) this gives the modified objective function for this "extended problem"
X6
(10)
We see that the simplex table corresponding to (10) and (8) is
b
\"1
I L I -2 - M -I +!M __ ______________
MOO 0 _________________
-!
:
-I
-I
I I I
0
0:
:
0
I
I
o: I oI
To =
I ~
I
-!
oI
I
0
0
0:
0
0
0
0
- I
I ~
-M_ ___
I I I
2
0:
4
I
0
I
The last row of this table results from (9) written as Xl - !X2 - x3 + x6 = 1. We see that we can now start. taking x4. x5. X6 as the basic variables and Xl' X2. X3 a, the nonbasic variables. Column 2 has a negative first entry. We can take the second entry (1 in Row 2) as the pivot. This gives
b
I
I
0
-2
o :
I
-!: -
I
-2
0
0
0
I
2
0
0
0: I
I
---,-------,------------------T--I
T1
=
I
-!:
I
0:
0
0
0:
0
O:O~:
0:3 I 10
I I 01001000
This corresponds to Xl = 1,.1"2 = 0 (point A in Fig. 475). x3 = 0, -"4 = I, Row 5 and Column 7. In this way we get rid of \"6' as wanted. and obtain
X5
= 3. X6 = O. We can now drop b
-~ -t -~- -~! -t- ~~ ---~ ---~-l- ~-j r
I I I 0 _1 I 2 I I I 3 I 01021
o
1
I
0
0
I I I I I 13
In Column 3 we choose 312 as the next pivot. We obtain
b
This corresponds to Xl = 2. -'"2 = 2 (this is B in Fig. 475). X3 as the pivot, by the usual principle. This gives
= O. X4
= 2, X5
= O. In Column 4 we choose 4/3
CHAP. 22
952
Unconstrained Optimization. Linear Programming
b
[ This cone5ponds to Xl
=
3.
-~-1-~----~-t-~---!---!-1-~-1 I
I
0:0
O:~
o
X2 =
I I
0
3 I "I
I .
~:2
1
3 I 3 "I"
3
()
1 (point C in Fig. 475).
-" -"3 =
~. x4 = O. x5 = O. This is the maximum
fmax = I(3. 1) = 7.
-............... _..... -.•........
•
2E4
~
If in a step you have a choice between pivots. take the one that comes first in the column considered.
1. Maximize:;; = fleX) = 6xl + 12x2 subject to o ~ Xl ::0; 4. 0::0; X2 ~ 4. 6x1 + 12x2 ::0; 72. 2. Do Prob. I with the last two constraints interchanged. 3. Maximi7e the daily output in producing Xl glass plates by a process PI and X2 glass plates by a process P2 subject to the constraints (labor hours. machine hours, raw material supply)
to input constraints /limitation of machine time) 4Xl 8X1
+
2X2
+ 4X2
X2 ;;;
4. Maximize:;; = 300.\"1 + 500.\"2 subject to 2Xl + 8X2 ~ 60. 2X1 + .\"2 ~ 30. 4XI + 4.\"2 ~ 60. 5. Do Prob. 4 with the last two constraints interchanged. Comment on the resulting simplification. 6. Maximize the total output f = Xl + X2 + X3 (production figures of three different production processes) subject
5X2
+ +
8X3 4X3
::0; 12, ::0; 12.
::0; 40.
9. Maximize f
::0; 140.
5X2
7. Maximize f = 6.\"1 + 6.\"2 + 9.\"3 subject to .\"j ;;; 0 (j = I. .... 5), and Xl + .\"3 + X4 = 1. X2 + .\"3 + \"5 = I. 8. Using an artificial variable. minimize f = 2x 1 - X2 subject to Xl ;;; O. X2 ;;; 0, X1 + X2 ;;; 5, -Xl + X2 ~ I, 5XI
4Xl
+ +
O.
X3
=
::0; 0,
+ X2 + + X2 + x3
4Xl
2X3
Xl
::0; 1.
subject to Xl ;;; Xl
+ X2
.\"3
o.
~ O.
10. If one uses the method of artificial variables in a problem without solution. this nonexistence will become apparent by the fact that one cannot get rid of the artificial variable. TIIustrate this by trying to maximize f = 2Xl + X2 subject to Xl ;;; 0, X2 ;;; 0, 2Xl + X2 ::0; 2, Xl + 2X2 ;;; 6, Xl + X2 ~ 4.
==::,.:.,::.===:=:--:e-".;::&':==.:U S T ION SAN D PRO B L EMS 1. What is the difference between constrained and unconstrained optimization?
2. State the idea and the basic formulas of the method of steepest descent. 3. Write down an algorithm for the method of steepest descent. 4. Design a "method of steepest ascent" for determining maxima. 5. What i~ linear programming? objective function?
lt~
ba~ic
idea? An
6. Why can we not use methods of calculus for extrema in linear programming?
f(x) = X1 2 + 1.5X22. starting from (6. 3). Do 3 steps. Why is the convergence faster than in Example 1. Sec. 22.1?
9. What does the method of steepe~t de~cent amount to in the case of a single variable? 10. In Prob. 8 start from Xo = [1.5 I ]T. Show that the next even-numbered approximations are X2 = kXo. X4 = k 2xo. etc .. where k = OJl4. 11. What happens in Example I of Sec. 22.1 if you replace the function fex) = X 12 + 3X22 by fex) = x 12 + 5X22? Do 5 steps, starting from Xo = [6 3]T. Is the convergence faster or slower?
7. Whar are slack variables? Artificial variables? Why did we use them'?
12. Apply the method of steepest descent to f(x) = 9X12 + X2 2 + 18.\'1 - 4X2, 5 steps. starting from Xo = [2 4]T.
8. Apply the method of steepest descent to
13. In Prob. 12, could you start from [0
O]T and do 5 steps?
Summary of Chapter 22
953
121-25]
14. Show that the gradients in Prob. 13 are orthogonal. Give a reason.
Xl
115-20 1 Graph or sketch the region in the first quadrant of the X1x2-plane detelwined by the following inequalities. 15.
17.
Xl
+
2X1
+
3X2 ~
6
X2
~
4
-Xl
+
X2
~
0
XI
+
X2
~
4
16.
0.8X1
18.
2X2 ~
Xl -
Xl -
+
X2
2X1
-4
2xI
+
X2 ~
12
XI
+
X2 ~
8
Xl
+
X2 ~
2
3X2 ~
-12
~
15
f =
~
+
X2
f ~
+ x2
-XI
+
20.
X2 ~
5
X2 ~
3
2X1 -
2
Xl
+ X2
~
Xl X2
20X2
subject to Xl
~
5.
+ ~
X2
subject to
Xl
+
2X2 ~
+ +
40X2 ~
1800
(Machine hours),
20X2 ~
6300
(Labor).
25. Maximize the daily output in producing XI chairs by a process PI and X2 chairs by a process P 2 subject to 3x 1 + 4X2 ~ 550 (machine hours), 5xI + 4X2 ~ 650 (labor).
--:!:.,: .'fI.=.:.,=:~"'=="==:._,=:':==:='
Unconstrained Optimization.
Linear Programming
In optimization problems we maximize or minimize an objective function.: = f(x) depending on control variables Xl' ••• , X'" whose domain is either unrestricted ("unconstrained optimization," Sec. 22. I) or restricted by constraints in the fonn of inequalities or equations or both ("constrained optimization," Sec. 22.2). If the objective function is linear and the constraints are linear inequalities in X10 ••• , -'""" then by introducing slack variables X m +l, . • . , Xn we can write the optimization problem in normal form with the objective function given by
(I) (where c m + 1
= ell
10.
4.
~
200X1
XI
=
LO.
+
4.
= 2X1 IOx2 subject to Xl - X2 ~ 4. 14. Xl + X2 ~ 9. -Xl + 3x2 ~ 15. 24. A factory produces two kinds of gaskets, G I • G 2 • with net profit of $60 and $30. respectively. Maximize the total daily profit subject to the constraints (Xj = number of gaskets Gj produced per day)
2xI
40X1
19.
lOx.
X2 ~
6.
23. Minimize f
6
2x2 ~
+ X2
22. Maximize
-2
~
Maximize or minimize as indicated.
21. Maximize
= 0) and the constraints given by
(2) Xl ~
o... '. x.,
~
O.
[n this case we can then apply the widely used simplex method (Sec. 22.3), a systematic stepwise search through a very much reduced subset of all feasible solutions. Section 22 4 shows how to overcome difficulties with this method
t
--\ ... ~
23
---, C HAP T E R
Graphs. Combinatorial Optimization Graphs and digraphs (= directed graphs) have developed into powerful tools in areas, such as electrical and civil engineering, communication networks, operations research, computer science, economics, industrial management, and marketing. An essential factor of this growth is the use of computers in large-scale optimization problems that can be modeled by graphs and solved by algorithms provided by graph theory. This approach yields models of general applicability and economic imporlance. II lies in the center of combinatorial optimization, a term denoting optimization problems that are of pronounced discrete or combinatorial structure. This chapter gives an introduction to this wide area, which constitutes a shift of emphasis away from differential equation~. eigenvalues, and so on, and is full of new ideas as well as open problems-in connection, for instance, with efficient computer algorithms. The classes of problems we shall consider include transportation of minimum cost or time, best assignment of workers to jobs, most efficient use of communication networks, and many others. Problems for these classes often form the core of larger and more involved practical problems.
Prerequisite: none. References and Answers
10
Problems: App. I Parl F, App. 2.
23.1 Graphs and Digraphs Roughly, a graph consists of points, called vertices, and lines connecting them, called edges. For example, these may be four cities and five highways connecting them, as in Fig. 476. Or the points may represent some people, and we connect by an edge those who do business with each other. Or the vertices may represent computers in a network and the edges connections between them. Let us now give a fonnal definition . ./JLOOP
~ flsolated
\ ......------... "
vertex
'-------"" Double edge
Fig. 476. Graph consisting of 4 vertices and 5 edges 954
Fig. 477. Isolated vertex, loop, double edge. (Excluded by definition.)
SEC. 23.1
955
Graphs and Digraphs
DEFINITION
Graph
A graph G consists of two finite sets (sets having finitely many elements), a set V of points, called vertices, and a set E of connecting lines, called edges, such that each edge connects two vertices, called the endpoints of the edge. We write G
=
(V, E).
Excluded are isolated vertices (vertices that are not endpoints of any edge), loops (edges whose endpoints coincide), and lIlultiple edges (edges that have both endpoints in common. See Fig. 477.
CAUTION! Our three exclusions are practical and widely accepted, but not uniformly. For instance, some authors permit multiple edges and call graphs without them simple graphs. • We denote vertices by letters, Lt, v, ... or VI' V 2 , .•• or simply by numbers 1,2, ... (as in Fig. 476). We denote edges by el, e2, ... or by their two endpoints; for instance, el = (1, 4), e2 = (1, 2) in Fig. 476. An edge (Vi' V) is called incident with the vertex Vi (and conversely); similarly, (Vi' Vj) is incident with Vj. The number of edges incident with a vertex V is called the degree of v. Two vertices are called adjacent in G if they are connected by an edge in G (that is, if they are the two endpoints of some edge in G). We meet graphs in ditlerent fields under different names: as "networks" in electrical engineering, "structures" in civil engineering, "molecular structures" in chemistry, "organizational structures" in economics, "sociograms," "road maps," "telecommunication networks," and so on.
Digraphs (Directed Graphs) Nets of one-way streets, pipeline networks, sequences of jobs in construction work, flows of computation in a computer, producer-consumer relations, and many other applications suggest the idea of a "digraph" (= directed graph), in which each edge has a direction (indicated by an arrow, as in Fig. 478).
Fig. 478.
DEFINITION
Digraph
Digraph (Directed Graph)
A digraph G = (V, E) is a graph in which each edge e its "initial point" i to its "terminal point" j.
=
(i,j) has a direction from
Two edges connecting the same two points i, j are now permitted, provided they have opposite directions, that is, they are (i,j) and (j, i). Example. (1,4) and (4, 1) in Fig. 478.
956
CHAP. 23
Graphs. Combinatorial Optimization
A subgraph or subdigraph of a given graph or digraph G = (V. E), respectively, is a graph or digraph obtained by deleting some of the edges and vertices of G. letaining the other edges of G (together with their pairs of endpoints). For instance, el' e3 (together with the vertices I, 2. 4) form a subgraph in Fig. 476. and e3, e4, e5 (together with the vertices l. 3. 4) fonn a subdigraph in Fig. 478.
Computer Representation of Graphs and Digraphs Drawings of graphs are useful to people in explaining or illustrating specific situations. Here one should be aware that a graph may be <;ketched in various ways; see Fig. 479. For handling graphs and digraphs in computers. one uses matrices or lists as appropriate data stmctures. as follows.
(a)
(b)
Fig. 479.
Different sketches of the same graph
Adjacency Matrix of a Graph G:
{/ij =
(c)
Matrix A =
{Io
[{lU]
with entries
if G has an edge (i, j), else.
Thus £Iij = 1 if and only if two veItices i and j are adjacent in G. Here, by definition, no vertex is considered to be adjacent to itself; thus, {/i; = O. A is symmetric, {/ij = {lji- (Why?) The adjacency matrix of a graph is generally much smaller than the so-called illcidellce matrix (see Probs. 21, 22) and is preferred over the latter if one decides to store a graph in a computer in matrix form. EXAMPLE 1
Adjacency Matrix of a Graph Vertex Vertex I 2
3 4
Adjacency Matrix of a Digraph G:
{/ij
=
{Io
2
[~
3
4
0
0
0
J
Matrix A = [aij] with entries
if G has a directed edge (i, j), else.
This matrix A is not symmetric. (Why?)
•
SEC. 23.1
957
Graphs and Digraphs
E X AMP L E 2
Adjacency Matrix of a Digraph To vertex
2
From vertex I 2 3 4
3
4
0
[i
0
0 0
0
U
;]
•
Lists. The vertex incidence list of a graph shows for each vertex the incident edges. The edge incidence list shows for each edge its two endpuints. Similarly for a digraph; in the vertex list, outgoing edges then get a minus sign, and in the edge list we now have ordered pairs of vertices. E X AMP L E 3
Vertex Incidence List and Edge Incidence List of a Graph This graph is the same as in Example I. except for notation.
Vertex
Incident Edges
Edge
Endpoints
VI
el' e5
el
VI, V 2
V2
el' C2, e3
C2
V 2 , V3
V3
e2' C4
C3
V 2 , V4
V4
C3 • C4 , e5
e4
V 3• V4
e5
VI. V 4
•
"Sparse graphs" are graphs with few edges (far fewer than the maximum possible number n(n - I )/2. where n is the number of vertices). For these graphs. matrices are not efficient.
Lists then have the advantage of requiring much less storage and being easier to handle; they can be ordered, sorted, or manipulated in various other ways directly within the computer. For instance, in tracing a "walk" (a connected sequence of edges with pairwise common endpoints), one can easily go back and furth between the two lists just discussed, instead of scanning a large column of a matrix for a single 1. Computer science has deVeloped more refined lists, which, in addition to the actual content, contain "pointers" indicating the preceding item or the next item to be scanned or both items (in the case of a "walk": the preceding edge or the subsequent one). For details, see Refs. [E 16] and LF7]. This section was devoted to basic concepts and notations needed throughuut this chapter, in which we shall discuss some of the most impurtant classes of combinatorial optimization problems. This will at the same time help us to become more and more familiar with graphs and digraphs.
958
CHAP. 23
-.•.--..-- ..... -.... --~--
Graphs. Combinatorial Optimization
..
~.-. ........ j--. .... - -.....-. --
1. Sketch the graph consisting of the vertices and edges of a square. Of a tetrahedron. 2. Worker WI can do jobs 11 and 13 , worker W2 job 14 , worker W3 jobs 12 and J 3 . Represent this by a graph. 3. Explain how the following may be regarded as graphs or digraphs: flight connections between given cities: memberships of some persons in some committees; relations between chapters of a book: a tennis tournament; a family tree. 4. How would you represent a net of one-way and two-way streets by a digraph? 5. Give further examples of situations that could be represented by a graph or digraph. 6. Find the adjacency matrix of the graph in Fig. 476.
Sketch the graph whose adjacency matrix is:
0 0
14. 0
0 0 0
0
0
0
0
0
0
0
0
0 0
0
0
0
16.
ADJACENCY MATRIX
Find the adjacency matrix of the graph or digraph.
8.
0
15.
7. When will the adjacency matrix of a graph be symmetric? Of a digraph?
18-131
0
0 Sketch the digraph whose adjacency matrix is: 17. The matrix in Prob. 14.
10.
18. The matrix in Prob. 16. 19. (Complete graph) Show that a graph G with 11 vertices can have at most 11(11 - 1)/2 edges, and G has exactly n(1l - I )12 edges if G is complete, that is, if every pair of vertices of G is joined by an edge. (Recall that loops and multiple edges are excluded.) 20. In what case are all the off-diagonal entries of the adjacency matrix of a graph G equal to I?
11.
Incidence Matrix of a Graph: Matrix B = [bjk ] with entries if vertex j is an endpoint of edge ek otherwise. Find the incidence matrix of: 21. The graph in Prob. 9.
12.
22. The graph in Prob. 8. Incidence Matrix of a Digraph: Matrix B = entries
a;jk] with
if edge ek leaves vertex j if edge ek enters vertex j
13.
othelwise. Find the incidence matrix of: 23. The digraph in Prob. II. 24. The digraph in Prob. 13. 25. Make a vertex incidence list ofthe digraph in Prob. 13.
SEC. 23.2
Shortest Path Problems.
23.2
959
Complexity
Shortest Path Problems.
Complexity
Beginning in this section, we shall discuss some of the most important classes of optimization problems that concern graphs and digraphs as they arise in applications. Basic ideas and algorithms will be explained and illustrated by small graphs, but you should keep in mind that real-life problems may often involve many thousands or even millions of vertices and edges (think of telephone networks, worldwide air travel, companies that have offices and stores in all larger cities). Then reliable and efficient systematic methods are an absolute necessity-solution by inspection or by trial and error would no longer work, even if "'nearly optimal" solutions are acceptable. We begin with shortest path problems, as they arise, for instance, in designing shortest (or least expensive, or fastest) routes for a traveling salesman. for a cargo ship. etc. Let us first explain what we mean by a path. In a graph G = (V, E) we can walk from a vertex VI along some edges to some other vertex vk- Here we can (A) make no restrictions. or (B) require that each edge of G be traversed at most once, or
(C) require that each vertex be visited at most once. In case (A) we call this a walk. Thus a walk from
VI
to
Vk
is of the form
(1)
where some of these edges or vertices may be the same. In case (B), where each edge may occur at most once, we call the walk a trail. Finally, in case (C), where each vertex may occur at most once (and thus each edge automatically occurs at most once), we call the trail a path. We admit that a walk. trail. or path may end at the vertex it started from. in which case we call it closed; then Vk = VI in (1). A closed path is called a cycle. A cycle has at least three edges (because we do not have double edges; see Sec. 23.1). Figure 480 illustrates all these concepts.
Walk, trail, path, cycle
Fig. 480.
14 11-
21 22-
3233-
2 is 34 4-
a walk (not a trail). 4 - 5 is a trail (not a path). 5 is a path (not a cycle). 1 is a cycle.
Shortest Path To define the concept of a shortest path, we assume that G = (V, E) is a weighted graph, that is, each edge (Vi, V) in G has a given weight or length lij > O. Then a shortest path VI ----;. Vk (with fixed VI and Vk) is a path (1) such that the sum of the lengths of its edges 112
(/12 = length of VI
to
Vk)'
+
123
+
134
+ ... +
Ik-I,K
(Vb V 2 ), etc.) is minimum (as small as possible among all paths from Similarly, a longest path VI ----;. Vk is one for which that sum is maximum.
960
CHAP. 23
Graphs. Combinatorial Optimization
Shortest (and longest) path problems are among the most important optimization problems. Here, "length" lij (often also called "cost" or "weight") can be an actual length measured in miles or travel time or gasoline expenses, but it may also be something entirely different. For instance, the "traveling salesman problem" requires the determination of a shortest Hamiltonian i cycle in a graph, that is, a cycle that contains all the vertices of the graph. As another example, by choosing the "most profitable" route VI --4 Vk, a salesman may want to maximize "Llij , where lij is his expected commission minus his travel expenses for going from town i to townj. In an investment problem, i may be the day an investment is made,j the day it matures, and lij the resulting profit, and one gets a graph by considering the various possibilities of investing and reinvesting over a given period of time.
Shortest Path if All Edges Have Length I = 1 Obviously, if all edges have length I, then a shortest path VI - Vk is one that has the smallest number of edges among all paths VI - Vk in a given graph G. For this problem we discuss a BFS algorithm. BFS stands for Breadth First Search. This means that in each step the algorithm visits all neighhnring (all adjacent) vertices of a vertex reached. as opposed to a DFS algorithm (Depth First Search algorithm), which makes a long trail (as in a maze). This widely used BFS algorithm is shown in Table 23.1. We want to find a shortest path in G from a vertex s (start) to a vertex t (terminal). To guarantee that there is a path from s to t, we make sure that G does not consist of separate portions. Thus we assume that G is connected, that is, for any two vertices V and w there is a path V _ w in G. (Recall that a vertex V is called adjacent to a vertex 11 if there is an edge (u, v) in G.) Table 23.1
Moore's BFS for Shortest Path (All Lengths One)
Proceedings of the Illfemationai Symposillm for Switching TIleory. Part II. pp. 285-292. Cambridge: Harvard University Press. 1959.
ALGORITHM MOORb [G
=
(V, E), s, t]
This algorithm determines a shortest path in a connected graph G s to a vertex t. INPUT:
OUTPUT: 1. 2. 3. 4. 5.
= (V, E)
from a vertex
Connected graph G = (V, E), in which one vertex is denoted by sand one by t, and each edge (i, j) has length lij = 1. Initially all vertices are unlabeled. A shortest path s ---+ t in G = (V. £)
Label s with O. Set i = O. Find all unlabeled vertices adjacent to a vertex labeled i. Label the vertices just found with i + l. If vertex t is labeled. then "backtracking" gives the shortest path k (= label of t), k - 1, k - 2, ... , 0
OUTPUT k, k - 1, k - 2, ... , O. Stop Else increase i by I. Go to Step 3. End MOORE
lWILLIAM ROWAN HAMILTON (1805-1865), Irish mathematician, known for his work in dynamics
SEC. 23.2
Shortest Path Problems.
E X AMP L E 1
961
Complexity
Application of Moore's BFS Algorithm Find a shortest path s
---+
t in the graph G shown in Fig. 481.
Solution.
Figure 481 shows the labels. The blne edges form a shortest path (length 4). There is another shortest path s ---+ t. (Can yon find it?) Hence in the program we must introduce a rule that makes backtracking unique because otherwise the computer would not know what to do next if at some step there is a choice (for instance, in Fig. 481 when it got back to the vertex labeled 2) The following rule seems to be natural.
Backtracking rule.
Using the numbeIing of the vertices from I to II (not the labeling'). at each step, if a vertex labeled i i, reached, take as the next ve11ex that with the smallest number (not label!) among all the • vertices labeled i - I .
2
Fig. 481.
Example 1, given graph and result of labeling
Complexity of an Algorithm Complexity of Moore's algorithm. To find the vertices to be labeled 1, we have to scan all edges incident with s. Next, when i = 1, we have to scan all edges incident with vertices labeled I, etc. Hence each edge is scanned twice. These are 2m operations (Ill = number of edges of G). This is a function c(m). Whether it is 2171 or 5111 + 3 or 12m is not so essential; it is essential that c(m) is proportional to 111 (not nz 2 , for example); it is of the "order" 1II. We write for any function alll + b simply Oem), for any function am 2 + bm + d simply 0(11/ 2 ), and so on; here, 0 suggests order. The underlying idea and practical aspect are as follows. lnjudging an algorithm, we are mostly interested in its behavior for very large problems (large m in the present case), since these are going to determine the limits of the applicability of the algorithm. Thus, the essential item is the fastest growing term (am 2 in alll 2 + bill + d, etc.) since it will overwhelm the others when III is large enough. Also, a constant factor in this term is not very essential; for instance, the difference between two algorithms of orders. say, 5111 2 and 8m 2 is generally not very essential and can be made irrelevant by a modest increase in the speed of computers. However, it does make a great practical difference whether an algorithm is of order 11/ or m 2 or of a still higher power lII P • And the biggest difference occurs between these "polynomial orders" and "exponential orders," such as 2'11. For instance, on a computer that does I 09 operations per second. a problem of size m = 50 will take 0.3 second with an algorithm that requires /715 operation~, but 13 days with an algorithm that requires 2m operations. But this is not our only reason for regarding polynomial orders as good and exponential orders as bad. Another reason is the gain in using afaster computer. For example let two algorithms be Oem) and 0(111 2 ). Then, since 1000 = 31.62 , an increase in speed by a factor 1000 has the effect that per hour we can do problems 1000 and 31.6 times as big, respectively. But since 1000 = 2 9 .97 • with an algorithm that is 0(2"'), all we gain is a relatively modest increase of 10 in problem size because 2 9 . 97 • 2 m = 2>n+9.97.
962
CHAP. 23
Graphs. Combinatorial Optimization
The symbol 0 is quite practical and commonly used whenever the order of growth is essential. but not the specific form of a function. Thus if a function g(m) is of the form gem)
= kh(m) + more slowly growing terms
(k
*" 0, constant),
we say that g(m) is of the order h(111) and write g(m)
=
O(lZ(111)).
For instance, am
+
b
=
5 . 2m
0(111).
+
3m 2 = 0(21n).
We want an algorithm .iI to be "efficient." that is. "good" with respect to (i) Time (number C.,l(111) of computer operations), or (ii) Space (storage needed in the internal memory)
or both. Here c. J suggests "complexity" of .iI. Two popular choices for cjm) = longest time 9'1 takes for a problem of size
(Worst case) (Average case)
C~'I
are
/11,
c.cim) = average time sl takes for a problem of size
111.
In problems on graphs, the "size" will often be 11l (number of edges) or 11 (number of vertices). For our present simple algorithm, cJm) = 2m in both cases. For a "good" algorithm.iI, we want that c.jm) does not grow too fast. Accordingly, we call .cJ1 efficient if C,,'1(111) = O(mk) for some integer k ~ 0; that is, C". I may contain only powers of m (or functions that grow even more slowly, such as In m), but no exponential functions. Furthermore, we call sJ polynomially bounded if .9l is efficient when we choose the "worst case" C,,«111). These conventional concepts have intuitive appeal. as our discussion shows. Complexity should be investigated for every algorithm, so that one can also compare different algorithms for the same task. This may often exceed the level in this chapter; accordingly, we shall confine ourselves to a few occasional comments in this direction.
[1=6]
SHORTEST PATH
5.
Find a shortest path P: s --> t and its length by Moore's BFS algorithm; si--erch the graph with the labels and indicate P by heavier lines (as in Fig. 481). 1.
2't~\
\
·V~
. . . . . /"-J s
4. /
"':::) ' " 0
~~-''b t/
6.
S
/\_1-1- . . . .
V\/_. /"
__
t
/\/~( !
sQ\
•
•
7. (Nonuniqueness) A shonest path s --> t for given sand t need not be unique. Illustrate this by finding another shortest path s --> t in Example I in the text. 8. (Maximum length) If P is a shortest path between any two vertices in a graph with n vertices, how many edges can P at most have'? In a complete graph (with all edges of length 1)7 Give a reason. 9. (Moore's algorithm) Show that if a vertex v has label .A(v) = k, then there is a path s --> v of length k.
SEC. 23.3
Bellman's Principle.
10. Call the length of a shortest path s ~ v the distance of v from s. Show that if v has distance /, it has label A(v)
=
963
Dijkstra's Algorithm 16. Find 4 different closed Euler trails in Fig. 484. 2
I.
/\/\
11. (Hamiltonian cycle) Find and sketch a Hamiltonian cycle in the graph of Prob. 3.
12. Find and sketch a Hamiltonian cycle in the graph of a dodecahedron. which has 12 pentagonal faces and 20 vertices (Fig. 482). This is a problem Hamilton himself considered.
Fig. 482.
Problem 12
13. Find and sketch a Hamiltonian cycle Sec. 23.1.
In
Fig. 479.
14. (Euler graph) An EllIeI' graph G is a graph that has a clo:-.ed Euler trail. An Euler trail is a trail that contains every edge of G exactly once. Which subgraph with four edges of the graph in Example I, Sec. 23.1. is an Euler graph? 15. Is the graph in Fig. 483 an Euler graph? (Give a reason.)
4
3
Fig. 484.
5
Problem 16
17. The postman problem is the problem of finding a closed walk W: s ~ s (s the post office) in a graph G with edges (i,j) of length lij > 0 such that every edge of G is traversed at least once and the length of W is minimum. Find a solution for the graph in Fig. 483 by inspection. (The problem is also called the Chinese postman problem since it was published in the journal Chinese MathenlOtic.I' 1 (1962),273-277.) 18. Show that the length of a shortest postman trail is the same for every starting verteX. 19. (Order) Show that 0(111 3 ) kO(111 P )
=
+
0(111 3 ) =
0(111 3 ) and
O(m P )'
20. Show that ~ = 0(111), O.02e m
+
100m2
= O(e m ).
21. If we switch from one computer to another that is 100 times as fast. what is our gain in problem size per hour in the use of an algorithm that is 0(111), 0(111 2 ). 0(111 5 ). O(e"")?
4
3}-------{
Fig. 483.
23.3
Problems 15, 17
Bellman's Principle.
22. CAS PROBLEM. Moore's Algorithm. Write a computer program for the algorithm in Table 23.1. Test the program with the graph in Example I. Apply it to Probs. 1-3 and to some graphs of your own choice.
Dijkstra's Algorithm
We continue our discussion of the shorrest path problem in a graph G. The last section concerned the special case that all edges had length 1. But in most applications the edges (i, j) will have any lengths lij > 0, and we now turn to this general case, which is of greater practical importance. We write lij = :x; for any edge (i,j) that does not exist in G (setting 'Xl + a = :x; for any number a, as usual). We consider the problem of finding shortest paths from a given veltex. denoted by I and called the origin, to all other vel1ices 2. 3 ..... Il of C. We let Lj denote the length of a shortest path p/ I ~ j in G. THEOREM 1
Bellman's Minimality Principle or Optimality Principle2
{l Pj
: I ~ j is a ShOrTeST path from 1 TO j ill G lllld (i. j) is tlze llist edge of Pj (Fig. 485), Then Pi: 1 ~ i [obtained by droppi1lg (i, j) from Pj ] is a sh017est path
I~i.
2RICHARD BELLMAN (1920--1984). American mathematician, known for his WOIX in dynamic programming.
964
CHAP. 23
Graphs. Combinatorial Optimization P.I ~
__________~A~__________~\
~
"'/-v~ Fig. 485.
PROOF
j
Paths P and Pi in Bellman's minimality principle
Suppose that the conclusion is false. Then there is a path P;"': I _ i that is shorter than Pi' Hence if we now add U. j) to Pi*, we get a path I _ j that is shorter than Pj' This • contradicts our assumption that Pj is "hortest. From Bellman's principle we can derive basic equations as follows. For fixed j we may obtain various paths 1 _ j by taking shortest paths Pi for vmious i for which there is in G an edge (i,j), and add U,.i) to the conesponding Pi' These paths obviously have lengths Li + lij (Li = length of Pi)' We can now take the minimum over i. that is. pick an i for which Li + lij is smallest. By the Bellman principle. this gives a shortest path I ~ j. It has the length Ll = 0
(1)
L·:J = min (L· i*j
1,
+
I) 1,J'
j = 2, ... ,
fl.
These are the Bellman equations. Since Iii = 0 by definition, instead of mini'1'j we can simply write mini' These equations suggest the idea of one of the best-known algorithms for the shortest path problem, as follows.
Dijkstra's Algorithm for Shortest Paths Dijkstra's3 algorithm is shown in Table 23.2, where a connected graph G is a graph in which for any two vertices v and 1I' in G there is a path v _ w. The algorithm is a labeling procedure. At each stage of the computation. each vertex v gers a label, either (PU a permallent label
=
length Lv of a shortest path 1 ----7
V
or (TL) a temporary label = upper bound
Lv for the length of a shortest path
1 ~ v.
We denote by '!P;£ and 2J;£ the sets of vertices with a permanent label and with a temporary label, respectively. The algorithm has an initial step in which vertex I gets the permanent label Ll = 0 and the other vertices get temporary labels, and then the algorithm alternates between Steps 2 and 3. In Step 2 the idea is to pick k "minimally." In Step 3 the idea is that the upper bounds will in general improve (decrease) and must be updated accordingly. Namely, the new temporary label I j of vertex j will be the old one if there is no improvement or it will be Lk + Ikj if there is.
3
EDSGER WYBE DIJKSTRA (1930-2002), Dutch computer scientist. 1972 recipient of the ACM TurinG" Award. His algorithm appeared in Numerische Mathematik J (1959),269-271. b
SEC. 23.3
Bellman's Principle.
965
Dijkstra's Algorithm
Table 23.2
Dijkstra's Algorithm for Shortest Paths
ALGORITHM DIJKSTRA [G = (V, E), V = {L ... , 17}, lij for all (i, j) in E] Given a connected graph G = (V, i',") with vertices 1, ... , n and edges (i, j) having lengths liJ > 0, this algorithm determines the lengths of shortest paths fi'om vertex I to the vertices 2, ... , n. INPUT: Number of vertices 17, edges (i, j), and lengths lij OUTPUT: Lengths Lj of sh011est paths I
-?
j, j = 2, ... , n
1. lnitiul step
Vertex I gets PL: LI = O. Vertexj (= 2, .. ',1/) gets TL: ~ = Ilj (= Set qp5£ = {I}, 'j5£ = {2, 3, ... , n}.
C/J
if there is no edge (1,j) in G).
2. Fixing a permanent label
Find a k in 'j5£ for which Lk is miminum, set Lk = L k . Take the smallest k if there are several. Delete k from 'j5£ and include it in f!P5£. If 'j5£ = 0 (that is, 'j5£ is empty) then OUTPUT L 2 ,
Ln. Stop
••. ,
Else continue (that is, go to Step 3). 3. Updating tempnral}" labels
For all j in 'j5£, set L j ~ mink {L j , Lk Lk + I kj as your new L j ).
+
lk.i} (that is, take the smaller of L j and
Go to Step 2. End DIJKSTRA
E X AMP L E 1
Application of Dijkstra's Algorithm Applying Dijkstra's algorithm to the graph in Fig. 4b6a, find shortest paths from vertex I to vertices 2, 3, 4.
Solution. = 2. L3 = 3. L2 = L4 = 2. L2 = 1.
Ll
We list the steps and computations. 0,
L2 = 8, La = 5, L4 = 7, [L 2 .L3 , L4 } = 5. k = 3.
min
min {8,L3 + /32J min [7, L3 min
3.
L4
=
2.
4
= 7.
{L2' L4 ) =
min {7,L2
=
+ 134} = +
min [6,7)
/24)
=
=
6
= 2. =
7
min {8, 5 + IJ min [7, x}
=
=
6, k
min [7,6
+
'!J5f. = [2,3, 4} '!J5f. = {2, 4}
'If':£
'!J5f.
=
[4}
'!J:£
=
0.
7
2}
=
{I, 2, 3],
'd''£= [1,2,3,4},
k = 4
Figure 486b shows the resulting shortest paths. of lengths
(a) Given graph G
Lz =
6. L3
= 5.4 =
7.
(b) Shortest paths in G
Fig. 486.
Complexity.
qp5f. = {I}, '!P5f.={1.3},
Dijkstra's algorithm is 0(n2).
Example 1
•
CHAP. 23
966 PROOF
Graphs. Combinatorial Optimization
Step 2 requires comparison of elements, first II - 2, the next time 11 - 3, etc., a total of (11 - 2)(11 - 1)/2. Step 3 requires the same number of comparisons, a total of (11 - 2)(11 - I )/2, as well m; additions, fIrst Il - 2, the next time 11 - 3, etc., dgain a [otal of 2 (ll - 2)(11 - I )/2. Hence the total number of operations is 3(11 - 2)(11 - 1)/2 = O(1l ) . •
1. The net of roads in Fig. 487 connecting four villages is to be reduced to minimum length. but so that one can still reach every village from every other village. Which of the roads should be retained? Find the solution (a) by inspection. (b) by Dijkstra's algorithm.
·t
5.
6. 18
Fig. 487.
Problem 1
2-=7]
DIJKSTRA'S ALGORITHM Find shortest paths for the following graphs. 2.
3.
8. Show that in Dijkstra's algorithm, for L" there is a path P: I ~ k of length L k . 9. Show that in Dijkstra's algorithm. at each instant the demand on storage is light (data for less than II edges) 10. CAS PROBLEM. Dijkstra's Algorithm. Write a program and apply it to Probs. 2--4.
23.4
Shortest Spanning Trees: Greedy Algorithm So far we have discussed shortest path problems. We now turn to a particularly important kind of graph. called a tree. along with related optimization problems that arise quite often in practice. By definition. a tree T is a graph that is connected and has no cycles. "Connected" was defined in Sec. 23.3; it means that there is a path fmm any vertex in T to any other veltex in T. A cycle is a path s ~ t of at least three edges that is closed (t = s); see also Sec. 23.2. Figure 488a shows an example. CAUTION!
The terminology varies; cycles are sometimes also called circuits.
SEC. 23.4
967
Shortest Spanning Trees: Greedy Algorithm
A spanning tree T in a given connected graph G = (V, E) is a tree containing all the vertices of G. See Fig. 488b. Such a tree has 17 - 1 edges. (Proof?) A shortest spanning tree T in a connected graph G (whose edges (i, j) have lengths lij> 0) is a spanning tree for which 'Llij (sum over all edges of 7) is minimum compared to 'Llij for any other spanning tree in G.
17
Trees are among the most important types of graphs, and they occur in various applications. Familiat examples ate family trees and organization charts. Trees can be used to exhibit, organize, or analyze electrical networks, producer-consumer and other business relations, infonnation in database systems, syntactic structure of computer programs, etc. We mention a few specific applications that need no lengthy additional explanations. The set of shortest paths from vertex I to the vertices 2..... 17 in the last section forms a spanning tree. Railway lines connecting a number of cities (the vertices) can be set up in the form of a spanning tree, the "length" of a line (edge) being the construction cost, and one wants to minimize the total construction cost. Similarly for bus lines, where "length" may be the average annual operating cost. Or for steamship lines (freight lines), where "length" may be profit and the goal is the maximization of total profit. Or in a network of telephone lines between some cities, a shortest spanning tree may simply represent a selection of lines that connect all the cities at minimal cost. In addition to these examples we could mention others from distribution networks. and so on. We shall now discuss a simple algorithm for the problem of finding a shortest spanning tree. This algorithm (Table 23.3) is patticularly suitable for spatse graphs (graphs with very few edges: see Sec. 23.1). Table 23.3
Kruskal's Greedy Algorithm for Shortest Spanning Trees
Proceedings of the American Mathematical Society 7 (1956), 48-50.
ALGORITHM KRUSKAL [G
=
(V, E), lij for all (i,j) in EJ
Given a connected graph G = (V, E) with edges (i.j) having length lij > O. the algorithm detelmines a shortest spanning tree Tin G. INPUT: Edges (i, j) of G and their lengths
lij
OUTPUT: ShOltest spanning tree T in G 1. Order the edges of G in ascending order of length. 2. Choose them in this order as edges of T, rejecting an edge only if it forms a cycle with edges already chosen. If 17 - I edges have been chosen. then OUTPUT T (= the set of edges chosen). Stop
EndKRUSKAL
CaJ A cycle
Fig. 488.
(b) A spanning tree
Example of (a) a cycle, (b) a spanning tree in a graph
968 E X AMP L E 1
CHAP. 23
Graphs. Combinatorial Optimization
Application of Kruskal's Algorithm Using Kmskars algorithm. we shall determine a shortest spanning tree in the gmph in Fig. 489.
Fig. 489.
Graph in Example 1
Solution.
See Table 23.4. In some of the intennediate stages the edges chosen form a disconnected gmph (see Fig. 490); this is typical. We stop after n - I = 5 choices since a spanning tree has II - I edges. In our problem the edges chosen are in the upper part of the list. This is typical of problems of any ~ize: in general, • edges farther down in the list have a smaller chance of being chosen.
Table 23.4
Edge
Solution in Example 1
Length
(3,6) (I. 2)
(1,3) (4.5) (2.3) (3,4) (5.6) (2,4)
2 4 6
7 8
Choice 1st 2nd 3rd 4th Reject 5th
9 11
The efficiency of Kruskars method is greatly increased by Double Labeling of Vertices.
Each vertex i carries
1I
ri
= Root of the subtree to which
Pi
= Predecessor of i = 0 for roots.
Pi
double label (ri, Pi), where i belongs,
ill its subtree,
This simplifies Rejecting. If (i. j) is next in the list to be considered, reject (i. j) if ri = ') (that is. i and j are in the same subtree. so that they are already joined by edges and (i. j) would thus create a cycle). If ri i= I} include (i, j) ill T. If there are several choices for ri, choose the smallest. If subtrees merge (become a single tree), retain the smallest root as the root of the new subtree. For Example I the double-label list is shown in Table 23.5. In storing it, at each instant one may retain only the latest double label. We show all double labels in order to exhibit the proces~ in all its stages. Labels that remain unchanged are nO[ listed again. Underscored are the two I' s that are the common root of vertices 2 and 3, the reason for rejecting the edge (2, 3). By reading for each vertex the latest label we can read from this list that I is the vertex we have chosen as a root and the tree is as shown in the last part of Fig. 490.
SEC. 23.4
969
Shortest Spanning Trees: Greedy Algorithm 1
2 --"'I
3~
f
6 Second
First
Third
Fig. 490.
4
3/
Fourth
Choice process in Example 1
"
Fifth
This is made possible by the predecessor label that each vertex carries. Also, for accepting or rejecting an edge we have to make only one comparison (the roots of the two endpoints of the edge). Ordering is the more expensive part of the algorithm. It is a standard process in data processing for which various methods have been suggested (see Sorting in Ref. [E25] listed in App. 1). For a complete list of 111 edges, an algorithm would be Oem log211/), but since the 17 - 1 edges of the tree are most likely to be found earlier, by inspecting the q « 1/1) topmost edges, for such a list of q edges one would have O(q log2 /11). Table 23.5
List of Double Labels in Example 1 Choice 1 (3,6)
Vertex
Choice 2 n,2)
Choice 3 (1,3)
Choice 4 (4,5)
Choice 5 (3,4)
l4,0) (4,4)
n, 3)
(1,0) 2 3 4 5 6
11-61
(1, 1)
(3,0)
(3, 3)
KRUSKAL'S ALGORITHM
Find a shortest spanning tree by Kruskal' s algorithm. 1.
(1, 1)
2.
(1,3)
3.
(1,4)
CHAP. 23
970
Graphs. Combinatorial Optimization 8. Design an algorithm for obtaining longest spanning trees. 9. Apply the algorithm in Prob. 8 to the graph in Example I. Compare with the result in Example I. 10. To get a minimum spanning tree, instead of adding shortest edges, one could think of deleting longest edges. For what graph5 would this be feasible? Describe an algorithm for this. 11. Apply the method suggested in Prob. IO to the graph in Example 1. Do you get the same tree?
7. CAS PROBLEM. Kruskal's Algorithm. Write a corresponding program. (Sorting is discussed in Ref. [E25] listed in App. I.)
Chicago Dallas Denver Los Angeles New York
Dallas
Denver
Los Angele<;
New York
Washington, DC
800
900 650
1800
700 1350 1650 2500
650 1200 1500 2350 200
13. (Forest) A (not necessarily connected) graph without cycles is called a forest. Give typical examples of applications in which graphs occur that are forests or trees.
I]4-20 I
GENERAL PROPERTIES OF TREES
Prove: 14. (l:niqueness) The path connecting any two vertice~ 1I and u in a tree is unique. 15. If in a graph any two vertices are connected by a unique path, the graph is a tree.
23.5
12. Find a shortest spanning tree in the complete graph of all possible 15 connections between the six cities given (distances by airplane. in miles. rounded). Can you think of a practical application of the result?
1300 850
I
16. If a graph has no cycles, it must have at least 2 vertices of degree 1 (definition in Sec. 23.\). 17. A tree with exactly two vertices of degree 1 must be a path. 18. A tree with induction.)
11
vertices has
11 -
I edges. (Proof by
19. If two vertices in a tree are joined by a new edge. a cycle is formed.
20. A graph with 11 vertices is a tree if and only if it has 11 1 edges and has no cycles.
Shortest Spanning Trees:
Prim's Algorithm
Prim's algorithm shown in Table 23.6 is another popular algorithm for the shortest spanning tree problem (see Sec. 23.4). This algorithm avoids ordering edges and gives a tree T at each stage. a property that Kruskal's algorithm in the last section did not have (look back at Fig. 490 if you did not notice it). In Plim's algorithm, starting from any ~ingle vertex, which we call I, we "grow" the tree T by adding edges to it, one at a time, according to some rule (in Table 23.6) until T finally becomes a spanning tree, which is shortest. We denote by U the set of vertices of the growing tree T and by S the set of its edges. Thus, initially U = {I} and S = 0; at the end, U = V. the vertex set of the given graph G = (V. E), whose edges (i, j) have length lij > 0, as before.
SEC. 23.5
Shortest Spanning Trees:
971
Prim's Algorithm
Thus at the beginning (Step 1) the labels
2.... ,
of the vertices
are the lengths of the edges connecting them to vertex I (or
00
11
if there is no such edge in
G). And we pick (Step 2) the shortest of these as the first edge of the growing tree T and include its other endj in U (choosing the smallestj if there are several, to make the process
unique). Updating labels in Step 3 (at this stage and at any later stage) concerns each vertex k not yet in U. Vertex k has label Ak = li(k),k from before. If Ijk < Ak , this means that k is closer to the new member j just included in U than k is to its old "closest neighbor" i(k) in U. Then we update the label of k, replacing Ak = li(k).k by Ak = Ijk and setting i(k) = j. If. however, Ijk :;;:: Ak (the old label of k), we don't touch the old label. Thus the label Ak always identifies the closest neighbor of k in U, and this is updated in Step 3 as U and the tree T grow. From the final labels we can backtrack the final tree, and from their numeric values we compute the total length (sum of the lengths of the edges) of this tree.
Table 23.6
Prim's Algorithm for Shortest Spanning Trees
Bell System Technical Joltl1l1l136 (1957). 1389-I·WL For an improved version of the algorithm. see Cheriton and Tmjan. SIAM Jolt,."al on COmplltlition 5 (1976).724-7-12.
ALGORITHM PRIM [G
=
(V, E), V
= {I, ... , 11},
lij for all (i, j) in E]
Given a connected graph G = (V, E) with vertices 1,2, ... ,11 and edges (i,j) having length lij > O. this algorithm dete1l11ine~ a shortest spanning tree Tin G and its length
L(n
INPUT: n. edges (i. j) of G and their lengths lij OUTPUT: Edge set 5 of a shortest spanning tree T in G: L(T) [Initially, alll'ertices are lInlabeled.]
1. Initial step Set i(k) = I, U = {I}, 5 Label vertex k (= 2, ...
=
0.
,11)
with Ak = Ii/e 1=
:G
if G has no edge
n. k)j.
2. Addition of an edge to the cree T Let Aj be the smallest Ale for vertex k not in U. Include veltex j in U and edge (i(j), j) in 5. If U = V then compute L(T) = "'2:/ij (sum over all edges in 5) OUTPUT 5. UT). Stop [5 is the edge set of a shortest spanning tree T ill G.] Else continue (that is. go to Step 3). 3. Label updating For every k not in U. if Ijle < Ak • then set Ale = lile and i(k) = j. Go to Step 2.
End PRIM
CHAP. 23
972
Graphs. Combinatorial Optimization
Fig. 491. E X AMP L E 1
Graph in Example 1
Application of Prim's Algorithm Find a shortest spanmng tree in the graph can <:ompare).
In
Fig. 4Yl (which is the
~ame
as in Example I. Sec. 23.4, so that we
Solutio1/. The steps are a~ follows. 1.
irk) = I, U =
IlJ, S
=
0, initial labels see Table 23.7.
2. A2 = 112 = 2 is smallest, U = 11. 2J. S = {(I, 2)/ 3. Update labels as shown in Table 23.7. column (I). 2. A3 = 113 = 4 is smallest. U = II. 2. 3}. S = {(I, 2), (I. 3») 3. Update labels
a~
shown in Table 23.7. column (Il).
2. A6 = 136 = I is smallest. U = II. 2. 3. 6J. S = 1(1. 21. (I. 3). (3. 6)J 3. Update labels as shown in Table 23.7, column (III). 2. A4 = 134 = 8 is smallest, U = II, 2, 3, 4, 6J, S = 10,2), (I. 3), (3. 4), (3, 6») 3. Update labels
a~
shown in Table 23.7, column (IV).
2. A5 = 145 = 6 is smallest. U = V. S
=
(I, 2). (I. 3). (3. 4). (3. 6). (4, 5). Stop.
The tree is the same as in Example I. Sec. 23.4. Its length is 21. You will find it interesting to compare the • growth process of the present tree with that in Sec. 23.4.
Table 23.7
Vertex
2 3 4
11-71
Labeling of Vertices in Example 1 Relabeling
Initial Label 112
113
= 2 =4
(ll)
(I)
1 13
= 4
:x;
124 =
5
:x;
x
6
:x;
x
11
PRIM'S ALGORITHM
Find a sh0l1est spanning tree by Prim's algorithm. Sketch it. 1. For the graph in Prob. I, Sec. 23.4 2. For the graph in Prob. 2. Sec. 23.4 3. For the graph in Prob. 4, Sec. 23.4 4.
6.
(Ill)
(IV)
SEC. 23.6
973
Flows in Networks
7.
8. (Complexity) Show that Prim's algorithm has 9. 10.
11. 12.
complexity 0(n2). How does Prim's algorithm prevent the formation of cycles as one grows T? For a complete graph (or one that is almost complete), if our dara is an 11 X 11 distance table (as in Prob. 12, Sec. 23.4). sho'" that the present algorithm [which is 2 0(11 )] cannot easily be replaced by an algorithm of order less than 0(/1 2 ). In what case will Prim's algorithm give S = E as the final result? TEAM PROJECT. Center of a Graph and Related Concepts. (a) Distance, eccentricity. Call the length of a shortest path u ~ v in a graph C = (V. E) the distance d(u, v) from II to v. For fixed u, call the greatest £1(11. u) as u ranges over V the ecce1ltricity E(II) of u. Find the eccentricity of vertices I, 2, 3 in the graph in Prob. 7.
23.6
(b) Diameter, radius, center. The diameter d(C) of a graph C = (V, E) is the maximum of li(li. u) as u and u vary over V. and the radius r(C) is the smallest eccentricity E(V) of the vertices v. A vertex v with E(V) = r(C) is called a ce1ltral rertex. The set of all central vertices is called the center of C. Find d(C), r(C) and the center of the graph in Prob. 7. (c) What are the diameter, radius, and center of the spanning tree in Example I? (d) Explain how the idea of a center can be used in setting up an emergency service facility on a transportation network. In setting up a fire station. a shopping center. How would you generalize the concepts in the case of two or more such facilities? (e) Show that a tree T whose edges all have length I has center consisting of either one vertex or two adjacent vel1ices.
<0 Set up an algorithm of complexity 0(11) for finding the center of a tree T. 13. What would the result be if you applied Prim's algorithm to a graph that is not connected? 14. CAS PROBLEM. Prim's Algorithm. Write a program and apply it to Probs. 4--6.
Flows in Networks After shortest path problems and problems for trees. as a third large area in combinatorial optimization we discuss flow problems in networks (electrical, water, communication, traffic, business connections, etc.), turning from graphs to digraphs (directed graphs; see Sec. 23.1). By definition, a network is a digraph G = (V, E) in which each edge (i,j) has assigned > 0 = maximum possible flow along (i, j)], and at one vertex, s, called the source, a flow is produced that flows along the edges of the digraph G to another vertex, t, called the target or sink, where the flow disappears. to it a capacity Cij
r
In applications, this may be the flow of electricity in wires, of water in pipes. of cars on roads, of people in a public transportation system. of goods from a producer to consumers, of e-mail from senders to recipients over the Internet, and so on. We denote the flow along a (directed!) edge (i, j) by fij and impose two conditions:
1. For each edge (i, j) in G the flow does not exceed the capacity (1)
Cij,
('"Edge condition").
2. For each vertex i, not s or t, Inflow = Outflow
("Vertex condition," "Kirchhoff's law");
974
CHAP. 23
Graphs. Combinatorial Optimization
in a formula, 0 if vertex i
k
{
j
lnnuw
s. i
=1=
t.
- f at the source s,
=
(2)
=1=
f at the target (sink)
t,
where f is the total flow (and at s the inflow is zero. whereas at t the outflow is zero). Figure 492 illustrates the notation (for some hypothetical figures).
Fig. 492.
Notation in (2): inflow and outflow for a vertex i (not 5 or t)
Paths By a path of edges
VI ~ Vk
from a ve11ex
VI
to a ve11ex
Vk
in a digraph G we mean a sequence
regardless of their directiolls ill G, that forms a path as in a graph (see Sec. 23.2). Hence when we travel along this path from VI to Vk we may traverse some edge ill its given direction-then we call it a forward edge of our path-or opposite to its given directionthen we call it a backward edge of our path. In other words. our path consists of oneway streets. and forward edges (backward edges) are those that we travel in the right direction (in the wrong direction). Figure 493 shows a forward edge (u. v) and a backward edge (w. v) of a path VI ~ Vk'
CAUTION! Each edge in a network has a given direction, which we COllnot change. Accordingly, if (u, v) is a forward edge in a path VI ~ Vk, then (u, v) can become a backward edge only in another path Xl ~ Xj in which it is an edge and is traversed in the opposite direction as one goes from Xl to.\); see Fig. 494. Keep this in mind. to avoid misunderstandings.
Fig. 493. Forward edge (u. v) and backward edge (w. v) of a path v, ~
Vk
Fig. 494. Edge (u, v) as forward edge in the path and as backward edge in the path X, ~ Xj
V, ~ Vk
Flow Augmenting Paths Our goal will be to maximize thejlow from the sourCe s to the target t of a given network. We shall do this by developing mcthods for increasing an existing flow (including the special case in which the latter is zero). The idea then is to find a path P: s ~ t all of whose edges are not fully used, so that we can push additional flow through P. This suggests the following concept.
SEC. 23.6
975
Flows in Networks
DEFINITION
Flow Augmenting Path
A flow augmenting path in a network with a given flow path P: s ~ t such that (i) no forward edge is used to capacity; thus (ii) no backward edge has flow 0; thus
EXAMPLE 1
Iij
Iij
<
Iij
Cij
on each edge (i, j) is a
for these;
> 0 for these.
Flow Augmenting Paths Find flow augmenting paths in the network in Fig. 495, where the first number is the capacity and the second number a given flow.
Fig. 495. Network in Example 1 First number = Capacity, Second number = Given flow
Solution. In practical problems. networks are large and one needs a sy.•tematic method for augmenting flows, wllich we discllss ill tile next sectioll. In our small network, which should help to illustrate and clarify the concepts and ideas, we can find flow augmenting paths by inspection and augment the existing flow f = 9 in Fig. 495. (The outtlow from s is 5 + 4 = 9, which equals the inflow 6 + 3 into t.) We use the notation for forward edges llij =
hj
for backward
II = min ti ij
edge~
taken over all edges of a path.
From Fig. 495 we see that a flow augmenting path PI: s --> t is Pt= 1 - 2 - 3 - 6 (Fig. 496). with j.I2 = 20 - 5 = 15. etc .. and j. = 3. Hence we can use PI to increase the given flow 9 to f = 9 + 3 = 12. All three edges of PI are forward edges. We augment the flow by 3. Then the flow in each of the edges of PI is increased by 3. so that we now have .fI2 = 8 (instead of 5), f23 = 11 (instead of 8), and h6 = 9 (instead of 6). Edge (2. 3) is now used to capacity. The flow in the other edges remains as before. We shall now try to increase the flow in thi~ network in Fig. 495 beyond f = 12. There is another flow augmenting path P 2 : s --> t. namely. P 2 : J - 4 - 5 - 3 - 6 (Fig. 496). [t shows how a backward edge comes in and how it is handled. Edge (3. 5) is a backward edge. It has now 2, so that tl.35 = 2. We compute tl.14 = 10 - 4 = 6. etc. (Fig. 496) and .i = 2. Hence we can use P 2 for another augmentation to get f = 12 + 2 = 14. The new flow is shown in Fig. 497. No further augmentation is possible. We shall confirm later that f = 14 is maximum. • "'23 =
3
)-------i!~3
r ~6"'4
s~
"'35=2
@t
~4~5 ""45 =
Fig. 496.
3
Flow augmenting paths in Example 1
CHAP. 23
976
Graphs. Combinatorial Optimization
Cut Sets A "cut set" is a set of edges in a network. The underlying idea is simple and natural. If we want to find out what i~ flowing from s to t in a network, we may cut the network somewhere between sand t (Fig. 497 shows an example) and see what is t10wing in the edges hit by the cut. because any flow from s to t must sometimes pass through some of these edges. These form what is called a cut set. [In Fig. 497, the cut set consists of the edges (2, 3), (5, 2), (4, 5).] We denote this cut set by (S, T). Here S is the set of vertices on that side of the cut on which s lies (S = {s, 2, 4} for the cur in Fig. 497) and T is the set of the other vertices (T = {3, 5, t} in Fig. 497). We say that a cut "partitiolls" the vertex set V into two parts Sand T. Obviously, the corresponding cut set (S, T) consists of all the edges in the network with one end in S and the other end in T.
Fig. 497.
Maximum flow in Example 1
By definition, the capacity cap (S, T) of a cut set (S, T) is the sum of the capacities of all forward edges in (S, T) (forward edges only!), that is, the edges that are directed from S to T,
(3)
cap (S, T)
=
[sum over the forward edges of (S, T)].
LCij
Thus, cap (S, T) = II + 7 = 18 in Fig. 497. The other edges (directedji-ol11 T to S) are called backward edges of the cut set (S, T), and by the net flow through a cut set we mean the sum of the t10ws in the forward edges minus the sum of the flows in the backward edges of the cut set.
CAUTION! Distinguish well between forward and backward edges in a cut set and in a path: (5, 2) in Fig. 497 is a backward edge for the cut shown but a forward edge in the path 1 - 4 - 5 - 2 - 3 - 6. For the cut in Fig. 497 the net flow is II + 6 - 3 = 14. For the same cut in Fig. 495 (not indicated there), the net flow is 8 + 4 - 3 = 9. In both cases it equals the flow f. We claim that this is not just by chance, but cuts do serve the purpose for which we have introduced them:
THEOREM 1
Net Flow in Cut Sets
AllY gil'e1l flow ill a network G is the net flow through a1lY cut set (S, T) of G.
PROOF
By Kirchhoff's law (2), multiplied by - L at a vertex i we have
(4)
L
fij -
j
L
fh
1
~~
Oulilow
Inflow
=
[0f
if i =I-
05,
if i = s.
t,
SEC. 23.6
977
Flows in Networks
Here we can sum over j and I from 1 to 11 (= number of vertices) by putting fij = 0 for j = i and also for edges without flow or nonexisting edges; hence we can write the two sums as one, if i =I- s, t, if i
=
s.
We now sum over all i in S. Since s is in S, this sum equals f:
:L :L (fij -
(5)
fji)
=
f·
iES jEV
We claim that in this sum, only the edges belonging to the cut set contlibute. Indeed, edges with both ends in T cannot contribute, since we sum only over i in S; but edges (i,j) with both ends in S contribute + fij at one end and - fij at the other, a total contribution of O. Hence the left side of (5) equals the net flow through the cut set. By (5), this is equal • to the flow f and proves the theorem. This theorem has the following consequence. which we shall also need later in this section.
THEOREM 2
Upper Bound for Flows
A pow
PROOF
f
ill a network G cannot exceed the capacity of any cut set (S, 1) in G.
By Theorem I the flow f equals the net flow through the cut set. f = f 1 - f 2' where f 1 is the sum of the flows through the forward edges and f2 (~ 0) is the sum of the flows through the backward edges of the cut set. Thus f ~ fl' Now f 1 cannot exceed the sum of the capacities of the forward edges; but this sum equals the capacity of the cut set, hy definition. Together, f ~ cap (S, 1), as asserted. • Cut sets will now bring out the full importance of augmenting paths:
Main Theorem. Augmenting Path Theorem for Flows
THEOREM 3
A pow from s to t in a network G is maximum (f and only (f there does not exist a flow augmenting path s ~ t ill G.
PROOF
(a) If there is a flow augmenting path P: s ~ t, we can use it to push through it an additional flow. Hence the given t10w cannot be maximum. (b) On the other hand, suppose that there is no flow augmenting path s ~ t in G. Let So be the set of all vertices i (induding .1') such that there is a flow augmenting path s ~ i, and let To be the set of the other vertices in G. Consider any edge (i, j) with i in So and j in To. Then we have a t10w augmenting path s ~ i since i is in So, but s ~ i ~ j is not t10w augmenting because j is not in So. Hence we must have forward
(6)
if (i, j) is a [
edge of the path s backward
~
i ~ j.
978
CHAP. 23
Graphs. Combinatorial Optimization
Otherwise we could use (i, j) to get a flow augmenting path s ---+ ; ---+ j. Now (So, To) defines a cut set (since I is in To: why?). Since by (6), forward edges are used to capacity and backward edges carry no flow, the net flow through the cut set (So, To) equals the sum of the capacities of the forward edges. which is cap (So. To) by definition. This net t10w equals the given flow f by Theorem 1. Thus f = cap (So, To). We also have f ~ cap (So, To) by Theorem 2. Hence f must be maximum since we have reached ~w~.
•
The end of this proof yields another basic result (by Ford and Fulkerson, Canadian JOlln1al of Mathematics 8 (1956), 399-404), namely. the so-called
THEOREM 4
Max-Flow Min-Cut Theorem
The maximum flow ill any network G equals the capacity of a "minimum cut set"' (= a cut set of minimum capacity) in G.
PROOF
We have just seen that f = cap (So, To) for a maximum flow f and a suitable cut set (So, To). Now by Theorem :2 we also have f ~ cap (S. T) for this f and any cut set (S, T) in G. Together, cap (So, To) ~ cap (S, n. Hence (So, To) is a minimum cut set. The existence of a maximum flow in this theorem follows for rational capacities from the algorithm in the next section and for arbitrary capacities from the Edmonds-Karp BFS also in that section. • The two basic tools in connection with networks are flow augmeming paths and cut sets. In the nexl section we show how flow augmenting paths can be used in an algorithm for maximum flows.
-
_
•... _.... _....lA__.........._____ .-....... ..--
= _ _ .....- . · . . . . .
11-41
FLOW AUGMENTING PATHS
3.
Find flow augmenting paths: 1.
2.
4.
SEC. 23.7
Maximum Flow:
Ford-Fulkerson Algorithm
~! MAXIMUM FLOW Find the maximum flow by inspection: S. In Prob. 1. 6. In Prob. 2. 7. In Prob. 3. 8. In Prob. 4. !9-11! CAPACITY In Fig. 495 find T and cap (5. T) if 5 equals 9. [1,2.31 10. [I. 2.4.51 11. [1, 3, 51
12. Find a minimum cut set in Fig. 495 and verify that its capacity equals the maximum flow I = 14. 13. Find examples of flow augmenting paths and the maximum flow in the network in Fig. 498.
lB~
CAPACITY In Fig. 498 find T and cap (5. T) if 5 equals 14. [1,2,41
23.7
Maximum Flow:
979 15. [L 2. 4. 6 I 16. [1,2.3.4,51
17. In Fig. 498 find a minimum cut set and its capacity.
Fig. 498.
18. Why are backward edge~ not considered III the definition of the capacity of a cut set? 19. In which case can an edge U, j) be used as a forward as well as a backward edge of a path in a network with a given flow? 20. (Incremental network) Sketch the network in Fig. 498, and on each edge (i,j) write Cij - Iij and Iij' Do you recognize that from this "incremental network" one can more easily see flow augmenting paths?
Ford-Fulkerson Algorithm
Flow augmenting paths, as discussed in the last section. are used as the basic tool in the Ford-Fulkerson4 algorithm in Table 23.8 on the next page in which a given flow (for instance, zero flow in all edges) is increased until it is maximum. The algOlithm accomplishes the increa-;e by a stepwise construction of flow augmenting paths. one at a time. until no further such paths can be constructed, which happens precisely when the tlow is maximum. In Step I, an initial t10w may be given. In Step 3, a vertex j can be labeled if there is an edge (i. j) with i labeled and ("forward edge") or if there is an edge (j, i) with i labeled and
f ·· > 0
• JZ
("backward edge").
To scan a labeled vertex i means to label every unlabeled vertex j adjacent to i that can be labeled. Before scanning a labeled vertex i, scan all the vertices that got labeled before i. This BFS (Breadth First Search) strategy was suggested by Edmonds and Karp in 1972 (Journal ofrhe Associariollfor Compllring Machinery 19, 248-64). It has the effect that one gets shortest possible augmenting paths.
4LESTER RANDOLPH FORD (horn 1927) and DELBERT RAY FULKERSON (1924-1976), American mathematicians known for their pioneering work on flow algorithms.
980
CHAP. 23
Graphs. Combinatorial Optimization
Table 23.8
Ford-Fulkerson Algorithm for Maximum Flow
Cllllllcllll1l JOl/mlll of Mathematics 9 (1957),210--218
I
ALGORITHM FORD-FULKERSON [G = (V, E), vertices 1 (= s) . .... 11 (= t). edges (i.j), Cij] This algorithm computes the maximum flow in a network G with source s. sink t. and capacities Cij > 0 of the edges (i, j).
INPUT: 11, s = I, t = 11. edges (i, j) of G. OUTPUT: Maximum flow f in G
1. Assign an initial flow
fij
(for instance,
Cij
fij =
0 for all edges), compute f.
2. Label s by 0. Mark the other veltices "lIlllabeled."
3. Find a labeled vertex i that has not yet been scanned. Scan i as follows. For every unlabeled adjacent veltexj, if Cij > fij' compute
if i = 1
L11j
and
j,j
=
[
.
mm (Ll·l' Ll 1J.)
and labelj with a ':forward label" (i+, L1j ); or if
fji
if i > I
> 0, compute
and labelj by a "backward label"' (C, Ll). If no such j exists then OUTPUT f. Stop [f is the maximum flmr.]
Else continue (that is, go to Step -l). 4. Repeat Step 3 until t is reached. [This gives a flow allgmellting path P: s ~ t.]
[f it is impossible to reach t then OUTPUT f. Stop [f is the maximum flow.J
Else continue (that is. go to Step 5). 5. Backtrack the path P, using the labels. 6. Using P, augment the existing flow by !::"t. Set f 7. Remove all labels from veltices 2, ... ,
11.
=
.f + Ll,.
Go to Step 3.
End FORD-FULKERSON
E X AMP L ElFord-Fulkerson Algorithm Applying the Ford-Fulkerson algorithm, determine the maximum flow for the network in Fig. 499 (which is the same as that in Example I. Sec. 23.6. ~o that we can compare).
Solu lion. The algorithm proceeds as follows. 1. An initial flow.f = 9 is given. 2. Label
of
(=
I) by 0. Mark 2, 3, 4. 5, 6 "unlabeled."
SEC. 23.7
Maximum Flow:
981
Ford-Fulkerson Algorithm
Fig. 499.
Network in Example 1 with capacities (first numbers) and given flow
3. Scan I. Compute J. 12 = 20 - 5 = IS = J.2. Label 2 by (1 +, IS). Compute J. 14 = 10 - 4 = 6 = ,).4' Label 4 by (1 +. 6).
4. Scan 2. Compute .1.23 = II - 8 = 3. ~3 = min (~2' 3) = 3. Label 3 by (2+. 3). Compute
~5 =
min (tl. 2• 3) = 3. LabelS by (T, 3).
Scan 3. Compute .1.36 = 13 - 6 = 7, J. 6 = ~t = min (~3' 7) = 3. Label 6 by (3+, 3).
5. P: I - 2 - 3 - 6 6.
J.t
1
(= t)
is a flow augmenting path.
=
3. Augmentation gives = 9 + 3 = 12.
h2 = 8. 123 = II. 136
9. other /;j unchanged. Augmented flow
7. Remove labels on vertices 2..... 6. Go 10 Step 3.
3. Scan I. Compute .112 = 20 - 8 = 12 = .:).2' Label 2 by (I +, 12). Compute ~14 = 10 - 4 = Ii = tl. 4. Label 4 by 11 +, 6).
4. Scan 2. Compute J. 5 = min (.1. 2, 3) = 3. LabelS by (2-, 3). Scan 4. [No I'ertex left forlabelillg.] Scan 5. Compute,).3 = min (.15 , 2) = 2. Label 3 by (5-. 1). Scan 3. Compute J.36 = 13 - 9 = 4 . .16 = min (.1. 3 , 4) = 2. Label Ii by (3+, 2).
5. P: I - 2 - 5 - 3 - 6
(= t)
6. .1.r. = 2. Augmentalion gives flow 1 = 12 + 2 = 14. 7. Remove labels on
vertice~
is a flow augmenting path.
h2 = 10./52
=
I. i35 = 0, 136
=
II, other
Ji]
unchanged. Augmented
2, ... , Ii. Go to Step 3.
One can now scan I and then scan 2, as before, but in scanning 4 and then 5 one finds that no vertex is left for labeling. Thus one can no longer reach 1. Hence the flo" obtained (Fig. 500) is maximum, in agreement with our result in the last section. •
Fig. 500.
Maximum flow in Example 1
982
CHAP. 23
Graphs. Combinatorial Optimization
1. Do the computations indicated near the end of Example 1 in detail. 2. Solve Example 1 by Ford-Fulkerson with initial now O. Is it more work than in Example I? 3. Which are the "bottleneck" edges by which the flow in Example 1 is actually limited? Hence which capacities could be decreased without decreasing the maximum How?
14. If the Ford-Fulkerson algorithm stops without reaching t. sho~ that the edges with one end labeled and the other end unlabeled form a cut set (S. T) whose capacity equals the maximum flow. 15. (Several sources and sinks) If a network has several sources Sl' . . . , Sk' sho\\ that it can be reduced to the case of a single-source network by introducing a new vertex S and connecting S to Slo • . • • Sk by k edges of capacity Similarly if there are several sinks. lllustrate tlus idea by a network with two sources and two sinks. 16. Find the maximum flow in the network in Fig. 50 I with two sources (factories) and two sinks (consumers). 17. Find a minimum cut set in Fig. 499 and its capacity. 18. Show that in a network G with all Cij = I, the maximum flow equals the number of edge-disjoint paths s ~ t. 19. In Prob. 17, the cut set contains precisely all forward edges used to capacity by the maximum How (Fig. 500). Is this just by chance? 20. Show that in a network G with capacities all equal to I, the capacity of a minimum cut set (S, T) equals the minimum number q of edges whose deletion destroys all directed paths S ~ t. (A directed path v ~ w is a path in which each edge has the direction in which it is traversed in going from v to w.) 0/:).
14-71
MAXIMUM FLOW Find the maximum How by Ford-Fulkerson: 4. In Prob. 2, Sec. 23.6. 5. In Prob. I, Sec. 23.6. 6. In Prob. 4, Sec. 23.6. 7. In Prob. 3, Sec. 23.6. 8. What is the (simple) reason that Kirchhoffs law is preserved in augmenting a flow by the use of a flow augmenting path? 9. How does Ford-Fulkerson prevent the fOimation of cycles? 10. How can you see that Ford-Fulkerson follows a BFS technique? 11. Are the consecutive How augmenting paths produced by Ford-Fulkerson unique"! 12. (Integer flow theorem) Prove that if the capacities in a network G are integers. then a maximum How exists and is an integer. 13. CAS PROBLEM. Ford-Fulkerson. Write a program and apply it to Probs. 4-7.
23.8
Bipartite Graphs.
Fig. 501.
Problem 16
Assignment Problems
From digraphs we return to graphs and discuss another impOitant class of combinatOlial optimization problems that arises in assignment problems of workers to jobs, jobs to machines, goods to storage, ships to piers. classes to classrooms, exams to time periods, and so on. To explain the problem, we need the following concepts. A bipartite graph G = (V, E) is a graph in which the veltex set V is partitioned into two sets 5 and T (without common elements, by the definition of a partition) such that every edge of G has one end in 5 and the other in T. Hence there are no edges in G that have both ends in 5 or both ends in T. Such a graph G = (V, E) is also written G = (5, T; E). Figure 502 shows an illustration. V consists of seven elements, three workers a, b, c, making up the set 5, and four jobs I, 2. 3, 4, making up the set T. The edges indicate that worker {/ can do the jobs 1 and 2, worker b the jobs I, 2, 3, and worker c the job 4. The problem is to assign one job to each worker so that every worker gets one job to do. This suggests the next concept, as follows.
SEC. 23.8
Bipartite Graphs.
DEFINITION
983
Assignment Problems
Maximum Cardinality Matching
A matching in G = (S. T; E) is a set M of edges of G such that no two of them have a vertex in common. If M consists of the greatest possible number of edges. we call it a maximum cardinality matching in G.
For in<;tance, a matching in Fig. 502 is Ml = {(a. 2), (b. l)}. Another is M2 = {(a, 1), (b, 3), (c, 4)}; obviously, this is of maximum cardinality.
s
T
:~:
c
____
4
Fig. 502. Bipartite graph in the assignment of a set 5 = {a, b, c} of workers to a set T = {l, 2. 3. 4} of jobs
A vertex v is exposed (or /lot covered) by a matching M if v is not an endpoint of an edge of M. This concept. which always refers to some matching, will be of interest when we begin to augment given matchings (below). If a matching leaves no vertex exposed. we call it a complete matching. Obviously, a complete matching can exist only if Sand T consist of the same number of vertices. We now want to show how one can stepwise increase the cardinality of a matching M until it becomes maximmn. Central in this task is the concept of an augmenting path. An alternating path is a path that consists alternately of edges in M and not in M (Fig. 503A). An augmenting path is an alternating path both of whose endpoints (a and b in Fig. 503B) are exposed. By dropping from the matching M the edges that are on an augmenting path P (two edges in Fig. 503B) and adding to M the other edges of P (three in the figure), we get a new matching, with one more edge than M. This is how we use an augmenting path in augmenting a given matching by one edge. We assert that this will always lead, after a number of steps, to a maximum cardinality matching. Indeed, the basic role of augmenting paths is expressed in the following theorem.
(Al Alternating path
(8) Augmenting path P
Fig. 503. Alternating and augmenting paths. Heavy edges are those belonging to a matching M.
CHAP. 23
984
THEOREM 1
Graphs. Combinatorial Optimization
Augmenting Path Theorem for Bipartite Matching
A matching M ill a bipartite graph G = (S, T; E) is of mllximum cardillality only if there does not exist an augmenting path P witlz respect to M.
PROOF
if alld
(a) We show that if such a path P exists. then M is not of maximum cardinality. Let P have q edges belonging to M. Then P has q + I edges not belonging to M. (In Fig. 503B we have q = 2.) The endpoints a and b of P are exposed. and all the other vertices on P are endpoints of edges in M. by the definition of an alternating path. Hence if an edge of M is not an edge of P. it cannot have an endpoint on P since then M would not be a matching. Consequently. the edges of M not on P. together with the q + 1 edges of P not belonging to M form a matching of cardinality one more than the cardinality of M because we omitted q edges from M and added q + I instead. Hence M cannot be of maximum cardinality. (b) We now show that if there is no augmenting path for M, then M is of maximum cardinality. Let M';' be a maximum cardinality matching and consider the graph H consisting of all edges that belong either to M or to M*. but not to both. Then it is possible that two edges of H have a vertex in common. but three edges cannot have a vertex in common since then two of the three would have to belong to M (or to M*), violating that M and M-;' are matchings. So every v in V can be in common with two edges of H or with one or none. Hence we can characterize each "component" (= maximal cOllllected subset) of H as follows. (A) A component of H can be a closed path with an even number of edges (in the case of an odd number, two edges from M or two from M* would meet. violating the matChing property). See (A) in Fig. 504. (B) A component of H can be an open path P with the same number of edges from M and edges from M*, for the following reason. P must be alternating. that is, an edge of M is followed by an edge of M*. etc. (since M and M':' are matchings). Now if P had an edge more from M*, then P would be augmenting for M [see (B2) in Fig. 504]. contradicting our assumption that there is no augmenting path for M. If P had an edge more from M, it would be augmenting for M'~ [see (B3) in Fig. 504]. violating the maximum cardinality of M*. by part (a) of this proof. Hence in each component of H. the two matchings have the same number of edges. Adding to this the number of edges that belong to both M and M* (which we left aside when we made up H), we conclude that M and M* must have the same number of edges. Since M* is of maximum cardinality, this shows that the same holds for M, as we wanted to prove. •
,
(Al
.
......
'"._---_.
-EdgefromM
--.
- - - - Edge from M'"
(81)
. - - - -_ _ . - - - . _ - '---"-'_--..
(82)
. - - - ...__-
(83)
_ _ _ . - - - . ~'---''''''''''--4.
Fig. 504.
__ - - - . -
.--- ..
(Possible)
(Augmenting for M) (Augmenting for M")
Proof of the augmenting path theorem for bipartite matching
SEC. 23.8
Bipartite Graphs.
Assignment Problems
985
This theorem suggests the algorithm in Table 23.9 for obtaining augmenting paths. in which vertices are labeled for the purpose of backtracking paths. Such a label is ill additiol1 to the number of the vertex, which is also retained. Clearly, to get an augmenting path. one must start from an exposed vertex. and then trace an alternating path until one arrives at another exposed vertex. After Step 3 all vertices in S are labeled. In Step 4. the set T contains at least one exposed vertex. since otherwise we would have stopped at Step I. Table 23.9
Bipartite Maximum Cardinality Matching
ALGORITHM MATCHING [G
=
(S, T; E), M, n]
This algorithm determines a maximum cardinality matching M in a bipartite graph G by augmenting a given matching in G. INPUT: Bipartite graph G = (S, T; E) with ve11ices I, ... instance, M = 0) OUTPUT: Maximum cardinality matching M in G
,11,
marching Min G (for
1. If there is no exposed vertex in S then OUTPUT M. Stop [M is of maxilllulIl cardillality ill G.] Else label all exposed vertices ill S with 0. 2. For each i in S and edge U, j)
l10t
in M, label j with i, unless already labeled.
3. For each 11011exposed j in T. label i with j, where i is the other end of the unique edge U. j) in M. -I. Backtrack the alternating paths P ending on an exposed vertex in T
by using the labels on the ve11ices. 5. If no P in Step 4 is augmenting then OUTPUT M. Stop [M is of maximum cardinality ill G.] Else augment M by using an augmenting path P. Remove all labels. Go to Step I. End MATCHING
E X AMP L E 1
Maximum Cardinality Matching Is the matching Ml in Fig. SOSa of maximum cardinality? If not. augment it until maximum cardinality is reached.
S
T
3 3
(aJ Given graph and matChing M J
Fig. 505.
5 3
7 2
o
B 3
4
(b) Matching M2 and new labels
Example 1
986
CHAP. 23
Graphs. Combinatorial Optimization
Solution.
We apply the algorithm.
1. Label I and.J. with 0.
2. Label 7 with I. Label 5. 6. !l with 3. 3. Label 2 with 6, and 3 with 7. [All I'afices are now labeled
(IS
shown in Fig. 474a.]
4. PI: 1 - 7 - 3 - 5. [By backtracking, PI is augmenting.] P 2: I - 7 - 3 - 8. [P2 is lIugmellfillg.j
5. Augment MI by using Pl' dropping (3,7) from MI and including (I, 7) and (3. 5). Remove all
label~.
Go to Step I.
Figure 474b shows the resulting matching M2 = {(I. 7). (2, 6). (3. 5)j. 1. Label.J. with 0.
2. Label 7 with 2. Label 6 and !l with 3. 3. Label I with 7. and 2 with 6. and 3 with 5. 4. P 3 : 5 - 3 - 8. [P3 is aitematillg but
/lot
aug11lellfing.]
•
5. Stop. M2 is otl1lllxi11l11111 cardillality (namely, 3).
--.. 11-6/
IS-lO I
BIPARTITE OR NOT?
Are the following graphs bipartite? If you answer is yes, find S and T.
1.~
2.
AUGMENTING PATHS
hnd an augmenting path:
cp------
s.:\:
dr---4
~
9.
~ ~
3.~
~ 10.
7
111-131
MAXIMUM CARDINALITY MATCHING
Augmenting the given matching, find a maximum cardinality matching:
11. In Prob. 9. 12. In Prob. 8.
7. Can you obtain the answer to Prob. 3 from that to Prob. I?
13. In Prob. 10.
987
Chapter 23 Review Questions and Problems 14. (Scheduling and matching) Three teachers Xl, X2' -'3 teach four classes )'1, Y2, Y3, Y4 for these numbers of periods:
Y1
Y2
y .3
1 0
0 I 1
1 1 1
Xl X2 -'"3
"
.4
1
Show that this arrangement can be represented by a bipartite graph G and that a teaching schedule for one period corresponds to a matching in G. Set up a teaching schedule with the smallest possible number of periods. 15. (Vertex coloring and exam scheduling) What is the smallest number of exam periods for six subjects a, b, c, d, e, t if some of the students simultaneously take a, b, t, some c, d, e, some a, c, e, and some c, e? Solve this as follows. Sketch a graph with six vertices a, ... , t and join vertices if they represent subjects simultaneously taken by some students. Color the vertices so that adjacent vertices receive different colors. (Use numbers L 2, ... instead of actual colors if you want) What is the minimum number of colors you need? For any graph G, this minimum number is called the (vertex) chromatic number Xv(G). Why is this the answer to the problem? Write down a possible schedule. 16. How many colors do you need in vertex coloring the graph in Prob. 5?
17. Show that all trees can be vertex colored with two colors.
18. (Harbor management) How many piers does a harbor master need for accommodating six cruise ships 51, " ' , 56 with expected dates of arrival A and departure D in July, (A, D) = (10, 13), (13, 15), (14, 17), (\2, 15), (16, 18), (\4, 17), respectively, if each pier can accommodate only one ship, arrival being at 6 a:m and departures at 11 p:m? Hint. Join 5i and 5j by an edge if their intervals overlap. Then color vertices.
ship~
51> ... , 55 had to be accommodated?
20. (Complete bipartite graphs) A bipartite graph G = (5, T: E) is called complete if every vertex in 5 is joined to every vertex in Tby an edge, and is denoted by K n1 ,,%, where n1 and n2 are the numbers of vertices in 5 and T, respectively. How many edges does this graph have?
21. (Planar graph) A planar graph is a graph that can be drawn on a sheet of paper so that no two edges cross. Show that the complete graph K4 with four vertices is planar. The complete graph K5 with five vertices is not planar. Make this plausible by attempting to draw K5 so that no edges cross. Interpret the result in terms of a net of roads between five cities. 22. (Bipartite graph K 3 ,3 not planar) Three factories 1, 2, 3 are each supplied underground by water, gas, and electricity, from points A, E, C, respectively. Show that this can be represented by K 3 •3 (the complete bipartite graph G = (5. T; £) with 5 and T consisting of three vertices each) and that eight of the nine supply lines (edges) can be laid out without crossing. Make it plausible that K 3 .3 is not planar by attempting to draw the ninth line without crossing the others. 23. (Four- (vertex) color theorem) The famous Jour-color theorem states that one can color the vertices of any planar graph (so that adjacent vertices get different colors) with at most four colors. It had been conjectured for a long time and was eventually proved in 1976 by Appel and Haken [Illinois J. Math 21 (1977), 429-5671. Can you color the complete graph K5 with four colors? Does the result contradict the four-color theorem? (For more details, see Ref. [F8] in App. I.) 24. (Edge coloring) The edge chromatic number xeCG) of a graph G is the minimLUll number of colors needed for coloring the edges of G so that incident edges get different colors. Clearly, Xe(G) ;;; max d(u), where d(lI) is the degree of vertex u. If G = (5, T; E) is bipartite, the equality sign holds. Prove this for K n .n . 25. Vizing's theorem states that for any graph G (without multiple edges!). max d(u) ~ Xe(G) ~ max d(u) + I. Give an example of a graph for which Xe( G) does exceed max d(u).
19. What would be the answer to Prob. 18 if only the five
.. - :«;::.,w.
:::''''==::== S T ION SAN D PRO B L EMS
1. What is a graph? A digraph? A tree? A cycle? A path? 2. State from memory how you can handle graphs and digraphs on computers. 3. Describe situations and problems that can be modeled using graphs or digraphs. 4. What is a shortest path problem? Give applications.
5. What is BFS? DFS? In what connection did these concepts occur? 6. Give some applications in which spanning trees playa role. 7. What are bipartite graphs? What applications motivate this concept?
CHAP. 23
988
Graphs. Combinatorial Optimization
8. What is the traveling salesman problem? 9. What is a network? What optimization problems are connected with it? 10. Can a forward edge in one path be a backward edge in another path? [n a cut set? Explain. 11. There is a famous theorem on cut sets. Can you remember and explain it?
26.
MATRICES FOR GRAPHS OR DIGRAPHS
112-171
}gt482
017 3
Find shortest paths by Dijkstra's algorithm:
2
810
27.
10
6
2
28
Find the adjacency matrix of: 12.
4
3
4
2
3
4
28.
15.
14.
fA
\!dJ 16.
17.~
~
B
~
29. (Shortest spanning tree) Find a shortest spanning tree for the graph in Prob. 26.
30. Find a shortest
~panning
tree in Prob. 27.
31. Cayley's theorem states that the number of spanning trees in a complete graph with II vertices is nn-2. Verify this for f1 = 2. 3. 4. 32. Show that
0(1Il 3)
+
0(111 2) = 0(11/3).
133-341 MAXIMUM FLOW. Find the maximum flow. where the given numbers are capacities: 33.
34.
21. Make a vertex incidence list of the digraph in Prob. 13. 22. Make a vertex incidence list of the digraph in Prob. 14. /23-28/ SHORTEST PATHS Find a shortest path and its length by Moore's BFS algorithm. assuming that all the edges have length I: 23.
/
.........
/.:~~>---:\
sVI~/
t
24.
35. Company A has offices in Chicago. Los Angeles. and New York. Company B in Boston and New York. Company C in Chicago, Dallas, and Los Angeles. Represent this by a bipartite graph. 36. (Maximum cal'dinality matching). Augmenting the given matching. find a maximum cardinality matching:
989
Summary of Chapter 23
. --.-
..... . - _
-.
.-II
.. -..
......
. . . . . . . . . _ _ .-..
Graphs and Combinatorial Optimization Combinatorial optimization concerns optimIzation problems of a discrete or combinatorial stmcture. It uses graphs and digraphs (Sec. 23.1) as basic tools. A graph G = (V. E) consists of a set V of vertices VI, V2, .•.• V n • (often simply denoted by I. 2 .... , /I) and a set E of edges el' e2' .... em. each of which connects two ve11ices. We also write (i. j) for an edge with vertices i and j as endpoints. A digraph (= directed graph) is a graph in which each edge has a direction (indicated by an arrow). For handling graphs and digraphs in computers. one can use matrices or lists (Sec. 23.1). This chapter is devoted to important classes of optimization problems for graphs and digraphs that all arise from practical applications. and corresponding algorithms, as follows. In a shortest path problem (Sec. 23.2) we determine a path of minimum length (consisting of edges) from a vertex s to a ve11ex t in a graph whose edges (i.j) have a "'length" lij > O. which may be an actual length or a travel time or cost or an electrical resistance [if (i, j) is a wire in a net], and so on. Dijkstra's algorithm (Sec. 23.3) or, when all lij = I, Moore's algorithm (Sec. 23.2) are suitable for these problems. A tree is a graph that is connected and has no cycles (no closed paths). Trees are very important in practice. A ~P(/l111illg tree in a graph G is a tree containing all the vertices of G. If the edges of G have lengths, we can detenrune a shortest spanning tree, for which the sum of the lengths of all its edges is minimum, by Kruskal's algorithm or Prim's algorithm (Sees. 23.4, 23.5). A network (Sec. 23.6) is a digraph in whieh each edge (i. j) has a capacity Cij > 0 [= maximum possible flow along (i. j)] and at one ve11ex, the source s. a flow is produced that flows along the edges to a vertex t, the sink or target, where the flow disappears. The problem is to maximize the tlow, for instance. by applying the Ford-Fulkerson algorithm (Sec. 23.7), which uses flow augmenting paths (Sec. 23.6). Another related concept is that of a cut set, as defined in Sec. 23.6. A bipartite graph G = (V, E) (Sec. n.8) is a graph whose vertex set V consists of two parts Sand T such that every edge of G has one end in S and the other in T, so that there are no edges connecting vertices in S or ve11ices in T. A matching in G is a set of edges. no two of which have an endpoint in common. The problem then is to find a maximum cardinality matching in G. that is. a matching M that has a maximum number of edges. For an algorithm. see Sec. 23.8.
r
PA RT
G
Probability, Statistics
."
CHAPTER 24
Data Analysis. Probability Theory
CHAPTER 25
Mathematical Statistics Probability theory (Chap. 24) provides models of probability distributions (theoretical models of the observable reality involving chance effects) to be tested by statistical methods, and it will also supply the mathematical foundation of these methods in Chap. 25. Modern mathematical statistics (Chap. 25) has various engineering applications, for instance, in testing materials, control of production processes, quality control of production outputs, perfOlmance tests of systems, robotics. and automatization in general, production planning, marketing analysis, and so on. To this we could add a long list of fields of applications, for instance, in agriculture, biology, computer science, demography, economics, geography, management of natural resources, medicine, meteorology, politics, psychology, sociology, traffic control, urban planning, etc. Although these applications are very heterogeneous, we shall see that most statistical methods are universal in the sense that each of them can be applied in various fields.
Additional Software for Probability and Statistics See also the list of software at the beginning of Part E on Numerical Analysis. DATA DESK. Data Description, Inc., Ithaca, NY. Phone 1-800-573-5121 or (607) 257-1000, website at www.datadescription.com. MINITAB. Minitab, Inc., College Park, PA. Phone 1-800-448-3555 or (814) 238-3280, website at www.minitab.com. SAS. SAS Institute, Inc., Cary, NC. Phone 1-800-727-0025 or (919) 677-8000, website at www.sas.com.
991
992
PART G
Probability, Statistics
S-PLUS. Insightful Corporation, Inc., Seattle, W A. Phone 1-800-569-0123 or (206) 283-8802, website at www.insightful.com. SPSS. SPSS, Inc., Chicago, [L. Phone 1-800-543-2185 or (312) 651-3000, website at www.spss.com. STATISTIC.I\. StatSoft, Inc., Tulsa, OK. Phone (918) 749-1119, website at www.statsoft.com.
24
CHAPTER
".
Data Analysis. Probability Theory We first show how to handle data numelically or in terms of graphs, and how to extract information (average size. spread of data, etc.) from them. If these data are influenced by "chance," by factors whose effect we cannot predict exactly (e.g., weather Jata, stock prices, lifespans of tires, etc.), we have to rely on probability theory. This theory originated in games of chance, such as flipping coins, rolling dice, or playing cards. Nowadays it gives mathematical models of chance processes called random experiments or, briefly, experiments. In such an experiment we observe a random variable X, that is, a function whose values in a trial (a pelfOimance of an experiment) occur "by chance" (Sec. 24.3) according to a probability distribution that gives the individual probabilities with which possible values of X may occur in tlle long run. (Example: Each of the six faces of a die should occur witll the same probability. l/6.) Or we may simultaneously observe more than one random variable, for instance. height alld weight of persons or hardness alld tensile strength of steel. This is discussed in Sec. 24.9, which will abo give the basis for tlle mathematical justification of the statistical methods in Chap. 25.
Prereqllisite: Calculus. References and Answers to Problems: App. L Part G, App. 2.
24.1
Data Representation.
Average.
Spread
Data can be represented numelically or graphically in various ways. For instance, your daily newspaper may contain tables of stock plices and money exchange rates, curves or bar charts illustrating economical or political developments, or pie charts showing how your tax dollar is spent. And there are numerous other representations of data for special purposes. In this section we discuss the use of standard representations of data in statistics. (For these, software packages, such as DATA DESK and MINITAB, are available, and Maple or Mathematica may also be helpful; see pp. 778 and 991) We explain corresponding concepts and methods in terms of typical examples, beginning with (1)
89
84
87
81
89
86
91
90
78
89
87
99
83
89.
These are 11 = 14 measurements of the tensile strength of sheet steel in kg/mm2, recorded in the order obtained and rounded to integer values. To see what is going on, we sort these data, that is, we order them by size, (2)
78
81
83
84
86
87
87
89
89
89
89
90
91
99.
S0l1ing is a standard process on the computer; see Ref. [E25], listed in App. 1. 993
994
CHAP. 24
Data Analysis. Probability Theory
Graphic Representation of Data We shall now discuss standard graphic representations used in statistics for obtaining information on properties of data.
Stem-and-Leaf Plot This is one of the simplesl but most useful representations of data. For (I) it is shown in Fig. 506. The numbers in (1) range from 78 to 99; see (2). We divide these numbers into 5 groups, 75-79, 80-84, 85-89, 90-94, 95-99. The integers in the tens position of the groups are 7,8,8,9,9. These form the stem in Fig. 506. The first lealis 8 (representing 78). The second leaf is 134 (representing 81, 83, 84), and so on. The number of times a value occurs is called its absolute frequency. Thus 78 has absolute frequency 1, the value 89 has absolute frequency 4, etc. The column to the extreme left in Fig. 506 shows the cumulative absolute frequencies, that is, the sum of the absolute frequencies of the values up to the line of the leaf. Thus, the number 4 in the second line on the left shows that (1) has 4 values up to and including 84. The number 11 in the next line shows that there are II values not exceeding 89, etc. Dividing the cumulative absolute frequencies by 11 (= 14 in Fig. 506) gives the cumulative relative frequencies.
Histogram For large sets of data, histograms are better in displaying the distribution of data than stem-and-leaf plots. The principle is explained in Fig. 507. (An application to a larger data set is shown in Sec. 25.7). The bases of the rectangles in Fig. 507 are the x-intervals (known as class intervals) 74.5-79.5, 79.5-84.5, 84.5-89.5, 89.5-94.5, 94.5-99.5, whose midpoints (known as class marks) are x = 77, 82, 87. 92, 97, respectively. The height of a rectangle with class mark x is the relative class frequency frel(X), defined as the number of data values in that class interval, divided by 1l (= 14 in our case). Hence the areas of the rectangles are proportional to these relative frequencies, so that histograms give a good impression of the distribution of data.
Center and Spread of Data: Median, Quartiles As a center of the location of data values we can simply take the median, the data value that falls in the middle when the values are ordered. In (2) we have 14 values. The seventh of them is 87, the eighth is 89, and we split the difference, obtaining the median 88. (In general, we would get a fraction.) The spread (vmiabil ity) of the data values can be measured by the range R = Xmax - xmin' the largesl minus the smallest data values, R = 99 - 78 = 21 in (2).
0.5
Leaf unit = 1.0 1 4 11 13 14
7 8 8 9 9
8 134 6779999 01 9
Fig. 506. Stem-and-Ieaf plot of the data in (I) and (2)
0.4
0.3
0.2 0.1
o
x
Fig. 507. Histogram of the data in (I) and (2) (grouped as in Fig. S06)
SEC. 24.1
Data Representation.
Average.
995
Spread
Better information gives the interquartile range IQR = qu - qv Here the upper quartile qu is the middle value among the data values above the median. The lower quartile qL is the middle value among the data values below the median. Thus in (2) we have qu = 89 (the fourth value from the end), qL = 84 (the fOlllth value from the beginning). and IQR = 89 - 84 = 5. The median is also called the middle quartile and is denoted by qM. The rule of "splitting the difference" (just applied to the middle quartile) is equally well used for the other quartiles if necessary.
Boxplot The boxplot of (I) in Fig. 508 is obtained from the five numbers xmin, qv qM, qu, Xmax just determined. The box extends from qL to quo Hence it has the height IQR. The position of the median in the box shows that the data distribution is not symmetric. The two lines extend from the box to Xmin below and to Xmax above. Hence they mark the range R. Boxplots are particularly suitable for making comparisons. For example, Fig. 508 shows boxplots of the data sets (I) and (3)
91
89
93
91
87
94
92
85
91
90
96
93
89
91
92
93
93
94
96
(consisting of 11 = 13 values). Ordeling gives
(4)
85
87
89
89
90
91
91
(tensile strength, as before). From the plot we immediately see that the box of (3) is shorter than the box of (I) (indicating the higher quality of the steel sheets!) and that qM is located in the middle of the box (showing the more symmetric form of the distribution). Finally, '"max is closer to qu for (3) than it is for (l), a fact that we shall discuss later. For plotting the box of (3) we took from (4) the values xmin = 85, qL = 89, qM = 91, qu = 93, Xmax = 96.
Outliers An outlier is a value that appears to be uniquely different from the rest of the data set. It might indicate that something went wrong with the data collection process. In connection with qumtiles an outlier is conventionally defined as a value more than a distance of 1.5 IQR from either end of the box.
100 95 gqU qM
90
qu qM
85
I
qL
qL
80 75 Data set 0)
Fig. 508.
Data set (3)
Boxplots of data sets (1) and (3)
CHAP. 24
996
Data Analysis. Probability Theory
For the data in (1) we have lQR = 5, qL = 84, qu = 89. Hence outliers are smaller than 84 - 7.5 or larger than 89 + 7.5, so that 99 is an outlier [see (2)]. The data (3) have no outliers, as you can readily verify.
Mean.
Standard Deviation.
Variance
Medians and quartiles are easily obtained by ordering and counting. practically without calculation. But they do not give full information on data: you can change data values to some extent without changing the median. Similarly for the qUaItiles. The average size of the data values can be measured in a more refined way by tlle mean
1 .t = -
(5)
n
2: Xj
11 j=l
I
= -
(xl
+
X2
+ ... +
X,.).
11
This is the aritllmetic mean of the data values, obtained by taking their sum and dividing by the data si::e 11. Thus in (I).
x=
l~ (89
+ 84 + ... + 89) =
6~1 = 87.3.
Every data value contributes, and changing one of them will change the mean. Similarly, the spread (variability) of tlle data values can be measured in a more refined way by the standard deviation s or by its square, the variance
(6) Thus, to obtain the variance of the data, take the difference .\1 - .r of each data value fi·om the mean, square it, take the sum of these 1l squares, and divide it by /I - 1 (not 11, as we motivate in Sec. 25.2). To get the standard deviation s, take the square root of S2. For example, using.t = 61117, we get for the data (I) tlle variance S2
=
113
[(89 - 6~1)2 + (84 -
6i l )2 + ... + (89
-
6i l )21
=
1~6 =
25.14.
Hence the standard deviation is s = V176/7 = 5.014. Note that the standard deviation has the same dimension as the data values (kg/mm2, see at the beginning), which is an advantage. On the other hand, the variance is preferable to the standard deviation in developing statistical metl1Ods. as we shall see in Chap. 25. CAUTION! Your CAS (Maple, for instance) may use lin instead of 1/(n - 1) in (6), but the latter is better when 11 is small (see Sec. 25.2).
--.... -_...........--- --....--................ .•.
'1_1
~
°1
DATA REPRESENTATIONS
Represent the data by a stem-and-leaf plot, a histogram, and a boxplot:
1. 20 2. 7
21 6
20 4
0
19 7
20 1
2
19
21
4
6
19
6
3.56 58 54 33 38 38 49 39 4. 12.1 10 12.4 14.7 9.9 5. 70.6 70.9 69.1 71.1 68.9 70.3
41 10.5
30
44
9.2
37 17.2
51
46
56
11.4
11.8
71.3 70.5 69.7 7l.5 69.2 71.2 70.4 72.8
69.8
SEC. 24.2
997
Experiments, Outcomes, Events
6. -0.52 0.11 -0.48 0.94 0.24 -0.19 -0.55 7. Reaction time [sec] of an automatic switch 2.3 2.2 2.4 1.5 2.3 2.3 2.4 2.1 2.5 2.4 2.6 1.3 2.5 2.1 2.4 2.2 2.3 2.5 2.4 2.4 8. Carbon content ['k J of coal 89 87
90 86
89 82
84 85
!i8 89
80 76
90 87
89 86
88 86
90
!i5
9. Weight offilled bottles [g] in an automatic tilling process 403
399
398
401
400
401
401
10. Gasoline consumption [gallons per mile] of six cars of the same model 14.0
111-161
14.5
13.5
14.0
14.5
14.0
AVERAGE AND SPREAD
Find the mean and compare it with the median. Find the standard deviation and compare it with the interquartile range.
24.2
11. The data in Prob. I. 12. 13. 14. 15.
The data in The data in The data in The data in
16. 5
22
Prob. Prob. Prob. Prob.
7 23
2. 5. 6. 9.
6. Why is
Ix - qMI
so large?
17. Construct the simplest possible data with.t = 100 but qM = O. 18. (Mean) Prove that t must always lie between the smallest and the largest data values. 19. (Outlier, reduced data) Calculate s for the data 4 I 3 10 2. Then reduce the data by deleting the outlier and calculate s. Comment. 20. WRITING PROJECT. Average and Spread. Compare QM' IQR and .t, s, illustrating the advantages and disadvantages with examples of your own.
Experiments, Outcomes, Events We now turn to probability theory. This theory has the purpose of providing mathematical models of situations affected or even governed by "chance effects," for instance, in weather foreca~ting, life insurance, quality of technical products (computers. batteries, steel sheets, etc.). traffic problems, and, of course, games of chance with cards or dice. And the accuracy of these models can be tested by suitable observations or experiments-this is a main purpose of statistics to be explained in Chap. 25. We begin by defining some standard terms. An experiment is a process of measurement or observation, in a laboratory, in a factDlY, on the street, in nature, or wherever; so "experiment" is used in a rather general sense. Our interest is in expeliments that involve randomness, chance effects, so that we cannot predict a result exactly. A trial is a single performance of an experiment. Its result is called an outcome or a sample point. n trial" then give a sample of size 11 consisting of n sample points. The sample space S of an experiment is the set of all pussible outcumes.
E X AMP L E S 1 - 6
Random Experiments. Sample Spaces (1)
In'peeting a IightbuIb. S = {Defective. Nondefeetive}.
(2)
RoIling a die. S = {I. 2. 3.4,5. Ii}.
(3)
Measuring tensile strength of wire. S the numbers in some interval.
(4)
Measuring coppel' content of brass. S: 50% to 909<, say.
(5)
Counting daily traffic acciden(, in New York. S the integers in some interval.
(6) Asking for opinion about a new car model. S = (Like. Dislike. Undecided).
•
The subsets of S are called events and the outcomes simple events. E X AMP L E 7
Events
In (2). evcms are A = {l. 3, 5} events are {I}. {2} .... , {6}.
("Odd /lllmbe,.··), B =
{2. 4. 6}
("El'ell Illlmbe,."). C =
{5. 6}, etc. Simple
998
CHAP. 24
Data Analysis. Probability Theory
If in a trial an outcome a happens and a E A (a is an element of A), we say that A happens. For instance, if a die turns up a 3, the event A: Odd number happens. Similarly, if C in Example I happens (meaning 5 or 6 turns up), then D = {4, 5, 6} happens. Also note that S happens in each triaL meaning that some event of S always happens. All this is quite natural.
Unions, Intersections,
Complements of Events
In connection with basic probability laws we shall need the following concepts and facts about events (subsets) A, B, C, ... of a given sample space S. The union A U B of A and B consists of all points in A or B or both. The intersection A
n
B of A and B consists of all points that are in both A and B.
If A and B have no points in common. we write AnB=0
where 0 is the empty set (set with no elements) and we call A and B mutually exclusive (or disjoint) because in a trial the occurrence of A excludes thal of B (and conversely)if your die turns up an odd number, it cannot turn up an even number in the same trial. Similarly, a coin cannot turn up Head and Tail at the same time. Complement A C of A. This is the set of all the points of S Ilot in A. Thus, An A C
= 0,
AU N = S.
In Example 7 we have A C = B, hence A U A C = {l, 2, 3, 4, 5, 6} = S. Another notation for the complement of A is A (instead of A C ), but we shall not use this because in set theory A is used to denote the closure of A (not needed in our work). Unions and intersections of more events are defined similarly. The union
of events AI' ... , Am consists of all points that are in at least one Aj . Similarly for the union A I U A2 U ... of infinitely many subsets A b A 2, ... of an ififinite sample space S (that is, S consists of infinitely many points). The intersection
of AI> ... , Am consists of the points of S thar are in each of these events. Similarly for the intersection Al n A2 n ... of infinitely many subsets of S. Working with events can be illustrated and facilitated by Venn diagrams I for showing unions, intersections. and complements, as in Figs. 509 and 510, which are typical examples thal give the idea. E X AMP L E 8
Unions and Intersections of 3 Events In rolling a die, consider the events
A:
Number greater thall 3,
B:
Number less thall 6.
c:
Evell /lumber.
Then A n B = (4, 5), B n c = (2. 4). C n A = (4, 6), An B n c = {4). Can you sketch a Venn diagram of this? Furthermore. A U B = S. hence A U B U C = S (why?). •
I JOHN
VENN (1834-1923), English mathematician.
SEC. 24.2
999
Experiments, Outcomes, Events
s
s
Intersection A n B
UmonAuB
Fig. 509. Venn diagrams showing two events A and B in a sample space 5 and their union A U B (colored) and intersection A n B (colored)
Fig. 510. A
--
---- ........ -- .... .. ........... 11-91
_.......
=
Venn diagram for the experiment of rolling a die, showing 5, {1, 3, 5}, C = {5, 6}, A U C = {l, 3, 5, 6}, A n C = {5}
...... ----
.-.
~.-----
SAMPLE SPACES, EVENTS
Graph a sample space for the expeliment: 1. Tossing 2 coins 2. Drawing 4 screws from a lot of right-handed and left-handed screws 3. Rolling 2 dice
VENN DIAGRAMS
115-201
15. In connection with a trip to Europe by some students, consider the events P that they see Paris, G that they have a good time. and M that they run out of money. and describe in words the events 1, .. " 7 in the diagram. G
4. Tossing a coin until the fIrst Head appears 5. Rolling a die until the first "Si,," appears 6. Drawing bolts from a lot of 20, containing one defective D, until D is drawn. one at a time tmd assuming sampling without replacement. that is, bolts drawn are not returned to the lot 7. Recording the lifetime of each of 3 lightbulbs 8. Choosing a committee of 3 from a group of 5 people
Problem 15
9. Recording the daily maximum tempemture X and the maximum air pressure Y at some point in a city
16. Using Venn diagrams, graph and check the mles 10. In Prob. 3, circle and mark the events A: Equal faces, B: Sum exceeds 9, C: SUIll equals 7. 11. In rolling 2 dice, are the events A: SIIII1 divisible by 3 and B: Sum dil'isible by 5 mutually exclusive? 12. Answer the question in Prob. 11 for rolling 3 dice. 13. In Prob. 5 list the outcomes that make up the event E: First "Six" in rolling at most 3 times. Describe E C • 14. List all 8 subsets of the sample space S = (a, b, c}.
A U (B A
n
n
C) = (A U B)
(B U C) = (A
n
n
(A U C)
B) U (A
n
C)
17. (De Morgan's laws) Using Venn diagrams. graph and check De Morgan's laws (A U B)c = A C
(A
n
n
BC
B)c = A C U B C •
CHAP. 24
1000
Data Analysis. Probability Theory (A C)C
18. Using a Venn diagram. show that A <;;; B if and only if An B
=
A.
SC
A UN = S,
19. Show that, by the definition of complement, for any subset A of a sample space S,
24.3
= A,
= 0,
0 c = S, AnN = 0.
20. Using a Venn diagram, show that A <;;; B if and only if AU B = B.
Probability The "probability" of an event A in an experiment is supposed to measure how frequently A is about to occur if we make many trials. If we flip a coin, then heads H and tails T will appear about equally often-we say that Hand T are "equally likely." Similarly, for a regularly shaped die of homogeneous material ("fair die") each of the six outcomes 1, ... , 6 will be equally likely. These are examples of experiments in which the sample space S consists of finitely many outcomes (points) that for reasons of some symmetry can be regarded as equally likely. This suggests the following definition.
First Definition of Probability
DEFINITION 1
If the sample space S of an experiment consists of finitely many outcomes (points) that are equally likely, then the probability peA) of an event A is (1)
peA) =
Number of points in A Number of points in S
From this definition it follows immediately that. in particular,
(2)
E X AMP L E 1
peS)
=
1.
Fair Die In rolling a fair die once. what is the probability peA) of A of obtaining a 5 or a 6? The probabilIty of B: "El'en 11l1llzber"?
Solutioll.
The six outcomes are equally likely. so that each has probability 1/6. Thus peA) = 2/6 = 1/3 • because A = (5, 6J has 2 points, and PCB) = 3/6 = 112.
Definition 1 takes care of many games as well as some practical applications, as we shall see, but celtainly not of all experiments, simply because in many problems we do not have finitely many equally likely outcomes. To arrive at a more general definition of probability, we regard probability as the coullterpart of relative frequellcy. Recall from Sec. 24.1 that the absolute frequency f(A) of an event A in n trials is the number of times A occurs, and the relative frequency of A in these trials is f(A)ln; thus
(3)
Number of times A occurs Number of trials
SEC 24.3
1001
Probability Now if A did not occur, then f(A) = O. If A always occurred. then f(A) = the extreme cases. Division by 11 gives
11.
These are
(4*) In particular, for A = S we have f(S) = 11 because S always occurs (meaning that some event always occurs; if necessary, see Sec. 24.2, after Example 7). Division by 11 gives (5*)
fre1(S) = I.
Finally. if A and B are mutually exclusive, they cannot occur together. Hence the absolute frequency of their union A U B must equal the sum of the absolute frequencies of A and B. Division by 11 gives the same relation for the relative frequencies, (6*)
f re1(A U B) = f rei (A)
+
n
(A
fre1(B)
B
= 0).
We are now ready to extend the definition of probability to experiments in which equally likely outcomes are not available. Of course, the extended definition should include Definition 1. Since probabilities are supposed to be the theoretical cOllnterpmt of relative frequencies, we choose the properties in (4*), (5*), (6"") as axioms. (Historically, such a choice is the result of a long process of gaining experience on what might be best and most practical.)
DEFINITION 2
General Definition of Probability
Given a sample space S, with each event A of S (subset of S) there is associated a number P(A), called the probability of A. such that the following axioms of probability are satisfied. 1. For every A in S, (4)
0:0;:
peA)
:0;:
I.
2. The entire sample space S has the probability (5)
peS) = 1.
3. For mutually exclusive events A and B (A
(6)
peA U B)
n
= peA) +
B = 0; see Sec. 24.2),
PCB)
(A
n
B = 0).
If S is infinite (has infinitely many points). Axiom 3 has to be replaced by 3'. For mutually exclusive events AI> A 2 • • . • , (6')
In the infinite case the subsets of S on which peA) is defined are restricted to form a so-called u-algebra. as explained in Ref. (GR6j (not (G6]!) in App. 1. This is of no practical consequence to us.
CHAP. 24
1002
Data Analysis. Probability Theory
Basic Theorems of Probability We shall see that the axioms of probability will enable us to build up probability theory and its application to statistics. We begin with three basic theorems. The first of them is useful if we can get the probability of the complement A C more easily than peA) itself.
THEOREM 1
Complementation Rule
For an event A and its complemellt A C in a sample space S,
(7)
PROOF
peN)
I - peA).
By the definition of complement (Sec. 24.2). we have S Hence by Axioms 2 and 3, I
EXAMPLE 2
=
=
peS)
=
peA)
+ P(A
C
).
=
A U A C and A
n
AC
0.
•
thus
Coin Tossing Five coin~ are tossed simultaneously. Find the probability of the event A: At least one head tums up. Assume that the coins are fair.
Solution. Since each coin can [Urn up heads or mils, the sample space consists of 25 = 32 Olilcomes. Since the coins are fair. we may assign the same probability (1/32) to each outcome. Then the event A C (No heads c c tum LIp) consists of only 1 outcome. Hence P(A ) = 1/32, and the answer is peA) = 1 - P(A ) = 31/32. • The next theorem is a simple extension of Axiom 3, which you can readily prove by induction.
THEOREM 2
Addition Rule for Mutually Exclusive Events
For mutually exclusive events AI, ... , Am in a sample space S,
E X AMP L E 3
Mutually Exclusive Events If the probability that on any workday a garage will get 10-20,21-30,31-40, over 40 cars to service is 0.20,
0.35, 0.25, 0.12, respectively, what is the probability that on a given workday the garage gets at least 21 cars to service?
Solution. Since these are mutually exclusive events, Theorem 2 gives the answer 0.35 + 0.25 + 0.12 = 0.72. Check this by the complementation rule. • In many cases, events will not be mutually exclusive. Then we have
THEOREM 3
Addition Rule for Arbitrary Events
For events A and B in
(9)
1I
sample space,
peA U B)
=
peA)
+ PCB)
- peA
n
B).
SEC. 24.3
1003
Probability
PROOF
C, D. E in Fig. 511 make up A U B and are mutually exclusive (disjoint). Hence by Theorem 2. peA U B)
= P( C) + P(D) +
This gives (9) because on the right P(C) and peE) = PCB) - PW) = PCB) - peA
+ P(D) = peA) by Axiom 3 and disjointness; n B). also by Axiom 3 and disjointness. •
B
A
Fig. 511.
peE).
Proof of Theorem 3
Note that for mutually exclusive events A and B we have A by comparing (9) and (6), (10)
P(0)
(Can you also prove this by (5) and E X AMP L E 4
n
B
= 0 by definition and.
= O.
om
Union of Arbitrary Events In tossing a fair die. what is the probability of getting an odd number or a number less than 4? Solutioll.
Let A be the event "Odd number" and B the event "Numberless than 4." Then Theorem 3 gives
the answer P(AUB)=~+~-~=~
because A
n
•
B = .. Odd lIl//uber less thall 4" = {I, 3}.
Conditional Probability.
Independent Events
Often it is required to find the probability of an event B under the condition that an event A occurs. This probability is called the conditional probability of B given A and is denoted by p(BIA). In this case A serves as a new (reduced) sample space, and that probability is the fraction of peA) which corresponds to A n B. Thus (11)
p(BIA)
=
PeA n B) peA)
[peA)
*- 01.
[PCB)
*-
Similarly, the cunditiO/wl probability of A gil'en B is p(AIB)
(12)
PCB)
Solving (II) and (12) for peA
THEOREM 4
= peA n B)
n
B), we obtain
Multiplication Rule
If A and B lire events in a sample space Sand peA) *- 0, PCB) *- O. then (13)
peA
n
B)
=
P(A)P(BIA)
=
P(B)P(AIB).
0].
1004 E X AMP L E 5
CHAP. 24
Data Analysis. Probability Theory
Multiplication Rule In producing screw~.letA mean "screw too slim" and B "screw too short." Let peA) = 0.1 and let the conditional probability that a slim sere", is also too short be p(BIA) = n.2. What is the probability that a scre'" that we pick randoml) from the lot produced will be both too slim and too short?
Solution.
PIA
n
B) = p(A)p(BIA) = 0.1 • 0.2 = 0.02 = 2'7c. by Theorem 4.
Independent Events.
•
If events A and B are such that
(14)
P(A
n
B) = P(A)P(B),
they are called independent events. Assuming P(A) that in this case p(AIB)
=
peA),
* O. P(B) * 0, we see from (II )-( 13)
P(BIA)
= PCB).
This means that the probability of A does not depend on the occurrence or nonoccurrence of B, and conversely. This justifies the term "independent." Independence of III Events.
Similarly,
111
events A I,
••• ,
Am are called independent if
(ISa) as well as for every k different events A jl , Ah , ... , Aj ". (ISb) where k = 2. 3, ... ,
111 -
l.
Accordingly. three events A. B. C are independent if and only if peA
n
B) = P(A)P(B),
n 0 = P(B)P(0, P(C n A) = P(C)P(A). PCB
(16) peA
n
B
n0
= P(A)P(B)P(O.
Sampling. Our next example has to do with randomly drawing objects, one at a time, from a given set of objects. This is called sampling from a popUlation, and there are two ways of sampling, as follows. 1. In sampling with replacement, the object that was drawn at random is placed back to the given set and the set is mixed thoroughly. Then we draw the next object at random. 2. In sampling without replacement the object that was drawn is put aside. E X AMP L E 6
Sampling With and Without Replacement A box contains 10 screws. three of which are defective. Two screws are drawn at random. Find the probability that none of the lWO screws is defective.
Solution.
We consider the event~ A: First drawn screll' 1100ldefectil'e. B: Second drawl1 screw /101ulefectil'e.
SEC. 24.3
1005
Probability
Clearly. PIA) = -k becau,e 7 of the 10 screw, are nomlefective and we sample at ramI om. so thm each screw has the same probability (to) of being picked. If we sample with replacement. the situation before the second drawing is the same as at the beginning. and PCB) = -k. The events are independent. and the answer is PeA
n
B) = P(A)P(B) = 0.7' 0.7 = 0.49 = 49<)}.
If we sample without replacement. then PIA) = in the box, 3 of which are defective. Thus P(B/A) PIA
n
-k, a~ before. If A has occurred. then there are 9 ~crews left = ~ = ~. and 1l1eorem 4 yields the answer
B) =
-k . ~ =
47<)}.
•
Is it intuitively clear that this value mu,t be smaller than the preceding one'!
--.•.. .... - •••
"'A
1. Three screws are drawn at random from a lot of 100 screws. 10 of which are defective. Find the probability that the screws drawn will be nondefective in drawing (a) with replacement. (b) without replacement.
11. In roIling two fair dice. what is the probability of
2. In Prob. I find the probability of E: At least I defective (i) directly. (ii) by using complements; in both cases (a) and (b).
13. A motor drives an eiectIic generator. During a 30-day period. the motor needs repair with probability 8%- and the generator needs repair with probability 49'r. What is the probability that during a given period. the entire apparatus (consisting of a motor and a generator) will need repair?
3. If we inspect paper by drawing 5 sheets without replacement from every batch of 500. what is the probability of getting 5 clean sheets although 2% of the sheets contain spots') First guess. 4. Under what conditions will it make practically no difference whether we sample with or witholll replacement? Give numeric examples. 5. If you need a right -handed screw from a box containing 20 right-handed and 5 left-handed screws. what is the probability that you get at least one right-handed screw in drawing 2 screws with replacement? 6. If in Prob. 5 you draw without replacement. does the probability decrease or increase? First think, then calculate. 7. What gives the greater probability of hitting some target at least once: (a) hitting in a shot with probability 112 and firing I shot. or (b) hitting in a shot with probability 1I4 and firing 2 shots? First guess. Then calculate. 8. Suppose that we draw cards repeatedly and with replacement from a file of 100 cards. 50 of which refer to male and 50 to female persons. What is the probability of obtaining the second "female" card before the third "male" card? 9. What is the complementary event of the event considered in Prob. 8'1 Calculate its probability and use it to check your result in Prob. 8. 10. In rolling two fair dice. what is the probability of obtaining a sum greater than 4 but not exceeding 7'1
obtaining equal numbers or numbers with an even product? 12. Solve Prob. II by considering complements.
14. If a circuit contains 3 automatic switches and we want that. with a probability of 959'c. during a given time interval they are all working. what probability of failure per time interval can we admit for a single switch? 15. If a certain kind of tire has a life exceeding 25 000 miles with probability 0.95. what is the probability that a set of 4 of these tires on a car will last longer than 25000 miles? 16. In Prob. 15. what is the probability that at least one of the tires will not last for 25 000 miles? 17. A pressure control apparatus contains 4 valves. The apparatus will not work unless all valves are operative. If the probability of failure of each valve during some interval of time is 0.03. what is the corresponding probability of failure of the apparatus? 18. Show that if B is a subset of A, then P(B) 19. Extending Theorem 4. show that PeA n B n C) = P(A)P(BIA)P(CiA
n
~
peA).
B).
20. You may wonder whether in (16) the last relation follows from the others. but the answer is no. To see this. imagine that a chip is drawn from a box containing 4 chips numbered 000,01 I. 101, 110. and let A. B. C be the events that the first. second. and third digit. respectively. on the drawn chip is 1. Show that then the first three formulas in (16) hold but the last one does not hold.
CHAP. 24
1006
Data Analysis. Probability Theory
24.4 Permutations and Combinations Permutations and combinations help in finding probabilities peA) = alk by systematically counting the number a of pointf> of which an event A consists; here, k is the number of points of the sample space S. The practical difficulty is that a may often be surprisingly large, so that actual counting becomes hopeless. For example, if in assembling some instrument you need 10 different screws in a certain order and you want to draw them randomly from a box (which contains nothing else) the probability of obtaining them in the required order is only 1/3 628800 because there are 10!
=
I . 2' 3 ·4·5·6·7' 8 . 9 . 10
= 3628800
orders in which they can be drawn. Similarly, in many other situations the numbers of orders, anangements, etc. are often incredibly large. (If you are unimpressed. take 20 screws-how much bigger will the number be?)
Permutations A permutation of given things (elements or objects) is an arrangement of these things in a row in some order. For example, for three letters a, b, c there are 3! = I . 2 . 3 = 6 permutations: abc, acb, bac, bca, cab, cba. This illustrates (a) in the following theorem.
THEOREM 1
Permutations
(a) Different t/zings. The lUt1llber of permutations of n different things taken all at a time is (1)
n! = I . 2 . 3 ...
l
(read "n.factorial").
Il
(b) Classes of equal things. If n given things can be divided into c classes of alike things differing from class to class, then the number of permutations of these things taken all at a time is
(2)
n!
(Ill
+
n2
+ ... +
llc
=
11)
where nj is the 1lumber of things ill the jth class.
PROOF
(a) There are 11 choices for filling the first place in the row. Then n - I things are still available for filling the second place, etc. (b) 171 alike things in class 1 make n 1 ! permutations collapse into a single permutation (those in which class I things occupy the same 111 positions), etc., so that (2) follows • from (I).
SEC. 24.4
1007
Permutations and Combinations
E X AMP L E 1
Illustration of Theorem l(b) If a box contains 6 red and 4 blue balls. the probability of drawing first the red and then the blue balls is p
~
•
6!4!110! = 11210 = 0.5%.
A permutation of n things taken k at a time is a permutation containing only k of the n given things. Two such permutations consisting of the ~ame k elements. in a different
order, are different, by definition. For example, there are 6 different permutations of the three letters a, b, e, taken two letters at a time, ab, ae, be, ba, ea, eb. A permutation of Il things taken k at a time with repetitions is an arrangement obtained by putting any given thing in the first position, any given thing, including a repetition of the one just used, in the second, and continuing until k positions are filled. For example, there are 32 = 9 different such permutations of a, b, e taken 2 letters at a time, namely, the preceding 6 permutations and aa, bb, ec. You may prove (see Team Project 18):
THEOREM 2
Permutations
TIle number of different pennutations of n different things taken k at a time without repetitions is
(3a)
n(n -
1)(n - 2) ... (n - k
+
1)
=
n! (n - k)!
alld with repetitions is (3b)
E X AMP L E 2
Illustration of Theorem 2 In a coded telegram the letters are arranged in groups of five letters, called words. From (3b) we see that the number of different such words is 265 = II 8!B 376. From (3a) it follows that the number of different such words containing each letter no more tiMn once is 26!/(26 - 5)!
= 26' 25 . 24 . 23' 22 = 7893600.
•
Combinations In a permutation, the order of the selected things is essential. In contrast, a combination of given things means any selection of one or more things without regard to order. There are two kinds of combinations, as follows. The number of combinations of Il different things, taken k at a time, without repetitions is the number of sets that can be made up from the n given things, each set containing k different things and no two sets containing exactly the same k things. The number of combinations of Il different things, taken k at a time, with repetitions is the number of sets that can be made up of k things chosen from the given n things, each being used as often as desired.
CHAP. 24
1008
Data Analysis. Probability Theory
For example, there are three combinations of the three letters a, b, c, taken two letters at a time, without repetitions. namely, lib, lIC, bc, and six such combinations with repetitions, namely, lib, lIC, be, aa, bb, cc.
Combinations
THEOREM 3
The Ilumber of d~fferent combinations of 11 d!fferent things taken. k at a time. withollt repetitiollS. is
(4a)
C) =
1) ... (n - k
n(n -
n!
+
I)
I ' 2· .. k
k!(n - k)!
lind the number of those combinlltions with repetitions is
(4b)
PROOF
E X AMP L E 3
The statement involving (4a) follows from the first part of Theorem 2 by noting that there are k! permutations of k things from the given n things that differ by the order of the elements (see Theorem I). but there is only a single combination of those k things of the type characterized in the first statement of Theorem 3. The last statement of Theorem 3 can be proved by induction (see Team Project 18). • Illustration of Theorem 3 The number of samples of five lightbulbs that can be selected from a 1m of 500 bulbs is rsee (4a 11 500 . 499 . 498 . -1-97 . 496 500) 500! = 255 244 687 600. ( 5 = 5!495! = 1·2·3'4·5
•
Factorial Function In (I )-(4) the factorial function is basic. By definition. (5)
O!
=
L
Values may be computed recursively from given values by
(6)
(n
+
l)!
=
(II
+
l)n!.
For large 11 the function is very large (see Table A3 in App. 5). A convenient approximation for large 11 is the Stirling formula 2
(7)
2 jAMES
(e
STIRUI\G 0692-1770). Scots mathematician.
= 2.718 ... )
SEC. 24.4
1009
Permutations and Combinations
where - is read "asymptotically equal" and means that the ratio of the two sides of (7) approaches 1 as 11 approaches infinity. EXAMPLE 4
Stirling Formula
[
n!
By (7)
Exact Value
Relative Error
4! I
IO!
23.5 3598696 2.422 79 . 1018
24 3628800 2 432 902 008 176 640 000
2.1% 0.8% 0.4%
l
20!
•
Binomial Coefficients The binomial coefficients are defined by the formula 0) = o(a (k
(8)
1)(0 - 2) ... (0 - k
+
1)
~
(k
k!
0, integer).
The numerator has k factors. FUlthermore. we define
(~)
(9) For integer
II
=
11
(~)
in particular,
= I,
= 1.
we obtain from (8)
(n ~ 0, 0 ~ k ~ n).
(10)
Binomial coefficients may be computed recursively, because 0
(11)
(
k
+ +
1) 1
~
(k
0, integer).
Formula (8) also yields (k
(12)
~
0, integer) (m > 0).
There are numerous further relations; we mention two important ones,
(13)
~ s=o
(k+S) = (l1+k) k
k
+
(k ~
O. n ~ 1, both integer)
1
and
(14)
(r
~
0, integer).
1010
CHAP. 24
Data Analysis. Probability Theory
1. List all pennutations of four digits I, 2, 3, 4, taken all at a time. 2. List (a) all permutations, (b) all combinations without repetitions, (c) all combinations with repetitions, of 5 letters G, e, i, 0, 1I taken 2 at a time. 3. In how many ways can we assign 8 workers to 8 jobs (one worker to each job and conversdy)? 4. How many samples of 4 objects can be drawn from a lot of 80 objects? 5. In how many different ways can we choose a committee of 3 from 20 persons? First guess. 6. In how many different wa) s can we select a committee consisting of 3 engineers. 2 biologists. and 2 chemists from 10 engineers, 5 biologist~, and 6 chernists? First guess. 7. Of a lot of 10 items, 2 are defective. (a) Find the number of different samples of 4. Find the number of samples of 4 containing (b) no defectives, (c) I defective, (d) 2 defectives. 8. If a cage contains 100 mice, two of which are male, what is the probability that the two male mice will be included if 12 mice are randomly selected? 9. An urn contains 2 blue, 3 green, and 4 red balls. We draw I ball at random and put it aside. Then we draw the next ball, and so on. Find the probability of drawing at first the 2 blue balls, then the 3 green ones, and finally the red ones. 10. By what factor is the probability in Prob. 9 decreased if the number of balls is doubled (4 blue, etc.)? 11. Detennine the number of different bridge hands. (A bridge hand consists of 13 cards selected from a full deck of 52 cards.) 12. In how many different ways can 5 people be seated at a round table? 13. If 3 suspects who committed a burglary and 6 innocent persons are lined up, what is the probability that a witne~s who is not sure and has to pick three persons will pick the three suspects by chance? That the witness picks 3 innocent persons by chance?
24.5
14. (Birthday problem) What is the probability that in a group of 20 people (that includes no twins) at least two have the same birthday. if we assume that the probability of having birthday on a given day is 11365 for every day. First guess. 15. How many different license plates showing 5 symbols, nanlely, 2 letters followed by 3 digits. could be made? 16. How many automobile registrations may the police have to check in a hit-and-run accident if a witness reports KDP5 and cannot remember the last two digits on the license plate but is certain that all three digits were different? 17. CAS PROJECT. Stirling formula. (a) Using (7), compute approximate values of n! for 11 = I, ... ,20. (b) Detennine the relative error in (a). Find an empirical formula for that relative error. (el An upper bound for that relative error is e 1/ 12n Try to relate your empirical formula to this.
-
L.
(d) Search through the literature for further infonnation on Stirling's fonnula. Write a short report about your findings, arranged in logical order and illustrated with numeric examples. 18. TEAM PROJECT. Permutations, Combinations. (a) Prove Theorem 2. (b) Prove the last statement of Theorem 3. (e) Derive (11) from (8). (d) By the binomial theorem,
so that llkbn - k has the coefficient (~). Can you conclude this from Theorem 3 or is this a mere coincidence? (e) Prove (14) by using the binomial theorem. (f) Collect further formulas for binomial coefficients from the literature and illustrate them numerically.
Random Variables. Probability Distributions In Sec. 24.1 we considered frequency distributions of data. These distributions show the absolute or relative frequency of the data values. Similarly, a probability distribution or, briefly, a distribution, shows the probabilities of events in an experiment. The quantity that we observe in an experiment will be denoted by X and called a random variable (or
SEC 24.5
1011
Random Variables. Probability Distributions
stochastic variable) because the value it will assume in the next trial depends on chance, on randomness-if you roll a dice, you get one of the numbers from I to 6, but you don't know which one will show up next. Thus X = Number a die tUnIS up is a random variable. So is X = Elasticity of rubber (elongation at break). ("Stochastic" means related to chance.) If we COUllt (cars on a road, defective screws in a production, tosses until a die shows the first Six), we have a discrete random variable and distribution. If we measure (electric voltage, rainfall, hardness of steel), we have a continuous random variable and distribution. Precise definitions follow. In both cases the distribution of X is determined by the distribution function F(x)
(1)
= P(X ~ x);
this is the probability that in a trial, X will assume any value not exceeding x. CAUTION! The terminology is not uniform. F(x) is sometimes also called the cumulative distribution function. For (I) to make sense in both the discrete and the continuous case we formulate conditions as follows.
DEFINITION
r
Random Variable
A random variable X is a function defined on the sample space S of an experiment. Its values are real numbers. For every number 11 the probability P(X = a)
with which X assumes a is defined. Similarly, for any interval 1 the probability P(X E I)
with which X assumes any value in 1 is defined. Although this definition is very general, practically only a very small number of distributions wil1 occur over and over again in applications. From (I) we obtain the fundamental formula for the probability corresponding to an interval a < x ~ b, Pea < X
(2)
~
b) = F(b) - F(a).
This follows because X ~ a ("X assumes any value not exceeding a") and a < X ~ b ("X assumes any value in the illterval a < x ~ b") are mutually exclusive events, so that by (1) and Axiom 3 of Definition 2 in Sec. 24.3 F(b)
= P(X ~ b) = P(X ~ =
F(a)
+
and subtraction of F(a) on both sides gives (2).
a)
+
Pea < X
Pea < X
~
b)
~
b)
1012
CHAP. 24
Data Analysis. Probability Theory
Discrete Random Variables and Distributions By definition, a random variable X and its distribution are discrete if X assumes only finitely many or at most countably many values Xl' -'"2' X3, .... called the possible values of X, with positive probabilities PI = P(X = Xl), P2 = P(X = X2), P3 = P(X = X3), ... , whereas the probability P(X E 1) is zero for any interval 1 containing no possible value. Clearly, the discrete distribution of X is also determined by the probability function f(x) of X, defined by p
(3)
f(x)
=
{
~
(j
=
1,2. ... ),
otherwise
From this we get the values of the distribution function F(x) by taking sums,
(4)
where for any given x we sum all the probabilities Pj for which Xj is smaller than or equal to that of x. This is a step function with upward jumps of size Pj at the possible values Xj of X and constant in between. E X AMP L E 1
Probability Function and Distribution Function Figure 512 shows the probability function f(x) and the distribution function F(x) of the discrete random variable
x=
Number a fair die tums up.
X has the possible values x = 1, 2, 3, 4, 5, 6 with probability 116 each. At these x the distribution function has upward jumps of magnitude 1/6. Hence from the graph of flx) we can construct the graph of F(x), and conversely. In Figure 512 (and the next one) at each jump the fat dot indicates theftmctioll value at the jump! •
t(xl
Y6[ I I I I I I o
5
x
o
5
10
12
x
12
x
F(x)
F(x)
1 30 36
20 36
1
2"
10
36
o
,
5
x
Fi-. 512. Probability function {(x) and distribution function F(x) of the random variable X = Number obtained in tossing a fair die once
!
[
10
Fig. 513. Probability function {(x) and distribution function F(x) of the random variable X = Sum of the two numbers obtained in tossing two fair dice once
SEC. 24.5
1013
Random Variables. Probability Distributions
E X AMP L E 2
Probability Function and Distribution Function The random variable X = Slim of the two Illlll/bers t ....o fair dice tum "I' is discrete and has the possible values 2 (= 1 + II. 3.4..... 12 (= 6 + 6). There are 6' 6 = 36 equally likely outcomes (I. 1) (I. 2)....• (6. 6). where the first number is lhat shown on the first die and the second number that on the other die. Each such outcome has probability 1/36. Now X = 2 occurs in the case of the outcome O. I): X = 3 in the case of the lwo outcomes (I. 2) and (2. I \: X = 4 in the case of the three outcomes (1. 3), (2. 2). (3. I): and so on. Hence fix) = PIX = x) and F(x) = PIX ~ x) have the values
I
i
x
2
3
4
5
f(x)
1136 1136
2/36 3/36
3/36 6/36
I F(x)
6
7
8
9
10
II
12
4/36
5/36 15/36
6/36 21/36
5/36 26/36
4/36
10136
30136
3/36 33136
2136 35/36
1136 36/36
Figure 513 shows a bar chan of this function and the graph of the distribution funcllon. which is again a srep function. with jumps (of different height!) at the possible values of X.
Two useful formulas for discrete distributions are readily obtained as follows. For the probability cOlTesponding to intervals we have from (2) and (4)
(5)
P(a
< X~
=
b)
=
F(b) - F(a)
2:
Pj
(X discrete).
a<xJ~b
This is the sum of all probabilities Pj for which Xj satisfies a < Xj ~ b. (Be careful about < and ~!) From this and peS) = I (Sec. 24.3) we obtain the following formula.
2: Pj =
(6)
(sum of all probabilities).
I
j
E X AMP L E 3
illustration of Formula (5) In Example 2. compme [he probability of a sum of at least 4 and at
Solution. E X AMP L E 4
P(3
< X ~ 8) = F(8) - F(3) = ~ -
mo~t
8.
•
:k = ~.
Waiting Time Problem. Countably Infinite Sample Space In to~sing a fair coin. let X = Number of trials limit the first head appears. Then. by independence of events (Sec. 24.3). PiX
=
1)
= P(H)
_
1
=
~.~
-2
PIX = 2) = P(TH)
PIX = 3) = P(TTH) =
and in general PiX series.
= II) =
(!)n.
II
! .~ .! = ~.
(H =
Head)
(T=
Tail)
etc.
= 1. 2•.... Also. (6) can be confirmed by the slim form lila for the geometric 1
2 +
1
4
1
+ 8 + ... = -I
+ 1-
~
=-1+2=1.
•
1014
CHAP. 24
Data Analysis. Probability Theory
Continuous Random Variables and Distributions Discrete random variables appear in experiments in which we count (defectives in a production, days of sunshine in Chicago. customers standing in a line, etc.). Continuous random variables appear in experiments in which we measure (lengths of screws, voltage in a power line, Brinell hardness of steel, etc.). By definition. a random variable X and its distribution are of continuous type or, briefly. continuous, if its distribution function F(x) [defined in (1)] can be given by an integral
(7)
=
F(x)
r
f(v) dv
-00
(we write v because x is needed as the upper limit of the integral) whose integrand f(x), called the density of the distribution, is nonnegative, and is continuous, perhaps except for finitely many x-values. Differentiation gives the relation of f to F as
(8)
f(x) = F'(x)
for every x at which f(x) is continuous. From (2) and (7) we obtain the very important formula for the probability corresponding to an interval:
(9)
Pta
<X~
b) = F(b) - F(a) =
I
b
f(v) dv.
a
This is the analog of (5). From (7) and peS) = I (Sec. 24.3) we also have the analog of (6):
I
(10)
oo
f(v) dv = 1.
_00
Continuous random variables are simpler than discrete ones with respect to intervals. Indeed, in the continuous case the four probabilities corresponding to a < X ~ b, a < X < b, a ~ X < b, and a ~ X ~ b with any fixed a and b (> a) are all the same. Can you see why? (Answer. This probability is the area under the density curve, as in Fig. 514, and does not change by adding or subtracting a single point in the interval of integration.) This is different from the discrete case! (Explain.) The next example illustrates notations and typical applications of our present formulas. Curve of density
f(X)~
/
1~
a
Fig. 514.
b
x
Example illustrating formula (9)
SEC. 24.5
1015
Random Variables. Probability Distributions
E X AMP L E 5
Continuous Distribution Let X have the density function I(x) = O.75( I - x 2 ) if -I ~ x ~ I and 7ero otherwi~e. Find the distribution function. Find the probabilities P( -~ ~ X ;;; ~) and PC! ~ X ~ 2). Find x such that P(X ~ x) = 0.95.
Solutioll. From
(7)
F(x)
we obtain F(x) = 0 if x ~ -I,
= 0.75
IX
(I - v 2 ) dv
= 0.5 + 0.75x -
0.25x3
if-I
<x~
I,
-1
and
F( t)
= I if x >
l. From this and (9) we get
P{-~ ~ X ~~) =
F@ -
F{-~) =
0.75
I
1/2
(I - v
2
)
dv = 68.75%
-1/2
(because P( -~
~ X ~~) = P{ -~
< X ;;; ~) for a continuous distribution) and
P{~ ~ X ~ 2) =
F(2) - F(!) = 0.75
f
1 (1 -
v
2
)
dv = 31.64%.
1/4
(Note that the upper limit of integration is I, not 2. Why?) Finally, P{X ~ x) = F(x) = 0.5
+
0.75x - 0.25,3 = 0.95.
3
Algebraic simplification gives 3x - x = 1.8. A solution is x = 0.73, approximately. Sketch fet) and mark x = -~,~, !. and 0.73. so that you can see the results (the probabilities) as areas under the curve. Sketch also Fex). • Further examples of continuous distributions are included in the next problem set and in later sections.
-_·.·_.w· ........-...____ - ..... ,--.-----....,;. ..... y
_
....... . - . - - , _
h 2 (x = 1, 2, 3,4, 5; k suitable) and the distribution function. Graph the density function fIx) = b: 2 (Q ~ X ~ 5: k suitable) and the distribution function. (Uniform distribution) Graph f and F when the density is f(x) = k = const if -4 ~ x ~ 4 and 0 elsewhere. In Prob. 3 find P(O ~ x ~ 4) and c such that P( -c < X < c) = 95%. Graph f and F when f{ -2) = f(2) = 1/8, f( -1) = f(1) = 3/8. Can f have further positive values? Graph the distribution function F(x) = I - e- 3x if x > 0, F(x) = 0 if x ~ 0, and the density f(x). Find x such that F(x) = 0.9. Let X be the number of years before a particular type of machine will need replacement. Assume that X has the probability function f(l) = 0.1, f(2) = 0.2, f(3) = 0.2, 1(4) = 0.2. 1(5) = 0.3. Graph I and F. Find the probability that the machine needs no
1. Graph the probability function f(x)
2. 3.
4. 5.
6.
7.
=
replacement during the first 3 years. 8. If X has the probability function f(x) = k/2x (x = O. l. 2 ... '). what are k and P(X ~ 4)? 9. Find the probability that none of the three bulbs in a traffic signal must be replaced during the first 1200 hours of operation if the probability that a bulb must be replaced is a random variable X with density f(x) = 6[0.25 - (x - 1.5)2] when 1 ~ x ~ 2 and fIx) = 0 otherwise. where x is time mea~ured in mUltiples of 1000 hours. 10. Suppose that certain bolts have length L = 200 + X mm, where X is a random variable with density f(x) = ~(l - x 2 ) if - 1 ~ x ~ I and 0 otherwise. Determine c so that with a probability of 95% a bolt will have any length between 200 - c and 200 + c. Hint: See also Example 5.
11. Let X [millimeters] be the thickne~~ of washers a machine turns out. Assume that X has the density f(x) = h if 1.9 < x < 2.1 and 0 otherwise. Find k. What is the probability that a washer will have thickness between 1.95 mm and 2.05 mm?
1016
CHAP. 24
Data Analysis. Probability Theory that X is between ::!.5 (40% profit) and 5 (20% profit)?
12. Suppose that in an automatic process of filling oil into cans, the content of a can (in gallons) is Y = 50 + X. where X is a random variable with density fIx) = I - Ixl when Ixl :=; 1 and 0 when Ixl > I. Graph fIx) and F(x). In a lot of 100 cans, about how many will contain 50 gallons or more'? What is the probability that a can will contain less than 49.5 gallons? Less than 49 gallons? 13. Let the random variable X with density fIx) = ke-;c if o ~ x :=; 2 and 0 otherwise (x = time measured in years) be the time after which cel1ain ball bearings are worn out. Find k and the probability that a bearing will last at least I year. 14. Let X be the ratio of sales to profits of some fiml. Assume that X has the distribution function F(x) = 0 if x < 2, F(x) = (x 2 - 4)/5 if 2 :=; x < 3. F(x) = I if x ~ 3. Find and graph the density. What is the probability
15. Show that b < c implies P(X:=; b) :=; PIX :=; c). 16. If the diameter X of axles has the density fIx} = k if 119.9 :=; x ~ 120.1 and 0 otherwise, how many defectives will a lot of 500 axles approximately contain if defectives are axles slimmer than 119.92 or thicker than 120.08?
17. Let X be a random variable that can as~ume evelY real value. What are the complements of the events X ~ b. X<~X~~X>~b:=;X~~b<X~~
18. A box contains 4 right-handed and 6 left-handed screws. Two screws are drawn at random without replacement. Let X be the number of left-handed screws drawn. Find the probabilities PIX = m. PIX = I). PIX = 2). P( I < X < 2), P(X :=; I). PIX ~ I). PIX > I), and P(0.5 < X < 10).
24.6 Mean and Variance of a Distribution The mean J-L and variance 0'2 of a random variable X and of its distribution are the theoretical counterpalts of the mean x and variance S2 of a frequency distribution in Sec. 24.1 and serve a similar purpose. Indeed, the mean characterizes the central location and the variance the spread (the variability) of the distribution. The mean J-L (mu) is defined by (a)
/1-
=
2: xjf(xj)
(Discrete distribution)
j
(1)
(b)
J-L
= {"
xf(x) dx
(Continuous distribution)
-x
and the variance 0'2 (sigma square) by (a)
0'2 =
2: L~j -
J-L)2f(xj)
(Discrete distribution)
j
(2) (b)
0'2 =
f"" (x -
J-L)2f{x) dx
(Continuous distribution).
-=
a (the positive square root of (T2) is called the standard deviation of X and its distribution.
f
is the probability function or the density, respectively, in (a) and (b). The mean J-L is also denoted by E(X) and is called the expectation of X because it gives the average value of X to be expected in many trials. Quantities such as J-L and 0'2 that measure certain properties of a distribution are called parameters. J-L and 0'2 are the two
most important ones. From (2) we see that (3) (except for a discrete "distribution" with only one possible value, so that 0'2 = 0). We assume that J-L and 0'2 exist (are finite), a<; is the ca<;e for practically all distributions that are useful in applications.
SEC. 24.6
1017
Mean and Variance of a Distribution
E X AMP L E 1
Mean and Variance The random variable X = Number o.{heads ina single toss o{ a lair coin has the possible values X = 0 and X = I + 1 = ~. and with probabiliues PIX = 0) = ~ and PIX = 1) = From (1 a) we thus obtain the mean f.L = (2a) yields the variance
o·l
i.
·l
•
E X AMP L E 1
Uniform Distribution. Variance Measures Spread The distribution with the
den~ity
fIx) =
b -
a<x
if
1I
and.f = 0 otherwise is called the uniform distribution on [he interval a < x < h. From (I b) (or from Theorem I. below) we find that f.L = (0 + b)l2. and (2b) yields the variance
I (. _(/ b
2 _ (T
-
a
.~
+ b)2 2
2
_1_ b-l/
._
d.\-
(b - l/) 12
•
Figure 515 illustrates that the spread is large if and only if (T2 is large.
((x)
{(x)
If-----,
o
1
(0
2
= 1/12)
x
1
o
-1
F(x)
2
x
2
x
F(x)
1
-1
Fig. 515.
Uniform distributions having the same mean (0.5) but different variances u'
Symmetry. We can obtain the mean JL without calculation if a distribution is symmetric. Indeed, you may prove THEOREM 1
Mean of a Symmetric Distribution
If a distribution is symmetric ~rith re!oJpect to x thell JL = c. (Examples I and 2 illustrate this.)
= c, that is,
f(c - x) = f(c
+ x),
Transformation of Mean and Variance Given a random variable X with mean JL and variance (]"2. we want to calculate the mean and variance of X* = al + a 2X. where al and a2 are given constants. This problem is important in statistics, where it appears often.
CHAP. 24
1018
THE 0 REM 2
Data Analysis. Probability Theory
Transformation of Mean and Variance (a)
If a
random variable X has mean J.L and variance
(]"2,
then the random
variable
(4) has the mean J.L* and variance
(]"*2.
where
(5)
and
(b) In particular, the standardized random variable Z corresponding to X, given by
X-J.L Z=--
(6)
(]"
has the mean 0 and the variance 1.
PROOF
We prove (5) for a continuous distribution. To a small interval I of length .it" on the x-axis there corresponds the probability f(X)flx [approximately; the area of a rectangle of base .i,. and height f(x)]. Then the probability f(x).::lx must equal that for the corresponding interval on the x*-axis, that is, f*(x*)Llx*, where f* is the density of X* and Llx* is the length of the interval on the x*-axis corresponding to l. Hence for differentials we have f*(x*) dx* = f(x) dx. Also. x* = al + a2x by (4). so that (lb) applied to X* gives J.L* =
I"" x*f*(x*) dx* -co
=
I
x
(al
+
a2x )f(x) dx
-x
=
al
{Xl f(x) dx + a2 I= xf(x) dx. -x
-CD
On the right the first integral equals 1, by (10) in Sec. 24.5. The second integral is J.L. This proves (5) for J.L *. It implies
From this and (2) applied to X*, again using f*(x*) dx* = f(x) dx. we obtain the second formula in (5),
(]"*2
=
IX (x* -~
J.L*)21*(x*) dx*
=
a 22
!"C (x -
J.L)2f(x) dx
=
a2 2(]"2.
-x
For a discrete distribution the proof of (5) is similar. Choosing a l = - J.LI(]" and a2 = I/(]" we obtain (6) from (4), writing X* {II, a2 formula (5) gives J.L* = 0 and (]"*2 = 1, as claimed in (b).
= Z. For these •
SEC. 24.6
1019
Mean and Variance of a Distribution
Moments
Expectation,
Recall that (1) defines the expectation (the mean) of X, the value of X to be expected on the average, written IL = E(X). More generally, if g(x) is non constant and continuous for all x, then g(X) is a random variable. Hence its mathematical expectation or, briefly, its expectation E(g(X» is the value of g(X) to be expected on the average. defined [similarly to (I)] by
(7)
E(g(X»
=
2: g(.l)f(xj)
E(g(X»
or
=
fC
g(x)f(x) dx.
-x
j
In the first fonnula, f is the probability function of the discrete random variable X. In the second formula, f is the density of the continuous random variable X. Important special cases are the kth moment of X (where k = L 2.... )
(8)
E(Xk) =
2: x/f(xj)
or
j
and the kth central moment of X (k
(9)
E([X - IL]k)
=
2: (Xj
=
L, 2, ... )
or
- ILlf(X.i)
{Xl (x _ ILlf(x) dx. -x
j
This includes the first moment. the mean of X (10)
IL
=
E(X)
[(8) with k
= I].
It also includes the second central moment, the variance of X [(9) with k = 2].
(11)
For later use you may prove (12)
11-:.6J
E(l)
MEAN, VARIANCE
Find the mean and the variance of the random variable X with probability function or density f(x).
1. f(x)
= 2x
(0;:;; x ;:;; I)
2. f(O) = 0.512, f(3) = 0.008
f(l) = 0.384,
f(2) = 0.096,
3. X = Number a fair die turns up 4. Y = -4X + 5 with X as in Prob. 1 5. Uniform distribution on [0, 8] 6. f(x) = 2e- 2x (x ~ 0)
= 1.
7. What is the expected daily profit if a store sells X air conditioners per day with probability f(lO) = 0.1, fell) = 0.3, f(12) = 0.4, f(13) = 0.2 and the profit per conditioner is $55? 8. What is the mean life of a light bulb whose life X [hours] has the density f(x) = O.OOle- O.OOlx (x ~ O)? 9. If the mileage (in multiples of 1000 mi) after which a tire must be replaced is given by the random variable X with density f(x) = ()e- flx (x > 0). what mileage can you expect to get on one of these tires? Let = 0.04 and find the probability that a tire will last at least 40000 mi.
e
1010
CHAP. 24
Data Analysis. Probability Theory
10. What sum can you expect in rolling a fair die 10 times? Do it. Repeat this experiment 20 times and record how the sum varies. 11. A small filling station is supplied with gasoline every Saturday afternoon. Assume that its volume X of sales in ten thousands of gallons has the probability density f(x) = 6x( I - x) if 0 ~ x ~ 1 and 0 otherwise. Determine the mean. the vaIiance. and the standardized variable. 12. What capacity must the tank in Prob. II have in order that the probability that the tank will be emptied in a given week be 5%'1 13. Let X [cm] be the diameter of bolts in a production. Assume that X has the density f(x) = k(x - 0.9)( 1.1 - x) if 0.9 < x < 1.1 and 0 otherwise. Detern1ine k. sketch fIx). and find JL and u 2 . 14. Suppose that in Prob. 13. a bolt is regarded as being defective if its diameter deviates from 1.00 cm by more than 0.09 cm. What percentage of defective bolts should we then expect? 15. For what choice of the maximum possible deviation c from 1.00 cm shall we obtain 3o/c defectives in Probs. 13 and 14?
24.7
16. TEAM PROJECT. Means, Variances, Expectations. (a) Show that E(X - IL) = 0, (]"2 = E(X 2 ) - IL2. (b) Prove (10)-(12).
(c) Find all the moments of the uniform distribution on an interval a ~ x ~ b. (d) The skewness 'Y of a random variable X is defined
by
(13)
'Y = 3
1
3
E([X - ILl ).
(]"
Show that for a symmetric distribution (whose third central moment exists) the skewness is 7ero. (e) Find the skewness of the distribution with density fIx) = xe- x when x > 0 and fIx) = 0 otherwise. Sketch f(x). (I) Calculate the skewness of a few simple discrete distributions of your own choice.
(g) Find a non symmetric discrete distribution with 3 possible values. mean O. and skewness O.
Binomial, Poisson, and Hypergeometric D istri butions These are the three most important discrete distributions. with numerous applications.
Binomial Distribution The binomial distribution occurs in games of chance (rolling a die, see below, etc.), quality inspection (e.g., counting of the number of defectives), opinion polls (counting number of employees favoring certain schedule changes, etc.), medicine (e.g., recording the number of patients recovered by a new medication), and so on. The conditions of its OCCUlTence are as follows. We are interested in the number of times an event A occurs in n independent trials. In each trial the event A has the same probability P(A) = p. Then in a trial, A will not occur with probability q = I - p. rn 11 trials the random variable that interests us is
x
=
Number oj times the event A occurs in
11
trials.
X can assume the values 0, I. .... 11. and we want to detennine the corresponding probabilities. Now X = x means that A occurs in l" trials and in n - x trials it does not occur. This may look as follows.
SEC. 24.7
1021
Binomial, Poisson. and Hypergeometric Distributions
A
A··· A
B
'---v----'
(1)
B··· B.
~
x times
11 -
x times
Here B = A C is the complement of A, meaning that A does not Occur (Sec. 24.2). We now use the assumption that the trials are independent, that is. they do not influence each other. Hence (I) has the probability (see Sec. 24.3 on independent events) pp ... p
. qq'"
'-------v-----'
(1 *)
x times
q
=
p"·qn-x.
'-------v-----'
x times
11 -
Now (l) is just one order of ananging x A's and 11 - x B's. We now use Theorem l(b) in Sec. 24.4, which gives the number of permutations of 11 things (the 11 outcomes of the II trials) consisting of 2 classes, class I containing the III = x A's and class 2 containing the 11 - III = 11 - x B's. This number is n! x!(n - x)! = (:) .
Accordingly, (1 *) multiplied by this binomial coefficient gives the probability P(X = x) of X = x, that is, of obtaining A precisely x times in 11 trials. Hence X has the probability function
(\' = 0, I, ... , /l)
(2)
and I(x) = 0 otherwise. The distribution of X with probability function (2) is called the binomial distribution or Be1'1loll11i distriblltion. The occunence of A is called sllccess (regardless of what it actually is: it may mean that you miss your plane or lose your watch) and the nonoccunence of A is calledfailllre. Figure 516 shows typical examples. Numeric values can be obtained from Table A5 in App. 5 or from your CAS. The mean of the binomial distribution is (see Team Project 16)
(3)
J.L
= np
and the variance is (see Team Project 16)
(4)
(]'2
=
npq.
0.5
Fig. 516.
°O!:---'--'-~----:C5
O!--,--'-...L~5
p=O.l
p=O.2
jll
O:--,--'-...L.1....:!5
0~~-'-.......,.5
p=O.5
p=O.8
p =0.9
0
5
Probability function (2) of the binomial distribution for n = 5 and various values of p
1022
CHAP. 24
Data Analysis. Probability Theory
For the symmetric case of equal chance of success and failure (p the mean nl2, the variance nJ4, and the probability function
(2*)
E X AMP L E 1
f(x)
= (:)
=
q
(~r
= 112) this gives
(x = 0, 1, ... , n).
Binomial Distribution Compute the probability of obtaining at least two "Six" in rolling a fair die 4 times.
Solution.
p = peA) = P("Six") = 1/6, q 2 or 3 or 4 "Six." Hence the answer is P = J(2)
+
J(3)
+
J(4) =
(~)
=
5/6. n
= 4. The event "At leasr two
r(r
'Six'" occurs if we obtain
r(
(i % + (~) (i %) + (:) (i
I
= - 4 (6·25 + 4·5 + 6
I)
r •
171
= - - = 13.2%. 1296
Poisson Distribution The discrete distribution with infinitely many possible values and probability function
(5)
(x = 0, 1, . , .)
f(x) =
is called the Poisson distribution, named after S. D. Poisson (Sec, 18.5). Figure 517 shows (5) for some values of f.L. It can be proved that this distribution is obtained as a limiting case of the binomial distribution, if we let p ~ and n ~ 00 so that the mean f.L = np approaches a finite value. (For instance, f.L = np may be kept constant.) The Poisson distribution has the mean f.L and the variance (see Team Project 16)
°
(6) Figure 517 gives the impression that with increasing mean the spread of the distribution increases, thereby illustrating formula (6), and that the distribution becomes more and more (approximately) symmetric.
0.5
o 11 = 0.5
Fig. 517.
11=1
11=2
5
10
11=5
Probability function (S) of the Poisson distribution for various values of J1,
SEC. 24.7
1023
Binomial, Poisson. and Hypergeometric Distributions
E X AMP L E 2
Poisson Distribution If the probability of producing a defective screw is I' = 0.01. what is the probability that a lot of 100 will contain more than 2 defectives'!
screw~
Solutioll.
The complementary event is AC : Not more thaI! 2 defeethoes. For its probability we get from the binomial distribution with mean M = 111' = I the value [see (2)] P(A c ) =
(
2 98 100) 0 (J.YY lOO + (100) I (J.OI • 0.yy99 + ( 100) 2 0.01' 0.Y9 .
Since p is very small. we can approximale this by the much more conveniem Poisson distriblllion with mean J1. = 111' = 100· 0.01 = I. obtaining [see (5)]
=
91.97%.
Thu, PIA) = 8.03%. Show that the binomial distribution gives PIA) = 7.94%, so that the is quite good.
E X AMP L E 3
Poi~son
approximation •
Parking Problems. Poisson Distribution If on the average. 2 cars enter a certain parking lot per minute. what is the probability that during any given minute 4 or more car, will enter the lot?
Solutioll.
To understand that the Poisson distribution is a model of the situation. we imagine the minute to be divided into very many short time intervals, let p be the (constant) probability that a car will enter the lot during any such short interval. and assume independence of the events that happen during those imervals. Then we are dealing with a binomial distribution with very large 11 and very small 1', which we can approximate by the Pois~on distriblllion with M = 1117 = 2,
because 2 cars enter on the average. The complementary event of the event ·'4 cars or more during a given minute" is "3 cars orfewer elller the lot" and ha~ the prohability 0
f(O)
+
f(l)
+
f(2)
+
f(3) = e-
=
Answer: l·t3"k.
2
2 ( -0 1
21
+ -
I!
22
+ -
2!
3 2 )
+ -
3!
0.857.
•
(Why did we consider that complement"!)
Sampling with Replacement This mean, that we draw things from a given set one by one, and after each trial we replace the thing drawn (put it back to the given set and mix) before we draw the next thing. This guarantees independence of trials and leads to the binomial distribution. Indeed, if a box contains N things, for example. screws. M of which are defective, the probability of drawing a defective screw in a trial is p = MIN. Hence the probability of drawing a nondefective screw is q = 1 - p = 1 - MIN, and (2) gives the probability of drawing x defectives in n trials in the form
(7)
_ (11) (M)X (I - M)n-X -
f(x) -
x
N
N
(x
= 0,
I, ... ,
11).
1024
CHAP. 24
Data Analysis. Probability Theory
Sampling without Replacement. Hypergeometric Distribution Sampling without replacement means that we return no screw to the box. Then we no longer have independence of trials (why?). and instead of (7) the probability of drawing x defectives in n trials is
(x
I(x) =
(8)
=
0, I, ... ,
11).
The distribution with this probability function is called the hypergeometric distribution (because its moment generating function (see Team Project 16) can be expres~ed by the hypergeometric function defined in Sec. 5.4, a fact that we shall not use). Derivation of (8). By (4a) in Sec. 24.4 there are (a) (:) different ways of picking
11
things from N.
(b)
(~)
(c)
(Nn -- M\ different ways of picking n xJ
different ways of picking x defectives from M, x nondefectives from
N- M,
and each way in (b) combined with each way in (c) gives the total number of mutually exclusive ways of obtaining x defectives in 11 drawings without replacement. Since (a) is the total number of outcomes and we draw at random. each such way has the probability l/(Ij . From this, (8) follows.
•
The hypergeometric distribution has the mean (Team Project 16) M
(9)
J.L=/l-
N
and the variance nM(N - MeN - 11)
(10)
E X AMP L E 4
N 2 (N -
I)
Sampling with and without Replacement We want to draw random samples of two gaskets from a box containing 10 gaskets. three of which are defective. Find the probability function of the random variable X = Number of defectives ill the sample.
Solution.
We have N = 10. M = 3. N - M = 7. f(x)
=
(~) I~ (
r( r7 10
Il
= 2. For sampling with replacement. (7) yields
x
J(O) = 0.49,
f(l)
= 0.42, J(2) = 0.09.
For sampling without replacement we have to use (8), finding 21
fCO) = f(l) =
45 "" 0.47.
3 f(2) = 45 = 0.07.
•
SEC. 24.7
1025
Binomial, Poisson, and Hypergeometric Distributions
If N, M, and N - M are large compared with n, then it does not matter too much whether we sample with or without replacemellt, and in this case the hypergeometric distribution may be approximated by the binomial distribution (with p = MIN), which is somewhat simpler. Hence in slimp ling from an indefinitely large population ("infinite population") we 1Il11\' use the binomial distribution, regardless of whether we sample with or withollI replacement.
===== -... -...-.. -..
:
..
..".-~
...
--
1. Four fair coins are tossed simultaneously. Find the probability function of the random variable X = Number of heads and compute the probabilities of obtaining no heads, precisely I head, at least I head, not more than 3 heads.
2. If the probability of hitting a target in a single shot is 10% and 10 shots are fired independently. what is the probability that the target will be hit at least once? 3. In Prob. 2, if the probability of hitting would be 5% and we fired 20 shots. would the probability of hitting at least once be less than. equal to, or greater than in Prob. 2? Guess first, then compute. 4. Suppose that 3% of bolts made by a machine are defective, the defectives occurring at random during production. If the bolts are packaged 50 per box, what is the Poisson approximatIOn of the probabilit) that a given box will contain x = 0, 1, ... , 5 defeClives? 5. Let X be the number of cars per minute passing a certain point of some road between 8 A.M. and 10 A.M. on a Sunday. Assume that X has a Poisson disuibution with mean 5. Find the probability of observing 3 or fewer cars during any given minute. 6. Suppose thar a telephone switchboard of some company on the average handles 300 calls per hour, and that the board can make at most 10 connections per minute. Using the Poisson disU'ibution, estimate the probability that the board will be overtaxed during a given minute. (Use Table A6 in App. 5 or your CAS.) 7. (Rutherford-Geiger experiments) In 1910. E. Rutherford and H. Geiger showed experimenrally that the number of alpha particles emitted per second in a radioactive process is a random variable X having a Poisson distribution. If X has mean 0.5, whar is the probability of observing two or more particles during any given second? 8. A process of manufacturing screws is checked every hour by inspecting 11 screws selected at random from that hour's production. If one or more screws are defective, the process is halted and carefully examined. How large should n be if the manufacturer wants the probability to be about 95% that the process will be
haIted when 10% of the screws being produced are defective? (Assume independence of the quality of any screw of that of the other screws.)
9. Suppose that in the production of 50-n resistors, nondefective items are those that have a resistance between 45 nand 55 n and the probability of a resistor's being defective is 0.2%. The resistors are sold in lots of 100, with the guarantee that all resistors are nondefective. What is the probability that a given lot will violate this guarantee? (Use the Poisson distribution.) 10. Let P = lo/c be the probability that a certain type of lightbulb will fail in a 24-hr test. Find the probability that a sign consisting of 10 such bulbs will bum 24 hours with no bulb failures. 11. Guess how much less the probability in Prob. 10 would be if the sign consisted of 100 bulbs. Then calculate. 12. Suppose that a certain type of magnetic tape contains. on the averdge, 2 defects per 100 meters. What is the probability that a roll of tape 300 meters long will contain (a) x defects, (b) no defects?
13. Suppose thar a test for extrasensory perception consists of naming (in any order) 3 cards randomly drawn from a deck of 13 cards. Find the probability that by chance alone, the person will correctly name (a) no cards, (b) I card, (c) 2 cards, (d) 3 cards. 14. A carton contains 20 fuses, 5 of which are defective. Find the probability that. if a sample of 3 fuses is chosen from the carton by random drawing without replacement, x fuses in the sample will be defective. 15. (Multinomial distribution) Suppose a trial can result in precisely one of k mutually exclusive events Ab ... , Ak with probabilities PI' .•• , Pk' respectively, where PI + ... + Pk = 1. Suppose that /I independent trials are performed. Show that the probability of getting Xl AI's, . . • , Xk Ak's is
where Xl
0
~ Xj ~ II,
+ ... + Xk
= 11.
j = I, ... , k. and The distribution having this
CHAP. 24
1026
Data Analysis. Probability Theory (b) Shov. that the binomial distribution has the moment generating function
probability function is called the lIlultinomial distribution.
16. TEAM PROJECT. Moment Generating Function. The moment generating function G(t) is defined by ~
tX
tx·
G(t) = E(e J) = .L.J e 'f(xj)
= (pet
+ q)".
or G(t)
=
E(etx )
=
(c) Using (b), prove (3).
fX et·t:f(x) dx
(d)
-x
where X is a discrete or continuous random variable, respectively. (a) Assuming that termwise differentiation and differentiation under the integral sign are permissible, show that E(Xk) = dkl(O), where d k ) = dkG/dtk. in particular, /L = G' (0).
24.8
Prove (4).
(e) Show that the Poisson distribution has the moment generating function G{t) = e-lLe lLe ' and prove (6).
(f)
Prove x
(~)
Using this. prove
= M
(~ =!) .
(9).
Normal Distribution Turning from discrete to continuous distributions, in this section we discuss the normal distribution. This is the most important continuous distribution because in applications many random variables are normal random variables (that is, they have a normal distribution) or they are approximately normal or can be transformed into normal random variables in a relatively simple fashion. Furthermore, the normal distribution is a useful approximation of more complicated distributions. and it also occurs in the proofs of various statistical tests. The normal distribution or Gauss distribution is defined as the distribution with the density
(1)
f(x) = -I- exp [_ 21
d\!2;
(x -cr )2J J.L
(cr> 0)
where exp is the exponential function with base e = 2.718 .... This is simpler than it may at first look. fex) has these features (see also Fig. 518). 1. J.L is the mean and cr the standard deviation.
2. l/(dV2;) is a constant factor that makes the area under the curve of f(x) from to x equal to L as it must be by (0), Sec. 24.5.
-x
3. The curve of f(x) is symmetric with respect to x = J.L because the exponent is quadratic. Hence for J.L = 0 it is symmetric with respect to the y-axis x = 0 (Fig. 518, "bell-shaped curves"). 4. The exponential function in (1) goes to zero very fast-the faster the smaller the standard deviation cr is, as it should be (Fig. 518).
SEC. 24.8
1027
Normal Distribution
((x)
cr = 1.0 2
Fig. 518.
x
Density (1) of the normal distribution with /L = 0 for various values of u
Distribution Function F(x) From (7) in Sec. 24.5 and (I) we see that the normal di<;tribution has the distribution function
(2)
.~ IX 27T
Fex) =
(TV
exp [- 21 (u - J.L (T
-00
)2J du.
Here we needed x as the upper limit of integration and wrote u (instead of x) in the integrand. For the conesponding standardized normal distribution with mean 0 and standard deviation I we denote F(x) by
I Vh I
(3)
Z
e- u2/2 duo
cI>(.:) = - -
-x
This integral cannot be integrated by one of the methods of calculus. But this is no serious handicap because its values can be obtained from Table A 7 in App. 5 or from your CAS. These values are needed in working with the normal distribution. The curve of cI>(z) is S-shaped. It increases monotone (why?) from 0 to 1 and intersects the vertical axis at 112 (why?), as shown in Fig. 519. Relation Between F(x) and «I>(z). Although your CAS will give you values of F(x) in (2) with any J.L and (T directly, it is important to comprehend that and why any such an F(x) can be expressed in terms of the tabulated standard cI>(z), as follows. y ,",(xl
~
1.0 0.8
/
0.6
o. /0.2
-3
Fig. 519.
-2
-1
0
2
3
x
Distribution function (z) of the normal distribution with mean 0 and variance 1
1028
CHAP. 24
THE 0 REM 1
Data Analysis. Probability Theory
Use of the Normal Table A7 in App. 5
r
The distributioll fimctioll FCr) of the nonnal distriblltioll with allY J.L and 0" see (2)] is related to the stalldardi-;.ed distribution filllCtiOIl <1>(.:) ill (3) by the formula
(4)
PROOF
F(x)
= ( -x-J.L) 0"-
.
Comparing (2) and (3) we see that we should set
u=
Then v
= x gives
u= 0"
0"
as the new upper limit of integration. Also v - J.L 0" drops out. I I(X-P)/u u2 2 F(x)
= --O"~
e-
/
= 0"
O"U,
du
thus dv
=
0" du.
Together, since
X ) = ( _ _J.L_
0"
-ex:
•
Probabilities corresponding to intervals will be needed quite frequently in statistics in Chap. 25. These are obtained as follows. THEOREM 2
Normal Probabilities for Intervals The probability that a normal random variable X with mean J.L and standard del'iatioll 0" assume allY vallie ill all inten'al a < x ~ b is
(5)
PROOF
Pea
<
X
~
b) = F(b) - F(a) = ( -b-J.L) 0 " - - <J:> (a-J.L) -0"- .
.
Formula (2) in Sec. 24.5 gives the first equality in (5). and (4) in this section gives the ~~~~~.
Numeric Values In practical work with the normal disnibution it is good to remember that about 2/3 of all values of X to be observed will lie between J.L ± 0", about 95% between J.L ± 20", and practically all between the three-sigma limits J.L ± 30". More precisely, by Table A7 in App. 5,
(6)
<X
(a)
P(J.L - 0"
(b)
P(J.L - 20"
<
(c)
P(J.L - 30"
<X
~
J.L
X ~ J.L ~ J.L
+
0") = 68%
+ 20") = +
95.5%
30") = 99.7%.
Formulas (6a) and (6b) are illustrated in Fig. 520. The formulas in (6) show that a value deviating from JL by more than 0", 20", or 30" will occur in one of about 3, 20, and 300 trials, respectively.
SEC. 24.8
1029
Normal Distribution
2.25% -\
r
.u-20 (a)
95.5%
'/
2.25%
'tl
!
.u
.u+20
(b)
Fig. 520.
Illustration of formula (6)
In tests (Chap. 25) we shall ask conversely for the intervals that corre<;pond to certain given probabilities; practically most important are the probabilities of 95%, 99%, and 99.9%. For these, Table AS in App. 5 gives the answers J.L ± 2u, J.L ± 2.5u, and J.L ± 3.3u, respectively. More precisely,
(7)
(a)
P(J.L - 1.96u
<X
~
J.L
+
1.96u)
= 95%
(b)
P(J.L - 2.5Su
<X
~
J.L
+
2.5Su)
= 99%
(e)
P(J.L - 3.29u
<X
~
J.L
+
3.29u) = 99.9%.
Working With the Normal Tables A7 and AS in App. 5 There are two normal tables in App. 5, Tables A7 and A8. If you want probabilities, use Table A7. If probabilities are given and corresponding intervals or x-values are wanted, use Table AS. The following examples are typical. Do them with care. verifying all values, and don't just regard them as dull exercises for your software. Make sketches of the density to see whether the results look reasonable. E X AMP L E 1
Reading Entries from Table A7 If X is standardi7ed nonnal (so that /L = O. a = 1), then P(X ~ 2.'14) = 0.9927 = 99~% P(X~
P(X
~
PO.O
E X AMP L E 2
-1.16)
= 1 - <1>(1.16) = 1 - 0.8770 = 0.1230 = 12.3'J1:
1) = 1 - P(X
~
X
~
~
I)
=
1 - 0.H413 = 0.1587 by (7), Sec. 24.3
1.8) = <1>(1.8) - <1>(1.0) = 0.9641 - 0.8413 = 0.1228.
•
Probabilities for Given Intervals, Table A7 Let X be normal with mean 0.8 and variance 4 (so that a = 2). Then by (4) and (5) P(X ~ 2.44)
=
F(2.44)
= q) (
2.44 - 0.80 ) 2
= <1>(0.82) = 0.7939
= 80%
or if you like it better (similarly in the other cases) P(X ~ 2.44)
PIX ~ 1)
=
=
P(
2.44 - 0.80 ) X - 0.80 2 ~ 2
=
1-08) I - P(X ~ 1) = I - ( --2-'-
P(1.0 ~ X ~ 1.8) = <1>(0.5) - <1>(0.1)
=
P(Z ~ 0.82)
=
= 0.7939
I - 0.5398 = 0.4602
0.6915 - 0.5398 = 0.1517.
•
1030
E X AMP L E 3
CHAP. 24
Data Analysis. Probability Theory
Unknown Values c for Given Probabilities, Table AS Let X be nonnal with mean 5 and variance 0.04 (hence standard deviation 0.1). Find c or k corresponding to the given probability PIX
~ c) =
P(5 - k P(X
E X AMP L E 4
~
950/(. X
~ c) =
~
19C.
5)
c -
=
5 + k) = 90%, thus PIX
c - 5 0.1 = 1.645.
95%.
5 + k = 5.319 ~ c) =
(as before: why?)
c-5
0:2
99%.
c = 5.319
=
c = 5.46S.
2.326.
•
Defectives In a production of iron rods let the diameter X be nunnally distributed with mean 2 in. ilnd standard deviation 0.008 in. (a) What percentage of defectives can we expect if we ,et the tolerance limit, at 2 ::':: 0.01 in.? (b) How should we set the tolerance limits to allow for 4'lt defectives?
Solutioll.
(a) I!'k because from (S) and Table A7 we obtain for the complementary event the probability
P(
1.98 ~ X ~ 2.02) =
( 1.9H - 2.00 ) 2.02 - 2.00) 0.008 -
=
<1)(2.5) -
=
0.9938 - (I - 0.9938)
=
0.9876
= 98~%. (b) 2 ::':: 0.0164 because for the complementary event we have
0.96
=
P(2 - c ~ X ~ 2
+ c)
or 0.98 = P(X
~
2 +
c)
so that Table A8 gives
0.98 =
1+C-2) 0 . 0.08 2+c-2 0.008
=
2.0S4.
•
c = 0.0164.
Normal Approximation of the Binomial Distribution The probability function of the binomial distribution is (Sec. 24.7)
(8)
(x
= 0, 1, ... , n).
If 11 is large, the binomial coefficients and powers become very inconvenient. It is of great practical (and theoretical) importance that in this case the normal distribution provides a good approximation of the binomial distribution, according to the following theorem, one of the most important theorems in all probability theory.
SEC. 24.8
1031
Normal Distribution
THEOREM 3
Limit Theorem of De Moivre and Laplace
For large n. f(x) ~ f*(x)
(9)
Here
f
(x
= 0, 1, ... , n).
is given by (8). The function
(10)
f* (x) =
x - np
---c=---==
V2;v,;pq
v;pq
is the density of the n01711al distribution with mean J.L = np and variance (j2 = npq (the mean and variance of the billomial distribution). The symbol ~ (read asymptotically equal) means that the ratio of both sides approaches 1 as n approaches so. Flirthe17nore. fbr any llOllIlegative integers a and b (> a).
Pea
~ X ~ b) = ~a (:) p~:qn-x ~ cI>(f3) -
cI>(ex),
(11 )
ex=
a - IIp - 0.5
v,;pq
13=
b - I1p + 0.5
v,;pq
A proof of this theorem can be found in [03] listed in App. 1. The proof shows that the term 0.5 in ex and 13 is a correction caused by the change from a discrete to a continuous distribution.
11-131
NORMAL DISTRIBUTION
1. Let X be normal with mean 80 and variance 9. Find P(X > 83). P(X < 81), P(X < 80), and P(78 < X < 82).
2. Let X be normal with mean 120 and variance 16. Find P(X;;;; 126), P(X > 116), P(l25 < X < 130). 3. Let X be normal with mean 14 and variance 4. Determine c such that P(X ;;;; c) = 95%, P(X ;;;; c) = 5%, P(X;;;; c) = 99.5%.
under the guarantee? 6. If the standard deviation in Prob. 5 were smaller. would that percentage be smaller or larger? 7. A manufacturer knows from experience that the resistance of resistors he produces is normal with mean /L = 150 n and standard deviation (T = 5 n. What percentage of the resistors will have resistance between 148 nand 152 n? Between 140 nand 160 n?
4. Let X be normal with mean 4.2 and variance 0.04. Find c such that P(X ;;;; c) = 50%, P(X > C) = 10%, P(-c < X - 4.2;;;; c) = 99%.
8. The breaking strength X [kg] of a certain type of plastic block is normally distributed with a mean of 1250 kg and a standard deviation of 55 kg. What is the maximum load such that we can expect no more than 5% of the blocks to break?
5. If the lifetime X of a certain kind of automobile battery is normally distributed with a mean of 4 yr and a standard deviation of I yr, and the manufacturer wishes to guarantee the battery for 3 yr, what percentage of the batteries will he have to replace
9. A manufacturer produces airmail envelopes whose weight is normal with mean /L = 1.950 grams and standard deviation if = 0.025 grams. The envelopes are sold in lots of 1000. How many envelopes in a lot will be heavier than 2 grams?
1032
CHAP. 24
Data Analysis. Probability Theory
10. If the resistance X of ce11ain wires in an electrical network is normal with mean 0.01 D and standard deviation 0.001 D, how many of 1000 wires will meet the specification that they have resistance between 0.009 and 0.011 Q?
(d) Considering cI>2(ao) and introducing polar coordinates in the double integral (a standard trick worth remembering), prove (12)
11. If the mathematics scores of the SAT college entrance exams are normal with mean 480 and standard deviation 100 (these are about the actual values over the past years) and if some college sets 500 as the minimum score for new students, what percent of students will not reach that score?
1
~
\. 2'11"
IX e-
ll 2
12
dll
= 1.
-x
(I) Bernoulli's law oflarge nwnbers.ln an experiment let an event A have probability p (0 < P < I), and let X be the number of time~ A happens in /I independent trials. Show that for any given E > 0,
as
/1-+
x.
(g) Transformation. If X is normal with mean J.L and variance u 2 , show that X* = clX + C2 (cI > 0) is normal with mean J.L* = CIJ.L + C2 and variance U*2
= C1 2 U 2 .
15. WRITING PROJECT. Use of Tables. Give a systematic discussion of the use of Tables A7 and A8 for obtaining P(X < b), P(X > a), P(a < X < fA PIX < c) = k. P(X > c) = k. as well as P(J.L - C < X < J.L + c) = k: include simple examples. If you have a CAS. describe to what extent it makes the use of those tables superfluous; give examples.
14. TEAM PROJECT. Normal Distribution. (a) Derive the formula~ in (6) and (7) from the appropriate normal table. (b) Show that cI>(-:) = I - cI>(:). Give an example. (e) Find the points of inflection of the curve of (1).
24.9
=
(e) Show that u in (1) is indeed the standard deviation of the normal distribution. [Use (12).]
12. If the monthly machine repair and maintenance cost X in a ce11ain factory is known to be normal with mean $12000 and standard deviation $2000, what is the probability that the repair cost for the next month will exceed the hudgeted amount of $150007 13. [f sick-leave time X used by employees of a company in one month is (very roughly) normal with mean 1000 hours and standard deviation 100 hours. how much time t should be budgeted for sick leave during the next month if t is to be exceeded with probability of only 20o/c?
cI>(x)
Distributions of Several Random Variables Distributions of two or more random variables are of interest for two reasons:
1. They occur in experiments in which we observe several random variables, for example, carbon content X and hardness Y of steel, amount of fertilizer X and yield of corn Y, height Xl' weight X 2 , and blood pressure X3 of persons, and so on. 2. They will be needed in the mathematical justification of the methods of statistics in Chap. 25. In this section we consider two random variables X and Yor, as we also say, a twodimensional random variable (X, Y). For (X, Y) the outcome of a trial is a pair of numbers X = x, Y = y, briefly (X, Y) = (x, y), which we can plot as a point in the XY-plane. The two-dimensional probability distribution of the random variable (X, Y) is given by the distribution function (1)
F(:.:, y)
=
P(X ~
x,
Y ~ y).
This is the probability that in a trial, X will assume any value not greater than x and in the same trial, Y will assume any value not greater than y. This corresponds to the blue region in Fig. 521, which extends to -00 to the left and below. F(x, y) determines the
SEC. 24.9
1033
Distributions of Several Random Variables
Fig. 521.
Formula (1)
probability distribution uniquely, because in analogy to formula (2) in Sec. 24.5, that is, < X ~ b) = F(b) - F(a), we now have for a rectangle (see Prob. 14)
Pea
As before, in the two-dimensional case we shall also have discrete and continuous random variables and distributions.
Discrete Two-Dimensional Distributions In analogy to the case of a single random variable (Sec. 24.5), we call (X, Y) and its distribution discrete if (X, Y) can assume only finitely many or at most countably infinitely many pairs of values (XI' YI), (X2' )'2), '" with positive probabilities, whereas the probability for any domain containing none of those values of (X, Y) is zero. Let (x;, -':) be any of those pairs and let P(X = Xi' Y = Yj) = Pij (where we admit that Pij may be 0 for certain pairs of subscripts i, j). Then we define the probability function f(x. y) of ex, Y) by (3)
f(x, y) = Pij
if x = Xi, Y = Yj
f(x, y) = 0
and
otherwise;
here, i = 1,2, ... andj = 1,2, ... independently. In analogy to (4), Sec. 24.5, we now have for the distribution function the formula
(4)
Instead of (6) in Sec. 24.5 we now have the condition
2: 2: f(Xi' Yj)
(5)
=
1.
j
E X AMP LEI
Two-Dimensional Discrete Distribution If we
~imilltaneously
toss a dime and a nickel and consider
x=
Number of heads the dime turns up,
Y = Number of heads the nickel turns up, then X and Y can have the values 0 or 1. and the probability function is fCO, 0) = fO, 0) = f(O, 1) = f(1, 1) =~,
f(x, y) =
°
otherwise.
•
1034
CHAP. 24
Data Analysis. Probability Theory y
Fig. 522.
Notion of a two-dimensional distribution
Continuous Two-Dimensional Distributions In analogy to the case of a single random variable (Sec. 24.5) we caIl (X, Y) and its distribution continuous if the corresponding distribution function F(x, \') can be given by a double integral
(6)
F(x, y)
=
I
,j
IX
-00
f(x*, y*) dx* dy*
-':X:l
whose integrand f, called the density of (X, Y), is nonnegative everywhere, and is continuous, possibly except on finitely many curves. From (6) we obtain the probability that (X, Y) assume any value in a rectangle (Fig. 522) given by the formula
(7) E X AMP L E 2
Two-Dimensional Uniform Distribution in a Rectangle Let R be the rectangle "1 < x
(8)
~
f31' "2
f(x. y) = Ilk
< Y ~ f32' The density (see Fig. 523) if (x. y)
i~
in R.
f(x, y) = 0 otherwise
defines the so-called uniform distribution ill the rectallgle R: here k = (f31 - "1)({32 - "2) is the area of R. The distribution function is shown in Fig. 524. •
y
x x
o Fig. 523.
Density function (8) of the uniform distribution
o Fig. 524. Distribution function of the uniform distribution defined by (8)
Marginal Distributions of a Discrete Distribution This is a rather natural idea, without counterpart for a single random variable. It amounts to being interested only in one of the two variables in (X, Y), say, X, and asking for its
SEC. 24.9
1035
Distributions of Several Random Variables
distribution, called the marginal distribution of X in (X, V). So we ask for the probability = x, Yarbitrary). Since (X, Y) is discrete, so is X. We get its probability function, call it f1(x), from the probability function f(x, y) of (X, Y) by summing over y:
P(X
(9)
f1(x)
=
P(X
=
=
x, Yarbitrary)
2: f(x, y) y
where we sum all the values of f(x, y) that are not 0 for that x. From (9) we see that the distribution function of the marginal distribution of X is (10)
Similarly, the probability function (11)
f2(Y)
=
P(X arbitrary. Y
=
=
y)
2: f(x. y) :r
determines the marginal distribution of Y in (X, V). Here we sum all the values of f(x, y) that are not zero for the conesponding y. The distribution function of this marginal distribution is (12) y*~y
E X AMP L E 3
Marginal Distributions of a Discrete Two-Dimensional Random Variable In drawing 3 cards with replacement from a bridge deck let us consider (X. Yl.
x=
NlIl11ber of queens.
Y = Number of kings or lIces.
The deck has 52 cards. These include 4 queens. 4 kings. and 4 aces. Hence in a single trial a queen has probability 4/52 = 1/13 and a king or ace 8/52 = 2/13. This gives the probability function of (X, Y), 3! (-I /<x,\") = . x! y! (3 - x - y)! 13
)X ( - 2 )Y ( - 10 )3-T-Y 13
(x
13
+ y
~
3)
and fIx. y) = 0 otherwise. Table 24.1 shows in the center the values of fIx, y) and on the right and lower margins the values of the probability functions hex) am] hey) of the marginal distributions of X and Y, respectively . •
Table 24.1 Values of the Probability Functions f{x, y), fl{X), f 2{Y) in Drawing Three Cards with Replacement from a Bridge Deck, where X is the Number of Queens Drawn and Y is the Number of Kings or Aces Drawn
x 0
y
0
I
2
3
f1(x)
1000 2197
600 2197
120 2197
8 2197
1728 2197
,IOU
1
2197
120 2197
12 2197
0
432 2197
2
30 2197
6 2197
0
0
36 2197
3
1 2197
0
0
0
2197
f2(Y)
1331 2197
726 2197
132 2197
8 2197
1
1036
CHAP. 24
Data Analysis. Probability Theory
Marginal Distributions of a Continuous Distribution This is conceptually the same as for discrete distributions. with probability functions and sums replaced by densities and integrals. For a continuous random variable (X, Y) with density f(x, y) we now have the marginal distribution of X in (X. Yl. defined by the distribution function (13)
Fl(x)
=
P(X ;:;; x,
-QO
<
Y
<
ce)
=
IX
fl(X*) dx*
-co
with the density f 1 of X obtained from f(x. y) by integration over y,
(14)
fleX)
=
I=
f(x, y) dy.
-x
Interchanging the roles of X and Y, we obtain the marginal distribution of Y in (X, Y) with the distribution function (15)
F 2(y) = P( - x
< X<
ce, y;:;; y) =
I
y
f2(V*) dy*
-x
and density
(16)
f2(Y)
=
fX f(x, y) dx. -co
Independence of Random Variables X and Y in a (discrete or continuous) random variable (X, Y) are said to be independent if (17) holds for all (x, y). Otherwise these random variables are said to be dependent. These definitions are suggested by the corresponding definitions for events in Sec. 24.3. Necessary and sufficient for independence is (IS)
for all x and y. Here the f's are the above probability functions if (X, Y) is discrete or those densities if (X, Y) is continuous. (See Prob. 20.) E X AMP L E 4
Independence and Dependence In tossing a dime and a nickel. X = Number of heads all the dillie, Y = Number of headf all the nickel may • assume the values 0 or I and are independent. The random variables in Table 24.1 are dependent.
Extension of Independence to II-Dimensional Random Variables. This will be needed throughout Chap. 25. The distribution of such a random variable X determined by a distribution function of the form
= (Xl> ... , Xn) is
SEC. 24.9
1037
Distributions of Several Random Variables
The random variables Xl> .... Xn are said to be independent if (19)
for all (Xl> ••• , xn). Here of Xj in X, that is,
FiXj)
is the distribution function of the marginal distlibution
Otherwise these random variahles are said to be dependent.
Functions of Random Variables When 11 = 2, we write Xl = X, X 2 = Y, Xl = X, X2 = y. Taking a nonconstant continuous function g(x, y) defined for all x, y, we obtain a random variable Z = g(X, Y). For example, if we roll two dice and X and Yare the numbers the dice turn up in a trial, then Z = X + Y is the sum of those two numbers (see Fig. 513 in Sec. 24.5), In the case of a discrete random variable (X, Y) we may obtain the probability function f(:::.) of Z = g(X. Y) by summing all f(x, y) for which g(x, y) equals the value of :::. considered; thus (20)
=
f(:::.)
P(Z
LL f(x, y).
= :::.) =
g(x.y)~z
Hence the distribution function of Z is (21)
=
F(::.)
P(Z ~ :::.)
=
LL f(x, y) g(x,Y)""Z
where we sum all values of f(x, y) for which g(x, y) ~ z. In the case of a continuous random variable eX, Y) we similarly have (22)
F(z)
=
P(Z
~ z) =
ff
f(x, y) dx dy
g(x,Y)""z
where for each z we integrate the density f(x, y) of (X, Y) over the region g(x, y) the xy-plane. the boundary curve of this region being g(x, v) = z.
~
z in
Addition of Means The number
LL (23)
x
E(g(X, Y»
=
{
x
g(x, y)f(x. y)
[(X, Y) discrete]
Y x
ix ixg(X,
y)f(x, y) dx dy
[(X, Y) continuous]
CHAP. 24
1038
Data Analysis. Probability Theory
is called the mathematical expectatio/1 or, briefly, the expectation of g(X, Y). Here it is assumed that the double series converges absolutely and the integral of Ig(x, y)I.f(x, y) over the xy-plane exists (is finite). Since summation and integration are linear processes, we have from (23) (24)
E(ag(X, Y)
+ bh(X, Y»
=
aE(g(X, Y»
+ bE(h(X, Y».
An imponam special case is E(X
+
Y)
= E(X) + E( Y),
and by induction we have the following result.
THEOREM 1
Addition of Means
The mean (expectation) of a sum of random variables equals the (expectations), that is,
Sll1l1
of the means
Furthermore, we readily obtain
THEROEM 2
Multiplication of Means
The mean (e.\pectation) of the product (~lilldependellt random variables equals the product qf the meam (expectatiolls), that is, (26)
PROOF
If X and Yare independent random variahles (both discrete or hoth continuous), then E(XY) = E(X)E(Y). In fact, in the di~crete case we have E(XY)
=
2: 2: xyf(x, y) x
y
=
2: xfl(x) 2: yf2(Y) = E(X)E(y), y
and in the continuous case the proof of the relation is similar. Extension to random variables gives (26), and Theorem 2 is proved.
/1
independent •
Addition of Variances This is another matter of practical impOltance that we shall need. As before, let Z = X + Y and denote the mean and Valiance of Z by I-t and u 2 • Then we first have (see Team Project 16(a) in Problem Set 24.6)
SEC. 24.9
1039
Distributions of Several Random Variables
From (24) we see that the first term on the right equals
For the second term on the right we obtain from Theorem I [E(z)f
=
[E(X)
+
E(y)]2
=
lE(X)f
+
2E(X)E(Y)
+
[E(y)]2.
By substituting these expressions into the formula for u 2 we have u2
=
E(X2) - [E(X)]2
+
+ E(y2)
- [E(Y)]2
2[E(XY) - E(X)E(Y)].
From Team Project 16, Sec. 24.6, we see that the expression in the first line on the right is the sum of the variances of X and Y, which we denote by U1 2 and U2 2, respectively. The quantity in the second line (except for the factor 2) is (27)
UXY = E(XY) - E(X)E(Y)
and is called the covariance of X and Y. Consequently, our result is (28) If X and Y are independent, then E(XY)
= E(X)E(Y):
hence UXY = 0, and (29)
Extension to more than two variables gives the basic
THEOREM 3
Addition of Variances
The variance of the sum of independent random variables equals the sum of the variances of these variables.
CAUTION! In the numerous applications of Theorems 1 and 3 we must always remember that Theorem 3 holds only for independent variables. This is the end of Chap. 24 on probability theory. Most of the concepts, methods, and special distributions discussed in this chapter will play a fundamental role in the next chapter, which deals with methods of statistical inference, that is, conclusions from samples to populations. whose unknown properties we want to know and try to discover by looking at suitable properties of samples that we have obtained.
CHAP. 24
1040
Data Analysis. Probability Theory
1. Let f(x, y) = k when 8 ~ x ~ 12 and 0 ~ y ~ 2 and zero elsewhere. Find k. Find P(X ~ 11, 1 ~ Y ~ 1.5) and P(9 ~ X ~ 13, Y ~ I).
2. Find P(X > 2, Y> 2) and P(X ~ 1. Y ~ 1) if (X. Y) has the density f{x, y) = 1/8 if x ~ O. Y ~ 0, X -t- Y ~ 4. =
k if x
> O.
y
f(x,
y) =
2500
if
0.99 < x < 1.01. 1.00 < Y < 1.02
> 0, x +
y < 3 and 0 otherwise. Find k. Sketch f(x, y). Find P(X + Y ~ I), P(Y> X).
3. Let f(x, y)
12. Let X [cm1 and Y [cm 1 be the diameter of a pin and hole. respectively. Suppose that (X, Y) has the density
4. Find the density of the marginal distribution of X in Prob 2.
5. Find the density of the marginal distribution of Y in Fig. 523.
and 0 otherwise. (a) Find the marginal distributions. (b) What is the probability that a pin chosen at random will fit a hole whose diameter is l.00?
13. An electronic device consists of two components. Let X and Y [months] be the length of time until failure of the first and second component, respectively. Assume that (X, Y) has the probability density
6. If certain sheets of wrapping paper have a mean weight of 10 g each. with a standard deviation of 0.05 g. what are the mean weight and standard deviation of a pack of IO 000 sheets?
7. What are the mean thickness and the standard deviation of transformer cores each consisting of 50 layers of sheet metal and 49 insulating paper layers if the metal sheets have mean thickness 0.5 mm each with a ~tandard deviation of 0.05 mm and the paper layers have mean 0.05 mm each with a standard deviation of O.02mm?
S. If the weight of certain (empty) containers has mean 2 Ib and standard deviation 0.1 lb. and if the filling of the containers has mean weight 751b and standard deviation 0.8 lb. what are the mean weight and standard deviation of filled containers'!
9. A 5-gear assembly is put together with spacers between the gears. The mean thickness of the gear~ is 5.020 em with a standard deviation of 0.003 cm. The mean thickness of the spacers is 0.040 cm with a standard deviation of 0.002 cm. Find the mean and standard deviation of the a~sembled units consisting of 5 randomly selected gears and 4 randomly selected spacers.
10. Give an example of two different discrete distributions that have the same marginal distributions.
f(x, y) = 0.01 e -O.l(x+y)
if x > 0 and y > 0
and 0 otherwise. (a) Are X and Y dependent or independent? (b) Find the densities of the marginal distributions. (c) What is the probability that the first component has a lifetime of 10 months or longer?
14. Prove (2). 15. Find P(X > Y) when (X. Y) has the density
ftx, Y) = 0.25e- O.5 (x+y) if x
~
0, y
~
0
and 0 otherwise.
16. Let (X. Y) have the density f(x,
y) =
k if x 2
+ y2 <
1
and 0 otherwise. Determine k. Find the densities of the marginal distributions. Find the probability P(X 2
+
y2
<
1/4).
17. Let (X, Y) have the probability function f(O.o)
=
f(1. I)
=
lIS.
f(O, 1) = f(l. 0) = 3/8.
11. Show that the random vatiables with the densities Are X and Y independent? f(l:, y) =
X
+Y
and g(x, y) = (x
if 0
~
x
~
g(x. y) =
distribution.
+ ~)(y + ~)
I, 0 ~ Y ~ I and f(x. y) = 0 and 0 elsewhere. have the same marginal
IS. Using Theorem 1, obtain the formula for the mean of the hypergeometrie distribution. Can you use Theorem 3 to obtain the variance of that distribution? 19. Using Theorems I and 3, obtain the formulas for the mean and the variance of the binomial distribution. 20. Prove the statement involving (18).
1041
Chapter 24 Review Questions and Problems
. ::a.::iI ...':==IU STIONS AND PROBLEMS 1. Why did we begin the chapter with a section on handling data?
23. Find the mean, standard deviation, and variance in Prob.21.
2. What are stem-and-Ieaf plots? Boxplots? Histograms? Compare their advantages.
24. Find the mean, standard deviation, and variance Prob.22.
3. What quantities measure the average size of data? The spread?
25. What are the outcomes of the sample space of
4. Why did we consider probability theory? What is its
26. What are the outcomes in the sample space of the experiment of simultaneously tossing three coins?
role in statistics?
In
X: Tossing a coin until the first Head appears?
5. What do we mean by an experiment? By a random variable related with it? What are outcomes? Events?
27. A box contains 50 screws, five of which are defective. Find the probability function of the random variable
6. Give examples of experiments in which you have equally likely cases and others in which you don't.
X = Number of defective screws in drawing tl1'O screws without replacement and compute its values.
7. State the definition of probability from memory.
28. Find the values of the distribution function in Prob. 27.
8. What is the difference between the concepts of a permutation and a combination?
29. Using a Venn diagram, show that A <::;; B if and only if AUB=B.
9. State the main theorems on probability. IIIustmte them by simple examples.
30. Using a Venn diagram, show that A <::;; B if and only if
10. What is the distribution of a random variable? The distribution function? The probability function'! The density?
31. If X has the density f(x) = 0.5x (0 :;::: x :;::: 2) and o otherwise, what are the mean and the Vallance of X* = -2X + 5?
11. State the definitions of mean and variance of a random variable from memory.
32. If 6 different inks are available, in how many ways can we select two colors for a printing job? Four colors?
12. If peA)
=
PCB) and A <::;; B, can A =1= B?
13. If E =1= S (= the sample space), can P(E)
=
I?
14. What distributions correspond to sampling with replacement and without replacement? 15. When will an experiment involve a binomial distribution? A hypergeometric distribution? 16. When will the Poisson distribution be a good approximation of the binomial distribution? 17. What do you know about the approximation of the binomial distribution by the normal distribution?
An B =A.
33. Compute 5! by the Stirling formula and find the absolute and relative errors. 34. Two screws are randomly drawn without replacement from a box containing 7 right-handed and 3 lefthanded screws. Let X be the number of left-handed screws drawn. Find P(X = 0), P(X = I), P(X = 2), P(I
<
X
<
2), P(O
<
X
<
5).
35. Find the mean and the variance of the distribution having the density f(x) = ~e-Ixl. 36. Find the skewness of the distribution with density f(x) = 2(1 - x) if 0 < x < I, f(x) = 0 otherwise.
18. Explain the use of the tables of the normal distribution. If you have a CAS, how would you proceed without the tables?
37. Sketch the probability function f(x) = x 2 /30 (x = 1, 2, 3,4) and the distribution function. Find
19. Can the probability function of a discrete random variable have infinitely many positive values?
38. Sketch F(x) = 0 if x :;::: 0, F(x) = 0.2x if 0 F(x) = 1 if x > 5, and its density f(x).
20. State the most important facts about distributions of two random variables and their marginal distributions.
39. If the life of tires is normal with mean 25 000 km and variance 25 000 000 km 2 , what is the probability that a given one of those tires will last at least 30 000 km? At least 35000 km?
21. Make a stem-and-Ieaf plot, histogram, and boxplot of the data 22.5. 23.2, 22.1, 23.6, 23.3, 23.4, 24.0, 20.6, 23.3. 22. Do the same task as in Prob. 21, for the data 210, 213, 209,218,210,215,204,211,216,213.
<
/-L.
x :;::: 5,
40. If the weight of bags of cement is normal with mean 50 kg and standard deviation 1 kg, what is the probability that 100 bags will be heavier than 5030 kg?
1042
CHAP. 24
Data Analysis. Probability Theory
: ' ·-11 :
Data Analysis. Probability Theory A random experiment, briefly called experiment, is a process in which the result ("outcome") depends on "chance" (effects of factors unknown to us). Examples are games of chance with dice or cards, measuring the hardness of steel. observing weather conditions, or recording the number of accidents in a city. (Thus the word "experiment" is used here in a much wider sense than in common language.) The outcomes are regarded as points (elements) of a set S, called the sample space, whose subsets are called events. For events E we define a probability PtE) by the axioms (Sec. 24.3)
o~ (1)
peE)
u
£2 U ... )
=
I
= I
peS) peEl
~
peEl)
+ P(£2) +
These aXiOms are motivated by properties of frequency distributions of data (Sec. 24.1). The complement ~ of E has the probability (2)
=
P(EC )
1 - peE).
The conditional probability of an event B under the condition that an event A happens is (Sec. 24.3) (3)
p(BIA)
=
peA n B) peA)
[peA)
> 0].
Two events A and B are called independent if the probability of their simultaneous appearance in a trial equals the product of their probabilities, that is, if
(4)
PtA
n
B)
=
P(A)P(B).
With an experiment we associate a random variable X. This is a function defined on S whose values are real numbers; furthermore, X is such that the probability p(X = a) with which X assumes any value a, and the probability pea < X ~ b) with which X assumes any value in an interval a < X ~ b are defined (Sec. 24.5). The probability distribution of X is determined by the distribution function (5)
F(x)
=
P(X
~
x).
In applications there are two important kinds of random variables: those of the discrete type, which appear if we count (defective items, customers in a bank, etc.) and those of the continuous type, which appear if we measure (length, speed, temperature, weight, etc.).
1043
Summary of Chapter 24
A discrete random variable has a probability function f(x) = P(X = x}.
(6)
Its mean 11- and variance a 2 are (Sec. 24.6) (7)
11-
=
L
u2
and
xjf(xj)
L
=
j
(Xj - 11-}2f(xj)
j
where the Xj are the values for which X has a positive probability. Important discrete random variables and distributions are the binomial. Poisson. and hypergeometric distributions discussed in Sec. 24.7. A continuous random variable has a density
(8)
[see (5)j.
f(x} = F'(x}
Its mean and variance are (Sec. 24.6)
(9)
11- =
fO
xf(x) dx
u2 =
and
-=
fC
(x - 11-)2f(x) dt:.
-oc
Very important is the normal distribution (Sec. 24.8), whose density is
I
f(1;) = - - exp
(l0)
u\,f2;
[1 (x -
11_._u
- 2
)2J
and whose distribution function is (Sec. 24.8: Tables A 7. A8 in App. 5) (II)
A two-dimensional random variable (X, Y) occurs if we simultaneously observe two quantities (for example, heightX and weight Yof adults}. Its distribution function is (Sec. 24.9) ( 12)
F(x, y}
=
P(X
~
x, Y
~ y}.
X and Y have the distribution functions (Sec. 24.9) (13)
FI(x}
=
P(X
~
x, Yarbitrary)
and
F 2 (y)
= P(x arbitrary,
Y
~
y)
respectively; their distribution-; are called marginal distributions. If both X and Y are discrete. then (X. Y) has a probability function f(x, y}
=
P(X
=
x, Y
=
y}.
If both X and Yare continuous. then (X. Y) has a density f(x, y).
CHAPTER
Of
2 5
Mathematical Statistics In probability theory we set up mathematical models of processes that are affected by "chance". In mathematical statistics or, briefly. statistics, we check these models against the observable reality. This is called statistical inference. It is done by sampling, that is, by drawing random samples, briefly called samples. These are sets of values from a much larger set of values that could be studied, called the popUlation. An example is 10 diameters of screws drawn from a large lot of screws. Sampling is done in order to see whether a model of the population is accurate enough for practical purposes. If this is the case, the model can be used for predictions, decisions, and actions, for instance, in planning productions, buying equipment, investing in business projects, and so on. Most important methods of statistical inference are estimation of parameters (Secs. 25.2). determination of confidence intervals (Sec. 25.3), and hypothesis testing (Secs. 25.4. 25.7. 25.8). with application to quality control (Sec. 25.5) and acceptance sampling (Sec. 25.6). In the last section (25.9) we give an introduction to regression and correlation analysis, which concern experiments involving two variables.
Prerequisite: Chap. 24. Sections that may be omitted in a shorter course: 25.5, 25.6. 25.8. References, Answers to Problems. a1ld Statistical Tables: App. I Part G, App. 2, App.5.
25.1
Introduction.
Random Sampling
Mathematical statistics consists of methods for designing and evaluating random experiments to obtain information about practical problems. such as exploring the relation between iron content and density of iron ore, the quality of raw material or manufactured products, the efficiency of air-conditioning systems, the performance of certain cars, the effect of advertising, the reactions of consumers to a new product, etc. Random variables occur more frequently in engineering (and elsewhere) than one would think. For example, properties of mass-produced articles (screws, lightbulbs. etc.) always show random variation, due to small (uncontrollable!) differences in raw material or manufacturing processes. Thus the diameter of screws is a random variable X and we have nOlldefecfive screws. with diameter between given tolerance limits, and defective screws, with diameter outside those limits. We can ask for the distribution of X, for the percentage of defective screws to be expected, and for necessary improvements of the production process.
1044
SEC. 25.1
Introduction.
1045
Random Sampling
Samples are selected from populations-20 screws from a lot of 1000, 100 of 5000 voters, 8 beavers in a wildlife conservation project-because inspecting the entire population would be too expensive, time-consuming, impossible or even senseless (think of destructive testing of lightbulbs or dynamite). To obtain meaningful conclusions, samples must be random selections. Each of the 1000 screws must have the same chance of being sampled (of being drawn when we sample), at least approximately. Only then will the sample mean x = (Xl + ... + X20)120 (Sec. 24.1) of a sample of size 11 = 20 (or any other 11) be a good approximation of the population mean JL (Sec. 24.6); and the accuracy of the approximation will generally improve with increasing II, as we shall see. Similarly for other parameters (standard deviation, variance, etc.). Independent sample values will be obtained in experiments with an infinite sample space S (Sec. 24.2), certainly for the normal distribution. This is also true in sampling with replacement. It is approximately true in drawing small samples from a large finite population (for instance. 5 or 10 of 1000 items). However. if we sample without replacement from a small population, the effect of dependence of sample values may be considerable. Random numbers help in obtaining samples that are in fact random selections. This is sometimes not easy to accomplish because there are many subtle factors that can bias sampling (by personal interviews, by poorly working machines, by the choice of nontypical observation conditions, etc.). Random numbers can be obtained from a random number generator in Maple, Mathematica, or other systems listed on p. 991. (The numbers are not truly random, as they would be produced in flipping coins or rolling dice, but are calculated by a tricky formula that produces numbers that do have practically all the essential features of true randomness.) E X AMP L E 1
Random Numbers from a Random Number Generator To select a sample of size n = 10 from 80 given ball bearings, we number the bearings from I to 80. We then let the generator randomly produce 10 of the integers from I to 80 and include the bearings with the numbers obtained in our sample. for example.
44 55 53 03 52 61
67 78 39 54
or whatever. Random numbers are also contained in (older) statistical tables.
•
Representing and processing data were considered in Sec. 24.1 in connection with frequency distributions. These are the empirical counterparts of probability distributions and helped motivating axioms and properties in probability theory. The new aspect in this chapter is randomness: the data are samples selected randomly from a population. Accordingly, we can immediately make the connection to Sec. 24.1. using stem-and-leaf plots, box plots. and histograms for representing samples graphically. Also, we now call the mean x in (5), Sec. 24.1, the sample mean
(1)
We call
(2)
11
the sample size, the variance
S2
in (6), Sec. 24.1, the sample variance
1046
CHAP. 25
Mathematical Statistics
and its positive square root s the sample standard deviation..\', 52, and parameters ~l a sample; they will be needed throughout this chapter.
25.2
5
are called
Point Estimation of Parameters Beginning in this section, we shall discuss the most basic practical tasks in statistics and corresponding statistical methods to accomplish them. The first of them is point estimation of parameters, that is, of quantities appearing in distributions, such as p in the binomial distribution and JL and u in the normal distribution. A point estimate of a parameter is a number (point on the real line). which is computed from a given sample and serves as an approximation of rhe unknown exact value of the parameter of the population. An interval estimate is an interval ("confidence il1terval"') obtained from a sample; such estimates will be considered in the next section. Estimation of parameters is of great practical importance in many applications. As an approximation of the mean JL of a popUlation we may take the mean .X' of a corresponding sample. This gives the estimate /L = .\' for JL, that is,
(1)
JL=X=
tXl
+ ... + x.,,)
11
where n is the sample size. Similarly. an estimate &2 for the variance of a popUlation is the variance S2 of a corresponding sample, that is,
(2)
Clearly, (1) and (2) are estimates of parameters for distributions in which JL or u 2 appear explicity as parameters. such as the normal and Poisson distributions. For the binomial distribution, p = JLln lsee (3) in Sec. 24.71. From (1) we thus obtain for p the estimate
(3)
jJ=
x 1l
We mention that (1) is a special case of the so-called method of moments. In this method the parameters to be estimated are expressed in terms of the moments of the distribution (see Sec. 24.6). In the resulting formulas those moments of the distribution are replaced by the corresponding moments of the sample. This gives the estimates. Here the kth moment of a sample X10 •••• Xn is
SEC. 25.2
1047
Point Estimation of Parameters
Maximum Likelihood Method Another method for obtaining estimates is the so-called maximum likelihood method of R. A. Fisher [Messellger Math. 41 (1912). 155-160]. To explain it we consider a discrete (or continuous) random variable X whose probability function (or density) f(x) depends on a single parameter e. We take a corresponding sample of 11 illdepelldent values Xl' • • • • X1]" Then in the discrete ca~e the probability that a sample of size 1l consists precisely of those 11 values is
(4) In the continuous case the probability that the sample consists of values in the small intervals}.; ~ x ~ Xj + tu (j = 1,2, .. ',11) is
(5) Since f(xj) depends on e, the function I in (5) given by (4) depends on Xl • . • . , Xn and e. We imagine Xl, • • . , Xn to be given and fixed. Then I is a function of e. which is called the likelihood function. The basic idea of the maximum likelihood method is quite simple, as follows. We choose that approximation for the unknown value of e for which I is as large as possible. If I is a differentiable function of e, a necessary condition for I to have a maximum in an interval (not at the boundary) is
ae = o.
(6)
(We write a partial derivative. because I depends also on Xlo • • • • xn") A solution of (6) depending onx1' ... , Xn is called a maximum likelihood estimate for e. We may replace (6) by
a In I
(7)
ae
= 0,
because f(xj) > 0, a maximum of I is in general positive, and In I is a monotone increasing function of I. This often simplifies calculations. Several Parameters. If the distribution of X involves r parameters e1 , . . • , en then instead of (6) we have the r conditions al/ae1 = 0, ... , rJllae,. = 0, and instead of (7) we have
(8)
E X AMP L E 1
a In I
ae,. = o.
= O.
Normal Distribution Find maximum likelihood estimates tor
Solutioll.
(JI
= /L and
(J2
= u in the case of the normal distribution.
From (1). Sec. 24.8. and (4) we obtain the likelihood function I
=
(~)n (~)n "271"
u
e- h
where
1048
CHAP. 25
Mathematical Statistics
Taking logarithms. we have
In 1=
-11
In
v T;;- -
/I
In u - h.
The first equation in (8) is vOn I)lvf.L = O. written out hence
L"
'\J - /If.L = O.
j~l
The ~olution is the desired estimate [L for f.L: we find
LXj=.r. Il j=l
The
~econd
equation in (8)
i~
a(ln l)trlu = O. written out
v In 1
11
ill!
u
au
Replacing f.L by [Land solving for u 2 • we obtain the estimate -2 U
1 ~ _2 = - L.. (Xj - xl
n
j=l
which we shall use in Sec. 25.7. Note that this differs from (2). We cannot discuss criteria for the goodness of estimates but want to mention that for 'mall 11. formula (2) is preferable. •
--•...---_............ ......... _...... ...
... ...
..-. ..-.
~--.-~
1. Find the maximum likelihood estimate for the parameter f.L of a nOlmal distribution with known variance u 2 = uo2 . 2. Apply the maximum likelihood method to the normal distribution with f.L = O. 3. (Binomial distribution) Derive a maximum likelihood estimate for p. 4. Extend Prob. 3 as follows. Suppose that 111 times 11 trials were made and in the first 11 trials A happened kl times, in the second n trials A happened k2 times, ... , in the mth 11 triab A happened k m times. Find a maximum likelihood estimate of p based on this information. 5. Suppose that in Prob. 4 we made 4 times 5 trials and A happened 2, I, ..k 4 times, respectively. Estimate p. 6. Consider X = Number of independent trials IImil all el'ent A occurs. Show that X has the probability function f(x) = pqX-l. X = l. 2 ..... where p is the probability of A in a single trial and q = I - p. Find the maximum likelihood estimate of p corresponding to a sample .\'1, ••• , Xn of observed values of X. 7. In Prob. 6 find the maximum likelihood estimate of p corresponding to a single observation x of X.
8. In rolling a die. suppose that we get the first Six in the 7th trial and in doing it again we get it in the 6th trial. Estimate the probability p of getting a Six in rolling that die once. 9. (Poisson distribution) Apply the maximum likelihood method to the Poisson distribution. 10. (Uniform distribution) Show that in the case of the parameters a and b of the uniform distribution (see Sec. 24.6), the maximum likelihood estimate cannot be obtained by equating the first derivative to zero. How can we obtain maximum likelihood estimates in this case? 11. Find the maximum likelihood estimate of e in the density f(x) = ee- HX if x ~ 0 and f(x) = 0 if x < O. 12. In Prob. I I. find the mean f.L. substitute it in fex). find the maximum likelihood estimate of IL. and show that It is identical with the estimate for f.L which can be obtained from that for e in Prob. I I. 13. Compute in Prob. 11 from the sanlple 1.8, 0.4. 0.8. 0.6. 1.4. Graph the sample distribution function Fcx) and the distribution function F(x) of the random variable, with e = on the same axes. Do they agree reasonably well? (We consider goodness of fit systematically in Sec. 25.7.)
e
e.
SEC 25.3
Confidence Intervals
14. Do the same task as in Frob. 13 if the given sample is 0.5.0.7.0.1. 1.1. 0.1. 15. CAS EXPERIMENT.
Maximum Likelihood Estimates. (MLEs). Find experimentally how much
25.3
1049
MLEs can differ depending on the ~ample ~ize. Hillf. Generate many samples of the same size II. e.g.. of the standardized normal distribution. and record i and S2. Then increase /I.
Confidence Intervals Confidence intervals 1 for an unknown parameter 8 of some distribution (e.g .. 8 = J-L) are intervals 81 :2: 8 :2: 82 that contain 8, not with certainty but with a high probability 'Y. which we can choose (95% and 99% are popular). Such an interval is calculated from a sample. 'Y = 95% means probability I - 'Y = 5% = 1/20 of being wrong--Dne of about 20 such intervals will not contain 8. Instead of writing 81 :2: e ~ 82 , we denote this more distinctly by writing (1)
Such a special symbol, CONE seems worthwhile in order to avoid the misunderstanding that 8 mllst lie between 81 and 82 , 'Y is called the confidence level, and 81 and 82 are called the lower and upper confidence limits. They depend on 'Y. The larger we choose 'Y. the smaller is the error probability 1 - 'Y, but the longer is the confidence interval. If 'Y ---7 I, then its length goes to infinity. The choice of 'Y depends 011 the kind of application. In taking no umbrella, a 5% chance of getting wet is not tragic. In a medical decision of life or death, a 5% chance of being wrong may be too large and a I % chance of being wrong ('Y = 99%) may be more desirable. Confidence intervals are more valuable than point estimates (Sec. 25.2). Indeed. we can take the midpoint of (1) as an approximation of 8 and half the length of ( I) as an "error bound" (not in the strict sense of numerics. but except for an error whose probability we know). 81 and 82 in (1) are calculated from a sample Xl, . . . • X n . These are 11 observations of a random variable X. Now comes a standard trick. We regard Xl, ••• , X 11 as single observations of n random variables Xl' ... , Xn (with the same distribution, namely, that ofX)· Then 81 = 81 (X1, ••• , xn) and 82 = 82 (x 1 , . • • , xn) in (I) are observed values of two random variables 8 1 = 8 1 (X1 , . . • , Xn) and 8 2 = 8 2 (X1 , ••• , X,,). The condition (I) involving 'Y can now be written
(2) Let us see what all this means in concrete practical cases. In each case in this section we shall first state the steps of obtaining a confidence interval in the form of a table, then consider a typical example, and finally justify those steps theoretically.
1 JERZY NEYMAN (1894-1981 l. American statistician, developed the theory of confidence intervals (Alll/als of Mathematical Statistics 6 (1935). 111-116).
CHAP. 25
1050
Mathematical Statistics
Confidence Interval for JL of the Normal Distribution with Known (J"2 Table 25.1 Determination of a Confidence Interval for the Mean p. of a Normal Distribution with Known Variance u 2
Step 1. Choose a confidence level y (95%, 99%, or the like). Step 2. Determine the conesponding c: 'Y
c
I 0.90 1.645
0.95
0.99
0.999
1.1)60
2.576
3.21) 1
Step 3. Compute the mean .f of the sample Xl>
•••• Xu-
Step -I. Compute k = cuIyr-;;. The confidence interval for jL is
(3) E X AMP L E 1
CONFy
Ix -
k
~
p. ~
x+
k).
Confidence Interval for jL of the Normal Distribution with Known u 2 Deterimine a 95'ff confidence interval for the mean of a normal distribution with variance sample of 11 = 100 values with mean x = 5.
0-
2
=
9. using a
Solution. Step 1. l' = 0.95 is required. Step 2. The corresponding c equals 1.960; see Table 25.1. Step 3..r = 5 is ghen. Step 4. We need k = 1.960' 3/v'lOo = 0.588. Hence r - k = 4.412..1' + k = 5.588 and the confidence interval is CONFo.95 [4.412 ::'" /L::'" 5.588}. This is sometimes written /L = 5 ::':: 0.588, but we shall not lise this notation, which can be misleading. With your CAS YOll can determine this interval more directly. Similarly for the other examples in this ,eetion...
Theory for Table 25.1.
THEOREM 1
The method in Table 25.1 follows from the basic
Sum of Independent Normal Random Variables
Let Xl' ... , X" be illdependent nonnal random variables each jL and I'ariallce u 2 . Theil the followillg holds. la) The Slllll Xl
+ ... +
has mean
Xn is Ilonlla! with meall IljL alld variallce IlU 2 .
(b) The following random variable
(4)
(~fwhich
X
I
= -
n
X is normal with mean
(Xl
jL and variallce u 2 /n.
+ ... + Xn)
le) The followillg ralldom variable Z is Ilonllal with melill 0 alld variallce l.
(5)
PROOF
The statements about the mean and variance in (a) follow from Theorems 1 and 3 in Sec. 24.9. From this and Theorem 2 in Sec. 24.6 we see that X has the mean (I/1l)lljL = jL and the variance (l11ly2nu 2 = u 2 /n. This implies that Z has the mean 0 and variance I. by Theorem 2(b) in Sec. 24.6. The nonnality of Xl + ... + Xn is proved in Ref. [031 listed in App. 1. This implies the normality of (4) and (5). •
SEC. 25.3
Confidence Intervals
1051
Derivation of (3) in Table 25.1. Sampling from a normal distribution gives independent sample values (see Sec. 25.1). so that Theorem I applies. Hence we can choose 'Yand then determine c such that
(6)
P( -c
~
Z
~
c) = P ( -c
X-11-
~ ~ ~
a/v 11
c ) =
For the value 'Y = 0.95 we obtain ::.(D) = 1.960 from Table AS in App. 5. as used in Example 1. For 'Y = 0.9. 0.99. 0.999 we get the other values of c listed in Table 25.1. Finally. all we have to do is to convert the inequality in (6) into one for 11- and insert observed values obtained from the sample. We multiply -c ~ Z ~ c by -I and then by a/V;;. writing car\!;; = k (as in Table 25.1), P( - c
~ Z ~ c) = P( c ~
- Z
~
- c)
=
P (c
= PCk Adding X gives P(X (7)
+
k
~ JL ~
X - k)
P(X - k
=
'Y
~ JL ~
~
JL - X
:> -
a/V;; =
~ JL -
X
~
c)
-k) = 'Y.
or X
+
k)
= y.
Inserting the observed value.X' of X gives (3). Here we have regarded Xl> • • • • Xn as single observations of Xl' ...• X" (the standard trick!). so tha!..xI + ... + Xn is an observed value of Xl + ... + Xn and .X' is an observed value of X. Note further that (7) is of the • form (2) with 8 1 = X - k and 8 2 = X + k. E X AMP L E 2
Sample Size Needed for a Confidence Interval of Prescribed Length How large must
Solution.
11
be in Example I if we want to obtain a 95% confidence interval of length L = OA?
ll1e interval (3) has the length L = 2k =
'leu'v;,. Solving for 11. we obtain
In the present case the answer is 11 = (2 . 1.960' 310.4)2 = 870. Figure 525 shows how L decreases as 11 increases and that for l' = 99% the confidence interval is substantially longer than for l' = 95% (and the same sample size Ill. •
0.6 r - , , - - - - - - - - - ,
0.4 Llu
0.2
-+00 n
L-- _
500
Fig. 525. Length of the confidence interval (3) (measured in multiples of tT) as a function of the sample size n for 'Y = 95% and y = 99%
1052
CHAP. 25
Mathematical Statistics
Confidence Interval for J.L of the Normal Distribution With Unknown 0-2 In practice a 2 is frequently unknown. Then the method in Table 25.1 does not help and the whole theory changes, although the steps of determining a confidence interval for /-L remain quite similar. They are shown in Table 25.2. We see that k differs from that in Table 25.1, namely, the sample standard deviation s has taken the place of the unknown standard deviation a of the popUlation. And c now depends on the sample size 11 and must be determined from Table A9 in App. 5 or from your CAS. That table lists values z for given values of the distribution function (Fig. 526)
F(~) =
(8)
I_= Z
K",
u2
(
I
+ -
)-(1n+
1)/2
du
III
of the t-distribution. Here, I1l (= I, 2, ... ) is a parameter, called the number of degrees of freedom of the distribution (abbreviated d.f.). [n the present case. I7l = 1l I; see Table 25.2. The constant Km is such that F(x) = 1. By integration it turns out that Km = r(~111 + ~)/[V;;;;: r(~1Il)]. where r is the gamma function (see (24) in App. A3.1). T 'lIe 25.2 Determination of a Confidence Interval for the Mean I.t of a Normal Distribution with Unknown Variance 0"2
Step 1. Choose a confidence level y (95%.99%. or the like). Step 2. Determine the solution c of the equation
(9)
F(c)
=
~(l
+
y)
from the table of the t-distribution with 11 - I degrees of freedom (Table A9 in App. 5; or use a CAS; 11 = sample size).
Step 3. Compute the mean
x
and the variance
S2
of the sample
X b · · · 'Xno
Step 4. Compute k (10)
=
cst\;;;. The confidence interval is
CONFl'
{x -
k :2i /-L :2i
x+
k}.
y
3 d.f.
r.~~f~ l.°l~ r
y
0.8
0.6
o
d'
-3
-2
Fir 526.
-1
0
2
3
x
Distribution functions of the tdistnbution with 1 and 3 dJ. and of the standardized normal distribution (steepest curve)
Fig. 527. Densities of the t-distribution with 1 and 3 dJ. and of the standardized normal distribution
SEC. 25.3
Confidence Intervals
1053
Figure 527 compares the curve ofthe density ofthe t-distribution with that of the normal distribution. The latter is steeper. This illustrates that Table 25.1 (which uses more information, namely, the known value of ( 2 ) yields shorter confidence intervals than Table 25.2. This is confirmed in Fig. 528, which also gives an idea of the gain by increasing the sample size. 2,--,-rr-----,----,----,
\\ L'lL l.5
lL-____~____~_____ L_ _ _ _~
o
10
20
11
Fig. 528. Ratio of the lengths L' and L of the confidence intervals (10) and (3) with y = 95% and y = 99% as a function of the sample size n for equals and (T
E X AMP L E 3
Confidence Interval for p. of the Normal Distribution with Unknown u 2 Five independent measurements of the point of int1ammation (flash point) of Diesel oil (D-2) gave the values (in OF) 144 147 146 142 144. Assuming normality. determine a 99% confidence interval for the mean.
Solutioll.
Step 1. y =
Step 2. FCc) = ~(1 Step 3.."i' = 144.6,
+
O.l)l)
is required.
y) = 0.995. and Table A9 in App. 5 with
s2 =
11 -
I = 4 d.f. gives c = 4.60.
3.8.
Step 4. k = V3.8. 4.60/Vs = 4.01. The confidence interval is CONFo.99 {140.5 ;;;; fL ;;;; 148.7J. If the variance (T2 were known and equal 10 the sample variance s2. thus 0"2 = 3.8. then Table 25.1 would give k = culV-;' = 2.576V3.8/V5 = 2.25 and CONFR99 {142.35;;;; fL;;;; 146.85J. We see that the present interval is almost twice as long as that obtained from Table 25.1 (with 0"2 = 3.8). Hence for small sample, the ditlerence is considerable! Sec also Fig. 528. •
Theory for Table 25.2.
THEOREM 2
For deriving (10) in Table 25.2 we need from Ref. lG3]
Student's t-Distribution
Let Xl ..... Xn be independent normal random variables with the same mean p. and the slime raricmce a 2 • Then the rllndom variable
X-p.
(11)
has a t-distribution [see (8)] with by (4) alld (12)
T=--
S/V;;
11
I degrees oj ji·eedo/Jl (d.f.): here X is given
1054
CHAP. 25
Mathematical Statistics
Derivation of (10). This is similar to the derivation of (3). We choose a number 'Y between 0 and I and determine a number (' from Table A9 in App. 5 with 11 - I dJ. (or from a CAS) such that (13)
P(-c
~ T~
=
c)
F(c) - F(-c)
= ')'.
Since the t-distribution is symmetric, we have F( -c) = I - F(c),
and (13) assumes the form (9). Substituting (11) into (13) and transforming the result as before, we obtain (14)
~
P(X - K
/-L ~ X
+
K) = 'Y
where K = cS/Y;;. By inserting the observed values .'t of X and
S2
of S2 into (14) we finally obtain (10) . •
Confidence Interval for the Variance of the Normal Distribution
0-
2
Table 25.3 shows the steps. which are similar to those in Tables 25.1 and 25.2. Table 25.3 Determination of a Confidence Interval for the Variance 0"2 of a Normal Distribution, Whose Mean Need Not Be Known
Step 1. Choose a confidence level 'Y (95%, 99%. or the like). Step 2. Determine solutions
Cl
and
C2
of the equations
(15)
from the table of the chi-square distribution with 11 - I degrees of freedom (Table AlO in App. 5; or use a CAS: 11 = sample size).
Step 3. Compute (n - 1 )S2, where Step 4. Compute kl = (11 confidence interval is
S2
l)s2/c ]
is the variance of the sample and
k2 =
(/1 -
I)S2/C2 .
The
(16)
E X AMP L E 4
Confidence Interval for the Variance of the Normal Distribution Detennine a 95'i!' confidence interval (16) for the variance. using Table 25.3 and a sample (tensile strell"th of . 2 . e sheet steel m kg/mm . rounded to mteger values) 89
84
R7
81
89
R6
91
90
78
89
R7
99
83
R9.
SEC. 25.3
1055
Confidence Intervals
Solution. Step 1. Step 2. For
11 -
'Y = 0.95 is required.
I = 13 we find ("I =
Step 3. 13s2
=
Step -I. l3s 2 1c1
5.01
and
c2
= 24.7~.
326.9.
= 65.25.
l3s 21c2
= 13.21.
The confidence interval is CONFo.95 {l3.2l ~ This
IS
fT2
~ 65.25}.
rather large, and for obtaining a more precise result, one would need a much larger sample.
•
Theory for Table 25.3. In Table 25.1 we used the normal distribution. in Table 25.2 the t-distribution. and now we shall use the X2-distribution (chi-square distribution), whose distlibution function is F(:::) = 0 if.: < 0 and F(.:)
=
em
f
z e-u/211cm-2)/2
du
if.:
~
0
(Fig. 529).
o
y
0.8
0.6
0.4 0.2
o Fig. 529.
4
6
10
8
x
Distribution function of the chi-square distribution with 2, 3, 5 dJ.
The parameter III (= 1. 2, ... ) is called the number of degrees of freedom (d.L), and
Note that the distribution is not symmetric (see also Fig. 530). For deriving (16) in Table 25.3 we need the following theorem.
THEOREM 3
Chi-Square Distribution
Under the assllmptions il1 Theorem 2 the random variable (17)
with S2 giren by (12) has a chi-square distribution with
Proof in Ref. [03]. listed in App. I.
11 -
I degrees offreedom
1056
CHAP. 25
Mathematical Statistics y
0.5
\ 0.4
2 d.f.
0.3 0.2
I
I
,~
3 d.f.
0.1 :
o Fig. 530.
2
6
4
8
10
x
Density of the chi-square distribution with 2, 3, 5 dJ.
Derivation of (16). This is similar to the derivation of (3) and (10). We choose a number l' between 0 and 1 and determine c] and C2 from Table AIO, App. 5, such that [see (15)]
Subtraction yields
Transforming
CI
~ Y ~
C2
with Y given by (17) into an inequality for u 2 , we obtain n - 1 2 ---s C2
By inserting the observed value
S2
~u
2
n - I 2 ~---s. CI
of S2 we obtain (16).
•
Confidence Intervals for Parameters of Other Distributions The methods in Tables 25.1-25.3 for confidence intervals for J.1- and u 2 are designed for the normal distribution. We now show that they can also be applied to other distrihutions if we use large samples. We know that if Xl' ... , Xn are independent random variables with the same mean JL and the same valiance u 2 , then their sum Yn = Xl + ... + X" has the following properties. (A) Yn has the mean nJL and the variance nu 2 (by Theorems I and 3 in Sec. 24.9). (B) If those variables are normal, then Yn is normal (by Theorem 1). If those random variables are not normal, then (B) is not applicable. However, for large n the random variable Yn is still approximately normal. This follows from the central limit theorem, which is one of the most fundamental results in probability theory.
SEC. 25.3
1057
Confidence Intervals
Central Limit Theorem
THEOREM 4
Let Xl> ... , X", ... be independent random variables that have the same distribution function and therefore the same mean /-L and the same variance a 2 . Let Yn = Xl + ... + Xn . Then the random variable
(18)
is asymptotically normal with mean 0 and variance 1; that is. the distribution function Fn(X) of Zn satisfies
lim Fn(x)
n~oo
=
I -yI2; I
(x) = - -
x
e-
u2/2
duo
-00
A proof can be found in Ref. [03] listed in App. 1. Hence when applying Tables 25.1-25.3 to a non normal distribution, we must use
sllfficiently large samples. As a rule of thumb, if the sample indicates that the skewness of the distribution (the asymmetry; see Team Project 16(d), Problem Set 24.6) is small, use at least Il = 20 for the mean and at least n = 50 for the variance.
_...-.. .....
_ ••• w ..... • _ _ .... _
11-71
... ---
....... ..-. • ..-.
--.
...,
MEAN (VARIANCE KNOWN)
1. Find a 95% confidence interval for the mean JL of a normal population with standard deviation 4.00 from the sample 30. 42, 40, 34, 48, 50. 2. Does the interval in Prob. I gel longer or shorter if we take 'Y = 0.99 instead of 0.95? By what factor? 3. By what factor does the length of the interval in Prob. 1 change if we double the sample size? 4. Find a 90% confidence interval for the mean JL of a nonnal population with variance 0.25, using a sample of 100 values with mean 212.3. 5. What sample size would be needed for obtaining a 95% confidence interval (3) of length 2u? Of length u"?
18-121
MEAN (VARIANCE UNKNOWN) Fwd a 99% confidence interval for the mean of a nonnal popUlation from the sample: 8. 425, 420. 425, 435 9. Length of 20 bolts with sample mean 20.2 cm and sample variance 0.04 cm2 10. Knoop hardness of diamond 9500, 9800, 9750, 9200, 9400, 9550 11. Copper content (%) of brass 66. 66. 65. 64. 66. 67, 64. 65,63,64
12. Melting point eC) of aluminum 660, 667. 654, 663, 662
6. (Use of Fig. 525) Find a 95% confidence interval for a sample of 200 values with mean 120 from a normal distribution with variance 4, lIsing Fig. 525.
13. Find a 95% confidence interval for the percentage of cars on a certain highway that have poorly adjusted brakes. using a random sample of 500 cars stopped at a roadblock on that highway. 87 of which had poorly adjusted brakes.
7. What sample size i~ needed to obtain a 99% confidence interval of length 2.0 for the mean of a normal population with variance 25? Use Fig. 525. Check by calculation.
14. Find a 99% confidence interval for p in the binomial distribution from a classical result by K. Pearson, who in 24000 trials oftossing a coin obtained 12012 Heads. Do you think that the coin was fair?
CHAP. 25
1058
Mathematical Statistics
[ii-20 I VARIANCE Find a Y5% confidence interval for the variance of a normal population from the sample: 15. A sample of 30 values with variance 0.0007 16. The sample in Prob. 9 17. The sample in Prob. II 18. Carbon monoxide emission (grams per mile) of a certain type of passenger car (cruising at 55 mph): 17.3,17.8,18.0,17.7,18.2,17.4. 17.6. 18.1 19. Mean energy (keV) of delayed neutron group (Group 3, half-life 6.2 sec.) for uranium U235 fission: 435, 451, 430,444,438 20. Ultimate tensile strength (k psi) of alloy steel (Maraging H) at room temperature: 251, 255, 258, 253, 253,252,250,252,255,256 21. If X is normal with mean 27 and variance 16, what distributions do -X, 3X, and 5X - 2 have? 22. If Xl and X2 are independent normal random variables
25.4
with mean 23 and 4 and variance 3 and I, respectively, what distribution does 4X1 - X2 have? Hint. Use Team Project 14(g) in Sec. 24.8. 23. A machine fills boxes weighing Y lb with X lb of salt, where X and Yare normal with mean 100lb and 51b and standard deviation 1 lb and 0.5 lb, respectively. What percent of filled boxes weighing between 104 Ib and 1061b are to be expected? 24. If the weight X of bags of cement is normally distributed with a mean of 40 kg and a standard deviation of 2 kg, how many bags can a delivery truck carry so that the probability of the total load exceeding 2000 kg will be 5%? 25. CAS EXPERIMENT. Confidence lntervals. Obtain 100 samples of size 10 of the standardized normal distribution. Calculate from them and graph the corresponding 95o/{' confidence intervals for the mean and count how many of them do not contain O. Does the result suppon the theory? Repeat the whole experiment. compare and comment.
Testing of Hypotheses.
Decisions
The ideas of confidence intervals and of tests 2 are the two most important ideas in modern statistics. In a statistical test we make inference from sample to population through testing a hypothesis, resulting from experience or observations, from a theory or a quality requirement, and so on. In many cases the result of a test is used as a basis for a decision, for instance, to buy (or not to buy) a certain model of car, depending on a test of the fuel efficiency (miles/gal) (and other tests, of course), to apply some medication, depending on a test of its effect; to proceed with a marketing strategy, depending on a test of consumer reactions, etc. Let us explain such a test in terms of a typical example and introduce the corresponding standard notions of statistical testing. E X AMP L E 1
Test of a Hypothesis. Alternative. Significance Level a We want to buy 100 coils of a certain kind of wire, provided we can verify the manufacturer's claim that the wire has a breaking limit /-t = /-to = 200 Ih (or more). This is a test ofthe hypothesis [also called /lull hypothesis} /-t = /-to = 200. We shall not buy the wire if the (statistical) test shows that actually /-t = /-t1 < /-to, the wire is weaker, the claim does not hold. /-tl is called the alternative (or alternative iz)lJOtizesis) of the test. We shall accept the hypothesis if the test suggests that it is true, except for a small error probability a, called the significance level of the test. Otherwise we reject the hypothesis. Hence a is the probability of rejecting a hypothesis although it is hue. The choice of a is up to us. SCk and I % are popular values. For the test we need a sample. We randomly select 25 coils of the wire, cut a piece from each coil, and determine the breaking limit experimentally. Suppose that this sample of n ~ 25 values of the breaking limit has the mean x = 1971b (somewhat less than the claim!) and the standard dcviation s = 6 lb.
2Beginning around 1930, a systematic theory of tests was developed by NEYMAN (see Sec. 25.3) and EGON SHARPE PEARSON (1895-1980), English statistician, the son of Karl Pearson (see the footnote on p. 1066).
SEC. 25.4
Testing of Hypotheses.
1059
Decisions
At this point we could only speculate whether this difference 197 - 200 = - 3 is due to randomness, is a chance effect. or whether it is significant, due to the actually inferior quality of the wire To continue beyond ,peculation requires probability theory. as follows. We assume thaI the breaking limit is normall} distributed. (This as,umption could be tested by the method in Sec. 25.7. Or we could remember the central limit theorem (Sec. 25.3) and take a still larger sample.) Then
T=
x-
/Lo
SI\ ';;
in (II). Sec. 25.3, with JL = /Lo has a t-distribution with 1/ - 1 degrees of freedom (1/ - I = 24 for our sample). Also X = 197 and s = 6 are observed values of X and S to be used later. We can now choose a significance level, say, a = 5%. From Table A9 in App. 5 or from a CAS we then obtain a critical value c such that peT ~ c) = a = 5%. For PIT ~ c) = I - a = 95'it the table gives c = 1.71. so that c = -c = -1.71 because of the symmetry of the distribution (Fig. 531). We now reason as follows-this is the crucial idea of the test. If the hypothesis is true, we have a chance of only a (= 5'it) that we observe a value t of T (calculated from a sample) that will fall between -:x; and -1.71. Hence if we nevertheless do observe such a t, we assert that the hypothesis cannot be true and we reject it. Then wc accept the alternative. If. however. t ~ c. wc accept the hypothesis. A simple calculation finally give~ t = (197 - 200)/(6rV25) = -2.5 as an observed value of T. Since -2.5 < -1.71. we reject the hypothesis (the manufacturer's claim) and accept the alternative /L = /Ll < 200, • the wire ~eems to be weaker than claimed.
I
Reject hypotheSiS~; Do not reject hypothesis
:
95%
a~~~ c =-1.71
0
Fig. 531.
t-distribution in Example 1
This example illustrates the steps of a test: 1. Formulate the hypothesis {}
=
80 to be tested. (80 = /-Lo in the example.)
2. Formulate an alternative 8 = 8}. (81 = /-Ll in the example.) 3. Choose a significance level a (5%,1%,0.1%).
e
4. Use a random variable = g(Xl , . . . , Kn) whose distribution depends on the hypothesis and on the alternative, and this distribution is known in both cases. Determine assuming the hypothesis to be true. (In the a critical ':.alue c from the distribution of example. e = T. and c is. obtained from peT ~ c) = a.)
e,
e of e. Accept or reject the hypothesis. depending on the size of erelative to c. (t < c in
5. Use a sample Xl • . . . , Xn to determine an observed value
= g(x}, ... , xn)
(t in the example.)
6. the example, rejection of the hypothesis.)
Two important facts require further discussion and careful attention. The first is the choice of an alternative. In the example, /-L] < /-Lo, but other applications may require /-Ll > /-Lo or /-Ll /-Lo· The second fact has to do with errors. We know that a (the significance level of the test) is the probability of rejecting a true hypothesis. And we shall discuss the probability {3 of accepting afalse hypothesis.
'*
1060
CHAP. 25
Mathematical Statistics
One-Sided and Two-Sided Alternatives (Fig. 532) Let 8 be an unknown parameter in a distribution, and suppose tbat we want to test the hypothesis 8 = 80 , Then there are three main kinds of alternatives. namely, (1 )
(2)
(3) (1) and (2) are one-sided alternatives, and (3) is a two-sided alternative.
We call rejection region (or critical region) the region such that we reject the hypothesis if the observed value in the test falls in this region. In CD the critical c lies to the right of 80 because so does the alternative. Hence the rejection region extends to the right. This is called a right-sided test. In @ the critical c lies to the left of 80 (as in Example 1), the rejection region extends to the left, and we have a left-sided test (Fig. 532, middle part). These are one-sided tests. In @ we have two rejection regions. This is called a two-sided test (Fig. 532, lower part). All three kinds of alternatives occur in practical problems. For example, (1) may arise if 80 is the maximum tolerable inaccuracy of a voltmeter or some other instrument. Alternative (2) may occur in testing strength of material, as in Example 1. Finally, 80 in (3) may be the diameter of axle-shafts, and shafts that are too thin or too thick are equally undesirable, so that we have to watch for deviations in both directions. Acceptance Region Do not reject hypothesis (Accept hypothesis)
Rejection Region (Critical Region) Reject hypothesis
c
Acceptance Region Do not reject hypothesis (Accept hypothesis)
Rejection Region (Critical Region) Reject hypothesis
~ __________________ ~----~I-----------------------80 c
Rejection Region (Critical Region) Reject hypothesis
Acceptance Region Do not reject hypothesis (Accept hypothesis)
0------ -----
Re jectlon Region (Critical Region) Reject hypothesIs
80
Fig. 532. Test in the case of alternative (1) (upper part of the figure), alternative (2) (middle part), and alternative (3)
Errors in Tests Tests always involve risks of making false decisions: (I) Rejecting a true hypothesis (Type 1 error). ll' = Probability of making a Type I error. (II) Accepting a false hypothesis (Type II error). f3 = Probability of making a Type II error.
SEC 25.4
Testing of Hypotheses.
1061
Decisions
Clearly, we Cannot avoid these errors because no absolutely certain conclusions about populations can be drawn from samples. But we show that there are ways and means of choosing suitable levels of risks, that is, of values a and {3. The choice of a depends on the nature of the problem (e.g., a small risk a = 1% is used if it is a matter of life or death). Let us discuss this systematically for a test of a hypothesis ti = tio against an alternative that is a single number el , for simplicity. We let el > eo, so that we have a right-sided test. For a left-sided or a two-sided test the discussion is quite similar. We choose a critical c > eo (as in the upper part of Fig. 532, by methods discussed below). From a given sample Xl> ••• , Xn we then compute a value
with a suitable g (whose choice will be a main point of our further disc~ssion; for instance, take g = (Xl + ... + Xn)l11 in the case in which e is the mean). If e > c, we reject the hypothesis. If ~ c, we accept it. Here, the value can be regarded as an observed value of the random variable
e
e
(4) becam,e Xj may be regarded as an observed value of Xj' j = L ... , 11. In this test there are two possibilities of making an error, as follows. Type I Error (see Table 25.4). The hypothesis is true but is rejected (hence the alternative is accepted) because e assumes a value > c. Obviously, the probability of making such an error equals
e
(5)
a is called the significance level of the test, as mentioned before. Type II Error (see Table 25.4). The hypothesis is false but is accepted because assumes a value ~ c. The probability of making such an error is denoted by {3; thus
e
e
(6) 7] = I - {3 is called the power of the test. Obviously, the power TJ is the probability of avoiding a Type I1 error.
Table 25.4 Type I and Type II Errors in Testing a Hypothesis ()o Against an Alternative () ()J
() =
=
Unknown Truth
'0 Cl)
E. '-' '-'
e = eo
'-'
e = el
e = eo
e = el
True decision
Type II error
P=l-a
P=f3
Type 1 error
True decision P=I-{3
P=a
1062
CHAP. 25
Mathematical Statistics
Formulas (5) and (6) show that both (]' and f3 depend on c, and we would like to choose c so that these probabilities of making errors are as small as possible. But the important Figure 533 shows that these are conflicting requirements because to let (]' decrease we must shift c to the right, but then f3 increases. In practice we first choose (]' (5%, sometimes 1%), then determine c, and finally compute f3. If f3 is large so that the power 1] = 1 - f3 is small, we should repeat the test, choosing a larger sample, for reasons that will appear shortly.
"
Density of e if the hypothesis is true
:
&
Density of if the alternative
/
l;f
,,. . . . -T-.. . . . . . . is true // "
//
I I
" ...
I I I I
" ... "
a-::::----...! eo
........ _
"
el
c
Acceptance region ~ Rejection region (Critical regIOn)
Fig. 533.
Illustration of Type I and II errors in testing a hypothesis e = eo against an alternative e = e, (> eo, right-sided test)
If the alternative is not a single number but is of the form (1}-(3), then f3 becomes a function of e. This function f3( e) is called the operating characteristic (OC) of the test and its curve the OC curve. Clearly, in this case 1] = 1 - f3 also depends on e. This function 1](e) is called the power function of the test. (Examples will follow.) Of course, from a test that leads to the acceptance of a certain hypothesis Bo, it does not follow that this is the only possible hypothesis or the best possible hypothesis. Hence the terms "not reject" or "fail to reject" are perhaps better than the term "accept."
Test for IL of the Normal Distribution with Known u 2 The following example explains the three kinds of hypotheses. E X AMP L E 2
Test for the Mean of the Normal Distribution with Known Variance Let X be a normal random variable with variance a 2 = 9. Using a sample of size hypothesis /L = /Lo = 24 against the three kinds of alternatives. namely. (a)
Solution.
/L> /Lo
(b)
/L < /Lo
/L
(c)
'*
11
= 10 with mcan.Y, test the
/Lo'
We choose the significance level a = 0.05. An estimate of the mean will be obtained from I X = - (Xl
+ ... +
X17)'
11
If the hypothesis is true. X is normal with mean /L = 24 and variance Hence we may obtain the critical value (' from Table A8 in App. 5.
Case (a).
(T
2
Right-Sided Test. We determine e from PIX > e)fL~24 = a
-
P(X ;" C)fL~24 =
(e-24) vo:9
/n = 0.9. see Theorem I, Sec. 25.3.
=
0.05, that is.
= I - a = 0.95.
Table A~ in App. 5 _gives (e - 24)/VO.9 = 1.645, and e = 25.56. which is grcater than /Lo, as in the upper part of FIg. 532. If x ;" 25.56, the hypothesis is accepted. If.ct' > 25.56, it is rejected. The power function of the test is (Fig. 534)
SEC. 25.4
Testing of Hypotheses.
Decisions
1063
0.8 0.6 O.Ll
0.2
20
Fig. 534.
> 25.56)" = I -
(7)
= I -
P(X ~ 25.56)"
25.56 - fL) vo.9 = I -
Left-Sided Test. The critical value c is obtained from the equation
-
PiX ~
C),,-24
=
(cYo.9 -
24)
Table A8 in App. 5 yields c = 24 - 1.56 = 22.44. If x reject it. Ihe power function of the test is (8)
Case (c).
fL
Power function 1)(fL) in Example 2, case (a) (dashed) and case (c)
7}(fL) = P(X
Case (b).
28
fLO
7}(fL)
-
= P(X
~
22.44)" = <1> ~
~
= a = 0.05.
22.44. we accept the hypothesis. If x < 22.44. we
( 22.44 - fL ) ~ = <1>(23.65 VO.9
Two-Sided Test. Since the normal distribution is symmetric. we choose + k. and detennine k from
cl
and
c2
equidistant from
fL = 24. say. cl = 24 - k and c2 = 24
P(24 - k
~ X ~ 24 + k), ,~24 =
<1>(
k
VO.9
) -
k
VO.9
)
= I - a = 0.95.
Table A8 in App. 5 gives k/Y0.9 = 1.960. hence k = 1.86. This gives the values cl = 24 - 1.86 = 22.14 and c2 = 24 + 1.86 = 25.86. If x is not smaller than cl and not greater than c2. we accept the hypothesis. Otherwise we reject it. The power function of the test is (Fig. 534) 7}(fL}
(9)
=
P(X
< 22.14)" +
P(X
=1+<1> ( = I
+
> 25.86)" =
22.14 - fL)
Yo.9
P(X
-
< 22.14)" + I -
P(X ~ 25.86)"
(25.86 - fL) VO.9
<1>(23.34 - 1.05fL) -
Consequently. the operating characteristic f3(.fL) = I - 7}(fL) (see before i is (Fig. 535)
If we take a larger sample. ~ay. of size 11 = 100 (instead of 10). then a 2 /1l = 0.09 (instead of 0.9) and the critical values are Cl = 23.41 and c2 = 24.59, as can be readily verified. Ihen the operating characteristic of the test is f3(fL) =
=
fL) _ <1>( 23.41
. fL) V'0.09
1064
CHAP. 25
Mathematical Statistics
Figure 535 shows that the corresponding OC curve is steeper than that for 11 = 10. l11is means that the increase of 11 has led to an improvement of the test. In any practical case, 11 is chosen as small as possible but SO large that the test brings out deviations bet",een J-L and J.Lo that are of practical interest. For instance. if deviations of ±2 units are of interest. we see from Fig. 535 that 11 = 10 is much too small because when J-L = 24 - 2 = 22 or J-L = 24 + 2 = 26 f3 is almost 50%. On the other hand, we see that" = I 00 is sufticient for that purpose. • (3(p.)
1.0 \ \
0.8 0.6 0.4
n = 10
\
0.2
n= 100 ,
....... 6-
28 p.
26
P.o
Curves of the operating characteristic (OC curves) in Example 2, case (e), for two different sample sizes n
Fig. 535.
Test for J.L When u 2 is Unknown, and for u 2 E X AMP L E 3
Test for the Mean of the Normal Distribution with Unknown Variance The tensile strength of a sample of /I = 16 manila ropes (diameter 3 in.) was measured. The sample mean was 4482 kg, and the sample standard deviation was s = 115 kg (N. C. Wiley, 41st Annual Meeting of the American Society for Testing Materials). Assuming that the tensile strength is a normal random variable, test the hypothesis J.Lo = 4500)..g against the alternative J-LI = 4400 kg. Here J.Lo may be a value given by the manufacturer. while J-Ll may result from previous experience.
x=
We choose the significance level a = S%. If the hypothesis i~ true, it follows from Theorem 2 in Sec. 25.3, that the random variable
Solution.
T=
x-
x-
J-Lo
Sry';;
4500
S/4
has at-distribution ",ith 11 - I = IS d.f. The test is left-sided. TIle critical value c is obtained from peT < c)I'O = a = 0.05. Table A9 in App. 5 gives c = -1.7S. As an observed value of T we obtain from the sample t = (4482 - 4S00)/( 11S/4) = -0.626. We see that t > c and accept the hypothesis. For obtaining numeric values of the power of the test. we would need tables called noncentral Student t-tables: we shall not discuss this question here. •
E X AMP L E 4
Test for the Variance of the Normal Distribution Using a sanlple of size n = 15 and '
Solution.
We choose the significance level a = 5%. If the hypothesis is true, then S2 y=
(/I -
S2
1)(J',2
1410 = lAS
2
o
has a chi-square distribution with n - I = 14 d.f. by Theorem 3, Sec. 2S.3. From P(Y> c)
= a = O.OS.
that is.
PlY ~ c) = 0.9S,
and Table AIO in App. 5 with 14 degrees of freedom we obtain c = 23.68. This is the critical value of Y. Hence
SEC. 25.4
Testing of Hypotheses.
Decisions
1065
to S2 = Uo 2Y1(11 - 1) = 0.7I4Y there corresponds the critical value c* = 0.714' 23.68 = 16.91. Since < c*. we accept the hypothesis. If the alternative is true, the random variable Y1 = l4S2/U12 = 0.7S2 has a chi-square distribution with 14 dJ. Hence our test has the power
.1'2
From a more extensive table of the chi-square distribution (e.g. in Ref. [G3] or [G8]) or from your CAS, you see that 7J = 62%. Hence the Type II risk is very large. namely. 38%. To make this risk smaller. we would have to increase the sample size. •
Comparison of Means and Variances EXAMPLE 5
Comparison of the Means of Two Normal Distributions Using a sample Xl> ••. , x n , from a normal distribution with unknown mean /L,- and a sample Yl, ... , Yn from 2 another normal distribution with unknown mean /Ly, we want to test the hypothesis that the means are equal, 3 /Lx = /Ly, against an altemative, say, /Lx > /Ly. The variances need not be known but are assumed to be equal. Two cases of comparing means are of practical importance:
The samples have the same size. Furthermore, each value of the first sample corre~ponds to precisely one value of the otlzer. because conesponding values result from the same person or thing (paired comparison)for example, two measurements of the same thing by two different methods or two measurements from the two eyes of the same person. More generally, they may result from pairs of similar individuals or things, for example, identical twins, pairs of used front tires from the same car, etc. Then we should form the differences of conesponding values and test the hypothesis that the population conesponding to the differences has mean 0, using the method in Example 3. If we have a choice. this method is better than the following.
Case A.
Case B.
The tl\'O samples are indepel1dent and not necessarily of the same size. Then we may proceed as follows. Suppose that tbe altemative is /Lx > /Ly. We choose a significance level a. Then we compute the sample means X and y as well as (nl - I)s,< 2 and (n2 - l)sy 2, where '\·x2 and Sy 2 are the sample variances. Using Table A9 in App. 5 with 111 + n2 - 2 degrees of freedom. we now determine (" from
peT ~ c) = I-a.
(10)
We finally compute (11)
to =
nln2(111
+ n2
"1
+ 112
x-y
- 2) V(nl
It can be shown that this is an observed value of a random variable that has a t-distribution with nl + 112 - 2 degrees of freedom, provided the hypothesis is true. If to ~ c, the hypothesis is accepted. If to > G, it is rejected. If the alternative is /Lx /Ly' then (] 0) must be replaced by
*
(10*)
peT ~
Gl)
=
0.5a,
peT ~ c2)
= I - O.Sa.
Note that for sanlples of equal size "1 = n2 = n, formula (11) reduces to
(12)
3 This assumption of equality of variances can be tested, as shown in the next example. If the test shows that they differ significantly, choose two samples of tbe same size nl = n2 = n (not too small, > 30, say), use tbe test in Example 2 together with the fact that (12) is an observed value of an approximately stillldardized normal random variable.
1066
CHAP. 25
Mathematical Statistics
To illustrate the computations. let us consider the two samples 105
lOR
89
92
(xl' . . . • .lnl )
and (.'"1 ....• -""2) given by
103
103
107
124
105
97
103
107
III
97
and 84
showing the relative output of tin plate worker~ under two different working conditions p. J. B. Worth. JUllrnol ofJndll.
*'
Solution.
We find
y
.1' = 105.125.
~
sx2 = 106.125.
97.500.
Sy
2 = 84.000.
We choose the significance level a = 5'k. From (10") with 0.5a = 2.5'if.. 1 - O.5a = 97.5'k and Table A9 in App.5 with 14 degrees of freedom we obtain ("1 = -2.1--1 and ("2 = 2.14. Fonnula (12) with n = 8 giYes the value to =
v'8. 7.625/\ '"i9o.i25 =
1.56.
Since ("1 ~ to ~ ("2' we accept the hypothesis iJ-x = iJ-y that under both conditions the mean output is the same. Case A applies to the example becau~e the two first sample values conespond to a certain type of work. the next two were obtained in another kind of work. etc. So we may use the differences 16
16
6
o
o
13
8
of corresponding sample values and the method in Example 3 to test the hypothesis,.,. = O. where,.,. is the memt of the population cOITcsponding to the differences. As a logical alternativc we take,.,. O. The sample mean is d = 7.625. and the sample variance is -,2 = --15.696. Hence
*'
t
= v'8 (7.625 - O)/V --15.696 = 3.19.
From P( T ~ ("1) = 2.5'k. P( T ~ ("2) = 97.5'7c and Table A9 in App. 5 with 11 - I = 7 degrees of freedom we obtain ("1 = - 2.36. ("2 = 2.36 and reject the hypothesis because t = 3.19 does not lie between ("1 and ("2' Hence our pre~ent test. in which we used more information (but the ,ame samples). shows that the difference in output is ~ignificant. •
E X AMP L E 6
Comparison of the Variance of Two Normal Distributions Using the fWO ~amples in the last example. test the hypothesis u.t .2 = U y 2: a"ume thm the corresponding populations are normal and the nature of the experiment suggests the altemative u x 2 > u y 2 .
Sollttioll.
We find
-'x2
=
106.125,
8--1.000. We choose the signiticance level a = 5'k. Using I. n2 - 1) = (7. 7) degrees of freedom. we = s:r2/Sy2 = 1.26. Since Vo ~ (". we accept the hypothesis. If
Sy2 =
p( V ~ (") = I - a = 95'if. and Table All in App. 5. with (n1 -
determine (" = 3.79. We finally compute Vo > (". we would reject it. This test is justified by the fact that Vo i~ an ob,erved value of a random variable that ha, a ,o-called F-distribution with (n1 - 1.112 - I) degrees of freedom. provided the hypothesis is true. (Proof in Ref. IG3] li,ted in App. I.J The F-distribution with (111. n) degree, of freedom was introduced by R. A. Fisher4 and has the distribution function F(:::) = 0 if : < 0 and Vo
(13)
F(:::) = Kmn
f
z ,<'n-2)/2(l1It
+ n)-<11t+n)/2
dt
(:::
~
0).
o
• 4 After the pioncering work of the English statistician and biOlogist. KARL PEARSON (1857-1936). the founder of the English school of statistics. and WILLIAM SEALY GOSSET (]876-1937). who discovered the t-di,tribution (mtd published under the name ·Student"). the English statistician Sir RONALD AYLMER FISHER (1890-1962). professor of eugenic~ in London (1933-1943) and professor of genetics in Cambrid"e Enuland (19~3:-1957) and Adelaide, Australia (1957-1962), had great influence on the ~fllfther developmen~;f m~dern statIstics.
SEC 25.4
Testing of Hypotheses.
1067
Decisions
This long section contained the basic ideas and concepts of testing, along with typical applications and you may perhaps want to review it quickly before going on, because the next sections concern an adaption of these ideas to tasks of great practical importance and resulting tests in connection with quality control, acceptance (or rejection) of goods produced. and so on.
... -..
_..
1. Test JL = a against JL > 0, assuming normality and using the sample 1. -\. I. 3. -8. 6. a (deviations of the azimuth [multiples of 0.01 radian] in some revolution of a satellite). Choose a = 5'k.
2. In one of his classical experiments Buffon obtained 2048 heads in tossing a coin 4040 times. Was the coin fair'? 3. Do the same test as in Prob. 2. using a result by K. Pearson. who obtained 6 019 heads in 12000 trials. 4. Assuming normality and known variance u 2 = 4. test the hypothesis fL = 30.0 again~t the alternative (a) JL = 28.5. (b) JL = 30.7. using a sample of size 10 with mean x = 28.5 and choosing a = 5%. 5. How does the result in Prob. 4(a) change if we use a smaller sample. say. of size 4. the other data (.ct = 28.5, a = 5%. etc.) remaining as before? 6. Detemine the power of the test in Prob. 4(a). 7. What is the rejection region in Prob. 4 in the case of a two-sided test with a = 5'k? 8. Using the sample 0.80. 0.81, 0.81. 0.82, 0.81. 0.82, 0.80,0.82. 0.81. 0.81 (length of nails in inches), test the hypothesis JL = 0.80 in. (the length indicated on the box) against the alternative JL 0.80 in. (A"sume normality. choose a = 5%.)
*'
9. A firm sells oil in cans containing 1000 g oil per can and is interested to know whether the mean weight differs significantly from 1000 g at the 5% level. in which case the filling machine has to be adjusted. Set up a hypothesis and an alternative and perform the test. assuming normality and using a sample of 20 fillings with mean 996 g and standard deviation 5 g. 10. If a sample of 50 tires of a certain kind has a mean life of 32 000 mi and a standard deviation of 4000 mi. can the manufacturer claim that the true mean life of such tires is greater than 30000 mi? Set up and test a corresponding hypothesis at a 5% level, assuming normality. 11. If simultaneous measurements of electric voltage by two different types of voltmeter yield the differences (in volts) 0.8. 0.2, -0.3.0.1. 0.0. 0.5, 0.7. 0.2, can we assert at the 5% level that there is no Significant difference in the calibration of the two types of instruments'! (Assume normality.)
12. If a standard medication cures about 70% of patients with a certain disease and a new medication cured 148 of the first 200 patients on whom it was hied, can we conclude that the new medication is better? (Choose a = 5%.)
13. Suppose that in the past the standard deviation of weights of certain 25.0-oz packages tilled by a machine was 0.4 oz. Test the hypothesiS Ho: u = 0.4 against the alternative HI: U> 0.4 (an undesirable increase). using a sample of 10 packages with ~tandard deviation 0.507 and assuming normality. (Choose a = 5%.) 14. Suppose that in operating battery-powered electrical equipment, it is less expensive to replace all batteries at fixed intervals than to replace each battery individually when it breaks down, provided the standard deviation of the lifetime is less than a certain limit. say. less than 5 hours. Set up and apply a suitable test. using a sample of 28 values of lifetime~ with standard deviation s = 3.5 hours and assuming normality: choose a = 5%. 15. Brand A gasoline was used in 9 automobiles of the same model under identical conditions. The corresponding sample of 9 values (miles per gallon) had mean 20.2 and standard deviation 0.5. Under the same conditions. high-power brand B gasoline gave a sample of 10 values with mean 21.8 and standard deviation 0.6. Is the mileage of B significantly better than that of A'? (Test at the 5% level; assume normality.) 16. The two samples 70. 80. 30. 70. 60. 80 and 140. 120, 130. 120. 120. 130. 120 are values of the differences of temperatures (OC) of iron at two stages of casting. taken from two different crucibles. Is the variance of the first population larger than that of the second? (Assume normality. Choose a = 5%.) 17. Using samples of sizes 10 and 16 with variances 2 2 Sx = 50 and Sy = 30 and assuming normality of the corresponding popUlations, test the hypothesis Ho: a./ = u y2 against the alternative u/ > u y2 . Choose a = 5%. 18. Assuming normality and equal varIance and usmg independent samples with 1/1 = 9, .r = 12 . .\'x = 2, 1/2 = 9, Y = 15, Sy = 2. test Ho: JLx = JL y against JLx fLy; choose a = 5%.
*'
1068
CHAP. 25
Mathematical Statistics
19. Show that for a nonnal distribution the two types of en'ors in a test of a hypothesis Ho: IL = JLo against an alternative HI: IL = ILl can be made as small as one pleases (not zero) by taking the sample sufficiently large. 20. CAS EXPERIMENT. Tests of Means and Variances. (a) Obtain 100 samples of size 10 each from
25.5
the normal distribution with mean 100 and variance 25. For each sample test the hypothesis ILo = 100 against the alternative ILl > 100 at the level of a = 10% Record the number of rejections of the hypothesis. Do the whole experiment once more and compare. (b) Set up a similar experiment for the variance of a normal distribution and perform it 100 times.
Quality Control The ideas on testing can be adapted and extended in various ways to serve basic practical needs in engineering and other fields. We show this in the remaining sections for some of the most important tash; solvahle by statistical methods. As a first such area of problems, we discuss industrial quality control, a highly successful method used in various industries. No production process is so perfect that all the products are completely alike. There is always a small variation that is caused by a great number of small. uncontrollable factors and must therefore be regarded as a chance variation. It is important to make sure that the products have required values (for example, length. strength, or whatever property may be essential in a particular case). For this purpose one makes a test of the hypothesis that the products have the required property. say. fL = fLo, where fLo is a required value. If this is done after an entire lot has been produced (for example, a lot of 100 000 screws), the test will tell us how good or how bad the products are, but it it obviously too late to alter undesirable results. It is much better to test during the production run. This is done at regular intervals of time (for example, every hour or half-hour) and is called quality control. Each time a sample of the same size is taken, in practice 3 to 10 times. If the hypothesis is rejected. we stop the production and look for the cause of the trouble. If we stop the production process even though it is progressing properly, we make a Type I error. If we do not stop the process even though something is not in order. we make a Type II error (see Sec. 25.4). The result of each test is marked in graphical form on what is called a control chart. This was proposed by W. A. Shew hart in 1924 and makes quality cuntrol particularly effective.
Control Chart for the Mean An illustration and example of a control chart is given in the upper part of Fig. 536. This control chart for the mean shows the lower control limit LCL, the center control line CL, and the upper control limit UCL. The two control limits correspond to the critical values Cl and C 2 in case (c) of Example 2 in Sec. 25.4. As soon as a silmple mean falls outside the range between the control limits, we reject the hypothesis and assert that the production process is "out of control"; that is, we assert that there has been a shift in process level. Action is called for whenever a point exceeds the limits. If we choose control limits that are too loose, we shall not detect process shifts. On the other hand, if we choose control limits that are too tight, we shall be unable to run the process because of frequent searches for nonexistent trouble. The usual significance level
SEC 25.5
Quality Control
1069
is Q' = 1%. From Theorem I in Sec. 25.3 and Table A8 in App. 5 we see that in the case of the normal distribution the corresponding control limits for the mean are
(1)
0"
LCL = lLo - 2.58 ~!
UCL
'
=
lLo
Vll
+
0"
2.58 ---;= . \'11
Here 0" is assumed to be known. If 0" is unknown, we may compute the standard deviations of the first 20 or 30 samples and take their arithmetic mean as an approximation of 0". The broken line connecting the means in Fig. 536 is merely to display the results. Additional, more subtle controls are often used in industry. For instance, one observes the motions of the sample means above and below the centerline, which should happen frequently. Accordingly, long runs (conventionally of length 7 or more) of means all above (or all below) the centerline could indicate trouble. 4.20
~
I
I I
0.5%
ue'
4.15
I c
~ :::;;:
4.10
/
~
,Y
II
1\
CL
f"
'f I I
l
99%
lel_{
4.05
\
0.5%
/
[
)
4.00 Sample no.
10
5
0.04 0.0365
I
I I .!
I
1%
II
UCL-
/ 0.03
I
c
a
:;::;
'"
.;;
1
W
"0 "Cl
ro "Cl
0.02
0.01
Sample no.
't.\
1\
I 1\II
c
'" ill
\
I
\
\
99 %
\
1/
o 5
10
Control charts for the mean (upper part of figure) and the standard deviation in the case of the samples on p. 1070
Fig. 536.
1070
CHAP. 25
Mathematical Statistics
Table 25.5 Twelve Samples of Five Values Each (Diameter of Small Cylinders, Measured in Millimeters)
Sample Number
-
Sample Values
I
2 3 4 5 6 7 8 9
10 II
12
x
s
R
4.06 4.10 4.06 4.06 4.08
4.08 4.10 4.06 4.08 4.10
4.08 4.12 4.08 4.08 4.12
4.08 4.12 4.10 4.10 4.12
4.10 4.12 4.12 4.12 4.12
4.080 4.112 4.084 4.088 4.108
0.014 0.011 0.026 0.023 0.018
0.04 0.02 0.06 0.06 0.04
-1-.08 4.06 4.08 4.06 4.06
4.10 4.08 4.08 4.08 4.08
4.10 4.08 4.10 4.10 4.10
4.10 4.10 4.10 4.12 4.12
4.12 4.12 4.12 4.14 4.16
4.100 4.088 4.096 4.100 4.104
0.014 0.023 0.017 0.032 0.038
0.04 0.06 0.04 0.08 0.10
4.12 4.14
4.14 4.14
4.14 4.16
4.14 4.16
4.16 4.16
4.140 4.152
0.014 0.011
0.04 0.02
Control Chart for the Variance In addition to the mean, one often controls the variance. the standard deviation, or the range. To set up a control chart for the variance in the case of a normal distribution, we may employ the method in Example 4 of Sec. 25.4 for detennining control limits. It is customary to use only one control limit. namely. an upper control limit. Now from Example 4 of Sec. 25.4 we have S2 = uo2 Y/(n - I), where because of our normality assumption the random variable Y hao.; a chi-square distribution with 11 - 1 degrees of freedom. Hence the desired control limit is
(2)
UCL
=
n-I
where (' is obtained from the equation P(Y> c) = a,
that is,
P(Y
~
c)
=
I - a
and the table of the chi-square distribution (Table AIO in App. 5) with 11 - 1 degrees of freedom (or from your CAS); here a (51k or I st. say) is the probability that in a properly running process an observed value S2 of S2 is greater than the upper control limit. If we wanted a control chart for the variance with both an upper control limit UCL and a lower control limit LCL. these Iimi[s would be
u 2c1 LCL= _ _
(3) where
(4)
1/-1 Cl
and
C2
and
UCL=
are obtained from Table A I 0 with and
11 -
P(Y
I d.f. and the equations ~ C2)
=
a I - -. 2
SEC. 25.5
Quality Control
1071
Control Chart for the Standard Deviation To set up a control chart for the standard deviation, we need an upper control limit
uVc
UCL=
(5)
Vn"=l
obtained from (2). For example, in Table 25.5 we have 11 = 5. Assuming that the corresponding population is normal with standard deviation u 0.02 and choosing a = I %, we obtain from the equation P(Y
~
c)
= I -
a
= 99%
and Table A lOin App. 5 with 4 degrees of freedom the critical value c (5) the corresponding value
UCL=
0.02vrns
V4
= 13.28 and from
= 0.0365,
which is shown in the lower part of Fig. 536. A control chart for the standard deviation with both an upper and a lower control limit is obtained from (3).
Control Chart for the Range Instead of the variance or standard deviation, one often controls the range R (= largest sample value minus smallest sample value). It can be shown that in the case of the normal distribution, the standard deviation u is proportional to the expectation of the random variable R* for which R is an observed value, say, u = AnE(R*), where the factor of proportionality An depends on the sample size 11 and has the values II
An
=
uIE(R*) 11
2
3
4
5
6
7
8
9
10
0.89
0.59
0.49
0.43
0.40
0.37
0.35
0.34
0.32
12
14
16
18
20
30
40
50
0.31
0.29
0.28
0.28
0.27
0.25
0.23
0.22
Since R depends on two sample values only, it gives less information about a sample than s does. Clearly, the larger the sample size 11 is, the more information we lose in using R instead of s. A practical rule is to use s when 11 is larger than 10.
I. Suppose a machine for filling cans with lubricating oil
is set so that it will generate fillings which form a normal population with mean I gal and standard deviation 0.03 gal. Set up a conrrol chart of the type shown in Fig. 536 for controlling the mean (that is. find LCL and VCL). a~suming that the sample size is 6. 2. (Three-sigma control chart) Show that in Prob. I, the
requirement of the significance level a = 0.3% leads to LCL = J.L - 3ulY;; and VCL = J.L + 3ulY;;, and find the corresponding numeric values. 3. What sample size should we choose in Prob. 1 if we want LCL and VCL somewhat closer together. say. VCL - LCL = 0.05. without changing the significance level'!
CHAP. 25
1072
Mathematical Statistics
4. How does the meaning of the control limits ( I ) change if we apply a control chart with these limits in the case of a population that is not normal?
5. How should we change the sample size in controlling the mean of a normal population if we want the difference UCL - LCL to decrease to half its original value?
6. What LCL and UCL should we use instead of (1) if instead of x we use the sum Xl + ... + xn of the sample values? Detennine these limits in the case of Fig. 536.
7. Ten samples of size 2 were taken from a production lot of bolts. The values (length in mm) are as shown. Assuming that the population is normal with mean 27.5 and variance 0.024 and using (I ), set up a control chart for the mean and graph the sample means on the chart. Sample
Length
4
3
2
No.
5
7
6
8
9
10
27.4 27.4 '17.5 27.3 27.9 27.6 27.6 27.8 27.5 27.3
14. How would progressive tool wear in an automatic lathe operation be indicated by a control chart of the mean? Answer the same question for a sudden change in the position of the tool In that operation.
15. (Number of defects per unit) A so-called c-chart or defects-per-unit chart is used for the control of the number X of defects per unit (for instance, the number of defects per 10 meters of paper. the number of missing rivets in an airplane wing, etc.) (a) Set up formulas for CL and LCL, UCL corresponding to J.L ::':: 3u.
assuming that X has a Poisson distribution. (b) Compute CL, LCL, and UCL in a control process ofthe number of imperfections in sheet glass; assume that this number is 2.5 per sheet on the average when the process is under control.
16. (Attribute control charts). Twenty samples of size 100 were taken from a production of containers. The numbers of defectives (leaking containers) in those samples (in the order observed) were
27.6 27.4 '17.7 27.4 27.5 27.5 27.4 27.3 27.4 27.7
8. Graph the means of the following 10 samples (thickness of washers. coded values) on a control chart for means, assuming that the population is normal with mean 5 and standard deviation 1.55. Time
8:008:309:00 9:30 10:00 10:30 11:00 11:3012:00 12:30
13 Sample 4 Values
18 .4
3 6 6
5 2 5
8
6
7 5 4 4
7 3 6 5
4 4 3 6
5 6 4 6
6 4 6 4
5 5 6
4
5 2 5 3
9. Graph the ranges of the samples in Prob. 8 on a control chart for ranges. 10. What effect on UCL - LCL does it have if we double the sample size? Ifwe switch from £l' = 1% to £l' = 5%? 11. Since the presence of a point outside control limits for the mean indicates trouble ("the process is out of contro]"), how often would we be making the mistake oflooking for nonexistent trouble if we used (a) I-sigma limits. (b) 2-sigma limits? (Assume normality.) 12. Graph An = uIE(R*) as a function of 11. Why is monotone decreasing function of 11?
An a
13. (Number of defectives) Find formulas for the UCL, CL, and LCL (corresponding to 3u-limits) in the case of a control charI for the number of defectives, assuming that in a state of statistical control the fraction of defectives is p.
376
45497056134
902
12
8.
From previous experience it was known that the average fraction defective is p = 5% provided that the process of production is mnning properly. Using the binomial distribution. ~et up afmction defectil'e chart (also called a p-chart). that is. choose the LCL = 0 and determine the UCL for the fraction defective (in percent) by the use of 3-sigma limits. where u 2 is the variance of the random variable
X
=
Fractioll defectil'e ill a sample of size 100.
Is the process under control? 17. CAS PROJECT. Control Charts. (a) Obtain 100 samples of 4 values each from the normal distribution with mean 8.0 and variance 0.16 and their means. variances, and ranges. (b) Use these samples for making up a control chart for the mean. (c) Use them on a control chart for the standard deviation. (d) Make up a control chart for the range. (e) Describe quantitative properties of the samples that you can see from those charts (e.g., whether the corresponding process is under control, whether the quantities observed vary randomly, etc.).
SEC. 25.6
1073
Acceptance Sampling
25.6
Acceptance Sampling Acceptance sampling is usually done when products leave the factory (or in some cases even within the factory). The standard situation in acceptance sampling is that a producer supplies to a consumer (a buyer or wholesaler) a lot of N items (a carton of screws. for instance). The decision to accept or reject the lot is made by determining the number x of defectives (= defective items) in a sample of size 11 from the lot. The lot is accepted if x ~ c, where c is called the acceptance number, giving the allowable number of defectives. If x > c, the consumer rejects the lot. Clearly, producer and consumer must agree on a ce11ain sampling plan giving 11 and c. From the hypergeometric distribution we see that the event A: "Accept the lot" has probability (see Sec. 24.7) c
(1)
P(A) = P(X ~ c) =
2: x=o
where M is the number of defectives in a lot of N items. In terms of the fraction defective e = MIN we can write (1) as
(2)
P(A; e) can assume n + 1 values conesponding to e = 0, liN, 21N, ... , NIN; here, n and c are fixed. A monotone smooth curve through these points is called the operating characteristic curve (OC curve) of the sampling plan considered.
E X AMP L E 1
Sampling plan Suppose that certain tool bits are packaged 20 to a box. and the following sampling plan is u~ed. A sample of two tool bits is drawn, and the corresponding box is accepted if and only if both bits in the sample are good. In Ihis case, N = 20, 11 = 2, C = O. and (2) t
(20 - 208)(19 - 200) 380 The values of peA. 0) for 8 = O. 1120.2120. ...• 20/20 and the re,ulting OC curve are shown in Fig. 537 on p. 1074. (Verify!) •
In most practical cases e will be small (less than 10%). Then if we take small samples compared to N, we can approximate (2) by the Poisson distribution (Sec. 24.7); thus (3)
(J.-t
= ne).
1074
CHAP. 25
p(A;e)
Mathematical Statistics
0.5
p(Ae) 0.5
0.2 Ii
Fig. 537.
E X AMP L E 2
Ii
OC curve of the sampling plan with n = 2 and c = 0 for lots of size N = 20
Fig. 538.
OC curve in Example 2
Sampling Plan. Poisson Distribution Suppose that for large lots the following sampling plan is used. A sample of size II = 20 is taken. If it contains not more than one defective. the lot is accepted. If the sample contains two or more defectives. the lot is rejected. In this plan, we obtain from (3)
e- 2o o(1 + 20/J).
peA: /J) -
•
The corresponding OC curve is shown in Fig. 538.
Errors in Acceptance Sampling We -;how how acceptance sampling fits into general test theory (Sec. 25.4) and what this means from a practical point of view. The producer wants the probability a of rejecting an acceptable lot (a lot for which 6 does not exceed a certain number 60 on which the two pm1ies agree) to be small. 60 is called the acceptable quality level (AQL). Similarly, P(A:Ii)
95%
S]~ \
Producer's risk
a = 5°
50%
\ 15%
:;ollsumer's risk
{3= 15°1-.
-1"--------I I
o 60 iiI = 1% = 5% Good : Indifference ' Poor material, zone : material Fig. 539.
OC curve, producer's and consumer's risks
SEC. 25.6
1075
Acceptance Sampling
the consumer (the buyer) wants the probability f3 of accepting an unacceptable lot (a lot for which e is greater than or equal to some e1 ) to be small. e1 is called the lot tolerance percent defective (LTPD) or the rejectable quality level (RQL). a is called producer's risk. It corresponds to a Type [ error in Sec. 25.4. f3 is called consumer's risk and corresponds to a Type II error. Figure 539 shows an example. We see that the points (eo, I - a) and (e1 • f3) lie on the OC curve. It can be shown that for large lots we can choose eo, el (> eo), a, f3 and then determine 11 and c such that the OC curve runs very close to those prescribed points. Table 25.6 shows the analogy between acceptance sampling and hypothesis testing in Sec. 25.4. Table 25.6
Acceptance Sampling and Hypothesis Testing
Acceptance Sampling Hypothesis Testing ---- . - - - - ----+-------=-------------1 Acceptable quality level (AQL) e = 80 Hypothesis tI = tlo Lot tolerance percent defectives (LTPD) Alternative 8 = til 8 = 81 Allowable number of defectives c Critical value c Producer·s risk a of rejecting a lot Probability a of making a Type Terror (significance level) with 8 ~ 80 Consumer's risk {3 of accepting a lot Probability {3 of making a Type II error with 8 ~ 81
Rectification Rectification of a rejected lot means that the lot is inspected item by item and all defectives are removed and replaced by nondefective items. (This may be too expensive if the lot is cheap; in this case the lot may be sold at a cut-rate price or scrapped.) If a production turns out 100e% defectives, then in K lots of size N each, KN8 of the KN items are defectives. Now KP(A; 8) of these lots are accepted. These contain KPNe defectives, whereas the rejected and rectified lots contain no defectives, because of the rectification. Hence after the rectification the fraction defective in all K lots equals KPNeIKN. This is called the average outgoing quality (AOQ); thus
(4)
AOQ(e) = ep(A; e).
\
OC curve \
0.5
\
AOQL.-~ 0/" o Fig. 540.
..f e*
..
0.5
~
e
OC curve and AOQ curve for the sampling plan in Fig. 537
CHAP. 25
1076
Mathematical Statistics
Figure 540 on p. 1075 shows an example. Since AOQ(O) = 0 and peA; 1) = 0, the AOQ curve has a maximum at some = 8*, giving the average outgoing quality limit (AOQL). This is the worst average quality that may be expected to be accepted under rectification.
e
--_....---........ -........... ..
-~
......
.-..
1. Lots of knives are inspected by a sampling plan that uses a sample of size 20 and the acceptance number c = I. What are probabilitIes of accepting a lot with 1%, 2%, 10% defectives (dull blades)? Use Table A6 in App. 5. Graph the OC curve. 2. What happens in Prob. I if the sample size is increased to 50? First guess. Then calculate. Graph the OC curve and compare. 3. How will the probabilities in Prob. I with 11 = 20 change (up or down) if we decrease c to zero? First guess.
compare with Example I.
11. Samples of 5 screws are drawn from a lot with fraction defective O. The lot is accepted if the sample contains (a) no defective screws, (b) at most 1 defective screw. Using the binomial distribution, find, graph, and compare the OC curves. 12. Find the risks in the single sampling plan with 11 = 5 and c = 0, assuming that the AQL is 00 = I % and the RQL is 01 = 15%.
4. What are the producer's and consumer's risks in Prob. I if the AQL is 1.5% and the RQL is 7.5'70?
13. Why is it impossible for an OC curve to have a vertical portion separating good from poor quality?
5. Large lots of batterie~ are inspected according to the following plan. 11 = 30 batteries are randomly drawn from a lot and tested. If this sample contains at most c = I defective battery, the lot is accepted. Otherwise it is rejected. Graph the OC curve of the plan, using the Poisson approximation. 6. Graph the AOQ curve in Prob. 5. Determine the AOQL, assuming that rectification is applied.
14. If in a single sampling plan for large lots of spark plugs, the sample size is 100 and we want the AQL to be 5% and the producer's risk 2%, what acceptance number c should we choose? (Use the normal approximation.)
7. Do the work required in Prob. 5 if 11 = 50 and c = O. 8. Find the binomial approximation of the hypergeometric distribution in Example 1 and compare the approximate and the accurate values.
15. What is the consumer's risk in Prob. 14 if we want the RQL to be 12%? 16. Graph and compare sampling plans with c = I and increasing values of II, say, 11 = 2. 3, 4. (Use the binomial distribution.)
9. In Example 1, what are the producer's and consumer's risks if the AQL is 0.1 and the RQL is 0.6?
17. Samples of 3 fuses are drawn from lots and a lot is accepted if in the corresponding sample we find no more than I defective fuse. Criticize this sampling plan. In particular, find the probability of accepting a lot that is 509r defective. (Use the binomial distribution.)
10. Calculate peA; 0) in Example 1 if the sample size is increased from 11 = 2 to 11 = 3, the other data remaining as before. Compute peA; 0.10) and peA; 0.20) and
18. Graph the OC curve and [he AOQ curve for the single sampling plan for large lots with 11 = 5 and c = 0, and find the AOQL.
25.7
Goodness of Fit. To test for goodness of fit means that we wish to test that a certain function F(x) is the distribution function of a distribution from which we have a sample Xl' . • . ,xn . Then we test whether the sample distribution function F(x) defined by
F(x)
=
SUIIl
of the relative frequencies of all sample l'lllues xJ not exceedbza x b
fits F(x) "sufficiently well." If this is so, we shall accept the hypothesis that F(x) is the distribution function of the population; if not, we shall reject the hypothesis.
SEC. 25.7
Goodness of Fit.
x 2 -Test
1077
This test is of considerable practical importance, and it differs in character from the tests for parameters (IL. a 2 • etc.) considered so far. To test in that fashion, we have to know how much F(x) can differ from F(x) if the hypothesis is true. Hence we must first introduce a quantity that measures the deviation of F(x) from F(x), and we must know the probability distribution of this quantity under the assumption that the hypothesis is true. Then we proceed as follows. We determine a number c such that if the hypothesis is true, a deviation greater than c has a small preassigned probability. If, nevertheless, a deviation greater than c occurs, we have reason to doubt that the hypothesis is true and we reject it. On the other hand. if the deviation does not exceed c, so that F(x) approximates F(x) sufficiently well. we accept the hypothesis. Of course. if we accept the hypothesis, this means that we have insufficient evidence to reject it, and this does not exclude the possibility that there are other functions that would not be rejected in the test. In this respect the situation is quite similar to that in Sec. 25.4. Table 25.7 shows a test of that type, which was introduced by R. A. Fisher. This test is justified by the fact that if the hypothesis is true, then Xo 2 is an observed value of a random variable whose distribution function approaches that of the chi-square distribution with K - I degrees of freedom (or K - r - 1 degrees of freedom if r parameters are estimated) as 11 approaches infinity. The requirement that at least five sample values lie in each interval in Table 25.7 results from the fact that for finite 11 that random variable has only approximately a chi-square distribution. A proof can be found in Ref. [G3] listed in App. 1. If the sample is so small that the requirement cannot be satisfied. one may continue with the test, but then use the result with caution. Table 25.7 Chi-square Test for the Hypothesis That F(x) is the Distribution Fundion of a Population from Which a Sample XlJ ••• , Xn is Taken Step 1. Subdivide the x-axis into K intecvab 110 12 ,
. . . , IK such thm each interval contains at least 5 values of the given smuple Xl, . . . , x n . Determine the number bj of smupJe values in the interval I j , where j = I .... , K. If a sample value lies at a common boundary point of two intervals, add 0.5 to each of the two corresponding b;r
Step 2. U:-ing F(x), compute the probability Pj that the random variable X under consideration assumes any value in the interval I j , where j = 1, ... , K. Compute ej
= IIPj.
(This is the number of sample values theoretically expected in I j if the hypothesis is true.) Step 3. Compute the deviation
(1) Step 4. Choose a significance level (5%. I %. or the like). Step 5. Determine the solution c of the equation P(X 2 ~ c)
=
I -
Q'
from the table of the chi-sqare distribution with K - I degrees of fi'eedom (Table AIO in App. 5). If rparameters of F(x) are unknown and their maximum likelihood estimates (Sec. 25.2) are used. then use K - r - I degrees of freedom (instead 2 of K - I). If Xo ~ c, accept the hypothesis. If Xo 2 > c, reject the hypothesis.
1078
CHAP. 25
Mathematical Statistics
Table 25.8 Sample of 100 Values of the Splitting Tensile Strength (Ib/in?) of Concrete Cylinders
320 350 370 320 400 420 390 360 370 340
380 340 390 350 360 400 330 390 400 360
340 350 390 360 350 350 360 350 360 390
410 360 440 340 390 370 380 370 350 400
340 350 390 350 350 320 330 350 380 410
380 370 330 340 400 330 350 370 380 370
350 370 360 390 340 380 300 370 340 400
360 380 330 350 360 390 360 390 360 360
320 300 400 380 370 400 360 370 330 340
370 420 370 340 420 370 360 340 370 360
D. L. IVEY. Splitting tensile tests on structural lightweight aggregate concrete. Texas Transp0l1ation Institute. College Station. Texas.
EXAMPLE 1
Test of Normality Test whether the population from which the sample in Table 25.8
wa~
taken is normal.
SolutiOIl. Table 25.8 show~ the values (column by column) in the order obtained in the experiment. Table 25.9 gives the frequency distribution and Fig. 541 the histogram. It is hard to guess the outcome of the test-does the histogram resemble a normal density curve sufficiently well or not? The maximum likelihood estimates for IL and cr2 are jL = X = 364.7 and ;;2 = 712.9. The computation in Table 25.10 yields Xo2 = 2.942. It is very intereMing that the interval 375 ... 385 contributes over 501ft of X02. From the histogram we see that the corresponding frequency looks much too small. The second largest contribution comes from 395 ... 405. and the histogram shows that the frequency seems somewhat too large. which is perhaps not obvious from inspection.
Table 25.9
Tensile Strength x [lb/in. 2 ]
300 310 320 330 340
Frequency Table of the Sample in Table 25.8 2 Absolute Frequency
3
4
5
Relative Frequency
Cumulative Absolute Frequency
Cumulative Relative Freguency FlX)
lex) 2
o 4
6 II
350 360 370 380 390
14 16 15 8
400 410 420 430 440
8 2 3
10
o 1
0.02 0.00 0.04 0.06 0.11
2 6 12 23
0.02 0.02 0.06 0.12 0.23
0.14 0.16 0.15 0.08 0.10
37 53 68 76 86
0.37 0.53 0.68 0.76 0.86
0.08 0.02 0.03 0.00 0.01
94 96 99 99 100
0.94 0.96 0.99 0.99
2
LOO
SEC. 25.7
1079
Goodness of Fit. x2-Test 0.20,--------,---,----...,-----,
0.15 ((x)
0.10
0.05
o~
250
__~~~~~--~~~~ 350 2
[lb.lin. ]
Fig. 541.
Frequency histogram of the sample in Table 25.8
We choose a = 5%. Since K = 10 and we e5timated r = 2 parameters we have to usc Table AlO in App. 5 with K - ,. - I = 7 degrees of freedom. We find c = 14.07 as the solution of P(X 2 ~ c) = 95%. Since 2 Xo < c, we accept the hypothesis that the population is normal. •
Table 25.10
Computations in Example 1 Xj -
Xj
26.7
ej
hj
Ternl in (1)
0.0000 ... 0.0681
6.81
6
0.0%
0.0681 ... 0.1335
6.54
6
0.045
-1.11 ... -0.74
0.1335 ... 0.2296 0.2296 ... 0.3594 0.3594 ... 0.4960
9.61 12.98 13.66
11 14 16
0.201 0.080 0.401
0.4960 ... 0.6517 0.6517 ... 0.7764
15.57 12.47
15 8
0.021 1.602
0.7764 ... 0.8708
9.44
10
0.033
0.8708 ... 0.9345 0.9345 . . . 1.0000
6.37
8
0.417
6.55
6
0.046
335 ... 345 345 ... 355 355 ... 365
385 ... 395
364.7) 26.7
-1.49 ... - 1.11
-x
395· .. 405 405 ... co
Xj -
... -1.49
-:x;···325 325 ... 335
365 ... 375 375 ... 385
¢(
364.7
-0.74' .. -0.36 -0.36· . 0.01 0.01 ... 0.39 0.39' . 0.76 0.76" . 1.13 1.13 ... 1.51 1.51 ..
co
X02 =
1. If 100 flips of a coin result in 30 heads and 70 tails. can we assel1 on the 5% level that the coin is fair? 2. If in 10 flips of a coin we get the same ratio as in Prob. I (3 heads and 7 tails), is the conclusion the same as in Prob. I? First conjecture. then compute. 3. What would be the smallest number of heads in Prob. I under which the hypothesis "Fair coin" is still accepted (with ex = 5%)? 4. If in rolling a die 180 times we get 39. 22. 41. 26. 20, 32. can we claim on the 5% level that the die is fair?
2.942
5. Solve Prob. 4 if the sample is 25, 31. 33, 27, 29. 35. 6. A manufacturer claims that in a process of producing kitchen knives, only 2.5% of the knives are dull. Test the claim against the alternative that more than 2.5% of the knives are dull, using a sample of 400 knives containing 17 dull ones. (Use ex = 5%.) 7. Between 1 P.M. and 2 P.M. on five consecutive days (Monday through Friday) a certain service station has 92,60. 66. 62. and 90 customers, respectively. Test the hypothesis that the expected number of customers during that hour is the same on lhose days. (Use ex = 591:.)
CHAP. 25
1080
Mathematical Statistics
8. Test for normality at the I % level using a sample of /I = 79 (rounded) values x (tensile strength [kg/mm2] of steel sheets of 0.3 nun thickness). a = a(x) = absolute frequency. (Take the first two values together, also the last three, to get K = 5.) 58
59
60
61
62
63
10
17
27
8
9
3
64
17.
9. In a sample of 100 patients having a certain disease 45 are men and 55 women. Does this support the claim that the disease is equally common among men and women? Choose a = 5%. 10. In Prob. 9 find the smallest number (>50) of women that leads to the rejection ofthe hypothesis on the levels 5%, 1%, 0.5%. 11. Verify the calculations in Example 1 of the text. 12. Does the random variable X = Number of accide/lTs per week in 1I certain foundry have a Poisson distribution if within 50 weeks. 33 were accident-free. I accident occurred in II of the 50 weeks. 2 in 6 of the weeks and more than 2 accidents in no week? (Choose a = 5%.) 13. Using the given sample, test that the corresponding population has a Poisson distribution. x is the number of alpha particles per 7.5-sec intervals observed by E. Rutherford and H. Geiger in one of their classical experiments in 1910, and a(x) is the absolute frequency (= number of time periods during which exactly x particles were observed). (Use a = 5%.)
x
0
a
I 57
t
a
I
16.
2
3
4
5
6
203
383
525
532
408
273
7
8
9
10
II
12
~B
139
45
27
10
4
2
0
18.
19.
load of 5000 lb, can we claim that a new process yields the same breakage rate if we find that in a sample of 80 rods produced by the new process, 27 rods broke when subjected to that load? (Use a = 5%.) Three samples of 200 livets each were taken from a large production of each of three machines. The numbers of defective rivets in the samples were 7, 8, and 12. Is this difference significant? (Use a = 5%.) In a table of properly rounded function values, even and odd last decimals should appear about equally often. Test this for the 90 values of lI(:':) in Table Al in App. 5. Are the 5 tellers in a ceI1ain bank equally time-efficient if during the same time interval on a certain day they serve 120.95, 110, 108, 102 customers? (Use a = 5%.) CAS EXPERIMENT. Random Number Generator. Check your generator expelimentally by imitating results of 11 trials of rolling a fair die, with a convenient 11 (e.g .. 60 or 300 or the like). Do this many times and see whether you can notice any "nonrandomness" features, for example. too few Sixes, too many even numbers. etc .. or whether your generator ~eems to work properly. Design and perform other kinds of checks.
20. TEAM PROJECT. Difficulty with Random Selection. 77 students were asked to choose 3 of the imegers II, 12, 13, ... ,30 completely arbitrarily. The amazing result was as follows.
14. Can we assert that the traffic on the three lanes of an expressway (in one direction) is about the same on each lane if a count gives 910. 850. 720 cars on the right. middle. and left lanes, respectively. during a particular time interval? (Use a = 5%.) 15. If it i5 known that 25% of certain steel rod~ produced by a standard process will break when subjected to a
'\lumber Il Frequ.
11
Number 21 rrequ.
12
12
13
14
15
16
17
18
19
20
10 20
8
13
9
21
9
16
8
22 23 24 25 26 27
28
8
15
10
10
9
12
8
29 30 13
9
If the selection were completely random, the following hypotheses should be true.
(a) The 20 numbers are equally likely. (b) The 10 even numbers together are as likely as the 10 odd numbers together. (c) The 6 prime numbers together have probability 0.3 and the 14 other numbers together have probability 0.7. Te~t these hypotheses. using a = 5%. Design further experiments that illustrate the difficulties of random selectiOn.
25.8 Nonparametric Tests Nonparametric tests, also called distribution-free tests, are valid for any distribution. Hence they are used in cases when the kind of distribution is unknown, or is known but such that no tests specifically designed for it are available. In this section we shall explain the basic idea of these tests, which are based on "order statistics" and are rather simple.
SEC. 25.8
1081
Nonparametric Tests
If there is a choice, then tests designed for a specific distribution generally give better results than do nonparametric tests. For instance. this applies to the tests in Sec. 25.4 for the normal distribution. We shall discuss two tests in terms of typical examples. In deriving the distributions used in the test, it is essential that the distributions from which we sample are continuous. (Nonparametric tests can also be derived for discrete distributions, but this is slightly more complicated. ) E X AMP L E 1
Sign Test for the Median A median of the population is a solution x ~ Ii of the equation F(x) ~ 0.5. where F is the distribution function of the population. Suppose that eight radio operators were tested, first in rooms without air-conditioning and then m air-conditioned rooms over the same period of time. and the difference of errors (unconditioned minus conditioned) were 9
o
4
o
4
11.
7
Test the hypothesis Ii ~ 0 (that is, air-conditioning has no effect) against the alternative ji > 0 (that is. inferior performance in unconditioned rooms).
Solution.
We choose the significance level a ~ 5%. If the hypothesis is true. the probability p of a positive difference is the same as that of a negative difference. Hence in this case. l' ~ 0.5. and the random variable
x
= NLlmber of positil'e \'a/LIes omollg
11
I'a/ues
has a binomial distribution with p = 0.5. Our sample has eight values. We omit the values O. which do not contribute to the decision. Then six values are left. all of which are positive. Since P(X
~ Ii) = (~)
(0.5)6(0.5)0
= 0.0156 =
1.56%
we do have observed an event whose probability is very small if the hypothesis is true: in fact 1.56% < a = 5%. Hence we assert that the alternative Ii > 0 is true. Thal is. the number of errors made in unconditioned rooms is significantly higher, so that installation of air conditioning should be considered. •
E X AMP L E 2
Test for Arbitrary Trend A certain machine is used for cutting lengths of wire. Five successive piece, had the lengths
29
31
28
30
32.
Using this sample. test the hypothesis that there is no trend, that is. the machine does not have the tendency to produce longer and longer pieces or shorter and sh0l1er pieces. Assume that the type of machine suggests the alternative that there is positil'e trend, that is. there is the tendency of successive pieces to get longer.
Solution.
We count the number of transpositions in the sample. that is. the number of times a larger value precedes a smaller value: 29 precedes 28
(1 transposition),
31 precedes 28 and 30
(2 transpositions).
The remaining three sample values follow in ascendmg order. Hence in the sample there are I transpositions. We now consider the random valiable
.L
2
3
T = Number of transpositiolls. If the hypothesis is true (no trend). then each of the 5! = 120 permutations of five elements I 2 3 4 5 has the same probability (11120). We alTange these permutations according to their number of transpositions:
1082
CHAP. 25
Mathematical Statistics
2
3
T=2
T=I
T=O 4
5
2 2 3 2
3 4
2 3
5 4 3 5 4 5 4 5
2
5 3 5
4
2
2
3 5 3
2
4
2 3 3 4
5
2
1
3
2 2
I
4
3 1
2
3
T=3
4 4
2 3 3 4 4 5
3 4 4
5 5 4
5 5 5
2 2 2
3
2 3 2 3 3
4
3
2
4
5 4 3 5 2 5 2 4 2 5 3 3 2 5 2 3 4 4 5 3 5 3 4 etc. I 5 4 4 5 I 3 5 2 5 4 4 2 5 4 5 2 3 5
4
From this we obtain P(T:o:; 3)
= l~O +
lio
+
l~O
+
)1;0
= )2io = 24%.
We accept the hypothesis because we have ob~erved an event that ha, a relatively large probability (certainly much more than 5'k) if the hypothesis i~ true. Values of the distribution function of T in the case of no trend are shoINn in Table A12. App. 5. For insrance. if /I = 3. then FlO) = 0.167. F(I) = 0.500, F(2) = I - 0.167. If /I = 4. then F(O) = 0.042. F(l) = 0.167, F(2) = 0.375. F(3) = I - 0.375, F(4) = I - 0.167, and SO on. Our method and those values refer to contilluous distributions. Theoretically. we may then expect that all the \'alues of a sample arc different. Practically. some sample values may still be equal. because of rounding: If 111 "alues are equal. add m(m - 1)14 (= mean value of thc transpo,itions in the case of the perIl1Ulalion~ of 111 elements). that is.l for each pair of equal values. ~ for each triple. etc. •
.,
-
1. What would change in Example 1. had we observed only 5 positive values? Only 4? 2. Does a process of producing plastic pipes of length f.L = 2 meters need adjustment if in a sample. 4 pipes have the exact length and 15 are shorter and 3 longer than 2 meters? (Use the nonnal approximation of the binomial distribution.) 3. Do the computations in Prob. 2 without the use of the DeMoivre-Laplace limit theorem (in Sec. 24.8). 4. Test whether a thermostatic switch is properly set to 200 e against the alternative that its setting is too low. Use a sample of9 values, 8 of which are less than 20 0 e and I is greater than 20°e. 5. Are air filters of type A better than type B filters if in IO trials. A gave cleaner air than B in 7 cases, B gave cleaner air than A in I case. whereas in 2 of the trials the results for A and B were practically the same?
6. In a clinical experiment. each of 10 patients were given two different sedatives A and B. The following table shows the effect (increase of sleeping time. measured in hours). Using the sign test. find out whether the difference is significant. A B
1.9 0.8 1.I 0.1 -0.1 4.4 5.5 1.6 4.6 3.4 0.7 -1.6 -0.2 -1.2 -0.1 3.4 3.7 0.8 0.0 2.0
Difference 1.2
2.4
1.3
1.3
0.0 1.0 1.8 0.8 4.6 1.4
7. Assuming that the populations corresponding to the samples in Prob. 6 are nOlmal. apply a suitable test for the normal distribution. 8. Thirty new employees were grouped into 15 pairs of similar intelligence and expeIience and were then instructed in data processing by an old method (A) applied to one (randomly selected) person of each pair. and by a new presumably better method (B) applied to
SEC. 25.9
Regression.
1083
Fitting Straight Lines. Correlation
the other person of each pair. Test for equality of methods against the alternative that (B) is better than (A), using the following scores obtained after the end of the training period. A 60 70 80 85 75 40 70 45 95 80 90 60 80 75 65
I
B 65 85 85 80 95 65 100 60 90 85 100 75 90 60 80
9. Assuming normality. solve Prob. 8 by a suitable test from Sec. 25.4. 10. Set up a sign test for the lower quartile q25 (defined by the condition F(q25) = 0.25). 11. How would you proceed in the sign test if the hypothesis is fi = fio (any number) instead of fi = O?
Temperature T
rCJ
Reading V [volts]
10
30
20
40
50
99.5 101.1 100.4 100.8 101.6
15. In a swine-feeding experiment. the following gains in weight [kg] of 10 animals (ordered according to increasing amounts of food given per day) were recorded:
20
17
19
18
23
16
25
28
24
22.
Test for no trend against positive trend. 16. Apply the test explained in Example 2 to the following data (x = diastolic blood pressure [mm Hgl. y = weight of hemt [in grams] of 10 patients who died of cerebral hemorrhage).
12. Check the table in Example 2 of the text.
A'121 120 95
13. Apply the test in Example 2 to the following data (x = disulfide content of a certain type of wool, measured in percent of the content in umeduced fibers; y = saturation water content of the wool. measured in percent). Test for no trend against negative trend.
,) 521 465 352 455 490 388 301 395 375 418
y
50
17. Does an increase in temperature cause an increase of the yield of a chemical reaction from which the following sample was taken"!
15
30
40
50
55
80
100
Temperature tOe]
10
20
30
40
60
80
46
43
42
36
39
37
33
Yield [kg/min]
O.b
1.1
0.9
1.6
1.2
2.0
14. Test the hypothesis that for a certain type of voltmeter. readings are independent of temperature T [0C] against the alternative that they tend to increase with T. Use a sample of values obtained by applying a constant voltage:
25.9
123 140 112 92 100 102 91
Regression. Correlation
18. Does the amount of feltilizer increase the yield of wheat X [kg/plot]? Use a sample of values ordered according to increasing amounts of fertilizer:
41.4
43.3
39.6
43.0
44.1
45.6
-l4.5
46.7.
Fitting Straight Lines.
So far we were concerned with random experiments in which we observed a single quantity (random variable) and got samples whose values were single numbers. In this section we discuss experiments in which we observe or measure two quantities simultaneously, so that we get samples of pairs of values (Xl, .\'1), (X2' )'2), ... , (x"' JII). Most applications involve one of two kinds of experiments, as follows.
1. In regression analysis one of the two variables. call it x. can be regarded as an ordinary variable because we can measure it without substantial en'or or we can even give it values we want. x is called the independent variable. or sometimes the controlled variable because we can control it (set it at values we choose). The other variable, Y. is a random variable, and we are interested in the dependence of Yon x. Typical examples are the dependence of the blood pressure Y on the age x of a person or, as we shall now say, the regression of Yon x. the regression of the gain of weight Y of certain animals on the daily ration of food x. the regression of the heat conductivity Y of cork on the specific weight x of the cork. etc.
1084
CHAP. 25
Mathematical Statistics
2. In correlation analysis both quantities are random variables and we are interested in relations between them. Examples are the relation (one says "correlation") between wear X and wear Y of the front tires of cars, between grades X and Y of students in mathematics and in physics, respectively. between the hardness X of steel plates in the center and the hardness Y near the edges of the plates, etc.
Regression Analysis In regression analysis the dependence of Y on x is a dependence of the mean /.L of Yon x, so that /.L = /.L(x) is a function in the ordinary sense. The curve of /.L(x) is called the regression curve of Y on x. In this section we discuss the simplest case, namely, that of a straight regression line (1)
Then we may want to graph the sample values as n points in the xY-plane, fit a straight line through them, and use it for estimating /.L(x) at values of x that interest us, so that we know what values of Y we can expect for those x. Fitting that line by eye would not be good because it would be sUbjective; that is, different persons' results would come out differently, particularly if the point<; are scattered. So we need a mathematical method that gives a unique result depending only on the Il points. A widely used procedure is the method of least squares by Gauss and Legendre. For our task we may fOlmulate it as follows.
Least Squares Principle
The straight line should be fitted through the given points so that the slim of the squares of the distances of those points from the straight line is minimum, where tile distance is measured in the vertical direction (the y-direction). (Formulas below.)
To get uniqueness of the straight line, we need some extra condition. To see this, take the sample (0, I), (0, -I). Then all the lines y = k1x with any kl satisfy the principle. (Can you see it?) The following assumption will imply uniqueness, as we shall find out.
General Assumption (Al)
The x-values Xl, . . . ,
Xn
in OLlr sample (Xl' Yl), . . . , (Xn, Yn) are not all equal.
From a given sample (Xl. Yl)' ••.. (X". Yn) we shall now determine a straight line by least ~quares. We write the line as
(2) and call it the sample regression line because it will be the counterpart of the population regression line (1). Now a sample point (Xj, )J) has the vertical distance (distance measured in the y-direction) from (2) given by (see Fig. 542).
SEC. 25.9
Regression.
1085
Fitting Straight Lines. Correlation y
x
x. J
Fig. 542.
Vertical distance of a point (Xj' Yj) from a straight line Y = ko
+ k,x
Hence the sum of the squares of these distances is n
(3)
q
= ~
(Yj -
ko - klXj)2.
j~I
In the method ofleast squares we now have to determine ko and kl such that q is minimum. From calculus we know that a necessary condition for this is
aq = 0
(4)
ilq
and
ilko
ilkl
=
o.
We shall see that from this condition we obtain for the sample regression line the formula (5)
Here i and.v are the means of the x- and the y-values in our sample, that is, (a)
j: =
-
I (Xl
+ ... + xn)
(."1
+ . . . + Yn)·
11
(6)
I
Y=
(b)
11
The slope kl in (5) is called the regression coefficient of the sample and is given by SXY
(7)
S 2 x
Here the "sample covariance" 1
(8)
and
Sxy
Sx
2
n
= -11 - I
~ (x· J
is
x)(,·· -J
y)
-
=
11 -
J~l
is given by 1
(9a)
Sxy
S 2 x
= -f)-I
n
~ j~I
(Xo J
x)2
=
f)-I
I
1086
CHAP. 25
Mathematical Statistics
:n.
From (5) we see that the sample regression line passes through the point Ct', by which it is detennined, together with the regression coefficient (7). We may call Sx 2 the variance of the x-values, but we should keep in mind that x is an ordinary variable. not a random variable. We shall soon also need S 2 Y
(9b)
71 = -1-I ~ ~ 11 -
j=l
Derivation of (5) and (7).
(y. _ 1')2 -J.
= -I1 Il -
[n
~ 1'.2 ~ -J j~l
-I
(71 ~
n
j=l
)2]
1'.
~-J
.
Differentiating (3) and using (4), we first obtain
iJq
ako
aq
ak
= -2 ~
)'i)j -
ko - k1xj)
=0
l
where we sum over j from 1 to 11. We now divide by 2, write each of the two sums as three sums, and take the sums containing )j and XjYj over to the right. Then we get the "normal equations" = ~ , .. ~.J
(10)
This is a linear system of two equations in the two unknowns ko and k1 . Its coefficient determinant is [see (9)] 11
and is not zero because of Assumption (A I). Hence the system has a unique solmion. Dividing the first equation of (10) by 11 and using (6), we get ko = Y - k1x. Together with y = ko + k1x in (2) this gives (5). To get (7), we solve the system (10) by Cramer's rule (Sec. 7.6) or elimination, finding
(II)
This gives (7)-(9) and completes the derivation. [The equality of the two expressions in (8) and in (9) may be shown by the student: see Prob. 14]. •
E X AMP L E 1
Regression Line The decrease of volume y ['i!-] of leather for certain fixed values of high pres~ure x [atmospheres I was measured. The resulh are shown in the first mo columns of Table 25.1 L Find the regression line of .'. on x. Solutioll. and (X)
We see that
11
= 4 and obtain the values.r = 2800014 = 7000. -" = II}.O/4 = 4.75. and from (9)
SEC. 25.9
Regression.
1087
Fitting Straight Lines. Correlation
Table 25.11 Regression of the Decrease of Volume y [%] of Leather on the Pressure x [Atmospheres] I
Given Values
Auxiliary Values
x}
Xj
.v· J
4000 6000 8000 10000
2.3 4.1 5.7 6.9
16000000 36000000 64000000 100000000
9200 24600 45600 69000
28000
19.0
216000000
148400
2 Sx
.
="3I
~XIJ
(
2
28 000 ) 216000000 - - - 4 -
. _ ~ ( 3 148400
-
xjYj
=
20 000 000 3
28000'19) _ 4 -
15400 3 .
Hence k] = 15 ·:100120000000 = 0.00077 from (7). and the regression line is
y - 4.75
=
0.000 77(x - 7000)
or
y
=
0.000 77r: - 0.64.
Note that y(O) = -0.64. which is physically meaningless. but typically indicates that a linear relation is merely an approximation valid 011 some restricted interval. •
Confidence Intervals in Regression Analysis If we want to get confidence intervals, we have to make assumptions about the distribution of Y (which we have not made so far; least squares is a "geometric principle," nowhere involving probabilities!). We assume normality and independence in sampling: Assumption (A2)
For each fixed x the random )'Qriable Y is /lonnalwith mean (I), that is, (12)
and varial/ce
(]"2
independent of x.
Assumption (A3)
The
11
pel.1lJ/7llaIlCeS of the experi1llellt by which we obtain a sample
a re independent. K1 in (12) is called the regression coefficient of the population because it can be shown that under Assumptions (AI)-(A3) the maximum likelihood estimate of KI is the sample regression coefficient kl given by (11). Under Assumptions (A1)-(A3) we may now obtain a confidence interval for Kb as shown in Table 25.12.
1088
CHAP. 25
Table 25.12
Mathematical Statistics
Determination of a Confidence Interval for
Kl
in (1) under Assumptions
(Al)-(A3) Step 1. Choose a confidence level ')'(95%,99%, or the like). Step 2. Determine the solution c of the equation (13)
F(c)
~(1
=
+ ')')
from the table of the t-distribution with n - 2 degrees of freedom (Table A9 in App. 5; 11 = sample size).
Step 3. Using a sample (Xlo Y1),
.•• ,
(xn , Yn), compute (n -
l)sx
2
from (9a), (n -
l)S.TY
from (8), kl from (7), n
(14)
I )Sy 2 = ~ Yj 2
(n -
-
n
[as in (9b)], and (15)
Step 4. Compute
K=c
(11 - 2)(n -
I )s,.2 .
The confidence interval is (16)
E X AMP L E 2
Confidence Interval for the Regression Coefficient Using the sample in Table 25.1 L determine a confidence interval for
Solution.
Step 1. We choose l'
=
/(1
by the method in Table 25.12.
0.95.
Step 2. Equation (13) takes the form HC) = 0.975, and Table A9 in App. 5 with 11 gives c = 4.30.
-
2 = 2 degrees offreedom
Step 3. From Example I we have 3s",2 = 20000000 and k1 = 0.00077. From Table 25.11 we compute 3s y2 = 102.2 =
192
4
11.95,
qo = I L')5 - 20 (X)O
om . OJ)(J0772
=
0.092.
=
4.30v'0.092/(2 . 20 000 (00)
Step 4. We thus obtain
K
= 0.000206 and CONFo.95 (0.00056 ~
K1 ~
0.000981.
•
SEC 25.9
Regression.
1089
Fitting Straight Lines. Correlation
Correlation Analysis We shall now give an introduction to the basic facts in correlation analysis: for proofs see Ref. [G2J or [G8] in App. I. Correlation analysis is concerned with the relation between X and Y in a two-dimensional random variable (X, Y) (Sec. 24.9). A sample consists of 11 ordered pairs of values (Xl' .\"1)' . . . , (xn , y,,), as before. The interrelation between the \" and y values in the sample is measured by the sample covariance Sxy in (8) or by the sample correlation coefficient (17)
with Sx and Sy given in (IJ). Here r has the advantage that it does not change under a multiplication of the X and y values by a factor (in going from feet to inches, etc.).
THEOREM 1
Sample Correlation Coefficient
The sample correlation coefficient r sati.~fies - I ~ r ~ 1. In particular. r and onl.r if the sample values lie on a straight line. (See Fig. 543.)
= :::'::: 1
if
The theoretical counterpart of r is the correlation coefficient p of X and Y, p=
(18)
where JLx = E(X) , JLy = E(Y), ux2 = E([X - JLxf), Uy2 = E([Y - JLy]2) (the means and variances of the marginal distributions of X and Y; see Sec. 24.9), and UXy is the
r=l 10
00
••
•
••
•• •
10
00 .~.
543.
• • •
r = 0.98
••
r=O 10
00
10
00
••
• • 10
•••
••
• 20
r = 0.6 • • •• • • • • • 10
20
•
• •
•
•
•
• 20
10
r = -0.3 10
00
• • •
• • • • • • •
10
20
r = -0.9 10
00
• • •
• 10
•• •
•
20
••
Samples with various values of the correlation coefficient r
1090
CHAP. 25
Mathematical Statistics
covariance of X and Y given by (see Sec. 24.9) (19)
UXY = E([X - J.Lx][Y - J.Ly]) = E(XY) - E(X)E(Y).
The analog of Theorem 1 is
THEOREM 2
Correlation Coefficient
The correlation coefficient p satisfies - 1 ~ P ~ 1. In particular. p = :::': 1 ollly if X alld Yare linearly related, that is. Y = yX + 8. X = y* Y + 8*.
if alld
X and Yare called uncorrelated if p = O.
THEOREM 3
Independence.
Normal Distribution
(a) Indepelldent X and Y (see Sec. 24.9) are uncorrelated. (b) If (X, Y) is nOl1llal (see below), then uncorrelated X alld Yare independent.
Here the two-dimensional normal distribution can be introduced by taking two independent standardized normal random variables X*. Y*, whose joint distribution thus has the density (20)
f*(x*. y*)
=
_1_
e-<x*2+y*2)/2
27T
(representing a surface of revolution over the x*y*-plane with a bell-shaped curve as cross section) and setting X
=
J.Lx
Y = J.Ly
+ +
uxX *
+ ~ uyY*.
pUyX*
This gives the general two-dimensional normal distribution with the density f(x, y) =
(21a)
1
2 e- h (x,y)/2
27TUXUy~
where (21b) hex. y)
=
In Theorem 3(b), normality is important, as we can see from the following example. E X AMP L E 3
Uncorrelated but Dependent Random Variables If X assumes -1, 0, I with probability 113 and Y = X2. then EO() = 0 and in (3) __
3
3
1
CTXY - E(XY) - E(X ) ~ (-1) . -
3
+
1 3 0 . 3
1 = 0 3'
+] 3 . -
.
so that p = 0 and X and Yare uncorrelated. But they are cenainly not independent since they are even functlonally ~~
SEC. 25.9
Regression.
Fitting Straight Lines. Correlation
1091
Test for the Correlation Coefficient p Table 25.13 shows a test for p in the case of the two-dimensional normal distribution. t is an observed value of a random variable that has a t-distribution with n - 2 degrees of freedom. This was shown by R. A. Fisher (Biometrika 10 (1915), 507-521).
Table 25.13 Test of the Hypothesis p = 0 Against the Alternative p of the Two-Dimensional Normal Distribution
> 0 in the Case
Step 1. Choose a significance level a (5%, 1%, or the like). Step 2. Determine the solution c of the equation P(T:;;::: c)
I - a
=
from the t-distribution (Table A9 in App. 5) with n - 2 degrees of freedom.
Step 3. Compute r from (17), using a sample
(XIo
Yl), ... , (x", Yn)'
Step 4. Compute
t=r(~). ~~ If t
E X AMP L E 4
~
c, accept the hypothesis. If t > c, reject the hypothesis.
Test for the Correlation Coefficient p Test the hypothesis p = 0 (independence of X and Y, because of Theorem 3) against the alternative p > 0, using the data in the lower left corner of Fig. 543. where r = 0.6 (manual soldering errors on 10 two-sided circuit boards done by 10 workers; x = front, y = back of the boards).
Solution. We choose a = 5%; thus 1 - a = 95%. Since n = 10, n - 2 = 8, the table gives c = 1.86. Also. t = 0.6VS/0.64 = 2.12 > c. We reject the hypothesis and assert that there is a positive correlation. A worker making few (many) errors on the front side also tends to make few (many) errors on the reverse side of the board. •
11-101
SAMPLE REGRESSION LINE
Find and sketch or graph the sample regression line of Y and x and the given data as points on the same axes. 1. (-1, 1), (0, 1.7), (1, 3) 2. (3, 3.5), (5, 2), (7, 4.5), (9, 3) 3. (2, 12), (5, 24), (9. 33), (14, 50) 4. (11, 22), (15, 18), e17, 16), (20, 9), (22, 10)
x
6
9
II
13
22
7. x = Revolutions per minute. y engine [hpJ
5. Speed x [mph] of a car
30
40
50
60
Stopping distance y [tt]
150
195
240
295
Also find the stopping disrance ar 35 mph.
6. x = Deformation of a certain steel [mm], y hardness [kg/mm 2 ]
26
28
= Brinell 33
35
= Power of a Diesel
x
400
500
600
700
750
y
580
1030
1420
18!m
2100
1092
CHAP. 2S
8. Humidity of air x [%] Expansion of gelatin y [%]
Mathematical Statistics 10
20
30
40
111-131
0.8
1.6
2.3
2.8
Find a 95% confidence interval for the regression coefficient Kl, assuming that (A2) and (A3) hold and using the sample:
9. Voltage x [V]
40
40
80
80
110
110
Current)' [A]
5.1
4.8
10.0
LO.3
13.0
12.7
CONFIDENCE INTERVALS
11. In Prob. 6 Also find the resistance R [il] by Ohms' law (Sec. 2.9]. 10. Force x [Ib] Extension y [in] of a spring
2
4
4.1
7.8
6
8
12.3 15.8
Also find the spring modulus by Hooke's law (Sec. 2.4).
~
.. ··1_'.'=01..·
1. What is a sample? Why do we take samples? 2. What is the role of probability theory in statistics? 3. Will you get better results by taking larger samples? Explain. 4. Do several samples from a certain popUlation have the same mean? The same variance? 5. What is a parameter? How can we estimate it? Give an example. 6. What is a statisticaL test? What errors occur in testing? 7. How do we test in quality control? 8. What is the x2-test? Give a simple example hom memory. 9. What are nonparametric tests? When would you apply them? 10. In what tests did we use the I-distribution? The X 2 -distribution?
11. What are one-sided and two-sided tests? Give typical examples. 12. List some areas of application of statistical tests. 13. What do we mean by "goodness of fiC? 14. Acceptance sampling uses principles of testing. Explain. 15. What is the power of a test? What can you do if the power is low?
12. In Prob. 7 13. In Prob. 8 14. Derive the second expression for first one.
2 in (9a) from the
Sx
15. CAS EXPERIMENT. Moving Data. Take a sample, for instance, that in Prob. 6, and investigate and graph the effect of changing y-values (a) for small.\', (b) for large x, (c) in the middle of the sampLe.
S T ION SAN D PRO B L EMS 18. Couldn't we make the error in interval estimation zero simply by choosing the confidence level I? 19. What is the Least squares principle? Give applications. 20. What is the difference between regression and cOIl'elation analysis? 21. Find the maximum likelihood estimates of mean and variance of a normal distribution using the sample 5, 4, 6,5,3,5,7,4,6,5,8,6. 22. Determine a 95% confidence interval for the mean fL of a normal population with variance 0"2 = 16, using a sample of size 400 with mean 53. 23. What will happen to the length of the interval in Prob. 22 if we reduce the sample size to 100? 24. Determine a 99% confidence interval for the mean of a normal population with standard deviation 2.2, using the sample 28,24,31,27,22. 25. What confidence interval do we obtain in Prob. 24 if we assume the variance to be unknown? 26. Assuming normality, find a 95% confidence interval for the variance hom the sample 145.3, 145.1, 145.4, 146.2. 127-291 Find a 95% confidence interval for the mean fL, assuming normality and using the sample: 27. Nitrogen content [%] of steel 0.74. 0.75. 0.73, 0.75, 0.74.0.72
16. Explain the idea of a maximum likelihood estimate from memory.
28. Diameters of 10 gaskets with mean 4.37 em and standard deviation 0.157 cm
17. How does the length of a confidence interval depend on the sample size? On the confidence level?
29. Density [g/cm 3 J of coke 1.40, 1.45, 1.39, 1.44, 1.38
1093
Summary of Chapter 25 30. What sample size should we use in Prob. 28 if we want to obtain a confidence interval of length 0.1. assuming that the standard deviation of the samples is (about) the same?
131-321 Find a 99'1t confidence interval for the variance 2 0- • assuning normality and using the sample: 31. Rockwell hardness of tool bits 64.9. 64.1, 63.8. 64.0 32. A sample of size II = 128 with variance s2 = 1.921 33. Using a sample of IO values with mean 14.5 from a normal population with variance 0-2 = 0.25. test the hypothesis flo = 15.0 against the alternative fLl = 14.4 on the 5% level. 34. In Prob. 33. change the alternative to fL =1= 15.0 and test as before. 35. Find the power in Prob. 33. 36. Using a sample of 15 values with mean 36.2 and variance 0.9. te~t the hypothesis fLo = 35.0 against the alternative fLl = 37.0. assuming normality and taking a = 1%. 37. Using a sample of 20 values with variance 8.25 from a normal population. test the hyothesis 0-02 = 5.0 against the alternative 0-1 2 = 8.1. choosing a = 5%. 38. A firm sells paint in cans containing I kg of paint per can and is interested to know whether the mean weight differs significantly from I kg, in which case the filling machine must be adjusted. Set up a hypothesis and an alternative and perform the test. assuming normality and using a sample of 20 fillings having a mean of 991 g and a standard deviation of 8 g. (Choose a = 5%.) 39. Using samples of sizes to and 5 with variances s/ = 50 and Sy2 = 20 and assuming normality of the conesponding populations. test the hypothesis Ho: 0".,,2 = o-y2 against the alternative 0".",2 > o-y2. Choose a = 5%.
40. Assume the thickness X of washers to be normal with mean 2.75 mm and variance 0.00024mm2. Set up a control chaIt for fL. choosing a = I %, and graph the means of the five samples (2.74. 2.76). <2.74. 2.74). (2.79.2.81), (2.78, 2.76), (2.71. 2.75) on the chart. 41. What effect on UCL - LCL in a control chart for the mean does it have if we double the sample size? If we switch from a = I % to a = 5'70?
42. The following sample~ of screws (length in inches) were taken from an ongoing production. Assuming that the population is normal with mean 3.500 and variance 0.0004, set up a control chart for the mean, choosing a = I %, and graph the sample means on the chart. Sample No. Length
2
3
4
5
6
7
3.49 3.48 3.52 3.50 3.51 3.49 3.52 3.53 3.50 3.47 3.49 3.51 3.48 3.50 3.50 3.49
43. A purchaser checks gaskets by a single sampling plan that uses a sample size of 40 and an acceptance number of I. Use Table A6 in App. 5 to compute the probability of acceptance of lots containing the following percentages of defective gaskets !'It. !%. I 'It, 2%. 5%. 10%. Graph the OC curve. (Use the Poisson approximation.) 44. Does an automatic cutter have the tendency of cutting longer and longer pieces of wire if the lengths of subsequent pieces [in.] were to.l. 9.8. 9.9, 10.2, 10.6, to.5? 45. Find the least squares regression line to the data (-2. I).
n. (6. 5).
(0. 1). (2, 3), (4 ...
Mathematical Statistics We recall from Chap. 24 that with an experiment in which we observe some quantity (number of defectives. height of persons, etc.) there is associated a random variable X whose probability distribution is given by a distribution function
(I)
F(x)
=
P(X
~ x)
(Sec. 24.5)
which for each x gives the prObability that X assumes any value not exceeding x.
1094
CHAP. 2S
Mathematical Statistics
In statIstIcs we take random samples Xl, . . . , Xn of size n by performing that experiment n times (Sec. 25.1) and draw conclusions from properties of samples about properties of the distribution of the con-esponding X. We do this by calculating point estimates or confidence intervals or by peIiorming a test for parameters (/-L and (T2 in the normal distribution. p in the binomial distribution. etc.) or by a test for distribution functions. A point estimate (Sec. 25.2) is an approximate value for a parameter in the distribution of X obtained from a sample. Notably, the sample mean (Sec. 25.1)
1
(2)
X
= -
n
n
2: -'J =
j~l
1
-
n
(Xl
+ ... +
XII)
is an estimate of the mean /-L of X, and the sample variance (Sec. 25.1) (3)
is an estimate of the variance (T2 of X. Point estimation can be done by the basic maximum likelihood method (Sec. 25.2). Confidence intervals (Sec. 25.3) are intervals 81 ~ 8 ~ 82 with endpoints calculated from a sample such that with a high probability 'Y we obtain an interval that contains the unknown true value of the parameter 8 in the distribution of X. Here, 'Y is chosen at the beginning, usually 95% or 99%. We denote such an interval by CONF y {8l ~ 8 ~ 82 }. In a test for a parameter we test a h)pothesis 8 = 80 against an alte171ative 8 = 81 and then, on the basis of a sample, accept the hypothesis. or we reject it in favor of the alternative (Sec. 25.4). Like any conclusion about X from samples, this may involve en-ors leading to a false decision. There is a small probability a (which we can choose, 5% or 1%, for instance) that we reject a true hypothesis, and there is a probability f3 (which we can compute and decrease by taking Larger samples) that we accept a false hypothesis. a is called the significance level and I - f3 the power of the test. Among many other engineeIing applications, testing is used in quality control (Sec. 25.5) and acceptance sampling (Sec. 25.6). If not merely a parameter but the kind of distribution of X is unknown, we can use the chi-square test (Sec. 25.7) for testing the hypothesis that some function F(x) is the unknown distribution function of X. This is done by determining the discrepancy between F(x) and the distribution function F(x) of a given sample. "Distribution-free" or nonparametric tests are tests that apply to any distribution, since they are based on combinatorial ideas. These tests are usually very simple. Two of them are discussed in Sec. 25.8. The last section deals with samples of pairs of values, which arise in an experiment when we simultaneously observe two quantities. In regression analysis, one of the quantities, x, is an ordinary variable and the other, Y, is a random variable whose mean /-L depends on x, say, /-L(x) = Ko + KIX, In correlation analysis the relation between X and Yin a two-dimensional random variable (X, Y) is investigated. notably in terms of the correlation coefficient p.
I
APPENDIX
1
References
Software see at the beginning of Chaps. 19 and 24. General References [GR I] Abramowitz, M. and 1. A. Stegun (eds.), Handbook of Mathematical Functiol1s. 10th printing, with corrections. Washington, DC: National Bureau of Standards. 1972 talso New York: Dover, 1965). [GR2] Cajori, F., History of Mathematics. 5th ed. Reprinted. Providence. RI: American Mathematical Society. 2002. [GR3] Courant, Rand D. Hilbert, Methods of Mathematical Physics. 2 vols. Hoboken, NJ: Wiley, 2003. [GR4] Courant, R, Differelltial and Integral Calculus. 2 vols. Hoboken, NJ: Wiley, 2003. IGR5] Graham. R L. et aI., Concrete Mathematics. 2nd ed. Reading. MA: Addison-Wesley. 1994. IGR6] Ito, K. (ed.), Encyclopedic Dictionmy of Mathematics. 4 vols. 2nd ed. Cambridge, MA: MIT Press, 1993. [GR7] Kreyszig, E., Introductory Functional Analysis with Applications. New York: Wiley, 1989. [GR8] Krey~zig, E.. Differemial Geometry. Mineola. NY: Dover, 1991. IGR9] Kreyszig. E. Introduction to Diflerentilll Ceometr.V alld Riemannian CeO/netr". Toronto: University of Toronto Press, 1975. IGRlOl Szeg6, G., Orthogonal Polyno/lliaL~. 4th ed. Replinted. New York: American Mathematical Society, 2003. [GRII] Thomas, G. et aI., Thomas' Calculus, Early TranscendelltaL~ Update. 10th ed. Reading, MA: Addison-Wesley, 2003.
Part A. Ordinary Differential Equations (ODEs) (Chaps. 1-6) See also Part E: Numeric Analysis [AI] Arnold, V. L Ordinm) D(fferential Equations. 3rd ed. New York: Springer. 1997. [A2] Bhatia, N. P. and G. P. Szego, Stability Theory of" Dynamical Systems. New York: Splinger, 2002. [A3] Birkhofl", G. and G.-c. Rota, Ordinary D(fferential Equations. 4th ed. New York: Wiley, 1989. [A4] Brauer. F. and J. A. Nohel, Qualitative Theory of Ordinary D!f/erelltial Equations. Mineola, NY: Dover, 1994.
[A5] Churchill, R V .. Operatiollal Mathematics. 3rd ed. New York: McGraw-Hill, 1972. [A6] Coddington, E. A. and R Carlson, Linear Ordinary D(fferelltial Equations. Philadelphia: SIAM, 1997. [A7] Coddington, E. A. and N. Levinson, Theory of Ordinary Differential Equations. Malabar, FL: Krieger, 1984. [A8] Dong, T.-R et al., Qualitative Theory ofD(fferential Equations. Providence, RI: American Mathematical Society, 1992. [A9] Erdelyi, A. et aI., Tables of Integral Transfonns. 2 vols. New York: McGraw-Hill, 1954. IAlO] Hartman. P .. Ordinary Differential Equations. 2nd ed. Philadelphia: SIAM, 2002. IAII] lnce, E. L., Ordinary Differential Equations. New York: Dover, 1956. [A12] Schiff, J. L., The Laplace Tran.~fonll: Theory and Applications. New York: Springer, 1999. [A13] Watson. G. N., A Treatise on the Theory of Bessel Functions. 2nd ed. Reprinted. New York: Cambridge University Press, 1995. [AI4] Widder, D. V., The Laplace Transfol1ll. Princeton, NJ: Princeton University Press, 1941. [A 15] Zwillinger, D., Handhook of Differential Equations. 3rd ed. New York: Academic Press, 1997.
Part B. Linear Algebra, Vector Calculus
(Chaps. 7-10) For books on numeric linear algebra, see also Part E: Numeric Analysis. IBI] Bellman, R., Introduction to Matrix Analysis. 2nd ed. Philadelphia: SIAM, 1997. IB2] Chatelin, F., Eigenvalues of Matrices. New York: Wiley-Interscience, 1993. [B3] Gantmacher. F. R, The Theory of Matrices. 2 vols. Providence, RI: American Mathematical Society. 2000. [B4] GOhberg, I. P. et a!., Invariant Subspaces ofMatrices with Applications. New York: Wiley, 1986. [B5] Greub, W. H., Linear Algebra. 4th ed. New York: Springer, 1996. [B6] Herstein. I. N., Ahstract Algebra. 3rd ed. New York: Wiley. 1996. [B7] John, A. W .. Matrices and Tensors in Physics. 3rd ed. New York: Wiley, 1995.
Al
A2
References
[B8J Lang. S .. Linear Algebra. 3rd ed. New York: Springer. 1996. [B9] Nef. W .. Linear Algebra. 2nd ed. New York: Dover. 1988. [B 10] Parlett. B.. The Symmetric Eigenmlue Problem. Philadelphia: SIAM. 1997.
Part C. Fourier Analysis and POEs (Chaps. 10-11) For books on numerics for PDEs see also Part
[D7] Krantz. S. G .. Complex Analysis: The Geometric ViCllpoi11t. Washington. DC: The Mathematical Association of America. 1990. [D8] Lang, S .. Complex Analysis. 3rd ed. New York: Springer, 1993. [D9] Narasimhan, R. Compact Riell/ann SlIIfaces. New York: Springer, 1996. [D 10] Nehari. Z.. C01!f017lwl Mapping. Mineola. NY: Dover. 1975. [DIl] Springer. G., Introduction to Riemann SlIIfaces. Providence, RI: American Mathematical Society, 1002.
E: Numeric Analysis. [CII Antimirov. M. Ya .. Applied Imegral Tramfol1ns. Providence. RI: American Mathematical Society, 1993. [Cl] Bracewell, R, Tile Fourier TralJSforlll and Its Applications. 3rd ed. New York: McGraw-Hili. 2000. [C3] Carslaw. H. S. and J. C Jaeger, COllduction (~f Heat in Solids. 2nd ed. Reprinted. Oxford: Clarendon, 1986. [C4J Churchill. R V. and J. W. Brown, Fourier Series and BOllndary Value Problellls. 6th ed. New York: McGraw-HilL 2000. IC5] DuChateau. P. and D. Zachmann. Applied Partial Differential Equations. Mineola. NY: Dover. 2001. [C6] Hanna, J. Rand J. H. Rowland. Fourier Series, Tran.~fo1711s. and Boundary Value Problems. 2nd ed. New York: Wiley. 1990. [C7] leni. A. L The Gibbs Phenomenon ill Fourier Analysis. Splines. and WCII'e!et Approximations. Boston: Kluwer. 1998. [C8] John. E. Partial Differelltial Equatiolls. New York: Springer. 1995. [C9] Tolstov. G. P .. Fourier Series. New York: Dover, 1976. [CIO] Widder. D. V .. The Heat Equation. New York: Academic Press. 1975. [CII] Zauderer, E.. Partial Dif(eremial Equatio/lS of Applied Mathematics. 2nd ed. New York: Wiley, 1989. [CI2] Zygmund, A. and R Fefferman. Trigonometric Series. 3rd ed. New York: Cambridge University Press, 2003.
Part
o. Complex Analysis (Chaps. 13-18)
[D I] Ahlfors. L. V.. Complex Analysis. 3rd ed. New York: McGraw-Hill. 1979. [D2] Bieberbach. L.. C01!formal Mapping. Providence, RI: American Mathematical Society. 1000. [D3] Hemici. P .. Applied and Computational Complex Analysis. 3 vols. New York: Wiley. 1993. [D4] Hille, E., Analytic Function Theory. 1 vols. 2nd ed. Providence. Rl: American Mathematical Society. 1997. [D5] Knopp. K.. Eleme11ts at tbe Theory of Functions. Ne\\' York: Dover. 1951. [D6] Knopp, K .. TheOlY of Functions. 2 parts. New York: Dover, 1945, 1947.
Part E. Numeric Analysis (Chaps. 19-21) [EI] Ames, W. E, humerical Methods for Partial Ditferelltial Equations. 3rd ed. New York: Academic Press, 1992. [E2] Anderson, E., et aI., LAPACK User's Guide. 3rd ed. Philadelphia: SIAM, 1999. [E3J Bank, R E., PLTMG. A Software Package for Solring Elliptic Partial Differential Equaticms: Users' Guide 7.0. Philadelphia: SIAM. 1994. [E4] Constanda. C, Solution Techniques for Elementary Partial Differelltial Eqllations. Boca Raton, R: CRe Press. 2001. [E5] Dahlquist, G. and A. Bjorck. NUII/erical Methods. Mineola. NY: Dover. 2003. [E6] DeBoor. C .. A Practical Guide to Splilles. Reprinted. New York: Springer. 1991. [E7] Dongarra. J. J. et aL LlNPACK Users Guide. Philadelphia: SIAM. 1978. (See also at the beginning of Chap. 19.) [E8] Garbow, B. S. et al., Matrix Eige11system Routi11es: EISPACK Gllide Extensioll. Reprinted. Kew York: Springer, 1990. rE9] Golub, G. H. and C F. Van Loan, Matrix Computatio11s. 3rd ed. Baltimore, MD: Johns Hopkins University Press, 1996. [E 10] Higham, N. J., Accuracy a11d Stability o.f Numerical Algorithms. 2nd ed. Philadelphia: SIAM, 2002. [E 11] IMSL (International Mathematical and Statistical Libraries). FORTRAN Numerical Library. Houston, TX: Visual Numerics, 2001. (See also at the beginning of Chap. 19.) [E 12] IMSL IMSL for Jam. Houston, TX: Visual Numerics, 2001. [El3] IMSL C Library. Houston. TX: Visual Numelics.
1002. [E14] Kelley, C T., Iteratil'e Methods for Li11ear alld NOlllillear Equatio11s. Philadelphia: SIAM. 1995. [EI5J Knabner. P. and L. Angerman, Numerical Methods .for Partial Differelltial Equatiolls. New York: Springer, 1003. [EI6] Knuth. D. E., The Art of Computer Programmillg. 3 voIs. 3rd ed. Reading, MA: Addison-Wesley, 1005.
AJ
App.l [E 17] Kreyszig. E., Imroductory Funcrimwl Analysis with Applications. New York: Wiley, 1989. [E181 Kreyszig. E.. On methods of Fourier analysis in muItigrid theory. Lecture Notes in Pure and Applied Mathematics IS7. New York: Dekker, 1994, pp. 22S-242. [EI9] Kreyszig, E., Basic ideas in modem numerical analysis and their origins. Proceedings of the Annnal COIlference (dthe Canadian Society for the History (llId Philosophy of Mathematics. 1997. pp. 34-4S. [E20J Kreyszig. E.. and J. Todd. QR in two dimensions. Elelllellte da Mathematik 31 (1976). pp. 109-114. [E21] Mortensen. M. E., Geometric Modelin!!,. 2nd ed. New York: Wiley. 1997. [E22] Morton, K. W., and D. F. Mayers, Numerical Solution of Partial Differential Equations: An Imroduction. New York: Cambridge University Press, 1994. [E23] Ot1ega. J. M .. Introduction to Parallel and Vector Solution of Linear Systems. New York: Plenum Press, 19HK [E24J Overton, M. L., Numerical Computing I\'ith IEEE Floating Poillt Arithmetic. Philadelphia: SIAM. 2001. [E2S] Pre~s, W. H. et aL Numerical Recipes in C: The Art of Scientific Computing. 2nd ed. New York: Cambridge University Press, 1992. [E26] Shampine, L. F., Numerical Solutions of Ordinary Differential Equations. New York: Chapman and Hall, 1994. [E27] Varga, R. S., Matrix Iterative Analysis. 2nd ed. New York: Springer, 2000. [E28] Varga, R. S .. GerJgorin and His Circles. New York: Springer. 2004. [E29] Wilkinson. J. H.. The Algebraic Eigenl'{lilte Problem. Oxford: Oxford University Press, 1988.
Part F. O"ltimization, Gra"lhs (Chaps. 22-23) [f-J] Bondy, J. A., Graph Them:,' with Applications. Hoboken, NJ: Wiley-Interscience, 2003. [F2] Cook. W. J. et aL Combinatorial Optimi;::ation. New York: Wiley. 1993. [F3] Diestel, R.. Graph Theory. 2nd ed. New York: Springer, 2000. [F4] Diwekar. U. M., Introduction to Applied Optimi;::atimz. Boston: Kluwer. 2003. [FS] Gass. S. L.. Linear Programming. Method and Applications. 3rd ed. New York: McGraw-Hill. 1969.
[F6] Gross. J. T., Handbook of Graph Them:" and Applications. Boca Raton. FL: CRC Press, 1999. [F7) Goodrich. M. T., and R. Tamassia, Algorithm Design: Foundations, Analysis. and Imemet Examples. Hoboken, NJ: Wiley, 2002. [FH] Harm)" F., Graph TheOl}". Reprinted. Reading, MA: Addison-Wesley, 2000. [F91 Merris, R.. Graph Theory. Hoboken. NJ: WileyInterscience, 2000. [FIOJ Ralston, A .. and P. Rabinowitz. A First Coune in Numerical Analysis. 2nd ed. Mineola, NY: Dover. 200 I. [FIIJ Thulasiraman. K., and M. N. S. Swamy, Graph Theory and Algorithms. New York: Wiley-Interscience, 1992. [FI2J Tucker, A.. Applied Combinarorics. 4th ed. Hoboken. NJ: Wiley, 2001.
Part G. Probability and Statistics (Chaps. 24-25) [G I] Amelican Society for Testing Materials, Manual on Presentation of Data alld Control Chart Analysis. 7th ed. Philadelphia: ASTM. 2002. [G2] Anderson. T. W., An Imroduction to Multivariate Statistical Analysis. 3rd ed. Hoboken. NJ: Wiley, 2003. [G3] Cramer. H .. Mathematical Methods of Statistics. Reprinted. Princeton, NJ: Princeton University Press, 1999. [G41 Dodge, Y., The Oxford Dictionary of Statistical Terms. 6th ed. Oxford: Oxford University Press. 2003. [GS] Gibbons. J. D., NOl/parametric Statistical Irlference. 4th ed. New York: Dekker. 2003. [G6] Grant. E. L. and R. S. Leavenworth. Statistical Quality Control. 7th ed. New York: McGraw-HilI. 1996. [G7] IMSL, Fortran Numerical Librm:r. Houston. TX: Visual Numelics, 2002. [G8] Kreyszig, E .. Illtroductory Mathematical Statistics. Principles {UJd Methods. New York: Wiley, 1970. [G9] O'Hagan, T. et aI., Kendal/'s Advanced TheOlY of Statistics 3-Volllllle Set. Kent, U.K.: Hodder Arnold, 2004. [GIO] Rohatgi, V. K. and A. K. MD. E. Saleh, All Imrodllction to Probability and Statistics. 2nd ed. Hoboken, NJ: Wiley-Interscience. 200\.
'r: A P PEN D I X
,,1,
2
'I
Answers to Odd-Numbered Problems Problem Set 1.1, page 8 1. (cos 7T.X)/7T + 7. Second order 11. y = ~ tan ('2x 13. y = e-:L:l
C
+
5. First order 9. Third order n7T), n = 0, ±I, ±2, ... 15. (A) No. (B) No. Only y = O.
17. )''' = g, y' = gt, Y = gt 2 /2 19. )''' = k, y' = kt + 6, y = ~kt2 + 6t, y(60) = 1800k + 360 y' (60) = 1.47·60 + 6 = 94 [rnlsec] = 210 [mph] 21. ekH = ~,H = (1n ~)/k = (1011 In 2)/1.4 = 1570 [yearsl
= 3000, k
=
1.47,
Problem Set 1.2, page 11 15. y = x(l - In x) + c 2 17. Verify the general solution y2 + t = c. Circle of radius 3Vz 19. 111V ' = 111g - bv 2 , v' = 9.8 - v 2 , v(O) = 10. v' = 0 gives the limit V9.8 = 3.1 [meter/sec]. 11. y
=
-(2/7T) cos ~7TX I- c
Problem Set 1.3, page 18
+
3. cos 2y dy = '2 dx, y = ~ arcsin (4x + c) 7. dy/y = cot 77X dx, )' = c(sin 7TX)lhT t2 11. r = roe-
36x 2 = c, ellipses 9. y = tan (c - e- 7rx/7T) 13. I = Ioe- RtiL
15.y=ex/~
17. y = 4ln x
5. )'2
= Vln (X - 2x + e) y' = (y - b)/(x - a), y - b = c(x - a) yoe k = 2yo, e k = 2 (l week), e2k = 22 (2 weeks), e4k = 24 y = yoe kt = yoe- 0,OO01213t = )'oe-O.OO01213.4000 = 0.62yo; 62%; cf. Example 2. 27. y' = -ky, y = Yoe-1
y
2
V2ih
31. h = gt 2 /2, t = Y2h/g, v = gt = gY2h/g = 33. y' = 0 - (2/800)y, y = 200e- 0,0025t, f = 300 [min], y(300)
= 94.5 [lb]
35. (A) is related to the enor function and (C) concerns the Fresnel integral C(x); see App.3.1. (D) y' = '2.\y + I, yeO) = 0
A4
App. 2
AS
Answers to Odd-Numbered Problems
Problem Set 1.4, page 25 1. Exact. x4 + y4 = e 3. Exact. u = cos TTX sinh Y + key), u y = cos TTX cosh y + k', k' = O. Ans. cos TTX sinh y = e 5. Exact. 9x2 + 4y2 = e 2f1 2f1 7. Exact, Mil = NT = -2e- , u = re- 2(J + k«(}), U e = -2re- + k', k' = O.
2fJ Ans. re- 21i = e, r = ee 9. Exact. u = ylx + sin 2x + key). lIy = lIx ...!.. k' = IIx - 2 sin 2y. AlIS. ylx + sin 2x + cos 2y = e 11. Not exact. F = 1/x2 by Theorem I. -ylx 2 dx + IIx dy = d(ylx) = O. Ails. Y = ex 13. - 3y 2/x 4 dx + 2ylx3 dy = d(\,2/x 3) = O. Y = e).2./2 (semicubical parabolas) 15. Exact, U = e 2x cos Y + key), lly = _e2x sin y + k', k' = O. AlIS. e2x cos y = e, e = 1 17. Not exact. Try R. F = e- x , e-X(cos wx + w sin wx) dx + dy = 0, U = Y + [(x), x U x = [' = e-X(cos wx + w sin wx), U = Y + [ = Y - e- cos wx = c, c = 0 Y Y 19. U = eX + key), u y = k' = -1 + e , k = -y + e • Ans. eX - y + e Y = e 21. B = C, !Ax2 + Cxy + !Dy2 = e
Problem Set 1.5, page 32 3. y
7. 9. 13. 17.
=
Cf;,-3.5x
+
0.8 ee- kx
5. y = 2.6e-1.25x + 4 2 e /(x13k if k =1= 0 11. y = 2xe cos 2x 15. Y = e llX (x2 + c), c = 4.1
+ c (if k = 0). y = + Separate. y - 2.5 = c cosh4 1.5x y = sin 2x + c/sin2 2x, e = 1 y = (c + cosh lOx)/x3 • Note (X 3 y)' = 5 sinh lOx.
y = x
19. )' = l/u
!
,
1I
= ce- 57 .x
-
6.5 5.7
-
21. u = y-2 = ecc\ l + ce2x ), c = 3, u(O) = 4 23. Separate variables. y2 = 1 - ce cos x, e = - 1/e 25 • .v' = Ry + k. y = ce Rt - klR. c = Yo + /.JR. Yo = 1000, R = 0.06. t = 65 - 25 = 40, k = 1000. Y = $178076.12. StaJt at 45 gives Yo[(I + 1I0.06)eo.o6.20 - 110.06] = 41.988732yo = 178076.12, Yo = k = $4241.05. 27. y' = 175(0.0001 - y/450), yeO) = 450· 0.0004 = 0.18, y = 0.135e-O.3889t + 0.045 = 0.1812, e-O.3889t = (0.09 - 0.045)/0.135 = 1/3. t = (In 3)/0.3889 = 2.82. AilS. About 3 years 29. y' = A - ky, yeO) = o. y = A(l - e-kt)/k 31. y' = By2 - Ay = By(y - AlB), A> 0, B > O. Constant solutions y = 0, y = AlB. y' > 0 if y > AlB (unlimited growth),.v' < 0 if 0 < y < AlB (extinction). y = AI(ceAt + B), yeO) > AlB if c < 0, yeo) < AlB if e > O. 33. y' = y - y2 - 0.2y, Y = 11(1.25 - 0.75e- O . 8t ), limit 0.8, limit 1 35. y' = y - 0.25y2 - O.ly = 0.25y(3.6 - y). Equilibrium harvest 3.6, y = 18/(5 + ce-O. 9t ) 37. (YI + )'2)' + P(YI + Y2) = c.v/ 39. (YI + .\"2)' + P(YI + Y2) = (YI' 41. Solution of eyt' + PQ'l = e(y/
+ PYI) + (1'2' + Ph) = 0 + 0 = 0 + PYI) + (Y2' + PY2) = r + 0 = r + PYl) = cr
A6
App. 2
Answers to Odd-Numbered Problems
43. CAS Experiment (a) y = x sin (lIx) + c\. c = 0 if y(2/rr) = 2/rr. y is undefined at x = 0, the point at which the "waves" of sin (11x) accumulate; the factor x makes them smaller and smaller. Experiment with various x-intervals. (b) )' = x"Lsin (llx) + c]. y(2/rr) = (2/rr)n. n need not be an integer. Try n = ~. Try n = - 1 and see how the "waves" near 0 become larger and larger. 45. y = uy*, y' + py = u'y* + uy*' + puy* = u'y* + u(y*' + py*) = u'y* + U' 0 = r, u' = r/y* = re Jp da', U = I eJp d.e r dx + c. Thus, y = UYh gives (4). We shall see that this method extends to higher-order ODEs (Secs. 2.10 and 3.3).
Problem Set 1.6, page 36
1. y' = 4, .v' = -1/4, Y = -x/4 + c* 3. y/x = c, y'lx = Y/X2, y' = ylx,)" = -X/y,)'2 + x 2 = c*, circles 5.2xy + x\" = 0, y' = -2y/x, y' = x/(2y), y2 - x2/2 = c*. hyperbolas 7. ye-.l:2/2 = c, y' = xy, y' = -1I(xy), yy' = -l/x, y2/2 = -In Ixl + c**, x = c*e- y2/2 , bell-shaped curves (with x and y interchanged) 9. y' = -4x/y. y' = )il4x, 4 In l}il = In Ixl + c':'*. x = C*J4. parabolas 11. xe- yl4 = c, y' = 4/x,}i' = -x/4, y = -x 2 /8 + c* 13. Use dy/d\ = I/(dx/dy). (y - 2x)e'" = c, tv' - 2 + Y - 2x)e" = 0, y' = 2 - Y + 2x, dxld"y = -2 + v - 2:r is linear, dx/dy + 2x = Y - 2, x = c*e- 2y + y/2 - 5/4 15. II = c, uxdx + uydy = 0, y' = -u,Ju y. TrajectOlies y' = uy/ux- Now v = c*. v,rdx + vydy = 0, y' = -v:r/Vy. This agrees with the trajectory ODE in u if U.l , = uy (equal denominators) and u y = -v.~ (equal numerators). But these are just the Cauchy-Riemann equations. 17.2r + 2y.v' = o. y' = -x/yo Trajectories y' = 5h. In IJI = In Ixl + c**, y = c*x. 19. y' = -4.\19)'. Trajectories}i' = 9}'14x. y = c*x9/4 (c* > 0). Sketch or graph these curves. Problem Set 1.7, page 41
1. In Ix - xol < a; just take b in ex = b/K large. namely, b = aK. 3. No. At a common point (Xl> .vI) they would both satisfy the "initial condition"" .v(Xl) = Yl, violating uniqueness. 5.)" = f(x. y) = rex) - pCr))': hence af/ay = -p(x) is continuous and i~ thus bounded in the closed interval Ix - xol ~ iI. 7. R has sides 2a and 2b and center (1, 1) since y(l) = 1. In R, f = 2y2 ~ 2(b + 1)2 = K, a = b/K = b/(2(b + 1)2). da/db = 0 gives b = 1, and a opt = b/K = 1/8. Solution by dy/)'2 = 2 dx, etc., y = 11(3 - 2x). 9. 11 + .v 2 1 ~ K = I + b 2 , a = b/K. da/db = O. b = 1. a = 1/2. Chapter 1 Review Questions and Problems, page 42
+ ~) = 4 dx. 2 arctan 2y = 4x + c*. y = ! tan (21' + c) = l/u, y' = -u' /u 2 = 4/u - 1/112, 1/ = c*e-4.r + ~ 15. dy/(y2 + 1) = x 2 dx, arctan y = x 3/3 + c, y = tan (x 3/3 + c) 17. Bernoulli.),' + xy = x/y, u = )'2, II' = 2)y' = 2x - 2ru linear, 2 J X2") U = e' (e' ~X dr + c) = 1 + ce- x , l' = V I u. Or write , (2 1) and separate. . )y = -x y 11. dy/(y2
13. Logistic ODE. y
-:t:
Z
--\.
App. 2
A7
Answers to Odd-Numbered Problems
19. Linear, y = e Cos xU e- cos x sin x dx
+
ce cos x
+ 1. Or by separation. 21. Not exact. Use Theorem 1, Sec. 1.4: R = 2/x, F = x 2 : the resulting exact ODE is 3x 2 sin 2y dx + 2x 3 cos 2y dy = d(x 3 sin 2y), x 3 sin 2y = c. Or by separation, cot 2y dy = - 3/(2x) d.r. etc., sin 2y = n·- 3 . 23. Exact. /I = I M dx = sin xy - x 2 + k, lIy = X cos xy + k' = N, k = y2, sin .l)" - x 2 + y2 = c. 25. Not exact. R* = 1 in Theorem 2, Sec. 1.4, F* = e Y • Exact is e Y sin (y - x) dx + eY[cos (y - x) - sin (y - x)J dy = O. II = I M dx = eY cos (y - x) + k, lIy = eY(cos (y - x) - sin (y - x» + k' = N, eY cos (y - x) = c. 27. Separation. )'2 + x 2 = 25 29. Separation. )' = tan (x + c), c = 31. Exact. u = X\2 + cos X + 2-" = c, c = 1I(0, 1) = 3 33. y' = x/yo Trajectories y' = -Yfx. y = c*/x by separation. Hyperbolas. 35. Y = Yoekt, e 4k = 0.9, k = ! In 0.9, e kt = 0.5, f = (In 0.5)/k = (In 0.5)/[(ln 0.9)/4] = 26.3 [daysl 37. ekt = 0.01, t = On O.OI)/k = 175 [days] 39. y' = -4x/)'. Trajectories y = CIX1l4 or X = C2y4 41. Logistic ODE y' = Ay - By2, Y = 1/1/, U' + All = +B, 1I = ce- At + B/A 43. A = amount of incident light. A thin layer of thicknes!> ,lx absorbs M = -kALh (-k = constant of proportionality). Thus ,lA/b.x = -kA. Let b.x ~ O. Then A' = -kA. A = Aoe-kl: = amount of light in a thick layer at depth x from the surface of incidence. c)
=
-!7T
Problem Set 2.1, page 52 1. \" = 2.5e4x + 0.5e- 4x
3. y = e- x cos x 9. Yes if a 1= 0 15. F(x, z, z') = 0
7. Yes 13. No 19. .'" dddy = 4:::, y = (CIX + C2)-1I3 21. (dddy)~ = _.::3 sin y, -11.:: = -dx/dy 23. y"y' = 2. y = ~(t + 1)3/2 - i. ),(3) = 25. y" = ky',.::' = k.:: . .:: = cle kx = y', Cl
5. Y = 4x2 11. No
= cos y + CI. X = -sin y + 3l, /(3) = 4 = 1, Y = (e kx - l)/k
+ 7/x 2
ClY
+
C2
Problem Set 2.2, page 59 1. Y = cle7X + C2 e - x 5. y = cleO.9X + C2e-L1X 9. y = cle3.5X + C2e-1.5X 13. y = Cl e12,- + C2e-12X
3. Y = (ci + c2x )e2.5x 7. y = eO. 5X(A cos I.5x + B sin 1.5.\) 11. y = A cos 3rr.r + B sin 3rr.r 15. y" - 3y' + 2y = 0 19. y" - 16y = 0 23. Y = e- 2x(2 cos x - sin x)
17. -" " - 2'[;;3' v.J y + 3Y = 0 3x 21. Y = 4e - 2e- X 27. y = (2 - 4x)e-O. 25x 25. \" = 2 + e- 7TX 29. y = e-o.1:r(3.2 cos 0.2x + 1.6 sin 0.2x) 31. \" = 4e5X - 4e- 5J; x x x 33')"1 = e- '.'"2 = O.OOle + e· E • , 1 , 35. W nte = e -a:"C/2 , c = cos wx, s = S1l1 wx. Note that £ = -"2a£, C = -ws, 2 s' = we. Substitute, drop £, collect c-terms, then s-terms, and use w = b - !a 2 , to get c(b - !a 2 + !02 - w 2) + s( -ow + !aw + ~aw) = 0 + 0 = O.
A8
App. 2
Answers to Odd-Numbered Problems
Problem Set 2.3, page 61 1. O. 0, -2 cos x 3. -0.8 X 3 S. -12x 3 + 9x2 + 8x - 2. -28 sin 4x - 4 cos 4x. 0 7. Y 11. y
=
=
2X (Cl + c2x )eCle-3.1X + C2e-x
+
6x
2
+ 0.4,
O. eO. 4x
= e- 3X(A cos 2x + B sin 2x) 13. y = A cos 4.2wx + B sin 4.2wx 9. y
Problem Set 2.4, page 68 1. Y = Yo cos wof + (uo/wo) sin wof. At integer f (if Wo = 7T), because of periodicity. 3. lIlLe" = -lIlg sin e = -mge (tangential component of W = lIlg). e" + w02e = O. WO/(27T)
=
\/i/i/(27T).
5. No. because the frequency depends only on kIm. 7. (i) Greater by a factor vi (ii) Lower 9. w* = [w0 2 - c 2/(41112)1112 = wo[1 - c 2/(411lk)]112 = wo(l - c 2/8111k) = 2.9583 11. 27T/W* since Eq. (10) and y' = 0 give tan (w';'f - 8) = -a/w*; tan is periodic with period 7T/W"'. 13. Case (II) of (5) with c = "\, '417lk = V 4· 500' 4500 = 3000 [kg/secJ. where 500 kg is the mass per wheel. 15. y = [Yo + (uo + aYo)f1e- at , Y = [1 + (uo + l)t]e- t ; (ii) u o = -2. -3/2, -4/3, -5/4, -6/5 17. Y = 0 gives Cl = -C2e-2{Jt, which has one or no positive zero, depending on the initial conditions.
Problem Set 2.5, page 72 1. CIX3
+
C2X-2
5. xlA cos (In Ixl) 9. CIXO.1 + C2XO.9
+
13. x-o. 5 [2 cos (10 In
B sin (In
Ixl) -
Ixl)]
sin (10 In
Ixl)]
3. (Cl + C2 In Ixl )x4 7. CIX1.4 + C2X1.6 11.3x2 - 2x 3 15.2.\"-3 + 10
Problem Set 2.6, page 77 1. y" - 0.25.'" = 0, W = -1 3. y" - 21..;/ + k 2 y = 0, W = e2kx 5. x 2/ ' + 0.5x/ + 0.0625), = 0, W = x-O. 5 7. x2y" + xy' + 4y = 0, W = 2/x 9. x2y" 0.75.'" = O. W = -2 11. y" - 6.25." = O. W = 2.5 13. y" + 2/ + 1.64." = O. W = 0.8e- 2x 15. y" + 5/ + 6.34y = 0, W = 0.3e- 5X 7 6rr 17. y" + 7.67Ty' + 14.44~\" = 0, W = e- . :,.
Problem Set 2.7, page 83 1. cIe- x
+ c 2e- 2x + 2.5e2x 3. cIe4X + C2 e - 4X + 2.4xe4.'t 3 5. Cle2X + C2e-3X - x - 3x - 0.5 7. e- 3X(A cos 8x + B sin 8x) + eX(cos 4x + ! sin 4x) 9. c 1e-O. 4x + C2eo.4.'t + 20xeo.4.'t - 2Q,e-O. 4x 11. Cl cos 1.2x + C2 sin 1.2x + lOx sin 1.2x 2T 13. e- (A cos x + B sin x) + 5x 2 - 8x + 4.4 - 1.6 cos 2x + 0.2 sin 2x 15. 4x sin 2x
-
4e x
App. 2
A9
Answers to Odd-Numbered Problems
17. e-O. IX ( 1.5 cos 0.5.t - sin 0.5.\:) + 2eO. 5x 19. 2e- 3X + 3e4x - 12.\"3 + 3x 2 - 6.5x
Problem Set 2.S, page 90 1. -0.4 cos 3t + 7.2 sin 3t 3. -12.8 cos 4.5t + 3.6 sin 4.5t 5.0.16 cos 2t + 0.12 sin 2t 7. 475 cos 3t - 415 sin 3t 9 • c 1 e- t/2 + C 2 e- 3tJ2 - 32 5 cos t - 1 5 sin t 11. (ci + c2t)e-3t/2 - ~ cos 31 - sin 3t 13. e-1. 5t (A cos t + B sin t) + 4 + 0.8 cos 2t - 6.4 sin 2t 15. 0.32e- t cos 5t + 0.68 cos 3t + 0.24 sin 3t 17. 5e- 41 - 4e- 2t - 0.3 cos 2t + 0.1 sin 2t 19. e-1. 5t (O.2 cos t - 1.1 sin t) + 0.8 cos t + 0.4 sin t
Problem Set 2.9, page 97 1. LI' 3. Rl'
+ Rl = E,1 = (E/R) + ce- RtJL = 2.4 + + lIC = O. I = ce-tJ(RC)
ce- 50t
5. I = 5(cus t - cos 1Ot)/99 7.10 is maximum when S = 0; thus C = 1/(w2 L). 9. R > Rcrit = 2VLIC is Case I. etc. 11.0 13. c l e- 20t + C2e-lOt + 16.5 sin lOt + 5.5 cos lOt 15. £' = -e- 4t (7.605 cos ~t + 1.95 sin ~t), I = e-o.lt(A cos ~t + B sin ~t) - e- 4t cos ~t 17. £(0) = 600. I' (0) = 600, 1 = e- 3t ( - 100 cos 4t + 75 sin 4t) + 100 co'> t 19. (b) R = 2 fl, L = 1 H, C = 1112 F, £ = 4.4 sin lOt V
Problem Set 2.10, page 101
+ B sin x - x cos x + (sin x) In Isin xl 3. CIX + C2X2 - X cos x 5. (cos X)(ci + sin .\" - In Isec x + tan xl) + (sin X)(C2 - cos x) = (cl - In Isec x + tan xl) cos x + C2 sin x 7. (CI + ~x) sin x + (C2 + In Icos xl) cos x 9. (Cl + c2x)ex + x 2 + 4x + 6 - eXOn Ixl + I) 11. c] cos 2x + C2 sin 2x + ~.\: cosh 2x 13. CIX + C2X2 - x sin x 1. A cos x
15. A cos x + B sin r + Ypi + Y p 2. Ypi as .\"p in Example 1, .\"p2 = ~~ sin 5x 17. lI" + 1I = 0 by substitution of Y = lIX- I/2 . YI = x- 1I2 cos x . .\"2 = x- 1I2 1I2 sin from (2) with the ODE in standard sin x. Yp = _~x1l2 cos X + form.
ix-
x
Chapter 2 Review Questions and Problems, page 102 9.
4 cle ."
11. e-
4X
+
C2e-2X -
(A cos 3r
+
1.1 cos 6x - 0.3 sin 6x B sin 3x) - ~ cos 3x
+
~ sin 3x
A10
App. 2
Answers to Odd-Numbered Problems
13. Y1 = x 3 , Y2 = x- 4 , r = x- 5 , W = -7x- 2 , Yp = - 412X-3 - ~X-3 = -*x- 3 15. )'1 = eX, Y2 = xe x , W = e2.T, Yp = e X I(2x) 17. -'"1 = eX cos X')'2 = eX sin x, W = e 2x , yp = -xe x cos x + eX(sin x) In Isin xl 19. y = 4e 2x + 2e- 7x 21. Y = 9x- 4 + 6x 6 2 3x 23. Y = e-2.1' - 2e- + 18x - 30t" + 19 25. Y = ~X3 + 4x 2 - 5x- 2 27. Y = -16 cos 2t + 12 sin 2t + 16(cos 0.5t - sin 1.5t). Resonance for wl(27T) = 2/(27T) = 1/7T 29. w = 3.1 is close to Wo = -vkj;, = 3, Y = 25(cos 3t - cos 3.1t). 31. R = 9,0, L = 0.5 H. C = 0.025 F, E = 17 sin 6t V, hence 0.5/" + 91' + 401 = 102 cos 6t, 1= -8.16e- 8t + 7.5e- lOt + 0.66 cos 6t + 1.62 sin 61 33. E' = 220·314 cos 314t, I = e- 50t(A cos 150t + B sin 150r) + 0.847001 sin 3141 - 1.985219 cos 314t
Problem Set 3.1, page 111 7. Linearly independent 11. xlxl = x2 if x > 0, linearly dependent 13. Linearly independent 17. Linearly independent
9. Linearly dependent 15. Linearly independent 19. Linearly dependent
Problem Set 3.2, page 115
+ 11y' -
=
3. yiv - Y = 0 7. C1 + C2 cos x -I:=. C3 sin x 9. C1ex + (c2 + c3x)e-·" 11. C1 ex + c2il+v7lX + c 3e(1-\ 7)x 13. eO. 25x + 4.3e-O. 7X + 12.1 cos O. Lt - 0.6 sin O.Ix 15. 2.4 + e-1.6x(cos 1.5x - :2 sin 1.5x) 17. y = cosh 5x - cos 4x 19. y = c 1x- 2 + C2X + C3X2. W = 121x2 1. Y'" - 6y" 5. /v + 4-,""
=
6-,"
0
0
Problem Set 3.3, page 122 1. (Cl + c2x)e2x + C3e-2.T - 0.04e- 3x + x 2 + X + 3. c] cos ~x + C2 sin ~x + X(C3 cos ~x + C4 sin ~x) - ~e-x sin !x 5. C1XO.5 + C2X + C3X1.5 + 0.1.\.5·5
7. C] cos X + C2 sin x + C3 cos 3x + C4 sin 3x 9. Y = (4 - x 2)e 3X - 0.5 cos 3x + 0.5 sin 3x 11. x- 2 - x 2 + 5x 4 + x(In x + I) 13. 3 + ge- 2x cos 9x - (1.6 - 1.5x)e X
+ 0.2 cosh
2x
Chapter 3 Review Questions and Problems, page 122 7. Cl + C 2x 1/2 + C3X-1/2 9. cle-O. 5x + C2 e O. 5X + C3e-L5x 2 11. Clx (i In x - ~) + C2X2 + c3.,· + C4 + tx 7 13. C1e-x + e X / 2(c2 cos (~V'3x) + C3 sin (~v'3x» + 8ext2 15. (c1 + c2x)eX + C3e-x + 0.25x 2e x 17. -0.5x- 1 + 1.5x- 5 19. cos 7x + e 3x - 0.02 cosh x
App. 2
An
Answers to Odd-Numbered Problems
Problem Set 4.1, page 135 1. Yes 5. y~ = 0.02(-)'1 + )'2), y~ = 0.02(\'1 - 2)'2 + )'3)')'~ = 0.02(.1'2 - )'3) 7. Cl = 1, C2 = -5 9.3 and 0 2t 11. )'~ = )'2, y~ = 4Yb .1'1 = c 1e- + C2 e2t = y, Y2 = v~ 13. y~ = Y2' y~ = )'2, eigenvalues 0, 1')'1 = Cl + C2 e t, Y2 = y~ = y' IS.)'~ = )'2, )'~ = 0.109375.1'1 + 0.75)'2 (divide by 64). Yl = cle-O.125t + c2eO.875t
Problem Set 4.3, page 146
+ C2e 6t , )'2 = -2Cle-6t + 2c2e6t 3')'1 Cle2t + C2, )'2 = Cl e2t - C2 5. Yl = Cle4it + C2e-4it = (Cl + C2) cos 41 + i(CI - C2) sin 41 = A cos 4t + B sin 4t, Y2 = iC1e4it - iC2e-4it = (iCI - iC2) cos 4t + i(iCI + iC2) sin 4t = B cos 4t - A sin 4t, A 1.
)"1
= =
c 1e- 6t
= Cl
+
C2.
B = i(CI - C2) 2e 1 + C2e-6t • .1'2 = -Cl + C3e-6t, )'3 = -Cl + 2(C2 + c 3 )e-6t 9f + 2c e-1. 8t , )'2 = 2Cle1.8t + c e- O.9t - 2C3e-1.8t, )"1 = c1e1.8t + 2c2e-O. 2 3 )'3 = 2Cle1.8t - 2C2e-O.9t + C3 e -1. 8t 11')'1 = 10 + 6e 2t , )'2 = -5 + 3e 2t 13. )"1 = 2.4e- t - 2e2.5t , .1'2 = 1.8e- t + 2e2.5t 15')'1 = 2e 14.5t + 10, Y2 = 5e 14.5t - 4
7. 9.
)'1
=
17. )'2 = .1'~ + Yb Y~ = )'~ + )'~ = -)'1 - Y2 = -Yl - ()'~ + .1'1), y~ + 2y~ + 2Y1 = 0, Y1 = e-t(A cos 1 + B sin t), )'2 = y~ + )'1 = e-t(B cos t - A sin t). Note that r2 = Y12 + .1'22 = e- 2t (A 2 + B2). 19.11 = 4c1e-200t + C2 e - 50t , 12 = -Cle-200t - 4C2e-50t
Problem Set 4.4, page 150
1. Saddle point, unstable, .1'1 = Cle-4t + c 2e 4t, )'2 = -2c 1e- 4t + 2c2 e 4t 3. Unstable node . .1'1 = Clet + C2e3t, )'2 = -Clet + C2e3t 5. Stable and attractive node, .'11 = Cle-3t + C2e-5t, .'12 = c 1e- 3t - c 2 e- 5t 7. Center, stable'.\'1 = A cos 41 + B sin 4t, .1'2 = -2B cos 4t + 2A sin 4t 9. Saddle point, unstable, .1'1 = Cle3t + C2e-t, .1'2 = C1e3t - C2e-t 11.)\ = Y = c 1e kt + c 2 e- kt ')'2 = y', hyperbolas k 2.'112 - Y2 2 = const 13. )' = e- 2t (A cos t + B sin t), stable and attractive spirals 17. For instance, (a) -2, (b) -1, (c) -~, (d) 1, (e) 4.
Problem Set 4.5, page 158
(9,
0), y~ = )'2, Y~ = 3.1'1, saddle poim; (0, -1)')'1 = 5\')'2 = -1 + Y2, y~ = -Y2, )1 = 35\. center 3. (0, 0), Y~ = 4.1'2. Y~ = 2Yl, saddle point; (2, 0), Yl = 2 + )lb )'2 = )12' Y~ = 4Y2, y~ = -2Yl' center 5. (0, 0), Y~ = -Yl + )'2, y~ = - Yl - .1'2' stable and attractive spiral point; (-2, 2), )'1 = -2 + )11> Y2 = 2 + Y2, Y~ = -.h - 3.V2, y~ = -j\ - )12, saddle point
1.
App. 2 7 • .\'~
Answers to Odd-Numbered Problems
=
.\"2' Y~
= -.\"10 - 4)'1), (0, O),)'~ = .\"2, .,.~ = -."1, center; + Yb )'2 = 5'2, Y~ = 5'2, Y~ = (-! - 5'1)( -4Yl)' y~ =
(!, 0)'.\'1 = !
Yl' saddle 9. (~7T ::!: 2117T, 0) saddle points; (-~7T ::!: 2117T, 0) centers. Use -cos (::!:~7T + :VI) = sin (::!:5'1) = ::!:Yl' 11 . .\"~ = .\'2, y~ = -)'1(2 + )'1)(2 - .\"1)' (0, 0), y; = -4.\"1' center; (-2, 0), y~ = 85'1' saddle point; (2, 0), y~ = 85'1 saddle point 13. )'''/y' + 2y' /)' = 0, In y' + 2 In Y = c, y' y2 = Y2Y] 2 = const 15.), = A cos t + B sin t, radius YA 2 + B2
Problem Set 4.f page 162 3. Yt = A cos 4t + B sin 41 + ~~, )'2 = B cos 4t - A sin 4t - ~t 5. )'1 = Cle4t + C2e-3t + 4, Y2 = c 1e 4t - 2.5c2e-3t - 10 7. Y1 = 2cle-9t + C2e-4t - 90t + 28'.\"2 = Cle-9t + C2e-4t - J26t + 14 9')'1 = Clet + 4c2e 2t - 3t - 4 - 2e- t ')'2 = -Clet - 5C2e2! + 5t + 7.5 + e- t 11')'1 = 3 cos 2t - sin 2t + t + 1,)'2 = cos 2t + 3 sin 2t + 2t - ~ 13'."1 = 4e- t - 4et + e 2t'."2 = -4e- t + t 15'."1 = 7 - 2e 2t + e 3t - 4e- 3t , )'2 = _e 2t + 3e- 3t 17. I~ + 2.5(ft - 12) = 845 sin t, 2.5(1~ - I~) + 2512 = 0, 11 = (95 + 162.5t)C 5t - 95 cos t + 312.5 sin t, 12 = (-30 - 162.5t)e-5t + 30 cos t + 12.5 sin t 19. I~ + 2(11 - 12) = 200, 2(12 - It) + 812 + 2 I 12 dt = O. II = 2cleA,t + 2C2e"~2t + 100, 12 = (1.1 + Vo:4T)cleA,t + 0.1 - \"0.41)C2eAzt, Al = -0.9 + \,'0.41, A2 = -0.9 - v'6AT
Chapter 4 Review Questions and Problems, page 163 11')'1 = Cle8t + C2e-8t, )'2 = 2c 1e 8t - 2c2e-8t. Saddle point 13. ."1 = Clet + C2e-6t, ."2 = Clet - 6c2e-6t. Saddle point 15. )'1 = Cle7.5t + C2e-3t, Y2 = -Cle7.5t + 0.75c2e-3t. Saddle point 17. ."1 = Cle5t + C2et, .\'2 = Cle5t - c2e t . Unstable node 19. Yl = e-t(A cos 2t + B sin 2t), )'2 = e-t(B cos 2t - A sin 2t). Stable and attractive spiral point 21. )'1 = Clet + C2e-t + e 2t + e- 2t , )'2 = -C2e-t - 1.5e- 2t 23')'1 = Clet + C2e-2t - 6e- t - 5. Y2 = -Clet - 2c2e-21- + 10e- t + 6 25• .\"1 = Cle3t + C2e-t + t 2 - 2t + 2, )'2 = c 1e 3t - C2e-t - t 2 + 2t - 2 27. A saddle point at (0, 0) 29. I] = 4e- 40t - e- lOt , 12 = _e- 40t + 4e- lOt
31. (117T, 0) center for even 11 and saddle point for odd n 33. Saddle points at (0, 0) and (~, ~), centers at (0, ~) and (~, 0)
(oblem Set 5. 1 page 170 1. aoO + x + ~X2 + ... ) = aoe x 3. aoO - 2X2 + ~X4 - + ... ) + al(x - ~x3 = ao cos 2,r + ~al sin 2x
+
I~X5 -
+ ... )
App. 2
AU
Answers to Odd-Numbered Problems
5. ao(1 + ~x) 7. ao + aox + (~ao
+ ~)x2 + ... = aoe x + eX - x - I = ce x 9. ao + a1x + ~alx2 + ... = ao - a l + alex 11. s = ~ - 4x + 8x 2 - 3ix 3 + 3ix 4 - I;:X5, s(O.2) = 0.69900 13. s = ~ + ~x - 1s x 3 + ~OX5, sCI) = 0.73125 15. s = I + x - x 2 - ~X3 + ~X4 + ~!X5, s@ = ~~~
x-I, c
= ao + 1
Problem Set 5.2, page 176 1. lei 5.0
3. 2 (as function of t = lx - 3)2). Ans.
9. 1 13.
V2
7. 2 11.
7T
15.
L
(_1)S-l
L
xS'R
5(s - 2)
s~3
= 1
,
(s - 4)2
s~5
(s-3)!
xS ' R =
Cf)
'
+ tx 3 + 2~X4 - :14x5 - ... ) 19. ao + aI(x - ~X3 + ~X5 - 2i x 7 + 227X9 - I~5Xll + - ... ) 21. lIo(l - ~X2 - :14x4 + 7I~ox6 + ... ) + aI(x - tx 3 - 214X5 + 1O~SX7 + ... ) 23. ao(1 + x 2 + x 3 + X4 + x 5 + x 6 + ... ) + alx 17. ao(l - l2X4 _lo-\:5 - ... )
+
a1(x
+
~x2
Problem Set 5.3, page 180 3. P 6lx) = I~l231x6 - 315x4 + 105x2 - 5), P 7lx) = I~(429x7 - 693x 5 + 315x 3 - 35x) 7. Set x = az.. y = cIPn(x/a) + C2Qn(x/a)
=~. P 21 = 3x~, P22 2 P 4 = (l - x 2)(105x2 - 15)/2
15. P l 1
= 3(1 - x 2),
Problem Set 5.4, page 187 1.
VI
= 1+
3•."1 = 1 -
x2
~
3!
~
12
5. r(r - 1) + 4r
+
X4
~
5! I ~-.r4 384
+ +
1 2
+ ... = -
sinh x x
. ..
Y2
v = 9". 1 In
'.2
=
x
x -
2 = 0, r l = -1, r2 = -2; Y1 =
+
24
720
+
x
2! 1M
3 + x + .,. ~
4! 36
x
x
6
+
x2 2
rl
25x 4
1024
+ ... ,
120
+ - ...
7. Euler-Cauchy equation with t = x + 3, h = (x + 3)5, Y2 = ."1 In (x 9. ho = 1, Co = 0, r2 = 0, Yl = e- x , ."2 = e- x In x 11'."1 = 1I(x + 1), .1'2 = 1Ix
13. ho = ~, Co = 0,
cosh x x
-2+
X4
=
+ 3)
= ~, r2 = 0'."1 = x I/2 (1 + 2x + 2X2 + ~X3 + ... ),
Y2=I +2x+2x + ... 15'."1 = (x - 4)7, .1'2 = (x - 4)-5 (Euler-Cauchy with t = x - 4) 3 17'."1 = X + x - I~X4 + I~X5 - 2~X6 + .. " Y2 = 1 + 3x 2 - tx 3 2
+
~~x6
-+ ...
+
~X4 - ~x5
+ ...
14
App. 2
Answers to Odd-Numbered Problems
= c1F(~, ~, ~; x) + C2 -yt;:F(l, 1,~; x) 21. y = A(l - 4x + ~X2) + B-yt;:F( -~, ~, ~: x) 23. y = c1F(2, -2, -~; t - 2) + C2(t - 2)3/2F(~, 19. Y
-~,~; t - 2)
Problem Set 5.5, page 197 1. Use (7b) in Sec. 5.2. 3.0.77958 (exact 0.76520), 0.19674 (0.22389). -0.27651 (-0.26005). -0.39788 (-0.39715), -0.17038 (-0.17760), 0.15680 (0.15065), 0.30086 (0.30008),0.16833 (0.17165) 5. Y = C1I,,(Ax) + c2L,,(Ax:), v*" 0, ± I, .. . 7. Y = C1I,,(-yt;:) + C2LvC-yt;:), v O. ± L .. . 9. Y = C1xI1(2x), II, I_I linearly dependent 11. y = X-"lC1I,,(X) + C2I_,,(X)], 0, ±1, .. . 13. Y = C1I,,(x3) + C2L,,(x3), v =1= 0, ± 1. .. . 15. Y = c]-yt;:1 1(2-yt;:), I]. I_I linearly dependent 17. Y = Xl/4ft(~Xl/4), II' L1 linearly dependent
*'
v*'
19. y
=
.~/\C1I8/5(4xl/4)
+ c2I_s/5(4x l/4» = 0, (24a) with v = I, (24d) with
v = 2, respectively. 23. I n (X1) = I,,(x2) = 0 implies x 1- n 1,,(X1) = X2 -71 I,,(x2) = 0 and [x- n 1 n (x)]' = 0 somewhere between Xl and X2 by Rolle's theorem. Now use (24b) to get 1 n + l (x) = 0 there. Conversely, 1,,+ 1(X3) = 1,,+I(X4) = 0, thus X3n-lIn+I(X3) = X4n+11n+1(X4) = 0 implies 1n(x) = 0 in between by Rolle's theorem and (24a) with v = 11 + L 25. Integrate the formulas in (24). 27. Use (24a) with JJ = 1, partial integration, (24b) with v = 0, partial integration. 33. CAS Experiment. (b) Xo = I. Xl = 2.5, X2 = 20, approximately. It increases with (c) (14) is exact. (d) It oscillates. (e) Formula (24b) with v = 0
21. Use (24b) with v
Problem Set 5.6, page 202 1. )' = C]15(X) + C2 Y5(X) 3. Y = C1I O(-yt;:) + C2 Yo(-yt;:) 2 5. Y = C112(X2) + C2 Y2(X ) 7. y = X-\CI15(X) + C2Y5(X» 11. Set H(l) = kH c21, use (10). 9. Y = X3(c1I3(X3) + C2 Y3(X3» 13. Set x = is in (l), Sec. 5.5, to get the present ODE (12) in terms of s. Use (20), Sec. 5.5.
Problem Set 5.7, page 209 3. Set x = ct + k. 5 ..\ = cos e. dx = -sin e de, etc. 7. Am = (11lr./5)2, 111 = 1,2, ... ; Ym = sin tll1r.x/5) 9. Am = [(2m + 1)r.12L]2, m = 0, 1. ... : Ym(x) = sin [(2m + l)m/2L] 2 11. Am = 111 , m = 0, 1, ... ; Yo = 1, Ym = cos II1X, sin IIlX, III = 1,2, ... 13. k = k m from tan k = -k. Am = k m 2 , III = 1,2, ... ; Ym = sin kmx 15. Am = 1112, 111 = 1, 2, ... ; Ym = x sin (m In /x/) 17• j'J -- e Sx, q -- 0, r -- e Sx, Am , . IIIX, III = I , 2,... = III 2 ; Ym = e -4x SIn 19. Am = (1Ilr.)2, Ym = X cos lIlr.x, X sin 111r.X, m = 0, 1, ...
n.
App. 2
A15
Answers to Odd-Numbered Problems
Problem Set 5.8, page 216
3. ~P3(X) - ~P2(X)
1. 1.6P4 (x) - 0.6Po(x) 7. -0.4775P l (x) - 0.6908P3(x)
+ ~Pl(X) - ~Po(.x) + 0.1544Pg (x) + ... ,
+ 1.844P5 (x) - 0.8234P7 (x) = 9. Rounding seems to have considerable influence in Probs. 6-15. 9. 0.3799P2(x) + 1.673P4 (x) - 1.397P6 (x) + 0.3968Ps(x) + ... . 1110 = 8 11. 1.175Po(x) + 1.l04Pl (x) + 0.3575P2(x) + 0.0700P3(x) - .... 1Il0 = 3 or 4 1110
13. 0.7855Po(x) - 0.3550P2(x) + 0.0900P4 (x) - ... ,
= 4
1110
15. 0.1212Po(x) - 0.7955P2(x) + 0.9600P4 (x) - 0.3360P6 (x) 2 17. (c) am = (2IJ1 (aO;rn»(J1(aO. m )/ao,m) = 2/(ao. m ll(aO. m
»
+ .... 1Il0 =
8
Chapter 5 Review Questions and Problems, page 217 11. e 3x , e- 3x , or cosh 3x, sinh 3x 13. eX, 1 + x 15. e-X'-, xe- x2 17. e- x , e- x In x 2 2 19. I/(l - x ), x/(l - x ) or 1/(1 - x), 1/(1 + x) 21. Y = ("11V2(6x) + c21 _ v2(6x) 23. Y = Clit (x2)
+
25. y = ~[clll/4(~kx2) + ['21 _l/4(!kx 2)J 27. Am = (21117Tl, Yo = I, Ym = cos nl17TX, sin 2m7Tx,
I, 2, ...
111 =
2 C2 Yl (x )
o.
29. Y = c l l l (kx) + C2Yl(kx). C2 = O. y(l) = c l l l (k) = k = k m = al,m (the positive zeros of 1 1 ), Ym = l l (a1.",x) 31. 1.813Po(x) + 2.923P l (x) + 1.759P2(x) + 0.663P3(x) + 0.185P4 (x) + ... 33. 0.693Po(x) - 0.285P2(x) + 0.144P4 (x) - 0.09IP6 (x) + ... 35.0.25Po(x) + O.5P1 (x) + 0.3 125P2(x) - 0.0938P4 (x) + O.0508P6 (x) + ...
Problem Set 6.1, page 226 2
2
s s s cos () - w sin ()
7. ---;;2:------:2;:---
+
S
k
13. s
(1 -
s-2
s
1."""3-"""2
w
e- bs )
5.
15.
1
1
9.--s + 2b 1 - (1
2
(s - 2) -
11.-S2 + 4 1 - e- bs be- bs 17. ---;;:--- - - -
+ 2s)e- 2S
----;;2---
2s
s
S2
(I - e- s )2
19. - - - s
23. Set ct
= p.
Then :£(f(ct»
=
{'oC e-stf(ct) dt L=e-(s'C)Pf(p) dp/c =
= F(s/c)/c.
0
29. 4 cos
7Tt -
3 sin
7Tt
35.2 - 2e- 4t 39.
45.
1
Vs
sin
,,1st -
37. (ev'3t - e-V5t)/(V3 e- 5t
41.
(s -
3.8 2 2.4)
+ k) + b 2 + k) + 1
a(s
(s
51. 3e- 2t sin 5t
53. e- 5 "1Tt sinh 7Tt
+ Vs) 5w 43.
(s
2
+ a) +
w
2
A16
App. 2
Answers to Odd-Numbered Problems
Problem Set 6.2, page 232 1
1.
(s - k)
2
7. (S2 + ~17"2)2 9. Use shifting. Use cos 2 0:' = ~ + ~ cos 20:'; use cos 2 0:' + sin2 0:' = 1. Ans. (2S2 + 1) 1[2s(S2 + 1)] 11. (s + ~)Y = -1 + 17· 21(s2 + 4), y = 7e-t{2 + 2 sin 2r - 8 cos 2t 13. (S2 - ~)Y = 4s, y = 4 cosh ~t 15. (S2 + 2s + 2)Y = s - 3 + 2 . 1. Y = (s + 1 - 2)/[(s + 1)2 + 11, y = e-t(cos t - 2 sin t) 17. (S2 + 7s + 12)Y = 3.58 - 10 + 24.5 + 211(s - 3), Y = ~e3t + ~e-4t 19. (s + 1.5fY = s + 31.5 + 3 + 541s 4 + 64ls, Y = lI(s + 1.5) + lI(s + 1.5)2 y = (1 + t)e-1. 5t + 4t3 - 16t2
+ ~e-3t
241s4 - 321s 3 + 321s2, 32t 21. t = t + 2, l' = 4/(s - 6), ji = 4e 6t, y = 4e 6Ct - 2) 23. t = t + 1, (s - l)(s + 4)1' = 4s + 17 + 6/(s - 2), Y = 3et - 1 + e 2Ct - ll 25. (b) In the proof, integrate from 0 to 1I and then from II to 0: and see what happens. (c) Find 3:;(f) and 3:;(f') by integration and substitute them into (1 *).
27.2 - 2e-
t/2
29.
+ +
1 ""k.2
(e
kt
~ r;:: 31. cosh v5t - 1
t k
- 1) -
33. t sinh 2t - ~t
Problem Set 6.3, page 240 3. (l - e 2 -
2S
(~ S3
~ S2
5. 7.
+
s
S2
+
17"2
)
I(s - 1)
+
~) e- s - (~ S S3
+
~ S2 e (-S21 + -10) s
( -e -s - e -4s)
-20s (e- 3s S2 + 17"2
1
13. - - (e- 2s + 27T _ e- 4s + 47T ) s - 17" 15. 0 if t < 4, t - 4 if t > 4 17. sin t if 217" < r < 817", 0 elsewhere 19. 0 if t < 2, (t - 2)4/24 if t > 2 21. lI(t - 3) cosh (2r - 6) t 23. e- sin t 25. e- 2t cos 3t + 9 cos 2t + 8 sin 2t 27. sin 3t + sin r if 0 < t < 17" and ~ sin 3t if t > 17" 29. t - sin t if 0 < t < 1, cos (t - 1) + sin (t - 1) - sin t if t > I 31. et - sin t + u(t - 217")(sin t - ~ sin 2t) 33. t = 1 + t, 'f" + 4ji = 8(1 + t)2(1 - lI({ - 4», cos 2t + 2r2 - I if t < 5, cos 2r + 49 cos (2r - 10) + 10 sin (2t - 10) if t > S 35. Rq' + qlC = 0, Q = 3:;(q), q(O) = CVo, i = q' (r). R(sQ - CVo) + QIC = 0, q = CVoe- tlCRC ) 11.
37. IO[
100 + -[
+
e- 6s )
100
= - - e- 2s , [
s S2 1 - e- lOCt - 2 ) if t > 2
1= e- 2s ( S
s
]
+ 10
), i = 0 if t < 2 and
-lOs
App. 2
Answers to Odd-Numbered Problems
=
39. i
e- 20t
+
+ lI(t - 2)[ - 20t + 1 + 3ge-20Ct-2J] 5t 490e- [l - lI(t - 1)], i = 20(e- 5t - e- 250t )
20t - 1
41. O.Ii' + 25i = + e-250t+245]
43. i
(10 sin lOt
=
45. i'
+
+
2i
2
A17
f
+
20£1(1 - 1)[-e- 5t
+
100 sin t)(lI(t - 7T) - u(t - 37T))
t
i(T) dT
= 1 - 1I(t - 2), I =
(1 - e- 2S )/(s2
+
+
2s
2),
°
i = e- t sin t - lI(t - 2) e-t+2 sin (t - 2)
= 27 cos t + 6 sin t + lI(t - 27T) [- 27 cos t
47. i
e- t (27 cos 3t + 11 sin 3t) - 6 sin t + e-(t-2TI"J(27 cos 3t
+
11 sin 3t)1
Problem Set 6.4, page 247 1. Y = 10 cos t if 0 < t < 27T and 10 cos t + sin t if t > 27T 3. Y = 5.5e t + 4.5e- t + 5(e t - 1I2 - e- t +1/2 )u(t - ~) - 50(e t - 1 - e-t+l)lI(t - 1) 5. Y = O.lfe t + e- 2 t(-cos t + 7 sin t)] + O.llI(t - lO)[-e t + e- 2t+ 3 0(cos (t - LO) - 7 sin (t - 10))] 7. Y = 1 + le- t sin 3t + lI(t - 4)[-1 + e-t+ 4 (cos (3t - 12) + l sin (3t - 12))] l~ lI(t - 5)e -t+5 sin (3t - 15) 9. Y = 5t - 2 - 501l(t - 7T)e- t +7r sin 2t. Straight line. sharply deformed between 7T and about 8 11. Y = (O.4t + 1.52)et
+
0.48e- 4t
+
1.611(t -
2)[ -et + e- 4t +1O ]
Problem Set 6.5, page 253 1. t
1 5. - sin wt
I kt 7. 2k (e
w
9. ~(e3t - e- 5t ) ll. 13. t - sin t 15. 19. Y = 3/((S2 + 4)(S2 + 9», y = 0.3 sin 2t 21. (S2 + 9)Y = 4 + 8(1 + e-=)/(s2 + 1), y = 23. 0 if 0
<
t
<
1,
~
f
t
sin (2( T
-
1)) dT =
~(~ -
e-
-
kt
)
! cos 2t)
=
L
k
= ~
sinh kl
sin 2 t
t(cosh 3t - 1) 0.2 sin 3t sin t + sin 3t if t < 7T, ~ sin 3t if t
-~ cos (2t
- 2)
+ ~ if t >
>
7T
1
1
25. Y = 2e- 2t - e- 4t
+
(e- 2t + 2 - e- 4t + 4)u(t t
27. Y - 1 * y = 1, y = e 31. Y(1 + l/s2) = lis, y = cos t
L)
+
(e- 2t + 4 - e- 4t +S )1I(t - 2)
29. y - y * sin t = cos t, Y = lis, y = 1 33. Y(1 + 2/(s - 1) = (s - 1)-2, Y = sinh t
Problem Set 6.6, page 257 4
1.
2
(s - 1)
+4 4s + 5)2 2w(3s 2 - w 2) 9. (S2 + W2)3 2s
5. (S2
+
3. (S2
2ws + w2)2
24s2 + 128 7. (S2 - 16y3 2s cos k
11.
+
(s
2
(S2 - I) sin k
+
2
1)
A18
App. 2
Answers to Odd-Numbered Problems
15. te- 2t sin t 19. In s - In (5 - 1); (e t
13.6te- t 2 kt
17. t e
-
l)/t
Problem Set 6.7, page 262
1. .'"1 = -e- t sin t, )"2 = e- t cos t 3 ..'·1 = 2e- 4t - 4e 2t , )"2 = e- 4t - 8e 2t 5'.'"1 = 2e- t + 4e- 2t +!t -~, )"2 = -3e- t - 4e- 2t - ! t + ~ 7. )"1 = e- t (2 cos 2t + 6 sin 2t) + t 2• .'"2 = lOe- t sin '2t - t 2 9• .'"1 = 4 cos 5t + 6 sin 5t - 2 cos t - 25 sin t. .'"2 = 2 cos 5t - 10 sin 5t + 20 sin t 11'.'"1 = -cos t + sin t + I + 1I(t - 1)1-1 + cos (t - I) - sin (t - 1)1 )"2 = cos t + sin t -I + 1I(t - l)ll - cos (t - 1) - sin (t - 1)] 13. Y1 = 2u(t - 2)(e 4t - e t + 6 ), Y2 = e 2t + ll(t - 2)(e 4t - 3e2t + 4 + 2e t + 6) 15. Yl = _e- 2t + et + ~llU - I)(-e- 2t + 3 + et ), Y2 = _e- 2t + 4e t + ~lllt - l)l-e- 2t + 3 + et ) 17')"1 = 3 sin 2t + 8e- 3t , )'2 = -3 sin 21 + 5e- 3t 19')'1 = e t - e- t , Y2 = et , Y3 = e- t 25.4i 1 + 8(il - i2) + 2i~ = 390 cos t, 8i2 + 8(i2 - i 1 ) + 4i~ = 0, il = - 26e- 2t - 16e- 8t + 42 cos t + 15 sin t, i2 = -26e- 2t + 8e- 8t + 18 cos t + 12 sin t Chapter 6 Review Questions and Problems, page 267 11.
17.
13.
(s - 3)2 S (s - 1)(5 2
23. 10 cos
+ 4)
tV2
29. te- 2t sin t
19.
2 S(S2
+ 4)
15. e-m ( 7T 5
25. 3e- 2t sin 4t 31. (t2 - l)u(t - 1)
21.
s12 )
a-b
2S2 -4-5 - I
+
(5 - (1)(5 -
b)
27. lI(t - 2)(5 + 4(t - 2» 7T 33. ---:3 (wt - sin wt) w
35.20 sin t + li(1 - 1)[1 - cos (t - I)] 37. 10 cos 21 - ~ sin 2t + 4u(t - 5) sin (2t - 10) 39. e- t (7 cos 3t + 2 sin 3t) t 41. e- + u(t - 7T)[1.2 cos t - 3.6 sin t + 2e-t+7T - 0.8e2t - 2"] 43. u(t - l)(t - l)e 2t - 2 + 4u(t - 2)(2 - t)e 2t - 4 45. Y1 = e t + ~e-t - ~ cos t - ~ sin t, )'2 = -et + ~e-t + ~ cos t + ~ sin t 47. Y1 = ~e-t sin 2t, )"2 = e-t(cos 2t - ~ sin 2t) 49. )'1 = e 2t , )'2 = e 2t + et 51. I = (l - e- 2S )/[s(s + 10)], i = 0.1 (I - e- lOt ) + O.lu(t - 2)[-1 + e- lOt + 20 ] 53. I = e- 2t(76 cos 4t - 42 sin 4t) - 76 cos 20t + 16 sin 201 55. i~ + IOUI - i2) = 100 t 2• 30i~ + 1O(i~ - i~) + 100i2 = 0, ;1 = (~ + 4t)e- 5t + 10t2 -~, i2 = (~ + 2t)e- 5t + 2t - ~ Problem Set 7.1, page 277
App. 2
A19
Answers to Odd-Numbered Problems
-:l[-: -1]
[-0
3. Undef.,
~
6
o
9
5.
[-48 -2] [ 38
-44
67
-15
.
-33 9.
24
72
60
smne,
. same.
-5.0
-4.9
1.8
-3.6
3.5
-48
-34] 3.8
0.4
[_~::] -2.2
2X2 =
4
=
0
4X2
24
[-03
= -3
-5X2
-3Xl
48]
0 3(,
-4.2
+ +
-5Xl
-9
-12
7. [ 6~], [ :.2]'
, undef.
Problem Set 7.2, page 286
64
[ 230
-92
-92
38
-12
6
5. [20
-3
-114
-114
126
-1:]
-7], [-62
34
2],
[50S] 525 ,same 790
5
7.
r :
0
~] [-310 8 ,31, -62
0
16
0
170 34
-124
337
8
9. [ 252
49
-100]
-308
52
233
-68
, same,
10]; ,same
68 257
68
232
97
-248
-16
[
-188J -96 265
-72] ,
110
A20
App. 2
11. [
13. 19. 21. 23. 27.
Answers to Odd-Numbered Problems
324
32
2M
38
-244
-10
216
-104
280
-132
-280
140
-320] [ -322. 366
4324
1520
[ 3636
-4816]
1242
-451!S
-3700
-1046
5002
7060
960
-5120]
7548
1246
-5434
-8140
-1090
6150
1M] [
-68, 76
83, 166, 593, 0 (d) AB = (AB)T = BTAT = BA; etc. (e) AilS. If AB = -BA. Triangular are U 1 + U 2 , U 1 U2 , U 12 , L1 + L 2 , L 1 L 2 , L12. [0.8 1.2]T, [0.76 1.24]T, [0.752 1.248]T P = [110 45 801 T, v = [92000 863001 T
Problem Set 7.3, page 295
= 2.5, Y = -4.2 x = 0, y = - 2, z = 9
1. x
3. x = 0.2, y = 1.6 7. x = 4, Y = 0, z = -2 x = 3y + 2, y arb.,.: = -y + 6 11. Y = 2x + 3.: + I, x, .: arb. w = I, Y = 2.: - x, x, .: arb. 15. w = 3, x = 0, y = -2, ;;: = 8 h = (R 1 + R2)Eo/(R 1R 2), 12 = Eo/R1' 13 = Eo/R2 [Amps] 11 - 12 - 13 = O. (3 + 2 + 5)1t + 10/2 = 95 + 35, 10/2 - 5/3 = 35, 11 = 8, 12 = 5. 13 = 3 Amps 21. Xl + X4 = 500, Xl + X2 = 800. X2 + X3 = 1100, X3 + X4 = 800, Xl = 500 - X4, X2 = 300 + X4, X3 = 800 - X4, X4 arbitrary 5. 9. 13. 17. 19.
Problem Set 7.4, page 301
1. L [I 3.3, [I [0 0 5. 2, [3 7.2, r8 9. 3, II 11. 4, [I 13. No 19. Yes 29. No 35. I, [5
-2]; [1
4
0
-3]T 7], [0 -2 0
I
3], rO
0
5
105]; r -2
-I-
5]T, [0
I
5]T,
I]T
0 0 0 0
5], 4], 3 0
[0 3 [0 2 0], [0 0], [0
41; [3 0 5]T, [0 3 4]T OJ; [8 0 4 O]T, [0 2 0 41T 5 8 -371, [0 0 -74 296]; same transposed o 0], [0 0 1 0], [0 0 0 I]; same transposed 15. No 17. Yes 21. (c) I 27.2, [I -I OJ, lO 0 1] 31. I, [-i l 1] 33. No
Problem Set 7.7, page 314
5. 107 9. -66.88 13. 113 + v 3 + w 3 - 311VW 19. x = -1.2, Y = 0.8, .: = 3.1 23.3
7. cos (a + f3) 11.0 15.4 21. 1
App. 2
Answers to Odd-Numbered Problems
A2l
Problem Set 7.8, page 322 1. [
1.80
3.
- 2.32J
-0.25
0.60
[COS '28
-sin 28J
sin 28
9. A-I
15. (A'j-L (A-'j' ~ [I: -~
cos 28
=A
11. No
inver~e
=]
19. AA -1 = I, (AA -1)-1 = (A -l)-lA -1 = I. Multiply by A from the right. 21. det A = - I. C 12 = C21 = C33 = - I, the other Cjk are zero. 23. uet A = 1. Cl l = I, C 12 = -2, C22 = 1, C 13 = 3. C23 = -4, C33 = I
Problem Set 7.9, page 329 1. Yes, 2, [3 5. Yes, 2, [0
5 0
-51 T 1 OIT, [0 0
OIT, [2
0
11. Yes, 2, xe-:r , e- x 13. [I O]T, [0 I]T; [I I]T, [-1 15. Xl = -0.6Yl + 0A.Y2 X2 = -0.8y! + 0.2.'"2 19. Xl = 5-"1 + 3.'"2 - 3)'3 X2 =
o
0
[_~ ~J
7. Yes, 1,
X3
3. No
0
3.'"1
+
= 2.'"1 -
2.1'2 -
)"2
+
l]T
9. No
I]T; [I
O]T, [0
17.
Xl =
x2
=
-l]T
2.1"1 5\" . 1
+ Y2 + 3\".2
2)"3 2.'"3
21. Vs6 25. 2
23. 16Vs 29. 4V1 -
3V2
= 0,
V
= ±
[53
15]T
Chapter 7 Review Questions and Problems, page 330 11. X = 4, Y = 7 15. x = ~. Y = -~. z = ~ 19. x = 22. Y = 4, 2 arbitrary
23.638.0,0
13. x = v + 6, z = y, )" arbitrary 17.x=7.)"= -3
21. 0
25.
8.0
-3.6
1.2
-3.6
2.6
2.4
1.2
2.4
9.0
[ 12
27. 14, 14.
2; [
o o o
r
J
20
29. [-20
9
-3],
_:
]
A22
App. 2
Answers to Odd-Numbered Problems
31.2,2
33.2.2
35.2,2
37.
39.
4~ [~~ 1~ 23
_ 5] 42
41.
-10
5\ [
4 -5
,~ r
:]
72
-72
31
-3:2
-19
20
132] 59 -35
43. It = 33 A, 12 = 11 A, 13 = 22 A 45. II = 12 A, 12 = 18 A, 13 = 6 A Problem Set 8.1, page 338
1. -2, [1 O]T; 0.4, [0 I]T 3.4,2\"1 + (-4 - 4)X2 = 0, say, Xl = 4. X2 = 1; -4, [0 I]T 5. -4, [2 9]T; 3, [1 I]T 7.0.8 + 0.6i, [1 _i]T; 0.8 - 0.6i, [l i]T 9.5, [1 2]T; 0, [-2 I]T 11.4, [I 0 O]T; 0, [0 1 OlT; -I, [0 0 I]T 13. -(A3 - I8A2 + 99A - I62)/(A - 3) = -(A2 - I5A + 54); 3. [2 -2 I]T; 6, [l 2 2]T; 9, [2 I _2]T 15. I, [-3 2 lO]T; 4, [0 1 2]T; 2, [0 0 I]T 17. -(A3 - 7A2 - 5A + 75)/(A + 3) = -(A2 - lOA + 25); -3, [I 2 _I]T; 5, [3 0 l]T, [-2 I O]T 19. -(A - 9)3; 9, [2 -2 I]T; defect 2 21. MA 3 - 8A2 - 16A + I28)/(A - 4) = A(A 2 - 4A - 32); 4. [-1 3 1 l]T; -4,[1 1 -I -l]T;O.[I I L 1]T;8, [1 -3 1 -3]T 23.2, [8 8 -16 l]T; 1, [0 7 0 4]T; 3, [0 0 9 2]T, -6, [0 0 0 l]T 25. (A + IfcA2 + 2A - 15); -I, [I 0 0 O]T, [0 1 0 O]T; -5, [- 3 -3 1 l]T, 3, [3 -3 I -I]T 29. Use that real entries imply real coefficients of the characteristic polynomial. Problem Set 8.2, page 343
1. [ -
~
~] ; -1, [~] ; I, [~] ; any point (x, 0) on the x-axis is mapped onto
(-x, 0), ~u that [l
3.
(x, y)
O]T is an eigenvector corresponding to A = -1.
maps onto (x, 0).
[~
~] ; 1, [~] ; 0, [~] . A point on the x-axis maps
onto itself, a point on the y-axis maps onto the OI;gin. 5. (x. y) maps onto (5x, 5)'). 2 X 2 diagonal matrix with entries 5. 7. -2, [I -l]T, -45°; 8, [1 1]T,45° 9.2, [3 -l]T, -18.4°; 7, [I 3]T.71.6° 11. I. [-l/V6 1],112.2°; 8, [1 1IV6]. 22.2° 13. 1. [l I]T, 45°; -5, [I _l]T, -45° 15. c[l5 24 50l T , c > 0 17. x = (I - A)-l y = [0.73 0.59 1.04]T (rounded) 19. [I I 1JT 21. 1.8 23. 2.1
App. 2
A23
Answers to Odd-Numbered Problems
Problem Set 8.3, page 348
5. A-I = (_AT)-1 = _(A- 1 )T
3. No
7. No since det A = det (AT) = det (-A) = (-1)3 det A = -det A = O. 11. Neither, 2, 2, defect 1 9. Orthogonal, 0.96 ± 0.28i Symmetric, 18, 15.0rthugunal, 1, i, -i Symmetric, 1I + 2b, a - b, 1I - b
13.
9, 1R
17.
Problem Set 8.4, page 355
2]T. [2
1.[1
- I]T, [1
3. [1
[~
-1]T:X=
2J -1
D=
[7
0
:J
I]T, D = [: :J
5. [2 7. [1 9. [0
_l]T, [2 1]T, diag (-2, 4) 0 O]T, [1 -2 1]T, [0 1 O]T, diag (I, 2, 3) 3 2]T, [5 3 O]T, [1 0 2]T, diag (45. 9, -27)
13. [~: -~:J: -5. [~J :2. [~J : = [-~J . [-~J -72J. [-3J x
-30
15. [
4~
17'[-:: ~: 66
19. C = [
1
12 3 21. C =
[ -4
100
=
27. C
=
[
-3
12J ' IOY1 2 -6 -4J
, 5Y1 2
[ 16
~]; -2
15yl = 5, x = [0.8 0.6
-
5.\"22
-
= 0,
X
=
- 3
+ 5"_22 = 10,
\' 2 '.1
1 -6J , 7y . 12 -6 1 12
-4
6
1:] 1~]
23.C~ [~ V3] 2 25. C
. [-3J
3
-1::];4,[_:]:_2'[_:]:1'[
-12
[=:]
F
[9J. .2 . . O. [12J.. x = -4 -5
32
5y .22
-
16J ' 28Y1 2 12
-
x
0.6J y, hyperbola -0.8
[2tV5 -ltv's
[
=
= 35 , x = [
112 - v 3/2 ~r:;
1IV2 -l/V2
4yl = 112, x =
ltv's] 2tv's
y, straight lines
V312] y, ellipse 112
lIV2J lIV2
N2J
[lIV2 1 ItV2 - lIV2
y, hyperbola
y, hyperbola
A24
App. 2
Answers to Odd-Numbered Problems
Problem Set 8.5, page 361 3. (ABC)T = CTBTj\T = C- 1(-B)A 5. Hermitian, 3 + v'2. [-i I - v'2]T; 3 - 0. [-; I + 7. Hermitian. unitary, 1, [1 ; - iV2]T; -1, [1 i + ;"\,'2]T 9. Skew-Hermitian. 5;. [I 0 O]T, [0 I I]T; -5i, [0 I -1]T 11. Skew-Hermitian, unitary, ;, [I 0 IJ T. [0 1 01 T: -i. [l 0 _lIT 13. Skew-Hennitian. -66i 15. Hermitian, 10
v2T
Chapter 8 Review Questions and Problems, page 362
[ 3 2J [3 2J = [5 OJ 11. [-1/3 2/3J A [1 9.
-4
_3
2/3
oJ!lo ~
-9
A
-1/3
[
4.8
19. I.lY1 2 +
- 3
0
=
A
-9
-7
~J [~ -~J
2
;l rl-~ -:l
-1.0 15.
-4
-2
2
4
=
- I
=
0
0
17r~
OJ,s. -I
5.0
yl
rlo~ -: ~l
I. ellipse
Problem Set 9.1, page 370 1. 5. 9. 13. 17. 23. 27. 31.
2, -4,0; V20; [ltv's, -2tv's, 0] -8. -6.0; 10; [-0.8, -0.6,0] ~, £); V37/8 [4, -2,0], [-2, 1,0], [-I,~, 0] [2!S, -14, -14] (5.5, 5.5, 0), (t, ~, IS'» l-8, -2,4]; V84 [-9. 0, 0]. [0, -2, 0]. [0, 0, -11]. Yes.
ci,
35. [
~ , ~]
- [-
~ , ~]
3. 7. 11. 15. 19. 25. 29. 33.
= [
-1,0,5; V26: [-I/V26, 0, 5/V26] (7. 5, 0); ViO (0, 1, ~); V37/2 [10, -5, -15J [-2, 1,8], [6, -3, -24] [0,0,91; 9 v = [0. O. -9] Ip + q + ul ~ 6. Nothing
~ , ~]
37. Iwl/(2 sin a)
Problem Set 9.2, page 376 1. 4 3. -v24T 5. [12. -8.4], [-18. -9. -36] 9. -4.4 11. -24 17·la
19.0
2
7. 17 IS. Use (I) and Icos
+ bl + la - bl 2 = aoa + 2aob + bob + (aoa - 2a°b + bob) = 21. 15
yl ~ 1. 2/a/2 + 2/b/ 2
23. Orthogonality. Yes
App. 2
A25
Answers to Odd-Numbered Problems
27. 79.11 ° 25.2,2,0, -2 33.54.79°,79.11°,46.10° 31. 54.74° 39. 1.4 37.3 41. If lal = Ibl or if a and bare OIthogonal
29. 82.45° 35.63.43°.116.57°
Problem Set 9.3, page 383
1. [0,0, -I OJ, [0, o. 10J 3. [-4, -8, 26J 5. [0. 0, - 60J 7. -20, -20 9.240 11. [19, -21. 24], V1378 13. [10, -5, -I] 15.2 17.30, -30 19. -20. -20 25. [-2. 2. 0] x [4.4. OJ = -16k. 16 27. [I, -1.2] x [1,2,3] = [-7, -1, 3J, v'59 29. [0, 10, 0] x [4, 3, 0] = [0, 0, -40], speed 40 31.1[7. O. OJ x [I. 1.0]1 = 7 33. ~V3 35. [18,14,26]; 9x + 7.v + 13z = c, 9·4 + 7·8 + 13·0 = 92 = c 37. 16 39. c = 2.5 Problem Set 9.4, page 389
5. Circles 1. Hyperholas 3. Hyperbolas 7. Ellipses; 288, 100, 409; elliptic ring between the ellipses and 9. Ellipsoids
11. Cones
23. [8x. 0, .\''::], [0,0, x.::], LO, 18;:, xy]; [0, ;:, yl, [z, O. x],
13. Planes
b', x,
0]
Problem Set 9.5, page 398
1. 5. 9. 13. 17. 23. 25.
[4 + 3 cos t, 6 + 3 sin t] 3. L2 - t. 0, 4 + t] [3, -2 + 3 cos t. 3 sin f] 7. [a + 3t, b - 21, c + 5t1 11. Helix on (x - 2)2 + (y - 6)2 = r2 [v'2 cos t, sin t. sin t] Circle (x - 2)2 + (y + 2)2 = 1, z = 5 IS. X4 + .l = 1 Hyperbola xy = 1 r' = [-5 sin 1. 5 cos t. 0], U = [-sin t. cos t. 0]. q = [4 - 3w. 3 + 4w. 0] r' = [sinh t. cosh f], U = (cosh 21)-112 [sinh t, cosh t]. q = Li + 4H'. ~ + 5wJ
v;:r:;r
= cosh t. l = sinh I = 1.175 Stan from ret) = [t, f(t)]. v = r' = ll. 2t. O].lvl = VI + 4t 2 • a = [0. 2. 0] v(O) = 2wRi, a(O) = -w2 Rj I year = 365' 86400 sec, R = 30' 365' 86400121T = 151· 106 [km]. lal = w 2R = Iv/ 2 JR = 5.98· 10-6 [kmlsec2 ] 39. R = 3960 + 80 mi = 2.133' 107 ft. g = lal = w 2 R = IvI 2 /R, Ivl = = V6.61· 108 = 25700 [ft/sec] = 17500 [mph] 43. ret) = [t, yet), 0], r' = [1, y', 0], r' • r' = I + y'2, r" = [0, y", 0], etc. 47. 3/(1 + 9t2 + 9t4 ) 27. 29. 33. 35. 37.
ViR
A26
App. 2
Answers to Odd-Numbered Problems
Problem Set 9.6, page 403 1. w' 3. w'
= 2V2(sinh 4t)/(cosh 4t)1I2 = (cosh
t)sinh
t-\(cosh2 t) In (cosh t) + sinh 2 t) 7. e 4u sin 2 2v, ~e4u sin 4v
5. w' = 3(2t4 + t 8 f(8t 3 + 8t7 ) 9. -2(u 2 + V 2)-3U, -2(u 2 + V 2 )-3V
Problem Set 9.7, page 409 1. [2x, 2y1 7. [6. 4. 4] 13. [-4, 2] 19. [6. 4] 23. [-0.0015. O. 29. [8, 6. 0] 35.7/3
3. [1Iy, - X/y2] 9. [-1.25. 01 15. [-18, 24] 21. [-6, -12] -0.0020] 31. [108, 108, 108] 37.2e 2 /V13 41. X4
+ y3
5. Lv + 2. x - 2] 11. [0. -e] 17. [48, -36]
27. [a, h. 33.V2t3
c]
3~2
-
Problem Set 9.8, page 413 1. 3(x
+ y)2
3. 2(x
+ x~ +
5. (y
z)
+x +
1) cos xy
7.9x2y2.::2 9. [Vt, V2, V31 = r' = [x', y', ~'] = b', o. 0].::.' = 0, z = C3, y' = 0, y = C2, x' = Y = C2, X = C2t + Cl' Hence as t increases from 0 to 1, this "shear flow" transforms the cube into a parallelepiped of volume l. 11. div (w x r) = 0 because Vb V2, V3 do not depend on x, y, z, respectively.
13. (b) (fVl)x + (fV2)y + (fv 3 )z = ![lVI)x (c) Use (b) with v = Vg. 15. 4(x + )')/(y - x)3 17.0
+
(V2)y
+ (v 3 )z] +
fxv]
+
fyV2
19. e XY\v 2z2
+ !zV3, etc.
+ X 2.: 2 + x2y2)
Problem Set 9.9, page 416 1. [0. 0, 4x - I] 3. [0, O. 2e x sin y1 5. [0, 0, -4y/(x 2 + )'2)] 9. curl v = [- 2~. 0, 01, incompressible, v = r' = [x'. y'. z'] = [0. .:2, 0]. x = CI, Z = C3, y' = Z2 = C32, Y = c32t + C2 11. curl v = [0, 0, -2], incompressible. x' = y, y' = -x, z' = 0, Z = C3, )' dy + x dx = 0, x 2 + )'2 = C 13. Irrotational, div v = 1, compressible, r = ['let, C2e-t. C3etJ 17.0,0, [xy - .:x, yz - xy, ,:x' - yz] 19. 0, 0, 0, - 2.\'Z2 - 2.:x 2 - 2.\"y2
Chapter 9 Review Questions and Problems, page 416 11. [-1, 9, 24] 15. [0, 0, -740], [0, 0, 19. -495, -495 23. If u x v = 0. Always 27.3.4
-740]
13. 0, [-43, 54, 3], [43, -54, 17. [-24, 3, -398], [114, 95, 21.90°, 95.4° 25. [VI, V2, -3] 29. If 'Y > ~7T, ~7T
-3] -76]
App. 2
A27
Answers to Odd-Numbered Problems
33.45/6 37. 0, 2y2 41. 0, 2X2
+ (z + xl + 4y2 + 2z 2 +
35. No 39. l-l, 1, - l], [-2z, 43.4881"\13323 45.0
4xz
-2x, -2y]
Problem Set 10.1, page 425 1. 5. 7. 9. 11. 15.
F(r(t» F(r(t» F(r(t» F(r(t» F(r(t» 17/3
= 1125t6 , t 3 , 0], 1644817 = 2350 3.0 + 160 = [cosh t sinh 2 t. cosh 2 t sinh tl, 93.09 cos t, sin t], 67T lcosh ~t, sinh !t, ett8 ], 0.6857 = let, et2 , e t2 ], e 2 + 2e 4 - 3 17. [367T.
= It,
=
~(87Tl,
367T]
Problem Set 10.2, page 432 3• _12 e -cx2+y2) , 0 11. sinh ae
1. sin xy, 1 7. x 2 )' + cosh z, 392 15. cea - aeb
5. e Xz 13. No 19. No
17.~a2bc2
+ y, -2
Problem Set 10.3, page 438
f Ix 7. f ~(e3x I + i) 1
x3
3.
-
(x 2 - x 5 )] dx
o
4
- e- X ) dx
5. ~ cosh 6 - cosh 3
= l2
= !e 12 + !e-4 -
o
1
(2x 2
11.
dx
-1
15.
=
i
19. Ix
=
(a
~
+ b)h 124, Iy =
~
(e Sin y cos y - cos y) dy
9.
0 1
0
x = ib, Y = ~h 3
f 13. f f
+~
x
0
= e- 2
~ d)' dx = ~
17. Ix = bh 3 /l2. Iy = b 3 h/4 h(a
4
-
4
b )/(48(a - b»
Problem Set 10.4, page 444 1. 2x 3 ), - 2.\}"3, 81 - 36 = 45 5. e'C- Y - e X + Y, -~e3 + !e 2 + e- 1 - ! 9. 0 (why?) 13. Y from 0 to ~x. x from 0 to 2. Ans. cosh 15. Y from 1 to 5 - x 2 • Ans. 56
3. 3x 2 + 3y2, 18757T/2 = 2945 7. 2x - 2v, -56/15 11. Integrand 4. Ans. 407T 2 - ! sinh 2 19. 4e 4 - 4
Problem Set 10.5, page 448 1. Straight lines, k 3. x 2/a 2 + y2/b 2 = 1, ellipses. straight lines, [-b CllS v, a sin v, 0] 5. z = (cla)Y + y2, circles. straight lines, [-aeu cos v, -aeu sm v, a 2u] 7. x 2/9 + ."2116 = z, ellipses, parabolas. [-811 2 cos V, -6112 sin v, 12u] 2 9. x /4 + )'2/9 + z2/]6 = I, ellipses, [12 cos 2 v cos u, 8 cos 2 v sin u, 6 cos v sin v] 13. [lOu, IOv, 1.6 - 4u + 2v], [40, -20, LOO] 15. [-2 + cos v cos u, cos v sin £I, 2 + sin v], 2 [cos V cos u, cos 2 v sin u, cus v sin v]
r
A2B
App. 2
Answers to Odd-Numbered Problems
17. [Lt, v, 3v 2 ], [0, -6v, II 19. [u cos V, 311 sin v, 3111, [-911 cos v,
-3u sin v. 311] 21. Because r1< and rl" are tangent to the coordinate curves v = respectively. 23. [iT. v, li 2 + v 2 ]. N = [-2u, -2v, 1]
COllsT
and
1I
=
COIlST,
Problem Set 10.6, page 456 1.-64 7.27r 15. 140V6/3
3. -18 9. ~a3 17. 1287TV213 = 189.6 19. 6 7T 2 (373/2 - 53/2 ) = 22.00 27. 7Th4/V2 29. 7Th + 27Th 3 /3
i
5. -1287T 11. 17h14
25. 27Th
Problem Set 10.7, page 463 1. 8a 3b 3c 3127 7. 2347T 13.7Th5 /lO 21. 0
3.6 9.2a 5 /3 17. 1087T 23.8
5. 42~7T 11. ha 4 rr12 19. 2167T 25. 3847T
Problem Set 10.8, page 468 1. 3. 5. 7.
Integrab ..j.. I . 1 (x = 1). ..j.· 1 . 1 (y = 1), -8' 1 . I (z = 1).0 (x = v = 2 (volume integral of 6y2), 2 (surface integral over x = 1). Others 0 Volume integral of 6)'2 - 6x2 is O. 2 (x = 1), -2 (v = 1), others O. F = [x. y. z], div F = 3. In (2). Sec. 10.7. Fon = IFllnl cos cp
=
z
= 0)
Vx 2 + y2 + Z2 cos cp = r cos cp.
9. F = [x,
0,
0], div F
= 1, use (2*), Sec. 10.7, etc.
Problem Set 10.9, page 473 1. 3. 5. 7. 11. 15. 19.
[0,
16]0[0, -1, 1], :::':12 -ex. eY ] [-1. -1. 1], ±(e2 - 1) S: [u, v, v 2 1, (curl F)oN = _4ve 2v2 , ±(4 - 4e 2 ) (curl F)on = 312, :::':3a 2 /2 9. The sides contribute a, 3a 212, -a, O. curl F = [0, o. 6],247T 13. (curl F)on = 2x - 2y, 1/3 -7T/4 17. (curl F)oN = 7T(COS TTX + sin -n:v), 2 For = [-sin e, cos eJo[ -sin e, cos e) = I. 27T, 0 8z,
r-ez ,
0
Chapter 10 Review Questions and Problems, page 473 11. Exact. -542/3 17. By Stokes, ± 18rr 23. 0, 4a137T 29. By Gauss, 1007T 35. Direct, 5(e 2 - 1)
13. Not exact, e 4 - 7 19. By Stokes, :::': 127T 25. 817. 118/49 31. By Gauss, 40abc
15. By Green, 1152rr 21. 4/5, 8/15 27. Direct. 5 33. Direct, rrh
App. 2
A29
Answers to Odd-Numbered Problems Problem Set 11.1, page 485
3. 13.
15
k, k,
2mn, 27T11l.
±+ ! (COS + x -
7T
'2
9
7T
I
sin
- sin 2x 2
y -
19. - -4 ( cos x 7T
2 (sin x
I 21. "3
7T
I 23. 6
7T 2 -
t'
+ ..!..
2 (cos x
4
+
cos 5x -
9
7T
7T
+
+
+
cos 3x
+ ... )
+ 4 (cos x + ..!.. cos 3x + _1_ cos 5x + ... )
17. -
29.
kin
kill,
2
+
4
+ - sin 3x - + ... 3
+
+
sin 3x
+ -I
cos 5x
25
+
+ ... )
+ ... )
sin 5x
"4I cos LX + 9"1 cos 3x - + ... )
I cos x - - cos 2x 2
-
7T
=
25
I
cos 3x
9
+ _1_ cos 5x + ... )
cos 3x
+ -1
4 ( cos x -
-
25
2x, f" = 2,jl =
4
1
277T
8
+ - - cos 3x + - cos 4x - ...
O,j~ = -47T,j~ = 0, an =
_1_ (n7T
..!..)
cos
(-47T)
117T,
etc.
11
Problem Set 11.2, page 490 1. :
3.
..!.. 3
~x
(sin
~ 2 7T
+
(cos
i
sin
3~X ..!..
TTX -
cos
4
+
2+4 (cos 9. "3 7T2 3
11. 4
4 7T
(
2
( co:>
TTX
2
+ ..!.. cos 9
27TX
+ -1 9
3
1
+ -
2
cus 2r
15. Translate by!.
+
37TX -
+ ... )
+ -I- cos 47TX + -1- cos 3·5
cos
1 2
+ - cos
TTX
37TX
5·7
+ -1
25
cos
1 9
1
-8 cos 4x
17. Setx
37TX 2
57TX
= O.
67TX
+ ... )
+ ... ) 47Tt"
+ - ... )
I 57TX cos - 25 2
+ - cos - - + -
+ .. -) 13. 8
+ ... )
I cos "41 cos 27TX + 9"1 cos 37TX - 16
TTX -
cos -
TTX
5~X
sin
27TX
5. Rectifier, -2-4 - (1 - - cos 7T 7T 1·3 7. Rectifier, -1 - 24 2 7T
+
1 cos 37lX 18
+ -
A30
App. 2
Answers to Odd-Numbered Problems
Problem Set 11.3, page 496
1. 3. 5. 7. 9.
Even, odd, neither, even, neither, odd Odd Neither Odd Odd
. 2
13.
~7r
+
11 7r
~7r
~9 cos 3x +
+
(cos x
~9 sin 3x +
(Sin x -
4 ( 71X 15. 1 - --:;; sin ""2
~
+ ... )
_1_ sin 5x 25
37rx "31 sin -2-
+
+ ... )
_1_ cos 5x
+
51
+ ...
+
8 ( 7T.X 19. (a) 1 + 7r2 cos ""2
+ 9 cos -2- + 25 cos -2- + ...
4 (
21. (a)
n"( sin 2
"23 -
23. (a)
"2L -
2
7rX
L
2
371X
1
1
9
7r
+
9
~
)
4
I
57rx
1
37rx
cos
3m:-
L
1
+ 25 cos
77rx - --:;1 cos -2-
+ - ...
I
57rx sin -2- -
"9 sin 371X
571X
)
L
+ ...
- "21 sin L27rx + "31 sin L37rx - + ... )
+ -4 ( cos x + -I cos
(b) 2 ( sin x
3
1
+
)
571X
- "3 cos -2- + 5 cos 2
4L ( 7rX 7r2 cos
+ ...
37rx I ) + -1 sin - + - sin 27rx + ...
1
7H
1
37rx
""2 - "3 sin 71X + "3 sin -2- + 5
2L ( 71X (b) --;; sin L
25. (a) -7r 2
+ -1 sin
7rX 2 ( 7r cos ""2
6 ( (b) 7r sin
1
+
5nr 51 sin -2-
)
4 ( 71X 17. (a) 1. (b) 7r sin ""2
(b) 7r
37rx "31 sin -2-
57T.X sin -2-
sin 2x
+
3x
+
+ - I cos 5x + ... )
sin 3x
25
+ . . .)
Problem Set 11.4, page 499
3. Use (5). (-1)n
00
L... 9 .1· " n=-oo
- - einx ' n
2i 7 . - - ~ _ _ _ e(2n+ 1)ix 7r n=-oo 2n + 1 2 1)n 11. ~ + 2 ~ ---- einx 3 n2 n=-(X: 00
n*O co
13. 7r
+
i ~ n=-x
(
)
+ ... )
App. 2
A3l
Answers to Odd-Numbered Problems
Problem Set 11.5, page 501
3. (0.0511)2 in Dn changes to (0.02n)2. which gives C5 = 0.5100. leaving the other coefficients almost unaffected. 5. Y = Cl cos wt + C2 sin wt + A{w) cos t, A(w) = 1/(w2 - L) < 0 if w2 < I (phase shift!) and> 0 if w 2 > I N
7. y =
Cl
cos wt
+ C2 sin
wt
+
an
L n=l
9. )' = c cos wt
.+ .. )
11. Y =
Cl
cos wt
+ c sin , +
C2
1
-
2
3· 5(w
16)
-
wt
sin wl
cos4t-
w
2
- n
+ -7T- + -4 26" ~
2
cos nt
(
1 -----0;---
cos t +
'u'-I
oo
-
1· 3(w2
cos 2t
4)
-
cos 3t
6i'-9
1
+ 2w2
1/9
•
13. The situation is the same as in Fig. 53 in Sec. 2.8. 3c
15. y = -
64
+
9c
2
cos 3t -
8 64 + 9c
2
sin 3t
Problem Set 11.6, page 505
±
1. F = 2 ( sin x -
sin Ir
+ ... +
4 ( 1 3. F = -7T 2 - -7T cos x + -9 cos 3x
{_l)N+l
N
)
sin Nx , E*
=
8.1,5.0,3.6,2.8,2.3
1 cos 5x + . .. ) E* = 00748 00748 + -25 ....,
0.01 19, 0.01 19. 0.0037
5. F = -2-4 - (1 - - cos 2x + -1- cos 4x 7T IT 1·3 3·5 E* = 0.5951, 0.0292. 0.0292, 0.0066, 0.0066 7. F
=
-
4 ( sin x
+ -1
7T
0.6243. 0.4206 9. -8 ( sin x 7T
+ -1
27
0.0015, 0.00023
sin 3x + -1 sin 5.1' 3 5 (0.1272 when N = 20) sin 3x
+
1~5
sin 5x
+ -1- cos 6x + . .. ) , 5·7
+ . .. ) . E*
+ ..
J
= 1.1902, 1.1902, 0.6243,
E* = 0.0295, 0.0295, 0.0015,
A32
App. 2
Answers to Odd-Numbered Problems
Problem Set 11.7, page 512
LX
1. f(x) = 7Te- x (x > 0) gives A = e- V cos wv dv = , B = __t_v----=-2 + w2 1 + tr (see Example 3), etc. 0 2 X 3.I(x) = !7Te- gives A = 1/(1 + w ). 5. Use f = (7TI2) cos v and (11) in App. 3.1 to get A = (cos (ml'!2»/(1 - w 2 ).
2 7. 7T
2 11. -
Icc
sin
7T
cos XW --dw
2
9 -
• 7T
W
0
LX cos TTW + I l-w
7To
2 17. -
(ltv
LX
tv
w2
0
LX
7To
7TH' - sin 7TW
sin
2
0
2 15. -
cos xw dw
2
I"" -cos- -w-+- , ,sin- -w--- - cos
XIV
2 19. rr
dw
IV
HI'
sin mv 2 sin xw dw l-w
L wa cc
0
sin wa 2
sin xw dw
IV
Problem Set 11.8, page 517 1.
7.
{2 ( sin 2w -
V--;;
2 sin w )
5.
w
v:;;n cos w if 0 <
11. V(2/7T) w/(w
2
+
7T
tv
2
< 7T12 and 0 if w > 7T12
v:;;n e-x
(x> 0)
9. Yes, no
)
19. In (5) for f(ax) set a.>.: = v.
Problem Set 11.9, page 528 3. ik(e- ibw 7. [(I
- l)/(v'2;w) iw)e- iw - 1]/(v'2;w2 )
+
11 ·21 e -w 13. (e ibw
5. V(2/7T)k (sin w)/w 9. V(21'7T)i(cos w - 1)/w
2 /2
-
e- ibw )/(i}vv'2;) = v'2/;-(sin bw)/w
Chapter 11 Review Questions and Problems, page 532 11. 4k (sin TTX
+
7T
13. 4 ( sin 15.
8 ( 7T2
"2x- I"2 sin
!3
sin 31TX
sin x
+
2nt - '9I sin
+
!5
sin 57TX
+ ... )
'31 sin 23x - 4"1 sin 4x + "51 sin 25x 3nt -2- +
I 25
5nr sin -2- -
17. -2-4 - (1 - - cos 16rrx + -1- cos 327TX 7T 7T 1·3 3·5
+ ...
)
)
+ -1- cos 5·7
+ ...
481TX
+ ...)
dw
App. 2
A33
Answers to Odd-Numbered Problems
19.
7r 2
12 -
I l L cos 6x + 16 cos 8 \" -
"4 cos 4x - 9
+
cos 2x
23. 7r 2 /8 by Prob. 15
21. rr/4 by Prob. 11 25.
1
"2 [f(x)
+
+ ...
1
"2 Lt(x) -
f(-x)1,
fe-x)]
27. rr - -8 ( cos -x + -I cos -3x 7r 2 9 2
+ -1
25
cos -5x + ... ) 2
29.8.105,4.963,3.567,2.781,2.279, 1.929. 1.673, 1.477
31. y
C] cos wI + C2 sin wt
=
L"" (cos
1
7r
3w
+ w sin w - 1) cos
W
LX
sin
tV -
0
7r
(cos I 2 w - 1
1 4
-
-
•
cos 2t _ 4
w2
+
1 9
cos 3t w2
-
cos
H' 2
W
+
(sin w - w cos w) sin
2
tVX
dH'
. •• • Sin tU d~~
W
LX sin 2w -
4 37. -
tVx
w
0
2 35. 7r
4
--2 -
+ _ ... )
__1_. cos 41 16 w 2 - 16 33. -
7?
+
2w cos 2w w
0,
3
39.
cos wx dl\'
~.----::--+4
,,-;
I\,2
t' , Problem Set 12.1, page 537
1. 1I 5. 1I
=
Cl(X)
=
c(x)e- Y
9. 1I
=
Cl(X»)'
cos 4)' +
+
+
C2(X)
eXY/(x
+
3. 7.
sin 4)' 1)
C2(.1'»),-2
15. C = 1/4 19. 7r/4 27. 1I = 110 - (lIO/in 100) In
(X2
+ )'2)
+
1I = Cl(X) 1I = c(x)
exp
C2(X)y
(h 2 cosh x)
11. u = c(x)eY + h(y) 17. Any C 21. Any C and w 29. 1I = ClX + C2(Y)
Problem Set 12.3, page 546
1. k cos 2m sin 2m; 3. '8k 3 ( cos 7r1 sin 7rX + -
5.
7r
27
cos 37r1 sin 37rx
+
4 57r
( COS 7r1 sin 7rX - -1 cos 37rf sin 37rx 9
+
--2
2
7. 7r 2
+
( cv'2
-
~ Cv'2 +
1) cos m sin 7rX
+
1~5
cos 57rISin57r,+ ... )
_1_ cos 57rf sin 57rx 25
"21 cos 2m sin 27rx
1) cos 3m sin 31TX - .• -)
+ ...)
9
A34
App. 2
Answers to Odd-Numbered Problems
9. 22 ~
( (2 - V2) cos m sin = - -I (2
_1_ (2
25
it =
3=
+
+ V2) cos 5m sin 5= + ... ) 2
17.
+ v'2) cos 3m sin
9
8L ~3
(
cos
[(~)2J C L t
L7TX
sin
1 cos [(3~)2J 33 C L t
+
sin
371:\ + ...)
L
19. (a) u(O, t) = 0, (b) lI(L, t) = 0, (c) 1l:t"(0. t) = 0, (d) uxCL, t) = O. C = -A, D = -B from (a), (b). Insert this. The coefficient determinant resulting b'om (c), (d) must be zero to have a nontrivial solution. This gives (22).
Problem Set 12.4, page 552 3. 11. 13. 15. 17. 19.
c 2 = 300/[0.9/(2·9.80)] = 80.83 2 [m2 /sec2 ] Hyperbolic. u = fl(X) + f2(X + y) Elliptic, U = fl(Y + 3ix) + f2(Y - 3ix) Parabolic, 1I = Xfl(X - Y) + f2(x - y) Parabolic. 1I = Xfl(2x + y) + f2(2x + y) Hyperbolic, u = (l/y)fl(XY) + f2(Y)
Problem Set 12.5, page 560 5.
It
= sin OAnt e-1.752.167T2t/lOO
7.
II
=
9. u = 11.
II
=
! (~
sin O.lnt
2:~
(sin 0.17T.t e-O.017527T2t
til
+ Un. where
Un
=
II -
UI
~
+
e-O.017527T2t
+
sin 0.27TX e-O.01752(2m2t - . . . )
~
sin 0.3n\" e-O.01752(37T)2t - - ... )
satisfies the boundary conditions of the text. so
fL
oc 117T.t 2 2 that u II = L... ~ B sin - - e-lCn7TIL) t B = n' L . II L
n~1
0
13. F = A cospx + B sinpx, F'(O) p = n~/L, etc. 15. u = 1
17. II =
2;2 + ~2
19. u = 12 23 . - -K~ L
l1~l [f(x) (x)] sin -L - dx . - II I'
~
4 (cos x e- t -
= Bp = O. B = 0, F'(L) = -Ap sinpL = 0,
cos 2x e- 4t
+
i
I
I
4
9
cos 3x e- 9t -
+ ... )
+ cos 2,' e- 4t + - cos 4x e- 16t + - cos 6x e- 36t + ... .
L OC
nB II e- A " 2 t
25. w
= e-{3t
1I~1
27. v t - c 2v xx =
0,
w"
=
-Ne- ux/c 2, w
=
C~2
[_e- ux -
~
(J - e-uL)x
so that w(O) = n'(L) = O. 29.
1I
= (sin i~x sinh ~~y)/sinh
1T
80 ~
31. II = -
L... ~ n~l
(2n - 1)=
(211 - 1) sinh (211 -
l)~
Sll1
24
(211 - 1)7TV
sinh - - - - 24
+
1] '
App. 2
A35
Answers to Odd-Numbered Problems
33.
= Aox +
II
=
Ao
I -2
24
L
sinh (I171:T/24) 117TY An - - - - - cos -24 ' sinh 117T
f
fey) dy,
24
117TX
oc
35. ~ An sin - - sinh
f24 0
117T(b - y)
a
n~l
I
12
= -
An
0
, An
a
117TV
f( \") cos - - ' dv 24-
=
2
fa
a sinh (l17Tb/a)
0
117TX
f(x) sin - - dx a
Problem Set 12.6, page 568 2 sin ap
1. A =
, B = 0,
fX -sin-ap- cos px e- c
2
2
= -
II
rr
~
L cos
2
p t
dp
p
0
oo
= e- P , B = 0, u =
3. A
px
dp
e-p-cZp2t
o
5. Set
= s. A = I if 0 <
TTV
p/7T
<
I, B
= 0, u
{T
=
cos px
e-c2p2t
dp
o
~
7. A
2[cos P
+ P sin p
1)/(~2)]. B
-
O.
=
f"
u =
A cos px e-c2p2t dp
o
Problem Set 12.8, page 578
1. 3. 5. 7. 11.
(a), (b) It is multiplied by v'2. (c) Half = 16/(ml17T2) if 111. 11 odd. 0 otherwise Bmn = (-l)n+18/(mn7T2) if 111 odd, 0 if m even Bmn
=
Bmn
(_l)m+n4/(mn7T2)
k cos v'29 1 sin 2x sin 5y 6.4 "" "" I 13. - 2 ~ ~ -----:33 cos (1 m~l n~l 117 II 111, n odd
7T
17.
V
117
2
+
2 /1 )
sin
l/IX
sin
11)"
(colTesponding eigenfunctions F 4,16 and F 1614 ), etc. 0 (m or 11 even). Bmn = 16k/(l1l1l7T2 ) (m, 11 odd)
C7TY'260
19. Bmn
=
21. Bmn = (-l)m+nI44a 3 b 3 /(m 3 11 3 1fl)
9
23. cos ( 7Tt
2 a
+ 216)
47TV
37TX
sin - - sin - - ' a b
b
Problem Set 12.9, page 585
7. 30r cos 8 + 10,-3 cos 38 9.55
220 + -
( r cos 8 - -1 ,.3 cos 38
3
7T
11.
7T _
2
~ 7T
(r cos 8 + ~9
,.3
cos 38
15. Solve the problem in the disk,. < and -lio on the lower semicircle. u
=
4uo 7T
(!.. a
sin 8
17. Increase by a factor
+
~ 3a
v'2
r3
a
+ -1 5
+
_1_ 25
r 5 cos 58 ,.5
cos 58
:-.ubject to
sin 38
+
Uo
~ r 5a
19. T
5
+ ... )
+ ... )
(given) on the upper semicircle
sin 58
+ ...)
= 6.826pR2 f1 2
A36
App. 2
Answers to Odd-Numbered Problems
21. No 23. Differentiation brings in a factor lIAm = RI(eCi m ).
Problem Set 12.10, page 593 11. v
=
F(r)C(t), F"
+
k 2F
= 0, G +
e 2k 2C
2 R
C n = Bn exp (-e 21l 27T 2tIR2), Bn = -
= 0,
fR
Fn
= sin (I1'mfR), WITT
rf(r) sin - - dr
R
0
13. u = 100 15. u = ~r3P3(CoS 1;) - ~rPl(cos 1;) 17. 64r4P4(COS 1;) 21. Analog of Example 1 in the text with 55 replaced by 50 23. v = r(cos ())lr2 = r/(x 2 + )"2), V = xyl(x 2 + )"2)2
Problem Set 12.11, page 596 e(s)
= -
X
S
+
x
, W(O, s) = 0, e(s) = 0, w(x, t) = x(t - 1 + e- t ) + 1) 7. w = f(x)g(t), xI' g + fi = xt, take f(x) = x to get g = ce- t + t - 1 and e = 1 from w(x, 0) = x(e - L) = o. 9. Set x 2/(4c 27) = Z2. Use z as a new variable of integration. Use erf(x) = I. 5. W
S2(S
Chapter 12 Review Questions and Problems, page 597 19. u
=
+
el(y)e X
c2(y)e- 2X
21. u = g(x)(1 - e- Y ) + f(x) 25. u = ~ cos 1 sin x - -! cos 3t sin 3x
23. u = cos t sin x - ~ cos 21 sin 2x 27. l/ = sin (0.02'i1:\") e-0.0045721 200 29 u = 7T 2
•
31. u
=
(7TX sin 50
·
e-0.004572t -
37T.\" -1 sin - e-0.04115t 9
50
100 cos 4x e- 16t 7T
16
33. u = - - 2 7T
(1-
4
cos 2x e- 4t
+ -
1
36
cos 6.1: e- 36t
+ ... )
+ -I- cos lOx e- lOOt 100
+ ... ) 37. u = fl(Y) + f2(X + y) 39. II = fl(Y - 2ix) + f2(Y + 2ix) 41. l/ = Xfl(Y - x) + f2(Y - x) 49. II = (111 - 1l0l(1n r)lIn (rl/ro) + (110 In rl - Ul In ro)lIn (rl/rO)
Problem Set 13.1, page 606 5. x - iy = -(x + iy), x = 0 9. -5/169 11. -7/13 -(22/l3)i 15. -7/17 - (l1117)i 17.xl(x2 + y2)
7.484 13. - 273 + 136i 19. (x 2 - y2)/(x 2 + y2)2
Problem Set 13.2, page 611 1. 3V2(cos (--!7T) + i sin (-!7T)) 3. 5 (cos 7T + i sin 7T) = 5 cos 7T
1 5• cos "27T
. 1 + I.sm "27T
App. 2
Answers to Odd-Numbered Problems
A37
9. -37r/4 7. ~v'6I (cos arctan ~ + i sin arctan ~) 15. 37r/4 11. arctan (±3/4) 13. ±7r/4 17.2.94020 + 0.59601i 19.0.54030 - 0.84147i 21. cos (-~7r) + i sin (-~7r), cos ~7r + i sin ~7r 23. ±O ±i)rV2 25. -I, cos!7r ::!: i sin !7r, cos ~7r ± i sin ~7r 27. 4 + 3i, 4 - 8i 29. ~ - i, 2 + ~i 35. 1<:1 + z21 2 = (:1 + Z2)(Zl + Z2) = (:1 + z'2)(':1 + ':2)' Multiply out and use Re <:1':2 ~ IZ1z21 (Prob. 32): 2 :1':1 + :1':2 + :2':1 + ::2':2 = hl + 2 Re :1':2 + IZ212 ~ 1:112 + 2blk:21 + IZ212 = (1:11 + IZ21)2. Take the square root to get (6).
Problem Set 13.3, page 617
1. 3. 5. 9. 13. 15. 19. 23.
Circle of radius ~, center 3 + 2i Set obtained from an open disk of radius I by omitting its center z = 1 Hyperbola xy = 1 7 . .v-axis The region above y = x f = I - 11(::; + 1) = 1 - (x + 1 - iy)/l(x + 1)2 + y21; 0.9 - O.li (x 2 - y2 - 2.ixr)/(x2 + y2)2, -i/2 17. Yes since r2(sin 2e1/r ~ 0 Yes 21. 6;::2(;::3 + i) 2i (1 - Z)-3
Problem Set 13.4, page 623
5. Yes 1. Yes 3. No 9.Yesforz*0 7. No 11. rx = x/r = cos e"y = sin e, ex = -(sin e)/r, ey = (cos e)/r, (a) 0 = Ux - Vy = It,. cos e + tllI(-sin e)/r - Vr sin e - vo(cos e)/r. (b) 0 = lty + Vx = 1I,. sin e + lle(COS e)/r + Vr cos e + v e ( -sin e)/r. Multiply (a) by cos e, (b) by sin e, and add. Etc. 13. z2/2 15. In Izl + i Arg : 17. Z3 19. No 21. No 23. c = I, cos x sinh y 27. Use (4), (5), and (1). Problem Set 13.5, page 626
3. -1.13120 + 2.47173i, e = 2.71828 5. -i, 1 9. e- 2x cos 2.y, _e- 2T sin 2.y 7. eO.s(cos 5 - i sin 5), 2.22554 2 2 11. exp (x - )"2) cos 2xy. exp (x - y2) sin 2.\)" 13. e i7r/4 , e 5m/4 15. Vr exp [ice + 2k7r)/Il], k = 0," ',11 - 1 17. ge m 19. z = In 2. + tri + 21l7ri (11 = 0, ±l,"') 21.: = In 5 - arctan ~i ± 21l7ri (11 = 0,1,"') Problem Set 13.6, page 629
3. Use (II), then (5) for e iy , and simplify. 5. Use (II) and simplify. 7. cos 1 cosh 1 - i sin 1 sinh 1 = 0.83373 - 0.98890i
A38
App. 2
Answers to Odd-Numbered Problems
11. -3.7245 - 0.51182i 15. cosh 4 = 27.308 19. z = ~(21l + 1)7T - (-l)n1.4436i
9.74.203, 74.210 13. -1
17. z = :::':::(211 + 1)7Ti/2 21. ::: = :::':::lI7Ti 25. Insert the definitions on the left, multiply out, simplify.
Problem Set 13.7, page 633 1. 5. 7. 11. 15. 17. 19. 21. 25.
In 10 + 7Ti 3. ~ In 8 - ~7Ti In 5 + (arctan ~ - 7T)i = 1.609 - 2.214i 0.9273i 9. ~ In 2 - ~7Ti :::':::(211 + I )7Ti. 11 = 0, I. . . . 13. In 6 :::'::: (21Z + 117Ti, 11 = O. 1. ... (7T - I :::'::: 2117T)i, Il = 0, 1, ... In (;2) = (:::':::211 + i)7Ti. 21n i = :::':::(411 + l)'ITi. n = O. 1. ... 3 eO. (cos 0.7 + i sin 0.7) = 1.032 + 0.870i 2 e (l + i)/V2 23. 64(cos (In 4) + i sin (In 4» 2.8079 + 1.3179i 27. (I + ;)/,\0.
Chapter 13 Review Questions and Problems, page 634 19. -~ - ~i 25. 12e- 77i/2 31. fez) = liz 37. (-x 2 + )"2)/2 43.0.6435i
17. -32 - 24i 23. 6V2e37ri/4
29. (:::':::1 :::'::: i)/V2 z2 35. f(:::) = e 41.0
21.5 - 3i 27. :::':::(2 + 2i) 33. fez) = (l + i):::2 39. No 45. -1.5431
Problem Set 14.1, page 645 1. 3. 7. 9.
Straight segment from I + 3i to 4 + 12i Circle of radius 3, center 4 + i 5. Semicircle. radius 1. center 0 Ellipse, half-axes 6 and 5 from - 1 - ~i to 2 + 4i Parabola)' =
¥3
11. e- it (0 ~ t ~ 27T) 13. t + ilt (1 ~ t ~ 4) 17. -Q - ib + re- it (0 ~ t ~ 27T) 21. 0
25. il2 29.2 sinh ~
15. t + (4 - 4t2 )i (-1 ~ t ~ I) 19.~ + ~i 23. 7Ti + ~i sinh 27T 27. -I + i tanh ~7T = -I + 0.6558i
Problem Set 14.2, page 653 1. 7Ti. no 3.0, yes 7. O. yes 9.0, no 15. Yes, by the deformation principle 21. 7Ti
23.27Ti
27. (a) 0, (b) 7T
29.0
5.0, yes 11. O. yes 19. 7Ti by path deformation 25. 0
App. 2
A39
Answers to Odd-Numbered Problems
Problem Set 14.3, page 657 1.-4~
7.0 13. ~ 17. ~i cosh2(l
5.8m
3.4~
+
11.m
9. -~i 15. 2m Ln 4 = 8.710i i) = ~(-0.2828 + 1.6489i)
Problem Set 14.4, page 661 1. 27~i/4 7. 2~i if lal < 2.0 if lal 11. 2~2i
5. ma 3 /3 >2 9. m(cos i-sin ~) 13. ~iea/2124 if la - 2 - il < 3 0 if la - 2 - il > 3 3. -2~ierr/2
Chapter 14 Review Questions and Problems, page 662 17. -6~i 23. ~i sin 8
21. 27.
19.0 25.0
-~i
~
29.0
Problem Set 15.1, page 672 Bounded, divergent, ± J 3. Bounded, convergent, 0 Unbounded Bounded. divergent, ±llV'2" ± i, O. I, -2 Convergent, 0 13. IZn - II < ~E, Iz~ - [*1 < ~E (n > N(E)), hence IZn + z~ - (l + [*)1 < ~E + ~E 17. Convergent 19. Divergent 21. Conditionally convergent 23. Divergent by Theorem 3 27.11= 1100 + 75il = 125 (why?); 1100 + 75iI 125/125! = 125125/[\h50~ (l251el 25J = e125tV250~ = 6.91 . 1052
1. 5. 7. 9.
Problem Set 15.2, page 677 1. ~ a n z2n = ~ a n(z2)n, Iz21 < R = lim lanlan+ll, hence Izl <\'R. 3. -i, I 5. -1, e by (6) and (1 + I1n)n ~ e. 7.0, Iblal 9. O. 1 11.0, 1 15. i, 1IV2 17. 0, V2 13. 3 - 2i. 1
Problem Set 15.3, page 682 3. V2 9. ]
1. 3 7. 1/\/7
5.
V5i3
Problem Set 15.4, page 690 1. 1 - 2z + 2z 2 - ~Z3 + ~Z4 - + .. ', R = 3. e- 2i (1 + (z + 2i) + hz + 2i)2 + t(z + 2i)3 5. I - ~(z - ~~)2
7. ~
+ ~i + !i(z
+
- i)
l4(Z - !~)4 - 7~O(Z
+
(-~
+ ~O(Z
-
- ;)2 -
00
+
+ 2i)4 + ... , R = 00 !~)6 + - ... , R = oc ~(z - ;)3 + - ... , R = V2 l4(Z
A40
App. 2
Answers to Odd-Numbered Problems
+ t:: 4 - .i8::.6 + 3~4<.8 - + .. " R = x 11.4(:: - 1) + \O(z - 1)2 + 16(:: - 1)3 + 14(:: - \)4 + 6(:: 13. (2/v;.)(:: - z3/3 + ::5(2!5) - z7/(3!7) + ... ), R = GC 15. ;:3/(l!3) - ::7/(3!7) + ;:1l/(5!1l) - + . ". R = x 19.:: + ~Z3 + 1~::5 + i{5:: 7 + . ", R = !7T 9. 1 - !::2
- 1)5
+
(z - 1)6
Problem Set 15.5, page 697 3. R = \[\1'; > 0.56 7. Itanhn Izll ~ I, 1/(/1 2 + 11. Izl ~ 2 - 8 (8 > 0) 15.lzl ~ \,'5 - 8 (8) 0)
1. Use Theorem 1. 5. Iz nl ~ 1 and ~ 1//1 2 converges. 9.
Iz +
L - 2il ~ r
4
13. Nowhere
\) <
\//1
2
Chapter 15 Review Questions and Problems, page 698 13. 1. ! Ln[(1 + z)/(1 - .:)] 17. 1/v;., [l - 7T(Z - 2i)2r 1
11. x, e 3 -
15.
GC
19. 1/3 21. -1 - (::. - 7Ti) - (:: - 7Ti)2/2! - .. "
23.! + ~(:: + I) + t(z +
R = x
1 1 6(Z
1)2 + + 1)3 + . ", R = 2 + 3;: + 6;:2 + 10;:3 + .. '. R = I. Differentiate the geometric + (:: + i) - i(z + i)2 - (z + i)3 + .. " R = 1
25. \ 27. i
1
+
29. -(:: - 2 7T)
3!\ (:: -
13
27T) -
1
5! (;: -
15
27T)
+ - .. '. R =
series.
ex;
Problem Set 16.1, page 707 \ 1. .:4
1
1
1
+ -::3 + -Z2 + -z + 1 + z +
1111 + - - Z2 2;:: 6
3.
-3 -
5.
-_3
I 24 .(.
+ -
Z
\
1
+ -
::5
1 2::7
ez-1e 7. - - = e Z -
9. -
\
(i 2:"2 x
n~O
1 6;:9
+ oc
+ -
(::: -
2:
n~O
I (.: -
12 120 ~
I)n-l
= e
i)n-l = -
R
= \
+ - ... R = ,
[1
-Z- \
+
1
z- \ + - - +"'],R=X 2!
il2 \; --=-=-=+ "4 + -8 (:: <.
i) -
I
R=2 x
11. -
2: (.: + ;)n-l =
-
n~O
3 13. - - z- 1
1 ~ Z
+
2
+
(z -
\)
I
x
= ox
'
/1!
+ .. '.
- -
+ ... R
n
-l-l
Z2
- \ - (.: + ;) - .. " R =
1
1 16 (:: -
;)2 -
.. "
App. 2
A41
Answers to Odd-Numbered Problems
15.
L
-L
< I.
:x;
1:::1
;::4n+2,
< I,
> 1
~
l
x
-L
4n+2'
Izl >
1
'11=0 Z
'>1=0
19.
Izl
_3n+3'
n~O
17. L
1
Xl
~3n, 1::1
i
I
+ - - + i + (;:: ~ -
(::: - i)2
i
L ~4n, Izl < x
21. (I - 4:::)
i)
(4 I) L I
1,
3
n=O
x
- 'I
Z
Z
4n'
n~O Z
Izl >
I
Problem Set 16.2, page 711 1. :::,:::~. :::':::~, ... (poles of 2nd order), :x (essential singularity) 3.0, :::,:::v:;.;.. :::,:::\12;. ... (simple poles), :x (essential singularity) 5. :x (essential singularity) 7. :::'::: I, :::':::i (fourth-order poles), :x (essential singularity) 9. :::':::i (essential singularities) 13. -16i (fourth order) 15. :::'::: 1, :::':::2, ... (third order) 17. :::':::irV3 (simple) 19. :::':::2i (simple), 0, ::':.2 m, :!::4'ITi, ... (second order) 21. O. :::':::271. :::':::471• ... (fourth order) 23. f(~) = (~ - ~o)ng(:::), g(zo)
"* 0, hence p(:~) =
(::: - ;::0)
2n
l(z).
Problem Set 16.3, page 717 1. i. 4i 5. 1/5! (at::: = 0) (at ~ = I), ~ (at z = -1) 9. 15. el/z = 1 + 1/;:: + ... , AilS. 2m 19. -4'ITi sinh ~7I
-!
23.0
3. -~i (at ~ = 2i), ~i (at - 2i) 7. I (at :::':::ll'1T) 11. -1 (at z = :::':::~71", :::':::~71", ... ) 17. Simple poles at :!::~. Am. -4i 21. -4i sinh ~ 25. ~ (at ~ = ~), 2 (at ~ = ~). Ans. 5m
Problem Set 16.4, page 725 1. 271/\;TI 7.0 13.71"/16 19.0 25.0
5. 271/3
3.271/35 9. 1f 15.0 21. 0 27. -71"/2
11. 271/3
17.71"12 23.71"
Chapter 16 Review Questions and Problems, page 726 17. 271"i/3 23. m/4 27.6m 33.0
19. 571" 25.0 (11 even), 29. 271/7 35. 71/2
(_I)
21. ~'IT cos 10 271"i/(/1 - I)! (/1 odd) 31. 4'IT/V3
A42
App. 2
Answers to Odd-Numbered Problems
Problem Set 17.1, page 733 3. Only the size 5. x = e, W = -y + ie, y = k. w = -k 7. -371"/4 < Arg w < 31714, Iwl < 1/8 11.lwl ~ 16, v ~ 0 15. In 2 ~ II ~ In 3, 71"/4 ~ v ~ 17/2 19.0, ±1, ±2, . . .
+
ix
9.lwl > 3 13. Annulus 3 < 17. ±1, ±i 21. -a/2
Iwl
< 5
23. (/ and 0, ~ 25. M = eX = 1 when x = 27. M = 1IIzi = 1 on the unit circle, J = 111<:12
o. J =
e
2x
Problem Set 17.2, page 737 _ __ -
~.~.
-nv
-2w + 3
9. z = 0 13. z = ±i 17. (/ - d = 0, ble = 1 by (5)
5i
7.;::= - - 4w - 2
n. <: = ~ + i ± Vi + i 15. w = 4/z, etc. 19. w = add (a *- 0, d *-
0)
Problem Set 17.3, page 741 5. Apply the inverse g of f on both sides of Zl = f(zl) to get g(Zl) 7. w = (<: + 2i)/(z - 2i) 9. w = z - 4
11. w
=
15. w = 19. w =
liz (z + 1)/(-3z + I) 4 (Z4 - i)/(-i: + 1)
=
13. w = (3iz + 1)lz 17. w = (2z - i)/(-iz - 2)
Problem Set 17.4, page 745 1. Annulus 1 ~ Iwl ~ e 2 3. liVe < Iwl < Ye, 371"/4 < arg w < 571"/4 5. 1 < Iwl < e. u > 0 7. w-plane without 0 9. u 2 /cosh 2 1 + v 2 /sinh 2 1 ~ 1, u ~ 0 11. Elliptic annulus bounded by u2/cosh2 I + v 2 /sinh 2 1 = I and u 2/cosh 2 5 + v 2 /sinh 2 5 = \ 13. ::!:(211 + 1)17/2, II = 0, \, ... 15.0 < [m t < 71" is the image of R under t = Z2. AilS. e t = e z2 17. O. ±i, ±2i, ... 19. 112/cosh 2 1 + v 2 /sinh 2 1 ~ \, v < 0 21. v < 0 23. -I ~ l/ ~ I. v = 0 (c = 0). u 2 /cosh2 e + u2 /sinh 2 e = I (e *- 0) 25. In 2 ~ l/ ~ In 3, 71"/4 ~ v ~ 7[/2
Problem Set 17.5, page 747 1. w moves once around the unit circle. 7. - i/2, 3 sheets
g(f(zl»
5. - 5/3, 2 sheets 9. 0, 2 sheets
=
Zl·
App. 2
Answers to Odd-Numbered Problems
A43
Chapter 17 Review Questions and Problems, page 747 11.
15. 17. 23. 27. 33. 39. 45.
1I = ~V2 I, ~V2 - I The domain between II Iw + ~I = ~ 0, (:::'::: 1 :::'::: i)/V2 0, :::':::i/V2 tV = zi(z + 2) I + i : :': : v'1+2i iz 3 + 1
=
13.lwl = 20.25, larg wi < n12 ~ - v 2 and It = 1 - ~V2 19. 1I = 1 21.larg wi < 7T/4 25. n/8 :::'::: 117T/2. 11 = O. I, ... 29. tV = iz 31. w = liz 35. ::!:V2 37. 2 : :': : v'6 41. w = e3z 43. z2 12k
Problem Set 18.1, page 753
3. 110 - SOx)" 110 + 25iz 2 1. 20x + 200, 20z + 200 7. F = 200 - (lOOlln 2) Ln z 5. F = (1lO/ln 2)Ln z 13. Use Fig. 388 in Sec. 17.4 with the z- and w-planes interchanged, and cos z = sin (z + !7T). 15. <1> = 220 - 110xy Problem Set 18.2, page 757 2 )') c]:> 2 2 2 1• 1I 2 - v 2 = e2 X(cos 2 ] \' - sin, xx = 4e X(cos )' - sin .v) = -c]:> YY' V2c]:> = 0 3. Straightforward calculation, involving the chain rule and the Cauchy-Riemann equations 5. See Fig. 389 in Sec. 17.4. c]:> = sin 2 x cosh2 J - cos 2 x sinh 2 y. 9. (i) c]:> = U1(l - "\y). (ii) w = iz 2 maps R onto -2 ~ 1I ~ O. thus <1>* = U1(l + !1I) = U1(l + -2x),». 11. By Theorem 1 in Sec. 17.2 13. <1> = 10[1 - (lin) Arg (z - 4)J, F = lOll + (i/n) Ln (z - 4)1 15. Corresponding rays in the w-plane make equal angles, and the mapping is conformal.
i(
Problem Set 18.3, page 760 3. (lOO/d»'. Rotate through 7T12. 5. 100 - 2408/7T 7. Re F(z) = 100 + (200/n) Re (arcsin z) 9. (240/n) Arg z 11. To + (2/7T)(TI - To) Arg z 13.50 + (400/7T) Arg z Problem Set 18.4, page 766 1. V = iV2 = iK, 'It = - Kx = canst, c]:> 3. F(z) = Kz (K positive real) 5. V = (I + 2i)K. F = (l - 2i)K::. 7. F(z) = Z3
= Ky = canST
9. Hyperbolas (x + I)y = COllst. Flow around a corner formed by x x-axis. 11. Y/(X2 + .1'2) = C or X2 + (y - k)2 = k 2 13. F(z) = ::.Iro + rolz
= -} and the
A44
App. 2
Answers to Odd-Numbered Problems
z with the roles of the z- and w-planes
15. Use that w = arccos Z is w = cos interchanged.
Problem Set 18.5, page 771 5. I - ,.2 cos 28 7. 2(r sin 8 - !r2 sin 28
+ ~,.3 sin 38 - + ... )
9. ~r2 sin 28 - ~r6 sin 68 11. ~7T2 - 4(,. cos 8 - ~r2 cos 28
+
,.3 sin 38 +
-1
13. -4 ( r sin 8 - -I
9
7T
+ ... )
~r3 cos 38 -
~
r5
+ ... )
sin 58 -
Problem Set 18.6, page 774 1. 7. 13. 15.
No; Izl2 is not analytic. 3. Use (2). F(~) = 2i 5. cJ:>(4, -4) = -12 Use (3). cJ:>(L 1) = -2 11. IF(ei 1)1 2 = 2 - 2 cos 'lA, e = 7T12, Max = 2 IF(z) I = [cos 2 2x + sinh2 2y]1I2, z = ±i, Max = [1 + sinh 2 2]112 = cosh 'l = 3.7622 No
Chapter 18 Review Questions and Problems, page 775 11. 13. 17. 23.
cJ:> = 10(1 - x + y), F = 10 - 10(1 (201ln 10) Ln ;:: Arg z = const T(x. y) = x(2y + I) = const
27. F(;::)
= -
c
Ln (;:: - 5), Arg (;:: - 5)
+
i)z.
15. (101In 1O)(ln 100 - In r) 19. (-i/7T) Ln z 25. Circles (x - C)2 + )'2 = c 2
=
C
27T
29.20
+ -80 ( r sin 7T
e + -I ,.3 sin 3fJ + -I ,.5 sin 5fJ + ... ) 3
5
Problem Set 19.1, page 786 1. 0.9817' 102, -0.1010' 103 ,0.5787' 10- 2, -0.1360' 105 3.0.36443/(17.862 - 17.798) = 0.36443/0.064 = 5.6942, 0.3644/(17.86 - 17.80) = 0.3644/0.06 = 6.073, 0.364/(17.9 - 17.8) = 3.64, impossible
5.
0.36443(17.862 + 17.798) 17.8622 _ 17.7982
=
0.36443' 35.660 12.996 319.05 _ 316.77 = ~ = 5.7000,
13.00 13.0 13 10 -22 = 5.702, ~2 = 5.70. = 5.7. . 8 _. 8 2.3 2
=
5
7. 19.95,0.049,0.05013: 20, 0, 0.05 9. In the present calculation, (b) is more accurate than (a). 11. -0.126' 10-2, -0.402' 10-3 ; -0.267' 10-6 , -0.847' 10- 7 13. Add first, then round. 15.
~ Q2
=
~I + EI Q2
+
102
=
a
l
:
a2
EI
(1 _
~2 + ~2: a2
a2
_
+ ...)
= ~I + ~l a2
a2
_
~2 . ~1 a2
Q2
,
App. 2
A45
Answers to Odd-Numbered Problems
hence
l al)/I -all = lEII( -a - -:::::a2 a2 al {/2
-
E21 ~ IErll + IEr21 ~ {3rl + {3r2
a2
19. (a) 19121 = 0.904761905, Echop = Eround = 0.1905.10- 5 , 5 E,·.chop = Er.round = 0.2106.10- , etc. Problem Set 19_2, page 796
1. g = 1.4 sin x, 1.37263 (= X5) 5. g = X4 + 0.2, 0.20165 (= x 3 ) 7. 2.403 (= X5' exact to 3S) 9.0.904557 (= x 3 ) 11. 1.834243 (= X4) 13. Xo = 4.5, X4 = 4.73004 (6S exact) 15. (a) 0.5, 0.375, 0.377968, 0.377964; (b) IIV7 = 0.377964 473 17. Xn+l = (2xn + 7/xn2 )/3, 1.912931 (= X3) 19. (a) Algorithm Bisect (f. ao_ bo, N) Bisection Method This algorithm computes an interval [an, b1 J containing a solution of f(x) = 0 (f continuous) or it computes a solution Cn' given an initial interval lao, bol such that f(ao)f(b o) < O. Here N is determined by (b - a)I2 N ~ {3, {3 the required accuracy. INPUT: Initial interval lao, boL maximum number of iterations N. OUTPUT: Interval LaN, bNl containing a solution, or a solution Cn. For 11 = 0, I, .. " N - I do: Compute Cn = ~(an + b,J. If f(c n ) = 0 then OUTPUT Cn. Stop. [Procedure completed] Else continue. If f(an)f(c n ) < 0 then an+l = an and bn +l = cn. Else set {/n+l = Cn and b n + 1 = bn . End OUTPUT LUN' bN ]. Stop. [Procedure completed] End BISECT Note that [aN' bNl gives (aN + bN )!2 as an approximation of the zero and (b N - aN)!2 as a cOlTesponding elTor bound. (b) 0.739085; (c) 1.30980, 0.429494 21. 1.834243 23.0.904557 Problem Set 19.3, page 808
1. Lo(x) = -2.r + 19, L 1 (x) = 2x - 18, Pl(X) = 0.1082x + 1.2234, Pl(9.4) = 2.2405 3.0.9971,0.9943.0.9915 (0.9916 4D), 0.9861 (0.9862 4D), 0.9835, 0.9809 5. P2(X) = -0.44304x 2 + 1.30906x - 0.02322, P2(0.75) = 0.70929 7. P2(X) = -0.1434x2 + 1.0895x, P2(0.5) = 0.5089, P2(1.5) = 1.3116 9. La = -t(x - I)(x - 2)(x - 3), Ll = !x(x - 2)(x - 3), ~ = -~x(x - 1)(x - 3), ~ = tx(x - I)(x - 2); P3(X) = 1 + 0.039740x - 0.335187x2 + 0.060645x 3; P3(O·5) = 0.943654 (6S-exact 0.938470), P3(1.5) = 0.510116 (0.511828), P3(2.5) = -0.047993 (-0.048384)
A46
App. 2
Answers to Odd-Numbered Problems
13. P2(X)
=
0.9461x - 0.2868x(x - 1)/2
= -0.1434x2 +
1.0895x
15.0.722,0.786 17. 8f1/2 = 0.057839, 8f3/2 = 0.069704. etc.
Problem Set 19.4, page 815 9. [-1.39(x - 5)2 + 0.58(x - 5)3]" = 0.004 at x = 5.8 (due to roundoff; should be 0). 11. 1 - ~X2 + ~X4 13.4 - 12x2 - 8x 3 ,4 - 12x2 + 8x 3 . Yes 15. I - x 2 , -2(x - l) - (x - 1)2 + 2(x - 1)3, -1 + 2(x - 2) + 5(x - 2)2 - 6(x - 2)3 17. Curvature f"/(I + j'2)3/2 = f" if If'l is small 19. Use that the third derivative of a cubic polynomial is constant, so that gill is piecewise constant, hence constant throughout under the present assumption. Now integrate three times.
Problem Set 19.5, page 828 1. 0.747131 3. 0.69377 (5S-exact 0.69315) 7.0.894 (3S-exact 0.908) 5. 1.566 (4S-exact 1.557) 9.1h /2 + Eh/2 = 1.55963 - 0.00221 = 1.55742 (6S-exact 1.55741) 11. J h /2 + Eh/2 = 0.90491 + 0.00349 = 0.90840 (5S-exact 0.90842) 13. 0.94508, 0.94583 (5S-exact 0.94608) 15.0.94614588, 0.94608693 (8S-exact 0.94608307) 17. 0.946083 (6S-exact) 19. 0.9774586 (7S-exact 0.9774377) 21. x - 2 = t, 1.098609 (7S-exact 1.098612) 23. x = !U + 1),0.7468241268 (lOS-exact 0.7468241330) 25. (a) M2 = 2. M2* = ~, IKM21 = 2/(J2n 2). 11 = 183. (b) iv) = 24/x 5 , 2m = 14 27. 0.08, 0.32, 0.176, 0.256 (exact) 29.5(0.1040 - !. 0.1760 + i· 0.1344 - ~. 0.0384) = 0.256
r
Chapter 19 Review Questions and Problems, page 830 17.4.266,4.38, 6.0, impossible 19.49.980,0.020; 49.980, 0.020008 21. 17.5565 ~ s ~ 17.5675 23. The same as that of a. 25. -0.2, -0.20032, -0.200323 27.3,2.822785,2.801665,2.801386, VlOI386 29.2.95647.2.96087 31. 0.26, M2 = 6, M2* = 0, -0.02 ~ E ~ 0, 0.24 ~ a ~ 0.26 33. 1.001005, -0.001476 ~ E ~ 0
Problem Set 20.1, page 839 1. Xl = -2.4, X2 = 5.3 5. Xl = 2, X2 = 1
3. No solutlOn 7. Xl = 6.78, x2 = -11.3,
X3
= 15.82
App. 2
A47
Answers to Odd-Numbered Problems
9. Xl = 0, x 2 = T1 arbitrary, X3 = 5f1 + 10 11. Xl = fl , x 2 = t2, both arbitrary, X3 = 1.25fl 13. Xl = 1.5, X2 = -3.5, X3 = 4.5, X4 = -2.5
-
2.25t2
Problem Set 20.3, page 850 Exact 21.5, 0, -13.8 5. Exact 2, I, 4 7. Exact 0.5, 0.5, 0.5 (3 )T = [0.49982 0.50001 0.50002], (b) X (3)T = [0.50333 0.49985 0.49968] 6, 15, 46, 96 steps; spectral radius 0.09, 0.35, 0.72, 0.85, approximately [1.99934 1.00043 3.99684]T (Jacobi, step 5): [2.00004 0.998059 4.00072jT (Gauss-Seidel) 17. v'306 = 17.49, 12, 12 19. V 18k2 = 4.24Ikl, 41kl, 41kl 3. 9. 11. 13.
(a) X
A48
App. 2
Answers to Odd-Numbered Problems
Problem Set 20.4, page 858
1. 5. 7. 13. 17. 21.
[t -]
~] -0.1 1]
12, v'62 = 7.87,6, 1.9, V1.35 = 1.16, 1. lO.3 6, \''6, 1. [1 1 K = 100· 100 46 ~ 6· 17 or 7 . 17 [-0.6 2.8]T
0.5
3.14, V56 = 7.07,4, [-1 I ~ -~] 1.0] 11. II AliI = 17, II A-I 111 = 17, K = 289 15. K = 1.2' 1~~ = 1.469 19. [0 11T, [1 -OAIT,289 23.27,748,28375,943656,29070279
Problem Set 20.5, page 862
1. 5. 11. 13.
3. 8.95 - 0.388x -11.4 + 5Ax s = -675 + 901, Vav = 90 kmlh 9.4 - 0.75x - 0.125x2 5.248 + 1.543x, 3.900 + 0.5321x + 2.021x2 -2.448 + 16.23x, -9.114 + 13.73x + 2.500x 2, -2.270 + 1.466x -1.778x 2 + 2.852x 3
Problem Set 20.7, page 871
1. 5 ~ A ~ 9 3.5,0, 7; radii 4, 6, 6 5.IA - 4il ~ v'2 + 0.1. IAI ~ 0.1, IA - 9il ~ v'2 7.111 = 100,122 = 133 = I 9. They lie in the intervals with endpoints ajj :::t::: (11 - 1)10- 6 . (Why?) 11. 0 lies in no Gerschgorin disk, by (3) with >; hence det A = Al ... An 13. peA) ~ Row sum norm II A
IIx = max L J
15.
Vi53
=
12.37
17.
lajkl = max (Iaiil
k
Vl22 = 11.05
=1=
O.
+ GerschgOlin radius)
J
19. 6 ~ A ~ 10. 8 ~ A ~ 8
Problem Set 20.S, page 875 1. q = 4,4.493.4.4999: lEI ~ 1.5.0.1849,0.0206 3. q = 8,8.1846,8.2252; lEI ~ L 0.4769, 0.2200 5. q = 4,4.786,4.917; lEI ~ 1.63,0.619,0.399 7. q = 5.5,5.5738.5.6018: lEI ~ 0.5. 0.3115, 0.1899: eigenvallle~ (4S) 1.697,3.382, 5.303.5.618 9. )' = Ax = Ax, yTx = AxTx, yTy = A2xTx, E2 ~ '~/Y/XTX - (yTX/XTl\.)2 = A2 - A2 = 0 11. q = 1. ... , - 2.8993 approximates - 3 (0 of the given matrix), lEI ~ 1.633, .. ',0.7024 (Step 8)
Problem Set 20.9, page 882 [
1.
3~
- ~.!lO2776
-UlO1776 6.730769 l.H-l6154
L~61~1 ~[
1.769230
0.9g0000
-0.-1--1-1814
-O.-1-·HS1-1-
0.H70l64
U37iOO:]
0
0.371803
0.4H9H36
App.2
A49
Answers to Odd-Numbered Problems 5. Eigenvalues 8, 3, 1 -2.50867
r 5.M516
01 r 7.45139 0.374953 , -1.56325
5.307219
-2.5086~
0.374953
r 7.91494
1.04762
0.0983071
1.00446
03124691 :1.000482
3.08458 0.0312469
r18.3171
IB3786 o0.360275 1 , r 0.396511
0.881767
~.881767
8.29042 0.360275
r18.3910
1.39250
0.396511 8.24727
0
0.0600924
:O6~241· 1.37414
0.177669
~.177669
9.
1
0098307
3.544142
-0.646602
-~.646602
7.
0
-1.56325
8.23540
:01lm141
0.0100214
1.37363
7an24
0.0571287
~.0571287
4.00088
r 7OO322
1 r7~n98 0.0249333 , 0.0326363
0.0326363
o
0.0249333
0.996875
4.U0034
0
0.00621221
:=21221 0.996681
0.0186419
~.0186419
r
4.00011
:.001547821
U.00154782
0.Y96669
Chapter 20 Review Questions and Problems, page 883 17.
r4
-I
2{
19.
L6
-3
I{
21. All nonzero entries of the factors are 1.
23.
2.8193
-1.5904
-1.5904
1.2048
-0.0482
-0.0241
[ 27. 15,
,189,
8
=~:::~l
,/21, 4
31. 14,
,178, 7
37. 11.5 . 4.4578 = 51.2651
35.9
39. Y = 1.98
1
0.1205 29.7,
33. 6
25. Exact [-2
(4D-values)
+ 0.98x
41. Centers 1. 1, 1. radii 2.5. 1. 2.5 (A = 2.944.0.028 ::':: 0.290i, 3D) 43. Centers 5, 6, 8: radii 2,1, I, (A = 4.1864. 6.4707, 8.3429. 5S) -2.23607
[ 45.
15 -~.23607
o
5.8
-3.1
-3.1
6.7
J
,Step3:
[
9.M913
-1.06216 0
-1.06216 4.28682 -0.00308
-:00308J 0.26345
A50
App. 2
Answers to Odd-Numbered Problems
Problem Set 21.1, page 897 1. Y = eX, 0.0382, 0.1245 (elTor of X5, .\"10) 3. Y = x - tanh x (set y - x = til, 0.009292, 0.0188465 (elTor of .\"5, XIO) 5. y = eX, 0.001275, 0.004200 (elTor of X5' XIO) 7. Y = 11(1 - x 2 /2), 0.00029, 0.01187 (elTor of X5' xw) 9. y = 11(1 - x 212), 0.03547. 0.28715 (elTor of X5' XIO) 11. Y = 11(1 - x 2 /2); error -10- 8 , -4' 10-8 , . . " -6' 10-7 , + 10- 5; about 1.3' 10-5 by (10) 13.y = xe x ; eITor'l~ (for x = L"" 3) 19,46,85,139,213,315,454,640,889,1219 15. Y = 3 cos x - 2 cos 2 x; error' 107 : 0.18, 0.74, 1.73,3.28,5.59,9.04, 14.33,22.77, 36.80, 61.42 17. Y = 1I(x5 + 1), 0.000307, -0.000259 (error of X5' XIO) 19. The elTors are for E.-c. 0.02000, 0.06287. 0.05076. for Improved E.-C. -0.000455, 0.012086,0.009601, for RK 0.0000012,0.000016,0.000536.
Problem Set 21.2, page 901 3. y = e- O. 1x2 ; elTors 10-6 to 6· 10-6 5. y = tan x; .\'4' .. " )'10 (error' 105): 0.422798 (-0.48),0.546315 (-1.2),0.684161 (-2.4),0.842332 (-4.4),1.029714 (-7.5),1.260288 (-13),1.557626 (-22) 7. RK-elTor smaller. elTor' 105 = 0.4, 0.3, 0.2. 5.6 (for x = 0.-1-, 0.6, 0.8, 1.0) 9• .\'4 = 4.229690, Y5 = 4.556 859, )'6 = 5.360657. Y7 = 8.082 563 11. ElTors between -6 . 10-7 and + 3 . 10- 7 . Solution eX - x - I 13. Errors' 105 from x = 0.3 to 0.7: -5, -II, -19, -31, -47 15. (a) 0, 0.02, 0.0884, 0.215 848,)'4 = 0.417818'.\'5 = 0.708887 (poor). (b) By 30-50%
Problem Set 21.3, page 908 3. )'1 = eX, )'2 = -ex. elTors range from ±0.02 to ±0.23, monotone. 5. )'~ = Y2, )'~ = -4)'1' )' = Yl = I, 0.84, 0.52, 0.0656, -0.4720; y = cos 2x x x 7')'1 = 4e- sin x')'2 = 4e- cos x; elTors from 0 to about ±O.I 9. ElTors smaller by about a factor 104 11. Y = 0.198669,0.389494,0.565220,0.719632,0.847790; y' = 0.980132,0.922062,0.830020,0.709991,0.568572 13. Y1 = e- 3 .1: - e- 5 :r')'2 = e- 3 .1: + e- 5 .1:;)'1 = 0.1341. 0.1807, 0.1832, 0.1657, 0.1409;)'2 = 1.348,0.9170.0.6300,0.4368.0.3054 17. You gel the exact solution, except for a roundoff elTor [e.g., Yl = 2.761 608, y(0.2) = 2.7616 (exact), etc.]. Why? 19. Y = 0.198669,0.389494,0.565220,0.719631. 0.847789; y' = 0.980132,0.922061. 0.830019, 0.709988, 0.568568
Problem Set 21.4, page 916 3. 105, 155, 105, 115; Step 5: 104.94, 154.97, 104.97, 114.98 5.0.108253, 0.108253,0.324760,0.324760; Step 10: 0.108538, 0.108396, 0.324902, 0.324831
App. 2
A51
Answers to Odd-Numbered Problems
7. 0, O. O. O. All equipotentia11ines meet at the comers (why?). Step 5: 0.29298. 0.14649,0.14649.0.073245 9. - 3un + U12 = -200, Un - 3U12 = -100 11. U12 = U32 = 31.25, U21 = U23 = 18.75, ujk = 25 at the others 13. U21 = U23 = 0.25, U12 = U32 = -0.25, Ujk = 0 else 15. (a) Un = -U12 = -66. (b) Reduce to 4 equations by symmetry. Un = U31 = -U15 = - 1I35 = -92.92, U21 = -U25 = -87.45, U12 = U 32 = -U 14 = -U 34 = -64.22, U 2 2 = -U24 = -53.98, U13
17.
=
U23
=
U33 =
0
\13,
Un = U21 = 0.0849, U12 = U22 = 0.3170. (0.1083, 0.3248 are 4S-values of the solution of the linear system of the problem.)
Problem Set 21.5, page 921 5. Un = 0.766. U21 = 1.109. U12 = 1.957. U22 = 3.293 7. A as in Example I, right sides -2, -2, -2, -2. Solution Un = U21 = 1.14286, U 1 2 = U22 = 1.42857 11. -4un + U21 + U12 = - 3. Un - 4U21 + U22 = -12, Un - 4U12 + U22 = 0, 2U21 + 2U12 - 12u22 = - 14. Un = U22 = 2, U21 = 4. U12 = 1. Here -14/3 = -~(1 + 2.5) with 4/3 from the stencil. 13. b = [-380 -190, -190, O]T; Un = 140, U21 = U 1 2 = 90, U22 = 30
Problem Set 21.6, page 927 5.0.1636.0.2545
(t
= 0.04. x = 0.2,0.4).0.1074.0.1752 (t = 0.08),0.0735.0.1187
(t = 0.12),0.0498,0.0807 (t
= 0.16),0.0339,0.0548
(t
= 0.2; exact 0.0331,0.0535)
7. Substantially less accurate, 0.15, 0.25 (1 = 0.04),0.100,0.163 (t = 0.08) 9. Step 5 gives 0,0.06279,0.09336,0.08364,0.04707, O. 11. Step 2: 0 (exact 0),0.0453 (0.0422),0.0672 (0.0658), 0.0671 (0.0628),0.0394 (0.0373), 0 (0) 13.0.1018,0.1673,0.1673,0.1018 (t = 0.04),0.0219,0.0355, ... (t = 0.20) 15.0.3301,0.5706.0.4522.0.2380 (t = 0.04).0.06538.0.10604,0.10565.0.6543 (t = 0.20)
Problem Set 21.7, page 930 1. For x = 0.2, 0.4 we obtain 0.012, 0.02 (t = 0.2), 0.004, 0.008 (t = 0.4), -0.004, -0.008 (t = 0.6). etc. 3. u(x, 1) = 0, -0.05, -0.10, -0.15, -0.075,0 5.0.190,0.308,0.308,0.190 (0.178, 0.288, 0.288, 0.178 exact to 3D) 7.0,0.354.0.766, 1.271, 1.679. 1.834.... (t = 0.1); 0.0.575.0.935, 1.135, 1.296. 1.357, ... (t = 0.2)
Chapter 21 Review Questions and Problems, page 930 17.
y = tan x; 0 (0),0.10050 (-0.00017). 0.20304 (-0.00033), 0.30981 (-0.00047), 0.42341 (-0.00062), 0.54702 (-0.00072)
ASl
App. 2
Answers to Odd-Numbered Problems
19. 0.1 003349 (0.8 . 10-7 ) 0.2027099 (I.6 . 10- 7 ), 0.3093360 (2.1 . 10-7 ). 0.4227930 (2.3· 10-7 ),0.5463023 (1.8' 10-7 ) 25.y(0.4) = 1.822798,.\'(0.5) = 2.046315,)'(0.6) = 2.284161,)'(0.7) = 2.542332, y(0.8) = 2.829714, y(0.9) = 3.160288, .v(1.0) = 3.557626 27'.\"1 = 3e- 9x,.I"2 = -5e- 9:r, [1.23251 -2.05419J, [0.506362 -0.843937],···, [0.035113 -0.058522] 29. 1.96, 7.86, 29.46 31. II(P l l ) = 1I(P31) = 270. U(P21 ) = U(P13 ) = U(P23 ) = U(P33 ) = 30, U(P 12 ) = U(P32 ) = 90, 1I(P22 ) = 60 35.0.06279,0.09336,0.08364,0.04707 37.0, -0.352, -0.153,0.153,0.352,0 if t = 0.12 and 0,0.344,0.166, -0.166, -0.344, 0 if t = 0.24 39.0.010956.0.017720.0.017747,0.010964 if t = 0.2
Problem Set 22.1, page 939 3. f = 3(-"1 - 2)2 + 2(X2 + 4)2 - 44. Step 3: [2.0055 - 3.9Y75]T 5. f = 0.5(x1 - 1)2 + 0.7(X2 + 3)2 - 5.8, Step 3: rO.99406 -3.0015]T 7. f = 0.2(X1 - 0.2)2 + X2 2 - 0.008. Step 3: [0.493 -O.Oll]T, Step 6: [0.203 0.004]T
Problem Set 22.2, page 943 1. X3, X4 unused time on MI' M 2. respectively 3. No 11. fmax = f(O. 5) = 10 13. f max = f(9, 6) = 36 15. fmin = f(3.5, 2.5) = - 30 17. X1/3 + x2/2 ~ 100, x1/3 + -"2/6 ~ 80, f = 150X1 + fmax = f(210, 60) = 37500 19. 0.5X1 + 0.75x2 ~ 45 (copper), 0.5X1 + 0.25x2 ~ 30, f = 120x1 + 100x2, fmax = f(45, 30) = 8400
Problem Set 22.3, page 946 1. f(12011 I, flOIl 1) = 48011 I
2100 200) 3. f ( - 3 - ' 2/3 = 78000
5. Matrices with Rows 2 and 3 and Columns 4 and 5 interchanged 7. f(O, [0) = -10 9. f(5, 4, 6) = 478
Problem Set 22.4, page 952 1. f(-l-. -1-) = 72 7. f(l, 1, 0) = 12
3. f(10, 30) = 50 f(!, 0, ~) = 3
5. f(lO, 5) = 5500
9.
Chapter 22 Review Questions and Problems, page 952
n. Step 5: [0.353 -0.028]T. Slower 13. Of course! Step 5: [-1.003 1.897]T 21. f(2, -1-) = 100 23. f(3, 6) = -54
25. /(50, 100) = 150
App. 2
Answers to Odd-Numbered Problems
A53
Problem Set 23.1, page 958
o o
9.
0
o o
1
15.
0
o o o
o 0
CD-------®
11.
1
f:
0
15'm 3
]
0
0
0
0
0
0
0
0
0
0
13. 0
4
Edge
o 21.
><
2
>
3
~
0
25.
0
E
23.
0
4
1 ><
>
0
2 3
0
Vertex 1 2 3 4
Incident Edges -eh -e2, e3, -e4 el e2, -e3 e4
Problem Set 23.2, page 962 1.4
5.4
3.5
9. The idea is to go backward. There is a VIc-I adjacent to Vk and labeled k - 1, etc. Now the only vertex labeled 0 is s. Hence A(vo) = 0 implies Vo = s, so that Vo - VI - ... - Vk-l - Vic is a path s ~ Vic that has length k. 15. No; there is no way of traveling along (3, 4) only once. 21. From Tn to 100m, 10m, 2.5m, 111 + 4.6
Problem Set 23.3, page 966 1. 3. 5. 7.
(1. 2). (1, 2), (1,4), (1, 5),
(2,4). (I, 4), (2, 4), (2. 3),
(4, (2, (3, (2.
3); 3); 4), 6),
L2 = 6. L3 = 18. L4 = 14 L2 = 2, L3 = 5, L4 = 5 (3, 5); L2 = 4, L3 = 3, L4 = 2, L5 = 8 (3. 4), (3, 5); ~ = 9, ~ = 7, L4 = 8, L5 = 4, L6 = 14
Problem Set 23.4, page 969 2 1. 1 :,
/
3
L
=
12
I / 3. 4 ,\""2
3- 5
4"\
5
L
=
10
8 / 5. 1 - 2"\ 5 3~ 6- 4 "\
7
L
= 28
A54
App. 2
Answers to Odd-Numbered Problems 2 /
L = 38 11. Yes 5-6, 15. G is connected. If G were not a tree, it would have a cycle, but this cycle would provide two paths between any pair of its vertices, contradicting the uniqueness. 19. If we add an edge (u, u) to T, then since T is connected, there is a path U ~ u in T which. together with (II, u), forms a cycle. 9. I - 3 - 4 '\.
Problem Set 23.5, page 972
1. (I, 2), (1.4), (3, 4), (4,5). L = 12 3. (I. 2). (2, 8), (8, 7), (8. 6), (6, 5), (2, 4), (4, 3), L 5. (1,4), (3,4). (2,4). (3,5), L = 20
= 40
7. (I, 2), (I, 3), (I, 4), (2, 6), (3, 5), L = 32 11. If G is a tree 13. A shortest spanning tree of the largest connected graph that contains vertex 1 Problem Set 23.6, page 978
1. 3. 5. 7.
I - 2 - 5, Ilf = 2; 1 - 4 - 2 - 5, Ilf = 2, etc. I - 2 - 4 - 6 . .1f = 2; I - 2 - 3 - 5 - 6. Ilf = I, etc. f12 = 4, f13 = 1. f14 = 4. f42 = 4. f43 = 0,125 = 8, f35 = 1, f = 9 f12
= 4. f13 = 3, f24 = 4, f35 = 3, f54 = 2, f4fj = 6, f56 = 1, f = 7
~ {~5,6},28 11. {2,~ 6},50 13. I - 2 - 3 - 7, lJ.f = 2; I - 4 - 5 - 6 - 7, Ilf = 1; 1 - 2 - 3 - 6 - 7, Ilf = 1; fmax = 14 15. {3, 5, 7}. 22 17. S = {I, 41. cap (S. 19. If fii < Cij as well as fii > 0
n=
6 + 8
= 14
Problem Set 23.7, page 982 3. 5. 7. 9.
(2, 3) and (5. 6) 1 - 2 - 5, .:It = 2; 1 - 4 - 2 - 5, ~t = I; f = 6 + 2 + 1 = 9 1 - 2 - 4 - 6, .:It = 2: 1 - 3 - 5 - 6. Il t = 1; f = 4 + 2 + I = 7 By considering only edges with one labeled end and one unlabeled end 17. S = {I, 2,4, 51. T = {3, 6}, cap (S, n = 14 Problem Set 23.8, page 986
1. No 3. No 5. Yes, S = { I, 4, 5, 8} 7. Yes; a graph is not bipartite if it has a nonbipartite subgraph. 9.1 - 2 - 3 - 5 11. (1, 5), (2, 3) by inspection. The augmenting path I - 2 - 3 - 5 gives I - 2 - 3 - 5, that is, (\, 2), (3, 5). 13. (1,4), (2. 3). (5, 7) by inspection. Or (1, 2), (3, 4), (5, 7) by the use of the path 1 - 2 - 3 - 4. 15. 3
25. K3
19. 3
23. No; K5 is not planar.
App. 2
ASS
Answers to Odd-Numbered Problems
Chapter 23 Review Questions and Problems, page 987 0
0
13.
r~
1
]
0 0
0
0
15. 0
0
0
0
o o o
CD---0
o
o o o
17.
21.
o
1
19.\
o o
CD
o 23.4
Incident Edges
Ve11ex
/
e2, -e3
2 3
-eb e3 'eb -e2
25.4 29. I - 4 - 3 - 2, L
=
27. L2 = 10. L3 33. f = 7
16
=
15, L4
=
13
Problem Set 24.1, page 996 1. qL
19, qM = 20, qu = 20.5 5. qL = 69.7, qM = 70.5, qu = 71.2 9. qL = 399, qM = 401, qu = 401 13..r = 70.49, s = 1.047,IQR = 1.5 17. 0 0 300 =
3.qL = 38,QM = 44,Qu = 54 7. qL = 2.3, qM = 2.4, qu = 2.45 11. x = 19.875, s = 0.835, IQR = 1.5 15. x = 400.4, s = 1.618, IQR = 2 19. 3.54, 1.29
Problem Set 24.2, page 999 1. 4 outcomes: HH, HT, TH, TT (H = Head, T = Tail) 3.62 = 36 outcomes (1, 1), (1, 2), .. " (6, 6) 5. Infinitely many outcomes S, SCS, ScScS, ... (S = "Six") 7. The space of ordered triples of nonnegative numbers 9. The space of ordered pairs of numbers 11. Yes 13. E = IS, scs, SCSCS}, E" = {SCSCSCS, ScScScScS, ... } (S = "Six")
Problem Set 24.3, page 1005 1. (a) 0.9 3 = 72.9%, (b) 190~ • ~~. ~~ = 72.65% 3 490. 489 • 488 • 487 • 486 - 90 3501. • 500
499
498
497
496 -
5. 1 - 2~ = 0.96 9. P(MMM)
+
P(MMFM)
.
/0
7. I - 0.75 2 = 0.4375 < 0.5
+ P(MFMM) + P(FMMM) =
~
+ 3 . 1~
=
1~
A56
App. 2
Answers to Odd-Numbered Problems
11. 3~ + ~~ - 3~ = ~~ by Theorem 3, or by counting outcomes 13. 0.08 + 0.04 - 0.08 . 0.04 = 11.68% 17. 1 - 0.97 4 = 11.5% 15.0.95 4 = 81.5%
Problem Set 24.4, page 1010 5. (2~)
3. In 40320 ways 7.210,70. 112.28 11. (~~) = 635013559600 15.676000
= 1140 = 1260. AIlS. 111260
9. 9!/(2!3!4!) 13. 1184, 5121
Problem Set 24.5, page 1015 1. k = 1/55 by (6)
3. k = II8 by (10) 7. 1 - P(X ~ 3) = 0.5
5. No because of (6)
9. P(X
> 1200)
f
=
2
6[0.25 -
(x -
1.5)2] dx = 0.896.
AilS.
0.8963 = 72%
1.2
11. k 17. X
= 2.5; 50% > b, X ~ b.
13. k = 1.1565; 26.9% X < c. X
~
c. etc.
Problem Set 24.6, page 1019 1. 2/3, 1118
3.3.5.2.917 7. $643.50
5.4, 16/3 9. JL = lie = 25; P 13. 750, 1, 0.002
=
20.2%
11.~, 2~'
(X -
~)V20
15. 15c - 500c 3 = 0.97. c = O.ms55
Problem Set 24.7, page 1025 1. 0.0625, 0.25, 0.9375, 0.9375 3.64% 5.0.265 7. f(x) = OS"e- o.5 /x!, f(O) + f(l) = e- O.5 ( 1.0 + 0.5) = 0.91. Am. 9% 9. I - e- O.2 = 18% 11. 0.99100 = 36.6% 13. ~~~, ~~~, 2~~' :di6
Problem Set 24.8, page 1031 1. 0.1587, 0.6306, 0.5, 0.4950 5.16% 9. About 23 13. t = 1084 hours
3. 17.29, 10.71, 19.152 7. 31.1 %, 95.5% 11. About 58st
Problem Set 24.9, page 1040
1. 1/8, 3/16, 3/8
3. 2/9, 2/9, 1/2
5. f2(Y) = 11(/32 - ll'2) if ll'2 < Y < /32 and 0 elsewhere 7. 27.45 mm, 0.38 mm 9. 25.26 cm, 0.0078 cm
App. 2
Answers to Odd-Numbered Problems
13. lndependent, .f1(X) 15. 50lJf
A57
= O.le- O.lx if X> 0,
f2(Y)
= O.le- Oly if Y > 0, 36.8%
17. No
Chapter 24 Review Questions and Problems, page 1041 23. x = 22.89, s = 1.028. 21. QL = 22.3, QM = 23.3, Qu = 23.5 25. H, TH, TTH, etc. 27. f(O) = 0.80816. f(l) = 0.18367. f(2) = 0.00816 29. Always B !: A U B. If also A !: B, then B = A U B, etc. 31. 7/3, 8/9 33. 118.019, 1.98, 1.65% 35.0, 2 37. JL = 100/30 39. 16%, 2.3% (see Fig. 520 in Sec. 24.8)
S2
= 1.056
Problem Set 25.2, page 1048 3. 1 = pk(1 - p)n-k, 5. 11120 7. 1 = f(x), aOn l)/ap 9. it = x 13. = 1
p = kin, k = number of Sllccesses in n trials = lip -
e
p
= 0, = 1/x 11. = nl'i. Xj = l!x 15. Variability larger than perhaps expected
(x - 1)10 - p)
e
Problem Set 25.3, page 1057 1. CONFo.95 [37.47 ~ JL ~ 43.87} 3. Shorter by a factor v'2 5.4, 16 7. Cf. Example 2. n = 166 11. CONFo.99 [63.71 ::::; JL ~ 66.29} 9. CONFo.99 [20.07 ~ JL ~ 20.33} 13. c = 1.96, x = 87, S2 = 87' 413/500 = 71.86, k = cslVn = 0.743, CONFo.95 {86 ~ JL ~ 88}, CONF095 (0.17 ~ p ~ 0.18} 15. CONFo.95 (0.00045 ~ a 2 ~ 0.00l31} 17. CONFO.95 [0.73 ::::; a 2 ~ 5.19} 19. CONFO.95 [23 ~ a 2 ~ 548}. Hence a larger sample would be desirable. 21. Normal distributions, means -27,81, 133, variances 16, 144,400 23. Z = X + Y is normal with mean 105 and variance 1.25. Ans. P(l04 ~ Z ~ 106) = 63%
Problem Set 25.4, page 1067 t = V7(0.286 - 0)/4.31 = 0.18 < c = 1.94; do not reject the hypothesis. c = 6090 > 6019; do nol reject the hypothesis. ifln = I, c = 28.36: do not reject the hypothesis. JL < 28.76 or JL > 31.24 Alternative JL =1= 1000, t = v'2O (996 - 1000)/5 = -3.58 < c = - 2.09 (Table A9, 19 degrees of freedom). Reject the hypothesis JL = 1000 g. 11. Test JL = 0 against JL =1= O. t = 2.11 < c = 2.36 (7 degrees of freedom). Do not reject the hypothesis. 13. ll' = 5%, c = 16.92 > 9.0.5 2 /0.4 2 = 14.06; do not reject hypothesis. 15. to = \1'10' 9·17119 (21.8 - 20.2)1\1'9.0.62 + 8.0.5 2 = 6.3 > c = 1.74 (17 degrees of freedom). Reject the hypothesis and assert that B is better.
1. 3. 5. 7. 9.
ASS
App. 2
17.
Answers to Odd-Numbered Problems
= 50/30 = 1.67 < c = 2.59 [(9, 15) degrees of freedom]: do not reject the hypothesis.
Vo
Problem Set 25.5, page 1071 1. LCL = I - 2.58 . 0.03/v6 = 0.968, VCL = 1.032 3.11 = 10 5. Choose 4 times the original sample size (why?). 7. 2.58VO.024/\12 = 0.283. VCL = 27.783, LCL = 27.217 11. In 30% (5%) of the cases, approximately 13. VCL = Ill' + 3Vllp(\ - p), CL = Ill', LCL = "l' - 3Vl1p(1 - p) 15. CL = JL = 2.5, VCL = JL + 3~ = 7.2, LCL = JL - 3~ is negative in (b) and we set LCL = O. Problem Set 25.6, page 1076 1. 0.9825, 0.9384, 0.4060 5. peA; e) = e- 30H(1 + 308) 7. P(A; 8) = e- 50 (J 11. (l - 8)5, (l - 8)5 + 5e(l - 8)4
15. <1>«9 - 12 + ~)/V12(1 - 0.12)) = 17. (l - ~)3 + 3 . ~(l - ~)2 = ~
3.0.8187,0.6703.0.1353 9. 19.5%, 14.7% 13. Because /I is finite 0.22 (if c = 9)
Problem Set 25.7, page 1079 1. X02 = (30 - 50)2/50 + (70 - 50)2150 = 16 > c = 3.84: no 3.41 5. X02 = 2.33 < c = 11.07. Yes 7. ej = I1Pj = 370/5 = 74. X02 = 984174 = 13.3. c = 9.49. Reject the hypothesis. 9. X02 = I < 3.84; yes 13. Combining the results for x = lo, II, 12, we have K - r - 1 = 9 (r = I since we estimated the mean. 1~ci:f = 3.87). Xo2= 12.98 < c = 16.92. Do not reject. 15. X02 = 49/20 + 49/60 = 3.27 < c = 3.84 (1 degree of freedom, a = 5%), which supports the claim. 17. 42 even digits, accept. Problem Set 25.8, page 1082 3. (~l8(l + 18 + 153 + 816) = 0.0038 5. Hypothesis: A and B are equally good. Then the probability of at least 7 trials favorable to A is ~8 + 8 . ~8 = 3.5%. Reject the hypothesis. 7. Hypothesis JL = O. Alternative JL > 0, .r = 1.58, t = \liD. 1.58/1.23 = 4.06 > c = 1.83 (a = 5%). Hypothesis rejected. 9. x = 9.67, s = 11.87, to = 9.67/(11.871'\115) = 3.15 > c = 1.76 (a = 5%). Hypothesis rejected. 11. Consider .'J = Xj - Po. 13. peT ~ 2) = 0.1% from Table A12. Reject.
App. 2
A59
Answers to Odd-Numbered Problems
15. P(T ~ 15) = 10.8%. Do not reject. 17. P{T ~ 2) = 2.8%. Reject.
Problem Set 25.9, page 1091
1. Y = 1.9 + x 3. y = 6.7407 + 3.068x 5. y = 4 + 4.8x. 172 ft 7. y = -1146 + 4.32x 9. y = 0.5932 + 0.1l38x, R = 1/0.1138 11. qo = 76, K = 2.36V76/(7· 944) = 0.253, CONFo.95 { -1.58 ~ Kl ~ -1.06} 13. 3sx 2 = 500, 3sxy = 33.5, kl = 0.067, 3s y2 = 2.268. qo = 0.023. K = 0.02] CONFo.95 {0.046 ~ Kl ~ 0.088} Chapter 25 Review Questions and Problems, page 1092
i/
21. fl = 5.33. = 1.722 25. CONFo.99 { 19.1 ~ J.L ~ 33.7} 29. CONFo.95 { 1.373 ~ J.L ~ 1.4511 33. c
= 14.74 > 14.5; reject J.Lo.
23. It will double. 27. CONFo.95 {0.726 ~ J.L ~ 0.75]} 31. CONFo.99 {0.05 ~ u 2 ~ 10} 14.74 - 14.40) 35. cD ( •~ = 0.9842 v 0.025
37.30.14/3.8 = 7.93 < 8.25. Reject. 39. Vo = 2.5 < 6.0 [(9.4) degrees of freedom]; accept the hypothesis. 41. Decrease by a factor v'2. By a factor 2.58/1.96 = 1.32. 43.0.9953,0.9825,0.9384, etc. 45. y = 1.70 + 0.55x
;;~".·APPENDIX
II
p
',.. j
---- ....
3
n"
Auxiliary Material
A3.1
Formulas for Special Functions For tables of IlUllleric values. see Appelldix 5.
Exponential function e
eX
(Fig. 544)
= 2.71828 1828459045235360287471353
(1)
Natural logarithm (Fig. 545) In (xy) = In x
(2)
+ In y.
In (xly) = Inx - 1ny.
In x is the inverse of eX, and e ln x = x, e- ln x
= e1n nIx)
= IIx.
Logarithm of base ten 10glOx or simply log x (3)
log x = M In x,
(4) In x
=
I X M loa b"
M I M
= log e
=
0.434294481903251 82765 11289 18917
= In 10 = 2.30258509299404568401 7991454684
log x is the inverse of lOT, and I Olog x
= x,
I O-log X
=
IIx.
Sine and cosine functions (Figs. 546.547). In calculus, angles are measured in radians, so that sin x and cos x have period 27['. sin .1' is odd. sin (-x) = - sin x, and cos x is even. cos (-x) = cos x. y
I
y
5 2
o x
Fig. 544. A60
Exponential function eX
I
x
-2,
Fig. 545.
Natural logarithm In x
SEC. A3.1
Formulas for Special Functions
A61
y
y
/
x
/
~l
\. X
sin x
Fig. 546.
Fig. 547.
1°
=
0.01745 32925 19943 radian
radian
=
57° 17' 44.80625"
cos x
= 57.29577 95131 ° sin 2 x
(5)
sin (x
+ y) =
sin (x - y)
(6) cos (x
{ sin (7T
(9)
cos 2
(10)
X
=
{
cos x cos y - sin x sin y
= cos x cos y + sin x sin y cos 2x
x) = sinx.
-
~(1
cos
= cos2 X
-
sin2 x
=
~[-cos (x
cos x cos y
=
~[cos (x
sin x cos y
=
Msin (x
cos u
+
sin v
x) = -cosx
(7T -
sin 2 x
+ cos 2x),
=
+ y) +
+ y) + + y) +
~(l - cos 2x)
cos (x - y)]
em (x - y)] sin (x - y)]
u+v
u-v
2
2
u+v
u-v
u+v
u-v
2
2
= 2 sin - - - cos - - -
+ cos v = 2 cos --2- cos --2-
cos v - cos u
(14)
- cos x sin y
x= cos (x - ;) = cos ( ; - x) cos x= sin (x + ;) = sin ( ; - x)
sin u (12)
sin y
sin
sin x sin y (1 L)
+ cus x
= sin x cos y
sin 2x = 2 sin x cos x,
(8)
(13)
cos 2 X = 1
sin x cos y
+ y) =
cos (x - y)
(7)
+
A cos x
+B
sin x =
VA
2
A cos x
+B
sin x =
VA
2
= 2 sin - - - sin - - -
+ B2 cos (x ± +
B2 sin (x
0),
± 8),
tan 8 =
tan 8
=
sin 8
B
+-
cos 8 sin 8 eas 8
A
=
A -T-
B
A62
APP. 3
Auxiliary Material
y
y
5
5
)
) -Tr
/
Tr
-Tr
x
( (
\
-5
\
-5
tan x
Fig. 548.
cot x
Fig. 549.
Tangent, cotangent, secant, cosecant (Figs. 548, 549) (15) tanx
(16)
=
sinx cosx
tan (x
+ y)
=
cot x
=
tan x
+
cosx
sec x
sinx tany
=
tan (x - y) =
1 - tanxtany
cscx
cos x
=
sin x
tan x - tan y
+
1
tan x tan v
Hyperbolic functions (hyperbolic sine sinh x, etc.; Figs. 550, 551) (17)
tanh x
(18)
cosh x
(19)
+
=
sinh x coshx '
sinh x
sinh2 x
(21)
=
i(cosh 2x - 1),
=
sinh x
cosh x - sinh x
= eX,
COSh2 X
(20)
cosh x
coth x
-
sinh2 x
= I
COSh2 X
=
i(cosh 2x
y
y
4
4
2
-2
g. 550.
/
2 x
;-2
-4
-4
Fig. 551.
+
l)
\
2 x
-2
-2
sinh x (dashed) and cosh x
e- x
=
tanh x (dashed) and coth x
SEC A3.1
A63
Formulas for Special Functions
{
(22)
sinh (x ± y)
sinh x cosh y ± cosh x sinh y
=
cosh (x ± y) = cosh x cosh y ± sinh x sinh y
tanh (x ± y)
(23)
tanh x ± tanh y
= ------I ± tanh x tanh y
Gamma function (Fig. 552 and Table A2 in App. 5). The gamma function f(a) is defined by the integral (24)
(a> 0)
which is meaningful only if a> 0 (or, if we consider complex a, for those a whose real part is positive). Integration by parts gives the importantfullctional relatio1l of the gamma function, (25)
f(a
+
1)
= af(a).
From (24) we readily have r(l) = I: hence if a is a positive integer. say k. then by repeated application of (25) we obtain
+
r(k
(26)
1)
=
(k = 0, 1, .. ').
k!
This shows that the ga11l111afunction can be regarded as a generalization of the elementary factorial jilllction. [Sometimes the notation (a - L)! is used for rea), even for noninteger values of a, and the gamma function is also known as the factorial function.] By repeated application of (25) we obtain
f(a)=
rea +
1)
f(a a(a
+ +
2)
f(a + k + 1) + I )(a + 2) ... (a +
----'--
a(a
1)
nw
4
I I I I I I I I
-2
in Fig. 552.
-4
Gamma function
ex
k)
A64
APP. 3
Auxiliary Material
and we may use this relation
rea)
(27)
rea + k + I) = -------a(a + I) ... (a + k)
(a oF 0, -1, -2,· .. )
for defining the gamma function for negative a (oF -1, -2, " .), choosing for k the smallest integer such that a + k + I > O. Together with (24), this then gives a definition of rca) for all a not equal to zero or a negative integer (Fig. 552). It can be shown that the gamma function may also be represented as the limit of a product namely. by the formula (28)
rea)
n! nl> =
lim
n~!XJ
(
a a
+ I )(a + 2) ...
(a
(a oF 0, - 1, .. ').
+ 11)
From (27) or (28) we see that, for complex a, the gamma function r( a) is a meromurphic function with simple poles at a = 0, - 1, - 2, .... An approximation of the gamma function for large positive a is given by the Stirling
formula (29)
where e is the base of the natural logarithm. We finally mention the special value (30)
Incomplete gamma functions Q(a. x)
(31)
= f=e-tt
U
-
1
dt
(a> 0)
x
(32)
rca)
=
pea, x)
+
Q(a. x)
Beta function (33)
B(x. y) =
I
1 1 (] -
tX-
t)y-l
(x> 0, y > 0)
dt
o
Representation in terms of gan1ma functions: (34)
B(x. y)
=
f(x)f()')
rex. + .y)
Error function (Fig. 553 and Table A4 in App. 5) (35)
erf x = -2-
v:;;:
IXe-
t2
dt
0
7
(36)
x 3!7
+ _ ... )
SEC. A3.1
A65
Formulas for Special Functions erfx 1
0.5
-2
x
/
~
/0.5 -1
Error function
Fig. 553.
erf (x)
1, C017lple171CllTal}, error jill1ction
=
erfc x = I - erf x
(37)
=
~ 2I V'Tr
I""e -
t
2
dt
x
Fresnel integrals! (Fig. 554) x
C(x) = {cos (t o
(38) C(x)
=
-v:;;;s, S(X)
= vi'Tr/S,
2
)
Set)
dt,
=
Jo sin (t2) dt
co171plemelllary fimctiollS
!¥ -
c(x)
=
s(x)
=
C(x)
(39)
LXcos (t
=
r; - to S(x) =
\
S
2
)
dt
sin (12) dt
x
Sine integral (Fig. 555 and Table A4 in App. 5) Sitx)
(40)
=
x
J o
sin t
-~
t
dt
y
1
C(x) /
-' '- _/_' ~----4'~
~ '-_/
/
o
-
~.//--
x
Fig. 554,
Fresnel integrals
lAUGUSTIN FRESNEL (1788-1827), French physicist and mathematician. For tables ~ee Ref. [GRI].
A66
APP. 3
Auxiliary Material
Si(X~l ,,/2
-
1 O~'--~I---L--~--L-~5~-L--~--~~L-~1~~~~x
Fig. 555.
Si(:x:)
=
Sine integral
7T/2. ("(}17lplemento ry jimction
(41)
si(x)
=
7T
-
f
Si(x) =
-
2
x
.
SIll
t
- - dt
t
x
Cosine integral (Table A4 in App. 5) (42)
citx}
f
=
cc
cos t
--
dt
(x> 0)
dt
(x> 0)
t
x
Exponential integral
(43)
f
Ei(x) =
x
-t
~ t
x
Logarithmic integral (44)
A3.2
lie\")
=
x
Io
-
dt
In t
Partial Derivatives For differentiation formulas, see inside of front cover. Let ~ = f(x, y) be a real function of two independent real variables, x and y. If we keep y constant. say, y = )'1' and think of x as a variable, then f(x, )'1) depends on x alone. If the derivative of f(x, Yl) with respect to x for a value x = Xl exists. then the value of this derivative is called the partial derivative of f(x. \") u'ith respect to x at tbe point (Xl' .'"1) and is denoted by or by Other notations are and these may be used when subscripts are not used for another purpose and there is no danger of confusion.
SEC. A3.2
A67
Partial Derivatives
We thus have, by the definition of the derivative,
(1)
The partial delivative of.: x constant, say, equal to
Xl,
= f(x. y) with respect to y is defined similarly; we now keep and differentiate f(XI, y) with respect to y. Thus
(2) Other notations are fy(x}. YI) and ~y(XI' )'1)' It is clear that the values of those two partial derivatives will in general depend on the point (Xl, YI)' Hence the partial delivatives a~/ax and iJz/iJ.v at a variable point (x, y) are functions of x and y. The function az/iJx is obtained as in ordinary calculus by differentiating z = f(x, y) with respect to x. treating y as a constant, and Bz/By is obtained by differentiating z with respect to y, treating x as a constant. E X AMP L E 1
Let:::
2
= I(x. y) = x y
+ x sin y. Then
iiI
~
ilx
= 2n
+
~in'·.
.
ilj 2 = x ily
~
+ \. COS
Y.
.
•
The partial derivatives iJ:;:/i)x and a~/ay of a function;: = f(x, y) have a very simple geometric interpretation. The function ;:: = f(x, y) can be represented by a surface in space. The equation y = Yl then represents a vertical plane intersecting the surface in a curve. and the partial derivative a::Ji)x at a point (Xl' Yl) is the slope of the tangent (that is, tan a where a is the angle shown in Fig. 556) to the curve. Similarly. the partial derivative iJ;:/ay at (Xl, Yl) is the slope of the tangent to the curve X = Xl on the surface z = f(x, y) at (Xl' YI)'
Fig. 556.
Geometrical interpretation of first partial derivatives
A68
APP. 3
Auxiliary Material
The partial derivatives azlax and a-::.My are called first partial derivatives or partial derivatives of first order. By differentiating these derivatives once more, we obtain the four second partial derivatives (or partial derivatives of second order)2 a2 f
a
ax 2
ax
a2f
a
ax ay
ax
a2f
a
(3)
ayax a2 f ay2
(:~ ) = fxx
(::.)
= fyx
f (a ) = fxy ay ax
a aI'
f -ay ) =fyy' (a
It can be shown that if all the derivatives concerned are continuous, then the two mixed partial derivatives are equal, so that the order of differentiation does not matter (see Ref. [GR4] in App. 1), that is,
(4) E X AMP L E 2
For the function in Example I. fxx = 2y,
f xy
= 2x + cos Y =
f yx,
fyy = -x siny.
•
By differentiating the second partial derivatives again with respect to x and y, respectively, we obtain the third partial derivatives or partial derivatives of the third order of f, etc. If we consider a function f(x, y, z) of three independent varutbles, then we have the three first partial derivatives fAx, y, z), fy{x, y, z), and fz(x, y, z). Here Ix is obtained by differentiating f with respect to x, treating both y and z as constallts. Thus, analogous to (I), we now have
etc. By differentiating derivatives of f, etc. E X AMP L E 3
f.p
f y, fz again in this fashion we obtain the second partial
Let f(x, y, z) = x 2 + y2 + Z2 + xy eZ • Then
f"
=
2x
+ )' eZ ,
fy = 2y
+ x eZ ,
fxx = 2,
fxy = f y ,' = e
fyy = 2.
fyz = fzy
=
fz = 2z
+ xy eZ ,
Z •
z
xe ,
•
2 LAUTIO]'l In the subscript notation the subscripts are written in the order in which we differentiate whereas in the "iY' notation the order is opposite. '
SEC. A3.3
A69
Sequences and Series
A3.3
Sequences and Series See also Chap. 15.
Monotone Real Sequences We call a real sequence xl, X2, increasing, that is,
. • • ,Xn , .••
a monotone sequence if it is either monotone
or monotone decreasing, that is,
We call Xl, for all n.
THE 0 REM 1
PROOF
X2, •••
a bounded sequence if there is a positive constant K such that
IXnl
(f a real sequence is bounded and monOTOne, it converges.
Let Xl> X2' . • • be a bounded monotone increasing sequence. Then its terms are smaller than some number B and, since Xl ~ Xn for all n. they lie in the interval Xl ~ X." ~ B. which will be denoted by 1o, We bisect 1o; that is, we subdivide it into two parts of equal length. If the right half (together with its endpoints) contains terms of the sequence, we denote it by 11' If it does not contain terms of the sequence, then the left half of 10 (together with its endpoints) is called 11 , This is the first step. In the second step we bisect h, select one half by the same rule, and call it 12 , and so on (see Fig. 557 on p. A 70). In this way we obtain shorter and shorter intervals 1o, 11 , 12 , • • . with the following prope11ies. Each 1m contains all In for n > 111. No term of the sequence lies to the right of 1m , and, since the sequence is monotone increasing. all Xn with Il greater than some number N lie in 1m; of course, N will depend on 111. in general. The lengths of the 1m approach zero as 111 approaches infinity. Hence there is precisely one number, call it L, that lies in all those intervals,3 and we may now easily prove that the sequence is convergent with the limit L. In fact, given an E > 0, we choose an I1l such that the length of 1m is less than E. Then L and all the Xn with n > N(m) lie in 1m' and. therefore, IXn - LI < E for all those n. This completes the proof for an increasing sequence. For a decreasing sequence the proof is the same, except for a suitable interchange of "left" and "right" in the construction of those intervals. •
3 This statement seems to be obvious, but actually it is not; it may be regarded as an axiom of the real number system in the following form. Let h. 12 , .•• be closed intervals such that each 1m contains all 1n with 11 > m. and the lengths of the 1m approach zero as III approaches intinity. Then there is precisely one real number that is contained in all those intervals. This is the so-called Cantor-Dedekind axiom, named after the German mathematicians GEORG CANTOR (1845-1918). the creamr of set theory, and RICHARD DEDEKIND (1831-1916), known for his fundamental work in nwnber theory. For further details see Ref. [GR2] in App. I. (An interval 1 is said to be closed if its two endpoints are regarded as points belonging to J. It is said to be open if the endpoints are not regarded as points of I.)
APP. 3
A70
Auxiliary Material
10
---------1 B I
I~<~--I---:: I I'~"III. :1 Fig. 557.
Proof of Theorem 1
Real Series THEOREM 2
Leibniz Test for Real Series
Let
Xl' X2, •••
be real and monotone decreasing to zero, that is,
(1)
lim
(b)
Xm
=
O.
1U----+X
Then the series witll terms of altematillg signs
converges, and for the remainder Rn after the nth term we have the estimate
(2)
PROOF
Let
Sn
so that
be the 11th partial sum of the series. Then, because of (1 a),
S2 ~ S3 ~ Sl'
Proceeding in this fashion, we conclude that (Fig. 558)
(3) which shows that the odd partial sums form a bounded monotone sequence, and so do the even partial sums. Hence. by Theorem L both sequences converge, say, lim
n_x
S2n+l
= S,
I 2
Fig. 558.
S2n
= s*.
-X 2
IE
8
lim
n_x
8
C-" =:j 4
8
3
8
1
Proof of the Leibniz test
SEC. A3.4
Grad, Div, Curl, V2 in Curvilinear Coordinates
Now. since
S271+1 -
s -
s*
S2n
= lim
=
X2n+1'
A71
we readily see that Ob) implies
lim
S271+1 -
n~x
= n_x lim (S271+1
S2n
n~x
-
= n_x lim '2n+1 =
S2")
Hence s* = s. and the series converges with the sum s. We prove the estimate (2) for the remainder. Since s" -
o.
s, it follows from (3) that
and also By subtracting
S2n
and
respectively, we obtain
S2n-1'
In these inequalities, the first expression is equal to X2,,+1' the last is equal to -X2m and the expressions between the inequality signs are the remainders R2n and R 2n - 1 . Thus the inequalities may be written
•
and we see that they imply (2). This completes the proof.
A3.4
Grad, Div, Curl, V 2 in Curvilinear Coordinates To simplify formulas we write Cartesian coordinates -' = Xl' Y = -'2' Z = -'3' We denote curvilinear coordinates by qb q2, q3' Through each point P there pass three coordinate surfaces q1 = const, q2 = COllSt. q3 = COllSt. They intersect along coordinate curves. We assume the three coordinate curves through P to be orthogonal (perpendicular to each other). We write coordinate transformations as (I)
Corresponding transformations of grad, div, curl, and V2 can all be written by using
(2) Next to Cartesian coordinates. most important are cylindrical coordinates = ;: (Fig. 559a on p. A 72) defined by
q1
= r, q2 =
e.
lJ3
(3)
Xl
=
q1
cos
q2
= r cos
e,
and spherical coordinates tiI
(4)
Xl
=
q1
cos q2 sin
q3
=
r,
= r cos
-'3 =
=
=
e, q3 = 4> (Fig. 559b) defined by4
q2
q1
sin q2 =
e,
X2
e sin 4>, q1
cos
-'2 q3 = r
=
cos
r
sin
q1
sin
q2
sin
q3
=
r sin
e sin 4>
4>.
4This is the notation used in calculus and in many other books. It is logical since in it. 8 play, the ,arne role as in polar coordinates. C-\L'TIOM Some books interchange the roles of IJ and
A72
APP. 3
Auxiliary Material z
z h~
e. z)
--
-z
e x
(a)
Y
Y
r
Cylindrical coordinates
Fig. 559.
(b)
Spherical coordinates
Special curvilinear coordinates
In addition to the general formulas for any orthogonal coordinates qh additional formulas for these important special cases.
Linear Element ds.
q2, Q3,
we shall give
In Cartesian coordinates, (Sec. 9.5).
For the q-coordinates, (5) (5')
(Cylindrical coordinates).
For polar coordinates set d-;,2
=
o.
(5")
(Spherical coordinates).
Gradient. grad f = vf = [f Xl' f X2' f X) (partial derivatives; Sec. 9.7). In the q-system, with u, v, w denoting unit vectors in the positive directions of the Ql, Q2, Q3 coordinate curves, respectively,
(6)
(6')
(6")
divF
=
V.F
1
ar
r
= Vf = - u + -
!!fad!
=""f
1 a -:-{rF1 ) r ilr
= -
af
= -u
<0
(7')
ilf
gr ad!
a,.
af aR
--'--v
(Cylindrical coordinates)
iJz
1 af 1 a! + - - -y + --w
rsin i/O
I iJF
aF
iJO
az
+ _ ~ + __3 r
iiI
+-w
,. 0<1>
(Spherical coordinates).
(Cylindrical coordinates)
SEC. A3.4
Grad, Div, Curl,
(7")
,2
A73
in Curvilinear Coordinates
div F =
v· F =
I iJ 2 -;- (,.2 F1 ) r "r
I
iJF2
rsm
UV
I
iJ
+ - . - --:;-;- + - . - -.- (sin
(Spherical coordinates).
rsm
(8')
(Cylindrical coordinates)
(8")
(Spherical coordinates).
Curl (Sec. 9.9):
]
(9)
curlF=,xF=--lz]h2 h 3
hlu
h2 v
h3W
a
a
a
aq1
aq2
aq3
hlFI
h2F2
h3 F 3
For cylindrical coordinates we have in (9) (as in the previous formulas)
and for spherical coordinates we have hI = h,. = I,
1z2 = h" = qI sin q3 = r sin
(
.. I ,, oS.
"
1
4
APPENDIX
Additional Proofs ~
..
Section 2.6, page 73
Uniqueness 1 Assuming that the problem consisting of the ODE
PROOF OF THEOREM 1
)''' + p(x)y' + q(x)y = 0
(1)
and the two initial conditions
(3) has two solutions Yllx) and difference
on the interval 1 in the theorem, we show that their
Y2(X)
is identically zero on 1: then Yl == )'2 on 1, which implies uniqueness. Since (I) is homogeneous and linear. y is a solution of that ODE on 1. and since )"2 satisfy the !>ame initial conditions, y satisfies the conditions (10)
=
0,
z(x)
=
ylxo)
y' (xo)
VI
and
= O.
We consider the function y(x)2
+ y' (X)2
and its derivative
z'
=
2yy' + 2)"')"".
From the ODE we have y"
=
,
-py - q)'.
By substituting this in the expression for z.' we obtain (11)
Now, since y and
y' are real.
IThis proof was suggested by my colleague. Prof. A. D. Ziebur. In this proof we use formula numbers that have not yet been used in Sec. 2.6.
A74
APP. 4
A75
Additional Proofs
From this and the definition of :: we obtain the two inequalities (12)
(a)
2yy'
~ )'2
From (l2b) we have 2)')" ~ obtain
+
= ;::,
y'2
-z. Together, 12)')"] ~
-2qyy'
-2y)"'
(b)
1-2qyy'l
~
~ y2
+
y'2
= z.
z. For the last term in (II) we now
Iq112y/1 ~ Iqlz.
=
Using this result a" well as -p ~ Ipl and applying (I2a) to the term 2yy' in (11), we find
Since / 2 ~
)'2
+ )"2
= .<;,
from this we obtain
z'
~ (l
+ 21pl + Iql)z
or, denoting the function in parentheses by 11,
z'
(l3a)
~ 11::
for all x on 1
Similarly. from (11) and (12) it follows that (l3b)
~ Z
+ 2Ip!;:: + Iq\::: = 11;::.
The inequalities (I3a) and (13b) are equivalent to the inequalities (14)
.<;' -
h::
~ 0,
z' +
h.<; ~
o.
Integrating factors for the two expressions on the left are and The integrals in the exponents exist because Iz is continuous. Since Fl and F2 are positive. we thus have from (14) and This means that F1z is non increasing and F 2 z is nondecreasing on I. Since ::(xo) = 0 by (10). when x ~ Xo we thus obtain
and similarly, when
x ~ Xo,
Dividing by Fl and F2 and noting that these functions are positive. we altogether have
z~ · 1·IeS that;:: = y 2+' l'2 ThisImp
"'"
o.
z~o
0 on I. Hence y "'" 0 or YI "'" Y2 on I.
for all x on I.
•
A76
APP. 4
Additional Proofs
Section 5.4, pages 184 PROOF 0 F THE 0 REM 2
Frobenius Method. Basis of Solutions. Three Cases The formula numbers in this proof are the same as in the text of Sec. 5.4. An additional formula not appearing in Sec. 5.4 will be called (A) (see below). The ODE in Theorem 2 is b(x),
where
c(x)
)''' + -x -·\' + -x )' = 0• 2
(1)
b(,)
and c(x) are analytic functions. We can write it x 2 y"
(1')
+ xb(x)),' +
dx)),
+
Co
= O.
The indicial equation of (l) is
(4)
r(r - 1)
bor
+
= O.
The roots 1"1' 1"2 of this quadratic equation determine the general form of a basis of solutions of 0). and there are three possible cases as follows. Case 1. Distinct Roots not Differing by an Integer. form
A first solution of 0) is of the
(5)
and can be determined as in the power series method. For a proof that in this case, the ODE (1) has a second independent solution of the form
(6) see Ref. [All] listed in App. 1. Case 2. Double Root. The indicial equation (4) has a double root r if and only if (b o - 1)2 - 4co = 0, and then r = !( I - b o). A first solution (7)
r
=
!(l -
boY,
can be determined as in Case I. We show that a second independent solution is of the form (x> 0).
(8)
We use the method of reduction of order (see Sec. 2.1), that is, we determine ll(X) such that )'2(X) = U(X)Yl(X) is a solution of (I). By inserting this and the derivatives
=
""
)'2
II )'1
+
I
,
2u Yl
+ UYIII
into the ODE (I') we obtain x
2("
U)'1
+')" _U .'"1 +
" + xb(ll ,)'1 + UYI) , +
UYI)
cUYI
= O.
APP. 4
A77
Additional Proofs
Since )'1 is a solution of (I r), the sum of the terms involving u is zero, and this equation reduces to
By dividing by
2 X )'1
and inserting the power series for b we obtain u
"
(
+ 2 -)'1, + -b o + . .. x
Yt
)
u
,
o.
=
Here and in the following the dots designate terms that are constant or involve positive powers of x. Now from (7) if follows that y~
+ l)alx + ... ] x [aO+alX +···]
xT - 1 ll"ao
+
(I"
T
Yt
I"
x
+
Hence the previous equation can be written (A)
U rr
+
(21": bo
+ ... )
o.
Ur =
Since r = (l - b o)/2, the term (21" + bo)/x equals I/x, and by dividing by u' we thus have u"
u
,
x
+
By integration we obtain In u' = -In x + ... , hence u' = (Ilx)e<·· .J. Expanding the exponential function in powers of x and integrating once more, we see that u is of the form
Inserting this into)'2
=
UY1,
we obtain for )'2 a representation of the form (8).
Case 3. Roots Differing by an Integer. positive integer. A first solution
We write 1"1 =
I"
and 1"2 =
I" -
P where p is a
(9) can be determined as in Cases 1 and 2. We show that a second independent solution is of the form (10)
where we may have k -=I=- 0 or k = O. As in Case 2 we set)'2 literally as in Case 2 and give Eq. (A), u"
+
(21": b
o
+ .. -)
u'
= O.
=
1t)'1.
The first steps are
APP.4
A7B
Additional Proofs
Now by elementary algebra. the coefficient b o - I of r in (4) equals minus the sum of the roots,
=
bo - I Hence 2r
+ r2) = -(r + r - p) = -2r + p.
-(r1
+ bo = p + L and division by u' gives
The further ')teps are as in Case 2. Integrating, we find
In u'
= -(p
+ 1) In x + ... ,
thus
I
u = x
-(p+ll ( ... J
e
where dots stand for some series of nonnegative integer powers of x. By expanding the exponential function as before we obtain a series of the form I
U
We integrate once more. Writing the resulting logarithmic term first, we get u = k Inx p
+ (- _1_P - ... - kp- 1 + kp+IX + ... )
Hence. by (9) we get for Y2 =
px
UYI
x
the formula
But this is of the form (10) with k = kp since rl - P = r2 and the product of the two • series involves nonnegative integer powers of x only.
Section 5.7, page 205
THEOREM
Reality of Eigenvalues
If p, q, r, alld p'
ill the Sturm-Liouville equation (I) of Sec. 5.7 are real-valued and continuous on the interval a ~ x ~ band rex) > 0 throughout that interval (or rex) < 0 throughout that interval). then all the eigenvalues of the Stunl1-Liouville problem (1). (2). Sec. 5.7. are real.
PROOF
Let A =
0'
+ i{3 be an eigenvalue of the problem and let y(x) = u(x)
be a corresponding eigenfunction: here Sec. 5.7, we have (
pu I
0',
+ iu(x)
{3. u. and u are real. Substituting this into (1),
+.lpU ')' + (q +
O'r
+ i{3r)(u + iu) = o.
APP. 4
Additional Proofs
A79
This complex equation is equivalent to the following pair of equations for the real and the imaginary parts: (pu')'
+
(q
+
ar)u - {3rv
=
0
(pU')'
+
(q
+
ar)u
+ 13m
=
O.
Multiplying the first equation by u, the second by -(3(1I 2
+
u 2 )r
-ll
and adding, we get
= u(pu')' - u(pu')'
= [(pu')lI - (pu')uJ'. The expression in brackets is continuous on a ~ x ~ b. for reasons similar to those in the proof of Theorem I, Sec. 5.7. Integrating over x from a to b. we thus obtain
Because of the boundary conditions the right side is zero; this is as in that proof. Since y is an eigenfunction, u 2 + u2 O. Since y and r are continuous and r > 0 (or r < 0) on the interval a ~ x ~ b, the integral on the left is not zero. Hence, (3 = 0, which means that A = a is reaL This completes the proof. •
*'
Section 7.7, page 308
THEOREM
Determinants
The definition of a detennillant
(7)
D
= detA =
as given in Sec. 7.7 is unambiguous, that is, it yields the same value of D no matter which rows or columns we choose in developings.
PROOF
In this proof we shall use fonnula numbers not yet used in Sec. 7.7. We shall prove first that the same value is obtained no matter which row is chosen. The proof is by induction. The statement is true for a second-order determinant. for which the developments by the first row aU{/22 + a 1 2( -(21) and by the second row a21(-a12) + {/22 a ll give the same value alla 22 - a12a21' Assuming the statement to be true for an (n - l)st-order determinant, we prove that it is true for an nth-order determinant.
ABO
APP. 4
Additional Proofs
For this purpose we expand D in terms of each of two arbitrary rows, say, the ith and the jth, and compare the results. Without loss of generality let us assume i < j.
First Expansioll.
We expand D by the ith row. A typical term in this expansion
i~
The minor Mik of aik in D is an (11 - 1)st-order determinant. By the induction hypothesis we may expand it by any row. We expand it by the row corresponding to the jth row of D. This row contains the entries ajl (/ =1= k). It is the (j - I )st row of M ik • because Mik does not contain entries of the ith row of D. and i < j. We have to distinguish between two ca<;es as follows. Case I. If I < k, then the entry ajl belongs to the lth column of Mik (see Fig. 560). Hence the term involving ajl in this expansion is
(20)
where M ikj / is the minor of (ljl in M ik . Since this minor is obtained from Mik by deleting the row and column of ajl, it is obtained from D by deleting the ith and jth rows and the kth and lth columns of D. We insert the expansions of the Mik into that of D. Then it follows from (19) and (20) that the terms of the resulting representation of D are of the form (2Ia)
(l
<
k)
where b=i+k+j+l-l. 11. If I > k, the only difference is that then ajl belongs to the (l - I )st column of because Mik does not contain entries of the kth column of D, and k < I. This causes an additional minus sign in (20). and. instead of (21 a). we therefore obtain Case
M ik ,
(2Ib)
(l
where b is the same as before.
lth
kth
kth
lth
col.
col.
col.
col.
I I
I I
--~------@---I
ith row
I
--6}-----~---I I I
I I I
Case I
Fig. 560.
-
-&-----~--I
jth row
I
---+------@--I
I
I I
I I
Case II
Cases I and II of the two expansions of D
> k)
APP. 4
A81
Additional Proofs
Second Expansion. expansion is
We now expand D at first by the jth row. A typical term in this
(22) By the induction hypothesis we may expand the minor Mjl of ajl in D by its ith row, which corresponds to the ith row of D. since j > i.
> I, the entry 0ik in that row belongs to the (k - I )st column of Mjl' because does not contain entIies of the Ith column of D. and I < k (see Fig. 560). Hence the term involving aik in this expansion is Case T. If k
Mjl
(23)
aik·
· Mjl) (cof actor 0 f aik In
= llik·
(-
l)i+(k- D M ikjZ,
where the minor M ikjl of aik in Mjl is obtained by deleting the ith and jth rows and the kth and Ith columns of D [and is, therefore, identical with M ikj /, in (20), so that our notation is consistentJ. We insert the expansions of the Mjl into that of D. It follows from (22) and (23) that this yields a representation whose terms are identical with those given by (21 a) when I < k.
< I, then 0ik belongs to the kth column of Mjl' we obtain an additional minus sign, and the result agrees with that characterized by (21 b).
Case 1I. If k
We have shown that the two expansions of D consist of the same terms, and this proves our statement concerning rows. The proof of the statement concerning colu11l1ls is quite similar; if we expand D in terms of two arbitrary columns, say, the kth and the !th. we find that the general term involving 0jlaik is exactly the same as before. This proves that not only all column expansions of D yield the same value, but also that their common value is equal to the common value of the row expansion), of D. This completes the proof and shows that ollr definitioll of all mil-order detel7ninalll is unambiguolls.
•
Section 9.3, page 377 PROOF OF FORMULA (2)
We prove that in right-handed
Carte~ian
coordinates. the vector product
has the components
(2) We need only consider the case v =1= O. Since v is perpendicular to both a and b, Theorem 1 in Sec. 9.2 gives a • v = 0 and b • v = 0; in components [see (2), Sec. 9.2],
(3)
A82
APP. 4
Additional Proofs
Multiplying the first equation by b3 , the last by
a3.
and subtracting, we obtain
Multiplying the first equation by b l , the last by
ill'
and subtracting, we obtain
We can
ea~ily
verify that these two equations are
~atisfied
by
(4) where c is a constant. The reader may verify by inserting that (4) also satisfies (3). Now each of the equations in (3) represents a plane thruugh the origin in VIV2v3-space. The vectors a and b are normal vectors of these planes (see Example 6 in Sec. 9.2). Since v =1= 0, these vectors are not parallel and the two planes do not coincide. Hence their intersection is a straight line L through the origin. Since (4) is a solution of (3) and, for varying c, represents a straight line, we conclude that (4) represents L, and every solution of (3) must be of the form (4). Tn particular, the components of v must be of this form, where c is to be determined. From (4) we obtain
This can be written
as can be verified by performing the indicated multiplications in both formulas and comparing. Using (2) in Sec. 9.2, we thus have
By comparing this with formula (12) in Team Project 24 of Problem Set 9.3 we conclude that c = ±1. We show that c = + 1. This can be done a" follows. If we change the lengths and directions of a and b continuously and so that at the end a = i and b = j (Fig. l86a in Sec. 9.3), then v will change its length and direction continuously, and at the end, v = i X j = k. Obviously we may effect the change so that both a and b remain different from the zero vector and are not parallel at any instant. Then v is never equal to the zero vector, and since the change is continuous and c can only assume the values + 1 or -I, it follows that at the end c must have the same value as before. Now at the end a = i, b = j. v = k and, therefore, al = 1. b2 = I, V3 = L and the other components in (4) are zero. Hence from (4) we see that V3 = c = + 1. This proves Theorem I. For a left -handed coordinate system, i X j = -k (see Fig. 186b in Sec. 9.3), resulting in c = -1. This proves the statement right after formula (2). •
APP. 4
A83
Additional Proofs
Section 9.9, page 416 PROOF OF THE INVARIANCE OF THE CURL
This proof will follow from two theorems (A and B), which we prove first.
THEOREM A
Transformation Law for Vector Components
For any vector v the componenTs VI, V2 , V3 and Vl*' V2*' V3* in allY two systems of Cartesian coordinates Xl> x3, X3 and Xl*' X2*' x3*' respectively, are related by
0) and conversely
(2)
with coefficients C 13
= i*·k
(3) C31
= k*·i
C32
=
c 33 = k*·k
k*· j
satisfying 3
(4)
L
CkjCmj =
i5km
(k,1I/
= 1, 2, 3),
j~1
where the Kronecker delta2 is given by (k
* m)
(k
= 11/)
and i, j, k and i*, j*, k* denote the unit vectors ill the positive X2*-' x3*-directions, respectively.
Xl-,
X2-, X3- and
Xl*-'
2LEOPOLD KRONECKER (l823-18YI), German mathematician at Berlin, who made important contributions to algebra. group theory. and number theory. We shall keep our discussion completely independent of Chap. 7, but readers familiar with matrices should rccognize that we are dealing with orthogonal transformations and matrices and that our present theorem follows from Theorem 2 in Sec. 8.3.
APP. 4
A84
PROOF
Additional Proofs
The representation of v in the two systems are
(5) Since i* • i* = L i* • j* from this and (Sa)
= 0, i* • k* = 0, we get from (5b) simply i* • v =
Vl*
and
Because of (3), this is the first formula in (1), and the other two fommlas are obtained similarly. by considering j* • v and then k* • v. Formula (2) follows by the same idea. taking i • v = VI from (Sa) and then from (5b) and (3)
and similarly for the other two components. We prove (4). We can write (1) and (2) briefly as 3
(6)
(a)
Vj
3
= ~ cl1lJ v m *,
(b)
Vk
*=
Vj
into
Vk *,
CkjVj.
j=1
m=1
Substituting
~
we get 3 3 3
Vk*
=~
Ckj
j=1
~
CmjV n
.* = ~
m=1
Vm *
m=l
where k = 1, 2, 3. Taking k = 1, we have
For this to hold for eve1), vector v, the first sum must be I and the other two sums O. This proves (4) with k = 1 for m = 1, 2, 3. Taking k = 2 and then k = 3. we obtain (4) with • k = 2 and 3, form = 1,2,3.
THEOREM B
Transformation Law for Cartesian Coordinates
The trclllc~f01111{{tioll of allY Cartesian XIX2x3-coordinate system into any other Cartesian XI *X2 *X3 *-coordillate system is of the fonn 3
(7)
Xm*
= ~ CnljXj + b m ,
111
= 1.2.3,
j=1
with coefficients (3) and COllstants bI> b 2, b3; cOllversely, 3
(8)
Xk
=
:L 1£=1
CnkXn
* + bk ,
k
=
1.2,3.
APP. 4
A8S
Additional Proofs
Theorem B follows from Theorem A by noting that the most general transformation of a Cartesian coordinate system into another such system may be decomposed into a tranSf0I111ation of the type just considered and a translation; and under a translation, cOlTesponding coordinates differ merely by a constant.
PROOF OF THE INVARIANCE OF THE CURL We write again Xl, X2, X3 instead of x, y,~, and similarly Xl*'
X2*' X3i< for other Cartesian coordinates. assuming that both systems are right-handed. Let al. a 2 • 03 denote the components of curl v in the xIx2x3-coordinates. as given by (l), Sec. 9.9. with
v=X2,
.
Similarly, let al*' a2*' a 3* denote the components of curl v in the xl*x2*x3*-coordinate system. We prove that the length and direction of curl v are independent of the particular choice of Cartesian coordinates. as asserted. We do this by showing that the components of curl v satisfy the transformation law (2), which is characteristic of vector components. We consider {[I' We use (6a), and then the chain rule for functions of several variables (Sec. 9.6). This gives
3
=
3
L L
(
aV.,/ c m3 (Jx/'
m=I j=l
From this and (7) we obtain
Note what we did. The double sum had 3 X 3 = 9 terms, 3 of which were zero (when = j). and the remaining 6 terms we combined in pairs as we needed them in getting a I *, a2*' {/3*
111
We now use (3), Lagrange's identity (see Team Project 24 in Problem Set 93) and k* x j* = -i* and k X j = -i. Then
= (k* x j*) • (k x j) = i* • i = Cn.
etc.
A86
APP. 4
Additional Proofs
Hence a] = clla]* + c2]a2* + c3]a3*' This is of the form of the first formula in (2) in Theorem A, and the other two formulas of the form (2) are obtained similarly. This proves the theorem for right-handed systems. If the .\"lx2'\·3-coordinates are left-handed, then k X j = +i, but then there is a minus sign in front of the determinant in (1), Sec. 9.9. •
Section 10.2, pages 426-427 PROOF 0 F THE 0 REM 1, PA R T (b)
f
(1)
We prove that if
f
F(r) • dr =
(Fl dx
c
C
+
F2 dy
+
F3 dz)
with continuous F I , F 2, F3 in a domain D is independent of path in D, then F = grad f in D for some f; in components
(2' ) We choose any fixed A: (xo, Yo, zo) in D and any B: (x, )" z) in D and define f by
(3)
f(x, y, z)
=
fo
+
I
B
(F] dx*
+
F2 dy*
+
F3 dz*)
A
with any constant fo and any path from A to BinD. Since A is fixed and we have independence of path. the integral depends only on the coordinates x. y. z. so that (3) defines a function f(x, y. z) in D. We show that F = grad f with this f, beginning with the first of the three relations (2'). Because of independence of path we may integrate from A to B]: (x], y, z) and then parallel to the x-axis along the segment B]B in Fig. 561 with B] chosen so that the whole segment lies in D. Then f(x, y, z) = fo
+
I
Bl
(FI d.x:*
+
F2 dy*
A
+
F3 dz*)
+
f
B
(FI dx*
+
F2 dy*
+
F3 dz*).
~
We now take the partial derivative with respect to x on both sides. On the left we get iJf/iJx. We show that on the right we get F]. The derivative of the first integral is zero because A: (xo, Yo, zo) and B 1 : (x], y, z) do not depend on x. We consider the second integral. Since on the segment B]B, both y and z are constant, the terms F2 dy* and
z
y x
Fig. 561.
Proof of Theorem 1
APP.4
A87
Additional Proofs
F3 d::.* do not contribute to the detivative of the integral. The remaining part can be written as a definite integral.
Hence its partial derivative with respect to x is Fl(X, y, ::.), and the first of the relations (2') is proved. The other two formulas in (2') follow by the same argument. • Section 13.4, page 620 Cauch~-Riemann Equations We prove that Cauchy-Riemann equations
PROOF 0 F THE 0 REM 1
(1)
are sufficient for a complex function fez) = u(x, y) + iv(x, y) to be analytic; precisely. (f the real parI u and the inzaginaJ~v part v of f(z.) satisfy (I) ill a domain D ill the complex plane and if the ponied derivatives i11 (I) are COlltillllOUS in D, then fez) is analytic in D. In this proof we write D.::. = .lx is as follows.
+ iD.y and D.f = fez. + D.z.) - f(z.).
The idea of proof
(a) We express .If in terms of first partial derivatives of II and v. by applying the mean value theorem of Sec. 9.6. (b) We get rid of partial derivatives with respect to y by applying the Cauchy-Riemann equations. (c) We let .l.:: approach zero and show that then D.fl.l::. as obtained approaches a limit which is equal to U x + iv x , the right side of (4) in Sec. 13.4. regardless of the way of approach to zero. (a) Let P: (x, y) be any fixed point in D. Since D is a domain, it contains a neighborhood of P. We can choose a point Q: (x + D.x, y + D.)') in this neighborhood such that the straight-line segment PQ is in D. Because of our continuity a~sumptions we may apply the mean value theorem in Sec. 9.6. This yields
u(x
+
v(x
+ D.x, y + .ly) - vex, y)
D.x, y
+
.ly) -
lI(X,
y)
= (.lx)ux(M 1 ) + (D.y)uyCM 1 ) =
(D.x)v x (M 2)
+ (D.y)vyCM2)
where Ml and M2 (01= Ml in general!) are suitable points on that segment. The first line is Re D.f and the second is 1m .If, so that
(b)
uy
=
-vx
and
Vy
=
IIx
by the Cauchy-Riemann equations, so that
A88
APP. 4
Additional proofs
Also 11::. = 11.\' = 0::. -
~x
+ il1.\', so that we can write I1x = ~::. - il1."'" in = -i(.1.:: - .1x) in the second term. This gives
the first term and
~x)li
By performing the multiplications and reordering we obtain
111 =
(.1::')lI x (M l ) - iI1Y{lIx (Ml) - lI x (M2 )}
+
i[(~::.Wr(Ml) -
..lx{ux(Ml ) - u x(M2 )}]·
Division by 11::. now yields
i:lx
i.1y .1~
.
{lI x (Ml) -
~-
tl x(M2 )} -
-
{ux(M l ) - u x(M2 )}·
(e) We finally let 11::. approach zero and note that 111-,,111::.1 ~ I and Il1xll1zl ~ I in (A). Then Q: (x + ~x, y + ~y) approaches P: (x, y), so that Ml and M2 must approach P. Also, since the partial derivatives in (A) are assumed to be continuous, they approach their value at P. In particular, the differences in the braces {... } in (A) approach zero. Hence the limit of the right side of (A) exists and is independent of the path along which 11::. ---7 O. We see that this limit equals the right side of (4) in Sec. 13.4. This means that 1(::.) is analytic at every point.:: in D, and the proof is complete. •
Section 14.2, pages 647-648 Goursat proved Cauchy's integml theorem without assuming that f' (.::) is continuous, as follows. We start with the case when C is the boundary of a triangle. We orient C counterclockwise. By joining the midpoints of the sides we subdivide the triangle into four congruent triangles (Fig. 562). Let CI . C n . C m . CIV denote their boundaries. We claim that (see Fig. 562).
GOURSAT'S PROOF OF CAUCHY'S INTEGRAL THEOREM
(I)
fI G
d.::
=
f
G,
I
d.::
+
f
Gn
I
dz
+
f
GIll
I
d.::
+
f
I
d::..
G,v
Indeed, on the right we integrate along each of the three segments of subdivision in both possible directions (Fig. 562), so that the corresponding integrals cancel out in pairs, and the sum of the integrals on the right equals the integral on the left. We now pick an integral on the right that is biggest in absolute value and call its path Cl' Then. by the triangle inequality (Sec. 13.2),
Fig. 562.
Proof of Cauchy's integral theorem
APP. 4
A89
Additional proofs
We now subdivide the triangle bounded by C 1 as before and select a triangle of subdivil>ion with boundary C2 for which Then Continuing in this fashion, we obtain a sequence of triangles T 1 , T2 , ••• with boundaries Cl> C2 , . • • that are similar and such that Tn lies in Tm when 11 > Ill, and
n = 1,2, .. '.
(2) Let ':::0 be the point that belongs to all these triangles. Since the derivative J' (':::0) exists. Let
(3)
h(.:::)
=
fez) - f(zo)
z -:'::0
f is differentiable at :.:: =
:'::0,
J' (zo)·
-
Solving this algebraically for f(:.::) we have fez) = f(:.::o)
+ (z
- zo)J' (:'::0)
+
11(:::)(:.:: -
':::0)·
Integrating this over the boundary Cn of the triangle Tn gives
Since f(.:::o) and J' (zo) are constants and Cn is a dosed path. the first two integrals on the right are zero, as follows from Cauchy's proof, which is applicable because the integrands do have continuous derivatives (0 and const, respectively). We thus have
f
fez) d:::
en
=
f
h(z)(z -
:'::0)
d:.::.
en
Since J' (:'::0) is the limit of the difference quotient in (3), for given 8 > 0 such that (4)
Ih(z)1 < e
when
Iz -
E
> 0 we can find a
:'::01 < 8.
We may now take 11 so large that the tliangle Tn lies in the disk Iz - :::01 < 8. Let Ln be the length of Cn' Then I::: - zol < Ln for all :: on Cn and::oin Tn. From this and (4) we have 111(.:::)(z - :(0)1 < eLn . The ML-inequality in Sec. 14.1 now gives (5)
Now denote the length of C by L. Then the path C1 has the length Ll = Ll2, the path C2 has the length ~ = Ll/'2 = Ll4, etc., and Cn has the length L" = Ll2n. Hence Ln 2 = L2/4n. From (2) and (5) we thus obtain
A90
APP. 4
Additional Proofs
By choosing E (> 0) sufficiently small we can make the expression on the right as small as we please, while the expression on the left is the definite value of an integral. Consequently. this value must be zero, and the proof is complete. The proof for the case ill which C is the boundary of a polygon follows from the previous proof by subdividing the polygon into triangles (Fig. 563). The integral corresponding to each such triangle is zero. The sum of these integrals is equal to the integral over C, because we integrate along each segment of subdivision in both directions, the corresponding integrals cancel out in pairs, and we are left with the integral over C. The case of a general simple closed path C can be reduced to the preceding one by inscribing in C a closed polygon P of chords, which approximates C "sufficiently accurately," and it can be shown that there is a polygon P such that the integral over P differs from that over C by less than any preassigned positive real number E, no matter how small. The details of this proof are somewhat involved and can be found in Ref. [D6] listed in App. 1. •
Fig. 563.
Proof of Cauchy's integral theorem for a polygon
Section 15.1, page 667 PROOF 0 F THE 0 REM 4
Cauchy's Convergence Principle for Series
(a) [n this proof we need two concepts and a theorem, which we list first. 1. A bounded sequence SI' S2, • • . is a sequence whose terms all lie in a disk of (sufficiently large, finite) radius K with center at the origin; thus ISnl < K for alln. 2. A limit point a of a sequence Sb S2, . • • is a point such that, given an E> 0, there are infinitely many terms satisfying ISn - al < E. (Note that this does not imply convergence, since there may still be infinitely many tenns that do not lie within that circle of radius E and center a.) Example: ~. ~, ~. ~, 1~' ~~ • . . . has the limit points 0 and 1 and diverges.
3. A bounded sequence in the complex plane has at least one limit point. (Bolzano-Weierstrass theorem: proof below. Recall that "sequence" always mean infinite sequence.) (b) We now turn to the actual proof that every E > 0 we can find an N such that (1)
Izn+l
ZI
+ ... + zn+pl <
E
+
~2
+ ...
converges if and only if for
for every n > Nandp
Here, by the definition of partial sums, Sn+p -
Sn
=
2n+l
+ ... + zn+p'
= 1,2, . ".
APP. 4
A91
Additional proofs
Writing
II
+P=
r, we see from this that (1) is equivalent to
for all r > Nand n
(1*)
>
N.
Suppose that SI' S2, ... converges. Denote its limit by s. Then for a given E> 0 we can find an N such that for every n Hence, if r > Nand n
>
>
N.
N, then by the triangle inequality (Sec. 13.2),
that is, (I *) holds. (c) Conversely, assume that SI, S2, • . . satisfies (1 *). We first prove that then the sequence must be bounded. Indeed, choose a fixed E and a fixed n = no > N in (1 *). Then (1 *) implies that all s,. with r > N lie in the disk of radius E and center s"o and only fillitely many tel7l1S SI, ... , SN may not lie in this disk. Clearly, we can now find a circle so large that this disk and these finitely many terms all lie within this new circle. Hence the sequence is bounded. By the Bolzano-Weierstrass theorem, it has at least one limit point, call it s. We now show that the sequence is convergent with the limit s. Let E > 0 be given. Then there is an N* such that 1ST - snl < E/2 for all r > N* and 11 > N*, by (1 *). Also, by the definition of a limit point, ISn - sl < E/2 for infinitely lI1allY n, so that we can find and fix an Il > N* such that ISn - sl < El2. Together, for e\'el)' r > N*, ST -
1
sl =
I(ST - S.,}
+ (Sn -
s)1
~
Is,. -
snl
+ ISn -
sl
E
E
< 2 + 2 =
E;
that is. the sequence SI, S2' ... is convergent with the limit s.
THEOREM
•
Bolzano-Weierstrass Theorem 3
A bounded infillite sequellce Z1> Z2, 23, . . • in the complex plane has at least one
limit point.
PROOF
It is obvious that we need both conditions: a finite sequence cannot have a limit point.
and the sequence I, 2, 3.... , which is infinite but not bounded. has no limit point. To prove the theorem, consider a bounded infinite sequence ZI. Z2 • ... and let K be such that < K for all n. If only finitely many values of the Zn are different, then. since the sequence is infinite, some number z must occur infinitely many times in the sequence, and, by definition, this number is a limit point of the sequence. We may now tum to the case when the sequence contains infinitely many differem terms. We draw a large square Qo that contains all Zw We subdivide Qo into four congruent squares, which we number 1, 2. 3, 4. Clearly, at least one of these squares (each taken
Iznl
3BERNARD BOLZANO (1781-1848). Austrian mathematician and professor of religious studies, was a pioneer in the study of point sets, the foundation of analysis, and mathematical logic. For Weierstrass. see Sec. 15.5.
A92
APP. 4
Additional Proofs
with its complete boundary) must contain infinitely many terms of the sequence. The square of this type with the lowest number (1. 2, 3, or 4) will be denoted by Q1' This is the first step. In the next step we subdivide Q1 into four congruent squares and select a square Q2 by the same rule, and so on. This yields an infinite sequence of squares Qo. Q1, Q2, ... , Qn, ... with the property that the side of Qn approaches zero as 11 approaches infinity, and Qm contains all Qn with 11 > m. It is not difficult to see that the number which belongs to all these squares,4 call it:::: = a, is a limit point of the sequence. In fact, given an E > O. we can choose an N so large that the side of the square QN is less than € and, since QN contains infinitely many Zn. we have Izn - aJ < E for infinitely many 11. This completes the proof. • Section 15.3, pages 681-682 T (b) OF THE PROOF OF THEOREM 5
We have to show that
=
L
an LlZ[(Z
+
+
LlZ)",-2
2z(z
+
uz)n-3
+ ... +
(11 -
1) zn-2],
11.=2
thus, (z
+
LlZ)n - zn Llz
= LlZ[(Z + Llz)n-2 + If we set
Z
+ .1z =
band z
2z(z
+
LlZ)n-3
= a, thuf, ,lz =
+ ... +
(n -
l)z11.-2].
h - a, this becomes simply
(7a)
(11
=
2,3, ... ),
where An is the expression in the brackets on the right, (7b)
thus, A2 = 1, A3 = b since then
+
2a. etc. We prove (7) by induction. When n
- 2a
=
(b
+
a)(b - a) b-a
- 2a
= 2. then
(7) holds.
= b - a = (b - a)A 2 .
Assuming that (7) holds for 11 = k, we show that it holds for n = k + 1. By adding and subtracting a term in the numerator and then dividing we first obtain bk + 1 b-a
-
ba k
+
bak
-
ak + 1
b-a
4The fact that such a unique number;:; = a exists seems [0 be obvious, but it actually follows from an axiom of the real number system, the so-called CantOl~Dedekind axiom: see footnote 3 in App. A3.3.
APP. 4
Additional Proofs
A93
By the induction hypothesis, the right side equals b[(b - a)Ak calculation shows that this is equal to
From (7b) with
Il
+
ka k - 1 ]
+
a k . Direct
= k we see that the expression in the braces {... } equals
bk - 1
+
2ab k - 2
+ ... +
bk + 1
ak + 1
(k - l)ba k - 2
+
ka k - 1 = A k + 1 •
+
(k
+
Hence our result is _
b-a
= (b - a)A k + 1
Taking the last term to the left, we obtain (7) with n integer n ~ 2 and completes the proof
=
k
+
l)a k .
I. This proves (7) for any •
Section 18.2, page 754 without the use of a harmonic conjugate
A NOT HER PROOF 0 F THE 0 REM 1
We show that if w = u + iu = i(z) is analytic and maps a domain D conformally onto a domain D* and *(u, u) is harmonic in D*, then (1)
(x, y) = *(u(x, y), u(x, y))
is harmonic in D. that is, y2<1> = 0 in D. We make no use of a hmmonic conjugate of <1>*, but use straightforward differentiation. By the chain rule,
We apply the chain rule again. underscoring the terms that will drop out when we form
,2<1>:
=
~v is
multiplied by
which is 0 by the Cauchy-Riemann equations. Also y 2 U = 0 and y 2 U = O. There remains
By the Cauchy-Riemann equations this becomes
and is 0 since <1>* is harmonic.
•
APPENDIX
5
Tables For Tables of Laplace transforms see Secs. 6.8 and 6.9. For Tables of Fourier transforms see Sec. 11.10. If you have a Computer Algebra System (CAS), you may not need the present tables, but you may still find them convenient from time to time. Table Al
Bessel Functions
For more extensive tables see Ref. [GRll in App. I. x
1 0(x)
1 1 (x)
x
1 0(:0:)
0.0 0.1 0.2 0.3 0.4
1.0000 0.9975 0.9900 0.9776 0.9604
0.0000 0.0499 0.0995 0.1483 0.1960
3.0 3.1 3.2 3.3 3.4
-0.2601 -0.2921 -0.3202 -0.3443 -0.3643
0.3391 0.3009 0.2613 0.2207 0.1792
6.0 6.1 6.2 6.3 6.4
0.1506 0.1773 0.2017 0.2238 0.2433
-0.2767 -0.2559 -0.2329 -0.2081 -0.1816
0.5 0.6 0.7 0.8 0.9
0.9385 0.9120 0.8812 0.8-1-63 0.8075
0.2423 0.2867 0.3290 0.3688 0.4059
3.5 3.6 3.7 3.8 3.9
-0.3801 -0.3918 -0.3992 -0.4026 -0.4018
0.1374 0.0955 0.0538 0.0118 -0.0272
6.5 6.6 6.7 6.8 6.9
0.2601 0.2740 0.2851 0.2931 0.2981
-0.1538 -0.1250 -0.0953 -0.0652 -0.0349
1.0 1.1 1.2 1.4
0.7652 0.7196 0.6711 0.6201 0.5669
0...\401 0.4709 OA983 0.5220 0.5419
-l.0 4.1 4.2 4.3 4.4
-0.3971 -0.3887 -0.3766 -0.3610 -0.3423
-0.0660 -0.1033 -0.1386 -0.1719 -0.2028
7.0 7.1 7.2 7.3 7.4
0.3001 0.2991 0.2951 0.2882 0.2786
-0.0047 0.0152 0.0543 0.0826 0.1096
1.5 1.6 1.7 1.8 1.9
0.5118 0.4554 0.3980 0.3400 0.2818
0.5579 0.5699 0.5778 0.5815 0.5812
4.5 -l.6 4.7 4.8 4.9
-0.3205 -0.1961 -0.1693 -0.2404 -0.2097
-0.2311 -0.2566 -0.2791 -0.2985 -0.3147
7.5 7.6 7.7 7.8 7.9
0.2663 0.2516 0.2346 0.2154 0.1944
0.1352 0.1592 0.1813 0.2014 0.2192
2.0 2.1 2.2 2.3 2.4
0.2239 0.1666 0.1104 0.0555 0.0025
0.5767 0.5683 0.5560 0.5399 0.5202
5.0 5.1 5.2 5.3 5.4
-0.1776 -0.1443 -0.1103 -0.0758 -0.0412
-0.3276 -0.3371 -0.3431 -0.3460 -0.3453
8.0 8.1 8.2 8.3 8.4
0.1717 0.1475 0.1222 0.0960 0.0692
0.2346 0.2476 0.2580 0.2657 0.2708
2.5 2.6 2.7 2.8 2.9
-0.0484 -0.0968 -0.1424 -0.1850 -0.2143
0.4971 0.-1-708 0.4416 0.4097 0.3754
5.5 5.6 5.7 5.8 5.9
-0.0068 0.0270 0.0599 0.0917 0.1220
-0.3414 -0.3343 -0.3241 -0.3110 -0.2951
8.5 8.6 8.7 8.8 8.9
0.0419 0.0146 -0.0125 -0.0392 -0.0653
0.2731 0.2728 0.2697 0.2641 0.2559
1.3
-
hex)
I
x
1 1 (x)
10(x)
I
fo(x) ~ 0 for x = 2.40483. 5.52008, 8.65373, 11.7915, 14.9309, 1~.0711, 21.2116, 24.3525. 27.4935, 30.6346
11(x) = 0 for x = 3.83171, 7.01559,10.1735, 13.3237. 16.4706. 19.6159,22.7601,25.9037,29.0468.32.1897
A94
A95
APP. 5 Tables
(continued)
Table A1 -
x
:I
-
I
I
0.0 0.5
Y1 (xj
YolX}
I
:~ I
2.0
(-x)
(-x)
-0.445 0.088 0.382 0.510
-1.471 -0.781 -0.412 -0.107
Table Al
x
Yo\.l:)
Yl~\')
x
YoIX)
2.5 3.0 3.5 4.0 4.5
0.498 0.377 0.189 -0.017 -0.195
0.146 0.325 0.410 0.398 0.301
5.0 5.5 6.0 6.5 7.0
-0.309 -0.339 -0.288 -0.173 -0.026
Yl~\")
I
0.148 -0.024 -0.175 -0.274 -0.303
I I
Gamma Function [see (24) in App. A3.1]
a
f(a)
a
na)
a
r(a)
a
na)
Q'
f(a)
1.00
1.000000
1.20
0.911l169
1.40
0.1l1l7264
1.60
0.1l93515
1.1l0
0.93131l4
1.02 1.04 1.06 LOll
0.988844 0.978438 0.968744 0.959725
1.22 1.24 1.26 1.28
0.913106 0.908521 0.904397 0.900718
1.42 1.44 1.46 1048
0.886356 0.885805 0.885604 0.885747
1.62 1.64 1.66 1.68
0.895924 0.898642 0.901668 0.905001
1.82 1.84 1.86 1.88
0.936845 0.942612 0.948687 0.955071
1.10
0.951351
1.30
0.897471
1.50
0.886227
1.70
0.908639
1.90
0.961766
1.12 1.14 1.16 1.18
0.943590 0.936416 0.929803 0.923728
1.32 1.34 1.36 1.38
0.894640 0.892216 0.890185 0.888537
1.52 1.54 1.56 1.58
0.887039 0.888178 0.889639 0.891420
1.72 1.74 1.76 1.78
0.912581 0.916826 0.921375 0.926227
1.92 1.94 1.96 1.98
I 0.976099 0.983743
1.20
0.918 169
1.40
0.887264
1.60
0.893515
1.80
0.931 384
2.00 11.000 000
: I
I
I
'-
Table Al
0.968774
0.991 708
Factorial Function and Its Logarithm with Base 10
I
I
II
11!
log (II!)
11
11!
log (II!)
11
11'
log VI!)
1 2 3 4 5
1 2 6 24 120
u.uuuOOO 0.301 030 0.778 151 1.380211 2.079 181
6 7 8 9 10
72u 5040 4032u 362880 3628800
2.857332 3.702431 4.605521 5.559763 6.559763
11 12 13 14 15
39916800 479001 600 6227 u2u IlUO 87178291200 1 307674368000
7.6Ul 156 8.680337 9.794280 10.940408 12.116500
I
Table A4
I
erfx
Silxj
CIIXI
x
erfx
Si(xi
ci(x)
U.U
u.uuuu
u.uuuu
:JO
2.u
0.9953
1.6054
-0.4230
0.2 0.4 0.6 0.8 1.0
0.2227 0.4284 0.6039 0.7421 0.8427
0.1996 0.3965 0.5881 0.7721 0.9461
1.0422 0.3788 0.0223 -0.1983 -0.3374
2.2 2.4 2.6 2.8 3.0
0.9981 0.9993 0.9998 0.9999 1.0000
1.6876 1.7525 1.8004 1.8321 1.8487
-0.3751 -0.3173 -0.2533 -0.1865 -0.1196
1.2
0.9103 0.9523 0.9763 0.9891 0.9953
1.1080 1.2562 1.3892 1.5058 1.6054
-0.4205 -0.4620 -0.4717 -0.4568 -0.4230
3.2 3.4 3.6 3.8 4.0
1.0000 1.0000 1.0000 1.0000 1.0000
1.8514 1.8419 1.8219 1.7934 1.7582
-0.0553 0.0045 0.0580 0.1038 0.1410
I
I
1.8 2.0
I
Error Function, Sine and Cosine Integrals [see (35), (40), (42) in App. A3.1]
l:
104 1.6
I
II
A96
APP. 5
Tables
Table AS Binomial Distribution Probability function f(x) [see (2), Sec. 24.7] and distribution function F(x) p -
"
x
= 0.1
I(~F(X)_
o.
p
-
J(x) -
o.
= 0.2 F(x)
r--
P
-
= 0.3
F(x) I(x) - - -I - -
o.
p I(x)
= 0.4
P = 0.5
F(x)
I(x)
o.
I
F(x)
o.
0 I
9000 1000
0.9000 1.0000
8000 2000
0.8000 1.0000
7000 3000
0.7000 1.0000
6000 4000
0.6000 1.0000
5000 5000
0.5000 1.0000
'1
0 I 2
!HOO 1800 0100
OJHOO 0.9900 1.0000
6400 3200 0400
0.6400 0.9600 1.0000
4'J00 4200 0900
OA'JOO 0.9100 1.0000
3600 4800 1600
0.3600 0.8400 1.0000
2500 5000 2500
0.2500 0.7500 1.0000
7290 2430 0270 0010
0.7290 0.9720 0.9990 1.0000
5120 3840 0960 0080
0.5120
3
0 1 2 3
0.9920 1.0000
3430 4410 I!NO 0270
0.3430 0.7840 0.9730 1.0000
2160 4320 2880 0640
0.2160 0.6480 0.9360 1.0000
1250 3750 3750 1250
0.1250 0.5000 0.8750 1.0000
6561 2916 0486 0036 0001
0.6561 0.9477 0.9963 0.9999 1.0000
40Y6
0.4096 0.8192 0.9728 0.9984 1.0000
2401 4116 2646 0756 0081
0.2401 0.6517 0.9163 0.9919 1.0000
12'J6
0.12'J6
4
0 I 2 3 4
3456 3456 1536 0256
0.4752 0.8208 0.9744 1.0000
0625 2500 3750 2500 0625
0.0625 0.3125 0.6875 0.9375 1.0000
0
5'}u5 3281
2 3 4 5
0081 0005 0000
u.5905 0.9185 0.9'J14 0.9995 1.0000 1.0000
2048 0512 0064 0003
u.3277 0.7373 0.9421 0.9933 0.9997 1.0000
Ib81 3602 3087 1323 0284 0024
u.lb81 0.5282 0.8369 0.9692 0.9976 1.0000
lJ778
I
3456 2304 0768 0102
(J.l)778 0.3370 0.6826 0.9130 0.9898 1.0000
lJ313 1563 3125 3125 1563 0313
0.u313 0.1875 0.5000 0.8125 0.9688 1.0000
0 I 2 3 4 5 6
5314 3543 0984 0146 0012 0001 0000
0.2621 0.6554 0.9011 0.'J830 0.9984 0.9999 1.0000
1176 3025 3241 1852 0595 0102 0007
0.1176 0.4202 0.7443 0.92'J5
1.0000 1.0000
2621 3932 2458 0819 0154 0015 0001
0467 1866 3110 2765 1382 0369 0041
0.0467 0.2333 0.5443 0.8208 0.9590 0.9959 1.0000
0156 0938 2344 3125 2344 0938 0156
0.0156 0.1094 0.3438 0.6563 0.8906 0.9844 1.0000
0 I 2 3 4 5 6 7
4783 3720 1240 0230 0026 0002 0000 0000
0.4783 0.8503 0.9743 0.9973 0.9998 1.0000 1.0000 1.0000
2097 3670 2753 1147 0287 0043 0004 0000
0.2097 0.5767 0.8520 0.'J667 0.9'J53
0824 2471 3177 2269 0972 0250 0036 0002
0.0824 0.3294 0.6471 0.8740 0.9712 0.9962 0.9998 1.0000
0280 1306 2613 2903
0.0280 0.1586 0.4199 0.7102 0.9037 0.9812 0.9984 1.0000
0078 0547 1641 2734 2734 1641 0547 0078
0.0078 0.0625 0.2266 0.5000 0.7734 0.9375 0.9922 1.0000
U I 2 3 4 :; 6 7 8
4305 3826 1488 0331 0046 0004 0000 0000 0000
U.4305 0.8131 0.9619
1678 3355
0.1678 0.5033
0576 1977 2965 2541 1361 0467 0100 0012 0001
U.0576 0.2553 0.5518 0.8059 0.9420 0.9887 0.9987 0.9999 1.0000
0168 0896 2090 2787 2322 1239 0413 0079 0007
U.0168 0.1064 0.3154 0.5941 0.8263 0.'J502 0.9915
om'J 0313 1094 2188 2734 2188 1O'J4 0313 0039
o.um'J 0.0352 0.1445 0.3633 0.6367 0.8555 0.9648 0.9961 1.0000
I' -
4096 1536 0256 0016
0.8Y60
-
I
5
072'J
3277 40Y6
25Y2
-
6
I
I
7
8
0.5314 0.8857 0.9841 0.9987 0.99'J'J
0.'J'J50
0.9996 1.0000 1.0000 1.0000 1.0000
0.9996 I.oooo 1.0000
2'J36
0.7Y69
1468 0459 0092 0011 0001 0000
0.9437 0.98'J6
0.9988 0.9999 1.0OOO 1.0000
0.9891 0.9993 1.0000
1'J35
0774 0172 0016
0.9'J93
1.0000
I
APP.5
A97
Tables
Table A6
Poisson Distribution
Probability function f(x) [see (5), Sec. 24.7] and distribution function F(x)
ix
f.L I(x)
= 0.1 F(x)
o.
f.L I(x)
= 0.2 F(x)
f.L I(x)
= 0.3 F(x)
o.
0
9048
0.9048
O. 8187
O.ll 1117
7408
0.74011
1 2 3 4 5
0905 0045 0002 0000
0.9953 0.9998 1.0000 1.0000
1637 0164 0011 0001
0.9825 0.9989 0.9999 1.0000
2222 0333 0033 0003
0.9631 0.9964 0.9997 1.0000
x
I(~)
0 I
f.L
= 0.6
f.L
= 0.8
F(x)
I(x)
F(x)
5488
0.5488
O. 4966
0.4966
O. 4493
0.4493
3293 0988 0198 0030 0004
0.8781 0.9769 0.9966 0.9996 1.0000
3476 1217 0284 0050 0007
0.1l442 0.9659 0.9942 0.9992 0.9999
3595 1438 0383 0077 0012
0001
1.0000
0002
6 7
x
= 0.7
J(x)
o.
2 3 4 5
f.L
F(x)
f.L = 1.5 F(x)
I(r) II.
I(x)
f.L=2 F(x)
II.
I(x)
f.L I(x)
O. 6703 2681 0536
oon 0007 0001
f.L I(x)
= 0.4 F(x)
f.L I(x)
0.6703
O. 6065
0.9384 0.9921 0.9992 0.9999 1.0000
3033 0758 0126 0016 0002
= 0.9
= 0.5 F(x)
0.6065 0.9098 0.9856 0.9982 0.9998 1.0000
I(x}
4066
0.4066
3679
0.3679
0.8088 0.9526 0.9909 0.9986 0.9998
3659 1647 0494 0111 0020
0.7725 0.9371 0.9865 0.9977 0.9997
3679 1839 0613 0153 0031
0.7358 0.9197 0.9810 0.9963 0.9994
1.0000
0003
1.0000
0005 0001
0.9999 1.0000
f.L=3 F(x)
I(x}
o.
f.L=4 F(x)
o.
II.
I(x)
f.L=5 F(x)
II.
0
2231
0.2231
1353
0.1353
0498
0.0498
0183
0.0183
0067
0.0067
I
3347 2510 1255 0471 0141
0.5578 0.8088 0.9344 0.9814 0.9955
2707 2707 1804 0902 0361
0.4060 0.6767 0.8571 0.9473 0.9834
1494 2240 2240 1680 1O01l
0.1991 0.4232 0.6472 0.8153 0.9161
0733 1465 1954 1954 1563
0.0916 0.2381 0.4335 0.6288 0.7851
0337 0842 1404 1755 1755
0.0404 0.1247 0.2650 0.4405 0.6160
0035 0008 0001
0.9991 0.9998 1.0000
0120 0034 0009 0002
0.9955 0.9989 0.9998 1.0000
0504 0216 0081 0027 0008
0.9665 0.9881 0.9962 0.9989 0.9997
1042 0595 0298 0132 0053
0.8893 0.9489 0.9786 0.9919 0.9972
1462 1044 0653 0363 0181
0.7622 0.8666 0.9319 0.9682 0.9863
0002 0001
0.9999 1.0000
0019 0006
0.9991 0.9997
0082 0034
0.9945 0.9980
13
0002
0.9999
0013
0.9993
14 15
0001
1.0000
0005 0002
0.9998 0.9999
0000
1.0000
2 3 4 5 6 7 8 9 10 II
12
16
I
f.L=1 F(x)
F(x)
o.
I
I
A98
APP. 5 Tables
Table A7 Normal Distribution Values of the distribution function <1>(.::) [see (3). Sec. 24.81. <1>( -~) = 1 - <1>(z)
z
<1>(;:)
5040 5080 5120 5160 5199
0.51 0.52 0.53 0.54 0.55
0.06 0.07 0.08 0.09 0.10
5239 5279 5319 5359 5398
0.11 0.12 0.13 0.14 0.15
z
<1>(~)
0.01 0.02 0.03 0.04 0.05
z
<1>(~)
z
<1>(~)
6950 6985 7019 7054 7088
loOI lo02 1.03 1.04 1.05
9778 9783 9788 9793 9798
2.51 2.52 2.53 2.54 2.55
9940 9941 9943 9945 9946
0.56 0.57 0.58 0.59 0.60
7123 7157 7190 7224 7257
2.06 2.07 2.08 2.09 2.10
9R03 9808 9812 9817 9821
2.5ti 2.57 2.58 2.59 2.60
9948 9949 9951 9952 9953
5438 5478 5517 5557 5596
0.61 0.62 0.63 0.64 0.65
94ti3 9474 9484 9495 9505
HI
2.12 2.13 2.14 2.15
982ti 9830 9834 9838 9842
2.61 2.62 2.63 2.64 2.65
9955 9956 9957 9959 9960
0.16 0.17 0.18 0.19 0.20
5636 5675 5714 5753 5793
1.66 1.67 1.68 1.69 1.70
9515 9525 9535 9545 9554
2.16 2.17 2.18 2.19 2.20
9846 9850 9854 9857 9861
2.66 2.67 2.68 2.69 2.70
9961 9962 9963 9964 9965
0.21 0.22 0.23 0.24 0.25
8869 8888 8907 8925 8944
1.71 1.72 1.73 1.74 1.75
9564 9573 9582 9591 9599
2.21 2.22 2.23 2.24 2.25
9864 9868 9871 9875 9878
2.71 2.72 2.73 2.74 2.75
9966 9967 9968 9969 9<)70
1.26 1.27 1.28 1.29 1.30
8962 8980 8997 9015 9032
1.76 1.77 1.78 1.79 1.80
9608 9616 9625 9633 9641
2.26 2.27 2.28 2.29 2.30
9RRI 9884 9887 98<)0 9893
2.76 2.77 2.78 2.79 2.80
9971 9972 9973 9974 9974
7910 7939 7967 7995 8023
1.31 1.32 1.33 1.34 1.35
9049 9066 9082 9099 9115
1.81 1.82 lo83 1.84 1.85
9649 9656 9664 9671 9678
2.31 2.32 2.33 2.34 2.35
9896 9898 9901 9904 9906
2.81 2.82 2.83 2.84 2.85
9975 9976 9977 9977 9<)78
0.86 0.87 0.88 0.89 0.90
8051 8078 8106 8133 8159
1.36 1.37 1.38 1.39 lAO
9131 <)147 9162 9177 9192
1.86 1.87 1.88 1.89 1.90
9686 9693 9699 9706 9713
2.36 2.37 2.38 2.39 2AO
9909 9911 9913 9916 9918
2.86 2.87 2.88 2.89 2.90
9979 9979 9980 9981 9981
6591 6628 6664 6700 6736
0.91 0.92 0.93 0.94 0.95
8186 8212 8238 8264 8289
1.41 1.42 1.43 1.44 lA5
9207 9222 9236 9251 9265
1.91 1.92 1.93 1.94 1.95
9719 9726 9732 9738 9744
2.41 2A2 2.43 2.-14 2A5
9920 9922 9925 9927 <)929
2.91 2.92 2.93 2.94 2.95
9982 9982 9983 9984 9984
6772 6808 6844 6879 6915
0.96 0.97 0.98 0.99 1.00
8315 8340 8365 8389 8413
1.46 1.47 1.48 1.49 1.50
9279 <)292 9306 9319 9332
1.96 1.97 1.98 1.99 2.00
9750 9756 9761 9767 9772
2A6 2A7 2A8 2.49 2.50
9931 9932 9934 9936 9938
2.96 2.97 2.98 2.99 3.00
9985 9<)85 9986 9986 9987
~
<.
<1>(~)
8438 8461 8485 8508 8531
1.51 1.52 1.53 1.54 1.55
9345 9357 9370 9382 9394
2.01 2.02 2.03 2.04 2.05
1.06 1.07 1.08 1.09 1.10
8554 8577 8599 8621 8643
1.56 1.57 1.58 1.59 1.60
9406 9418 9429 9441 9452
7291 7324 7357 7389 7422
1.11 1.12 1.13 1.14 1.15
8665 8686 8708 8729 8749
1.61 1.62 1.63 1.64 1.65
0.66 0.67 0.68 0.69 0.70
7454 7486 7517 7549 7580
1.16 1.17 1.18 1.19 1.20
8770 8790 8810 8830 8849
5832 5871 5910 5948 5987
0.71 0.72 0.73 0.74 0.75
7611 7642 7ti73 7704 7734
1.21 1.22 1.23 1.24 1.25
0.26 0.27 0.28 0.29 0.30
6026 6064 6103 6141 6179
0.76 0.77 0.78 0.79 0.80
7764 7794 7823 7852 7881
0.31 0.32 0.33 0.34 0.35
6217 6255 6293 6331 ti3tiR
0.81 0.82 0.83 0.84 0.85
0.36 0.37 0.38 0.39 0.40
6406 6443 6480 6517 6554
OAI OA2 OA3 0.44 OA5 OA6 OA7 0.48 OA9 I 0.50
u.
u.
u.
u.
I
-
u.
u.
A99
APP. 5 Tables
Table AS Normal Distribution Values of z for given values of cI>(z) [see (3). Sec. 24.8] and D(-;.) Example: -;. = 0.279 if cI>(-;.) = 61 %; z = 0.860 if D(-;.) = 61 %. %
~(<1»
z(D)
%
z(<1»
cI>(-;.) -
cI>( - z)
~(D)
%
:(<1»
:(D)
1 2 3 4 5
-2.326 -2.054 -1.881 -1.751 -1.645
0.013 0.025 0.038 0.050 0.063
41 42 43 44 45
-0.22S -0.202 -0.176 -0.151 -0.126
0.539 0.553 0.568 0.583 0.598
SI 82 83 84 85
0.S7S 0.915 0.954 0.994 1.036
1.311 1.341 1.372 1.405 1.440
6 7 8 9
10
-1.555 -1.476 -1.405 -1.341 -1.282
0.075 0.088 0.100 0.113 0.126
46 47 48 49 50
-0.100 -0.075 -0.050 -0.025 0.000
0.613 0.628 0.643 0.659 0.674
86 S7 88 89 90
1.080 1.126 1.175 1.227 1.282
1.476 1.514 1.555 1.598 1.645
11 12 13 14 15
-1.227 -1.175 -1.126 -1.080 -1.036
0.138 0.151 0.164 0.176 0.IS9
51 52 53 54 55
0.025 0.050 0.075 0.100 0.126
0.690 0.706 0.722 0.739 0.755
91 92 93 94 95
1.341 1.405 1.476 1.555 1.645
1.695 1.751 1.812 1.881 1.960
16 17 18 19 20
-0.994 -0.954 -0.915 -0.878 -0.842
0.202 0.215 0.228 0.240 0.253
56 57 58 59 60
0.151 0.176 0.202 0.228 0.253
0.772 0.789 0.806 0.824 0.842
96 97 97.5 98 99
1.751 1.881 1.9{)() 2.054 2.326
2.054 2.170 2.241 2.326 2.576
21 22 23 24 25
-0.806 -0.772 -0.739 -0.706 -0.674
0.266 0.279 0.292 0.305 0.319
61 62 63 64 65
0.279 0.305 0.332 0.358 0.385
0.860 0.878 0.896 0.915 0.935
99.1 99.2 99.3 99.4 99.5
2.366 2.409 2.457 2.512 2.576
2.612 2.652 2.697 2.748 2.807
26 27 2R 29 30
- 0.643 - 0.613 -0.5S3 -0.553 -0.524
0.332 0.345 0.358 0.372 0.385
66 67 68 69 70
0.412 0.-140 0.468 0.496 0.524
0.954 0.974 0.994 1.015 1.036
99.6 99.7 99.8 99.9
2.652 2.748 2.878 3.090
2.878 2.968 3.090 3.291
31 32 33 34 35
-0.496 -0.468 -0.-140 -0.412 -0.385
0.399 0.412 0.-1-26 0.440 0.454
71
72 73 74 75
0.553 0.583 0.613 0.643 0.674
1.058 1.080 1.103 1.126 1.150
99.91 99.92 99.93 99.94 99.95
3.121 3.156 3.195 3.239 3.291
3.320 3.353 3.390 3.432 3.481
36 37 38 39 40
-0.358 -0.332 -0.305 -0.279 -0.253
0.468 0.482 0.496 0.510 0.524
76 77 78 79 80
0.706 0.739 0.772 O.S06 0.842
1.175 1.200 1.227 1.254 1.282
99.96 99.97 99.98 99.99
3.353 3.432 3.540 3.719
3.540 3.615 3.719 3.891
- --
,
A100
APP. 5
Tables
Table A9 t-Distribution Values of z for given values of the distribution function F(z) (see (8) in Sec. 25.3). Example: For 9 degrees of freedom, z = 1.83 when F(z) = 0.95. Number of Degrees of Freedom F(:)
2
3
4
5
6
7
8
9
10
0.00 0.32 0.73 1.38 3.08
0.00 0.29 0.62 1.06 1.89
0.00 0.28 0.58 0.98 1.64
0.00 0.27 0.57 0.94 1.53
0.00 0.27 0.56 0.92 1.48
0.00 0.26 0.55 0.91 1.44
0.00 0.26 0.55 0.90 1.41
0.00 0.26 0.55 0.89 1.40
0.00 0.26 0.54 0.88 1.38
0.00 0.26 0.54 0.88 1.37
6.31 12.7 31.8 63.7 318.3
2.92 4.30 6.96 9.92 22.3
2.35 3.18 4.54 5.84 10.2
2.13 2.78 3.75 4.60 7.17
2.02 2.57 3.36 4.03 5.89
1.94 2.45 3.14 3.71 5.21
1.89 2.36 3.00 3.50 4.79
1.86 2.31 2.90 3.36 4.50
1.83 2.26 2.82 3.25 4.30
1.81 2.23 2.76 3.17 4.14
19
20
L
0.5 0.6 0.7 0.8 0.9 0.95 0.975 I 0.99 0.995 0.999
l
Number of Degrees of Freedom F(~)
II
12
0.00 0.26 0.54 0.88 1.36
U.OO
0.6 0.7 0.8 0.9
0.26 0.54 0.87 1.36
0.95 0.975 0.99 0.995 0.999
1.80 2.20 2.72 3.11 4.02
1.78 2.18 2.68 3.05 3.93
1.77 2.16 2.65 3.01 3.85
U.S
I
13
14
15
17
18
U.OO
U.OO
!l.OO
0.26 0.54 0.87 1.35
0.26 0.54 0.87 1.35
0.26 0.54 0.87 1.34
O.UU
U.OU
0.26 0.54 0.86 1.34
0.26 0.53 0.86 1.33
0.00 0.26 0.53 0.86 1.33
1.76 2.14 2.62 2.98 3.79
1.75 2.13 2.60 2.95 3.73
1.75 2.12 2.58 2.92 3.69
1.74 2.11 2.57 2.90 3.65
16
II
U.OO
U.UU
0.26 0.53 0.86 1.33
0.26 0.53 0.86 1.33
1.73 2.10 2.55 2.88 3.61
1.73 2.09 2.54 2.86 3.58
1.72 2.09 2.53 2.85 3.55
I
Number of Degrees of Freedom F(:)
22
24
26
28
30
40
50
100
200
ex:
0.5 0.6 0.7 0.8 0.9
0.00 0.26 0.53 0.86 1.32
0.00 0.26 0.53 0.86 1.32
0.00 0.26 0.53 0.86 1.31
0.00 0.26 0.53 0.85 1.31
0.00 0.26 0.53 0.85 1.31
0.00 0.26 0.53 0.85 1.30
0.00 0.25 0.53 0.85 1.30
0.00 0.25 0.53 0.85 1.29
0.00 0.25 0.53 0.84 1.29
0.00 0.25 0.52 0.84 1.28
0.95 0.975 0.99
1.72 2.07 2.51 2.82 3.50
1.71 2.06 2.49 2.80 3.47
1.71 2.06 2.48 2.78 3.43
1.70 2.05 2.47 2.76 3.41
1.70 2.04 2.46 2.75 3.39
1.68 2.02 2.42 2.70 3.31
1.68 2.01 2.40 2.68 3.26
1.66 1.98 2.36 2.63 3.17
1.65 1.97 2.35 2.60 3.13
1.65 1.96 2.33 2.58 3.09
I 0.995 0.999
I
APP. 5
A10l
Tables
Table A10 Chi-square Distribution Values of x for given values of the distribution function F(z) (see Sec. 25.3 before (17)). Example: For 3 degrees of freedom, z = 1l.34 when F(z.) = 0.99. Number of Degrees of Freedom F(z)
1
2
3
0.005 0.01 0.025 0.05
0.00 0.00 0.00 0.00
0.01 0.02 0.05 0.10
0.07 0.11 0.22 0.35
0.95 0.975 0.99 0.995
3.84 5.02 6.63 7.88
5.99 7.38 9.21 10.60
7.81 9.35 11.34 12.84
4
5
6
0.21 0.30 0.48 0.71
0.41 0.55 0.83 US
0.68 0.87 1.24 1.64
9.49 11.14 13.28 14.86
11.07 12.83 15.09 16.75
12.59 14.45 16.81 18.55
7
8 - 0.99 1.34 1.24 1.65 2.18 1.69 2.17 "2.73 14.07 16.01
15.51 17.53 20.09 21.95
18A8
20.28
9
10
1.73 2.09 2.70 3.33
2.16 2.56 3.25 3.94
16.92 19.02 21.67 23.59
18.31 20.48 23.21 25.19 -
Number of Degrees of Freedom F(z)
11
12
13
14
IS
16
17
18
19
20
0.005 0.01 0.025 0.05
2.60 3.05 3.82 4.57
3.07 3.57 4.40 5.23
3.57 4.11 5.01 5.89
4.07 4.66 5.63 6.57
4.60 5.23 6.26 7.26
5.14 5.81 6.91 7.96
5.70 6.41 7.56 8.67
6.26 7.01 8.23 9.39
6.84 7.63 8.91 10.12
7.43 8.26 9.59 10.85
0.95 0.975 0.99 0.995
19.68 21.92 24.72 26.76
21.03 23.34 26.22 28.30
22.36 24.74 27.69 29.82
23.68 26.12 29.14 31.32
25.00 27.49 30.58 32.80
26.30 28.85 32.00 34.27
27.59 30.19 33.41 35.72
28.87 31.53 34.81 37.16
30.14 32.85 36.19 38.58
31.41 34.17 37.57 40.00
--
-'--
~
--
-~
-~
Number of Degrees of Freedom F(::J
21
22
23
24
25
26
27
28
29
30
0.005 0.01 0.0"25 0.05
8.0 8.9 10.3 11.6
8.6 9.5 11.0 12.3
9.3 10.2 11.7 13.1
9.9 10.9 12.4 13.8
10.5 11.5 13.1 14.6
11.2 12.2 13.8 15.4
11.8 12.9 14.6 16.2
12.5 13.6 15.3 16.9
13.1 14.3 16.0 17.7
13.8 15.0 16.8 18.5
0.95 0.975 0.99 0.995
32.7 35.5 38.9 41..1-
33.9 36.8 40.3 42.8
35.2 38.1 41.6 44.2
36.4 39.4 43.0 45.6
37.7 40.6 44.3 46.9
38.9 41.9 45.6 48.3
40.1 43.2 47.0 49.6
41.3 44.5 48.3 51.0
42.6 45.7 49.6 52.3
43.8 47.0 50.9 53.7
Number of Degrees of Freedom F(z)
> 100 (Approximation)
40
50
60
70
80
90
100
0.005 0.01 0.025 0.05
20.7 22.2 24.4 26.5
28.0 29.7 32.4 34.8
35.5 37.5 40.5 43.2
43.3 45.4 48.8 51.7
51."2 53.5 57.2 60.4
59."2 61.8 65.6 69.1
67.3 70.1 74.2 77.9
!(h !(h !(h !(h
-
0.95 0.975 0.99 L 0.995
55.8 59.3 63.7 66.8
67.5 71.4 76.2 79.5
79.1 83.3 88.4 92.0
90.5 95.0 100.4 104.2
101.9 106.6 112.3 116.3
113.1 118.1 124.1 1"28.3
124.3 129.6 135.8 140.2
!(h !(h
+ 1.(4)2 + 1.96)2 + "2.33)2 + 2.58)2
--
In the last column, h
=
~, where
III IS
the nUlllbel of degree~ of freedom.
~(h
WI
2.58)2 2.33)2 1.96)2 1.64)2
Al02
APP.5
Tables
Table All F-Distribution with (m, n) Degrees of Freedom Values of z for which the distribution function F(z) [see (13), Sec. 25.4] has the value Example: For (7, 4) d.f., .::: = 6.09 if F(.:::) = 0.95. 111=1
II
111=2
111=3
111=4
-
= 5
111=6
111 = 7
111=8
111=9
230 19.3 9.01 6.26 5.05
234 19.3 S.94 6.16 4.95
237 19.4 8.89 6.09 4.88
239 19.4 8.85 6.04 4.82
241 19.4 8.81 6.00 4.77
III
I
2 3 4 5
161 18.5 10.1 7.71 6.61
200 19.0 9.55 6.94 5.79
216 19.2 9.2S 6.59 5.41
225 19.2 9.12 6.39 5.19
6 7 8 9 10
5.99 5.59 5.32 5.12 4.%
5.14 4.74 4.46 4.26 4.10
4.76 4.35 4.07 3.86 3.71
4.53 4.12 3.84 3.63 3.4R
4.39 3.97 3.69 3A8 3.33
4.28 3.87 3.5S 3.37 3.22
4.21 3.79 3.50 3.29 3.14
4.15 3.73 3.44 3.23 3.07
4.10 3.68 3.39 3.18 3.02
II 12 13 14 15
4.84 4.75 4.67 4.60 4.54
3.98 3.89 3.SI 3.74 3.68
3.59 3.49 3.41 3.34 3.29
3.36 3.26 3.18 3.11 3.06
3.20 3.11 3.03 2.96 2.90
3.09 3.00 2.92 2.85 2.79
3.01 2.91 2.83 2.76 2.71
2.95 2.85 2.77 2.70 2.64
2.90 2.80 2.71 2.65 2.59
16 17 18 19 20
4.49 4.45 4.41 4.38 4.35
3.63 3.59 3.55 3.52 3.49
3.24 3.20 3.16 3.13 3.10
3.01 2.96 2.93 2.90 2.87
2.85 2.81 2.77 2.74 2.71
2.74 2.70 2.66 2.63 2.60
2.66 2.61 2.5S 2.54 2.51
2.59 2.55 2.51 2.48 2.45
2.54 2.49 2.46 2.42 2.39
22 24 26 28 30
4.30 4.26 4.23 4.20 4.17
3.44 3.40 3.37 3.34 3.32
3.05 3.01 2.98 2.95 2.92
2.82 2.78 2.74 2.71 2.69
2.66 2.62 2.59 2.56 2.53
2.55 2.51 2.47 2.45 2A2
2.46 2.42 2.39 2.36 2.33
2.40 2.36 2.32 2.29 2.27
2.34 2.30 2.27 2.24 2.21
32 34 36 38 40
4.15 4.13 4.11 4.10 4.08
3.29 3.28 3.26 3.24 3.23
2.90 2.88 2.87 2.85 2.84
2.67 2.65 2.63 2.62 2.61
2.51 2.49 2.48 2.46 2.45
2.40 2.38 2.36 2.35 2.34
2.31 2.29 2.28 2.26 2.25
2.24 2.23 2.21 2.19 2.18
2.19 2.17 2.15 2.14 2.12
50 60 70 80 90
4.03 4.00 3.98 3.96 3.95
3.18 3.15 3.13 3.11 3.10
2.79 2.76 2.74 2.72 2.71
2.56 2.53 2.50 2.49 2.47
2.40 2.37 2.35 2.33 2.32
2.29 2.25 2.23 2.21 2.20
2.20 2.17 2.14 2.13 2.11
2.13 2.10 2.07 2.06 2.04
2.07 2.04 2.02 2.00 1.99
100 150 200 IOUO
3.94 3.90 3.89 3.85 3.84
3.09 3.06 3.04 3.00 3.00
2.70 2.66 2.65 2.61 2.60
2.46 2.43 2.42 2.38 2.37
2.31 2.27 2.26 2.22 2.21
2.19 2.16 2.14 2.11 2.10
2.10 2.07 2.06 2.02 2.01
2.03 2.00 1.98 1.95 1.94
1.97 1.94 1.93 1.89 1.88
I
:x:
0.95
APP. 5
A103
Tables
Table All F-Distribution with (m, n) Degrees of Freedom (continued) Values of z for which the distribution function F(z) [see (13), Sec. 25.4] has the value 111=10
n
111
= 15
III
= 20
111
= 30
111
= 40
III
= 50
III
= 100
0.95
00
2 3 4 5
242 19.4 8.79 5.96 4.74
246 19.4 8.70 5.86 4.62
248 19.4 8.66 5.80 4.56
250 19.5 8.62 5.75 4.50
251 19.5 8.59 5.72 4.46
252 19.5 8.58 5.70 4.44
253 19.5 8.55 5.66 4.41
254 19.5 8.53 5.63 4.37
6 7 8 9 10
4.06 3.64 3.35 3.14 2.98
3.94 3.51 3.22 3.01 2.85
3.87 3A4 3.15 2.94 2.77
3.81 3.38 3.08 2.86 2.70
3.77 3.34 3.04 2.83 2.66
3.75 3.32 3.02 2.80 2.64
3.71 3.27 2.97 2.76 2.59
3.67 3.23 2.93 2.71 2.54
11 12 13 14 15
2.85 2.75 2.67 2.60 2.54
2.72 2.62 2.53 2.46 2.40
2.65 2.54 2.46 2.39 2.33
2.57 2.47 2.38 2.31 2.25
2.53 2.43 2.34 2.27 2.20
2.51 2.40 2.31 2.24 2.18
2.46 2.35 2.26 2.19 2.12
2.40 2.30 2.21 2.13 2.07
16
17 18 19 20
2.49 2.45 2.41 2.38 2.35
2.35 2.31 2.27 2.23 2.20
2.28 2.23 2.19 2.16 2.12
2.19 2.15 2.11 2.07 2.04
2.15 2.10 2.06 2.03 1.99
2.12 2.08 2.04 2.00 1.97
2.07 2.02 1.98 1.94 1.91
2.01 1.96 1.92 1.88 1.84
22 24 26 28 30
2.30 2.25 2.22 2.19 2.16
2.15 2.11 2.07 2.04 2.01
2.07 2.03 1.99 1.96 1.93
1.98 1.94 1.90 1.87 1.84
1.94 1.89 1.85 1.82 1.79
1.91 1.86 1.82 1.79 1.76
1.85 1.80 1.76 1.73 1.70
1.78 1.73 1.69 1.65 1.62
32 34 36 38 40
2.14 2.12 2.11 2.09 2.08
1.99 1.97 1.95 1.94 1.92
1.91 1.89 1.87 1.85 1.84
1.82 1.80 1.78 1.76 1.74
1.77 1.75 1.73 1.71 1.69
1.74 1.71 1.69 1.68 1.66
1.67 1.65 1.62 1.61 1.59
1.59 1.57 1.55 1.53 1.51
50 60 70 80 90
2.03 1.99 1.97 1.95 1.94
un 1.84 1.81 1.79 1.78
1.78 1.75 1.72 1.70 1.69
1.69 1.65 1.62 1.60 1.59
1.63 1.59 1.57 1.54 1.53
1.60 1.56 1.53 1.51 1.49
1.52 1.48 1.45 1.43 1.41
1.44 1.39 1.35 1.32 1.30
100 150 200 1000
1.93 1.89 1.88 1.84 1.83
1.77 1.73 1.72 1.68 1.67
1.68 1.64 1.62 1.58 1.57
1.57 1.54 1.52 1.47 1.46
1.52 1.48 1.46 1.41 1.39
1.48 1.44 1.41 1.36 1.35
1.39 1.34 1.32 1.26 1.24
1.28 1.22 1.19 1.08 1.00
I
cc
I
Al04
APP. 5
Tables
Table All F-Distribution with (m, n) Degrees of Freedom (continued) Values of:z for which the distribution function F(::.) [see (13), Sec. 25.4] has the value
0.99
m=1
111=2
m=3
111=4
111=5
111=6
111=7
111=8
111=9
1 2 3 4 5
4052 98.5 34.1 21.2 16.3
4999 99.0 30.8 18.0 13.3
5403 99.2 29.5 16.7 12.1
5625 99.2 28.7 16.0 11.4
5764 99.3 28.2 15.5 11.0
5859 99.3 27.9 15.2 10.7
5928 99.4 27.7 15.0 10.5
5981 99.4 27.5 14.8 10.3
6022 99.4 27.3 14.7 10.2
6 7 8 9 10
13.7 12.2 11.3 10.6 10.0
n
10.9 9.55 8.65 8.02 7.56
9.78 8.45 7.59 6.99 6.55
9.15 7.85 7.01 6.42 5.99
8.75 7.46 6.63 6.06 5.64
8.47 7.19 6.37 5.80 5.39
8.26 6.99 6.18 5.61 5.20
8.10 6.84 6.03 5.47 5.06
7.98 6.72 5.91 5.35 4.94
12 13 14 15
9.65 9.33 9.07 8.86 8.68
7.21 6.93 6.70 6.51 6.36
6.22 5.95 5.74 5.56 5.42
5.67 5.41 5.21 5.04 4.89
5.32 5.06 4.86 4.69 4.56
5.07 4.82 4.62 4.46 4.32
4.89 4.64 4.44 4.28 4.14
4.74 4.50 4.30 4.14 4.00
4.63 4.39 4.19 4.03 3.89
16 17 18 19 20
8.53 8.40 8.29 8.18 8.10
6.23 6.11 6.01 5.93 5.85
5.29 5.18 5.09 5.01 4.94
4.77 4.67 4.58 4.50 4.43
4.44 4.34 4.25 4.17 4.10
4.20 4.10 4.01 3.94 3.87
4.03 3.93 3.84 3.77 3.70
3.89 3.79 3.71 3.63 3.511
3.78 3.68 3.60 3.52 3.46
22 24 26 28 30
7.95 7.82 7.72 7.64 7.56
5.72 5.61 5.53 5.45 5.39
4.82 4.72 4.64 4.57 4.51
4.31 4.22 4.14 4.07 4.02
3.99 3.90 3.82 3.75 3.70
3.76 3.67 3.59 3.53 3.47
3.59 3.50 3.42 3.36 3.30
3.45 3.36 3.29 3.23 3.17
3.35 3.26 3.18 3.12 3.07
32 34 36 38 40
7.50 7.44 7.40 7.35 7.31
5.34 5.29 5.25 5.21 5.18
4.46 4.42 4.38 4.34 4.31
3.97 3.93 3.89 3.86 3.83
3.65 3.61 3.57 3.54 3.51
3.43 3.39 3.35 3.32 3.29
3.26 3.22 3.18 3.15 3.12
3.13 3.09 3.05 3.02 2.99
3.02 2.98 2.95 2.92 2.89
50 60 70 80 90
7.17 7.08 7.01 6.96 6.93
5.06 4.98 4.92 4.88 4.85
4.20 4.13 4.07 4.04 4.01
3.72 3.65 3.60 3.56 3.54
3.41 3.34 3.29 3.26 3.23
3.19 3.12 3.07 3.04 3.01
3.02 2.95 2.91 2.87 2.84
2.89 2.82 2.78 2.74 2.72
2.78 2.72 2.67 2.64 2.61
100 150 200 1000
6.90 6.81 6.76 6.66 6.63
4.82 4.75 4.71 4.63 4.61
3.98 3.91 3.88 3.80 3.78
3.51 3.45 3.41 3.34 3.32
3.21 3.14 3.11 3.04 3.m
2.99 2.92 2.89 2.82 2.80
2.82 2.76 2.73 2.66 2.64
2.69 2.63 2.60 2.53 2.51
2.59 2.53 2.50 2.43 2.41
11
'"
A10S
APP. 5 Tables
F-Distribution with (m, n) Degrees of Freedom (continued) Values of ~ for which the distribution function F(;:) [see (13), Sec. 25.4] has the value
Table All
11
III
III
6056 99.4 27.2 14.5 10.1
I
2 3 4 5
I
= 10
= 15
III
=
20
111
= 30
111
= 40
III
= 50
III
= 100
0.99
co
6157 99.4 26.9 14.2 9.72
6209 99.4 26.7 14.0 9.55
6261 99.5 26.5 13.8 9.38
6287 99.5 26.4 13.7 9.29
6303 99.5 26.4 13.7 9.24
6334 99.5 26.2 13.6 9.13
6366 99.5 26.1 13.5 9.G:!
6 7 8 9 10
7.87 6.62 5.81 5.26 4.85
7.56 6.31 5.52 4.96 4.56
7.40 6.16 5.36 4.81 4.41
7.23 5.99 5.20 4.65 4.25
7.14 5.91 5.12 4.57 4.17
7.09 5.86 5.07 4.52 4.12
6.99 5.75 4.96 4.42 4.01
5. 65 4.86 4.31 3.91
11 12 13 14 15
4.54 4.30 4.10 3.94 3.80
4.25 4.01 3.66 3.52
4.10 3.86 3.66 3.51 3.37
3.94 3.70 3.51 3.35 3.21
3.86 3.62 3.43 3.27 3.13
3.81 3.57 3.38 3.22 3.08
3.71 3.47 3.27 3.11 2.98
3.60 3.36 3.17 3.00 2.87
16
3.69 3.59 3.51 3.43 3.37
3.31 3.23 3.15 3.09
3.26 3.16 3.08 3.00 2.94
3.10 3.00 2.92 2.84 2.78
3.02 2.92 2.84 2.76 2.69
2.97 2.87 2.78 2.71 2.64
2.86 2.76 2.68 2.60 2.54
2.75 2.65 2.57 2.49 2.42
3.26 3.17 3.09 3.03 2.98
2.98 2.89 2.81 2.75 2.70
2.83 2.74 2.66 2.60 2.55
2.67 2.58 2.50 2.44 2.39
2.58 2.49 2.42 2.35 2.30
2.53 2.44 2.36 2.30 2.25
2.42 2.33 2.25 2.19 2.13
2.31 2.21 2.13 2.06 2.01
34 36 38 40
2.93 2.89 2.86 2.83 2.80
2.65 2.61 2.58 2.55 2.52
2.50 2.46 2.43 2.40 2.37
2.34 2.30 2.26 2.23 2.20
2.25 2.21 2.18 2.14 2.11
2.20 2.16 2.12 2.09 2.06
2.08 2.04 2.00 1.97 1.94
1.96 1.91 1.87 1.84 1.80
50 60 70 80 90
2.70 2.63 2.59 2.55 2.52
2.42 2.35 2.31 2.27 2.24
2.27 2.20 2.15 2.12 2.09
2.10 2.03 1.98 1.94 1.92
2.01 1.94 1.89 1.85 1.82
1.95 1.88 1.83 1.79 1.76
1.82 1.75 1.70 1.65 1.62
1.68 1.60 1.54 1.49 1.46
100 150 200 1000
2.50 2.44
2.22 2.16 2.13 2.06 2.04
2.07 2.00 1.97
1.89 1.83 1.79 1.72 1.70
1.80 1.73 1.69 1.61 1.59
1.74 1.66 1.63 1.54 1.52
1.60 1.52 1.48 1.38 1.36
1.43 1.33 1.28 1.11 1.00
171 18 19 20
HZ
3.-l1
6.88
I
22 24 26 28
:~ I
I I
2.-l1
2.34 2.32
JC I
I
I
1.90
1.88 I
I
1
A106
APP. 5 Tables Table All Distribution Function F(x) in Section 25.8
~ I
x
=3
o
167
I £ I
1=20 o.
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 I 67 68 69 70 71 72
73 74 75 1 76 '77 '78 179 '80 '81 82 83 84 85 86 87 88 89 90 91 92 93 94
1
001 002 002 003 004 005 006 007 008 010 012 014 017 020 023 027 1 032 037 043 049 056 064 073 082 093 104 117 130 144 159 176 193 211 I 230 250 271 293 315 339 1 362 387 411 1 436 462 487
=6
0 1 2 3 4 5 6 7
001 008 028 068 136 235 360 500
o. 008 1 1042 2 117
0421 1 167 2 , 375
x
o.
o
o
~ x
x
134 12421 408
~ x =19
,
,
I
,
11
x
1 2 3 4 5 6
001 0051 015 035 068 119 71 191 8 281 9 386 10,500
O.
,
73 411 74 441 , 75 1 -1-70 : 76 500 1
6 7 8 9 10 11
II
=8
O. I001
0031 007 016 031 054 089 138 199 274 360 452
IZ
001 002 003 003 004 005 007 009 1 011 I 013 016 020 024 029 0341 041 048 056 066 076 088 100 115 130 147 165 64 184 65 1 205 66 227 67 250 68 275 69 300 70 327 71 354 ,72 383
,
2 3 4 5
12 13
=18
38 39 40 41 '42 143 '144 45 46 '47 148 '49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
I X ,
o. I
O.
43 001 44 002 45 002 46 003' 47 003 48 004, 49 005 50 006 51 008 52 010 53 012 54 014 1 155 017 56 021 57 025 58 029 I 59 034 I 60 040' 61 0471 62 054 163 062: 64 072 I 65 082 66 I 093 : 67 1105 68 119 69 133 1 70 149' 71 166 72 184' 73 203 ~ 74 223 75 245, 76 267 1 77 290 I 78 314 79 339 1 80 3651 181 391 82 4181 83 445 84 473 85 500
I =71 II
11
/I
=4
1 O.
o.
~
11
= P(T ~ x) of the Random Variable T
x
=17
I
I
~
/I
x
~ 7 8 9 10 11 12 13 14 15 16 ,17
IZ
x
=10
6 001 7 002 8 005
4 001 5 003 6' 006
O.
32 001 33 002 34 002 1 27 35 003 28 002 36 1004 29 002 37 005 30 003 38 007 31 004 39 009 32 006 40 011 33 008 41 014 34 010 42 017 35 013 43 021 44 026 361016 37 021 45 032 381026 46 1038 39 032 47 046 40 039 48 054 41 048 49 064 42,058 50 076 51 088 431070 44 083 52 102 45 097 53 118 46 114 54 135 1 47 133 55 154 1 48 153 56 174 49 175 57 196 50 199 58 220 51 1 225 59 245 52 253 60 271 53 282 61 299 54' 313 62 328 55 345 63 358 64 388' 56 378 65 420 57 412/ 66 452/ 58 1 447 59 482 , / 67 484
I
rE
8 001 9 002 10 003 11 0051 12 008 13 013 1
9 0081 0121 10 014 022 038 11 023 060' 12 0361 14 020 13 054 090 15 030 16 043 130, 1 14 078 17 060 179 151108 238 I 16 146 18 082 "106' 1 171190 19 109 18 242 20 141 381 460 119 300 121 179 1 1 20 364 22 2231 '23 271 431 500 '24 324
,
I
I~~
l
x
27
=15
o.
~! ~~I
25 003 26 004 27006 28 008 29 010 30 014 31 018 32 023
£
=14
18 19 20 21 22 23 24 25,
001 002
O.
~ =';3 x
om'
O.
003 005' 007 I 010 013 ,
14 15 16 17 18
001 001 002 003 005
11
x
I
=12 O.
11 001
12 002
1
28 29 30 31 132 33 34 35 36 37 38 39 40 141 42 43 44 '45
I
~
,~! ~~~ ~~ ,~~! I ~~ ~~ I!!~!
35 046 136 057 137 070 1 38 0841 , 391101 : 401120 411141 42 164 43 190 44 218 45 248 46 279 47 313 I 48 3491 : 49 385 ,50 423 51 461 '52 500
I
=11
031 21 015 15 007 116 010 040 122 , 021 23 029 17 016 051 063 124 038 18 022 25 050 19 031 079 096 26 064 20 043 117, 27 082 21 058 140 28 102 22 076 165 29 126/ 23 098 194: 30 153 24 125 225 31 1 184 25 155 259 32 218/ 26 190 295 33 255 I 27 230 334 I 34 295 28 273 374 35 338 129 319 415 36 383 30 369 457 137 4291 131 420 500 I 38 476' 32 473
PHOTO
CREDITS
Part A Opener: John SohmJChromoshohmJPhoto Researchers. Part B Opener: Lester Lefkowitz/Corbis Images. Part C Opener: Science Photo Library/Photo Researchers. Part D Opener: Lester Lefkowitz/Corbis Images. Part E Opener: Walter Hodges/Stone/Getty Images. Chapter 19, Figure 434: From Dennis Sharp, ARCHITECKTUR. ©1973 der deutschsprachigen Ausgabe Edition Praeger GmbH, Munchen. Part F Opener: Charles O'Rear/Corbis Images. Part G Opener: Greg Pease/Stone/Getty Images. Appendix 1 Opener: Patrick Bennett/Corbis Images. Appendix 2 Opener: Richard T. Nowitz/Corbis Images. Appendix 3 Opener: Chris KapolkaiStone/Getty Images. Appendix 4 Opener: John OlsoniCorbis Images.
Pl
IN 0 E X Page numbers AI, A2, A3, ... refer to App. I to App. 5 at the end of the book.
A Abel 78 Absolute convergence 667,697 frequency 994. 1000 value 607 Absolutely integrable 508 Acceleration 395, 995 Acceptance sampling 1073 Adams-Bashforth methods 899 Adams-Moulton methods 900 Adaptive 824 Addition of complex numbers 603 matrices 275 means 1038 normal random variables 1050 power series 174. 680 variances 1039 vectors 276, 324, 367 Addition rule 1002 Adiabatic 561 ADI method 915 Adjacency matrix 956 Adjacent vertices 955 Airfoil 732 Airy's equation 552. 904 Algebraic multiplicity 337, 865 Algorithm 777, 783 Dijkstra 964 efficient 962 Ford-Fulkerson 979 Gauss 837 Gauss-Seidel 848 Greedy 967 Kruskal967 Moore 960 polynomially bounded 962 Prim 971 Runge-Kutta 892,904 stable, unstable 783 Aliasing 526
Allowable number of defectives 1073 Alternating direction implicit method 915 path 983 Alternative hypothesis 1058 Ampere 92 Amplification 89 Amplitude spectrum 506 Analytic function 175,617,681 Analytic at infinity 711 Angle between curves 35 between vectors 372 Angular speed 38 L 765 Annulus 613 Anticommutative 379 AOQ, AOQL 1075-1076 Approximate solution of differential equations 9, 886-934 eigenvalue problems 863-882 equations 787-796 systems of equations 833-858 Approximation least squares 860 polynomial 797 trigonometric 502 A priori estimate 794 AQL 1074 Arc of a curve 391 Archimedian principle 68 Arc length 393 Arctan 634 Area 435, 442, 454 Argand diagram 605 Argument 607 Artificial variable 949 Assignment problem 982 Associated Legendre functions 182 Asymptotically equal 191, 1009 normal 1057 stable 148
n
12
Index
Attractive 148 Augmented matrix 288, 833 Augmenting path 975, 983 theorem 977, 984 Autonomous 31, 151 Average (see Mean value) Average outgoing quality 1075 Axioms of probability 100 I
B Back substitution 289, 834 Backward differences 807 edge 974, 976 Euler method 896, 907 Band matrix 914 Bashforth method 899 Basic feasible solution 942. 944 variable~ 945 Basis 49, 106. 113, 138, 300, 325, 360 Beam 120,547 Beats 87 Bellman optimality principle 963 Bell-shaped curve 1026 Bernoulli 30 distribution 1020 equation 30 law of large numbers 1032 numbers 690 Bessel 189 equation 189, 204 functions 191, 198, 202, 207, A94 functions, tables A94 inequality 215, 504 Beta function A64 Bezier curves 816 BFS 960 Bijective mapping 729 Binary 782 Binomial coefficients 1009 distribution 1020, A96 series 689 theorem 1010 Binormal (Fig. 210) 397 Bipartite graph 982, 985 Birthday problem 1010
Bisection method 796 Bolzano-Weierstrass theorem A91 Bonnet 181 Boundary 433 conditions 203, 540, 558, 571, 587 point 433, 613 value problem 203, 558 Bounded domain 646 function 38 region 433 sequence A69 Boxplot 995 Branch cut 632 point 746 Breadth first search 960 Buoyance force 68
C Cable 52, 198, 593 CAD (Computer aided design) 810 Cancellation law 321 Cantor-Dedekind axiom A69 Capacitance 92 Capacitor 92 Capacity of a cut set 976 of an edge 973 Cardano 602 Cardioid 443 Cm1esian coordinates 366, 604 CAS (Computer algebra system) vii, 777 Catenary 399 Cauchy 69 convergence principle 667, A90 determinant 112 -Goursat theorem 647 -Hadamard formula 676 inequality 660 integral formula 654 integral theorem 647, 652 method of steepest descent 938 principal value 719, 722 product 680 -Riemann equations 37, 618, 621 -Schwarz inequality 326, 859 Cayley transformation 739
13
Index
Center 143 of a graph 973 of gravity 436, 457 of a power series 171 Central differences 808 limit theorem 1057 moments 1019 Centrifugal, centripetal 396 Cgs system: Front cover Chain rule 401 Characteristic determinant 336. 864 equation 59, Ill, 336, 551, 864 function 542, 574 of a partial differential equation 551 polynomial 336, 864 value 324, 864 vector 324, 864 Chebyshev polynomials 209 Chinese postman problem 963 Chi-square 1055, 1077, AlOl Cholesky's method 843 Chopping 782 Chromatic number 987 Circle 391 of convergence 675 Circuit 95 Circular disk 613 helix 391, 394 membrane 580 Circulation 764 Cissoid 399 Clairaut equation 34 Class intervals 994 Closed disk 613 integration formula 822, 827 interval A69 path 959 point set 613 region 433 Coefficient matrix 288. 833 Coefficients of a differential equation 46 power series 171 system of equations 287, 833 Cofactor 309 Collatz's theorem 870
Column 275 space 300 sum norm 849 vector 275 Combination 1007 Combinatorial optimization 954-986 Comparison test 668 Complement 613, 988 Complementary enor function A64 Fresnel integrals A65 sine integral A66 Complementation rule 1002 Complete bipartite graph 987 graph 958 matching 983 orthonormal set 214 Complex conjugate numbers 605 exponential function 57, 623 Fourier integral 519 Fourier series 497 function 614 hyperbolic functions 628, 743 impedance 98 indefinite integral 637 integration 637~663, 7l2~725 line integral 633 logarithm 630, 688 matrices 356 number 602 plane 605 plane. extended 710 potential 761 sequence 664 series 666 sphere 710 trigonometric functions 626. 688 trigonometric polynomial 524 variable 614 vector space 324. 359 Complexity 961 Component 366,374 Composite transformation 281 Compound interest 9 Compressible fluid 412 Computer aided design (CAD) 810 algebra system (CAS) vii, 777
14
Index
Computer (Com.) graphics 287 software (see Software) Conchoid 399 Condition number 855 Conditionally convergent 667 Conditional probability 1003 Conduction of heat 465, 552, 757 Cone 406, 448 CONF 1049 Confidence intervals 1049-1058 level L049 limits 1049 Conformal mapping 730. 754 Conic sections 355 Conjugate complex numbers 605 harmonic function 622 Connected graph 960 set 613 Conservative 415. 428 Consistent equations 292. 303 Constraints 937 Consumer's risk 1075 Continuity of a complex function 615 equation 413 of a vector function 387 Continuous distribution 1011 random variable 101 L. 1034 Contour integral 647 Contraction 789 Control chart 1068 limit 1068 variables 936 Convergence absolute 667 circle of 675 conditional 667 interval 172. 676 of an iterative process 793, 848 mean 214 mean-square 214 in norm 214 principle 667 radius 172
Convergence (Cont.) of a sequence 386, 664 of a series ] 71, 666 superlinear 795 tests 667-672 uniform 691 Conversion of an ODE to a system 134 Convolution 248. 523 Cooling 14 Coordinate transformations A 71. A84 Coordinates Cartesian 366, 604 curvilinear A71 cylindrical 587, A71 polar 137. 443, 580, 607 spherical 588, A71 Coriolis acceleration 396 COlTector 890, 900 COlTelation analysis 1089 Cosecant 627, A62 Cosine of a complex variable 627, 688, 743 hyperbolic 688 integral A66. A95 of a real variable A60 Cotangent 627. A62 Coulomb 92 law 409 Covariance 1039, 1085 Cramer's rule 306-307, 312 Crank-Nicolson method 924 Critical damping 65 point 31, 142,730 region 1060 Cross product 377 Crout's method 841 Cubic spline 811 Cumulative distribution function 1011 frequency 994 Curl 414.430,472, A71 Curvature 397 Curve 389 arc length of 393 fitting 859 orientation of 390 piecewise smooth 421 rectifiable 393 simple 391
15
Index
Curve (Cont.) smooth 421, 638 twisted 391 Curvilinear coordinates A71 Cut set 976 Cycle 959 Cycloid 399 Cylinder 446 flow around 763, 767 Cylindrical coordinates 587, A71
o D' Alembert's solution 549, 551 Damping 64, 88 Dantzig 944 DATA DESK 991 Decay 5 Decreasing sequence A69 Decrement 69 Dedekind A69 Defect 337 Defective item 1073 Definite complex integral 639 Definiteness 356 Deformation of path 649 Degenerate feasible solution 947 Degree of precision 822 a vertex 955 Degrees of freedom 1052, 1055, 1066 Deleted neighborhood 712 Delta Dirac 242 Kronecker 210, A83 De Moi vre 610 formula 610 limit theorem 1031 De Morgan's laws 999 Density 1014 Dependent linearly 49, 74, 106,21.)7, 300, 325 random variables 1036 Depth first search 960 Derivative of a complex function 616, 658 directional 404 left-hand 484 right-hand 484 of a vector function 387
DERIVE 778 Descartes 366 Determinant 306, 308 Cauchy 112 characteristic 336. 864 of a matrix 305 of a matrix product 321 Vandermonde 112 DFS 960 Diagonalization 351 Diagonal matrix 284 Diagonally dominant matrix 868 Diameter of a graph 973 Differences 802, 804. 807-808 Difference table 803 Differentiable complex function 616 Differential 19, 429 form 20, 429 geometry 389 operator 59 Differential equation (ODE and PDE) Airy 552, 904 Bernoulli 30 Bessel 189. 204 Cauchy-Riemann 37, 618. 621 with constant coefficients 53, III elliptic 551, 909 Euler-Cauchy 69, 185 exact 20 homogeneous 27, 46, 105, 535 hyperbolic 551,928 hypergeometric 188 Laguerre 257 Laplace 407. 465. 536. 579. 587. 910 Legendre 177, 204. 590 linear 26, 45, 105, 535 nonhomogeneous 27, 46, 105, 535 nonlinear 45, 151,535 numeric methods for 886-934 ordinary 4 parabolic 551, 909, 922 partial 535 Poisson 910. 918 separable 12 Sturm-Liouville 203 of vibrating beam 547, 552 of vibrating mass 61,86, 135, 150,243,261, 342,499 of vibrating membrane 569-586 of vibrating string 538
16
Index
Differentiation analytic functions 691 complex functions 616 Laplace transforms 254 numeric 827 power series 174, 680 series 696 vector functions 387 Diffusion equation 464. 552 Diffusivity 552 Digraph 955 Dijkstra's algorithm 964 Dimension of vector space 300, 325. 369 Diodes 399 Dirac's delta 242 Directed graph 955 line segment 364 path 982 Directional derivative 404 Direction field 10 Direct method 845 Dirichlet 467 discontinuous factor 509 problem 467, 558, 587,915 Discharge of a source 767 Discrete Fourier transform 525 random variable 1011. 1033 spectrum 507, 524 Disjoint events 998 Disk 613 Dissipative 429 Distribution 1010 Bernoulli 1020 binomial 1020, A96 chi-square 1055. AIOI continuous 1011 discrete 1011.1033 Fisher's F- 1066, AI02 -free test 1080 function 1011. 1032 Gauss 1026 hypergeometric 1024 marginal 1035 multinomial 1025 normal 1026. 1047-1057, 1062-1067, A98 Poisson 1022, 1073. A97 Student's t- 1053. AIOO two-dimensional 1032
Distribution (COli f.) uniform 1015, 1017,1034 Divergence theorem of Gauss 459 of vector fields 410, A 72 Divergent sequence 665 series 171, 667 Di vided differences 802 Division of complex numbers 604, 609 Domain 401,613,646 Doolittle's method 841 Dot product 325, 371 Double Fourier series 576 integral 433 labeling 968 precision 782 Driving force 84 Drumhead 569 Duffing equation 159 Duhamel's formula 597
E Eccentricity of a vertex 973 Echelon form 294 Edge 955 coloring 987 incidence list 957 Efficient algorithm 962 Eigenbasis 349 Eigenfunction 204, 542, 559, 574 expansion 210 Eigenspace 336, 865 Eigenvalue problems for matrices (Chap. 8) 333-363 matrices, numerics 863-882 ODEs (Sturm-Liouville problems) 203-216 PDEs 540-593 systems of ODEs 130-165 Eigenvector 334, 864 EISPACK 778 Elastic membrane 340 Electrical network (see Networks) Electric circuit (see Circuit) Electromechanical analogies 96 Electromoti ve force 91 Electrostatic field 750
17
Index Electrostatic (COllt.) potential 588-592, 750 Element of matrix 273 Elementary matrix 296 operations 292 Elimination of first derivative 197 Ellipse 390 Ellipsoid 449 Elliptic cylinder 448 paraboloid 448 partial differential equation 551, 909 Empty set 998 Engineering system: Front cover Entire function 624,661, 711 Entry 273, 309 Equality of complex numbers 602 matrices 275 vectors 365 Equally likely 1000 Equipotential lines 750, 762 surfaces 750 Equivalence relation 296 Equivalent linear systems 292 Erf 568, 690. A64, A95 Error 783 bound 784 estimation 785 function 568, 690. A64. A95 propagation 784 Type I, Type II 1060 Essential singularity 708 Estimation of parameters 1046-1057 Euclidean norm 327 space 327 Euler 69 backward methods for ODEs 896, 907 beta function A64 -Cauchy equation 69, 108. 116. 185. 589 -Cauchy method 887, 890, 903 constant 200 formula 58, 496, 624, 627, 687 formulas for Fourier coefficients 480, 487 graph 963 numbers 690 method for systems 903
Euler (Cont.) trail 963 Evaporation L8 Even function 490 Event 997 Everett interpolation formula 809 Exact differential equation 20 differential form 20, 429 Existence theorem differential equations 37, 73, 107, 109, 137, 175 Fourier integral 508 Fourier series 484 Laplace transforms 226 Linear equations 302 Expectation 10 16 Experiment 997 Explicit solution 4 Exponential decay 5 function, complex 57, 623 fUnction, real A60 growth 5, 31 integral A66 Exposed vertex 983 Extended complex plane 710, 736 Extension. periodic 494 Extrapolation 797 Extremum 937
F Factorial function 192, 1008, A95 Failure 1021 Fair die 1000 Falling body 8 False position 796 Family of curves 5, 35 Faraday 92 Fast Fourier transform 526 F-distribution 1066. A 102 Feasible solution 942 Fehlberg 894 Fibonacci numbers 683 Field conservative 415. 428 of force 385 gravitational 385,407,411,587
18
Index
Field (Cant.) irrotational 415, 765 scalar 384 vector 384 velocity 385 Finite complex plane 710 First fundamental form 457 Green's formula 466 shifting theorem 224 Fisher, R. A. 1047, 1066, 1077 F -distribution I 066, A 102 Fixed decimal point 781 point 736, 781, 787 Flat spring 68 Floating point 781 Flow augmenting path 975 Flows in networks 973-981 Fluid flow 412,463, 761 Flux 412, 450 integral 450 Folium of Descarte" 399 Forced oscillations 84, 499 Ford-Fulkerson algorithm 979 Forest 970 Form Hermitian 361 quadratic 353 skew-Hermitian 361 Forward differences 804 edge 974. 976 Four-color theorem 987 Fourier 477 -Bessel series 213, 583 coefficients 480, 487 coefficients, complex 497 constants 210 cosine integral 511 cosine series 491 cosine transform 514, 529 double series 576 half-range expansions 494 integral 508, 563 integral, complex 519 -Legendre series 212, 590 matrix 525 series 211, 480. 487 series, complex 497
Fourier (Cant.) series, generalized 210 sine integral 511 sine series 491, 543 sine transform 514.530 transform 519. 531, 565 transform, discrete 525 transform, fast 526 Fractional linear transformation 734 Fraction defective J073 Fredholm 20 I Free fall 18 oscillations 61 Frene! formulas 400 Frequency 63 of values in samples 994 Fresnel integrals 690, A65 Friction 18-19 Frobenius 182 method 182 norm 849 theorem 869 Fulkerson 979 Full-wave rectifier 248 Function analytic 175, 617 Bessel 191, 198,202,207, A94 beta A64 bounded 38 characteristic 542, 574 complex 614 conjugate harmonic 622 entire 624, 661. 71 I error 568, 690, AM, A95 even 490 exponential 57, 623, A60 factorial 192, 1008, A95 gamma 192, A95 Hankel 202 harmonic 465, 622, 772 holomorphic 617 hyperbolic 628, 743. A62 hypergeometric 188 inverse hyperbolic 634 inverse trigonometric 634 Legendre 177 logarithmic 630. A60 meromorphic 711 Neumann 201
Index Function (Collt.) odd 490 orthogonal 205, 482 orthonormal 205, 210 periodic 478 probability 1012. 1033 rational 617 scalar 384 space 382 staircase 248 step 234 trigonometric 626, 688, A60 unit step 234 vector 384 Function space 327 Fundamental form 457 matrix 139 mode 542 period 485 system -1-9, 106, 113, 138 theorem of algebra 662
G Galilei 15 Gamma function 192, A95 GAMS 778 Gau~s 188 distribution 1026 divergence theorem 459 elimination method 289. 834 hypergeometric equation 188 integration formula 826 -Jordan elimination 317, 844 least squares 860, 1084 quadrature 826 -Seidel iteration 846, 913 General powers 632 solution 64. 106. 138, 159 Generalized Fourier series 210 function 242 solution 545 triangle inequality 608 Generating function 181,216,258 Geometric multiplicity 337 series 167,668,673,687,692
19
Gerschgorin's theorem 866 Gibbs phenomenon 490, 510 Global error 887 Golden Rule 15,23 Goodness of fit 1076 Gosset 1066 Goursat 648, A88 Gradient 403, 415. 426, A72 method 938 Graph 955 bipartite 982 complete 958 Euler 963 planar 987 sparse 957 weighted 959 Gravitation 385, 407, 411, 587 Greedy algorithm 967 Greek alphabet: Back cover Green 439 formulas 466 theorem 439, ..J.66 Gregory-Newton formulas 805 Growth restriction 225 Guldin's theorem 458
H Hadamard's formula 676 Half-life time 9 Half-plane 613 Half-range Fourier series 494 Half-wave rectifier 248, 489 Halving 819, 824 Hamiltonian cycle 960 Hanging cable 198 Hankel functions 202 Hard spring 159 Harmonic conjugate 622 function 465, 622. 772 oscillation 63 series 670 Heat equatiun 464, 536, 553, 757, 923 potential 758 Heaviside 221 expansions 245 formulas 247
110
Index
Heaviside (Cont.) function 234 Helicoid 449 Helix 391, 394, 399 Helmholtz equation 572 Henry 92 Hermite interpolation 816 polynomials 216 Hermitian 357, 361 Hertz 63 Hesse's normal form 375 Heun's method 890 High-frequency line equations 594 Hilbert 201, 326 matrix 858 space 326 Histogram 994 Holomorphic 617 Homogeneous differential equation 27, 46, 105. 535 system of equations 288, 304, 833 Hooke's law 62 Householder's tridiagonalization 875 Hyperbolic differential equation 551,909,928 functions, complex 628, 743 functions, real A62 paraboloid 448 partial differential equations 551, 928 spiral 399 Hypergeometric differential equation 188 distribution 1024 functions 188 series 188 Hypocycloid 399 Hypothesis 1058
Idempotent matrix 286 Identity of Lagrange 383 matrix (see Unit matrix) theorem for power ~eries 679 transformation 736 trick 351 Ill-conditioned 851 Image 327, 729
Imaginary axis 604 part 602 unit 602 Impedance 94, 98 Implicit solution 20 Improper integral 222. 719. 722 Impulse 242 IMSL 778 Incidence list 957 matrix 958 Inclusion theorem ~68 Incomplete gamma function A64 Incompressible 765 Inconsistent equations 292, 303 Increasing sequence A69 Indefinite integral 637, 650 integration 640 Independence of path 426, 648 Independent events 1004 random variables 1036 Indicial equation 184 Indirect method 845 Inductance 92 Inductor 92 Inequality Bessel 215, 504 Cauchy 660 ML- 644 Schur 869 triangle 326. 372. 608 Infinite dimensional 325 population 1025, 1045 sequence 664 series (see Series) Infinity 710, 736 Initial condition 6, 48, 137, 540 value problem 6. 38.48. 107,886,902 Injective mapping 729 Inner product 325. 359, 371 space 326 lnput 26, 84. 230 Instability (see Stability) Integral contour 647
111
Index
Integral (Cant.) definite 639 double 433 equation 252 Fourier 508, 563 improper 719. 722 indefinite 637. 650 line 421, 633 surface 449 theorems, complex 647, 654 theorems, real 439, 453, 469 transform 221, 513 triple 458 Integrating factor 23 Integration complex functions 637-663, 701-727 Laplace transforms255 numeric 8l7-827 power series 680 series 695 Integra-differential equation 92 Interest 9, 33 Interlacing of zeros 197 Intermediate value theorem 796 Interpolation 797-815 Hermite 816 Lagrange 798 Newton 802, 805, 807 spline 811 Interquartile range 995 Intersection of events 998 Interval closed A69 of convergence 172, 676 estimate 1046 open 4, A69 Invariant subspace 865 Inverse hyperbolic functions 634 mapping plinciple 733 of a matrix 315, 844 trigonometric functions 634 Inversion 735 Investment 9, 33 Irreducible 869 Irregular boundary 919 Irrotational415,765 Isocline 10 Isolated singularity 707 Isotherms 758
Iteration for eigenvalues 872 for equations 787-794 Gauss-Seidel 846. 913 Jacobi 850 Picard 41
J Jacobian 436, 733 Jacobi iteration 850 Jerusalem, Shrine of the Book 814 Jordan 316 Joukowski airfoil 732
K Kirchhoff's laws 92, 973 Kronecker delta 210, A83 Kruskal's algorithm 967 Kutta 892
L l10 l2, lex; 853 L 2 863 Labeling 968 Lagrange 50 identity of 383 interpolation 798 Laguerre polynomials 209, 257 Lambert's law 43 LAPACK 778 Laplace 221 equation 407, 465, 536, 579, 587, 910 integrals 512 limit theorem 1031 operator 408 transform 221, 594 Laplacian 443, A73 Latent root 324 Laurent series 701, 712 Law of absorption 43 cooling 14 gravitation 385 large numbers 1032 mass action 43 the mean (see Mean value theorem) LC-circuit 97
III
Index
LCL 1068 Least squares 860, 1084 Lebesgue 863 Left-hand derivative 484 limit 484 Left-handed 378 Legendre 177 differential equation 177, 204, 590 functions 177 polynomials 179, 207, 590, 826 Leibniz 14 convergence test A 70 Length of a curve 393 of a vector 365 Leonardo of Pisa 638 Leontief 344 Leslie model 341 Libby 13 Liebmann's method 913 Likelihood function 1047 Limit of a complex function 615 cycle 157 left-hand 484 point A90 right-hand 484 of a sequence 664 vector 386 of a vector function 387 Lineal element 9 Linear algebra 271-363 combination 106, 325 dependence 49, 74, 106, 108, 297, 325 differential equation 26, 45, 105, 535 element 394, A 72 fractional transformation 734 independence 49, 74, 106, 108, 297, 325 interpolation 798 operator 60 optimization 939 programming 939 space (see Vector space) system of equations 287, 833 transformation 281, 327 Linearization of systems of ODEs 151 Line integral 421, 633 Lines of force 751
UNPACK 779 Liouville 203 theorem 661 Lipschitz condition 40 List 957 Ljapunov 148 Local error 887 minimum 937 Logarithm 630, 688, A60 Logarithmic decrement 69 integral A66 spiral 399 Logistic population law 30 Longest path 959 Loss of significant digits 785 Lotka-Volterra population model 154 Lot tolerance per cent defective 1075 Lower control limit 1068 triangular matlix 283 LTPD 1075 LU-factorization 841
M Maclaurin 683 series 683 trisectrix 399 Magnitude of a vector (see Length) Main diagonal 274, 309 Malthus's law 5, 31 MAPLE 779 Mapping 327, 729 Marconi 63 Marginal distributions 1035 Markov process 285, 341 Mass-sPling system 61, 86, 135, 150, 243, 252, 261,342,499 Matching 985 MATHCAD 779 MATHEMATICA 779 Mathematical expectation 1019, 1038 MATLAB 779 Matlix addition 275 augmented 288, 833 band 914
Index
Matrix (COllt.) diagonal 284 eigenvalue problem 333-363. 863-882 Hermitian 357 identity (see Unit matrix) inverse 315 inversion 315, 844 multiplication 278, 321 nonsingular 315 norm 849. 854 normal 362. 869 null (see Zero matrix) orthogonal 345 polynomial 865 scalar 284 singular 315 skew-Hermitian 357 skew-symmetric 283, 345 sparse 812, 912 square 274 stochastic 285 symmetric 283, 345 transpose 282 triangular 283 tridiagonal 812. 875, 914 unit 284 unitary 357 zero 276 Max-flow min-cut theorem 978 Maximum 937 flow 979 likelihood method 1047 matching 983 modulus theorem 772 principle 773 Mean convergence 214 Mean-square convergence 214 Mean value of a (an) analytic function 771 distribution IO 16 function 764 harmonic function 772 sample 996 Mean value theorem 402, 434, 454 Median 994, 1081 Membrane 569-586 Meromorphic function 711 Mesh incidence matrix 278 Method of false position 796
113
Method of (Com.) least squares 860. 1084 moments 1046 steepest descent 938 undetermined coefficients 78, 117, 160 variation of parameters 98, 118, 160 Middle quartile 994 Minimum 937, 942, 946 MINITAB 991 Minor 309 Mixed boundary value problem 558. 587. 759. 917 triple product 381 Mixing problem 13, 130, ]46, 163,259 Mks system: Front cover ML-inequality 644 Mobius 453 strip 453, 456 transformation 734 Mode 542. 582 Modeling 2. 6. 13. 61. 84. 130. 159. 222, 340, 499, 538,569.750-767 Modified Bessel functions 203 Modulus 607 Molecule 912 Moivre's formula 610 Moment central 1019 of a distribution 1019 of a force 380 generating function 1026 of inertia 436. 455. 457 of a sample 1046 vector 380 Monotone sequence A69 Moore's shortest path algorithm 960 Morera's theorem 661 Moulton 900 Moving trihedron (see Trihedron) M-test for convergence 969 Multinomial distribution 1025 Multiple point 391 Multiplication of complex numbers 603, 609 determinants 322 matrices 278, 321 means ]038 power series 174, 680 vectors 2]9, 371, 377 Multiplication rule for events 1003
114
Index
Multiplicity 337, 865 Multiply connected domain 646 Multistep method 898 "Multivalued function" 615 Mutually exclusive events 998
N Nabla 403 NAG 779 Natural frequency 63 logarithm 630, A60 spline 812 Neighborhood 387, 613 Nested form 786 NETLIB 779 Networks 132, 146, 162,244,260.263.277,331 in graph theory 973 Neumann, C. 201 functions 20 I problem 558, 587, 917 Newton 14 -Cotes formulas 822 interpolation formulas 802, 805, 807 law of cooling 14 law of gravitation 385 method 800 -Raphson method 800 second law 62 Neyman 1049, 1058 Nicolson 924 Nicomedes 399 Nilpotent matrix 286 NIST 779 Nodal incidence matrix 277 line 574 Node 142, 797 Nonbasic variables 945 Nonconservative 428 Nonhomogeneous differential equation 27, 78, 116, 159, 305, 535 system of equations 288. 304 Nonlinear differential equations 45. 151 Nonorientable smface 453 Nonparametric test 1080 Nonsingular matrix 315 Nonn 205, 326, 346, 359, 365, 849
Normal acceleration 395 asymptotically 1057 to a curve 398 derivative 444, 465 distribution 1026, 1047-1057, 1062-1067, A98 two-dimensional 1090 equations 860. 1086 form of a PDE 551 matrix 362, 869 mode 542, 582 plane 398 to a plane 375 random variable 1026 to a surface 447 vector 375, 447 Null hypothesis 1058 matrix (see Zero matlix) space 301 vector (see Zero vector) Nullity 301 Numeric methods 777-934 differentiation 827 eigenvalues 863-882 equations 787-796 integration 817-827 interpolation 797-816 linear equations 833-858 matrix inversion 315, 844 optimization 936-953 ordinary differential equations (ODEs) 886-908 partial differential equations (PDEs) 909-930 Nystrom method 906
o 0962 Objective function 936 OC curve 1062 Odd function 490 ODE 4 (see also Differential equations) Ohm's law 92 One-dimensional heat equation 553 wave equation 539 One-sided derivative 484 test 1060
115
Index
One-step method 898 One-to-one mapping 729 Open disk 613 integration formula 827 interval A69 point set 613 Operating characteristic 1062 Operational calculus 59, 220 Operation count 838 Operator 59. 327 Optimality principle. Bellman's 963 Optimal solution 942 Optimization 936-953, 959-990 Orbit 141 Order 887, 962 of a determinant 308 of a differential equation 4, 535 of an iteration process 793 Ordering 969 Ordinary differential equations 2-269. 886-908 (see also Differential equation) Orientable surface 452 Orientation of a curve 638 surface 452 Orthogonal coordinates A 71 curves 35 eigenvectors 350 expansion 210 functions 205, 482 matrix 345 polynomials 209 series 210 trajectories 35 transformation 346 vectors 326, 371 Orthonormal functions 205, 210 Oscillations of a beam 547,552 of a cable 198 in circuits 9 I damped 64, 88 forced 84 free 61, 500, 547 harmonic 63 of a mass on a spring 61, 86, 135, 150, 243, 252, 261, 342.499 of a membrane 569-586
Oscillations (Cont.) self-sustained 157 of a string 204, 538. 929 undamped 62, Osculating plane 398 Outcome 997 Outer product 377 Outlier 995 Output 26, 230 Overdamping 65 Overdetermined system 292 Overflow 782 Overrelaxation 851 Overtone 542
p Paired comparison 1065 Pappus's theorem 458 Parabolic differential equation 551, 922 Paraboloid 448 Parachutist 12 Parallelepiped 382 Parallel flow 766 Parallelogram equality 326, 372, 612 law 367 Parameter of a distribution 1016 Parametric representation 389, 446 Parking problem 1023 Parseval's equality 215. 504 Partial derivative 388, A66 differential equation 535, 909 fractions 231, 245 pivoting 291, 834 sum 171,480, 666 Particular solution 6, 48. 106, 159 Pascal 399 Path in a digraph 974 in a graph 959 of integration 421, 637 PDE 535, 909 (see a/so Differential equation) Peaceman-Rachford method 915 Pearson. E. S. 1058 Pearson, K. 1066 Pendulum 68, 152, 156 Period 478
116
Index
Periodic extension 494 function 478 Permutation 1006 Perron-Frobenius theorem 344, 869 Pfaffian form 429 Phase angle 88 of complex number (see Argument) lag 88 plane 141, 147 pOltrait 14 I, 147 Picard iteration method 41 theorem 709 Piecewise continuous 226 smooth 421, 448. 639 Pivoting 291, 834 Planar graph 987 Plane 315, 375 Plane curve 391 Poincare 216 Point estimate 1046 at infinity 710. 736 set 613 source 765 spectrum 507,524 Poisson 769 distribution 1022, 1073, A97 equation 910, 918 integral formula 769 Polar coordinates 437, 443, 580, 607-608 form of complex numbers 607 moment on inertia 436 Pole 708 Polynomial approximation 797 matrix 865 Polynomially bounded 962 Polynomials 617 Chebyshev 209 Hermite 216 Laguerre 207, 257 Legendre 179. 207. 590. 826 trigonometric 502 Population in statistics 1044 Population models 31, 154. 341
Position vector 366 Positive definite 326. 372 Possible values 1012 Postman problem 963 Potential 407, 427,590, 750, 762 complex 763 theory 465, 749 Power method for eigenvalues 872 of a test 1061 series 167, 673 series method 167 Precision 782 Predator-prey 154 Predictor-corrector 890, 900 Pre-Hilbert space 326 Prim's algorithm 971 Principal axes theorem 354 branch 632 diagonal (see main diagonal) directions 340 normal (Fig. 210) 397 part 708 value 607. 630. 632. 719. 722 Prior estimate 794 Probability 1000, 10m conditional 1003 density 1014. 1034 distribution 10 10, 1032 function 1012, 1033 Producer's risk 1075 Product (see Multiplication) Projection of a vector 374 Pseudocode 783 Pure imaginary number 603
Q QR-factorization method 879 Quadratic equation 785 form 353 interpolation 799 Qualitative methods 124. 139-165 Quality control 1068 Quartile 995 Quasilinear 551, 909 Quotient of complex numbers 604
117
Index
R Rachford method 915 Radiation 7, 561 Radiocarbon dating 13 Radius of convergence 172, 675, 686 of a graph 973 Random experiment 997 numbers 1045 variable 1010. 1032 Range of a function 614 sample 994 Rank of a matrix 297, 299. 31 I Raphson 790 Rational function 617 Ratio test 669 Rayleigh 159 equation 159 quotient 872 RC-circuit 97, 237, 240 Reactance 93 Real axis 604 part 602 sequence 664 vector space 324, 369 Rectangular membrane 571 pulse 238. 243 rule 817 wave 21 1, 480, 488, 492 Rectifiable curve 393 Rectification of a lot 1075 Rectifier 248, 489, 492 Rectifying plane 398 Reduction of order 50, 116 Region 433,614 Regression 1083 coefficient Im;5, L088 line 1084 Regula falsi 796 Regular point of an ODE 183 Sturm-Liouville problem 206 Rejectable quality level 1075 Rejection region 1060 Relative class frequency 994
Relati ve (Cont.) error 784 frequency 1000 Relaxation 850 Remainder 171, 666 Removable singularity 709 Representation 328 Residual 849, 852 Residue 713 Residue theorem 715 Resistance 91 Resonance 86 Response 28, 84 Restoring force 62 Resultant of forces 367 Riccati equation 34 Riemann 618 sphere 710 surface 746 Right-hand derivative 484 limit 484 Right-handed 377 Risk 1095 RL-circuit 97, 240 RLC-circuit 95, 241, 244, 499 Robin problem 558, 587 Rodrigues's formula 181 Romberg integration 829 Root 610 Root test 671 Rotation 381, 3R5, 414, 734, 764 Rounding 782 Row echelon form 294 -equivalent 292, 298 operations 292 scaling 838 space 300 sum norm 849 vector 274 Runge-Kutta-Fehlberg 893 Runge-Kutta methods 892, 904 Runge-Kutta-Nystrom 906
s Saddle point 143 Sample 997, 1045 covariance 1085
118
Index
Sample (Cont.) distribution function 1076 mean 996, 1045 moments 1046 point 997 range 994 size 997, 1045 space 997 standard deviation 996, 1046 variance 996, 1045 Sampling 1004, 1023, 1073 SAS 991 Sawtooth wave 248, 493, 505 Scalar 276, 364 field 384 function 384 matIix 284 multiplication 276, 368 triple product 381 Scaling 838 Scheduling 987 Schoenberg 810 Scrnodinger 242 Schur's inequality 869 Schwartz 242 Schwarz inequality 326 Secant 627, A62 method 794 Second Green's formula 466 shifting theorem 235 Sectionally continuous (see Piecewise continuous) SeIdel 846 Self-starting 898 Self-sustained oscillations 157 Separable differential equation 12 Separation of variables 12, 540 Sequence 664, A69 Series 666, A69 addition of 680 of Bessel functions 213, 583 binomial 689 convergence of 171, 666 differentiation of 174. 680 double Fourier 576 of eigenfunctions 210 Fourier 211,480.487 geometIic 167.668,673,687.692 harmonic 670
Series (Cont.) hypergeometric 188 infinite 666, A70 integration of 680 Laurent 701, 712 MaclauIin 683 multiplication of 174. 680 of orthogonal functions 210 partial sums of 17 L 666 power 167, 673 real A69 remainder of 171, 666, 684 sum of 171,666 Taylor 683 tligonometric 479 value of 171, 666 Serret-Frenet formulas (see Frenet formulas) Set of points 613 Shifted data problem 232 Shifting theorems 224, 235, 528 Shortest path 959 spanning tree 967 Shrine of the Book 814 Sifting 242 Significance level 1059 Significant digit 781 in statistics 1059 Sign test 1081 Similar matrices 350 Simple curve 391 event 997 graph 955 pole 708 zero 709 Simplex method 944 table 945 Simply connected 640, 646 Simpson's rule 821 Simultaneous corrections 850 differential equations 124 linear equations (see Linear systems) Sine of a complex variable 627, 688, 742 hyperbolic 688
119
Index
Sine (Cont.) integral 509, 690, A65, A95 of a real variable A60 Single precision 782 Single-valued relation 615 Singular at infinity 711 matrix 315 point 183, 686, 707 solution 8, 50 Sturm-Liouville problem 206 Singularity 686, 707 Sink 464, 765, 973 SI system: Front cover Size of a sample 997. 1045 Skew-Hermitian 357, 361 Skewness L020 Skew-symmetric matrix 283, 345 Skydiver 12 Slack variable 941 Slope field 9 Smooth curve 42 I, 638 piecewise 421, 448, 639 surface 448 Sobolev 242 Soft spring 159 Software 778, 991 Solution of a differential equation 4, 46, 105, 536 general 6, 48, 106, 138 particular 6, 48, 106 singular 8, 50 space 304 steady-state 88 of a system of differential equations 136 of a system of equation~ 288 vector 288 SOR 851 Sorting 969, 993 Source 464. 765, 973 Span 300 Spanning tree 967 Sparse graph 957 matrix 812, 912 system of equations 846 Spectral density 520 mapping theorem 344,865
Spectral (Cont.) radius 848 representation 520 shift 344, 865, 874 Spectrum 324, 542, 864 Speed 394 Sphere 446 Spherical coordinates 588, A 71 Spiral 399 point 144 Spline 81 I S-PLUS, SPSS 992 Spring 62 Square error 503 matrix 274 membrane 575 root 792 wave 211, 480, 488, 492 Stability 3 I, 148, 783, 822, 922 chart 148 Stagnation point 763 Staircase function 248 Standard basis 328, 369 deviation 10 I 6 form of a linear ODE 26, 45, 105 Standardized random variable 1018 Stationary point 937 Statistical inference 1044 tables A96-A I 06 Steady 413, 463 state 88 Steepest descent 938 Steiner 399, 457 Stem-and-leaf plot 994 Stencil 912 Step-by-step method 886 Step function 234 Step size control 889 Stereographic projection 7 I I Stiff ODE 896 system of ODEs 907 Stirling formula 1008, A64 Stochastic matrix 285 variable 10 11 Stokes's theorem 469
120
Index
Straight line 375, 391 Stream function 762 Streamline 761 Strength of a source 767 Stlictly diagonally dominant 868 String 204, 538, 594 Student's (-distribution 1053, A I 00 Sturm-Liouville problem 203 Subgraph 956 Submarine cable equations 594 Submatrix 302 Subsidiary equation 220, 230 Subspace 300 invariant 865 Success [021 Successive conections 850 overrelaxation 851 Sum (see Addition) Sum of a selies 171,666 Superlinear convergence 795 Superposition principle 106, 138 Surface 445 area 435, 442, 454 integral 449 normal 406 Surjective mapping 729 Symmetric matrix 283, 345 System of differential equations 124-165,258-263,902 linear equations (see Linear system) units: Front cover
T Tables on differentiation: Front cover of Fourier transforms 529-531 of functions A94-A 106 of integrals: Front cover of Laplace transforms 265-267 statistical A96-A106 Tangent 627, A62 to a curve 392, 397 hyperbolic 629, A62 plane 406, 447 vector 392 Tangential acceleration 395 Target 973 Tarjan 971
Taylor 683 formula 684 series 683 Tchebichef (see Chebyshev) (-distribution 1053, A100 Telegraph equations 594 Termination critelion 791 Termwise differentiation 696 integration 695 multiplication 680 Test chi-square 1077 for convergence 667-672 of hypothesis 1058-1 06R nonparametric 1080 Tetrahedron 382 Thermal conductivity 465, 552 diffusivity 465, 552 Three -eights rule 830 -sigma limits 1028 Time tabling 987 Torricelli's law 15 Torsion of a curve 397 Torsional vibrations 68 Torus 454 Total differential 19 Trace of a matrix 344, 355, !:I64 Trail 959 TrajectOlies 35, 133, 141 Transfer function 230 Transformation of Cartesian courdinates A84 by a complex function 729 of integrals 437, 439, 459, 469 linear 281 linear fractional 734 orthogonal 346 similarity 350 unitary 359 of vector components A83 Transient state 88 Translation 365, 734 Transmission line equations 593 Transpose of a matrix 282 Transpositions 1081. AI06 Trapezoidal rule 817 Traveling salesman problem 960
Index Tree 966 Trend 1081 Trial 997 Triangle inequality 326, 372. 608 Triangular matlix 283 Tricomi equation 551. 55~ Tridiagonalization 875 Tridiagonal matrix 812, 875. 914 Trigonometric approximation 502 form of complex numbers 607, 624 functions, complex 626, 688 functions, real A60 polynomial 502 series 479 system 479. 482 Trihedron 398 Triple integral 458 product 381 Tlivial solution 27, 304 Truncation error 783 Tuning 543 Twisted curve 391 Two-dimensional distribution 1032 normal distribution 1090 random variable 1032 wave equation 571 Two-sided test 1060 Type of a differential equation 551 Type I and II errors 1060
U UeL 1068
Unconstrained optimization 937 Uncorrelated I 090 U mlamped system 62 Underdamping 66 Underdetemlined system 292 Underflow 782 Undetermined coefficients 79, 117, 160 Uniform convergence 691 distribution 1015, 1017. 1034 Union of events 998 Uniqueness differential equations 37, 73, 107, 137 Dirichlet problem 774
121
Uniqueness (Cont.) Laurent series 705 linear equations 303 power series 678 Unit binormal vector 398 circle 611 impulse 242 matrix 284 normal vector 447 principal normal vector 391:\ step function 234 tangent vector 392. 398 vector 326 Unitary matrix 357 system of vectors 359 transformation 359 Unstable (see Stability) Upper control limit 1068
v Value of a series 171. 666 Vandermonde determinant I 12 Van der Pol equation 157 Variable complex 614 random 10 I0, 1032 standardized random 1018 stochastic I 0 11 Variance of a distribution 1016 sample 996, 1045 Variation of parameters 98. 118, 160 Vector 274, 364 addition 276. 327, 367 field 384 function 384 moment 380 norm 853 product 377 space 300, 323, 369 subspace 300 Velocity 394 field 385 potential 762 vector 394 Venn diagram 998 Verhulst 31
122
Index
Vertex 955 coloring 987 exposed 983 incidence list 957 Vibrations (see Oscillations) Violin string 538 Vizing's theorem 987 Volta 92 Voltage drop 92 Volterra 154,201, 253 Volume 435 Vortex 767 Vorticity 764
w Waiting time problem 10 13 Walk 959 Wave equation 536, 539, 569, 929 Weber 217 equation 217 .
Weber (Cont.) functions 201 Website see Preface Weierstrass 618. 696 approximation theorem 797 M-test 696 Weighted graph 959 Weight function 205 Well-conditioned 852 Wessel 605 Wheatstone bridge 296 Work 373, 423 integral 422 Wronskian 75, 108
z Zero of analytic function 709 matrix 276 vector 367
Systems of Units. Some Important Conversion Factors The most important systems of units are shown in the table below. The mks system is also known as the bztemational System of Units (abbreviated S/), and the abbreviations s (instead of sec), g (instead of gm), and N (instead of nt) are also used.
Length
System of units
Mass
Time
Force
cgs system
centimeter (cm)
gram (gm)
second (sec)
dyne
mks system
meter (m)
kilogram (kg)
second (sec)
newton (nt)
Engineering system
foot (ft)
slug
second (sec)
pound (lb)
I inch (in.) = 2.540000 cm
I yard (yd)
I foot (ft)
= 3 ft = 91.440000 cm
= 12 in. = 30.480000 cm
I statute mile (mi)
= 5280 ft = 1.609344 km
I mi 2 = 640 acres
= 2.5899881
I nautical mile = 6080 ft = 1.853184 km I acre
= 4840 yd2 = 4046.8564 m 2
= IIl28 U.S. gallon = 2311128 in. 3 = 29.573730 cm 3
I fluid ounce
I U.S. gallon = 4 quarts (liq) = 8 pints (liq) I British Imperial and Canadian gallon I slug
=
= 128 fl oz = 3785.4118 cm3
1.200949 U.S. gallons
= 4546.087 cm 3
= 14.59390 kg
I pound (lb)
= 4.448444 nt
I British thermal unit (Btu) I calorie (cal)
I newton (nt)
= 1054.35 joules
= 105 dynes
I joule = 107 ergs
= 4.1840 joules
I kilowatt-hour (kWh)
= 3414.4 Btu = 3.6' 106 joules
I horsepower (hp) = 2542.48 Btu/h I kilowatt (kW)
OF
km 2
= 178.298 cal/sec = 0.74570 kW
= 1000 watts = 3414.43 Btu/h = 238.662 cal/sec
= °C . 1.8 + 32
1° = 60' = 3600"
= 0.01 7453293 radian
For further details see, for example, D. Halliday, R. Resnick, and 1. Walker, FUl1damel1tals of Physics. 7th ed., New York: Wiley. 2005. See also AN American National Standard. ASTMIlEEE Standard Metric Practice. Institute of Electrical and Electronics Engineers, Inc., 445 Hoes L~ne. Piscataway, N. 1. 08854.
Integration
Differentiation (eu)' = eu'
(e constant)
f uv' dx =
f u'v dx
ltV -
xn+l
(u
+ v)' = u' + v'
(uv)' = u'v
+ v'u
,
,
(: )'
2
du dy du -=-.dx dy dx
(Chain rule)
f sin x dx = -cos x + f cos dx = sin + x
=
f sec x dx
(tan x)'
= sec2 x
(cotx), = -csc 2 x (sinh x)' = cosh x (cosh x)'
= sinh x
, I (lnx) = x
+e
-In Icos xl
In Isec x
=
e
e
x
f cotxdx = In Isinxl
(cosx)' = -sinx
(n oF -I)
feaxdx=~eax+e
f tan x dx
(sin x)' = cos x
e
e
uv-vu v
+ f f ~ dx = In Ixl + xndx = - - n + 1
+
e
+ tan xl +
f csc x dx = In Icsc x - cot \:1 + dx x 2 2 = - arctan - + f + f Ya dx x = arcsin'::'a + e f Yx dx+ = arcsinh .::. + f Yx dx a = arccosh .::.a + I
x
a
2
a
e
e
2
-
2
2
a
e
a
e
2
a
e
2
-
f sin2 x dx
= 12 x
- 14 sin 2x
+
e
! x + ~ sin 2x + e
f cos 2 X dx =
f tan 2 x dx = tan x - x + e x
(arcsin x)'
(arccos x)'
f cot2 = -cotx - + fIn x dx = x In x - x + e xdx
x
f eax sin bx dx
= 2
eax
a
(arctan x)'
e
f
+
b
2 (a sin bx - b cos bx)
eax cos bx dx
eax (arccotx)'
+e
=
a
2
+
b
2 (a cos bx
+
b sin bx)
+
e
Polar Coordinates
Some Constants
x = rcos 0
e = 2.71828 182845904523536 Ve = 1.64872 127070012814685 e 2 = 7.38905 60989 30650 22723
y
tanO=x
lOglO
=
7T
Series
= 0.49714987269413385435
=
I
1.14472 98858 4940017414 IOglO e = 0.434294481903251 82765 In 10 = 2.30258 50929 94045 68402 In
7T
v'2 =
= rdrdO
dxdy
3.14159265358979323846 ~ = 9.86960 440lO 89358 61883 y:;;:- = 1.77245385090551602730 7T
y=rsinO
00
(Ixl
- - = ~ xm 1- x
<
1)
m~O
1.41421 356237309504880
-{Y2 = 1.25992 1049894873 16477
v'3 = V'3 =
•
1.73205 08075 68877 29353 1.44224 95703 07408 38232 In 2 = 0.69314718055994530942 In 3 = 1.09861 228866810969140
cos x =
Alpha
IJ
Nu
p
Beta
g
Xi
Gamma
0
Omicron
Delta
7T
Pi
Epsilon
p
Rho
e
€,
~
Zeta
T}
Eta
0, tt,
e
Theta
L
Iota
K
Kappa
A,A fL
cr, L T
v, Y
l)!
(_I)mx 2m
~
(2m)!
m~O
xm In (1 - x) = - ~ m 00
(Ixl
< 1)
m~l
(_l)m 2m+1
x
00
arctan x = ~
+
2m
m=O
(Ixl <
I
1)
Vectors
0'
0, !J.
+
(2m
00
Greek Alphabet
r
_l)mx 2m+ I
(
=~
m~O
'Y = 0.57721566490153286061 In 'Y = -0.549539312981644 82234 (see Sec. 5.6) 0 1 = 0.01745 32925 19943 29577 rad 1 rad = 57.29577 95130 82320 87680 0 = 57° 17' 44.806"
'Y,
00
Sill X
Sigma
a-b = alb l + a2b2 + a3b3
j
k
axb= al
a2
a3
bl
b2
b3
i
af
af
at
ax
ay
az
gradf = Vf = - i + - j + - k
Tau Upsilon
.
cP. cp, Phi X
Chi
Lambda
I/J, 'It
Psi
Mu
w,n
Omega
aVl
diV v = V-v = -
curl v = V x v =
ax
aV2
+ -
ay
i
j
a -
a -
ax
ay V2
VI
aV 3
+ -
az
k
a
az V3