Numerical Analysis by Dr. Anita Pal Assistant Professor Department of Mathematics National Institute of Technology Durgapur Durgapur-713209 email:
[email protected]
.
Chapter 1 Numerical Errors
Module No. 1 Errors in Numerical Computations
...................................................................................... Two major techniques are used to solve any mathematical problem − analytical and numerical. The analytical solution is obtained in a compact form and generally it is free from error. On the other hand, numerical method is a technique which is used to solve a problem with the help of computer or calculator. In general, the solution obtained by this method contains some error. But, for some class of problems it is very difficult to obtain an analytical solution. For these problems we generally use numerical methods. For example, the solutions of complex non-linear differential equations cannot be determined by analytical methods, but these problems can easily be solved by numerical methods. In numerical method there always be a scope to occur errors and hence it is important to understand the source, propagation, magnitude, and rate of growth of these errors. To solve a problem with the help of computer, a special method is required and this method is known as numerical method. Analytical methods are not suitable to solve a problem by computer. Thus, the numerical methods are highly appreciated and extensively used by scientists and engineers. Let us discuss sources of error.
1.1 Sources of error It is well known that the solution of a problem obtained by numerical method contains some errors. But, our intension is to minimize the error. To minimize it, the most essential thing is to identify the causes or sources of the error. Three sources of errors, viz. inherent errors, round-off errors and truncation errors occur to find a solution of a problem by using numerical method. They are discussed below. (i) Inherent errors: These type of errors occur due to the simplified assumptions made during mathematical modelling of the problem. These errors also occur when the data is obtained from certain physical measurements of the parameters of the proposed problem. (ii) Round-off errors: Generally, the numerical methods are performed using computer. In numerical computation, all the numbers are represented as decimal fraction. Again, a computer can store finite number of digits for a number. Some numbers viz. 1/3, 1/6, 1/7 etc. can not be represented by decimal fraction in finite numbers of digits. 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Errors in Numerical Computations Thus, to represent these numbers some digits must be discarded and hence the numbers should be rounded-off into some finite number of digits. So in arithmetic computation, some errors will occur due to the finite representation of the numbers; these errors are called round-off errors. These errors depend on the word length of the used computer. (iii) Truncation errors: These errors occur due to the finite representation of an inherently infinite process. These types of errors are explained by an example. Let us consider the cosine series. The Taylor’s series expansion of cos x is cos x = 1 −
x2 x4 x6 + − + ··· . 2! 4! 6!
This is well known that this series s infinite. If we consider the first five terms to calculate the value of cos x for a given x, then we obtained an approximate value. The error occurs due to the truncation of the remaining terms of the series and it is called the truncation of error. Note that the truncation error is independent of the computational machine.
1.2 Exact and approximate numbers In numerical computation, a number is consider as either exact or approximate value of a solution of a problem. Exact number represents the true value of a result while the approximate number represents the value which is closed to the true value. For example, in the statements ‘a book has 134 pages’, ’the population of a locality is 15000’ the numbers 134, 15000 are exact numbers. But, in the assertions ‘the time taken to fly from Kolkata to New Delhi is 2 hrs’, ‘the number of leaves of a mango tree is 150000’, the numbers 2 and 150000 are approximate numbers, as time to fly from Kolkata to New Delhi is approximately 2 hrs and similarly, the number of leaves of the tree is approximately 150000, because it is not possible to count exact number of leaves of a big tree. These approximations are coming either from the imperfection of measuring instruments or the measurement depends on other parameters. There are no absolutely exact measuring instruments; each of them has its own accuracy. 2
...................................................................................... It may be noted that same number may be exact as well as approximate. For example, the number 3 is exact when it represents the number of rooms of a house and approximate when it represents the number π. The accuracy of a solution is defined in terms of number of digits used in the computation. The significant digits or significant figures of a number are all its digits, except for zeros which appear to the left of the first non-zero digit. But, the zeros at the end of a number are always significant digit. The numbers 0.000342 and 8921.2300 have 3 and 8 significant digits respectively. Some times we need to cut off usable digits. The number of digits to be cut off depends on the problem. This process to cut off digits from a number is called rounding-off of numbers. That is, in rounding process the number is approximated to a very close number consisting of a smaller number of digits. In that case, one or more digits are kept with the number, taken from left to right, and all other digits are discarded. Rules of rounding-off (i) If the discarded digits constitute a number which is larger than half the unit in the last decimal place that remains, then the last digit that is left is increased by one. If the discarded digits constitute a number which is smaller than half the unit in the last decimal place that remains, then the digits that remain do not change. (ii) If the discarded digits constitute a number which is equal to half the unit in the last decimal place that remains, then the last digit that is half is increased by one, if it is odd, and is unchanged if it is even. This rule is often called a rule of an even digit. In Table 1.1, we consider different cases to illustrate the round-off process. In this table the numbers are rounded-off to the six significant figures. But, computer kept more number of digits during round-off. It depends on the computer and the type of the number declared in a programming language. Note that the round-off numbers contain errors and this errors are called round-off errors.
3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Errors in Numerical Computations Exact number
Round-off number to six significant figures
26.0123728
26.0124 (added 1 in the last digit)
23.12432615
23.1243 (last digit remains unchanged)
30.455354
30.4554 (added 1 in the last digit)
19.652456
19.6525 (added 1 in the last digit)
126.3545
126.344 (last digit remains unchanged)
34.4275
34.4280 (added 1 in the last digit to make even digit)
8.999996
9.00000 (added 1 in the last digit)
9.999997
10.0000 (added 1 in the last digit)
0.0023456573
0.00234566 (added 1 in the last digit)
6.237
6.23700 (added two 0’s to make six figures)
67542159
675422×102 (integer is rounded to six digits) Table 1.1: Different cases of round-off numbers
1.3 Absolute, relative and percentage errors Let xA be the approximate value of the exact number XT . The difference between the exact value xT and its approximate value xA is an error. But, by principle it is not possible to determine the value of the error xT − xA and even its sign, when the exact number xT is unknown. The errors are designated as absolute error, relative error and percentage error. Absolute error: Let xA be the approximate value of the exact number xT . Then the absolute error is denoted by (∆x) and satisfies the relation ∆x ≥ |xT − xA |. Note that the absolute error is the upper bound of the difference between xT and xA . This definition is applicable when there are many approximate values of the exact number xT . Otherwise, ∆x = |xT − xA |. 4
...................................................................................... Also, the exact value xT lies between xA − ∆x and xA + ∆x. It can be written as xT = xA ± ∆x.
(1.1)
The upper bound of the absolute error is 1 absolute error ≤ × 10−m , 2
(1.2)
when the number is rounded to m decimal places. Note that the absolute error measures the total error and hence this error measures only the quantitative side of the error. It does not measure the qualitative, i.e. how much the measurement is accurate. For example, the length and the width of a pond are determined by a tape in meter. Suppose that width w = 50 ± 2 m and the length l = 250 ± 2 m. In both the measurements the absolute error is 2 m, but it is obvious that the second measure is more accurate. To determine the quality of measurements, we introduced a new concept called relative error. Relative error: The relative error is denoted by δx and is defined by δx =
∆x ∆x or , |xT | 6= 0 and |xA | 6= 0. |xA | |xT |
This expression can also be written as xT = xA (1 ± δx) or xA = xT (1 ± δx). Note that the absolute error is the total error when whole thing is measured, while relative error is the error when we measure 1 unit. That is, the relative error is the error per unit measurement. In case of above example, the relative errors are δw =
2 50
= 0.04 and δl =
2 250
=
0.008. Thus, the second measurement is more accurate. In general, the relative error measures the quantity of error and quality of the measurement. Thus, the relative error is a better measurement of error than absolute error. Percentage error: The relative error is measured in 1 unit scale while the percentage error is measured 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Errors in Numerical Computations in 100 unit scale. The percentage error is measured by δx × 100%. This error is sometimes called relative percentage error. Percentage error measures both the quantity and quality. Generally, when relative error is very small then the percentage error is determined. Note that the relative and percentage errors are free from the unit of measurement, while absolute error depends on the measuring unit. Example 1.1 Find the absolute, relative and percentage error in xA when xT = xA = 0.1429.
1 and 7
Solution. The absolute error 1 − 1.0003 1 ∆x = |xT − xA | = − 0.1429 = 7 7 0.0003 = = 0.000043 rounding up to two significant figures. 7 The relative error δx =
∆x 0.000043 = = 0.000329 ' 0.0003. xT 1/7
The percentage error is δx × 100% = 0.0003 × 100% ' 0.03%. Example 1.2 Find the absolute error and the exact number corresponding to the approximate number xA = 7.543. Assume that the percentage error is 0.1%. Solution. The relative error is δx = 0.1% = 0.001. Therefore, the absolute error is ∆x = |xA × δx| = 7.543 × 0.001 = 0.007543 ' 0.0075. Thus, the exact value is = 7.543 ± 0.0075. Example 1.3 Suppose two exact numbers and their approximate values are given by xT =
√ 17 ' 0.8947 and yT = 71 ' 8.4261. 19
Find out which approximation is better. Solution. To find the absolute error, we take the numbers xA and yA with a larger √ number of decimal digits as xA ' 0.894736 · · · , yA = 71 ' 8.426149 · · · . Therefore, the absolute error in xT is ∆x = |0.894736 · · · − 0.8947| ' 0.000036, and ∆y = |8.426149 · · · − 8.4261| ' 0.000049. 6
...................................................................................... Thus, δx = 0.000036/0.8947 ' 0.000040 = 0.0040% δy = 0.000049/8.4261 = 0.0000058 = 0.00058%. The percentage error in second case is 0.00058 while in first case it is 0.0040. Thus the second measurement is more better than the first one.
1.4 Valid significant digits A decimal integer can be represented in many ways. For example, the number 7600000 can be written as 760 × 104 or 76.0 × 105 or 0.7600000 × 107 . Note that each number has two parts, the first part is called mantissa and second part is called exponent. In last form, the mantissa is a proper fraction and first digit after decimal point is non-zero. This form is known as normalize form and it is commonly used in computer. Every positive decimal number a can be expressed as a = d1 × 10m + d2 × 10m−1 + · · · + dn × 10m−n+1 + · · · , where di are the digits constituting the number (i = 1, 2, . . .). The digit d1 6= 0 and 10m−i+1 is the value of the ith decimal place starting from left. Let dn be the nth digit of the approximate number x. This digit is called valid significant digit (or simply a valid digit) if it satisfies the following condition ∆x ≤ 0.5 × 10m−n+1 .
(1.3)
If the inequality of (1.3) does not satisfied, the digit dn is said to be doubtful. If dn is a valid digit then all the digits preceding to dn are also valid. Theorem 1.1 If a number is correct up to n significant figures and the first significant digit is k, then the relative error is less than 1 . k × 10n−1 Proof. Let xA and xT be the approximate and exact values. Also, assume that xA is correct up to n significant figures and m decimal places. There are three cases arise: (i) m < n (ii) m = n and (iii) m > n. 7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Errors in Numerical Computations From (1.2) it is known that the absolute error ∆x ≤ 0.5 × 10−m . (i) When m < n. In this case, the total number of digits in integral part is n − m. Let k be the first significant digit in xT . Therefore, ∆x ≤ 0.5 × 10−m and |xT | ≥ k × 10n−m−1 − 0.5 × 10−m . Thus, the relative error δx =
0.5 × 10−m ∆x ≤ |xT | k × 10n−m−1 − 0.5 × 10−m 1 = . 2k × 10n−1 − 1
Since, n is a positive integer and k is an integer lies between 1 and 9, 2k × 10n−1 − 1 > k × 10n−1 for all k and n except k = n = 1. Hence, δx <
1 . k × 10n−1
(ii) When m = n. In this case, the first significant digit is same as first digit after decimal point, i.e. the number is proper fraction. As in previous case,
0.5 × 10−m k × 10n−m−1 − 0.5 × 10−m 1 1 = < . 2k × 10n−1 − 1 k × 10n−1
δx =
(iii) When m > n. In this case, the first significant digit k is at the (n−m+1) = −(m−n−1)th position and the integer part is zero. Then ∆x ≤ 0.5 × 10−m and |xT | ≥ k × 10−(m−n+1) − 0.5 × 10−m . Thus,
Hence the theorem. 8
0.5 × 10−m k × 10−(m−n+1) − 0.5 × 10−m 1 1 = < . 2k × 10n−1 − 1 k × 10n−1
δx =
.
Chapter 1 Numerical Errors
Module No. 2 Propagation of Errors and Computer Arithmetic
...................................................................................... This is the continuation of Module 1. In this module, the propagation of error during arithmetic operations are discussed in details. Also, the representation of numbers in computer and their arithmetic calculations are explained.
2.1 Propagation of errors in arithmetic operations In numerical computation, it is always assumed that there is an error in every number, it may be very small or large. The errors present in the numbers are propagated during arithmetic process. But, the rate of propagation depends on the type of arithmetic operation. This case is discussed in the subsequent sections. 2.1.1
Errors in sum and difference
Let us consider the exact numbers X1 , X2 , . . . , Xn and their corresponding approximate number be respectively x1 , x2 , . . . , xn . Assumed that ∆x1 , ∆x2 , . . . , ∆xn be the absolute errors in x1 , x2 , . . . , xn . Therefore, Xi = xi ± ∆xi , i = 1, 2, . . . , n. Let X = X1 + X2 + · · · + Xn and x = x1 + x2 + · · · + xn . The total absolute error is |X − x| = |(X1 − x1 ) + (X2 − x2 ) + · · · + (Xn − xn )| ≤ |X1 − x1 | + |X2 − x2 | + · · · + |Xn − xn |. This shows that the total absolute error in the sum is ∆x = ∆x1 + ∆x2 + · · · + ∆xn .
(2.1)
Thus the absolute error in sum of approximate numbers is equal to the sum of the absolute errors of all the numbers. The following points should keep in mind during addition of numbers. (i) identify a number (or numbers) of the least accuracy, (ii) round-off the numbers to the nearest exact numbers and retain one digit more than in the identified number, (iii) perform addition for all retained digits, (iv) round-off the result by discarding last digit. 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . Propagation of Errors and Computer Arithmetic Subtraction The case for subtraction is similar to addition. Let x1 and x2 be two approximate values of the exact numbers X1 and X2 respectively and X = X1 − X2 , x = x1 − x2 . Therefore, one can write X1 = x1 ± ∆x1 and X2 = x2 ± ∆x2 . Now, |X − x| = |(X1 − x1 ) − (X2 − x2 )| ≤ |X1 − x1 | + |X2 − x2 |. Hence, ∆x = ∆x1 + ∆x2 .
(2.2)
It may be noted that the absolute error in difference of two numbers is equal to the sum of individual absolute errors. 2.1.2
The error in product
Let us consider two exact numbers X1 and X2 with their approximate values x1 and x2 . Let, X1 = x1 ± ∆x1 and X2 = x2 ± ∆x2 , where ∆x1 and ∆x2 are the absolute errors in x1 and x2 . Now, X1 X2 = x1 x2 ± x1 ∆x2 ± x2 ∆x1 ± ∆x1 · ∆x2 . Therefore, |X1 X2 −x1 x2 | ≤ |x1 ∆x2 |+|x2 ∆x1 |+|∆x1 ·∆x2 |. Both the terms |∆x1 | and |∆x2 | represent the errors and they are small, so their product is also small. Therefore, we discard it and dividing both sides by |x| = |x1 x2 | to get the relative error. Hence, the relative error is X1 X2 − x1 x2 ∆x2 ∆x1 = x2 + x1 . x1 x2
(2.3)
From this expression we conclude that the relative error in product of two numbers is equal to the sum of the individual relative errors. This result can be extended for n numbers as follows: Let X = X1 X2 · · · Xn and x = x1 x2 · · · xn . Then X − x ∆x1 ∆x2 = + + · · · + ∆xn . xn x x1 x2
(2.4)
That is, the total relative error in product of n numbers is equal to the sum of individual relative errors. In particular, let all approximate values x1 , x2 , . . . , xn be positive and x = x1 x2 · · · xn . Then log x = log x1 + log x2 + · · · + log xn . 2
...................................................................................... In this case, ∆x ∆xn ∆x1 ∆x2 + + ··· + . = x x1 x2 xn ∆x ∆x ∆x ∆x 1 2 n Hence, = + + ··· + . x x1 x2 xn Let us consider another particular case. Suppose, x = kx1 , where k is a non-zero real number. Now, ∆x k ∆x1 ∆x1 = δx1 . δx = = = x k x 1 x1 Also, |∆x| = |k ∆x1 | = |k| |∆x1 |. Observed that the relative errors in both x and x1 are same, while absolute error in x is |k| times the absolute error in x1 . 2.1.3
The error in quotient
Let X1 and X2 be two exact numbers and their approximate values be x1 and x2 . X1 x1 Again, let X = and x = . X2 x2 If ∆x1 and ∆x2 are the absolute errors, then X1 = x1 + ∆x1 , X2 = x2 + ∆x2 . Suppose both x1 and x2 are non-zeros. Now, X −x=
x1 + ∆x1 x1 x2 ∆x1 − x1 ∆x2 − = . x2 + ∆x2 x2 x2 (x2 + ∆x2 )
Dividing both sides by x and taking absolute values: X − x x2 ∆x1 − x1 ∆x2 ∆x1 ∆x2 x2 = = x x1 (x2 + ∆x2 ) x2 + ∆x2 x1 − x2 . Since the error ∆x2 is small as compared to x2 , therefore x2 ' 1. x2 + ∆x2 Thus, ∆x X − x ∆x1 ∆x2 ∆x1 ∆x2 , = = ≤ + δx = − x x x1 x2 x1 x2
(2.5)
i.e., δx = δx1 + δx2 . This expression shows that the total relative error in quotient is equal to the sum of their individual relative errors. 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . Propagation of Errors and Computer Arithmetic The relative error δx of (2.5) can also be expressed as ∆x ∆x1 ∆x2 ∆x1 ∆x2 = x x 1 − x 2 ≥ x 1 − x 2 .
(2.6)
It may be observed that the relative error in quotient is greater than or equal to the difference of their individual relative errors. In case of positive numbers one can determine the error of logarithm function. Let x1 and x2 be the approximate numbers and x = x1 /x2 . Now, log x = log x1 − log x2 . Thus, ∆x ∆x1 ∆x2 ∆x ∆x1 ∆x2 ≤ + . − i.e., = x x1 x2 x x1 x2 Example 2.1 Find the sum of the approximate numbers 120.237, 0.8761, 78.23, 0.001234, 234.3, 128.34, 35.4, 0.0672, 0.723, 0.08734. It is known that in each of which all the written digits are valid. Find the absolute error in sum. Solution. The least exact numbers are 234.3 and 35.4. The maximum error of each of them is 0.05. Now, rounding-off all the numbers in two decimal places (one digit more than the least exact numbers). Their sum is 120.24 + 0.88 + 78.23 +0.00 + 234.3 + 128.34 +35.4 + 0.07 + 0.72 + 0.09 = 598.27. Now, rounding-off the sum to one decimal place and it becomes 598.3. There are two types of errors in the sum. The first one is the initial error. This is the sum of the errors of the least exact numbers and the rounding errors of the other numbers, which is equal to 0.05 × 2 + 0.0005 × 8 = 0.104 ' 0.10. The second one is the error in rounding-off the sum which is 598.3 − 598.27 = 0.03. Thus, the total absolute error in the sum is 0.10 + 0.03 = 0.13. Finally, the sum can be expressed as 598.3 ± 0.13. Example 2.2 Let x1 = 43.5 and x2 = 76.9 be two approximate numbers and 0.02 and 0.008 be the corresponding absolute errors respectively. Find the difference between these numbers and evaluate absolute and relative errors. Solution. Here, x = x1 −x2 = −33.4 and the total absolute error is ∆x = 0.02+0.008 = 0.028. Hence, the difference is 33.4 and the absolute error is 0.028. The relative error is 0.028/| − 33.4| ' 0.00084 = 0.084%. 4
......................................................................................
Example 2.3 Let x1 = 12.4 and x2 = 45.356 be two approximate numbers and all digits of both the numbers are valid. Find the product and the relative and absolute errors. Solution. The number of valid decimal places in first and second approximate numbers are one and three respectively. So we round-off the second number to one decimal place. After rounding-off the numbers become x1 = 12.4 and x2 = 45.4. Now, the product is x = x1 x2 = 12.4 × 45.4 = 562.96 ' 56.0 × 10. The result is rounded in two significant figures, because the least number of valid significant digits of the given numbers is 3. The relative error in product is ∆x ∆x1 ∆x2 0.05 0.0005 = = + = 0.004043 ' 0.40%. + δx = x x1 x2 12.4 45.356 The absolute error is (56.0 × 10) × 0.004043 = 2.26408 ' 2.3. Example 2.4 Let x1 = 7.235 and x2 = 8.72 be two approximate numbers, where all the digits of the numbers are valid. Find the quotient and also relative and the absolute errors. Solution. Here, x1 = 7.235 and x2 = 8.72 have four and three valid significant digits respectively. Now, x1 7.235 = = 0.830. x2 8.72 We consider three significant digits, since the least exact number contains three valid significant digits. The absolute error in x1 and x2 are respectively ∆x1 = 0.0005 and ∆x2 = 0.005. The relative error in quotient is ∆x1 ∆x2 0.0005 0.005 x1 + x2 = 7.235 + 8.72 = 0.000069 + 0.000573 ' 0.001 = 0.1%. The absolute error is x1 × 0.001 = 0.830 × 0.001 = 0.00083 = 0.001. x2 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . Propagation of Errors and Computer Arithmetic
2.1.4
The errors in power and in root
Let x1 be an approximate value of an exact number X1 and its relative error be δx1 . Now, we determine the relative error of x = xk1 , where k is a real number. Then x = xk1 = x1 · x1 · · · k times. According to the formula (2.4), the relative error δx is given by δx = δx1 + δx1 + · · · + δx1 + k times = k δx1 .
(2.7)
Thus, the relative error of the approximate number x is k times the relative error of x1 . Let us consider the case, the kth root of a positive approximate value x1 , i.e. the √ number x = k x1 . Since x1 > 0, log x =
1 log x1 . k
Therefore, ∆x 1 ∆x1 ∆x 1 ∆x1 . = = or x k x1 x k x1 √ Thus, the relative error in k x1 is δx =
1 δx1 . k
Example 2.5 Let a = 5.27, b = 28.61, c = 15.8 be the approximate values of some numbers and let the absolute errors in a, b, c be 0.01, 0.04 and 0.02 respectively. Calcu√ 3 2 a b late the value of E = and the error in the result. c3 Solution. It is given that the absolute error ∆a = 0.01, ∆b = 0.04 and ∆c = 0.02. One more significant figure retain to intermediate calculation. Now, the approximate values √ of the terms a2 , 3 b, c3 are 27.77, 3.0585, 3944.0 respectively. The approximate value of the expression is E= 6
27.77 × 3.0585 = 0.0215. 3944.0
...................................................................................... Three significant digits are taken in the result, since, the least number of significant digits in the numbers is three. The relative error is given by 1 0.01 1 0.04 0.02 δb + 3 δc = 2 × + × +3× 3 5.27 3 28.61 15.8 ' 0.0038 + 0.00047 + 0.0038 ' 0.008 = 0.8%.
δE = 2 δa +
The absolute error ∆E in E is 0.0215 × 0.008 = 0.0002. Hence, A = 0.0215 ± 0.0002 and the relative error is 0.0002. In the above example, E is an expression of three variables a, b, c, and the error presents in E is illustrated. The general rule to calculate an error in a function of several variables are determined below: Error in function of several variables Let y = f (x1 , x2 , . . . , xn ) be a differentiable function containing n variables x1 , x2 , . . . , xn . Also, let ∆xi be the error in xi , for i = 1, 2, . . . , n. Now, the absolute error ∆y in y is given by y + ∆y = f (x1 + ∆x1 , x2 + ∆x2 , . . . , xn + ∆xn ) n X ∂f ∆xi + · · · = f (x1 , x2 , . . . , xn ) + ∂xi i=1
(by Taylor’s series expansion) n X ∂f =y+ ∆xi ∂xi i=1
(neglecting second and higher powers terms of ∆xi ) n X ∂f i.e., ∆y = ∆xi ∂xi i=1
This is the formula to calculate the total absolute error to compute a function of several variables. The relative error can be calculated as n
∆y X ∂f ∆xi = . y ∂xi y i=1
7
. . . . . . . . . . . . . . . . . . . . . . . . . . . Propagation of Errors and Computer Arithmetic
2.2 Significant error It may be remembered that some significant digits are lost during arithmetic calculation, due to the finite representation of computing instruments. This error is called significant error. In the following two cases, there are high chances to loss of more significant digits and care should be taken in these situations: (i) When two nearly equal numbers are subtracted, and (ii) When division is made by a very small divisor compared to the dividend. It should be remembered that the significant error is more serious than round-off error. These are illustrated in the following examples: Example 2.6 Find the difference
√
10.23 −
√
10.21 and calculate the relative error in
the result. Solution.
Let X1 =
√
10.23 and X2 =
√
10.21 and their approximate values be
x1 = 3.198 and x2 = 3.195. Let X = X1 − X2 . Then the absolute errors are ∆x1 = 0.0005 and ∆x2 = 0.0005 and the approximate difference is x = 3.198 − 3.195 = 0.003. Thus, the total absolute error in the subtraction is ∆x = 0.0005 + 0.0005 = 0.001 0.001 and the relative error is δx = = 0.3333. 0.003 But, by changing the calculation scheme one can obtained more accurate result. For example, X= =
√
10.23 −
√
10.21 = √
10.23 − 10.21 √ 10.23 + 10.21
0.02 ' 0.003128 = x (say). 3.198 + 3.195
The relative error is δx =
∆x1 + ∆x2 0.001 = = 0.0002 = 0.02%. x1 + x2 3.198 + 3.195
Observed that the relative error is much less that the previous case. 8
......................................................................................
Example 2.7 Find the roots of the equation x2 − 1500x + 0.5 = 0. Solution. To illustrate the difficulties of the problem, let us assumed that the computing machine using four significant digits for all arithmetic calculation. The roots of this equation are √
15002 − 2 . 2 2 7 7 Now, 1500 − 2 = 0.2250 × 10 − 0.0000 × 10 = 0.2250 × 107 . √ Thus 15002 − 2 = 0.1500 × 104 . 1500 ±
Hence, the roots are 0.1500 × 104 ± 0.1500 × 104 = 0.1500 × 104 , 0.0000 × 104 . 2 That is, the smaller root is zero (correct up to four decimal places), this occur due to the finite representation of the numbers. But, it is noted that 0 is not a root of the given equation. To get the more accurate result, we use the transformation on arithmetic calculation. The smaller root of the equation is now calculated as follows: √ √ √ (1500 − 15002 − 2)(1500 + 15002 − 2) 1500 − 15002 − 2 √ = 2 2(1500 + 15002 − 2) 2 √ = = 0.0003333. 2(1500 + 15002 − 2) Hence, the smaller root of the equation is 0.0003333 and it is more closed to the exact root. The other root is 0.1500 × 104 . The situation may aries when |4ac| b2 . So it is suggested that a care should be taken when nearly two equal numbers are subtracted. It is done by taking sufficient number of reserve valid digits.
2.3 Representation of numbers in computer It is mentioned earlier that the numerical methods are used to solve problems using computer. But, the computer has a limitation to store number either it is an integer or a real (or floating point) number. Generally, two bytes memory space is used to 9
. . . . . . . . . . . . . . . . . . . . . . . . . . . Propagation of Errors and Computer Arithmetic store an integer and four bytes space is used to store a floating point number. Due to the limitation of space, the rules for arithmetic operations used in mathematics do not always hold in computer arithmetic. The representation of a floating point number in computer is different from our conventional technique. In computer representation, the technique is used to preserve the maximum number of significant digits and increase the range of values of the real numbers. This representation is known the normalized floating point mode. In this representation, the whole number is converted to a proper fraction in such a way that the first digit after decimal point should be non-zero and is adjusted by multiplying some power of 10. For example, the number 3876.23 is represented in the normalized form as .387623 × 104 , and in computer representation it is written as .387623E4 (E4 is used to denote 104 ). It is observed that in normalized floating point representation, a number has two parts – mantissa and exponent.
In this example, .387623 is the
mantissa and 4 is the exponent. According to the representation the mantissa is always greater than or equal to .1 and exponent is an integer. To explain the computer arithmetic, in this section, it is assumed that the computer uses only four digits to store mantissa and two digits for exponent. The mantissa and the exponent have their own signs. In this assumption, the range of floating point numbers (magnitudes) is .9999 × 1099 to .1000 × 10−99 .
2.4 Arithmetic of normalized floating point numbers In this section, the four basic arithmetic operations on normalized floating point numbers are discussed. 2.4.1
Addition
The addition of two normalized floating point numbers is done by using the following rules: (i) If two numbers have same exponent, then the mantissas are added directly and the exponent of the added number is the either exponent. (ii) If the exponents are different, then lower exponent is shifted to higher exponent by adjusting mantissa and then the above rule is used to add them. 10
...................................................................................... All the possible cases are discussed in the following examples. Example 2.8 Add the following normalized floating point numbers. (i) .2678E15 and .4876E15 (same exponent) (ii) .7487E10 and .6712E10 (same exponent) (iii) .3451E3 and .3218E8 (different exponents) (iv) .3876E25 and .8541E27 (different exponents) (v) .8231E99 and .6541E99 (overflow condition) Solution. (i) Here the exponents are same. So using first rule one can add the numbers by adding mantissa. Therefore, the sum is .7554E15. (ii) In this case also, the exponent are equal and in previous case the sum is 1.4199E10. Notice that the sum contains five significant figures, but it is assumed that the computer can store only four significant figures. So, the number is shifted right one place before storing it to the computer memory. To convert it to four significant figures, the exponent is increased by 1 and the last digit is truncated. Hence, finally the sum is .1419E11. (iii) For this problem, the exponents are different and the difference is 8 − 3 = 5. The mantissa of smaller number (low exponent) is shifted 5 places and the number becomes .0000E8. Now, the numbers have same exponent and hence the final result is .0000E8 + .3218E8 = .3218E8. (iv) In this case, the exponents are also different and the difference is 27 − 25 = 2. So the mantissa of the smaller number (here first number) is shifted right by 2 places and it becomes .0038E27. Now the sum is .0038E27 + .8541E27 = .8579E27. (v) This case is different. The exponents are same and the sum is 1.4772E99. Here, the mantissa has five significant digits, so it is shifted right and the exponent is increased by 1. Then the exponent becomes 100. Since as per our assumption, the maximum value of the exponent is 99, so the number is larger than the capacity of the floating number of the assumed computer. This number cannot store in the computer and this situation is called an overflow condition. In this case, the computer will generate an error message.
11
. . . . . . . . . . . . . . . . . . . . . . . . . . . Propagation of Errors and Computer Arithmetic
2.4.2
Subtraction
The subtraction is a special type of addition. In subtraction one positive number is added with a negative number. The different cases of subtraction are illustrated in the following examples. Example 2.9 Subtract the normalized floating point numbers indicated below: (i) .2832E10 from .8432E10 (ii) .2693E15 from .2697E15 (iii) .2786E–17 from .2134E–16 (iv) .7224E–99 from .7273E–99. Solution. (i) Here the exponents are equal, and the hence the mantissas are directly added. Thus, the result is .8432E10 – .2832E10 = .5600E10. (ii) Here also the exponents are equal. So the result is .2697E15 – .2693E15 = .0004E15. The mantissa is not in normalised form. Since the computer always store normalised numbers, we have to convert it to the normalised number. The normalised number corresponding to .0004E15 is .4000E12. This is the final answer. (iii) In these numbers the exponents are different. The number with smaller exponent is shifted right and the exponent increased by 1 for every right shift. The second number becomes .0278E–16. Thus the result is .2134E–16 – .0278E–16 = .1856E–16. (iv) The result is .7273E–99 – .7224E–99=.0049E–99=.4900E–101 (In normalised form). Note that the number of digits in exponent is 3, but our hypothetical computer can store only two digits. In this case, the result is smaller than the smallest number which could be stored in our computer. This situation is called the underflow condition and the computer will give an error message. 2.4.3
Multiplication
The multiplication of normalised float point numbers are same as multiplication of ordinary numbers. 12
...................................................................................... Two normalized floating point numbers are multiplied by multiplying the mantissas and adding the exponents. After multiplication, the mantissa is converted into normalized floating point form and the exponent is adjusted accordingly. Multiplication is illustrated in the following examples. Example 2.10 Multiply the following floating point numbers: (i) .2198E6 by .5671E12 (ii) .2318E17 by .8672E–17 (iii) .2341E52 by .9231E51 (iv) .2341E–53 by .7652E-51. Solution. (i) In this case, .2198E6 × .5671E12 = .12464858E18. Note that the mantissa has 8 significant figures, but as per our computer the result will be .1246E18 (last four significant figures are truncated). (ii) Here, .2318E17 × .8672E–17 = .20101696E0 = .2010E0. (iii) .2341E52 × .9231E51 = .21609771E103. In this case, the exponent has three digits and it is not allowed in our assumed computer. The overflow condition occurs, so an error message will generate. (iv) .2341E–53 × .7652E-51 = .17913332E–104 = .1791E–104 and an error message will come. 2.4.4
Division
Also, the division of normalised floating point number is similar to division of ordinary number. Only the difference is that the mantissa retains only four significant digits (as per our assumed computer) instead of all digits. The quotient mantissa must be written in the normalized form and the exponent is adjusted accordingly. Example 2.11 Perform the following divisions (i) .8765E43 ÷ .3131E21 (ii) .9999E5 ÷ .1452E–99 (iii) .3781E–18 ÷ .2871E94. Solution. (i) .8765E43 ÷ .3131E21 = 2.7994251038E22 = .2799E23. (ii) In this case, the number is divided by a small number. .9999E5 ÷ .1452E–99 = 6.8863636364E104 =.6886E105. 13
. . . . . . . . . . . . . . . . . . . . . . . . . . . Propagation of Errors and Computer Arithmetic The overflow situation occurs. (iii) In this case, the number is divided by a large number. .3781E–18 ÷ .2871E94 = 1.3169627307E–112 = .1316E–111. As per our computer, underflow condition occurs.
2.5 Effect of normalized floating point arithmetics Sometimes floating point arithmetics give unpredictable results, due to the truncation of mantissa. To illustrate this situation, let us consider the following example. It is well known that 61 ×12 = 2. But, in the case of floating point arithmetic 1 6
1 6
× 12 = .1667 × 12 = .2000E1. Also, one can determine the value of
= .1667 and hence 1 6
× 12 by repeated
addition. Note that .1667 + .1667 + .1667 + .1667 + .1667 + .1667 = 1.0002 =.1000E1, but .1667 + .1667 + .1667 + · · · 12 times gives 0.1996E1. Thus, in floating point arithmetics multiplication is not always same as repeated addition, i.e. 12x = |x + x + {z· · · + x} is not true always. 12 times
Also, in floating point arithmetics the associative and distributive laws do not hold always, due to the truncation of mantissa. That is, (i) (a + b) + c 6= a + (b + c) (ii) (a + b) − c 6= (a − c) + b (iii) a(b − c) 6= ab − ac. These results are illustrated in the following examples: (i) Suppose, a =.6889E2, b =.7799E2 and c =.1008E2. Now, a + b =.1468E3 (a + b) + c = .1468E3 + .1008E2 = .1468E3 + .0100E3 = .1568E3. Again, b + c =.8807E2. a + (b + c)=.6889E2+.8807E2=.1569E3. Hence, for this example, (a + b) + c 6= a + (b + c). (ii) Let a =.7433E1, b =.6327E–1, c =.6672E1. Then a + b =.7496E1 and (a + b) − c =.7496E1 – .6672E1 = .8240E0. Again, a − c =.7610E0 and (a − c) + b =.7610E0 + .0632E0 = .8242E0. Thus, (a + b) − c 6= (a − c) + b. 14
...................................................................................... (iii) Let a =.6683E1, b =.4684E1, c =.4672E1. b − c =.1200E–1. a(b − c) =.6683E1 × .1200E–1 = .0801E0 = .8010E–1. ab =.3130E2, ac =.3122E2. ab − ac =.8000E–1. Thus, a(b − c) 6= ab − ac. From these examples we can think numerical computation is very dangerous. But, it is not such dangerous, as the actual computer generally stores seven digits as mantissa (in single precision). The larger length of mantissa gives more accurate result. 2.5.1
Zeros in floating point numbers
There is a definite meaning of zero in mathematics, but, in computer arithmetic exact equality of a number to zero can never be guaranteed. Because, most of the numbers in floating point representation are approximate. The behaviour of zero is illustrated in the following example. The exact roots of the equation x2 + 2x − 5 = 0 are x = −1 ±
√
6.
In floating point representation (4 digits mantissa) these are .1449E1 and –.3449E1. When x =.1449E1, then the left hand side of the equation is .1449E1 × .1449E1 + .2000E1 × .1449E1 – .5000E1 = .0209E2 + .2898E1 – .5000E1 = .0209E2 + .0289E2 – .0500E2 = –.0002E2. When x =–.3449E1, then left hand side of the equation is (–.3449E1) × (–.3449E1) + .2000E1 × (–.3449E1) – .5000E1 = .1189E2 – .6898E1 – .5000E1 = .1189E2 – .0689E2 – .0500E2 = .0000E2, which is equal to 0. It is interesting to see that one root perfectly satisfies the equation while other root does not, though they are roots of the equation. Since .1449E1 is a root, one can say that 0.02 is a zero. Thus, we can conclude the following: Note 2.1 There is no fixed value of zero in computer arithmetic like mathematical calculation. Thus, it is not advisable to give any instruction based on testing whether a floating point number is zero or not. But, it is suggested that a number is zero if it is less than a given (very) small number.
15
.
Chapter 1 Numerical Errors
Module No. 3 Operators in Numerical Analysis
...................................................................................... Lot of operators are used in numerical analysis/computation. Some of the frequently used operators, viz. forward difference (∆), backward difference (∇), central difference (δ), shift (E) and mean (µ) are discussed in this module. Let the function y = f (x) be defined on the closed interval [a, b] and let x0 , x1 , . . . , xn be the n values of x. Assumed that these values are equidistance, i.e. xi = x0 + ih, i = 0, 1, 2, . . . , n; h is a suitable real number called the difference of the interval or spacing. When x = xi , the value of y is denoted by yi and is defined by yi = f (xi ). The values of x and y are called arguments and entries respectively.
3.1 Finite difference operators Different types of finite difference operators are defined, among them forward difference, backward difference and central difference operators are widely used. In this section, these operators are discussed. 3.1.1
Forward difference operator
The forward difference is denoted by ∆ and is defined by ∆f (x) = f (x + h) − f (x).
(3.1)
When x = xi then from above equation ∆f (xi ) = f (xi + h) − f (xi ), i.e. ∆yi = yi+1 − yi , i = 0, 1, 2, . . . , n − 1.
(3.2)
In particular, ∆y0 = y1 − y0 , ∆y1 = y2 − y1 , . . . , ∆yn−1 = yn − yn−1 . These are called first order differences. The differences of the first order differences are called second order differences. The second order differences are denoted by ∆2 y0 , ∆2 y1 , . . .. Two second order differences are ∆2 y0 = ∆y1 − ∆y0 = (y2 − y1 ) − (y1 − y0 ) = y2 − 2y1 + y0 ∆2 y1 = ∆y2 − ∆y1 = (y3 − y2 ) − (y2 − y1 ) = y3 − 2y2 + y1 . The third order differences are also defined in similar manner, i.e. ∆3 y0 = ∆2 y1 − ∆2 y0 = (y3 − 2y2 + y1 ) − (y2 − 2y1 + y0 ) = y3 − 3y2 + 3y1 − y0 ∆3 y1 = y4 − 3y3 + 3y2 − y1 . 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Operators in Numerical Analysis Similarly, higher order differences can be defined. In general, ∆n+1 f (x) = ∆[∆n f (x)], i.e. ∆n+1 yi = ∆[∆n yi ], n = 0, 1, 2, . . . .
(3.3)
Again, ∆n+1 f (x) = ∆n [f (x + h) − f (x)] = ∆n f (x + h) − ∆n f (x) and ∆n+1 yi = ∆n yi+1 − ∆n yi , n = 0, 1, 2, . . . .
(3.4)
It must be remembered that ∆0 ≡ identity operator, i.e. ∆0 f (x) = f (x) and ∆1 ≡ ∆. All the forward differences can be represented in a tabular form, called the forward difference or diagonal difference table. Let x0 , x1 , . . . , x4 be four arguments. All the forwarded differences of these arguments are shown in Table 3.1. x
y
x0
y0
∆
∆2
∆3
∆4
∆y0 x1
∆ 2 y0
y1
∆ 3 y0
∆y1 x2
∆ 2 y1
y2
∆ 3 y1
∆y2 x3
∆4 y0
∆ 2 y2
y3 ∆y3
x4
y4
Table 3.1: Forward difference table. 3.1.2
Error propagation in a difference table
If any entry of the difference table is erroneous, then this error spread over the table in convex manner. The propagation of error in a difference table is illustrated in Table 3.2. Let us assumed that y3 be erroneous and the amount of the error be ε. Following observations are noted from Table 3.2. 2
......................................................................................
x
y
x0
y0
∆y
∆2 y
∆3 y
∆4 y
∆5 y
∆y0 x1
∆2 y0
y1
∆3 y0 + ε
∆y1 x2
∆2 y1 + ε
y2
∆3 y1 − 3ε
∆y2 + ε x3
∆2 y2 − 2ε
y3 + ε
∆5 y1 − 10ε ∆4 y2 − 4ε
∆3 y3 − ε
∆y4 x5
∆4 y1 + 6ε
∆2 y3 + ε
y4
∆5 y0 + 10ε
∆3 y2 + 3ε
∆y3 − ε x4
∆4 y0 − 4ε
∆2 y4
y5 ∆y5
x6
y6 Table 3.2: Error propagation in a finite difference table.
(i) The error increases with the order of the differences. (ii) The error is maximum (in magnitude) along the horizontal line through the erroneous tabulated value. (iii) In the kth difference column, the coefficients of errors are the binomial coefficients in the expansion of (1 − x)k . In particular, the errors in the second difference column are ε, −2ε, ε, in the third difference column these are ε, −3ε, 3ε, −ε, and so on. (iv) The algebraic sum of errors in any complete column is zero. If there is any error in a single entry of the table, then we can detect and correct it from the difference table. The position of the error in an entry can be identified by performing the following steps. (i) If at any stage, the differences do not follow a smooth pattern, then there is an error. 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Operators in Numerical Analysis (ii) If the differences of some order (it is generally happens in higher order) becomes alternating in sign then the middle entry contains an error. Properties Some common properties of forward difference operator are presented below: (i) ∆c = 0, where c is a constant. (ii) ∆[f1 (x) + f2 (x) + · · · + fn (x)] = ∆f1 (x) + ∆f2 (x) + · · · + ∆fn (x). (iii) ∆[cf (x)] = c∆f (x). Combining properties (ii) and (iii), one can generalise the property (ii) as (iv) ∆[c1 f1 (x) + c2 f2 (x) + · · · + cn fn (x)] = c1 ∆f1 (x) + c2 ∆f2 (x) + · · · + cn ∆fn (x). (v) ∆m ∆n f (x) = ∆m+n f (x) = ∆n ∆m f (x) = ∆k ∆m+n−k f (x), k = 0, 1, 2, . . . , m or n. (vi) ∆[cx ] = cx+h − cx = cx (ch − 1), for some constant c. (vii) ∆[x Cr ] = xCr−1 , where r is fixed and h = 1. ∆[x Cr ] =
x+1C
r
− xCr = xCr−1 as h = 1.
Example 3.1 ∆[f (x)g(x)] = f (x + h)g(x + h) − f (x)g(x) = f (x + h)g(x + h) − f (x + h)g(x) + f (x + h)g(x) − f (x)g(x) = f (x + h)[g(x + h) − g(x)] + g(x)[f (x + h) − f (x)] = f (x + h)∆g(x) + g(x)∆f (x). Also, it can be shown that ∆[f (x)g(x)] = f (x)∆g(x) + g(x + h)∆f (x) = f (x)∆g(x) + g(x)∆f (x) + ∆f (x)∆g(x). 4
......................................................................................
f (x) g(x)∆f (x) − f (x)∆g(x) Example 3.2 ∆ = , g(x) 6= 0. g(x) g(x + h)g(x) f (x + h) f (x) f (x) = ∆ − g(x) g(x + h) g(x) f (x + h)g(x) − g(x + h)f (x) = g(x + h)g(x) g(x)[f (x + h) − f (x)] − f (x)[g(x + h) − g(x)] = g(x + h)g(x) g(x)∆f (x) − f (x)∆g(x) = . g(x + h)g(x)
In particular, when the numerator is 1, then 1 ∆f (x) ∆ =− . f (x) f (x + h)f (x)
3.1.3
Backward difference operator
The symbol ∇ is used to represent backward difference operator. The backward difference operator is defined as ∇f (x) = f (x) − f (x − h).
(3.5)
When x = xi , the above relation reduces to ∇yi = yi − yi−1 ,
i = n, n − 1, . . . , 1.
(3.6)
In particular, ∇y1 = y1 − y0 , ∇y2 = y2 − y1 , . . . , ∇yn = yn − yn−1 .
(3.7)
These are called the first order backward differences. The second order differences are denoted by ∇2 y2 , ∇2 y3 , . . . , ∇2 yn . First two second order backward differences are ∇2 y2 = ∇(∇y2 ) = ∇(y2 − y1 ) = ∇y2 − ∇y1 = (y2 − y1 ) − (y1 − y0 ) = y2 − 2y1 + y0 , and ∇2 y3 = y3 − 2y2 + y1 , ∇2 y4 = y4 − 2y3 + y2 . The other second order differences can be obtained in similar manner. 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Operators in Numerical Analysis In general, ∇k yi = ∇k−1 yi − ∇k−1 yi−1 ,
i = n, n − 1, . . . , k,
(3.8)
where ∇0 yi = yi , ∇1 yi = ∇yi . Like forward differences, these backward differences can be written in a tabular form, called backward difference or horizontal difference table. All backward difference table for the arguments x0 , x1 , . . . , x4 are shown in Table 3.3. ∇2
∇
∇3
x
y
x0
y0
x1
y1
∇y1
x2
y2
∇y2
∇ 2 y2
x3
y3
∇y3
∇ 2 y3
∇ 3 y3
x4
y4
∇y4
∇ 2 y4
∇ 3 y4
∇4
∇4 y4
Table 3.3: Backward difference table. It is observed from the forward and backward difference tables that for a given table of values both the tables are same. Practically, there are no differences among the values of the tables, but, theoretically they have separate significant. 3.1.4
Central difference operator
There is another kind of finite difference operator known as central difference operator. This operator is denoted by δ and is defined by δf (x) = f (x + h/2) − f (x − h/2).
(3.9)
When x = xi , then the first order central difference, in terms of ordinates is δyi = yi+1/2 − yi−1/2 where yi+1/2 = f (xi + h/2) and yi−1/2 = f (xi − h/2). In particular, δy1/2 = y1 − y0 , δy3/2 = y2 − y1 , . . . , δyn−1/2 = yn − yn−1 . The second order central differences are δ2y 6
i
= δyi+1/2 − δyi−1/2 = (yi+1 − yi ) − (yi − yi−1 ) = yi+1 − 2yi + yi−1 .
(3.10)
...................................................................................... In general, δ n yi = δ n−1 yi+1/2 − δ n−1 yi−1/2 .
(3.11)
All central differences for the five arguments x0 , x1 , . . . , x4 is shown in Table 3.4. x
y
x0
y0
δ
δ2
δ3
δ4
δy1/2 x1
δ 2 y1
y1
δ 3 y3/2
δy3/2 x2
δ 2 y2
y2
δ 3 y5/2
δy5/2 x3
δ 4 y2
δ 2 y3
y3 δy7/2
x4
y4
Table 3.4: Central difference table. It may be observed that all odd (even) order differences have fraction suffices (integral suffices). 3.1.5
Shift, average and differential operators
Shift operator, E: The shift operator is denoted by E and is defined by Ef (x) = f (x + h).
(3.12)
In terms of y, the above formula becomes Eyi = yi+1 .
(3.13)
Note that shift operator increases subscript of y by one. When the shift operator is applied twice on the function f (x), then the subscript of y is increased by 2. 7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Operators in Numerical Analysis That is, E 2 f (x) = E[Ef (x)] = E[f (x + h)] = f (x + 2h).
(3.14)
E n f (x) = f (x + nh) or E n yi = yi+nh .
(3.15)
In general,
The inverse shift operator can also be find in similar manner. It is denoted by E −1 and is defined by E −1 f (x) = f (x − h).
(3.16)
Similarly, second and higher order inverse operators are defined as follows: E −2 f (x) = f (x − 2h)
and
E −n f (x) = f (x − nh).
(3.17)
The general definition of shift operator is E r f (x) = f (x + rh),
(3.18)
where r is positive as well as negative rational numbers. Properties Few common properties of E operator are given below: (i) Ec = c, where c is a constant. (ii) E{cf (x)} = cEf (x). (iii) E{c1 f1 (x) + c2 f2 (x) + · · · + cn fn (x)] = c1 Ef1 (x) + c2 Ef2 (x) + · · · + cn Efn (x). (iv) E m E n f (x) = E n E m f (x) = E m+n f (x). (v) E n E −n f (x) = f (x). In particular, EE −1 ≡ I, I is the identity operator and it is some times denoted by 1. 8
...................................................................................... (vi) (E n )m f (x) = E mn f (x). Ef (x) f (x) = . (vii) E g(x) Eg(x) (viii) E{f (x) g(x)} = Ef (x) Eg(x). (ix) E∆f (x) = ∆Ef (x). (x) ∆m f (x) = ∇m E m f (x) = E m ∇m f (x) and ∇m f (x) = ∆m E −m f (x) = E −m ∆m f (x). Average operator, µ: The average operator is denoted by µ and is defined by µf (x) =
1 f (x + h/2) + f (x − h/2) 2
In terms of y, the above definition becomes µyi =
1 yi+1/2 + yi−1/2 . 2
Here the average of the values of f (x) at two points (x + h/2) and f (x − h/2) is taken as the value of µf (x). Differential operator, D: The differential operator is well known from differential calculus and it is denoted by D. This operator gives the derivative. That is, d f (x) = f 0 (x) dx d2 D2 f (x) = 2 f (x) = f 00 (x) dx ······ ··············· dn Dn f (x) = n f (x) = f n (x). dx Df (x) =
(3.19) (3.20)
(3.21)
9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Operators in Numerical Analysis
3.1.6
Factorial notation
The factorial notation is a very useful notation in calculus of finite difference. Using this notation one can find all order differences by the rules used in differential calculus. It is also very useful and simple notation to find anti-differences. The nth factorial of x is denoted by x(n) and is defined by x(n) = x(x − h)(x − 2h) · · · (x − n − 1h),
(3.22)
where, each factor is decreased from the earlier by h; and x(0) = 1. Similarly, the nth negative factorial of x is defined by x(−n) =
1 . x(x + h)(x + 2h) · · · (x + n − 1h)
(3.23)
A very interesting and obvious relation is x(n) .x(−n) 6= 1. Following results show the similarity of factorial notation and differential operator. Property 3.1 ∆x(n) = nhx(n−1) . Proof. ∆x(n) = (x + h)(x + h − h)(x + h − 2h) · · · (x + h − n − 1h) −x(x − h)(x − 2h) · · · (x − n − 1h) = x(x − h)(x − 2h) · · · (x − n − 2h)[x + h − {x − (n − 1)h}] = nhx(n−1) . Note that this property is analogous to the differential formula D(xn ) = nxn−1 when h = 1. The above formula can also be used to find anti-difference (like integration in integral calculus), as ∆−1 x(n−1) =
10
1 (n) x . nh
(3.24)
......................................................................................
3.2 Relations among operators Lot of useful and interesting results can be derived among the operators discussed above. First of all, we determine the relation between forward and backward difference operators. ∆yi = yi+1 − yi = ∇yi+1 = δyi+1/2 ∆2 yi = yi+2 − 2yi+1 + yi = ∇2 yi+2 = δ 2 yi+1 etc. In general, ∆n yi = ∇n yi+n ,
i = 0, 1, 2, . . . .
(3.25)
There is a good relation between E and ∆ operators. ∆f (x) = f (x + h) − f (x) = Ef (x) − f (x) = (E − 1)f (x). From this relation one can conclude that the operators ∆ and E − 1 are equivalent. That is, ∆≡E−1
or
E ≡ ∆ + 1.
(3.26)
The relation between ∇ and E operators is derived below: ∇f (x) = f (x) − f (x − h) = f (x) − E −1 f (x) = (1 − E −1 )f (x). That is, ∇ ≡ 1 − E −1 .
(3.27)
The expression for higher order forward differences in terms of function values can be derived as per following way: ∆3 yi = (E − 1)3 yi = (E 3 − 3E 2 + 3E − 1)yi = y3 − 3y2 + 3y1 − y0 . The relation between the operators δ and E is given below: δf (x) = f (x + h/2) − f (x − h/2) = E 1/2 f (x) − E −1/2 f (x) = (E 1/2 − E −1/2 )f (x). That is, 11
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Operators in Numerical Analysis δ ≡ E 1/2 − E −1/2 . The average operator µ is expressed in terms of E and δ as follows: 1 µf (x) = f (x + h/2) + f (x − h/2) 2 1 1 = E 1/2 f (x) + E −1/2 f (x) = (E 1/2 + E −1/2 )f (x). 2 2 Thus, 1 µ ≡ E 1/2 + E −1/2 . 2
(3.28)
(3.29)
2 1 1/2 E + E −1/2 f (x) 4 1 1 = (E 1/2 − E −1/2 )2 + 4 f (x) = δ 2 + 4 f (x). 4 4
µ2 f (x) =
Hence, r
1 (3.30) 1 + δ2. 4 Every operator defined earlier can be expressed in terms of other operator(s). Few µ≡
more relations among the operators ∆, ∇, E and δ are deduced in the following. ∇Ef (x) = ∇f (x + h) = f (x + h) − f (x) = ∆f (x). Also, δE 1/2 f (x) = δf (x + h/2) = f (x + h) − f (x) = ∆f (x). Thus, ∆ ≡ ∇E ≡ δE 1/2 . There is a very nice relation among the operators E and D, deduced below. Ef (x) = f (x + h) = f (x) + hf 0 (x) +
h2 00 h3 f (x) + f 000 (x) + · · · 2! 3!
[by Taylor’s series] h2 h3 = f (x) + hDf (x) + D2 f (x) + D3 f (x) + · · · 2! 3! h2 2 h3 3 = 1 + hD + D + D + · · · f (x) 2! 3! = ehD f (x). 12
(3.31)
...................................................................................... Hence, E ≡ ehD .
(3.32)
hD ≡ log E.
(3.33)
This result can also be written as
The relation between the operators D and δ is deduced below: δf (x) = [E 1/2 − E −1/2 ]f (x) = ehD/2 − e−hD/2 f (x) hD = 2 sinh f (x). 2 Thus, δ ≡ 2 sinh
hD
Again, µδ ≡ 2 cosh
2
. Similarly, µ ≡ cosh
hD 2
sinh
hD 2
hD 2
.
= sinh(hD).
(3.34)
(3.35)
This relation gives the inverse result, hD ≡ sinh−1 (µδ).
(3.36)
From the relation (3.33) and using the relations E ≡ 1 + ∆ and E −1 ≡ 1 − ∇ we obtained,
hD ≡ log E ≡ log(1 + ∆) ≡ − log(1 − ∇) ≡ sinh−1 (µδ).
(3.37)
Some of the operators are commutative with other operators. For example, µ and E are commutative, as µEf (x) = µf (x + h) =
1 f (x + 3h/2) + f (x + h/2) , 2
and Eµf (x) = E
h1
i 1 f (x + h/2) + f (x − h/2) = f (x + 3h/2) + f (x + h/2) . 2 2
Hence, µE ≡ Eµ.
(3.38) 13
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Operators in Numerical Analysis
Example 3.3 Prove the following relations. (i) (1 + ∆)(1 − ∇) ≡ 1 hD (ii) µ ≡ cosh 2 ∆+∇ (iii) µδ ≡ 2 (iv) ∆∇ ≡ ∇∆ ≡ δ 2 ∆E −1 ∆ (v) µδ ≡ + 2 2 δ 1/2 (vi) E ≡µ+ 2 δ2 2 2 2 (vvi) 1 + δ µ ≡ 1 + 2 r 2 δ δ2 (viii) ∆ ≡ +δ 1+ . 2 4 Solution. (i) (1 + ∆)(1 − ∇)f (x) = (1 + ∆)[f (x) − f (x) + f (x − h)] = (1 + ∆)f (x − h) = f (x − h) + f (x) − f (x − h) = f (x). Therefore, (1 + ∆)(1 − ∇) ≡ 1. (ii)
(iii)
1 1 µf (x) = [E 1/2 + E −1/2 ]f (x) = ehD/2 + e−hD/2 f (x) 2 2 hD = cosh f (x). 2
1 ∆+∇ f (x) = [∆f (x) + ∇f (x)] 2 2 1 = [f (x + h) − f (x) + f (x) − f (x − h)] 2 1 1 = [f (x + h) − f (x − h)] = [E − E −1 ]f (x) 2 2 = µδf (x) (as in previous case).
Thus, µδ ≡ 14
(3.39)
∆+∇ . 2
(3.40)
...................................................................................... (iv) ∆∇f (x) = ∆[f (x) − f (x − h)] = f (x + h) − 2f (x) + f (x − h). Again, ∇∆f (x) = f (x + h) − 2f (x) + f (x − h) = (E − 2 + E −1 )f (x) = (E 1/2 − E −1/2 )2 f (x) = δ 2 f (x). Hence, (v)
∆∇ ≡ ∇∆ ≡ (E 1/2 − E −1/2 )2 ≡ δ 2 .
(3.41)
1 ∆E −1 ∆ f (x) = [∆f (x − h) + ∆f (x)] + 2 2 2
1 = [f (x) − f (x − h) + f (x + h) − f (x)] 2 1 1 = [f (x + h) − f (x − h)] = [E − E −1 ]f (x) 2 2 1 1/2 = (E + E −1/2 )(E 1/2 − E −1/2 )f (x) 2 = µδf (x). Hence ∆E −1 ∆ + ≡ µδ. 2 2
(3.42)
1 1/2 1 1/2 δ −1/2 −1/2 f (x) = [E + E ] + [E − E ] f (x) = E 1/2 f (x). (vi) µ + 2 2 2 Thus δ E 1/2 ≡ µ + . 2
(3.43)
(vii) δµf (x) = 12 (E 1/2 + E −1/2 )(E 1/2 − E −1/2 )f (x) = 12 [E − E −1 ]f (x). Therefore, 1 (1 + δ 2 µ2 )f (x) = 1 + (E − E −1 )2 f (x) 4 1 1 = 1 + (E 2 − 2 + E −2 ) f (x) = (E + E −1 )2 f (x) 4 4 2 1 1/2 δ2 2 −1/2 2 = 1 + (E − E ) f (x) = 1 + f (x). 2 2 15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Operators in Numerical Analysis Hence 1 + δ 2 µ2 ≡ (viii)
δ2 +δ 2
1+
δ2 2
2 .
(3.44)
r
δ2 1+ f (x) 4
r 1 1/2 1 1/2 −1/2 2 1/2 −1/2 −1/2 2 = (E − E ) f (x) + (E − E ) 1 + (E − E ) f (x) 2 4 1 1 = [E + E −1 − 2]f (x) + (E 1/2 − E −1/2 )(E 1/2 + E −1/2 )f (x) 2 2 1 1 −1 = [E + E − 2]f (x) + (E − E −1 )f (x) 2 2 = (E − 1)f (x). Hence,
δ2 +δ 2
r 1+
δ2 ≡ E − 1 ≡ ∆. 4
(3.45)
In Table 3.5, it is shown that any operator can be expressed with the help of another operator. E
∆
∇
E
E
∆+1
(1 − ∇)−1
∆
E−1
∆
(1 − ∇)−1 − 1
∇
1 − E −1
1 − (1 + ∆)−1
∇
δ E 1/2−E −1/2 ∆(1 + ∆)−1/2 ∇(1 − ∇)−1/2 E 1/2+E −1/2 (1 + ∆/2) (1−∇/2)(1−∇)−1/2 µ 2 ×(1 + ∆)−1/2 hD
log E
log(1 + ∆)
− log(1 − ∇)
δ δ2
hD r
δ2 + δ 1+ ehD 2 r 4 δ2 δ2 +δ 1+ ehD − 1 2 4 r δ2 δ2 1 − e−hD − +δ 1+ 2 4 δ 2 sinh(hD/2) 2 δ 1+ cosh(hD/2) 4 1+
2 sinh−1 (δ/2)
hD
Table 3.5: Relationship between the operators. From earlier discussion we noticed that there is an approximate equality between ∆ operator and derivative. These relations are presented below. 16
...................................................................................... By the definition of derivative, f (x + h) − f (x) ∆f (x) = lim . h→0 h→0 h h
f 0 (x) = lim
Thus, ∆f (x) ' hf 0 (x) = hDf (x). Again, f 0 (x + h) − f 0 (x) h→0 h ∆f (x + h) ∆f (x) − h h ' lim h→0 h ∆f (x + h) − ∆f (x) ∆2 f (x) = lim = lim . h→0 h→0 h2 h2
f 00 (x) = lim
Hence, ∆2 f (x) ' h2 f 00 (x) = h2 D2 f (x). In general, ∆n f (x) ' hn f n (x) = hn Dn f (x). That is, for small values of h, the operators ∆ and hD are almost equal.
3.3 Polynomial using factorial notation According to the definition of factorial notation, one can write x(0) = 1 x(1) = x x(2) = x(x − h) x(3)
= x(x − h)(x − 2h)
x(4)
= x(x − h)(x − 2h)(x − 3h)
(3.46)
and so on. From these equations it is obvious that the base terms (x, x2 , x3 , . . .) of a polynomial can be expressed in terms of factorial notations x(1) , x(2) , x(3) , . . ., as shown below. 1 = x(0) x = x(1) x2 = x(2) + hx(1)
(3.47)
x3 = x(3) + 3hx(2) + h2 x(1) x4 = x(4) + 6hx(3) + 7h2 x(2) + h3 x(1) 17
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Operators in Numerical Analysis and so on. Note that the degree of xk (for any k = 1, 2, 3, . . .) remains unchanged while expressed it in factorial notation. This observation leads to the following lemma. Lemma 3.1 Any polynomial f (x) in x can be expressed in factorial notation with same degree. Since all the base terms of a polynomial are expressed in terms of factorial notation, every polynomial can be written with the help of factorial notation. Once a polynomial is expressed in a factorial notation, then its differences can be determined by using the formula like differential calculus. Example 3.4 Express f (x) = 10x4 − 41x3 + 4x2 + 3x + 7 in factorial notation and find its first and second differences. Solution. For simplicity, we assume that h = 1. Now by (3.47), x = x(1) , x2 = x(2) + x(1) , x3 = x(3) + 3x(2) + x(1) , x4 = x(4) + 6x(3) + 7x(2) + x(1) . Substituting these values to the function f (x) and we obtained f (x) = 10 x(4) + 6x(3) + 7x(2) + x(1) − 41 x(3) + 3x(2) + x(1) + 4 x(2) + x(1) + 3x(1) + 7 = 10x(4) + 19x(3) − 49x(2) − 24x(1) + 7. Now, the relation ∆x(n) = nx(n−1) (Property 3.1) is used to find the first and second order differences. Therefore, ∆f (x) = 10.4x(3) + 19.3x(2) − 49.2x(1) − 24.1x(0) = 40x(3) + 57x(2) − 98x(1) − 24 = 40x(x − 1)(x − 2) + 57x(x − 1) − 98x − 24 = 40x3 − 63x2 − 75x − 24 and ∆2 f (x) = 120x(2) + 114x(1) − 98 = 120x(x − 1) + 114x − 98 = 120x2 − 6x − 98. The above process to convert a polynomial in a factorial notation is a very labourious task when the degree of the polynomial is large. There is a systematic method, similar to Maclaurin’s formula in differential calculus, is used to convert a polynomial in factorial notation. This technique is also useful for a function which satisfies the Maclaurin’s theorem for infinite series. Let f (x) be a polynomial in x of degree n. We assumed that in factorial notation f (x) is of the following form f (x) = a0 + a1 x(1) + a2 x(2) + · · · + an x(n) , 18
(3.48)
...................................................................................... where ai ’s are unknown constants to be determined and an 6= 0. Now, we determine the different differences of (3.48) as follows. ∆f (x) = a1 + 2a2 x(1) + 3a3 x(2) + · · · + nan x(n−1) ∆2 f (x) = 2.1a2 + 3.2a3 x(1) + · · · + n(n − 1)an x(n−2) ∆3 f (x) = 3.2.1a3 + 4.3.2.x(1) + · · · + n(n − 1)(n − 2)an x(n−3) ······ ··· ············································· ∆n f (x) = n(n − 1)(n − 2) · · · 3 · 2 · 1an = n!an . Substituting x = 0 to the above relations and we obtained a0 = f (0),
∆f (0) = a1 , ∆2 f (0) ∆2 f (0) = 2.1.a2 or, a2 = 2! 3 f (0) ∆ ∆3 f (0) = 3.2.1.a3 or, a3 = 3! ·················· ······ ··············· ∆n f (0) ∆n f (0) = n!an or, an = . n! Using these results equation (3.48) transferred to f (x) = f (0) + ∆f (0)x(1) +
∆n f (0) (n) ∆2 f (0) (2) ∆3 f (0) (3) x + x + ··· + x . 2! 3! n! (3.49)
Observed that this formula is similar to Maclaurin’s formula of differential calculus. This formula can also be used to expand a function in terms of factorial notation. To expand a function in terms of factorial notation different forward differences are needed at x = 0. These differences can be determined using the forward difference table and the entire method is explained with the help of the following example. Example 3.5 Express f (x) = 15x4 − 3x3 − 6x2 + 11 in factorial notation. Solution. Let h = 1. For the given function, f (0) = 11, f (1) = 17, f (2) = 203, f (3) = 1091, f (4) = 3563. 19
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Operators in Numerical Analysis x
f (x)
0
11
∆f (x)
∆2 f (x)
∆3 f (x)
∆4 f (x)
6 1
17
180 186
2
522
203
702
360
888 3
882
1091
1584 2472
4
3563
Thus by formula (3.49) ∆2 f (0) (2) ∆3 f (0) (3) ∆4 f (0) (4) f (x) = f (0) + ∆f (0)x(1) + x + x + x 2! 3! 4! = 15x(4) + 87x(3) + 90x(2) + 6x(1) + 11. There is another method to find the coefficients of a polynomial in factorial notation, presented below. Example 3.6 Find f (x), if ∆f (x) = x4 − 10x3 + 11x2 + 5x + 3. Solution. The synthetic division is used to express ∆f (x) in factorial notation. 1 2 3
1 1 1
−10
11
5
1
−9
2
−9
2
7
2
−14
−7
−12
3
3 4
1
−4
1 Therefore, ∆f (x) = x(4) − 4x(3) − 12x(2) + 7x(1) + 3. Hence, 1 4 12 7 f (x) = x(5) − x(4) − x(3) + x(2) + 3x(1) + c, [using Property 1] 5 4 3 2 1 = x(x − 1)(x − 2)(x − 3)(x − 4) − x(x − 1)(x − 2)(x − 3) 5 7 −4x(x − 1)(x − 2) + x(x − 1) + 3x + c, where c is arbitrary constant. 2 20
......................................................................................
3.4 Difference of a polynomial Let f (x) = a0 xn + a1 xn−1 + · · · + an−1 x + an be a polynomial in x of degree n, where ai ’s are the given coefficients. Suppose, f (x) = b0 x(n) + b1 x(n−1) + b2 x(n−2) + · · · + bn−1 x(1) + bn be the same polynomial in terms of factorial notation. The coefficients bi ’s can be determined by using any method discussed earlier. Now, ∆f (x) = b0 nhx(n−1) + b1 h(n − 1)x(n−2) + b2 h(n − 2)x(n−3) + · · · + bn−1 h. Clearly this is a polynomial of degree n − 1. Similarly, ∆2 f (x) = b0 n(n − 1)h2 x(n−2) + b1 (n − 1)(n − 2)h2 x(n−3) + · · · + bn−2 h2 , ∆3 f (x) = b0 n(n − 1)(n − 2)h3 x(n−3) + b1 (n − 1)(n − 2)(n − 3)h3 x(n−4) + · · · + bn−3 h3 . In this way, ∆k f (x) = b0 n(n − 1)(n − 2) · · · (n − k + 1)hk x(n−k) . Thus finally, ∆k f (x), k < n is a polynomial of degree n − k, ∆n f (x) = b0 n!hn = n!hn a0 is constant, and ∆k f (x) = 0, if k > n.
In particular, ∆n+1 f (x) = 0.
Example 3.7 Let ui (x) = (x − x0 )(x − x1 ) · · · (x − xi ), where xi = x0 + ih, i = 0, 1, 2, . . . , n; h > 0. Prove that ∆k ui (x) = (i + 1)i(i − 1) · · · (i − k + 2)hk (x − x0 )(x − x1 ) · · · (x − xi−k ). Solution. Let ui (x) = (x − x0 )(x − x1 ) · · · (x − xi ) be denoted by (x − x0 )(i+1) . 21
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Operators in Numerical Analysis Therefore, ∆ui (x) = (x + h − x0 )(x + h − x1 ) · · · (x + h − xi ) − (x − x0 ) · · · (x − xi ) = (x + h − x0 )(x − x0 )(x − x1 ) · · · (x − xi−1 ) −(x − x0 )(x − x1 ) · · · (x − xi ) = (x − x0 )(x − x1 ) · · · (x − xi−1 )[(x + h − x0 ) − (x − xi )] = (x − x0 )(x − x1 ) · · · (x − xi−1 )(h + xi − x0 ) = (x − x0 )(x − x1 ) · · · (x − xi−1 )(i + 1)h
[since xi = x0 + ih]
= (i + 1)h(x − x0 )(i) . By similar way, ∆2 ui (x) = (i + 1)h[(x + h − x0 )(x + h − x1 ) · · · (x + h − xi−1 ) −(x − x0 )(x − x1 ) · · · (x − xi−1 )] = (i + 1)h(x − x0 )(x − x1 ) · · · (x − xi−2 )[(x + h − x0 ) − (x − xi−1 )] = (i + 1)h(x − x0 )(i−1) ih = (i + 1)ih2 (x − x0 )(i−1) . Also, ∆3 ui (x) = (i + 1)i(i − 1)h3 (x − x0 )(i−2) . Hence, in this way ∆k ui (x) = (i + 1)i(i − 1) · · · (i − k − 2)hk (x − x0 )(i−k−1) = (i + 1)i(i − 1) · · · (i − k + 2)hk (x − x0 )(x − x1 ) · · · (x − xi−k ).
22
.
Chapter 2 Interpolation
Module No. 1 Lagrange’s Interpolation
...................................................................................... We start this module with a problem. It is well known that the population of India are known for the years 1951, 1961, 1971, 1981, 1991, 2001, 2011. What is the population of India in the year 2008? Exact result is not known for this question. What is the approximate value? By guessing we can say an approximate figure. But, guessing gives different results for different persons, in different times, etc. To avoid this ambiguity, a method is developed known as interpolation which gives an approximate value of this problem. Obviously, this value is not exact, contains some error. Also, there is a method to estimate such error. Now, we state the general interpolation problem: Let y = f (x) be a function, whose explicit express is not known, but a table of values of y is known for a given set of values x0 , x1 , x2 , . . ., xn of x. There is no other information available about the function f (x). That is, f (xi ) = yi , i = 0, 1, . . . , n.
(1.1)
The problem of interpolation is to find the value of y(= f (x)) for a given value of x say, x ¯. Obviously, the value of y at x ¯ is unknown. Many different methods are available to find the value of y at the given point x = x ¯. The main step of interpolation is to find an approximate function, say, φ(x), for the given function f (x) based on the given tabulated values. The approximate function should be simple and easy to handle. The constructed function φ(x) may be a polynomial, exponential, geometric function, Taylor’s series, Fourier series, etc. If the function φ(x) is a polynomial, then the corresponding interpolation is called polynomial interpolation. Polynomial interpolation is used in most of the situations as polynomial is easy to evaluate, continuous, differentiable and integrable in any range. A polynomial φ(x) is polynomial if yi = f (xi ) = φ(xi ), i = interpolating called dk φ dk f = for some finite k, and x ¯ is one of the values of x0 , 0, 1, 2, . . . , n and dxk x¯ dxk x¯ x1 , . . ., xn . Every interpolating polynomial must satisfies the following condition. Theorem 1.1 If the function f (x) is continuous on [a, b], then for any pre-assigned positive number ε > 0, there exists a polynomial φ(x) such that |f (x) − φ(x)| < ε for all x ∈ (a, b). This theorem implies that the interpolating polynomial φ(x) is bounded by the functions y = f (x) − ε and y = f (x) + ε, for some given ε. The relation between the 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lagrange’s Interpolation
Figure 1.1: Interpolation of a function. function y = f (x), interpolating polynomial φ(x) and two boundaries functions are shown in Figure 1.1. Several interpolation methods are available in literature. Among them Lagrange’s interpolation is one of the mostly useful method. We assumed that a polynomial of degree n, means a polynomial of degree not higher than n.
1.1 Lagrange’s interpolation polynomial Let y = f (x) be a real valued function which is defined on the closed interval [a, b] and yi = f (xi ), i = 0, 1, . . . , n. Suppose the following table is given. x x0 x1 · · · xn y
y0
y1
···
yn
Note that the table contains n + 1 points of the function y = f (x). Now, our problem is to find a polynomial φ(x) of degree less than or equal to n which satisfied the condition φ(xi ) = yi , i = 0, 1, . . . , n.
(1.2)
The polynomial φ(x) is called the interpolation polynomial. If all the points are collinear, then the degree of φ(x) is linear. 2
...................................................................................... Suppose the polynomial φ(x) is the following form n X
φ(x) =
Li (x) yi ,
(1.3)
i=0
where each Li (x) is polynomial in x, of degree less than or equal to n. The function Li (x) called the Lagrangian function. The polynomial φ(x) satisfies the condition (1.2) if ( Li (xj ) =
0, for i 6= j 1, for i = j.
This condition implies that the polynomial Li (x) is zero at the points x0 , x1 , . . . , xi−1 , xi+1 , . . . , xn , and it is 1 when x = xi . This statement leads to the following form of Li (x). Li (x) = ai (x − x0 )(x − x1 ) · · · (x − xi−1 )(x − xi+1 ) · · · (x − xn ), where ai is a constant and its value is to be determined by using the condition Li (xi ) = 1. Thus, ai (xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn ) = 1. That is, ai = 1/{(xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )}. Hence, Li (x) =
(x − x0 )(x − x1 ) · · · (x − xi−1 )(x − xi+1 ) · · · (x − xn ) . (xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )
(1.4)
Finally, the Lagrange’s interpolation polynomial φ(x) is given by φ(x) =
n X
Li (x) yi ,
i=0
where Li (x) is given in (1.4). Note that each Li (x) is a polynomial of degree at most n and it can be written in many different forms. One of such form is Li (x) =
n Y x − xj . xi − xj j=0 j6=i
3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lagrange’s Interpolation With this notation φ(x) is given by φ(x) =
n Y n X x − xj yi . xi − xj j=0 i=0
j6=i
Another form of Li (x) is deduced below. Let w(x) = (x − x0 )(x − x1 ) · · · (x − xn ).
(1.5)
This is a polynomial of degree n+1 and it vanishes at the n+1 points x = x0 , x1 , . . . , xn . The derivative w0 (x) is given by w0 (x) = (x − x1 )(x − x2 ) · · · (x − xn ) + (x − x0 )(x − x2 ) · · · (x − xn ) + · · · + (x − x0 )(x − x1 ) · · · (x − xi−1 )(x − xi+1 ) · · · (x − xn ) + · · · + (x − x0 )(x − x1 ) · · · (x − xn−1 ). At x = xi , w0 (x) is w0 (xi ) = (xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn ). Note that, this is the denominator of Li (x). In terms of w(x), the Li (x) is expressed as Li (x) =
w(x) . (x − xi )w0 (xi )
Thus, the Lagrange’s interpolation polynomial in terms of w(x) is φ(x) =
n X i=0
w(x) yi . (x − xi )w0 (xi )
(1.6)
Note the following important observations. 1. For a set of n + 1 points, the degree of Lagrange’s interpolating polynomial is at most n, i.e. the degree of the polynomial depends on the number of given points. 2. The Lagrange’s interpolating polynomial always pass through the given points (xi , yi ), i = 0, 1, 2, . . . , n, i.e. yi = φ(xi ). Let us consider the following example to illustrate the Lagrange’s interpolation. 4
......................................................................................
Example 1.1 The following table is obtained from a function f (x) whose explicit form is unknown x
−1
0
1
f (x)
3
8
11
Use Lagrange’s interpolating polynomial to find the polynomial φ(x) and hence find approximate value of the function f (x) at x = 0.5. Solution. Given that x0 = −1, x1 = 0, x2 = 1 and f (x0 ) = 3, f (x1 ) = 8, f (x2 ) = 11. Let φ(x) be the approximate polynomial of the function f (x). 2 X Then φ(x) = Li (x)f (xi ) i=0
where
(x − 0)(x − 1) x2 − x (x − x1 )(x − x2 ) = = . (x0 − x1 )(x0 − x2 ) (−1 − 0)(−1 − 1) 2 (x + 1)(x − 1) x2 − 1 (x − x0 )(x − x2 ) = . L1 (x) = = (x1 − x0 )(x1 − x2 ) (0 + 1)(0 − 1) −1 (x − x0 )(x − x1 ) (x + 1)(x − 0) x2 + x L2 (x) = = = . (x2 − x0 )(x2 − x1 ) (1 + 1)(1 − 0) 2 L0 (x) =
Therefore, x2 − x x2 − 1 x2 + x ×3+ ×8+ × 11 2 −1 2 = −x2 + 4x + 8.
φ(x) =
Hence, f (0.5) = 9.75.
1.2 Linear interpolation There is a major disadvantage of Lagrange’s interpolation. The degree of the interpolating polynomial is high for a large data set. To remove this drawback different interpolating formulae are developed. One of them is linear interpolation. In this method, for each interval [xi , xi+1 ], i = 0, 1, 2, . . . , n − 1, one interpolating polynomial is constructed. But, for two given points there is a polynomial whose degree must be one, i.e. linear. The linear interpolating polynomial is described below. 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lagrange’s Interpolation Let (x0 , y0 ) and (x1 , y1 ) be two points. For this case, φ(x) = L0 (x)y0 + L1 (x)y1 x − x0 x − x0 x − x1 y0 + y1 = y0 + (y1 − y0 ). = x0 − x1 x1 − x0 x1 − x0
(1.7)
This polynomial is known as linear interpolation polynomial for the interval [x0 , x1 ]. For the given n + 1 points (xi , yi ), i = 0, 1, 2, . . . , n, the linear interpolating polynomials are φ(x) = yi +
x − xi (yi+1 − yi ), when xi ≤ x ≤ xi+1 , i = 0, 1, 2, . . . , n − 1. xi+1 − xi
1.3 Lagrangian interpolation formula for equally spaced points The Lagrange’s interpolation is laborious, particular when the values of xi ’s are fraction and unequal spaced. But, if the xi ’s are equal spaced, then a modified formula can be derived and we will see that it is more easier than the previous one. Since xi ’s are in equally spaced, therefore they can be written as xi = x0 + ih, i = 0, 1, 2 . . . , n, where h is the spacing (difference between two consecutive x’s). Now, a new variable s is introduced which is related with x as x = x0 + sh. Then x − xi = (s − i)h and xi − xj = (i − j)h. In this notation w(x) is given by w(x) = (x − x0 )(x − x1 ) · · · (x − xn ) = sh(s − 1)h(s − 2)h · · · (s − n)h = hn+1 s(s − 1)(s − 2) · · · (s − n). Again, w0 (xi ) = (xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 )(xi − xi+2 ) · · · (xi − xn ) = (ih)(i − 1)h · · · (i − i − 1)h(i − i + 1)h(i − i + 2)h · · · (i − n)h = hn i(i − 1) · · · 1 · (−1)(−2) · · · ({−(n − i)} = hn i!(−1)n−i (n − i)!. 6
...................................................................................... With these values the Lagrange’s interpolation formula for equal spaced points is given by φ(x) = =
n X hn+1 s(s − 1)(s − 2) · · · (s − n) i=0 n X
(−1)n−i hn i!(n − i)!(s − i)h (−1)n−i
i=0
yi
s(s − 1)(s − 2) · · · (s − n) yi , i!(n − i)!(s − i)
(1.8)
where x = x0 + sh. Note that, in general, s is a fraction and each time an integer is subtracted from it. Obviously is it easy, particularly in hand calculation. In the following, it is proved that for every tabulated values, the Lagrange’s interpolation polynomial exists and unique. Theorem 1.2 The Lagrange’s interpolation polynomial exists and unique. Proof. By the construction of Lagrange’s interpolation formula yi = φ(xi ), for all i = 0, 1, . . . , n.
(1.9)
When n = 1, φ(x) =
x − x0 x − x1 y0 + y1 . x0 − x1 x1 − x0
(1.10)
When n = 2, φ(x) =
(x − x1 )(x − x2 ) (x − x0 )(x − x2 ) y0 + y1 (x0 − x1 )(x0 − x2 ) (x1 − x0 )(x1 − x2 ) (x − x0 )(x − x1 ) + y2 . (x2 − x0 )(x2 − x1 )
(1.11)
In general, for any positive integer n, φ(x) =
n X
Li (x)yi ,
(1.12)
i=0
where Li (x) =
(x − x0 ) · · · (x − xi−1 )(x − xi+1 ) · · · (x − xn ) , (xi − x0 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn ) i = 0, 1, . . . , n.
(1.13) 7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lagrange’s Interpolation For two points the interpolating polynomial is a linear (see (1.10)) and φ(x0 ) = y0 and φ(x1 ) = y1 . For three points it is a second degree polynomial (see (1.11)) and φ(x0 ) = y0 , φ(x1 ) = y1 , φ(x2 ) = y2 , i.e. satisfy (1.9). Thus, the condition (1.13) is satisfied for n = 1, 2. In general, the polynomials (1.13) are expressed in the form of a fraction whose numerator is a polynomial of degree n and denominator is a non-zero number. Again, Li (xi ) = 1 and Li (xj ) = 0 for j 6= i, j = 0, 1, . . . , n. That is, φ(xi ) = yi . Thus, the conditions of interpolating polynomial are satisfied. Hence, the Lagrange’s polynomial exists. Uniqueness of the polynomial Let φ(x) be a polynomial of degree n, where φ(xi ) = yi , i = 0, 1, . . . , n.
(1.14)
If possible let, φ∗ (x) be another polynomial of degree n which satisfied the conditions of interpolating polynomial, i.e. φ∗ (xi ) = yi , i = 0, 1, . . . , n.
(1.15)
Then from (1.14) and (1.15), φ∗ (xi ) − φ(xi ) = 0, i = 0, 1, . . . , n.
(1.16)
If φ∗ (x) − φ(x) 6= 0, then φ∗ (x) − φ(x) represents a polynomial of degree at most n. So it has at most n zeros, which contradicts (1.16), whose number of zeros is n + 1. Hence, φ∗ (x) = φ(x). Thus φ(x) is unique.
1.4 Properties of Lagrangian functions The Lagrange’s functions Li (x) satisfies many interesting properties discussed below. 1. The Lagrangian functions depend only on xi ’s and does not depend on the values of the functions. 2. Lagrangian functions are invariant under linear transformation. Proof. Let x = au + b, where a, b are constants. 8
...................................................................................... Therefore, xj = auj + b and x − xj = a(u − uj ). Again, xi − xj = a(ui − uj ) when (i 6= j). Now, w(x) and w0 (x) are change to w(x) = (x − x0 )(x − x1 ) · · · (x − xn ) = an+1 (u − u0 )(u − u1 ) · · · (u − un ) and w0 (xi ) = (xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn ) = an (ui − u0 )(ui − u1 ) · · · (ui − ui−1 )(ui − ui+1 ) · · · (ui − un ). Using these values Li (x) becomes, Li (x) =
w(x) (x − xi )w0 (xi )
an+1 (u − u0 )(u − u1 ) · · · (u − un ) a(u − ui )an (ui − u0 )(ui − u1 ) · · · (ui − ui−1 )(ui − ui+1 ) · · · (ui − un ) w(u) = = Li (u), (u − ui )w0 (ui )
=
where w(u) = (u − u0 )(u − u1 ) · · · (u − un ). This shows that all Li (x)’s are invariant. 3. Sum of Lagrangian functions is 1, i.e.
n X
Li (x) = 1.
i=0
Proof. From definition n X
Li (x) =
i=0
n X i=0
w(x) (x − xi )w0 (xi )
(1.17)
where w(x) = (x − x0 )(x − x1 ) · · · (x − xn ). Let us consider the expression 1 A0 A1 Ai An = + + ··· + + ··· + . w(x) x − x0 x − x1 x − xi x − xn This gives
(1.18)
1 = A0 (x − x1 )(x − x2 ) · · · (x − xn ) + A1 (x − x0 )(x − x2 ) · · · (x − xn ) + · · · + Ai (x − x0 )(x − x1 ) · · · (x − xi−1 )(x − xi+1 ) · · · (x − xn ) + · · · + An (x − x0 )(x − x1 ) · · · (x − xn−1 ).
(1.19) 9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lagrange’s Interpolation When x = x0 then from (1.19) we have 1 = A0 (x0 − x1 )(x0 − x2 ) · · · (x0 − xn ), i.e. A0 =
1 1 . = 0 (x0 − x1 )(x0 − x2 ) · · · (x0 − xn ) w (x0 )
Similarly, x = x1 gives A1 =
1 w0 (x1 )
.
1 1 and An = 0 . w0 (xi ) w (xn ) With these values equation (1.18) becomes 1 1 1 = + + ··· 0 w(x) (x − x0 )w (x0 ) (x − x1 )w0 (x1 ) 1 1 + + ··· + 0 (x − xi )w (xi ) (x − xn )w0 (xn ) n n X X w(x) i.e., 1 = = Li (x). (x − xi )w0 (xi )
In general, Ai =
i=0
i=0
Hence the results.
1.5 Error in interpolating polynomial In every numerical computation there must be an error. In interpolation, a function f (x) is approximated by a polynomial φ(x). So there should be an error at the nontabular points. In the following, we estimate the amount of error in interpolating polynomial. Theorem 1.3 Let I be an interval such that all interpolating points x0 , x1 , . . . , xn belong to I. If f (x) is continuous and have continuous derivatives of order n + 1 for all x in I, then the error at any point x is given by En (x) = (x − x0 )(x − x1 ) · · · (x − xn )
f (n+1) (ξ) , where ξ ∈ I. (n + 1)!
(1.20)
Proof. Let the error be En (x) = f (x) − φ(x). Let φ(x) be the interpolating polynomial which approximates the function f (x). The degree of φ(x) is less than or equal to n. 10
...................................................................................... At x = xi , the error term is En (xi ) = f (xi ) − φ(xi ) = 0 for i = 0, 1, . . . , n. Motivated from this result, it is assumed that En (x) = w(x)k, where w(x) = (x − x0 )(x − x1 ) . . . (x − xn ). Let u ∈ I be an arbitrary value of x, other than x0 , x1 , . . . , xn . Then, En (u) = w(u)k or f (u) − φ(u) = kw(u).
(1.21)
Now we define an auxiliary function F (x) = f (x) − φ(x) − kw(x).
(1.22)
The function F (x) vanishes at x = x0 , x1 , . . . , xn , since f (xi ) = φ(xi ) and w(xi ) = 0. Also, F (u) = 0, by (1.21). Thus, F (x) = 0 has n + 2 roots in I. By Roll’s theorem, F 0 (x) = 0 has n + 1 roots in I. F 00 (x) = 0 has n roots in I. Finally, F (n+1) (x) = 0 must have at least one root in I. Let ξ be one such root. Then F (n+1) (ξ) = 0. This gives, f (n+1) (ξ) − 0 + k(n + 1)! = 0 [φ(x) is a polynomial of degree n so φ(n+1) (x) = 0 and w(x) is a polynomial of degree n + 1, so w(n+1) (x) = (n + 1)!]. Hence, k=
f (n+1) (ξ) . (n + 1)!
Thus, the error at x = u is En (u) = kw(u) [by (1.21)] = (u − x0 )(u − x1 ) · · · (u − xn )
f (n+1) (ξ) . (n + 1)!
Hence, the error at any point x is given by f (n+1) (ξ) En (x) = (x − x0 )(x − x1 ) · · · (x − xn ) . (n + 1)!
Note 1.1 The above expression gives the error at any point x. Practically, this formula is not much useful because in many situations f (n+1) (ξ) cannot be determined. But, this expression gives an upper bound of the error. Let Mn+1 be the upper bound of f (n+1) (ξ) in I. Then |En (x)| ≤
Mn+1 |w(x)|. (n + 1)!
(1.23)
11
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lagrange’s Interpolation
Note 1.2 In case of equispaced points, the error term is given by En (x) = s(s − 1)(s − 2) · · · (s − n)hn+1
f (n+1) (ξ) . (n + 1)!
(1.24)
Example 1.2 For equispaced points show that h 3 M3 √ , 9 3 h4 M4 (ii) |E3 (x)| ≤ , 24 (i) |E2 (x)| ≤
x 0 ≤ x ≤ x2
(1.25)
x 0 ≤ x ≤ x3 ,
(1.26)
where |f (n+1) (ξ)| ≤ Mn+1 for x0 ≤ ξ ≤ xn . |f (3) (ξ)| . 3! Let g(s) = s(s − 1)(s − 2). Then g 0 (s) = 3s2 − 6s + 2. Thus the extreme value is 1 given by g 0 (s) = 0, i.e. s = 1 ± √ . 3 Since g 00 (s) = 6(s − 1) < 0 at s = 1 − √13 , the maximum value of g(s) is Solution. (i) |E2 (x)| = |s(s − 1)(s − 2)|h3
1 1 1 2 1− √ −√ − √ −1 = √ . 3 3 3 3 3
Hence, 2 M3 h3 M3 |E2 (x)| ≤ √ h3 = √ . 6 3 3 9 3 |f (4 (ξ)| . 4! 0 Let p(s) = s(s − 1)(s − 2)(s − 3). Then p (s) = 4s3 − 18s2 + 22s − 6. (ii) |E3 (x)| = |s(s − 1)(s − 2)(s − 3)|h4
0 (s) = 0, i.e. 2s3 − 9s2 + 11s − 3 = 0. The extreme points is given by p√ 3 3± 5 This equation gives s = , . 2 2 √ 3± 5 It can be shown that |p(s)| is maximum when s = and the maximum value 2 is |p(s)| = 1.
Thus, |E3 (x)| ≤ 1.h4 12
M4 h4 M4 = . 24 24
......................................................................................
Example 1.3 Let f (x) = sin x over [0, 1.5] and it is approximated by Lagrange’s interpolation. Find the maximum error when f (x) is approximated in quadratic and cubic polynomials. Solution. Here f (x) = sin x. |f 0 (x)| = | cos x|, |f 00 (x)| = | sin x|, |f 000 (x)| = | cos x|, |f iv (x)| = | sin x|. |f 000 (x)| ≤ | cos 0| = 1, so M3 = 1, and |f iv (x)| ≤ | sin 1.5| = 0.997495, and hence M4 = 0.997495. In case of quadratic approximation the value of h is h = (1.5 − 0)/2 = 0.75 and the maximum error is given by |E2 (x)| ≤
(0.75)3 × 1.0 h3 M3 √ ≤ √ = 0.02706. 9 3 9 3
In case of cubic approximation h = (1.5 − 0)/3 = 0.5. The error bound is |E3 (x)| ≤
(0.5)4 × 0.997495 h4 M4 ≤ = 0.00260. 24 24
Example 1.4 Determine the step size h which is used in the tabulation of f (x) = sin x in the interval [0,1] so that the quadratic interpolation will be correct up to six decimal places. Solution. The upper bound of the error in quadratic approximation is |E2 (x)| ≤ f (x) = sin x,
h 3 M3 √ , 9 3
f 0 (x) = cos x,
M3 = max f 000 (x). 0≤x≤1
f 00 (x) = − sin x,
f 000 (x) = − cos x.
The value of M3 is given by M3 = max |f 000 (x)| = max | cos x| = 1. 0≤x≤1
0≤x≤1
h3 Therefore, √ × 1 ≤ 5 × 10−6 . 9 3 √ This gives h3 ≤ 45 3 × 10−6 , i.e. h ≤ 0.0427. Example 1.5 Use Lagrange’s interpolation formula obtain a quadratic polynomial approximation of the function f (x) = (1.25 − x)e−x , taking x = 0, 0.5, 1. Solution. Let x0 = 0, x1 = 0.5, x2 = 1 and f (x0 ) = 1.25, f (x1 ) = 0.75e−0.5 , f (x2 ) = 0.25e−1 . 13
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lagrange’s Interpolation The quadratic polynomial φ(x) is (x − x1 )(x − x2 ) (x − x0 )(x − x2 ) (x − x0 )(x − x1 ) f (x0 ) + f (x1 ) + f (x2 ) (x0 − x1 )(x0 − x2 ) (x1 − x0 )(x1 − x2 ) (x2 − x0 )(x2 − x1 ) (x − 0)(x − 1) (x − 1/2)(x − 1) × 1.25 + × 0.75e−0.5 = (0 − 1/2)(0 − 1) (1/2 − 0)(1/2 − 1) (x − 0)(x − 1/2) + × 0.25e−1 (1 − 0)(1 − 1/2) = 1.25(2x − 1)(x − 1) − 3e−0.5 x(x − 1) + 0.25e−1 x(2x − 1)
φ(x) =
= 0.864348x2 − 2.022378x + 1.250000. The given function f (x) and the interpolating polynomial are shown in Figure 1.2.
Figure 1.2: The function f (x) and the polynomial φ(x)
Example 1.6 Using Lagrange’s interpolation formula, express the function x2 + 2x + 3 (x + 1)x(x − 1) as sums of partial fractions. Solution. Let f (x) = x2 + 2x + 3. We tabulate f (x) for x = −1, 0, 1 as follows:
14
x
:
−1
0
1
f (x)
:
2
3
6
...................................................................................... The Lagrange’s functions are (x − x1 )(x − x2 ) x(x − 1) = . (x0 − x1 )(x0 − x2 ) 2 (x − x0 )(x − x2 ) (x + 1)(x − 1) L1 (x) = = . (x1 − x0 )(x1 − x2 ) −1 (x − x0 )(x − x1 ) (x + 1)x L2 (x) = . = (x2 − x0 )(x2 − x1 ) 2 L0 (x) =
By Lagrange’s interpolation formula the polynomial f (x) is given by x(x − 1) (x + 1)(x − 1) (x + 1)x ×2+ ×3+ ×6 2 −1 2 = x(x − 1) − 3(x + 1)(x − 1) + 3x(x + 1).
f (x) =
Thus, f (x) 1 3 3 x2 + 2x + 3 = = − + . (x + 1)x(x − 1) (x + 1)x(x − 1) x+1 x x−1 Merit and demerits of Lagrangian interpolation Lagrange’s interpolation formula is used for any type of data as in this formula there is no restriction on the spacing h. This formula is also applicable to find the value of f (x) at any point x within the minimum and maximum values of x0 , x1 , . . . , xn . There are also some disadvantages of this formula. If the number of interpolating points decreases or increases, then a fresh calculation is needed to find all Lagrange’s functions. Again, each Li (x) is a polynomial of degree n if the table contains n + 1 points. Thus we have to calculate n such polynomials each of degree n and it is very laborious. Generally, we think that if the number of points increases then there is a significant improvement in the approximate value, but it is not true for all functions. For example, if we consider f (x) = cos x, then for 11 points the Lagrange’s interpolating polynomial exactly matches with f (x) within the interval [−5, 5]. But, this is not happened for all functions. In case of Lorentz function f (x) = 1/(1 + x2 ), if the number of points increases, the approximate function oscillates more rapidly.
15
.
Chapter 2 Interpolation
Module No. 2 Newton’s Interpolation Methods
...................................................................................... In Lagrange’s interpolation lots of arithmetic calculations are involved. But, this method is applicable for both equal and unequal spaced points. In this module, we discussed about new kind of interpolation methods based on finite differences known as Newton’s interpolations. In these methods, we assumed that the values of x’s are equispaced.
2.1
Newton’s forward difference interpolation formula
Suppose the explicit form of the function y = f (x) is unknown, but the values of y at some equispaced points x0 , x1 , . . . , xn are known. Let yi be the value of f (x) at x = xi and it is known for all i = 0, 1, 2, . . . , n. If the function is given, then one can calculate such values. Since the values of x’s are equispaced, therefore xi = x0 + ih, i = 0, 1, . . . , n, where h is called the spacing. Thus, we have a set of n + 1 points (xi , yi ), i = 0, 1, 2, . . . , n. Now, the problem is to construct a polynomial, say, φ(x) of degree less than or equal to n which passes through the points yi = φ(xi ),
i = 0, 1, . . . , n.
(2.1)
Let the polynomial φ(x) be the following form φ(x) = a0 + a1 (x − x0 ) + a2 (x − x0 )(x − x1 ) + a3 (x − x0 )(x − x1 )(x − x2 ) + · · · + an (x − x0 )(x − x1 ) · · · (x − xn−1 ),
(2.2)
where a0 , a1 , . . . , an are constants and their values are to be determined using (2.1). Now, our problem is to find the values of ai ’s. To find these values, substitute x = xi in (2.2) according to the order i = 0, 1, 2, . . . , n. When x = x0 , then φ(x0 ) = a0 , i.e. a0 = y0 . When x = x1 , then φ(x1 ) = a0 + a1 (x1 − x0 )
y1 − y0 ∆y0 = . h h When x = x2 , φ(x2 ) = a0 + a1 (x2 − x1 ) + a2 (x2 − x0 )(x2 − x1 ) y1 − y0 i.e. y2 = y0 + .2h + a2 (2h)(h). h y2 − 2y1 + y0 ∆ 2 y0 Therefore, a2 = = . 2!h2 2!h2 i.e. y1 = y0 + a1 h. Thus, the value of a1 is a1 =
1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newton’s Interpolation Methods ∆n y0 . n!hn By substituting all the values of ai in (2.2), the polynomial φ(x) becomes
In this way, all ai ’s can be determined. In general, an =
∆y0 ∆ 2 y0 + (x − x0 )(x − x1 ) h 2!h2 3 ∆ y0 +(x − x0 )(x − x1 )(x − x2 ) 3!h3 ∆n y0 + · · · + (x − x0 )(x − x1 ) · · · (x − xn−1 ) . n!hn
φ(x) = y0 + (x − x0 )
(2.3)
This is the interpolating polynomial, but it can be simplified by using the equispaced condition and introducing a suitable variable discussed below. Let the new variable u be defined as x = x0 + uh. Since xi = x0 + ih, therefore, x − xi = (u − i)h for i = 0, 1, 2, . . . , n. Using these values the equation (2.3) reduced to ∆2 y0 ∆ 3 y0 ∆y0 + (uh)(u − 1)h + (uh)(u − 1)h(u − 2)h h 2!h2 3!h3 n ∆ y0 + · · · + (uh)(u − 1)h(u − 2)h · · · (u − n − 1)h n!hn u(u − 1) 2 u(u − 1)(u − 2) 3 = y0 + u∆y0 + ∆ y0 + ∆ y0 2! 3! u(u − 1)(u − 2) · · · (u − n − 1) n +··· + ∆ y0 . n!
φ(x) = y0 + (uh)
(2.4)
x − x0 for a given value of x. h This is known as Newton or Newton-Gregory forward difference interpolating The value of u is obtained by u =
polynomial. Obviously, this interpolation formula also contains some error, and the error is estimated in the following. Error in Newton’s forward interpolation formula In previous module, it is seen that the error in polynomial interpolation is E(x) = (x − x0 )(x − x1 ) · · · (x − xn )
f (n+1) (ξ) (n + 1)!
= u(u − 1)(u − 2) · · · (u − n)hn+1 2
f (n+1) (ξ) (using x = x0 + uh) (n + 1)!
...................................................................................... where ξ is a value of x and it lies between min{x0 , x1 , . . . , xn , x} and max{x0 , x1 , . . . , xn , x}. In terms of forward difference, the value of f (n+1) (ξ) is approximately equal to hn+1 ∆n+1 y0 . Thus, the error in terms of forward difference is E(x) '
u(u − 1)(u − 2) · · · (u − n) n+1 ∆ y0 . (n + 1)!
Let us consider a particular case: If 0 < u < 1 then 2 1 1 1 |u(u − 1)| = (1 − u)u = u − u = − − u ≤ and 4 2 4 |(u − 2)(u − 3) · · · (u − n)| ≤ |(−2)(−3) · · · (−n)| = n!. 2
Thus, |E(x)| ≤
1 1 n! |∆n+1 y0 | = |∆n+1 y0 |. 4 (n + 1)! 4(n + 1)
Again, |∆n+1 y0 | ≤ 9 in the last significant figure. 9 Hence, |E(x)| ≤ < 1 for n > 2 and 0 < u < 1. 4(n + 1) This is a very interesting result and indicates that the maximum error in Newton’s forward interpolation is 1, provided u = |(x − x0 )/h| < 1, i.e. |x − x0 | < h. Example 2.1 Given below is a table of values of the probability integral y=
2 π
x
Z
2
e−x dx.
0
Determine the value of y when the value of x is 0.456. x
:
0.45
0.46
0.47
0.48
0.49
y
:
0.475482
0.484656
0.497452
0.502750
0.511668
Solution. Let us first construct the forward difference table as follows:
3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newton’s Interpolation Methods x
y
0.45
0.475482
∆y
∆2 y
∆3 y
0.009174 0.46
0.484656
0.003622 0.012796
0.47
0.497452
–0.011120 –0.007498
0.005298 0.48
0.502750
0.011118 0.003620
0.008918 0.49
0.511668
For this problem, x0 = 0.45, x = 0.456, h = 0.01, u =
x − x0 0.456 − 0.45 = = 0.6. h 0.01
Now, u(u − 1)(u − 2) 3 u(u − 1) 2 ∆ y0 + ∆ y0 2! 3! 0.6(0.6 − 1) = 0.475482 + 0.6 × 0.009174 + × 0.003622 2 0.6(0.6 − 1)(0.6 − 2) + × (−0.011120) 6 = 0.475482 + 0.0055044 − 0.0008693 − 0.0006227
y(0.456) = y0 + u∆y0 +
= 0.4794944. Hence, the value of y when x = 0.456 is 0.479494. Note 2.1 In general, Newton’s forward difference formula is used to compute the approximate value of f (x) when the value of x is near to x0 of the given table. But, if the value of x is at the end of the table, then this formula gives more error. In this case, Newton’s backward formula is used, discussed below.
2.2 Newton’s backward difference interpolation formula This is another form of Newton’s interpolation. Let the set of values (xi , yi ), i = 0, 1, 2, . . . , n be given, where yi = f (xi ) and xi = x0 + ih, h is the spacing. 4
...................................................................................... We have to construct a polynomial of degree less than or equal to n, which passes through the given points. Let φ(x) be such a polynomial and it is consider as follows: φ(x) = a0 + a1 (x − xn ) + a2 (x − xn )(x − xn−1 ) +a3 (x − xn )(x − xn−1 )(x − xn−2 ) + · · · + an (x − xn )(x − xn−1 ) · · · (x − x1 ).
(2.5)
Here ai ’s are constants and their values are to be determined using the conditions yi = φ(xi ), i = 0, 1, . . . , n.
(2.6)
To find the values of ai ’s, we substitute x = xn , nn−1 , . . . , x1 in (2.5). For x = xn , φ(xn ) = a0 , i.e. a0 = yn . For x = xn−1 , φ(xn−1 ) = a0 +a1 (xn−1 −xn ), i.e. yn−1 = yn +a1 (−h). This gives a1 = When x = xn−2 ,
∇yn yn − yn−1 = . h h
φ(xn−2 ) = a0 + a1 (xn−2 − xn ) + a2 (xn−2 − xn )(xn−2 − xn−1 ) yn − yn−1 = yn + (−2h) + a2 (−2h)(−h) h ∇ 2 yn yn − 2yn−1 + yn−2 = . That is, yn−2 = 2yn−1 − yn + a2 .2!h2 . Hence, a2 = 2!h2 2!h2 In this manner, all ai ’s can be determined. In general, ak =
∇ k yn . k!hk
Using these values, φ(x) becomes ∇2 yn ∇yn + (x − xn )(x − xn−1 ) h 2!h2 3 ∇ yn +(x − xn )(x − xn−1 )(x − xn−2 ) + ··· 3!h3 ∇n yn +(x − xn )(x − xn−1 )(x − xn−2 ) · · · (x − x1 ) . n!hn
φ(x) = yn + (x − xn )
(2.7)
Since the values of xi ’s are equispaced, further simplification is possible. Let v be a new variable defined by x = xn + vh. Note that v is unit less quantity. 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newton’s Interpolation Methods Thus, xi = x0 +ih and x = xn +vh. Therefore, x−xn−i = (xn +vh)−(x0 +n − ih) = (xn − x0 ) + (v − n − i)h = (v + i)h, i = 0, 1, . . . , n. Using these results, (2.7) reduces to ∇3 yn ∇2 yn ∇yn + vh(v + 1)h(v + 2)h + ··· + vh(v + 1)h h 2!h2 3!h3 ∇ n yn +vh(v + 1)h(v + 2)h · · · (v + n − 1)h n!hn v(v + 1)(v + 2) 3 v(v + 1) 2 ∇ yn + ∇ yn + · · · = yn + v∇yn + 2! 3! v(v + 1)(v + 2) · · · (v + n − 1) n + ∇ yn . n!
φ(x) = yn + vh
(2.8)
This is well known Newton’s backward or Newton-Gregory backward interpolation formula. Note 2.2 The Newton’s backward difference interpolation formula is used to compute the value of f (x) when x is near to xn , i.e. when x is at the end of the table.
Error in Newton’s backward interpolation formula The error occur in backward difference interpolation formula is given by E(x) = (x − xn )(x − xn−1 ) · · · (x − x1 )(x − x0 ) = v(v + 1)(v + 2) · · · (v + n)hn+1
where v =
f (n+1) (ξ) (n + 1)!
f (n+1) (ξ) , (n + 1)!
(2.9)
x − xn and ξ lies between min{x0 , x1 , . . . , xn , x} and max{x0 , x1 , . . . , xn , x}. h
Example 2.2 The following table of values of x and f (x) is given. x
:
0.25
0.30
0.35
0.40
0.45
0.50
f (x)
:
2.6754
2.8765
2.9076
3.2876
3.3451
3.7139
Use Newton’s backward interpolation formula to determine the value of f (0.48). Solution. The backward difference table is 6
......................................................................................
∇f (x)
∇2 f (x)
∇3 f (x)
∇4 f (x)
x
f (x)
0.25
2.6754
0.30
2.8765
0.2011
0.35
2.9076
0.0311
–0.1700
0.40
3.2876
0.3800
0.3489
0.5189
0.45
3.3451
0.0575
–0.3225
–0.6714
–1.1903
0.50
3.7139
0.3688
0.3113
0.6338
1.3052
In this problem, xn = 0.50, x = 0.48, h = 0.05, v =
x − xn 0.48 − 0.50 = = −0.4. h 0.05
Then, v(v + 1)(v + 2) 3 v(v + 1) 2 ∇ f (xn )+ ∇ f (xn ) 2! 3! v(v + 1)(v + 2)(v + 3) 4 + ∇ f (xn )+· · · 4! −0.4(−0.4 + 1) = 3.7139 − 0.4 × 0.3688 + × 0.3113 2 −0.4(−0.4 + 1)(−0.4 + 2) + × 0.6338 6 −0.4(−0.4 + 1)(−0.4 + 2)(−0.4 + 3) × 1.3052 + 24 = 3.7139 − 0.14752 − 0.037356 − 0.040563 − 0.054296
f (0.48) = f (xn ) + v∇f (xn )+
= 3.43416 ' 3.4342. Thus, f (0.48) = 3.4342. Example 2.3 The following table gives the value of x and y. x
:
2
3
4
5
6
f (x)
:
5
8
12
20
37
Calculate the value of y at x = 5.5 by considering third degree Newton’s backward interpolation polynomial. Again, find the same value by considering the fourth degree polynomial. Solution. To find a third degree polynomial, we consider last four data and the corresponding backward difference table is 7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newton’s Interpolation Methods ∇y
∇2 y
x
y
3
8
4
12
4
5
20
8
4
6
37
17
9
In this problem, xn = 6, x = 5.5, h = 1, v =
∇3 y
5
x − xn 5.5 − 6 = = −0.5. h 1
v(v + 1) 2 v(v + 1)(v + 2) 3 ∇ yn + ∇ yn 2! 3! −0.5(−0.5 + 1) = 37 − 0.5 × 17 + ×9 2 −0.5(−0.5 + 1)(−0.5 + 2) ×5 + 6 = 37 − 8.5 − 1.125 − 0.3125
y(5.5) = yn + v∇yn +
= 27.0625. Now, we consider fourth degree polynomial to calculate y(5.5). The difference table is shown below. The additional calculations are marked by red colour. ∇y
∇2 y
∇3 y
x
y
2
5
3
8
3
4
12
4
1
5
20
8
4
3
6
37
17
9
5
∇4 y
2
The value of y(5.5) can be determined by the following formula. Note that there is only one term we have to calculate, other terms are already determined in previous step. v(v + 1) 2 v(v + 1)(v + 2) 3 v(v + 1)(v + 2)(v + 3) 4 ∇ yn + ∇ yn + ∇ yn . 2! 3! 4! The value of the last term is v(v + 1)(v + 2)(v + 3) 4 ∇ yn 4! −0.5(−0.5 + 1)(−0.5 + 2)(−0.5 + 3) = × 2 = −0.078125. 4! Thus, the value of y(5.5) is 27.0625 − 0.078125 = 26.984375.
f (0.48) = yn + v∇yn +
8
......................................................................................
Example 2.4 The upward velocity of a rocket is given below: t (sec)
:
0
10
15
20
25
30
v(t) (m/sec)
:
0
126.75
350.50
510.80
650.40
920.25
Determine the value of the velocity at t = 26 sec using Newton’s backward and forward formulae and compare the results. Solution. The velocity at t = 0 is zero, and hence it does not give any information. Therefore, we discard this data. Using Newton’s backward formula The backward difference table is ∇v
∇2 v
∇3 v
t
v(t)
10
126.75
15
350.50
223.75
20
510.80
160.30
–63.45
25
650.40
139.60
–20.70
42.45
30
920.25
269.85
130.25
150.95
Here, tn = 30, t = 26, h = 5, v = By Newton’s backward formula
∇4 v
108.50
t − tn 26 − 30 = = −0.8. h 5
v(v + 1) 2 v(v + 1)(v + 2) 3 ∇ yn + ∇ yn 2! 3! v(v + 1)(v + 2)(v + 3) 4 ∇ yn + 4! −0.8(−0.8 + 1) × (130.25) = 920.25 − 0.8 × 269.85 + 2 −0.8(−0.8 + 1)(−0.8 + 2) + × 150.95 6 −0.8(−0.8 + 1)(−0.8 + 2)(−0.8 + 3) + × 108.50 24 = 687.21.
v(26) = yn + v∇yn +
Thus the velocity of the rocket at t = 16 sec is 687.21 m/s. By Newton’s forward formula Since the Newton’s forward interpolation formula is used when the given value is at the beginning of the table. For this purpose the table is written in reverse order as 9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newton’s Interpolation Methods t (sec)
:
30
25
20
15
10
0
v(t) (m/s)
:
920.25
640.40
510.80
350.50
126.75
0
Here also we discard the last data as it has no importance. The forward difference table is t
v(t)
30
920.25
∇v
∇2 v
∇3 v
∇4 v
–269.85 25
650.40
130.25 –139.60
20
510.80
–150.95 –20.70
–160.30 15
350.50
108.50 –42.45
–63.45 –223.75
10
126.75
Here t0 = 30, t = 26, h = −5, u =
t − t0 26 − 30 = = 0.8. h −5
Then u(u − 1)(u − 2) 3 u(u − 1) 2 ∆ v0 + ∆ v0 2! 3! u(u − 1)(u − 2)(u − 3) 4 + ∆ v0 4! 0.8(0.8 − 1) = 920.25 + 0.8 × (−269.85) + × (130.25) 2 0.8(0.8 − 1)(0.8 − 2) 0.8(0.8 − 1)(0.8 − 2)(0.8 − 3) + × (−150.95) + × 108.50 6 24 = 687.21.
v(26) = y0 + u∆v0 +
Thus, the value of v(16) obtained by Newton’s forward formula is 687.21 which is same as the value obtained by Newton’s backward formula. From the above example one can conclude that if a problem is solved by Newton’s forward formula, then it can be solved by Newton’s backward formula also. In the following section, we discuss another interpolation formula called Newton’s divided difference interpolation formula. 10
......................................................................................
2.3 Divided differences and their properties Newton’s forward and backward interpolation formulae are deduced from forward and backward difference operators. In this section, another type of difference called divided difference is defined and based on this difference, Newton’s divided interpolation formula is derived. Let y = f (x) be a function whose explicit form is unknown, but it is known at some values of x, say at x0 , x1 , . . . , xn . Suppose (xi , yi ), where yi = f (xi ), i = 0, 1, . . . , n are known at n + 1 points x0 , x1 , . . . , xn . Like Lagrange’s interpolation, the points x0 , x1 , . . . , xn are not necessarily equispaced. The divided differences of different orders are defined below: Zeroth order divided difference The zeroth order divided difference for the argument x0 is denoted by f [x0 ] and is defined by f [x0 ] = f (x0 ). First order divided difference The first order divided difference for two arguments x0 and x1 is denoted by f [x0 , x1 ]. This is defined below: f (x0 ) − f (x1 ) . f [x0 , x1 ] = x0 − x1 In general, for the arguments xi and xj , the first order divided difference is f (xi ) − f (xj ) f [xi , xj ] = . xi − xj Second order divided difference For the arguments x0 , x1 and x2 the second order divided difference is denoted by f [x0 , x1 , x2 ] and is defined below: f [x0 , x1 ] − f [x1 , x2 ] . f [x0 , x1 , x2 ] = x0 − x2 In general, f [xi , xj , xk ] =
f [xi , xj ] − f [xj , xk ] . xi − xk
nth order divided differences 11
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newton’s Interpolation Methods For the n+1 arguments x0 , x1 , . . . , xn the divided difference is denoted by f [x0 , x1 , . . . , xn ], which is defined as
f [x0 , x1 , . . . , xn−1 ] − f [x1 , x2 , . . . , xn ] . x0 − xn From the definition of first order divided difference it is seen that if the two arguments
f [x0 , x1 , . . . , xn ] =
are equal, then it does not give any definite value. But, there is a meaning, which is discussed below. 2.3.1
Divided differences for equal arguments or divided differences for confluent arguments
In the following, it is shown by limiting process, that there is a definite value of divided difference when the arguments are equal. The divided differences for equal arguments is known as confluent divided differences. f [x0 , x0 ] = lim f [x0 , x0 + ε] = lim ε→0
ε→0
f (x0 + ε) − f (x0 ) = f 0 (x0 ), ε
provided f (x) is differentiable. Thus, f [x0 , x0 ] is nothing but the first order derivative of f (x) at x = x0 . For three arguments, the confluent divided difference is determined as follows: f [x0 + ε, x0 ] − f [x0 , x0 ] ε→0 ε 0 − f (x0 )
f [x0 , x0 , x0 ] = lim f [x0 + ε, x0 , x0 ] = lim ε→0
= lim
f (x0 +ε)−f (x0 ) ε
ε f (x0 + ε) − f (x0 ) − εf 0 (x0 ) 0 = lim form ε→0 ε2 0 0 f (x0 + ε) − f 0 (x0 ) (by L’Hospital rule) = lim ε→0 2ε f 00 (x0 ) = . 2! ε→0
We obtained second order derivative.
000
f (x0 ) Similarly, the confluent divided difference of four arguments is f [x0 , x0 , x0 , x0 ] = . 3! In general, for (k + 1) arguments the confluent divided difference is given by (k+1) times
}| { z f k (x0 ) . f [x0 , x0 , . . . , x0 ] = k! For this relation we can write 12
(2.10)
......................................................................................
(k+1) times
z }| { dk f (x0 ) = k! f [x0 , x0 , . . . , x0 ]. dxk
(2.11)
Some interesting properties of divided difference are presented below. 2.3.2
Properties of divided differences
(i) Divided difference of a constant is zero Let f (x) = c, where c is an arbitrary constant. f (x0 ) − f (x1 ) c−c Then f [x0 , x1 ] = = = 0. x0 − x1 x0 − x1 (ii) Divided difference of cf (x), c is constant, is the divided difference of f (x) multiplied by c, i.e. if g(x) = cf (x), then g[x0 , x1 ] = cf [x0 , x] Let g(x) = cf (x). Therefore, g[x0 , x1 ] =
g(x0 ) − g(x1 ) cf (x0 ) − cf (x1 ) f (x0 ) − f (x1 ) = =c = cf [x0 , x1 ]. x0 − x1 x0 − x1 x0 − x1
(iii) Divided difference is linear Let f (x) = ap(x) + bq(x). Now, f (x0 ) − f (x1 ) ap(x0 ) + bq(x0 ) − ap(x1 ) − bq(x1 ) = x0 − x1 x0 − x1 p(x0 ) − p(x1 ) q(x0 ) − q(x1 ) =a +b x0 − x1 x0 − x1 = ap[x0 , x1 ] + bq[x0 , x1 ].
f [x0 , x1 ] =
This shows that divided difference is a linear operator. (iv) Divided differences are symmetric The first order divided difference is f [x0 , x1 ] =
f (x1 ) − f (x0 ) f (x0 ) − f (x1 ) = = f [x1 , x0 ]. x0 − x1 x1 − x0 13
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newton’s Interpolation Methods So, it is symmetric. This expression can be written as 1 1 f (x0 ) + f (x1 ). Also, f [x0 , x1 ] = x0 − x1 x1 − x0 The second order difference can also be expressed to the above form. For three arguments f [x0 , x1 ] − f [x1 , x2 ] x0 − x2 o 1 hn 1 1 = f (x0 ) + f (x1 ) x0 − x2 x0 − x1 x1 − x0 oi n 1 1 f (x1 ) + f (x2 ) − x1 − x2 x2 − x1 1 = f (x0 ) (x0 − x2 )(x0 − x1 ) 1 1 + f (x1 ) + f (x2 ). (x1 − x0 )(x1 − x2 ) (x2 − x0 )(x2 − x1 )
f [x0 , x1 , x2 ] =
In this manner, the divided differences for (n + 1) arguments is written as 1 f (x0 ) (x0 − x1 )(x0 − x2 ) · · · (x0 − xn ) 1 f (x1 ) + · · · + (x1 − x0 )(x1 − x2 ) · · · (x1 − xn ) 1 + f (xn ) (xn − x0 )(xn − x1 ) · · · (xn − xn−1 ) n X f (xi ) . = (xi − x0 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )
f [x0 , x1 , . . . , xn ] =
i=0
(2.12) Thus, it is easy to conclude that the divided differences are symmetric. (v) For equispaced arguments, the divided differences can be expressed in terms of forward differences, i.e. f [x0 , x1 , . . . , xn ] =
1 hn .n!
∆n y0 .
Since the arguments are equispaced, therefore xi = x0 + ih, i = 0, 1, . . . , n. 14
(2.13)
...................................................................................... Therefore, the first order divided difference is given by f [x0 , x1 ] =
y1 − y0 ∆y0 f (x1 ) − f (x0 ) = = . x1 − x0 x1 − x0 h
(2.14)
Again, the second order difference is f [x0 , x1 ] − f [x1 , x2 ] 1 ∆y0 ∆y1 = − x0 − x2 −2h h h 2 ∆y1 − ∆y0 ∆ y0 = = . 2 2h 2!h2
f [x0 , x1 , x2 ] =
(2.15)
Now, we use mathematical induction to prove the result. The result (2.13) is true for n = 1, 2. Suppose the result be true for n = k. Therefore, f [x0 , x1 , . . . , xk ] =
∆k y0 . k! hk
Now, f [x0 , x1 , . . . , xk ] − f [x1 , x2 , . . . , xk+1 ] x0 − xk+1 h ∆k y 1 ∆k y1 i 0 = − −(xk+1 − x0 ) k! hk k! hk 1 = [∆k y1 − ∆k y0 ] (k + 1)k! hk+1 ∆k+1 y0 = . (k + 1)! hk+1
f [x0 , x1 , . . . , xk , xk+1 ] =
Hence, by mathematical induction we conclude that f [x0 , x1 , . . . , xn ] =
1 hn .n!
∆n y0 .
(2.16)
(vi) The nth order divided difference of a polynomial of degree n is constant Let f (x) = a0 xn + a1 xn−1 + a2 xn−2 + · · · + an−1 x + an , (a0 6= 0) be a polynomial of degree n. 15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newton’s Interpolation Methods Then f (x) − f (x0 ) x − x0 xn − xn0 xn−1 − xn−1 xn−2 − xn−2 x − x0 0 0 = a0 + a1 + a2 + · · · + an−1 x − x0 x − x0 x − x0 x − x0 = a0 [xn−1 + xn−2 x0 + xn−3 x20 + · · · + xxn−2 + xn−1 ] 0 0
f [x, x0 ] =
+a1 [xn−2 + xn−3 x0 + xn−4 x20 + · · · + xxn−3 + xn−2 ] + · · · + an−1 0 0 = a0 xn−1 + (a0 x0 + a1 )xn−2 + (a0 x20 + a1 x0 + a2 )xn−3 + · · · + an−1 = b0 xn−1 + b1 xn−2 + b2 xn−3 + · · · + bn−1 , where b0 = a0 , b1 = a0 x0 + a1 , b2 = a0 x20 + a1 x0 + a2 , . . . , bn−1 = an−1 . Thus, the first order divided difference of a polynomial of degree n is a polynomial of degree n − 1. In this manner, one can prove that the nth order divided difference of a polynomial is constant.
2.4 Newton’s fundamental interpolation formula Based on divided difference, one can construct another type of interpolation formula called Newton’s fundamental interpolation formula. From this formula, Newton’s forward and backward difference and also Lagrange’s interpolation formulae can be derived. So, in this context, this formula is highly importance. Let the function y = f (x) be known at the arguments x0 , x1 , . . . , xn and let yi = f (xi ), i = 0, 1, . . . , n. The points xi , i = 0, 1, . . . , n are not necessary equispaced. The first order divided difference for the arguments x0 , x is f (x) − f (x0 ) . x − x0 This can be written as f (x) = f (x0 ) + (x − x0 )f [x0 , x]. f [x0 , x] =
For the arguments x0 , x1 and x, the divided difference is f [x0 , x1 ] − f [x0 , x] x1 − x i.e., f [x0 , x] = f [x0 , x1 ] + (x − x1 )f [x0 , x1 , x] f (x) − f (x0 ) i.e., = f [x0 , x1 ] + (x − x1 )f [x0 , x1 , x] x − x0 Thus, f (x) = f (x0 ) + (x − x0 )f [x0 , x1 ] + (x − x0 )(x − x1 )f [x0 , x1 , x]. f [x0 , x1 , x] =
16
...................................................................................... Similarly, for the arguments x0 , x1 , x2 , x, f (x) = f (x0 ) + (x − x0 )f [x0 , x1 ] + (x − x0 )(x − x1 )f [x0 , x1 , x2 ] + (x − x0 )(x − x1 )(x − x2 )f [x0 , x1 , x2 , x] In this way, for the arguments x0 , x1 , x2 , . . . , xn and x, we get f (x) = f (x0 ) + (x − x0 )f [x0 , x1 ] + (x − x0 )(x − x1 )f [x0 , x1 , x2 ] + (x − x0 )(x − x1 )(x − x2 )f [x0 , x1 , x2 , x3 ] + · · · + (x − x0 )(x − x1 ) · · · (x − xn )f [x0 , x1 , x2 , . . . , xn , x].
(2.17)
This formula is known as Newton’s fundamental or Newton’s general interpolation formula. The last term (x − x0 )(x − x1 ) · · · (x − xn )f [x0 , x1 , x2 , . . . , xn , x] is the error term. The divided differences of different order are shown in Table 2.1. x x0 x0 − x1 x0 − x2 x0 − x3
x1 x1 − x2
x1 − x3
x2 x2 − x3 x3
.. . .. . .. . .. . .. . .. . .. . .. .
f (x)
First
Second
Third
f (x0 ) f [x0 , x1 ] f (x1 )
f [x0 , x1 , x2 ] f [x1 , x2 ]
f (x2 )
f [x0 , x1 , x2 , x3 ] f [x1 , x2 , x3 ]
f [x2 , x3 ] f (x3 )
Table 2.1: Divided difference table.
Error term The error term is E(x) = (x − x0 )(x − x1 ) · · · (x − xn )f [x0 , x1 , . . . , xn , x] f n+1 (ξ) = (x − x0 )(x − x1 ) · · · (x − xn ) , [using (2.10)] (n + 1)!
(2.18) 17
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newton’s Interpolation Methods where min{x0 , x1 , . . . , xn , x} < ξ < max{x0 , x1 , . . . , xn , x}. It is very interesting that the Newton’s fundamental interpolation formula gives both the interpolating polynomial as well as error term simultaneously. Example 2.5 Using Newton’s divided difference formula, find the value of f (1.2) from the following table: x
:
0
2
5
7
11
f (x)
:
2.153
3.875
4.279
4.891
5.256
Solution. The divided difference table is x
f (x)
0
2.153
–2 –5 –7 –11
3.875
–3
–9
3rd
4th
–0.1453 0.1346
5
4.279
–2 –6
2nd
0.8610 2
–5
1st
0.0257 0.0343
0.3060 7
4.891
–4
–0.0030 –0.0078
–0.0358 0.0912
11
5.256
Here x = 1.2, x0 = 0, x1 = 2, x2 = 5, x3 = 7, x4 = 11, f [x0 ] = 2.153, f [x0 , x1 ] = 0.8610, f [x0 , x1 , x2 ] = −0.1453, f [x0 , x1 , x2 , x3 ] = 0.0257, f [x0 , x1 , x2 , x3 , x4 ] = −0.0030. The Newton’s divided difference formula is f (1.2) = f [x0 ] + (x − x0 )f [x0 , x1 ] + (x − x0 )(x − x1 )f [x0 , x1 , x2 ] +(x − x0 )(x − x1 )(x − x2 )f [x0 , x1 , x2 , x3 ] +(x − x0 )(x − x1 )(x − x2 )(x − x3 )f [x0 , x1 , x2 , x3 , x4 ] = 2.153 + 1.0332 + 0.1399 + 0.0938 + 0.0635 = 3.4834. Hence, f (1.2) = 3.4834. 18
......................................................................................
2.5 Deductions of other interpolation formulae from Newton’s divided difference formula Theoretically, Newton’s divided difference formula is very powerful because from this formula several interpolation formulae can be derived.
2.6 Newton’s forward difference interpolation formula Let the arguments be equispaced, i.e. xi = x0 + ih, i = 0, 1, . . . , n. Then the kth order divided difference is (from (2.16)) f [x0 , x1 , . . . , xk ] =
∆k f (x0 ) , k! hk
for k = 1, 2, . . . , n. Using this assumption the formula (2.17) reduces to ∆f (x0 ) ∆2 f (x0 ) + (x − x0 )(x − x1 ) 1!h 2!h2 3 ∆ f (x0 ) + (x − x0 )(x − x1 )(x − x2 ) + ··· 3!h3 ∆n f (x0 ) ∆n+1 f (ξ) + (x − x0 )(x − x1 ) · · · (x − xn−1 ) + (x − x )(x − x ) · · · (x − x ) . 0 1 n n!hn (n + 1)! hn+1
φ(x) = f (x0 ) + (x − x0 )
The last term is the error term. To convert it to usual form of Newton’s forward difference formula, let u = i.e. x = x0 + uh. Since xi = x0 + ih, i = 0, 1, 2, . . . , n.
x − x0 , h
Therefore, x − xi = (u − i)h. Then u(u − 1) 2 ∆ f (x0 ) + · · · 2! u(u − 1)(u − 2) · · · (u − n + 1) n + ∆ f (x0 ) n! f n+1 (ξ) + u(u − 1)(u − 2) · · · (u − n) . (n + 1)!
φ(x) = f (x0 ) + u∆f (x0 ) +
(2.19)
The value of ξ lies between min{x, x0 , x1 , . . . , xn } and max{x, x0 , x1 , . . . , xn }. 19
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newton’s Interpolation Methods This is the well known Newton’s forward difference interpolation formula including error term.
2.7 Newton’s backward difference interpolation formula In this case also, we assumed that xi = x0 + ih, i = 0, 1, . . . , n. Let φ(x) = f (xn ) + (x − xn )f [xn , xn−1 ] + (x − xn )(x − xn−1 )f [xn , xn−1 , xn−2 ] + · · · + (x − xn )(x − xn−1 ) · · · (x − x1 )f [xn , xn−1 , . . . , x1 , x0 ] +E(x),
(2.20)
where E(x) = (x − xn )(x − xn−1 ) · · · (x − x1 )(x − x0 )f [x, xn , xn−1 , . . . , x1 , x0 ]. From the arguments xn , xn−1 , . . . , xn−k , the relation (2.16) becomes f [xn , xn−1 , . . . , xn−k ] =
∆k f (xn−k ) . k!hk
Again, ∆k f (xn−k ) ∇k f (xn ) = . k!hk k!hk Thus, f [xn , xn−1 , . . . , xn−k ] =
∇k f (xn ) . k!hk
With this notation (2.20) reduces to ∇2 f (xn ) ∇f (xn ) + (x − xn )(x − xn−1 ) + ··· 1!h 2!h2 ∇n f (xn ) + (x − xn )(x − xn−1 ) · · · (x − x1 )(x − x0 ) + E(x), n!hn
φ(x) = f (xn ) + (x − xn )
where E(x) = (x − xn )(x − xn−1 ) · · · (x − x1 )(x − x0 ) where min{x, x0 , x1 , . . . , xn } < ξ < max{x, x0 , x1 , . . . , xn }. 20
∇n+1 f (ξ) , (n + 1)!hn+1
...................................................................................... This is the Newton’s backward difference interpolation formula with error term E(x).
2.8 Lagrange’s interpolation formula Let x0 , x1 , . . . , xn be the (n + 1) arguments, not necessarily equispaced. For these arguments the n order divided difference is f [x0 , x1 , . . . , xn ] =
n X i=0
f (xi ) . (xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )
Similarly, for the (n + 2) arguments x, x0 , . . . , xn , the (n + 1) order divided difference is f [x, x0 , x1 , . . . , xn ] = +
f (x) (x − x0 )(x − x1 ) · · · (x − xn ) n X f (xi ) i=0
(xi − x0 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn ) n
f (x) X f (xi ) = + , w(x) (xi − x)w0 (xi ) i=0
where w(x) = (x − x0 )(x − x1 ) · · · (x − xn ). Thus, f (x) =
n X i=0
=
n X i=0
w(x)f (xi ) + w(x)f [x, x0 , x1 , . . . , xn ] (x − xi )w0 (xi ) Li (x)f (xi ) + w(x)
f n+1 (ξ) [using (2.16)] (n + 1)!
where min{x0 , x1 , . . . , xn , x} < ξ < max{x0 , x1 , . . . , xn , x} and w(x) Li (x) = , the ith Lagrange’s function. (x − xi )w0 (xi ) This is the Lagrange’s interpolation formula for unequal spacing with error term. Note that both Lagrange’s and Newton’s divided difference interpolation formulae are applicable for unequal spaced arguments. Also, they are equivalent proved in the next section.
21
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newton’s Interpolation Methods
2.9 Equivalence of Lagrange’s and Newton’s divided difference formulae Let (xi , yi ), i = 0, 1, . . . , n be the given set of points. The arguments xi ’s are not necessarily equispaced. The Lagrange’s interpolation polynomial for these points is
φ(x) =
n X
Li (x)yi ,
(2.21)
i=0
where Li (x) =
(x − x0 )(x − x1 ) · · · (x − xi−1 )(x − xi+1 ) · · · (x − xn ) . (2.22) (xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )
Again the Newton’s divided difference interpolation formula for these points is φ(x) = f (x0 ) + (x − x0 )f [x0 , x1 ] + (x − x0 )(x − x1 )f [x0 , x1 , x2 ] + · · · +(x − x0 )(x − x1 ) · · · (x − xn−1 )f [x0 , x1 , · · · , xn ] f (x1 ) f (x0 ) + = f (x0 ) + (x − x0 ) x0 − x1 x1 − x0 +(x − x0 )(x − x1 ) × f (x1 ) f (x2 ) f (x0 ) + + (x0 − x1 )(x0 − x2 ) (x1 − x0 )(x1 − x2 ) (x2 − x0 )(x2 − x1 ) + · · · + (x − x0 )(x − x1 ) · · · (x − xn−1 ) × f (x0 ) f (xn ) + ··· + . (2.23) (x0 − x1 ) · · · (x0 − xn ) (xn − x0 ) · · · (xn − xn−1 ) The coefficient of f (x0 ) in the above expression is x − x0 (x − x0 )(x − x1 ) (x − x0 )(x − x1 ) · · · (x − xn−1 ) + + ··· + x0 − x1 (x0 − x1 )(x0 − x2 ) (x0 − x1 )(x0 − x2 ) · · · (x0 − xn ) (x − x1 ) x − x0 (x − x0 )(x − x2 ) = 1+ + + (x0 − x1 ) x0 − x2 (x0 − x2 )(x0 − x3 ) (x − x0 )(x − x2 ) · · · (x − xn−1 ) +··· + (x0 − x2 )(x0 − x3 ) · · · (x0 − xn ) 1+
22
......................................................................................
(x − x1 ) x − x2 (x − x0 )(x − x2 ) = + + (x0 − x1 ) x0 − x2 (x0 − x2 )(x0 − x3 ) (x − x0 )(x − x2 ) · · · (x − xn−1 ) +··· + (x0 − x2 )(x0 − x3 ) · · · (x0 − xn ) (x − x1 )(x − x2 ) x − x0 (x − x0 )(x − x3 ) · · · (x − xn−1 ) = + ··· + 1+ (x0 − x1 )(x0 − x2 ) x0 − x3 (x0 − x3 )(x0 − x4 ) · · · (x0 − xn ) (x − x1 )(x − x2 ) x − x3 (x − x0 )(x − x3 ) · · · (x − xn−1 ) = + ··· + (x0 − x1 )(x0 − x2 ) x0 − x2 (x0 − x3 )(x0 − x4 ) · · · (x0 − xn ) = ··························· x − x0 (x − x1 )(x − x2 ) · · · (x − xn−1 ) 1+ = (x0 − x1 )(x0 − x2 ) · · · (x0 − xn−1 ) x0 − xn (x − x1 )(x − x2 ) · · · (x − xn−1 )(x − xn ) = (x0 − x1 )(x0 − x2 ) · · · (x0 − xn−1 )(x0 − xn ) = L0 (x). By similarly process, it can be shown that the coefficient of f (xi ) is Li (x) for i = 2, 3, . . . , n. Hence, (2.23) becomes φ(x) = L0 (x)f (x0 ) + L1 (x)f (x1 ) + · · · + Ln (x)f (xn ) =
n X
Li (x)f (xi ) =
i=1
n X
Li (x)yi .
i=1
Thus the Lagrange’s and the Newton’s divided difference interpolation formulae are equivalent. Example 2.6 For the following table of data, find the interpolating polynomial using (i) Lagrange’s formula and (ii) Newton’s divided difference formula. x
:
0
3
8
10
y
:
12
21
1
0
Also, compare the results. Solution. (i) The Lagrange’s interpolation polynomial is φ(x) =
(x − 3)(x − 8)(x − 10) (x − 0)(x − 8)(x − 10) × 12 + ×8 (0 − 3)(0 − 8)(0 − 10) (3 − 0)(3 − 8)(3 − 10) (x − 0)(x − 3)(x − 8) (x − 0)(x − 3)(x − 10) + ×1+ ×0 (8 − 0)(8 − 3)(8 − 10) (10 − 0)(10 − 3)(10 − 8) 23
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newton’s Interpolation Methods
x3 − 21x2 + 134x − 240 x3 − 18x2 + 80x × 12 + × 21 −240 105 x3 − 13x2 + 30x + ×1+0 −80 1 = [11x3 − 191x2 + 714x + 960]. 80
=
(ii) The divided difference table is x
f (x)
1st div.
2nd div.
3rd div.
diff.
diff.
diff.
0
12
3
21
3
8
1
–4
–7/8
10
0
–1/2
1/2
11/80
Newton’s divided difference polynomial is 7 φ(x) = 12 + (x − 0) × 3 + (x − 0)(x − 3) × (− ) 8 11 +(x − 0)(x − 3)(x − 8) × 80 11 3 7 2 = 12 + 3x − (x − 3x) + (x − 11x2 + 24x) 8 80 1 = [11x3 − 191x2 + 714x + 960]. 80 See that both the polynomials are same. It is expected because interpolating polynomial is unique. Note 2.3 From the above calculations it is obvious that the Newton’s divided difference interpolation formula needs less computation than Lagrange’s interpolation formula. So, it is suggested to use Newton’s divided difference interpolation formula when the arguments are unequal spaced.
24
.
Chapter 2 Interpolation
Module No. 3 Central Difference Interpolation Formulae
...................................................................................... There is a limitation on Newton’s forward and Newton’s backward formulae. This formulae are useful only when the unknown point is either in the beginning or in the ending of the table. But, when the point is on the middle of the table then these formulae give more error. So some different methods are required and fortunately developed for central point. These methods are known as central difference formulae. Many central difference methods are available in literature. Among them Gaussian forward and backward, Stirling’s and Bessel’s interpolation formulae are widely used and these formulae are discussed in this module. Like Newton’s formulae, there are two types of Gaussian formulae, viz. forward and backward difference formulae. Again, if the number of points is odd then there is only one middle point, but for even number of points there are two middle points. Thus two formulae are developed for odd number and even number of points.
3.1 Gauss’s forward difference formula There are two types of Gauss’s forward difference formulae are deduced, one for even number of arguments and other for odd number of arguments. 3.1.1
For odd (2n + 1) number of arguments
Let y = f (x) be given at 2n+1 equally spaced points x−n , x−(n−1) , . . . , x−1 , x0 , x1 , . . . , xn−1 , xn . That is, yi = f (xi ), i = 0, ±1, ±2, . . . , ±n are known. Based on these values we construct a polynomial φ(x) of degree at most 2n which pass through the points (xi , yi ), i.e. φ(xi ) = yi , i = 0, ±1, ±2, . . . , ±n,
(3.1)
where xi = x0 + ih, h is the spacing. The form of Gauss’s forward difference interpolation formula is similar to the Newton’s forward difference interpolation formula. Let the function φ(x) be of the form φ(x) = a0 + a1 (x − x0 ) + a2 (x − x0 )(x − x1 ) + a3 (x − x−1 )(x − x0 )(x − x1 ) +a4 (x − x−1 )(x − x0 )(x − x1 )(x − x2 ) + · · · +a2n−1 (x − x−n+1 )(x − x−n+2 ) · · · (x − x−1 )(x − x0 ) · · · (x − xn−1 ) +a2n (x − x−n+1 )(x − x−n+2 ) · · · (x − x−1 )(x − x0 ) · · · (x − xn ).
(3.2) 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Central Difference Interpolation Formulae The coefficients ai ’s are unknown and their values are to be determined by substituting x = x0 , x1 , x−1 , x2 , x−2 , . . . , xn , x−n to (3.2). Note that the appearance of the arguments to (3.2) follow the order x0 , x1 , x−1 , x2 , x−2 , . . . , xn , x−n and the same order is maintained to calculate the values of ai ’s. By successive substitution of the values of x, we get the following equations. y0 = a0 y1 = a0 + a1 (x1 − x0 ) i.e., y1 = y0 + a1 h, ∆y0 y1 − y0 = i.e., a1 = . x1 − x0 h Again, y−1 = y0 + a1 (−h) + a2 (−h)(−2h) ∆y0 + a2 h2 · 2! = y0 − h h y−1 − 2y0 + y1 ∆2 y−1 i.e., a2 = = . 2! h2 2! h2 y2 = a0 + a1 (x2 − x0 ) + a2 (x2 − x0 )(x2 − x1 ) +a3 (x2 − x−1 )(x2 − x0 )(x2 − x1 ) y1 − y0 y−1 − 2y0 + y1 = y0 + (2h) + (2h)(h) + a3 (3h)(2h)(h) h 2!h2 y2 − 3y1 + 3y0 − y−1 ∆3 y−1 or, a3 = = . 3!h3 3!h3 In this manner, the values of the remaining ai ’s can be determined. That is, a4 =
∆2n−1 y−(n−1) ∆4 y−2 ∆5 y−2 ∆2n y−n , a = , . . . , a = , a = . 5 2n−1 2n 4!h4 5!h5 (2n − 1)!h2n−1 (2n)!h2n
Therefore, the Gauss’s forward difference interpolation formula is given by ∆y0 ∆2 y−1 + (x − x0 )(x − x1 ) h 2!h2 3 ∆ y−1 + ··· +(x − x−1 )(x − x0 )(x − x1 ) 3!h3 ∆2n−1 y−(n−1) +(x − x−(n+1) ) · · · (x − xn−1 ) (2n − 1)!h2n−1 ∆2n y−n +(x − x−(n+1) ) · · · (x − xn−1 )(x − xn ) . (2n)!h2n
φ(x) = y0 + (x − x0 )
(3.3)
In this formula, we assumed that the arguments are in equispaced, i.e. x± i = x0 ± ih, i = 0, 1, 2, . . . , n. So, a new variable s is introduced, where x = x0 + sh. 2
...................................................................................... Thus, x − x±i = (s ∓ i)h. By these substitution, x − x0 = sh, x − x1 = (s − 1)h, x − x−1 = (s + 1)h, x − x2 = (s − 2)h, x − x−2 = (s + 2)h and so on. Using these values φ(x) reduces to ∆3 y−1 ∆y0 ∆2 y−1 + (s + 1)hsh(s − 1)h + ··· + sh(s − 1)h h 2!h2 3!h3 ∆2n−1 y−(n−1) +(s + n − 1)h · · · sh(s − 1)h · · · (s − n − 1)h (2n − 1)!h2n−1 ∆2n y−n +(s + n − 1)h · · · sh(s − 1)h · · · (s − n − 1)h(s − n)h (2n)!h2n ∆2 y−1 ∆3 y−1 = y0 + s∆y0 + s(s − 1) + (s + 1)s(s − 1) + ··· 2! 3! ∆2n−1 y−(n−1) +(s + n − 1) · · · s(s − 1) · · · (s − n − 1) (2n − 1)! ∆2n y−n +(s + n − 1) · · · s(s − 1) · · · (s − n − 1)(s − n) (2n)! 2 3 ∆ y−1 ∆ y−1 = y0 + s∆y0 + s(s − 1) + s(s2 − 12 ) 2! 3! 4y ∆ −2 + ··· +s(s2 − 12 )(s − 2) 4! ∆2n−1 y−(n−1) 2 2 +s(s2 − n − 1 )(s2 − n − 2 ) · · · (s2 − 12 ) (2n − 1)! ∆2n y−n 2 2 . +s(s2 − n − 1 )(s2 − n − 2 ) · · · (s2 − 12 )(s − n) (2n)!
φ(x) = y0 + sh
(3.4)
The formula (3.3) or (3.4) is known as Gauss’s forward central difference interpolation formula or the first interpolation formula of Gauss. 3.1.2
For even (2n) number of arguments
In this case, there are two points in the middle position. So, the arguments be taken as x0 , x±1 , . . . , x±(n−1) and xn . By using the previous process, the Gauss’s forward interpolation formula for even 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Central Difference Interpolation Formulae number of arguments becomes: ∆2 y−1 ∆3 y−1 + (s + 1)s(s − 1) 2! 3! ∆4 y−2 +(s + 1)s(s − 1)(s − 2) 4! ∆5 y−2 +(s + 2)(s + 1)s(s − 1)(s − 2) + ··· 5! ∆2n−1 y−(n−1) +(s + n − 1) · · · s · · · (s − n − 1) (2n − 1)! 2 ∆ y−1 ∆3 y−1 = y0 + s∆y0 + s(s − 1) + s(s2 − 12 ) 2! 3! 4y ∆5 y−2 ∆ −2 + (s2 − 22 )(s2 − 12 )s + ··· +s(s2 − 12 )(s − 2) 4! 5! ∆2n−1 y−(n−1) 2 +(s2 − n − 1 ) · · · (s2 − 12 )s . (2n − 1)!
φ(x) = y0 + s∆y0 + s(s − 1)
3.1.3
(3.5)
Error in Gauss’s forward central difference formula
From general expression of error term (discussed in Module number 1 of Chapter 2), we have for 2n + 1 arguments f 2n+1 (ξ) (2n + 1)! = (s + n)(s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1)(s − n) f 2n+1 (ξ) × h2n+1 (2n + 1)! 2n+1 f (ξ) = s(s2 − 12 ) · · · (s2 − n2 ).h2n+1 (3.6) (2n + 1)!
E(x) = (x − x−n )(x − x−(n−1) ) · · · (x − x−1 )(x − x0 ) · · · (x − xn )
where x = x0 + sh and ξ lies between min{x−n , x−(n−1) , . . . , x0 , x1 , . . . , xn−1 , xn } and max{x−n , x−(n−1) , . . . , x0 , x1 , . . . , xn−1 , xn }. For 2n arguments, the error term is E(x) = (s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1)(s − n)h2n 2
= s(s2 − 12 ) · · · (s2 − n − 1 )(s − n).h2n 4
f 2n (ξ) , (2n)!
f 2n (ξ) (2n)! (3.7)
...................................................................................... where, min{x−n , x−(n−1) , . . . , x0 , x1 , . . . , xn−1 , xn } < ξ < max{x−n , x−(n−1) , . . . , x0 , x1 , . . . , xn−1 , xn }.
3.2 Gauss’s backward difference formula Like previous case, there are two formulae for Gauss’s backward difference interpolation one for odd number of arguments and other for even number number of arguments. 3.2.1
For odd (2n + 1) number of arguments
Assumed that the function y = f (x) be known for 2n + 1 equispaced arguments x±i , i = 0, 1, 2, . . . , n, where x±i = x0 ± ih, i = 0, 1, 2, . . . , n. Let y±i = f (x±i ), i = 0, 1, 2, . . . , n. Suppose φ(x) be the approximate polynomial which passes through the 2n points x±i = x0 ± ih, i = 0, 1, 2, . . . , n and the degree of it is at most 2n. That is, φ(x±i ) = y±i , i = 0, 1, . . . , n.
(3.8)
Let the polynomial φ(x) be of the following form. φ(x) = a0 + a1 (x − x0 ) + a2 (x − x−1 )(x − x0 ) + a3 (x − x−1 )(x − x0 )(x − x1 ) +a4 (x − x−2 )(x − x−1 )(x − x0 )(x − x1 ) +a5 (x − x−2 )(x − x−1 )(x − x0 )(x − x1 )(x − x2 ) + · · · +a2n−1 (x − x−(n−1) ) · · · (x − x−1 )(x − x0 ) · · · (x − xn−1 ) +a2n (x − x−n )(x − x−(n−1) ) · · · (x − x−1 )(x − x0 ) · · · (x − xn−1 ),
(3.9)
where ai ’s are unknown constants and their values are to be determined by using the relations (3.8). Also, xi = x0 +ih, x−j = x0 −jh for i, j = 0, 1, 2, . . . , n. Therefore, xi −x−j = (i+j)h and (x−i − xj ) = −(i + j)h. To find the values of ai ’ we substitute x = x0 , x−1 , x1 , x−2 , x2 , . . . , x−n , xn to (3.9) in succession. 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Central Difference Interpolation Formulae The values of ai ’s are given by y0 = a 0 φ(x−1 ) = a0 + a1 (x−1 − x0 ) i.e., y−1 = y0 + a1 (−h), y0 − y−1 ∆y−1 i.e., a1 = = h h φ(x1 ) = a0 + a1 (x1 − x0 ) + a2 (x1 − x−1 )(x1 − x0 ) ∆y−1 y1 = y0 + h. + a2 (2h)(h) h y1 − y0 − (y0 − y−1 ) ∆2 y−1 i.e., a2 = = 2!h2 2!h2
φ(x−2 ) = a0 + a1 (x−2 − x0 ) + a2 (x−2 − x−1 )(x−2 − x0 ) +a3 (x−2 − x−1 )(x−2 − x0 )(x−2 − x1 ) ∆y−1 ∆2 y−1 i.e., y−2 = y0 + (−2h) + (−h)(−2h) + a3 (−h)(−2h)(−3h) h 2!h2 = y0 − 2(y0 − y−1 ) + (y1 − 2y0 + y−1 ) + a3 (−1)3 (3!)h3 y1 − 3y0 + 3y−1 − y−2 ∆3 y−2 or, a3 = = . 3!h3 3!h3 The other values can be obtained in similar way. a4 =
∆4 y−2 ∆5 y−3 ∆2n−1 y−n ∆2n y−n , a = , . . . , a = , a = . 5 2n−1 2n 4!h4 5!h5 (2n − 1)!h2n−1 (2n)!h2n
Using the values of ai ’s, equation (3.9) reduces to ∆y−1 ∆2 y−1 + (x − x−1 )(x − x0 ) 1!h 2!h2 3 ∆ y−2 +(x − x−1 )(x − x0 )(x − x1 ) 3!h3 ∆4 y−2 +(x − x−2 )(x − x−1 )(x − x0 )(x − x1 ) + ··· 4!h4
φ(x) = y0 + (x − x0 )
∆2n−1 y−n (2n − 1)!h2n−1 ∆2n y−n +(x − x−n )(x − x−1 )(x − x0 )(x − x1 ) · · · (x − xn−1 ) . (3.10) (2n)!h2n +(x − x−(n−1) ) · · · (x − x−1 )(x − x0 ) · · · (x − xn−1 )
6
...................................................................................... Like previous case, we introduced a unit less variable s, where x = x0 + sh. The advantage to use such variable is that the formula becomes simple and easy to calculate. Let us consider two identities x − xi x − x0 − ih = = s − i and h h x − x−i x − x0 + ih = = s + i, i = 0, 1, 2, . . . , n. h h Then the above formula becomes (s + 1)s 2 (s + 1)s(s − 1) 3 ∆ y−1 + ∆ y−2 2! 3! (s + 2)(s + 1)s(s − 1) 4 ∆ y−2 + · · · + 4! (s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1) 2n−1 ∆ y−n + (2n − 1)! (s + n)(s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1) 2n + ∆ y−n . (2n)!
φ(x) = y0 + s∆y−1 +
(3.11)
The formula (3.11) is known as Gauss’s backward interpolation formula or second interpolation formula of Gauss for odd number of arguments. 3.2.2
For even (2n) number of arguments
Here the number of middle values is two. So we take the arguments as x0 , x±1 , . . . , x±(n−1) and x−n , where x±i = x0 ± ih, i = 0, 1, . . . , n − 1 and x−n = x0 − nh. Proceeding as in previous case we obtained the Gauss’s backward interpolation formula as (s + 1)s 2 (s + 1)s(s − 1) 3 ∆ y−1 + ∆ y−2 2! 3! (s + 2)(s + 1)s(s − 1) 4 + ∆ y−2 + · · · 4! (s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1) 2n−1 + ∆ y−n , (2n − 1)!
φ(x) = y0 + s∆y−1 +
where s =
(3.12)
x − x0 . h
7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Central Difference Interpolation Formulae
3.2.3
Error term in Gauss’s backward central difference formula
The error term for the (2n + 1) equispaced arguments is f 2n+1 (ξ) (2n + 1)! = (s + n)(s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1)(s − n) f 2n+1 (ξ) × h2n+1 , (2n + 1)! f 2n+1 (ξ) = s(s2 − 12 )(s2 − 22 ) · · · (s2 − n2 ) × h2n+1 , (2n + 1)!
E(x) = (x − x−n )(x − x−(n−1) ) · · · (x − x−1 )(x − x0 ) · · · (x − xn )
The error term for the 2n equispaced arguments is f 2n (ξ) (2n)! 2n f (ξ) = (s + n)(s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1)h2n , (2n)! f 2n (ξ) 2 = s(s2 − 12 )(s2 − 22 ) · · · (s2 − n − 1 )(s + n)h2n , (2n)!
E(x) = (x − x−n )(x − x−(n−1) ) · · · (x − x−1 )(x − x0 ) · · · (x − xn−1 )
In both the cases ξ lies between min{x−n , x−n , . . . , x0 , x1 , . . . , xn−1 , xn−1 } and max{x−n , x−(n−1) , . . . , x0 , x1 , . . . , xn−1 }.
3.3 Stirling’s interpolation formula The average of Gauss’s forward and backward difference formulae for odd number of equispaced arguments gives Stirling’s interpolation formula. The Stirling’s formula is obtained from the equations (3.4) and (3.11) as φ(x)forward + φ(x)backward 2 s ∆y−1 + ∆y0 s2 2 s(s2 − 12 ) ∆3 y−2 + ∆3 y−1 = y0 + + ∆ y−1 + 1! 2 2! 3! 2 2 2 2 2 2 2 2 5 s (s − 1 ) 4 s(s − 1 )(s − 2 ) ∆ y−3 + ∆5 y−2 + ∆ y−2 + + ··· 4! 5! 2 2 s2 (s2 − 12 )(s2 − 22 ) · · · (s2 − n − 1 ) 2n + ∆ y−n . (2n)!
φ(x) =
8
(3.13)
...................................................................................... The error term of this formula is given by E(x) =
s(s2 − 12 )(s2 − 22 ) · · · (s2 − n2 ) 2n+1 2n+1 h f (ξ), (2n + 1)
(3.14)
where min{x−n , . . . , x0 , . . . , xn } < ξ < max{x−n , . . . , x0 , . . . , xn }. The formula (3.13) is known as Stirling’s central difference interpolation formula. Note 3.1 (a) It may be noted that the Stirling’s interpolation formula is applicable when the argument x, for which f (x) to be calculated, is at the centre of the table and the number of arguments is odd. If the number of arguments in the given tabulated values is even, then discard one end point from the table to make odd number of points such that x0 be in the middle of the table. (b) It is verified that the Stirling’s interpolation formula gives the better approximate result when −0.25 < s < 0.25. So it is suggested that, assign the subscripts to the points in such a way that s =
x−x0 h
satisfies this condition.
3.4 Bessel’s interpolation formula This is another useful central difference interpolation formula obtained from Gauss’s forward and backward interpolation formulae. It is also obtained by taking average of Gauss’s forward and backward interpolation formulae after shifting one step of backward formula. This formula is application for the even number of arguments. Let us consider 2n equispaced arguments x−(n−1) , . . . , x−1 , x0 , x1 , . . . , xn−1 , xn , where x±i = x0 ± ih, h is the spacing. Since the number of points is even, there are two points in the middle position. For the above numbering, the number of arguments to the right of x0 is n and to the left is n − 1. For this representation the Gauss’s backward difference interpolation formula (3.12) is (s + 1)s(s − 1) 3 s(s + 1) 2 ∆ y−1 + ∆ y−2 2! 3! (s + 2)(s + 1)s(s − 1) 4 + ∆ y−2 + · · · 4! (s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1) 2n−1 + ∆ y−n . (2n − 1)!
φ(x) = y0 + s∆y−1 +
(3.15) 9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Central Difference Interpolation Formulae Now we consider x1 be the middle argument of the data. That is, the number of points to the right of x1 is n − 1 and to the left is n. Therefore, for this assumption, we have to shift the points one step to the right. Then
x − x1 x − (x0 + h) x − x0 = = − 1 = s − 1. h h h Also, the indices of all the differences of (3.15) will be increased by 1. So we replace s by s − 1 and increase the indices of (3.15) by 1, and the Gauss’s backward difference interpolation formula becomes s(s − 1)(s − 2) 3 s(s − 1) 2 ∆ y0 + ∆ y−1 2! 3! (s + 1)s(s − 1)(s − 2) 4 + ∆ y−1 4! (s + 1)s(s − 1)(s − 2)(s − 3) 5 ∆ y−2 + · · · + 5! (s + n − 2) · · · (s + 1)s(s − 1)(s − 2) · · · (s − n) 2n−1 + ∆ y−n+1 . (2n − 1)!
φ1 (x) = y1 + (s − 1)∆y0 +
(3.16)
The average of (3.16) and Gauss’s forward interpolation formula (3.5) gives, φ1 (x) + φ(x)forward 2 y0 + y1 1 s(s − 1) ∆2 y0 + ∆2 y−1 = + s− ∆y0 + 2 2 2! 2 (s − 21 )s(s − 1) 3 s(s − 1)(s + 1)(s − 2) ∆4 y−2 + ∆4 y−1 + ∆ y−1 + 3! 4! 2 1 (s − 2 )s(s − 1)(s + 1)(s − 2) 5 ∆ y−2 + · · · + 5! (s − 12 )s(s − 1)(s + 1) · · · (s + n − 2)(s − n − 1) 2n−1 + ∆ y−(n−1) , (3.17) (2n − 1)!
φ(x) =
where x = x0 + sh. As in previous cases, we introduce the new variable u defined by u = s − By this substitution the above formula becomes φ(x) =
10
x − x0 1 1 = − . 2 h 2
u2 − 14 ∆2 y−1 + ∆2 y0 u(u2 − 14 ) 3 y0 + y1 + u∆y0 + · + ∆ y−1 2 2! 2 3! 2 (u2 − 41 )(u2 − 49 ) · · · (u2 − (2n−3) ) 2n−1 4 + ∆ y−(n−1) . (2n − 1)!
(3.18)
...................................................................................... This formula is known as Bessel’s central difference interpolation formula. Note 3.2 (a) It can be shown that the Bessel’s formula gives the best result when u lines between −0.25 and 0.25, i.e. 0.25 < u < 0.75. (b) Bessel’s central difference interpolation formula is used when the number of arguments is even and the interpolating point is near the middle of the table. Example 3.1 Use Bessel central difference interpolation formula to find the values of y at x = 1.55 from the following table x
:
1.0
1.5
2.0
2.5
3.0
y
:
10.2400
12.3452
15.2312
17.5412
19.3499
Solution. The central difference table is given below: i
xi
yi
−2
1.0
10.2400
∆yi
∆ 2 yi
∆3 yi
2.1052 −1
1.5
12.3452
0.7808 2.8860
0
2.0
15.2312
–1.3568 –0.5760
2.3100 1
2.5
17.5412
0.0747 –0.5013
1.8087 2
3.0
19.3499
Here x = 1.55. Let x0 = 2.0. Therefore, s = (1.55 − 2.0)/0.5 = −0.9. By Bessel’s formula y0 + y1 1 s(s − 1) ∆2 y0 + ∆2 y−1 + s− ∆y0 + 2 2 2! 2 1 1 3 + s− s(s − 1)∆ y−1 3! 2 15.2312 + 17.5412 = + (−0.9 − 0.5) × 2.3100 2 −0.9(−0.9 − 1) −0.5760 − 0.5013 + 2! 2 1 + (−0.9 − 0.5)(−0.9)(−0.9 − 1) × 0.0747 6 = 13.5829.
y(1.55) =
11
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Central Difference Interpolation Formulae
Example 3.2 Use Stirling central difference interpolation formula to find the values of y at x = 2.30 from the table of above example. In this case x = 2.30. Let x0 = 2.0. Thus s = (2.30 − 2.0)/0.5 = 0.6. Now, by Stirling’s formula y(x) = y0 + s
∆y−1 + ∆y0 s2 2 s(s2 − 12 ) ∆3 y−2 + ∆3 y−1 + ∆ y−1 + 2 2! 3! 2
2.3100 + 2.8860 (0.6)2 + × (−0.5760) 2 2 0.6(0.36 − 1) −1.3568 + 0.0747 + 6 2 = 15.2312 + 1.5588 − 0.10368 + 0.0410272 ' 16.2773.
y(2.30) = 15.2312 + 0.6
Note 3.3 In Newton’s forward and backward interpolation formulae the first or the last interpolating point is taken as initial point. But, in central difference interpolation formulae, a middle point is taken as the initial point x0 .
12
.
Chapter 2 Interpolation
Module No. 4 Aitken’s and Hermite’s Interpolation Methods
...................................................................................... In this module, two new kind of polynomial interpolation formulae are deduced. The first one is Aitken interpolation formula based on iteration process. Other is Hermite interpolation formula which uses the values of the function and its first order derivative. At first we describe the Aitken iteration interpolation formula.
4.1 Aitken’s interpolation Recall that, in Lagrange’s interpolation formula each term Li (x) is a polynomial of degree n, while in Newton’s formulae the degree of the terms are gradually increase starting from 0 degree. The Aitken iteration formula is similar to Newton’s formula. This formula is also successively generates the higher degree interpolation polynomials. But, the most advantage of this formula is, it can be easily programmed for a computer. Let y = f (x) be a function which may or may not be known explicitly, but a table of values of yi at x = x0 , x1 , . . . , xn are known, where yi = f (xi ), i = 0, 1, . . . , n. The points xi , i = 0, 1, . . . , n need not be equispaced. The Aitken iteration process is described below: At first step, a set of linear polynomials between the points (x0 , y0 ) and (xj , yj ), j = 1, 2, . . . , n is determined. This is the first approximation. In second step, a set of quadratic polynomials is generated for the points (x0 , y0 ), (x1 , y1 ) and (xj , yj ), j = 2, 3, . . . , n. The second approximation be performed with the help of first approximation to reduce the computational effort. This process is repeated for n times. Let the linear polynomial for the points (x0 , y0 ) and (x1 , y1 ) be denoted by p01 (x) and is given by x − x0 1 x − x1 y0 + y1 = [(x1 − x)y0 − (x0 − x)y1 ] x0 − x1 x1 − x0 x1 − x0 y x − x 1 0 0 = . x1 − x0 y1 x1 − x
p01 (x) =
Similarly, for the arguments x0 and xj the linear polynomial is y x − x 1 0 0 p0j (x) = , j = 1, 2, . . . , n. xj − x0 yj xj − x
(4.1)
(4.2)
That is, each of these polynomials passes through the points (x0 , y0 ) and (xj , yj ), j = 1, 2, . . . , n. 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Aitken’s and Hermite’s Interpolation Methods In similar way, we can determine the quadratic polynomial passing through the points (x0 , y0 ), (x1 , y1 ), (xj , yj ). This polynomial be denoted by p01j (x) and it is obtained as p (x) x − x 1 1 01 p01j (x) = (4.3) , j = 2, 3, . . . , n. xj − x1 p0j (x) xj − x In general, for the (k+2) points (x0 , y0 ), (x1 , y1 ), . . . , (xk , yk ) and (xj , yj ), the (k+1)th degree interpolating polynomial is p xk − x 1 012···k (x) p012···kj (x) = , for j = k + 1, . . . , n. xj − xk p012···(k−1)j (x) xj − x
(4.4)
The entire calculation can be represented as a tabular form shown below. xj yj p0j p01j p012j p0123j xj − x x0 − x
x0
y0
x1
y1
p01
x2
y2
p02
p012
x3
y3
p03
p013
p0123
x4
y4
p04
p014
p0124
p01234
x4 − x
···
···
···
···
···
···
···
xn
yn
p0n
p01n
p012n
p0123n
xn − x
x1 − x x2 − x x3 − x
Example 4.1 Consider the following table of values and calculate the value of y(1.5) using Aitken interpolation method. x
:
0
1
2
3
y(x)
:
21.4
27.5
32.6
40.3
Solution. In this case, x = 1.5. The following table can be formed easily.
2
p0j
p01j
p012j
xj − x
xj
yj
0
21.4
−1.5
1
27.5
−0.5
2
32.6
0.5
3
40.3
1.5
...................................................................................... Now, we calculate the first approximation. The formula is y x −x 1 0 0 j = 1, 2, 3. p0j = , xj − x0 yj xj − x Therefore, p01 p02 p03
1 21.4 = 1 27.5 1 21.4 = 2 32.6 1 21.4 = 3 40.3
−1.5 = 30.55. −0.5 −1.5 = 29.80. 0.5 −1.5 = 30.85. 1.5
After first approximation, the updated table is p0j
p01j
p012j
xj − x
xj
yj
0
21.4
1
27.5
30.55
−0.5
2
32.6
29.80
0.5
3
40.3
30.85
1.5
−1.5
The second approximations are determined as follows.
p01j p012 p013
p x −x 1 01 1 = j = 2, 3. , xj − x1 p0j xj − x 1 30.55 −0.5 = = 30.175. 1 29.80 0.5 1 30.55 −0.5 = = 30.625. 2 30.85 1.5
After second approximation, the updated table is p0j
p01j
p012j
xj − x
xj
yj
0
21.4
1
27.5
30.55
2
32.6
29.80
30.175
0.5
3
40.3
30.85
30.625
1.5
−1.5 −0.5
3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Aitken’s and Hermite’s Interpolation Methods The third approximation is given below. p x − x 1 012 2 p012j = , xj − x2 p01j xj − x 1 30.175 0.5 p0123 = = 29.95. 1 30.625 1.5
j = 3.
The final table is xj
yj
p0j
p01j
0
21.4
1
27.5
30.55
2
32.6
29.80
30.175
3
40.3
30.85
30.625
p012j
xj − x −1.5 −0.5 0.5
29.95
1.5
Hence, the value of y at x = 1.5 is 29.95. Example 4.2 Use Aitken formula to determine a quadratic polynomial from the following data x
:
1
2
3
y(x)
:
7
5
2
Solution. The first approximations are calculated below. y x −x 1 0 0 p0j = j = 1, 2. , xj − x0 yj xj − x Now, p01 p02
1 7 = 1 5 1 7 = 2 2
1 − x = 9 − 2x. 2−x 1 − x 1 = (19 − 2x). 3−x 2
The second approximation is p01j p012 4
p x −x 1 01 1 = j = 2. , xj − x1 p0j xj − x 1 9 − 2x 2 − x = 1 = 8 − 0.5x − 0.5x2 . 1 2 (19 − 5x) 3 − x
...................................................................................... Hence, the required quadratic polynomial is 8 − 0.5x − 0.5x2 .
4.2 Hermite’s interpolation formula Now, we discuss another type of interpolation method by considering first order derivatives and it is known as Hermite’s interpolation. In the previous interpolation formulae, a polynomial of degree n is constructed based on given (n+1) points. Suppose that the values of the function y = f (x) and its first order derivative are given at (n + 1) points. Then an interpolating polynomial φ(x) of degree (2n + 1) can be obtained. This polynomial must satisfies the following (2n + 2) conditions. φ(xi ) = f (xi ) φ0 (xi ) = f 0 (xi ), i = 0, 1, 2, . . . , n.
(4.5)
This new type of interpolating polynomial is known as Hermite’s interpolation formula. Note that, in this formula, the number of conditions is (2n + 2), the number of coefficients of the polynomial to be determined is (2n + 2) and the degree of the polynomial is (2n + 1). Let the Hermite’s interpolating polynomial be the following form φ(x) =
n X
hi (x)f (xi ) +
i=0
n X
Hi (x)f 0 (xi ),
(4.6)
i=0
where hi (x) and Hi (x), i = 0, 1, 2, . . . , n, are polynomials in x of degree at most (2n+1). By the conditions (4.5), we obtained ( hi (xj ) =
1, if i = j 0, if i 6= j
Hi (xj ) = 0, for all i h0i (xj ) = 0, for all i; ( 1, if i = j Hi0 (xj ) = 0, if i = 6 j
(4.7) (4.8) (4.9) (4.10)
The Lagrangian function Li (x) is Li (x) =
(x − x0 )(x − x1 ) · · · (x − xi−1 )(x − xi+1 ) · · · (x − xn ) , (xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn ) i = 0, 1, 2, . . . , n.
(4.11) 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Aitken’s and Hermite’s Interpolation Methods Obviously,
( Li (xj ) =
1, if i = j 0, if i 6= j.
(4.12)
The above conditions are similar to the conditions of Lagrange’s interpolation. So, we may express hi (x) and Hi (x) in terms of Lagrangian functions as follows: hi (x) = (ai x + bi )[Li (x)]2 Hi (x) = (ci x + di )[Li (x)]2
(4.13)
Note that each hi (x) or Hi (x) is a polynomial of degree 2n + 1. Using the conditions (4.7)-(4.10), we get ai xi + bi
=1
ci xi + di
=0
ai +
2L0i (xi )
ci
=0
(4.14)
=1
Thus we have ai = −2L0i (xi ), ci = 1
bi = 1 + 2xi L0i (xi ) and
di = −xi
(4.15)
Therefore, hi (x) and Hi (x) are obtained in terms of Li (x) and L0i (x) as follows. hi (x) = [−2xL0i (xi ) + 1 + 2xi L0i (xi )][Li (xi )]2 = [1 − 2(x − xi )L0i (xi )][Li (x)]2 and Hi (x) = (x − xi )[Li
(4.16)
(x)]2 .
Hence the Hermite interpolating polynomial is φ(x) =
n X
[1 − 2(x − xi )L0i (xi )][Li (x)]2 f (xi )
i=0
+
n X i=0
6
(x − xi )[Li (x)]2 f 0 (xi ).
(4.17)
......................................................................................
Example 4.3 Find the fifth degree Hermite’s interpolating polynomial which satisfies the following data and hence find an approximate value of y when x = 1.5. x : 1 2 3 y
:
1
8
27
y0
:
3
12
27
Solution. (x − x1 )(x − x2 ) (x − 2)(x − 3) 1 = = (x2 − 5x + 6) (x0 − x1 )(x0 − x2 ) (1 − 2)(1 − 3) 2 (x − x0 )(x − x2 ) (x − 1)(x − 3) L1 (x) = = = −(x2 − 4x + 3) (x1 − x0 )(x1 − x2 ) (2 − 1)(2 − 3) (x − 1)(x − 2) 1 (x − x0 )(x − x1 ) = = (x2 − 3x + 2) L2 (x) = (x2 − x0 )(x2 − x1 ) (3 − 1)(3 − 2) 2 L0 (x) =
Thus 1 (2x − 5), 2 1 L02 (x) = (2x − 3). 2 L00 (x) =
L01 (x) = −(2x − 4),
Now, L00 (x0 ) = −3/2,
L01 (x1 ) = 0,
L02 (x2 ) = 3/2. 1 h0 (x) = [1 − 2(x − x0 )L00 (x0 )][L0 (x)]2 = [1 + (x − 1)(3)][ (x2 − 5x + 6)]2 4 1 2 2 = (3x − 2)(x − 5x + 6) 4 h1 (x) = [1 − 2(x − x1 )L01 (x1 )][L1 (x)]2 = (x2 − 4x + 3)2 h2 (x) = [1 − 2(x − x2 )L02 (x2 )][L2 (x)]2 1 = (10 − 3x)(x2 − 3x + 2)2 . 4 1 H0 (x) = (x − x0 )[L0 (x)]2 = (x − 1)(x2 − 5x + 6)2 4 H1 (x) = (x − x1 )[L1 (x)]2 = (x − 2)(x2 − 4x + 3)2 1 H2 (x) = (x − x2 )[L2 (x)]2 = (x − 3)(x2 − 3x + 2)2 . 4 7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Aitken’s and Hermite’s Interpolation Methods Hence, the required Hermite polynomial is 1 (3x − 2)(x2 − 5x + 6)2 (1) + (x2 − 4x + 3)2 .(8) 4 1 + (10 − 3x)(x2 − 3x + 2)2 .(27) 4 1 + (x − 1)(x2 − 5x + 6)2 .(3) 4 +(x − 2)(x2 − 4x + 3)2 .(12) 1 + (x − 3)(x2 − 3x + 2)2 .(27) 4 1 = (6x − 5)(x2 − 5x + 6)2 + (12x − 16)(x2 − 4x + 3)2 4 27 + (7 − 2x)(x2 − 3x + 2)2 . 4
φ(x) =
The approximate value of y at x = 1.5 is ( 14
8
× 0.5625 × 4) + (0.5625 × 2) + ( 27 4 × 0.0625 × 4) = 3.375.
.
Chapter 2 Interpolation
Module No. 5 Spline Interpolation
...................................................................................... The Lagrange’s and Newton’s interpolation formulae for (n + 1) points (xi , yi ), i = 0, 1, . . . , n give a polynomial of degree n. It is well known that the evaluation of a polynomial of higher degree is a very complicated task. The Horner method is used to evaluated a polynomial of degree n and it needs n additions and n multiplications. So it is a laborious and time consuming for hand calculation and also for computer. But, in computer graphics, image processing, etc. interpolation is frequently used. So a new interpolation method called spline is developed to avoid this limitation. Spline interpolation is a very powerful and widely used method and has many applications in numerical differentiation, integration, solution of boundary value problems, two and three - dimensional graph plotting, etc. In spline interpolation a function is interpolated between a given set of points by means of piecewise smooth polynomials. In this interpolation, the curve passes through the given set of points and also its slope and its curvature are continuous at each point. The splines with different degree are found in literature, among them cubic splines are widely used.
5.1 Piecewise linear interpolation Let (xi , yi ), i = 0, 1, . . . , n, where y = f (x) be a given set of points. In each interval one can construct a polynomial of degree one since an interval contains two end points. Let pi (x) be the linear polynomial on the ith interval [xi , xi+1 ]. Then pi (x) = yi +
yi − yi+1 (x − xi ), i = 0, 1, . . . , n − 1. xi − xi+1
Figure 5.1: Piecewise linear interpolation 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Spline Interpolation Now, the polynomial approximation (φ(x)) of the function f (x) is φ(x) = pi (x), xi ≤ x ≤ xi+1 , i = 0, 1, . . . , n − 1. The set of n such polynomials constitute the piecewise linear interpolation, but the resultant curve is not smooth (See Figure 5.1). To construct a smooth piecewise polynomials we need more conditions and it is discussed in next section.
5.2 Cubic spline Let (xi , yi ), i = 0, 1, . . . , n be a set of (n+1) points, where y = f (x). The function f (x) may or may not be known explicitly. A cubic spline is a set of n third degree polynomials and each polynomial is constructed in each interval [xi , xi+1 ], i = 0, 1, . . . , n − 1 under certain conditions. It is defined more precisely below: Let
φ(x) =
p0 (x), p1 (x), ···
x0 ≤ x ≤ x1 x1 ≤ x ≤ x2 ···
(5.1)
pi (x), xi ≤ x ≤ xi+1 ··· ··· pn−1 (x), xn−1 ≤ x ≤ xn
where each pi (x) is a polynomial of degree three. The function φ(x) is called cubic spline in [x0 , xn ] if φ(x) satisfies the following conditions. p0i−1 (xi ) = p0i (xi ), i = 1, 2, . . . , n − 1 (equal slope).
(5.2)
p00i−1 (xi ) = p00i (xi ), i = 1, 2, . . . , n − 1 (equal curvature).
(5.3)
and pi (xi ) = yi ,
pi (xi+1 ) = yi+1 , i = 0, 1, . . . , n − 1.
(5.4)
The continuity on slope and curvature are not defined at the endpoints x0 and xn . The conditions at these points are assigned based on the problems. Let the interval [xi , xi+1 ], i = 0, 1, . . . , n − 1 be denoted by (i + 1)th interval. The length (hi+1 ) of the (i+1)th interval is denoted by hi+1 = xi+1 −xi , i = 0, 1, 2, . . . , n−1. Let the cubic spline on the (i + 1)th interval be φ(x) = pi (x) = ai (x − xi )3 + bi (x − xi )2 + ci (x − xi ) + di , 2
in [xi , xi+1 ],
(5.5)
...................................................................................... where ai , bi , ci and di are unknown and their values are to be determined by imposing the conditions (5.2)-(5.4). Since it passes through the end points xi and xi+1 , therefore, yi = φ(xi ) = di
(5.6)
and yi+1 = ai (xi+1 − xi )3 + bi (xi+1 − xi )2 + ci (xi+1 − xi ) + di = ai h3i+1 + bi h2i+1 + ci hi+1 + di .
(5.7)
Now, differentiate the equation (5.5) twice and get the following equations. φ0 (x) = 3ai (x − xi )2 + 2bi (x − xi ) + ci . 00
and φ (x) = 6ai (x − xi ) + 2bi .
(5.8) (5.9)
00 From equation (5.9), yi00 = φ00 (xi ) = 2bi and yi+1 = φ00 (xi+1 ) = 6ai hi+1 + 2bi .
For simplicity, we denote yi00 by Mi for i = 0, 1, 2, . . . , n. Using this notation, the above equations become Mi = 2bi , Mi+1 = 6ai hi+1 + 2bi . Thus, Mi , 2 Mi+1 − Mi ai = . 6hi+1 bi =
(5.10) (5.11)
Thus, the values of ai and bi are obtained in terms of Mi . Using the values of ai , bi , the equation (5.7) becomes Mi+1 − Mi 3 Mi 2 hi+1 + h + ci hi+1 + yi 6hi+1 2 i+1 yi+1 − yi 2hi+1 Mi + hi+1 Mi+1 i.e. ci = − . hi+1 6 yi+1 =
(5.12)
Thus, we get ci . That is, all the coefficients ai , bi , ci and di for i = 0, 1, . . . , n − 1 of (5.5) are obtained and they are available in terms of n + 1 unknowns M0 , M1 , . . . , Mn . Now the problem is to determine these unknowns. To find them we use the condition of equation (5.2), i.e. p0i−1 (xi ) = p0i (xi ), i = 1, 2, . . . , n − 1. 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Spline Interpolation For the polynomial pi (x) p0i (xi ) = 3ai (xi − xi )2 + 2bi (xi − xi ) + ci = ci ,
(5.13)
and for the polynomial pi−1 (x) p0i−1 (xi ) = 3ai−1 (xi − xi−1 )2 + 2bi−1 (xi − xi−1 ) + ci−1 .
(5.14)
Now, from the equations (5.13) and (5.14), we get ci = 3ai−1 h2i + 2bi−1 hi + ci−1 . We substitute the values of ai−1 , bi−1 , ci−1 and ci to the above equation and we get yi+1 − yi 2hi+1 Mi + hi+1 Mi+1 − hi+1 6 Mi − Mi−1 2 Mi yi − yi−1 2hi Mi−1 + hi Mi =3 hi + hi + − . 6hi 2 hi 6 This gives y yi − yi−1 i+1 − yi hi Mi−1 + 2(hi + hi+1 )Mi + hi+1 Mi+1 = 6 − . hi+1 hi
(5.15)
where i = 1, 2, . . . , n − 1.
y yi − yi−1 i+1 − yi . Let Ai = hi , Bi = 2(hi + hi+1 ), Ci = hi+1 and Di = 6 − hi+1 hi Using these notations, the equation (5.15) reduces to Ai Mi−1 + Bi Mi + Ci Mi+1 = Di , i = 1, 2, . . . , n − 1.
(5.16)
Note that, this is a system of tri-diagonal equations contains n−1 equations and n+1 unknowns M0 , M1 , . . . , Mn . This is a system of n − 1 linear equations containing n + 1 unknowns M0 , M1 , . . . , Mn . So two more equations/conditions are needed to solve this system completely. Thus, using two additional equations/conditions one can solve this system. By solving the above system of equations, we get the values of ai , bi , ci and di for all i = 0, 1, . . . , n − 1 completely and hence we get the cubic spline φ(x) on [x0 , xn ]. The followings are the common conditions which are used to solve this system. (i) M0 = Mn = 0. The corresponding spline is called natural spline. 4
...................................................................................... (ii) M0 = Mn , M1 = Mn+1 , y0 = yn , y1 = yn+1 , h1 = hn+1 . The spline for this condition is called periodic spline. (iii) y 0 (x0 ) = y00 , y 0 (xn ) = yn0 , i.e. 6 y1 − y0 − y00 h1 h1 6 0 yn − yn−1 and Mn−1 + 2Mn = yn − . hn hn 2M0 + M1 =
The spline satisfying the above conditions is called non-periodic spline or clamped cubic spline. hn (Mn−1 − Mn−2 ) h1 (M2 − M1 ) and Mn = Mn−1 + . The spline obh2 hn−1 tained by this condition is called extrapolated spline.
(iv) M0 = M1 −
(v) M0 = y000 and Mn = yn00 . If a spline satisfy these conditions, then the spline is called endpoint curvature-adjusted spline. Case I. (Natural spline) In this case M0 = Mn = 0. Then the tri-diagonal system for M1 , M2 , . . . , Mn−1 becomes B1 C1 0 0 · · · 0 0 0 M D 1 1 A2 B 2 C 2 0 · · · 0 0 0 M2 D 2 0 A3 B3 C3 · · · 0 0 0 . = . . . ··· ··· ··· ··· ··· ··· ··· ··· . . Mn−1 Dn−1 0 0 0 0 · · · 0 An−1 Bn−1 and M0 = Mn = 0. Case II. (Non-periodic spline) By introducing the conditions of non-periodic spline, we get 2M0 + M1 = D0 and Mn−1 + 2Mn = Dn , 6 y1 − y0 − y00 where D0 = h1 h1 6 yn − yn−1 0 and Dn = y − . hn n hn 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Spline Interpolation Then the system of tri-diagonal equations for M0 , M1 , . . . , Mn is given below. D0 M0 2 1 0 0 ··· 0 0 0 D1 M1 A1 B1 C1 0 · · · 0 0 0 0 A B C ··· 0 D M 0 0 2 2 2 2 2 .. = .. . ··· ··· ··· ··· ··· ··· ··· ··· . . 0 0 0 0 · · · An−1 Bn−1 Cn−1 Mn−1 Dn−1 0 0 0 0 ··· 0 1 2 Dn Mn Case III. (Extrapolated spline) In this case the values of M0 and Mn are given by the following equations M0 = M1 −
hn (Mn−1 − Mn−2 ) h1 (M2 − M1 ) and Mn = Mn−1 + . h2 hn−1
(5.17)
The first equation can be written as A1 h1 A1 h1 + M2 C 1 − = D1 or, M1 B10 + M2 C10 = D1 M1 A1 + B1 + h2 h2 A1 h1 A1 h1 and C10 = C1 − . h2 h2 0 Similarly, the second equation is changed to Mn−2 A0n−1 + Mn−1 Bn−1 = Dn−1 where
where B10 = A1 + B1 +
A0n−1 = An−1 −
Cn−1 hn hn Cn−1 0 and Bn−1 = Bn−1 + Cn−1 + . hn−1 hn−1
Finally, the system of tri-diagonal equations for M1 , M2 , . . ., Mn−1 is B10 C10 0 0 · · · 0 0 0 D1 M1 A2 B 2 C 2 0 · · · 0 0 0 M 2 D2 0 A3 B3 C3 · · · 0 0 0 . = . . . . ··· ··· ··· ··· ··· ··· ··· ··· . . Dn−1 Mn−1 0 0 0 0 0 · · · 0 A0n−1 Bn−1 The values of M0 and Mn are obtained from the equation (5.17). Case IV. (Endpoint curvature-adjusted spline) For this case, the values of M0 and Mn are given. The values of M1 , M2 , . . ., Mn−1 are 6
...................................................................................... obtained by solving the following system of tri-diagonal equations. D10 B1 C1 0 0 · · · 0 0 0 M1 D2 A2 B 2 C 2 0 · · · 0 0 0 M2 . 0 A3 B3 C3 · · · 0 0 0 . = .. , . ··· ··· ··· ··· ··· ··· ··· ··· . Dn−2 Mn−1 0 0 0 0 0 · · · 0 An−1 Bn−1 Dn−1 where D10 = D1 − A1 y000 ,
0 Dn−1 = Dn−1 − Cn−1 yn00 .
Let us consider two problems. Example 5.1 The following table gives the values of x and y = f (x). x
:
–1
0
1
2
f (x)
:
1.52
2.34
3.58
4.67
Fit a cubic spline curve for this data with the end conditions y 00 (−1) = 0.0 and y 00 (2) = 0.0. Also, find the value of f (x) when x = 0.5. Solution. For this problem the intervals are (−1, 0), (0, 1) and (1, 2). For each interval a cubic polynomial is to be constructed. Let the natural cubic spline on the interval [xi , xi+1 ] be pi (x) = ai (x − xi )3 + bi (x − xi )2 + ci (x − xi ) + di where the coefficients ai , bi , ci and di are obtained from the following equations. Mi+1 − Mi Mi , bi = , 6hi+1 2 yi+1 − yi 2hi+1 Mi + hi+1 Mi+1 ci = , − hi+1 6
ai =
di = yi ,
for i = 0, 1, 2. Here h1 = h2 = h3 = 1. The values of M ’s are given by the equation (5.15), i.e. Mi−1 + 4Mi + Mi+1 = 6(yi+1 − 2yi + yi−1 ),
i = 1, 2.
The system of tri-diagonal equations is given by M0 + 4M1 + M2 = 6(y2 − 2y1 + y0 ) = 6 × (3.58 − 2 × 2.34 + 1.52) = 2.52 M1 + 4M2 + M3 = 6(y3 − 2y2 + y1 ) = 6 × (4.67 − 2 × 3.58 + 2.34) = −0.9. 7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Spline Interpolation Given that M0 = y 00 (−1) = 0 and M3 = y 00 (2) = 0. Using these values the above equations becomes 4M1 + M2 = 2.52,
M1 + 4M2 = −0.9.
Solving these equations we get M1 = 0.7320 and M2 = −0.4080. Now, the values of ai , bi , ci , di are given by a0 = c0 = a1 = c1 = a2 = c2 =
M0 M1 − M0 = 0.1220, b0 = = 0.0, 6 2 y1 − y0 2M0 + M1 − = 0.6980, d0 = y0 = 1.52. 1 6 M2 − M 1 M1 = −0.1900, b1 = = 0.3660, 6 2 y2 − y1 2M1 + M2 − = 1.0604, d1 = y1 = 2.3400. 1 6 M3 − M2 M2 = 0.0680, b2 = = −0.2040, 6 2 y3 − y2 2M2 + M3 − = 1.2260, d2 = y2 = 3.58. 1 6
Hence the required cubic spline is p0 (x) = 0.1220(x + 1)3 + 0.6980(x + 1) + 1.52,
−1 ≤ x ≤ 0
p1 (x) = −0.1900x3 + 0.3660x2 + 1.0604x + 2.34,
0≤x≤1
p2 (x) = 0.0680(x − 1)3 − 0.2040(x − 1)2 + 1.2260(x − 1) + 3.58,
1 ≤ x ≤ 2.
The value of f (x) is obtained from p1 (x) as x = 0.5 belongs to 0 ≤ x ≤ 1. Thus f (0.5) = p1 (0.5) = 2.9398. Example 5.2 Fit a non-periodic cubic spline for the data x
:
1.5
3.0
4.5
6.0
y
:
10.5
12.8
14.2
16.3
along with the conditions y 0 (1.5) = 1.1 and y 0 (6.0) = 7.3. Solution. In this problem, the intervals are (1.5, 3.0), (3.0, 4.5) and (4.5, 6.0). Here, h1 = h2 = h3 = 1.5. 8
...................................................................................... Let the cubic spline on the interval [xi , xi+1 ] be pi (x) = ai (x − xi )3 + bi (x − xi )2 + ci (x − xi ) + di where Mi Mi+1 − Mi , bi = , 6hi+1 2 yi+1 − yi 2hi+1 Mi + hi+1 Mi+1 ci = − , hi+1 6
ai =
di = yi , for i = 0, 1, 2.
The following equations are used to determine the values of M ’s. That is, yi+1 − yi yi − yi−1 1.5Mi−1 + 6Mi + 1.5Mi+1 = 6 − 1.5 1.5 for i = 1, 2. This gives y − y y1 − y0 2 1 1.5M0 + 6M1 + 1.5M2 = 6 − 1.5 1.5 y − y y2 − y1 3 2 − . 1.5M1 + 6M2 + 1.5M3 = 6 1.5 1.5 Again, the boundary conditions give, 6 y1 − y0 6 0 y3 − y2 2M0 + M1 = − y00 and M2 + 2M3 = y3 − . h1 h1 h3 h3 Thus the system of equations in M ’s are M0 + 4M1 + M2 = −2.4, M1 + 4M2 + M3 = 1.8666, 2M0 + M1 = 1.7334, M2 + 2M3 = 23.6000. Solution of these equations is M0 = 0.9333, M1 = −0.1333, M2 = −2.8000 and M3 = 13.2000. Therefore, a0 = −0.1185, b0 = 0.4667, c0 = 1.1000, d0 = 10.5. a1 = −0.2963, b1 = −0.0667, c1 = 1.70, d1 = 12.8. a2 = 1.7778, b2 = −1.4000, c2 = −0.5000, d2 = 14.2000. Hence, the cubic spline is given by p0 (x) = −0.1185(x − 1.5)3 + 0.4667(x − 1.5)2 + 1.1000(x − 1.5) + 10.5, p1 (x) = −0.2963(x − 3)3 − 0.0667(x − 3)2 + 1.7000(x − 3) + 12.8, p2 (x) = 1.7778(x − 4.5)3 − 1.4000(x − 4.5)2 − 0.5000(x − 4.5) + 14.2,
1.5 ≤ x ≤ 3
3 ≤ x ≤ 4.5 4.5 ≤ x ≤ 6. 9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Spline Interpolation The value of y at x = 1.75 is obtained from p0 (x) and it is y(1.75) = p0 (1.75) = 10.8023. Example 5.3 Let ( f (x) =
2x3 − 4.5x2 + 3x + 5, 0 ≤ x ≤ 1, −x3 + 4.5x2 − 6x + 8, 1 ≤ x ≤ 2.
Show that f (x) is a cubic spline. Solution. Let p0 (x) = 2x3 − 4.5x2 + 3x + 5,
0 ≤ x ≤ 1,
and p1 (x) = −x3 + 4.5x2 − 6x + 8,
1 ≤ x ≤ 2.
For this problem x0 = 0, x1 = 1 and x2 = 2. The function f (x) will be a cubic spline if the following conditions are satisfied. pi (xi ) = f (xi ), p0i−1 (xi )
=
p0i (xi ),
pi (xi+1 ) = f (xi+1 ), p00i−1 (xi )
=
p00i (xi ),
i = 0, 1 and i = 1.
That is, we have to show p0 (x1 ) = f (x1 ), p1 (x1 ) = f (x1 ), p00 (x1 ) = p01 (x1 ), p000 (x1 ) = p001 (x1 ). p0 (x0 ) = f (x0 ) and p1 (x2 ) = f (x2 ) are obvious. Now, p00 (x) = 6x2 − 9x + 3,
p01 (x) = −3x2 + 9x − 6
p000 (x) = 12x − 9,
p001 (x) = −6x + 9.
p00 (x1 ) = p00 (1) = 0.0, p01 (x1 ) = p01 (1) = 0.0,, i.e. p00 (x1 ) = p01 (x1 ). and p000 (x1 ) = p000 (1) = 3 and p001 (x1 ) = p001 (1) = 3. Therefore, p000 (x1 ) = p001 (x1 ). Hence, f (x) is a cubic spline.
5.3 Comparison with other methods The Lagrange’s, Newton’s and Gaussian interpolation formulae construct a polynomial of degree n for n + 1 points (xi , yi ), i = 0, 1, 2, . . . , n, while the cubic spline 10
...................................................................................... constructs a set of n cubic polynomials for each interval. Computation of an n degree polynomial is a very difficult task whereas it is easy for a cubic one. Thus, in computational point of view the cubic spline is very useful. But, with respect to the computer storage, the cubic spline needs more space than the other interpolating polynomials. To store a polynomial of degree n only n + 1 coefficients are to be stored, while to store a third degree polynomial needs 4 units space. Again, for a cubic spline there are n polynomials each of degree three. So to store a cubic spline in total 4n coefficients are to be stored. Thus, with respective to storage, the other polynomials are better.
11
.
Chapter 2 Interpolation
Module No. 6 Inverse Interpolation
...................................................................................... In previous chapters, different types of interpolation methods are discussed. In these methods, from a table of values, we determine the value of y for a given value of x. But, in inverse interpolation the problem is just opposite, i.e. in this case, the value of x is given we have to determine the value of x. Many inverse interpolation methods are available in literature among them the commonly used methods are discussed in this module. In this module, three inverse interpolation methods are discussed, viz. Lagrange method and the methods based on Newton forward and Newton backward interpolation formulae. The inverse interpolation based on Lagrange’s formula is a direct method while the formulae based on Newton’s interpolation formulae are iterative.
6.1 Inverse interpolation based on Lagrange’s formula The idea of this method is simple compare to other inverse interpolation formulae. The Lagrange’s interpolation formula of y on x is y=
n X i=0
w(x) yi . (x − xi )w0 (xi )
From this formula, a polynomial in x is obtained, and hence one can determine the value of y when x is given. Now, we interchange x and y in the above formula and we obtained x=
n X i=0
n
X w(y)xi = Li (y)xi , (y − yi )w0 (yi ) i=0
where Li (y) =
w(y) (y−y0 )(y−y1 ) · · · (y−yi−1 )(y−yi+1 ) · · · (y−yn ) = , 0 (y−yi )w (yi ) (yi −y0 )(yi −y1 )· · ·(yi −yi−1 )(yi −yi+1 )· · ·(yi −yn )
and w(y) = (y − y0 )(y − y1 ) · · · (y − yn ). It is easy to observed that this formula gives the value of x for a given value of y. This formula is known as Lagrange’s inverse interpolation formula. The concept of this formula is simple but, practically this formula needs more computation time. 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inverse Interpolation
Example 6.1 The following table represents the time t and the corresponding velocity v of a particle moving with non-uniform velocity. t
:
0.0
1.0
1.5
2.0
v
:
2.5
3.8
4.6
5.3
From this table determine the time when the velocity of the particle becomes 2.75. Solution. The inverse Lagrange’s interpolation formula is used to solve this problem. Here, v = 2.75. In this case, the formula is t = L0 (v)t0 + L1 (v)t1 + L2 (v)t2 + L3 (v)t3 , where (v − v1 )(v − v2 )(v − v3 ) (v0 − v1 )(v0 − v2 )(v0 − v3 ) (v − v0 )(v − v2 )(v − v3 ) L1 (v) = (v1 − v0 )(v1 − v2 )(v1 − v3 ) (v − v0 )(v − v1 )(v − v3 ) L2 (v) = (v2 − v0 )(v2 − v1 )(v2 − v3 ) (v − v0 )(v − v1 )(v − v2 ) L3 (v) = (v3 − v0 )(v3 − v1 )(v3 − v2 ) L0 (v) =
(2.75 − 3.8)(2.75 − 4.6)(2.75 − 5.3) (2.5 − 3.8)(2.5 − 4.6)(2.5 − 5.3) (2.75 − 2.5)(2.75 − 4.6)(2.75 − 5.3) = (3.8 − 2.5)(3.8 − 4.6)(3.8 − 5.3) (2.75 − 2.5)(2.75 − 3.8)(2.75 − 5.3) = (4.6 − 2.5)(4.6 − 3.8)(4.6 − 5.3) (2.75 − 2.5)(2.75 − 3.8)(2.75 − 4.6) = (5.3 − 2.5)(5.3 − 3.8)(5.3 − 4.6) =
= 0.64801 = 0.75601 = −0.56920 = 0.16518
Thus, the required time t is t = 0.64801 × 0 + 0.75601 × 1 − 0.56920 × 1.5 + 0.16518 × 2 = 0.23257.
6.2 Method of successive approximations Deduction of inverse interpolation formula from Newton’s interpolation formula is not straight forward like Lagrange’s formula. The Newton’s inverse interpolation methods are based on iteration.
2
......................................................................................
6.2.1
Iteration formula from Newton’s forward difference interpolation formula
The Newton’s forward difference interpolation formula is u(u − 1) 2 u(u − 1)(u − 2) 3 ∆ y0 + ∆ y0 + · · · 2! 3! u(u − 1)(u − 2) · · · (u − n + 1) n + ∆ y0 , n!
y = y0 + u∆y0 +
x − x0 . h From the second term of the above formula one can obtained the value of u as u(u − 1) 2 1 h u(u − 1)(u − 2) 3 y − y0 − ∆ y0 − ∆ y0 − · · · u= ∆y0 2! 3! u(u − 1)(u − 2) · · · (u − n + 1) n i − ∆ y0 . (6.1) n!
where u =
Note that the value of u depends on the value of u. So this formula does not give the value of u unless some initial value of u is not available. Suppose the first approximation of u be denoted by u(1) . By neglecting the second and higher differences we obtained the first approximate value of u as follows: 1 ∆y0 (y − y0 ). (u(2) ) of u, is
u(1) = Next, the second approximate value
obtained by neglecting third and
higher order differences from equation (6.1), i.e. u(2) =
1 h u(1) (u(1) − 1) 2 i y − y0 − ∆ y0 . ∆y0 2!
Similarly, the third approximation u(3) is obtained by the following formula. u(3) =
1 h u(2) (u(2) − 1) 2 u(2) (u(2) − 1)(u(2) − 2) 3 i y − y0 − ∆ y0 − ∆ y0 . ∆y0 2! 3!
In general, the (k+1)th approximate value of u is obtained from the following formula. u(k) (u(k) − 1) 2 u(k) (u(k) − 1)(u(k) − 2) 3 1 (k+1) u = y − y0 − ∆ y0 − ∆ y0 ∆y0 2! 3! u(k) (u(k) − 1) · · · (u(k) − k) k+1 −··· − ∆ y0 , (k + 1)! k = 0, 1, 2, . . . . 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inverse Interpolation This process is repeated until two successive approximations u(k+1) and u(k) be equal up to desired number of significant figures, i.e. until |u(k+1) − u(k) | < ε, where ε is a very small pre-assigned number. Once the value of u is obtained, the from the equation x = x0 + u(k+1) h we obtained the value of x. Let us consider the following example to illustrate the method. Example 6.2 Based on the following table
x
:
2.5
3.0
3.5
4.0
4.5
y
:
1.3240
4.6231
6.1245
7.4322
8.3217
determined the value of x when y = 4.8 using the method of successive approximations. Solution. The forward difference table for the given set of data is
x
y
2.5
1.3240
∆y
∆2 y
∆3 y
3.2991 3.0
4.6231
–1.7977 1.5014
3.5
6.1245
1.6040 –0.1937
1.3077 4.0
7.4322
–0.2245 –0.4182
0.8895 4.5
8.3217
Let x0 = 3.0, h = 0.5. Now, we apply the successive approximation method to find u. The first approximation is u(1) = 4
1 1 (y − y0 ) = (3.2 − 4.6231) = 0.1178. ∆y0 1.5014
...................................................................................... The other approximations are u(1) (u(1) − 1) 2 u(1) (u(1) − 1) ∆2 y0 1 (2) y − y0 − u = ∆ y0 = u(1) − ∆y0 2! 2! ∆y0 0.1178(0.1178 − 1) −0.1937 = 0.1178 − = 0.1111. 2 1.5014 u(2) (u(2) − 1) ∆2 y0 u(2) (u(2) − 1)(u(2) − 2) ∆3 y0 − u(3) = u(1) − 2 ∆y0 3! ∆y0 0.1111(0.1111 − 1) −0.1937 = 0.1178 − · 2 1.5014 0.1111(0.1111 − 1)(0.1111 − 2) −0.2245 − · 6 1.5014 = 0.1160. Thus, the value of x when y = 4.8 is obtained from x = x0 +u(3) h = 3.0+0.1160×0.5 = 3.058. 6.2.2
Based on Newton’s backward difference interpolation formula
From this formula one can deduced similar formula for inverse interpolation. The Newton’s backward interpolation formula is v(v + 1) 2 v(v + 1)(v + 2) 3 ∇ yn + ∇ yn + · · · 2! 3! v(v + 1)(v + 2) · · · (v + n − 1) n + ∇ yn , n!
y = yn + v∇yn +
x − xn or x = xn + vh. h The variable v can be obtained as 1 h v(v + 1) 2 v(v + 1)(v + 2) 3 v= y − yn − ∇ yn − ∇ yn − · · · ∇yn 2! 3! v(v + 1) · · · (v + n − 1) n i − ∇ yn . n!
where v =
The first approximation of v is obtained by neglecting second and higher order differences as
1 (y − yn ). ∇yn Similarly, the second approximation of v is given by 1 h v (1) (v (1) + 1) 2 i v (2) = ∇ yn . y − yn − ∇yn 2! v (1) =
5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inverse Interpolation The third approximation is v (3) =
v (2) (v (2) + 1) 2 v (2) (v (2) + 1)(v (2) + 2) 3 i 1 h y − yn − ∇ yn − ∇ yn . ∇yn 2! 3!
In general, the (k + 1) approximation of v is given by the formula v
(k+1)
1 v (k) (v (k) + 1) 2 v (k) (v (k) + 1)(v (k) + 2) 3 = y − yn − ∇ yn − ∇ yn ∇yn 2! 3! v (k) (v (k) + 1) · · · (v (k) + k) k+1 −··· − ∇ yn , (k + 1)!
for k = 0, 1, 2, . . . . This iteration process repeats until two consecutive values of v’s become equal up to a desired number of significant figures. Finally, the value of x is determined from the equation x = xn + v (k+1) h. The inverse interpolation method is used to solve many problems. One of them is determination of roots of an equation discussed below.
6.3 Computation of roots of an equation by inverse interpolation Let f (x) = 0 be an equation and one of the roots, say ξ, lies between a and b. That is, f (ξ) = 0 and a ≤ ξ < b. To find the root ξ by inverse interpolation method, a table is to be formed for some values of x between a and b. The first row contains the values of x and second row contains the values of y = f (x). If we use Lagrange’s method, then arguments (the values of x) may be unequal spaced, but for Newton’s methods the arguments must be equal spaced. Now, the problem is to find the value of x such that y = 0 and this value of x is the desired root of the equation. Example 6.3 Find a real root of the equation x3 − 2x2 + 0.5 = 0. Solution. Let y = x3 − 2x2 + 0.5. It is easy to check that one root of this equation lies between 1/2 and 3/4. Let us consider six values of x within this interval, viz. x = 0.50, 0.55, 0.60, 0.65, 0.70, 0.75. The values of x and y are shown in the following table. 6
......................................................................................
x
:
0.50
0.55
0.60
0.65
0.70
0.75
y
:
0.125000
0.061375
–0.004000
–0.070375
–0.137000
–0.203125
The difference of table is shown below. x
y
∆y
∆2 y
∆3 y
0.50
0.125000
0.55
0.061375
–0.063625
0.60
–0.004000
–0.065375
–0.00175
0.65
–0.070375
–0.066375
–0.00100
0.00075
0.70
–0.137000
–0.066625
–0.00025
0.00075
0.75
–0.203125
–0.066125
0.00050
0.00075
Since, we have to find a root of the given equation, so the problem is find the value of x such that y = 0. Now, the first approximation is y0 0.125 = 1.964636. = ∆y0 0.063625 1 u(1) (u(1) − 1) 2 ∆ y0 =− y0 + ∆y0 2 1 1.964636 × (1.964636 − 1) = 0.125 + × −0.00175 0.063625 2 = 1.938573. 1 u(2) (u(2) − 1) 2 u(2) (u(2) − 1)(u(2) − 2) 3 =− y0 + ∆ y0 + ∆ y0 ∆y0 2! 3! = 1.939394.
u(1) = − u(2)
u(3)
Therefore, the third approximate value of u is 1.939394. Thus, x = x0 + u(3) × h = 0.50 + 1.939394 × 0.05 = 0.596970. This is a root of the given equation. An interesting result Here we consider an interesting function for interpolation. 1 Let f (x) = , [−3, 3]. The second degree interpolating polynomial is y = φ2 (x) = 1 + x2 1 − 0.1x2 and the fourth degree polynomial is y = φ4 (x) = 0.15577u4 − 1.24616u3 + 2.8904u2 − 1.59232u + 0.1 where u = (x + 3)/1.5. The graph of the curves y = f (x), y = φ2 (x) and y = φ4 (x) are shown in Figure 6.1. 7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inverse Interpolation
Figure 6.1: The graph of the curves y = f (x), y = φ2 (x) and y = φ4 (x). For this function, the fourth degree polynomial gives an negative result at x = 2, but the exact value is positive. At this point f (2) = 0.2, φ2 (2) = 0.6 and φ4 (2) = −0.01539. It may be noted that the functions y = f (x) and y = φ2 (x) are positive for all values of x, but y = φ4 (x) is negative for some values of x.
6.4 Choice of interpolation formulae In this chapter, different methods are discussed to construct an interpolating polynomial in case of one independent variable. Some methods have certain advantage over other method. In this section we discussed the merits and demerits of the methods. If the arguments are not equally spaced then Lagrange’s, Newton’s divided difference or Aitken’s iterated interpolation formulae may be used. All these formulae need more computation time than Newton’s forward and backward interpolation methods. But, Newton’s forward and backward methods are applicable only when the arguments are equally spaced. The Newton’s forward formula is used when the interpolating point is at the beginning of the table, whereas Newton’s backward formula is used when such 8
...................................................................................... point is at the end of the table. Stirling’s and Bessel’s formulae are useful when the interpolating point is at the centre of the table. It is known that the interpolation polynomial is unique and hence all the formulae discussed in this chapter are just different forms of one and the same interpolation polynomial. For a given table of values, the results obtained by these formulae must be same, provided all the terms of the formulae are consider to calculate the results. In many cases, if the number of points is very large a subset of the entire data is consider. For example, if we interpolate at the beginning of the table, then a subset from the beginning of the table is consider and in this case the recommended formula is Newton’s forward formula, if the arguments are in equispaced. For the same reasons the central difference formulae like Stirling’s, Bessel’s, Everett’s etc. are used for interpolation near the centre of the table. For central difference interpolation, if the table is large, then some points may be discarded from the ends of the table to reduced the points. The proper choice of a central interpolation formulae depends on the error terms and the data set. It can be shown that the Stirling’s formula gives the more accurate result for −1/4 ≤ s ≤ 1/4, and Bessel’s formula gives better result near s = 1/2, i.e. for 1/4 ≤ s ≤ 3/4. If all the terms of the formulae are considered, then both the formulae give same result. Based on these discussions, the following rules are suggested to use interpolation formula. (i) If the interpolating point is at the beginning of the table and the arguments are equispaced, then Newton’s forward formula is suggested with a suitable starting point x0 such that 0 < u < 1. (ii) If the interpolating point is at the end of the table and the arguments are equispaced, then Newton’s backward formula is recommended with a suitable starting point xn such that −1 < u < 0. (iii) If the interpolating point (with equispaced arguments) is at the centre of the table and the number of arguments is odd, then Stirling’s formula can be used. (iv) If the interpolating point (with equispaced arguments) is at the centre of the table and the number of points is even, then Bessel’s or Everett’s formula is recommended. 9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inverse Interpolation (v) If the arguments are not equispaced, then Lagrange’s formula or Newton’s divided difference formula or Aitken’s iterative formula is suggested.
10
.
Chapter 2 Interpolation
Module No. 7 Bivariate Interpolation
...................................................................................... In previous modules, we consider only one independent variable for polynomial interpolation. In this module, two independent variables are consider for polynomial interpolation, and this type of interpolation is known as bivariate interpolation. Recently bivariate interpolation becomes important due to its extensive use in digital image processing, digital filter design, computer-aided design, solution of non-linear simultaneous equations, etc. To construct bivariate interpolation formulae, the following two approaches are used. (i) For a given table of values for two independent and one dependent variables, constructing a function for the dependent variable that satisfy exactly the functional values at all the given points. (ii) Constructing a function that approximately fits the data. This approach is desirable when the data likely to have errors and require smooth functions. Mainly two approaches, viz. matching method and approximation method are used. Also, local and global methods are available for these approaches. In this module, only matching methods are discussed. In matching method, the interpolated function passes through the given points, but in the approximate method the function approximately fits the given data.
7.1 Local matching methods Commonly used two two local matching methods, viz. triangular interpolation and rectangular grid or bilinear interpolation are discussed. 7.1.1
Triangular interpolation
The very simple local interpolating surface is of the form F (x, y) = a + bx + cy. The values of a, b, c are evaluated based on the given data. These values are determine from the three corners of a triangle. This procedure generates a piecewise linear surface which is global continuous. 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bivariate Interpolation Let the function f (x, y) be known at the points (x1 , y1 ), (x2 , y2 ) and (x3 , y3 ). Let f1 = f (x1 , y1 ), f2 = f (x2 , y2 ) and f3 = f (x3 , y3 ). That is, f1 , f2 , f3 are known. Let the interpolating polynomial be F (x, y) = a + bx + cy
(7.1)
such that F (xi , yi ) = f (xi , yi ), i = 1, 2, 3. Therefore, f1 = a + bx1 + cy1 ,
f2 = a + bx2 + cy2 ,
f3 = a + bx3 + cy3 .
The solution of these equations give the values of a, b and c as (x2 y3 − x3 y2 )f1 − (x1 y3 − x3 y1 )f2 + (x1 y2 − x2 y1 )f3 ∆ (f2 − f1 )(y3 − y1 ) − (f3 − f1 )(y2 − y1 ) b= ∆ (f3 − f1 )(x2 − x1 ) − (f2 − f1 )(x3 − x1 ) c= ∆
a=
where ∆ = (x2 − x1 )(y3 − y1 ) − (x3 − x1 )(y2 − y1 ). By substituting the values of a, b, c we get the required polynomial. But, the function F (x, y) can be also be written in the following form F (x, y) = Af1 + Bf2 + Cf3 , (x2 − x)(y3 − y) − (x3 − x)(y2 − y) where A = ∆ (x3 − x)(y1 − y) − (x1 − x)(y3 − y) B= ∆ (x1 − x)(y2 − y) − (x2 − x)(y1 − y) . C= ∆
(7.2) (7.3) (7.4) (7.5)
Note that, in this form the values of A, B, C does not depend on the f1 , f2 , f3 . Some observations of this method are given below. Note 7.1
(i) If A + B + C = 1 then ∆ 6= 0.
(ii) If ∆ = 0 then the points (xi , yi ), i = 1, 2, 3 are collinear. (iii) Suppose (xi , yi ) and f (xi , yi ), i = 1, 2, . . . , n are given. If we choose non-overlapping triangles which cover the region containing all these points, then a continuous polynomial can be constructed in this region. 2
......................................................................................
Example 7.1 Let f (1, 1) = 3, f (3, 1) = 8 and f (2, 2) = 15 be given. Find the approximate value of f (2, 1.5) using triangular interpolation. Solution. For this peoblem x1 = 1,
y1 = 1, f1 = f (x1 , y1 ) = 3
x2 = 3,
y2 = 1, f2 = f (x2 , y2 ) = 8
x3 = 2,
y3 = 2, f3 = f (x3 , y3 ) = 15
Let x = 2, y = 1.5. Therefore, ∆ = (x2 − x1 )(y3 − y1 ) − (x3 − x1 )(y2 − y1 ) = (3 − 1)(2 − 1) − (2 − 1)(1 − 1) = 2. (x2 − x)(y3 − y) − (x3 − x)(y2 − y) A= = 0.25 ∆ (x3 − x)(y1 − y) − (x1 − x)(y3 − y) = 0.25 B= ∆ (x1 − x)(y2 − y) − (x2 − x)(y1 − y) C= = 0.50. ∆ Thus f (2, 1.5) ' F (2, 1.5) = Af1 + Bf2 + Cf3 = 0.25 × 3 + 0.25 × 8 + 0.5 × 15 = 10.25. 7.1.2
Bilinear interpolation
In this method, four corner points of a rectangle are needed. Suppose the function f (x, y) be known at the points (x1 , y1 ), (x1 + h, y1 ), (x1 , y1 + k) and (x1 + h, y1 + k). Based on these points we construct a bilinear polynomial F (x, y), which passes through these points and defined within the rectangle. Let f1 = f (x1 , y1 ), f2 = f (x1 + h, y1 ), f3 = f (x1 , y1 + k) and f4 = f (x1 + h, y1 + k). Let the polynomial F (x, y) be the following form F (x, y) = a + b(x − x1 ) + c(y − y1 ) + d(x − x1 )(y − y1 ).
(7.6)
Since this polynomial must passes through these points, therefore F (x1 , y1 ) = f (x1 , y1 ) = f1 ,
F (x1 + h, y1 ) = f (x1 + h, y1 ) = f2 ,
F (x1 , y1 + k) = f (x1 , y1 + k) = f3 , F (x1 + h, y1 + k) = f (x1 + h, y1 + k) = f4 . 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bivariate Interpolation Substituting the four corner points to the equation (7.6), we obtained the following equations. f1 = a, f2 = a + bh, f3 = a + ck and f4 = a + hb + kc + hkd. The solution of these equations is a = f1 ,
b=
f3 − f1 f2 − f1 ,c = h k
and d =
f4 + f1 − f2 − f3 . hk
(7.7)
Thus, the required bilinear polynomial is give by the equation (7.6) and the values of a, b, c, d are obtained from the equation (7.7). Example 7.2 Suppose the function f (x, y) is known at the points (2, 2), (3, 2), (2, 3) and (3, 3) such that f (2, 2) = 4, f (3, 2) = 20, f (2, 3) = 15 and f (3, 3) = 25. Construct a bilinear polynomial F (x, y) based on these data. Also, find an approximate value of f (2.50, 2.75). Solution. Here x1 = 2, y1 = 2,
f1 = f (x1 , y1 ) = 4
x1 + h = 3,
y1 = 2,
f2 = f (x1 + h, y1 ) = 20
x1 = 2,
y1 + k = 3,
f3 = f (x1 , y1 + k) = 15
x1 + h = 3, y1 + k = 3, f4 = f (x1 + h, y1 + k) = 25. It is easy to see that, h = 1, k = 1. Thus,
f2 − f1 20 − 4 = = 16, h 1 f3 − f1 15 − 4 f4 + f1 − f2 − f3 c= = = 11, d = = −6. k 1 hk
a = f1 = 4, b =
Hence, the required bilinear polynomial is f (x, y) ' F (x, y) = a + b(x − x1 ) + c(y − y1 ) + d(x − x1 )(y − y1 ) = 4 + 16(x − 2) + 11(y − 2) − 6(x − 2)(y − 2). Also, f (2.50, 2.75) = 18.
7.2 Lagrange’s bivariate interpolation The Lagrange’s bivariate interpolation formula is an extension of single variable case. 4
...................................................................................... Let f (x, y) be a function defined at (m + 1)(n + 1) distinct points (xi , yi ), i = 0, 1, . . . , m; j = 0, 1, . . . , n. Let F (x, y) be the interpolating polynomial of the function (f (x, y). The degree of F (x, y) is at most m in x and n in y, and which satisfy the condition F (xi , yj ) = f (xi , yj ),
i = 0, 1, . . . , m; j = 0, 1, . . . , n.
(7.8)
As in case of single variable, the Lagrangian functions are defined as wx (x) , (x − xi )wx0 (xi ) wy (y) Ly,j (y) = , (y − yj )wy0 (yj ) Lx,i (x) =
i = 0, 1, . . . , m
(7.9)
j = 0, 1, . . . , n
(7.10)
where wx (x) = (x − x0 )(x − x1 ) · · · (x − xm ) and wy (y) = (y − y0 )(y − y1 ) · · · (y − yn ). It is observed that the functions Lx,i (x) and Ly,j (y) are the polynomials of degree m in x and n in y respectively. These functions also satisfy the following conditions. ( Lx,i (xk ) =
0, if xi 6= xk 1, if xi = xk
( and Ly,j (yk ) =
0, if yi 6= yk 1, if yi = yk
(7.11)
Using these notations the Lagrange’s bivariate polynomial is given by F (x, y) =
m X n X
Lx,i (x)Ly,j (y) f (xi , yj ).
(7.12)
i=0 j=0
Note that the formula is just an extension of single variable case, and it is easy to remembers. This formula is illustrated by the following example. Example 7.3 Let f (0, 0) = 1, f (0, 1) = 2, f (1, 0) = 3, f (1, 1) = 5, f (2, 0) = 5, f (2, 1) = 10 are known for the function f (x, y). Construct a bivariate interpolating polynomial using Lagrange’s method. Also, calculate the value of f (1.5, 0.75). Solution. m = 2, n = 1, x0 = 0, x1 = 1, x2 = 2, y0 = 0, y1 = 1. 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bivariate Interpolation The Lagrangian functions are Lx,0 (x) = Lx,1 (x) = Lx,2 (x) = Ly,0 (y) = Ly,1 (y) =
1 (x − x1 )(x − x2 ) = (x2 − 3x + 2) (x0 − x1 )(x0 − x2 ) 2 (x − x0 )(x − x2 ) = −(x2 − 2x) (x1 − x0 )(x1 − x2 ) 1 (x − x0 )(x − x1 ) = (x2 − x) (x2 − x0 )(x2 − x1 ) 2 (y − y1 ) =1−y (y0 − y1 ) (y − y0 ) = y. (y1 − y0 )
Now, the interpolating polynomial is
F (x, y) =
2 X 1 X
Lx,i (x)Ly,j (y) f (xi , yj )
i=0 j=0
= Lx,0 (x){Ly,0 (y)f (x0 , y0 ) + Ly,1 (y)f (x0 , y1 )} +Lx,1 (x){Ly,0 (y)f (x1 , y0 ) + Ly,1 (y)f (x1 , y1 )} +Lx,2 (x){Ly,0 (y)f (x2 , y0 ) + Ly,1 (y)f (x2 , y1 )} 1 = (x2 − 3x + 2)[(1 − y) × 1 + y × 2] − (x2 − 2x)[(1 − y) × 3 + y × 5] 2 1 2 = (x − x)[(1 − y) × 5 + y × 10] 2 = x2 y + y + 2x + 1. Hence, the value of f (1.5, 0.75) is approximately F (1.5, 0.75) = 6.4375
7.3 Newton’s bivariate interpolation formula The concept of Lagrange’s bivariate formula is simple. But, the idea of Newton’s bivariate formula is new. Since, f (x, y) contains two independent variables, two types of differences are considered. One difference is taken with respect to x and other for y. Let f (x, y) be given at (m + 1)(n + 1) distinct points (xi , yj ), i = 0, 1, . . . , m; j = 0, 1, . . . , n. Also, let xi = x0 + ih, yj = y0 + jk, x = x0 + sh and y = y0 + tk. 6
...................................................................................... Now, we defined the first and higher order differences with respect to x and y.
∆x f (x, y) = f (x + h, y) − f (x, y) = Ex f (x, y) − f (x, y) = (Ex − 1)f (x, y) ∆y f (x, y) = f (x, y + k) − f (x, y) = Ey f (x, y) − f (x, y) = (Ey − 1)f (x, y) ∆xx f (x, y) = (Ex2 − 2Ex + 1)f (x, y) = (Ex − 1)2 f (x, y) ∆yy f (x, y) = (Ey2 − 1)2 f (x, y) ∆xy f (x, y) = ∆x {f (x, y + k) − f (x, y)} = {f (x + h, y + k) − f (x, y + k)} − {f (x + h, y) − f (x, y)} = Ex Ey f (x, y) − Ey f (x, y) − Ex f (x, y) + f (x, y) = (Ex − 1)(Ey − 1)f (x, y)
and so on. Let us consider the function f (x, y) as,
f (x, y) = f (x0 + sh, y0 + tk) = Exs Eyt f (x0 , y0 ) = (1 + ∆x )s (1 + ∆y )t f (x0 , y0 ) n o s(s − 1) = 1 + s∆x + ∆xx + · · · 2! o n t(t − 1) ∆yy + · · · f (x0 , y0 ) × 1 + t∆y + 2! s(s − 1) t(t − 1) ∆xx + ∆yy = 1 + s∆x + t∆y + 2! 2! +st∆xy + · · · f (x0 , y0 ).
x − x0 y − y0 and t = . h k x − x0 − h x − x1 y − y1 Then s − 1 = = and t − 1 = . h h k Now, we substitute s =
7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bivariate Interpolation Thus, the Newton’s bivariate interpolation formula is x − x0 y − y0 F (x, y) = f (x0 , y0 ) + ∆x + ∆y f (x0 , y0 ) h k 1 (x − x0 )(x − x1 ) 2(x − x0 )(y − y0 ) + ∆xx + ∆xy 2! h2 hk (y − y0 )(y − y1 ) + ∆yy f (x0 , y0 ) + · · · . k2
(7.13)
To simplify the formula, two unit less quantities u and v are introduced which are defined as x = x0 + uh and y = y0 + vk. Then x − xs = (u − s)h and y − yt = (v − t)k. Hence, the equation (7.13) becomes 1 [u(u − 1)∆xx 2! +2uv∆xy + v(v − 1)∆yy ]f (x0 , y0 ) + · · ·
F (x, y) = f (x0 , y0 ) + [u∆x + v∆y ]f (x0 , y0 ) +
(7.14)
Example 7.4 The function f (x, y) is known as f (0, 0) = −1, f (0, 1) = 2, f (0, 2) = 3, f (1, 0) = 4, f (1, 1) = 0, f (1, 2) = 4, f (2, 0) = 2, f (2, 1) = −2, f (2, 2) = 3. For these values construct the Newton’s bivariate polynomial. Also, find the approximate values of f (1.25, 0.75) and f (1.0, 1.5). Solution. ∆x f (x0 , y0 ) = f (x0 + h, y0 ) − f (x0 , y0 ) = f (x1 , y0 ) − f (x0 , y0 ) = 4 − (−1) = 5 ∆y f (x0 , y0 ) = f (x0 , y0 + k) − f (x0 , y0 ) = f (x0 , y1 ) − f (x0 , y0 ) = 2 − (−1) = 3 ∆xx f (x0 , y0 ) = f (x0 + 2h, y0 ) − 2f (x0 + h, y0 ) + f (x0 , y0 ) = f (x2 , y0 ) − 2f (x1 , y0 ) + f (x0 , y0 ) = 2 − 2 × (4) + (−1) = −7 ∆yy f (x0 , y0 ) = f (x0 , y0 + 2k) − 2f (x0 , y0 + k) + f (x0 , y0 ) = f (x0 , y2 ) − 2f (x0 , y1 ) + f (x0 , y0 ) = 3 − 2 × 2 + (−1) = −2 ∆xy f (x0 , y0 ) = f (x0 + h, y0 + k) − f (x0 , y0 + k) − f (x0 + h, y0 ) + f (x0 , y0 ) = f (x1 , y1 ) − f (x0 , y1 ) − f (x1 , y0 ) + f (x0 , y0 ) = 0 − 2 − 4 + (−1) = −7. 8
......................................................................................
Figure 7.1: The interpolated bivariate polynomial F (x, y) y − y0 x − x0 = x, v = = y. h k Thus, the approximate polynomial F (x, y) for the function f (x, y) is
Here h = k = 1. u =
F (x, y) = 1 + [x × 5 + y × 3] 1 + [x(x − 1) × (−7) + 2xy × (−7) + y(y − 1) × (−2)] 2! 1 = [2 + 17x + 8y − 7x2 − 14xy − 2y 2 ]. 2 The approximate polynomial F (x, y) is shown in Figure 7.1. Also, the approximate values are f (1.25, 0.75) ' F (1.25, 0.75) = 2.03125 and f (1.0, 1.5) ' F (1.0, 1.5) = −0.75.
9
.
Chapter 3 Approximation of Functions
Module No. 1 Least Squares Method
...................................................................................... Lots of experimental data are available in science and engineering applications and it is very essential to construct a curve based on this data. From the fitted curve, it is easy to determine the (approximate) values of unknown parameter(s) at non-experimental points. So, curve fitting is a very useful and important topic. But, the question is which curve is better for a given data set? There are two broad methods are available for this problem, one is matching method and another is approximation method. In Chapter 2, lots of interpolation methods are discussed to construct a polynomial which passes through the given points. These methods are called the matching methods. If n points (xi , yi ), i = 1, 2, . . . , n are given, then in interpolation an nth degree polynomial is constructed. But, it is obvious that the evaluation of large degree polynomial is a difficult task, though it gives exact values at the given nodes x0 , x1 , . . . , xn . Again, the cubic spline generates n third degree polynomials for n data points. By using approximation method one can construct lower degree polynomials such as linear, quadratic etc. and other types of curve, viz., geometric, exponential etc. Mainly least squares method is used to construct such curve. The least squares method does not give the guarantee that the curve must passes through the given points, but it will minimizes the sum of square of the absolute errors. In this method, the user can decide the degree of the polynomial irrespective of the data size, but in interpolation the degree of the polynomial depends on the data set.
1.1 General least squares method Let (xi , yi ), i = 1, 2, . . . , n be the give data set, i.e. a bivariate sample of size n. Our problem is to construct a curve of the following form y = g(x; a0 , a1 , . . . , ak ),
(1.1)
where a0 , a1 , . . . , ak are the unknown parameters and their values are to be determined in terms of given sample. When x = xi , the ith argument, the value of y obtained from the equation (1.1) is denoted by Yi . This value is called the predicted value or expected value or observed value of y and in general, it is approximate value of y, i.e. erroneous value, where as yi is the exact value of y. 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Least Squares Method These points and the least squares curve are shown in Figure 1.1. The points are drawn in blue colour and the curve is drawn in red colour. The dotted line represents the actual value of y at x = xi . The predicted value Yi is the length between the (red) curve and the x-axis.
y 6 r r 6 6 r yi -Yi ? r r r r6 r r r y
r
r r r
r r
r r
r
r
i
Yi
? ?
xi
O
r
- x
Figure 1.1: The sample points and the least squares curve The predicted value Yi is given by the equation Yi = g(xi ; a0 , a1 , . . . , ak ).
(1.2)
In general, Yi and yi are different as the predicted values does not necessarily satisfy the curve (1.1). The difference (yi − Yi ) is called the residual corresponding to x = xi . Now, the sum of the square of the residuals be denoted by S and it is given by the following expression S=
n X i=1
(yi − Yi )2 =
n X
[yi − g(xi ; a0 , a1 , . . . , ak )]2 .
(1.3)
i=1
The values of the parameters a0 , a1 , . . . , ak are determined by least squares method in such a way that S is minimum. Note that S contains (k + 1) parameters a0 , a1 , . . . , ak . 2
...................................................................................... The value of S will be minimum if ∂S ∂S ∂S = 0, = 0, . . . , = 0. ∂a0 ∂a1 ∂ak
(1.4)
This is a system of (k + 1) equations and these equations are called normal equations. Let the solution of these (k + 1) equations for the parameters a0 , a1 , . . . , ak be a0 = a∗0 , a1 = a∗1 , . . . , ak = a∗k . Then the required fitted curve is y = g(x; a∗0 , a∗1 , . . . , a∗k ).
(1.5)
The sum of the square of residuals is determined from the following equation S=
n X
2
(yi − Yi ) =
i=1
n X
[yi − g(xi ; a∗0 , a∗1 , . . . , a∗k )]2 .
(1.6)
i=1
This is the general method to fit a curve by using least squares method. Now, we discuss this method to fit some special type of curves.
1.2 Fitting of a straight line The straight line is a most simple curve and it is easy to handle. Let the equation of a straight line be y = a + bx,
(1.7)
where a and b are two parameters and their values are to be determined by the least squares method. Let (xi , yi ), i = 1, 2, . . . , n, be a given sample of size n. In this case, the sum of square of residuals S is given by S=
n X i=1
(yi − Yi )2 =
n X
(yi − a − bxi )2 .
i=1
Here, the parameters are a and b. The normal equations are X ∂S = −2 (yi − a − bxi ) = 0 ∂a X ∂S = −2 (yi − a − bxi )xi = 0. ∂b 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Least Squares Method After simplification, these equations reduce to X
X yi = na + b xi X X X x i yi = a xi + b x2i . The solution of these equations is P P P n x i yi − x i yi P P b= n x2i − ( xi )2
and
a=
(1.8)
X i 1hX yi − b xi . n
(1.9)
The expression of b can also be written as n P
b=
(x − x)(y − y)
i=1 n P
, where x = (xi − x)2
n
n
i=1
i=1
1X 1X xi , y = yi . n n
(1.10)
i=1
The above formula is generally used when the data set is large. Let a∗ and b∗ be the values of a and b. Then the fitted straight line is y = a∗ + b∗ x.
(1.11)
Example 1.1 Let (2, 2), (−1.5, −2), (4, 4.5) and (−2.5, −3) be a sample. Use least squares method to fit the line y = a + bx based on this sample and estimate the total error. Solution. The sample size is n = 4. The normal equations for straight line are X
X yi = na + b xi X X X x i yi = a xi + b x2i The values of
P
xi ,
P
yi and
Total 4
P
xi yi are calculated in the following table. xi
yi
x2i
x i yi
2.00
2.00
4.00
4.00
−1.50
−2.00
2.25
3.00
4.00
4.50
16.00
18.00
−2.50
−3.00
6.25
7.50
2.00
1.50
28.50
32.50
...................................................................................... Thus the normal equations are 1.5 = 4a + 2b and 32.5 = 2a + 28.5b. The solution of these equations is a = −0.2023, b = 1.1545. Hence, the fitted straight line is y = −0.2023 + 1.1545x. Estimation of error.
x
Given y (yi )
y, obtained from the curve (Yi )
2.0 −2.0 4.5 −3.0
2.1067 −1.9340 4.3657 −3.0885
2.0 −1.5 4.0 −2.5
(yi − Yi )2
Total
0.0114 0.0043 0.0180 0.0078 0.0415
Hence, the sum of the square of the residuals is 0.0415.
1.3 Fitting of polynomial of degree k Here, we consider a general case of polynomial curve fitting. Let (xi , yi ), i = 1, 2, . . . , n be the given bivariate sample of size n. The problem is to fit the following polynomial curve based on the above data. Let y = a0 + a1 x + a2 x2 + · · · + ak xk ,
(1.12)
be a polynomial curve of degree k, where a0 , a1 , . . . , ak are the (k + 1) parameters and their values are to be determined by least squares method in terms of the sample values. Let Yi be the value of y when x = xi obtained from the curve (1.12). The sum of square of residuals is S=
n X i=0
(yi − Yi )2 =
n h i2 X yi − (a0 + a1 xi + a2 x2i + · · · + ak xki ) .
(1.13)
i=0
5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Least Squares Method As in previous cases, the normal equations are given by ∂S ∂S ∂S = 0, = 0, . . . , = 0. ∂a0 ∂a1 ∂ak After simplification, these equations become X X X X na0 + a1 xi + a2 x2i + · · · + ak xki = yi X X X X X a0 xi + a1 x2i + a2 x3i + · · · + ak xk+1 = x i yi i X X X X X a0 x2i + a1 x3i + a2 x4i + · · · + ak xk+2 = x2i yi i
(1.14)
················································ ··· ······ X X X X X a0 xki + a1 xk+1 + a2 xk+2 + · · · + ak x2k xki yi . i = i i This is a set of (k + 1) linear equations containing (k + 1) parameters a0 , a1 , . . . , ak . Let a0 = a∗0 , a1 = a∗1 , . . . , ak = a∗k be the solution of this system of linear equations. Then the fitted polynomial is y = a∗0 + a∗1 x + a∗2 x2 + · · · + a∗k xk . 1.3.1
(1.15)
Fitting of second degree parabolic curve
Let us consider a particular case of kth degree polynomial. Here, we consider k = 2. Then the curve is known as parabolic curve or a polynomial curve of degree 2. Let the equation of the parabolic curve be y = a + bx + cx2 ,
(1.16)
where a, b, c are unknown parameters and their values are to be determined by least squares method. Let Yi be the value of y when x = xi obtained from the curve (1.16). Therefore, Yi = a + bxi + cx2i . For this case, the sum of square of residuals S is S=
n X
(yi − Yi )2 =
i=1
n X
(yi − a − bxi − cx2i )2 .
i=1
Hence, the normal equations are ∂S = 0, ∂a 6
∂S = 0, ∂b
and
∂S = 0. ∂c
...................................................................................... After simplification, these equations become
X
X X yi = na + b xi + c x2i X X X X xi yi = a xi + b x2i + c x3i X X X X x2i yi = a x2i + b x3i + c x4i .
Let a = a∗ , b = b∗ and c = c∗ be the solution of the above equations. Then the fitted parabolic curve is
y = a∗ + b∗ x + c∗ x2
(1.17)
Example 1.2 Fit a quadratic polynomial curve for the following data
x
:
1
2
3
4
5
6
7
8
9
y
:
3
4
6
7
7
8
10
11
11
Solution. We transform the x values by u = 5−x to reduce the arithmetic computation. Let the quadratic curve be y = a + bu + cu2 . For this problem, the normal equations are
X
P P = na + b u + c u2 X P P P uy = a u + b u2 + c u3 X P P P u2 y = a u2 + b u3 + c u4 . y
The necessary calculations are shown below: 7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Least Squares Method
x
u
y
u2
uy
u3
u2 y
u4
1 2 3 4 5 6 7 8 9
4 3 2 1 0 −1 −2 −3 −4
3 4 6 7 7 8 10 11 11
16 9 4 1 0 1 4 9 16
12 12 12 7 0 −8 −20 −33 −44
64 27 8 1 0 −1 −8 −27 −64
48 36 24 7 0 8 40 99 176
256 81 16 1 0 1 16 81 256
Total
0
67
60
−62
0
438
708
Using this values the normal equations are 67 = 9a + 0b + 60c −62 = 0a + 60b + 0c 438 = 60a + 0b + 708c. Solution of these equations is b = −1.033, a = 7.632, c = −0.028. Thus, the fitted quadratic curve is y = 7.632 + (−1.033)u + (−0.028)u2 = 7.632 + (−1.033)(5 − x) + (−0.028)(5 − x)2 = 1.767 + 1.313x − 0.028x3 .
1.4 Fitting of other curves Using least squares method one can fit many other curves such as geometric curve, exponent curve, hyperbolic curve, etc. These type of curves occur in different applications and the nature of the curve depends on the problem and the data. Some of these curves are described below.
8
......................................................................................
1.4.1
Geometric curve
Let us consider the geometric curve. Let y = axb
(1.18)
be the geometric curve. Here x and y are independent and dependent variables respectively, and a and b are called the parameters of the curve. The values of these parameters are to be determined by least squares method in terms of the sample (xi , yi ), i = 1, 2, . . . , n. Taking logarithm on both sides of the equation (1.18), we obtain log y = log a+b log x. This equation can be written as Y = A + bX, where Y = log y, X = log x, A = log a. Thus to fit the curve of equation (1.18), we have to determine the values of the parameters A and b. For this purpose, the given sample (xi , yi ) is to be converted to (Xi , Yi ), i = 1, 2, . . . , n. Using this sample, and by using the procedure described in Section 1.2, one can determine the values of A and b. The value of a is then determine from the equation a = eA . Example 1.3 Fit a geometric curve of the form y = axb for the following data. x
:
1
2
3
4
5
y
:
2.7
5.6
10.2
15.3
21
Solution. Let y = axb . Here a and b are parameters. Taking logarithm both sides, we get Y = A + bX, where log y = Y, log x = X, log a = A. The normal equations for the curve Y = A + bX are
X
X Yi = nA + b Xi X X X Xi Yi = A Xi + b Xi2 . The necessary values are calculated in the following table. 9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Least Squares Method x
y
X
Y
X2
XY
1
2.7
0
1.99325
0
0
2
5.6
0.69315
1.72276
0.48045
1.19413
3
10.2
1.09861
2.32238
1.20694
2.55139
4
15.3
1.38629
2.72785
1.92179
3.78159
5
21.0
1.60944
3.04452
2.59029
4.89997
4.78749
10.81076
6.19947
12.42708
Total
Thus the normal equations are 5A + 4.78749b = 10.81076, 4.78749A + 6.19947b = 12.42708. The solution of these equations is A = 0.93180, b = 1.28496. Then a = eA = 2.53907. Hence, the fitted curve is y = 2.53907 x1.28496 . 1.4.2
Exponential curve
This is a particular case of geometric curve. Let the exponential curve be y = aebx .
(1.19)
Here the parameters are a and b and their values are to be determined by least squares method. Taking logarithm on both sides of the equation (1.19), we get log y = log a + bx. This equation can be written as Y = A + bx, where Y = log y, A = log a, i.e. a = eA . The parameters A (and hence a) and b can be determined by the same process discussed in Section 1.2. 1.4.3
Rectangular hyperbola
Let us consider another kind of curve called the rectangular hyperbola whose equation is of the following form: y= 10
1 . a + bx
(1.20)
...................................................................................... This is a non-linear equation and it can be converted to linear form by substituting Y = 1/y. Then the above equation is transformed to Y = a + bx. By using same process discussed in Section 1.2, we can fit the rectangular hyperbola.
1.5 Weighted least squares method This is an extension of the least squares method. Weighted least squares method has two fold importance. If a particular sample point, say (xk , yk ) is repeated for wk (say) times, then this method is useful. Here, wk is an integer. Again, if a particular point(s), say (xk , yk ) is (are) very significant, then the amount of importance can be represented by a weight say wk . In this case, wk need not be an integer. Some times it may required that the curve must passes through some given points. For this case also the weighted least squares method is useful. If all the sample points have same significance, then the weights are taken as 1. This method is similar to the least squares method, only the weights are incorporated in this new method. 1.5.1
Fitting of weighted straight line
Let y = a + bx be the straight line to be fitted for the sample (xi , yi ), i = 1, 2, . . . , n. Each sample point (xi , yi ) is associated with a weight wi , i = 1, 2, . . . , n, where wi is a real number. Let the predicted value of y at x = xi be Yi . Then the error for the ith sample point is yi − Yi and the sum of square of residuals is
S=
n X
h i2 wi yi − (a + bxi ) .
(1.21)
i=1
Like previous cases, for the minimum S, ∂S = 0, ∂a
∂S = 0. ∂b 11
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Least Squares Method These give −2
n X
wi [yi − (a + bxi )] = 0
i=1
and
−2
n X
wi [yi − (a + bxi )]xi = 0
i=1
After simplification, the above equations reduce to a
n X i=1
wi + b
n X
wi x i =
i=1
n X
wi yi
and
a
n X
i=1
wi x i + b
i=1
n X
wi x2i
=
i=1
n X
wi x i yi .
i=1
These are the normal equations and solving these equations we obtain the values of a and b. This method is illustrated in the following example. The same method can be used to fit other type of curves in case of weighted data. Example 1.4 Consider the following sample: x
:
0
3
6
9
y
:
9
16
21
28
(a) Fit a straight line for this data. (b) Fit a straight line by considering the weights of the sample points (3,16) and (6,21) as 6 and 15 respectively. (b) Again, fit the line by considering the modified weights 20 and 30 for the same sample points. Solution. Let the straight line be y = a + bx. (a) As in previous case, the fitted straight line is y1 = 9.2000 + 2.0667x. (b) In case of weighted least squares method, the normal equations to fit the straight line are X X X a wi + b wi x i = wi yi X X X and a wi x i + b wi x2i = wi x i yi . In this case, the weights are 1, 6, 15, 1. The calculations are shown in the following table. 12
......................................................................................
x
y
w
wx
wx2
wy
wxy
0
9
1
0
0
9
0
3
16
6
18
54
96
288
6
21
15
90
540
315
1890
9
28
1
9
81
28
252
Total
23
117
675
448
2430
Then the normal equations are 23a + 117b = 448 and 117a + 675b = 2430. The solution is a = 9.8529, b = 1.8921. Thus the fitted line is y2 = 9.8529 + 1.8921x. (c) In this case, the weights are 1, 20, 30, 1. The calculations are shown in the following table. x
y
w
wx
wx2
wy
wxy
0
9
1
0
0
9
0
3
16
20
60
180
320
960
6
21
30
180
1080
630
3780
9
28
1
9
81
28
252
Total
52
249
1341
987
4992
Then the normal equations are 52a + 249b = 987 and 249a + 1341b = 4992 and the solution is a = 10.4202, b = 1.7877. Hence, the fitted line is y3 = 10.4202 + 1.7877x. Estimation of error. x
y
y1
y2
y3
|y − y1 |
|y − y2 |
|y − y3 |
0 3 6 9
9 16 21 28
9.2000 15.4001 21.6002 27.8003
9.8529 15.5292 21.2055 26.8818
10.4202 15.7833 21.1464 26.5095
0.2000 0.5999 0.6002 0.1997
0.8529 0.4708 0.2055 1.1182
1.4202 0.2167 0.1464 1.4905
1.5998
2.6474
3.2738
Sum of square of errors
13
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Least Squares Method Note that the sum of the square of errors is minimum for the line y1 , than the lines for y2 and y3 . When the weights on second and third sample points are increased then the absolute errors in y are reduced at these points, but, the sum of square of errors is increased for the lines y2 and y3 . The three lines are shown in Figure 1.2.
Figure 1.2: Comparison of three fitted lines
14
.
Chapter 3 Approximation of Functions
Module No. 2 Approximation of Function using Orthogonal Polynomials
2.1. Least squares method for continuous data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . In the previous module, the least squares method is used to fit a curve from a given sample. That is, from a discrete sample (xi , yi ), i = 1, 2, . . . , n a pre-assumed curve is fitted. The (absolute) error is determined for each sample point and from these errors, the sum of square of errors S is computed. Since the sample is discrete, so the absolute errors for the sample points are discrete, therefore the sum of all such errors is determined. The least squares method can also be used for continuous data. In this case, the sum of square of residuals is determined by replacing summation by integration. This method is discussed in the next section.
2.1 Least squares method for continuous data Let y = f (x) be a continuous function defined on the closed interval [a, b]. We approximate this function to a kth degree polynomial defined below y = a0 + a1 x + a2 x2 + · · · + ak xk .
(2.1)
Let w(x) be a suitable weight function and Y be the predicted value of y obtained from the equation (2.1) for a particular value of x. Then the total (sum of) square of the residuals S is defined as Z S=
b
w(x)(y − Y )2 dx =
a
Z
b
w(x)[y − (a0 + a1 x + a2 x2 + · · · + ak xk )]2 dx, (2.2)
a
where a0 , a1 , . . . , ak are parameters and their values are to be determined by minimizing S. For minimum S, ∂S ∂S ∂S = = ··· = = 0. ∂a0 ∂a1 ∂ak
(2.3)
These are called the normal equations. After differentiation, the above equations become 1
. . . . . . . . . . . . . . . . . Approximation of Function using Orthogonal Polynomials
Z b −2 w(x)[y − (a0 + a1 x + a2 x2 + · · · + ak xk )] dx = 0 a Z b −2 w(x)[y − (a0 + a1 x + a2 x2 + · · · + ak xk )]x dx = 0 a Z b w(x)[y − (a0 + a1 x + a2 x2 + · · · + ak xk )]x2 dx = 0 −2 a
.. . Z −2
.. .
b
w(x)[y − (a0 + a1 x + a2 x2 + · · · + ak xk )]xk dx = 0.
a
Assume that the term by term integration is valid. Then after simplification these equations simplified as
Z
b
Z
b
b
Z
Z
k
b
w(x)dx + a1 xw(x) dx + · · · + ak x w(x) dx = w(x)y dx a a a a Z b Z b Z b Z b a0 xw(x)dx + a1 x2 w(x) dx + · · · + ak xk+1 w(x) dx = w(x)xy dx a a a a Z b Z b Z b Z b a0 x2 w(x)dx + a1 x3 w(x) dx + · · · + ak xk+2 w(x) dx = w(x)x2 y dx a0
a
a
a
a
.. . Z a0
b
xk w(x)dx + a1
a
b
Z
xk+1 w(x) dx + · · · + ak
a
.. . Z
b
x2k w(x) dx =
a
(2.4) Z
b
w(x)xk y dx
a
Remember that the weight function w(x) and the given function y = f (x) are known. Therefore, the above equations is a system of (k + 1) linear equations with (k + 1) unknowns a0 , a1 , . . . , ak . Let a0 = a∗0 , a1 = a∗1 , . . . , ak = a∗k be the solution of the above system of equations. Thus, the approximate polynomial corresponding to the function y = f (x) is y = a∗0 + a∗1 x + a∗2 x2 + · · · + a∗k xk .
2
2.1. Least squares method for continuous data . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Example 2.1 Use least squares method to approximate the function y = xex to a quadratic polynomial (yl ) on [0, 1]. Also, compare the values of yl , y3 and y4 , where y3 and y4 are the Taylor’s series approximations of xex up to third and fourth degree terms. Solution. For simplicity, we assume that w(x) = 1. Let the quadratic polynomial be y = a + bx + cx2 , which approximates the function xex , where a, b and c are the parameters. The total square of residuals S is Z 1 Z 1 [xex − (a + bx + cx2 )]2 dx. [y − (a + bx + cx2 )]2 dx = S= 0
0
Thus, the normal equations are Z 1 Z 1 2 [a + bx + cx ] dx = y dx 0 0 Z 1 Z 1 2 3 [ax + bx + cx ] dx = xy dx 0 0 Z 1 Z 1 [ax2 + bx3 + cx4 ] dx = x2 y dx. 0
0
That is, Z
1
Z
1
Z
1
Z
1
dx + b x dx + c x dx = xex dx 0 0 0 0 Z 1 Z 1 Z 1 Z 1 a x dx + b x2 dx + c x3 dx = x2 ex dx 0 0 0 0 Z 1 Z 1 Z 1 Z 1 a x2 dx + b x3 dx + c x4 dx = x3 ex dx. a
0
0
2
0
0
After simplification these equations reduce to 1 a+ b+ 2 1 1 a+ b+ 2 3 1 1 a+ b+ 3 4
1 c = 1, 3 1 c = e − 2, 4 1 c = 6 − 2e. 5
The solution of these equations is a = 0.0449, b = 0.4916, c = 2.1278. 3
. . . . . . . . . . . . . . . . . Approximation of Function using Orthogonal Polynomials Thus, the quadratic approximation to the curve y = xex is yl = 0.0449 + 0.4916x + 2.1278x2 .
(2.5)
The Taylor’s series expansion of the function y = xex up to third and fourth degree terms are x3 2 x3 x4 y4 = x + x2 + + . 2 6
y3 = x + x2 +
(2.6) (2.7)
The values of y for some values of x on [0, 1] obtained from (2.5), (2.6) and (2.7) are tabulated below. x
yl
y3
y4
Exact
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
0.044900 0.115338 0.228332 0.383882 0.581988 0.822650 1.105868 1.431642 1.799972 2.210858
0.000000 0.105500 0.224000 0.358500 0.512000 0.687500 0.888000 1.116500 1.376000 1.669500
0.000000 0.105517 0.224267 0.359850 0.516267 0.697917 0.909600 1.156517 1.444267 1.778850
0.000000 0.110517 0.244281 0.404958 0.596730 0.824361 1.093271 1.409627 1.780433 2.213643
The above table shows that the least squares quadratic approximation produces better result as compared to both Taylor’s series approximation up to third degree term, and fourth degree approximation of Taylor series.
2.2 Use of orthogonal polynomials In previous section, it is seen that the least squares method generates a system of linear equations and it takes time to find the values of the parameters, particularly for 4
2.1. Least squares method for continuous data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . large system. But, by using orthogonal polynomials one can determine the parameters easily. In any polynomial of degree k, the fundamental terms are 1, x, x2 , . . . , xk . These terms are called base functions, because, any function or even discrete data are approximated based on these functions. But, in some applications the base functions may be taken as orthogonal polynomials. The use of these functions as base functions has certain advantages. For example, the values of the parameters a0 , a1 , . . . , ak can be determined easily with the help of orthogonal polynomials. A set of polynomials {f0 (x), f1 (x), . . . , fk (x)} is called orthogonal with respect to the weight function w(x) if the inner product between two different orthogonal functions is zero, i.e.
b
Z a
if i 6= j 0, fi (x)fj (x)w(x) dx = Z b 2 fi (x)w(x) dx, if i = j.
(2.8)
a
Let {f0 (x), f1 (x), . . ., fk (x)} be a set of orthogonal polynomials and the given function be approximated as y = a0 f0 (x) + a1 f1 (x) + · · · + ak fk (x),
(2.9)
where fi (x) is a polynomial in x of degree i, i = 0, 1, 2, . . . , n. In this case, the total square of residuals S is Z S=
b
w(x)[y − {a0 f0 (x) + a1 f1 (x) + · · · + ak fk (x)}]2 dx,
(2.10)
a
where w(x) is a suitable weight function, and a0 , a1 , . . . , ak are parameters. The values of these parameters are determine by minimizing S. For minimum S, ∂S ∂S ∂S = 0, = 0, . . . , = 0. ∂a0 ∂a1 ∂ak 5
. . . . . . . . . . . . . . . . . Approximation of Function using Orthogonal Polynomials These equations give the following normal equations. b
Z −2
w(x)[y − {a0 f0 (x) + a1 f1 (x) + · · · + ak fk (x)}]f0 (x) dx = 0 a b
Z −2
w(x)[y − {a0 f0 (x) + a1 f1 (x) + · · · + ak fk (x)}]f1 (x) dx = 0 a
.. . Z
.. .
b
−2
w(x)[y − {a0 f0 (x) + a1 f1 (x) + · · · + ak fk (x)}]fk (x) dx = 0. a
This is a system of (k + 1) equations containing (k + 1) parameters. Let us consider ith equation. After simplification, it becomes Z
b
Z
b
w(x)f1 (x)fi (x) dx + · · · Z b Z b 2 +ai w(x)fi (x) dx + · · · + ak w(x)fk (x)fi (x) dx a a Z b w(x) y fi (x) dx, = a0
w(x)f0 (x)fi (x) dx + a1
a
a
(2.11)
a
i = 0, 1, 2, . . . , k. Using the property of orthogonal polynomial, the equation (2.11) reduces to Z ai
b
w(x)fi2 (x)
a
b
Z dx =
w(x)yfi (x) dx,
i = 0, 1, 2, . . . , k.
a
That is, b
Z
w(x) y fi (x) dx a
ai = Z
,
b
w(x)fi2 (x)
(2.12)
dx
a
for i = 0, 1, 2, . . . , k. Therefore, all the parameters are obtained from the equation (2.12). Notice that this expression is simple and need not require to solve a system of linear equations to find ai ’s. Thus, an outline to approximate a function using orthogonal polynomials is given. Now, we discuss about the shape of such polynomials. Several orthogonal polynomials 6
2.1. Least squares method for continuous data . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Name
fi (x)
Interval
w(x)
Legendre Leguerre Hermite Chebyshev
Pn (x) Ln (x) Hn (x) Tn (x)
[−1, 1] [0, ∞) (−∞, ∞) [−1, 1]
1 e−x 2
e−x (1 − x2 )−1/2
Table 2.1: Some standard orthogonal polynomials.
are available in literature, some standard orthogonal polynomials along with their weight functions are listed in Table 2.1. Any one of these orthogonal polynomials be used to approximate a function. Also, we can generate a set of orthogonal polynomials. A commonly used method to generate a set of orthogonal polynomials is described below.
2.3 Gram-Schmidt orthogonalization process This is a very well known method in linear algebra, to find a set of orthogonal vectors from a set of linearly independent vectors. Let {f0 (x), f1 (x), f2 (x), . . . , fn (x)} be a set of polynomials, where fi (x) is a polynomial in x of degree i. Let w(x) be a weight function defined on the interval [a, b]. The set of orthogonal polynomials {f0∗ (x), f1∗ (x), f2∗ (x), . . . , fn∗ (x)} on the interval [a, b] with respect to the weight function w(x) is determined from the following equation fi∗ (x) = xi −
i−1 X
cir fr∗ (x),
i = 1, 2, . . . , n,
(2.13)
r=0
where cir ’s are constants and f0∗ (x) is taken as 1. The values of cir ’s are determine as follows. The equation (2.13) is multiplied by w(x)fk∗ (x), k = 0, 1, 2, . . . , i − 1 and after integration between a and b, the equation (2.13) becomes Z a
b
fi∗ (x)fk∗ (x)w(x)
Z dx = a
b
xi fk∗ (x)w(x)
dx −
Z bX i−1
cir fr∗ (x)fk∗ (x)w(x) dx.
a r=0
7
. . . . . . . . . . . . . . . . . Approximation of Function using Orthogonal Polynomials Since fr∗ (x) and fk∗ (x) are orthogonal, therefore, b
Z
fr∗ (x)fk∗ (x)w(x) dx = 0.
a
Hence, the above equation reduces to Z b Z b i ∗ x fk (x)w(x) dx − cik f ∗ 2k (x)w(x) dx = 0 a a Rb i ∗ x fk (x) w(x) dx , k = 0, 1, 2, . . . , i − 1. i.e. cik = aR b ∗2 (x) w(x) dx f k a Thus, finally the set of orthogonal polynomials {f0∗ (x), f1∗ (x), f2∗ (x), . . . , fn∗ (x)} is given by f0∗ (x) = 1 fi∗ (x) = xi −
i−1 X
cir fr∗ (x),
i = 1, 2, . . . , n
r=0
Rb
xi fr∗ (x) w(x) dx . where cir = aR b ∗2 (x) w(x) dx f r a
(2.14)
Note 2.1 The Gram-Schmidt process generates a sequence of monic (leading coefficient unity) orthogonal polynomials.
Example 2.2 Taking the weight function w(x) = 1 and using Gram-Schmidt orthogonalization process determine the first five orthogonal polynomials on the interval [−1, 1]. Solution. The first orthogonal polynomial is f0∗ (x) = 1. The second orthogonal polynomial is R1 f1∗ (x)
=x−
c10 f0∗ (x),
where c10 = R−11
x dx
−1 dx
Therefore, f1∗ (x) = x. The third orthogonal polynomial is f2∗ (x) = x2 − c20 f0∗ (x) − c21 f1∗ (x). 8
= 0.
2.1. Least squares method for continuous data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The constants c20 and c21 are obtained as R1 2 R1 2 x .x dx 1 −1 x dx = , c21 = R−11 = 0. c20 = R 1 3 dx x2 dx −1
−1
Thus, f2∗ (x) = x2 −
1 1 = (3x2 − 1). 3 3
Again, f3∗ (x) = x3 − c30 f0∗ (x) − c31 f1∗ (x) − c32 f2∗ (x) where R1
R1 c30 =
3 −1 x dx R1 −1 dx
= 0, c31 =
Hence,
3 −1 x .x dx R1 2 −1 x dx
R1 3 1 x . (3x2 − 1) dx 3 = , c32 = R−11 1 3 = 0. 5 (3x2 − 1)2 dx −1 9
3 1 f3∗ (x) = x3 − x = (5x3 − 3x). 5 5
The last polynomial is f4∗ (x) = x4 − c40 f0∗ (x) − c41 f1∗ (x) − c42 f2∗ (x) − c43 f3∗ (x). The constants are R1 4 R1 4 R1 4 1 x . (3x2 − 1) dx 1 6 −1 x dx −1 x .x dx c40 = R 1 = , c41 = R 1 = 0, c42 = R−11 1 3 = 2 2 2 5 7 −1 dx −1 x dx −1 9 (3x − 1) dx R1 4 1 x . 5 (5x3 − 3x) dx c43 = R−1 = 0. 1 1 3 2 −1 25 (5x − 3x) dx Hence,
1 6 1 1 − . (3x2 − 1) = (35x4 − 30x2 + 3). 5 7 3 35 Thus the first five orthogonal polynomials are f4∗ (x) = x4 −
f0∗ (x) = 1, f1∗ (x) = x, 1 f2∗ (x) = (3x2 − 1), 3 1 ∗ f3 (x) = (5x3 − 3x) and 5 1 f4∗ (x) = (35x4 − 30x2 + 3). 35 These polynomials are called (monic) Legendre polynomials. 9
.
Chapter 3 Approximation of Functions
Module No. 3 Approximation of Function by Chebyshev Polynomials
...................................................................................... In previous module, it is shown that use of orthogonal polynomials reduced the computational time to approximate a function. Among several orthogonal polynomials, Chebyshev polynomials seem to be more economic. In this module, we use Chebyshev polynomials to approximate a function.
3.1 Chebyshev polynomials The Chebyshev polynomial in x of degree n is denoted by Tn (x) and is defined by Tn (x) = cos(n cos−1 x), on the interval [−1, 1]
n = 0, 1, 2, . . . .
(3.1)
This polynomial can also be written as Tn (x) = cos nθ,
where
x = cos θ.
(3.2)
This polynomial looks like a trigonometric function. It is obvious that T0 (x) = 1 and T1 (x) = x. This polynomial satisfies the following second order differential equation (1 − x2 )
d2 y dy −x + n2 y = 0. 2 dx dx
(3.3)
The Chebyshev polynomials satisfy many interesting properties. Some of them are discussed below. Since cos θ is an even function, therefore from equation (3.1), we have Tn (x) = T−n (x). Also, T2n (x) = T2n (−x) and T2n+1 (−x) = −T2n+1 (x), i.e. Tn (x) is even or odd
functions according as n is even or odd.
Now, we deduce a recurrence relation for Chebyshev polynomial. It will help us to find all higher degree polynomials. From trigonometry, we know cos(n − 1)θ + cos(n + 1)θ = 2 cos nθ. cos θ. Thus, Tn−1 (x) + Tn+1 (x) = 2xTn (x) i.e. Tn+1 (x) = 2xTn (x) − Tn−1 (x),
n = 1, 2, 3, . . . .
(3.4) 1
. . . . . . . . . . . . . . . . . . . . . . Approximation of Function by Chebyshev Polynomials This is a recurrence relation for Chebyshev polynomial. From the recurrence relation (3.4) and T0 (x) = 1, T1 (x) = x, one can generate all order Chebyshev polynomials. The first seven Chebyshev polynomials are T0 (x) = 1 T1 (x) = x T2 (x) = 2x2 − 1
T3 (x) = 4x3 − 3x
(3.5)
T4 (x) = 8x4 − 8x2 + 1
T5 (x) = 16x5 − 20x3 + 5x
T6 (x) = 32x6 − 48x4 + 18x2 − 1
T7 (x) = 64x7 − 112x5 + 56x3 − 7x. The graph of first four Chebyshev polynomials are shown in Figure 3.1. Tn (x) 61
T0 (x)
T4 (x)
T3 (x)
T1 (x) −1
0
- x
1
T2 (x) −1 Figure 3.1: Chebyshev polynomials Tn (x), n = 0, 1, 2, 3, 4. The equation Tn (x) = 0 will be satisfied for n values of x, where the ith value of x is 2
...................................................................................... given by xi = cos
(2i + 1)π , 2n
i = 0, 1, 2, . . . , n − 1.
(3.6)
Since | cos θ| ≤ 1, therefore the zeros of Chebyshev polynomial Tn (x) lie between −1
and 1. Also, all zeros are distinct.
These values are called the Chebyshev abscissas or nodes. From the equation (3.2), it is obvious that |Tn (x)| ≤ 1 for −1 ≤ x ≤ 1. Thus extreme
values of Tn (x) are −1 and 1.
From the recurrence relation (3.4), it also observed that the leading coefficient of
Tn+1 (x) is twice that of Tn (x). Again, T1 (x) = x. Therefore, the coefficient of xn in Tn (x) is 2n−1 , n ≥ 1.
3.2 Orthogonality of Chebyshev polynomials It is mentioned that the Chebyshev polynomials are orthogonal with respect to the weight function (1 − x2 )−1/2 , that is, if Ti (x) and Tj (x) are two Chebyshev polynomials of degree i and j, then
Z
1
−1
Ti (x)Tj (x) √ dx = 0 1 − x2
for i 6= j.
Proof: Let x = cos θ. Then Z 1 Z π Ti (x)Tj (x) √ I= dx = Ti (cos θ)Tj (cos θ) dθ 1 − x2 0 −1 Z π Z 1 π = cos iθ cos jθ dθ = [cos(i + j)θ + cos(j − i)θ] dθ 2 0 0 1 sin(j + i)θ sin(j − i)θ π . = + 2 j +i j−i 0 It is easy to see that when j 6= i, then I = 0. When j = i = 0 then I = π and when
j = i 6= 0 then I = π2 . Thus,
Z
1
−1
0
if j = 6 i Ti (x)Tj (x) √ dx = π if j = i = 0 1 − x2 π/2 if j = i 6= 0
(3.7)
3
. . . . . . . . . . . . . . . . . . . . . . Approximation of Function by Chebyshev Polynomials
Example 3.1 Use Chebyshev polynomials to find least squares approximation of second degree for f (x) = (1 − x2 )5/2 on the interval [−1, 1]. Solution. Let f (x) ≃ a0 T0 (x) + a1 T1 (x) + a2 T2 (x). Then the residue is Z 1 S= w(x)[f (x) − {a0 T0 (x) + a1 T1 (x) + a2 T2 (x)}]2 dx, −1
where w(x) = (1 − x2 )−1/2 . For minimum S,
Now,
∂S = −2 ∂a0
Z
∂S = 0, ∂a0
∂S = 0, ∂a1
∂S = 0. ∂a2
1 −1
w(x)[f (x) − {a0 T0 (x) + a1 T1 (x) + a2 T2 (x)}]T0 (x) dx = 0.
Using the property of orthogonal polynomials, this equation is simplified as Z 1 Z 1 w(x)f (x)T0 (x) dx − w(x)a0 T02 (x) dx = 0. −1
−1
This equation gives R 1
R1 w(x)f (x)T (x) dx (1 − x2 )2 dx 0 a0 = −1 = R 1 −1 R1 2 2 −1/2 dx −1 w(x)T0 (x) dx −1 (1 − x ) =
8/15 16 = . π/2 15π
∂S ∂S = 0 and = 0, we obtain ∂a1 ∂a2 R1 R1 x(1 − x2 )2 dx −1 w(x)f (x)T1 (x) dx a1 = R 1 = R 1 −1 2 2 −1/2 dx 2 −1 w(x)T1 (x) dx −1 x (1 − x ) =0 R1 R1 (1 − x2 )2 (2x2 − 1) dx −1 w(x)f (x)T2 (x) dx a2 = R 1 = R 1 −1 2 2 2 2 −1/2 dx −1 w(x)T2 (x) dx −1 (2x − 1) (1 − x )
Similarly, from the equations
=
−8/21 32 =− . π/5 21π
Thus the Chebyshev approximation is f (x) = 4
16 32 16 32 T0 (x) − T2 (x) = − (2x2 − 1) = 0.82457 − 0.97009x2 . 15π 21π 15π 21π
...................................................................................... From the equation (3.5) it is easy to observed that all powers of x, i.e. all the base functions of a polynomial can be expressed in terms of Chebyshev polynomials. From that equation, we have 1 = T0 (x) x = T1 (x) x2 = 21 [T0 (x) + T2 (x)] x3 = 41 [3T1 (x) + T3 (x)] x4 = 18 [3T0 (x) + 4T2 (x) + T4 (x)] x5 = x6
=
x7
=
1 16 [10T1 (x) 1 32 [10T0 (x) 1 64 [35T1 (x)
(3.8)
+ 5T3 (x) + T5 (x)] + 15T2 (x) + 6T4 (x) + T6 (x)] + 21T3 (x) + 7T5 (x) + T7 (x)].
Thus every polynomial can be approximated in terms of Chebyshev polynomials. Example 3.2 Express x4 − x3 + 3x + 2 in terms of Chebyshev polynomials. Solution. We know, 1 = T0 (x) x = T1 (x) 1 x2 = [T0 (x) + T2 (x)] 2 1 3 x = [3T1 (x) + T3 (x)] 4 1 4 x = [3T0 (x) + 4T2 (x) + T4 (x)] 8 Thus, x4 − x3 + 3x + 2 1 1 = [3T0 (x) + 4T2 (x) + T4 (x)] − [3T1 (x) + T3 (x)] + 3T1 (x) + 2T0 (x) 8 4 19 9 1 1 1 = T0 (x) + T1 (x) + T2 (x) − T3 (x) + T4 (x). 8 4 2 4 8
5
. . . . . . . . . . . . . . . . . . . . . . Approximation of Function by Chebyshev Polynomials
Example 3.3 Express the function ex up to third degree term using Chebyshev polynomials Solution. The function ex up to third degree term is ex ≃ 1 + x + Thus,
1 2 1 x + x3 . 2! 3!
1 1 2 x + x3 2! 3! 1 1 1 1 = T0 (x) + T1 (x) + · [T0 (x) + T2 (x)] + · [3T1 (x) + T3 (x)] 2 2 6 4 1 [30T0 (x) + 27T1 (x) + 6T2 (x) + T3 (x)] . = 24
ex ≃ 1 + x +
3.3 Expansion of function using Chebyshev polynomials From the above example, it is seen that if a function be expanded by Taylor’s series, then it can be approximated by Chebyshev polynomials. Let y = f (x) be any function to be approximated by Chebyshev polynomials. Then by Chebyshev polynomials f (x) can be expressed as f (x) = a0 T0 (x) + a1 T1 (x) + a2 T2 (x) + · · · + ak Tk (x). The coefficients ai are given by
1
Z
ai = Z−11
−1
1 yTi (x) dx 1 − x2 , 1 2 √ Ti (x) dx 1 − x2
√
i = 0, 1, 2, . . . , k.
The denominator of ai is an improper integral and it is not easy to find its value. Therefore, a discretization technique is adopted to approximate a function using Chebyshev polynomials. The orthogonality of Chebyshev polynomials for discrete case is written as
0 if i 6= j n+1 Ti (xk )Tj (xk ) = if i = j = 6 0 2 k=0 n + 1 if i = j = 0
n X
6
(3.9)
......................................................................................
(2k + 1)π where xk = cos , k = 0, 1, 2, . . . , n. 2n + 2 The following theorem can be established using this result. Theorem 3.1 (Chebyshev approximation). The function f (x) can be approximated over [−1, 1] by Chebyshev polynomials as f (x) ≃
n X
ai Ti (x).
(3.10)
i=0
The coefficients ai are given by n n 1 X 1 X a0 = f (xj )T0 (xj ) = f (xj ), n+1 n+1 j=0 j=0 (2j + 1)π xj = cos 2n + 2 n X 2 and ai = f (xj )Ti (xj ) n+1 j=0 n 2 X (2j + 1)iπ = f (xj ) cos n+1 2n + 2
(3.11)
(3.12)
j=0
for i = 1, 2, . . . , n.
Example 3.4 Use Chebyshev polynomials to approximate the function f (x) = ex up to second order over the interval [−1, 1]. Solution. The second order Chebyshev approximation of f (x) is f (x) ≃ a0 T0 (x) + a1 T1 (x) + a2 T2 (x). The values of a’s are defined 2
a0 =
1 X f (xj ), n+1
2
ai =
j=0
2 X f (xj )Ti (xj ), n+1
i = 1, 2
j=0
and values of xj are given by xj = cos
(2j + 1)π , j = 0, 1, 2. 6 7
. . . . . . . . . . . . . . . . . . . . . . Approximation of Function by Chebyshev Polynomials That is, x0 = 0.86660254, x1 = 0, x2 = −0.8660254. Hence,
1 a0 = [f (x0 ) + f (x1 ) + f (x2 )] = 1.2660209 3 2 (2j + 1)π 2X f (xj ) cos a1 = = 1.1297721 3 6 j=0 2
a2 =
2X f (xj ) cos 3 j=0
2(2j + 1)π 6
= 0.26602093.
Therefore, f (x) ≃ 1.2660209 T0 (x) + 1.1297721 T1 (x) + 0.2660209 T2 (x) = 1 + 1.1297721x + 0.5320418x2 .
3.4 Minimax principle The error in polynomial interpolation for (n + 1) interpolating points is En (x) = wn+1 (x)
f (n+1) (ξ) , (n + 1)!
where wn+1 (x) = (x − x0 )(x − x1 ) · · · (x − xn ) is a polynomial of degree (n + 1). The upper bound of the error is now determine by
|En (x)| ≤ max |wn+1 (x)| −1≤x≤1
max |f (n+1) (ξ)|
−1≤x≤1
(n + 1)!
.
(3.13)
If the function f (x) is bounded, then max |f (n+1) (ξ)| is finite. So, the error bound −1≤x≤1
depends on the polynomial |wn+1 (x)|.
Again, |wn+1 (x)| depends on the choice of
the nodes x0 , x1 , . . . , xn . Thus, the maximum error depends on the product of two
terms max |wn+1 (x)| and max |f (n+1) (ξ)|. Chebyshev suggested that, arguments −1≤x≤1
−1≤x≤1
x0 , x1 , . . . , xn must be chosen in such a way that wn+1 (x) = 2−n Tn+1 (x). The polynomial 2−n Tn+1 (x) is a monic Chebyshev polynomial and it is denoted by T˜n+1 (x). 8
...................................................................................... For a given n, and all possible choices of the arguments and wn+1 (x) on the interval [−1, 1], the polynomial T˜n+1 (x) = 2−n Tn+1 (x) is the unique choice which satisfies the following inequality max {|T˜n+1 (x)|} ≤ max {|wn+1 (x)|}.
−1≤x≤1
−1≤x≤1
Since, |Tn+1 (x)| ≤ 1, therefore, max {|T˜n+1 (x)|} = 2−n . −1≤x≤1
This property is called minimax principle and the polynomial T˜n+1 (x) = 2−n Tn+1 (x)
or,
T˜n (x) = 21−n Tn (x)
is called minimax polynomial.
3.5 Economization of power series Computation of power series with finite number of terms is easy and unambiguous, but the computation of infinite series is a difficult task. The following questions occur for an infinite series. (a) What is the radius of convergence? (b) How many terms of the series gives the answer correct up to certain significant figures? It is obvious that, if we consider a large number of terms to find the value of the series, then we get the answer with given accuracy. But, it is not computationally better, as it takes large computational time. So, our aim is to find the minimum number of terms such that these terms give the answer with desire accuracy. The Chebyshev polynomials can help us to find such minimum number of terms. We have seen that, every polynomial f (x) = a0 + a1 x + a2 x2 + · · · + an xn
(3.14)
can be expressed in terms of Chebyshev polynomials as f (x) = b0 + b1 T1 (x) + b2 T2 (x) + · · · + bn Tn (x).
(3.15)
It can be shown that the expansion of the form (3.15) converges more rapidly than the form (3.14). Thus the representation of the polynomial f (x) of the form (3.15) is computationally better. 9
. . . . . . . . . . . . . . . . . . . . . . Approximation of Function by Chebyshev Polynomials Computation of a power series with less number of terms or less amount of time, which gives the desired accuracy, is called economization of the power series. This process is illustrated in the following example. Example 3.5 Approximate the power series cos x = 1 −
x2 x4 x6 x8 + − + − ··· 2! 4! 6! 8!
with minimum number of terms which gives the answer of cos(x) correct to four decimal places. Solution. The coefficients of the term x8 is 1/8! = 0.0000248 and it is affected in the fifth decimal place only. Thus the terms
x8 8!
can be truncated.
The reduced series is cos x = 1 −
x2 x4 x6 + − . 2! 4! 6!
(3.16)
Now, the above series in terms of Chebyshev polynomials, is cos x = T0 (x) −
1 1 1 1 · [T0 (x) + T2 (x)] + · [3T0 (x) + 4T2 (x) + T4 (x)] 2! 2 4! 8
1 1 · [10T0 (x) + 15T2 (x) + 6T4 (x) + T6 (x)] 6! 32 = 0.7651910 − 0.2298177 T2 (x) + 0.0049479 T4 (x) − 0.0000434 T6 (x). −
Since, |T6 (x)| ≤ 1, so the term 0.0000434 T6 (x) has no affect in the fourth decimal
place. Therefore, this term is discarded and hence the reduced economized series is given by cos x = 0.7651910 − 0.2298177 T2 (x) + 0.0049479 T4 (x).
In terms of x, the above series reduces to
cos x = 0.7651910 − 0.2298177 (2x2 − 1) + 0.0049479 (8x4 − 8x2 + 1) = 0.9999566 − 0.4992186 x2 + 0.0395832 x4 .
This is the final expression for cos x and it gives the result correct up to four decimal places.
10
.
Chapter 4 Solution of Non-linear Equation
Module No. 1 Newton’s Method to Solve Transcendental Equation
...................................................................................... Finding roots of algebraic and transcendental equations is a very important task. These equations occur in many applications of science and engineering. A function f (x) is called algebraic if each term of f (x) contains only the arithmetic operations between real numbers and x with rational power. On the other hand, a transcendental function includes at least one non-algebraic function, i.e. an exponential function, a logarithmic function, trigonometric functions, etc. An equation f (x) = 0 is called algebraic or transcendental according as f (x) is algebraic or transcendental. The equations x9 − 120x2 + 12 = 0 and x15 + 23x10 − 9x8 + 30x = 0 are the examples
of algebraic equations and the equations log x + xex = 0, 3 sin x − 9x2 + 2x = 0 are the examples of transcendental equations.
Lot of numerical methods are available to solve the equation f (x) = 0. But, each method has some advantages and disadvantages over another method. Mainly, the following points are considered to compare the methods: rate of convergence, domain of applicability, number of evaluation of functions, precomputation step, etc. The commonly used methods to solve an algebraic and transcendental equations are bisection, regula-falsi, secant, fixed point iteration, Newton-Raphson, etc. In this module, only Newton-Raphson method is discussed. It is a very interesting method and rate of convergence of this method is high compare to other methods.
1.1 Newton-Raphson method This method is also known as method of tangent. The Newton-Raphson method is an iteration method and so it needs an initial or starting value. Let f (x) = 0 be the given equation and x0 be the initial guess, i.e. the initial approximate root. Let x1 = x0 + h be an exact root of the equation f (x) = 0, where h is a correction of the root, i.e. the amount of error. Generally, it is assumed that h is small. Therefore, f (x1 ) = 0. Now, by Taylor’s series, the equation f (x1 ) = f (x0 + h) = 0 is expanded as f (x0 ) + hf ′ (x0 ) +
h2 ′′ f (x0 ) + · · · = 0. 2! 1
. . . . . . . . . . . . . . . . . . . . . . . Newton’s Method to Solve Transcendental Equation Since h is small, so the second and higher power terms of h are neglected and then the above equation reduces to f (x0 ) + hf ′ (x0 ) = 0 or, h = −
f (x0 ) . f ′ (x0 )
Note that this is an approximate value of h. Using this h, the value of x1 is x1 = x0 + h = x0 −
f (x0 ) . f ′ (x0 )
(1.1)
It is obvious that x1 is a better approximation of x than x0 . Since x1 is not an exact root of the equation f (x) = 0, therefore another iteration is to be performed to find the next better root. For this purpose, the value of x0 is replaced by x1 in equation (1.1) to get second approximate root x2 . That is, x2 = x1 −
f (x1 ) . f ′ (x1 )
(1.2)
In this way, the (n + 1)th iterated value is given by xn+1 = xn −
f (xn ) . f ′ (xn )
(1.3)
The above formula generates a sequence of numbers x1 , x2 , . . . , xn , . . .. The terms of this sequence go to the exact root ξ. The method will terminate when |xn+1 − xn | ≤ ε, where ε is a pre-assigned very small positive number called the error tolerance.
Note 1.1 This method is also used to find a complex root of the equation f (x) = 0. But, for this case, the initial root is taken as a complex number.
Geometrical interpretation The geometrical interpretation of Newton-Raphson method is shown in Figure 1.1. Here, a tangent is drawn at the point (x0 , f (x0 )) to the curve y = f (x). Let the tangent cuts the x-axis at the point (x1 , 0). Again, a tangent is drawn at the point (x1 , f (x1 )). Suppose this tangent cuts the x-axis at the point (x2 , 0). This process is repeated until the nth iterated root xn coincides with the exact root ξ, for large n. For this reason this method is known as method of tangents. 2
...................................................................................... f (x) 6 6
6 6
O
ξ x2
x1
x0
- x
Figure 1.1: Geometrical interpretation of Newton-Raphson method.
The choice of initial guess x0 is a very serious task. If the initial guess is close to the root then the method converges rapidly. But, if the initial guess is not much close to the root or if it is wrong, then the method may generates an endless cycle. Also, if the initial guess is not close to the exact root, the method may generates a divergent sequence of approximate roots. Thus, to choose the initial guess the following rule is suggested. Let a root of the equation f (x) = 0 be in the interval [a, b]. If f (a) · f ′′ (x) > 0 then x0 = a be taken as the initial guess of the equation f (x) = 0 and if f (b) · f ′′ (x) > 0, then x0 = b be taken as the initial guess. 1.1.1
Convergence of Newton-Raphson method
Suppose a root of the equation f (x) = 0 lies in the interval [a, b]. The Newton-Raphson iteration formula (1.3) is
xi+1 = xi −
f (xi ) = φ(xi ) (say). f ′ (xi )
(1.4)
If ξ is a root of the equation f (x) = 0, therefore ξ = φ(ξ).
(1.5) 3
. . . . . . . . . . . . . . . . . . . . . . . Newton’s Method to Solve Transcendental Equation Subtracting (1.4) from (1.5), ξ − xi+1 = φ(ξ) − φ(xi )
= (ξ − xi )φ′ (ξi ) (by MVT) (where ξi lies between ξ and xi )
Now, substituting i = 0, 1, 2, . . . , n to the above equation and multiplying them we get (ξ − xn+1 ) = (ξ − x0 )φ′ (ξ0 )φ′ (ξ1 ) · · · φ′ (ξn )
or |ξ − xn+1 | = |ξ − x0 ||φ′ (ξ0 )||φ′ (ξ1 )| · · · |φ′ (ξn )|
(1.6)
Let |φ′ (x)| ≤ l for all x ∈ [a, b]. Then from the equation (1.6) |ξ − xn+1 | ≤ ln+1 |ξ − x0 |. Now, if l < 1 then |ξ − xn+1 | → ∞. Therefore, lim xn+1 = ξ. n→∞
Hence, the sequence {xn } converges to ξ for all x ∈ [a, b], if |φ′ (x)| < 1 or d f (x) x − or |f (x) · f ′′ (x)| < |f ′ (x)|2 <1 dx f ′ (x)
(1.7)
within the interval [a, b].
Thus, the Newton-Raphson method converges if the initial guess x0 is chosen sufficiently close to the root and the functions f (x), f ′ (x) and f ′′ (x) are continuous and bounded within [a, b]. This is the sufficient condition for the convergence of the NewtonRaphson method. The rate of convergent of Newton-Raphson method is calculated in the following theorem. Theorem 1.1 The rate of convergence of Newton-Raphson method is quadratic. Proof. The Newton-Raphson iteration formula is xn+1 = xn −
f (xn ) . f ′ (xn )
(1.8)
Let ξ and xn be an exact root and the nth approximate root of the equation f (x) = 0. Let εn be the error occurs at the nth iteration. Then xn = εn + ξ and f (ξ) = 0. 4
...................................................................................... Therefore, from the equation (1.8) εn+1 + ξ = εn + ξ −
f (εn + ξ) f ′ (εn + ξ)
That is, f (ξ) + εn f ′ (ξ) + (ε2n /2)f ′′ (ξ) + · · · [by Taylor’s series expansion] f ′ (ξ) + εn f ′′ (ξ) + · · · ε2n f ′′ (ξ) ′ f (ξ) εn + 2 f ′ (ξ) + · · · [since f (ξ) = 0] = εn − f ′′ (ξ) ′ f (ξ) 1 + εn f ′ (ξ) + · · ·
εn+1 = εn −
ε2 f ′′ (ξ) + ··· = εn − εn + n ′ 2 f (ξ) 1 f ′′ (ξ) = ε2n ′ + O(ε3n ). 2 f (ξ)
f ′′ (ξ) 1 + εn ′ + ··· f (ξ)
−1
Neglecting the third and higher powers of εn , the above expression reduces to εn+1 = Cε2n , where C =
f ′′ (ξ) a constant number. 2f ′ (ξ)
(1.9)
Since the power of εn is 2, therefore the rate of convergence of Newton-Raphson method is quadratic. Example 1.1 Using Newton-Raphson method find a root of the equation x3 − 2 sin x − 2 = 0 correct up to five decimal places.
Solution. Let f (x) = x3 − 2 sin x − 2. One root lies between 1 and 2. Let x0 = 1 be
the initial guess.
The iteration scheme is f (xn ) f ′ (xn ) x3 − 2 sin xn − 2 = xn − n 2 . 3xn − 2 cos xn
xn+1 = xn −
The sequence {xn } for different values of n is shown below. 5
. . . . . . . . . . . . . . . . . . . . . . . Newton’s Method to Solve Transcendental Equation n
xn
xn+1
0
1.000000
2.397806
1
2.397806
1.840550
2
1.840550
1.624820
3
1.624820
1.588385
4
1.588385
1.587366
5
1.587366
1.587365
Therefore, one root of the given equation is 1.58736 correct up to five decimal places. Example 1.2 Find an iteration scheme to find the kth root of a number a and hence find the cube root of 2. Solution. Let x be the kth root of a. Therefore, x = a1/k or xk − a = 0. Let f (x) = xk − a. Then the Newton-Raphson iteration scheme is f (xn ) f ′ (xn ) xk − a k xkn − xkn + a = xn − n k−1 = k xn k xk−1 n 1h a i = (k − 1)xn + k−1 . k xn
xn+1 = xn −
Second part: Here, a = 2 and k = 3. Then the above iteration scheme reduces to xn+1
1h 2 i 2 = 2xn + 2 = 3 xn 3
x3n + 1 x2n
.
All calculations are shown in the following table.
Thus, the value of 6
√ 3
n
xn
xn+1
0
1.00000
1.33333
1
1.33333
1.26389
2
1.26389
1.25993
3
1.25993
1.25992
2 is 1.2599, correct up to four decimal places.
......................................................................................
5 x3n + 1 is an iteration scheme to find a root of the 9 equation f (x) = 0. Find the function f (x).
Example 1.3 Suppose 2xn+1 =
Solution. Let l be a root obtained from the given iteration scheme 2xn+1 =
5 x3n + 1 . 9
Then, lim xn = l. n→∞
h i Now, lim 18xn+1 = 5 lim x3n + 1 . n→∞
n→∞
That is, 18l = (5l3 + 1), or 5l3 − 18l + 1 = 0.
Therefore, the required equation is 5x3 − 18x + 1 = 0, and hence f (x) = 5x2 − 18x + 1. Example 1.4 Discuss the Newton-Raphson method to find a root of the equation x15 − 1 = 0 starting with x0 = 0.5.
Solution. It is obvious that the real roots of the given equation are ±1. Here f (x) = x15 − 1. Therefore,
xn+1 = xn −
x15 14x15 n −1 n +1 = . 14 15xn 15x14 n
14 × (0.5)15 + 1 = 1092.7333. This is 15 × (0.5)14 far away from the root 1. This is because 0.5 is not close enough to the exact root Let the initial guess be x0 = 0.5. Then x1 =
x = 1. But, the initial guess x0 = 0.9 gives the first approximate root as x1 = 1.131416 and it is close to the root 1. This example shows the importance of initial guess in Newton-Raphson method. The Newton-Raphson method may also be used to find the complex root. This is illustrated in the following example.
7
. . . . . . . . . . . . . . . . . . . . . . . Newton’s Method to Solve Transcendental Equation
Example 1.5 Find a complex root of the equation z 3 + 3z 2 + 3z + 2 = 0. An initial guess may be taken as 0.5 + 0.5i. Solution. Let z0 = 0.5+0.5i = (0.5, 0.5) be the initial guess and f (z) = z 3 +3z 2 +3z+2. Then f ′ (z) = 3z 2 + 6z + 3. The Newton-Raphson iteration scheme is zn+1 = zn −
f (zn ) . f ′ (zn )
The values of zn and zn+1 at each iteration are tabulated below: n
zn
zn+1
0 1
( 0.50000000, 0.50000000) (–0.10666668, 0.41333333)
(–0.10666668, 0.41333333) (–0.62715298, 0.53778100)
2 3
(–0.62715298, 0.53778100) (–0.47841841, 1.0874815)
(–0.47841841, 1.0874815) (–0.50884020, 0.90368903)
4 5
(–0.50884020, 0.90368903) (–0.50117314, 0.86686337)
(–0.50117314, 0.86686337) (–0.50000149, 0.86602378)
6 7
(–0.50000149, 0.86602378) (–0.49999994, 0.86602539)
(–0.49999994, 0.86602539) (–0.49999994, 0.86602539)
Thus one complex root is (−0.49999994, 0.86602539), i.e. −0.49999994+ 0.86602539 i
correct up to eight decimal places.
1.2 Newton-Raphson method for multiple root Using Newton-Raphson method, one can determined the multiple root of the equation f (x) = 0. But, the following modified formula xn+1 = xn − k
f (xn ) f ′ (xn )
(1.10)
gives a more faster convergent scheme, where k is the multiplicity of the root. The term in the formula k1 f ′ (xn ) is the slope of the straight line passing through point (xn , f (xn )) and intersecting the x-axis at the point (xn+1 , 0). Let ξ be a root of the equation f (x) = 0 with multiplicity k. Then ξ is also a root of the equation f ′ (x) = 0 with multiplicity (k − 1). In general, ξ is a root of the
equation f p (x) = 0 with multiplicity (k − p), p < k. If the equation f (x) = 0 has a 8
...................................................................................... root with multiplicity k and if the initial guess is very close to the exact root ξ, then the expressions x0 − k
f (x0 ) , f ′ (x0 )
x0 − (k − 1)
f ′ (x0 ) , f ′′ (x0 )
x0 − (k − 2)
f ′′ (x0 ) f k−1 (x0 ) , . . . , x − 0 f ′′′ (x0 ) f k (x0 )
must have the same value. Theorem 1.2 The rate of convergence of the formula (1.10) is quadratic. Proof. Let ξ be a multiple root of the equation f (x) = 0 of multiplicity k. Therefore, f (ξ) = f ′ (ξ) = f ′′ (ξ) = · · · = f k−1 (ξ) = 0 and f k (ξ) 6= 0. Let xn and εn be the nth
approximate root and the error at this step. Then εn = xn − ξ. Now, from the iteration
scheme (1.10), we have
εn+1 = εn − k = εn − k
f (εn + ξ) f ′ (εn + ξ) f (ξ) + εn f ′ (ξ) + · · · + f ′ (ξ) + εn f ′′ (ξ) + · · · +
= εn − k
εkn k k! f (ξ)
+
εk−1 k−1 (ξ) n (k−1)! f εk−2 n
(k−2)! f
εk+1 k+1 (ξ) n (k+1)! f
εk−1 n k (k−1)! f (ξ)
+
εkn k+1 (ξ) k! f
+
k−1 (ξ) +
εkn k k! f (ξ) εk−1 n
(k−1)! f
+
εk+1 k+1 (ξ) n (k+1)! f
k (ξ)
+
+ ···
εkn k+1 (ξ) k! f
+ ···
+ ··· + ···
−1 εn ε2n f k+1 (ξ) εn f k+1 (ξ) = εn − k + + ··· 1 + + ··· k k(k + 1) f k (ξ) k f k (ξ) εn ε2n f k+1 (ξ) εn f k+1 (ξ) = εn − k + + ··· 1 − + ··· k k(k + 1) f k (ξ) k f k (ξ) ε2 f k+1 (ξ) ε2n f k+1 (ξ) = εn − εn + n − + ··· k + 1 f k (ξ) k f k (ξ) 1 f k+1 (ξ) 2 = εn + O(ε3n ). k(k + 1) f k (ξ)
1 f k+1 (ξ) . Neglecting cube and higher order terms of εn , the above k(k + 1) f k (ξ) equation becomes εn+1 = Cε2n . Let C =
Thus, the rate of convergence of the scheme (1.10) is quadratic. 9
. . . . . . . . . . . . . . . . . . . . . . . Newton’s Method to Solve Transcendental Equation
Example 1.6 Find the multiple root with multiplicity 3 of the equation x4 − x3 − 3x2 + 5x − 2 = 0.
Solution. Let the initial guess be x0 = 0.5. Also, let f (x) = x4 − x3 − 3x2 + 5x − 2. f ′ (x) = 4x3 − 3x2 − 6x + 5, f ′′ (x) = 12x2 − 6x − 6, f ′′′ (x) = 24x − 6.
The first iterated values are f (x0 ) f (0.5) x1 = x0 − 3 ′ = 0.5 − 3 ′ = 1.035714 f (x0 ) f (0.5) f ′ (x0 ) f ′ (0.5) x1 = x0 − 2 ′′ = 0.5 − 2 ′′ = 1.083333 and f (x0 ) f (0.5) f ′′ (0.5) f ′′ (x0 ) = 0.5 − ′′′ = 1.5. x1 = x0 − ′′′ f (x0 ) f (0.5) The first two values of x1 are closed to 1. It indicates that the equation may have a
double root near 1. Let x1 = 1.035714. f (x1 ) f (1.035714) Then x2 = x1 − 3 ′ = 1.035714 − 3 ′ = 1.000139 f (x1 ) f (1.035714) f ′ (x1 ) f ′ (1.035714) x2 = x1 − 2 ′′ = 1.035714 − 2 ′′ = 1.000277 f (x1 ) f (1.035714) f ′′ (x1 ) f ′′ (1.000277) x2 = x1 − ′′′ = 1.000277 − ′′′ = 1.000812. f (x1 ) f (1.000277) Here it is seen that the three values of x2 are very close to 1. So the equation has a multiple root near 1 of multiplicity 3. Let x2 = 1.000139. The third iterated values are f (x2 ) f ′ (x2 ) x3 = x2 − 3 ′ = 1.000000, x3 = x2 − 2 ′′ = 1.000000 and f (x2 ) f (x2 ) f ′′ (x2 ) x3 = x2 − ′′′ = 1.000000. f (x2 ) All the values of x3 are same and hence one root of the equation is 1.000000 correct up to six decimal places, with multiplicity 3.
1.3 Modification on Newton-Raphson method After development of Newton-Raphson method, some modifications have been made on this method. One of them is discussed below. 10
...................................................................................... Note that in the Newton-Raphson method the derivative of the function f (x) is evaluated at each iteration. That is, to find xn+1 , the value of f ′ (xn ) is required for n = 0, 1, 2, . . .. Therefore, at each iteration two functions are evaluated at the point xn , n = 0, 1, 2, . . .. So, a separate method is required to find derivatives. Thus, in each iteration of this method more calculations are needed. But, the following proposed method can reduced the computational effort: xn+1 = xn −
f (xn ) . f ′ (x0 )
(1.11)
In this method, the derivative of f (x) is calculated only at the initial guess x0 and obviously it reduces the computation time at each iteration. But, the rate of convergence of this method reduced to 1. This is, proved in the following theorem. Theorem 1.3 The rate of convergence of the modified Newton-Raphson method (1.11) is linear. Solution. Let ξ be an exact root of the equation f (x) = 0 and xn be the approximate root at the nth iteration. Then f (ξ) = 0. Let εn be the error occurs at the nth iteration. Then εn = xn − ξ.
Now, from the formula (1.11), we have f (εn + ξ) f (ξ) + εn f ′ (ξ) + · · · εn+1 = εn − = ε − n f ′ (x0 ) f ′ (x0 ) ′ f (ξ) = εn 1 − ′ + O(ε2n ). f (x0 ) Neglecting square and higher power terms of εn , the above equation reduces to f ′ (ξ) εn+1 = εn 1 − ′ . f (x0 ) Let C = 1 −
tion becomes
f ′ (ξ) , which is free from εn . Using this notation the above error equaf ′ (x0 ) εn+1 = Cεn .
(1.12)
This shows that the rate of convergence of the formula (1.11) is linear. 11
. . . . . . . . . . . . . . . . . . . . . . . Newton’s Method to Solve Transcendental Equation
Example 1.7 Find a root of the equation x3 − 3x2 + 1 = 0 using modified NewtonRaphson formula (1.11) and Newton-Raphson method correct up to four decimal places.
Solution. Let f (x) = x3 − 3x2 + 1. One root of this equation lies between 0 and 1.
Let the initial guess be x0 = 0.5. Now, f ′ (x) = 3x2 − 6x and hence f ′ (x0 ) = −2.25. The iteration scheme for the formula (1.11) is
f (xn ) f ′ (x0 ) x3 − 3x2n + 1 x3 − 3x2n + 2.25xn + 1 = xn − n = n . −2.25 2.25
xn+1 = xn −
All the approximate roots are calculated in the following table. n
xn
xn+1
0
0.50000
0.66667
1
0.66667
0.65021
2
0.65021
0.65313
3
0.65313
0.65263
4
0.65263
0.65272
5
0.65272
0.65270
Therefore, 0.6527 is a root of the given equation correct up to four decimal places. By Newton-Raphson method The iteration scheme for Newton-Raphson method is f (xn ) f ′ (xn ) 2x3 − 3x2n − 1 x3 − 3x2 + 1 = n2 . = xn − n 2 n 3xn − 6xn 3xn − 6xn
xn+1 = xn −
Let x0 = 0.5. The successive iterations are shown below.
12
n
xn
xn+1
0
0.50000
0.66667
1
0.66667
0.65278
2
0.65278
0.65270
3
0.65270
0.65270
...................................................................................... Therefore, 0.6527 is a root correct up to four decimal places. In this example, Newton-Raphson method takes less number of iterations whereas as expected the modified formula (1.11) needs more iterations.
13
.
Chapter 4 Solution of Non-linear Equation
Module No. 2 Roots of a Polynomial Equation
...................................................................................... Derivation of all roots of a polynomial equation is a very important task. In many applications of science and engineering all roots of a polynomial equations are needed to solve a particular problem. For example, to find the poles, singularities, etc. of a function, the zeros of the denominator (polynomial) are needed. The available analytic methods are useful when the degree of the polynomial is at most four. So, numerical methods are required to find the roots of the higher degree polynomial equations. Fortunately, many direct and iterated numerical methods are developed to find all the roots of a polynomial equation. In this module, two iterated methods, viz. Birge-Vieta and Bairstow methods are discussed.
2.1 Roots of polynomial equations Let Pn (x) be a polynomial in x of degree n. If a0 , a1 , . . . , an are coefficients of Pn (x), then equation Pn (x) = 0 can be written in explicit form as Pn (x) ≡ a0 xn + a1 xn−1 + · · · + an−1 x + an = 0.
(2.1)
Here, we assumed that the coefficients a0 , a1 , . . . , an are real numbers. A number ξ (may be real or complex) is a root of the polynomial equation Pn (x) = 0 if and only if Pn (ξ) = 0. That is, Pn (x) is exactly divisible by x − ξ. If Pn (x) is exactly divisible by (x − ξ)k (k ≥ 1), but it is not divisible by (x − ξ)k+1 , then ξ is called a root of multiplicity k. The roots of multiplicity k = 1 are called simple roots or single roots. From fundamental theorem of algebra, we know that every polynomial equation has a root. More precisely, every polynomial equation Pn (x) = 0, (n ≥ 1) with any numerical coefficients has exactly n, real or complex roots. The roots of any polynomial equation are either real or complex. If the coefficients of the equation are real and it has a complex root α + iβ of multiplicity k, then α − iβ must be another complex root of the equation with multiplicity k. Let a0 xn + a1 xn−1 + · · · + an−1 x + an = 0,
(2.2)
be a polynomial equation, where a0 , a1 , . . . , an are real coefficients. Also, let A = 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Roots of a Polynomial Equation max{|a1 |, |a2 |, . . . , |an |} and B = max{|a0 |, |a1 |, . . . , |an−1 |}. Then the magnitude of a 1 A root of the equation (2.2) lies between and 1 + . 1 + B/|an | |a0 | The other methods are also available to find the upper bound of the positive roots of the polynomial equation. Two such results are stated below: Theorem 2.1 (Lagrange’s). If the coefficients of the polynomial a0 xn + a1 xn−1 + · · · + an−1 x + an = 0 satisfy the conditions a0 > 0, a1 , a2 , . . . , am−1 ≥ 0, am < 0, for some m ≤ n, then the p upper bound of the positive roots of the equation is 1 + m B/a0 , where B is the greatest of the absolute values of the negative coefficients of the polynomial.
Theorem 2.2 (Newton’s). If for x = c the polynomial f (x) = a0 xn + a1 xn−1 + · · · + an−1 x + an and its derivatives f 0 (x), f 00 (x), . . . assume positive values then c is the upper bound of the positive roots of the equation f (x) = 0. In the following sections, two iteration methods, viz. Birge-Vieta and Bairstow methods are discussed to find all the roots of a polynomial equation of degree n.
2.2 Birge-Vieta method In Module 1 of this chapter, Newton-Raphson method is described to find a root of an algebraic and transcendental equations. The rate of convergence of this method is quadratic. So, Newton-Raphson method can be used to find a root of a polynomial equation as polynomial equation is an algebraic equation. Birge-Vieta method is based on the Newton-Raphson method. Let ξ be a root of the polynomial equation Pn (x) = 0 (Pn (x) is a polynomial in x of degree n). Then (x − ξ) is a factor of the polynomial Pn (x). Thus, the problem is to find the root ξ. This root can be determined by Newton-Raphson method. Then Pn (x) is divided by the factor (x − ξ) and obtained the quotient Qn−1 (x) of degree n − 1. 2
...................................................................................... Let the polynomial equation be Pn (x) = xn + a1 xn−1 + a2 xn−2 + · · · + an−1 x + an = 0.
(2.3)
Assume that Qn−1 (x) and R be the quotient and remainder when Pn (x) is divided by the factor (x − ξ). Here, Qn−1 (x) is a polynomial of degree (n − 1), so it can be written as Qn−1 (x) = xn−1 + b1 xn−2 + b2 xn−3 + · · · + bn−2 x + bn−1 .
(2.4)
Pn (x) = (x − ξ)Qn−1 (x) + R.
(2.5)
Thus,
If ξ is an exact root of the equation Pn (x) = 0, then R must be zero. Thus, the value of R depends on the accuracy of ξ. The Newton-Raphson method or any other method be used to find the value of ξ starting from an initial guess x0 such that R(ξ) = Pn (ξ) = 0.
(2.6)
The Newton-Raphson iteration scheme for the equation Pn (x) = 0 is xk+1 = xk −
Pn (xk ) , k = 0, 1, 2, . . . . Pn0 (xk )
(2.7)
This method determines the approximate value of ξ, so for this ξ, R is not exactly 0, but it is a small number. Since Pn (x) is a polynomial, so it is differentiable everywhere. Also, the values of Pn (xk ) and Pn0 (xk ) can be determined by synthetic division or any other method. To find the polynomial Qn−1 (x) and R, comparing the coefficient of like powers of x on both sides of the equation (2.5). Thus we get the following equations. a1 = b1 − ξ
b 1 = a1 + ξ
a2 = b2 − ξb1 .. .
b2 = a2 + ξb1 .. .
ak = bk − ξbk−1 .. .
bk = ak + ξbk−1 .. .
an = R − ξbn−1
R = an + ξbn−1 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Roots of a Polynomial Equation From equation (2.5), Pn (ξ) = R = bn (say).
(2.8)
bk = ak + ξbk−1 , k = 1, 2, . . . , n, with b0 = 1.
(2.9)
Thus,
Therefore, bn is the value of Pn . To determine the value of Pn0 , the equation (2.5) is differentiated with respect to x, i.e. Pn0 (x) = (x − ξ)Q0n−1 (x) + Qn−1 (x). That is, Pn0 (ξ) = Qn−1 (ξ) = ξ n−1 + b1 ξ n−2 + · · · + bn−2 ξ + bn−1 .
(2.10)
Pn0 (xi ) = xn−1 + b1 xn−2 + · · · + bn−2 xi + bn−1 . i i
(2.11)
Again,
Thus, the evaluation of Pn0 (x) is same as Pn (x). Differentiating (2.9) with respect to ξ, we get dbk−1 dbk = bk−1 + ξ . dξ dξ
(2.12)
dbk = ck−1 . dξ
(2.13)
We denote
Then the equation (2.12) reduces to ck−1 = bk−1 + ξck−2 Therefore, the recurrence relation of ck is ck = bk + ξck−1 , k = 1, 2, . . . , n − 1. 4
(2.14)
...................................................................................... Now, from equation (2.8), we have Pn0 (ξ) =
dR dbn = = cn−1 dξ dξ
[using (2.13)].
Hence, the iteration scheme (2.7) becomes xk+1 = xk −
bn , k = 0, 1, 2, . . . . cn−1
(2.15)
This method is known as Birge-Vieta method. The values of bk and ck are generally written in a tabular form shown in Table 2.1. x0 1 a1 a2
· · · an−2
an−1
an
x0 x0 b1 · · · x0 bn−3 x0 bn−2
x0 bn−1
· · · bn−2
bn = R
x0 1 b1 b2
bn−1
x0 x0 c1 · · · x0 cn−3 x0 cn−2 1 c1 c2
· · · cn−2
cn−1 = Pn0 (x0 )
Table 2.1: Tabular form of b’s and c’s.
Example 2.1 Find all the roots of the polynomial equation x4 +x3 −8x2 −11x−3 = 0. Solution. Let P4 (x) = x4 + x3 − 8x2 − 11x − 3 be the given polynomial. Also, let the initial guess be x0 = −0.5. First iteration for first root –0.5 –0.5
1
1
–8
–11
–3
–0.500000
–0.250000
4.125000
3.437500
0.500000
–8.250000
–6.875000
0.437500 =b4 = P4 (x0 )
–0.500000
–0.000000
4.125000
0.000000
–8.250000
–2.750000=c3 = P40 (x0 )
1 1
Therefore, x1 = x0 −
b4 0.437500 = −0.500000 − = −0.340909. c3 −2.750000
This is the first iterated value. 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Roots of a Polynomial Equation Second iteration for first root –0.340909 –0.340909
1 1 1
1
–8
–11
–3
–0.340909
–0.224690
2.803872
2.794135
0.659091
–8.224690
–8.196128
–0.205865=b4
–0.340909
–0.108471
2.840850
0.318182
–8.333161
–5.355278=c3
Then the second iterated root is x2 = x1 −
b4 −0.205865 = −0.340909 − = −0.379351. c3 −5.355278
Third iteration for first root –0.379351 –0.379351
1 1
1
–8
–11
–3
–0.379351
–0.235444
3.124121
2.987720
0.620649
–8.235444
–7.875879
–0.012280=b4
–0.379351
–0.091537
3.158846
1
0.241299 –8.326981 –4.717033=c3 −0.012280 b4 = −0.379351 − = −0.381954. Therefore, x3 = x2 − c3 −4.717033 Fourth iteration for first root –0.381954 –0.381954
1 1
1
–8
–11
–3
–0.381954
–0.236065
3.145798
2.999944
0.618046
–8.236065
–7.854202
–0.000056=b4
–0.381954
–0.090176
3.180241
1
0.236092 –8.326241 –4.673960=c3 b4 −0.000056 Then x4 = x3 − = −0.381954 − = −0.381966. c3 −4.673960 Therefore, one root of the equation is −0.38197. The reduce polynomial is x3 + 0.618034x2 − 8.236068x − 7.854102 = 0 (obtained from the third row of the above table). First iteration for second root Let x0 = 1.0. 6
......................................................................................
1 1
1 1 1
0.618034
–8.236068
–7.854102
1.000000
1.618030
–6.618040
1.618030
–6.618040
–14.472139=b3
1.000000
2.618030
2.618030
–4.000010= c2
Therefore, b3 −14.472139 = 1.000000 − = −2.618026. c2 −4.000010 Second iteration for second root x1 = x0 −
–2.618026 –2.618026
1 1 1
0.618034
–8.236068
–7.854102
–2.618026
5.236043
7.854150
–1.999996
–3.000027
0.000050=b3
–2.618026
12.090104
–4.618022
9.090076= c2
The second iterated value is b3 0.000050 = −2.618032. x2 = x1 − = −2.618026 − c2 9.090076 It is seen that x = −2.61803 is another root. The next reduced equation is x2 − 2.00000x − 3.00003 = 0. Roots of this equation are x = 3.00001, 1.00000. Hence, the roots of the given equation are −0.38197, −2.61803, 3.00001, 1.00000. Note 2.1 The Birga-Vieta method is used to find all real roots of a polynomial equation. But, the current form of this method is not applicable to find the complex roots. After modification, this method may be used to find all roots (real or complex) of a polynomial equation. Since the method is based on Newton-Raphson method, the rate of convergent of this method is quadratic, as the rate of convergent of Newton-Raphson method is quadratic.
7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Roots of a Polynomial Equation
2.3 Bairstow method This method is also an iterative method. In this method, a quadratic factor is extracted from the polynomial Pn (x) by iteration. As a by product the deflated polynomial (the polynomial obtained by dividing Pn (x) by the quadratic factor) is also obtained. It is well known that the determination of roots (real or complex) of a quadratic equation is easy. Therefore, by extracting all quadratic factors one can determine all the roots of a polynomial equation. This is the basic principle of Bairstow method. Let the polynomial Pn (x) of degree n be xn + a1 xn−1 + a2 xn−2 + · · · + an−1 x + an .
(2.16)
Let x2 + px + q be a factor of the polynomial Pn (x), n > 2. When this polynomial is divided by the factor x2 + px + q, then the quotient is a polynomial of degree (n − 2) and remainder is a linear polynomial. Let the quotient and the remainder be denoted by Qn−2 (x) and M x + N , where M and N are two constants. Using this notation, Pn (x) can be written as Pn (x) = (x2 + px + q)Qn−2 (x) + M x + N.
(2.17)
The polynomial Qn−2 (x) is called deflated polynomial and let it be Qn−2 (x) = xn−2 + b1 xn−3 + · · · + bn−3 x + bn−2 .
(2.18)
It is obvious that the values of M and N depends on p and q. If x2 + px + q is an exact factor of Pn (x), then the remainder M x + N , i.e. M and N must be zero. Thus the main aim of Bairstow method is to find the values of p and q such that M (p, q) = 0 and N (p, q) = 0.
(2.19)
These are two non-linear equations in p and q and these equations can be solved by Newton-Raphson method for two variables (discussed in Module 3 of this chapter). Let (pT , qT ) be the exact values of p and q and ∆p, ∆q be the (errors) corrections to p and q. Therefore, pT = p + ∆p 8
and
qT = q + ∆q.
...................................................................................... Hence, M (pT , qT ) = M (p + ∆p, q + ∆q) = 0
and
N (pT , qT ) = N (p + ∆p, q + ∆q) = 0.
By Taylor’s series expansion, we get ∂M ∂M + ∆q + ··· = 0 ∂p ∂q ∂N ∂N and N (p + ∆p, q + ∆q) = N (p, q) + ∆p + ∆q + · · · = 0. ∂p ∂q M (p + ∆p, q + ∆q) = M (p, q) + ∆p
All the derivatives are evaluated at the approximate value (p, q) of (pT , qT ). Neglecting square and higher powers of ∆p and ∆q, as they are small, the above equations become ∆pMp + ∆qMq = −M
(2.20)
∆pNp + ∆qNq = −N.
(2.21)
Therefore, the values of ∆p and ∆q are obtained by the formulae ∆p = −
M Nq − N Mq N Mp − M Np , ∆q = − . Mp Nq − Mq Np Mp Nq − Mq Np
(2.22)
It is expected that in this stage the values of ∆p and ∆q are either 0 or very small. Now, the coefficients of the deflated polynomial Qn−2 (x) and the expressions for M and N in terms of p and q are computed below. From equation (2.17) xn + a1 xn−1 + a2 xn−2 + · · · + an−1 x + an = (x2 + px + q)(xn−2 + b1 xn−3 + · · · + bn−3 x + bn−2 ) + M x + N. (2.23) Comparing both sides, we get a1 = b1 + p
b1 = a1 − p
a2 = b2 + pb1 + q .. .
b2 = a2 − pb1 − q .. .
ak = bk + pbk−1 + qbk−2 .. .
bk = ak − pbk−1 − qbk−2 .. .
an−1 = M + pbn−2 + qbn−3
M = an−1 − pbn−2 − qbn−3
an = N + qbn−2
N = an − qbn−2 .
(2.24)
9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Roots of a Polynomial Equation In general, bk = ak − pbk−1 − qbk−2 ,
k = 1, 2, . . . , n.
(2.25)
The values of b0 and b−1 are taken as 1 and 0 respectively. With this notation, the expressions for M and N are M = bn−1 , N = bn + pbn−1 .
(2.26)
Note that M and N depend on b’s. Differentiating the equation (2.25) with respect to p and q to find the partial derivatives of M and N . ∂bk−1 ∂bk−2 ∂bk = −bk−1 − p −q , ∂p ∂p ∂p ∂bk ∂bk−1 ∂bk−2 = −bk−2 − p −q , ∂q ∂q ∂q
∂b0 ∂b−1 = =0 ∂p ∂p ∂b0 ∂b−1 = =0 ∂q ∂q
(2.27) (2.28)
For simplification, we denote ∂bk = −ck−1 , ∂p ∂bk and = −ck−2 . ∂q
k = 1, 2, . . . , n
(2.29) (2.30)
With this notation, the equation (2.27) simplifies as ck−1 = bk−1 − pck−2 − qck−3 .
(2.31)
Also, the equations (2.28) becomes ck−2 = bk−2 − pck−3 − qck−4 .
(2.32)
Hence, the recurrence relation for ck is ck = bk − pck−1 − qck−2 , k = 1, 2, . . . , n − 1 and c0 = 1, c−1 = 0. Therefore,
∂bn−1 = −cn−2 ∂p ∂bn ∂bn−1 Np = +p + bn−1 = bn−1 − cn−1 − pcn−2 ∂p ∂p ∂bn−1 Mq = = −cn−3 ∂q ∂bn ∂bn−1 Nq = +p = −(cn−2 + pcn−3 ). ∂q ∂q
Mp =
10
(2.33)
...................................................................................... From the equation (2.22), the explicit expressions for ∆p and ∆q, are obtained as follows: bn cn−3 − bn−1 cn−2 − cn−3 (cn−1 − bn−1 ) bn−1 (cn−1 − bn−1 ) − bn cn−2 ∆q = − 2 . cn−2 − cn−3 (cn−1 − bn−1 )
∆p = −
c2n−2
(2.34)
Therefore, the improved values of p and q are p + ∆p and q + ∆q. Thus if p0 , q0 be the initial guesses of p and q, then the first approximate values of p and q are p1 = p0 + ∆p
and
q1 = q0 + ∆q.
(2.35)
Table 2.2 is helpful to calculate the values of bk ’s and ck ’s, where p0 and q0 are taken as initial values of p and q. 1 −p0
a1
a2
···
ak
···
an−1
an
−p0
−p0 b1
···
−p0 bk−1
···
−p0 bn−2
−p0 bn−1
−q0
···
−q0 bk−2
···
−q0 bn−3
−q0 bn−2
b1
b2
···
bk
···
bn−1
bn
−p0
−p0 c1
···
−p0 ck−1
···
−p0 cn−2
−q0
···
−q0 ck−2
···
−q0 cn−3
−q0 1 −p0 −q0 1
c1
c2 ··· ck ··· cn−1 Table 2.2: Tabular form of b’s and c’s.
The second approximate values p2 , q2 of p and q are determined from the equations: p2 = p1 + ∆p,
q2 = q1 + ∆q.
In general,
pk+1 = pk + ∆p,
qk+1 = qk + ∆q,
(2.36)
the values of ∆p and ∆q are calculated at p = pk and q = qk . The iteration process to find the values of p and q will be terminated when both |∆p| and |∆q| are very small. The next quadratic factor can be obtained by similar process from the deflated polynomial Qn−2 (x). The values of ∆p and ∆q are obtained by applying Newton-Raphson method for two variables case. Also, the rate of convergence of Newton-Raphson method is quadratic. Hence, the rate of convergence of this method is quadratic. 11
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Roots of a Polynomial Equation
Example 2.2 Extract all the quadratic factors from the equation x4 + 2x3 + 3x2 + 4x + 1 = 0 by using Bairstow method and hence solve this equation. Solution. Let the initial guess of p and q be p0 = 0.5 and q0 = 0.5. First iteration 1
4.000000 1.000000
−0.500000 −0.750000 −0.875000 −1.187500
−0.500000
−0.500000 −0.750000 −0.875000 1.500000
1.750000
2.375000 −1.062500
−0.500000
−0.500000 −0.500000 −0.375000
−0.500000
−0.500000 −0.500000 1
c22
3.000000
−0.500000 1
∆p = −
2.000000
1.000000
0.750000
1.500000
= c1
= c2
= c3
b4 c1 − b3 c2 = 1.978261, − c1 (c3 − b3 )
∆q = −
b3 (c3 − b3 ) − b4 c2 = 0.891304 c22 − c1 (c3 − b3 )
Therefore, p1 = p0 + ∆p = 2.478261, q1 = q0 + ∆q = 1.391304. Second iteration 4.00000 −7.00000 −22.00000
1.00000 −1.38095
−1.38095 −3.61678
2.57143
2.57143
−1.38095
−1.38095 −1.70975
9.92031
2.57143
2.57143
3.18367
1.23810 −7.18367
8.94893
1.00000 ∆p = 0.52695, ∆q = −0.29857.
p2 = p1 + ∆p = 1.90790, q2 = q1 + ∆q = −2.86999.
5.73794
6.73469 −20.68805
2.61905 −8.04535 −4.15506
1.00000
12
11.11025
24.00000
9.04989
...................................................................................... Third iteration 1 −2.478261
2.000000 −2.478261
−1.391304
3.000000
4.000000
1.000000
1.185256 −6.924140
5.597732
−1.391304
0.665407 −3.887237
1 −0.478261
2.793951 −2.258734
−2.478261
7.327033 −21.634426
−2.478261 −1.391304
−1.391304 1 −2.956522
2.710495
4.113422
8.729680 −19.779737
∆p = −0.479568, ∆q = −0.652031. p3 = p2 + ∆p = 1.998693, q3 = q2 + ∆q = 0.739273. Fourth iteration 1 −1.998693
2.000000
4.000000
1.000000
−1.998693 −0.002613 −4.513276
1.027812
−0.739273
3.000000
−0.739273 −0.000967 −1.669363 1
−1.998693
0.001307
2.258114 −0.514242
−1.998693
3.992159 −11.014794
−0.739273
−0.739273 1 −1.997385
0.358449
1.476613
5.511000 −10.052423
∆p = −0.187110, ∆q = −0.258799. p4 = p3 + ∆p = 1.811583, q4 = q3 + ∆q = 0.480474. Fifth iteration 1 −1.811583
2.000000
4.000000
1.000000
−1.811583 −0.341334 −3.945975
0.066131
−0.480474
−0.480474 −0.090530 −1.046566 1
−1.811583
3.000000
0.188417
2.178192 −0.036504
−1.811583
2.940498 −8.402511
−0.480474
−0.480474 1 −1.623165
0.019565
0.779889
4.638216 −7.659126
∆p = −0.015050, ∆q = −0.020515. p5 = p4 + ∆p = 1.796533, q5 = q4 + ∆q = 0.459960. 13
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Roots of a Polynomial Equation Sixth iteration 1 −1.796533
2.000000
4.000000
1.000000
−1.796533 −0.365535 −3.906570
0.000282
−0.459960
−0.459960 −0.093587 −1.000184 1
−1.796533
3.000000
0.203467
2.174505 −0.000157
−1.796533
2.861996 −8.221908
−0.459960
−0.459960 1 −1.593066
0.000098
0.732746
4.576541 −7.489319
∆p = −0.000062, ∆q = −0.000081. p6 = p5 + ∆p = 1.796471, q6 = q5 + ∆q = 0.459879. Note that, ∆p and ∆q are correct up to four decimal places. Thus p = 1.7965, q = 0.4599 correct up to four decimal places. Therefore, a quadratic factor is x2 + 1.7965x + 0.4599 and the deflated polynomial is Q2 (x) = P4 (x)/(x2 + 1.7965x + 0.4599) = x2 + 0.2035x + 2.1745. Thus, P4 (x) = (x2 + 1.7965x + 0.4599)(x2 + 0.2035x + 2.1745). Hence, the roots of the given equation are −0.309212, −1.487258, (−0.1018, 1.4711), (−0.1018, −1.4711).
14
.
Chapter 4 Solution of Non-linear Equation
Module No. 3 Solution of System of Non-linear Equations
......................................................................................
3.1 Nonlinear equations In Module 1 of this chapter, Newton-Raphson method is discussed to solve a nonlinear equation (algebraic or transcendental). Here only one equation is consider at a time. We observed that the solution of a nonlinear equation is a difficult task. The nonlinear equation f (x) = 0 represents a plane curve and if the equation cuts the x-axis at the point (x1 , 0), then x1 is the root of the equation. But, a pair of nonlinear equations f (x, y) = 0, g(x, y) = 0 represent surfaces in 3-dimension. A point of intersection of these surfaces is the solution of the equations. But, it is a very difficult task to find such point of intersection. In this module, three methods, viz. fixed point iteration, Seidel iteration and NewtonRaphson method are discussed to solve a pair of nonlinear equations. These methods can also be extended to solve a system of three or more equations/variables.
3.2 Fixed point iteration method Let a pair of nonlinear equations be f (x, y) = 0 and
g(x, y) = 0.
(3.1)
The equations f (x, y) = 0 and g(x, y) = 0 are either algebraic and/or transcendental. Like fixed point iteration, in case of single variable, these equations are rewritten as x = φ(x, y) and
y = ψ(x, y).
(3.2)
The functions φ and ψ are not unique. The equations f (x, y) = 0, g(x, y) = 0 can be written in many different ways to get φ and ψ. All such representations are not acceptable. They must satisfy some conditions discussed latter. For example, let f (x, y) ≡ x2 y + 2xq− 3y = 0, g(x, y) ≡ xy + ex − y 2 = 0. From first equation, x = 12 (y − x2 y) or x = 3y−2x and from second equation we can write y √ 1 2 x x y = xy + e or y = x (y − e ). 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution of System of Non-linear Equations Let (ξ, η) be an exact root of the pair of equations (3.2) and let (x0 , y0 ) be the initial guess for this root. The first approximate root (x1 , y1 ) is then determine as x1 = φ(x0 , y0 ), y1 = ψ(x0 , y0 ). Similarly, the second approximate root is given by x2 = φ(x1 , y1 ), y2 = ψ(x1 , y2 ) and on. In general, xn+1 = φ(xn , yn ), yn+1 = ψ(xn , yn ).
(3.3)
Thus, the fixed point iteration method generates a (double) sequence of numbers {(xn , yn )}. If the sequence converges, i.e. lim xn = ξ
and
ξ = φ(ξ, η)
and
n→∞
lim yn = η,
n→∞
then η = ψ(ξ, η).
(3.4)
But, there is no guarantee that the sequence {(xn , yn )} will converges to a root. The sufficient condition for convergent is stated below. Theorem 3.1 Let R be a region containing the root (ξ, η). Assumed that the functions x = φ(x, y), y = ψ(x, y) and their first order partial derivatives are continuous within the region R. If the initial guess (x0 , y0 ) is sufficiently close to (ξ, η) and if ∂φ ∂φ ∂ψ ∂ψ + <1 + < 1, and ∂x ∂y ∂x ∂y
(3.5)
for all (x, y) ∈ R, then the sequence {(xn , yn )} obtained from the equation (3.3) converges to the root (ξ, η). In case of three variables the sufficient condition is stated below. The condition for the functions x = φ(x, y, z), y = ψ(x, y, z), z = ζ(x, y, z) is ∂φ ∂φ ∂φ + + < 1, ∂x ∂y ∂z ∂ψ ∂ψ ∂ψ + + <1 ∂x ∂y ∂z ∂ζ ∂ζ ∂ζ and + + < 1 ∂x ∂y ∂z 2
...................................................................................... for all (x, y, z) ∈ R. Example 3.1 Solve the following system of equations x2 + y 2 − 4x = 0,
x2 + y 2 − 8x + 15 = 0
starting with (3.5, 1.0) by iteration method. Solution. p p From first equation we have x = 2+ 4 − y 2 and from second equation y = 1 − (x − 4)2 . p p Let, φ(x, y) = 2 + 4 − y 2 and ψ(x, y) = 1 − (x − 4)2 . The iteration scheme is p xn+1 = φ(xn , yn ) = 2 + 4 − yn2 , p and yn+1 = ψ(xn , yn ) = 1 − (xn − 4)2 . The value of xn , yn , xn+1 and yn+1 for n = 0, 1, . . . are shown in the following table. n
xn
yn
xn+1
yn+1
0
3.500000
1.000000
3.732051
0.866025
1
3.732051
0.866025
3.802776
0.963433
2
3.802776
0.963433
3.752654
0.980358
3
3.752654
0.980358
3.743243
0.968927
4
3.743243
0.968927
3.749623
0.966476
5
3.749623
0.966476
3.750978
0.968148
6
3.750978
0.968148
3.750054
0.968498
7
3.750054
0.968498
3.749861
0.968260
8
3.749861
0.968260
3.749992
0.968210
9
3.749992
0.968210
3.750020
0.968244
10
3.750020
0.968244
3.750001
0.968251
11
3.750001
0.968251
3.749997
0.968246
Thus, a root correct up to five decimal places is (3.75000, 0.96825).
3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution of System of Non-linear Equations
3.3 Seidal method The above method can be accelerated by modified the iteration scheme (3.3). The modification is very simple. When yn+1 is computed, then the value of xn+1 is already available. So, we can use this updated value of x while calculating yn+1 . Thus, the modified iteration scheme is xn+1 = φ(xn , yn ) yn+1 = ψ(xn+1 , yn ).
(3.6)
This method is called Seidal iteration. In case of three variables this iteration scheme is extended as xn+1 = φ(xn , yn , zn ) yn+1 = ψ(xn+1 , yn , zn )
(3.7)
and zn+1 = ζ(xn+1 , yn+1 , zn ). Example 3.2 Solve the following system of equations x2 + y 2 − 4x = 0,
x2 + y 2 − 8x + 15 = 0
starting with (3.5, 1.0) by Seidal iteration method. Solution. p The first equation can be written as x = 2 + 4 − y 2 and second equation be y = p 1 − (x − 4)2 . p p Thus, φ(x, y) = 2 + 4 − y 2 and ψ(x, y) = 1 − (x − 4)2 . The iteration scheme is p xn+1 = φ(xn , yn ) = 2 + 4 − yn2 , p and yn+1 = ψ(xn+1 , yn ) = 1 − (xn+1 − 4)2 . The value of xn , yn , xn+1 and yn+1 for n = 0, 1, . . . are shown in the following table. 4
......................................................................................
n
xn
yn
xn+1
yn+1
0
−
1.000000
3.500000
1.000000
1
3.500000
1.000000
3.732051
0.963433
2
3.732051
0.963433
3.752654
0.968927
3
3.752654
0.968927
3.749623
0.968148
4
3.749623
0.968148
3.750054
0.968260
5
3.750054
0.968260
3.749992
0.968244
6
3.749992
0.968244
3.750001
0.968246
Therefore, a root correct up to five decimal places is (3.75000, 0.96825). Observed that this problem has been solved in Example 3.1 and iteration method takes 11 iterations, while Seidal method takes only 6 iterations to obtained the same result.
3.4 Newton-Raphson method Another efficient method to solve a pair of nonlinear equations is Newton-Raphson method. Let the pair of equations be f (x, y) = 0 and g(x, y) = 0.
(3.8)
Also, let (x0 , y0 ) be an initial guess to the root (ξ, η). If h0 and k0 be the errors at x0 and y0 respectively, then (x0 + h0 , y0 + k0 ) is a root of the given equations. Therefore, f (x0 + h0 , y0 + k0 ) = 0 g(x0 + h0 , y0 + k0 ) = 0.
(3.9)
If f (x, y) and g(x, y) are differentiable, then by Taylor’s series expansion, we have ! ! ∂f ∂f f (x0 , y0 ) + h0 + k0 + ··· = 0 ∂x ∂y (x0 ,y0 ) (x0 ,y0 ) ! ! ∂g ∂g g(x0 , y0 ) + h0 + k0 + ··· = 0 (3.10) ∂x ∂y (x0 ,y0 )
(x0 ,y0 )
5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution of System of Non-linear Equations Neglecting square and higher order terms of h0 and k0 , the equations of (3.10) reduce to ∂f0 ∂f0 + k0 = −f0 ∂x ∂y ∂g0 ∂g0 h0 + k0 = −g0 ∂x ∂y ∂f0 ∂f etc. where f0 = f (x0 , y0 ), = ∂x ∂x (x0 ,y0 ) The above equations are written as matrix notation shown below: ∂f0 ∂f0 " # " # ∂x ∂y h0 −f 0 = . k −g 0 0 ∂g0 ∂g0 h0
∂x
(3.11)
∂y
This is a system of two equations and two variables. It can be solved by matrix inverse method or Crammer rule or any other method. By Crammer −f0 1 h0 = J0 −g0
rule, the values of h0 and k0 are obtained as ∂f0 ∂f0 ∂f 0 −f 0 ∂x ∂y ∂x 1 , k0 = , where J0 = J0 ∂g0 ∂g0 ∂g0 −g 0 ∂x ∂y ∂x
∂f0 ∂y ∂g0 ∂y
. (3.12)
In this process the second and higher power of h0 and k0 are neglected, so the values of h0 and k0 obtained from equation (3.12) are approximate. Thus, (x0 + h0 , y0 + k0 ) is not an exact root, but it is more better root than the initial root (x0 , y0 ). Let this new approximate root be (x1 , y1 ), where x1 = x0 + h0 , Similarly, the second ∂f1 −f1 ∂y 1 h1 = J1 ∂g1 −g1 ∂y 6
y1 = y0 + k0 .
(3.13)
approximate root is x2 = x1 + h1 , y2 = y1 + k1 , where ∂f1 ∂f1 ∂f1 −f1 ∂x ∂y ∂x , k1 = 1 . , J1 = J 1 ∂g1 ∂g1 ∂g1 −g1 ∂x ∂x ∂y
...................................................................................... All the derivatives are calculated at the point (x1 , y1 ). In general, the (n + 1)th approximate root (xn+1 , yn+1 ) is given by xn+1 = xn + hn ,
yn+1 = yn + kn ,
(3.14)
where −fn 1 hn = Jn −gn
∂fn ∂y ∂gn ∂y
∂fn ∂x 1 , kn = J n ∂gn ∂x
−fn
−gn
∂fn ∂x and Jn = ∂gn ∂x
∂fn ∂y ∂gn ∂y
. (3.15)
Here also all derivatives are calculated at the point (xn , yn ). The method will terminate when |xn+1 − xn | < ε and |yn+1 − yn | < ε, where ε is a very small positive pre-assigned number called the error tolerance. The sufficient condition for convergent of the iteration process is stated below. Theorem 3.2 Let R be a region which contains the root (ξ, η). Let (x0 , y0 ) be an initial guess to a root (ξ, η) of the equations f (x, y) = 0, g(x, y) = 0. If (i) the functions f (x, y), g(x, y) and their first order partial derivatives are continuous and bounded in R, and (ii) Jn 6= 0 in R, then the sequence of approximation xn+1 = xn + hn , yn+1 = yn + kn , where hn and kn are given by (3.15), converges to the root (ξ, η).
Note 3.1 The Newton-Raphson method reduces a pair of non-linear equations to a pair of linear equations in h and k. In general, this method converts a system of non-linear equations to a system of linear equations.
7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution of System of Non-linear Equations
Example 3.3 Solve the pair of nonlinear equations 3x2 − 2y 2 − 1 = 0 and x2 − 2x + 2y − 8 = 0 by Newton-Raphson method. The initial guess may be taken as (2.5, 3.0). Solution. Let f (x, y) = 3x2 − 2y 2 − 1 and g(x, y) = x2 − 2x + 2y − 8. ∂f ∂f ∂g ∂g = 6x, = 4y, = 2x − 2, = 2. ∂x ∂y ∂x ∂y Therefore, "
∂f ∂x ∂g ∂x
#
∂f ∂y ∂g ∂y
" =
6x
−4y
2x − 2
2
# .
First iteration At (x0 , y0 ), "
−f0 −g0
#
" =
0.25
# .
0.75
Therefore, "
15 Since, J0 = 3 By matrix inverse " h0 k0
15
−12
#"
h0
#
"
0.25
#
= 0.75 3 2 k0 −12 = 66. 2 method, # # " # #" " 0.143939 0.25 2 12 1 = = 66 0.159091 0.75 −3 15
Thus, x1 = x0 + h0 = 2.5 + 0.143939 = 2.643939, y1 = y0 + k0 = 3.0 + 0.159091 = 3.159091. This is the first approximate root. Second iteration At 1 ), " " (x1 , y# # −f1 −0.011536 = . −g1 −0.020719 " 15.863637 3.287879 8
−12.636364 2.000000
#"
h1 k1
#
" =
−f1 −g1
#
...................................................................................... 15.863637 J1 = 3.287879 Therefore, # " h1 k1
−12.636364 = 73.274109. 2.000000
=
=
" 2.000000 1 73.274109 −3.287879 " # −0.003888 . −0.003968
12.636364
#"
−0.011536
#
−0.020719
15.863637
Therefore, x2 = x1 + h1 = 2.643939 − 0.003888 = 2.640052, y2 = y1 + k1 = 3.159091 − 0.003968 = 3.155123. Third iteration At 2 ), " " (x2 , y# # −f2 −0.000015 = . −g2 −0.000015 "
15.840309
−12.620492
3.280103
2.000000
#"
h2 k2
#
" =
−f2
#
−g2
15.840309 −12.620492 Here, J2 = = 73.077133. 3.280103 2.000000 Thus, " #" # " # 2.000000 12.620492 −0.000015 h2 1 = 73.077133 −3.280103 15.840309 −0.000015 k2 " # −0.000003 = . −0.000003 Hence, x3 = x2 + h2 = 2.640052 − 0.000003 = 2.640049, y3 = y2 + k2 = 3.155123 − 0.000003 = 3.155120. Thus, one root is x = 2.64005, y = 3.155120 correct up to five decimal places.
9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution of System of Non-linear Equations
Note 3.2 In iteration and Seidal methods, some pre-calculations are required to find the functions φ(x, y) and ψ(x, y). But, finding of these functions is a very difficult task, particularly for three or more variables case. On the other hand, no precalculation is required for Newton-Raphson method. But, in Newton-Raphson method partial derivatives of the functions f (x, y) and g(x, y) are required. No derivatives are required in iteration and Seidal methods. The rate of convergent of Newton-Raphson method is quadratic whereas iteration and Seidal methods it is linear.
10
.
Chapter 5 Solution of System of Linear Equations
Module No. 1 Matrix Inverse Method
...................................................................................... The system of linear and non-linear equations occur in many applications. To solve a system of linear equations many direct and iterated methods are developed. The old and trivial methods are Cramer’s rule and matrix inverse method. But, these methods depend on evaluation of determinant and computation of inverse of the coefficient matrix. Few methods are available to evaluate a determinant, among them pivoting method is most efficient and applicable for all type of determinants. In this module, pivoting method is discussed to evaluate a determinant and inverse of the coefficient matrix. Then, matrix inverse method is described to solve a system of linear equations. Other direct and iteration methods are discussed in next modules. A system of m linear equations with n variables is given by a11 x1 + a12 x2 + · · · + a1n xn = b1 ···························
··· (1.1)
ai1 x1 + ai2 x2 + · · · + ain xn = bi ···························
···
am1 x1 + am2 x2 + · · · + amn xn = bm . The quantities x1 , x2 , . . ., xn are the unknowns (variables) of the system and a11 , a12 , . . ., amn are called the coefficients and generally they are known. The numbers b1 , b2 , . . . , bm are constant or free terms of the system. The above system of equations (1.1) can be written as a single equation: n X
aij xj = bi ,
i = 1, 2, . . . , m.
(1.2)
j=1
Also, the entire system of equations (1.1) can be written with the help of matrices as AX = b, where
A=
a11 a12 · · · a1n
(1.3) b1
b2 a21 a22 · · · a2n .. . ··· ··· ··· ··· ,b = b ai1 ai2 · · · ain i . . ··· ··· ··· ··· . am1 am2 · · · amn bm
x1
x2 .. . and X = x . i . . . xm
(1.4)
1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Matrix Inverse Method A system of linear equations may or may not have a solution. If the system of linear equations (1.1) has a solution then the system is called consistent otherwise it is called inconsistent or incompatible. Again, a consistent system of linear equations may have unique solution or multiple solutions. Finding of unique solution is easy, but determination of multiple solutions, if exists, is a complicated problem. To solve a system of linear equations usually three type of the elementary transformations are applied. These are discussed below. Interchange: The order of two equations can be changed. Scaling: Multiplication of both sides of an equation by any non-zero number. Replacement: Addition to (subtraction from) both sides of one equation of the corresponding sides of another equation multiplied by any number. If for a system, all the constant terms b1 , b2 , . . . , bm are zero, then the system is called homogeneous system otherwise it is called the non-homogeneous system. Two type of methods are available to solve a system of linear equations, viz. direct method and iteration method. Again, many direct methods are used to solve a system of equations, among them Cramer’s rule, matrix inversion, Gauss elimination, matrix factorization, etc. are well known. Also, the mostly used iteration methods are Jacobi’s iteration, Gauss-Seidal’s iteration, etc. In many applications, we have to determine the value of a determinant. So an efficient method is required for this purpose. One efficient method based on pivoting is discussed in the following section.
1.1 Evaluation of determinant One of the best methods to evaluate determinant is known as triangularization and it is also known as Gauss reduction method. The main idea of this method is to convert the given determinant (D) into a lower or upper triangular form by using only elementary row operations. If the determinant is reduced to a triangular form (say D0 ), then the value of D is obtained by multiplying the diagonal elements of D0 . 2
...................................................................................... a a ··· a 1n 11 12 a21 a22 · · · a2n . Let D be a determinant of order n given by ··· ··· ··· ··· an1 an2 · · · ann Using the elementary row operations, D can be reduced to the following upper triangular form: a11 a12 0 a(1) 22 D0 = 0 0 ··· ··· 0 0
a13 · · · (1)
a23 · · · (2)
a33 · · · ··· ··· 0
···
a1n (1) a2n (2) a3n . · · · (n−1) ann
To convert in this form lot of elementary operations are required. To convert all the elements of the first column, except first element, to 0 the following elementary operations are used ai1 a1j , for i, j = 2, 3, . . . , n. a11 Similarly, to convert all the elements of the second column below the second element (1)
aij = aij −
to 0, the following operations are used. (1)
(2) aij
=
(1) aij
−
ai2
(1) a , (1) 2j a22
for i, j = 3, 4, . . . , n.
All these elementary operations can be written as (k−1)
(k) aij
=
(k−1) aij
−
aik
(k−1) a ; (k−1) kj akk
(1.5)
(0)
i, j = k + 1, . . . , n; k = 1, 2, . . . , n − 1 and aij = aij , i, j = 1, 2, · · · , n. Once D0 is available, then the value of D is given by (1) (2)
a11 a22 a33 · · · a(n−1) . nn It is observed that the formula for the elementary operations is simple and easy to programmed. The time taken by this method is O(n3 ). But, there is a serious drawback of this formula, which is discussed below. 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Matrix Inverse Method (k)
(k−1)
To compute the value of aij one division is required. If akk then the method fails. If
(k−1) akk
is zero or very small
is very small, then there is a chance of loosing significant
digits or data overflow. To avoid this situation the pivoting techniques are used. A pivot is the largest magnitude element in a row or in a column or in the principal diagonal or the leading or trailing sub-matrix of order i (2 ≤ i ≤ n). Let us consider the following matrix to illustrate these terms: 0 1 0 −5 1 −8 3 10 . A= 9 3 −33 18 4 −40 9 11 For this matrix 9 is the pivot for the first column, −33 is the pivot for the principal diagonal, −40 # pivot for the entire matrix and −8 is the pivot for the trailing " is the 0 1 sub-matrix . 1 −8 If any one of the column pivot element (during elementary operation) is zero or very small relative to other elements in that row, then we rearrange the remaining rows in such a way that the pivot becomes non-zero or not a very small number. The method is called pivoting. The pivoting methods are of two types, viz. partial pivoting and complete pivoting, these are discussed below. 1.1.1
Partial pivoting
In partial pivoting method, the pivot is the largest magnitude element in a column. In the first stage, find the first pivot which is the largest element in magnitude among the elements of first column. If it is a11 , then there is nothing to do. If it is ai1 , then interchange rows i and 1. Then apply the elementary row operations to make all the elements of first column, except first element, to 0. In the next stage, the second pivot is determined by finding the largest element in magnitude among the elements of second column leaving first element and let it be aj2 . In this case, interchange second and jth rows and then apply elementary row operations. This process continues for (n − 1)th times. In general, at the ith stage, the smallest index j is chosen for which (k)
(k)
(k)
(k)
(k)
|aij | = max{|akk |, |ak+1 k |, . . . , |ank |} = max{|aik |, i = k, k + 1, . . . , n} 4
...................................................................................... and the rows k and j are interchanged. Complete pivoting or full pivoting In partial pivoting, the pivot is chosen from column. But, in complete pivoting the pivot element is the largest element (in magnitude) among all the elements of the determinant. Let it be at the (l, m)th position for first time. Thus, alm is the first pivot. Then interchange first row and the lth row and of first column and mth column. In second stage, the largest element (in magnitude) is determined among all elements leaving the first row and first column. This element is the second pivot. In this manner, at the kth stage, we choose l and m such that (k)
(k)
|alm | = max{|aij |, i, j = k, k + 1, . . . , n}. Then interchange the rows k, l and columns k, m. In this case, akk is the kth pivot element. It is obvious that the complete pivoting is more complicated than the partial pivoting. Partial pivoting is easy to program. Generally, partial pivoting is used for hand calculation. We have mentioned earlier that the pivoting is used to find the value of all kind of determinants. To determine the pivot and to interchange the rows and/or columns some additional time is required. But, for some type of determinants without pivoting one can determine its value. Such type of determinants are stated below. Note 1.1 If the coefficient matrix A is diagonally dominant, i.e. n X j=1 j6=i
|aij | < |aii |
or
n X
|aji | < |aii |,
for i = 1, 2, . . . , n.
(1.6)
j=1 j6=i
or real symmetric and positive definite then no pivoting is necessary.
Note 1.2 Every diagonally dominant matrix is non-singular.
5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Matrix Inverse Method
Example 1.1 Convert the determinant 1 0 3 A = −2 7 1 5 −1 6 into the upper triangular form using (i) partial pivoting, and (ii) complete pivoting and hence determine the value of A. Solution. (i) (Partial pivoting) The largest element in the first column is 5, present in the third row and it is the first pivot of A. Therefore, first and third rows are interchanged and the reduced determinant is 5 −1 6 −2 7 1 . 1 0 3 Since two rows are interchanged then the value of the determinant is to be multiplied by −1. To maintain it a variable sign is used and in this case it’s value is sign = −1. Now, we apply the elementary row operations to convert all elements of first column, except first, to 0.
1 times the first row to the second row, − times the first row to the third 5 2 1 row, i.e. R20 = R2 + R1 and R30 = R3 − R1 . (R2 and R20 represent the original second 5 5 row and modified second row respectively.) Adding
2 5
The reduced determinant is
6 5 −1 0 33/5 17/5 . 0 1/5 9/5
Now, we determine the second pivot element. In this case, the pivot element is at the (2, 2)th position, therefore no interchange is required. 1 1 −1/5 =− times the second row to the third row, i.e. R30 = R3 − R2 . Adding 33/5 33 33 The reduced determinant is 6 5 −1 0 33/5 17/5 . 0 0 56/33 6
...................................................................................... Note that this is an upper triangular determinant and hence its value is sign × (5)(33/5)(56/33) = −56. (ii) (Complete pivoting) The largest element in A is 7 at position (2,2). Interchanging first and second columns and assign sign = −1; and then interchanging first and second rows and setting sign = −sign = 1. Then the updated determinant is 7 −2 1 0 1 3 . −1 5 6 Adding
1 7
times the first row to the third row, i.e. using the formula R30 = R3 + 17 R1 .
The reduced determinant is
1 7 −2 0 1 3 . 0 33/7 43/7
Now, we determine the second pivot element from the submatrix obtained by deleting 1 3 first row and column. That is, from the trailing sub-matrix . 33/7 43/7 The second pivot is 43/7 at (3,3) position. Interchange the second and third columns and setting sign = −sign = −1 and then interchanging second and third rows. Then −2 7 1 the modified determinant is 0 43/7 33/7 and sign = 1. 0 3 1 21 Now, we apply row operation as R30 = R3 − R2 and we obtain the required upper 43 7 1 −2 triangular determinant 0 43/7 33/7 . 0 0 −56/43 Hence, the value of the determinant is sign × (7)(43/7)(−56/43) = −56. Observed that the values obtained by both the methods are same and it is expected. Advantages and disadvantages of partial and complete pivoting In pivoting method, the symmetry or regularity of the original matrix may be lost. It is easily observed that the partial pivoting requires less time, as it needs less number 7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Matrix Inverse Method of interchanges than complete pivoting. Again, the partial pivoting needs less number of comparison to get pivot element. A combination of partial and complete pivoting is expected to be very effective not only for computing a determinant but also for solving system of linear equations. The pivoting prevent the loss of significant digits.
1.2 Inverse of a matrix Let A be a non-singular square matrix and there exists a matrix B such that AB = I. Then B is called the inverse of A and vice-versa. The inverse of a matrix is denoted by A−1 . Now, using some theories of matrices it can be shown that the inverse of a matrix A is given by A−1 =
adj A . |A|
The matrix adj A is called adjoint of A and A11 A21 A12 A22 adj A = ··· ··· A1n A2n
(1.7)
defined as · · · An1 · · · An2 , ··· ··· · · · Ann
where Aij being the cofactor of aij in |A|. This is the first definition to find the inverse of a matrix. But, this definition is not suitable for large matrix as it needs huge amount of arithmetic calculations. In this method, we have to calculate n2 cofactors and each cofactor is a determinant of order (n − 1) × (n − 1). It is mentioned in previous section that to evaluate a determinant of order n, O(n3 ) arithmetic calculations are required. Thus, to compute all cofactors, total (n3 × n2 ) = O(n5 ) arithmetic calculations are needed. This is a huge amount of time for large matrices. Fortunately, many efficient methods are available to find the inverse of a matrix, among them Gauss-Jordan is most popular. In the following Gauss-Jordan method is discussed to find the inverse of a square non-singular matrix.
8
......................................................................................
1.2.1
Gauss-Jordan method
In this method, the matrix A is augmented with a unit matrix of same size, and only elementary row operations are applied to get the inverse of the matrix. Let the order of the matrix A be n × n and it is augmented with the unit matrix I. This augmented . . matrix is denoted by [A..I]. The order of the augmented matrix [A..I] becomes n × 2n. The augmented matrix is of the following form: . a a · · · a1n .. 1 11 12 . a21 a22 · · · a2n .. 0 .. [A.I] = . · · · · · · · · · · · · .. · · · .. a a ··· a . 0 n1
n2
nn
0 ··· 0
1 ··· 0 . ··· ··· ··· 0 ··· 1
(1.8)
Now, the inverse of A is calculated in two phases. In the first phase, the first half of the augmented matrix is converted into an upper triangular matrix by using only elementary row operations. In the second phase, this upper triangular matrix is converted to an identity matrix by using only row operations. All these operations are applied on the . augmented matrix [A..I]. . . After second phase, the augmented matrix [A..I] is transferred to [I..A−1 ]. Thus, the right half becomes the inverse of A. Symbolically, we can write as .. A.I
. Gauss − Jordan −→ I..A−1 .
In explicit form, the transformation is .. . a a · · · a1n . 1 0 · · · 0 1 0 · · · 0 .. a011 11 12 . . a21 a22 · · · a2n .. 0 1 · · · 0 Gauss-Jordan 0 1 · · · 0 .. a021 −→ . . · · · · · · · · · · · · .. · · · · · · · · · · · · · · · · · · · · · · · · .. · · · . . 0 0 · · · 1 .. a0n1 an1 an2 · · · ann .. 0 0 · · · 1
a012
···
a01n
a022 · · · a02n . ··· ··· ··· 0 0 an2 · · · ann
9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Matrix Inverse Method
Example 1.2 Use partial pivoting method to find the inverse of the following matrix 2 0 1 A = −1 3 4 . 4 −2 0 . Solution. The augmented matrix [A..I] is .. 2 0 1 . 1 0 0 . . [A..I] = −1 3 4 .. 0 1 0 . .. 4 −2 0 . 0 0 1 Phase 1. (Reduction to upper triangular form): In the first column 4 is the largest element, so it is the first pivot. So we interchange first and third rows to place the pivot element 4 at the (1,1) position. Then, the above matrix becomes .. 4 −2 0 . . −1 3 4 .. . 2 0 1 .. 1 −1/2 0 ∼ −1 3 4 2 0 1 .. 1 −1/2 0 . . ∼ 0 5/2 4 .. . 0 1 1 ..
0 0 1 . 0 1 0 1 .. . .. . .. .
0 0
0 0 1/4 0 R = 1 R1 0 1 0 1 4 1 0 0 0 0 1/4 0 R = R2 + R1 ; R30 = R3 − 2R1 0 1 1/4 2 1 0 −1/2 All the elements of first column, except first, become 0. Now, we convert the element
of (3,2) position to 0. For this purpose, we find the largest element (in magnitude) from the second column leaving first element and it is 52 . Fortunately, it is at (2,2) position and so there 1 −1/2 ∼0 1 0 1 10
is no need to . 0 .. 0 0 . 8/5 .. 0 2/5 . 1 .. 1 0
interchange any rows. 1/4 0 R = 2 R2 1/10 2 5 −1/2
...................................................................................... .. 1 −1/2 0 . 0 0 1/4 0 .. ∼0 1 R3 = R3 − R2 8/5 . 0 2/5 1/10 .. 0 0 −3/5 . 1 −2/5 −3/5 .. 0 1/4 1 −1/2 0 . 0 .. ∼ 0 1 8/5 . 0 2/5 1/10 R30 = − 53 R2 .. 0 0 1 . −5/3 2/3 1
Phase 2. (Make the left half a unit matrix): .. 1 0 4/5 . 0 1/5 3/10 . . [A..I] ∼ 0 1 8/5 .. 0 2/5 1/10 R10 = R1 + 21 R2 .. 0 0 1 . −5/3 2/3 1 .. 1 0 0 . 4/3 −1/3 −1/2 . ∼ 0 1 0 .. 8/3 −2/3 −3/2 R10 = R1 − 54 R3 ; R20 = R2 − 85 R3 .. 0 0 1 . −5/3 2/3 1 Now, the left half becomes a unit matrix, thus the second half is the inverse of the given matrix, and it is
4/3 −1/3 −1/2
8/3 −2/3 −3/2 . −5/3 2/3 1 Complexity of the algorithm By analyzing each step of the method to find the inverse of a matrix A of order n × n, it can be shown that the time complexity to compute the inverse of a non-singular matrix is O(n3 ).
1.3 Matrix inverse method A system of equations (1.1) can be written in the matrix form (1.3) as Ax = b 11
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Matrix Inverse Method where A, b and x are defined in (1.4). The solution of Ax = b is obtained from the equation x = A−1 b
(1.9)
where A−1 is the inverse of the matrix A. Thus, the vector x can be obtained by finding inverse of A and then multiplying with b. Example 1.3 Solve the following system of equations by matrix inverse method x1 + 12x2 + 3x3 − 4x4 + 6x5 = 2, 13x1 + 4x2 + 5x3 + 4x5 = 4, 5x1 + 4x2 + 3x3 + 2x4 − 2x5 = 6, 5x1 + 14x2 + 3x4 − 2x5 = 10, −5x1 + 4x2 + 3x3 + 4x4 + 5x5 = 13. Solution. The given equations can be written as Ax = b, where
1 12 3 −4
13 4 A= 5 4 5 14 −5 4
5 3 0 3
6
4 2 −2 , 3 −2 4 5 0
x1
x2 x = x3 x 4 x5
,
2
4 b = 6 . 10 13
Using partial pivoting method, the inverse of A is obtained as
−0.0362
0.0788 −0.0641
0.0357 −0.0309
0.0358 −0.0241 0.0068 0.0464 −0.0024 −1 A = 0.0798 −0.0646 0.3333 −0.1531 0.0280 −0.1186 0.0473 −0.0682 0.0768 0.1079 −0.0178 0.0990 −0.2150 0.0291 0.0679 −0.1872 0.4486 Thus, the solution vector is x = A−1 b = 0.7333 1.7136 0.2430 12
...................................................................................... Hence, x1 = −0.1872, x2 = 0.4486, x3 = 0.7333, x4 = 1.7136, x5 = 0.2430, correct up to four decimal places. Note 1.3 It is mentioned earlier that the time to compute the inverse of an n×n matrix is O(n3 ) and this amount of time is required to multiply two matrices of same order. Hence, the time complexity to solve a system of linear equations containing n equations is O(n3 ).
13
.
Chapter 5 Solution of System of Linear Equations
Module No. 2 Iteration Methods to Solve System of Linear Equations
2.1. Some terminologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . In the previous module, it is mentioned that apart from direct method there is another method called iteration method to solve a system of linear equations. Generally, for large system the iteration methods are used. Theoretically, the direct method gives the accurate solution, but practically it is not possible due to finite representation of number in computer. Again, in direct method there is no scope to update the solution obtained by applying the method. In iteration method, it is possible to obtain the roots of a system with a specified accuracy as the limit of the sequence of some vectors. The process to generate such sequence of roots is known as the iterative process. A number of iteration methods is available to solve a system of linear equations, viz. Jacobi’s iteration method, Gauss-Seidal’s iteration method, relaxation method, etc. The efficiency of any iteration method depends on the choice of the initial vector and the rate of convergence of the process. Also, the iteration methods are not applicable for all types of system of equations. Let us defined some useful terms.
2.1 Some terminologies Let us consider a system of n linear equations containing n variables: a11 x1 + a12 x2 + · · · + a1n xn = b1 a21 x1 + a22 x2 + · · · + a2n xn = b2
(2.1)
··························· ··· ··· an1 x1 + an2 x2 + · · · + ann xn = bn . A set of values of the variables x1 , x2 , . . . , xn which satisfies the above equations is called the roots of the system. But, such values cannot be determined in one execution. The values obtained in a particular iteration are to be updated in the next iteration. This process continues until we get the desired accuracy. (k)
Let xi , i = 1, 2, . . . , n be the kth (k = 1, 2, . . .) iterated value of the variable xi and (k)
(k)
(k)
x(k) = (x1 , x2 , . . . , xn )t be the solution vector obtained at the kth iteration. The sequence of numbers {x(k) }, k = 1, 2, . . . is said to converge to a vector x =
(x1 , x2 , . . . , xn )t if for each i (= 1, 2, . . . , n) (k)
xi
−→ xi as k −→ ∞.
(2.2) 1
. . . . . . . . . . . . . . . . . . . Iteration Methods to Solve System of Linear Equations Let ξ = (ξ1 , ξ2 , . . . , ξn )t be the exact solution of the given system of linear equations. (k)
Then the error εi
of the ith variable xi occurred at the kth iteration is given by (k)
εi
(k)
= ξ i − xi .
(2.3)
The errors occurred in all variables at the kth iteration is denoted by the vector ε(k) , i.e. (k)
(k)
t ε(k) = (ε1 , ε2 , . . . , ε(k) n ) .
(2.4)
The difference of the errors e(k) at two consecutive iterations is given by e(k) = x(k+1) − x(k) = ε(k) − ε(k+1) , (k)
where ei
(k+1)
= xi
(2.5)
(k)
− xi .
The rate of convergence of a system of equations depends on the method. The rate of convergence for a method applied on a system of linear equations is defined as follows. An iteration method is said to be of order p ≥ 1 if there exists a positive constant A
such that for all k
kε(k+1) k ≤ Akε(k) kp .
(2.6)
Now, we discussed a simple iteration method to solve a system of linear equations.
2.2 Jacobi’s iteration method Let us consider the following system of linear equations containing n variables and n equations. a11 x1 + a12 x2 + · · · + a1n xn = b1 a21 x1 + a22 x2 + · · · + a2n xn = b2
(2.7)
··························· ··· ··· an1 x1 + an2 x2 + · · · + ann xn = bn . For this iteration method, we assume that the coefficients matrix is diagonally dominant, i.e. either 2
2.1. Some terminologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
n X j=1
|aij | < |aii | for all i = 1, 2, . . . , n
or n X i=1
|aij | < |ajj | for all j = 1, 2, . . . , n.
Now, we rewrite the equations (2.7) as 1 (b1 − a12 x2 − a13 x3 − · · · − a1n xn ) a11 1 x2 = (b2 − a21 x1 − a23 x3 − · · · − a2n xn ) a22 ··· ··· ······································· 1 xn = (bn − an1 x1 − an2 x2 − · · · − an n−1 xn−1 ). ann x1 =
(0)
(0)
(0)
Let x1 , x2 , . . . , xn (0) x1 , x2
=
(0) x2 , . . . , xn
(2.8)
be the initial guess/solution of the system. That is, x1 = (0)
= xn be the initial solution. In some cases, the initial solution
may be taken as the zero vector, i.e. (0, 0, . . . , 0). This initial solution is substitute to the right hand side of the system of equations (2.8). This gives the first approximate roots of the given system of equations. Let (1)
(1)
(1)
x1 , x2 , . . . , xn be the first approximate roots, and these are given by 1 (0) (0) (b1 − a12 x2 − a13 x3 − · · · − a1n x(0) n ) a11 1 (1) (0) (0) x2 = (b2 − a21 x1 − a23 x3 − · · · − a2n x(0) n ) a22 ··· ··· ······································· 1 (0) (0) (0) x(1) = (bn − an1 x1 − an2 x2 − · · · − an n−1 xn−1 ). n ann (1)
x1
=
(1)
(1)
(2.9)
(1)
Again, substitute x1 , x2 , . . . , xn to the right hand side of (2.8) and obtained the (2)
(2)
(2)
second approximate roots x1 , x2 , . . . , xn . (k)
(k)
(k)
In general, if x1 , x2 , . . . , xn be the kth approximate roots, then the (k + 1) ap3
. . . . . . . . . . . . . . . . . . . Iteration Methods to Solve System of Linear Equations proximate roots are given by 1 (k) (k) (b1 − a12 x2 − a13 x3 − · · · − a1n x(k) n ) a11 1 (k) (k) (k+1) x2 = (b2 − a21 x1 − a23 x3 − · · · − a2n x(k) n ) a22 ··· ··· ······································· 1 (k) (k) (k) x(k+1) = (bn − an1 x1 − an2 x2 − · · · − an n−1 xn−1 ). n ann k = 0, 1, 2, . . . . (k+1)
x1
=
(2.10)
This process is repeated until all the roots converge to the required number of significant figures. This process if iteration is called Jacobi’s iteration or simply the method of iteration. But, there is a limitation of this method. Jacobi’s iteration is not applicable for all systems of linear equations. The sufficient condition for convergence of this method is discussed below. 2.2.1
Convergence of Gauss-Jacobi’s iteration
The (k+1)th iterated value of the variable xi obtained by the Gauss-Jacobi’s iteration method is given by (k+1) xi
n X 1 (k) bi − aij xj , i = 1, 2, . . . , n. = aii j=1
(2.11)
j6=i
If ξi be the exact solution for variable xi , then n X 1 ξi = bi − aij ξj . aii j=1
(2.12)
j6=i
Thus, the difference between exact value and the (k + 1)th iterated value of the ith variable is (k+1)
ξ i − xi
= −
n 1 X (k) aij ξj − xj aii j=1 j6=i
or
(k+1)
εi
n 1 X (k) = − aij εj . aii j=1 j6=i
4
2.1. Some terminologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . That is, (k+1)
kεi
Let
M = max i
k ≤
n
n
j6=i
j6=i
1 X 1 X (k) |aij | kεj k ≤ |aij | kε(k) k. |aii | j=1 |aii | j=1
n n 1 X o |aij | . |aii | j=1 j6=i
Then the above equation reduces to kε(k+1) k ≤ M kε(k) k.
(2.13)
Here, the term kε(k) k is linear, so the rate of convergence of Gauss-Jacobi’s method
is linear.
kε(k+1) k ≤ M kε(k) k ≤ M 2 kε(k−1) k ≤ · · · ≤ M k+1 kε(0) k.
Again, That is,
kε(k) k ≤ M k kε(0) k.
(2.14)
If M < 1 then M k → 0 as k → ∞ and consequently kε(k) k → 0 as k → ∞. Thus,
the iteration process converges.
Hence, the sufficient condition for convergent of Gauss-Jacobi’s iteration method is n X M < 1, i.e. |aij | < |aii | for all i. j=1 j6=i
In other words, the sufficient condition for convergence of the Gauss-Jacobi’s iteration method is that the coefficient matrix be diagonally dominant. The equation (2.13) can also be written as kε(k+1) k ≤ M kε(k) k = M ke(k) + ε(k+1) k
or,
≤ M ke(k) k + M kε(k+1) k M ke(k) k. kε(k+1) k ≤ M −1
[by (2.5)]
(2.15)
From this equation one can estimate the absolute error at the (k + 1)th iteration in terms of the errors at the kth iteration.
5
. . . . . . . . . . . . . . . . . . . Iteration Methods to Solve System of Linear Equations
Example 2.1 Consider the following system of linear equations 14x − 3y + 5z = 4 3x + 10y − 2z = 12 5x + 2y + 20z = 16. Solve these equations by Gauss-Jacobi’s method correct up to four decimal places. Also calculate the upper bound of absolute errors. Solution. Since | − 3| + |5| < |14|,
|3| + | − 2| < |10|,
|5| + |2| < |20|,
therefore, the system is diagonally dominant and hence there is a convergence iteration scheme. Now, such Gauss-Jacobi’s iteration scheme is 1 x(k+1) = 4 + 3y (k) − 5z (k) 14 1 (k+1) 12 − 3x(k) + 2z (k) y = 10 1 z (k+1) = 16 − 5x(k) − 2y (k) . 20 Let the initial solution be (0, 0, 0). The successive iterations are depicted in the following table.
6
k
x
y
z
0
0
0
0
1
0.28571
1.20000
0.80000
2
0.25714
1.27429
0.60857
3
0.34143
1.24457
0.60829
4
0.33516
1.21923
0.59019
5
0.33620
1.21749
0.59429
6
0.33436
1.21800
0.59420
7
0.33450
1.21853
0.59461
8
0.33447
1.21857
0.59452
9
0.33451
1.21856
0.59453
10
0.33450
1.21855
0.59452
11
0.33451
1.21855
0.59452
2.1. Some terminologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Thus, the solution correct up to four decimal places is x = 0.3345, y = 1.2186, z = 0.5945. Here M = max i
n 1 X 8 5 7 4 |aij | = max , , = . aii j=1 14 10 20 7 j6=i
e(10)
= (1 ×
10−5 , 0, 0).
Therefore, the upper bound of absolute error is
kε(10) k ≤
M ke(10) k = 1.3333 × 10−5 . 1−M
2.3 Gauss-Seidal’s iteration method This method is obtained by a simple modification on Jacobi’s iteration method, and most of the times this new method gives faster convergence. Let us consider the same system of linear equations with n variables and n equations. a11 x1 + a12 x2 + · · · + a1n xn = b1 a21 x1 + a22 x2 + · · · + a2n xn = b2
(2.16)
··························· ··· ··· an1 x1 + an2 x2 + · · · + ann xn = bn . Also, in this method we assume that the coefficient matrix is diagonally dominant. If the coefficient matrix is not diagonally dominant, then the above system of equations are re-arranged in such a way that the above condition holds. If it is not possible, then the method may or may not give the solution. Like Jacobi’s method the equations (2.16) are rewritten in the following form: 1 (b1 − a12 x2 − a13 x3 − · · · − a1n xn ) a11 1 x2 = (b2 − a21 x1 − a23 x3 − · · · − a2n xn ) a22 ··· ··· ······································· 1 xn = (bn − an1 x1 − an2 x2 − · · · − an n−1 xn−1 ). ann x1 =
(0)
(0)
(2.17)
(0)
Let x2 , x3 , . . . , xn be the initial solution for the variables x2 , x3 , . . . , xn respectively. Substitute these values to the first equation of the above system and obtain 7
. . . . . . . . . . . . . . . . . . . Iteration Methods to Solve System of Linear Equations (1)
(1)
the first approximate value of x1 and it is denoted by x1 . Next, substitute x1 x1 and
(0) (0) (0) x3 , x4 , . . . , xn
for
for x3 , x4 , . . . , xn respectively to the second equation of (2.17) (1)
to get the first approximate value of x2 and it is denoted as x2 . Then substitute (1)
(1)
(1)
(0)
(0)
x1 , x2 , . . . , xi−1 , xi+1 , . . . , xn for x1 , x2 , . . . , xi−1 , xi+1 , . . . , xn respectively to the ith (1)
equation of (2.17) and obtain the first approximation of xi and let it be xi . Continue this process for all equations. In similar way, one can calculate the second approximate values of all variables. (k)
If xi , i = 1, 2, . . . , n be the kth approximate value of xi , then the (k + 1)th approximate value of x1 , x2 , . . . , xn are given by 1 (k) (k) (b1 − a12 x2 − a13 x3 − · · · − a1n x(k) n ) a11 1 (k+1) (k+1) (k) x2 = (b2 − a21 x1 − a23 x3 − · · · − a2n x(k) (2.18) n ) a22 ··· ··· ······································· 1 (k+1) (k+1) (k+1) (k) (k) xi = (bi − ai1 x1 − · · · − ai i−1 xi−1 − ai i+1 xi+1 − · · · − an n−1 xn−1 ) aii ··· ··· ······································· 1 (k+1) (k+1) (k+1) (k+1) xn = (bn − an1 x1 − an2 x2 − · · · − an n−1 xn−1 ). ann k = 0, 1, 2, . . . . (k+1)
x1
=
In compact form the above equations can be written as, (k+1)
xi
=
i−1 n X X 1 (k+1) (k) bi − aij xj − aij xj , i = 1, 2, . . . , n and k = 0, 1, 2, . . . . aii j=1
j=i+1
(k+1)
This iteration process is repeated until |xi
(k)
− xi | < ε for all i = 1, 2, . . . , n, for
any pre-assigned number ε > 0. This number is called the error tolerance. This method is known as Gauss-Seidal’s iteration method.
Note that the very recent values of the variables are used to calculate the value of the next variable.
8
2.1. Some terminologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Example 2.2 Solve the system of equations 14x − 3y + 5z = 4,
12,
3x + 10y − 2z =
5x+2y+20z = 16 by Gauss-Seidal’s iteration method, correct up to four decimal
places. Solution. In this case, the iteration scheme is 1 x(k+1) = 4 + 3y (k) − 5z (k) 14 1 y (k+1) = 12 − 3x(k+1) + 2z (k) 10 1 (k+1) z = 16 − 5x(k+1) − 2y (k+1) . 20 Let y = 0, z = 0 be the initial solution. All calculations are shown in the following table. k
x
y
z
0
−
0
0
1
0.28571
1.11429
0.61714
2
0.30408
1.23220
0.60076
3
0.33520
1.21959
0.59424
4
0.33483
1.21840
0.59445
5
0.33450
1.21854
0.59452
6
0.33450
1.21855
0.59452
7
0.33450
1.21855
0.59452
The solution correct up to four decimal places is x = 0.3345, y = 1.2186, z = 0.5945. Note 2.1 Note that for the same system of equations, Gauss-Jacobi’s method needs eleven iterations while Gauss-Seidal’s method takes only seven iterations. The sufficient condition for convergence of this method is same as Jacobi’s method, i.e. the coefficient matrix is diagonally dominant. This is justified in next section. 2.3.1
Convergence of Gauss-Seidal’s method
Let M = max i
n i−1 1 X 1 X |aij | and mi = |aij |, i = 1, 2, . . . , n. aii j=1 aii j6=i
j=1
9
. . . . . . . . . . . . . . . . . . . Iteration Methods to Solve System of Linear Equations It is obvious that 0 ≤ mi ≤ M < 1.
As in case of Gauss-Jacobi’s method X 1 X (k+1) (k) (k+1) |aij | |εj |+ |aij | |εj | |εi |≤ |aii | j
i X 1 X (k+1) (k) |aij | kε k+ |aij | kε k ≤ |aii | j
j>i
≤ mi kε(k+1) k + (M − mi )kε(k) k. Thus, for a fixed i one can write kε(k+1) k ≤ mi kε(k+1) k + (M − mi )kε(k) k. That is, kε(k+1) k ≤
M − mi (k) kε k. 1 − mi
(2.19)
M − mi ≤ M as 0 ≤ mi ≤ M < 1. 1 − mi Thus, the relation (2.19) becomes Now,
kε(k+1) k ≤ M kε(k) k.
(2.20)
Here, also the power of M is one, so the rate of convergence of Gauss-Seidal’s iteration is also linear. That is, theoretically the rate of convergence of both the methods are same, but practically for most of the problems the Gauss-Seidal’s iteration method converge faster. From equation (2.20), by successive substitutions we get kε(k) k ≤ M k kε(0) k. Now, if M < 1, or
n X j=1 j6=i
|aij | < |aii | for all i then kε(k) k → 0 as k → ∞. Thus, the
sufficient condition of Gauss-Seidal’s iteration method is stated below. If the coefficient matrix is diagonally dominant then the method converge to the roots. Like Gauss-Jacobi’s method, the absolute error at the (k + 1)th iteration is given by kε(k+1) k ≤ 10
M ke(k) k when M < 1. 1−M
2.1. Some terminologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Note 2.2 Usually, the Gauss-Seidal’s method converges rapidly than the Gauss-Jacobi’s method. But, this is not always true. There are some examples in which the GaussJacobi’s method converges faster than the Gauss-Seidal’s method. Example 2.3 Solve the equations 20x + 3y + 4z + 5w = 10.8, 2x + 3y + 8.5z + 25.5w = 21.4, −x + 15y + 3z + 4w = 20.3, x + 10y + 21z + 5w = 15.7 by Gauss-Seidal’s method correct up to four significant figures.
Solution. The given system of equations is not diagonally dominant. But, we see that the rearranged system 20x + 3y + 4z + 5w = 10.8, −x + 15y + 3z + 4w = 20.3,
x + 10y + 21z + 5w = 15.7, 2x + 3y + 8.5z + 25.5w = 21.4 is diagonally dominant. Therefore, the Gauss-Seidal’s iteration scheme is 1 (10.8 − 3y (k) − 4z (k) − 5w(k) ) 20 1 = (20.3 + x(k+1) − 3z (k) − 4w(k) ) 15 1 = (15.7 − x(k+1) − 10y (k+1) − 5w(k) ) 21 1 = (21.4 − 2x(k+1) − 3y (k+1) − 8.5z (k+1) ). 25.5
x(k+1) = y (k+1) z (k+1) w(k+1)
Let y = 0, z = 0, w = 0 be the starting values of y, z, w. The successive iterations are shown below. k
x
y
z
w
0
−
0
0
0
1
0.54000
1.38933
0.06032
0.61331
2
0.16621
1.18880
0.02758
0.67713
3
0.18688
1.17971
0.01573
0.68052
4
0.18977
1.18136
0.01400
0.68068
5
0.18983
1.18167
0.01381
0.68070
6
0.18981
1.18170
0.01379
0.68071
7
0.18981
1.18171
0.01379
0.68071
Hence, the solution correct up to four decimal places is x = 0.1898, y = 1.1817, z = 0.0138, w = 0.6807. Let us consider a system of linear equations which is not diagonally dominant. 11
. . . . . . . . . . . . . . . . . . . Iteration Methods to Solve System of Linear Equations
Example 2.4 Solve the equations 3x + 2y + z = 7, x + 4y + 5z = 15, x + 3y + 10z = 23 using Gauss-Seidal’s method. Solution. The Gauss-Seidal’s iteration scheme is 1 x(k+1) = (7 − 2y (k) − z (k) ) 3 1 y (k+1) = (15 − x(k+1) − 5z (k) ) 4 1 (k+1) z = (23 − x(k+1) − 3y (k+1) ). 10 Let y = 0, z = 0 be the initial values of y, z. The other approximate values are shown below. k
x
y
z
0
−
0
0
1
2.33333
3.16667
1.21667
2
-0.18333
2.27500
1.73583
3
0.23806
1.52069
1.91999
4
0.67954
1.18013
1.97801
5
0.88724
1.05568
1.99457
6
0.96469
1.01561
1.99885
7
0.98998
1.00395
1.99982
8
0.99743
1.00087
2.00000
9
0.99942
1.00015
2.00001
10
0.99990
1.00001
2.00001
11
0.99999
0.99999
2.00000
12
1.00000
1.00000
2.00000
Therefore, the solution correct up to four decimal places is x = 1.0000, y = 1.0000, z = 2.0000.
12
2.1. Some terminologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Note 2.3 Note that the given system is not diagonally dominant, but the Gauss-Seidal’s iteration scheme converges to the solution. Let us consider a very interesting problem. Let x1 + x2 = 2, x1 − 3x2 = 1
be a system of equations. Note that the coefficient matrix of these equations is not diagonally dominant. Let us consider two different Gauss-Seidal’s iteration schemes as (k+1)
x1
(k+1)
(k)
= 2 − x2 ,
x2
1 (k+1) = (−1 + x1 ) 3
(2.21)
and (k+1)
x1
(k)
= 1 + 3x2 ,
(k+1)
x2
(k+1)
= 2 − x1
(2.22)
The approximate roots at each step are shown below and the behaviour of the solutions are depicted in the Figures 2.1 and 2.2. For the scheme (2.21)
For the scheme (2.22)
k
x1
x2
k
x1
x2
0
-
0
0
-
0
1
2.00000
0.33333
1
1
1
2
1.66667
0.22222
2
4
–2
3
1.77778
0.25926
3
–5
7
4
1.74074
0.24691
4
22
–20
5
1.75309
0.25103
5
–59
61
6
1.74897
0.24966
6
184
–182
7
1.75034
0.25011
7
547
–545
8
1.74989
0.24996
8
–1634
1636
9
1.75004
0.25001
9
4909
–4907
10
1.74999
0.25000
10
–14720
14722
11
1.75000
0.25000
11
44167
–44165
Note that the first scheme converges to a solution, where as second scheme diverges. 13
. . . . . . . . . . . . . . . . . . . Iteration Methods to Solve System of Linear Equations The first scheme gives the solution of the equations and it is x1 = 1.75000, x2 = 0.25000 correct up to five decimal places. This example shows that the condition ‘diagonally dominant’ is a sufficient condition, but not necessary for Gauss-Seidal’s iteration method. 6
(0,2) x1 + x2 = 2
x1 − 3x2 = 1
? 6
(1,0)
-
(2,0)
(0,-1/3) Figure 2.1: Convergence case of Gauss-Seidal’s iteration scheme (2.21.) 6 6
x1 − 3x2 = 1
(0,2) (-5,0)
6
(2,0)
(0,-2)
-
(4,0) ?
x1 + x2 = 2 Figure 2.2: Divergence case of Gauss-Seidal’s iteration scheme (2.22).
14
2.1. Some terminologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4 Comparison of direct and iteration methods Mainly two type of methods are used to solve a system of equations, viz. direct and iterative. Both the methods have some advantages and disadvantages, these are discussed below: (i) The direct methods are applicable for almost all type of problems where as iterative methods are applicable only for particular type of problems. (ii) In direct method, the rounding errors may become large particularly for illconditioned systems (discussed in next module), whereas in iterative method the rounding error is small, since it is committed in the last iteration. Thus, for ill-conditioned systems the iterative method is a good choice. (iii) Direct method gives the solution of a system in one execution, while in iteration method the same process is repeated for many times. (iv) There is no scope to update the solution obtained in direct method, but it is possible for iteration method. (v) Most of the direct methods are applied on the coefficient matrix and hence the entire matrix is to be stored into the primary memory of the computer during execution of the program. But, the iteration methods are applied in a single equation at a time. Therefore, only one equation is to be stored at a time in primary memory. Thus iterative methods are efficient then direct methods with respect to space space complexity.
15
.
Chapter 5 Solution of System of Linear Equations
Module No. 3 Methods of Matrix Factorization
...................................................................................... Let the system of linear equations be Ax = b where A=
a11 a12 · · · a1n
(3.1)
b1
b2 a21 a22 · · · a2n .. . ··· ··· ··· ··· ,b = b ai1 ai2 · · · ain i . . ··· ··· ··· ··· . am1 am2 · · · amn bm
x1
x2 .. . and X = x . i . . . xm
(3.2)
In the matrix factorization method, the coefficient matrix A is expressed as a product of two or more other matrices. By finding the factors of the coefficient matrix, some methods are adapted to solve a system of linear equations with less computational time. In this module, LU decomposition method is discussed to solve a system of linear equations. In this method, the coefficient matrix A is written as a product of two matrices L and U, where the first matrix is a lower triangular matrix and second one is an upper triangular matrix.
3.1 LU decomposition method LU decomposition method is also known as matrix factorization or Crout’s reduction method. Let the coefficient matrix A be written as A = LU, where L and U are the lower and upper triangular matrices respectively. Unfortunately, this factorization is not possible for all matrices. Such factorization is possible and it is unique if all the principal a11 a a 11 12 a11 6= 0, 6= 0, a21 a21 a22 a31
minors of A are non-singular, i.e. a12 a13 a22 a23 6= 0, · · · , |A| 6= 0 a32 a33
(3.3)
Since the matrices L and U are lower and upper triangular, so these matrices can be 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Methods of Matrix Factorization written in the following l11 0 l21 l22 L = l31 l32 . . . . . . ln1 ln2
form: 0 ··· 0
0 ··· 0 l33 · · · 0 and U = .. .. .. . . . ln3 · · · lnn
u11 u12 u13 · · · u1n
0 u22 u23 · · · u2n 0 0 u33 · · · u3n . .. .. .. .. .. . . . . . 0 0 0 · · · unn
(3.4)
If the factorization is possible, then the equation Ax = b can be expressed as LUx = b.
(3.5)
Let Ux = z, then the equation (3.5) reduces to Lz = b, where z = (z1 , z2 , . . . , zn )t is an unknown vector. Thus, the equation (3.2) is decomposed into two systems of linear equations. Note that these systems are easy to solve. The equation Lz = b in explicit form is
l11 z1
= b1
l21 z1 + l22 z2
= b2
l31 z1 + l32 z2 + l33 z3
= b3
(3.6)
···································· ··· ··· ln1 z1 + ln2 z2 + ln3 z3 + · · · + lnn zn
= bn .
This system of equations can be solved by forward substitution, i.e. the value of z1 is obtained from first equation and using this value, z2 can be determined from second equation and so on. From last equation we can determine the value of zn , as in this stage the values of the variables z1 , z2 , . . . , zn−1 are available. By finding the values of z, one can solve the equation Ux = z. In explicit form, this system is u11 x1 + u12 x2 + u13 x3 + · · · + u1n xn =
z1
u22 x2 + u23 x3 · · · + z2n xn =
z2
u33 x3 + u23 x3 · · · + u3n xn =
z3
························ ··· ··· un−1n−1 xn−1 + un−1n xn = zn−1 unn xn = 2
zn .
(3.7)
...................................................................................... Observed that the value of the last variable xn can be determined from the last equation. Using this value one can compute the value of xn−1 from the last but one equation, and so on. Lastly, from the first equation we can find the value of the variable x1 , as in this stage all other variables are already known. This process is called the backward substitution. Thus, the outline to solve the system of equations Ax = b is given. But, the complicated step is to determine the matrices L and U. The matrices L and U are obtained from the relation A = LU. Note that, this matrix equation gives n2 equations containing lij and uij for i, j = 1, 2, . . . , n. But, the number of elements of the matrices L and U are n(n + 1)/2 + n(n + 1)/2 = n2 + n. So, n additional equations/conditions are required to find L and U completely. Such conditions are discussed below. When uii = 1, for i = 1, 2, . . . , n, then the method is known as Crout’s decomposition method. When lii = 1, for i = 1, 2, . . . , n then the method is known as Doolittle’s method for decomposition. In particular, when lii = uii for i = 1, 2, . . . , n then the corresponding method is called Cholesky’s decomposition method. 3.1.1
Computation of L and U
In this section, it is assumed that uii = 1 for i = 1, 2, . . . , n. Now, the equation LU = A becomes l11 l11 u12 l11 u13 l21 l21 u12 + l22 l21 u13 + l22 u23 l l u +l 31 31 12 32 l31 u13 + l32 u23 + l33 . . .. . .. . . ln1 ln1 u12 + ln2 ln1 u13 + ln2 u23 + ln3 a11 a12 a13 · · · a1n a21 a22 a23 · · · a2n . = · · · · · · · · · · · · · · · a31 a32 a33 · · · ann
· · · l11 u1n
· · · l21 u1n + l22 u2n
· · · l31 u1n + l32 + u2n + l33 u3n .. .. . . · · · ln1 u1n + ln2 u2n + · · · + lnn
From first row and first column, we have a1j , j = 2, 3, . . . , n. li1 = ai1 , i = 1, 2, . . . , n and u1j = l11 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Methods of Matrix Factorization Similarly, from second column and second row we get the following equations. li2 = ai2 − li1 u12 , for i = 2, 3, . . . , n, a2j − l21 u1j for j = 3, 4, . . . , n. u2j = l22 Solving these equations we obtained the second column of L and second row of U. In general, the elements of the matrix L, i.e. lij and the elements of U, i.e. uij are determined from the following equations. lij = aij − aij −
k=1 i−1 P
lik ukj , i ≥ j
(3.8)
lik ukj
k=1
, i<j lii = 1, lij = 0, j > i and uij = 0, i > j.
uij = uii
j−1 X
(3.9)
The matrix equations Lz = b and Ux = z can also be solved by finding the inverses of L and U as
and
z = L−1 b
(3.10)
x = U−1 z.
(3.11)
But, the process is time consuming, because finding of inverse takes much time. It may be noted that the time to find the inverse of a triangular matrix is less than an arbitrary matrix. The inverse of A can also be determined from the relation A−1 = U−1 L−1 .
(3.12)
Few properties of triangular matrices Let L = [lij ] and U = [uij ] be the lower and upper triangular matrices. • The determinant of a triangular matrix is the product of the diagonal elements. • Product of two lower (upper) triangular matrices is a lower (upper) triangular matrix. 4
...................................................................................... • Square of a lower (upper) triangular matrix is a lower (upper) triangular matrix. • The inverse of lower (upper) triangular matrix is also a lower (upper) triangular matrix. • Since A = LU, |A| = |L||U|. Let us illustrate the LU decomposition method. Example 3.1 Let
2 −3
1
A= 1
2 −3 . 4 −1 −2
Express A as A = LU, where L and U are lower and upper triangular matrices and hence solve the system of equations 2x1 − 3x2 + x3 = 1, x1 + 2x2 − 3x3 = 4, 4x1 − x2 − 2x3 = 8. Also, determine L−1 , U−1 , A−1 and |A|. Solution. Let 2 −3 1 1 2 −3 4 −1 −2 l11 0 0 1 u12 u13 l l u 11 11 12 = l21 l22 0 0 1 u23 = l21 l21 u12 + l22 l31 l32 l33 0 0 1 l31 l31 u12 + l32 To find the values of lij and uij , comparing both l11 = 2, l21 = 1
l11 u13
l21 u13 + l22 u23
.
l31 u13 + l32 u23 + l33 sides and we obtained
l31 = 4
l11 u12 = −3
or,
u12 = −3/2
l11 u13 = 1
or,
u13 = 1/2
l21 u12 + l22 = 2
or,
l22 = 7/2
l31 u12 + l32 = −1
or,
l32 = 7
l21 u13 + l22 u23 = −3
or,
u23 = −1
l31 u13 + l32 u23 + l33 = −2
or,
l33 = 1. 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Methods of Matrix Factorization Hence L and U are given by 2 0 0 , L= 1 7/2 0 4 5 1
1 −3/2 1/2
U= 0
1 −1 . 0 1
0
The given equations can be writtenas Ax = b, where 1 2 −3 1 x1 A = 1 2 −3 , x = x2 , b = 4 . x3 8 4 −1 −2 Let A = LU. Then, LUx = b. Let Ux = z. Then, the given equation reduces to Lz = b. First we consider the equation 2 1 4
Lz = b. Then 0 0 z 1 1 7/2 0 z2 = 4 . 5 1 z3 8
In explicit form these equations are 2z1
= 1,
z1 + 7/2z2
= 4,
4z1 + 5z2 + z3 = 8. The solution of the above equations is z1 = 1/2, z2 = 1, z3 = 1. Therefore, z = (1/2, 1, 1)t . Now, we solve the equation Ux = z, i.e. 1 −3/2 1/2 x 1 0 1 −1 x2 0 0 1 x3
1/2
= 1 . 1
In explicit form, the equations are x1 − (3/2)x2 + (1/2)x3 = 1/2 x2 − x3 = 1 x3 = 1. 6
...................................................................................... The solution is x3 = 1, x2 = 1 + 1 = 2, x1 = 1/2 + (3/2)x2 − (1/2)x3 = 3, i.e. x1 = 3, x2 = 2, x3 = 1. Third Part. Gauss-Jordan method is used to find L−1 . Augmented matrix is
.. L.I =
∼
∼
∼
∼
2 1 4 1 1 4 1 0 0 1 0 0 1 0 0
0 7/2 5 0 7/2 5 0 7/2 5 0 0 1 0 5 1 0 0 1 0 0 1
1/2
0
. 0 .. 1 0 0 . 0 .. 0 1 0 .. 1 . 0 0 1 . 0 .. 1/2 0 0 1 0 . R1 = R1 0 .. 0 1 0 2 . 1 .. 0 0 1 . 0 .. 1/2 0 0 0 . R = R2 − R1 , R30 = R3 − 4R1 0 .. −1/2 1 0 2 . 1 .. −2 0 1 .. . 1/2 0 0 2 0 .. R1 = R2 . −1/7 2/7 0 7 .. . −2 0 1 .. . 1/2 0 0 0 .. R = R3 − 5R2 . . −1/7 2/7 0 3 .. . −9/7 −10/7 1 0
Thus, L−1 = −1/7 2/7 0 . −9/7 −10/7 1 Using same process, one can determine U−1 . But, here another method is used to determine U−1 . We know that the inverse of an upper triangular matrix is upper triangular.
1 b12 b13
. Therefore, let U−1 = 0 1 b 23 0 0 1 7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Methods of Matrix Factorization From the identity U−1 U = I, we have 1 b12 b13 1 −3/2 1/2 1 0 0 0 1 b23 0 1 = 0 1 0 . −1 0 0 1 0 0 1 0 0 1
1 −3/2 + b12 1/2 − b12 + b13
0 0
This gives,
1
−1 + b23
0
1
1 0 0
= 0 1 0 . 0 0 1
Comparing both sides −3/2 + b12 = 0 or, b12 = 3/2, 1/2 − b12 + b13 = 0 or, b13 = 1 −1 + b23 = 0 or, b23 = 1 Thus,
1 3/2 1
U−1 = 0 0
1 0
1 . 1
Now,
1 3/2 1
A−1 = U−1 L−1 = 0 0 −1 −1 = −10/7 −8/7
1/2
1 −1/7
1 0 1
1
0 2/7
0
0
−9/7 −10/7 1
1 .
−9/7 −10/7 1
Last Part. |A| = |L||U| = 2 × (7/2) × 1 × 1 = 7.
3.2 Cholesky method The Cholesky method is used to solve a system of linear equations Ax = b if the coefficient matrix A is symmetric and positive definite. This method is also known as square-root method. Let A be a symmetric matrix, then A can be written as a product of lower triangular matrix and its transpose. That is, 8
......................................................................................
A = LLt ,
(3.13)
where L = [lij ], lij = 0, i < j, a lower triangular matrix, LT is the transpose of the matrix L. Again, the matrix A can be written as A = UUt ,
(3.14)
where U is an upper triangular matrix. Using (3.13), the equation Ax = b becomes
Let then
LLt x = b.
(3.15)
Lt x = z,
(3.16)
Lz = b.
(3.17)
Using forward substitution one can easily solve the equation (3.17) to obtained the vector z. Then by solving the equation Lt x = z using back substitution, we obtained the vector x. Alternately, the values of z and then x can be determined from the following equations. z = L−1 b and x = (Lt )−1 z = (L−1 )t z.
(3.18)
As an intermediate result the inverse of A can be determined from the following equation. A−1 = (L−1 )t L−1 . From this discussion, it is clear that the solution of the system of equations is completely depends on the matrix L. The procedure to compute the matrix L is discussed below.
9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Methods of Matrix Factorization
3.2.1
Procedure to determine L
Since A = LLt , then
l11 0
0 ··· 0
l11 l21 · · · lj1 · · · ln1 l21 l22 0 · · · 0 0 l22 · · · lj2 · · · ln2 ··· ··· ··· ··· ··· A = 0 0 · · · lj3 · · · ln3 li1 li2 li3 · · · 0 ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· 0 0 · · · 0 · · · lnn ln1 ln2 ln3 · · · lnn 2 l11 l21 l11 · · · lj1 l11 · · · ln1 l11 2 2 l21 l11 l21 + l22 · · · lj1 l21 + lj2 l22 · · · ln1 l21 + ln2 l22 ··· ··· ··· ··· ··· ··· = li1 l11 l21 li1 + l22 li2 · · · lj1 li1 + · · · + ljj lij · · · ln1 li1 + · · · + lni lii ··· ··· ··· ··· ··· ···
2 + l2 + · · · + l2 ln1 l11 l21 ln1 + l22 ln2 · · · lj1 ln1 + · · · + ljj lnj · · · ln1 nn n2
.
By comparing both sides, we get the following system of equations. l11 = (a11 )1/2 i−1 P 1/2 2 + l2 + · · · + l2 = a li1 or l = a − lij , i = 2, 3, . . . , n ii ii ii i2 ii j=1
li1 = ai1 /l11 , i = 2, 3, . . . , n li1 lj1 + li2 lj2 + · · · + lij ljj = aij 1/2 j−1 P 1 , aij − ljk lik or, lij = ljj
(3.19) for i = j + 1, j + 2, . . . , n
k=1
lij = 0, i < j.
Note that this system of equations gives the values of lij . Similarly, the elements of the matrix U for the system of equations (3.14) are given by, 10
......................................................................................
unn = (ann )1/2 uin = ain /unn , i = 1, 2, . . . , n − 1 n P 1 uik ujk , uij = ujj aij − k=j+1
(3.20)
for i = n − 2, n − 3, . . . , 1; j = i + 1, i + 2, . . . , n − 1 1/2 n P 2 uik , i = n − 1, n − 2, . . . , 1 uii = aii − k=i+1
uij = 0, i > j. This method is illustrated by considering the following example. Example 3.2 Solve the following system of equations by Cholesky method. 4x1 + 2x2 + 6x3 = 16 2x1 + 82x2 + 39x3 = 206 6x1 + 39x2 + 26x3 = 113. Solution. The given system of equations can be written as
4 2
6
Ax = b where x = (x1 , x2 , x3 )t , b = (16, 206, 113)t , and A = 2 82 39 . 6 39 26 Note that the coefficient matrix A is symmetric and positive definite and hence it t can be written as LL =A. l 0 0 11 Let L = l21 l22 0 .
l31 l32 l33 2 l11 l11 l21 l11 l31 4 2 6 2 2 Therefore, LLt = l21 l31 + l22 l32 l21 l11 l21 + l22 = 2 82 39 . 2 + l2 + l2 l31 l11 l31 l21 + l32 l22 l31 6 39 26 32 33 Comparing both sides, we get the following system of equations
2 = 4 or l l11 11 = 2
l11 l21 = 2 or l21 = 1 l11 l31 = 6 or l31 = 3 2 + l2 = 82 or l 1/2 = 9 l21 22 = (82 − 1) 22
11
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Methods of Matrix Factorization l31 l21 + l32 l22 = 39 or l32 = 2 l31
− l31 l21 ) = 4
2 − l2 )1/2 = 1. = 26 or l33= (26 − l31 32 2 0 0 Therefore, L = 1 9 0 .
+
2 l32
1 l22 (39
+
2 l33
3 4 1 Now, the system of equations Lz = b becomes
2z1 = 16 z1 + 9z2 = 206 3z1 + 4z2 + z3 = 113. The solution of these equations is z1 = 8.0, z2 = 22.0, z3 = 1.0. Now, from the equation Lt x = z,
2.0 1.0 3.0
x1
8.0
0.0 9.0 4.0 x2 = 22.0 . 0.0 0.0 1.0 x3 1.0 In explicit form the equations are
2x1 + x2 + 3x3 = 8 9x2 + 4x3 = 22 x3 = 1.
Solution of these equations is x3 = 1.0, x2 = 2.0, x1 = 1.5. Hence, the solution is x1 = 1.5, x2 = 2.0, x3 = 1.0. This is the exact solution of the given system of equations.
3.3 Gauss elimination method to find inverse of a matrix . The Gauss elimination method is applied to the augmented matrix A..b to solve a system of linear equations Ax = b. Using this method one can obtain the inverse of a . matrix. To find the inverse of A, this method is applied to augmented matrix A..I . This method converts the matrix A(= LU) to an upper triangular matrix U and the 12
...................................................................................... unit matrix I to the lower triangular matrix. This lower triangular matrix is the inverse of L. Now, AA−1 = I reduces to LUA−1 = I, i.e. UA−1 = L−1 .
(3.21)
The left hand side of the equation (3.21) is a lower triangular matrix. Also the matrices U and L−1 are known. Therefore, by solving the system of equations (3.21) using substitution one can easily determine the matrix A−1 . The following problem is consider to illustrated the method. 1 2 4 Example 3.3 Find the inverse of the matrix A = 1 −2 6 by using Gauss elimi2 −1 0
nation method. . Solution. The augmented matrix A..I is .. 1 2 4 . 1 0 0 .. . A.I = 1 −2 6 .. 0 1 0 .. 2 −1 0 . 0 0 1 . 1 2 4 .. 1 0 R2 ← R2 − R1 . −→ 0 −4 2 .. −1 R30 ← R3 − 2R1 . 0 −5 −8 .. −2 . 1 2 4 .. 5 0 R3 ← R3 − 4 R2 . −→ 0 −4 2 .. .. . 0 0 −21/2
0 0 1 0 0 1 1 −1
0 0 1 0
−3/4 −5/4 1 1 0 0 −1 1 0 .
1 2 4 −1 Thus, we get U = 2 0 −4 , L = 0 0 −21/2 −3/4 −5/4 1 x x x 11 12 13 −1 Let A = x21 x22 x23 . x31 x32 x33
13
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Methods of Matrix Factorization Since, UA−1 = L−1 ,
1
2
4
x11 x12 x13
1
0 0
0 −4 2 1 0 x21 x22 x23 = −1 . 0 0 −21/2 x31 x32 x33 −3/4 −5/4 1 This equation generates the following system of linear equations. Comparing first column, we get x11 + 2x21 + 4x31 = 1 − 4x21 + 2x31 = −1 −
21 2 x31
= −3/4
Second column gives x12 + 2x22 + 4x32 = 0 − 4x22 + 2x32 = 1 −
21 2 x32
= − 45
From third column we obtain x13 + 2x23 + 4x33 = 0 − 4x23 + 2x33 = 0 −
21 2 x33
= 1.
The solution of these equations is 1 2 1 x11 = , x21 = , x31 = , 7 7 14 2 4 5 x12 = − , x22 = − , x32 = , 21 21 42 10 1 2 x13 = , x23 = − , x33 = − . 21 21 21 1/7 −2/21 10/21 Therefore, A−1 = 2/7 −4/21 −1/21 . 1/14
5/42 −2/21
3.4 Matrix partition method We generally presumed that the size of the coefficient matrix of a system is not very large, that is, the entire matrix can be stored in the primary memory of a computer. But, in many applications, it is seen that the size of the matrix is very large and it 14
...................................................................................... cannot be stored in the primary memory of a computer. So, for such cases the entire matrix is divided into some matrices with lower sizes. With the help of these lower order matrices, one can find the inverse of the given matrix. This process of division is known as matrix partitioning method. This method is also useful when few more variables and consequently few more equations are added to the original system. Suppose the coefficient matrix A be partitioned as
B
.. .
C
A= ··· ··· ··· . D .. E
(3.22)
where B is an l × l matrix, C is an l × m matrix, D is an m × l and E is an m × m matrix; and l, m are positive integers with l + m = n. Let A−1 be partitioned as
.. .
P
Q
A−1 = · · · · · · · · · .. R . S
(3.23)
where the matrices P, Q, R and S are of the same orders as those of the matrices B, C, D and E respectively. Then
B
.. .
C
P
AA−1 = ··· ··· ··· ··· . D .. E R
.. .
Q
I1
··· ··· = ··· .. . S 0
.. .
0
··· ··· , .. . I2
(3.24)
where I1 and I2 are identity matrices of order l and m respectively. From (3.24), we have BP + CR = I1 BQ + CS = 0 DP + ER = 0 DQ + ES = I2 . 15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Methods of Matrix Factorization From, BQ + CS = 0 we have Q = −B−1 CS, i.e. DQ = −DB−1 CS. Again, from DQ + ES = I2 , we have (E − DB−1 C)S = I2 . Thus, S = (E − DB−1 C)−1 . Similarly, the other matrices are given by Q = −B−1 CS R = −(E − DB−1 C)−1 DB−1 = −SDB−1 P = B−1 (I1 − CR) = B−1 − B−1 CR. Note that, we have to determine the inverse of two square matrices B and (E − DB−1 C) of order l × l and m × m respectively. That is, the inverse of the matrix A of order n × n depends on the inverses of two lower order (roughly half) matrices. If the matrices B, C, D, E are still large to fit in the computer memory, then further partition them. Example 3.4 Using matrix partition method find the inverse of the matrix 1 2 3 A = 2 −1 0 . 0 2 4 Hence, find the solution of the system of equations x1 + 2x2 + 3x3 = 1 2x1 − x2 = 0 2x2 + 4x3 = −1. Solution. A be partitioned as Suppose the matrix .. . 1 2 . 3 B .. C . 2 −1 .. 0 = · · · · · · · · · . A= ··· ··· ··· ··· .. D . E . 0 2 .. 4 " # 1 2 The matrices B, C, D, E are given by B = 2 −1 " # h i h i 3 C= , D= 0 2 , E= 4 . 0 16
......................................................................................
.. .
P
Q
Then the inverse of A is given by A−1 = · · · · · · · · · , the matrices P, Q, R and . R .. S S are obtain from the following formulae. S = (E − DB−1 C)−1 , R = −SDB−1 , P = B−1 − B−1 CR, Q = −B−1 CS. Now, −1
B
1 =− 5
"
i1
"
h
E − DB−1 C = 4 − 0 2 S =
1
#
1 = 5
1 2
#"
2 −1
3
#
0
"
1
2
#
2 −1
.
8 = . 5
5 8
−1
−1
P = B
−B
"
#
1 2
1
1 Q = − 5 Hence,
−2
5
i1 5h R = − 0 2 8 5
=
−1 −2
1 4 − 12
"
1
"
1
2
# =
2 −1
1 CR = 5
"
h
1
− 12 2
#
2 −1
1 4
i
1 − 5
"
1
2
#"
2 −1
3 0
#
h
1 2
− 14
i
. 2
2 −1
#"
3
#
0
A−1 =
5 = 8 1 2
"
− 38
#
− 34
1 4 1 − 12 1 − 12 4
− 38
− 34 . 5 8
Now, the solution of the given system of equation is 7 1 3 1 − 1 2 4 8 8 1 3 = 7 x = A−1 b = 1 − − 0 4 2 4 1 5 − 12 −1 − 98 4 8
.
7 7 9 Hence, the required solution is x1 = , x2 = , x3 = − . 8 4 8 17
.
Chapter 5 Solution of System of Linear Equations
Module No. 4 Gauss Elimination Method and Tri-diagonal Equations
...................................................................................... In this module, Gauss elimination method is discussed to solve a system of linear equations. Also, another special type of system of equations called tri-diagonal system of equations is also introduced here. The tri-diagonal system occurs in many applications. A special case of LU-decomposition method is used to solve a tri-diagonal system off equations.
4.1 Gauss elimination method Let a11 x1 + a12 x2 + · · · + a1n xn = b1 ···························
··· (4.1)
ai1 x1 + ai2 x2 + · · · + ain xn = bi ···························
···
an1 x1 + an2 x2 + · · · + ann xn = bn . be a system of linear equations containing n variables and n equations. In Gauss elimination method, the variables are eliminated from the system of equations one by one. The variable x1 is eliminated from second equation to nth equation, x2 is eliminated from third equation to nth equation, x3 is eliminated from fourth equation to nth equation, and so on and finally, xn−1 is eliminated from nth equation. Thus, the reduced system of linear equations becomes an upper triangular system which can be solved by back–substitution. Assumed that a11 6= 0. To eliminate x1 , from second, third, · · · , and nth equations a21 a31 an1 the first equation is multiplied by − ,− , ...,− respectively and successively a11 a11 a11 added with the second, third, · · · , nth equations. After this step the system reduces to a11 x1 + a12 x2 + a13 x3 + · · · + a1n xn = b1 (1)
(1)
(1)
(1)
(1)
(1)
(1)
(1)
a22 x2 + a23 x3 + · · · + a2n xn = b2 a32 x2 + a33 x3 + · · · + a3n xn = b3
(4.2)
··························· ··· ··· (1)
(1)
(1) an2 x2 + an3 x3 + · · · + a(1) nn xn = bn ,
where (1)
aij = aij −
ai1 a1j ; i, j = 2, 3, . . . , n. a11 1
. . . . . . . . . . . . . . . . . . . . . Gauss Elimination Method and Tri-diagonal Equations (1)
Now, to eliminate x2 (here also assumed that a22 6= 0) from the third, forth, . . ., and (1) (1) (1) a32 a an2 nth equations, the second equation is multiplied by − (1) , − 42 , . . ., − respectively (1) (1) a22 a22 a22 and successively added to the third, fourth, . . ., and nth equations. The reduced system of equations becomes a11 x1 + a12 x2 + a13 x3 + · · · + a1n xn = b1 (1)
(1)
(1)
(1)
(2)
(2)
(2)
a22 x2 + a23 x3 + · · · + a2n xn = b2 a33 x3 + · · · + a3n xn = b3
(4.3)
····················· ··· ··· (2)
(2) an3 x3 + · · · + a(2) nn xn = bn ,
where (1)
(2)
(1)
aij = aij −
ai2
(1) a ; (1) 2j a22
i, j = 3, 4, . . . , n.
Finally, after eliminating xn−1 , the above system of equations converted to a11 x1 + a12 x2 + a13 x3 + · · · + a1n xn = b1 (1)
(1)
(1)
(1)
(2)
(2)
(2)
a22 x2 + a23 x3 + · · · + a2n xn = b2 a33 x3 + · · · + a3n xn = b3
(4.4)
············ ··· ··· a(n−1) xn = b(n−1) , nn n where, (k−1)
(k) aij
=
(k−1) aij
−
aik
(k−1) akk
(k−1)
akj
;
(0)
i, j = k + 1, . . . , n; k = 1, 2, . . . , n − 1, and apq = apq ; p, q = 1, 2, . . . , n. Note that, from last equation one can determine the value of xn easily. From last but one equation we can determine the value of xn−1 using the value of xn obtained from last equation. In this process, we can determine the values of all variables and this process is known as back substitution. 2
...................................................................................... (n−1)
value of xn−1
bn
. Using this value, we can determine the (n−1) ann from the last but one equation, and so on. Finally, first equation gives
From last equation we have, xn = the value of x1 .
The process to determine the values of the variables xi ’s is a back substitution because we first determine the value of the last variable xn , but the evaluation of the elements (k)
aij ’s is a forward substitution. Note 4.1 In Gauss elimination method, it is assumed that the diagonal elements are non-zero. If one these elements is zero or close to zero then the method is not applicable to solve the system of equations though it may have a solution. In this case, the partial or complete pivoting method must be used to find a solution or a better solution. It is mentioned in previous module that if the system is diagonally dominant or real symmetric and positive definite then no pivoting is necessary. Example 4.1 Solve the equations by Gauss elimination method. 2x1 − x2 + x3 = 5, x1 + 2x2 + 3x3 = 10, x1 + 3x2 − 2x3 = 7. Solution. Multiplying the second and third equations by 2 and subtracting them from first equation we obtained 2x1 − x2 + x3 = 5, −5x2 − 5x3 = −15, −7x2 + 5x3 = −9. Multiplying third equation by 5/7 and subtracting from second equation we get 2x1 − x2 + x3 = 5, −5x2 − 5x3 = −15, 10 10 − x3 = − . 7 7 Observe that the value of x3 can easily be determined from the third equation and it is x3 = 1. Using this value, from second equation we have x2 = 2. Finally, from first equation 2x1 = 5 + 2 − 1 = 6, i.e. x1 = 3. Hence, the solution is x1 = 3, x2 = 2, x3 = 1. 3
. . . . . . . . . . . . . . . . . . . . . Gauss Elimination Method and Tri-diagonal Equations
4.2 Gauss-Jordan elimination method In Gauss elimination method, the coefficient matrix is transferred into an upper triangular form and then the solution is obtained by back substitution. But, in GaussJordan method the coefficient matrix is transferred to a diagonal matrix by using row operations and hence the values of the variables are obtained directly by comparing each row. Using Gauss-Jordan elimination method the system of equations (4.1) reduces to the following form: 0 1 ··· 0 ··· ··· ··· ··· 0 0 ··· 1
1 0 ··· 0
x1
b01
b0 2 = . . .. xn b0n
x2 .. .
(4.5)
It is obvious that the solution of the given system of equations is x1 = b01 , x2 = b02 , . . . , xn = b0n . Symbolically, the Gauss-Jordan method can be written as .. A.b
. Gauss − Jordan −→ I..b0 .
(4.6)
Normally, the Gauss-Jordan method is not used to solve a system of equations, as it needs more arithmetic computations than the Gauss elimination method. But, this method is widely used to find the inverse of a matrix. Example 4.2 Use Gauss-Jordan elimination method to solve the following equations. x1 + x2 + x3 = 4, 2x1 − x2 + 3x3 = 1, 3x1 + 2x2 − x3 = 1. Solution. For this problem, the associated matrices are 1 1 1 x1 4 A = 2 −1 3 , x = x2 and b = 1 . 3 2 −1 x3 1 4
...................................................................................... . The augmented matrix A..b is .. 1 1 1 . 4 .. . A.b = 2 −1 3 .. 1 . 3 2 −1 .. 1 .. 1 1 1 . 4 R0 = R − 2R , 2 1 . ∼ 0 −3 1 .. −7 20 R3 = R3 − 3R1 . 0 −1 −4 .. −11 .. 1. 4 1 1 1 0 .. ∼ 0 −3 R = R3 − R2 1. −7 3 3 .. 0 0 −13/3 . −26/3 .. 1 1 1 . 4 R0 = − 3 R , .. 13 3 ∼ 0 1 −1/3 . 7/3 30 R2 = − 13 R2 . 00 1 .. 2 .. 1 0 4/3 . 5/3 . ∼ 0 1 −1/3 .. 7/3 R10 = R1 − R2 . 00 1 .. 2 .. 1 0 0 . −1 R0 = R − 4 R , 1 . 3 3 ∼ 0 1 0 .. 3 10 R2 = R2 + 13 R3 .. 001. 2
Thus, the given system of equations reduces to = −1
x1 x2
=3 x3 = 2.
Hence, the required solution is x1 = −1, x2 = 3, x3 = 2.
5
. . . . . . . . . . . . . . . . . . . . . Gauss Elimination Method and Tri-diagonal Equations
4.3 Solution of tri-diagonal systems The tri-diagonal system of equations is a particular case of a system of linear equations. These type of equations occur in many applications, viz. cubic spline interpolation, solution of boundary value problem, etc. The tri-diagonal system of equations is of the following form b1 x1 + c1 x2 = d1 a2 x1 + b2 x2 + c2 x3 = d2
(4.7)
a3 x2 + b3 x3 + c3 x4 = d3 ··················
···
an xn−1 + bn xn = dn . The coefficient matrix for this system is
b1 c1 0 0 · · · 0
0
0
0
d1 a2 b2 c2 0 · · · 0 0 0 0 0 a b c ··· 0 0 d2 0 0 3 3 3 A= and d = . ··· ··· ··· ··· ··· ··· ··· ··· ··· .. 0 0 0 0 · · · 0 an−1 bn−1 cn−1 dn 0 0 0 0 ··· 0
0
an
(4.8)
bn
This matrix has many interesting properties. Note that the main diagonal and its two adjacent (below and upper) diagonals are non-zero and all other elements are zero. This special matrix is called tri-diagonal matrix and the system of equations is called a tri-diagonal system of equations. This matrix is also known as band matrix. A tri-diagonal system of equations can be solved by the methods discussed earlier. But, this system has some special properties. Exploring these special properties, the system can be solved by a simple way, starting from the LU decomposition method. Let A = LU where 6
......................................................................................
γ1 0 0 · · ·
0
0
0
β2 γ2 0 · · · 0 0 0 L = · · · · · · · · · · · · · · · · · · · · · , 0 0 0 ··· β n−1 γn−1 0 0 0 0 ··· 0 β n γn 1 α1 0 · · · 0 0 0 0 1 α2 · · · 0 0 0 and U = · · · · · · · · · · · · · · · · · · · · · . 0 0 0 ··· 0 1 α n−1 0 0 0 ··· 0 0 1 Then γ1 γ1 α1 0 ··· 0 0 β2 α1 β2 + γ2 α2 γ2 · · · 0 0 LU = 0 β3 α2 β3 + γ3 · · · 0 0 ··· ··· ··· ··· ··· ··· 0
0
0
0
0
.
0 ···
· · · 0 βn βn αn−1 + γn
Now, comparing both sides of the matrix equation LU = A and we obtain the following system of equations. γ1 = b1 ,
γi αi = ci ,
or, αi = ci /γi ,
i = 1, 2, . . . , n − 1
βi = ai , i = 2, . . . , n γi = bi − αi−1 βi = bi − ai
ci−1 , γi−1
i = 2, 3, . . . , n.
Hence, the elements of the matrices L and U are given by the following equations. γ1 = b1 , ai ci−1 , i = 2, 3, . . . , n γi−1 βi = ai , i = 2, 3, . . . , n
(4.10)
αi = ci /γi , i = 1, 2, . . . , n − 1.
(4.11)
γi = bi −
(4.9)
Note that, this is a very simple system of equations. Now, the solution of the equation Ax = d where d = (d1 , d2 , . . . , dn )t can be obtained by solving the equation Lz = d by forward substitution and then by solving the equation Ux = z by back substitution. 7
. . . . . . . . . . . . . . . . . . . . . Gauss Elimination Method and Tri-diagonal Equations The solution of the equation Lz = d is z1 =
d1 di − ai zi−1 , zi = , i = 2, 3, . . . , n. b1 γi
(4.12)
And the solution of the equation Ux = z is xn = zn , xi = zi − αi xi+1 = zi −
ci xi+1 , γi
i = n − 1, n − 2, . . . , 1.
(4.13)
Observe that the number of computations is linear, i.e. O(n) for n equations. Thus, this special method needs significantly less time compare to other method to solve a tri-diagonal equations. Example 4.3 Solve the following tri-diagonal system of equation x1 + 2x2
= 4,
−x1 + 2x2 + 3x3 = 6,
3x2 + x3 = 8.
Solution. For this problem, b1 = 1, c1 = 2, a2 = −1, b2 = 2, c2 = 3, a3 = 3, b3 = 1, d1 = 4, d2 = 6, d3 = 8. Thus, γ1 = b1 = 1 c1 = 2 − (−1).2 = 4 γ1 5 c2 3 γ3 = b3 − a3 = 1 − 3. = − γ2 4 4 d3 − a3 z2 d1 d2 − a2 z1 5 2 z3 = z1 = = 4, z2 = = , =− b1 γ2 2 γ3 5 2 c2 c1 14 8 x 3 = z3 = − , x2 = z2 − x3 = , x1 = z1 − x2 = − . 5 γ2 5 γ1 5 γ2 = b2 − a2
14 2 8 Therefore, the required solution is x1 = − , x2 = , x3 = − . 5 5 5 The above method is not applicable for all kinds of tri-diagonal equations. The equations (4.12) and (4.13) are valid only if γi 6= 0 for all i = 1, 2, . . . , n. If any one of γi is zero at any stage, then the method fails. Remember that this method is based on LU decomposition method and LU decomposition method is applicable and gives unique solution if all the principal minors of the coefficient matrix are non-zero. Fortunately, a modified method is available if one or more γi are zero. The modified method is described below. 8
...................................................................................... Without loss of generality, let us assume that γk = 0 and γi 6= 0, i = 1, 2, . . . , k − 1. Let γk = x, x is a symbolic value of γk . The values of the other γi , i = k + 1, . . . , n are calculated by using the equation (4.9). Using these γ’s, the values of zi and xi are determined by the formulae (4.12) and (4.13). Note that, in general, the values of xi ’s are depend on x. Practically, the value of x is 0. Finally, the solution is obtained by substituting x = 0.
4.4 Evaluation of tri-diagonal determinant For n ≥ 3, the general form of a tri-diagonal matrix T = [tij ]n×n is b1 c1 0 · · · · · · · · · 0 a2 b2 c2 · · · · · · · · · 0 0 a b ··· ··· ··· 0 3 3 T= ··· ··· ··· ··· ··· ··· ··· 0 0 0 · · · a b c n−1 n−1 n−1 0 0 0 · · · 0 an bn and tij = 0 for |i − j| ≥ 2. Note that the first and last rows contain two non-zero elements and the number of non-zero elements to all other rows is three. These elements may also be zero for any particular case. In general, for any matrix the number of non-zero elements are n2 , but for tri-diagonal matrix only 3(n − 2) + 4 = 3n − 2 non-zero elements are present. So, this matrix can be stored using only three vectors c = (c1 , c2 , . . . , cn−1 ), a = (a2 , a3 , . . . , an ), and b = (b1 , b2 , . . . , bn ). Let us define a vector d = (d1 , d2 , . . . , dn ) as b1 if i = 1 ai di = ci−1 if i = 2, 3, . . . , n. bi − di−1 The value of the determinant is the product P =
n Y
(4.14)
di .
i=1
If di = 0 for any particular i, i = 1, 2, . . . , n, then, set di = x (x is just a symbolic name). Using this value calculate other d’s, i.e. di+1 , di+2 , . . . , dn . In this case, 9
. . . . . . . . . . . . . . . . . . . . . Gauss Elimination Method and Tri-diagonal Equations
d’s contains x. Thus, the product P =
n Y
di depends on x. Lastly, the value of the
i=1
determinant is obtained by substituting x = 0 in P . Example 4.4 Find the values of the following tri-diagonal determinants. 1 1 0 1 2 0 A = 1 1 −3 , B = −1 2 −2 . 0 −1 3 0 −1 2 Solution. For the determinant A. a3 3x − 3 a2 . d1 = 1, d2 = b2 − c1 = 0. Here d2 = 0, so let d2 = x. d3 = b3 − c2 = d1 d2 x 3x − 3 Thus, P = d1 d2 d2 = 1 · x · = 3x − 3. x Now, we put x = 0. Therefore, A = −3. For the determinant B. a2 a3 d1 = 1, d2 = b2 − c1 = 4, d3 = b3 − c2 = d1 d2 3 Therefore, P = d1 d2 d2 = 1 · 4 · = 6, that 2
10
3 . 2 is, B = 6.
.
Chapter 5 Solution of System of Linear Equations
Module No. 5 Generalized Inverse of Matrix
......................................................................................
5.1 Generalized inverse (g-inverse) From linear algebra, we know that the inverse of a matrix exists and unique if it is square and non-singular. Several methods are discussed in previous modules to find the inverse of a square non-singular matrix. The matrix inverse is widely used in many areas of science and engineering. In many cases, particularly in statistics, data analysis etc. some kind of weak inverses of singular square and rectangular matrices are required. The inverse of such type of matrix is known as generalized inverse or g-inverse. Many works have been done on g-inverse during last three decades. The generalized inverse of an m × n matrix A is a matrix X of order n × m. Unlike conventional inverse, the generalised inverse are of different types. Depending on shape, singularity and requirement different authors defined different g-inverses. The g-inverse has a special importance if the given matrix is either singular or rectangular. If it is a non-singular square matrix, then there is no need to determine g-inverse. To define g-inverse, let us define the following matrix equations: Let A be a matrix, it may be a rectangular or singular or non-singular square matrix: (i) AXA = A (ii) XAX = X (iii) AX = (AX)∗
(5.1)
(iv) XA = (XA)∗ , (∗ denotes the conjugate transpose). The matrix X is called (a) a generalized inverse of A, denoted by A− , if (i) holds; (b) a reflexive generalized inverse of A, denoted by A− r , if (i) and (ii) hold ; (c) a minimum norm inverse of A, denoted by A− m , if (i) and (iv) hold; (d) a least-square inverse of A, denoted by A− l , if (i) and (iii) hold; (e) the Moore-Penrose inverse of A, denoted by A+ , if (i), (ii), (iii) and (iv) hold. Among these g-inverses only the Moore-Penrose inverse A+ is unique and other all inverses are not unique. If the matrix A is square and non-singular, then all these inverses reduce to the conventional inverse A−1 . Since the Moore-Penrose inverse A+ is unique, so it is widely used in many applications. 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Generalized Inverse of Matrix Some important properties of Moore-Penrose inverse are presented below: (i) (A+ )+ = A;
(5.2)
+ t
(5.3)
t +
(ii) (A ) = (A ) ; (iii) A
−1
+
= A , if A is non-singular and square +
+
+
(iv) In general, (AB) 6= B A .
(5.4) (5.5)
Let A be a matrix of order m × n not necessarily singular, and At be its transpose. • If rank of A is 0, then A+ is a null matrix of order n × m; 1 • If rank of A is 1, then A+ = At ; trace(AAt ) • If rank of A is n, then A+ = (At A)−1 At ; • If rank of A is m, then A+ = At (AAt )−1 . The g-inverse A+ of the matrix A is used to solve the system of equations Ax = b, (b 6= 0)
(5.6)
where A is an m × n matrix, x and b are respectively n × 1 and m × 1 vectors. The solution of Ax = b is given by x = A+ b. It can be shown that the Euclidean norm kxk =
(5.7) √
x ∗ x is minimum. A solution x∗
of the system of equations Ax = b is called least squares solution if Ax∗ − b 6= 0, but kAx∗ − bk is minimum. That is, the least-squares solution minimizes the Euclidean norm kAx − bk for an inconsistent system. The solution of the equation (5.6) obtained by (5.7) is minimum norm least squares solution. Few methods are available in literature to find Moore-Penrose inverse, among them Greville’s algorithm is frequently used. 5.1.1
Greville’s algorithm to find Moore-Penrose inverse
It is a recursive algorithm to find the Moore-Penrose inverse of a matrix of order m × n. 2
...................................................................................... Let
a11 a12 · · · a1k · · · a1n
a21 a22 · · · a2k · · · a2n = (α1 α2 . . . αk . . . αn ) A= (5.8) · · · · · · · · · · · · · · · · · · am1 am2 · · · amk · · · amn a1k a2k where αk = . , the kth column of the matrix A. .. amk Also, let Ak be a sub-matrix formed by the first k columns of the matrix A. That is, a11 a12 · · · a1k a21 a22 · · · a2k Ak = (α1 α2 . . . αk ) = ··· ··· ··· ··· . am1 am2 · · · amk Also, a11 a12 · · · a1k−1 a1k a21 a22 · · · a2k−1 a2k Ak = (Ak−1 αk ) = ··· ··· ··· ··· ··· . am1 am2 · · · amk−1 amk The Greville’s algorithm is recursive. The initial condition of the recursion is given below: If α1 = 0 (null column), then A+ 1 =0
(5.9)
else t −1 t A+ α1 . 1 = (α1 α1 )
(5.10)
Now, we define the column vectors δk and γk as follows: δ k = A+ k−1 αk
(5.11)
and γk = αk − Ak−1 δk .
(5.12)
Another vector βk is defined below: 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Generalized Inverse of Matrix If γk 6= 0, then βk is defined as
else
βk = γk+ = (γkt γk )−1 γkt ;
(5.13)
βk = (1 + δkt δk )−1 δkt A+ k−1 .
(5.14)
Then, the matrix A+ k of the matrix Ak is computed as " # + A − δ β k k k−1 A+ . k = βk The process is continue for k = 1, 2, . . . , n. The method is illustrated bellow: Example 5.1 Find the g-inverse (Moore-Penrose inverse) of 0 1 0 2 1 0 2 1 −1 2 −1 0 and hence solve the following system of equations x2
+ 2x4 = 2
x1 + 2x3
+ x4 = 3
− x1 + 2x2 − x3
= 1.
Solution. For this problem, the vectors αi ’s are
0
1
0
2
1 , α2 = 0 , α3 = 2 , α4 = 1 . 2 −1 0 −1 0 Thus, A1 = (α1 ) = 1 . −1 Now, −1 0 i 1h i i h h t −1 t A+ = (α α ) α = = 1 0 1 −1 0 1 −1 0 1 −1 1 1 1 1 2 −1 h i = 0 12 − 12 . α1 =
4
(5.15)
......................................................................................
i h i 0 = −1 . 2 1 0 h i 1 γ2 = α2 − A1 δ2 = 0 − 1 −1 = 1 6= 0 (the null column vector).
δ2 = A + 1 α2 =
h
1
0
1 2
− 12
−1
2
1
1
−1 i
.
4 −1 −2
i h i h h 1 Hence, β2 = γ2+ = (γ2t γ2 )−1 γ2t = = 13 31 1 1 1 1 1 1 1 h i δ2 β2 = − 31 − 13 − 13 . # " # " + 5 1 5 − δ β − A 2 2 1 6 . A+ = 13 61 2 = 1 β2 3 3 3 This is the second iterated matrix. Now, we are going to third iteration. " " # # 0 1 5 11 5 − 3 3 6 2= 6 δ3 = A + 2 α3 = 1 1 1 1 3
3
3
γ3 = α3 − A2 δ3 =
0
−1
2 −
−1
1 3
3
0 1
− 13
=
1 6 1 6
"
1 0 −1 2
11 6 1 3
#
6= 0.
Hence,
h 1 β3 = γ3+ = (γ3t γ3 )−1 γ3t = −3 h = 6 − 11 3 " δ3 β 3 =
11 6 1 3
#
h
1 6
−2 1 −1
1 6
i
i
=
=
h
"
− 11 3
1 6
1 6
i
− 13 1 6 1 6
−1 h
− 11 3
1 6
1 6
i
i −2 1 1 .
− 23
11 6 1 3
11 6 1 3
# . "
Therefore, the third iterated matrix is A+ 3 =
A+ 2 − δ3 β3 β3
# =
1
0
−2
1
0 . 1
5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Generalized Inverse of Matrix
4 −1 −2
Now, δ4 = A+ 3 α4 =
1
2
0 1 =
0
−2 1 1 2 0 1 γ4 = α4 − A3 δ4 = 1 − 1 0
0
0
2
0 −1 2 −1 (the null column vector)
7
7
2 ,
−3
2
2
0
2 =1−1=0
−3
0
0
0
Thus,
h i 1 + 7 2 −3 = β4 = (1 + δ4t δ4 )−1 δ4t A+ 3 i h 1 h = 5 1 7 = 63 δ4 β 4 =
7
5 63
h 2
1 63
1 9
i
i
7
−1
2
−3
h
i 7 2 −3
0 1
0
1 0
2
−1 2 −1
.
5/9
7/9
2/9 . −3 −5/21 −1/21 −1/3 31/9 −10/9 −25/9 " # + 53/63 −2/63 −2/9 − δ β A 4 4 3 + . Hence, A4 = = β4 −37/21 22/21 4/3 5/63 1/63 7/63 5 63
1 63
1 9
= 10/63
1/9 2/63
This is the last step and hence +
A =
A+ 4
31/9 −10/9 −25/9
53/63 −2/63 = −37/21 22/21 5/63 1/63
−2/9 . 4/3 7/63
The given system of equations is written as Ax = b. Note that the coefficient matrix is rectangular and so its inverse is A+ not A−1 . Thus, the solution is given by x = A+ b. 6
...................................................................................... Therefore,
31/9
−10/9 −25/9
53/63 −2/63 −2/9 x= −37/21 22/21 4/3 5/63 1/63 7/63
−13
−2 3 = −2 . 8 1 0
Hence, the required solution is x1 = −13, x2 = −2, x3 = 8, x4 = 0. Note 5.1 This problem is very interesting. Notice that a system of three equations and four variables is solved by g-inverse. It is not possible by using conventional inverse.
−1 Note 5.2 It may be verified that A+ satisfies all the conditions of (5.1). Again, A+ 3 = A3
as |A3 | = −1, i.e. A3 is non-singular. Also, for this matrix A, AA+ = I4 , but
1.6667 −2.1111
0.1905 A A= −0.2857 −0.0952 +
0.5556
5.7778
1.6508 6= I4 , 0.9048 0.7619 −2.4762 0.3016 −0.0794 0.1746
0.3968
0.1587
the unit matrix of order 4. Now, we consider another example to find g-inverse in which the first column of the matrix is zero. Example 5.2 Find Moore-Penrose inverse of the matrix 0 1 0 1 0 1 −1 1 . 0 0 1 2 Use this matrix solve the following system of equations x1
+ x3 = 3,
x1 − x2 + x3 = 2,
x2 + 2x3 = 1.
Solution. In this problem, the vectors αi ’s are 7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Generalized Inverse of Matrix
0
1
0
1
, α2 = 1 , α3 = −1 , α4 = 1 . α1 = 0 0 0 1 2 0 Since α1 = 0 (null column vector), therefore,
A+ 1
=
0 h
i 0 0 0 .
Obviously, δ2 = A+ 1 α2 =[0]. 1 γ2 = α2 − A 1 δ 2 = α2 = 1 .
0 −1 i 1 h i h h −1 t t Now, β2 = (γ2 γ2 ) γ2 = 1 1 0 = 1 1 0 1 0 δ2 β 2 =
h
" A+ 2 =
1 2
1 2
i 0 .
i 0 0 0 .
A+ 1 − δ2 β2
#
" =
β2
0 0 0 1 2
1 2
# .
0
" # 0 0 −1 = = 1 1 δ3 = − 12 0 2 2 1 1 # 0 0 1 " 2 0 −1 . −0 0 = γ3 = α3 − A2 δ3 = −1 2 −1 2 1 1 0 0 "
A+ 2 α3
0 0 0
#
Hence, h β3 = (γ3t γ3 )−1 γ3t = = 8
h
1 3
− 13
2 3
i
.
1 2
i − 21 1
1 2 − 12
1
−1
h
1 2
− 21 1
i
...................................................................................... " δ3 β 3 =
#
0
h
− 12 "
Hence, A+ 3 =
1 3
2 3
− 13
i
A+ 2 − δ3 β3
" =
0
1 6
− 13
− 61
# .
0
#
β3
0 0
0
0
= 2/3
1/3 1/3 . 1/3 −1/3 2/3
0
0
0
1
0
5 Now, δ4 = A+ 1/3 1/3 3 α4 = 2/3 1 = 3 . 4 1/3 −1/3 2/3 2 3 1 −2 0 1 0 0 3 5 2 γ4 = α4 − A3 δ4 = 1 − 0 1 −1 3 = 3 . 2
0 0
4 3
1
2 3
Thus,
h 2 β4 = (γ4t γ4 )−1 γ4t = −3
δ4 β 4 =
0 5 3 4 3
h −1 2
1 2
Thus,
=
A+ 4
=
i
0
5 = −6
A+ 3 − δ4 β 4 β4
2 3
i
0 0
− 23
" A+
1 2
2 3
#
5 6 2 3
5 6 2 3
− 23 2 3 2 3
−1
h
− 32
0
0
2 3
2 3
i
=
h
− 21
1 2
1 2
i
.
.
0
=
3 2
− 12 − 12 . 1 −1 0
1 1 − 12 2 2 Notice that the first column of A and the first row of A+ are null.
The given system of equations can be written as Bx = b, where
1
0 1
3
B= 1 −1 1 and b = 2 . 0
1 2
1
If we add a zero column at the beginning of the matrix B, the it becomes the matrix 9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Generalized Inverse of Matrix A. Thus, the augmented matrix
0 1
0 1
. A = [0..B] = 0 1 −1 1 . 0 0 1 2
0
0
0
0
3 3 − 12 − 12 Therefore, x = 2=1 1 −1 0 1 1 1 − 21 0 2 2 Hence, the required solution is x1 = 3, x2 = 1, x3 = 0, discarding the first element, A+ b
=
3 2
as the first row of A+ is corresponding to the first column of A.
10
.
Chapter 5 Solution of System of Linear Equations
Module No. 6 Solution of Inconsistent and Ill Conditioned Systems
...................................................................................... In previous modules, we have discussed several methods to solve a system of linear equations. In these modules, it is assumed that the given system is well-posed, i.e. if one (or more) coefficient of the system is slightly changed, then there is no major change in the solution. Otherwise the system of equations is called ill-posed or ill-conditioned. In this module, we will discussed about the solution methods of the ill-conditioned system of equations. Before going to discuss the ill-conditioned system, we define some basic terms from linear algebra which are used to described the methods.
6.1 Vector and matrix norms Let x = (x1 , x2 , . . . , xn ) be a vector of dimension n. The norm of the vector x is the size or length of x, and it is denoted by kxk. The norm is a mapping from the set of vectors to a real number. That is, it is a real number which satisfies the following conditions: (i) kxk ≥ 0 and kxk = 0 iff x = 0
(6.1)
(ii) kαxk = |α|kxk for any real scalar α
(6.2)
(iii) kx + yk ≤ kxk + kyk (triangle inequality).
(6.3)
Several type of norms are defined by many authors. The most use full vector norms are defined below.
(i) kxk1 =
n X
|xi |
(6.4)
i=1
v u n uX (ii) kxk2 = t |xi |2 (Euclidean norm)
(6.5)
i=1
(iii) kxk∞ = max |xi | (maximum norm or uniform norm). i
(6.6)
Now, we define different type of matrix norms. Let A and B be two matrices such that A + B and AB are defined. The norm of a matrix A is denoted by kAk and it 1
. . . . . . . . . . . . . . . . . . . . . . Solution of Inconsistent and Ill Conditioned Systems satisfies the following conditions (i) kAk ≥ 0 and kAk = 0 iff A = 0
(6.7)
(ii) kαAk = |α|kAk, α is a real scalar
(6.8)
(iii) kA + Bk ≤ kAk + kBk
(6.9)
(iv) kAB| ≤ kAkkBk.
(6.10)
From (6.10), it follows that kAk k ≤ kAkk ,
(6.11)
for any positive integer k. Like the vector norms, some common matrix norms are X (i) kAk1 = max |aij | (the column norm) j
(ii) kAk2 =
(6.12)
i
sX X i
(iii) kAk∞ = max i
|aij |2 (the Euclidean norm)
(6.13)
|aij | (the row norm).
(6.14)
j
X j
The Euclidean norm is also known as Erhard-Schmidt norm or Schur norm or the Frobenius norm. The concept of matrix norm is used to study the convergence of iterative methods to solve the system of linear equations. It is also used to study the stability of a system of equations. Example 6.1 Let 1 0 −4 1 A = 4 5 7 0 be a matrix. Find the matrix norms kAk1 , kAk2 and kAk∞ . 1 −2 0 3 Solution. kAk1 = max{1 + 4 + 1, 0 + 5 − 2, −4 + 7 + 0, 1 + 0 + 3} = 6 p √ kAk2 = 12 + 02 + (−4)2 + 12 + 42 + 52 + 72 + 02 + 12 + (−2)2 + 02 + 32 = 122 and kAk∞ = max{1 + 0 − 4 + 1, 4 + 5 + 7 + 0, 1 − 2 + 0 + 3} = 16. 2
......................................................................................
6.2 Ill-conditioned system of linear equations Let us consider the following system of linear equations. 1 x + y = 1.33 3 3x + y = 4.
(6.15)
It is easy to verify that this system of equations has no solution. But, for different approximate values of First we take
1 3
1 3
the system has different interesting results.
' 0.3. Then the system becomes x + 0.3y = 1.33 3x + y = 4.
(6.16)
The solution of these equations is x = 1.3, y = 0.1. If we approximate
1 3
as 0.33, then the reduced system of equations is x + 0.33y = 1.33 3x + y = 4
(6.17)
and its solution is x = 1, y = 1. If the approximation is 0.333 then the system is x + 0.333y = 1.33 3x + y = 4
(6.18)
and its solution is x = −2, y = 10. When
1 3
' 0.3333, then the system is x + 0.3333y = 1.33 3x + y = 4
(6.19)
and its solution is x = 100, y = −32. Note the systems of equations (6.15)-(6.19) and their solutions. These are very confusing situations. What is the best approximation of
1 3
? 0.3 or 0.3333. Observe that 3
. . . . . . . . . . . . . . . . . . . . . . Solution of Inconsistent and Ill Conditioned Systems the solutions are significantly increased when the coefficient of y in first equation is sightly increased. That is, a small change in the coefficient of y in first equation of the system produces large change in the solution. These systems are called ill-conditioned or ill-posed system. On the other hand, if the change in the solution is small for small changes in the coefficients, then the system is called well-conditioned or well-posed system. Let us consider the following system of equations Ax = b.
(6.20)
Suppose one or more elements of the matrices A and/or b be changed and let them be A0 and b0 . Also, let y be the solution of the new system, i.e. A0 y = b0 .
(6.21)
Assumed that the changes in the coefficients are very small. The system of equations (6.20) is called ill-conditioned when the change in y is too large compared to the solution vector x of (6.20). Otherwise, the system of equations is called well-conditioned. If a system is ill-conditioned then the corresponding coefficient matrix is called an ill-conditioned matrix. # above problem, i.e. for the system of equations (6.17) coefficient matrix is " For the 1 0.33 and it is an ill-conditioned matrix. 3 1 When |A| is small then, in general, the matrix A is ill-conditioned. But, the term small has no definite meaning. So many methods are suggested to measure the illconditioned of a matrix. One of the simple methods is defined below. Let A be a matrix and the condition number (denoted by Cond(A)) of it is define by Cond(A) = kAk kA−1 k
(6.22)
where kAk is any type of matrix norm. If Cond(A) is large then the matrix is called ill-conditioned and corresponding system of equations is called ill-conditioned system of equations. If Cond(A) is small then the matrix A and the corresponding system of equations are called well-conditioned. 4
...................................................................................... Let us consider the following the ill-conditioned and well" two #matrices to"illustrated # 0.33 1 44 conditioned cases. Let A = and B = be two matrices. 1 3 35 " # " # 0.625 −0.500 −300 100 1 . Then A−1 = and B−1 = 11 −0.375 0.500 100 −33 √ The Euclidean norms of A and B are kAk2 = 0.10890 + 1 + 1 + 9 = 3.3330 and kA−1 k2 = 333.300. Thus, Cond(A) = kAk2 kA−1 k2 = 3.3330 × 333.300 = 1110.8889, a very large number. Hence A is ill-conditioned. √ For the matrix B, kBk2 = 16 + 16 + 9 + 25 = 8.1240 and kB−1 k2 = 1.01550 Then Cond(B) = 8.24992, a relatively small quantity. Thus, the matrix B is well-conditioned. The value of Cond(A) lies between 0 and ∞. If it is large then we say that the matrix is ill-conditioned. But, there is no definite meaning of large number. So, this measure is not good. Now, we define another parameter whose value lies between 0 and 1. 1/2 P n a2ij , i = 1, 2, . . . , n. The quantity Let A = [aij ] be a matrix and ri = j=1
ν(A) =
|A| r1 r2 · · · rn
(6.23)
measures the smallness of the determinant |A|. It can be shown that −1 ≤ ν ≤ 1. If |ν(A)| is closed to zero, then the matrix A is ill-conditioned and if it is closed to 1, then A is well-conditioned. "
1 4
#
√
17, r2 = 1.0239, |A| = 0.12, " # √ √ 35 0.12 = 0.0284 and for the matrix B = , r1 = 34, r2 = 8, ν(A) = √ 17 × 1.0239 −2 2 16 √ = 0.9702. |B| = 16, ν(B) = √ 34 × 8 Thus the matrix A is ill-conditioned while the matrix B is well-conditioned as its For the matrix A =
0.22 1
, r1 =
value is very closed to 1.
5
. . . . . . . . . . . . . . . . . . . . . . Solution of Inconsistent and Ill Conditioned Systems
6.3 Least squares method for inconsistent system Let us consider a system of equations whose number of equations is not equal to number of variables. Let such system be Ax = b
(6.24)
where A, x and b are of order m × n, n × 1 and m × 1 respectively. Note that the coefficient matrix is rectangular. Thus, either the system has no solution or it has infinite number of solutions. Assumed that the system is inconsistent. So, it does not have any solution. But, the system may have a least squares solution. A solution x0 is said to be least squares if Ax0 − b 6= 0, but kAx0 − bk is minimum. The solution xm is called the minimum norm least squares solution if kxm k ≤ kxl k
(6.25)
for any xl such that kAxl − bk ≤ kAx − bk
for all x.
(6.26)
Since A is rectangular matrix, so its solution can be determined by the following equation x = A+ b,
(6.27)
where A+ is the g-inverse of A. Since the Moore-Penrose inverse A+ is unique, therefore the minimum norm least squares solution is unique. The solution can also be determined by without finding the g-inverse of A. This method is described below. If x is the exact solution of the system of equations Ax = b, then Ax − b = 0, otherwise Ax − b is a non-null matrix of order m × 1. In explicit form this vector is a11 x1 + a12 x2 + · · · a1n xn − b1 a21 x1 + a22 x2 + · · · a2n xn − b2 . ······ ······ ··· ········· am1 x1 + am2 x2 + · · · amn xn − bm 6
...................................................................................... Let square of the norm kAx − bk be denoted by S. Therefore, S = (a11 x1 + a12 x2 + · · · a1n xn − b1 )2 +(a21 x1 + a22 x2 + · · · a2n xn − b2 )2 + · · · +(am1 x1 + am2 x2 + · · · amn xn − bn )2 m X n X = (aij xj − bi )2 .
(6.28)
i=1 j=1
The quantity S is called the sum of square of residuals. Now, our aim is to find the vector x = (x1 , x2 , . . . , xn )t such that S is minimum. The sufficient conditions for which S to be minimum are ∂S ∂S ∂S = 0, = 0, · · · , =0 ∂x1 ∂x2 ∂xn
(6.29)
Note that the system of equations (6.29) is non-homogeneous and contains n equations with n unknowns x1 , x2 , . . . , xn . This system of equations can be solved by any method described in previous modules. Let x1 = x∗1 , x2 = x∗2 , . . . , xn = x∗n be the solution of the equations (6.29). Therefore, the least squares solution of the system of equations (6.24) is x∗ = (x∗1 , x∗2 , . . . , x∗n )t .
(6.30)
The sum of square of residuals (i.e. the sum of the squares of the absolute errors) is given by ∗
S =
m X n X
(aij x∗j − bi )2 .
(6.31)
i=1 j=1
Let us consider two examples to illustrate the least squares method which is used to solve inconsistent system of equations. " Example 6.2 Find g-inverse of the singular matrix A =
48
#
and hence find a least 12 squares solution of the inconsistent system of equations 4x + 8y = 2, x + 2y = 1. " # " # " # 4 8 4 Solution. Let α1 = , α2 = , A1 = . 1 2 1 " # i h i 4 −1 h i h t α )−1 αt = 4 1 , A+ = (α = 4 1 4 1 1 1 1 1 17 17 1 7
. . . . . . . . . . . . . . . . . . . . . . Solution of Inconsistent and Ill Conditioned Systems
δ2 =
A+ 1 α2
=
h
4 1 17 17
γ2 = α2 − A1 δ2 =
" # i 8 2
" # 8 2
−2
= 2, " # 4
=
2
1 β2 = (1 + δ2t δ2 )−1 δ2t A+ 1 = 5 .2. h i 4 . δ2 β2 = 16 85 85
h
" # 0 0 i
4 1 17 17
= 0 (a null vector), h i 8 2 = 85 85
Therefore, " A+ 2 =
A+ 1 − δ2 β2
#
β2
" =
4 85 8 85
1 85 2 85
# .
This is the g-inverse of A. Second Part: In matrix notation, the given system of equations is Ax = b, where " A=
48
#
12
, x=
" # x y
, b=
" # 2 1
.
Note that the coefficient matrix is singular. So, it has no conventional solution. But, the least squares solution of this system of equation is x = A+ b, i.e. " #" # " # 2 9/85 1 41 x= = . 85 8 2 1 18/85 Hence, the least squares solution is x=
9 18 , y= . 85 85
Example 6.3 Find the least squares solution of the following system of linear equations x + 2y = 2.0, x − y = 1.0, x + 3y = 2.3, and 2x + y = 2.9. Also, estimate the residual. Solution. Let x∗ , y ∗ be the least squares solution of the given system of equations. Then the sum of square of residuals S is S = (x∗ + 2y ∗ − 2.0)2 + (x∗ − y ∗ − 1.0)2 + (x∗ + 3y ∗ − 2.3)2 + (2x∗ + y ∗ − 2.9)2 . Now, the problem is to find the values of x∗ and y ∗ in such a way that S is minimum. Thus, ∂S ∂S = 0 and = 0. ∗ ∂x ∂y ∗ 8
...................................................................................... Therefore the normal equations are, 2(x∗ + 2y ∗ − 2.0) + 2(x∗ − y ∗ − 1.0) + 2(x∗ + 3y ∗ − 2.3) + 4(2x∗ + y ∗ − 2.9) = 0 and 4(x∗ + 2y ∗ − 2.0) − 2(x∗ − y ∗ − 1.0) + 6(x∗ + 3y ∗ − 2.3) + 2(2x∗ + y ∗ − 2.9) = 0. After simplification, these equations reduce to 7x∗ +6y ∗ = 11.1 and 6x∗ +15y ∗ = 12.8. 1 The solution of these equations is x∗ = 1.3 and y ∗ = = 0.3333. This is the least 3 squares solution of the given system of equations. The sum of the square of residuals is S = (1.3 + 2 × 0.3333 − 2)2 + (1.3 − 0.3333 − 1)2 + (1.3 + 3 × 0.3333 − 2.3)2 + (2 × 1.3 + 0.3333 − 2.9)2 = 0.0033.
6.4 Method to solve ill-conditioned system It is very difficult to solve a system of ill-conditioned equations. Few methods are available to solve an ill-conditioned system of linear equations. One simple concept to solve an ill-conditioned system is to carry out the calculations with large number of significant digits. But, computation with more significant digits takes much time. One better method is to improve upon the accuracy of the approximate solution by an iterative method. Such an iterative method is consider below. Let us consider the following ill-conditioned system of equations n X
aij xj = bi , i = 1, 2, . . . , n.
(6.32)
j=1
Let {e x1 , x e2 , . . . , x en } be an approximate solution of (6.32). Since this is an approximate n X solution, therefore aij x ej is not necessarily equal to bi . For this solution, let the right j=1
hand vector be ebi , i.e. bi = ebi . Thus, for this solution the equation (6.32) becomes n X
aij x ej = ebi , i = 1, 2, . . . , n.
(6.33)
j=1
9
. . . . . . . . . . . . . . . . . . . . . . Solution of Inconsistent and Ill Conditioned Systems Subtracting (6.33) from (6.32), we get n X
i.e.,
aij (xj − x ej ) = (bi − ebi )
j=1 n X
aij εi = di
(6.34)
j=1
where εi = xi − x ei , di = bi − ebi , i = 1, 2, . . . , n. Now, equation (6.34) is again a system of linear equations whose unknowns are ε1 , ε2 , . . . , εn . By solving these equations we obtained the values of εi ’s. Hence, the new solution is given by xi = εi + x ei and this solution is better approximation to x ei ’s. This technique may be repeated to get more better solution.
6.5 The relaxation method The relaxation method, invented by Southwell in 1946, is an iterative method used to solved a system of linear equations. Let n X
aij xj = bi ,
(6.35)
j=1
be the ith, i = 1, 2, . . . , n equation of a system of linear equations. (k) (k) (k) (x1 , x2 , . . . , xn )t
Let x(k) =
be the kth iterated solution of the system of linear equations.
Then
n X
(k)
aij xj
' bi , i = 1, 2, . . . , n.
j=1 (k)
Now, we denote the kth iterated residual for the ith equation by ri . Therefore, the (k)
value of ri
is given by (k)
ri
= bi −
n X
(k)
aij xj , i = 1, 2, . . . , n.
(6.36)
j=1 (k)
If ri
(k)
(k)
(k)
= 0 for all i = 1, 2, . . . , n, then (x1 , x2 , . . . , xn )t is the exact solution of the
given system of equations. If the residuals are not zero or not small for all equations, then apply the same method to reduce the residuals. 10
...................................................................................... In relaxation method, the solution can be improved successively by reducing the largest residual to zero at that iteration. To get the fast convergence, the equations are rearranged in such a way that the largest coefficients in the equations appear on the diagonals, i.e. the coefficient matrix becomes diagonally dominant. The aim of this method is to reduce the largest residual to zero. Let rp be the largest residual (in magnitude) occurs at the pth equation for a particular iteration. Then the value of the variable xp be increased by dxp where dxp = −
rp . app
That is, xp is replaced by xp + dxp to relax rp , i.e. to reduce rp to zero. Then the modified solution after this iteration is (k) (k) (k) (k) x(k+1) = x1 , x2 , . . . , xp−1 , xp + dxp , xp+1 , . . . , x(k) . n The method is repeated until all the residuals become zero or tends to zero. Example 6.4 Solve the following system of linear equations by relaxation method taking (0, 0, 0) as initial solution 27x + 6y − z = 54,
6x + 15y + 2z = 72,
x + y + 54z = 110.
Solution. The given system of equations is diagonally dominant. The residuals r1 , r2 , r3 are given by the following equations r1 = 54 − 27x − 6y + z r2 = 72 − 6x − 15y − 2z r3 = 110 − x − y − 54z. Here, the initial solution is (0, 0, 0), i.e. x = y = z = 0. Therefore, the residuals are r1 = 54, r2 = 72, r3 = 110. The largest residual in magnitude is r3 . Thus, the third equation has more error and we have to improve x3 .
r3 110 = = 2.037. a33 54 Thus the first iterated solution is (0, 0, 0 + 2.037), i.e. (0, 0, 2.037). Then the increment dx3 in x3 is now calculated as dx3 = −
In next iteration we determine the new residuals of large magnitudes and relax it to zero. The process is repeated until all the residuals become zero or very small. 11
. . . . . . . . . . . . . . . . . . . . . . Solution of Inconsistent and Ill Conditioned Systems All steps of all iterations are shown below:
k
r1
residuals r2
0 1 2 3 4 5 6 7 8 9 10 11 12
– 54 56.037 28.869 0.006 2.568 2.472 –0.012 0.132 –0.003 –0.004 0.008 –0.008
– 72 67.926 0.006 –6.408 –0.003 0.189 –0.363 –0.003 –0.033 –0.031 –0.001 –0.001
r3
max (r1 , r2 , r3 )
– 110 0.003 –0.4526 –5.89 –5.168 0.016 –0.076 –0.052 –0.057 –0.003 –0.001 –0.001
– 110 67.926 28.869 –6.408 –5.168 2.472 –0.363 0.132 –0.057 –0.031 0.008 0.008
p
increment dxp
– 3 2 1 2 3 1 2 1 3 2 1 2
– 2.037 4.528 1.069 –0.427 –0.096 0.092 –0.024 0.005 0.001 –0.002 0.000 0.000
x
solution y
z
0 0 0 1.069 1.069 1.069 1.161 1.161 1.166 1.166 1.166 1.166 1.166
0 0 4.528 4.528 4.101 4.101 4.101 4.077 4.077 4.077 4.075 4.075 4.075
0 2.037 2.037 2.037 2.037 1.941 1.941 1.941 1.941 1.940 1.940 1.940 1.940
In this case, all residuals are very small. The solution of the given system of equations is x1 = 1.166, x2 = 4.075, x3 = 1.940, correct upto three decimal places.
6.6 Successive overrelaxation (S.O.R.) method The relaxation method can be modified to achieve fast convergence. For this purpose, a suitable relaxation factor w is introduced. The ith equation of the system of equations n X
aij xj = bi , i = 1, 2, . . . , n
j=1
is n X j=1
This equation can be written as 12
aij xj = bi .
(6.37)
......................................................................................
i−1 X
aij xj +
j=1 (0)
(0)
(0)
Let x1 , x2 , . . . , xn
aij xj = bi .
(6.38)
j=i
be the initial solution and
(k+1)
x1
n X
(k+1)
, x2
(k+1) (k) (k) , . . . , xi−1 , xi , xi+1 , . . . , x(k) n ,
be the solution when ith equation being consider. Then the equation (6.38) becomes i−1 X
(k+1) aij xj
+
j=1 (k+1)
Since x1
(k+1)
, x2
n X
(k)
aij xj
= bi .
(6.39)
j=i
(k+1)
(k)
(k)
(k)
, . . . , xi−1 , xi , xi+1 , . . . , xn
is an approximate solution of the
given system of equations, therefore the residual at the ith equation is determine from the following equation: ri = bi −
i−1 X
(k+1) aij xj
j=1
−
n X
(k)
aij xj , i = 1, 2, . . . , n. (k)
We denote the differences of xi ’s at two consecutive iterations by εi as
(k) εi
=
(k+1) xi
−
(6.40)
j=i
and it is defined
(k) xi .
In the successive overrelaxation (SOR) method, it is assumed that (k)
aii εi
= w ri , i = 1, 2, . . . , n,
(6.41)
where w is a scalar, called the relaxation factor. Thus, the equation (6.41) becomes (k+1) aii xi
=
(k) aii xi
−w
X i−1
(k+1) aij xj
+
j=1
n X
(k) aij xj
− bi ,
(6.42)
j=i
i = 1, 2, . . . , n; k = 0, 1, 2, . . . The iteration process is repeated until desired accuracy is achieved. The above iteration method is called the overrelaxation method when 1 < w < 2, and is called the under relaxation method when 0 < w < 1. When w = 1, the method becomes well known Gauss-Seidal’s iteration method. 13
. . . . . . . . . . . . . . . . . . . . . . Solution of Inconsistent and Ill Conditioned Systems The proper choice of w can speed up the convergence of the iteration scheme and it depends on the given system of equations. Example 6.5 Solve the following system of linear equations 4x1 + 2x2 + x3 = 5, x1 + 5x2 + 2x3 = 6, −x1 + x2 + 7x3 = 2 by SOR method taken relaxation factor w = 1.02. Solution. The SOR iteration scheme for the given system of equations is h i (k+1) (k) (k) (k) (k) 4x1 = 4x1 − 1.02 4x1 + 2x2 + x3 − 5 h i (k+1) (k) (k+1) (k) (k) 5x2 = 5x2 − 1.02 x1 + 5x2 + 2x3 − 6 h i (k+1) (k) (k+1) (k+1) (k) 7x3 = 7x3 − 1.02 − x1 + x2 + 7x3 − 2 . (0)
(0)
(0)
Let x1 = x2 = x3 = 0. The calculations of all iterations are shown below: k
x1
x2
x3
0 1 2 3 4 5 6 7 8 9
0 1.275 0.67204 0.72414 0.70811 0.70882 0.70835 0.70835 0.70833 0.70833
0 0.9639 0.93023 0.95686 0.95736 0.95823 0.95829 0.95832 0.95833 0.95833
0 0.33676 0.24707 0.25257 0.25006 0.25008 0.25001 0.25000 0.25000 0.25000
The solutions at iterations 8th and 9th are same. Hence, the required solution is x1 = 0.7083, x2 = 0.9583, x3 = 0.2500 correct up to four decimal places. 14
.
Chapter 6 Eigenvalues and Eigenvectors of Matrix
Module No. 1 Construction of Characteristic Equation of a Matrix
......................................................................................
1.1 Eigenvalue of a matrix Computation of eigenvalues and eigenvectors of a matrix is a very important problem of linear algebra. But, eigenvalues and eigenvectors are used to solved many problems of mathematics, physics, chemistry and many engineering problems. Let A be a square matrix of order n. If the matrix A satisfies the equation AX = λX
(1.1)
for a column matrix X and for a scalar λ, then λ is called an eigenvalue or characteristic value of the matrix A and X is called the corresponding eigenvector. The matrix equation (1.1) can be written as (A − λI)X = 0. The equation |A − λI| = 0
(1.2)
is called the characteristic equation of the matrix A. In explicit form this equation is written as a − λ a a · · · a 12 13 1n 11 a21 a22 − λ a23 · · · a2n = 0. ··· · · · · · · · · · · · · an1 an2 an3 · · · ann − λ
(1.3)
Note that, it is a polynomial in λ of degree n, where n is the order of the matrix. The roots λi , i = 1, 2, . . . , n, of the equation (1.2) are the eigenvalues of A. Again, for each eigenvalue λi , there exists a column vector Xi such that AXi = λi Xi .
(1.4)
The eigenvalues and eigenvectors satisfy many interesting properties. The eigenvalues are real or complex, they may be either distinct or repeated. The eigenvalues of a real symmetric matrix are real and that of a real skew-symmetric matrix are either zero or purely imaginary. The set of all eigenvalues of a matrix A is called the spectrum of A and the largest (in magnitude) eigenvalue is called the spectral radius of A. 1
. . . . . . . . . . . . . . . . . . . . . Construction of Characteristic Equation of a Matrix The eigenvalues are bounded with respect to a given matrix. The upper bound of the eigenvalues of a matrix is stated below. Theorem 1.1 The largest eigenvalue (in magnitude) of a square matrix A cannot exceed the largest sum of the modulus of the elements along any row or any column, i.e. |λ| ≤ max i
n hX
n i hX i |aij | and |λ| ≤ max |aij | .
j=1
j
(1.5)
i=1
Following is a very interesting property regarding eigenvalues. Theorem 1.2 (Shifting eigenvalues). Suppose λ be an eigenvalue of the matrix A and X be its corresponding eigenvector. If c is any constant, then λ − c is an eigenvalue of the matrix A − cI with same eigenvector X. There are some relations between the eigenvalues and the coefficients of the matrix. Let det(A − λI) = λn + c1 λn−1 + c2 λn−2 + · · · + cn ,
(1.6)
be the characteristic polynomial of the matrix A. Then n X −c1= aii = T r. A, which is the sum of all diagonal elements of A, called the trace. i=1
X aii aij c2 = , is the sum of all principal minors of order two of A, i<j aji ajj aii aij aik X aji ajj ajk , is the sum of all principal minors of order three of A, −c3 = i<j
......................................................................................
1.2 Leverrier-Faddeev method to construct characteristic equation The Leverrier-Faddeev method was developed by Leverrier and latter modified by Faddeev. Let det(A − λI) ≡ λn + c1 λn−1 + c2 λn−2 + · · · + cn = 0
(1.7)
be the characteristic equation of a square matrix A of order n × n. Let the roots (i.e. eigenvalues) of this equation be λ1 , λ2 , . . . , λn . Let Sk be the sum of kth power of the these roots. Then, S1 = λ1 + λ2 + · · · + λn = T r A, S2 = λ21 + λ22 + · · · + λ2n = T r A2 , ··· ··· ···························
(1.8)
Sn = λn1 + λn2 + · · · + λnn = T r An . Then by Newton’s formula (on polynomial) we have Sk + c1 Sk−1 + c2 Sk−2 + · · · + ck−1 S1 = −kck ,
(1.9)
for k ≤ n. This is a recurrence relation among the coefficients of the characteristic equation and the Sk ’s. Now, the coefficients of the characteristic equation are obtained by substituting k = 1, 2, . . . , n, in (1.9) as c1 = −S1 1 c2 = − (S2 + c1 S1 ) 2 ··· ··· ··············· 1 cn = − (Sn + c1 Sn−1 + c2 Sn−2 + · · · + cn−1 S1 ). n Thus, the values of the coefficients c1 , c2 , . . . , cn depend on the trace of the matrices A, A2 , . . ., An . 3
. . . . . . . . . . . . . . . . . . . . . Construction of Characteristic Equation of a Matrix The Leverrier-Faddeev method generates a sequence of matrices B1 , B2 , . . . , Bn , shown below. B1
= A,
d1 =
T r. B1 ,
D1 =
B1 − d1 I
T r. B2 ,
D2 =
B2 − d2 I
T r. B3 ,
D3 =
B3 − d3 I ·········
B2
= AD1 ,
d2 =
B3
= AD2 ,
d3 =
1 2 1 3
···
··· ······
···
······
······
Bn−1 = ADn−2 ,
dn−1 =
Dn−1 = Bn−1 − dn−1 I
Bn
dn =
1 n−1 T r. Bn−1 , 1 n T r. Bn ,
= ADn−1 ,
(1.10)
Bn − dn I
Dn =
Now, the coefficients of the characteristic polynomial are given by c1 = −d1 , c2 = −d2 , . . . , cn = −dn . It can be verified that Dn = 0. Example 1.1 Use the Leverrier-Faddeev method to find the characteristic polynomial 3 21 of the matrix 2 −3 1 . 0 01 Solution.
3
21
B1 = A = 2 −3 1 , d1 = T r. B1 = 3 − 3 + 1 = 1 0
01
3
21
2 −1
2
2 −4 B2 = A(B1 − d1 I) = 2 −3 1 0
01
0
0
10 −2
5
−2 16 −1 = 1
0
0
0
0
1 1 d2 = T r. B2 = (10 + 16) = 13 2 2 3 21 −3 −2 5 −13 0 0 B3 = A(B2 − d2 I) = 0 2 −3 1 −2 3 −1 = 0 −13 0 0 −13 0 01 0 0 −13 D3 = B3 − d3 I = 4
−13
0
0 −13 0
0
100
000
0 1 0 = 0 0 0 + 13 0
0 −13
001
000
......................................................................................
1 1 d3 = T r. B3 = (−13 − 13 − 13) = −13. 3 3 Thus c1 = −d1 = −1, c2 = −d2 = −13, c3 = −d3 = 13. Hence, the characteristic equation is λ3 − λ2 − 13λ + 13 = 0. Note 1.1 Leverrier-Faddeev method can also be used to find the inverse of a matrix. We have seen that Dn = 0. Therefore, Bn − dn I = 0 or, ADn−1 = dn I. This equation can be written as Dn−1 = dn A−1 . Thus, A−1 =
Dn−1 Dn−1 = . dn −cn
(1.11)
1.3 Eigenvectors using Leverrier-Faddeev method The Leverrier-Faddeev is also used to find all the eigenvectors. In this method, it is assumed that all the eigenvalues λ1 , λ2 , . . . , λn and the matrices D1 , D2 , . . . , Dn−1 are known. These matrices are available during the construction of the characteristic equation. Then the ith eigenvectors Xi can be determined using the formula Xi = λn−1 e0 + λn−2 e1 + λn−3 e2 + · · · + en−1 , i i i
(1.12)
where e0 is a unit vector and e1 , e2 , . . . , en−1 are column of the matrices vectors 1 0 D1 , D2 , . . . , Dn−1 of the same order as e0 . That is, if e0 is . , then e1 , e2 , . . . , en−1 .. 0 are the first column vectors of the matrices D1 , D2 , . . . , Dn−1 respectively.
5
. . . . . . . . . . . . . . . . . . . . . Construction of Characteristic Equation of a Matrix
Example 1.2 Use Leverrier-Faddeev method to find characteristic equation and all 1 1 −2 eigenvectors of the matrix A = −1 2 1 . 0 1 −1
Solution.
1 1 −2
B1 = A = −1 2
1
0 1 −1
d1 = T r. B1 = 1 + 2 − 1 = 2 −1 1 −2 200 1 1 −2 − 0 2 0 = −1 0 1 D1 = B1 − d1 I = −1 2 1 0 1 −3 002 0 1 −1 1 1 −2 −1 1 −2 −2 −1 5 B2 = AD1 = −1 2 1 −1 0 1 = −1 0 1 0 1 −1 0 1 −3 −1 −1 4
1 1 d2 = T r. B2 = (−2 + 0 + 4) = 1 2 2 −3 −1 5 D2 = B2 − d2 I = −1 −1 1 −1 −1 3
1 1 −2
−3 −1 5
−2
0
0
B3 = AD2 = −1 2 1 −1 −1 1 = 0 −2 0 0 1 −1 −1 −1 3 0 0 −2 1 1 d3 = T r. B3 = (−2 − 2 − 2) = −2. 3 3 Thus, c1 = −d1 = −2, c2 = −d2 = −1, c3 = −d3 = 2. Therefore, the characteristic equation is λ3 − 2λ2 − λ + 2 = 0. The roots of this equation are −1, 1, 2. Thus, the eigenvalues are λ1 = 1, λ2 = 2, λ3 = −1. 6
...................................................................................... 1 −1 −3 , and then e1 = −1 , e2 = −1 . Let e0 = 0 0 0 −1 (e1 , e2 are the first columns of the matrices D1 , D2 ). The formula Xi = λ2i e0 + λi e1 + e2 , 1 −1 −3 −3 + 1 −1 + −1 = −2 . for λ1 = 1 gives X1 = (1)2 0 0
0
−1
−1
Similarly, for λ2 = 2, 1 −1 −3 −1 X2 = (2)2 0 + (2) −1 + −1 = −3 .
0 −1 −1 1 −1 −3 −1 For λ3 = −1, X3 = (−1)2 0 + (−1) −1 + −1 = 0 . 0 0 −1 −1 −3 −1 −1 Thus, the eigenvectors are −2 , −3 and 0 . 0
−1
−1
−1
7
.
Chapter 6 Eigenvalues and Eigenvectors of Matrix
Module No. 2 Eigenvalue and Eigenvector of Arbitrary Matrices
...................................................................................... Several methods are available to determine the eigenvalues and eigenvectors of a matrix. In this module, two methods, viz. Rutishauser and power are introduced.
2.1 Rutishauser method Let A be an arbitrary square matrix of order n. In Rutishauser method, a convergent sequence of upper triangular matrices {A1 , A2 , . . .} is constructed and let this sequence be converge to the matrix A. Then the diagonal elements of the matrix A are the eigenvalues, if they are real. This method is based on LU-decomposition technique of matrix. Initially, let A1 = A and A1 = L1 U1 , where L1 and U1 are the lower and upper triangular matrices, with lii = 1. Then construct the matrix A2 = U1 L1 . The matrices A1 and A2 are similar because A2 = U1 L1 = U1 A1 U−1 1 . So the matrices A1 and A2 have same eigenvalues. Now, the matrix A2 is factorized as A2 = L2 U2 with lii = 1. Then construct the matrix A3 = U2 L2 . In this way, we construct a sequence of similar matrices A1 , A2 , A3 , . . .. This sequence of matrices converges to an upper triangular matrix or a near-triangular matrix A. If the eigenvalues are real, then all the eigenvalues are obtained from the diagonal of the matrix A. Practically, this method is complicated. Sometimes, the lower triangular matrix L is replaced by Q, where Q is an unitary or orthogonal matrix. The rate of convergent of this method is low, so the shifting technique may be used to accelerate the speed of convergence. Example 2.1 Find all the eigenvalues of the following matrix using Rutishauser method: 1 1 1 2 1 2 . 1 3 2
1
1 0 l31 l32 1 = 1, l31 = −2,
Solution. Let A1 = A = l21 This gives, l21 = 2, l31
0 0
u11 u12 u13
0 u22 u23 . 0 0 u33
u11 = 1, u12 = 1, u13 = 0, u22 = −1, u23 = 0, u33 = 1. 1
. . . . . . . . . . . . . . . . . . . . . . . . .Eigenvalue and Eigenvector of Arbitrary Matrices
1
0 0
1
1 0
0 −1 0 = L1 U1 . Form A1 = 2 1 0 1 −2 1 0 0 1
1 1 0 1 0 0 4 −1 1 Form A2 = U1 L1 = 0 −1 0 2 1 0 = −2 −1 0 . 0 0 1 1 −2 1 1 −2 1
1
0 0
u11 u12 u13
0 u22 u23 . Again, let A2 = L2 U2 = l 1 0 21 l31 l32 0 0 0 u33 Thus, l21 = −1/2, l31 = 1/4, l31 = 7/6, u11 = 4, u12 = −1, u13 = 1, u22 = −3/2, u23 = 1/2, u33 = 1/6. 1 0 Hence, A2 = L2 U2 = −1/2 1
0
4
−1
1
0 0 −3/2 1/2 . 1/4 7/6 1 0 0 1/6
4
−1
1
1
0
0
19/4
1/6
1
Now, form A3 = U2 L2 = 0 −3/2 1/2 −1/2 1 0 = 7/8 −11/12 1/2 . 0 0 1/6 1/4 7/6 1 1/24 7/36 1/6 u u u 1 0 0 11 12 13 Let A3 = L3 U3 = l21 1 0 0 u22 u23 . l31 l32 1
0
0 u33
From this relation, l21 = 7/38, l31 = 1/114, l31 = −11/54, u11 = 19/4, u12 = 1/6, u13 = 1, u22 = −18/19, u23 = 6/19, u33 = 2/9. 1 0 Thus, A3 = L3 U3 = 7/38 1
0
0
1/114 −11/54 1
In this way, we find 2
19/4 0 0
1/6
1
−18/19 6/19 . 0 2/19
......................................................................................
4.78947 −0.03704 1.00000
4.79121
0.00763 1.00000
, A5 = 0.03647 −0.99732 0.35165 , A4 = −0.17174 −1.01161 0.31579 0.00195 −0.04527 0.22222 0.00008 0.00921 0.20611 4.79121 0.00763 1.00000 4.79129 0.0003 1.00000 A6 = 0.03647 −0.99732 0.35165 , A7 = 0.001584 −0.9989 0.34562 , 0.00008 0.009207 0.20611
0.00008 0.00040 0.20859
and so on. The sequence {Ai } converges slowly to an upper triangular matrix and the diagonal elements converge to the eigenvalues of A. Hance, the eigenvalues of A are approximately 4.79129, −0.99989, 0.20859.
2.2 Power method Power method is used to find the eigenvalue with largest magnitude of any square matrix. Sometimes this eigenvalue is called first eigenvalue. Suppose all eigenvalues of the matrix A of order n × n be λ1 , λ2 , . . . , λn . Without loss of generality, it is assumed that |λ1 | > |λ2 | > |λ3 | > · · · > |λn |. That is, λ1 is the first eigenvalue of the matrix A. Let X1 , X2 , . . . , Xn be the eigenvectors corresponding to the eigenvalues λ1 , λ2 , . . . , λn respectively. Then by definition of eigenvalues and eigenvectors of the matrix A, we have AX1 = λ1 X1 , AX2 = λ2 X2 , . . . , AXn = λn Xn .
(2.1)
Also, it is assumed that all these eigenvectors are independent. In this case, these n eigenvectors generate a vector space V of dimension n. Therefore, a basis of this vector space is {X1 , X2 , . . . , Xn }. Then a vector X in V can be written as a linear combination of the eigenvectors X1 , X2 , . . . , Xn as X = c1 X 1 + c2 X 2 + · · · + cn X n .
(2.2)
Now, we multiply the equation (2.2) by A and using equation (2.1), we get AX = c1 λ1 X1 + c2 λ2 X2 + · · · + cn λn Xn λ λ 2 n = λ1 c1 X1 + c2 X2 + · · · + cn Xn . λ1 λ1
(2.3) 3
. . . . . . . . . . . . . . . . . . . . . . . . .Eigenvalue and Eigenvector of Arbitrary Matrices Again, we multiply the equation (2.2) by A2 , . . . , Ak successively and obtain the following equations: 2
A X = .. . Ak X = Ak+1 X =
λ 2 λ 2 n 2 X2 + · · · + cn Xn . c1 X1 + c2 λ1 λ1 .. . λ k λ k 2 n λk1 c1 X1 + c2 X2 + · · · + cn Xn . λ1 λ1 λ k+1 λ k+1 2 n k+1 λ1 c1 X1 + c2 X2 + · · · + cn Xn . λ1 λ1 λ21
(2.4)
(2.5) (2.6)
|λ1 | > |λi | for all i = 2, 3, . . . , n, then for large k, i.e. when k → ∞, then Since λi λ1 → 0 for all i = 2, 3, . . . , n. Thus, when k → ∞, then from equations (2.5) and (2.6), we have Ak X = λk1 c1 X1 and Ak+1 X = λk+1 1 c1 X1 . Therefore, for k → ∞, k k Ak+1 X = λk+1 1 c1 X1 = λ1 (λ1 c1 X1 ) = λ1 A X.
(2.7)
We know that two vectors are equal if and only if their corresponding components are equal. That is, (Ak+1 X)i = λ1 (Ak X)i , where (Ak X)i represents the ith component of the vector Ak X. Hence, from the equation (2.7) λ1 = lim
k→∞
Ak+1 X i , i = 1, 2, . . . , n. k A X
(2.8)
i
This method surely gives the eigenvalue λ1 . Although, if |λ2 | |λ1 |, then the term within the square bracket of the equation (2.6) converges rapidly to c1 X1 . The round off error can be reduced by normalizing the (approximate) eigenvalue at each iteration. In normalize eigenvector, the largest magnitude element is unity and it can be obtained by dividing all the elements of the vector by the largest magnitude element. Let X0 be an initial eigenvector. Generally, X0 is arbitrary, non-null and nonorthogonal to X1 . 4
...................................................................................... Now, let λ(i+1) be the largest element (in magnitude) of Zi+1 where Zi+1 = AXi . Zi+1 Therefore, Xi+1 = (i+1) , λ
for i = 0, 1, 2, . . . .
(2.9)
The number λ(i+1) is the (i+1)th approximate value of λ1 . Then the largest eigenvalue is given by (Zk+1 )i , k→∞ (Xk )i
λ1 = lim
i = 1, 2, . . . , n.
(2.10)
The vector Xk+1 is the eigenvector corresponding to the eigenvalue λ1 . Note 2.1 The initial eigenvector X0 is generally chosen as X0 = (1, 1, · · · , 1)T . But, if this vector is not close to the eigenvector corresponding to λ1 , then the iteration scheme (2.10) does not necessarily converge to the eigenvalue λ1 , i.e. the limit of the ratio (Zk+1 )i (Xk )i
may not exist. In this case, the initial eigenvector must be changed. Sometimes
a unit vector may be taken as an initial eigenvector.
Example 2.2 Find the largest eigenvalue in magnitude and the corresponding eigenvector of the matrix
2 3 1 A=4 3 4 . 7 6 1 Solution. Let the initial vector be X0 = (1, 1, 1)T . The first iteration gives 2 3 1 1 6 Z1 = AX0 = 4 3 4 1 = 11 . 14 7 6 1 1 Therefore, λ(1) = 14 and X1 =
0.428571
Z1 . = 0.785714 14 1.000000 5
. . . . . . . . . . . . . . . . . . . . . . . . .Eigenvalue and Eigenvector of Arbitrary Matrices
2 3 1
0.428571
4.214286
0.785714 = 8.071428 Z2 = AX1 = 4 3 4 7 6 1 1.000000 8.714286 0.483607 Z2 = λ(2) = 8.714286, X2 = 0.926229 . 8.714286 1.000000
2 3 1
0.483607
4.745902
0.926229 = 8.713115 Z3 = AX2 = 4 3 4 7 6 1 1.000000 9.942623 0.477329 . λ(3) = 9.942623, X3 = 0.876340 1.00000
4.583677
0.477499
(4) Z4 = 8.538334 , X4 = 0.889471 , λ = 9.599340. 9.599340 1. 4.623411 0.477659 , X5 = 0.886262 , λ(5) = 9.679319. Z5 = 8.578409 9.679319 1.
4.614102
0.477592
(6) Z6 = 8.569420 , X6 = 0.886995 , λ = 9.661180. 1. 9.661180
4.616169
0.477611
, X7 = 0.886834 , λ(7) = 9.665114. Z7 = 8.571354 9.665114 1.0 4.615726 0.477606 , X8 = 0.886868 , λ(8) = 9.664286. Z8 = 8.570949 9.664286 1.0 6
......................................................................................
4.615818
0.477608
, X9 = 0.886861 , λ(9) = 9.664465. Z9 = 8.571031 9.664455 1.0 Thus the required largest eigenvalue is 9.6645 correct up to four decimal places and the corresponding eigenvector is
0.4776
0.8869 . 1.0000
2.2.1
Power method for least eigenvalue
The power method may also be used to find the least eigenvalue of the matrix A. Let X be the eigenvector corresponding to the eigenvalue λ. Then AX = λX. If A is a nonsingular matrix, then A−1 exist and hence A−1 (AX) = λA−1 X, i.e. A−1 X = Thus, if λ is an eigenvalue of A, then
1 λ
1 λ X.
is an eigenvalue of A−1 . Also, X is the
eigenvector of the matrix A−1 corresponding to the eigenvalue λ1 . Hence, if λ is largest (in magnitude) eigenvalue of the matrix A, then
1 λ
is the least eigenvalue of the matrix
A−1 . But, this method is not suitable for large matrices as computation of inverse of a large matrix is time consuming. Another method is available to find the least eigenvalue without computing matrix inverse. Such method is discussed below. Let λ1 , λ2 , . . . , λn be the eigenvalues of the matrix A of order n × n and also let |λ1 | > |λ2 | > · · · > |λn |. That is, λ1 and λn are the largest and smallest (in magnitude) eigenvalues of A. Let B = (A − λ1 I). Then the eigenvalues of the matrix B are λ0i = (λi − λ1 ) (called shifting eigenvalues) for all i = 1, 2, . . . , n. Obviously, λ0n is the largest magnitude eigenvalue of B. Now, by computing the largest (in magnitude) eigenvalues of A and B, one can find the smallest eigenvalue of A and it is λn = λ0n +λ1 . Again, if Xn is an eigenvector of B corresponding to the eigenvector λ0n , then BXn = 0
λn Xn or, (A − λ1 I)Xn = (λn − λ1 )Xn , i.e. AXn = λn Xn . Hence, Xn is the eigenvector of A, corresponding to the smallest magnitude eigenvalue λn . This concept is explained in the following example. 7
. . . . . . . . . . . . . . . . . . . . . . . . .Eigenvalue and Eigenvector of Arbitrary Matrices Let
A=
2 3 0
5 8 1 . −2 3 4
The largest eigenvalue (in magnitude) of A is 10.1962. The matrix B is defined as B = A − 10.1962 ∗ I and this matrix is −8.1262 3.0000 B = 5.0000 −2.1962 −2.0000
0.0000
1.0000 .
3.0000 −6.1962
Now, the power method is used to find the largest magnitude eigenvalue of the matrix B, and it is −10.3924. Therefore, the smallest magnitude eigenvalue of A is λn = λ0n + λ1 = −10.3924 + 10.1962 = −0.1962.
8
.
Chapter 6 Eigenvalues and Eigenvectors of Matrix
Module No. 3 Eigenvalues and Eigenvectors of Symmetric Matrices
...................................................................................... In this module, we consider real symmetric matrices. The methods discussed in module 2 of this chapter can also be used to find the eigenvalues and eigenvectors of real symmetric matrices. Since these matrices have some special properties, so some efficient methods are available to find all eigenvalues and all eigenvectors of real symmetric matrices. In this module, Jacobi’s and Householder’s methods are discussed. One very important result from linear algebra is stated below: All the eigenvalues of a real symmetric matrix are real.
3.1 Jacobi’s method The Jacobi’s method is used to find all eigenvalues and eigenvectors of a real symmetric matrix. Let A be a real symmetric matrix. From linear algebra, we know that there exist a real orthogonal matrix R (if all the eigenvalues are real) such that R−1 AR is a diagonal matrix D. Again, the diagonal matrix D and the matrix A are similar, and hence diagonal elements of the matrix D are the eigenvalues of the matrix A and the columns vectors of R are the eigenvectors of the matrix A. But, it is not an easy task to find the matrix R. The main principle of Jacobi’s method is to find the matrix R such that R−1 AR becomes a diagonal matrix. For this purpose, a series of orthogonal transformations R1 , R2 , . . . are applied. Suppose aij be the largest magnitude element among the off-diagonal elements of the matrix A of order n × n. Let the first orthogonal matrix R1 be defined as follows: rij = − sin θ, rji = sin θ, rii = cos θ, rjj = cos θ.
(3.1)
All other diagonal elements are unity and all other off-diagonal elements are taken as zero. Thus, the explicit form of the matrix R1 is
1
. . . . . . . . . . . . . . . . . . . . . . Eigenvalues and Eigenvectors of Symmetric Matrices
1 0 ···
0 1 .. .. . . 0 0 R1 = .. .. . . 0 0 . . .. .. 00
···
0
0
··· 0
··· 0 .. . · · · cos θ · · · − sin θ · · · 0 .. .. .. . . . . · · · sin θ · · · cos θ · · · 0 .. .. .. . . . ··· 0 ··· 0 ··· 1 ···
···
0 .. .
0 .. .
(3.2)
Notice that the four elements cos θ, − sin θ, sin θ and cos θ are present in the positions (i, i), (i, j), (j, i) and (j, j) respectively. Let A1 be a sub-matrix of A formed by the elements aii , aij , aji and ajj , i.e. " A1 =
aii aij
#
aji ajj
.
Again, let R1 be a submatrix of R1 defined as " # cos θ − sin θ R1 = , sin θ cos θ where θ is an unknown quantity. The matrix R1 is orthogonal. We apply the orthogonal transformation R1 to A1 , such that the matrix R1
−1
A1 R1 becomes diagonal.
Now, −1
=
R1 A1 R1 " #" #" # cos θ sin θ aii aij cos θ − sin θ − sin θ cos θ "
=
aji ajj
sin θ
cos θ
aii cos2 θ + aij sin 2θ + ajj sin2 θ
(ajj − aii ) sin θ cos θ + aij cos 2θ
(ajj − aii ) sin θ cos θ + aij cos 2θ
aii sin2 θ − aij sin 2θ + ajj cos2 θ −1
# .
Since θ is arbitrary, so we can choose θ in such a way that R1 A1 R1 becomes diagonal. Therefore, (ajj − aii ) sin θ cos θ + aij cos 2θ = 0. 2
...................................................................................... That is, tan 2θ =
2aij . aii − ajj
(3.3)
The value of θ can be obtained from the following equation: 2aij 1 −1 θ = tan . 2 aii − ajj
(3.4)
This equation gives four values of θ, but, to get smallest rotation, θ must satisfies the inequality −π/4 ≤ θ ≤ π/4. The equation (3.4) is valid for all i, j if aii 6= ajj . If aii = ajj then
θ=
π , 4
if aij > 0 (3.5)
− π , if a < 0. ij 4 −1
So for this rotation, the off-diagonal elements sij and sji of R1 A1 R1 vanish and the diagonal elements are updated. Thus, the first diagonal matrix after first rotation is obtained from the equation D1 = R−1 1 AR1 . If D1 is a diagonal matrix, then no further rotation is required. In this case, diagonal elements of D1 are the eigenvalues and column vectors of R1 are the eigenvectors of A. Otherwise, another rotation (iteration) is required. In the next iteration largest off-diagonal (in magnitude) element is determined from the matrix D1 and the same method is applied to find another orthogonal matrix R2 to compute the matrix D2 . That is, −1 −1 −1 D2 = R−1 2 D1 R2 = R2 R1 AR1 R2 = (R1 R2 ) A(R1 R2 ).
In this process, a series of orthogonal rotations are generated. At kth rotation, the matrix Dk is given by −1 −1 Dk = R−1 k Rk−1 · · · R1 AR1 R2 · · · Rk−1 Rk
= (R1 R2 · · · Rk )−1 A(R1 R2 · · · Rk ) = R−1 AR
(3.6)
where R = R1 R2 · · · Rk . 3
. . . . . . . . . . . . . . . . . . . . . . Eigenvalues and Eigenvectors of Symmetric Matrices According to the principle, when k → ∞, the matrix Dk becomes a diagonal matrix and the diagonal elements of Dk are the eigenvalues of the matrix A and the columns of R are the corresponding eigenvectors. Thus, Jacobi’s method determines all eigenvalues and eigenvectors of a real symmetric matrix. But, the method has a drawback. The elements which are transferred to zero may not remain zero during next rotations. Note 3.1 From linear algebra, we know that R−1 = RT for any orthogonal matrix R.
Note 3.2 The number of upper diagonal elements of a symmetric matrix of order n × n is n(n − 1)/2. Therefore, to make all off-diagonal elements to zero it is expected that n(n − 1)/2 rotations are required. But, for some problems, we see that the number of such rotations may be less than this number and also for some other problems, it is more than n(n − 1)/2.
Example 3.1 Find the eigenvalues and eigenvectors of the symmetric matrix 231 A= 3 2 1 using Jacobi’s method. 113
231
Solution. Here, A = 3 2 1 .
113 The largest off-diagonal element is 3 at (1, 2), (2, 1) positions. 6 π 2a12 = = ∞ i.e., θ = . The rotational angle θ is given by tan 2θ = a11 − a22 0 4 Thus the orthogonal matrix R1 is √ √ cos π/4 − sin π/4 0 1/ 2 −1/ 2 0 √ √ R1 = sin π/4 cos π/4 0 = 1/ 2 1/ 2 0 . 0 Then the first rotation yields
4
0
1
0
0
1
......................................................................................
√ √ √ √ 1/ 2 1/ 2 0 231 1/ 2 −1/ 2 0 √ √ √ √ D1 = R−1 1 AR1 = −1/ 2 1/ 2 0 3 2 1 1/ 2 1/ 2 0
0 0 1 0 0 1 113 √ √ √ √ √ 1/ 2 −1/ 2 0 5 0 2 5 0 1.41421 5/ 2 5/ 2 2/ 2 √ √ √ √ 1/ 2 1/ 2 0 = 0 −1 0 = . = 2 −1/ 2 0 1/ 0 −1 0 √ 1 1 3 0 0 1 2 0 3 1.41421 0 3
√
The largest off-diagonal element of D1 is now 1.41421 situated at (1, 3) position and hence the rotational angle is
2a13 1 tan−1 = 0.477658. 2 a11 − a13 The second orthogonal matrix R2 is cos θ 0 − sin θ 0.88807 0 −0.45970 . R2 = 0 1 0 0 1 0 = θ=
sin θ 0 cos θ Then second rotation gives D1 R2 D2 = R−1 2 0.88807 0 −0.45970 = 0 1 0 −0.45970 0 0.88807 5.73205 0 0 . = 0 −1 0 0 0 2.26795
0.45970 0 0.88807
5 0 1.41421
0 1.41421 −1
0
0
3
0.88807 0 −0.45970 0
1
0
0.45970 0 0.88807
Thus D2 becomes a diagonal matrix and hence the eigenvalues are 5.73205, −1, 2.26795. The eigenvectors are the columns of R, where √ √ 0.88807 0 −0.45970 1/ 2 −1/ 2 0 √ √ R = R1 R2 = 0 1 0 1/ 2 1/ 2 0 0.45970 0 0.88807 0 0 1 0.62796 −0.70711 −0.32506 . = 0.62796 −0.70711 −0.32506 0.45970
0
0.88807
5
. . . . . . . . . . . . . . . . . . . . . . Eigenvalues and Eigenvectors of Symmetric Matrices Hence, the eigenvalues are 5.73205, −1, 2.26795 and the corresponding eigenvectors are (0.62796, 0.62796, 0.45970)T , (−0.70711, −0.70711, 0)T , (−0.32506, −0.32506, 0.88807)T respectively. Note that the eigenvectors are normalized. In this problem, only two rotations are used. This is less than the expected one. The following example shows that at least six rotations are needed to diagonalise a symmetric matrix. Example 3.2 Find all the eigenvalues and eigenvectors of the symmetric matrix 132 3 5 1 by Jacobi’s method. 214
132
. Solution. Here, A = 3 5 1
214 The largest off-diagonal element is at 3 at (1, 2) and (2, 1) positions. The rotational
angle θ is given by, tan 2θ =
2a12 , i.e. θ = −0.491397 a11 − a22
Thus the orthogonal matrix R1 is
0.88167 0.47186 0
R1 = −0.47186 0.88167 0 . 0 0 1 The first rotation yields D1 = R−1 1 AR1 =
−0.60555 0
0
1.29149
6.60555 1.82539 .
1.29149 1.82539
4.00
The largest off-diagonal element is 1.82539 which is at (2, 3) and (3, 2) positions. The rotational angle θ = 6
1 2
23 tan−1 ( a222a−a ) = 0.47668. 23
......................................................................................
1
0
0
. So, R2 = 0 0.88908 −0.45775 0 0.45775 0.88908
−0.60555 0.59119 1.14824
0.59119 7.54538 . D2 = R−1 D R = 0 1 2 2 1.14824 0 3.06017 The largest off-diagonal element in magnitude is 1.14824 which is at the position (1, 3) and (3, 1). 31 ) = 0.279829. The rotational angle θ = 21 tan−1 ( a332a−a 11 0.96110 0 0.27619 . Therefore, R3 = 0 1 0 −0.27619 0 0.96110 −0.93552 0.56819 0 D3 = 0.56829 7.54538 0.16328 .
0 0.16328 3.39014 The largest off-diagonal element in magnitude is 0.56829 which is at the position
(1, 2) and (2, 1). 12 The rotational angle θ = 21 tan−1 ( a112a−a ) = −0.066600. 22 0.99778 0.06655 0 Therefore, R4 = −0.06655 0.99778 0 .
0
D4 = R−1 4 D3 R4 =
0
1
−0.97342 0.00000 −0.01087
7.58328 0.16292 . −0.01087 0.16292 3.39014 The largest off-diagonal element in magnitude is 0.16292 which is at the position 0
(2, 3) and (3, 2). 32 The rotational angle θ = 21 tan−1 ( a332a−a ) = −0.038776. 22 1 0 0 Therefore, R5 = 0 0.99925 −0.03877 .
0 0.03877 0.99925
7
. . . . . . . . . . . . . . . . . . . . . . Eigenvalues and Eigenvectors of Symmetric Matrices
−0.97342 −0.00042 −0.01086
. D5 = R5−1 D4 R5 = −0.00042 7.58960 0 −0.01086 0 3.38382 The largest off-diagonal element in magnitude is −0.01086 which is at the positions (1, 3) and (3, 1). The rotational angle θ = 1 Therefore, R6 = 0
1 2
13 ) = 0.00249. tan−1 ( a112a−a 33 0 −0.00249 . 1 0
0.00249 0 1 −0.97346 −0.00042 −0.00001 . D6 = R−1 0 6 D5 R6 = −0.00042 7.58960 0.00001 0 3.38387 This matrix is almost diagonal and hence the eigenvalues are −0.9735, 7.5896, 3.3839
correct up to four decimal places. The eigenvectors are the columns of
0.87115 0.47998 0.01515
. R = R1 R2 . . . R6 = −0.39481 0.73872 −0.54628 −0.27339 0.47329 0.83747 That is, the eigenvectors corresponding to the eigenvalues −0.9735, 7.5896, 3.3839 are respectively (0.87115, −0.39418, −0.27339)T , (0.47998, 0.73872, 0.47329)T and (0.01515, −0.54628, 0.83747)T .
Note 3.3 In this example, it is seen that the elements which were zero in a particular rotation, may not remain zero during the subsequent rotations.
3.2 Eigenvalues of a symmetric tri-diagonal matrix In this section, an efficient method to compute all eigenvalues of a real symmetric tri-diagonal matrix is described. 8
...................................................................................... Let A be a real symmetric tri-diagonal matrix, where a1 b2 0 0 · · · 0 0 b2 a2 b3 0 · · · 0 0 A = 0 b3 a3 b4 · · · 0 0 . . . . . . . . . . . . . . . . . . . . . 0 0 0 0 · · · bn an The characteristic equation of the matrix A is a1 − λ b 2 0 0 ··· 0 b2 a2 − λ b 3 0 ··· 0 0 b3 a3 − λ b4 · · · 0 Cn (λ) ≡ .. .. .. .. .. .. . . . . . . 0
0
0
.
0
0
= 0.
0 .. .
0 · · · bn an − λ
The recurrence relation for Cn (λ) is obtained as Ck+1 (λ) = (ak+1 − λ)Ck (λ) − b2k+1 Ck−1 (λ), where
(3.7)
C0 (λ) = 1, C1 (λ) = a1 − λ, k = 1, 2, . . . , n − 1. If all b2 , b3 , . . . , bn are nonzero, then {Cn (λ)} is a Sturm sequence. Then by Sturm theorem, we can find an interval which contains an eigenvalue of the matrix A, by substituting different values of λ. Let n(λ) be the number of changes in sign in the Sturm sequence for different values of λ. Then | n(a) − n(b) | represents the number of eigenvalues lie within the interval [a, b], where Cn (a) and Cn (b) are nonzero. If the location of an eigenvalue is known, then using any method such as Newton-Raphson method, fixed point iteration method, etc. one can find an eigenvalue. Repeating this method we can find all eigenvalues of a real symmetric tri-diagonal matrix A. After computation of eigenvalues of A, the eigenvectors can be directly computed from the equation (A − λI)X = 0.
9
. . . . . . . . . . . . . . . . . . . . . . Eigenvalues and Eigenvectors of Symmetric Matrices
Example 3.3 Find the eigenvalues of the following tri-diagonal matrix 2 1 0 1 2 3 . 0 3 2
Solution. The Sturm sequence {Cn (λ)} is given by, C0 (λ) = 1 C1 (λ) = 2 − λ C2 (λ) = (2 − λ)p1 (λ) − 1p0 (λ) = λ2 − 2λ + 3 C3 (λ) = (2 − λ)p2 (λ) − 9p1 (λ) = −λ3 + 6λ2 − 2λ − 12. Now tabulate the values of C0 , C1 , C2 , C3 for different values of λ. λ
C0
C1
C2
C3
n(λ)
–2
+
+
+
+
0
–1
+
+
+
-
1
0
+
+
+
–
1
1
+
+
+
–
1
2
+
0
+
0
0
3
+
–
+
+
2
4
+
–
+
+
2
5
+
–
+
+
2
6
+
–
+
–
3
Here C3 (2) = 0, so λ = 2 is an eigenvalue. The other two eigenvalues lie in the intervals (−2, −1) and (5, 6) as n(−1) − n(−2) = 1 and n(6) − n(5) = 1. To find the eigenvalue within (−2, −1) using Newton-Raphson method Let λ(i) be the ith approximate value of the eigenvalue in (-2, -1). The NewtonRaphson iteration scheme is λ(i+1) = λ(i) −
C3 (λ(i) ) . C30 (λ(i) )
Initially, let λ(0) = −1. The successive iterations are shown below. 10
...................................................................................... λ(i)
C3 (λ(i) )
–1
–3
C30 (λ(i) )
λ(i+1)
–17
–1.17647
–1.17647
0.28576
–20.26988
–1.16237
–1.16237
0.00185
–20.00175
–1.16228
–1.16228
0.00005
–20.00004
–1.16228
Hence, the other eigenvalue is −1.16228. To find the eigenvalue within (5, 6) using Newton-Raphson method Let the initial eigenvalue be λ(0) = 5. The calculations are shown in the following table. λ(i)
C3 (λ(i) )
5
3
C30 (λ(i) )
λ(i+1)
–17
5.17647
5.17647
–0.28576
–20.26988
5.16237
5.16237
–0.00185
–20.00175
5.16228
5.16228
–0.00005
–20.00004
5.16228
The other eigenvalue is 5.16228. Hence, all the eigenvalues are 2, −1.16228 and 5.16228. The exact values are 2, √ 2 ± 10. The drawback of Jacobi’s method is removed in Householder’s method discussed below.
3.3 Householder’s method The Householder’s method is used to find eigenvalues and eigenvectors of a real symmetric matrix of order n × n. The main principle of this method is to convert the given real symmetric matrix to a real symmetric tri-diagonal matrix by applying a series of orthogonal transformations. Then using the method described in previous section one can find all eigenvalues. In each rotation, a complete row of zeros in appropriate positions is produced without changing rows obtained in the previous rotations. Thus, (n − 2) Householder transformations are required to get the tri-diagonal matrix. The following orthogonal transformation is used in Householder method. R = I − 2SST
(3.8) 11
. . . . . . . . . . . . . . . . . . . . . . Eigenvalues and Eigenvectors of Symmetric Matrices where S = (s1 , s2 , . . . , sn )T is a column matrix with n components and it satisfies the condition ST S = s21 + s22 + · · · + s2n = 1.
(3.9)
The matrix R is a symmetric and orthogonal, which is justified below. RT = (I − 2SST )T = I − 2SST = R and also RT R = (I − 2SST )(I − 2SST ) = I − 4SST + 4SST SST = I − 4SST + 4SST = I.
(3.10)
Thus, R−1 AR = RT AR = RAR,
(3.11)
since R is orthogonal and symmetric. Now, we construct a sequence of rotations defined below. Ai = Ri Ai−1 Ri , i = 2, 3, . . . , n − 1,
(3.12)
T 2 2 2 where Ri = I − 2Si ST i , Si = (0, 0, . . . , 0, si , si+1 , . . . , sn ) and si + si+1 + · · · + sn = 1,
A1 = A. In first transformation, we determine si ’s in such a way that the elements in the positions (1, 3), (1, 4), . . . , (1, n) of the matrix A2 become zero. Also, the elements in the corresponding positions in the first column become zero. Therefore, first transformation produces n − 2 zeros in the first row and first column. In the second transformation, the elements in the positions (2, 4), (2, 5), . . . , (2, n) and (4, 2), (5, 2), . . . , (n, 2) are transferred to zeros. Thus, (n − 2) Householder rotations are needed to obtain the tri-diagonal matrix An−1 . The method is explained below by taking a matrix A of order 4 × 4. Let in first rotation, S2 = (0, s2 , s3 , s4 )T where s22 + s23 + s24 = 1. 12
(3.13)
...................................................................................... Now,
1
0
0
0
0 1 − 2s22 −2s2 s3 −2s2 s4 . = 2 −2s s 0 −2s s 1 − 2s 2 3 3 4 3 0 −2s2 s4 −2s3 s4 1 − 2s24
T
R2 = I − 2SS
(3.14)
The first rows of A1 and S2 A1 are same. The elements in the first row of A2 = [a0ij ] = R2 A1 R2 are obtained from the following equations. a011 = a11 a012 = (1 − 2s22 )a12 − 2s2 s3 a13 − 2s2 s4 a14 = a12 − 2s2 p1 a013 = −2s2 s3 a12 + (1 − 2s23 )a13 − 2s3 s4 a14 = a13 − 2s3 p1 and a014 = −2s2 s4 a12 − 2s3 s4 a13 + (1 − 2s24 )a14 = a14 − 2s4 p1 where p1 = s2 a12 + s3 a13 + s4 a14 . It can be shown that the expression a211 + a212 + a213 + a214 is invariant, i.e. 2
2
2
2
a0 11 + a0 12 + a0 13 + a0 14 = a211 + a212 + a213 + a214 . Since, a0 211 = a211 , therefore 2
2
2
a0 12 + a0 13 + a0 14 = a212 + a213 + a214 = q 2 (say).
(3.15)
As per rule, the elements in positions (1, 3) and (1, 4) of A2 are to be zeros, so a013
= 0, a014 = 0.
Thus,
and
a13 − 2s3 p1 = 0
(3.16)
a14 − 2s4 p1 = 0
(3.17)
a012 = ±q or, a12 − 2p1 s2 = ±q.
(3.18)
Now, multiplying the equations (3.18), (3.16) and (3.17) by s2 , s3 , s4 respectively and adding them we get p1 − 2p1 (s22 + s23 + s24 ) = ±qs2 , i.e. p1 = ±s2 q.
(3.19) 13
. . . . . . . . . . . . . . . . . . . . . . Eigenvalues and Eigenvectors of Symmetric Matrices Therefore, the values of s2 , s3 and s4 are a12 a13 1 2 1∓ , s3 = ∓ , s2 = 2 q 2s2 q
s4 = ∓
a14 . 2s2 q
(3.20)
We see that, if the value of s2 is large, then better accuracy can be achieved. This can be done by choosing suitable sign of the expressions in (3.20). Let a12 × sign(a12 ) 1 2 1+ . s2 = 2 q
(3.21)
Taking the positive square root of s2 , we obtain s3 =
a13 × sign(a12 ) , 2q × s2
s4 =
a14 × sign(a12 ) . 2q × s2
In second rotation, we have to make zeros in positions (2, 4) and (4, 2). For second rotation, let S3 = (0, 0, s3 , s4 )T 1 0 0 0 0 1 0 0 R3 = 0 0 1 − 2s2 −2s s 3 4 3 0 0 −2s3 s4 1 − 2s24
.
(3.22)
The same method is used to find the values of s3 and s4 and we obtained new matrix A3 = S3 A2 S3 . It is guaranteed that the zeros in first row and first column obtained in first rotation remain unchanged during computation of the matrix A3 . Thus, A3 becomes to a tri-diagonal matrix. The extension of this method for a general n × n matrix is straight forward. Now, at the kth iteration the transformation vector Sk is Sk = (0, · · · , 0, sk , sk+1 , · · · , sn ) where v u X u n 1 akr × sign(akr ) 2 sk = 1+ , r = k + 1, where q = t a2ki 2 q i=k+1
si =
aki × sign(akr ) , 2qsk
i = k + 1, . . . , n.
The tri-diagonal matrix An−1 is similar to the original matrix A, therefore their eigenvalues are identical. Now, the eigenvalues and eigenvectors of the matrix An−1 can be determined using the method discussed in previous section. 14
......................................................................................
Example 3.4 Use the Householder method to reduce the matrix
1
1 2
1
1 2 1 −1 A= 2 1 3 1 1 −1 1 4 into the tri-diagonal form. Solution. First rotation.
p √ Let S2 = (0, s2 , s3 , s4 )T , q = a212 + a213 + a214 = 6, (1)(1) 1 2 1+ √ s2 = = 0.7041, s2 = 0.8391, 2 6 s3 = 0.4865,
s4 = −0.2433.
S2 = (0, 0.8391, 0.4865, 0.2433)T . R2 = I − 2S2 ST 2 1 0 0 0 0 −0.4082 −0.8165 −0.4082 . = 0 −0.8165 0.5266 −0.2367 0 −0.4082 −0.2367 0.8816 1 −2.4455 0 0 −2.4495 4 −0.2367 −0.5266 . A2 = R2 A1 R2 = 0 −0.2367 0.8936 0.5798 0 −0.5266 0.5798 4.1064 Second rotation. p S3 = (0, 0, s3 , s4 )T . q = a223 + a224 = 0.5774, 1 (−0.2367)(−1) s23 = 1+ = 0.7050, s3 = 0.8396, 2 0.5774 s4 =
(−0.5266)(−1) = 0.5431 2 × 0.5774 × 0.8396 15
. . . . . . . . . . . . . . . . . . . . . . Eigenvalues and Eigenvectors of Symmetric Matrices S3 = (0, 0, 0.8396, 0.5431)T 1 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 . R3 = = 2 0 0 1 − 2s −2s s 0 0 −0.4099 −0.9120 3 4 3 2 0 0 −2s3 s4 1 − 2s4 0 0 −0.9120 0.4101 1 −2.4435 0 0 −2.4485 4 0.5773 0 . then A3 = R3 A2 R3 = 0 0.5773 3.9988 −0.8170 0 0 −0.8170 1.0001 This is the required tri-diagonal real symmetric matrix.
16
.
Chapter 7 Numerical Differentiation and Integration
Module No. 1 Numerical Differentiation
...................................................................................... Numerical differentiation is a very interesting problem. From a given table of values of a function y = f (x), one can find the derivatives of different orders of the function f (x), without knowing the function f (x) explicitly. The main concept of numerical differentiation is stated below: construct an appropriate interpolation polynomial from the given set of values of x and y and then differentiate it at any value of x. Like interpolation, lot of formulae are available for differentiation. Based on the given set of values, different types of formulae can be constructed. The common formulae are based on Lagrange’s and Newton’s interpolation formulae. These formulae are discussed in this module.
1.1 Differentiation based on Lagrange’s interpolation formula Suppose the following table of values of y = f (x) is given: x:
x0
x1
x2
···
xn
y:
y0
y1
y2
···
yn
The values of x’s are not necessarily equispaced. The problem is to find the values of f 0 (x), f 00 (x), . . . for a given x, where min{x0 , x1 , x2 , . . . , xn } < x < max{x0 , x1 , x2 , . . . , xn }. The Lagrange’s interpolation formula for the given table of values is φ(x) = w(x)
n X i=0
yi , where w(x) = (x − x0 )(x − x1 ) · · · (x − xn ). (x − xi )w0 (xi )
Then φ0 (x) = w0 (x)
n X i=0
00
00
φ (x) = w (x)
n X
n
X yi yi − w(x) , 0 (x − xi )w (xi ) (x − xi )2 w0 (xi )
(1.1)
i=0
n
X yi yi 0 − 2w (x) (x − xi )w0 (xi ) (x − xi )2 w0 (xi )
i=0 n X
+2w(x)
i=0
i=0
yi , (x − xi )3 w0 (xi )
(1.2)
and so on. 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Numerical Differentiation The value of w0 (x) is given by n X 0 w (x) = (x − x0 )(x − x1 ) · · · (x − xj−1 )(x − xj+1 ) · · · (x − xn ). j=0
Note that the formulae (1.1) and (1.2) can be used to find the values of φ0 (x) and φ00 (x) for all values of x’s except x = xi , i = 0, 1, . . . , n. To find such values at the points x0 , x1 , . . . , xn , the Lagrange’s polynomial is rewritten as φ(x) = w(x)
n X i=0 i6=j
+
yi (x − xi )w0 (xi )
(x − x0 )(x − x1 ) · · · (x − xj−1 )(x − xj+1 ) · · · (x − xn ) yj . (xj − x0 )(xj − x1 ) · · · (xj − xj−1 )(xj − xj+1 ) · · · (xj − xn )
Now, (x − x0 )(x − x1 ) · · · (x − xj−1 )(x − xj+1 ) · · · (x − xn ) d dx (xj − x0 )(xj − x1 ) · · · (xj − xj−1 )(xj − xj+1 ) · · · (xj − xn ) x=xj =
n X i=0 i6=j
1 . xj − xi
Thus, φ0 (xj ) = w0 (xj )
n X i=0 i6=j
n
X yj yi + , (xj − xi )w0 (xi ) xj − xi i=0
(1.3)
i6=j
and the value of w0 (xj ) is given by w0 (xj ) = (xj − x0 )(xj − x1 ) · · · (xj − xj−1 )(xj − xj+1 ) · · · (xj − xn ) n Y = (xj − xi ). i=0 i6=j
The formula (1.3) is used to find the first derivative of φ(x) and hence first approximate derivative of f (x) at the points x = x0 , x1 , . . . , xn .
1.2 Error in Lagrange’s differentiation formula In Module 1 of Chapter 2, it is shown that the error in interpolation polynomial is E(x) = w(x) 2
f n+1 (ξ) (n + 1)!
...................................................................................... where min{x, x0 , . . . , xn } < ξ < max{x, x0 , . . . , xn } and w(x) = (x − x0 )(x − x1 ) · · · (x − xn ), where ξ = ξ(x) is a quantity depends on x. Thus, E 0 (x) = w0 (x)
f n+2 (ξ) 0 f n+1 (ξ) + w(x) ξ (x). (n + 1)! (n + 1)!
(1.4)
Since ξ 0 (x) is an unknown function, so its upper bound is not know. When x = xi , w(xi ) = 0. Therefore, E 0 (xi ) = w0 (xi )
f n+1 (ξ(xi )) , (n + 1)!
(1.5)
where min{x, x0 , . . . , xn } < ξ(xi ) < max{x, x0 , . . . , xn }. Note 1.1 Suppose the function f (x) is approximated by the polynomial φ(x) of degree at most n, then the first derivative of f (x) is also approximated by the first derivative of φ(x). But, the error in φ0 (x) is more than the error occurs in φ(x).
Example 1.1 Using Lagrange’s differentiation formula find the value of f 0 (3) and f 0 (1.4) from the following table. x
:
1
3
5
6
y
:
3
0
13
22
Solution. Here x0 = 1, x1 = 3, x2 = 5, x3 = 6. w(x) = (x − x0 )(x − x1 )(x − x2 )(x − x3 ) = (x − 1)(x − 3)(x − 5)(x − 6). w0 (x) = (x − 3)(x − 5)(x − 6) + (x − 1)(x − 5)(x − 6) + (x − 1)(x − 3)(x − 6) + (x − 1)(x − 3)(x − 5). By the formula, 0
0
0
f (x1 ) ' φ (x1 ) = w (x1 )
3 X i=0 i6=1
3
X y1 yi + (x1 − xi )w0 (xi ) x1 − xi i=0 i6=1
y0 y2 y3 0 = w (3) + + (3 − 1)w0 (1) (3 − 5)w0 (5) (3 − 6)w0 (6) 1 1 1 +y1 + + . 3−1 3−5 3−6
3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Numerical Differentiation w0 (1) = −40, w0 (3) = 12, w0 (5) = −8, w0 (6) = 15. Thus 3 13 22 +0 f (3) ' 12 + + 2 × (−40) (−2) × (−8) (−3) × 15 = 3.43333.
0
Also f 0 (1.4) ' w0 (1.4)
3 X i=0
3
X yi yi − w(1.4) 0 (1.4 − xi )w (xi ) (1.4 − xi )2 w0 (xi ) i=0
y1 y2 y3 y0 0 + + + = w (1.4) (1.4 − 1)w0 (1) (1.4 − 3)w0 (3) (1.4 − 5)w0 (5) (1.4 − 6)w0 (6) y0 y1 y2 y3 −w(1.4) + + + (1.4−1)2 w0 (1) (1.4−3)2 w0 (3) (1.4−5)2 w0 (5) (1.4−6)2 w0 (6) Now, w0 (1.4) = (1.4 − 3)(1.4 − 5)(1.4 − 6) + (1.4 − 1)(1.4 − 5)(1.4 − 6) +(1.4 − 1)(1.4 − 3)(1.4 − 6) + (1.4 − 1)(1.4 − 3)(1.4 − 5) = −14.62400, w(1.4) = (1.4 − 1)(1.4 − 3)(1.4 − 5)(1.4 − 6) = −10.59840. Therefore, f 0 (1.4) ' −14.62400
3 0 13 22 + + + 0.4 × (−40) (−1.6) × 12 (−3.6) × (−8) (−4.6) × 15 3 0 13 22 − (−10.59840) + + + (0.4)2×(−40) (−1.6)2×12 (−3.6)2×(−8) (−4.6)2×15 = −14.6240 × 0.23490 + 10.59840 × (−0.33706)
= −7.00747.
1.3 Differentiation based on Newton’s forward interpolation formula We know that the Newton’s forward and backward interpolation formulae are applicable only when the arguments are in equispaced. So, we assumed that the given arguments are equispaced. 4
...................................................................................... Let the function y = f (x) be known at the (n+1) equispaced arguments x0 , x1 , . . . , xn and yi = f (xi ) for i = 0, 1, . . . , n. Since the arguments are in equispaced, therefore one x − x0 can write xi = x0 + ih. Also, let u = where h is called the spacing. h For this data set, the Newton’s forward interpolation formula is u(u − 1) 2 u(u − 1) · · · (u − n − 1) n ∆ y0 + · · · + ∆ y0 2! n! u2 −u 2 u3 −3u2 +2u 3 u4 −6u3 +11u2 −6u 4 = y0 +u∆y0 + ∆ y0 + ∆ y0 + ∆ y0 2! 3! 4! u5 − 10u4 + 35u3 − 50u2 + 24u 5 + ∆ y0 + · · · (1.6) 5!
φ(x) = y0 + u∆y0 +
The error term of this interpolation formula is E(x) =
u(u − 1) · · · (u − n) n+1 (n+1) h f (ξ), (n + 1)!
where min{x, x0 , · · · , xn } < ξ < max{x, x0 , . . . , xn }. Differentiating (1.6) thrice to get the first three derivatives of φ(x). 1 2u − 1 2 3u2 − 6u + 2 3 4u3 −18u2 +22u − 6 4 φ (x) = ∆y0 + ∆ y0 + ∆ y0 + ∆ y0 h 2! 3! 4! 5u4 −40u3 +105u2 −100u+24 5 ∆ y0 + · · · (1.7) + 5! 0
du 1 as = dx h φ00 (x) =
1 6u − 6 3 12u2 − 36u + 22 4 ∆2 y0 + ∆ y0 + ∆ y0 2 h 3! 4! 20u3 − 120u2 + 210u − 100 5 + ∆ y0 + · · · 5!
1 24u − 36 4 60u2 − 240u + 210 5 3 ∆ y0 + ∆ y0 + · · · . φ (x) = 3 ∆ y0 + h 4! 5! 000
(1.8)
(1.9)
In this way, we can find all other derivatives. It may be noted that ∆y0 , ∆2 y0 , ∆3 y0 , · · · are constants. The above three formulae give the first three (approximate) derivatives of f (x) at any arbitrary argument x where x = x0 + uh. 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Numerical Differentiation The above formulae become simple when x = x0 , i.e. u = 0. That is, i 1h 1 1 1 1 φ0 (x0 ) = ∆y0 − ∆2 y0 + ∆3 y0 − ∆4 y0 + ∆5 y0 − · · · h 2 3 4 5 i 11 4 1h 2 5 5 3 00 φ (x0 ) = 2 ∆ y0 − ∆ y0 + ∆ y0 − ∆ y0 + · · · h 12 6 h i 1 3 7 φ000 (x0 ) = 3 ∆3 y0 − ∆4 y0 + ∆5 y0 − · · · . h 2 4
(1.10) (1.11) (1.12)
Error in Newton’s forward differentiation formula The error in Newton’s forward differentiation formula is calculated from the expression of error in Newton’s forward interpolation formula. The error in Newton’s forward interpolation formula is given by E(x) = u(u − 1) · · · (u − n)hn+1
f n+1 (ξ) . (n + 1)!
Differentiating this expression with respect to x, then we have 1 u(u − 1) · · · (u − n) n+1 d n+1 f n+1 (ξ) d u(u − 1) · · · (u − n) + h [f (ξ)] (n + 1)! du h (n + 1)! dx u(u − 1) · · · (u − n) n+1 n+2 f n+1 (ξ) d [u(u − 1) · · · (u − n)] + h f (ξ1 ), (1.13) = hn (n + 1)! du (n + 1)!
E 0 (x) = hn+1
where ξ and ξ1 are two quantities depend on x and min{x, x0 , x1 , . . . , xn } < ξ, ξ1 < max{x, x0 , x1 , . . . , xn }. The expression for error at the starting argument x = x0 , i.e. u = 0 is evaluated as f n+1 (ξ) d hn (−1)n n! f n+1 (ξ) [u(u − 1) · · · (u − n)]u=0 = (n + 1)! du (n + 1)! d [as [u(u − 1) · · · (u − n)]u=0 = (−1)n n!] du (−1)n hn f n+1 (ξ) = , n+1
E 0 (x0 ) = hn
where ξ lies between min{x, x0 , x1 , . . . , xn } and max{x, x0 , x1 , . . . , xn }.
6
(1.14)
......................................................................................
Example 1.2 Consider the following table. x
:
1.0
1.5
2.0
2.5
3.0
3.5
y
:
1.234
2.453
7.625
12.321
18.892
23.327
Find the value of
dy dy d2 y at x = 1 and , when x = 1.2. 2 dx dx dx
Solution. The forward difference table is x
y
1.0
1.234
∆y
∆2 y
∆3 y
∆4 y
∆5 y
3.425 1.5
4.659
3.095 6.520
2.0
11.179
0.36 3.455
9.975 2.5
21.154
1.07 4.525
14.500 3.0
0.71
35.654
-0.680 0.03
1.10 5.625
20.125 3.5
55.779
Here x0 = 1 and h = 0.5. Then u = 0 and hence 1 1 1 1 1 ∆y0 − ∆2 y0 + ∆3 y0 − ∆4 y0 + ∆5 y0 · · · h 2 3 4 5 1 1 1 1 1 = 3.425 − × 3.095 + × 0.36 − × 0.71 − × 0.68 0.5 2 3 4 5 = 3.36800. 1 11 5 y 00 (1) = (∆2 y0 − ∆3 y0 + ∆4 y0 − ∆5 y0 ) (0.5)2 12 6 11 5 = 4 × 3.095 − 0.36 + × 0.71 + × 0.68 12 6 = 15.8100. y 0 (1) =
Now, at x = 1.2, h = 0.5, u =
x−x0 h
=
1.2−1 0.5
= 0.4 7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Numerical Differentiation Therefore, 1 2u − 1 2 3u2 − 6u + 2 3 4u3 − 18u2 + 22u − 6 4 ∆y0 − ∆ y0 + ∆ y0 + ∆ y0 0.5 2! 3! 4! 5u4 − 40u3 + 105u2 − 100u + 24 5 + ∆ y0 · · · 5! 1 2 × 0.4 − 1 3(0.42 ) − 6(0.4) + 2 = 3.425 + × 3.095 + × 0.36 0.5 2! 3! 4(0.4)3 − 18(0.4)2 + 22(0.4) − 6 + × 0.71 4! 5(0.4)4 − 40(0.4)3 + 1.05(0.4)2 − 100(0.4) + 24 + × −0.68 5! = 6.26948.
y 0 (1.2) =
1.4 Differentiation based on Newton’s backward interpolation formula Like Newton’s forward differentiation formula one can derive Newton’s backward differentiation formula based on Newton’s backward interpolation formula. Suppose the function y = f (x) is not know explicitly, but it is known at (n + 1) arguments x0 , x1 , . . . , xn . That is, yi = f (xi ), i = 0, 1, 2, . . . , n are given. Since the Newton’s backward interpolation formula is applicable only when the arguments are x − xn . equispaced, therefore, xi = x0 + ih, i = 0, 1, 2, . . . , n and v = h The Newton’s backward interpolation formula is
v(v + 1) 2 v(v + 1)(v + 2) 3 ∇ yn + ∇ yn 2! 3! v(v + 1)(v + 2)(v + 3) 4 v(v+1)(v+2)(v+3)(v + 4) 5 + ∇ yn + ∇ yn +· · · 4! 5!
φ(x) = yn + v∇yn +
Differentiating this formula with respect to x successively, the formulae for derivatives 8
...................................................................................... of different order can be derived as 1h 2v + 1 2 3v 2 + 6v + 2 3 4v 3 +18v 2 +22v+6 4 φ0 (x) = ∇yn + ∇ yn + ∇ yn + ∇ yn h 2! 3! 4! i 5v 4 + 40v 3 + 105v 2 + 100v + 24 5 ∇ yn + · · · (1.15) + 5! 1h 6v + 6 3 12v 2 + 36v + 22 4 φ00 (x) = 2 ∇2 yn + ∇ yn + ∇ yn h 3! 4! i 20v 3 + 120v 2 + 210v + 100 5 + ∇ yn + · · · (1.16) 5! i 24v + 36 4 60v 3 + 240v + 210 5 1h ∇ yn + ∇ yn + · · · (1.17) φ000 (x) = 3 ∇3 yn + h 4! 5! and so on.
dy d2 y d3 y , , , and so on, at any dx dx2 dx3 value of x(= xn + vh), where min{x0 , x1 , . . . , xn } < x < max{x0 , x1 , . . . , xn }. The above formulae give the approximate values of
When x = xn then v = 0. In this particular case, the above formulae reduced to the following form. i 1h 1 1 1 1 ∇yn + ∇2 yn + ∇3 yn + ∇4 yn + ∇5 yn + · · · h 2 3 4 5 h i 11 5 1 φ00 (xn ) = 2 ∇2 yn + ∇3 yn + ∇4 yn + ∇5 yn + · · · h 12 6 i 1h 3 3 4 7 5 000 φ (xn ) = 3 ∇ yn + ∇ yn + ∇ yn + · · · h 2 4 φ0 (xn ) =
(1.18) (1.19) (1.20)
Error in Newton’s backward differentiation formula The error can be calculated by differentiating the error in Newton’s backward interpolation formula. Such error is given by E(x) = v(v + 1)(v + 2) · · · (v + n)hn+1
f n+1 (ξ) , (n + 1)!
x − xn and ξ lies between min{x, x0 , x1 , . . . , xn } and max{x, x0 , x1 , . . . , xn }. h Differentiating E(x), we get
where v =
d f n+1 (ξ) [v(v + 1)(v + 2) · · · (v + n)] dv (n + 1)! v(v + 1)(v + 2) · · · (v + n) n+2 +hn+1 f (ξ1 ), (n + 1)!
E 0 (x) = hn
9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Numerical Differentiation where min{x, x0 , x1 , . . . , xn } < ξ, ξ1 < max{x, x0 , x1 , . . . , xn }. This expression gives the error in differentiation at any argument x. In particular, when x = xn , i.e. when v = 0 then f n+1 (ξ) d [v(v + 1)(v + 2) · · · (v + n)] dv (n + 1)! h i n! d = hn f n+1 (ξ) as [v(v + 1) · · · (v + n)]v=0 = n! (n + 1)! dv n n+1 h f (ξ) = . n+1
E 0 (xn ) = hn
(1.21)
Example 1.3 A slider in a machine moves along a fixed straight rod. It’s distance x (in cm) along the rod are given in the following table for various values of the time t (in second): (sec) t
:
0
2
4
6
8
(cm) x
:
20
50
80
120
180
Find the velocity and acceleration of the slider at time t = 8. Solution. The backward difference table is ∇x
∇2 x
∇3 x
t
x
0
20
2
50
30
4
80
30
0
6
120
40
10
10
8
180
60
20
10
∇4 x
0
Here. h = 2. The velocity is i dx 1h 1 1 1 1 = ∇ + ∇2 + ∇3 + ∇4 + ∇5 + · · · xn dt h 2 3 4 5 i 1h 1 1 = 60 + × 20 + × 10 2 2 3 h 10 i = 0.5 × 70 + 3 = 36.66667. 10
...................................................................................... The acceleration is i d2 x 1h 2 11 4 5 5 3 = ∇ + ∇ + ∇ + ∇ + · · · xn dt2 h2 12 6 h i 1 = 2 20 + 10 + 0 2 = 7.50.
1.5 Two-point and three-point formulae From the relation between finite differences and first order derivative we know that f 0 (xi ) '
yi+1 − yi y(xi + h) − y(xi ) ∆yi = = h h h
(forward difference formula) (1.22)
and f 0 (xi ) '
∇yi yi − yi−1 y(xi ) − y(xi − h) = = (backward difference formula) (1.23) h h h
Adding these two equations, we get the central difference formula for first order derivative, as f 0 (xi ) '
y(xi + h) − y(xi − h) (central difference formula) 2h
(1.24)
Equations (1.22) to (1.24) give two-point formulae to find first order derivative of the function f (x) at x = xi . Similarly, from the second order finite differences, we have f 00 (xi ) '
f 00 (xi ) '
∆ 2 yi yi+2 − 2yi+1 + yi y(xi +2h) − 2y(xi + h)+y(xi ) = = , h2 h2 h2
(1.25)
∇2 yi yi − 2yi−1 +yi−2 y(xi )−2y(xi − h)+y(xi − 2h) = = h2 h2 h2
(1.26)
and f 00 (x0 ) '
∆2 y−1 y1 − 2y0 + y−1 y(x0 + h) − 2y(x0 ) + y(x0 − h) = = . h2 h2 h2
In general, f 00 (xi ) '
y(xi + h) − 2y(xi ) + y(xi − h) . h2
(1.27) 11
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Numerical Differentiation The equations (1.25) to (1.27) give the three-point formulae for second order derivative. All these formulae can also be deduced from the Taylor’s series expansion of a function. In the following the four-point formula for first derivative is deduced from Taylor’s series expansion. The Taylor’s series expansions for f (x + h) and f (x − h) are f (x + h) = f (x) + hf 0 (x) +
h3 h4 h5 h2 00 f (x) + f 000 (x) + f iv (x) + f v (ξ1 ) 2! 3! 4! 5!
f (x − h) = f (x) − hf 0 (x) +
h2 00 h3 h4 h5 f (x) − f 000 (x) + f iv (x) − f v (ξ2 ). (1.29) 2! 3! 4! 5!
(1.28)
and
Subtracting (1.29) from (1.28), we get f (x + h) − f (x − h) = 2hf 0 (x) +
2f 000 (x)h3 2f v (ξ3 )h5 + . 3! 5!
Now, replacing h by 2h to the above formula, we get f (x + 2h) − f (x − 2h) = 4hf 0 (x) +
16f 000 (x)h3 64f v (ξ4 )h5 + . 3! 5!
All these ξ’s lie between x − 2h and x + 2h. Let us consider the expression f (x − 2h) − f (x + 2h) + 8f (x + h) − 8f (x − h) and after simplification we have f (x − 2h) − f (x + 2h) + 8f (x + h) − 8f (x − h) 16f v (ξ3 ) − 64f v (ξ4 ) 5 = 12hf 0 (x) + h . 120
(1.30)
Suppose f v (x) is continuous, therefore, f v (ξ3 ) ' f v (ξ4 ) = f v (ξ) (say), where ξ lies between (x − 2h) and (x + 2h). By the above assumption 16f v (ξ3 ) − 64f v (ξ4 ) = −48f v (ξ). Therefore, the equation (1.30) becomes 2 f (x − 2h) − f (x + 2h) + 8f (x + h) − 8f (x − h) = 12hf 0 (x) − h5 f v (ξ). 5 Hence, f 0 (x) is evaluated by the following formula f 0 (x) ' 12
f (x − 2h) − 8f (x − h)+8f (x + h)−f (x + 2h) f v (ξ)h4 + . 12h 30
(1.31)
...................................................................................... This is a very interesting formula. The first term on the right hand side is a four-point formula and second term represents the corresponding truncation error to find the first derivative of f (x). Corollary 1.1 If f is a differentiable function up to fifth order within the interval [a, b] and x−2 , x−1 , x1 , x2 ∈ [a, b] then f 0 (x0 ) =
−f (x2 ) + 8f (x1 ) − 8f (x−1 ) + f (x−2 ) h4 f v (ξ) + , 12h 30
(1.32)
where ξ lies between x−2 and x2 .
1.5.1
Error analysis and optimum step size
Suppose f (x) is continuously differentiable function up to third order within the interval [a, b] and x − h, x, x + h ∈ [a, b]. Now, by Taylor’s series expansion h2 00 f (xi ) + 2! h2 and f (xi − h) = f (xi ) − hf 0 (xi ) + f 00 (xi ) − 2! f (xi + h) = f (xi ) + hf 0 (xi ) +
h3 000 f (ξ1 ) 3! h3 000 f (ξ2 ) 3!
Hence, by subtracting we get f (xi + h) − f (xi − h) = 2hf 0 (xi ) +
f 000 (ξ1 ) + f 000 (ξ2 ) 3 h . 3!
(1.33)
Since f 000 is continuous, then by intermediate value theorem one can write f 000 (ξ1 ) + f 000 (ξ2 ) = f 000 (ξ) 2 where ξ lies between (xi − h) and (xi + h). Then from equation (1.33), f (xi + h) − f (xi − h) f 000 (ξ)h2 − 2h 3! f (xi + h) − f (xi − h) = + Etrunc , 2h
f 0 (xi ) =
(1.34)
f 000 (ξ)h2 called truncation error. 3! Note that the order of the truncation error is O(h2 ).
where Etrunc = −
13
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Numerical Differentiation Another type of error occurs during computer arithmetic and it is called round-off error. Let y(x0 − h) and y(x0 + h) be the approximate values of the function f at the points (x0 − h) and (x0 + h) respectively and ε−1 and ε1 are the round-off errors. Then f (x0 − h) = y(x0 − h) + ε−1 and f (x0 + h) = y(x0 + h) + ε1 . Thus f (xi + h) − f (xi − h) + Etrunc 2h y(xi + h) − y(xi − h) ε1 − ε−1 = + + Eround 2h 2h y(xi + h) − y(xi − h) +E = 2h
f 0 (xi ) =
where E is given by E = Eround + Etrunc =
ε1 − ε−1 h2 f 000 (ξ) − . 2h 6
(1.35)
This is the total error. Suppose |ε−1 | ≤ ε, |ε1 | ≤ ε and M3 = max |f 000 (x)|. a≤x≤b
Then the upper bound of the total error is given by |E| ≤ Now, for minimum |E|, h to minimize |E| is
|ε1 | + |ε−1 | h2 000 ε M3 h2 + |f (ξ)| ≤ + . 2h 6 h 6
d|E| ε hM3 = 0, i.e. − 2 + = 0. Thus the optimum value of dh h 3 h=
3ε M3
1/3 .
(1.37)
Hence, the minimum total error is
3ε |E| = ε M3
14
(1.36)
−1/3
M3 + 6
3ε M3
2/3 .
......................................................................................
Example 1.4 The value of x and f (x) = x sin x are tabulated as follows: x : 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 f (x) : 0.03973 0.08866 0.15577 0.23971 0.33874 0.45095 0.57388 0.70499 0.84147
Find the value of f 0 (0.6) using the two- and four-point formulae f (x0 + h) − f (x0 − h) 2h −f (x0 + 2h) + 8f (x0 + h) − 8f (x0 − h) + f (x0 − 2h) and f 0 (x0 ) = 12h with step size h = 0.1. f 0 (x0 ) =
Solution. By the two-point formula, f (0.7) − f (0.5) 0.45095 − 0.23971 f 0 (0.6) ' = = 1.05620 2 × 0.1 0.2 and by the four-point formula, −f (0.8) + 8f (0.7) − 8f (0.5) + f (0.4) 12 × 0.1 −0.57388 + 8 × 0.45095 − 8 × 0.23971 + 0.15577 = 1.2 = 1.05984.
f 0 (0.6) '
The exact value of f 0 (0.6) is sin(0.6) + (0.6) × cos(0.6) = 1.05984. Thus, error in two-point formula is 0.00364 and that in four-point formula is 0.00000. Clearly, four-point formula gives better result than two-point formula.
1.6 Determination of extremum of a tabulated function Numerical method may also be used to find the extreme values of a function even when the explicit form of the function is unknown. It is well known that, if a function f (x) is differentiable, then the maximum and minimum point of the function can be determined by solving the equation f 0 (x) = 0. This method is also applicable for the function which is known at some particular arguments. Now, we consider the Newton’s forward difference interpolation formula to find the extreme values of a function. 15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Numerical Differentiation
y = f (x) = y0 + u∆y0 +
u(u − 1) 2 u(u − 1)(u − 2) 3 ∆ y0 + ∆ y0 + · · · , 2! 3!
(1.38)
x − x0 . h The first order derivative of f (x) is
where u =
dy 2u − 1 2 3u2 − 6u + 2 3 = ∆y0 + ∆ y0 + ∆ y0 + · · · . dx 2 6 At the maximum or minimum ∆y0 +
dy = 0. That is, dx
2u − 1 2 3u2 − 6u + 2 3 ∆ y0 + ∆ y0 + · · · = 0 2 6
(1.39)
Neglecting the fourth and higher order differences the above equation reduces to 2u − 1 2 3u2 − 6u + 2 3 ∆y0 + ∆ y0 + ∆ y0 = 0. 2 6 1 1 1 Let a = ∆3 y0 , b = ∆2 y0 − ∆3 y0 , c = ∆y0 − ∆2 y0 + ∆3 y0 . Using these notations 2 2 3 the above equation can be written as au2 + bu + c = 0. This is a quadratic equation. Solving this equation one can determine the values u, at which f (x) is extremum. Then the values of x are determined from the relation x = x0 + uh and the extreme value of f (x) can be obtained from the equation (1.38). Example 1.5 Find x for which y is maximum and also find the corresponding value of y, from the following table.
16
x
:
1.0
1.5
2.0
2.5
3.0
y
:
0.40668
0.86553
1.21290
1.51025
1.83346
...................................................................................... Solution. The forward difference table is x
y
1.0
0.40668
∆2 y
∆y 0.45885
1.5
0.86553
–0.11148 0.34737
2.0
1.21290
–0.05002 0.18232
2.5
1.51025
0.02586 0.32321
3.0
1.83346
2u − 1 2 Let x0 = 1.0 we have, ∆y0 + ∆ y0 = 0 2 2u − 1 or, 0.45885 + × (−0.11148) = 0 or, u = 4.61598. 2 Therefore, x = x0 + uh = 1.0 + 0.5 × 4.61598 = 3.30799. Note that the value of x is at the end of the table. Therefore, Newton’s backward interpolation formula is used to find the value of y. For this purpose, we calculate v=
x − xn 3.30799 − 3 = = 0.61598. h 0.50
Then by Newton’s backward interpolation formula y(3.30799) = yn + v∇yn−1 +
v(v + 1) 2 ∇ yn−1 2
= 1.83346 + 0.61598 × (0.32321) +
0.31598(0.61598 + 1) × (−0.02586) 2
= 2.04542. Thus, the approximate maximum value of y when x = 3.30799 is 2.04542. Choice of differentiation formula Choice of differentiation formula is same as choice of interpolation formula. That is, if the given argument is at the beginning of the table then the Newton’s forward differentiation formula is used. Similarly, when the given argument is at the end of the table then the Newton’s backward differentiation formula is used. The Lagrange’s differentiation formula is used for any argument. 17
.
Chapter 7 Numerical Differentiation and Integration
Module No. 2 Newton-Cotes Quadrature
...................................................................................... Integration is a very common and fundamental tool of integral calculus. But, finding of integration is not easy for all kind of functions, even the function is known completely. Again, in many real life problems, only a set of values of x and y are available and we have to find the integration of such functions. In this situations, separate methods are developed and these methods are known as numerical integration or quadrature. The problem of numerical integration is stated below: Given a set of points (x0 , y0 ), (x1 , y1 ), . . . , (xn , yn ) of a function y = f (x). The problem Rx is to find the value of the definite integral x0n f (x) dx. The function f (x) is replaced by a suitable interpolating polynomial φ(x). Then the approximate value of the definite integral is then evaluated by the following formula Z
xn
Z
xn
f (x) dx ' x0
φ(x) dx.
(2.1)
x0
Mainly three types of numerical integration formulae are known. These are Newton Cotes, Gaussian and Monte-Carlo. A quadrature formula is said to be of closed type, if the limits of integration a(= x0 ) and b(xn ) are taken as two interpolating points. If a and b are not included in the interpolating polynomial, then the formula is known as open type formula. Degree of precision The degree of precision of a quadrature formula is a positive integer n such that the error is zero for all polynomials of degree less than or equal to n, but it is non-zero for at least one polynomial of degree n + 1. The Newton-Cotes quadrature formula is one of the simplest and widely used numerical quadrature formulae. This formula is derived from Lagrange’s interpolation polynomial for equispaced arguments.
2.1
Newton-Cotes quadrature formulae (Closed type)
Like numerical differentiation, the formulae for numerical integration can be obtained by integrating interpolating polynomials. The Newton-Cotes quadrature formula is derived from the Lagrange’s interpolation polynomial in case of equal spaced arguments. 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newton-Cotes Quadrature Suppose y = f (x) be a function defined on the interval [a, b] and the function is unknown, but it is known at the points x0 , x1 , . . . , xn , i.e. yi = f (xi ), i = 1, 2, . . . , n are known. Let the arguments xi ’s be equispaced, i.e. xi = x0 + ih, i = 1, 2, . . . , n. Also, let x0 = a, xn = b and h = (b − a)/n, called the length of each interval. Z b Then the approximate value of the integral f (x)dx is computed as a b
Z
Z f (x)dx =
b
φ(x)dx,
a
a
where φ(x) is an interpolating polynomial corresponding to the function f (x). The Newton-Cotes quadrature formula is consider as the following form: b
Z
f (x)dx ' a
n X
Ci yi ,
(2.2)
i=0
where Ci ’s are suitable constants. Now, our problem is to find the value of Ci ’s from a set of values of xi and yi = f (xi ). The Lagrange’s interpolation polynomial is φ(x) =
n X
Li (x)yi ,
(2.3)
i=0
where the Lagrangian function Li (x) is given by Li (x) =
(x − x0 )(x − x1 ) · · · (x − xi−1 )(x − xi+1 ) · · · (x − xn ) . (xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )
(2.4)
Let us consider a new variable s, which is defined as x = x0 +sh. For this substitution x − xi = (s − i)h and xi − xj = (i − j)h. Thus, sh(s − 1)h · · · (s − i − 1)h(s − i + 1)h · · · (s − n)h ih(i − 1)h · · · (i − i − 1)h(i − i + 1)h · · · (i − n)h (−1)n−i s(s − 1)(s − 2) · · · (s − n) = . i!(n − i)! (s − i)
Li (x) =
2
(2.5)
...................................................................................... Then the equation (2.2) is transferred to n X
Z
xn
Ci yi '
f (x)dx x0
i=0
n X (−1)n−i s(s − 1)(s − 2) · · · (s − n) yi dx (s − i) x0 i=0 i!(n − i)! n Z xn X (−1)n−i s(s−1)(s−2) · · · (s−n) = dx yi . (s − i) x0 i!(n − i)!
Z
xn
=
(2.6)
i=0
By comparing both sides we obtained the expression for Ci as Z xn (−1)n−i s(s − 1)(s − 2) · · · (s − n) dx Ci = (s − i) x0 i!(n − i)! Z (−1)n−i h n s(s − 1)(s − 2) · · · (s − n) = ds, i!(n − i)! 0 (s − i) Z (−1)n−i (b − a) n s(s − 1)(s − 2) · · · (s − n) ds, = i!(n − i)! n (s − i) 0
(2.7)
i = 0, 1, 2, . . . , n and x = x0 + sh. Now, we can write Ci = (b − a)Hi ,
(2.8)
where 1 (−1)n−i Hi = n i!(n − i)!
Z
n
0
s(s − 1)(s − 2) · · · (s − n) ds, i = 0, 1, 2, . . . , n. (s − i)
(2.9)
The quantities Hi ’s or Ci ’s are called Cotes coefficients. Then the quadrature formula (2.2) becomes Z
b
f (x)dx ' (b − a) a
n X
Hi yi ,
(2.10)
i=0
and the values of Hi ’s are given by (2.9). Note 2.1 The Cotes coefficients Hi ’s are pure number and free from the values of yi ’s.
3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newton-Cotes Quadrature
2.1.1
(i)
Properties of Cotes coefficients
n X
Ci = (b − a).
i=0
We know that the sum of Lagrangian functions is 1, i.e. n X i=0
Then by integrating,
w(x) = 1. (x − xi )w0 (xi )
Z bX n a i=0
Again, Z bX n a i=0
n
X w(x) dx = (x − xi )w0 (xi ) =
w(x) dx = (x − xi )w0 (xi )
Z
n
h(−1)n−i
i=0 0 n X
Z
b
dx = (b − a).
(2.11)
a
s(s − 1)(s − 2) · · · (s − n) ds i!(n − i)!(s − i)
Ci .
(2.12)
i=0
Hence, from the equations (2.11) and (2.12), we have n X
Ci = b − a.
(2.13)
i=0
(ii)
n X
Hi = 1.
i=0
From the equation (2.8), Ci = (b − a)Hi n n X X i.e. Ci = (b − a) Hi i=0
i=0
Finally, (b − a) = (b − a)
n X
Hi . [by (2.13)]
i=0
Thus, n X i=0
4
Hi = 1.
(2.14)
...................................................................................... (iii) Ci = Cn−i . From the expression of Ci , we have Cn−i =
(−1)i h (n − i)!i!
n
Z 0
s(s − 1)(s − 2) · · · (s − n) ds. s − (n − i)
Now, we substitute t = n − s. Then Cn−i becomes Cn−i
Z (−1)i h(−1)n 0 t(t − 1)(t − 2) · · · (t − n) dt =− i!(n − i)! t−i n Z (−1)n−i h n s(s − 1)(s − 2) · · · (s − n) = ds = Ci . i!(n − i)! 0 s−i
Hence, Ci = Cn−i .
(2.15)
(iv) Hi = Hn−i . Obtained from the equation (2.15) Hi = Hn−i .
(2.16)
The Newton-Cotes formula is a general quadrature formula. From this formula one can derive many simple formulae for different values of n. Some particular cases are discussed below.
2.2 Deduction of some standard quadrature formulae 2.2.1
Trapezoidal formula
One of the simple quadrature formulae is trapezoidal formula. To obtain this formula, we substitute n = 1 to the equation (2.10). Then Z
b
f (x)dx = (b − a) a
1 X
Hi yi = (b − a)(H0 y0 + H1 y1 ).
i=0
Now H0 and H1 are determined from the equation (2.9). 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newton-Cotes Quadrature Then, Z 1 s(s − 1) 1 ds = and H0 = − s 2 0 Z 1 1 H1 = sds = . 2 0 (b − a) = b − a. n Hence, Z b (b − a) h f (x)dx = (y0 + y1 ) = (y0 + y1 ). 2 2 a This is known as trapezoidal quadrature formula. For n = 1, h =
Note that the formula is very simple and it gives a very rough answer of the integral. So, if the interval [a, b] is divided into some subintervals and the formula is applied to each of these subintervals, then much better approximate result may be obtained. This formula is known as composite trapezoidal formula, described below. Composite trapezoidal formula Suppose the interval [a, b] be divided into n equal subintervals as a = x0 , x1 , x2 , . . ., xn = b. That is, xi = x0 + ih, i = 1, 2, . . . , n, where h is the length of the intervals. Now, the trapezoidal formula is applied to each of the subintervals, and we obtained the composite formula as follows: Z Z x1 Z b f (x) dx + f (x) dx = x0
a
x2
Z
xn
f (x) dx + · · · +
x1
f (x) dx xn−1
h h h h [y0 + y1 ] + [y1 + y2 ] + [y2 + y3 ] + · · · + [yn−1 + yn ] 2 2 2 2 h = [y0 + 2(y1 + y2 + · · · + yn−1 ) + yn ]. (2.17) 2
'
Since it is a numerical formula, so it must have an error. The error in trapezoidal formula is calculated below. Error in trapezoidal formula Z The difference between the integral E =
b
f (x) dx and the approximate value a
obtained by trapezoidal formula is the error. 6
h (y0 + y1 ) 2
...................................................................................... Thus, Z
b
f (x) dx −
E= a
h (y0 + y1 ). 2
(2.18)
Let y = f (x) be a continuously differentiable function within the interval [a, b]. Also, it is assumed that there exists a function F (x) such that F 0 (x) = f (x) in [x0 , x1 ] on [a, b]. Therefore, b
Z
Z
x1
f (x) dx = a
F 0 (x) dx = F (x1 ) − F (x0 )
x0
= F (x0 + h) − F (x0 ) = [F (x0 ) + hF 0 (x0 ) +
h2 00 F (x0 ) 2!
h3 000 F (x0 ) + · · · ] − F (x0 ) 3! h3 h2 = hf (x0 ) + f 0 (x0 ) + f 00 (x0 ) + · · · 2! 3! 2 3 h h = hy0 + y00 + y000 + · · · 2 6 +
(2.19)
Also, h h (y0 + y1 ) = [y0 + y(x0 + h)] 2 2 h h2 = [y0 + y(x0 ) + hy 0 (x0 ) + y 00 (x0 ) + · · · ] 2 2! 2 h h = [y0 + y0 + hy00 + y000 + · · · ]. 2 2!
(2.20)
Substituting the values of the equations (2.19) and (2.20), to the equation (2.18), we obtain h i hh i h h2 h2 E = h y0 + y00 + y000 + · · · − 2y0 + hy00 + y000 + · · · 2 6 2 2! h3 00 = − y0 + · · · 12 h3 00 = − f (x0 ) + · · · 12 h3 00 ' − f (ξ), 12
(2.21) 7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newton-Cotes Quadrature where a = x0 < ξ < x1 = b. This is the error for basic trapezoidal formula, i.e. the error in the interval [x0 , x1 ]. The error (Ec ) in the composite formula is the sum of the errors of all intervals, i.e. Ec = −
h3 00 00 (y + y100 + · · · + yn−1 ). 12 0
00 If M is the upper bound of the n quantities y000 , y100 , . . . , yn−1 then
Ec ≤ −
(b − a) 2 1 3 h nM = − h M, since nh = b − a. 12 12
Note 2.2 The error term in trapezoidal formula indicates that if the second and higher order derivatives of the function f (x) vanish, then the trapezoidal formula gives exact result. That is, the trapezoidal formula gives exact result when the integrand is linear.
y 6
y = f (x) B A
y0 O
D x0
y1
C x1
- x
Figure 2.1: Geometrical interpretation of trapezoidal formula Geometrical interpretation of trapezoidal formula In trapezoidal formula, the integrand y = f (x) is replaced by the straight line joining the points A(x0 , y0 ) and B(x1 , y1 ) (see Figure 2.1). Then the area bounded by the curve y = f (x), the ordinates x = x0 , x = x1 and the x-axis is approximated by the area of the trapezium (ABCD) bounded by the straight line AB, the straight lines x = x0 , 8
...................................................................................... Z x = x1 and x-axis. That is, the value of the integration
b
f (x) dx obtained by the a
trapezoidal formula is nothing but the area of the trapezium ABCD.
2.2.2
Simpson’s 1/3 formula
We substitute n = 2 to the equation (2.10) to get another quadrature formula. Thus, Z
b
f (x) dx = a
2 X
Hi yi + E
(2.22)
i=0
where E is the error and Z 1 1 1 2 (s − 1)(s − 2)ds = H0 = . 2 2 0 6 Z 2 2 1 s(s − 2)ds = H1 = − 2 0 3 Z 2 1 1 1 H2 = . s(s − 1)ds = . 2 2 0 6 (b − a) . 2 Thus, the equation (2.22) reduces to
In this case h =
Z
b
f (x)dx = (b − a) a
2 X
Hi yi + E = (b − a)(H0 y0 + H1 y1 + H2 y2 ) + E
i=0
=
h (y0 + 4y1 + y2 ) + E. 3
The formula is known as Simpson’s 1/3 formula or simply Simpson’s formula.
Composite Simpson’s 1/3 formula In the above formula, the interval of integration [a, b] is divided into two subdivisions. Now, we divide the interval [a, b] into n (even number) equal subintervals by the arguments x0 , x1 , x2 , . . . , xn , where xi = x0 + ih, i = 1, 2, . . . , n. 9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newton-Cotes Quadrature Therefore, Z
b
f (x) dx Z Za x2 f (x) dx + =
x4
Z
f (x) dx xn−2
x2
x0
xn
f (x) dx + · · · +
h h h [y0 + 4y1 + y2 ] + [y2 + 4y3 + y4 ] + · · · + [yn−2 + 4yn−1 + yn ] 3 3 3 h = [y0 + 4(y1 + y3 + · · · + yn−1 ) + 2(y2 + y4 + · · · + yn−2 ) + yn ]. (2.23) 3 h = [y0 + 4 (sum of y with odd subscripts) + 2 (sum of y with even subscripts) + yn ] 3
=
This is known as Simpson’s 1/3 composite quadrature formula.
Error in Simpson’s 1/3 quadrature formula The expression for error in Simpson’s 1/3 quadrature formula is Z
xn
f (x) dx −
E= x0
h [y0 + 4y1 + y2 ]. 3
(2.24)
Suppose that the function f (x) be continuous on [x0 , x2 ] and continuously differentiable of all order. Also, assumed that there exists a function F (x) such that F 0 (x) = f (x) on [x0 , x2 ]. Now, Z
x2
Z
x2
f (x) dx = x0
F 0 (x) dx = F (x2 ) − F (x0 )
x0
= F (x0 + 2h) − F (x0 ) = F (x0 ) + 2hF 0 (x0 ) +
(2h)2 00 F (x0 ) 2!
(2h)4 iv (2h)5 v (2h)3 000 F (x0 ) + F (x0 ) + F (x0 ) + · · · − F (x0 ) 3! 4! 5! 4 2 = 2hf (x0 ) + 2h2 f 0 (x0 ) + h3 f 00 (x0 ) + h4 f 000 (x0 ) 3 3 4 5 iv + h f (x0 ) + · · · . (2.25) 15 +
10
...................................................................................... Also, h h [y0 + 4y1 + y2 ] = [f (x0 ) + 4f (x1 ) + f (x2 )] 3 3 h = [f (x0 ) + 4f (x0 + h) + f (x0 + 2h)] 3 n h2 h h3 f (x0 ) + 4 f (x0 ) + hf 0 (x0 ) + f 00 (x0 ) + f 000 (x0 ) = 3 2! 3! o n 4 (2h)2 00 h f (x0 ) + f iv (x0 ) + · · · + f (x0 ) + 2hf 0 (x0 ) + 4! 2! o (2h)3 000 (2h)4 iv + f (x0 ) + f (x0 ) + · · · 3! 4! 4 2 = 2hf (x0 ) + 2h2 f 0 (x0 ) + h3 f 00 (x0 ) + h4 f 000 (x0 ) 3 3 5 5 iv (2.26) + h f (x0 ) + · · · . 18 Using these values the equation (2.24) becomes, 4 5 5 iv E= − h f (x0 ) + · · · 15 18 h5 (2.27) ' − f iv (ξ), 90 where x0 < ξ < x2 . This is the expression of error in Simpson’s 1/3 formula on the interval [x0 , x2 ]. The error in composite Simpson’s 1/3 formula is h5 iv {f (x0 ) + f iv (x2 ) + · · · + f iv (xn−2 )} 90 h5 n iv nh5 iv ≤− f (ξ) = − f (ξ) 90 2 180 (b − a) 4 iv ≤− h f (ξ), 180 where f iv (ξ) = max{f iv (x0 ), f iv (x2 ), . . . , f iv (xn−2 )}. E=−
(2.28)
Geometrical interpretation of Simpson’s 1/3 formula In this formula the curve y = f (x) is approximated by a quadratic parabola passing through the points A(x0 , y0 ), B(x1 , y1 ) and C(x2 , y2 ). Thus, the area bounded by the curve y = f (x), the ordinates x = x0 , x = x2 and the x-axis is approximated to the area bounded by the parabola ABC, the straight lines x = x0 , x = x2 and the x-axis. 11
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newton-Cotes Quadrature y 6
parabola A
C
y = f (x)
B
D
E x0
O
x1
- x
x3
Figure 2.2: Geometrical interpretation of Simpson’s 1/3 formula
Example 2.1 Evaluate
R3 1
(x + 1) exp(x2 ) dx, taking 10 intervals, by (i) Trapezoidal
formula, and (ii) Simpson’s 1/3 formula. Solution. Here n = 10, a = 1, b = 3, y = f (x) = (x + 1) exp(x2 ). b−a 3−1 So, h = = = 0.2. n 10 The tabulated values of x and y are shown below.
12
x0
x1
x2
x3
x4
x5
xi
:
1.0
1.2
1.4
1.6
1.8
2.0
yi
:
5.4366
9.2855
17.0384
33.6331
71.4944
163.7944
y0
y1
y2
y3
y4
y5
x6
x7
x8
x9
x10
xi
:
2.2
2.4
2.6
2.8
3.0
yi
:
404.7019
1078.9843
3105.5119
9652.7784
32412.3359
y6
y7
y8
y9
y10
...................................................................................... (i) By Trapezoidal formula: 3
Z
(x + 1) exp(x2 ) dx
1
h = [y0 + 2(y1 + y2 + y3 + y4 + y5 + y6 + y7 + y8 + y9 ) + y10 ] 2 0.2 [5.4366 + 2(9.2855 + 17.0384 + 33.6331 + 71.4944 + 163.7944 = 2 +404.7019 + 1078.9843 + 3105.5119 + 9652.7784) + 32412.3359] = 6149.2217. (ii) By Simpson’s formula: Z
3
(x + 1) exp(x2 ) dx
1
h = [y0 + 4(y1 + y3 + y5 + y7 + y9 ) + 2(y2 + y4 + y6 + y8 ) + y10 ] 3 0.2 = [5.4366 + 4(9.2855 + 33.6331 + 163.7944 + 1078.9843 3 +9652.7784) + 2(17.0384 + 71.4944 + 404.7019 + 3105.5119) + 32412.3359] 0.2 [83369.1685] = 5557.9445. = 3 2.2.3
Simpson’s 3/8 formula
This formula is obtained by substituting n = 3 in (2.10). In this case, the quadrature formula is Z
b
Z
x3
f (x) dx = a
f (x) dx = x0
3 X i=0
Ci yi + E = (b − a)
3 X
Hi yi + E
(2.29)
i=0
where E is the error and H0 = H1 = H2 = H3 =
Z 1 (−1)3 3 1 . (s − 1)(s − 2)(s − 3)ds = 3 0!3! 0 8 Z 3 2 1 (−1) 3 . s(s − 2)(s − 3)ds = 3 1!2! 0 8 Z 3 1 1 (−1) 3 . s(s − 1)(s − 3)ds = 3 2!1! 0 8 Z 3 0 1 (−1) 1 . s(s − 1)(s − 2)ds = . 3 3!0! 0 8 13
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newton-Cotes Quadrature
In this case, h =
b−a . 3
Thus, Z a
b
f (x) dx = (b − a)(H0 y0 + H1 y1 + H2 y2 + H3 y3 ) + E 1 3 3 1 = 3h y0 + y1 + y2 + y3 ) + E 8 8 8 8 3h = [y0 + 3y1 + 3y2 + y3 )] + E, 8
where E is the error term, which is given by the expression E = −
3 5 iv h f (ξ), x0 < ξ < x3 . 80
This is well known Simpson’s 3/8 formula. To obtained composite Simpson’s 3/8 formula, the interval [a, b] is divided into n (divisible by 3) equal subintervals by the arguments x0 , x1 , . . . , xn . Now, the formula is used to each of the subintervals. Then Z
xn
Z
x3
f (x)dx = x0
Z
x6
x0
Z
xn
f (x)dx + · · · +
f (x)dx + x3
f (x)dx xn−3
3h [(y0 + 3y1 + 3y2 + y3 ) + (y3 + 3y4 + 3y5 + y6 ) 8 + · · · + (yn−3 + 3yn−2 + 3yn−1 + yn )] 3h = [y0 + 3(y1 + y2 + y4 + y5 + y7 + y8 + · · · + yn−2 + yn−1 ) 8 +2(y3 + y6 + y9 + · · · + yn−3 ) + yn ]. (2.30) =
This formula is known as Simpson’s 3/8 composite formula. 2.2.4
Boole’s formula
Substitute n = 4 to the equation (2.10). In this case, the quadrature formula is Z
b
Z
x4
f (x) dx = a
14
f (x) dx = x0
4 X i=0
Ci yi + E = (b − a)
4 X i=0
Hi yi + E.
(2.31)
...................................................................................... The Cotes coefficients are given by Z 7 1 (−1)4 4 (s − 1)(s − 2)(s − 3)(s − 4)ds = H0 = . 4 0!4! 0 90 Z 4 3 32 1 (−1) s(s − 2)(s − 3)(s − 4)ds = H1 = . 4 1!3! 0 90 Z 4 2 12 1 (−1) s(s − 1)(s − 3)(s − 4)ds = H2 = . 4 2!2! 0 90 Z 4 1 32 1 (−1) s(s − 1)(s − 2)(s − 4)ds = H3 = . 4 3!1! 0 90 Z 4 0 7 1 (−1) s(s − 1)(s − 2)(s − 3)ds = . H4 = . 4 4!0! 0 90 Thus, Z
x4
f (x) dx = x0
4 X
Ci yi + E = (b − a)
i=0
4 X
Hi yi + E
i=0
= (b − a)[H0 y0 + H1 y1 + H2 y2 + H3 y3 + H4 y4 ] + E 7 32 12 32 7 = 4h y0 + y1 + y2 + y3 + y4 + E 90 90 90 90 90 2h = [7y0 + 32y1 + 12y2 + 32y3 + 7y4 ] + E 45 8h7 vi f (ξ), a < ξ < b. 945 This formula is known as Boole’s quadrature formula.
and the error term (E) is − 2.2.5
Weddle’s formula
When n = 6, then we obtained another quadrature formula known as Weddle’s formula. In this case, Z
b
Z
x6
f (x)dx = (b − a)
f (x)dx = a
x0
6 X
Hi yi
i=0
= 6h(H0 y0 + H1 y1 + H2 y2 + H3 y3 + H4 y4 + H5 y5 + H6 y6 ) = 6h[H0 (y0 + y6 ) + H1 (y1 + y5 ) + H2 (y2 + y4 ) + H3 y3 ]. Here, we have to determine seven Hi ’s. But, Hi ’s are symmetric, i.e. Hi = Hn−i . 6 X Also, we know Hi = 1. Thus, i=0
15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newton-Cotes Quadrature
H3 = 1 − (H0 + H1 + H2 + H4 + H5 + H6 ) = 1 − 2(H0 + H1 + H2 ). Now, 1 1 H0 = · 6 6! Similarly,
6
Z 0
H1 =
s(s − 1)(s − 2) · · · (s − 6) 41 ds = s 840
27 272 216 , H2 = , H3 = . 840 840 840
Thus, Z
b
f (x)dx = a
h [41y0 +216y1 +27y2 +272y3 +27y4 +216y5 +41y6 ]. 140
(2.32)
Again, we know that ∆6 y0 = y0 − 6y1 + 15y2 − 20y3 + 15y4 − 6y5 + y6 , i.e.,
h 140 [y0
− 6y1 + 15y2 − 20y3 + 15y4 − 6y5 + y6 ] −
h 6 140 ∆ y0
= 0.
Now, we add the left hand side of this expression (as it is zero) to the right hand side of the equation (2.32). Then we obtained Z b 3h h 6 f (x)dx = [y0 + 5y1 + y2 + 6y3 + y4 + 5y5 + y6 ] − ∆ y0 . 10 140 a The first term is the Weddle’s formula and the last term is the error in addition to the truncation error. n 1
1 2
1 2
2
1 3
4 3
1 3
3
3 8
9 8
9 8
3 8
4
14 45
64 45
24 45
64 45
5
95 375 250 250 375 95 288 288 288 288 288 288
6
41 216 27 272 27 216 41 140 140 140 140 140 140 140
14 45
Table 2.1: Weights of Newton-Cotes integration formula for different n The degree of precision of some quadrature formulae are given in Table 2.2. 16
......................................................................................
Method
Degree of precision
Trapezoidal
1
Simpson’s 1/3
3
Simpson’s 3/8
3
Boole’s
5
Weddle’s
5
Table 2.2: Degree of precision of some standard quadrature formulae.
Note 2.3 All these formulae (i.e. trapezoidal, Simphson’s 1/3, etc.) are called NewtonCotes quadrature formulae. Another kind of quadrature formulae are also available, they are called Gaussian quadrature formulae, discussed in Module 3 of this chapter.
Note 2.4 Note that all these quadrature formulae are applicable for the proper integral. If either lower or/and upper limit(s) is (are) infinity (first type improper integral) or f (x) has infinite discontinuity within the interval of integration [a, b] (second type improper integral), then none of these formulae are applicable.
2.3 Newton-Cotes formulae (Open type) Newton-Cotes quadrature formulae can be applied for second type improper integrals by choosing appropriate arguments. But, these formulae can never be used for first type improper integral and if the integrand f (x) has infinite discontinuity at the endpoints of the interval. In this section, some formulae are introduced which are applicable for the integral when the integrand has infinite discontinuity at the endpoints. In these formulae the endpoints are not considered, and hence the formulae are known as Newton-Cotes open type formulae. These formulae are sometimes known as the Steffensen formulae. Also, these formulae are useful to solve ordinary differential equations numerically when the boundary conditions are not specified at the endpoints. 17
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newton-Cotes Quadrature (i) Mid-point formula
Z
x1
f (x)dx = hf (x0 + h/2) + x0
1 3 00 h f (ξ), x0 ≤ ξ ≤ x1 . 24
(2.33)
(ii) Two-point formula x3
Z
f (x)dx = x0
3h 3h3 00 [f (x1 ) + f (x2 )] + f (ξ), x0 ≤ ξ ≤ x3 . 2 4
(2.34)
(iii) Three-point formula
Z
x4
f (x)dx = x0
14h5 iv 4h [2f (x1 ) − f (x2 ) + 2f (x3 )] + f (ξ), 3 45 x0 ≤ ξ ≤ x4 .
(2.35)
(iv) Four-point formula
Z
x5
f (x)dx = x0
5h [11f (x1 ) + f (x2 ) + f (x3 ) + 11f (x4 )] 24 +
95h5 iv f (ξ), x0 ≤ ξ ≤ x5 . 144
(2.36)
The last terms in the formulae are the errors. These formulae are obtained by integrating Lagrange’s interpolating polynomial for the points (xi , yi ), i = 1, 2, . . . , (n − 1) between the given limits.
18
.
Chapter 7 Numerical Differentiation and Integration
Module No. 3 Gaussian Quadrature
...................................................................................... In the Newton-Cotes method (discussed in Module 2 of Chapter 7), the finite interval of integration [a, b] is divided into n equal subintervals. That is, the arguments xi , i = 0, 1, 2, . . . , n are known and they are equispaced. Also, all the Newton-Cotes formulae give exact result for the polynomials of degree up to n. It is mentioned that the NewtonCotes formulae have some limitations. These formulae are not applicable for improper integrals. This drawback can be removed by taking non-equal arguments. But, the question is how one can choose the arguments? That is, in this case the arguments are unknown. For this situation, one new kind of quadrature formulae are devised which give exact result for the polynomials of degree up to 2n − 1. These methods are called Gaussian quadrature methods, described below.
3.1 Gaussian quadrature The Gaussian quadrature formula is of the following form Z
b
ψ(x)f (x)dx =
n X
a
wi f (xi ) + E,
(3.1)
i=1
where xi and wi are respectively called nodes and weights and ψ(x) is called the weight function, E is the error. Here, the weights wi ’s are discrete numbers, but the weight function ψ(x) is a continuous function and defined on the interval of integration [a, b]. In Newton-Cotes formulae, the weights wi ’s were unknown but xi ’s were known, while in Gaussian formulae, both are unknown. By changing the weight function ψ(x) one can derived different quadrature formulae. The fundamental theorem of Gaussian quadrature is stated below: the optimal nodes of the n-point Gaussian quadrature formula are precisely the zeros of the orthogonal polynomial for the same interval and weight function. Gaussian quadrature gives exact result for all polynomial up to degree 2n − 1. Suppose the Gaussian nodes xi ’s are chosen by some way. The weights wi ’s can be computed by using Lagrange’s interpolating formula. Let π(x) =
m Y
(x − xj ).
(3.2)
j=1
1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Gaussian Quadrature Then π 0 (xj ) =
m Y
(xj − xi ).
(3.3)
i=1 i6=j
Then Lagrange’s interpolating polynomial for m arguments is φ(x) =
m X j=1
π(x) f (xj ) (x − xj )π 0 (xj )
(3.4)
for arbitrary point x. Now, from the equation (3.1) Z
b
φ(x)ψ(x)dx = a
=
Z bX m a j=1 m X
π(x)ψ(x) f (xj )dx (x − xj )π 0 (xj )
wj f (xj ).
(3.5)
j=1
Comparing, we get wj =
1 π 0 (xj )
Z
b
π(x)ψ(x) dx. x − xj
a
(3.6)
The weights wj are sometimes called the Christofell numbers. It is obvious that any finite interval [a, b] can be converted to the interval [−1, 1] using the following linear transformation x=
b−a b+a b−a b+a t+ = qt + p, where q = and p = . 2 2 2 2
(3.7)
Then, Z
b
Z
1
f (x) dx = a
f (qt + p) q dt.
(3.8)
−1
In Gaussian quadrature the limits of the integration are taken as −1 and 1, and it is possible for any finite interval shown above. Thus, we consider the following Gaussian integral Z
1
ψ(x)f (x)dx = −1
2
n X i=1
wi f (xi ) + E.
(3.9)
...................................................................................... Depending on the weight function ψ(x) one can generate different Gaussian quadrature formulae. In this module, we consider two Gaussian quadrature formulae.
3.2 Gauss-Legendre quadrature formulae
In this formula, the weight function ψ(x) is taken as 1. Then the quadrature formula is Z
1
f (x)dx = −1
n X
wi f (xi ) + E.
(3.10)
i=1
Here, wi ’s and xi ’s are 2n unknown parameters. Therefore, wi ’s and xi ’s can be determined such that the formula (3.10) gives exact result when f (x) is a polynomial of degree up to 2n − 1. Let
f (x) = c0 + c1 x + c2 x2 + · · · + c2n−1 x2n−1 .
(3.11)
be a polynomial of degree 2n − 1. Now, the left hand side of the equation (3.10) is,
Z
1
Z
1
f (x)dx = −1
[c0 + c1 x + c2 x2 + · · · + c2n−1 x2n−1 ]dx
−1
2 2 = 2c0 + c2 + c4 + · · · . 3 5
(3.12)
When x = xi , equation (3.11) becomes
f (xi ) = c0 + c1 xi + c2 x2i + c3 x3i + · · · + c2n−1 x2n−1 . i 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Gaussian Quadrature The right hand side of the equation (3.10) is n X
wi f (xi ) = w1 [c0 + c1 x1 + c2 x21 + · · · + c2n−1 x2n−1 ] 1
i=1
+w2 [c0 + c1 x2 + c2 x22 + · · · + c2n−1 x2n−1 ] 2 +w3 [c0 + c1 x3 + c2 x23 + · · · + c2n−1 x2n−1 ] 3 +···································· +wn [c0 + c1 xn + c2 x2n + · · · + c2n−1 x2n−1 ] n = c0 (w1 + w2 + · · · + wn ) + c1 (w1 x1 + w2 x2 + · · · + wn xn ) +c2 (w1 x21 + w2 x22 + · · · + wn x2n ) + · · · + · · · + wn x2n−1 ). + w2 x2n−1 +c2n−1 (w1 x2n−1 n 2 1
(3.13)
Hence, equation (3.10) becomes 2 2 2c0 + c2 + c4 + · · · 3 5 = c0 (w1 + w2 + · · · + wn ) + c1 (w1 x1 + w2 x2 + · · · + wn xn ) +c2 (w1 x21 + w2 x22 + · · · + wn x2n ) + · · · + · · · + wn x2n−1 ). + w2 x2n−1 +c2n−1 (w1 x2n−1 n 2 1 Comparing both sides the coefficients of ci ’s, and we obtained the following 2n equations: w1 + w2 + · · · + wn = 2 w1 x 1 + w2 x 2 + · · · + wn x n = 0 w1 x21 + w2 x22 + · · · + wn x2n = ··························· w1 x2n−1 1
+
w2 x2n−1 2
+ ··· +
wn x2n−1 n
2 3
(3.14)
··· = 0.
Now, the equation (3.14) is a system of non-linear equations containing 2n equations and 2n unknowns wi and xi , i = 1, 2, . . . , n. Let the solution of these equations be wi = wi∗ and xi = x∗i , i = 1, 2, . . . , n. Then the Gauss-Legendre quadrature formula is given by Z 1 n X f (x)dx = wi∗ f (x∗i ). −1
4
i=1
(3.15)
...................................................................................... Unfortunately, the system of equations (3.14) is non-linear and it is very difficult to find its solution for large n. But, for lower values of n, one can find the exact solution of the system. Some particular cases are discussed below: Case I. When n = 1, the Gauss-Legendre quadrature formula becomes Z 1 f (x)dx = w1 f (x1 ) −1
and the system of equations is w1 = 2 and w1 x1 = 0, i.e. x1 = 0. Thus, for n = 1, Z
1
f (x)dx = 2f (0).
(3.16)
−1
Note the this is a very simple formula to get the value of an integration. This formula is known as 1-point Gauss-Legendre quadrature formula. It gives an approximate value of the integration and it gives exact answer when f (x) is a polynomial of degree one. Case II. When n = 2, then the quadrature formula reduces to Z 1 f (x)dx = w1 f (x1 ) + w2 f (x2 ).
(3.17)
−1
In this case, the system of equations (3.14) becomes
w1 + w2
= 2
w1 x1 + w2 x2 = 0 w1 x21 + w2 x22 =
2 3
(3.18)
w1 x31 + w2 x32 = 0. √ √ The solution of these equations is w1 = w2 = 1, x1 = −1/ 3, x2 = 1/ 3. Hence, the 2-point Gauss-Legendre quadrature formula is Z 1 √ √ f (x)dx = f (−1/ 3) + f (1/ 3).
(3.19)
−1
The above system of equations can also be obtained by substituting f (x) = 1, x, x2 and x3 to the equation (3.17) successively. 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Gaussian Quadrature This 2-point quadrature formula gives exact answer when f (x) is a polynomial of degree up to three. Case III. When n = 3, then the Gauss-Legendre formula becomes Z
1
f (x)dx = w1 f (x1 ) + w2 f (x2 ) + w3 f (x3 ).
(3.20)
−1
In this case, the system of equations containing six unknowns x1 , x2 , x3 and w1 , w2 , w3 is w1 + w2 + w3
= 2
w1 x 1 + w2 x 2 + w3 x 3 = 0 w1 x21 + w2 x22 + w3 x32 =
2 3
w1 x31 + w2 x32 + w3 x32 = 0 w1 x41 + w2 x42 + w3 x42 =
2 5
w1 x51 + w2 x52 + w3 x52 = 0. This system of equations can also be obtained by substituting f (x) = 1, x, x2 , x3 , x4 , x5 to the equation (3.20). Solution of this system of equations is p p x1 = − 3/5, x2 = 0, x3 = 3/5, w1 = 5/9, w2 = 8/9, w3 = 5/9. Hence, in this case, the Gauss-Legendre quadrature formula is Z
1
p p 1 f (x)dx = [5f (− 3/5) + 8f (0) + 5f ( 3/5)]. 9 −1
(3.21)
This is known as 3-point Gauss-Legendre quadrature formula. In this way, one can determine Gauss-Legendre quadrature formulae for higher values of n. Note that the system of equations (3.14) is non-linear with respect to xi ’s, but if xi ’s are known, then this system becomes linear one. It is very interesting that the nodes xi ’s, i = 1, 2, . . . , n are the zeros of the nth degree Legendre’s polynomial Pn (x) = 6
1 dn [(x2 − 1)n ]. 2n n! dxn
...................................................................................... This is a well known orthogonal polynomial and it is obtained from the following recurrence relation: (n + 1)Pn+1 (x) = (2n + 1)xPn (x) − nPn−1 (x)
(3.22)
where P0 (x) = 1 and P1 (x) = x. Some lower order Legendre polynomials are P0 (x) = 1 P1 (x) = x P2 (x) = 12 (3x2 − 1)
(3.23)
P3 (x) = 12 (5x3 − 3x) P4 (x) = 18 (35x4 − 30x2 + 3). It can be verified that the roots of the equation Pn (x) = 0 are the nodes xi ’s of the n-point Gauss-Legendre quadrature formula. Finding of zeros of lower degree GaussLegendre polynomial is easy than finding the solution of the system of equations (3.14). √ For example, the roots of the equation P2 (x) = 0 are ±1/ 3 and these are the nodes for 2-point Gauss-Legendre quadrature formula. Similarly, the roots of the equation p P3 (x) = 0 are 0, ± 3/5 and these are the nodes of 3-point formula, and so on. Again, it is proved that the weights wi ’s can be determined from the following equation wi =
2
(1 −
. x2i )[Pn0 (xi )]2
(3.24)
It can be shown that the error of this formula is E=
22n+1 (n!)4 f (2n) (ξ), (2n + 1)[(2n)!]3
−1 < ξ < 1.
(3.25)
The nodes and weights for some lower values of n are listed in Table 3.1. Note 3.1 The Gauss-Legendre quadrature generates several formulae for different values of n. These formulae are known as n-point formula, where n = 1, 2, . . ..
7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Gaussian Quadrature n
node xi
weight wi
order of truncation error
2
±0.57735027
1.00000000
f (4) (ξ)
3
0.00000000
0.88888889
f (6) (ξ)
±0.77459667
0.55555556
±0.33998104
0.65214515
±0.86113631
0.34785485
0.00000000
0.56888889
±0.53846931
0.47862867
±0.90617985
0.23692689
±0.23861919
0.46791393
±0.66120939
0.36076157
±0.93246951
0.17132449
4 5
6
f (8) (ξ)
f (10) (ξ)
f (12) (ξ)
Table 3.1: Values of xi and wi for Gauss-Legendre quadrature
Z Example 3.1 Find the value of
1
x2 sin x dx by Gauss-Legendre formula for n =
0
2, 4, 6. Also, calculate the absolute errors. Solution. To apply the Gauss-Legendre formula, the limits are transferred to −1, 1 by 1 1 1 substituting x = u(1 − 0) + (1 + 0) = (u + 1). 2 2 2 Then, Z 1 Z 1 n u + 1 1 1X 2 2 I= x sin x dx = (u + 1) sin du = wi f (ui ) 2 8 0 −1 8 i=1 x + 1 i where f (xi ) = (xi + 1)2 sin . 2 For the two-point formula (n = 2) x1 = −0.57735027, x2 = 0.57735027, w1 = w2 = 1. Then I = 81 [1 × 0.037469207 + 1 × 1.7650614] = 0.22531632. For the four-point formula (n = 4) x1 = −0.33998104, x2 = −0.86113631, x3 = −x1 , x4 = −x2 , w1 = w3 = 0.65214515, w2 = w4 = 0.34785485. 8
...................................................................................... Then, 1 I = [w1 f (x1 ) + w3 f (x3 ) + w2 f (x2 ) + w4 f (x4 )] 8 1 = [w1 {f (x1 ) + f (−x1 )} + w2 {f (x2 ) + f (−x2 )}] 8 1 = [0.65214515 × (0.14116516 + 1.1149975) + 0.34785485 × (0.0013377874 + 2.77785)] 8 = 0.22324429. For the six-point formula (n = 6) x1 = −0.23861919, x2 = −0.66120939, x3 = −0.93246951, x4 = −x1 , x5 = −x2 , x6 = −x3 , w1 = w4 = 0.46791393, w2 = w5 = 0.36076157, w3 = w6 = 0.17132449. Then, 1 I = [w1 {f (x1 ) + f (−x1 )} + w2 {f (x2 ) + f (−x2 )} + w3 {f (x3 ) + f (−x3 )}] 8 1 = [0.46791393 × (0.2153945 + 0.89054879) + 0.36076157 × (0.019350185 + 2.0375335) 8 +0.17132449 × (0.00015395265 + 3.0725144)] = 0.22324427. The exact value is 0.22324428. The following table gives a comparison among the different Gauss-Legendre formulae. n
Exact value
Gauss formula
Error
2
0.22324428
0.22531632
2.07 × 10−3
4
0.22324428
0.22324429
0.01 × 10−6
6
0.22324428
0.22324427
0.01 × 10−6
By considering the weight function ψ(x) = (1 − x2 )−1/2 , we obtain another type of Gaussian quadrature known as Gauss-Chebyshev quadrature, discussed in next section.
3.3 Gauss-Chebyshev quadrature formulae Gauss-Chebyshev quadrature is also known as Chebyshev quadrature. Its weight function is taken as ψ(x) = (1 − x2 )−1/2 . The general form of this method is Z 1 n X 1 √ f (x)dx = wi f (xi ) + E, 1 − x2 −1 i=1
(3.26) 9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Gaussian Quadrature where E is the error. This formula also contains 2n unknown parameters. So, as per Gaussian quadrature, this method gives exact answer for polynomials of degree up to 2n − 1. For n = 3 the equation (3.26) becomes Z
1
√
−1
1 f (x)dx = w1 f (x1 ) + w2 f (x2 ) + w3 f (x3 ). 1 − x2
(3.27)
Since the method gives exact value for the polynomials of degree up to 2n − 1. i.e. up to 5. Therefore, for f (x) = 1, x, x2 , x3 , x4 , x5 the following equations are obtained from (3.27). w1 + w2 + w3 = π w1 x1 + w2 x2 + w3 x3 = 0 w1 x21 + w2 x22 + w3 x23 =
π 2
w1 x31 + w2 x32 + w3 x33 = 0 w1 x41 + w2 x42 + w3 x43 =
3π 8
w1 x51 + w2 x52 + w3 x53 = 0. Solution of this system of equations is x1 =
√
√ 3/2, x2 = 0, x3 = − 3/2, w1 = w2 =
w3 = π/3. Thus the formula (3.27) becomes Z
1
−1
√
i √ 1 πh √ f ( 3/2) + f (0) + f (− 3/2) . f (x)dx = 3 1 − x2
(3.28)
This is known as 3-point Gauss-Chebyshev quadrature formula. Like Gauss-Legendre quadrature formulae, many Gauss-Chebyshev quadrature formulae can be derived for different values of n. In Gauss-Chebyshev quadrature formulae, the nodes xi , i = 1, 2, . . . , n, are the zeros of the Chebyshev polynomials Tn (x) = cos(n cos−1 x). That is, the nodes xi ’s are given by equation (2i − 1)π xi = cos , i = 1, 2, . . . , n. 2n 10
(3.29)
(3.30)
...................................................................................... The weights wi ’s are same for all values of i and these are given by wi = −
π π = , 0 Tn+1 (xi )Tn (xi ) n
i = 1, 2, . . . , n.
(3.31)
Using these results, the 1-point Gauss-Chebyshev quadrature formula is deduced below: For n = 1, x1 = cos π2 = 0 and w1 = π. That is, Z
1
√
−1
1 f (x)dx = w1 f (x1 ) = πf (0). 1 − x2
(3.32)
For n = 2, xi = cos(2i − 1) π4 , i = 1, 2. Thus,
π 1 3π 1 = √ and x2 = cos = −√ . 4 4 2 2 π The weights are w1 = w2 = . 2 Thus, 2-point Gauss-Chebyshev quadrature formula is Z 1 1 √ f (x)dx = w1 f (x1 ) + w2 f (x2 ) 1 − x2 −1 πh 1 1 i = f √ +f − √ . 2 2 2 x1 = cos
(3.33) (3.34)
The error in Gauss-Chebyshev quadrature is E=
2π f (2n) (ξ), 22n (2n)!
−1 < ξ < 1.
(3.35)
The more general Gauss-Chebyshev quadrature formula is then Z
1
−1
n
n (2i − 1) oi πX h 2n f (x)dx √ = f cos π + 2n f (2n) (ξ). 2 n 2n 2 (2n)! 1−x i=1
(3.36)
In Table 3.2, the values of nodes and weights for first few Gauss-Chebyshev quadrature formulae are provided.
11
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Gaussian Quadrature n
node xi
weight wi
order of truncation error
2
±0.7071068
1.5707963
3
0.0000000
1.0471976
±0.8660254
1.0471976
±0.3826834
0.7853982
±0.9238795
0.7853982
0.0000000
0.6283185
±0.5877853
0.6283185
±0.9510565
0.6283185
4 5
f (4) (ξ) f (6) (ξ) f (8) (ξ) f (10) (ξ)
Table 3.2: Nodes and weights for Gauss-Chebyshev quadrature formulae
1
Z Example 3.2 Find the value of 0
mula.
1 dx using Gauss-Chebyshev four-point for1 + x2
√
1 − x2 . Here x1 = −0.3826834 = x2 , 1 + x2 x3 = −0.9238795 = −x4 and w1 = w2 = w3 = w4 = 0.7853982.
Solution. Let f (x) = Then Z I= 0
1
1 1 dx = 2 1+x 2
Z
1
−1
1 1 dx = 2 1+x 2
Z
1
−1
f (x) √ dx 1 − x2
1 = [w1 f (x1 ) + w2 f (x2 ) + w3 f (x3 ) + w4 f (x4 )] 2 1 = × 0.7853982[f (x1 ) + f (x2 ) + f (x3 ) + f (x4 )] 2 1 = × 0.7853982[2 × 0.8058636 + 2 × 0.2064594] = 0.7950767, 2 while the exact value is π/4 = 0.7853982. Thus, the absolute error is 0.0096785.
12
.
Chapter 7 Numerical Differentiation and Integration
Module No. 4 Monte-Carlo Method and Double Integration
...................................................................................... It is mentioned in Module 2 of this chapter that there are three types of quadrature formulae available in literature, viz. Newton-Cotes, Gaussian and Monte-Carlo. Two such methods are discussed in Modules 2 and 3. The third method, i.e. Monte-Carlo method is discussed in this module. This method is based on random numbers, i.e. it is a statistical method whereas earlier two methods are non-statistical.
4.1 Monte-Carlo method The Monte Carlo method is a statistical method and applied to solve different type of problems. For example, it is used in computer based simulation methods, to find extreme values of a function, etc. This method was invented by a Polish mathematician Stanislaw Ulam, in the year 1946 and the first paper on it was published in 1949. The basic principle of this method is stated below. Generate a set of random numbers. The Monte Carlo method is then performed for this random numbers for one random trial. The trial is repeated for several number of times and the trials are independent. The final result is the average of all the results obtained in different trials. This method is not suitable for hand calculation, because it depends on a large number of random numbers. Now, we discuss the Monte Carlo method to find numerical integration of a single valued function. Let the definite integral be
b
g(x) dx,
I=
(4.1)
a
where g(x) is a real valued function defined on the closed interval [a, b]. Now, this definite integral is converted to a particular form that can be solved by Monte Carlo method. For this purpose, a uniform probability density function (pdf) is defined on [a, b] in the following.
f (x) =
1 b−a ,
a<x
0,
otherwise.
This function is included to the equation (4.1) to obtain the following integral I. b g(x)f (x) dx. (4.2) I = (b − a) a
1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Monte-Carlo Method and Double Integration Note that the integral on the right hand side of the equation (4.2) is nothing but the expectation of the function g(x) under uniform probability density function. Thus, the integral I can be written as
b
I = (b − a)
g(x)f (x) dx = (b − a)g,
(4.3)
a
where g is the expected value or mean value of g(x) over the interval [a, b]. Now, a sample {x1 , x2 , . . . , } is drawn from the probability density function f (x). For each such xi the value of g(xi ) is evaluated. To get a better approximation, a large size of sample, say N , is considered. Let G be the average of all such values of g(xi ), i = 1, 2, . . . , N . Then G=
N 1 g(xi ). N
(4.4)
i=1
It can easily be proved that the expectation of the average of N samples is the expectation of g(x), i.e. G = g. Hence N 1 g(xi ) . I = (b − a)G (b − a) N
(4.5)
i=1
The Monte Carlo method is illustrated in Figure 4.1. y = g(x) 6
g(x)
x=a
- x
x=b
Figure 4.1: Random points are chosen in [a, b]. The value of g(x) is evaluated at each random point (marked by straight lines in the figure). Thus the approximate value of the integral I on the interval [a, b] is determined by taking the average of N results of the integrand with the random variable x distributed 2
...................................................................................... uniformly over the interval [a, b]. This indicates that the interval [a, b] is finite, because there is no uniform probability density function in an infinite interval. The infinite interval can be accommodated with more sophisticated techniques. The true variance in the average G is equal to the true variance in g, i.e. var(G) =
1 var(g). N
(4.6)
If an error is committed in the estimation of the integral I with the standard deviation, √ then one may expect the error in the estimate of I to decrease by the factor 1/ N . Since the Monte-Carlo method is based on random numbers, so a suitable method is required to generate random numbers. Several methods are available to generate random numbers. In C programming language, the function rand() generates random numbers. In the next section, a method is described to generate pseudo-random numbers.
4.2 Generation of random numbers The random numbers are generated by a random process and these are the values of a random variable. Several methods are available to generate random numbers, but these methods do not generate real random numbers, because they follow some mathematical formula. A very simple method to generate random numbers is known as power residue method. In this method, a sequence of non-negative integers x1 , x2 , . . . is generated by the following formula: xn+1 = (a xn ) (mod m)
(4.7)
where x0 is a starting value called the seed, a and m are two positive integers (a < m). The expression (axn ) (mod m) gives the remainder when axn is divided by m. Note that the possible values of xn+1 are 0, 1, 2, . . . , m − 1. That is, the number of different random numbers is m. The period of random number depends on the values of the parameter a and the seed x0 and m. The proper values of a, x0 and m generate a long period random numbers. Different choices are suggested by many people. One such choice is presented below. Suppose the word-length of the computational machine be b bits. Let m = 2b−1 − 1, a is 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Monte-Carlo Method and Double Integration an odd integer of the form 8k±3 and near to 2b/2 , and x0 is an odd integer between 0 and m. Thus, ax0 is a 2b bits integer. The least b significant bits form the random number x1 . The process is repeated for a desired number of times. For a 32-bit computer, a set of values of the parameters as per above suggestion are m = 231 − 1 = 2147483647, a = 216 + 3 = 65539, x0 = 1267835015, (0 < x0 < m). The above method generates the random numbers between 0 and m − 1. To obtain the random numbers between [0, 1], all numbers are divided by m − 1. These numbers are uniformly distributed over the interval [0, 1]. In C, a function rand() is available in the header file stdlib.h which generates a random number between 0 and RAND MAX (the maximum integer provided by the computer). The C statement (float)rand()/RAND MAX may be used to generate random numbers between 0 and 1. Then the random number between a and b can be obtained by using the statement a+(b-a)*(float)rand()/RAND MAX.
The following C program finds the value of an integration Carlo method.
0
1
1 dx by Monte1 + x2
/* Program Monte Carlo Program to find the value of the integration of 1/(1+x^2) between 0 and 1 by Monte Carlo method for different values of N. */ #include<stdio.h> #include<stdlib.h> void main() { float g(float x); /* g(x) may be changed accordingly */ float x,y,I,sum=0.0,a=0.0,b=1.0; int i, N; srand(100); /* seed for random number */ printf("Enter the sample size "); scanf("%d",&N); for(i=0;i
/* rand() generates a random number between 0 and RAND_MAX */ y=(float)rand()/RAND_MAX; /* generates a random number between 0 and 1*/
4
......................................................................................
x=a+(b-a)*y; sum+=g(x); } I=sum*(b-a)/N; printf("%f",I); } /* definition of function */ float g(float x) { return(1/(1+x*x)); } The results obtained for different values of N are tabulated below. N
value of integration
500
0.790020
1000
0.789627
1500
0.786979
2000
0.787185
3000
0.786553
4000
0.784793
9000
0.784254
10000
0.784094
15000
0.782420
The exact answer is π/4 0.785398. No values shown in the above table does not matched with the exact value (correct up to six decimal places). The closed value is 0.784793 (correct up to three decimal places) and it is obtained for N = 4000. Note 4.1 Note that the Monte-Carlo method can be used to find the integration of a function, but it does not produce better result for all type of functions. The three types of integration methods are discussed and all these methods are applicable for single valued functions. Now, we discuss trapezoidal and Simpson’s methods to find the double integration. 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Monte-Carlo Method and Double Integration
4.3 Double integration Let us consider the double integration d b
f (x, y) dx dy.
I= c
(4.8)
a
This integration can be evaluated by using the Newton-Cotes, Gaussian and MonteCarlo methods. In this section, trapezoidal or Simpson’s methods are considered to find double integral. The integral can be evaluated numerically in two successive steps. In first step, the integration is evaluated with respect to x taking y constant and in second step the resultant integrand is evaluated with respect to y. 4.3.1
Trapezoidal method
Let the double integral be d b
f (x, y) dx dy.
I= c
a
Suppose y is constant. Then we integrate I with respect to x by trapezoidal method. By trapezoidal formula, b−a d [f (a, y) + f (b, y)]dy 2 c d d b−a f (a, y)dy + f (b, y)dy . = 2 c c
I=
(4.9)
In right hand side, there are two integrals with variable y. Again, applying trapezoidal rule on two integrals of right hand side and obtain the trapezoidal formula for double integral as d−c b−a d−c {f (a, c) + f (a, d)} + {f (b, c) + f (b, d)} I= 2 2 2 (b − a)(d − c) [f (a, c) + f (a, d) + f (b, c) + f (b, d)]. = 4 6
(4.10)
...................................................................................... In terms of x and y, the above formula can be written as follows: Let h = b − a, k = d − c, x0 = a, x1 = b, y0 = c, y1 = d. Then I=
hk [f (x0 , y0 ) + f (x0 , y1 ) + f (x1 , y0 ) + f (x1 , y1 )]. 4
(4.11)
Thus, only four values at the corner points (a, c), (a, d), (b, c) and (b, d) of the rectangular region [a, b; c, d] are required to find the double integration by trapezoidal rule. This formula gives a very rough value of the integration. Note 4.2 The trapezoidal formula finds the repeated integration of the double integral (4.8). In many times repeated integral is equal to double integral.
Composite trapezoidal formula The composite trapezoidal formula is generally used to get more better result of the integral (4.8). In this case, the interval [a, b] is divided into n equal subintervals each of length h. Similarly, the interval [c, d] is divided into m equal subintervals each of length k. That is, x0 = a,
xi = x0 + ih, i = 1, 2, . . . , n;
xn = b,
y0 = c,
yj = y0 + jk, j = 1, 2, . . . , m;
ym = d,
b−a , n d−c . k= m
h=
Now, integrating (4.8) with respect to x between a and b by using trapezoidal formula as in case of single variable and we get
b
f (x, y) = a
h [f (x0 , y) + 2{f (x1 , y) + f (x2 , y) + · · · 2 +f (xn−1 , y)} + f (xn , y)].
(4.12)
Again, integrating the above integral with respect to y between c and d by the same formula, we get 7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Monte-Carlo Method and Double Integration
d b
f (x, y)dxdy a h k = f (x0 , y0 ) + 2(f (x0 , y1 ) + f (x0 , y2 ) + · · · + f (x0 , ym−1 )) + f (x0 , ym ) 2 2 k +2. f (x1 , y0 ) + 2(f (x1 , y1 ) + f (x1 , y2 ) + · · · + f (x1 , ym−1 )) + f (x1 , ym ) 2 k +2. f (x2 , y0 ) + 2(f (x2 , y1 ) + f (x2 , y2 ) + · · · + f (x2 , ym−1 )) + f (x2 , ym ) 2 + · · · ·· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · k f (xn−1 , y0 ) + 2(f (xn−1 , y1 ) + · · · + f (xn−1 , ym−1 )) + f (xn−1 , ym ) +2. 2 k f (xn , y0 ) + 2(f (xn , y1 ) + f (xn , y2 ) + · · · + f (xn , ym−1 )) + f (xn , ym ) + 2 hk f00 + 2{f01 + f02 + · · · + f0,m−1 } + f0m = 4 n−1 +2 {fi0 + 2(fi1 + fi2 + · · · + fi,m−1 ) + fim }
I=
c
i=1
+fn0 + 2(fn1 + fn2 + · · · + fn,m−1 ) + fnm ,
(4.13)
where fij = f (xi , yj ), i = 0, 1, . . . , n; j = 0, 1, 2, . . . , m. The method is of second order in both h and k. y0
y1
y2
ym−1
ym
x0
[f00 + 2(f01 + f02 + · · · + f0,m−1 ) + f0,m ] k2 = I0
x1
[f10 + 2(f11 + f12 + · · · + f1,m−1 ) + f1,m ] k2 = I1
···
·············································
xn
[fn0 + 2(fn1 + fn2 + · · · + fn,m−1 ) + fn,m ] k2 = In
Table 4.1: Tabular form of trapezoidal formula for double integration
Using the notations of Table 4.1, the trapezoidal formula for double integral can be 8
...................................................................................... written as h [I0 + 2(I1 + I2 + · · · + In−1 ) + In ] where 2 k Ij = [fj0 + 2(fj1 + fj2 + · · · + fj,m−1 ) + fj,m], j = 0, 1, . . . , n. 2 I=
(4.14)
Now, we consider the Simpson’s 1/3 formula which gives more better result than trapezoidal formula. 4.3.2
Simpson’s 1/3 method
Let the double integral be d b
I=
f (x, y) dx dy. c
(4.15)
a
In Simpson’s 1/3 formula both the intervals are divided into two subintervals. That is, x0 = a, x1 = x0 + h, x2 = b are the three points on the interval [a, b] and similarly y0 = c, y1 = y0 + k, y2 = d be the three points on the interval [c, d], where d−c b−a ,k = . h= 2 2 Then by Simpson’s 1/3 rule used in single variable case, taking y is constant, we get b h f (x, y) dx = [f (x0 , y) + 4f (x1 , y) + f (x2 , y)]. 3 a Again, by the same rule on each term of the above formula, we get h k {f (x0 , y0 ) + 4f (x0 , y1 ) + f (x0 , y2 )} I= 3 3 k +4. {f (x1 , y0 ) + 4f (x1 , y1 ) + f (x1 , y2 )} 3 k + {f (x2 , y0 ) + 4f (x2 , y1 ) + f (x2 , y2 )} 3 hk [f00 +f02 +f20 +f22 +4(f01 +f10 +f12 +f21 )+16f11 ], = 9
(4.16)
where fij = f (xi , yj ), i = 0, 1, 2; j = 0, 1, 2. In general, over the intervals [xi−1 , xi+1 ] and [yj−1 , yj+1] the formula is yj+1 xi+1 hk
fi−1,j−1 +fi−1,j+1 +fi+1,j−1 +fi+1,j+1 f (x, y) dx dy = 9 yj−1 xi−1 +4(fi−1,j +fi,j−1 +fi,j+1 +fi+1,j )+16fij .
(4.17) 9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Monte-Carlo Method and Double Integration y0 x0 x1 ··· xn
y1
y2
y3
y4
ym−1
ym
[f00 + 4f01 + 2f02 + 4f03 + 2f04 + · · · + 4f0,m−1 + f0,m ] k3 = I0 [f10 + 4f11 + 2f12 + 4f13 + 2f14 + · · · + 4f1,m−1 + f1,m ] k3 = I1 ····························································
[fn0 + 4fn1 + 2fn2 + 4fn3 + 2fn4 + · · · + 4fn,m−1 + fn,m] k3 = In
Table 4.2: Tabular form of Simpson’s formula for double integration This formula is known as Simpson’s 1/3 rule for double integration. The composite Simpson’s 1/3 formula can easily be derived from the Table 4.2. The composite Simpson’s 1/3 formula is h [I0 + 4(I1 + I3 + · · · + In−1 ) + 2(I2 + I4 + · · · + In−2 ) + In ] where (4.18) 3 k Ij = [fj0 + 4fj1 + 2fj2 + 4fj3 + 2fj4 + · · · + 4fj,m−1 + fj,m], j = 0, 1, . . . , n. 3 I=
Example 4.1 Find the value of 1/3 formulae taking h = k = 0.25.
1 1
0
0
dx dy by trapezoidal and Simpson’s (1 + x2 )(1 + y 2 )
Solution. Since h = k = 0.25, let x = 0.00, 0.25, 0.50, 0.75, 1.00 and y = 0.00, 0.25, 0.50, 0.75, 1.00. 1 . Then the following table gives the values of f (x, y) for Let f (x, y) = 2 (1 + x )(1 + y 2 ) different values of x and y. y x
0.00
0.25
0.50
0.75
1.00
0.00
1.00000
0.94118
0.80000
0.64000
0.50000
0.25
0.94118
0.88581
0.75294
0.60235
0.47059
0.50
0.80000
0.75294
0.64000
0.51200
0.40000
0.75
0.64000
0.60235
0.51200
0.40960
0.32000
1.00
0.50000
0.47059
0.40000
0.32000
0.25000
Let x be fixed and y be varying variable. Then by the trapezoidal formula on each row of the above table, we get 10
...................................................................................... I0 =
1
f (0, y)dy = 0
0.25 [1.00000 + 2(0.94118 + 0.80000 + 0.64000) + 0.50000] 2
= 0.78279. 1
I1 =
f (0.25, y)dy = 0
= 0.73675. 1
I2 =
f (0.5, y)dy = 0
0.25 [0.80000 + 2(0.75294 + 0.64000 + 0.51200) + 0.40000] 2
= 0.62624. 1
I3 =
f (0.75, y)dy = 0
= 0.50099. 1
I4 =
f (1.0, y)dy = 0
0.25 [0.94118 + 2(0.88581 + 0.75294 + 0.60235) + 0.47059] 2
0.25 [0.64000 + 2(0.60235 + 0.51200 + 0.40960) + 0.32000] 2
0.25 [0.50000 + 2(0.47059 + 0.40000 + 0.32000) + 0.25000] 2
= 0.39140. Hence, finally 1 1 h dx dy = [I0 + 2(I1 + I2 + I3 ) + I4 ] 2 2 2 0 0 (1 + x )(1 + y ) 0.25 [0.78279 + 2(0.73675 + 0.62624 + 0.50099) + 0.39140] = 2 = 0.61277. Again, by Simpson’s 1/3 formula to each row of the above table, we have 1 0.25 [1.00000 + 4(0.94118 + 0.64000) + 2(0.80000) + 0.50000] f (0, y)dy = I0 = 3 0 = 0.78539. Similarly, I1 = 0.73919, I2 = 0.62831, I3 = 0.50265, I4 = 0.39270. Hence, by Simpson’s 1/3 formula, the double integration is h I = [I0 + 4(I1 + I3 ) + 2I2 + I4 ] 3 0.25 [0.78539 + 4(0.73919 + 0.50265) + 2(0.62831) + 0.39270] = 3 = 0.61684.
11
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Monte-Carlo Method and Double Integration
Merits and demerits of the quadrature formulae Three types of quadrature formulae are discussed in last three modules. Their advantages, disadvantages and limitations are discussed below. (i) The Newton-Cotes quadrature formulae are applicable only for proper integral. Whereas Gaussian quadrature formulae are useful for both proper and improper integrals. (ii) The simplest but crudest integration formula is the trapezoidal formula. This formula gives a rough value of the integral using a very few number of calculations. (iii) Simpson’s 1/3 rule gives more accurate result than the trapezoidal formula and it is also simple. If f (x) does not fluctuate rapidly and is explicitly known then this method with a suitable subinterval can be used. If high accuracy is required then the Gaussian quadrature may be used. (iv) If the integrand f (x) has infinite discontinuity(ies) at one or both of the limits, then open type Newton-Cotes formulae may be used. But, in this case, Gaussian quadrature gives more better result. (v) Gauss-Legendre or Gauss-Chebyshev formulae give more accurate result than the trapezoid, Simpson’s 1/3, Simpson’s 3/8, Boole and Weddle’s formulae with less number of calculations. (vi) The Gauss-Legendre or Gauss-Chebyshev formulae may directly be applied for second type improper integral. If a function is second type improper and it has singularity(ies) at some nodes of k-point Gauss-Legendre (or Gauss-Chebyshev) formula, then use either (k −1)-point or (k +1)-point or any other Gauss-Legendre (or Gauss-Chebyshev) formula to avoid such singularities. Provided that the improper integral is convergent. (vii) If one of the limits of the integration is infinite, then replace x by 1/x to make it finite. Then use appropriate formula. (viii) Simpson’s 1/3 formula gives more better result than trapezoidal formula for double integration also. Provided that the repeated integral and double integral are same 12
...................................................................................... for the given problem, otherwise Simpson’s 1/3 (and also trapezoid) formula gives completely different answer. (ix) If the integrand is violently oscillating or fluctuating then the Monte Carlo method can be used.
13
.
Chapter 8 Numerical Solution of Ordinary Differential Equations
Module No. 1 Runge-Kutta Methods
...................................................................................... One of the most useful tools of applied mathematics is differential equation, it may be ordinary differential equation (ODE) or partial differential equation (PDE). Both ODE and PDE are widely used to model lots of mathematical and engineering problems. Unfortunately, it is not possible to find analytical solution of all ODEs or PDEs. Finding of analytic solution of an ODE or a PDE is a very difficult task. But, several numerical techniques are available to solve ODEs and PDEs. Let us consider the following first order differential equation dy = f (x, y) dx
(1.1)
y(x0 ) = y0 .
(1.2)
with initial condition
If the analytic solution of an ODE is not available then we will go for numerical solution. The numerical solution of a differential equation can be determined in one of the following two forms: (i) A power series solution for y in terms of x. Then the values of y can be obtained by substituting x = x0 , x1 , . . . , xn . (ii) A set of tabulated values of y for x = x0 , x1 , . . . , xn with spacing h. The general solution of an ODE of order n contains n arbitrary constants. To find such constants, n conditions are required. The problems in which all the conditions are specified at the initial point only, are called initial value problems (IVPs). The ODEs of order two or more and for which the conditions are specified at two or more points are called boundary value problems (BVPs). There may not exist a solution of an ordinary differential equation always. The sufficient condition for existence of unique solution is provided by Lipschitz. The methods to find approximate solution of an initial value problem are called finite difference methods or discrete variable methods. The solutions are determined at a set of discrete points called a grid or mesh of points. That is, a given differential equation is converted to a discrete algebraic equation. By solving such equation, we get an approximate solution of the given differential equation. 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Runge-Kutta Methods A finite difference method is convergent if the solution of the finite difference equation approaches to a limit as the size of the grid spacing tends to zero. But, in general, there is no guarantee, that this limit corresponds to the exact solution of the differential equation. Mainly, two types of numerical methods are used, viz. explicit method and implicit method. If the value of yi+1 depends only on the values of yi , h and f (xi , yi ) then the method is called explicit method, otherwise the method is called implicit method. In this chapter, we discuss some useful methods to solve ODEs.
1.1 Euler’s method Euler’s method is the most simple but crude method to solve the differential equation of the form dy = f (x, y), dx
y(x0 ) = y0 .
Let x1 = x0 + h, where h is small. Then by Taylor’s series dy h2 d2 y y1 = y(x0 + h) = y0 + h + , dx x0 2 dx2 c1 where c1 lies between x0 and x h2 = y0 + hf (x0 , y0 ) + y 00 (c1 ) 2
(1.3)
(1.4)
Thus, by neglecting second order term, we get y1 = y0 + hf (x0 , y0 ).
(1.5)
yn+1 = yn + hf (xn , yn ), n = 0, 1, 2, . . .
(1.6)
In general, This is a very slow method. To find a reasonable accurate solution, the value of h must be taken as small. The Euler’s method is less efficient in practical problems because if h is not sufficiently small then this method gives inaccurate result. This method is modified to get more better result as follows: The value of y1 obtained by Euler’s method is repeatedly modified to get a more accurate result. For this purpose, the term f (x0 , y0 ) is replaced by the average of (0)
(0)
f (x0 , y0 ) and f (x0 + h, y1 ), where y1 = y0 + hf (x0 , y0 ). 2
1.1. Euler’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . That is, the modified Euler’s method is (1)
y1 = y0 +
h (0) [f (x0 , y0 ) + f (x1 , y1 )] 2
(1.7)
(0)
where y1 = y0 + hf (x0 , y0 ). Note that the value of y1 depends on the value of y1 obtained in previous iteration. So this method is known as implicit method, whereas Euler method is explicit method. Other simple methods, viz. Taylor’s series, Picard, Runge-Kutta, etc. are also available to solve the differential equation of the form (1.1). Now, we describe Runge-Kutta method. In this method, several function evaluations are required at each step and it avoids the computation of higher order derivatives. There are several types of Runge-Kutta methods, such as second, third, fourth, fifth, etc. order. The fourth-order Runge-Kutta method is more popular. These are singlestep explicit methods.
1.2 Second-order Runge-Kutta method First we deduce second order Runge-Kutta method from modified Euler’s method. The modified Euler’s method is y1 = y0 +
h (0) [f (x0 , y0 ) + f (x1 , y1 )] 2
(1.8)
(0)
where y1 = y0 + hf (x0 , y0 ). (0)
Now, we substitute the value of y1 y1 = y0 +
to the equation (1.8). Then
h [f (x0 , y0 ) + f (x0 + h, y0 + hf (x0 , y0 ))]. 2
Let k1 = hf (x0 , y0 ) and k2 = hf (x0 + h, y0 + hf (x0 , y0 )) = hf (x0 + h, y0 + k1 ).
(1.9)
Using this notation the equation (1.8) becomes 1 y1 = y0 + (k1 + k2 ). 2
(1.10) 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Runge-Kutta Methods This method is known as second-order Runge-Kutta method. Note that this is an explicit formula, whereas modified Euler’s method is an implicit method. The local truncation error of this method is O(h3 ). This formula can also be derived independently. Independent derivation of second-order Runge-Kutta method Suppose the solution of the equation dy = f (x, y), dx
y(x0 ) = y0
(1.11)
is of the following form: y1 = y0 + ak1 + bk2 ,
(1.12)
where k1 = hf (x0 , y0 ) and k2 = hf (x0 + αh, y0 + βk1 ), a, b, α and β are constants. By Taylor’s theorem h3 h2 y1 = y(x0 + h) = y0 + hy00 + y000 + y0000 + · · · 2 6 h ∂f i 2 h ∂f = y0 + hf (x0 , y0 ) + + f (x0 , y0 ) + O(h3 ) 2 ∂x (x0 ,y0 ) ∂y (x0 ,y0 ) ∂f ∂f df = + f (x, y) since dx ∂x ∂y k2 = hf (x0 + αh, y0 + βk1 ) ∂f i h ∂f = h f (x0 , y0 ) + αh + βk1 + O(h2 ) ∂x (x0 ,y0 ) ∂y (x0 ,y0 ) ∂f ∂f = hf (x0 , y0 ) + αh2 + βh2 f (x0 , y0 ) + O(h3 ). ∂x (x0 ,y0 ) ∂y (x0 ,y0 ) Substituting these values to the equation (1.12), we get h2 [fx (x0 , y0 ) + f (x0 , y0 )fy (x0 , y0 )] + O(h3 ) 2 = y0 + (a + b)hf (x0 , y0 ) + bh2 [αfx (x0 , y0 ) + βf (x0 , y0 )fy (x0 , y0 )] + O(h3 ).
y0 + hf (x0 , y0 ) +
4
1.1. Euler’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Equating the coefficients of f , fx and fy both sides we get the following equations. a + b = 1, bα =
1 1 and bβ = . 2 2
(1.13)
Obviously, α = β. Here, number of equations is three and variables is four. Therefore, the system of equations has many solutions. However, usually the parameters are chosen 1 as α = β = 1, then a = b = . 2 For this set of parameters, the equation (1.12) becomes 1 y1 = y0 + (k1 + k2 ) + O(h3 ), 2 where k1 = hf (x0 , y0 ) and k2 = hf (x0 + h, y0 + k1 ).
(1.14) (1.15)
1.3 Fourth-order Runge-Kutta method The fourth-order Runge-Kutta formula to calculate y1 is 1 y1 = y0 + (k1 + 2k2 + 2k3 + k4 ) 6
(1.16)
where k1 = hf (x0 , y0 ) k2 = hf (x0 + h/2, y0 + k1 /2) k3 = hf (x0 + h/2, y0 + k2 /2) k4 = hf (x0 + h, y0 + k3 ). Using the value of y1 one can find the value of y2 as 1 y2 = y1 + (k1 + 2k2 + 2k3 + k4 ) 6
(1.17)
where k1 = hf (x1 , y1 ) k2 = hf (x1 + h/2, y1 + k1 /2) k3 = hf (x1 + h/2, y1 + k2 /2) k4 = hf (x1 + h, y1 + k3 ). 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Runge-Kutta Methods 1 (i) (i) (i) (i) yi+1 = yi + (k1 + 2k2 + 2k3 + k4 ) 6
In general,
(1.18)
where (i)
k1 = hf (xi , yi ) (i)
(i)
(i)
(i)
k2 = hf (xi + h/2, yi + k1 /2) k3 = hf (xi + h/2, yi + k2 /2) (i)
(i)
k4 = hf (xi + h, yi + k3 ), for i = 0, 1, 2, . . .. Example 1.1 Given y 0 = x − y 2 with x = 0, y = 1. Find y(0.2) by second and fourthorder Runge-Kutta methods. Solution. Here, x0 = 0, y0 = 1, f (x, y) = x − y 2 . By Second order Runge-Kutta method Let h = 0.1. k1 = hf (x0 , y0 ) = 0.1 × (0 − 12 ) = −0.10000. k2 = hf (x0 + h, y0 + k1 ) = 0.1 × f (0.1, 0.9) = −0.07100. Now, we calculate, 1 y1 = y(0.1) = y(0) + (k1 + k2 ) 2 = 1 + 0.5 × (−0.1 − 0.071) = 0.91450. To determine y2 = y(0.2), x1 = 0.1 and y1 = 0.91450. k1 = hf (x1 , y1 ) = 0.1 × {0.1 − (0.91450)2 } = −0.073633. k2 = hf (x1 + h, y1 + k1 ) = 0.1 × [0.2 − (0.9145 − 0.07363)2 ] = −0.05071. Therefore, 1 y2 = y(0.2) = y(0.1) + (k1 + k2 ) 2 = 0.91450 + 0.5 × (−0.07363 − 0.05071) = 0.85233. 6
1.1. Euler’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . By fourth-order Runge-Kutta methods: Here h = 0.1, x0 = 0, y0 = 1, f (x, y) = x − y 2 . Then, k1 = hf (x0 , y0 ) = 0.1 × (0 − (1)2 ) = −0.1. k2 = hf (x0 + h/2, y0 + k1 /2) 0.1 0.1 2 = 0.1 × {( ) − (1 − ) } = −0.08525. 2 2 k3 = hf (x0 + h/2, y0 + k2 /2) 0.1 0.08525 2 = 0.1 × {( ) − (1 − ) } = −0.08666. 2 2 0.08666 2 0.1 ) } = −0.07342. k4 = 0.1 × {( ) − (1 − 2 2 Therefore, 1 y1 = y(0.1) = y0 + (k1 + 2k2 + 2k3 + k4 ) 6 1 = 1 + [−0.1000 + 2 × {(−0.08525) + (−0.08666)} − 0.07342] = 0.91379. 6 To find, y2 = y(0.2), h = 0.1, x1 = 0.1, y1 = 0.91379. Then, k1 = hf (x1 , y1 ) = 0.1 × {0.1 − (0.91379)2 } = −0.07350. k2 = hf (x1 + h/2, y1 + k1 /2) 0.07350 2 0.1 ) − (0.91379 − ) } = −0.06192. = 0.1 × {(0.1 + 2 2 k3 = hf (x1 + h/2, y1 + k2 /2) 0.1 0.06192 2 = 0.1 × {(0.1 + ) − (0.91379 − ) } = −0.06294. 2 2 k4 = hf (x1 + h, y1 + k3 ) = 0.1 × {(0.1 + 0.1) − (0.91379 − 0.06294)2 } = −0.05239.
Therefore, 1 y2 = y(0.2) = y(0) + (k1 + 2k2 + 2k3 + k4 ) 6 1 = 0.91379 + [−0.07350 + 2 × {(−0.06192) + (−0.06294)} − 0.05239] 6 = 0.85119. 7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Runge-Kutta Methods
Note 1.1 The fourth order Runge-Kutta method gives better result, though, this method has some disadvantages. In this method, a lot of function calculations are required. Thus, if the function f (x, y) is complicated, then the Runge-Kutta method is very laborious. Error in fourth-order Runge-Kutta method The fourth-order Runge-Kutta formula can be written as y1 = y0 +
i 4(k2 + k3 ) (h/2) h k1 + + k4 . 3 2
This form is similar to the Simpson’s 1/3 formula with step size h/2. Thus, the local h5 iv truncation error of this formula is − y (c1 ), i.e. of O(h5 ). After n steps, the 2880 accumulated error is −
n X xn − x0 iv h5 iv y (ci ) = − y (c)h4 = O(h4 ). 2880 5760 i=1
1.4 Runge-Kutta method for a pair of equations The second and fourth-order Runge-Kutta methods can also used to solve a pair of first order differential equations. Let us consider a pair of first-order differential equations dy = f (x, y, z) dx (1.19) dz = g(x, y, z) dx with initial conditions x = x0 , y(x0 ) = y0 , z(x0 ) = z0 .
(1.20)
Here, x is the independent variable and y, z are the dependent variables. So, in this problem we have to determine the values of y, z for different values of x. 8
1.1. Euler’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The fourth-order Runge-Kutta method to find the values of yi and zi for x = xi from a pair of equations is 1 (i) (i) (i) (i) yi+1 = yi + [k1 + 2k2 + 2k3 + k4 ], 6 1 (i) (i) (i) (i) zi+1 = zi + [l1 + 2l2 + 2l3 + l4 ], 6
where
(1.21)
(i)
k1 = hf (xi , yi , zi ) (i)
l1 = hg(xi , yi , zi ) (i)
(i)
(i)
(i)
(i)
(i)
(i)
(i)
(i)
(i)
(i)
(i)
k2 = hf (xi + h/2, yi + k1 /2, zi + l1 /2) l2 = hg(xi + h/2, yi + k1 /2, zi + l1 /2)
(1.22)
k3 = hf (xi + h/2, yi + k2 /2, zi + l2 /2) l3 = hg(xi + h/2, yi + k2 /2, zi + l2 /2) (i)
(i)
(i)
(i)
(i)
(i)
k4 = hf (xi + h, yi + k3 , zi + l3 ) l4 = hg(xi + h, yi + k3 , zi + l3 ). Example 1.2 Solve the following pair of differential equations dy dz = z and = y + xz at x = 0.2 given y(0) = 1, z(0) = 0 . dx dx Solution. Let h = 0.2. Here f (x, y, z) = z, g(x, y, z) = y + xz. We calculate the values of k1 , k2 , k3 , k4 ; l1 , l2 , l3 , l4 as follows. k1 = hf (x0 , y0 , z0 ) = 0.2 × 0 = 0 l1 = hg(x0 , y0 , z0 ) = 0.2 × (1 + 0) = 0.2 0.2 = 0.02 2 l2 = hg(x0 + h/2, y0 + k1 /2, z0 + l1 /2) = 0.2 × [(1 + 0) + (0 + k2 = hf (x0 + h/2, y0 + k1 /2, z0 + l1 /2) = 0.2 ×
0.2 2 )
× (0 +
0.2 2 )]
= 0.2 × (1 + 0.1 + 0.1) = 0.2020 0.202 = 0.02020 2 0.02 l3 = hg(x0 + h/2, y0 + k2 /2, z0 + l2 /2) = 0.2 × [(1 + 2 ) + (0 + k3 = hf (x0 + h/2, y0 + k2 /2, z0 + l2 /2) = 0.2 ×
0.2 2 )
× (0 +
0.2020 2 )]
= 0.20402 k4 = hf (x0 + h, y0 + k3 , z0 + l3 ) = 0.2 × 0.20402 = 0.04080 l4 = hg(x0 + h, y0 + k3 , z0 + l3 ) = 0.2 × [(1 + 0.02020) + (0 + 0.2) × (0 + 0.20402)] = 0.21220. 9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Runge-Kutta Methods Hence, 1 y(0.2) = y1 = y(0) + [k1 + 2(k2 + k3 ) + k4 ] 6 1 = 1 + [0 + 2(0.02 + 0.0202) + 0.0408] = 1.0202. 6 1 z(0.2) = z1 = z(0) + [l1 + 2(l2 + l3 ) + l4 ] 6 1 = 0 + [0.2 + 2(0.2020 + 0.20402) + 0.21220] = 0.20404. 6
1.5 Runge-Kutta method for a system of equations The above method can be extended to solve a system of first order differential equations. Let us consider the following system of first order differential equations dy1 = f1 (x, y1 , y2 , . . . , yn ) dx dy2 = f2 (x, y1 , y2 , . . . , yn ) dx ··· ·················· dyn = fn (x, y1 , y2 , . . . , yn ) dx
(1.23)
with initial conditions y1 (x0 ) = y10 , y2 (x0 ) = y20 , . . . , yn (x0 ) = yn0 .
(1.24)
In this problem, x is the independent variable and y1 , y2 , . . . , yn are n dependent variables. The above system of equations can be written as a vector notation as follows: dy = f (x, y1 , y2 , . . . , yn ) with y(x0 ) = y0 . dx The fourth-order Runge-Kutta method for the equation (1.25) is
(1.25)
1 yj+1 = yj + [k1 + 2k2 + 2k3 + k4 ] 6
(1.26)
where
k11
k12
k13
k14
k24 k22 k23 k21 k1 = . , k2 = . , k3 = . , k4 = . .. .. .. .. kn4 kn1 kn2 kn3 10
(1.27)
1.1. Euler’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . and ki1 = hfi (xj , y1j , y2j , . . . , ynj ) ki2 = hfi (xj + h/2, y1j + k11 /2, y2j + k21 /2, . . . , ynj + kn1 /2) ki3 = hfi (xj + h/2, y1j + k12 /2, y2j + k22 /2, . . . , ynj + kn2 /2) ki4 = hfi (xj + h, y1j + k13 , y2j + k23 , . . . , ynj + kn3 ) for i = 1, 2, . . . , n. Here, ykj is the value of the variable yk evaluated at xj .
1.6 Runge-Kutta method for second order IVP The Runge-Kutta methods can be used to solve second order IVP by converting the IVP to a pair of first order differential equations. Let the second order differential equation be a(x)y 00 + b(x)y 0 + c(x)y = f (x)
(1.28)
and the initial conditions be x = x0 , y(x0 ) = y0 , y 0 (x0 ) = z0 .
(1.29)
To convert it to first order equation, let y 0 (x) = z(x).
(1.30)
Then y 00 (x) = z 0 (x). Therefore, the equation (1.29) reduces to dy =z dx dz 1 = [f (x) − b(x)z − c(x)y] = g(x, y, z) dx a(x)
(1.31)
with initial conditions x = x0 , y(x0 ) = y0 , z(x0 ) = z0 .
(1.32)
This is a pair of first order differential equations. Now, Runge-Kutta methods may be used to solve this pair of equations (1.31) with initial conditions (1.32). The value of y is the solution of the given IVP and z is the first order derivatives. 11
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Runge-Kutta Methods
Example 1.3 Find the value of y(0.1) for the following second order differential equation by fourth-order Runge-Kutta method. y 00 (x) − 2y 0 (x) + y(x) = x + 1 with y(0) = 1, y 0 (0) = 0. Solution. Substituting y 0 = z. Then the given equation reduces to z 0 − 2z + y = x + 1, or z 0 = 2z − y + x + 1 = g(x, y, z) (say). Therefore, y 0 = z and z 0 = (2z − y + x + 1). Here, x0 = 0, y0 = 1, z0 = 0, h = 0.1. k1 = h × z0 = 0.1 × 0 = 0 l1 = hg(x0 , y0 , z0 ) = 0.1 × [(2 × 0) − 1 + 0 + 1] = 0 k2 = h × (z0 + l1 /2) = 0.1 × 0 = 0 0.1 0 ) + 1] l2 = hg(x0 + h/2, y0 + k1 /2, z0 + l1 /2) = 0.1 × [(2 × 0) − (1 + ) + (0 + 2 2 = 0.1 × [−1 + 0.05 + 1] = 0.005 0.005 k3 = h × (z0 + l2 /2) = 0.1 × = 0.00025 2 l3 = hg(x0 + h/2, y0 + k2 /2, z0 + l2 /2) = 0.1 × [0.005 − 1 + 0.05 + 1] = 0.0055 k4 = h × (z0 + l3 ) = 0.1 × 0.0055 = 0.00055 l4 = hg(x0 + h, y0 + k3 , z0 + l3 ) = 0.1 × [{2 × (0 + 0.0055)} − (1 + 0.00025) + (0 + 0.1) + 1] = 0.011075. Therefore, y(0.1) = y1 1 = y0 + [k1 + 2(k2 + k3 ) + k4 ] 6 1 = 1 + [0 + 2(0 + 0.00025) + 0.00055] = 1.000175 6 1 0 z(0) = y (0.1) = z(0) + [l1 + 2(l2 + l3 ) + l4 ] 6 1 = 0 + [0 + 2(0.005 + 0.0055) + 0.011075] = 0.005346. 6 The required value of y(0.1) is 1.000175. In addition, y 0 (0.1) is 0.005346. 12
1.1. Euler’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.7 Runge-Kutta-Fehlberg method Many variant of Runge-Kutta methods are available in literature. The Runge-KuttaFehlberg method gives better result by solving the differential equation twice using step sizes h and h/2. In this method, at each step two different approximations for the solution are calculated. If the two approximations are closed then the solution is obtained. If the approximation does not meet the required accuracy, then the step size may be reduced. In each step the following six values are required. k1 = hf (xi , yi ) k1 h k2 = hf xi + , yi + 4 4 3h 3 9 k3 = hf xi + , yi + k1 + k2 8 32 32 12 1932 7200 7296 k4 = hf xi + h, yi + k1 − k2 + k3 13 2197 2197 2197 439 3680 845 k5 = hf xi + h, yi + k1 − 8k2 + k3 − k4 216 513 4104 h 8 3544 1859 11 k6 = hf xi + , yi − k1 + 2k2 − k3 + k4 − k5 . 2 27 2565 4104 40
(1.33)
Then the approximate value of yi+1 using fourth-order Runge-Kutta method is yi+1 = yi +
25 1408 2197 1 k1 + k3 + k4 − k5 . 216 2565 4104 5
(1.34)
In this formula, the value of k2 is not used. Another value of y is determined by the following fifth order Runge-Kutta method: ∗ yi+1 = yi +
16 6656 28561 9 2 k1 + k3 + k4 − k5 + k6 . 135 12825 56430 50 55
(1.35)
∗ | is small enough, then the method If these two values are closed, i.e. if |yi+1 − yi+1
gives the solution of the given problem, otherwise, the computation is repeated by reducing the step size h.
13
.
Chapter 8 Numerical Solution of Ordinary Differential Equations
Module No. 2 Predictor-Corrector Methods
...................................................................................... A method is called single step, if only one value of the dependent variable, say yi , is required to compute the next value, i.e. yi+1 . Taylor’s series, Picard, Euler’s, RungeKutta are the examples of the single-step method. On the other hand, a method is called multistep if it needs more than one values of the dependent variable to compute the next value. In k-step method, the values of yi−k−1 , yi−k−2 , . . . , yi−1 and yi are needed to compute yi+1 . To solve an ordinary differential equation, a special type of method is used known as predictor-corrector method. This method has two formulae. The first formula determines an approximate value of y and it is known as predictor formula. The second formula, known as corrector formula, improves the value of y obtained by predictor formula. The commonly used predictor-corrector methods are Adams-Bashforth-Moulton, Milne-Simpson, etc. These two methods are discussed in this module.
2.1 Adams-Bashforth-Moulton methods Let us consider a differential equation dy = f (x, y) with initial conditions x = x0 , y(x0 ) = y0 . dx
(2.1)
This problem is to be solved by Adams-Bashforth-Moulton method. It is a fourthstep predictor-corrector method. So, to compute the value of yi+1 , four values yi−3 , yi−2 , yi−1 and yi are required. These values are called starting values. To find these values, generally, any single step method such as Euler, Runge-Kutta, etc., are used. Now, we integrate the differential equation (2.1) between xi and xi+1 and obtain the following equation Z
xi+1
yi+1 = yi +
f (x, y) dx.
(2.2)
xi
The second term of the above equation can not be determined as y is an unknown dependent variable. To find this integration, the function f (x, y) is approximated by Newton’s backward interpolation formula, i.e. by the following expression f = fi + v∇fi +
v(v + 1) 2 v(v + 1)(v + 2) 3 ∇ fi + ∇ fi 2! 3! 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Predictor-Corrector Methods where v =
x − xi and fi = f (xi , yi ). After simplification, it becomes h f = fi + v∇fi +
v2 + v 2 v 3 + 3v 2 + 2v 3 ∇ fi + ∇ fi . 2 6
Therefore, yi+1
1
v2 + v 2 v 3 + 3v 2 + 2v 3 = yi + h fi + v∇fi + ∇ fi + ∇ fi dv 2 6 0 1 5 3 = yi + hfi + h∇fi + h∇2 fi + ∇3 fi 2 12 8 h = yi + (−9fi−3 + 37fi−2 − 59fi−1 + 55fi ). 24 Z
(2.3)
This formula is known as Adams-Bashforth predictor formula and it is denoted by p yi+1 .
Thus,
p yi+1 = yi +
h [−9f (xi−3 , yi−3 ) + 37f (xi−2 , yi−2 ) − 59f (xi−1 , yi−1 ) + 55f (xi , yi )].(2.4) 24
The same technique is used to find corrector formula. To find the corrector formula, the function f (x, y) is approximated by the following Newton’s backward interpolation polynomial. f (x, y) = fi+1 +v∇fi+1 +
v(v+1) 2 v(v+1)(v+2) 3 x − xi+1 ∇ fi+1 + ∇ fi+1 , where v = . 2! 3! h
Using this approximation, the equation (2.2) becomes Z 0 v2 + v 2 v 3 + 3v 2 + 2v 3 yi+1 = yi + h fi+1 + v∇fi+1 + ∇ fi+1 + ∇ fi+1 dv 2 6 −1 [since, x = xn + vh, dx = hdv] 1 1 2 1 3 = yi + h fi+1 − ∇fi+1 − ∇ fi+1 − ∇ fi+1 2 12 24 h = yi + [fi−2 − 5fi−1 + 19fi + 9fi+1 ]. (2.5) 24 This is known as Adams-Moulton corrector formula. The corrector value is denoted c . Thus by yi+1 c = yi + yi+1
2
h p [f (xi−2 , yi−2 ) − 5f (xi−1 , yi−1 ) + 19f (xi , yi ) + 9f (xi+1 , yi+1 )]. (2.6) 24
......................................................................................
p The predicted value yi+1 is computed from the equation (2.4). The formula (2.6) can
be used repeatedly to get the value of yi+1 to the desired accuracy. Note that the predictor formula is an explicit formula, whereas corrector formula is an implicit formula. Note 2.1 The modified Euler’s method is a single step predictor-corrector method.
Example 2.1 Find the value of y(0.20) and y(0.25) from the differential equation y 0 = xy + y sin x − 1 with y(0) = 1 taking step size h = 0.05 using Adams-BashforthMoulton predictor-corrector method. Solution. The Runge-Kutta method is used to find the starting values at x = 0.05, 0.10, 0.15. Here f (x, y) = 2y − y 2 , h = 0.05. The values are shown below: i
xi
yi
k1
k2
k3
k4
yi+1
0
0.00
1.000000
-0.050000
-0.047563
-0.047560
-0.045239
0.952419
1
0.05
0.952419
-0.045239
-0.043030
-0.043021
-0.040914
0.909377
2
0.10
0.909377
-0.040914
-0.038903
-0.038890
-0.036967
0.870466
Thus the starting values are y0 = y(0) = 1, y1 = y(0.05) = 0.952419, y2 = y(0.10) = 0.909377, y3 = y(0.15) = 0.870466. Now, y p (0.20)
= y4p = y3 + = 0.870466
h 24 [−9f (x0 , y0 ) + 37f (x1 , y1 ) − 59f (x2 , y2 ) + 55f (x3 , y3 )] + 0.05 24 [−9 × 1 + 37 × (−0.904778) − 59 × (−0.818276)
+55 × (−0.739349)] = 0.83533. p h 24 [f (x1 , y1 ) − 5f (x2 , y2 ) + 19f (x3 , y3 ) + 9f (x4 , y4 )] + 0.05 24 [(−0.904778) − 5 × (−0.818276) + 19 × (−0.739349)]
y c (0.20) = y4c = y3 + = 0.870466
+9 × (−0.666978)] = 0.83533. Thus, y4 = y(0.20) = 0.83533. Note that the predicted and corrected values are equal up to five decimal places. 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Predictor-Corrector Methods y p (0.25) = y5p = y4 + = 0.83533 +
h 24 [−9f (x1 , y1 ) + 37f (x2 , y2 )−59f (x3 , y3 )+55f (x4 , y4 )] 0.05 24 [−9 × (−0.904778) + 37 × (−0.818276) − 59 × (−0.739349)
+55 × (−0.666978)] = 0.805461. y c (0.25)
= y5c = y4 + = 0.83533 +
p h 24 [f (x2 , y2 ) − 5f (x3 , y3 )+19f (x4 , y4 )+9f (x5 , y5 )] 0.05 24 [(−0.818276) − 5 × (−0.739349)+19 × (−0.666978)
+9 × (−0.599361)] = 0.80369. y c (0.25) = y5c = y4 + = 0.83533 +
h c 24 [f (x2 , y2 ) − 5f (x3 , y3 )+19f (x4 , y4 )+9f (x5 , y5 )] 0.05 24 [(−0.818276) − 5 × (−0.739349)+19 × (−0.666978)
+9 × (−0.600241)] = 0.80367. Hence, y(0.25) = 0.8037 correct upto four decimal places. Error in Adams-Bashforth-Moulton method The local truncation error for predictor formula is Z 1 v(v + 1)(v + 2)(v + 3) 4 251 v h ∇ fi dv ' y (ξ1 )h5 4! 720 0 and for the corrector formula it is Z 0 v(v + 1)(v + 2)(v + 3) 4 19 v h ∇ fi+1 dv ' − y (ξ2 )h5 . 4! 720 −1 When h is small then the local truncation error for Adams-Bashforth-Moulton method is −
19 c p (y − yi+1 ). 270 i+1
(2.7)
Another useful predictor-corrector method is due to Milne-Simpson, describe below.
2.2 Milne-Simpson method This method is also known as Milne’s method. The Minle’s method is used to solve the following problem: dy = f (x, y) with initial condition y(x0 ) = y0 . dx 4
(2.8)
...................................................................................... Now, we integrate the above differential equation between xi−3 and xi+1 and find Z xi+1 yi+1 = yi−3 + f (x, y) dx. (2.9) xi−3
To find the value of the above integral, the function f (x, y) is approximated by Newton’s forward difference interpolation polynomial as f (x, y) = fi−3 + u∆fi−3 +
u(u − 1)(u − 2) 3 u(u − 1) 2 ∆ fi−3 + ∆ fi−3 , 2! 3!
(2.10)
x − xi−3 . h Using this approximation, the equation (2.9) reduces to Z 4 u3 − 3u2 + 2u 3 u2 − u 2 ∆ fi−3 + ∆ fi−3 du yi+1 = yi−3 + h fi−3 + u∆fi−3 + 2 6 0 i h 8 20 = yi−3 + h 4fi−3 + 8∆fi−3 + ∆2 fi−3 + ∆3 fi−3 3 3 4h [2fi−2 − fi−1 + 2fi ]. = yi−3 + 3
where u =
Thus the predictor formula due to Milne is p yi+1 = yi−3 +
4h [2f (xi−2 , yi−2 ) − f (xi−1 , yi−1 ) + 2f (xi , yi )]. 3
(2.11)
The corrector formula is derived in the same way. Again, we integrate the given differential equation between xi−1 and xi+1 and the function f (x, y) is approximated by the Newton’s forward interpolation polynomial f (x, y) = fi−1 + u∆fi−1 +
u(u − 1) 2 ∆ fi−1 . 2
(2.12)
Then, yi+1
xi+1
u(u − 1) 2 = yi−1 + ∆ fi−1 dx fi−1 + u∆fi−1 + 2 xi−1 Z 2h i u2 − u 2 = yi−1 + h fi−1 + u∆fi−1 + ∆ fi−1 du 2 0 h i 1 2 = yi−1 + h 2fi−1 + 2∆fi−1 + ∆ fi−1 3 h = yi−1 + [f (xi−1 , yi−1 ) + 4f (xi , yi ) + f (xi+1 , yi+1 )]. 3 Z
5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Predictor-Corrector Methods c . Therefore, This formula is known as corrector formula and it is denoted by yi+1 c yi+1 = yi−1 +
h p [f (xi−1 , yi−1 ) + 4f (xi , yi ) + f (xi+1 , yi+1 )]. 3
(2.13)
p The predicted value yi+1 is computed from the equation (2.11) and this value is p c | is not sufficiently small, corrected by the formula (2.13). If the value of |yi+1 − yi+1 c . In this case, y p then use the corrector formula again to update the value of yi+1 i+1 is c to be replaced by yi+1 in the corrector formula.
Example 2.2 Find the value of y(0.20) for the initial value problem dy = xy + 2y with y(0) = 1 dx using Milne’s predictor-corrector method, taking h = 0.05. Solution. Let f (x, y) = xy + 2y, x0 = 0, y0 = 1, h = 0.05. Fourth-order Runge-Kutta method is used to compute the starting values y1 , y2 and y3 . i
xi
yi
k1
k2
k3
k4
yi+1
0
0.00
1.000000
0.100000
0.106312
0.106632
0.113430
1.106553
1
0.05
1.106553
0.113422
0.120689
0.121066
0.128900
1.227525
2
0.10
1.227525
0.128890
0.137272
0.137717
0.146764
1.365130
Therefore, y1 = y(0.05) = 1.106553, y2 = y(0.10) = 1.227525, y3 = y(0.15) = 1.365130. The predictor value 4h [2f (x1 , y1 ) − f (x2 , y2 ) + 2f (x3 , y3 )] 3 4 × 0.05 [2f (0.05, 1.106553) − f (0.10, 1.227525) + 2f (0.15, 1.365130)] =1+ 3 4 × 0.05 =1+ [2 × 2.268434 − 2.577802 + 2 × 2.935030] 3 = 1.521942.
y4p = y0 +
6
...................................................................................... The corrector value is h [f (x2 , y2 ) + 4f (x3 , y3 ) + f (x4 , y4p )] 3 0.05 = 1.227525 + [2.577802 + 4 × 2.935030 + 3.348272] 3 = 1.52196.
y4c = y2 +
Again, the corrector value y4c is calculated by using the formula h [f (x2 , y2 ) + 4f (x3 , y3 ) + f (x4 , y4c )] 3 0.05 = 1.227525 + [2.577802 + 4 × 2.935030 + 3.348312] 3 = 1.52196.
y4c = y2 +
Since these two values are same, the required solution is y4 = y(0.20) = 1.52196 correct up to five decimal places. Error in Milne-Simpson method The local truncation error for predictor and corrector formulae are 1 28 v y (ci+1 )h5 = O(h5 ) and − y v (di+1 )h5 = O(h5 ) 90 90 respectively. Note 2.2 The predictor-corrector methods are widely used to solve the differential equation of the form (2.8). These methods give more accurate result than the methods discussed earlier. In these methods, the value of y can be corrected by using corrector formula repeatedly. But, these methods need starting values y1 , y2 , y3 to obtain y4 . These values may be obtained from any single-step method, such as Taylor’s series, Euler’s, Runge-Kutta or any similar method. So, these methods need lot of function calculations.
7
.
Chapter 8 Numerical Solution of Ordinary Differential Equations
Module No. 3 Finite Difference Method and its Stability
...................................................................................... The finite difference method is a simple and widely used method to solve second order initial value problem (IVP) and boundary value problem (BVP). In this method, the first and second order derivatives y 0 and y 00 are replaced by finite differences (either by forward or central). By this substitution the given BVP is converted to a system of linear algebraic equations. Then the solution of this system of algebraic equations is the solution of the given BVP. The central difference approximation of the first and second order derivatives are given by y 0 (xi ) =
yi+1 − yi−1 + O(h2 ) 2h (3.1)
yi+1 − 2yi + yi−1 + O(h2 ). and y 00 (xi ) = h2 The method to solve first order differential equation using finite difference method is the Euler’s method.
3.1 Second order initial value problem (IVP) Let us consider the following second order linear IVP y 00 + p(x)y 0 + q(x)y = r(x)
(3.2)
with the initial conditions x = x0 ,
y(x0 ) = y0 ,
y 0 (x0 ) = a,
(3.3)
where p(x), q(x) and r(x) are given functions. Now, we discretize the given differential equation by substituting x = xi , i = 0, 1, 2, . . .. Then yi00 + p(xi )yi0 + q(xi )yi = r(xi ).
(3.4)
The values of y 0 (xi ) and y 00 (xi ) are substituted from the equation (3.1) to the equation (3.2). Then the equation (3.4) reduces to yi+1 − 2yi + yi−1 yi+1 − yi−1 + p(xi ) + q(xi )yi = r(xi ). 2 h 2h
(3.5) 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Finite Difference Method and its Stability After simplification it becomes [2 − hp(xi )]yi−1 + [2h2 q(xi ) − 4]yi + [2 + hp(xi )]yi+1 = 2h2 r(xi ).
(3.6)
For simplicity, we denote Ci = 2 − hp(xi ) Ai = 2h2 q(xi ) − 4 Bi = 2 + hp(xi ) and
(3.7)
Di = 2h2 r(xi ). Then equation (3.6) is turned to the following form Ci yi−1 + Ai yi + Bi yi+1 = Di
(3.8)
for i = 0, 1, 2, . . .. Again, using the equation (3.1), the initial condition y 0 (x0 ) = a reduces to a=
y1 − y−1 2h
or
y−1 = y1 − 2ha.
(3.9)
Again, from the equation (3.8), for i = 0 C0 y−1 + A0 y0 + B0 y1 = D0 .
(3.10)
Now, we eliminate y−1 from the equations (3.9) and (3.10). Therefore, the value of y1 is given by the following equation y1 =
D0 − A0 y0 + 2hC0 a . C0 + B0
(3.11)
The coefficients Ai , Bi , Ci , Di of the equation (3.8) are known quantities as p, q, r are known functions. Note that, all the terms of right hand side of the equation (3.11) are known quantities. Therefore, from this equation, we can find the value of y1 . Also, y0 is known. Hence, from the equation yi+1 =
Di − Ci yi−1 − Ai yi , Bi
xi+1 = xi + h
(3.12)
one can determine the value of yi+1 for all i = 1, 2, . . . . This is a recurrence relation of y’s and we can determine the value of y one by one. 2
......................................................................................
Example 3.1 Solve the following IVP y 00 = xy − 4y with y(0) = 3 and y 0 (0) = 0 using finite difference method for x = 0.01, 0.02, . . . , 0.10. yi+1 − 2yi + yi−1 in the given h2 differential equation and following system of equations is derived. Solution. The second order derivative y 00 is replaced by yi+1 − 2yi + yi−1 = xi yi − 4yi h2
or
yi+1 − (2 + h2 xi − 4h2 )yi + yi−1 = 0
and
y1 − y−1 2h Again from above equation y00 =
or
y−1 = y1 − 2hy00 .
y1 − (2 + h2 x0 − 4h2 )y0 + y−1 = 0, or y1 − (2 − 4h2 )3 + (y1 − 2hy00 ) = 0, that is, y1 =
[2 − (4 × (0.01)2 )] × 3 + 2 × 0.01 × 0 = 2.99440. 2
The values of yi , i = 2, 3, . . . are obtained from the relation yi+1 = (2 + h2 xi − 4h2 )yi − yi−1 . i
yi−1
xi
yi
yi+1
1
3.000000
0.01
2.999400
2.997603
2
2.999400
0.02
2.997603
2.994613
3
2.997603
0.03
2.994613
2.990434
4
2.994613
0.04
2.990434
2.985071
5
2.990434
0.05
2.985071
2.978529
6
2.985071
0.06
2.978529
2.970813
7
2.978529
0.07
2.970813
2.961929
8
2.970813
0.08
2.961929
2.951884
9
2.961929
0.09
2.951884
2.940685
10
2.951884
0.10
2.940685
2.928339
Thus, y(0.01) = 2.999400, y(0.02) = 2.997603, y(0.03) = 2.994613, y(0.04) = 2.990434, y(0.05) = 2.985071, y(0.06) = 2.978529, y(0.07) = 2.970813, y(0.08) = 2.961929, y(0.09) = 2.951884, y(0.10) = 2.940685, y(0.11) = 2.928339. 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Finite Difference Method and its Stability Error in finite difference method for IVP The error occurs due to the approximation of first and second order derivatives to the central differences. Thus, the local truncation error is yi−1 − 2yi + yi+1 yi+1 − yi−1 00 0 E= − yi + pi − yi . h2 2h Expanding the terms yi−1 and yi+1 by Taylor’s series and after simplification, E becomes
h2 iv (y + 2pi yi000 ) + O(h4 ). 12 i Thus, the finite difference approximation has second-order accuracy for the functions E=
with continuous fourth derivatives. Now, we apply the finite difference method to solve BVP.
3.2 Second order boundary value problem (BVP) The BVP occurs in many applications of applied mathematics, physics and different branches of engineering. But, the solution of every BVP does not exist. The existence and uniqueness conditions of BVP are stated below. Theorem 3.1 Assume that f (x, y, y 0 ) is continuous on the region R = {(x, y, y 0 ) : a ≤ x ≤ b, −∞ < y < ∞, −∞ < y 0 < ∞} and that fy and fy0 are continuous on R. If there exists a constant M > 0 for which fy and fy0 satisfy fy > 0 for all (x, y, y 0 ) ∈ R and |fy0 (x, y, y 0 )| < M for all (x, y, y 0 ) ∈ R, then the BVP y 00 = f (x, y, y 0 ) with y(a) = γ1 and y(b) = γ2 has a unique solution y = y(x) for a ≤ x ≤ b. Let us consider the following linear second order differential equation y 00 + p(x)y 0 + q(x)y = r(x),
a<x
(3.13)
with boundary conditions y(a) = γ1 and y(b) = γ2 . It is assumed that the above BVP has a unique solution. Let the interval [a, b] be divided into n equal subintervals with spacing h. That is, xi = x0 + ih, i = 1, 2, . . . , n − 1 and x0 = a, xn = b. We discretize the given equation by substituting x = xi . Then yi00 + p(xi )yi0 + q(xi )yi = r(xi ). 4
(3.14)
...................................................................................... Now, we replace yi00 and yi0 by the following central difference scheme yi00 =
yi−1 − 2yi + yi+1 + O(h2 ), h2
yi0 =
yi+1 − yi−1 + O(h2 ). 2h
Using these approximations the equation (3.14) reduces to yi−1 − 2yi + yi+1 yi+1 − yi−1 + p(xi ) + q(xi )yi + O(h2 ) = r(xi ). h2 2h Drooping the term O(h2 ) as it is small for small h. Then after simplification, the above equation reduces to yi−1 [2 − hp(xi )] + yi [2h2 q(xi ) − 4] + yi+1 [2 + hp(xi )] = 2h2 r(xi ).
(3.15)
For simplicity, we use the following notations. Ci = 2 − hp(xi ) Ai = 2h2 q(xi ) − 4 Bi = 2 + hp(xi )
(3.16)
and Di = 2h2 r(xi ). With these notations, the equation (3.15) reduces to Ci yi−1 + Ai yi + Bi yi+1 = Di ,
(3.17)
where i = 1, 2, . . . , n − 1. This is a system of n − 1 linear algebraic equations containing n + 1 unknowns y0 , y1 , . . . , yn . Among them two values are obtained from the boundary conditions y0 = γ1 and yn = γ2 . For i = 1 and n − 1, the equations are C1 y0 + A1 y1 + B1 y2 = D1 , or A1 y1 + B1 y2 = D1 − C1 γ1 (as y0 = γ1 ) and Cn−1 yn−2 + An−1 yn−1 + Bn−1 yn = Dn−1 , or Cn−1 yn−2 + An−1 yn−1 = Dn−1 − Bn−1 γ2 (as yn = γ2 ). The system of equations (3.17) can be written in matrix notation as Ay = b
(3.18) 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Finite Difference Method and its Stability where y = [y1 , y2 , . . . , yn−1 ]T b = 2h2 [r(x1 ) − {C1 γ1 }/(2h2 ), r(x2 ), . . . , r(xn−2 ), r(xn−1 ) − {Bn−1 γ2 }/(2h2 )]T A1 B 1 0 0 · · · 0 0 C2 A2 B2 0 · · · 0 0 and A = 0 C3 A3 B3 · · · 0 0 . ··· ··· ··· ··· ··· ··· ··· 0 0 0 0 · · · Cn−1 An−1 Here, y is an unknown column vector, b is a known column vector and A is a known tri-diagonal matrix. That is, (3.18) is a system of linear tri-diagonal equations with n − 1 variables y1 , y2 , . . . , yn−1 and n − 1 equations. This system of equations can be solved by any method. But, an efficient method is described in Module 4 of Chapter 5. The solution of this system, i.e. the values of y1 , y2 , . . . , yn−1 are the approximate solution of the given BVP at x1 , x2 , . . . , xn−1 . Example 3.2 Solve the following boundary value problem y 00 +y 0 +x = 0 with boundary conditions y(0) = 0, y(1) = 0. Solution. Here nh = 1. The difference scheme is yi−1 − 2yi + yi+1 yi+1 − yi−1 + + xi = 0. h2 2h That is, yi−1 (2 − h) − 4yi + yi+1 (2 + h) + 2h2 xi = 0, i = 1, 2, . . . , n − 1 together with boundary condition y0 = 0, yn = 0. Let n = 2. Then h = 1/2, x0 = 0, x1 = 1/2, x2 = 1, y0 = 0, y2 = 0. The difference scheme is y0 (2 − h) − 4y1 + y2 (2 + h) + 2h2 x1 = 0 or −4y1 + 2h2 x1 = 0 or −4y1 + 2( 14 )( 21 ) = 0 or y1 = 0.0625. 6
(3.19)
...................................................................................... That is, y(0.5) = 0.0625. Let n = 4. Then h = 0.25, x0 = 0, x1 = 0.25, x2 = 0.50, x3 = 0.75, x4 = 1.0, y0 = 0, y4 = 0. The system of equations (3.19) becomes y0 (2 − h) − 4y1 + y2 (2 + h) + 2h2 x1 = 0 y1 (2 − h) − 4y2 + y3 (2 + h) + 2h2 x2 = 0 y2 (2 − h) − 4y3 + y4 (2 + h) + 2h2 x3 = 0. This system is finally simplified to −4y1 + 2.250y2 + 0.03125 = 0 1.750y1 − 4y2 + 2.250y3 + 0.0625 = 0 1.750y2 − 4y3 + 0.09375 = 0. The solution of this system is y1 = y(0.25) = 0.043510, y2 = y(0.50) = 0.063462, y3 = y(0.75) = 0.051202. This is the solution of the given differential equation. Error in finite difference method for BVP The local truncation error of this method is similar to the the error in IVP. That is, this method is also of second-order accuracy for functions with continuous fourth order derivatives on [a, b]. It may be observed that when h → 0, then the local truncation error tends to zero. Therefore, the more accuracy in the solution can be achieved by taking small h. But, when h is small then the number of algebraic equations become large and hence need more computational effort.
3.3 Stability of finite difference method Let us consider the following simple second order differential equation y 00 + ky 0 = 0,
(3.20)
where k is a constant and very large in comparison to 1. The central difference approximation of this equation is 1 k (yi+1 − 2yi + yi−1 ) + (yi+1 − yi−1 ) = 0. 2 h 2h
(3.21) 7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Finite Difference Method and its Stability The solution of this difference equation is yi = c1 + c2
2 − kh 2 + kh
i ,
(3.22)
where c1 and c2 are arbitrary constants and their values are to be determined from the given boundary conditions. The analytical solution of the given equation is y(x) = A + Be−kx ,
(3.23)
where A and B are arbitrary constants. If k > 0 and x → ∞ then the solution becomes bounded. The term e−kx is monotonic for k > 0 and k < 0. Thus, it is expected that the finite difference solution is also monotonic for k > 0 and k < 0. It can be shown that the exponential term of (3.22) behaves monotonic for k > 0 and k < 0 if h < |2/k|. This is the condition for stability of the finite difference scheme.
8
.
Chapter 8 Numerical Solution of Ordinary Differential Equations
Module No. 4 Shooting Method and Stability Analysis
...................................................................................... Finding solution of initial value problems is easy. Many simple methods are available to solve such problems. But, boundary value problems are relatively complicated. In Module 3 of this chapter, finite difference method is used to solve a boundary value problem. In this method, a system of linear algebraic equations is generated. The solution of this system is the result of the BVP. But, we have seen that, if the step length h is small then the number of equations becomes large and hence to solve such system huge amount of time is required. This drawback is removed in shooting method. In shooting method, the given BVP is decomposed into two IVPs. Solving and combining the solutions of these IVPs, we get the solution of the given BVP.
4.1 Shooting method for boundary value problem Let the BVP be y 00 − p(x)y 0 − q(x)y = r(x) with the Dirichlet conditions y(a) = α and y(b) = β.
(4.1)
To solve this problem, we have to perform the following three steps. (i) Convert the given BVP to two equivalent IVPs, (ii) Solve these two IVPs by Runge-Kutta, predictor-corrector method, or any other method, (iii) Combine these two solutions to get the required solution of the given BVP. Let u(x) be a solution to the IVP u00 (x) − p(x)u0 (x) − q(x)u(x) = r(x) with u(a) = α and u0 (a) = 0.
(4.2)
Also, let v(x) be a solution to the IVP, v 00 (x) − p(x)v 0 (x) − q(x)v(x) = 0 with v(a) = 0 and v 0 (a) = 1.
(4.3)
Then the linear combinations of the solutions of the equations (4.2) and (4.3), i.e. y(x) = c1 u(x) + c2 v(x)
(4.4) 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shooting Method and Stability Analysis is a solution of the equation (4.1), where c1 , c2 are arbitrary constants. The boundary conditions y(a) = α and y(b) = β give, α = c1 u(a) + c2 v(a) and β = c1 u(b) + c2 v(b). From the first equation we get, c1 = 1 and from second equation, c2 =
β − u(b) . v(b)
Hence, the equation (4.4) reduces to y(x) = u(x) +
β − u(b) v(x). v(b)
(4.5)
This is a solution of the given BVP (4.1) and it also satisfies the boundary conditions y(a) = α, y(b) = β. Thus, a second order linear BVP is converted to two second order linear IVPs. In Module 1 of Chapter 8, it is seen that any second order linear IVP can be converted to a pair of first order linear initial value problems. Now, we substitute u0 (x) = w(x) to the first equation. Then it becomes u0 = w(x) with u(a) = α and w0 (x) = p(x)w(x) + q(x)u(x) + r(x) with w(a) = u0 (a) = 0.
(4.6)
In second equation, we substitute v 0 (x) = z(x), then it becomes a pair of the following equations v0
= z(x) with v(a) = 0
z 0 (x)
= p(x)z(x) + q(x)v(x) with z(a) = v 0 (a) = 1.
(4.7)
Finally, the desired solution y(x) is obtained from (4.5). This is a good method to solve BVP. If the expression p(x)y 0 (x) + q(x)y(x) + r(x) is linear then there is no problem. But, if the expression p(x)y 0 (x) + q(x)y(x) + r(x) is non-linear then many iterations are required to get the solution at x = b. When y(b) is a very sensitive function of y 0 (a), then it is very difficult to obtain convergent solution. In this situation, the integration may be done from the opposite directions by guessing a value of y 0 (b). And the iteration be repeated until y(a) is sufficiently close to y0 . The speed of convergence depends on the initial guess.
2
......................................................................................
Example 4.1 Use shooting method to find the solution of the boundary value problem y 00 − x2 y 0 + (x + 1)y = 1 with y(0) = 0, y(1) = 1. Solution. We use second-order Runge-Kutta method to solve the initial value problems with h = 0.20. Here p(x) = x2 , q(x) = −(x + 1), r(x) = 1, a = 0, b = 1, α = 0, β = 1. The two IVPs are u0 = w with u(0) = 0
(4.8)
w0 = x2 w − (x + 1)u + 1 and w(0) = u0 (0) = 0 and
v 0 = z with v(0) = 0
(4.9)
z 0 = x2 z − (x + 1)v and z(0) = v 0 (0) = 1. Solution of the system (4.8) is shown below. i
xi
ui
wi
k1
k2
l1
l2
ui+1
wi+1
0 1 2 3 4
0.00 0.20 0.40 0.60 0.80
0.00000 0.02000 0.07984 0.17830 0.31443
0.00000 0.20080 0.39714 0.58799 0.77685
0.00000 0.04016 0.07943 0.11760 0.15537
0.04000 0.07952 0.11750 0.15465 0.19262
0.20000 0.19681 0.19035 0.18528 0.18624
0.20160 0.19588 0.19133 0.19245 0.20470
0.02000 0.07984 0.17830 0.31443 0.48842
0.20080 0.39714 0.58799 0.77685 0.97232
Solution of the second system (4.9) is shown in the following table. i xi
vi
zi
k1
k2
l1
l2
vi+1
zi+1
0 0.00 0.00000 1.00000 0.20000 0.20000 0.00000 -0.04000 0.20000 0.98000 1 0.20 0.20000 0.98000 0.19600 0.18797 -0.04016 -0.08081 0.39198 0.91952 2 0.40 0.39198 0.91952 0.18390 0.16784 -0.08033 -0.12386 0.56785 0.81742 3 0.60 0.56785 0.81742 0.16348 0.13891 -0.12286 -0.17438 0.71905 0.66880 4 0.80 0.71905 0.66880 0.13376 0.09911 -0.17325 -0.24202 0.83549 0.46117 These are the solutions of the induced IVPs. Now, to find the solution of the given boundary value problem, we calculate the constant c as c=
β − u(b) 1 − u(1) 1 − 0.48842 = = = 0.61231. v(b) v(1) 0.83549
The values of y(x) are obtained from the equation y(x) = u(x) + cv(x) = u(x) + 0.61231 v(x) and they are listed below: 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shooting Method and Stability Analysis x
u(x)
v(x)
y(x)
0.2 0.02000 0.20000 0.14246 0.4 0.07984 0.39198 0.31985 0.6 0.17830 0.56785 0.52600 0.8 0.31443 0.71905 0.75471 1.0 0.48842 0.83549 1.00000
4.2 Stability analysis Analysis of stability is an important part of numerical solution of ODE. Most of the methods used to solve ODE are based on difference equation. To investigate the stability of the numerical methods, the model differential and difference equations are considered. Model differential equation Without loss of generality, we consider the following model initial value differential equation y 0 = λy,
y(0) = y0 ,
(4.10)
where λ is a real or a complex constant. The solution of this problem is y = eλt y0 .
(4.11)
Let us assume that λ = λR + iλI , where λR and λI represent respectively the real and imaginary parts of λ, with λR ≤ 0. Model difference equation Similar to the differential equation, let us consider the following model linear initial value difference problem yn+1 = σyn ,
n = 0, 1, 2, . . .
(4.12)
where y0 is given and σ is, in general, a complex number. The solution of this problem is yn = σ n y0 . 4
(4.13)
......................................................................................
Figure 4.1: Stability region of exact solution. It is obvious that the solution remains bounded only if |σ| ≤ 1. To connect the exact solution and the solution obtained from difference equation (generated by a numerical method) we determine the exact solution at tn = nh, for n = 0, 1, . . . where h > 0. Then, yn = eλtn y0 = eλnh y0 = σ n y0 , where σ = eλh .
(4.14)
The exact solution is bounded, if |σ| = |eλh | ≤ 1. This is possible if Re(λh) = λR h ≤ 0. That is, in the Argand plane, the region of stability of the exact solution is the left half-plane as shown in Figure 4.1. In the following section, we discuss the stability of Euler’s and Runge-Kutta methods. 4.2.1
Stability of Euler’s method
For the model differential equation (4.10), the value of yn+1 obtained from Euler’s method is yn+1 = yn + hf (xn , yn ) = yn + λhyn = (1 + λh)yn .
(4.15)
This is a difference equation and its solution is yn = (1 + λh)n y0 = σ n y0 ,
(4.16)
where σ = 1 + λh. 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shooting Method and Stability Analysis This solution is stable if |σ| ≤ 1. In general, λ is a complex number. Now, we consider different type of values of λ. (i) Case I: Real λ In this case, |1 + λh| < 1, or −2 < λh < 0, (ii) Case II: Purely imaginary λ Let λ = iw, w is real. Therefore, |1 + iwh| =
√
1 + w2 h2 > 1. That is, the method is not stable in case
λ is purely imaginary. (iii) Case III: Complex λ Let λ = λR + iλI . In this case, |σ| = |1 + λh| = |1 + λR h + iλI h| =
p
(1 + λR h)2 + (λI h)2 ≤ 1.
That is, λh lies inside the unit circle. From these three cases, we observed that a small portion of the left half-plane is the region of stability for the Euler’s method. That is, the inside of the circle (1 + λR h)2 + (λI h)2 = 1, shown in Figure 4.2, is the region of stability.
Figure 4.2: Stability region of Euler’s method. Thus, we conclude that, to get a stable numerical solution by Euler’s method, the step size h must be reduced such that λh falls within the circle. If λ is real and negative 6
...................................................................................... then the maximum step size for stability is h ≤ 2/|λ|. If λ is real and the numerical solution is unstable, then |1 + λh| > 1. This means, (1 + λh) is negative with magnitude greater than 1. Since yn = (1 + λh)n y0 , the numerical solutions exhibits oscillations with changes of sign at every step. 4.2.2
Stability of Runge-Kutta methods
Let us consider the second-order Runge-Kutta method. Then the values of k1 and k2 are k1 = hf (xn , yn ) = λhyn k2 = hf (xn + h, yn + k1 ) = λh(yn + k1 ) = λh(yn + λhyn ) = λh(1 + λh)yn . Now, the value of yn+1 is yn+1
where σ = 1 + λh +
1 (λh)2 = yn + (k1 + k2 ) = yn + λh + yn 2 2 λ2 h2 yn = σyn , = 1 + λh + 2
(4.17)
λ2 h2 2 .
For stability, |σ| ≤ 1. Now, the stability is discussed for different cases of λ. (i) Case I: Real λ Thus, 1 + λh +
λ2 h2 2
≤ 1, or −2 ≤ λh ≤ 0.
(ii) Case II: Purely imaginary λ q Let λ = iw. Then |σ| = 1 + 41 w4 h4 > 1. In this case, the method is unstable. (iii) Case III: Complex λ Let σ = 1 + λh +
λ2 h2 2
= eiθ . Now, find the complex roots, λh, of this polynomial
for different values of θ. It is easy to verify that |σ| = 1 for all values of θ. The inner region of Figure 4.3 is the stability region for second order Runge-Kutta method. 7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shooting Method and Stability Analysis
Figure 4.3: Stability regions of Runge-Kutta methods. Inner and outer regions are the stability regions for second and fourth order Runge-Kutta methods respectively Let us consider the fourth order Runge-Kutta method. For the model differential equation k1 = λhyn k2 = λh(yn + k1 /2) = λh(1 + λh/2)yn 1 1 k3 = λh(yn + k2 /2) = λh 1 + λh + λ2 h2 yn 2 4 1 2 2 1 3 3 k4 = λh(yn + k3 ) = λh 1 + λh + λ h + λ h yn . 2 4 Then, 1 yn+1 = yn + (k1 + 2k2 + 2k3 + k4 ) 6 1 1 1 2 3 4 = 1 + λh + (λh) + (λh) + (λh) yn 2! 3! 4! = σyn , where σ = 1 + λh +
(4.18) (4.19)
1 1 1 (λh)2 + (λh)3 + (λh)4 . 2! 3! 4!
For stability, |σ| ≤ 1. For different λ, the stability of fourth-order Runge-Kutta method is discussed below: 8
...................................................................................... (i) Case I: Real λ In this, −2.785 ≤ λh ≤ 0. (ii) Case II: Purely imaginary λ √ For this case, 0 ≤ |λh| ≤ 2 2. (iii) Case III: Complex λ In this case, the stability region is obtained by finding the roots of the fourth-order polynomial with complex coefficients: 1 1 1 1 + λh + (λh)2 + (λh)3 + (λh)4 = eiθ . 2! 3! 4! The outer region of Figure 4.3 is the region of stability of fourth order Runge-Kutta method. It indicates that the region of stability of fourth order method is significantly larger than second-order method. That is, the step length h in fourth order Runge-Kutta method may be taken as larger than the second order method.
4.3 Discussion about the methods In this chapter, different numerical methods are described to solve ordinary differential equation. The advantages and disadvantages of these methods are now discussed here. The Euler’s method is a very simple single-step method, but it gives a very rough answer of the problem. The Runge-Kutta methods are most widely used method to solve a single or a system of IVP. These are single-step explicit methods. But, in these methods, lot of function calculations are required and hence take time. These methods can also be used to solve higher order IVPs. Also, these methods are used to calculate starting values of some predictor-corrector methods. A major drawback of Runge-Kutta methods is that there is no scope to check or estimate the error occurs in any step. If an error occurs at any step, then it will propagate through the subsequent steps without detection. The multi-step methods, in general, give more better solution than the single-step methods. But, these methods are not self starting. If the starting values are available, then few number of function evaluations is required. The predictor-corrector methods 9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shooting Method and Stability Analysis are generally multi-step methods. These are, implicit methods and have two formulae. The first formula gives a predicted value of the dependent variable and it is corrected at the second formula. The first one is an explicit formula called predictor formula. The second formula is an implicit formula and it is known as corrector formula. Another useful method called finite difference method is used to solve IVPs and BVPs. This method converts a BVP to a system of tri-diagonal algebraic equations. This is a simple method, but when the step size is small then the method generates a large system of equations and hence takes lot of time. The shooting method is described which is used to solve second order BVP. In this method, the given BVP is converted to two pairs of IVPs. It is well known that solution of IVP is simple. So shooting method is a good method to solve BVPs. Nowadays, another good method is used to solve ODE, know as finite element method. This method is not discussed in this chapter, but the references are provided in Learn More section.
10
.
Chapter 9 Numerical Solution of Partial Differential Equations
Module No. 1 Partial Differential Equation: Parabolic
...................................................................................... The partial differential equation (PDE) is one of the most important and useful topics of mathematics, physics and different branches of engineering. But, finding of solution of PDEs is a very difficult task. Several analytical methods are available to solve PDEs, but, all these methods need in depth mathematical knowledge. On the other hand numerical methods are simple, but generate erroneous result. Most widely used numerical method is finite-difference method due to its simplicity. In this module, only finite-difference method is discussed to solve PDEs.
1.1 Classification of partial differential equations The general second order PDE is of the following form: ∂2u ∂2u ∂2u ∂u ∂u + B + C +D +E + Fu = G 2 2 ∂x ∂x∂y ∂y ∂x ∂y Auxx + Buxy + Cuyy + Dux + Euy + F u = G,
A i.e.,
(1.1) (1.2)
where A, B, C, D, E, F, G are functions of x and y. The second order PDEs are of three types, viz. elliptic, hyperbolic and parabolic. The type of a PDE can be determined by finding the discriminant ∆ = B 2 − 4AC. The PDE of equation (1.2) is called elliptic, parabolic and hyperbolic according as the value of ∆ at any point (x, y) is < 0, = 0 or > 0. Elliptic equation The most simple examples of this type of PDEs are Poisson’s equation ∂2u ∂2u + 2 = g(x, y) ∂x2 ∂y
(1.3)
and Laplace equation ∂2u ∂2u + 2 =0 ∂x2 ∂y
or
∇2 u = 0.
(1.4)
The analytic solution of an elliptic equation is a function of x and y which satisfies the PDE at every point of the region S which is bounded by a plane closed curve C and satisfies some conditions at every point on C. The condition that the dependent variable satisfies along the boundary curve C is known as boundary condition. 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Partial Differential Equation: Parabolic Parabolic equation The simplest example of parabolic equation is the heat conduction equation ∂u ∂2u = α 2. ∂t ∂x In parabolic PDE, time t is involved as an independent variable.
(1.5)
The solution u represents the temperature at a distance x units from one end of a thermally insulated bar after t seconds of heat conduction. In this problem, the temperature at the ends of a bar are known for all time, i.e. the boundary conditions are known. Hyperbolic equation The third type PDE is hyperbolic. Also, in this equation time t is taken as an independent variable. The simplest example of hyperbolic equation is the one-dimensional wave equation 2 ∂2u 2∂ u = c . ∂t2 ∂x2
(1.6)
Here, u is the transverse displacement of a vibrating string of length l at a distance x from one end after a time t. Hyperbolic equations generally originate from vibration problems. The values of u at the ends of the string are generally known for all time (i.e. boundary conditions are known) and the shape and velocity of the string are given at initial time (i.e. initial conditions are known).
1.2 Finite-difference approximation to partial derivatives Let x and y be two independent variables and u be the dependent variable of the given PDE. Now, we divide the xy-plane into set of equal rectangles of lengths ∆x = h and ∆y = k by drawing the equally spaced grid lines parallel to the coordinates axes. That is,
2
xi = ih,
i = 0, ±1, ±2, . . .
yj = jk,
j = 0, ±1, ±2, . . . .
...................................................................................... The intersection of horizontal and vertical lines is called mesh point and the ijth mesh point is denoted by P (xi , yj ) or P (ih, jk). The value of u at this mesh point is denoted by ui,j , i.e. ui,j = u(xi , yj ) = u(ih, jk). The first order partial derivatives at this mesh point are approximated as follows: ui+1,j − ui,j + O(h) h (forward difference approximation) ui,j − ui−1,j + O(h) = h (backward difference approximation) ui+1,j − ui−1,j = + O(h2 ) 2h (central difference approximation)
ux (ih, jk) =
Similarly,
ui,j+1 − ui,j + O(k) k (forward difference approximation) ui,j − ui,j−1 = + O(k) k (backward difference approximation) ui,j+1 − ui,j−1 = + O(k 2 ) 2k (central difference approximation)
uy (ih, jk) =
(1.7)
(1.8)
(1.9)
(1.10)
(1.11)
(1.12)
The second order partial derivatives are approximated as follows: ui−1,j − 2ui,j + ui+1,j + O(h2 ). h2 ui,j−1 − 2ui,j + ui,j+1 + O(k 2 ). uyy (ih, jk) = k2
uxx (ih, jk) =
(1.13) (1.14)
The above equations are used to approximate a PDE to a system of difference equations.
1.3 Parabolic equations Here, two methods are described to solve a parabolic PDE. Let us consider the following parabolic equation known as heat conduction equation ∂u ∂2u =α 2 ∂t ∂x
(1.15) 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Partial Differential Equation: Parabolic with the initial condition u(x, 0) = f (x) and the boundary conditions u(0, t) = φ(t), u(1, t) = ψ(t). 1.3.1
An explicit method
We approximate the given PDE by using the finite-difference approximation for ut and uxx defined in (1.10) and (1.13). Then the equation (1.15) reduces to ui,j+1 − ui,j ui−1,j − 2ui,j + ui+1,j 'α , k h2
(1.16)
where xi = ih and tj = jk; i, j = 0, 1, 2, . . .. After simplification the above equation becomes ui,j+1 = rui−1,j + (1 − 2r)ui,j + rui+1,j ,
(1.17)
where r = αk/h2 . From this formula one can compute the value of ui,j+1 at the mesh point (i, j + 1) when the values of ui−1,j , ui,j and ui+1,j are known. So this method is called the explicit method. By stability analysis, it can be shown that the formula is stable, if 0 < r ≤ 1/2. The grids and mesh points are shown in Figure 1.1. For this problem, the initial and boundary values are given. These values are shown in the figure by filled circles. That is, the values of u are known along x-axis and two vertical lines (t = 0 and t = 1). Also, it is mentioned in the figure that if the values at the meshes (i − 1, j), (i, j) and (i + 1, j) are known (shown by filled circles) then one can determine the value of u for the mesh (i, j + 1) (shown by circle). Example 1.1 Solve the following heat equation ∂u ∂2u = ∂t ∂x2 subject to the boundary conditions u(0, t) = 0, u(1, t) = 3t and initial condition u(x, 0) = 1.5x. Solution. Let h = 0.2 and k = 0.01, so r = k/h2 = 0.25 < 1/2. The initial and boundary values are shown in the following table. 4
...................................................................................... t 6 u
u
u
u
u
e
u
ui,j+1
u
u
u
u known values
u
ui−1,j ui,j
⊕ unknown value
u
ui+1,j
u
u
6 k hu? u
u
u
u
u
u-
x
t = 0, u = f (x)
Figure 1.1: Known and unknown meshes of heat equation j = 5, t = 0.05
0.0
0.15
j = 4, t = 0.04
0.0
0.12
j = 3, t = 0.03
0.0
0.09
j = 2, t = 0.02
0.0
0.06
j = 1, t = 0.01
0.0
0.03
j = 0, t = 0.00
0.0
0.3
0.6
0.9
1.2
0.00
x=0
x = 0.2
x = 0.4
x = 0.6
x = 0.8
x = 1.0
i=0
i=1
i=2 i=3 i=4 ui,j+1 − ui,j ∂u Using the finite difference approximations = and ∂t k 2 ui−1,j − 2ui,j + ui+1,j ∂ u = , the given PDE reduces to 2 ∂x h2
i=5
1 ui,j+1 = (ui−1,j + 2ui,j + ui+1,j ). 4 From this equation, we get 1 u1,1 = (u0,0 + 2u1,0 + u2,0 ) = 0.3 4 1 u2,1 = (u1,0 + 2u2,0 + u3,0 ) = 0.6 4 and so on. 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Partial Differential Equation: Parabolic The values of u for all meshes are shown in the following table.
1.3.2
j = 5, t = 0.05
0.0
0.28104
0.50125
0.56968
0.42396
0.15
j = 4, t = 0.04
0.0
0.29264
0.53888
0.63460
0.47062
0.12
j = 3, t = 0.03
0.0
0.30000
0.57656
0.71438
0.53906
0.09
j = 2, t = 0.02
0.0
0.30000
0.60000
0.80625
0.64500
0.06
j = 1, t = 0.01
0.0
0.30000
0.60000
0.90000
0.82500
0.03
j = 0, t = 0.00
0.0
0.30000
0.60000
0.90000
1.20000
0.00
x=0
x = 0.2
x = 0.4
x = 0.6
x = 0.8
x = 1.0
i=0
i=1
i=2
i=3
i=4
i=5
Crank-Nicolson implicit method
The above explicit method is very simple and it has limitation. This method is stable if 0 < r ≤ 1/2, i.e. 0 < αk/h2 ≤ 1/2, or αk ≤ h2 /2. That is, the value of k must be chosen very small, and it takes time to get the result at a particular mesh. In 1947, Crank and Nicolson have developed an implicit method which reduces the total computation time. Also, the method is applicable for all finite values of r. In this method, the given PDE is approximated by replacing both space and time derivatives by their central difference approximations at the midpoint of the points (ih, jk) and (ih, (j + 1)k), i.e. at the point (ih, (j + 1/2)k). To use approximation at this point, we write the equation (1.15) in the following form 2 2 ∂ u α ∂2u ∂ u ∂u =α = + . ∂t i,j+1/2 ∂x2 i,j+1/2 2 ∂x2 i,j ∂x2 i,j+1
(1.18)
Then using central difference approximation the above equation reduces to ui,j+1 − ui,j ui+1,j+1 − 2ui,j+1 + ui−1,j+1 α ui+1,j − 2ui,j + ui−1,j = + . k 2 h2 h2 After simplification, we get the following equation −rui−1,j+1 + (2 + 2r)ui,j+1 − rui+1,j+1 = rui−1,j + (2 − 2r)ui,j + rui+1,j , (1.19) where r = αk/h2 and j = 0, 1, 2, . . ., i = 1, 2, . . . , (N − 1), N is the number of subdivisions of x. 6
...................................................................................... In general, the left hand side of the above equation contains three unknowns values, but the right hand side has three known values of u. The known (circle) and unknown (filled circle) meshes are shown in Figure 1.2. For j = 0 and i = 1, 2, . . . , N −1, equation (1.19) generates N simultaneous equations for N − 1 unknown u1,1 , u2,1 , . . . , uN −1,1 (of first row) in terms of known initial values u1,0 , u2,0 , . . . , uN −1,0 and boundary values u0,0 and uN,0 . Thus, for this problem initial and boundary conditions are required. Similarly, for j = 1 and i = 1, 2, . . . , N − 1 we obtain another system of equations containing the unknowns u1,2 , u2,2 , . . . , uN −1,2 in terms of known values obtained in previous step. Thus for each value of j (j = 2, 3, . . .), there is a system of equations.
t
6 u
u
u
e
e
e
u
u
u e
u
u
u
j+1
j
u u
i=0
⊕ unknown values u known values
u u
u
i−1
u
i
u
i+1
u
u-
x
i=N
Figure 1.2: Meshes of Crank-Nicolson implicit method
In this method, the value of ui,j+1 is not expressed directly in terms of known values of u’s obtained in earlier step, it is written as unknown values, and hence the method is implicit. The system of equations (1.19) for a fixed j, can be expressed as the following matrix notation. 7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Partial Differential Equation: Parabolic
2+2r −r
u1,j+1
u2,j+1 −r 2+2r −r · · · · · · · · · · · · · · · · · · · · · u3,j+1 .. −r 2+2r −r . −r 2+2r uN −1,j+1
d1,j
d2,j d = 3,j .. . dN −1,j
(1.20)
where d1,j = ru0,j + (2 − 2r)u1,j + ru2,j + ru0,j+1 di,j = rui−1,j + (2 − 2r)ui,j + rui+1,j ;
i = 2, 3, . . . , N − 2
dN −1,j = ruN −2,j + (2 − 2r)uN −1,j + ruN,j + ruN,j+1 . Note that the right hand side of equation (1.20) is known. This is a tri-diagonal system of linear equations and it can be solved by any method discussed in Chapter 5 or the special method for tri-diagonal equations.
1 Example 1.2 Solve the following problem by Crank-Nicolson method by taking h = 4 1 1 and k = , 8 16 ∂u ∂2u = , 0 < x < 1, t > 0 ∂t ∂x2 where u(0, t) = u(1, t) = 0, t > 0, u(x, 0) = 3x, t = 0. Solution. Case I. 1 1 k Let h = and k = . Therefore, r = 2 = 2. 4 8 h The Crank-Nicolson scheme is −ui−1,j+1 + 3ui,j+1 − ui+1,j+1 = ui−1,j − ui,j + ui+1,j . The initial and boundary conditions are shown in Figure 1.3. The initial values are u0,0 = 0, u1,0 = 0.75, u2,0 = 1.5, u3,0 = 2.25, u4,0 = 3 and the boundary values are u0,1 = 0, u4,1 = 0. 8
...................................................................................... t j=1, t= 18 j=0, t=0
6 u0,1 u1,1 u2,1 u3,1 u4,1 u e e e u
u0,0
u
A B C u1,0 u2,0 u3,0
u
u
u
u
u-4,0x
i=0 i=1 i=2 i=3 i=4 x=0 x= 14 x= 12 x= 43 x=1 Figure 1.3: Boundary and initial values when h = 1/4, k = 1/8. The unknown meshes are A(i = 1, j = 1), B(i = 2, j = 1) and C(i = 3, j = 1). Hence the system of equations is −u0,1 + 3u1,1 − u2,1 = u0,0 − u1,0 + u2,0 −u1,1 + 3u2,1 − u3,1 = u1,0 − u2,0 + u3,0 −u2,1 + 3u3,1 − u4,1 = u2,0 − u3,0 + u4,0 . Using initial and boundary values the above system of equations becomes 0 + 3u1,1 − u2,1 = 0 − 0.75 + 1.5 = 0.75 −u1,1 + 3u2,1 − u3,1 = 0.75 − 1.5 + 2.25 = 1.5 −u2,1 + 3u3,1 + 0 = 1.5 − 2.25 + 3.0 = 2.25. This is a system of three linear equations with three unknowns. The solution of this system is u1,1 = u(0.25, 0.125) = 0.60714,
u2,1 = u(0.50, 0.125) = 1.07143,
u3,1 = u(0.75, 0.125) = 1.10714. Case II. 1 1 1 Let h = , k = . Therefore, r = 1. To find the value of u at t = , we have to solve 4 16 8 two systems of tri-diagonal equations in two steps. The Crank-Nicolson scheme is −ui−1,j+1 + 4ui,j+1 − ui+1,j+1 = ui−1,j + ui+1,j . For this case, the initial and boundary conditions are shown in Figure 1.4. 9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Partial Differential Equation: Parabolic t j=2, t= 18 1 j=1, t= 16
j=0, t=0
u0,2 u1,2 u2,2 u3,2 u4,2 6 u e e e u u0,1
u
u0,0
u
D E F u1,1 u2,1 u3,1
u4,1
e
e
e
u
u
u
u
u-4,0x
A B C u1,0 u2,0 u3,0
u
i=0 i=1 i=2 i=3 i=4 x=0 x= 14 x= 12 x= 43 x=1 Figure 1.4: Boundary and initial values when h = 1/4, k = 1/16. That is, the initial values are, u0,0 = 0, u1,0 = 0.75, u2,0 = 1.5, u3,0 = 2.25, u4,0 = 3 and the boundary values are u0,1 = 0, u4,1 = 0; u0,2 = 0, u4,2 = 0. Here, r = 1, so the middle term of right hand side of Crank-Nicolson equation vanishes. Thus the Crank-Nicolson equations for first step, i.e. for the mesh points A(i = 1, j = 1), B(i = 2, j = 1) and C(i = 3, j = 1) are respectively −u0,1 + 4u1,1 − u2,1 = u0,0 + u2,0 −u1,1 + 4u2,1 − u3,1 = u1,0 + u3,0 −u2,1 + 4u3,1 − u4,1 = u2,0 + u4,0 . That is, 4u1,1 − u2,1 = 0 + 1.5 = 1.5 −u1,1 + 4u2,1 − u3,1 = 0.75 + 2.25 = 3.0 −u2,1 + 4u3,1 = 1.5 + 3 = 4.5. The solution of this system of equations is u1,1 = u(0.25, 0.0625) = 0.69643, u2,1 = u(0.50, 0.0625) = 1.28571, u3,1 = u(0.75, 0.0625) = 1.44643. Again, the Crank-Nicolson equations for the mesh points D(i = 1, j = 2), E(i = 2, j = 2) and F (i = 3, j = 2) are respectively, −u0,2 + 4u1,2 − u2,2 = u0,1 + u2,1 −u1,2 + 4u2,2 − u3,2 = u1,1 + u3,1 −u2,2 + 4u3,2 − u4,2 = u2,1 + u4,1 . 10
......................................................................................
Using boundary conditions and values of right hand side obtained in first step, the above system becomes 4u1,2 − u2,2 = 0 + 1.28571 = 1.28571 −u1,2 + 4u2,2 − u3,2 = 0.69643 + 1.44643 = 2.14286 −u2,2 + 4u3,2
= 1.28571 + 0 = 1.28571.
The solution of this system is u1,2 = u(0.25, 0.125) = 0.52041, u2,2 = u(0.50, 0.125) = 0.79592, u3,2 = u(0.75, 0.125) = 0.52041.
11
.
Chapter 9 Numerical Solution of Partial Differential Equations
Module No. 2 Partial Differential Equations: Hyperbolic
...................................................................................... In this module, two other types of partial differential equations are considered. These are hyperbolic and elliptic PDEs. Finite difference method is also used to solve these problems. First we consider hyperbolic equation.
2.1 Hyperbolic equations The simplest problem of this class is one dimensional wave equation. This problem may occurs in many real life situations. For example, the transverse vibration of a stretched string, propagation of light and sound, propagation of water wave, etc. It arises in different fields such as acoustics, electromagnetics, fluid dynamics, etc. The simplest form of the wave equation is given below: Let u = u(r, t), r ∈ Rn be a scalar function which satisfies ∂2u = c2 ∇2 u, (2.1) ∂t2 where ∇2 is the Laplacian in Rn and c is a constant speed of the wave propagation. This equation can also be written as 2 u = 0, where 2 = ∇2 −
1 ∂2 . c2 ∂t2
(2.2)
The operator 2 is called d’Alembertian. In case of one dimension, the above equation is 2 ∂2u 2∂ u = c , t > 0, 0 < x < 1. ∂t2 ∂x2 The initial conditions are u(x, 0) = f (x) and ∂u = g(x), 0 < x < 1 ∂t (x,0)
(2.3)
(2.4)
and the boundary conditions are u(0, t) = φ(t) and u(1, t) = ψ(t), t ≥ 0.
(2.5)
In finite difference method, the partial derivatives uxx and utt are approximated by the following central-difference schema at the mesh points (xi , tj ) = (ih, jk) are 1 (ui−1,j − 2ui,j + ui+1,j ) + O(h2 ) h2 1 and utt = 2 (ui,j−1 − 2ui,j + ui,j+1 ) + O(k 2 ), k uxx =
1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Partial Differential Equations: Hyperbolic where i, j = 0, 1, 2, . . .. Using this approximation, the equation (2.3) reduces to c2 1 (u − 2u + u ) = (ui−1,j − 2ui,j + ui+1,j ). i,j−1 i,j i,j+1 k2 h2 That is, ui,j+1 = r2 ui−1,j + 2(1 − r2 )ui,j + r2 ui+1,j − ui,j−1 ,
(2.6)
where r = ck/h. Note that the value of ui,j+1 depends on the values of u at two time-levels (j − 1), j and the value of ui,j+1 can be determined if the four values ui−1,j , ui,j , ui+1,j , ui,j−1 are known. The known (filled circle) and unknown (circle) values of u are shown in Figure 2.1. t j+1 j
e
6 u
u
u
⊕ unknown values u known values
u
j−1 i−1
i
i+1
- x
Figure 2.1: Known and unknown meshes for hyperbolic equations. When j = 0, then from the equation (2.6) we get ui,1 = r2 ui−1,0 + 2(1 − r2 )ui,0 + r2 ui+1,0 − ui,−1 . Since, u(x, 0) = f (x), ui,0 = f (xi ) = fi . Using this notation, the above equation reduces to ui,1 = r2 fi−1 + 2(1 − r2 )fi + r2 fi+1 − ui,−1 . Now, by central difference approximation, the initial condition (2.4), becomes 1 (ui,1 − ui,−1 ) = gi . 2k 2
(2.7)
...................................................................................... Substituting the value of ui,−1 to the equation (2.7), we get ui,1 =
1 2 r fi−1 + 2(1 − r2 )fi + r2 fi+1 + 2kgi . 2
(2.8)
Thus, from equation (2.8) we obtain the values of ui,1 for all values of i. The truncation error of this method is O(h2 + k 2 ) and the formula (2.6) is convergent for 0 < r ≤ 1. Example 2.1 Consider the following wave equation 2 ∂2u 2∂ u = c . ∂t2 ∂x2
The boundaryconditions u(0, t) = 0, u(1, t) = 0, t > 0 and the initial conditions ∂u = 0, 0 ≤ x ≤ 1. Find the value of u for x = 0, 0.2, 0.4, . . . , 1.0 u(x, 0) = 4x2 , ∂t (x,0) and t = 0, 0.1, 0.2, . . . , 0.5, when c = 1. Solution. Using central-difference approximation, the explicit formula for the given equation is ui,j+1 = r2 ui−1,j + 2(1 − r2 )ui,j + r2 ui+1,j − ui,j−1 .
(2.9)
Let h = 0.2 and k = 0.1, so r = ck/h = 0.5 < 1. The boundary conditions transferred to u0,j = 0, u5,j = 0. The initial conditions ui,1 − ui,−1 reduce to ui,0 = 4x2i , i = 1, 2, 3, 4, 5 and = 0, therefore ui,−1 = ui,1 . 2k Since r = 0.5, the difference equation (2.9) becomes ui,j+1 = 0.25ui−1,j + 1.5ui,j + 0.25ui+1,j − ui,j−1 .
(2.10)
When j = 0, then ui,1 = 0.25ui−1,0 + 1.5ui,0 + 0.25ui+1,0 − ui,−1 i.e. ui,1 = 0.125ui−1,0 + 0.75ui,0 + 0.125ui+1,0 , [using ui,−1 = ui,1 ] = 0.125(ui−1,0 + ui+1,0 ) + 0.75ui,0 . This formula gives the values of u for j = 1. For other values of j (j = 2, 3, . . .) the values of u are calculated from the formula (2.10). The initial and boundary values are shown in the following table. 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Partial Differential Equations: Hyperbolic j = 5, t = 0.5
0
0
j = 4, t = 0.4
0
0
j = 3, t = 0.3
0
0
j = 2, t = 0.2
0
0
j = 1, t = 0.1
0
0
j = 0, t = 0.0
0
0.16
0.64
1.44
2.56
0
x=0
x = 0.2
x = 0.4
x = 0.6
x = 0.8
x = 1.0
i=0
i=1
i=2
i=3
i=4
i=5
The values of first row, i.e. ui,1 , i = 1, 2, 3, 4 are calculated as follows: u1,1 = 0.125(u0,0 + u2,0 ) + 0.75u1,0 = 0.20 u2,1 = 0.125(u1,0 + u3,0 ) + 0.75u2,0 = 0.68 u3,1 = 0.125(u2,0 + u4,0 ) + 0.75u3,0 = 1.48 u4,1 = 0.125(u3,0 + u5,0 ) + 0.75u4,0 = 2.10. Other values are written in the following table.
2.1.1
j = 5, t = 0.5
0
0.74328
0.89226
−0.50500
−1.25125
0
j = 4, t = 0.4
0
0.62906
1.05875
0.45250
−1.10375
0
j = 3, t = 0.3
0
0.46500
0.96625
1.17250
−0.29125
0
j = 2, t = 0.2
0
0.31000
0.80000
1.47500
0.96000
0
j = 1, t = 0.1
0
0.20000
0.68000
1.48000
2.10000
0
j = 0, t = 0.0
0
0.16000
0.64000
1.44000
2.56000
0
x=0
x = 0.2
x = 0.4
x = 0.6
x = 0.8
x = 1.0
i=0
i=1
i=2
i=3
i=4
i=5
Implicit difference methods
Generally, implicit methods generate a tri-diagonal system of algebraic equations. Thus, it is suggested that the implicit methods should not be used without simplifying assumption to solve pure BVPs, because these methods generate large number of equations for small h and k. But, these methods may be used for initial-boundary value problems. Two such implicit methods are described below. 4
...................................................................................... Implicit Method-I The right hand side of the equation (2.3) is divided into two parts. Now, by centraldifference approximation at the mesh point (ih, jk) the given equation reduces to " # 2 2 ∂ u c2 ∂2u ∂ u = + . ∂t2 i,j 2 ∂x2 i,j+1 ∂x2 i,j−1 That is, 1 [ui,j+1 − 2ui,j + ui,j−1 ] (2.11) k2 c2 = 2 [(ui+1,j+1 − 2ui,j+1 + ui−1,j+1 ) + (ui+1,j−1 − 2ui,j−1 + ui−1,j−1 )]. 2h
Implicit Method-II Again, we divide the right hand side of the given equation into three parts as
∂2u ∂t2
i,j
c2 = 4
"
∂2u ∂x2
∂2u +2 ∂x2 i,j+1
+ i,j
∂2u ∂x2
#
. i,j−1
By central-difference approximation the given equation reduces to 1 [ui,j+1 − 2ui,j + ui,j−1 ] k2 c2 = 2 [(ui+1,j+1 − 2ui,j+1 + ui−1,j+1 ) 4h +2(ui+1,j − 2ui,j + ui−1,j ) + (ui+1,j−1 − 2ui,j−1 + ui−1,j−1 )]. The above equation can be written as 2 −ui−1,j+1 + 2 1 + 2 ui,j+1 − ui+1,j+1 r h i 2 = 2 ui−1,j − 2 1 − 2 ui,j + ui+1,j r h i 2 + ui−1,j−1 − 2 1 + 2 ui,j−1 + ui+1,j−1 r
(2.12)
(2.13)
where r = ck/h. This is a system of linear tri-diagonal equations and it can be solved by any method. The above system of equations can also be written as 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Partial Differential Equations: Hyperbolic
2s −1 0
0
0 ··· 0 0
u1,j+1
d1,j+1
−1 2s −1 0 0 · · · 0 −1 2s −1 0 · · · 0 0 −1 2s −1 · · · ··· ··· ··· ··· ··· ··· 0
0
0
0
0 0 u2,j+1 d2,j+1 0 0 u3,j+1 d3,j+1 = 0 0 u4,j+1 d4,j+1 ··· ··· ··· ··· dN −1,j+1 uN −1,j+1 0 · · · −1 2s
where s = 1 + 2/r2 and
d1,j = u0,j+1 + 2[u0,j − 2(1 − 2/r2 )u1,j + u2,j ] + [u0,j−1 − 2(1 + 2/r2 )u1,j−1 + u2,j−1 ] di,j = 2[ui−1,j − 2(1 − 2/r2 )ui,j + ui+1,j ] +[ui−1,j−1 − 2(1 + 2/r2 )ui,j−1 + ui+1,j−1 ] i = 2, 3, . . . , N − 2 dN −1,j = uN,j+1 + 2[uN −2,j − 2(1 − 2/r2 )uN −1,j + uN,j ] +[uN −2,j−1 − 2(1 + 2/r2 )uN −1,j−1 + uN,j−1 ] For a particular value of j = k, k = 1, 2, . . ., one can find all values of ui,k , for i = 1, 2, . . . , N − 1. Both the formulae are valid for all values of r = ck/h > 0.
6
.
Chapter 9 Numerical Solution of Partial Differential Equations
Module No. 3 Partial Differential Equations: Elliptic
...................................................................................... Elliptic PDE is one of the widely used PDEs. In this module, the finite difference method is described to solve the elliptic PDEs.
3.1 Elliptic equations The elliptic PDEs occur in many practical situations. The most simple elliptic PDEs are Laplace and Poisson equations. The Laplace equation is ∇n u = 0 and Poisson equation is ∇n u = g(r).
One physical example of such equations is stated below. Let the function ρ represent the electric charge density in some open bounded set
Ω ⊂ Rd . If the permittivity ε is constant in Ω the distribution of the electric potential ϕ in Ω is governed by the Poisson equation
−ε∆ϕ = ρ. This equation does not have a unique solution, because if φ is a solution of this equation, then the function φ + c, is also a solution, where c is any constant. To get a solution, every elliptic equation should have a suitable boundary conditions. Let us consider the two-dimensional Laplace equation ∂2u ∂2u + 2 = 0 within the region R ∂x2 ∂y and u = f (x, y) on the boundary C.
(3.1)
In this problem, both are space variables. Now, we approximate this PDE by the central difference approximation. Then the finite difference approximation of the above equation is ui−1,j − 2ui,j + ui+1,j ui,j−1 − 2ui,j + ui,j+1 + = 0. h2 k2
(3.2)
It is assumed that the length of the subintervals along x and y directions are equal, i.e. h = k. Then the above equation becomes, 1 ui,j = [ui+1,j + ui−1,j + ui,j+1 + ui,j−1 ]. 4
(3.3)
From this expression, it is seen that the value of ui,j is the average of the values of u at the four meshes – north (i, j + 1), east (i + 1, j), south (i, j − 1) and west (i − 1, j). 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Partial Differential Equations: Elliptic y 6(i, j+1) u
(i-1,j)
u
e
u
(i,j) (i+1,j)
u
(i,j-1)
- x
Figure 3.1: Known and unknown meshes in standard five-point formula The known values (filled circles) and unknown (circle) are shown in Figure 3.1. This formula is known as standard five-point formula. It can be proved that the Laplace equation remains invariant when the coordinates axes are rotated at an angle 450 . Under this rotation, the equation (3.3) becomes 1 ui,j = [ui−1,j−1 + ui+1,j−1 + ui+1,j+1 + ui−1,j+1 ]. 4
(3.4)
This is another formula to calculate ui,j and it is known as diagonal five-point formula. The known and unknown meshes for this formula are shown in Figure 3.2. y (i-1,j+1) 6 u u (i+1,j+1) e
(i,j)
u
(i-1,j-1)
u- x (i+1,j-1)
Figure 3.2: Known and unknown meshes in diagonal five-point formula When the right hand side of the Laplace equation is nonzero, then this equation is known as Poisson’s equation. The Poisson’s equation in two-dimension is in the following form uxx + uyy = g(x, y), with the boundary condition u = f (x, y) along the boundary C. 2
(3.5)
...................................................................................... Here, we also assumed that the mesh points in both x and y directions are uniform. Using this assumption the central difference approximation of the equation (3.5) is reduced to 1 ui,j = [ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − h2 gi,j ] where gi,j = g(xi , yj ). 4
(3.6)
Let u = 0 along the boundary C and i, j = 0, 1, 2, 3, 4. Then u0,j = 0, u4,j = 0 for j = 0, 1, 2, 3, 4 and ui,0 = 0, ui,4 = 0 for i = 0, 1, 2, 3, 4. The boundary values (filled circles) are shown in Figure 3.3. y u=0
6 j=4 u
u
u
u
j=3 u
e
e
e
u3,3
u=0, j=2 u
e
e
e
u3,2
j=1 u
e
e
e
u3,1
u1,3 u1,2 u1,1
u2,3 u2,2 u2,1
u
u u u j=0 u i=0 i=1 i=2 i=3 u=0
u u u=0 u u- x
i=4
Figure 3.3: The 5 × 5 meshes for elliptic equation For a particular case, i.e. for i, j = 1, 2, 3 the equation (3.6) becomes a system of nine equations with nine unknowns. These equations are written in matrix notation as 4 −1 0 −1 0 0 0 0 0 u1,1 −h2 g1,1 −1 4 −1 0 −1 0 0 0 0 u1,2 −h2 g1,2 2g 0 −1 4 0 0 −1 0 0 0 u −h 1,3 1,3 −1 0 0 4 −1 0 −1 0 0 u2,1 −h2 g2,1 (3.7) 0 −1 0 −1 4 −1 0 −1 0 u2,2 = −h2 g2,2 0 0 −1 0 −1 4 0 0 −1 u −h2 g 2,3 2,3 0 0 0 −1 0 0 4 −1 0 u3,1 −h2 g3,1 0 0 0 0 −1 0 −1 4 −1 u −h2 g 3,2 3,2 0 0 0 0 0 −1 0 −1 4 u3,3 −h2 g3,3 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Partial Differential Equations: Elliptic This indicates that the equation (3.6) is a system of N (where N is the number of subintervals along x and y directions) equations. Note that the coefficient matrix is symmetric, positive definite and sparse (many elements are 0). Since, the coefficient matrix is sparse, so it is suggested to use iterative method rather than direct method to solve the above system of equations. Commonly used iterative methods are Jacobi’s method, Gauss-Seidel’s method, successive overrelaxation method, alternate direction implicit method, etc. 3.1.1
Method to find first approximate value of Laplace’s equation
Let us consider the Laplace’s equation uxx + uyy = 0. Let the region R be square and it is divided into N × N small squares each of side h. The boundary values are u0,j , uN,j , ui,0 , ui,N where i, j = 0, 1, 2, . . . , N , shown in Figure 3.4. y u2,4 u3,4 u4,4 6 u1,4 u u u u u0,4 u u0,3 u
e
e
e
u3,3
u0,2 u
e
e
e
u3,2
u0,1 u
e
e
e
u3,1
u
u
u
u
u0,0
u1,0
u1,3 u1,2 u1,1
u2,3 u2,2 u2,1
u2,0
u u4,3 u u4,2 u u4,1 u- x
u3,0 u4,0
Figure 3.4: Known and unknown meshes for Laplace equation. Red meshes are calculated by diagonal five-point formula and blue meshes are determined by standard five-point formula At first the diagonal five-point formula is used to compute the values of u according to the order u2,2 , u1,3 , u3,3 , u1,1 and u3,1 (red meshes in the figure). That is, 1 u2,2 = (u0,0 + u4,4 + u0,4 + u4,0 ) 4 1 u1,3 = (u0,2 + u2,4 + u0,4 + u2,2 ) 4 4
......................................................................................
1 u3,3 = (u2,2 + u4,4 + u2,4 + u4,2 ) 4 1 u1,1 = (u0,0 + u2,2 + u0,2 + u2,0 ) 4 1 u3,1 = (u2,0 + u4,2 + u2,2 + u4,0 ). 4 In the second step, the remaining values, viz., u2,3 , u1,2 , u3,2 and u2,1 are evaluated using standard five-point (blue meshes in the figure). Thus, 1 u2,3 = (u1,3 + u3,3 + u2,2 + u2,4 ) 4 1 u1,2 = (u0,2 + u2,2 + u1,1 + u1,3 ) 4 1 u3,2 = (u2,2 + u4,2 + u3,1 + u3,3 ) 4 1 u2,1 = (u1,1 + u3,1 + u2,0 + u2,2 ). 4 Note that these are the first approximate values of u at different meshes. These values can be updated by using any iterative methods mentioned earlier. Example 3.1 Let us consider the following Dirichlet’s problem uxx + uyy = 0, u(x, 0) = 0, u(0, y) = 0, u(x, 1) = 5x, u(1, y) = 5y. Find the first approximate values at the interior meshes by dividing the square region into 4 × 4 squares. Solution. For this problem, the region R is 0 ≤ x, y ≤ 1. Let h = k = 0.25 and xi = ih, yj = jk, i, j = 0, 1, 2, 3, 4. The meshes are shown in Figure 3.5. The values of u are calculated in two steps. In first step, the diagonal five-point
formula is used to find the values of u2,2 , u1,3 , u3,3 , u1,1 , u3,1 and in second step the standard five-point formula is used to find the values of u2,3 , u1,2 , u3,2 , u2,1 . The diagonal five-point formula is 1 ui,j = [ui−1,j−1 + ui+1,j−1 + ui+1,j+1 + ui−1,j+1 ] 4 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Partial Differential Equations: Elliptic y
u1,4 u2,4 u3,4 u4,4 5.0 6 1.25 u 2.50 u 2.75 u u u4,4 =5.0 y=1.00, u0,4 =0 u y =0.75, u0,3 =0 u
e
e
y=0.50, u0,2 =0 u
e
e
y=0.25, u0,1 =0 u
e
e
u2,1 u3,1
y=0.00, u0,0 =0 u
u
u
u1,3 u1,2 u1,1
e
u u4,3 =3.75
e
u u4,2 =2.5
e
u u4,1 =1.25
u
u- x
u2,3 u3,3 u2,2 u3,2
0 0 0 0 u1,0 u2,0 u3,0 u4,0 x=0 x= 14 x= 12 x= 34 x=1
Figure 3.5: Meshes for Dirichlet’s problem and standard five-point formula is 1 ui,j = [ui+1,j + ui−1,j + ui,j+1 + ui,j−1 ]. 4 Therefore, 1 u2,2 = (u0,0 + u4,4 + u0,4 + u4,0 ) = 4 1 u1,3 = (u0,2 + u2,4 + u0,4 + u2,2 ) = 4 1 u3,3 = (u2,2 + u4,4 + u2,4 + u4,2 ) = 4 1 u1,1 = (u0,0 + u2,2 + u0,2 + u2,0 ) = 4 1 u3,1 = (u2,0 + u4,2 + u2,2 + u4,0 ) = 4
1 (0 + 5.0 + 0 + 0) = 1.25 4 1 (0 + 2.5 + 0 + 1.25) = 0.9375 4 1 (1.25 + 5 + 2.5 + 2.5) = 2.8125 4 1 (0 + 1.25 + 0 + 0) = 0.3125 4 1 (0 + 2.5 + 1.25 + 0) = 0.9375. 4
The values of u2,3 , u1,2 , u3,2 and u2,1 are obtained by using standard five-point formula. 1 u2,3 = (u1,3 + u3,3 + u2,2 + u2,4 ) = 4 1 u1,2 = (u0,2 + u2,2 + u1,1 + u1,3 ) = 4 1 u3,2 = (u2,2 + u4,2 + u3,1 + u3,3 ) = 4 1 u2,1 = (u1,1 + u3,1 + u2,0 + u2,2 ) = 4 6
1 (0.9375 + 2.8125 + 1.25 + 2.5) = 1.875 4 1 (0 + 1.25 + 0.3125 + 0.9375) = 0.625 4 1 (1.25 + 2.5 + 0.9375 + 2.8125) = 1.875 4 1 (0.3125 + 0.9375 + 0 + 1.25) = 0.625. 4
...................................................................................... These are the first approximate values of u at the interior meshes.
3.2 Iterative methods If the first approximate values of u are known, then these values can be updated by applying any well known iterative method. Several iterative methods are available with different rates of convergence, some of them are discussed below. The standard five-point finite-difference formula for the Poisson’s equation (3.5) is 1 ui,j = (ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − h2 gi,j ). 4
(3.8)
(r)
Let ui,j be the rth iterative value of ui,j , r = 1, 2, . . .. Jacobi’s method The Jacobi’s iterative scheme to solve the system of equations (3.8) for the interior meshes is (r+1)
ui,j
=
1 (r) (r) (r) (r) [u + ui+1,j + ui,j−1 + ui,j+1 − h2 gi,j ]. 4 i−1,j
(3.9)
This formula evaluates the (r + 1)th iterated value of ui,j , when the rth iterated values of u are known at the meshes (i − 1, j), (i + 1, j), (i, j − 1), (i, j + 1). Gauss-Seidel’s method In it well known (discussed in Chapter 5) that the latest updated values are used in Gauss-Seidel’s method. The values of u along each row are computed systematically from left to right. The iterative formula is (r+1)
ui,j
1 (r+1) (r) (r+1) (r) = [ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − h2 gi,j ]. 4
(3.10)
The rate of convergence of this method is twice as fast as the Jacobi’s method. Successive Over-Relaxation (SOR) method In this method, the iteration scheme is accelerated by introducing a scalar, called re(r+1)
laxation factor. This acceleration is made by making corrections on [ui,j (r+1)
Suppose ui,j
(r)
− ui,j ].
is the value obtained from any iteration method, such as Jacobi’s or 7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Partial Differential Equations: Elliptic Gauss-Seidel’s method. Then the updated value of ui,j at the (r + 1)th iteration is given by (r+1)
ui,j
(r+1)
= wui,j
(r)
+ (1 − w)ui,j ,
(3.11)
where w is called relaxation factor. If w > 1 then the method is called over-relaxation method. If w = 1 then the method is nothing but the Gauss-Seidal iteration method. Thus, for the Poisson’s equation, the Jacobi’s over-relaxation scheme is i 1 h (r) (r+1) (r) (r) (r) (r) ui,j = w ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − h2 gi,j + (1 − w)ui,j 4 and the Gauss-Seidel’s over-relaxation scheme is i 1 h (r+1) (r) (r+1) (r) (r) (r+1) ui,j = w ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − h2 gi,j + (1 − w)ui,j . 4
(3.12)
(3.13)
The rate of convergence of the above schema depends on the value of w and its value lies between 1 and 2. But, the choice of suitable w is a difficult task. Example 3.2 Solve the Laplace’s equation uxx + uyy = 0 defined within the square region 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 shown in Figure 3.6, by (a) Jacobi’s method, (b) Gauss-Seidel’s method, and (c) Gauss-Seidel’s successive over-relaxation method. u
12.4 12.4 e u e u
0 u
e
e
0 u
e
e
u
u1,2
u1,1
u
u
0
0
u2,2 u2,1
e u u e0 e0 u u
Figure 3.6: Boundary conditions of Laplace equation Solution. (a) Jacobi’s method Let the initial values be u2,1 = u1,2 = u2,2 = u1,1 = 0 and h = k = 1/3. The Jacobi’s iteration scheme is 8
...................................................................................... (r+1)
u1,1
(r+1)
u2,1
(r+1)
u1,2
(r+1)
u2,2
1 (r) 1 (r) (r) (r) u2,1 + u1,2 + 0 + 0 = u2,1 + u1,2 4 4 1 (r) 1 (r) (r) (r) = u1,1 + u2,2 + 0 + 0 = u1,1 + u2,2 4 4 1 (r) 1 (r) (r) (r) = u1,1 + u2,2 + 0 + 12.4 = u1,1 + u2,2 + 12.4 4 4 1 (r) 1 (r) (r) (r) = u1,2 + u2,1 + 12.4 + 0 = u1,2 + u2,1 + 12.4 . 4 4
=
(1)
(1)
(1)
(1)
The first iterated values are, u1,1 = 0, u2,1 = 0, u1,2 = 3.1, u2,2 = 3.1. The all other iterated values are given below. r
u1,1
u2,1
u1,2
u2,2
0
0.00000
0.00000
0.00000
0.00000
1
0.00000
0.00000
3.10000
3.10000
2
0.77500
0.77500
3.87500
3.87500
3
1.16250
1.16250
4.26250
4.26250
4
1.35625
1.35625
4.45625
4.45625
5
1.45312
1.45312
4.55312
4.55312
6
1.50156
1.50156
4.60156
4.60156
7
1.52578
1.52578
4.62578
4.62578
8
1.53789
1.53789
4.63789
4.63789
9
1.54395
1.54395
4.64395
4.64395
10
1.54697
1.54697
4.64697
4.64697
11
1.54849
1.54849
4.64849
4.64849
12
1.54924
1.54924
4.64924
4.64924
13
1.54962
1.54962
4.64962
4.64962
14
1.54981
1.54981
4.64981
4.64981
15
1.54991
1.54991
4.64991
4.64991
16
1.54995
1.54995
4.64995
4.64995
Therefore, u1,1 = 1.5500, u2,1 = 1.5500, u1,2 = 4.6500, u2,2 = 4.6500, correct up to four decimal places. (b) Gauss-Seidel’s method Let u2,1 = u1,2 = u2,2 = u1,1 = 0 be the initial values. Also, h = k = 1/3. 9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Partial Differential Equations: Elliptic The Gauss-Seidel’s iteration scheme is (r+1)
u1,1
(r+1)
u2,1
(r+1)
u1,2
(r+1)
u2,2
1 (r) (r) u2,1 + u1,2 4 1 (r+1) (r) = u1,1 + u2,2 4 1 (r+1) (r) = u1,1 + u2,2 + 12.4 4 1 (r+1) (r+1) = u1,2 + u2,1 + 12.4 . 4 =
When r = 0 then (1)
1 0+0 =0 4 1 = 0+0 =0 4 1 = 0 + 0 + 12.4 = 3.1 4 1 = 3.1 + 0 + 12.4 = 3.857. 4
u1,1 = (1)
u2,1 (1)
u1,2 (1)
u2,2
These are the first iterated values. The results in all other iterations are shown below. r
u1,1
u2,1
u1,2
u2,2
0
0.00000
0.00000
0.00000
0.00000
1
0.00000
0.00000
3.10000
3.87500
2
0.77500
1.16250
4.26250
4.45625
3
1.35625
1.45312
4.55312
4.60156
4
1.50156
1.52578
4.62578
4.63789
5
1.53789
1.54395
4.64395
4.64697
6
1.54697
1.54849
4.64849
4.64924
7
1.54924
1.54962
4.64962
4.64981
8
1.54981
1.54991
4.64991
4.64995
9
1.54995
1.54998
4.64998
4.64999
10
1.54999
1.54999
4.64999
4.65000
11
1.55000
1.55000
4.65000
4.65000
Hence, u1,1 = 1.55000, u2,1 = 1.55000, u1,2 = 4.65000, u2,2 = 4.65000, correct up to five decimal places. 10
...................................................................................... (c) Gauss-Seidel’s successive over-relaxation method Let the intial value be u2,1 = u1,2 = u2,2 = u1,1 = 0. The SOR scheme for interior meshes are w (r+1) (r+1) (r) (r+1) (r) (r) ui,j = ui−1,j + ui+1,j + ui,j−1 + ui,j+1 + (1 − w)ui,j . 4 For j = 1, 2, i = 1, 2, the formulae are w (r) (r+1) (r) (r) u1,1 = [u2,1 + u1,2 ] + (1 − w)u1,1 4 w (r+1) (r) (r) (r+1) u2,1 = [u1,1 + u2,2 ] + (1 − w)u2,1 4 w (r) (r+1) (r) (r+1) u1,2 = [u2,2 + u1,1 + 12.4] + (1 − w)u1,2 4 w (r+1) (r+1) (r) (r+1) (r) u2,2 = [u1,2 + u3,2 + u2,1 + 12.4] + (1 − w)u2,2 . 4 Let w = 1.1. Then the values of u’s are listed below. r
u1,1
u2,1
u1,2
u2,2
1
0.00000
0.00000
3.41000
4.34775
2
0.93775
1.45351
4.52251
4.61863
3
1.54963
1.55092
4.65402
4.65450
4
1.55140
1.55153
4.65122
4.65031
5
1.55062
1.55010
4.65013
4.65003
6
1.55000
1.55000
4.65000
4.65000
7
1.55000
1.55000
4.65000
4.65000
Hence, solution is u1,1 = 1.55000, u2,1 = 1.55000, u1,2 = 4.65000, u2,2 = 4.65000, correct up to five decimal places. The SOR iteration scheme gives the result in 6th iterations for w = 1.1. While Gauss-Seidal and Jacob’s iteration schema take 11 and 16 iterations respectively. For SOR method the number of iterations depends on the value of w. Example 3.3 Solve the Poisson’s equation uxx + uyy = 5x2 y for the square region 0 ≤ x ≤ 1, 0 ≤ y ≤ 1 with h = 1/3 and the values of u on the boundary are every where zero. Use (a) Gauss-Seidel’s method, and (b) Gauss-Seidel’s SOR method.
Solution. In this problem, g(x, y) = 5x2 y, h = k = 1/3 and the boundary conditions are u0,0 = u1,0 = u2,0 = u3,0 = 0, u0,1 = u0,2 = u0,3 = 0, u1,3 = u2,3 = u3,3 = 0, u3,1 = u3,2 = 0. 11
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Partial Differential Equations: Elliptic (a) The Gauss-Seidel’s iteration scheme is 1 (r+1) (r) (r+1) (r) (r+1) ui,j = ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − h2 g(ih, jk) . 4 5 2 Now, g(ih, jk) = 5h3 i2 j = 27 i j. Thus (r+1) u1,1
= =
(r+1)
u2,1
= =
(r+1)
u1,2
= =
(r+1)
u2,2
= =
(0)
(0)
1 (r+1) 1 5 2 (r) (r+1) (r) u + u2,1 + u1,0 + u1,2 − . .1 .1 4 0,1 9 27 1 5 1 (r) 5 (r) (r) (r) 0 + u2,1 + 0 + u1,2 − = u + u1,2 − 4 243 4 2,1 243 1 (r+1) 1 5 (r) (r+1) (r) u + u3,1 + u2,0 + u2,2 − . .22 .1 4 1,1 9 27 1 (r+1) 20 (r) u1,1 + u2,2 − 4 243 1 (r+1) 1 5 (r) (r+1) (r) u0,2 + u2,2 + u1,1 + u1,3 − . .12 .2 4 9 27 1 (r) 10 (r+1) u2,2 + u1,1 − 4 243 1 (r+1) 1 5 2 (r) (r+1) (r) u + u3,2 + u2,1 + u2,3 − . .2 .2 4 1,2 9 27 1 (r+1) 40 (r+1) u1,2 + u2,1 − . 4 243 (0)
Let u2,1 = u2,2 = u1,2 = 0. All the values are shown in the following table. r
u1,1
u2,1
u1,2
u2,2
1
−0.00514
−0.02186
−0.01157
−0.04951
−0.02074
−0.03995
−0.02966
−0.05855
2 3 4 5 6 7
−0.01350
−0.02255
−0.02300
−0.02311
−0.02314
−0.03633
−0.04085
−0.04108
−0.04113
−0.04115
−0.02604
−0.03056
−0.03079
−0.03085
−0.03086
−0.05675
−0.05901
−0.05912
−0.05915
−0.05915
Hence, the solution correct up to five decimal places is u1,1 = −0.02314, u2,1 = −0.04115, u1,2 = −0.03086, u2,2 = −0.05915. 12
...................................................................................... (b) The SOR scheme is (r+1)
ui,j
w (r+1) (r) (r+1) (r) (r) ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − h2 g(ih, jh) + (1 − w)ui,j 4 w (r+1) 5 2 (r) (r+1) (r) (r) = i j + (1 − w)ui,j . ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − 4 243
=
(0)
(0)
(0)
Let the initial values be u2,1 = u1,2 = u2,2 = 0. Let the relaxation factor be w = 1.1. Then, the values of u1,1 , u2,1 , u1,2 and u2,2 are computed below. r
u1,1
u2,1
u1,2
u2,2
1
−0.00566
−0.02419
−0.01287
−0.05546
−0.02315
−0.04119
−0.03089
−0.05921
2 3 4 5 6
−0.01528
−0.02316
−0.02316
−0.02315
−0.03967
−0.04117
−0.04115
−0.04115
−0.02948
−0.03088
−0.03087
−0.03086
−0.05874
−0.05916
−0.05916
−0.05916
The solution obtained by SOR method is u1,1 = −0.02315, u2,1 = −0.04115, u1,2 =
−0.03086, u2,2 = −0.05916, correct up to five decimal places.
Note that the Gauss-Seidel’s iteration method needs 7 iterations whereas SOR method
takes only 6 iteration for w = 1.1.
13