Acta Appl Math (2009) 106: 325–335 DOI 10.1007/s10440-008-9300-9
Numerical Solutions for Ito Coupled System A.M. Kawala
Received: 6 November 2007 / Accepted: 19 August 2008 / Published online: 10 October 2008 © Springer Science+Business Media B.V. 2008
Abstract In this paper, the classical fourth-order Runge-Kutta method and Heun’s method are applied to initial value problems for system of ordinary differential equations in nonlinear cases which we reach it by Painleve analysis, focusing our interest in Ito coupled nonlinear partial differential system. The equations are solved by scheme of one step methods. Numerical results for the velocity in three dimensions are obtained and reported graphically for various temperatures to show interesting aspects of the solution. Keywords Ito coupled system · Painleve analysis · Fourth-order Runge-Kutta · Heun’s method
1 Introduction Numerical method is demonstrated for solving nonlinear problems. The study of nonlinear problems is of crucial importance in all areas of mathematical and physics. Some of the most interesting features of physical systems are hidden in their nonlinear behavior, and can only be studied with appropriate methods designed to tackle nonlinear problems. In the past several decades, many authors mainly had paid attention to study solutions of nonlinear equations by using various numerical methods [1]. Among these are Bucklund transformations, Inverse scattering method, and the tanh-function method. The aim of this paper is to extended some iterations by one-step methods based on classical fourth-order Runge-Kutta method (CRK4) proposed by [2, 3] and Heun’s method (HM) proposed by [3] to solve nonlinear Ito coupled system [4] which reduced by Painleve analysis [6, 9] to the system of ordinary differential equation. We discuss the application of the methods to some of cases and comparison with the results that obtained previously by exact solution [10, 11]. A.M. Kawala () Mathematics Department, Faculty of Science, Helwan University, Cairo, Egypt e-mail:
[email protected]
326
A.M. Kawala
2 Application Ito Coupled System [4] Consider the nonlinear Ito coupled system has the form ut = vx , vt = −2(vxxx + 3uvx + 3vux ) − 12wwx ,
(1)
wt = wxxx + 3uwx . Let us consider an one-parameter Lie group of infinitesimal transformations [7, 8] of the form: U = u + εη1 (x, t, u, v, w), V = v + εη2 (x, t, u, v, w), W = w + εη3 (x, t, u, v, w),
(2)
X = x + εζ1 (x, t, u, v, w), T = t + εζ2 (x, t, u, v, w),
ε 1.
The infinitesimal transformations η1 , η2 , η3 , ζ1 and ζ2 for (1). Under transformation (2) η1 = −2au,
η2 = −4av,
η3 = −3aw,
ζ1 = ax + b,
ζ2 = 3at + c
(3)
where a, b and c are arbitrary constants. Solving the characteristic equation associated with the infinitesimal symmetries (3) one obtains: Case 1: [4]. The similarity variables take the form: z=
ax + b (3at + c)
1 3
,
2
w1 = (3at + c) 3 u,
4
w2 = (3at + c) 3 v,
w3 = (3at + c)w.
(4) Under similarity transformation (4) we can reduce the system of PDE’s (1) to a system of ODE’s of the form: 2w1 + zw1 + w2 = 0, −4w2 − zw2 + 2a 2 w2 + 6w1 w2 + 6w2 w1 + 12w3 w3 = 0,
(5)
3w3 + zw3 + a 2 w3 − 3w1 w3 = 0 Here prime denotes differentiation with respect to z. using Painleve analysis of system (5), then solution of the system of ODE’s will be 1 w1 = z, 3
1 w2 = − z2 , 2
w3 = 0
and hence the solution of the system of PDE’s (1) are: u=
(ax + b) , 3(3at + c)
v=−
(a 2 x 2 + 2abx + b2 ) , 2(3at + c)2
w = 0.
(6)
Numerical Solutions for Ito Coupled System
327
Solving the characteristic equation associated with the infinitesimal symmetries (3), in the derivation of the general similarity reductions we made an assumption that a = 0 and c = 0. We obtain Generic sub-cases: Case 2: a = b = 0, c is arbitrary [4]. The similarity variables take the form: z = x,
w1 = u,
w2 = v,
w3 = w.
(7)
Under similarity transformation (7) we can reduce the system of PDE’s (1) to system of ODE’s of the form w2 = 0, 2w2 + 6w1 w2 + 6w2 w1 + 12w3 w3 = 0, w3
− 3w1 w3
(8)
=0
Here prime denotes differentiation with respect to z using Painleve analysis of system (8), then we get the solution of the system of PDE’s (1) are: u=
2 , x2
1 v = − c02 , 2
w=
c0 . x
(9)
Case 3: b = c = 0, a is arbitrary [4]. The similarity variables take the form: z=
x t
1 3
2
w1 = t 3 u,
,
4
w2 = t 3 v,
w3 = tw
(10)
Under similarity transformation (10), we can reduce the system of PDE’s (1) to a system of ODE’s of the form 2w1 + zw1 + 3w2 = 0, −4w2 − zw2 + 6w2 + 18w1 w2 + 18w2 w1 + 36w3 w3 = 0,
(11)
3w3 + zw3 + 3w3 − 9w1 w3 = 0 We apply Ablowitz-Ramani-Segur (ARS) algorithm to the system of ODE’s (11). Then we get 1 1 w2 = − z2 , w1 = z, 9 18 and hence the solution of the system of PDE’s (1) are: u=
x , 9t
v=−
x2 , 18t 2
w3 = 0
w=0
(12)
3 Classical Fourth-Order Runge-Kutta Method To solve the system of ODE’s (5), we consider y (n) = f (t, y, y , y , . . . , y (n−1) ),
y(t0 ) = y0 ,
y (t0 ) = y0 , . . . , y (n−1) (t 0 ) = y0(n−1)
328
A.M. Kawala
that can be solved for its highest derivative y (n) and for their initial conditions by a simple strategy called degree reduction [5, 12] which transforms nth-order IVP to first order coupled system. To solve the previous system by means of Runge-Kutta of fourth order method [2, 3], we consider the functional has yk+1 = yk +
h (k1 + 2k2 + 2k3 + k4 ), 6
where k1 = f (xn , yn ), h k1 k2 = hf xn + , yn + , 2 2 h k2 , k3 = hf xn + , yn + 2 2 k4 = hf (xn + h, yn + k3 )
(13)
4 Heun’s Method To solve the previous system (5) by means of Heun’s method [3], we consider the functional has h ¯ y¯n+1 = y¯n + f (tn , y¯n ) + f¯(tn+1 , y¯p ) (14) 2 where y¯p = y¯n + hf¯(tn , y¯n ). In our study, we will investigate the three cases for Ito coupled system. 5 Numerical Solution for Reduced System for Ito Coupled System (Case 1) For purpose of illustration of Runge-Kutta of fourth order iteration method for solving the system (5), we start with an initial values at z = 5 and the initial approximation by (6), using (13), where ⎡ y ⎤ ⎡ 1.666666667 ⎤ 1
y0,j
⎢ y2 ⎥ ⎢ ⎢ ⎥ ⎢ ⎢ y3 ⎥ ⎢ ⎢ ⎥ ⎢ = ⎢ y4 ⎥ = ⎢ ⎢ ⎥ ⎢ ⎢ y5 ⎥ ⎢ ⎣ ⎦ ⎣ y6 y7 ⎡
−1.25 −5 −1 0 0 0 −2y1 −y3 z y3
⎥ ⎥ ⎥ ⎥ ⎥, ⎥ ⎥ ⎦
⎢ ⎢ y ⎢ ⎢ 4y +zy −6y y −6y 4( −2y1 −y3 )−12y y 2 3 1 3 2 5 6 z f¯(z, y) ¯ =⎢ ⎢ 2a 2 ⎢ y6 ⎢ ⎣ y 7 −3y5 −zy6 +3y1 y6 a2
zn = nh,
j = 1, 2
⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦
Numerical Solutions for Ito Coupled System
329
and by the above iteration formula (13), we can obtain in the same manner the rest of component of the iteration formula (13), were obtained using maple personal computer [3] software. For purpose of illustration of Heun’s method iteration method for solving the system (5), we start with an initial values at z = 5 and the initial approximation by (6), using (14) where
⎤ ⎡ y1 + h −2y1z−y3 ⎥ ⎢ y2 + hy3 ⎥ ⎢ ⎥ ⎢ y3 + hy4 ⎥ ⎢ −2y1 −y3 ⎥ ⎢ ⎢ y + h 4y2 +zy3 −6y1 y3 −6y2 ( z )−12y5 y6 ⎥ y¯p = ⎢ 4 ⎥ 2a 2 ⎥ ⎢ ⎥ ⎢ y + hy 5 6 ⎥ ⎢ ⎥ ⎢ y + hy 6 7 ⎦ ⎣
y7 + h −3y5 −zya 26 +3y1 y6 by the above iteration formula (14), we can obtain in the same manner the rest of component of the iteration formula (14) were obtained using maple personal computer [3] software. In order to verify numerically whether the proposed methodology lead to higher accuracy, we can evaluate the numerical solutions using nth approximations show the high degree of accuracy and in most cases y (n) the nth approximation is accurate for n = 20. The behaviour of the solution obtained by Runge-Kutta of fourth order, Heun’s methods and analytic solution are shown for value of h, some selected values of z and for a different values of times in Figs. 1, 2, respectively. The numerical results are given along with Richardson’s error [2, 10, 11] estimated by yj (zn ) − yj k (zn ) = yj k (zn ) − yj,2k (zn ), Fig. 1 Comparison between analytic solution of system of equations (5) and numerical solutions at several time at h = 0.5
Fig. 2 Comparison between analytic solution of system of equations (5) and numerical solutions at several time at h = 0.5
j = 1, 2
330
A.M. Kawala
Fig. 3 Comparison between analytic solution of system of equations (5) and two numerical solutions at several time and step size
Fig. 4 Comparison between analytic solution of system (5) and numerical solutions at several time and step size
From these results, we conclude that Runge-Kutta method although in exhibits some truncation error, the results captures the general shape of the analytical solution of system (5). In Heun’s method, the solution manifests oscillation in some interval. To improve the solution, we reduce step size to (h = 0.05) and then having the step size to (h = 0.025.) The behaviour of the solution obtained by Runge-Kutta of fourth order, Heun’s methods and analytic solution are shown for value of h, some selected values of z and for a different values of times in Figs. 3, 4, respectively. Having the step size reduces as expected the absolute errors in between 25.05% to 25.45% for several times (in CRK4). Having the step size reduces as expected the absolute errors in between 3.55% to 14.28% for several times (in Heun’s methods). From these results, we conclude that the RungeKutta of fourth order and Heun’s methods, gives remarkable accuracy in comparison with our analytic solution (6). The behaviour of true and estimated truncation error obtained by numerical methods and analytic solution are shown for value of h and some selected values of z in Fig. 5, which calculated by Richardson extrapolation error [2] y(zn ) − yh (zn ) = where p is order of method.
1 (yh (zn ) − y2h (zn )), p+1
Numerical Solutions for Ito Coupled System
331
Fig. 5 Comparison of true and estimated truncation error for numerical methods where step size is 0.05
6 Numerical Solution for Reduced System for Ito Coupled System (Case 2) For purpose of illustration of Runge-Kutta of fourth order iteration method for solving the system (8), we start with an initial values at z = 5 and the initial approximation by (9) in [4], using (13), where ⎡y ⎤
0.08 ⎤ ⎢ y2 ⎥ ⎢ −5 × 10−5 ⎥ ⎥ ⎢ ⎥ ⎢ = ⎢ y3 ⎥ = ⎢ 2 × 10−3 ⎥ , ⎣ ⎦ ⎣ −4 ⎦ y4 −4 × 10 y5 1.6 × 10−4 1
y0,j
⎡
⎡ −2y3 y4 ⎤ y2
⎢ 0 ⎥ ⎢ ⎥ ⎥ f¯(z, y) ¯ =⎢ ⎢ y4 ⎥ , ⎣ y ⎦ 5 3y1 y4
zn = nh, j = 1, 3
and by the above iteration formula (13), we can obtain in the same manner the rest of component of the iteration formula (13), were obtained using maple personal computer [3] software. For purpose of illustration of Heun’s method iteration method for solving the system (8), we start with an initial values at z = 5 and the initial approximation by (9), using (14), where ⎡ y + h( −2y3 y4 ) ⎤ 1 y2 y2 ⎥ ⎢ ⎥ ⎢ y¯p = ⎢ y3 + hy4 ⎥ ⎦ ⎣ y4 + hy5 y5 + h(3y1 y4 ) by the above iteration formula (14), we can obtain in the same manner the rest of component of the iteration formula (14), were obtained using maple personal computer [3] software. In order to verify numerically whether the proposed methodology lead to higher accuracy, we can evaluate the numerical solutions using nth approximations show the high degree of accuracy and in most cases y (n) the nth approximation is accurate for n = 20. The behaviour of the solution obtained by Runge-Kutta of fourth order, Heun’s methods and analytic solution are shown for value of (h = 0.05) and then having the step size to (h = 0.025), some selected values of z in Figs. 6, 7, respectively. Having the step size reduces as expected the absolute errors in between 25.05% to 25.45% for several times (in CRK4). Having the step size reduces as expected the absolute errors in between 3.55% to 14.28% or several times (in Heun’s methods). From these results we conclude that the Runge-Kutta of fourth order and Heun’s methods, gives remarkable accuracy in comparison with our analytic solution (9). The behaviour of true and estimated truncation error obtained by numerical methods and analytic solution are shown for value of h and some selected values of z in Fig. 8, which calculated by Richardson extrapolation error [2].
332
A.M. Kawala
Fig. 6 Comparison between analytical solution of system (8) and two numerical solutions at several step size
Fig. 7 Comparison between analytical solution of system (8) and two numerical solutions at several step size
Fig. 8 Comparison of true and estimated truncation error for numerical methods where step size is 0.05
7 Numerical Solution for Reduced System for Ito Coupled System (Case 3) For purpose of illustration of Runge-Kutta of fourth order iteration method for solving the system (11), we start with an initial values at z = 5 and the initial approximation by (12) in [4], using (13) where ⎡y ⎤
⎡ 0.555555555 ⎤ ⎢ y2 ⎥ ⎢ −1.388888889 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ y3 ⎥ ⎢ −0.555555555 ⎥ ⎢ ⎥ ⎢ ⎥ = ⎢ y4 ⎥ = ⎢ −0.111111111 ⎥ ⎢ ⎥ ⎢ ⎥ 0 ⎢ y5 ⎥ ⎢ ⎥ ⎣ ⎦ ⎣ ⎦ y6 0 y7 0 1
y0,j
Numerical Solutions for Ito Coupled System
333
⎡
−2y1 −3y3 z y3
⎢ ⎢ y4 ⎢ ⎢ −2y1 −3y3 )−36y5 y6 4y2 +zy3 −18y1 y3 −18y2 ( ⎢ ¯ z f (z, y) ¯ =⎢ 6 ⎢ y6 ⎢ ⎣ y 7 −3y5 −zy6 +9y1 y6 3
⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦
zn = nh, j = 1, 2 and by the above iteration formula (13), we can obtain in the same manner the rest of component of the iteration formula (13) were obtained using maple personal computer [12] software. For purpose of illustration of Heun’s method iteration method for solving the system (11), we start with an initial values at z = 5 and the initial approximation by (12), using (14) where
⎤ ⎡ y1 + h −2y1z−3y3 ⎥ ⎢ y2 + hy3 ⎥ ⎢ ⎥ ⎢ y3 + hy4 ⎢ ⎥ −2y −3y ⎥ ⎢ 1 3 )−36y y 4y2 +zy3 −18y1 y3 −18y2 ( 5 6 z ⎥ y¯p = ⎢ ⎥ ⎢ y4 + h 6 ⎥ ⎢ ⎥ ⎢ y + hy 5 6 ⎥ ⎢ ⎦ ⎣ y6 + hy 7
−3y5 −zy6 +9y1 y6 y7 + h 3 by the above iteration formula (14), we can obtain in the same manner the rest of component of the iteration formula (14), were obtained using maple personal computer [3] software. In order to verify numerically whether the proposed methodology lead to higher accuracy, we can evaluate the numerical solutions using nth approximations show the high degree of accuracy and in most cases y (n) the n-th approximation is accurate for n = 20. The behaviour of the solution obtained by Runge-Kutta of fourth order, Heun’s methods and analytic solution are shown for value of (h = 0.05) and then having the step size to (h = 0.025), some selected values of z and for a different values of times in Figs. 9, 10, respectively. Having the step size reduces as expected the absolute errors in between 5.21% to 81.83% percent for several times (in CRK4). Having the step size reduces as expected the absolute errors in between 25.03% to 25.81% percent for several times (in Heun’s methods). From these results we conclude that the Runge-Kutta of fourth order and Heun’s methods gives remarkable accuracy in comparFig. 9 Comparison between analytic solution of system of equations (11) and two numerical solutions at several time and step size
334
A.M. Kawala
Fig. 10 Comparison between analytic solution of system of equations (11) and two numerical solutions at several time and step size
Fig. 11 Comparison of true and estimated truncation error for numerical methods where step size is 0.05
ison with our analytic solution (12). The behaviour of true and estimated truncation error obtained by numerical methods and analytic solution are shown for value of h and some selected values of z in Fig. 11, which calculated by Richardson extrapolation error [2].
8 Conclusions In this paper, Runge-Kutta of fourth order and Heun’s methods have been successfully used for finding the solution of Ito coupled nonlinear partial differential system. The results reported here provide further evidence of the usefulness of using Runge-Kutta of fourth order and Heun’s methods for solving Ito coupled system. The numerical results obtain for nthapproximation and compared with the known analytical solutions and the results show that, we achieved an excellent approximation to the actual solution of the equations by using n = 20 iterations. In CRK4 and Heun methods, for case (h = 0.05) have stability limit when having step size reduced (h = 0.025) or reduced step size below the stability limit (h = 0.05) the numerical solutions induced more stability. when step size is increased to value (h = 0.5) just over the stability limit (h = 0.05) for the solution manifests oscillations. The use of Richardson extrapolation to estimate the errors and to accelerate the convergence. The solutions obtained are shown graphically. In our work, we use the Maple Package to calculate the iterations obtained from Runge-Kutta of fourth order and Heun’s methods.
References 1. 2. 3. 4.
Ames, W.F.: Numerical Methods for Partial Differential Equations. Academic Press, New York (1992) Atkinson, K.: Elementary Numerical Analysis. Wiley, New York (1985) Chapra, S.C., Canale, R.P.: Numerical Methods for Engineers. McGraw Hill, New York (2003) El-Hadidi, A.M.: Lie symmetry and Painleve analysis for the nonlinear Ito coupled system, J. Sci. Ain Shams Univ. 41/4 (2006) 5. Maron, M.J.: Numerical Analysis: A Practical Approach. Macmillan & Co., London (1987)
Numerical Solutions for Ito Coupled System
335
6. Zayed, E.M.E., Zedan, H.A.: On the solutions of the nonlinear Schrodinger equation. Chaos, Solitons Fractals 16, 133–145 (2003) 7. Zayed, E.M.E., Zedan, H.A., Gepreel, K.A.: On the solitary wave solutions for nonlinear Hirota-Satsuma coupled KdV of equations. Chaos, Solitons Fractals 22, 285–303 (2004) 8. Zayed, E.M.E., Zedan, H.A., Gepreel, K.A.: On the solitary wave solutions for nonlinear Euler equations. Appl. Anal. 11, 1101–1132 (2004) 9. Zayed, E.M.E., Zedan, H.A., Gepreel, K.A.: Group analysis and modified extended tanh-function to find the invariant solutions and soliton solutions for nonlinear Euler equations. Int. J. Nonlinear Sci. Numer. Simul. 5, 221–234 (2004) 10. Zedan, H.A., Kawala, A.M.: Classes of solution for a system of one-dimensional motion of a gas. Appl. Math. Comput. 142, 271–282 (2003) 11. Zedan, H.A., Kawala, A.M.: The symmetry and numerical solutions for the Euler’s equations, Int. J. Nonlinear Sci. Numer. Simul. (2007, submitted) 12. Zill, D.G.: A First Course in Differential Equations with Modeling Applications. Brooks/Cole, Pacific Grove (2001)
Acta Appl Math (2009) 106: 337–348 DOI 10.1007/s10440-008-9301-8
Inner Product Space and Concept Classes Induced by Bayesian Networks Youlong Yang · Yan Wu
Received: 18 July 2008 / Accepted: 21 August 2008 / Published online: 12 September 2008 © Springer Science+Business Media B.V. 2008
Abstract Bayesian networks have become one of the major models used for statistical inference. In this paper we discuss the properties of the inner product spaces and concept class induced by some special Bayesian networks and the problem whether there exists a Bayesian network such that lower bound on dimensional inner product space just is some positive integer. We focus on two-label classification tasks over the Boolean domain. As main results we show that lower bound on the dimension of the inner product space induced by a class of Bayesian networks without v-structures is ni=1 2mi + 1 where mi denotes the number of parents for ith variable. As the variable’s number of Bayesian network is n ≤ 5, we also show that for each integer m ∈ [n + 1, 2n − 1] there is a Bayesian network N such that VC dimension of concept class and lower bound on dimensional inner product space induced by N all are m. Keywords Bayesian networks · VC dimension · Inner product space · Concept classes
1 Introduction A Bayesian network is a statistical model that uses directed acyclic graph to represent the conditional independence structures between collections of random variables. As a class of probabilistic graphical models Bayesian networks have faced a significant theoretical development within areas such as Artificial Intelligence and Statistics [1–7] during the last years. They are used frequently in machine learning and in many application fields. Studying
This work was supported by National Natural Science Foundation of China (60574075). Y. Yang () · Y. Wu School of Science, Xidian University, Xi’an 710071, China e-mail:
[email protected] Y. Yang e-mail:
[email protected] Y. Wu e-mail:
[email protected]
338
Y. Yang, Y. Wu
their complexity is quite important, since the capacity measures given above are required for many theoretical analysis of inference methods. For example, from generalization bounds, one can conclude that the size of training set required to ensure good generalization scales linearly with VC dimension. Bayesian networks are particularly useful for dealing with high dimensional statistical problems. In recent years, there has been remarkable interest in learning systems based on kernel methods [8–10] and probabilistic graphical models, for example, [11–13] and [14]. Altun et al. [14] proposed a kernel for the Hidden Markov Model, which is a special case of a Bayesian network. In this paper, we consider Bayesian networks as computational models that perform two-label classification tasks over the Boolean domain. We aim at finding the simplest inner product space that is able to express the concept class, that is, the class of decision functions, induced by a given Bayesian network. Hereby, simplest refers to a space which has as few dimensions as possible. Given a Bayesian network N , let Edim(N ) [12] denote the smallest dimension d such that the decisions represented by N can be implemented as inner products in the ddimensional Euclidean space. Nakamura et al. [12] establish upper and lower bounds on the dimension of the inner product space for Bayesian networks (see Theorem 2.1). We (Yang et al. [13]) have showed that for full Bayesian network and almost full Bayesian network with n variables, VC dimension and the smallest dimension of inner product space of the concept classes induced by them are 2n − 1. We (Yang et al. [13]) also have obtained that the VC dimension and the smallest dimension of inner product space of the concept classes induced by a Bayesian network with n variables are belong to a closed interval [n + 1, 2n − 1]. The one of contributions of this paper is to show that lower bound on the dimension of the inner product space induced by some Bayesian networks without v-structures is ni=1 2mi + 1 where mi denotes the number of parents for ith variable. And when the variable’s number of Bayesian network is n ≤ 5, for each integer m ∈ [n + 1, 2n − 1] there is a Bayesian network Nm such that VC dimension of concept class and lower bound on dimensional inner product space induced by Nm all are m, that is, VCdim(Nm ) = Edim(Nm ) = m. The rest of this paper is organized as follows. In Sect. 2 we give some basic concepts and the related results. In Sect. 3, we provide the main results and detailed proofs for lower bound on the dimension of the inner product space induced by some special Bayesian networks. In Sect. 4, we demonstrate that for any positive integer m ∈ [n + 1, 22 − 1] where n ≤ 5 there exists a Bayesian network Nm with n variables such that VC dimension of concept class and lower bound on dimensional inner product space induced by Nm all are m. Finally, we conclude the paper in Sect. 5.
2 The Basic Concepts and Related Results A Bayesian network N for a set of variables V = {X1 , X2 , . . . , Xn } represents a joint probability distribution over those variables. It consists of (1) a network structure G that encodes assertions of conditional independence in the distribution and (2) a set of conditional probability distribution P corresponding to that structure. The network structure is a directed acyclic graph (DAG for short) such that each variable Xi has a corresponding node Xi in the structure. So let (G , P ) denote the Bayesian network N where G = (V , E), that is, N = (G , P ). A topological sort of the nodes (or variables) in a DAG G = (V , E) is any total ordering of the nodes such that for any pair of nodes Xi and Xj in G , if Xi is an ancestor of Xj , then Xi must precede Xj in the ordering. We assume that every edge (i, j ) ∈ E satisfies i < j ,
Inner Product Space and Concept Classes Induced by Bayesian
339
that is, E induces a topological ordering on X1 , X2 , . . . , Xn . Given (i, j ) ∈ E, Xi is called a parent of Xj and Xj is called a child of Xi . We use P Ai to denote the set of parents of node Xi and let mi = |P Ai | be the number of parents. Let CHi denote the set of children of node Xi . A network N is said to be full connected if P Ai = {X1 , . . . , Xi−1 } holds for every node Xi . We associate with every node Xi a Boolean variable xi with values in {0, 1}. Let X = (X1 , X2 , . . . , Xn ) be a n-dimension binary vector. Then {x|x = (x1 , x2 , . . . , xn )} = {0, 1}n . The class of distributions induced by N , denoted as DN , consists of all distributions on {0, 1}n of the form P (X) =
n
p(Xi |P Ai )
i=1
where P ∈ DN and p(Xi |P Ai ) represents the conditional probability for the event Xi given that the parent variables P Ai . In fact, P (X) is a product of conditional probabilities. Thus for every assignment of value from the open interval (0, 1) to the parameters of N , one obtain a specific distribution from DN . Therefore in this paper distribution P ∈ DN denotes every parameter 0 < p(Xi |P Ai ) < 1 where p(Xi |P Ai ) ∈ P . For each distribution P ∈ DN , the total number n ofm parameters, that is, the number i of independent variables that express P (X) is i=1 2 . Thus we can use real values n mi i=1 {pi,α = p(Xi = 1|P Ai = α)|α ∈ {0, 1} } to denote the distribution P , that is, P=
n
{pi,α = p(Xi = 1|P Ai = α)|α ∈ {0, 1}mi }
i=1
and P = {pi,α } for short. Let X = (X1 , X2 , . . . , Xn ) denote a topological ordering on n random variables. Further let X i = {Xi1 , Xi2 , . . . , Xir } denote a sub-topological ordering from X. For every n-dimension vector x = (x1 , x2 , . . . , xn ) there is a r-dimension vector x i = (xi1 , xi2 , . . . , xir ). Then let x|Xi denote the vector x r . Such as, for a Bayesian network N with n random variables one have x|P AXi = (xi1 , xi2 , . . . , xir ) (x|i for short) where P AXi = {Xi1 , Xi2 , . . . , Xir } and x = (x1 , x2 , . . . , xn ). A concept class C over domain X is a family of functions of the form f : X → {1, −1}. Each f ∈ C is called a concept. In fact a concept over a domain X is a total Boolean function over X . A finite set S = {s1 , s2 , . . . , sm } ⊆ X is said to be shattered by C if for every mdimensional binary vector b ∈ {1, −1}m there exists some concept f ∈ C such that f (si ) = bi for i = 1, 2, . . . , m. Let (S + , S − ) denote a dichotomy of S if S + ∪ S − = S and S + ∩ S − = φ. The Vapnik-Chervonenkis (VC) dimension of C is given by VCdim(C ) = sup{m | there is some S ⊆ X shattered by C and |S| = m}. We use the sign function for mapping a real-valued function g to a ±1-valued concept sign ◦ g. Thus the class of concepts induced by N , denoted as CN , consists of all ±1-valued functions on {0, 1}n of the form sign(log(P (X)/Q(X))) for P , Q ∈ DN where P = {pi,α } and Q = {qi,α }. Note that the function sign(log(P (X)/Q(X))) attains the value 1 if P (X) ≥ Q(X) and the value −1 otherwise. We use VCdim(N ) to denote the VC dimension of CN . We often say that the set S ⊆ {0, 1}n is shattered by Bayesian network N when we mean that S is shattered by concept class induced by N . Thus DN = {P |P = {pi,α }, pi,α ∈ (0.1)} CN = {f = sign(log(P (X)/Q(X)))|P , Q ∈ DN }.
340
Y. Yang, Y. Wu
VC dimension is a measure of the capacity of a statistical classification algorithm in computational learning theory. Note that if VC-dimension is m then there exists at least one set of m data points that can be shattered but this is not necessarily true that every set of m data points can be shattered. For example, let N be a Bayesian network with three variables X1 , X2 , X3 and without any edge between them. Then VCdim(N ) = Edim(N ) = 4 by . . . . However, the subset S = {(1, 1, 1), (1, 1, 0), (1, 0, 1), (1, 0, 0)} is not shattered by CN because that VCdim(N |X1 =1 ) = 3, that is, given the condition X1 = 1. d Let u, v denote d two vectors in d-dimensional Euclidean space R and the standard dot product u · v = i=1 ui vi where u = (u1 , u2 , . . . , ud ) and v = (v1 , v2 , . . . , vd ). It’s an interesting topic to embed concept classes into finite-dimensional Euclidean spaces. A d-dimensional linear arrangement for a concept class C over domain X is given by collections (uf )f ∈C and (vf )x∈D of vectors in Rd such that ∀f ∈ C , x ∈ X : f (x) = sign(uf · vf ). The smallest d such that there exists a d-dimensional linear arrangement for C is denoted as Edim(C ). If there is no finite-dimensional linear arrangement for C , Edim(C ) is defined to be infinite. The following results about Edim(C ) and VCdim(C ) are easy to get by some definitions or from some results in [12]. For more information see, for example, [12] and [13]. Theorem 2.1 Every Bayesian network N with n variables X1 , . . . , Xn satisfies n n n 2mi ≤ Edim(N ) ≤ 2P Ai ∪{Xi } ≤ 2 · 2 mi i=1
i=1
i=1
where P Ai denote the set of parents of ith variable Xi and mi = |P Ai |. Theorem 2.2 Let NF be a full Bayesian network with n variables. Then VCdim(NF ) = Edim(NF ) =
n
2mi = 2n − 1.
i=1
Theorem 2.3 Let N be a non-full Bayesian network with n variables. Then VCdim(N ) ≥
n
2mi + 1.
i=1
Theorem 2.4 Let N be a Bayesian network with n variables. Then n + 1 ≤ VCdim(N ) ≤ Edim(N ) ≤ 2n − 1.
3 VC Dimension and Inner Product Space for Special Bayesian Networks Definition 3.1 A Bayesian Network with a well defined Euclidean space Rd is a Bayesian P (x) ) and x ∈ {0, 1}n where network with the following constraints: For each f = sign(log Q(x) P (x) P , Q ∈ DN , there exist d-dimension vectors vx and uf such that vx · uf = log Q(x) .
Inner Product Space and Concept Classes Induced by Bayesian
341
Obviously, Edim(N ) ≤ d if that N is a Bayesian network with a well defined Euclidean space Rd . Lemma 3.1 Let N be a Bayesian network with n variables X1 , X2 , . . . , Xn and n P Ai ∪{Xi } d = 2 . i=1
Then N is a Bayesian network with a well defined Euclidean space Rd . Proof For ith variable Xi , let P Ai = {Xi1 , Xi2 , . . . , Ximi }. Then |2P Ai ∪{Xi } | = 2mi +1 . Without loss of generality suppose that 2P Ai ∪{Xi } = {Ai1 , Ai2 , . . . , Ai2mi +1 } and |Aij | ≤ |Aik | if ijj ≤ ik. Further let MAij denotes an algebraic expression on the elements of Aij as MAij = X∈Aij x, such as, MAij = xi2 xi3 xi5 if Aij = {Xi2 , Xi3 , Xi5 }. Especially, let MAij = 1 if Aij = ∅. For a mi -dimension vector α = (a1 , a2 , . . . , ami ) ∈ {0, 1}mi , let pi,α (or qi,α ) denotes the probability of Xi = 1 given the condition Xi1 = a1 , Xi2 = a2 , . . . , Ximi = ami , that is, pi,α = p(Xi = 1|Xi1 = a1 , Xi2 = a2 , . . . , Ximi = ami ) = p(Xi = 1|P Ai = α). We use Mα denote the algebraic expression on letters xi1 , xi2 , . . . , ximi Mα = bj = xij if aj = 1, otherwise, bj = 1 − xij . Then log
mi
j =1 bj
where
2m i pi,αj 1 − pi,αj p(xi |P Ai ) = Mαj xi log + (1 − xi ) log . q(xi |P Ai ) j =1 qi,αj 1 − qi,αj
Let ni=1 2P Ai ∪{Xi } = {A1 , A2 , . . . , Ad }. Therefore for each pair of P , Q ∈ DN and every x = (x1 , x2 , . . . , xn ) ∈ {0, 1}n , there are some positive real numbers r1 , r2 , . . . , r2d such that P (x) Q(x)
p(xn |P An ) p(x1 ) p(x2 |P A2 ) p(x3 |P A3 ) · · ··· = log q(x1 ) q(x2 |P A2 ) q(x3 |P A3 ) q(x2 |P An )
log
=
n
log
i=1
p(xi |P Ai ) q(xi |P Ai )
pi,αj 1 − pi,αj Mαj xi log + (1 − xi ) log qi,αj 1 − qi,αj j =1 m
=
n 2 i i=1
=
d
MAi log ri
i=1
= (MA1 , MA2 , . . . , MAd ) · (log r1 , log r2 , . . . , log rd )
342
Y. Yang, Y. Wu
Fig. 1 Eight Bayesian networks with n variables where n ≤ 3. The numbers in parentheses denote VCdim(Ni ) and Edim(Ni )
P (x) Hence, for each f = sign(log Q(x) ) and x = (x1 , x2 , . . . , x) ∈ {0, 1}n where P , Q ∈ DN , there exist d-dimension vectors vx = (MA1 , MA2 , . . . , MAd ) and uf = (log r1 , log r2 , . . . , P (x) . log rd ) such that vx · uf = log Q(x)
Theorem 3.1 Let N be a non-full Bayesian network with n variables and | ni=1 2P Ai ∪{Xi } |. Then VCdim(N ) = Edim(N ) = ni=1 2mi + 1. Proof Theorem holds by Lemma 3.1 and Theorem 2.3.
n
i=1 2
mi
+1=
Corollary 3.1 Let Nk1 be a Bayesian network with n variables and P A2 = · · · = P Ak = {X1 }, where n > 2 and 2 ≤ k ≤ n. Then VCdim(Nk1 ) = Edim(Nk1 ) = n + k = ni=1 2mi + 1. Proof Corollary holds using Theorem 3.1.
A V -structure in DAG G is an ordered triple of nodes (A, B, C) such that (1) G contains the arcs A → B and B ← C, and (2) A and C are not adjacent in G . For example, there is a V -structure in Bayesian network N7 in Fig 1. Lemma 3.2 Let N be a Bayesian network with n variables. Then there isn’t any V -structure in N if and only if ni=1 2mi + 1 = | ni=1 2P Ai ∪{Xi } |. Proof Clearly lemma holds for n ≤ 3 (see Fig. 1). Suppose that it holds for Bayesian network N with n − 1 variables. The remainder of the proof demonstrates lemma holds for Bayesian network N = N ∪ {Xn }. ‘Only if’: Let Xni , Xnj ∈ P An denote two parent-variables of variable Xn . Then (Xni , Xn , Xnj ) isn’t a v-structure in N , that is, there exists an edge between Xni and Xnj . Therefore there is full connecting for all variables in P An ∪ {Xn }. Further one An P Ai ∪{Xi } have 2P An ∪{Xn } = 2P An ∪ {S ∪ {Xn }|S ∈ 2P }. It’s easy to get 2P An ⊆ n−1 and i=1 2 n n P An mn mi P Ai ∪{Xi } |. |{S ∪ {Xn }|S ∈ 2 }| = 2 . So one have i=1 2 + 1 = | i=1 2 ‘If’: Assume there is V -structure (Xni , Xn , Xnj ) along with adding variable Xn to Bayesian network N . Then there isn’t an edge between variable Xni and Xnj in N , that is, P Ai ∪{Xi } subset {Xni , Xnj } ∈ / n−1 . Because of 2P An ∪{Xn } = 2P An ∪ {S ∪ {Xn }|S ∈ 2P An }, i=1 2 P An P An mn 2 and |{S ∪ {Xn }|S ∈ 2 }| = 2 , one have ( ni=1 2mi + 1) + 1 ≤ {X nin , XnjP A} ∈ ∪{Xi } i | which imply a contradiction. Therefore there isn’t any V -structure in N . | i=1 2 Theorem 3.2 Let N be a non-full Bayesian network with n variables and there isn’t a V -structure in N . Then VCdim(N ) = Edim(N ) = ni=1 2mi + 1. Proof Theorem holds by Lemma 3.2 and Theorem 3.1.
Inner Product Space and Concept Classes Induced by Bayesian
343
Let NAF denote a Bayesian network resulting from deleting the edge X1 → X2 of full Bayesian network NF , defined almost full Bayesian network. Obviously, there are V -structures X1 → Xi ← X2 (3 ≤ i ≤ n) in NAF . However one have following Theorem 3.3. Theorem 3.3 (See [13]) Let NAF be a almost full Bayesian network with n variables. Then VCdim(NAF ) = Edim(NAF ) = 2n − 1 =
n
2mi + 1.
i=1
For Bayesian networks N1 , N3 and N8 in Fig. 1 by Theorem 2.2 one have VCdim(N1 ) = Edim(N1 ) = 21 − 1 = 1, VCdim(N3 ) = Edim(N3 ) = 22 − 1 = 3, VCdim(N8 ) = Edim(N8 ) = 23 − 1 = 7. Applying Theorem 3.3 VCdim(N7 ) = Edim(N7 ) = 23 − 1 = 7 is obtained. And according to Theorem 3.2 one have VCdim(N2 ) = Edim(N2 ) = 3,
VCdim(N4 ) = Edim(N4 ) = 4,
VCdim(N5 ) = Edim(N5 ) = 5,
VCdim(N6 ) = Edim(N6 ) = 6.
Therefore as n ≤ 3 for each positive integer m ∈ [n + 1, 2n − 1] there exists some Bayesian network N such that VCdim(N ) = Edim(N ) = m.
4 The Existence of Bayesian Networks Lemma 4.1 Let N be a Bayesian network with n − 1 and a well defined Euclidean space Rd . Suppose that N = N ∪ {Xn } be a Bayesian network and P An = {Xr1 , Xr2 , . . . , Xri }. i Then there is a well defined Euclidean space Rd+2 for N if the Bayesian network constituted by P An is a full Bayesian network. P Ai ∪{Xi } because that the Bayesian network constituted Proof One can have 2P An ⊆ n−1 i=1 2 by P An is a full Bayesian network. Then n−1 n P Ai ∪{Xi } P Ai ∪{Xi } − 2 = {S ∪ {Xn }|S ∈ 2P An }, and 2 i=1 i=1 {S ∪ {Xn }|S ∈ 2P An } = 2P An ∪{Xn } − 2P An = 2i+1 − 2i = 2i . It’s easy to see theorem holds by the proof of Lemma 3.1.
Theorem non-full Bayesian network with n − 1 variables X1 , X2 , . . . , Xn−1 n−14.1m Let N bean−1 i and i=1 2 + 1 = | i=1 2P Ai ∪{Xi } |. Suppose that N = N ∪ {Xn } be a Bayesian network and P An = {Xr1 , Xr2 , . . . , Xri }. Then VCdim(N ) = Edim(N ) = ni=1 2mi + 1 if the Bayesian network constituted by P An is a full Bayesian network.
Proof Theorem holds by Lemma 4.1, Theorems 2.3 and 3.1.
344
Y. Yang, Y. Wu
Fig. 2 Three Bayesian networks with N 4 = N 3 ∪ {X4 } and N 3 = NF2 ∪ X3
Lemma 4.2 Let a, b be two real numbers and 0 < a < 1 < b. Then there exist real numbers 0 < pi , qi < 1 where 1 = 1, 2, . . . , 8 such that (1)
p1 a q1
> 1,
1−p1 b 1−q1
> 1.
(2)
p2 b q2
> 1,
1−p2 a 1−q2
> 1.
(3)
p3 a q3
> 1,
1−p3 b 1−q3
< 1.
(4)
p4 b q4
< 1,
1−p4 a 1−q4
> 1.
(5)
p5 a q5
< 1,
1−p5 b 1−q5
> 1.
(6)
p6 b q6
> 1,
1−p6 a 1−q6
< 1.
(7)
p7 a q7
< 1,
1−p7 b 1−q7
< 1.
(8)
p8 b q8
< 1,
1−p8 a 1−q8
< 1.
Let NF2 , N 3 and N 4 denote three Bayesian networks such as in Fig. 2 where N = NF2 ∪ X3 and N 4 = N 3 ∪ {X4 }. Applying Theorems 2.2 and 3.2 one have VCdim(NF2 ) = Edim(NF2 ) = 3 and VCdim(N 3 ) = Edim(N 3 ) = 6. Following property claims that VCdim(N 4 ) = Edim(N 4 ) = 11. 3
Proposition 4.1 Let N 4 be the Bayesian networks such as in Fig. 2. Then 4 4 P Ai ∪{Xi } 2mi + 2 = 11. VCdim(N ) = Edim(N ) = 2 = 4
4
i=1
i=1
Proof By Fact 2.1 one have VCdim(N 4 ) ≤ Edim(N 4 ) ≤ | 4i=1 2P Ai ∪{Xi } | = 11 = 4 m i + 2. Next one show that VCdim(N 4 ) ≥ 11. i=1 2 Let x 1 = (1, 1, 1), x 2 = (1, 1, 0), x 3 = (1, 0, 1) and x 4 = (1, 0, 0). Further let x 5 = (0, 1, 1), x 6 = (0, 1, 0), x 7 = (0, 0, 1) and x 8 = (0, 0, 0). Obviously S = {x 1 , x 2 , x 3 , x 5 , x 6 , x 7 } is shattered by Bayesian network N 3 . For every vector b ∈ {1, −1}4 −{(1, −1, −1, 1), (−1, 1, 1, −1), (−1, −1, −1, −1)}, there is a concept f = sign(log P (x i )Q (x i )) such that f (x 1 , x 2 , x 3 , x 4 ) = b holds, that is, f (x i ) = bi where b = (b1 , b2 , b3 , b4 ) and 1 ≤ i ≤ 4. There is the same result for the subset {x 5 , x 6 , x 7 , x 8 }. Let S = S1 ∪ S2 where S1 = {(x 1 , 1), (x 1 , 0), (x 2 , 1), (x 3 , 1), (x 3 , 0), (x 4 , 1)} and S2 = {(x 5 , 1), (x 6 , 1), (x 6 , 0), (x 7 , 1), (x 8 , 0)}. Then S ⊆ {0, 1}4 and |S| = 11. Let (S + , S − ) be a dichotomy of S. Then (S + , S − ) be the dichotomy of S defined by
+ S = {x i ∈ S |(x i , 1) ∈ S + } and S − = {x i ∈ S |(x i , 1) ∈ S − }. There are two distributions
Inner Product Space and Concept Classes Induced by Bayesian
345
k
P (x )
+ P , Q ∈ DN 3 such that the concept f (x k ) = sign(log Q
(x k ) ) outputs 1 for elements of S and −1 for elements of S − .
Case(1): f (x 4 ) = f (x 8 ). Applying Lemma 4.2 there are real numbers p4,ij = p(x4 = 1|x2 = i, x3 = j ) and q4,ij = q(x4 = 1|x2 = i, x3 = j ) where i, j ∈ {0, 1} such that P = P (x) ) outputs 1 for P ∪ {p4,ij }, Q = Q ∪ {q4,ij } ∈ DN 4 and the concept f (x) = sign(log Q(x) + − elements of S and −1 for elements of S . Case(2): f (x 4 ) = f (x 8 ). Suppose that f (x 4 ) = f (x 8 ) = −1. Then (x 5 , 1) ∈ S + , (x 6 , 1), (x 6 , 0), (x 7 , 1) ∈ S − , that is, x 5 ∈ S + and x 6 , x 7 ∈ S − . Let S ∗+ = S + − {x 5 } and S ∗− = S − ∪ {x 5 }. Obviously (S ∗+ , S ∗− ) be the dichotomy of S . Therefore there is a concept P ∗ (x k )
∗+ and −1 for elements of S ∗− f ∗ (x k ) = sign(log Q
∗ (x k ) ) outputs 1 for elements of S
∗ 4
∗ 8 and f (x ) = −1, f (x ) = 1. This now immediately follows from the proof of Case(1). Assume that f (x 4 ) = f (x 8 ) = 1. Then (x 5 , 1) ∈ S − . Let S ∗+ = S + ∪ {x 5 } and S ∗− = S − − {x 5 }. Obviously (S ∗+ , S ∗− ) also be the dichotomy of S . Therefore there is a concept P ∗ (x k )
∗+ and −1 for elements of S ∗− and f ∗ (x k ) = sign(log Q
∗ (x k ) ) outputs 1 for elements of S f ∗ (x 4 ) = 1, f ∗ (x 8 ) = −1. This now immediately follows from the proof of Case(1). Therefore S = {1111, 1110, 0111} ∪ {1011, 1010, 0011} ∪ {1101, 0101, 0100} ∪ {1001, 0000} is shattered by N 4 , that is, VCdim(N 4 ) ≥ 11. Theorem 4.2 Let N be a Bayesian network with n − 1 variables X1 , . . . , Xn−1 and VCdim(N ) = Edim(N ) = m. N = N ∪ {Xn } also be a Bayesian network and P AXn = {X1 , X2 , . . . , Xn−1 }. Then VCdim(N ) = Edim(N ) = m + 2n−1 . Proof (1) By VCdim(N ) = m there is a set S ⊆ {0, 1}n−1 shattered by N with |S | = m. Let S1 = {(x , 0), (x , 1)|x ∈ S }, S2 = {(x , 1)|x ∈ {0, 1}n−1 − S } and S = S1 ∪ S2 . Then S ⊆ {0, 1}n and |S| = m + 2n−1 . In the following we show that the subset S is shattered by N . Let (S + , S − ) be a dichotomy of S. Then (S + , S − ) is the dichotomy of S defined by S + = {x ∈ S |(x , 1) ∈ S + },
S − = {x ∈ S |(x , 1) ∈ S − }.
Since S is shattered by N , there exist distributions P , Q ∈ DN such that N with these distributions induces this dichotomy on S . For each pair of (x , 0), (x , 1) ∈ S1 , let pn,x = qn,x = 0.5 if (x , 0), (x , 1) ∈ S + or
(x , 0), (x , 1) ∈ S − . Otherwise we specify the parameters pn,x and qn,x as follows. For the case (x , 1) ∈ S + and (x , 0) ∈ S − , that is, x ∈ S + , one can get the result since there are
(x ) pn,x parameters pn,x and qn,x such that log( Pq (x
) · q ) < 1. Similar techniques as used lead to n,x
the result for the case (x , 1) ∈ S − and (x , 0) ∈ S + . For each vector (x , 1) ∈ S2 , it’s easy to obtain that (x , 1) ∈ S + or (x , 1) ∈ S − by the choice of pn,x and qn,x . Therefore the subset S is shattered by N , that is, VCdim(N ) ≥ m + 2n−1 . (2) By Edim(N ) = m one can get the vx , uf ∈ Rm for each x ∈ {0, 1}n−1 and each P
f = sign(log Q
) where P , Q ∈ DN such that f (x ) = sign(vx · uf ).
n−1 and xi1 = (xi , 1), xi0 = (xi , 0) (1 ≤ i ≤ Let {x1 , x2 , . . . , x2n−1 } denote the set {0, 1} n−1 n−1 2 ). Further let ei denote the 2 -dimension unit vector. Then one can get the m + 2n−1 -
346
Y. Yang, Y. Wu
Fig. 3 Twelve Bayesian networks with 4 variables.The numbers in parentheses denote VCdim(Ni ) and Edim(Ni )
dimension real values vectors vxi1 = (vxi , ei ),
vxi0 = (vxi , −ei )
for each pair of vectors xi1 and xi0 . Suppose that f is a concept induced by pair of P , Q ∈ DN where P = P ∪ {pn,x |x ∈ {0, 1}n−1 }
and
Q = Q ∪ {qn,x |x ∈ {0, 1}n−1 }.
Let a = a1 , a2 , . . . , a2n−1 denote the m + 2n−1 -dimension real values vector and uf = (uf , a). Obviously,vxi1 · uf = vx · uf + ai and vxi · uf = vx · uf − ai . The remainder of the proof demonstrates the existence of a such that f (x) = sign(vx · uf ). Case(1): Case(2): Case(3): Case(4): Case(5): Case(6):
vx · uf vx · uf vx · uf vx · uf vx · uf vx · uf
≥ 0,f (xi1 ) = 1 and f (xi0 ) = 1. One specify ai = 0. ≥ 0,f (xi1 ) = 1 and f (xi0 ) = −1. One choose ai > (vx · uf ). ≥ 0,f (xi1 ) = −1 and f (xi0 ) = 1. Let ai < −(vx · uf ). < 0, f (xi1 ) = −1 and f (xi0 ) = −1. One specify ai = 0. < 0, f (xi1 ) = −1 and f (xi0 ) = 1. One define ai < (vx · uf ). < 0, f (xi1 ) = 1 and f (xi0 ) = −1. Let ai > −(vx · uf ).
Then one have Edim(N ) ≤ m + 2n−1 . By fact VCdim(N ) ≤ Edim(N ), it’s obtained VCdim(N ) = Edim(N ) = m + 2n−1 .
Applying Theorem 3.2, Proposition 4.1 and Theorem 4.2, one have there exists some Bayesian network N with 4 variables (see Fig. 3) such that VCdim(N ) = Edim(N ) = m for each positive integer m ∈ [5, 15]. Similar techniques as used in the proof of Theorem 4.2 lead to that following Theorem 4.3 holds. Theorem 4.3 Let N be a Bayesian network with n − 1 variables. For each of dichotomy P (x ) (S + , S − ) of S shattered by N there exist a concept f = sign(log Q
(x ) ) such that
P (x )
+ log Q where |S| = m. Suppose that N = N ∪ {Xn } be a Bayesian
(x ) > 0 for x ∈ S network with P An = {Xr1 , Xr2 , . . . , Xri } and VCdim(N ) = m. Then VCdim(N ) ≥ m + 2i .
Inner Product Space and Concept Classes Induced by Bayesian
347
Fig. 4 Twelve Bayesian networks with 5 variables. The numbers in parentheses denote VCdim(Ni ) and Edim(Ni )
By Theorems 4.3 and 2.1 for Bayesian network N14 in Fig. 4 one have following Corollary. Corollary 4.1 Let N14 be the Bayesian networks such as in Fig. 4. Then VCdim(N14 ) = Edim(N14 ) = 19. Applying Theorem 3.2, Corollary 4.1 and Theorem 4.2, one have there exists some Bayesian network N with 5 variables (see Fig. 4) such that VCdim(N ) = Edim(N ) = m for each positive integer m ∈ [6, 31]. Although the partial result of Theorem 4.4 have been presented in [12], obviously it holds by applying our result Theorem 3.2.
348
Y. Yang, Y. Wu
Theorem 4.4 Let N0 be a Bayesian network with n variables and P Ai = ∅ for each 1 ≤ i ≤ n, where n ≥ 2. Then VCdim(N0 ) = Edim(N0 ) = n + 1 =
n
2mi + 1.
i=1
5 Conclusions Bayesian networks have become one of the major models used for statistical inference. Like other probabilistic models, Bayesian networks can be used to represent inhomogeneous data with possibly overlapping features and missing values in a uniform manner. Garcia et al. [15] and Sullivant [3] placed Bayesian networks into the realm of algebraic statistics to study the algebraic varieties defined by the conditional independence statements of Bayesian networks or Gaussian Bayesian networks. Quite elaborate methods dealing with Bayesian networks have been developed for solving problems in pattern classification. We have also seen a growing interest for using Bayesian networks in the classification [1, 2, 8]. In this paper we focus on two-label classification tasks over the Boolean domain. As main results we present that lower bound on the dimension of the inner product space induced by a class of special Bayesian networks without v-structures is ni=1 2mi + 1 where mi denotes the number of parents for ith variable. As the variable’s number of Bayesian network n ≤ 5, for each integer m ∈ [n + 1, 2n − 1] there is a Bayesian network N such that VC dimension of concept class and lower bound on dimensional inner product space induced by N all are m. References 1. Chen, S., Gordon, G.J., Murphy, R.F.: Graphical models for structured classification, with an application to interpreting images of protein subcellular location patterns. J. Mach. Learn. Res. 9, 651–682 (2008) 2. Oualia, A., Cherifb, A.R., Krebsa, M.: Data mining based Bayesian networks for best classification. Comput. Stat. Data Anal. 51, 1278–1292 (2006) 3. Sullivant, S.: Algebraic geometry of Gaussian Bayesian networks. Adv. Appl. Math. 40, 485–513 (2008) 4. Jansen, R., Yu, H., Greenbaum, D., et al.: A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science 302(5644), 449–453 (2003) 5. Friedman, N.: Inferring cellular networks using probabilistic graphical models. Science 303(5659), 799– 805 (2004) 6. Cobb, B.R., Shenoy, P.P.: Operation for inference in continuous Bayesian networks with linear deterministic variables. Int. J. Approx. Reason. 42(1–2), 21–36 (2006) 7. Likforman-Sulem, L., Sigelle, M.: Recognition of degraded characters using dynamic Bayesian networks. Pattern Recogn. 41, 3092–3103 (2008) 8. Xu, Y., Zhang, H.: Refinable Kernels. J. Mach. Learn. Res. 8, 2083–2120 (2007) 9. Gurwicz, Y., Lerner, B.: Bayesian network classification using spline-approximated kernel density estimation. Pattern Recogn. Lett. 26, 1761–1771 (2005) 10. Tsuda, K., Akaho, S., Kawanabe, M., Muller, K.: Asymptotic properties of the Fisher kernel. Neural Comput. 16, 115–137 (2004) 11. Chechik, G., Heitz, G., Galel, G.E., Koller, D.: Max-margin classification of data with absent features. J. Mach. Learn. Res. 9, 1–21 (2008) 12. Nakamura, A., Schmitt, M., Schmitt, N., Simon, H.U.: Inner product space of Bayesian networks. J. Mach. Learn. Res. 6, 1383–1403 (2005) 13. Yang, Y., Wu, Y.: VC dimension and inner product space induced by Bayesian networks. Int. J. Approx. Reason. (2008, to appear) 14. Altun, Y., Tsochantaridis, I., Hofmann, T.: Hidden Markov support vector machines. In: Proceedings of the 20th International Conference on Machine Learning, p. 3C10. AAAI Press, Menlo Park (2003) 15. Garcia, L.D., Stillman, M., Sturmfels, B.: Algebraic geometry of Bayesian networks. J. Symb. Comput. 39, 331–355 (2005)
Acta Appl Math (2009) 106: 349–358 DOI 10.1007/s10440-008-9302-7
Quantum Uncertainty and the Spectra of Symmetric Operators R.T.W. Martin · A. Kempf
Received: 8 July 2008 / Accepted: 21 August 2008 / Published online: 5 September 2008 © Springer Science+Business Media B.V. 2008
Abstract In certain circumstances, the uncertainty, S[φ], of a quantum observable, S, can be bounded from below by a finite overall constant S > 0, i.e., S[φ] ≥ S, for all physical states φ. For example, a finite lower bound to the resolution of distances has been used to model a natural ultraviolet cutoff at the Planck or string scale. In general, the minimum uncertainty of an observable can depend on the expectation value, t = φ, Sφ, through a function St of t, i.e., S[φ] ≥ St , for all physical states φ with φ, Sφ = t. An observable whose uncertainty is finitely bounded from below is necessarily described by an operator that is merely symmetric rather than self-adjoint on the physical domain. Nevertheless, on larger domains, the operator possesses a family of self-adjoint extensions. Here, we prove results on the relationship between the spacing of the eigenvalues of these selfadjoint extensions and the function St . We also discuss potential applications in quantum and classical information theory. Keywords Self-adjoint extensions of symmetric operators · Generalized observables · Finite minimum uncertainty · Spectra of symmetric operators Mathematics Subject Classification (2000) 81Q10 · 47A10
1 Introduction The uncertainty, S[φ], of a quantum observable, S, can possess a finite lower bound S > 0, i.e., S[φ] ≥ S, for all physical states φ. A simple example is the momentum operator of a particle confined to a box with Dirichlet boundary conditions. Since the position uncertainty is bounded from above, the uncertainty relation implies that the momentum uncertainty is finitely bounded from below. Another example arises from general arguments of quantum gravity and string theory [1–5], which point towards corrections to the uncertainty R.T.W. Martin () · A. Kempf Department of Applied Mathematics, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada e-mail:
[email protected]
350
R.T.W. Martin, A. Kempf
relations which are of the type x p ≥ 2 (1 + β(p)2 + · · ·). For positive β, this type of uncertainty relation implies a finite lower bound to the position uncertainty. A Hilbert space representation and functional analytic investigation of the underlying type of generalized commutation relations first appeared in the context of quantum group symmetric quantum mechanics and quantum field theory [6–8], followed by representations in quantum mechanics and quantum field theory with undeformed symmetries [9, 10]. This made it possible to implement this type of ultraviolet cutoff in various quantum mechanical systems, see e.g. [11], as well as in quantum field theory, with applications, e.g., in the study of black hole radiation and inflationary cosmology, see e.g. [12–14]. Our aim in the present paper is to extend the basic functional analytic understanding of observables whose uncertainty is finitely bounded from below. We will consider the general case where S can be a function St of the expectation value, t = φ, Sφ, i.e., S[φ] ≥ St , for all physical states φ with t = φ, Sφ. As we will explain below, such observables are necessarily described by operators that are merely symmetric rather than self-adjoint on their domain in the Hilbert space. Each such symmetric operator possesses, nevertheless, a family of self-adjoint extensions to larger domains in the Hilbert space. The spectra of these self-adjoint extensions are discrete. Our aim here will be to prove results on the close relationship between the spacing of the eigenvalues of these self-adjoint extensions and the function St . 2 Symmetric Operators Let S be a closed, symmetric operator defined on a dense domain, Dom(S), in a separable Hilbert space H. Recall that the deficiency indices (n+ , n− ) of S are defined as the dimensions of the subspaces Ran(S − z)⊥ = Ker(S ∗ − z) and Ran(S − z)⊥ = Ker(S ∗ − z) respectively where z belongs to the open complex upper half plane (UHP). The dimensions of these two subspaces are constant for z within the upper and lower half-plane respectively ([15], Sect. 78). For z = i we will call D+ := Ker(S ∗ − i) and D− := Ker(S ∗ + i) the deficiency subspaces of S. We will let σ (S), σp (S), σc (S), σr (S), and σe (S) denote the spectrum, and the point, continuous, residual and essential spectrum of S respectively. Recall that σ (S) is defined as the set of all λ ∈ C such that (S − λ) does not have a bounded inverse defined on all of H. The point spectrum σp (S) is defined as the set of all eigenvalues, σc (S) is here defined as the set of all λ such that Ran(S − λ) is not closed, σr (S) is defined as the set of all λ such that λ ∈ / σp (S) and Ran(S − λ) is not dense, and σe (S) is the set of all λ such that S − λ is not Fredholm. Recall that a closed, densely defined operator T is called Fredholm if Ran(T ) is closed and if the dimension of Ker(T ) and the co-dimension of Ran(T ) are both finite. If T is unbounded, we include the point at infinity as part of the essential spectrum. Clearly all the above sets are subsets of σ (S), and σ (S) = σp (S) ∪ σc (S) ∪ σr (S). If S is symmetric, and λ ∈ C \ R, then it is easy to see that S − z is bounded below by 1 . This shows that any non-real z ∈ σ (S) must belong to the residual spectrum σr (S) Im(z) of S. If S has finite deficiency indices, then the orthogonal complement of Ran(S − z) is finite-dimensional for any z ∈ C \ R, which shows that σe (S) ⊂ R. The domain of the adjoint S ∗ of S can be decomposed as ([15], p. 98): Dom(S ∗ ) = Dom(S) D+ D− .
(1)
Here the linear manifolds Dom(S) , D+ and D− are non-orthogonal, linearly independent, non-closed subspaces of H. The notation denotes the non-orthogonal direct sum of these
Quantum Uncertainty and the Spectra of Symmetric Operators
351
linear subspaces. If S has finite deficiency indices, and if the co-dimension of Ran(S − λ) is infinite, then λ ∈ R. Furthermore if λ ∈ Ran(S − λ)⊥ then λ is an eigenvalue to S ∗ . This and the fact that the dimension of Dom(S ∗ ) modulo Dom(S) is finite (by the above equation (1)) allows one to conclude that λ must be an eigenvalue of infinite multiplicity to S. Hence if λ ∈ σe (S) then either it is an eigenvalue of infinite multiplicity or it belongs to the continuous spectrum of S. Claim 1 If S is a symmetric operator with finite and equal deficiency indices then σe (S) = σe (S ), and σc (S) = σc (S ) for any self-adjoint extension S of S within H. The domain of any self-adjoint extension S of S can be written as ([15], p. 100): Dom(S ) = Dom(S) (U − 1)D+ .
(2)
Here, U is the isometry from D+ onto D− that defines the self-adjoint extension S . Since the domain of S and S differ by a finite-dimensional subspace, so do the range of S and S. Using these facts it is straightforward to establish the claim.
3 Minimum Uncertainty and Spectra of Self-adjoint Extensions As was first pointed out in [16, 17], there exists a close relationship between the finite lower bound St on the uncertainty of a symmetric operator and the spectra of its self-adjoint extensions. Our aim now is to refine those results, and to include new results, in particular, concerning the dependence of the density of eigenvalues on the operator’s deficiency indices. Definition 1 We denote the expectation value and the uncertainty of a symmetric operator S with respect to a unit-length vector φ ∈ Dom(S) by S¯φ := φ, Sφ and by S[φ] := Sφ, Sφ − (Sφ, φ)2 respectively. For a fixed expectation value t ∈ R, the quantity St := infφ∈Dom(S),Sφ,φ=t,φ=1 S[φ] will be called the minimum uncertainty of S at t . The overall lower bound on the uncertainty of S will be denoted by S := inft∈R St . Recall that a symmetric operator S is said to be simple if there is no subspace S ⊂ H such that S|S is self-adjoint. A point z ∈ C is said to be a point of regular type for S if (S − z) has a bounded inverse defined on Ran(S − z). As discussed in the introduction, every point z ∈ C \ R is a point of regular type for a symmetric operator S. S is then said to be regular, if every z ∈ C is a point of regular type for S. It is clear that if φ is an eigenvector of S, then S[φ] = 0. Furthermore, if λ ∈ R belongs to σc (S) then Sλ = 0. Hence, if S ≥ > 0 this implies that S has no continuous or point spectrum on the real line. This means, in particular, that such an S is not self-adjoint, and is both simple and regular. In addition, the theorem below shows that if S > 0 then S must have equal deficiency indices. Theorem 1 If S is a symmetric operator with unequal deficiency indices, then S = 0. Furthermore, St = 0 for all t ∈ R. In the proof of this theorem, it will be convenient to use the Cayley transform of the . symmetric operator S. Given λ in the upper half plane (UHP), consider κλ (z) := (z−λ) (z−λ) If S is self-adjoint, then κ(S) is unitary by the functional calculus. More generally if S is symmetric, κλ (S) is a partially defined transformation which is an isometry from
352
R.T.W. Martin, A. Kempf
Ran(S − λ) onto Ran(S − λ) ([15], Sects. 67 and 79). For convenience we will take λ = i, ⊥ ⊥ → D− is called the Cayley transand write κi (z) =: κ(z). The linear map V := κ(S) : D+ 1+z −1 form of S. One can further show that if κ (z) := i 1−z and V = κ(S), then κ −1 (κ(z)) = z and S = κ −1 (V ). Recall that all symmetric extensions of the symmetric operator S can be constructed by taking the inverse Cayley transforms of partial isometric extensions of the Cayley transform V = κ(S). For example, if S has deficiency indices (n, m), one can define an arbitrary partial isometry W from D+ into D− , and the inverse Cayley transform κ −1 (V ) =: S of V := V ⊕ W on H = Ker(S ∗ − i) ⊕ D+ will be a symmetric extension of S. The proof of the above theorem will also make use of the Wold decomposition for isometries. Recall that the Wold decomposition theorem states that any isometry on a Hilbert space H is isometrically isomorphic to an operator U ⊕ ( α∈ R) on H0 ⊕ ( α∈ l 2 (N)) where U is some unitary on H0 , is some index set, and R is the right shift operator on the Hilbert space of square summable sequences, l 2 (N) (see e.g. [18], p. 2). Proof of Theorem 1 Suppose S has deficiency indices (n+ , n− ), n+ = n− , n± = dim(D± ). Then S has a symmetric extension S with deficiency indices either (0, j ) or (j, 0), where j := |n+ − n− |. Recall that such an extension is obtained as follows. Take the Cayley transform V of S, and in the case where n− > n+ , extend it by an arbitrary isometry from D+ into D− to obtain an isometry V with dim(Ran(V )⊥ ) = j . The inverse Cayley transform of V
yields the desired symmetric extension S with deficiency indices (0, j ). In the case where n+ > n− , extend V by an arbitrary isometry from an arbitrary (n− )-dimensional subspace of D+ onto D− to obtain a partial isometry V with dim(Ker(V )) = j and dim(Ran(V )⊥ ) = 0. Again, the inverse Cayley transform of V yields the desired symmetric extension S for this case with deficiency indices (j, 0). Accordingly, the Cayley transform V of S is either an isometry with dim(Ran(V )⊥ ) = j or the adjoint of an isometry, with dim(Ker(V )) = j . By the Wold decomposition theorem, U and either j copies V is isometrically isomorphic to the direct sum of a unitary operator j of the right shift operator or j copies of the left shift operator on H0 i=1 l 2 (N). It follows
that σ (V ) ⊃ σ (R) or ⊃ σ (L) respectively, where R and L are the right and left shift operators on l 2 (N). It is known that the unit circle lies in the continuous spectrum of both the right and left shift operators. It is not difficult to see that λ ∈ σc (V ) \ {1} where V is the Cayley transform of S if and only if κ −1 (λ) ∈ σc (S ) = σc (S). It follows that the continuous spectrum of S (which is a subset of R) is non-empty and hence there exist φ ∈ Dom(S) for which S[φ] is arbitrarily small. Furthermore, the above shows that the continuous spectrum of S is all of R. Using this fact, it is not difficult to show that St = 0 for all t ∈ R. First, given any fixed t ∈ R, since t ∈ σc (S), one can find a sequence (φn )n∈N ⊂ Dom(S) such that (S − t)φn → 0, φn = 1, and tn := Sφn , φn ≥ t . Again, since σc (S) = R, one can find a unit norm ψ ∈ Dom(S) such that S ψ = t < t . Let Pn denote the projectors onto the orthogonal complements of the one-dimensional subspaces spanned by the φn . Then each vector Pn ψ ∈ Dom(S), and it is easy to verify that if ψn := PPnn ψψ , then S ψn =: tn ≤ t . For each n ∈ N, let ϕn be the linear combination of the ψn and φn such that ϕn = 1 and S ϕn = t . It is straightforward to verify that (S − t)ϕn → 0 so that St = 0. Note that the above theorem implies that if S is any symmetric operator with unequal deficiency indices that represents a quantum mechanical observable, then even though S is not self-adjoint, it is possible to measure that observable as precisely as one likes in the sense that St = 0 for all t ∈ R. Nevertheless, despite the fact that St = 0 for all t ∈ R,
Quantum Uncertainty and the Spectra of Symmetric Operators
353
the situation is physically different from the case of a self-adjoint observable. This is because, in this case, the formal, non-normalizable quasi-eigenstates of the symmetric operator S are non-orthogonal. For example, consider the case of the symmetric derivative operd defined with domain Dom(D) in L2 [0, ∞), Dom(D) := {φ ∈ L2 [0, ∞) | φ ∈ ator D := i dx ACloc [0, ∞); Dφ ∈ L2 [0, ∞); φ(0) = 0}. Here, ACloc [0, ∞) denotes the set of all functions which are absolutely continuous on any compact subinterval of [0, ∞). It is straight−iλx forward to check that D has deficiency indices (0, 1). If φλ (x) := e√2π , for x ∈ [0, ∞), then φλ can be thought of as a formal, non-normalizable quasi-eigenstate for S, since if f ∈ L2 [0, ∞), the formal inner product of f with the φλ , ∞ 1 f (x) √ eiλx =: F (λ), (3) 2π 0 generates a unitary transformation, i.e. the Fourier transform, of L2 [0, ∞) onto a subspace of L2 (R), and under this transformation S is transformed into multiplication by the independent variable. These quasi-eigenstates are non-orthogonal in the following sense. For > 0 and λ ∈ R, let 1 −iλx−x − √x −e . e φ(, λ; x) := √ 2π Then φ(, λ; x) ∈ Dom(D) for any > 0, and as → 0, φ(, λ; x) converges to φλ in L2 norm on any compact subinterval of [0, ∞). Furthermore, it is straightforward to check that Dφ(, λ; ·), φ(, λ; ·) →λ φ(, λ; ·)2
and
Dφ(, λ; ·), Dφ(, λ; ·) → λ2 φ(, λ; ·)2
as → 0.
However, if λ1 = λ2 ∈ R, then the inner product φ(, λ1 ; ·, φ(, λ2 ; ·) converges to 1 = 0 in the limit as → 0. In this sense, the formal non-normalizable quasi2π i(λ1 −λ2 ) eigenstates φλ are not orthogonal. Compare this to the case of the self-adjoint derivd in L2 (R). In this case the non-normalizable eigenstates to ative operator D := i dx eigenvalues λ ∈ R are again φλ (x) = √12π e−iλx , x ∈ R. If one chooses, for example, 2
φ(, λ; x) := √12π e−iλx−x ∈ Dom(D ), then it is straightforward to check that the inner product φ(, λ1 ; ·), φ(, λ2 ; ·) vanishes as → 0 if λ1 = λ2 , so that the non-normalizable eigenstates of this self-adjoint operator are indeed orthogonal. For a concrete physical example, consider a telescope with some finite aperture. The accurate measurement of the arriving photons’ momentum orthogonal to the telescope is essential for the production of a sharp image. This amounts to the measurement of the momentum of a particle in a box. The momentum operator (which is just i times the first derivative operator) acting on a particle in a box is a symmetric operator with deficiency indices (1, 1). In this case the finite aperture of the telescope causes a minimum uncertainty in the angle measurements [16, 17]. The case of a telescope is the case of light being diffracted as it passes through a slit. The case of the symmetric derivative operator D on the half line, L2 [0, ∞), is that of light being diffracted at a single edge, i.e., passing a single wall. The fact that the quasi eigenstates are not orthogonal means, physically, that there is a diffraction pattern in this case as well. If S is a simple symmetric operator with deficiency indices (j, 0), or (0, j ), then it is straightforward to verify that S is isometrically isomorphic to j copies of the differentiation d on L2 (0, ∞) or L2 (−∞, 0) respectively. This fact was first proven by von Neuoperator i dx mann, see for example ([15], Sect. 82). Hence, if S has deficiency indices (j, 0), or (0, j ), it
354
R.T.W. Martin, A. Kempf
generates a semi-group of isometries or co-isometries which is isometrically isomorphic to j copies of right translation on L2 (0, ∞) or L2 (−∞, 0), respectively. It follows that if S is a symmetric quantum mechanical Hamiltonian operator, which has deficiency indices (m, n), and j := |n − m|, then any maximal symmetric extension of S will generate either an isometric or co-isometric time evolution of the quantum mechanical system. Furthermore, the Hilbert space can be decomposed into j + 1 subspaces such that the time-evolution on the first subspace is unitary, and such that the time evolution on each of the other subspaces is purely isometric or co-isometric. If the state of the system begins in one of these subspaces, its image at any later time will be confined to that subspace, so that there are, in general, subspaces of the Hilbert space which will be inaccessible to the time evolution of the system once the initial state is fixed. Theorem 2 Let S be a densely defined, closed symmetric operator with finite and equal deficiency indices (n, n). If S > 0, then any self-adjoint extension S of S has a purely discrete spectrum, σ (S ) = σp (S ). In particular, if St > > 0 for all t ∈ I ⊂ R, then S can have no more than n eigenvalues (including multiplicities) in any interval J ⊂ I of length less than or equal to , and if n = 1, then S can have no more than one eigenvalue in any interval J ⊂ I of length less than or equal to 2. This theorem shows, in particular, that if S > , then any self-adjoint extension of S has no more than n eigenvalues in any interval of length . The authors are currently investigating whether the improved result that holds for the n = 1 case generalizes to the case of higher deficiency indices. Proof of Theorem 2 If S > 0, then as in the discussion preceding the proof of Theorem 1, we conclude that every z ∈ C is a point of regular type for S. Since S has finite and equal deficiency indices, if S is any self-adjoint extension of S, it follows that σe (S ) = σe (S) consists only of the point at infinity. This implies that the spectrum of S consists solely of eigenvalues of finite multiplicity with no finite accumulation point. Suppose that there is a self-adjoint extension S of S which has n + 1 eigenvectors φi to eigenvalues λi where λi ∈ J ⊂ I , and the length of J is less than or equal to . Then since is n, there is a non-zero linear combination the dimension of Dom(S ) modulo Dom(S) c of these orthogonal eigenvectors, ψ = n+1 i=1 i φi which has unit norm and which belongs to Dom(S). The expectation value of the symmetric operator S in the state ψ lies in J , t := S ψ ∈ J since ψ is a linear combination of eigenvectors to S whose eigenvalues all lie in J . Now it is straightforward to verify that since |λi | < |t| + for all 1 ≤ i ≤ n + 1, that (S[ψ])2 =
n+1 i=1
λ2i |ci |2 − t 2 ≤
n+1 (|t| + )2 |ci |2 − t 2 = 2|t| + 2 .
(4)
i=1
Now first suppose that 0 ∈ J and that t := S ψ = 0. Then in this case equation (4) contradicts the fact that S0 > , proving the claim for this case. If t = 0, then consider the symmetric operator S(t) := S − t on Dom(S). Given any φ ∈ Dom(S) which has unit norm and expectation value S φ = Sφ, φ = t , it is not hard to see that S(t)φ = S(t)φ, φ = 0 and that S(t)[φ] = S(t)φ, S(t)φ = Sφ, Sφ − 2tSφ, φ + t 2 = Sφ, Sφ − t 2 = S[φ].
(5)
Quantum Uncertainty and the Spectra of Symmetric Operators
355
This shows that S(t)0 = St > . Now let S be any self-adjoint extension of S. Applying the above result for the expectation value 0 to the symmetric operator S(t), we conclude that the self-adjoint extension S (t) := S − t of S(t) can have no more than n eigenvalues in the interval J − t . This in turn implies that S can have no more than n eigenvalues in the interval J . Now suppose that n = 1. As in the above, if S is a self-adjoint extension of S that has two eigenvectors φ1 , φ2 to eigenvalues λ1 , λ2 , in a subinterval J ⊂ I of length less than or equal to 2, then there is a unit norm vector ψ = c1 φ1 + c2 φ2 , that belongs to Dom(S). The expectation value of ψ , t := Sψ, ψ will also belong to J . The expectation value t and the fact that ψ has unit norm, uniquely determines the constants c1 and c2 up to complex numbers of modulus one: |λ2 − t| |λ1 − t| |c1 | = and |c2 | = . (6) |λ1 − λ2 | |λ1 − λ2 | √ It is now straightforward to calculate that S[ψ] = |λ1 − t||λ2 − t|. Assume, without loss of generality, that λ1 < λ2 , so that λ1 < t < λ2 . Since λ1 , λ2 belong to the same interval J with length less than or equal to 2, it follows that |λ2 − t| ≤ 2 − |λ1 − t|, so that S[ψ] ≤ (2|λ1 − t| − |λ1 − t|2 ). It is simple to check that the function f (x) = 2x − x 2 has a global maximum of 2 when x = , so that S[ψ] ≤ . Since t ∈ I , this contradicts the assumption that St > . Corollary 1 If S is a symmetric operator with finite deficiency indices such that S = > 0, then S is simple, regular, the deficiency indices (n, n) of S are equal, and the spectrum of any self-adjoint extension of S is purely discrete and consists of eigenvalues of finite multiplicity at most n with no finite accumulation point. It is known ([15], Sect. 83) that if S is a closed, densely defined simple symmetric operator with equal and finite deficiency indices (n, n), then the multiplicity of any eigenvalue of any self-adjoint extension S of S does not exceed n. Corollary 1 is an immediate consequence of this fact and Theorems 1 and 2. d d (x dx ·) + x defined For example, consider the symmetric differential operator S := − dx ∞ 2 on the dense domain C0 (0, ∞) ⊂ L [0, ∞) of infinitely differentiable functions with compact support in (0, ∞). Let S be the closure of S . Let D be the closed symmetric derivative d operator on L2 [0, ∞) which is the closure of the symmetric derivative operator D := i dx ∞ on the domain D := C0 (0, ∞). It follows that D is a core for both D and for S. Recall that a dense set of vectors D is called a core for a closable operator T , if T |D = T . For all φ ∈ D, it is easy to verify that [D, S]φ := (DS − SD)φ = i(D 2 + 1)φ. By the uncertainty principle, it follows that for any φ ∈ D, 1 1 |φ, [D, S]φ| = φ, (D 2 + 1)φ 2 2 1 = (D[φ]2 + φ, Dφ2 + φ, φ). 2
S[φ]D[φ] ≥
2
(7)
Using the fact that the function f (t) = t 2t+1 is concave up for all t ∈ (0, ∞) and has a global minimum f (1) = 1, we conclude that S[φ] ≥ 1 for any φ ∈ D. Since D is a core for S, given any ψ ∈ Dom(S) we can find a sequence ψn ∈ D such that ψn → ψ and Sψn → Sψ . It follows that S[ψ] = limn→∞ S[ψn ] ≥ 1. This shows that S ≥ 1. Now S is a second
356
R.T.W. Martin, A. Kempf
order symmetric differential operator. It is known that the deficiency indices of such an operator are equal and do not exceed (2, 2) ([19], Sect. 17). Since S ≥ 1, Corollary 1 also implies that the deficiency indices of S must be equal and non-zero. Hence D has deficiency indices (1, 1) or (2, 2). Applying Theorem 2 one can now conclude that any self-adjoint extension of S can have at most two eigenvalues in any interval of length 1. Conversely, if S has finite deficiency indices and is simple and regular, then S > 0. Theorem 3 Suppose that S is a regular, simple symmetric operator with finite and equal deficiency indices. Let S denote the set of all self-adjoint extensions of S within H. Then, St ≥ max St = max S ∈S
S ∈S
min
λ,μ∈σ (S )
|λ − t||μ − t| .
(8)
Proof Note that if S is simple and regular with finite deficiency indices (m, n), then these indices must be equal, otherwise S would have continuous spectra and would not be regular. S ∈ S, it is clear that St ≥ Since Dom(S) ⊂ Dom(S ) and S |Dom(S) = S for any √
maxS ∈S St . It remains to prove that St = minλ,μ∈σ (S ) |λ − t||μ − t| for any S ∈ S. Since we assume S is regular, simple, and has finite deficiency indices, the essential spectrum of S is empty. Hence by Claim 1, σe (S ) is empty for any S ∈ S. This shows that the spectrum of any S consists solely of eigenvalues of finite multiplicity with no finite accumulation point. Order the eigenvalues as a non-decreasing sequence (tn )n∈M and let {bn }n∈M be the corresponding orthonormal eigenbasis such that S bn = tn bn . Here M = ±N or = Z, depending on whether S is bounded above, below, or neither bounded above nor below. To calculate St , we need to minimize the functional [φ] := S φ, S φ − t 2
(9)
over the set of all unit norm φ ∈ Dom(S ) which satisfy S φ, φ = t . Let us assume that t
is not an eigenvalue of S as in this case St = 0 and (8) holds trivially. Expanding φ in the basis bn , φ = n∈M φn bn , we see that to find the extreme points of subject to these constraints we need to set the functional derivative of [φ] := φn φ n (tn2 − t 2 ) − α1 tn − α2 (10) n∈Z
to zero. Here α1 , α2 are Lagrange multipliers. Setting the functional derivative of with respect to φ to 0 yields: 0 = φn (tn2 − t 2 ) − α1 tn − α2 . (11) Formula (11) leads to the conclusion that if φ is an extreme point, it must be a linear combination of two eigenvectors to S corresponding to two distinct eigenvalues. To see this note that if the decomposition of φ in the eigenbasis {bn } had three non-zero coefficients, say φji , i = 1, 2, 3, all of which correspond to eigenvectors bji with distinct eigenvalues, ti = tj , 1 ≤ i, j ≤ 3, then (11) leads to the conclusion that α1 = tj1 + tj2 = tj2 + tj3 which would imply that tj1 = tj3 , a contradiction. Furthermore, φ cannot just be a linear combination of eigenvectors bj to one eigenvalue, as such a linear combination cannot satisfy the constraint Sφ, φ = t . So let λ := ti and μ := tj for any j, i ∈ Z for which ti = tj . Choose ϕ ∈ Ker(S ∗ − λ) and ψ ∈ Ker(S ∗ − μ). We have shown that φ has the form φ = c1 ϕ + c2 ψ . As in the proof of Theorem 2, the constraints that φ, φ = 1 and
Quantum Uncertainty and the Spectra of Symmetric Operators
357
Sφ, φ = t uniquely determine c1 and c2 up to complex numbers of modulus one: |μ − t| |λ − t| and |c2 | = . |c1 | = |λ − μ| |λ − μ|
(12)
The phases of c√ 1 and c2 do not affect the value of [φ]. It √follows that if φ extremizes , then S [φ] = |μ − t||λ − t| so that St = minμ,λ∈σ (S ) |μ − t||λ − t|. √ Observe that the curve f (t) = |μ − t||λ − t| describes the upper half of a circle of centred at the point λ+μ . radius |λ−μ| 2 2 Corollary 2 If S is a symmetric operator with deficiency indices (n, n) such that S > 0, √ . If n = 1, then maxS ∈S S ≥ S, so that then maxS ∈S St ≥ S t 2 S = inf max St = inf max t∈R S ∈S
t∈R S ∈S
min
λ,μ∈σ (S )
|λ − t||μ − t| .
(13)
Proof It is known that if λ is a regular point of a symmetric operator S with deficiency indices (n, n), then there exists a self-adjoint extension of S for which λ is an eigenvalue of multiplicity n ([15], p. 109). Given any λ for which |λ − t| < S, let S be the selfadjoint extension of S for which λ is an eigenvalue of multiplicity n. By Theorem 2, if μ = λ belongs to σ (S ), it must be that |μ − t| ≥ S − |λ − t|. Again, by formula (8), St ≥ |λ − t|S − (λ − t)2 . It is a simple calculus exercise to show that this is maximized . Choosing λ so that this condition is satisfied proves the first part of the when |λ − t| = S 2 claim. Using the result of Theorem 2 in the case where n = 1, and repeating the above arguments shows that, in the case where n = 1, maxS ∈S St ≥ S. By Theorem 3, St ≥ maxS ∈S St , so that St ≥ maxS ∈S St ≥ S. Taking the infimum over t ∈ R of both sides yields S = inft∈R maxS ∈S St . Combining this with formula (8) now yields the formula (13). If the improved result of Theorem 2 that holds for the n = 1 case could be established for all values of n, then the stronger result of the above theorem for the n = 1 case would also hold for all n.
4 Outlook Our new results on operators whose uncertainty is bounded from below are of potential interest in quantum gravity. This is because our results improve on the results of [16, 17], where it was first pointed out that physical fields in theories with a finite lower bound on spatial resolution, x, possess the so-called sampling property: a field is fully determined by its amplitude samples taken at the eigenvalues of any one of the self-adjoint extensions of x. With this ultraviolet cutoff, physical theories can therefore be written, equivalently, as living on a continuous space, or as living on any one of a family of discrete lattices of points. This provides a new approach to reconciling general relativity’s requirement that space-time be a continuous manifold with the fact that quantum field theories tend to be well-defined only on lattices. This approach has been extended to quantum field theory on flat and curved space see [20, 21]. Representing Hamiltonians as symmetric operators
358
R.T.W. Martin, A. Kempf
with unequal deficiency indices may also be of physical significance, since the co-isometric time evolution of a quantum system generated by such a Hamiltonian could be useful for describing information vanishing beyond horizons, e.g. the horizon of a black hole [22]. Finally, we remark that our results are also of potential interest in information theory: The theory of spaces of functions which are determined by their amplitudes on discrete points of sufficient density is a long-established field, called sampling theory. Sampling theory plays a central role in information theory, where it serves as the crucial link between discrete and continuous representations of information, see [23, 24]. Our results on the relationship between the varying uncertainty bound St and varying density of the eigenvalues of the self-adjoint extensions of S therefore contribute new tools for handling the difficult case of sampling and reconstruction with a variable Nyquist rate.
References 1. Gross, D.J., Mende, P.F.: String theory beyond the Planck scale. Nucl. Phys. B 303, 407 (1988) 2. Amati, D., Ciafaloni, M., Veneziano, G.: Can space-time be probed beyond the string size? Phys. Lett. B 216, 41 (1989) 3. Garay, L.J.: Quantum gravity and minimum length. Int. J. Mod. Phys. A 10, 145 (1995) 4. Witten, E.: Reflections on the fate of space-time. Phys. Today 49(4), 24 (1996) 5. Amelino-Camelia, G., Ellis, J., Mavromatos, N.E., Nanopoulos, N.: Planckian scattering and black holes. Mod. Phys. Lett. A 12, 2029 (1997). gr-qc/9806028 6. Kempf, A.: Quantum group-symmetric fock spaces with Bargmann-Fock representation. Lett. Math. Phys. 26, 1 (1992) 7. Kempf, A.: Uncertainty relation in quantum mechanics with quantum group symmetry. J. Math. Phys. 35, 4483 (1994). hep-th/9311147 8. Kempf, A.: Quantum field theory with nonzero minimal uncertainties in positions and momenta. J. Math. Phys. 38, 1347 (1997). hep-th/9405067 9. Kempf, A., Mangano, G., Mann, R.B.: Hilbert space representation of the minimal length uncertainty relation. Phys. Rev. D 52, 1108 (1995) 10. Kempf, A., Mangano, G.: Minimal length uncertainty relation and ultraviolet regularization. Phys. Rev. D 55, 7909 (1997) 11. Brau, F.: Minimal length uncertainty relation and the hydrogen atom. J. Phys. A 32, 7691 (1999) 12. Brout, R., Gabriel, Cl., Lubo, M., Spindel, P.: Minimal length uncertainty principle and the transPlanckian problem of black hole physics. Phys. Rev. D 59, 044005 (1999) 13. Ahluwalia, D.V.: Wave-particle duality at the Planck scale: freezing of neutrino oscillations. Phys. Lett. A 275, 31 (2000). gr-qc/0002005 14. Kempf, A.: Mode generating mechanism in inflation with a cutoff. Phys. Rev. D 63, 083514 (2001) 15. Akhiezer, N.I., Glazman, I.M.: Theory of Linear Operators in Hilbert Space. Dover, Mineola (1993) 16. Kempf, A.: Fields over unsharp coordinates. Phys. Rev. Lett. 85, 2873 (2000). hep-th/9905114 17. Kempf, A.: On fields with finite information density. Phys. Rev. D 69, 124014 (2004). hep-th/0404103 18. Rosenblum, M., Rovnyak, J.: Hardy Classes and Operator Theory. Courier Dover, Mineola (1997) 19. Naimark, M.A.: Linear Differential Operators in Hilbert Space, Part II. Frederick Ungar, New York (1968) 20. Kempf, A.: Covariant information-density cutoff in curved space-time. Phys. Rev. Lett. 92, 221301 (2004). gr-qc/0310035 21. Kempf, A., Martin, R.: On information theory, spectral geometry and quantum gravity. Phys. Rev. Lett. 100, 021304 (2008) 22. Kempf, A.: On the only three short distance structures which can be described by linear operators. Rep. Math. Phys. 43, 171–177 (1999). hep-th/9806013 23. Shannon, C.E.: Communication in the presence of noise. Proc. IRE 37, 10 (1949) 24. Benedetto, J.J., Ferreira, P.J.S.G.: Modern Sampling Theory. Birkhäuser, Basel (2001)
Acta Appl Math (2009) 106: 359–367 DOI 10.1007/s10440-008-9303-6
Application of He’s Variational Iteration Method for Solving Nonlinear BBMB Equations and Free Vibration of Systems D.D. Ganji · H. Babazadeh · M.H. Jalaei · H. Tashakkorian
Received: 22 July 2008 / Accepted: 21 August 2008 / Published online: 5 September 2008 © Springer Science+Business Media B.V. 2008
Abstract A new analytical method called He’s variational iteration method (VIM) is introduced to be applied to solve nonlinear Benjamin-Bona-Mahony-Burgers (BBMB) equations and free vibration of a nonlinear system having combined linear and nonlinear springs in series in this article. In this method, general Lagrange multipliers are introduced to construct correction functionals for the problems. The multipliers can be identified optimally via the variational theory. The results are compared with the results of the homotopy analysis method and also with the exact solution. He’s Variational iteration method in this problem functions so better than the homotopy analysis method and exact solutions one of them in per section. Keywords He’s variational iteration method · Benjamin-Bona-Mahony-Burgers (BBMB) equations · Serial linear and nonlinear stiffness · Nonlinear free vibration · Motion equation
1 Introduction Although there are few phenomena in different fields of science occurring linearly, most of them occur nonlinearly. We know that except a limited number of these problems, most of them do not have precise analytical solutions; therefore they have to be solved using other approximate methods. Many different new methods have recently presented some techniques to eliminate the small parameter; for example, the homotopy analysis method, the variational iteration method (VIM) [5, 6, 8, 9, 15, 16] and the Adomian’s decomposition method (ADM) [1, 2], the homotopy perturbation method [10–14, 18, 21], and Exp-Function Method [25] which is powerful method for solving nonlinear equation [17, 24] and others [3, 23]. In this research the basic idea of the VIM is introduced and then its application in nonlinear equations is studied. Telli and Kopmaz [22] have attempted to solve the motion of a mechanical system associated with linear and nonlinear properties using analytical and D.D. Ganji () · H. Babazadeh · M.H. Jalaei · H. Tashakkorian Faculty of Mechanical and Electrical Engineering, Babol University of Technology, P.O. Box 47135-484, Babol, Iran e-mail:
[email protected]
360
D.D. Ganji et al.
numerical techniques. It deals with vibration of a conservative oscillation system with attached mass grounded by linear and nonlinear springs. The linkage of the linear and nonlinear springs in series has been derived with cubic nonlinear characteristics in the equations of motion [22]. As mentioned in [22], although there exists a vast literature on discrete systems including either linear or nonlinear springs [7, 20], one does not encounter publications on mechanical systems with single-degree-of-freedom containing flexible component consisting of linear–nonlinear spring in series, which occurs in technical applications. One similar mechanical model is the conservative Duffing equation with linear and cubic characteristics governed by a second-order differential equation In the present letter, the resulting nonlinear. differential equation is solved by VIM to evaluate the nonlinear Benjamin-Bona-Mahony-Burgers (BBMB) equations and compared with results provided by the homotopy analysis method and the exact solution.
2 Basic Idea of He’s Variational Iteration Method To clarify the basic ideas of VIM, we consider the following differential equation: Lu + N u = g(t),
(1)
where L is a linear operator, N a nonlinear operator and g(t) an inhomogeneous term. According to VIM, we can write down a correction functional as follows: un+1 (t) = un (t) +
t
λ(Lun (τ ) + N u˜ n (τ ) − g(τ ))dτ,
(2)
0
where λ is a general Lagrangian multiplier which can be identified optimally via the variational theory. The subscript n indicates the nth approximation and u˜ n is considered as a restricted variation δ u˜ n = 0.
3 Implementation of the Method Consider the general form of Benjamin-Bona-Mahony-Burgers equations (BBMB): ut − uxxt − αuxx + βux + g(u)x = 0,
x ∈ R, t ≥ 0
(3)
with the initial condition: u(x, 0) = f (x) → 0,
x → ±∞,
(4)
where u(x, t) represents the velocity of fluid in the horizontal direction x, α is the positive constant, β ∈ R and g(u) is C 2 -smooth nonlinear function [8]. In order to assess the advantages and accuracy of VIM for solving nonlinear partial differential equations, we will consider the following example. First we consider a nonlinear example, i.e. ut − uxxt + ux +
u2 2
= 0, x
(5)
Application of He’s Variational Iteration Methods for Solving
with the initial conditions:
361
x u(x, 0) = sec h 4 2
(6)
Its correction variational functional in x and t then, can be expressed, respectively, as follows: t ∂ ∂3 un (x, τ ) − 2 un (x, τ ) λ un+1 (x, t) = un (x, t) + ∂τ ∂x ∂τ 0 ∂ ∂ + un (x, τ ) + un (x, τ ) un (x, τ ) dτ (7) ∂x ∂x Its stationary conditions can be obtained as follows: λ (τ ) = 0,
(8)
λ(τ )|τ =t = −1,
(9)
therefore the Lagrangian multiplier can be identified: λ = −1
(10)
As a result, we obtain the following iteration formula: t
∂3 ∂ un (x, τ ) − 2 un (x, τ ) ∂τ ∂x ∂τ 0 ∂ ∂ + un (x, τ ) + un (x, τ ) un (x, τ ) dτ ∂x ∂x
un+1 (x, t) = un (x, t) −
We start with the initial approximation of u(x, 0) given by (6) x u(x, 0) = sec h2 4
(11)
(12)
Using the above iteration formula (11), we can directly obtain the other components, as follows: u1 (x, t) =
2 cosh( x4 )3 + sinh( x4 )t cosh( x4 )2 + sinh( x4 )t 2 cosh( x4 )5
(13)
In the same manner, the rest of the components of the iteration formula can be obtained. u2 (x, t) =
3 2 1 x x x x + sinh + sinh 2 cosh t cosh t 2 cosh( x4 )5 4 4 4 4 3 +2
2
cosh( x2 )2 sinh( x4 ) + 14 cosh( x4 )3 t + 12 sinh( x4 )2 t cosh( x4 ) + 14 cosh( x4 )t 2 cosh( x4 )5
5 5(2 cosh( x4 )3 + sinh( x4 )t cosh( x4 )2 + sinh( x4 )t) sinh( x4 ) x + t cosh 8 cosh( x4 )6 4
362
D.D. Ganji et al.
Fig. 1 (a) Comparison of u(x, t) for the VIM, HAM [4] and exact solutions for t = 0.1. (b) Comparison of u(x, t) for the VIM, HAM and exact solutions for t = 0.2
Fig. 2 Nonlinear free vibration of a system of mass with serial linear and nonlinear stiffness on a frictionless contact surface [20]
−2 3 ×
2
2 cosh( x4 )3 + sinh( x4 )t cosh( x4 )2 + sinh( x4 )t 2 cosh( x4 )5
cosh( x4 )2 sinh( x4 ) + 14 cosh( x4 )3 t + 12 sinh( x2 )2 t cosh( x4 ) + 14 cosh( x4 )t 2 cosh( x4 )5
5 (2 cosh( x4 )3 + sinh( x4 )t cosh( x4 )2 + sinh( x4 )t) x −5 t cosh 8 cosh( x4 )6 4
(14)
By comparison of the figures of the VIM solution with another validated approximate solution, i.e. homotopy analysis method (HAM) [4] and the exact solution, we see that these figures are similar to each other. According to the Fig. 1b as the time increase from t = 0.1 to t = 0.2 the rate of error in HAM increases and decreases in VIM.
4 Governing Equation of Motion and Formulation Consider free vibration of a conservative, single-degree-of-freedom system with a mass attached to linear and nonlinear springs in series as shown in Fig. 2. After transformation, the
Application of He’s Variational Iteration Methods for Solving
motion is governed by a nonlinear differential equation of motion [22]: 2 d 2ν dν + 6εzν + ωe2 ν + εωe2 ν 3 = 0, 1 + 3εzν 2 2 dt dt
363
(15)
where ε=
β , k2
(16)
ξ=
k2 , k1
(17)
ξ , 1+ξ k2 ωe = m(1 + ξ ) z=
(18)
(19)
with the initial conditions of ν(0) = A,
dν (0) = 0 dt
(20)
in which ε, β, ν, ωe , m and ξ are perturbation parameter (not restricted to a small parameter), coefficient of nonlinear spring force, deflection of nonlinear spring, natural frequency, mass and the ratio of linear portion k2 of the nonlinear spring constant to that of linear spring constant k1 , respectively. Note that the notations in (15)–(19) follow those in Telli and Kopmaz [22]. The deflection of linear spring y1 (t) and the displacement of attached mass y2 (t) can be represented by the deflection of nonlinear spring ν in simple relationships [18]: y1 (t) = ξ ν(t) + εξ [ν(t)]3
(21)
y2 (t) = ν(t) + y1 (t)
(22)
and
5 Implementation of VIM Firstly, we write down (15) for m = 1, A = 0.5, ε = 0.5 and ξ = 0.1 (k1 = 50, k2 = 5) 2 d 2ν dν + 0.2730ν + 4.5454ν + 2.2727ν 3 = 0 (23) 1 + 0.1365ν 2 2 dt dt Then dividing (23) by 1 + 0.1365ν 2 , and writing it in a simplified form, yields: ν¨ +
4.5454ν 2.2727ν 3 0.2730ν ν˙ 2 + + =0 1 + 0.1365ν 2 1 + 0.1365ν 2 1 + 0.1365ν 2
(24)
with the initial conditions: ν(0) = 0.5,
(25a)
ν˙ (0) = 0
(25b)
364
D.D. Ganji et al.
Fig. 3 Comparison of deflection of nonlinear spring ν(t) for the VIM and other various analytical approximations for m = 1, A = 0.5, ε = 0.5 and ξ = 0.1 (k1 = 50, 2 = 5)
where a dot denotes differentiation with respect to t . Its correction variational functional can be expressed as follows: t 0.2730ν ν˙ 2 4.5454ν 2.2727ν 3 λ(τ ) ν¨ + + + νn+1 (t) = νn (t) + dτ, (26) 1 + 0.1365ν 2 1 + 0.1365ν 2 1 + 0.1365ν 2 0 where λ is a general Lagrangian multiplier, which can be identified as λ = τ − t . Therefore the variational iteration formula is obtained in the form: t 0.2730ν ν˙ 2 4.5454ν 2.2727ν 3 + + dτ, νn+1 (t) = νn (t) + (τ − t) ν¨ + 1 + 0.1365ν 2 1 + 0.1365ν 2 1 + 0.1365ν 2 0 (27) we start with the initial approximation of ν(t) given by (25a) ν0 (t) = 0.5
(28)
Using the iteration formula (27), all of the other components can be obtained: ν1 (t) = 0.5000 − 1.2362t 2 ν2 (t) = 0.5000 − 1.2362t 2 − 711.3758 + 1.7152t 4 + 10.8266 arctan(−1.2016 + 1.0477t) + 2.0186t 2 − 20.0953 arctan(−1.2016 + 1.0477t)t + 16.9333 ln(2322291557 − 2392567114t + 1043014737t 2 ) − 6.4022 ln(2322291557 − 2392567114t + 1043014737t 2 )t + 16.9333 ln(2322291557 + 2392567114t + 1043014737t 2 ) + 6.4022 ln(2322291557 + 2392567114t + 1043014737t 2 )t
(29)
Application of He’s Variational Iteration Methods for Solving
365
Fig. 4 (a) Comparison of deflection of linear spring y1 (t) for the VIM and other various analytical approximations for m = 1, A = 0.5, ε = 0.5 and ξ = 0.1 (k1 = 50, k2 = 5). (b) Comparison of displacement of attached mass y2 (t) for the VIM and other various analytical approximations for m = 1, A = 0.5, ε = 0.5 and ξ = 0.1 (k1 = 50, k2 = 5)
− 20.0953 arctan(1.2016 + 1.0477t)t − 10.8266 arctan(1.2016 + 1.0477t)
(30)
By comparison of the figures of the VIM solution with other validated approximate solutions, i.e. Lindstedt–Poincare’ perturbation (LP), harmonic balance (HB) and accurate approximate analytical (AAA) solution [19], we see that these figures are similar to each other. And the curves of deflection of linear spring y1 (t) and the displacement of attached mass y2 (t) drawn using the VIM and some other methods are presented for comparison.
6 Conclusions In this paper, He’s variational iteration method has been successfully applied to find the solution of nonlinear BBMB and free vibrations of systems with serial linear and nonlinear stiffness equations. All the curves drawn using the VIM method show that the results of the present method are in excellent agreements with those of exact solutions and the obtained solutions are shown graphically. In our work, we use the Maple Package to calculate the functions obtained from the variational iteration method. Some of the advantages of VIM are that the initial solution can be freely chosen with some unknown parameters and that we can easily achieve the unknown parameters in the initial solution. This paper also applies the variational iteration method to a nonlinear wave equation, the solution process is simple but effective, and result is of high accuracy. Compared with HAM method proposed in [4], the present method has many advantages; HAM requires complex calculation of , suitable choice of leads to ideal results, or results in wrong solutions. The present paper completely overcomes the difficulty arising in HAM. An interesting point about VIM is that with the fewest number of iterations or even in some cases, once, it can converge to correct results.
366
D.D. Ganji et al.
Nomenclature t k1 k2 y1 y2 m VIM LHB
Time Linear spring constant Linear portion of the nonlinear spring constant Deflection of linear spring Displacement of attached mass Mass Variational iteration method Linearized harmonic balance
Greek Symbols ε ν ωe β ξ λ
Perturbation parameter Deflection of nonlinear spring Natural frequency Coefficient of nonlinear spring force Ratio of linear portion of the nonlinear spring constant to that of linear spring constant Lagrangian multiplier
References 1. Adomian, G.: J. Math. Anal. Appl. 135, 501 (1988) 2. Adomian, G.: Solving Frontier Problems of Physics: The Decomposition Method. Kluwer Academic, Boston (1994) 3. El-Sayed, S.M., Kaya, D.: Exact and numerical traveling wave solutions of Whitham–Broer–Kaup equations. J. Comput. Appl. Math 167, 1339–1349 (2005) 4. Fakhari, A., Domairry, G., Ebrahimpour: Approximate explicit solutions of nonlinear BBMB equations by homotopy analysis method and comparison with the exact solution. Phys. Lett. A 368, 64–68 (2007) 5. Ganji, D.D., Sadighi, A.: Application of homotopy-perturbation and variational iteration methods to nonlinear heat transfer and porous media equations. J. Comput. Appl. Math. (2006) 6. Ganji, D.D., Jannatabadi, M., Mohseni, E.: Application of He’s variational iteration method to nonlinear Jaulent–Miodek equations and comparing it with ADM. J. Comput. Appl. Math. (2006) 7. Hagedorn, P.: Non-linear Oscillations. Clarendon, Oxford (1988). (Translated by Wolfram Stadler) 8. He, J.H.: Variational iteration method: a kind of nonlinear analytical technique: some examples. Int. Nonlinear Mech. 344, 699–708 (1999) 9. He, J.H.: Variational iteration method for autonomous ordinary differential systems. Appl. Math. Comput. 114, 115–123 (2000) 10. He, J.H.: A coupling method of a homotopy technique and a perturbation technique for non-linear problems. Int. J. Non-Linear Mech. 35(1), 37–43 (2000) 11. He, J.H.: Homotopy perturbation method: a new nonlinear analytical technique. Appl. Math. Comput. 135(1), 73–79 (2003) 12. He, J.H.: Homotopy perturbation method for bifurcation of nonlinear problems. Int. J. Nonlinear Sci. Numer. Simul. 6(2), 207–208 (2005) 13. He, J.H.: New interpretation of homotopy-perturbation method. Int. J. Mod. Phys. B 20(18), 2561–2568 (2006) 14. He, J.H.: Homotopy perturbation method for solving boundary value problems. Phys. Lett. A 350(1–2), 87–88 (2006) 15. He, J.H.: Variational iteration method—some recent results and new interpretations. J. Comput. Appl. Math. 207, 67 (2008) 16. He, J.H.: Variational iteration method: New development and applications. Comput. Math. Appl. 54, 881 (2008) 17. He, J.H., Zhang, L.N.: Generalized solitary solution and compacton-like solution of the Jaulent–Miodek equations using the Exp-function method. Phys. Lett. A 372(7), 1044–1047 (2008)
Application of He’s Variational Iteration Methods for Solving
367
18. Hosein Nia, S.H., Ranjbar, N.A., Soltani, H., Ghasemi, J., Effect off the initial approximation on stability and convergence in homotopy perturbation method. Int. J. Nonlinear Dyn. Eng. Sci. 1, 79 (2008) 19. Lai, S.K., Lim, C.W.: Accurate approximate analytical solutions for nonlinear free vibration of systems with serial linear and nonlinear stiffness. J. Sound Vib. 307, 720–736 (2007) 20. Nayfeh, A.H., Mook, D.T.: Nonlinear Oscillations. Wiley, New York (1979) 21. Siddiqui, A.M., Mahmood, R., Ghori, Q.K.: Thin film flow of a third grade fluid on a moving belt by He’s homotopy perturbation method. Int. J. Nonlinear Sci. Numer. Simul. 7(1), 7–14 (2006) 22. Telli, S., Kopmaz, O.: Free vibrations of a mass grounded by linear and nonlinear springs in series. J. Sound Vib. 289, 689–710 (2006) 23. Whitham, G.B.: Proc. R. Soc. Lond. Ser. A 299, 6 (1967) 24. Wu, H.X., He, J.H.: Solitary solutions periodic solutions and compacton-like solutions using EXPFunction method. Comput. Math. Appl. 54(7–8), 966–986 (2007) 25. Xie, F., Yan, Z., Zhang, H.: Phys. Lett. A 285, 76 (2001)
Acta Appl Math (2009) 106: 369–420 DOI 10.1007/s10440-008-9304-5
Generalizing the Reciprocal Logarithm Numbers by Adapting the Partition Method for a Power Series Expansion Victor Kowalenko
Received: 26 June 2008 / Accepted: 21 August 2008 / Published online: 13 September 2008 © Springer Science+Business Media B.V. 2008
Abstract Recently, a novel method based on the coding of partitions was used to determine a power series expansion for the reciprocal of the logarithmic function, viz. z/ ln(1 + z). Here we explain how this method can be adapted to obtain power series expansions for other intractable functions. First, the method is adapted to evaluate the Bernoulli numbers and polynomials. As a result, new integral representations and properties are determined for the former. Then via another adaptation of the method we derive a power series expansion for the function zs / lns (1 + z), whose polynomial coefficients Ak (s) are referred to as the generalized reciprocal logarithm numbers because they reduce to the reciprocal logarithm numbers when s = 1. In addition to presenting a general formula for their evaluation, this paper presents various properties of the generalized reciprocal logarithm numbers including general formulas for specific values of s, a recursion relation and a finite sum identity. Other representations in terms of special polynomials are also derived for the Ak (s), which yield general formulas for the highest order coefficients. The paper concludes by deriving new results involving infinite series of the Ak (s) for the Riemann zeta and gamma functions and other mathematical quantities. Keywords Absolute convergence · Bernoulli numbers · Bernoulli polynomials · Conditional convergence · Divergent series · Equivalence · Gamma function · Generalized reciprocal logarithm number · Harmonic number · Partition · Partition method for a power series expansion · Pochhammer polynomials · Polynomials · Recursion relation · Regularization · Riemann zeta function · Stirling numbers
1 Introduction In a recent work [1] a relatively novel number theoretic/graphical method was used to derive a new power series expansion for the reciprocal of the logarithmic function, viz. z/ ln(1 + z). The resulting expansion consisted of powers of z, i.e. zk , multiplied by coefficients Ak , V. Kowalenko () School of Physics, University of Melbourne, Parkville, Victoria 3010, Australia e-mail:
[email protected]
370
V. Kowalenko
which were referred to as the reciprocal logarithm numbers. These numbers were found to be fractions with special properties, one of which was that they converged to zero as the order k tended to infinity. Hence, the reciprocal of the logarithmic function represents the generating function for the Ak , much like the reciprocal of the hyperbolic function, i.e. 1/ sinh(z), represents the generating function for the Bernoulli numbers, Bk . Because of this similarity it was intimated that the method used in Ref. [1] could be adapted to calculate the Bernoulli numbers despite the fact that they diverge as k → ∞. In this work we aim to describe how this method, which shall be referred to as the partition method for a power series expansion, can be adapted to obtain the Bernoulli numbers. By another adaptation of the method we shall show that the Bernoulli numbers can be expressed as elementary integrals of special polynomials, denoted here by gk (y), multiplied by an exponential factor. These new polynomials possess interesting properties of their own. Adapting the method in a different manner will enable us to calculate the Bernoulli polynomials, thereby demonstrating that the coefficients in the resulting power series expansion can be dependent upon variables rather than numerical. In so doing, we shall not only be introducing an entirely different method for calculating power series expansions from the standard Taylor series approach, but we shall also be demonstrating its power and versatility in situations, where the latter cannot be applied. Another example where the Taylor series approach cannot be applied is in the derivation of a power series expansion for the function zs / lns (1 + z), which grew out of a desire to derive new representations for the Riemann zeta and gamma functions. Perhaps, the reason why the Riemann hypothesis has not been solved yet is that we do not have sufficient material on the properties of the zeta function. Basically, all the existing results do not seem to reveal its logarithmic nature. Since the Riemann and Hurwitz zeta functions can be expressed in terms of a simple integral involving lns (1 + z) as we shall see later in this work, we can determine such behaviour by introducing the power series derived by adapting the partition method into these integrals. In deriving this power series expansion, we will be generalizing the reciprocal logarithm numbers presented in Ref. [1]. This paper is arranged as follows. In Sect. 2 it is shown how the partition method for a power series expansion can be adapted to calculate the numbers ak , which are given by Bk /k!. Then the generating function is written as an exponential integral so that the partition method for a power series expansion can be adapted. As a result of this adaptation, the ak are expressed as exponential integrals involving new polynomials gk (y), which have interesting properties of their own. In Sect. 3 the method is adapted so that the Bernoulli polynomials, or rather ak (x)(= Bk (x)/k!), can be evaluated from their generating function. Unlike the previous adaptation whereby a multinomial factor in the partition method was altered, this adaptation affects the coding of the elements, which are no longer assigned numerical values, but values that are dependent upon the variable x. In Sect. 4 we adapt the partition method by introducing the variable s in the multinomial factor to derive a new power series expansion for zs / lns (1 + z). Since the coefficients of the resulting power series expansion reduce to the Ak in Ref. [1] when s = 1, we refer to them as the generalized reciprocal logarithm numbers and denote them by Ak (s). Furthermore, for s = −1, they become the coefficients of the standard Taylor series expansion for ln(1+z)/z. In Sect. 5 we derive several properties of the Ak (s) including a recursion relation, which also involves the reciprocal logarithm numbers, and a finite sum identity for Ak (s + t). It is also shown that Ak (−s) can be expressed in terms of the coefficients of new polynomials hk (t) and the Pochhammer polynomials, while Ak (s) can be expressed in terms of the coefficients of different polynomials pk (t) and the Pochhammer polynomials. By developing general formulas for the lowest and highest orders of these polynomials and expressing the Pochhammer polynomials in terms of the Stirling
Generalizing the Reciprocal Logarithm Numbers
371
numbers of the first kind, we derive general formulas for several of the highest order terms in the Ak (s). In Sect. 6 we present various applications where the Ak (s) arise in infinite series, in particular the gamma and zeta functions. As a consequence, new results for Apéry’s constant and ζ (4) are derived in terms of the harmonic numbers. Finally, the principal results in this work are summarized in the concluding section.
2 Bernoulli Numbers Our starting point for the analysis in this work is the generating function for the Bernoulli numbers, which we write as ∞
Bk t ≡ tk. t e − 1 k=0 k!
(1)
Because the lhs of this result can also be written as te−t/2 t = , et − 1 2 sinh(t/2)
(2)
we can regard the Bernoulli numbers as the generating function for the reciprocal of the hyperbolic sine function, while the result itself can be obtained by putting f (x) = exp(−xt), h = 1, x1 = 0 and xN = ∞ in the following version of the Euler-Maclaurin summation formula:
xN x1
dxf (x) ≡ h
N k=1
fk +
∞ h B2k h2k 2k−1 (fN − f1 ) − (fN − f12k−1 ). 2 (2k)! k=1
(3)
The major difference between the above results and those appearing in the mathematical literature is that equivalence symbols have replaced equals signs. This is because the rhs’s of both equivalences can become divergent. According to p. 138 of Press et al. [2], Equivalence (3) and hence, Equivalence (1) are not convergent because the Bernoulli numbers diverge. This statement, however, is incorrect because although the Bk diverge as k → ∞, B2k /(2k)! converge. To observe this, from No. 9.616 in Ref. [3] for k ≥ 1, we have (−1)k+1 B2k =2 ζ (2k). (2k)! (2π)2k
(4)
As k increases, ζ (2k) decreases, which means that B2k /(2k)! converges to zero rapidly, courtesy of the (2π)2k factor in the denominator of (4). Thus, the divergence of the Bernoulli numbers has no effect on the convergence of either of the two equivalences. Instead, it is the power series that determines whether the rhs of each equivalence is convergent or not. In this paper we shall be concerned with applying the partition method for a power series expansion in the calculation of Bk /k! rather than the direct evaluation of the Bernoulli numbers since it is the former that appear in the generating function and in the Euler-Maclaurin summation formula. As a consequence, we shall set ak = Bk /k! and concentrate more on the properties of this infinite set of converging rationals. Furthermore, when we consider the evaluation of the Bernoulli polynomials, we shall concentrate on the corresponding polynomials ak (x) = Bk (x)/k!.
372
V. Kowalenko
To determine when the power series in Equivalence (1) converges, we introduce (4) into the rhs and replace the zeta function by its Dirichlet series form, thereby obtaining ∞
ak t k = 1 −
k=0
∞ ∞ t t 2k +2 (−1)k+1 . 2 2πl l=1 k=1
(5)
k The inner series in the above equation is merely a variant of the geometric series, ∞ k=0 z . For |z| < 1, it is well-known that this series is absolutely convergent with a limit value of 1/(1 − z). However, what is not well-known is that the series is conditionally convergent for z < 1 and divergent for > 1. Although this issue has been discussed in Refs. [4] and [5], since it is critical for understanding the concepts presented in this work, we present its proof in the following lemma. k Lemma The geometric series ∞ k=0 z is absolutely convergent for |z| < 1, conditionally convergent for z < 1 and |z| > 1 and divergent for z > 1. For the latter case, the series must be regularized by removing the infinity in the remainder in order to obtain a finite value. When regularized, the series yields the same limit value of 1/(1 − z) as when the series is convergent. Proof We begin by writing the geometric series as ∞
zk =
k=0
∞
(k + 1)
k=0
∞ zk zk p = lim dt e−t t k . k! p→∞ k=0 k! 0
(6)
Since the integral in (6) is finite, technically, we can interchange the order of the summation and integration. In reality, an impropriety is occurring when we do this, which we shall discuss in more detail shortly. For now, interchanging the summation and integration yields ∞ k=0
z = lim
p
k
p→∞ 0
= lim
p→∞
dt e
−t
∞ (zt)k k=0
k!
= lim
1 e−p(1−z) + . z−1 1−z
p→∞ 0
p
dt e−t (1−z)
(7)
For z < 1, the first term in the last member of the above equation vanishes and the series yields a finite value of 1/(1 − z). Therefore, we see that the same value is obtained for the series when z < 1 as for when z lies in the circle of absolute convergence given by |z| < 1. According to the definition on p. 18 of Ref. [6], this means that the series is conditionally convergent for z < 1 and |z| > 1. For z > 1, however, the first term in the last member of (93) is infinite. We define regularization as the process of removing the infinity in the remainder so as to make it summable. Then we are left with a finite part that equals 1/(1 − z) in the same manner as when a divergent integral is regularized in the theory of generalized functions [7, 8]. Hence, for all complex values of z except z = 1, we have ∞ 1 k ≡ 1−z , z > 1, z (8) 1 , z < 1. = 1−z k=0 One important property of the above result is that regardless of whether the geometric series is divergent or convergent, the value of 1/(1 − z) is unique for all z lying in the principal
Generalizing the Reciprocal Logarithm Numbers
373
branch of the complex plane, which is critical for the development of a theory of divergent series. Frequently, we will not know for which values of z an asymptotic series is convergent and for which it is divergent. So, we shall replace the equals sign by the less stringent equivalence symbol on the understanding that we may be dealing with a series that is absolutely or conditionally convergent for some values of z. As a result, we shall adopt the shorthand notation of ∞
zk = zN
k=N
∞
zk ≡
k=0
zN . 1−z
(9)
We shall refer to such results as equivalences since they cannot be regarded strictly as equations. In fact, it is simply incorrect to refer to the above as an equation since we have seen for z > 1 that the lhs is infinite. Furthermore, the above notation is only applicable when the form for the regularized value of the divergent series is identical to the form of the limiting value of the convergent series, which is not always the case as explained in Ref. [5]. At the barrier of z = 1, the situation appears to be unclear. For z = 1 the last member of (7) vanishes, which is consistent with removing the infinity from 1/(1 − z). For other values of z = 1, the last member of (7) is clearly undefined. This is to be expected as this line forms the border between the domains of convergence and divergence for the series. Because the finite value remains the same to the right and to the left of the barrier at z = 1 and in keeping with the fact that regularization is effectively the removal of the first term in the last member of (7), we take 1/(1 − z) to be the finite or regularized value when z = 1. Hence, Equivalence (8) becomes ∞ 1 k ≡ 1−z , z ≥ 1, z (10) 1 = 1−z , z < 1. k=0 As discussed in Refs. [4, 5] regularization is a mathematical abstraction. However, when an asymptotic method such as Laplace’s method, the method of steepest descent or the method of iterating a differential equation is employed, an impropriety has occurred, which results in an infinity appearing in the remainder. In the above instance, we have altered the order of integration and summation without paying strict attention to the fact that the geometric series is the Taylor series expansion of 1/(1 − z), which is only valid for |z| < 1. That is, by taking the Taylor series expansion for exp(zt) and carrying out the integration with z outside the circle of absolute convergence for the geometric series, we have committed an impropriety, as is the case when applying any asymptotic method to a function or integral. In committing this impropriety, we have seen that it does not mean that the series is always divergent outside the domain of absolute convergence. In fact, there is a barrier, where to one side the series is either absolutely or conditionally convergent and to the other side, it is divergent. Even though the original function is still finite, as a result of applying an asymptotic method, we can obtain an infinity in the remainder. Regularization is, therefore, a necessary step for correcting the impropriety. It should also be mentioned that in Ref. [9] an alternative technique of regularizing a divergent series known as Mellin-Barnes regularization [4, 5, 10] is applied to the geometric series. In this approach the regularized value for a divergent series is expressed as a MellinBarnes integral. For example, the regularized value for the geometric series becomes ∞ k=0
zk ≡ −
c+i∞ c−i∞ −1
ds
zs . −1
e2iπ s
(11)
374
V. Kowalenko
When this result is evaluated numerically for z equal to −1, −1 + i, 1 + i and −1 − i, it gives the identical values as would be obtained by evaluating 1/(1 − z) within the machine precision of the computing system. E.g., using Mathematica 4.1 on a Pentium computer with z = −1 − i gives a value of 0.39999999999999991 − 0.2000000000000003i, which is well within the accuracy and precision goals set in the integration routine. This completes the proof of the lemma. From the lemma we see that the inner series on the rhs of (5) is absolutely convergent whenever |t/2πl| < 1. Since l ranges from unity to infinity, the equation is absolutely convergent for |t/2π| < 1. In addition, the inner series is conditionally convergent whenever (−t 2 /4π 2 l 2 ) < 1 and |t / 4π 2 l 2 | > 1. In this case, there will be a certain number of l values before the second condition is no longer valid and then the series for l values above this value will be absolutely convergent. However, if there is only one value of l where (−t 2 /4π 2 l 2 ) > 1, then (5) is divergent in which case the series will have to be regularized to yield a finite value. Therefore, the formal version of Equivalence (1) is given by ∞ t 2 2 k = et −1 , t /4π > −1, ak t (12) t ≡ et −1 , t 2 /4π 2 ≤ −1. k=0 The divergent behaviour of the series on the lhs of the above result seems to have been overlooked by a great number of mathematical texts including Refs. [2] and [6], although Gradshteyn et al provide the condition for absolute convergence, viz. |t/2π| < 1, when presenting the generating function for the Bernoulli numbers as No. 9.621 in Ref. [3]. However, the condition does not appear in earlier editions. As mentioned in the proof of the lemma, frequently we shall not need to be concerned with the domains over which a series is convergent or divergent. Hence, we replace the equals sign by the less restrictive equivalence symbol to connect both sides of a mathematical statement. Then the ensuing result becomes an equivalence as in Equivalence (1). We now apply the partition method for a power series expansion in the evaluation of the ak . As mentioned in the introduction, in this paper we are going to study several problems involving different adaptations of the method, thereby demonstrating its versatility. The first and simplest adaptation will be a direct numerical evaluation of the ak , while the second will express the ak in terms of exponential integrals of special polynomials. To apply the partition method for a power series expansion to Equivalence (1), we begin by expanding the denominator on the rhs as a power series, which yields ∞ k=0
ak t k ≡
1 1 + t/2! + t 2 /3! + t 3 /4! + t 4 /5! + · · ·
.
(13)
We can regard the rhs of this equivalence as the regularized for the geometric series, value k t /(k + 1)!. If we replace the where the variable z is equal to an infinite series, viz. ∞ k=1 regularized value by the power series representation for the geometric series, then we arrive at k ∞ ∞ t2 t3 t4 t k k + + + + ··· . ak t = (−1) (14) 2! 3! 4! 5! k=0 k=0 Note that because both series yield the same regularized value, they are basically equal to one another and hence, an equals sign appears in the above result.
Generalizing the Reciprocal Logarithm Numbers
375
The first few ak can be easily evaluated since they require only small values of k on the rhs of (14). For example, we find that by putting k = 0, a0 = 1, while putting k = 1 gives a1 = −1/2. For the coefficient of t 2 or a2 there are two contributions, one coming from the k = 1 term and the other from the k = 2 term in (14). Hence, we find that a2 = (1/2!)2 − 1/3! = 1/12 or B2 /2!. The calculation of the coefficients of higher order terms becomes progressively more difficult as there is an increasing number of contributions to the ak as k increases. Therefore, we need to apply the partition method for a power series expansion. As stated in the introduction the partition method for a power series expansion represents a number theoretic/graphical method. This is because it relies on a tree diagram to determine all the distinct partitions that sum to k when we wish to evaluate ak . Later we shall present a general formula for the ak , which does not require a knowledge of the partitions, although such a knowledge is valuable in reducing the redundancy in the general formula. As discussed in Refs. [1, 11, 12], the tree diagram for the ak is constructed by drawing branch lines to all pairs of numbers that can be summed to k, where the first number in the tuple is an integer less than or equal to [k/2]. For example, the tree diagram for a6 possesses branch lines to (0, 6), (1, 5), (2, 4) and (3, 3). Whenever a zero appears as the first element of a tuple, the path stops, as evidenced by (0, 6). For the other pairs, one draws branch lines to all pairs with integers that sum to the second number under the prescription that the first member of each new tuple is now less than or equal to half its second member. For the tuple (1, 5) we get paths branching out to (0, 5), (1, 4) and (2, 3). This recursive approach continues until all paths are terminated with a tuple containing a zero as its first member. It is obvious that all the first members plus the second member of the final tuple in each path represents a partition for k. For example, the path consisting of (1, 5), (1, 4), (2, 2) and (0, 2) represents the partition {1, 1, 2, 2}. Unfortunately, this is not the end of the matter as there may be duplicated paths which involve the same partition. Whenever a duplicated path occurs, it must be removed from the tree diagram, thereby ensuring that each distinct partition appears only once in a tree diagram. Finally, one ends up with the tree diagram given in Ref. [1]. Although the construction of a tree diagram is not necessary for the evaluation of the ak , it does ensure that all partitions are captured whereas there is a danger that partitions can be omitted when one attempts to write them down directly. Another advantage of the method is that it is a simple matter to exclude elements from the diagram, which would have been necessary if one or more of the coefficients in the power series in the large parentheses in (14) were zero. For example, if the coefficient of t 3 in the power series in the large parentheses in (14) had been zero instead of −1/4!, then the tree diagram could still be constructed in the same manner except there would be no branches containing a three in a tuple. The most important step in the partition method is to determine the contributions due to each partition in the tree diagram. These are not only dependent upon the total number of elements or parts in the partitions, N , but also on the values of the elements, li . Furthermore, each li appears we let ni represent the number of occasions or frequency jin a particular parj tition. If there are j elements in a partition, then i=1 ni = N , while i=1 ni li = k. Up till now the evaluation of the ak has mirrored the evaluation of the reciprocal logarithm numbers in Ref. [1]. Now comes the difference. Instead of assigning a value of (−1)li +1 /(li + 1) to each li in the partition as we did in Ref. [1], we now assign a value of −1/(li + 1)!. The multinomial factor, which arises because each ni represents a power of the series in (14), remains the same as in Ref. [1] and is given by N !/n1 ! · · · nj !. Therefore, if the contribution from our general partition to the ak is denoted by C([l1 , n1 ], [l2 , n2 ], . . . , [lj , nj ]), where
376
V. Kowalenko
1 ≤ j ≤ k, then we arrive at
C [l1 , n1 ], . . . , [lj , nj ] =
ni j N! −1 . n1 !n2 ! · · · nj ! i=1 (li + 1)!
(15)
For the partition involving the path (1, 5), (1, 4), (2, 2) and (0, 2), l1 = 1, n1 = 2, l2 = 2 and n2 = 2, while i ranges from 1 to 2 in the product. Thus, the contribution from this partition to a6 is given by
C [1, 2], [2, 2] =
4! 2! · 2!
−1 2!
2
−1 3!
2 ,
(16)
which yields a value of 1/24. To determine a6 , we require the contributions from all partitions, whose sum is equal to 6. These are {6}, {1, 5}, {1, 1, 4}, {1, 1, 1, 3}, {1, 1, 1, 1, 2}, {1, 1, 1, 1, 1, 1}, {1, 2, 3}, {2, 4}, {2, 2, 2}, {1, 1, 2, 2} and {3, 3}. Hence, a6 is given by a6 =
−1 2! −1 −1 3! −1 2 −1 + + 7! 1! · 1! 2! 6! 2! · 1! 2! 5! 3 4 −1 −1 4! −1 5! −1 6! −1 6 + + + 3! · 1! 2! 4! 4! · 1! 2! 3! 6! 2! 3! −1 −1 −1 2! −1 −1 + + 1! · 1! · 1! 2! 3! 4! 1! · 1! 3! 5! 3 2 2 2 4! 2! −1 1 −1 3! −1 −1 + + = . + 3! 3! 2! · 2! 2! 3! 2! 4! 42 · 6!
1! 1!
(17)
As B6 = 6!a6 , we see that B6 = 1/42. The difference between evaluating the ak via the partition method and evaluating the reciprocal logarithm numbers in Ref. [1] is simply that a value of −1/(l + 1)! has been assigned to each element in a partition instead of −1/(l + 1). Therefore, just by altering the assigned values to the elements in the partitions, one obtains entirely different mathematical quantities. An interesting feature of the partition method is that although the tree diagram is recursive in nature, it is different from the standard recursion relation for the Bernoulli numbers, which can simply be obtained by multiplying Equivalence (1) by exp(t) − 1. Hence, the equivalence becomes t≡
∞ Bk k=0
k!
tk
∞ tk k=0
k!
−1 =
∞
⎛ k tk ⎝
k=0
j =0
⎞ Bk ⎠ Bk−j − . (k − j )!j ! k!
(18)
We have already seen that if |t/2π| < 1, then we can replace the equivalence symbol by an equals sign. Thus, under this condition we have an equation between the lhs and the rightmost expression. Since t is arbitrary in the disk of absolute convergence, we can equate like powers of t on both sides of the resulting equation, which gives Bk Bk−j = , k! (k − j )!j ! j =0 k
(19)
Generalizing the Reciprocal Logarithm Numbers
377
or better still, k ak−j j =1
j!
= 0.
(20)
Equation (19) is more commonly written as Bk = kj =0 jk Bk−j . It should be noted that (20) is not the only recursion relation that can be derived from Equivalence (1). If we divide by t and differentiate the equivalence, then we obtain ∞
−
1 1 1 − t ≡− 2 + (k − 1)ak t k−1 . t 2 e − 1 (e − 1) t k=1
(21)
By using Equivalence (1) we can write the lhs of the above equivalence as −
2 ∞ ∞ 1 1 k−1 k−1 − ≡ a t − a t . k k et − 1 (et − 1)2 k=0 k=0
(22)
Therefore, the divergent series on the rhs’s of the two equivalences give the same regularized value. Hence, they must equal one another. If the reader is uncomfortable with this notion, we could have easily restricted t to the disk of absolute convergence and then we would have been dealing with equations in which we would arrive at the same conclusion that the rhs’s of Equivalences (21) and (22) are equal to one another. Thus, we arrive at −
∞ ∞ ∞ k 1 k−2 k−1 k−2 + (k − 1)a t = − a t − t aj ak−j . k k t 2 k=1 k=0 k=0 j =0
(23)
Equating powers of t yields (k − 1)ak + ak−1 +
k
aj ak−j = 0.
(24)
j =0
Now that we have derived the standard recursion relation as given by either (19) or (20), let us see how different it is from the partition method when evaluating a4 . Putting k = 5 in (20) yields a4 = a2 /3! − a1 /4! − a0 /5! = −a0 /2! · 2! · 3! + a0 /3! · 3! + a0 /2! · 4! − a0 /5!.
(25)
In arriving at this result, we have used (20) repeatedly to express a2 and a1 in terms of a0 . Since a0 = 1, the above result yields a4 = −1/24 + 1/36 + 1/48 − 1/120 = −1/720 = −1/30 · 4! = B4 /4!.
(26)
If we calculate a4 via the partition method, then we need to consider the partitions of {4}, {1, 3}, {1, 1, 2}, {2, 2}, {1, 1, 1, 1}. Hence, we find that a4 = −
2 2 4 1 1 1 1 1 3! 1 1 + 2! + − + 5! 2! 4! 2! · 1! 2! 3! 3! 2!
= −1/120 + 1/24 − 1/8 + 1/36 + 1/16 = −1/720.
(27)
378
V. Kowalenko
In comparing (27) with (26), we see that there is not only a different number of contributions required to evaluate a4 , but that the numerical values of the contributions are for the most part different. This demonstrates that the partition method for a power series expansion operates in a different manner from the standard recursion relation for the Bernoulli numbers. In Ref. [1] we presented a general formula for the reciprocal logarithm numbers arising out of the partition method. We can do the same for the ak after making appropriate modifications. Although the general formula does not require knowledge of the specific partitions, much redundancy occurs in the general formula if they are known and as a result, the calculation of the coefficients is made easier. The first modification to the general formula presented in Ref. [1] is that we do not have to consider a2k+1 for k > 0 since these coefficients are zero. Hence, we only need to consider the evaluation of a2k , which means, in turn, that we only need to consider the partitions that sum to 2k. Then the elements range from 1 to 2k with frequencies n1 to n2k . This is a major difference from the previous application of the partition method for a power series expansion in that now n1 represents the number of ones in a partition, n2 the number of twos, n3 the number of threes, etc. Each frequency ni ranges from 0 to [2k/ i] such that n1 ranges from 0 to 2k, n2 from 0 to k, n3 from 0 to [2k/3], and so on. As a result of this modification, we need to code the partitions differently with each element i in a partition now assigned a value of −1/(i + 1)!. Therefore, the a2k are given by
2k,k,[2k/3],...,1
a2k =
n1 ,n2 ,n3 ,...,n2k =0 n1 +2n2 +3n3 +···+2kn2k =2k
1 × − 3!
n2
1 − 4!
(n1 + n2 + · · · + n2k )! 1 n1 − n1 !n2 ! · · · n2k ! 2! n3
−1 ··· (2k + 1)!
n2k .
(28)
By redundancy it is meant that many of the ni will equal zero. Hence, frequently 0! and (−1/(i + 1)!)0 will occur in the above formula, which are taken to equal unity. We can also simplify the notation in this result, which will be done when we develop a general formula for the Bernoulli polynomials in the following section. For the time being, however, it is more instructive to leave the result for the a2k in this form. For |t| < 2π , we can replace the equivalence symbol by an equals sign and write Equivalence (1) as et = 1 +
1+
t ∞
k=1 ak t
k
.
(29)
We can treat the second term on the rhs of (29) as the regularized value for the geometric series. Then the above result yields et ≡ 1 + t
k ∞ ∞ ∞ k
aj t j = 1 + t −a1 t − a2 t 2 − a3 t 3 − · · · . − k=0
j =1
(30)
k=0
Expanding the rhs of Equivalence (30) for the lowest order terms in t yields et ≡ 1 + t − a1 t 2 + (a12 − a2 )t 3 + (2a1 a2 − a13 )t 3 + O(t 4 ).
(31)
Since a1 = −1/2 and a2 = 1/12, we find that the rhs yields 1 + t + t 2 /2! + t 3 /3! + O(t 4 ). In another words we obtain the Taylor series expansion for exp(t). Since the Taylor series
Generalizing the Reciprocal Logarithm Numbers
379
expansion for exp(t) is convergent, we can replace the equivalence symbol by an equals sign in the above result, which means that it is an equation now. Thus, employing the partition method does not necessarily imply that a divergent power series expansion will result. Let us apply the partition method to (30). The first property we note is that a2k+1 is zero for k > 0. This means that we can remove all elements 2i + 1, where i > 0, from the tree diagrams for each ak . In addition, if we let ni represent the number of occasions the element i appears in a partition, then n2i+1 = 0 for i > 0. Previously, we assigned a value of −1/(i + 1)! for each element i when we applied the partition method in developing a general formula for the ak . If we compare the series in (14) with that in (30), then we see that the coefficient of t i is −ai instead of −1/(i + 1)!. Furthermore, when we develop a power series in t , we need to take into account the additional factor of t outside the series in (30). That is, the coefficient of t i obtained when we apply the partition method to the series in (30) will represent the coefficient of t i+1 overall. Therefore, application of the partition method to the power series on the rhs of (30) yields ∞ k t
k!
k=0
=1+
∞
αk t k+1 ,
(32)
k=0
where
k,[k/2],[k/4],...,1
αk =
n1 =0,n2 =0,n4 =0,...,nk
(n1 + n2 + n4 + · · · + nk )! n1 !n2 !n4 ! · · · nk ! =0
× (−a1 )n1 (−a2 )n2 (a4 )n4 · · · (−ak )nk .
(33)
Since t is arbitrary, we can equate like powers of t on both sides of the above equation. Furthermore, if k is odd, then nk = 0 and the coefficient αk will stop at nk−1 . As a result, we can separate even and odd values of k, thereby obtaining 1 = (2k + 1)! n
k,[k/2],[k/4],...,1
1 =0,n2 =0,n4 =0,...,n2k
(n1 + n2 + n4 + · · · + n2k )! n1 !n2 !n4 ! · · · n2k ! =0
× (−a1 )n1 (−a2 )n2 (−a4 )n4 · · · (−a2k )n2k
(34)
and 1 = (2k + 2)! n
k,[k/2],[k/4],...,1
1 =0,n2 =0,n4 =0,...,n2k
(n1 + n2 + n4 + · · · + n2k )! n1 !n2 !n4 ! · · · n2k ! =0
× (−a1 )n1 (−a2 )n2 (−a4 )n4 · · · (−a2k )n2k .
(35)
The above equations represent inverse identities of (28) in that powers of the ak appear in the sum on the rhs to yield 1/k!, whereas previously we had powers of 1/(k + 1)! yielding the ak . An analogous situation was uncovered in Ref. [1]. There we found general formulas for the reciprocal logarithm numbers in terms of powers of 1/(k + 1) and for 1/(k + 1) in terms of powers of the reciprocal logarithm numbers. Now we adapt the partition method for a power series expansion by presenting an example where the multinomial factor is different from previous situation and in Ref. [1]. As a
380
V. Kowalenko
result, we shall see that the Bernoulli numbers can be written in terms of exponential integrals of special polynomials gk (y). To accomplish this, we replace the exponential factor in the denominator on the rhs of Equivalence (13) by ∞
∞
ak t ≡ k
dy exp −y(1 + t/2! + t 2 /3! + t 3 /4! + · · · ) .
(36)
0
k=0
At this stage we introduce the asymptotic method of expanding most of the exponential as discussed on p. 113 of Ref. [13]. We retain the leading exponential term in the final form of Equivalence (36), viz. exp(−y) and expand all the remaining exponential terms in powers of t . Then we find that ∞
dy exp −y(1 + t/2! + t 2 /3! + t 3 /4! + · · · ) 0
∞
≡
dy e−y
0
∞ k 1
−yt/2! − yt 2 /3! − yt 3 /4! − · · · . k! k=0
(37)
Since an asymptotic method has been employed, the rhs can become divergent as discussed in the proof of the lemma and as a consequence, we must replace the equals sign by an equivalence symbol. Furthermore, the series on the rhs of the above equivalence can be expressed as a power series in t involving polynomials in powers of y, which we write as ∞ ∞ k 1
−yt/2! − yt 2 /3! − yt 3 /4! − · · · = gk (y)t k , k! k=0 k=0
(38)
where the gk (y) are of order k with g0 (y) = 1, g1 (y) = −y/2!, g2 (y) = y 2 /8 − y/3!, etc. Alternatively, we may write the above equation as ∞
gk (y)t k = exp −y(et − 1)/t + y .
(39)
k=0
By putting y = 0 we see immediately that gk (0) = 0 for k ≥ 1 and g0 (0) = 1. Introducing the gk (y) into the Equivalence (38) yields 2 3 ∞ ∞ y y2 y ty 2 y y + t2 − − + ak t k = dy e−y 1 − − t3 + ··· 2! 8 3! 8 · 3! 2 · 3! 3! 0 k=0 t 1 1 (4) (2) (2) − − + = 1 − + t2 − t3 + ··· . (40) 2 4 6 8 · 3! 2 · 3! 4! The coefficients in (40) yield the first four Bernoulli numbers, i.e. B0 = 1, B1 = −1/2, B2 = 1/6 and B3 = 0. Thus, we can obtain specific values by continuing the expansion. More generally, we can write the above as ∞
∞
ak t k = 0
k=0
dy e−y
∞
gk (y)t k .
Since t is arbitrary, we can equate powers of t , which gives ∞ dy e−y gk (y) = ak . 0
(41)
k=0
(42)
Generalizing the Reciprocal Logarithm Numbers
381
Equation (42) is not the only result involving an exponential integral of the gk (y). For example, if we multiply (39) by y ν exp(−y), where ν > −1, and integrate w.r.t. y between zero and infinity, then we find ∞
∞
tk
dy y ν e−y gk (y) =
0
k=0
∞
dy y ν exp −y(et − 1)/t
0
=
(ν + 1)t ν+1 . (et − 1)3
(43)
For ν = 1 the rhs of (43) can be written as t2 d 1 1 2 = −t + t . (et − 1)2 dt et − 1 e −1
(44)
If we assume |t| < 2π , then we can replace the equivalence symbol by an equals sign in Equivalence (1). Introducing the ensuing equation into (44) yields ∞
∞
tk
dy ye−y gk (y) = −
0
k=0
∞ ∞ (k − 1)ak t k − ak t k+1 . k=0
(45)
k=0
Again, since t is fairly arbitrary, we can equate powers of t , thereby obtaining
∞
dy ye−y gk (y) = −(k − 1)ak − ak−1 =
0
k
aj ak−j .
(46)
j =0
The final result follows from (24). For ν = 2, the rhs of (43) becomes 2 d 1 2t 3 1 2t 3 3 3 d = t + 3t + . (et − 1)3 dt 2 et − 1 dt et − 1 et − 1
(47)
Now assuming |t| < 2π again, we can introduce the “equation form” of Equivalence (1) into the above, which gives ∞
∞
tk 0
k=0
dy y 2 e−y gk (y) =
∞
ak t k (k − 1)(k − 2) + 3(k − 1)t + 2t 3 .
(48)
k=0
Equating powers of t yields ∞ dy y 2 e−y gk (y) = (k − 1)(k − 2)ak + 3(k − 2)ak−1 + 2ak−2 .
(49)
0
So far, we have managed only to calculate the first few Bernoulli numbers by explicit expansion as in (40). Obviously, this becomes very unwieldy as we progress to higher orders. Therefore, we need to adapt the partition method so that we can evaluate any Bernoulli number via the gk (y). As we have seen already, there are two steps in the partition method: the first is to develop a method of processing all the partitions that sum to k and the second is to determine the contribution from each partition so that when all the contributions due to the partitions are summed, they yield the value for the quantity we wish to calculate. In fact, the only difference between the method for calculating reciprocal logarithm numbers
382
V. Kowalenko
Table 1 The polynomials gk (y) given by (39) k
gk (y)
0
1
1
−y/2
2
−y/6 + y 2 /8
3
−y/24 + y 2 /12 − y 3 /48
4
−y/120 + 5y 2 /144 − y 3 /48 + y 4 /384
5
−y/720 + y 2 /90 − 7y 3 /576 + y 4 /288 − y 5 /3840
6
(−576y + 8568y 2 − 15344y 3 + 7560y 4 − 1260y 5 + 63y 6 )/2903040
7
(−144y + 3936y 2 − 10920y 3 + 8288y 4 − 2310y 5 + 252y 6 − 9y 7 )/5806080
in Ref. [1] and the Bernoulli numbers here occurs in the second step where the elements are coded differently or assigned with different values. To evaluate the contributions made by the partitions, let l1 , . . . , li be the distinct elein a partiments in a partition and n1 , . . . , ni be the frequencies for each distinct element tion. If j represents the total number of elements in the partition, then im=1 nm = j and i m=1 nm lm = k. To obtain the polynomials gk (y), each element lm corresponds to a value of −y/(lm + 1)!. In this application of the partition method for a power series expansion, not only are the elements assigned with different values than before, but the multinomial factor is also different. The reason for the latter difference is that we are now dealing with the power series representation for an exponential in Equivalence (36), whereas previously we were dealing with the geometric series. As a consequence, we have to contend with an extra factor of k! appearing the denominator of the summand. This factor cancels the numerator of the multinomial factor for the geometric series. Hence, for each partition the multinomial factor becomes 1/n1 ! · · · ni ! and the total contribution by the partition to the gk (y) will be (−1)j y j /(l1 + 1)! · · · (li + 1)!n1 ! · · · ni !. The partition involving the path (1, 4), (1, 3) and (0, 3) in the tree diagram or {1, 1, 3} contributes (−y/2!)2 (−y/4!)/2! to g5 (y). As before, summing all the contributions from the partitions that sum to k yields the final value for gk (y). Hence, g5 (y), which consists of the contributions from the partitions {5}, {1, 4}, {1, 1, 3}, {2, 3}, {1, 1, 1, 2}, {1, 2, 2} and {1, 1, 1, 1, 1}, is found to be y y y 1 y 2 y y y + − − + − + − − − 6! 5! 2! 2! 2! 4! 3! 4! 3 2 5 1 y 1 y 1 y y y + − + − + − . − − 3! 2! 3! 2! 3! 2! 5! 2!
g5 (y) = −
(50)
Simplifying the above yields g5 (y) = −
y2 7y 3 y4 y5 y + − + − . 720 90 576 288 3840
(51)
More gk (y) appear in Table 1. From this analysis we see that the gk (y) are of order k with the highest order coefficient determined by the partition composed only of k ones. The lowest order term in the polynomials is unity, whose coefficient is the contribution made by the partition with the single element of k. If we modify the above so that ni represents the frequency that the element i occurs in a partition, then we can present a general formula for the gk (y) as we did for a2k in (28).
Generalizing the Reciprocal Logarithm Numbers
383
Thus, the gk (y) can be written as
k,[k/2],[k/3],...,1
gk (y) =
n1 ,n2 ,n3 ,...,nk =0 n1 +2n2 +3n3 +···+knk =k
nk y n1 y n2 y 1 − ··· − . − n1 !n2 ! · · · nk ! 2! 3! (k + 1)!
(52)
Again, it should be stressed that this result incorporates much redundancy since many of the ni will often be equal to zero. Furthermore, if we let gk (y) = kj =1 gk,j y j , then the coefficients of the polynomials are given by
k,[k/2],[k/3],...,1
gk,j =
n1 ,n2 ,n3 ,...,nk =0
k k i=1 ini =k, i=1 ni =j
(−1)j n1 !n2 ! · · · nk !
1 2!
n1
1 3!
n2
1 ··· (k + 1)!
nk .
(53)
In terms of the tree diagrams j represents the number of branches along each path. For example, g6,3 represents all the paths with 3 branches or all the partitions that sum to 6 with three elements in them. Those partitions are {1, 1, 4}, {1, 2, 3} and {2, 2, 2}. For the first partition, n1 = 2, n4 = 1 and all the other ni values, viz. n2 , n3 , n5 and n6 , are zero, while for the second partition n1 = 1, n2 = 1, n3 = 1 and n4 to n6 are equal to zero. For the third partition only n2 = 3 is non-zero. According to (53) the third order term in y for g6 (y) is g6,3
1 =− 2! · 1! =−
1 2!
2
1 1 1 1 1 1 1 3 − − 5! 1! · 1! · 1! 2! 3! 4! 3! 3!
137 . 36 · 6!
(54)
We can also develop general formulas for some of the coefficients gk,j . Because the highest order term in the polynomials is due to the contribution by the partition composed only of ones, we find that gk,k =
(−1)k . 2k · k!
(55)
The next highest order term in the polynomials is due to the partition whose elements are all equal to one except for one equal to 2. That is, we have k − 2 ones and one two, which means that gk,k−1 =
(−1)k+1 , 6 · 2k−2 · (k − 2)!
(56)
where k ≥ 3. On the other hand, the lowest order term is the contribution made by the single element k. Hence, the coefficient gk,1 is given by gk,1 = −
1 . (k + 1)!
(57)
The next lowest order term is slightly more complex since it involves a sum of all the contributions from the tuples whose elements sum to k. That is, we need to evaluate the contributions from all tuples {j, k − j } where j ranges from unity to [k/2]. Initially, we need to consider the cases of odd and even k values separately because of the upper limit [k/2] in the
384
V. Kowalenko
sum. For even values of k we find that the multinomial factor is different for {[k/2],[k/2]} compared with all other tuples. In particular for k = 2m, we find that g2m,2 =
m
−
j =1
y (j + 1)!
−
y y2 . − (2m − j + 1)! 2((m + 1)!)2
(58)
By introducing No. 4.2.1.6 in Ref. [14] into the above equation, we cancel the final term, which gives the same result for odd integer values of k. Therefore, we arrive at gk,2 =
2k+1 − k − 3 , (k + 2)!
(59)
which is valid for k ≥ 2. Since the gk (y) can be expressed as polynomials in powers of y with coefficients gk,j , where j ranges from unity to k, (42) yields
∞
a2m =
dy e−y g2m (y) =
0
2m
(j + 1)g2m,j ,
(60)
j =1
for k = 2m and m, a non-negative integer. For odd integer values of k, i.e. k = 2m + 1, we obtain 2m+1
(j + 1)g2m+1,j = 0.
(61)
j =0
We can also develop recursion relations for the coefficients of the gk (y) by differentiating (39) w.r.t. y. Then we find that ∞
gk (y)t k = −
k=0
∞ k=0
∞
t k+1 gk (y)t k , (k + 2)! k=0
(62)
where we have introduced the power series expansion for (1 − exp(t))/t . Multiplying the two series on the rhs and equating like powers of t gives
gk+1 (y) = −
k j =0
gj (y) . (k − j + 2)!
(63)
Equation (63) is a rather important result because the recursion relation for the Bernoulli numbers, viz. (19), can be obtained by multiplying by exp(−y) and integrating from zero to infinity. By introducing the polynomial representation for the gk (y) into (63), one arrives at k+1 j =1
jgk+1,j y j −1 = −
k j =0
1 gj,l y l . (k − j + 2)! l=1 j
(64)
Since y is arbitrary, we can equate like powers of y, which leads to the following recursion relation: (k − j )gk+1,k−j = −
j +1 i=0
1 gk−i,k−j −1 . (i + 2)!
(65)
Generalizing the Reciprocal Logarithm Numbers
385
The above result is valid for −1 ≤ j ≤ k − 1. On the other hand, if we differentiate (39) w.r.t. t , then after introducing the power series representation for exp(t) and carrying out a little algebra we find that ∞
kgk (y)t k−1 = y
k=1
∞ k=0
tk
k j =0
1 1 − gj (y). (k − j + 2)! (k − j + 1)!
(66)
Equating like powers of t yields (k + 1)gk+1 (y) = y
k j =0
1 1 − gj (y). (k − j + 2)! (k − j + 1)!
(67)
By introducing (63) into the above result, we obtain
(k + 1)gk+1 (y) = −ygk+1 (y) − ygk (y) + ygk (y). Now applying the integrating factor method [15] to the above result, we arrive at y
gk+1 (y) = gk (y) − y −k−1 dt gk (t) t k+1 + (k + 1)t k .
(68)
(69)
0
If we introduce the polynomial form for the gk (y) into (69), then we obtain gk+1,k+1 = −gk,k /2(k + 1),
(70)
gk+1,j = (jgk,j − gk,j −1 )/(k + j + 1).
(71)
and
Equation (70) is merely another form of (55), while (71) is a much simpler formula for evaluating the coefficients of the gk (y) than (65) since it requires only two preceding coefficients. As a consequence, it might be easier to calculate the ak using a combination of (60) and (71) rather than (20) for large k. It has already been stated that the equivalence symbol in Equivalence (1) can be replaced by an equals sign for |t| < 2π . If for these values of t we multiply both sides of the resulting equation by exp(t) − 1 and introduce (38), then we find that 0
∞
dy e−y
∞
tk
k=0
∞ k ∞ gj (y) − dy e−y gk (y)t k = t. (k − j )! 0 j =0 k=0
(72)
Since t remains fairly arbitrary, we can equate like powers of t again, thereby obtaining
∞ 0
dy e−y
k−1 1 gj (y) =− , (k − j )! k! j =1
(73)
where k > 1. Since the gk (y) possess coefficients gk,j , (73) yields k−1 j =1
1 1 (i + 1)gj,i = − . (k − j )! i=0 k! j
(74)
386
V. Kowalenko
As an aside, because partitions have been used to obtain general formulas for the Bernoulli numbers in this section, the reader may be under the impression that the partition method for a power series expansion is equivalent to Bell numbers [16]. However, this is far from the case. Bell numbers, which are denoted here by Bek , represent the number of ways a set of k elements can be partitioned into non-empty sets and are determined from the generating function given by exp(ex − 1) =
∞ Bek k=0
k!
xk .
(75)
The first few Bek are 1, 2, 5, 15, 52, . . . . However, the partition method for a power series expansion does not involve the number of ways a set of k elements can be partitioned, but instead codes each partition that sums to a value k and sums all the contributions from the coding process to render the coefficient of the order k term in the resulting power series. In fact, we can apply the partition method to the lhs of (75) by writing it as ∞ Bek k=0
k!
∞ k 1
t + t 2 /2! + t 3 /3! + · · · . k! k=0
tk =
(76)
Comparing (76) with (38), we see that Bek /k! are similar to gk (y) except each element i in a partition has a coded value of 1/ i!, not −y/(i + 1)!. That is, the multinomial factor remains the same as for the gk (y). Hence, modifying (52) yields the following result for the Bell numbers:
k,[k/2],[k/3],...,1
Bek = k!
n1 ,n2 ,n3 ,...,nk =0
1 n1 !n2 ! · · · nk !
1 1!
n1
1 2!
n2
1 ··· k!
nk .
(77)
n1 +2n2 +···+knk =k
3 Bernoulli Polynomials In the previous section we were able to adapt the partition method for a power series expansion to express the Bernoulli numbers as an exponential integral of the special polynomials gk (y). There we showed that the multinomial factor in the partition method had to be altered compared with the situation when the ak were evaluated directly from Equivalence (1). Then the gk (y) were generated by assigning the linear variable y to the elements in the partitions. Here we adapt the partition method for a power series expansion to calculate the Bernoulli polynomials, which will involve coding the partitions with a more complicated functional form than a linear variable. The Bernoulli polynomials are simply defined by multiplying the lhs of Equivalence (1) by exp(xt) and replacing Bk on the rhs by Bk (x), which gives ∞
text ≡ ak (x)t k . t e − 1 k=0
(78)
At this stage, all we have done is provide a definition. We have not even established whether the quantity on the rhs is convergent or divergent and if so, for which values of x. Since the
Generalizing the Reciprocal Logarithm Numbers
387
quantity on the rhs is a power series in t , it must be equal to the power series in t when we multiply the power series for exp(xt) by the series given in Equivalence (1). Therefore, ∞
ak (x)t k =
∞ ∞ (xt)k
k=0
k!
k=0
ak t k =
k=0
∞
tk
k=0
k ak−j j =0
j!
xj .
(79)
Equating powers of t yields ak (x) =
k ak−j
j!
j =0
xj .
(80)
By expressing ak and ak (x) in terms of the Bernoulli numbers and polynomials, we obtain the standard definition of Bk (x) =
k k j =0
j
Bk−j x j .
(81)
On the other hand, if we multiply both sides of Equivalence (78) by exp(−xt) and introduce its power series representation, then we obtain ∞
(−1)k−j t ≡ aj (x)x k−j . tk et − 1 k=0 j =0 (k − j )! k
(82)
Now we replace the lhs of the above equivalence by the rhs of Equivalence (1). Since the lhs is a divergent power series in t , it must be equal to the rhs of the above equivalence. As t is arbitrary, we can equate like powers of both sides, thereby arriving at k (−1)k−j j =0
(k − j )!
aj (x)x k−j = ak .
(83)
If we express the aj (x) and ak in terms of the Bernoulli polynomials and numbers respectively, then (83) becomes k j k (−1) Bj (x)x −j = (−1)k Bk x −k . j j =0
(84)
From the preceding analysis we can determine the conditions under which Equivalence (78) is convergent and divergent. In obtaining (80) we have used the power series for exp(xt), which is convergent for all values of x and t . However, we have also used Equivalence (1), which according to the regularized result given by (12) is divergent when t 2 /4π 2 ≤ −1 and convergent when t 2 /4π 2 > −1. Thus the same conditions apply to Equivalence (78), which means that we can write it as ∞ k=0
ak (x)t
k
= ≡
text , et −1 text , et −1
t 2 > −4π 2 and ∀x, t 2 ≤ −4π 2 and ∀x.
(85)
388
V. Kowalenko
Next we investigate how the partition method for a power series expansion can be applied to Equivalence (78). By introducing the exponential factor on the lhs into the denominator and substituting the exponential power series, we find that 1 text . = k ((x − 1)k+1 − x k+1 )t k /(k + 1)! et − 1 1 − ∞ (−1) k=1
(86)
The rhs of the above equation can be viewed as the regularized value for the geometric series. Hence, we arrive at k ∞ ∞ (−1)j +1
text j +1 j +1 j ≡ (x − 1) − x . t et − 1 k=0 j =1 (j + 1)!
(87)
Alternatively, we can write the above equivalence as t2
t
text 2 2 3 3 ≡ 1 − (x − 1) (x − 1) − x − x − + · · · et − 1 2! 3! 2 t2
t
(x − 1)2 − x 2 − (x − 1)3 − x 3 + · · · − · · · . + 2! 3!
(88)
From this result we can determine the lowest order terms in a t -expansion, which are 2 1 x 1 text x ≡ 1 + x − − + t + t2 et − 1 2 2 2 12 2 x x 2 5x 3 + − + t + O(t 4 ). 6 4 6
(89)
The rhs of the above equivalence is now of the same form as the rhs of Equivalence (78). Hence, we can set both rhs’s equal to another whereupon we identify a1 (x) = x − 1/2, a2 (x) = x 2 /2 − x/2 + 1/12 and a3 (x) = x 3 /6 − x 2 /4 + x/12. These are, of course, B1 (x), B2 (x)/2! and B3 (x)/3!, respectively. It is becoming more difficult to write down the higher order coefficients by glancing the rhs of Equivalence (88). This is, of course, where the partition method for a power series expansion comes into its own. In applying the method we note that from (86) we are basically dealing with the geometric series again. Hence, the multinomial factor remains the same as it was when we calculated the Bernoulli numbers via the first approach in the previous section. In fact, the only difference occurs in the coding of the elements in the partitions. Previously, we assigned a value of −1/(i + 1)! for each element i, which was due to the fact that the coefficient of t i of the denominator in Equivalence (13) is 1/(i + 1)!. In (86) the coefficient of t i is (−1)i+1 ((x −1)i+1 −x i+1 )/(i +1)!, which means that the value we now assign to each element i in a partition is minus this value. For x = 0 this reduces to the value used to calculate the Bernoulli numbers. Hence, we observe that ak (0) = ak or Bk (0) = Bk . When evaluating the Bernoulli numbers via the first approach in the previous section we presented a general result for each contribution from a partition. We did this in terms of the values li in the partition and their frequencies ni . E.g., for the partition {1, 1, 3} that arises in the evaluation of a5 (x), l1 and l2 are 1 and 3, while n1 and n2 are equal to 2 and 1, respectively. The multinomial factor was given by N !/n1 ! · n2 ! · · · nj !, where N is the sum
Generalizing the Reciprocal Logarithm Numbers
389
Table 2 Coefficients of the polynomial contributions in the evaluation of a5 (x) via the partition method x5
x4 −1/48
1/120
h1 (x)
x3
x2 1/36
−1/48
1/120
1/6
−7/120
−1/720
2h1 (x)h4 (x)
−1/12
3h1 (x)2 h3 (x)
1/2
2h2 (x)h3 (x)
−1/6
5/24
−17/36
4h1 (x)2 h2 (x)
−2
5
−31/6
11/4
−3/4
−15/8
2
−9/8
1/3
−1/24 −1/32
5/12 −5/4
−1/4
x0
x
11/8
−13/16 7/24
1/4 −7/72
3h1 (x)h2 (x)2
3/4
h1 (x)5
1
−5/2
5/2
−5/4
5/16
Total
1/120
−1/48
1/72
0
−1/720
1/720 −1/32 1/72 1/2
0
of all the ni and j represents the number of distinct elements in a partition. In fact, (15) is the same for the Bernoulli polynomials except for the values in the product, which must be altered according to the prescription in the previous paragraph. Hence, we obtain
C [l1 , n1 ], . . . , [lj , nj ] =
ni j (−1)li (x − 1)li +1 − x li +1 N! . n1 !n2 ! · · · nj ! i=1 (li + 1)!
(90)
For the partition {1, 1, 3} the multinomial factor is simply 3!/(2! · 1!), whilst the values assigned to the ones and three are −((x − 1)2 − x 2 )/2! and −((x − 1)4 − x 4 )/4! respectively. Then the contribution to a5 (x) from this partition is
C [1, 2], [3, 1] = =
2 3! ((x − 1)4 − x 4 ) ((x − 1)2 − x 2 ) − − 2! · 1! 2! 4! 1 x 5 5 4 11 3 13 2 x − x + x − x + − . 2 4 8 16 4 32
(91)
We can now determine a5 (x) by evaluating all the contributions due to the partitions whose elements sum to 5. These are {5}, {1, 4}, {1, 1, 3}, {2, 3}, {1, 1, 1, 2}, {1, 2, 2} and {1, 1, 1, 1, 1}. If we let hj (x) = (−1)j ((x − 1)j +1 − x j +1 )/(j + 1)!, then by summing all the contributions via (90), we obtain a5 (x) = h5 (x) + 2h1 (x)h4 (x) + 3h1 (x)2 h3 (x) + 2h2 (x)h3 (x) + 4h1 (x)3 h2 (x) + 3h1 (x)h2 (x)2 + h1 (x)5 .
(92)
Each term on the rhs of (92) yields a fifth order polynomial in x whose coefficients are listed in the various columns of Table 2. The sums of the coefficients in each column are presented in the bottom row of this table. Since the totals form the coefficients of a5 (x) according to (92), we find that a5 (x) =
x4 x3 x 1 x5 5 5 x − + − = x5 − x4 + x3 − , 120 48 72 72 5! 2 3 6
or in other words, a5 (x) = B5 (x)/5!.
(93)
390
V. Kowalenko
We can also develop an interesting relationship involving the ak (x) and the αk as given by (33) by introducing (32) into Equivalence (78). Then we obtain ext ≡
∞
αk t k
∞
k=0
ak (x)t k .
(94)
k=0
From the analysis presented earlier in this section we know that the rhs is absolutely convergent for |t| < 2π and hence, for these values of t the equivalence symbol can be replaced by an equals sign. By introducing the Taylor series expansion for the exponential term on the lhs of the above equivalence, we arrive at ∞ (xt)k k=0
k!
=
∞
tk
k=0
k j =0
1 aj (x), (k − j + 1)!
(95)
where |t| < 2π . Since t is still arbitrary, we can equate like powers of t , which yields k
xk 1 aj (x) = . (k − j + 1)! k!
j =0
(96)
Alternatively, the above result can be written as k k+1 j
j =0
Bj (x) = (k + 1)x k .
(97)
The above equation represents the generalization of (19). It has also been obtained independently by using the umbral calculus of Ref. [17], which is a entirely different approach from the partition method for a series expansion and is based on the study of a class of Sheffer sequences. In this approach “shadow” identities result when polynomials with powers x k are changed to discrete values and the exponent x k is changed to (x)k , while in the method presented here partitions are coded to specific values derived from expanding generating functions. It should be noted that (95) is an interesting result in that it can be used to obtain general formulas for the coefficients of the ak (x) and hence, the Bernoulli polynomials without a knowledge of the Bernoulli numbers as in (81). If we write ak (x) = kj =0 ak,j x j , then we see immediately that the only term with x k in the sum comes from k = j . Equating this term with the term on the rhs of (95) gives ak,k = 1/k! or Bk,k = 1. The terms with x k−1 can be extracted by considering only the j = k and j = k − 1 on the lhs of (95). Hence, we arrive at ak,k−1 x k−1 +
1 ak−1,k−1 x k−1 = 0. 2!
(98)
If we replace ak−1,k−1 by 1/(k − 1)!, then we find that ak,k−1 = −
1 , 2(k − 1)!
(99)
or Bk,k−1 = −k/2. The terms with x k−2 can be extracted by considering the j = k, j = k − 1 and j = k − 2 terms on the lhs of (95). Therefore, we obtain ak,k−2 +
1 1 ak−1,k−2 + ak−2,k−2 = 0. 2! 3!
(100)
Generalizing the Reciprocal Logarithm Numbers
391
We now replace ak−1,k−2 and ak−2,k−2 by −1/2(k − 2)! and 1/(k − 2)! respectively, which gives ak,k−2 =
1 , 12(k − 2)!
(101)
or Bk,k−2 = k(k − 1)/12. Similarly, we find that ak,k−3 = Bk,k−3 = 0 and ak,k−4 = −1/6!(k − 4)!. More generally, the coefficients of the ak (x) are given by ak,k−l = −
l ak−j,k−l , (j + 1)! j =1
(102)
for k ≥ 1. Furthermore, from (80) we know that ak,j = ak−j /j !. By introducing this result into the above equation, after a little algebra we obtain l j =0
aj = 0, (l − j + 1)!
(103)
which appears in terms of Bernoulli numbers as (34) in Ref. [18]. As in the previous section for the Bernoulli numbers we can adapt the partition method for a power series expansion to present a general formula for the ak (x). We simply let n1 represent the frequency of ones in a partition, n2 the number of twos in a partition, and so on up to nk , which represents the number of k’s in a partition. Then we sum over all the possible values that the nk can take. In fact, the general formula has the same form as (28) except now we code each element i with a value of hi (x). Hence, we arrive at
k,[k/2],[k/3],...,1
ak (x) =
n1 ,n2 ,n3 ,...,nk =0
(n1 + n2 + · · · + nk )! hj (x)nj . n1 !n2 ! · · · nk ! j =1 k
n1 +2n2 +3n3 +···+knk =k
(104)
Like the other general formulas in the previous section, the above result incorporates much redundancy since many of the ni are equal zero in the partitions. Furthermore, for x = 0 (104) reduces to (28) as expected. If we assume that |t| < 2π , then we can introduce an equals sign into Equivalence (78) and invert it to obtain
t −1 e−xt et − 1 =
1+
1 ∞
k=1 ak (x)t
k
.
(105)
Now we can regard the rhs of the above equation as the regularized value of a geometric series involving powers of the series in the denominator. Therefore, we find that ⎛ ⎞k ∞ ∞
(−1)k ⎝ aj (x)t j ⎠ t −1 e−xt et − 1 ≡ j =1
k=1
= 1+
∞ k
(−1)k a1 (x)t + a2 (x)t 2 + · · · .
(106)
k=1
Specifically, for (t exp(xt)/(exp(t) − 1) − 1) > −1, the series in the above equivalence is convergent. Therefore, for these values of x and t we can replace the equivalence symbol
392
V. Kowalenko
by an equals sign. In actual fact, the equivalence symbol can be replaced by an equals sign since the lhs can be expanded into a convergent series for all values of x and t . Since it is obvious that the lhs can be expressed as a power series in t , we can write the rhs as a power series in t with coefficients rk (x). At a glance we see that r0 (x) = 1, r1 (x) = −a1 (x) and r2 (x) = a1 (x)2 − a2 (x) = x 2 /2 − x/2 + 1/6. From here on, it is best to use the partition method to evaluate the rk (x). As we are dealing with the geometric series again, the multinomial factor remains the same as when we were evaluating the Bernoulli polynomials. The difference, however, occurs in the coding of the elements in the partitions, which is dependent on the coefficients of the series in the denominator of (105). Hence, each element i in a partition is now assigned a value of −ai (x). Therefore, if there are j distinct elements in a partition with values and frequencies of li and ni respectively, then the contribution C1 from the partition to rk (x) is given by
C1 [l1 , n1 ], . . . , [lj , nj ] =
n N! −ali (x) i , n1 !n2 ! · · · nj ! i=1 j
(107)
j j where i=1 ni li = k and i=1 ni = N . By modifying the ni to be the frequency of the element i in a partition so that ki=1 ini = k, we arrive at a general formula for the rk (x), which is
(n1 + n2 + · · · + nk )! (−ai (x))ni . n1 !n2 ! · · · nk ! i=1
k,[k/2],[k/3],...,1
rk (x) =
k
n1 ,n2 ,n3 ,...,nk =0
n1 +2n2 +3n3 +···+knk =k
(108)
If we expand the lhs of Equivalence (106) into a power series, then we find that ∞
rk (x)t k =
∞ (−1)k
k=0
k=0
k!
(x − 1)k − x k t k−1 .
(109)
Since t is arbitrary, we can equate like powers of t , thereby obtaining rk (x) = −hk (x). Consequently, we observe that (108) is the inverse of (104). Finally, we can derive an analogous result to (96) by writing Equivalence (106) as e−xt ≡
∞ k=0
ak t k
k
rk (x)t k .
(110)
k=0
We also know that there will be an infinite number of values of t where the above equivalence becomes an equation. If for these values of t we introduce the Taylor series expansion for exp(−xt) and equate powers of t , then we find that k
aj rk−j (x) =
j =0
(−1)k k x . k!
(111)
Introducing the form for rj (x) in terms of hj (x) yields k (−1)k−j +1 aj
j =0
(−1)k k x k−j +1 − (x − 1)k−j +1 = x . (k − j + 1)! k!
(112)
Generalizing the Reciprocal Logarithm Numbers
393
If we substitute (81) into the above result and replace k and −x by k − 1 and x respectively, then after some algebra we arrive at the important result of Bk (x + 1) − Bk (x) = kx k−1 ,
(113)
which is discussed on p. 126 of Ref. [6]. At the end of the previous section we discussed how the partition method for a power series expansion could be applied to the evaluation of the Bell numbers. To complete this section, we explain how the method can be used to calculate the Bell polynomials, which are defined by ∞ Bek (x)
k!
k=0
t k = exp x(et − 1) .
(114)
Expanding the rhs yields ∞ Bek (x) k=0
k!
tk =
∞ k 1
xt + xt 2 /2! + xt 3 /3! + · · · . k! k=0
(115)
If we compare (115) with (76), then we see that the only difference is that each power of t is now multiplied by x. This means that we can adapt (77) to obtain a general formula for the Bell polynomials simply by introducing a factor of x into the numerator of each of the terms in the large parentheses. Therefore, the Bell polynomials are given by
k,[k/2],[k/3],...,1
Bek (x) = k!
n1 ,n2 ,n3 ,...,nk =0
x n1 x n2 x nk 1 ··· . n1 !n2 ! · · · nk ! 1! 2! k!
(116)
n1 +2n2 +···+knk =k
4 Generalized Reciprocal Logarithm Numbers In the previous sections we were concerned with introducing and adapting the partition method for a power series expansion to the Bernoulli numbers and polynomials in order to make the method familiar to the reader before considering a problem where the outcome is not known. We have seen that the method not only generates power series expansions with numerical coefficients, but it can also generate power series expansions with polynomial coefficients. In this section we present the main result of this paper, which is the derivation of a power series expansion for ln(1 + z)−s , where s is complex. As mentioned in the introduction, the aim for doing this is to investigate how the resulting expansion can be used to uncover new properties for the Riemann zeta function, which will be discussed in a later section. Previously, the s = 1 case was considered in Ref. [1], where we obtained a power series expansion whose coefficients were referred to as the reciprocal logarithm numbers Ak . Here, we shall obtain polynomial coefficients Ak (s), which will be referred to as the generalized reciprocal logarithm numbers. Now that we have developed an understanding of the partition method for a power series expansion, let us present the following theorem: Theorem 1 For s complex, the power series expansion for zs / lns (1 + z) can be written as ∞
zs ≡ Ak (s)zk , s ln (1 + z) k=0
(117)
394
V. Kowalenko
Table 3 Values for the generalized reciprocal logarithm numbers Ak (s) k
Ak (s) 0
1
1
s/2
2
s 2 /8 − 5s/24
3
s 3 /48 − 5s 2 /48 + s/8
4
s 4 /384 − 5s 3 /192 + 97s 3 /1152 − 251s/2880
5
s 5 /3840 − 5s 4 /1152 + 61s 3 /2304 − 401s 2 /5710 + 19s/288
6
(63s 6 − 1575s 5 + 15435s 4 − 73801s 3 + 171150s 2 − 152696s)/2903040
7
(9s 7 − 315s 6 + 4515s 5 − 33817s 4 + 139020s 3 − 295748s 2 + 252336s)/5806080
8
(135s 8 − 6300s 7 + 124110s 6 − 1334760s 5 + 8437975s 4 − 31231500s 3 +62333204s 2 − 51360816s)/1393459200
9
(15s 9 − 900s 8 + 23310s 7 − 339752s 6 + 3040975s 5 − 17065540s 4 +58415444s 3 − 110941776s 2 + 88864128s)/2786918400
10
(99s 10 − 7425s 9 + 244530s 8 − 4634322s 7 + 55598235s 6 − 436886945s 5 +2242194592s 4 − 72207228238s 3 + 13175306672s − 10307425152s)/367873228800
where the coefficients Ak (s) represent the generalization of the reciprocal logarithm numbers Ak (1) or Ak in Ref. [1] and are polynomials of order k. Here, A0 (s) = 1, A1 (s) = s/2, A2 (s) = s 2 /8 − 5s/24, while Table 3 displays the values of the Ak (s) up to and including k = 10. The general form for the Ak (s) can be written as
k,[k/2],[k/3],...,1
Ak (s) =
n1 ,n2 ,n3 ,...,nk =0
n1 +2n2 +3n3 +···+knk =k
n k (−1)i+1 i 1 . (s)N i+1 ni ! i=1
(118)
In addition, the above power series expansion represents a small |z| expansion with a finite disk of absolute convergence given at least by |z| < 1. For these values of z the equivalence symbol may be replaced by an equals sign.
Proof From the lemma we have seen that 1/(1 − z) represents the regularized value for the geometric series when z > 1. If ones replaces z in Equivalence (10) by −x and integrates over x between zero and z, then one obtains ∞ (−1)k k+1 = ln(1 + z), z > −1, z (119) k+1 ≡ ln(1 + z), z ≤ −1. k=0 Hence, ln(1 + z) represents the regularized value of the series on the lhs of the above equivalence for z ≤ −1. Furthermore, we can replace the equals sign in Equivalence (119) by an equivalence symbol for z > −1. Therefore, if we introduce the ensuing equivalence into zs / lns (1 + z), then we find that 1 zs ≡ , ln (1 + z) (1 − z/2 + z2 /3 − z3 /4 + z4 /5 − · · · )s s
which is valid for all z.
(120)
Generalizing the Reciprocal Logarithm Numbers
395
In Ref. [1] we applied the partition method for a power series expansion to the s = 1 version of Equivalence (120). We were able to do this because we could treat the rhs of Equivalence (120) as the regularized value of the geometric series. Now that s = 1, we cannot use the geometric series to derive Equivalence (117). Instead, we note that the rhs of Equivalence (120) can be treated as the regularized value of the binomial series according to ∞ (k + s) k = (1 − z)−s , z < 1, z (121) 1 F0 (s; z) = (s)k! ≡ (1 − z)−s , z ≥ 1. k=0 This important result, which is valid for all values of s, is proved in Ref. [5]. For z < 1 we can replace the equals sign by the equivalence symbol in which case we have an equivalence statement for all values of z. Then Equivalence (120) becomes ∞
(k + s) zs ≡ s ln (1 + z) k=0 (s)k!
z z2 z3 − + + ··· 2 3 4
k .
(122)
The series inside the large parentheses in the above result converges absolutely to 1 − ln(1 + z)/z for |z| < 1. On the other hand, the series over k on the rhs of Equivalence (122) is absolutely convergent for | ln(1 + z)/z − 1| < 1. The latter condition follows if the first condition is obeyed. Therefore, to guarantee absolute convergence for the above result we require only that |z| < 1 for all values of s. A different analysis is required to determine the other values of z for which the series on the rhs of Equivalence (122) is convergent or divergent. For |z| < 1, however, we can replace the equivalence symbol by an equals sign as stated in the theorem. In addition, expanding Equivalence (122) yields (s + 1) (s + 1) z 1 (s + 2) 1 2 + ln−s (1 + z) ≡ z−s 1 + − + z (s) 2 (s) 3 2!(s) 4 (s + 3) 1 (s + 2) 1 (s + 1) 1 3 − + + z + ··· 3!(s) 8 (s) 6 (s) 4 =
∞
Ak (s)zk−1 .
(123)
k=0
Hence, we find that A0 (s) = 1, A1 (s) = s/2, A2 (s) = s 2 /8 − 5s/24, while A3 (s) is given by A3 (s) =
s s3 5s 2 s (s + 2)(s + 1)s (s + 1)s − + = − + . 48 6 4 48 48 8
(124)
To evaluate the higher order coefficients, we will need to adapt the partition method for a power series expansion again. When the reciprocal logarithm numbers or Ak (1) were evaluated in Ref. [1], a value of (−1)li +1 /(li + 1) was assigned to each element li in a partition. As we have seen from the examples in the previous sections, the values assigned to the elements of a partition depend upon the coefficients of the power series that becomes the variable of a second power series expansion. In the case of Equivalence (122) the values assigned to the elements will be based on the coefficients of the series inside the large parentheses. Since this series is the same one used in calculating the reciprocal logarithm numbers, we again assign a value of (−1)li +1 /(li + 1) to each element li .
396
V. Kowalenko
For most examples in the previous sections the second power series was the geometric series. This meant that the multinomial factor was equal to N !/n1 !n2 ! · · · nj !, where the ni represents the frequency of an element li , j is the number of distinctelements in a parj tition and N is the total number of elements in the partition. Hence, i=1 ni = N , while j i=1 ni li = k, where k is the order of resulting power series expansion. In the example where the Bernoulli numbers were expressed in terms of exponential integrals involving the polynomials gk (y), the multinomial factor was different in that it lacked a factor of N ! in the numerator. This was because for this case the second power series was the exponential power series, not the geometric series. As a consequence, an extra factor of k! appears in the denominator of its power series expansion to cancel the N ! factor in the multinomial factor for the geometric series. Thus, we see that k in the second power series plays the role of N in the partition method. In Equivalence (122), whilst each element li in a partition is assigned a value of (−1)li +1 /(li + 1) as in Ref. [1], the multinomial factor is that for the geometric series multiplied by an extra factor due to the factor of (k + s)/k!(s) outside the first power series. Since N replaces k in the second power series, the multinomial factor for applying the partition method for a power series expansion to Equivalence (122) is composed of N !/n1 !n2 ! · · · nj ! multiplied by (N + s)/N !(s). In other words, the multinomial factor equals (s)N /n1 !n2 ! · · · nj !, where (s)N is the Pochhammer notation for (s + N )/ (s). Let us now consider the evaluation of A4 (s) to clarify the situation. First, the partitions that sum to 4 are {4}, {3, 1}, {2, 2}, {2, 1, 1} and {1, 1, 1, 1}. According to the prescription in the previous paragraph we assign values of −1/2, 1/3, −1/4 and 1/5 respectively to elements 1, 2, 3 and 4. For the partition involving one element the multinomial factor is composed of 1!/1! multiplied by (s + 1)/ (s) or (s)1 . As there are two elements in the partition {3, 1}, the multinomial factor for the partition consists of the standard factor of 2!/1! · 1! multiplied by the extra factor of (s)2 /2!. Generally, the contribution to the Ak (s) due to the partition which can be represented as [l1 , n1 ], [l2 , n2 ], . . . , [lj , nj ]) with nj ≥ 1, is given by
C [l1 , n1 ], . . . , [lj , nj ] =
n j (s)N (−1)li +1 i . n1 !n2 ! · · · nj ! i=1 li + 1
(125)
Hence, each partition with N elements contributes a polynomial of O(s N ) to the Ak (s). The partition with the greatest number of elements, viz. k ones, contributes the highest order polynomial, which is O(s k ). Since the Ak (s) are determined by summing all the contributions due to the partitions that sum to k, the Ak (s) are O(s k ) to highest order and O(s) to lowest order. The general form for the Ak (s) can be represented by
C [l1 , n1 ], [l2 , n2 ], . . . , [lj , nj ] , (126) Ak (s) = l1 n1 +l2 n2 +···+lj nj =k
where it is implied that the sum is over all the possible combinations that are subject to the constraint. For example, summing all the contributions due to the partitions that sum to 4 according to the above formula yields 1 (s)2 1 2 (s)3 1 2 (s)4 1 4 1 1 1 + (s)2 + − + + − A4 (s) = (s)1 − 5 4 2 2! 3 2! 2 3 4! 2 =
s4 5s 3 97s 3 251s − + − . 384 1152 1152 2880
(127)
Generalizing the Reciprocal Logarithm Numbers
397
More values of the Ak (s) as evaluated by the above method are displayed in Table 3. As expected, for s = 1 these results reduce to the values of Ak given in Ref. [1]. Hence, we refer to the Ak (s) as the generalized reciprocal logarithm numbers. Also as expected, for s = −1 they yield the coefficients of the Taylor series expansion for ln(1 + z), i.e. Ak (−1) = (−1)k /(k + 1). As stated previously, there is no need to determine the partitions when deriving a general form for the coefficients via the partition method, although such a list will reduce much redundancy in the form itself. To derive a general form for the Ak (s), we let the elements from 1 to k possess frequencies n1 to nk for all the partitions. The frequency corresponding to element i, viz. ni , in any partition will lie in the range from zero to [k/ i], where [x] denotes the greatest integer less than or equal to x. The redundancy in the general form arises due to the fact that many of the ni will equal zero in the partitions, whereas previously we dealt with non-zero frequencies. As a consequence of this slight modification, the partitions that provide a contribution to the generalized reciprocal logarithm numbers are constrained by k
ni i = n1 + 2n2 + 3n3 + · · · + knk = k.
(128)
i=1
From (128) we see that nk equals zero for all partitions except {k}, in which case it will equal unity, while the other ni will equal zero for this partition. Similarly, for k > 3, nk−1 equals zero for all partitions except {k − 1, 1}, in which case it equals unity. The highest value that the ni can take is k, which corresponds to the partition with k ones. Accordingly, we can write (125) as n k
(−1)i+1 i 1 C n1 , n2 , . . . , nk = (s)N , i +1 ni ! i=1
(129)
where the number of elements in a partition is now given by N = ki=1 ni . Since the generalized reciprocal logarithm numbers are obtained by summing over all the partitions, we arrive at
k,[k/2],[k/3],...,1
Ak (s) =
C n1 , n2 , . . . , nk .
(130)
n1 ,n2 ,n3 ,...,nk =0
n1 +2n2 +3n3 +···+knk =k
Combining (129) and (130) gives the result for the Ak (s) in the theorem, viz. (118). Although this form incorporates much redundancy, it is superior to (126) because it specifies limits to each ni . This completes the proof of the theorem. As a result of the theorem, we have z−s lns (1 + z) =
1 1 ≡ . k zs / lns (1 + z) 1 + ∞ k=1 Ak (s)z
(131)
The lhs of the above equivalence can be replaced by the series representation in Equivalence (117) with Ak (−s), while we treat the result on the rhs as the regularized value of the geometric series. Hence, we arrive at ∞ k=0
Ak (−s)zk =
∞ k
(−1)k A1 (s)z + A2 (s)z2 + A3 (s)z3 + · · · . k=0
(132)
398
V. Kowalenko
By expanding the rhs of the above equation, we see that A1 (−s) = −A1 (s), A2 (−s) = A1 (s)2 − A2 (s) and A3 (−s) = 2A1 (s)A2 (s) − A3 (s) − A1 (s)3 . To calculate the higher orders, we apply the partition method for a power series as we did in the proof of the theorem. Comparing the power series in (132) with that in Equivalence (122), we see that the coefficients of zk are −Ak (s) rather than (−1)k+1 /(k + 1). In addition, there is no factor of (s)k /k! in the summand. As a consequence, the multinomial factor will be the same as when we calculated the Bernoulli numbers directly from Equivalence (14). In other words, for a partition possessing jj elements with values li , the multinomial factor is simply N !/n1 !n2 ! · · · nj !, where N = i=1 ni . Then the contribution to Ak (−s) from the partition is given by
C2 [l1 , n1 ], . . . , [lj , nj ] =
n N! −Ali (s) i . n1 !n2 ! · · · nj ! i=1 j
(133)
To obtain Ak (−s), we simply sum over the contributions C2 as given above. Hence, we find that
Ak (−s) =
N!
l1 n1 +l2 n2 +···+lj nj =k
j (−Ali (s))ni . ni ! i=1
(134)
On the other hand, if we adopt the approach that n1 represents the number of ones in a partition, n2 the number of twos and so on, then the above result can be written as
k,[k/2],[k/3],...,1
Ak (−s) =
N!
n1 ,n2 ,n3 ,...,nk =0
n1 +2n2 +3n3 +···+knk =k
k (−Ai (s))ni i=1
ni !
,
(135)
where N = ki=1 ni . For s = 1 the Ak (s) reduce to the reciprocal logarithm numbers Ak in Ref. [1], while Ak (−1) represent the coefficients of the Taylor series expansion for ln(1 + z), i.e. Ak (−1) = (−1)k /(k + 1). From (135) we obtain (−1)k = (k + 1)
k,[k/2],[k/3],...,1
N!
n1 ,n2 ,n3 ,...,nk =0
n1 +2n2 +3n3 +···+knk =k
k (−Ai )ni i=1
ni !
.
(136)
If we set s = −1 in (135), however, then we find that
k,[k/2],[k/3],...,1
Ak =
n1 ,n2 ,n3 ,...,nk =0
n1 +2n2 +3n3 +···+knk =k
N!
n k (−1)i+1 i 1 . i +1 ni ! i=1
(137)
Equations (136) and (137) were first presented in Ref. [1], but with (135) we can go even further. In Ref. [1] we obtained values for Ak (2), which were denoted by Fk = kj =0 Ak−j Aj . Later in Theorem 2 of this work we showed that Ak (2) = −(k − 2)Ak−1 − (k − 1)Ak .
(138)
Generalizing the Reciprocal Logarithm Numbers
399
To employ the above result in (135), we require an expression for Ak (−2), which can be obtained by squaring the Taylor series expansion for ln(1 + z)/z. This gives z−2 ln(1 + z)2 ≡
∞ k (−z)k j =0
k=0
1 . (k − j + 1)(j + 1)
(139)
By equating the rhs of the above equivalence with the power series representation in terms of Ak (−2) in Theorem 1, one obtains 2 2 1 = Hk+1 , k + 2 j =0 j + 1 k + 2 k
Ak (−2) =
(140)
where Hk+1 denotes the k + 1-th harmonic number [19]. If we introduce the results for Ak (2) and Ak (−2) into (135), then we obtain
k,[k/2],[k/3],...,1
(k − 2)Ak−1 + (k − 1)Ak = −
n1 ,n2 ,n3 ,...,nk =0
n1 +2n2 +3n3 +···+knk =k
N!
k i=1
2 − i +2
ni
n
i Hi+1 , ni !
(141)
and 2 Hk+1 = k+2
k,[k/2],[k/3],...,1
N!
n1 ,n2 ,n3 ,...,nk =0
n1 +2n2 +3n3 +···+knk =k
k
((i − 2)Ai−1 + (i − 1)Ai )ni /ni !.
(142)
i=1
5 Properties of the Ak (s) In this section we shall present various properties of the generalized reciprocal logarithm numbers, which will enable us to calculate them via different means, viz. recursion relations and to derive general formulas for their highest order coefficients. We begin this section by proving the following identity. Theorem 2 The generalized reciprocal logarithm numbers obey the finite sum of Ak (s + t) =
k
Ak−j (s)Aj (t).
(143)
j =0
Proof From Theorem 1 we have ∞
zs+t Ak (s + t)zk . ≡ s+t ln (1 + z) k=0
(144)
Furthermore, we can express the lhs of Equivalence (144) via Theorem 1 as ∞
∞
zs zt ≡ Ak (s)zk Ak (t)zk . s t ln (1 + z) ln (1 + z) k=0 k=0
(145)
400
V. Kowalenko
Multiplying both series on the rhs of the above equivalence yields a power series in z, whose regularized value is the same as the power series on the rhs of Equivalence (144). Hence, we arrive at ∞
Ak (s + t)zk =
k=0
∞ k=0
zk
k
Ak−j (s)Aj (t).
(146)
j =0
Since z is arbitrary, we can equate like powers of z, which results in (143). This completes the proof. There are interesting consequences arising from (143). If we let s = −z and t = z, then we find that k 0, k ≥ 1, Ak−j (−z)Aj (z) = (147) 1, k = 0. j =0 For s = 1 − z and t = z, (143) yields k
Ak−j (1 − z)Aj (z) = Ak ,
(148)
j =0
while for s = −1 − z and t = z, we find that k
Ak−j (−1 − z)Aj (z) = Ak (−1) =
j =0
(−1)k . k+1
(149)
Alternatively, if we put t = z and s = 1, then we obtain k−1
Ak−j Aj (z) = Ak (z + 1) − Ak (z).
(150)
j =0
For t = z and s = −1, (143) yields k−1 (−1)j Aj (z) = (−1)k (Ak (z − 1) − Ak (z)). k−j +1 j =0
(151)
Note also that by putting z = 1 in (151) we recover the first recursion relation for the reciprocal logarithm numbers, viz. (30), in Ref. [1]. With the aid of (150), we can derive a recursion relation for the generalized reciprocal logarithm numbers as exemplified by the following theorem. Theorem 3 For s = 0 the generalized reciprocal logarithm numbers obey the recursion relation of kAk (s) = −(k − s/2 − 1)Ak−1 (s) − s
k−2 j =0
Ak−j Aj (s).
(152)
Generalizing the Reciprocal Logarithm Numbers
401
Proof For s = 0, the derivative of ln−s (1 + z) w.r.t. z is given by d −s s ln (1 + z) = − ln−s−1 (1 + z). dz 1+z
(153)
Multiplying both sides by zs+1 and introducing Theorem 1 into (153) yields ∞
zs+1
∞
s d −s z Ak (s)zk = − Ak (s + 1)zk . dz 1 + z k=0 k=0
(154)
Note that since the same divergent quantity has been introduced into both sides of (154), there is no need to replace the equals sign by an equivalence symbol. That is, both sides are convergent and divergent for the same values of z. Furthermore, if we multiply both sides of the above equation by 1 + z and equate like powers of z, then we find that (k − s)Ak (s) + (k − s − 1)Ak−1 (s) = −sAk (s + 1).
(155)
If we introduce (150) into (155), then we obtain the recursion relation given by (152). Equation (152) indicates that the reciprocal logarithm numbers Ak are required to evaluate the Ak (s), which is another reason why they are referred to as the generalized reciprocal logarithm numbers. This completes the proof. The recursion relation given by (152) together with putting z = 1 in (151) can be implemented in Mathematica [20] to obtain values of the generalized reciprocal logarithm numbers. This is by far much quicker than using the unoptimized partition method for a power series expansion of the previous section. For example, by programming the recursion relations and utilising the Simplify routine in the package, one finds that it only takes 0.062 and 2.094 CPU seconds to evaluate A10 (s) and A15 (s) respectively with Mathematica 4.1 on a Pentium computer. These CPU times are based on the fact that all the preceding Ak (s) have been determined. We can also modify the partition method for a power series expansion so that only the reciprocal logarithm numbers are required in the evaluation of the Ak (s). To accomplish this, we need to write the integral representation for the gamma function as ∞ dt t s−1 e−zt/ ln(1+z) = (s)z−s lns (1 + z), (156) I= 0
where s > 0. Now we expand z/ ln(1 + z) in terms of the power series whose coefficients are the reciprocal logarithm numbers. As shown in Ref. [1], this series is absolutely convergent for |z| < 1. For these values of z the integral I becomes I= 0
∞
∞ dt t s−1 exp −t − t Ak z k .
(157)
k=1
Before we can employ the partition method for a power series expansion, we express the final exponential factor in I as ∞ ∞ k (−1)k t k
A 1 z + A2 z 2 + A3 z 3 + · · · . Ak z k = exp −t k! k=1 k=0
(158)
402
V. Kowalenko
Table 4 The polynomials hk (t) as given by (160) k
hk (t)
0
1
1
−t/2
2
t/12 + t 2 /8
3
−t/24 − t 2 /24 − t 3 /48
4
19t/720 + 7t 2 /288 + t 3 /96 + t 4 /384
5
−3t/160 − t 2 /60 − t 3 /144 − t 4 /576 − t 5 /3840
6
863t/60480 + 43t 2 /3456 + 133t 3 /25920 + t 4 /768 + t 5 /4608 + t 6 /46080
7
−275t/24192 − 79t 2 /8064 − 139t 3 /34560 − 107t 4 /103680 − 5t 5 /27648 −t 6 /46080 − t 7 /645120
8
33953t/3628800 + 717t 2 /89600 + 4759t 3 /1451520 + 2111t 4 /2488320 +127t 5 /829440 + 11t 6 /552960 + t 7 /552960 + t 8 /10321920
This form is analogous to the situation in Sect. 2 where we introduced the gk (y) to calculate the Bernoulli numbers. There we adapted the partition method by modifying the multinomial factor because the second power series expansion was the exponential power series. Thus, the multinomial for this situation is simply 1/n1 !n2 ! · · · nk !, where for each partition n1 represents the number of ones, n2 the number of twos, etc. up to nk . Meanwhile, coding the elements in the partitions depends upon the coefficients in the first power series in large parentheses on the rhs of (158). Hence, each element i is assigned a value of −Ai t . Then (158) becomes ∞ ∞ exp −t Ak z k = hk (t)zk , k=1
(159)
k=0
where the polynomials hk (t) are given by
k,[k/2],[k/3],...,1
hk (t) =
tN
n1 ,n2 ,n3 ,...,nk =0
n1 +2n2 +3n3 +···+knk =k
k (−Ai )ni i=1
ni !
,
(160)
and N = kj =1 nj as before. As N ranges from 1 to k, we see that the polynomials hk (t) are of order k. From (160) we find that h0 (t) = 1, h1 (t) = −t/2, h2 (t) = t 2 /8 + t/12 and h3 (t) = −t 3 /48 − t 2 /24 − t/24. More of these polynomials are displayed in Table 4. Now we introduce (159) into the rhs of (157). Because an exponential term has been replaced by a power series, evaluation of the resulting integral may produce a divergent power series expansion. In fact, this procedure whereby part of an exponential is replaced by a power series representation is an example of the asymptotic method known as the method of expanding most of the exponential, which is discussed on p. 113 of Ref. [13]. Since we have now obtained an asymptotic series for I , we must introduce an equivalence symbol instead of using an equals sign. Therefore, we arrive at I ≡ (s) +
∞ k=1
∞
zk 0
dt hk (t)t s−1 e−t .
(161)
Generalizing the Reciprocal Logarithm Numbers
403
Because the polynomials hk (t) are O(t k ), we can represent them by hk (t) = ki=1 hk,i t i , where the coefficients hk,i represent the sum of all the contributions from the partitions that sum to k with i elements. Introducing this form for the hk (t) into the above result yields z−s lns (1 + z) ≡ 1 +
∞ k=1
zk
k
hk,i (s)i ,
(162)
i=1
where s > 0. The latter condition arises from identifying the resulting integrals as the integral representation for the gamma function. By replacing the lhs of the above equivalence by the power series expansion in Theorem 1 and equating like powers of z, we find that Ak (−s) =
k
hk,i (s)i ,
(163)
i=1
for k ≥ 1. For example, h2,1 = 1/12 and h2,2 = 1/8, which means that A2 (−s) =
s(s + 1) s 2 5s s + = + . 12 8 8 24
(164)
As expected, (164) agrees with the corresponding result in Table 3. Although the above analysis has been carried out for s > 0, (163) can be analytically continued to s < 0. Therefore, for s = −1 it yields Ak =
k
hk,i (−1)i = −hk,1 ,
(165)
i=1
which also follows from (160). For s = −2 (163) yields Ak (2) = −2hk,1 + 2hk,2 .
(166)
Using (138) and (165), one finds that hk,2 = −(k/2 − 1)Ak−1 − (k/2 + 1/2)Ak .
(167)
From here on, it becomes a formidable exercise to express the remaining hk,i in terms of the reciprocal logarithm numbers. For example, for s = −3 (163) yields hk,3 = −(k/2)Ak − (k/2 − 1)Ak−1 + Ak (3)/6.
(168)
Whilst the above equation appears simple, the problem lies in expressing Ak (3) in terms of the reciprocal logarithm numbers. Higher order expressions for the coefficients of hk (t), i.e. hk,i , where i > 3, will involve Ak (i), which are even more formidable to express in terms of the reciprocal logarithm numbers. To express Ak (3) in terms of the reciprocal logarithm numbers, we differentiate z2 / ln2 (1 + z), which yields ∞ k=1
kAk (2)zk−1 ≡
2z2 2z − . ln (1 + z) (1 + z) ln3 (1 + z) 2
(169)
404
V. Kowalenko
Multiplying by z and introducing the equivalence in Theorem 1 into the rhs of the above result, we arrive at 2
∞
Ak (3)zk =
k=0
∞ ∞ (2 − k)Ak (2)zk − (k − 3)Ak−1 (2)zk . k=0
(170)
k=1
Note the appearance of the equals sign in the above result. Since z is arbitrary, we can equate like powers of z in (170), thereby obtaining Ak (3) = (1 − k/2)Ak (2) − (k/2 − 3/2)Ak−1 (2).
(171)
On introducing (138) into (171), we find that Ak (3) = (k − 1)(k/2 − 1)Ak + (k − 2)(k − 5/2)Ak−1 1 + (k − 3)2 Ak−2 . 2
(172)
Finally, if we introduce (172) into (168), then after a little algebra we obtain hk,3 = −
(k + 2) (k − 2) (k − 3)2 (k + 1)Ak − (2k + 1)Ak−1 − Ak−2 . 12 12 12
(173)
It should be mentioned that (172) has been obtained independently in Ref. [21]. There the authors refer to the reciprocal logarithm numbers of Ref. [1] as the Bernoulli numbers of the second kind. However, according to the definition in Refs. [17] and [22], the Bernoulli numbers of the second kind, which are denoted by bk , are equal to k!Ak . Therefore, unlike the reciprocal logarithm numbers, the Bernoulli numbers of the second kind diverge as k → ∞. It was largely as a result of the convergence behaviour of the reciprocal logarithm numbers that we were able to develop the numerous results in Ref. [1], whereas the authors of Ref. [21] do not discuss the properties of their “bk ”. Furthermore, in their Theorem 1 they present an expansion for t N / lnN (1 + z), where N is a positive integer. Not only is the result given in the theorem restricted to positive integral powers, whereas our Theorem 1 is valid for complex s, their final expansion is very unwieldy. Nevertheless, they do give the results for Ak (2) and Ak (3), although these are expressed in terms of the “bk ”. This is technically incorrect because the factorial relating them to the reciprocal logarithm numbers has been neglected in their results. More importantly, throughout the entire work they do not discuss the conditions under which any of their series expansions are convergent and divergent. As a consequence, they use equals signs throughout their paper when they are not valid. A faster and better method of determining the coefficients, hk,i , is via the recursion relation for the hk (t) that emanates from (160). This result is derived in Theorem 4 below. Theorem 4 The polynomials hk (t) as given by (160) obey the recursion relation of
hk (t) = −
k−1
Aj hk−j (t).
(174)
j =0
Proof Taking the derivative w.r.t. t of (159) gives −
∞ ∞ z −1 hk (t)zk = hk (t)zk . ln(1 + z) k=0 k=0
(175)
Generalizing the Reciprocal Logarithm Numbers
405
If we introduce the power series expansion for 1/ ln(1 + z) from Ref. [1] under the assumption that |z| < 1, then (175) becomes ∞
hk (t)zk −
k=0
∞ k=0
zk
k
Aj hk−j (t) =
j =0
∞
hk (t)zk .
(176)
k=1
Note that since |z| < 1, there is no need for the equivalence symbol to appear in the above result. Nevertheless, z is still fairly arbitrary. Therefore, we can equate like powers of z, which yields (174). This completes the proof. If the hk (t) are expressed in terms of their coefficients hk,i , then (174) becomes −Ak +
k
hk,i t i −
k−1 j =0
i=1
Aj
k−j
hk−j,i t i =
i=1
k
ihk,i t i−1 .
(177)
i=1
Already we have determined the hk,i for i ≤ 3. Therefore, let us consider the derivation of a general formula for hk,4 in terms of the reciprocal logarithm numbers, which we have indicated previously may be quite formidable. From (177) we obtain 4hk,4 = hk,3 −
k−1
Aj hk−j,3 .
(178)
j =0
After carrying out some algebraic manipulation we arrive at 4hk,4 = −
(k 2 + 3k) (2k 2 − 3k − 2) (k − 3)2 Ak − Ak−1 − Ak−2 12 12 12
+
k k−1 (j 2 + 3j + 2) (2j 2 + j + 3) Aj Ak−j + Aj Ak−j −1 12 12 j =0 j =0
+
k−2 (j 2 − 2j − 7) Aj Ak−j −2 . 12 j =0
(179)
In (179) the sums involving products of the reciprocal logarithm numbers are easily eval uated since we know from Theorem 2 and also from Ref. [1] that kj =1 Ak−j Aj = Ak (2), which can in turn be expressed in terms of the reciprocal logarithm numbers via (138). Furthermore, we can express the finite sums with the summand of j Ak−j Aj in terms of the reciprocal logarithm numbers since for |z| < 1, ∞ k ∞ ∞ z2 d k d =2 kAk zk−1 Ak z k = z Aj Ak−j . 2 dz ln (1 + z) dz k=0 j =0 k=0 k=0
(180)
Once again, z is fairly arbitrary. So we can equate like powers of both series in the above result, thereby obtaining 2
k j =0
j Aj Ak−j = k
k j =0
Aj Ak−j .
(181)
406
V. Kowalenko
Introducing (138) into (181) yields k
j Aj Ak−j = −
j =0
k
(k − 2)Ak−1 + (k − 1)Ak . 2
(182)
However, an expression for kj =0 j 2 Aj Ak−j in terms of the reciprocal logarithm numbers has yet to be found, which demonstrates that it becomes more formidable to express the hk,i purely in terms of the Ak for i > 3. For |t| 1, the lowest order terms are no longer an effective approximation for the hk (t). Therefore, we need to determine the coefficients of the highest order terms. Unfortunately, the preceding material is of no benefit. This is where the partition method for a power series expansion comes into its own. To determine the highest order coefficient in the hk (t), viz. hk,k , we need to consider the contributions from all partitions with k elements. There is, of course, only one such partition, which is the partition consisting of k ones. Hence, hk,k is given by hk,k =
k! (−1)k (−A1 )k = k . k!k! 2 k!
(183)
To determine the next highest order coefficient, we need to evaluate all the contributions due to the partitions that possess k − 1 elements. Again, there is only one such partition, which consists of k − 2 ones and a two. The contribution due to this partition is hk,k−1 =
1 (−1)k (−A1 )k−2 (−A2 ) = , (k − 2)! 3 · 2k (k − 2)!
(184)
where k ≥ 2. The next highest order coefficient consists of the contributions due to partitions possessing k − 2 elements. In this case there are two such partitions: one with k − 3 ones and a three and the other with k − 4 ones and two twos. The sum of the contributions due to these partitions is 1 1 (−A1 )k−3 (−A3 ) + (−A1 )k−4 (−A2 )2 (k − 3)! 2!(k − 4)! (−1)k k+3 = k , 2 (k − 3)! 18
hk,k−2 =
(185)
where k ≥ 3. For k = 3 the above gives h3,1 = −1/24, which agrees with the coefficient of t for k = 3 in Table 4. Since the coefficients of the four lowest orders of the hk (t) have been evaluated, let us conclude our study of the coefficients by evaluating the fourth highest coefficient or hk,k−3 . This involves evaluating the contributions from the partitions with k − 3 elements, which are: (1) k − 4 ones and a four, (2) k − 5 ones, a two and a three, (3) k − 6 ones and three twos. Then we find that hk,k−3 =
(−A1 )k−4 (−A1 )k−5 (−A4 ) + (−A2 )(−A3 ) (k − 4)! (k − 5)! +
(−A1 )k−6 (−A2 )3 , 3!(k − 6)!
(186)
Generalizing the Reciprocal Logarithm Numbers
407
where k ≥ 4. Equation (186) reduces to hk,k−3 =
5k 2 + 45k + 82 810 · (k − 4)!
−
1 2
k .
(187)
For k = 4, we find that h4,1 = 19/720, which agrees with the coefficient of t for k = 4 in Table 4. The preceding results can now be used to determine the large |s| behaviour of the generalized reciprocal logarithm numbers or Ak (s). First, from (163) we have Ak (−s) = hk,k (s)k + hk,k−1 (s)k−1 + hk,k−2 (s)k−2 + · · · + hk,1 (s)1 .
(188)
According to Chaps. 18 and 24 of Refs. [23] and [24] respectively, the Pochhammer polynomials can be written as (s + k) (j ) = (−1)k (−1)j Sk s j , (s) j =0 k
(s)k =
(189)
(j )
where the integers Sk are known as the Stirling numbers of the first kind and satisfy the following recursion relation: (j )
(j −1)
Sk+1 = Sk
(j )
− kSk .
(190)
If we introduce (189) into (188), then we obtain 1 (k) s k 1 (k−1) 1 s k−1 (k−1) Sk Sk−1 + − Ak (−s) = Sk − − k! 2 2k! 6(k − 2)! 2 1 (k−2) 1 (k + 3) (k−2) s k−2 (k−2) Sk Sk−1 Sk−1 + − + + · · · . (191) − 4k! 12(k − 2)! 72(k − 2)! 2 In the appendix we present general formulas, i.e. in terms of k, for the highest order coefficients of s in (s)k . On introducing the results given by (250) into (191), we find that the generalized reciprocal logarithm numbers can be written as Ak (s) =
s k−1 1 s k 5 (k − 1) − (k − 2) k! 2 12(k − 2)! 2 (625k 2 − 225k − 64) (25k − 3) s k−2 (k − 3) − + 12 · 4!(k − 3)! 2 72 · 6!(k − 4)! s k−3 k−4 s (k − 4) + O × (k − 5), 2 2
where (z) is the Heaviside step-function. The integral representation for the gamma function can also be written as ∞ dt t s−1 e− ln(1+z)t/z = (s)zs ln−s (1 + z), I2 =
(192)
(193)
0
where, as in (156), s > 0. If we assume that |z| < 1, then we can introduce the result in Theorem 1 with the equivalence symbol replaced by an equals sign into the above equation,
408
V. Kowalenko
Table 5 The polynomials pk (t) as given by (195) k
pk (t)
0
1
1
t/2
2
−t/3 + t 2 /8
3
t/4 − t 2 /6 + t 3 /48
4
−t/5 + 13t 2 /72 − t 3 /24 + t 4 /384
5
t/6 − 11t 2 /60 + 17t 3 /288 − t 4 /144 + t 5 /3840
6
−t/7 + 29t 2 /160 − 59t 3 /810 + 7t 4 /576 − t 5 /1152 + t 6 /46080
7
t/8 − 223t 2 /1260 + 241t 3 /2880 − 229t 4 /12960 + 25t 5 /13824 −t 6 /11520 + t 7 /645120 −t/9 + 481t 2 /2800 − 929t 3 /10080 + 7207t 4 /311040 − 157t 5 /51840
8
+29t 6 /138240 − t 7 /138240 + t 8 /10321920
thereby obtaining
∞
dt t
∞ ∞ k e exp − Ak (−1)z t = (s) Ak (s)zk .
s−1 −t
0
k=1
(194)
k=0
For |z| ≥ 1 the equals sign in (194) must be replaced by an equivalence symbol. We have already seen how to apply the partition method for a power series expansion to the second exponential factor in the above integral when Ak (1) appeared instead of Ak (−1). In (194), all we need to do is follow the same procedure except now we replace the reciprocal logarithm numbers in (158)–(160) by Ak (−1) or (−1)k /(k + 1). As a consequence, we define new polynomials pk (t) instead of the hk (t). Hence, the new polynomials are given by
k,[k/2],[k/3],...,1
pk (t) =
n k 1 (−1)i+1 i t , n ! (i + 1) i=1 i N
n1 ,n2 ,n3 ,...,nk =0
n1 +2n2 +3n3 +···+knk =k
(195)
where N = kj =1 nj again. As in the case of the hk (t), the new polynomials are of order k in t . Hence, we can represent them by pk (t) = kj =1 pk,j t j . Using (195), we find that p0 (t) = 1, p1 (t) = t/2 and p2 (t) = t 2 /8 − t/3. More of these polynomials are displayed in Table 5, where we see that the coefficients oscillate in sign. This is unlike the hk (t), whose coefficients are homogeneous in sign for each value of k. If like powers of z are equated in (194), then one obtains ∞ dt t s−1 e−t pk (t). (196) (s)Ak (s) = 0
In terms of the coefficients pk,j , (196) reduces to Ak (s) =
k
pk,j (s)j ,
(197)
j =1
where k ≥ 1. Whilst (197) has been determined under the condition that s > 0, it can be analytically continued to s < 0 as we did previously for the Ak (−s) and hk,i . Hence, we
Generalizing the Reciprocal Logarithm Numbers
409
find by putting s = −1 in (197) that pk,1 = −Ak (−1) or rather pk,1 = (−1)k+1 /(k + 1), which is also verified by the first order coefficients in Table 5. Furthermore, by putting s= 1 in the same equation we see that the reciprocal logarithm numbers are given by Ak = k j =1 pk,j j ! = −hk,1 , where the last result follows from (165). Now let us differentiate the analogous version of (159) with the Ak replaced by Ak (−1) and hk (t) by pk (t). By equating like powers of z, we arrive at
pk (t) =
k (−1)k−j +1
pj (t).
(198)
k−1 (−1)k−j +1 pj,l−1 . k−j +1 j =l−1
(199)
j =1
k−j +1
An alternative version of this equation is lpk,l =
For l = 1 (199) yields the result for pk,1 given in the previous paragraph, while for l = 2 we find that pk,2 =
(−1)k H1 (k + 1), k+2
(200)
−l where Hl (k) = k−1 from Ref. [1]. Hence, pk,2 is related to a harmonic number. j =1 j To determine general formulas for the coefficients of the highest order terms in the pk (t), we use the result derived via the partition method for a power series expansion, viz. (195). The highest order term is the contribution due to the greatest number of elements that sum to k or in other words, the contribution due to the partition with k ones. Then we find that pk,k =
1 2k k!
= (−1)k hk,k .
(201)
The next highest order coefficient is given by the contribution due to the partition consisting of k − 2 ones and a two. From (195) we obtain pk,k−1 =
1 (k − 2)!
k−2 1 1 − = 4(−1)k+1 hk,k−1 , 2 3
(202)
where k ≥ 2. For pk,k−2 there are two partitions that we need to consider. The first consists of k − 3 ones and a 3, while the second consists of k − 4 ones and two twos. The first partition occurs for k ≥ 3, while the second occurs for k ≥ 4. Nevertheless, the contributions can be combined into one general result, which is given by 1 pk,k−2 = (k − 3)!
k−3 4k − 3 1 k 4k − 3 = 4(−1) hk,k−2 . 2 36 k+3
(203)
Another method of determining the pk,j is to replace s by −s in (197) and equate the rhs of the resulting equation with the rhs of (163). Then equating like powers of s yields k k (−1)j +l Sj(l) pk,j = (−1)j Sj(l) hk,j . j =l
j =l
(204)
410
V. Kowalenko
For l = k − 3 this equation reduces to pk,k−3 =
(k − 1) (k − 2) (k − 3) (−1)k+1 hk,k−2 − pk,k−2 + (k − 2) 2 24
k2 × (k − 3)(3k − 4) (−1)k+1 hk,k−1 − pk,k−1 + (k − 1)2 48
k+1 k+1 × (k − 2)(k − 3) (−1) hk,k − pk,k + (−1) hk,k−3 .
(205)
On introducing the general formulas for the various quantities into the rhs of (205), we eventually find that pk,k−3 = −
8 (20k 2 − 45k + 22) . 405 2k (k − 4)!
(206)
This completes our investigation into the properties of the Ak (s) and their associated polynomials. In the next section we consider applications of Theorem 1 to derive other fascinating results.
6 Applications As mentioned in the introduction, this paper grew out of a desire to derive new properties for the Riemann zeta function, which may be useful for proving the Riemann hypothesis. At present there does not seem to be sufficient material about the properties of the zeta function to prove the Riemann hypothesis. In this section we aim to produce new forms for both the Riemann zeta and gamma functions in terms of the generalized reciprocal logarithm numbers. Whether these will be useful in proving the Riemann hypothesis remains to be seen, but at least here we will be able to present another version of the hypothesis. Before considering the Riemann zeta and gamma functions, there are other applications for the generalized reciprocal logarithm numbers. Therefore, as our first application we consider No. 2.6.5.10 in Ref. [14], which for μ = 1 is
1
dx 0
ln2n−1 (x) (−1)n = (2π)2n B2n . 1−x 4n
(207)
In (207) n is a positive integer while B2n denotes a Bernoulli number. We can also write the above integral as
1
dx(1 − x)
2n−2
0
ln2n−1 (1 + x − 1) (x − 1)2n−1
=
(−1)n+1 (2π)2n B2n . 4n
(208)
We now introduce the result in Theorem 1 into the lhs of the above equation. Since the |x − 1| ranges from zero to unity, we replace the equivalence symbol in this result by an equals sign. Then we obtain 1 ∞ (−1)n+1 (2π)2n B2n . (−1)k Ak (1 − 2n) dx(1 − x)2n−k−2 = 4n 0 k=0
(209)
Generalizing the Reciprocal Logarithm Numbers
411
Recognizing that the integral in (209) is merely the integral representation for the beta function, we arrive at ∞
B2n =
4(−1)n+1 n (−1)k Ak (1 − 2n) . (2π)2n k=0 2n + k − 1
(210)
We shall generalize (210) later in this section. Since B2n is related to ζ (2n) via No. 9.616 in Ref. [3], we can regard (210) as an expression for ζ (2n) in terms of an infinite sum involving negative odd integer values of the generalized reciprocal logarithm numbers. we found that Ak (−1) = For n = 1 we have B2 = 1/6, while from the previous section 2 1/k = π 2 /6, which was (−1)k /(k + 1). Introducing these results into (210) yields ∞ k=1 first obtained by Euler. On the other hand, for n = 2 we have B4 = −1/30. Then (210) reduces to ∞
∞
1 1 |Ak (−3)| = ζ (4) = . 6 k=0 k + 3 k4 k=1
(211)
This result is interesting as the lhs is a completely different series from the Dirichlet series for ζ (4). In actual fact, we can derive a general formula for Ak (−3) by the repeated use of (1), which gives Ak (−3) =
k
Aj (−1)Ak−j (−2) =
j =0
k j =0
Aj (−1)
k−j
Ai (−1)Ak−j −i (−1).
(212)
i=0
By introducing the general form for Ak (−1) into the above equation, we find after some algebra that Ak (−3) =
k−j k 1 (−1)k 2 1 + . k + 3 j =0 j + 1 k − j + 2 i=0 i + 1
(213)
The above equation gives A1 (−3) = −3/2, A2 (−3) = 7/4, A3 (−3) = −15/8 and A4 (−3) = 29/15, which can be verified by putting s = −3 in the corresponding results in Table 3. Then (211) becomes ∞ k=0
k 1 1 1 + Hk−j +1 = 3ζ (4), (k + 3)2 j =0 j + 1 k − j + 2
(214)
k−j +1 where the harmonic number Hk−j +1 = i=1 1/ i. According to No. 2.6.5.11 in Ref. [14], we also have for μ = 1
1
dx 0
x −1/2 2n−1 (2π)2n
ln 1 − 22n B2n . x = (−1)n+1 1−x 4n
(215)
By introducing the power series expansion in Theorem 1 with the equivalence symbol replaced by an equals sign as in (208), one obtains 1 ∞ (−1)k Ak (1 − 2n) dx(1 − x)2n+k−2 x −1/2 k=0
0
412
V. Kowalenko
= (−1)n
(2π)2n
1 − 22n B2n . 4n
(216)
Like the first application, the integral in (216) is a version of the beta function integral. Hence, (216) becomes ∞ (2n + k − 1)(1/2) (−1)k Ak (1 − 2n) (2n + k − 1/2) k=0
= (−1)n
(2π)2n
1 − 22n B2n . 4n
(217)
For n = 1 this equation yields ∞ (3/2) π2 k! = 3 F2 (1, 1, 1; 3/2, 2; 1) = , k + 1 (k + 3/2) 4 k=0
(218)
which is a result that does not appear in Ref. [25]. For our next application we consider the integral representation for the gamma function, which is given by (s) =
∞
dt e−t t s−1 .
(219)
0
Equation (219) is defined for s > 0. If we make the substitution y = exp(−t), then we find that 1 lns−1 (1 + y − 1) dy(1 − y)s−1 . (220) (s) = (y − 1)s−1 0 We now introduce the power series expansion in Theorem 1 into (220) with the equivalence symbol replaced by an equals sign since the range of integration is between zero and unity. The resulting integral is a simple algebraic integral and so we arrive at (s) =
∞ Ak (1 − s) . (−1)k k+s k=0
(221)
Better still, we can write (221) as (s + 1) =
∞ Ak (−s) . (−1)k k+s +1 k=0
(222)
For the particular case, where s = l and l is a positive integer, (222) reduces to l! =
∞ Ak (−l) . (−1)k k +l+1 k=0
(223)
We have already seen that the generalized reciprocal logarithm numbers are defined over the entire complex plane. Therefore, although (221) has been derived for s > 0, it can be
Generalizing the Reciprocal Logarithm Numbers
413
analytically continued over the entire complex plane. In addition, for s = 1 − s we obtain (1 − s) =
∞ Ak (s) . (−1)k k + 1−s k=0
(224)
Multiplying (224) with (222) yields ∞
Aj (s)Ak−j (−s) sπ = , (−1)k sin(sπ) k=0 (j + 1 − s)(k − j + s + 1) j =0 k
(225)
where we have used the reflection formula for the gamma function, which is given as No. 8.334(3) in Ref. [3]. Alternatively, this equation can be written as ∞ sπ (−1)k
= fk (s) + fk (−s) , sin(sπ) k=0 k + 2
(226)
where k Aj (s)Ak−j (−s)
fk (s) =
j =0
j +1+s
.
(227)
Since the lhs of (226) is even, adding fk (s) to fk (−s) must result in an expression in powers of s 2 only. In actual fact, if we define wk (s) by wk (s) =
k (−1)i i=0
i +2
(fi (s) + fi (−s)) ,
(228)
then we find via Mathematica that wk (s) =
k+1
dk (s) , 2 /(j + 1)2 ) (1 − s j =1
(229)
where dk (s) = 1 + kj =1 (−1)k+1 dk,j s 2j . Some of the dk (s) are displayed in Table 6. From this table we see that the coefficients dk,j increase monotonically in magnitude as j increases, which is surprising since we know that ∞
1 sπ = . 2 sin(sπ) j =1 (1 − s /(j + 1)2 )
(230)
Therefore, we expect the coefficients dk,j to converge to zero in accordance with the above equation, but we observe the opposite. This may not be a problem, however, since the coefficients alternate in sign. From (229) we see that the zeros up to ±(k + 1)π in sin s appear in the denominator of wk (s). That is, as k increases, more zeros of sin(sπ) appear in the denominator of wk (s) while the order of the dk (s) also increases. In addition, as k increases, the wk (s) become a more accurate approximation to sπ csc(sπ) as demonstrated in Table 7 for s = 1/13 and 9/13.
414
V. Kowalenko
Table 6 The polynomials dk (s) as given by (229) k
dk (s)
1
1
2
1 + s 2 /432 − s 4 /432
3
1 + 7s 2 /1728 − 29s 4 /6912 + s 6 /6912
4
1 + 419s 2 /81000 − 1573s 4 /288000 + 127s 6 /432000 − 13s 8 /2592000
5
1 + 7609s 2 /1296000 − 292591s 4 /46656000 + 12797s 6 /31104000 −89s 8 /7776000 + 11s 10 /93312000
6
1 + 7460407s 2 /1185408000 − 288988487s 4 /42674688000 + 3523601s 6 /7112448000 −123421s 8 /7112448000 + 6421s 10 /21337344000 − 29s 12 /14224896000
Table 7 Approximating sπ csc(sπ ) with successive values of wk (s) as defined by (228)
s
1/13
9/13
w2 (s)
1.0074426
2.1818945
w3 (s)
1.0085024
2.3781553
w4 (s)
1.0087478
2.4252766
w5 (s)
1.0089178
2.4583976
w6 (s)
1.0090421
2.4829207
w7 (s)
1.0091369
2.5017917
w8 (s)
1.0092113
2.5167521
w9 (s)
1.0092713
2.5288961
w10 (s)
1.0093206
2.5389461
sπ csc(sπ )
1.0098000
2.6427599
It should be mentioned, however, that using the truncated power series expansion for sin(sπ) and dividing it into sπ is a far more accurate approach for obtaining values of the quantity on the lhs of (226). In a future publication we shall adapt the partition method for a power series expansion to the lhs of the (226), thereby obtaining a new power series expansion whose coefficients will be referred to as the cosecant numbers. Thus, we shall not need to divide a truncated power series expansion to obtain accurate values for cosecant. In addition, we shall show that the new numbers are not only related to the Bernoulli numbers, but they also possess numerous interesting properties of their own. Another disadvantage with using the above approximation scheme involving wk (s) is that one needs to ensure that the nearest zeros to the value of s are included when the series in (225) is truncated. For example, if s = l + x, where l is large and x is less than unity, then we need to ensure that all the zeros in the vicinity of l are included in order to achieve a satisfactory approximation. Truncating the sum at values less than l will yield a useless approximation because the dominant terms are due to the factors of 1 − (l + x)2 / (l + 1)2 and 1 − (l + x)2 / l 2 in the denominator, which will not be included in the sum when it is truncated below l. Unfortunately, this means that we require larger k values of the generalized reciprocal logarithm numbers. We now consider the generalization of the first application in this section. From No. 9.511 in Ref. [3] we have ∞ s−1 −t t e 1 . (231) ζ (s) = (s) 0 1 − e−t
Generalizing the Reciprocal Logarithm Numbers
415
If we make the change of variable, y = exp(−t), then we find that 1 (− ln(1 + y − 1))s−1 1 . dy ζ (s) = (s) 0 1−y
(232)
At this stage we introduce the result given in Theorem 1 except that we can replace the equivalence symbol by an equals sign since the range of integration is between zero and unity. Then we obtain ζ (s) =
1 (s)
1
dy(1 − y)s−2
0
∞
Ak (1 − s)(y − 1)k .
(233)
k=0
Interchanging the order of the integration and summation and evaluating the resulting integral, we arrive at ∞
ζ (s) =
1 Ak (1 − s) . (−1)k (s) k=0 k+s −1
(234)
As an aside, if we had used the integral representation for the Hurwitz zeta function, which is given as No. 9.511 in Ref. [3] and followed through with the same analysis, then we would find that ∞
ζ (s, a) =
1 (k + s − 1)(2 − a) , (−1)k Ak (1 − s) (s) k=0 (k + s + 1 − a)
(235)
which, as expected, reduces to (234) when a = 1. For s = 3 we can introduce (140) into (234), thereby obtaining the following result for Apéry’s constant: ζ (3) =
∞ (−1)k Hk+1 . (k + 2)2 k=0
(236)
For s = 2l, where l is a positive integer, ζ (2l) can be expressed in terms of B2l . Hence, the above is a generalization of (210). In addition, if we denote the l-th zero along the critical line by χl , i.e. s = 1/2 ± iχl , then (234) reduces to ∞ Ak (1/2 ∓ iχl ) (−1)k = 0. k − 1/2 ± iχl k=0
(237)
Alternatively, we can paraphrase the Riemann hypothesis by stating that one needs to show k that within the critical strip the sum ∞ k=0 (−1) Ak (1 − s)/(k + s − 1) is zero if and only if s = 1/2. From No. 2.6.5.11 of Ref. [14], we also have 1
x −1/2 2n dx ln x = (2n)! 22n+1 − 1 ζ (2n + 1). (238) 1−x 0 In order to introduce the power series expansion in Theorem 1, we re-write the above result as 1
ln2n (1 + x − 1) dxx −1/2 (1 − x)2n−1 = (2n)! 22n+1 − 1 ζ (2n + 1). (239) 2n (x − 1) 0
416
V. Kowalenko
Introducing Equivalence (117) but with the equivalence symbol replaced by an equals sign into (239) yields ∞
(k + 2n)(1/2) = (2n)! 22n+1 − 1 ζ (2n + 1). (−1)k Ak (−2n) (k + 2n + 1/2) k=0
(240)
In obtaining this result we have used the integral representation for the beta function. For n = 1, we find that ∞ 7 (−1)k (k + 2) Hk+1 = π 3/2 . (k + 2)(k + 5/2) 6 k=0
(241)
For our final application we consider No. 2.6.13.18 in Ref. [14], which is
∞
I=
dx lns
1
x +1 x −1
= 2(s + 1)ζ (s).
(242)
The integral on the lhs of (242) can also be written as
∞
I= 1
2 dx − ln 1 − 1+x
s .
(243)
If we multiply the integrand by (2/(1 + x))s (2/(1 + x))−s , then we can introduce Equivalence (117) with an equals sign replacing the equivalence symbol as we did in the previous application. Therefore, the integral becomes
∞
I = 2s
dx (1 + x)−s
1
k 2 Ak (−s) − . 1+x k=0
∞
(244)
Interchanging the order of the integration and summation and making the change of variable, y = 1/(1 + x), we find that I = 2s
1 ∞ (−2)k Ak (−s) dy y k+s−2 .
(245)
0
k=0
Evaluating the integral yields ∞
ζ (s) =
1 2k+s−1 Ak (−s) . (−1)k (s + 1) k=0 k+s −1
(246)
We the Riemann hypothesis by stating that one needs to show that ∞ can paraphrase k (−2) A (−s)/(k + s − 1) is equal to zero if and only if s = 1/2. Furthermore, if k k=0 we compare the above result with (234), then we arrive at the interesting identity of ∞ (−1)k
Ak (1 − s) − 2s−1 s −1 Ak (−s) = 0. k+s −1 k=0
(247)
Generalizing the Reciprocal Logarithm Numbers
417
7 Conclusion To familiarize the reader with the partition method for a power series expansion, which was first introduced in Ref. [11], we have explained in this work how it can be adapted to calculate the Bernoulli numbers and polynomials. The method involves evaluating all distinct partitions that sum to the order of the power of the variable and then assigning specific values to the elements in the partitions. The contribution due to a particular partition is a product of the assigned values as well as a multinomial factor, which, in turn, is based on the frequencies or number of occurrences of each element in the partition. The sum of all the contributions from the partitions that sum to the order of the power yield the coefficient for that order in the resulting power series expansion, which need not necessarily be convergent. An interesting feature of this method is that although the number of partitions increases exponentially as the order k increases, the coefficients in the resulting power series expansion often converge rapidly to zero as in the calculation of the ak in Sect. 2, which were set equal to Bk /k!. In order to apply the partition method for a power series expansion, a quantity must first be expanded as a power series beginning at least to first order. This power series then becomes the variable in a second power series expansion. In calculating ak and ak (x), where in the latter case the elements of the partitions were assigned values that were dependent upon the variable x as opposed to numerical values being assigned for the ak , we first expanded the denominators of their exponential generating functions, which resulted in power series expansions appearing in the denominators. Then the denominators were treated as regularized values of the geometric series according to the lemma in Sect. 2. In such cases the multinomial factor is equal to (n1 + n2 + · · · + nk )!/n1 !n2 ! · · · nk !, where n1 represents the number of ones in a partition, n2 the number of twos, etc. However, the partition method for a power series expansion need not only be applied to cases where the second series is the geometric series. In the second calculation of the ak , where we were able to express them in terms of exponential integrals involving the polynomials gk (y), the exponential power series became the second power series and as a consequence, the multinomial factor simplified to 1/n1 !n2 ! · · · nk !. It should also be mentioned that the partition method for a power series expansion is more general than a Taylor series expansion and thus, can be used to determine power series where the latter method cannot be used. However, when a Taylor series expansion can be derived, the method will yield the same result although via a different perspective. Thus, the method should be viewed as complementary to the Taylor series approach in these cases. This paper grew out of a desire to obtain a power series expansion for zs ln−s (1 + z), which could then be used to obtain new forms for the Riemann zeta and gamma functions. In deriving this power series expansion in Theorem 1 we again adapted the partition method so that the arbitrary power s became a variable in the multinomial factor. As in the calculation of the ak (x), the coefficients in the resulting power series become polynomials. We referred to the coefficients as the generalized reciprocal logarithm numbers, since for s = 1, they reduce to the reciprocal logarithm numbers presented in Ref. [1]. Theorem 1 presents the general form for the generalized reciprocal logarithm numbers Ak (s) as obtained from the partition method for a power series expansion, while Table 3 displays them up to k = 10. There, we see that the Ak (s) are polynomials of order k, although this can be surmised when applying the partition method. Section 5 derives numerous properties of the generalized reciprocal logarithm numbers. We begin with an identity relating Ak (s + t) to a finite sum involving the product Ak−j (s)Aj (t), which is proved in Theorem 2. Then a recursion relation for the Ak (s) is derived in Theorem 3. In order to evaluate the Ak (s) with this relation, one also requires the
418
V. Kowalenko
reciprocal logarithm numbers or Ak . Nevertheless, it is significantly faster than using (118), although it remains to be seen whether an optimized version of the latter can be constructed that will outperform the recursion relation for very large values of k. If the partition method for a power series expansion is applied to the factor of exp(−zt/ ln(1 + z)) in an integral representation of the gamma function given by (156), then one can express the generating function for zs / lns (1 + z) in terms of an exponential integral involving another set of special polynomials referred to as hk (t) in Sect. 5. By evaluating the resulting integral and equating like powers of z with the result in Theorem 1, we show that the generalized reciprocal logarithm numbers for −s, i.e. Ak (−s), can be expressed in terms of a finite sum, whose summand is the product of the coefficients of the hk (t) multiplied by the Pochhammer polynomials as given by (163). When the partition method for a power series expansion is applied to the integral representation of the gamma function with the exponential factor altered to exp(− ln(1 + z)t/z), a power series expansion is obtained for z−s lns (1 + z) in terms of another set of polynomials, which are referred to as pk (t). Then one finds that the generalized reciprocal logarithm numbers can be expressed in terms of a finite sum with the summand being the product of the coefficients of the pk (t) and the Pochhammer polynomials as given by (197). We obtain general formulas for the highest and lowest order coefficients of both sets of polynomials in Sect. 5. By using the results for coefficients of these polynomials together with the general results for the Stirling numbers of the first kind in the appendix, we were able to derive general formulas for the four highest orders of the Ak (s) as in (192). By applications in Sect. 6 we mean the derivation of new results involving the Ak (s) in infinite series. Basically, the idea has been to introduce the power series expansion given by Equivalence (117) into known integrals involving lns (z) in the integrand. As a result, we were able to derive infinite series involving the generalized reciprocal logarithm numbers for the Bernoulli numbers and the gamma and Riemann zeta functions. The infinite series for the Bernoulli numbers are given by (210) and (217), while that for the gamma function is given by (221). We present two results for the Riemann zeta function, viz. (234) and (246), while the Hurwitz zeta function is given by (235). Acknowledgement The author wishes to thank Professor M.L. Glasser of Clarkson University for his interest in this work and for alerting him to Ref. [21].
Appendix To develop general forms for the coefficients of the generalized reciprocal logarithm numbers Ak (s), we require general formulas for the Stirling numbers of the first kind. These can be derived by multiplying out all the terms in (s)k , which yields (s)k = s k +
k−1
is k−1 +
i=1
+
k−1 i−1 i j s k−2 i=2
j =1
j −1 k−1 i−1
i j ls k−3 + O s k−4 . i=3
j =2
l=1
Evaluating the sums in (248) yields (s)k = s k +
k(k − 1) k−1 (3k 4 − 10k 3 + 9k 2 − 2k) k−2 s s + 2 24
(248)
Generalizing the Reciprocal Logarithm Numbers
+
419
k 2 (k − 1)(k − 2)(k − 3) k−3 s + O s k−4 . 48
(249)
By comparing the above equation with (189) we see that k (3k − 1) k Sk(k−1) = − Sk(k) = 1, , Sk(k−2) = , 4 2 3
Sk(k−3)
k(k − 1) k =− 2 4
and
Sk(k−4)
(15k 3 − 30k 2 + 5k + 2) k = . 336 5
(250)
Although the above results have been obtained independently, the first four results agree with the coefficients in the power series expansion given by (27) in Ref. [26]. From (248) we can write down a more general result for the Stirling numbers of the first kind, which is (k−j ) Sk
= (−1)
j
k−1 ij =j
ij −1 −1
ij −1
ij
ij −1 =j −1 (k−j )
From the results (250) we see that Sk
k in . multiplied by j +1
ij −1
ij −2 =j −2
i2 −1
ij −2 · · ·
i1 .
(251)
i1 =1
will equal a polynomial of O(j − 1) in k
References 1. Kowalenko, V.: Properties and applications of the reciprocal logarithm numbers (submitted for publication) 2. Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C, 2nd edn. Cambridge University Press, New York (1992) 3. Gradshteyn, I.S., Ryzhik, I.M.: Table of Integrals, Series and Products, 5th edn. Academic, London (1994). Alan Jeffrey (Ed.) 4. Kowalenko, V.: Towards a theory of divergent series and its importance to asymptotics. In: Recent Research Developments in Physics, vol. 2, pp. 17–68 (Transworld Research Network, Trivandrum, India, 2001) 5. Kowalenko, V.: The Stokes Phenomenon, Borel Summation and Mellin-Barnes Regularisation. Bentham e-books (submitted for publication) 6. Whittaker, E.T., Watson, G.N.: A Course of Modern Analysis, 4th edn. Cambridge University Press, Cambridge (1973), p. 252 7. Lighthill, M.J.: Fourier Analysis and Generalised Functions, Student’s edn. Cambridge University Press, Cambridge (1975) 8. Gel’fand, I.M., Shilov, G.E.: Generalized Functions, vol. 1—Properties and Operations. Academic, New York (1964) 9. Kowalenko, V.: Exactification of the asymptotics for Bessel and Hankel functions. Appl. Math. Comput. 133, 487–518 (2002) 10. Kowalenko, V., Frankel, N.E., Glasser, M.L., Taucher, T.: Generalised Euler-Jacobi Inversion Formula and Asymptotics beyond All Orders. London Mathematical Society Lecture Note, vol. 214. Cambridge University Press, Cambridge (1995) 11. Kowalenko, V., Frankel, N.E.: Asymptotics for the Kummer function of Bose plasmas. J. Math. Phys. 35, 6179–6198 (1994) 12. Kowalenko, V.: The non-relativistic charged Bose gas in a magnetic field II. Quantum properties. Ann. Phys. (N.Y.) 274, 165–250 (1999) 13. Dingle, R.B.: Asymptotic Expansions: Their Derivation and Interpretation. Academic, London (1973) 14. Prudnikov, A.P., Marichev, O.I., Brychkov, Y.A.: Integrals and Series, vol. I: Elementary Functions. Gordon and Breach, New York (1986) 15. Munkhammar, J.: Integrating Factor. In: MathWorld-A Wolfram Web Resource. http://mathworld. wolfram.com//IntegratingFactor.html
420
V. Kowalenko
16. Weisstein, E.W.: Bell Number. In: Mathworld—A Wolfram Web Resource. http://mathworld. wolfram.com/BellNumber.html 17. Roman, S.: The Umbral Calculus. Academic, New York (1984) 18. Weisstein, E.W.: Bernoulli Number. In: Mathworld—A Wolfram Web Resource. http://mathworld. wolfram.com/BernoulliNumber.html 19. Sondow, J., Weisstein, E.W.: Harmonic number. In: MathWorld—A Wolfram Web Resource. http://mathworld.wolfram.com//HarmonicNumber.html 20. Wolfram, S.: Mathematica—A System for Doing Mathematics by Computer. Addison-Wesley, Reading (1992) 21. Wu, M., Pan, H.: Sums of products of Bernoulli numbers of the second kind, arXiv:0709.2947v1 [math.NT], 19 Sep. 2007 22. Weisstein, E.W.: Bernoulli number of the second kind. In: Mathworld—A Wolfram Web Resource. http://mathworld.wolfram.com/BernoulliNumberoftheSecondKind.html 23. Spanier, J., Oldham, K.B.: An Atlas of Functions. Hemisphere, New York (1987) 24. Abramowitz, M., Stegun, I.A.: Handbook of Mathematical Functions. Dover, New York (1965) 25. Prudnikov, A.P., Marichev, O.I., Brychkov, Y.A.: Integrals and Series, vol. 3: More Special Functions. Gordon and Breach, New York (1990) 26. Weisstein, E.W.: Stirling number of the first kind. In: MathWorld—A Wolfram Web Resource. http://mathworld.wolfram.com//StirlingNumberoftheFirstKind.html
Acta Appl Math (2009) 106: 421–432 DOI 10.1007/s10440-008-9305-4
Practical Study on the Fuzzy Risk of Flood Disasters Lihua Feng · Gaoyuan Luo
Received: 15 May 2008 / Accepted: 21 August 2008 / Published online: 11 September 2008 © Springer Science+Business Media B.V. 2008
Abstract The simplest way to perform a fuzzy risk assessment is to calculate the fuzzy expected value and convert fuzzy risk into non-fuzzy risk, i.e., a crisp value. In doing so, there is a transition from the fuzzy set to the crisp set. Therefore, the first step is to define an α level value, and then select the elements x with a subordinate degree A(x) ≥ α. The higher the value of α, the lower the degree of uncertainty—the probability is closer to its true value. The lower the value of α, the higher the degree of uncertainty—this results in a lower probability serviceability. The possibility level α is dependant on technical conditions and knowledge. A fuzzy expected value of the possibility-probability distribution is a set with E α (x) and E α (x) as its boundaries. The fuzzy expected values E α (x) and E α (x) of a possibility-probability distribution represent the fuzzy risk values being calculated. Therefore, we can obtain a conservative risk value, a venture risk value and a maximum probability risk value. Under such an α level, three risk values can be calculated. As α adopts all values throughout the set [0, 1], it is possible to obtain a series of risk values. Therefore, the fuzzy risk may be a multi-valued risk or set-valued risk. Calculation of the fuzzy expected value of flood risk in the Jinhua River basin has been performed based on the interior-outer set model. Selection of an α value depends on the confidence in different groups of people, while selection of a conservative risk value or venture risk value depends on the risk preference of these people. Keywords Interior-outer set model · α level · Fuzzy risk · Fuzzy expected value · Flood
1 Introduction Among natural disasters, flood is one of the most common and destructive. In recent years, floods have become increasingly serious, as social and economic development has rapidly progressed [12]. It is estimated that losses caused by flood accounts for 40% of all total losses attributed to all disasters [22]. Floods occur with high frequency, affect vast areas, are L.H. Feng () · G.Y. Luo Department of Geography, Zhejiang Normal University, Jinhua 321004, China e-mail:
[email protected]
422
L.H. Feng, G.Y. Luo
difficult to contain and cause the greatest destruction, representing a huge threat to lives and property, and are a constant risk facing mankind [15]. Risk is always accompanied by uncertainty [14]. If there were no uncertainty, there would be no risk. Due to many factors, human beings cannot accurately predict many incidents. Uncertainty always exists, and therefore, risk is inevitable [11]. Flood variations in the future are an issue of great uncertainty, and risk analysis of floods is absolutely necessary. Two main reasons for the inaccurate estimation of flood risk are: (1) People do not fully understand the distribution of the random rules of floods. This is because the normal distribution assumption regarding random variations and the exponential distribution assumption, as well as the Markov assumption regarding a process complexity, are far-fetched. (2) The related assumptions are possibly similar to the actual conditions in form; however, the available data usually provides no help to the analysis of the related distribution or process parameters. To avoid the adverse effects created by the improper man-made assumption, we suggest adopting the Non-Parameter Estimation Algorithm. The simplest non-parameter estimation model shall be the histogram model, while the most theoretically perfect model shall be the nuclear estimation model. The former is simple and rough, while the latter is theoretically strict; however, since there is not any criterion of engineering interest for the selection of nuclear function, it can hardly be used for flood risk analysis [9]. The proposition of fuzzy risk provides an alternative way to improve the accuracy of flood risk estimation. By any means, describing the probability distribution of the flood risk with a given sample estimation is always inaccurate, and the error is quite significant. Therefore, the risk recognized by people is not a true risk, but a fuzzy risk [9]. It is similar to the example that if there were no true records of a person, judging his or her age solely from his appearance would always be inaccurate, and the error would be quite significant. It is quite normal that the estimated age differs from the actual age by several years. A correct estimation is a coincidence, while an incorrect estimation is certain. Taking the incorrect estimation of flood risk as the basis of decision-making will inevitably cause the investment project to face a major risk without being aware of it. Just like the age estimated from the appearance would not be taken as the actual age, the incorrectly estimated flood risk may not be taken as the actual risk. If the difference between the estimated value and the actual value can be proved insignificant, such estimated value can be used without hesitation. Otherwise, a more flexible method is preferable for the estimation of flood risk. Just as when you cannot exactly estimate a person’s age, it is preferable to express his age by “he is a young man”, “he is about 20”, etc.; an absolute expression would only conceal the age estimation’s true attribute, fuzziness [9]. Fuzzy mathematics is a powerful tool for dealing with uncertainty problems [5]. After the fuzzy set theory was established by Zadeh [23, 24], many scholars started to perform risk analysis by applying fuzzy set methods [18, 21]. Schmucker [19] proposed a technique to calculate the fuzzy risk of a whole system from the fuzzy risk of consolidated sub-systems. Joblonowski [10] provided a so-called “fuzzy risk contour” and called for the application of artificial neural network models in discovering fuzzy risk. Delgado et al. [3] presented the basic principles of decisions based on the model interval. Machias and Skikos [13] applied fuzzy risk index models to some extent to divide wind velocity-probability space into risk zones and non-risk zones. All these studies paved the way for risk analysis using fuzzy techniques [1]. A less common approach is the risk calculation model by which fuzzy treatment is dealt with [17]. Huang and Shi [8] proposed the theory of fuzzy risk together with its mathematical model through the use of information diffusion technology, highlighting its effectiveness in small sample optimization in connection with the fuzzy uncertainty of risk values.
Practical Study on the Fuzzy Risk of Flood Disasters
423
Fuzzy risk differs from probability risk in terms of the probability distribution introduced for exceeding probability, indicating the possibility of occurrence of some probabilities [4]. This not only reveals the inaccuracy of the exceeding probability estimation, but also provides a way for the model to accommodate fuzzy information. Huang [6] established the interior-outer set model by using the information distribution method to calculate the fuzzy risk as an expression of the fuzzy nature of probability estimation. The interior-outer set model, due to the complexity of the combined calculation involved in its traditional expression, is not easy to popularize in practical applications. However, the matrix arithmetic method may be used for this purpose. Moraga [16] provided another method of presentation for the interior-outer set model that has proven to be quite easy to compute. Since no flood analysis has been performed using the theory of fuzzy risk [20], studying future flood variation using the interior-outer set model is recommended.
2 Methods 2.1 Interior-Outer Set Model Let X = {xi | i = 1, 2, . . . , n} represent the sample of the incident where the sample point is xi ∈ R i (real number set). For instance, X = {xi | i = 1, 2, . . . , n} consists of precipitation records in a certain historic period with a domain of U . When X represents a precipitation sample, the domain is [0, 1800]. Let u1 , u2 , . . . , um represent discrete points with a given step length Δ. U is used to express this discrete domain, e.g. U = {ui | i = 1, 2, . . . , m}. From the point of view of information distribution, a sample point xi can allocate its information with a value of qij to point uj , as expressed by the following equation: qij =
1 − |xi − ui |/Δ,
for |xi − ui | ≤ Δ
0,
for |xi − ui | ≥ Δ
(1)
where xi is the observation value, uj is a controlled point, and Δ is the step length of the controlled point. The interior-outer set model can be used to estimate the fuzzy probability of an incident occurring during the following interval: Ij = [uj − Δ/2, uj + Δ/2),
j = 1, 2, . . . , m
(2)
Intervals are selected so that all of the sample points xi lie within a certain interval Ij only. When a sample point is subjected to random disturbance, it may depart from the interval Ij , while a point outside may also enter the interval. Changes in the relationship between the sample point and the interval due to random disturbance are referred to as leaving in the former case, and joining in the latter case. The possibility of sample point xi leaving or joining the interval Ij is expressed as qij− and qij+ , respectively. For the purpose of computation, the definition of the interior set and outer set of Ij is as follows: Interior set: Xin−j X ∩ Ij , i.e. a set composed of all elements contained within Ij . Outer set: Xout−j X\Xin−j , i.e. a set composed of all elements not contained within Ij . Let Sj = {sj | {xsj }sj = Xin−j } represent the interior index set. Similarly, let Tj = {tj | {xtj }tj = Xout−j } represent the outer index set.
424
L.H. Feng, G.Y. Luo
Let |Sj | = nj , then |Tj | = n − nj where nj represents the volume of Sj or Xin−j . Let qij−
=
xi ∈ Xin−j
0
qij+
1 − qij ,
=
qij ,
xi ∈ Xout−j
0
(3)
(4)
Consequently, a possibility-probability risk distribution of random incident in interval Ij can be calculated corresponding to the domain of interval I = {Ij | j = 1, 2, . . . , m} and the domain of discrete probability P = {pk | k = 0, 1, 2, . . . , n} = {k/n | k = 0, 1, 2, . . . , n}: ΠI,P = {πIj (p) | Ij ∈ I, p ∈ P }
(5)
where, πIj (p) represents the possibility of an incident in the interval Ij with a probability p. A simplified calculation method for (5) was given by Huang et al. [7] using the following technique: − First, calculate the leaving value set Q− j = {qij } for interval Ij by taking its interior set + Xin−j , and then calculating the joining value set Q+ j = {qij } for the same interval using its outer set Xout−j . Define: − − − ↑ Q− j = qj 0,j , qj 1,j , . . . , qj nj −1,j , where qj−s,j ≤ qj−t,j (∀s < t), i.e. the elements of Q− j are arranged in order of ascending magnitude. + + + ↓ Q+ j = qj nj +1,j , qj nj +2,j , . . . , qj s,j , where qj+s,j ≤ qj+t,j (∀s < t), i.e. the elements of Q+ j are arranged in order of descending magnitude. Finally, let pk = k/n (k = 0, 1, 2, . . . , n), and (5) can be simplified as follows:
πIj (p) =
⎧ − qj 0,j ⎪ ⎪ ⎪ ⎪ q− ⎪ ⎪ j 1,j ⎪ ⎪ ⎪ ⎪ · · · ⎪ ⎪ ⎪ ⎪ − ⎪ qj n −1,j ⎪ ⎨ j 1, ⎪ ⎪ ⎪ qj+nj +1,j ⎪ ⎪ ⎪ ⎪ + ⎪ ⎪ ⎪ qj nj +2,j ⎪ ⎪ ⎪ ··· ⎪ ⎪ ⎪ ⎩ + qj n,j
p = p0 p = p1 p = pnj −1 p = pnj
(6)
p = pnj +1 p = pnj +2 p = pn
− − nd element of ↑ Q− where qj−0,j is the 1st element of ↑ Q− j , qj 1,j the 2 j , qj nj −1,j the last + + + + + st nd element of ↑ Q− j , qj nj +1,j the 1 element of ↓ Qj , qj nj +2,j the 2 element of ↓ Qj , qj n,j + the last element of ↓ Qj . Equation (6) is a form of the interior-outer set model suitable for simplified computation.
Practical Study on the Fuzzy Risk of Flood Disasters
425
2.2 Fuzzy Expected Value Calculation Based on Possibility-Probability Distribution The possibility-probability risk calculated using the interior-outer set model is referred to as fuzzy risk. The simplest way to perform fuzzy risk assessment is to calculate the fuzzy expected value and convert fuzzy risk into non-fuzzy risk, i.e., a crisp value. In doing so, there must be a transition from the fuzzy set to the crisp set. In order to obtain the crisp set from a fuzzy set, the first thing that must be done is to define a standard α (level or threshold value), and then select the elements x with subordinate degree A(x) ≥ α (0 ≤ α ≤ 1). Therefore, the level value α is a key concept in this transition. Definition Let Ω represent the space of incident x, and P the probability domain, p(x) {μx (p) | x ∈ Ω, p ∈ P }
(7)
This equation represents a fuzzy probability distribution with ∀α ≤ [0, 1], and then selects the elements pi of the subordinate degree μx (p) ≥ α. Let p α (x) = min{p | p ∈ P , μx (p) ≥ α} pα (x) = max{p | p ∈ P , μx (p) ≥ α}
(8)
where, p α (x) is referred to as the minimum probability with regard to x in the cut set αlevel; and p α (x) is referred to as the maximum probability accordingly. For example in Fig. 1, the minimum probability with regard to x in the cut set α-level is p2 ; the maximum probability is p3 accordingly. A sketch of the fuzzy cut set of the possibility-probability distribution is shown in Fig. 1. In this paper, a triangular function is adopted as the subordinate function. The finite closed interval (9) pα (x) p α (x), pα (x) is referred to as the α level cut set of the fuzzy set p(x) with regard to x. The transition from fuzzy set p(x) to crisp set pα (x) is achieved with α as the smallest value of the subordinate Fig. 1 Fuzzy cut set of possibility-probability distribution
426
L.H. Feng, G.Y. Luo
degree, i.e., πx (p) = {p | p ∈ P , μx (p) ≥ α}
(10)
Here, the level value α is referred to as the possibility level of probability. pα (x) is an interval derived from the possibility distribution πx (p) with given values x and α, not a function with π as a variable. pα (x) has an alterable boundary 0 ≤ α ≤ 1, and is a set with an elastic boundary. That is to say, for any given possibility level α, there is a corresponding pα (x). Thus, the higher the value of α, the greater the corresponding possibility of probability. It can be seen from Fig. 1 that the triangular curve is just the subordinate function μx (p). The conversion from p(x) to pα (x) begins at α. This means that the higher the possibility level α is, the less the element in the set pα (x); the lower the level α is, the more the element in the set pα (x). Hence, the higher the value of α, the lower the uncertainty, and the closer to the true value the probability; the lower the value of α, the higher the uncertainty, and the less practical use the value of probability will have. The possibility level α is dependant on technical conditions and knowledge. Let p α (x) = p α (x)/ p α (x)dx, p α (x) = p α (x)/
Ω
pα (x)dx, or Ω
p α (x)dx, p α (x) = p α (x)/
(11)
Ω
p α (x) = p α (x)/
p α (x)dx
Ω
p α (x) and p α (x) represent the normalization of p α (x) and pα (x), respectively. Let xp α (x)dx, E α (x) =
Ω
xpα (x)dx, or
xp α (x)dx, E α (x) = E α (x) =
Ω
(12)
Ω
E α (x) =
xp α (x)dx
Ω
Then
Eα (x) E α (x), E α (x)
(13)
represents the expected interval of the α level cut set of p(x) with respect to x, where, E α (x) and E α (x) are referred to as the fuzzy expected value of minimum probability and the fuzzy expected value of maximum probability for pα (x), respectively. Especially when α = 1, the occurrence of incident Ij is the only possibility. Hence, E1 (x) for α = 1 is called the fuzzy expected value of maximum possibility probability for pα (x). A fuzzy expected value of the possibility-probability distribution is a set with E α (x) and E α (x) as its boundaries. Here, to simplify calculation, E α (x) and E α (x) are selected
Practical Study on the Fuzzy Risk of Flood Disasters
427
as fuzzy expected values of possibility-probability distribution. In fact, the number of an expected value of possibility-probability distribution may be 3 or more. For any possibilityprobability distribution with a given non-empty α cut set, there must be a corresponding E α (x) and E α (x). Therefore, under the α level, the fuzzy expected values are E α (x) and E α (x). When the cut set technique for fuzzy sets is applied using α and adopting all values in the range [0, 1], it is possible to obtain the whole hierarchical structures of E α (x) and E α (x). The fuzzy expected values E α (x) and E α (x) of the possibility-probability distribution are really the fuzzy risk values. Generally speaking, incidents with high probability values occur with less intensity, while incidents with low probability values occur with stronger intensity. Therefore, we refer to the fuzzy expected value E α (x) as the conservative risk value (RC ), and the fuzzy expected value E α (x) as the venture risk value (RV ). The fuzzy expected value E1 (x) is referred to as the maximum probability risk value (RM ). For such an α level, 3 risk values can be obtained. With α taking all values throughout [0, 1], it is possible to obtain a series of risk values. Therefore, the fuzzy risk may be a multi-valued risk or set-valued risk.
3 Results and Discussions 3.1 Flood Risk Analysis The application of fuzzy risk in the study of flood disasters is dealt with in this paper, using the Jinhua River basin as an example. Located in the center of Zhejiang Province, China, the Jinhua River basin covers an area of some 15,100 square meters, and has a humid climate and abundant light and heat resources. It has long been an important grain production base in Zhejiang province. However, due to being located in an area with a semi-tropical monsoon climate zone and having precipitation with significant variations during the year and between years, it suffers from serious flood. Since 1949, flood has occurred at an average frequency of 0.6 times per year, and a maximum frequency of 3 times in a single year. The most serious flood occurred from the 18th to the 22nd of June 1955, when precipitation in 24 hours, 3 days and 7 days all exceeded recorded highs. A peak discharge of 5500 m3 /s at a recurrence interval of 100 years was recorded by the Jinhua Hydrological Station. Large areas of farmland, including 593,000 hm2 in Jinhua city alone, were submerged in this catastrophic flood, leading to lost grain production of 2.49 million tons. Therefore, it is necessary to analyze flood risks to ensure healthy and sustainable economic development. The type of probability pattern (for instance normal distribution or exponential distribution) that flood variations in the Jinhua River basin follow is still unknown, and the sample volume actually measured is relatively small, with the characteristics of fuzzy information. Since probability estimation based on such samples is inaccurate, it is advisable to determine flood risk in the future using the possibility-probability distribution. Flood in the Jinhua River basin is derived solely from atmospheric precipitation. The extent of disasters is determined by peak discharge. Hence, it is proposed to estimate future flood variations using the yearly maximum peak discharge Q, as published by the Jinhua Hydrological Station (Table 1). Taking into consideration the intervals I1 = [1100, 1900), I2 = [1900, 2700), I3 = [2700, 3500), I4 = [3500, 4300) with corresponding discrete domains U = {uj | j = 1, 2, 3, 4} = {1500, 2300, 3100, 3900} and a controlling point step length of Δ = 800, calculations are performed based on table calculation methods as follows:
428
L.H. Feng, G.Y. Luo
Table 1 Yearly maximum peak discharge Q since 1998 at Jinhua Hydrological Station Year
Q (m3 /s)
Sample point
Year
Q (m3 /s)
Sample point
1998
2790
x1
2003
2410
x6
1999
3810
x2
2004
1610
x7
2000
4200
x3
2005
1410
x8
2001
1200
x4
2006
1670
x9
2002
3380
x5
2007
3040
x10
Table 2 Distributed information of each sample point Ij
x1
x2
x3
x4
x5
x6
x7
x8
x9
x10 0
I1
0
0
0
0.63
0
0
0.86
0.89
0.79
I2
0.39
0
0
0
0
0.88
0.14
0
0.21
0.07
I3
0.61
0.11
0
0
0.65
0.13
0
0
0
0.93
I4
0
0.89
0.63
0
0.35
0
0
0
0
0
x6
x7
x8
x9
x10
0.14
0.11
0.21
− Table 3 Leaving information qij
Q− j
x1
x2
x3
Q− 1
x4
x5
0.38
Q− 2 Q− 3
Q− 4
0.13 0.39
0.35 0.11
0.07
0.38
Table 4 Ascending magnitude with respect to leaving information ↑ Q− j
↑ Q− j
Leaving information
↑ Q− 1
0.11
↑ Q− 2 ↑ Q− 3 ↑ Q− 4
0.14
0.21
0.07
0.35
0.39
0.11
0.38
0.38
0.13
Step 1: Calculate the distributed information qij using (1) for the individual sampling points. The results are shown in Table 2. Step 2: Calculate the leaving information qij− (Table 3) using (3). Arrange qij− in ascending order over the given interval Ij to obtain the leaving information ↑ Q− j of ascending magnitude, as shown in Table 4. Step 3: Calculate the joining information qij+ (Table 5) using (4). Arrange qij+ in descending order over the given interval Ij to obtain the joining information ↓ Q+ j of descending magnitude, as shown in Table 6. Step 4: Let P = {pk = k/n}
(14)
Practical Study on the Fuzzy Risk of Flood Disasters
429
+ Table 5 Joining information qij
Q+ j
x1
Q+ 1
x2
x3
0
0
0
Q+ 2
0.39
0
0
0.11
0
Q+ 4
0
Q+ 3
x4
0
x5
x6
x7
x8
x9
x10
0
0 0.14
0
0.21
0.07
0.13
0
0
0
0
0
0
0
0
0
0 0
0.35
0
Table 6 Descending magnitude with respect to joining information ↓ Q+ j ↓ Q+ j
Joining information
↓ Q+ 1
0
0
0
0
0
0
0.39
0.21
0.14
0.07
0
0
0
0.13
0.11
0
0
0
0
0
0.35
0
0
0
0
0
0
↓ Q+ 2 ↓ Q+ 3 ↓ Q+ 4
0
0
0
Table 7 Fuzzy risk represented by a possibility-probability distribution P
p0
p1
p2
p3
p4
p5
p6
p7
p8
p9
p10
0
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
I1
0.11
0.14
0.21
0.38
1
0
0
0
0
0
0
I2
0.13
1
0.39
0.21
0.14
0.07
0
0
0
0
0
I3
0.07
0.35
0.39
1
0.13
0.11
0
0
0
0
0
I4
0.11
0.38
1
0.35
0
0
0
0
0
0
0
Here, k = 1, 2, . . . , 10, where n is the total number of samples. Substitute the figures in Tables 4 and 6 into (6) and get the fuzzy risk calculated by the possibility-probability distribution (Table 7). 3.2 Fuzzy Expected Value of Flood Risk The α cut set technique can be used to obtain the fuzzy expected value of flood risk. First, take a level cut set with α = 0.1. We can obtain the following from Table 7 based on (8) and (9): p0.1 (I1 ) = [0, 0.40],
p0.1 (I2 ) = [0, 0.40]
p0.1 (I3 ) = [0.10, 0.50],
p0.1 (I4 ) = [0, 0.30]
(15)
Then: p 0.1 (I ) = {0, 0, 0.10, 0} p0.1 (I ) = {0.40, 0.40, 0.50, 0.30}
(16)
430
L.H. Feng, G.Y. Luo
Table 8 Fuzzy expected value of the flood risk in the Jinhua River basin
α
0.1
0.3
0.5
1
RC (m3 /s)
3100
2700
2540
2540
RV (m3 /s)
2650
2630
2540
2540
Second, we can obtain the following sets based on (11) by normalizing p 0.1 (Ij ) and p 0.1 (Ij ): p 0.1 (I ) = {0, 0, 1, 0} p0.1 (I ) = {0.25, 0.25, 0.31, 0.19}
(17)
Finally, we calculate the fuzzy expected values E 0.1 (I ) and E 0.1 (I ) based on (12). Where Ij is replaced by the interval center point Uj , so: E 0.1 (I ) = 0 × 1500 + 0 × 2300 + 1 × 3100 + 0 × 3900 = 3100 E 0.1 (I ) = 0.25 × 1500 + 0.25 × 2300 + 0.31 × 3100 + 0.19 × 3900 = 2650
(18)
From the fuzzy expected value of the flood risk in the Jinhua River basin listed in Table 8, we can get a conservative risk value (RC ) 3100 m3 /s for flood risk in the Jinhua River basin, a venture risk value (RV ) 2650 m3 /s, and a maximum probability risk value (RM ) 2540 m3 /s for the α = 0.1 level cut set. 3.3 Benefit from Using the Fuzzy Risk Instead of the Probability Risk The core task of probability risk analysis is to calculate the probability of the occurrence of various floods. In the real flood disaster system, due to the incompleteness of information, etc., the estimated probability differs from the actual probability significantly, especially when the volume of the given sample for risk analysis is too small to be used for judging the type of probability distribution, flood risk cannot be estimated accurately at all. Due to the flood system’s complexity and uncertainty, people usually must analyze the flood system with incomplete information. And if the data for recognizing the objective law is incomplete, a rougher model is used to estimate the related functions [2]. Created on the basis of such cognition, the interior-outer set model may be used to estimate the possibilityprobability distribution of the occurrence of flood event when the samples are incomplete. The natural way of dealing with the “inaccurate estimation” is to find out a more practicable assessment model that can significantly improve the estimation accuracy under the existing conditions. However, it is always impossible to accurately estimate risk value, and an alternative way is to reflect the fuzzy uncertainty of risk value in a proper way. The flood system has a large amount of uncertainties. Such uncertainties comprise both occasional factors and obscure factors, corresponding to the randomness and fuzziness of the flood respectively. Therefore, when a sample is incomplete, the estimated risk value with the probability risk model is necessarily unreliable [25]. Therefore, the fuzzy risk can be a multi-valued risk or set-valued risk. That is to say, one flood event may correspond to several probability values that are something to different extents. Therefore, using the fuzzy risk instead of probability risk may not only improve the accuracy of risk value estimation, but also reflect the fuzzy uncertainty of risk value. The calculation fuzzy risk of flood mainly aims at the risk assessment system for the complexity of flood system and the incompleteness of data. It has an advantage of being
Practical Study on the Fuzzy Risk of Flood Disasters
431
able to represent the fuzzy uncertainty of risk value, provide more information for sorting out disaster alleviation solutions and allow for adjustment while making decision. 4 Conclusions It can be seen from the above that the fuzzy risk of an incident calculated based on the fuzzy cut set technique is a multi-valued risk, allocating many risk values into a given unit. These risk values are organized by some hierarchy for different values of α. For each value of α, the expected values of possibility-probability risk are E α (x), E α (x), and E1 (x). The simplest way to perform fuzzy risk analysis is to calculate the fuzzy expected value, and convert fuzzy risk into non-fuzzy risk to obtain a crisp value. There exists a problem concerning the transformation from a fuzzy set to a crisp set. Therefore, it is necessary to locate the α level value in advance, and then select the element x with subordinate degree A(x) ≥ α. The fuzzy expected value of a possibility-probability distribution is a set with E α (x) and E α (x) as its boundaries. The fuzzy expected values E α (x) and E α (x) of a possibility-probability distribution represent the fuzzy risk value. Therefore, we can obtain a conservative risk value, a venture risk value and a maximum probability risk value. With different values of α, the results fall into different conceptual categories such as “low-risk area”, “high-risk area”, and “acceptable risk area” in the same geographical area. “Inaccurate estimation of probability” is a fatal weakness in the existing risk zoning map, and such objective information cannot be represented in the current existing risk zoning map. The interior-outer model is capable of calculating the possibility-probability distribution value in occurrence of floods, thus working out multi-valued probabilities. The risk zoning map plotted with the possibility-probability distribution value can reflect the fuzzy uncertainty of risk value. This study is scientifically significant for creating a brand-new theory of flood risk zoning and has obvious use value for improving the quality of risk zoning map and effectively avoid risks. This case study has been performed separately using α = 0.1, 0.3, 0.5 and 1.0. The selection of the value α depends on the extent of confidence in different groups of people. The lower the confidence in people, the bigger the difference between the conservative risk value and the venture risk value. In the case of secured confidence, both results may be reduced to one, i.e. the maximum probability risk value (RM = 2540 m3 /s). Selection of a conservative risk value or venture risk value depends on the risk preference of different groups of people. For instance, an investment activity with small investment and significant benefits (regardless of high risk probability) in an area where the flood risk is high may appeal to a tourism project investor who might be interested in the venture risk value; in the building of a nuclear power station, an investor with a fairly large amount of capital may give up an activity with “possibly” high benefits, for an activity with low risk probability and select a conservative risk value. This result, which has a very clear and practical significance, can facilitate the study and application of the regional risk planning theory, and help scientists and decision-makers, and is in the public interest as it makes full use of information about the risk being studied. Acknowledgements This work was supported by National Natural Science Foundation of China (No. 40771044) and Zhejiang Provincial Science and Technology Foundation of China (No. 2006C23066).
References 1. Allahviranloo, T.: Successive over relaxation iterative method for fuzzy system of linear equations. Appl. Math. Comput. 162(1), 189–196 (2005)
432
L.H. Feng, G.Y. Luo
2. Baudrit, C., Couso, I., Dubois, D.: Joint propagation of probability and possibility in risk analysis: Towards a formal framework. Int. J. Approx. Reason. 45(1), 82–105 (2007) 3. Delgado, M., Vgay, J.L., Vila, M.A.: A model for linguistic partial information in decisionmaking problems. Int. J. Intell. Syst. 9, 365–378 (1994) 4. Dikmen, I., Birgonul, M.T., Han, S.: Using fuzzy risk assessment to rate cost overrun risk in international construction projects. Int. J. Proj. Manag. 25(5), 494–505 (2007) 5. Elsalamony, G.: A note on fuzzy neighbourhood base spaces. Fuzzy Sets Syst. 157(20), 2725–2738 (2006) 6. Huang, C.F.: Concepts and methods of fuzzy risk analysis. In: Proceedings of the First China-Japan Conference on Risk Assessment and Management, pp. 12–23. International Academic Publishers, Beijing (1998) 7. Huang, C.F., Moraga, C., Chen, Z.F.: A simple algorithm of interior-outer set model. J. Nat. Disasters 13(4), 15–20 (2004) 8. Huang, C.F., Shi, Y.: Towards Efficient Fuzzy Information Processing—Using the Principle of Information Diffusion. Springer, Heidelberg (2002) 9. Huang, C.F., Zhang, J.X., Chen, Z.F., Zong, T.: Toward a new kind of natural disaster risk zoning map. J. Nat. Disasters 13(2), 9–15 (2004) 10. Joblonowski, M.: Fuzzy risk analysis, using AI systems. AI Expert 9(12), 34–37 (1994) 11. Karimi, I., Hüllermeier, E.: Risk assessment system of natural hazards: A new approach based on fuzzy probability. Fuzzy Sets Syst. 158(9), 987–999 (2007) 12. Kenyon, W., Hill, G., Shannon, P.: Scoping the role of agriculture in sustainable flood management. Land Use Policy 25(3), 351–360 (2008) 13. Machias, A.V., Skikos, G.D.: Fuzzy risk index of wind sites. IEEE Trans. Energy Convers. 7(4), 638–643 (1992) 14. Matos, M.A.: Decision under risk as a multicriteria problem. Eur. J. Oper. Res. 181(3), 1516–1529 (2007) 15. Mikhailov, V.N., Morozov, V.N., Cheroy, N.I., Mikhailova, M.V.: Extreme flood on the Danube River in 2006. Russ. Meteorol. Hydrol. 33(1), 48–54 (2008) 16. Moraga, C., Huang, C.F.: Learning subjective probabilities from a small data set. In: Proceedings of 33rd International Symposium on Multiple-value Logic, pp. 355–360. IEEE Comput. Soc., Los Alamitos (2003) 17. Reyna, V.F., Brainerd, C.J.: Numeracy. ratio bias, and denominator neglect in judgments of risk and probability. Learn. Individ. Differ. 18(1), 89–107 (2008) 18. Rieˇcan, B.: On the Dobrakov submeasure on fuzzy sets. Fuzzy Sets Syst. 151(3), 635–641 (2005) 19. Schmucker, K.J.: Fuzzy Sets, Natural Language Computations, and Risk Analysis. Comput. Sci. Press, Rockvill (1984) 20. Thavaneswaran, A., Thiagarajah, K., Appadoo, S.S.: Fuzzy coefficient volatility (FCV) models with applications. Math. Comput. Model. 45(7–8), 777–786 (2007) 21. Wu, H.C.: Using fuzzy sets theory and Black–Scholes formula to generate pricing boundaries of European options. Appl. Math. Comput. 185(1), 136–146 (2007) 22. Xia, F.Q., Kang, X.W., Wu, S.H.: Research on dike breach risk of the hanging reach under different flood conditions in the Lower Yellow River. Geogr. Res. 27(1), 229–239 (2008) 23. Zadeh, L.A.: Fuzzy sets. Inf. Control 8, 338–353 (1965) 24. Zadeh, L.A.: Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets Syst. 1, 3–28 (1978) 25. Zhang, J.X., Huang, C.F.: Study on pattern of soft risk zoning map of natural disasters. J. Nat. Disasters 14(6), 20–25 (2005)
Acta Appl Math (2009) 106: 433–454 DOI 10.1007/s10440-008-9306-3
Stable-Range Approach to Short Wave and Khokhlov-Zabolotskaya Equations Xiaoping Xu
Received: 14 February 2008 / Accepted: 21 August 2008 / Published online: 18 September 2008 © Springer Science+Business Media B.V. 2008
Abstract Short wave equations were introduced in connection with the nonlinear reflection of weak shock waves. They also relate to the modulation of a gas-fluid mixture. KhokhlovZabolotskaya equation is used to describe the propagation of a diffraction sound beam in a nonlinear medium. We give a new algebraic method of solving these equations by using certain finite-dimensional stable range of the nonlinear terms and obtain large families of new explicit exact solutions parameterized by several functions for them. These parameter functions enable one to find the solutions of some related practical models and boundary value problems. Keywords Stable range · Short wave · Khokhlov-Zabolotskaya · Symmetry transformation · Exact solution Mathematics Subject Classification (2000) Primary 35C05 · 35Q35 · Secondary 35C10 · 35C15
1 Introduction Khristianovich and Rizhov [8] discovered the equations of short waves in connection with the nonlinear reflection of weak shock waves. The equations are mathematically equivalent to the following equation of their potential function u for the velocity vector: 2utx − 2(x + ux )uxx + uyy + 2kux = 0,
(1.1)
where k is a real constant. For convenience, we call the above equation “the short wave equation”. The symmetry group and conservation laws of (1.1) were first studied by Kuchar-
Research supported by China NSF 10871193. X. Xu () Institute of Mathematics, Academy of Mathematics & System Sciences, Chinese Academy of Sciences, Beijing 100190, China e-mail:
[email protected]
434
X. Xu
czyk [14] and later by Khamitova [6]. Bagdoev and Petrosyan [2] showed that the modulation equation of a gas-fluid mixture coincides in main orders with the corresponding shortwave equation. Roy, Roy and De [23] found a loop algebra in the Lie symmetries for the short-wave equation. Kraenkel, Manna and Merle [13] studied nonlinear short-wave propagation in ferrites and Ermakov [3] investigated short-wave interaction in film slicks. Khokhlov and Zabolotskaya [7] found the equation 2utx + (uux )x − uyy = 0.
(1.2)
for quasi-plane waves in nonlinear acoustics of bounded bundles. More specifically, the equation describes the propagation of a diffraction sound beam in a nonlinear medium (cf. [4, 20]). Kupershmidt [15] constructed a geometric Hamiltonian form for the KhokhlovZabolotskaya equation (1.2). Certain group-invariant solutions of (1.2) were found by Korsunskii [12], and by Lin and Zhang [16]. Akiyama and Kamakura [1] gave a connection between (1.2) and acoustic lens emitting strongly focused finite-amplitude beams. Moreover, Koshvaga, Makavarests, Grimalsky, Kotsarenko and Eriquez [10] used (1.2) to investigate the spectrum of the seismic electromagnetic wave caused by seismic and volcano activity. The three-dimensional generalization 2utx + (uux )x − uyy − uzz = 0
(1.3)
and its symmetries were studied by Krasil’shchik, Lychagin and Vinogradov [17] and by Schwarz [25]. Martinez-Moras and Ramos [18] showed that the higher dimensional classical W-algebras are the Poisson structures associated with a higher dimensional version of the Khokhlov-Zabolotskaya hierarchy. Kacdryavtsev and Sapozknikov [9] found the symmetries for a generalized Khokhlov-Zabolotskaya equation. Sanchez [24] studied long waves in ferromagnetic media via Khokhlov-Zabolotskaya equation. Morozov [19] derived two non-equivalent coverings for the modified Khokhlov-Zabolotskaya equation from Maurer-Cartan forms of its symmetry pseudo-group. Rozanova [21, 22] studied closely related Khokhlov-Zabolotskaya-Kuzentsov equation from analytic point of view. Kostin and Panasenko [11] investigated nonlinear acoustics in heterogeneous media via KhokhlovZabolotskaya-Kuzentsov-type equation. All the above equations are similar nonlinear algebraic partial differential equations. Observe that the nonlinear terms in the above equations keep some finite-dimensional polynomial space in x stable. In this paper, we present a new algebraic method of solving these equations by using this stability. We obtain a family of solutions of (1.1) with k = 1/2, 2, which blow up on a moving line y = f (t). They may reflect partial phenomena of gust. Moreover, we obtain another family of smooth solutions parameterized by six smooth functions of t for any k. Similar results for (1.2) are also given. Furthermore, we find a family of solutions of (1.3) blowing up on a rotating and translating plane cos α(t)y + sin α(t)z = f (t), which may reflect partial phenomena of sound shock, and a family of solutions parameterized by time-dependent harmonic functions in y and z, whose special cases are smooth solutions. Since our solutions contain parameter functions, they can be used to solve certain related practical models and boundary-value problems for these equations. On the list of the Lie point symmetries of (1.1) in the works of Kucharczyk [14] and of Khamitova [6] (e.g. cf. p. 301 in [5]), the most sophisticated ones are those with respect to the following vector fields: y3 (1.4) X1 = −α y∂x + α∂y + xy(α + α ) − (α + (k + 1)α + kα ) ∂u , 3
Stable-Range Approach to Short Wave and Khokhlov-Zabolotskaya
435
X2 = β∂x + [y 2 (β + (k + 1)β + kβ) − x(β + β)]∂u ,
(1.5)
where α and β are arbitrary functions of t . Among the known Lie point symmetries of the Khokhlov-Zabolotskaya equation (1.2) in the works of Vinogradov and Vorob’ev [26], and of Schwarz [25] (e.g. cf. p. 299 in [5]), the most interesting ones are those with respect to the following vector fields: 1 1 X3 = α y∂x + α∂y − α y∂u , 2 2
(1.6)
2β x + β y 2 2 4β u + 2β x + β y 2 ∂x + β y∂y − ∂u . (1.7) 6 3 6 The symmetries of the three-dimensional Khokhlov-Zabolotskaya equation (1.3) causing our attention are those with respect to the vector fields (e.g. cf. p. 301 in [5]): X4 = β∂t +
X5 = 10t 2 ∂t + (4tx + 3y 2 + 3z2 )∂x + 12ty∂y + 12tz∂z − (4x + 16tu)∂u , 1 1 X6 = α y∂x + α∂z − α y∂u , 2 2
(1.8)
(1.9)
1 1 (1.10) X7 = β z∂x + β∂y − β z∂u . 2 2 We find that the group-invariant solutions with respect to the above vector fields X1 -X7 are polynomial in x. This motivates us to find more exact solutions of the equations (1.1)– (1.3) polynomial in x. In Sect. 2, we solve the short-wave equation (1.1). Although (1.2) can be viewed as a special case of (1.3), we first solve (1.2) in Sect. 3 for simplicity because our approach to (1.3) involves time-dependent harmonic functions and sophisticated integrals. The exact solutions of (1.3) will be given Sect. 4.
2 Short Wave Equation In this section, we study solutions polynomial in x for the short wave equation (1.1). By comparing the terms of highest degree in x, we find that such a solution must be of the form: u = f (t, y) + g(t, y)x + h(t, y)x 2 + ξ(t, y)x 3 ,
(2.1)
where f (t, y), g(t, y), h(t, y) and ξ(t, y) are suitably-differentiable functions to be determined. Note ux = g + 2hx + 3ξ x 2 , utx = gt + 2ht x + 3ξt x 2 ,
uxx = 2h + 6ξ x,
uyy = fyy + gyy x + hyy x 2 + ξyy x 3 ,
(2.2) (2.3)
Now (1.1) becomes 2(gt + 2ht x + 3ξt x 2 ) − 2(g + (2h + 1)x + 3ξ x 2 )(2h + 6ξ x) + fyy + gyy x + hyy x 2 + ξyy x 3 + 2k(g + 2hx + 3ξ x 2 ) = 0,
(2.4)
436
X. Xu
which is equivalent to the following systems of partial differential equations: ξyy = 36ξ 2 ,
(2.5)
hyy = 6ξ(6h + 2 − k) − 6ξt ,
(2.6)
gyy = 8h2 + 4(1 − k)h + 12ξg − 4ht ,
(2.7)
fyy = 4gh − 2gt − 2kg.
(2.8)
1 ξ= √ ( 6y + β(t))2
(2.9)
First we observe that
is a solution of (2.5) for any differentiable function β of t . Substituting (2.9) into (2.6), we get 12β (t) 6(6h + 2 − k) hyy = √ + √ . 3 ( 6y + β(t)) ( 6y + β(t))2
(2.10)
Denote by Z the ring of integers. Write h(t, y) =
√ ai (t)( 6y + β(t))i .
(2.11)
i∈Z
Then hyy =
√ 6(i + 2)(i + 1)ai+2 (t)( 6y + β(t))i .
(2.12)
i∈Z
Substituting (2.11) and (2.12) into (2.10), we obtain
√ 12β (t) 6(2 − k) 6[(i + 2)(i + 1) − 6]ai+2 (t)( 6y + β(t))i = √ + √ . (2.13) 3 ( 6y + β(t)) ( 6y + β(t))2 i∈Z
So −24a−1 (t) = 12β (t),
−36a0 (t) = 6(2 − k)
(2.14)
and 6(i + 4)(i − 1)ai+2 (t) = 0,
i = −2, −3.
(2.15)
Thus √ α β k−2 h= √ + γ ( 6y + β)3 , − √ + 2 6 ( 6y + β) 2( 6y + β)
(2.16)
where α and γ are arbitrary differentiable functions of t . Note −2αβ 2α + (β )2 β ht = √ + √ − √ ( 6y + β)3 2( 6y + β)2 2( 6y + β) √ √ + 3γβ ( 6y + β)2 + γ ( 6y + β)3
(2.17)
Stable-Range Approach to Short Wave and Khokhlov-Zabolotskaya
437
and α2 αβ 3(β )2 + 4(k − 2)α − √ + h2 = √ √ ( 6y + β)4 ( 6y + β)3 12( 6y + β)2 √ √ (2 − k)β (k − 2)2 + 2αγ ( 6y + β) − β γ ( 6y + β)2 + √ 36 6( 6y + β) √ (k − 2)γ √ ( 6y + β)3 + γ 2 ( 6y + β)6 . + 3 +
(2.18)
Substituting the above two equations into (2.7), we have: 12g 8α 2 4[(k + 1)α + 3α ] 2((k + 1)β + 3β ) = √ − + gyy − √ √ √ ( 6y + β)2 ( 6y + β)4 3( 6y + β)2 3( 6y + β) √ √ 2(k − 2)(1 − 2k) + 16αγ ( 6y + β) − 20β γ ( 6y + β)2 + 9 √ 4[(k + 1)γ + 3γ ] √ ( 6y + β)3 + 8γ 2 ( 6y + β)6 . − (2.19) 3 Write g(t, y) =
√ bi (t)( 6y + β)i .
(2.20)
i∈Z
Then
√ 6[(i + 2)(i + 1) − 2]bi+2 (t)( 6y + β)i
i∈Z
8α 2 4[(k + 1)α + 3α ] 2[(k + 1)β + 3β ] = √ − + √ √ ( 6y + β)4 3( 6y + β)2 3( 6y + β) √ √ 2(k − 2)(1 − 2k) + 16αγ ( 6y + β) − 20β γ ( 6y + β)2 + 3 √ 4[(k + 1)γ + 3γ ] √ ( 6y + β)3 + 8γ 2 ( 6y + β)6 . − 3 Comparing the constant terms, we get k = 1/2, 2. Moreover, the coefficients of the other terms give b−2 =
b3 =
α2 , 3
2αγ , 3
b0 =
b4 = −
(k + 1)α + 3α , 9 β γ , 3
b5 = −
b1 = −
(k + 1)β + 3β , 18
(k + 1)γ + 3γ , 81
b8 =
2γ 2 81
(2.21)
(2.22)
and (i + 3)ibi+2 = 0
for i = −4, −2, −1, 1, 2, 3, 6.
(2.23)
Therefore g=
σ (k + 1)α + 3α (k + 1)β + 3β √ α2 − ( 6y + β) +√ + √ 9 18 3( 6y + β)2 6y + β
438
X. Xu
√ 2αγ √ β γ √ ( 6y + β)3 − ( 6y + β)4 + ρ( 6y + β)2 + 3 3 2γ 2 √ (k + 1)γ + 3γ √ ( 6y + β)5 + ( 6y + β)8 , − 81 81
(2.24) (2.25)
where σ and ρ are arbitrary differentiable functions of t . Observe that 2α 2 β (2αα − 3σβ ) σ (k + 1)(2α − (β )2 ) + √ +√ gt = − √ + 3 2 18 3( 6y + β) 3( 6y + β) 6y + β √ 2α − β β 36β ρ − (k + 1)β − 3β √ + ( 6y + β) + (ρ + 2αβ γ )( 6y + β)2 6 18 2 √ 2αγ + 2α γ − 4(β ) γ ( 6y + β)3 + 3 42β γ + 5(k + 1)β γ + 27β γ √ − ( 6y + β)4 81 (k + 1)γ + 3γ √ 16β γ 2 √ 4γ γ √ ( 6y + β)5 + ( 6y + β)7 + ( 6y + β)8 , (2.26) − 81 81 81
+
gh =
α3 6ασ − α 2 β kα 2 + 2αα − 3β σ + √ + √ √ 3( 6y + β)4 6( 6y + β)3 6( 6y + β)2 3(k − 2)σ − 3αβ − 2(k + 1)αβ − 3α β (k + 1)(β )2 + 3β β + αρ + √ 36 18( 6y + β) (k − 2)((k + 1)α + 3α ) β ρ (k − 2)((k + 1)β + 3β ) √ + + α2 γ − − ( 6y + β) 54 2 108 2 (k − 2)ρ + 6γ σ − 4αβ γ √ (17k − 10)αγ − 3αγ (β ) γ 2 + ( 6y + β) + + 6 6 81 α γ √ (10 − 17k)β γ + 3β γ − 27β γ √ + ( 6y + β)4 ( 6y + β)3 + 3 162 (k − 2)((k + 1)γ + 3γ ) √ 56αγ 2 √ ( 6y + β)6 + γρ − ( 6y + β)5 + 486 81 +
−
28β γ 2 √ (k − 2)γ 2 √ 2γ 3 √ ( 6y + β)7 + ( 6y + β)8 + ( 6y + β)11 . 81 243 81
(2.27)
Substituting (2.25), (2.26) and (2.27) into (2.8), we obtain fyy =
4α 3 2[6(k + 1)σ + 3αβ + 2(k + 1)αβ + 3α β + 9σ ] − √ √ 3( 6y + β)4 9( 6y + β) 2[β β − (k + 1)α − α ] 2(k + 1)(β )2 12ασ + 2α 2 β + + 4αρ + √ 3 9 3( 6y + β)3 2 2 √ 4(k + 1) α 2(k + 1) β β + (k + 1)β − + 4α 2 γ − 6β ρ + + ( 6y + β) 27 3 27
+
Stable-Range Approach to Short Wave and Khokhlov-Zabolotskaya
439
4(k + 1)ρ + 20αβ γ √ 40((k + 1)αγ + 3αγ ) + 4γ σ − 2ρ − ( 6y + β)2 + − 3 81 2 √ √ 10(β ) γ 10(k + 1)β γ + 30β γ + ( 6y + β)4 + 4γρ ( 6y + β)3 + 3 27 2(k + 1)γ + 2γ √ 4(k + 1)2 γ 224αγ 2 √ + ( 6y + β)6 + ( 6y + β)5 + 243 27 81 16β γ 2 √ 8γ ((k + 1)γ + 3γ ) √ ( 6y + β)7 − ( 6y + β)8 9 243 8γ 3 √ ( 6y + β)11 . + 81 −
(2.28)
Thus f =
α3 6ασ + α 2 β (k + 1)(β )2 2 y + + θ + ϑy + 2αρy 2 + √ √ 2 9 27( 6y + β) 18( 6y + β) √ 6(k + 1)σ + 3αβ + 2(k + 1)αβ + 3α β + 9σ √ ( 6y + β)[ln( 6y + β) − 1] 27 2 β ρ β + (k + 1)β α γ β β − (k + 1)α − α 2 2(k + 1)2 α 2 y − y + − + + 3 27 9 6 108 2 √ √ (k + 1) β (k + 1)ρ + 5αβ γ 2γ σ − ρ + − ( 6y + β)3 + ( 6y + β)4 486 36 54 2 (β ) γ (k + 1)αγ + 3αγ √ (k + 1)β γ + 3β γ √ + − ( 6y + β)5 + ( 6y + β)6 36 243 486 (k + 1)γ + γ √ γρ (k + 1)2 γ 2αγ 2 √ + + ( 6y + β)8 + ( 6y + β)7 + 63 15309 3402 243 −
β γ 2 √ 2γ ((k + 1)γ + 3γ ) √ ( 6y + β)9 − ( 6y + β)10 243 32805 γ3 √ ( 6y + β)13 , + 9477 −
(2.29)
where θ and ϑ are arbitrary functions of t . Theorem 2.1 When √ k = 1/2, 2, we have the following solution of the equation (1.1) blowing up on the surface 6y + β(t) = 0: √ x3 α β k−2 u= √ + γ ( 6y + β)3 x 2 + √ − √ + 6 ( 6y + β)2 ( 6y + β)2 2( 6y + β) 2 (k + 1)β + 3β √ α σ (k + 1)α + 3α − ( 6y + β) + +√ + √ 9 18 3( 6y + β)2 6y + β √ 2αγ √ β γ √ (k + 1)γ + 3γ √ +ρ( 6y + β)2 + ( 6y + β)3 − ( 6y + β)4 − ( 6y + β)5 3 3 81
440
X. Xu
+
2γ 2 √ 6ασ + α 2 β (k + 1)(β )2 2 α3 ( 6y + β)8 x + y + + 2αρy 2 + √ √ 81 9 27( 6y + β)2 18( 6y + β)
√ 6(k + 1)σ + 3αβ + 2(k + 1)αβ + 3α β + 9σ √ ( 6y + β)[ln( 6y + β) − 1] 27 2 β ρ β + (k + 1)β α γ β β − (k + 1)α − α 2 2(k + 1)2 α 2 y − y + − + + 3 27 9 6 108 (k + 1)2 β √ 2γ σ − ρ (k + 1)ρ + 5αβ γ √ + − ( 6y + β)3 + ( 6y + β)4 486 36 54
−
+ θ + ϑy 2 (k + 1)αγ + 3αγ √ (k + 1)β γ + 3β γ √ (β ) γ − ( 6y + β)6 + ( 6y + β)5 + 36 243 486 (k + 1)γ + γ √ γρ (k + 1)2 γ 2αγ 2 √ + + ( 6y + β)8 + ( 6y + β)7 + 63 15309 3402 243 −
β γ 2 √ 2γ ((k + 1)γ + 3γ ) √ γ3 √ ( 6y + β)9 − ( 6y + β)10 + ( 6y + β)13 , (2.30) 243 32805 9477
where α, β, γ , σ, ρ, θ and ϑ are arbitrary functions of t , whose derivatives appeared in the above exist in a certain open set of R. When α = γ = σ = ρ = θ = ϑ = 0, the above solution becomes
β k−2 (k + 1)β + 3β √ u= √ − √ ( 6y + β)x + x2 − 6 18 ( 6y + β)2 2( 6y + β) x3
(k + 1)(β )2 2 β β 2 y + y 9 3 β + (k + 1)β (k + 1)2 β √ + + ( 6y + β)3 . 108 486 +
(2.31)
Take the trivial solution ξ = 0 of (2.5), which is the only solution polynomial in y. Then (2.6) and (2.7) become hyy = 0,
gyy = 8h2 + 4(1 − k)h − 4ht ,
(2.32)
Thus h = α(t) + β(t)y.
(2.33)
Hence gyy = 4(2α 2 + (1 − k)α − α ) + 4(4αβ + (1 − k)β − β )y + 8β 2 y 2 .
(2.34)
So 2 2 g = γ + σy + 2(2α 2 + (1 − k)α − α )y 2 + (4αβ + (1 − k)β − β )y 3 + β 2 y 4 , (2.35) 3 3
Stable-Range Approach to Short Wave and Khokhlov-Zabolotskaya
441
where γ and σ are arbitrary functions of t . Now (2.8) yields 2 fyy = 4(α + βy) γ + σy + 2(2α 2 + (1 − k)α − α )y 2 + (4αβ + (1 − k)β − β )y 3 3 2 2 + β 2 y 4 − 2 γ + σ y + 2(4αα + (1 − k)α − α )y 2 + (4α β + 4αβ 3 3 4 + (1 − k)β − β )y 3 + ββ y 4 − 2k γ + σy + 2(2α 2 + (1 − k)α − α )y 2 3 2 2 + (4αβ + (1 − k)β − β )y 3 + β 2 y 4 3 3 = 4αγ − 2γ − 2kγ + 2(2ασ + 2βγ − σ − kγ )y + 8((1 − k)α − α )βy 3
+ 4(4α 3 + 2(1 − 2k)α 2 − 6αα + k(k − 1)α + (2k − 1)α + α + βσ )y 2 4 + (20α 2 β + 2(1 − 3k)αβ − 6αβ − 4α β + (2k − 1)β + β − k(1 − k)β)y 3 3 4 8 + (10αβ 2 + (2 − 3k)β 2 − 4ββ )y 4 + β 3 y 5 . (2.36) 3 3 Therefore, f = (2αγ − γ − kγ )y 2 +
2ασ + 2βγ − σ − kγ 3 2((1 − k)α − α ) 5 y + y + τ + ρy 3 5
1 + (4α 3 + 2(1 − 2k)α 2 − 6αα + k(k − 1)α + (2k − 1)α + α + βσ )y 4 3 1 + (20α 2 β + 2(1 − 3k)αβ − 6αβ − 4α β + (2k − 1)β + β − k(1 − k)β)y 5 15 +
2 4β 3 7 (10αβ 2 + (2 − 3k)β 2 − 4ββ )y 6 + y . 45 63
(2.37)
Theorem 2.2 The following is a solution of the equation (1.1): 2 u = (α + βy)x 2 + [γ + σy + 2(2α 2 + (1 − k)α − α )y 2 + (4αβ + (1 − k)β − β )y 3 3 2 2ασ + 2βγ − σ − kγ 3 2((1 − k)α − α ) 5 y + y + β 2 y 4 ]x + (2αγ − γ − kγ )y 2 + 3 3 5 1 + τ + ρy + (4α 3 + 2(1 − 2k)α 2 − 6αα + k(k − 1)α + (2k − 1)α + α + βσ )y 4 3 1 + (20α 2 β + 2(1 − 3k)αβ − 6αβ − 4α β + (2k − 1)β + β − k(1 − k)β)y 5 15 +
2 4β 3 7 (10αβ 2 + (2 − 3k)β 2 − 4ββ )y 6 + y , 45 63
(2.38)
where α, β, γ , σ, ρ and τ are arbitrary functions of t , whose derivatives appeared in the above exist in a certain open set of R. Moreover, any solution polynomial in x and y of (1.1) must be of the above form. The above solution is smooth (analytic) if all α, β, γ , σ, ρ and τ are smooth (analytic) functions of t .
442
X. Xu
Remark 1 In addition to the nonzero solution (2.9) of (2.5), the other nonzero solutions are of the form √ (2.39) ξ = ℘ι ( 6y + β(t)), where ℘ι (w) is the Weierstrass’s elliptic function such that ℘ι (w)2 = 4(℘ι (w)3 − ι),
(2.40)
and ι is a nonzero constant and β is any function of t . When β is not a constant, the solutions of (2.6)–(2.8) are extremely complicated. If β is constant, we can take β = 0 by adjusting ι. Any solution of (2.6)–(2.8) with h = 0 is also very complicated. Thus the only simple solution of (1.1) in this case is √ (2.41) u = ℘ι ( 6y)x 3 .
3 2-D Khokhlov-Zabolotskaya Equation The solution of (1.2) polynomial in x must be of the form u = f (t, y) + g(t, y)x + ξ(t, y)x 2 .
(3.1)
Then ux = g + 2ξ x,
utx = gt + 2ξt x,
uyy = fyy + gyy x + ξyy x 2 ,
(uux )x = ∂x (f g + (g 2 + 2f ξ )x + 3gξ x 2 + 2ξ 2 x 3 ) = g 2 + 2f ξ + 6gξ x + 6ξ 2 x 2 .
(3.2) (3.3)
Substituting them into (1.2), we get 2(gt + 2ξt x) + g 2 + 2f ξ + 6gξ x + 6ξ 2 − fyy − gyy x − ξyy x 2 = 0,
(3.4)
equivalently, ξyy = 6ξ 2 ,
(3.5)
gyy − 6gξ = 4ξt ,
(3.6)
fyy − 2f ξ = 2gt + g 2 .
(3.7)
First we observe that ξ=
1 (y + β(t))2
(3.8)
is a solution of (3.5) for any differentiable function β of t . Substituting (3.8) into (3.6), we obtain 6g 8β (t) =− . (3.9) gyy − 2 y + β(t)) (y + β(t))3 Write g(t, y) =
i∈Z
ai (t)(y + β(t))i .
(3.10)
Stable-Range Approach to Short Wave and Khokhlov-Zabolotskaya
443
Then (3.9) becomes [(i + 2)(i + 1) − 6]ai+2 (t)(y + β(t))i = − i∈Z
8β (t) . (y + β(t))3
(3.11)
Thus a−1 = 2β ,
(i + 4)(i − 1)ai+2 = 0
for i = −3.
(3.12)
Hence g=
α(t) 2β (t) + γ (t)(y + β(y))3 , + 2 (y + β(t)) y + β(t)
(3.13)
where α and γ are arbitrary differentiable functions of t . Note gt = −
√ 2αβ α − 2(β )2 2β + 3γβ (y + β)2 + γ ( 3y + β)3 + + 3 2 (y + β) (y + β) y +β
(3.14)
and g2 =
α2 4αβ 4(β )2 + + + 2αγ (y + β) + 4γβ (y + β)2 + γ 2 (y + β)6 . (3.15) 4 3 (y + β) (y + β) (y + β)2
Substituting the above two equations into (3.7), we have: fyy −
2f α2 2α 4β + 2αγ (y + β) = + + 2 4 2 (y + β) (y + β) (y + β) y +β + 10γβ (y + β)2 + 2γ (y + β)3 + γ 2 (y + β)6 .
Write f (t, y) =
bi (t)(y + β)i .
(3.16)
(3.17)
i∈Z
Then (3.16) becomes [(i + 2)(i + 1) − 2]bi+2 (y + β)i i∈Z
=
α2 2α 4β + + (y + β)4 (y + β)2 y + β + 2αγ (y + β) + 10β γ (y + β)2 + 2γ (y + β)3 + γ 2 (y + β)6 .
(3.18)
Thus b−2 =
α2 , 4
b0 = −α ,
b4 = β γ , (i + 3)ibi+2 = 0
b5 =
b1 = −2β , γ , 9
b8 =
b3 = γ2 , 54
for i = −4, −2, −1, 1, 2, 3, 6.
αγ , 2
(3.19)
(3.20) (3.21)
444
X. Xu
Therefore, f =
α2 σ − α − 2β (y + β) + ρ(y + be)2 + 2 4(y + β) y+β +
αγ γ γ2 (y + β)3 + β γ (y + β)4 + (y + β)5 + (y + β)8 , 2 9 54
(3.22)
where σ and ρ are arbitrary functions of t . Theorem 3.1 We have the following solution of (1.1) blowing up on the surface y + β(t) = 0: u=
αx 2β x α2 x2 + γ (y + β)3 x + + + 2 2 (y + β) (y + β) y +β 4(y + β)2 σ αγ − α − 2β (y + β) + ρ(y + be)2 + (y + β)3 + y +β 2 + β γ (y + β)4 +
γ γ2 (y + β)5 + (y + β)8 , 9 54
(3.23)
where α, β, γ , σ and ρ are arbitrary functions of t , whose derivatives appeared in the above exist in a certain open set of R. When α = γ = σ = ρ = 0, the above solution becomes u=
2β x x2 − 2β (y + β). + (y + β)2 y + β
(3.24)
Take the trivial solution ξ = 0 of (3.5), which is the only solution polynomial in y. Then (3.6) and (3.7) become gyy = 0,
fyy = 2gt + g 2 .
(3.25)
Thus g = α(t) + β(t)y.
(3.26)
fyy = α 2 + 2α + 2(β + αβ)y + β 2 y 2 .
(3.27)
Hence
So α 2 + 2α 2 β + αβ 3 β 2 4 y + y + y , 2 3 12 where γ and σ are arbitrary functions of t . f = γ + σy +
(3.28)
Theorem 3.2 The following is a solution of (1.2): u = (α + βy)x + γ + σy +
α 2 + 2α 2 β + αβ 3 β 2 4 y + y + y , 2 3 12
(3.29)
where α, β, γ and σ are arbitrary functions of t , whose derivatives appeared in the above exist in a certain open set of R. Moreover, any solution polynomial in x and y of (1.2) must
Stable-Range Approach to Short Wave and Khokhlov-Zabolotskaya
445
be of the above form. The above solution is smooth (analytic) if all α, β, γ and σ are smooth (analytic) functions of t . Remark 2 In addition to the solutions in Theorems 3.1 and 3.2, (1.2) has the following simple solution: u = ℘ι (y)x 2 ,
(3.30)
where ℘ι (w) is the Weierstrass’s elliptic function satisfying (2.40).
4 3-D Khokhlov-Zabolotskaya Equation By comparing the terms of highest degree, we find that a solution polynomial in x of (1.3) must be of the form: u = f (t, y, z) + g(t, y, z)x + ξ(t, y, z)x 2 ,
(4.1)
where f (t, y, z), g(t, y, z) and ξ(t, y, z) are suitably-differentiable functions to be determined. As (3.2)–(3.7), (1.3) is equivalent to: ξyy + ξzz = 6ξ 2 ,
(4.2)
gyy + gzz − 6gξ = 4ξt ,
(4.3)
fyy + fzz − 2f ξ = 2gt + g 2 .
(4.4)
1 (y cos α(t) + z sin α(t) + β(t))2
(4.5)
First we observe that ξ=
is a solution of (4.2), where α and β are suitable differentiable functions of t . With the above ξ , (4.3) becomes gyy + gzz −
6g 8(α (−y sin α + z cos α) + β ) = − . (y cos α(t) + z sin α(t) + β(t))2 (y cos α + z sin α + β)3
(4.6)
In order to solve (4.6), we change variables: ζ = cos αy + sin αz + β,
η = − sin αy + cos αz.
(4.7)
∂z = sin α∂ζ + cos α∂η .
(4.8)
Then ∂y = cos α∂ζ − sin α∂η , Thus ∂y2 + ∂z2 = (cos α∂ζ − sin α∂η )2 + (sin α∂ζ + cos α∂η )2 = ∂ζ2 + ∂η2 .
(4.9)
Note ∂t (ζ ) = α η + β ,
∂t (η) = α (β − ζ ).
(4.10)
446
X. Xu
Equation (4.6) can be rewritten as: gζ ζ + gηη − 6ζ −2 g = −8(α η + β )ζ −3 .
(4.11)
In order to solve the above equation, we assume ai (t, η)ζ i . g=
(4.12)
i∈Z
Now (4.11) becomes [((i + 2)(i + 1) − 6)ai+2 + aiηη ] = −8(α η + β )ζ −3 ,
(4.13)
i∈Z
which is equivalent to −4a−1 + a−3ηη = −8(α η + β ), (i + 4)(i − 1)ai+2 + aiηη = 0
for − 3 = i ∈ Z. (4.14)
Hence 1 a−1 = a−3ηη + 2(α η + β ), (i + 4)(i − 1)ai+2 = −aiηη 4
for − 3 = i ∈ Z.
(4.15)
When i = −4 and i = 1, we get a−4ηη = a1ηη = 0. Moreover, a−2 and a3 can be any functions. Take a3 = σ,
a−2 = ρ,
a−1 = 2(α η + β ),
a1 = a−1−2i = a−2−2i = 0 for 0 < i ∈ Z
(4.16) (4.17)
in order to avoid infinite number of negative powers of ζ in (4.12), where σ and ρ are arbitrary functions of t and η differentiable in a certain domain. By (4.15), (−1)k ∂η2k (σ ) (−1)k 15∂η2k (σ ) , a3+2k = k = (2k + 5)(2k + 3)(2k + 1)! i=1 (2i + 5)(2i)
(4.18)
(−1)k ∂η2k (ρ) (−1)k (2k − 1)(2k − 3)∂η2k (ρ) a−2+2k = k . = 3(2k)! i=1 (2i)(2i − 5)
(4.19)
Therefore, g = 2(α η + β )ζ −1 +
∞ (−1)k [ k=0
+
15∂η2k (σ )ζ 3 (2k + 5)(2k + 3)(2k + 1)!
(2k − 1)(2k − 3)∂η2k (ρ)ζ −2 3(2k)!
]ζ 2k
(4.20)
is a solution of (4.11). By (4.9), (4.4) is equivalent to fζ ζ + fηη − 2ζ −2 f = 2gt + g 2 .
(4.21)
Stable-Range Approach to Short Wave and Khokhlov-Zabolotskaya
447
Note
gt = 2(α η + β + (α )2 β)ζ −1 − 2(α )2 − 2(α η + β )2 ζ −2 +
∞ (−1)k ζ 2k k=0
×
15∂η2k (σt
+ α (β − ζ )ση )ζ
3
+
(2k − 1)(2k
− 3)∂η2k (ρt
+ α (β − ζ )ρη )ζ −2
(2k + 5)(2k + 3)(2k + 1)! 3(2k)! 2k 2 (2k − 1)(2k − 2)(2k − 3)∂η2k (ρ)ζ −3 15∂η (σ )ζ + + (α η + β ) . (4.22) (2k + 5)(2k + 1)! 3(2k)!
For convenience of solving (4.21), we denote ∞
2gt + g 2 =
bi (t, η)ζ i
(4.23)
b−3 = 0,
(4.24)
i=−4
by (4.20) and (4.22). In particular, b−4 = ρ 2 ,
b−2 = 2(ρt + α βρη ) +
ρηη ρ , 3
(4.25)
2 b−1 = 4[α η + β + (α )2 β] − 2α ρη + (α η + β )ρηη , 3
(4.26)
1 1 1 2 . b0 = −4(α )2 + (ρtηη + α βρηηη ) + ∂η4 (ρ)ρ + ρηη 3 12 36
(4.27)
Suppose that f=
ci (t, η)ζ i
(4.28)
i∈Z
is a solution (4.21). Then ∞ [((i + 2)(i + 1) − 2)ci+2 + ciηη ]ζ i = br ζ r ,
(4.29)
r=−4
i∈Z
equivalently (i + 3)ici+2 = bi − ciηη ,
(r + 3)rcr+2 = −crηη ,
r < −4 ≤ i.
(4.30)
By the above second equation, we take cr = 0 for r < −4
(4.31)
to avoid infinite number of negative powers of ζ in (4.28). Letting i = −3, 0, we get b−3 = c−3ηη ,
b0 = c0ηη .
(4.32)
448
X. Xu
The first equation is naturally satisfied because c−3 = −c−5ηη /10 = 0. Taking i = −2, −4 and r = −6 in (4.30), we obtain 1 1 c0 = c−2ηη − b−2 , 2 2
1 c−2 = b−4 . 4
(4.33)
So 1 1 c0 = ∂η2 (b−4 ) − b−2 . 8 2
(4.34)
1 1 b0 = ∂η4 (b−4 ) − ∂η2 (b−2 ), 8 2
(4.35)
Thus we get a constraint:
equivalently, 1 1 1 2 −4(α )2 + (ρtηη + α βρηηη ) + ∂η4 (ρ)ρ + ρηη 3 12 36 ∂η2 (ρηη ρ) 1 . = ∂η4 (ρ 2 ) − ρtηη − α βρηηη − 8 6
(4.36)
Thus 2 96(ρtηη + α βρηηη ) + 6∂η4 (ρ)ρ + 2ρηη − 9∂η4 (ρ 2 ) + 12∂η2 (ρηη ρ) = 288(α )2 .
(4.37)
It can be proved by considering the terms of highest degree that any solution of (4.37) polynomial in η must be of the form ρ = γ0 (t) + γ1 (t)η + γ2 (t)η2 .
(4.38)
6γ2 − 5γ22 = 9(α )2 .
(4.39)
Then (4.37) becomes
So α =
6γ2 − 5γ22 3
⇒
α=
3
6γ2 − 5γ22 dt,
(4.40)
where = ±1. Replace β by −β if necessary, we can take = 1. Under the assumption (4.38), g = ρζ −2 + 2(α η + β )ζ −1 +
∞ 15∂η2k (σ )ζ 3+2k γ2 + (−1)k 6 (2k + 5)(2k + 3)(2k + 1)! k=0
(4.41)
and 2 b−2 = 2(ρt + α βρη ) + γ2 ρ, 3
(4.42)
4 b−1 = 4[α η + β + (α )2 β] − 2α ρη + (α η + β )γ2 , 3
(4.43)
2 γ2 b0 = −4(α )2 + γ2 + 2 . 3 9
(4.44)
Stable-Range Approach to Short Wave and Khokhlov-Zabolotskaya
449
Denote β,ρ,σ (t, η, ζ ) =
∞
bi ζ i .
(4.45)
i=1
For any real function F (t, η) analytic at η = η0 , we define F (t, η0 +
√
−1ζ ) =
∞ ∂ηr (F )(t, η0 ) √ ( −1ζ )r . r! r=0
(4.46)
Note ∞ (−1)k
15∂η2k (σ )ζ 3+2k
(2k + 5)(2k + 3)(2k + 1)!
ζ ∞ ∂η2k (σ )τ12k 2 k (−1) = 15ζ dτ1 (2k + 5)(2k + 3)(2k)! 0 k=0
ζ τ2 ∞ ∂η2k (σ )τ12k k dτ1 dτ2 = 15 τ2 (−1) (2k + 5)(2k)! 0 0 k=0
ζ τ3 τ2 ∞ ∂ 2k (σ )τ12k −2 k η dτ1 dτ2 dτ3 = 15ζ τ3 τ2 (−1) (2k)! 0 0 0 k=0
k=0
15 = ζ −2 2 ∞ ∂t (−1)k
0
√ √ −1τ1 ) + σ (t, η − −1τ1 )]dτ1 dτ2 dτ3 , (4.47)
[σ (t, η +
0
(2k + 5)(2k + 3)(2k + 1)! 15∂η2k (σt )ζ 3+2k (2k + 5)(2k + 3)(2k + 1)!
∞ (−1)k k=0
15 −2 ζ 2
τ2
τ2
0
k=0
=
τ3
τ3
∞ (−1)k
− α
ζ
15∂η2k (σ )ζ 3+2k
k=0
=
(2k + 5)(2k + 3)(2k + 1)!
0
τ3
τ3
τ2
τ2 0
[σt (t, η +
0
15α (ζ − β) √ + −1ζ −2 2ζ 2
ζ
τ2 0
√
τ2
∞ (−1)k k=0
15∂η2k+1 (σ )ζ 4+2k
ζ
+ α β
15∂η2k+1 (σ )ζ 3+2k (2k + 5)(2k + 3)(2k + 1)!
+ (α η + β )
−1τ1 ) + σt (t, η −
τ1 [σ (t, η +
√
∞ 15∂η2k (σ )ζ 2+2k (−1)k (2k + 5)(2k + 1)! k=0
√
−1τ1 )]dτ1 dτ2 dτ3
−1τ1 ) − σ (t, η −
√
−1τ1 )]dτ1 dτ2
0
15 (α η + β )ζ −3 2 ζ τ2 √ √ × τ23 [σ (t, η + −1τ1 ) + σ (t, η − −1τ1 )]dτ1 dτ2 . +
0
0
(4.48)
450
X. Xu
Hence γ2 15 −2 g = ρζ −2 + 2(α η + β )ζ −1 + + ζ 6 2 ζ τ3 τ2 √ √ × τ3 τ2 [σ (t, η + −1τ1 ) + σ (t, η − −1τ1 )]dτ1 dτ2 dτ3 , (4.49) 0
0
0
by (4.41) and (4.47). According to (4.23) and (4.45), we have β,ρ,σ (t, η, ζ ) ζ τ3 τ2 2 √ √ 225 −4 ζ τ3 τ2 [σ (t, η + −1τ1 ) + σ (t, η − −1τ1 )]dτ1 dτ2 dτ3 = 4 0 0 0 ζ τ3 τ2 √ √ + 15ζ −2 τ3 τ2 [σt (t, η + −1τ1 ) + σt (t, η − −1τ1 )]dτ1 dτ2 dτ3 0
0
15α (ζ − β) √ + −1 ζ2 −3 + 15(α η + β )ζ + 15
τ3
0
0
τ23 τ2
τ2 0
τ2
τ1 [σ (t, η +
√
−1τ1 ) − σ (t, η −
√ −1τ1 )]dτ1 dτ2
0
ζ
τ3
ζ
τ2
0
ζ
0
τ2
[σ (t, η +
√
−1τ1 ) + σ (t, η −
√ −1τ1 )]dτ1 dτ2
0
[σ (t, η +
0
× ζ −2 ρζ −2 + (α η + β )ζ −1 +
√
−1τ1 ) + σ (t, η −
√
−1τ1 )]dτ1 dτ2 dτ3
γ2 . 6
(4.50)
Now c−2 =
ρ2 4
(4.51)
by (4.24) and (4.33). According to (4.30) with i = −3, 0, c−1 and c2 can be arbitrary. For convenience, we redenote c−1 = κ(t, η),
c2 = ω(t, η).
(4.52)
Moreover, (4.24), (4.34) and (4.42) imply c0 =
ρη2 4
− ρt − α βρη +
γ2 ρ . 6
(4.53)
Furthermore, (4.30) and (4.43) yield c1 =
κηη 2 − 2(α η + β + (α )2 β) + α ρη − (α η + β )γ2 . 2 3
(4.54)
In addition, (4.30) and (4.52) gave c2k+3 =
(−1)k+1 ∂η2k+4 (κ) 2(k + 2)(2k + 2)!
+
k (−1)k−i (i + 1)(2i)! i=0
(k + 2)(2k + 2)!
∂η2(k−i) (b2i+1 ),
(4.55)
Stable-Range Approach to Short Wave and Khokhlov-Zabolotskaya
c2k+4 =
(−1)k+1 3∂η2k+2 (ω) (2k + 5)(2k + 3)!
+
451
k (−1)k−i (2i + 3)(2i + 1)!
(2k + 5)(2k + 3)!
i=0
∂η2(k−i) (b2i+2 )
(4.56)
for 0 ≤ k ∈ Z. Set β,ρ,σ,κ,ω (t, η, ζ ) ∞ κηη ζ + ωζ 2 + ci ζ i 2 i=3 ∞ ∞ ∂ 2k (κ)ζ 2k 3∂η2k (ω)ζ 2k −1 k η = −ζ ∂ζ ζ (−1) + ζ2 (−1)k (2k)! (2k + 3)(2k + 1)! k=0 k=0
= κζ −1 +
+
∞ k (−1)k−i (i + 1)(2i)! k=0 i=0
+
(k + 2)(2k + 2)!
∂η2(k−i) (b2i+1 )ζ 2k+3
∞ k (−1)k−i (2i + 3)(2i + 1)!
(2k + 5)(2k + 3)!
k=0 i=0
∂η2(k−i) (b2i+2 )ζ 2k+4 .
(4.57)
Note ζ2
∞ (−1)k k=0
3 = ζ −1 2
3∂η2k (ω)ζ 2k (2k + 3)(2k + 1)!
ζ
τ2
τ2 0
[ω(t, η +
√
−1τ1 ) + ω(t, η −
√ −1τ1 )]dτ1 dτ2 .
(4.58)
for 0 < i ∈ Z.
(4.59)
0
Moreover, β,ρ,σ (t, η, 0) = 0,
bi =
∂ζi ( β,ρ,σ )(t, η, 0) i!
Thus β,ρ,σ,κ,ω (t, η, ζ ) ζ τ2 √ √ 3 τ2 [ω(t, η + −1τ1 ) + ω(t, η − −1τ1 )]dτ1 dτ2 = ζ −1 2 0 0 √ √ 1 − ζ ∂ζ ζ −1 [κ(t, η + −1τ1 ) + κ(t, η − −1τ1 )] 2 +
k ∞ (−1)k−i (i + 1)∂η2(k−i) ∂ζ2i+1 ( β,ρ,σ )(t, η, 0) k=0 i=0
+
(2i + 1)(k + 2)(2k + 2)!
ζ 2k+3
∞ k (−1)k−i (2i + 3)∂η2(k−i) ∂ζ2i+2 ( β,ρ,σ )(t, η, 0) k=0 i=0
(2i + 2)(2k + 5)(2k + 3)!
ζ 2k+4 ,
(4.60)
452
X. Xu
in which the summations are finite if σ (t, η) is polynomial in η. According to (4.51)–(4.57) and (4.60), ρ 2 −2 ρη2 γ2 ρ ζ + − ρt − α βρη + 4 4 6 2 − 2(α η + β + (α )2 β) − α ρη + (α η + β )γ2 ζ. 3
f = β,ρ,σ,κ,ω (t, η, ζ ) +
(4.61)
Theorem 4.1 In terms of the notions in (4.7), we have the following solution of (1.3) blowing up on the hypersurface cos α(t)y + sin α(t)z + β(t) = 0 (ζ = 0): u = x 2 ζ −2 + [ρζ −2 + 2(α η + β )ζ −1 +
ζ
× 0
τ3
τ3
τ2
τ2 0
[σ (t, η +
√
γ2 15 −2 + ζ 6 2
−1τ1 ) + σ (t, η −
√
−1τ1 )]dτ1 dτ2 dτ3 ]x
0
ρ 2 −2 ρη2 γ2 ρ ζ + − ρt − α βρη + 4 4 6 2 − [2(α η + β + (α )2 β) − α ρη + (α η + β )γ2 ]ζ, 3
+ β,ρ,σ,κ,ω (t, η, ζ ) +
(4.62)
where the involved parametric functions ρ is given in (4.38), α is given in (4.40) and β is any function of t . Moreover, σ, κ, ω are real functions in real variable t and η, and β,ρ,σ,κ,ω (t, η, ζ ) is given in (4.60) via (4.50). When σ = κ = ω = 0, the above solution becomes: ρη2 γ2 ρ2 2 −2 −2 −1 u = x ζ + ρζ + 2(α η + β )ζ + − ρt x + ζ −2 + 6 4 4 γ2 ρ 2 − α βρη + − 2(α η + β + (α )2 β) − α ρη + (α η + β )γ2 ζ, (4.63) 6 3 Next we consider ξ = 0, which is the only solution polynomial in y and z of (4.2). In this case, (4.3) and (4.4) becomes: gyy + gzz = 0,
fyy + fzz = 2gt + g 2 .
(4.64)
The above first equation is classical two-dimensional Laplace equation, whose solutions are called harmonic functions. In order to find simpler expressions of the solutions of the above equations, we introduce a new notion. A complex function G(μ) is called bar-homomorphic if
G(μ) = G(μ).
(4.65)
For instance, trigonometric functions, polynomials with real coefficients and elliptic functions with bar-invariant periods are bar-homomorphic functions. The extended function F (t, μ) in (4.46) is bar-homomorphic in μ. As (4.20), it can be proved by power series that the general solution of the first equation in (4.64) is: √ √ √ √ (4.66) g = (σ + −1ρ)(t, y + −1z) + (σ − −1ρ)(t, y − −1z),
Stable-Range Approach to Short Wave and Khokhlov-Zabolotskaya
453
where σ (t, μ) and ρ(t, μ) are complex functions in real variable t and bar-homomorphic in complex variable μ. Set √ √ w = y − −1z. (4.67) w = y + −1z, Then the Laplace operator ∂y2 + ∂z2 = 4∂w ∂w .
(4.68)
The second equation in (4.64) is equivalent to: ∂w ∂w (f ) =
√ √ g2 1 gt + = (σt + −1ρt )(t, w) + (σt − −1ρt )(t, w) 2 4 2 √ √ 1 + [(σ + −1ρ)(t, w) + (σ − −1ρ)(t, w)]2 . 4
(4.69)
Hence the general solution of the second equation in (4.64) is: f =
w
√ √ 1 [(σt + −1ρt )(t, μ1 ) + (σt − −1ρt )(t, μ1 )] w1 w1 2 √ √ 1 2 + [(σ + −1ρ)(t, μ1 ) + (σ − −1ρ)(t, μ1 )] dμ1 dμ1 4 √ √ + (κ + −1ω)(t, w) + (κ − −1ω)(t, w), w
(4.70)
where κ(t, μ) and ω(t, μ) are complex functions in real variable t and bar-homomorphic in complex variable μ, and w1 is a complex constant. Theorem 4.2 In terms of the notions in (4.66), the following is a solution polynomial in x of (1.3): √
−1ρ)(t, w) + (σ −
√
w
√ 1 [(σt + −1ρt )(t, μ1 ) w1 w1 2 √ √ √ 1 2 + (σt − −1ρt )(t, μ1 )] + [(σ + −1ρ)(t, μ1 ) + (σ − −1ρ)(t, μ1 )] dμ1 dμ1 4 √ √ + (κ + −1ω)(t, w) + (κ − −1ω)(t, w), (4.71)
u = [(σ +
w
−1ρ)(t, w)]x +
where σ (t, μ), ρ(t, μ), κ(t, μ) and ω(t, μ) are complex functions in real variable t and barhomomorphic in complex variable μ (cf. (4.65)). Moreover, the above solution is smooth (analytic) if all σ, ρ, κ and ω are smooth (analytic) functions. In particular, any solution of (1.3) polynomial in x, y, z must be of the form (4.71) in which σ, ρ, κ and ω are polynomial in μ. Remark 3 In addition to the solutions in Theorems 4.1 and 4.2, (1.3) has the following simple solution: u = ℘ι (ay + bz)x 2 ,
(4.72)
where ℘ι (w) is the Weierstrass’s elliptic function and a, b are real constants such that a 2 + b2 = 1.
454
X. Xu
References 1. Akiyama, M., Kamakura, T.: Elliptically curved acoustic lens emitting strongly focused finite-amplitude beams: Application of the spherical beam equation model to the theoretical prediction. Acoust. Sci. Technol. 26, 179–284 (2005) 2. Bagdoev, A.G., Petrosyan, L.G.: Justification of the applicability of short wave equations in obtaining an equation for modulation of a gas-fluid mixture. Izv. Akad. Nauk Armyan. SSR Ser. Mek. 38(4), 58–66 (1985) 3. Ermakov, S.: Short wave/long wave interaction and amplification of decimeter-scale wind waves in film slicks. Geophys. Res. Abs. 8, 00469 (2006) 4. Gibbons, J.: The Khokhlov-Zabolotskaya equation and the inverse scattering problem of classical mechanics. In: Dynamical Problems in Soliton Systems, Kyoto, 1984, pp. 36–41. Springer, Berlin (1985) 5. Ibragimov, N.H.: Lie Group Analysis of Differential Equations. CRC Handbook, vol. 1. CRC Press, Boca Raton (1995) 6. Khamitova, R.S.: Group structure and a basis of conservation laws. Teor. Mat. Fiz. 52(2), 244 (1982) 7. Khokhlov, R.V., Zabolotskaya, E.A.: Quasi-plane waves in nonlinear acoustics of bounded bundles. Akust. Z. 15(1), 40 (1969) 8. Khristianovich, S.A., Razhov, O.S.: On nonlinear reflection of weak shock waves. Prikl. Mat. Tek. 22(5), 586 (1958) 9. Kocdryavtsev, A., Sapozhnikov, V.: Symmetries of the generalized Khokhlov-Zabolotskaya equation. Acoust. Phys. 4, 541–546 (1998) 10. Koshvaga, S., Makavarests, N., Grimalsky, V., Kotsarenko, A., Enriquez, R.: Spectrum of the seismicelectromagnetic and acoustic wave caused by seismic and volcano activity. Nat. Hazards Earth Syst. Sci. 5, 203–209 (2005) 11. Kostin, I., Panasenko, G.: Khokhlov-Zabolotskaya-Kuzentsov-type equation: Nonlinear acoustics in heterogeneous media. SIAM J. Math. 40, 699–715 (2008) 12. Korsunskii, S.V.: Self-similar solutions of two-dimensional equations of Khokhlov-Zabolotskaya type. Mat. Fiz. Nelinein. Mek. 16, 81–87 (1991) 13. Kraenkel, R., Manna, M., Merle, V.: Nonlinear short-wave propagation in ferrites. Phys. Rev. E 61, 976–979 (2000) 14. Kucharczyk, P.: Group properties of the “short waves” equations in gas dynamics. Bull. Acad. Pol. Sci. Ser. Sci. Technol. XIII(5), 469 (1965) 15. Kupershmidt, B.A.: Geometric-Hamiltonian forms for the Kadomtsev-Petviashvili and KhokhlovZabolotskaya equations. In: Geometry in Partial Differential Equations, pp. 155–172. World Scientific, River Edge (1994) 16. Lin, J., Zhang, J.: Similarity reductions for the Khokhlov-Zabolotskaya equation. Commun. Theor. Phys. 24(1), 69–74 (1995) 17. Lychagin, V.V., Krasil’shchik, I.S., Vinogradov, A.M.: Introduction to Geometry of Nonlinear Differential Equations. Nauka, Moscow (1986) 18. Martinez-Moras, F., Ramos, E.: Higher dimensional classical W-algebras. Commun. Math. Phys. 157, 573–589 (1993) 19. Morozov, O.: Cartan’s structure theory of symmetry pseudo-groups for the Khokhlov-Zabolotskaya equation. Acta Appl. Math. 101, 231–241 (2008) 20. Roy, C., Nasker, M.: Towards the conservation laws and Lie symmetries for the Khokhlov-Zabolotskaya equation in three dimensions. J. Phys. A 19(10), 1775–1781 (1986) 21. Rozanova, A.: The Khokhlov-Zabolotskaya-Kuznetsov equation. Math. Acad. Sci. Paris 344, 337–342 (2007) 22. Rozanova, A.: Qualitative analysis of the Khokhlov-Zabolotskaya equation. Math. Models Methods Appl. Sci. 18, 781–812 (2008) 23. Roy, S., Roy, C., De, M.: Loop algebra of Lie symmetries for a short-wave equation. Int. J. Theor. Phys. 27(1), 47–55 (1988) 24. Sanchez, D.: Long waves in ferromagnetic media, Khokhlov-Zabolotskaya equation. J. Differ. Equ. 210, 263–289 (2005) 25. Schwarz, F.: Symmetries of the Khokhlov-Zabolotskaya equation. J. Phys. A 20(6), 1613 (1987) 26. Vinogradov, A.M., Vorob’ev, E.M.: Application of symmetries for finding of exact solutions of Khokhlov-Zabolotskaya equation. Akust. Z. 22(1), 22 (1976)
Acta Appl Math (2009) 106: 455–472 DOI 10.1007/s10440-008-9307-2
Gaussian DCT Coefficient Models Saralees Nadarajah
Received: 5 May 2008 / Accepted: 21 August 2008 / Published online: 11 September 2008 © Springer Science+Business Media B.V. 2008
Abstract It has been known that the distribution of the discrete cosine transform (DCT) coefficients of most natural images follow a Laplace distribution. However, recent work has shown that the Laplace distribution may not be a good fit for certain type of images and that the Gaussian distribution will be a realistic model in such cases. Assuming this alternative model, we derive a comprehensive collection of formulas for the distribution of the actual DCT coefficient. The corresponding estimation procedures are derived by the method of moments and the method of maximum likelihood. Finally, the superior performance of the derived distributions over the Gaussian model is illustrated. It is expected that this work could serve as a useful reference and lead to improved modeling with respect to image analysis and image coding. Keywords Discrete cosine transform (DCT) · Gaussian distribution · Generalized hypergeometric function · Image analysis · Image coding · Incomplete gamma function · Kummer function · Modified Bessel function
1 Introduction A discrete cosine transform (DCT) expresses a sequence of finitely many data points in terms of a sum of cosine functions oscillating at different frequencies. The DCT was first introduced by Ahmed et al. [1]. Later Wang and Hunt [2] introduced a complete set of variants of the DCT. The DCT is included in many mathematical packages, such as Matlab, Mathematica and GNU Octave. DCTs are important to numerous applications in science and engineering, from lossy compression of audio and images (where small high-frequency components can be discarded), to spectral methods for the numerical solution of partial differential equations to Chebyshev approximation of arbitrary functions by series of Chebyshev polynomials. The use of cosine rather than sine functions is critical in these applications: for compression, it S. Nadarajah () University of Manchester, Manchester M13 9PL, UK e-mail:
[email protected]
456
S. Nadarajah
turns out that cosine functions are much more efficient, whereas for differential equations the cosines express a particular choice of boundary conditions. We refer the readers to Jain [3] and Rao and Yip [4] for comprehensive accounts of the theory and applications of the DCT. For a tutorial account see Duhamel and Vetterli [5]. The DCT is widely used in image coding and processing systems, especially for lossy data compression, because it has a strong “energy compaction” property (Rao and Yip [4]). DCT coding relies on the premise that pixels in an image exhibit a certain level of correlation with their neighboring pixels. Similarly in a video transmission system, adjacent pixels in consecutive frames show very high correlation. (Frames usually consist of a representation of the original data to be transmitted, together with other bits which may be used for error detection and control. In simplistic terms, frames can be referred to as consecutive images in a video transmission.) Consequently, these correlations can be exploited to predict the value of a pixel from its respective neighbors. The DCT is, therefore, defined to map this spatial correlated data into transformed uncorrelated coefficients. Clearly, the DCT utilizes the fact that the information content of an individual pixel is relatively small, i.e. to a large extent visual contribution of a pixel can be predicted using its neighbors. A typical image/video transmission system is outlined in Fig. 1. The objective of the source encoder is to exploit the redundancies in image data to provide compression. In other words, the source encoder reduces the entropy, which in our case means decrease in the average number of bits required to represent the image. On the contrary, the channel encoder adds redundancy to the output of the source encoder in order to enhance the reliability of the transmission. The source encoder has three sub-blocks: the transformation sub-block, quantizer sub-block and the entropy sub-block. The transformation sub-block refers to the DCT. The quantizer sub-block utilizes the fact that the human eye is unable to perceive some visual information in an image. Such information is deemed redundant and can be discarded without introducing noticeable visual artifacts. The entropy encoder employs its knowledge
Fig. 1 Components of a typical image/video transmission system
Gaussian DCT Coefficient Models
457
of the DCT and quantization processes to reduce the number of bits required to represent each symbol at the quantizer output. The source and channel decoders reconstruct the image by doing all the above operations in reverse. Efficient encoders and decoders are based on source models. Reininger and Gibson [6] used Kolmogorov–Smirnov tests to show that most DCT coefficients are reasonably modeled by the Laplace distribution. This knowledge has been used to improve decoder design. However, recent work has shown that the Laplace model does not give a good fit for certain images such as text documents (see Lam [7] and Lam and Goodman [8]). In these cases, it is suggested that the Gaussian distribution is a realistic model. In fact, using a doubly stochastic model, Lam and Goodman [8] argue that within an 8 × 8 block used for the DCT, assuming that the pixels are identically distributed, the DCT coefficient is approximately Gaussian. The probability density function (pdf) of the Gaussian distribution is given by y2 1 exp − 2 (1) f (y) = √ 2σ 2πσ for −∞ < y < ∞, where σ 2 > 0 represents the variance of the block. When modeling the DCT densities of a big image, the block variance σ 2 in (1) is likely to vary over different parts of the image and so one should consider σ itself to be a random variable. This means that the actual DCT coefficient distribution will be given by the compound form: ∞ y2 1 1 exp − 2 g(σ 2 )dσ 2 , (2) f (y) = √ 2σ 2π 0 σ where g(·) denotes the pdf of σ 2 . For convenience, letting x = y 2 /2 and λ = σ 2 , one can rewrite (2) as ∞ x 1 1 f (y) = √ (3) √ exp − g(λ)dλ. λ λ 2π 0 A number of forms for g(·) has been used in the literature. Lam [7], Lam and Goodman [8] and Lam [9] considered g(·) to have the exponential distribution. Lam [7], Teichroew [10] and Nadarajah and Kotz [11] considered g(·) to have the gamma distribution. The uniform distribution has also been used. The aim of this note is to derive the most comprehensive list of forms for (3) by taking g(·) to belong to some sixteen flexible families. For each g(·), we derive the corresponding f (·) given by (3) as well as provide estimators of the associated parameters obtained by the method of moments and the method of maximum likelihood, see Sect. 2. An application of the derived distributions is illustrated in Sect. 3. It is shown that they are better models for the DCT coefficients than the Gaussian distribution given by (1). The calculations of this note use several special functions, including the incomplete gamma function defined by x t a−1 exp(−t)dt, γ (a, x) = 0
the modified Bessel function of the third kind defined by ∞ x ν (1/2) exp(−xt)(t 2 − 1)ν−1/2 dt, Kν (x) = ν 2 (ν + 1/2) 1
458
S. Nadarajah
the generalized hypergeometric function defined by p Fq (a1 , . . . , ap ; b1 , . . . , bq ; x)
=
∞ (a1 )k (a2 )k · · · (ap )k x k , (b1 )k (b2 )k · · · (bq )k k! k=0
and, the Kummer function defined by (a, b; x) =
(1 − b) (b − 1) 1−b x 1 F1 (1 + a − b; 2 − b; x), 1 F1 (a; b; x) + (1 + a − b) (a)
where (f )k = f (f + 1) · · · (f + k − 1) denotes the ascending factorial. The properties of these special functions can be found in Prudnikov et al. [12] and Gradshteyn and Ryzhik [13]. The estimation by the method of moments and the method of maximum likelihood requires the following notation: for a random sample λ1 , . . . , λn of λ, define {2μ31 − 3μ1 μ2 + μ3 }2 {μ2 − μ21 }3
(4)
−3μ41 + 6μ21 μ2 − 4μ1 μ3 + μ4 {μ2 − μ21 }2
(5)
β1 = and β2 =
as the sample skewness and sample kurtosis, respectively, where μj is the j th sample mo j ment defined by μj = (1/n) ni=1 λi for j = 1, 2, 3, 4.
2 Models for Actual DCT Coefficient In this section, we provide a collection of formulas for f (·) in (3) by taking g(·) to belong to sixteen flexible families. The estimators for the parameters of f (·) determined by the method of moments and the method of maximum likelihood are also given. One Parameter Exponential Distribution:
If g takes the form
g(λ) = (1/μ) exp(−λ/μ) for λ > 0 and μ > 0 then √
2x 1/4 x f (y) = √ 3/4 K1/2 2 . μ πμ Note that μ is the scale parameter. The moment estimator of μ is μ1 , the sample mean. This is also the maximum likelihood estimator of μ. Two Parameter Gamma Distribution:
If g takes the form
g(λ) =
λβ−1 exp(−λ/μ) μβ (β)
Gaussian DCT Coefficient Models
459
for λ > 0, β > 0 and μ > 0 then f (y) = √
√ β/2−1/4 2x x K . β−1/2 2 μ π(β)μ(1/2+β)/2
(6)
Note that β and μ are the shape and scale parameters, respectively. The moment estimators of μ and β are μ1 β1 /4 and 4/β1 , respectively, where μ1 is the sample mean and β1 is the sample skewness given by (4). The maximum likelihood estimator of β is the solution of the equation: n log μ1 − n log β + nψ(β) =
n
log λi ,
i=1
where ψ(x) = d log (x)/dx is the digamma function. The maximum likelihood estimator . of μ is μ1 /β One Parameter Half Logistic Distribution: g(λ) =
If g takes the form
2μ exp(−λμ) {1 + exp(−λμ)}2
for λ > 0 and μ > 0 then √ ∞
2 2μ3/4 x 1/4 −2 f (y) = (k + 1)−1/4 K1/2 2 μ(k + 1)x . √ k π k=0 Note that μ is the scale parameter. The moment estimator of μ is (2 ln 2/μ1 )1/3 , where μ1 is the sample mean. The maximum likelihood estimator of μ is the solution of the equation: 2 λi exp(−μλi ) 1 + = μ1 . μ n i=1 1 + exp(−μλi ) n
Two Parameter Inverse Gaussian Distribution: If g takes the form φ λ μ μφ −3/2 g(λ) = exp(φ)λ exp − + 2π 2 μ λ for λ > 0, φ > 0 and μ > 0 then √ f (y) =
2x −1 2x φ exp(φ) 1+ . K−1 φ φ+ π μφ μ
Note that both φ and μ are the scale parameters. The moment estimators of φ and μ are 9/β1 and μ1 , respectively, where μ1 is the sample mean and β1 is the sample skewness given by (4). The maximum likelihood estimators of μ and φ are the simultaneous solutions of the equations: φ 1 φμ1 1 = − 2μ 2n i=1 λi 2μ2 n
460
S. Nadarajah
and 1 μ1 μ 1 = + . 2φ 2μ 2n i=1 λi n
1+
Two Parameter Weibull Distribution:
If g takes the form
g(λ) = βλβ−1 μ−β exp{−(λ/μ)β } for λ > 0, β > 0 and μ > 0 then q−1 βx β−1/2 (−A)j (1/2 − β − βj )C1j f (y) = √ μβ 2π j =0 j !
p−1 β − 1/2 − h (−1)h A(1/2−β+h)/β + C2h h!β β h=0
provided that β = p/q where p ≥ 1 and q ≥ 1 are co-prime integers, C1j = 1 Fp+q (1; (p, 1/2 + β + βj ), (q, 1 + j ); z) and 1/2 + h C2h = 1 Fp+q 1; q, , (p, 1 + h); z . β Furthermore, A = (x/μ)β , z = (−1)p+q Aq /{p p q q } and (k, a) = (a/k, (a + 1)/k, . . . , (a + k − 1)/k). Note that β and μ are the shape and scale parameters, respectively. The moment estimator of β is the root of the equation
2 1 3/2 β1 1 + − 2 1 + β β 3 1 1 2 = 1+ − 3 1 + 1+ + 2 3 1 + , β β β β
where β1 is the sample skewness given by (4). The moment estimator of μ is μ1 {(1 + 1/β)}−1 , where μ1 is the sample mean. The maximum likelihood estimators of μ and β are the simultaneous solutions of the equations: n n n λi λi β + log λi = n log μ + log β μ μ i=1 i=1 and β nβ = βμ−β−1 λi . μ i=1 n
Gaussian DCT Coefficient Models
461
Three Parameter Stacy Distribution: g(λ) =
If g takes the form cλcγ −1 exp{−(λ/β)c } β cγ (γ )
for λ > 0, c > 0, γ > 0 and β > 0 then q−1 (−A)j cx cγ −1/2 (1/2 − cγ − cj )C3j f (y) = √ β cγ 2π(γ ) j =0 j !
p−1 cγ − 1/2 − h (−1)h A(1/2−cγ +h)/c + C4h h!c c h=0
provided that c = p/q where p ≥ 1 and q ≥ 1 are co-prime integers, C3j = 1 Fp+q (1; (p, 1/2 + cγ + cj ), (q, 1 + j ); z) and C4h = 1 Fp+q
1/2 + h 1; q, 1 − c + , (p, 1 + h); z . c
Furthermore, A = (x/β)c , z = (−1)p+q Aq /{p p q q } and (k, a) = (a/k, (a +1)/k, . . . , (a + k − 1)/k). Note that c and γ are the shape parameters while β is the scale parameter. The moment estimators of c and γ are the solutions of the equations
(γ + 1/c) 2 3/2 (γ + 2/c) − β1 (γ ) (γ )
3 (γ + 1/c) (γ + 1/c)(γ + 2/c) (γ + 3/c) + =2 −3 (γ ) 2 (γ ) (γ )
and
(γ + 1/c) 2 2 (γ + 2/c) − (γ ) (γ )
4 (γ + 1/c)(γ + 3/c) (γ + 1/c) 2 (γ + 1/c)(γ + 2/c) −4 = −3 +6 3 (γ ) (γ ) 2 (γ )
β2
+
(γ + 4/c) , (γ )
where β1 and β2 are the sample skewness and the sample kurtosis given by (4) and (5), respectively. The moment estimator of β is μ1 (γ )/ (γ + 1/c), where μ1 is the sample mean. The maximum likelihood estimators of c, γ and β are the simultaneous solutions of the equations: n n λi λi c n +γ , log λi = nγ log β + log c β β i=1 i=1
462
S. Nadarajah
c
n
log λi = nc log β + nψ(γ )
i=1
and cβ −c−1
n i=1
One Parameter Half Gaussian Distribution: g(λ) = √
λci =
ncγ . β
If g takes the form
λ2 exp − 2 2μ 2πμ 2
for λ > 0 and μ > 0 then x 3/2 (−1/2)1 F3 (1; (2, 3/2), (1, 1 + j ); z) f (y) = πμ 1 1/2 − h (−1)h A(−1/2+h)/2 3/2 + h + , (2, 1 + h); z , 1 F3 1; 1, 2h! 2 2 h=0 where A = x 2 /(2μ2 ), z = −A/4 and (k, a) = (a/k, (a + 1)/k, . .√ . , (a + k − 1)/k). Note that μ is the scale parameter. The moment estimator of μ is μ1 π/2, where μ1 is the √ sample mean. The maximum likelihood estimator of μ is μ2 , where μ2 is the second sample moment. Two Parameter Fréchet Distribution: g(λ) =
If g takes the form k θ kθ k exp − λk+1 λ
for λ > 0, k > 0 and θ > 0 then √ q−1 kθ k x (−A)j (−1/2 + kj )p+1 Fq (1; (p, −1/2 + kj ), (q, 1 + j ); (−1)q z) f (y) = √ 2π j =0 j ! provided that 0 < k < 1 and k = p/q where p ≥ 1 and q ≥ 1 are co-prime integers, A = (x/β)k , z = (−1)p+q Aq /{p p q q } and (k, a) = (a/k, (a + 1)/k, . . . , (a + k − 1)/k). On the other hand, if k > 1 then √ p−1 h − 1/2 θ k x (−1)h A(1/2−h)/k f (y) = √ C5h , h! k 2π h=0 where h − 1/2 (−1)p C5h = q+1 Fp 1; q, , (p, 1 + h); . k z
Gaussian DCT Coefficient Models
463
The moment estimator of k is the root of the equation
2 1 3/2 1 1 2 β1 1 − = 2 3 1 − − 2 1 − − 3 1 − 1− k k k k k 3 + 1− , k where β1 is the sample skewness given by (4). Note that k and θ are the shape and scale parameters, respectively. The moment estimator of θ is μ1 / (1 − 1/k), where μ1 is the sample mean. The maximum likelihood estimators of k and θ are the simultaneous solutions of the equations:
n θ k
log
λi
i=1
θ λi
=
n + n log θ k
and kθ k−1
n
λ−k i =
i=1
Two Parameter Pareto Distribution:
If g takes the form g(λ) =
for λ > k and a > 0 then f (y) = √
nk . θ
ak a λa+1
x γ a + 1/2, . k 2πx a+1/2 ak a
The moment estimator of a is the root of the equation
√ √ β1 a(a − 3) = 2(a + 1) a − 2, where β1 is the sample skewness given by (4). Note that a and k are the shape and scale parameters, respectively. The moment estimator of k is μ1 (a − 1)/a, where μ1 is the sample mean. The maximum likelihood estimators of k and a are min λi and {log min λi − (1/n) ni=1 log λi }−1 , respectively. Four Parameter Two Sided Power Distribution: If g takes the form ⎧ ⎨ p λ−a p−1 , if a ≤ λ ≤ m, b−a m−a g(λ) = ⎩ p b−λ p−1 , if m ≤ λ ≤ b b−a b−m for 0 < a ≤ λ ≤ b < ∞, 0 < a ≤ m ≤ b < ∞ and p > 0 then p−1 p p−1 1−p (m − a) f (y) = (−a)k x p−1/2−k C6k √ k (b − a) 2π k=0 + (b − m)
1−p
p−1 p−1 k=0
k
k p−1/2−k
(−b) x
C7k ,
464
S. Nadarajah
where C6k = γ
x 1 − p + k, 2 a
−γ
x 1 − p + k, 2 m
and C7k = γ
x x 1 1 − p + k, − p + k, −γ . 2 m 2 b
Note that p is the shape parameter while a, b and m are the scale parameters. The moment estimators of p and θ = (m − a)/(b − a) are the solutions of the equations
3/2 β1 e2 − e12 = 2e13 − 3e1 e2 + e3 and 2 β2 e2 − e12 = −3e14 + 6e12 e2 − 4e1 e3 + e4 , where ek =
k pθ k+1 k p(θ − 1)i+1 − , p+k p+i k−i i=0
where β1 and β2 are the sample skewness and the sample kurtosis given by (4) and (5), respectively. For the maximum likelihood estimators, see van Dorp and Kotz [14, 15]. Two Parameter Beta Distribution:
If g takes the form
g(λ) =
λa−1 (1 − λ)b−1 B(a, b)
for 0 < λ < 1, a > 0 and b > 0 then (b) exp(−x) f (y) = √ (b, 3/2 − a; x). 2πB(a, b) Note that both a and b are the shape parameters. The moment estimators of a and b are the solutions of the equations
√ √ β1 ab(a + b + 2) = 2(b − a) a + b + 1 and β2 ab(a + b + 2)(a + b + 3) = 6{a 3 − a 2 (2b − 1) + b2 (b + 1) − 2ab(b + 2)}, where β1 and β2 are the sample skewness and the sample kurtosis given by (4) and (5), respectively. The maximum likelihood estimators of a and b are the simultaneous solutions of the equations: n i=1
log λi = nψ(a) − nψ(a + b)
Gaussian DCT Coefficient Models
465
and n
log(1 − λi ) = nψ(b) − nψ(a + b).
i=1
Two Parameter Inverted Beta Distribution: g(λ) =
If g takes the form
λγ −1 B(γ , β)(1 + λ)γ +β
for λ > 0, γ > 0 and β > 0 then (1/2 + β) f (y) = √ (1/2 + β, 3/2 − γ ; x). 2πB(γ , β) Note that both γ and β are the shape parameters. The moment estimators of β and γ are the solutions of the equations
3/2 β1 γ (γ + β − 1) = 2γ 3 (β − 2)3/2 − 3γ 2 (γ + 1)(β − 1) β − 2
+ γ (γ + 1)(γ + 2)(β − 1)2 β − 2(β − 3)−1 and β2 γ 2 (γ + β − 1)2 = −3γ 4 (β − 2)2 + 6γ 3 (γ + 1)(β − 1)(β − 2) − 4γ 2 (γ + 1)(γ + 2)(β − 1)2 (β − 2)(β − 3)−1 + γ (γ + 1)(γ + 2)(γ + 3)(β − 1)3 (β − 2)(β − 3)−1 (β − 4)−1 , where β1 and β2 are the sample skewness and the sample kurtosis given by (4) and (5), respectively. The maximum likelihood estimators of β and γ are the simultaneous solutions of the equations: n
log λi −
i=1
n
log(1 − λi ) = nψ(γ ) − nψ(γ + β)
i=1
and n
log(1 − λi ) = nψ(γ + β) − nψ(β).
i=1
Two Parameter Lomax Distribution:
If g takes the form
g(λ) =
aca (c + λ)a+1
for λ > 0, a > 0 and c > 0 then f (y) =
a(1/2 + a) (1/2 + a, 1/2; x/c). √ 2cπ
466
S. Nadarajah
The moment estimator of a is the root of the equation
√ √ β1 a 3/2 = 2(a − 2)3/2 − 6(a − 1) a − 2 + 6(a − 1)2 a − 2(a − 3)−1 , where β1 is the sample skewness given by (4). Note that a and c are the shape and scale parameters, respectively. The moment estimator of c is (a − 1)μ1 , where μ1 is the sample mean. The maximum likelihood estimators of a and c are the simultaneous solutions of the equations: n
log(c + λi ) =
i=1
n + n log c a
and (a + 1)
n i=1
na 1 . = c + λi c
Two Parameter Generalized Pareto Distribution: g(λ) =
If g takes the form
1 cλ 1/c−1 1− k k
for λ > 0, −∞ < c < ∞ and k > 0 then ⎧ −1/2 ⎨ (1/2−1/c)(−c) √ 12 − 1c , 12 ; − xc , k 2kπ f (y) = −1/2 ⎩ (1/c)c √ exp − cxk 1c , 12 ; xc , k 2kπ
if c ≤ 0, if c > 0.
Note that c and k are the shape and scale parameters, respectively. The moment estimator of c is the root of the equation
3/2 1 1 1 1 β1 − 3 B − − 2, 3 − 4 B 2 − − 1, 2 c c c c 3 1 1 2 3 1 1 1 = 6 B − − 1, 2 + 5 B − − 1, 2 B − − 2, 3 + 4 B − − 3, 4 c c c c c c c
for c ≤ 0 or 2
1 1 1 1 β1 − 3 B − − 2, 3 − 4 B 2 − − 1, 2 c c c c 2 1 1 1 3 1 1 = 6 B 3 2, + 5 B 2, B 3, + 4 B 4, c c c c c c c for c > 0, where β1 is the sample skewness given by (4). The moment estimator of k is given by ⎧ ⎨ c2 μ1 B − 1 − 1, 2 −1 , if c ≤ 0, c k= ⎩ c2 μ B 2, 1 −1 , if c > 0, 1 c
Gaussian DCT Coefficient Models
467
where μ1 is the sample mean. The maximum likelihood estimators of c and k are the simultaneous solutions of the equations: (c − 1)
n λi k2
i=1
1−
cλi k
−1
=
n k
and c(c − 1)
n λi i=1
k
1−
Two Parameter Burr III Distribution:
cλi k
−1
=
cλi log 1 − . k i=1
n
If g takes the form
g(λ) = kcλ−c−1 (1 + λ−c )−k−1 for λ > 0, c > 0 and k > 0 then one can write kc f (y) = √ I, 2π where I= 0
∞
wc−1/2 exp(−wx) dw. (1 + wc )k+1
This integral cannot be reduced to an explicit form for general c. It can be reduced for particular values of c. For instance, if c = 2 (no physical motivation) then 3 x2 1 I = x 2k−1/2 (1/2 − 2k) 1 F2 k + 1; k + 1 − , − ; − 4 4 4 2 1 5 x 5 1 1 1 ; , − k; − + B k− , 1 F2 2 4 4 4 2 4 4 3 7 x2 7 3 7 x ; , − k; − − B k− , . 1 F2 2 4 4 4 2 4 4 Note that both c and k are the shape parameters. The moment estimators of c and k are the solutions of the equations
2 1 3/2 2 1 β1 kB 1 − , k + − k2 B 2 1 − , k + c c c c 1 1 1 2 2 1 3 3 2 = 2k B 1 − , k + − 3k B 1 − , k + B 1 − ,k + c c c c c c 3 3 + kB 1 − , k + c c
468
S. Nadarajah
and 2 1 2 2 1 β2 kB 1 − , k + − k2B 2 1 − , k + c c c c 1 1 2 1 1 2 4 4 3 2 = −3k B 1 − , k + + 6k B 1 − , k + B 1 − ,k + c c c c c c 1 3 4 1 3 4 − 4k 2 B 1 − , k + B 1 − ,k + + kB 1 − , k + , c c c c c c where β1 and β2 are the sample skewness and the sample kurtosis given by (4) and (5), respectively. The maximum likelihood estimators of k and c are the simultaneous solutions of the equations: n
n log 1 + λ−c = i k i=1
and n
log λi − (k + 1)
n λ−c log λi
i=1
i
i=1
Two Parameter Burr XII Distribution:
1 + λ−c i
=
n . c
If g takes the form
g(λ) = kcλc−1 (1 + λc )−k−1 for λ > 0, c > 0 and k > 0 then one can write kc f (y) = √ I, 2π where I= 0
∞
wkc−1/2 exp(−wx) dw. (1 + wc )k+1
Again, this integral cannot be reduced to an explicit form for general c. For particular values like c = 2 (no physical motivation) one can reduce the integral to x2 7 5 I = x 3/2 (−3/2)1 F2 k + 1; , ; − 4 4 4 3 1/2 + 2k x2 1/2 + 2k 1 1 1 ; , ;− + B , 1 F2 2 4 2 2 2 4 4 1 2k + 3/2 x2 2k + 3/2 3 3 x ; , ; − F − B , . 1 2 2 4 2 2 2 4 4 Note that both c and k are the shape parameters. The moment estimators of c and k are the solutions of the equations
Gaussian DCT Coefficient Models
469
2 1 3/2 2 1 2 2 − k B k − ,1 + β1 kB k − , 1 + c c c c 1 1 2 2 1 1 − 3k 2 B k − , 1 + B k − ,1 + = 2k 3 B 3 k − , 1 + c c c c c c 3 3 + kB k − , 1 + c c
and 2 1 2 2 1 β2 kB k − , 1 + − k2B 2 k − , 1 + c c c c 1 1 2 1 1 2 4 4 3 2 = −3k B k − , 1 + + 6k B k − , 1 + B k − ,1 + c c c c c c 1 3 4 1 3 4 − 4k 2 B k − , 1 + B k − ,1 + + kB k − , 1 + , c c c c c c where β1 and β2 are the sample skewness and the sample kurtosis given by (4) and (5), respectively. The maximum likelihood estimators of k and c are the simultaneous solutions of the equations: n
n log 1 + λci = k i=1
and n i=1
log λi − (k + 1)
n λc log λi i
i=1
1 + λci
=
n . c
3 Application As mentioned in Sect. 1, a popular modal for the DCT coefficients of images such as text documents is the Gaussian distribution. Here, we show that the distributions derived in Sect. 2 are better models for the DCT coefficients whether they follow the Gaussian distribution or not. To show this, we simulated 100 samples each of size 10 from each of the following distributions: (1) the standard Gaussian distribution (Model 1); (2) the Student’s t distribution with degrees of freedom ν = 1 (Model 2); (3) the standard logistic distribution given by the pdf f (x) = exp(−x)/{1 + exp(−x)}2 for −∞ < x < ∞ (Model 3); (4) the standard Laplace distribution given by the pdf f (x) = (1/2) exp(− | x |) for −∞ < x < ∞ (Model 4). For each of the 100 × 4 = 400 samples, we fitted the distribution given by (6) as well as the standard Gaussian distribution. The method of maximum likelihood described in Sect. 2 was used. We computed 2(log L2 − log L1 ) for each fit, where L1 and L2 denote the maximized likelihoods for the two distributions. Figure k + 1 shows the box plot of the values of 2(log L2 − log L1 ) for the 100 samples from Model k, k = 1, 2, 3, 4.
470
S. Nadarajah
Fig. 2 Box plot of the values of 2(log L2 − log L1 ) for the 100 simulated samples each of size 10 from the standard Gaussian distribution
Fig. 3 Box plot of the values of 2(log L2 − log L1 ) for the 100 simulated samples each of size 10 from the Student’s t distribution with degrees of freedom ν = 1
A box plot is a convenient way of graphically depicting numerical data through their five–number summaries: the smallest observation, lower quartile, median, upper quartile, and the largest observation (from the bottom to top). Observations which are considered “outliers” or “extreme observations” are indicated by open dots. The box plot was invented by the American statistician John Tukey. We refer the readers to Tukey [16] for details. The box plots in Figs. 2, 3, 4 and 5 show that the distribution of 2(log L2 − log L1 ) lies entirely below zero. In other words, the likelihood L1 for the fit of (6) is always greater than the likelihood L2 for the fit of the standard Gaussian distribution. So, one can infer that the
Gaussian DCT Coefficient Models
471
Fig. 4 Box plot of the values of 2(log L2 − log L1 ) for the 100 simulated samples each of size 10 from the standard logistic distribution
Fig. 5 Box plot of the values of 2(log L2 − log L1 ) for the 100 simulated samples each of size 10 from the standard Laplace distribution
model given by (6) performs better than the Gaussian distribution when the DCT coefficients are in fact Gaussian distributed, see Fig. 2. More importantly, when the DCT coefficients are not Gaussian distributed the model given by (6) performs better when compared to the Gaussian model, see Figs. 3–5. The results were similar when the other distributions derived in Sect. 2 were used and for larger sample sizes of 100, 1000 and 10,000. Hence, the derived distributions in Sect. 2 provide versatile models whether the DCT coefficients are Gaussian distributed or not.
472
S. Nadarajah
4 Conclusions We have derived sixteen flexible models for the distribution of the actual DCT coefficient given by (3) and the corresponding estimation procedures by the method of moments and the method of maximum likelihood. We have established the superior performance of the derived models over the Gaussian distribution given by (1). The models can be useful for the DCT coefficients of large text documents or images of that type (see Lam [7] and Lam and Goodman [8]). An extension of the work is to consider the generalized Gaussian distribution in place of the Gaussian distribution, i.e. replace (1) by 1/α x 1 f (x) = exp − 2β(1 + α) β for −∞ < x < ∞, where β > 0 is the scale parameter and α > 0 is the shape parameter. In this case, the actual DCT coefficient distribution can be given by 1/α ∞ x 1 1 f (x) = exp − g(β)dβ. (7) 2(1 + α) 0 β β Expressions for (7) similar to those in Sect. 2 can be derived by taking g(·) to belong to the sixteen families. We hope to present these expressions and show their applicability in a future paper. Acknowledgements The author would like to thank the Editor and the two referees for carefully reading the paper and for their comments which greatly improved the paper.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.
Ahmed, N., Natarajan, T., Rao, K.R.: Discrete cosine transform. IEEE Trans. Comput. 23, 90–93 (1974) Wang, Z., Hunt, B.: The discrete W transform. Appl. Math. Comput. 16, 19–48 (1985) Jain, A.K.: Fundamentals of Digital Image Processing. Prentice Hall, New York (1989) Rao, K.R., Yip, P.: Discrete Cosine Transform: Algorithms, Advantages, Applications. Academic Press, Boston (1990) Duhamel, P., Vetterli, M.: Fast Fourier transforms: a tutorial review and a state of the art. Signal Process. 19, 259–299 (1990) Reininger, R.C., Gibson, J.D.: Distribution of the two–dimensional DCT coefficients for images. IEEE Trans. Commun. 31, 835–839 (1983) Lam, E.Y.: Analysis of the DCT coefficient distributions for document coding. IEEE Signal Process. Lett. 11, 97–100 (2004) Lam, E.Y., Goodman, J.W.: A mathematical analysis of the DCT coefficient distributions for images. IEEE Trans. Image Process. 9, 1661–1666 (2000) Lam, E.Y.: Statistical modelling of the wavelet coefficients with different bases and decomposition levels. IEE Proc. Vis. Image Signal Process. 151, 203–206 (2004) Teichroew, D.: The mixture of normal distributions with different variances. Ann. Math. Stat. 28, 510– 512 (1957) Nadarajah, S., Kotz, S.: On the DCT coefficient distributions. IEEE Signal Process. Lett. 13, 601–603 (2006) Prudnikov, A.P., Brychkov, Y.A., Marichev, O.I.: Integrals and Series, vol. 1. Gordon and Breach Science, Amsterdam (1986) Gradshteyn, I.S., Ryzhik, I.M.: Table of Integrals, Series, and Products, 6th edn. Academic Press, San Diego (2000) van Dorp, J.R., Kotz, S.: A novel extension of the triangular distribution and its parameter estimation. J. R. Stat. Soc. Ser. D—Stat. 51, 63–79 (2002) van Dorp, J.R., Kotz, S.: The standard two-sided power distribution and its properties: with applications in financial engineering. Am. Stat. 56, 90–99 (2002) Tukey, J.W.: Exploratory Data Analysis. Addison-Wesley, Reading (1977)
Acta Appl Math (2009) 106: 473–499 DOI 10.1007/s10440-008-9308-1
Dynamical Systems Gradient Method for Solving Nonlinear Equations with Monotone Operators N.S. Hoang · A.G. Ramm
Received: 28 June 2008 / Accepted: 26 August 2008 / Published online: 18 September 2008 © Springer Science+Business Media B.V. 2008
Abstract A version of the Dynamical Systems Gradient Method for solving ill-posed nonlinear monotone operator equations is studied in this paper. A discrepancy principle is proposed and justified. A numerical experiment was carried out with the new stopping rule. Numerical experiments show that the proposed stopping rule is efficient. Equations with monotone operators are of interest in many applications. Keywords Dynamical systems method (DSM) · Nonlinear operator equations · Monotone operators · Discrepancy principle Mathematics Subject Classification (2000) 47J05 · 47J06 · 47J35 · 65R30
1 Introduction In this paper we study a version of the Dynamical Systems Method (DSM) (see [10]) for solving the equation F (u) = f,
(1)
where F is a nonlinear, twice Fréchet differentiable, monotone operator in a real Hilbert space H , and (1) is assumed solvable, possibly nonuniquely. Monotonicity means that F (u) − F (v), u − v ≥ 0,
∀u, v ∈ H.
(2)
Equations with monotone operators are important in many applications and were studied extensively, see, for example, [5, 7, 21, 24], and references therein. One encounters many N.S. Hoang · A.G. Ramm () Mathematics Department, Kansas State University, Manhattan, KS 66506-2602, USA e-mail:
[email protected] N.S. Hoang e-mail:
[email protected]
474
N.S. Hoang, A.G. Ramm
technical and physical problems with such operators in the cases where dissipation of energy occurs. For example, in [9] and [8], Chap. 3, pp. 156–189, a wide class of nonlinear dissipative systems is studied, and the basic equations of such systems can be reduced to (1) with monotone operators. Numerous examples of equations with monotone operators can be found in [5] and references mentioned above. In [19] and [20] it is proved that any solvable linear operator equation with a closed, densely defined operator in a Hilbert space H can be reduced to an equation with a monotone operator and solved by a convergent iterative process. In this paper, apparently for the first time, the convergence of the Dynamical Systems Gradient method is proved under natural assumptions and convergence of a corresponding iterative method is established. No special assumptions of smallness of the nonlinearity or other special properties of the nonlinearity are imposed. No source-type assumptions are used. Consequently, our result is quite general and widely applicable. It is well known, that without extra assumptions, usually, source-type assumption about the right-hand side, or some assumption concerning the smoothness of the solution, one cannot get a specific rate of convergence even for linear ill-posed equations (see, for example, [10], where one can find a proof of this statement). On the other hand, such assumptions are often difficult to verify and often they do not hold. By this reason we do not make such assumptions. The result of this paper is useful both because of its many possible applications and because of its general nature. Our novel technique consists of an application of some new inequalities. Our main results are formulated in Theorems 17 and 19, and also in several lemmas, for example, in Lemmas 3, 4, 8, 9, 11, 12. Lemmas 3, 4, 11, 12 may be useful in many other problems. In [23] a stationary equation F (u) = f with a nonlinear monotone operator F was studied. The assumptions A1–A3 on p. 197 in [23] are more restrictive than ours, and the Rule R2 on p.199, formula (4.1) in [23] for the choice of the regularization parameter is quite different from our rule and is more difficult to use computationally: one has to solve a nonlinear equation (equation (4.1) in [23]) in order to find the regularization parameter. To use this equation one has to invert an ill-conditioned linear operator A + αI for small values of α. Assumption A1 in [23] is not verifiable practically, because the solution x † is not known. Assumption A3 in [23] requires F to be constant in a ball Br (x † ) if F (x † ) = 0. Our method does not require these assumptions, and, in contrast to equation (4.1) in [23], it does not require inversion of ill-conditioned linear operators and solving nonlinear equations for finding the regularization parameter. The stopping time is chosen numerically in our method without extra computational effort by a discrepancy-type principle formulated and justified in Theorem 17, in Sect. 3. We give a convergent iterative process for stable solution of (1.1) and a stopping rule for this process. In [23] the “source-type assumption” is made, that is, it is assumed that the right-hand side of the equation F (u) = f belongs to the range of a suitable operator. This usually allows one to get some convergence rate. In our paper, as was already mentioned above, such an assumption is not used because, on the one hand, numerically it is difficult to verify such an assumption, and, on the other hand, such an assumption may be not satisfied in many cases, even in linear ill-posed problems, for example, in the case when the solution does not have extra smoothness. We assume the nonlinearity to be twice locally Fréchet differentiable. This assumption, as we mention below, does not restrict the global growth of the nonlinearity. In many practical and theoretical problems the nonlinearities are smooth and given analytically. In these cases one can calculate F analytically. This is the case in the example, considered in Sect. 4. This example is a simple model problem for non-linear Wiener-type filtering (see [18]).
Dynamical Systems Gradient Method for Solving Nonlinear Equations
475
If one drops the nonlinear cubic term in the equation Bu + u3 = f of this example, then the resulting equation Bu = f does not have integrable solutions, in general, even for very smooth f , for example, for f ∈ C ∞ ([0, 1]), as shown in [18]. It is, therefore, of special interest to solve this equation numerically. It is known (see, e.g., [10]), that the set N := {u : F (u) = f } is closed and convex if F is monotone and continuous. A closed and convex set in a Hilbert space has a unique minimal-norm element. This element in N we denote by y, F (y) = f . We assume that sup
u−u0 ≤R
F (j ) (u) ≤ Mj (R),
0 ≤ j ≤ 2,
(3)
where u0 ∈ H is an element of H , R > 0 is arbitrary, and f = F (y) is not known but fδ , the noisy data, are known, and fδ − f ≤ δ. Assumption (3) simplifies our arguments and does not restrict the global growth of the nonlinearity. In [12] this assumption is weakened to hemicontinuity in the problems related to the existence of the global solutions of the equations, generated by the DSM. In many applications the nonlinearity F is given analytically, and then one can calculate F (u) analytically. If F (u) is not boundedly invertible then solving (1) for u given noisy data fδ is often (but not always) an ill-posed problem. When F is a linear bounded operator many methods for stable solving of (1) were proposed (see [2, 4–10] and references therein). However, when F is nonlinear then the theory is less complete. DSM consists of finding a nonlinear map (t, u) such that the Cauchy problem u˙ = (t, u),
u(0) = u0 ,
has a unique solution for all t ≥ 0, there exists limt→∞ u(t) := u(∞), and F (u(∞)) = f , ∃! u(t)
∀t ≥ 0;
∃u(∞);
F (u(∞)) = f.
(4)
Various choices of were proposed in [10] for (4) to hold. Each such choice yields a version of the DSM. The DSM for solving equation (1) was extensively studied in [10–17]. In [10], the following version of the DSM was investigated for monotone operators F : u˙ δ = −(F (uδ ) + a(t)I )−1 (F (uδ ) + a(t)uδ − fδ ),
uδ (0) = u0 .
(5)
Here I denotes the identity operator in H . The convergence of this method was justified with some a priori choice of stopping rule. In [22] a continuous gradient method for solving (1) was studied. A stopping rule of discrepancy type was introduced and justified under the assumption that F satisfies the following condition: ˜ F (x) ˜ − F (x) − F (x)(x˜ − x) = η F (x) − F (x) ,
η < 1,
(6)
for all x, x˜ in some ball B(x0 , R) ⊂ H . This very restrictive assumption is not satisfied even for monotone operators. Indeed, if F (x) = 0 for some x ∈ B(x0 ) then (6) implies F (x) = f for all x ∈ B(x0 , R), provided that B(x0 , R) contains a solution of (1). In this paper we consider a gradient-type version of the DSM for solving (1): u˙ δ = −(F (uδ )∗ + a(t)I )(F (uδ ) + a(t)uδ − fδ ),
uδ (0) = u0 ,
(7)
476
N.S. Hoang, A.G. Ramm
where F is a monotone operator and A∗ denotes the adjoint to a linear operator A. If F is monotone then F (·) := A ≥ 0. If a bounded linear operator A is defined on all of the complex Hilbert space H and A ≥ 0, i.e., Au, u ≥ 0, ∀u ∈ H , then A = A∗ , so A is selfadjoint. In a real Hilbert space H a bounded linear operator defined on all of H and satisfying the inequality Au, u ≥ 0, ∀u ∈ H is not necessary selfadjoint. Example: H = R2 ,
A=
2 1 , 0 2
Au, u = 2u21 + u1 u2 + u22 ≥ 0,
but
A∗ =
2 1
0 2
= A.
The convergence of the method (7) for any initial value u0 is proved for a stopping rule based on a discrepancy principle. This a posteriori choice of stopping time tδ is justified provided that a(t) is suitably chosen. The advantage of method (7), a modified version of the gradient method, over the GaussNewton method and the version (5) of the DSM is the following: no inversion of matrices is needed in (7). Although the convergence rate of the DSM (7) maybe slower than that of the DSM (5), the DSM (7) might be faster than the DSM (5) for large-scale systems due to its lower computation cost at each iteration. In this paper we investigate a stopping rule based on a discrepancy principle (DP) for the DSM (7). The main results of this paper are Theorem 17 and Theorem 19 in which a DP is formulated, the existence of a stopping time tδ is proved, and the convergence of the DSM with the proposed DP is justified under some natural assumptions.
2 Auxiliary Results The inner product in H is denoted u, v. Let us consider the following equation F (Vδ ) + aVδ − fδ = 0,
a > 0,
(8)
where a = const . It is known (see, e.g., [10, 25]) that equation (8) with monotone continuous operator F has a unique solution for any fδ ∈ H . Let us recall the following result from [10]: Lemma 1 Assume that (1) is solvable, y is its minimal-norm solution, assumption (2) holds, and F is continuous. Then lim Va − y = 0,
a→0
where Va solves (8) with δ = 0. Of course, under our assumption (3), F is continuous. Lemma 2 If (2) holds and F is continuous, then Vδ = O( a1 ) as a → ∞, and lim F (Vδ ) − fδ = F (0) − fδ .
a→∞
(9)
Dynamical Systems Gradient Method for Solving Nonlinear Equations
477
Proof Rewrite (8) as F (Vδ ) − F (0) + aVδ + F (0) − fδ = 0. Multiply this equation by Vδ , use inequality F (Vδ ) − F (0), Vδ − 0 ≥ 0 and get: a Vδ 2 ≤ fδ − F (0) Vδ . Therefore, Vδ = O( a1 ). This and the continuity of F imply (9).
Let a = a(t) be a strictly monotonically decaying continuous positive function on [0, ∞), 0 < a(t) 0, and assume a ∈ C 1 [0, ∞). These assumptions hold throughout the paper and often are not repeated. Then the solution Vδ of (8) is a function of t , Vδ = Vδ (t). From the triangle inequality one gets: F (Vδ (0)) − fδ ≥ F (0) − fδ − F (Vδ (0)) − F (0) . From Lemma 2 it follows that for large a(0) one has: F (Vδ (0)) − F (0) ≤ M1 Vδ (0) = O
1 . a(0)
Therefore, if F (0) − fδ > Cδ, then F (Vδ (0)) − fδ ≥ (C − )δ, where > 0 is sufficiently small and a(0) > 0 is sufficiently large. Below the words decreasing and increasing mean strictly decreasing and strictly increasing. Lemma 3 Assume F (0) − fδ > 0. Let 0 < a(t) 0, and F be monotone. Denote ψ(t) := Vδ (t) ,
φ(t) := a(t)ψ(t) = F (Vδ (t)) − fδ ,
where Vδ (t) solves (8) with a = a(t). Then φ(t) is decreasing, and ψ(t) is increasing. Proof Since F (0) − fδ > 0, one has ψ(t) = 0, ∀t ≥ 0. Indeed, if ψ(t)|t=τ = 0, then Vδ (τ ) = 0, and (8) implies F (0) − fδ = 0, which is a contradiction. Note that φ(t) = a(t) Vδ (t) . One has 0 ≤ F (Vδ (t1 )) − F (Vδ (t2 )), Vδ (t1 ) − Vδ (t2 ) = −a(t1 )Vδ (t1 ) + a(t2 )Vδ (t2 ), Vδ (t1 ) − Vδ (t2 ) = (a(t1 ) + a(t2 ))Vδ (t1 ), Vδ (t2 ) − a(t1 ) Vδ (t1 ) 2 − a(t2 ) Vδ (t2 ) 2 .
(10)
Thus, 0 ≤ (a(t1 ) + a(t2 )) Vδ (t1 ) Vδ (t2 ) − a(t1 ) Vδ (t1 ) 2 − a(t2 ) Vδ (t2 ) 2 = (a(t1 ) Vδ (t1 ) − a(t2 ) Vδ (t2 ) )( Vδ (t2 ) − Vδ (t1 ) ) = (φ(t1 ) − φ(t2 ))(ψ(t2 ) − ψ(t1 )). If ψ(t2 ) > ψ(t1 ) then (11) implies φ(t1 ) ≥ φ(t2 ), so a(t1 )ψ(t1 ) ≥ a(t2 )ψ(t2 ) > a(t2 )ψ(t1 ).
(11)
478
N.S. Hoang, A.G. Ramm
Thus, if ψ(t2 ) > ψ(t1 ) then a(t2 ) < a(t1 ) and, therefore, t2 > t1 , because a(t) is strictly decreasing. Similarly, if ψ(t2 ) < ψ(t1 ) then φ(t1 ) ≤ φ(t2 ). This implies a(t2 ) > a(t1 ), so t2 < t1 . Suppose ψ(t1 ) = ψ(t2 ), i.e., Vδ (t1 ) = Vδ (t2 ) . From (10), one has Vδ (t1 ) 2 ≤ Vδ (t1 ), Vδ (t2 ) ≤ Vδ (t1 ) Vδ (t2 ) = Vδ (t1 ) 2 . This implies Vδ (t1 ) = Vδ (t2 ), and then equation (8) implies a(t1 ) = a(t2 ). Hence, t1 = t2 , because a(t) is strictly decreasing. Therefore φ(t) is decreasing and ψ(t) is increasing. Lemma 4 Suppose that F (0) − fδ > Cδ, C > 1, and a(0) is sufficiently large. Then, there exists a unique t1 > 0 such that F (Vδ (t1 )) − fδ = Cδ. Proof The uniqueness of t1 follows from Lemma 3 because F (Vδ (t)) − fδ = φ(t), and φ is decreasing. We have F (y) = f , and 0 = F (Vδ ) + aVδ − fδ , F (Vδ ) − fδ = F (Vδ ) − fδ 2 + aVδ − y, F (Vδ ) − fδ + ay, F (Vδ ) − fδ = F (Vδ ) − fδ 2 + aVδ − y, F (Vδ ) − F (y) + aVδ − y, f − fδ + ay, F (Vδ ) − fδ ≥ F (Vδ ) − fδ 2 + aVδ − y, f − fδ + ay, F (Vδ ) − fδ . Here the inequality Vδ − y, F (Vδ ) − F (y) ≥ 0 was used. Therefore F (Vδ ) − fδ 2 ≤ −aVδ − y, f − fδ − ay, F (Vδ ) − fδ ≤ a Vδ − y f − fδ + a y F (Vδ ) − fδ ≤ aδ Vδ − y + a y F (Vδ ) − fδ .
(12)
On the other hand, we have 0 = F (Vδ ) − F (y) + aVδ + f − fδ , Vδ − y = F (Vδ ) − F (y), Vδ − y + a Vδ − y 2 + ay, Vδ − y + f − fδ , Vδ − y ≥ a Vδ − y 2 + ay, Vδ − y + f − fδ , Vδ − y. where the inequality Vδ − y, F (Vδ ) − F (y) ≥ 0 was used. Therefore, a Vδ − y 2 ≤ a y Vδ − y + δ Vδ − y . This implies a Vδ − y ≤ a y + δ.
(13)
From (12) and (13), and an elementary inequality ab ≤ a + 2
b2 , 4
∀ > 0, one gets:
F (Vδ ) − fδ 2 ≤ δ 2 + a y δ + a y F (Vδ ) − fδ ≤ δ 2 + a y δ + F (Vδ ) − fδ 2 +
1 2 a y 2 , 4
(14)
Dynamical Systems Gradient Method for Solving Nonlinear Equations
479
where > 0 is fixed, independent of t , and can be chosen arbitrary small. Let t → ∞ and a = a(t) 0. Then (14) implies limt→∞ (1 − ) F (Vδ ) − fδ 2 ≤ δ 2 . This, the continuity of F , the continuity of Vδ (t) on [0, ∞), and the assumption F (0) − fδ > Cδ imply that equation F (Vδ (t)) − fδ = Cδ must have a solution t1 > 0. The uniqueness of this solution has already established. Remark 5 From the proof of Lemma 4 one obtains the following claim: If tn ∞ then there exists a unique n1 > 0 such that F (Vn1 +1 ) − fδ ≤ Cδ < F (Vn1 ) − fδ ,
Vn := Vδ (tn ).
Remark 6 From Lemmas 2 and 3 one concludes that an Vn = F (Vn ) − fδ ≤ F (0) − fδ ,
an := a(tn ), ∀n ≥ 0.
Remark 7 Let V := Vδ (t)|δ=0 , so F (V ) + a(t)V − f = 0. Let y be the minimal-norm solution to (1). We claim that Vδ − V ≤
δ . a
(15)
Indeed, from (8) one gets F (Vδ ) − F (V ) + a(Vδ − V ) = f − fδ . Multiply this equality with (Vδ − V ) and use the monotonicity of F to get a Vδ − V 2 ≤ δ Vδ − V . This implies (15). Similarly, multiplying the equation F (V ) + aV − F (y) = 0, by V − y one derives the inequality: V ≤ y .
(16)
Similar arguments one can find in [10]. From (15) and (16), one gets the following estimate: Vδ ≤ V +
δ δ ≤ y + . a a
(17)
t a 2 (s) d Lemma 8 Suppose a(t) = (c+t) ds where b ∈ (0, 14 ], d and c are positive b , ϕ(t) = 0 2 constants. Then t 2b eϕ(t) eϕ(s) d2 1− θ 2 ds < , ∀t > 0, θ = 1 − 2b > 0. (18) 3b 2 c d (c + t)b 0 (s + c)
480
N.S. Hoang, A.G. Ramm
Proof We have t d2 d2 1−2b 1−2b ϕ(t) = ds = − c = p(c + t)θ − C3 , (c + t) 2b 2(1 − 2b) 0 2(c + s) where θ := 1 − 2b, p :=
d2 , 2θ
(19)
C3 := pcθ . One has θ
θ
θ
pθ ep(c+t) bep(c+t) d ep(c+t) = − dt (c + t)b (c + t)b+1−θ (c + t)b+1 θ ep(c+t) b d2 = − (c + t)b 2(c + t)2b c + t θ ep(c+t) d2 2b ≥ 1− θ 2 . (c + t)b 2(c + t)2b c d Therefore, t p(c+s)θ t θ e d ep(c+s) d2 2b ds ≤ ds 1− θ 2 3b b 2 c d 0 (s + c) 0 ds (c + s) θ
≤
θ
θ
epc ep(c+t) ep(c+t) − ≤ . (c + t)b cb (c + t)b
Multiplying this inequality by e−C3 and using (19), one obtains (18). Lemma 8 is proved. Lemma 9 Let a(t) = 6b. One has
d (c+t)b
e−ϕ(t)
t 0
and ϕ(t) :=
t 0
a 2 (s) ds 2
where d, c > 0, b ∈ (0, 14 ] and c1−2b d 2 ≥
1 eϕ(s) |a(s)| V ˙ δ (s) ds ≤ a(t) Vδ (t) , 2
Proof From Lemma 8, one has t d3 d 1 2b eϕ(s) ds < eϕ(t) , 1− θ 2 3b 2 c d (s + c) (c + t)b 0 Since c1−2b d 2 ≥ 6b or
6b cθ c12
1−
t ≥ 0.
(20)
∀c, b ≥ 0, θ = 1 − 2b > 0.
(21)
≤ 1, one has 2b 4b 4b ≥ θ 2≥ , θ 2 c d c d (c + s)1−2b d 2
s ≥ 0.
This implies d3 2b 4db a 3 (s) 2b = 2|a(s)|, ˙ 1 − ≥ 1− θ 2 = 3b θ 2 2 c d 2(c + s) c d 2(c + s)b+1
s ≥ 0.
(22)
Multiplying (21) by Vδ (t) , using inequality (22) and the fact that Vδ (t) is increasing, then for all t ≥ 0 one gets t t a 3 (s) 2b eϕ(t) a(t) Vδ (t) > eϕ(s) Vδ (t) eϕ(s) |a(s)| V ˙ 1 − θ 2 ds ≥ 2 δ (s) ds. 2 c d 0 0
Dynamical Systems Gradient Method for Solving Nonlinear Equations
481
This implies inequality (20). Lemma 9 is proved. Let us recall the following lemma, which is basic in our proofs.
Lemma 10 ([10], p. 97) Let α(t), β(t), γ (t) be continuous nonnegative functions on [t0 , ∞), t0 ≥ 0 is a fixed number. If there exists a function μ ∈ C 1 [t0 , ∞), such that
μ > 0,
lim μ(t) = ∞,
t→∞
μ μ(t) ˙ dμ , γ− , μ˙ := 2 μ(t) dt 1 μ(t) ˙ β(t) ≤ γ− , 2μ μ(t)
0 ≤ α(t) ≤
μ(0)g(0) < 1,
(23) (24) (25)
and g(t) ≥ 0 satisfies the inequality g(t) ˙ ≤ −γ (t)g(t) + α(t)g 2 (t) + β(t),
t ≥ t0 ,
(26)
then g(t) exists on [t0 , ∞) and 0 ≤ g(t) <
1 → 0, μ(t)
as t → ∞.
(27)
If inequalities (23)–(25) hold on an interval [t0 , T ), then g(t) exists on this interval and inequality (27) holds on [t0 , T ). Lemma 11 Suppose M1 , c0 , and c1 are positive constants and 0 = y ∈ H . Then there exist λ > 0 and a function a(t) ∈ C 1 [0, ∞), 0 < a(t) 0, such that |a(t)| ˙ ≤
a 3 (t) , 4
and the following conditions hold M1 ≤ λ, y
λ 2|a(t)| ˙ 2 (t) − a , 2a 2 (t) a(t) a 2 (t) 2 2|a(t)| ˙ |a(t)| ˙ ≤ c1 a (t) − , a(t) 2λ a(t)
c0 (M1 + a(t)) ≤
λ g(0) < 1. a 2 (0)
(28) (29) (30) (31)
Proof Take a(t) =
d , (c + t)b
1 0 < b ≤ , 4b ≤ c1−2b d 2 , c ≥ 1. 4
(32)
482
N.S. Hoang, A.G. Ramm
Note that |a| ˙ = −a. ˙ We have b b 1 |a| ˙ = 2 ≤ 2 1−2b ≤ . 3 1−2b a d (c + t) d c 4 Hence, 2|a(t)| ˙ a 2 (t) ≤ a 2 (t) − . 2 a(t)
(33)
Thus, inequality (29) is satisfied if c0 (M1 + a(0)) ≤ Take
λ . 4
(34)
M1 λ ≥ max 8c0 M1 , . y
(35)
Then (28) is satisfied and c0 M1 ≤
λ . 8
(36)
For any given g(0), choose a(0) sufficiently large so that λ g(0) < 1. a 2 (0) Then inequality (31) is satisfied. Choose κ ≥ 1 such that κ > max
4λc1 b 8c0 a(0) , 1 . , d4 λ
(37)
Define ν(t) := κa(t)
λκ := κ 2 λ.
(38)
Using inequalities (36), (37) and (38), one gets c0 (M1 + ν(0)) ≤
λκ λκ λ λκ + c0 ν(0) ≤ + = . 8 8 8 4
Thus, (34) holds for a(t) = ν(t), λ = λκ . Consequently, (29) holds for a(t) = ν(t), λ = λκ since (33) holds as well under this transformation, i.e., 2|˙ν (t)| ν 2 (t) ≤ ν 2 (t) − . 2 ν(t) Using the inequalities (37) and c ≥ 1 and the definition (38), one obtains 4λκ c1
|˙ν (t)| b b = 4λc1 2 4 ≤ 4λc1 2 4 ≤ 1. 5 1−4b ν (t) κ d (c + t) κ d
(39)
Dynamical Systems Gradient Method for Solving Nonlinear Equations
This implies c1
483
ν 4 (t) ν 2 (t) 2 2|˙ν | |˙ν | ≤ . ≤ ν − ν(t) 4λκ 2λκ ν
Thus, one can replace the function a(t) by ν(t) = κa(t) and λ by λκ = κ 2 λ in the inequalities (28)–(31). Lemma 12 Suppose M1 , c0 , c1 and α˜ are positive constants and 0 = y ∈ H . Then there exist λ > 0 and a sequence 0 < (an )∞ n=0 0 such that the following conditions hold an ≤ 2, an+1 fδ − F (0) ≤
a03 , λ
M1 ≤ y , λ c0 (M1 + a0 ) 1 ≤ , λ 2 a2 ˜ 4 an − an+1 an2 αa − n+ c1 ≤ n+1 . λ 2λ an+1 λ
(40) (41) (42) (43) (44)
Proof Let us show that if a0 > 0 is sufficiently large, then the following sequence an =
1 b= , 4
(45)
M1 , 4c0 M1 . y
(46)
a0 , (1 + n)b
satisfies conditions (41)–(44) if λ ≥ max
Condition (40) is satisfied by the sequence (45). Inequality (42) is satisfied since (46) holds. Choose a(0) so that a0 ≥ 3 fδ − F (0) λ, (47) then (41) is satisfied. Assume that (an )∞ n=0 and λ satisfy (40), (41) and (42). Choose κ ≥ 1 such that
4c0 a0 , κ ≥ max λ
4
√ , αa ˜ 02 2 2
λc1 . αa ˜ 04
(48)
It follows from (48) that 4 ˜ √ ≤ α, κ 2 a02 2 2
λc1 ≤ α. ˜ κ 2 a04
(49)
Define ∞ (bn )∞ n=0 := (κan )n=0 ,
λκ := κ 2 λ.
(50)
484
N.S. Hoang, A.G. Ramm
Using inequalities (46), (48) and the definitions (50), one gets c0 (M1 + b0 ) 1 c0 a0 1 1 1 ≤ + = . ≤ + λκ 4 κλ 4 4 2 Thus, inequality (43) holds for a0 replaced by b0 = κa0 and λ replaced by λκ = κ 2 λ, where κ satisfies (48). For all n ≥ 0 one has a4
a4
0 0 2 − n+2 a4 − a4 a4 − a4 an2 − an+1 1 1 n+1 ≤ 2 √ . = 4 n 2 n+1 = = 2 √ ≤ n 2 n+1 2 4 2 4 a0 a0 an an (an + an+1 ) 2an+1 an4 a 2 n + 2 a 2 2 0 0 2 √n+2 n+1
(51)
Since an is decreasing, one has 4 an4 − an+1 an − an+1 = 2 an4 an+1 an4 an+1 (an + an+1 )(an2 + an+1 ) a4
a4
0 − 0 a4 − a4 1 ≤ n 4 4n+1 = n+1 4 n+2 ≤ 4, 4 a a 4an an+1 4a0 0 0 4 n+2 n+1
∀n ≥ 0.
(52)
Using inequalities (51) and (49), one gets 2 4(an2 − an+1 ) 4 ≤ ˜ √ ≤ α. 2 4 κ an κ 2 a02 2 2
(53)
Similarly, using inequalities (52) and (49), one gets 4λ(an − an+1 )c1 λc1 ≤ 2 4 ≤ α. ˜ 2 4 κ an an+1 κ a0
(54)
Inequalities (53) and (54) imply 2 2 bn2 − bn+1 a 2 − an+1 an − an+1 bn − bn+1 + + c1 = n c1 λκ bn+1 λ an+1
=
2 ) κ 2 an4 4λ(an − an+1 )c1 κ 2 an4 4(an2 − an+1 + 4λ κ 2 an4 4λ κ 2 an4 an+1
≤
κ 2 an4 κ 2 an4 α˜ αb ˜ 4 κ 2 an4 α˜ + α˜ = = n. 4λ 4λ 2λ 2λκ
Thus, inequality (44) holds for an replaced by bn = κan and λ replaced by λκ = κ 2 λ, where κ satisfies (48). Inequalities (40)–(42) hold as well under this transformation. Thus, the choices M1 , 4c0 M1 ), where κ satisfies (48), satisfy all the conditions of an = bn and λ := κ max( y Lemma 12. Remark 13 The constant c0 and c1 used in Lemma 11 and 12 will be used in Theorems 17 and 19. These constants are defined in (67). The constant α, ˜ used in Lemma 12, is the one from Theorem 19. This constant is defined in (94).
Dynamical Systems Gradient Method for Solving Nonlinear Equations
485
d Remark 14 Using similar arguments one can show that the sequence an = (c+n) b , where 1 c ≥ 1, 0 < b ≤ 4 , satisfy all conditions of Lemma 4 provided that d is sufficiently large and λ is chosen so that inequality (46) holds. a2
Remark 15 In the proof of Lemma 12 and 11 the numbers a0 and λ can be chosen so that λ0 is uniformly bounded as δ → 0 regardless of the rate of growth of the constant M1 = M1 (R) from formula (3) when R → ∞, i.e., regardless 1 of the strength of the nonlinearity F (u). + 4c0 . To satisfy (47) one can choose To satisfy (46) one can choose λ = M1 y a0 =
3 λ( f − F (0) + f ) ≥ 3 λ fδ − F (0) ,
where we have assumed without loss of generality that 0 < fδ − f < f . With this a2
choice of a0 and λ, the ratio λ0 is bounded uniformly with respect to δ ∈ (0, 1) and does not depend on R. The dependence of a0 on δ is seen from (47) since fδ depends on δ. In practice one has fδ − f < f . Consequently, 3 fδ − F (0) λ ≤ 3 ( f − F (0) + f )λ. Thus, we can practically choose a(0) independent of δ from the following inequality a0 ≥ 3 λ( f − F (0) + f ). √ a2 3 Indeed, with the above choice one has λ0 ≤ c(1 + λ−1 ) ≤ c, where c > 0 is a constant independent of δ, and one can assume that λ ≥ 1 without loss of generality. This Remark is used in the proof of the main result in Sect. 3. Specifically, it is used to prove that an iterative process (93) generates a sequence which stays in the ball B(u0 , R) for all n ≤ n0 + 1, where the number n0 is defined by formula (104) (see below), and R > 0 is sufficiently large. An upper bound on R is given in the proof of Theorem 19, below formula (117). Remark 16 One can choose u0 ∈ H such that g0 := u0 − V0 ≤
F (0) − fδ . a0
(55)
Indeed, if, for example, u0 = 0, then by Remark 6 one gets g0 = V0 = If (41) and (55) hold then g0 ≤
a0 V0 F (0) − fδ ≤ . a0 a0
a02 . λ
3 Main Results 3.1 Dynamical Systems Gradient Method Assume: 0 < a(t) 0,
a(t) ˙ = 0, t→∞ a(t) lim
|a(t)| ˙ 1 ≤ . 3 a (t) 4
(56)
486
N.S. Hoang, A.G. Ramm
Denote A := F (uδ (t)),
Aa := A + aI,
a = a(t),
where I is the identity operator, and uδ (t) solves the following Cauchy problem: u˙ δ = −A∗a(t) [F (uδ ) + a(t)uδ − fδ ],
uδ (0) = u0 .
(57)
Theorem 17 Assume that F : H → H is a monotone operator, twice Fréchet differentiable, supu∈B(u0 ,R) F (j ) (u) ≤ Mj (R), 0 ≤ j ≤ 2, B(u0 , R) := {u : u − u0 ≤ R}, u0 is an element of H , satisfying inequality (88) (see below). Let a(t) satisfy conditions of Lemma 11. d 1 For example, one can choose a(t) = (c+t) b , where b ∈ (0, 4 ], c ≥ 1, and d > 0 are constants, and d is sufficiently large. Assume that equation F (u) = f has a solution in B(u0 , R), possibly nonunique, and y is the minimal-norm solution to this equation. Let f be unknown but fδ be given, fδ − f ≤ δ. Then the solution uδ (t) to problem (57) exists on an interval [0, Tδ ], limδ→0 Tδ = ∞, and there exists tδ , tδ ∈ (0, Tδ ), not necessarily unique, such that F (uδ (tδ )) − fδ = C1 δ ζ ,
lim tδ = ∞,
δ→0
(58)
where C1 > 1 and 0 < ζ ≤ 1 are constants. If ζ ∈ (0, 1) and tδ satisfies (58), then lim uδ (tδ ) − y = 0.
(59)
δ→0
Remark 18 One can easily choose u0 satisfying inequality (88). Note that inequality (88) is a sufficient condition for (91) to hold. In our proof inequality (91) is used at t = tδ . The stopping time tδ is often sufficiently large for the quantity e−ϕ(tδ ) h0 to be small. In this case inequality (91) with t = tδ is satisfied for a wide range of u0 . The parameter ζ is not fixed in (58). While we could fix it, for example, by setting ζ = 0.9, it is an interesting open problem to propose an optimal in some sense criterion for choosing ζ . Proof of Theorem 17 Denote C :=
C1 + 1 . 2
(60)
Let w := uδ − Vδ , One has
g(t) := w .
w˙ = −V˙δ − A∗a(t) F (uδ ) − F (Vδ ) + a(t)w .
(61)
We use Taylor’s formula and get: F (uδ ) − F (Vδ ) + aw = Aa w + K,
K ≤
M2 w 2 , 2
(62)
where K := F (uδ ) − F (Vδ ) − Aw, and M2 is the constant from the estimate (3). Multiplying (61) by w and using (62) one gets g g˙ ≤ −a 2 g 2 +
M2 (M1 + a) 3 g + V˙δ g, 2
(63)
Dynamical Systems Gradient Method for Solving Nonlinear Equations
487
where the estimates: A∗a Aa w, w ≥ a 2 g 2 and Aa ≤ M1 + a were used. Note that the inequality A∗a Aa w, w ≥ a 2 g 2 is true if A ≥ 0. Since F is monotone and differentiable (see (3)), one has A := F (uδ ) ≥ 0. Let t0 > 0 be such that 1 δ = y , a(t0 ) C − 1
C > 1.
(64)
This t0 exists and is unique since a(t) > 0 monotonically decays to 0 as t → ∞. By Lemma 4, there exists t1 such that F (Vδ (t1 )) − fδ = Cδ,
F (Vδ (t1 )) + a(t1 )Vδ (t1 ) − fδ = 0.
(65)
We claim that t1 ∈ [0, t0 ]. Indeed, from (8) and (17) one gets Cδ = a(t1 ) Vδ (t1 ) ≤ a(t1 ) y +
δ a(t1 )
= a(t1 ) y + δ,
C > 1,
so δ≤
a(t1 ) y . C −1
Thus, δ y δ ≤ = . a(t1 ) C − 1 a(t0 ) Since a(t) 0, the above inequality implies t1 ≤ t0 . Differentiating both sides of (8) with respect to t , one obtains ˙ δ. Aa(t) V˙δ = −aV This implies −1 ˙ V˙δ ≤ |a| A a(t) Vδ ≤
1 |a| ˙ |a| ˙ δ |a| ˙ Vδ ≤ y 1 + y + ≤ , a a a a C−1
Since g ≥ 0, inequalities (63) and (66) imply |a(t)| ˙ c1 , g˙ ≤ −a (t)g(t) + c0 (M1 + a(t))g + a(t) 2
2
∀t ≤ t0 . (66)
M2 1 , c1 = y 1 + c0 = . (67) 2 C−1
Inequality (67) is of the type (26) with γ (t) = a 2 (t),
α(t) = c0 (M1 + a(t)),
β(t) = c1
|a(t)| ˙ . a(t)
Let us check assumptions (23)–(25). Take μ(t) =
λ a 2 (t)
,
λ = const.
By Lemma 11 there exist λ and a(t) such that conditions (23)–(25) hold. Thus, Lemma 10 yields g(t) <
a 2 (t) , λ
∀t ≤ t0 .
(68)
488
N.S. Hoang, A.G. Ramm
Therefore, F (uδ (t)) − fδ ≤ F (uδ (t)) − F (Vδ (t)) + F (Vδ (t)) − fδ ≤ M1 g(t) + F (Vδ (t)) − fδ ≤
M1 a 2 (t) + F (Vδ (t)) − fδ , λ
∀t ≤ t0 .
(69)
It follows from Lemma 3 that F (Vδ (t)) − fδ is decreasing. Since t1 ≤ t0 , one gets F (Vδ (t0 )) − fδ ≤ F (Vδ (t1 )) − fδ = Cδ. This, inequality (69), the inequality C1 = 2C − 1 (see (60)) imply
M1 λ
(70)
≤ y (see (35)), the relation (64), and the definition
M1 a 2 (t0 ) + Cδ λ M1 δ(C − 1) ≤ + Cδ ≤ (2C − 1)δ = C1 δ. λ y
F (uδ (t0 )) − fδ ≤
(71)
We have used the inequality a 2 (t0 ) ≤ a(t0 ) =
δ(C − 1) y
which is true if δ is sufficiently small, or, equivalently, if t0 is sufficiently large. Thus, if F (uδ (0)) − fδ ≥ C1 δ ζ ,
0 < ζ ≤ 1,
then there exists tδ ∈ (0, t0 ) such that F (uδ (tδ )) − fδ = C1 δ ζ
(72)
for any given ζ ∈ (0, 1], and any fixed C1 > 1. Let us prove (59). If this is done, then Theorem 17 is proved. First, we prove that limδ→0 a(tδδ ) = 0. From (69) with t = tδ , and from (17), one gets C1 δ ζ ≤ M1
a 2 (tδ ) + a(tδ ) Vδ (tδ ) λ
≤ M1
a 2 (tδ ) + y a(tδ ) + δ. λ
Thus, for sufficiently small δ, one gets ˜ ζ ≤ a(tδ ) M1 a(0) + y , Cδ λ
C˜ > 0,
where C˜ < C1 is a constant. Therefore, δ δ 1−ζ M1 a(0) ≤ lim + y = 0, lim ˜ δ→0 a(tδ ) δ→0 C λ
0 < ζ < 1.
(73)
Dynamical Systems Gradient Method for Solving Nonlinear Equations
489
Secondly, we prove that lim tδ = ∞.
(74)
δ→0
Using (57), one obtains: d (F (uδ ) + auδ − fδ ) = Aa u˙ δ + au ˙ δ = −Aa A∗a (F (uδ ) + auδ − fδ ) + au ˙ δ. dt This and (8) imply: d [F (uδ ) − F (Vδ ) + a(uδ − Vδ )] = −Aa A∗a [F (uδ ) − F (Vδ ) + a(uδ − Vδ )] + au ˙ δ . (75) dt Denote v := F (uδ ) − F (Vδ ) + a(uδ − Vδ ),
h = v .
Multiplying (75) by v and using monotonicity of F , one obtains hh˙ = −Aa A∗a v, v + v, a(u ˙ δ − Vδ ) + av, ˙ Vδ ˙ δ − Vδ + |a|h V ˙ ≤ −h2 a 2 + h|a| u δ ,
h ≥ 0.
(76)
Again, we have used the inequality Aa A∗a ≥ a 2 , which holds for A ≥ 0, i.e., monotone operators F . Thus, h˙ ≤ −ha 2 + |a| u ˙ δ − Vδ + |a| V ˙ δ .
(77)
Since F (uδ ) − F (Vδ ), uδ − Vδ ≥ 0, one obtains two inequalities a uδ − Vδ 2 ≤ v, uδ − Vδ ≤ uδ − Vδ h,
(78)
F (uδ ) − F (Vδ ) 2 ≤ v, F (uδ ) − F (Vδ ) ≤ h F (uδ ) − F (Vδ ) .
(79)
and
Inequalities (78) and (79) imply: a uδ − Vδ ≤ h,
F (uδ ) − F (Vδ ) ≤ h.
(80)
Inequalities (77) and (80) imply |a| ˙ h˙ ≤ −h a 2 − + |a| V ˙ δ . a Since a 2 −
|a| ˙ a
≥
3a 2 4
>
a2 2
(81)
by the last inequality in (56), it follows from inequality (81) that 2
a h˙ ≤ − h + |a| V ˙ δ . 2
(82)
Inequality (82) implies: h(t) ≤ h(0)e−
t a 2 (s) ds 0
2
+ e−
t a 2 (s) ds 0
t
e
2
0
s a 2 (ξ ) dξ 0
2
|a(s)| V ˙ δ (s) ds.
(83)
490
N.S. Hoang, A.G. Ramm
Denote
t
ϕ(t) := 0
a 2 (s) ds. 2
From (83) and (80), one gets F (uδ (t)) − F (Vδ (t)) ≤ h(0)e
−ϕ(t)
+e
−ϕ(t)
t
eϕ(s) |a(s)| V ˙ δ (s) ds.
(84)
0
Therefore, F (uδ (t)) − fδ ≥ F (Vδ (t)) − fδ − F (Vδ (t)) − F (uδ (t)) t ≥ a(t) Vδ (t) − h(0)e−ϕ(t) − e−ϕ(t) eϕ(s) |a| V ˙ δ ds.
(85)
0
From Lemma 9 it follows that there exists an a(t) such that t 1 a(t) Vδ (t) ≥ e−ϕ(t) eϕ(s) |a| V ˙ δ (s) ds. 2 0
(86)
For example, one can choose a(t) =
c1 , (c + t)b
1 b ∈ 0, , c12 c1−2b ≥ 6b, 4
(87)
where c1 , c > 0. Moreover, one can always choose u0 such that 1 h(0) = F (u0 ) + a(0)u0 − fδ ≤ a(0) Vδ (0) , 4
(88)
because the equation F (u0 ) + a(0)u0 − fδ = 0 is solvable. If (88) holds, then 1 h(0)e−ϕ(t) ≤ a(0) Vδ (0) e−ϕ(t) , 4
t ≥ 0.
(89)
If (87) holds, c ≥ 1 and 2b ≤ c12 , then it follows that e−ϕ(t) a(0) ≤ a(t).
(90)
Indeed, inequality a(0) ≤ a(t)eϕ(t) is obviously true for t = 0, and (a(t)eϕ(t) )t ≥ 0, provided that c ≥ 1 and 2b ≤ c12 . Inequalities (89) and (50) imply 1 1 e−ϕ(t) h(0) ≤ a(t) Vδ (0) ≤ a(t) Vδ (t) , 4 4
t ≥ 0.
(91)
where we have used the inequality Vδ (t) ≤ Vδ (t ) for t ≤ t , established in Lemma 3. From (72) and (85)–(91), one gets 1 Cδ ζ = F (uδ (tδ )) − fδ ≥ a(tδ ) Vδ (tδ ) . 4
Dynamical Systems Gradient Method for Solving Nonlinear Equations
491
Thus, lim a(tδ ) Vδ (tδ ) ≤ lim 4Cδ ζ = 0.
δ→0
δ→0
Since Vδ (t) is increasing, this implies limδ→0 a(tδ ) = 0. Since 0 < a(t) 0, it follows that (74) holds. From the triangle inequality and inequalities (68) and (15) one obtains: uδ (tδ ) − y ≤ uδ (tδ ) − Vδ + V (tδ ) − Vδ (tδ ) + V (tδ ) − y ≤
δ a 2 (tδ ) + + V (tδ ) − y . λ a(tδ )
(92)
From (73), (74), inequality (92) and Lemma 1, one obtains (59). Theorem 17 is proved. 3.2 An Iterative Scheme Let Vn,δ solve the equation: F (Vn,δ ) + an Vn,δ − fδ = 0. Denote Vn := Vn,δ . Consider the following iterative scheme: un+1 = un − αn A∗n [F (un ) + an un − fδ ],
An := F (un ) + an I,
u0 = u0 ,
(93)
where u0 is chosen so that inequality (55) holds, and {αn }∞ n=1 is a positive sequence such that 2 , ||An || ≤ M1 + an . (94) 0 < α˜ ≤ αn ≤ 2 an + (M1 + an )2 It follows from this condition that 1 − αn A∗an Aan =
sup an2 ≤λ≤(M1 +an )2
|1 − αn λ| ≤ 1 − αn an2 .
(95)
Note that F (un ) ≥ 0 since F is monotone. Let an and λ satisfy conditions (40)–(44). Assume that equation F (u) = f has a solution in B(u0 , R), possibly nonunique, and y is the minimal-norm solution to this equation. Let f be unknown but fδ be given, and fδ − f ≤ δ. We prove the following result: d 1 Theorem 19 Assume an = (c+n) b where c ≥ 1, 0 < b ≤ 4 , and d is sufficiently large so that conditions (40)–(44) hold. Let un be defined by (93). Assume that u0 is chosen so that (55) holds. Then there exists a unique nδ such that
F (unδ ) − fδ ≤ C1 δ ζ ,
C1 δ ζ < F (un ) − fδ , ∀n < nδ ,
(96)
where C1 > 1, 0 < ζ ≤ 1. ∞ Let 0 < (δm )∞ m=1 be a sequence such that δm → 0. If the sequence {nm := nδm }m=1 is ∞ bounded, and {nmj }j =1 is a convergent subsequence, then ˜ lim unmj = u,
j →∞
(97)
492
N.S. Hoang, A.G. Ramm
where u˜ is a solution to the equation F (u) = f . If lim nm = ∞,
(98)
lim unm − y = 0.
(99)
m→∞
where ζ ∈ (0, 1), then m→∞
Proof Denote C :=
C1 + 1 . 2
(100)
Let zn := un − Vn ,
gn := zn .
We use Taylor’s formula and get: F (un ) − F (Vn ) + an zn = An zn + Kn ,
Kn ≤
M2 zn 2 , 2
(101)
where Kn := F (un ) − F (Vn ) − F (un )zn and M2 is the constant from (3). From (93) and (101) one obtains zn+1 = zn − αn A∗n An zn − αn A∗n K(zn ) − (Vn+1 − Vn ).
(102)
From (102), (101), (95), and the estimate An ≤ M1 + an , one gets αn M2 (M1 + an ) 2 gn + Vn+1 − Vn 2 αn M2 (M1 + an ) 2 ≤ gn (1 − αn an2 ) + gn + Vn+1 − Vn . 2
gn+1 ≤ gn 1 − αn A∗n An +
(103)
Since 0 < an 0, for any fixed δ > 0 there exists n0 such that δ an0 +1 By (40), one has
an an+1
>
δ 1 y ≥ , C −1 an0
C > 1.
(104)
≤ 2, ∀ n ≥ 0. This and (104) imply
2δ δ δ 1 2 y ≥ y ≥ > > , C−1 an0 an0 +1 C−1 an0
C > 1.
(105)
Thus, 2 δ y > , C −1 an
∀n ≤ n0 + 1.
(106)
The number n0 , satisfying (106), exists and is unique since an > 0 monotonically decays to 0 as n → ∞. By Remark 5, there exists a number n1 such that F (Vn1 +1 ) − fδ ≤ Cδ < F (Vn1 ) − fδ , where Vn solves the equation F (Vn ) + an Vn − fδ = 0.
(107)
Dynamical Systems Gradient Method for Solving Nonlinear Equations
493
We claim that n1 ∈ [0, n0 ]. Indeed, one has F (Vn1 ) − fδ = an1 Vn1 , and Vn1 ≤ y + Cδ < an1 Vn1 ≤ an1
δ y + an1
δ a n1
(cf. (17)), so
= an1 y + δ,
C > 1.
(108)
Therefore, δ<
an1 y . C −1
(109)
Thus, by (105), δ δ y < < . an1 C − 1 an0 +1
(110)
Here the last inequality is a consequence of (105). Since an decreases monotonically, inequality (110) implies n1 ≤ n0 . One has an+1 Vn − Vn+1 2 = (an+1 − an )Vn − F (Vn ) + F (Vn+1 ), Vn − Vn+1 ≤ (an+1 − an )Vn , Vn − Vn+1 ≤ (an − an+1 ) Vn Vn − Vn+1 . By (17), Vn ≤ y +
δ , an
and, by (106),
Vn ≤ y 1 +
δ an
≤
2 y C−1
2 , C−1
(111)
for all n ≤ n0 + 1. Therefore, ∀n ≤ n0 + 1,
(112)
and, by (111), Vn − Vn+1 ≤
an − an+1 an − an+1 2 , Vn ≤ y 1 + an+1 an+1 C −1
∀n ≤ n0 + 1.
(113)
Inequalities (103) and (113) imply gn+1 ≤ (1 − αn an2 )gn + αn c0 (M1 + an )gn2 +
an − an+1 c1 , an+1
∀n ≤ n0 + 1,
(114)
where the constants c0 and c1 are defined in (67). By Lemma 4 and Remark 14, the sequence (an )∞ n=1 , satisfies conditions (40)–(44), provided that a0 is sufficiently large and λ > 0 is chosen so that (46) holds. Let us show by induction that an2 , 0 ≤ n ≤ n0 + 1. (115) λ Inequality (115) holds for n = 0 by Remark 16. Suppose (115) holds for some n ≥ 0. From (114), (115) and (44), one gets gn <
2 2 an2 a an − an+1 + αn c0 (M1 + an ) n + c1 λ λ an+1 a 4 αn c0 (M1 + an ) a 2 an − an+1 − αn + n + = n c1 λ λ λ an+1
gn+1 ≤ (1 − αn an2 )
494
N.S. Hoang, A.G. Ramm
≤− ≤
αn an4 an2 an − an+1 + + c1 2λ λ an+1
2 an+1 . λ
(116)
Thus, by induction, inequality (115) holds for all n in the region 0 ≤ n ≤ n0 + 1. From (17) one has Vn ≤ y + aδn . This and the triangle inequality imply u0 − un ≤ u0 + zn + Vn ≤ u0 + zn + y +
δ . an
(117)
Inequalities (112), (115), and (117) guarantee that the sequence un , generated by the iterative process (93), remains in the ball B(u0 , R) for all n ≤ n0 + 1, where R ≤ aλ0 + u0 + y + aδn . This inequality and the estimate (106) imply that the sequence un , n ≤ n0 + 1, stays in the ball B(u0 , R), where R≤
a0 C +1 + u0 + y + y . λ C −1
(118)
By Remark 15, one can choose a0 and λ so that aλ0 is uniformly bounded as δ → 0 even if M1 (R) → ∞ as R → ∞ at an arbitrary fast rate. Thus, the sequence un stays in the ball B(u0 , R) for n ≤ n0 + 1 when δ → 0. An upper bound on R is given above. It does not depend on δ as δ → 0. One has: F (un ) − fδ ≤ F (un ) − F (Vn ) + F (Vn ) − fδ ≤ M1 gn + F (Vn ) − fδ ≤
M1 an2 + F (Vn ) − fδ , λ
∀n ≤ n0 + 1,
(119)
where (115) was used and M1 is the constant from (3). Since F (Vn ) − fδ is decreasing, by Lemma 3, and n1 ≤ n0 , one gets F (Vn0 +1 ) − fδ ≤ F (Vn1 +1 ) − fδ ≤ Cδ.
(120)
From (42), (119), (120), the relation (104), and the definition C1 = 2C − 1 (see (100)), one concludes that M1 an20 +1
+ Cδ λ M1 δ(C − 1) ≤ + Cδ ≤ (2C − 1)δ = C1 δ. λ y
F (un0 +1 ) − fδ ≤
(121)
Thus, if F (u0 ) − fδ > C1 δ ζ ,
0 < ζ ≤ 1,
then one concludes from (121) that there exists nδ , 0 < nδ ≤ n0 + 1, such that F (unδ ) − fδ ≤ C1 δ ζ < F (un ) − fδ ,
0 ≤ n < nδ ,
(122)
Dynamical Systems Gradient Method for Solving Nonlinear Equations
495
for any given ζ ∈ (0, 1], and any fixed C1 > 1. Let us prove (97). If n > 0 is fixed, then uδ,n is a continuous function of fδ . Denote u˜ := u˜ N = lim uδ,nmj ,
(123)
δ→0
where lim nmj = N.
j →∞
From (123) and the continuity of F , one obtains: F (u) ˜ − fδ = lim F (unmj ) − fδ ≤ lim C1 δ ζ = 0. j →∞
δ→0
Thus, u˜ is a solution to the equation F (u) = f , and (97) is proved. Let us prove (99) assuming that (98) holds. From (96) and (119) with n = nδ − 1, and from (122), one gets C1 δ ζ ≤ M 1
an2δ −1 λ
+ anδ −1 Vnδ −1 ≤ M1
an2δ −1 λ
+ y anδ −1 + δ.
If δ > 0 is sufficiently small, then the above equation implies ˜ ζ ≤ anδ −1 M1 a0 + y , C˜ > 0, Cδ λ where C˜ < C1 is a constant, and the inequality an2δ −1 ≤ anδ −1 a0 was used. Therefore, by (40), δ δ δ 1−ζ M1 a0 + y = 0, 0 < ζ < 1. (124) ≤ lim ≤ lim lim ˜ δ→0 2an δ→0 an −1 δ→0 C λ δ δ In particular, for δ = δm , one gets lim
δm →0
δm = 0. anm
(125)
From the triangle inequality and inequalities (15) and (115) one obtains: unm − y ≤ unm − Vnm + Vn − Vnm ,0 + Vnm ,0 − y ≤
an2m λ
+
δm + Vnm ,0 − y . anm
(126)
From (98), (125), inequality (126) and Lemma 1, one obtains (99). Theorem 19 is proved.
4 Numerical Experiments Let us do a numerical experiment solving nonlinear equation (1) with F (u) := B(u) +
u3 := 6
1 0
e−|x−y| u(y)dy +
u3 , 6
f (x) :=
13 ex − e−x − . 6 e
(127)
496
N.S. Hoang, A.G. Ramm
Such equation is a model nonlinear equation in Wiener-type filtering theory, see [18]. One can check that u(x) ≡ 1 solves the equation F (u) = f . The operator B is compact in H = L2 [0, 1]. The operator u −→ u3 is defined on a dense subset D of L2 [0, 1], for example, on D := C[0, 1]. If u, v ∈ D, then
1
u3 − v 3 , u − v =
(u3 − v 3 )(u − v)dx ≥ 0.
0
Moreover, e−|x| =
1 π
∞
eiλx dλ. 1 + λ2
−∞
Therefore, B(u − v), u − v ≥ 0, so F (u − v), u − v ≥ 0,
∀u, v ∈ D.
Note that D does not contain subsets, open in H = L2 [0, 1], i.e., it does not contain 3 interior points of H . This is a reflection of the fact that the operator G(u) = u6 is unbounded on any open subset of H . For example, in any ball u ≤ C, C = const > 0, where u := u L2 [0,1] , there is an element u such that u3 = ∞. As such an element one can take, for example, u(x) = c1 x −b , 13 < b < 12 . here c1 > 0 is a constant chosen so that u ≤ C. The operator u −→ F (u) = G(u) + B(u) is maximal monotone on DF := {u : u ∈ H, F (u) ∈ H } (see [1, p.102]), so that (8) is uniquely solvable for any fδ ∈ H . The Fréchet derivative of F is: 1 u2 h + e−|x−y| h(y)dy. (128) F (u)h = 2 0 If u(x) vanishes on a set of positive Lebesgue’s measure, then F (u) is obviously not boundedly invertible. If u ∈ C[0, 1] vanishes even at one point x0 , then F (u) is not boundedly invertible in H . Let us use the iterative process (93): un+1 = un − αn (F (un )∗ + an I )(F (un ) + an un − fδ ), u0 = 0.
(129)
We stop iterations at n := nδ such that the following inequality holds F (unδ ) − fδ < Cδ ζ ,
F (un ) − fδ ≥ Cδ ζ ,
n < nδ , C > 1, ζ ∈ (0, 1).
(130)
1 Integrals of the form 0 e−|x−y| h(y)dy in (127) and (128) are computed by using the trapezoidal rule. The noisy function used in the test is fδ (x) = f (x) + κfnoise (x),
κ > 0,
κ = κ(δ).
The noise level δ and the relative noise level are determined by δ = κ fnoise ,
δrel :=
δ . f
Dynamical Systems Gradient Method for Solving Nonlinear Equations
497
In the test, κ is computed in such a way that the relative noise level δrel equals to some desired value, i.e., κ=
δrel f δ = . fnoise fnoise
We have used the relative noise level as an input parameter in the test. The version of DSM, developed in this paper and denoted by DSMG, is compared with the version of DSM in [3], denoted by DSMN. Indeed, the DSMN is the following iterative scheme un+1 = un − A−1 n (F (un ) + an un − fδ ),
u0 = u0 ,
n ≥ 0,
(131)
This iterative scheme is used with a stopping time nδ defined by (96). The where an = existence of this stopping time and the convergence of the method is proved in [3]. a0 1 As we have proved, the DSMG converges when an = (1+n) b , b ∈ (0, 4 ], and a0 is sufficiently large. However, in practice, if we choose a0 too large then the method will use too many iterations before reaching the stopping time nδ in (130). This means that the computation time is large. Since a0 . 1+n
F (Vδ ) − fδ = a(t) Vδ , and Vδ (tδ ) − uδ (tδ ) = O(a(tδ )), we have Cδ ζ = F (uδ (tδ )) − fδ ∼ a(tδ ). Thus, we choose a0 = C 0 δ ζ ,
C0 > 0.
The parameter a0 used in the DSMN is also chosen by this formula. In all figures, the x-axis represents the variable x. In all figures, by DSMG we denote the numerical solutions obtained by the DSMG, by DSMN we denote solutions by the DSMN and by exact we denote the exact solution. In experiments, we found that the DSMG works well with a0 = C0 δ ζ , C0 ∈ [0.2, 1]. δ 0.99 Indeed, in the test the DSMG is implemented with an := C0 (n+1) 0.25 , C0 = 0.5 while the 0.99
δ DSMN is implemented with an := C0 (n+1) , C0 = 1. For C0 > 1 the convergence rate of DSMG is much slower while the DSMN still works well if C0 ∈ [1, 4]. Figure 1 plots the solutions using relative noise levels δ = 0.01 and δ = 0.001. The exact solution used in these experiments is u = 1. In the test the DSMG is implemented with αn = 1, C = 1.01, ζ = 0.99 and αn = 1, ∀n ≥ 0. The number of iterations of the DSMG for δ = 0.01 and δ = 0.001 were 49 and 50 while the number of iteration for the DSMN are 9 and 9, respectively. The number of nodal points used in computing integrals in (127) and (128) was n = 100. The noise function fnoise in this experiment is a vector with random entries normally distributed of mean 0 and variant 1. Figure 1 shows that the solutions by the DSMN and DSMG are nearly the same in this figure. Figure 2 presents the numerical results when N = 100 with δ = 0.01 u(x) = sin(2πx), x ∈ [0, 1] (left) and with δ = 0.01, u(x) = sin(πx), x ∈ [0, 1] (right). In these cases, the DSMN took 11 and 7 iterations to give the numerical solutions while the DSMG took 512 and 94 iterations for u(x) = sin(2πx) and u(x) = sin(πx), respectively. Figure 2 show that the numerical results of the DSMG are better than those of the DSMN. Numerical experiments agree with the theory that the convergence rate of the DSMG is slower than that of the DSMN. It is because the rate of decaying of the sequence { 1 1 }∞ n=1 (1+n) 4
498
N.S. Hoang, A.G. Ramm
Fig. 1 Plots of solutions obtained by the DSMN and DSMG when N = 100, u = 1, x ∈ [0, 1], δrel = 0.01 (left) and N = 100, u = 1, x ∈ [0, 1], δrel = 0.001 (right)
Fig. 2 Plots of solutions obtained by the DSMN and DSMG when N = 100, u(x) = sin(2π x), x ∈ [0, 1], δrel = 0.01 (left) and N = 100, u(x) = sin(π x), x ∈ [0, 1], δrel = 0.01 (right)
1 ∞ }n=1 . However, if the cost for evaluating F and is much slower than that of the sequence { 1+n F are not counted then the cost of computation at one iteration of the DSMG is of O(N 2 ) while that of the DSMN in one iteration of the DSMN is of O(N 3 ). Here, n is the dimension of u. Thus, for large scale problems, the DSMG might be an alternative to the DSMN. Also, as it is showed in Fig. 2, the DSMG might yield solutions with better accuracy. a0 1 Experiments show that the DSMN still works with an = (1+n) b for 4 ≤ b ≤ 1. So in practice, one might use faster decaying sequence an to reduce the time of computation. From the numerical results we conclude that the proposed stopping rule yields good results in this problem.
Dynamical Systems Gradient Method for Solving Nonlinear Equations
499
References 1. Deimling, K.: Nonlinear Functional Analysis. Springer, Berlin (1985) 2. Hoang, N.S., Ramm, A.G.: Solving ill-conditioned linear algebraic systems by the dynamical systems method. Inverse Probl. Sci. Eng. 16(N5), 617–630 (2008) 3. Hoang, N.S., Ramm, A.G.: An iterative scheme for solving nonlinear equations with monotone operators (2008, submitted) 4. Ivanov, V., Tanana, V., Vasin, V.: Theory of Ill-posed Problems. VSP, Utrecht (2002) 5. Lions, J.L.: Quelques Methodes de Resolution des Problemes aux Limites non Lineaires. Dunod, Gauthier-Villars, Paris (1969) 6. Morozov, V.A.: Methods of Solving Incorrectly Posed Problems. Springer, New York (1984) 7. Pascali, D., Sburlan, S.: Nonlinear Mappings of Monotone Type. Noordhoff, Leyden (1978) 8. Ramm, A.G.: Theory and Applications of Some New Classes of Integral Equations. Springer, New York (1980) 9. Ramm, A.G.: Stationary regimes in passive nonlinear networks. In: Uslenghi, P. (ed.) Nonlinear Electromagnetics, pp. 263–302. Acad. Press, New York (1980) 10. Ramm, A.G.: Dynamical Systems Method for Solving Operator Equations. Elsevier, Amsterdam (2007) 11. Ramm, A.G.: Global convergence for ill-posed equations with monotone operators: the dynamical systems method. J. Phys. A 36, L249–L254 (2003) 12. Ramm, A.G.: Dynamical systems method for solving nonlinear operator equations. Int. J. Appl. Math. Sci. 1(N1), 97–110 (2004) 13. Ramm, A.G.: Dynamical systems method for solving operator equations. Commun. Nonlinear Sci. Numer. Simul. 9(N2), 383–402 (2004) 14. Ramm, A.G.: DSM for ill-posed equations with monotone operators. Commun. Nonlinear Sci. Numer. Simul. 10(N8), 935–940 (2005) 15. Ramm, A.G.: Discrepancy principle for the dynamical systems method. I, II. Commun. Nonlinear Sci. Numer. Simul. 10, 95–101 (2005); 13, 1256–1263 (2008) 16. Ramm, A.G.: Dynamical systems method (DSM) and nonlinear problems. In: Lopez-Gomez, J. (ed.) Spectral Theory and Nonlinear Analysis, pp. 201–228. World Scientific, Singapore (2005) 17. Ramm, A.G.: Dynamical systems method (DSM) for unbounded operators. Proc. Am. Math. Soc. 134(N4), 1059–1063 (2006) 18. Ramm, A.G.: Random Fields Estimation. World Sci., Singapore (2005) 19. Ramm, A.G.: Iterative solution of linear equations with unbounded operators. J. Math. Anal. Appl. 1338– 1346 20. Ramm, A.G.: On unbounded operators and applications. Appl. Math. Lett. 21, 377–382 (2008) 21. Skrypnik, I.V.: Methods for Analysis of Nonlinear Elliptic Boundary Value Problems. American Mathematical Society, Providence (1994) 22. Tautenhahn, U.: On the asymptotical regularization method for nonlinear ill-posed problems. Inverse Probl. 10, 1405–1418 (1994) 23. Tautenhahn, U.: On the method of Lavrentiev regularization for nonlinear ill-posed problems. Inverse Probl. 18, 191–207 (2002) 24. Vainberg, M.M.: Variational Methods and Method of Monotone Operators in the Theory of Nonlinear Equations. Wiley, London (1973) 25. Zeidler, E.: Nonlinear Functional Analysis. Springer, New York (1985)