Editors-in-Chief Robert Beauwens Facultés des Sciences Appliquées Service de Métrologie Nucléaire Université Libre de Bruxelles Ave. F.D. Roosevelt 50 1050 Brussels, Belgium Martin Berzins School of Computing University of Utah 50 S. Central Campus Dr., Rm 3190 Salt Lake City, UT 84112-9205, USA
Founding Editor and Editor Emeritus R. Vichnevetsky
Editor Emeritus Joseph E. Flaherty
Senior Editors M. Ainsworth Department of Mathematics Strathclyde University 26 Richmond Street Glasgow G1 1XH, UK Randolph Bank Department of Mathematics University of California at San Diego La Jolla, CA 92093-0001, USA Claude Brezinski Laboratoire d’Analyse Numérique et d’Optimisation UFR de Mathématiques Pures et Appliquées Université des Sciences et Technologies de Lille 59655 Villeneuve d’Ascq Cedex, France J.J. Dongarra Computer Science Department University of Tennessee Knoxville, TN 37996-1301, USA Peter K. Jimack School of Computer Studies University of Leeds Leeds, LS2 9JT, UK Ernest Mund Service de Métrologie Nucléaire Université Libre de Bruxelles Ave. F.D. Roosevelt 50 1050 Brussels, Belgium
J.M. Sanz-Serna Facultad de Ciencias Departamento de Matemática Aplicada y Computación Universidad de Valladolid Valladolid, Spain
Editorial Board Alvin Bayliss Department of Engineering Science and Applied Mathematics Northwestern University Evanston, IL 60201, USA Stefania Bellavia Dipartimento di Energetica “Sergio Stecco” University of Florence, via C. Lombroso 6/17 50134 Firenze, Italy J.C. Butcher Department of Mathematics The University of Auckland Private Bag 92019, Auckland, New Zealand Eric de Sturler Department of Mathematics Virginia Tech 544 McBryde Hall Blacksburg, VA 24061-0123, USA Dinh Nho Hào Hanoi Institute of Mathematics Vietnam Academy of Science and Technology 18 Hoang Quoc Viet Road, 10307 Hanoi, Vietnam Alan Feldstein Department of Mathematics Arizona State University Tempe, AZ 85287, USA Anne Gelb Department of Mathematics and Statistics Arizona State University Tempe, AZ 85287-1804, USA Norbert Heuer Pontificia Universidad Católica de Chile, Santiago, Chile Willem Hundsdorfer MAS Center for Mathematics and Computer Science P.O. Box 94079 1090 GB Amsterdam, The Netherlands M.Y. Hussaini Program in Computational Science and Engineering Florida State University 411 Dirac Science Center Library Tallahassee, FL 32306-3075, USA Zdzislaw Jackiewicz Department of Mathematics Arizona State University Tempe, Arizona 85287, USA
Jens Lang Department of Mathematics Darmstadt University of Technology Darmstadt 64289, Germany Yvan Notay Brussels Free University Faculty of Applied Sciences ULB, CP 165/84 Ave. F. D. Roosevelt 50 Brussels, 1050 Belgium Jorg Peters Dept of C.I.S.E. University of Florida CSE Bldg Gainesville, FL 32611-6120, USA Alfio Quarteroni Institute of Analysis and Scientific Computing, CMCS – Modelling and Scientific Computing, Ecole Polytechnique Federale de Lausanne (EPFL), Av. Piccard, Station 8, 1015 Lausanne, Switzerland G. Richter Department of Computer Science Rutgers University New Brunswick, NJ 08903, USA Adrian Sandu Department of Computer Science Virginia Polytechnic Institute Blacksburg, VA 24061, USA Roger Temam Mathematics Department Indiana University Rawles Hall, 831 East Third Street Bloomington, IN 47405, USA S. Tsynkov Department of Mathematics North Carolina State University Box 8205 Raleigh, NC 27695, USA J. Vignes Laboratoire d’Informatique de Paris 6 U. Pierre et Marie Curie 4 Place Jussieu 75230 Paris Cedex 05, France Dongbin Xiu Department of Mathematics Purdue University, West Lafayette IN 47907, USA
© 2010 IMACS. Published by Elsevier B.V. All rights reserved Publication information: Applied Numerical Mathematics (ISSN 0168-9274). For 2011, volume 61 is scheduled for publication. A combined subscription to Applied Numerical Mathematics and Mathematics and Computers in Simulation at reduced rate is available. Subscription prices are available upon request from the Publisher or from the Elsevier Customer Service Department nearest you or from this journal’s website (http://www.elsevier.com/locate/apnum). Further information is available on this journal and other Elsevier products through Elsevier’s website: (http://www.elsevier.com). Subscriptions are accepted on a prepaid basis only and are entered on a calendar year basis. Issues are sent by standard mail (surface within Europe, air delivery outside Europe). Priority rates are available upon request. Claims for missing issues should be made within six months of the date of dispatch. Orders, claims, and journal enquiries: please contact the Elsevier Customer Service Department nearest you: St. Louis: Elsevier Customer Service Department, 3251 Riverport Lane, Maryland Heights, MO 63043, USA; phone: (877) 8397126 [toll free within the USA]; (+1) (314) 4478878 [outside the USA]; fax: (+1) (314) 4478077; e-mail:
[email protected]. Oxford: Elsevier Customer Service Department, The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, UK; phone: (+44) (1865) 843434; fax: (+44) (1865) 843970; e-mail:
[email protected]. Tokyo: Elsevier Customer Service Department, 4F Higashi-Azabu, 1-Chome Bldg, 1-9-15 Higashi-Azabu, Minato-ku, Tokyo 106-0044, Japan; phone: (+81) (3) 5561 5037; fax: (+81) (3) 5561 5047; e-mail:
[email protected]. Singapore: Elsevier Customer Service Department, 3 Killiney Road, #08-01 Winsland House I, Singapore 239519; phone: (+65) 63490222; fax: (+65) 67331510; e-mail:
[email protected]. Advertising information: If you are interested in advertising or other commercial opportunities please e-mail
[email protected] and your enquiry will be passed to the correct person who will respond to you within 48 hours.
∞ The paper used in this publication meets the requirements of ANSI/NISO Z39.48-1992 (Permanence of Paper). Published monthly
0168-9274/07/$36.00
Printed in The Netherlands
Applied Numerical Mathematics 61 (2011) 1–23
Contents lists available at ScienceDirect
Applied Numerical Mathematics www.elsevier.com/locate/apnum
A finite volume spectral element method for solving magnetohydrodynamic (MHD) equations Fatemeh Shakeri, Mehdi Dehghan ∗ Department of Applied Mathematics, Faculty of Mathematics and Computer Science, Amirkabir University of Technology, No. 424, Hafez Avenue, Tehran 15914, Iran
a r t i c l e
i n f o
Article history: Received 14 July 2009 Received in revised form 10 May 2010 Accepted 26 July 2010 Available online 1 August 2010 Keywords: Unsteady magnetohydrodynamic equations Finite volume method Spectral element method Hermit interpolation Rectangular mesh Triangular mesh
a b s t r a c t In this paper, the coupled equations in velocity and magnetic field for unsteady magnetohydrodynamic (MHD) flow through a pipe of rectangular section are solved using combined finite volume method and spectral element technique, improved by means of Hermit interpolation. The transverse applied magnetic field may have an arbitrary orientation relative to the section of the pipe. The velocity and induced magnetic field are studied for various values of Hartmann number, wall conductivity and orientation of the applied magnetic field. Comparisons with the exact solution and also some other numerical methods are made in the special cases where the exact solution exists. The numerical results for these sample problems compare very well to analytical results. © 2010 IMACS. Published by Elsevier B.V. All rights reserved.
1. Introduction Magnetohydrodynamic (MHD) is the study of the interaction of electrically conducting fluids and electromagnetic forces. The field of MHD was initiated by Swedish physicist, Hannes Alfvén for which he received in 1970 the Nobel Prize [1]. The official birth of incompressible fluid magnetohydrodynamics is 1936–1937. In 1937, Hartmann and Lazarus [37] studied the influence of a transverse uniform magnetic field on the flow of a viscous incompressible electrically conducting fluid between two infinite parallel stationary and insulating plates. The most appropriate name for the phenomena would be MagnetoFluidMechanics, but the original name Magnetohydrodynamics is still generally used. The study of MHD developed especially after 1950, when plasma could be reproduced in the laboratory. Smith [81] obtained upper and lower bounds for the mass flow rate for the steady flow of a conducting liquid along a pipe of arbitrary cross section under the influence of a uniform transverse magnetic field. Walker [91] and Holroyd [39] also made much effort in this field. Branover and Gershon [11] reported experimental effort on high Hartman number and interaction parameter flows. In 1979, Holroyd [40] presented the results of an experimental investigation of the flow of mercury along circular and rectangular non-conducting ducts in a non-uniform magnetic field at high Hartmann number. This author [41] also carried out a theoretical and experimental study on the flow of a liquid metal along a straight rectangular duct, whose pairs of opposite walls are highly conducting and insulating, situated in a planar non-uniform magnetic field parallel to the conducting walls. Magnitudes of the flux density and mean velocity were taken to be such that the Hartmann number and interaction parameter had very large values and the magnetic Reynolds number was extremely small. MHD problems arise in a wide variety of situations ranging from the explanation of the origin of Earth’s magnetic field and the prediction of space weather to the damping of turbulent fluctuations in semiconductor melts during crystal growth
*
Corresponding author. E-mail addresses:
[email protected] (F. Shakeri),
[email protected],
[email protected] (M. Dehghan).
0168-9274/$30.00 © 2010 IMACS. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.apnum.2010.07.010
2
F. Shakeri, M. Dehghan / Applied Numerical Mathematics 61 (2011) 1–23
and even the measurement of the flow rates of beverages in the food industry. The description of MHD flows involves both the equations of fluid dynamics, the Navier–Stokes equations, and the equations of electrodynamics, Maxwell’s equations, which are mutually coupled through the Lorentz force and Ohm’s law for moving electrical conductors [21]. Magnetohydrodynamic channel flows have considerable theoretical and practical importance because of the widespread applications in designing cooling systems with liquid metals, MHD generators, accelerators, nuclear reactors, blood flow measurements, pumps and flow-meters. Due to the coupling of the equations of fluid mechanics and electrodynamics, the equations governing MHD flows are rather cumbersome and exact solutions are, therefore, available only for some simple geometries subject to simple boundary conditions [17,27,34,37,48,72]. Therefore, some numerical techniques have been used to obtain the approximate solutions for the MHD flow problems. Singh and Lal [74–78] obtained numerical solutions of the MHD flows through pipes of various cross-section for Hartmann number less than 10 using the finite difference and finite element methods. Yagawa and Masuda [96] used an incremental finite element technique to study and analyze MHD flow and its application to liquid lithium blanket design of a fusion reactor. Later, Winowich and W.F. Hughes [93] analyzed a DC electromagnetic pump with insulating side walls using the finite element method. This time they also solved the complete Navier–Stokes equation with a non-uniform applied magnetic field. The finite difference method was employed by Ramos and Winowich [66] for solving electromagnetic pump. These authors [67] also applied finite element methodology to study MHD channel flow fields as a function of the Reynolds number, electrode length and the wall conductivity. It was shown that the axial velocity profile is distorted into M-shapes by the applied electromagnetic field and that the distortion increases as the Reynolds number and the electrode length increase. Hua and Walker [42] investigated the steady, laminar flow of a liquid metal in a circular duct with electrically insulating wall in the presence of a strong transverse non-uniform field. The interaction parameter and Hartmann number were assumed to be large, whereas the magnetic Reynolds number was assumed to be small. Sterl [82] used finite difference codes to investigate the influence of Hartmann number, interaction parameter, wall conductance ratio and changing magnetic field on the flow. Sezgin and Köksal [87] used the finite element method with linear and quadratic elements to moderate Hartmann numbers (up to 100). Meir [54] dealt with magnetohydrodynamic flows in pipes with arbitrary cross-section and arbitrary wall conductivities under the influence of a transverse magnetic field. Existence, uniqueness and finite element approximation of solutions to the equations of steady-state magnetohydrodynamics with arbitrary boundary conditions which describe this phenomenon were investigated in [54]. Seungsoo and Dulikravich [71] proposed a finite difference scheme [24,25] for threedimensional unsteady MHD flow together with temperature field. They used explicit Runge–Kutta method for step-by-step computations in time. M. Hughes, Pericleous and Cross [45], considered some models for three interesting magnetohydrodynamic flows, whereby predicted numerical solutions were compared with their analytical counterparts. The mathematical formulations adopted and the analytical solutions obtained were described together with the results of the computations. These authors also [46] used two-dimensional models under externally imposed permanent magnetic field to simulate the MHD flow in macroscopic scale. Later, Suwon Cho [83] studied several analytical solutions using the two-dimensional magnetic field analysis based on an equivalent current sheet model. These theoretical studies and numerical simulations predicted serious problems linked to the transformation of the velocity profile yielding the electromagnetic driving force in the opposite direction to the fluid motion especially at turbulent flow conditions. In [33], bi-cubic B-spline finite elements, which are memory consuming, were used for solving the MHD equations. In [26] an analytic finite element method was proposed for solving the governing equations of steady magnetohydrodynamic duct flows in the range of low and moderate Hartmann numbers M < 1000. Meir and Schmidt [55] considered the steady flow of a conducting fluid, confined to a bounded region of space and driven by a combination of body forces, externally generated magnetic fields, and currents entering and leaving the fluid through electrodes attached to the surface. By means of the Biot–Savart law, they reduced the problem to a system of integrodifferential equations in the fluid region, derived a mixed variational formulation, and proved its well-posedness under a small-data assumption and then studied the finite element approximation of solutions. Sheu and Lin [73] presented the convection-diffusion-reaction model for solving unsteady MHD flow applying a finite difference method on non-staggered grids with a transport scheme in each ADI (predictor–corrector) spatial sweep. Sezgin and Han Aydın [86] used the fundamental solution of Laplace equation in the dual reciprocity boundary element method solution of uncoupled MHD equations and approximated convective terms with osculatory functions. Barrett [4] obtained solution for high values of Hartmann number by using finite element method and very fine mesh within the Hartmann layers, which is computationally very expensive, time and memory consuming. Authors of [52] presented an interpolatory formulation for moving-least-squares approximants to study two-dimensional magnetohydrodynamic flow problem which allows the direct introduction of boundary conditions, reducing the processing time and improving the condition numbers. In [85], the magnetohydrodynamic flow of an incompressible, viscous, electrically conducting fluid in a rectangular duct, with an external magnetic field applied transverse to the flow was investigated and an analytical solution was developed for the velocity field and magnetic field by reducing the problem to the solution of a Fredholm integral equation of the second kind. Neslitürk and Sezgin [58,59] solved MHD flow equations in rectangular ducts by using a stabilized finite element method with residual free bubble functions for M < 1000 and general wall conductivities. Authors of [35] introduced a finite element technique for solving the Maxwell equations in the MHD limit in heterogeneous domains. Authors of [69] presented a nonlinearly implicit, conservative numerical method for integration of the single-fluid resistive MHD equations. In [98], the space–time conservation element and solution element method was applied to solve the ideal MHD equations with special emphasis on satisfying the divergence
F. Shakeri, M. Dehghan / Applied Numerical Mathematics 61 (2011) 1–23
3
free constraint of magnetic field. Han and Tang [36] presented an adaptive moving mesh algorithm for the two-dimensional ideal magnetohydrodynamics equation that utilizes a staggered constrained transport technique to keep the magnetic field divergence-free. Mignone [56] proposed a new approximate Riemann solver for the equations of magnetohydrodynamics with an isothermal equation of state. In [10], the solution of magnetohydrodynamic duct flow problems with arbitrary wall conductivity was given using the boundary element method based on the time-dependent fundamental solution. Dehghan and Mirzaei [22] proposed a meshless local Petrov–Galerkin method to obtain the numerical solution of the coupled unsteady MHD flow through a pipe of rectangular section having arbitrary conducting walls. These authors also [23] applied the meshless local boundary integral equation method to solve the coupled unsteady MHD flow with non-conducting walls. In [32], finite volume schemes for the equations of ideal magnetohydrodynamics and based on splitting these equations into a fluid part and a magnetic induction part were designed. For more studies on MHD, the interested reader is referred to [6–9,43,44,46,49,60,64,65,68,70,84,97]. This paper presents an investigation in which the spectral element method and the finite volume technique are combined for solving MHD equations. Finite volume method has a long history as a class of important numerical tools for solving differential equations. This method which investigated in the early literature [88,89] and was known as the integral finite difference method, has proved particularly popular for simulating a wide range of important applications and physical processes such as fluid mechanics, meteorology, electromagnetics, semi-conductor device simulation, mass transfer or petroleum engineering and numerous models of biological processes [2,3,5,15,16,28,38,53,61,80,92,94,99]. Finite volume approximations rely on the local conservation property expressed by the differential equation and inherit the physical conservation laws of the original problems locally, hence this method can be expected to simulate corresponding physical phenomena effectively [31,51,90]. For many physical and engineering applications, such as heat transfer, transport phenomena and flows in porous media, this numerical conservation property is crucial. Finite volume element (FVE) method [12,13,18,29,30,47,95] is a combination of the standard finite volume method and the standard Galerkin finite element method. In finite volume element method, the governing partial differential equation is posed as a local conservation law on a set of control or finite volumes that partition the problem domain. The unknown FVE solution and known data are systematically discretized by a C 0 piecewise polynomial trial space that is defined on a finite element triangulation [18]. The FVE method can be viewed as either a systematic generalization of the finite volume method or as a locally conservative (Petrov–Galerkin) variant of the standard finite element method. In this study, for the purpose of increasing the accuracy of the solution, we combine the spectral element method with the finite volume technique in a manner similar to FVE method. The spectral element method [62] is a high order weighted residual technique which combines the geometrical flexibility of the finite element method and the high accuracy of spectral method. The spectral element method provides exponential convergence for problems with smooth solutions. The structure of the remainder of this paper is as follows: In Section 2, the governing equations under consideration are described. Section 3 presents a brief description of the general triangulation and volumization of the domain problem for applying the proposed method. In Section 4, the applied procedure is discussed and a time stepping method and the obtained linear system are outlined in Section 5. Section 6 reports some numerical results and comparisons to show efficiency and applicability of the new method for the current problem. A conclusion is drawn in Section 7. 2. Mathematical formulation of MHD equations The equations governing the motion of a conducting fluid in a straight pipe of uniform cross-section in presence of transverse magnetic field are [79]
∂BZ ∂BZ ∂V Z B0 − K F ( T ) = η ∇ 2 V Z + cos θ + sin θ , ∂T μ0 ∂X ∂Y ∂V Z ∂V Z ∂BZ 1 = ∇ 2 B Z + B 0 cos θ + sin θ , ∂T μ0 σ ∂X ∂Y
ρ
where
ρ , η , σ density, viscosity and conductivity of fluid, μ0 a constant = 4π × 10−7 in MKS system, X, Y , Z T
coordinates of a point with Z -axis along axis of the pipe and X -axis parallel to the applied magnetic field, time variable, θ orientation of applied magnetic field with X -axis, B0 applied magnetic field, V Z , B Z axial velocity and induced magnetic field, − K F ( T ) pressure gradient, ∇2 is the two-dimensional Laplacian operator.
(2.1) (2.2)
4
F. Shakeri, M. Dehghan / Applied Numerical Mathematics 61 (2011) 1–23
Fig. 1. A rectangular duct flow with general BC.
The boundary conditions (BC) depend strongly on the material of the bounding walls. The general BC which are suitable in engineering practice can be formulated as [17]
∂BZ σ BZ + = 0, σ h ∂N
V Z = 0,
(2.3)
is the outward normal to the boundary of the domain, σ and h are the wall conductance and the small wall where N thickness, respectively. The initial conditions depend upon how the motion starts initially. If it is assumed that initially the fluid is at rest and the motion starts by applying the constant pressure gradient, the initial conditions become V Z ( X , Y , 0) = 0,
B Z ( X , Y , 0) = 0.
(2.4)
Introducing the non-dimensional variables and parameters by
V = x=
VZ V0 X a
,
,
V0 = y=
ρ aV 0 R= , η
Y a
K a2
η
,
, M2 =
B=
BZ V 0 μ0
B 20 a2 σ
η
R m = V 0 aμ0 σ ,
,
t=
σ η
,
λ= T V0 a
σa , σ h
,
f (t ) = F
aT V0
,
the governing equations are reduced to
∂V ∂B ∂B , − f (t ) = ∇ 2 V + M cos θ + sin θ ∂t ∂x ∂y ∂V ∂V ∂B , = ∇ 2 B + M cos θ + sin θ Rm ∂t ∂x ∂y R
(2.5) (2.6)
in Ω × [0, ∞) with boundary conditions
V = 0,
on ∂Ω, ∂B + λ B = 0, on ∂Ω, ∂ n
(2.7) (2.8)
and initial conditions
V (x, y , 0) = 0,
(x, y ) ∈ Ω,
(2.9)
B (x, y , 0) = 0,
(x, y ) ∈ Ω,
(2.10)
where Ω represents the section of the pipe in non-dimensional form with ∂Ω as the boundary. The parameters M, R and R m are called the Hartmann number, Reynolds number and magnetic Reynolds number respectively. In Fig. 1, a rectangular duct flow with the introduced BC, is presented. In the limiting cases of perfectly insulating (σ = 0, λ = ∞) and conducting (σ = ∞, λ = 0) walls, the BC become V = B = 0 and V = ∂∂ nB = 0, respectively.
F. Shakeri, M. Dehghan / Applied Numerical Mathematics 61 (2011) 1–23
5
Fig. 2. The control volume corresponding to the point z (the colored polygon).
3. General triangulation and volumization In order to introduce the fundamental concepts of the proposed method, let Th denotes a triangulation of the domain Ω , partitioning of Ω into triangular or rectangular elements, assumed to be regular in the usual sense [20,50]. The mesh parameter h is the maximum of the diameters hk over all elements and T h is the p-fold refinement of Th that connects p
interpolation (nodal) points defined on each element in Th in the manner of [57]. Let Mh = {mi : i ∈ I } and M h = {n j : j ∈ J } be the sets of all nodal points of Th and T h , respectively and M h ( T ) = p
p
{n j : j ∈ J , n j ∈ T } be the vertices of T ∈ T h , where I and J are suitable index sets.
p
p
Now, we can construct a dual partition D h = { D z : z ∈ M h } corresponding to the triangulations T h , respectively in p
p
p
the following way [18,19]. Let z T be the circumcenter or barycenter of T ∈ T h . We connect z T with line segments to the p
midpoints of the edges of T , thus we partition T into some quadrilaterals K z , z ∈ M h ( T ). Then with each vertex z ∈ M h , p
p
we associate a control volume D z which is a closed polygon and consists of the union of the subregions K z , sharing the vertex z (Fig. 2), so there is a one-to-one correspondence between the interpolation points and finite volumes. 4. Methodology fundamentals The finite volume-spectral element (FVSE) method for solving MHD equations is based on the fact that these equations arise from the conservation laws. Finite volume approximations rely on the local conservation property expressed by the differential equation. The FVSE method is based on the Petrov–Galerkin formulation in which the solution space consists of continuous piecewise polynomial functions over the primal partition, Th , and the test space consists of piecewise constant functions over the dual partition, D h . The test space essentially conserves the local conservation property of the method. The approximate spaces we use are
p
Vh := v ∈ C 0 (Ω): v | T ∈ P p ( T ), ∀ T ∈ Th , W h := w ∈ L 2 (Ω): w | D z ∈ P 0 ( D z ), ∀ D z ∈ D h , p
p
where P p ( K ) is the set of polynomials of degree p within each spectral element or control volume K and in the cases of rectangular and triangular partitions is specified as
P p ( K ) = span xi y j : 0 i , j p , (x, y ) ∈ K ,
(4.1)
and
P p ( K ) = span xi y j : 0 i + j p , (x, y ) ∈ K , respectively. Obviously, W h = span{χz : z ∈ M h }, where
χz (x, y ) =
p
1, 0,
p
(4.2)
χz is the characteristic function of volume D z defined by
if (x, y ) ∈ D z , otherwise,
and Vh = span{ψz : z ∈ M h }, where ψz is the nodal basis function associated with the node z will be defined later in this p
section. Given an interpolation node z ∈ M h , integrating (2.5) and (2.6) over the associated control volume D z and using the p
Green’s formula, we obtain
6
F. Shakeri, M. Dehghan / Applied Numerical Mathematics 61 (2011) 1–23
R Dz
∂V dD = ∂t
∇ V . n dΓ + Dz
∂ Dz
∂B dD = ∂t
Rm
∂B ∂B M cos θ + sin θ ∂x ∂y
Dz
∂V ∂V M cos θ + sin θ ∂x ∂y
∇ B . n dΓ +
+ f (t ) dD ,
(4.3)
dD ,
(4.4)
Dz
∂ Dz
denotes the unit outward normal vector on ∂ D z , the boundary of D z . where n Now, the FVSE method is to find V p , B p ∈ Vh such that
R Dz
Rm
∂Vp dD = ∂t
∂Bp ∂Bp M cos θ + sin θ ∂x ∂y
∇ V p . n dΓ + Dz
∂ Dz
∂Bp dD = ∂t
Dz
∇ B p . n dΓ +
∂Vp ∂Vp M cos θ + sin θ ∂x ∂y
+ f (t ) dD ,
(4.5)
dD .
(4.6)
Dz
∂ Dz
The FVSE method is viewed as a perturbation of the spectral element method with the help of an interpolation operator I h∗ : Vh → W h defined by p
I h∗ u =
u ( z)χz .
z∈M h
p
Then, the discrete FVSE problem is then written as follows: Find V p , B p ∈ Vh such that for all w ∈ Vh
a V p , B p , I h∗ w = − f , I h∗ w , b V p , B p , I h∗ w = 0.
(4.7)
Here, a(., ., I h∗ .), b(., ., I h∗ .) and
∗
a u, v , Ih w =
( f , I h∗ .) are defined as follows:
z∈M h
p
+
∇ u . n I h∗ w dΓ
∂ Dz
M cos θ
z∈M h D p z
∗
b u, v , Ih w =
z∈M h
p
+
∗
f , Ih u =
∂v ∂v + sin θ ∂x ∂y
−R
∂u ∗ I h w dD , ∂t
∀u , v , w ∈ Vh ,
(4.8)
∇ v . n I h∗ w dΓ
∂ Dz
M cos θ
z∈M h D p z
f I h∗ u dΩ,
∂u ∂u + sin θ ∂x ∂y
− Rm
∂v ∗ I h w dD , ∂t
∀u , v , w ∈ Vh ,
∀u ∈ Vh .
(4.9)
(4.10)
Ω
FVSE procedure is usually easier to implement than the spectral element procedure and offers most of the advantages of flexibility for handling complicated domain geometries. More important, the test space W h ensures the local conservation p
of the numerical flux over each computational element. Thus, this method is highly desirable in computational conservation laws. For the unsteady MHD flow through a pipe of rectangular section Ω = I x × I y , I x = [ax , b x ], I y = [a y , b y ] which we consider here, we describe non-uniform rectangular and also triangular partitions of Ω , briefly in the next section. 4.1. Rectangular mesh Firstly, we consider the rectangular partition of Ω . In order to take the advantage of the properties of spectral methods, N +1 we have divided Ω into N x × N y non-overlapping rectangular elements using Gauss–Lobatto–Legendre (GLL) points {xi }i =x1
F. Shakeri, M. Dehghan / Applied Numerical Mathematics 61 (2011) 1–23
7
Fig. 3. The filled squares and the filled squares together with the filled circles represent the nodal points related to the Th and T h meshes, respectively. p
The control volume associated to each point is denoted by a black lined rectangle. N y +1
and { y j } j =1 in the x and y directions, respectively. GLL points are placed at zeros of the completed Lobatto polynomials [63] defined on the interval ζ ∈ [−1, 1] as c 2 Lm +1 (ζ ) = (1 − ζ ) L m (ζ ),
(ζ ) denotes the derivative of the mth Legendre polynomial. For getting these points in the arbitrary interval [a, b ], where L m the shifted Legendre polynomials are used [14]. Regarding (4.1), the total number of nodal points in each element in this case is ( p + 1)2 , hence once the base triangulation Th is defined, ( p + 1) × ( p + 1) interpolation points are considered in p +1 p +1 each element [xi , xi +1 ] × [ y j , y j +1 ] using again GLL points {xri }r =1 and { y sj }s=1 and therefore each rectangle in Th is further subdivided into p × p non-overlapping rectangles which form T h , the p-fold refinement of Th . Note that considering the p +1
nodal points as described above, we have x1i = xi , xi Let s− 12
yj
(xri ,
y sj )
= y sj −
p
be an interpolation point of T h . Consider
k sj−1 2
p
s+ 12
, yj
= y sj +
k sj 2
p +1
= xi +1 , y 1j = y j , y j hri
=
xri +1
−
xri ,
k sj
= y j +1 . r − 12
= y sj+1 − y sj , xi
= xri −
hri −1 , 2
r + 12
xi
= xri +
hri 2
,
, then
r − 12 r + 12 s− 1 s+ 1 , y ∈ yj 2, yj 2 , Λrs , xi i j = (x, y ): x ∈ xi is a control volume or dual element of node (xri , y sj ). For boundary nodes, their control volumes should be modified correspondingly. For example,
1 1+ 12 1+ 1 , y ∈ y 11 , y 1 2 . Λ11 11 = (x, y ): x ∈ x1 , x1 The control volumes Λrs , i = 1, . . . , N x + 1, j = 1, . . . , N y + 1, r = 1, . . . , p + 1 and s = 1, . . . , p + 1, form the dual partition ij D h . In Fig. 3, the domain problem initially has been divided into 4 × 4 rectangles using filled squares corresponding 5 × 5 p
GLL points (Th ) and then each element has been partitioned into 2 × 2 elements using 3 × 3 GLL points (T h ). Also the p
control volume associated to each point, according to previous paragraph, has been denoted. One may consider T h as the tensor product of Γx and Γ y , the partitions of the intervals I x and I y , respectively, which p
are another representations of the nodal points xri and y sj as follows σ =b , Γσ : aσ = β1σ < β2σ < β3σ < · · · < β M σ σ
where σ = x, y and M σ = N σ p + 1. p Let θk,q (ζ ), k = 1, . . . , N σ and q = 1, . . . , p + 1 be the Lagrange interpolation polynomials of degree p in the interval q p +1 (σk , σk+1 ), corresponding to the point σk , through the p + 1 GLL points, {σkl }l=1 , in this interval, defined by p
θk,q (ζ ) =
1
(ζ 2 − 1) Lk , p (ζ )
p ( p + 1) L k, p (σ q )(ζ − σ q ) k k
,
8
F. Shakeri, M. Dehghan / Applied Numerical Mathematics 61 (2011) 1–23
σ = x, y and Ln ,m denotes the derivative of the mth shifted Legendre polynomial in the (σn , σn+1 ). We may then
where
p
p
write down discrete representations V N x N y and B N x N y for the velocity and induced magnetic field, respectively, as follows: p V N x N y (ξ,
η, t ) =
My Mx
y
(4.11)
y
(4.12)
v i j (t )ψix (ξ )ψ j (η),
i =1 j =1
p
B N x N y (ξ, η, t ) =
My Mx
b i j (t )ψix (ξ )ψ j (η),
i =1 j =1
where ψkσ is the global interpolation function corresponding to βkσ which according to the spectral element method [63], is p
continuous piecewise polynomial function by piecing together the local basis functions θi , j s and is denoted as follows. Let 1 k N σ and 1 k p + 1 such that βkσ = σkk : if k = 1 and k = 1,
⎧ p ⎪ ⎨ θk −1, p +1 (ζ ), σk −1 < ζ < σk , ψkσ (ζ ) = θ p (ζ ), σk < ζ < σk +1 , ⎪ ⎩ k ,1 0, otherwise,
(4.13)
if k = p + 1 and k = M σ ,
⎧ p ⎪ ⎨ θk , p +1 (ζ ), σk < ζ < σk +1 , σ ψk (ζ ) = θ p ⎪ k +1,1 (ζ ), σk +1 < ζ < σk +2 , ⎩ 0, otherwise,
else
ψkσ (ζ ) =
(4.14)
p
θk ,k (ζ ), σk < ζ < σk +1 , 0,
(4.15)
otherwise.
Regarding the above notations and also (4.5) and (4.6) with D z = Λrs , i = 1, . . . , N x + 1, j = 1, . . . , N y + 1, r = 1, . . . , p + 1 ij and s = 1, . . . , p + 1, we have
R
My Mx ∂ v kl k =1 l =1
∂t
y
ψkx (ξ )ψl (η) dΛ =
My Mx
v kl (t )
k =1 l =1
Λrs ij
y ∇ ψkx (ξ )ψl (η) . n dΓ
∂Λrs ij
+ M cos θ
My Mx
k =1 l =1
+ M sin θ
My Mx
y ψkx (ξ )ψl (η) dΛ
Λrs ij
y ψkx (ξ ) ψl (η) dΛ
bkl (t )
k =1 l =1
Λrs ij
+ f (t )
bkl (t )
dΛ,
(4.16)
Λrs ij
Rm
My Mx ∂ bkl k =1 l =1
∂t
Λrs ij
y ψkx (ξ )ψl (
η ) dΛ =
My Mx
bkl (t )
k =1 l =1
+ M cos θ
y ∇ ψkx (ξ )ψl (η) . n dΓ
∂Λrs ij My Mx
v kl (t )
k =1 l =1
+ M sin θ
My Mx k =1 l =1
y ψkx (ξ )ψl (η) dΛ
Λrs ij
v kl (t ) Λrs ij
y ψkx (ξ ) ψl (η) dΛ,
(4.17)
F. Shakeri, M. Dehghan / Applied Numerical Mathematics 61 (2011) 1–23
9
Fig. 4. The filled squares represent the nodal points in parent element.
where (ψiσ ) is the derivative of ψiσ with respect to σ , σ = x, y. To facilitate the computation of the mentioned integrations, we can map the element K = {(x, y ) | a1 x b1 , a2 y b2 } onto a parent element
Kˆ = (ξ, η) −1 ξ,
η1 ,
achieved using the following mapping
ϕ : K −→ Kˆ ,
ϕ (x, y ) =
2x − a1 − b1 2 y − a2 − b2 b 1 − a1
,
b 2 − a2
(4.18)
.
4.2. Triangular mesh In this case, we partition the domain Ω into N triangular elements. For simplicity, we map each physical triangle from the xy plane to a right isosceles (parent) triangle in the ξ η parametric plane where ξ, η > −1 and ξ + η < 0 and then using Jacobian matrix and obtained computations for parent element, we get the computations for physical triangle [63]. Considering (4.2), the total number of nodal points in the case of triangular mesh in each element is l = 12 ( p + 1)( p + 2). To generate the nodal points on the parent triangle, consider a one-dimensional master grid v i , i = 1, 2, . . . , p + 1, where v i s positioned at the zeros of the p + 1 degree completed Lobatto polynomial. The nodes are then identified by the coordinates (ξk , ηq ) = ( v k , v q ), where k = 1, 2, . . . , p + 1 and q = 1, 2, . . . , p + 2 − k. In Fig. 4, the parent element and nodal points for p = 4 are shown. Also we assign to each index (k, q), k = 1, 2, . . . , p + 1, q = 1, 2, . . . , p + 2 − k, the node number k−2 i = q + s=0 ( p + 1 − s). p The local interpolation function θk,q corresponding to nodal point (ξk , ηq ), is a pth degree polynomial in ξ and η , required to satisfy the l interpolation conditions p
θk,q (ξk , ηq ) = δkk δqq ,
k = 1, 2, . . . , p + 1, q = 1, 2, . . . , p + 2 − k .
(4.19)
In the most general approach, the local interpolation functions are expressed as linear combinations of a set of l independent polynomials that forms a complete base of the pth order expansions in the ξ η plane, φ j (ξ, η), for j = 1, 2, . . . , l, i.e. p
θk,q (ξ, η) =
l
j j =1 ck,q φ j (ξ,
η), where ckj,q comprise a set of l expansion coefficients for node (ξk , ηq ). Enforcing the Kroneck-
er’s delta condition (4.19), we obtain the linear system V.c = ei , where i is the node number corresponding to (k, q), V is the generalized Vandermonde matrix whose (r , s) entry is the value of φs at the rth node, c = [ck1,q ck2,q . . . ckl−,q1 ckl ,q ] T and p
ei is a l × 1 vector whose ith entry is 1 and zero elsewhere. Solving this system, the local interpolation function θk,q is identified. To construct the control volume corresponding to node (ξk , ηq ), for q = p + 2 − k, consider the control volume in the same manner as rectangular partition, for q = p + 2 − k, k = 1 and k = p + 1, let
ηq − ηq−1 ξk − ξk−1 Λkq = (ξ, η): ηq − ξ, ξk − η, ξ + η < 0 , 2
2
(4.20)
10
F. Shakeri, M. Dehghan / Applied Numerical Mathematics 61 (2011) 1–23
Fig. 5. A global interpolation function corresponding to ith point, located on the boundary of an element of Th before applying Hermit interpolation.
for q = p + 2 − k and k = 1, let
Λkq = (ξ, η): ηq −
ηq − ηq−1 2
ξ, ξk η, ξ + η < 0 ,
(4.21)
and for q = p + 2 − k and k = p + 1, let
ξk − ξk−1 Λkq = (ξ, η): ηq ξ, ξk − η, ξ + η < 0 .
(4.22)
2
Such as previous section, for the nodal points located at the boundary of an element, the union of the interpolation functions corresponding to adjacent elements and for the interior functions yield global M nodes, the interpolation themselves, p p M interpolation functions. Then we can consider V N (ξ, η, t ) = i =1 v i (t )ψi (ξ, η), and B N (ξ, η, t ) = i =1 b i (t )ψi (ξ, η), as approximations of V and B, respectively, where ψi , is the global interpolation function corresponding to ith point and M is the total number of nodal points in the whole domain. p
Notice that Eqs. (4.16) and (4.17) are satisfied for triangular mesh by replacing V N x N y (ξ, η, t ) = y p ψix (ξ )ψ j ( ) and B N x N y (ξ, , t ) M i =1 b i (t )ψi (ξ, ), respectively.
η
η
=
M x M y i =1
y b (t )ψix (ξ )ψ j ( j =1 i j
η) with
p V N (ξ,
η, t ) =
M
i =1
M x M y i =1
v i (t )ψi (ξ, η) and
v (t ) × j =1 i j p B N (ξ, , t ) =
η
η
Remark. It is expected that increasing the number of elements in Th , increases the accuracy of the solution, but the numerical results show that the error in the boundary of elements of Th grows by increasing the number of elements in Th . Because regarding the global interpolation functions as described before, causes sharp points to be formed at the boundary of elements of Th which tends to effect the differentiability of the global basis functions and results some errors in these points. For the purpose of handling this problem, we use a Hermit interpolation at a strip of width 2 around the boundary of elements of Th to slight the behavior of the basis functions in these points and make them differentiable. The is considered such small that has a little effect on the interpolation basis functions. The numerical results in our simulations show that smaller gives more accurate results, especially when p increases, but notice that this reduction of the size of , continues until no underflow error in computations occurs. A global interpolation function before and after applying this improvement for the case of rectangular mesh which is product of global interpolation functions in one dimension, are displayed in Figs. 5 and 6, respectively. 5. Time discretization In this section, we describe the time discretization for the case of rectangular partition. The time discretization for the triangular mesh is quite similar and performed with small changes only in notations. The time discretization, applied here, is performed with the Crank–Nicolson discretization, which is an implicit method. The main reason for choosing this method is the high order of convergence. Also theoretically, the Crank–Nicolson method is unconditionally stable. In this technique, a time step, t is selected and then the differential equations (4.16) and (4.17) at time t + 12 t are evaluated. The time derivative is approximated with a centered finite difference. For the rest terms, the average of times t and t + t is considered. Therefore Eqs. (4.16) and (4.17) can be written as:
F. Shakeri, M. Dehghan / Applied Numerical Mathematics 61 (2011) 1–23
11
Fig. 6. A global interpolation function corresponding to ith point, located on the boundary of an element of Th after applying Hermit interpolation. Table 1 Comparison of velocity field of Shercliff’s problem at M = 20 using MLPG method [22], FVE method [12,13,18,29] and FVSE method with p = 5 in rectangular partition.
(x, y )
Exact value
FVE method with Nx = N y = 7
FVE method with N x = N y = 10
FVE method with N x = N y = 30
FVSE method with Nx = N y = 7
FVSE method with N x = N y = 10
MLPG method with N = 1681
(0.00, 0.00) (0.25, 0.00) (0.50, 0.00) (0.00, 0.25) (0.25, 0.25) (0.50, 0.25) (0.00, 0.50) (0.25, 0.50) (0.50, 0.50) (0.00, 0.75) (0.50, 0.75)
0.049918641 0.049880236 0.049760102 0.049662783 0.049570651 0.049299034 0.047716857 0.047452918 0.046677531 0.037657703 0.036166028
0.060535196 0.049914241 0.042771026 0.059340793 0.049190306 0.042257986 0.056345686 0.047031664 0.040559079 0.040699955 0.030529006
0.050154180 0.049908275 0.049399719 0.049763373 0.049474403 0.048822959 0.047917937 0.047508053 0.046447175 0.037972159 0.036241337
0.049949831 0.049931721 0.049810629 0.049673904 0.049653541 0.049311106 0.047654403 0.047395908 0.046513455 0.037475349 0.036002088
0.049904402 0.049869511 0.049777626 0.049652578 0.049563709 0.049319260 0.047704620 0.047443914 0.046693784 0.037646269 0.036174089
0.049904635 0.049866591 0.049745566 0.049646821 0.049555008 0.049282439 0.047697841 0.047434150 0.046657809 0.037637608 0.036145676
0.049921141 0.049883013 0.049764512 0.049684382 0.049593666 0.049327163 0.047785558 0.047520638 0.046745300 0.037763549 0.036262629
Table 2 CPU time for solving Shercliff’s problem at M = 20 using MLPG method [22], FVE method [12,13,18,29] and FVSE method with p = 5 in rectangular partition. FVE method with Nx = N y = 7
FVE method with N x = N y = 10
FVE method with N x = N y = 30
FVSE method with Nx = N y = 7
FVSE method with N x = N y = 10
MLPG method with N = 1681
17.375000000 s
27.218750000 s
128.125000000 s
36.046875000 s
126.765625000 s
919.846000000 s
Table 3 Velocity field of Shercliff’s problem at M = 20 using FVSE method with N x = N y = 10 and different values of p in rectangular partition.
(x, y )
Exact value
p=3
p=4
p=5
(0.00, 0.00) (0.25, 0.00) (0.50, 0.00) (0.00, 0.25) (0.25, 0.25) (0.50, 0.25) (0.00, 0.50) (0.25, 0.50) (0.50, 0.50) (0.00, 0.75) (0.50, 0.75)
0.049918641 0.049880236 0.049760102 0.049662783 0.049570651 0.049299034 0.047716857 0.047452918 0.046677531 0.037657703 0.036166028
0.049897425 0.049864621 0.049668870 0.049675969 0.049590600 0.049249341 0.047756709 0.047499074 0.046659524 0.037710746 0.036190621
0.049894502 0.049853383 0.049723799 0.049631527 0.049536675 0.049255528 0.047672973 0.047406598 0.046623227 0.037619383 0.036120422
0.049904635 0.049866591 0.049745566 0.049646821 0.049555008 0.049282439 0.047697841 0.047434150 0.046657809 0.037637608 0.036145676
Table 4 CPU time for solving Shercliff’s problem at M = 20 using FVSE method with N x = N y = 10 and different values of p in rectangular partition. p=3
p=4
p=5
36.468750000
43.406250000 s
126.765625000 s
12
F. Shakeri, M. Dehghan / Applied Numerical Mathematics 61 (2011) 1–23
Table 5 Comparison of velocity field of Shercliff’s problem at M = 5 using MLPG method [22], FVE method [12,13,18,29] and FVSE method with p = 5 in triangular partition.
(x, y )
Exact value
FVE method N = 98
FVE method N = 162
FVE method N = 722
FVSE method N = 98
FVSE method N = 162
MLPG method N = 441
(0.00, 0.00) (0.25, 0.00) (0.50, 0.00) (0.00, 0.25) (0.25, 0.25) (0.50, 0.25) (0.00, 0.50) (0.25, 0.50) (0.50, 0.50) (0.25, 0.75) (0.50, 0.75) (0.75, 0.75)
0.171601814 0.168372009 0.155787639 0.164754886 0.161621571 0.149458308 0.141207698 0.138504000 0.128048264 0.089937761 0.083498372 0.064544256
0.170217438 0.165892367 0.154134853 0.158711395 0.154749438 0.143480875 0.130773635 0.127904263 0.119247020 0.079360255 0.074544113 0.063581548
0.170829389 0.166395134 0.154428858 0.159481686 0.155395527 0.143952256 0.131991731 0.128900469 0.119616812 0.080439469 0.075135557 0.060219517
0.171508731 0.166994125 0.153954518 0.160365891 0.156257834 0.143932112 0.132898732 0.129751728 0.119841829 0.080865884 0.075216935 0.059579241
0.171600899 0.168369637 0.155783037 0.164755121 0.161621098 0.149455924 0.141209328 0.138505348 0.128048405 0.089940573 0.083502002 0.064442661
0.171556364 0.168326676 0.155742364 0.164718369 0.161585326 0.149422849 0.141184658 0.138481830 0.128027365 0.089930550 0.083492387 0.064547312
0.170849580 0.167642203 0.155128896 0.164090329 0.161035982 0.148874152 0.140772330 0.138115184 0.127831234 0.089835973 0.083378769 0.064740823
Table 6 CPU time for solving Shercliff’s problem at M = 5 using MLPG method [22], FVE method [12,13,18,29] and FVSE method with p = 5 in triangular partition. FVE method with N = 98
FVE method with N = 162
FVE method with N = 722
FVSE method with N = 98
FVSE method with N = 162
MLPG method with N = 441
3.828125000 s
8.718750000 s
134.343750000 s
113.968750000 s
132.171875000 s
31.628000000 s
Table 7 Velocity field of Shercliff’s problem at M = 5 using FVSE method with N = 162 and different values of p in triangular partition.
(x, y )
Exact value
FVSE method with p=3
FVSE method with p=4
FVSE method with p=5
(0.00, 0.00) (0.25, 0.00) (0.50, 0.00) (0.00, 0.25) (0.25, 0.25) (0.50, 0.25) (0.00, 0.50) (0.25, 0.50) (0.50, 0.50) (0.25, 0.75) (0.50, 0.75) (0.75, 0.75)
0.171601814 0.168372009 0.155787639 0.164754886 0.161621571 0.149458308 0.141207698 0.138504000 0.128048264 0.089937761 0.083498372 0.064544256
0.171361449 0.168054414 0.155169239 0.164501303 0.161333434 0.148947405 0.140937387 0.138240774 0.127716857 0.089742452 0.083310281 0.064580340
0.171525725 0.168300692 0.155724385 0.164681586 0.161546525 0.149391160 0.141145483 0.138439836 0.127985406 0.089896087 0.083451611 0.064493341
0.171556364 0.168326676 0.155742364 0.164718369 0.161585326 0.149422849 0.141184658 0.138481830 0.128027365 0.089930550 0.083492387 0.064547312
Table 8 CPU time for solving Shercliff’s problem at M = 5 using FVSE method with N = 162 and different values of p in triangular partition.
M y Mx
p=3
p=4
p=5
14.828125000 s
60.640625000 s
132.171875000 s
i j ,rs
R ωHkl
1
M y Mx
i j ,rs
R ωHkl
1
Rmω
i j ,rs Hkl
M y Mx k =1 l =1
i j ,rs
−
1 2
i j ,rs
R m ωHkl
i j ,rs Fkl
1
2
bnkl+1
i j ,rs
−
i j ,rs
M cos(θ)Skl
v nkl +
+ Fkl 2
1
2
k =1 l =1
=
v nkl+1 −
+ Fkl
k =1 l =1 M y Mx
2
k =1 l =1
=
i j ,rs
− Fkl
1 2
1 2
bnkl +
i j ,rs
M cos(θ)Skl
M
i j ,rs n+1 bkl
1 2
i j ,rs cos(θ)Skl
+ sin(θ)Kkl
2
i j ,rs + sin(θ)Kkl v nkl+1
2
(5.1)
i j ,rs n v kl
+ sin(θ)Kkl
1
+ Li j ,rs f n ,
+ sin(θ)Kkl
i j ,rs
M cos(θ)Skl
i j ,rs n bkl
1
− Li j ,rs f n+1
,
(5.2)
F. Shakeri, M. Dehghan / Applied Numerical Mathematics 61 (2011) 1–23
13
Table 9 Velocity field of Shercliff’s problem at M = 20 using FVSE method and rectangular partition with N x = N y = 10, p = 5 and different values of interpolation.
in Hermit
(x, y )
Exact value
= .005
= .001
= .00001
(0.00, 0.00) (0.25, 0.00) (0.50, 0.00) (0.75, 0.00) (0.00, 0.25) (0.25, 0.25) (0.50, 0.25) (0.75, 0.25) (0.00, 0.50) (0.25, 0.50) (0.50, 0.50) (0.75, 0.50) (0.00, 0.75) (0.25, 0.75) (0.50, 0.75) (0.75, 0.75)
0.049918641 0.049880236 0.049760102 0.049227495 0.049662783 0.049570651 0.049299034 0.048549629 0.047716857 0.047452918 0.046677531 0.045177836 0.037657703 0.037296934 0.036166028 0.033933507
0.042932354 0.042315264 0.040363575 0.036801070 0.041969000 0.041306934 0.039245376 0.035566689 0.038520710 0.037778903 0.035515679 0.031627987 0.028674921 0.028063657 0.026148522 0.022733414
0.048337155 0.048161076 0.047586800 0.046255595 0.047905480 0.047674085 0.046947189 0.045405510 0.045572049 0.045183013 0.044003650 0.041812838 0.035486822 0.035052056 0.033682656 0.031057899
0.049904635 0.049866591 0.049745566 0.049220916 0.049646821 0.049555008 0.049282439 0.048540801 0.047697841 0.047434150 0.046657809 0.045165251 0.037637608 0.037277029 0.036145676 0.033919548
Fig. 7. The contour plots of velocity (left) and induced magnetic field (right) for λ = ∞, M = 20 with N x = N y = 14 and p = 5 in rectangular partition.
where i, r, j and s are indices such that (xri , y sj ) is an interior node of Ω , v nkl = v kl (nt ), bnkl = bkl (nt ), f n = f (nt ),
ω=
1
t , and i j ,rs Hkl
y
ψkx (ξ )ψl (η) dΛ,
=
(5.3)
Λrs ij i j ,rs
Fkl
y ∇ ψkx (ξ )ψl (η) . n dΓ,
=
(5.4)
∂Λrs ij i j ,rs Skl
=
y ψkx (ξ )ψl (η) dΛ,
(5.5)
y ψkx (ξ ) ψl (η) dΛ,
(5.6)
Λrs ij i j ,rs
Kkl
= Λrs ij
14
F. Shakeri, M. Dehghan / Applied Numerical Mathematics 61 (2011) 1–23
Fig. 8. The contour plots of velocity (left) and induced magnetic field (right) for λ = ∞, M = 300 with N x = N y = 14 and p = 5 in rectangular partition.
Fig. 9. Plot of the velocity along the x-axis ( y = 0) with N x = N y = 14, p = 5 and for θ = 0, λ = ∞ and various values of M.
Li j ,rs =
dΛ.
(5.7)
Λrs ij y
If (xri , y sj ) = (βkx , βl ) ∈ ∂Ω , in the case of the Dirichlet boundary condition (2.7) (or the boundary condition (2.8) when
λ = ∞), it is not necessary to carry out the corresponding integration on Λrs in (5.1) ((5.2)) and we set v kl (t ) = 0 (bkl (t ) = 0) ij in the representation (4.11) ((4.12)). But considering the boundary condition (2.8), when λ = ∞, Eq. (5.2) is reduced to My Mx
i j ,rs
R ωHkl
1 2
i =1 j =1
=
My Mx i =1 j =1
where
i j ,rs
− Qkl i j ,rs
R ωHkl
1
1 2
i j ,rs
+ Qkl 2
i j ,rs
+ Pkl
1
bnkl+1 −
i j ,rs
− Pkl 2
1 2
i j ,rs
M cos(θ)Skl
bnkl +
1 2
i j ,rs
M cos(θ)Skl
i j ,rs n+1 v kl
+ sin(θ)Kkl
i j ,rs n v kl ,
+ sin(θ)Kkl
(5.8)
F. Shakeri, M. Dehghan / Applied Numerical Mathematics 61 (2011) 1–23
15
Fig. 10. Plot of the induced magnetic field along the x-axis ( y = 0) with N x = N y = 14, p = 5 and for θ = 0, λ = ∞ and various values of M.
Fig. 11. Sparseness structure of coefficient matrix for λ = ∞, M = 300 with N x = N y = 14, p = 5 in rectangular partition.
i j ,rs
Qkl
y ∇ ψkx (ξ )ψl (η) . n dΓ,
=
(5.9)
∂Λrs −∂Λrs ∩∂Ω ij ij i j ,rs
Pkl
=λ
y
ψkx (ξ )ψl (η) dΓ.
(5.10)
∂Λrs ∩∂Ω ij
Eqs. (5.1) and (5.2) (or (5.8)) constitute a system of linear algebraic equations of the unknowns v nkl+1 and bnkl+1 . Notice
0 0 = bkl = 0. Then, solving the obtained linear system that at first, using the initial conditions (2.9) and (2.10), we have v kl repeatedly, we get v nkl and bnkl , for n = 1, 2, . . . , until the steady state is achieved.
6. Case studies In the following test problems, we present our procedure by taking a square pipe |x| 1, | y | 1. The wall conductivity, λ, is taken uniform over the entire boundary. The computations have been carried out for various values of θ , λ and M. For all cases, we have chosen R = R m = 1. Also for the transient flow with constant pressure gradient considered here, we have f (t ) = 1. However, for pulsating flows f (t ) may be taken as a periodic function of time. The numerical results are obtained at steady state, which is considered when the absolute difference between the solutions of two consecutive time
16
F. Shakeri, M. Dehghan / Applied Numerical Mathematics 61 (2011) 1–23
Fig. 12. Sparseness structure of coefficient matrix for λ = ∞, M = 20 with N = 162, p = 5 in triangular partition.
Fig. 13. The contour plots of velocity (left) and induced magnetic field (right) for λ = 0, M = 20, with N x = N y = 18, p = 4.
levels is less than 10−6 . Also the numerical calculations are performed with t = 0.01. (All numerical simulations are run on a standard desktop computer, Pentium IV 2.8 GHz.) 6.1. Problem 1: duct with insulating walls, λ = ∞ and horizontal magnetic field, θ = 0 Shercliff [72] obtained the exact solution for the flow in an insulated square duct with an applied magnetic field parallel to one pair of sides. In the current paper, this solution is selected as a benchmark problem which can be defined as the solution of (2.5) and (2.6) when θ = 0, further B = V = 0 along the channel boundary, i.e. λ = ∞. Tables 1–2 and 5–6, display the numerical results and CPU time obtained by Meshless Local Petrov–Galerkin (MLPG) [22], FVE [12,13,18,29,30,47] and FVSE methods for solving MHD equations in rectangular and also triangular partitions (in the MLPG method, N is the number of randomly located nodes in the domain Ω , for more details, see [22]). We can see from these tables that our steady-state solutions by FVSE for the velocity and the induced magnetic field agree very well with the exact solution given by Shercliff [72] and is more accurate than FVE and also MLPG methods, even when FVE method is applied with N x = N y = 30 (N = 722) for which the CPU time is approximately equal to the FVSE method’s CPU time with N x = N y = 10 (N = 162) in rectangular (triangular) mesh. In Tables 3 and 7, the exact value of velocity and also FVSE approximation with N x = N y = 10 (N = 162) in rectangular (triangular) partition for different values of p are reported. The CPU time computed in applying this procedure is shown in
F. Shakeri, M. Dehghan / Applied Numerical Mathematics 61 (2011) 1–23
17
Fig. 14. Plot of the velocity along the x-axis ( y = 0) with N x = N y = 14, p = 5 for M = 20, θ = 0 and various values of λ.
Fig. 15. Plot of the induced magnetic field along the x-axis ( y = 0) with N x = N y = 14, p = 5 for M = 20, θ = 0 and various values of λ.
Tables 4 and 8. In Table 9, the effect of different values of in constructing Hermit interpolation in applying FVSE method is presented. The contour graphs of velocity and induced magnetic field for small and large values of Hartmann number M = 20 and M = 300 with N x = N y = 14 and p = 5 in rectangular partition are plotted in Figs. 7 and 8, respectively. These figures show that the velocity is symmetric with respect to both x- and y-axes. The induced magnetic field is antisymmetric with respect to y-axis and hence the current lines change their direction in the left and right parts of the ducts. The graphs for triangular mesh in this case and also other cases are similar to the rectangular partition. The velocity and induced magnetic field profiles along the x-axis, in the y = 0 plane of the duct, are plotted in Figs. 9 and 10, respectively for θ = 0, λ = ∞ and different values of Hartmann number M. It is noted from these figures that as M increases, the velocity becomes uniform and almost stagnant at the center of the duct and the boundary layer formation starts for both the velocity and the induced magnetic field. The velocity decreases taking a uniform value at the center of the duct and V has its maximum value through the center. This flattening tendency is observed in the magnetic field also. These are the well known characteristic of MHD flow. The location of nonzero entries in the coefficient matrix in the obtained linear algebraic system for λ = ∞, M = 300 (M = 20) with N x = N y = 14 (N = 162), p = 5 in rectangular (triangular) partition is presented in Fig. 11 (Fig. 12). For other test problems, the sparseness structure of the coefficient matrix is similar to this case.
18
F. Shakeri, M. Dehghan / Applied Numerical Mathematics 61 (2011) 1–23
Fig. 16. The contour plots of velocity (left) and induced magnetic field (right) M = 200, λ = ∞, θ = π4 with N x = N y = 14 and p = 5.
Fig. 17. The contour plots of velocity (left) and induced magnetic field (right) M = 200, λ = ∞, θ = π3 with N x = N y = 14 and p = 5.
F. Shakeri, M. Dehghan / Applied Numerical Mathematics 61 (2011) 1–23
19
Fig. 18. Plot of the velocity along the x-axis ( y = 0) for M = 20, λ = ∞ and various values of θ with N x = N y = 14 and p = 5.
Fig. 19. Plot of the induced magnetic field along the x-axis ( y = 0) for M = 20, λ = ∞ and various values of θ with N x = N y = 14 and p = 5.
6.2. Problem 2: duct with arbitrary wall conductivity, λ, 0 λ < ∞ and horizontal magnetic field, θ = 0 In this problem, at first, we consider the unsteady MHD flow in a duct with the perfect conductivity on the walls of the duct, i.e. λ = 0. The FVSE method is employed to the governing equations at M = 20, with N x = N y = 18 and p = 4. The contour graphs are displayed in Fig. 13. One can observe that when λ = 0, which is the high conducting wall case, the induced magnetic field contours are perpendicular to the walls. To see the effect of increase in λ, we present Figs. 14 and 15 for M = 20, θ = 0 and various values of conductivity parameter λ. As λ increases, the graphs along the x-axis ( y = 0), converge to the case λ = ∞ and show the behavior of solution of MHD flow with insulated walls. 6.3. Problem 3: duct under oblique magnetic field, θ > 0 Here, the MHD flow problems (2.5) and (2.6) by taking externally applied magnetic field making a positive angle θ with the x-axis are considered. The contour graphs for M = 200, λ = ∞, θ = π4 and θ = π3 with N x = N y = 14, p = 5 are plotted in Figs. 16 and 17, respectively. The boundary layers are concentrated near the corners in the direction of the applied oblique magnetic field for both the velocity and induced magnetic field. Figs. 18 and 19 are made with the purpose of demonstrating the effect of various values of θ along the x-axis ( y = 0), for M = 20, λ = ∞. As can be seen from these figures, the absolute value of the induced magnetic field along the x-axis is reduced when θ is increased from 0 to π2 .
20
F. Shakeri, M. Dehghan / Applied Numerical Mathematics 61 (2011) 1–23
Table 10 Comparison of velocity field of Shercliff’s problem at M = 0 using FVE method [12,13,18,29] and FVSE method with p = 4 in rectangular partition.
(x, y )
Exact value
FVE method with Nx = N y = 7
FVE method with N x = N y = 10
FVE method with N x = N y = 20
FVSE method with Nx = N y = 7
FVSE method with N x = N y = 10
(0.00, 0.00) (0.25, 0.00) (0.50, 0.00) (0.75, 0.00) (0.00, 0.25) (0.25, 0.25) (0.50, 0.25) (0.75, 0.25) (0.00, 0.50) (0.25, 0.50) (0.50, 0.50) (0.75, 0.50) (0.00, 0.75) (0.25, 0.75) (0.50, 0.75) (0.75, 0.75)
0.294685413 0.278882332 0.229339626 0.139729128 0.278882332 0.264148031 0.217799304 0.133327705 0.229339629 0.217799304 0.181144632 0.112736689 0.139729128 0.133327705 0.112736685 0.072819791
0.302207785 0.271056627 0.231701020 0.134706103 0.271056627 0.243847323 0.209248350 0.122734219 0.231701020 0.209248350 0.180457147 0.107135393 0.134706103 0.122734219 0.107135393 0.066055943
0.284537549 0.270541100 0.225332898 0.137869228 0.270541100 0.257379628 0.214822788 0.131999258 0.225332898 0.214822788 0.180701477 0.112813110 0.137869228 0.131999258 0.112813110 0.073174537
0.292268617 0.277749251 0.227641472 0.138616943 0.277749251 0.264140337 0.217023600 0.132722756 0.227641472 0.217023600 0.179895157 0.111849409 0.138616943 0.132722756 0.111849409 0.072124775
0.293524315 0.277822912 0.228399600 0.139165561 0.277822912 0.263184058 0.216940686 0.132813676 0.228399600 0.216940686 0.180367959 0.112252718 0.139165561 0.132813676 0.112252718 0.072497677
0.294639646 0.278837322 0.229310695 0.139718764 0.278826724 0.264093068 0.217760813 0.133311067 0.229284974 0.217744528 0.181102961 0.112715493 0.139689585 0.133288009 0.112704285 0.072803408
Table 11 CPU time for solving Shercliff’s problem at M = 0 using FVE method [12,13,18,29] and FVSE method with p = 4 in rectangular partition. FVE method with Nx = N y = 7
FVE method with N x = N y = 10
FVE method with N x = N y = 20
FVSE method with Nx = N y = 7
FVSE method with N x = N y = 10
13.296875000 s
23.546875000 s
46.062500000 s
25.218750000 s
49.703125000 s
Table 12 Comparison of velocity field of Shercliff’s problem at M = 0 using FVE method [12,13,18,29] and FVSE method with p = 4 in triangular partition.
(x, y )
Exact value
FVE method with N = 98
FVE method with N = 162
FVE method with N = 512
FVSE method with N = 98
FVSE method with N = 162
(0.00, 0.00) (0.25, 0.00) (0.50, 0.00) (0.75, 0.00) (0.00, 0.25) (0.25, 0.25) (0.50, 0.25) (0.75, 0.25) (0.00, 0.50) (0.25, 0.50) (0.50, 0.50) (0.75, 0.50) (0.00, 0.75) (0.25, 0.75) (0.50, 0.75) (0.75, 0.75)
0.294685413 0.278882332 0.229339626 0.139729128 0.278882332 0.264148031 0.217799304 0.133327705 0.229339629 0.217799304 0.181144632 0.112736689 0.139729128 0.133327705 0.112736685 0.072819791
0.272737607 0.270070641 0.232178096 0.147922581 0.246426412 0.246761771 0.214940575 0.138844587 0.193414412 0.195047613 0.174818965 0.116039194 0.112446646 0.113918187 0.103958632 0.077761183
0.276831690 0.274342165 0.236691537 0.151786413 0.250426204 0.249691638 0.217414055 0.141155358 0.196930994 0.197288581 0.174774201 0.116176105 0.114695307 0.115420152 0.104121521 0.073605998
0.283129869 0.280684993 0.242644202 0.156145942 0.256324801 0.254652832 0.221439015 0.143927349 0.201775194 0.200917054 0.176534715 0.117148959 0.117599170 0.117417161 0.104909256 0.072498974
0.294674456 0.278880091 0.229340906 0.139730493 0.278880005 0.264143559 0.217799796 0.133319897 0.229340630 0.217799601 0.181146698 0.112721197 0.139730163 0.133319626 0.112721037 0.073013319
0.294685929 0.278888301 0.229347874 0.139736015 0.278888173 0.264155549 0.217809201 0.133333345 0.229347469 0.217808893 0.181154245 0.112738381 0.139735503 0.133332849 0.112738061 0.072811937
6.4. Problem 4: duct with M = 0, λ, θ = arbitrary In this case, the velocity and induced magnetic field, are calculated for M = 0, i.e. hydrodynamic case in which results obviously will be independent of λ and θ . The exact solution of this problem has been given by Singh and Lal [76]. In Tables 10–13, the exact value of velocity for this case, FVE and FVSE approximations and CPU time for rectangular and triangular meshes are reported. The results for both methods, with N x = N y = 7 and N x = N y = 10 (N = 98 and N = 162) in rectangular (triangular) mesh are presented. In these two cases, the accuracy of FVSE method is higher than FVE method, but the FVE method uses less CPU time. Also there exists another column with N x = N y = 20 (N = 512) for FVE method for which the CPU time is approximately equal to FVSE method with N x = N y = 10 (N = 162) in rectangular (triangular) partition, but as can be seen, FVE’s accuracy has not yet achieved to FVSE’s accuracy.
F. Shakeri, M. Dehghan / Applied Numerical Mathematics 61 (2011) 1–23
21
Table 13 CPU time for solving Shercliff’s problem at M = 0 using FVE method [12,13,18,29] and FVSE method with p = 4 in triangular partition. FVE method with N = 98
FVE method with N = 162
FVE method with N = 512
FVSE method with N = 98
FVSE method with N = 162
5.500000000 s
8.031250000 s
55.359375000 s
42.281250000 s
56.218750000 s
7. Conclusion In this study, a FVSE method is developed for solving MHD duct problem with various values of θ , orientation of applied magnetic field with x-axis, λ, the wall conductivity and M, the Hartmann number. The effects of various values of these parameters are visualized in terms of graphics showing the characteristics of MHD duct flow. Comparisons with two other numerical methods in the cases for which the exact solution exists show that this technique is more accurate and agrees very well with the exact solution. Acknowledgements The authors are very grateful to the three reviewers for carefully reading this paper and for their comments and suggestions which have improved the paper very much. The authors are very much thankful to the Editor, Professor Spencer Sherwin for his useful comments. Also we would like to thank Editor-in-Chief, Professor Robert Beauwens for managing the review process for this paper. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29]
H. Alfvén, Existence of electromagnetic-hydrodynamic waves, Nature 150 (1942) 405–406. H. Al Moatassime, D. Esselaoui, A finite volume approach for unsteady viscoelastic fluid flows, Int. J. Numer. Meth. Fluid 39 (2002) 939–959. C. Bailey, G.A. Taylor, M. Cross, P. Chow, Discretisation procedures for multi-physics phenomena, J. Comput. App. Math. 103 (1999) 3–17. K.E. Barrett, Duct flow with a transverse magnetic field at high Hartmann numbers, Int. J. Numer. Meth. Eng. 50 (2001) 1893–1906. T. Barth, Aspects of Unstructured Grids and Finite Volume Solvers for the Euler and Navier–Stokes Equations, 25th Computational Fluid Dynamics Lecture Series, Von Karman Institute, 1994. V. Bojarevics, V.I. Sharamkin, MHD flows due to current spreading in an axisymmetric layer of finite thickness, Magnetohydrodynamics 13 (1977) 172–177. V. Bojarevics, Magnetohydrodynamic interface waves and the distribution of heart caused by the dynamic interaction of currents in an aluminium electrolyte cell, Magnetohydrodynamics 28 (1992) 360–367. V. Bojarevics, K. Pericleous, Magnetic levitation fluid dynamics, Magnetohydrodynamics 37 (2001) 93–102. V. Bojarevics, K. Pericleous, Comparison of MHD models for aluminium reduction cells, Light Metals 2 (2006) 347–352. N. Bozkaya, M. Tezer-Sezgin, Time-domain BEM solution of convection-diffusion-type MHD equations, Int. J. Numer. Meth. Fluids 56 (2008) 1969–1991. H. Branover, P. Gershon, MHD turbulence study, Ben-Gurion University, Rept. BGUN-RDA-100-76, 1976. Z. Cai, On the finite volume element method, Numer. Math. 58 (1991) 713–735. Z. Cai, J. Mandel, S. McCormick, The finite volume element method for diffusion equations on general triangulations, SIAM J. Numer. Anal. 28 (2) (1991) 392–402. C. Canuto, M.Y. Hussaini, A. Quarteroni, T.A. Zang, Spectral Methods in Fluid Dynamics, Springer-Verlag, New York, 1988. C. Chainais-Hillairet, Second-order finite-volume schemes for a non-linear hyperbolic equation: Error estimate, Math. Method. Appl. Sci. 23 (2000) 467–490. C.T. Chan, K. Anastasiou, Solution of incompressible flows with or without a free surface using the finite volume method on unstructured triangular meshes, Int. J. Numer. Meth. Fluids 29 (1999) 35–57. C.C. Chang, T.S. Lundgren, Duct flow in magnetohydrodynamics, ZAMP 12 (1961) 100–114. P. Chatzipantelidis, A finite volume method based on the Crouzeix–Raviart element for elliptic PDE’s in two dimensions, Numer. Math. 82 (1999) 409–432. P. Chatzipantelidis, R.D. Lazarov, V. Thomée, Error estimates for a finite volume element method for parabolic equations in convex polygonal domains, Numer. Meth. Partial Differential Eq. 20 (2004) 650–674. P.G. Ciarlet, The Finite Element Method for Elliptic Problems, North-Holland, Amsterdam, 1978. J. Davidson, A. Thess, P.A. Davidson, Magnetohydrodynamics, Springer, Vienna, 2002. M. Dehghan, D. Mirzaei, Meshless Local Petrov–Galerkin (MLPG) method for the unsteady magnetohydrodynamic (MHD) flow through pipe with arbitrary wall conductivity, Appl. Numer. Math. 59 (2009) 1043–1058. M. Dehghan, D. Mirzaei, Meshless local boundary integral equation (LBIE) method for the unsteady magnetohydrodynamic (MHD) flow in rectangular and circular pipes, Comput. Phys. Commun. 180 (2009) 1458–1466. M. Dehghan, Finite difference procedures for solving a problem arising in modeling and design of certain optoelectronic devices, Math. Comput. Simulation 71 (2006) 16–30. M. Dehghan, On the solution of an initial-boundary value problem that combines Neumann and integral condition for the wave equation, Numer. Methods Partial Differential Eq. 21 (2005) 24–40. Z. Demendy, T. Nagy, M.-E. Hungary, A new algorithm for solution of equations of MHD channel flows at moderate Hartmann numbers, Acta Mech. 123 (1997) 135–149. L. Dragos, Magneto-Fluid Dynamics, Abacus Press, England, 1975. K.S. Erduran, V. Kutija, C.J.M. Hewett, Performance of finite volume solutions to the shallow water equations with shock–capturing schemes, Int. J. Numer. Meth. Fluids 40 (2002) 1237–1273. R.E. Ewing, R.D. Lazarov, Y. Lin, Finite volume element approximations of nonlocal in time one-dimensional flows in porous media, Computing 64 (2000) 157–182.
22
F. Shakeri, M. Dehghan / Applied Numerical Mathematics 61 (2011) 1–23
[30] R.E. Ewing, T. Lin, Y. Lin, On the accuracy of the finite volume element method based on piecewise linear polynomials, SIAM J. Numer. Anal. 39 (6) (2002) 1865–1888. [31] R. Eymard, T. Gallouët, R. Herbin, Finite volume methods, in: Handbook of Numerical Analysis, vol. VII, North-Holland, Amsterdam, 2000, pp. 713–1020. [32] F.G. Fuchs, S. Mishra, N.H. Risebro, Splitting based finite volume schemes for ideal MHD equations, J. Comput. Phys. 228 (2009) 641–660. [33] L.R.T. Gardner, G.A. Gardner, A two-dimensional bi-cubic B-spline finite element used in a study of MHD duct flow, Comput. Methods Appl. Mech. Engrg. 124 (1995) 365–375. [34] R.R. Gold, Magnetohydrodynamic pipe flow. Part 1, J. Fluid Mech. 13 (1962) 505–512. [35] J.L. Guermond, R. Laguerre, J. Léorat, C. Nore, An interior penalty Galerkin method for the MHD equations in heterogeneous domains, J. Comput. Phys. 221 (2007) 349–369. [36] J. Han, H. Tang, An adaptive moving mesh method for two-dimensional ideal magnetohydrodynamics, J. Comput. Phys. 220 (2007) 791–812. [37] J. Hartmann, I. Hg-Dynamics, Theory of the laminar flow of an electrically conducting liquid in a homogeneous magnetic field, K. Dan. Vidensk. Selsk. Mat.-Fys. Medd. 15 (1937) 1–27. [38] R.E. Harris, Z.J. Wang, High-order adaptive quadrature-free spectral volume method on unstructured grids, Computers and Fluids 38 (2009) 2006–2025. [39] E.T. Holroyd, J.C.R. Hunt, A review of MHD flows in ducts with changing cross section areas and non-uniform magnetic fields, Euromech Colloquium 70 (1976) 16–19. [40] R.J. Holroyd, An experimental study of the effects of wall conductivity, non-uniform magnetic field and variable-area ducts on liquid metal flow at high Hartmann number, Part 1: Ducts with non-conducting walls, J. Fluid Mech. 93 (1979) 609–630. [41] R.J. Holroyd, MHD flow in a rectangular duct with pairs of conducting and non-conducting walls in the presence of a non-uniform magnetic field, J. Fluid Mech. 96 (1980) 335–353. [42] T.Q. Hua, J.S. Walker, Three-dimensional MHD flow in insulating circular ducts in non-uniform transverse magnetic fields, Int. J. Engrg. Sci. 27 (1989) 1079–1091. [43] W.F. Hughes, F.J. Young, The Electromagnetohydrodynamics of Fluid, John Wiley and Sons, New York, 1966. [44] W.F. Hughes, I.R. McNAB, A quasi one dimensional analysis of an electromagnetic pump including end effects, in: H. Branover, P.S. Lykoudis, A. Yakhot (Eds.), Liquid-Metal Flows and Magnetohydrodynamics, Progress in Astronautics and Aeronautics 84 (1983) 287–312. [45] M. Hughes, K.A. Pericleous, M. Cross, The CFD analysis of simple parabolic and elliptic MHD flows, Appl. Math. Modelling 18 (1994) 150–155. [46] M. Hughes, K.A. Pericleous, M. Cross, The numerical modelling of DC electromagnetic pump and brake flow, Appl. Math. Model. 19 (1995) 713–723. [47] J.G. Huang, S.T. Xi, On the finite volume element method for general self-adjoint elliptic problems, SIAM J. Numer. Anal. 35 (1998) 1762–1774. [48] J.C.R. Hunt, Magnetohydrodynamic flow in rectangular ducts, J. Fluid Mech. 21 (1965) 577–590. [49] A. Kao, G. Djambazov, K. Pericleous, V. Voller, Thermoelectric MHD in dendritic solidification, Magnetohydrodynamics 45 (2009) 305–316. [50] N. Kechkar, D. Silvester, Analysis of locally stabilized mixed finite element methods for the Stokes problem, Math. Comput. 58 (1992) 1–10. [51] R.J. LeVeque, Finite Volume Methods for Hyperbolic Problems, University Press, Cambridge, 2002. [52] S.L. Lopes Verardi, J.M. Machado, Y. Shiyou, The application of interpolating MLS approximations to the analysis of MHD flows, Source Finite Elements in Analysis and Design 39 (2003) 1173–1187. [53] M. Lukáˇcová-Medvid’ová, Z. Vlk, Well-balanced finite volume evolution Galerkin methods for the shallow water equations with source terms, Int. J. Numer. Meth. Fluids 47 (2005) 1165–1171. [54] A.J. Meir, Finite element analysis of magnetohydrodynamic pipe flow, Appl. Math. Comput. 57 (1993) 177–196. [55] A.J. Meir, P.G. Schmidt, Analysis and numerical approximation of a stationary MHD flow problem with nonideal boundary, SIAM J. Numer. Anal. 36 (1999) 1304–1332. [56] A. Mignone, A simple and accurate Riemann solver for isothermal MHD, J. Comput. Phys. 225 (2007) 1427–1441. [57] W.R. Mitchell, Optimal multilevel iterative methods for adaptive grids, SIAM J. Sci. Stat. Comp. 13 (1992) 146–167. [58] A.I. Neslitürk, M. Tezer-Sezgin, The finite element method for MHD flow at high Hartmann numbers, Comput. Methods Appl. Mech. Engrg. 194 (2005) 1201–1224. [59] A.I. Neslitürk, M. Tezer-Sezgin, Finite element method solution of electrically driven magnetohydrodynamic flow, J. Comput. Appl. Math. 192 (2006) 339–352. [60] M.J. Ni, R. Munipalli, N.B. Morley, P. Huang, M.A. Abdou, A current density conservative scheme for incompressible MHD flows at a low magnetic Reynolds number, Part 1: On a rectangular collocated grid system, J. Comput. Phys. 227 (2007) 174–204. [61] P.J. Oliveira, On the numerical implementation of nonlinear viscoelastic models in a finite-volume method, Numer. Heat Transfer, Part B 40 (2001) 283–301. [62] A.T. Patera, A spectral element method for fluid dynamics: Laminar flow in a channel expansion, J. Comput. Phys. 54 (1984) 468–488. [63] C. Pozrikidis, Introduction to Finite and Spectral Element Methods Using Matlab, Chapman and Hall/CRC, 2005. [64] K.A. Pericleous, V. Bojarevics, Pseudo-spectral solutions for fluid flow and heat transfer in electro-metallurgical applications, Progr. Comput. Fluid Dynam. 7 (2007) 118–127. [65] K.A. Pericleous, M. Hughes, M. Cross, The CFD analysis of simple parabolic and elliptic MHD flows, Appl. Math. Modelling 18 (1994) 150–155. [66] J.I. Ramos, N.S. Winowich, Magnetohydrodynamic channel flow study, Phys. Fluids 29 (1986) 992–997. [67] J.I. Ramos, N.S. Winowich, Finite difference and finite element methods for MHD channel flows, Int. J. Numer. Meth. Fluids 11 (1990) 907–934. [68] S.S. Ravindran, Linear feedback control and approximation for a system governed by unsteady MHD equations, Comput. Methods Appl. Mech. Engrg. 198 (2008) 524–541. [69] D.R. Reynolds, R. Samtaney, C.S. Woodward, A fully implicit numerical method for single-fluid resistive magnetohydrodynamics, J. Comput. Phys. 219 (2006) 144–162. [70] N.B. Salah, A. Soulaimani, W.G. Habashi, A finite element method for magnetohydrodynamics, Comput. Methods Appl. Mech. Engrg. 190 (2001) 5867– 5892. [71] L. Seungsoo, G.S. Dulikravich, Magnetohydrodynamic steady flow computation in three dimensions, Int. J. Numer. Meth. Fluids 13 (1991) 917–936. [72] J.A. Shercliff, Steady motion of conducting fluids in pipes under transverse magnetic fields, Proc. Camb. Phil. Soc. 49 (1953) 136–144. [73] T.W.H. Sheu, R.K. Lin, Development of a convection-diffusion-reaction magnetohydrodynamic solver on nonstaggered grids, Int. J. Numer. Meth. Fluids 45 (2004) 1209–1233. [74] B. Singh, J. Lal, MHD axial flow in a triangular pipe under transverse magnetic field, Indian J. Pure Appl. Math. 9 (1978) 101–115. [75] B. Singh, J. Lal, MHD axial flow in a triangular pipe under transverse magnetic field parallel to a side of the triangle, Ind. J. Tech. 17 (1979) 184–189. [76] B. Singh, J. Lal, FEM in MHD channel flow problems, Int. J. Numer. Meth. Eng. 18 (1982) 1104–1111. [77] B. Singh, J. Lal, Heat transfer for MHD flow through a rectangular pipe with discontinuity in wall temperatures, J. Heat Moss Transfer 25 (1982) 1523–1529. [78] B. Singh, J. Lal, FEM for unsteady MHD flow through pipes with arbitrary wall conductivity, Int. J. Numer. Meth. Fluids 4 (1984) 291–302. [79] B. Singh, J. Lal, P.K. Agarwal, Finite element method for unsteady MHD channel flow with arbitrary wall conductivity and orientation of applied magnetic field, Indian J. Pure Appl. Math. 16 (11) (1985) 1390–1398. [80] A.K. Slone, C. Bailey, M. Cross, Dynamic solid mechanics using finite volume methods, Appl. Math. Model. 27 (2003) 69–87.
F. Shakeri, M. Dehghan / Applied Numerical Mathematics 61 (2011) 1–23
23
[81] P. Smith, Some asymptotic extremum principles for magnetohydrodynamic pipe flow, Appl. Sci. Res. 24 (1971) 452–466. [82] A. Sterl, Numerical simulation of liquid-metal MHD flows in rectangular ducts, J. Fluid Mech. 216 (1990) 161–191. [83] C. Suwon, H.H. Sang, The magnetic field and performance calculations for an electromagnetic pump of a liquid metal, J. Phys. D: Appl. Phys. 31 (1998) 2754–2759. [84] H.S. Takhar, A.K. Singh, G. Nath, Unsteady MHD flow and heat transfer on a rotating disk in an ambient fluid, Int. J. Therm. Sci. 41 (2002) 147–155. [85] M. Tezer-Sezgin, Magnetohydrodynamic flow in a rectangular duct, Int. J. Numer. Meth. Fluids 7 (2005) 697–718. [86] M. Tezer-Sezgin, S. Han Aydın, Dual reciprocity boundary element method for magnetohydrodynamic flow using radial basis functions, Int. J. Comput. Fluid Dyn. 16 (1) (2002) 88–92. [87] M. Tezer-Sezgin, S. Köksal, Finite element method for solving MHD flow in a rectangular duct, Int. J. Numer. Meth. Eng. 28 (1989) 445–459. [88] A.N. Tikhonov, A.A. Samarskii, Homogeneous difference schemes, USSR Comput. Math. Math. Phys. 1 (1962) 5–67. [89] A.N. Tikhonov, A.A. Samarskii, Homogeneous difference schemes on nonuniform nets, USSR Comput. Math. Math. Phys. 2 (1963) 927–953. [90] H. Versteeg, W. Malalasekra, An Introduction to Computational Fluid Dynamics: The Finite Volume Method, Prentice Hall, 2007. [91] J.S. Walker, G.S.S. Ludford, MHD flow in insulating circular expansions with strong transverse magnetic fields, Int. J. Eng. Sci. 12 (1974) 1045–1061. [92] Q. Wan, H. Wan, C. Zhou, Y. Wu, Simulating the hydraulic characteristics of the lower Yellow River by the finite-volume technique, Hydrol. Process. 16 (2002) 2767–2779. [93] N.S. Winowich, W.F. Hughes, J.I. Ramos, Numerical simulation of electromagnetic pump flow, Numer. Methods Laminar Turbulent Flow 5 (1987) 1228– 1240. [94] G. Xia, C.L. Lin, An unstructured finite volume approach for structural dynamics in response to fluid motions, Comput. Struct. 86 (2008) 684–701. [95] Z. Xiong, Y. Chen, Finite volume element method with interpolated coefficients for two-point boundary value problem of semilinear differential equations, Comput. Methods Appl. Mech. Engrg. 196 (2007) 3798–3804. [96] G. Yagawa, M. Masuda, Finite element analysis of magnetohydrodynamics and its application to lithium blanket design of fusion reactor, Nucl. Eng. Design 71 (1982) 121–136. [97] H.C. Yee, B. Sjögreen, Development of low dissipative high order filter schemes for multiscale Navier–Stokes/MHD systems, J. Comput. Phys. 225 (2007) 910–934. [98] M. Zhang, S.T. John Yu, S.C. Henry Lin, S.C. Chang, I. Blankson, Solving the MHD equations by the space–time conservation element and solution element method, J. Comput. Phys. 214 (2006) 599–617. [99] Z.P. Zang, B. Teng, W. Bai, L. Cheng, A finite volume solution of wave forces on submarine pipelines, Ocean Eng. 34 (2007) 1955–1964.
Applied Numerical Mathematics 61 (2011) 24–37
Contents lists available at ScienceDirect
Applied Numerical Mathematics www.elsevier.com/locate/apnum
Adaptive sparse grid algorithms with applications to electromagnetic scattering under uncertainty Meilin Liu a,b,1 , Zhen Gao b,c,1 , Jan S. Hesthaven b,∗ a b c
College of Information Science and Technology, Nanjing University of Aeronautics and Astronautics, China Division of Applied Mathematics, Brown University, USA Research Center for Applied Mathematics, Ocean University of China, China
a r t i c l e
i n f o
Article history: Received 2 January 2010 Received in revised form 9 August 2010 Accepted 9 August 2010 Available online 14 August 2010 Keywords: Sparse grids Stochastic collocation Adaptivity Maxwell’s equations
a b s t r a c t We discuss adaptive sparse grid algorithms for stochastic differential equations with a particular focus on applications to electromagnetic scattering by structures with holes of uncertain size, location, and quantity. Stochastic collocation (SC) methods are used in combination with an adaptive sparse grid approach based on nested Gauss–Patterson grids. As an error estimator we demonstrate how the nested structure allows an effective error estimation through Richardson extrapolation. This is shown to allow excellent error estimation and it also provides an efficient means by which to estimate the solution at the next level of the refinement. We introduce an adaptive approach for the computation of problems with discrete random variables and demonstrate its efficiency for scattering problems with a random number of holes. The results are compared with results based on Monte Carlo methods and with Stroud based integration, confirming the accuracy and efficiency of the proposed techniques. © 2010 IMACS. Published by Elsevier B.V. All rights reserved.
1. Introduction With the increasing need to quantify the impact of random input parameters on different types of physical systems or engineering applications, often described by stochastic ordinary or partial differential equations, comes the need for new and efficient computational techniques to deal with such problems. The classic Monte Carlo based method are increasingly inadequate and a number of alternatives have begun to emerge. It is reasonable to categorize the majority of these methods into two groups: sampling based statistical methods and probabilistic techniques. In the first category one finds the classic Monte Carlo (MC) method [4] with the clear advantage of being simple and non-intrusive, e.g., one needs only a deterministic solver. The simplicity, however, comes at the cost of very slow convergence as O ( M −1/2 ) where M is the number of samples. This quickly becomes prohibitive even if reasonable accuracy is required, in particular if the interest is on higher moments such as variance/sensitivity. A notable exception to this is for very high dimensional problems where the advantage of the dimensionally independent convergence rate eventually becomes important. However, we shall not consider this limit here. To accelerate convergence of the MC method, several techniques have been proposed, e.g., Latin hypercube sampling [15], quasi-MC (QMC) method [5], and the Markov chain MC (MCMC) [6] method. However, additional restrictions are often imposed by these methods and their applicability is limited.
* 1
Corresponding author. E-mail address:
[email protected] (J.S. Hesthaven). The two lead authors have contributed equally and substantially to the completion of the published work.
0168-9274/$30.00 © 2010 IMACS. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.apnum.2010.08.002
M. Liu et al. / Applied Numerical Mathematics 61 (2011) 24–37
25
An alternative to sampling based techniques has been recently received substantial attention. These methods, known as Stochastic Galerkin or Polynomial Chaos (PC) methods [10,21], are probabilistic in nature and are based on a generalization of the Wiener–Hermite PC expansion [20]. In this approach, the randomness is represented by the Wiener expansion and the unknown expansion coefficients are found by a Galerkin procedure in the inner product associated with the random variables using in the Wiener expansion. Substantial recent work has confirmed the accuracy and efficiency of this approach, in particular for problem with low to moderate dimensionality and for problems with sufficient smoothness in observation space, resulting in very efficient representations through the Wiener expansion. However, a substantial disadvantage of the Galerkin approach lies in the need to have to develop entirely new software to solve the large coupled equations resulting from this procedure. This represents a significant problem as validated existing software cannot be used directly to model the impact of randomness and uncertainty. To address this shortcoming of an otherwise successful approach, several authors have proposed a slight modification of this traditional approach. As mentioned above, the bottleneck in the stochastic Galerkin approach is the creation of a new large coupled system through the required inner product. It is natural to attempt to satisfy the high-dimensional problem in a collocation fashion instead, resulting in a large number of decoupled small problems, much in the sense of an MC approach. However, in the collocation approach, the sampling points are deterministic and associated with integration formulas for the evaluation of high-dimensional integrals in contrast MC based techniques where the sampling points are drawn randomly from some a priori distribution. This approach, now known as stochastic collocation, was first proposed by Tatang et al. [17] and more recently revisited and extended in [22] and subsequently considered in more detail by numerous authors, see [24] for a recent review. A clear advantage of this approach over the stochastic Galerkin formulation is its non-intrusive nature, enabling one to use validated software much in the same way as for MC based techniques. A central component of the efficiency and accuracy of these techniques is the construction of efficient and accurate integration methods for high-dimensional problems. In [22,24] several options are discussed in detail, including the Stroud’s cubature points [16], resulting in an efficient approach at moderate accuracy, and sparse grids constructed through Smolyak’s algorithm [14] combined with the Clenshaw–Curtis integration method [3]. This latter approach improves accuracy but is costly due to the moderate accuracy of the quadrature. In this work we discuss the tradeoff between accuracy and computational efficiency in a few different ways. We first consider the use of hierarchical Gauss–Patterson integration formulas as a more accurate alternative to the Clenshaw–Curtis nodes. A complementary discussion of some of this can be found in [9]. To further decrease computational cost we propose to use Richardson extrapolation between the levels to estimate errors and ultimately predict the results of an increased accuracy at very little additional cost. Most of the past work discussed in the above focuses on problems with continuous random variables. In this work we also discuss basic approaches for the adaptive solution of problems with discrete random variables and demonstrate how to use a priori information effectively. Throughout, we illustrate the efficiency and accuracy of the methods on a set of benchmark problems, often referred to the Genz problems [7,8]. However, we conclude this work with the consideration of problems of electromagnetic scattering in which Maxwell’s equations are solved in the time-domain using a discontinuous Galerkin method. It is important to note, however, that the adaptive techniques proposed here are problem independent and can be applied to general stochastic problems, solved using a variety of computational techniques. What remains of the paper is organized as follows. In Section 2 we provide a brief overview of polynomial chaos techniques with a focus on stochastic collocation methods. This sets the stage for Section 3 where we discuss high-dimensional integration schemes and introduce Gauss–Patterson integration methods in combination with the Smolyak construction. Section 4 discusses in detail the strategies for adaptivity and the use of Richardson extrapolation as an error estimator in this context. We also briefly discuss ideas that allow the efficient adaptive solution of problems in which discrete random variables are used to model the random behavior. In Section 5 we provide a brief overview of the application cases and the computational technique used to solve Maxwell’s equations and then illustrate some additional results in support of the generality of the proposed techniques. Section 6 contains a few concluding remarks. 2. Stochastic collocation methods Let us adopt the notation of [22]. (Ω, A, P ) is a complete probability space, where Ω is the event space, A ∈ 2Ω the σ algebra, and P the probability measure. Assume a d-dimensional bounded domain D ⊂ R d (d = 1, 2, 3), with boundary ∂ D, ¯ → R, such that for P -almost everywhere and focus on the following problem: find a stochastic function, u ≡ u (ω, x) : Ω × D ω ∈ Ω , the following equation holds,
L(ω, x; u ) = f (ω, x),
x ∈ D,
(1)
subject to the boundary condition
B(ω, x; u ) = g (ω, x),
x ∈ ∂ D,
(2)
where x = (x1 , . . . , xd , t ), L is a differential operator, and B is a boundary operator. Note that we do not designate between spatial and temporal dimensions at this stage to keep the notation simple.
26
M. Liu et al. / Applied Numerical Mathematics 61 (2011) 24–37
We assume that the randomness can be represented by p independent variables with zero mean and unit variance, each depending on the random event ω . There are several ways to achieve this depending on details of the problem [24]. We assume validity of the standard technique of a Karhunen–Loéve expansion to express the randomness as
Ξ (ω) = μ0 +
p
λk ψk ξk (ω),
k =1
where μ0 represents the mean of the random field and ψk are the orthogonal eigenfunctions associated with the eigenvalues, λk , of the correlation function of the random process being represented, and ξk are the random variables (see e.g. [10,24] for details). Representing the randomness in this generic form in Eqs. (1)–(2) yields a ( p + d)-dimensional differential equation in strong form as
L(ξ, x; u ) = f (ξ, x),
(ξ, x) ∈ Γ × D,
subject to the boundary condition
B(ξ, x; u ) = g (ξ, x),
(ξ, x) ∈ Γ × ∂ D,
where Γ is the p-dimensional random space. To account for the impact of the uncertainty, it is natural to consider moments of the solutions over the probabilistic space. In other words, we need to evaluate multi-dimensional integrals of the form
I[ f ] =
f (ξ ) dμ(ξ ). Γ
The simplest way to achieve this is through a Monte Carlo approach like
I[ f ]
M 1
M
f (ξm )
m =1
with the M instances, ξm , being drawn from the distribution, μ(ξ ). As mentioned previously, the disadvantage of this approach is its low convergence rate which, however, is independent on the dimension p of the random space. Realizing that all we need is to be able to evaluate integrals accurately, it seems reasonable to utilize more accurate integration techniques. At least for problems of moderate dimensionality one would expect these to be superior in terms of accuracy vs cost. This line of arguments were first explored in [17] for relatively simply ordinary differential equations and discussed in much more detail in [22]. We refer to these and [24] for further aspects of this. The essence of the stochastic collocation approach is to abandon the random sampling approach and consider the use of more advanced integration approached and, in this work, adaptive hierarchical integration techniques. In other words we shall solve the deterministic problems
L(ξk , x; u ) = f (ξk , x),
x ∈ D,
with boundary condition
B(ξk , x; u ) = g (ξk , x),
x ∈ ∂ D.
where ξk ∈ Γ are specific instances chosen with an integration formula in mind. Clearly, an objective in identifying this integration approach is to minimize the number of samples to achieve a given accuracy in evaluating the integral. 3. Integration for high-dimensional problems For the multi-dimensional integration, we consider a number of different approaches, the simplest of which is the Stroud [16] cubature points. These are useful when computing integrals of the form
I[ f ] =
f (x) dx.
(3)
[−1,1] p
This set of cubature points based on ( p + 1) points is exact for polynomials of degree two, and are given as
I[ f ]
n i =1
ωi f (xi ),
(4)
M. Liu et al. / Applied Numerical Mathematics 61 (2011) 24–37
27
where the n = p + 1 cubature points xi = (x1i , x2i , . . . , xni ) are given by
x2r i
=
2
−1 x2r = i
cos
3 2 3
sin
2r (i − 1)π n+1
2r (i − 1)π
(5)
,
,
n+1
√
for r = 1, . . . , n/2 . If n is odd, xni = (−1)(i −1) / 3. The weights in (4) are all equal to 2n /(n + 1). Similarly, we have the Stroud-3 method based on 2p points which is exact for polynomials of degree three:
I[ f ]
n
ωi f (xi ),
(6)
i =1
where the n = 2p cubature points xi = (x1i , x2i , . . . , xni ) are now defined by
−1 x2r = i
x2r i
=
2
cos
3 2 3
sin
(2r − 1)i π n
(2r − 1)i π n
,
(7)
,
for r = 1, . . . , n/2 . The weights in (6) are all equal to 2n /2n. It can be shown [3,23] that the Stroud-2 and Stroud-3 methods use the minimal number of points for their corresponding integration accuracy. The very simple schemes have recently been extended to general weights in [23]. While the Stroud schemes are efficient and may suffice to compute the expectation, their limited accuracy is often a problem. The most straightforward way to extend the many known one-dimensional integration methods to higher (p) dimensions is through the use of simple tensor products. However, this quickly becomes prohibitive with the number of samples growing like N p for a quadrature of order N used in p dimensions. A valuable and often superior alternative to this is the use of sparse grid methods of which the most notable ones are those based on the Smolyak construction. In [14], Smolyak proposed a construction of sparse multivariate quadrature formulas based on sparse tensor products of one-dimensional quadrature formulas. Let us consider the numerical integration of functions f (x) over a p-dimensional unit hypercube Γ := [−1, 1] p ,
I [ f ] :=
f (x) dx, Γ p
p
p
by a sequence of nl -point quadrature formulas with level l ∈ N and nl < nl+1 , n
p Ql
f :=
p
l
ωli f (xli )
(8)
i =1
using the weights
ωli and abscissas xli . Moreover, we define the underlying grids of a quadrature formula by
p p Γl := xli ∈ [−1, 1] p : 1 i nl .
(9)
Now, define the difference quadrature as
k1 f := Q k1 − Q k1−1 f
with Q 01 f := 0.
(10)
Smolyak’s construction for the integration of p-dimensional functions f is p
Q l f :=
|k|1 l+ p +1
k11 ⊗ · · · ⊗ k1d f
(11)
where l ∈ N and k ∈ N p . An essential feature of this construction is that the sparse quadrature formulas are nested if the corresponding one-dimensional quadrature nodes are nested. Notably, this rules out classic Gauss quadratures which are not nested and the impact of this is illustrated in Fig. 1 and will be discussed shortly.
28
M. Liu et al. / Applied Numerical Mathematics 61 (2011) 24–37
Fig. 1. Two-dimensional Smolyak based sparse grids based on Clenshaw–Curtis (left), Gauss–Patterson (middle) and Gauss–Legendre (right) for 6 levels in the hierarchical integration.
Table 1 The number of grid points required in a two-dimensional Smolyak sparse grid based Clenshaw–Curtis (C-C), Gauss–Patterson (G-P) or the Gauss–Legendre (G-L) integration nodes. Level
C-C
G-P
G-L
1 2 3 4 5 6
5 13 29 65 145 321
5 17 49 129 321 769
5 21 73 221 609 1573
3.1. Gauss–Patterson quadrature rules Seeking nested one-dimensional integration formulas, simple trapezoidal rules immediately come to mind. However, the limited accuracy of these makes this a less interesting choice. A more appropriate, and widely used, approach is based on the Clenshaw–Curtis rule [3] which is exact for polynomials of order n when using n + 1 points. This is considerably better than the second order accuracy of the trapezoidal rules but falls short of the 2n + 1 polynomial exactness of the Gaussian quadrature. The natural question to raise is whether there are nested quadratures which are better than the Clenshaw–Curtis rules, but perhaps not quite as good as the classic quadratures. This question was first addressed by Kronrod [3] who extended an n-point Gauss–Legendre quadrature formula by n + 1 points such that the quadrature formulas completed the polynomial degree of the exactness with degree 3n + 1 (n even) or 3n + 2 (n odd). Patterson iterated Kronrod’s scheme recursively and obtained a sequence of nested quadrature formulas with maximal degree of exactness. We refer to [3] for a discussion of the details of this construction. For f ∈ C r , the error bounds of integration f using Gauss–Patterson sparse grid formula will be [9]
1
E f = O 2−lr , l
where l refers to the number of levels in the hierarchical construction. When considering the efficiency of the integration measured through polynomial exactness, it is well known that using a quadrature with n points, the Clenshaw–Curtis is exact for polynomials up to order n − 1 and the Gauss–Legendre quadrature is exact for orders up to 2n − 1. For the general Gauss–Patterson rule, one can show exactness up to order (3n − 1)/2, confirming that this is truly a compromise between the two alternatives [3]. The nested structure of the Gauss–Patterson quadrature grids in combination with the Smolyak’s construction results in a natural hierarchical structure for computing the integrals. Hence, to improve the accuracy one needs only compute those new additional grids required to increase from level l to level l + 1. This is an important property, in particular for highdimensional problems. For the one-dimensional Clenshaw–Curtis rule, the number of points grows like 2l−1 + 1, whereas the growth for the Gauss–Patterson rule is 2l − 1 since the rule is based on the Gauss quadrature. Hence, when comparing cost of the two methods, it is most appropriate to compare the Clenshaw–Curtis rule at level l with the Gauss–Patterson rule at level l − 1. When the dimensionality of the problem increases this becomes more pronounced because the number of quadrature points grows as O (2l ld−1 ) in the sparse grid [2]. This is illustrated in Table 1 listing the number grid points required for the two-dimensional example with an increasing number of levels. The importance of the nested structure is evident for all levels but the first one. As a further illustration of this aspect, we plot in Fig. 1, the Smolyak based sparse grids Clenshaw– Curtis, Gauss–Patterson and Gauss–Legendre formules for two variables and with 6 levels in the Clenshaw–Curtis based scheme and 5 levels in the two other cases. The final question to address is whether the Gauss–Patterson based approach, with its improved accuracy but with more quadrature points at a given level, is competitive with the more traditional Clenshaw–Curtis scheme when one compares cost vs accuracy, i.e., to achieve a given accuracy in the integral, which of the two schemes require the least number of function evaluations.
M. Liu et al. / Applied Numerical Mathematics 61 (2011) 24–37
29
Fig. 2. Accuracy vs level for integration of the test-function for dimensions p = 2, 5, 10.
Fig. 3. Cost vs accuracy for integration of the test function for dimensions p = 2, 5, 10.
To address this question, we consider the test function [8] for x ∈ [0, 1] p
f (x1 , . . . , x p ) = cos 2πω1 +
p
c i xi ,
i =1
where ω1 and c = (c 1 , . . . , c p ) are randomly generated and the c i ’s sum to 9. In Fig. 2 we show the convergence of the different high-dimensional integration schemes discussed previously. We note that the accuracy of the Gauss–Patterson scheme is comparable to that of the scheme based on the full Gauss integration. We also note that the sparse grid based on the Clenshaw–Curtis formula experiences convergence problems for the highdimensional case. This is a known problem [9] and is likely associated with the kink-phenomena discussed in detail in [19]. It is noteworthy that there are numerous other examples of high-dimensional cases where the Clenshaw–Curtis formula converges as expected. A more important question is, however, which of these methods are preferred when taking into account the work needed to achieve a specific accuracy. In Fig. 3 we address this question by showing the time taken to achieve a specific accuracy. From this, it is clear that the sparse grid scheme based on the Gauss–Patterson scheme is superior except when a very moderate accurate is needed in which case the Stroud based schemes may well be preferred. Similar conclusions have been reached by considering other test functions and we will therefore focus on the Gauss–Patterson based sparse grid schemes going forward. This conclusion was also reached in [9] although the problems being considered were different and less focused on the cost vs accuracy considerations introduced here. 4. Error estimation and adaptivity in the sparse grid construction With the computational work growing exponentially with the dimensionality, it is essential that we seek to minimize the cost by carefully using the required degrees of freedom only where they are needed. This suggests that an adaptive approach is warranted and therefore the need to formulate an effective error estimator. The hierarchical nature of the Smolyak based integration immediately suggests that some kind of extrapolation be use to predict the value of the integral at the next level using results obtained at the previous levels [1]. We here propose the use of Richardson extrapolation to achieve this. It is known that for functions f ∈ C r , the error in the Smolyak algorithm is | E lp f | = O (n−r / p (log n)( p −1)(r / p +1) ) when
the integration utilize n = 2l l p −1 as the number of integration point sparse grid at level l to evaluate a p-dimensional integral [13]. With this knowledge of the local error behavior as a function of the level in the Smolyak grid, we can apply classic Richardson extrapolation in the following way.
30
M. Liu et al. / Applied Numerical Mathematics 61 (2011) 24–37
Assume we have computed an estimate to the integral, I [ f ], at two different levels. We then know that
I [ f ]l = I [ f ] + C rp 2l l p −1 l +1
I[ f ]
= I[ f ] +
C rp
−r / p
l +1
2
log 2l l p −1
(l + 1)
( p −1)(r / p +1)
p −1 −r / p
l
log 2 (l + 1)
,
p −1
( p −1)(r / p +1)
.
Through simple algebra one recovers that
I[ f ] =
α1 I ( f )l+1 − α2 I ( f )l , α1 − α2
where
(−r / p ) l ( p −1) ( p −1)·(r / p +1) · log 2 · l ,
( p −1)·(r / p +1)
(l+1)
(− r / p ) α2 = 2 · (l + 1)( p −1) · log 2(l+1) · (l + 1)( p −1) .
α1 = 2l · l( p−1)
This improved estimate of the integral depends on r which we recall is a measure of the smoothness of f ∈ C r . This is generally unknown. However, extensive tests have shown that the efficiency of the error estimator has limited sensitivity to this number and we have found that taking it to values of 2–4 generally yields excellent results as we will illustrate shortly. In some cases, there may be benefits to estimating a different value and this can simply be done by evaluating the accuracy/efficiency of the extrapolated values at the coarse grid levels and identify the optimal value of r. In our approach we will not compute the next level in the Smolyak grid if the extrapolation result is close to the result at the current level to within a given tolerance. This is vastly advantageous over previously used methods which simply compare solutions at the two levels, often resulting in having to compute an additional level at substantial cost – typically the cost of a new level is comparable to the combined cost of all previous levels. Instead of this, we use the Richardson extrapolation to estimate the result at the next level, hence dramatically reducing the overall cost without impacting the accuracy. To test the validity of this approach as a way to accelerate the adaptive sparse grid algorithm and reduce the computational cost we consider a set of high-dimensional test functions proposed in [7,8]. These function are all defined on [0, 1] p and we seek to integrate them as accurately as possible. The functions being considered have different characteristics: 1. Oscillatory:
f 1 (x) = cos 2πω1 +
p
c i xi ;
(12)
i =1
2. Product peak:
f 2 (x) =
p
c i−2 + (xi − ωi )2
− 1
;
(13)
i =1
3. Corner peak:
f 3 (x) =
1+
p
−(d+1) c i xi
;
(14)
i =1
4. Gaussian:
f 4 (x) = exp −
p
c i2 (xi
2
− ωi )
;
(15)
i =1
5. Continuous:
f 5 (x) = exp −
p
c i | xi − ωi | ;
(16)
i =1
Different test functions can be obtained by varying the parameters c = (c 1 , . . . , c p ) and ω = (ω1 , . . . , ω p ). The parameters ωi act as shift parameters, and the difficulty of the functions is increasing with c i > 0. We test the integration with the dimension p = 10 and use parameters c i such that p
ci = b j ,
i =1
where b j depends on the family f j and is given by Table 2.
(17)
M. Liu et al. / Applied Numerical Mathematics 61 (2011) 24–37
31
Table 2 Parameter b j for the five different test functions, f j . j
1
2
3
4
5
bj
9 .0
7.25
1.85
7.03
20.4
Table 3 Integrals computed for functions f 1 – f 4 for p = 10 using sparse grid (SG) and Richardson extrapolation (RE) at levels one to six. Note that the integral was also computed at level zero to enable the extrapolation but is not shown due to its poor accuracy. Level 1
Level 2
Level 3
Level 4
Level 5
Level 6
SG RE
−0.1191 –
−0.1378 −0.1191
−0.1359 −0.1378
−0.1360 −0.1359
−0.1360 −0.1360
−0.1360 −0.1360
f2
SG RE
4.1472e–06 –
4.3598e–06 4.1472e–06
4.3452e–06 4.3601e–06
4.3458e–06 4.3451e–06
4.3458e–06 4.3458e–06
4.3458e–06 4.3458e–06
f3
SG RE
0.0014 –
0.0017 0.0014
0.0018 0.0017
0.0018 0.0018
0.0018 0.0018
0.0018 0.0018
f4
SG RE
0.3970 –
0.4085 0.3970
0.4091 0.4085
0.4091 0.4091
0.4091 0.4091
0.4091 0.4091
f1
Table 4 Integrals computed for function f 5 for dimensions p = 2, 5, 10 using sparse grid (SG) and Richardson extrapolation (RE) at levels zero to seven. Level 0
Level 1
Level 2
Level 3
Level 4
Level 5
Level 6
Level 7
p=2
SG RE
0.0029 –
0.0208 –
0.0201 0.0214
0.0452 0.1337
0.0326 0.0099
0.0330 0.0336
0.0338 0.0349
0.0338 0.0338
p=5
SG RE
0.0047 –
0.0114 –
0.0102 0.0114
0.0066 0.0112
0.0068 0.0063
0.0073 0.0056
0.0072 0.0078
0.0072 0.0072
p = 10
SG RE
0.0092 –
0.0069 –
0.0021 0.0069
0.0034 0.0021
0.0031 0.0035
0.0034 0.0030
0.0035 0.0033
0.0035 0.0035
In Table 3 we illustrate the accuracy of the extrapolation for p = 10 for the first four test functions above. It is evident that the extrapolation works very well and offers an accurate estimate of the integral at the next level once it is reasonably well approximated at previous levels. We have used r = 3 as an estimator of the smoothness in this example but similar results very obtained with r = 4, confirming the relative insensitivity of the estimate on the performance of the scheme. The final test function, f 5 , is more challenging and in fact violates the smoothness assumption required to derive the error estimates for the Richardson extrapolation. We show in Table 4 the results for the extrapolation for different dimensions and observe that in spite of the more complex function, the results confirm the accuracy of the proposed approach for local error estimation and, thus, enables accurate adaptive integration. Naturally, the exact same idea can be explored as a dimensional error estimator to enable anisotropic adaptivity, although we have not explored this further in this work. Note that in the above results, we used r = 3 for all tests, confirming the lack of sensitivity to this parameter. 4.1. Predictive sampling for discrete variables For discrete random variables, sparse grids and extrapolation are not directly applicable and we need to seek an alternative approach. We will, in this work, simply use the knowledge of the a priori density of the discrete random variables. Let us, for instance, assume a discrete random variable k with a density
f (k; λ) = e −λ
λk , k!
k = 0, 1 , 2, . . .
(18)
where the mean and variance of k are: μk = k¯ = λ > 0 and σk2 = λ, respectively. We use the a priori density to select which samples to compute first. Hence, if k = 2 is the most likely case, this will be computed first, followed by the next most likely value of the discrete variable. The convergence of the integral is used to decide whether additional instances are needed. We will demonstrate the computational advantage of this simple idea in the following section. 5. Applications to electromagnetic scattering problems To evaluate the benefits of the techniques discussed in the above for problems of a more practical character, we consider electromagnetic scattering by a two-dimensional cylinder with holes where the size, the location, and ultimately, the number of holes, are considered as uncertain and described by continuous and discrete random variables.
32
M. Liu et al. / Applied Numerical Mathematics 61 (2011) 24–37
The physical model is Maxwell’s equations
∂E = ∇ × H, ∂t ∂H μ = −∇ × E, ∂t
(19) (20)
in the general three-dimensional domain Λ without sources. E is electric field intensity, H is the magnetic field, are the permittivity and the permeability of the domain. As an output measure of interest we use the Radar Cross Section (RCS) defined as
|F(φ)|2 RCSdb (φ) = 10 log 2π , | E i |2
and μ
(21)
where Ei is the incident field and F(φ) is a function of E and H, computing the scattered far field as a function of the polar angle, φ . In this particular case, F(φ), is near-to-far-field transformation along some closed contour [18]. 5.1. Nodal discontinuous Galerkin finite element method Discontinuous Galerkin methods are a general and flexible way of solving Maxwell’s equations. We follow the formulation in [11] and refer to [12] for a more general and in depth account of these techniques. We assume the computational domain Ω is approximated by K elements D k as
Λ∼ =
K
Dk ,
(22)
k =1
where D k is a two-dimensional simplex. In each element, we approximate E and H by Lagrange polynomials as
[Eh , Hh ] =
N
E(x j , t ), H(x j , t ) l j (x),
(23)
j =1
where x j are the interpolation points. The number of nodes N is given as
N=
(n + 1)(n + 2) 2
for the nth order polynomial in two dimensions. We insert the approximate solution into Maxwell’s equation and require that the local residual is orthogonal to all nth order polynomials. Integrating by part twice, this yields the scheme
∂ Eh ˆ k × Hk − H∗ li (x) dx, − ∇ × Hh li (x) dx = − n ∂t
Dk
μ Dk
(24)
∂ Dk
∂ Hh + ∇ × Eh li (x) dx = ∂t
ˆ k × Ek − E∗ li (x) dx, n
(25)
∂ Dk
ˆ is the unit outward normal vector where [E∗ , H∗ ] denotes the numerical flux of the corresponding vector quantities and n along ∂ D k . The numerical flux is responsible for the coupling of the elements, for the stability of the scheme, and for the imposition of boundary conditions. We use a standard upwind flux with the explicit form given in [11]. We use a low storage Runge–Kutta scheme [11] for the temporal integration. It should be noted that the method used to solve Maxwell’s equations in this work is less important since the focus is on the efficiency of the techniques dealing with the stochastic elements of the problem. The computational setup of the problem is a plane wave that impinges on a 2-D cylinder from some specific direction. The cylinder has an uncertain number of holes in it and their sizes and locations may also be uncertain. Examples of meshes of the computational model with different number of holes, hole sizes and locations are shown in Fig. 4. The bistatic RCS has an exact solution for the case of a plane wave impinging on a 2-D cylinder without holes. We consider the case with ka = π where a is the radius and k = 2π /λ is the wave number associated with wavelength of the incoming wave. We employ a perfectly matched layer (PML) to absorb the reflected wave in our computation [18]. The comparison of exact solution and numerical solution is shown in Fig. 5, confirming the accuracy of the solver. A complete analysis of the solution approach can be found in [11].
M. Liu et al. / Applied Numerical Mathematics 61 (2011) 24–37
33
Fig. 4. Examples of meshes of the cylinder with different number of holes, hole size, and hole location. (a) has one hole, (b) three holes, (c) four holes, and (d) six holes.
Fig. 5. Comparison between numerical and exact bistatic RCS for a ka = π metallic cylinder.
5.2. Low-dimensional examples To verify our algorithm, we first compare the results using the three methods for low-dimensional problems, i.e., we restrict the problem to having one or two random parameters, all assumed to be uniformly distributed random variables. The three methods are a standard Monte Carlo (MC) method, Stroud’s method at second and third order, and the sparse grid method. For all case we recover the mean of the RCS as
RCS =
Q i =1
ωi RCSi ,
(26)
34
M. Liu et al. / Applied Numerical Mathematics 61 (2011) 24–37
Fig. 6. Mean RCS computed using four different sampling methods.
Fig. 7. Mean RCS and sensitivity for a two-parameter random problem computed using the adaptive sparse grid method.
and the variance of RCS as
var(RCS) =
Q
2
ωi RCSi − RCS ,
(27)
i =1
where the number of terms, Q , and the integration weights, ωi , depend on the specific integration technique used. We first consider the problem with one random parameter, taken to be a cylinder with one hole of random size. The size π , π ) in the polar angle. In Fig. 6 we show the mean RCS computed using of the hole is assumed to be in the range of ( 12 6 the four methods. The level of the sparse grid method is 6 (31 quadrature points) and the Monte Carlo method employs 2000 uniformly distributed points. The difference between level 5 and level 6 results is of order 10−3 . The Stroud method of degrees 2 and 3 yields the same results since they are identical in this simple case. We observe excellent agreement between the four methods and, in particular, observe excellent agreement between the sparse grid result and the MC computation. Next, we consider 2 random parameters; the hole size and the angle of the incidence plane wave. The incident wave impinges from the left on the cylinder and changes in the range of ( 35 π , 37 π ). The RCS mean and it plus/minus one 36 36 standard deviation computed using the sparse grid method are presented in Fig. 7, illustrating the value of these techniques by enabling the computation of sensitivities of output measures of interest. We use 5 levels in the sparse grid computation and estimate the error to be of the order of 10−3 based on the extrapolation of the RCS.
M. Liu et al. / Applied Numerical Mathematics 61 (2011) 24–37
35
Fig. 8. RCS computed for cylinders illustrated in Fig. 4, with 2 (left), 3 (middle), and 4 (right) holes with uniformly distributed hole size and angle of incidence of the illuminating wave.
Fig. 9. Poisson density associated with the number of holes in the cylinder.
5.3. Example of a higher-dimensional problem To simulate a more realistic random wave problem, we need to consider higher-dimensional problems. We achieve this in two different ways. In the first example we assume the number of holes is deterministic but that their size and illumination are uniformly distributed random variables. In Fig. 8 we present the results for the cylinder with 2 to 4 holes obtained with results obtained using the sparse grid method with 5 levels in the integration. We note in particular the impact of the number of holes on the sensitivity of the RCS. To further add to the complexity, let us also assume that the number holes is a discrete random variable with a Poisson probability distribution, illustrated in Fig. 9. We assume that the number of holes is ranging between one and nine but that their location is fixed. The RCS mean and it sensitivity computed with the adaptive sparse grid method is presented in Fig. 10. We use this case to demonstrate the value of the approach discussed in Section 4.1. To illustrate this, we compute the L 2 -error of the RCS with different numbers of holes and list the error as the number of holes increases. Here the reference solution is assumed to be the one where all variations of one to nine holes are accounted for, i.e., for each discrete number of holes, the randomness is in the hole size is accounted for as discussed previously and the moments in the discrete variables are computed subsequently. The results, simply reflecting the error in the L 2 -error or the mean on the RCS, are shown in Table 5. As expected, we quickly see convergence with as little as three to four holes and can terminate the computation and, thus saving substantially in the overall computational cost.
36
M. Liu et al. / Applied Numerical Mathematics 61 (2011) 24–37
Fig. 10. RCS mean and sensitivity for problem with random number of holes and uniformly distributed hole size.
Table 5 The L 2 -error of the mean RCS as a function of the number of holes, assuming that the number of holes is a Poisson distributed random discrete variable. Hole number
L 2 -error of the mean RCS
1 2 3 4 5 6
4.2345 2.4427 1.0191 0.4481 0.1181 0
6. Concluding remarks In the paper we have discussed the development of adaptive sparse grid methods in the context of stochastic collocation methods for solving partial differential equations with uncertainty. The emphasis has been on identifying methods which delivers maximum accuracy at minimal cost. We find that the combination of Gauss–Patterson quadratures and Smolyak sparse grid constructions is an effective way to reach this and results in a computationally robust approach. In particular we confirm, in agreement with related work, that the widely used Clenshaw–Curtis Smolyak based approach may have problems with convergence for certain high-dimensional test functions. No such problem was observed with the Gauss–Patterson based scheme. Similar behavior has also been observed by other authors [9] but the detailed study of the cost vs accuracy question is a new direction of research. Furthermore, we demonstrate how the strict hierarchical structure of the Smolyak sparse grid lends itself very well to the use of Richardson extrapolation as an efficient estimator for the computed value of the integral at the next level. This is important as it allows for significant savings in the overall computational expense. We demonstrated that this is a robust and general approach as illustrated for several standard test functions. This suggests an accurate and fast adaptive sparse grid scheme and we illustrated its use and flexibility by considering electromagnetic scattering problems with the scatterer having randomized holes. We also demonstrate a simple but effective way to deal with discrete random variables in an error controlled manner. In combination with the adaptive sparse grid approach, this yields substantial computational savings. An immediate next step is to look more carefully at the theory of the overall procedure and, in particular, the accuracy of the Richardson extrapolation procedure. We hope to report on this in the near future. Acknowledgement The first author acknowledges the support of the China Scholarship Committee and National Science Foundation of China (No. 60771017) for this research. The second author also acknowledges the support of China Scholarship Committee and National Science Foundation of China (No. 2008633049) for this research. The last author acknowledges partial support by OSD/AFOSR FA9550-09-1-0613, and by NSF, and DoE.
M. Liu et al. / Applied Numerical Mathematics 61 (2011) 24–37
37
References [1] H. Bungartz, M. Griebel, U. Rüde, Extrapolation, combination, and sparse grid techniques for elliptic boundary value problems, Comput. Methods Appl. Mech. Engrg. 116 (1994) 243–252. [2] H. Bungartz, M. Griebel, Sparse grids, Acta Numer. 13 (2004) 147–269. [3] P.J. Davis, P. Rabinowitz, Methods of Numerical Integration, Academic Press, New York, 1975. [4] G. Fishman, Monte Carlo, Concepts, Algorithms, and Applications, Springer-Verlag, New York, 1996. [5] B. Fox, Strategies for Quasi-Monte Carlo, Kluwer, Dordrecht, The Netherlands, 1999. [6] D. Gamerman, Markov Chain Monte Carlo: Stochastic Simulation for Bayesian Inference, Chapman and Hall, London, 1997. [7] A.C. Genz, Testing multidimensional integration routines., in: B. Ford, J.C. Rault, F. Thomasset (Eds.), Tools, Methods, and Languages for Scientific, Engineering Computation, North-Holland, Amsterdam, 1984, pp. 81–94. [8] A.C. Genz, A package for testing multiple integration subroutines, in: P. Keast, G. Fairweather (Eds.), Numerical Integration, Kluwer, Dordrecht, 1987, pp. 337–340. [9] T. Gerstner, M. Griebel, Numerical integration using sparse grid, Numer. Algorithms 18 (1998) 209–232. [10] R.G. Ghanem, P.D. Spanos, Stochastic Finite Elements: A Spectral Approach, Springer-Verlag, New York, 1991. [11] J.S. Hesthaven, T. Warburton, Nodal high-order methods on unstructured grids I. Time-domain solution of Maxwell’s equations, J. Comput. Phys. 181 (2002) 186–221. [12] J.S. Hesthaven, T. Warburton, Nodal Discontinuous Galerkin Methods: Algorithms, Analysis, and Applications, Springer-Verlag, New York, 2007. [13] E. Novak, K. Ritter, The curse of dimension and a universal method for numerical integration, in: G. Nurnberger, J.W. Schmidt, G. Walz (Eds.), Multivariate Approximation and Splines, Birkhauser, Boston/Basel, 1997, pp. 177–188. [14] S.A. Smolyak, Quadrature and interpolation formulas for tensor products of certain classes of funcitions, Dokl. Akad. Nauk SSSR 4 (1963) 240–243. [15] M. Stein, Large sample properties of simulations using Latin hypercube sampling, Technometrics 29 (1987) 143–151. [16] A. Stroud, Remarks on the disposition of points in numerical integration formulas, Math. Comp. 11 (1957) 257–261. [17] M.A. Tatang, W.W. Pan, R.G. Prinn, G.J. McRae, An efficient method for parametric uncertainty analysis of numerical geophysical model, J. Geophys. Res. 102 (1997) 21925–21932. [18] A. Taflove, Computational Electrodynamics – The Finite-Difference Time-Domain Method, Aztech House, Boston, 1995. [19] J.A.C. Weideman, L.N. Trefethen, The kink phenomenon in Fejer and Clenshaw–Curtis quadrature, Report no. 6/16, Oxford University Computing Laboratory, 2006. [20] N. Wiener, The homogeneous chaos, Amer. J. Math. 60 (1938) 897–936. [21] D. Xiu, G.E. Karniadakis, The Wiener–Askey polynomial chaos for stochastic differential equations, SIAM J. Sci. Comput. 24 (2002) 619–644. [22] D. Xiu, J.S. Hesthaven, High-order collocation methods for differential equations with random inputs, SIAM J. Sci. Comput. 27 (2005) 1118–1139. [23] D. Xiu, Numerical integration formulas of degree two, Appl. Numer. Math. 58 (2008) 1515–1520. [24] D. Xiu, Fast numerical methods for stochastic computations: A review, Commun. Comput. Phys. 5 (2009) 242–272.
Applied Numerical Mathematics 61 (2011) 38–52
Contents lists available at ScienceDirect
Applied Numerical Mathematics www.elsevier.com/locate/apnum
On Numerov’s method for a class of strongly nonlinear two-point boundary value problems ✩ Yuan-Ming Wang a,b,∗ a
Department of Mathematics, East China Normal University, Shanghai 200241, People’s Republic of China Scientific Computing Key Laboratory of Shanghai Universities, Division of Computational Science, E-Institute of Shanghai Universities, Shanghai Normal University, Shanghai 200234, People’s Republic of China b
a r t i c l e
i n f o
Article history: Received 15 September 2009 Received in revised form 15 June 2010 Accepted 14 August 2010 Available online 18 August 2010 Keywords: Strongly nonlinear two-point boundary value problem Numerov’s method Fourth-order accuracy Monotone iterations Upper and lower solutions
a b s t r a c t The purpose of this paper is to give a numerical treatment for a class of strongly nonlinear two-point boundary value problems. The problems are discretized by fourthorder Numerov’s method, and a linear monotone iterative algorithm is presented to compute the solutions of the resulting discrete problems. All processes avoid constructing explicitly an inverse function as is often needed in the known treatments. Consequently, the full potential of Numerov’s method for strongly nonlinear two-point boundary value problems is realized. Some applications and numerical results are given to demonstrate the high efficiency of the approach. © 2010 IMACS. Published by Elsevier B.V. All rights reserved.
1. Introduction Nonlinear two-point boundary value problems arise from many fields of applied sciences and have been investigated extensively in the literature both analytically and numerically (cf. [1,3,7,31–33]). In this paper, we seek high-accuracy numerical solution of the following strongly nonlinear two-point boundary value problem:
⎧ ⎨ ⎩
−
d dx
k (u )
u (0) = α ,
du dx
= f (x, u ),
0 x 1,
(1.1)
u (1) = β,
where α , β are given constants, and the functions f (x, u ) and k(u ) (which, in general, are nonlinear in u) are prescribed smooth functions of their respective arguments. The consideration of problem (1.1) is motivated by some heat-conduction problems and diffusion problems. For example, if the ends of a rod are kept at given temperature and the thermal conductivity is temperature dependent then the steadystate temperature distribution u (x) in the rod is governed by the above problem (1.1), and in this case, k(u ) is the thermal conductivity and f (x, u ) is the internal source that may also be temperature dependent. Various aspects of such heatconduction problems, such as the qualitative analysis of the equations and the computation of the solutions, have been
✩ This work was supported in part by the National Natural Science Foundation of China, No. 10571059, E-Institutes of Shanghai Municipal Education Commission, No. E03004, the Natural Science Foundation of Shanghai, No. 10ZR1409300 and Shanghai Leading Academic Discipline Project No. B407. Address for correspondence: Department of Mathematics, East China Normal University, Shanghai 200241, People’s Republic of China. E-mail address:
[email protected].
*
0168-9274/$30.00 © 2010 IMACS. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.apnum.2010.08.003
Y.-M. Wang / Applied Numerical Mathematics 61 (2011) 38–52
39
investigated in the literature (cf. [6,20,25,26,29–33]). For some other applications of problem (1.1) in the theory of diffusion we see [1,10–12,20,31–34] and the references therein. As we know, it is difficult to give the analytical solution of problem (1.1), even if the function f (·, u ) is linear in u. Therefore, numerical methods are of considerable practical interest. If k(u ) ≡ 1, problem (1.1) is reduced to a semilinear problem. For solving such a semilinear problem, various numerical methods have been developed in the literature. These methods include, for example, finite difference methods in [4,7,15,18,23,43–46], Petrov-Galerkin methods in [21], variation methods in [14], shooting methods in [22,36], spline methods in [5,16,24] and multiderivative methods in [41]. In the context of finite difference discretizations, one of the well-known methods is Numerov’s method (cf. [3,23,28]). Because Numerov’s method possesses fourth-order accuracy and a compact property, it has attracted considerable attention and has been extensively used in practical computations (cf. [2,3,13,15,18,19,37,43–46]). The compact property of Numerov’s method means that the difference stencil in this method only utilizes mesh points directly adjacent to the mesh point at which a difference approximation is being made. This property makes that the boundary conditions can be easily treated in the same manner as in the standard second-order method. In other words, Numerov’s method requires only a regular threepoint difference stencil similar to that used in the standard second-order method but still possesses fourth-order accuracy. For the developments of Numerov’s method, we refer to the survey paper [4]. If k(u ) depends on u, problem (1.1) becomes more complicated due to the nonlinearity of the function k(u ). In this case, Numerov’s method cannot be directly applied, even if k(u ) is linear in u. However, in the article [44], the author showed that after a proper transformation of (1.1), one can use Numerov’s method to find indirectly an accurate numerical solution of (1.1). The numerical method proposed there consists of three steps: Step 1. Using a transformation T (see (2.1)) to transform (1.1) into a semilinear problem; Step 2. Applying Numerov’s method to solve the resulting semilinear problem; Step 3. Making use of the inverse T −1 to obtain the numerical solution of problem (1.1). Although the above method possesses the same accuracy as Numerov’s method regardless of the strong nonlinearity of problem (1.1), it is only valid for the situation when the inverse T −1 can be constructed explicitly. This limits its applications since the construction of the inverse T −1 is not always easy to do. The purpose of this paper is to search a new technique for developing a direct method that avoids constructing the inverse T −1 and directly offers the numerical solution of (1.1) but still maintains the same accuracy as Numerov’s method. Specifically, our approach here is to formulate (1.1) as a coupled system of a semilinear two-point boundary value problem and a nonlinear functional equation and then to discretize the coupled system by Numerov’s method. To solve the resulting discrete problem we develop a linear monotone iterative algorithm by the method of upper and lower solutions and its associated monotone iterations. This approach makes it possible to compute accurately the numerical solution of (1.1) by using Numerov’s method without finding the inverse T −1 . Consequently, the full potential of Numerov’s method for strongly nonlinear two-point boundary value problem (1.1) is realized. The outline of the paper is as follows: In the next section, we formulate the strongly nonlinear problem (1.1) as a coupled system of a semilinear two-point boundary value problem and a nonlinear functional equation and then apply Numerov’s method to discretize the coupled system. Section 3 is devoted to a linear monotone iterative algorithm for the resulting discrete problem. It is shown that using an upper solution and a lower solution as initial iterations the iterative algorithm yields two sequences which converge monotonically from above and below, respectively, to a unique solution of the resulting discrete problem. The convergence of the method is investigated in Section 4 where the fourth-order accuracy of the method is proven. In Section 5, we give some applications to several model problems arising from heat-conduction, population dynamics and chemical engineering. Some numerical results are presented to demonstrate the monotone property of the iterative algorithm and the high accuracy of the numerical solution. The final section is for some concluding remarks. 2. Numerov’s method Let I be an interval in R such that α , β ∈ I . Throughout this paper, we assume that the nonlinear functions f (·, u ) and k(u ) satisfy the following basic hypothesis:
(H )
f (·, u ) and k(u ) are C 1 -functions of u , and there exists a positive constant k0 such that k(u ) k0 > 0 for all u ∈ I .
In order to apply Numerov’s method to (1.1), we form the problem as a coupled system of a semilinear two-point boundary value problem and a nonlinear functional equation. Let δ ∈ I . Define
u v = T (u ) =
k(s) ds,
∀u ∈ I .
δ
Using the transformation T we transform problem (1.1) into the following coupled system:
(2.1)
40
Y.-M. Wang / Applied Numerical Mathematics 61 (2011) 38–52
⎧ ⎨ ⎩
−
d2 v dx2
= f (x, u ),
v (0) = T (α ),
T u (x) = v (x),
0 x 1,
(2.2)
v (1) = T (β).
It is clear that u (x) is a solution of (1.1) in I (i.e., u (x) ∈ I for each x) then (u (x), v (x)) = (u (x), T (u (x))) is a solution of the coupled system (2.2) and vice versa. As compared with the treatment in [44], a new feature of the above system is that it avoids the inverse T −1 . This makes it possible to compute accurately the numerical solution of (1.1) without finding the inverse T −1 . Let γ (x) be any nonnegative function satisfying
γ (x) − f u (x, u )/k0 for all 0 x 1 and u ∈ I ,
(2.3)
where f u = d f /du. Then we have
γ (x)k(u ) + f u (x, u ) 0 for all 0 x 1 and u ∈ I .
(2.4)
In other words, the function
g (x, u ) = γ (x) T (u ) + f (x, u )
(0 x 1)
(2.5)
is monotone nondecreasing in u for u ∈ I . By adding γ (x) v (x) on both sides of the first equation in (2.2) and using the relation (2.1), we obtain the following equivalent system:
⎧ ⎨ ⎩
−
d2 v dx2
+ γ (x) v (x) = g (x, u ),
v (0) = T (α ),
T u (x) = v (x),
0 x 1,
(2.6)
v (1) = T (β).
Our Numerov’s scheme for the numerical solution of (2.2) is based on the above equivalent form (2.6) (for the reason why we don’t discretize (2.2) directly, see Remark 2.1). Let h = 1/ N be the mesh size, and let xi = ih (0 i N ) be the mesh points. For convenience, we introduce the finite difference operators δh2 and Ph as follows:
δh2 u (xi ) = u (xi −1 ) − 2u (xi ) + u (xi +1 ), h 2
Ph u (xi ) =
12
1 i N − 1,
u (xi −1 ) + 10u (xi ) + u (xi +1 ) ,
1 i N − 1.
(2.7)
Using the following Numerov’s formula (cf. [4,28])
δh2 u (xi ) = Ph u (2) (xi ) + O h6 , where u (2) (xi ) =
2
d u ( xi ) , dx2
1 i N − 1,
(2.8)
we have from (2.6) that
−δh2 v (xi ) + Ph γ (xi ) v (xi ) = Ph g xi , u (xi ) + O h6 , v (x0 ) = T (α ),
T u (xi ) = v (xi ),
v (x N ) = T (β).
1 i N − 1,
(2.9)
After dropping the O (h6 ) term, we derive a finite difference scheme of (2.6) as follows:
−δh2 v i + Ph (γi v i ) = Ph g i (u i ), v 0 = T (α ),
T (u i ) = v i ,
1 i N − 1,
v N = T (β),
(2.10)
where u i and v i represent the approximations of u (xi ) and v (xi ), respectively, and
γi = γ (xi ),
g i (u i ) = g (xi , u i ).
To write (2.10) in a vector form, we let J = ( J i , j ) and B = ( B i , j ) be the ( N − 1)-order symmetric tridiagonal matrices with the following elements:
J i , i = 2, B i , i = 5 /6 ,
J i ,i −1 = J i ,i +1 = −1, B i ,i −1 = B i ,i +1 = 1/12,
1 i N − 1,
and let Γ be the diagonal matrix given by
Γ = diag(γ1 , γ2 , . . . , γ N −1 ). Also we define the following ( N − 1)-dimensional column vectors:
(2.11)
Y.-M. Wang / Applied Numerical Mathematics 61 (2011) 38–52
U = (u 1 , u 2 , . . . , u N −1 )T ,
41
V = ( v 1 , v 2 , . . . , v N −1 )T ,
T
G (U ) = g 1 (u 1 ), g 2 (u 2 ), . . . , g N −1 (u N −1 ) ,
T
T (U ) = T (u 1 ), T (u 2 ), . . . , T (u N −1 ) ,
T
H = T (α ) + h2 g (0, α ) − γ0 T (α ) /12, 0, . . . , 0, T (β) + h2 g (1, β) − γ N T (β) /12 .
(2.12)
Then system (2.10) can be written in the vector form:
J + h 2 B Γ V = h 2 B G (U ) + H ,
(2.13)
T (U ) = V .
This is a coupled system of a nonlinear algebraic equation and a nonlinear functional equation. To analyze it and develop a linear monotone iterative algorithm for its solution, we review some well-known properties about the matrices J and B. For two constants M and M satisfying M M > −π 2 we define
⎧ 12 ⎪ ⎪ ⎪ M, ⎪ ⎪ ⎪ ⎪ ⎨ 1, h( M , M ) = 12 12 ⎪ min { , (1 + πM2 ) }, ⎪ ⎪ π2 M ⎪ ⎪ ⎪ ⎪ ⎩ 12 (1 + M ), π2 π2
M > −8, M > 0, M > −8, M 0, (2.14)
M −8, M > 0, M −8, M 0.
Lemma 2.1. (See Lemma 2.3 of [45] or Lemma 3.1 of [46].) Let M = diag( M 1 , . . . , M N −1 ) be a diagonal matrix and let M and M be two constants such that
−π 2 < M M i M ,
i = 1, 2, . . . , N − 1.
(2.15)
Then the inverse ( J + h2 B M )−1 exists and is nonnegative provided that h < h( M , M ). The following lemma will be used in Section 4 to prove the convergence of the resulting finite difference scheme (2.10) (or (2.13)). Lemma 2.2. (See Lemma 2.4 of [45].) Let M, M and M i (i = 1, 2, . . . , N − 1) be the given constants satisfying (2.15), and let M = diag( M 1 , . . . , M N −1 ). Also let Z and R be the vectors in R N −1 such that
J + h2 B M Z = R .
(2.16)
Then when h < h( M , M ),
⎧ −2 ⎪ M 0, ⎨ ( R ∞ /8)h , Z ∞ ( R ∞ /(8 + M ))h−2 , −8 < M < 0, ⎪ ⎩ ( R ∞ /(2π (1 − θ)))h−2 , −π 2 < M −8, h H ,
(2.17)
where H < h( M , M ) and
θ =−
M
π 2 (1 − π 2 H 2 /12)
< 1.
Remark 2.1. The formulation of the system (2.13) is based on the equivalent form (2.6). This system can be clearly written as
J V = h2 BF (U ) + H ,
(2.18)
T (U ) = V , where
T
F (U ) = f 1 (u 1 ), f 2 (u 2 ), . . . , f N −1 (u N −1 ) ,
f i (u i ) = f ( xi , u i )
(i = 1, 2, . . . , N − 1).
(2.19)
System (2.18) is the discretization of the original problem (2.2) by Numerov’s method. As compared with (2.18), an important property of (2.13) is that the function G (U ) in this system is monotone nondecreasing in U (see (2.4)). This property, as we shall see later, makes it more convenient to design a linear monotone iterative algorithm for the computation of the solutions. This is our main reason for considering (2.13) instead of (2.18).
42
Y.-M. Wang / Applied Numerical Mathematics 61 (2011) 38–52
Remark 2.2. Much research has been done for the numerical solutions of the initial value problems (IVPs) related to the second-order ordinary differential equations. Various numerical methods have appeared in the literature, including the linear multistep methods and hybrid methods which are often used in practical applications (see [8,9,17,27,35,37–40,42]). A detailed survey for the recent developments of these methods is given in [9]. It is interesting to examine if the techniques in these methods can be applied to present two-point boundary value problems. Further investigations will be made in the future. 3. Linear monotone iterative algorithm To develop a linear monotone iterative algorithm for the solutions of (2.13) we use the method of upper and lower solutions and its associated monotone iterations. A pair of ordered upper and lower solutions of (2.13) are defined as follows: Definition 3.1. A vector (U˜ , V˜ ) ∈ R N −1 × R N −1 is called an upper solution of (2.13) if
J + h2 B Γ V˜ h2 B G (U˜ ) + H ,
(3.1)
T (U˜ ) V˜ .
Similarly, (Uˆ , Vˆ ) ∈ R N −1 × R N −1 is called a lower solution of (2.13) if it satisfies the above inequalities in the reversed order. A pair of upper and lower solutions (U˜ , V˜ ) and (Uˆ , Vˆ ) are said to be ordered if (U˜ , V˜ ) (Uˆ , Vˆ ). In the above definition, inequalities between vectors are in the sense of componentwise. It is clear that every solution of (2.13) is an upper solution as well as a lower solution. For any W in R N −1 , we denote w i the i-th component of W . Given a pair of ordered upper and lower solutions (U˜ , V˜ ) and (Uˆ , Vˆ ), we define the sectors
S = (U , V ) ∈ R N −1 × R N −1 ; (Uˆ , Vˆ ) (U , V ) (U˜ , V˜ ) , uˆ i , u˜ i = {u i ∈ R; uˆ i u i u˜ i },
(3.2)
and define the interval I0 by
I0 = min uˆ i , max u˜ i . i
i
We assume that I0 ⊂ I throughout the paper, where I is defined in hypothesis ( H ). To compute the solutions of (2.13) we use the following linear iterative scheme:
J + h2 B Γ V (m+1) = h2 B G U (m) + H ,
Γ (m) U (m+1) = Γ (m) U (m) − T U (m) + V (m+1) ,
(3.3)
m = 0, 1 , 2 , . . . ,
where the initial iteration (U (0) , V (0) ) is either (U˜ , V˜ ) or (Uˆ , Vˆ ), and
(m) (m) Γ (m) = diag γ1 , . . . , γ N −1 , (m)
The functions u i
γi(m) = max k(s); u (i m) s u (i m) .
(3.4)
s
, u (i m) in the definition of γi(m) are the respective components of U (m) and U (m) which are obtained from
(3.3) with (U (0) , V (0) ) = (U˜ , V˜ ) and (U (0) , V (0) ) = (Uˆ , Vˆ ), respectively.
To show that the sequences given by (3.3) are well-defined it is crucial that the sequences {U (m) } and {U (m) } possess the property U (m) U (m) for every m. It is clear from (3.4) that
Γ (m) U − U − T (U ) + T U 0 whenever U (m) U U U (m) .
(3.5)
Moreover, by the monotone nondecreasing property of g (x, u ) in u,
G (U ) G U
whenever Uˆ U U U˜ .
(3.6) that the inverse ( J + h B Γ )−1 exists and is a nonnegative 2
We observe from Lemma 2.1 and the nonnegative property of Γ matrix provided that h < h(0, Γ ∞ ). These properties lead to the following well-defined and monotone properties of the sequences from (3.3). Lemma 3.1. Let hypothesis ( H ) be satisfied and let h < h(0, Γ ∞ ). Also let (U˜ , V˜ ) and (Uˆ , Vˆ ) be a pair of ordered upper and lower solutions of (2.13). Then the sequences {(U (m) , V (m) )}, {(U (m) , V (m) )} and {Γ (m) } given by (3.3) and (3.4) with (U (0) , V (0) ) = (U˜ , V˜ )
Y.-M. Wang / Applied Numerical Mathematics 61 (2011) 38–52
43
and (U (0) , V (0) ) = (Uˆ , Vˆ ) are all well-defined and possess the monotone property
(Uˆ , Vˆ ) U (m) , V (m) U (m+1) , V (m+1) U (m+1) , V (m+1) U (m) , V (m) (U˜ , V˜ )
(3.7)
for every m = 0, 1, 2, . . . . Proof. Let m = 0 in (3.3). Since U (0) = U˜ , U (0) = Uˆ and U˜ Uˆ , the elements
γi(0) of the diagonal matrix Γ (0) are well-
defined. By hypothesis ( H ), the inverse (Γ (0) )−1 exists and is a nonnegative matrix. Hence the first iterations and (U (1) , V (1) ) exist and satisfy the relation
(U (1) , V (1) )
⎧ ⎪ J + h2 B Γ V (0) − V (1) = J + h2 B Γ V˜ − h2 B G (U˜ ) − H , ⎪ ⎨ J + h2 B Γ V (1) − V (0) = h2 B G (Uˆ ) + H − J + h2 B Γ Vˆ , ⎪ ⎪ ⎩ J + h2 B Γ V (1) − V (1) = h2 B G (U˜ ) − G (Uˆ ) .
(3.8)
By (3.1) and (3.6), the right-hand side of the above relation is nonnegative, and thus by the nonnegative property of ( J + h2 B Γ )−1 , V (0) V (1) V (1) V (0) . Knowing this relation we have from (3.1), (3.3) and (3.5) that
Γ (0) U (0) − U (1) = T (U˜ ) − V (1) T (U˜ ) − V˜ 0, Γ (0) U (1) − U (0) = − T (Uˆ ) + V (1) − T (Uˆ ) + Vˆ 0, Γ (0) U (1) − U (1) = Γ (0) (U˜ − Uˆ ) − T (U˜ ) + T (Uˆ ) + V˜ − Vˆ 0. In view of (Γ (0) )−1 0, U (0) U (1) U (1) U (0) . This proves that (3.7) holds for m = 0. Assume, by induction, that for some m 1, (U (m) , V (m) ), (U (m) , V (m) ), (U (m−1) , V (m−1) ) and (U (m−1) , V (m−1) ) exist and (3.7) holds when m is replaced by m − 1. It follows from Uˆ U (m) U (m) U˜ that Γ (m) is well-defined and its inverse (Γ (m) )−1 is nonnegative. This ensures that (U (m+1) , V (m+1) ) and (U (m+1) , V (m+1) ) exist. By (3.3) and (3.6),
J + h2 B Γ
2
2
J + h BΓ J + h BΓ
(m−1)
V (m) − V (m+1) = h2 B G U (m−1) − G U (m) V
(m+1)
V
(m+1)
−V
(m)
−V
(m+1)
2
(m)
0,
=h B G U −G U 0, (m) (m) 2 =h B G U −G U 0.
The nonnegative property of ( J + h2 B Γ )−1 leads to that V (m) V (m+1) V (m+1) V (m) . This relation together with (3.3) and (3.5) yields
Γ (m) U (m) − U (m+1) Γ (m−1) U (m−1) − U (m) − T U (m−1) + T U (m) 0, Γ (m) U (m+1) − U (m) Γ (m−1) U (m) − U (m−1) − T U (m) + T U (m−1) 0, Γ (m) U (m+1) − U (m+1) Γ (m) U (m) − U (m) − T U (m) + T U (m) 0 which implies that U (m) U (m+1) U (m+1) U (m) . The conclusion of the lemma follows from the principle of induction.
2
In view of the monotone property (3.7), the limits
lim U (m) , V (m) = (U , V ),
m→∞
lim U (m) , V (m) = (U , V )
m→∞
(3.9)
exist and satisfy
(Uˆ , Vˆ ) U (m) , V (m) U (m+1) , V (m+1) (U , V ) (U , V ) U (m+1) , V (m+1) U (m) , V (m) (U˜ , V˜ ), m = 0, 1, 2, . . . .
(3.10)
The following theorem shows that the limits (U , V ) and (U , V ) are the maximal and minimal solutions of (2.13) in S , respectively, in the sense that if (U , V ) is any solution of (2.13) in S then (U , V ) (U , V ) (U , V ). Theorem 3.1. Let the conditions in Lemma 3.1 hold. Then the sequences {(U (m) , V (m) )} and {(U (m) , V (m) )} converge monotonically to the maximal solution (U , V ) and the minimal solution (U , V ) of (2.13) in S , respectively. Moreover, the relation (3.10) holds. Proof. Let
γ i = maxs {k(s); u i s u i } and Γ = diag(γ 1 , . . . , γ N −1 ), where u i and u i are the respective components of
the limits U and U in (3.9). Then by the definition of Γ (m) , Γ (m) Γ (m+1) Γ for every m, and so the sequence {Γ (m) } converges as m → ∞. Letting m → ∞ in (3.3) shows that both (U , V ) and (U , V ) are solutions of (2.13) in S . We next show
44
Y.-M. Wang / Applied Numerical Mathematics 61 (2011) 38–52
that these solutions are the maximal and minimal solutions of (2.13) in S . Let (U , V ) be any solution of (2.13) in S . Then by (2.13) and (3.3),
J + h2 B Γ
V (m+1) − V = h2 B G U (m) − G (U ) ,
Γ (m) U (m+1) − U = Γ (m) U (m) − U − T U (m) + T (U ) + V (m+1) − V , Since (U , V ) ∈ S , the above relation for m = 0 gives
J + h2 B Γ
m = 0, 1 , 2 , . . . .
(3.11)
V (1) − V = h2 B G (U˜ ) − G (U ) 0,
Γ (0) U (1) − U = Γ (0) (U˜ − U ) − T (U˜ ) + T (U ) + V (1) − V V (1) − V .
(3.12)
This yields (U (1) , V (1) ) (U , V ) by using the nonnegativity of ( J + h2 B Γ )−1 and (Γ (0) )−1 . A similar argument gives (U (1) , V (1) ) (U , V ). It follows from an induction argument that (U (m) , V (m) ) (U , V ) (U (m) , V (m) ) for every m. This shows (U , V ) (U , V ) (U , V ), which proves the maximal and minimal property of (U , V ) and (U , V ). 2 It is obvious from the maximal and minimal property of (U , V ) and (U , V ) that if (U , V ) = (U , V ) (≡ (U ∗ , V ∗ )) then S . To ensure (U , V ) = (U , V ), it is necessary to impose some additional conditions on f (·, u ). To give such a sufficient condition, we define
(U ∗ , V ∗ ) is the unique solution of (2.13) in
f u ( xi , ξ ) , − k(η) ξ,η∈uˆ i ,u˜ i
σ = max max i
σ = min i
f u ( xi , ξ ) − k(η) ξ,η∈uˆ i ,u˜ i min
(3.13)
and assume
σ > −π 2 ,
h < h(σ , σ ).
(3.14)
Theorem 3.2. Let the conditions in Lemma 3.1 hold. If, in addition, condition (3.14) holds, then the sequences {(U (m) , V (m) )} and {(U (m) , V (m) )} converge monotonically from above and below, respectively, to the unique solution (U ∗ , V ∗ ) of (2.13) in S . Proof. It suffices to show (U , V ) = (U , V ), where (U , V ) and (U , V ) are the respective limits of the sequences {(U (m) , V (m) )} and {(U (m) , V (m) )}. By (2.13) (or (2.18)),
J ( V − V ) = h 2 B F (U ) − F (U ) ,
(3.15)
T (U ) − T (U ) = V − V . Applying the mean-value theorem we obtain
T (U ) − T (U ) = T u Θ (U − U ),
F (U ) − F (U ) = F u (Θ)(U − U ),
(3.16)
where
F u (Θ) = diag f u (x1 , θ1 ), . . . , f u (x N −1 , θ N −1 ) ,
(θ1 , θ2 , . . . , θ N −1 ) ∈ Uˆ , U˜ ,
T u Θ = diag k θ1 , . . . , k θ N −1 ,
θ1 , θ2 , . . . , θ N −1 ∈ Uˆ , U˜ .
(3.17)
Using the relation (3.16) in (3.15) leads to
J − h2 BF u (Θ) T u Θ
−1
( V − V ) = 0.
(3.18)
The condition (3.14) implies that the matrix − F u (Θ)( T u (Θ ))−1 satisfies the condition of Lemma 2.1, and therefore by this lemma, the inverse ( J − h2 BF (Θ)( T (Θ ))−1 )−1 exists. It follows from (3.18) that V = V . This implies T (U ) − T (U ) = 0 u
u
which ensures U = U . This proves (U , V ) = (U , V ).
2
Remark 3.1. (a) If k(u ) ≡ 1 then all the conclusions in Theorems 3.1 and 3.2 hold true for the semilinear problem (1.1). In this situation, the iteration process (3.3) is reduced to that in [43] and the uniqueness condition (3.14) becomes the one in [46]. (b) For each m, the iterations V (m) and U (m) from (3.3) can be computed, respectively, by solving a tridiagonal system of linear algebraic equations and a diagonal system of linear algebraic equations, and thus (3.3) provides a linear monotone iterative algorithm. Remark 3.2. We see from the monotone property (3.10) that for each m, (U (m) , V (m) ) is an upper bound of the maximal solution (U , V ), while (U (m) , V (m) ) gives a lower bound of the minimal solution (U , V ). Moreover, these bounds are improved, step-by-step, as m increases. If (U , V ) = (U , V ) (≡ (U ∗ , V ∗ )) then they become improved upper and lower bounds of (U ∗ , V ∗ ). This demonstrates the superiority of the monotone convergence over the ordinary convergence.
Y.-M. Wang / Applied Numerical Mathematics 61 (2011) 38–52
45
The linear iterative scheme (3.3) can be replaced by the following form
J + h2 B Γ V (m+1) = h2 B G U (m) + H ,
Γ (0) U (m+1) = Γ (0) U (m) − T U (m) + V (m) ,
(3.19)
m = 0, 1 , 2 , . . .
which results in less computations in the process of iterations but still maintains the monotone convergence (3.10) of the sequences. The latter result follows the same line as in the proofs of Lemma 3.1 and Theorem 3.1. However, the sequence given by (3.19) converges slower than the one by (3.3). Specifically, we have the following comparison result. Theorem 3.3. Let the conditions in Lemma 3.1 hold, and let (U˜ , V˜ ) and (Uˆ , Vˆ ) be a pair of ordered upper and lower solutions of (2.13). Denote by {(U (m) , V (m) )} and {(U (m) , V (m) )} the sequences from (3.3) with (U (0) , V (0) ) = (U˜ , V˜ ) and (U (0) , V (0) ) = (Uˆ , Vˆ ), and by
{(U ∗ , V ∗ )} and {(U ∗ , V ∗ )} the sequences from (3.19) with (U ∗ , V ∗ ) = (U˜ , V˜ ) and (U ∗ , V ∗ ) = (Uˆ , Vˆ ). Then we have (m)
(m)
(m)
(m)
(0)
U (m) , V (m) U (∗m) , V (∗m) ,
(0)
U (m) , V (m) U (∗m) , V (∗m) ,
(0)
(0)
m = 0, 1, 2, . . . .
(3.20)
Proof. By (3.3) and (3.19),
J + h2 B Γ
V (∗m+1) − V (m+1) = h2 B G U (∗m) − G U (m) ,
(3.21)
and
Γ (0) U (∗m+1) − U (m+1) = Γ (0) U (∗m) − U (m) + Γ (0) − Γ (m) U (m) − U (m+1) − T U (∗m) + T U (m) + V (∗m) − V (m+1) .
(3.22)
Since Γ (0) Γ (m) , U (m) U (m+1) and V (m) V (m+1) , the equality (3.22) implies
Γ (0) U (∗m+1) − U (m+1) Γ (0) U (∗m) − U (m) − T U (∗m) + T U (m) + V (∗m) − V (m) .
(3.23)
In view of (U (0) , V (0) ) = (U ∗ , V ∗ ) = (U˜ , V˜ ), we have from the above relations (3.21) and (3.23) for m = 0 that (0)
J + h2 B Γ
(0)
Γ (0) U (∗1) − U (1) 0.
V (∗1) − V (1) = 0,
The nonnegative property of ( J + h2 B Γ )−1 and (Γ (0) )−1 yields that V (1) = V ∗ and (3.23) we get from (3.5) and (3.6) that
(1)
J + h2 B Γ
(2)
Γ (0) U (∗2) − U (2) 0.
V (∗2) − V (2) 0,
This proves V (2) V ∗
(1)
and U (1) U ∗ . Using this result in (3.21)
(2)
and U (2) U ∗ . An induction argument shows that
U (m) , V (m) U (∗m) , V (∗m) ,
m = 0, 1 , 2, . . . .
Another inequality in (3.20) can be proved in the same manner.
2
Remark 3.3. The comparison result in Theorem 3.3 implies that with the same initial iteration, which is either an upper solution or a lower solution, the sequence given by (3.3) converges faster than the one by (3.19). 4. Convergence of the method In this section, we deal with the convergence of finite difference scheme (2.10) (or (2.13)), and show its fourth-order accuracy. Let (u (xi ), v (xi )) be the value of the solution of (2.2) at xi , and let (u i , v i ) stand for the solution of (2.10). Since, by (2.2), (2.5) and (2.10),
g xi , u ( xi ) = γ ( xi ) v ( xi ) + f xi , u ( xi ) ,
g i (u i ) = γi v i + f (xi , u i ),
we have from (2.9) and (2.10) that the solutions (u (xi ), v (xi )) and (u i , v i ) satisfy, respectively, the systems
−δh2 v (xi ) = Ph f xi , u (xi ) + O h6 , v (x0 ) = T (α ),
and
1 i N − 1,
v (x N ) = T (β),
−δh2 v i = Ph f (xi , u i ), v 0 = T (α ),
T u (xi ) = v (xi ),
T (u i ) = v i ,
1 i N − 1,
v N = T (β).
Based on (4.1) and (4.2), we show the convergence of (u i , v i ) to (u (xi ), v (xi )) as h → 0.
(4.1)
(4.2)
46
Y.-M. Wang / Applied Numerical Mathematics 61 (2011) 38–52
Theorem 4.1. Assume that u i , u (xi ) ∈ Λ ⊂ I for each i and some interval Λ in R, and assume
f u (x, ξ )
< π 2,
k(η)
0 x 1, (ξ, η) ∈ Λ × Λ.
(4.3)
Then for sufficiently small h,
maxu (xi ) − u i ch4 ,
max v (xi ) − v i ch4 ,
i
(4.4)
i
where c is a positive constant independent of h. Proof. Let e i = u (xi ) − u i and e i = v (xi ) − v i . A subtraction of the corresponding equations in the systems (4.1) and (4.2) and using the mean-value theorem lead to
−δh2 e i = Ph f u (xi , ξi )e i + O h6 ,
e
0
= e
N
k ξi e i = e i ,
1 i N − 1,
= 0,
(4.5)
where ξi and ξi are two intermediate values between u (xi ) and u i . Define
T
E = e 1 , e 2 , . . . , e N −1 ,
E = (e 1 , e 2 , . . . , e N −1 )T ,
F u (Ξ ) = diag f u (x1 , ξ1 ), f u (x2 , ξ2 ), . . . , f u (x N −1 , ξ N −1 ) ,
T u Ξ = diag k ξ1 , k ξ2 , . . . , k ξ N −1 . Then the vector form of (4.5) is given by
JE = h2 BF u (Ξ ) E + O h6 ,
(4.6)
T u Ξ E = E . This yields
J − h2 BF u (Ξ ) T u Ξ
−1
E = O h6 .
(4.7)
By (4.3), the matrix − F u (Ξ )( T u (Ξ ))−1 satisfies the condition (2.15). An application of Lemma 2.2 to (4.7) shows that for sufficiently small h, E ∞ c 1 h4 where c 1 is a positive constant independent of h. Hence E ∞ c 1 ( T u (Ξ ))−1 ∞ h4 1 4 c 1 k− 0 h . This proves the estimate (4.4). 2 Remark 4.1. (a) Theorem 4.1 shows that as h → 0, the solution (u i , v i ) of (2.10) converges to the solution (u (xi ), v (xi )) of (2.2) with the convergence order O (h4 ) in the L ∞ -norm. (b) If k(u ) ≡ 1, the convergence result in Theorem 4.1 coincides with that in [45] for the semilinear problem. 5. Applications and numerical results In this section, we apply the results of the previous sections to several model problems arising from heat-conduction, population dynamics and chemical engineering. We present some numerical results to demonstrate the monotone property of the linear iterative algorithm (3.3) and the high accuracy of the numerical solution. All computations are carried out by using a MATLAB subroutine on a Pentium-4 computer with 2G memory. Since the solution of (1.1) is usually nonnegative for practical problems, we assume the boundary values α 0 and β 0 in this section. Also we assume that the functions f (·, u ) and k(u ) are C 1 -functions of u, and there exists a positive constant ρ max{α , β} such that k(u ) k0 > 0 for all u ∈ [0, ρ ]. This implies that the functions f (x, u ) and k(u ) satisfy hypothesis ( H ) with the set I = [0, ρ ]. For this situation, we always take δ = 0 in the definition of transformation T . Under these assumptions, the zero vector (Uˆ , Vˆ ) = (0, 0) ∈ R N −1 × R N −1 is a lower solution of (2.13) if
f (0, α ) 0,
f (1, β) 0,
f (x, 0) 0 (0 < x < 1).
(5.1)
To construct an upper solution of (2.13), we let (U˜ , V˜ ) = (ρ E , T (ρ )E ), where E = (1, 1, . . . , 1)T ∈ R N −1 . A simple calculation shows that this vector is an upper if
JT (ρ )E h2 BF (ρ E ) + H , where F (u ) is defined by (2.19). The above inequality is satisfied by any
ρ satisfying
Y.-M. Wang / Applied Numerical Mathematics 61 (2011) 38–52
⎧ f (x, ρ ) 0 (0 x 1), ⎪ ⎪ ⎨ ρ
α ⎪ ⎪ ⎩
k(s) ds max 0
β 2
(5.2)
2
k(s) ds + h f (0, α )/12, 0
47
k(s) ds + h f (1, β)/12 . 0
We now consider three examples to illustrate the applications of the above construction and the results in the previous sections. Example 5.1. Consider the heat transfer in a rod with the given temperatures α 0 and β 0 at two ends. If the thermal conductivity is considered temperature dependent and the form of the source function is based on the so-called Boltzmann fourth-power law, the equation governing the steady-state temperature distribution u (x) in the rod is given by (1.1) with
f (x, u ) = σ a4 (x) − u 4 (x) ,
(5.3)
where σ is a positive constant and a(x) 0 denotes the surrounding temperature (see [20,30]). Assume that a(0) α and a(1) β , and let a max0x1 a(x). Since f u (x, u ) = −4σ u 3 , the function γ (x) in (2.3) can be chosen as γ (x) ≡ γ = 4σρ 3 /k0 . In this case, the function g (x, u ) in (2.5) and the difference scheme (2.10) are, respectively, reduced to
g (x, u ) = γ T (u ) + f (x, u ), and
(5.4)
−δh2 v i + γ Ph v i = Ph g i (u i ), v 0 = T (α ),
T (u i ) = v i ,
1 i N − 1,
(5.5)
v N = T (β).
The vector form of (5.5) is given by
J + γ h2 B V = h2 BG(U ) + H ,
(5.6)
T (U ) = V ,
where H = ( T (α ) + h2 ( g (0, α ) − γ T (α ))/12, 0, . . . , 0, T (β) + h2 ( g (1, β) − γ T (β))/12)T . Since a(0) α , a(1) β and a(x) 0, the condition (5.1) is satisfied by the function f (x, u ) in (5.3). This ensures that (Uˆ , Vˆ ) = (0, 0) is a lower solution of (5.6). On the other hand, the condition (5.2) for the present problem holds if
⎧ ρ a, ⎪ ⎪ ⎨ ρ
α β 4 4 2 4 2 4 ⎪ k(s) ds max k(s) ds + h σ a (0) − α /12, k(s) ds + h σ a (1) − β /12 . ⎪ ⎩ 0
0
(5.7)
0
This requirement can be satisfied by a sufficiently large ρ or by taking mesh size h suitably small. Therefore, (U˜ , V˜ ) = (ρ E , T (ρ )E ) and (Uˆ , Vˆ ) = (0, 0) form a pair of ordered upper and lower solutions of (5.6). With respect to this pair, the constants σ and σ in (3.13) is given by
σ = 4σρ 3 max k−1 (η),
σ = 0.
η∈0,ρ
(5.8)
Let h < min{ 12/γ , 12/σ }. Using (U (0) , V (0) ) = (ρ E , T (ρ )E ), (U (0) , V (0) ) = (0, 0) and the functions in (5.3) and (5.4) in the linear iterative scheme (3.3) with Γ = γ I (where I denotes the identity matrix), we can compute the sequences {(U (m) , V (m) )} and {(U (m) , V (m) )}. By Theorem 3.2, these sequences converge monotonically to the unique solution (U ∗ , V ∗ ) of (5.6) in S where
S = (U , V ) ∈ R N −1 × R N −1 ; (0, 0) (U , V ) ρ E , T (ρ )E .
(5.9)
Since f u (x, u ) 0 for all u 0 and all 0 x 1, the condition (4.3) is trivially satisfied. By Theorem 4.1, the solution (u ∗i , v ∗i ) of (5.5) converges to the continuous solution (u ∗ (xi ), v ∗ (xi )) of (2.2), (5.3) at every mesh point xi with the convergence order O (h4 ), as h → 0. To give some numerical results, we consider a special case of the above model problem where α = β = 0, σ = 1 and
k (u ) =
1 1 + u3
,
1/4
π 2 (sin π x + sin4 π x + 3 sin2 π x cos2 π x) a(x) = + sin4 π x (1 + sin3 π x)2
.
(5.10)
48
Y.-M. Wang / Applied Numerical Mathematics 61 (2011) 38–52
Fig. 1. The monotone convergence of the sequences at xi = 0.5 for Example 5.1.
Table 1 Numerical solution (U ∗ , V ∗ ) of Example 5.1. xi
u ∗i
v ∗i
0.1 0.2 0.3 0.4 0.5
0.30901814073003 0.58778747153044 0.80901821366068 0.95105728817513 1.00000076672804
0.30677611425942 0.56098264809778 0.72574878708737 0.81026505270532 0.83564923162852
It can be checked that u = sin π x is a solution of this example, and the transformation T in (2.1) is given by
u v = T (u ) = 0
√ 2u − 1 (1 + u )2 1 3π ds = ln 2 + √ arctan √ + . 3 6 18 1+s u −u+1 3 3 1
1
(5.11)
We see that it is very difficult to obtain the inverse T −1 . Since 0 a(x) 1.8 for all 0 x 1, the constants appeared above may be taken as
k 0 = 1/ 1 + ρ 3 ,
a = ρ = 1.8,
γ = σ = 4ρ 3 1 + ρ 3 .
(5.12)
With Γ = γ I in the linear iterative scheme (3.3), we compute the sequences {(U (m) , V (m) )} and {(U (m) , V (m) )} with (U (0) , V (0) ) = (ρ E , T (ρ )E ) and (U (0) , V (0) ) = (0, 0) for various values of h. In all the numerical computations, the basic feature of monotone convergence of the sequences was observed. Let h = 1/40. In Fig. 1, we present some numerical results of these sequences at xi = 0.5, where the solid line denotes the sequence {U (m) } or { V (m) } and the dash-dot line stands for the sequence {U (m) } or { V (m) }. As expected from our theoretical analysis in Theorem 3.2, the monotone convergence of the sequences holds and their common limit (U ∗ , V ∗ ) is the unique solution of (5.6) in S given by (5.9). More numerical results of (U ∗ , V ∗ ) at various xi are explicitly given in Table 1. To demonstrate the accuracy of the numerical solution (U ∗ , V ∗ ), we calculate the order of maximum error which is defined by
orderα (h) = log2
errorα (h)
errorα (h/2) erroru (h) = maxu ∗i − u ∗ (xi ), i
,
α = u, v ,
error v (h) = max v ∗i − v ∗ (xi ), i
(5.13)
where (u ∗i , v ∗i ) is the i-th component of (U ∗ , V ∗ ) and (u ∗ (xi ), v ∗ (xi )) is the continuous solution of (2.2), (5.3) at xi . The termination criterion of iterations is given by
(m) U − U (m)
∞
+ V (m) − V (m) ∞ < 10−13 .
(5.14)
In Table 2, we list the maximum errors erroru (h) and error v (h) as well as the orders of them for various values of h. The data in this table demonstrate that the numerical solution (U ∗ , V ∗ ) has the fourth-order accuracy. This coincides with the analysis very well.
Y.-M. Wang / Applied Numerical Mathematics 61 (2011) 38–52
49
Table 2 The accuracy of the numerical solution (U ∗ , V ∗ ) of Example 5.1. h
erroru (h)
orderu (h)
error v (h)
order v (h)
1/4 1/8 1/16 1/32 1/64 1/128 1/256 1/512
2.81395566738641e–002 1.24652816805571e–003 9.06433894494185e–005 5.51294410744418e–006 3.42257746477337e–007 2.13555905181906e–008 1.33351585329677e–009 8.10035372111884e–011
4.49661221054842 3.78156986465225 4.03930692923534 4.00966787470430 4.00239749843280 4.00130691012906 4.04110625795553
2.04635438672907e–002 1.18007005963311e–003 7.73722447332537e–005 4.70594950063852e–006 2.97095818502235e–007 1.85340933711586e–008 1.15732573524596e–009 7.04389879757628e–011
4.11611159332585 3.93091257322739 4.03925838585124 3.98548563403334 4.00267485004741 4.00131466422679 4.03827698371076
Example 5.2. Our second example is a logistic equation in population dynamics problems where the diffusion is density dependent. This equation in the one-dimensional domain [0, 1] is governed by (1.1) with
f (x, u ) = σ u (1 − u ) + q(x),
(5.15)
where σ is a positive constant and q(x) 0 is a possible internal source (see [7,11]). For this example, the function γ (x) in (2.3) can be chosen as γ (x) ≡ γ = σ (2ρ − 1)/k0 . In this case, the function g (x, u ) and the difference scheme (2.13) are reduced to (5.4) and (5.6), respectively, where f (x, u ) is replaced by (5.15). On the other hand, the conditions (5.1) and (5.2) for this example are, respectively, reduced to
σ α (1 − α ) + q(0) 0, and
σ β(1 − β) + q(1) 0,
(5.16)
⎧ σρ (1 − ρ ) + q(x) 0 (0 x 1), ⎪ ⎪ ⎨
α ρ β 2 2 ⎪ k(s) ds max k(s) ds + h f (0, α )/12, k(s) ds + h f (1, β)/12 . ⎪ ⎩ 0
0
(5.17)
0
We now consider the above problem with
k(u ) = arctan(1 + u ),
α = β = 0, σ = 2 and
q(x) = 2 arctan(1 + z) −
(2x − 1)2 − 2z(1 − z), 2 + 2z + z2
(5.18)
where z = x(1 − x). In this case, u = z is a solution of the problem, and the transformation T in (2.1) is given by
u v = T (u ) =
arctan(1 + s) ds = (1 + u ) arctan(1 + u ) −
1 2
ln 2 + 2u + u 2 −
π 4
√ + ln 2.
(5.19)
0
Clearly, it is difficult to find the inverse T −1 explicitly. With respect to the above parameters, the condition (5.16) is obviously satisfied, and the condition (5.17) holds by ρ = 2. This implies that the pair (U˜ , V˜ ) = (2E , T (2)E ) and (Uˆ , Vˆ ) = (0, 0) are a pair of ordered upper and lower solutions. Since ρ = 2 and arctan(1 + u ) π /4 for all u 0, we have k0 = π /4 and γ = 24/π . Using (U (0) , V (0) ) = (2E , T (2)E ) and (U (0) , V (0) ) = (0, 0), we compute the corresponding sequences {(U (m) , V (m) )} and {(U (m) , V (m) )} from (3.3) with Γ = γ I for the present problem, where γ = 24/π . Let h = 1/40. Some numerical results of these sequences at xi = 0.5 are plotted in Fig. 2, where the solid line denotes the sequence {U (m) } or { V (m) } and the dashdot line stands for the sequence {U (m) } or { V (m) }. As expected from our theoretical analysis in Theorem 3.2, the monotone convergence of the sequences holds and their common limit (U ∗ , V ∗ ) is the unique solution of the present problem in S given by (5.9) where ρ = 2. The maximum errors erroru (h) and error v (h) of the numerical solution (U ∗ , V ∗ ) as well as the orders of them for various values of h are presented in Table 3. In our computations, the termination criterion of iterations is still determined by (5.14). We see from Table 3 that the numerical solution (U ∗ , V ∗ ) has the fourth-order accuracy. Example 5.3. Our last example is given by (1.1) with
f (x, u ) = σ (1 − u ) exp −ν /(1 + u ) + q(x),
(5.20)
where σ and ν are positive constants, and q(x) is a nonnegative function. This problem arises from a chemical reactor with first-order reaction (cf. [10,12,20,31,32,34]). To find γ (x) in (2.3) we observe from (5.20) that
f u (x, u ) = −σ exp −ν /(1 + u )
1−
ν (1 − u ) (1 + u )2
.
(5.21)
50
Y.-M. Wang / Applied Numerical Mathematics 61 (2011) 38–52
Fig. 2. The monotone convergence of the sequences at xi = 0.5 for Example 5.2. Table 3 The accuracy of the numerical solution (U ∗ , V ∗ ) of Example 5.2. h
erroru (h)
orderu (h)
error v (h)
order v (h)
1/4 1/8 1/16 1/32 1/64 1/128 1/256 1/512
1.58684599293107e–004 1.01956076546672e–005 6.41229860487425e–007 4.01376447634050e–008 2.50951628921747e–009 1.56848256605002e–010 9.50076128880539e–012 5.40345546085064e–013
3.96014245092926 3.99096234202004 3.99781374138991 3.99947474754768 3.99996791204800 4.04518256310889 4.13608892220934
1.42185276064216e–004 9.13580885505971e–006 5.74577388973152e–007 3.59655524528879e–008 2.24866553155678e–009 1.40544686999533e–010 8.51330117512816e–012 4.84223772190262e–013
3.96009579872077 3.99095934055636 3.99781355033980 3.99947477046760 3.99996826760600 4.04516643685877 4.13597285658087
This implies that − f u (x, u ) σ (1 + ν ) for all u 0, and therefore, γ (x) can be chosen as γ (x) ≡ γ = σ (1 + ν )/k0 . Then the function g (x, u ) and the difference scheme (2.13) for this example are reduced to (5.4) and (5.6), respectively, where f (x, u ) is replaced by (5.20). To accommodate the exact solution of u = 1 − x2 /2, we let
k (u ) =
1 + u2 ,
q(x) =
1 + z2 − √
x2 z 1 + z2
− σ (1 − z) exp −ν /(1 + z) ,
The above choice implies that α = 1, β = 1/2, k0 = 1 and q(x) 0 for all x ∈ [0, 1] if transformation T in (2.1) is given by
v = T (u ) =
u
1 + s2 ds =
1 u 1 + u 2 + lnu + 1 + u 2 . 2
z=1−
x2 2
(5.22)
. √
σ 3 exp(ν /2)/ 5. Moreover, the
(5.23)
0
As in the previous examples, we can prove that for any
√
√
√
σ + 2 exp(ν /2) 13 2 21 5 , ρ max , , σ 12 40 ˆ ˆ the pair (U˜ , V˜ ) = (ρ E , T (ρ )E ) and √ (U , V ) = (0, 0) are a pair of ordered upper and lower solutions of the present problem. Let σ = 1, ν = 0.5 and ρ = 1 + 2 exp(1/4). We compute the corresponding numerical solution (U ∗ , V ∗ ) by the linear iterative scheme (3.3) with the initial iterations (U (0) , V (0) ) = (ρ E , T (ρ )E ) and (U (0) , V (0) ) = (0, 0). The termination criterion is the same as in Example 5.1. Fig. 3 shows the monotone convergence of the sequences {(U (m) , V (m) )} and {(U (m) , V (m) )} at xi = 0.5, where the solid line denotes the sequence {U (m) } or { V (m) } and the dash-dot line stands for the sequence {U (m) } or { V (m) } as before. The data in Table 4 show the maximum errors erroru (h) and error v (h) of the numerical solution (U ∗ , V ∗ ) as well as the orders of them for various values of h. We see that the fourth-order accuracy of the numerical solution (U ∗ , V ∗ ) is demonstrated. 6. Concluding remarks In this paper, we gave a numerical treatment for a class of strongly nonlinear two-point boundary value problems by Numerov’s method. The main contribution is to develop precisely a technique for computing accurately the numerical
Y.-M. Wang / Applied Numerical Mathematics 61 (2011) 38–52
51
Fig. 3. The monotone convergence of the sequences at xi = 0.5 for Example 5.3. Table 4 The accuracy of the numerical solution (U ∗ , V ∗ ) of Example 5.3. h
erroru (h)
orderu (h)
error v (h)
order v (h)
1/4 1/8 1/16 1/32 1/64 1/128 1/256 1/512
2.54113143148826e–005 1.68847875603451e–006 1.05727220867102e–007 6.62888643976345e–009 4.14343670485096e–010 2.58856269752528e–011 1.61481938931729e–012 1.14130926931466e–013
3.91167506288797 3.99730525735689 3.99543651708451 3.99986674925372 4.00060471650773 4.00270654236108 3.82261112679379
3.23999380957840e–005 2.16726185686689e–006 1.35707057924428e–007 8.48539494224809e–009 5.30407606724737e–010 3.31368266159871e–011 2.06989980711114e–012 1.42330591756945e–013
3.90204567565112 3.99730581562862 3.99937013513628 3.99980843898597 4.00059411265700 4.00080260662483 3.86224325092020
solutions of the problems by Numerov’s method without finding any inverse function. We generalized the method of upper and lower solutions to a coupled system of a nonlinear algebraic equation and a nonlinear functional equation. A linear monotone iterative algorithm was presented for the solutions of the resulting discrete problems. The monotone property of the iterations gives improved upper and lower bounds of the solution in each iteration. The proposed method complements the known method in [44] where it is necessary to find explicitly an inverse function, and so the full potential of Numerov’s method for strongly nonlinear two-point boundary value problems is realized. The numerical results coincide with the analysis very well. We conclude by taking note that the present study serves as an initial exploration of the application of higher-order compact finite difference methods to solve strongly nonlinear differential equations. The analysis is limited to a ordinary differential equation. The extension to partial differential equations is an interesting issue, and more challenging problems concerning this extension will be studied in the future. Acknowledgements The author would like to thank the referees for their valuable comments and suggestions which improved the presentation of the paper. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]
E. Adams, W.F. Ames, On contracting interval iteration for nonlinear problems in Rn : Part II: Applications, Nonlinear Anal. 5 (1981) 525–542. R.P. Agarwal, On Numerov’s method for solving two point boundary value problems, Util. Math. 28 (1985) 159–174. R.P. Agarwal, Difference Equations and Inequalities, Marcel Dekker, New York, 1992. R.P. Agarwal, Y.-M. Wang, The recent developments of the Numerov’s method, Comput. Math. Appl. 42 (2001) 561–592. J.H. Ahlberg, E.N. Nilson, J.L. Walsh, The Theory of Splines and Their Applications, Academic Press, New York, 1967. W.F. Ames, Nonlinear Partial Differential Equations in Engineering, Academic Press, New York, 1972. W.F. Ames, Numerical Methods for Partial Differential Equations, Academic Press, New York, 1977. Z.A. Anastassi, T.E. Simos, A family of two-stage two-step methods for the numerical integration of the Schrödinger equation and related IVPs with oscillating solution, J. Math. Chem. 45 (2009) 1102–1129. Z.A. Anastassi, T.E. Simos, Numerical multistep methods for the efficient solution of quantum mechanics and related problems, Phys. Rep. 482–483 (2009) 1–240. R. Aris, The Mathematical Theory of Diffusion and Reaction in Permeable Catalysts, Clarendon Press, Oxford, 1975. D.G. Aronson, H.F. Weinberger, Nonlinear diffusion in population genetics, combustion, and nerve propagation, in: Partial Differential Equations and Related Topics, in: Lecture Notes in Math., vol. 446, Springer, New York, 1975, pp. 5–49. J.W. Bebernes, D. Eberly, Mathematical Problems from Combustion Theory, Springer-Verlag, New York, 1989.
52
Y.-M. Wang / Applied Numerical Mathematics 61 (2011) 38–52
[13] L.K. Bieniasz, Use of the Numerov method to improve the accuracy of the spatial discretization in finite-difference electrochemical kinetic simulations, Comput. Chem. 26 (2002) 633–644. [14] R.L. Burden, J.D. Faires, Numerical Analysis, Prindle, Weber and Schmidt, Boston, 1993. [15] M.M. Chawla, P.N. Shivakumar, Numerov’s method for non-linear two-point boundary value problems, Int. J. Comput. Math. 17 (1985) 167–176. [16] M.M. Chawla, P.N. Shivakumar, A new fourth order cubic spline method for non-linear two-point boundary value problems, Int. J. Comput. Math. 22 (1987) 321–341. [17] J.P. Coleman, L.Gr. Ixaru, P -stability and exponential-fitting methods for y = f (x, y ), IMA J. Numer. Anal. 16 (1996) 179–199. [18] L. Collatz, The Numerical Treatment of Differential Equations, third ed., Springer, Berlin, 1960. [19] Y. Fang, X. Wu, A trigonometrically fitted explicit Numerov-type method for second-order initial value problems with oscillating solutions, Appl. Numer. Math. 58 (2008) 341–351. [20] D.A. Frank-Kamenetskii, Diffusion and Heat Transfer in Chemical Kinetics, Plenum Press, New York, 1969. [21] B.-Y. Guo, J.J.H. Miller, Iterative and Petrov–Galerkin methods for solving a system of one-dimensional nonlinear elliptic equations, Math. Comp. 58 (1992) 531–547. [22] R.C. Gupta, Ravi. P. Agarwal, A new shooting method for multi-point discrete boundary value problems, J. Math. Anal. Appl. 112 (1985) 210–220. [23] P. Henrici, Discrete Variable Methods in Ordinary Differential Equations, John Wiley, New York, 1962. [24] S.R.K. Iyengar, P. Jain, Spline finite difference methods for singular two point boundary value problems, Numer. Math. 50 (1987) 363–376. [25] H.B. Keller, Elliptic boundary value problems suggested by nonlinear diffusion processes, Arch. Ration. Mech. Anal. 35 (1969) 363–381. [26] J.B. Keller, W.E. Olmstead, Temperature of a nonlinear radiating semi-infinite solid, Quart. Appl. Math. 29 (1972) 559–566. [27] A. Konguetsof, T.E. Simos, An exponentially fitted and trigonometrically fitted method for the numerical solution of periodic initial value problems, Comput. Math. Appl. 45 (2003) 547–554. [28] B.V. Numerov, A method of extrapolation of perturbations, Monthly Notices Royal Astron. Soc. 84 (1924) 592–601. [29] W.E. Olmstead, Temperature distribution in a convex solid with nonlinear radiation boundary condition, J. Math. Mech. 15 (1966) 899–908. [30] M.N. Özisik, Boundary Value Problems of Heat Conduction, Dover, New York, 1989. [31] C.V. Pao, Monotone iterative methods for finite difference system of reaction diffusion equations, Numer. Math. 46 (1985) 571–586. [32] C.V. Pao, Nonlinear Parabolic and Elliptic Equations, Plenum Press, New York, 1992. [33] C.V. Pao, Finite difference reaction diffusion equations with nonlinear boundary conditions, Numer. Methods Partial Differential Equations 11 (1995) 355–374. [34] S.V. Parter, Solutions of a differential arising in chemical reactor process, SIAM J. Appl. Math. 26 (1974) 687–716. [35] G. Psihoyios, T.E. Simos, A fourth algebraic order trigonometrically fitted predictor-corrector scheme for IVPs with oscillating solutions, J. Comput. Appl. Math. 175 (2005) 137–147. [36] S.M. Roberts, J.S. Shipman, Two Point Boundary Value Problems: Shooting Methods, American Elsevier, New York, 1972. [37] T.E. Simos, A Numerov-type method for the numerical-solution of the radial Schrödinger-equation, Appl. Numer. Math. 7 (1991) 201–206. [38] T.E. Simos, A high-order predictor-corrector method for periodic IVPS, Appl. Math. Lett. 6 (1993) 9–12. [39] T.E. Simos, Exponentially-fitted and trigonometrically-fitted symmetric linear multistep methods for the numerical integration of orbital problems, Phys. Lett. A 315 (2003) 437–446. [40] S. Stavroyiannis, T.E. Simos, Optimization as a function of the phase-lag order of nonlinear explicit two-step P -stable method for linear periodic IVPs, Appl. Numer. Math. 59 (2009) 2467–2474. [41] E.H. Twizell, S.I.A. Tirmizi, Multiderivative methods for non-linear second-order boundary value problems, J. Comput. Appl. Math. 17 (1987) 299–307. [42] H. Van de Vyver, Phase-fitted and amplification-fitted two-step hybrid methods for y = f (x, y ), J. Comput. Appl. Math. 209 (2007) 33–53. [43] Y.-M. Wang, Monotone methods for a boundary value problem of second-order discrete equation, Comput. Math. Appl. 36 (1998) 77–92. [44] Y.-M. Wang, Numerov’s method for strongly nonlinear two-point boundary value problems, Comput. Math. Appl. 45 (2003) 759–763. [45] Y.-M. Wang, The extrapolation of Numerov’s scheme for nonlinear two-point boundary value problems, Appl. Numer. Math. 57 (2007) 253–269. [46] Y.-M. Wang, B.-Y. Guo, On Numerov scheme for nonlinear two-points boundary value problem, J. Comput. Math. 16 (1998) 345–366.
Applied Numerical Mathematics 61 (2011) 53–65
Contents lists available at ScienceDirect
Applied Numerical Mathematics www.elsevier.com/locate/apnum
Higher order optimization and adaptive numerical solution for optimal control of monodomain equations in cardiac electrophysiology Chamakuri Nagaiah ∗ , Karl Kunisch Institute of Mathematics and Scientific Computing, University of Graz, Heinrichstr. 36, Graz, A-8010, Austria
a r t i c l e
i n f o
Article history: Received 14 December 2009 Received in revised form 29 July 2010 Accepted 9 August 2010 Available online 27 August 2010 Keywords: Reaction–diffusion equations Monodomain model PDE constraint optimization Adaptive FEM Newton’s optimization algorithm
a b s t r a c t In this work adaptive and high resolution numerical discretization techniques are demonstrated for solving optimal control of the monodomain equations in cardiac electrophysiology. A monodomain model, which is a well established model for describing the wave propagation of the action potential in the cardiac tissue, will be employed for the numerical experiments. The optimal control problem is considered as a PDE constrained optimization problem. We present an optimal control formulation for the monodomain equations with an extra-cellular current as the control variable which must be determined in such a way that excitations of the transmembrane voltage are damped in an optimal manner. The focus of this work is on the development and implementation of an efficient numerical technique to solve an optimal control problem related to a reaction–diffusions system arising in cardiac electrophysiology. Specifically a Newton-type method for the monodomain model is developed. The numerical treatment is enhanced by using a second order time stepping method and adaptive grid refinement techniques. The numerical results clearly show that super-linear convergence is achieved in practice. © 2010 IMACS. Published by Elsevier B.V. All rights reserved.
1. Introduction Optimization problems governed by partial differential equations arise in many application areas of natural science and engineering. These problems, now called PDE constrained optimization problems, due to their problem size and/or complexity still they present a significant challenge to efficient numerical realization. One such interesting application is the optimal control of the bidomain model equations in cardiac electrophysiology. The bidomain model [8,18,22] describes the electrical behavior of the cardiac tissue by a reaction–diffusion system coupled with an ordinary differential equations which model the ionic currents associated with the reaction terms. The equations model the fact that currents leaving the extracellular space by traversing the cellular membranes are the sources of the intracellular current density and vice versa. That is, currents leaving the intracellular space are acting as sources of the extracellular current density. The numerical solution of the bidomain equations is computationally expensive, see [25,24], due to the high spatio-temporal resolution required to resolve the fast transients and steep gradients governing wavefront propagation in the heart. Assuming that the anisotropy ratios of the two spaces are equal leads to a reduced model, referred to as the monodomain model, which can be solved at a much lower computational expense by avoiding the time consuming solution of an elliptic PDE [17]. Under many circumstances of practical relevance the monodomain model can be considered to approximate the bidomain model fairly well [19,15]. Cardiac arrhythmia refers to sudden, irregular patterns in the heart rhythm that may cause the heart to stop beating completely or slow down to the point where life is unsustainable. This must be treated immediately to avoid sudden cardiac
*
Corresponding author. E-mail addresses:
[email protected] (C. Nagaiah),
[email protected] (K. Kunisch).
0168-9274/$30.00 © 2010 IMACS. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.apnum.2010.08.004
54
C. Nagaiah, K. Kunisch / Applied Numerical Mathematics 61 (2011) 53–65
death. The only way to reestablish a normal rhythm is to apply a strong electrical shock, a process called defibrillation. During defibrillation extracellular currents are injected via electrodes to establish an extracellular potential distribution which acts to reduce the complexity of the activity. This is achieved either by extinguishing all electrical activity, i.e. the entire tissue returns to its quiescent state, or gradients in V m are smoothed out to drive the system to a less heterogeneous state which reduces the likelihood of triggering new wavefronts via “break” mechanisms when switching off the applied field. In the context of optimally control cardiac arrhythmias, it is essential to determine the control response to an applied electric field as well as the optimal extracellular current density that acts to damp gradients of transmembrane voltage in the system. The main objective of this work is the development and computer implementation of an efficient numerical technique to solve optimal control problems for monodomain equations. The optimal control approach is based on minimizing a properly chosen cost functional J ( v , I e ) depending on the extracellular current I e as input and on the transmembrane potential v as one of the state variables. First- and second-order derivatives of the reduced cost are derived using a Lagrangian based formalism. Numerical technique for solving the optimal control problems requires combining a discretization technique with an optimization method. A traditional method to solve such problems is by using the optimize before discretize approach. In this case one expresses the continuous optimality system first before discretizing it. A second approach is to first discretize the differential equations and the cost and to solve the resulting nonlinear programming method with an efficient method. Clearly it is desirable to follow an approach were these two strategies commute or at least lead to very similar results. For the problem class under consideration here we carried out a detailed comparison for the 1D case in Nagaiah et al. [13]. In this present work we stick to the optimize before discretize technique. In the numerical simulations there are many important factors which put a high demand on the computing time. These include different length and time scales of the reaction terms, strong nonlinearities caused by ionic currents, anisotropy related to the fiber orientation, and rapid changes of the potentials [7,24]. We have chosen the finite element method for the spatial and higher order linearly implicit Runge–Kutta time stepping methods for the temporal discretization. The mesh has to be adapted, during the primal and dual solves, at each time step in all regions in order to preserve the accuracy of the solution. To control the spatial discretization error, a-posteriori error estimators are computed to steer the mesh improvement by refinement and coarsening in each time step during the primal and dual solves in optimization algorithm. The adaptivity strategy is based on the ZZ-error estimator [26] in our work. Various other forms of adaptive mesh refinement techniques were applied successfully for excitable media by varying the spatial or temporal resolution or both [5,7,12,21]. At present our numerical experiments are based on allowing adaptivity in the spatial grids, while the time step is kept constant during the primal and dual solves. More details will be given in Section 4.3. Our numerical realization is based on the public domain software package DUNE [1]. The article is organized as follows: in the next section the governing equations for the action potential and the behavior of the ionic current variables using ionic models are described. In Section 3 the control problem is posed for the monodomain equations and the first and second order derivatives of the reduced cost are characterized. Also a brief description of a Newton’s algorithm is given. The numerical approach for solving the primal and the adjoint state equations is presented in Section 4. Numerical results for two test cases are discussed in Section 5. Finally concluding remarks are given. 2. The governing equations The monodomain model consists in a parabolic reaction–diffusion equation for the transmembrane potential v coupled with a system of ODEs for the gating variables. We set Q = Ω × [0, T ], where Ω ⊂ Rd , d = 2, denotes the cardiac tissue sample domain with Lipschitz boundary ∂Ω . The system is given by
∂v = ∇ · σ¯ i ∇ v − I ion ( v , w ) + I e (x, t ) in Q , ∂t ∂w = g ( v , w ) in Q , ∂t
(1) (2)
where v : Q → R is the transmembrane voltage, w : Q → Rn represents the ionic current variables, σ¯ i : Ω → Rd×d is the intracellular conductivity tensors, I e (x, t ) is an extracellular current density stimulus, I ion ( v , w ) is the current density flowing through the ionic channels and the function g ( v , w ) determines the evolution of the gating variables. The above mentioned Eq. (1) is a parabolic type equation and Eq. (2) is a set of ordinary differential equations which can be solved independently for each node. In the absence of a conductive bath both intracellular and extracellular domains are electrically isolated along the tissue boundaries and homogeneous Neumann boundary conditions are appropriate to reflect this fact. The initial values of transmembrane voltage and ion current variables are prescribed by given constant values. Here the initial and boundary conditions are chosen as
σ¯ i ∇ v · η = 0 on ∂Ω × [0, T ], v (x, 0) = v 0 and w (x, 0) = w 0
(3) on Ω,
(4)
C. Nagaiah, K. Kunisch / Applied Numerical Mathematics 61 (2011) 53–65
55
where v 0 : Ω → Rd denotes the initial transmembrane potential and w 0 : Ω → Rd is the initial ionic current variables at time t = 0. Ionic model In our numerical experiments, we considered a phenomenological model, namely a variant of the Fitzhugh–Nagumo (FHN) model [20], which is constructed to reproduce the macroscopically observed behavior of the cells. The ionic activity is modeled by an ordinary differential equations. In this case, the I ion ( v , w ) term is a cubic polynomial function in terms of the transmembrane potential v and linear in terms of the gating variable w:
v
I ion ( v , w ) = G v 1 −
g ( v , w ) = η2
v vp
v th
1−
v
+ η1 v w ,
vp
(5)
− η3 w ,
(6)
where G , η1 , η2 , η3 are positive real coefficients, v th is a threshold potential and v p the peak potential. 3. Optimal control problem In this section we specify the optimal control problem that is under consideration. We consider
(P)
min J ( v , I e ),
(7)
e ( v , w , I e ) = 0 in Q ,
where v and w are the state variables, and I e is the control variable of the optimal control problem. Here Q = Ω × (0, T ) denotes the space-time cylinder and the coupled systems of PDE and ODE constraints is expressed as e ( v , w , I e ) = 0, where
e( v , w , I e ) =
∇ · (σ¯ i ∇ v ) −
∂v ∂t
∂w ∂t
− I ion ( v , w ) + I e (x, t )
− g(v , w )
(8)
.
The initial conditions in (4) are explicitly enforced and the homogeneous Neumann boundary condition in (3) will be realized in the variational setting of the FE discretization. We refrain here from entering a function space setting. This requires independent investigations which will be reported elsewhere. The control variable I e is chosen such that it is nontrivial only on the control domain Ωcon of Ω , i.e. I e : Ωcon × (0, T ) → R, and I e is extended by zero on (Ω \ Ωcon ) × (0, T ). It will not be necessary to introduce extra notation for this purpose. It can be argued that for each I e ∈ L 2 ((0, T ) × Ωcon ; R) there exists a unique ( v , w ) ∈ L 2 (0, T ; H 1 (Ω)) ∩ C (0, T ; L 2 (Ω)) ∩ 4
L 4 ( Q ) × L 2 (0, T ; H 1 (Ω)) ∩ C (0, T ; L 2 (Ω)) with ut , w t ∈ L 2 (0, T ; H 1 (Ω)∗ + L 3 ( Q )) such that e ( v ( I e ), w ( I e ), I e ) = 0, see [4]. With v ( I e ) thus defined we introduce the reduced cost functional
ˆJ ( I e ) := J v ( I e ), I e .
(9)
Next we turn to the choice of the cost functional. For this purpose we introduce the observation domain Ωobs ⊂ Ω . The control objective consists in dampening out the excitation wave of the transmembrane voltage Ωobs . We therefore set
J (v , Ie ) =
1
T
| v |2 dΩobs + α
2 0
Ωobs
| I e |2 dΩcon dt ,
(10)
Ωcon
where α is the weight of the cost of the control, which is used to determine the influence of the corresponding components, Ωobs is the observation domain. The inclusion of the tracking type term in the cost functional serves code-validation purposes, see Nagaiah et al. [14] for more numerical results and further discussion. To derive the first derivative of the reduced cost we use the Lagrange functional
T L( v , w , I e , p , q ) = J ( v , I e ) +
T ∂v ∂w − g ( v , w ) q dΩ dt + − I ion ( v , w ) + I e p dΩ dt . ∇ · σ¯ i ∇ v − ∂t ∂t
0 Ω
0 Ω
The first order optimality system is obtained by formally setting the partial derivatives of L equal to 0. We find
Lv :
v |Ωobs + ∇ · σ¯ i ∇ p + pt − ( I ion ) v p − g v q = 0,
(11)
Lw :
−( I ion ) w p − qt − g w ( v , w )q = 0,
(12)
where the subscripts v and w denote the partial derivatives with respect to these variables. Further we obtain the
56
C. Nagaiah, K. Kunisch / Applied Numerical Mathematics 61 (2011) 53–65
terminal conditions: boundary conditions:
p ( T ) = 0 q ( T ) = 0,
(13)
σ¯ i ∇ p · η = 0 on ∂Ω × [0, T ],
and the optimality condition:
L I e : α I e + p = 0,
(14)
on Ωcon .
(15)
The first order necessary optimality conditions consist of the coupled system of primal equations (1)–(2), adjoint equations (11)–(12), together with initial conditions (4), terminal conditions (13) and the optimality condition (15). 3.1. Newton’s method In this subsection, we present the inexact Newton method for solving the reduced optimization problem. In our case it is infeasible to set up the Hessian matrix. Rather we explain the necessary steps for the computation of “the action of the Hessian of the reduced cost” on a given vector. Once this is achieved approximate Hessian directions of appropriately discretized problems can be computed. This procedure was advocated e.g. in [9], Hinze et al. [10] for solving optimization problem governed by evolutionary partial differential equations. We use the Lagrangian functional based approach to derive the second derivative of the reduced cost functional. Proceeding in formal terms it can be expressed as follows [10]:
ˆJ = L I e I e + δ u ∗ Lu I e + L I e u δ u + δ u ∗ Luu δ u , 1 −1 where δ u = −e − u e I e . In this case e u is only formal since it requires to invert a differential operator. Here, for brevity, we denote u = ( v , w ) and y = (u , I e ). Here and below L and its derivatives are evaluated at ( v , w , I e ). We now introduce the matrix operator T ( I e ) and the second derivative of L as follows,
T (Ie ) =
1 −e − u e Ie
and
Id I e
Lyy =
Luu
Lu I e
LIe u
LIe Ie
,
where Id I e is the identity operator in the control space. From these quantities the representation of the Hessian can be constructed by using the following formula
J ( I e ) = T ∗ ( I e )L y y T ( I e ),
(16)
where T ∗ ( I e ) is the adjoint operator to T ( I e ). Now we carry out these steps for the calculation of the second derivative of the reduced cost functional associated to the monodomain problem. In this process we have to evaluate the first derivatives, i.e. the sensitivities. We calculate these derivatives w.r.t. state and control variables as follows:
e u (δ v , δ w , δ I e ) = and
e I e (δ I e ) =
δ Ie
∇ · (σ¯ i ∇δ v ) − δ v t − [ I ion ] v δ v − [ I ion ] w δ w
δ w t − ηv 2p δ v + η2 η3 δ w
(17)
(18)
0
where
[ I ion ] v =
G v p v th
( v th − v )( v p − v ) − v ( v p − v ) − v ( v th − v ) + η1 w ,
[ I ion ] w = η1 v . The computation of the second derivative operator L applied to the vector (δ v , δ w , δ I e ) can be expressed as
⎛
⎞ δ v |Ω obs − [ I ion ] v v p − η1 δ wp ⎠ −η1 δ vp L y y (δ v , δ w , δ I e ) = ⎝ αδ I e
2G where [ I ion ] v v = δ v v − ( v p − v ) − ( v th − v ) . v p v th
(19)
We note that in order to obtain the action of the Hessian on a given vector, one linearized primal problem and one linearized adjoint equation have to be solved. The basic steps to compute the action of the Hessian are summarized next. 1. Compute the first derivative ˆJ (un ) = α I e + p, which requires one primal and one dual solve. 2. In each step iteratively evaluate the action of ˆJ (un ) on δn which is done using the following sequence of computations.
C. Nagaiah, K. Kunisch / Applied Numerical Mathematics 61 (2011) 53–65
57
(a) solve the linearized primal equation for δ v , δ w using δ I ek
∇ · (σ¯ i ∇δ v ) − (δ v t + [ I ion ] v δ v + [ I ion ] w δ w )
δ w t − ηv 2p δ v + η2 η3 δ w
=
−δ I ek
0
;
(b) evaluate ( z1 , z2 ) := L y y (δ v , δ w , δ I e ) from Eq. (19); (c) solve the adjoint equation with ( z1 , z2 ) as r.h.s. i.e.
∇ · σ¯ i ∇ w 1 + w 1t − [ I ion ] v w 1 − ηv 2p w 2 −[ I ion ] w w 1 − w 2t + η2 η3 w 2
=
z1 z2
;
(d) finally compute the action of ˆJ (uk ) on δ I ek , i.e. − w 1 + α δ I e . In this way one can compute the action of the Hessian of the reduced cost. Consider the system of step (7) in Algorithm 1, given in Appendix A. Its dimension is that of the control space dimension. To evaluate this step one must use an iterative method, e.g. a linear conjugate gradient method. 4. Numerical discretization For the approximation of the optimality system (1)–(2) and (11)–(12) we use a finite element method for the spatial discretization. This results in an initial value problem for a system of ordinary differential equations. The time discretization is based on an explicit Euler method for the ODE equations and a linearly implicit Runge–Kutta method for the parabolic equations. 4.1. Space discretization using FEM In this subsection we give a brief description of the piecewise linear finite element discretization to solve the monodomain equations. We commence with the primal problem in variational form: find v : [0, T ] → H 1 (Ω) and w : [0, T ] → L 2 (Ω) such that for a.e. t ∈ (0, T )
∇ · σ¯ i ∇ v , ϕ =
∂v + ∂t
∂w , ϕ = g ( v , w ), ϕ . ∂t
I ion ( v , w ) − I e , ϕ
for all ϕ ∈ H 1 (Ω)
Q
(20)
Let V h ⊂ H 1 (Ω) be the finite-dimensional subspace of piecewise Nlinear basis functions with respect to the spatial grid. The approximate solution v is expressed in the form v(t ) = i =0 vi ωi where {ωi }iN=1 denote the basis functions. This results in the following system of nonlinear ordinary differential equations:
M
∂v = −Ai v − Iion (v, w) + Ie , ∂t
(21)
∂w = g(v, w), ∂t v(0) = 0, w(0) = 0, where Ai is the stiffness and M is the mass matrix, and Iion and Ie are vectors defined by Iion = { I ion , ω
{ I e , ω j } Nj=1 , respectively.
(22) (23) N j } j =1
and Ie =
Space discretization of dual problem We use an analogous derivation as for the primal problem and obtain the following semi-discrete form of the dual equations:
M
∂p = −Mobs v + Ai p + M(Iion ) v p + Mqg v , ∂t
∂q T = −gw ( v , w )q − ( I ion ) w p, ∂t p( T ) = 0, q( T ) = 0,
(24) (25) (26)
where Mobs is a mass matrix in observation domain. To solve the semi-discretized primal and the dual problems (21)–(23) and (24)–(26), we first approximate the ODE system solution at the current time step. This gives the ionic current variable update w while solving the primal problem,
58
C. Nagaiah, K. Kunisch / Applied Numerical Mathematics 61 (2011) 53–65
which is subsequently used to update Eq. (21). Analogously, the adjoint variable update q is used in Eq. (24). In our numerical computations the primal problem is solved by decoupling the system as follows:
step-1:
wn+1 = wn + tg vn , wn ,
∂v = −Ai vn − Iion vn , wn+1 − Ie . ∂t
(27)
n
step-2:
M
(28)
Analogously, the dual problem is decoupled as follows:
step-1:
qn = (1 − t η2 η3 )qn+1 + t η1 vn+1 pn+1 ,
∂ pn = − Mobs vn+1 − Ai pn + M(Iion ) v pn + g v Mqn+1 . step-2: M ∂t
(29) (30)
4.2. Time discretization using linearly implicit RK methods In this subsection we give a brief description of the time discretization for solving the systems of ordinary differential equations which we now express in the following form:
M
∂u = F(u), ∂t
u t 0 = u0 .
(31)
To solve (31), we introduce discrete steps in the time interval [0, T ]:
0 = t 0 , t 1 , . . . , tn = T , which are not necessarily equidistant. We further set τ i = t i +1 − t i and denote by ui the numerical solution at time t i . For time discretization linearly implicit Runge–Kutta methods, specifically Rosenbrock methods, are used. These belong to a large class of methods which try to avoid the nonlinear system and replace it by a sequence of linear ones. For the construction of the Jacobian matrix we used exact derivatives of the vector F(u). For our computations the ROS2 method was employed which has two internal stages to solve in each iteration see [11]. After the time discretization one ends up with a system of linear equations. For solving this linear system the BiCGSTAB method with ILU preconditioning is used. 4.3. Spatial grid adaptivity The adaptive mesh refinement(AMR) algorithm uses a hierarchy of properly nested levels. It automatically refines or coarsens the mesh to achieve a solution having a specified accuracy in an optimal fashion. Here we used the AMR technique based on the Z 2 error indicator of Zienkiewicz and Zhu [26]. See also [23] for a more detailed description of error estimators. Here we will recall the Z 2 error indicator. We denote by W h the space of all piecewise linear vector-fields and set X h := W h ∩ C (Ω, R2 ). Denote by v and v h the unique solution of problems (1 ) and (20), respectively. In this case G v h − ∇ v h L 2 ( T ) can be used as an error estimator, where G v h is an easily computable approximation of ∇ v h . Let G v h ∈ X h be the ·,· h -projection of ∇ v h onto X h . It can computed by a local averaging of ∇ v h | T (xi ) as follows
G v h (xi ) =
|T | ∇ v h |T (xi ). |D x|
(32)
T ⊂Dx
Here, D x is the union of the triangles having x as a vertex. We finally set the error indicator locally and globally respectively as follows
η Z ,T := G v h − ∇ v h L 2 (T ) , and
η Z :=
(33)
1/2
η
2 Z ,T
.
(34)
T ∈Th
The Z 2 indicator η Z , T is an estimate for ∇ v th (·, t i ) − ∇ v t (·, t i ) L 2 ( T ) , see Verfürth [23] for complete details. Let λ( T ) ∈ N0 be the refinement level of triangle T ∈ T , λmax ∈ N0 be a given maximum refinement level, and φ1 , . . . , φλmax be given real numbers satisfying 0 φ1 · · · φλmax . We set φ0 = 0 and φλmax = ∞. With the choice of φ1 , . . . , φλmax one controls the structure of the grid. If we set φ1 = · · · = φλmax = 0 this leads to a uniform triangulation of level λmax . We set φ1 = · · · = φλmax = 10−2 in our numerical computations. Here we used the scaled indicator
φT := η Z ,T / | T |.
Suppose that an initial coarse triangular grid is constructed using a grid generator. A triangle T is marked for
(35)
C. Nagaiah, K. Kunisch / Applied Numerical Mathematics 61 (2011) 53–65
59
Fig. 1. Control and excitation region at the cardiac domain.
1. refinement if φ T > φλ( T ) and λ( T ) < i for i = 0, . . . , λmax , φλ(T ) and λ( T ) > i for i = 0, . . . , λmax , 2. coarsening if φ T < 100 where φ T is calculated according to Eq. (35). So far we explained adaptivity for the direct problem. Let us turn to the optimal control problem next. One prominent approach in the literature is the dual weighted residual methodology as presented in [3] as a technique to derive a posteriori error estimates for both time dependent and stationary problems governed by partial differential equations. Another approach uses grid adaptivity based on a posteriori error estimate of the cost functional [2,6]. In our computational work we proceeded as follows. We used AMR in space only and assume that, during the optimization iterations, the control variable lives always on the coarse grid dimension where we start the simulations. This can be related to the fact that in practice in is difficult to manipulate the control mechanism spatially. The initial coarse grid, denoted by L0 is constructed by a grid generator. Finer levels Li for i > 0 are constructed recursively from the coarser levels L i −1 . The tolerance for spatial grid refinement is set tolx = 10−2 . The Z 2 error estimator is called for every 3 time intervals during the primal solve. It adjusts the spatial grid by refining and coarsening, depending on the estimated spatial solution error of the elements. Accordingly, the control variable is interpolated based on the new grid construction. We used the same strategy to solve to adjoint problem with the spatial grid being adjusted according to the adjoint solution. In the Newton optimization algorithm, to evaluate the matrix–vector product in each iteration of the inner loop, see algorithm (1), the coarse grid dimension is used to up-date the control solution. Thus our Newton direction is an inexact one: it is determined from solving the Newton system on the coarse grid iteratively up to a certain tolerance tolNew . We set tolNew = 10−3 | ˆJ ( I e,k )|. Since | ˆJ ( I e,k )| → 0 in case of convergence to a critical point we can expect to obtain superlinear convergence of the overall numerical optimization strategy. Of course, this can be achieved only if function and gradient evaluations are sufficiently accurate. For our numerical experiments, we developed an optimization code based on the public domain software package DUNE [1]. 5. Numerical results In this section we present numerical results based on the adaptive and second order methods. We shall see that the proposed methodology is capable of dampening an excitation wave of the transmembrane potential by properly applying an extracellular potential, even if the control domain is relatively small. The numerical experiments were carried out using different static grid sizes and as well as the AMR technique. Here we consider 4 msec of simulation time. The computational domain Ω = [0, 1] × [0, 1] and various relevant subdomains are depicted in Fig. 1. The excitation domain and control domains are Ωexi and Ωcon = Ωcon1 ∪ Ωcon2 . Further Ω f 1 , Ω f 2 are neighborhoods of Ωcon1 , Ωcon2 and the observation domain is Ωobs = Ω\(Ω f 1 ∪ Ω f 2 ). In the computations equidistant time steps are used. For the spatial adaptive grid refinement, a uniform 60 × 60 triangular grid is considered as coarse grid and the subsequent multi level grids are constructed based on the primal or adjoint solution. In simulations the weight of the cost of the control is fixed to be α = 1 × 10−3 and the iterations were terminated when ∇ J k ∞ 10−3 (1 + | J k |), is satisfied or the difference between two successive optimization iterations of cost functional minimization value is less than 10−3 . If this condition is not met within the prescribed iterations, let us say 100 optimization iterations, then we terminate the simulations. The line search algorithm starts with initial step length α = 1 for the inexact Newton method and is reduced by a factor of 2 for subsequent rejected step sizes, see Nocedal and Wright [16]. For the computational set up, we considered the domains Ωexi = [0.498, 0.502] × [0.498, 0.504], Ωcon1 = [0.3, 0.4] × [0.45, 0.55], Ωcon2 = [0.6, 0.7] × [0.45, 0.55], Ω f 1 = [0.28, 0.42] × [0.43, 0.57] and Ω f 2 = [0.58, 0.72] × [0.43, 0.57]. Parameters used in the simulation The following parameter values [7] are used for our numerical experiments: σil = 3 × 10−3 −1 cm−1 , −1 cm−1 G = 1.5 S/cm2 , v th = 13 mV, v p = 100 mV, η1 = 4.4 S/cm2 , η2 = 0.012, η3 = 1.
10−4
σit = 3.1525 ×
60
C. Nagaiah, K. Kunisch / Applied Numerical Mathematics 61 (2011) 53–65
Fig. 2. The norm of the gradient and minimum value of the cost functional are shown on left and right respectively for T = 4 msec of simulation time.
Fig. 3. The minimum value of
| V m |2 dx dt and
| I e |2 dx dt are on left and right respectively for T = 4 msec of simulation time.
Test case 1 First we discuss the results based on different static grid sizes and the AMR grid. We use a 60 × 60 triangular grid, consisting of 13,924 elements and 7081 nodal points, an 80 × 80 triangular grid, consisting of 24,964 elements and 12,641 nodal points, a 100 × 100 triangular grid, consisting of 40,000 elements and 20,201 nodal points, and AMR coarse grid which is a 60 × 60 triangular grid. The corresponding L 2 norms of the gradients of the cost functional and the minimum values of the cost functional themselves are depicted in Fig. 2. A logarithmic scale is considered for the presentation of L 2 norms of the gradients of the cost functional. As expected for a Newton method, the optimal solutions are obtained within a rather small number of iterations. For each one of therespective grids convergence was obtained within 8 or 9 iterations. The values of the two additive terms of the cost, 12 V m 2 dx dt and α2 I e 2 dx dt, are depicted in Fig. 3. Note that the two terms are of approximately the same order of magnitude. We observed that the control action has a strong impact at the beginning of iterations where the excitation of the wave front is dampened out and decreases with time, see Fig. 6. The total CPU time for solving the optimization problem, as well as the CPU time for all primal solves, all dual solves, and for all setup and evaluation times of the Hessian steps is given in the first 5 columns of Table 1. The last column contains the average number of CG-steps required for obtaining the inexact Newton algorithm. We note that the computational time for the gradient of the cost, which requires one primal and one dual solve per iteration, is much cheaper than updating the control variable on the basis of second order information. For all grids, updating the control variable on the basis of the second order system, needs approximately 18 times more CPU time than computing the first derivative. This is the expensive step in our calculations. However, as we experienced in our earlier work, minimization on the basis of only first order information requires at least several hundred iterations before comparable accuracy can be achieved. Let us also mention that additional Newton
C. Nagaiah, K. Kunisch / Applied Numerical Mathematics 61 (2011) 53–65
61
Table 1 Computational time and an average inner CG iterations in Newton’s algorithm are presented for all grids. Grids
CPU time
Primal solves (s)
Dual solves (s)
Opt. solves (s)
Avg inner CG iters
60 × 60 80 × 80 100 × 100 AMR grid
58 min 51 sec 2 h 2 min 54 sec 3 h 13 min 14 sec 1 h 13 min 19 sec
93.64 197.03 323.89 598.52
87.24 180.46 292.27 551.26
3264.63 6813.89 10 759.4 3187.07
15.7 17.7 20 17
Table 2 Optimization iterations, norm of gradient of cost functional and order of convergence for the different grid constructions are presented using inexact Newton’s algorithm. 60 × 60
80 × 80
100 × 100
AMR grid
∇ J
∇ J i +1 ∇ J i
∇ J
∇ J i +1 ∇ J i
∇ J
∇ J i +1 ∇ J i
∇ J
∇ J i +1 ∇ J i
59.9128 27.9293 19.8697 3.40178 3.47448 1.60263 0.31174 0.02236 0.00050
– 0.4661 0.7114 0.1712 1.0213 0.4612 0.1945 0.0717 0.0226
62.5567 25.8370 18.8618 3.20784 2.62626 1.23709 0.22913 0.01267 0.00024
– 0.4130 0.7300 0.1700 0.8187 0.4710 0.1852 0.0553 0.0191
59.3120 18.9710 5.0035 2.64203 2.31805 1.95699 0.18502 0.00700 –
– 0.3199 0.2637 0.5280 0.8774 0.8442 0.0945 0.0378 –
66.5305 30.5083 5.8358 3.07498 0.67403 0.05105 0.00180 0.00010 –
– 0.4586 0.1913 0.5269 0.2192 0.0758 0.0353 0.0556 –
Fig. 4. Uncontrolled solution (V m ) at 0.3 msec, 2 msec and 4 msec of simulation time respectively.
steps, beyond those which are documented, allow to further decrease ∇ J which we could not observe with the nonlinear CG algorithm [14]. The AMR technique takes 598 seconds to solve the primal problem, out of which it needs 96.87 seconds for grid adaption. For the dual solve the grid adaption process takes 81.96 seconds. We can observe that AMR takes less overall CPU time and achieves results comparable to the 100 × 100 grid. Obviously the AMR technique shows a good improvement over the static grids for the current problem. Table 2 confirms superlinear convergence of the inexact Newton method for all grid constructions at the end of iterations. The line search algorithm takes small step lengths during the initial iterations and full step lengths after 3 or 4 iterations which leads to superlinear convergence. The 2D contour snap shots of the uncontrolled solution, and of the optimally controlled solutions are shown in Figs. 4, and 5 at times 0.3 msec, 2 msec and 4 msec for the 100 × 100 mesh. We can see that the uncontrolled wave front of the transmembrane voltage spreads almost over the entire domain during the time interval from 0 msec to 4 msec, see in Fig. 4. The corresponding controlled action potential is first slowly dampened at 0.3 msec and is almost completely damped at 4 msec. The control action has a tremendous effect on the action potential at the beginning to control and dampens the excitation wave It has much less effect at 4 msec time, see Fig. 6. The AMR grid of the uncontrolled solution can be seen in Fig. 7. The refinement/coarsening strategy follows the interface of the wave propagation. For the first optimization iteration the fine grid for the primal solve consists of 34,038 elements and 17,138 number of nodal points. At the last optimization iteration the fine grid consists of 21,700 number of elements and 10,969 number of nodes where the maximum level of refinement is 5. The AMR grids corresponding to the optimal state solution are shown in Fig. 8. We can note that the grid at time t = 4 msec is equivalent to the coarse grid, which we considered as the level-0 grid. This is consistent with the fact that there is no further wave propagation within the domain. Finally, the presented numerical results evidently show that the excitation wave propagation is dampened out by properly adjusting the extracellular potential using the optimization algorithm. Further, the AMR method shows a good computational improvement over the static grids.
62
C. Nagaiah, K. Kunisch / Applied Numerical Mathematics 61 (2011) 53–65
Fig. 5. Controlled solution (( V m )opt ) at 0.3 msec, 2 msec and 4 msec of simulation time respectively.
Fig. 6. Controlled (I e ) at 0.3 msec, 2 msec and 4 msec of simulation time respectively.
Fig. 7. Uncontrolled solution (V m ) grid view at 0.3 msec, 2 msec and 4 msec of simulation time respectively for AMR technique.
Fig. 8. Controlled solution (V m ) grid view at 0.3 msec, 2 msec and 4 msec of simulation time respectively for AMR technique.
C. Nagaiah, K. Kunisch / Applied Numerical Mathematics 61 (2011) 53–65
63
Fig. 9. The norm of the gradient and minimum value of the cost functional are shown on left and right respectively for T = 4 msec of simulation time.
Fig. 10. The minimum value of
| V m |2 dx dt and
| I e |2 dx dt are on left and right respectively for T = 4 msec of simulation time.
Test case 2 In this test case the placement of control domain is changed and isotropic conductivity tensors are considered. With respect to the control domains the observation domains are chosen in an asymmetric manner. More specifically, Ωcon1 = [0.3, 0.4] × [0.45, 0.55], Ωcon2 = [0.5, 0.6] × [0.53, 0.6], Ωobs1 = [0.25, 0.45] × [0.40, 0.60] and Ωobs2 = [0.45, 0.65] × [0.47, 0.65]. In this test case the weight of the cost of the control is reduced to α = 5 × 10−4 to achieve the dampening of exciting wave propagation. The L 2 norms of the gradient and the minimum value of the cost functional are depicted in Fig. 9 for different grid constructions. For all static grids 7 optimization iterations are required, while the AMR method takes 6 iterations. In this case the AMR method shows a clear improvement over the static grids in terms of fast converging to the optimal solution. The corresponding additive values of 12 V m 2 dx dt and α2 I e 2 dx dt are depicted in Fig. 10. We observed that the control term has an especially strong impact during the first minimization iterations, during which the excitation of the wave front is dampened out. This can be seen on the right-hand side of Fig. 10. 2D contour snap shots of the uncontrolled solution and the optimally controlled solution are shown in Figs. 11, and 12 at times 0.3 msec, 2 msec and 4 msec the 100 × 100 mesh. We can see that the uncontrolled wave front of the transmembrane voltage spreads uniformly in the x- and y-directions over the domain during the time interval from 0 msec to 4 msec, see in Fig. 11. The corresponding controlled action potential is shown in Fig. 13 at times 0.3 msec, 2 msec and 4 msec respectively. In contrast to the Test case (1), we can observe that the controlled action potential is dampened much more rapidly, which is due to the isotropic conductivity tensors, σil = σit = 3 × 10−3 and the fact that the control cost parameter is reduced by a factor of 2. Also observe that the control domain that is closer to the excitation domain has much more effect than the other one. We also carried out an experiment with the isotropic conductivity tensor of the form σil = σit = 3 × 10−4 . In this case the excitation wave is not dampened out by the chosen control configuration, due to the domination of the reaction
64
C. Nagaiah, K. Kunisch / Applied Numerical Mathematics 61 (2011) 53–65
Fig. 11. Uncontrolled solution (V m ) at 0.3 msec, 2 msec and 4 msec of simulation time respectively.
Fig. 12. Controlled solution (( V m )opt ) at 0.3 msec, 2 msec and 4 msec of simulation time respectively.
Fig. 13. Controlled (I e ) at 0.3 msec, 2 msec and 4 msec of simulation time respectively.
part. We remark that for long time horizons this procedure is effective from the point of dampening the wave. For short time horizons, however, it is less successfull when compared to locally distributed (w.r.t. space) control action. Again we can observe super-linear convergence of the optimization algorithm. 6. Conclusions We discussed solution strategies based on the Newton method and adaptive grid refinement strategies to solve optimal control of the action potential in cardiac electrophysiology based on the monodomain model. The numerical results show the capability of dampening the excitation wave propagation by properly applying an extra cellular current as a control variable. The presented optimization algorithm exhibits super linear convergence of the inexact Newton method. Moreover the overall performance is more efficient when compared to first order methods, see [14]. The results motivate us to continue our investigations for the bidomain model. Adaptive time-stepping should be considered to further save computational time. We also need to strive for more insight with respect to longer time horizons, with more realistic geometries, and finer meshes.
C. Nagaiah, K. Kunisch / Applied Numerical Mathematics 61 (2011) 53–65
65
Acknowledgements The authors gratefully acknowledge the Austrian Science Foundation (FWF) for financial support under SFB 032, “Mathematical Optimization and Applications in Biomedical Sciences”. Appendix A. Optimization algorithm Here is a brief outline of complete a Newton’s optimization algorithm, by utilizing the spatial grid adaptivity during the primal and dual solves, which is used to carry out numerical computations. Algorithm 1 Line search Newton-CG optimization algorithm. 1: primal variables: v, w. 2: dual variables: p, q. 3: repeat 4: set v (0) := v 0 , w (0) := w 0 and solve the primal problem to obtain v (t ), w (t ) by utilizing the spatial grid adaptation during the intermediate time steps. 5: solve the dual problem for p (t ), q(t ) using terminal conditions p ( T ) := 0, q( T ) := 0 by utilizing the spatial grid adaptation during the intermediate time steps. 6: update the gradient using the adjoint solution ˆJ = p + α I ek .
7: solve the system ˆJ δ I e = − ˆJ by linear conjugate gradient method. 8: set step length βk := 1.0 and compute optimal βk using backtracking method by checking the strong Wolfe conditions, see [16]. 9: update I ek+1 := I ek + βk δ I e using modified βk . 10: k ← k + 1. 11: until ∇ J k ∞ 10−3 (1 + | J k |)
References [1] P. Bastian, M. Blatt, A. Dedner, C. Engwer, R. Klöfkorn, R. Kornhuber, M. Ohlberger, O. Sander, A generic grid interface for parallel and adaptive scientific computing. Part II: implementation and tests in dune, Computing 82 (2008) 121–138. [2] R. Becker, H. Kapp, R. Rannacher, Adaptive finite element methods for optimal control of partial differential equations: Basic concept, SIAM J. Control Optim. 39 (2000) 113–132. [3] R. Becker, R. Rannacher, An optimal control approach to a posteriori error estimation in finite element methods, Acta Numer. 10 (2001) 1–102. [4] Y. Bourgault, Y. Coudiére, C. Pierre, Existence and uniqueness of the solution for the bidomain model used in cardiac electrophysiology, Nonlinear Analysis: Real World Applications 10 (2009) 458–482. [5] E.M. Cherry, H.S. Greenside, C.S. Henriquez, A space–time adaptive method for simulating complex cardiac dynamics, Phys. Rev. Lett. 84 (2000) 1343– 1346. [6] L. Dede’, A. Quarteroni, Optimal control and numerical adaptivity for advection–diffusion equations, M2AN 39 (2005) 1019–1040. [7] P.C. Franzone, P. Deuflhard, B. Erdmann, J. Lang, L.F. Pavarino, Adaptivity in space and time for reaction–diffusion systems in electrocardiology, SIAM Journal on Numerical Analysis 28 (2006) 942–962. [8] C.S. Henriquez, Simulating the electrical behavior of cardiac tissue using the bidomain model, Crit. Rev. Biomed. Eng. 21 (1) (1993) 77. [9] M. Hinze, K. Kunisch, Second order methods for optimal control of time-dependent fluid flow, SIAM J. Control Optim. 40 (2001) 925–946. [10] M. Hinze, R. Pinnau, M. Ulbrich, S. Ulbrich, Optimization with PDE Constraints, Springer, 2008. [11] J. Lang, Adaptive Multilevel Solution of Nonlinear Parabolic PDE Systems, Lecture Notes in Computational Science and Engineering, vol. 16, SpringerVerlag, Berlin, 2001. [12] P.K. Moore, An adaptive finite element method for parabolic differential systems: Some algorithmic considerations in solving in three space dimensions, SIAM J. Sci. Comput. 21 (1999) 1567–1586. [13] C. Nagaiah, K. Kunisch, G. Plank, Numerical solutions for optimal control of monodomain equations in cardiac electrophysiology, in: Proceedings of BFG-09, Recent Advances in Optimization and its Applications in Engineering, in press. [14] C. Nagaiah, K. Kunisch, G. Plank, Numerical solution for optimal control of the reaction–diffusion equations in cardiac electrophysiology, Comput. Optim. Appl. (2009) 1–30. [15] B.F. Nielsen, T.S. Ruud, G.T. Lines, A. Tveito, Optimal monodomain approximations of the bidomain equations, Appl. Math. Comput. 184 (2007) 276–290. [16] J. Nocedal, S.J. Wright, Numerical Optimization, 2nd ed., Springer Verlag, New York, 2006. [17] G. Plank, M. Liebmann, R.W. dos Santos, E. Vigmond, G. Haase, Algebraic multigrid preconditioner for the cardiac bidomain model, IEEE Trans. Biomed. Eng. 54 (4) (2007) 585–596. [18] R. Plonsey, Bioelectric sources arising in excitable fibers (ALZA lecture), Ann. Biomed. Eng. 16 (1988) 519–546. [19] M. Potse, B. Dube, J. Richer, A. Vinet, R. Gulrajani, A comparison of monodomain and bidomain reaction–diffusion models for action potential propagation in the human heart, IEEE Trans. Biomed. Eng. 53 (2006) 2425–2435. [20] J.M. Rogers, A.D. McCulloch, A collocation-Galerkin finite element model of cardiac action potential propagation, IEEE Trans. Biomed. Eng. 41 (1994) 743–757. [21] J.A. Trangenstein, C. Kim, Operator splitting and adaptive mesh refinement for the luo-rudy i model, J. Comput. Phys. 196 (2004) 645–679. [22] L. Tung, A bi-domain model for describing ischemic myocardial DC potentials, Ph.D. thesis, MIT, Cambridge, MA, 1978. [23] R. Verfürth, A Review of a Posteriori Error Estimation and Adaptive Mesh-Refinement Techniques, Wiley & Teubner, 1996. [24] E.J. Vigmond, R. Weber dos Santos, A.J. Prassl, M. Deo, G. Plank, Solvers for the cardiac bidomain equations, Prog. Biophys. Mol. Biol. 96 (2008) 3–18. [25] R. Weber dos Santos, G. Plank, S. Bauer, E. Vigmond, Parallel multigrid preconditioner for the cardiac bidomain model, IEEE Trans. Biomed. Eng. 51 (2004) 1960–1968. [26] O.C. Zienkiewicz, J.Z. Zhu, A simple error estimator and adaptive procedure for practical engineering analysis, Int. J. Num. Meth. Eng. 24 (1987) 337– 357.
Applied Numerical Mathematics 61 (2011) 66–76
Contents lists available at ScienceDirect
Applied Numerical Mathematics www.elsevier.com/locate/apnum
A dimensional split preconditioner for Stokes and linearized Navier–Stokes equations Michele Benzi a,∗,1 , Xue-Ping Guo b,2 a b
Department of Mathematics and Computer Science, Emory University, Atlanta, GA 30322, USA Department of Mathematics, East China Normal University, Shanghai, 200062, PR China
a r t i c l e
i n f o
Article history: Received 23 March 2009 Received in revised form 2 August 2010 Accepted 9 August 2010 Available online 19 August 2010 Keywords: Saddle point problems Matrix splittings Iterative methods Preconditioning Stokes problem Oseen problem Stretched grids
a b s t r a c t In this paper we introduce a new preconditioner for linear systems of saddle point type arising from the numerical solution of the Navier–Stokes equations. Our approach is based on a dimensional splitting of the problem along the components of the velocity field, resulting in a convergent fixed-point iteration. The basic iteration is accelerated by a Krylov subspace method like restarted GMRES. The corresponding preconditioner requires at each iteration the solution of a set of discrete scalar elliptic equations, one for each component of the velocity field. Numerical experiments illustrating the convergence behavior for different finite element discretizations of Stokes and Oseen problems are included. © 2010 IMACS. Published by Elsevier B.V. All rights reserved.
1. Introduction We consider the solution of the incompressible Navier–Stokes equations governing the flow of viscous Newtonian fluids. For an open bounded domain Ω ⊂ Rd (d = 2, 3) with boundary ∂Ω , time interval [0, T ], and data f, g and u0 , the goal is to find a velocity field u = u(x, t ) and pressure field p = p (x, t ) such that
∂u − ν u + (u · ∇)u + ∇ p = f on Ω × (0, T ], ∂t div u = 0 on Ω × [0, T ],
(2)
u=g
(3)
on ∂Ω × [0, T ],
u(x, 0) = u0 (x)
on Ω,
(1)
(4)
where ν is the kinematic viscosity, is the vector Laplacian, ∇ is the gradient and div the divergence. Implicit time discretization and linearization of the Navier–Stokes system by Picard fixed-point iteration result in a sequence of (generalized) Oseen problems of the form
*
Corresponding author. E-mail addresses:
[email protected] (M. Benzi),
[email protected] (X.-P. Guo). 1 The work of this author was supported in part by the National Science Foundation Grants DMS-0511336 and by a grant by the University Research Committee at Emory University. 2 Part of this work was performed while visiting CERFACS, 42 Ave. Gaspard Coriolis, 31057 Toulouse Cedex, France. 0168-9274/$30.00 © 2010 IMACS. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.apnum.2010.08.005
M. Benzi, X.-P. Guo / Applied Numerical Mathematics 61 (2011) 66–76
67
σ u − ν u + (v · ∇)u + ∇ p = f in Ω,
(5)
div u = 0 in Ω,
(6)
u = g on ∂Ω,
(7)
where v is a known velocity field from a previous iteration or time step (the ‘wind’) and σ is proportional to the reciprocal of the time step (σ = 0 for a steady problem). When v = 0 we have a (generalized) Stokes problem. Spatial discretization of the Stokes or Oseen problem using LBB-stable finite elements (cf. [18,19]) results in large, sparse linear systems in saddle point form:
BT 0
A −B
u p
=
f , −g
or Ax = b,
(8)
where now u and p represent the discrete velocity and pressure, respectively, A is the discretization of the diffusion, convection, and time-dependent terms, B T is the discrete gradient, B the (negative) discrete divergence, and f and g contain forcing and boundary terms. The efficient solution of (8) calls for rapidly convergent iterative methods. Much work has been done in developing efficient preconditioners for Krylov subspace methods applied to this problem; see, e.g., [4,5,8,14,15,17,18,22]. The ultimate goal is to develop robust solvers with optimal complexity. In particular, the rate of convergence should be independent of the mesh size h. For the Oseen problem, the rate of convergence should also depend only weakly on the kinematic viscosity ν (equivalently, on the Reynolds number Re = O (ν −1 )), although this goal is difficult to achieve in practice. 2. Dimensional splitting For simplicity, in this paper we limit ourselves to the 2D case. Extension to the 3D case is possible (see Section 7), but will not be described here. The system matrix A admits the following splitting:
⎡
A1
B 1T
0
A=⎣ 0 A2 −B1 −B2
⎤
⎡
A1
0
B 2T ⎦ = ⎣ 0 −B1 0
B 1T
⎤
⎡
0
0
0 ⎦ + ⎣0
0
0
0
0
A2
B 2T
0 −B2
⎤
⎦ = A1 + A2 .
(9)
0
Here each diagonal submatrix A i is a scalar discrete convection-diffusion-reaction operator:
Ai = σ M + ν L + Ni B 1T ,
B 2T
(i = 1, 2),
(10)
are discretizations of the partial derivatives ∂∂x , ∂∂y , respectively. Note that A 1 and A 2 act, respectively, on u (the
and x-component of the velocity field u) and on v (the y-component of u). Denoting by n1 , n2 and m the number of degrees of freedom of u, v and p, respectively, then A 1 ∈ Rn1 ×n1 , A 2 ∈ Rn2 ×n2 , B 1 ∈ Rm×n1 and B 2 ∈ Rm×n2 . Thus, A ∈ R(n+m)×(n+m) with n = n1 + n2 . In (10), M denotes the velocity mass matrix, L the discrete (negative) Laplacian, and N i the convective terms. Note that for the discrete Stokes problem the convective terms are absent (N i = 0) so that each A i is symmetric and positive definite. For the discrete Oseen problem A i = A iT , but A i + A iT is positive definite (i = 1, 2). As a consequence, A1 and A2 in (9) are nonsymmetric but positive semidefinite, in the sense that A1 + A1T and A2 + A2T are both symmetric positive semidefinite. In particular, A1 and A2 are singular. We refer to (9) as to a dimensional splitting, since A1 contains terms that correspond to the x-component of the solution and A2 contains terms that correspond to the y-component of the solution. Although this splitting is somewhat reminiscent of ADI (alternating direction implicit) methods [29], it is actually quite different since we do not split the operators A i into their constituent components. To distinguish it from ADI splitting, we refer to (9) as to dimensional splitting (or DS for short). We further mention that our approach is also different from previous ADI schemes for saddle point problems, such as those described in [11] and [13]. Let now α > 0 be a parameter, and denote by I the identity matrix of order n1 + n2 + m. Then A1 + α I and A2 + α I are both nonsingular, nonsymmetric, and positive definite. Consider the two splittings of A,
A = (A1 + α I ) − (α I − A2 ) and A = (A2 + α I ) − (α I − A1 ). Associated to these splittings is the alternating iteration 1
(A1 + α I )xk+ 2 = (α I − A2 )xk + b, k +1
(A2 + α I )x
k+ 12
= (α I − A1 )x k+ 12
(k = 0, 1, . . .). Eliminating x k +1
x
k
= Tα x + c,
+b
(11) (12)
from these, we can rewrite (11)–(12) as the stationary scheme
k = 0, 1 , . . .
where Tα is the iteration matrix
Tα = (A2 + α I )−1 (α I − A1 )(A1 + α I )−1 (α I − A2 )
(13)
68
M. Benzi, X.-P. Guo / Applied Numerical Mathematics 61 (2011) 66–76
and c is a certain vector. As in [9], one can show that there is a unique splitting A = Pα − Qα with Pα nonsingular such that the iteration matrix Tα is the matrix induced by that splitting, i.e., Tα = Pα−1 Qα = I − Pα−1 A. Furthermore, c = Pα−1 b. Matrices Pα and Qα are given by
Pα =
1 2α
(A1 + α I )(A2 + α I ),
Qα =
1 2α
(α I − A1 )(α I − A2 ).
(14)
We refer to iteration (11)–(12) as the DS iteration, and to Pα as the DS preconditioner. Besides the already mentioned resemblance to the Peaceman–Rachford ADI method, the DS iteration bears some resemblance to another alternating method, the Hermitian and skew-Hermitian splitting (HSS) iteration [2–4,6]. While the HSS method has proved quite successful in solving such problems as the generalized Stokes problem and the rotation form of the Navier–Stokes equations (see [6]), it is not well-suited for the standard (convection) form (1)–(2). This limitation of HSS was one of the main motivations for introducing the DS approach. 3. Convergence of the fixed-point iteration We now prove that, under standard assumptions on the saddle point problem (8), the alternating iteration (11)–(12) converges to the solution of (8) for any choice of α > 0 and for all initial guesses. First we state two auxiliary results. The first one is classical, and is known as Kellogg’s Lemma; see [21]. In the following, a (not necessarily Hermitian) matrix A ∈ Cn×n is said to be positive definite (semidefinite) if the Hermitian matrix A + A ∗ is positive definite (resp., semidefinite) in the usual sense. Lemma 1. Let A ∈ Cn×n be positive semidefinite. Then
(α I n + A )−1 (α I n − A ) 1 2
for all α > 0. Furthermore, if A is positive definite then
(α I n + A )−1 (α I n − A ) < 1 2
for all α > 0. Lemma 2. Assume that the (1, 1) block A in (8) has positive definite symmetric part and that B has full row rank. Then the following are equivalent: (i) The matrix
⎡ A 1 Cα := ⎣ 0
1
T
B B A α2 1 2 2 A2
−B1
−B2
⎤
B 1T + α12 B 1T B 2 B 2T ⎦ B 2T
(15)
0
has no purely imaginary eigenvalues. (ii) The spectral radius of the iteration matrix Tα in (13) is strictly less than unity. Proof. First we note that under the assumptions made on A and B, the matrix A in (8) is nonsingular; see [4, Lemma 1.1]. Let λ be an eigenvalue of the iteration matrix Tα = I − Pα−1 A. Then λ = 1 − μ where μ is a generalized eigenvalue of the matrix pencil (A, Pα ); that is, there exists a vector x = 0 such that Ax = μPα x. Expanding the right-hand side, we get
Ax =
μ 2α
(A1 A2 + α A + α 2 I )x.
Collecting terms in A, we rewrite this as
1−
1 2
μ Ax =
μα 2
1 I + 2 A1 A2 x .
(16)
α
Since both A and Pα are nonsingular, it must be μ = 0. Also, it must be 1 − (I + α12 A1 A2 )x = 0 has a nonzero solution, but this is impossible since
⎡
G := I +
1
α2
I n1
A1 A2 = ⎣ 0
is clearly nonsingular. Hence,
0
− α12 B 1T B 2 I n2 0
0
⎤
0 ⎦ Im
μ = 2 and we can set (as in [25, p. 378])
μα 2α 2θ θ := , from which μ = 2 − = . 2−μ θ +α θ +α
1 2
μ = 0 for otherwise (16) implies that
M. Benzi, X.-P. Guo / Applied Numerical Mathematics 61 (2011) 66–76
69
Hence, the generalized eigenproblem (16) can be reformulated as
Ax = θ G x , that is,
⎡
A1
0
⎣ 0 A2 −B1 −B2
⎡ ⎤ I n1 u T ⎦ v ⎣ =θ 0 B2 B 1T
p
0
⎡ G − 1 Ax = θ x ,
0
I n2
0
or
⎤ u 0 ⎦ v ,
− α12 B 1T B 2 0 1
I n1
T
B B α2 1 2
where G −1 = ⎣ 0
0
⎤
0 ⎦.
I n2
0
p
Im
0
Im
This eigenproblem is precisely Cα x = θ x, where Cα = G −1 A is the matrix in (15). Note that Cα is necessarily nonsingular. Recall now that the eigenvalues of Pα−1 A are of the form μ = 2θ/(θ + α ). It must be |1 − μ| 1, since λ = 1 − μ is an eigenvalue of the iteration matrix
Tα = I − Pα−1 A = (α I + A2 )−1 (α I − A1 )(α I + A1 )−1 (α I − A2 ), which is similar to
α = (α I − A1 )(α I + A1 )−1 (α I − A2 )(α I + A2 )−1 , T hence
(Tα ), the spectral radius of Tα , satisfies
(Tα ) = (T α ) (α I − A1 )(α I + A1 )−1 2 (α I − A2 )(α I + A2 )−1 2 1.
(17)
The last inequality is an immediate consequence of Kellogg’s Lemma. Denoting by (θ) and (θ) the real and imaginary parts of θ , respectively, we claim that (Tα ) < 1 if and only if (θ) = 0 for every eigenvalue θ of Cα ; equivalently, (Tα ) = 1 if and only if there exists at least one θ with (θ) = 0. Indeed, we have
|1 − μ| = 1
⇔
2θ =1 − 1 θ + α
⇔
θ − α θ + α = 1
⇔
|θ − α | = |θ + α |.
The last equality can be rewritten as ((θ) − α )2 + (θ)2 = ((θ) + α )2 + (θ)2 , or
(θ) − α
Therefore,
2
2 = (θ) + α
⇔
4α (θ) = 0
⇔
(θ) = 0.
(Tα ) = 1 if and only if Cα has at least one purely imaginary eigenvalue. The proof is complete. 2
We are now in a position to prove the following convergence result. Theorem 3. Under the assumptions of Lemma 2, the iteration (11)–(12) is unconditionally convergent; that is, (Tα ) < 1 for all α > 0. Proof. By Lemma 2, it suffices to show that for all α > 0, the matrix Cα in (15) has no purely imaginary eigenvalues. We will argue by contradiction. Recall that Cα is nonsingular. Thus, let θ = 0 be an eigenvalue of Cα corresponding to an eigenvector x = [u ; v ; p ], where u ∈ Cn1 , v ∈ Cn2 and p ∈ Cm are not all equal to zero. Expanding Cα x = θ x we obtain
A1u +
1
α2
1
B 1T B 2 A 2 v + B 1T p +
α2
B 1T B 2 B 2T p = θ u ,
(18)
A 2 v + B 2T p = θ v ,
(19)
− B 1 u − B 2 v = θ p.
(20)
Assuming that the eigenvector x has been normalized so that x 2 = 1, we have
θ = x ∗ Cα x ,
θ¯ = x∗ CαT x and (θ) =
Therefore, letting H 1 = ( A 1 +
A 1T )/2
(θ) = u ∗ H 1 u + v ∗ H 2 v +
θ + θ¯ 2
and H 2 = ( A 2 +
1 2α
2
1 = x∗ Cα + CαT x. 2
A 2T )/2,
we find after some easy algebraic manipulations
u ∗ B 1T B 2 A 2 v + B 2T p + v ∗ A 2T + p ∗ B 2 B 2T B 1 u .
70
M. Benzi, X.-P. Guo / Applied Numerical Mathematics 61 (2011) 66–76
First we observe that at least one between u and v must be nonzero, for otherwise (20), together with the fact that θ = 0, implies p = 0 and thus x = 0, a contradiction. Since H 1 and H 2 are symmetric positive definite we have that u ∗ H 1 u + v ∗ H 2 v > 0, therefore
(θ) = 0
u ∗ B 1T B 2 A 2 v + B 2T p + v ∗ A 2T + p ∗ B 2 B 2T B 1 u < 0,
⇒
(21)
showing that if θ is purely imaginary, it must necessarily be u = 0. Next, consider the case where v = 0. Then Eqs. (18)–(20) reduce to
A 1 u + B 1T p = θ u ,
(22)
B 2T
p = 0,
(23)
− B 1 u = θ p,
(24)
that is, Ax = θ x where x = [u ; 0; p ] = 0. Hence, θ is an eigenvalue of A; but then (θ) > 0 by virtue of Lemma 1.1 in [4]. So it must be v = 0. Furthermore, if p = 0, then Eq. (19) becomes A 2 v = θ v, hence (θ) > 0 since A 2 is positive definite (has positive definite symmetric part) by assumption, and therefore all its eigenvalues have positive real part. Thus, it must be u = 0, v = 0 and p = 0. Now, using (19) we rewrite the necessary condition in (21) in the form
¯ B 2 v )∗ B 1 u < 0. θ u ∗ B 1T B 2 v + θ¯ v ∗ B 2T B 1 u = θ( B 1 u )∗ B 2 v + θ(
(25)
Now, from (20) we obtain B 2 v = − B 1 u − θ p which substituted into (25) yields
θ u ∗ B 1T (− B 1 u − θ p ) + θ¯ −u ∗ B 1T − θ¯ p ∗ B 1 u < 0,
or, equivalently,
−θ u ∗ B 1T B 1 u − θ 2 u ∗ B 1T p − θ¯ u ∗ B 1T B 1 u − θ¯ 2 p ∗ B 1 u < 0. Now, if (θ) = 0 then θ = iξ for some ξ ∈ R, ξ = 0, where i = ∗ T
2 ∗
−θ u B 1 B 1 u − θ u
B 1T
p − θ¯ u B 1 B 1 u − θ p B 1 u = ξ ∗ T
¯2 ∗
√
2
−1. After simplification, we find
u ∗ B 1T p + p ∗ B 1 u ,
therefore condition (21) becomes
u ∗ B 1T p + p ∗ B 1 u < 0 and since u ∗ B 1T p + p ∗ B 1 u = 2(u ∗ B 1T p ), we conclude that
(θ) = 0
u ∗ B 1T p < 0.
⇒
Likewise, from (20) we obtain B 1 u = − B 2 v − θ p; substituting this into (25) and going through the same algebraic operations as before, we also find that
(θ) = 0
v ∗ B 2T p < 0.
⇒
Therefore,
( B 1 u + B 2 v )∗ p = u ∗ B 1T + v ∗ B 2T p < 0,
but together with (20) this implies ((−θ p )∗ p ) < 0, or (iξ p 22 ) < 0, which is clearly absurd since iξ p 22 is imaginary. This proves that Cα cannot have purely imaginary eigenvalues. 2 The restriction in Theorem 3 that A have positive definite symmetric part is not essential. If A + A T is only positive semidefinite (and singular), the alternating iteration (11)–(12) is still well defined. Moreover, the spectral radius of the iteration matrix cannot exceed 1. Indeed, the iteration matrix Tα still satisfies (Tα ) 1, and if the symmetric part of A and B have no null vectors in common, the coefficient matrix A in (8) is still nonsingular; see again Lemma 1.1 in [4]. Hence, 1 is not an eigenvalue of Tα = I − Pα−1 A. However, it may happen that (Tα ) = 1 for some choices of α > 0. A simple example is given by
⎡
I 0 A=⎣0 0
⎤
0 I ⎦.
0 −I 0
Note that this matrix is nonsingular. It is easy to see that for α = 1, the iteration matrix Tα has only three distinct eigenvalues: λ = 0, λ = i and λ = −i. Hence, the spectral radius is 1. Nevertheless, a simple modification of the basic algorithm yields a convergent iteration. To this end, recall that (Tα ) 1 for all α > 0; see (17). Let γ ∈ (0, 1) be a parameter, then the matrix (1 − γ )I + γ Tα has spectral radius less than 1 for all α > 0. Indeed, the eigenvalues of (1 − γ )I + γ Tα are of the form 1 − γ + γ λ, where λ is an eigenvalue of Tα . It is easy to see that since |λ| 1 and λ = 1, all the quantities
M. Benzi, X.-P. Guo / Applied Numerical Mathematics 61 (2011) 66–76
71
1 − γ + γ λ have magnitude strictly less than 1. In practice, however, this modification is seldom used. In the next section we discuss Krylov subspace acceleration, which is much more effective and is applicable whether or not (Tα ) < 1. Nevertheless, knowing that (Tα ) < 1 is useful because it implies that the spectrum of the preconditioned matrix lies entirely in the right half-plane, a desirable property for Krylov subspace acceleration. Moreover, the smaller is (Tα ), the more clustered the spectrum of the preconditioned matrix is around 1. 4. Krylov subspace acceleration The basic method (11)–(12) although unconditionally convergent, is not competitive as a solver for problem (8), mainly due to the fact that convergence is generally slow. Fortunately, the rate of convergence can be greatly improved by Krylov subspace acceleration. In other words, Pα can be used as a preconditioner for GMRES [24] or any other nonsymmetric Krylov method. It should be noted that when Pα is used as a preconditioner, the pre-factor 21α in (14) is irrelevant and can be neglected. In this paper we use the restarted GMRES algorithm with restart parameter m. Preconditioning is applied on the right. The rate of convergence of nonsymmetric Krylov iterations (like GMRES) preconditioned by Pα depends on the particular choice of α . Finding the value of α that optimizes the rate of convergence appears to be a difficult problem in general. Indeed, in practice the convergence rate depends to a large extent on the size, shape, and location of the entire spectrum of the preconditioned matrix Pα−1 A, and not just on the spectral radius of Tα = I − Pα−1 A. (The rate of convergence may also be affected by the conditioning of the eigenbasis of the preconditioned matrix, but this is usually difficult to estimate; see, e.g., [26, p. 17].) Numerical experiments (see below) suggest that the value (or values) α∗ of α for which the number of preconditioned iterations is minimized is a rather small number (0 < α∗ 1). Moreover, the convergence rate is not overly sensitive to small relative changes in α . 5. Implementation aspects For the proposed approach to be successful, it is imperative that the action of the DS preconditioner be computed efficiently within each GMRES iteration. Written out explicitly, system (11) reads
A 1 + α I n1 0 −B1
B 1T 0 α I n3
0 α I n2 0
⎡
⎤
⎡
⎤
α uk + f 1 u k+ 2 ⎣ v k+ 12 ⎦ = ⎣ (α I n2 − A 2 ) v k − B 2T pk + f 2 ⎦ , 1 B 2 v k + α pk − g p k+ 2 1
(26)
while system (12) becomes
α I n1
0 A 2 + α I n2 −B2
0 0
0 B 2T α I n3
u k +1 v k +1 p k +1
⎡
=⎣
1
1
(α I n1 − A 1 )uk+ 2 − B 1T pk+ 2 + f 1
αv B1u
k+ 12
k+ 12
+ f2
+ αp
k+ 12
⎤ ⎦.
(27)
−g 1
Both systems (26) and (27) are highly reducible. Indeed, the second equation in (26) immediately yields v k+ 2 = α1 [(α I n2 − A 2 ) v k − B 2T pk + f 2 ] together with the reduced system
A 1 + α I n1 −B1
B 1T α I n3
1
u k+ 2 1 p k+ 2
=
α uk + f 1 . B 2 v k + α pk − g
(28) 1
1
Likewise, system (27) is equivalent to uk+1 = α1 [(α I n1 − A 1 )uk+ 2 − B 1T pk+ 2 + f 1 ] together with the reduced system
A 2 + α I n2 −B2
B 2T α I n3
v k +1 p k +1
=
1
α v k+ 2 + f 2 . 1 k+ 12 B1u + α p k+ 2 − g
(29)
Both systems (28) and (29) can be further reduced. Let ck = α uk + f 1 and dk = B 2 v k + α pk − g. For (28), the second equation yields 1
p k+ 2 =
1 1 k d + B 1 u k+ 2
(30)
α
which, substituted in the first one, yields
A 1 + α I n1 +
1
α
1
B 1T B 1 uk+ 2 = ck −
1
α
B 1T dk .
1
(31) 1
1
1
1
Once this equation has been solved for uk+ 2 , we recover pk+ 2 from (30). Similarly, let ck+ 2 = α v k+ 2 + f 2 and dk+ 2 = B1u
k+ 12
+ αp
k+ 12
− g, then the second equation of (29) yields
72
M. Benzi, X.-P. Guo / Applied Numerical Mathematics 61 (2011) 66–76
p k +1 =
1 k+ 1 d 2 + B 2 v k +1
(32)
α
which, substituted in the first one, yields
A 2 + α I n2 +
1
α
1
B 2T B 2 v k+1 = ck+ 2 −
1
α
1
B 2T dk+ 2 .
(33)
Once this equation has been solved, the new value pk+1 of the pressure can be obtained from (32). Hence, the bulk of the work in applying the preconditioner is in the solution of the two reduced systems (31) and (33). Each of these is a discrete analogue of a scalar, second-order, elliptic, anisotropic convection–diffusion–reaction equation. The anisotropy is in a sense artificial, since it depends on the size of the algorithmic parameter α : the smaller α is, the stronger the anisotropy in the diffusion terms. Note that for the Stokes and generalized Stokes problems, the convection terms are missing in (31) and (33) and the coefficient matrices are symmetric and positive definite. Some remarks on the solution of these two subsystems are given in the next section. On the basis of the foregoing discussion, it is clear that the DS approach can be regarded as a dimensionally segregated method, i.e., a method where the values of the velocity components u and v (or u, v and w in 3D) are updated separately through a decoupling process; the new value of the pressure p is obtained at very low cost from the new velocity values. We conclude this section with a discussion of diagonal scaling. In the alternating iteration (11)–(12), it is possible to replace the (n + m)-by-(n + m) identity matrix I with an arbitrary symmetric positive definite matrix D , leading to a preconditioner of the form
α = P
1 2α
(A1 + α D)(A2 + α D).
ˆ where
xˆ = b, It is easy to check that this is equivalent to applying the original DS preconditioner Pα to the linear system A
:= D− 12 AD− 12 , xˆ = D− 12 x, and bˆ = D− 12 b. A natural choice for finite element problems is to take D to be a block A
diagonal matrix with the velocity and pressure mass matrices on the main diagonal. In order to reduce the cost of applying the preconditioner, the mass matrices can be lumped or replaced by their main diagonals. In this paper we form D from the diagonals of the mass matrices. We found that this scaling, which is used in all numerical experiments in the next section, is quite effective in improving the convergence rate of DS preconditioning, especially for problems on stretched grids. Incidentally, it was noted in [4] that diagonal scaling is very beneficial for the HSS preconditioner as well. 6. Numerical experiments In this section we report the results of numerical experiments on linear systems from Stokes and Oseen models of incompressible flow. We consider Q2–Q1 finite element discretizations of two standard model problems: the leaky-lid driven cavity problem, and the backward facing step problem (see [18]). For the driven cavity problem, we consider both uniform and stretched grids of increasing size. All test problems were generated under Matlab using the IFISS software package [16] (see also [18]). We use the DS preconditioner in conjunction with restarted GMRES with m = 30 as the restart. In all cases the initial guess was the zero vector, and the stopping criterion was a reduction of at least six orders of magnitude of the initial residual norm. We discuss experiments for both steady and unsteady cases. 6.1. Steady problems Application of the DS preconditioner requires, at each iteration, the solution of the linear systems (31) and (33), where the general form of A i is given in (10). For the steady Stokes problems, σ = 0 and N i = 0; hence, the coefficient matrices in these two systems are symmetric positive definite. The systems can be solved very efficiently with a sparse Cholesky factorization with an approximate minimum degree (AMD) ordering; see [1,12]. The factorization is computed once and for all at the outset, and only forward and backward triangular solves need to be performed at each GMRES iteration. For the steady Oseen problem (σ = 0, N i = 0 in (10)) the two systems (31) and (33) are nonsymmetric, although structurally symmetric. We compute sparse LU factorizations [12] using again an AMD reordering. These direct methods are much faster than iterative methods in the case of 2D problems; in the solution of large 3D problems iterative methods will have to be used instead, necessitating the use of a flexible Krylov method (like flexible GMRES [23]) for the outer iteration. The first set of experiments is aimed at assessing the performance of the DS preconditioner on steady Stokes and Oseen problems, in particular to investigate the dependence on the discretization parameter h and on the viscosity ν . We begin with the leaky-lid driven cavity problem. In Table 1 we show iteration counts for DS-preconditioned GMRES (30) applied to the steady Stokes problem on a sequence of uniform grids. We report results for the optimal choice of α , determined experimentally. We see that the iteration count is independent of mesh size. Our tests show that for the Stokes problem, small changes in the value of α do not have a dramatic effect on the number of iterations. In Table 2 we report iteration counts for the steady Oseen problem on a sequence of uniform grids and for different values of ν , using optimal or near-optimal values of α . We found that for small values of the viscosity, scaling (as described
M. Benzi, X.-P. Guo / Applied Numerical Mathematics 61 (2011) 66–76
Table 1 Preconditioned GMRES iterations for Stokes problem with the optimal
73
α.
Grid
Its
αopt
16 × 16 32 × 32 64 × 64 128 × 128
11 12 12 10
0.006 0.001 0.0006 0.0002
Table 2 Preconditioned GMRES iterations for Oseen problem for different values of ν , optimal α (lid driven cavity, uniform grids). Grid
ν = 0.1
ν = 0.01
ν = 0.001
16 × 16 32 × 32 64 × 64 128 × 128
14 15 15 14
19 20 19 18
44 57 50 39
Table 3 Preconditioned GMRES iterations for Oseen problem for different values of ν , optimal α (backward facing step, uniform grids). Grid
ν = 0.1
ν = 0.01
ν = 0.005
16 × 48 32 × 96 64 × 192 128 × 384
19 19 20 19
21 22 24 25
22 27 30 32
Table 4 Preconditioned GMRES iterations for Stokes problem with the optimal driven cavity, stretched grids).
α (lid
Grid
Its
αopt
16 × 16 32 × 32 64 × 64 128 × 128
9 9 9 9
0 .2 0 .3 0 .3 0 .4
Table 5 Preconditioned GMRES iterations for Oseen problem for different values of ν , optimal α (lid driven cavity, stretched grids). Grid
ν = 0.1
ν = 0.01
ν = 0.001
16 × 16 32 × 32 64 × 64 128 × 128
14 14 14 14
30 37 39 39
137 166 177 189
in the previous section) dramatically improves the rate of convergence of preconditioned GMRES iteration. For example, for ν = 0.001 the preconditioned iteration without scaling requires over 200 iterations on the fine grid. One can clearly see again that DS preconditioning results in h-independent convergence rates. There is a mild dependence of the rate of convergence on the viscosity. Note, however, that the convergence rate on the finest grid remains excellent. Results for the backward facing step problem (Oseen only) are presented in Table 3. Note that for this problem, the number of cells is different in the horizontal and vertical directions. Here the smallest value of the viscosity we consider is ν = 0.005, since the flow is unsteady for ν ≈ 0.001. The experiments show a fairly robust convergence behavior with respect to both h and ν . Next, we present some results using stretched grids, using the default stretch factors provided by IFISS. These are 1.2712 for the 16 × 16 grid, 1.1669 for the 32 × 32 grid, 1.0977 for the 64 × 64 grid, and 1.056 for the 128 × 128 grid. The stretching is done in both the horizontal and vertical directions starting at the center of the domain, resulting in rather fine grids near the boundaries. In practice, stretched grids of this kind are often used to resolve boundary layers, if present. However, using stretched grids typically results in linear systems that are considerably more difficult to solve with iterative methods. Tables 4–5 report results on numerical experiments for the driven cavity Stokes and Oseen problems (respectively) discretized on a sequence of stretched grids. Clearly, the results for the Stokes problem are very good—indeed, even better than on the uniform grids. It is important to mention that without the diagonal scaling (using the diagonals of the mass matrices), the
74
M. Benzi, X.-P. Guo / Applied Numerical Mathematics 61 (2011) 66–76
Table 6 Preconditioned GMRES iterations for generalized Stokes driven cavity problem (ν = 1, σ = h−1 ) with the optimal α . Grid
Its
αopt
16 × 16 32 × 32 64 × 64 128 × 128
11 12 11 12
0.006 0.001 0.0005 0.0001
Table 7 Preconditioned GMRES iterations for generalized Oseen driven cavity problem (σ = h−1 ) for different values of ν , optimal α . Grid
ν = 0.1
ν = 0.01
ν = 0.001
16 × 16 32 × 32 64 × 64 128 × 128
12 14 16 19
13 15 18 22
13 15 19 23
convergence behavior is much worse; in particular, the number of iterations increases as the mesh is refined. Also note that the optimal value of α is much larger now than for the uniform grid case. Table 5 shows that mesh independent convergence is observed also for the steady Oseen problem, although there is a noticeable dependence on the viscosity. We note that without scaling with the mass matrices, the rate of convergence deteriorates very rapidly for decreasing mesh size and viscosity. Hence, diagonal scaling helps mitigate the negative impact of stretched meshes and removes the dependence on mesh size; however, it does not appear to be enough to achieve robustness for very small values of ν . In our experience, this kind of degradation in solver performance is observed also with other preconditioners when stretched grids are used. 6.2. Unsteady problems Next, we report on analogous experiments involving the generalized Stokes problem (with ν = 1) and the generalized Oseen problem (for several values of ν ). A sequence of linear systems of this type needs to be solved when the timedependent Stokes or Navier–Stokes equations are integrated numerically using implicit time-stepping schemes. Now the matrices A 1 and A 2 in (31)–(33) are of the form (10) where σ = h−1 and M is the velocity mass matrix; also, N i = 0 for generalized Stokes and N i = 0 for generalized Oseen. For brevity, we only consider the driven cavity problem on square meshes. As one can see from Table 6, for the generalized Stokes problem the results are virtually the same as those obtained in the steady case. The behavior, however, is somewhat different for the generalized Oseen problem. Indeed, we can see from the results in Table 7 that the rate of convergence for DS-preconditioned GMRES(20) is essentially independent of viscosity, while showing a mild dependence on h. 6.3. Choosing α As with all parameter dependent preconditioners, some guidelines need to be provided for the choice of α . The analytic determination of the value of α which results in the fastest convergence of the preconditioned GMRES iteration appears to be quite difficult, especially in the case of the Oseen problem. Our numerical experiments indicate that for problems posed on uniform meshes, α should be taken very small; for the Q2–Q1 discretization used in our experiments, the best value of α is often of the order of 10−3 or even smaller. Moreover, the best value of α gets smaller as the mesh is refined. Note that taking too small a value of α can lead to excessive ill-conditioning in the subsystems (31)–(33) to be solved at each GMRES iteration; in our experiments, however, this was never a problem. For problems posed on stretched meshes, on the other hand, our experiments show that with proper diagonal scaling the optimal α is often of the order of 10−1 . A possible rule of thumb, applicable in case of uniform meshes, is to tie α to the discretization parameter h. Taking α ≈ h2 was found to give pretty good results in most cases, at least for h small enough. In the case of a uniform mesh, the rate of convergence of DS-preconditioned GMRES does not appear to be overly sensitive to the choice of α , in the sense that small relative changes in the size of α do not usually cause the number of iterations to change too drastically. In Fig. 1 we show the total number of GMRES and GMRES(30) iterations for the solution of the steady Oseen problem (with ν = 0.01) as a function of α , for two choices of h. Fig. 2 displays the corresponding data for the unsteady case. From these plots, it appears that slightly overestimating the parameter α does not lead to drastic changes in the number of iterations, especially in the steady case. Underestimating the optimal α can be more harmful, but this is easy to avoid. One way to do this is to find a (near-)optimal value of α on a coarse grid, and then use the same value of α on finer grids. Since the optimal α tends to decrease as the grid id refined, this strategy will generally overestimate the optimal α . This strategy, again, assumes uniform meshes are used.
M. Benzi, X.-P. Guo / Applied Numerical Mathematics 61 (2011) 66–76
(a) 16 × 16 grid
(b) 64 × 64 grid
Fig. 1. Number of GMRES iterations vs. the value of
(a) 16 × 16 grid Fig. 2. Number of GMRES iterations vs. the value of
75
α , steady Oseen problem, ν = 0.01. Uniform mesh.
(b) 64 × 64 grid
α , unsteady Oseen problem, ν = 0.01. Uniform mesh.
7. Conclusions and future work In this paper we have introduced a “dimensional splitting” approach for the solution of saddle point systems in which the (1, 1) block can be partitioned into a two-by-two block diagonal structure. Saddle point systems of this kind arise in a number of applications: in this paper we focused on linear systems arising from the discretization of two-dimensional incompressible flow problems. We have established the convergence of the fixed-point iteration, and investigated experimentally its use as a preconditioner for restarted GMRES on a set of Stokes and Oseen problems (both steady and unsteady) discretized by Q2–Q1 finite elements on both uniform meshes and stretched meshes. The numerical experiments indicate that for uniform meshes, the preconditioner results in fast convergence for a wide range of mesh sizes and Reynolds numbers. Diagonal scaling was found to be essential to achieve robustness for small values of the viscosity. For stretched meshes, on the other hand, we found that diagonal scaling (using the diagonals of the velocity and pressure mas matrices) is absolutely necessary to retain mesh-independent convergence for steady Stokes and Oseen problems. On stretched meshes, however, the performance of the preconditioner deteriorates for small values of the viscosity. This problem occurs with other preconditioners as well. Future work should include further analysis of the preconditioned iteration, including using Local Fourier Analysis [28,3] for estimating the optimal value of the relaxation parameter α , and extension to the 3D case. We observe here that the basic alternating iteration (11)–(12) is of Peaceman–Rachford type and cannot be directly extended to the case of three splittings. Extension to the 3D case requires the alternating iteration to be of Douglas–Rachford type; see [27, pp. 244–245]. From the viewpoint of implementation, the 3D case necessitates using (inexact) inner iterative solves for the subproblems that occur in the application of the preconditioner. The effect of inexact solves on the performance of the DS preconditioner needs to be investigated. We mention that a promising approach for solving systems of the type (31)–(33) was recently described
76
M. Benzi, X.-P. Guo / Applied Numerical Mathematics 61 (2011) 66–76
in [10]. Another possibility would be to use the scalable algebraic multilevel solvers for scalar elliptic PDEs provided in the state-of-the-art Trilinos software package [20]. Finally, we mention that the method presented in this paper forms the starting point for a new preconditioning scheme, called Relaxed Dimensional Factorization, which is currently under development; see [7]. Acknowledgements We would like to thank Zhen Wang and Qiang Niu for useful comments and for help with some of the numerical experiments. We are also indebted to Daniel Szyld and two anonymous referees for constructive criticism that resulted in several improvements. References [1] P.R. Amestoy, T.A. Davis, I.S. Duff, An approximate minimum degree ordering algorithm, SIAM J. Matrix Anal. Appl. 17 (1996) 886–905. [2] Z.Z. Bai, G.H. Golub, M.K. Ng, Hermitian and skew-Hermitian splitting methods for non-Hermitian positive definite linear systems, SIAM J. Matrix Anal. Appl. 24 (2003) 603–626. [3] M. Benzi, M.J. Gander, G.H. Golub, Optimization of the Hermitian and skew-Hermitian splitting iteration for saddle-point problems, BIT Numer. Math. 43 (2003) 881–900. [4] M. Benzi, G.H. Golub, A preconditioner for generalized saddle point problems, SIAM J. Matrix Anal. Appl. 26 (2004) 20–41. [5] M. Benzi, G.H. Golub, J. Liesen, Numerical solution of saddle point problems, Acta Numer. 14 (2005) 1–137. [6] M. Benzi, J. Liu, An efficient solver for the incompressible Navier–Stokes equations in rotation form, SIAM J. Sci. Comput. 29 (2007) 1959–1981. [7] M. Benzi, M.K. Ng, Q. Niu, Z. Wang, A relaxed dimensional factorization preconditioner for the incompressible Navier–Stokes equations, Technical Report TR-2010-010, Department of Mathematics and Computer Science, Emory University, 2010. [8] M. Benzi, M.A. Olshanskii, An augmented Lagrangian-based approach to the Oseen problem, SIAM J. Sci. Comput. 28 (2006) 2095–2113. [9] M. Benzi, D.B. Szyld, Existence and uniqueness of splittings for stationary iterative methods with applications to alternating methods, Numer. Math. 76 (1997) 309–321. [10] D. Bertaccini, G.H. Golub, S. Serra-Capizzano, Spectral analysis of a preconditioned iterative method for the convection-diffusion equation, SIAM J. Matrix Anal. Appl. 29 (2007) 260–278. [11] D.C. Brown, Alternating-direction iterative schemes for mixed finite element methods for second order elliptic problems, PhD thesis, Department of Mathematics, University of Chicago, Chicago, IL, 1982. [12] T.A. Davis, Direct Methods for Sparse Linear Systems, Society for Industrial and Applied Mathematics, Philadelphia, PA, 2006. [13] J. Douglas Jr., R. Durán, P. Pietra, Alternating-direction iteration for mixed finite element methods, in: Computing Methods in Applied Sciences and Engineering VII, Versailles, France, 1985, Springer-Verlag, New York, 1986, pp. 181–196. [14] H.C. Elman, Preconditioning for the steady-state Navier–Stokes equations with low viscosity, SIAM J. Sci. Comput. 20 (1999) 1299–1316. [15] H.C. Elman, Preconditioners for saddle point problems arising in computational fluid dynamics, Appl. Numer. Math. 43 (2002) 75–89. [16] H.C. Elman, A. Ramage, D.J. Silvester, IFISS: A Matlab toolbox for modelling incompressible flow, ACM Trans. Math. Software 33 (2007), Article 14. [17] H.C. Elman, D.J. Silvester, A.J. Wathen, Performance and analysis of saddle point preconditioners for the discrete steady-state Navier–Stokes equations, Numer. Math. 90 (2002) 665–688. [18] H. Elman, D. Silvester, A. Wathen, Finite Elements and Fast Iterative Solvers with Applications in Incompressible Fluid Dynamics, Numer. Math. Sci. Comput., Oxford University Press, Oxford, UK, 2005. [19] R. Glowinski, Numerical Methods for Fluids (Part 3). Finite Element Methods for Incompressible Viscous Flow, in: P.G. Ciarlet, J.L. Lions (Eds.), Handb. Numer. Anal., vol. 9, Elsevier/North-Holland, Amsterdam, 2003. [20] M.A. Heroux, R.A. Bartlett, V.E. Howle, R.J. Hoekstra, J.J. Hu, T.G. Kolda, R.B. Lehoucq, K.R. Long, R.P. Pawlowski, E.T. Phipps, A.G. Salinger, H.K. Thornquist, R.S. Tuminaro, J.S. Willenbring, A. Williams, K.S. Stanley, An overview of the Trilinos project, ACM Trans. Math. Software 31 (2005) 397–423. [21] R.B. Kellogg, Another alternating-direction-implicit method, J. Soc. Ind. Appl. Math. 11 (1963) 976–979. [22] M.A. Olshanskii, Y.V. Vassilevski, Pressure Schur complement preconditioners for the discrete Oseen problem, SIAM J. Sci. Comput. 29 (2007) 2686– 2704. [23] Y. Saad, A flexible inner-outer preconditioned GMRES algorithm, SIAM J. Sci. Comput. 14 (1993) 461–469. [24] Y. Saad, M.H. Schultz, GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems, SIAM J. Sci. Stat. Comput. 7 (1986) 856–869. [25] V. Simoncini, M. Benzi, Spectral properties of the Hermitian and skew-Hermitian splitting preconditioner for saddle point problems, SIAM J. Matrix Anal. Appl. 26 (2004) 377–389. [26] V. Simoncini, D.B. Szyld, Recent computational developments in Krylov subspace methods for linear systems, Numer. Linear Algebra Appl. 14 (2007) 1–59. [27] R.S. Varga, Matrix Iterative Analysis, Prentice-Hall, Englewood Cliffs, NJ, 1962. [28] R. Wienands, W. Joppich, Practical Fourier Analysis for Multigrid Methods, Chapman & Hall, New York, 2005. [29] D.M. Young, Iterative Solution of Large Linear Systems, Academic Press, New York, NY, 1971.
Applied Numerical Mathematics 61 (2011) 77–91
Contents lists available at ScienceDirect
Applied Numerical Mathematics www.elsevier.com/locate/apnum
Iterative adaptive RBF methods for detection of edges in two-dimensional functions Jae-Hun Jung a,∗ , Sigal Gottlieb b , Saeja Oh Kim b a b
Department of Mathematics, The State University of New York at Buffalo, Buffalo, NY 14260-2900, United States Department of Mathematics, University of Massachusetts Dartmouth, North Dartmouth, MA 02747-2300, United States
a r t i c l e
i n f o
Article history: Received 23 November 2009 Received in revised form 30 June 2010 Accepted 9 August 2010 Available online 20 August 2010 Keywords: Two-dimensional edge detection Gibbs phenomenon Radial basis function methods Adaptive methods
a b s t r a c t Radial basis functions have gained popularity for many applications including numerical solution of partial differential equations, image processing, and machine learning. For these applications it is useful to have an algorithm which detects edges or sharp gradients and is based on the underlying basis functions. In our previous research, we proposed an iterative adaptive multiquadric radial basis function method for the detection of local jump discontinuities in one-dimensional problems. The iterative edge detection method is based on the observation that the absolute values of the expansion coefficients of multiquadric radial basis function approximation grow exponentially in the presence of a local jump discontinuity with fixed shape parameters but grow only linearly with vanishing shape parameters. The different growth rate allows us to accurately detect edges in the radial basis function approximation. In this work, we extend the one-dimensional iterative edge detection method to two-dimensional problems. We consider two approaches: the dimension-by-dimension technique and the global extension approach. In both cases, we use a rescaling method to avoid ill-conditioning of the interpolation matrix. The global extension approach is less efficient than the dimension-by-dimension approach, but is applicable to truly scattered two-dimensional points, whereas the dimensionby-dimension approach requires tensor product grids. Numerical examples using both approaches demonstrate that the two-dimensional iterative adaptive radial basis function method yields accurate results. © 2010 IMACS. Published by Elsevier B.V. All rights reserved.
1. Introduction Radial basis function (RBF) methods have been actively investigated for a broad range of application areas, including machine learning and neural networks [15], image processing and reconstruction [3,13], and numerical approximations of partial differential equations (PDEs) [10,12]. RBF methods work on a wide variety of meshes including scattered points, which makes them attractive for problems on complex geometries and problems featuring a large degree of localization which benefits from irregularly distributed points. The grid points used for the RBF method are called centers. Centers can be distributed arbitrarily based on the given geometry, without any constraints from the method itself. For this reason RBF methods are referred to meshless methods. Furthermore, the RBF approximation yields fast convergence when the function to be approximated is smooth enough [1,2]. However, when the function is non-smooth, the fast rate of convergence deteriorates significantly unless the centers or the shape parameters of the RBFs are adapted appropriately. Such an adaptive approach [4] is necessary to obtain a stable RBF approximation. To apply the adaptive method, it is important to be able to
*
Corresponding author. E-mail addresses:
[email protected] (J.-H. Jung),
[email protected] (S. Gottlieb),
[email protected] (S.O. Kim).
0168-9274/$30.00 © 2010 IMACS. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.apnum.2010.08.006
78
J.-H. Jung et al. / Applied Numerical Mathematics 61 (2011) 77–91
identify the location of the discontinuity or sharp gradient in the RBF approximation. That is, one needs an accurate RBF edge detection algorithm. Several edge detection algorithms for high order methods have been developed. Examples include the Fourier conjugate sum and the enhanced conjugate sum edge detection methods developed by Gelb and Tadmor [6]. These methods are based on the Fourier method and have been applied successfully to many practical problems. These methods are particularly useful for problems which have Fourier data as an inherent part of the method. For problems which feature RBFs as their basis functions, it is more efficient to exploit the properties of the RBFs to determine the edges. The numerical approximation of discontinuous problem with the RBF method is highly oscillatory near the local jump discontinuities, a feature known as the Gibbs phenomenon [5,8]. Interestingly, the Gibbs phenomenon of the RBF method can be used to provide an accurate and efficient edge detection algorithm [9]. In [9], an iterative adaptive RBF method was proposed for one-dimensional edge detection. It was shown that the proposed method performs well with multiple local jump discontinuities. As explained in the following sections, the iterative adaptive RBF method exploits the different growth/decay rates of the expansion coefficients. At or near the discontinuity, the absolute values of the expansion coefficients attain local maxima and grow exponentially with N, the total number of center points, if the shape parameter is fixed for all N and it is greater than a certain value. The magnitudes of the expansion coefficients also decay exponentially away from the local discontinuity. This exponential decay rate changes dramatically into a linear one when the shape parameters at or near the discontinuity are set to vanish. The success of the iterative adaptive method is due to the finding that the growth rate of the expansion coefficients can change significantly when the shape parameters vary adaptively. This dramatic change in the growth rate makes it easy to determine whether the region considered contains a discontinuity or not. In such a way, the iterative RBF method can serve as an efficient edge detection method for RBF approximations. In this paper, we extend the 1D iterative adaptive RBF method to the 2D problems, using two approaches: a slice-byslice approach and a global extension approach. The slice-by-slice approach finds edges in the x-direction first with a given y value, and then find edges in y-direction with a given x value. The final edge map is the union of these two edge sets. The advantage of this approach is that it is simple to implement and computationally inexpensive. However, it requires tensor product grids, so that the points may be scattered in one dimension but not in the other. On a truly scattered 2D grid, we require a more sophisticated approach, the global extension method, in which we use the global expansion coefficients of the 2D RBF basis, and follow the same edge detection approach as the 1D method. The maximum absolute values of the expansion coefficients are used to define the edge maps. This approach can be applied to scattered grids, and yields more rigorous results than the slice-by-slice approach because it considers the variations of the 2D function both in x- and y-directions simultaneously. However, this approach is computationally costly. This paper presents preliminary numerical results which verify that the 2D iterative adaptive RBF method, using both the slice-by-slice and the global extension approaches, yields accurate results on the 2D data. The paper has the following structure. In Section 2, a brief explanation of the RBF approximation is given. In Section 3, both the global extension method and the slice-by-slice method are explained. In Section 4, various numerical examples are given. In Section 5, a brief summary will be presented and future research directions are discussed. 2. 1D iterative adaptive RBF methods 2.1. Radial basis function approximation Radial basis functions are defined using a set of points, known as centers, and a set of corresponding shape parameters. Consider a center set X composed of the center points xi such that X = {xi | xi ∈ [−1, 1], i = 1, . . . , N }, and let f denote the set of function values at the centers, f = { f i | f i = f (xi ) ∈ R, xi ∈ X } for the real-valued function f (x). In this work, we use the multiquadric (MQ) RBFs, defined by
φ j (x) =
(x − x j )2 + 2j ,
x ∈ [−1, 1], x j ∈ X
where the shape parameters j may be determined a priori for the approximation or can be adaptively determined depending on the regularity of the given function f (x) [11]. In the special case that the shape parameter vanishes (i = 0) for all the given centers, the corresponding RBF basis function becomes the piecewise linear function |x − xi | and its first derivative has a jump at x = xi . Although the strength of RBF methods is their ability to handle irregular grid points, in the following 2 . discussion we assume for simplicity that we have uniform centers xi = −1 + (i − 1)x for ∀i = 1, . . . , N where x = N − 1 The MQ RBF approximation s f , X (x) to the function f (x) is given by the linear combination of MQ RBFs [1,7]:
s f , X (x) =
N
λ j (x − x j )2 + j 2 ,
(1)
j =1
where λ j are the expansion coefficients, which are obtained by the interpolation condition s f , X (xi ) = f (xi ), ∀xi ∈ X , that is,
λ = M−1 f,
(2)
J.-H. Jung et al. / Applied Numerical Mathematics 61 (2011) 77–91
79
where the interpolation matrix M has its elements M i j = (xi − x j )2 + 2j , f = ( f 1 , . . . , f N ) T and λ = (λ1 , . . . , λ N ) T . Unfortunately, it is well known that the above linear system, Eq. (2) is ill-posed for a large N. The RBF approximation in Eq. (1) allows a simple computation of the derivative sf , X (x), given by the linear combination of the derivatives of RBF basis functions. At each center x = xi , the derivative is given by
sf , X (xi ) =
N
λ j (xi − x j )/ (xi − x j )2 + j 2 .
(3)
j =1
Here note that if j = 0 the derivative is not defined at xi = x j . For the adaptive method used in the following discussion to be well posed, we will let sf , X (xi ) = 0 at xi = x j when j = 0. Let D denote the derivative matrix D i j =
(xi − x j )/ (xi − x j )2 + j 2 with non-zero j , then the derivative of s f , X (x) at center points is given by s = Dλ,
where s
(4)
= (s
(x ), . . . , sf , X (xN ))T . We note that one has to exercise caution when using the RBF derivative approximation. f ,X 1
In general, this operation is not exact in the sense that m + 1 times RBF derivative operations for the polynomial of degree m does not vanish because RBFs are not polynomials. In fact, if f (x) is a constant function, say f (x) = 1, we expect its derivative to be zero. However, the matrices D and M are non-singular, so that setting DM−1 f = 0 yields f (x) = 0, which is a contradiction. 2.2. 1D iterative adaptive RBF methods The iterative adaptive RBF edge detection method uses the difference in the rate of growth or decay of the RBF coefficients. To understand this method, we use three properties of RBF approximations which were observed numerically and discussed in [9]: Property 1 (Linear growth). Given the sign function f (x) = sgn(x) with x ∈ [−1, 1], and the set of centers X = {xi | xi = 2 −1 + (i − 1)x, i = 1, . . . , N } where x = N − and N is even, the MQ expansion coefficients are given by 1
λi =
⎧ 0 ⎪ ⎨
N −1 2 ⎪ ⎩ N −1 − 2
i ∈ I \{ N /2, N /2 + 1}, i = N /2 ,
(5)
i = N /2 + 1 ,
if the shape parameters are given by (proved in [9]).
i = > 0, i ∈ I \{ N /2, N /2 + 1} and i = 0, i ∈ { N /2, N /2 + 1}, where I = {1, . . . , N }
Property 2 (Exponential growth). With the same conditions used in Property 1, except a uniform shape parameter i = > 0, ∀i ∈ I , for some it was observed in [9] that the maximum absolute value of the expansion coefficients grows exponentially with the number of centers,
max |λi | ≈ q N ,
(6)
i∈I
where the constant q is q > 1 and independent of N.
N
Property 3 (Multiple edges). Let f (x) be f (x) = i =s 1 αi (x) H c i (x) with c i ∈ Ω = (−1, 1). Here H c i (x) is the Heaviside function whose jump exists at x = c i . αi (x) are analytic functions in Ω , and are permitted to be constant functions. f (x) possibly contains maximum N s discontinuities due to the singular feature of the Heaviside functions H c i . Properties 1 and 2 yield Property 3 that the local maxima of |λi | occur at centers where c i exist (see [9]). These three properties are used to define the iterative adaptive RBF edge detection method, which is based on the growth/decay properties of the expansion coefficients λi . A coefficient λi whose absolute value is larger than a certain tolerance level suggests a discontinuity near xi . The 1D iterative method defines a concentration map C as
C = C i C i = |λi |, xi ∈ X , i ∈ I .
(7)
The edge exists near the center point where the value of the element of C is large. The higher the contrast between the largest value of the element of C and the others the easier to determine the edge location. In fact, the concentration map itself contains the edge as the function. To improve the method and obtain sharper edges, we can incorporate derivative information as well [9] to define concentration map C for sf , X (xi ) = 0:
80
J.-H. Jung et al. / Applied Numerical Mathematics 61 (2011) 77–91
Table 1 Pseudo MATLAB code for 1D RBF edge detection with = 0.1 and η = 12 . function [edge] = EdgeMap ( f , M, D) N = length( f ); = 0.1 · ones( N , 1); edge = zeros( N , 1); λ previous = zeros( N , 1); η = 12 ; Residue = 1; while Residue 10−10 λ = M −1 f ; f = D · λ; C = | f ||λ|/ max(| f ||λ|); for i = 1 : N if C (i ) η edge (i ) = 1; (i ) = 0; end end Update M and D; Residue = λ − λprevious ; λ previous = λ; end
C = C i C i = λi sf , X (xi ) , xi ∈ X , i ∈ I , where C i is equivalent to C i = |λi
j
(8)
D i j λ j |. We then define the normalized concentration map Cˆ as
Cˆ = Cˆ i Cˆ i = C i / max C i , i ∈ I . i∈I
(9)
Here note that ∀C i = 0 iff f (x) = 0 at every xi for Eq. (7). If f (x) = 0 at x = xi ∈ X , the problem is trivial and the edge detection method is not applied. For Eq. (9), we need to show that if not every f (xi ) vanishes, there exists at least one nonvanishing element of C i for Eq. (8). Then the concentration map is well defined. Otherwise the problem becomes trivial. The concentration map reflects the smoothness of the function and indicates the possible location of the local jump. We define the edge set S such that
S = { S i | S i = xi for Cˆ i η, xi ∈ X , i ∈ I },
(10)
where 0 < η < 1 is the tolerance level given a priori. Since the expansion coefficients decay exponentially away from the local jump, the value η = 12 is a reasonable value for the tolerance level. Iterative adaptive RBF methods: The iterative method first detects the edge set S based on the expansion coefficients λi with the fixed shape parameters i (Property 2). Then the adaptive method is applied at the center points in S. According to the Property 1, the expansion coefficients at the centers xi corresponding to S i then become algebraically small. If new expansion coefficients are computed after the adaptation of the shape parameters, the maximum of the expansion coefficients occurs at the second jump discontinuity. By repeating this procedure, all the possible jumps are detected (Property 3). A description of this algorithm is given in Table 1. In Table 1, a pseudo MATLAB code is provided. The pseudo code generates a new function edge at centers xi whose values are either 0 or 1. The value of 1 indicates that the possible edges are located on or near the corresponding center points. 3. Iterative adaptive RBF method for 2D edges To extend the 1D RBF iterative edge detection method to the 2D problems we can adopt two possible approaches. The first approach is the global extension method. The global extension method relies on the 2D RBF basis functions, and uses the 2D global expansion coefficients. The second approach is the dimension-by-dimension or slice-by-slice application of the 1D RBF edge detection method. The slice-by-slice method treats each dimension separately and the final edges are obtained by summing up the edges obtained from each dimension. The slice-by-slice method requires more strict 2D grid system or 2D reconstruction from the arbitrary grid system to define slice, but the global method does not require such complication on any grid system. The global method, however, suffers from ill-conditioning due to the large size of the interpolation matrix and requires us to separate the problem into multiple sub-domain problem for the search of the edges in each sub-domain. The second approach reduces this complication, but can only be applied on tensor product grids. The two approaches are described below.
J.-H. Jung et al. / Applied Numerical Mathematics 61 (2011) 77–91
81
3.1. 2D global extension method The 2D global approach uses the global two-dimensional MQ RBFs based on a distribution of points {xi } and { y j } in 2D, and basis functions
φi , j (x, y ) =
(x − xi )2 + ( y − y j )2 + i2j .
The 2D MQ RBF approximation s f , Z (x, y ) is given by
s f , Z (x, y ) =
Ny Nx
λi j (x − xi )2 + ( y − y j )2 + i2j ,
(11)
i =1 j =1
where Z is the set of the 2D centers, λi j the expansion coefficients, i j the shape parameters, N x and N y numbers of centers in x- and y-directions, respectively. Suppose that the real-valued function f (x, y ) are defined at the 2D center points (xi , y j ). Let the expansion coefficient vector be
Λ = (λ11 , λ12 , . . . , λ N x N y −1 , λ N x N y )T , the function vector
T
f = f (x1 , y 1 ), f (x1 , y 2 ), . . . , f (x N x , y N y −1 ), f (x N x , y N y )
,
and the interpolation matrix M
M (i j )(kl) =
(xi − xk )2 + ( y j − yl )2 + kl ,
(12)
where (i j ) = (i − 1) N y + j and (kl) = (k − 1) N y + l with i , k = 1, . . . , N x and j , l = 1, . . . , N y . With these notations the 2D expansion coefficients are found by
Λ = M−1 f.
(13)
The size of the interpolation matrix M is ( N x N y ) × ( N x N y ). The complexity of the direct method for the interpolation is about O ( N x3 N 3y ). If N x = N y = N, then it becomes O ( N 6 ). This makes the 2D global extension method costly. For this reason, a domain decomposition method should be considered to implement the global approach if N x and N y are large [8]. Similar ideas have been proposed such as the RBF domain decomposition techniques for elliptic PDEs found in [16]. To understand the domain decomposition technique, consider the 2D MQ RBFs, where the center set Z as in Eq. (11) is denoted by
Z = (xi , y j ) xi ∈ [−1, 1], y j ∈ [−1, 1], i ∈ I x , j ∈ I y
with I x = {i | i = 1, . . . , N x , x ∈ Z∗ } and I y = { j | j = 1, . . . , N y , y ∈ Z∗ }. For the domain decomposition, the given do-
L
main Ω is split into L subdomains Ωi , i = 1, . . . , L where Ω = i =1 Ωi , and for some i = j , i , j = 1, . . . , L, Ωi ∩ Ω j can be a non-empty set, that is, the overlapping subdomains are allowed. We also decompose a given function f (x, y ) into L f i (x, y ), (x, y ) ∈ Ωi where f (x, y ) = i =1 f i (x, y ), (x, y ) ∈ Ωi . Once this domain decomposition is performed, it is as though the method is performed on smaller domains. This approach has the advantage of creating smaller systems that are more efficient to solve and better conditioned. Whether using one domain or multiple smaller domains, the next step is to apply the procedure similar to the 1D method. We assume that the behavior of the 2D expansion coefficients in Eq. (13) is similar to that of the 1D coefficients. However, we observed that in the global 2D case the use of the derivative (or, more accurately, the gradient) does not improve the concentration map, and is a costly procedure which makes the 2D edge detection less efficient. Thus, we return to the original definition of the concentration map which relies only on the expansion coefficients. Next, the critical step is to make the shape parameters vanish at or near the local discontinuity. Table 2 shows the pseudo MATLAB code for 2D RBF method. The overall algorithm is the same as the 1D. The input variable is a 2D array f . The 2D array is converted into a 1D string. Once the 2D array is converted into a 1D string, the same 1D algorithm is applied. In the subroutine EdgeMap, the construction procedure for M should be modified to correspond to the 1D string made of f . It is interesting to note that with the vanishing shape parameter, the 2D RBF also has a jump discontinuity in its derivative both in x- and y-directions as in the 1D case. Fig. 1 shows the 2D RBF function with the vanishing shape parameter (left) and non-zero shape parameter (right). Only the difference is that the 1D RBF with vanishing shape parameter is a piecewise linear function, but the 2D RBF with vanishing shape parameter, i j = 0 is not linear unless either direction is fixed as shown in Fig. 1.
82
J.-H. Jung et al. / Applied Numerical Mathematics 61 (2011) 77–91
Fig. 1. The 2D MQ RBF. Left: With the vanishing shape parameter. Right: With the non-zero shape parameter,
= 0.1.
Table 2 A pseudo MATLAB code for the 2D RBF global edge detection. Input a 2D array f ; Convert f into a 1D string g; Construct the corresponding M; edge = zeros( N x N y , 1); [edge ] = EdgeMap(g, M); Convert edge into a 2D array;
3.2. Edge detection by slice-by-slice The global extension approach in the previous section is the true 2D generalization of the 1D method. However, it is a costly approach, requiring the inversion of a much larger matrix. The slice-by-slice approach is a straightforward extension of the 1D RBF edge detection method in a dimension-by-dimension approach, with the edge map determined by the union of the edge maps obtained from the x- and y-directions. This approach is straightforward, less costly, and easy to implement, as long as the 2D points are given on a tensor product grid. The numerical examples in the next section also verify that this approach yields accurate results. Each array has an arbitrary domain size in the x- and y-directions, and the major concern is the ill-conditioning problem of the interpolation matrix. One approach that resolves this issue is the rescaling of the domain. 3.2.1. Ill-conditioning and rescaling domain intervals To combat the exponential growth with N of the condition number κ (M) of the MQ RBF interpolation matrix M, first we rescale the domain interval for the interpolation matrix. For example, when dealing with the 2D images, the domain intervals of x- and y-directions can be arbitrarily chosen for the reconstruction. If we use a fixed domain interval for any size of array, e.g. x = [−1, 1] and y = [−1, 1] and a fixed shape parameter i (e.g. i = 0.1 such as that used for the 1D iterative edge detection method), the interpolation matrix becomes highly ill-posed as N becomes large. This ill-conditioning can be reduced by simply stretching the domain, which suggests that the domain size should be determined for the edge detection as a function of the given number of grid points in each direction, and the initial domain interval. Numerical studies allow us to investigate the optimal size of the domain for a given number of center points. Consider a simple sign function, f (x) = sgn(x) in x-dimension. We seek the best reconstruction domain Ω ∗ = [−xc , xc ] such that with the given even N and X the best edge set S is accurately detected as the best edge set S ∗ = {x N /2 , x N /2+1 } for the be the minimum domain size needed for the best domain Ω ∗ for the best edge set S ∗ . For uniform distribution. Let xmin c the numerical experiments, the shape parameter is fixed for every value of xc and N. Figs. 2 and 3 show the numerical experiments on the relation between xc and the best edge set. The numerical computation, the MATLAB “\” operation is used with machine accuracy of about 10−16 . Fig. 2 shows the edge sets versus N with different xc . From left to right, top to bottom, xc = 1.0, 1.12, 1.24, 1.36, 1.48, 1.6, respectively. The figure shows that the maximum number of N that allows the best edge set increases if xc increases. For xc = 1.0, the maximum N for the best edge set is N ∼ 200. For N 200, many wrong edges are detected due to the ill-posedness of the interpolation matrix M. Fig. 3 shows the instability of the interpolation matrix on xc –N plane. The left figure shows the stable and unstable regions. The stable region is where the best edge set S is detected and is given in blue color while the unstable region is where false edges are detected. It is interesting that there are blue spots even in the unstable region. The right figure shows the contour plot of the total number of elements of the edge set. As the figure shows, the number of elements of the detected edge set increases rapidly after a certain value of N for a fixed xc . Fig. 4 shows the histogram of the total number of elements of the detected edge set, #( S ), versus N for several values of xc = 1, 1.303, 1.4545, 1.6 in logarithmic scale based on base 2. In the figure, the value of 1 in the y-axis means that two edge elements are found. Note that the best edge set is composed of only two elements. The figure shows that for large values of N there are more than two edge elements are found due to the ill-posedness of the interpolation matrix after a certain value of N, say, N ∗ . Also note that some large values of N yield the best edge map although N > N ∗ . This is confirmed by the blue stable dots in the unstable region clearly shown in the left figure of Fig. 3.
J.-H. Jung et al. / Applied Numerical Mathematics 61 (2011) 77–91
83
Fig. 2. Edge sets detected with the MQ RBF iterative method versus N for different domain intervals xc , xc = 1.0, 1.12, 1.24, 1.36. 1.48, 1.6. The uniform shape parameters are used = 0.1 for every N and xc .
Fig. 3. Left: Stability of the iterative method for fixed shape parameters = 0.1 on the xc –N plane. Stable region (blue area) is where the best edge set is obtained. Right: The contour plot of the total number of elements of the edge set. (For interpretation of the reference to color in this figure legend, the reader is referred to the web version of this article.)
Based on these numerical experiments, we can roughly estimate the relation between xmin and N in the range of at least c N ∈ [100, 340] as
xmin ≈ c
6 1000
N − 0.2.
(14)
Using these values, we can rescale the domain to enhance stability. 3.2.2. 1D domain decomposition methods Beside the rescaling method, one can also use the domain decomposition method for large N in one direction when the slice-by-slice method is applied. For example, we know that Ω ∗ = [−1, 1] for N = 128, according to our 1D numerical experiments Eq. (14). If we have N > 128, then we decompose the domain into smaller subdomains, either overlapping or non-overlapping. The final edge map is the union of the edge maps from each subdomain.
84
J.-H. Jung et al. / Applied Numerical Mathematics 61 (2011) 77–91
Fig. 4. Number of elements in the edge set S versus N for different xc = 1.0, 1.303, 1.4545, 1.6 in logarithmic scale based on base 2.
Fig. 5. Broadening effect. Left: The original square image with 40 × 40 resolution. Right: Edges detected during the whole iteration.
4. Numerical results 4.1. Global extension method In this section, we provide several simple numerical examples of the 2D iterative global edge detection method. For this method, we assume that N x = N y and the interpolation matrix has a size of N 2 × N 2 . The complexity is, at worst, about O( N 6 ). The 2D global edge detection process is different from the 1D case. For the 1D case, the edges are the sum of edges detected at each iteration step. For the 2D case, however, our numerical results show that the final edges should be those found at the last iteration step. The numerical results show that if all the edges from each iteration step are collected as the final edge set, the final edge has the broadening effect. That is, the edge set is not sharp but contains broad neighborhoods of real edges. Fig. 5 shows such broadening effect. For the numerical example, a square box inside 40 × 40 arrays is used. The left figure shows the original square image and the right figure shows all the edges detected from every iteration step. For the edge criteria, we use
1
|λi j |/ max |λi j | η = . ij
2
(15)
J.-H. Jung et al. / Applied Numerical Mathematics 61 (2011) 77–91
85
Fig. 6. Final edge map on 60 × 60 array. Left: The original square image. Right: The edge map detected at the last iteration step.
Fig. 7. Final edge map. Left: The original circle image. Right: The edge map detected at the last iteration step.
To stop the iteration we use the condition that if the difference between the current expansion coefficients and those from the previous iteration step is small enough in a certain norm the iteration stops. That is, the iteration stops if
i +1 λ − λi t ,
where λi +1 and λi are the coefficient vectors at the i + 1 and i iteration steps and t is the tolerance level, for which t = 10−8 was used for the numerical experiments. Here · denotes the vector norm and we use the vector 2-norm · 2 for the numerical experiments. In the right figure all the edges detected at each step are plotted. As a result, the broadening effect is clearly seen in the edge map. Figs. 6 and 7 show the edges detected at the final iteration step for the square box image (Fig. 6) and the circle image (Fig. 7). In each, the left figure shows the original image and the right figure shows the edge map with the RBF method. For the numerical experiments, the same condition was used, Eq. (15). Compared with Fig. 5, we observe that the sharper image is obtained when the edge map is constructed based on the edges found at the last iteration step only. To explain the broadening effect, we show the edge map at each intermediate iteration step in Figs. 8 and 9 for the square and circle images, respectively. As shown in the figures, the 2D global RBF edge detection method yields interesting intermediate patterns. Unlike to the 1D case with which each iteration step does not depend on the previous iteration, each iteration of the 2D method depends on the previous edge detection result. For the 1D case, once the edges are detected, the corresponding shape parameters are set to vanish and the image is smoothed at and near these edges. Once smoothed, the detected edges are no longer edges and the next iteration only detects next edges. In this way, the smoothing does not affect the next edge detection iteration. This is because the 1D case has only one direction of gradient. For the 2D case, however, after the shape parameters are set to vanish, the smoothed image may cause a new edge. The reconstructed image is smoothed near the detected edges but possibly in a particular direction only resulting in creating a new edge in the reconstruction. Thus the next iteration is affected by the wrong edges generated in the previous iteration. For example, in Fig. 6, the 2D method detects edges at the 4 corners of the square at the first iteration. We observed that the edges at the next iteration center around those 4 edges from the first iteration. In Fig. 7, we also observe that the edges at the second iteration are found along the alignment of edges from the first iteration. Thus we know that the 2D global method finds and generates wrong edges at each iteration and the collection of all these false edges yields the broadening effect. Despite the wrong edge detection during the intermediate iteration steps, we also observe that interestingly edges detected at the last iteration are close to the real edges.
86
J.-H. Jung et al. / Applied Numerical Mathematics 61 (2011) 77–91
Fig. 8. Edge map for the square image at each iteration with the first iteration through the last iteration from left top to right bottom.
The 2D global method captures edges satisfactorily at the last iteration and fine features in the test images, but is costly and requires additional techniques to enhance its efficiency. The full development and testing of this approach is left to future work. 4.2. Dimension by dimension method In this section, we provide numerical examples of the slice-by-slice approach. First, we repeat the simple image used in the previous section, a square of size 10 × 10 pixels on a 40 × 40 pixel domain. Although this is a simple image, the problem itself has low resolution (the image itself is very small) and so is challenging from the standpoint of edge detection. Fig. 10 shows the edge detections applied to the x- and y-slices and the final figure which is the union of the two edge maps from x- and y-directions, from left to right, respectively. The method easily and efficiently identifies the edges in each slice, and this procedure produces a reliable edge image. We also note that if a discontinuity does not exist inside the sliced domain the domain boundaries are included in the edge map. The boundary edges are excluded in the edge map otherwise. The slice-by-slice approach is inexpensive and can handle larger data sets efficiently, which allows us to consider more difficult problems without applying the 2D domain decomposition techniques. For the second numerical example, we consider the Shepp–Logan phantom image. The Shepp–Logan phantom image is the simplified version of the human brain phantom composed of ten ellipses. Each ellipse has different major and minor axes with different rotation angle and is filled by a single gray level. That is, the 2D Shepp–Logan phantom image is the 2D piecewise constant function. To generate the Shepp–Logan phantom image, we use the MATLAB intrinsic command
P = phantom(‘Modified Shepp–Logan’, N ), where P is the two-dimensional square array with the dimension of N × N. For the numerical example, N = 256 is used. We use two different domain intervals xc , i.e. xc = 1 and xc = 3. Fig. 11 shows the Shepp–Logan phantom image with N = 256. The image was created using the MATLAB command imagesc. Fig. 12 shows edges detected by the iterative RBF method. The figure shows the edges in x- and y-directions (left and middle columns, respectively) and the addition of edges (right column). The top row figure shows the results with xc = 1 and the bottom with xc = 3. As shown in the figure, the RBF
J.-H. Jung et al. / Applied Numerical Mathematics 61 (2011) 77–91
87
Fig. 9. Edge map for the circle image at each iteration with the first iteration through the last iteration from left top to right bottom.
Fig. 10. The x-slice edges detected (left), y-slice edges detected (middle), and the union of the two (right).
method detects many false edges including the real ones due to round-off errors for xc = 1. We adopt the formula derived in Eq. (14) in the previous section, which suggests that xc > 1.34. We used xc = 3 and the bottom figures in Fig. 12 show the results. The RBF method with xc = 3 properly detects the real edges without false edges found in the results with xc = 1.
88
J.-H. Jung et al. / Applied Numerical Mathematics 61 (2011) 77–91
Fig. 11. Shepp–Logan phantom image for N = 256.
Fig. 12. Edges detected by the iterative RBF method in x-direction, y-direction and their addition from left to right, respectively. Top: Edges with xc = 1. Bottom: Edges with xc = 3.
The following numerical examples take simple sample images of a flower and a maple leaf with N = 125. These images have very clear naturally defined edges, and the edge detection algorithm easily identifies the edges, thus recreating the images quite clearly (Figs. 13 and 14). In our final examples, we use standard images taken from the USC Signal and Image Processing Institute data base [14]. The first two images are a picture of a clock and an aerial photo of an airplane, both with N = 256. The edge detection method, with a rescaling of xc = 5, reliably captures these images, as seen in Figs. 15 and 16. Finally, we study a more complicated image, known as a resolution test (N = 256). The edge detection method successfully identifies these edges as well (Fig. 17). The performance of the RBF edge detection method is better for the resolution test image than the clock and aerial images. 5. Conclusions In this paper, we extended the previously developed 1D iterative adaptive MQ RBF edge detection method to 2D problems based on the global extension and the dimension-by-dimension approaches. In the dimension-by-dimension approach, we obtain the edge map by the collection of edges in each direction. To obtain the best edge map and improve the conditioning of the linear system, we rescale the domain interval in each direction. Several numerical examples are given to demonstrate that the dimension-by-dimension multiquadrics iterative edge detection approach yields accurate edge maps. Future work will focus on the development of RBF edge detection
J.-H. Jung et al. / Applied Numerical Mathematics 61 (2011) 77–91
Fig. 13. Left: Flower image. Right: Edge map.
Fig. 14. Left: Maple image. Right: Edge map.
Fig. 15. Left: Clock image. Right: Edge map of the image.
89
90
J.-H. Jung et al. / Applied Numerical Mathematics 61 (2011) 77–91
Fig. 16. Left: Aerial photograph of a plane on a runway. Right: Edge map of the image.
Fig. 17. Left: Resolution test image. Right: Edge map of the image.
methods based on other RBF bases, including Gaussians and Wendland functions which are localized, thus leading to a sparse interpolation matrices which are more efficient to implement. While the dimension-by-dimension approach works well for problems with a tensor product grid, for truly scattered images a global method will be necessary. For this reason, we explore a global extension method based on 2D MQ RBFs. We studied this method for images with small number of pixels and demonstrated that the global extension method performs well. For this method, we observe that there is a broadening effect in the final edge maps if the 1D method is directly extended to the 2D method. Unlike the 1D method, the 2D global extension method needs to use the final profile of the expansion coefficients obtained in the final iteration to avoid the broadening effect. We also observe that for the 2D global extension method, there exist interesting intermediate edge patterns during the iterations. These patterns are made due to the successive detection and elimination of edges at each iteration step. If the domain is large and contains many centers, a domain decomposition approach together with rescaling will need to be considered. A direct implementation of the global extension method yields a computational complexity of O ( N 6 ) for N × N array data if applied in a direct manner. Thus the domain decomposition technique becomes critical for the efficiency of the 2D global method. This will comprise our future work. The motivation of this work is the need for efficient edge detection in the RBF approximations of PDEs. The RBF approximations possibly contain local jump discontinuities such as shocks, or even sharp gradients which appear as discontinuities when the grid is not sufficiently refined. For PDEs, the continuity of the derivative(s) in the solution is also important for enhancing the accuracy and stability of the RBF approximations. In this paper we only focused on the continuity of the given function. In our future work, we will also further develop the RBF method for the detection of jumps in the derivative(s) of the given function or solutions of PDEs.
J.-H. Jung et al. / Applied Numerical Mathematics 61 (2011) 77–91
91
Acknowledgement This work was supported by the National Science Foundation under the grant DMS 0608844. References [1] M.D. Buhmann, Radial Basis Functions: Theory and Implementations, first ed., Cambridge Univ. Press, Cambridge, 2003. [2] M.D. Buhmann, N. Dyn, Spectral convergence of multiquadric interpolation, Proc. Edinb. Math. Soc. 36 (1993) 319–333. [3] J. Carr, R. Beatson, J. Cherrie, T. Mitchell, W. Fright, B. McCallum, T. Evans, Reconstruction and representation of 3D objects with radial basis functions, in: SIGGRAPH, 2001, pp. 67–76. [4] T.A. Driscoll, A.R.H. Heryudono, Adaptive residual subsampling methods for radial basis function interpolation and collocation problems, Comput. Math. Appl. 53 (2007) 927–939. [5] B. Fornberg, N. Flyer, The Gibbs phenomenon for radial basis functions, in: A. Jerri (Ed.), Advances in the Gibbs Phenomenon with Detailed Introduction, in: The Gibbs Phenomenon in Various Representations and Applications, Sampling Publishing, Potsdam, NY, 2008, pp. 201–224. [6] A. Gelb, E. Tadmor, Detection of edges in spectral data II: Nonlinear enhancement, SIAM J. Numer. Anal. 38 (2000) 1389–1408. [7] R.L. Hardy, Multiquadric equations of topography and other irregular surfaces, J. Geophys. Res. 176 (1971) 1905–1915. [8] J.-H. Jung, A note on the Gibbs phenomenon with multiquadric radial basis functions, Appl. Numer. Math. 57 (2007) 213–229. [9] J.-H. Jung, V. Durante, An iteratively adaptive multiquadric radial basis function method for the detection of local jump discontinuities, Appl. Numer. Math. 59 (2009) 1449–1466. [10] E.J. Kansa, Muliquadrics—A scattered data approximation scheme with applications to computational fluid dynamics: II. Solutions to parabolic, hyperbolic, and elliptic partial differential equations, Comput. Math. Appl. 19 (1990) 147–161. [11] E.J. Kansa, R.E. Carlson, Improved accuracy of multiquadric interpolation using variable shape parameters, Comput. Math. Appl. 24 (1992) 99–120. [12] E. Larsson, B. Fornberg, A numerical study of some radial basis function based solution methods for elliptic PDEs, Comput. Math. Appl. 46 (2003) 891–902. [13] Y. Ohtake, A. Belyaev, H.-P. Seidel, 3D scattered data approximation with adaptive compactly supported radial basis functions, in: Proceeding of International Conference on Shape Modeling and Applications, Genova, Italy, 2004, IEEE Computer Society, 2004, pp. 31–39. [14] USC Signal and Image Processing Institute Data base, http://sipi.usc.edu/database/database.cgi?volume=misc. [15] P.V. Yee, S. Haykin, Regularized Radial Basis Function Networks: Theory and Applications, John Wiley & Sons, Inc., New York, 2001. [16] X. Zhou, Y.C. Hon, J. Li, Overlapping domain decomposition method by radial basis functions, Appl. Numer. Math. 44 (2003) 241–255.
Applied Numerical Mathematics 61 (2011) 92–107
Contents lists available at ScienceDirect
Applied Numerical Mathematics www.elsevier.com/locate/apnum
Numerical solutions of a two-phase membrane problem F. Bozorgnia Faculty of Sciences, Persian Gulf University, Boushehr 75168, Iran
a r t i c l e
i n f o
a b s t r a c t
Article history: Received 21 April 2009 Received in revised form 13 August 2010 Accepted 13 August 2010 Available online 19 August 2010 Keywords: Free boundary problems Two-phase membrane Finite element method Error estimate Regularization
In this paper different numerical methods for a two-phase free boundary problem are discussed. In the first method a novel iterative scheme for the two-phase membrane is considered. We study the regularization method and give an a posteriori error estimate which is needed for the implementation of the regularization method. Moreover, an efficient algorithm based on the finite element method is presented. It is shown that the sequence constructed by the algorithm is monotone and converges to the solution of the given free boundary problem. These methods can be applied for the one-phase obstacle problem as well. © 2010 IMACS. Published by Elsevier B.V. All rights reserved.
1. Introduction A free boundary problem is a partial differential equation where the equation changes qualitatively across a level set of the solution u of the equation. A general form of elliptic free boundary problems with the Laplacian can be written as
u = f (x, u , ∇ u ) in Ω,
(1)
where the right-hand side term is supposed to be piecewise continuous, having jumps at some values of the arguments u and ∇ u and where Dirichlet boundary conditions are considered. Suppose that λ± : Ω → R are positive and Lipschitz continuous functions and Ω is a bounded open subset of Rn with smooth boundary. Assume further that g ∈ W 1,2 (Ω) ∩ L ∞ (Ω) and g changes sign on ∂Ω . Let K = { v ∈ W 1,2 (Ω): v − g ∈ 1, 2 W 0 (Ω)}. The functional
I (v ) =
1 2
|∇ v | + λ max( v , 0) − λ min( v , 0) dx, 2
+
−
(2)
Ω
is convex, coercive on K and weakly lower semi-continuous, hence it attains its infimum in K . The Euler–Lagrange equation corresponding to the functional (2) is (see [17])
u = λ+ χ{u >0} − λ− χ{u <0} in Ω, u=g
on ∂Ω.
Problem (3) is related to the time dependent equation
α ∂t max( v , 0) + β∂t min( v , 0) − v = 0 in (0, T ) × Ω, E-mail address:
[email protected]. 0168-9274/$30.00 © 2010 IMACS. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.apnum.2010.08.007
(3)
F. Bozorgnia / Applied Numerical Mathematics 61 (2011) 92–107
93
which has been used to describe an instantaneous and complete reaction of two substances coming into contact at a surface Γ (see [16]). In problem (3) χ A denotes the characteristic function of the set A. The boundary
∂ x ∈ Ω : u (x) > 0 ∪ ∂ x ∈ Ω : u (x) < 0 ∩ Ω, is called the free boundary. Properties of the solution, regularity of the solution, Hausdorff dimension and the regularity of free boundary have been derived in [14–17]. In [2] a perturbation of coefficients is considered by the author. Note that if we let λ− = 0 and let g be non-negative on the boundary we obtain the one-phase obstacle problem. There are numerous papers about numerical solutions of elliptic free boundary problems, variational inequalities and one-phase obstacle problem, see for instance [1,5,8,10–13]. In the next section we present numerical methods for (3) and then we discuss the regularization method and an error estimate which is needed for the implementation of the regularization method. Finally, an algorithm based on the finite element method is described. 2. Numerical approximation of the two-phase membrane problem Let ψ ∈ C 02 (Ω), and consider the following equation:
u = λ+ χ{u >ψ} − λ− χ{u <ψ} + ψ χ{u =ψ} in Ω, u=g
on ∂Ω.
(4)
This problem is a generalization of (3). In (4), let w = u − ψ, then provided λ+ − ψ and λ− + ψ are positive, we obtain the following equation:
w = (λ+ − ψ)χ{ w >0} − (λ− + ψ)χ{ w <0} in Ω, w=g
on ∂Ω.
2.1. First method In problem (3) let
u1 = u+ ,
u2 = u− ,
g1 = g + ,
g2 = g − ,
where u + = max(u , 0) and u − = max(−u , 0). Then
u 1 · u 2 = 0,
u 1 , u 2 0.
Obviously u 1 and u 2 solve the following equations, respectively
⎧ + in {u > 0}, ⎪ 1 ⎨ u 1 = λ u 1 = g1 on ∂Ω ∩ ∂{u 1 > 0}, ⎪ ⎩ u1 = 0 on ∂{u 1 > 0} \ ∂Ω,
and
⎧ − in {u > 0}, ⎪ 2 ⎨ u 2 = λ u 2 = g2 on ∂Ω ∩ ∂{u 2 > 0}, ⎪ ⎩ u2 = 0 on ∂{u 2 > 0} \ ∂Ω.
Since u is also a solution to (3),
(u 1 − u 2 ) = λ+ χ{u 1 >0} − λ− χ{u 2 >0} .
(5)
Eq. (5) shows that u 1 − u 2 ∈ C 1,1 (see [14,15]). Therefore, on the free boundary we have
∇ u 1 = −∇ u 2 . Now for a given uniform mesh on Ω ⊂ R2 , let x = y = h, and use standard finite difference methods for Eq. (5), to obtain
94
F. Bozorgnia / Applied Numerical Mathematics 61 (2011) 92–107
1 h2
u 1 (xi −1 , y j ) + u 1 (xi +1 , y j ) − 4u 1 (xi , y j ) + u 1 (xi , y j −1 ) + u 1 (xi , y j +1 )
−
1 h2
u 2 (xi −1 , y j ) + u 2 (xi +1 , y j ) − 4u 2 (xi , y j ) + u 2 (xi , y j −1 ) + u 2 (xi , y j +1 )
= λ+ χ{u 1 (xi , y j )>0} − λ− χ{u 2 (xi , y j )>0} .
(6)
If u 1 (xi , y j ) and u 2 (xi , y j ) are obtained from (6) and the conditions
u 1 (xi , y j ) · u 2 (xi , y j ) = 0 and
u 1 ( x i , y j ) 0,
u 2 ( x i , y j ) 0,
are imposed, then the iterative method for u 1 and u 2 will be as follows.
• Initialization:
( 0)
u 1 ( xi , y j ) =
0
if (xi , y j ) is an interior point,
g 1 (xi , y j ) if (xi , y j ) is a boundary point,
( 0)
u 2 ( xi , y j ) =
0
if (xi , y j ) is an interior point,
g 2 (xi , y j ) if (xi , y j ) is a boundary point.
• Step k + 1, k 0: Let u 1 (xi , y j ) and u 2 (xi , y j ) denote the average of u 1 and u 2 , respectively, for all neighbors of the points (xi , y j ). Then we iterate over all interior points by setting
uk1+1 (xi ,
y j ) = max
−λ+ h2 4
k
k
+ u 1 (xi , y j ) − u 2 (xi , y j ), 0 ,
and
uk2+1 (xi , y j ) = max
−λ− h2 4
+ u 2 k (xi , y j ) − u 1 k (xi , y j ), 0 .
2.2. Regularization method A major difficulty in solving problem (2) numerically is that the terms max(u , 0) and min(u , 0) are non-differentiable. In the case of the one-phase obstacle problem, the method of regularization is discussed in [8,9]. The idea of the regularization method is to approximate the non-differentiable terms by a sequence of differentiable terms. Convergence is obtained when the parameter ε tends to 0. One should note that when ε tends to 0 the conditioning of a regularized problem deteriorates. On the other hand, to obtain a more accurate approximation we have to choose ε small but when ε is too small the numerical solution of regularized problem cannot be computed accurately. Therefore one needs to know about a posteriori error estimates to compute error bounds of the approximated solution. If the estimated error is within the given error tolerance, the solution of the regularized problem is accepted as the exact solution but if the estimated error is large, then we need to use a smaller parameter ε . We consider a family of regularized equations
u ε = λ+ χ ε u ε − λ− χ ε −u ε , in Ω, uε = g
on ∂Ω,
where χ ε (t ) is a non-decreasing, smooth approximation of the Heaviside function such that
χ ε (t ) =
1 t ε, 0 t −ε .
A solution u ε can be obtained by minimizing the following functional I ε on K
I ε (v ) =
1 2
|∇ v |2 + λ+ φ ε ( v ) − λ− φ ε (− v ) dx,
Ω
t
where φ ε (t ) = −∞ χ ε (s) ds. By the maximum principle of Alexandrov (see [7]),
ε u
L ∞ (Ω)
g L ∞ (Ω) + C (n, Ω) max λ+ L ∞ , λ− L ∞ .
F. Bozorgnia / Applied Numerical Mathematics 61 (2011) 92–107
95
Consequently, applying the interior L p -estimates, we will get
ε u
W 2, p ( K )
C p , K , Ω, λ+ , λ− , g for any K Ω and 1 < p < ∞.
Thus the family {u ε } is uniformly bounded in W 2, p ( K ), and there exists a subsequence denoted in the same notation such that 2, p
uε u
weakly in W loc (Ω). 2, p
We have u ∈ W loc (Ω) for any 1 < p < ∞. In addition, 1, p
uε → u
strongly in W loc (Ω). 1, 2
1, 2
Since u ε − g ∈ W 0 (Ω) and W 0 (Ω) is a closed subspace of the Hilbert space W 1,2 (Ω), one gets 1,2
u − g ∈ W 0 (Ω). Applying Fatou’s lemma and the dominated convergence theorem,
I (u ) lim inf I ε u ε lim inf I ε ( v ) = I ( v ), 1, 2
for any v such that v − g ∈ W 0 (Ω). Thus u is the minimizer of the functional (2). Since the minimizer is unique, the 1, p whole sequence {u ε } converges to u in W loc (Ω). By the Sobolev embedding theorem, we have
1,α
uε → u
in C loc (Ω) as ε → 0.
Consequently, the locally uniform convergence implies that
u =
λ+
a.e. in {u > 0},
−λ−
a.e. in {u < 0}.
Since u ∈ W 2,2 (Ω) then u = 0 a.e. in the set {x ∈ Ω : u (x) = 0} ∩ {x ∈ Ω : ∇ u (x) = 0}. Furthermore, the set {x ∈ Ω : ∇ u (x) = 0} ∩ {x ∈ Ω : u (x) = 0} is a C 1 -surface and thus has measure zero, hence
u = 0 a.e. in {u = 0}. Therefore u satisfies (3). Different estimates for the solution of the regularized problem can be obtained. We consider approximating problems
u ε = f ε u ε in Ω, uε = g
on ∂Ω,
where f ε are smooth non-decreasing functions such that
f ε (t ) =
λ+
t ε,
−λ− t −ε .
By the maximum principle we have
inf min{ g , 0} u ε sup max{ g , 0}. ∂Ω
∂Ω
ε1 and ε2 if we choose ε = max{ε1 , ε2 }, then we obtain λ+ λ+ λ+ s − 2ε ε2 , s ε2 + 2ε , s ε1 , ε ε 2 1 f ( s − 2ε ) = = f ( s ) = − − − −λ −λ −λ s − 2ε −ε2 s −ε2 + 2ε s −ε1 .
Also for given
Therefore, u ε1 − 2ε is a subsolution of
ε2 -problem. Hence,
u ε1 − 2 ε f ε2 u ε1 − 2 ε . The monotonicity of f and the maximum principle imply
u ε1 − 2 ε u ε2 .
96
F. Bozorgnia / Applied Numerical Mathematics 61 (2011) 92–107
Changing the roles of u ε1 and u ε2 we obtain
u ε2 − 2 ε u ε1 . Hence,
−2ε u ε1 − u ε2 2ε . Therefore the estimate
supu ε1 − u ε2 2 max{ε1 , ε2 },
(7)
Ω
follows.1 Note that we have
u ε1 − u ε2 = f ε1 u ε1 − f ε2 u ε2 . Multiplying by (u ε1 − u ε2 ) and integrating by parts gives
ε ∇ u 1 − u ε2 2 dx =
Ω
ε u 2 − u ε1 f ε1 u ε1 − f ε2 u ε2 dx.
Ω
From the estimate (7) it follows that
ε ∇ u 1 − u ε2 22
L (Ω)
C · max{ε1 , ε2 } max λ+ , λ− .
In numerical experiments, the non-differential terms max(u , 0) and min(u , 0) can be approximated by many different differentiable sequences. In functional (2) set
max(u , 0) =
u + |u |
min(u , 0) =
and
2
u − |u | 2
.
The term |u | can be regularized by the following sequence: 1. φ1ε (u ) =
√
u2 + ε2 .
| u | ε +1 2. φ2ε (u ) = ε+1 . ⎧
3. φ3ε (u ) =
u ε,
u ⎪ ⎨
1 u2 ( 2 ε
+ ε ) |u | ε , ⎪ ⎩ −u u −ε . ⎧ u − ε /2 u ε , ⎪ ⎨ u2 4. φ4ε (u ) = |u | ε , ⎪ ⎩ 2ε ε −u − 2 u −ε . In order to obtain a posteriori error estimates of the approximate solution, we use the duality method by conjugate functions [6,9,12]. Let V and Q be two normed spaces, V ∗ and Q ∗ their dual spaces and let ·,· denote the duality pairing. Assume that there exists a continuous linear operator A from V to Q , A ∈ L ( V , Q ). The adjoint A∗ ∈ L ( Q ∗ , V ∗ ) of the operator A is defined through the relation
A∗ q∗ , v = q∗ , Av
∀ v ∈ V , q∗ ∈ Q ∗ .
Let J be a function from V × Q to R = R ∪ {−∞, +∞}. Consider the minimization problem
inf J ( v , Av ).
(8)
v ∈V
Define its dual by
sup − J ∗ A∗ q∗ , −q∗ ,
(9)
q∗ ∈ Q ∗
where the convex conjugate function of J is given by
J ∗ v ∗ , q∗ =
sup
v ∈ V , q∈ Q
v , v ∗ + q, q∗ − J ( v , q) ,
v ∗ ∈ V , q∗ ∈ Q ∗ .
The relation between (8) and (9) is stated in the following theorem (see [6]).
1
The estimate (2.4) is communicated to us by Arshak Petrosyan.
F. Bozorgnia / Applied Numerical Mathematics 61 (2011) 92–107
97
Theorem 2.1. Assume that V is a reflexive Banach space and Q is a normed vector space, and let A ∈ L ( V , Q ). Let J : V × Q → R be proper l.s.c. strictly convex such that 1. There exists v 0 ∈ V , such that J ( v 0 , Av 0 ) < ∞ and q → J ( v 0 , q) is continuous at Av 0 . 2. J ( v , Av ) → +∞, as v → ∞, v ∈ V . Then problem (8) has a solution u ∈ V also problem (9) has a solution q∗ ∈ Q ∗ , and
J (u , Au ) = − J ∗ A ∗ q∗ , −q∗ .
(10)
In the case that the function J is of a separated form, i.e.,
J ( v , q) = F ( v ) + G (q)
v ∈ V, q ∈ Q,
then the conjugate of J is
J ∗ v ∗ , q∗ = F ∗ v ∗ + G ∗ q∗ , where F ∗ and G ∗ are the conjugate functions of F and G, respectively. To calculate the conjugate function when the functional is defined by an integral, we use the following theorems which can be found in [6,9]. Theorem 2.2. Assume h : Ω × Rn → R is a Carathéodory function with h ∈ L 1 (Ω) and suppose
G (q) =
h x, q(x) dx. Ω
Then the conjugate function of G is
G ∗ q∗ =
h∗ x, q∗ (x) dx
∀q∗ ∈ Q ∗ ,
Ω
where
h∗ (x, y ) = sup y · ξ − h(x, ξ ) . ξ ∈Rn
Let u ∈ V be a solution of the minimization problem (8) and q∗ be defined as in (10). For any v ∈ V , define the energy difference
E D (u , v ) = J ( v , Av ) − J (u , Au ). Theorem 2.3. Suppose the assumptions in Theorem 2.1 are satisfied. Then
∀ v ∈ V , q∗ ∈ Q ∗ . E D (u , v ) J ( v , Av ) + J ∗ A ∗ q∗ , −q∗ Note that by the relations max(u , 0) =
I (v ) =
1 2
|∇ v |2 +
u +|u | 2
and min(u , 0) =
u −|u | , 2
we can rewrite functional (2) as
1 + 1 λ + λ− | v | + λ+ − λ− v dx. 2 2
Ω
We introduce the following
A v = (∇ v , v ),
F (v ) =
1
Ω 2 (λ
∞
∗
V ∗ = H 1 (Ω) ,
V = H 1 (Ω),
J ( v , Av ) = F ( v ) + G ( A v ), +
n
Q = Q ∗ = L 2 (Ω)
− λ− ) v dx v ∈ g + H 01 (Ω), otherwise,
× L 2 (Ω),
98
F. Bozorgnia / Applied Numerical Mathematics 61 (2011) 92–107
1
G (q) =
2
|q1 |2 +
1 + λ + λ− |q2 | dx, 2
Ω
where q = (q1 , q2 ), q1 ∈ ( L 2 (Ω))n , q2 ∈ L 2 (Ω). We use Theorem 2.3 to find a bound for u − u ε . We have
∗
F
∗ ∗
A q
=
+ − sup v ∈ H 1 (Ω) Ω (q∗1 · ∇ v + vq∗2 − ( λ −λ ) v ) dx 2 g
∞
= where
otherwise
Ω
(q∗
· ∇g +
1
gq∗
2
−(
λ+ −λ− 2
) g ) dx
∞
∗
if q∗
∈
Q ,
otherwise,
∗
v ∈ g + H 01 ,
Q = q ∈Q :
∗
∗
q1 · ∇ v + vq2 −
λ + − λ−
v dx = 0 ∀ v ∈
2
H 01
,
Ω
and
G
∗
∗ q
= sup q∈ Q
1 −q1 · q1 − q2 q2 − |q1 | − λ+ + λ− |q2 | dx = 2 2 ∗
1
∗
2
Ω
1 ∗ 2 Ω 2 |q1 | dx
∞
if |q∗2 | 12 (λ+ + λ− ), otherwise.
Denoting the constraint set
∗
∗
∗
Qc = q ∈ Q :
∗
∗
q1 · ∇ v + vq2 −
λ + − λ− 2
∗ 1 + − λ + λ a.e. in Ω , v dx = 0, q2 2
Ω
where the integral condition should hold for all v ∈ H 01 , the term J ∗ (A∗ q∗ , −q∗ ) can be written as
J
∗
∗ ∗
A q , −q
∗
1 ∗ 2 Ω ( 2 |q1 |
=
+ q∗1 · ∇ g + gq∗2 − ( λ
+ −λ−
2
) g ) dx q∗ ∈ Q c∗ ,
∞
otherwise.
Now consider the energy difference
ED u , u ε =
+ + − − u ε − |u | + λ − λ ∇ u ε 2 − 1 |∇ u |2 + λ + λ u ε − u dx.
1 2
2
2
2
Ω
We show that the following inequality holds:
1 ε ∇ u − u 22 ED u , u ε . L (Ω) 2 This is equivalent to show
|∇ u |2 + λ+ max(u , 0) − λ− min(u , 0) dx
Ω
∇ u · ∇ u ε + λ+ max u ε , 0 − λ− min u ε , 0 dx.
Ω
The difference between two integrals is
∇ u · ∇ u − u ε + λ+ max(u , 0) − λ− min(u , 0) − λ+ max u ε , 0 + λ− min u ε , 0 dx
Ω
=
u u ε − u + λ+ max(u , 0) − λ− min(u , 0) − λ+ max u ε , 0 + λ− min u ε , 0 dx
Ω
=
λ+ χ{u >0} − λ− χ{u <0} u ε − u + λ+ max(u , 0) + λ− min(u , 0) − λ+ max u ε , 0 + λ− min u ε , 0 dx
Ω
=
λ+ u ε (χ{u >0} − χ{u ε >0} ) + λ− u ε (χ{u ε <0} − χ{u <0} ) dx.
Ω
The domain Ω can be decomposed as Ω = Ω1 ∪ Ω2 ∪ Ω3 ∪ Ω4 ∪ Ω5 ∪ Ω6 where
(11)
F. Bozorgnia / Applied Numerical Mathematics 61 (2011) 92–107
Ω1 = x ∈ Ω : Ω3 = x ∈ Ω : Ω5 = x ∈ Ω :
Ω2 = x ∈ Ω : Ω4 = x ∈ Ω : Ω6 = x ∈ Ω :
u ε (x) u (x) 0 ,
u ε (x) u (x) 0 ,
u (x) 0 u ε (x) ,
99
u (x) u ε (x) 0 ,
u (x) u ε (x) 0 ,
u ε (x) 0 u (x) .
In all of these cases we obtain that
λ+ u ε (χ{u >0} − χ{u ε >0} ) + λ− u ε (χ{u ε <0} − χ{u <0} ) 0, which show the inequality (11) holds. Theorem 2.3 implies
ED u , u ε
1
∇ u ε 2 + 1 λ+ + λ− u ε + 1 λ+ − λ− u ε
2
2
2
Ω
1 2 + q∗1 + q∗1 · ∇ g + gq∗2 − 2
λ+ − λ−
g dx,
2
(12)
for any q∗ ∈ Q ∗ . Therefore, we obtain
1 ε ∇ u − u 22 L (Ω) 2
1 2
∇ u ε 2 + 1 λ+ + λ− u ε + 1 λ+ − λ− u ε 2
2
Ω
1 2 λ+ − λ− + q∗1 + q∗1 · ∇ g + gq∗2 − g dx.
2
2
If |(φ ε ) (t )| 1 ∀t ∈ R, then considering the constraint set Q c∗ , we can choose q∗ as follows
q∗1 = −∇ u ε ,
q∗2 = −
λ + + λ− ε ε φ u . 2
Then from (12), it follows that
1 ε ∇ u − u 22 L (Ω) 2
1 2
∇ u ε 2 + 1 λ+ + λ− u ε + 1 λ+ − λ− u ε 2
2
Ω
+ 2 λ − λ− 1 1 + ∇ u ε − ∇ u ε · ∇ g + λ+ + λ− φ ε u ε g − g dx. 2
Hence,
1 ε ∇ u − u 22 L (Ω) 2
2
2
1 ∇ u ε · ∇ u ε − g + λ+ + λ− u ε − φ ε u ε g −
λ+ − λ−
2
2
g − u ε dx.
(13)
Ω
The regularized problem can be written in the form
⎧ ⎨
−u ε +
⎩ ε u =g
+ λ − λ− λ+ + λ− ε ε φ = 0 in Ω, u + 2
(14)
2
on ∂Ω.
Here u ε is the weak solution of
∇ u ε · ∇ v dx + Ω
+ λ − λ− λ + + λ− ε ε φ u v+ v dx = 0 ∀ v ∈ H 01 . 2
∇ u ε · ∇ u ε − g dx +
Ω
Ω
+ λ − λ− ε λ+ + λ− ε ε ε φ u u −g + u − g dx = 0. 2
2
(16)
Ω
Therefore,
(15)
Ω
Taking v = u ε − g ∈ H 01 in (15), gives
2
∇ u ε · ∇ u ε − g dx = −
+ λ − λ− ε λ+ + λ− ε ε ε u u −g + u − g dx, φ 2
2
Ω
and substituting in (13) then gives the following a posteriori error estimate, which we formulate as the following theorem.
100
F. Bozorgnia / Applied Numerical Mathematics 61 (2011) 92–107
Theorem 2.4. Assume u and u ε are solutions of the two-phase problem (3) and the regularized problem (14) respectively, then
1 ε ∇ u − u 22 L (Ω) 2
1 + λ + λ− u ε − u ε φ ε u ε dx. 2
(17)
Ω
Now for different choices of φ ε , the a posteriori error can be computed from (17). 2.3. Steady state problem Since the solution u of (3) minimizes the convex functional I ( v ), (3) can be viewed as a steady state of the parabolic equation
⎧ + − ⎪ ⎨ ut − u = −λ χ{u >0} + λ χ{u <0} in Ω × (0, T ), u=g on ∂Ω × [0, T ], ⎪ ⎩ u = u0 on Ω × {t = 0}.
One can see that
d
u (·, t ) 2L 2 dt
(18)
0. Therefore,
ut → 0 as t → ∞. So given a mesh (xi , y j ) with x = y, then using standard finite difference we obtain:
u n +1 ( x i , y j ) − u n ( x i , y j )
t
−
un (xi +1 , y j ) + un (xi −1 , y j ) − 4un (xi , y j ) + un (xi , y j +1 ) + un (xi , y j −1 )
(x)2
= −λ+ χ{un (xi , y j )>0} + λ− χ{un (xi , y j )<0} .
(19) n
Note the right-hand side of (19) is evaluated at u (xi , y j ) and gives an explicit method. 3. Approximation by the finite element method The finite element method is an efficient method for solving partial differential equations; in particular for complex domains, when the domain changes or in cases when the desired precision varies over the entire domain, the solution lacks smoothness. The finite element method has been studied extensively [3,4,8]. In this method, an infinite-dimensional function space is approximated by a finite set of functions. These functions form the basis of a discrete finite-dimensional subspace of some functional space and are usually defined in simple geometrical elements. Let T denote either a discretization of Ω in one dimension or triangulation of Ω in two dimensions. Let S be the set of all nodes (vertices) of T which belong to the interior of Ω and let N = Card( S ). The affine subspace K is approximated by
V h ( g ) = v ∈ H 1 (Ω) ∩ C 0 (Ω), v |∂Ω = g , v | T ∈ P 1 , ∀ T ∈ T ,
(20)
where v |∂Ω is the trace of v on ∂Ω , v | T denotes the restriction of v to T , and P 1 is the space of polynomials of degree less than or equal to one. Let {φi }iN=1 be the basis functions of V h such that φi (x j ) = δi j , where x j is a vertex. Consider the approximated problem
u ∈ V h ( g ):
∇ u · ∇φ dx = − Ω
λ+ φ χ{u >0} dx +
Ω
Define U = (u 1 , u 2 , . . . , u N ) ∈ R denoted by I 1 , I 2 , I 3 such that
N
λ− φ χ{u <0} dx,
∀φ ∈ V h (0).
(21)
Ω
by u i = u (xi ). Note that by Eq. (21), there exist three disjoint subsets of {1, 2, . . . , N }
( AU )i = F i− ∀i ∈ I 1 ,
u i < 0,
and
u i = 0,
∀i ∈ I 2 ,
u i > 0,
and
( AU )i = F i+ ∀i ∈ I 3 ,
where A denotes the N × N matrix whose elements are determined by
ai j =
∇φi · ∇φ j dx Ω
(22)
F. Bozorgnia / Applied Numerical Mathematics 61 (2011) 92–107
101
so that
( AU )i =
N
j =1
and
+
∇φi .∇φ j dx
uj Ω
−
+
Fi = −
(23)
Fi =
λ φi dx, Ω
λ− φi dx.
(24)
Ω
Then (21) can be written as
⎧ − ⎪ ⎨ ( AU )i = F i ∀i ∈ I 1 , ui = 0 ∀i ∈ I 2 , ⎪ ⎩ ( AU )i = F i+ ∀i ∈ I 3 .
(25)
Now with introduced notations the algorithm will be as follows: (0)
(0)
(0)
• Initialization: let I 1 , I 2 , I 3 be a partition of I = {1, 2, . . . , N }. • Step ( j ), j 0. ( j)
( j)
( j)
For the given sets I 1 , I 2 , I 3
( j)
( j)
( j)
let U j = (u 1 , u 2 , . . . , u N ) be the solution of the following equation
⎧ ( j) − ⎪ ⎪ ( AU )i = F i ∀i ∈ I 1 , ⎨ ⎪ ⎪ ⎩
j
( j)
ui = 0
∀i ∈ I 2 ,
+
( AU )i = F i
(26)
( j)
∀i ∈ I 3 .
Set ( j ,0)
I1
( j) ( j) = i ∈ I 1 : ui < 0 ,
( j ,1)
( j)
( j ,0)
= I1 \ I1 , ( j ,0) ( j) I2 = i ∈ I 2 : ( AU )i > F i− , ( j ,1) ( j) I2 = i ∈ I 2 : ( AU )i < F i+ , ( j ,2) j ( j ,0) ( j ,1) , I2 = I2 \ I2 ∪ I2 ( j ,0) ( j) ( j) I3 = i ∈ I 3 : ui 0 . I1
Define
⎧ j +1 ( j ,0) ( j ,0) ⎪ ⎪ ⎨ I1 = I1 ∪ I2 , j +1 ( j ,1) ( j ,2) ( j ,0) I2 = I1 ∪ I2 ∪ I3 , ⎪ ⎪ ⎩ j +1 j +1 j +1 I 3 = I \ ( I 1 ∪ I 2 ).
(27)
The algorithm stops when I n2 = I n2+1 and this condition is equivalent to
∀i ∈ I 2 : ( AU )i F i− and ( AU )i F i+ . Remark. Assume B r ⊆ supp (φi ). If we multiply u = −λ− by a basis function φi and integrate by parts we have
−
∇ u · ∇φi dx + Br
∂{u <0}∩ B r
∂u ds = − ∂n
λ− φi dx.
(28)
Br
Therefore, by (23) and (24),
∂{u <0}∩ B r
∂u ds = ( AU )i − F i− . ∂n
One can interpret (29) as moving free boundary in direction of the normal to the free boundary.
(29)
102
F. Bozorgnia / Applied Numerical Mathematics 61 (2011) 92–107
Remark. Note that if ( AU )i · F i+ < 0 or if ( AU )i · F i− < 0 then one should decrease the number of points in the zero set I 2 . In fact these conditions tell us how the free boundary points should move. Remark. In each iteration we do not solve N × N system in (26). We solve two linear sub-matrices with size smaller than N. The next lemma shows that the monotone convergence of the iterates is a consequence of the maximum principle. Lemma 3.1. The elements of the sequence (U j ) constructed by the above algorithm satisfy
j + 1 +
ui
j + 1 − j − j + ui and u i ui .
Equivalently, the sequence of functions u ( j ) (x) defined by
u ( j ) (x) =
N
( j)
u i φi (x),
i =1
for all x ∈ Ω satisfies
( j +1 ) + ( j ) + ( j +1 ) − ( j ) − u (x) u (x) and u (x) u (x) ,
where u + = max(u , 0), u − = max(−u , 0). Proof. Set w (x) = ( u ( j +1) (x))+ − ( u ( j ) (x))+ . Then w (x) can be written as N
w (x) =
w i φi (x),
i =1
with w i = w + − w− . One can obtain i i
∇ w (x) =
N
∇φi (x), w+ − w− i i
i =1
which shows
N ∇ w − (x)2 dx = − w− i i =1
Ω j
j +1
j I1 j I3 j I3
j +1 I2 j +1 I2 j +1 I3
• If i ∈ • If i ∈
∩ ∩
∩
Ω
then w − = 0. i then w − = 0. i then
∇ u j · ∇φi dx = Ω
∇ w − (x)∇φi (x) dx.
then w − = 0. i
• If i ∈ I 1 ∩ I 1 • If i ∈
∇ u j +1 · ∇φi dx.
Ω
j +1
Therefore Ω ∇( u − u j ) · ∇φi dx = 0, which gives Ω ∇ w (x) · ∇φi dx = 0. j j +1 • If i ∈ I 2 ∩ I 3 then
Hence
AU j
i
< F i+ and
AU j +1
i
= F i+ .
∇ w (x) · ∇ w − φi dx 0. i
Ω j
j +1
The case i ∈ I 2 ∩ I 1
Ω
is similar. Now the above inequalities and (30) show that
∇ w − (x)2 dx = 0,
(30)
F. Bozorgnia / Applied Numerical Mathematics 61 (2011) 92–107
103
Fig. 1. Free boundary, Ω + and Ω − .
that is w − (x) = c and since w − (x) ∈ H 01 (Ω) we have w (x) 0. The monotonicity of the negative part can be proved in the same way with
−
w (x) = u ( j +1) (x)
( j ) − − u (x) .
2
Remark. It should be noted that the discretized problem based on the finite element method can be given a complementarity formulation. Thus the proposed algorithm is an extended version of Chandrasekaran’s algorithm to two-phase obstacle problems. For Chandrasekaran’s algorithm for linear complementarity problems see the books [4,13]. For the Chandrasekaran’s algorithm applied to the classical obstacle problem see [12]. 4. Numerical examples Example 4.1. In Fig. 1 the first method was applied with Ω = [−1, 1] × [−1, 1] and the boundary value g is defined by
⎧ 1 + x −1 x +1 and y = 1, ⎪ ⎪ ⎨ 1 + y −1 y +1 and x = 1, g (x, y ) = ⎪ x − 1 −1 x +1 and y = −1, ⎪ ⎩ y − 1 −1 y +1 and x = −1.
For implementation aspects, in each point we use updated value for neighboring points like in the Gauss–Seidel method. The Gauss–Seidel method uses the already updated values calculated in the same iteration, to update u 1 (xi , y j ), which depends on the direction of propagation of information from the boundary. For instance, the iterations for u 1 and u 2 (see Fig. 2) will be (k+1)
u1
( xi , y j ) =
1 4
− (k+1)
u2
( xi , y j ) =
1 4
−
−λ+ h2 + u (1k+1) (xi −1 , y j ) + u (1k) (xi +1 , y j ) + u (1k+1) (xi , y j −1 ) + u (1k) (xi , y j +1 )
1 (k) (k) (k) (k) u 2 ( x i −1 , y j ) + u 2 ( x i +1 , y j ) + u l ( x i , y j −1 ) + u 2 ( x i , y j +1 ) , 4 (k+1)
−λ− h2 + u 2
(k)
(k+1)
( x i −1 , y j ) + u 2 ( x i +1 , y j ) + u 2
(k) ( x i , y j −1 ) + u 2 ( x i , y j +1 )
1 (k) (k) (k) (k) u 1 ( x i −1 , y j ) + u 1 ( x i +1 , y j ) + u 1 ( x i , y j −1 ) + u 1 ( x i , y j +1 ) . 4
A general way to accelerate the convergence of the method is provided by the Multigrid method. Another alternative can be to solve the problem for a coarse mesh then the obtained solution is used as initial values for a fine mesh and so on. Example 4.2. Figs. 3 and 4 show the first method and the regularization method with λ+ = λ− = 1, ε = 10−7 . The regularis solved by the finite element method where the mesh size is h = 0.01. The boundary ized equation u ε = √ 2 u ε u ε +.0000001
value g is given by
104
F. Bozorgnia / Applied Numerical Mathematics 61 (2011) 92–107
Fig. 2. The surfaces of u 1 and u 2 .
Fig. 3. The level set of the solution.
⎧ 1− x 2 ( 2 ) ⎪ ⎪ ⎪ ⎪ 2 ⎪ ⎪ ⎨y g (x, y ) = − y 2 ⎪ ⎪ x 2 ⎪ ⎪ −( 1− ) ⎪ 2 ⎪ ⎩ 0
−1 x +1 and y = 1, 0 y +1 and x = −1,
−1 y 0 and x = −1, −1 x +1 and y = −1, −1 y −1 and x = +1.
Example 4.3. The regularization method was applied in Fig. 5 with
u+ φ ε (u ) =
√
u2 + ε2 2
,
where u satisfies
u = 4χ{u >0} − 4χ{u <0} in B 1 , u=g
2
on ∂ B 1 . 2
(31)
F. Bozorgnia / Applied Numerical Mathematics 61 (2011) 92–107
Fig. 4. Regularized solution with
ε = 10−6 .
Fig. 5. Regularized solution with
ε = 10−6 .
105
The boundary value g is defined by
g=
0.25 − x2
if y =
−0.25 + x
2
√
0.25 − x2 , √ if y = − 0.25 − x2 .
The remaining examples deal with the method introduced in Section 3. For any initial guess, the monotonicity of the algorithm, implies that there exists n such that I n2 = I n2+1 . It is also easy to check that I n2 = I n2+1 implies that I n1 = I n1+1 and
I n3 = I n3+1 . Furthermore, the sequence (U n )n∈ N , constructed by the algorithm is such that U n is the exact solution to the discrete problem. Moreover, the integer n satisfies:
n N /2 . The algorithm is simple to implement. For instance, in one dimension, if we let u 0i = 0 for i = 2, . . . , N − 1, then after + solving (26) we only need to check if ( AU )2 > F 2− and ( AU ) N −1 < F N −1 and so on. It means that one only needs to obtain ( j ,0)
I2
( j , 1)
and I 2
at each iteration. This fact is shown in Fig. 5.
106
F. Bozorgnia / Applied Numerical Mathematics 61 (2011) 92–107
Fig. 6. The first iteration and the 8th and 15th iterations.
Fig. 7. Modified method.
Example 4.4. Consider the following two-phase equation
u = 8χ{u >0} − 8χ{u <0} , u (+1) = +1,
u (−1) = −1.
The exact solution is
u=
⎧ 2 ⎪ ⎨ 4x − 4x + 1 ⎪ ⎩
0
0.5 x 1,
−0.5 x 0.5,
−4x2 − 4x − 1 −1 x −0.5.
Numerical tests show the efficiency of the algorithm introduced in Section 3 (see Fig. 6). Also one can see that the (0) (0) (0) number of iterations depends on the initial guess for I 1 , I 2 and I 3 . The number of iterations can be significantly (0)
(0)
(0)
decreased if one modifies the algorithm and considers the initial guess for I 1 , I 2 , I 3 as below. For example, in one dimension, the interval is divided into two parts. Assume that u is negative for all nodes in the left part of the interval, positive in the right part and zero at the middle point. This means that
F. Bozorgnia / Applied Numerical Mathematics 61 (2011) 92–107
107
Fig. 8. The first iteration and the 12th iteration.
( 0)
I 1 = 1 , 2 , . . . , [ N /2 ] − 1 ,
( 0)
I 2 = [ N /2 ] ,
( 0)
I 3 = [ N /2 ] + 1 , . . . , N .
Then after solving (26) we add half of the nodes in I 1 such that u i > 0 to I 2 . Similarly we add half of the points on I 3 such that u i < 0 to I 2 . Fig. 7 illustrates the iterations in modified method. Example 4.5. Consider
u = 2χ{u >0} − χ{u <0} , u (+1) = +1,
u (−1) = −1.
In this case the zero set has measure zero and the free boundary is the point 0.141215213 (see Fig. 8). References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17]
A. Bogomolny, J.W. Hou, Shape optimization approach to numerical solution of the obstacle problem, Appl. Math. Optim. 12 (1984) 45–72. F. Bozorgnia, A perturbation formula of two-phase membrane problem, 2010, preprint. S.C. Brenner, L. Ridgway Scott, The Mathematical Theory of Finite Element Methods, Springer-Verlag, 2002. P.G. Ciarlet, J.L. Lions, Handbook of Numerical Analysis, vol. II, Finite Element Methods, North-Holland, Amsterdam, 1991. R.W. Cottle, J.S. Pang, R.E. Stone, The Linear Complementarity Problem, Cambridge Univ. Press, Cambridge, 2009. I. Ekeland, R. Teman, Convex Analysis and Variational Problems, North-Holland, Amsterdam, 1976. D. Gilbarg, N. Trudinger, Elliptic Partial Differential Equations of Second Order, Springer Verlag, New York, 1983. R. Glowinski, Numerical Methods for Nonlinear Variational Problems, Springer-Verlag, 1981. W. Han, A Posteriori Error Analysis via Duality Theory, Springer-Verlag, 1981. K. Majava, X. Tai, A level set method for solving free boundary problems associated with obstacles, Int. J. Numer. Anal. Model. 1 (2) (2004) 157–171. U. Mosco, F. Scarpini, Complementarity systems and approximation of variational inequalities, Rev. Franc. Automat. Inform. Rech. Operat. 9, Anal. Numer. R-1 (1975) 83–104. K.G. Murty, Linear Complementarity, Linear and Nonlinear Programming, Helderman, Boston, 1988. R. Scholz, Numerical solution of the obstacle problem by the penalty method, Computing 32 (1984) 297–306. H. Shahgholian, N. Uraltseva, G.S. Weiss, Global solution of an obstacle-problem-like equation with two phases, Monatsh. Math. 142 (2004) 27–34. H. Shahgholian, G.S. Weiss, The two-phase membrane problem—an intersection-comparison approach to the regularity at branch points, Adv. Math. 205 (2006) 487–503. N. Uraltseva, Two-phase obstacle problem, function theory and phase transitions, J. Math. Sci. (N.Y.) 106 (3) (2001) 3073–3077. G.S. Weiss, An obstacle-problem-like equation with two phases: pointwise regularity of the solution and an estimate of the Hausdorff dimension of the free boundary, Interfaces Free Bound. 3 (2001) 121–128.
Applied Numerical Mathematics 61 (2011) 108–130
Contents lists available at ScienceDirect
Applied Numerical Mathematics www.elsevier.com/locate/apnum
Optimization, resolution and application of composite compact finite difference templates Stephen A. Jordan Naval Undersea Warfare Center, Newport, RI 02842, United States
a r t i c l e
i n f o
Article history: Received 1 June 2009 Received in revised form 18 February 2010 Accepted 9 August 2010 Available online 19 August 2010 Keywords: Compact finite differencing Multi-parameter optimization Spatial resolution errors Numerical stability
a b s t r a c t Spectral-like compact finite differencing schemes are capable of achieving high spatial efficiency of complex physics in irregular domains with difficult boundary conditions. Their low-resolution errors are commonly reached through large stencils sizes and/or parameter optimization. The field stencils require boundary (and near boundary) stencils to close the composite template for implicit solution. Present practices often optimize each participating stencil individually with aim toward insuring global stability and/or spectrallike characteristics. However, analyzing each stencil separately incorrectly quantifies the local resolution errors. A new process is proposed that properly quantifies the dispersive and dissipative errors of optimized templates in the spectral domain. The templates are optimized at the boundary and adjacent interior points. Both tri- (five-point) and pentadiagonal (seven-point) compact systems are treated in this fashion. A spectral eigenvalue analysis shows the resultant composites to be numerically stable. An a priori procedure is formulated that quantifies the expected reduction in the local predictive error due specifically to the improved template spatial resolution. Three test problems are selected from the Computational Aeroacoustics workshops to demonstrate their improved predictive accuracy. Finally, the present technique provides closure for exercising the three essential criteria of numerical accuracy, stability and resolution when developing composite compact finite difference templates for practical applications. Published by Elsevier B.V. on behalf of IMACS.
1. Introduction Compact finite differencing schemes can attain spectral-like spatial resolution of complex physics. This variety is an adequate substitute for spectral (and pseudo-spectral) methods when dealing with irregular domains and challenging boundary conditions. These conditions commonly appear in a wide array of practical applications. The stencils are implicit and theoretically raise the formal accuracy by two orders over their explicit counterpart when formulated over the same number of computational points. Moreover, these stencils commonly well-resolve sixty percent (or better) of the resolvable waves as compared to merely one-half of that resolution by the companion explicit scheme (usually less than thirty percent well-resolved). Given this improved resolution, one can experience higher spatial accuracy under the same spacing or still anticipate acceptable predictive error for expanded gridding. Recently, much effort has been devoted to optimizing the boundary and near boundary stencils to reduce the spatial resolution errors. The optimization path introduces free parameters into the compact stencil at a cost of sacrificing its formal accuracy. New multi-parameter families have arisen that house large compact finite differencing stencils which are only moderately accurate. These stencils are coupled with the higher-order interior scheme to realize spectral-like spatial resolution throughout the entire solution domain. But before these newly coupled schemes can be universally implemented for
E-mail address:
[email protected]. 0168-9274/$30.00 Published by Elsevier B.V. on behalf of IMACS. doi:10.1016/j.apnum.2010.08.008
S.A. Jordan / Applied Numerical Mathematics 61 (2011) 108–130
109
practical application, their formal accuracy, stability and resolution characteristics must be carefully quantified by acceptable means. More importantly, does in fact the complete solution domain share an equivalent distribution of spatial resolution characteristics using these optimized templates? One of the original optimized high-order stencils in compact form was developed by Lele [18] for approximating firstorder derivatives at the interior points of discretized domains. His optimization strategy focused on reaching spectral-like resolution of a penta-diagonal system where a tenth-order central differencing stencil was reduced to fourth-order with three free parameters. He determined these parameters by constraining the modified complex wavenumber of the stencil to be spectral (exact) at three discrete points in Fourier space. The spatial resolution efficiency of the fourth-order stencil was improved by over eleven percent as compared to the tenth-order scheme. Besides this interior scheme, Lele also optimized several one-sided boundary stencils using an equivalent strategy. A well-known inherent property of all one-sided standard schemes is the inevitable introduction of numerical dissipation. Therefore, in one example, Lele reduced the one-sided fourth-order expression of Rogallo and Moin [19] to a second-order two-parameter family where the explicit-side free coefficient was evaluated to give zero dissipation while the implicit-side parameter was optimized to reduced dispersion. When coupled with the interior stencil, Lele focused on satisfying discrete global conservation and stability with practical application to a compressible mixing layer. Shortly afterward, Tam and Webb [22] developed a high-order accurate dispersion-relation-preserving (DRP) scheme for aeroacoustic applications. The skeleton stencil held seven points (and flexible coefficients) whose formal accuracy was dependent on the user demand for high spatial resolution. Outside the boundaries, they introduced ghost points that expanded the computational space to satisfy the seven-point stencil everywhere within the physical domain. Tam and Dong [20] minimized the requirement for the fictitious points by introducing seven-point backward differencing schemes that mimicked solid wall boundaries. They illustrated strong numerical accuracy of their modified DRP scheme by solving various problems with particular attention paid to the acoustic pressure near the solid walls. With focus on multi-parameter compact schemes that are asymptotically stable while simultaneously satisfying the Gustafsson–Kreiss–Sundstrom (GKS) stability theory [11], Carpenter et al. [4] derived a fourth-order three-parameter family and sixth-order four-parameter family of coupled stencils that also attained their formal field accuracy when solving the scalar wave equation. Specifically, an explicit three-parameter fourth-order boundary stencil and explicit two-parameter fifth-order boundary and adjacent boundary stencils were joined by compact fourth-order and sixth-order interior operators, respectively. Carpenter et al. noted that neither stencil was optimized except that each yielded roots from the stability analysis that lied inside the left-hand plane of the eigenvalue spectrum. Besides these two sample cases, they briefly examined the GKS stability of explicit one-parameter first- and third-order boundary stencils as linked with fourth- and sixth-order operators, respectively. Demuren et al. [6] implemented the sixth-order template of Carpenter et al. for solving the general class of turbulent flow problems. The template demonstrated strong pressure-velocity coupling when an analogous expression was devised for the separate pressure solution. Similarly, Cook and Riley [5] asymptotically stabilized a sixth-order field scheme with optimized fifth-order stencils at the boundary and adjacent boundary points for direct numerical simulations (DNS) of a turbulent reactive plume. Optimizing high-order upwinding stencils was originally tested by Adams and Shariff [3] for shock capturing. They evaluated two free parameters of their five-point fifth-order stencil by minimizing the L 2 -error norm of the complex exact and modified wavenumber difference in Fourier space. One parameter controlled dispersion while the other limited dissipation at and adjacent to the domain boundaries. When integrated into their hybrid essentially non-oscillatory (ENO) scheme, the resultant spatial resolution character maintained stable and accurate predictions of shock behavior. Zhong [24] also explored high-order upwinding stencils as shock-fitting operators. He adjusted the explicit floating mid-point coefficient of three central (4th, 6th and 8th) compact stencils to minimize numerical dissipation. By setting the coefficient to zero, the lower-ordered stencil defaulted back to the original compact centered scheme. Although Zhong did not specifically discuss the spatial resolution characteristics of each upwind compact stencil, he tested their accuracy as shock fitting operators when coupled with various explicit boundary expressions. Finally, Deng and Maekawa [7] extended their optimized upwind shock-capturing operators to a cell-centered solution molecule. A MacCormack-type solution (MAC) scheme that utilizes a prefactorization routine for decomposing high-order compact central stencils into easily solved upper and lower reduced-order bidiagonal systems was developed by Hixon and Turkel [14] for aeroacoustic applications. Later, Hixon [13] introduced an optimization strategy of their MAC scheme where the further-reduced size of the interior stencil required only compact (or explicit) boundary definitions to close the solution scheme. The stability and accuracy of his resultant operator was benchtested against three token problems from the First and Second Computational Aeroacoustics (CAA) workshops [12,21]. A recent alternate optimization concept was developed by Finkelstein and Kastner [8] who coined the development as the spectral order of accuracy. Their strategy operates in the spectral domain that targets a ubiquitous tool for methodical design of finite difference time domain schemes. Recently, Kim [16] optimized a fourth-order three-parameter penta-diagonal compact stencil for the interior points of the discretized domain in aeroacoustic applications. Like Adams and Shariff [3], he evaluated the free parameters by minimizing the difference between the complex modified and exact wavenumbers in Fourier space. He further introduced an integration limit and weighting function to concentrate the minimization procedure towards the higher resolvable wavenumbers. Given this large stencil, Kim developed extrapolation functions for replacing the exterior terms to close the approximation operator at and near the domain boundaries. He performed an eigenvalue analysis and convergence tests using several benchmark problems taken from the CAA workshops [12,21].
110
S.A. Jordan / Applied Numerical Mathematics 61 (2011) 108–130
When one optimizes a compact finite differencing stencil that targets high resolution of complex physics, three salient ingredients must be suitably addressed before the general science community can universally accept the resultant operator. The developer must treat the accuracy and stability as well as the spatial resolution properties of the optimized scheme that includes the mixture influence by the associated boundary and near boundary stencils. Nearly all of the above advancements toward high-resolution stencils properly analyzed the stability and accuracy of the linked finite difference expressions. But ironically, each overlooked the true spatial resolution character of the composite scheme. Instead, only a one-to-one comparison of the improved spatial resolution properties was discussed between the individual standard and optimized stencils. No attention was given to a continued analysis of the actual spatial resolution properties inherent in the final compact system of equations. Jordan [15] touched on the coupling effects of both compact and explicit boundary stencils with various interior schemes, but no procedure was defined that sufficiently quantified the reduced predictive error due specifically to the improved spatial resolution. Additionally, his analyses did not examine linked schemes that housed both boundary and near boundary optimized stencils as well as their numerical stability. In the present paper, we will emphasize a simple procedure for quantifying the expected improvement in predictive accuracy due only to a reduced resolution error of an optimized compact differencing scheme. We will examine several of the previous compact composites that were largely developed for aeroacoustic applications as well as a separate set of new multi-parameter boundary (and near boundary) families after optimized and joined to the adjacent interior stencil. These new families will own both tri-diagonal and penta-diagonal structures. We will see that the improved spatial resolution through optimization can quickly decay to levels below the original operator once the analysis links the proposed boundary stencils. Herein, a corrective measure is introduced into the optimization strategy of the multi-parameter boundary (and near boundary) stencils whose goal is a consistent spatial resolution throughout the entire computational domain. The numerical stability of these new optimized compact composites is assessed by eigenvalue analysis. Using several benchmark problems from the CAA workshops, we will examine the predictive error of the new composite schemes and compare them to their standard counterpart. Finally, the proposed procedure supplies closure of the required analyses for sufficiently quantifying the intrinsic accuracy, stability and resolution characteristics of coupled finite difference systems including application of optimization techniques toward spectral-like characteristics. 2. Optimize compact finite difference stencils Optimized compact finite differencing schemes commonly hold large stencil sizes whether approximating the boundary or interior derivatives. As the formal accuracy and/or free-parameter size of the interior schemes increase, direct application migrates further from the domain boundaries (exclusive of any outside ghost points). Fourth-order stencils owning two free parameters, one each on the explicit and implicit side to control the dissipation and dispersion levels separately, require a boundary and near boundary approximation to close the compact system. In all cases, deriving expressions for the stencil coefficients in terms of the free parameters start with a Taylor series expansion of each constituent. On each side of the compact stencil, both the neighbors as well as the exact quantities are mathematically represented in the series expansions. The respective terms are appropriately summed and simplified to uniquely define each coefficient in terms of the free parameters. Theoretically, the procedure has no ceiling on the number of free parameters and formal order of the resultant stencil. Before proceeding with discussions on pertinent topics regarding optimized finite-difference template sizes, accuracies and their spatial resolutions, we must first introduce suitable notation. Single notations (i.e. 6c, 6e) specify the formal order of un-optimized compact and explicit finite difference stencils (sixth-order-accurate Padé- and explicit-type in this case). For the optimized composite templates, we will adopt the convenient register as coined and strengthened by Carpenter el al. [4]. While the base variable identifies the formal accuracy of the stencil, the accompanying exponent carries the number of optimized parameters. For instance, the notation 42 characterizes a fourth-order-accurate stencil holding two optimized parameters. Placement of each optimized (or standard) stencil within the composite template notation gives its unique position on the discretized solution domain. The center stencil always identifies the intended field accuracy of the interior points. The computational space notations include the boundary points regardless of whether they are explicitly part of overall final solution exercise. As an example, the composite template 32 -4-32 denotes an un-optimized fourth-order-accurate Padé-type scheme imposed at all interior points coupled with an optimized third-order-accurate two-parameter family stencil at both left and right boundaries. Continuing with this notation, the optimized composite 43 -43 -43 -43 -43 -43 -43 of Kim [16] holds similar optimized fourth-order-accurate three-parameter stencils at both left and right boundaries, 1st interior points as well as 2nd interior points with a separate optimized fourth-order-accurate three-parameter operator at all remaining interior points. Introduction of a second exponent in the composite template (or stencil) identification points to its origin. For example, the notation (43 ) [17] symbolizes a fourth-order-accurate three-parameter family stencil found in the author works of reference [17]. Finally, (de-)coupled templates reflect the optimization strategy at the boundary, 1st interior and 2nd interior points of the computational domain. More specifically, the notation ‘decoupled’ means that each participating stencil within the composite template was optimized separately whereas ‘coupled’ denotes composite templates whose free parameters of each participating stencil were properly optimized together in a fully coupled fashion. As a reminder, these notations will be briefly repeated throughout the text.
S.A. Jordan / Applied Numerical Mathematics 61 (2011) 108–130
111
Fig. 1. Real (dispersive error) distribution of the modified wavenumber (kˆ = kˆ r ) in Fourier space for a fourth-order [ O (4c )], tenth-order [ O (10c )] and optimized fourth-order [ O (43 )] [16,18] and sixth-order [ O (61 )] [17] compact finite differences.
2.1. Quantifying the spatial resolution The spatial resolution of a finite differencing scheme differs fundamentally from its formal accuracy in a sense that two stencils can own equivalent truncation errors, but resolve the physical waves quite differently. Gaining an understanding of improved spatial resolution characteristics of an optimized compact finite differencing scheme is no different than the approach for standard stencils. Their spatial resolution is easily quantified by differentiating the fundamental Fourier wave q(x) = e ikx and distributing its compact differencing approximation over the range of resolvable wavenumbers (up to the π wave). For brevity, we will assume that the exact complex wavenumber k and coordinate x are dimensionless according to the nomenclature established by Lele [18]. The exact differentiation (ex) of q(x) with respect to x gives q (x)ex = ikq(x) ˆ (x). In this latter definition, the quantity kˆ is the while the compact finite difference (fd) approximate yields q (x)fd = ikq
dimensionless modified complex wavenumber; kˆ = kˆ r + ikˆ i where kˆ r and kˆ i are the real and imaginary contributions. Departures of the real and imaginary components of kˆ from k are the key ingredients that quantify the dispersive (phase) and dissipative errors of the finite difference stencil. A simple example that quickly quantifies the resolution errors of a compact finite difference stencil lies in the popular fourth-order Padé-type approximation of the first-order derivative (q ). This stencil requires only three points as centered over the ith point;
qi +1 + 4qi + qi −1 = 3(qi +1 − qi −1 )/
(2.1)
where the local grid spacing is denoted by . After substituting the Fourier representation of the continuous q wave at the discrete points in this stencil, we can collect the real and imaginary components of the complex modified wavenumber in terms of the exact resolution over wavenumber space [0, π ]. Inasmuch as the Padé-type stencils are all central approximations, kˆ is only real valued (kˆ i = 0.0) where the dimensionless (as scaled by ) fourth-order definition becomes
kˆ r =
3 2
1+
sin(k) 1 2
cos(k)
(2.2)
Without an imaginary complement, this stencil introduces only a dispersive error into its derivative approximation of the exact wave (see Fig. 1). Herein, we will adopt the resolving efficiency metric [e (δ)] as devised by Lele [18] that is commonly used to report quantification of the fraction of well-resolved waves by a particular differencing stencil; e (δ) = (k)δ /π where (k)δ is the exact wave number (k) in the resolution spectra for the user-defined efficiency. We will further clarify that the variable δ is the resolution expectancy; δr ,i = 1 − |1 − (kˆ r ,i )/(k)δ |. In that latter definition, (kˆ r ,i ) are the dimensionless real
(kˆ r ) and imaginary (kˆ i ) components of the modified wavenumber. Likewise, the quantity δr ,i are the real and imaginary
elements of the error tolerance specified by the user that spells out the required minimum spatial resolution for establishing a well-resolved wave. For example, an error expectancy of δr = 0.1 (notation δ90 ) shows that the fourth-order Padé-type stencil efficiently resolves 59% [e (δ90 ) 0.59] of the resolvable waves 90% (δr 0.90) or better. Placing a more stringent expectancy of δr = 0.01 (notation δ99 ) on the stencil reduces the resolving efficiency to just 35% [e (δ99 ) 0.35]. This latter efficiency most likely calls for a higher resolution stencil or intelligent gridding strategies to suitably resolve the fine-scale physics.
112
S.A. Jordan / Applied Numerical Mathematics 61 (2011) 108–130
2.2. Interior and periodic boundary stencils As previously mentioned, optimized stencils commonly accommodate large stencil sizes to reach spectral-like resolution. Although the corresponding stencil coefficients and/or free parameters realistically have no limits, their evaluation and implementation can become quite cumbersome for extremely high orders. Moreover, finding suitable stencils for the boundary and near boundary points to match the interior scheme is not an easy task especially when attempting to maintain the improved spatial resolution throughout the entire computational domain. A large generalized compact stencil up to tenth-order that is suitable for the interior domain (or periodic boundaries) would consume seven points and require a penta-diagonal solution routine. Its molecule appears as
aqi +2 + cqi +1 + bqi + dqi −1 + eqi −2 = ηqi +3 + δqi +2 + λqi +1 + α qi + γ qi −1 + β qi −2 + μqi −3
(2.3)
Evaluating all the coefficients in the Taylor series expansions produces a tenth-order stencil with b = 600, a = e = 30, c = d = 600, η = μ = 1, δ = β = 101, λ = γ = 425 and α = 0 where 81% [e (δ90 ) 0.81] of the resolvable scales are considered wellresolved when the resolution expectancy is 90% or better (Fig. 1). Releasing the off-diagonal implicit coefficients (a, c , d, e ) and the furthest off-diagonal explicit coefficients (η, μ) as free parameters for optimization reduces the stencil to a fourthorder three-parameter family with the additional expressions
3λ = 2 + c − 8a + 15η
and
12δ = −1 + 4c + 22a − 48η
(2.4a)
for the remaining coefficients (a = e, d = c, b = 1, λ = γ , δ = β , η = μ, α = 0). Enforcing three (kˆ = kr ) discrete real constraints kr1 = 2.2, kr2 = 2.3 and kr3 = 2.4 in exact-modified wavenumber spectrum locks in the free parameters having the values (Lele [18]) 1, 2, 3
a = 0.0896406,
c = 0.5771439 and
η = 0.006250408
(2.4b)
This optimized stencil improves the spatial resolution to 90% [e (δ90 ) 0.90] for the same resolution expectancy of 90% (Fig. 1). But as we will see in Section 3, linking this optimized result with three additional stencils to complete the compact system will not preserve the spectral-like resolution at the interior points near the boundaries that would otherwise be expected by the interior stencil itself. Recognizing the strong demand for low-resolution errors when resolving shock wave physics, Kim [16] revisited the three-parameter family of Lele [18] where the evaluation procedure centered on minimizing the integrated resolution error in Fourier space. The free parameters of his optimized stencil are (truncated to single precision)
a = 0.095495336,
c = 0.58627040 and
η = 0.0071409535
(2.5)
where b = 1, e = a and d = c with the remaining explicit side coefficients evaluated by the appropriate expressions. Kim improved the resolution efficiency of this fourth-order three-parameter family to nearly 92% [e (δ90 ) 0.915] for an expectancy of 90% (Fig. 1), which clearly reduces the dispersive error to an attractive minimum with no added dissipation. Moreover, the stencil suitably satisfies even much higher demands for spectral-like properties where the efficiency mildly lowers to 86% [e (δ99 ) 0.86] for a high expectancy of 99%. Comparatively, the fourth-order Padé-type stencil is only 36% [e (δ99 ) 0.356] efficient given this same expectancy. Exclusive of external ghost points, a seven-point operator requires three additional stencils at and near the boundaries to close the compact system. By relaxing the resolution expectancy, one can reduce the size of these stencils and possibly managed its compact form more easily toward a consistent spatial resolution of the resultant scheme as well as assure numerical stability and acceptable accuracy. After exercising an earlier version of the optimization procedure by Kim [16], Kim and Lee [17] optimized a seven-point one-parameter six-order stencil that required only a tri-diagonal solver. They focused the single parameter on minimizing the dispersion error (since the dissipation error was everywhere theoretically zero). Distribution of the dispersive error inherent in this six-order family is plotted in Fig. 1. The advantages of optimizing become clearly evident because the resultant five-point scheme gives the same resolution efficiency of 81% [e (δ90 ) 0.81] when compared to the standard seven-point tenth-order stencil. But as their closure stencils approached the boundary, the formal accuracy of each reduced by two orders with two free parameters. Because the boundary and adjacent boundary stencils were one-sided, two parameters (one each on the explicit and implicit sides of the stencil) were necessary to oppress the dissipative error as well as the excess dispersion. Later, we will assess the coupling effects of these closure stencils on the adjacent field resolution. 2.3. Non-periodic boundary and near boundary stencils Without question, matching the resolution characteristics of the optimized field scheme at (and near) the boundaries is a challenging task. Like the interior stencil, their spatial order and number of free parameters is limitless (given enough points). In most cases, the initial aim of the optimization strategy deals with eliminating their unavoidable and sizeable dissipative error at the higher resolved scales. Consequently, several optimized boundary (and near boundary) stencils have evolved that are numerically compatible with larger-sized central (and upwind) schemes. But we know from many previous analyses [1,2,4,9,10] that the boundary stencils must be at least pth − 1 order to preserve a pth field accuracy. Carpenter et
S.A. Jordan / Applied Numerical Mathematics 61 (2011) 108–130
113
al. [4] clearly demonstrated this requirement using various ordered boundary stencils coupled with the fourth-order Padétype stencil as the field accuracy. Realizing this fact, we will study below both tri-diagonal and penta-diagonal compact composites that target consistent resolutions errors throughout the entire computational domain as well as investigate their associated numerical stability. Several optimized stencils that close the seven-point (penta-diagonal) three-parameter scheme have been previously developed and tested for numerical accuracy and stability [16,2]. Unfortunately, the improved resolution properties of those stencils analyzed individually bear little resemblance to their dispersive and dissipative errors when coupled with the highresolution interior scheme. Moreover, their errors propagate much further into the domain interior where only the sevenpoint scheme exists. Herein, a fresh look is taken at closing seven- and five-point field stencils. The plan requires optimizing both the boundary and adjacent interior stencils simultaneously. The procedure explores the optimization strategy by Jordan [15] who successfully reached spectral-like boundary stencils using two parameters for closing three-point field schemes. Finally, we will outline a process for quantifying the expected reduction in predictive error due specifically to the improved resolution characteristics of the composite scheme. Starting with the boundary derivative (q1 ), a six-point generalized compact one-sided approximate appears as
cq2 + bq1 = ηq6 + λq5 + δq4 + α q3 + γ q2 + β q1
(2.6a)
Evaluating each coefficient uniquely by solving the linear system of summed coefficients from a Taylor series expansion of each term produces a sixth-order-accurate stencil. Although this stencil is even-ordered, an undesirable dissipative error is introduced over the upper 1/3rd of resolvable scales. We can address this shortcoming by releasing coefficients on the explicit and implicit sides of the stencil for optimization where each free parameter reduces the truncation error by one order. Given one parameter on each side (c , η), the remaining coefficients of the fourth-order boundary stencil are evaluated by the expressions (b = 1)
λ=
γ=
−3 + c − 60η
δ=
,
12 24 − 5c + 30η 6
8 − 3c + 60η
β =−
,
α=
,
6
−6 + 3c − 20η 2
25 + 3c + 12η
(2.6b)
12
If our intent is only toward minimizing the dissipative error, we can reformulate the boundary stencil (2.6a) as a oneparameter (η) fifth-order stencil. For even lower-order interior schemes, we can create a two-parameter third-order family (a, λ, η = 0) with
δ= β=
2 − c − 24λ 6
α=
,
−3 + 2c + 12λ 2
−11 − 2c + 6λ
γ=
,
6 − c − 8λ 2 (2.7)
6
as the expressions to complete the coefficient evaluations. Later, we will use these expressions to optimize several boundary stencils with aim towards low-resolution errors even after mixing with the field scheme. Moving to the first interior point, a generalized five-point stencil owning truncation errors up to sixth-order that is needed to close both seven- and five-point field stencils has the form
cq3 + bq2 + dq1 = λq5 + δq4 + α q3 + γ q2 + β q1
(2.8a)
where the off-center influence exists only on the explicit side. Herein, we will work with a fourth-order two-parameter (c , λ) version of this stencil with expressions (b = 1)
d=
γ=
1 − c − 12λ
2
δ=
,
3 1 − 4c − 32λ
−1 + 4c − 96λ
β=
,
,
α=
1 + 2c + 24λ
18 −17 + 14c + 150λ 18
2 (2.8b)
for evaluating the remaining coefficients. A final stencil that is required to complete closure of penta-diagonal (seven-point) compact schemes is formed at the second interior point from the boundary. Herein, this stencil is fourth-order that appears as
a q5 + q1 + bq3 + c q4 + q2 = η(q5 − q1 ) + δ(q4 − q2 )
(2.9a)
where (a, η) are the two free parameters that are optimized to minimize the spatial resolution errors. The remaining coefficients in terms of these free parameters become (b = 1);
c=
1 − 22a + 12η 4
,
δ=
3 − 18a + 4η 4
(2.9b)
Optimizing these three stencils (boundary, 1st interior, and 2nd interior points) by performing separate Fourier analyses to quantify their resolution properties is fruitless because their reduced errors will most likely deteriorate after joining them
114
S.A. Jordan / Applied Numerical Mathematics 61 (2011) 108–130
Fig. 2. Real (dispersive error) and imaginary (dissipative error) distributions of the modified wavenumbers for composite template 22 -42 -42 -61 -42 -42 -22 (Kim and Lee [17]); notation -42 - refers to a compact stencil that is fourth-order-accurate with two optimized parameters.
together along with the field scheme. In the next section, we will demonstrate this fact and offer an alternate optimization procedure that centers on the resolution errors of the composite operator itself rather than treat the participating elements individually. 3. Optimizing composite templates The final product housing an interior compact finite differencing scheme (standard or optimized) with another compact or explicit stencil for closure is hereinafter referred to as a composite template. As noted earlier, the resolution errors of the composite template at the boundary and adjacent field points are substantially dissimilar from the respective participating constituents. Moreover, these errors diffuse well into the domain, which contaminate the desirable resolution benefits of the optimized interior scheme. Their propagation length depends on the composite template size, optimization and final formal order. In this section, we propose an optimization strategy and subsequent Fourier analysis of resolution errors for closing seven-point field schemes. The Fourier analysis quantifies the actual dissipative and dispersive characteristics of composite templates. This development departs from the previous practices where each attached stencil was analyzed separately leading to a misunderstanding of their salient resolution properties. We will restrict development of these composite templates to those holding only boundary, first interior, and second internal point stencils. 3.1. Spatial resolution properties The first step toward quantifying the resolution errors of a composite template is representing the linear structure of all participating stencils. This composition is simply
[M] qfd = R(q)fd /
(3.1)
where matrix [M] constitutes the implicit coefficients of derivative {qfd } and vector R(q)fd holds all the explicit terms with their respective coefficients. An equivalent representation of the Fourier system becomes
i P(k) {kˆ } = S(k)
(3.2)
where [P(k)] is a multi-diagonal N × N matrix and S(k) is an N length vector. The limit integer N denotes the total number of nodal points where each location defines a unique modified wavenumber spectrum over an equivalent range of resolvable scales (0 k π ). Fig. 2 shows an example comparison between the real and imaginary components of the modified wavenumbers for the decoupled and coupled (3.2) versions of the optimized tri-diagonal composite template developed by Kim and Lee [17]. Clearly, the disparity in resolution errors between these versions is not subtle. While the resolution efficiency of each individual stencil appears appealing in the decoupled version, their actual dispersive and dissipative errors as given by solution of (3.2) are quite high over the upper 1/3rd of resolvable wavenumbers. The boundary errors in particular are most misleading. The individual stencil indicates a resolution efficiently [e (δ90 ) 0.615] that is over 20% better than the composite template at the boundary [e (δ90 ) 0.51]. Moreover, a rather large destabilizing influence of the composite
S.A. Jordan / Applied Numerical Mathematics 61 (2011) 108–130
115
template is present at and near the boundary. One final note is the observation of a measurable dissipative error beyond the three compact stencils that were optimized for closing the field scheme. The seven-point penta-diagonal compact field stencil coupled with the compact fourth-order two-parameter boundary (2.6), first interior (2.8) and second internal (2.9) point stencils has the generalized Fourier form (3.2) in wavenumber space as
P(k)
⎡
c 1 e ik 1 c 3 e −ik e i e −2ik
1 ⎢ d2 e −ik ⎢ ⎢ a3 e −2ik
⎢ ⎢ ⎢ =⎢ ⎢ ⎢ ⎢ ⎢ ⎣
⎤ ik
c2 e 1 di e −ik
c 3 e ik 1
a3 e −2ik c i e ik
.
.
.
e i e −2ik
di e −ik a N −2 e −2ik
ai e 2ik
.
.
1 c N −2 e −ik
c i e ik 1 c N −1 e −ik
ai e 2ik c N −2 e −ik 1 c N e −ik
⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ 2ik ⎥ a N −2 e ⎥ ⎦ a e ik N −1
1 (3.3a)
where the explicit scalar definitions at the forward boundary (s1+ ), first (s2 ), second (s3 ) and explicit vector field points (si ) are
s1+ = η1 e 5ik + λ1 e 4ik + δ1 e 3ik + α1 e 2ik + γ1 e ik + β1 s 2 = λ2 e
3ik
ik
−ik
+ α2 e + γ2 + β2 e + δ3 e ik − e −ik −e s 3 = η3 e si = ηi e 3ik − e −3ik + δi e 2ik − e −2ik + λi e ik − e −ik
+ δ2 e
2ik
−2ik
2ik
(3.3b) (3.3c) (3.3d) (3.3e)
with μi = ηi , βi = δi , γi = λi , αi = 0, b = 1 and 1 i N. Implicit solutions of the system (3.3a) with explicit definitions (3.3b)–(3.3e) over discrete dimensionless wavenumbers within the range 0 k π provide a modified wavenumber spectrum where one can quantify the true resolution errors of the penta-diagonal composite template. We will explore these solutions in the next section. 3.2. Optimizing multi-parameter templates Past optimization strategies concentrated on either guaranteeing numerical stability or minimizing the resolution errors between the individual approximating stencils and their spectral analog. Unfortunately, neither option identifies the true spatial resolution properties of the composite template. Below, we suggest an alternate approach that specifically follows the template spatial resolution. Each participant is optimized to match the resultant resolution errors of its adjacent stencil. Jordan [15] found success in this optimization strategy for application only to boundary stencils. We shall modify and extend his tactic to include the first and second interior point stencils for closing seven-point field schemes. 3.2.1. Boundary stencil Reducing the resolution errors of the composite template at the boundaries begins by fine-tuning the associated free parameters of a respective multi-parameter stencils to best meet the spatial resolution of the field scheme. This approach j j +1 imposes the general constraint kˆ r ,i = kˆ r .i on the modified wavenumber of the closure stencils over all resolvable scales (0 k π ) in the Fourier space. The superscripts j and j + 1 refer to the modified wavenumbers of the present ( j ) and adjacent ( j + 1) stencils toward the field scheme. Neighboring compact central difference stencils uniquely meet this constraint on their imaginary components. This advantage relaxes optimization of the free parameters that manipulate the dissipative error. But the boundary and adjacent stencils are usually asymmetric and require free parameters to manage both the dissipative and dispersive errors. The fourth-order two-parameter (c , η) boundary stencil has a simple complex form for the modified wavenumber as follows
i (kˆ ) b + ce ik = ηe 5ik + λe 4ik + δ e 3ik + α e 2ik + γ e ik + β
(3.4a)
After imposing the wavenumber constraint in spectral space and substituting the respective definitions for the stencil coefficients, two expressions evolve that distributes the real and imaginary components of the adjacent modified wavenumber in terms of the boundary free parameters. These expressions (not simplified) arise after expanding and collecting the respective real and imaginary components:
116
S.A. Jordan / Applied Numerical Mathematics 61 (2011) 108–130
(−12kˆ r sin k − 12kˆ i cos k − cos 4k + 6 cos 3k − 18 cos 2k + 10 cos k + 3)c = 12kˆ i − 3 cos(4k) + 16 cos(3k) − 36 cos(2k) + 48 cos k − 25 + 12 cos(5k) − 60 cos(4k) + 120 cos(3k) − 120 cos(2k) + 60 cos k − 12 η
(3.4b)
(12kˆ r cos k − 12kˆ i sin k − sin 4k + 6 sin 3k − 18 sin 2k + 10 sin k)c = −12kˆ r − 3 sin(4k) + 16 sin(3k) − 36 sin(2k) + 48 sin k + 12 sin(5k) − 60 sin(4k) + 120 sin(3k) − 120 sin(2k) + 60 sin k η
(3.4c)
where (kˆ r ,i ) represents the real (kˆ r ) and imaginary (kˆ i ) elements of the modified wavenumber distribution at the adjacent interior point. The boundary stencil can now be optimized according to the resolution errors intrinsic in the field scheme. 3.2.2. Adjacent boundary stencils When the field stencil consumes three-points, only a single boundary definition is necessary to close the compact system. But higher-order composite templates require additional stencils adjacent to the boundary. The fourth-order two-parameter family in (2.8a), (2.8b) for the first interior point has the complex form for the modified wavenumber as
i (kˆ ) ce ik + b + de −ik = λe 3ik + δ e 2ik + α e 2ik + γ + β e −ik
(3.5a)
Constraining the respective wavenumber leads to the expressions (one each after collecting the real and imaginary pieces)
(−6kˆ r sin k − 12kˆ i cos k + 2 cos 2k + 16 cos k − 18)c = +9kˆ i + 3kˆ i cos k − 3kˆ r sin k + 1/2(9 − cos 2k) − 4 cos k + (−36kˆ i cos k + 36kˆ r sin k − 144 + 9 cos 3k − 48 cos 2k + 183 cos k)λ (12kˆ r cos k − 6kˆ i sin k + 2 sin 2k + 16 sin k)c = −9kˆ r − 3kˆ r cos k + 3kˆ i sin k − 1/2 sin 2k + 13 sin k + (36kˆ r cos k + 36kˆ i sin k + 9 sin 3k − 48 sin 2k + 33 sin k)λ
(3.5b)
(3.5c)
for uniquely evaluating the two free-parameters (c , λ). Again, the notation kˆ r ,i refers to the real and imaginary modified wavenumbers of the adjacent interior scheme. Like the boundary compact definitions, the free parameters of this fourth-order stencil are optimized toward minimizing the resolution errors consistently throughout the wavenumber space. Deriving a lower-order finite difference expression is senseless because the optimized parameters will not prevent degradation of the domain’s global accuracy toward this low-order stencil. The third stencil (2.9a), (2.9b) that is necessary to close a penta-diagonal system has the complex form in wavenumber space as
i (kˆ ) a e 2ik + e −2ik + c e ik + e −ik
= η e 2ik − e −2ik + δ e ik − e −ik
(3.6a)
where the two free parameters (a, η) are evaluated by the simple expressions
(−4 cos 2k + 22 cos k)a = 2 + cos k + (12 cos k)η (4kˆ r cos 2k − 22kˆ r cos k + 18 sin k)a = −2kˆ r − kˆ r cos k + 3 sin k + (−12kˆ r cos k + 4 sin 2k + 4 sin k)η
(3.6b)
(3.6c)
Because this third stencil is symmetric, the imaginary component kˆ i of the modified wavenumber is exactly zero. Picking two discrete values for the dimensionless wavenumber k in the spectral domain provides an easy approach for evaluating the respective free parameters in the equation system (3.6) as well as those in systems (3.4) and (3.5). However, experience shows that this choice does not yield the best resolution characteristics of the resultant composite template. In the next section, an alternate procedure is presented that evaluates the free parameters over all exact wavenumbers in the Fourier space. 3.3. Evaluating the free parameters Once again, the impetus of our optimization strategy is reproduction of the field resolution characteristics in the closing stencils. This challenge lies in the evaluation process of the free parameters. Optimizing the closing stencils imposes a constraint on the modified wave number spectra, but as noted earlier extracting discrete samples to explicitly evaluate each free parameter is a poor tactic towards representing the full spectrum. Minimizing the dissipative error is trivial for adjoining compact central stencils because the imaginary components are everywhere zero. But a close match between the respective dispersive error profiles would demand optimizing many parameters such that the stencil size becomes too excessive.
S.A. Jordan / Applied Numerical Mathematics 61 (2011) 108–130
117
Thus, the present optimization process is vastly over-determinate. One corrective measure to suitably handle this impasse is formulation of a least squares minimization technique. Given the two scalars or free parameters (c, η) of an optimized compact stencil, we can form the vector solutions
qc = b + rη
and
sc = d + tη
(3.7)
Each vector quantity (q, s, b, d, r and t) corresponds to the respective coefficients given by the real and imaginary modified wavenumber expressions. As an example, the vector quantities for optimizing the fourth-order two-parameter boundary stencil (3.4) are
q = −12kˆ r sin k − 12kˆ i cos k − cos 4k + 6 cos 3k − 18 cos 2k + 10 cos k + 3 s = 12kˆ r cos k − 12kˆ i sin k − sin 4k + 6 sin 3k − 18 sin 2k + 10 sin k b = 12kˆ i − 3 cos 4k + 16 cos 3k − 36 cos 2k + 48 cos k − 25 d = −12kˆ r − 3 sin 4k + 16 sin 3k − 36 sin 2k + 48 sin k r = 12 cos 5k − 60 cos 4k + 120 cos 3k − 120 cos 2k + 60 cos k − 12 t = 12 sin 5k − 60 sin 4k + 120 sin 3k − 120 sin 2k + 60 sin k
(3.8)
Recasting (3.7) into unique formulations for each scalar produces the vector expressions
Sc = R and
Tη = Q
(3.9a)
where
S = q · t − r · s,
R = b · t − d · r and
T = r · s − q · t,
Q=s·b−q·d
(3.9b)
We can now evaluate representative values for these scalars by minimizing (3.9a) using a least squares approach. After formulating the square of the errors vectors G and H as
G = (R − Sc )2
H = (T − Qη)2
and
(3.10a)
and applying the constraints
∂ G/∂ c = 0 and ∂ H/∂ η = 0
(3.10b)
each scalar (parameter) is evaluated according to
c = (R · S)/S2
and
η = −(T · Q)/T2
(3.11)
These expressions are suitable for optimizing only two-parameter family stencils as in equation systems (2.6), (2.8), and (2.9). Each scalar (free parameter) is unique because the respective denominator and numerator vectors can only vanish simultaneously. Solving the vector solutions (3.7) after substituting the fourth-order definitions (3.6b) and (3.6c) provides the minimization path necessary to sufficiently evaluate the two free parameters (a, η) at the second internal point that is adjacent to the field stencil (2.3). The respective vector quantities have the form
q = −4 cos 2k + 22 cos k s = 4kˆ r cos 2k − 22kˆ r cos k + 18 sin k f
f
b = 2 + cos k d = −2kˆ r − kˆ r cos k + 3 sin k f
f
r = 12 cos k t = −12kˆ r cos k + 4 sin 2k + 4 sin k f
(3.12)
ˆ f ks r
where depicts the real component distribution of the field modified wavenumber in the spectra space and k denotes the corresponding dimensionless wavenumber ranging as 0 k π . Choosing 100 discrete exact wavenumbers [k = (k)i ] converged the scalar optimization (3.11) for the free parameters (a, η) in (2.9) to error tolerances less than one percent; 1 i 100 where the subscript (i ) denotes the discrete wavenumber series chosen to replicate the error spectrum.
f After setting the optimized three-parameter stencil developed by Kim [16] (2.5) as the target field resolution (kˆ r ), the values for (a, η) became (single precision)
a3 = 0.3173376 and
η3 = 0.9519348
with the remaining two coefficients of this fourth-order stencil uniquely defined by the expressions in (2.9b).
(3.13)
118
S.A. Jordan / Applied Numerical Mathematics 61 (2011) 108–130
Fig. 3. Comparison of the dispersive and dissipative error distributions at the second interior point using fourth-order stencils with two (present) and three [16] free parameters.
Fig. 4. Comparison of the dispersive error distributions at the boundary, 1st interior and 2nd interior points of composite templates 42 -42 -42 -43 -42 -42 -42 (present) and 43 -43 -43 -43 -43 -43 -43 as developed by Kim [16]; percents in the parenthesis denote resolution efficiencies for 90% expectancy.
Comparison between the resolution errors of this fourth-order two-parameter stencil (notation 42 ) and the threeparameter version (43 ) after linked to the field scheme (2.5) is plotted in Fig. 3. The latter stencil (43 ) was optimized by Kim [16] in a similar least-squares fashion, but the optimization centered on matching the spectral wavenumbers with an allowable resolution error tolerance and weighting function toward the upper end of the spectrum. As illustrated by spectral profiles in Fig. 3, operating on the modified wavenumbers of the field stencil itself without weighting efficiently resolves 77% [e (δ90 ) 0.77] of the resolvable waves 90% (δr 0.90) or better, which is a 22% improvement over the threeparameter version (43 ). A small sacrifice by the present optimization strategy is a slightly higher dissipative error over the upper 14 of resolvable wavenumbers. Continuing with the same optimization strategy for the two-parameter boundary and adjacent boundary families ((2.6) and (2.8), respectively) produces the free parameters
c 1 = −1.0001941,
η1 = −0.0046976898,
c 2 = 0.40493011 and
λ2 = 0.1574873
(3.14)
where the remaining coefficients of these two closure stencils are given by the expressions in (2.6b) and (2.8b), respectively. Coupling the present three closure stencils with the three-parameter optimized field stencil by Kim [16] forms the final penta-diagonal composite template given the notation 42 -42 -42 -43 -42 -42 -42 . The actual resolution errors of this composite
S.A. Jordan / Applied Numerical Mathematics 61 (2011) 108–130
119
Fig. 5. Comparison of the dispersive error distributions at the boundary and 1st interior points of three composite templates owning compact third- and fourth-order standard and optimized stencils; percents in the parenthesis denote resolution efficiencies for 90% expectancy.
are assessed by solving the equation system (3.2) with accompanying definitions (3.3). Contrasting the dispersive errors with the alternate composite template 43 -43 -43 -43 -43 -43 -43 as developed by Kim [16] is illustrated in Fig. 4 where a resolution expectancy of 90% is shown in parentheses to judge their respective differences. At and adjacent to the boundary, the 42 42 -42 -43 -42 -42 -42 composite clearly yields significantly lower dispersive errors over more than one-half of the resolvable wavenumbers. The dissipative errors (not shown) also follow this same observation. But once distant from these points, the 43 -43 -43 -43 -43 -43 -43 template approaches the field spatial resolution more rapidly. Thus, practical applications that require low dispersive (and dissipative) errors near the boundaries, such as resolving a pressure quantity, may benefit from the present 42 -42 -42 -43 -42 -42 -42 penta-diagonal composite over the alternate 43 -43 -43 -43 -43 -43 -43 template. Beginning with the standard base composite template 3-4-6-4-3, tri-diagonal templates can be formed similarly that own various orders of optimized compact boundary and adjacent boundary stencils. A few examples are listed in Table 1. The boundary stencils are both third- and fourth-order two-parameter families, but the adjacent stencil is no less than fourthorder to preserve at least fifth-order field accuracy. Each stencil is optimized to their interior neighbor. The field operators are standard upwind (fifth-order) or central (sixth-order) five-point schemes. Fig. 5 shows a sample distribution of the modified wavenumbers in spectral space at the boundary and adjacent points. Scrutinizing the error profiles generated by the optimized boundary stencil itself (32 ) gives a synthetic perspective of the actual template resolution properties. The dispersive error at the boundary is in fact far removed from those present in the field operator. Interestingly, a comparison between the resolution efficiencies of templates 32 -4-6-4-32 (Jordan [15]) and 32 -42 -6-42 -32 reveals that optimizing the adjacent boundary point stencil degrades the global resolution consistency specifically at the boundary. In particular, the former template efficiently resolves 56% [e (δ90 ) 0.56] of the resolvable waves 90% (δr 0.90) or better at the boundary while the latter composite drops that efficiency to only 37%. The optimized template 42 -42 -6-42 -42 whose boundary stencil is one-order higher does not soften this decline [e (δ90 ) 0.38]. The rewards of template optimization are plainly evident in the changed resolution errors at the first interior point (Fig. 5b). While the decoupled version suggests no apparent gain in resolution efficiency over the standard composite template, the true error profiles match the adjacent sixth-order field operator quite nicely. The new resolution efficiencies are 71% (32 -42 -6-42 -32 ) and 72% (42 -42 -6-42 -42 ) as compared to a 70% efficiency by the field operator (6c ) for a 90% expectancy. The error profiles indicate a modest overall improvement in the spatial resolution by switching the boundary stencil to fourth-order. Evidently, gaining consistent spatial resolution by the composite template even at the boundaries requires a boundary stencil carrying free parameters on both the explicit and implicit sides to optimally control growth of the respective dispersive and dissipative errors. In the application section, we will test these optimized composite templates (and several others) for improving the predictive accuracy over their standard counterparts as well as isolate the direct reduction of solution error due specifically to their spatial resolution properties. 4. Numerical stability of optimized composite templates A prerequisite to applying the proposed optimized composite templates is proof of their intrinsic numerical stability. We can quantify this characteristic by reformulating the linear convection equation into a system of ordinary differential equations (ODE) whose normal modes lead to a range of eigenvalues that denote unstable (growth rate) or stable (decay rate) solutions. This approach has become a standard for quantifying the numerical stability of composite templates (see
120
Template
Stencil
a
c
32 -42 -5-42 -32
Boundary 1st interior Boundary 1st interior Boundary 1st interior Boundary 1st interior Boundary 1st interior Boundary 1st interior
6.591868 0.288468 7.565600 0.110884 −0.054967 0.024444 3.652111 0.178418 −6.478879 0.046667 −3.084739 0.028975
– 0.081969 – 0.121718 – 0.110468 – 0.109101 – 0.110392 – 0.134718
32 -42 -6-42 -32 42 -42 -5-42 -42 42 -42 -6-42 -42 3
2
2
3
4 -4 -5-4 -4
43 -42 -6-42 -43
φ
−0.475096 −0.333219
η – 0.038802 – 0.043663 0.438545 0.053679 0.000807 0.041190 2.953079 0.051846 2.021383 0.047239
λ 0.277210 – 0.379251 –
−2.447307 – 0.050308 –
−8.428866 –
−5.615693 –
δ
α
γ
β
−1.874153 −0.198396 −2.444604 −0.263786 5.746269 −0.336413 −0.484652 −0.235587 15.099735 −0.321699 9.760777 −0.301060
6.755130 1.254092 8.341106 1.134845 −7.467903 1.168595 2.470097 1.172698 −20.869802 1.168823 −12.846088 1.095847
−1.404776 −0.697768 −2.299804 −0.420382 6.238532 −0.407756 0.960609 −0.515876 12.762164 −0.422875 8.680277 −0.313779
−3.753412 −0.396730 −3.975949 −0.494340 −2.508137 −0.478105 −2.997168 −0.462425 8.390744 −0.476095 3.438685 −0.528247
S.A. Jordan / Applied Numerical Mathematics 61 (2011) 108–130
Table 1 List of coefficients for several five-point composite templates optimized at the boundary and 1st interior points.
S.A. Jordan / Applied Numerical Mathematics 61 (2011) 108–130
121
Fig. 6. Numerical stability characteristics of several fourth-order-accurate standard and optimized composite templates.
Carpenter el al. [4] for further justification). Accordingly, the first-order derivatives in the equation qt = −cq x are premultiplied by the implicit coefficient matrix [M] followed by the substitution of (3.1) to give the ODE matrix–vector system as
[M]{qft }t = −c [R]{qfd }/ The normal modes of this system are found by substituting the expression qfd = e ωi t qˆ where the complex quantity the eigenvalues and (ˆq) are the respective eigenvectors. This new system becomes
−ωˆ i [M]{ˆq} = [R]{ˆq}
(4.1)
ωι are (4.2)
ˆ i is the dimensionless set of eigenvalues defined as ωˆ i = ωi /c that depend on the specific stencils of the comwhere ω posite template as well as the number of computational points. To declare the composite template as numerical stable, all ˆ i must fall inside the left-half plane of the eigenvalue spectrum. conjugate pairs of ω Direct solutions of the equation system in (4.2), however, address only periodic boundary conditions. Inasmuch as q0 is normally defined at the boundary, its derivative becomes redundant. For the tri-diagonal templates, the boundary and first interior stencils are combined to eliminate the first row in matrices [M] and [R]. The penta-diagonal templates require elimination of both the first and second rows by appropriately substituting the stencils at the boundary, first interior and second internal points. Further specifics of the elimination procedure and resultant coefficient matrices for the penta-diagonal systems are described by Kim [16]. The numerical stability of various standard and optimized fourth-order composites are illustrated in Fig. 6. Their respective eigenvalues (as well as the eigenvectors) were evaluated using the MATLAB tool and 49 computational points. Carpenter el al. [4] discussed and illustrated the inherent unstable nature of the standard base composite 4-4-4. Stabilizing this template is possible through parameter optimization. The advantage of optimizing only two free parameters according to the present procedure in comparison to the three-parameter composites becomes evident when comparing their respective eigenvalues. Both optimized templates 42 -4-42 (Jordan [15]) and 42 -42 -42 -43 -42 -42 -42 own real elements that fall strictly within the left-half plane of the eigenvalue spectrum. This result suggests that these templates are asymptotically stable |e ωi t | 1 for all CFL values below the stability limit. Kim [16] admits that his template 43 -43 -43 -43 -43 -43 -43 is marginally unstable, but approaches neutral stability when exercised over finer grid spacing. This same behavior was demonstrated by the 43 -4-43 composite where both vector solutions (3.7) housed three free parameters. In this case, the present optimization strategy does not yield unique determinations for each parameter. The destabilizing effect of optimizing three parameters in the closure stencils is not unique to only the potentially unstable 4-4-4 composite template. Starting with stable base composites 3-4-3 and 4-4-5-6-5-4-4, optimizing three parameters (one on the implicit and two on the explicit sides of the stencil) produces positive eigenvalues as plotted in Figs. 7a and 7b. This detriment is more pronounced for the lower fourth-order field scheme (Fig. 7a). A second notable observation of the fourth-order composite family shown in Fig. 6 is the large scatter among the eigenvalues. The number of optimized stencils and accompanying free parameters directly stimulates a shift in the maximum eigenvalue that suggests large differences among their relative rates of decay. Switching the field scheme to the sixth-order Padé-type with an adjacent standard fifth-order stencil minimized the relative scatter among the various optimized versions (Fig. 7b). One should note that the favorable spatial stability of these composite templates does not necessarily assure overall stability especially when advanced by a chosen temporal scheme.
122
S.A. Jordan / Applied Numerical Mathematics 61 (2011) 108–130
Fig. 7. Numerical stability characteristics of several composite templates holding fourth- and sixth-order Padé-type interior stencils within the computational field.
5. Applications of optimized composite templates Composite templates are optimized herein to specifically target their local dispersive and dissipative errors. But their optimization comes at a cost of lower formal order for the same size template. Obviously, one can build larger stencil sizes that hold both low truncation errors and high spatial resolution, but this latter upgrade may locally vanish once fed into the resultant composite scheme. Keeping the field operator manageably small (up to seven points) demands only boundary and adjacent boundary stencils to close the composite system. By optimizing these closing stencils, the spatial resolution errors of the composite template approached the field state once off the boundary. In this section, we will illustrate the predictive capability of the present composite templates as well as several others using three benchmark problems taken from the CAA workshops. Specifically, these templates resolve the physical wave of the linear convection and non-linear viscous Burger equations as well as the acoustic scattering waves reflected off a twodimensional cylinder. Notably, these three applications involve evaluating the predictive errors of single uniform waves. The predictive errors of more practical applications that involve multiple waves propagating at different speeds and impending at non-zero angles-of-incidence on the boundary become difficult to quantify due to the absence of companion theories. Moreover, testing the spatial accuracy and temporal stability of the composite templates using practical problems dealing with non-uniform and non-orthogonal grid topologies is not presently appropriate due to the unknown coupling effects of the accompanying metric coefficients in the transformed form of the governing equations. Evaluating these coefficients analytically (non-uniform gridding only) is not an option because the leading term of the truncation error in the first-order derivative will be reduced by at least one-order [23]. Herein, each CAA application was time-advanced by the second-order MAC method. The chosen stability coefficient was very small (CFL = 0.001) to isolate the spatial resolution effects from any temporal errors. Practical applications require much higher CFL values to reach the final solution in a timely manner. But the coupling of the higher temporal and spatial errors can be detrimental (but not always) to the predictive error. For the linear convection and non-linear Burger waves, a transparent fictitious wall was inserted far downstream from the wave starting position. The predictive accuracy of each composite template was assessed once the advancing wave was centered over the fictitious wall. Knowing that each template should reach their field stencil resolution errors several points off the boundary, results were collected from only five points (four interior and one boundary) to aptly quantify their predictive accuracy as compared to the exact solution. This spread covers the interior propagation of the higher resolution errors generated by the closing boundary and adjacent constituents in the composite template. We gauge the predictive error of each application in terms of the L 2 -norm;
M L 2 (u ) = (u e − u fd )2i
(5.1)
i = M −5
where u e and u fd are the exact and predicted discrete wave velocities. The variable M is the number of computational points. We note that in the following exercises no attempt was made to report separate amplitude and phase errors of each template. The L 2 -norm (as presently evaluated) represents both errors collectively.
S.A. Jordan / Applied Numerical Mathematics 61 (2011) 108–130
123
Fig. 8. Solution error (L 2 -norm) of the linear convection equation using fourth-order-accurate composite templates that include standard (tri-diagonal) and optimized (penta-diagonal) stencils at the fictitious boundary.
5.1. Linear convection equation Numerical solution of a propagating linear wave is a classic start for assessing the predictive accuracy of the optimized penta-diagonal composite templates. This wave is mathematically represented by the one-dimensional time-dependent equation
ut + cu x = 0
(5.2)
where the wave speed c is simply unity. Starting with the wave state
u (x, t = 0) =
1 − ln 2(x/n)2 e 2
(5.3a)
gives the exact solution for all times thereafter as
u (x, t ) =
1 − ln 2(x /n)2 e 2
(5.3b)
where the local coordinate is x = x − x0 − t with x0 being the origin. In the simulations, = x is the uniform grid spacing and the ratio /n quantifies the spatial resolution of each composite template. The fictitious wall represents the right-side boundary of the computational domain with spatial limits −50 x 200. Numerical solution errors of the linear wave equation for the standard (tri-diagonal) and optimized (penta-diagonal) composite templates with two-parameter and three-parameter closure stencils are shown in Fig. 8. Each template is built around two basic operators holding standard (-4-) and optimized (-43 -) fourth-order field accuracy. The figure includes convergence rates of the field operators (4c ) as measured five points off the boundary for the various grid sizes. As expected, the linear slope in log–log space matches the expected convergence toward fourth-order field accuracy. Inclusion of either the standard and optimized closure stencils maintains the same convergence rate as the field accuracy. With the L 2 -norm quantifying convergence at and near the boundary, this rate is not unexpected because each composite template is closed with the same order stencils. The benefits of optimizing templates rest ostensibly on their solution errors when compared to the standard composite. But the impact of different resolution properties between the two- and threeparameter families is not readily reflected by their respective solution errors. This result should also be expected because the solution errors of similar composite templates for approximating the linear wave equation are only mildly affected by the respective improvements in the resolution properties. Thus, resolving linear wave propagation is an excellent indicator of numerical convergence in view of verifying the formal order, but the immediate subtleties of optimization toward lowresolution errors are not readily apparent in the local predictive error results. 5.2. Non-linear viscous burgers equation Finite difference approximation of the non-linear viscous Burgers equation provides a better metric for assessing the inherent resolution efficiency of the composite templates. In particular, the non-linear term is very sensitive to a high
124
S.A. Jordan / Applied Numerical Mathematics 61 (2011) 108–130
Fig. 9. Solution error (L 2 -norm) of the non-linear Burgers equation using fourth-order (penta-diagonal) and sixth-order (tri-diagonal) field schemes closed with two- and three-parameter optimized stencils at the fictitious boundary.
spatial resolution error where numerical symptoms such as aliasing can quickly corrupt the solution accuracy. The governing equation in one-dimension is
ut + uu x = μu xx
(5.4)
where the non-linear term is given in non-conservative form. This advective form of Burgers equation permits easy treatment of both central and upwind field operators. The exact wave at initial time t = 0 is
u (x, t = 0) = 1 − tanh
x
2μ
(5.5a)
and the advancing wave solution becomes
u (x, t ) = 1 − tanh
x
2μ
(5.5b)
The local coordinate is still x = x − x0 − t. Knowing that the non-linear term gauges the resolving power of the composite template, we will use the exact form of the pseudo-viscous term μu xx in the computations:
μu xx =
1 2μ
sech2 ( v ) tanh( v )
(5.6)
where v = x /2μ. Resolving the non-linear Burger wave by the various composite templates should supply sufficient evidence for correlating their resolution properties with the predictive errors in terms of the L 2 -norm. In the computations, one has the option of changing either the grid spacing or Burger wave size to contrast the predictive errors of the composite template. Like the linear convection predictions, the Burger wave was advanced using the MAC method with CFL = 0.001. Convergence of the penta-diagonal composite templates 42 -42 -42 -43 -42 -42 -42 and 43 -43 -43 -43 -43 -43 -43 as applied for resolving the propagating Burger wave to the fictitious wall boundary is plotted in Fig. 9. As before, the L 2 -norm at this location is evaluated by summing the predictive errors at the boundary and four adjacent internal points. Unlike the predictive accuracy of the linear convective wave, each template application over coarse grid spacing gave similar values for the L 2 -norm with non-linear convergence rates (in log–log space). This result directly coincides with their analogous high dispersive and dissipative errors at the finer resolvable scales as illustrated in Fig. 4. These resolution errors sufficiently decrease upon finer gridding to produce consistent convergence rates as well as distinguish the level of predictive accuracy among these two composites. Notably, the degradation in the predictive error at the intermediate scales by the 43 -43 -43 -43 -43 -43 -43 optimized template correlates well with the corresponding rise in dispersive error (see Fig. 4b). Their predictive errors become nearly indistinguishable once the grid’s spatial resolution progressed toward 75% of the resolvable scales. Introducing a sixth-order scheme as the field accuracy lowers the resolution errors at the intermediate resolvable scales (Fig. 5) and concurrently the predictive error. But this improvement is only consistently realized for the optimized fourthorder boundary closures. Relative improvement in the wave predictions using the second-order boundary stencil degraded with finer grid spacing. This result manifests itself from the reduced formal accuracy of the boundary stencil rather than its associated resolution error. Discrete side-by-side comparisons between the predictive accuracy and resolution errors of a specific composite template in their respective spatial and spectral domains lend a qualitative picture of the healthier predictive capability one can reach
S.A. Jordan / Applied Numerical Mathematics 61 (2011) 108–130
125
Fig. 10. Comparison of third-order explicit, standard and optimized boundary templates (fourth-order field accuracy) as applied to the non-linear Burgers equation.
by optimizing the closing participants. This improvement is most apparent over the intermediate resolvable scales where the optimization strategy is most effective. But this understanding lacks a definable quantitative connection between the lowered resolution error and the superior predictive power of the optimized template. Herein, we seek a simplistic process that quantifies the enhanced (or degraded) predictive accuracy a priori due to varying the local resolution properties. We can begin forming this route by appropriately grouping the predictive errors of the composite templates holding constituents of similar formal order. Quantitative differences among these groupings at discrete resolution scales isolate their predictive accuracy from the truncation errors associated with dissimilar intrinsic stencils. An example is illustrated on the left side of Fig. 10 for the base composite template 3-4-3. The notation |r | in the figure represents the magnitude of the integrated resolution expectancy error (|δ|) averaged over the boundary and four interior points;
|r | =
n M 1 5
1 − |δ| d(k)
(5.7)
i = M −5 0
where 0 n π . The scaled grid spacings (k) of each grouping are displayed in the figure. Clearly, the trends give a qualitative picture outlining the connection between the local predictive and resolution errors. Like the previous observations, the most improved predictive accuracy occurs over the intermediate resolvable scales. Using this figure we can now correlate the predictive accuracy with resolving power of each composite. The right side of Fig. 10 establishes the direct correlation between the expected change in predictive error, notation R ( L 2 ), versus improved (or degraded) modified resolution error, notation I (r ), for the base composite template 3-4-3. This correlation treats boundary closures varying from an explicit third-order definition (3e ) to a corresponding optimized three-parameter compact stencil (33 ). Collectively, their correlation from performing linear regression in the log–log space is unity. A similar correlation is demonstrated in Fig. 11 for the 4-4-4 base template. For this particular composite, the correlation between the different resolution errors and change in predictive accuracy is nearly one-to-one (log–log space). These correlations provide an a priori assessment of the expected upgrade in predictive accuracy due to template optimization (in terms of resolution error) at and near the domain boundaries. Conversely, degrading the template predictive accuracy by substituting an explicit stencil for the compact boundary closure is also quantified by this error correlation. Processing the L 2 -accuracy norms and their corresponding discrete resolution errors |r | in the same fashion for the 4-4-5-4-4 base template is shown in Fig. 12. The boundary stencils in these tests vary from an explicit definition to a threeparameter optimized fourth-order expression. In view of the previous field accuracy analyses [1,2,6,10,11], only fourth-order stencils were coupled with the upwind fifth-order interior scheme. Like the previous 3-4-3 and 4-4-4 base templates, reduction of the L 2 -error norm for each 4-4-5-4-4 composite template correlates well with the improved spatial resolution. Similarly, the slope of the correlation line that quantifies their relative improvement is unity (log–log space). Superimposing a variety of tri- and penta-diagonal composite templates in terms of their predictive and resolution errors as applied to resolving the propagating Burger wave is illustrated in Fig. 13. The field operators are fourth- and sixthorder with various standard and optimized second- and fourth-order closure stencils. Results of the composite template 22 -42 -42 -61 -42 -42 -22 appear through the stencil optimization strategy taken from Kim and Lee [17] and the composite
126
S.A. Jordan / Applied Numerical Mathematics 61 (2011) 108–130
Fig. 11. Comparison of fourth-order explicit [6], standard and optimized boundary templates (fourth-order field accuracy) as applied to the non-linear Burgers equation.
Fig. 12. Comparison of fourth-order explicit, standard and optimized boundary templates (fifth-order field accuracy) as applied to the non-linear Burgers equation.
template 43 -43 -43 -43 -43 -43 -43 belongs to Kim [16]. Apart from the former template, each composite tracks the same trends drawn from all previous applications. Optimizing the closing stencils and/or raising the field order clearly improved the predictive accuracy over the base composite 4-4-4 for the intermediate and finer grid spacing. Although the slope of the regression line departs from unity, the correlation coefficient among all templates exceeds 95%. The visible differences by the optimized template from Kim and Lee are the relatively higher resolution errors using the coarser grids. But as noted earlier, the lower order boundary stencil presumably negates any expected gains in predictive accuracy compared to the other closing stencils even though both are optimized to reduce the local resolution errors. Finally, we note that improving the spatial resolution of the composite template most always comes at a cost of more CPU solution time. This cost reflects inversion of the coefficient matrix [M] in Eq. (3.1) by either a tri- or penta-diagonal solver for the present composites. As an example (k = 1.0), switching the interior explicit fourth-order operator to a companion three-point compact variety lowered the L 2 -error norm of the non-linear Burgers prediction by nearly an order of magnitude (the same number of grid points). One drawback to this dramatic improvement in predictive accuracy was a 20% rise in CPU expense spent solely on numerically approximating the non-linear term. Likewise, an attractive 500%
S.A. Jordan / Applied Numerical Mathematics 61 (2011) 108–130
127
Fig. 13. Comparison of various multi-parameter optimized templates (sixth-order field accuracy) as applied to the non-linear Burgers equation.
reduction in predictive error was realized when using a penta-diagonal mixture of similar formal accuracy. But again, this improvement raised the CPU expense by approximately 30%. One has the option of coarsing the grid spacing to lower the CPU cost given the higher resolving template. However, the resolving efficiency of the composite template will concurrently degrade such that one must insure that the lower-resolving composite with fine grid spacing is the best choice. 5.3. Two-dimensional acoustic scattering The final problem for testing the spatial accuracy of optimized composite templates is taken from the Second CAA Workshop [21]. Specifically, we seek reduction in the predictive error of the acoustic wall pressure through improvement in the local resolution efficiency. The problem itself involves solving the acoustic waves reflecting off a cylinder wall due to a pressure source positioned along the horizontal centerline (x-axis) of the domain annulus. The governing equations in polar coordinates are
d dt
ur uθ P
+
d dr
p 0 ur
+
d r dθ
0 p uθ
+
1 r
0 0 ur
0
= 0
(5.8)
0
where u r is the radial velocity, u θ is circumferential velocity and p is the acoustic pressure. The cylinder radius is r = 0.5 with an annulus outer limit r = 10.5. Flow conditions at the cylinder wall are zero for the normal velocity u r whereas both the circumferential velocity u θ and pressure p are computed using the governing equations (5.8). An initial pressure source is placed at polar coordinates (4, 0◦ ) using the equation
p (x, y , 0) = exp − ln(2)
(x − 4)2 + y 2 0.04
(5.9)
One can find the exact solution for the time-dependent acoustic pressures and the annulus exterior boundary condition in the CAA Workshop publication [21]. A uniform point distribution 201 × 101 (circumferential by radial directions) was generated to test the composite template field accuracy whereas a relatively coarse 101 × 81 grid focused on the wall pressure predictions. A coarse spacing was needed adjacent to the cylinder wall to distinguish the template accuracies in terms of resolving the corresponding acoustic pressure. We chose the MAC solution scheme of Hixon and Turkel [14] to time-advance the governing equation system (5.8) with the stated boundary conditions. The interior stencils were both fourth- and sixth-order with their pre-factored form and respective coefficients given by Hixon and Turkel. Predictions of the primary, secondary and tertiary waves within the domain annulus at time t = 7 are shown in Fig. 14. These results reflect solutions by the fourth-order field operator with an optimized fourth-order boundary closure (template 42 -4-42 with e (δ90 )wall 0.42 at the wall). As previously discussed, this particular template is stable for all CFL values below the stability limit. The outside acoustic pattern replicates the first wave of the initial pressure pulse while the second intermediate pattern captures the primary reflected wave off the cylinder wall that opposes the source. Their time histories at point (5, 90◦ ) are compared to the exact solution where the peak amplitude of the dominant wave is under-predicted by
128
S.A. Jordan / Applied Numerical Mathematics 61 (2011) 108–130
Fig. 14. Predictions of two-dimensional acoustic scattering problem by present optimized composite template 42 -4-42 .
Fig. 15. Radial pressure distribution of the acoustic scatter problem (45◦ , t = 4); percents in the parentheses denote the absolute predictive error at the cylinder boundary point.
approximately 3% with a small phase error. Not shown in the time history is the weakest wave pattern that is closest to the cylinder wall. This tertiary wave materializes from the split portions of the dominant wave interacting on the opposite wall of the pressure source. Fig. 15 shows the spatial distribution of the primary and secondary reflected waves along a radial line measured 45◦ from the annulus x-axis at time t = 4. The left half of this figure compares the 42 -4-42 template predictions to the exact solution over the finer 201 × 101 spacing while the right side is a close-up of the secondary wave using the coarse 101 × 81 grid. The peak amplitude of the secondary wave is under-predicted 4% by the 42 -4-42 template with no discernible phase error. The expanded view near the cylinder wall shows the fourth interior point of the coarse grid coincident with the secondary wave peak. Apart from the 4-4-4 composite, all composite templates show nearly overlying predictions of this wave peak (and beyond), which is due to their diminished influence of the various boundary closures. The 4-4-4 composite, in particular, predicted a substantial phase variation that was expected due to its poor dissipative error properties. The predictive errors of the instantaneous wall pressure differed significantly among the optimized stencils as listed in Fig. 15b. The largest error (11.5%) came from the 4-4-4 composite (e (δ90 )wall 0.39). Optimizing the closing stencil using two (42 -4-42 ) and then three (43 -4-43 , e (δ90 )wall 0.46) parameters gradually improved the predictive accuracy of the wall pressure; 10.4% and 9.4%, respectively. But when a sixth-order field operator with an adjacent standard fourth-order stencil was introduced, then optimized, the predictive errors dropped to acceptable levels. Considering the relatively coarse grid spacing used for this test problem, these particular templates are especially attractive even through their boundary resolution efficiencies at the wall boundary are no better than their analog fourth-order field schemes; e (δ90 )wall 0.40 and e (δ90 )wall 0.45 for the 42 -4-6-4-42 and 43 -4-6-4-43 templates, respectively. Finally, switching the numerical approx-
S.A. Jordan / Applied Numerical Mathematics 61 (2011) 108–130
129
imation from the 4-4-4 standard scheme to the 43 -4-6-4-43 optimized template raised the relative CPU requirement by approximately 13%. 6. Final remarks Optimized composite templates own coupled multi-parameter compact finite differencing stencils that target lowresolution properties as well as global stability. Previous spectral analyses quantify their local resolution properties by treating each compact stencil separately. But as demonstrated herein, the true resolution errors can only be properly understood through spectral analyses of the final coupled version. One must be assured that the optimization strategy does not produce composite templates that house resolution errors worst than the corresponding standard form. At the boundary, the local resolution efficiencies of the decoupled template can appear 100% better than the true efficiency of the fully coupled template. The high dispersive and dissipative errors at the upper resolvable scales propagate into the domain interior where only the field operator exists in the composite template. However, users can reconcile this resolution inconsistency by imposing mild grid clustering towards the physical boundaries. The present corrective least-squares measure for template optimization centers on insuring consistent resolution errors throughout the entire computational domain. Both tri- and penta-diagonal composite templates were closed by boundary and adjacent boundary two- and three-parameter families. The minimization procedure works well except specifically at the boundary. The improvements are most significant along the intermediate resolvable scales. Interestingly, optimizing the first-point stencil in a tri-diagonal composite corrupted the favorable resolution characteristics at the adjacent boundary. But the larger templates (penta-diagonal) lead to better boundary (and near boundary) spectral resolution distributions similar to the field scheme. While some composite templates stabilized upon optimization, others became mildly unstable. The unstable versions can reach neutral stability under fine grid spacing, but this answer counters the primary justification for stencil optimization towards favorable resolution properties. A simple process is devised that correlates the increase in spatial accuracy with reduced resolution errors. In log–log space, the slope of the linear regression line is unity for composite templates holding equivalent formal orders among the contributing constituents. Like the established techniques for insuring temporal stability and field accuracy, the procedure provides developers with a path towards a priori assessments of the expected change in predictive errors due to template optimization. Notably, the procedure is still useful even when the companion templates own dissimilar formal orders. Acknowledgements The author gratefully acknowledges the support of the Office of Naval Research (Dr. Ronald D. Joslin, Program Officer), Contract No. N0001408AF00002, and the In-House Laboratory Independent Research Program (Dr. Anthony A. Ruffa, Program Coordinator) at the Naval Undersea Warfare Center Division Newport. References [1] S.S. Abarbanel, A.E. Chertock, Strict stability of high-order compact implicit finite-difference schemes: the role of boundary conditions for hyperbolic PDEs I, Journal of Computational Physics 160 (2000) 42–66. [2] S.S. Abarbanel, A.E. Chertock, Strict stability of high-order compact implicit finite-difference schemes: the role of boundary conditions for hyperbolic PDEs II, Journal of Computational Physics 160 (2000) 67–87. [3] N.A. Adams, K. Shariff, A high-resolution hybrid compact-ENO scheme for shock-turbulence interaction problems, Journal of Computational Physics 127 (1996) 27–51. [4] M.H. Carpenter, D. Gottlieb, S. Abarbanel, The stability of numerical boundary treatments for compact high-order finite-difference schemes, Journal of Computational Physics 108 (1993) 272–295. [5] A.W. Cook, J.J. Riley, Direct numerical simulation of a turbulent reactive plume on a parallel computer, Journal of Computational Physics 129 (2) (1996) 263–283. [6] A.O. Demuren, R.V. Wilson, M. Carpenter, Higher-order compact schemes for numerical simulation of incompressible flows, Part 1: Theoretical development, Numerical Heat Transfer Part B 103 (2001) 207–230. [7] X. Deng, H. Maekawa, Compact high-order accurate nonlinear schemes, Journal of Computational Physics 130 (1997) 77–91. [8] B. Finkelstein, R. Kastner, The spectral order of accuracy: A new unified tool in the design methodology of excitation-adaptive wave equation, Journal of Computational Physics 228 (24) (2009) 8958–8984. [9] B. Gustafsson, Highly accurate compact implicit methods and boundary conditions, Mathematics of Computation 29 (130) (1975) 396–406. [10] B. Gustafsson, The convergence rate for difference approximations to general mixed initial boundary value problems, SIAM Journal of Numerical Analysis 18 (1981) 179–190. [11] B. Gustafsson, H.-O. Kreiss, A. Sundstrom, Stability theory of difference approximations for mixed initial boundary-value problems II, Mathematics of Computation 26 (1972) 649–686. [12] C. Hardin, J.R. Ristorcelli, C.K.W. Tam, in: ICASE/LaRC Workshop on Benchmark Problems in Computational Aeroacoustics, NASA CP-3300, Hampton, VA, 1995. [13] R. Hixon, Prefactored small stencil compact schemes, Journal of Computational Physics 165 (2000) 522–541. [14] R. Hixon, E. Turkel, Compact implicit MacCormack-type schemes with high accuracy for acoustics, Journal of Computational Physics 158 (2000) 51–70. [15] S.A. Jordan, The spatial resolution properties of composite compact finite differencing, Journal of Computational Physics 221 (2007) 558–576. [16] J.W. Kim, Optimized boundary compact finite difference schemes for computational aeroacoustics, Journal of Computational Physics 225 (1) (2007) 995–1019.
130
S.A. Jordan / Applied Numerical Mathematics 61 (2011) 108–130
[17] J.W. Kim, D.J. Lee, Implementation of boundary conditions for optimized high-order compact schemes, Journal of Computational Acoustics 5 (2) (1997) 177–191. [18] S.K. Lele, Compact finite difference schemes with spectral-like resolution, Journal of Computational Physics 103 (1992) 16–42. [19] R. Rogallo, P. Moin, Numerical simulation of turbulent flows, Annual Review of Fluid Mechanics 16 (1984) 99–137. [20] C.K.W. Tam, Z. Dong, Wall boundary conditions for high-order finite-difference schemes in computational aeroacoustics, Theoretical and Computational Fluid Mechanics 6 (1994) 303–322. [21] C.K.W. Tam, C. Hardin, in: Second Computational Aeroacoustics (CAA) Workshop on Benchmark Problems, NASA CP-3352, Hampton, VA, 1997. [22] C.K.W. Tam, J.C. Webb, Dispersion-relation-preserving schemes for computational acoustics, Journal of Computational Physics 103 (1993) 262–281. [23] J.F. Thompson, Z.U.A. Warsi, C.W. Mastin, Numerical Grid Generation, Elsevier Science, New York, 1985. [24] X. Zhong, High-order finite-difference schemes for numerical simulation of hypersonic boundary-layer transition, Journal of Computational Physics 144 (1998) 662–709.
Applied Numerical Mathematics 61 (2011) 131–148
Contents lists available at ScienceDirect
Applied Numerical Mathematics www.elsevier.com/locate/apnum
Jacobi spectral solution for integral algebraic equations of index-2 M. Hadizadeh ∗ , F. Ghoreishi, S. Pishbin Department of Mathematics, K.N. Toosi University of Technology, Tehran, Iran
a r t i c l e
i n f o
Article history: Received 13 December 2009 Received in revised form 16 August 2010 Accepted 17 August 2010 Available online 24 August 2010 Keywords: Integral algebraic equations Jacobi collocation method System of Volterra equations Index of IAEs Error analysis
a b s t r a c t This paper is concerned with obtaining the approximate solution of a class of semi-explicit Integral Algebraic Equations (IAEs) of index-2. A Jacobi collocation method including the matrix–vector multiplication representation is proposed for the IAEs of index-2. A rigorous analysis of error bound in weighted L 2 norm is also provided which theoretically justifies the spectral rate of convergence while the kernels and the source functions are sufficiently smooth. Results of several numerical experiments are presented which support the theoretical results. © 2010 IMACS. Published by Elsevier B.V. All rights reserved.
1. Introduction Coupled systems of Integral Algebraic Equations (IAEs) consisting of the first and second kind Volterra Integral Equations (VIEs) naturally arise in many mathematical modeling processes, e.g. the kernel identification problems in heat conduction and viscoelasticity [21], evolution of a chemical reaction within a small cell [14], the two dimensional biharmonic equation in a semi-infinite strip [10], dynamic processes in chemical reactors [15] and Kirchhoff’s laws. (For further applications see [2,22] and references therein.) An initial investigation of these equations indicates that they have properties very similar to Differential Algebraic Equations (DAEs). A system of DAEs is characterized by its index, which is the number of differentiations required to convert it into a system of ODEs. The concept of “index” has been introduced in order to quantify the level of difficulty that is involved in solving a given DAE or IAE (see e.g. [1,12,13]). It must be stressed that the numerical schemes which applicable (i.e. convergent) for IAEs of a given index, might not be useful for IAEs of higher index. Note that IAEs with index > 1 are generally hard to solve and are still under active research. The theory of IAEs appeared from early attempts by Gear in the 1990 that determined the difficulties of these equations. He introduced the “index reduction procedure” for IAEs system in [8] similar to that in [9] for DAEs in which if the process is terminated, then the index is determined. This means that under suitable conditions, there is a solution for the resulting regular system of integral equations. Since then, several authors have investigated the existence, uniqueness and numerical analysis of IAEs systems. Bulatov [3] in 1997, gave the existence and uniqueness results of solution for IAEs systems with convolutions kernels and defined the index in analogy to Gear’s approach. (See [4] for further details.) Kauthen [16] in 2000, applied the polynomial spline collocation method for a semi-explicit IAEs with index-1 and established global convergence as well as local superconvergence. Furthermore, Brunner [2] defined the index-1 tractable for a semi-explicit form of IAEs and investigated the existence of a unique solution for these types of systems.
*
Corresponding author. E-mail address:
[email protected] (M. Hadizadeh).
0168-9274/$30.00 © 2010 IMACS. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.apnum.2010.08.009
132
M. Hadizadeh et al. / Applied Numerical Mathematics 61 (2011) 131–148
In this paper, we study the numerical solvability of a mixed system of Volterra integral equations of the first and second kind. More precisely, we consider the semi-explicit system of integral algebraic equations
y (t ) = f (t ) + (ν11 y )(t ) + (ν12 z)(t ),
(1.1)
0 = g (t ) + (ν21 y )(t ), where the linear Volterra integral operators
νkl are given by:
t (νkl ϕ )(t ) =
K kl (t , s)ϕ (s) ds,
t ∈ I = [0, T ]
(k, l = 1, 2)
0
and y : I → Rd1 , z : I → Rd2 , K 11 (·,·) ∈ L (Rd1 ), K 12 (·,·) ∈ L (Rd2 , Rd1 ), K 21 (·,·) ∈ L (Rd1 , Rd2 ) and L (·,·) is the linear transformation space. Here, the word “algebraic” assumes a wider meaning, in that it refers to the “non-differential” constraints forming part of the system in analogy to DAEs. In order to give an application of these models, consider the following heat equation with initial and mixed boundary conditions which represents a boundary reaction in diffusion of chemicals where α (t )u x represents the diffusive transport of materials to the boundary. Following [5, p. 79], for continuously differentiable function f and for continuous functions g , h, α , β , and γ , the solution of the problem
⎧ ut = u xx , 0 < x < 1, 0 < t , ⎪ ⎪ ⎪ ⎨ u (x, 0) = f (x), 0 < x < 1, ⎪ ut (0, t ) + α (t )u x (0, t ) + β(t )u (0, t ) = g (t ), ⎪ ⎪ ⎩ u x (1, t ) + γ (t )u (1, t ) = h(t ), 0 < t ,
(1.2)
0 < t,
has the representation
1 u (x, t ) =
θ(x − ξ, t ) − θ(x + ξ, t ) f (ξ ) dξ − 2
0
t
τ ∂θ (x, t − τ ) f (0) + φ1 (η) dη dτ ∂x
0
t +2
0
τ ∂θ (x − 1, t − τ ) f (1) + φ2 (η) dη dτ , ∂x
0
(1.3)
0
if and only if φ1 and φ2 are continuous functions that satisfy in the coupled system of Volterra integral equations
⎧ t t ⎪ ⎪ ⎪ ⎪ g (t ) = φ1 (t ) + 2α (t ) θ(ξ, t ) f (ξ ) dξ − 2α (t ) θ(0, t − τ )φ1 (τ ) dτ ⎪ ⎪ ⎪ ⎪ ⎪ 0 0 ⎪ ⎪ ⎪ t t ⎪ ⎪ ⎪ ⎪ ⎪ + 2α (t ) θ(−1, t − τ )φ2 (τ ) dτ + β(t ) f (0) + β(t ) φ1 (τ ) dτ , ⎪ ⎪ ⎨ 0
0
t t ⎪ ⎪ ⎪ ⎪ ⎪ ⎪h(t ) = 2 θ(1 + ξ, t ) f (ξ ) dξ − 2 θ(1, t − τ )φ1 (τ ) dτ ⎪ ⎪ ⎪ ⎪ 0 0 ⎪ ⎪ ⎪ t t ⎪ ⎪ ⎪ ⎪ + 2 θ(0, t − τ )φ2 (τ ) dτ + γ (t ) f (1) + γ (t ) φ2 (τ ) dτ , ⎪ ⎪ ⎩ 0
0
where θ(x, t ) is a well-known Theta function which is given in [5]. However, in some special cases (e.g. system of the form (1.1) as follows:
(1.4)
γ (t ) =
−2
t 0
θ(0,t −τ )φ2 (τ ) dτ t , u (1, t ) = 0), it is possible to reduce the system (1.4) to a 0 φ2 (τ ) dτ
f (1)+
M. Hadizadeh et al. / Applied Numerical Mathematics 61 (2011) 131–148
133
⎧ t t ⎪ ⎪ ⎪ ⎪ ⎪ g ( t ) = φ ( t ) + 2 α ( t ) θ(ξ, t ) f (ξ ) d ξ − 2 α ( t ) θ(0, t − τ )φ1 (τ ) dτ 1 ⎪ ⎪ ⎪ ⎪ ⎪ 0 0 ⎪ ⎪ ⎪ t t ⎨ + 2α (t ) θ(−1, t − τ )φ2 (τ ) dτ + β(t ) f (0) + β(t ) φ1 (τ ) dτ , ⎪ ⎪ ⎪ 0 0 ⎪ ⎪ ⎪ t t ⎪ ⎪ ⎪ ⎪ ⎪ ⎪h(t ) = 2 θ(1 + ξ, t ) f (ξ ) dξ − 2 θ(1, t − τ )φ1 (τ ) dτ . ⎪ ⎩ 0
0
The outline of this paper is as follows: Section 2 is devoted to the introduce of IAEs and its preliminary concepts. In Section 3, we investigate the existence of a unique solution for the semi-explicit IAEs system (1.1). The numerical analysis and the error estimation of the Jacobi collocation method in weighted L 2 norm are given in Sections 4 and 5. Finally, in Section 6 some numerical experiments are reported which confirm the theoretical results of the paper. 2. Preliminaries A general integral algebraic equations system takes the form:
t A (t ) X (t ) = G (t ) +
K t , s, X (s) ds,
(2.1)
0
where X : I → Rd , G : I → Rd , K : I × I × Rd → Rd are continuous and A (t ) ∈ L (Rd ) is singular matrix with continuous entries (rank(A) 1, det( A ) = 0). Note that since A is singular and the system has a structure similar to DAEs of index greater than zero, there is no guarantee that the above system is solvable. (See e.g. [8,9].) The semi-explicit linear version of (2.1) is as follows:
y (t ) = f (t ) + (ν11 y )(t ) + (ν12 z)(t ), 0 = g (t ) + (ν21 y )(t ) + (ν22 z)(t ),
(2.2)
where the Volterra integral operators νkl and the matrix kernels K kl (k, l = 1, 2), are defined as (1.1). The notion of index is crucial for the behavior of solutions of IAEs and it must contain the properties of feasible numerical solutions. There exist several different definitions of index for IAEs in [2,4,8] which are often closely related. These definitions are conceptually based on the “index reduction procedure” i.e. a differentiation process of the algebraic constraints which yields a system of regular VIEs and the “tractability index” that is the algebraic constraints which are locally solvable for the algebraic components of the IAEs solution. The following definition regarding to index-1 tractable for the semi-explicit system (2.2) was given by Brunner which is analogous to that defined for DAEs by März [17]: Definition 1. (See [2].) The semi-explicit IAEs (2.2), is said to be index-1 tractable if the first-kind VIE:
t K 22 (t , s) w (s) ds = h(t ),
t ∈ I,
0
is uniquely solvable in C ( I ), whenever h ∈ C 1 ( I ) and h(0) = 0. This means that IAEs system (2.2) is index-1 tractable if its second equation (e.g. the linear first kind VIE) is uniquely solvable for z ∈ C ( I ) as an algebraic component. Following [2], assume that the functions f , g , K i j (i , j = 1, 2) are sufficiently smooth and for all t ∈ I , |det K 22 (t , t )| k0 > 0, where k0 is a positive constant and g (0) = 0. Using the index reduction procedure [8] and Theorem (2.1.8) in [2, p. 64], we can show that the linear VIE possesses a unique solution on I and so the solvability and regularity of its solution have been obtained. 3. IAEs of index-2 As a consequence of the previous section, it is possible to define the index-2 for the semi-explicit IAEs system (1.1) based on minimal regularity conditions and investigate the existence of a unique solution. Considering the system (1.1) and differentiating from its second equation with respect to t, we obtain:
t
0 = g (t ) + K 21 (t , t ) y (t ) + 0
∂ K 21 (t , s) y (s) ds. ∂t
134
M. Hadizadeh et al. / Applied Numerical Mathematics 61 (2011) 131–148
Now, by replacing y (t ) from the first equation of (1.1):
t 0 = g 1 (t ) +
t ∂ K 21 (t , s) K 21 (t , t ) K 11 (t , s) + y (s) ds + K 21 (t , t ) K 12 (t , s) z(s) ds, ∂t
0
(3.1)
0
g (t ) + K 21 (t , t ) f (t ).
where g 1 (t ) = Obviously, (3.1) together with the first equation of (1.1) are the same as the index-1 semi-explicit IAEs system (2.2). In accordance with the terminology introduced by Gear [8], we will refer to the system (1.1) as a semi-explicit IAEs of index-2. However it has to be pointed out that this reduction to IAEs of index-1 (2.2) is not practical from a numerical point of view. We can focus again on the tractability index-2 for the IAEs (1.1), in other view point. The IAEs (1.1) is called index-2 tractable, if the algebraic constraints are locally solvable for the algebraic components z i.e. by replacing y (t ) from the first equation of (1.1) into its second equation we have
t s 0 = g (t ) + f 1 (t ) +
K 21 (t , s) K 12 (s, x) z(x) dx ds,
(3.2)
0 0
where
t f 1 (t ) =
t s K 21 (t , s) f (s) ds +
0
K 21 (t , s) K 11 (s, x) y (x) dx ds. 0 0
Exchanging the two integrals in Eq. (3.2) with respect to s, then x, we obtain:
t 0 = g (t ) + f 1 (t ) +
K (t , x) z(x) dx,
(3.3)
0
t
where K (t , x) = x K 21 (t , s) K 12 (s, x) ds. One of our result in this section is given in the following definition which gives the tractability index-2 for the semiexplicit IAEs system (1.1) using (3.3) and the extension of Definition 1: Definition 2. The semi-explicit IAEs (1.1), is said to be index-2 tractable if the first-kind VIE:
t K (t , x) w (x) dx = h(t ),
t ∈ I,
0
is uniquely solvable in C ( I ) whenever h ∈ C 1 ( I ) and h(0) = 0. Now, we will make use of the result of Brunner [2], which gives the existence and uniqueness results for IAEs (1.1). So, ∂ K 21 (t ,s) are sufficiently smooth the VIE (3.1) is uniquely solvable for z ∈ C ( I ), if the given functions f , g , K i j (i , j = 1, 2) and ∂t and
g 1 (0) = 0,
and
det K 21 (t , t ) K 12 (t , t ) k0 > 0,
∀t ∈ I
where k0 is a positive constant. The following theorem gives the relevant conditions concerning K 1l (l = 1, 2), K 21 , f and g for the investigation of the unique solution of IAEs (1.1) analogous to Theorem (8.1.5) in [2, p. 472]: Theorem 1. Let ν 0 and assume that 1. K 1l ∈ C ν ( D ) for l = 1, 2 and D = I × I , 2. K 21 ∈ C ν +1 ( D ) and |det( K 21 (t , t ) K 12 (t , t ))| k0 > 0, 3. f ∈ C ν ( D ), g ∈ C ν +1 ( D ) and g 1 (0) = 0, then the IAEs (1.1) possesses a unique solution y , z ∈ C ν ( I ). It is also possible to obtain the desired information on the regularity of the solutions in accordance to Theorem (8.1.6) of [2, p. 473]:
M. Hadizadeh et al. / Applied Numerical Mathematics 61 (2011) 131–148
Theorem 2. Assume that the hypotheses given in Theorem 1 hold with unique solution of the IAEs (1.1) is given by the representation
t y (t ) = f (t ) +
ν 0, then for K 21 (t , t ) K 11 (t , s) +
135
∂ K 21 (t ,s) ∂t
= 0 on D, the
t R 11 (t , s) f (s) ds + k12 (t ) g 1 (t ) +
0
Q 12 (t , s) g 1 (s) ds, 0
z(t ) = k21 (t ) g 1 (t ) − k−1 g 1 (t ) +
t Q 22 (t , s) g 1 (s) ds, 0
where
k21 = −
R 22 (t , t )
∂ Q 22 = ∂s
k
k12 = −
,
R 22 (t , s)
k
2R 11 (t , t ) − R 22 (t , t ) k
,
∂ 2R 11 (t , s) − R 22 (t , s) , Q 12 = ∂s k
,
and R 11 (t , s) denotes the resolvent kernels associated with the kernel K 11 (t , s) in (1.1). Proof. The proof is mainly based on the proof of the corresponding theorem in [2] and we refrain from going into details. It is sufficient we take k = K 21 (t , t ) K 12 (t , t ) and H 22 (t , s) = −k−1 ∂∂t ( K 21 (t , t ) K 12 (t , s)). Also, R 22 (t , s) is the resolvent kernels associated with the kernel H 22 (t , s). Under these conditions, the results of the theorem will be obtained. 2 For considering the stability issue of the problem, we use the results of Gear [8], who has defined the index of IAEs by considering the effect of perturbations of the equations on the solutions and obtained the relationship between the differential and the perturbation index in which they are identical. Theorem 2 reveals us the stability issue in the sense of perturbation index as follows: The IAEs system (1.1) can be written as compact form
BY = G, where Y = ( y , z) T , G = ( f , g ) T and B =
I −ν11 −ν12 −ν21 0
(3.4)
.
The corresponding perturbed system may be considered as
BY˜ = G + δ, with δ = (δ1 , δ2 ) T . Differentiating the second equation of the perturbed system with respect to t and substituting y˜ from the first equation, yields
B1 Y˜ = G1 + δ , t I −ν −ν where B1 = −V11 −V12 , Vi = 0 Ki (t , s) ds, i = 1, 2, with 1
2
K1 (t , s) = K 21 (t , t ) K 11 (t , s) + and
∂ K 21 (t , s) , ∂t
T
δ = δ1 , K 21 (t , t )δ1 + δ2 ,
K2 (t , s) = K 21 (t , t ) K 12 (t , s)
G1 = ( f , g 1 ) T
with g 1 (t ) = g (t ) + K 21 (t , t ) f (t ).
Also, differentiating the second equation of (3.5) with respect to t, gives
B2 Y˜ = G2 + δ , where
B2 =
Vi =
t 0
I − ν11
−ν12
−V1
− K 21 (t , t ) K 12 (t , t ) − V2
∂ Ki (t , s) ds, ∂t
G2 = ( f , g 2 ) T ,
, T
δ = δ1 , K 21 (t , t )δ1 + K 21 (t , t )δ1 + δ2 ,
(3.5)
136
M. Hadizadeh et al. / Applied Numerical Mathematics 61 (2011) 131–148
and
∂ K 21 (t , t ) g 2 (t ) = g 1 (t ) + K 21 (t , t ) K 11 (t , t ) + y˜ (t ). ∂t
Furthermore, for the original system (3.4) we have
B2 Y = G2 . 1 Applying Theorem 2, we can deduce that the operator B2 is a linear, bounded and bijective such that B− 2 exists and is bounded. Then it follows from the above relations
1 , Y − Y˜ C B− 2 δ1 + δ1 + δ2 where C is a constant independent of t. This yields the stability estimate of the IAEs system (1.1) in the sense of the perturbation index. 4. The Jacobi collocation scheme This section is devoted to applying the Jacobi collocation method to numerically solve the IAEs system of index-2. To do so, we consider a collocation method including the matrix–vector multiplication representation of the equations. Let P N (Λ) be the space of all polynomials with degree not exceeding N on Λ, where Λ stands for the open interval (−1, 1) and w α ,β (x) = (1 − x)α (1 + x)β , for α , β > −1 denotes a weight function in the usual sense. It is well known, the set α ,β 2 2 of Jacobi polynomials { J N }∞ N =0 forms a complete L w α ,β (Λ) orthogonal system, where L w α ,β (Λ) is the space of functions f : [−1, 1] → R with f 22
L α ,β w
< ∞, and
1 f 2L 2 α ,β
= f , f L2
w α ,β
w
| f (x)|2 w α ,β (x) dx.
= −1
So, any function u (x) in the space L 2w α,β (Λ) admits the expansion u (x) =
u˜ N =
1
1
∞
˜N N =0 u
α ,β
JN
with
α ,β
u (ξ ) J N (ξ ) w α ,β (ξ ) dξ.
α ,β J 2
L 2 α ,β −1 w
k
Now, let H m denotes the Sobolev space of all functions u (x) on Λ such that u (x) and all its weak derivatives up to w α ,β
order m are in L 2w α,β (Λ), with the norm and the semi-norm as
u (x)2 m
H α ,β w
= (Λ)
|u | H m, N
w α ,β
(Λ)
2 m k ∂ u ( x ) 2 ∂ xk m
=
,
L α ,β (Λ) w
k =0
( j ) 2 u 2
j =min(m, N +1)
1/2
L α ,β (Λ) w
(4.1)
.
Let Π N be the orthogonal projective operator from L 2w α,β (Λ) on to P N (Λ). It means that for any function ϕ in L 2w α,β (Λ), Π N ϕ belongs to P N (Λ) and satisfies
1 (ϕ − Π N ϕ )(ξ )ψ N (ξ ) w α ,β (ξ ) dξ = 0.
∀ψ N ∈ P N (Λ), −1
α ,β
For any u (x) ∈ C [−1, 1], we can define the projection I N Jacobi polynomials (see e.g. [19]) α ,β
I N u (x) =
N i =0
α ,β
u˜ i J i
(x) =
N
u (xi ) L i (x),
i =0
where the Lagrange polynomials L i (x) take the form
as a Lagrange interpolating polynomial associated with the
M. Hadizadeh et al. / Applied Numerical Mathematics 61 (2011) 131–148
L i (x) = w i
137
N 1 α ,β α ,β J (xi ) J k (x) (i = 0, 1, . . . , N ) γ˜k k k =0
and the collocation points {xi }iN=0 represent the Jacobi–Gauss quadrature points used to compute the discrete expansion coefficients u˜ k , N 1
u˜ k =
γ˜k
α ,β
u ( xi ) J k
( xi ) w i ,
(4.2)
i =0
where γ˜k and w i were given in [11, p. 231]. Let us now turn our attention toward the application of Jacobi collocation method for the following linear semi-explicit IAEs system:
t A (t ) X (t ) = G (t ) +
t ∈ [0, T ]
K (t , s) X (s) ds,
(4.3)
0
where X : [0, T ] → Rd , G : [0, T ] → Rd , K ∈ L (Rd ), are continuous and A (t ) = diag(Id1 , Od2 ) ∈ L (Rd ) is a singular block matrix and X (t ) = {xi (t )}di=1 , K (t , s) = {ki j (t , s)}di, j =1 . To use the theory of orthogonal Jacobi polynomials, we consider the change of variables:
s=
T 2
(η + 1),
−1 η τ ,
t=
T
(τ + 1),
2
−1 τ 1
(4.4)
to rewrite the problem (4.3) as follows
A (τ ) X (τ ) = G (τ ) +
τ
K (τ , η) X (η) dη,
τ ∈ [−1, 1]
(4.5)
−1
where
T G (τ ) = G (τ + 1) ,
T X (τ ) = X (τ + 1) ,
2
T T T K (τ , η) = K (τ + 1), (η + 1) .
2
2
2
2
Approximating K (τm , η) with Jacobi polynomials gives
d K N (τm , η) = kˆ i j (τm , η) i , j =1 =
N
d (kˆ i j )mk J k (η) α ,β
k =0
i , j =1
d
= (kˆ ij )m i , j =1 ⊗ VW , α ,β
where V is a lower triangular coefficient matrix of Jacobi polynomials with { J i
W = 1, η, . . . , η N
(kˆ i j )mk =
T
(kˆ ij )m = (kˆ i j )m0 , . . . , (kˆ i j )mN ,
, 1
1 α ,β
J k 22
kˆ i j (τm , ξ ) J k
α ,β
(ξ ) w α ,β (ξ ) dξ
(4.6)
(η)}iN=0 = VW , and
(k = 0, 1, . . . , N )
L α ,β −1 w
and ⊗ is a Kronecker product. Consequently, we seek a solution α ,β
IN
X (η) = X N (η) =
N k =0
α ,β
where V = JV with J = { γ1˜ w k J j j
d α ,β
=
(˜xi )k J k (η) i =1
N
d (ˆxi )(τk ) Lk (η)
k =0
i =1
= {ˆxi }di=1 ⊗ V W ,
(4.7)
(τk )}kN, j =0 , xˆ i = ((ˆxi )(τ0 ), (ˆxi )(τ1 ), . . . , (ˆxi )(τ N )), {τk }kN=0 are the Gauss–Jacobi collocation
α ,β
points (i.e. the zeros of J N +1 ), the coefficients {(˜xi )k }kN=0 are also given by (4.2) and { L k }kN=0 are the interpolating Lagrange polynomials. N Inserting the collocation points {τm }m =0 in (4.5), we obtain the collocation equation as
A (τm ) X (τm ) = G (τm ) +
τm
−1
K N (τm , η) X N (η) dη (m = 0, 1, . . . , N )
(4.8)
138
M. Hadizadeh et al. / Applied Numerical Mathematics 61 (2011) 131–148
and using (4.6), (4.7) and (4.8), we get
A (τ
xi }di=1 m ){ˆ
τm
= G (τm ) +
(kˆ ij )m
d i , j =1
⊗ VW {ˆxi }di=1 ⊗ V W dη (m = 0, 1, . . . , N ).
(4.9)
−1
In this position, we give the following lemma, that will become instrumental in establishing the matrix vector multipliK N (τm , η) X N (η) in (4.8): cation representation of the product Lemma 1. Let P (η) =
N
i =0
N
p i η i and Q (η) =
j =0 q j
η j be two given polynomials, then
P (η) Q (η) = P(Q ⊗ M)W, where P = ( p 0 , p 1 , . . . , p N ), Q = (q0 , q1 , . . . , q N ), W = (1, η, η2 , . . . , η2N ) T , and M is a block sparse matrix of the form M = (k)
(MN )kN=0 with k
0 ... 0 1
j −k
⎞ ... 0 0 ... 0 . ⎜ ⎟ ⎜ 0 . . . 0 0 1 . . . .. 0 . . . 0 ⎟ (k) ⎟ Mj = ⎜ . ⎜. . . . .⎟ .. ⎝ .. . . . .. .. . 0 .. . . . .. ⎠ 0 . . . 0 0 . . . 0 1 0 . . . 0 ( N +1)( N +1+ j ) ⎛
0
N +1
Proof. Firstly, we will show that
ηm
N
(m)
p i ηi = PMm W m ,
i =0
where W m = (1, η, η2 , . . . , η N +m ) T . The validity of the equation for m = 0 is obvious. Now we proceed by induction. So, we assume the validity of the equation for m = k and transit to m = k + 1 as follows
η k +1
N
p i η i = ηk
i =0
N T (k) (k) ( p i η)ηi = { p i η}iN=0 Mk W k = PMk η, η2 , . . . , ηk+ N +1 . i =0
With some simple manipulations we conclude (k)
Mk
η , η 2 , . . . , η k + N +1
T
(k+1)
= Mk+1
1, η , . . . , η k + N +1
T
From these relations, we can write
η k +1
N i =0
(k+1)
p i ηi = PMk+1 W k+1 ,
and this proves the identity for m = k + 1. Now, consider the multiplication
P (η) Q (η) =
N
qj
j =0
N
pi η
i
ηj =
i =0
N
( j)
q j PM j W j .
j =0
(k)
Due to the structure of M = (M N )kN=0 , we have ( j)
( j)
MN W N = M j W j
( j = 0, 1, . . . , N ).
This leads to the following relation
P (η) Q (η) = P
N j =0
( j)
q j MN
W N = P(Q ⊗ M)W.
2
(k+1)
= Mk+1 W k+1 .
M. Hadizadeh et al. / Applied Numerical Mathematics 61 (2011) 131–148
139
Algorithm 1. The construction of the Jacobi collocation method including the matrix–vector multiplication representation of the equations α ,β
(η), (i = 0, 1, . . . , N ) and compute the nonsingular co-
Step 1. Choose N, form the orthogonal bases J i efficient matrices V and V as
α ,β Ji
(η )
N i =0
= VW ,
and
V = JV α ,β
(τk )}kN, j =0 and W = (1, η, . . . , η N ) T . Compute the approximations of K (τm , η) and X (η) using τ Jacobi polynomials from (4.6) and (4.7). Compute the matrix M from Lemma 1 and take Wm = −m1 W dη with W = (1, η, η2 , . . . , η2N ) T . Solve the linear system (4.11) and obtain the entries of the vector solution {ˆxi }di=1 . Set X N (η) = {ˆxi }di=1 ⊗ (V W ).
where J = { γ1˜ w k J j j
Step 2. Step 3. Step 4. Step 5.
By the mentioned lemma and relations (4.6) and (4.7), the matrix–vector multiplication representation of the product
K N (τm , η) X N (η) can be obtained as follows:
K N (τm , η) X N (η) = = = =
d
d kˆ i j (τm , η)ˆx j (η)
j =1
i =1
d
j =1
(kˆ ij )m VW xˆ j V W
d
N
j =1
l =0
(kˆ ij )m V l η
l
d i =1
N
xˆ j V
d
l
η
l
l =0
i =1
d
d
(kˆ ij )m V xˆ j V ⊗ M W j =1
(4.10)
, i =1
(l)
where M = (M N )lN=0 is a block matrix which is defined in Lemma 1. Inserting (4.10) in (4.9), we obtain
A (τm ){ˆxi }di=1 = G (τm ) +
τm
j =1
= G (τm ) +
d
(kˆ ij )m V xˆ j V ⊗ M W dη −1
d
(kˆ ij )m V xˆ j V ⊗ M Wm j =1
d i =1
d (m = 0, 1, . . . , N )
(4.11)
i =1
τ
where Wm = −m1 W dη . Finally, we end up with a system of equations whose solution gives the unknown vectors {ˆxi }di=1 . The Algorithm 1 summarizes the proposed Jacobi collocation method. 5. Convergence analysis In this section, we will try to provide an error analysis which theoretically justifies the spectral rate of convergence of the proposed method. In order to describe the key ideas without having to resort to complex notation involving Kronecker products of matrices and vectors, we will consider the index-2 IAEs system (4.3) with d1 = d2 = 1. Our strategy is mainly based on the ideas in [20], together with the following auxiliary lemmas from [6] and [7]: Lemma 2. (See [6].) Let φ ∈ P N where P N denote the space of all polynomials of degree N, then for any integer r 1 and 0 p ∞, there exists a positive constant C independent of N such that
(r ) φ
p
L α ,β (−1,1) w
C N 2r φL p
w α ,β
(−1,1) .
140
M. Hadizadeh et al. / Applied Numerical Mathematics 61 (2011) 131–148
N
α ,β
α ,β
Lemma 3. (See [6].) Assume that u ∈ H m (Λ) and P N u = k=0 uˆ k J k is the truncated orthogonal Jacobi series of u and I N u is the interpolation of u at any of the three families of Jacobi Gauss points (Gauss or Gauss–Radau or Gauss–Lobatto). Then for Λ = (−1, 1), the following estimates hold:
u − P N u L 2 u − I α ,β u N
C N −m |u | H m,N (Λ) ,
(Λ)
w α ,β
C N −m |u | H m,N
L 2 α ,β (Λ)
w α ,β
w
α ,β u − I u N
L 2 α ,β (Λ) w
(5.1)
,
(Λ)
C N 1−m |u | H m,N
w α ,β
(Λ)
(5.2)
.
We are now ready to prove the following main theorem which gives the convergence properties of the presented scheme: A (τ ) = diag(1, 0), G (τ ) = ( ˆf (τ ), gˆ (τ )) T and Theorem 3. Consider the IAEs of index-2 (4.3) and its transformed system (4.5) with T T ˆ ˆ ˆ ˆ X (t ) = ( y , z) . Assume that the hypotheses given in Theorem 1 hold with ν 0. If X N = ( y N , z N ) is an approximate solution of the proposed Jacobi collocation scheme with the Gauss–Jacobi collocation points, then the Jacobi collocation approximation errors yˆ − yˆ N and zˆ − zˆ N for m 1 satisfy:
yˆ − yˆ N L 2
w α ,β
(Λ)
5
O( N 2 +γ −m ),
ˆz − zˆ N L 2
w α ,β
(Λ)
O( N 2−m log N ), −1 < α , β − 12 , otherwise
O( N 4−2m (log N )2 ), −1 < α , β − 12 , O( N 5+2γ −2m ),
otherwise
where yˆ , zˆ ∈ H m (Λ), Λ = (−1, 1) and γ = max(α , β). Proof. The strategy we shall follow is to determine the error estimations for yˆ − yˆ N and then ˆz − zˆ N . The spectral collocation solution X N was considered as an interpolating polynomial of degree N on the interval (−1, 1). It is defined through the collocation equation (4.11)
yˆ (τi ) = fˆ (τi ) + D 1 (τi ) + D 2 (τi )
( i = 1, 2, . . . , N )
(5.3)
where
D 1 (τi ) = (kˆ 11 )m V yˆ V ⊗ M Wm ,
D 2 (τi ) = (kˆ 12 )m V zˆ V ⊗ M Wm .
We rewrite Eq. (5.3) in the form:
yˆ (τi ) = fˆ (τi ) +
τi
K 11 (τi , η) yˆ N (η) dη +
−1
τi
K 12 (τi , η)ˆz N (η) dη + q1 (τi ) + q2 (τi ),
(5.4)
−1
such that
τi q1 (τi ) = D 1 (τi ) −
K 11 (τi , η) yˆ N (η) dη,
−1
τi q2 (τi ) = D 2 (τi ) −
K 12 (τi , η)ˆz N (η) dη,
−1
and yˆ N , zˆ N are the approximate solutions of yˆ , zˆ which is approximated by k-th Lagrange interpolating polynomials L k : α ,β
IN
yˆ (τ ) = yˆ N (τ ) =
n
yˆ k L k (τ ),
α ,β
IN
k =0
where yˆ k = yˆ (τk ) and zˆ k = zˆ (τk ). It follows that
zˆ (τ ) = zˆ N (τ ) =
n k =0
zˆ k L k (τ ),
M. Hadizadeh et al. / Applied Numerical Mathematics 61 (2011) 131–148
yˆ (τi ) = fˆ (τi ) +
τi
K 11 (τi , η)e (η) dη +
−1
τi +
τi
141
K 11 (τi , η) yˆ (η) dη
−1
K 12 (τi , η)ε (η) dη +
−1
τi
K 12 (τi , η)ˆz(η) dη + q1 (τi ) + q2 (τi ),
(5.5)
−1
where e (s) = yˆ N (s) − yˆ (s) and ε (s) = zˆ N (s) − zˆ (s). Multiplying the j-th equation of (5.5) by L j (τ ) and summing up over j from 0 to N, we get α ,β ˆ
α ,β f (τ ) + I N
yˆ N (τ ) = I N
τ
K 11 (τ , η) yˆ (η) dη
−1
α ,β
α ,β α ,β q1 (τ ) + I N q2 (τ ) + I N
+ IN
α ,β
τ
+ IN
τ
K 11 (τ , η)e (η) dη
−1
α ,β K 12 (τ , η)ˆz(η) dη + I N
−1
τ
K 12 (τ , η)ε (η) dη .
(5.6)
−1
From the first equation of (4.5), we may write α ,β
τ
IN
K 11 (τ , η) yˆ (η) dη +
−1
τ
α ,β K 12 (τ , η)ˆz(η) dη = I N yˆ (τ ) − fˆ (τ ) ,
−1
and replacing the above relation in (5.6), we get
τ 0=
K 11 (τ , η)e (η) dη +
−1
τ
α ,β α ,β K 12 (τ , η)ε (η) dη + I N q1 (τ ) + I N q2 (τ ) + P 1 + P 2 ,
(5.7)
−1
where α ,β
τ
P1 = IN
−1
α ,β
τ
P2 = IN
τ
K 11 (τ , η)e (η) dη −
K 11 (τ , η)e (η) dη,
−1
K 12 (τ , η)ε (η) dη −
−1
τ
K 12 (τ , η)ε (η) dη.
−1
Differentiating from (5.7) respect to
τ , yields:
− K 11 (τ , τ )e (τ ) − K 12 (τ , τ )ε (τ ) =
τ τ ∂ K 11 (τ , η) ∂ K 12 (τ , η) e (η) dη + ε(η) dη ∂τ ∂τ
−1
α ,β
+ IN
−1
α ,β q1 (τ ) + I N q2 (τ ) + P 1 + P 2 .
(5.8)
In order to investigate the convergence of the second equation of (4.5), we apply the vector–matrix multiplication representation form (4.11) for this equation. So, we have
0 = gˆ (τi ) + D 3 (τi ) (i = 1, 2, . . . , N )
(5.9)
where D 3 (τi ) can be written as:
D 3 (τi ) = (kˆ 21 )m V yˆ V ⊗ M Wm . Using a similar procedure as outlined in the first part, Eq. (5.9) give rise
τ 0= −1
α ,β K 21 (τ , η)e (η) dη + I N q3 (τ ) + P 3 ,
(5.10)
142
M. Hadizadeh et al. / Applied Numerical Mathematics 61 (2011) 131–148
where P 3 and q3 (τi ) are similarly defined as: α ,β
τ
K 21 (τ , η)e (η) dη −
P3 = IN
−1
τ
K 21 (τ , η)e (η) dη,
−1
τi
K 21 (τi , η) yˆ N (η) dη.
q3 (τi ) = D 3 (τi ) − −1
Also, differentiating from (5.10) respect to
− K 21 (τ , τ )e (τ ) =
τ , gives:
τ ∂ K 21 (τ , η) α ,β e (η) dη + I N q3 (τ ) + P 3 . ∂τ
(5.11)
−1
Now, let E(τ ) = (e (τ ), ε (τ )) T , then Eqs. (5.8) and (5.11), can be rewritten in matrix notation
τ HE(τ ) =
$ K(τ , η)E(η) dη + D,
(5.12)
−1
with
H=
and
− K 11 (τ , τ ) − K 12 (τ , τ ) K 21 (τ , τ ) 0 −
α ,β IN
D=
α ,β
(q1 (τ )) + I N α ,β
IN
,
$ K(τ , η) =
∂ K 11 (τ ,η) ∂τ ∂ K 21 (τ ,η) ∂τ
∂ K 12 (τ ,η) ∂τ
,
0
(q2 (τ )) + P 1 + P 2
.
(q3 (τ )) + P 3
Due to the assumptions of Theorem 1, we have
K 21 (t , t ) K 12 (t , t ) > 0,
∀t ∈ I ,
this shows that H is an invertible matrix with the inverse of the form:
H
−1
=
−1 − K 21 (τ , η) . −1 −1 −1 K 21 (τ , η) K 12 (τ , η)( K 11 (τ , η) K 21 (τ , η)) − 0
Multiplying (5.12) by H−1 and using the Gronwall’s inequality (see e.g. Lemma 3.4 from [20]), we have
EL 2
w α ,β
(Λ)
C FL 2
w α ,β
(5.13)
(Λ) ,
where F = H−1 D. Then it follows from Lemma 2:
α ,β I q1 (τ ) N
L 2 α ,β (Λ) w
α ,β C N 2 I N q1 (τ ) L 2
w α ,β
(Λ)
(5.14)
,
indeed
α ,β I q1 (τ ) N
L 2 α ,β w
n q1 (τi ) max L i (τ ). max (Λ) 0 i N
τ ∈(Λ)
(5.15)
i =0
Applying the Cauchy–Schwarz inequality (see e.g. [6]), we can write:
τi ( |q1 (τi )| = K 11 ) N (τi , η) − K 11 (τi , η) yˆ N (η) dη −1
( K 11 ) N (τi , η) − K 11 (τi , η) L 2
w α ,β
and using (5.1) from Lemma 3, we have
(Λ)
yˆ N L 2
w α ,β
(Λ) ,
M. Hadizadeh et al. / Applied Numerical Mathematics 61 (2011) 131–148
q1 (τi ) C N −m K 11 (τi , η)
m, N
H α ,β (Λ) w
yˆ N L 2
w α ,β
143
(5.16)
(Λ) .
In this position, we will make use of the result of Chen and Tang [7] and also [18], which gives the Lebesgue constant for the Lagrange interpolating polynomials associated with the nodes of the Jacobi polynomials. Actually, as stated in Lemma 3.4 of [7], the following relation for (5.15) holds n L i (η) = max
η∈[−1,1]
1
O( N γ + 2 ),
i =0
so, we have
α ,β I q1 (τ ) N
L 2 α ,β (Λ) w
−1 < α , β − 12 ,
O(log N ),
γ = max(α , β), otherwise
C N −m log N Θ11 , CN
1 2 +γ −m
Θ11 ,
−1 < α , β − 12 , otherwise
where Θ11 = max0i N | K 11 (τi , η)| H m,N (Λ) yˆ N L 2 (Λ) and w α ,β w α ,β Now, by using this relation for (5.14), we have
α ,β I q1 (τ ) N
L 2 α ,β w
(Λ)
and similarly
α ,β I q2 (τ ) N
α ,β I q3 (τ ) N
L 2 α ,β (Λ) w
L 2 α ,β (Λ) w
5
otherwise
C N 2−m log N Θ12 , −1 < α , β − 12 , 5
C N 2−m log N Θ11 , −1 < α , β − 12 , C N 2 +γ −m Θ11 ,
γ = max(α , β).
C N 2 +γ −m Θ12 , CN
2−m
otherwise
log N Θ21 , −1 < α , β − 12 ,
5
C N 2 +γ −m Θ21 ,
otherwise
where
Θ12 = max K 12 (τi , η) H m,N (Λ) ˆz N L 2 (Λ) , 0 i N w α ,β w α ,β Θ21 = max K 21 (τi , η) m,N yˆ N 2 . L α ,β (Λ) w
H α ,β (Λ) w
0 i N
Indeed, using (5.2) and (4.1) for m = 1 and then applying Hardy’s inequality (see e.g. Lemma 3.7 from [7]), we can write
P 1
τ ∂( K 11 (τ , η)) C K ( τ , τ ) e ( τ ) + e ( η ) d η 2 11 L α ,β (Λ) ∂τ w −1
L 2 α ,β (Λ) w
τ ∂( K 11 (τ , η)) C K 11 (τ , τ ) L 2 (Λ) e (τ ) L 2 (Λ) + e (η) dη ∂τ w α ,β w α ,β
C K 11 (τ , τ )
L 2 α ,β (Λ) w
C K 11 (τ , τ ) L 2
w
α ,β (Λ)
−1
C N −m | yˆ | H m,N (Λ) w α ,β
+ C e (τ )
+ C C N −m | yˆ | H m,N w
α ,β (Λ)
L 2 α ,β (Λ) w
L 2 α ,β (Λ) w
,
thus
P
1 L2 (Λ) w α ,β
C N −m K 11 (τ , τ ) L 2
w
α ,β (Λ)
+ C | yˆ | H m,N
w α ,β
(Λ)
C N −m | yˆ | H m,N
w α ,β
(Λ)
.
Similarly
P
2 L2 (Λ) w α ,β
P
3 L2 (Λ) w α ,β
C N −m K 12 (τ , τ ) L 2 (Λ) + C |ˆz| H m,N (Λ) C N −m |ˆz| H m,N (Λ) , w α ,β w α ,β w α ,β
C N −m K 21 (τ , τ ) 2 + C | yˆ | m,N C N −m | yˆ | m,N . L α ,β (Λ) w
H α ,β (Λ) w
The above estimates together with (5.13), yield
e L 2
w α ,β
ˆ − yˆ N L 2 (Λ) = y
w α ,β
(Λ) C N
−m
| yˆ | H m,N
w α ,β
+ (Λ)
H α ,β (Λ) w
C N 2−m log N Θ21 , −1 < α , β − 12 , 5
C N 2 +γ −m Θ21 ,
otherwise
(5.17)
144
M. Hadizadeh et al. / Applied Numerical Mathematics 61 (2011) 131–148
Table 1 L 2w α,β errors for Example 1. N
x1 − u L 2
x2 − v L 2
x3 − w L 2
4 6 8 10 12 14
1.79 × 10−4
9.86 × 10−5
3.72 × 10−3 1.30 × 10−4 4.61 × 10−6 1.59 × 10−7 5.38 × 10−9 1.74 × 10−10
w α ,β
w α ,β
3.12 × 10−6 7.66 × 10−8 1.99 × 10−9 5.33 × 10−11 1.37 × 10−12
1.84 × 10−6 4.67 × 10−8 1.46 × 10−9 3.01 × 10−11 7.73 × 10−13
w α ,β
Also Θ11 can be written as:
Θ11 = max | K 11 (τi , η)| H m,N 0 i N
w α ,β
(Λ)
yˆ N L 2
w α ,β
(Λ)
K ( yˆ L 2ω (Λ) + e L 2ω (Λ) ),
where K = max0i N | K 11 (τi , η)| H m,N (Λ) . w α ,β Finally, the above estimates together with (5.13) and (5.17) give
ε L 2
w α ,β
(Λ)
= ˆz − zˆ N L 2 (Λ) C N −m | yˆ | H m,N (Λ) + |ˆz| H m,N (Λ) w α ,β w α ,β w α ,β ⎧ 2−m log N ( K ( yˆ L 2 (Λ) + e L 2 (Λ) ) + Θ12 + Θ21 ), −1 < α , β − 12 , ⎨CN w α ,β w α ,β + ⎩ C N 52 +γ −m ( K ( yˆ 2 otherwise L (Λ) + e L 2 (Λ) ) + Θ12 + Θ21 ), w α ,β
w α ,β
which leads to the estimate stated of the theorem, provided that N is sufficiently large and C is a constant independent of N. 2 6. Numerical results and discussions To incorporate our numerical approach, two semi-explicit IAEs system of index-2 together an applied problem are considered. These problems are solved using the proposed Jacobi collocation method for α = 12 and β = 13 based on matrix–vector
multiplication representation of equations. To examine the accuracy of the results, L 2w α,β errors are employed to assess the efficiency of the method. All the calculations were supported by the software Matematica® .
Example 1. Consider the following semi-explicit linear IAEs system of index-2
t A (t ) X (t ) = g (t ) +
K (t , s) X (s) ds, 0
where
A (t ) =
1 0 0 1 0 0
0 0 0
⎛
T
X (t ) = x1 (t ), x2 (t ), x3 (t )
,
3− s 2− s
2(2 − s)
⎞
⎟ ⎠, −1 1 s + 2 s2 − 4 0
T
g (t ) = 1, 2et − 1, −1 + et −t 2 + t + 1 . ⎜
K (t , s) = ⎝
,
3−2s 2− s −1 s −2
The exact solutions of the system are:
x1 (t ) = x2 (t ) = et ,
x3 (t ) = −
et 2−t
.
Let u , v , w be the approximation of the exact solutions x1 , x2 , x3 , respectively, that is given by (4.7). The discrete method described in Section 4 has been implemented for the problem and the L 2w α,β errors for different values of N = 4, 6, 8, . . . have been reported in Table 1. Graphs of the error functions and error behaviors for several values of N are also given in Figs. 1 and 2. We observe that the approximate solution of the equation represents the expected convergence behavior as described in Theorem 3. Example 2. Let us consider the IAEs system of index-2:
t A (t ) X (t ) = g (t ) +
K (t , s) X (s) ds, 0
M. Hadizadeh et al. / Applied Numerical Mathematics 61 (2011) 131–148
Fig. 1. Error functions of the Jacobi collocation approximations of orders N = 6, 10, 12 and 14 for
α=
145
1 2
and β =
1 3
in Example 1.
Fig. 2. L 2w α,β errors versus the number of collocation points for Example 1.
with
A (t ) =
1 0 0 0
K (t , s) =
, T
X (t ) = x1 (t ), x2 (t )
,
et +s (s + 1)2 s+t +2 0
T
g (t ) = f 1 (t ), f 2 (t )
,
,
and
f 1 (t ) = sin t −
1
et 1 + et (− cos t + sin t ) −
2 f 2 (t ) = −(2 + t ) + 2(1 + t ) cos t − sin t .
1 4
−2 + 2(1 + t ) cos 2t + 1 + 4t + 2t 2 sin 2t ,
The exact solutions of the system are:
x1 (t ) = sin t ,
x2 (t ) = cos 2t .
We assume u and v are the Jacobi collocation approximation of the exact solutions x1 and x2 respectively, which are defined by (4.7).
146
M. Hadizadeh et al. / Applied Numerical Mathematics 61 (2011) 131–148
Table 2 L 2w α,β errors for Example 2. N
x1 − u L 2
x2 − v L 2
4 6 8 10 12 14
2.41 × 10−5
5.29 × 10−4 4.44 × 10−6 1.86 × 10−8 4.85 × 10−11 1.97 × 10−13 1.45 × 10−13
w α ,β
4.25 × 10−8 4.18 × 10−11 2.39 × 10−14 4.05 × 10−15 3.94 × 10−15
w α ,β
Fig. 3. Error functions of the Jacobi collocation approximations of orders N = 6, 8, 10 and 14 for
α=
1 2
and β =
1 3
in Example 2.
Fig. 4. L 2w α,β errors versus the number of collocation points for Example 2.
The computational results have been reported in Table 2. Figs. 3 and 4 show the graphs of the Jacobi collocation error functions with α = 12 and β = 13 . It is observed that the errors decay exponentially.
M. Hadizadeh et al. / Applied Numerical Mathematics 61 (2011) 131–148
147
Table 3 The absolute errors of Example 3 for different values of N. N
|u (0.95, 0.05) − uˆ (0.95, 0.05)|
|u (1, 0.05) − uˆ (1, 0.05)|
2 3 4 5
1.69 × 10−3 1.59 × 10−3 1.49 × 10−3 1.48 × 10−3
5.53 × 10−6 3.47 × 10−6 2.54 × 10−6 1.98 × 10−6
Example 3. As an applied test problem, consider the controlled heat equation (1.2) with f (x) = 1, h(t ) =
−1
α (t ) = 1, β(t ) = 0,
4t √1 , and g (t ) = e (1+32t ) . √ πt
2
πt 2
The exact solution of the equation is:
u (x, t ) = erf
√ .
1−x 2 t
It can be shown that, under the assumption
γ (t ) =
−49
13
6
e 4t (7−12e 4t +5e t )(9+π 2 t ) , 5 7 9π t (− erf( √ )+2 erf( √3 )−erf( √ )) 2 t
t
the system (1.4) can be reduced to a
2 t
system of the form (1.1) as follows
⎧ −1 t t ⎪ ⎪ e 4t (1 + 2t ) ⎪ ⎪ = φ1 (t ) − 2 θ(0, t − τ )φ1 (τ ) dτ + 2 θ(−1, t − τ )φ2 (τ ) dτ , ⎪ ⎪ ⎪ 2√π t 32 ⎨ 0
0
⎪ t ⎪ ⎪ 1 ⎪ ⎪ √ = −2 θ(1, t − τ )φ1 (τ ) dτ , ⎪ ⎪ ⎩ πt 0
√ 3 √ 3 2 where θ(0, t − τ ) √ 1 (1 + π3 t ), θ(−1, t − τ ) 34 t π 2 and θ(1, t − τ ) 12 t π 2 . 4π t
Let φˆ 1 (t ) and φˆ 2 (t ) be the approximation of the exact solutions φ1 (t ) and φ2 (t ) that are given by (4.7). Now, by replacing φˆ 1 (t ) and φˆ 2 (t ) in (1.3), the approximate solution of the problem (1.2) can be obtained as
1 uˆ (x, t ) =
ˆ x − ξ, t ) − θ( ˆ x + ξ, t ) dξ − 2 θ(
0
t +2
∂ θˆ ( x − 1, t − τ ) 1 + ∂x
0
t 0
τ
τ ∂ θˆ (x, t − τ ) 1 + φˆ 1 (η) dη dτ ∂x
φˆ 2 (η) dη dτ ,
0
(6.1)
0
ˆ x, t ) is the truncated series of the Theta function where θ(
ˆ x, t ) = √ θ(
1 4π t
3 m=−3
exp −
(x + 2m)2 4t
.
The approximate solution of uˆ (0.95, 0.05) and uˆ (1, 0.05) are special interest and computed using the Euler product integration method for (6.1). The differences between the numerical and analytical solutions for different values of N, are listed in Table 3. 7. Conclusion Here, we elaborated a spectral collocation method based on Jacobi orthogonal polynomials to obtain numerical solution of a class of IAEs system with index-2. We will achieve the goal by using the matrix–vector representation and the solution of the related system of equations. The strategy has been derived using some variable transformations to change the equation into an other IAEs defined on the standard interval [−1, 1], so the Jacobi orthogonal polynomial theory can be applied conveniently and the obtained index-2 system is solved directly without using the index reduction procedure which causes the simplification of the method. The spectral rate of convergence for the proposed method established in weighted L 2 norm. With the availability of this methodology, it will now be possible to investigate the approximate solution of other classes of IAEs systems. Although our convergence theory does not cover the nonlinear case, but it contains some complications and restrictions for establishing a convergent result similar to Theorem 3 which will be the subject of our future work.
148
M. Hadizadeh et al. / Applied Numerical Mathematics 61 (2011) 131–148
Acknowledgement The authors would like to acknowledge useful early discussions with Prof. Hermann Brunner (Hong Kong Baptist University) as well as two anonymous referees for their careful reading of the manuscript and constructive comments. References [1] U. Ascher, L.R. Petzold, Computer Methods for Ordinary Differential Equations and Differential-Algebraic Equations, SIAM, 1998. [2] H. Brunner, Collocation Methods for Volterra Integral and Related Functional Equations, University Press, Cambridge, 2004. [3] M.V. Bulatov, V.F. Chistyakov, The properties of differential-algebraic systems and their integral analogs, Memorial University of Newfoundland, preprint, 1997. [4] M.V. Bulatov, Numerical methods of investigation and solution for integral algebraic equations, in: SciCADE09, May 25–29, 2009. [5] J.R. Cannon, The One-Dimensional Heat Equation, University Press, Cambridge, 1984. [6] C. Canuto, M.Y. Hussaini, A. Quarteroni, T.A. Zang, Spectral Methods Fundamentals in Single Domains, Springer-Verlag, 2006. [7] Y. Chen, T. Tang, Convergence analysis of the Jacobi spectral collocation methods for Volterra integral equations with a weakly singular kernel, Math. Comput. 79 (2010) 147–167. [8] C.W. Gear, Differential-algebraic equations, indices, and integral-algebraic equations, SIAM. J. Numer. Anal. 27 (1990) 1527–1534. [9] C.W. Gear, Differential-algebraic equations and index transformation, SIAM. J. Stat. Comput. 1 (1988) 39–47. [10] A.M. Gomilko, A Dirichlet problem for the biharmonic equation in a semi-infinite strip, J. Engng. Math. 46 (2003) 253–268. [11] B. Guo, L. Wang, Jacobi interpolation approximations and their applications to singular differential equations, Adv. Comput. Math. 14 (2001) 227–276. [12] M. Hanke, E.I. Macana, R. März, On asymptotics in case of linear index-2 differential-algebraic equations, SIAM. J. Numer. Anal. 35 (1998) 1326–1346. [13] I. Higueras, R. März, C. Tischendorf, Stability preserving integration of index-2 DAEs, Appl. Numer. Math. 45 (2/3) (2003) 201–229. [14] B. Jumarhon, W. Lamb, S. McKee, T. Tang, A Volterra integral type method for solving a class of nonlinear initial-boundary value problems, Numer. Methods Partial Differential Equations 12 (1996) 265–281. [15] V.V. Kafarov, B. Mayorga, C. Dallos, Mathematical method for analysis of dynamic processes in chemical reactors, Chem. Engng. Sci. 54 (1999) 4669– 4678. [16] J.P. Kauthen, The numerical solution of integral-algebraic equations of index-1 by polynomial spline collocation methods, Math. Comp. 236 (2000) 1503–1514. [17] R. März, The index of linear differential-algebraic equations with properly stated leading term, Results Math. 42 (3/4) (2002) 308–338. [18] C.K. Qu, R. Wong, Szego’s conjecture on Lebesgue costants for Legendre series, Pacific. J. Math. 135 (1988) 157–188. [19] J. Shen, T. Tang, Spectral and High-Order Methods with Applications, Science Press, Beijing, 2006. [20] T. Tang, X. Xu, J. Cheng, On spectral methods for Volterra type integral equations and the convergence analysis, J. Comput. Math. 26 (2008) 825–837. [21] L.V. Wolfersdorf, On identification of memory kernel in linear theory of heat conduction, Math. Methods Appl. Sci. 17 (1994) 919–932. [22] A.I. Zenchuk, Combination of inverse spectral transform method and method of characteristics: deformed Pohlmeyer equation, J. Nonlinear Math. Physics 15 (2008) 437–448.