Multilevel Adaptive Methods for Partial Differential Equations
Frontiers in Applied Mathematics Frontiers in Applied ...
59 downloads
868 Views
13MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Multilevel Adaptive Methods for Partial Differential Equations
Frontiers in Applied Mathematics Frontiers in Applied Mathematics is a series that presents new mathematical or computational approaches to significant scientific problems. Beginning with Volume 4, the series reflects a change in both philosophy and format. Each volume focuses on a broad application of general interest to applied mathematicians as well as engineers and other scientists. This unique series will advance the development of applied mathematics through the rapid publication of short, inexpensive books that lie on the cutting edge of research. Frontiers in Applied Mathematics Vol. Vol. Vol. Vol.
1 2 3 4
Vol. 5 Vol. 6 Vol. 7 Vol. 8 Vol. 9 Vol. 10 Vol. 11 Vol. 12 Vol. 13 Vol. 14 Vol. 15 Vol. 16
Ewing, Richard E., The Mathematics of Reservoir Simulation Buckmaster, John D., The Mathematics of Combustion McCormick, Stephen F., Multigrid Methods Coleman, Thomas F. and Van Loan, Charles, Handbook for Matrix Computations Grossman, Robert, Symbolic Computation: Applications to Scientific Computing McCormick, Stephen F., Multilevel Adaptive Methods for Partial Differential Equations Bank, R. E., PLTMG: A Software Package for Solving Elliptic Partial Differential Equations. Users' Guide 6.0 Castillo, José E., Mathematical Aspects of Numerical Grid Generation Van Huffel, Sabine and Vandewalle, Joos, The Total Least Squares Problem: Computational Aspects and Analysis Van Loan, Charles, Computational Frameworks for the Fast Fourier Transform Banks, H.T., Control and Estimation in Distributed Parameter Systems Cook, L. Pamela, Transonic Aerodynamics: Problems in Asymptotic Theory Rude, Ulrich, Mathematical and Computational Techniques for Multilevel Adaptive Methods More, Jorge J. and Wright, Stephen ]., Optimization Software Guide Bank, Randolph E., PLTMG: A Software Package for Solving Elliptic Partial Differential Equations. Users' Guide 7.0 Kelley, C.T., Iterative Methods for Linear and Nonlinear Equations
Multilevel Adaptive Methods for Partial Differential Equations Stephen F. McCormick
University of Colorado at Denver
Society for Industrial and Applied Mathematics Philadelphia 1989
Library of Congress Catalog Card Number 89-22034 1098765432 All rights reserved. Printed in the United States of America. No part of this book may be reproduced, stored, or transmitted in any manner without the written permission of the Publisher. For information, write the Society for Industrial and Applied Mathematics, 3600 University City Science Center, Philadelphia, Pennsylvania 19104-2688. Copyright © 1989 by the Society for Industrial and Applied Mathematics is a registered trademark.
To Pop
This page intentionally left blank
Contents ix
Preface
1
Chapter 1 Introduction 1.1 1.2 1.3 1.4 1.5 1.6
17
Chapter 2 The Finite Volume Element Method 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12
57
Purpose Motivation Philosophy of Computation Historical Remarks Notation and Assumptions Model Problems
Introduction Basic Approach Boundary Conditions Stencils Composite Grids Conservation and Singular Equations Planar Cavity Flow; High Reynolds Number Flow Rectangular and Rectilinear Elements Time-Dependent Equations Theory Numerical Examples Comments
Chapter 3 Multigrid Methods 3.1 3.2 3.3 3.4 3.5 3.6
Basic Concepts Galerkin Operators and Singular Equations Nonlinear Schemes Full Multigrid and Computational Complexity Parallel Implementation Numerical Examples vii
viii
81
CONTENTS
Chapter 4 The Fast Adaptive Composite Grid Method 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12
Basic Two-Level Schemes Interpretations Multilevel Schemes Interface Treatment Nonlinear Schemes Computational Complexity and Direct FAC Solvers Time-Dependent Equations Self-Adaptive Techniques Physically Conforming Grids Theory for Variational FAC Theory for FVE-Based FAC Numerical Examples
129 Chapter 5 The Asynchronous Fast Adaptive Composite Grid Method 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9
Motivation Basic Two-Level Schemes Interpretations Multilevel Schemes Parallel Implementation Parallel Complexity Theory for Variational AFAC A Variant Numerical Examples
149 Appendix 155 References 161 Index
Preface The decision to write this book was a tough one. Adaptive methods cover such a broad area that it is virtually impossible to include a survey of the literature. Moreover, because the best way to introduce readers to multilevel methods depends so much on their backgrounds, choosing the right course was not at all obvious. Finally, the multilevel methodology is evolving so rapidly that deciding what topics to include became difficult—and a fast job of writing and publication became essential. Yet, despite these and other troubles, it seemed to me that every aspect of the field had finally come together: the fast adaptive composite grid methods provided the basic discretization structure and solution methods, multigrid methods offered fast subgrid solvers, and the finite volume element methods gave a flexible and efficient means for discretizing the equations and designing the interlevel transfers. This "unification" clinched my decision. However, the insecurities that mounted in the face of the troubles I encountered are reflected in the Introduction, and readers should take special heed of the warnings I have made there. Much of this work was originally inspired by various developments and perspectives in the field of multigrid methods, for which I am indebted to its pioneer, Achi Brandt. I am also indebted to Liu Chaoqun and Dan Quinlan for the numerical results and several suggestions for algorithm improvement. I am grateful for helpful discussions with William Briggs, Richard Ewing, Gary Lewis, Jan Mandel, Raytcho Lazarov, Ulrich Rude, Jim Thomas, and Olof Widlund. Special thanks goes to Debbie Beltz for her superb typing and infallible editing. Finally, I thank Pat Quinlan for her beautiful illustrations, Vickie Kearn and Tricia Manning for inspiration, and my wife Lynda for her great support and patience. The research reflected in this book was sponsored in part by the Air Force Office of Scientific Research under grant AFOSR-86-0126 and by the National Science Foundation under grant DMS-8704169. Stephen F. McCormick University of Colorado at Denver ix
This page intentionally left blank
Chapter 1
Introduction
1.1 Purpose This volume is intended as a practical handbook of the fundamental concepts for a class of multilevel adaptive methods which are designed for efficient adaptive discretization and solution of partial differential equations (PDEs). These so-called fast adaptive composite grid methods (FAC) are characterized by their use of a composite grid, which is nominally the union of various uniform grids. In effect, this nonuniform grid is where the PDE is discretized and solved, but the constituent uniform subgrids are where all of the actual computation takes place. In this way, these multilevel composite grid methods maintain the potential advantages of uniform grid discretizations (e.g., simple stencils, assured accuracy, and fast solution), while at the same time allowing effective adaptation to local phenomena. Said in a multigrid context, FAC is capable of producing a composite grid with tailored resolution, and a corresponding solution with commensurate accuracy, at a cost proportional to the number of composite grid points. Moreover, special asynchronous versions of the fast adaptive composite grid methods studied here have seemingly optimal complexity in a parallel computing environment. These advantages come with some loss of flexibility. Since the multilevel methods treated here are based on uniform rectangular grids, 1
2
MULTILEVEL ADAPTIVE METHODS FOR PDES
special measures must be taken to allow the composite grids to fit very irregular boundaries and internal interfaces and discontinuities. However, preliminary work on local grid generation and rotated patches is just beginning to show promise in restoring this flexibility. We will briefly discuss these techniques in Chapter 4. FAC methods require discretization of the PDE not only on the individual levels, but in effect on the composite grid as well. While there are many techniques for doing this, the approach developed in Chapter 2 is a combined finite volume and finite element method, which is especially effective for many fluid flow applications. The class of multilevel adaptive methods treated here have deep roots in the multigrid methodology. Indeed, much of their early development was largely inspired, by the pioneering work of Achi Brandt, especially in the context of his multilevel adaptive techniques (MLAT; cf. [Brandt 1973, 1977]; for more recent work, see [Bai and Brandt 1987]). (See [McCormick and Thomas 1986] for some comments in this direction.) In fact, we borrow heavily on multigrid concepts and terminology and assume that the reader is familiar with its basic principles. (See [Briggs 1987] for a well-founded introduction to multigrid and its underlying principles.) However, the adaptive methods considered here can be interpreted in a more general context, where the grid "solvers" are allowed to be virtually any technique designed for PDEs on uniform grids. We discuss this further in Chapter 4. In some ways this book is premature. Most of the methods we treat were discovered only within the last decade, and in many cases their development is still in its infancy. Because there are many issues yet to be settled, we have reserved full treatment of the more complex procedures in adaptive refinement for a possible later edition of this book. However, in other areas we have in fact taken the liberty to introduce new concepts, in most cases to fill certain gaps, but occasionally to expose new avenues of research. Also, because adaptive refinement seems to demand a lot of attention to philosophical issues, we have freely and often brought personal perspectives into the discussion. Finally, the dynamic nature of this field currently prohibits a fixed foundation, so the reader should beware that many of the principles developed here may change with future progress. In any event, it is becoming increasingly critical to develop highly accurate, efficient, and reliable solution methods for very large scale and complex physical models. We hope that this book will contribute in a practical way towards achieving this goal. The presentation is organized as follows. The remainder of this
INTRODUCTION
3
chapter develops the motivation, historical background, notation, and model equations. Chapter 2 describes the discretization method, the finite volume element technique (FVE). In Chapter 3, certain advanced aspects of multigrid methods are developed that will form components of the adaptive methods. Chapter 4 is devoted to the FAC methods for solving the discretized equations on composite grids. Finally, Chapter 5 treats asynchronous versions of FAC (AFAC) designed for multiprocessor applications. To provide a simple understanding of the basic methods introduced in subsequent chapters, FVE and FAC are applied in the Appendix to a one-dimensional model problem using a composite grid with two levels. The reader may wish to consult this development before proceeding. 1.2 Motivation Most physical models have variations in scale on which numerical methods might capitalize. These variations usually appear in the coefficients, forcing terms, or boundary conditions. They may also be part of the computational objectives themselves: High accuracy may be needed only in limited parts of the domain, for example. If computational power were free and unlimited, we could simply resolve the physics on a global uniform grid with the smallest desired mesh size. Most applications, however, require either concession to coarser global scales with the attendant loss of important physics and its impact, or use of some sort of scheme for locally adapting resolution and accuracy. Unfortunately, local adaptation introduces several design challenges, including: Grid structures. The grid structures must allow for easy addition and deletion of points, efficient mechanisms to determine and process neighbors, and assurance in the stability of the structures (e.g., to avoid small "aspect ratios" of elements, severe stretching, and tearing). The scheme must also support simple and manageable data structures that do not impede the efficiency of the overall discretization and solution process. Error control. The local adaptation process must provide for relatively safe and efficient methods of error assessment and control. Accurate discretizations. Discretizations must assure acceptable accuracy, but this is problematic on irregular grids. Nonuniform grids imply a loss in truncation error accuracy, on which finite difference approximations are usually based. Even the more
4
MULTILEVEL ADAPTIVE METHODS FOR PDES
complicated finite element discretizations can assure only lowerorder accuracy, albeit in a higher-order Sobolev norm. Efficient solvers. The discrete equations must be solved efficiently. Unfortunately, the convergence rate and complexity of iterative methods usually deteriorate with decreasing size of the smallest mesh size. These small scales may even hinder direct methods since they determine the condition number of the discretization matrix. Parallelism. The advances in computers for large-scale computation make it critical to develop efficient parallel methods for local refinement. However, the requirements of adaptive methods are usually in conflict with efficiency in a parallel computing environment. Nonuniformity creates severe difficulties with vectorization, but it also hinders multiprocessing, especially in terms of imbalanced loads. Theory. It is important as a science that the adaptive refinement techniques be placed on a sound theoretical footing. The limited amount of founding theory is in fact a reflection of the difficulties that have impeded progress in the development of these methods. While some adaptive discretization schemes admit more or less realistic error estimates, many do not, and rigorous convergence estimates exist only for a few adaptive solvers. Since the sacrifice of grid uniformity exacts such a high price, it can be an advantage to introduce as little nonuniformity into the method as possible. This is the guiding principle for the techniques considered in this book. It is achieved by way of a composite grid, which is the union of a nested sequence of uniform grids of varying scale. See Figure 1.1 for a simple example. (A patch here is a rectangular uniform grid. A level is the union of all patches of the same mesh size. Note that there is only one patch per level in this example.) To be more specific, multilevel composite grid methods have three fundamental features for meeting the design challenges of local adaptation: Composite grid structures. Since the composite grid is the union of a nested sequence of uniform grids, data structures are simplified. For example, a patch aligned with coarser grids can be specified by the relative location of its southwest corner, its dimensions, and its mesh sizes. Moreover, nonuniformity is controlled because there are relatively few irregular points, and even
INTRODUCTION
5
Figure 1.1. Composite grid: five patches, one patch per level, refinement factor = 2. those points are quasi-regular, by which we mean there is some pattern to the stencils at the irregular points. For example, the coarse grid interface points for the two-level simple-patch example of Figure 1.2 are of only two types, sides and corners, and each type exhibits a fixed stencil pattern. Moreover, the fine grid interface points can be treated as if they were regular interior points simply by requiring that the discretization scheme produce fine grid equations that allow the coarse grid interface and slave interface points to be treated as the real boundary of the fine grid patch. A slave point is usually treated by requiring that the solution value there be an interpolant from its values at neighboring coarse grid interface points. Uniform grid structures. Each level of the composite grid is a uniform grid. The computation in multilevel adaptive methods
6
MULTILEVEL ADAPTIVE METHODS FOR PDES
Figure 1.2. Composite grid interface coarse grid,
points ( fine grid, slave).
is dominated by processing on these uniform subgrids. The only significant impact of nonuniformity on computation is in the interpolation of slave point values and the evaluation of certain residuals at coarse grid interface points. Even these procedures can be made to appear like uniform grid processes. Multilevel processing. Multilevel methods start by approximately solving individual equations on these uniform grids, then proceed by using the approximations to correct the coarse grid equations. The key to the success of these methods is their use of fully overlapping grids (of different scales) and their proper correction of the coarse grid equations by the composite grid residuals. These features of multilevel methods provide: grid and data structures that are relatively simple to manipulate; grids that are stable and quasi-regular; effective error assessment and control mechanisms (via the presence of several levels of discretization); accurate discretization capabilities (because of the predominant use of uniform grids); very efficient solvers; a comparatively easy means for enhancing existing uniform grid software; a high degree of vectorizability and parallelizability (again by the predominant use of uniform grids); and substantial theoretical foundations. These aspects will be treated in the chapters that follow. 1.3 Philosophy of Computation It is important to keep in mind the fundamental objectives and basic philosophy of computation in order to provide proper direction to algorithm development and analysis. This is especially critical for
INTRODUCTION
7
adaptive methods because the PDE must be intimately involved in this process. For example, it may be encouraging to know that under certain conditions a given iterative method for solving a given discretization converges with an algebraic rate independent of any grid spacing, but this alone is not enough. Since the real goal is to approximate the PDE solution, we must ask how accurately this function is approximated by the final numerical result - and at what cost. Because adaptive methods must carefully manage discretization accuracy and computational cost, a purely algebraic approach is usually too limiting. For this reason, instead of restricting ourselves to the development of good preconditioneds or fast algebraic solvers, we will take the overall perspective of developing efficient methods for approximating the solution of the PDE. According to this point of view, we adopt the following goals of computation (which are intentionally somewhat vague): 1. For a specified composite grid, compute a corresponding composite grid function that approximates the PDE solution to the level of discretization error, that is, so that the error is comparable to the error in the exact solution of the composite grid equation. Achieve this at optimal cost in the sense that the total time for computation is a small multiple of the time it takes just to compute a composite grid residual. 2. For a specified desired error tolerance, determine a composite grid and corresponding approximation that obtains this accuracy. Achieve this at an optimal cost with the following added proviso: The number of composite grid points is essentially minimal in the sense that any other composite grid with discretization error below the given tolerance must have approximately the same number of points. Our research has taken the approach that the performance of an algorithm must be tested against these objectives - and development is not complete until these tests are successful or it is well understood why these objectives cannot be achieved. Actually, because self-adaptive versions of the schemes we consider are in the early stages of development, most of the focus of this book will be on the first objective. 1.4 Historical Remarks Here we give a very brief account of the historical development of multilevel adaptive methods and related techniques. For current developments, we have attempted throughout this book to reference work
8
MULTILEVEL ADAPTIVE METHODS FOR PDES
that has had significant impact on our research. However, the science of adaptive methods is so broad and rapidly advancing that our references account for only a small fraction of the progress in this field. We apologize to the many researchers who have made important advancements in this and related fields, yet who are not properly referenced in this book. The use of global and local uniform grids for adaptive mesh refinement is no doubt a very old concept. The early work on these "classical mesh refinement methods" used individual processing of each level; but full interlevel efficiency was not obtained because either no iteration was used, the levels were not fully overlapping, or the interlevel communication used the emerging solution, not the equations. See [Ciment and Sweet 1973] for some of the first theory for these methods. This approach appears in many forms today, especially in the context of time-dependent problems. See [Berger and Oliger 1984] and [Caruso, Ferziger, and Oliger 1985] for examples of effective uses of this technique in the context of explicit time-stepping methods. Achi Brandt introduced many of the basic concepts of multileveling into adaptive methods in [Brandt 1973, 1977] with further improvements presented in [Bai and Brandt 1987]. The foundations for his MLAT are local truncation error estimation and fine grid correction of coarse grid equations by way of the full approximation scheme (FAS). The FAC method was introduced in [McCormick 1984] and [McCormick and Thomas 1986] as a method for generalizing MLAT — so that virtually any grid solver could be used to process each level — and for making it more systematic. FAC is driven by the choice of a composite grid discretization and is not limited by a truncation error perspective. The asynchronous version of FAC, which allows for simultaneous processing of all levels and is thus well suited for distributed memory multiprocessing systems, was introduced in [Hart and McCormick 1989]. A preconditioning method, which is quite similar to FAC, was introduced in [Bramble et al. 1988]. It is now called the BEPS preconditioner in reference to its developers. For an interesting comparison between BEPS and FAC considered as a preconditioner, see [Ewing, Lazarov, and Vassilevski 1988]. A technique related to FAC is the hierarchical basis method, first introduced for finite element applications in [Craig and Zienkiewicz 1985]. Loosely speaking, while its approach has some similarities to that of FAC, it uses a different fine grid equation: FAC discretizes the PDE on the fine grid using "hat" functions associated with every
INTRODUCTION
9
fine grid point, while the hierarchical basis method uses only those hat functions associated with fine grid points that are not also coarse grid points. The hierarchical basis method is quite general, but it does not fit with the objectives here because its incomplete use of the fine grid basis functions leads to a mild dependence of the convergence rate on the number of refinement levels. There is an abundance of references on other methods for adaptive refinement and solution of PDEs. Rather than attempt to list all of them here, we just cite several representative ones that are of multigrid type: [Fuchs 1985], [Bank 1986], [Berger and Jameson 1985], [Gannon 1980], [Berger 1987], [Hackbusch 1984], [Forester 1982], [Hemker 1980], [McCormick 1985], [van Rosendale 1983], and [Rivara 1984]. For a more complete historical account, see the KWIK reference guide in the appendix of [McCormick 1987] under the term "adaptive." In the next chapter we introduce the finite volume element method [Liu and McCormick 1988a] for discretization of PDEs nominally in conservative form. FVE is related to the so-called control volume finite element method (CVFE) developed earlier in [Baliga and Patankar 1980]. See also [Minkowycz et al. 1988] and [Bank and Rose 1987]. The approaches of FVE and CVFE are basically the same; the differences are in certain aspects of their development and their intended use. For example, FVE was developed for use with composite grids, with volumes that are defined in a way that leads to convenience in the solution process. CVFE appears to have been developed for general grids, with volumes corresponding to a certain dual of the finite element mesh. There are also major differences in the way we use FVE to treat such aspects as boundary conditions, nonlinearities, and systems. Finally, we prefer using the term finite volume elements because of its compatibility with terms used for other discretization methods (e.g., finite differences, finite elements, and finite volumes).
1.5 Notation and Assumptions Regions, spaces, operators, grid points, and sets use capital letters; constants and functions use lower case. Quantities associated with a continuum (e.g., PDE operators, spaces, and functions) use Greek, quantities associated with grids use Roman, and sets use caligraphy. Constants can either be lower case Greek or Roman. Exceptions and
10
MULTILEVEL ADAPTIVE METHODS FOR PDES
special symbols include:
A/"(-) T^(-) 3?n 4> p(-)
null space range Euclidean n-space empty set spectral radius in 2-D
V
divergence operator; v =
L1 fC* _L n O(-)
matrix transpose operator adjoint perpendicular outward unit surface normal order; p = O(^) if limsup ||| < oo where the limit and limit variable are usually understood (typically h -» 0 or n —>• oo)
«
uniformly less than; p « q if there exists an a such that p < a < q for all values of some understood quantity (e.g., mesh size or independent variable) linear part; if G is a stationary linear iterative method of the form G(u) ~ Au + 6, then Q_ — A unit constant function; 1 can denote either a continuum or grid ('vector') function or a constant
G 1
To avoid iteration subscripts, approximations like u are dynamic quantities that can change assignment in an algorithm by a statement of the form u <— G(u). This is understood to mean that the new assignment of u is the result of applying; G to the old one. Occasionally, when we need to be more specific we will use expressions like W( ne w) <— G'( w (oid))Following is a description of the specific notation commonly used in this text. Since much of the development of concepts rests heavily on visualization, we have made several tacit assumptions about this notation that generally simplify the discussion, but are not otherwise essential. These assumptions are indicated in square brackets below;
11
INTRODUCTION
they are meant to hold unless otherwise indicated. independent variable space dimension [d = 2] generic mesh size [mesh sizes are equal in each coordinate direction]; used in superscripts and subscripts, but may be dropped when understood coarse grid mesh size mesh size of fcth grid [hk = 2/ifc+i]; k may replace hk in superscripts and subscripts
(open) regions in 7£d [regions are simply connected]; ft = closure; dtl = Q\ft = boundary; C$1 =complement; nested means ft/j C $lih continuum function spaces (including discrete finite element spaces) with respective elements t/>,>, vh,<}>h grids [Q'1 is uniform; $l± is the union of its uniform subgrids, $lk = ft'1*; $lh is aligned with fi2ft; grids "cover" their associated regions in the sense that the region is enclosed by the boundary of the grid]; grids exclude Dirichlet boundary points but contain Neumann boundary points; patch = rectangular uniform grid; level = uniform subgrid = union of patches of same mesh size; ft means £lh and its nearest neighbors (including diagonal neighbors, but excluding Dirichlet boundary points); dfth = Tlh\Slh; C£lh = complement; nested means D 2/l C ft*1; locally nested means n 2 f c nft f c cft f c
12
MULTILEVEL ADAPTIVE METHODS FOR PDES
grid spaces with element uh\ entries of uh are written u% or u^ (uf for time-dependent problems) grid points (nodes) sets of "volumes" V and "elements" E "surface" of a volume V multigrid notation (cf. [Briggs 1987])
A superscript asterisk (*) is used to denote an exact solution to a given equation. The following conventions are used for errors:
actual continuum errors grid errors discrete continuum errors grid errors algebraic continuum errors grid errors
Here, P is a generic grid point in $lh and uh(P) and Vp mean the respective values of uh and ip* at P, for example. We also use the following innerproduct and norm conventions:
INTRODUCTION
13
sup norm LI or generic innerproduct, norm is the space of continuously fc-differentiable functions on J7) #!(0) norm; (#i(Q) is the space of functions on which this norm is defined) #!(!*) seminorm; discrete fTi(ft) norm; where $ h is the finite element interpolant of ij) (when it exists, e.g., when discrete Hi($l) semi-norm; energy innerproduct; energy norm;
1.6 Model Problems To illustrate many of the concepts of this book, we introduce the following model problems. The first three are static two-dimensional equations on the unit square fi = (0,1) X (0,1); we will use these models as the basis for our numerical experiments. The fourth model is a simple time-dependent equation on the x — t strip F = (0,1) X (0,oo). Let d£lw — {(0,^) : 0 < y < 1} be the west boundary of 12 and similarly for d£ls, d^-Ei and dSlw- (Throughout this book, such subscripts will be used to denote appropriate sections of boundaries and neighbors of gridpoints.) In the following, Re is a given parameter (the Reynolds number}; n is the outward unit normal on #11; and />, 77, 0, ^o, and t/>i are given functions. Potential Flow (well-posed)
(Dirichlet boundary condition) (Neumann boundary condition) (1.1)
14
MULTILEVEL ADAPTIVE METHODS FOR PDES
It will be useful to interpret these equations from a physical perspective. For this we assume that ty is the flow potential so that tyx and fyy are the fluid velocities in the respective x and y directions. Also, the functions p, 77, V'o? and V>i are the respective fluid density, interior source flow rate, boundary potential, and boundary source flow rate. We think of these quantities in terms of the following physical units: length per unit time mass per unit volume mass per unit time per unit volume volume surface area (The spatial terms we use here such as "volume," "area," and "surface" are meant in a generic context, independent of dimension. In 2-D these terms really mean area, length, and perimeter, respectively.) Note that (1.1) is a scalar, static, linear PDE in conservative form and that it reduces to the familiar Poisson equation in the incompressible case, p = constant. In general, we assume p » 0. Potential Flow (singular, no flow boundary conditions) To illustrate some of our concepts in the context of singular operator equations, we introduce the following equation with no flow (homogenous Neumann) boundary conditions: (Neumann boundary condition) To ensure that a solution exists, we impose the so-called (analytic) compatibility condition on the data, given here by
This condition can be interpreted as a global conservation law which states that the net flow rate from internal sources must be zero (since there is no flow through the boundary). Equation (1.3) is needed to ensure that (1.2) has a solution, in which case there are infinitely many: If 4> solves (1.2), then so does ^ + c, where c is a constant function; all solutions are in fact of the form ^ + c, where ^ is a fixed particular solution.
INTRODUCTION
15
Planar Cavity Flow in!) (vorticity equation) in fl (stream function equation) on$!) (Dirichlet boundary condition) on d£l (Neumann boundary condition) This model is useful because it is a system, it is nonlinear, and its basic character can be manipulated by the choice of Re. Note that its character as a system is emphasized by the boundary conditions: The first equation is usually associated with the vorticity, 6, and the second with the stream function, ^>, but they must really be treated as coupled equations because the boundary conditions are imposed only on -0. Time-Dependent Elliptic Equation (initial condition) (Neumann boundary condition) While this model is only one-dimensional in space, we will use it for illustrating only those concepts that carry over directly to higher dimensions.
This page intentionally left blank
Chapter 2
The Finite Volume Element Method
2.1 Introduction The classical finite volume method (FV) is in common use as a discretization method for computational fluid dynamics applications. Reasons for its popularity include its ability to be faithful to the physics in general and conservation in particular (cf. [Roache 1972]), to capture shocks, to produce simple stencils, to apply to a fairly wide range of fluid flow equations, to effectively treat boundary conditions and nonuniform grids, and to facilitate multigrid solution. Yet the FV approach is not fully systematic: Use of FV requires a scheme for approximating certain fluxes, which is often done in an effective but rather ad hoc and restrictive way that depends upon truncation error analysis. Evidence of the lack of fully developed guiding principles for the FV approach is its scarcity of founding theory. (See, however, [Samarskii, Lazaroff, and Makarov 1987] and [Heinrich 1987].) Contrast this with the status of theoretical foundations for finite elements (FE; cf. [Babuska, Chandra, and Flaherty 1983] and [Ciarlet 1978]). The finite volume element method (FVE) was developed as an attempt to use finite element ideas to create a more systematic FV methodology. The basic idea, which was first used in the control vol-
17
18
MULTILEVEL ADAPTIVE METHODS FOR PDES
ume finite element method (CVFE; cf. [Baliger and Patankar 1980]), is to approximate the discrete fluxes needed in FV by replacing the unknown partial differential equation (PDE) solution by a finite element approximation. This means that the discretization design process can pay more attention to the local character of the solution (to choose accurate finite element spaces), and less to the equations. As the present chapter will show, this use of finite elements also leads to more effective treatment of problem complexities like boundary conditions, nonlinearities, systems, and irregular grids. As the next two chapters will show, FVE is ideal for multilevel methods because it provides a foundation for design of effective discretizations, interlevel transfers, scaling, innerproducts, and norms. 2.2 Basic Approach Suppose V is a given "control volume" in 0 with surface 5 as shown in Figure 2.1. Integrating the potential equation (1.1) over V yields
Using the Gauss Divergence Theorem, the left-hand side of (2.1) is transformed to a surface integral, yielding
Note that, in terms of the physical units introduced for potential flow, (2.2) reads mass length area volume time
mass time • volume
volume.
Each side therefore represents a flow rate in mass per unit time, and (p V VO ' n. represents a flux across 5. This equation can thus be interpreted as a conservation law for the volume V which states that the net flow rate of the fluid across the surface S balances with the net flow rate from the interior source. The use of control volumes in this way allows us to discretize the equation in (1.1). We simply choose a finite set, V, of volumes that partitions ft as shown in Figure 2.2 and impose on each volume the integral condition (2.2). (Actually, it is important to be careful here to incorporate the boundary conditions, but this issue will be ignored until the next section.) The number, n, of discrete equations is therefore just
FVE METHOD
19
Figure 2.1. Control volume V (thick dashed lines) and its surface S. the cardinality of V. To complete the task of discretization, the exact solution V7* must now be replaced with an approximation u in 3?n, that is, we must discretize the unknown. The conventional FV approach uses finite differences in order to replace the fluxes (p v *0*) • n. by differences of u at points neighboring S. Instead, FVE replaces ijj* by a finite element approximation v expressed in terms of its nodal values. Consider in particular the triangular finite element partition S depicted in Figure 2.3. Let T be the space of continuous piecewise linear functions on ft associated with £. (For the moment, we ignore the problem of incorporating the boundary conditions of problem (1.1) in our discretization.) Then the FVE discretization of (1.1) is: Find veT such that (2.2) holds for all V in V. This is finally transformed to a problem in 9£n when v is expressed in terms of a nodal basis for T:
where Uk is the value of v at node k and 4>k is the so-called "hat" function associated with the fcth node, 7V^, of V, that is, >fceT and 0fc(N/) = £jt/, where tiki is the Kronecker delta. (Here we use a singly subscripted ordering of the nodes Nk. Later we will also use the notation Nij to refer to the node at (x;, j/j), where Xi and t/j will be given or implied. We may also allow the subscripts to take on fractional values, although these will always be given explicitly, as in 7Vi +1 / 2 ,j.) Substituting (2.3) into (2.2) then yields the matrix equation
20
MULTILEVEL ADAPTIVE METHODS FOR PDES
Figure 2.2 Volume partition of ft.
Figure 2.3. Finite element partition of ft.
where
(except for boundary contributions) and L is the n X n matrix with entries
(Here we have ordered the Vk in V according to the ordering of the nodes, Nk-) 2.3 Boundary Conditions There are several ways to treat the boundary conditions, depending on their type and the objectives of the discretization. This section will illustrate very straightforward schemes for implementing Neumann and Dirichlet conditions, which are analogous to finite element treatment of the respective "natural" and "essential" boundary conditions. Neumann conditions are perhaps the easiest to incorporate in the FVE discretization of (1.1): The Neumann condition can be imposed indirectly on ty or its approximation v simply by substituting the flux value i/?i into the appropriate term in (2.2). Specifically, for the quartersize volume V in the southwest corner of Q, S\v U 5s coincides with
FVE METHOD
21
part of dft, so the integral condition for V is
The discrete approximation v uses (2.7) for this corner V in place of the interior condition (2.2). Note that with this approach, the Uk corresponding to nodes on d£tw Udfis are unknowns to be determined by the equations. Dirichlet conditions are imposed directly on i/> and v. Hence, for (1.1), Uk takes on the value of ^i at each Dirichlet node JV^, that is, each node on dft^ U dtls including the corner points (1,0) and (0,1). With this approach there are potentially fewer unknowns Uk than equations, but this is easily avoided by discarding the equations associated with the Dirichlet nodes (see Figure 2.4). This means that V no longer partitions 17, which slightly impairs conservation in the discretization (see §2.6). Fortunately, it is just where this occurs—at the Dirichlet points—that this loss of conservation is generally not a concern. This is because conservation is generally needed only to ensure accurate approximation of very smooth components that are not properly posed (e.g., the constant functions for problem (1.2)); Dirichlet conditions tend to reduce the ill-posedness of these components and, hence, the need for conservation. However, it may be important for more complex problems like (1.4) that V be maintained as a partition. A simple method for doing this is to expand the volumes at the points neighboring the Dirichlet boundary as depicted in Figure 2.5. Note that the corners (1,0) and (0,1) of fi are treated here as Dirichlet points; they could just as well have been Neumann points with the quartersize volumes remaining in V. For simplicity, however, we assume from now on that the V we use to discretize (1.1) consists of the volumes as shown in Figure 2.4. For full Neumann problems, this imbalance in the number of equations and unknowns does not arise. Thus, for (1.2) we henceforth use the original partition V as displayed in Figure 2.2. 2.4 Stencils Implementation of FVE requires a numerical rule for evaluating the integrals in (2.5) and (2.6). Here we will illustrate how this is done with a fairly simple approach. In particular, for (2.5) we use the quadrature rule
22
MULTILEVEL ADAPTIVE METHODS FOR PDES
Figure 2.4. Reduced volumes to accommodate Dirichlet points
Figure 2.5. Modified volume partition of fi to accommodate Dirichlet points
where Ny is the node associated with V and \V\ = fv dV is the "volume" (that is, area) of V. For (2.6) we use the following rule on each interior surface segment T = 5 AT, SS^SE, or Sw-
where PT is the point of intersection of T and the grid lines passing through NV- (Except for Neumann boundary nodes, PT is the midpoint of T.) Finally, for Neumann boundary segments T = Ss or Sw we use
where MT is the midpoint of T and \T\ is the "surface area" (that is, length) of T. To see what this produces in terms of the right-hand side / and the stencils for L defined by (2.6), consider the discretization on a uniform mesh with grid size h = -^ in both coordinates. We use double subscripts i,j varying from 0 to ra— 1 where 0 corresponds to the Neumann boundary nodes and ra — 1 corresponds to the nodes neighboring the Dirichlet boundary. Written in stencil form, the equations in (2.4) are as follows, according to the type of associated node (E signifies the sum of the outer stencil entries):
FVE METHOD
23
General interior node (0 < z, j < m — 1)
Corner Neumann node (i = j = 0)
Here, the coefficient of WQO is — S, which is the "center" of the stencil. (Ordinarily, a stencil has a true center whose entry corresponds to the grid point in question. Uncentered stencils occur at the boundary, where the correspondence between entries and grid points is obvious from the context.) Side Neumann boundary node (0 < i < m — 1, j = 0; the case 0 < j < m — 1, i — 0 is similar)
Neighboring Dirichlet node (i = m — 1 or j = m — 1) The equations here are the same as at other nodes except that the Dirichlet points are eliminated in the usual way, that is, the stencil entry reaching to a Dirichlet node value is removed and placed in the right-hand side as a coefficient of the boundary data. For example, at neighbors of dfis (i = m — 1, 0 < j < m — 1) we have
where S' is the sum of the entries of the stencil without the boundary terms removed, that is, E' = p(l - |, jh] + (p(l f ,jh) + p(l - h,(j - i)/i) + p(l - h,(j + \)h)).
24
MULTILEVEL ADAPTIVE METHODS FOR PDES
One point to notice here is that FVE stencils can appear much like the usual finite difference stencils for simple problems on uniform grids. In such cases, the only apparent advantages are its more systematic treatment of the boundary conditions and its greater assurance for maintaining conservation. This must, of course, be balanced against how it appears in terms of ease of use. Another point is that, in this simple setting, the discretization is independent of the orientation of the triangular elements: Triangles formed by connecting the SW and NE corners of each grid box produce the same discretization. This will not be the case for problem (1.4), as we shall see in §2.7. It may be important to develop more accurate rules of integration, especially for more sophisticated equations. Such rules can probably best be developed by treating smaller segments of the surfaces individually (e.g., each linear segment contained wholly within an element). This leads to only a slight complication over the simpler scheme presented here. 2.5 Composite Grids The purpose of our development of FVE is to provide an effective discretization of PDEs on adapted grids. To see how this is done for model problem (1.1), in this section we investigate the case of a twolevel single patch composite grid as depicted in Figure 2.6. One of the practical objectives guiding our use of FVE is to produce equations on the patch that appear as if no coarser grid exists. In particular, the coarse grid and slave interface points should appear in the patch equations as if they were Dirichlet boundary points for the patch, and the equations at the fine grid interface points should not otherwise be special. To achieve this patch conformity, the patch volumes are chosen first—in a regular way, as if the patch were separate. The remaining volumes are then determined in a fairly straightforward way, with the volumes at interface points largely dictated by neighboring patch volumes. The result is shown in Figure 2.7. Finally, we choose the triangularization £ in a similar way as the example in Figure 2.8. As indicated more clearly in Figure 2.9, the patch triangulation at the interface is formed by combining pairs of triangles of opposing orientation that abut at the slave points. Use of the resulting intermediate-size triangles at the interface is equivalent to using the original constituent triangles but imposing the rule that the solution value at each slave point be the linear interpolant of its values at the two neighboring coarse grid interface points. Note that by opposing the orientation of the constituent triangles, we have avoided coalescing
FVE METHOD
Figure 2.6. Composite grid: two levels, one patch, refinement factor = 2.
25
Figure 2.7. Composite grid volumes.
Figure 2.8. (left) Composite grid triangulation. Figure 2.9. (above) Construction of elements at a slave point
their boundaries with the volume boundaries. Note also that the coarse grid triangulation in Figure 2.3 is actually a coarsening of the triangulation in Figure 2.8 in the sense that each coarse element is the union of composite grid elements. This generally means that the associated finite element spaces are nested in the sense that the coarse grid space is a subspace of the composite grid space. This will be convenient for certain aspects of our algorithms and their development.
26
MULTILEVEL ADAPTIVE METHODS FOR PDES
Figure 2.10. Coarse grid interface point. The labels N, E, SE, and so on, are used to refer in a similar way to the points (0) where p is evaluated.
Figure 2.11. A coarse grid interface point, its volume, and the triangular elements for the case of refinement factor = 4. A similar labeling is used for the points (O) where p is evaluated.
Using numerical integration rules in a similar way as before, the FVE discretization is now completely determined. For example, the equation for the interface point shown in Figure 2.10 is
Here, h is the coarse grid size, the subscripts are used to indicate where p and 77 are evaluated, and w(P) signifies the entry of u corresponding
FVE METHOD
27
to point P. Note that all points not on the interface have the usual equation
where h is the mesh size of the coarse or fine grid accordingly as P is a coarse or fine grid point. The fine grid interface points also have the same equations provided the slave points are used where appropriate. To see the effect of a larger refinement factor, consider the sample point depicted in Figure 2.11. Note the orientation of the elements at the interface, which again avoids having the boundaries of the elements and volumes coalesce. An approach similar to the above yields the equation
2.6 Conservation and Singular Equations Most of the PDEs used for modeling fluid flows are derived from physical laws of conservation (cf. [Lax 1972]). Loosely speaking, these laws state that the net change of a physical quantity by way of fluxes through the boundary of a given region equals the net contribution to this quantity from the sources inside the region. This translates to a similar statement about the model: The mathematical conservation law for the PDE (1.1) is the integral form (2.2), for example. In fact, it is this form that is usually derived first from the physical system—the PDE is rather a consequence of the integral form, not the converse. Furthermore, at some locations of the region (especially at shocks; cf. [Lax 1972]), the PDE may not be valid in the usual sense, even though the integral laws still hold. The main point here is that the mathematical conservation law can generally be more appropriate than the PDE for characterizing the physical system. It can also provide a more direct form for discretization: Finite difference approximation of the PDE ostensibly relies on continuity of terms like vV>> while it is often only the flux terms like pvfi'm the integral form that are continuous. It is thus an important property of FV in general, and FVE in particular, that they can apply directly to the integral conservation laws.
28
MULTILEVEL ADAPTIVE METHODS FOR PDES
Figure 2.12. An admissable discrete region, V0, enclosed by its surface £0 (dotted lines), formed from volumes at three fine grid and two coarse grid points.
Analogous to these physical and analytic laws are algebraic properties of conservation that are attributes of so-called conservative difference schemes. Although these properties are usually studied as global phenomena in the context of time-dependent hyperbolic equations (cf. [Berger 1984]), we start with the stronger local concept and concentrate primarily on our elliptic model (1.1) and its integral form (2.2). While a discretization cannot be expected to conserve quantities in completely arbitrary regions, we can ask for an analog to (2.2) on general "discrete" regions. To be specific, let V0 be any nonempty subset of V and let be the union of the volumes in V0. Let 50 be the surface of Vo, which is generally contained in but not equal to the union of the surfaces of the volumes in V0 (see Figure 2.12). Any Vo constructed in this way is called an admissable discrete region. Now the discrete solution v of (2.2), with node values Uk defined by (2.3), satisfies
This can be seen by summing (2.2) for the volumes V in V0 and noticing that the flux terms that belong to common boundaries of these V agree in magnitude but have opposite sign, so they cancel. Equation (2.8) has the interpretation that this FVE method produces a solution that exactly satisfies the local analytic conservation law (1.2), provided the laws are restricted to admissable discrete regions. Of course, the
FVE METHOD
29
discretization that is actually computed generally uses quadrature to approximate the integrals in (2.2), so this sense of conservation is contaminated by the quadrature error. Also, the fact that our V does not partition ft imposes an added restriction on the admissable discrete regions. For example, because the volumes at the Dirichlet boundary are not included in V, we cannot conclude that v satisfies the global conservation law
For this to hold for problem (1.1), it would have been necessary to use something like the extended volumes as indicated in Figure 2.5. It is less important (at least for this simple model) to be conservative at Dirichlet points, so this is not a concern here. However, global conservation can be critical for certain singular or ill-posed equations. To see this and to expose important properties of algebraic conservation, we turn now to the singular potential flow equation in (1.2). Let #0 be some space of functions satisfying the boundary conditions in (1.2) and on which the differential operator 1C = V ' P^7 m (1.2) is defined in an appropriate sense. Now /C is a singular operator because it has a nontrivial null space given by A/"(/C) = {1}°°, where by 1 we mean the function that has the constant value 1. Similarly, A/"(/C*) = {1}°° where superscript * denotes operator adjoint. Since the range of /C satisfies 7£(/C) C A/ r ~ L (/C*), where _L denotes orthogonal complement in the LI innerproduct < • , • > , then (1.2) is solvable only if rj is orthogonal to the function 1, that is, only if the compatibility condition (1.3) holds. When (1.3) is violated, we may perturb (1.2) to a problem for which (1.3) is satisfied by applying a projection to the right-hand side according to
It is important that the discretization faithfully represents this singularity of 1C in the sense that the approximations of the spaces A/*(/C) and ./V(/C*) are exact. Exact representation of jV(/C) will avoid contamination of the other solution components by the singularity of 1C and exact representation of N(IC*) will ensure solveablility of the discrete equations. Together they facilitate theoretical analysis and numerical solution. To see how FVE accomplishes this for problem (1.2), note that the nullspace, N(L}, of the resulting matrix L contains the n-vector
30
MULTILEVEL ADAPTIVE METHODS FOR PDES
1, that is, the vector all of whose entries are 1. This can be seen directly from noting that the function 1 is in T and that it satisfies (2.2) for 77 = 0. Thus, M'(L) D {1}°°. It is not hard to see now that Af(L) = {1}°°- This follows for simple composite grid structures from the fact that L is an irreducible, nonnegative M-matrix. Since the n-vector 1 here represents nodal values of the function 1 in T, the discrete approximation of A/"(/C) is therefore exact. One important consequence of algebraic conservation is that it implies that A/"(AC*) is faithfully represented: Setting 77 = 0 in the global conservation law (2.9) and using (2.3) and (2.6) shows that the n-vector 1 is in A/"(£*), where superscript t denotes matrix transpose. Since rank (L] = rank (I f ) = n - 1, we must then have M(L*) = {1}°°. Thus, the discrete approximation of A/*(/C*) is also exact. This faithfulness to the "left" and "right" null vectors of/C, which is due in part to local conservation, has the following important consequences: Solvability. If 77 satisfies the analytic compatibility condition (1.3), then 77 must be in A/'J-(/C*). Hence, (1.2) must be solvable. Moreover, for such 77, / defined by (2.5) must satisfy the discrete compatibility condition
Thus, / must be in ^(L*) = R(L). Hence, (2.4) is solvable whenever (1.2) is. When (1.3) is violated, the projection
can be used on (2.4) in place of (2.10) on (1.2). Also, in practice, when inexact integration is used to compute /, (2.12) may be applied to / to ensure solvability. Accuracy. The accuracy of the discretization is essentially unaffected by the singularity of (1.2). Since the eigenfunction 1 is exactly represented, the discretization error can be analyzed on {I}"1. This is an invariant subspace of the operator in the spaces ty and T. Discrete solver. Methods used to solve (2.4) may have little or no trouble with the singularity of L. Such is the case for multigrid, as will be shown in §3.2.
FVE METHOD
31
Figure 2.13. Control volumes for stream function and vorticity equations, respectively. Generally, when the discretization is nonconservative, error is introduced into the representation of the left null vector of /C. This means that the solvability of (1.2) no longer guarantees that of (2.4), and a mechanism for detecting and correcting unsolvable discretizations is not forthcoming. But more seriously, inexact approximations to singular components can contaminate other components of the approximation. Maintaining conservation in the difference scheme can be a simple mechanism for relieving these troubles. 2.7 Planar Cavity Flow; High Reynolds Number Flow Although FVE discretization of problem (1.4) is more complicated, the basic principles are the same. What should be emphasized here is the importance of adhering to these principles as closely as is practical: In the context of problem (1.4), the main point is that we do not separate the discretization of each equation in (1.4), but instead treat it as a fully integrated system. This can be critical to the development of accurate discretizations, especially for high Re flows. We illustrate this approach in this section by showing some of the steps in the construction of the discrete equations at sample interior and boundary points of a uniform grid. Because of the nonlinearities in (1.4), an approach different than that of §2.2 must be used. We start with the case of linear elements, which is designed for modest-size values of Re. Because of the nature of the boundary con-
32
MULTILEVEL ADAPTIVE METHODS FOR PDES
Figure 2.14. Interior volume for the vorticity and stream function equations with triangular elements.
ditions in (1.4), we are led to different choices of volumes for each equation. The choices for the stream function and vorticity equations are the respective volume sets V^ and Vg that are depicted in Figure 2.13. These partitions allow for the Dirichlet boundary conditions to be imposed directly on tj> and the Neumann boundary conditions indirectly via the equations, while providing a well-posed correspondence between equations and unknowns. First consider the vorticity equation at the sample interior point shown in Figure 2.14. Integrating this equation over the control volume V and using the Gauss Divergence Theorem yields
Consider the northern half segment of the east wall of V, which is labeled SEN- The corresponding term in the left-hand side of (2.13) is the flux
Allowing a momentary "scratch-pad" approach to the notation, write
where we temporarily use a,6, c,a,/3,7 as undetermined constants and, for convenience, place the x — y origin at the midpoint of SEN- Sub-
FVE METHOD
33
stituting (2.15) into expression (2.14) yields the discrete flux
where h is the mesh size. For the triangles as they are oriented, we have
Treating the other segments in an analogous way, using quadrature on the right-hand side of (2.13), and letting u and v denote the nodal values of ip and £, respectively, we arrive at the discrete vorticity equation
where an = ^(u(N) - u(W)), a12 = 1 + %(2u(E) - u(P) - u(NW)), a 2 i = 1 + %(u(P) + u(SE) - 2tt(JV)), a23 = 1 + %(u(P) - u(NW) 2tt(5)), 032 = 1 + ^(2u(W) - u(P) - u(SE)), and u33 = &(u(S) u(E)}. Here we have taken some notational liberty by using stencils to express nonlinear equations. Note the NW-SE bias in these equations due to the orientation of the triangulation. (We have yet to find reason for doing so, but we could have avoided this bias by using either alternating triangle orientation, which might lead to two different types of stencils, or averaged stencils of opposite orientation, which would create some cumbersomeness for the solver.) Taking a similar approach to the stream function equation at an interior point, we first use the Gauss Divergence Theorem on the ^>term of this equation to get
Using (2.15) in each triangle then leads to the discrete stream function equation
34
MULTILEVEL ADAPTIVE METHODS FOR PDES
Figure 2.15. Boundary volume for Figure 2.16. Alternate volume for the stream function equation with tri- the vorticity equation at the boundangular elements. ary.
To illustrate this process at the boundary, consider first the sample boundary volume in V^ depicted in Figure 2.15. The contributions to the discrete equation from the surface segments SNW,SSW,SWN, and S\vs are the same except that the Dirichlet boundary condition is used to replace the values of ip at JV, P, and 5 by the corresponding values of V>o there. On both boundary segments, we again use (2.15) in appropriate elements (e.g., the northern half boundary segment uses the triangle with vertices at NW, JV, and P), but the Dirichlet conditions are imposed on 4> ( e -g-> ^y = (^ON ~ V>OP)/^ on the northern half boundary segment). This process yields the discrete stream function equation at the boundary given by
The dots are used here to emphasize that the unknown u does not include the Dirichlet boundary nodes. Note that the entry 1 in the stencil for u corresponds to the point W.
FVE METHOD
35
An alternate way to choose the volumes for the vorticity equation is to extend the volumes at the points neighboring the boundary as depicted in Figure 2.16. This means that V$ as well as V^ would be partitions. However, use of these volumes requires evaluation of normal derivatives of the discrete 6 on the boundary #0, and it is not clear how well-posed this might be. One virtue of the FVE method is that its analysis may rely less on finite difference-type estimates of local discretization error and more on finite element-type estimates of local interpolation error. Thus, the central design objective is to choose element basis functions that can in some smooth sense accurately approximate the target solution. For most of the problems we have thus far considered, linear elements suffice, even on composite grids. However, for more difficult problems like high Reynold number flow, such elements cannot cope with the dramatic local variations exhibited by the solutions. Consider (1.4) again, but now under the assumption that Re is large. In particular, suppose that a uniform grid with mesh size h is given and that the mesh Reynolds number, hRe, is so large that the terms involving Re dominate the discretization (2.16). Then the FVE scheme developed above is in danger of becoming unstable with a total loss of accuracy. The culprit here is that high Reynolds number flows have strong exponential character, making piecewise polynomial approximation acceptable only on very fine grids. A natural remedy is to choose element basis functions that better match this exponential character. One of the more successful ways to determine such functions is to compute a basis for the null space of a linearization of the PDE operator on each element, with no boundary conditions specified. For the vorticity equation in (1.4) using triangles, in our temporary notation this means that 6 is assumed to be of the form 8 = aesx + bety + c in each element, where s and t are constants that in some way approximate Reifry and —Reifrx, respectively. Examination of the stream function equation suggests that we take if) of the same form: ifi = aesx + fiety +7. The choices for s and t are tricky: They are meant to be representative values of the respective "coefficient" functions, Re^y and —Reij}x, on each triangle, but taking partials of the form for ^ begs the question because the form itself involves s and t. However, since substantial error has already been introduced into the approximation by replacing the coefficient functions by constants, it seems reasonable simply to use divided differences of ^ for determining s and t. To illustrate this approach, consider the flux }16EN — fs (8X — Re^y8)dS across SEN in Figure 2.14. Assuming that the x — y origin
36
MULTILEVEL ADAPTIVE METHODS FOR PDES
is at the midpoint of the P-E line segment, we have
Here,
and the remaining quantities are solutions of the 3 x 3 matrix equations
and
Thus,
The other fluxes and the discretization of the stream function equation are obtained in a similar way.
37
FVE METHOD
Figure 2.17. Interior volume for Poisson's equation with rectangular elements.
2.8 Rectangular and Rectilinear Elements Although triangles are much simpler to use, rectilinear elements can sometimes provide greater accuracy. We begin with a uniform grid example to illustrate the basic approach. For simplicity, we consider only the Poisson problem, (1.1) with p = 1. Consider the flux term in (2.2), with p = 1, due to the northern half segment of the west wall of an interior volume as depicted in Figure 2.17:
With the temporary notation
and the x — y origin placed at the midpoint of SEN-, then substitution of (2.22) into (2.21) yields the discrete flux
But because of the assumed bilinearity of ty m (2.22), we know that
38
MULTILEVEL ADAPTIVE METHODS FOR PDES
Hence, Since all segments are essentially the same, then letting u be the vector of node values of Vs we are led to the following discrete Poisson equation at interior points:
Rectilinear elements require some care at composite grid interfaces. In fact, a straightforward implementation of FVE is ill-defined because element and volume boundaries coalesce, inhibiting integrability of the discrete fluxes. This difficulty may be avoided by defining the fluxes as one-sided limits of perturbed fluxes or by averaging such limits from both sides. This has the advantage of ensuring patch conformity (see §2.5). Another somewhat more complicated approach is to use different volumes like the one shown in Figure 2.18. This violates patch conformity (assuming that we would have used rectangular volumes for the fine grid alone), but it preserves most of the other aspects of triangular FVE discretization. We now briefly examine a third possibility that is based on using different elements. Consider the hybrid element partition shown in Figure 2.19, which consists of triangles everywhere except for the rectilinear elements at the interface. A side rectilinear element as depicted in Figure 2.20 has five degrees of freedom corresponding to the five nodes (e.g., W, P, SW, A, and S). Again using temporary scratch-pad notation, in each such element we therefore assume that ^ has the quasi-quadratic form
Corner rectilinear elements (Figure 2.21) have six degrees of freedom, so we use 4> is assumed to be linear in each triangle. This hybrid scheme leads to the following stencils for Poisson's equation at the side interface points
FVE METHOD
39
Figure 2.18. Modified volume at coarse grid interface point for rectangular elements.
Figure 2.19. Hybrid element partition of composite grid.
Figure 2.20. Coarse grid side interface point for rectilinear elements.
Figure 2.21. Coarse grid corner interface point for rectilinear volumes.
40
MULTILEVEL ADAPTIVE METHODS FOR PDES
of Figure 2.20:
This hybrid scheme seems to be a natural way to treat grid interfaces for the case of rectilinear elements, and it has proved useful in analysis (cf. [Cai and McCormick 1990]). However, it comes at the cost of losing symmetry, continuity of the finite elements (quasiquadratics are generally nonlinear along element boundaries and thus differ from linears at common boundaries like the W-P line of Figure 2.20), nestedness of the element grids, and patch conformity. 2.9 Time-Dependent Equations
To illustrate the versatility of FVE, we apply it here to the timedependent model problem (1.5). We first consider the uniform grid case as depicted in Figure 2.22, with mesh sizes hx and ht in space and time, respectively. Since (1.5) is actually in conservative form, the basic FVE approach to its discretization is essentially the same as for the other model problems. However, the different character of the operator and boundary conditions in time suggest different control volumes, namely, that they be "lagged" in time as shown in Figure 2.23. This construction means that the upper and lower control volume boundary segments coincide with element boundaries, but because the time derivative is only first order, there is no danger in applying the integral forms to finite element functions. Let u1- represent the value of the FVE approximation to ifr at the node Nik °n space line i and time line k. Then a straightforward calculation leads to the discrete equations
FVE METHOD
Figure 2.22. Uniform grid for (1.5).
41
Figure 2.23. Control volume V and triangular elements for (1.5). Labels are indexes of the space lines (i — 1, i, i + 1) and time lines (fc, k -f 1).
Here, to approximate Jv rjdV, we used the average values of 77 at the two grid points of V. Note that (2.27) defines an implicit time-stepping method for the solution of (1.5). In fact, this is just the Crank-Nicolson method with time differencing averaged in space. The terms at time line k + 1 include the nearest neighbors of space line i as a result of contributions from both the temporal and spatial derivatives. Note that computation of the approximation u1- can be done in the usual implicit way by solving the tridiagonal linear equations for u^+l for each value of A; in turn, starting with k = 0. For higher dimensions, the equations defining the (k + l)-line solution could be solved by multigrid methods (MG) using the fc-line approximation as the initial guess. (In some cases, a better implicit equation solver would be full multigrid (FMG) properly modified to incorporate initial fine grid approximations.) To illustrate FVE for the case of local refinement in both space and time, consider the simple case displayed in Figure 2.24 of one refinement with a mesh factor of 2 on (|, 1) X (0, oo). All points of this grid have equations analogous to (2.27) except the coarse grid interface points. (For the fine grid equations, the slave points are treated in the usual way by defining the solution values there as averages of the values at the two neighboring coarse grid interface points.) The equation at point
42
MULTILEVEL ADAPTIVE METHODS FOR PDES
Figure 2.24. Simple space time level refinement of (1.5). Slave points are indicated by o.
Figure 2.25. Control volumes and triangular elements at an interface for
Figure 2.26. Sample volume V0 for illustrating time conservation.
FVE METHOD
43
An important consequence of the use of FVE for time-dependent problems is that the discrete equations that it produces are naturally conservative, even in the presence of grid interfaces. (For a different approach to conservation in the context of finite differences, see [Berger 1984].) For example, let V0 be the union of all of the volumes between any two consecutive global time lines and let So be its surface. See Figure 2.26 for an example using time lines k and k + 1. As in §2.6, we have that
where v = v(x,t) is the finite element function with nodal values u^. Using the no-flow boundary conditions in (1.5) on the east and west boundaries of S0, we conclude that
where t^ = kht and t^+i = (k + l)ht. We can interpret this relation to mean that the total mass at time ifc+i is the total mass at time tk plus the total mass produced by the source in the time interval [£fc,£fc+i]. This is the usual concept of time conservation, but it is further supported by the local property that V could have been chosen, as in Figure 2.12, to be the union of any set of control volumes associated with the grid. An FAC scheme for solving the composite grid equations for (1.5) using equations based on (2.27) and (2.28) will be described in §4.7.
2.10
Theory
Discretization error for a given FVE application can be rigorously estimated by local truncation error analysis in the same way that it is usually done for finite differences. However, this type of analysis can be much too pessimistic for composite grids: On nonuniform grids, some
44
MULTILEVEL ADAPTIVE METHODS FOR PDES
formulas have actual errors of order h? but truncation errors of order 1, preventing bounds based on the latter from being realistic. Another way to obtain discretization error bounds is by existing finite element theory based on interpreting FVE as a Petrov-Galerkin method, where the test functions are characteristics of the control volumes (see §3.2). Yet another possibility is to use the fact that, in many cases, FVE and the classical Galerkin method produce nearly the same discretization, so their approximations are close in the (continuous) energy norm. This is the approach taken in [Bank and Rose 1987] for general diffusion equations. Unfortunately, because of a technical assumption about how the volumes are constructed, their theory applies to FVE only in the Poisson case (p = 1). In any event, even though interpreting FVE as a Petrov-Galerkin or approximate Galerkin method can yield 0(/i)type error estimates, a more direct approach provides even stronger results. This will be illustrated in the following brief description of the specialized but developing theory on FVE. It is interesting, if not potentially fruitful, to start with a simple one-dimensional example. Consider the two-point boundary value problem
with the analytic compatibility condition
Let 0 = XQ < xi < ... < xm = 1 be arbitrarily spaced mesh points with mesh size hk — Xk — Xk-i, I < k < m. Define the control volumes Vk = (arfc,afc+i), 0 < k < m, where ak = f(zfc-i + «*), 1 < k < m, ao = 0, and a m +i = 1. Then integrating (2.29) over each VJ, using the familiar one-dimensional Gauss Divergence Theorem, and imposing the boundary conditions indirectly via the equations, we arrive at the discrete system
where fk = J^**+1 r](x)dx. (We assume for simplicity that 77 has been integrated exactly.) Let v* be the solution of the FVE discretization of
FVE METHOD
45
(2.29) using linear basis functions, that is, v* is the continuous piecewise linear function associated with the elements (a:,-, £ t -+i) that satisfies (2.31) with if) = v*. Let ^* be the solution of (2.29), so it too satisfies (2.31). This implies that the discretization error e* — -0* — v* satisfies
Thus, (e*)'(ak) = 0, 1 < k < m, which means that the discrete solution has exact fluxes at grid midpoints. We can then use a Taylor series expansion to estimate the //i([0,l]) semi-norm of the actual error as follows:
where ^ is some quantity (dependent on t) between 0 and /, ||(V'*)"||oo = maxo< x
46
MULTILEVEL ADAPTIVE METHODS FOR PDES
have the 1-D discrete .#i([0,l]) semi-norm estimate
Here we assume that IKV^y'lloo exists. (With a little more difficulty, we could have developed these estimates in terms of higher Sobolev semi-norms.) It is this type of error estimate that we seek for the general case. In higher dimensions, the fluxes are not generally exact at any particular point of the control surface, but the exactness of the total fluxes on each surface can be used to establish strong error estimates for certain cases. To illustrate this approach in its simplest form, consider the 2-D Dirichlet problem
on the unit square ft = (0,1) X (0,1). Here we use FVE discretization by triangular elements on a uniform grid, fi/l, as shown in Figure 2.27. Consider the 2-D discrete Hi($l) semi-norm defined on an appropriate subspace of Hi($l] by where ^ is the finite element interpolant of ijj. Define the norm
on the appropriate subspace of 77i(Q). (Since an #i(ft) function need not be defined everywhere, its interpolant may not be defined. Strictly
FVE METHOD
47
Figure 2.27. Uniform grid for Dirichlet problem (2.34).
speaking, then, the discrete semi-norm exists only for functions that are defined at the nodes. However, this causes no trouble because our theory has been simplified by restricting the solution ^* to a subspace of Hi(Q) where ||V'*||3,co is defined, namely, the "appropriate subspace" C3(D).) We then have the following discretization error estimate. THEOREM 2.1. FVE has 0(/i 2 ) accuracy in the discrete HI (SI) seminorm according to the error estimate
Proof. To simplify the proof, assume that the nodes have been linearly ordered using the indexing set M.:
For each P/'ffJ' 1 , define the index set corresponding to nearest neighbors of Pf by Mi = [j : PheHh, dist ( P f , P ? ) = h], where dist (P,Q) denotes the Euclidean distance between points P and Q in Q. First note that
where W is the set of unordered index pairs {i,j} such that either P/ l £ft /l and jeJ^i or PfeSlh and ieMj. (By unordered we mean that
48
MULTILEVEL ADAPTIVE METHODS FOR PDES
{i,jf} = {j,i}.) Thus, W is the indexing set corresponding to all unique nearest neighbor connections on grid ft'1, including interiorto-boundary connections. Now let B be the operator defined on an appropriate subspace of #i(ft), with range in Uh (the grid functions on ft'1), defined so that the value of Bij) at grid point Pf1 in ft'1 is the total flux
where Sph is the surface of the control volume about P/1. For each ieM. and'jteA/i, let Sij = SPh n SPH be the surface segment between Pf and Pj- and define the segmental fluxes
so that
Then, for any piecewise linear function u, since bij(v) — —6 Jt (t;), we have
Now since bij(v) = v(Pjl) — ^(P/1), we have what can be called the discrete ellipticity condition
Since V>* and v* both satisfy the discrete equations, then Bi£>* = Bv* which implies Be*h = —Be^, where e^ is the finite element interpolant of e* = ty* — v* and eJh\ = ifr* — ift^. (Note that the finite element interpolant of v* is just itself.) By (2.35) and the Cauchy-Schwarz
FVE METHOD
49
inequality, we then have
Thus,
To estimate the right-hand side of (2.36), consider first a horizontal connection by supposing PJ1 is the nearest east neighbor of P/1. With P^ i p'*
Qh = '
2
;
and using Taylor series expansions, we thus have
Since there are m2 = h 2 horizontal connections in W, then treating the vertical connections analogously yields the estimate
50
MULTILEVEL ADAPTIVE METHODS FOR PDES
The theorem now follows from this and (2.36). This theory easily generalizes to the potential flow case, as exemplified by model problems (1.1) and (1.2), provided p = p(x,y) has appropriate smoothness and p » 0 in fi. It can also be extended to the case of composite grids, but with some awkwardness at the interfaces. In fact, using a special choice of volumes (different than what we use in this book) and with the loss of a factor of h in the bound on accuracy, it can be applied to fairly general nonuniform triangulations (cf. [Cai, Mandel, and McCormick 1989]). Other work (cf. [Cai and McCormick 1990]) more directly applies to the FVE discretizations considered in this book, but its proofs are far more intricate than is probably necessary. In any case, there is as yet no theory for FVE that applies to a more general class of problems, allows for more general choices for volumes and triangulations, and provides guidance for these choices. 2.11 Numerical Examples This section contains results of numerical experiments with FVE applied to the first three model problems. These results, and those of subsequent chapters, are meant to serve only as illustrations, not as guides to developing highly efficient codes. In fact, the algorithms were implemented using a straightforward approach, with no concern for optimality. Therefore, except to understand the floating point environment, it is of little importance that these experiments were performed using Fortran in scalar mode on a Sequent Balance 2100. Unless otherwise noted, all tests used the discretization schemes described in §§2.1 through 2.7. In particular, the FVE approach generally involved triangular elements, piecewise linear basis functions, and control volumes constructed as described in §2.5. Note especially the volumes for planar cavity flow that are illustrated in Figure 2.13. To test the accuracy of FVE for well-posed potential flow, we considered (1.1) with the functions p = l,V>o = V7! = 0, and the single point source 77 = £(o,o)- Here, £(x,y) is the Dirac delta function defined by the condition that
for all appropriate functions ij) and any volume V C -R2 containing the point (x,y}. This means that the resulting discrete equation
51
FVE METHOD
/ =1 ; =2
>"i
"-i
""H
•»-5
0.205 EO
0.797 E-l
0.268 E-l
0.855 E-l
0.280 E-l
0.724 E-2
0.703 E-2 -
/ =3
0.288 E-l
0.770 E-2
-
-
Table 2.1. Discrete energy norm estimates for well-posed potential flow.
/ =1
">=!
*-£
"-£
"•-i
0.232 E 0
0.113 EO
0.391 E-l
0.994 E-2
I =2
0.174 E 0
0.580 E-l
0.148 E-l
-
I = 3
0.712 E-l
0.184 E-l
-
-
Table 2.2. Discrete energy norm estimates for singular potential flow. has the source term defined by
which is the characteristic function of P = (0,0) in 17—. The numerical experiments used a composite grid consisting of / > 1 levels, where level k + 1 has a mesh size hk+i = hk/1 and represents a refinement of the SW quadrant of level k. Thus, each level has the same number of grid points, namely, /ij~ 2 . Table 2.1 displays the results of estimating the size of the discretization error for various numbers of levels / and for various global mesh sizes hi. This error was estimated by measuring
where we used the discrete energy norm |||w-||| = (L-u-,u-}1/2 and where uh° denotes the exact solution on the global uniform grid with mesh size ho = y^g. Data is included for the cases for which hi > ho. While these results are not extensive enough to make very confident conclusions, certain trends are suggested. First, by observing the errors for fixed /, they appear to behave very roughly like 0(/i 1 - 5 ). Second, local resolution appears to be very effective here because the composite grid errors are almost as good as those of the corresponding global fine grids. For example, compare entry / = 3, hi = -^ with entry
52
MULTILEVEL ADAPTIVE METHODS FOR PDES
/ = 2, hi = ^ and entry / = 1, hi = ^. Third and perhaps most significant, each composite grid appears to have much better accuracy than the global grid that consists of about the same number of points. For example, compare entry / = 3,/ii = ^ with entry / = l,/ii = ^. The global grid here has one-third more points but about three-and-a-half times the error of the composite grid. (Here we count multiplicities in the composite grid due to two or more levels sharing the same point.) Similar tests were run for the no-flow potential equation given in (1.2)-(1.3). We again considered the Poisson case (p = 1), but used the two-point source term 77 = £(o,o) — £(1,1)- To resolve both singularities induced by 77, we refined both the SW and NE quadrants of fi. In this case, each fine level k > 2 has twice as many points as the coarsest level k = 1. The results in Table 2.2 are similar to those for (1.1), although the behavior of the errors for fixed / are more erratic and the advantages of local patches are less pronounced. For example, the global grid corresponding to entry / = 1, h\ = ^ now has 20% fewer points and only about twice the error of the composite grid corresponding to entry / = 3,/ii = -^. These results should not be taken too seriously. First, comparing errors for grids with approximately the same number of points does not fully take into account the cost of the solution process. We will see that multilevel methods are optimal for both global and composite grid problems, so their complexity is bounded by a small constant times the number of grid points. However, bounds leave much room for differences in performance that could shift the balance of comparison. In other words, more careful comparisons would be based on the actual accuracy produced by the solution method and what it costs to attain it. Second, there has been no serious attempt to fit the local grids to the solution in an efficient way. A more realistic placement of the patches should give greater advantage to local refinement. But third and more to the point, the textbook model problems treated here are generally too idealized to fully reap the benefits of adaptive methods. It would be much more informative to experiment with more sophisticated models and actual solvers. However, the purpose here is concrete illustration of basic concepts, not extensive numerical analysis of performance on realistic—and therefore specialized —applications. To test the accuracy of FVE for planar cavity flow, we applied the scheme outlined in §2.7 to (1.4) with the functions 77 = 0,0 = 0, V>o = 0, and V7! defined by
53
FVE METHOD k
-i
*=A
"=s
Re = 0
(0.335 E 0, 0.241 E-2)
(0.295 E 0, 0.604 E-3)
(0.247 E 0, 0.122 E-3)
Re = 50
(0.344 E 0, 0.257 E-2)
(0.283 E 0, 0.587 E-3)
(0.247 £0,0.120 E-3)
Re = 100
(0.537 E 0, 0.426 E-2)
(0.231 E 0, 0.806 E-3)
(0.244 E 0, 0.156 E-3)
Table 2.3. Discrete energy norm estimates (err, err^) for planar cavity flow.
These experiments are restricted for simplicity to the case of a single level, that is, a global grid with uniform mesh size h. We further restrict these tests to low Reynolds numbers because the current work [Liu and McCormick 1988a] for high Reynolds number flows is in its early stages. To measure the accuracy of the individual functions 6 and V>, we used a special discrete energy-like norm constructed as follows. The discretization of (1.4) typified by (2.16) and (2.18) can be written as
where we write L^h to emphasize its dependence on uh (the nodal values for V>). Associating the first equation with vorticity and the second with stream function, we define the following approximate discrete error norms: and
Here, vh° and uh° denote the exact solutions of (2.37) for /i0 = ^. Table 2.3 displays results for various fine grid mesh sizes h and Reynolds numbers Re. Note that, while the secondary variable vorticity seems to have an accuracy of 0(1), the physically important variable stream function is apparently 0(h2}.
2.12 Comments Although FVE does not yet have a unifying theory that could clarify principles for guiding the choices of elements and control volumes and the treatment of boundary conditions, it has proved to be an accurate and versatile discretization method for fluid flow problems on composite grids. We will show in subsequent chapters how FVE and its physical interpretation simplifies some of the aspects in the development of multilevel solvers.
54
MULTILEVEL ADAPTIVE METHODS FOR PDES
While not theoretically founded, certain FVE design principles which we have already introduced have proved useful for our composite grid applications. We summarize them loosely as follows: 1. Choose elements and basis functions to represent the local character of the solution. 2. For scalar equations, select rectilinear control volumes for each grid point. For systems, strict one-to-one correspondence between grid points and volumes may not be appropriate, but there should be some local correspondence between the number of volumes and the number of unknowns. These volumes should partition the region when it is convenient, but with Dirichlet boundaries this may not be very critical. The volumes should be determined by choosing the "usual" rectangular volumes at all but the coarse grid interface points. 3. Be sure that the finite element function space is admissable in the sense that the volume integrals are well defined. The Gauss Divergence Theorem may be necessary here to transform volume integrals of derivative expressions to lower-order surface integrals. When this process leaves certain derivatives in the expression for the surface integral, it may be necessary to ensure that the control volumes and finite elements do not share common boundaries. 2.13 Remarks on Notation Throughout this book, we will implicitly assume that the boundary conditions are treated as described in §2.3. Thus, the Neumann conditions are imposed indirectly via the equations, implying that the grid points on the Neumann boundary correspond to entries of the discrete unknown, u. The Dirichlet conditions are assumed to be imposed directly on the unknown, so the values of u at Dirichlet boundaries are determined by the boundary data there. As in §2.4, we henceforth assume that this data is incorporated in the right-hand side of (2.4), which has the effect that the Dirichlet boundary conditions can then be considered as homogeneous. Thus, while the grid points on the Dirichlet boundary do not correspond to explicit entries of w, we can think of u as implicitly having zero values there. This interpretation will be needed in the design of interpolation and other multilevel processes. For consistency, we will assume that all grids £th are "open" in the sense that they do not contain Dirichlet boundary points; we do assume that they contain Neumann boundary points, however. (The
FVE METHOD
55
figures usually assume a full Dirichlet boundary for simplicity. See Figure 1.2, for example; note that the interface does not include boundary points.) In general, if Qh is understood to be a subgrid of some larger grid, 0 will mean the original grid and its nearest grid point neighbors, including those along the diagonal. As with f^, 0 is not meant to include Dirichlet boundary points. Again for consistency, we will always assume that members of the finite element spaces (e.g., T) have zero values at Dirichlet boundaries. Finally, unless otherwise noted, we will assume that the elements defined on each of the various levels are related so that a coarser element space is contained in a finer one, whether they correspond to uniform or composite grids. We call these conforming spaces for later reference.
This page intentionally left blank
Chapter 3
Multigrid Methods
3.1 Basic Concepts Multigrid (MG) serves as a basic component of fast adaptive composite grid techniques (FAC) because of its apparent optimal efficiency as an iterative method. We are interested in how MG behaves as a solver of discretizations on uniform grids because it will be used for the composite grid equations restricted to the global and patch subgrids. As such, for many types of problems, basic MG cycles can be developed that converge at a linear rate that is independent of mesh size, with typical factors of 0.2 or less, and at an arithmetic cost proportional to the number of unknowns, often just 2 or 3 times the cost of evaluating the discrete operator at a prospective solution. As we shall see in the next chapter, such grid solver efficiency is compatible with the two goals of computation elucidated in §1.3. The purpose of this chapter is to introduce the basic concepts necessary to fully appreciate the role that MG solvers play in the FAC methods. We will assume that the reader has a fundamental understanding of MG and when and why it works. An excellent source for this purpose is A Multigrid Tutorial [Briggs 1987], which cites references for more advanced topics. Because of its abundance, we will not discuss existing theory for MG methods, but instead refer the reader to Chapter 4 of [McCormick 1987] and the references cited therein. 57
58
MULTILEVEL ADAPTIVE METHODS FOR PDES
To describe MG methods in their general form, suppose we are given a family of uniform grids £lh covering the region 0, on which we have the discrete problems
Here, h is a generic discretization parameter which we may think of as the mesh size of £lh in the case that it is the same in each coordinate direction. Let Uh denote the space of vectors considered as functions on OA Thus, the operator Lh in (3.1) is of the form Lh : Uh -> Uh. Suppose we choose a sequence of nested grids in this family, starting from the coarsest, $lhc, to the finest, Q'1', with successive mesh sizes differing by a power of 2. (By the term "nested grids" we mean D 2/l C I)'1 for all "admissable" /i, 2h in the sequence.) To relate the grid spaces in this sequence, assume we are given coarse-to-fine and fine-to-coarse (inter)grid transfer operators of the respective forms I^h : U2h —>• Uh and Ifrh : Uh —> U2h. Finally, we assume some relaxation process given by the expression Our notation here avoids iteration subscripts by interpreting the approximation uh as a dynamic variable that is allowed to change by assignment of the form expressed in (3.2). This expression should be interpreted as meaning that the old assignment of uh is used as the initial guess in the relaxation process denoted by Gh, which is replaced by the new assignment generated by Gh. (When absolutely necessary, however, we will use iteration subscripts closed in parentheses as in w (oid)' w (new)' anc^ U (A:)') ^ mav rePresent one or more sweeps of some iterative process such as the Gauss-Seidel relaxation. Let v\ and vi be two iteration parameters. Then one cycle of a linear, V-cycle MG process is represented by uh <— MGh(uh\fh] and defined recursively by the following steps:
Step 1 means that a direct method is used on the coarsest grid, although all that is usually needed in practice is to compute an approximation
MULTIGRID METHODS
59
with an error that does not contaminate the overall error of the cycle; this can often be achieved by a few relaxation sweeps of Gh provided the coarsest grid has just a few points. We will generally assume v\ = v
60
MULTILEVEL ADAPTIVE METHODS FOR PDES
rally provides the choice for interpolation. For example, discretizations based on linear elements would use linear interpolation to define I%h (taking the orientation of the element triangles into account). Note that because the grids are nested, so are the triangles. This means that a continuous 2/i-piecewise linear function is also a continuous hpiecewise linear function, that is, the "coarse" finite element space T 2/l is a subspace of the "fine" finite element space T A . This in turn means that I%h corresponds to the "natural imbedding" (or "identity") from T 2/l to Th. For the sample nodes shown in Figure 3.1, we have the following characterization of uh = /^w2'1:
The only remaining choice is the restriction operator, for which we rely on the physics. I%h is used to transfer right-hand sides, / f t , and residuals, rh = fh — Lhuh, to coarser levels. By our physical interpretation of §1.6, these are source flow rates integrated over control volumes. That is, /t^- represents the total mass per unit time flowing into the control volume centered at node JV,-j. Since the purpose of I*h is to lump these grid h sources for representation on the scale of grid 2/i, it is reasonable to determine the coefficients in I%h by the fraction that the fine grid volumes intersect the coarse ones. For the example depicted in Figure 3.2, uh = Ijlhuih is thus defined by the following:
There are several points to observe about these MG constructs. First, the FVE philosophy together with the physical interpretation fully guides these choices. This is especially true of coarse grid operator scaling, which can otherwise be tricky. Second, as we will see in the next section, FVE and the physics also guide our choice of innerproducts and error norms—on all levels. Other points to observe about these constructs concern a formal interpretation of FVE as a Petrov-Galerkin method, MG treatment of constant functions, and how these properties relate to singular equations of the type discussed in §2.6. These are the main topics of the next section.
61
MULTIGRID METHODS
Figure 3.1. Coarse (solid lines) and fine (solid and otted lines) grid lines in 2-D space. The grid line indices vary on the fine grid scale so that the coarse grid point corresponding to the fine grid point 2z, 2j is just i, j, for example.
Figure 3.2. Relationship between typical coarse (dashed) and fine (dotted) grid control volumes.
3.2 Galerkin Operators and Singular Equations Discretization methods can usually be viewed as attempts to approximate a finite number of components of the PDE solution that are distinguished by some measure of smoothness. In fact, the accuracy of the discretization must improve as the components exhibit greater smoothness. A useful measure of smoothness is the (continuous) energy norm \\\ij>\\\ =< /C-0,^ > ^ > where 1C is the differential operator and < •,• > the Z-2-norm. (For the moment, we restrict our attention to those ip for which \\\ifi\\\ is well defined.) In this sense, components with small energy norms must be approximated very accurately. As
62
MULTILEVEL ADAPTIVE METHODS FOR PDES
a limiting case, this means that null space components must be approximated exactly, as asserted in §2.6. (In a similar way, since MG uses coarser levels to eliminate smooth error components, increasingly singular components must be approximated on the coarser grids with increasingly greater accuracy. Whether such approximation requirements for MG are satisfied depends solely on the discretization upon which the MG coarse grid corrections are based.) To say this more concretely, consider model problem (1.2), denote /C = v ' P^7i and let \P be an appropriate space of functions on which 1C is defined in some sense. Note that (1.2) has the weak formulation: Find a function ip in ^ such that
for all functions 0 in some appropriate space $. Now let Th be the space of continuous piecewise linear functions on a given triangulation, £h, of H with mesh parameter h and let 3>h be the space of generally discontinuous functions that are piecewise constant on a corresponding volume partition of ft. Assume that Th C ^ and $h C $. The FVE discretization of (1.2) can now be interpreted formally as a PetrovGalerkin method: Find vh in Th such that
for all <jth in $h. Define Th : Th -» tf as the identity on T\ Let Th be the adjoint of a similar imbedding of . Let r]h = Thrj. Then (3.4) can be written as
where )Ch is given by the Galerkin condition
Note that K,h : Th —> $h is just the finite element space operator corresponding to the matrix L = Lh given in (2.4): If vh has nodal values uh, then tChvh is a piecewise constant function with nodal values Lhuh. It is important to remember that this construction was purely formal. Actually, /C is not defined on the finite element space T'1, even in the sense of (3.4). To be more rigorous, (3.4) must be interpreted by way of the divergence theorem: When 4> is piecewise constant on
MULTIGRID METHODS
63
a volume partition V, we may write (fCif), 4>} as a sum °f terms of the form 4>v Jv ICij>dV, where <j>y is the value of <j> on V 6 V; when if) is sufficiently smooth, these terms can then be written as (fry fs p^ifi-ndS', we thus take (K,huh,4>h} in (3.4) to mean YsVeV^v IsP V $h • ndS. Henceforth, we will refer to this interpretation as the weak form of (3.4). Note that (3.6) must also be interpreted in a weak sense: tCh is the operator mapping Th onto $h such that
for all uh E Th^(f>h G 3>/l, where these forms are taken in the weak sense. We will henceforth refer to this interpretation as the weak form of (3.6). Now let £2h be any coarsening of £h in the sense that every element of £2h can be written as a union of elements of £h. Thus, its associated space T 2/l is a subspace of T ft . Assume for the moment that $2/l is a subspace of $\ We may then define T%h : T 2/l -»• Th and T\h : $h -> $2/l implicitly by the conditions (see Figure 3.3)
It is easy to see that this characterizes F^ and T2hh. From these considerations, we can easily see that the diagram in Figure 3.4 also commutes, that is, the various function space operators satisfy the discrete Galerkin condition Suppose that F^ and Y2hh correspond to the grid transfers /^ and J2^, respectively. For example, if v?h is the nodal vector for v 2/l , then 7 2 \w 2/l is the nodal vector for T%hv2h. Then (3.7) implies the analogous relationship for the grid operators, namely,
Note that (3.7) and (3.8) hold for any £h and any of its coarsenings, including the case where £h represents a composite grid and £2h represents one of its uniform subgrids. A consequence of these properties is that the ability of the grid 2h to approximate and correct smooth grid h vectors is dictated by the ability of each grid to approximate smooth vectors of the PDE.
64
MULTILEVEL ADAPTIVE METHODS FOR PDES
Figure 3.3. Commutative diagram for function space transfer operators.
Figure 3.4. Commutative diagram showing the Galerkin relationship between operators of different levels. Unfortunately, these discrete Galerkin conditions do not hold for the FVE discretizations defined in Chapter 2. In fact, 3>2/l is generally not a subspace of $h because the use of a consistent grid-point-centered scheme for constructing the volumes means that elements of V2h cannot be written as a union of elements of Vh, However, the Galerkin condition (3.6) that relates each grid operator in the weak sense to the PDE operator does hold for each of the grid levels. Note that this Galerkin condition makes the standard Euclidean innerproduct and norm (in both the usual and the energy senses) a natural choice for numerical treatment of the discrete systems. Moreover, as we will see in §4.11, FVE discretizations satisfy approximate versions of (3.1) and (3.8), which are useful in developing convergence estimates for FVE-based FAC. An important property of the FVE intergrid transfers for problem (1.2) is that they are faithful to constants up to scale. Specifically,
MULTIGRID METHODS
65
interpolation preserves constants (i.e., l£hl = 1) and restriction preserves constant density (i.e., if w^-/|V^| = 1 for all fine nodes N^ then (I%huh)ij/\V?jh\ = 1 for all coarse nodes Nfp). Physically speaking, this means that interpolation and restriction faithfully reproduce constant potentials and constant-density flow rates, respectively. The first property and the Galerkin condition (3.6) have the additional consequence that the grid operators are also faithful to the singularity of /C in the sense that N()Ch) = N(IC) = {1}°°. Since FVE is a conservative difference scheme, we also have Af((/C /l )*) = Af(/C*) = {1}°°. These properties agree with our observations in §2.6. An important consequence of the fact that FVE is faithful to the singularities of /C and /C* and that its intergrid transfers preserve constants is that all MG levels exactly reproduce the null spaces of the fine grid operator and its adjoint. This means that singular components cannot have adverse effects on the solution process from the coarse grid correction alone. To illustrate this, suppose that the relaxation scheme is one sweep of Richardson's iteration given by Gh(uh,fh) = uh + u(fh - Lhuh\ that is,
where u; is some relaxation parameter. Assuming that fh£lZ(Lh}, then (3.9) propagates the algebraic error eh = uh* — uh according to
where Gh = / - uLh is the linear part of Gh. Writing eh = e% + ej, where ej is in M(Lh) and e]_ is in NL(Lh\ then (3.10) becomes
where by L\_ we mean Lh restricted to Af- L (£ /l ) and by G^ = I - uL^_ we mean the linear part of relaxation restricted to Af-L(Lh). Thus, singular error components of Lh cannot affect and cannot be affected by any other error components, so the performance of relaxation can be analyzed on NL(Lh}: p(I - uLh) = p(I - uL'l), for example. The point here is that faithfulness of the coarse grid process to the singularity ensures that MG has the same attribute. Specifically, MGh propagates the algebraic error according to
66
MULTILEVEL ADAPTIVE METHODS FOR PDES
where the linear part MGh is defined recursively by
Here, superscript f denotes the Moore-Penrose generalized inverse. Because of the special properties of FVE, we can then rewrite (3.12) as
where MGj_ is the linear part of the multigrid process restricted to {I}-1. This is given recursively by
Therefore, just as with (7, we can analyze MG on {I}1 with the knowledge that the singular and nonsingular error components have no influence on each other: p(MGh] = p(MG^\ for example. 3.3 Nonlinear Schemes Most PDE models are nonlinear in the unknown function. Such problems can often be treated by some form of "outer" linearization scheme using MG as an "inner" iteration. But a generally more efficient approach is the full approximation scheme (FAS; cf. [Brandt 1977]), which uses a form of the coarse grid correction that does not require linearity. To explain the basic idea behind FAS, consider the equation
To approximate the smooth components of the algebraic error eh = uh* - uh, we rewrite (3.15) as
Assuming that eh is smooth, then the solution eh of (3.16) can be approximated by /^e 2/l , where elh solves
Given an applicable relaxation scheme, Gh, like nonlinear Gauss-Seidel, then (3.17) leads to the following definition of one FAS V-cycle, again represented by uh <- MGh(uh;fh):
MULTIGRID METHODS
67
Here we assume that (3.15) has a unique solution for h = hc which we write as (Lh)~l(fh). Note that the only possible linearization necessary in this FAS scheme is in relaxation. This is usually a much simpler process than global linearization, involving ordinary as opposed to general Frechet derivate evaluations. We also point out that the mapping Iff used to transfer uh to grid 2h in Step 3 may be different from that used to transfer fh and Lh(uh"). We have used the same notation for simplicity. Another nonlinear MG scheme uses a Galerkin approach. The basic idea is that (3.15) can be rewritten in a block form which, in the two-grid case, is given by
Although this "appended" system is singular, it is formally possible to apply relaxation directly to it. To this end, define X^(w 2 / l ) = llhLh(I%hu2h + uh] and f2h = Ilhfh. Then a two-grid Galerkin version of FAC applied to (3.15) is given by
The Galerkin scheme can be more effective than FAS for certain types of problems. This is especially true for certain discrete equations that arise from variational principles, including many self-adjoint eigenvalue problems (cf. the Rayleigh quotient MG method described in [Mandel and McCormick 1989c]). This effectiveness probably stems from the property that the Galerkin scheme is compatible with the discretization in the sense discussed in §3.2. The major limitation with this approach stems from the fact that the explicit use of the definition of L^h involves grid h computation. Thus, for relaxation or residual computation on grid 2/i, Z^{(u 2/l ) may be evaluated by interpolating u2h to grid /i, adding uh, applying Z, ft , and transferring the result to
68
MULTILEVEL ADAPTIVE METHODS FOR PDES
grid h. This is prohibitively expensive. For linear problems, this can be avoided by noticing that L2u^(u2h) = L2hu2h + I2hhLhuh, so the second term can be computed once, allowing all further computations to remain on grid 2h. For certain nonlinear equations like eigenvalue problems, a similar approach can be used, but such a mechanism for the general case is not evident.
3.4 Full Multigrid and Computational Complexity There is ample theoretical and numerical evidence that MG is an optimal iterative method for solving an increasingly large number of problems. Typically, the cost of each MG cycle is the order of the number of unknowns, and the convergence rate is linear with a bound that is independent of the mesh size. This usually means that MG can improve the algebraic accuracy in an approximation by a decimal point at a cost equivalent to two or three relaxation sweeps. Such optimal iterative performance depends of course on the particular application and MG constructs, but we will assume that this optimality is achieved by the MG schemes used here. Such performance will be demonstrated for our model problems in §3.6. For use of MG only as a component of FAC adaptive methods, this type of complexity is all that we need. However, it will be important for later development to see how MG achieves essentially the same optimal complexity in its role as a "direct" solver. To simplify this explanation, we will develop the basic concepts loosely, using the function space setting as if the computations occurred there and making certain specific assumptions about the accuracy of the discretization and performance of the basic MG cycles. In particular, we suppose that the discretization on grid h has an O(/i 2 ) accuracy, so that the discretization error eh" = ifi* - vh" satisfies where V>* solves (3.3), vh" solves (3.5), c is some constant independent of /i, and || • || is some norm on the appropriate space of functions. Suppose that MGh has an arithmetic cost of CQ/I, where CQ is some constant independent of h and n = Q(h~d) is the cardinality of £lh. This complexity assumption about basic MGh cycles is reasonable because relaxation for typical applications costs cin, in 2-D the coarse-level relaxations contribute about ci j + ciy-jr + • • • = GI~ to the cost, and the cost of the other MG processes is marginal. (See [Briggs 1987] for details.) We could have used more complex cycling schemes like the W-cyc\e (cf. [Brandt 1977]), but V-cycles are critical to efficient
MULTIGRID METHODS
69
performance in distributed-memory parallel computing environments and they are most effective for our adaptive applications. In any case, we assume that the basic scheme MGh converges at an optimal rate which is given specifically by
Now if MG were required to solve (3.5) to some fixed algebraic accuracy tol > 0 using some starting guess v/J))' then the total cost would be CQVTI, where v is the number of required iterations. Let vfa denote the MGh iterates defined by v£k+l) = MGh(v£k);rih), where rjh is the ^2-orthogonal projection of 77 onto Th. (Here we allow MGh to apply to ^-functions as well as {/^-functions.) Then by (3.19) all we know is that Thus, the requirement
implies that where e^Q\ = vh —vfa is the initial algebraic error. This generally gives us an inaeterminant bound on complexity because it depends on the ratio 2o//||£(o)ll' We could presume a fixed initial guess like v^ = 0, but then the complexity bound would still depend on the indeterminant quantity log(/o/). This difficulty stems from the rather artificial convergence criterion in (3.20), which does not take into account the real objective of solving (3.3). A criterion that does account for the real objective is the requirement that the norm of the final algebraic error £ (v) = vh ~ v(v) be comparable to that of the discretization error eh". Specifically, first note that the real objective is presumably to achieve an actual error e^. = ifi* — v^ that satisfies
From the identity E}. = eh" + e^ and the triangle inequality, we can guarantee (3.22) by the requirements
70
MULTILEVEL ADAPTIVE METHODS FOR PDES
and
It seems inefficient to demand a more stringent algebraic tolerance than (3.24): Since a zero algebraic error could only improve the bound in (3.22) by a factor of two, rather than using additional MG cycles on grid /i, it would be more effective to compute with a smaller h. Now (3.23) and (3.18) dictate the choice of /i, namely,
(In practice, we might adjust h downward—e.g., to the nearest power of |—when this expression does not lead to a convenient grid.) Using (3.25) to replace the value of tol in (3.24), we then arrive at a convergence criterion that directly reflects our computational objective:
We will say that the iterates have converged to the level of discretization error when (3.26) is satisfied. There are two problems associated with this criterion. First, the direct use of (3.26) requires an estimate for c. Second, to satisfy (3.26) with H^o)!! fixed, it can be expected from (3.21) that v = 0(—log/i) = O(logn). Thus, the cost of achieving (3.26) by MGh is O(nlogn), which violates our computational objectives. We now show how the so-called full multigrid algorithm (FMG; cf. [Brandt 1977]) achieves (3.26) while avoiding both of these difficulties. The culprit in this degradation of complexity by the factor log(n) is the rather naive way that the initial guess was chosen. What we need is a starting function v^ with error of the same order of tol so that log^oJ/H^oJI) is of order one. The fact that the cost of coarse level relaxations is only 0(n) suggests that v^ be obtained by approximating v2h . This can be done by using MG2h cycles. The grid 1h process would use MG 4/l to provide itself with a suitable starting guess, and so on beginning at the coarsest level. This rationale leads to the following recursively defined algorithm, which we denote by uh <— FMGh(fh):
MULTIGRID METHODS
71
Note that FMGh consists of one basic V-cycle on each grid, so its complexity in 2-D is about c 0 n + c0-J + cofs + • • • = f con. Returning to the function space setting, we now show that FMG attains the convergence criterion (3.26) without further iteration, that is, with v = 1 for MGh on the finest level. To prove by induction that (3.26) is attained by FMGh for all /i, note by Step 1 that it is true for h = hc. Assume it is true for 2h. Since the grid h cycle MGh starts with uh •=• /£ /l u 2/l , we have v^ = v?A. Hence,
Here, (3.18) was used for h and 2h and (3.26) for 2h. Now by (3.19) we have
which proves that (3.26) is true for v = I with this h. By induction, (3.26) must be true in general for v = 1. The main point that was shown here is that FMG achieves our computational objective directly at a cost of approximately four-thirds the cost of a basic V-cycle, which usually translates to a cost of about 4 to 6 relaxation sweeps. This is true provided that the basic MG cycles converge with a factor of at least -^, which is obtained for many applications by proper choice of the number of relaxation sweeps per level, usually one or two before and after the coarse grid correction. This optimal performance feature of FMG means that it plays the role of a direct solver in the truest sense of the term. Such direct solvers will not be necessary for the FAC algorithms because the role of obtaining a convergence criterion analogous to (3.26) will be played by the composite grid procedures, not the MG subgrid
72
MULTILEVEL ADAPTIVE METHODS FOR PDES
solvers. The purpose of our discussion of FMG here is simply to familiarize the reader with the basic concepts related to the role of coarser grids for obtaining good starting guesses for fine grid iterative methods. 3.5 Parallel Implementation The parallel asynchronous fast adaptive composite grid methods (AFAC) use MG as an inner loop solver, so this section is devoted to a discussion of MG designed for a distributed-memory multiprocessor system. For concreteness, we consider only hypercube systems with possibly large numbers of relatively strong processor nodes. We will avoid discussing the many details that arise in implementing such schemes, but concentrate instead on some of their general characteristics. Because this dynamic area of research is not yet settled, we will choose a somewhat conservative philosophical perspective that suits our interests for adaptive methods. For a more thorough description of the implementation details, general characteristics, and philosophy that are briefly introduced here, see [Briggs et al. 1988]. For a general survey of parallel MG methods, see [Chan and Tuminaro 1986]. For more recent work, see relevant papers in [McCormick 1989c]. The development of computers with many processing nodes is motivated by the dramatic increase in computing power required to handle the very large-scale computational problems now on the horizon. While a discussion of the implications that this has on adaptive refinement will be reserved primarily for Chapter 5, two points are worth noting here. First, for most applications, there are likely to be many more unknowns than available processors, regardless of the size of the computer system. This follows directly from the premise that demand always taxes capability. Second, for the forseeable developments in computer technology and architecture, we believe that the most effective systems of this type will have processors that are fairly powerful compared to their number. Interprocessor communication usually implies a large reduction in computing efficiency, so it seems appropriate that computational power should come as much from the capability of individual nodes as from how many there are. We can use these two points to argue that MG methods are highly parallelizable, maintaining their apparent optimality for most realistic applications. In fact, as in the scalar case, for many applications the elapsed parallel processing time for MG or even FMG is just that of a few parallel relaxation sweeps. To understand the basic reasoning here, it is important first to recognize that some sort of inefficiency of conventional MG algorithms
MULTIGRID METHODS
73
is inevitable in a multiprocessor environment. The coarse grid correction process is inherently sequential, so processing of the coarsest grid cannot occur concurrently with any other processing. Since for proper MG efficiency the coarsest grid problem cannot require more than just a few arithmetic operations, this means that most processor nodes might as well be idle. This inefficiency will occur on the coarsest levels, regardless of the number of points on the finest grid. Actually, in some situations it may be worthwhile to minimize interprocessor communication by having every processor solve the global problem on the coarsest levels, but this is still not a fully efficient use of parallelism. The question is whether this type of inefficiency is really a serious concern. The current perspective seems to indicate that it is not for at least two reasons. The first is that the real question for a given machine and specific problem is whether the method is most efficient in terms of the total time it takes to deliver an acceptable solution. For an increasingly large class of problems, conventional MG schemes have established themselves in this sense. In some cases, unconventional MG algorithms have been developed to enhance parallel performance, but the results are not yet generally satisfying. The second and perhaps most important reason to be unconcerned about parallel inefficiency is that it is likely to be negligible for most real applications. To explain this further, in the remainder of this section we briefly describe the parallel MG algorithm, then summarize some illustrative results taken from [Briggs et al. 1988]. The algorithm used for experiment and analysis is based on the following components: Model problem. The PDE is the two-dimensional Poisson's equation on the unit square with Dirichlet conditions on the entire boundary. MG constructs. This is the standard fast Poisson solver form of MG that uses 5-point stencils on a uniform grid, red-black relaxation, half-injection, bilinear interpolation, and standard V-cycles and their full multigrid counterparts. See [Stiiben and Trottenberg 1982] for details on this scheme. Hypercube implementation. Using so-called gray codes (cf. [Chan and Schreiber 1983]) for proper ordering, the hypercube nodes are assigned grid points by decomposing the domain into a nearly uniform rectangular array of boxes, which in the extreme case is a linear array of strips. See Figure 3.5 for the case of square boxes. Each node is assigned all of the grid points in its domain as well as
74
MULTILEVEL ADAPTIVE METHODS FOR PDES
Figure 3.5. Processor box assignment of a 16-node hypercube (d = 4) to the 7 x 7 grid (m = 3) on fi . Shown are the subgrids assigned to processors PI,1,^2,2; and P4(4 in the 4 x 4 array of nodes. The dotted lines indicate artificial grid boundaries.
an artificial boundary of grid points that correspond to neighboring grid points in other domains. The codes were written in both C and Fortran for the 32-node Intel iPSC/1. These components for the MG algorithm have the following important implications: C-level. The coarser grids have increasingly fewer points in each box so that, on a sufficiently coarse level, a substantial number of these boxes become empty. Such grids are said to be below C-level. The location of C-level depends on the number of processors and grid points and the rectangularity of the boxes. C-level is highest in the extreme case of strips. Coding. The code for MG on the cube would be almost the same as the scalar code were it not for the accounting needed below C-level to accommodate idle processors and increased message path lengths. This requires a very substantial increase in the code length and coding effort. Local communication. All interprocessor communication requirements are satisfied by one message passing phase performed after each half relaxation sweep (i.e., red or black step). This phase consists of a message packet being sent to and received from the four nodes assigned to the immediate neighboring domains. Above C-level, these nodes are immediate neighbors one communication pathlength away; below C-level, they are exactly two pathlengths away in the hypercube connection topology.
MULTIGRID METHODS
75
Global communication. Most other methods require innerproduct and norm evaluations for the computation of such quantities as parameters, step-lengths, orthogonalized directions, and convergence criteria. These usually require global interprocessor communication. No such computations are necessary for MG since its performance can be easily predetermined. This is especially true for FMG which, when properly implemented, requires no iterations.
Table 3.1 displays the elapsed time for a complete V-cycle using the C-code with approximately square boxes on hypercubes and grids of varying sizes. In parentheses are the percentages of time these cycles took in interprocessor communication and below C-level, respectively. We make the following two observations:
1. By examining the upper rows of Table 3.1, it is clear that, well above C-level, the cost of a V-cycle depends on the number of points per processor, and little else.
2. When the finest grid is at or just above C-level, the V-cycle spends a fairly large portion of its time below C-level where many processors are idle. But a few levels above this, where the number of fine grid points per processor just begins to be significant, this time is fairly modest.
To predict what the observations might be for much larger hypercubes and grids, the following rough complexity estimates are made for the V-cycle using nearly square boxes. To do this, assume that the time that the V-cycle spends in arithmetic computation on a given level is 4cn, where n is the maximum number of points per processor on that
d 5
m
4
3
2
1
0
10
24,367(18,0.34)
-
-
-
-
-
9
7,239(37,1.1)
12,469(10,0.057)
23,768(60,0.034)
-
-
-
8
2,606(37,3.4)
3,832(22,0.23)
6,120(12,0.10)
11,800(5.0,0)
22,664(3.0,0)
7
1,225(58,8.0)
1,459(42,0.46)
2,176(24,0.34)
3,435(13,0)
6,116(5.0,0)
10,871(0.30,0)
6
687(74,14)
757(65,0.92)
838(40,0.80)
1,146(27,0)
1,773(11,0)
2,822(0.90,0)
5
466(83,20)
438(78,1.6)
402(57,1.7)
512(49,0)
585(21,0)
760(3.0,0)
4
311(88,30)
278(84,2.5)
225(69,3.1)
243(63,0)
227(36,0)
220(10,0)
3
203(90,46)
169(87,4.1)
115(72,5.9)
123(71,0)
95(49,0)
72(17,0)
2
-
81(88,8.4)
76(84,8.1)
67(82,0)
34(53,0)
22(27,0)
1
-
-
-
5(60,0)
5(60,0)
4(50,0)
Table 3.1. Execution time in milliseconds for a V-cycle with percentages in parentheses for interprocessor communication and below C-level times. Results are for (2m — 1) by (2m — 1) finest grids on a hypercube with 2 d nodes. Data is missing either because the problem was too large on individual cubes (2m — d > 15) or because the V-cycle started off below C-level (2m < d).
-
MULTIGRID METHODS
77
level. The major costs here are the three relaxations and a residual calculation on each level; other costs are assumed to be negligible. Assume further that the time that the V-cycle spends in communication on a given level is a + (31 at or above C-level and 2(a + (31} below Clevel, where / is the maximum length of all messages sent from each processor on that level, (a reflects the start-up cost of sending a message and j3 reflects the dependence of message passing cost on message length. Their values depend on the number of communication stages, the character of the hypercube, and other details of the environment.) Table 3.2 displays the estimates of the elapsed time for one V-cycle on a (2m — 1) by (2m — 1) grid in a cube with 2d nodes, broken down by above and below C-level contributions. Table 3.3 uses these estimates to obtain the orders of the ratios for each coefficient c, a, and (3 in terms of the available number of processors, p, and maximum number of grid points per processor, n. For example, the time that the V-cycle spends in arithmetic computation below C-level is insignificant so long as n » logp. Trouble can occur when n is comparable to logp, but this corresponds to a fairly small problem, even on very large cubes. A similar conclusion can be made with respect to a when we require n » (logp)2. For /?, C-level can be significant when n is comparable to p, but this case still represents a relatively small problem even when p is very large. Moreover, judging from current trends, it can be expected that (3 is much smaller than a for many of the future multiprocessor systems; so even when n is comparable to p, the complexity of the V-cycle below C-level may still be negligible. Finally, Table 3.4 lists the estimates of the elapsed time for one V-cycle using strips. We include these estimates for the complexity analysis of AFAC given in §5.6. The numerical results and complexity analyses suggest that, for problems that tax the capabilities of the machines we consider (e.g., where storage capacity on each node is substantially greater than the size of the hypercube), processor idleness on the coarser levels in MG is of negligible concern. However, even for more modest size problems where such idleness becomes a factor, and for a large class of PDEs, MG is still the most efficient in terms of the total elapsed time to compute an acceptable solution. For many users of parallel computers, this is likely to be the real issue in determining their choice of solvers.
3.6 Numerical Examples Here we describe numerical results with MG applied to the same test problems treated in §2.11. For model problem (1.1), we used lexico-
MULTILEVEL ADAPTIVE METHODS FOR PDES
78
above C-level
below C-level
Table 3.2. Approximate V-cycle complexity for a (2m - 1) by (2m - 1) finest grid on a hypercube with 2 nodes allocated by approximately square boxes. Table 3.3. Orders of the ratios of the below C-level to above C-level coefficients given in Table 3.2, where p = 2 is the number of processor nodes and n = 2 ( 2 m ~ d > is the approximate maximum number of fine grid points per node.
above C-level
below C-level
2d(22(m-d + l)_1)c
22(2d-l)c + (2cM-l)a
d
m dl
+ 2 (2 " -l) « + (m-d + l)y3
+ 2(d-l)/?
Table 3.4. Approximate V-cycle complexity for a (2m - 1) by (2m - 1) finest grid on a hypercube with 1d nodes allocated by strips.
graphic Gauss-Seidel as the smoother and a V(2,1) cycle based on the FVE constructs essentially as described in §3.1. We used the FAS form of MG for our linear as well as our nonlinear models. As an exception to the FVE constructs, we used bilinear instead of linear interpolation as the basis for the coarse-to-fine transfer I%h. This was done because bilinear interpolation is much more common in practice and because the differences in performance were negligible in every case. Table 3.5 displays the results of running several cycles with zero as an initial guess. Displayed are the Euclidean norms of the residuals computed after each of five cycles for various fine grid mesh sizes h. Also tabulated are the geometrically averaged convergence factors, which are computed by taking the fourth root of the ratio of residual norms after cycles 5 and 1. Table 3.6 displays analogous data for model problem (1.2)-(1.3). Planar cavity flow requires more care in its treatment by multigrid. To understand the difficulties involved in systems, where the boundary conditions are imposed on only one of the functions, the reader is referred to [Linden 1985]. For (1.4), we used a V(4,2) FAS cycle where
MULTIGRID METHODS
79
"i
"5
k
cycle 1
0.237 E-2
0.123 E-2
0.620 E-3
0.313 E-3
-5
h=
128
cycle 2
0.166 E-3
0.847 E-4
0.426 E-4
0.215 E-4
cycle 3
0.117 E-4
0.601 E-5
0.300 E-5
0.151 E-5
cycle 4
0.829 E-6
0.429 E-6
0.211 E-6
0.107 E-6
cycle 5
0.582 E-7
0.309 E-7
0.150 E-7
0.762 E-8
0.070
0.071
0.070
0.071
average factor =
Table 3.5. Convergence history of MGh for well-posed potential flow. Displayed are the Euclidean norms of the fine grid residuals computed after each F(2,l) cycle and the geometrically averaged convergence factors.
"-i
"-it
k
-i
"5
cycle 1
0.118 E-l
0.608 E-2
0.309 E-2
0.156 E-2
cycle 2
0.752 E-3
0.388 E-3
0.197 E-3
0.991 E-4
cycle 3
0.593 E-4
0.306 E-4
0.155 E-4
0.782 E-5
cycle 4
0.533 E-5
0.275 E-5
0.139 E-5
0.702 E-6
cycle 5
0.535 E-6
0.276 E-6
0.140 E-6
0.705 E-7
0.082
0.082
0.082
0.082
average factor =
Table 3.6. Convergence history of MGh for singular potential flow.
"-£
"A Re = 0
Re = 50
Re = 100
Re = 0
Re = 50
'-£ Re = 100
Re = 0
Re = 50
Re = 100
cycle 1
0.205 EO 0.187 E 0 0.272 E 0 0.323 E 0 0.310 E 0 0.303 E 0 0.496 E 0 0.498 E 0 0.504 E 0
cycle 2
0.702 E-2 0.910 E-2 0.158 E-l 0.903 E-2 0.122 E-l 0.179 E-l 0.181 E-l 0.224 E-l 0.408 E-l
cycle 3
0.451 E-3 0.685 E-3 0.175 E-2 0.943 E-3 0.120 E-2 0.415 E-2 0.187 E-2 0.226 E-2 0.730 E-2
cycle 4
0.239 E-4 0.615 E-4 0.229 E-3 0.365 E-4 0.708 E-4 0.881 E-3 0.605 E-4 0.928 E-4 0.121 E-2
cycle 5
0.135 E-5 0.565 E-5 0.293 E-4 0.281 E-5 0.581 E-5 0.149 E-4 0.822 E-5 0.773 E-5 0.203 E-3
average factor =
0.051
0.074
0.102
0.054
0.066
0.149
0.064
0.063
Table 3.7. Convergence history of MGh for planar cavity flow.
0.142
80
MULTILEVEL ADAPTIVE METHODS FOR PDES
each sweep was a lexicographic block Gauss-Seidel relaxation with the blocks defined in terms of the grid points as follows. At an interior point away from the boundary, the block consisted of the two equations (stream function and vorticity) and their two variables (uh and vh] associated with this point. At an interior point neighboring the boundary but away from the corners, the block similarly included the associated pair of equations and variables, but it also included the stream function equation and variable at the neighboring boundary point. At an interior point in a corner of the region, the block consisted of the two equations and two variables associated with this point and the stream function equations and variables associated with the three neighboring boundary points. Table 3.7 contains the results obtained from testing this algorithm on (1.4) starting with a zero guess. Here we display the Euclidean norms of the residuals for the vorticity equation computed after each of five cycles and the convergence factor geometrically averaged over these cycles.
Chapter 4
The Fast Adaptive Composite Grid Method
4.1 Basic Two-Level Schemes This chapter, which is the core of this book, is devoted to a fundamental treatment of the class of multilevel adaptive methods referred to as FAC (fast adaptive composite grid). This section develops a generic two-level version of FAC by viewing it as a natural extension of conventional multigrid methods applied to problems discretized on composite grids. The algorithm is then removed from its origin by replacing the relaxation scheme used on the uniform grids by exact solvers. This will have the effect of clarifying the roles of the various grids used in the solution process while providing a useful generalization of the method. In fact, it will be shown in the last chapter how this generalization allows FAC to be modified to. achieve something that other schemes apparently cannot: completely asynchronous but efficient processing of all levels of refinement. This section is not light reading. Since we introduce several different methods, requiring rather intricate notation, the casual reader may wish to skip to the next section. However, for those interested in practical use of FAC, an understanding of the multilevel principles and various subtleties exposed in this section should be worth the struggle.
81
82
MULTILEVEL ADAPTIVE METHODS FOR PDES
To preview this development, consider the following methods: standard multigrid scheme local multigrid scheme bordered multilevel scheme multilevel composite grid scheme fast adaptive composite grid scheme MGh was developed in the last chapter as a global grid solver. It is too inefficient for our purpose here because we are assuming that fine grid resolution and accuracy is needed only in a local subdomain P-2 C fi. We will thus be led to the local multigrid method MG^2 that arises from MGh by suppressing the fine grid relaxations in C£l<2. Unfortunately, we will have gained very little from this method because the global fine grid is still present and used for interpolation and residual computations. To make MG^2 more practical, we will introduce the concept of a bordered composite grid ftr^ and show how all of the computations in MGft2 can be translated to it, thereby avoiding excessive fine grid work outside of 1)2- This will result in the bordered multilevel scheme ML+, which is practical. To make MLr+ somewhat more efficient, we will show how the borders of £1+ can be eliminated, that is, we will translate all of the computations in ML^ to the standard composite grid fi—. The resulting method is the multilevel composite grid scheme MlA. Finally, we will introduce the fast adaptive composite grid method FAC— which results from replacing the relaxation scheme in ML— by a direct solver. The motivation for developing these various methods make use of the Galerkin condition (3.8), even though the methods themselves do not depend on it. All that we really use the Galerkin condition for is to establish various forms of the multilevel adaptive method that are exactly equivalent. Thus, for grid operator constructs like those produced by finite volume element techniques (FVE) that do not exactly satisfy (3.8), only the equivalence of the forms is lost—not the validity of the methods themselves. In this development, we will have in mind scalar elliptic equations like the model problems in (1.1) and (1.2), although the schemes have much broader applicability. We start with the basic premise that, while a certain accuracy is enough to correctly resolve the physical system in most of the global domain ft = fli, substantially greater accuracy is needed in a given local region, ^2 C QI (see Figure 4.1). More precisely,
FAC METHOD
83
we assume that local phenomena in ^2 demand a resolution with mesh size /i, and the accuracy that goes with it, which is smaller than that needed in CSli = Qi\ft2- If we were to allow the local phenomena in ft 2 to dictate the use of a global grid fi\ then this would presumably lead to unnecessary computations outside of ft£= ^ ^ (^2 U dfii). Noting Figure 4.2, this means that, while relaxations in MGh are needed on 1)2 to resolve the solution components to that scale, they would be wasted on €$1%. A natural way to avoid this is simply to modify the standard multigrid scheme MGh by suppressing relaxation outside of £1%' To this end, let G^2 denote a restricted relaxation process: If we use the decomposition uh = ( u £) corresponding to the partition ftj1 = C$l% U ft£, and similarly for /\ then
where Gh is a relaxation scheme corresponding to the local problem on 1^2 tnat results only in changes to u%. We can then define the local multigrid method MG^ expressed in its two-grid form as follows:
We call this a two-grid process because the coarse grid routine MG2h has been replaced by a direct solver, which is done only to simplify our presentation. Also for simplicity, we have eliminated relaxation before the coarse grid correction. The only serious modification to MGh here is thus the restriction of relaxation to the local grid, f^. Unfortunately, the local multigrid scheme MG^2 makes little practical sense. Its motive is to eliminate unnecessary computations on COfy, but only the relaxation process has been suppressed there, not the intergrid transfers and residual calculations. Fortunately, these fine grid computations can also be suppressed without changing the results. To see this, let uh be the initial approximation for one MG^2 cycle, let u2h denote the correction computed in Step 1, and let vh be the change in uh -f 7^u2/l due to the relaxation on £1% m Step 2. Thus, the approximation produced by MG^ is uh + I%hu2h + vh- Now, vh is zero on CftJ, so I^Lhvh is zero on CTt^ = fif nCH 2 . (Recalling the remark on notation in §2.12, CS72 is the set of points in fij1 that are not
84
MULTILEVEL ADAPTIVE METHODS FOR PDES
Figure 4.1. A global domain, fi = QI, and local one, $1%.
Figure 4.2. Fine grid points (•) where relaxation is required (assuming Neumann boundary on W and S, Dirichlet on N and E).
in P-2 or its interface.) This follows because we are tacitly assuming that the stencils for I%h and Lh are of nearest-neighbor type. (Broader stencils like those that arise in the discretization of the biharmonic can be treated using wider interfaces.) Thus, by (3.8) and the use of an exact solve in Step 1, we have
This implies that the transferred residual / 2/l in the next cycle of MG^ will be zero at points of CQ2 • In f act > m general, local fine grid relaxation has no immediate effect on the coarse grid equations in C£12. In principle, then, there is no need to use the fine grid in that region (except perhaps to represent u2h with better resolution, but this is more efficiently done at the end of all computations). We could thus develop a modification of the local scheme MG^ that avoids unnecessary fine grid work away from D2. However, to implement such a scheme we would need to change the role of u2h in Cft 2 from representing error
85
FAC METHOD
Figure 4.3. Local grid 172» l^s ^TS^ border A2 (indicated by •), and the extended local grid fi2-i-.
corrections to representing the current approximation. It is therefore necessary that this new modification use u2h to accumulate corrections. Another important point about the modification of MG^^ is to be sure that the fine grid is supplied with enough data to compute I\h(fh — Lhuh) at coarse grid interface points. This requires interpolation not only to Q 2 » but *° the two grid lines bordering Q2 as wellFigure 4.3 shows the first border, A 2 , and the extended grid, ft2+, that includes both borders. These ingredients provide for a new algorithm that avoids all fine grid work away from ft2. The notation gets quite cumbersome here, but the basic idea is easy to state: Residuals are computed on the local fine grid including its boundary and first border; these residuals are then transferred to the coarse grid points lying under the fine grid and at the interface; the correction is then computed and interpolated to the local fine grid including its boundary and both borders; relaxation is then performed at fine grid interior points. Specifically, let Q 2 + = ^i H Q2+ be the 2h subgrid of Q 2 + an(^ OL L 0L denote the restriction of /^ to D2+ functions by / 2/l . Let Ih be the grid £lj operator whose range is restricted to ^1^+ functions. Note then that J:2h = Ih (fh — Lhuh") only involves computation of the residual on fi2+ and transfer to H 2 ^. Finally, instead of determining the solution on all of fl\ we seek an approximate bordered composite grid solution O /i
rk L
^^O h
L
r
86
MULTILEVEL ADAPTIVE METHODS FOR PDES
u+ represented on the bordered composite grid £t+ = D|A U 0 2+ as follows:
Define /+ in a similar way. Suppose that local relaxation, G , applies to uh defined on Q 2 + but that it changes only those values of uh on 1^2 • Then the two-level bordered multilevel algorithm is represented by w+ <— MIr£(w+;/+) and defined as follows:
The bordered multilevel scheme ML^. is certainly practical: All of its computations at each scale are on or about the regions where they are presumably needed. A major additional advantage of ML~^_ is that there is no special treatment at the interface. In fact, all of the processing in ML~^ is essentially as if each grid were global. This makes ML~^ easy to program and it is therefore frequently used in practice. However, it does require two additional grid lines bordering the local patches in order to compute the residual correction at the coarse grid interface points. By the Galerkin condition (3.8), this correction is just the composite grid residual at these coarse grid points. Thus, if we generate the composite grid operator, lA : U— —> £/"—, where U— is the nodal space associated with the composite grid fi— = fij'1 U 0^> tnen we can compute the residual correction more directly, without the need for borders. A direct approach like this also simplifies our notation. Specifically, we represent the multilevel composite grid scheme by u— <— MlA(u&', /£•) and define it as follows:
87
FAC METHOD
Figure 4.4. Domain partition.
Figure 4.5. Composite grid partition with subgrids !£r(°), ^j(o), and ^(»).
Here we use intergrid transfers induced by FVE as discussed in §3.1. The matrix Lh : Uh -+ Uh is the FVE discretization on Uh, the nodal space corresponding to the local fine grid Q£ with (homogeneous) Dirichlet conditions imposed on the coarse grid interface Oy =
ftfndftj.
Although this form of the algorithm is much cleaner, our abstract notation has hidden several subtleties that need to be exposed. This will be done by taking some care to make the notation more concrete, starting with the following partition of the domain (see Figure 4.4): where 17c = ^1^2,^7 = #^2\^i and &F = ^2- This corresponds to a partition of the composite grid (see Figure 4.5) as follows:
where J7^; is defined as J7—flftc together with its neighboring Neumann boundary points and similarly for f t j and f t p . Observe the following: = coarse grid points away from refinement region; = coarse grid interface points; and — fine grid patch.
88
MULTILEVEL ADAPTIVE METHODS FOR PDES
The partition in (4.2) induces a block decomposition of the operators and grid functions illustrated by the composite grid equation
Similar representations hold for the uniform grid operators and intergrid transfers. Let U2h and Uh be the grid spaces corresponding to $llh and f^2 with operators L2h and Lh, respectively. Let /2/l and Ih be the grid transfer operators restricted to the patch 02- The following observations result from the way the grid operators are constructed (block entries of interest are displayed; asterisks indicate entries that need not be specified here):
Interpretation: Interpolation is the identity outside the refinement region.
Interpretation: Residuals in 9^; U Qy are simply injected to correct ft^1 U Si}*1 equations, but these residuals do not affect ft^/1 equations; residuals in fly U Q^ are used to correct P.2h equations; and residuals in 0/p, are used to correct D^1 equations.
Interpretation: Interpolation from the fine grid patch is the trivi o 1 r-v-r* a
\
/
Interpretation: Residuals are transferred from the composite grid to the fine grid patch in the trivial way.
FAC METHOD
89
Interpretation: The composite grid operator, which is block tridiagonal, agrees with the coarse grid operator on DC and with the fine grid operator on QFThe multilevel composite grid scheme ML— clearly places the focus of the computation on the composite grid operator, /A While the decomposition above shows that L— is composed mostly of the coarse and fine grid operators, it does require a special evaluation at the coarse grid interface points. In other words, composite grid residuals in ftc and ftp can be computed using the respective coarse and fine grid stencils, but special treatment is necessary in I)/. The bordered multigrid scheme ML~^ uses patch borders and standard stencils for this purpose, but MlA ostensibly requires that the stencils for lA be generated for use on fiy. (It can be convenient for this purpose to write the stencil as a sum of coarse and fine grid points stencils; see §4.4.) We have written the multilevel composite grid scheme MlA in its immediate correction form. The term "immediate" is used because both intermediate quantities u2h and uh attempt to approximate the current algebraic error (e— = u- — u—} in u—, not the discrete solution (u- ). We use the term "correction" for MlA because it attempts to solve the residual equation
This is not necessarily the most efficient nor convenient scheme in practice, but it is useful for exposition. Most importantly, this form of ML— focuses attention on the composite grid equations, which is fundamental to our approach. An equivalent but more natural delayed correction form of MLr^ involves correcting u— only at the end of each cycle as follows:
It is enlightening to contrast these two forms of the multilevel composite grid scheme ML—: Immediate correction attempts to solve (4.3) by transferring the composite grid residual to each level in its turn, computing a
90
MULTILEVEL ADAPTIVE METHODS FOR PDES
correction there, then immediately interpolating it back to the composite grid. Note that the initial guess for the fine grid correction is zero and that the fine grid equations use homogeneous boundary conditions on the interface. Delayed correction attempts to solve (4.3) by transferring the composite grid residual simultaneously to both levels, solving for the grid 2/i correction, using this coarse grid correction to define boundary values on the fine grid (this is the role of the term — L^jU2^ in the definition of fh), solving for the fine grid correction, then finally using the appropriate level solutions to correct the composite grid approximation. One final modification to these basic schemes is now made by replacing fine grid relaxation with a direct solver. More precisely, defining
in MlA yields the so-called fast adaptive composite grid method, which is represented by u— <— FAC—(u—;f-} and defined in its immediate correction form as follows:
FAC should not be interpreted as a method that requires exact solvers, even though we have defined it this way. In fact, while FAC allows for direct methods to resolve the subgrid equations, its predominant use in practice has been with iterative methods. Our main reasons for defining FAC as we have are to simplify discussion, to expose the role of the subgrid solvers, and to pave the way for asynchronous fast adaptive composite grid methods (AFAC, Chapter 5). We leave this section with three important observations about FAC—. First, because the composite grid stencils agree with the coarse and fine grid stencils in $lc and Qp, respectively, and the correction equations are solved exactly there, the residual r— — f— — L-u- produced by an FAC- cycle is nonzero only on fty. This observation simplifies the two-level theory, as will be seen in §4.10. Second, the linear part of FAC- is given by
FAC METHOD
91
Figure 4.6. One-dimensional example of grids used in FAC-MG processing (Dirichlet boundaries).
Figure 4.7. One-dimensional example of grids used in FAC-G processing.
The two terms in FAC— are actually projections onto the appropriate grid subspaces. This will also be used in our analysis. Finally, FACis defined in this exact-solver form to expose the role of the grids used in practical versions. In particular, FAC uses grid 1h to approximate smooth global components of the error in (4.3) and grid h to approximate oscillatory local components. No other grids are needed per se. However, if MG is used as the actual grid solver (we refer to this combined scheme as FAC-MG), extra grids are automatically introduced on each level as depicted for the one-dimensional case in Figure 4.6. These extra grids are needed to approximate error components within the same domain as the level they support and with smoothness relative to the scale of that level. However, if a simple relaxation scheme, G, is used in place of MG (we refer to this as FAC-G), the weakness of G as a stand-alone solver for the coarse grid equations demands additional global grids, as depicted for one dimension in Figure 4.7. Note that FAC-G is just ML extended to apply to multiple levels. 4.2 Interpretations FAC was developed in the previous section by interpreting it as a natural extension of MG to composite grid equations. To solidify understanding, this section shows how FAC can be interpreted in several other ways. As a preconditioner. One of the basic premises underlying the development of FAC is that computation on uniform grids is much more practical and efficient than it is on nonuniform grids. This gives the perspective of FAC that its subgrid solvers are meant to act as preconditioned for the composite grid operator. Specifically, let Mh =
92
MULTILEVEL ADAPTIVE METHODS FOR PDES
I%(Lh)-lI% and M2h = I^h(L2h)-^ if, which are the inverses of the operators on the uniform subgrids translated to the composite grid space. From (4.4) it can be concluded that FAC- acts on the algebraic error according to Thus, the operator M^ = Mh + M2/l - Mhl£M2h can be interpreted as an approximate inverse of lA. In this way, the inverses of the uniform subgrid operators precondition the composite grid operator. To motivate the three terms appearing in the definition of M-, note that Mh and M2h are rank deficient. They must therefore be combined to properly precondition ZA. On the other hand, while Mh + M2h is full rank, it overshoots computation of smooth local components (i.e., nodal vectors associated with functions in T2hr\Yh). The term -MhlAM2h is used in M— to compensate for this overshoot. As an iterative mesh refinement method. The classical mesh refinement methods discussed in §1.4 start by solving the coarse grid equations, then solve the fine grid equations using the coarse grid solution to define Dirichlet boundary conditions at the interface. The problem is that the results of this two-grid process usually do not achieve an accuracy commensurate with the added resolution. The conceptual hurdle here is to determine how the coarse and fine grid processing steps can be reused to quickly obtain such accuracy. The key is to use the composite grid equations, which is just what FAC does. In fact, one cycle of the delayed correction version of FAC— applied to the initial guess u— — 0 is just the classical mesh refinement method. Generally, then, FAC can be viewed as such a method applied to the residual equation (4.3). The significance of FAC in this context, then, is that it uses the composite grid equations to guide improvement of the classical process. As a block Gauss-Seidel method. The imbedded spaces /2/ l £/ 2/l and I^Uh are subspaces of the composite grid space U— that yield the decomposition This is generally not a direct sum because I^hU2h PI I~^Vh ^ (f>: This common subspace consists of grid functions that correspond to the smooth local discrete functions like that depicted in Figure 4.8. Nevertheless, (4.5) gives rise to another block decomposition of the composite grid equations written as
FAC METHOD
93
Figure 4.8. Sample onedimensional smooth local function (i.e., function with nodal values in
Here we understand a solution of (4.6) to represent nodal values for the composite grid solution, so u-* = I^hv?h* -f I^u*1*. Note that, because the sum is not direct in (4.5), the block system in (4.6) is singular: All solutions of (4.6) are of the form
where v2h and vh are any grid functions satisfying I^"hv2h = I^vh. However, the diagonal blocks in (4.6) are nonsingular. In fact, L^h2h = L2h and L~^h = Lh. This allows the formal use of block Gauss-Seidel applied to (4.6) as follows:
This can be seen as just another way to write FAC—. As a domain decomposition method. The Schwarz alternating procedure [Schwarz 1870] is perhaps the first domain decomposition method developed for solving partial differential equations (PDEs). Its basic approach is to partition the domain into two or more subregions on which the localized problem is successively solved, using Dirichlet boundary data provided by the approximations from neighboring subregions. Convergence rates generally improve as the subregions increase in overlap, but so does the computational complexity (cf. [Cai and McCormick 1989]). FAC can be interpreted as a Schwarz-like domain decomposition method in terms of the subdomains Qj and ^2 • It achieves fast convergence by the use of fully overlapping subdomains and its low cost by the use of a coarse grid with substantially fewer points in the overlap. More precisely, the sub domain fti fully overlaps ^2 (^2 C fti), but the computation on £l\ is done only with a coarse grid that adds comparatively little to the computational cost.
94
MULTILEVEL ADAPTIVE METHODS FOR PDES
4.3 Multilevel Schemes For each of the many ways there are to process several grids in MG, there is an analogous way to treat the multilevel case for FAC. For example, in analogy to the accommodation scheme [Brandt 1977], the order in which the grids are processed could be determined dynamically by estimating the evolving errors. This approach represents an attempt to ensure a certain error reduction rate per cycle, but it does nothing directly to curb computational complexity. A more effective approach for FAC is to predetermine the order in which the levels are processed so that computational complexity and convergence rates can both be assessed a priori. Unfortunately, there is much less freedom for FAC than there is for MG to choose efficient fixed cycling schemes: The number of grid points diminishes rapidly in descent from finer to coarser grids when each grid is global, but this is almost never the case for the local levels used in FAC; on the contrary, for many applications the coarse levels tend to be the largest. Thus, FAC fixed cycling schemes should not spend an inordinate amount of time on coarser levels, suggesting that FAC be limited to cycling schemes that correspond to the multigrid V- or slash-cycles (cf. Chapter 4 in [McCormick 1987]). To describe these simple FAC cycling schemes, suppose we are given a family of uniform locally nested grids £lh covering the nested regions Slh- (By "locally nested grids" we mean fi2/l n ft^ C Q fc ; by "nested regions" we mean Q^ C fyz/n which is not analogous to our definition of nested grids.) Suppose h — hc signifies the coarsest level, with successive mesh sizes differing by a power of 2. Remembering that h is a vector with components h, 2/i,..., /ic, by the term 2h we mean a vector with components 2/i,4/i,..., hc. The relationship between the composite grids 0> = Qfc u 02/l U , . . U D/lc and Q^ = ft2* u ft4/l U ... U fi/lc is illustrated in Figure 4.9. A V-cycle form of FAC— is now given recursively as follows:
This form of FAC- involves transfers between the h and 2/i composite grids. As indicated in Figure 4.10, this requires the usual intergrid
FAC METHOD
95
Figure 4.9. Sample composite grid on ft = (0,1) with h = ^ and hc = J. Note that ft— consists of three levels while ft— has only two. Interfaces are indicated by .
Figure 4.10. Illustration of transfer interactions between ft— and ft-.
transfers in the finest region, ft^, and the trivial transfers elsewhere. Partial V-cycles can be defined by suppressing either Step 2 or Step 4, yielding the respective coarse-to-fine or fine-to-coarse slashcycles. The former is especially natural when FAC is considered as a mesh refinement method.
4.4 Interface Treatment One of the most important basic issues of FAC implementation concerns the treatment of the grid interface. The main point that should be made in this respect is that this treatment is guided by the composite grid equations, although the equations themselves need not be used explicitly in the algorithm. More precisely, as with the bordered multigrid scheme M£+, all computations can be done on the uniform subgrids ft2/l and ft'1 with the understanding that the composite grid
96
MULTILEVEL ADAPTIVE METHODS FOR PDES
approximation is represented by
This can be a major programming advantage since all of the significant effects on the algorithm due to the presence of local grids can be dealt with simply by manipulating array subscripts. However, this scheme requires two extra boundary grid lines for each patch, which can be inefficient especially when the patches are very small. The FAC— scheme as described in §4.1 eliminates those borders, but now the composite grid residuals must be computed specially at coarse grid interface points. (Composite grid residuals use the usual uniform stencils at other points.) This can be accomplished by storing the associated right-hand sides and stencils at the interface points. A useful way to do this is to first compute the coarse grid residual at those points, perhaps as a natural result of the coarse grid solve, then correct this computation after the fine grid solve. This scheme consists of the following computations: After Step 1 of FAC&, form r 2/l = f2h - L2hu2h and set f2h <— r2h (this is the new / 2/l to be used for the next FAC cycle; here it is correct only at points outside of the refinement region, fi^; actually, since FAC& is presumed here to use exact solvers, r 2/l = 0, so f2h = 0 at this point); after Step 2, redefine the new /2/l in 17^ as the fine-to-coarse transfer of the residual rh = fk — Lhuh; finally, correct the value of / 2/l at the coarse grid interface according to where 6(P) = ((/* - lM>) - ( f 2 h - L2hu2h))(P). This scheme can be convenient, especially for parallel processing applications and for coding, because computation of 6 may involve only fine grid data. For example, for problem (1.1) with p — 1 and remembering the stencils in §2.5, at the coarse grid interface point P in Figure 2.10 we have
(Here, h is the fine grid mesh size, so u2h(S) signifies the coarse grid function at the coarse grid point below P. Thus, S here actually refers to the point just below the fine grid S depicted in Figure 2.10.) Note that, with the delayed correction version of FAC-, all data needed to compute 8(P) is resident on the local grid, Q . Note also that this works in an analogous way for larger mesh factors (e.g., h to 4/i).
FAC METHOD
97
4.5 Nonlinear Schemes There are several effective ways to apply FAC to nonlinear problems. In this section, we briefly describe two that are analogous to the MG schemes presented in §3.3. We base both of them on the immediate correction version of FAC. The FAS version of FAC is given by
Using the immediate correction version of FAC makes this nonlinear FAC scheme a little subtle. It may be more enlightening to consider FAS for the delayed correction version, but it would require notation that we choose not to introduce here. Note, as with MG, that the fineto-coarse transfers applied to the approximation u— can be different from those used for the residuals /- — L—u—. The Galerkin scheme for FAC uses the subgrid operators L2u^(u2h) = 4^-(4"2/l + JJS*) an* L^h(uh) = /£^(4^2/l + &h) ^d right-hand sides /2/l = Ilhf- and fh - /£/A. It is given by
4.6 Computational Complexity and Direct FAC Solvers As an iterative solver of the composite grid equations, FAC cycles have essentially the same computational complexity as the solvers used for the uniform subgrid equations. In particular, a coarse-to-fine FACMG scheme based on a V-cycle MG solver costs about CQH arithmetic operations, where n is the number of composite grid points and CQ is the complexity constant for MG as defined in §3.4. For many applications, then, the complexity of FAC-MG is essentially the cost of performing three or four relaxation sweeps on each uniform subgrid. However, as a direct solver of the PDE, this complexity is not yet optimal. For this purpose, we develop a nested iteration FAC algorithm for composite grid equations in analogy to the full MG method described in §3.4. We will take a similar approach that uses the finite element space setting for simplicity.
98
MULTILEVEL ADAPTIVE METHODS FOR PDES
Figure 4.11. Sample composite grid and its coarsening on ft = (0,1) with h = -J£ and hc = ^. Note that each consists of three levels.
Suppose for each composite grid ft- = U^._ 1 ft /l * that the discretization satisfies
where e-* = $* — t>^* is the discretization error in the composite grid solution, t^1*, p > 0 is the order of approximation, c is some constant independent of h, \ \ - \ \ is some norm on T, and || • ||n fc \n fc+1 is possibly another norm restricted to functions on ftfc\ftjt+i. To obtain accuracy to the level of discretization error, the convergence criterion for the composite grid ft- thus becomes
where er ^ = v&* — vr\ is the algebraic error in the z/th FAC iterate as an approximation to tA*. As with full multigrid (FMG), we now attempt to determine an initial guess, f7^, so that (4.8) is achieved with z/ = 1. To do this, we will need a grid that is significantly coarser than ft- (so that computation is relatively inexpensive), but that delivers a reasonable discretization accuracy. The natural choice here is to globally coarsen the composite grid, ft-. Specifically, let ft2/l* be the usual coarse subgrid of ft'1* and define the coarse 2h composite grid ft2- = UJ k=1 ft 2/lfc (see Figure 4.11). (Contrast the definition of ft2- with that of ft— given in §4.3 and illustrated in Figure 4.9. Note that ft2- is formed by globally coarsening ft-, while ft— results by deleting the finest level of ft-.) Let I^h and /J- be the FVE transfers corresponding to these composite grids. (We assume the conforming case T2- C T- which holds for the model grids consid9h ered here.) Suppose the approximation v,^\ satisfies the convergence
FAC METHOD
99
criterion (4.8) relative to its level:
Then the initial guess i^u = v,^. for grid h satisfies
Here we have used (4.7) for grid h and grid 1h. For the typical case p = 2, this inequality means that the initial guess is within an order of magnitude from convergence. As with MG, it is then enough to have the linear part of FAC— satisfy
In general, we want [[.FAC—1| < 7 < 1; it would then be sufficient to perform v = log(l -f 2 p+1 )/(— log7) cycles of FAC on each composite grid. In any event, the nested iteration form of FAC is written as u— <— FACNI-(f-) and defined recursively as follows:
Because the numbers of points of the successive composite grids ft2and fi— differ by about a factor of 4, in analogy to FMG the complexity of this nested iteration scheme is about four thirds the cost of FAC—. For FAC-MG, this translates to the cost of about four relaxations on each uniform subgrid. For a more thorough complexity analysis of FAC, including nested iteration and the complete self-adaptive process, see [McCormick, McKay, and Thomas 1989].
100
MULTILEVEL ADAPTIVE METHODS FOR PDES
Figure 4.12. An example of a nested sequence of composite grids with / = 4. Note the expansion of certain levels as composite grids are coarsened, which is necessary to provide coarse level support points for the interface. Note that some levels become redundant (e.g., level k = 2 for ft —), which means the interface there has a refinement factor larger than 2. Finally, note that levels disappear as their interiors become empty. Coarsening subgrids to create coarser composite grids must be done carefully to ensure that each internal patch boundary is supported by coarse level points. See Figure 4.12, which illustrates the need to occasionally expand the region governed by a given level to obtain this support. 4.7 Time-Dependent Equations The discretization developed in §2.9 for the parabolic equations in (1.5) could be solved by FAC in a way entirely analogous to the elliptic case: Use the standard time marching scheme ("relaxation") to solve (2.27) on the global grid up to the desired terminal time line; use the
FAC METHOD
101
global solution u* to fix fine grid boundary values at the interface line (Figure 2.24); solve the fine grid equations (2.27) by the same time marching scheme; compute composite grid residuals (if exact solves were used for the implicit equations in (2.27), the only nonzero residuals are from (2.28) at the coarse grid interface points); correct the coarse grid equations with these residuals using the FVE-based transfer operator (the staggered control volumes determine transfer weights); then repeat the coarse and fine grid solves. However, the marching nature of relaxation allows us to improve this approach by having the fine grid accuracy be reflected more immediately in the coarse grid equation. To say this differently, note that one role of the fine grid is to impart its accuracy to the global solution at the final time line; since accuracy of the advancing global solution depends on previous time, the fine grid correction should occur as soon as possible—before the global solution is advanced. This yields the following FAS-type two-level scheme: For k = 0,1,2,..., do: Step 1. Solve (2.27) for wf + 1 for all i on the coarse grid. Step 2. Compute ui ' = \(u* + wf"1"1) for i corresponding to the interface and solve the fine grid counterpart of (2.27) for u{ ' and wf + 1 for all i on the fine grid. Step 3. Compute the composite grid residuals on time line k -f 1 using (2.27) on the coarse grid, its counterpart on the fine grid, and (2.28) at the interface. Transfer these residuals to the coarse grid equations and add in the FAS way to the right-hand sides of (2.27). Step 4- Recompute w^ +1 for all i on the coarse grid using the corrected right-hand sides in (2.27). Step 5. Repeat Step 2. By implication of the notation here, there is no separate storage for the coarse and fine grid approximations: both use the composite grid vector wf. This means that the fine grid values are automatically "injected" to the coarse grid whenever they are changed, which is just what was intended. For a related development of time-dependent FAC schemes, together with results from several numerical experiments, see [Heroux 1988] and [Ewing et al. 1989]. 4.8 Self-Adaptive Techniques For many of the physical models requiring local resolution, it is known at the outset where to place the local grids, and at what scale.
102
MULTILEVEL ADAPTIVE METHODS FOR PDES
In many cases, however, this knowledge must be obtained dynamically from features of the emerging solution. This section focuses on the design of self-adaptive FAC techniques for this purpose. The brief comments here pertain more to general principles than concrete algorithms because many basic issues in this field have not been fully explored. Despite this unsettled state of research, efficient FAC software does exist (cf. [McCormick, McKay, and Thomas 1989]) that implements completely dynamic self-adaptive processes. Perhaps as much as any other area of numerical computation, the design of effective adaptive methods must be done in a very systematic way. One of the first tasks is to develop clear and realistic objectives. This requires specification of the type of error, how it will be measured, and what the computational goal is relative to errors and work. For example, in the context of FVE discretizations, it seems natural to focus on the actual error in the function space setting measured by an energy norm or one of the Sobolev norms. An error measure of this type can be written as a sum of local integrals of the error, so the ultimate objective could be that each of these component errors be smaller than some prescribed (weighted) tolerance, regardless of the work required to achieve it. Another possibility is to use pointwise estimates of both the error and the work necessary to achieve it, then adopt an objective that articulates the appropriate compromise here. (See [Brandt 1977] for an example of this approach.) In any case, a local error estimating procedure is essential. It should be fairly safe so that serious errors are not overlooked, but reasonably tight to keep a check on unnecessary refinement. An important advantage of multilevel methods is that it enhances the choice of conventional error estimators: In addition to using the data on the present level of refinement to solve local error approximation equations or measure smoothness of the solution, the approximations on the various refinement levels can be used to quantify the known form of the error. For example, if the pointwise error is known or presumed to be of the form c/i 2 , with c an unknown constant, then approximations with two different mesh sizes can be used to determine c and, hence, the error on the finest level. In the case that the error is of the general form chp with p unknown, three levels can be used for this determination. Care should be taken with this approach since it has a limitation shared by most extrapolation methods: In practice, the discretization error may not behave according to the form c/ip, even when this error is bounded by such a form. For these cases, a fortuitously accurate coarse approximation could lead to a severe underestimate of the error.
FAC METHOD
103
Upper bounds on the error cannot by themselves prevent misleading results. Therefore, for such cases, this multilevel error-estimation approach should be used not as a replacement but as a supplement to conventional error estimators. On the other hand, for a PDE with sufficient smoothness discretized on uniform grids, asymptotic expansions can be used to show that the error is indeed of the form chp. Multilevel error estimation can be very effective for such cases. This is true even for composite grid applications because the expansions apply locally in the uniform regions. This ability to use asymptotic expansions on the uniform subgrids is another advantage that multilevel composite grid methods have over conventional irregular grid techniques. A method that exploits these expansions to obtain higher-order accuracy with FAC is studied in [McCormick and Rude 1989b]. Error estimators can be used in the framework of the discretization objectives to "flag" grid points that indicate the need for further refinement. Once this is done, these flagged points must be organized into coherent uniform patches. It is critical that this process of organization be localized so that its cost is not out of balance with the cost of the solver. In [McCormick, McKay, and Thomas 1989], such a scheme is described that creates rectangular patches by starting from a single point, searching neighbors for flagged points to add to the patch, then assessing whether to join any other existing patches that are abutting. The cost of this "patching" scheme is substantially less than the FAC solver, although the theory has yet to show that it keeps a check on unnecessary refinement. 4.9 Physically Conforming Grids
Although the strict use of uniform subgrids gives the FAC process its efficiency, it creates some inflexibility in the treatment of very irregular features like nonrectilinear boundaries and internal shocks. In this section, we briefly describe two current directions of research attempting to overcome this limitation. Again because the exploration of this area is only just beginning, the discussion will be concerned mostly with general concepts. Grids that conform to irregular physical phenomena can provide for greater accuracy than those that do not. This is especially true when the phenomena allow discontinuities in the solution because conventional finite element functions themselves can be discontinuous only at irregular contours that coincide with element boundaries. In most cases, nonconforming grids must be very fine for the inaccuracy that this introduces not to matter. (An interesting alternative to physically
104
MULTILEVEL ADAPTIVE METHODS FOR PDES
Figure 4.13. Internal boundary (airfoil) in the physical domain is connected to the east far field boundary by a cut (dotted line).
conforming grids or elements are physically conforming basis functions', this approach has only just begun to be explored in the context of FAC [Liu and McCormick 1988a].) One way to fit composite grids to irregularities is to use a global mapping to transform the domain, including its irregular contours, to a rectilinear region with a rectilinear contour. Consider Figure 4.13, for example, where the irregular contour is an internal airfoil. A "cut" is made from the trailing edge of the airfoil to the "far field" east boundary so that the airfoil can then be viewed as part of the domain boundary. This allows the "physical" region to be suitably transformed to a square, the "computational" domain, where a uniform grid can be used for the discretization. The objectives of this approach are to use transformations that have some sense of smoothness, that do not significantly complicate the transformed PDE, that allow the location of the physical grid points corresponding to the computational grid to be easily computed, and that provide simple controls to direct these physical grid point locations. The work in [Liu and McCormick 1988b] studied such an approach based on elliptic grid generation for the full potential equations modeling flow over an airfoil. It used FAC in the dual role of generating smooth local grids and solving the transformed PDE discretized on the computational composite grids. The main problem with such approaches is that local irregularities can be transformed by the mapping into global complexities of the PDE. An approach that attempts to remove this difficulty was first suggested in very general terms in [Brandt 1977]. The basic idea is to use transformations restricted to local grid patches. However, while local transformations provide a potentially effective approach in principle, there are many ways that this concept could be realized in practice, and the perspectives that could guide development through these alternatives have yet to be clarified. Nevertheless, to get a closer view of the basic idea of local transformations, following is a brief description of two of the many
FAC METHOD
105
alternatives. Consider for simplicity the case, depicted in Figure 4.14, of a square region ft that has been slightly distorted on a small segment of the boundary. A uniform global grid ft1 is placed on ft that uses a crude scheme for treating this distortion. For example, as indicated in Figure 4.15, the elements may extend to, or just outside of, ft so that dft 1 approximates d£l as closely as possible. The interface for the refinement region is then placed near the irregular boundary to isolate it from the global domain and to define the refinement region, &2. (Actually, the interface need not be very close to this boundary; its main role is to separate ft2 from the rest of ft so that a local coordinate transformation can be used; thus, ft2 need not be any finer than ft1; a third and finer level ft3 can be placed in f&2 nearer the boundary if higher resolution is necessary.) Note that the interface in Figure 4.15 was chosen to lie along lines of the global grid. This can limit flexibility in determining the refinement region and complicate the mapping, but it greatly simplifies the definition of the composite grid space, the resulting discretization, and the implementation of FAC. But first a local transformation must be used to define the fine grid ft2 in the refinement region so that it conforms to the domain boundary and the interface. The composite grid ft0 is then defined as the union of ft2 and the coarse grid ft1 outside the refinement region (see Figure 4.16). This is an important but subtle choice for ft0. The coarse grid points in the refinement region and outside of ft are not included in ft0, so the coarse grid space is generally nonconforming in the sense that it is not a subspace of the composite grid space. (An alternative here would be to define the composite grid space as the union of the subgrid spaces, ft0 = ft1 U ft2; but this would produce a more complicated algorithm because the composite grid operator would have many connecting coefficients between the coarse and fine points in the refinement region.) Nevertheless, the FAC algorithm can now be applied just as it is defined in §4.1. As in [Liu and McCormick 1988b], if some form of elliptic grid generation scheme is used to compute ft2, then FAC can play a dual role of computing the mapping that defines ft2 and solving the resulting composite grid equations. These roles are further compounded for problems where the location and shape of the physical phenomena are themselves unknown and must be determined as part of the solution process. This approach has yet to be analyzed, so the effect of reentrant corners on the boundary 5ft2 is not really known. Until it is, it seems prudent to consider ways to avoid these potential singularities. One
106
MULTILEVEL ADAPTIVE METHODS FOR PDES
Figure 4.14. Sample irregular boundary.
Figure 4.15. Coarse grid with triangular elements and interface indicated by dashed lines.
Figure 4.16. Composite grid conforming to the boundary.
possibility is to determine the coarse grid ft2 without regard to (I1, then form the composite grid by triangulating the interface between f\ dft 2 and the coarse region fi^fi . Although such an approach requires extra work and special care in the triangulation process, it avoids artif-
FAC METHOD
107
ical corners on the interface and substantially simplifies the coordinate transformation. 4.10 Theory for Variational FAC This section develops bounds on the convergence factors for FAC. We restrict most of the discussion to the two-level case, although we present a theorem for the multilevel case at the end of this, section. Throughout this section, as with almost all existing theory for FAC (see, however, [Ewing, Lazarov, and Vassilevski 1988]), we will assume that the composite grid matrix L—is symmetric positive definite (s.p.d.) and that the following variational conditions hold:
where C2h and Ch are positive constants. Note that L2h and Lh are also s.p.d. provided the intergrid transfers are full rank, as we assume. These conditions admit the use of an energy innerproduct on each level that relates in a monotonic way to the composite grid space, greatly simplifying the proofs. The Galerkin condition (4.11) holds for many practical applications, although it generally does not hold for FVE discretizations as noted in §3.2. The grid transfer condition (4.12) is not so common in practice, although it holds for all Galerkin-type FE discretizations and for certain FVE discretizations (e.g., those based on rectangular elements and corresponding bilinear basis functions). We will assume both conditions for all of the theory presented here and in §5.7. In §4.11, we will extend this theory to the nonvariational case of FVE-based FAC. To simplify the discussion, we will assume further and without loss of generality that c^h — Ch — 1- (These constants affect the scale of the subgrid equations, but have no effect on the composite grid iterates.) The convergence estimates are obtained in terms of the discrete energy innerproduct and norm defined for the composite grid space uby the respective forms
and
108
MULTILEVEL ADAPTIVE METHODS FOR PDES
where (•,•} is the Euclidean innerproduct on U—. Here and in §5.7, the terms self-adjoint and orthogonality and the symbols * (adjoint) and _L (perpendicular) will refer to the energy innerproduct, unless otherwise indicated. Define the operators P^h and Pjjh. for the respective coarse and fine grid spaces U2h and Uh by
Note that Pyh is an orthogonal projection onto the subspace I^Uh: By (4.11),
so that Pjjh is a projection; if u— JL I'£Uh, then (4.12) implies that
for any uheUh, which in turn implies that
so Pjjh is orthogonal; and, finally, the range of Pjjh is I^Vh because
A similar argument shows that PU^ is an orthogonal projection onto I^hU'2h. This now means that P^2h = I - P^h is an orthogonal projection onto (I^hU<2h)1-, and similarly for P^h. Note by (4.4) that the algebraic error e- = u-" - u- is transformed by one FAC- cycle according to Letting p denote spectral radius, then because the projections are selfadjoint (in the energy innerproduct), the convergence factor of this
FAC METHOD
109
coarse-to-fine cycle of FAC is bounded sharply by
The last line follows from the fact that P^hP^hU— = \u— implies u± = P&u*-, so that p(P£kP&*) = p(P^P^hP^}. Note that (4.16) implies that the asymptotic factor p(P^hP^h] = K 2 . For the twogrid case considered here, this improved rate is attained by the second FAC- cycle because the error is then in the range of P^h . Note further that
so that K; is also a sharp bound on the convergence factor of the fineto-coarse cycle of FAC. It also follows that a two-level V-cycle version of FAC has a convergence factor sharply bounded by
This follows because the linear part of the two-level F-cycle algorithm is P^hP^2hP^h- (Note that this iteration matrix is self-adjoint, which permits the use of F-cycle FAC algorithms as preconditioners for conjugate gradient acceleration.) In special circumstances, K — 0. To see this, first note that the range of P^h consists of composite grid error functions with zero residuals in the refinement region: If u— — P^he—, then L^-e— is zero in tip. Such functions are called composite grid harmonics. In cases where these harmonics are exactly reproduced on grid 2/i, P^h^h = (/ — P[/2h)v /l = 0 which implies that K = 0. Such exact approximations occur when the null space of the PDE operator in QF (values on the boundary dftp are free) can be represented exactly by grid 2/i functions. For example, 1-D diffusion operators with no Oth-order terms
110
MULTILEVEL ADAPTIVE METHODS FOR PDES
have null spaces in £lp consisting of linear functions, and these can be exactly reproduced on any coarse level. The same is true in 2-D provided the refinement region tip is wholly contained in a grid cell of H . FAC is therefore an exact solver for such special cases. For the general case, however, FAC is an iterative method with convergence rates that depend on how well composite grid harmonics are approximated on grid 2/i. Our first theorem establishes convergence estimates that involve the quantity The value of 6 relates to the approximation and regularity properties of the particular application, as we now show. The quantity 6 in (4.19) depends on properties of the PDE and its discretization about which we must be more specific. We therefore consider the case that the PDE has full regularity and admits a variational principle based on an energy functional. Specifically, we assume that the PDE is discretized on Q2/l and ft— by a Galerkin-type finite element method that produces the nodal matrix lA and induces the intergrid transfers I^h and Iff and coarse grid operator L2fl that naturally satisfy (4.11) and (4.12). Assume that the grids H2/l and Slh that constitute Qare uniform. The main point of these assumptions is that we can then appeal to standard finite element theory to obtain the estimate
where u~^ is an eigenvector of L— belonging to eigenvalue A—, similarly for u 2 ^ and A 2/l , and c\ is a constant independent of the mesh size h. Equation (4.20) can be derived from standard finite element estimates that relate the eigenvectors associated with any global grid to the eigenvectors of the PDE using an inequality analogous to (4.20); such estimates can be used in a triangle inequality involving grids Q2/l and H— to establish (4.20). Note that (4.20) implies that
FAC METHOD
111
for all eigenvectors u—^ ^ 0. To see how (4.21) can be used to bound £, note first that which is symmetric in the Euclidean sense. Hence, by the property of the projection P^2h, we have
Now by a careful argument based on the orthogonality of the tr\, from (4.21) and (4.23) we can then conclude that for some constant 02 independent of h. Finally, an inspection of the row sums of L2h shows that for some constant €3 independent of h. (4.24) and (4.25) then combine to show that that is, S is bounded independent of h. This bound seems to be based on the assumptions that the mesh ratio between $l2h and $lh is 2, as the notation implies. However, as we said, (4.20) is derived from a triangle inequality involving an analogous relationship between grids Q?h and ft- and the PDE; we made no assumption here about the mesh ratio. This means that $lh could have an arbitrarily small mesh size relative to the mesh size of fl 2/l and 6 would still be bounded. In fact, as a limiting case, the fine "grid" problem could be the PDE itself localized to O2 and we would still have 6 < oo. By the same reasoning, we need not assume £lh is uniform. Thus, while our notation continues to suggest that £lh is a local uniform refinement of 172/l by a mesh ratio of 2, the next theorem actually applies to the case that £th is any refinement of Q2,'1-
112
MULTILEVEL ADAPTIVE METHODS FOR PDES
The following theorem is taken from [McCormick 1984]. THEOREM 4.1. The convergence factor for the coarse-to-fine cyde of a two-]eve] exact soever version of FAC^- satisfies
Proof. Equation (4.26) is proved by using (4.17) and showing that
Dropping the superscript h, to this end we let e be the error before FAC is applied and define v = P^e and r = Lv. Since P^h involves an exact solve in fi/r, then rp — 0 so \\r\\ = \\Ihhr\\. Using this and the orthogonality of the projections yields
Hence,
But then the orthogonality of the projections again implies
This proves (4.27). By (4.17), this in turn proves (4.26) and, hence, the theorem.
FAC METHOD
113
The convergence factor estimate in (4.26) has the advantage that it is bounded independent of the mesh size ratio, but the disadvantage that it is limited by its dependence on regularity of the PDE. Thus, while this theory applies to many other practical phenomena, it does not cover the case of real discontinuities in the PDE operator (e.g., discontinuous coefficients or boundary singularities). This gap is filled with the following regularity-free estimate, which we obtain by sacrificing the independence of the bounds on the mesh size ratios. To this end, let Vj} = {uheUh : uh(P) = 0 for P in ft2/l n $lh}. For any nonzero vectors u,veU—, define
and, for any subspaces U,V of U—, define Let U — I^U2h and V = I^UQ. The next theorem will bound K = \\\FAC—\\\ by |cos(£/,F)|, which in turn can be bounded locally: To be specific, let (U,V}E denote the energy innerproduct restricted to an element Ee£2h. Note that
If it can be shown that for all ueU, veV and for some constant 7 < 1, then
Estimating 7 locally on Ee£^h for finite element discretizations reduces to the solution of a generalized eigenvalue problem for the local stiffness matrix of the element E. For example, the bound 7 = 1/3/8 was proved in [Maitre and Musy 1982] for the model problem (1.1) using triangular elements and linear basis functions. Note that 7 is invariant under multiplication of the diffusion coefficient p by a different positive constant within each coarse grid element.
114
MULTILEVEL ADAPTIVE METHODS FOR PDES
The proof of the next theorem is based on the results presented in [Mandel and McCormick 1989b]. THEOREM 4.2. The convergence factor for the coarse-to/me cycle of a two-level exact solver version of FAC^- satisfies Proof. We first show that where X = (I^U™^,Y = (I^Vh}L, and Px = P^h and PY = P^h are the orthogonal projectors of U— onto X and Y, respectively. To prove (4.29), note for any u in X that Hence, and, similarly, for any v in Y we have Now since PxPyw = ^w implies w = PXW, we have PxPyPxw — ^wHence, We have thus proved (4.29). The theorem would then follow from this and (4.17) if we could prove that To prove this, first note that X = U^- and Y C VL. Thus, We will therefore complete the proof by showing that To this end, note that U± = U © V which implies that UL n VL = {0}. Let u G U-1 such that |||u||| = 1 and |||u - Pt/xw|j| is minimal. (P[/jand Pv± denote the orthogonal projections of U— onto U^ and V1, respectively.) Let v = Pv±u/|||Pv±u||| and a = cos(u,v). Then, |a| = cos(?/- L ,V rJ -), Pv±u — QU, and PU±V = au. Now from a simple geometric argument in the two-dimensional subspace of U- spanned by U and V, we can conclude that u - Pv±u E U, v - PU±V 6 V, and cos(w - Pv±u,v - PU±V) — a. This proves (4.30) and, hence, the theorem.
FAC METHOD
115
Direct methods for solving the uniform subgrid equations are too costly for most large-scale applications. Moreover, it seems wasteful to demand exact local accuracy when all that is required for the current composite grid approximation is a modest improvement in accuracy— typically one decimal place (see (4.10)). The next theorem obtains convergence bounds for the more practical version of FAC that uses approximate solvers. For this purpose, we assume that the solvers of the equations on the uniform subgrids 2h and h deliver a relative accuracy of e2h and £ A , respectively; for simplicity, we assume that these solvers are self-adjoint stationary linear iterative methods. To be more precise, let the solver of the grid h equation Lhuh = fh, beginning with initial guess uh, be represented as uh <— Gh(uh;fh], where
and where Mh is a symmetric matrix considered as an appoximation to (Lh}~1. Assume that the linear part Qh = I - MhLh satisfies |||/j~(/ — M/1Z-/l)||| < eh. Using analogous notation and assumptions for grid 2h and letting £ = ( *2fc )•> we represent the two-grid approximate solver version of FAC by u— <— FAC7(u—; /-) and define it by
THEOREM 4.3. The convergence factor of FACr satisfies
Proof. Because of the assumptions on the subgrid solvers, in analogy to (4.15) the algebraic error e— — u-* — u— is transformed by one cycle of FACT according to where
116
MULTILEVEL ADAPTIVE METHODS FOR PDES
and
Thus, Qh is a self-adjoint linear operator satisfying
Since P^hl'h = 0, we have
Analogous comments hold for Q2h. Thus,
Equation (4.32) now follows from (4.17) and the theorem is proved. A somewhat more general estimate was proved in [McCormick and Thomas 1986]. However, Theorem 4.3 is simpler and its estimate is sharper. Theorem 4.3 can be used to extend Theorem 4.1 to the case that Q— consists of / > 2 levels. In particular, the coarse-to-fine version of F'AC- defined in §4.3 with h = (fo,2/i,... ,hcy can be treated by writing 9> = ft^' U fthe, where h' = (/i,2/i,... ,/i c /2), and viewing F AC— as an approximate solver for the "patch" £1- . Since Theorem 4.1 does not actually require $lh to be uniform, we can conclude from Theorem 4.3 that
Now by recursively considering ft- in terms of its "global" level ft'10/2 and "patch" P>", where h" = (h, 2 f t , . . . , fcc/4), we can bound \\\FA£r \ in a similar way. Continuing in this manner we can show that
FAC METHOD
117
Unfortunately, this gives a dependence of the rate on /, the number of levels in !)—. A more sophisticated approach is needed for the multilevel case, which is treated in Theorem 4.5 below. Theorem 4.3 applies to the case that a self-adjoint MG scheme is used as the approximation subgrid solver. In this case, e2h = \\\MG2h\\\ and eh = HIMG^III which, for many applications, are both bounded by e < 1 independent of h. This in turn implies that the convergence rate of FACT satisfies
Hence, K£ is bounded less than 1 independent of h whenever |||FAC—1|| is. For example, Theorem 4.1 implies
Theorem 4.3 also applies to the case that the subgrid solver is a self-adjoint relaxation scheme, yielding a convergence bound independent of h provided the mesh ratio between the coarse and fine grids is bounded. For example, consider Richardson's iteration, which on subgrid h is written as
This is of the form in (4.31) with Mh = ,^h) J, which is symmetric. Thus, the linear part I — MhLh is self-adjoint (in energy). Consider the case suggested by our notation of a mesh ratio of 2 and assume that grid 2/i is solved exactly (e2h = 0). Then the initial algebraic error eh = (Lh)-ll£(f±- IM-) satisfies ~?*Lheh = 0, where the overbar signifies the operator restricted to the refinement region QF- Thus, eh = (I - lh2h(fhyllfLh}eh, from which it follows that
118
MULTILEVEL ADAPTIVE METHODS FOR PDES
where
Hence,
We thus have eh = (1 — -)1/2 which, by inequalities similar to (4.24) and (4.25), is typically bounded above by an e < I independent of h. Note by (4.32) that the convergence factor of FACT based on Richardson's iteration is thus bounded by
None of our first three theorems applies as stated to model problem (1.2) because its discretizations are not, or rather should not be, positive definite. The next theorem relaxes the assumption that L— is positive definite in order to cover such cases. THEOREM 4.4. Suppose that lA is nonnegative definite and that the interpolation operators are constructed so that its null space is contained in both of their ranges:
Suppose that f- is orthogonal to jV(ZA) in the Euclidean sense so that (4.3) has a solution. Then Theorems 4.1 through 4.3 apply with the modification that inverses are everywhere replaced by Moore-Penrose generalized inverses. Proof. First note that (4.34) is equivalent to
Since the intergrid transfers are assumed to be of full rank, this is actually an equality. Note also that /- is orthogonal to N(L—} in both the Euclidean and energy innerproducts. Let PV be the (energy)
FAC METHOD
119
orthogonal projector of U— onto A/"(l/—) and let P£ = I— Pjj. Now it is easy to verify that if UTJU is the starting approximation for FAC—, u7i/2) is the approximation after Step 1, and UT^ is the final approximation, then u~^ = Pfru-j-^ + Pj^v^0) for i/ = 1/2,1. Moreover, P^U-^ = FAC^(Pbu^;f±). Thus, FAC& can be analyzed on J^L(L^} where Theorems 4.1 through 4.3 can be applied. This completes the proof. Note that if /— is Euclidean orthogonal to N(L—}, then the transferred residuals / 2/l and fh are Euclidean orthogonal to N(L2h} and N(Lh], respectively, so each subgrid equation is automatically solvable. The theory developed thus far shows that basic FAC convergence rates are independent of the number of grid points on each refinement level. A remarkable theorem in [Widlund 1989] established independence of the FAC rates on the number of these levels. One significance of this result is the consequence that direct FAC solvers (see §4.6) produce approximations that are accurate to the level of discretization error at a cost proportional to the number of composite grid points. We state this theorem in a form tailored to our applications omitting the enlightening but rather lengthy proof. We pose the theorem in a special way, although it has much broader applicability. THEOREM 4.5. [Widlund 1989] Consider model problem (1.1) discretized by the Galerkin scheme using continuous, piecewise linear elements. Assume that the composite grid consists of a locally nested sequence of uniform triangulations covering a nested sequence of subregions. Assume that the successive mesh sizes differ by a factor of two. Then the energy norm convergence factor for the V-cycle of a multilevel exact solver version ofFAC^- is bounded by a constant which is less than one. The constant depends on the shape and size of the subregions, but is independent of the number of levels. 4.11 Theory for FVE-Based FAC Since FVE does not exactly satisfy the variational conditions (4.10) and (4.11), the theory developed in the previous section does not directly apply. However, in this section we will show that FVE discretizations satisfy approximate Galerkin conditions which will allow us to relate FVE-based convergence factors to Galerkin-based factors. This approach and the following results are taken from [McCormick and Rude 1989a].
120
MULTILEVEL ADAPTIVE METHODS FOR PDES
With this objective in mind, assume that the current symbols FAC^L^L2*1,!^, and iff refer to the operators based on FVE. Let |||WA||| = (Lhu^,u^)l/2. For focus, assume that FAC^ represents a fine-to-coarse two-level exact solver applied to model problem (1.1). Assume further that p is Lipschitz continuous on H. (It should be clear from the following development that our results can also be applied to coarse-to-fine FAC cycles, to general boundary conditions, and to diffusion equations under the weaker assumption that p is Lipschitz continuous within each triangle; e.g., p may be discontinuous across triangles.) Now let the subscript G be used to denote operators based on the Galerkin discretization of (1.1) using T, the same finite element space used by FVE. Let |||U^|||G = (I^w A ,^) 1/2 . Theorem 4.6 below will establish that |||FACA||| ~ |||£A£-||||G in some sense. For its proof, we need a few lemmas, the first of which establishes that lA ~ L~^. LEMMA 4.1. There exists a, constant c independent of h such that
Proof. We first extend the domain of definition of u— to include its Dirichlet boundary points and interface slave points. More precisely, let F^ be the set of points in dSl^ Ddtls having nearest (nondiagonal) neighbors in ft—. Extend the definition of u— to F^ by setting u—(P) = 0 for P G F^. Define S— to be the set of grid h slave points on ft/ and extend the definition of u— further by defining u—(P] for P G S— to be the linear interpolant of u— from the two neighbors of P in ftyUF^. Now let ftA = 0>UF-£U£A. Let W^ be the set of unordered pairs, {i,j}, of indices corresponding to nearest neighbors in the linearly ordered grid Q- (>yA is meant to exclude index pairs corresponding to a slave and a coarse grid interface point, but include pairs corresponding either to two interface points or to a slave and a neighboring fine grid point.) Let 5"- be the common control volume surface between the points indexed by {«, j} G W—. Then, analogous to (2.35), it is easy to verify that
FAC METHOD
121
where a~, = p- L^. pdS. Here, hij = h if either i or j correspond to a fine grid point and hij = 2h otherwise. It is also easy to verify that
where fr^ = -gr f^_ pdV. Here, E~j is the union of triangles in S— with the points corresponding to i and j as vertices. From the forms of a~j and fc, we can easily conclude that there exists a constant c independent of h such that
This estimate, together with (4.36) and (4.37), proves the lemma. Our next lemma shows that the FVE coarse grid operator L2h is approximately the same as the variational coarsening of L— given by LT-2/i
V
_ (jh. \tr!LT- (12h) L 12h'
LEMMA 4.2. There exists a constant c independent ofh such that
Proof. As in the proof of Lemma 4.1, let F2^1 be the set of points in dfi,N\Jd$lE having nearest (nondiagonal) neighbors in fi2/l. Let Q2h = H 2/l U F2^ and define u2h on F2^ by setting u2h(P) = 0 for P e F2^. Let Sjj1 be the common control volume surface between points indexed by {^ j} € W 2/l , the set of ordered pairs of indices corresponding to points in its linearly ordered grid Q,2h. Then, analogous to (4.35), we have
where a2]1 = ^ J52fc pdS. The proof of this lemma will then be complete if we can show that
122
MULTILEVEL ADAPTIVE METHODS FOR PDES
To accomplish this, first note that (4.38) is trivially satisfied for points that are both in C^p but not both on d t f p . Since I^h restricted to CSlp is the identity, &IJ1 = afj1. Assume now that i and j correspond to —2/i
points that are both in 0F but not both in dti]?. Assume without loss of generality that the i and j points have the same x-coordinate and that the element diagonals have a southwest-to-northeast bias. Then it is easy to verify that
where 5?* = ($?/-(£ , f ) ) u (5?^+ ($,$)). (Here we use the standard set notation S±Q = {P±Q:Pe S}.) Evidently, (4.38) holds for such i and j. Finally, a relationship similar to (4.39) can be derived for the case that the i and j both refer to points in dtfp1. For example, if these points are on the Neumann boundary #Hs, then (4.39) holds w i t h 5 ? / = ( f S ? / - ( f , 0 ) ) u ( f S ? / + (f,0)). (By aS we mean the set {aP : P G 5}.) At interface points, we obtain a combination of such surface segments and the original half-segment of S^ lying in CSlp• This completes the proof. Our next lemma shows that the worst-case convergence factor for FAC— is approximately the same as that for the variational scheme applied to L-. LEMMA 4.3. There exists a constant c independent of h such that
|j|«i- /- u2h\\\ where KV = max{min u 2* 6Lr 2h — ^ — : 0 / u^ G U^l£v£ is III'HII zero in H^}. Proof. Let e— be an arbitrary initial algebraic error, ^\/2\ the error, after Step 1 (the fine grid solve) of FAC^, and ^ = FAC-M-. the final error. Step 1 is actually a variational coarsening, so |||e^/2J|| < |||e—1||.
FAC METHOD
123
Thus, we can assume without loss of generality that er^ ,^ = e—. Now the effect of Step 2 on the error can be written
Since lAe- is zero in JTj,, by the properties of I%h and I^h we have
Hence,
where the maximum is taken over all unit vectors e— such that L—e~ is zero in ft^.. Let e— be the maximizing unit error. Now using Lemma 4.2 first to replace Lty by (1 + ch)L2h, then to replace (L2h)~l by (1 — ch)L'yl1 we have
This certainly proves (4.40) (with possibly a larger constant c) and, hence, the lemma. To show that KV ~ ||UiL4j2Glll<3> we must of course relate L~^ and //£, which we do in the proof of Theorem 4.6 below. More subtly, we need to relate their composite grid harmonics (i.e., errors that give zero residuals in H/r). This is the purpose of our last lemma. LEMMA 4.4. If u- e U- such that lAu- is zero in J7^, then there exists a UQ 6 U— such that L~£ u~£ is zero in fi"p and
where c is the constant satisfying (4.35).
124
MULTILEVEL ADAPTIVE METHODS FOR PDES
Proof. We first prove the more general proposition: Suppose that u— is orthogonal to a subspace, W-, of U— in the L— energy innerproduct. Suppose further that vr^ is the vector in W— for which U-Q = u- + WG is orthogonal to W— in the LQ energy inner product. Then
To prove (4.41), note by (4.35) that
Now (L^u^iW^) — 0 for all w- G W— and, in particular, for w- = u— — u^. Hence,
Thus, (4.41) is proved. The lemma now follows from this proposition using the subspace W— of vectors in U— with support contained in 1)^ and noting that L—u— is zero in Dp if and only if u— is orthogonal to W— in the lA energy innerproduct. THEOREM 4.6. There exists a constant c independent of h such that
FAC METHOD
125
Proof. Let c be a constant large enough to satisfy the conditions of all of our lemmas. In view of Lemma 4.3, we need only show that
To this end, let u— 6 U— be such that lAu— is zero in QT, but is otherwise arbitrary. Let u^ G U— be the vector, guaranteed to exist by Lemma 4.4, for which L~Q u~^ is zero in J7j; and
Let u2Q E U2h be the best approximation to u"^ in the L~£ energy norm. Then
Since u— was an arbitrarily chosen L— harmonic, (4.42) follows (with possibly a larger c) and the theorem is proved. 4.12 Numerical Examples FAC-MG was applied to the composite grid test examples discussed in §2.11. For model problem (1.1), the scheme started with a zero guess and used a coarse-to-fine FAC process with one F(2,1) multigrid cycle as the approximate subgrid solver. The interface was treated as described in §4.4. Table 4.1 tabulates the Euclidean norms of the composite grid residuals using various numbers of levels / and
126
MULTILEVEL ADAPTIVE METHODS FOR PDES
/-2,h,-i
/-2,1,,-i
'=3'hl=i
I =3,hi=£
cycle 1
0.388 E-3
0.335 E-3
0.475 E-3
0.262 E-3
cycle 2
0.431 E-4
0.386 E-4
0.750 E-4
0.335 E-4
cycle 3
0.471 E-5
0.387 E-5
0.115 E-4
0.417 E-5
cycle 4
0.561 E-6
0.396 E-6
0.184 E-5
0.576 E-6
cycle 5
0.747 E-7
0.492 E-7
0.284 E-6
0.102 E-6
average factor =
0.12
0.11
0.16
0.15
Table 4.1. Convergence history of FAC-MG for well-posed potential flow. Displayed are the Euclidean norms of the composite grid residuals after each coarse-to-fine cycle for global grid mesh sizes h\ = -^, ^ an^ for two and three levels (1 = 2 and 3, respectively).
*-£
"'-I
"'=£
discretization error
0.855 E-l
0.288 E-l
0.724 E-2
actual error
0.872 E 0
0.293 E-l
0.738 E-2
Table 4.2. Comparison of errors produced by FACNI against discretization errors for well-posed potential flow.
-' = 2 ' h l = l ^
!=2
'hl=^
/=3,h1=1L
/ = 3, hi = ±
0.113 E-3
cycle 1
0.179 E-3
0.103 E-3
0.349 E-3
cycle 2
0.207 E-4
0.961 E-5
0.537 E-4
0.158 E-4
cycle 3
0.281 E-5
0.876 E-6
0.835 E-5
0.219 E-5
cycle 4
0.475 E-6
0.106 E-6
0.146 E-5
0.329 E-6
cycle 5
0.882 E-7
0.172 E-7
0.245 E-6
0.513 E-7
average factor =
0.15
0.11
0.16
0.15
Table 4.3. Convergence history of FAC-MG for singular potential flow.
"'=1
"-*
"-£
discretization error
0.174EO
0.580 E-l
0.148 E-l
algebraic error
0.182 EO
0.598 E-l
0.153 E-l
Table 4.4. Comparison of errors produced by FACNI against discretization errors for singular potential flow.
127
FAC METHOD
h
" =5 Re = 0
Re = 50
cycle 1
0.124 E-l
0.154 E-l
Re = 100
0.325 E-l
Re = 0
0.610 E-2
-H
Re = 50
Re = 100
0.622 E-2
0.941 E-2 0.204 E-2
cycle 2
0.101 E-2
0.126 E-2
0.328 E-2
0.676 E-3
0.593 E-3
cycle 3
0.108 E-3
0.140 E-3
0.462 E-3
0.524 E-4
0.486 E-4
0.383 E-3
cycle 4
0.160 E-4
0.180 E-4
0.412 E-4
0.700 E-5
0.695 E-5
0.518 E-4
cycle 5
0.266 E-5
0.257 E-5
0.982 E-5
0.114 E-5
0.111 E-5
0.998 E-5
0.114
0.132
0.116
0.180
average factor =
0.121
0.117
Table 4.5. Convergence history for FAC-MG for planar cavity flow.
various global grid mesh sizes h\. Also included are the convergence factors geometrically averaged over four cycles. To test the claim that FACNI, as defined in §4.6, is a "direct" solver, we applied it as described in §4.6 with one exception: To retain more accuracy in the transfer to finer levels in the nested iteration sequence, we used cubic interpolation for generating the initial iteration for the V-cycles. (The FAC correction process used bilinear interpolation as before.) The discretization error estimates were taken from Table 2.1. Restricting experiments to the two-level case, Table 4.2 shows that FACNI produces a composite grid approximation in one cycle that is well below the level of discretization error: The actual error is smaller than twice the discretization error. Tables 4.3 and 4.4 display analogous results for model problem (1.2)-(1.3). For planar cavity flow, we tested FAC performance for a two-level example by placing a single patch in the NE corner of 17. (The driven cavity example we tested actually has secondary vortices in the NW as well as the NE corner; however, here we have chosen simply to analyze the algebraic performance of FAC, not the accuracy of FACNI, so realistic placement of the refinement regions is not very significant.) Table 4.5 displays the convergence history of FAC coarse-to-fine cycles measured in terms of the Euclidean norms of the vorticity equation residuals and the convergence factors geometrically averaged over four cycles.
This page intentionally left blank
Chapter 5
The Asynchronous Fast Adaptive Composite Grid Method
5.1 Motivation The continuing advances in computers represent a response to the continuing increases in the demand of large-scale applications for greater computing power. The requirements for improved efficiency, higher resolution, and more sophisticated models have fueled efforts to create better hardware technologies and system architectures. Such requirements are also fueling efforts to create more powerful numerical algorithms. These trends in advancing machines and mathematics merge in a dramatic way in the area of parallel adaptive techniques for partial differential equations (PDEs). At first glance, although it might appear that increased computer power will relieve the need for adaptive methods, just the opposite is probably true. Adaptive techniques are designed to achieve more capabilities from the computer than are possible with conventional methods. Expanding computer power is likely to give no more than temporary relief to the reliance on these techniques because demands for greater problem sophistication will quickly tax new capacities. This will recre-
129
130
MULTILEVEL ADAPTIVE METHODS FOR PDES
Figure 5.1. Assignment strategy for processors. Regions governed by individual processors are indicated by ' '.
Figure 5.2. Active processors for FAC, indicated by ^^^m . Fine level processors used for error corrections, coars( ones for residuals. ate the need for efficient local resolution. But now this need will be dramatically intensified by the fact that enhancements in the physical model will significantly accentuate disparities in scale. This should make adaptive methods imperative for parallel computation. Unfortunately, most adaptive methods seem ill-suited for parallel systems: Nonuniformity of the grids can inhibit vectorization and complicate process assignment, interprocess communication, and load balancing; and dynamic self-adaptive strategies can be cumbersome with respect to implementation of decision-making strategies and reassignment and rebalancing of loads. The use of uniform subgrids by the fast adaptive composite grid method (FAC) and other multilevel techniques alleviates most of these difficulties, but there remains a troublesome sequentialness remains in these algorithms that all but debilitates parallelism. To explain this, suppose the host computer is a distributed memory multiprocessor system with a hypercube interprocessor communication topology (or some other topology supporting suitable local and global communication). Consider a composite grid with a general number of levels and suppose the hypercube is so large that each level may be assigned its own set of processors, as shown in Figure 5.1. When the composite grid has just two or three levels, one might be satisfied with the horizontal parallelism inherent in the FAC procedure: Each refinement patch can be treated by a suitable parallel solver (cf. §3.5) and multiple patches within a given level can be processed independently. However, the lack of vertical parallelism can be a severe handicap even for just a few levels: As Figure 5.2 shows, most processors must wait for error corrections or residuals while one level is actively being solved, virtually eliminating the advantage of using processors on more than one level at a time. The very large scale models that massively paral-
AFAC METHOD
131
lei systems are beginning to support require a multilevel scheme that has vertical as well as horizontal parallelism, allowing many levels of refinement to be processed simultaneously with no significant sacrifice in convergence rates. The asynchronous fast adaptive composite grid method (AFAC) is designed to meet this need. 5.2 Basic Two-Level Schemes To develop AFAC in a systematic way, we start by forcing the steps in FAC to be performed simultaneously. Our analysis of the failure of the resulting algorithm will then suggest an immediate cure. To this end, consider the following two-grid FAC scheme defined in §4.1, but modified so that the subgrids can be processed independently:
Note that Steps 1 and 2 are independent, as desired. Unfortunately, this algorithm generally does not converge. To see this, it is important first to understand the roles of the 1h and h subgrids in the solution process. As Figure 5.3 shows, grid 2h computes global components varying on the coarse scale. Similarly, as Figure 5.4 shows, grid h computes local components varying on the fine scale. For a general error, these subgrids will actually separate out their respective error components that are then combined in Step 3 to correct the composite grid approximation. The trouble is that these roles are not mutually exclusive: An error component that is both local and of coarse scale will be computed on both subgrids. This means that the correction v— = I^hu2h + I^uh will overshoot the target error e— by a factor of two so that wr v = U—+V— will have an error — e r , (see Figure 5.5). This stagnation could be remedied by using an underrelaxation parameter w = ^ as a coefficient of the correction: UT v = u— + wlA However, this will impair correction of other error components, especially for the case of a larger number, /, of levels because we must then have w ~ j. A more effective remedy is based on recognizing that troublesome local coarse-scale components can be inexpensively computed on a local coarse grid, as shown in Figure 5.6. Specifically, let I^h and Ih be the intergrid transfer operators restricted to the refinement region H/r and I
f) L
132
MULTILEVEL ADAPTIVE METHODS FOR PDES
Figure 5.3. The role of the global coarse grid is to represent global coarsescale components of the composite grid error.
Figure 5.4. The role of the local fine grid is to represent local fine-scale cornponents of the error,
Figure 5.5. A local coarsescale component e is computed on both subgrids. The new correction /J^u 2/l + I'hU thus overshoots the target error e— by a factor of 2, producing a new err
°r e fnew) = ~e~-
2h
Figure 5.6. A local coarse grid Q^/1 can be inserted between each pair of levels to be used for computing local coarse-scale components, which can then be subtracted from the correction to compensate for the overshoot.
AFAC METHOD
133
let L2p = Ih LhI2h- Similarly define Ijh and Ih . Then two-grid AFAC is represented by v± <- AFAC^(u±; f ± ) and defined by
Note that Steps 1 through 3 can be performed independently of each other. Note also that the last term in Step 4 compensates for the overshoot in computing local coarse-scale errors, yet it has no impact on other error components. The basic idea behind AFAC is to insert a grid between each pair of successive levels that has the resolution of the coarser subgrid and the domain of the finer subgrid. Each new grid is then used in an additional step to eliminate the components duplicated by the original pair of levels. It is important to recognize that adding this step does not significantly increase the cost of the algorithm. Since fi^1 is both local and coarse, direct computation of u2p is much less costly than the computation of uh. Moreover, when MG is to be used as the subgrid solver, u*p can be computed by an initial V-cycle on grid I)]/1, then eliminated by replacing fh with the residual fh — LhIihv?p . The solution of this new grid h problem is then just uh — I^h^p •> which constitutes the last two terms in Step 4. The cost of AFAC in terms of the total number of arithmetic operations is therefore comparable to that of FAC, but there remains the question of how well it performs, especially on a parallel computer. In §5.7, a simple two-level theorem will be established that shows that AFAC rates are directly proportional to those of FAC. Section 5.6 contains analytical results illustrating the performance of AFAC on a hypercube multiprocessor system. Its convergence properties will be studied numerically in §5.9. 5.3 Interpretations In this section, AFAC is interpreted in ways similar to those made of FAC in §4.2. As a preconditioner. Define
134
MULTILEVEL ADAPTIVE METHODS FOR PDES
and
Then AFAC- acts on the algebraic error according to
Thus, the comments about FAC as a preconditioner hold here for AFAC with Mf replacing MhL±M2h. As a block Jacobi method. The decomposition (4.5) allowed us to interpret FAC as a block Gauss-Seidel method. This observation suggests the use of a block Jacobi scheme to allow for simultaneous processing of each block (i.e., subgrid equation). As we observed, (4.5) leads to a system with a singularity that does not impair the block Gauss-Seidel process. It does, however, prevent the block Jacobi process from converging. (To see how singularities can impair Jacobi but not Gauss-Seidel, consider the simple case of the 2 x 2 scalar matrix
The equation Ax = b is solved in one step with Gauss-Seidel relaxation provided beTl(A), but the error simply alternates in sign with every Jacobi sweep.) The key is to replace (4.5) with the decomposition
^_L
r> L
where U%h = (I-I2h(L2/)~1Ih Lh)Uh. We write this decomposition as a direct sum because the subspaces are disjoint. Thus, the block system analogous to (4.6) that this produces can be solved by the block Jacobi method, which is just AFAC written in a different form. It is important to note that the decomposition (5.1) also works for FAC: Since the first step of FAC is to eliminate I^U2*1 error components, projecting the error at the start of Step 2 onto I^Uh actually projects it onto I^U%h. In other words, FAC is also a block Gauss-Seidel method based on the decomposition (5.1). This gives us an immediate extension of the two-grid theory of FAC to that of AFAC (see §5.7). As a domain decompositon method. As noted in §4.2, FAC can be viewed as a classical or multiplicative Schwarz alternating procedure with the feature that the sub domains fully overlap at their respective scales. In [Dryja 1989, Dryja and Widlund 1987], an additive Schwarz
AFAC METHOD
135
method is developed that modifies the classical algorithm to allow for simultaneous processing of each sub domain. This modification has the same objectives as the changes made to FAC to produce AFAC. In this way, AFAC can be viewed as an additive domain decomposition method. 5.4 Multilevel Schemes There is in principle no question of scheduling the order of the levels in AFAC processing because the presumption is that they may be treated simultaneously by separate sets of processors. Defining a multilevel version of AFAC is therefore straightforward. Since this definition is intrinsically nonrecursive, we introduce the following notation: Given the mesh sizes hk+i = 2~~hhi, 1 < k < I -1, with h\ > 0, let £lhk denote locally nested grids covering the nested regions ft*,, 1 < k < /; let I~^ , l£k, and Lhk be the operators associated with grid fc, where h is the vector of entries hk (so that h refers to the composite grid ftA = uUftH 1 < k < /; and let 7^fc,7f *,7^,7^, and L*F" denote the operators associated with ft^1* = $lhk n {lhk~1, 2 < k < I. Then the multilevel AFAC scheme is defined by
Note that the only synchronization required in this scheme is at Step 3: The individual 21— 1 grid solves in Steps 1 and 2 together with their corrections in Step 3 are completely independent tasks and may be treated by separate sets of processors, but a new cycle cannot begin until all steps have been completed for each subgrid. Actually, we could have eliminated even this synchronization by a process analogous to chaotic relaxation for scalar matrix equations. To do this, Steps 1 and 2 would have to be combined for each subgrid 0,hk, but their solves would be allowed to freely access the current residuals r— = /—— L—u— and form the correction u&- <— u— -f- /£" uhk — I^hk u2F *, without coordinating these tasks with other subgrids. As with classical chaotic relaxation, this fully asynchronous AFAC would converge at a per "sweep" rate that is no worse than that of the "cycle-synchronized" scheme defined above,
136
MULTILEVEL ADAPTIVE METHODS FOR PDES
assuming that the term "sweep" is defined so that it includes at least one complete correction from each subgrid. However, the remaining sections will consider only the cycle-synchronized scheme because of its simplicity and its suitability for the applications we have in mind. 5.5 Parallel Implementation It is not possible to make very concrete observations about parallelism of an algorithm without specifying the computational environment. The diversity and changing nature of the field of advanced computers therefore make it difficult to develop general principles about parallel AFAC implementation and performance. For this reason, this and the next section of this chapter will serve primarily to illustrate various features of AFAC on a prototype hypercube system. This section consists of a few general comments on processor task assignment, load balancing, and other aspects of implementation; the next section contains estimates of the parallel complexity of AFAC on a model hypercube. One of the most critical aspects of implementing a PDE solver on a multiprocessor system is processor task assignment. For iterative methods on distributed-memory multiprocessors, allocation of the work is usually done by decomposing the domain into subdomains that are then assigned to individual processors. This is how parallel MG was treated in §3.5. A major advantage in taking this approach is in situations where computation within the subdomains dominates communication between them. This is generally the case for AFAC, but the independence in its processing of the refinement levels offers another approach to processor task assignment that is often more advantageous. The basic idea is to allocate processors by level, with little regard to their geometric location. This can be done by linearly ordering the various levels, assigning units of weight to them based on their number of grid points and other factors, then doling out these levels to the processors in turn. See Figure 5.7 which illustrates the results of such a process with a simple example. This level-assignment approach involves the following considerations: Weights. The number of interior grid points can be used as a crude choice for the weights assigned to each level. However, performance might be improved by incorporating an actual estimate of the cost of the solver, which may even take into account whether the level is likely to be assigned to one or to several processors. An estimate of the communication costs might also
AFAC METHOD
137
Figure 5.7. Processor task assignment by levels. The levels, Gk, are ordered by their assigned weights and displayed as squares with sizes proportional to those weights. Here, each level is kept intact so that it is assigned to exactly one processor.
Figure 5.8. Effects of reordering the levels of Figure 5.7.
influence the weights, although this can be difficult to incorporate for many applications. Ordering and interlevel communication. The ordering of the levels can influence accuracy of the load balance, as shown in Figure 5.8. It can also affect interprocessor communication costs. Specifically, allocation by levels means that there will generally be a communication requirement between processors: Different levels that are located in a common geometric region must share data on composite grid residuals and approximations. It is difficult to order the levels to ensure that such processors are neighbors in the communication topology, so this will generally require global communication. Fortunately, this requirement often leads to a fairly small increment in AFAC processing time because of the dominance of level solver time and the potential for hiding message-passing latency in other computational tasks. Level partitioning. Effective balancing of the load may require
138
MULTILEVEL ADAPTIVE METHODS FOR PDES
some of the individual levels to be shared among several processors. This is especially true of the coarser levels because for many applications they tend to have the most grid points. Although the level assignment approach minimizes this requirement, it can also make partitioning of the levels more awkward. Specifically, a given level may be assigned to several processors that are neighbors in the interprocessor communication topology, but it may be difficult to partition that level so as to maintain a local communication structure between the subdomains. For example, if a given grid is to be partitioned into boxes by a general sequence of contiguous processors, it is unlikely that the sequence would have the right number to admit a suitable rectangular processor array, and even less likely that neighboring boxes could be assigned to neighboring processors. However, for the case of strip partitioning which will probably be adequate for many applications, nearest neighbor structure can be maintained, even on coarser MG levels. Figure 5.9 illustrates such a decomposition, where the nearest neighbor structure is clearly attained on the finest grids. To maintain the simplest possible communication structure on coarser levels, care must be taken in defining the coarser grids. Figure 5.10 illustrates this issue for the one-dimensional case, which is representative of strips in higher dimensions. Taking the usual coarse grid consisting of odd-indexed points will lead eventually to a loss of the feature that coarse grid regions below C-level belong to processors two pathlengths away in the hypercube. This trouble will not occur if the coarse grid is chosen by a coarsening based on the smallest subcube containing the processor sequence. Figure 5.10 shows the case where coarsening is based on the subcube consisting of PI through PIS. Note that this can cause irregular mesh spacing at the boundaries, which is usually easy to treat in an MG context. Load balancing. The elapsed time of AFAC processing is the maximum time of an individual processor to complete its assigned tasks. If the weights are good predictors of that time and the tasks have been fairly allocated by sharing these weights equally among the processors, then this elapsed processing time should be near the minimum. (Actually, equal allocation of the load may not be quite optimal if it comes at an increased cost in communication, as Figure 5.11 illustrates.) The allocation pro-
AFAC METHOD
139
Figure 5.9. Partition with levels themselves decomposed. Each processor load here is six units.
Figure 5.10. Nonstandard coarse grid to maintain local communication structure. (Boundary or interface points are indicated by • and interior grid points are superscripted according to mesh size.)
Figure 5.11. If these weights assigned to the levels include an estimate of interprocessor communication costs, then they will be larger for levels assigned to more than one processor, as reflected by the increased size of the Gk in (b). In (a), the loads are out of balance, but the levels are assigned intact to single processors. Here the maximal load is 9. In (b), the loads are in balance, but at the cost of partitioning the levels and increasing the maximal load to \1\.
140
MULTILEVEL ADAPTIVE METHODS FOR PDES
cess for level partitioning is simpler to implement than for other partitioning schemes because it is one-dimensional in character. This simplicity is especially important for dynamic grid problems (e.g., self-adaptive applications and time-dependent equations), where the changing composite grids continually force reallocation of the loads. Actually, in some dynamic grid cases, rebalancing is not necessary. For example, if moving grids are used to follow an evolving front through a region, then it may be that these grids change only location, not size. This means that the processor assignments by level could remain the same; each processor would just need to determine to which processors it must now send composite grid residuals and approximations. 5.6 Parallel Complexity In order to get a general impression of the total time it takes for AFAC to solve a given composite grid problem on a parallel computer, in the following we develop a fairly crude parallel complexity analysis. This development is based on several assumptions about AFAC, multigrid (MG), the machine, and the problem. These assumptions, particularly those concerning the interprocessor communication features, are difficult to make in any generality. Moreover, any estimates that are made without statistical methods cannot be very realistic. These estimates are therefore not meant to be taken too seriously. With this perspective, the results will not be compromised by the fact that many of the assumptions are made to simplify the complexity analysis. This analysis first considers the basic AFAC process, using MG as the subgrid solver. As the next two sections will show, AFAC asymptotic convergence factors are the square root of those of FAC. Typically, then, two AFAC iterations are necessary as the basic composite grid solver for nested iteration. We will estimate the cost of a "direct" AFAC solver based on this consideration. Remembering the multilevel algorithm described in §5.4, the AFAC process involves solving a coarse grid (H^*) as well as a fine grid (£lhk) equation for each uniform subgrid. It is advantageous that an approximate solver like MG be used in such a way that the coarse and fine grid solves are analogous. This would ensure that the approximation to Wpfc would agree with the error component computed on $lhk~l that is local to fifc. Thus, Step 2 of the multilevel AFAC algorithm would exactly reproduce the local error component computed on the coarser subgrid; it would then be exactly eliminated in Step 3, preventing any
AFAC METHOD
141
contamination to this component computed on the fine grid. A convenient MG solver for this purpose is a F(0,2) cycle, which can be used to approximate uhk — / 2 £ f c w F * directly. In particular, this basic coarseto-fine MG cycling scheme automatically computes an approximation to u2F * just before it reaches the finest grid $lhk. At this point, instead of using this as an initial guess for uhk, it is used to compute a residual on I)'1* so that the new problem approximates uhk — /2^ u2F k, not uhk. More precisely, the AFAC-MG algorithm using this approach is defined as follows:
This section will focus on this algorithm for parallel complexity analysis. To this end, the following notation and simplifying assumptions are introduced: The target machine is a dedicated d-dimensional hypercube with p processing nodes. Each node performs q arithmetic operations in aq units of time. A message packet of w words sent from one node to another over k pathlengths (i.e., through k-1 other nodes) takes k(a + @w) units of time. This time is unaffected by any other interprocessor communication, including messages sent concurrently in the opposite direction. The AFAC algorithm is synchronized so that subgrid processing and interlevel message passing phases for all processors are in step. Thus, for example, processors begin transmitting and receiving composite grid residuals and interpolants only after all of them have finished the subgrid processing phase. The only significant costs of the algorithm are in the subgrid solves and the composite grid interpolant evaluation and transfers. (The composite grid residual computation and transfer costs are ignored because the interface residuals only require partial stencil evaluation at just a few points to correct coarse grid stencils, the interior residuals are automatically computed by the subgrid V-cycles, and the transfers can be done simultaneously with interpolant transfers.) The subgrid solves are computed using Table 3.4 with c = 15a, which corresponds to the approximate cost of a F(0,2) cycle for 5-point
142
MULTILEVEL ADAPTIVE METHODS FOR PDES
operators. The composite grid interlevel communication costs are computed as an upper bound on the time it takes to compute an interpolant plus the time it takes for that interpolant to be transmitted from sending to receiving mode. (This is a very crude estimate of the communication costs for an actual hypercube.) There are / square subgrids, where the fcth has Hk = (2 mfc — I)2 points. Each subgrid has a mesh size of half that of the coarser grid to which it corresponds. (This does not prevent a given coarse grid from having several regions of refinement.) There are pk contiguous processors assigned to level k by strips and $2fc=1 Pk — P(Note that no processor is assigned to more than one subgrid.) The estimate for a V(0,2) cycle on subgrid k using Table 3.4 is
Adding to this the time
for computation and global send of the composite grid interpolant yields the total time estimate for one AFAC-MG iteration of
This estimate is difficult to absorb without being more specific about the parameters rik and p^. It seems reasonable to assume that the arithmetic load is in balance, so that each processor has the same number of grid points:
However, little can be said about UK in general. On the other hand, for many adaptive situations the dominant grid is the global one, $lhl, so for illustration we consider now the simplified case that
AFAC METHOD
143
Using (5.3) and (5.4), letting n = ^k-i ftfc, and dropping insignificant terms, (5.2) then becomes
An observation that can be made here is that AFAC increases the parallel complexity of MG on the global grid in two ways: by reducing the number of available processors on the global grid from p to p/2 and by requiring a global message transmission for interpolation. The order of complexity is changed only in the (3 term, from 0(log ra) to O(nlogp). A similar analysis of the nested iteration version of AFAC-MG would arrive at a similar conclusion. 5.7 Theory for Variational AFAC The two-grid theory for AFAC is greatly simplified by relating it to FAC as a block relaxation based on the decomposition (5.1). For this to be valid, we will assume that the variational conditions (4.11) and (4.12) hold. (Theory does not yet exist for the nonvariational case.) This relationship will allow us to use the FAC bounds (4.26) and (4.28) to obtain corresponding estimates for the exact solver version of AFAC. To develop approximate solver estimates similar to those for FACT in (4.32), let the grid solver in AFACT be represented by (4.31) with the more restrictive assumption that M2\ and M2h are symmetric matrices satisfying
Here, £ = ( £ 2h)' ^he relationship A < B between two symmetric matrices A and B means that B - A is nonnegative definite. The notation L^h signifies Lh restricted to f/2\. Other operators will be similarly labeled. Then AFACr is defined by
144
MULTILEVEL ADAPTIVE METHODS FOR PDES
Note that we have combined the grid £lh and grid ftp1 steps, assuming that the grid ft^1 problem is solved exactly. For the general case of an approximate solver on ftp*1, the present theory applies simply by requiring that the approximate solvers on ft]/1 and ft'1 agree on ft^ components, that is, if Mh and Mph are the respective approximate inverses, then This is the reason for our choice of a V(0,2) cycle MG solver analyzed in the previous section. THEOREM 5.1. Suppose lA is positive definite. Then the spectral radii of the two-level exact solver versions of AFAC^ and FAC^ satisfy
Here, FAC^ refers to either the coarse-to-fine or the fine-to-coarse algorithm. The convergence factors for one iteration of a two-level exact solver version of AFAC^ satisfies both
and
The convergence factor of the approximate solver AFAQr satisfies
where £ = max(£ 2/l ,£2yi)- Finally, suppose L- is only nonnegative definite but that (4.34) is satisfied. If f— is orthogonal to ^(lA) in the Euclidean sense, then the bounds (5.8) - (5.10) apply with the modification that the inverses are everywhere replaced by Moore-Penrose generalized inverses. Proof. The decomposition (5.1) yields a 2 x 2 block matrix for the composite grid equations. Since the matrix is symmetric and positive definite, it is two-cyclic. Equation (5.7) then follows from a theorem of Young [1950]. But p(AFAC^ = |||A£A£*-|||. Thus, (5.8) and (5.9) now follow from (5.7), (4.17), (4.26), and (4.28). To prove (5.10), note that |||AJMC|||| = p(X - I) where
AFAC METHOD
145
Here, P^ = (M)1/2/^^)-1/2 and /£ = /^(M)1/2, and similarly for the other operators. Note that all operators in (5.11) are symmetric in the Euclidean sense. Now (5.6) implies that and similarly for the 2h term. Hence, Since AFAC- = P^h + P^h - /, then 2/i
from which (5.10) follows. The final assertion follows from a proof analogous to the proof of Theorem 4.4. Equation (5.7) implies that p(AFAC-} - 0 whenever p(FAC-) 0. Thus, AFAC is an exact solver in the special case that composite grid harmonics can be reproduced exactly on grid 2/i. This theorem and those of §4.10 show that the energy bounds on the convergence factors for FAC and AFAC agree. But practical observations suggest that FAC factors are actually much better. This can be explained by observing that the energy factor for FAC is only an upper bound on the asymptotic convergence factor p(FAC-): in fact, the asymptotic factor is quickly attained because the second and subsequent cycles of FAC— can be written FACr^ = P^hP^hP^h so that IIIFAC^IH = p(P^hP^h). In other words, (5.7) is a better reflection of numerical performance than are (5.8) and (5.9). As this analysis suggests, AFAC provides no apparent advantages over FAC in the two-level case. However, it was intended for use with many levels of refinement where parallelization of level processing becomes a significant factor. Thus, for a multilevel AFAC result analogous to Theorem 4.5, we appeal to another remarkable theorem, this one presented in [Dryja and Widlund 1989]. (For an alternate but specialized theorem, see [Mandel and McCormick 1989aj.) Again, we tailor the statement of this theorem to our special interests and omit the proof. THEOREM 5.2. [Dryja and Widlund 1989] Consider the damped AFAC scheme defined by multiplying each coarse grid correction by u>. Then there exists an u> > 0 such that the convergence factor of a multilevel exact solver version of AFAC^- is bounded by a constant less than one. The constant depends on the shape and size of the subregions, but is independent of the number of levels.
146
MULTILEVEL ADAPTIVE METHODS FOR PDES
5.8 A Variant The role of the AFAC coarse grid projection step in the refinement region ftp is to separate the levels Q 2/l and $lh enough to allow for independent processing. This step has the effect that, in subsequent AFAC cycles, the composite grid residual transferred to Q 2/l is zero in ftp. We can eliminate this step but approximate its effect at the outset by simply setting the coarse grid source term to zero in OF- This yields the following simplified variant of AFAC: Step 1. Set
This variant is a little less expensive than the standard AFAC scheme because it avoids the projection step in ftp- In fact, in sequential mode, it is even a little less expensive than FAC because it avoids the transfer I\h in QF- In practice, this variant seems to have essentially the same performance characteristics as standard AFAC, at least for the limited set of problems used in our experiments. However, this variant has no supporting theory, which is no doubt a symptom of the fact that it is not yet well understood. 5.9 Numerical Examples A complete analysis of the computational performance of AFAC should involve careful timing tests on parallel computers. However, we prefer to avoid such specialized tests here and instead refer the reader to [McCormick and Quinlan 1989]. For the present, we will be content with analyzing the algebraic convergence factors of AFAC applied to our three model problems. In particular, Table 5.1 displays the results of applying AFAC-MG to problem (1.1) in a way analogous to the FAC-MG experiments reflected in Table 4.1. Similarly, we report on the results for singular potential flow in Table 5.2, which is analogous to Table 4.3, and for planar cavity flow in Table 5.3, which is analogous to Table 4.5. Comparison of these tables suggests a very rough relationship between the convergence factors of FAC and AFAC predicted by the theory of §5.7, namely, that the FAC asymptotic convergence
AFAC METHOD
147
/ -2,h,-i
'=2'h'=H
1
I.3,h,-i
= 3 , h , = JL
cycle 1
0.585 E-3
0.313 E-3
0.895 E-3
0.302 E-3
cycle 2
0.668 E-4
0.284 E-4
0.129 E-3
0.251 E-4
cycle 3
0.266 E-4
0.185 E-4
0.662 E-4
0.236 E-4
cycle 4
0.427 E-5
0.265 E-5
0.149 E-4
0.631 E-5
cycle 5
0.124 E-5
0.113 E-5
0.214 E-5
0.625 E-6
average factor =
0.215
0.245
0.221
0.213
Table 5.1. Convergence history of AFAC-MG for well-posed potential flow.
' = 2-h' = i
/-2.h,-i cycle 1
0.615 E-3
cycle 2 cycle 3
1
'
=3'h'=^
=3 hl =
'
0.272 E-3
0.113 E-2
0.189 E-3
0.144 E-3
0.562 E-4
0.877 E-3
0.439 E-3
0.240 E-4
0.712 E-5
0.198 E-3
0.667 E-4
cycle 4
0.136 E-4
0.481 E-5
0.669 E-4
0.337 E-4
cycle 5
0.519 E-5
0.174 E-5
0.157 E-4
0.395 E-5
0.303
0.283
0.343
0.380
average factor =
i
Table 5.2. Convergence history of AFAC-MG for singular potential flow.
h
-i
"5 Re = 50
Re = 100
Re = 0
Re = 50
Re = 100
Re = 0
cycle 1
0.139 E-l
0.163 E-l
0.307 E-l
0.696 E-2
0.709 E-2
0.942 E-2
cycle 2
0.253 E-2
0.262 E-2
0.541 E-2
0.963 E-3
0.937 E-3
0.190 E-2
cycle 3
0.255 E-2
0.294 E-2
0.314 E-2
0.138 E-2
0.130 E-2
0.7% E-3
cycle 4
0.889 E-3
0.100 E-2
0.152 E-2
0.368 E-3
0.389 E-3
0.427 E-3
cycle 5
0.228 E-3
0.260 E-3
0.229 E-3
0.169 E-3
0.134 E-3
0.104 E-3
0.355
0.294
0.395
average factor =
0.358
0.371
Table 5.3. Convergence history of AFAC-MG for planar cavity flow.
0.324
148
MULTILEVEL ADAPTIVE METHODS FOR PDES
factors are the square of those for AFAC. The variance observed in the results is due to the facts that we have not carried out enough cycles to reach the asymptotic limits and that we are using inexact subgrid solvers (which slightly contaminate convergence, especially for FAC).
Appendix
To illustrate the basic concepts of this book, in this Appendix we develop a finite volume element (FVE) discretization and fast adaptive composite grid (FAC) solver for a two-level composite grid approximation to a simple model problem: the one-dimensional Helmholtz equation with homogeneous Dirichlet boundary conditions given by
where a is a constant. First consider the uniform grid given by x, = ih, 0 < i < m,h = — , with volumes Vi = [£t-i/2> x t+i/2L 0 < i < m. See Figure A.I. (By #,-±1/2 we mean (i± l/2)/i.) Integrating (A.I) over Vi yields the ra — 1 equations
Now the 1-D Gauss Divergence Theorem is just the fundamental theorem of calculus of the form
150
MULTILEVEL ADAPTIVE METHODS FOR PDES
Using this for the first term in (A.2) yields
where we have used the simple midpoint quadrature rule to approximate the source term. The basic idea of FVE is to replace ty in (A.3) by a continuous piecewise linear function on grid h. To this end, let 4>i(x) = max{0,1 — ^ }, which is one of the hat functions for grid h that form a basis for our space of element functions (see Figure A.2). Then replacing ij)(x} in (A.3) by J^iiT u ?0i( x ) yields
Here, fi = hr](xi) and u^ is the node value of the continuous piecewise linear approximation to ^- Equation (A.4) together with the boundary conditions UQ = u^ = 0 form the global grid equations, which we write as Lhuh = fh. To develop corresponding FVE equations on a composite grid, let m be an even integer and let h = ^. Consider the composite grid consisting of a (global) grid on [0,1] with mesh size h and a (local) grid on [^,1] with mesh size h/2. Its points are
Using volumes defined by the midpoints of neighboring grid points (see Figure A.3), first consider the irregular interface point xm. with volume VJQ. = [xja._i,Xja + i] = [| — -|, | + j]. Typical hat functions for this composite grid are depicted in Figure A.4, including the basis function at x = xsa. given by
The FVE equation at the interfzce point is thus
151
APPENDIX
Figure A.I. Global grid points (indicated by x) and volume boundaries (indicated by |).
Figure A.2. Typical global grid hat function.
Figure A.3. Composite grid points and volume boundaries.
Figure A.4. Typical composite grid hat functions. Here we use the uncentered quadrature rule to define
(For composite grid quantities, we suppress the superscript h for simplicity.) By analogy to (A.4), at coarse regular points we have
152
MULTILEVEL ADAPTIVE METHODS FOR PDES
where At local regular points we have
where Equations (A.5), (A.7), and (A.9) together with the boundary conditions UQ = um = 0 form the composite grid equations Lu = /. Finally, in analogy to (A.9), on the individual fine grid -| we have
where /i is given by (A.10). To use FAC, we will define the left boundary value in terms of the coarse grid solution. Generally, we will impose i. A. Um. = v0 and Um = 0, where VQ is to be specified. Equation (A.11) together with these boundary conditions form the fine grid equations L1* «2 = f 2 . Assume that we are given an initial composite grid approximation represented by its entries
with UQ = um = 0. Assume that the composite grid source term / has been constructed according to (A.6), (A.8), and (A.10). Then one cycle of FAC is given by the following: Step 1. Compute the residual r — f - Lu for the composite grid equation Lu = f denned by (A.5), (A.7), and (A.9).
APPENDIX
153
and solve the coarse grid equation Lhuh = fh defined by (A.4). A
A
,
A
Step 3. Set /_? = r±, m < t < 2m, ujk = u^., and Um = 0, and 2
2
2
2
solve the fine grid equations L^u^ = / 2 defined by (A.11). Step 4- Define the composite grid correction v by
This page intentionally left blank
References
J. Babuska, J. Chandra, and J. E. Flaherty (1983), Adaptive Computational Methods for Partial Differential Equations, Society for Industrial and Applied Mathematics, Philadelphia. D. Bai and A. Brandt (1987), Local mesh refinement multilevel techniques, SIAM J. Sci. Statist. Comput., 8, pp. 109-134. B. R. Baliga and S. V. Patankar (1980), A new finite-element formulation for convection-diffusion problems, Numer. Heat Transfer, 3, pp. 393-409. R. E. Bank (1986), A-posteriori error estimates, adaptive local mesh refinement, and multigrid iteration, in Proc. 2nd European Conference on Multigrid Methods, Cologne, October 1-4, 1985, W. Hackbusch and U. Trottenberg, eds., Lecture Notes in Mathematics, 1228, Springer-Verlag, Berlin, pp. 7-23. R. Bank and D. Rose (1987), Some error estimates for the box method, SIAM J. Numer. Anal., 24, pp. 777-787. M. Berger (1984), On conservation at grid interfaces, ICASE report no. 84-43. (1987), Adaptive finite difference methods in fluid dynamics, New York University report no. UC-32. M. J. Berger and A. Jameson (1985), An adaptive multigrid method for the Euler equation, in Proc. Ninth International Conference on Numerical Methods in Fluid Dynamics, Soubbaramayer 155
156
MULTILEVEL ADAPTIVE METHODS FOR PDES
and J. P. Boujot, eds., Lecture Notes in Physics, 218, SpringerVerlag, Berlin, pp. 92-97. M. Berger and J. Oliger (1984), An adaptive mesh refinement for hyperbolic partial differential equations, J. Comput. Phys., 53, pp. 484-512. J. H. Bramble, D. E. Ewing, J. E. Pasciak, and A. H. Schatz (1988), A preconditioning technique for the efficient solution of problems with local grid refinement, Comput. Methods Appl. Mech. Engrg., 67, pp. 149-159. A. Brandt (1973), Multi-level adaptive techniques (MLAT) for fast numerical solution to boundary value problems, in Proc. Third International Conference on Numerical Methods in Fluid Mechanics, Paris 1972, H. Cabannes and R. R. Teman, eds., Lecture Notes in Physics, 18, Springer-Verlag, Berlin, pp. 82-89. (1977), Multi-level adaptive solutions to boundary-value problems, Math. Comp., 31, pp. 333-390. W. Briggs (1987), A Multigrid Tutorial, Society for Industrial and Applied Mathematics, Philadelphia. W. Briggs, L. Hart, S. F. McCormick, and D. Quinlan (1988), Multigrid methods on a hypercube, in Multigrid Methods: Theory, Applications, and Supercomputing, S.F. McCormick, ed., Lecture Notes in Pure and Appl. Math., 110, Marcel Dekker, New York, pp. 63-83. Z. Cai and S. F. McCormick (1989), Computational complexity of the Schwarz alternating procedure, Internat. J. High-Speed Cornput., to appear. (1990), On the accuracy of the finite volume element method for diffusion equations on composite grids, SIAM J. Numer. Anal., 27, to appear. Z. Cai, J. Mandel, and S. F. McCormick (1989), The finite volume element method for elliptic equations on triangular meshes, Univ. of Colo, at Denver report. S. C. Caruso, J. H. Ferziger, and J. Oliger (1985), Adaptive grid techniques for elliptic fluid-flow problems, Stanford University report. T. F. Chan and R. Schreiber (1983), Parallel networks for multigrid algorithms: Architecture and complexity, Report no. 262, Department of Computer Science, Yale University, New Haven, CT. T. F. Chan and R. S. Tuminaro (1986), A survey of parallel multigrid algorithms, in Parallel Computations and Their Impact on
REFERENCES
157
Mechanics, A. K. Noor, ed., AMD 86, American Society for Mechanical Engineering. P. Ciarlet (1978), The Finite Element Method for Elliptic Problems, North-Holland, Amsterdam. M. Ciment and R. Sweet (1973), Mesh refinements for parabolic equations, J. Comput. Phys., 12, pp. 513-525. A. W. Craig and 0. C. Zienkiewicz (1985), A multigrid algorithm using a hierarchical finite element basis, in Multigrid Methods for Integral and Differential Equations, D. J. Paddon and H. Hoistein, eds., The Institute of Mathematics and Its Applications Conference Series, 3, Clarendon Press, Oxford, pp. 301-312. J. E. Dendy, Jr. (1982), Black box multigrid, J. Comput. Phys., 48, pp. 366-386. M. Dryja (1989), An additive Schwarz algorithm for two- and threedimensional finite element elliptic problems, in Domain Decomposition Methods, T. F. Chan et al., eds., Society for Industrial and Applied Mathematics, Philadelphia, pp. 168-172. M. Dryja and 0. Widlund (1987), An additive variant of the Schwarz alternating method for the case of many subregions, Tech. Report 339, Department of Computer Science, Courant Institute. (1989), On the optimality of an additive iterative refinement method, in Proc. Fourth Copper Mountain Conference on Multigrid Methods, Society for Industrial and Applied Mathematics, Philadelphia, to appear. R. E. Ewing, R. D. Lazarov, and P. S. Vassilevski (1988), Local refinement techniques for elliptic problems on cell-centered grids, Univ. Wyoming E.O.R.I. report no. 1988-16. R. E. Ewing, R. D. Lazarov, J. E. Pasciak, and P. S. Vassilevski (1989), Finite element methods for parabolic problems with time steps variable in space, Univ. Wyoming E.O.R.I. report no. 1989-05. C. K. Forester (1982), Error norms for the adaptive solution of the Navier-Stokes equations, NASA-CR-165828, National Aeronautics and Space Administration, Washington, DC. L. Fuchs (1985), An adaptive multi-grid scheme for simulation of flows, in Proc. 2nd European Conference on Multigrid Methods, Cologne, October 1-4, 1985, W. Hackbusch and U. Trottenberg, eds., Lecture Notes in Mathematics, 1228, Springer-Verlag, Berlin, pp. 123-135. D. B. Gannon (1980), Self adaptive methods for parabolic partial differential equations, Report UIUCDCS-R-80-1020, Department of Computer Science, University of Illinois at Urbana-Champaign.
158
MULTILEVEL ADAPTIVE METHODS FOR PDES
W. Hackbusch (1984), Local defect correction method and domain decomposition techniques, in Defect Correction Methods: Theory and Applications, K. Bohmer and H. J. Stetter, eds., Computations Supplementation, 5, Springer-Verlag, Wien, pp. 89-113. L. Hart and S. F. McCoraiick (1989), Asychronous multilevel adaptive methods for solving partial differential equations on multiprocessors: basic ideas, Parallel Computing, to appear. B. Heinrich (1987), Finite Difference Methods on Irregular Networks, Akademie-Verlag, Berlin. P. W. Hemker (1980), On the structure of an adaptive multi-level algorithm, BIT, 20, pp. 289-301. M. Heroux (1988), Ph.D. Thesis, Colorado State University. P. Lax (1972), Hyperbolic Systems of Conservation Laws and the Mathematical Theory of Shock Waves, Society for Industrial and Applied Mathematics, Philadelphia. J. Linden (1985), A multigrid method for solving the biharmonic equation on rectangular domains, Arbeitspapiere der GMD 143. C. Liu and S. F. McCormick (1988a), The finite volume-element method for planar cavity flow, Proc. llth International Conference on Numerical Methods in Fluid Dynamics, Williamsburg. (1988b), Multigrid, elliptic grid generation and the fast adaptive composite grid method for solving transonic potential flow equations, in Multigrid Methods: Theory, Applications, and Supercomputing, S. F. McCormick, ed., Lecture Notes in Pure and Appl. Math., 110, Marcel Dekker, New York, pp. 365-387. J. F. Maitre and F. Musy (1982), The contraction number of a class of two-level methods: an exact evaluation for some finite element subspaces and model problems, in Multigrid Methods, proceedings of a conference held at Koln-Porz, November 23-27, 1981, W. Hackbusch and U. Trottenberg, eds., Lecture Notes in Mathematics, 960, Springer-Verlag, Berlin, pp. 535-544. J. Mandel and S. F. McCormick (1989a), Iterative solution of elliptic equations with refinement: the model multilevel case, in Domain Decomposition Methods, T. F. Chan et al., eds., Society for Industrial and Applied Mathematics, Philadelphia, pp. 93-102. (1989b), Iterative solution of elliptic equations with refinement: the two-level case, in Domain Decomposition Methods, T. F. Chan et al., eds., Society for Industrial and Applied Mathematics, Philadelphia, pp. 81-92. (1989c), A multilevel variational method for Au = gBu on composite grids, J. Comput. Phys., 80, pp. 442-450.
REFERENCES
159
S. F. McCormick (1984), Fast adaptive composite grid (FAC) methods: theory for the variational case, in Defect Correction Methods: Theory and Applications, K. Bohmer and H. J. Stetter, eds., Computations Supplementation, 5, Springer-Verlag, Wien, pp. 115-122. (1985), A variational theory for multi-level adaptive techniques (MLAT), in Multigrid Methods for Integral and Differential Equations, D. J. Paddon and H. Holstein, eds., The Institute of Mathematics and Its Applications Conference Series 3, Clarendon Press, Oxford, pp. 225-230. , ed. (1987), Multigrid Methods, SIAM Frontiers Series in Applied Mathematics, 3, Society for Industrial and Applied Mathematics, Philadelphia. , ed. (1989), Proceedings of the Fourth Copper Mountain Conference on Multigrid Methods, Society for Industrial and Applied Mathematics, Philadelphia. S. F. McCormick and D. Quinlan (1989), Asynchronous multilevel adaptive methods for solving partial differential equations on multiprocessors: performance results, Parallel Comput., to appear. S. F. McCormick and U. Rude, (1989a), A nonvariational convergence theory for the fast adaptive composite grid methods, Univ. of Colo, at Denver report. (1989b), On local refinement higher order methods for elliptic partial differential equations, Univ. of Colo, at Denver report. S. F. McCormick and J. Thomas (1986), The fast adaptive composite grid method (FAC) for elliptic boundary value problems, Math Comp., 46, pp. 439-456. S. F. McCormick, M. McKay, and J. Thomas (1989), Computational complexity of the fast adaptive composite (FAC) method, Appl. Numer. Math., Special Issue on Domain Decomposition, to appear. W. J. Minkowycz, E. M. Sparrow, G. E. Schneider, and R. H. Fletcher, eds. (1988), Handbook of Numerical Heat Transfer, Chapter 10 by G. E. Schneider and Chapter 11 by B. R. Baliga and S. V. Patankar on the finite-element method, John Wiley and Sons, New York. M. C. Rivara (1984), Algorithms for refining triangular grids suitable for adaptive and multigrid techniques, J. Numer. Meth. Eng., 20, pp. 745-756. P. J. Roache (1972), Computational Fluid Dynamics, Hermosa, Albuquerque.
160
MULTILEVEL ADAPTIVE METHODS FOR PDES
A. Samarskii, R. Lazaroff, and L. Makarov (1987), Finite Difference Schemes for Differential Equations with Weak Solutions, Moscow Vissaya Skola. H. A. Schwarz (1870), Gesammelte Mathematische Abhandlungen, Vol. 2, Springer, Berlin, 1890; first published in Vierteljahreschrift der Naturforschenden Gesellschaft in Zurich, 15, pp. 272-286. K. Stiiben and U. Trottenberg (1982), Multigrid methods: Fundamental algorithms, model problem analysis and applications, in Multigrid Methods, Proc. conference held at Koln-Porz, November 23-27, 1981, W. Hackbusch and U. Trottenberg, eds., Lecture Notes in Mathematics, 960, Springer-Verlag, Berlin, pp. 1-176. J. R. van Rosendale (1983), Algorithms and data structures for adaptive multigrid elliptic solvers, Appl. Math. Comput., 13, Proc. Internat. Multigrid Conference, April 6-8, 1983, Copper Mountain, CO, S. F. McCormick and U. Trottenberg, eds., NorthHolland, Amsterdam, pp. 453-470. 0. Widlund (1989), Optimal iterative refinement methods, in Domain Decomposition Methods, T. F. Chan et al., eds., Society for Industrial and Applied Mathematics, Philadelphia, pp. 114-125. D. Young (1950), Iterative methods for solving partial differential equations of elliptic type, Ph.D. Thesis, Harvard University.
Index
Asynchronous fast adaptive Fast adaptive composite grid composite grid method (AFAC), method (FAC), 8, 81-127 Finite volume element method 24, 129-148 (FVE), 9, 17-55 Finite volume method (FV), 17 BEPS method, 24 Full approximation scheme (FAS), C-level, 74 8,66-68, 97 Coarse grid correction, 58 Full multigrid (FMG), 68-72 Compatibility condition Galerkin analytic, 14 discrete, 30 condition, 63, 107 nonlinear method, 66-68, 97 Composite grid, 4, 24-27, 82 Composite grid harmonic, 109 operators, 61-66 Petrov-, 44, 62 Conservation, 14, 27-31 Gauss Divergence Theorem, 18, 32, Control volume finite element 33, 44, 149 method (CVFE), 9, 17 Gauss-Seidel method, 58, 92 Crank-Nicolson method, 41 Grid transfer operator, 58 Delayed correction, 90 Hierarchical basis method, 8 Direct solver, 71, 97-100 Hypercube, 73, 141 Discrete ellipticity, 48 Domain decomposition, 93, 134 Immediate correction, 89 Error Interface, 5, 95 Interpolation, 59 actual, 69 algebraic, 65 Jacobi, 134 discretization, 44, 68 161
162
MULTILEVEL ADAPTIVE METHODS FOR PDES
Level, 4 Level of discretization error, 7, 70
Patch, 4 Patch conformity, 24 Planar cavity flow, 33, 31-36 Potential flow, 13, 14 Preconditioned 91, 133
Mesh refinement, 8, 92 Multigrid method (MG), 57-80, 81 Multilevel adaptive techniques, (MLAT), 8
Quasi-quadradic, 38 Quasi-regular, 5
Level partitioning, 137 Linear part, 65 Load balancing, 138
Nested grids, 58 iteration, 97-100 regions, 94 Overlap, 6 Parallel complexity, 140 Parallelism horizontal, 130 vertical, 130
Restriction, 59 Reynolds number, 13, 31-36 Richardson's iteration, 65, 117 Schwarz method, 93, 134 Self-adaptive techniques, 101-103 Singular equation, 27-31, 61-66 Slave point, 5 Time-dependent equation, 15, 4C-43, 40-43,100-101 Variational conditions, 107