Sequence Transformations and Their Applications

SEQUENCE TRANSFORMATIONS AND THEIR APPLICATIONS Jet Wimp DEPARTMENT OF MATHEMATICAL SCIENCES DREXEL UNIVERSITY PHILADELP...

Author: Wimp

109 downloads 1096 Views 7MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

SEQUENCE TRANSFORMATIONS AND THEIR APPLICATIONS Jet Wimp DEPARTMENT OF MATHEMATICAL SCIENCES DREXEL UNIVERSITY PHILADELPHIA, PENNSYLVANIA

@

1981

ACADEMIC PRESS

A Subsidiary of Harcourt Brace Jovanovich, Publishers

New York

London Toronto

Sydney

San Francisco

COPYRIGHT © 1981, BY ACADEMIC PRESS, INC. ALL RIGHTS RESERVED. NO PART OF THIS PUBLICATION MAY BE REPRODUCED OR TRANSMITTED IN ANY FORM OR BY ANY MEANS, ELECTRONIC OR MECHANICAL, INCLUDING PHOTOCOPY, RECORDING, OR ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM, WITHOUT PERMISSION IN WRITING FROM THE PUBLISHER.

ACADEMIC PRESS, INC.

III Fifth Avenue. New York, New York 10003

United Kingdom Edition published by ACADEMIC PRESS, INC. (LONDON) LTD.

24 28 Oval Road, London NWI

7DX

Library of Congress Cataloging in Publication Data Wimp, Jet. sequence transformations and their applications. (Mathematics in science and engineering) Bibliography: p. Includes index. 1. Sequences (Mathematics) 2. Transformations (Mathematics) 3. Numerical analysis. I. Title. II. Series. QA292.W54 515'.24 80-68564 ISBN 0-12-757940-0

PRINTED IN THE UNITED STATES OF AMERICA

81 82 83 84

987654321

Preface

In this book we shall be concerned with the practical aspects of sequence transformations. In particular, we shall discuss transformations T mapping sequences in a Banach space 81 (often, but not always, the complex field) into sequences in 81. Certain practical requirements are ordinarily made of T: that its domain f» contain an abundance of" interesting" sequences and for S E f» also as + e E ~, e being any constant sequence; further, we shall usually require that T satisfy the following requirements: (i) T is homogeneous: T(as) = aT(s) for any scalar a; (ii) T is translative: T(s + e) = T(s) + T(e) for any constant sequence e; (iii) T is regular for s: if s converges, then T(s) converges to the same limit. Often more than (iii) is required, namely, (iii')

T is accelerative for s: T(s) converges more rapidly than s.

This requirement sometimes takes the form that lim II{T(s)}n - sll [s, - sliP

=

f3 < I

n~OCJ

for some indexp ~ I, where {T(s)}n and Sn are the nth components of T(s) and s, respectively, and s is the limit of s. Historically, most of the work done in this area up to 1950 focused on transformations that are also linear: T(s + t) = T(s) + T(t). Such transformations have a very simple structure, namely, the components of T(s) ix

x

Preface

can be characterized by weighted scalar means of the components of s (at least when :!4 is separable), and such transformations have beautiful theoretical properties. [The classical work in this area is the book "Divergent Series" (Hardy, 1956), and more modern developments are discussed in Cooke (1950), Zeller (1958), Petersen (1966), and Peyerimhoff (1969).] However, linear methods are distinctly limited in their usefulness primarily because the class of sequences for which the methods are regular is too large. In defense of this somewhat paradoxical statement, I only remark that experience indicates the size of the domain of regularity of a transformation and its efficiency(i.e., the sup of p values in the foregoing equation) seem to be inversely related. Furthermore, linear transformations whose domains of regularity are all convergent sequences (called regular transformations) generally accelerate convergence at most linearly, i.e., p = 1, 0 < f3 < 1. Obviously, for safety's sake, when one uses a nonregular method, one wants a criterion for deciding when s belongs to its domain of regularity. This, however, is not the problem it might seem to be. Linear regular transformations are discussed (at length, in fact) in this book, but primarily those transformations whose application can be effected through a certain simple computational procedure called a lozenge method. As the reader will find, the subject touches virtually every area of analysis, including interpolation and approximation, Pade approximation, special functions, continued fractions, and optimization methods, to name a few; and the proofs of the theorems draw their techniques from all these disciplines. Incidentally, I have included a proof only if it is either short or conceptually important for the discussion at hand. It was simply not feasible to include very detailed and computational proofs, e.g., estimates for the Lebesgue constants for various transformations (Section 2.4), or inequalities satisfied by the iterates in the e-algorithm, or long proofs whose flavor was totally that of another discipline-results on Pade theory, for instance, or results requiring the theory of Hilbert subspaces. In such cases, I have always indicated where the proof can be found. The techniques given will, I hope, be useful in any practical problem that requires the evaluation of the limit of a sequence: the summation of series, numerical quadrature, the solution of systems of equations. Particularly welcome should be the discussion of methods to accelerate the convergence of sequences arising from Monte Carlo statistical experiments. Since the convergence of Monte Carlo computations is so poor, O(n -1/2), n being the number of trials, techniques for enhancing convergence are highly desirable. A closely related subject is the iterative solution of (operator) equations. In fact, any sequence transformation can be used to define such an iterative method (cf. Chapter 5). However, this is not the subject proper of this book,

Preface

xi

there being available already several excellent works in this area. I have, in fact, restricted myself mostly to material which has not appeared in book form in English. Some of the material is available in French [any numerical analyst will have on his shelf C. Brezinski's two important volumes (Brezinski, 1977, 1978)], but much of the material has never appeared in book form, some has not appeared in published papers [the thesis work of Higgins (1976) and Germain-Bonne (1978) for instance], and much is new altogether. I have not usually opted for abstraction. In most instances the transformations can be generalized from complex sequences to Banach-spacevalued sequences, and often I have indicated how this can be done and have established appropriate convergence results. But where abstraction can confuse rather than elucidate, I have left well enough alone. For instance, I believe that the theory of Pade approximants, at least for my purposes, is most firmly at home in classical function theory. My notation may at times seem idiosyncratic, but it is one I have found necessary to diminish clutter and bring some focus to the development. Before the reader gets into the book, I strongly advise him to read the section on notation. Otherwise, certain unfamiliar conventions-for instance, xnR: Yn, which I have found most useful-may well render the material completely opaque. The notation for special functions is, by and large, as in the Bateman manuscript volumes. Ad hoc notation is explained in Notation or as needed. I have provided many numerical examples, but these are illustrative only, not exhaustive. The reader interested in further numerical examples and applications should consult C. Brezinski's (1978) book, and, for a comparison of methods, the survey of Smith and Ford (1979). The problem of rounding is always an annoying one in a book dealing with numerical methods. Generally speaking, all numbers free from decimal points or occurring in definitions may be considered exact. Others, particularly those occurring in tables, have been rounded to the number of places given. However, I should be surprised if I have been consistent.

Acknowledgments

Several people have contributed to this book. John Quigg has read and commented on some of the material. Bob Higgins, my former student, has provided most of the theory in Chapters 12 and 13. Steve Yankovich and Stanley Dunn have contributed their programming and analytical skills for the preparation of numerical examples. Drexel University has been generous in its support and encouragement. I am grateful to Alison Chandler, whose combined typing and mathematical skills led to such a beautifully prepared manuscript, and to Don Johnson and Harold Schwalm, Jr., who assisted in the proofreading. Finally, I consider myself fortunate to be working in a field where friends are so easily made. My colleagues have proved to be warm and enthusiastic. I have enjoyed thoroughly meeting and exchanging ideas with Bernard Germain-Bonne and Florent Cordellier. I am particularly indebted to correspondence and discussions with Claude Brezinski. He has generously provided me with unpublished results (Chapter 10). Some of the ideas in the book originated in a lengthy afternoon discussion with Claude and other colleagues. That meeting demonstrated to me the delights of the mutual, as opposed to solitary, quest.

XIII

Notation

Spaces

.,({

metric or pseudometric space

1/

linear space

fff

topological vector space over real or complex field

fJI

Banach space

-*

dual space

B(81, fJI') space of all bounded linear mappings of one Banach space into

another

IITII

n

= sUPllxll:511I T (x) ll, TEB,xEfJI cone in fff (n contains a nonzero vector and if x

E

n, A.X E n, A. > 0).

for any matrix A = [aiJ, 1 :s; i :s; n, 1 :s; j :s; m, first subscript of aij denotes row position, the second column position, of the element

Real and Complex Numbers

space of ordered complex p-tuples,

p

complex numbers space of ordered real p-tuples, p > 1 xv

> 1

XVI

Notation

fJIl

real numbers

fJIlO

nonnegative real numbers

fJIl+

positive reals

J

integers

JO

nonnegative integers

J+

positive integers

m, n, k, r, i, j

generally denote integers

d(A, B)

=

D(A, B)

=

infxEA,YEB

[x -

SUPXEA,YEB

yl

[x -

yl

= {z[lz - al < p}

Np(a)

oNp(a)

= {z[lz - al = p}

Np(a)

= {z[lz - al : : ; p}

NiO)

=

Np

N1 = N

the unit circle

Sequences boldface letters denote sequences, s, t, etc, for any space d, d s denotes the space of sequences with elements in d; s = {s.} E ds, Sn E sf de

space of convergent sequences

d

space of null sequences, e.g., d a metrizable t.v.s.

N

e., «.. «;

fJIl TM, fJIlTQ' 'CE=(r) special real and complex sequence spaces (see Sections 1.4, 1.5, 2.2) related sequences (the space d must be such that the definitions make sense)

n

a: r:

S =

~

1;

lim

Sn

a o = So,

so

Notation

h:

h.

=

L' L"

r.+ dr.

=

(s.+ I

.1 s : {.1 s }. = .1 s. , k

k

-

xvii

s)/(s. - s)

k ~ 1

k

indicates first term of sum is to be halved indicates first and last term of (finite) sum are to be halved

• f(k/n), T,,(f) = -1 L"

n

/1 k=O

sequence relationships: two sequences x, y

~

1

(trapezoidal sum)

let R be a binary relationship between members of

x.R :Y.

means x.Rv; holds for an infinite number of values of n

x.Ry.

means x.Ry. holds for alln sufficiently large this notation is used only when the sequence variable is n

Example

IA.kl s.

1 means: for some no, IA.kl ~ 1, 0

s ks

/1,

n > no

Functions

'I'

class of real nondecreasing bounded functions on [0, 00) having infinitely many points of increase

'1'* subclass of 'I' such that

LOOt' dt/J

<

CfJ,

n

~ 0; t/J E '1'*

support of t/J is the set of points of increase (rt)k

= rt(rt + 1)··· (rt + k - 1) (Pochhammer's symbol)

I al 1 a2

a~-I a~ -

I

=

n n

k- 1

k

j=lr=j+l

(a, - a)

n s. + 2k -

2

(van der Monde determinant)

~

0,

k~1

(Hankel determinant)

XVlll

Notation

R~a,p)(x) = p~a'P)(2x

Jacobi polynomial shifted to [0, 1]

- 1),

All other special functions are as defined in the Erdelyi volumes (1953), Special Sequences

(LN 2)n:

(GAM)n:

( _1)k +l'

8 = In 2 = 0.693147180559945

ak = k

( _1)k

fi+!'

ak

=

ak

= (k +

1

1)2'

ao = 1, 8

=Y=

8 = 0,604898643421630

ak =

S

= 1.644934066846559

0.577215664901533

ak

= (k +

(EX 2)n:

ak

= (0,8t/(k + 1),

(EX 3)n:

ak

= (k +

(FAC)n:

ak

= (-ltk!,

1)(0,8)\

1)(1.2)\

f'X) ~ =

(IT 1)n:

n2

6

k~ 1 + In (k ~ 1), k> 0,

(EX 1)n:

Jo

=

t + 1

8 = 25 8

= 1.25 In 5

s divergent

s divergent, but generated by

0.5963473611

generated by 8 n+ 1 = 20[s,;

+ 2sn +

IOr

1

80

;

= 1;

8 = 1.368808107

(IT 2)n:

generated by

Sn+1 =

(20 - 28'; - 8~)/1O;

80

= 1;

s divergent (LUB)n:

ak

=

( _1) k + 1 '

8 = 1.131971754

<.>= greatest integer contained in

Notation

xix

Numerics

Generally, in tables n SF representing a number is a rounded value; for instance, n=3 n = 3.1 tt = 3.14 n = 3.142, '" . For rational numbers, it may occasionally be important to know that the given value is exact. If that is the case, we write ~

= 1.5

(exact).

In definitions, all numbers are exact, e.g., s, = (1.18t, or it is indicated by ... that the number has been truncated.

Chapter 1

Sequences and Series

1.1. Order Symbols and Asymptotic Scales, Continuous Variables Let 1/ and 1/' (see Notation) be equipped with pseudometrics d and d', respectively; let n be a cone in 1/ and ¢, IjJ E /T(1/, 1/').

¢=

O(IjJ)

n

in

(1)

means for some M > 0 there is an R(M) > 0 such that d'(¢, O)jd'(IjJ, 0) < M,

Further,

¢ = o(ljJ)

d(x,O) > R.

XEn,

III

n

(2)

(3)

means for any e > 0 there is an R(e) > 0 such that d'(¢, O)jd'(IjJ, 0) < e,

d(x,O) > R.

XEn,

(4)

If ¢, IjJ depend parametrically on a E .sf and (2) holds for all a E .sf, then we shall write "¢ = O(IjJ) in n uniformly in .sf," and similarly for (3). F or the foregoing definitions to apply, the implicit assumption is made that denominators are never zero; for example, there must be some R such that d'(l/J, O) oF 0, X E n, d(x, 0) > R. Thus anytime an order symbol is used, an implicit statement is being made about the zeros of d'(IjJ, 0). The concept of asymptotic equivalence is often useful. This is written in and means both ¢ - IjJ

n

= o(ljJ) and IjJ - ¢ = o(¢) in n.

(5)

2

I. Sequences and Series

Now let cj» E :Y sC'f/, y'), where Y and 'Or' are linear spaces with pseudometrics d and d', cj» is called an asymptotic scale in Q if, for every k ;;::: 0, in

(6)

Q,

and if this holds uniformly in k or uniformly in some parameter space, we speak of a uniform asymptotic scale (properly qualified). See Erdelyi (1956) for many examples. Letf E :Y(Y, y'), A E res and cj» be an asymptotic scale in Q. The statement in

(7)

Q

is to be read "f has the right-hand side as an asymptotic expansion in Q with respect to the scale cj»" and means, for every k ;;::: 0,

f -

k

L A r4>r = O(4)k) r;O

in

Q.

(8)

Often cj» is understood from context, so "with respect to the scale cj»" may be deleted from the definition. Note that 0, 0, and -- are transitive and ~ is symmetric. Clearly no asymptotic scale can contain the zero vector or two identical vectors. If d' is a metric induced by a norm I ·11, the asymptotic expansion (7) is unique (but not otherwise). This is a simple consequence of the fact that 4> = 0(1]), l/J = 0(1]) then imply 4> + l/J = 0(1]). Thus assume another expansion (7) with coefficients A' holds. Setting k = in (8) and its analog and subtracting the two gives

°

(A o - A~)4>o = 0(4)0), or lAo -

A~

I<

s for all e, so that A o =

A~,

(9)

similarly, A j = Aj,j > 0.

1.2. Integer Variables In discussions of sequences, the relevant variable x in 4> or l/J takes values in JO. We write 4>n or l/Jn for 4> or l/J, respectively, or, when there is a possibility of confusion with the index of an asymptotic scale, 4>(n) or l/J(n). 1.1(1) is then written

and means that for some M > 0, there is an N > d'(4)n, O)jd'Cl/Jn' 0) < M

for

°

such that

n> N.

(1)

1.3. Sequences and Transformations in Abstract Spaces

3

A similar modification is made of 1.1(3). An additional complexity occurs when ¢ and t/J depend on a p-tuple with elements in say, n = (ml' m2' ... , m p ) . It is usually important to know exactly how the elements mj become infinite, and it is hardly ever sufficient to say, for instance, that ml + m 2 + ... + mp > N. In fact, the concept of a path in n-space becomes important (see Section 1.3).

r:

1.3. Sequences and Transformations in Abstract Spaces

In this book we shall be concerned with two kinds of sequence transformations. The first is the transformation ofagiven sequences E d sinto a sequence S E.r4's with, generally, a formula given to compute sn in terms of elements of s. (In some situations there is no explicit formula.) The other case is where the given sequence s is mapped into a countable set of sequences S(k), k ~ 0, with a formula given (called a lozenge algorithm) for filling out the array {S~k)}, n, k ~ 0. The whole point is to compare the convergence of the transformed sequence(s) with that of the original sequence. The most useful concepts are formulated in the definitions that follow. Definition 1. (i) (ii) (iii)

Let s, tEAte, a metric space.

t converges as s means d(sn, s) = O(d(t n, t)) and d(t n, t) = O(d(sn's)), t converges more rapidly than s means d(tn' t) = o(d(sn, s)). The convergence of sis pth order if, for some p E

r,

d(sn+ I' s)

= O(d(sn, s)") (I)

and

d(sn' s)"

=

O(d(sn+ I' s)).

It is easy to show that p, if it exists, is unique.

Definition 2.

Let T E !!T(d, JIt s) where d

c

JIt e and T(s)

=

s.

(i) Tis regular for d ifs Ed=> S E Ate and s = s. (ii) T is accelerative for d (or accelerates d) if T is regular for d and S converges more rapidly than s, s e ss, Definition 3. SEd.

Let T E 5"(d, Jlts)whered

c

Jlt s . T sumsdifT(s) E JIt c-

P = {(im,Jm)lim, JmEjO} IS called a path if io
4

I. Sequences and Series n/k (0,0) (0,1)

(O,2)

(1,0)

(0,3)

1,\ )

(0,4)

(1,2)

(2,0)

(0,5)

(1,3)

(2 I)

(3,0)

(1,4)

(4,0)

(2,4)

(I,5)

(3,2)

(3,4)

(4,2)

(5,0)

(5,2)

(44) (4,5)

(5,3)

(6, I)

(54)

(6,2)

(7,0)

(3,5)

(4,3)

(5,1 )

(6,0)

(2,5)

( 3)

(4,1)

(6,3)

(7,1)

(7,2)

(8,0) (8,1)

(9,0)

Fig. l.

ultimately constant are called vertical paths, paths with t; ultimately constant diagonal paths. Figure 1 shows how the (n, k) position on P is labeled for illustrative purposes. Generally the (n, k) position of the diagram itself will

be occupied by the (n + l)th component of the (k + l)th member of the set S(k), i.e., S~k). S~k) may converge as n + k --> 00 along certain paths but not along others. The following definitions contain the key ideas. Let P be a path and <jJ(n, k), t/!(n, k) E §'(P, 1/') where 1/ is equipped with a pseudometric d. <jJ

= O(t/!) in P

(2)

means for some M > 0 there is an N > 0 such that d(<jJ, O)/d(t/!, 0) < M for E P, n + k > N. A similar interpretation is made of o.

(n, k)

5

104. Properties of Complex Sequences

Definition 4. Let vIt be a metric space, T and let 7k(s) = S(k), k ~ 0.

E

:Ysed, vIt s) where d

c

vIt c

(i) T is called regularfor d on P if sEd = d(s~kl, s) = 0(1) in P. (ii) T is called accelerative for d on P if T is regular for d on P and if d(s~k),

s)/d(sn' s) = 0(1)

in

P,

s e se.

(3)

If, in the foregoing definitions, d == vIt c- we shall omit the wordsror d and say simply that .r is regular, etc. We now discuss certain computational aspects of the foregoing definitions. Usually To = I, the identity transformation, so s(O) = S and an algorithm that is computationally feasible for filling out the array {S~k)} will start with the values s~O) = s; and assign one and only one value to each (n, k) position in the array. There seems to be no easy characterization of those algorithms that are feasible in this sense. However, several important ones have been discovered recently. Among these are formulas of the kind s~O) =

Sn,

n, k

~

0,

(4)

called a deltoid; and S~k+ I)

=

H(S~k;/), S~k~ I' S~k»,

n, k ~ 0,

S~-I)

= 0,

s~O)

= Sn, n

~ 0;

S~k+l)

=

H'(S~kL, S~k-l), S~k),

n, k

S~-lI

= 0,

s~O)

= Sn' n

~

~

0,

(5)

0, (6)

called rhomboids. There is as yet no general theory for constructing such algorithms. Those that are known have been derived using ad hoc arguments from diverse areas of analysis: Lagrangian interpolation, the theory of orthogonal polynomials, and the transformation theory of continued fractions. Much work remains to be done in this area. For transformations in vector spaces, there are several important concepts that involve the linearity of the underlying space. Definition 5. Let T E :Y(d, 11s) where d c; 11s- T is linear if, for all x, y E d and c l , C2 E '??, T(c1x + e2Y) = C 1 T(x) + C2 T(y); otherwise, Tis nonlinear. T is homogeneous if T(cx) = cT(x) for xEd, c E '?? T is translative if T(d + x) = d + T(x), where d is a constant sequence (dn == d) whenever d + x, XEd. 1.4. Properties of Complex Sequences When the metric space of the previous sections is the complex field, its sequence space possesses elegant properties. Some of these have been long

6

1. Sequences and Series

known, and others are surprisingly recent. This section contains a discussion of some of these results. Definition 1.

Let

S E ~c

and

rn+dr n = (sn+ I

s)/(sn - s) = p

-

+ 0(1).

(1)

(i) If 0 < Ipi < I, s converges linearly and we write s E ~l' (ii) If p = 1, s converges logarithmically and we write S E ~l"

Theorem 1.

Let Ip I i= 0, 1. Then , Sn+1 I1m

n-e co

Remark.

Sn -

S

S

=p

1iff

I'IHl a n +~I = p. n-e oo an

(2)

For the divergent case Ipi> 1, S can be any number.

Proof The validity of either limit implies an i=. O. Assume, without loss of generality, an =f 0 for any n; otherwise delete the finite number of ans that are zero and relabel the members of a and s. =: We have [a n+I

+ (s,

- s)]/(sn - s) ~ p

(3)

or

an+1

~

an = (p - 1)(sn_1 - s).

(p - 1)(sn - s),

(4)

Dividing the former by the latter shows (4')

Note that for this part of the theorem p can be zero. =: We do only the convergent case 0 < Ipl < 1. The other is similar. Since I an converges,

s, -

S

=

00

I

k=n+ 1

ai,

(5)

We can write EE~N'

(6)

Let gn = sup j'2:n

lejl.

(7)

1.4. Properties of Complex Sequences

Then g E

~N'

7

Taking products in (6) gives

= aopn

an

n

n-I

j=O

(1

+ G),

(8)

empty products interpreted as 1. Define

(9)

Thus

.

11m F" ~

n-e co

I Lr pkl k=O

1

>

Ipl,+1

I I

1_

P

-lpr+ 1

- II - pi

(1 + I) . 11 - pi I - Ipl

(10)

For r sufficiently large, the right-hand side is >0. Thus lim F; > 0 and IIFn is bounded. Now S

n+ 1 Sn -

L

-s

'=.

S

au

k=n+1

ak+

/00L

1

k=n+1

a,

=

L00

k=n+1

a k P(1

+ Gk) /00 L

=. p + Un'

k=n+1

ak

(11) (12)

(The foregoing operations are valid since it will turn out that s, of. s.) Thus jUnl~·

00

L

k=n+1

~.Cgn+1

lakGkl/lan+IJFn+1

00

L

k=O

Iplk(1

+ gn+d

Cg n+ 1

(13)

l-p(l+gn+I)'

which actually shows a bit more, namely, Sn+1 Sn -

S S

=

P

+ o(suplak+1 k>n

pak

_

II).•

(14)

8

I. Sequences and Series

Corollary. Proof

Cfl a :=; Cfl/.

This is true since (15)

limlanl 1 / n slim lan+1/anl . •

Another useful result has to do with the order of growth of partial products. Theorem 2.

If

», = for some t

E Cfl N, Gj

n-l

Il (1 + G),

j=O

n ~ 1,

Vo = 1

i= -1,j ~ 0, then there is an t*

E CflN

(16)

such that (17)

Proof

We have

n-1

Un

= ell (1 + Gj)'

(18)

j=no

(19) but the quantity in square brackets is the Cesaro means of a null sequence, and hence the nth term of a null sequence, say, <5 n • So (20) and this may be extended to all n ~ 0. (s", because of the multiple valuedness of log, is not unique.) • 1.5. Further Properties of Complex Sequences Some unusual convergence properties have recently been demonstrated for complex sequences. These properties are a help in determining whether important sequence transformations are regular or accelerative. Sources for this material are Tucker (1967, 1969). In what follows let s, a E Cfls and be related in the usual way. For all an i= and n ~ 0, define Pn by

°

(1)

and if s is convergent, r, by Tn

Otherwise Pn' r, are undefined.

= (s - sn)/an·

(2)

1.5. Further Properties of Complex Sequences

9

Since we shall in general be concerned only with members of a sequence with large index, the notation "xnR.Yn" (see Notation) will be employed constantly.

Lemma 1. Let

S E C(/C Cn

and an #. O. Let c E

=

C

+ (sn

C(/,

- s),

n

and define

~

(3)

0;

then I - Pn) 1 + C( - an+ 1

C n+ 1 I - Pn ( + -c, - - =. - S-

an

an+ 1

an+ 1

Sn)'

(4)

Proof

Pn) +Cn- - C-n+-1 1-1 +C ( an+ 1 an an+ 1 =.1

+

=. 1 +

c(_I__~) + an+ 1

S -

C

+ Sn

Sn

=. S

an

Sn + 1 _

S -

an+ 1

an

an

-

-

+ Sn+l

S _

C

Sn _

S -

an + 1

an+ 1

an

-

S

Sn

•

(5)

Theorem 1. Let (6)

Then S diverges. Proof Assume s converges. Since (1 - Pn)/an+ 1 is bounded, there is an 8> 0 such that 18(1 - Pn)/a n+ 11 <.l Let C be any complex number satisfying Ic I = 8, so that

Set

Cn

=

C

+ (s,

Re[1

- Re[c(1 - Pn)/a n+ 1]

<·l

(7)

- s). From the previous lemma

+ c(1 -

Pn) an+ 1

+

C

n

an

_

Cn+l] =. Re[1 - Pn (s - Sn)], an+ 1 an + 1

(8)

so (9)

10

1. Sequences and Series

Using (7) and (9) shows

~ + Re c" <. Re C,,+ 1 a"

2

a"+ 1

P,,)] _ ~ <. Re C,,+

Re[C(l a" +1

_

from which it follows that Re c.fa; a,,¢. {z l arg c

-+ 00

+ 3n/4 ~

4

1

a"+1

and so Re c"/a,, >. O. Since C"

argz ~ argc

+ 5n/4}.

(10) -+

c,

(11)

Choosing arg c to be successively 0, n/2, n; 3n/2 shows that a cannot be a complex sequence, a contradiction. • [This beautiful proof is due to Tucker (1967).] We state without proof a similar result for infinite products. Theorem 2.

Let l/a,,+ 1

Then

-

Uo;

= 0(1).

(12)

Il:'=o (1 + a,,) diverges.

Lemma 2.

Let P" be defined ultimately and

Ip,,1 ~.p <

t·

(13)

Then

0<(1 - 2p)/(1 - p)

Proof

~.lr,,/p,,1 ~.

1/(1 - p).

(14)

Note that r" is, ultimately, defined and that

(15)

r" =. P" + P"P,,+ 1 + P"P,,+ 1P"+2 + "',

+ 1 =. r,./p" and the above series converges. We have Ir,,1 ~·lp,,1 + Ip"P"+11 + ... ~·lp"I/(1 - p) ~. p/(1 - p) Thus I»J», I s. 1/(1 - p) and 1iJ», I =.11 + r"+11;::::.ll -lr"+lll =.1 -lr"+11

since r,,+ 1

< 1.

;::::.1 - p/(1 - p) = (1 - 2p)/(1 - p) > 0 . .

(16)

(17)

Theorem 3. Let s, s* be two sequences such that a:/a" = 0(1) and Ip,,1 ~.p < t, Ip:1 ~.p* < 1 for some numbers p, p*. Then s* converges more rapidly than s.

Proof An implication of the hypothesis is a" #-. 0, a: #-. O. The previous lemma shows 0<(1 - 2p)/(1 - p)

~·lr,,/p,,1

(18)

1.5. Further Properties of Complex Sequences

and

II

(19)

One concludes that

I Sn S: -

S* S

I=.1 aa:+n+ II ':/P: I 'n/Pn 1 1

~.la:+ll[(1 an + 1

-

p*)(l - 2p)(1 - p)r 1

=

0(1).

•

(20)

Tucker gives an example (1967, p. 358) to show that 1- cannot be replaced by a larger number.

Lemma 3.

Let b, S, s* E rrls with (21)

Then s* converges more rapidly than s to the same limit if and only if s, = 0(1). Proof Either hypothesis implies the convergence of sand S either case, therefore, bn + 1 ~

bn+tI(s - sn)

+ (s -

and from this the lemma follows.

Theorem 4.

(22)

S -

s:)/(s - sn) =. 1,

-

s, #. O. In (23)

•

Let t, s E rrlc and (24)

and suppose t converges more rapidly than s to the same limit. Then u converges more rapidly than s to the same limit if and only if {In ~ (J.n'

Proof From the previous theorem, a n+ l(J.n+ 1 ~ S - Sn = 0(1). Also u converges more rapidly than s to the same limit if and only if a n+ l{Jn+ 1 ~ S - Sn' Since S - Sn #.0, we have an+l(J.n+l #.0, an+1{Jn+l #.0, and so an' (J.n' {In #. O. By transitivity of ~ we conclude a n+ 1 (J.n+ 1 ~ a n+ l{Jn+ 1 or (J.n ~ {In' This step is reversible, so the theorem follows. • Theorem 5.

Let

S

be convergent and (25)

Then the three conditions below are all equivalent: (i) s* converges more rapidly than s to the same limit; (ii) (J.n+ 1 ~ 'n/Pn; (iii) (J.n ~ 1 + Tn'

I2

1. Sequences and Series

Proof

From Lemma 3, s* converges more rapidly than s iff a n + 1(Xn+ 1

~

(Xn+1

~

1.6. Totally Monotone and Totally Oscillatory Sequences Definition.

s is totally monotone (written s E

«i

(-I)kNsn~O,

~TM)

:«

if (I)

s is totally oscillatory (written s E 9fT O ) if {( -I)nsn} E 9fT M • Here n

~

0,

k

~

1

n, k ~ O.

If s E 9fT M , one has (2)

so s converges since it is monotone decreasing and bounded. On the other hand, if s E 9fTQ, S is alternating and so converges to O. Examples. s~1)

= I/(n +

The sequences S(k)

E ~TM'

where

2),

(X

> 0,

(3)

since

S~k) = ft dl/lit) n

(4)

for I/Ik bounded and nondecreasing

1/12 = (I - tin t),

(5)

(see Theorem 3). ~TM is an important regularity space for certain nonlinear transformations in ff(~s, ~s).

Theorem 1. Let s E ~TM' Then the sequences whose nth elements are given below are also E ,'3iP T M • (Empty products are interpreted as 1.)

(so < 1);

13

1.6. Totally Monotone and Totally Oscillatory Sequences

nj:6 (1 -

(ii)

s)

(iii)

A.(-I)k+l~ksn

(iv)

A.J:~lSj

(so:s: 1); (0 < A.:s: 1, k

(0

:s: A.:s:

> 0, a, k fixed);

1).

Proof The proofs are straightforward. We prove here only (i); the reader is referred to Wynn's paper (1972) for the others. Write

1/(1 - sn) = tn·

(6)

Multiplying both sides by 1 - s; and using the difference formula 1.6(40) gives

(-~)ktn = (1 -

°

sn)-I

±(~)(-~)k-jtn+i-~)jSn.

(7)

j= 1 )

Now s, - Sn+ 1 ?: 0, so :S: s, < 1 for all n, and thus Eq. (7) provides an immediate induction argument on k. •

Theorem 2. Proof

Let

S,

Obvious.

t

Bf T M • Then {sntn}, {asn + bt n} E BfT M , a, b > 0.

E

•

Theorem 3. S is totally monotone if and only if there is a function t/J(t) bounded and nondecreasing on [0, 1] that satisfies s;

=

f

n t dt/J(t),

(8)

n ?: 0.

Proof

Write

<=:

( -1)k~ksn = For all k and

=:

°:S: n :S: k,

f

(1 - t)ktn dljJ(t) ?: 0.

(-It-n~k-nsn

or

?: 0,

k-"(k L - n) (-I)'sn+r = r-.

r=O

where r _

sn -

E Bf~.

r

(10)

°

(11)

:S: n :S: k,

It is easily seen that this system of equations has the solution

(k - n) r; -_ L m(m L m=n m - n m=n k(k k

(9)

k

1)··· (m - n + 1) L m, 1)··· (k - n + 1)

°

:S: n :S: k,

(12)

14

I. Sequences and Series

where (13) This can be written

°::; n s k,

(14)

where cDk n(t) .

t(t - llk)(t - 21k) ... [t - (n - 1)lk) 1(1 - llk)(1 - 2Ik)··· [1 - (n - l)jk]

=

-,---:-:---,---;c:-,---:-:---=-:-:-,.------=cc----,-----~_=_

= t" + O(k- 1)

(15)

uniformly in t, and l/Jk(t) is the step function defined by

°

t ::; 0,

< t ::; 11k, 11k < t s 21k, Lo + L 1 + Lo + L 1 +

+ Lk + Lk ,

(k - 1)lk < t < 1, 1 ::; t.

1,

Because of (12) with n = 0, So

(16)

= l/Jk(1) :2: l/Jk(t) :2: l/Jk(O) = 0,

°s t ::; 1.

(17)

But any sequence of bounded nondecreasing functions on [0, 1] contains a subsequence converging to a bounded nondecreasing function [see Wall (1948, p. 246)]. It is easy to justify taking the limit over this subsequence inside the integral sign (Wall, 1948, p. 245), so for some bounded nondecreasing l/J(t),

s, = For

SE

fr

dl/J(t),

n :2: 0.

•

(18)

fJil s , the determinants sn

H~k)(s)

= Sn+ 1 Sn+k-l

Sn+ 1 Sn+ 2 Sn+k

are called Hankel determinants.

+k- 1 Sn+k

Sn

Sn+ 2k - 2

n, k :2: 0,

(19)

1.7. Birkhoff-Poincare Logarithmic Scales

Theorem 4. If s E 3l TM ,

H~k)(S) 2:

O. If s

E ~TO, (_l)nkH~k)(s)

15

2: O.

Let

Proof Q~k) =

L sn+i+A¢j =

k-1

I1 0

i.j~O

tn(~o

+ ~lt + ... + ¢k_ltk-1)Z dt/J(t)

2: O.

(20)

But this means by a known result on quadratic forms (Bellman, 1970, p. 75) that the determinant of the coefficients of Qin) must be nonnegative, which gives the first part of the theorem. The second is similar. •

Theorem 5 (Brezinski). Let f(x) = Lk~O CkXk be a power series with nonnegative coefficients and radius of convergence p > O. Let s E 3l TM and So < p. Then {f(sn)} E 3l T M • Proof

Obvious, since the sequence S(k) S~k) =

Co

E ~TM

when

+ C 1Sn + '" + CkS~

by Theorem 2, and any limit of totally monotone sequences is totally monotone. [This also provides a proof of Theorem l(i).] •

Theorem 6. If S E ~TM' then H~k)(~2rs)

If S E

~TO'

2: 0

and

then

( - 1)kn H~k)(LlZrs) 2: 0

Proof

Obvious.

and

•

1.7. Birkhoff-Poincare Logarithmic Scales Let p E J+, Ill' Ilz, ... , II p, ebe complex constants, Ilo an integral multiple (positive or negative) of lip. Define Q(w) = lloW In w + 1l1W + IlZW(p-1)/P + ... + IlpW1/ p, WE 3l+. (1) Consider the sequence of functions t/J;jw) = eQ(W)w6 -

j/P(ln

wy,

i,j = 0, 1,2, ... ,

co

E ~+,

(2)

and let F = {t/Ji,j}' It is easily verified that F is strictly (nonreflexively) well ordered under the operation written"
16

1. Sequences and Series

bounded, i = 0, 1, 2, ... , p, then one may define a unique asymptotic scale on F, say, {
I/Ii", k"'

(3)

This scale is called the Birkhoff-Poincare logarithm scale (B-P log scale); p is called the index of the scale. If p = 0, it is called simply the BirkhoffPoincare scale. The ~pecial case J1; = () = 0, p = 1 is called the Poincare scale. Any function satisfying a fairly general difference equation or differential equation is known to possess an asymptotic expansion in a B-P log scale, or, more precisely, the function can be written as a linear combination of such expansions, once it is decided how to interpret sums of asymptotic expansions. [In fact, this can easily be done; see Wimp (1974b).] For difference equations, this is called the Birkhoff-Trjitzinsky theory (1930, 1932),and for differential equations, the theory of subnormal forms. [See Wasow (1965) and the references given there.]

Theorem 1 (Birkhoff-Trjitzinsky). Ao(w)y(w)

+ Al(w)y(w +

1)

Consider the difference equation

+ '" +

Am(w)y(w

+ m) =

0,

(4)

where Ai is defined for w E ~o and Am(w) i:- 0. Let Ai have an asymptotic expansion with respect to some B-P scale F with Q = J1j = () = in ~+. Then there is a B-P log scale G and a basis of solutions Yl' Y2'" ., Ym of the equation such that Yj has an asymptotic expansion gi with respect to G in ~ +. The general form of these solutions is

°

+ aOlw- l/ p + ...) + (a l O + allw- l/ P + .. ·)lnw + '" + (a mo + amlw- l/ P + .. -)(In co)"], (5)

y(w) = eQ(W)w 8[(aoo

Proof The proof is the subject of two papers. The first (Birkhoff, 1930) treats the formal (constructive) theory of the question; the second (Birkhoff and Trjitzinsky, 1932) treats the analytic theory. •

While the theorem is simple to state, in the construction ofthe asymptotic expansions there are many complexities. For instance, p for G and F need not be the same. Further, once certain expansions gi are obtained others may be found from these by formal manipulations, thus vastly simplifying the work involved. For example, the difference equation () yw

+

(w (w (w

+

+ b + c + 1) + A] yw+ ( 1) (w + b)(w + c) + 1)(w + 2) + b)(w + c) y(w + 2) = 0, b, c > 1)[(2w

°

(6)

17

1.7. Birkhoff-Poincare Logarithmic Scales

has solutions

h1(w) =

r(b

r(w hz(w) = f(w

+ w)r(C + w) r(w + 1) 'P(b + w, b + + b) + 1) <J)(b + os, b +

1 - c; A), (7)

1 - c; A)

(see Wimp, 1974a). There is a formal basis of solutions

gj(w) = exp[( -IY+ lAl/2Wl/2]W9

LE 00

k=O

k

(

_ly k w - k / 2,

() =

-!(b

+ c)

-

i (8)

and

h1(w) '" fiA(e-b)/2-1/4e~/2g1(W)

in

(9)

~+.

Here the A i are rational; p = 1 for F; for G, P = 2. Let us see how the construction proceeds for the important case m = 1. For the formal computations, assume all series are convergent for w > R, say. Then Eq. (4) can be written yew + 1) = w/i/P(aoe/i/P + a1w- 1/ P + .. -)y(w), f.1 E J, ao # O. (10) Let

yew)

Then

z(w

+

1) = b(w)z(w),

Finally, let u(w)

u(w

+

= w/iw/Pa~z(w).

=

b(w) = 1

In z(w). Then u satisfies

(11)

+ b1w- 1 / P + .... w- 2 / P

1) - u(w) = In b(w) = b1w- 1/P + (2b 2 - bi) 2

=

C1W- 1/ P

In this equation write

+ C2W-2/p + ....

+ d2_pWl-2/P + ... + d_1W 1 / P + d1w- 1/ p + d 2w- 2/P + "',

(12)

+ ... (13)

u(w) = dl_pWl-l/P

+ de In w

and it is immediately found that d 1 with

P'

d2 -

P"

' "

(14)

do are uniquely determined,

(15)

18

I. Sequences and Series

by comparison of terms w- I / P, w-z/p, ... ,w- I • On comparison of terms w- k / p , (k > p), one gets equations of the form

+ I,

k :::: p

(16)

in which IY.k is a known polynomial in d ,_ p, dz-p, ... ,dk-I-P' Thus all the d, are determined in succession and uniquely. Then, writing (17) and exponentiating the series for u(w) gives the desired formal series. This construction, of course, shows that a unique formal asymptotic series always exists, and also that, for the first-order case, p for F = P for G and the index of G is zero. The next result shows how the partial sums of a series grow when the general term of the series has an asymptotic expansion of the form eQs.

Theorem 2.

Let (18)

with

S E

Cfic . Let an'" eQ(W)wlJp(w),

where

~

co = n

+(

J+

(19)

+ IY.IW- I/ P + IY.zW- z/ P + ....

(20)

In

is arbitrary and complex and where pew) = IY. 0

Then

s, where ()*,

(;(6 are as

Case I.

S '"

eQ (W)w 8' p*((I))

(21)

follows:

Q"¥= O. Denote the first nonzero fl j in the seq uence

by u., Then

+ flo, - () + (r -

()* _ {()

r=O I)/p,

S r

Case II.

Q == O. ()*

=

8

+

I,

IY.6 =':1.0/(8

p;

r=O = I;

T

s

r

+

I).

2

s s

(22)

(23)

p.

(24)

1.7. Birkhoff-Poincare Logarithmic Scales

19

Proof. A straightforward application of Theorem 1; see Wimp (1974b) (whose r = 0 values are incorrect). •

Also of interest is a related result for the partial sums of divergent series.

Theorem 3. Let s

~

Then

rcc and let an' p, to be as in (19), for some constant c. r:

in where lJ*,

(25)

are as in (22) and (23) except in the following cases:

O(~

llo;6 O. Then O(~ =

Case I.

lJ* = lJ.

0(0'

Q == 0 and p(w) contains a term w -

Case I I.

S,,-

Corollary.

C

+ din w

I.

Then for some c, d,

-- W6+ lp*(W)

(26)

Let 0(0;6

(27)

O.

Then for some constants /3j' 'Ij, h j ,

An+ Ino

(

/31

/32

(

n+n

)

- - IXo+-+-+"', 2 n

(A-I)

Sn -

S --

and

l

6

1

-n + (8 + 1)

OCo +

'II

n

'12

2 o In n + lJdn + lJ 2/n

< I (28)

)

A = 1, Rc lJ < - 1,

+ ... ,

2

Sn -- iX

1..1.1

+ ... ,

A = I, lJ = -I.

(29)

As examples of the use of these formulas, consider the computation of

e' and (s) from their defining series. Let Sn

(u) = s~ - r~,

For

L k'

k=O n

=

L

k=1

•

1 k'"

(30)

Re a > I.

Sn'

p

= 1,

and for p

s~

=

" xk

= t,

IIp = -I,

III

= In ex,

= III = 0,

() =

-u,

0=

0(0

= I/J"~, (31)

s~,

JLo

0(0

= l,

IXj = 0,

.i >

0,

(32)

20

I. Sequences and Series

and (for both may be taken to be O. We have (33) and the series on the right can be rewritten (34) Higher coefficients are easily determined by formal series manipulations [see Smith (1978)]:

Pl =

PI = 1,

P3 = Xl -

X,

2x,

(35)

For the Riemann zeta function, ((a) -

~ L.

k= I

a1 '" n I - " ((Xo + -(Xl + 2(Xl + ... ) , n

k

n

(36)

where (Xo

(37)

= 1/(a - 1),

and (Xlk+ I

(Xlk

= 0,

k '2. 1,

(ahk-l = (2k)!

k '2. 1,

Blk>

(38)

the higher coefficients being obtained by the formula in Wimp (1974b, 2.42) or from the integral representation for ((a). Equation (36) is known to hold for all complex a =I- 1 [see Olver (1974, p. 292)]. In what follows let (39) be a formal asymptotic series. In future sections we shall need to know the effect of certain difference operators on this series.

Lemma 1

L~Ju(n)v(n) = Proof

±

r=O

(j)l.l.ru(n)!.l.j-rv(n r

See Milne-Thomson (1960, p. 35).

•

+ r).

(40)

1.7. Birkhoff-Poincare Logarithmic Scales

21

Lemma 2

An(A- 1)PnO( IJ( O+ Pdn + "'), kid, APy(n)= { (-OM-l)Pno-P(lJ(o+Ydn+ ... ), A=l,

0#0,1,2, ... ,p-1, (41)

for some Pi' Yi' Proof

Left to the reader.

•

Theorem 4

L dijA)AP+ry(n) = i

r=O

An(A - 1)P( -l)i( - O)ino- i(lJ(o

+ lJ('dn + ."), (42)

A # I. 0 # 0, 1,2,,,. ,j - 1,

where IJ('J'

IJ(~, .•••

depend on j and p and di.rCA)

Proof

= C)A-i(1 - A)i- r.

The first part of the previous lemma gives A-nAPy(n) = (A - l) PnO(lJ(o + Pdn +

(43)

" -).

(44)

Then letting v(n) = A-n and u(n) = APy(n)and using the two previous lemmas gives the theorem. • The paper by Wimp (1974b) includes a number of applications of the previous results, particularly to the problem of finding asymptotic formulas for the remainder terms in expansions in orthogonal polynomials. Brezinski has shown that something like Theorems 2 and 3 is true for series of arbitrary real terms provided the terms are ultimately positive and that their differences are ultimately of one sign. First, note that Un ~ V n is equivalent to the statement that u; #.0, Vn #.0, and (45) Lemma 3.

Let an >.0, s diverge to

+ 00, and b; =

n

L akbk = o(sn)'

k=O

0(1). Then (46)

Proof n

L

k=O

f.lnkbk = 0(1),

(47)

22

l. Sequences and Series

where (48) by the Toeplitz limit theorem, Theorem 2.1(3). Note b may be a complex sequence. •

Theorem 5.

Case I.

Let

S E ~s,

an >.0 and h; = an/l1an with I1h n = 0(1).

Sa; <. O. Then s converges and (49)

Case I I.

l1a n >. O. Then s diverges and (50)

Proof

Case I. h; - ho

n-I

= k~O I1hk =

0

(n-I ) k~O = o(n), 1

(51)

by the lemma (with s = {n}). This means

1)r 1 = 0(1),

[n(an+ dan -

(52)

or, since an+ dan <. 1, the sequence {n(an+ dan - I)} is definitely divergent to - 00. By Raabe's test s is convergent. Thus 00

00

L ak = L hkl1ak =

k=n

k=n

00

-hna n -

L

k=n+1

akl1hk-t·

(53)

Now

f akl1hkIk=n+1

l !

S sup Il1h kl(s - Sn) s.sup/l1h kl(s - Sn-I)'

(54)

L

(55)

k~n

k~n

Thus

00

k=n+1 where

~ E ~N'

so (53) may be written S -

or, since

S -

akl1hk_1 =(s - Sn-I)¢n,

Sn =1=.0,

Sn-I

= -hnan - (s -

-hnan/(s - Sn-I) and letting n ->

CJJ

gives the result.

=.

Sn-I)~n,

1 + ~n

(56) (57)

23

1.7. Blrkhoff-Poincare Logarithmic Scales

Case II.

Sincean+1 >.an'Sn--' +00. Also, n

L hk~ak =

Sn =

hnan+ I

k=O

-

hoao -

n

L ak~hk-I'

(58)

- ho a o .

(59)

~n)

(60)

k=1

or ao

+

n

L ak(l

k=1

+ ~hk - I) = h; an+ I

But, by the lemma, n

L ak(l + ~hk-I) =

k=1

where; E

fYtN .

sn(l

+

Since s; =1-.0, hnan+ Js,

=. 1 +

and letting n --. 00 gives the result.

ao(l + ho)/sn + ~n,

(61)

•

Sometimes the variable appearing in a Poincare series is n + f3 rather than n. This is immaterial, however, as the following result indicates. Theorem 6.

Let

L crCn + a;o-" 00

Un '"

Then, for any f3

E

,=0

e, «e «.

(62)

'fl,

L c~(n + f3)O-', 00

Un '"

(63)

,=0

Proof

k k ( a _ f3)O-' L c,(n + at-' =. L c,(n + f3)O-' 1 + --f3 ,=0 ,=0 n+ k

=. ,=0 LcrC n +

0 00 (f3 - ay-'(r - (})s-, f3) s=, L (s-r )'( . n + f3y

=. i>,(n + f3t ,=0

±

se r

(f3 - aY-,'(r - (}):-, (s - r).(n + f3)

k

=. L c:(n + f3)o-S + O(nO- k- I ), s=o

and from this the theorem follows immediately.

•

+

O(nO- k- l )

(64)

Chapter 2

Linear Transformations

2.1. Toeplitz's Theorem in a Banach Space The most famous result dealing with the regularity of linear transformations is the Toeplitz limit theorem. In its classical guise, this concerns the convergence of transformations of (6's where the (n + 1)th member of the transformed sequence is a weighted mean of the first n + 1 members of the original sequence: n

s, = L PnkSk' k=O

(1)

The theory of this transformation is covered quite adequately in the existing literature (Knopp, 1947, Hardy, 1956; Petersen, 1966, Peyerimhoff, 1969). For what follows, we shall need an abstract version of the theorem. This, in a way, is fortunate, since the proof is cleaner than the proof of
Lemma 1 (Uniform Boundedness Principle). If lim lim (nk(Y) 24

(3)

25

2.1. Toeplitz's Theorem in a Banach Space

exists for every Y E &1 3 , then the linear operators (n(Y) = lim (niy)

(4)

k--+ 00

satisfy

z o.

(5)

= ((y)

(6)

n Lemma 2.

If II (nkII :::;; M, n, k

z 0, and

lim lim (nk(Y)

n-oo k-oo

exists for all Y in some dense subset W (E B(&l3,&l 2 ) , with 110 :::;; M.

&1 3 , then ((y) exists for all Y E&l3 and

c

We omit the proofs of these lemmas, which are straightforward double applications of standard functional analysis results [see Banach (1932, Theorems 3 and 5); Zeller (1952)]. (All limits above, of course, are in the norm topology.) Now let s be a convergent sequence in fJd\ i.e., s E &I~ and let S E 88~ with 00

s, = 2: J.lnk(Sk)

(7)

k=O

subject to certain convergence considerations, which we shall examine presently. T(s) = s is called a generalized Toeplitz transformation. Theorem 1 (Toeplitz Limit Theorem). and S E &12 iff (i) (ii) (iii)

The sum (7) converges, n

112:~=o J.lni s) II :::;; M for Ilsjll :::;; 1,j z 0, and n, k z 0; 2:;;'= 0 J.lnk(Y), n z 0, and lim n 2:;;'= 0 J.lnk(Y) exist for Y E --+ 00

lim n --+ 00 J.lnk(Y) exists for Y E &11, k

z O.

z

0,

f1l1 ;

If (i)-(iii) are satisfied,

s= Proof

00

lim

2:

n-co k=O

00

J.lnk(S)

+ 2:

lim J.lnk(Sk - s).

k=O n-e co

(8)

To apply the lemmas, put f!J3 = f!J~ with the usual norm

[x] = sup[x.],

(9)

n

and k

(nis) =

2: Jlni s).

j=O

(10)

26

2. Linear Transformations

For W, the dense subset of ~3, we pick all finite linear combinations of sequences of the form {O, 0, 0, ... ,0, b, 0, ... }.

and

{c, c, c, ... }

(11)

This works since for any x E f!J~, {x, x, x, ...}

+

{x o - x,

Xl -

X, X2 -

k

X, ... , Xk -

x,O, 0, ... } ~ x, (12)

-> 00.

=: The necessity of (ii) and (iii) is immediate. For (i) note that each sequence of the form

(13)

(so, Sl, S2' ... , Sk, 0,0, ...)

belongs to f!J~ and has norm [s] :::;; 1. =: Consider C E W, Cj = c. Statement (i) shows that condition 1 of Lemma 2 is satisfied and (ii) guarantees the second. For (0,0, ... , 0, b, 0, ...) E W, condition 1 is clearly satisfied and (iii) guarantees the second. Thus s converges and T(s) = s defines a continuous linear transformation from f!J~ tof!Jl:. (8) is obvious. Equation (7) shows

II TIl

f Jlnk(Sk)ll·

sup supll

=

Ilsll=1

n

k=O

Theorem 2 (Toeplitz Regularity).

T(s) =

•

(14)

s is regular for f!J c iff

(i) III'=o Jlnj(S)II :::;; M for Ilsjll :::;; 1,j 2: 0, n, k 2: 0; (ii) Ik'=o JlniY) = Y + 0(1), n 2: 0, Y E f!J; (iii) Jlnk(Y) = 0(1) in n, k 2: 0, Y E~. Proof =: =:

Obvious. We determine the limits ii, u

v of the two sequences

= (u, u, u, u, ...),

where v in the second is in the (K ii

Requiring ii = u and

v=

°

v

+

= lim

v=

= (0,0, ... , 0, v, 0, ...)

(15)

l)th position. Applying Eq. (8) yields 00

I

n-e co k=O

(16)

Jlnk(U),

(17)

lim JlnK( v).

finishes the proof.

•

Any transformation (7) defined by a matrix {Jlnd of linear operators is called a generalized Toeplitz summation method.

2.2. Complex Toeplitz Methods

27

2.2. Complex Toeplitz Methods The only possible linear transformation is (1)

flnk E~.

In all practical situations flnk = 0, k > n, and so the matrix

° ° °

fl OO

U

=

[flnkJ

= [

fllO fl20

flll

(2)

fl21

is lower triangular. If rows sum to 1, n

L flnk

k=O

= 1,

(3)

then U, or the transformation defined by U, T(s) fin

=

n

L flnkSk' k=O

n

= s,

= 0, 1,2, ...

(4)

is called a triangle. We now restate the Toeplitz limit theorem in a form suitable for U in Eq. (2). Theorem 1.

(i) (ii) (iii)

U is regular iff

Li:=o Iflnkl ~ M; LI:=o flnk = 1 + 0(1); flnk = 0(1), k fixed.

(ii) and (iii) are obvious. Condition (i) of Theorem 2.1(2) produces ~ M for Isjl ~ 1, but for n fixed, there is an s for which this maximum is attained, namely, Sj = sgn flnj' The smallest M that will do is, in fact, the norm of T, and Proof

ID=o flnjsjl

IITII = sup M n , n

u, =

n

L Iflnkl· k=O

•

(5)

Of course, if U is a triangle, condition (ii) can be deleted. A method U satisfying (i)-(iii) is called a Toeplitz method. If flnk ?: 0, U is called positive. Complex Toeplitz methods are very useful and, when applied to the right sequences, can greatly enhance convergence. Because of their numerical stability, positive methods are the most frequently used ones.

28

2. Linear Transformations

Real positive Toeplitz triangles (even triangles that are "nearly" positive) have an important limit-preserving property; i.e., negative elements appear in only a finite number of columns of V iff (6)

jar all real bounded sequences s [see Cooke (1955, p. 160)].

One cannot expect too much from any linear summability method. The improvement in convergence is, in general, no greater than exponential; in other words, (s, - s)j(sn - s) = O(t), 0 < y < 1, and one cannot find a method, at least a positive triangle, that is accelerative for all convergent sequences. To see this, let V be such a method and s a monotone decreasing null sequence. Then

f

k=O

JinkSk > ~

f

JinkSn

k=O

= 1.

(7)

~

Pennacchi (1968) has shown that no method of the form p

s, = L JijSn-p+j,

(8)

j=O

where the Jij are independent of n, can be accelerative for all sequences. (The foregoing is a band Toeplitz process with constant diagonals.) A minor modification of his proof permits the generalization that no band Toeplitz process can be accelerative for all Cf/c. Whether any Toeplitz method can be accelerative for all Cf/c is an important open question. There are many triangles that sum divergent bounded sequences, but it is a consequence of the Banach-Steinhaus theorem that no regular triangle can sum all bounded sequences (Schur, 1921). The polynomial A Pn( )

~ Jink Ak = On (A1 -_ AAnk) ' = Lk=O

k=l

nk

n

~

0,

(9)

is called the nth characteristic polynomial of the triangle U. The regularity and accelerative properties of V are intimately connected with the location of the complex zeros Ank of Pn(A). A useful function, called the measure of V, is (10) For all AEN, (11)

2.2. Complex Toeplitz Methods

and K

K

= 1.

29

is called the modulus of numerical stability of U. When U is regular,

Let '{J Em(r) C

'{Js denote

sn

=

S

the space of all exponential sequences of the form

+ cdi + czYz + '" + cmY;:',

(12)

where cj =1= 0 is complex and Yj E I",a nonempty compact subset ofthe complex plane not containing O. We assume the Yj are distinct. As the following theorem shows, the properties of the measure of U determine whether or not U is regular and accelerative for this important class of sequences.

Theorem 2. (i) (ii)

AE r.

Let 0"(..1.) Let 0"(..1.)

Proof

Let U be a triangle with measure 0"(..1.), .s4 = =1= =1=

'{J Em(r).

I, A E r. Then U sums .s4 with s = s iff 0"(..1.) < 1, A E r. A, AEre N. Then U is accelerative for .s4 iff 0"(..1.) < IAI,

The basic inequality from which these statements follow is (0"(..1.) - e)" <: IPiA) I <. [0"(..1.)

+ e]",

(13)

for any s > O. To show sufficiency in (ii), for instance, choose an s so that the above holds for Z = Yl, Yz,.··, Ym' and let y* = suplv.]. Next pick 1 > b > 0 so that (14)

O"(Yr) ~ IYrl(l - b).

One has 8n 15" -

S \ 5

~.

CI8"-sl C~ Icrl[IYrl(l-b)+e]" *n <. 1... *n

Y

Y

r= 1

<. C(1 - b + e/y*)"

e > O.

for every

(15)

But this must hold for every s > 0, 15, y* being fixed. Thus

1(8" - S)/(5n

-

s)1 = O(r")

for some

r,

0 < r < 1,

(16)

which shows not only that U accelerates convergence of each sequence in .s4 but does so exponentially. Remaining details of the proof are left to the reader. •

If something is known about the zeros of PiA), one can say something about the kinds of exponential sequences U sums. In what follows, define the distance functions d, D for sets C'{J by d(A, B)

= inf l z - w], zEA WEB

D(A, B)

= supiz zEA WEB

- w],

(17)

30

2. Linear Transformations

and let A =

Pond. Then

clearly,

d(r, A)/D({I}, A) S; 1 O'(A) 1 S; D(r, A)/d({I}, A),

AEr,

(18)

so if the maximum distance of r to {And is less than the minimum distance of {l} to {AmJ, U sums C6'Em(r). Also O'(A *) = 0 only if A* is a limit point of {And, provided the latter is bounded. The regularity of U is particularly easy to characterize when Ank does not depend on n. These are the so-called (j, -A k ) means (Jakimovski, 1959). Theorem 3.

Let the triangle U be defined by Po = 1 and n

A - Ak

PiA) = n~, k= 1 k

n

~

(19)

1,

where Ak =f= 1. (i) (ii)

Let 0 <. Ak < M; then U is regular iffL Ak converges. Let Ak <.0; then U is regular iff L Ak 1 diverges.

Proof We prove statement (i) first. Let the last Ak S; 0 be Am' Clearly, the regularity of the triangle associated with PiA) is unaffected if each factor (A - Ak)/( 1 - Ak) in Pn(A), 1 S; k S; m, is replaced by, say, 2A - 1. Thus we can assume, without loss of generality, that Ak > O. Note also that already n

I

k=O =>:

}ink = Pn(1) = 1.

(20)

Since Ak > 0, the coefficients in Pn(A) alternate in sign. Thus

IPi-l)1

=

n k n

= 1

1

1 AI + ,k = n l}inkl < M. 1/l.k k=O

I

(21)

n

1 Pn ( -1) 1 is obviously monotone increasing and thus convergent. This means that Ak -+ o(Knopp, 1947,p. 219), which means the infinite product (1 + Ak ) / (1 - Ak) converges absolutely. Thus L 2Ak/(1 - Ak) is convergent (Knopp, 1947, p. 224), and so L Ak is convergent.

=:

(22) (23)

2.2. Complex Toeplitz Methods

31

But the convergence ofL Ak guarantees the convergence of the above product, since 0< Ak < M (Knopp, 1974, p. 274). Thus Iltnkl :$; AR k- n (24) and A(R n + 1

/I

k~olltnkl:$;

AR

1)

-

Rn(R - 1) < R - l'

(25)

and letting n --+ 00 in (24) finishes the first part of the proof. For (ii), let -A k = 'k > 0, without loss of generality. Note LIltnkl = L Itnk = 1. =:

n k= n

Itno =

L,;;

If Itno --+ 0, then diverge (to zero). <=: As before

1

(

1

1

I)-I.

+-

(26)

'k

must be divergent, since the product in (26) must

Iltnkl = -1

2n

Iii

k+1

Itl=et

+ 'k) dt n (t-+ 'k n

k=1

1

I

0<8<1.

(27)

The product diverges as n --+ 00 if L (1 + 'k) - 1 is divergent. But, as is easily seen, this is true for L ,;; 1 divergent. Thus the product diverges and obviously to since its terms are < 1. This means Itnk --+ 0, so V is regular. •

°

For this very simple class of triangles, the measure can be computed explicitly. Theorem 4.

Let the triangle V be defined by Po = 1 and

Pn(A) =

)J (A1 -- AA

k)

n

1

k

'

n ;:::.: 1,

(28)

where Ak i= 1, k ;:::.: 1, and Ak = 0(1). Then a(A)

= 1.11.

(29)

Proof

(30)

32

2. Linear Transformations

or the left-hand side is the exponential of the Cesaro means of the sequence s~ = lnl(A - An)/(1 - An)l. This sequence is convergent-in fact, s* = In IA1- .,,,rl the theorem results. •

Example 1. Let Ak = (J-k, (J > 1, U is then the triangle corresponding to the Romberg integration procedure (see Section 3.1). In the next result, the zeros of Pn are allowed to depend on n.

Theorem 5. Let Pn(A) be as in (28) but with A = Ank' Furthermore, let all but a finite number of the Ank E~, which is a compact subset of {Re z :s; O}, and let n

Mn =

L l!lnkl =

Then U is regular. Furthermore, if omitted. Proof

~

(31)

0(1).

k=O

= [ -a, OJ, a > 0, (31) may be

Using the previous contour integral, we find

n nn 111 -_ tAnj A I

l!lnkl :S;.Rk- sup

JII=Rj=l

nj

= RkfI [(1/R - Re Anif + (1m AnjtJ1/2. j=l

(1 - Re An)

+ (Im

An)

(32)

But since Re Ani' 1m Anj are bounded and Re Anj :s; 0, each term in the product is less than or equal to 1'/ < 1, so in That (31) may be omitted for

~

n.

(33)

real follows from

M n = P n(1) = 1.

•

(34)

Example 2. The case where {P n} is a system orthonormal with respect to a distribution function t/J E 'P with support (that is, the set of points of increase) in [ -a, OJ is very important (Section 2.3.6). Of course, it is also important to know when a method is not regular.

Theorem 6. Let Ank E. [0, 00 J, k 2': 1,and let m(n) of these be bounded and bounded away from zero, m(n) -- 00 as n -- 00. Then U is not regular. Proof

(35)

2.3. Important Triangles

33

For the m(n) zeros, 1 Ank 1+ _ A l 2. 1 + nk 1

so

s.

(j

> 0,

(36)

(37)

Theorem 7. be regular:

Anyone of the n

+

1 conditions below is necessary for U to

n

Il 11 -

Ank 1- 1 = O(1);

n 11 -

And- 1 = 0(1),

k=l

Sp(An1, An2' ... , Ann)

n

k= 1

(38) l~p~n,

where Sp is the pth symmetric function of the roots of Pn(A). Proof

Obvious.

•

Definition. Let U be a triangle. U is said to be equivalent to convergence s; = S ifflimn~oo sn = s. [Note that this definition requires s; (or sn) to exist only for n sufficiently large.] iflimn~oo

Triangles equivalent to convergence are, generally speaking, pretty weak computationally-they, as it were, try to do too much. The triangle U tends to be heavily weighted toward the diagonal [flii] and so gives excessive weight to the latest member used in the sequence {s.}. But the latest member of the sequence carries very little information. The following criterion is due to Agnew (1952). Theorem 8.

Let U be a regular triangle and lim (2flnn - M n) > 0.

(39)

n-e oo

Then U is equivalent to convergence. Proof

See Agnew (1952).

•

2.3. Important Triangles 2.3.1. Weighted Means

flnk = Pk/Pn, P; = Po

+ PI + ... + Pn

(1) i= 0,

n 2 0.

(2)

34

2. Linear Transformations

U is regular ifand only ifIz=o IPkl = O(P n) and P; regular if and only if Pn ---+ 00.

---+ 00.

When Pn > 0, U is

2.3.2. Euler Means (1)

P> O. U is regular, because

Ci(A) =

I~ : ; I·

(2)

When P = 1, U is called the binomial method. For further properties of U, see Section 12.2. 2.3.3. HausdorffTransformations

Let ¢ be of bounded variation in [0, 1], ¢(O) = 0, ¢(1) = 1, fb and define

d¢ =

1,

(1)

Obviously, U is a triangle. Theorem 1.

U is regular iff ¢ is continuous at O.

Proof

Jolf!nkl

~

LJO(~)Xk(1

= Lld¢1 <

- xr-kld¢1

00.

(2)

Thus U is regular iff limn~ co f!nk = O. If ¢ is continuous at 0, (3)

35

2.3. Important Triangles

and for every

1:, 0

<

< I,

I:

~

Illnkl

f:1d¢, +

f (~)Xk(1

~ J:1d¢1 + (~)(1

-

= J:1d¢1 + 0(1), and soit follows that lim n _

O,k

~

oo

~ IJ:d¢ 1- (~)(1 = If lim n _

00

Ilnk =

e)n-k L'd¢' n --+

(4)

00,

O.Conversely,foreverye,O <

IJ:d¢ I-I f (~)Xk(1 - x)n-k d¢I

~

Illnd

Ilnk =

- x)n-kjd¢1

ILd¢ I + 0(1),

I:

< 1,

- I:)n-k L'd¢' n

(5)

--+ 00.

0 and ¢(O) = 0, it follows that ¢ must be continuous at O.

•

Hausdorff weights yield interesting quadrature formulas for Stieltjes integrals.

Theorem 2. Let f

E

CEO, 1]. Then

lim

±f(~)llnk = II f d¢.

n-e co k=O

n

0

(6)

Proof The proof is elementary. By the uniform approximating properties of the Bernstein polynomials (Davis, 1963),

Bn(j; x)

=

kt (~)f(~)Xk(1 - x)n-k.

Note that ¢ == x yields the trapezoidal formula.

(7)

•

For additional properties of Hausdorff transformations, see Petersen (1966) and Peyerimhoff (1969). 2.3.4. Salzer Means

Salzer means are given by

Ilnk

=

(_l)n+k (y

+ k)n n!

(n)k '

y>O

(1)

36

2. Linear Transformations

(Salzer, 1955, 1956; Wynn, 1956a; Salzer and Kimbro, 1961; Wimp, 1972, 1975). U is not regular since

M; =

n

L IIlnk I > (n + y)n/n! --> 00.

(2)

k=O

In fact the following result holds. Theorem.

Let A t=

°be arbitrary complex. Then for U defined by (1), a(A)

=

W

-1,

(3)

where W(A) is the modulus of the smallest (in magnitude) root(s) of

eIn particular, K(U) Proof

Using (y

+ k)" I

n.

__1_ - 2 . tu

f

C

c

+

r-

Z

+ Aez =

(4)

0.

= o( -1) = 3.5911. i oo

i a:

we have

(-1)" Pn(A) = - - . 2m

~ n+ 1 dp, P

fC+;oo c

r-

La»

ePY p

-

t

(5)

= Y + k, c>

(1 - Aep)n P

0,

(6)

(7)

dp,

and the theorem follows by a straightforward application of the method of steepest descents. Rouche's theorem shows that Eq. (4) for A. = -I has exactly one root in Re z > - 1. This root is real,

zo = - 0.278464543, and since the Ilnk alternate, Eq. (5) follows immediately.

(8) •

°< A < 1, it has no real roots. For all A. t= 0, it has a string of roots lying

Equation (4) is rather interesting, and has received much attention. If

asymptotically within an arbitrarily narrow sector enclosing the imaginary axis. An asymptotic formula for these roots is known; see Bellman and Cooke (1963) and Wright (1955). Equation (5) indicates the method obtained from the Salzer weights is numerically very unstable. What happens, of course, is that the weights grow large and alternate in sign. These considerations would seem reason enough to dismiss U as a summation method suitable for any practical applications. The reader will therefore be surprised to learn that U is one of the most important summation

37

2.3. Important Triangles

methods. It is regular for a large and important class of sequences, and performs better in summing these sequences than even the most powerful nonlinear methods. Furthermore, those sequences, which have the property that they approach their limits algebraically (and are thus logarithmically convergent), are the sequences which pose the greatest challenge to any summation method. To explore this idea, observe that

I

n

k=O

1 = ,{\nyn-r = boT' n.

+ k)-r

Ilnb

0

s

r S n.

(9)

This means that, ultimately, U is exact when applied to sequences that are in Lin(1,(y + n)-l,(y + n)-2, ... ,(y + n)-m); i.e., sn =.8. [In fact, this is a consequence of the manner of derivation ofthe method; see Wimp (1975) for details.] Actually U is effective-i-but not necessarily exact-i-on a much larger class of sequences. Let d be the class of sequences s with s, =

8

+

C

I ( r !3)" r=1 n + 00

n 2 0,

(10)

where C E C(}s and the series converges absolutely for n = O. Assume 0 < Rearranging (10) gives (the also absolutely convergent series) s;

=

+

8

With the representation (k

c*

I ( r=1 n + y 00

r

+ y)-r = -1- foo e-(k+ Y)ltr - 1 dt, T(r)

y.

(II)

n 2 O.

)"

!3 <

(12)

0

we have

(

- l)n I n!

so - 1 1 r«

<~ ~

-

c* ~ r=1 r(r) 00

foo e0

Ic:+nl foo e

,L. F(r) n' r=1 r

0

Yt( 1

-Yt('-1

- e- t )"t r -

1

dt,

(13)

Ic:+nl

(14)

dt = yn ~ ,L. r+n n· r=! Y

or M(y) =

I IC:'.

r=1 Y

(15)

38

2. Linear Transformations Table I n

s,

0 I 2 3 4 5 6 7 8

1 1.5 1.625 1.643518 1.644965 1.644951 1.644935185 1.644933943 1.644934041

Thus U is regular for d. In fact, we can find a constant C and an integer m such that I(sn - s)/(sn - s) I ~ Cy"nm/n!, (16) m being the index of the first c~ i= O. Thus U is dramatically accelerative for d. U also seems to do well on sequences which have Poincare asymptotic expansions given by the right-hand side of (10). However, without assuming something more about the character of s, there seems no way to establish regularity. Nevertheless, as an example, take S

2

n

= (PI) = n

L

k=O

Z

1

n

(k

+

l)z

'" -n6 + -Cln + -nCz + ... Z

(17)

as the work of Section 1.6 shows. Taking}' = 1 gives the values shown in Table I. The error in the last entry is 2.6 x 10- 8 . Can U sum divergent sequences? If IIlI > 1, (4) has one zero in N, so for these values of Il, a(A) > 1. Thus U sums no divergent exponential sequences. If s; is a real convergent alternating sequence, s, = (- A)", 0 < A < 1, a direct argument shows Salzer's method is regular when 0 < A < l/e and produces a divergent s; when l/e < A < 1. The Salzer weights are best applied using a lozenge procedure; see Section 3.3, Example 3.

2.3.5. Other Nonreqular Methods There is a class of nonregular methods that work on the same kind of sequences as the Salzer methods but that are easier to analyze theoretically. These are triangles given by T> 0,

so

_ (T + k>n( _l)n+k

flnk -

,

n.

(n)k .

(1) (2)

2.3. Important Triangles

39

All the zeros of PiA) lie in (0, 1) and, in fact, are equidistributed there. Thus U is not regular [Theorem 2.2(6)]. However, let .s;1 be the class of all sequences oftheform

s; =

S

+

C

I

00

n2

r,

r=l(n+T)r

o.

(3)

U is very effective for such sequences.

Theorem 1. Let Cn

= pnvn,

Ifni:::; lpln+l

sup[u.]

Vn

= 0(1).

(4)

Then

r>n

(n - I)! 1p1 (2 )' e ,

n21.

n .

(5)

Proof

_ (-ltr(n r,> n!

+

T)

crr(r)

00

r~lr(r+T+n)r(r-n)

(6)

results by using the known formula for 2F1(I). Thus

I

Icr+n+ll(n + r)! n - n! r = 0 (n + r)n+ r + 1 r!

If I < ~

:::; Ipln+l

~~~Ivrl r(~(~

1) <1>(n

+

1, 2n

+

1;

Ipl),

(7)

<1> being Tricomi's <1>-function, since (n + T)n+r+ 1 is increasing in T. However, each term in the Taylor series for <1> is decreasing in n. Letting n = 0 gives an upper bound, and the theorem results. • The foregoing also shows that U is, ultimately, exact (r n =.0) for sequences of the form

s, = Since the weights method is

/lnk

S

+

m

I

C r.

r=l(n+T)r

(8)

alternate in sign, the numerical stability of this

(9) or K(U)

= 3 + J8 = 5.828,

even worse than the Salzer method. The method is regular for another important class of sequences.

(10)

40

2. Linear Transformations

Theorem 2.

{(sn - s)}

Proof

Let d be the class of

S E ((j s

= r E ~TM' U is regular for d. For some

whose remainder sequences

t/J E '1', (11)

and thus t, = (_I)n

f p~r-l,

0)(1 - 2t) dt/J,

t)

= (_I)n+l f/~r-I'O)(t)d¢,

1¢(t) = t/J ( -2- .

(12)

The following facts are in Erdelyi (1953, vol. 2,10.14,10.18):

tE[-I,l]; p~r-I,O)(COS

8)

~

K(8)n- 1/ 2 cos[(n

(13)

+ r/2)8 + C],

(14)

(14) holding uniformly on compact subsets of (0, n), K (>0) being integrable on such subsets. Pick b, 0 < b < 1, and write

Irnl s

r.

d¢

+

C'n-

1/ 2

f_-I

bK(8)d¢

+

f-/¢,

o = arccos t.

(15)

Now pick b to make the first and last integrals < e/3; the second will be <. £/3. Thus \rnl <. £ or sn = S + 0(1). •

2.3.6. Orthoqonal Methods Let a > 0 and {pnCt)} be a system of polynomials orthonormal with respect to some t/J E 'I' with support in [ - 1, 1]. Further, let

JV I

-I

In t/J'(t) r.-

I -

(2

(1)

dt

be bounded. If we define

O"n(a) = Pn(2/a

+

1),

(2)

Pn(A) has its zeros in (-a, 0), so a regular triangle is obtained, by virtue of

41

2.3. ImportantTriangles

Theorem 2.2(5). Further, by the well-known asymptotic properties of P« (Freud, 1966, p. 245ff.),

A~ [-a, OJ,

P = 1 + J1+~ ~ 2, (3)

branch cuts being taken between - a and O. The Toeplitz methods obtained by taking the polynomial P; in (2) as the characteristic polynomial of U are perhaps the most powerful of all linear summation methods. Furthermore, they lend themselves to an elegant computational scheme that requires no explicit knowledge of the zeros of P n or the weights /1nk' This algorithm is derived in Chapter 3. Here, let us explore the regularity of methods based on (2). We first need some geometric concepts. The values of A for which A(o) = IAI are given by the roots of 1 Z

A/

+ ~ = ZA 1 / Z , P

(4)

or by A a

(cos 2fJ - (2/p) cos fJ) - i(sin 2fJ - (2/p) sin fJ) pZ - 4p cos fJ + 4

(5)

As f) varies between -arccos l/p and arccos LIp, the latter having its principal value, the A-values in Eq. (5) trace out the outer loop of a limacontype figure intersecting the real axis at the points - a/pZ and 1. This figure is shown in Fig. 1 for a = 8. Let 11 (a) denote the region of the cut A-planeexterior to this curve. There (and only there) U(A) < IAI. Let I z(a) denote the region interior to an ellipse, center ( - a/2, 0), major semiaxis on the real axis of length a/2 + 1, minor semiaxis parallel to the imaginary axis of length ja+l, and exterior to [-a, 0]. For AE Iz(a), U(A) < 1 (and only there). These observations and Theorem 2.2(2) lead to the following result. Theorem 1. Let U be a triangle determined from Eq. (2). Then U sums C I z, and accelerates convergence of rtErn(!) for I C r 1 n E I'

rtE'n(!), I

The asymptotic theory for Pn«2A/a) + 1) on the cut is rather complicated and to get an idea of what can happen, it is best to specialize, taking the case of the Jacobi polynomials, since these generate the most useful triangles. If

42

2. Linear Transformations

I
-i

Fig. l.

Pn(X) then

= h; 1(2 P~a.·P)(X), h;

denoting the usual orthonormalization constant,

(A.)kj l ), a Tn(a), (n)(n ++ V)k

Jl.nk = k (ex Tn(a)

=

±(n) (n ++ V)k (~)k.

k=0

k (ex

v

= ex + f3 + 1,

(6)

l)k a

From Erdelyi et al. (1953, vol. 11,10.14 (10»,

A.) ""' C cos{[n + -!(ex + f3 + 1)]8 - (1 ex + i)n}

P n(

""'

p 2n

e = arccoseaA. + 1)' A. E ( -a, 0).

'

(7)

2.3. Important Triangles

Since IImlcos(an

+ b)11/n =

43

1, a =I- 0, this shows cr(A) = p-2.

(8)

If max ly.] in the sequence s, [see Eq. 2.2(12)] is larger than this, U will accelerate Sn' For a = 1, this value is 0.17157, and then U is accelerative if all YrE(-I, -0.17157). However, if some YrE(-0.17157, 1), then U is not accelerative and, in fact, an application of U will harm rather than help convergence, at least for such exponential sequences. This is consistent with the general philosophy that regular methods are ineffective for summing sequences that approach their limits monotonically, for instance, sequences like {1 + (!)n}. In such cases, one's recourse must be to nonregular methods. 2.3.7. The Chebyshev Weights What is probably the best of all the positive triangles results when pix) of the previous section is chosen to be the Chebyshev polynomial y"(x) of degree n. The efficiency of U in this case is a consequence of the extraordinary interpolatory properties of y"(x) that cause that system of polynomials to play such an important role in numerical analysis and approximation theory. The definition is T,,(x) = cos nO,

so To = 1, T1 = x, T2 = 2x 2 y,,(x)

-

0 = arccos x,

(1)

n 2': 0,

1,... . Obviously,

= 1[(x + i~)n + (x - ij1:::':;;2)n].

Another useful representation is y,,(x)

x)

- -' = 2 Fl ( - 1;n, n 11- 2

(2)

(3)

Letting x = 2A/a + 1 gives a positive triangle with entries (4) k, n 2': 0.

Usually one takes a = 1. As an example of the power of U take a = 1 in Eq. (4) and n (-It (5) Sn = (LN 2)n = L - - --+ In 2 = 0.69314718. k=O k + I The result is given in Table II.

44

2. Linear Transformations Table II

n

s,

0 1 2 3 4 5 6 7 8 9

1 0.66666667 0.68627451 0.69360269 0.69312536 0.69315096 0.69314685 0.69314723 0.69314717 0.69314718

Of course, U is a positive triangle, and thus works poorly on positive monotone sequences. The application of U is best accomplished by a lozenge method rather than using JlnkSk (see Chapter 3).

L

2.3.8. Lotockii's Method For Lotockii's method,

p A. _ (A. n( ) - (1

+ a)n _ + a)n -

(A.

+ a)(A. + a + 1)··· (A. + 1:1. + n (1 + a)(2 + 1:1.) ••• (n + a)

1)

,

a> O. (1)

[See Lotockii (1953), Vuckovic (1958), and for applications and other references Cowling and King (1962/1963) and Agnew (1957).J Theorem 2.2(3) as it stands provides no information about the regularity of U, but, starting with Eq. 2.2(24) the proof is easily modified:

IJlnk I -< R k nn

j~l

= CR k

(a

+j

- 1 + I/R) ( a +') , ]

R > 1

r(n + a + I/R) r(n + a + 1)

= RkO(n(l/R)-l) = 0(1)

III

n.

(2)

Since U is a positive triangle, this is all that is required to show that U is regular.

2.3.9. Romberg Weights The Romberg weights are a triangle, not a positive one, that bears a close relationship to an extrapolation procedure attributed to Richardson and also

2.3. Important Triangles

45

to a method attributed to Romberg for improving the accuracy of integration by the trapezoidal rule. Both procedures are treated at length in most books on numerical analysis; see, for instance, Isaacson and Keller (1966), Bauer et al. (1963), or the articles of Bulirsch and Stoer (1966, 1967). Take a > 1, Po = 1, and

PiA) =

JJ A _ o n

1_ a

?

k'

n ~ 1.

(1)

/lnk then is the kth symmetric function of (17- 1,17- 2 , ••• , a- n); but it is not necessary, nor even desirable, to compute /lnk to use U. It is much more convenient to use a lozenge algorithm (see Chapter 3). We have shown that U is regular. Furthermore

(2) and the product on the right is convergent as n -+ 00. It cannot converge to zero unless one of the factors is zero. Thus U accelerates an s in the space of exponential sequences ~Em(r) iff the y, of largest modulus [see 2.1(12)] is equal to a- k for some k ~ 1. Since (an + I - l)Pn+ l(A) = (Aa n+ 1 - I)P n(A), (3) one gets, by equating powers of A, the recursion formula (a n +

1

-

1)/ln+l.k

=

an+1/ln,k_l - Iln,k'

(Ilnk = 0,

k < 0, k > n).

n, k ~ 0

(4)

2.3.10. Hiqqins Weights

Higgins weights are designed for sequences having the following behavior. (1) cf. Eq. 2.3.3(10). In contrast to the summation formulas of Section 2.3.3, the method to be derived here is regular. Let

y E rt.

(2)

By the same steepest descent argument used in Section 2.3.4,

vv,,/n! ~ j2n/n(zo + l)e Y(Zo+ll( - zo)- n, The Toeplitz limit theorem shows U is regular.

Zo = -0.278464543.

(3)

46

2. Linear Transformations

We now demonstrate two theorems about this process. According to Theorem 1.6(6), (1) may be rewritten 5"

~ 5 + (-

or

~(m)

f (n +c~ yr:

c'

+ (-1)" L

(4)

r= 1

~(m)

( r y + (n + "yr+ r=1 n + y m

5" =.5

1)"

m 2 1,

l'

(5)

a bounded sequence. [(5) holds also when (4) is convergent.] Note that

I"

k=O

(_I)"+k ll

(k

r"k

+ y)1

= 0,

1 :-s; j :-s; n,

(6)

and this provides the following theorem. Theorem 1.

Let (1) hold with y > 0. Then

Is" - 51 :-s; Proof

Left to the reader.

m < n.

supl~~m)l/ym+1,

"

(7)

•

When the series on the right-hand side of (4) is absolutely convergent for n = 0, much more can be said. Theorem 2. Let the series on the right-hand side of (4) converge absolutely, y > 0. Then

(8)

where M is independent of nand k is the smallest integer such that Proof

ci of.

0.

The integral representation 2.3.3(12) gives (9)

Therefore, - I -< W 1 ~ I c~ +" I < 1 ()" 1r; L. - r - - W C Y Y , " r= 1 y"

Choosing k to be the smallest integer with sequence asymptotically by

Ir"1

c(y)

= ~ ler I. L. r= 1

r'

Y

(10)

c~

of. 0, we can describe the error

~ 1e~I/nk

(11)

and combining this with (10) proves the theorem.

•

2.3. Important Triangles

47

The computational gain on using the transformation (2) on series of the kind (l) is spectacular, the convergence being accelerated more than exponentially, actually, by a factor Anlnn. It is rare indeed that a linear method performs this well. Of course since the method is a positive triangle, it is numerically stable (K = 1). The analysis of this procedure illustrates very well the gulfthat exists between the acceleration of sequences that oscillate about their limits and those that do not (logarithmically convergent sequences being cases of the latter). For the former, there exists a profusion of highly efficient summation procedures, while for the latter, the suitable methods are much less efficient and invariably nonregular. For sequences that neither oscillate about their limits nor approach their limits monotonically, almost all known methods fail. (Recent numerical evidence, however, indicates that the implicit procedures of Chapter 9 hold some promise for such sequences.) 2.3.11. Inverse Methods

Some interest attaches to the inverse of a Toeplitz method V, i.e., the triangle U* = [Il~k] defined by

t; =

n

L Ilnk Sk, k=O

Sn

=

n

L ll~kSk k=O

(1)

for all sequences s, s. The characteristic polynomials P~(A) of U* usually cannot be found explicitly. In one case, however, this can be accomplished, that is, for the nonregular methods discussed in Section 2.3.5. If (2) and application of the formula in Erdelyi et al. (1953, vol. 2, 10.20(3» gives

where

P~(A) = (r + 2A dldA) v,.(A) n+t

(3)

(4) Furthermore,

IAI < 1 IAI> 1.

(5)

48

2. Linear Transformations

For litl < 1, (5) follows from (4) by dominated convergence and taking a termwise limit. For Iitl > 1, consider the integral

f

u« /3, it) =

(1 - t)n+a(1

= (1 + it)n+ P

f

+ itt)n+P dt

zn+>(1 - yz)n+ P dz, IY

= (1 + it)"+p[f +

y = it/(I + it)

{~J

= (1 + it)2n+a+ P+lit- n->-IB(n + a + 1, n + /3 + 1) - (1/it)I(/3, a; Ilit)

(6)

The use of Stirling's formula and the relation (n

+a+

I)I(a,

/3; it) =

n -a -l/3 21\ -it) F ( n- +

(7)

shows (5) for Iitl > 1. A quick computation shows U* is regular, but it is a poor method to use on exponential sequences since S will converge more slowly than itn for all litl < 1. A considerable amount of research has been done on inverse methods. The paper by Wilansky and Zeller (1957) contains some important results and a number of useful references. 2.4. Toeplitz Methods Applied to Series of Variable Terms; Fourier Series and Lebesgue Constants

Often it is important to discern the effect of U on a series of variable terms: f(z) = s(z) = sn(z) = sn(z) =

co

L !k(z),

k:;O n

L fk(Z),

k=O n

n

L J.1nk Sk(Z) = k=O L Vndk(Z),

k=O

(1)

2.4. Toeplitz Methods Applied to Series of Variahle Terms

49

A straightforward application of Cauchy's integral formula shows that for U to sum a Taylor series about the origin anywhere within its circle of convergence, it is sufficient that Pn(A) -> 0 uniformly for all IAI ~ 1 - 15, for every o < 15 < 1. Obviously this is a weaker condition than regularity. Necessary and sufficient conditions are presented later [Theorem 4.3.1(1)]. Applications to Fourier series present somewhat different problems. Let f E L( - n, n) and let ak' bk be the Fourier coefficients generated by f, ak

= ~1

I

n _"

f(x) cos kx dx,

bk

= ~1

I

n _"

f(x) sin kx dx.

(2)

Let sn(x)

= !ao +

n

L (a

k=l

k

cos kx

+ b, sin kx).

(3)

Assume that U is a real triangle and that six) is the result of applying the summability method U to sn(x). The convergence of s; can be related to the constants (4) called the Lebesgue constants for U. The standard theorem establishing the connection is due to Hardy and Rogosinki (1956). to

Theorem 1. Let U be regular with Ln(U) bounded. Then sn(x) converges (5)

wherever this exists. If f is continuous on a compact set K c [ - n, n], then sn(x) converges uniformly to f(x) on K. Conversely, if Ln(U) is unbounded there is an f E C[ - n, n] for which six) -f> f(x) at some point x E [ - z, n]. Proof

See Hardy and Rogosinski (1956, pp. 58ff.).

•

An important related result is due to Nikolskii (1948).

Theorem 2. (i) (ii)

limn~oo sn(x) = f(x)

limn~ 00 /lnk

at every Lebesgue point of f iff

= 0 and

Ln(U) is bounded.

Proof The proof is established by an appeal to results of Banach on weak convergence in Banach spaces [see Nikolskii (1948)]. •

50

2. Linear Transformations

As a philosophical consequence of such theorems, much research has centered on describing the asymptotic properties of L n( U) for various summation methods. Concerning the Hausdorff transformation Ilnk =

f

(~)

x\1 - x)n-k d

(6)

(see Section 2.3.3), Lorch and Newman (1961), improving the earlier work of Livingston (1954), have found the following result. Theorem 3.

Let U for (6) be regular. Then Ln(U) = C cP In n

+ o(ln n),

(7)

where (8) the sum extending over all the discontinuities ~k (at most countable) of , and ~(f(x)) represents the mean value of the almost periodic function f(x). Furthermore,

f1 IdI

4

0::;; C cP ::;; 2 n

(9)

0

and CcP = 0 iff is continuous.

e;

Theorem 4. Let E E be monotone. Then there exists an increasing absolutely continuous for which

(10)

LiU) i= O(E n In n).

This result establishes that the error term o(ln n) in (7) is the best possible and cannot be improved even for an increasing absolutely continuous . For the Cesaro method

L (U)

=

n

<

tdn

2

+

~

n

+

1

1)

f"/2 sin 0

2(n

+

sin? ()

1)8 dO

f"/2 sin 2(n + 1)0 0

02

_ dO - M

f
- - 2-

u

du

(11)

and the latter is a convergent integral. This yields the well-known fact that the Fourier series of a continuous function is Cesaro summable. On the other hand, the constants Ln(U) for the binomial method are not bounded.

51

2.4. Toeplitz Methods Applied to Series of Variable Terms

The Lebesgue constants for the methods displayed in Theorem 2.2(3),

PiA) =

,1,- Ak

}]1 1 n

(12)

Ak '

have been discussed by Lorch and Newman (1962). (Note this includes Lotockif's method of Section 2.3.8.) In what follows it is assumed that Ak < 0. Let n

Un

= 1+ 2L

1

--,. k=1 1 - Ak

(13)

Then, if Un is bounded, (14) while if Un is unbounded,

IX

= - 22 Y + ~2 TC

Ii -sin t dt - -2 leo -1 (2- -

TCo

Y = 0.57721

t

TCltTC

Isin r] )

dt,

(15)

(Euler's constant).

Statement (14),coupled with Theorem 2.2(3ii),shows that for regular (f, - Yk) methods with u; bounded and Yk < 0, the Lebesgue constants are unbounded. For generalizations to methods (not triangles) defined by

f d(1) f(A) -

k=

Ak Ak

=

f

Ak

k=/,nk'

(16)

f analytic at 0, see Shoop (1979).

Concerning the behavior of the Lebesgue constants for the powerful methods based on orthogonal polynomials (Section 2.3.6) nothing at all is known. Undoubtedly, they will prove to be unbounded, another consequence of the rule of thumb that only weak methods (Cesaro summability, for instance) are regular for large classes of sequences, in this case, the partial sums of the Fourier series for continuous functions. How much improvement can be expected on applying a Toeplitz triangle to a Fourier series involves the concept of saturation. Let X denote either C[ - TC, TC] or L p [ - TC, TC], 1 ::::;; p < 00 with the norm defined in the usual manner. Let U be a triangle and (17)

52

2. Linear Transformations

Theorem 5.

Let

f

E

X. If there is an a > 0 such that for each k > 0 lim

nal Vnk - II >

0,

(18)

then (19) implies f is constant almost everywhere (a.e.).

Proof

Let

h = -1

I" f(x)e- lkX. dx.

2n _"

Then (20) and so (21) There exists a subsequence {nj}, nj -+ 00, such that both lim j_ oo njjvnj,k - 11 = Ck > 0 and also, by Holder's inequality, limj_oo njllsn/x) - f(x) I 1 = O. But this implies Ck I fk I = 0 for each k > O. Since a function in X is uniquely characterized by its Fourier coefficients, f must be a constant. • This shows that the approximation in norm of sn(x) to f(x) by Toeplitz methods satisfying (18) cannot be improved beyond the critical order n a: no matter how smooth f is. Saturation theory deals with the optimal order of approximation to functions E E c X by a triangle U. For instance, consider the Cesaro means, Vnk = (n + 1 - k)/(n + 1). One cannot have IISix)f(x)11 = 0(n- 1 ) for f E C[ -n, nJ no matter how smooth f is, since IV nk - 11 = k/(n + 1) and a = 1 in the previous theorem. For all nonconstant functions in C[ - n, n J, sn(x) approximates f with an order at most O(n- 1). In fact, this order is actually attained since for f = eix , Ilsn(x) - f(x)11 = I/(n + 1). One says the Cesaro triangle is saturated in C[ - n, nJ with order 0(n- 1 ) . One problem is to characterize those elements in X for which the optimal order is attained. In some cases this can be done. Define -r

~ f(x)

=

f"

1 _/(x - t) cot "2t dt, 2n

the integral being a principal value integral.

(22)

53

2.5 Toeplitz Methods and Rational Approximations; The Pade Table

Theorem 6. Let six) be the Cesaro means of the Fourier series for f(x), X = C[ -n, n]. Then

Ilsn(x) - f(x)11 = O(n- 1 ) iff](x)EC[-n,n],esssuPIJ'(x)1 <

Proof K

(23)

00.

See Butzer and Nessel (1971, Chap. 12).

•

It can be shown that the typical means defined by V nk = 1 - [k/(n LP with order n-". Zemansky (1949) has studied the case

> 0, are saturated in Cor

V nk

= g(k/n),

+ 1)]\ (24)

where g is a polynomial with g(O) = 1, g(l) = 0, gUl(O) = 0, 1 ~ j ~ p - 1, g(Pl(O) #- 0. U is saturated in C[ - n, n] with order n- ". The saturation class of U is the subclass offunctions such that] (resp. f) is p - 1 times differentiable and satisfies a Lipschitz condition of order 1 for p even (resp. odd). Many of the previous ideas can be generalized to abstract spaces. See Butzer and Nessel (1971). Much work has been done on the summability of expansions in general orthogonal functions, for instance, by Olevskii (1975 and the references given there). A discussion ofthese results is outside the scope of this book. However, one result is particularly intriguing: if a E 12 and $(t) is an orthonormal system on [0, 1] and sit) = Lk= 1 ak¢k(t) is summable a.e. by a real regular triangle U, then some subsequence ofs(t) converges a.e. [see Cooke (1955, p. 90)].

2.5. Toeplitz Methods and Rational Approximations; The Pade Table Let sn(z) denote the partial sums of the power series of a function analytic at 0, n

00

s(z) = L ak zk, k=O

siz) = L akzk. k=O

(1)

Let y E '(j and (Jnk be an infinite lower triangular array of numbers. Define n

Aiz, y) = L y-k(JnkSk(Z), k=O so that

s(z) =

snCz, y) - fn(z, y),

Bh)

n

=

L y-k(Jnk> k=O

n

Riz, y) = L y-krk(Z), k=O (2)

sn(z, y) = Aiz, y)/Biy), fiz, y) = Rn(z, y)/Bb)· (3)

54

2. Linear Transformations

Explicitly,

A.(z, y) =

()k n-r k (_)r L ak ~ L (In,r+k r: = L y-k L ar(Jn,r+k ~ . n

k=O

i

n

n

r=O

k=O

r=O

}

For y fixed, sn is, of course, just the Toeplitz means of SO, -k

Ilnk = Y (J nk However, if we put y

A.(z, z) = An(z),

••• , Sn

(4)

with weights

n

L y -k (Jnk .

Ik=O

(5)

= z then, defining sn(z, z) = sn(z), we have Sn(z) = A.(z)/Bn(z) = znAn(z)/znBn(z)

rn(z, z) = rn(z),

(6) and the latter is the ratio of two polynomials in z and thus a rational approximation. Clearly

[znB.(z)]s(z) - [z" An(z)] = Oiz"" 1),

z~o;

(7)

i.e., the rational approximation agrees with the power series through n + 1 terms. As will be shown, for certain functions s and certain choices of weights, considerably greater agreement is possible. What is a "reasonable" choice for (Jnk? Certainly, some ofthe most powerful Toeplitz methods are those based on orthogonal polynomials (Section 2.3.6). Thus one could take

_(n) (n + vM - 1)k

(Jnk -

k

(f3

+

l ),

'

v

= o: + f3 + 1,

r:x, f3 > - 1.

(8)

The characteristic polynomial for the method defined by (2) and (3) is then (9) so P; has its zeros on the ray connecting 0 and y. An argument based on Eq. 2.3.6(3) and Theorem 2.2(5) shows that U is regular iff y is real and y < O. In this case sn(z, y) ~ s(z) for all z interior to the circle of convergence of (1). Also, the rational approximation sn(z) will converge for all z real, negative, and interior to the circle of convergence of (1). For many important functions, however, this appraisal of convergence is far too weak. These are the functions that have a representation as Stieltjes integrals

s(z) =

J

OO

dljt

--,

o 1 - zt

Ijt E tJl*,

z rt= Supp Ijt.

(10)

2.5 Toeplitz Methods and Rational Approximations; The Pade Table

Theorem 1. Define

Let the representation (5) hold and t/J have compact support. a=sup{tltESUppt/J}.

Let YE~, l1Y¢ [0, IJ, zalyER, Then

°s zal» s 1, a> - t, /3 > -1.

1FnCz, y)1 s KnCz), Kn(z)

55

:'=::

(11)

(12)

zt

2 sup - o sr s« I 1 - zt

q = max(a,

/3, -1),

I fiall/y -

1 la/2+1/4Iyl-/l/2-1/4 nQ+l/2 , q!ly-1/2 + 1/Y _ 11 2n+ v

n ---+

J

(13)

00.

Proof

Fn(z, y) =

_R~a,/l)(I/y)-l LX) [ztl(1

-

zt)JR~a,/l)(zt/y) dt/J.

(14)

The proof will require the following well-known estimate (Szego, 1959, p. 194). For wi (0, 1),

R~a,/l)(w)

:'=::

_1_ (w _ 1)-a/2-1/4 w-/lI2-1/4(w l / 2

2JM

+ ~-=-i)2n+v,

(15)

branch cuts for (w - l )" and w" being taken along ( - 00, IJ and ( - 00, OJ, respectively, This result holds uniformly on compact subsets of ~ - [0, 1]. Using the fact that R~a,/l)(x) can be bounded algebraically (Erdelyi et aI., 1953, vol. II, 10.18(12)) completes the proof. •

Corollary. Under the conditions stated above, sn(z) converges exponentially to s(z). Further, the rational approximations sn(z, az) converge uniformlytos(z)oneverycompactsubsetSof~ - Dla, w),alsoexponentially;i.e., ZES, (16) 1"

= sUPI(az)-1/2 ZES

for some M and

Example.

e. Note 1"

+ Jl/az -

11- 2,

< 1.

Let

s(z) = F(1,

/3; v; z),

v=a+/3+1.

(17)

56

2. Linear Transformations

Then a = 1, (18)

and (19)

On expanding (1 - zt)-l in powers of t one finds the first n terms, that is, the coefficients of 1, z, ... , zn- 1 vanish by virtue of the orthogonality properties of R~IZ·/l)(t). Thus

= O(z2n+l);

[znBiz)]s(z) - [znAn(z)]

(20)

i.e., in this case the rational approximation yields the [n/n] entry in the Pade table for F(1, fJ; (f. + fJ + 1; z), These rational approximations, by virtue of the theorem, converge uniformly On compact subsets of C& - [1, 00). For an extensive discussion of the construction and properties of Pade approximants, see Chapter 6. Using R~IZ·/l)(t) =

a useful formula for

(

l)n

~~

dn

(1 - t)-lZt-/l dt" [(1 - t)lZ+nt/l+ n],

rn can be derived by integration by parts:

rn(z) = .~n
c = n

(21)

+

1, n

+ fJ +

1; 2n

+ v+

1; z)/R~"·/l)(l/z),

-r(v)r(n + (f. + 1)r(n + P + 1) r(P)r«(f. + 1)r(2n + v + 1) .

(22)

The F in (22) is easily estimated by Watson's formula (Luke, 1969, vol. I, p.237):

r

-2nr(v)(1 - z)"z2n+l 1)[1 + ft-="Z]4n+2v

z = n()

r(fJ)r«(f. +

the branch cut of

[1 + (I)J ~ 0

(23)

,

J1=Z being taken along [1, (0).

A similar error formula holds when the Unk are the coefficients of any system of polynomials orthogonal with respect to a general distribution function t/J E '11* with bounded support. Theorem 2.

Let t/J E '11* with support in [a, b], s(z)

=

i --, b

a

dt/J

1 - zt

1

z #- -, t

00 ~

a ~ r ~ b,

a< b~

00,

(24)

2.5. Toeplitz Methods and Rational Approximations; The Pade Table

and let

0' nk

57

be defined by n

Pn(t) =

L O'nk tk

(25)

k=O

where {Pn} is a system of polynomials orthogonal with respect to t dt/!. Then the rational approximation sn(z) defined by (6) is the [nln] element of the Pade table for s(z) [where (1) is interpreted as a formal series if its radius of convergence is zero and where sn(z) is the rational function formally agreeing with that series through 2n + 1 terms]. Furthermore, if [a, b] is bounded (take a = 0, b = 1 without loss of generality) and if t/! satisfies the conditions (i) t/! is absolutely continuous, (ii) 0 < m :::;; t3/2(1 - t)1/2t/!' :::;; M, (iii) (In t/!' dtl...jt(f=t) > - 00, and (iv) g It/!'(t + h) - t/!'(t)1 dt = O(h),

g

then sn(z)converges to s(z) uniformly on compact subsets of~ - [1, 00) with

I (

~ Yn(Z) 1 +

l/n<

~)n I

Zl/2

00.

(26)

Proof The proof uses the known asymptotic properties of orthogonal polynomials [see Freud (1966), in particular, p. 245] and is straightforward.

•

The diagonal elements of the Pade table for s(z) (in fact, for any power series) may also be generated with the s-algorithm ;see Chapter 6.That chapter also contains a number of convergence results pertaining to the off-diagonal elements of the Pade table for s(z) a Stieltjes integral (10) and an example where the support of t/! is not compact. It is easily verified that if Pn is orthogonal with respect to dt/! rather than t dt/!, the [n - lin] entry in the Pade table is obtained. When the support oft/! is not compact, then the power series for s(z) may be onlyformal.for example.e = -e-t,O:::;; t s; 00. In such cases Luke (1979) suggests taking

(27) It is very difficult to prove anything about the rational approximations associated with such a choice of weights, but the numerical evidence accumulating so far seems to indicate the rational approximations converge.

58

2. Linear Transformations

As an example consider the logarithm of the gamma function,

( 21) In

In r(z) =

z- -

h(z) '"

z- z

1

1

+ -In 2n + - h(z) 2 12z'

L akz-2\ 00

(28)

[arg z] < n,

k=O

where ak

= 6B 2k+ 2/(2k +

+

l)(k

(29)

1),

the B, being Bernoulli numbers, Bo = 1, B 2 = f" B4 = - lo, .... Let Q( = f3 = 0, v = 1 (the Legendre polynomial weights). Let z --+ 1/z2 and y = 1/z2. The rational approximations are of the form

_ (n) (n +

Uk -

IM-l)k

k

k'.ak

n-k

'

Vk =

L um+kak'

m=O

(30)

Table III gives an idea of the kind of accuracy that can be expected with such approximations. The rational approximations compare favorably with the Pade, as Table IV shows. In this case s(z) = h(iz -1 / 2) has the representation as a Stieltjes integral, h(iz-1/2) =

i

oo

p(t)dt, o 1 - zt

_

p(t) - 6t

-1/2

f

oo

I

U-1/2 du

(e 2 " v'ii _ 1)

(31)

[see Erdelyi et aI., 1953, 1.9(9)]. Applying Darboux's method to the power series

z

e - 1

,Z

~ Bn

- z - - = L.

n=O

n

n.

(32)

shows that B 2k+ 2 2(-1)k (2k + 2)! = (2n)2k+2

+0

[1 ] (4n)2k+2 .

(33)

Thus (34)

for any R > 2n. Appealing to a future result [Theorem 6.5(7) with n = 0] we can assert that the [n/n] Pade approximants to h(z) converge to h(z) uniformly on compact subsets of {Re z > O}.

2.6. Other Orthogonal Methods; Pollaczek Polynomials and PaM Approximants

59

Table III ro(z) = so(z) - h(z) 11\;:

2 4 6 8 10 12

1.4(-3) 3.1(-5) 1.3(- 6) 8.7( - 8) 1.6(- 8) 1.1(-9)

2

5

10

4.6( - 5) 5.3( -7) 2.5( - 9) 1.1( - 10) 2.4( -12) 2.1(-13)

2.6( - 7) 2.4(-10) 4.1(-13) 6.8( -16) 4.5(-19) 1.9(- 20)

4.4( -9) ).O( -13) 5.8( -17) 2.1(-20) 1.0(- 23) 4.6( -27)

Table IV [n/n] Pade Approximant - h(z) 11\=

2 4 6 8 10 12

2.4( -4) 2.8( - 5) 6.7(-6) 2.2( -6) 8.5( - 7) 3.1(-7)

2

5

10

1.8(- 6) 2.8( -8) 1.8(-9) 2.3( -10) 4.3( -II) 9.5( -12)

5.3(-10) 7.2(-14) 1.3(-16) 9.8(-19) 1.9(-20) 6.7( -22)

6.3(-13) 6.6(-19) 1.6(- 23) 3.0(-27) 2.2( - 30) 1.2(- 33)

In both types of rational approximations the approximations gain in accuracy as z increases, reflecting the asymptotic character of the series lor h(z).

2.6. Other Orthogonal Methods; Pollaczek Polynomials and Pade Approximants

From the previous discussion, it is clear that the problem of producing weights (Jnk that, when used in Eqs. 2.5.1(1)-2.5.1(4) with y = z, will produce rational approximations having a maximum degree of precision at z = 0 for a given function s(z) representable as a Stieltjes integral reduces to the problem of determining polynomials orthogonal to a weight function dt/J that has the property s(z)

=

f~, 1 - zt

Z-1

¢ Supp t/J.

(1)

Obviously, the weights (Jnk can be used to define summability methods. When [a, b] is finite and t/J satisfies Eq. 2.3.6(1) the method will have the regularity properties described in Theorem 2.3.6(1). Interestingly, members of the class

60

2. Linear Transformations

of distribution functions to be discussed in this section do not satisfy Eq. 2.3.6(1). The most far-reaching results to date on the construction of distribution functions and associated polynomials are due to Pollaczek (1956). [See also Erdelyi et at. (1953, Chap. 1O).J Pollaczek's approach is as follows. It is known that any system of polynomials orthogonal with respect to a distribution function v E 0/* must satisfy a three-term recursion relationship:

Pn(x) = (Anx

+ Bn)Pn- 1(x) -

CnPn- 2(x),

n;;::: 1,

An ¥= 0, CJA n- 1 A n > 0.

(2)

(Without loss of generality we may assume P -1 = 0, P 1 = 1.) Suppose a generating function for the set Pn(x) exists: co

L znpix).

g(z, x) =

(3)

n~O

Let tJ denote the operator tJ

= z dldz, Then

tJg(z, x)

=

co

L nznpn(x).

(4)

n~O

If the coefficients An' Bn, C, are rational in n the substitution of (3) into (2), multiplying by the lowest common multiple of An' Bn, C n and using the properties of the ~ operator, produces an ordinary linear differential equation for g (in the variable z). If a fundamental set of solutions for the related homogeneous equation can be determined, then the equation for g can, in principle, be solved. Once g is found, Pollaczek shows that X(z) = z-t S(Z-1) can be found, where x(z)

=

f~, z- t

(5)

and then, by using the inversion formula for the Hilbert transform (Shohat and Tamarkin, 1943, p. xiv) one can determine e :

1 = lim - -2' £-0

£>0

XI

it 0

[X(t + is) - X(t - is)J dt.

(6)

The appropriate weights for computing the [n - 11nJ Pade approximants to s(z) = Z-1 X(Z-1) cannot, in general, be given in closed form but can be

61

2.6. Other Orthogonal Methods; Pollaczek Polynomials and PaM Approximants

computed conveniently from (2):

o s k s n, n ~ 1, k i

< 0, j < 0, or j > i; 0"00

=

~

0;

(7)

1.

For the class of transcendental functions to be discussed, this is tantamount to having a closed form expression for the [n - lin] Pade approximant. The necessity for solving linear equations is avoided (cf. Secti~n 6.5) and, in fact, the s-algorithm for generating Pade approximants is tedious. The most general case considered by Pollaczek was for An' B n, en bilinear functions of n having a common denominator. Through a suitable normalization the recursion relationship can be written

+ c)Pn - 2[(n - I + A + a + c)z + b]Pn- 1 (8) + (n + 2..1. + c - 2)P n - 2 = 0, so An = 2(n - I + A + a + c)/(n + c), etc. We shall assume a, b, c, and A are real and a > Ib I, 2..1. + c > 0, C ~ 0, although often an appeal to continuity (n

will enable some of these conditions to be relaxed. In what follows p-=l", z ¢ ( - I, I), will denote that branch of the function that is positive for z positive and> 1. Let

B(z) = az

+b

p-=l"

= - i( at +

() B +t

, b)

t E ( - 1, 1),

' v~ I - t2

_ i(at

( ) B_t

+ b)

=t+

UL(t)

~' 2

vI - t

iJ1=7,

(9)

tE(-I,I).

We wish to solve the integral equation x(z) =

where

n2 X(z) =

2

-

.

2Ar(c

+

f:oo dl/JI(z -

l)r(c

(A + c

+ 2..1.)w(z)F

+ B(z))F

t),

z¢ Supp

(1 c-+A..1.+1 + B(z), c + 11 + B(z)

(1 - A+ B(z) c I c

l/J

+ A + B(:)

w(zf

)

(10)

W(Z)2

) (11)

62

2. LinearTransformations

Pollaczek's work guarantees that a solution l/J E '1'* exists; it will be given by the inversion formula (6). It turns out that l/J is differentiable. Writing dl/J = p dt, we have

pet) = i2 1 -

2 Ar(c

x [ (A H+(c)

+

l)r(c

+ 2..1)

w+(t)H +(c + 1) + c + B+(t))H +(c)

1 - ..1+ B+(t), c

= F ( ..1+ C + B+(t)

I

- (A

w+(t)

w_(t)H _(c + 1) ] + c + B_(t))H _(c)'

(12)

2) ,

etc. The computations are rather complicated, but straightforward. First, use the fact that oi : = W.;:l, and then Eq. 2.10(2) of Erdelyi et al. (1953) on H _(c + 1) and H _(c). Next use Eqs. 2.8(25) and 2.8(26). The result is

(13) _ Us -

2-2c-2).--2B+

w+

F

(2 - 2..12 _- cc,_1-A _AB+- B+ Iw+. 2)

The Wronskian of this pair of solutions of the hypergeometric equation is easily determined by the standard techniques, so that finally exp[(2 arccos t - n)(at + b)/j!=t2] x (1 - t 2)" - 1/21rCA + c + B+(t)W

pet) = 0,

tE(-I,I)

(14)

ItI > 1.

Pollaczek has shown that the polynomials defined by (8) are orthogonal with respect to this weight function. From this basic result a number of other Hilbert transforms and the recursion relationship for the corresponding orthogonal polynomials may be

2.7. Other Methods for Generating Toeplitz Transformations

obtained.Ifz ~ ez and s ~ 0, then

+ cos ¢,t

~

et + cos ¢,a

~

sin ¢/s,b

~

63

-sin ¢ cos ¢/s,

l' e- Zi
n2z-ur(c + l)r(c + 2A)e- iF(1 - ,1,- iz, c ~ ,1,+ c + 1 - lZ

(sin ¢ )U-I(,1, + c _ iZ)F(1 - A - i~, c Ie Zi
foo -

00

e(Z-n)t Ir(,1, + c - itW dt

I

I

(z - t) Fe ,1,-+' ; ' , c e: Zi
r'

(15)

and the corresponding orthogonal polynomials satisfy (n + c)P n

+ (n

-

2[(n - 1 + A + c)cos ¢ + x sin ¢]Pn-l

+ 2,1,

+c-

2)Pn - z = O.

(16)

Ifweletz ~ zl¢,t ~ tl¢,in this result and 4> ~ O,thesupportofljJcollapses in a surprising way to (- 00, 0). To evaluate the limits, use Erdelyi et al. (1953,2.10(1),6.5(7». Redefining things in an obvious way gives r(a)r(a

+2

_ c) 'P(a + 1, c; z) 'P(a, c; z)

=

foo 0

(z

e-tt

1

-

c

dt

+ t)I'P(a, c;

,

(17)

= O.

(18)

_t)l z

and the corresponding polynomials satisfy (n + a)P n - [2n + 2a - c + x]P n-

1

+ (n + a - c)P n-

Z

Letting a ~ 0 gives the distribution for the Laguerre polynomials L~l-C)(X). 2.7. Other Methods for Generating Toeplitz Transformations In this section V will denote any doubly infinite complex matrix [l1nk] satisfying (i) (ii) (iii)

Lk=o l,unkl ~ M; Lk";o ,unk = 1 + 0(1); and

I1nk

= 0(1), k fixed.

By Theorem 2.1(2), V defines a regular summation process. Such a matrix will be called aT-matrix. The methods previously used in this book to discover T-matrices, or at least those T-matrices that are also triangles, were generally based on a consideration of the zeros ,1,nk of the characteristic polynomial for the method.

64

2. Linear Transformations

In a way, such an approach is unsatisfactory because no necessary and sufficient conditions on A = P.nd could be specified guaranteeing the regularity of U. Garreau (1952), using a totally different approach, has found a way of systematically generating T-matrices that yields all T-matrices. ing

Theorem 1.

(i) (ii) (iii)

Let

(f

E~;,

(In

---+

00. Let f be a sequence offunctions satisfy-

Jgo I fn(x) I dx < M; (J;; lfn(x) = o( 1) uniformly for x E [0, h], for some h > Jgo fn(x)dx = 1 + 0(1).

0; and

Let

~ r" lfn(~) du, (In J (In

flnk = Then U is aT-matrix.

Proof

I

cYJ

k=O

n, k e. O.

k

Iflnk I = o; 1 I

cYJ

k=O

= «; 1

r

In (u) ~

1

k k

f J,r

k=O

(1)

an

+1 ,

I

fn(!!'-) du (J1l

(2) Also for fixed k,

(3) Application of (ii) shows that flnk

I

cYJ

k=O

flnk =

(J;; 1 I

cYJ

lk+lfn(u) fcYJfn(x) dx = 1 + 0(1), ~ =

k=O k

and the proof is complete.

= 0(1). Finally, (In

0

(4)

•

Theorem 2. Given any T matrix U a sequence f of complex-valued locally integrable functions can be found that, using Eq. (1), yield U.

65

2.7. Other Methods for Generating Toeplitz Transformations

Proof

Let OC!

L: J1n"

c/JnCk) =

n, k 2 0.

r=k

(5)

Extend this function to a function c/JnCx) E C[O, 00). Now let hypotheses of Theorem 1 and define f,,(x) by

(f

satisfy the (6)

Then

-1

lk f" (u- ) +1

(In k

(In

lk

du = -

+1

k

c/J~(u)du = c/Jn(k) - c/JnCk

+ 1) =

[Note that (i) and (ii) of Theorem I need not be satisfied.] Example 1. Let f,,(x)

= 0,

x > (n

I

n

p

+ 1)/(Jn'

k = n

•

(7)

J1nk'

•

+ p, P 2

1. Then

(u)

1

+ + In J1n.n+p = -1 du (In n+p (In

=

j(n+

(n + p + I ) ! 6 n

+

so U is lower triangular. Let (J n = n

J1nk =

fn(x) dx = 0,

p)/6 n

(8)

1. Then fn(.x) = 0, x > 1, and

l

(k+I)!(n+ I) fn(x) dx, k/(n+ I)

0:::; k :::; n.

(9)

Condition (iii) of Theorem 1 becomes J:fn(X)dX

Taking fn(x)

= 1 + 0(1).

= r(1 - xy- " Re r > 0, gives

J1nk = [1 - k/(n

+ I»)' - [1 -

(k

+

1)/(n

which is a generalization of the Cesaro means (r Example 2.

+ I»)',

=

°s k :::; n, (10)

1) called the Riesz means.

Take the finite case of Example 1 with rxr -

f,,(x) = { 0,

I,

°:: ; x s

1,

x>1.

Re r > 0,

(11)

66

2. Linear Transformations

This gives J1nk = [(k

+

1)' - kr]/(n

+

1)',

05, k

5,

n,

(12)

another generalization of the Cesaro means. Example 3.

Going the other way, let J1nk = nk/(n + 1)k+I.

(13)

(This is called the Abel T-matrix.) Then 4>ik) =

L ni(n + 1)-i-1 00

i=k

and we may take 4>n(x) = [n/(n of Theorem 1 are satisfied.

+

= nk/(1 + n)k,

(14)

1)]\ an = n. In this case conditions (i)-(iii)

Chapter 3

Linear Lozenge Methods

3.1. Background: Richardson Extrapolation and Romberg Integration

Let us see how linear lozenge techniques can arise in numerical analysis. Often a sequence t E ~s is known to converge to its limit t as

w> 0, Re 0 <

o.

(1)

(This is slightly more general than a B~ P series with Q == 0 since w need not be rational.) It is easily seen that for p > 1 en . p ~ pOno

L: crp~wrn-wr. 00

(2)

r=O

Multiplying (1) by pO and subtracting (2) gives o

ptn-tn. p _ (1)_ o 1 - tn - t p -

+ en

(1)

(3)

and e~l)

=

nO- w

L: c~1)n~wr. 00

(4)

r=O

Obviously, t~l) converges to t more rapidly than t.; Replacing n by np in (4) and repeating serves to define yet another sequence that converges more rapidly than t~l): (2) _

tn -

e P

w

p

(1)

(1)

tn - tn' p _ 1 - t

fI-w

-

67

+ en

(2)

,

(5)

68

3. Linear Lozenge Methods

and so on. There is thus a double sequence defined recursively by t(k + 11 =

8 - k Wt (k ) _ n

P

n

t(k) n.p

r:': _ 1

n, k 2 0.

(6)

The transformed sequence may have great advantages provided tn. p may be computed easily once t.; is known. This is sometimes the case, particularly for certain numerical integration formulas. Now replace n by pn in (6) and define a new sequence S (kl n -

(7)

tiki

v":

The formula becomes n, k 2 0,

(8)

and this is the form the algorithm usually takes in practice. Obviously it can be applied to any sequence s. Furthermore, induction shows that, for some constants fJ.km' k

S n(kl

= 1... '\'

m=O

IJ S rkm n+m'

(9)

Putting Sn == 1 shows I fJ.km = 1, so (8) is in reality just a triangle applied to the sequence s starting with {sn, s, + 1, . . . } rather than {so, S 1, . . . }. In fact, as is easily shown, the kth characteristic polynomial for the method is Pk(it)

=

TI k

r>

1

(it1 __

p8+0-'lW) 8+0 'lw

P

(10)

and thus, by Theorem 2.2(3), the U taking s, into s~l) defines a regular method, which means fJ.km

= 0(1),

k

I lfJ.kml < m=O

M,

k 20.

(11)

In fact an application of a result yet to be shown (Theorem 3.2(1» gives the following theorem.

3.1. Background: Richardson Extrapolation and Romberg Integration

Theorem 1.

69

The transformation defined by (8) is regular for any path P.

The computation scheme for the algorithm is as follows:

When (} = - I, p = (J, W = 1, the triangle [Jlkm] is the triangle of Romberg weights given in Section 2.3.9. Example.

As an example, consider the problem of evaluating (12)

1= fI(X)dX

by the trapezoidal rule. Define

1 I"I (k) -

t n = T,,(f) = -

n

n k=O

(To

n

=

0),

(13)

where the double-prime notation means that the first and last terms in the sum are to be halved. Then (14)

I = t n - en,

and if I is sufficiently smooth [see Krylov (1962, Chap. 11)], Cj

Cz

= l1Z[f' (1) - f'(O)],

= -7io[f"'(1) - f'''(0)],

(15)

....

With p = 2, (J = - 2, W = 2 in (8), this means that to start the method one must compute To, Tz , T4 , Ts,." ; that is, each time the number of

70

3. Linear Lozenge Methods

intervals in (13) is doubled. However, to calculate T2 " + 1 from T2 " means computing the functionfat only an additional Z" points. Thus (0) _

Sn

-

T_ k~O~"f(k)r r 1

2n

2" -

(16)

n ~ 0,

and (17)

The following convergence theorem is useful.

Theorem 2. Letfbe bounded and Riemann integrable on [0, 1]. Then along any path P

S~k) -> Further, iff E S(k) n

I

flf( ) d 0

X

X

e( 2vl[0, 1],

f

f(x)dx.

(18)

v ~ 1, then

1< KIB2vlllp2vlli fli=1 (1 + 1/4 (2v)!4v(n+k)

j

-

(}2) 1/2

K = ( (f

1 q= 1/2

)

,

k

~

v-I,

(19)

= 1.969255.

Proof The first statement is immediate. For the second, we use the following facts:

(20) r

v=

n

1

2-

2vn

= .L. "ex·2 - 2nj + - - B f(2v)():) J (2 V)'. 2 v 'on J= 1

(21)

[see Krylov (1962, p. 216)]. Replacing n by n + m, multiplying by Ilkm' and summing from m = to k produces

°

(22)

71

3.2. General Deltoids Table I Romberg Algorithm, s = n\k

0

0

10.476 6.147

2

4.2[9

3

3.438

4

3.161

5

3.076

6

3.053

7

3.047

8

3.045

J6 (x + 0.05)-

2

3

I

dx = In 21 = 3.044522437723 4

5

6

7

4.704 3.576

3.502 3.152

3.178 3.068 3.047

3.146

3.061 3.046

3.046 3.044581

3.04460 3.0447 3.04454 [ 3.044573

3.059

3.060

3.046 3.046

3.044575

3.044576 3.044523

3.044574 3.044523

3.044523

3.044523

3.044524 3.044525

3.044522436

3.044522475

and identifying the infinite product in terms of l./-functions is straightforward (Rainville, 1960, Section 173). • This theorem seems to indicate that for functions in e(2v), diagonal convergence (k ---> (0) is not much better than vertical convergence. Numerical evidence seems to bear this out. As an example take f = (x + 0.05)- 1. The results of applying the algorithm are given in Table I. Note S~2) is nearly as accurate as any other entry in the table yet easier to compute and less subject to roundoff error than entries on its right. It is a curious contrast that in most other deltoid algorithms diagonal convergence is more rapid than vertical convergence. 3.2. General Deltoids The Romberg integration scheme leads one to analyze the more general deltoid scheme defined by (1)

Let S E ({;e. The idea is to determine when the above transformation is regular for any P. If S~k) ---> S is to hold, an induction argument shows (2)

72

3. Linear Lozenge Methods

From here on assume this holds. As before, write k (k) _

Sn

for some constants

flkm'

(3)

"

L. {tkmSn+m

-

m=O

Substituting (3) in (1) shows

k+ 1

k

k

I flk+ l.mSn+m = akm=O I flkmSn+m+ + b,m=O I flkmSn+m' m=O 1

(4)

and this will hold for all possible sequences iff (flkm

Multiplying by

Am

= 0, m < 0, m > k).

and summing from m = 0 to k P k+

1(..1.) =

(akA

(5)

+ 1 gives

+ bk)Pk(A)

(6)

(7)

Furthermore, in the notation of Chrystal (1959, vol. I, p. 431), flkm = ( -1)k+mg\_mO·),

(8)

where ,o/lr(A) is the product of AI, ... , Ak taken r at a time. An application of Theorem 2.2(4) furnishes the next result. Theorem 1. Let ak ri. [0, 1]. Then the transformation defined by (1) is regular for all paths P iff L (1 - a; 1) converges. It often happens, of course, that S~k) goes to S along some P much more rapidly than Sn goes to S as n -+ 00. Then the scheme defined by (1) is computationally desirable, as is the case for Romberg integration. This algorithm can be derived heuristically on the assumption that the given sequence s; behaves as Sn

=

S

+

k

I

r= 1

crA~,

(9)

(see Chapter 10). Consequently, one would expect the algorithm to be exact (s~k) == s), when s has such a representation. This turns out to be true and does not even require convergence. Theorem 2. Let s have the above representation with a j i= 0 for some complex constants c., Then S~k) == S, n ~ O. Proof

Trivial.

•

3.3. Deltoids Obtained by Extrapolation

73

3.3. Deltoids Obtained by Extrapolation Other deltoid formulas are generated by the Neville-Aitken formalization of the Lagrangian interpolation polyuomial [see Householder (1953, pp.202ff.)]. Let x be an arbitrary sequence of distinct numbers, and suppose there is a function f(x) such that f(x) = Sj,j 2 0, s being the sequence to be transformed. Consider the algorithm given by the following computational scheme: n, k 2 0,

s~O)

=

sn,

n 2 O.

(1)

One sees immediately, by referring to the appendix, that S~k) is the value at z = 0 of the Lagrangian polynomial of degree k in z that interpolates to Sj at Xj' n -:::;, j -:::;, n + k. Now f(x n ) = s., so if X n , for instance, is decreasing and f is reasonably behaved, f(O) = s, and for nand/or k large S~k) will closely approximate s. Laurent (1964) has shown necessary and sufficient conditions for diagonal (k ---+ (0) regularity of the algorithm. A minor additional effort enables us to show the same conditions are equivalent to regularity for any path. Theorem 1. Let x be monotone decreasing to zero. Then the algorithm (1) is regular for all paths iff Xn/X n+ 1 2 o: > 1, n 2 O.

Proof

By the formulas in the appendix, we can write k (k)

Sn

_"

-

1... Ilkm (n) Sn+m'

Ilkm(n)

m=O

=

k

f1 i=O,i*m X n +

i -

Xn+m

(2)

Note I Ilkm = 1. =: Assume that on the contrary where

EE

9f; and contains a subsequence converging to zero. Then k- I

Ilnk

=

f1 i=O Xn+i -

Xn+i Xn+k

> -----Xn+k-I -

Xn+k

(3)

(4)

since each term in the product is greater than 1. Thus Ilnk

> (1

+ En+k-I)/f;n+k-I

(5)

and, taking n fixed, we can pick k values -> 00 such that En + k is a member of the aforementioned subsequence of E. Thus Ilkn contains a positively divergent subsequence. Therefore, by the Toeplitz limit theorem, the transformation

74

3. Linear Lozenge Methods

defined by (1) is not regular (in k). In fact, this shows that (1) is not regular for any diagonal path. =: Taking products of xn/x n+ 1 ~ r:J. from n = j to k - 1 and reciprocating gives

k > j.

(6)

Write IJlkml

= A· B,

Then

(7)

n

A = m-I( 1 _ 1

=0

X

n

+m

Xn+m - i - I

)-1

(8)

But x, + .J», + m _ 1 _ 1 :s: o: - i - I , so A :s: d, where d is the limiting value of the convergent monotone increasing sequence m (1 - rx-i-I)-I [see Knopp (1947, p. 219, Theorem 3)]. Also,

n

B

__

j

:s:

n --_._--

k-m-l

Xn+m -

~O

n

X n+ m+ i+ 1

k-m-I

rx

n

Xn+m +i+ 1

-i-I

.

1 -rx - I

i=O

k-m-I

1

<

i=O

X

dX X n + m + i + dX n + m

n+ + + n+ - - - -m- - -i- - - _ . - - - -m

1-

drx-(k-m)(k-m+ 1)/2.

(9)

Thus IJlkml:s: d 2rx-(k-m)(k-m+I)/2, k

u, = L

k

IJlkml

m=O

:s: d 2 L rx-m(m+ 1)/2 m=O

< C.

(10)

Since both the bounds are independent of nand Jlkm -> 0 as k -> 00, m ~ 0 fixed, an application of a result soon to be given, Theorem 5.2(1), completes the proof. • Example 1. Let Xi = o', 0 < (J < 1. This choice yields a special case of the Romberg-Richardson algorithm 3.1(8). The algorithm is regular for all paths. Example 2.

Let

Xi

= 1/(i + 1). Then

Jlkm

(_1)m+k

= ---- k!

(n

+m+

(k)

1)k m '

(11)

3.3. Deltoids Obtained by Extrapolation

75

and (k+ Sn

1) _

-

(n

+ k + 2)S~k~ 1 k + 1

(n

+

l)s~k)

n, k

;:0:

0,

s~o)

= Sn, n > 0, (12)

is a deltoid formalization of Salzer's weight scheme for y = 1, Eq. 2.3.4(1). Clearly, the hypotheses of the theorem are violated. In fact, it is easy to show the algorithm (12) is regular on no vertical or diagonal path. Nevertheless, used on appropriate sequences, the technique is very valuable. Let s, = (GAM)n so that an ~'l/(n

+

1)

+ In[n/(n +

1)],

n

>

ao = 1.

1,

(13)

Then s, -> Y = 0.5772156649, Euler's constant. Suppose only ten terms of s are available. How weIl can one do in computing y? Table II lists the tenth ascending diagonal of S~k), each term of which requires so, ... , S9'

Example 3.

Let

Xi =

1/(1

+ i?

Then

_ 2( _l)m+k(n

ftk

m

+ m + 1)2k+ 1 (k)

---------~

k!(2n+m+2)k+1

m

(14)

and S(k+

I)

(k) = ( n + k + 2) 2 Sn+

(k

n

+

1)(2n

1 -

(

n

+ 1)2Sn

(k)

+ k + 3)

This algorithm is appropriate for sequences behaving as

co/n2

+ c l/n 4 + ... ; Table II

k

S~~k

0 1 2 3 4 5 6 7 8 9

0.626 0.578 0.577219 0.577214 0.577215590 0.577215682 0.577215669 0.577215665 0.5772156643 0.5772156644

(15)

76

3. Linear Lozenge Methods Table III k

S~~-k

0 5 10 15 20 25

0.6932 0.6931469 0.69314705 0.693147089 0.693147099 0.693147100

e.g., the sequence of iterates 1;, in the trapezoidal rule 3.1(13). Table III gives some entries on the 26th ascending diagonal for f(x) = (x + 1)-1 (In 2 = 0.6931471805.) It is easy to show that for the Toeplitz array U corresponding to (these weights yield the diagonal entries s~)) one has K(U) ;:::: e2 = 7.389,

~km(O)

(16)

worse than that for the methods given in Sections 2.3.4 or 2.3.5.Such numerical instability dictates great caution in the use of (15). There are several ways of looking at the acceleration properties of lozenge algorithms. One is to compare rapidity of convergence along different paths. Very little work has been done in this area. However, an interesting condition for horizontal acceleration in the previous algorithm is due to Brezinski (1972).

Theorem 2. Let x be monotone decreasing to zero and x.fx; + 1 ;:::: rx > 1. Then for S E CfS s , s~k+ 1) converges more rapidly than S~k), n ..... 00, k fixed, iff (17) Proof

Left to the reader.

•

The algorithm of this section can be derived formally from the assumption that S behaves as s; =

k

S

+ L crx~.

(18)

r::::. 1

Not surprisingly, the algorithm is exact for such sequences, even when x depends on s.

Theorem 3.

Let s have the foregoing representation with C m • Then S~k) == S, n ;:::: O.

i #- i. for some complex constants

Proof

Trivial.

•

Xi

#- Xj'

77

3.4. Example: Quadrature Based on Cardinal Interpolation

3.4. Example: Quadrature Based on Cardinal Interpolation A class of quadrature formulas derived from a general Hermite cardinal interpolation formula provides an excellent example of the summation process of Section 3.3. It has long been known that the approximation of a doubly infinite integral by a trapezoidal sum 1=

f:a)j(X)dX~hff(mh)

(1)

gives surprisingly good results in many cases; i.e., the series on the right approaches rapidly the value of the integral as h ---> O. For instance, ifj = «>' and h = 1, the sum has the value 1.77264, to be compared to = 1.77245. This agreement is nothing short of phenomenal, considering how few values ofjare required to define the sum, and indicates that something profound is gomg on. In many instances, however, there are knotty computational problems associated with (1). It may happen that the right-hand side is indeed a good approximation to the integral, but converges very slowly; in fact, those small values of h that give a good approximation produce a slowly convergent series. An example is

fi

1=

1 fa)

~

dx

_a)

h

1+~2 ~ ~

f

a)

I 1 + (mh)2'

(2)

One would like a procedure to calculate I based on as few evaluations of the sum as possible. One approach is to truncate the sum at N, which depends on h, and to try to find, given h and a suitable class of functions f, the values of N that produce optimal accuracy. This approach is the basis of the so-called tanh rule. However the iterates in that rule are not suitable for the application of the present summation procedure. A more general procedure, the BL protocol, is required; the subject is discussed in Section 11.3. Note that any procedure to compute I is adaptable to the computation of finite integrals; for instance, the substitution x = tanh t gives an integral over (-1, 1). (Some writers have conjectured that this change of variable is, in some sense, the best choice; again see the discussion in Section 11.3.) The quadrature formulas to which Section 3.3 is to be applied are generalizations of (1). Let.f: 9f ---> rc and h > O. The series ~(.f)(z)

=

sin

L f(mh) -W- - , a)

-a)

Wm

m

Wm

tt

= h (z

is called the cardinal interpolation series of the function Obviously, m e J, ~(.f)(mh) = f(mh),

- mh),

(3)

f with respect to

h.

(4)

78

3. Linear Lozenge Methods

This formula and its remainder have been thoroughly investigated [e.g., Kress (1971); McNamee et al. (1971)]. Here a more general interpolation series is required. Define the p + 1 entire functions tiz), 0 :s; q :s; p, by

I

[1tzJr

_ zq [sin(1tz/h)JP+ 2«p-q)/2) tq(z) - , /h I a; h q.

1tZ

r= 0

r even

(IX) indicating largest integer contained in IX. a, of the Taylor series,

[sin z

~-J P + l

Lemma.

Let 0 :s; r, q

(5)

'

00

I

a.e",

,=0

==

a,(p) are the coefficients

Izi < n.

(6)

reven

s

p. Then

(7) Proof

Since tiz) = zq/q!

+ zp+ IUq(Z),

where "« is entire, (7) is immediate. Now let f : ;Jl -->

rc, /

s

(8)

p,

• P

I I 00

-

q

h > O. The series

E CP(~),

Tp,h(f)(Z) =

o :s;

00

q= 0

pq)(mh)tiz - mh)

(9)

is called the pth cardinal interpolation series of/with respect to h. Clearly ::q Tp,h(f)(z)lz=mh = pq)(mh),

o :s; q :s; p,

m

E

J.

(10)

For functions / analytic and bounded in a strip [- ia, ia] x ~, a remainder formula is easy to derive [see Kress (1972)]. Its exact form is not important for our purposes. It suffices to say that for all x the remainder is O(e -7t(p+ I la/h), It is the integration of (9) that provides the desired quadrature formula: Ip,h(f) = h I

00

_. 00

P

I

q=O

q even

bqpq)(mh), (11)

79

3.4. Example: Quadrature Based on Cardinal Interpolation

A simple recursion formula exists for the computation of bq (Kress, 1972).

If the first several such formulas are recorded, it turns out that I Oh is given by (1), I z p+l,h = Izp,h' and

14 h

=

I Oh

5h3 + 16 Z n

h5 L j"(mh) + 644 -00 n 00

(12)

L j""(mh). 00

-00

A remainder formula can be computed directly from that for (9). Theorem. Letfbe analytic in [-ia, ia] x !Jll,f(x uniformly for - a ~ y ~ a, and

+ iy)

~

Oasx

~

± 00 (13)

Then I =

f':'

00

f(x) dx exists and (14)

This is a generalization of a result (p = 1) first given, apparently, by Luke (1969, vol. 2, p. 217). Now for p,f, N > 1 fixed (and thus a, which may be taken as the distance from!Jll to the nearest singularity of f), let h = N/(n + 1) and define O~n~N-1.

(15)

Equations (14) and 3.3(18) suggest taking x, =

e-na(p+Z)n/N

(16)

in the summation formula 3.3(1). Thus one can compute the S~k) array for ~ N - 1. Equation (16) turns out to be a very happy choice, Take as an example f(x) = l/n(x Z + 1), p = 0, and consider formula (2). Then a = I. The lozenge formula is

o~ n + k

O~n+k~N-1.

(17)

Table IV gives the results for N = 4. Thus we have I to almost eight significant figures with only four evaluations ofthe sum in (2).

80

3. Linear Lozenge Methods Table IV n

s;

0

1.524868619

s~l)

1.0903314 [ I 2

1.018129443

3

1.003741873

\(3)

S~2)

0.976293939 0.999181169 0.999966081

·n

1.000214887

0.999999598

1.000001532

3.5. General Rhombus Lozenges This section shows how a lozenge algorithm can be developed for the orthogonal triangles discussed in Section 2.3.6. Theorem 1. Let {Pk(X)} be a system of polynomials orthogonal on [ -1, IJ with respect to Ij; E '¥ with PH I(X) = (Akx

+ Bk)Pk(X) -

CkPk-l(X),

k ?: 0, P-l == O.

(1)

Then the sequence transformation defined by n, k ?: 0,

where

ak = (B k + Ak)O"k/O"H 1, bk = 2A kO"k/aO"k+ l ' Ck = -CkO"k-J!O"k+l, O"k = O"k(a) = Pk(2/a + I),

= 0,

S~-l)

s~O)

=

Sn,

(2)

(3)

a> 0,

is regular for any path P. Proof

First, note that O"k satisfies

= [(2/a + 1)A k + Bk]O"k - CkO"k-l· This shows that ak + bk + Ck = 1. Thus for some constants Ilkm, O"k+ 1

(4)

k S(k) n -

'\'

(5)

S

1-J J1km n + m ,

m=O

and putting Sn == 1 in (2) shows that [llkmJ is a triangle. Proceeding as before, we find that Pk(A) satisfies

P- 1 == 0,

(6)

3.5. General Rhombus Lozenges

81

and this is precisely the recursion relationship satisfied by Pk(2)./a + 1)/ Pk(2/a + 1), and since the two agree when k = 0 and k = 1, identically

Pk(J.) = Pk(2)./a + 1). Pk(2/a + 1)

(7)

Theorem 2.2(5) then asserts that U is regular. The rest of the proof is as in Theorem 3.2(1). • The computational scheme for the algorithm is as follows:

0 So

0

~( I) '0

S(2) , 0

SI

0 S

0

~\"~S(2) /1

2

S~I)

3) Sb

S(3)

, 1

S~2)

S3

In the algorithm it is much more efficient to compute t~k)

= (J kS~k)

(8)

and then divide t~k) by (Jk (which itself satisfies a simple recursion relation) to get S~k). The algorithm becomes t~k+ 1) t~-I) S~k)

(Jk+ 1

= (B k + == 0,

Ak)t~k)

t~O)

= t~~)/(Jk; = [(2/a +

=

2A + ~k t~k~ 1 a

-

Ckt~k-l),

n, k ~ 0, (9)

Sn;

1)A k + Bk](Jk - Ck(Jk~ I'

k ~ 1.

As an example consider the Chebyshev polynomials T,,(x) with a = 1. These satisfy (10) Thus (11)

82

3. Linear Lozenge Methods

Even simpler is the algorithm given by the Chebyshev polynomials of the second kind Uk(x): Uk+l(x)=2xU k ( x ) - U k -

1( X ) ,

i

:».

U_

1

=, 0 ;

U o = l , (12)

and so (13) For I;. G"k(l) = {I, 3,17,99,577,3363,19601,114243, ...},

(14)

G"k(1) = {I, 6,35,204,1189,6930,40391,235416, ...}.

(15)

and for Uk Both satisfy k

~

1.

(16)

Applying the I;. algorithm to the sequence s, = LI:=o (_I)k with the computational scheme yields Table V for t~k). This sequence is divergent, and thus S~k) cannot sum s along all paths P (s must converge if that is to happen). However, it is easy to show that S~k) -> t as k -> 00. This, in fact, is the traditional "sum" assigned to the sequence [see Knopp (1947, Chap. XIII) for a historical discussion]. The sequence {s~)} obtained from the preceding table is one of dramatic precision: {I, 0.333, 0.529, 0.495, 0.501, 0.49985, 0.500025, 0.499996,...}.

(17)

3.5. General Rhombus Lozenges

~B

It is likely that no other linear method is more efficient than 7k for sequences that alternate around their limits. Two-dimensional algorithms can also be developed for the nonregular class of triangles of Section 2.3.5. The case r = 1 (Legendre polynomials) is particularly simple: (k+1) _ S" -

+

( 2k

1)[2S(k) - S(k)] - kS(k "+1"" (k + 1)

1)

,

n, k > 0,

(18)

In general, this algorithm will be regular only along vertical paths. However for an important class of sequences it is regular along any path. Theorem 2. Let r regular along any P.

E

9l T M • Then the transformation defined by (18) is

Proof (k)

_

r" -

(k)

s"

L J.1kmr"+m k

_

s -

-

so for any <5, 0 < <5 < 1,

Ir~k)1 :-:;;

f:dl/J +

:-:;

J dl/J +

~

o

JI

t Pk(2t - 1) dl/J,

_"

m=O

0

f_/I/J + If-~t"Pk(2t -

fl

C(<5)

I-~

dl/J + k

(I - <5)"

(19)

I)dl/J I (20)

by Erdelyi (1953, 10.18, (I), (5)). Now pick <5 so that the first two terms are less than s and take lim as n + k --+ 00 along P. This shows Ir~k) I --+ 0 or S~k) --+ S along P. • The exactness problem for the algorithms of this section is easily disposed of.

Theorem 3. Let S~k) be defined by the algorithm of Theorem 1 except that now a in (3) may be any nonzero complex number. If s has the representation S"

=

k

S

,tkr a zero of Pk(2,t!a + 1), then S~k)

+ L cr,t~"

(21)

r= I

==

S,

n ;:::: O.

Obviously, if the interval of orthogonality of Pk(2,t!a + 1) lies partly outside N, the algorithm will sum some divergent sequences. For other such algorithms, see Wimp (l974c).

Optimal Methods and Methods Based on Power Series

Chapter 4

4.1. Best Methods for Laplace Moment Sequences

In this section we assume that s E C(}C may be represented as its limit s plus a certain moment sequence as follows: s, =

.1(0 =

S

+ f(a + n),

Re a> 0, n

Looe-~1(t)dt,

~

0,

(1)

fELj(O, (0).

(2)

(The motivation for representing s this way will be discussed later.) Theorem 1. Let! E L, (0, (0), letf r ) be locally integrable for some r ~ I, and suppose that .f(r)e~'eu E

LiO, (0)

for some

p > 1,

0 < c < 1.

(3)

Further, let the nth characteristic polynomial of a triangle U satisfy the r conditions

°

~j ~ r -

1, n ~ r.

(4)

r,

(5)

Then n

where

E~q)

f

!!.nke-,d-rkUII ' (lL + k) q Equality in (5) is attained for some f = II

c

k=O

+d=

~

1,

~+~= p

q

1.

in the class of functions (3). S4

(6)

4.1. Best Methods for Laplace Moment Sequences

Proof

I

let)

85

may be represented =

r-L pil(O)t; -.,-- + - 1 foo f(r)(u)(t I

;=0

nr)

l.

0

uy-1E(t - ui du,

(7)

where

E(t)

=

t > 0

{~:

(8)

t < O.

Substituting this in the representation (1) and (2), summing, and using (4) gives after an easily justified change in the order of integration

(9)

If a change of variable is made (t - u = x), this may be written "

K ( ) = - au ~ n U e L,

k=O

(

flnk e

rx

-ku

+ k)r

= O( e - au) ,

k

--+ 00.

(10)

Because of the integrability condition (3), Holder's inequality may be invoked, proving the theorem. [That an f may be found yielding equality in (5) is a standard argument: see Krylov (1962, pp. 134ff.).] • It is desirable to have a more workable representation of KnCu), and so of

E~q). To do this we define the polynomial

(II)

so that (12) Lemma.

Pnr(A) has the representation (13)

where 'nr(A) is a polynomial of degree n - r satisfying

'nr(l) = (-IYlr!

(14)

l:I6

4. Optimal Methods and Methods Based on Power Series

Proof:

I

J..-

o

or

(;l;;

e

Condition (4) becomes

-at

I

11

k=O

e

-kt

fl

j

nk

t dt

kt l~:~'(~-~r~) 11=0

=

I --------=0 + k)j+' ' J.1nk

n

k=O (IX

O.

=

O~j~r-I,

(15)

(16)

O~j~r-l.

But this means 'j

(! -. e - "P n'(e - I) I1 = 0 elt)

so

= 0.

( 17) (18)

Forj = r, one must have

Next,

e

t n, (

I) can be calculated by

at' e-,IPn,(e-I)I,=o

i),

= (-I)' = at' [e-'(I - e-I)'tn,(e- I)] 11=0 (20)

or

r!e-"tn,(e-I)II=O = (-1)', and (14) follows, as claimed.

(21)

•

The formula for E~q) is now much more manageable: E~q)

= Ile'CUKn(u)ll q = Ile-,duPn,(e-U)llq =

[LX'dq-'(I - X)'qltn,(X)lqdX]'/q.

(22)

There are two cases in which the minimization of E~q) can be easily effected. Performing the minimization will yield the weights [flnk].

4.1.1. The Case p = q = 2 Now we want to find min ('X 2,d-l(l - x)2't;,(x)dx f'lr

Jo

= Mn

(I)

4.1. Best Methods for Laplace Moment Sequences

87

subject to condition Tnr(l) = ( - I Ylr!. This minimum problem is associated with the Jacobi polynomials. The solution, fully worked out in Freud (1966), IS T

nr

(-Inn - r)!(2r + I)! (x) = _ ..-. _..-. -.-.------- R (x) r! (n + r + I)! n- r , M

= n

=

Rk(x) == RFr+ l.2ad- I)(x),

r(n-:-:r + 2cx.(I)(n - r)!(2r)!(2c.+ I)! r(n + r + 2ad + I)(n + r + 1)!r!2

(2r!)2(2r

+

l)n- 4 r - 2

-----,2--r.

[I

+

O(n

-I

(2)

)].

With the notation

;\ f

=

{coefficient of t k in .I~ f a polynomial}

(3)

an explicit formula for the weights in U is Ilnk =

(ex

+ kn -Inn

- r)!(2r

'( 1)1 r.n+r+ .

+

I)!

c5d(l - tYRn_r(t)}.

Equation 4.1(5) gives

Ifni s: Ilpr)e- acuI12 (2r;! r.

J2r

+

I n-zr-I(I

+

£),

(4)

(5)

n

n 2 r for some C independent of f, equality being attained for some f in the class 4.1(3). It is useful to have an explicit representation of Pn • Let

v = (-Inn - r)!(2r + I)! c5 [(I _ t)rR _ (t)]. nk I' !(n + r + I)! k n r

(6)

Then (7)

or P (A) = A-a(-Inn - r)!(2r n

r!(n+r+I)!

+

I)! (ADnAa(l _ AYR

n-r

(A)].

(8)

Since all zeros of R, are simple and in (0, I) and dense there, an application of Rolle's theorem shows the zeros of Pn(A) are in (0, I) and dense there. Thus,

88

4. Optimal Methods and Methods Based on Power Series

by Theorem 2.2(6), U is not regular. Of course U is regular for the class of sequences defined through conditions 4.1 (I) and 4.1(2). Note that 4.1 (II) and 4.1(12) yield

oS

Theorem 1. Let p = q = 2 in Theorem 4.1 and let the set {m Ifm)(o) of- 0, m S I' - I} be nonempty, j being its smallest member. Then U accelerates s. In fact

( 10)

Proof

(II)

by the use of the asymptotic estimate for the Jacobi polynomial. Now I' n

=

f'(j)(o) L -----'----.-

r - 1

i~O («

+ n)'+

fUl(O)

= ---_._-

+ l1)j+ 1

(rx

or, rn ~ CI1- j

- 1.

I

+

I (rx

+ n)'

roo ~O

("(>+nj,(r)(t)dt .

0(1)

pr-I)(o)

+ ... + --.- + -_._(rx + 11)' (« + 11)"

(12)

Combining this with (II) gives the theorem.

In Wimp (1977) it is shown that

link

satisfies the recursion relation

os k s with the interpretation

Jink

•

= 0, k < 0, k > 11, where

en

11

+

I,

11 > 1',

(13)

depends on k;

hn

+ I' + 2)(11 + I' + Zod + l)(n + rxd) = (211 + Zxd + 1)[(1' + rxd)(r + I - cJ.d) -

en

= 2(211 + Zod + 1)(11 + rxd)(n + (Xli + l)(rx + k)'(rx + k - 1) -r,

an = (11

d; = -(11 - r)(n -

I'

+ 2rxd

- 1)(11

+

xd

+

(17

1).

+ rxd)(11

+rxd

+

I)],

(14)

IN

4.1. Best Methods for Laplace Moment Sequences

To start the computations one needs flrk = (a flr+

I,k

= (a

+ k)'{ -l)r+kjk!(r - k)!, + k)'{ _ly+k+ I(k + ad)jk!(r

- k

o ::s; k

+

::s; 1', O::s;k::S;r+l.

I)!,

(15)

Let

Example.

s; = (GAM)n =

f(t)

e-t[t-

=

I

-

n+ll

I

k= I

-k -In(n

I)

= }' +

foo e~n1(t) dt, 0

(e' - l)-IJECOO(O, co),

}' = 0.5772156649

(16)

(Euler's constant).

Let, e.g., r = 3, a = I, n = 5, (' = d = '~5

+

l

= 0.577209,

Then

'5 = 6.1 (-6).

(17)

The improvement in convergence is dramatic; 8 6 = 0.658 is not accurate to even one significant figure. Very often monotone sequences that come up in applications can be put in the form of Laplace moment sequences, the above being an example. It can be shown that flnk is alternating in k. Therefore, for A > 0, n

V.(A) = Ilflnkl Ak k=1l

?' f

(n ,- r)!(2r + r.(n + r + I).

(a

k=O

+ k)'Ak I6k[(I + t)'Rn-r(-t)JI.

Putting a + k = a provides a lower bound, a For some C, C, Cn- 2r(1

+ AYlR n- r( -A)I <

+

Vn(A) < Cn~2r(l

k

(18)

= a + n an upper bound.

+ A)'{a + nYlR n- r( -A)I (19)

and using the asymptotic properties of R n - r (Freud, p. 245) shows C' < Vn (A)n2r + 1/2[(2A.

+

I)

+

2~X{X + 'I)]" < C".

(20)

Setting A = I, taking nth roots and lim sup shows K = 5.828, the same coefficient of numerical instability, in fact, as the triangle discussed in Section 2.3.5. For the present methods, however, the presence of the factor (a + k)'{a + k - l)-r in the recursion formula for lInk [see (14)J seems to preclude formalizing U as a lozenge algorithm.

90

4. Optimal Methods and Methods Based on Power Series

4.1.2. The Case r

-+ 00

To interpret formally the case r 4.1(4) as

-+

o: (f

E

Coo(O,

(0», rewrite condition

0::;; j ::;; n - 1,

(1)

i z: n - 1,

(2)

or -1 ::;;

and this gives Ilnk

=

(n)(_I)k+n (a + kt k 1 ' n.

(3)

i.e., the Salzer weights. For the error,

Ifni::;; Ilpnle-acullpllkt J1(kae:d~~kut = lIf
e- adu(1 - e-Utt

(4)

or

(5) The same argument as in Krylov (1962, p. 150) shows that the error estimate (5) cannot be decreased. (Note that A is independent of f.) Higgins, in his thesis (1976), has extended the methods of the present section to sequences whose remainders can be represented as moments of an arbitrary weight function. 4.2. Optimal Approximations in [1 and PAc The larger the linear subspace of sequences of Cflc to which a Toeplitz method is to be applied, the less effective the member will be for some members of that class. In short, you cannot get something for nothing.

4.2. Optimal Approximations in I Land :3£c

91

Some interesting recent research has been devoted to the study of the efficiency of summability methods as applied to large sequence spaces. These results are due mainly to Baranger (1970) and Germain-Bonne (1978). The efficiency of a Toeplitz method for any linear subspace of [I can be characterized very nicely through the concept of a Hilbertian subspace, see Schwartz (1961/1962). In what follows it is assumed that all sequences are real, but this entails no loss of generality. Also, a E [I means, as usual, that an is the (n + l)th term of an absolutely convergent series. Definition. £ is a Hilbertian subspace of [I means £ is a Hilbert space with inner product (', .) and related norm II·IIH for which (I)

aE£,

where [a] =

00

(2)

Ilaki. k=O

It is known (Schwartz, 1961/1962) that the functionals

F: a

00

-+

I

k=O

ak

= s,

(3)

are then bounded linear functionals on £. By the Riesz representation theorem, there are representers ~, ~(k) E £ such that 00

I

k=O

ak

= (~, a)

a; = Let n be fixed, Vnk

E fjf,

(~(k),

a)

for all

a E £;

(4)

for all

a E £.

(5)

and define e= ~ -

n

I

k=O

(6)

Vnk~(mk).

Now for any a E £, we can consider approximations to s by finite linear combinations of the elements of s with coefficients Vnk as follows; n

00

I ak ~ k=O I vnkamk' k=O But notice that

n

S -

I

k=O

VnkU mk

=

n

S -

I

k=O

Vnk(~(mk), a)

(7)

= (e, a).

(8)

The problem, then, of finding a "best" summability method for all £ can be formulated in terms of determining Vnk> mk to minimize Ilell,Jr'o

92

4. Optimal Methods and Methods Based on Power Series

There are two ways of looking at this optimization problem; one when the mk are given, the other when they are not known. We shall treat only a special

case of the former problem. For additional material, the reader should consult Baranger's paper (1970). Denote by d p that linear subspace of <&" n /:;1s c:: [I such that limlanl l / n = p, 0 < p < I. (9) Let 0 < A < I. The space :Ie

{a Eel Jo (~~:)2 < oo}

=

is easily shown to contain d with inner product

p

(10)

when ,.1. 2 > p. Furthermore, it is a Hilbert space (11)

Theorem 1. The optimization problem for ;if has a unique solution given by

VnO = Co

+

(1- A)-I;

sksn-I;

(12)

where ( 13)

In particular, when mk = k, VnO = Vnl and Ilellx = An + 1/(1 - ,.1.)3.

= ... = vn. n- I = I,

V,m

= (1 - A)-I,

This approximation, which works" best" for all .Yf, is, then, a o - A)-I. Thus

+ an-I + an(1

s;

= Sn + anA/(1 - A),

+ a I + ... (14)

and the transformation offers little improvement over ordinary convergence. Things are not much better for /~c, the space of convergent real sequences, as will be shown. Let A be an infinite real matrix [au], i,j 2 0, and denote by An the (n + 1) x (n + 1) truncate of A, and by s, E .~n+ I the n + I truncate (so,···, sn) ofs E ~s· A is assumed to satisfy the following three hypotheses: (1) (2) (3)

An is positive definite; sUPn,ilanil S M; Iimn~oo «; = aOk> k 2 O.

4.2. Optimal Approximations in I' and .4Ic

Now define ,1t

==

,1t(A) =

{SE.~sIS~PSnAn-lsJ

<XJ}'

93

( 15)

Germain-Bonne (1978) has established the following. Theorem 2.

,1t'(A) is a Hilbert space with inner product

(s, t) = lim snA; ltJ.

(16)

Furthermore, Yr(A) is a Hilbertian subspace of both /00 and case the representer ~(k) having the property for all is the (k perty

+

SE

s

= (s, s)

for all

SE

In either (17)

,yt'(A)

I)th row of A. In the latter case the representer

.gfc.

~

£'(A)

having the pro(18)

is the first row of A. Now let e = ~ -

n

L

k=O

(19)

J.1.nkS(k).

The problem is to determine J.1.nk to minimize lIell.#". It can be shown that the problem has a solution, and the resulting Toeplitz method is regular for

Yf(A).

As an example, let d > I and take

A =

Then An is positive definite and

"'1

1 d ... 1 1I I d · .. .

r

·· ·

..

(20)

.

n+d-l -1 -1

···-1

-1

-1

(I)

(21 )

-1 One has (22)

94

4. Optimal Methods and Methods Based on Power Series

and the space Yl' is the space of sequences in f!Ilc for which lim s, A nor, those satisfying

I

00

k=1

(Sk -

SO)2

<

IS:

<

00,

(23)

00,

i.e., those converging in f to their first element. The associated inner product is (s, t)

=

So to

+d

1

00

_ 1 k~Yk - SO)(tk - to)·

(24)

As far as the minimum problem associated with (19) is concerned, this will entail no loss of generality, for we can put, quite arbitrarily, lim Sn = S = So, lim tn = t = to and require J-lnO = O. The Hilbertian subspace of f!Ilc then consists of those sequences converging in 12 to their limits. The J-lnk can be computed explicitly for this case, and the result is J-lnk = I/(n

+d-

1), n

>

1,

1 S k S n;

J-lno = O.

(25)

This is, of course, Cesaro's method. For a number of other examples, consult Germain-Bonne's thesis. 4.3. Methods Based on Power Series 4.3.1. Construction and Properties of Methods

The natural space of sequences to consider for these methods is Recall S E C(J u means

C(JG'

(1)

contains all linearly convergent sequences. The material in this section is due to Gordon (1975). First, we have a regularity theorem.

C(JG

Theorem 1. A triangle U is regular for (i) given

c;

C(JG iff

> 0, IVn k l < . ( 1 + f . ) \

where Vnk = 'L';=k J-lnj' and (ii) J-lnk = 0(1), k fixed.

Osksn,

(2)

95

4.3. Methods Based on Power Series

Proof It clearly suffices to do this for /l.k real. =: We shall show more, namely, if 8(Z) = akzk is analytic in N R' R > 1, then U sums the partial sum of the series to 8(Z) uniformly in any smaller circle. Since 8 is analytic in lV, the sufficiency part of the theorem will follow when z = 1. Let R" < R' < R. A simple computation with Cauchy's formula gives

L

rn(z) =

1 -2. 7[1

f

[kL (vnk k=O O

'E(lN R

•

l)r

k

= ztw,

r

r

ko

+

1

~1 _

-

T

+ L n

k=ko+ 1

v.kr

k] 8(w)dw W

,

(3)

ZEN R'"

Now pick e so that (l + e)R"IR' < 1 and N so that (i) holds for n > N. Then choose ko ~ N to make the contribution of the second and third terms to the integral < e12, and then n large enough so the first term contributes < e12. (Note vnk -+ 1, k fixed.) =: This requires a much more subtle argument. We shall only sketch the proof. The reader unable to fillin details should consult Gordon's paper (1975). (ii) is obvious. To establish the necessity of (i), assume it is false. This means given e > 0, IVnk I ~ (1 + e)k holds for two nand k sequences with the n sequence unbounded. But, clearly, the k sequence must also be unbounded. Now we construct two sequences inductively, {nr } , {k r } , as follows: (i) (ii) (iii)

Let no = ko = 0; assume nr - 1 , kr - 1 have been chosen; pick n~ > nr - 1 so that

Ivnd < 1 + s,

(4)

(this is possible because U is a triangle); (iv)

choose n, >

n;., k; > k

r- 1

so that

'±:./l." i ! =

IVn"kd

! J=kr-

(v)

~ (l + e)k~;

(5)

choose k, > n..

We now construct a sequence s E Cf}". Choose 15 E (0, I) so that b( 1

+ e) =

p

> 1.

(6)

Let ao

= 1,

with empty sums interpreted as

°

and sgn

°=

(7)

0.

96

4. Optimal Methods and Methods Based on Power Series

Then n s· n a·v· I . I = I .~ Is_nrI = II.~ J Iln r,] J nr,] J-O J-O r

r

2 £5k~(I

>

+ e)k~

kr

L: lajllvnr)

-

- 1

j=O

l; - L £5j(l + e) 2 kr -

I

j=O

l~

Thus s contains an unbounded sequence.

I

+e

- --' --> 1-£5

CfJ.

(8)

•

It is easy to show that the regularity of U for rc" implies Pn(A.) = 0(1) for each A. EN. However, regularity for rc" does not imply regularity. Take

Ilnk =

{

n

0,

0~k~n-2

-n, I,

k=n-l

+

(9)

k = n.

Then (10) but U is not regular. A Taylor sequence s is a sequence such that ISn II/n = 0(1). Clearly, the space of Taylor sequences contains rc". Bajsanski and Karamata (1960) have shown that U (not necessarily a triangle) takes a Taylor sequence into a Taylor sequence iff, given e > 0, there is an M such that IIlnk I < ekM", n, k 2 O. For additional material on such sequences, see the thesis by Heller referenced in Bajsanski and Karamata (1960). The transformation we wish to study here is defined by formal power series. Let S E rc sand G(z)

=

00

L ak_Izk.

(11)

k=1

Let z = f(w) =

L h-l W \

k=l

f(1) = 1.

(12)

Define the sequence b by 00

L bk -

k=l

Iw

k

= G(f(w»

(13)

97

4.3. Methods Based on Power Series

Finally define T(s)

= s by n

~

o.

(14)

(If j~ + II + ... + In -# 0, we can use this to define sn for all n.) Further exploring the properties of U requires its matrix representation. Let

[I( w)]m =

00

'\'

1... j'k-I,m wk ,

m~l.

k=m

(15)

Define n

Pnk

=

I ./j,k+ j=k

Os k S n.

I'

(16)

Thus .fie, I = .fie, fJno = 1 + 0(1). Substituting the series (15) into (13) and interchanging the order of summation gives

bn =

n

I ak.fn.k+l k=O

(17)

so n

s; = Ko l I

j=O

hj

= r~;;ol

n

I

k=O

akr~nk (18)

Define

k = n k < n,

(19)

An application of Theorem 1 then furnishes immediately the next result.

Theorem 2. (i)

given

U is regular for L:

> 0,

IfJnkl <. (ii)

fJnk = I

Pro(){

+ 0(1), in n.

Obvious,

•

(1

C{,.~

iff

+ et,

os

k S n;

(20)

98

4. Optimal Methods and Methods Based on Power Series

Theorem 3. Let f have a radius of convergence greater than 1. Then V is regular for e, iff

sup If(w)1

[w]

>

I

= 1.

(21)

Proof <=: First, note that the coefficient of w" on the right-hand side of (13) involves ao, ai' ... , ak- I, or s; involves So, SI' ... , Sn' Next, let a o = I, a, = 0, k > O. This means Sn == 1. But then », = hand sn = Lk=O Jlnk = 1, so the transformation defined by T is a triangle, U. few) is analytic for Iwl < R, R > 1. Let the condition (21) be satisfied. If SEre" then G(z) is analytic for [z ] < 1 + s, e > O. But then G(f(w» is analytic for Iw 1 < 1 + J, for some J > O. In particular (22) Thus V is regular for s. =: Assume (21) is not satisfied. Then there is a point w* such thatf(w*) = z* where Iw*1 < 1 and Iz*1 > 1 since f(1) = 1 and by the continuity of few). Let (23)

The corresponding sequence s is in '"5". Since G(f(w» has a pole at w = w* the radius of convergence of this series is Iw* I < 1. The series, therefore, cannot converge at w = 1, so limn~oo s; does not exist. • Corollary. Let feu) have a radius of convergence> 1. Then the two conditions sUPlwl=llf(w)1 = 1 and Ifink I <. (1 + e)\ o s k s; n, are equivalent in the sense that either is necessary and sufficient for V to be regular for

«..

Proof. I need only show Theorem 2(ii). f(w)m is analytic in the closed unit disk. Since f(1) = 1,/(w)ml w = I = 1, or

L t.:«; 00

k=m

or

fin,m-I -+

= 1,

m~

1,

(24)

1, m ~ 1. •

Not yet discussed is the regularity of U. There is a simple positivity condition on the Taylor series coefficients of f that ensures this.

4.3. Methods Based on Power Series

Theorem 4. Let f(w) have radius of convergence greater than f E ~~. Then U is regular. Proof

First

Ko 1([3"k - [3".k+ I) = Ko1(J/i.k+ 1 -

i=t /i.k+2)

--->

n

0,

---> 00.

99 and

(25)

Further, it is easily established by induction that Il"k ~ O. Thus

" " Illnk = I Illnkl = 1

k=O

k=O

and the Toeplitz limit theorem can be invoked.

•

4.3.2. Applications

Example 1. Let f(w) = _ 1 [ w 1 + q 1 - qw/(1

+ q)

]

,

q

~

O.

(1)

This gives, for all practical purposes, the Euler (E, q) method, although this formulation differs slightly from the standard one, the weights here having row sum of 1. Then k~O

(2)

n

~

O.

Theorem 4 provides the well-known result that U is regular.

Example 2. Consider the case where f is a polynomial of degree 1,2, or 3. Usually condition 4.3.1(21) is easier to check than Eq. 4.3.1(20). If f(w) = w - w 2 for example, condition (21) shows that U is not regular for 'l?" since f( -1) = -2. For first-degree polynomials, the only 'l? ,,-regular methods arise from f(w) = w, the identity transformation. For second-degree polynomials, (21) requires that f(w) = w(w + a)/(l + a) with a real and positive. Again, U is regular. For third-degree polynomials, both regularity and C(}" regularity arise. The results are as follows: first f(w) = w(w 2

+ zw + [3)/(1 + II + [3).

(3)

100

4. Optimal Methods and Methods Based on Power Series

For real

IX,

(i) (ii) (iii)

p, (21) holds iff at least one of the following holds:

IX,PZO;

- 1 < p < 0, -1, IX S

P<

z 41 PI1(1 - I{jI); -4IPI/(IPI - 1).

IX

For case (i) U is regular; for cases (ii) and (iii) only Example 3.

Cf} a

regular.

The heat conduction equation

av

at

82 v 8x 2

(4)

can be approximated by the difference equation (5)

where (6)

To study the instability of(5) one can substitute v~n) = ~nwj, W = e". This gives ~

= (1 - 2p) + p(w + 1/w).

e

(7)

For stability it is required that I~ lsI for all [see Richtmeyer (1957)]. This is equivalent to If(w)1 s I for [w] = 1, where f(w) = w[p

+ (1

- 2p)w

+ pw 2 ].

(8)

Thus the difference scheme (5) is stable iff U is C(j(l regular. Example 2, Case (3), then yields the well-known result that (5) is stable only if 1 - 2p Z O.

Chapter 5

Nonlinear Lozenges; Iteration Sequences

5.1. General Theory of Nonlinear Lozenge Algorithms One cannot expect general lozenge algorithms 1.3(1)-1.3(3) to satisfy a theory as simple and as elegant as the linear theory of Chapter 3. What made the linear theory possible was, of course, the Toeplitz limit theorem. Only recently has work been done on developing a general description of regularity and acceleration properties for nonlinear methods, much of the work having been done in France, particularly by Germain-Bonne (1973, 1978). The theory doesn't cover all cases-s-in fact, it excludes some rather important algorithms and in its present formulation can handle only vertical convergence. Nevertheless it is elegant and seems to point the way to further study. In what follows let k be fixed and ;;::: O. Let G: ~k + 1 --> ~. For any s E ~c define (1)

During the analyses to follow G will be required to fulfill one or more of the following conditions: (i) G is continuous on ~k+ 1 ; (ii) G is homogeneous, G(ilx) = ilG(x), il E ~, X E ~k+ 1; (iii) G is translative, G(x + «e) = G(x) + ex, x E ~k+ 1, and ex e = (1, I, 1, ... , 1).

E ~,

Note that any function satisfying (ii) and (iii) also satisfies G(O) = 0 and G(exe) = G(O) + ex, ex E ~. 101

102

5. Nonlinear Lozenges; Iteration Sequences

Let

gck + I

be that subset of ~k + I such that

iff i = j,

1 ::::;; i,j

s

k

+

(2)

1.

Let ~k-I be that subset of ~k-I such that Xi #- 0, 1 ::::;; i ::::;; k - 1. Let C§k-I be the subset of ~k-I interior to the hypercube Ix;1 < 1. Theorem 1. Let G be continuous on G can be written

gck+ I

and satisfy (ii) and (iii). Then

(3) where 9 is defined and continuous on ~ k _ I. Conversely, any function defined and continuous on ~k-I can be associated with a G as above that is continuous on gck+ I and satisfies (ii) and (iii) if one defines G(O) = 0. Proof

Let the x j be distinct. Then

G(xo, ... ,xk )

= Xo +

(XI -

xo)G ( 0, I,

Xz -

Xo

XI -

Xo

, ... ,

x, - x o) XI -

Xo

.

(4)

Now define

I::::;;j::::;;k-I; (5)

i z: 0. Then Yo

= 0, YI = I, and j-j

Y;=1+ LXIXZ ... X i , i= 1

2 ::::;; j ::::;; k.

(6)

Define As X varies continuously over ~k-l> Y varies continuously over vice versa. This proves the theorem. • Theorem 2.

%

gck+ I'

and

Let G satisfy (ii) and (iii) and be bounded on the hypercube

= {xlmaXO:5j:5klxjl ::::;; 1}. Then Gin (1) is regular as n --+ 00. Proof

Let 6n = sup Irjl. j~n

(8)

5.1. General Theory of Nonlinear Lozenge Algorithms

103

If bn :.0, the theorem is clearly true, so assume bn #-. O. Then r~k) =. bn G(x),

so

X E

x j =,

r n+

)bn , 0

~ j ~ k,

(9)

% and

(10)

Theorem 3. Let G satisfy (ii) and (iii) and be continuous on ~k+ iLet S E ~c and Ian + dan I be bounded and bounded away from 0 for n large. Then G is regular for s. Proof (11) Now for n sufficiently large, Y belongs to the compact set 0 < C 1 and so the theorem follows immediately. •

Corollary. Proof

G in the above theorem is regular for

Use Theorem 1.4(1).

~

IljI ~

C 2,

~"

•

Remark. Often it happens that S~k) is given ab initio in the form (11) rather than (1). Then, of course, conditions (ii) and (iii) are superfluous. Recall that G is said to accelerate s (S~k)

-

E ~c

if

s)/(sn - s) = 0(1)

In

n.

(12)

We employ the usual notations

n 2': 0

(s- 1 = 0), (13)

and assume in what follows that G satisfies (ii) and (iii). Therefore we can write (14) Let .xl p

C

~,

be the class of sequences with P

fixed,

0 < Ip I < 1.

(15)

Without loss of generality we can assume an #- 0 for any of these sequences. Further [Theorem 1.4(I)J Pn' hn tend to p.

104

5. Nonlinear Lozenges; Iteration Sequences

Theorem 4. Let 9 be defined on C§ k _ 1, k ::2: 2. Then G accelerates d iff 9 is continuous in a neighborhood of pe and g(pe) = 1/(1 - p). Proof

p

(16)

We can write

= [1 + (h n - 1)g(Pn+l>Pn+2, ... ,Pn+k-l)].

r~k)/rn

(17)

Note that (14) shows that G is regular for d p : Clearly g's continuity and value (15) at pe is necessary and sufficient for G to accelerate d o: • Note that if G fails to be regular for d some neighborhood of p.

Example 1. defined by

p'

then 9 must be unbounded in

Anticipating a little, we now discuss the Aitken (F-process, (18)

Note s; is always defined for n sufficiently large since an #. an+ l' SEdp : Here k = 2 and g(x) = 1/(1 - x). The theorem confirms the well-known fact that Aitken's D2-process accelerates all sequences in d p for each P, i.e., accelerates '{}/. Smith and Ford (1979) have given a useful one-way generalization of the previous theorem in which the functions 9 are allowed to depend on n as well as k. We again assume G == Gn satisfies (ii) and (iii).

Theorem 5. Let k ::2: 2 and the functions gn be continuous in a common neighborhood K of pe and converge uniformly in K as n -+ 00 to a function 9 with the property g(pe) = 1/(1 - p).

Then G as defined by (1) is accelerative for d Proof

p .

By uniform convergence gn(Pn+l,···,Pn+k-l) = 1/(1- p)

+ '1n,

where" is a null sequence. Thus r(k)

I :n

I = 1 + (p -

( 1+ ) '1n = (p -

1 + Dn) 1 _ P

()a null sequence, and this gives the theorem.

1)'1n

+ (1

(19)

D ~ p) + Dn'1n, (20)

•

5.2. Path Regularity for Certain Lozenges

105

Example 2. Let y be a fixed sequence EC(}s. Define the generalized Aitken i5 2 -process by sn

=

+ an+l(1

Sn

- (an+2Yn/an+1Yn+l»-1.

(21)

(It is assumed that the denominator in Eq. (21) is nonzero-more about this

later.) Again k = 2, but

(22) Thus if Yn/Yn+ 1 = 1 + 0(1), (21) is accelerative for C(}/. [Then, of course (21) is defined for n large enough. If the denominator of the right-hand side vanishes for any n*, it is customary to put sn* = sn*'] • 5.2. Path Regularity for Certain Lozenges When each element S~k) in a lozenge can be written as a weighted sum of s; +k' then certain simple conditions ensure the path regularity of S~k) even though the algorithm may be nonlinear. To be precise, let

Sn' ... ,

S(k) n

=

"IIrkm n+m' m=O k

n. k 2 0,

S

L.

(1)

where Ilkm == Ilkm(n, s). Theorem 1. For s E

C(}C

let

(i) L~=o Illkml :::; M, M independent ofn; (ii) L~=o Ilkm = 1; and (iii) Ilkm = 0(1), k ---+ 00, uniformly in n, along P. Then the algorithm defined by (1) is regular for s along P. Proof

By (ii) we can write, for s E c(}c, k

rn(k) -_

=

(k)

Sn

-

_"

S -

ko

L.

m=O k

L + m=ko+ L m=O

rn+mllkm

(2)

1

Applying (i) gives Ir~k)1 :::; sup j"2:.n

Irjl

ko

L IIlkm I +

m=O

sup IrjlM.

j>n+ko

(3)

106

S. Nonlinear Lozenges; Iteration Sequences

Either k -. 00 or n ---+ 00 on P (or both). If k ---+ 00 pick k o to make the second term < t:/2 for all n and then the first term will be less than t:/2 for k large. If n ---+ 00 simply take the direct limit. This shows r~k) ---+ 0 on P, or S~k) ---+ S. • This theorem means, in effect, that an analog ofthe Toeplitz limit theorem holds one way for certain nonlinear algorithms. This provides us with a one-way generalization of the linear deltoid obtained by extrapolation, Theorem 3.3(1).

Theorem 2.

Let Xn+k+ 1

Xn -

n, k

~

n

0,

~

0,

(4)

where x (which may depend on s) is monotone decreasing to zero. Then if ex > 1, n ~ 0, the above algorithm is regular for all paths.

Xn/x n+ 1 ~

Proof

Left to the reader.

•

5.3. Iteration Sequences

5.3.1. The GBW Transformation The algorithm to be studied in this section is a case of algorithm 5.2(4) with X n = Llsn = an + 1 •

Equation 3.3(2) gives (k ) Sn

= ~ ~

m=O

n k

Sn+m

i=O

te m

(

an + i + 1

an + i + l

-

n, k

~

n

O.

~

an + m + l

)

.

0,

(1)

(2)

If a, # a., i # j, then S~k) is defined. This restriction on s will be held in force throughout. The transformation (1) was discovered independently by Wimp (1970) and Germain-Bonne (1973). Ek(sn) == S~k) is homogeneous, translative, and exact when s has the form

(3)

5.3. Iteration Sequences

107

for some cj not all zero, Theorem 3.2(3). This rather obscure nonlinear difference equation offers little clue to the behavior of s itself, and one would expect the exactness problem for this algorithm to be rather intractable, certainly in comparison with the easy analyses of the preceding chapters. It turns out, however, that E; is a very natural transformation to use for a class of sequences of fundamental importance in numerical analysis.

Definition. s E ~s is an iteration sequence at S if there is a noneonstant function 4>(z) analytic in a region d with s e se, 4>(s) = s, 14>'(s)1< 1, and Sn+ 1 = 4>(sn)'

n :?: 0,

Sn E

d.

(4)

4> is analytic in a disk centered at S and

It is easy to show that if

= ro is sufficiently small, then s, lies in the disk for all nand s,

So -

s. As the reader may know, these sequences arise in the attempt to find solutions of scalar equations of the form z = 4>(z), or, in abstract spaces, solutions of operator equations x = Ax. For the latter, the abstract Brezinski-Havie process with f~) = (LlsnY is the process to study (see Chapter to). Here we discuss only the scalar case. It is easy to show the class of convergent iteration sequences is a subset of C(J, provided 0 < 14>'(a)I < 1 and 1S - So 1is sufficiently small for then (sn+ 1 - s)/(sn - s) -+ 4>'(a). In practice, however, Sn does not usually converge either because insufficient information is available to enable one to choose So close enough to S or because the function 4>(z) lacks the property 14>'(s) 1< 1. For computational aspects of such sequences, see, for instance, Isaacson and Keller (1966) or Householder (1953), and, for a discussion in abstract spaces, Kantorovich and Akilov (1964). The first result is a representation theorem. S

-+

Theorem 1. Let s satisfy (3) with 11 + l/c l l < 1.Then for ro sufficiently small, s is an iteration sequence at s. Conversely if s is an iteration sequence at S and the function inverse to 4>(z + s) - (z + s) at z = 0 is a polynomial of degree
=>.

Rewriting (3) as rn =

Cl

Llrn + ... + ck(Llrnt

(5)

and reversing this series gives (6)

or (7)

108

5. Nonlinear Lozenges; Iteration Sequences

Now let

¢(z) = S + (1

+

l/c t)(z - s)

+ c~(z

- s)Z +

....

(8)

Then ¢ has all the required properties. <=. If s is an iteration sequence, we can write

Sn+ t = ¢(sn) = S + ct(sn - s)

+ cZ(sn

- s)Z +

...

(9)

or

Llrn = (Cl - l)rn + czr;

+ ...

= ¢(rn + s) - r n

and the conclusion is immediate. (Note that

Ct -

-

S

(10)

1 =F 0 since Icll < 1.)

•

Theorem 2. Ek accelerates C, along vertical paths. Further, if ¢ is an iteration function real on the real axis with 0 :s; ¢'(s) < 1, S real, and ro is sufficiently small positive, then Ek is regular for s along all paths.

Proof

The second part of the theorem, being a straightforward application of Theorem 5.2(2), is left to the reader. •

If s is divergent in the way sn+dsn -> P, Ipl > 1, then E; "decelerates" s, i.e., s~k)/sn -> O. This can be shown by using Theorem 1.4(1). Moreover, E, sums the series An exactly, A =F 0, 1, i.e., S~k) = (I 1, n 20, k 2 1.

Ar

L

5.3.2. Overholt's Procedure This procedure (Overholt, 1965) deals with the iteration sequence directly. Let (1) where ¢ is analytic in a suitably large neighborhood of s, ¢(s) Then

Sn+ I

-

S

= s, I¢'(s) I <

= ¢'(s)(sn - s) + t¢"(s)(sn - s)Z + ...

1. (2)

or (3)

5.3. Iteration Sequences

109

and inverting gives (4) Theorem 1.

Let

n, k n

~

~

0,

(5)

0,

and let the denominator of (5) never be zero. Then S~k) =

S

+ O(r nt +1,

n

(6)

-+ c».

Proof We show instead that s~kL = s + O(rn+k)k+ 1, which, by (3), amounts to the same thing. Assume for some fixed k ~ 0 that

(7)

(3) gives (8)

so an+k+2

=

(b l

= (b l

-

l)rn+k+1 + birn+k+1)2 + l)b lrn+k + cr;+k + ... ,

...

(9)

by substitution of (3). Thus (a n+k+2/ a n+k+ I)k+ I

Now letting n

-+

=

b~+ I

+ drn+k + ....

n - 1 in (7) and substituting (4) with n S~k)

= s + (ctlb~+I)(rn+k)k+1 + ... ,

-+

(10) n

+k

- 1 gives (11)

and combining (7), (10), and (11) gives the inductive proof that establishes (6) (actually more, since we have shown that S~k) has a complete asymptotic expansion in powers of rn) . • Corollary. Overholt's procedure (5) is accelerative for iteration sequences of the prescribed type (provided S~k) is defined, n, k ~ 0). It is easy to show that Overholt's procedure is regular for CG', but not accelerative, since members of CG', do not necessarily have the asymptotic representation s; + 1 ~ S + aIrn + a 2 r; + .. '. In fact the algorithm on the basis of numerical evidence seems inferior to the GBW algorithm, as the following example points out. As with the GBW algorithm, the Overholt method sums I An exactly, A oF 0,1.

110

5. Nonlinear Lozenges; Iteration Sequences Table I

k

Sk

tk

0 1 2 3 4 5 6 7

1 1.538461 1.295019 1.401825 1.354209 1.375298 1.365930 1.370086

1 1.7 0.930700 1.746142 0.857797 1.789719 0.786117 1.827823

Example.

s~)

GBW

s~)

1 1.370814 1.368870 1.368809

Overholt

1 1.370814 1.368884 1.368808

t~)GBW

1 1.333492 1.375589 1.369853 1.368604 1.368785 1.368801 1.368804

t~)

Overholt

1 1.333492 1.389681 1.375163 1.358817 1.364680 1.382128 1.375144

Leonardo's equation (Scheid, 1968, p. 314)

has one real root, ways

S

=

+ 2x 2 +

lOx - 20 = 0 (12) = 1.368808107. The equation can be rewritten in two

f(x)

x3

x = 4J(x),

x = ljJ(X)

(13)

to yield, respectively, the sequences Sn+ 1

=

20/(s;

+ 2s n +

(14)

10),

starting with So = to = 1. sn converges, since 4J'(s) = 0.4438; r, diverges (ljJ'(s) = -1.1096). To these sequences application of the GBW algorithm and the Overholt algorithm produces sequences S~k), t~k) (see Table I). None ofthe previous theorems indicates what will happen with t~k) for any path, since t; diverges. In fact t n has essentially two convergent subsequences, t 2n -> IX and t 2 n + 1 -> {3, IX = 0.548946478, {3 = 1.92318948. Both the Overholt and GBW methods produce S~k) that converge to s along all paths. What is remarkable is that the GBW method produces a t~k) that sums diagonally to s while all paths in the Overholt method for t~k) seem to produce divergence or at best grave numerical instability. To the accuracy permitted by the 19 places carried, tl} 1) was found to be 1.5, t1>12) to be negative.

5.3.3. The Phenomenon of Diagonal Convergence A strange thing occurs when either the GBW or Overholt process is applied to an iterative sequence which diverges very rapidly. To illustrate, let s, be given by

sn+ 1 = s; - 1,

So

= 2.

(1)

If s could be summed, one would expect the limit to be (1 + fi)/2. In fact, s diverges with such rapidity that none of the methods we have studied will

5.3. Iteration Sequences

III

Table II Sn+ 1 =

s; -

1, So

=

1

S~k)

2

1.75

3

1.736111111 2.5

8 63

2.493956044

7.214285714

7.214094181

62.031017370

3968 15745023

62.031017308 3967.000503841 15745022.000000120

3967.000503842

1.735916990 2.493954545 7.214094181

1.735916943 2.493954545 7.214094181

62.0310 17308

assign it a limit. The result of applying the GBW algorithm produces the Table II. Clearly, there is regularity along no path. But, also, lim S~k) = an;

(2)

k~CJO

i.e., there is convergence along all diagonal paths. Interesting are the facts that «, -# a j , i -#i, and the extraordinary rapidity with which the above limit is reached. There seems to be no easy way to determine closed form values of an. But the cause of the failure of the as to be equal can be pinpointed: a certain function fails to satisfy a uniqueness property under infinite interpolation. Recall that ifz is a sequence of distinct complex numbers and Zn-+OO, there is always an entire function! for which !(zn) = Wn for any WE ({is (Davis, 1963, Theorem 5.2.1).! is not, in general, unique. Recall that in our case S~k) is the value at Z = 0 of the Lagrangian interpolant of degree k that takes the values Sn+ j at the points a n + j+ l ' 0 :S j :S k. The Newton interpolation formula gives (3) to k

+

1 terms, where the gj are the divided differences 1 1

an + 1 k-2 k-2

an + 2

:

k-2

an + k

/

J.-k(a n+ l ' an+ 2 , · · · , an+k)·

(4)

I 12

5. Nonlinear Lozenges; Iteration Sequences

f is the function inverse to - I since ( - I)(sn) = s, + 1 - s; = an + l' Clearly here the ftaking the values Sk at ak+ l ' k ~ j, is not unique. In this example the series (3) converges with such rapidity that the first two terms give a good estimate for «; for, say, n > 2: (5) When n = 3, for instance, the right- and left-hand sides agree to nine significant figures.

5.3.4. Construction of Scalar Iteration Functions One of the most important problems in applied mathematics is that of determining vectors in !!4 satisfying x = (x) where :!!4 ~!!4. (Obviously determining a zero of f:!!4 ~ !!4 can be formulated in terms of this problem.) One of the earliest methods of attacking the problem -one treated in every elementary book on numerical analysis, at least for scalar equations-is the method of simple iteration: Starting with some initial value So one establishes the sequence Sn + 1 = (sn). The difficulty with this method is well knownit hardly ever converges-and the search has long been on for more effective iteration methods. The literature on iteration processes is enormous. The books by Ostrowski (1966) and Traub (1964) give a good survey of classical material. Good sources of more recent work, in particular, on operator equations are Kantorovich and Akilov (1964), Ortega and Rheinboldt (1970), Ostrowski (1973), and, of course, Mathematical Reviews. The usual way of constructing an iteration sequence is to employ the derivatives of , the Newton-Raphson method being the classical prototype of this kind of method. Recently authors have studied methods for scalar equations that do not require derivatives, e.g., generalizations of the Steffensen iteration process [see Wimp (1970); Esser (1975); King (1979)]. In fact, a virtually inexhaustible source of iteration sequences that do not require derivatives can be obtained using anyone of the sequence transformations in this book, for s, = ((- .. (so)) ...) requires only repeated evaluations of . There is, however, a more imaginative way of proceeding, and that is to make use of the general nonlinear iteration function G of Eq. 5.1(3). Consider the formula n

~

0,

(1)

where cPo(x) = x.

(2)

Assume that G is homogeneous and translative and related to 9 by Eq. 5.1(3).

5.3. Iteration Sequences

113

Theorem 1. Let ¢ be analytic at a, ¢(a) = a, ¢'(a) =F 0, 1, and let 9 be continuous in a neighborhood of ¢'(a)e, g(¢'(a)e) = 1/[1 - ¢'(a)J. Then if So is sufficiently close to a, s converges, S = a, and either rn + drn -+ or rn =.0.

°

Proof Sn + 1

= Sn + [A-.( ) 'P Sn

Now

-

Sn ] 9 [¢2(Sn) - ¢(Sn) , ... , ¢k(Sn) - ¢k-l(Sn) ] . (3) ¢(Sn) - Sn ¢k - 1 (Sn) - ¢k - 2(Sn) ¢(x) = a

+ ¢'(a)(x

- a)

+ ...

(4)

so (5) and x Thus as s, a

-+

-+

a.

(6)

a the right-hand side of (3) behaves as

+ rn { l + [¢'(a)

- 1 + O(rn)][g(¢'(a)e)

+ 0(1)]} = a + rno(1).

(7)

Thus, given e > 0, if So is sufficiently close to a, all subsequent Sn will be in N.(a), the denominator of(6) will be nonzero for x = Sn' and 9 will be defined. Thus (8) n -+ 00, and this proves the theorem.

•

The theorem shows that although the sequence t defined by tn + 1 = ¢(t n) converges (if it converges at all) linearly, the sequence s converges hyperlinearly. If ¢ is analytic in a neighborhood of a and ¢(x) = a

+

L cr(x 00

r=O

ar+ r,

Ix

- a I < p,

m

~

1, Co =F 0,

(9)

then ¢ is said to be an iteration function oforder m. Given an iteration function ¢ of order 1,can an iteration of arbitrary order be constructed? The answer is, of course, yes. We discuss this matter in the context of finding a root of the equation x = ¢(x), where ¢ is an iteration function of first order at a. Letfbe any function with a simple zero at a (e.g.,f = x - ¢ if ¢'(a) =F 1), and let 9 be analytic at a, g(a) =F 0. In a neighborhood of a, we have the Laurent series

glf = b_1/(x - a)

+ bo + b1(x

- a)

+ ....

(10)

114

5. Nonlinear Lozenges; Iteration Sequences

Then

r

I d h, = r1 dx r

(g)I = (x(-I)'b_ _ rJ.)'+

1 1

hr- dh r = -(x - rJ.)[1

+ O(x

+ b, + O(X

- «),

(11) (12)

- rJ.y],

and so (13)

is an iteration function of order r + I at least. g = r = 1 gives Newton's method. In general, r+ 1 will involve derivatives offof orders up through r. Analogs of certain sequence transformations can be used to obtain iteration functions that do not require the differentiation of cjJ. The first method is an analog of the GBW transformation. Theorem 2. Let cjJ be an iteration function of first order at rJ. with cjJ'(rJ.) not a root of 1. Then the function k( cjJ) defined recursively by k+l[cjJ(X)] = (cjJ - X)k(cjJ(cjJ(X))) - (cjJk+2 - cjJk+l)k(cjJ(X)), (cjJ - x) - (cjJk+2 - cjJk+l)

k

~

0,

o(cjJ) = cjJ,

where cjJk is defined in (2), is an iteration function of order k Proof k(cjJ)

Let cjJ'(rJ.) =

C,

(14)

ci oft 1. Assume

= rJ. + Ak(x - rJ.)k+ I + Bk(x- rJ.)k+2 + .",

+

I at least.

°:-:; k s K.

(15)

Then 0:-:; k s; K,

(16)

so

N K = (c - 1)(x - rJ.)AKc K+ I(X - rJ.)K+ 1_ (C K+ 2 - c K+ I)(X - rJ.)AK(x - rJ.t+ 1

+ O(x while

- rJ.)K+3

= O[(x - rJ.)K+3],

D K = (c - l)(x - z) - (C K+ 2 - CK+1)(X - «) + O(x - rJ.)2 = (c - 1)(1 - cK+ l)(X - rJ.) + O[(x - rJ.f],

(18)

(19)

5.3. Iteration Sequences

115

and the leading term in (19) is not zero. Thus cJ)K+ 1(4))

as was to be shown. k

=

=

(X

+ O[(x

-

(20)

(X)K+2],

•

°gives the Steffensen iteration function (21)

Consider the iteration sequence S.+ 1

4>(s.)

=

=

s; -

1,

So =

1.5.

(22)

Here (X = (1 + )5)/2 = 1.618033989. None of the usual sequence transformations works very well in summing this very rapidly divergent sequence. In fact, nearly all of them produce divergent sequences, an exception being the GBW transformation. Yet the Steffensen function produces a rapidly convergent sequence, s. 4>( 4>(s.)) - 4>2(S.) S.+1 = 4>(4)(sn)) - 24>(sn) +

(23)

s;

s = {1.5, 1.6429, 1.6189, 1.6180344, ...}.That the original GBW transformation produces only mild convergence while the iteration function constructed by analogy to the GBW transformation produces very rapid convergence is not so paradoxical as it may seem. This is because the iteration function cJ)k uses much more precise information about the sequence (22) than the GBW sequence transformation, namely, the exact form of the iteration function 4>. The Overholt procedure also produces an iteration function. Theorem 3. Let 4> be an iteration function of first order at not a root of 1. Then the function 'Pk(4)) defined recursively by

k

~

0, 'P o(4» = Proof

(X

with 4>'(rx)

4>, is an iteration function of order k + 1 at least.

Left to the reader.

•

For k = 0, this method also yields the Steffensen iteration function. The third iteration procedure arises from the formal elimination of the constants c., 1 S r S k, from the system of equations y= 4>i

+

k

"f.Cr(4)r+i - 4>r+i-1)'

r=l

o sj s k.

(25)

116

5. Nonlinear Lozenges; Iteration Sequences

Some readers may recognize this as the same formal procedure that leads to the Schmidt sequence transformation, to be discussed at length in Chapter 6. One interprets (25) as k + 1 equations in the k unknowns c., As such, for consistency, the determinant of the augmented matrix of the system must vanish. This produces a determinantal equation that may be solved for Y, which can be relabeled 1 k, and

cP-x cPz - cP

cPk - cPk-1 cPk+ 1 - cPk

cPk+1 - cPk cP-X cPz - cP

cPZk - cPZk-1 cPk - cPk-1 cPk+1 - cPk

cPk+ 1

cPZk - cPZk- 1

x

cP 1~

To analyze

)~

=

cPk I 1

-

cPk

(26)

it is convenient to rewrite it

r, = a + ILi(i-1)(cPj~ l)lk+ dILi(i+ l)(cPj~ l)k, k + 1 in the numerator determinant and 1 S

i.e., 1 s i, j s denominator determinant, and

Lik+ l(cP) = Lik(cPj+ 1) - Lik(cjJ),

(27) i, j skin the

k,j Z 0,

LiO(cPo) = LiO(x) = x - a.

(28)

We now need the following lemma.

Lemma. Let 1 s i,j

where

bi(x) =

00

LC

r=O

i_

s

m,

(29)

1,r Xr,

(30)

Let 11

(31)

--+ O.

Then D m = ICi-1,j-1IVm(X1,XZ,oo.,xm)

+ O(l1m(m -

l)/ 2+ 1) .

(32)

Proof In what follows it will be convenient to let Dm be a generic notation not necessarily involving the same bi(x) wherever the symbol appears.

5.3. Iteration Sequences

1I 7

Proof is by induction. Assume (32) true for 1 .:::;; m .:::;; N - 1. By the explicit formula for Vm [Appendix 1(4)], this implies l.:::;;m':::;;N-I; 1 .:::;; i,j N-l

c.(x·) l)

= "c. i....J

d;(x)

=

r=O

l-

s

= b·(x.) - d.(x.) J

1 ,r x~J

I

I)

(33)

N,

(34)

,

L C;-I.r X j. 00

r=N

Then

Ic;(x)I = IC;-I,j-lllxt11 = IC;-I,j-ll VN( X 1 , · · · , x N ) ,

1 .:::;; i.t

« N.

(35)

The remainder R N may be written RN

=

N

L

L

r= 1 (Ul.U2.···.Ur)ENSr

(X U'X U 2

' "

x ur t TN ( U 1,

U 2,···,

ur ) ,

(36)

where nSk is the set of combinations of (1, 2, ... , n) taken k at a time and TN is a determinant of order N containing di(x uJ)/x~J in the columns uJ' , j = 1,2, ... ,r, and C;(Xk) in the kth column if k # Uj' By Laplace's expansion (Aitken, 1956, p. 78) TN ( U 1 , u z,"" ur ) may be expanded by minors chosen from the columns u j and their cofactors whose elements are chosen from the remaining N - r columns. These latter are determinants of the form D N - " and each may be estimated as I] ~ 0 by (33). Thus (36) may be written RN

=

L O[r(NI](N-r)(N-r-1)/2] N

r= 1

=

L O[I]N(N- 1l/2+r(r+ l l/2] N

r=

=

O(I]N(N- 1l/2 + I).

(36')

1

This establishes the lemma for m = N. Since the result is true for m = 1, the proof is complete. • We now return to the analysis of Yk . A(k)(X)

=

A(k)(X)

L bklx 00

may be written a)'.

r= 1

(37)

Let ¢(x)

=

a

+

00

L Cr(X -

r=1

a)'.

(38)

I 18

5. Nonlinear Lozenges; Iteration Sequences

Then the bkr can be computed recursively from the c; For instance, b ll = C I - I, b 21 = (c I - 1)2, b 3 1 = (C I - 1)3,

b'j = Cj, b22 = Cz(CI - 1)(cI + 2), b 32 = Cz(C I - 1?(ci + 3c I

(39)

+ 3),

Theorem 4. Let Ibi + l, j lk' Ibi • j + l i b 1 ::::; i,j ::::; k, be nonzero and let C I not be a root of 1. Then Y k is an iteration function of order k + 1.

Proof For the numerator determinant in (27) we use the lemma with the identifications m

= k + I, 1'/ = x -

IX,

(40)

and for the denominator determinant m = k,

1'/

=

x -

IX.

(41)

Since Ibi-I.jlk+ I = Ibi.j+ l i b the use of (5) gives )'" =

IX

+

Ibi.j+dk

Ibi+l,jik

C~(k+2)/2 [J(e{ -

and this proves the theorem.

1)(x - 1X)k+I[1

j=1

+ O(x

- IX)] (43)

•

This process also yields the Steffensen iteration function when k = 1. 5.3.5. Iteration Functions in Abstract Spaces

Iteration functions suitable for the solution of operator equations can be derived in a straightforward manner. Let cjJ: ~ -+!!4. Let cjJ* be any element from the dual of ~ and consider I

cjJk+1 - cjJk

I

(1)

I J9

5.3. Iteration Sequences

For k = 1this gives an interesting operator generalization of the Steffensen iteration function: y

_ <¢*, ¢(¢»I - <¢*, ¢)¢ <¢*, ¢(¢) _ 2¢ + I) .

(2)

1 -

Thus an iteration sequence s for the solution of (I - ¢)(x)

=

n ~ 0,

°is given by So E

PJ.

(3)

The generalization of the GBW iteration function to operator sequences can be accomplished by using the recursion formula:

(¢*, ¢ - I)<1>k(¢(¢)) - (¢*, ¢k+2 - ¢k+l)<1>k(¢) k+I <¢*, ¢ _ I - ¢k+2 + ¢k+I) .

<1>

_

(4)

For k = 0, this reduces to <1>

_

I -

<¢*, ¢ - I)¢(¢) - <¢*, ¢(¢) - ¢)¢ (¢*, ¢(¢) _ 2¢ + I) ,

(5)

which is a generalization of the Steffensen iteration function different from (3). The generalization of the Overholt procedure, Eq. 5.3.4(24), proceeds along the same lines. For k = 0, it reduces to (5). For related constructions of sequence transformations in abstract spaces, see Chapter 6.

Chapter 6

The Schmidt Transformation; The 8 -Algorithm

6.1. Background

In this chapter we shall discuss the Schmidt transformation. Schmidt (1941) used the method to solve by iteration systems of linear equations. Schmidt's paper was neglected for some time, and the rediscovery of the method is really due to D. Shanks, who resurrected the algorithm and discussed its remarkable properties in a lengthy paper (1955). The paper was widely read and made an enormous impact. However, there was one drawback to the use of the Schmidt transformation, specifically, the need to compute determinants of large order; in fact, the cumbersome formulation of the algorithm made it difficult to analyze from a theoretical point of view, in particular, to determine its regularity or accelerativeness for special sequence spaces. Wynn changed all that by the publication of a paper (1956) in which he showed the Schmidt transformation could be formalized as a simple (although nonlinear) lozenge algorithm. This formalization has come to be called the s-algorithm and has been the subject of an enormous amount of research. Over 50 papers on the subject have appeared by Wynn, and at least 30 by Brezinski. For a fairly complete bibliography, the reader is referred to Brezinski's book (1977). It should be pointed out that Schmidt's idea did not appear ex nihilo. (What mathematical idea has?) The general idea goes back at least to Jacobi (1846). A special case of the Schmidt algorithm called the (F-process is usually attributed to Aitken (1926), who used the method, not surprisingly, to accelerate iterates in the Bernoulli method for solving algebraic equations. However, Todd (1962) observed that the method attributed to Aitken goes back at least to Kummer (1837). 120

6.2. Derivation

121

What are the properties of the Schmidt transformation? Briefly, It IS nonlinear and nonregular. This should not be thought a drawback: It is regular for certain important sequences, for instance, ~TM' Furthermore, when it works, i.e., accelerates a sequence, its performance is often nothing short of spectacular. I have indicated that the improvement in convergence obtained by linear (Toeplitz) methods is generally restricted to exponential improvement. The Schmidt transformation has no such limitations. In fact, it sums sequences so grossly divergent that they are hopelessly beyond the reach of linear methods, for instance, the partial sums of O! - I! + 2! - 3 ! +4!- .... As the reader knows, Toeplitz methods are easily conceptualized for the case of vector sequences. It might be thought that the nonlinearity of the Schmidt transformation would prevent its extension to abstract spaces except, perhaps, for those spaces where elements have inverses. This is not true: Brezinski has shown how the Schmidt transformation can be generalized to topological vector spaces in a very natural way by utilizing the dual space. This generalization will be presented in a later chapter when we discuss the BH protocol. Further, at least for ~s, the Schmidt transformation has an elegant geometric interpretation. 6.2. Derivation The heuristic derivation of the Schmidt transformation proceeds from the consideration that many sequences in applied mathematics converge as though they were composed of their limit and a linear combination of exponential "transients." Thus for such a sequence Sn

=

S

+

k

L: ar Y~,

r=1

n

~

0,

Yi =I Yj,

i =I j,

Yi =I 0;

(1)

in other words, s is a member of <6'Ek. We write (1) as

r; =

k

L: arY~,

r= 1

m

~

O.

(2)

Since r m is a linear combination of k exponentials, there is a linear difference operator of order k with constants coefficients that annihilates r m [see Milne-Thomson (1960)]. Thus k

L birm+i = 0,

i=O

(3)

122

6. The Schmidt Transformation; Theo-Algorithm

Now rm + i = rm

+ Arm + Arm+ 1 + AS m + ASm+ 1 +

= rm +

+ Arm+ i-I + ASm+ i- l ,

i '2: 1.

(4)

Substituting this in (3), dividing by L b, (this requires that s not be a constant sequence), and redefining constants gives S

=

Sm

+

k

L c, AS m+ r-

r=1

b

n ~ m ~ n

+ k.

(5)

This can be interpreted as k + 1equations in the k constants c.. For a solution to exist the column vector {s - sm} must lie in the column space of the coefficient matrix of the system, and this is true iff the following determinantal equation is satisfied: S S -

Sn Sn + 1

ASn ASn+ 1

= 0.

(6)

AS n + 2 k - 1

Solving for S gives (7)

where

l

z;

"If/'izn, sn) == "If/'k(Zn) =

:

Zn+k

~(Zn)

= 1"If/'k(Zn) I·

Asn+k_Ij ASn+ k

Zn+ 1 .. ,

AS

+: n 2k 1

(8)

Equation (7) will be satisfied if s E ~ Ek, but not necessarily otherwise. Nevertheless, the formula can be used to define a lozenge algorithm when ~(l) ¥ 0, i.e., n, k '2: 0,

(9)

whose regularity for certain sequence spaces can be investigated. Equation (9) we shall call the Schmidt transformation. Equations (5) shows that for sequences of the same kind as (1), s, differs from its limit S by certain perturbation terms, namely, linear combinations of As n, ... , ASn+ k- I' Levin (1960) took this as his point of departure and defined a very general class of transformations of sequences whose terms could be considered as their limits plus perturbations of a more general character (see Chapter 9).

6.3. Exactness Results

123

The following properties ofthe Schmidt algorithm are fairly obvious: (i) (ii) (iii)

ek is translative and homogeneous, = H~k+ I)(S)/H~k)(~2S) [see Eq. 1.5(9)J, and ek is exact (S~k) = s) for any member of ~ Em when k :2: m, for all n.

S~k)

6.3. Exactness Results It is of interest to determine that class of sequences for which ek is exact, i.e., sequences such that, for fixed k :2: 1, S~k) == s,n :2: O. The reason is that generally speaking a method is most likely to be regular for classes of sequences that behave "similarly" to those sequences for which the method is exact. This idea will be particularly useful when we discuss an extension of the Schmidt transformation to sequences in topological vector spaces. We begin with some definitions. Definition. Let y E C(js- Then y E x b k :2: 1, means that y satisfies a homogeneous linear difference equation with constant coefficients, (1)

and satisfies no equation of lower order and that y contains no terms purely algebraic in n. We say y is a maximal solution of Eq. (1) if y satisfies no lower-order equation. A maximal basis for (1) is a basis each of whose members is a maximal solution. Examples (1) (2) (3) (4)

{nA. n } E X 2, A. #- 0, 1; {A.1 + A.~} EX 2 , ..1. 1 #-

{A.nH'x 2 ; {n + A.n } rf·X' 3'

..1. 2 ,

At>

..1. 2

#- 0, 1;

In Example 4 the function satisfies a difference equation of order 3 and none lower, but violates the condition on algebraic terms.

Remark J. Any solution y of f1J k = 0 whose initial values form a vector from the natural basis of f?llk is a maximal solution. Remark 2. A Ycontaining algebraic terms is equivalent to Milne-Thomson (1960)J.

I

c, = 0 [see

124

6. The Schmidt Transformation; The c-Algorlthm

The members of x , are easy to characterize [see Milne-Thomson (1960, Chapter 13)]. They are linear combinations of the kind A~(d1nrnl

+ d'lnrn,-l + ...) + A~(dznrn2 + d~nrnz-l + ...) + ... + A~(dynrnr + d~nrnr- 1 + ...),

where the Aj are distinct, Aj #- 0, I, d, #- 0 and ml Let

(2)

+ mz + ... + my + r =

k.

(3)

Lemma 1. Let

value of n. Proof similar. Let

SEX k •

Then neither ~(1) nor Mk(sn) vanishes for any

Only the first statement will be shown; proof of the second is

COSn

+ C 1Sn+1 + ... + CkSn+k =

(4)

0,

One can rewrite M k (5) Sn + Zk-1 The functions {I, Sn' ... , s; + k- I} are linearly independent. Otherwise, for some constants d, not all zero, (6) which means either that s, either contains algebraic terms (do #- 0) or else satisfies a difference equation of order < k (do = 0). But these functions are solutions of the (k + l)th-order equation Co

dYn

+ C1 dYn+1 + ... + Ck dYn+k =

O.

(7)

Heymann's theorem (Milne-Thomson, 1960, p. 357), gives Mk(sn+ 1) = (-It(co/ck)M k(sn)'

(8)

But this means that M k can vanish for no value of n unless it vanishes identically for n > no, which cannot be. •

Lemma 2. If S satisfies an equation of the form (1) of order r ::; k and contains algebraic terms, then ~(l) == o.

6.3. Exactness Results

Proof

125

L Cj = 0, the equation can be rewritten

Since

ar-I

Thus Rank 1f'k(l) < k

+

1, and so

~(I)

== 0.

= c, -# 0.

(9)

•

Remark 3. To illustrate the hypothesis in Lemma 1 that t n must be free from algebraic terms, take t; = n + An, A -# 0, 1. Then t satisfies an equation of minimal order 3, yet W 3(1) == 0. The next result will be of use later on.

°

°

Lemma 3. Let o9\(y) = with Ck = 1. Then for any t E ~ s» o9\(tn) is as given by Eq. (10) for any solution s of 09\ = for which the denominator is defined, in particular, for s a maximal solution. 9 k(tn) == ( _1)k

Proof

Sn

Sn + k- 1

Sn+ 1

Sn+k

t n+k

Sn+k

Sn+2k-1

Left to the reader.

Theorem. Proof

i.

t n+1

S~k) is defined for all n ;:::

(10)

Sn+k- 1

°

and S~k)

Sn+2k-2

== S iff r E :ff k-

We can write S~k) -

=>:

Sn+k- 1

Sn

S

=

~(rn' rn)/~(l, rn)·

(11)

== s, we must have for some aj not all zero

If S~k)

aor n + a l l\r n +

.,. + ak l\rn+k-I

= 0,

n ;::: 0,

(12)

or, rewriting, (13) Differencing gives Co

°

l\r n +

C1

+ ... + Ck l\rn+k = 0, 1f'k(1, rn) < k + 1, so ~ ==

l\rn+ 1

°

(14)

and if Co = or Ck = 0, rank and S~k) is not defined. This means the minimal order of the difference equation (13) is k. By Lemma 2, r« can contain no algebraic terms. Thus r, E :ff k • =: Wk(l, rn ) -# and

corn

°

+ clrn+ 1 + ... + Ckrn+k =

(15)

0,

Rewriting gives c~rn

+ c;

l\r n +

so Rank jf"ir n, rn) < k

+

... + c~ l\rn+k-l = 0, I and ~(rn' rn) == 0.

(16) •

126

6. The Schmidt Transformation; Thee-Algorithm

Example (k = I, the Aitken c5 2 -process). and only when s; = S + cAn, C #- 0, A #- 0, 1.

el is defined and exact when

Example (k = 2). e2 is defined and exact iff either s, = S + n, C lC2 #- 0, ..1[, ..1 2 #- 0, I, A[ #- ..1 2 , or s; = S + (Cln + C2)A A #- 0, 1. c2A~,

Cl A~ Cl

+

#- 0,

6.4. The Effect of ek on Certain Series The work of this section will illustrate the remarks ofthe previous section. It turns out that ek works well on (renders more rapidly convergent) sequences that behave as exponential sequences and does poorly on sequences that contain terms algebraic in n. By elementary row and column manipulations,

(1) and k~1.

(2)

Since lv,. can be written (3)

(4) Now let (5)

lv,.(I, Anpn) = ~_I(An(A - 1)2p: , An(A - 1)2p:>, = (A - 1)2k Ak(n+k-1)lv,._I(P:' P:),

P: = nO(co + c't/n + ...).

By Lemma 1.7(2) and the representation (1),

(6)

127

6.4. The Effect of e, on Certain Series

0(0 - 1)(0 - k

x

+

O(0 - 1) ... (0 - k

=

1)n8 -

1)(8-k)A k

ct+1n(k+

O(D - 1)··· (D - 2k

k

D (-I)j( -D)j' k

D

D(O - 1)··· (0 - k

+

1)

0- k

(D - k) ... (0 - 2k

+

1)

Dj

= 1 ( -

D(-D)jjL k

j= 1

Theorem 1. Let s, have the asymptotic expansion 00 c 8 Sn '" S + .-1.nn "L. -.-!-r' Co #- O. r=O

+

k

l)n 8- 2k

1)jj! and

l¥,.(Pn, Pn) ~ ct+ 1n(k+l)(8-kl

We combine these results as a theorem.

l)n 8 -

(7)

j= 1

and differencing rows gives A k =

+

n

(8)

(9)

Then for .-1. #- 1, 0 #- 0, 1, ... , k - 1, (k) _

Sn

-

S

+

co~n+2kn8-2kk!( -O)k [ (.-1. _ 1)2k 1

+0

(~)J n '

(10)

and thus, for fixed k, ek accelerates convergence on vertical paths of all convergent sequences of the form (9). For .-1. = 1, 0 #- 0, 1, ... , k - 1, (k) _

Sn

-

S

(I)J

8 cok! n [ O)k 1 + 0;;

+ 0-

,

(11)

and so ek does not accelerate convergence.

For the effect of ek on exponential sequences of the form S + L Ck .-1.;, see Theorem 6.8(4). We have already shown that el' which is Aitken's (F-process, is accelerative for rtf/ (see Section 5.1, Example I); ek is not accelerative for rtf/ when k > 1. The following argument, due to Smith and Ford (1979), shows what happens when k = 2. In the notation of Chapter 5, (12) e2(Sn) = s, + an+lg(Pn+b Pn+2, Pn+3)' (The k in that chapter is here 4.) According to Theorem 5.1(4), eisn) accelerates d p iff g is continuous at (p, P, p) and has the value 1/(1 - p) there.

128

6. The Schmidt Transformation; The t: -Algorithm

But _

(

gXI,X2,X3 ) -

If Xl = p, X2 =

(1

+ X2)(X 2 1

X2(X I -

X, X3

)(

X3 -

-

Xl

)

xd -

(

Xz( X3 - Xl) )( X2 - Xl X IX2-

1)

(13)

= y, the denominator is zero on the curve

X(p - 1)(y - p)

= (x - p)(xp - 1)

(14)

and the numerator on the curve X(y - p)

= (1 + x)(x - p).

(15)

The only x-values common to these curves must satisfy p - 1

= (xp - 1)/(1 + x),

or

X

=

p.

(16)

Therefore, y = p as well. We conclude that 9 is unbounded arbitrarily close to (p, p, p), so e2 does not accelerate d P' e b in fact, is not regular for d p when k> 1. For totally monotone sequences, an interesting convexity property holds for ek(s.), To establish it requires an inequality due to Bergstrom. Lemma. Let A, B be positive definite matrices and let Ai, B, denote the submatrices obtained by deleting the ith rows and ith columns. Then

IAI/IAd + IBI/IBd Proof

::s;

IA + BI/IA i + Bil·

See Beckenbach and Bellman (1961, p. 67).

Theorem 2.

(17)

•

Let s, t E 9f!TM' Then ek(s.)

+

ek(t.) ::s; ekeS.

+ t.).

(18)

Proof The case in which either s or t is the zero sequence is obvious. Assume neither is the zero sequence. Then neither of the distribution functions (see Section 1.5) for these functions can be equivalent to the zero function, and it is easy to show that »-k is positive definite in either case. The rest of the proof follows immediately from Eq. 6.4(1). •

A similar statement is possible when s, t

E 9f!TO'

6.5. Power Series and ek ; The PaM Table When ek is applied to the partial sums of a power series the result, obviously, is a rational approximation. This section investigates the nature of this approximation.

6.5. Power Series and ek ; The Pade Table

Definition.

129

Let s(z) be analytic at 0, 00

L akz\

s(z) =

(1)

k=O

Let A be a polynomial of degree ~ p, B ( =1= 0) a polynomial of degree ~ q. If

=

s(z)B - A

O(ZP+q+1),

z

-->

0 in

(2)

C(j,

then the rational form A/B is called the p, q Pade approximant to s(z) and written [p/qJ. • When [p/qJ exists, it is unique when written in lowest terms. To show this, assume otherwise, i.e., that

s(z)B* - A* = O(zp+q+ 1).

(3)

AB* - A*B = O(zp+q+ 1).

(4)

Then

But the left-hand side is a polynomial of degree not exceeding p O(zp+q+ 1). It must therefore vanish identically. Thus

+ q, yet

A/B = A*/B*.

is

(5)

Now let

A =

p

L r:tjZ j,

j=O

B=

q

L f3jX j.

(6)

j=O

Equation (2) requires

(ao

+ a1z + .. ·)(f3o + = O(zp+q+ 1).

f31 Z +

... +

f3 qzq) - (r:t o + r:t 1z +

... + r:tpz P)

(7)

Then f3j must satisfy

ap+1f3o

+ ap!31 + ... + ap-1+1!3q =

0, (8) a:

j

= 0.

This is a homogeneous system of q equations in q + 1 unknowns and so always possesses nontrivial solutions. The r:t j may then be computed from

= aof3o, r:t 1 = a1f3o + a of3h

()(o

(9)

130

6. The Schmidt Transformation; Thee-Algorithm

°

A nontrivial solution with f30 i= will exist iff H~~q+ lea) i= 0. Thus if all the determinants H~q~ q+ 1 (a) i= 0, p, q z 0, there will exist for each p, q an A and B of degrees s; p, :::; q, respectively, satisfying (2). This does not guarantee that A will be of exact degree p (although B will be of exact degree q since 0:0 i= 0, f3q i= 0). However, if the like determinants formed with the reciprocal series

-

1

s(z)

=

ao

L bkzk k;O

(10)

are nonvanishing, then A will also be of maximum degree. Such a series, or such a set of'Pade approximants, Wall calls normal, although this terminology is not universal [cf. Henrici (1977, vol. 2, Chap. 12)]. With the approximants [p/qJ one can associate the following geometrical configuration known as the Pade table for s(z): [%J [O/IJ [0/2J

[1/0J [1/IJ [1/2J

[2/0J [2/1J [2/2J

The elements in the first row are simply the partial sums of the power series for s, [P/OJ = L);o a.z', those in the first column the partial sums of the series for l/s. [P/qJ can always be obtained by brute force, i.e., by the above method of undetermined coefficients. However, this way of computing the Pade elements for a function tends to be very unstable numerically. Luke has found that in many cases double-precision arithmetic is not adequate for even moderate values of n (10 or so). Note, however, that the coefficient matrix of the system (8) is of Toeplitz type and thus can be inverted easily by using algorithms due to Trench (1964, 1965). (Toeplitz matrices are those having northwest-tosoutheast diagonals constant.) Trench's algorithms require O(n 2 ) operations, as compared with the usual methods, which characteristically require O(n 3 ) operations. In fact, the Trench algorithms can be used to compute ek(sn) systematically. [In Section 10.6 we show how a Trench algorithm can be used to compute a transformation that includes ek(sn) as a special case.J To my knowledge, however, no work has yet been done on assessing the computational stability of Trench's methods. [For an excellent general discussion of methods for inverting Toeplitz and Hankel matrices, see Cornyn (1974).J Closed-form expressions for the Pade elements are known for only a few special functions (other than rational functions) and then only for diagonal approximants [p/pJ or off-diagonal approximants, [pip - IJ or [p - l/p]. The functions are special cases or confluent limits of the Gaussian hypergeometric function F(l, b; c; z); see Luke's books (1969) for details. For certain

6.5. Power Series and ek; The Pade Table

131

other functions (see Section 2.5.3) the [p - 1, pJ Pade may be computed systematically without solving equations, but since closed-form expressions for the Taylor coefficients of these functions are not known, the Pade can hardly be said to be closed form. The following is due to Shanks (1955). Theorem 1.

Let s,

= LJ=o a.z'. Then

S~k)

= [n + k/k].

Note that it is possible for Wk(I) to vanish identically. However, as Shank shows, this can happen only if at the same time Wk(sn) == O. In such cases one defines ek(Sn) = ek_ 1 (Sn) = ... = ek_ ,(sn), where r is the smallest integer such that Wk-,(sn) 1= O. With this understanding, the theorem still holds. A number of theorems on Pade approximants can be translated directly into statements about the effect of ek on partial sums of power series [see Gilewicz (1978, Section 6.3.1)]. (There are interesting results available on convergence in measure also. Here we shall be concerned only with uniform convergence.) Theorem 2 (de Montessus de Balloire, 1902). Let s(z) be meromorphic in N R and have exactly k poles (each counted as often as its multiplicity) in N R' Then S~k)(Z) = s(z) + 0(1), n -.. 00, (11) uniformly on every compact subset of N R with the poles removed.

Proof

See de Montessus de Balloire (1902).

•

The most far-reaching result to date on the convergence of subsequences of S~k)(Z) is due to Bearden (1968). The proof will not be given here. Theorem 3. Let s(z) be analytic at zero and meromorphic in a domain f0 containing zero. Then given any compact subset :ft of f0 with the poles removed there is some subsequence S~~i)(Z) that converges uniformly to s(z) in :ft. These theorems cannot be strengthened in any obvious way. For instance, regarding Theorem 2, Perron (1929) has given an example of a function analytic in a disk N o for which s~l)(z) (Aitken's (j2_process) diverges over a set dense in N p : Several computational algorithms have been developed to relate the various components of the Pade table, e.g., schemes due to Wynn (the ealgorithm, Section 6.2), Baker, Longman, and Pindor. For a good survey of these, see Gilewicz (1978, Section 7.3). Some of these schemes are computationally more efficient than the s-algorithm [see Baker and Gammel (1970, Chap. 1)].

132

6. The Schmidt Transformation; The c-Algortthm

The previous results have touched on convergence only along vertical paths. Convergence along other paths is a much more difficult matter. The theorems to date-such as those of Chisholm-Gilewicz and Zinn-Justinrequire troublesome hypotheses. One of these results will be given later. As an example of what can happen, consider the diagonal path [k/k], i.e., the path corresponding to s~)(z). If s is given by a formal power series 0 a.z', then it is known that the Pade approximant [k/k] = s~>Cz) is the (2k + I)th approximant to the (formal) continued fraction corresponding to s (when such exists); see Henrici (1977, vol. 2, Chap 12). It would be tempting to assert that for s(z)analytic at 0, s~)(z) -+ s(z) in some neighborhood ofO. However, this is not, in general, the case, as shown by an example in Perron (1957, vol. 2, p. 158). Nor indeed, must a.z' converge for s~)(z) to converge. Apparently, additional conditions must be placed on the coefficients aj to ensure the convergence of the Pade table. In a special but very important case, namely, when the a j are the moments of some function t/J, many of these convergence questions can be answered satisfactorily. We shall tabulate below the most far-reaching results to date. (Since proofs in Pade theory tend to be very computational and to rely heavily on function theory, they have been omitted.) In what follows, let t/J E 'P* and also Supp t/J = [0, CfJ). (When the latter is not true, the results can be refined in an obvious way.) Let

I;;,

I

s(z) =

Sn(Z)

=

dt/J o 1 - zt

5 00

n

I

z ¢ (0,

--,

IX),

(12)

a.z',

j~O

j

~

-1,

and let {pV)(z)}r~o be the sequence of orthogonal polynomials associated with dt/Jj' Theorem 4. S~k) is the ratio of two polynomials of exact degree n k, respectively. The denominator polynomial B is given by

+ k and (13)

Thus all the poles of S~k)(Z) lie in [0, CfJ).

6.5. Power Series and e.; The Pade Table

133

Theorem 5 S~k)(Z) =

La zr + zn+ L k

n

1

r=O r

a(n)

kr r= 1 1 - ZXk~)'

(14)

where the as and the xs are the weights and abscissas, respectively, of the Gaussian quadrature process based on dljJn' Theorem 6. (i)

Let z be real negative.

(_l)n+ I[S~k+ I)(Z) -

S~k)(Z)]

::2: 0,

(ii) (_l)n+l[s~k+I)(z) - S~k~Z p, then, for (iv)

~

°::; z

< 1/p,

S~k~2(Z)::; S~k+1)(Z)::; S~k~I(Z)::; S~k)(Z)::; S~k+I)(Z)::;

Theorem 7. - [0, (0). If

S~k)(Z)

s(z).

converges along any path P to a function analytic in

L la 00

j= 1

j

l- I/(2j + l ) =

(15)

00,

all paths produce a common limit. Further, if (1) converges in some N R'

R > 0, S~k)(Z) converges along any path P to s(z) in ~ - [R, 00).

In all cases above, convergence is uniform on compact subsets of the indicated region. Theorem 8 z

Theorem 9.

if (0,

00].

(16)

Let (17)

for some R > 0. Then s~)(z) converges to s(z) as k subsets of ~ - [0, OC!).

-+ 00

uniformly on compact

For Theorem 4,see Wall (1948, p. 388)and Allen et al. (1975); for Theorems 5,8, and 9 see Allen et al. (1975); for Theorems 6 and 7 see Baker and Gammel (1970). [Actually, Theorem 9 follows directly from Theorem 5 by an application ofVspensky's result (l928)on quadrature formulas for infinite intervals.]

134

6. The Schmidt Transformation; The s-Afgorlthm

Example.

If I)(

> -I,

(18)

then pjj>(z)

= LV+ a+ 1)(z).

(19)

We shall require the formula

~~z;}: \{I(a + 1, a + b + 2; -~}

lOOe-lta(1 - zt)b dt =

[argt -z)1 <

Re a> -1.

tt,

In the present case

foo

-I _ e-1/Zr(1)( + 1) a+ l e t (-z) -liz

S(Z) -

(20)

-a-I dt.

(21)

[The integral is the incomplete Gamma function F( -I)(, -1/z).J Using the Rodrigues formula for the Laguerre polynomial in (16) and integrating k times by parts gives r(k)(z)n

-

(-I)"k!r(c) \{I(a,c;-I/z)e-1/Z + 1)( _zy+l <1>(a, c; -1/z) ,

r(1)(

a = n

+ + k + 2,

c= n

I)(

(22)

+ + 2. I)(

Now \{I(a, c; w)

<1>(a, c; w)

~

2na c -

exp( -4~) r(a)r(c)

1

a

-+ 00,

[arg w] < tt

(23)

[see Slater (1960, p. 80)]. Thus r~k)(z) ~ [2n( -In( -zY+ IJ exp( -1/z) exp( -4J -k/z),

k

-+ 00,

[argf -z)1 < n.

Table I k

slkl ·0

2 4 6 8 10 \2 14

0.615 0.598802 0.596816 0.596459999 0.596378884 0.596357234 0.596350734

(24)

6.5. Power Series and ek ; The PaM Table

135

This agrees, for n = 0, with a result given by Luke (1969, vol. 2, p. 200). [The formula (5.10) of Allen et al. (1975) seems to be wrong.] For a numerical example, take z = 0, Z = -1, Then ek in diagonal modes sums the highly divergent sequence s, = O! -I! +2! -3! + .. , + (-I)nn! = (FAC)n

(25)

to the value

s

=

foo l e -

t

+t

o

dt

= 0.5963473611.

(26)

is not regular for s, in vertical modes. Let us tabulate some elements on the leading diagonal (see Table I). According to Eq. (24),

ek

S~)

-

S

k

~ 2rc exp(1 - 4Jk),

-> 00.

Table II

z = 0.85 s;

s~1)

S~2)

1 1.85 2.37 2.60 2.67 2.695 2.6979 2.6983 2.698328 2.698338

3.2 2.79 2.71 2.70 2.6985 2.698346 2.698331 2.69833043 2.6983303937

2.67 2.695 2.6980 2.69831 2.698329 2.69833035 2.6983303913 2.6983303925

z = 1.05 Sn

s~1)

I 2.05 3.36 4.82 6.99 10.38 16.18 27.09 49.80

-5.66 -2.35 -0.54 0.86 2.23 3.85 6.05

,(2)

•n

11.13 -36.51 -5.88 -2.11 -0.18

(27)

136

6. The Schmidt Transformation; Thee-Algorithm

ei, particularly when computed by means of the s-algorithm, is one of the few simple computational tools available for the numerical analytic continuation of an analytic function. What happens if one tries to continue a function beyond a natural boundary? Brezinski (1978) presents the example of (28)

for which aN is a natural boundary. The results are given in Table II. The behavior of the algorithm seems to reflect the functional realities. A series whose coefficients are given by (12) is called a Stieltjes series.lfthe Taylor series for s(z) is not a Stieltjes series, it is still possible to say something about the convergence of S~k} on general paths, but restrictions, which may in practice be unverifiable, must be placed on the zeros of ~(sn), ~(1). Let (1) hold near 0 with s; its partial sums. The following result is due to Chisholm

(1966).

Theorem 10. Let s(z) be meromorphic in N R' Let P be a path and let the zeros of ~(sn), ~(1) have no accumulation point in NIT for (J < Rand (n, k) E P. Then S~k}(Z) converges to s(z) uniformly in some nonempty region ~, 0 c ~ C NIT'

6.6. Geometrical Significance of the Schmidt Transformation

As Tucker (1973) has pointed out, ek has an elegant geometrical interpretation. Let s E f!lls and let (1)

be a sequence of points in f!llk + 1. Denote by !fl the line through (0, 0, ... , 0) and (1, 1, ... , 1). Assume a unique plane can be passed through Pn' Pn+ b ... , Pn+k' Its equation is Xl

X2

X k+ 1

s,

Sn+ 1

Sn+k

Sn+k

Sn+k+1

Sn+2k

= O.

(2)

If s converges, then lim P« = s(1, 1, ... , 1).

(3)

6.6. Geometrical Significance of the Schmidt Transformation

137

which lies on }P. If the plane intersectsze', say at (tn, tn' ... , t n), then it is reasonable to expect that t; will be closer to s than any of the components of p., n ::;; i ::;; n + k. Now,

(4)

Solving for

tn

and performing obvious determinant manipulations shows t n = ek(Sn)'

(5)

Figure 1 illustrates k = 1 (Aitken's 15 2-process).

(e I (sn) ,e I (sn)) I I

I I

I

I I I I I I I

.1

I

~---I----I----+--------'------'~

5

Fig. 1

XI

138

6. The Schmidt Transformation; The s-Algorithm

6.7. The s-Algorithm The s-algorithm, due to Wynn (1956b), is an economic computational procedure for calculating ek(s.) without the necessity of evaluating determinants. It is a lozenge algorithm, actually a rhomboid of the kind 1.3(5). Not only does the s-algorithm make the application of the Schmidt algorithm much more practical, but it helps to clarify some of the convergence properties of the latter, particularly in its application to monotone sequences.

Theorem 1. Let n, k :2 0, c:~\ = 0,

(I)

(It is assumed that all quantities are defined.) Then n, m :2 0,

(2)

n, m :2 0.

(3)

and Proof The proof is quite computational and depends on an expansion of the ratio of determinants due to Schweins. The proof is sketched in Appendix

2. •

The computational scheme for the s-algorithm is as follows.

0(3)

'-'-1

As a numerical example, take the iterative sequence considered by Wynn, S.+l

=

i(s; + 2),

n > 0,

So

= 0.

(4)

139

6.7. The s-Algorlthm

Table III" 11

£(n)

Sri

o

e~)

'I

E~ n)

€~)

0.0 2

0.5

2

0.5625

3

60.23529.41

0.5791015625

4

0.5838396549

5

0.5852171856 a

0.5714285714

16

211.05540 725.9366

The true root of x 2

-

4x

0.5851063830 0.5857319781 0.5857818504

89.1111108

1658.713

0.5857434871 0.5857857313

20262.3

+ 2 is 0.5857864375

This sequence converges (very slowly) to the smaller zero of the Laguerre polynomial Lz(x) = x 2 - 4x + 2, i.e., x = 0.5857864375. Table III shows the effect of the e-algorithm on Sn' It is apparent the convergence of e~~) is much superior to that of Sk' Note the odd entries, e~l+ l' diverge as n ---+ 00. This is generally the case. Tables IV and V show the effect of the algorithm on some other sequences. 4n ) satisfies e~~ = H~m+ 1)(s)/H~m)(Ll2s),

e~n~+l e~~+2 -

=

(5)

H~m)(Ll3s)/mm+1)(Lls),

e~~ = _[H~m+1)(Lls)]2/H~m+1)(Ll2s)H~m)(Ll2s).

The first two properties are obvious and the third is deducible using a determinantal expansion of the kind given in the appendix. The two following theorems show the effect of 4n ) on sequences in ~TM and ~TO' Theorem 2.

Let

SE

~TM'

Then

(i) 0::; e~~+2 ::; e~~, n, m ;;::: 0; (ii) e~~+ 1 ::; e~~-l ::; 0, n, m ;;::: 0; and (iii) ekn + 1) ::; 4n>, n, k ;;::: O.

Further the s-algonthm is regular along any such that rE~TM and limn~oo e~~+l = -00.

En, 2m] path for sequences

Proof The three inequalities are fairly straightforward, requiring the use of Theorem 1.6(6).They are left as an exercise. The rest of the proofis trickier. Assume without loss of generality that S = O. Inequalities (iii) and (i) show

140

6. The Schmidt Transformation; The c -Algorithm Table IV" (LN 2),

E'O)

.,

(PI 2 ) ,

ckO)

(EX 3),

riO)

0 2 4 6 8 10 12 14

1 0.833333 0.783333 0.759524 0.745635 0.736544 0.730134 0.725372

1 0.7 0.693333 0.693152 0.693147332 0.693147185 0.693147180688 0.693147180564

I 1.361 1.464 1.512 1.540 1.558 1.571 1.580

1 1.45 1.552 1.590 1.609 1.620 1.626 1.630

1 7.72 25 60.832 128.196 247.903 452.973 795.351

1 -2 25 25 25 25 25 25

a

f,~~

m

.,

k

In 2 = 0.693147180560. 11';6 1.644934.

is bounded and positive decreasing in n, and hence convergent. Putting

= 0, 1,2, ... , in (i) shows vertical regularity.

Inequality (i) shows that E~~ is positive decreasing in m and bounded, and hence convergent. Let (6)

m-->oo

so that (7) Now,

l.

-

(8)

since E~:-n E~~+I $; 0 by (iii). Taking limits shows t« $; t n + But taking limits in (iii) with k = 2m shows t n+ 1 $; tn' Hence t; is a constant, and letting n ~ XJ in (7) shows tn = O. This gives horizontal regularity. It is an easy exercise to show that (iii) guarantees regularity for any path.

•

Table V"

k

.ro:

(FAe),

c,

2

2 4

0.666 0.615 0.602 0.598 0.597 0.596817

20 620

6

8

10 12 u

/:,,0J = 0.596353077, S'~ [e-'/(I +

I)J dt

= 0.596347

6.8. The Stability of the s-Algorithm

Theorem 3. (i) (ii)

°S

(iii)

Let

SE

.0lTO • Then

e~2';:~ 2 S e~2';:l,

e~2';:+ I) S e~2';::

(-lte~~+ 1 S (-lte~n~_1 (-It(e~:

1) -

( - 1t( e~:-;t

-

e~~)

so,

n e~:~ \ -

Ite
n, m 2: 0.

Further, the s-algorithm is regular along any such that r E ~TO and limn_roe- Ite
n, m 2: 0;

0,

n, m 2: 0;

S (-It(e~n,:;1 - e~~+2) S 0,

e~n,:, + d 2: ( - 1

(-lte~n~+2 S (-

i) s

14I

Left to the reader.

e~n~ _ I) 2: 0,

en, 2m] path for sequences

•

The very different behaviors of e
s, ~

Let S

+

co

I

r> 1

(9)

CrA~,

Then (10) (I I) Proof

Left to the reader, or see Golomb (1943).

•

Thus ek accelerates sequences of the form (9) exponentially, (S~,k) -

sn)/(sn - s) ~ C(Ak + dA1)n.

(12)

6.8. The Stability of the I:-Algorithm

The execution of the algorithm involves division by small quantities, necessarily, in fact, if the algorithm is successful, since the reciprocal of e
142

6. The Schmidt Transformation; The c-Algorithm

A stability analysis of the algorithm in its application to general sequences is a very difficult problem. It is not even possible at present to say much about how the computations should be arranged to minimize the accumulation of error. Some insight, however, can be gained by looking at the effect the algorithm has on specific sequences. Wynn (1966) has done this for the exponential sequence 6.8(9) and the Poincare series s, ~

S

s, ~

S

+

00

Lcrn- r,

r= 1

+ (-It

Cl

i= 0,

(1)

Cl

i= 0.

(2)

00

Lcrn- r,

r= 1

Let bl:'l denote the relative error of el:'l and consider the case of diagonal convergence. For Eq. 6.7(9) with Ar > 0, one has bin~ot-2 ~

-[Am+tI(l - Am+l)]2Jin~,

(3)

+ Am+l)]2Jin~,

(4)

provided s is not too small. Thus if Am + 1 > 1the relative error is increased in moving from one even-numbered column to the next, considerably so if Am + 1 is close to one. If Am + 1 ~ 1, however, the error is not magnified, and thus the process is numerically stable even though s, is a monotone series. [Wynn shows that the s-algorithm applied to (1) is disastrously unstable.] For 6.7(9) with Ar < 0, one has b~~+2 ~ -[Am+tI(l

and the absolute error is attenuated when one passes from one even-numbered column to the next. The s-algorithm for this case is unconditionally stable, as it is for the sequence (2). Obviously this brief discussion far from exhausts the subject. For more details and many examples, the reader should consult Wynn's papers (1959, 1961, 1963). What can be said in general is that the s-algorithm tends to be numerically stable when used to compute limits of sequences whose terms oscillate about their limit, but numerically unstable when used on sequences that approach their limits monotonically. 6.9. Rational Analogs of the Formulas of Numerical Analysis The s-algorithm can be used in a straightforward manner to develop rational analogs of interpolation or quadrature formulas. For instance, in Newton's interpolation formula S

x - Xo

= - h - ' jj =

f(x o

+ jh), (I)

6.9. Rational Analogs of the Formulas of Numerical Analysis

143

the partial sums may be used directly in the s-algorithm. The resulting nonlinear interpolation formulas can be expected to be accurate for functions having poles in the vicinity of the interpolation interval. For another example, take the first two terms of (1), let X o ..... x,; and integrate between x, and x, + 1 = x, + h. We get the well-known formula that can be written Yn+ 1

-

Yn = 0 +

hJ" + th t.In,

In

= y~.

(2)

Applying e1(so) to the above yields an approximation to Yn+ 1 - Yn' Yn+ 1 ~ Yn

Applying

el

+ 2hy~2/(3y~

- y~+ 1)'

(3)

- hy~).

(4)

to a Taylor series gives Yn+ 1 ~ Yn

+ 2hy~2/(2y~

These formulas and more sophisticated ones have been studied by many writers (Lambert and Shaw, 1965, 1966; Shaw, 1967; Luke et al., 1975). As an example, take the nonlinear differential equation y(O) = 1.

(5)

This equation has a movable singularity-a simple pole. The initial condition positions the pole at 1 and (6)

Table VI demonstrates the result of applying (4) with h = 0.1. The error in

y(0.8) is about 2.7%.

A characteristic of these types of formulas is that they can be used to predict the location of the singularity. Thus the value h that causes the Table VI

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

yet) from (3)

yet) true

1 1.02518 1.06322 1.12506 1.22472 1.38980 1.68526 2.30981 4.30434

1.00537 1.02341 1.05831 1.11720 1.21306 1.37203 1.65528 2.24664 4.06570

144

6. The Schmidt Transformation; The s-Algorlthm

denominator of (3) to vanish yields, when added to the current value of t, an approximation to the pole: pole

~

t;

+ 2y~/y~.

(7)

The last five entries in the table result in the following sequence of approximants to the pole: 0.898, 0.950, 0.978, 0.991, 0.994. Nonlinear formulas of great precision have been developed, including analogs of the Milne predictor-corrector formulas (see the references given in this section).

6.10. Generalizations of the s-Algorithm The s-algorithm may be generalized to abstract spaces in several ways. One natural way to extend the method to sequences in topological vector spaces is by means of the Brezinski-Havie protocol, discussed in Chapter 9. Wynn (1962) gave an ingenious version of the method applicable to sequences in ,??p. For x, y E ,??P, we can define the inner product in the usual way,

(x, y)

=

(1)

L x(j)yUl. P

j~

1

Now let s E '??~ and define e~nL =

4n_\ 1) + (l:~n+l) - 4nl)*, l:~)l = 0, l:bn) = Sn' n~

n, k 0,

~

0;

(2)

where (3)

y* = y!(y, y).

y* can be thought of as a pseudoinverse of y. When p = 1, of course, this reduces to the s-algorithm in '?? s- Obviously the idea can be extended to any finite-dimensional complex inner product space or any real inner product space. In fact, the applications in Wynn's paper are to matrices. An important class of sequences are those formed by iteration of the matrix equation

x = Ax

+ b,

(4)

where A is p x p with entries in '??, i.e., those sequences n ~

o.

S E '??~

defined by (5)

6.10. Generalization of the e-Algorithm

145

Gekeler (1972) has shown that the effect of the vector s-algorithm on the simple (Picard) iterative sequence Sn+ 1 = F(sn) for determining fixed points of F(x), F a real analytic function of the vector x E fll P , can be related to the effect of the algorithm on sequences of the form (5). To proceed we need a lemma, due to McLeod (1971), that is sort of a one-way version of Theorem 6.3 but a little more general than really required. Let r E

Lemma.

corn

~~

satisfy the irreducible equation

+ c1rn+ 1 + '" + Ckrn+k = 0, n 2": 0, S, Cj E~, Co + C 1 + .,. + Ck #- O. all i + j :s; 2k,

(6)

r n = Sn -

Then if e{ exists for

(7)

Remark. An equation of the kind considered is irreducible when it has no solutions in common with an equation oflower order (see Milne-Thomson, 1960, p. 366). Proof

See Milne-Thomson (1960).

•

Note that, in contrast to the scalar case, here there is no simple formula for

et in terms of determinants. In fact, the proof of this result is rather difficult. l

It has been conjectured but not proved that the given conditions are both necessary and sufficient for e~~ = s. But this is not true even for scalar sequences, since the irreducibility of (6) is not equivalent to the statement, "r satisfies no such equation of lower order." Perhaps the correct iff statement will result when the irreducibility hypothesis is replaced by the statement in quotes. Theorem.

Let Sn + 1

= ASn + b,

given, A (p x p) real, bE fll P, and I - A invertible. Then if e{ exists for all i + j :s; 2k, So

e~~

=

S,

where S = (I - A) - 1 band k is the degree of the minimal polynomial of So [with respect to A; see Gantmacher (1959, vol. I, p. 176)]. j Proof Let p(A.) = CjA. be the minimal polynomial. Then

2:1=0

(8)

n 2": 0, -

S

J46

6. The Schmidt Transformation; Thee-Algorithm

since Sn

= S + An(so -

s), or k

L ajrn+j = O.

Also,

L7=0

(10)

j=O aj

=I' 0, because otherwise we could write k-I

peA) = (A - 1)

L cjA

j

j=O

(11)

,

and since I - A is invertible, p(A)/(A - 1) would be an annihilating polynomial for So - s of degree k - 1. • 6.11. Fixed Points of Differentiable Functions

Let &8 1 , &8 2 be two Banach spaces, d an open subset of &8 1 , and f, 9 two continuous mappings of d into 902 , f and 9 are said to be tangent at Xo Ed if lim Ilf(x) - g(x)II/llx - xoll

= 0.

(I)

Among all functions tangent at Xo to f there is at most one mapping of the form g: x -. f(xo) + u(x - xo) where u is linear [see Dieudonne (1969, Chap. 8)]. f is said to be differentiable at Xo if such a mapping exists; u is then called the derivative of f at Xo and written f'(xo). Higher-order derivatives are defined similarly, pml(xo) being a m-linear continuous mapping of &8 1 x &8 1 X ... X &8 1 (m times) into &8 2' If f(ml exists and is continuous at each point of d, f is said to be an m-times continuously differentiable mapping of d into &8 2 , If the segment joining x and x + t is in d, it can be shown that the generalized Taylor formula holds: f(x

+ t) =

f(x)

+

m-I

1

1 fUI(x)t(j)

~

j=IJ!

+

(1'(I-om-, 0

(m-l)!

f(ml(x

+ (t)d(

)

t(ml, (2)

where t(k) stands for (t, t, ... , t) (k times). In particular, for every e > 0, there is an c > 0 such that for Iltll < c,

Il

f (X + t) -

,f ;, fU)(x)t U) II s elltll J=oJ·

m

•

(3)

For details and examples, see Dieudonne (1969). Now consider the iterative scheme for determining a fixed point s of the function f(x), (4)

6.11. Fixed Points of Differentiable Functions

147

If f is differentiable in some neighborhood of s and all members of s are sufficiently close to s, one may write (5) so generally the convergence of the process is poor and first order [see Definition l.3(liii)]. However, the vector e-aigorithm, as defined by Eq. 6.10(2) may be used to derive a quadratically (second-order) convergent process in the case that f!J 1 = f!J 2 = f!IlP and f is twice continuously differentiable on d. In this case, of course, 1'(x) is a real p x p matrix. Let s = f(s) and Qm(f'(s)) be the set of vectors x for which m is the degree of the minimal polynomial of x (with respect to 1'(s)). Define s as above and s by Sn

= S + [f'(s)Jn(so - s).

Theorem. Let 1 not be an eigenvalue of 1'(s) and let t:F>' exist for all So sufficiently close to s with So - s E Qm(f'(s)). Put t:io~

= G(so, Sl>

.•• ,

(6)

W>' i + j

~ 2m,

S2m) = H(so).

(7)

0,

(8)

Then the computational procedure

t i + 1 = H(tJ,

i

~

is, for to sufficiently close to s and to - s E Qm(f'(s)), quadratically convergent to s.

Proof

See Gekeler (1972).

Gekeler assumes that f is analytic (possesses a p-tuple series absolutely convergent in a neighborhood of s), but this seems not to be required. The author gives some interesting numerical examples. The s-algorithm has an obvious generalization to sequences of square matrices. For rectangular matrices, the generalized matrix inverse of Moore (1920) and Penrose (1955) is applicable, and many writers have investigated generalizations of the e-algorithm applied to sequences of such matrices, particularly as they arise in the solution of linear systems of equations; see Pyle (1967), Wynn (1966, 1967), and Greville (1968). For a numerical example of the theorem let f: f!Il4 --+ f!Il4 be defined by

f(x) = s

+

A(x - s)

+ Q(x

- s)

(9)

where and (11)

148

6. The Schmidt Transformation; The s-Algorithm

where D is the diagonal matrix (0.9, 0.8, 0.7, 0.6) and U is the orthogonal matrix 1

U

1

1

=:2

[

~

1 1

-1 -1

1 -1 1 -1

-~j

-1 .

(12)

1

It is easy to show that

f'(s)

= A.

(13)

The mapping f is quadratic and f(x)

=

s + f (s)(x - s) f

!,,(s)

+ -2- (x

- SPl.

(14)

s turns out to be (1,1,1, 1)T. With the initial vector to = (2,2,2, 2)T, the method described in the theorem produces iterates t, for which (15)

Gekeler, who gives other examples, states that the method seems to produce the best results when the Jacobian matrix of the system s = f(s) is symmetric.

Chapter 7

Aitken's £52-Process and Related Methods

7.1. Aitken's (j2-Process The most famous example of the Schmidt algorithm is a method usually attributed to Aitken (1926) but that is, in fact, much older. The method, which is discussed in most books on numerical analysis, results on taking k = 1 in ei . We use the following notation: A(s) = S, where (1) Pn+l

= 1.

This definition guarantees that S always exists. The following results are obvious: (i) A is defined and regular for all convergent sequences having the property (j > 0, n > 0, (2) (thus A is regular when applied to the partial sums of convergent real alternating series); (ii) if A is not regular for s E C(}C, then some subsequence of p has limit 1. Thus A is regular for of the form

C(}" which

was established earlier. If s is a sequence Co

# 0,

Re

e < 0,

(3)

then A is regular for s but does not accelerate s; see Theorem 6.4(1). (This is a logarithmically convergent sequence.) A, however, is not regular. 149

150

7. Aitken's iF-Process and Related Methods

Example 1

(Lubkin, 1952). 8=

so

80

Let s be defined by the partial sums of

1 +!-!-i+t+i-···.

= 1. 8

81

(4)

= 1 + !.

(5)

= n/4 + ! In 2.

We can write (6) and

_ 82m = 82m _

82m+1

=

( -l)m(2m

+ 3)

+ (2m + 2)(4m + 5)'

(7)

+ 4) + 3)"

(8)

82m+1

+ (-1)

m

(2m

(2m

Thus A is not regular for s. Note that s contains essentially three distinct convergent subsequences. One of these, 82m' converges to 8. This is no coincidence. Theorem 1 verges to 8.

(Tucker).

Let s

E '(jc.

Then some subsequence of

Proof Suppose no subsequence of s converges to Pn i=. 1. Now

8.

s con-

This means an i=. 0, (9)

Thus the assumption holds iff no subsequence of means

Vn

for some

converges to zero. This B > 0

(10)

or

But then. by Theorem 1.5(1), s diverges. a contradiction. Corollary 1. Corollary 2. s diverges.

If sand S E '(j s- then

8

•

= 8.

If s is such that s is properly divergent (i.e., Is; I ~ 00). then

Proof If s were convergent some subsequence of s would be convergent. a contradiction. •

7.1. Aitken's iF-Process

151

Tucker (1967, 1969), has obtained several sets of conditions that ensure that A accelerates convergence. These conditions, generally speaking, amount to restricting s to rt'l or else are reformulations of the condition (Llsn)2/(rnLl2 sn) ~ 1. Brezinski's result, Theorem 1.7(5), results in a criterion for certain real sequences. Theorem 2. Let a be ultimately positive and monotone decreasing, and Ll(an/Llan) = 0(1). Then s converges and A accelerates the convergence of s. Proof

Direct application of Theorem 1.7(5).

•

There are two useful results describing the effect of A on the partial sums of power series. Let sn(z) be the effect of A on sn(z) = and s(z) =

n

L akzk

(12)

k=O

L akzk, 00

k=O

[z] < c5.

(13)

We know (see Section 6.5) that the analyticity of s(z) in N R does not guarantee the convergence of sn(z). However, the following results, whose proofs are omitted, provide some information. Theorem 3 (Tucker, 1969). Let IPnl S. P < 1 and let Sn(1) converge more rapidly than sn(1). Then sn(z) converges more rapidly than sn(z) [to s(z)] for each z such that 0 < [z] < lip. Theorem 4 (Beardon, 1968). Let s(z) be analytic and bounded in N R' Then there is some subsequence oSnlz) that converges to s(z) uniformly on every compact subset of N R'

Aitken's c5 2 -process also accelerates convergence of hyperlinearly convergent series, i.e., series of the form Pn = 0(1), Pn = rn+i/r n. However, comparatively speaking, the method does not work as well on these sequences as on linearly convergent ones. Reich (1970) observes that it is more logical to compare oSn with Sn+2 rather than Sn' since the computation of oSn involves s, + 2 . For linearly convergent sequences, one still has rnlrn + 2 -+ O. The 2 examples s, = n:" or 2 -n show this is not true for hyperlinear sequences. Conditions for A to accelerate convergence of the infinite product (1 + an) (i.e., accelerate convergence of the sequence of partial products) have been investigated by Tucker. One result is that if IPn I s. P < 1 and an =1= - 1, n ~ 0, then A accelerates the convergence of (1 + an)iff Llpn -+ O. A sums divergent exponential sequences in certain cases.

n

n

152

7. Aitken's il'-Process and Related Methods

Theorem 5.

Let s be real, divergent, and bounded, and let Pn = - 1 + 0(1),

Pn eventually monotone. Then A sums s. See Goldsmith (1965).

Proof

Example 2.

A sums the series

•

L (-It (to !).

There are ways of modifying the b 2-process when the original is ineffective, for instance, in such problems as determining by the power method eigenvalues of a matrix which are close together. Iguchi (1975, 1976) discusses means of doing this and gives many examples. 7.2. The Lubkin W-Transform This is a transformation introduced by Lubkin (1952) that is sort of an iteration of the Aitken b2-process. The formula is

The work of Chapter 5 shows immediately that the W -transform is accelerative for rtf,. Further, for any P, W is regular in a sequence space slightly larger than d p : To make this precise, consider D; = 1 - 2Pn + 1 + PnPn+ 1 and let Pn = P - bn, Ibnl :-:; A, A > O. Then D; = I(p - 1)2 - p(bn + bn+ 1) + 2bn+ 1 + bnbn+ 11

> Ip Then

o, >

11 2

-

2A(lpl

+

I) - ,1,2.

(2)

0 for A :-:; ,1,*, (3)

This proves the following result.

Theorem 1. The W -transform is defined and regular for all convergent sequences s having the property

0< Ipi < 1, ,1,* as above.

(4)

7.2. The Lubkin W-Transform

Theorem 2.

s: is defined and s: = s, n 1, iff s, has the form (ja + b + 1) s, = + K n . b ' n 1, Ja + ~

n

S

where K #- 0, a#- 1,ja Proof

153

j=l

+ b #- 0,

-1, 1,j

~

~

(5)

n~1.

(6)

1.

Notice that s: may be written

The requirement s: == s means Sn satisfies a first-order linear difference equation that may be solved by the usual techniques. Cordellier (1977) is responsible for the clever observation (6). Setting a = shows W is exact for exponential sequences, convergent or not. •

°

There is a close connection between accelerativeness for the Aitken (jz-process and W. Theorem 3. Let A (resp. W) accelerate s. Then W (resp. A) accelerates s iff

~ (1 _2an+z + an+ z). (1_an+z)Z an+ an+ an 1

1

(7)

Proof Immediate by the use of Theorem 1.5(4). Note in accordance with my convention [see Definition 7.1(1)] A or W may be undefined for a (finite) number of values of n. • For more results of the W-transform, see Lubkin (1952) and Tucker (1967, 1969). To close the discussion, observe that for a large class of logarithmically convergent sequences, the W-transform is accelerative, whereas the A-process is not. Theorem 4.

Let

an ~ norco Then W accelerates

Sn'

+ ct/n + cz/n z + .. '],

S: -

Proof

S

_ 2no+ 1 1) [1

(e +

+ O(n- 1 )].

Left to the reader. [Note that the denominator of (1) is e(e

+ O(n- 3 ) .]

•

(8)

In fact

-- = x n- z

Re e < -1.

(9)

+ 1)

154

7. Aitken's Y-Process and Related Methods

7.3. Related Algorithms A number of variants and generalizations of the c5 2 -process have been given. In Aitken's process one assumes s, converges as (1) Samuelson (1945) assumed sn+ 1

-

S

(2)

~ A(Sn - S)2.

Replacing n by n + 1 and eliminating A from the two equations produces a quadratic equation for S qua sn' Ostrowski (1966) assumed more generally

m

~

2,

(3)

and proposed the scheme sn = Sn+l

+ (!a n+l!m+l/Iannsgn(sn+l

- s).

(4)

Of course, there can be a problem in determining the appropriate sign above. Jones's method (1976) includes the c5 2 -method and takes s; = Sn - L1sn/(d - 1),

(5)

where d is a root of (6) It is an easy matter to show the procedure is exact (s, = s) when s; satisfies (7) for some A E C(l, m ~ 1. The selection of the correct root of (5) is not really a difficult matter-Jones has a discussion of this. The procedure is intended to be used on sequences which converge or diverge hyperlinearly, for instance, if one takes m = 2 in (5), (6) will sum the sequence Sn+ 1 =

s; -

1,

So

= 2,

(8)

to its "correct" value, (1 + )5)/2. Note, however, the method is not accelerative for C(l, since (6) with ~sn + Ii~sn replaced by p has a root = p iff m = 1. All the above methods, however, have severe, perhaps fatal, computational deficiencies. If s, converges linearly, one is better off using a column of the a-algorithm to sum s. If s converges hyperlineariy, why use an acceleration method at all? It is my experience that one picks up in s; at most an extra significant figure or so over those present in s, + 2, which is used to compute sn' Finally, if s diverges hyperlinearly, severe loss of significance problems are

7.3. Related Algorithms

155

encountered. If m > 2, it is unlikely these can be overcome even on the largest computers. Iguchi (1975, 1976) discusses a generalization of the c:5 2 -process based on sn

(m

--+ 00

=

gives the

Sn+2

c:5

2

+ (Sn+2

-process).

-

Sn)

I

m

k;l

(

n

a +2 an + l

)2k,

(9)

Chapter 8

Lozenge Algorithms and the Theory of Continued Fractions

8.1. Background

In Chapter 6 it was shown how the Schmidt algorithm, when applied to the partial sums of a power series, produced the upper half of the Pade table. Since the diagonal Pade elements are the (2n + 1, 2n + 1) approximants of the continued-fraction representation of the function defined by the power series, it seems clear some formal connection must exist between the Schmidt transformation, i.e., the s-algorithm, and the theory of continued fractions. In fact, the s-algorithm is just one of several computational formats relating various elements of the Pade table. This chapter shows how two algorithms, the "I-algorithm and the calgorithm, can be derived from the theory of continued fractions. The theory is both elegant and satisfying because it establishes a deep connection between an algorithm derived purely algebraically and certain important ideas in function theory. The analysis in this chapter will depend heavily on material by Wall (1948) and Henrici (1977, Vol 2, Chapter 12). 8.2. The Quotient-Difference Algorithm; The "I-Algorithm

This section considers a procedure due to Rutishauser, who developed it and explored its application in a series of books and papers [see e.g., Rutishauser (1954, 1957)]. We shall not deal extensively with the properties of the quotient-difference (q-d) algorithm here, but use it primarily as a tool for obtaining the other lozenge algorithms, the '1- and s-algorithms. 156

157

8.2. The Quotient Difference Algorithm; The I]-Algorithm

A formal (not necessarily convergent) power series U = ao + a1z + azz z + ...

(1)

and a formal continued fraction of the kind (2)

are said to correspond to each other if the nth approximant Piz)/Qn(z) of

K, with

Po = 0,

. ..

(3)

and

Qo = 1, (4)

if expanded in powers of z, satisfies

(5)

It is not clear that such a correspondence need exist. But the following theorem states when this happens.

Theorem 1. For U, there is at most one corresponding K. There is exactly one such K if and only if the Hankel determinants satisfy

¥ 0,

n

~

k

0,

~

1. (6)

an+Zk-Z Proof

See Henrici (Vol. 2, p. 518). •

The q-d algorithm provides a systematic way of obtaining {qn} and {en} from {an}' We assume the condition H~k) ¥ 0 of the previous theorem holds, but for the present the development is purely formal and no assumptions are made about convergence. The even part of K is Z

K,

=

J

Z

1 qlelz qzezz ao [ 1 - q1z- 1 - z(qz + e 1)- 1 - Z(q3 +ez)~ ....

Its approximants are PZn/Qzn' The odd part of K is

«; =

[1+

aO

J.

z qlZ qZelZ .. . 1 - Z(ql + el)- 1 - z(qz + ez)-

(7)

(8)

158

8. Lozenge Algorithms and the Theory of Continued Fractions

Its approximants are P2n+ I/Q2n+ I' Now consider a sequence of functions {U;(z)} that have continued fraction developments (2) with corresponding coefficients a~, {q~i)}, {e~)}, and so Equating (7) for k (k)

ao [

1

+

+

+ elk)~

(k+ I)

= a k + ao

+ ak + I Z + ak + 2 Z2 + ....

(9)

I with (8) for k gives

q (k)Z I

1 - z(q~)

Uk = ak

q(k)e(k)z2 2

1 - z(q~)

I

+ e~l)-

I

]

...

q(k+1)e(k+ll z2 1

z [ 1 _ zqlk+ll_ I _ z(q~+1)

]

+ elk+ 1)- ....

1

(10)

For the sake of the formal development, assume these fractions terminate. Then a uniqueness argument [see Wall (1948, Chapter IX)] can be invoked to show they are equal coefficient by coefficient. The result is q~k)

+ e~k) =

e~k)q~k~l

=

e~k_\1)

+ q~k+ I),

q~k+l)e~k+I),

k :;::: 0, n

e:

1;

(11)

k :;::: 0,

e:

1.

(12)

n

To obtain starting values, observe that a~) = ai ; so

qlk l = ak+ dak

and

e~)

= 0, k > O.

(13)

Equation (9) shows U

=

ao

+ alz + a2z2 + ... + aNz N + /V+ 1U N+ I ,

(14)

for any N. But taking N sufficiently large shows (15) The q-d scheme may be arranged as follows:

8.2. The Quotient Difference Algorithm; The 'I-Algorithm

159

Table I

1

0

-2

1

2

0

!

0

6

i

1 -n

1

4:

0

1

1

-6

3

1

-6 1

1

TO

-TO

20

1

-20

1

5

0

The quantities in each formula constitute the four corners of a lozenge or rhombus, and one moves out in the table using first (11), then (12), and then repeating, as indicated in the above array. As an example, take U = e', The results are given in Table I. Thus Z 1 z i,z i,z /oz e = 1=-1+1=-1+1=-1+'" (16)

tz

which is, apart from an equivalence transformation, the known continued fraction for e', In this case, the quantities q~k), e~k) may be written in closed form q~k) = (n + k - l)j(k + 2n - 2)(k + 2n - 1), k Z 0, n > 1; (17) e~) = -nj(k + 2n - 1)(k + 2n), k Z 0, n Z 1. One way of making the q-d formulas easier to use is to label each quantity by its direction from the center of the lozenge: E represents east, etc. Then (11) and (12) become E = WSjN. (18) E = W + S - N, It can be shown that q~) = HLn~

IHkn-l)jHLn)HLn;/),

e~k) =

HLn+ I)Hkn+-II)/Hkn)Hkn~ I'

(19)

Thus the q-d recursion relations induce a recurrence relation sometimes attributed to Aitken (1931) but, in fact, known to Hadamard (1892). Of course the s-algorithm also makes a statement, by means of Eq. 6.8(5), about Hankel determinants. Theorem 2.

Define

[H~k)(6.a)]2

H~O) =

1. Then

- H~k)(6.2a)H~k)(a)

n Z 0, H~k)(a)2 -

+ H~k+ 1)(6.a)mk k Z 1;

I )(6.2a)

l(a) + H~k_\I)(a)H~k;II)(a) n, k z 1.

H~k~ l(a)H~k~

= 0,

= 0,

(20)

(21)

160

8. Lozenge Algorithms and the Theory of Continued Fractions

Remark. (21) can be shown independently of the q-d algorithm by using Sylvester's expansion; see Section A.3 of the Appendix.

Bauer [Bauer (l959, 1965); Bauer et al. (1963)) seems to be the first to trace the connection between the q-d algorithm and the s-algorithm. The basic idea is to convert the continued fraction K, which is equivalent to the formal power series V, into a Euler continued fraction K'

=~

Pl P2 P3 1- 1 + Pl - 1 + P2- 1 + P3-

(22)

Under appropriate conditions [see Wall (1948 p. 17, Theorem 2.1)) this continued fraction is equivalent to the infinite series V'

= aO(1 + J/IP2 "'Pr)

(23)

in the sense that the nth numerator of (22) is equal to the sum of the first n terms of (23) and the nth denominator is 1. The IJ-algorithm establishes a correspondence between the terms of the above series and the coefficients aj of V. The s-algorithm results on interpreting the IJ-algorithm for sequences. In what follows all convergence considerations are disregarded, since these are thoroughly discussed in Chapter 6. The required conversion of the continued fraction K ' to K depends on shameless algebraic trickery. Recall that the denominators of K satisfy Q2m(Z) = -qm zQ2m-iz)

+ Q2m-l(Z),

Q2m+ l(Z) = Q2m(Z) - emzQ2m- l(Z),

m:::::-:1.

(24)

m:::::-:1.

(25)

1,

(26)

Let A be an arbitrary complex parameter. Write qm =

Q2m-l(A) - Q2m(A) ;·Q2m-2(A)

m:::::-:

em =

Q2m(A) - Q2m+ 1(A) AQ2m-l(A)

m:::::-:1.

(27)

Defining gm(A) = Qm + 1(A)/Qm(A),

m

> 0,

go(A) == I,

(28)

> 0.

(29)

(note go(A) = 1), we also have gm(A)gm+ 1(,1.) = Qm+2(,1.)/Qm(A),

m

8.2. The Quotient Difference Algorithm; The II-Algorithm

161

Using the formula (2.2) in Wall shows

( A) = zQm-l(Z)[Qm(.1) - Qm+l(.1)] Pm z, .1Qm+ l(Z)Qm-l(.1) ,

(30)

m~1.

Then

K(z) = K'(z) = ~

Pl(Z, A) pz(z, A) 1- 1 + Pl(Z, .1)- 1 + P2(Z, .1)-

(31)

Now let .1= z:

K(.1) = K'(.1) = ~

Pl P2 1- 1 + Pl - 1 + P2-

(32)

where

Pm == Pm(A, A) = [1 - gm(.1)]/gm(A),

m ~ 1.

(33)

m ~ 1,

(34)

1.

(35)

Also, from (26) and (27),

qm = [g2m-z(.1)j.1][1 - g2m-l(.1)], em = [g2m-l(A)/.1] [ 1 - g2m(.1)],

m

~

Assume, as with the q-d algorithm, that both K and K' are used with two different functions, and U k + 1 , associated with quantities g~>, q~>, and e~l. Applying the q-d algorithm to K and using (34) and (35) with all quantities superscripted by k gives for n = 1 in (11)

v,

(g~ + 1)/.1)(1

- g\k+ 1l)

= (g~) /.1)(1

- gt l)

+ (g\k l/.1)( 1 -

g~l)

(36)

or (37) For n

= 1, (12) gives

which when combined with (37) gives (1 - g\k+l»(1 -

g(km'l

g~+l»

= (1 -

g~l)(l

_

(39)

g~kl).

Continuing this process gives a lozenge algorithm for the computation of (40)

k

~

0, m

~

1, (41)

162

8. Lozenge Algorithms and the Theory of Continued Fractions

with starting values g~)

= 1,

(42)

To derive the 17-algorithm, let .,(k)

'1m

= a k Ak

m flp(k) r= 1

r

(43)

,

with (44) Then, since p~)

= Pm' we have from (32) and (23) U(A) = 171>°)

+ 17\°) + 17~0) + .,.,

(45)

and so we have defined a series transformation of U(A). Of course, the 17~) satisfy lozenge relationships. For instance, let (46) Substituting (43) in the above, factoring 17~~-1 from the numerator and 17~:!1 from the denominator, and pairing off factors by 2s using (44) and the g~) recursion relationships yields

(47) and this provides the first of two relationships. The iterates in the n-alqorithm are defined by

(48)

1 .,(k)

'12m

1

+~ = '12m+l

1 .,(k+ I) '12m-l

1

+ '12m .,(k+ 1)'

k,m 20,

(49)

with starting values

(50) The derivation of the second relationship above is straightforward, By the 17-algorithm we mean the summation of the sequence defined by

(51) in terms of

(52)

8.2. The Quotient Difference Algorithm; The I}-Algorithm

163

The computational scheme is as follows: ao = 17bO)

= 17b1)

al/l

17~O)

1711) a2/l 2 = 17l?)

YJ~O) 17~I)

17~O)

1712)

17~1 )

a3/l 3 = YJb3)

17~2)

1713 ) a4/l4

= 17b4)

Symbolically the 17-algorithm may be written N + E = W + S => E = W + S - N,

~+~=~+~=>E= (~+~_~)-l NEWS

(53)

WSN'

the formulas being applied alternately. Often it is convenient to take /l = 1 in the algorithm. As an example, consider the divergent series 0!-1!+2!-3!+4!-···,

(54)

The 17 table is as follows: 1

1

-1

2

3"

2

6

I

-6

24

24 -20 -120

TI 3

5

4

91 6

6

3

-2

-6

2

-21

4

-91

10 8

-35

-s4

-2TI

164

8. Lozenge Algorithms and the Theory of Continued Fractions

The original series is therefore transformed into the series 1-

!

+

i - l1

+

41 9 -

+ ... ,

2~1

(55)

whose first six terms provide the sum 0.5882352 (cf. Example 6.5). The s-algorithm results from interpreting the 1J-algorithm as a sequence rather than a series transformation. Let

=

e~~

k-1

I

r=O

+

1Jt)

2m-1

I

r=O

(56)

1J~k).

Then the starting value e~)

=

k-1

I

r=O

1Jt)

k-1

I

=

QrA

(57)

r

r=O

is the kth partial sum of the original series and 2m-1

e(O)

2m

= '" L..,;

(58)

"(0)

nr

r=O

is the 2mth partial sum of the transformed series. We find that e~: 1)

-

e~~

=

1Jbk)

+

=

IJ~~

+

= 1J~~ +

2m-1

L

r=O

2m-1

1J~k+ 1)

L

-

r=O

2m-1

I

r=O

1J~k)

2m

1J~k+ 1)

-

L lJ~k)

r=1

k-1

L (1J~\+1) + lJ~r-r.-V

r=O

-

lJ~k~+1

-

1J~~+2)

(59)

or (60) and so the odd partial sums of the transformed series are given by 1) e(2m

=

2m

'" L..t

r=O

1](0) r .

(61)

Now let (62) and for convenience, let e~\ = O.

8.2. The Quotient Difference Algorithm; the x-Algorithm

165

One can show, as above, that

B~:;~ - B~~+ 1

=

1/1J~~+ 1, (63)

B~~+1 - B~:!~ = 1/1J~~, B~~ + 2

-

B~: 1)

= IJ~~ + i -

Applying to these formulas the iteration rules for the IJ-algorithm shows that the following hold for both even and odd subscripts:

n, k

~

0,

(64)

with B~\

= 0,

B~)

=

k~l

I

r;O

arAr.

(65)

Other Lozenge Algorithms and Nonlinear Methods

Chapter 9

9.1. A Multiparametere-Algorithm

The s-algorithm may be considered one of a class of lozenge algorithms that depend on an arbitrary fixed sequence y E '(/5. The s-algorithm results on choosing y = {c}. For any sequence W E '(/5' define the linear operator R: '(/5 --> '(/5 by {R(w)}n = R(wn) = ~(Wn/Yn).

(1)

R may be iterated by means of the rule

(2)

so that R 2(w n) = M~(wn/Yn)/Yn}, etc. Note that if Yn = c, then Rk(wn) =

~kwn/ck.

Now define RWn

Wn+ m RWn+ m

Rmwn Yn RWn

Rmwn+ m Yn+m RWn+ m

Rmwn

Rmwn+ m

Wn

e\f~

ein~+l

= fm(wn) =

= Ilfm(Rw n).

(3)

(4) 166

9.1. A Multiparameten-Algorithm

167

4n ) satisfies

Theorem 1.

f.1n~1

= f.1n~+/) + Yn(e1n+ l ) - 81n») - I, e~)l = 0, eg') = w-lv«. n

n, k ~ 0.

~

0,

(5)

Proof The proof is the same as that for the e-algorithrn and is left to the reader. •

The most useful case occurs on choosing (6)

and defining (7) Then eg') = Sn, n ~ O. The transformation ek is translative and homogeneous. Exactness theorems for this transformation are, of course, more difficult than those for the s-algorithm because of the nature of the operator R. However, some information is available; see Brezinski (1977, pp. l l lff.). Example.

<5 2 -process:

k

=

I provides an interesting generalization of the Aitken

(8)

The following convergence criterion is immediate. Theorem 2. Then sn -+ s.

Let

S E ~c

and an+2Yn/an+ lYn+ 1 be bounded away from 1.

By means of Theorem 1.4(4) a connection may be established between the above transformation and the Aitken <5 2 -process. Theorem 3. Let the Aitken <5 2 -process accelerate the convergence of s. Then e l accelerates the convergence ofs iff(l - y,'/Yn+ 1)/(1 - an+ 1/an+2) is a null sequence.

In particular e 1 will accelerate the convergence of some member of ~l iff y is ultimately "nearly constant" (Yn/Yn+l -+ 1). It might be thought that y could be chosen to make this new method effective on logarithmically convergent series for which the Aitken 6 2-process (in fact, for which the e-algorithrn also) is ineffective. However, to make the process accelerative for ~", it seems y must reflect some a priori knowledge about s.

168

9. Other Lozenge Algorithms and Nonlinear Methods

9.2. The p-Algorithm

Recall how the extrapolation deltoid of Section 3.3 was obtained. We assumed that a function f existed with the property that, at the distinct points Xj' f(Xj) = Sj' S~k) was then defined to be the Lagrangian polynomial through the points (xj, s),O .:::;; j .:::;; k, extrapolated to O. An approach based on the Thiele interpolation formula [see MilneThomson (1960, Chapter V)J yields a lozenge algorithm that is very similar to the algorithm of the previous section. Again let f be tabulated at Xj' f(Xj) = Sj' The reciprocal differences pin) are defined recurrently by n,k

~

0,

(1)

Iff is a rational function of x of the form f(x) = (sx k + alxk-l + ... + ak)/(x k + b l x k- l + ... + bk), (2) then P<;~ = s; i.e., the algorithm is exact; otherwise when x is a real sequence

with x, - 4 00 P~oJ will be the value of the interpolating rational function extrapolated to 00. Note that pin) is translative and homogeneous. [For a justification of these statements, see Milne-Thomson (1960).J The obvious choice of x to make for logarithmically convergent sequences of the form Sn ""' S + nO(co + cfn + ...) is X n = (n + 1). Naturally, for any sequence x that is appropriate to use in the extrapolation deltoid of Section 3.3, {x; I} can be used in the p-algorithm. A very interesting nonlinear analog of the Romberg integration procedure results on choosing x, = 4n :

PinL = Pin_\ I ) + 4n(4k + l n

~

-

I)(Pin + l ) _ Pin») - I ,

(3)

0,

n

~

o.

P~oJ corresponds to extrapolating a rational approximation for the error in the trapezoidal formula, i.e., the right-hand side of n)

P(o

-

I

I

o

f(x)dx ""' - n1 + - Zn + - 3 n + ... . 4 4 4 C

Cz

C3

(4)

Table I shows the result of applying the p-algorithm to the example of Section 3.1. Note that, as with the Romberg algorithm, diagonal convergence is not the most desirable. p~6) is as accurate as any other entry in the table and

9.3. The II-Algorithm

169

Table I p-Algorithm: s = n

0 I 2 3 4 5 6 7 8

pg') 10.476 6.147 4.219 3.438 3.161 3.760 3.053 3.047 3.045

J6 (x

p~)

3.435 ( = p~O» 3.133 3.058 3.046 3.044586 3.044524 3.044522424

+

0.05)-1 dx = 1n 21 = 3.044522437723 p~)

3.053 3.045 3.044546 3.044522818 3.044522427

p~)

3.044541 3.044522697 3.044522425

p\:')

3.044522425

clearly safer to use than entries on its right. This is probably due to the monotone character of Sn ( = p~». 9.3. The V-Algorithm The theorems and examples of the previous chapter show that the Schmidt transformation is highly effective on sequences in C(?I but not at all on those in C(?I" This is hardly surprising considering the way the transformation was derived. While it is true the Levin t and u transformations are more effective for C(?I', these methods are time and space consuming as far as their computer implementation is concerned. It would be nice to have a method useful for ((;1' that has some of the computational simplicity of the s-algorithm. The following derivation is very freewheeling, and the reader is asked to suspend any skepticism, accepting for the moment that the assumptions made are realistic and reasonable. The resulting method, which has acquired the label of the O-algorithm, is in certain vertical modes of convergence an extraordinarily useful transformation. It is proclaimed by Smith and Ford (1979) in their exhaustive survey to be one of the three best (along with the Levin t and u transformations) across-the-board methods for accelerating the convergence of arbitrary sequences. Let us modify the s-algorithm slightly by writing (1)

13k to be determined. Generally speaking, if (}~~ is convergent as n -+ co, e~~ + 1 will be divergent. Since we are not really interested in the size of e~~+ 1, assume that f3zk == 1. Taking k odd gives (2)

Table II II-Algorithm, lila)

k 2 4 6 8 10

s;

= (GAMl.

0,577621 0.577455 0.577215433 0.577215664 0.577215663314

s.

= (LN

2).

0,694 0.693149 0.6931471814 0.693147180560 0.693147180560

s. = (FACl. (divergent)

s;

0.615 0.596984 0.596357 0.596347258 0.596347252

1.639 1.644935 1.644934082 1.644934067 1.644934067

= (PI 2l.

s;

= (IT 2l. 1.330 1.326 1.305 1.307 1.383

s,

= (LUB). 1.583 0.911 -2.757 4,898 112.535

s.

= (RTl.

0.606 0.604901 0.604898645 0.604898643422 0.604898643422

9.4. Implicit Summation: Logarithmically Convergent Sequences

171

Now if we wish 8\;~+2 to converge more rapidly than 8in,,; 1>, we should pick (3)

Of course, we have no knowledge of what this limit is. We could allow f32m+ 1 to depend on n and pick f32m+ 1 = -(8\;";;1 - 8\;~+ lW\;,,; 1), but this leads to nothing. Let us take differences with respect to n of (2) and then divide by .M\;"; 1). Then .M\;~+2/M\;"; 1) = 1

+ (f32m+ tI~8~n,,; 1)) ~(~8~~'+l)-I,

(4)

and the accelerativeness requirement is now (5) n~

00

If we now allow f3 2m + 1 to depend on n, i.e., to be equal to the quantity behind

the limit, something nontrivial happens. The resulting algorithm can be expressed by the following equations: n :2: 0, m :2: 0,

(6)

m:2: O.

In particular,
+

a n+2(l - Pn+2)

1-

2

Pn+2

+

Pn+1Pn+2

(7)

This is just S:+ 1 of the Lubkin W -transform [see Eq. 7.2(1)]. In this case the regularity and exactness results of Section 7.2 are applicable. Thus 8\;) accelerates the convergence of sequences of the form Sn '" nli(co + c tin + .. -), Re e < -1. Table II indicates that the 8-algorithm works well in diagonal modes of convergence on both logarithmic and linear sequences. Like other methods, however, it fails on hybrid sequences, such as (LUB)n' 9.4. Implicit Summation: Logarithmically Convergent Sequences

This method, which is suitable for logarithmically convergent sequences, is radically different from any method previously discussed. The iterates oSn are not defined explicitly, but rather occur as a subsequence of the roots of certain polynomials.

172

9. Other Lozenge Algorithms and Nonlinear Methods

Definition. Let s be a bounded complex sequence whose members are distinct. Let ¢k =1= 0 be 0(1) and denote the zeros of the polynomial fln(t) =

:j

n(t-s.)

n

k~O ¢k}] j*k

Sk -

(1)

by Ank, 1 ::; k ::; n. If for some region :?2 ~ [so, SI""] {And has exactly one limit point a E :?2, then s is said to be ¢-summable to IX.

Thus s is an equivalence class of sequences, namely, all sequences Anmn converging to IX. In practice, one selects a subsequence which converges rapidly. Several facts about the method are immediately apparent. The method is homogeneous and translative in the sense that if s is ¢-summable to a, as + b is ¢-summable to a« + b, a =1= O. Further, if a monotone sequence is to be ¢-summable to IX, all ¢k must be of one sign. To see this, suppose, e.g., ¢m < 0 < ¢m" Then for n sufficiently large, flit) contains a root in the interval (sm, sm')' by Rolle's theorem. Thus some subsequence Anmn will converge to some {J E [sm, sm']' but clearly {J =1= a. This means that to apply the method to logarithmically convergent sequences requires choosing all the ¢k positive, say. The technique is basically an extrapolation process, as Fig. 1makes clear. Tables III and IV show the effect of the method on the logarithmically convergent sequences (PI 2 )n and (GAM)n' Both cases are based on the choice ¢k = 1/(4k + 9). Clearly, irrelevant roots are given to 'Only three significant figures. The method seems to be a very powerful one for dealing with logarithmic convergence, one of the most powerful algorithms we have encountered. The method presents no particular computational problems, at least for moderate values of n, since root-finding capabilities are standard software on today's computers. In the numerical examples developed so far, it has not been necessary to take n greater than 10 to achieve nine or ten significant figures.

(5 2

.ch ) (53' <1>3) (54' <1>4) (5~j'~) a

Fig. I

173

9.4. Implicit Summation: Logarithmically Convergent Sequences Table III Roots Je nk of I1n(I): Sn

n 1.81

2

3

1.665887 - 1.30

1.647976 0.282 ± 1.67i

= (PI 2 )n' 112 / 6 =

1.644934067

4 \.645328 -0.5\2 1.30 ± 1.67i

5

6

1.644976 -4.22 ± 0.975i 1.8\ ± 1.47i

1.644938 0.537 ± 1.38i 2.09 ± 1.30i -0.295

Table IV Roots Je nk of I1n(I): s; = (GAM)n' Y = 0.5772\5664

n 0.372

2

3

4

5

0.552003 1.89

0.572941 1.28 ± 0.686i

0.576504 0.825 ± 0.714i 1.55

0.577109 0.582 ± 0.632i 1.35 ± 0.391i

6

7

8

9

\0

0.577201 0.444 ± 0.548i 1.10 ± 0.552i 1.43

0.577213 0.360 ± 0.479i 0.892 ± 0.601 i 1.33 ± 0.266i

0.577215482 0.304 ± 0.422i 0.734 ± 0.600i !.I8 ± 0.422i 1.36

0.577215647 0.265 ± 0.376i 0.616 ± 0.577i 1.03 ± 0.506i 1.30 ± 0.199i

0.577215663 0.238 ± 0.338i 0.529 ± 0.547i 0.897 ± 0.547i 1.19 ± 0.336i 1.3\

n

Theorem.

Let the series (2)

converge for Iz I ~ 1 and let the inverse function z(u) be analytic and schlicht in N R for some R. Let

+ r=1 L 00

Sn

and suppose [so close to 1, s is (k + Proof

S, S1 -

=

S

S, S2 -

S, ...

C

(n + 1)" r

P) -1 summable to s.

,0]

c

N R' Then if

(3)

P is sufficiently

Without loss of generality let s = O. Defining g(u)

z(u)

= 1 + (P -

l)z(u)

(4)

174

9. Other Lozenge Algorithms and Nonlinear Methods-

we have, since z(u) is schlicht, g(Sk) = 1

1/(k + 1) - 1)/(k

+ ([3

+

I) = k

1

+ [3 =

cPk

(5)

o

Thus [3 can be chosen sufficiently close to 1 so that g is analytic in any with R' < R. Let R' < R be such that [so, SI, ... J cz N Roo By Milrie- Thomson (1960, p. 12),

Nw

(t - so) ... (t - sn) g( t ) - Iln ( t ) = -------:2=-n-i--x

Since ISn I <.8,

J

g(u)du

oNR ' (u - so)

jg(t) - Iln(t)j <0 M(lt\

0

•

0

(u - sn)(u - t)

+ 8t1(R'

-

et,

,

t E Nw .

(6)

(7)

for every 0 < e < R'. A direct argument based on this shows that Iln(t) --+ g(t) uniformly on compact subsets of N R' Now g(O) = 0 and, by the schlichtness of z(u), this is the only zero of gin N R' By Hurwitz's theorem, Pond has exactly one limit point in N R' i.e., IX = O. • Example.

Let (8)

Then S = 1, z

= -In(1 + u),

(9)

so N R = Nand Sj - SEN. In this case the permissible values of [3 can be determined. The denominator of (4) is zero iff In(1 + u) = 1/([3 - 1) or u = e
1 and hence the denominator of (4) will not vanish in N. s, will then be (k + [3)-1summable to 1.

Chapter 10

The Brezinski-Havie Protocol

10.1. Introduction and Derivation; Sequences in a Banach Space Let s be a complex convergent sequence approaching its limit s in the following manner: s; ~

S

+ L crj~(n) r= 1

(1)

where {f..(n)} is an asymptotic scale. It is surprising that very often in practical problems the form of the functionsj, is known but the values ofthe coefficients A r are not-or are at best, difficult to compute. As a simple example, suppose S is the value of an integral whose nth approximation by the trapezoidal rule is Sn' Thenf,(n) = w > but the An which depend on the higher derivatives of the integrand, may be impossible to compute with any accuracy since, generally, only tabular values of the integrand are known. Many other examples are given in Chapter 3. What the Brezinski-Havie (BH) protocol does, loosely speaking, is to establish a deltoid algorithm that maps s into a sequence S(k) where S~k) has a representation similar to (1), but where the first k terms are accounted for. Thus the convergence of s can be accelerated with no knowledge whatsoever of the coefficients A r • The importance and usefulness of the method can hardly be exaggerated. The idea of the general representation (1) was, apparently, first articulated by Levin (1973), although special cases, such as the Romberg scheme for integration, are classical and form the basis for many of the algorithms in previous chapters. The discovery of the deltoid computation of S~k) and its representation by a ratio of determinants, the real heart of the method, are 175

176

10. The Brezinski-Havie Protocol

due to Havie (1979) and Brezinski, respectively. The latter was communicated to me privately in 1979. We shall conduct the derivation of the algorithm for sequences in a Banach space. Although the formal aspects of derivation can be carried through for sequences in any topological vector space, interesting applications of the algorithm seem to be possible only when the underlying space is metrizable and the dual has a reasonable supply of functionals. For normed spaces the Hahn-Banach theorem guarantees the latter and, of course, the norm provides a convenient metric. In what follows Pil will be a nontrivial real or complex Banach space with norm II· II, and Pil* will be its dual; s will be a sequence with members in fJB, i.e., S E P4s and «l> a convergent (not to zero) sequence of functionals in /JIJ*, i.e., «l> E !?4~; 0, for the moment fixed, and f r = {f,.(n)} E Pils , f r =F 0, 1 ::;; r ::;; k. Also, f,. = 0(1) as n

~

00.

To say «l> is a constant sequence means that
s; <

°

fl(n)···J,.(n) (2)

where

Cn

=

[~

•

1

...

<
It is assumed here and in what follows that ICn I =F

°

(3)

for the values of n under consideration. Note that Ek is not exact for constant sequences nor even for all members of Linlf.}, Although it is possible to define a similar transformation that is exact for constant sequences, it does not have the very nice formal properties of the present one. At any rate, we may simply remark that E k is exact for constants and members of Lin If.} whenever «l> is constant. The following easily demonstrated theorem describes the effect of E k on sequences of the form (1) when the right-hand side converges.

10.1. Introduction and Derivation; Sequences in a Banach Space

177

In what follows let

fjk)(n) =
=

S~k)

=
=

(4)

- s).

Theorem 1. Let (1) hold where c, is in the field of gg and the series on the right converges absolutely, i.e., IArlll.f..(n)11 < 00. Then

L:

S~k)

L: CX!

= Ek(s) +

r~k+

and S~k) =

Proof

Trivial.

<1>., Ek(s»

1

crEk[.f..(n)]

L:

(5)

CX!

+

r~k+

l

(6)

cJ~k)(n).

•

Note Ek(s) = S when is constant. To examine the convergence properties of E; requires recursion formulas for S~k) and R~k).

Theorem 2 S
~

0, (7)

R(k+ I) = fn l(n)R~kL - fikL(n + I)R~k) • fn I(n) - fikL(n + 1) 11.<1>., s)fik~ I(n) - fik~ I(n + 1)'

+ fY~ I(n) it: 1)( ) = I

n

fikt I(n)flk)(n + 1) - fikt I(n + 1}f~k)(n) fikt t(n) - fikL(n + 1 ) '

i ~ 1,0 S k

s

flo'(n) =
n, k ~ 0, R~O) = :

(8)

(9)

i - 2, i ~ 1.

Since

+ k»

+ k»

(10)

the first recursion formula is easily demonstrated by applying Sylvester's identity (Appendix, Section A.3). The prooffor R~k) follows by the substitution R~k) = S~k) -
178

10. The Brezinski-Havie Protocol

A computational tableau for computing ftk)(n) will be discussed later along with the scalar case fJI = C(5. Now consider a path P. Provided r~k) = S~k) - S for n large on P does not approach too close, relatively speaking, to the kernel of
<

$'

d(g, $')

= {g E 881<
(11)

= inf Ilg - hll,

(12)

s= Theorem 3.

For n

hE .JI'

<
(13)

+ k sufficiently large on d(r~kl, $')/Hk)1I 2

Then S~k) converges in norm to two following cases:

S

s>

P, let O.

(14)

on P iff S~k) converges to S on P in the

(i) constant, P any path; (ii) not necessarily constant, P a path (n, k) with n Proof

For n

-> 00.

+ k large, 11
2

I<
[see Jameson (1974, p. 188)]. Now observe that, since <
(15)

+ <
1<
(16)

so it is clear we can find m1 and m 2 , 0 < m1 < m 2 , such that (17)

mlllr~k)1I :o;;.I<
Further, <
+ <

and combining this with

and the assertion of the second case follows on taking n

-> 00.

•

10.1. Introduction and Derivation; Sequences in a Banach Space

179

What is desired, of course, is information about the convergence of S~k) that does not require a knowledge of S~k). The following lemma is the link to a theorem that does this. Lemma (Brezinski). (lnk

Let S

E

f!Jc , and for each k, let

= fikL(n + 1)/fikL(n)

(19)

be bounded away from I, 0 ~ k ~ K. Then S~) converges to S on all vertical paths, 0 ~ k ~ K + 1. Proof

Since (lnk/(I - (lnk) is bounded, Eq. (20) furnishes an easy inductive proof on k that R~k) ~ o. It is then trivial that S~k) ~ S. • Paths where k is unbounded present a formidable problem. Aside from certain easy scalar cases, i.e., the summation by extrapolation formulas of Chapter 3, nothing has yet been accomplished on this problem. We have now proved the following result. Theorem 4. Let the hypothesis (14) of Theorem 3 hold and for each fixed k, 0 ~ k ~ K, let (lnk be bounded away from 1. Then Ek is regular on all vertical paths, 0 ~ k ~ K + 1. Example. Let f!4 = ~z; cPI = cPz = cP;
(22) A sufficient condition for the regularity for E 1 in this case is that (23)

For instance, if Lib n = o(bn), then it is sufficient that an = o(bn). Ek is then regular for this class of sequences.

180

10. The Brezinski-Hiivie Protocol

10.2. The Case

cP Constant

For the case cI> = const the algorithm can be derived in the same way the Schmidt transformation was derived. One assumes that Sn behaves as Sn

=S+

k

L

(1)

crfr(n).

r~l

Taking ¢ of Eq. (1), replacing n by m, n :-: : ; m :-: : ; n + k, and considering those equations and the above as k + 2 equations in the k + 1 unknowns (¢, S), Cb C z , " " C k produces the requirement S

0

f~(n)

(¢,sn)

I

(¢, ftc(n»

s, -

(¢,

Sn+k)

(¢, fl(n

+ k»

(¢, ftc(n

= 0,

(2)

+ k»

but this is clearly equivalent to Eq. 10.1(2) when S = s~kl, cI> = const. Conversely, when s; has the form (I), then E k will be exact, S~k) == S, provided the algorithm is defined. For the study of this algorithm, there are two modes of regularity or accelerativeness to consider. One pertains to weak convergence, i.e., convergence in the seminorm I(¢, .) I.The other is the usual strong convergence, convergence in the norm. The regularity result below, though based on pretty specific properties off,., is often applicable. Theorem 1.

Let s E fJB c and

lim(¢, f,.(n

+

1»/(¢, f,.(n»

= b,

# 1,

1 :-: : ; r :-: : ; k,

(3)

where b, # bj , i # j. Define

(4) (which we assume exists for n sufficiently large) and denote by Am the proposition "lIfm(n)II/(¢, fm(n»

= 0(1)."

(5)

Then along any vertical path.

In

(i) if '7n is bounded, then E; is regular for s in the seminorm I(¢,.) I; (ii) if '7n ----> b, for some j, then Ek accelerates s in seminorm; (iii) if '7n is bounded and Am holds, 1 :-: : ; m :-: : ; k, then E, is regular for s norm;

10.2. The Case I/J Constant

if'1n ---> b, for some j, Am holds, 1 :s; m :s; k, m

(iv)

=1=

181

j, and (6)

then E k accelerates s in norm.

Proof ship r(k) n

~ ~

All these statements are immediate consequences of the relation-

<¢, rn)

v;. + 1(1, bl' b 2 , · · · , bk) 0 fl(n)/<¢, fl(n»

rn/<¢, rn) 1

x

1 1

'1n '1n ... '1n+k -

I

j~(n)/<¢,

1 bl

1 bk

b kI

b~

fk(n» (7)

where v;. is a Vandermonde determinant (see Notation). Details are left to the reader with a hint: For (iv) subtract the first column of the determinant from the jth column. • Example 1.

Let (8)

f,.(n) = x~h,

and let Xn+ I/Xn ---> b, b =1= 1, b =1= O. Now, (Ink = Xn+k+ .t»: This provides a generalization to Banach spaces of the summation by extrapolation scheme of Section 3.3, but here there is no simple deltoid algorithm for the computation of S~k), only for S~k) = <¢, S~k». Example 2.

Let f,.(n)

=

x~h,

X E~,

h

E

fJH,

h ¢'.

x:

(9)

and let the Xi be distinct numbers, none of which is 1. (Ink = Xk+ I and this gives a generalization of the deltoid of Section 3.2. Although there is no deltoid in the general case, S~k) can be written out as a linear combination of the};(n) with closedform coefficients that are Vandermonde determinants. Both of these examples generalize the Romberg and Richardson procedures.

182

10. The Brezinski-Havie Protocol

10.3. The Topological Schmidt Transformation

In the algorithm of Section 10.1 take};(n) = ization of the Shanks-Schmidt transformation when is constant. What results is

This gives a generalThe most useful case is

.1Sn + i - 1.

ek(Sn)'

(1)

Jv,. as in Eq. 6.1(8). Although the above is a nonlinear algorithm, its denominator is a scalar. Thus it requires nothing in the way of invertibility from the elements of []d. Obviously it retains the properties of homogeneity and translativity of the scalar algorithm. The first question one asks is, How good is this abstract version of the Schmidt transformation? Insight into this very difficult question can be obtained by answering a simpler question: For what sequences is the algorithm exact? This question is easily resolved. Since, for the scalar algorithm, there is an intimate relation between exactness and regularity, one expects this relation to hold in other Banach spaces. We shall show that, roughly speaking, the topological Ek can be exact only for sequences that are linear combinations of fixed elements of !!l, where all the dependence on n is restricted to the scalars in the linear combination. Unless !!l is finite dimensional, this is clearly a small class of sequences. Although more theoretical and numerical investigations are required, this probably means that E, is regular for a disappointingly small subspace of !!lc. In what follows Greek letters denote scalars. (Note that the arguments used carry through for any topological vector space.) Theorem 1. Let k be fixed. Then S~k) is defined and s~) (¢, r n E ''X''k (see Section 6.3) and

>

k-l

_ S + '" 1...

Sn -

° °

m=O

r m Tn(m) ,

== s,

n ~

0, iff

(2)

where T~m), S m S k - 1, is a basis of solutions of the (scalar) equation = satisfying

;J!lk

TU) = m

J mJ'.

Os m,j S k - 1.

(3)

Proof

(4)

10.3. The Topological Schmidt Transformation

183

Taking ¢ of both sides gives

o = ~~k) =

(5)

Jtv,.(~n' d~n)/Jtv,.(I, d~n)'

By Theorem 6.3, the definition and exactness of ~~k) imply ~n E x:k : Let {r~O), r~l), ... , r~k-l)} be a basis of gIIk where gIIk(~n) = 0, satisfying (2). Then :<

k-l

_

,,:<

L....

Sn -

But this means <¢, r«

r,

where V k- I

Vn E

=

(6)

.

k-I

L rmr~m) + Vn,

(7)

m=O

ff (the kernel of ¢). Putting n

= O. Substituting (7) into (2) gives

o=

(m)

Sm'T n

=0 or

L rm r~m»

-

m=O

= 0,1,2, ... , k - 1 shows Vo, VI' ... ,

II

k r Jtv,.( r~m») m= 0 m Jtv,.( 1)

+

Jtv,.( vn) Jtv,.(1) .

(8)

Now we have g\(d~n)

= Yo

d~n

+

Yl d~n+ 1

+ ." + Yk d~n+k =

0

(9)

with, say, Yk = 1. By Lemma 6.3(3) each term in the sum in (8) is proportional to gIIk(r~m») and hence equal to zero. Thus Jtv,.(v n) = 0, or Vn satisfies (10)

= 0, Vn = 0 for all n. This part is trivial, since (2) shows gIIk(rn) tional to Jtv,.(rn) (=0). • Since

Vo, ... , Vk-l

<=:

=0, and this is propor-

The following, more of an observation than a theorem, is a useful negative criterion. Theorem 2. gIIk = 0 and let

< >

Let ¢, rn satisfy a homogeneous linear difference equation

o~ m ~ k -

1,

where IJn is a nonmaximal solution of f?l!k = O. Then S~k) is not defined. Example 1 (k

= 2).

<¢, rn>

<¢, r n >= (cln

E

ff z if, for example,

+ c z)-1.n , C I i=

0, -1. i= 0, 1.

(11)

184

10. The Brezinski-Havie Protocol

Then

(12) Thus £2 will be exact for sequences of the form

s, =

S

+ ro(1

- n)An

+ rlnAn-l,

A =F 0, 1,

(13)

provided Aro - r l rf- f. Example 2

(k

= 1. Aitken's (j2_process in a Banach space) sn =

Sn -

«1>, !J.sn)/<1>, !J.2sn»)!J.sn·

Theorem 3. Let S EIJc . Then s; in (14) is defined and s; S + Anro for some A E 'Ii, A =F 0, 1.

s; =

(14)

== S iff r0 rf- x: and

If S E .IJc , our previous work, Theorem 10.1(4), states that this algorithm converges provided (15) and

0< a < 1 <

/3.

(16)

The acceleration properties of the algorithm are easily established. Theorem 4.

Let

S E

.r1Jc and

<1>, an+ 2)1 <1>, an+ I) =

Then

p

°< Ipl <

+ 0(1),

1.

(17)

s converges to S more rapidly than s in the seminorm 1<1>, .) I.

If, further,

(18) then

s converges more rapidly than s in norm.

Proof

By Theorem 1.4(1),

<1>, rn + 1)/<1>, rn ) = so Theorem 10.2(1) may be applied.

p

+ 0(1),

(19)

•

Brezinski (1975) has studied this algorithm. It may be used to generalize the Pade table in the following way. Let a E :JBs and Z E (c. Then we may write the formal power series f(z) =

00

L: ajz j

j=O

(20)

lOA. The Scalar Case

185

with partial sums

s;

=

n

L a.z',

(21)

j=O

S~k) will then define a formal rational approximation tofwhose numerator is a polynomial of degree n + k in z with coefficients in [J9 and whose denominator is a scalar polynomial of degree n. For details, see Brezinski (1975). Germain- Bonne (1978) has also studied this algorithm. The topological Schmidt transformation provides a construction for iteration functions for the solution of operator equations. Let f: [J9 -+ flJ and define (22) j~(x) = x.

Take, in (1), ~Sn +k -+

Sn+k

-+

k ? 0, k > 0.

fk+ 1 (Sn) - fk(Sn), !t(Sn),

(23)

Thus the k = 1 case of Eq. (1) produces sn+ 1 = s; -
(24)

for the solution of x = f(x). In contrast with often-used methods such as the generalized Newton iteration scheme, these formulas do not require the evaluation of the (Frechet) derivatives off There are a multitude of other ways the BH protocol can be used to construct iteration functions. One could take «p const,

Jj(n + k) = [h.{sn) - !t-l(Sn)]
-+

fk(Sn),

k ? 0,

> 0,

(25)

and replace S~k) by sn+ 1 on the left-hand side of Eq. 10.1(2).

10.4. The Scalar Case When [J9 is its scalar field and cPn == I (the identity), S~k) = S~k) and there is a deltoid computational scheme for the computation of S~k). Then !t(n) (1)

186

10. The Brezinski-Havie Protocol

and the algorithm becomes n ?:: 0;

(2)

.

F\k + 1)( ) = fikL I (n)f~k)(n + 1) - fn I (n + 1)f~kJ(n) n fnl(n) - ftL(n + 1) , I

i ?:: I,

0.:::; k .:::; i - 2;

i ?:: 1,

(3)

and f~k)(n) = EkU/n», i.e., the ratio (1) with s, replaced by /;(n). The computational tableau of the algorithm is as follows. The S~k) array is filled out in diagonal lines, the kth line being {siO~ I' sp~ 2; si2~ 3, ... , s~- I J}:

S1

S2

To compute this kth diagonal line, k - I subsidiary arrays are needed. Each array has the following form: ith Array

flO)(o) = flO) flO)(1)

=

1;(1)

11 i -1)(0) .ni-I)(l)

f\i-I)(k - i-I)

J;(k - 2) h(k - 1)

fli-I)(k - i)

187

10.4. The Scalar Case

For instance, to compute the diagonal three arrays are needed: Array 1 fl(O)

{S~Oi, s~1), S)2),

Array 3

Array 2 f3(0)

j~(O)

fil)(O)

I, (1)

f~I)(O)

fil)

f2(1)

u»

fl(2) fl(3)

slJ3)} the following

fi1)(1)

f~2)(0)

f~I)(1)

fY)(1)

f3(2) fi 1)(2)

f~I)(2)

f3(3)

f2(3)

Clearly, the amount of computer storage necessary at the completion of the computation of the kth diagonal is k(k + 1)(2k + 0/6 : : :; k 3/3. The computations are probably best done in the following order: (a) (b) (c) arrays;

Initialize slJO) = so,/)O)(O) = fl(O), f)O)(1) = fl(1). Assume the (k - l)th S~k) diagonal has been filled out, k :2: 2. For k > 2 compute new ascending diagonals of each of the k - 2 i.e., DO, for 1 ~ i ~ k - 2, generate flO)(k - 1) = J;(k - 1); compute

(d)

f~j)(k

- 1 - j)

from (3),

1

~ j ~ i-I.

For k > 2 fill out array k - 1, i.e., generate fkO~ 1 and DO, for

l~i~k-l:

generate fko~ I (i) = j~ - I (i); compute fV~ 1 (i - j) from (3),

j

= 1, 2, ... ,i

(i - 1 if i

= k - 1).

(e) For k :2: 2 fill out the kth diagonal of S~k); i.e., generate compute st~ 1-;' 1 ~ i ~ k - 1, from (2). (f ) Go back to (a).

Sk _ 1

and

Note that moving down one diagonal in the S~k) table necessitates adding one more subsidiary array. The Brezinski-Havie protocol for scalar sequences is undoubtedly the most elegant and flexible computational procedure yet discovered for the transformation of sequences. The flexibility of the algorithm lies in the possible choices of L. Generally speaking, the choices are made with a foreknowledge of the kinds of sequences one wishes to accelerate.

188

10. The Brezinski-Havie Protocol

Before discussing particular cases of the scalar algorithm, let us gather together the convergence and acceleration results. Theorem 1. Let, for each fixed k, (Ink [Eq. 10.1(19)] be bounded away from I. Then E, is regular on all vertical paths. Theorem 2.

Let Sn =

S

+

L aJr(n), 00

(4)

r= I

where for each n the series on the right converges absolutely. Then S~k)

=

S

+

L 00

r=k+1

(5)

aJ~k)(n).

Proof Follows from Theorem 10.1(1). This shows E k is exact for constant sequences and Lin{f l , f 2 , ... , fd when f j is independent ofs. •

For the next three theorems, the common hypothesis is limf,.(n

+ 1)/f,.(n) =

Theorem 3.

Let s E rt'c.

b, =I- 1,

1~ r

~

b, =I- bj ,

k,

i =I- j.

(6)

(i) If h; = 0(1), then E; is regular along vertical paths. (ii) If h; ~ bj for some j, then E k is accelerative along vertical paths. Our last two results require the following. Lemma fJk)(n) ~ jj(n) (bj - b l ) ... (bj - bk) , (1 - bl)· ··(1 - bk ) Proof

Left to the reader.

Theorem 4.

j>k~1.

•

Let k

~

o.

(8)

k

~

o.

(9)

Then

Proof

(7)

Left to the reader.

•

10.5. The Levin Transformations

Theorem 5.

189

For some A let Iail < Ai and let (6) and

f..+ l(n) =

o(f..(n»

(10)

hold uniformly in r. Let the representation (4) hold. Then k ? O.

(11)

Proof Requirement (10) guarantees the absolute and uniform convergence of (4) for n > no- It is easily seen that r~k) ~ a k + I f~k+ I), and the previous theorem may be invoked. •

Example 1

A E [ -1,1).

(12)

Then S~I)/Sn+ I

= 1/(..1. - I)n = 0(1).

(13)

The conditions of Theorems 1, 3, and 4 are satisfied but not those of Theorems 2 and 5 since there is no a l =I 0 such that An/n = alAn + .... Example 2. If};.(n) = ~Sn+k-l' the result is the Schmidt transformation but, interestingly, the algorithm for computing the transformation is not the s-algorithm. Work remains to be done in assessing the relative computational advantages of the two algorithms. 10.5. The Levin Transformations In 1973, Levin gave a general transformation of series that is enormously useful in numerical analysis and that has been the subject of a wide literature. Essentially, the transformation is a special case of the previous transformation, although Levin did not develop an algorithm for computing S~k) efficiently. There are a number of useful cases of his algorithm, and I will examine each of these in turn. After making the assumption 10.4(4) Levin effected the specialization Jj(n) = x~-l(n,

Xm =I x n ,

Expanding by minors, one finds S(k) n

=

t

Sn+m

m=O (n+m

n~k,m)1

r=O r*m

_1_

m=O (n+m

k

, m ) = "(x n (k n L, n+r

i

m =I n.

-

Xn+m )-1 ,

(1)

n~k,m), (2)

190

10. The Brezinski-Havie Protocol

where it is assumed, of course, that all quantities are defined. Furthermore, (3) There are a number of ways to choose the x, and (n that make sense. One is as follows. If the sequence s converges rapidly enough and, say, the terms IQjl are ultimately monotone in such a way that we may write

(4) then a good choice would seem to be (n = Qn+ r- (h n can be multiplied, of course, by any constant without affecting the algorithm.) For this choice of hn , we wish to analyze the acceleration properties of the algorithm using the theorems of Chapter 5. Let, as usual, (5) Note that S~k) depends on Sn' s, + 1> ••• , s; +k + 1 and is translative and homogeneous. Thus we need to consider the functions (6) An application of the Smith-Ford (1979) theorem gives

Theorem. some Un let

In (2) let (n =

Qn+l' Q n

'I. O. Let

SE

.Plp , 0 <

Ipi < 1, and for (7)

where

I

k

p-mn(k.m)

# O.

m=O

(8)

Then S~k) accelerates the convergence of s along any vertical path.

Proof

=.

±

(1

+ Pn+l + Pn+lPn+2 + ... + Pn+l···Pn+m_l)n~k.m)

m=1 X

L k

(

)-1

Pn+1Pn+2'''Pn+m

1

m=O Pn+ IPn+2'"

n~k,m) Pn+m

(9)

10.5. The Levin Transformations

191

Thus

(10)

and

(11)

By elementary properties of interpolation sums,

L n~k.m) == O. Thus

gn(pe) =.1/(1 - p).

(12)

g(pe) = 1/(1 - p),

(13)

By uniform convergence, and this concludes the proof.

•

In this proof it was assumed x is independent ofs. However, x may depend on s if the dependency is such that the homogeneity and translativity of the transformation is maintained, e.g., x, = Llsn •

10.5.1. The t-Transform In his analysis, Levin took (n = an, but, as Smith and Ford point out, (n = a n + 1 makes better sense and simplifies the convergence analysis. One natural choice of the x, would result by assuming that s converges as

(1) where

Vn

is a Poincare asymptotic series, in other words, by taking

x, = lien

+

1).

(2)

In what follows, it is assumed that an "1=. o. This choice and dividing numerator and denominator by common factors amounts to the choices and

Jr~k.m) = (n + m + ll- 1(_1)m(~)

192

10. The Brezinski-Havie Protocol

in Eq. 10.5(2). Then _ (k) _ L~=o (sn+m/an+m+ l)(n + m + l)k-l( _1)m(~) tk(Sn ) = Sn '\'k k i m k) L,m=O (1/a n+ m+ l)(n + m + 1) (-1) (m

(3)

This is called the Levin t-transform.

k ?: 1, is accelerative for C(f, along any vertical path. In Theorem 10.5 take Un = nk- 1 and then n(k.m) = (_1)m(~).

Theorem 1. Proof We have

tb

Ik m1 n(k, m) = (1l-)-k #- O. m=O P P

•

(4)

tk turns out to be regular for any path for another large and important class of sequences.

S

Theorem 2. Ifs E along any path. Proof

C(fc

and a is alternating, S~k) is defined and converges to

We can write

=

() =

~_~n

First, assume n

-+ 00

(n

+m+

l)k-l

1an + m + 1 I

(k).; ~

L.. r =0

m

(n

+ r + IlIan+ r + 1 I

1

(k) r

(5)

.

on P. In this case, use k

Ir~k) I s sup IrmiL Jlmk m~n

m=O

=

sup IrmI -+ o.

(6)

m~n

If n is bounded on P, use Theorem 5.2(1). Conditions (i) and (ii) of that theorem are satisfied. We can majorize Jlkm by throwing away all the terms in its denominator except the last, so k! lan+k+ll(n+m+l)k-l Jlkm < .,------------, - Ian + m + lin + k + 1 m! (k - m)!

~ Ian +k + 1 I (n + m + Il- 1 k" - k+ 1 Ian + m + 1 1m! en+ 1 ' so Jlkm

=

k

-+ 00,

n bounded,

0(1) in k along P and this establishes convergence along P.

t does not work well on monotone sequences.

(7)

•

10.5. The Levin Transformations

Theorem 3.

Let Re () < -1, an '" n/l(co

Co "1=

0, and

+ c fn + C2/n2 + ...).

(8)

Then r~k) ~ -

Proof

+ 1)( -

k! n/l+ ICO/(O

193

k

()k'

~

o.

(9)

Exercise. •

10.5.2. The u-Transform

The t-transform was designed to be used on rapidly convergent alternating series.The u-transform is designed for monotonic series and has the following heuristic basis. Consider

1

n

s;

=

k~O (k + 1)'"

IX > 1.

(1)

Then, according to the work in Chapter 1 [Theorem 1.7(3)], Sn -

S ~

-(n

+ 1)1-«/(1

- IX),

(2)

or s, - S ~ C(n + l)an + i- Since the above sequence is such a typical one, it makes sense to take (n = (n

+ 1)an+ 1

(3)

and, as before, x; = (n + 1)-1. (This is not the precise choice Levin made= nan' x, = n-1-but seems preferable since the transform is now defined for all n.) The result is called the Levin u-transform: (n

Uk ( Sn )

_

(k) _

- Sn -

L~=o(sn+..Jan+m+'>(_1)m(n + m + 1)k-2(~) "k m k 2 k • L,.m=O (1/a n+ m+ 1)( -1) (n + m + 1) (m)

(4)

Theorem 10.5 gives immediately the next result. Theorem 1. The vertical path.

Uk

transform, k

~

1, is accelerative for

Cfij,

along any

It is ironic that, despite its derivation, it has not been established that Uk is regular for monotone series. (In fact, I suspect this is not true.) Nevertheless, a result in the previous section continues to hold.

Theorem 2. If S E Cfijc and a is alternating, to S along any path.

S~k)

is defined and converges

194

10. The Brezinskj-Havie Protocol

For certain kinds of monotone series, however, the columns of the utransform give excellent results. These are series whose general term has a Poincare type of asymptotic expansion. Theorem 3.

Let Re 0 < -I, an ~ n8(c o

Co f=

0, and

+ cdn + cz/n z + ...).

(5)

Then for some m ?: k, (6)

Left to the reader.

Proof

•

If s behaves as

0< IAI < I,

(7)

n

(8)

then it is easily shown that ---> 00,

for both t k and ui : Thus the Levin transformations enhance exponential convergence algebraically. Levin defined another transform, called the v-transform, by taking e 1(sn), the Aitken 15 Z-iterate, as an estimate for s; i.e., in Eq. 10.4(1), fj(n) = [a n+ l/(Pn - I)JX~-l, x, = (n + 1)- 1. Thus the u-transform is defined by (k) _

Sn

-

L~~o (sn+m(Pn+m - l)ja n+m+l)(n + L:~~o «Pn+m - I)/a n +m+ l)(n + m

m+ It+

1

(

_1)m(~)

I)k 1( _1)m(~.)

,

(9)

Obviously this idea can be elaborated ad absurdum, since any sequence transformation can be used for (n' Smith and Ford, however, think that v has special advantages, and consider it one of the best practical transformations. 10.5.3. Exactness Theoremsfor t and u t and u turn out to be exact for a surprisingly large and varied class of sequences. To explore this matter, we first demonstrate an exactness result for a general case of the scalar algorithm 10.4(1).

Theorem 1. Let k ?: I and jj(n) = an + 19/n) where the gj are linearly independent and independent of s. Let rn' an f= 0 and let the denominator

tn.s,

The Levin Transformations

195

of Eq. 10.4(1) vanish for no value of n. Then S~k) == Sn' n ~ 0, iff s, =

n- I

S

+ (so - s) fl (1 + r= 1

'r-

I

n

),

~

0,

(1)

where T is a nontrivial member of Lin[gl' ... , gk]. Furthermore if gl(n) == 1 and gj(n) is an asymptotic scale, the transform is exact for S E ~c only if (2)

for some j, 1 :s; j :s; k, and some E E

re

N'

Proof =: S~k)

-

S

r;

=

fl(n)

fk(n)

+ k)

rn+ k fl(n

.h(n

fl(n) x

fl(n

+ k)

j~(n)

+ k)

f~(n

-I

+ k)

(3)

Let Vn

=

n-I

fl (1 + -; 1 ).

(4)

r=O

Differencing (1) shows (5)

and if an is defined and nonzero, then r; = an+1T n and S~k) - S = 0, n ~ 0. =: We must have

Tn

# 0,

So

# s. Thus rn/an+ 1 = r, or

(6)

and by linear independence of the gj'

C1

# 0. This can be written (7)

for some r, E Lin[gl' ... , gkJ, or rn+1/rn = (1

and taking products gives (1).

+ Tn-I),

(8)

196

10. The Brezjnskt-Havie Protocol

To prove the second part of the theorem, write 1

+

I/!n = 1 + [CI

=

(l

+ c 2g 2(n) + ... + ckgk(n)]-1 + I/cl)(l + 8n ) , CI =1= () E C{/ N,

-1.

(9)

Note that C1 =1= 0, since otherwise s is not convergent. Substituting (9) in (8), taking products, and using Theorem 1.4(2) gives (2) (with j = 1). The case C1 = - 1 is handled similarly. • (Exactness for Euler Series).

Theorem 2

Sn

Then for t k , S~k)

=

S

+

=s for x

n

I

k'x",

=1=

1, k 2

k=O

Let (10)

IX

+

1. For

Uk>

S~k)

=

S

for k 2

IX

+

2.

We shall prove the first statement. According to (7), t is exact iff

Proof

.I kax n

k=O

k

[c + +

= (n + 1yxn + 1

Co

(n

1

1)

+ ... +

(n

Ck-l 1)k

+

] I

(11)

for some constants CO, C l' . . . , Ck _ l' By the work in Section 1.7 this is certainly possible provided x =1= 1 and k - 1 2 IX. (Notice that in this case the asymptotic expansion for rn terminates.) • Theorem 3. If t k is exact for some sequence s, then Urn is exact for s when m> k. If Uk is exact, Urn is exact, m 2 k. If t k is exact, t rn is exact, m 2 k. Proof

Trivial.

•

Theorem 4. Let s; = S + (so - s)Pn' U 1 is exact for P« = (a + 1)n/nl; is exact for P» = (a + 1)n/(b + l), or c"; U3 is exact for the previous sequences and Pn = c-n(c + l ),; or (b/a)"(a + On/(b + On' t l is exact for c"; t 2 is exact for Pn = c-n(c + 1)n, en, or (b/a)n(a + 1)n/

Uz

(b

+ n,

Proof Left to the reader. (Assume that the parameters are such that all quantities are defined and denominators of tor U are never zero.) • 10.5.4. Numerics

Diagonal modes of convergence seem to always be preferable with the Levin transforms. Table I shows the effect of t and u on typical sequences. t sums alternating series well but not monotone series; u sums both. Further, both sum the divergent 1 - 1! + 2! - 3! + .... In almost every case, the

197

10.5. The Levin Transformations Table I S~':

Levin Transforms

s; = (GAM). k

2 4 6 8 10

s. = (LN 2).

u

u

0.645 0.577621 0.608 0.577268 0.594 0.577216 0.588 0.577215661 0.585 0.577215664926 y = 0.577215664901533

0.692 0.694 0.693144 0.693161 0.693147186 0.693147203 0.693147180584 0.693147180437 0.6931471805598 0.693147180559951 In 2 = 0.693147180559945

s; = (FAC). (divergent)

s; = (IT

u

0.571 0.595 0.596399 0.596346 0.596347283

u

0.615 0.598 0.596368 0.596341 0.596347823

fo e-'
1.521 1.586 1.611 1.624 1.630

1.639 1.644676 1.644931 1.644934081 1.644934067

I). u

1.369456 1.368860 1.368812 1.368808438 1.368808134

1.369734 1.368882 1.368813 1.368808559 1.368808143

~

1[2/6 = 1.644934066846

=

The root of x 3 + 2x 2 + lOx - 20 is 1.368808107

=

0

performance of u is spectacular, and it is clear why Smith and Ford call it one of the three best all-round practical summation methods (along with the O-algorithm and Levin v-transform). However, no method can do everything; apparently the t and u methods perform less satisfactorily on iteration sequences, convergent or divergent. True they both sum (IT l ), well diagonally, but Sn itself is rapidly convergent also. For (IT 2)n both methods fail. For t and u, respectively,

Sb1 5 ) =

1.2596,

Sb1 5 ) =

1.2606

and the data seem to indicate s~) converges in either case, although it is not clear to what. (sn has two limit points, 0.549 and 1.293.) The GBW algorithm is the only one I know of that will sum this kind of sequence. The example of 7.1(4), (LUB)n, can be used to show neither t nor u is regular for any path P = (n, k), k > O. For t the sequence s~) is (1, 1.3 (exact), 3 (exact), 0.955, 0.876, 1.45, 1.316, ...) and for u it is (1, 1.214, 1.583,0.870, 0.904, 1.917, ...).

198

10. The Brezmski-Havie Protocol

10.6. Special Computational Procedures: The Trench Algorithm Two of the drawbacks to the BH protocol are computing time and storage space. However, if a certain relationship prevails among the Jj(n), a very efficient algorithm due to Trench (1965) for the inversion of finite Hankel matrices can be used to compute S~k). It is surprising that whenJj(n) = L\sn+ j - I ' yielding the iterates of the Schmidt transformation, the result is even more efficient than the s-algorithm for the computation of s~kl, requiring only one-third as many operations. Recall that S~k) results from solving the system

=

Sm

S

+

k

L crlr(m),

n s; m :::;; n

1'= I

+ k.

(1)

Differencing gives k

L c, L\f,.(m),

L\sm =

r=

1

n:::;;m:::;;n+k-l.

(2)

+r-

(3)

Now assumef,. has the property

f,.(n) = g(n

1).

The system may be written k- I

L\sm+n

=

L

C

1'=0

r+,Dr+m, 0:::;; m :::;; k - 1,

Define

i.i

0:::;;

Dj

== L\g(n + j).

i: k.

(4) (5)

The algorithm for the inversion of the H, proceeds as follows. Let

H; I

=

0

[bl'l],

s

i,j :::;; k,

(6)

and assume H;;' I is known, 0:::;; m :::;; k for some fixed n. (The algorithm generates a diagonal of S~k).) Initialize as follows: Y_I = 0,

= 0, uo, - I = 1, ui , U-I,i = U i + I , i - I = 0,

U i• - 2

= 0,

I

U i + I,i

i # 0;

(7)

= 1.

Then compute Ak Ur,k

=

=

k

L Dj+kUj.k-1

j= 0

(Ak-_1'Yk_l -

k

h =

Ak-1Ydur,k-l

L

j= 0

Dj+k+ IUj,k-I'

+ Ur-1,k-l

0:::;; r s; k.

-

Ak-\Ak Ur.k-2'

(8)

10.6. Special Computational Procedures: The Trench Algorithm

199

Next compute k+ 1

=

Ak+ 1

L D j + k + lUjk>

(9)

j=O

and finally O~i~j~k+l O~j
S~) may be computed from (1):

(k) _ Sn -

Sn -

f [ l(n),

f2(n), ... , };.(n)]H - 1

l

k- 1

j

L'1sn L'1Sn + 1

:

Llsn + k -

.

(10)

(11)

1

Brezinski (1976) has shown that for the special case of the s-algorithm, the computation in (11) can be bypassed and the algorithm becomes more compact. An important fact is that cj = jth component of {H;_\(Lls n , ••• , L'1Sn+k_l)T},

and thus if it is known that kind

Sn

(12)

has a complete asymptotic expansion of the

s, '"

S

+

L c, f,.(n),

r= 1

then Eq. (12) gives a lozenge algorithm for the computation of each cj , j ~ 1; i.e., just label the left-hand side of (12) cJ~~.

Chapter 11

The Brezinski-Havie Protocol and Numerical Quadrature

11.1. Introduction; The G-Transform

The sequence transformations discussed in the previous chapters can be put to obvious use to compute approximate values of s = S~ g(x) dx. If any sequence s of approximants to s has been obtained (for instance, by the application of some standard numerical quadrature rule to progressively finer subdivisions of [a, b]), then any acceleration method can be applied to s. We shall not belabor this obvious approach. Brezinski (1978) discusses this approach exhaustively and gives many numerical examples. One problem in such an ad hoc approach is that there is usually no clear way of finding those functions for which the method yields exact answers and thus of characterizing the class of functions for which the method should be expected to provide good answers. However, there is another more intuitive way of proceeding. The underlying philosophy ofthis method, which is applied to infinite integrals (b = 00), was first set forth in a paper by Gray and Atchison (1967) and developed in subsequent papers by Gray, Atchison, and Clark. Their algorithm came to be known as the G-transformation, and is a special case-vertical convergence in the second column of the S~k) array-of the algorithm to be developed in this section. The latter method is one to which the BH protocol can easily be applied. The derivation will be informal; convergence theorems will come later. Suppose we have a way of computing approximate values of G(t)

=

fg(X) dx, 200

t

z

a,

(1)

11.1. Introduction; The G-Transform

20 I

for a sequence of values of t. Let p be a fixed number> 0 and write

c, =

G(t

gj = g(t

1=

+ jp), + jp),

(2)

LX) g(x ) dx

We have

= Goo.

L oo

G(t) = 1 -

Go = G(t),

j 2 0,

g(x

+ r) dx,

t 2

(3)

Q.

Now assume that a quadrature formula with equally spaced nodes is available for the evaluation of the above integral; in fact, for the purposes of the derivation, we assume the integral can be represented exactly for all t by such a formula, so that 1 - Go =

k-I

L c.q..

(4)

ro:=.O

Replacing t by t + mp, 0 ~ m ~ k, yields a system of k + 1 equations in the k unknowns Co, CI, ... , ck- I' For consistency, the augmented determinant of the system must vanish and that determinant can be solved for I. For general integrands, of course, the result will no longer be exact, but we can use it to obtain an approximation W) to 1 that looks like

,

I(k) --

Go go Gk

gk~ 1

?/k

?i2k ~ 1

Thus

nO) =

G(t),

,

-

1(1) -

/

1

1

?Jo

Yk-I

a,

Y2k- I

G(t)g(t + p) - G(t + p)g(t) , .... g(t + p) - g(t)

(5)

(6)

There are several ways the above formula can be used. One could assume, for instance, that G is tabulated at equally spaced points, to + jh, to 2 a, j 2 O. The BH protocol can then be applied by taking f,.(n) = g(t o

+ (n + r

- l)p),

s; = G(to

+ np),

(7)

and so S~k) yields an extrapolation to 1 in terms of the known values G(to +np), n 2 O. Of course, to may be chosen larger than a, the advantage being that 9 may have a singularity near Q and that the necessity of tabulating 9 near a can be avoided. Alternatively, one may take to = a and define, arbitrarily 1)(0) = 0 when 9 is singular at a.

202

11. The Brezlnskt-Havie Protocol and Numerical Quadrature Table I k

sn-.

0 1 2 3 4 5

-0.45939 -0.49006 -0.49634 -0.49829 -0.49909 -0.49948

k

sn-.

6 7 8 9 10

-0.49968 -0.49979 -0.49986 -0.49990 -0.50003

Example 1

+ ev'X)2, - !,

g(x) = -efiI2J~(l G(t)

= (1 + ev'l)-1

I =

50'" g(x) dx

=

-!,

x> 0, (8)

to = a = 0,

p = 1.

Here G is known explicitly, but, surprisingly, not more than ten or so tabular values are required to determine I to almost five places despite the fact that 9 is singular at zero. Thus we may assume, for the example11(0) = 0, that 11 values of G are known and tabulate the 11th ascending diagonal of S~k) (see Table I). Example 2 g(x) = -e-X(x G(t)

+

x> 0,

1)lx 2 ,

= e-tlt - lie,

I =

J'"

(9)

g(x) dx = e-

I

= 0.367879441,

to = a = p = 1.

The sixth ascending diagonal is tabulated in Table II. In this example double precision (16 significant figures) was used, and Sb1 5 ) is accurate to 16 significant figures. This indicates the method has great numerical stability, at least when applied to monotonic integrands. Table II k 0 I 2

5-'

k

S~~k

-0.367466316 -0.367863093 -0.367878981

3 4 5

- 0.367879339 -0.367879477 - 0.367879363

Sl')

203

11.1. Introduction; The G-Transform Table III

Example 3.

k

S\k~ _ k

k

S\k~ -k

0 2 4 6

1.04471 0.99818 1.00015 0,99968

8 10 12

0.99996 0.99967 0,99996

This has an oscillatory integrand, corresponding to

I = Then

f

OO

sin x - x cos x

o

x

G(t)

=

2

dx = 1.

(10)

1 - (sin t}/t.

(11)

Some elements on the 13th ascending diagonal are tabulated in Table III. The error in #5) is 1 X 10- 6 . Obviously, the algorithm was not designed for integrands that decay algebraically or logarithmically. For J~ x-l(ln x)" 2 dx, as another example, Sb1 5 ) = 1.262, while the true value of the integral is l/ln 2 = 1.443. We now look at the exactness problem for this algorithm. Theorem 1. For some complex constant do, d 1 , ••• , db let A. E f:(}c be a sequence of roots with negative real parts of the exponential polynomial H(A) = do

k-l

+ AI

r=O

(12)

dr+ le"rp •

Then if (13)

where Pm(t) is a polynomial of degree less than the multiplicity of Am' infinite sums being allowed subject to convergence conditions, the transformation (5) is exact for each t; i.e., Ilk) == I, t > a, provided the denominator of (5) does not vanish. Proof

Define !£(f) = dof(t)

k-l

+ I

r=O

dr+d'(t

+ rp).

(14)

If g satisfies the equation !£(g) = 0, then, by integration between t and

do[G(t) - 1]

k-l

+ I

r=O

dr+lg(t

+ rp) =

0,

t > a,

00,

(15)

204

II. The Brezinski-Havie Protocol and Numerical Quadrature

°

so the numerator of the determinantal expression of I~k) - I will vanish. Let ,10 be a root of H(A) of multiplicity m. We need only show 2(ti e AOI) = for O:5:j:5:m-l. We can write eAIH(A)

d

k-l

= co eAI + "c _ 1... r+ 1 dt r=O

eA(I+rp )

(16)

O:5:j:5:m-l,

(17)

'

so

which was to be shown.

•

Corollary 1 (k = 1). Let f E L(O, iff g(t) = Me-at, M -# 0, Re a > 0. Corollary 2.

00).

Then 1~IJ is defined and exact

For some complex constants db ... , db d,

+ dk -# 0, let A be a sequence of roots with negative real parts of

°

k- 1

"d 1... r+ 1 e Lrp .

r=O

Then

IlkJ, k

~

+ ... (18)

1, is defined and exact for

=

g(t)

L: Pm(t)e

Am l

(19)

,

where Pm is as in Theorem 1.

Proof Completion of the proof, which requires Heymann's theorem to guarantee the nonvanishing of the denominator of Ilk!, is left to the reader; see Section 6.3. • The following result on accelerativeness is easy to demonstrate.

Theorem 2. Let D(t) denote the denominator of I:k ) and Mr(t) the rth cofactor of the first column of D. Let Mr/D be bounded. Let I exist, g be bounded, and 1 :5: r :5: k.

(20)

Then lim {(Wl - l)/[G(t) - I]} = 0. t~oo

Proof

Left to the reader.

•

(21)

1l.2 The Computation or Fourier Coefficients

205

Let us take as an example the important case k = 1, III _

~

-

G(t

+ p)

- G(t)g(t + p)jg(t) . 1 - g(t + P)jg(t)

(22)

If g(t + p)jg(t) = A + 0(1), 0 < A < 1, the hypotheses of Theorem 2 are satisfied ~ in fact, in this case the conditions are necessary and sufficient for the accelerativeness of Ipl; see Gray and Atchison (1967). The algorithm is most suitable for integrands that behave exponentially. Obviously iff = o(t-a), the conditions of theorem are not satisfied; in fact, for k = 1, one has (23)

An algorithm suitable for cases in which f behaves algebraically can be obtained by making an exponential substitution in (2)-(6). This amounts to taking in the BH protocol f,(n)

= topn+r-lg(topn+r-I), to

~ a ~

1, p > 1.

(24)

However, these equations offer no clear computational advantage over (7), since tabular values of G for very large t are required. An exactness theorem analogous to Theorem 1 is easily established for the new algorithm. Details are left to the reader. Theorem 2 remains unchanged. For the important case k = 1, these results show the algorithm is exact for functions f(t) = Mt- a, M #- 0, Re IX > 1, and accelerative if f(t) = O(t- a), Re IX > 1. The papers by Gray, Atchison, and Clark detail many other properties of the k = 1 algorithm. 11.2. The Computation of Fourier Coefficients Suppose it is required to compute the Fourier coefficients I(m)

=

L

f(x) cos(2nmx) dx,

(1)

and that a sequence s of values of the trapezoidal sums (2)

is known. Further, assume that Romberg integration (Section 3.1) has been applied to Sn to produce a value of 1(0) accurate to as many figures as are required of 1(m).

206

II. The Brezinski-Havie Protocol and Numerical Quadrature

The BH protocol, combined with a method due to Lyness (1970, 1971) can be used to attack this problem. To be accurate, we should speak of a "class" of methods, since Lyness's theory has a great deal of flexibility, which allows one to take advantage of additional data, i.e., a knowledge of the derivatives off Here only the simplest form of his algorithm will be used. (It seems a pity that Lyness's work, uncomplicated and beautifully ingenious, has received almost no attention from the authors of books on numerical analysis.) Supposefhas the Fourier series development f(y)

=

1(0)

+

2JI f

f(x) cos[2nk(x - y)] dx.

Let y assume the values jim and sum from j = be expressed

2

°to m -

(3)

1. The result may

00

I

k=l

l(km) = rm ,

(4)

[For details, see Luke (1969, Vol. II, p. 215).] Now, the Mobius inversion formula (Hardy and Wright, 1959, p. 237) states that, subject to certain convergence conditions, the sum m

~

(5)

1,

may be inverted to yield 00

r; = I

k= 1

ilk G k·m,

m

~

1,

(6)

where ilk is the Mobius function, ilk

=

f~

1

(-1)'

k = 1 if k has a square factor if k is the product of r prime numbers.

(The first ten values of ilk are + 1, -1, -1,0, -1, applied this formula to the sum (4) to obtain l(m) =

1

(7)

+ 1, -1,0,0, + 1). Lyness

00

2 k;/krk.m.

(8)

This is the series from which we wish to compute l(m). We show how the BH protocol can be applied to the partial sums of this series. Let 1 n+ 1 lim) = -2 I ilkrk'm, k=l

n

~

0,

(9)

11.3. The tanh Rule

207

and define R; =: I n(m) - l(m) =

1

2

L 00

k;n+2

(10)

Ilkrk'm'

From the fact that (11) fj(n) =

L 00

k;n+ 2

Ilk

k2 j

'

(12)

However, (13)

so I

fj(n) = (2j) -

n

+

1

k~1

Ilk k2j

'

n ::::: 0, j

>

1,

(14)

and to complete the BH protocol one takes 1 n+ 1 s, = I n(m) = -2 L Ilkrk'm, k;l

rn =

T" =

T" -

1(0),

~n in f(~), n k;O

1(0)

=

f

(15) f(x) dx.

[The numbers (2j) are extensively tabulated; see e.g., Abramowitz and Stegun (1964).J One would expect, based on the representation 10.4(1), that rin ) = Oin" 2k- 2), n -> 00. (This has not been proved, of course.) The original series, Eq. (8), converges only as n- 2 . Iffhas derivatives, i.e., if the values of c., c 2 , ••• , c 2 r + I' are known, these may be used in an obvious way to make the process even more efficient, with T" minus the first several terms in the series (11) taken for T". 11.3. The tanh Rule The basis of the tanh rule is the approximation of a doubly infinite integral by means of a trapezoidal approximating sum. Thus the quadrature process is similar to the methods based on cardinal interpolation. However, there is an important difference, one that changes completely the nature ofthe

208

11. The Brezinski-Hiivie Protocol and Numerical Quadrature

error term: The infinite sum is truncated at ± N(h). The problem is, how should N be chosen to obtain optimal results? Following Schwartz (1969), we make a change of variable in the finite integral J~ 1 g(x) dx. Let ljJ be a reasonably smooth function that is monotone and maps ( -1, 1) into ( - 00, (0).

~ hrt_f'(rh)g(ljJ(rh)).

flg(X)dX = f:oog(ljJ(t))ljJ'(t)dt

(1)

How should ljJ and h == hen) be chosen? Schwartz suggested ljJ(t) = tanh(!t) (hence the name "tanh rule") and h = nj2FJ. For integrands 9 in Hardy class H 2 , Haber (1977) has computed the asymptotic form of the error norm and has shown that for the above choice of ljJ, the choice of h is optimal. [The functions in the Hardy class H 2 are iO 2 functions analytic in N for which I f(re ) 1 dO is bounded as r ---+ 1.] Let 9 E H 2 and define

gJr

( )= h Sn

9

It can be shown that S that

i

s(g) =

r= -n

-

f

1 g(x)

(2)

dx,

g(tanh(nh/2)) 2 cosh 2(nh/2) ,

h=

nj2FJ.

(3)

s; is a bounded linear functional on H 2 • Haber found (4)

Note that this seems to be considerably inferior to the bound obtained for the trapezoidal rule in Section 3.4. However, there the sum is not truncated and the class of functions is smaller. Haber's computations seem to indicate that a good choice for the BH protocol is Jj(n) = e-(Jr/J].lJri/(n

+ l)U- 1 )/ 2 .

(5)

The function g(x) = (l - x 2 y is in H 2 provided Re a> I = rca + l)fi/rca + ~) and

-i.

Then

n 2: 1 (6)

n 2: 1,

and

So

= O.

11.3. The tanh Rule Table IV BH Protocol Applied to 1

7

Ci

k

2

4 6 8

8

12

=

Sk

-t 1_ 13 =

(tanh rule)

2.611931003 2.586166070 2.586239244 2.586715520 2.586937436 2.587032111

2.587109559 s~)

2.266890051 2.563060233 2.586139159 2.587082817 2.587108878 2.587109544

2

=

IX

=

Sk

fO o

-

209

x 2 )' dx

i, 1_ L4

(tanh rule)

2.440806880 2.399070105 2.396475368 2.396260717 2.396257569 2.396267876

=

2.396280467 )'(kl

'0

2.048670072 2.371528094 2.395295728 2.396255106 2.396279761 2.396280440

Table IV displays Sk versus s~), i.e., vertical versus diagonal, convergence for the choice (5) and the cases (X = -t and (X = -t. Clearly, the BH protocol is a powerful tool to use in conjunction with the tanh rule.

Chapter 12

Probabilistic Methods

12.1. Introduction

Historically, the construction of summability methods has been based on the philosophy and techniques of classical analysis. Actually, the problem of accelerating the convergence of a sequence is more at home in a probabilistic setting. A formulation in terms of prediction theory or recursion filtering, for instance, immediately suggests the minimization of the expectation {E(lrnl)} of the transformed error sequence if the original sequence is interpreted as a sequence of random variables.t By assuming certain distribution functions for the {sn} and performing this minimization, one is led naturally to a class of methods for transforming sequences. Of course, the methods will depend on the parameters of the chosen distributions. If these parameters are unknown, any well-known estimation technique can be applied. Each estimation technique provides a different summation method. Although the construction of summation methods has not traditionally been based on probabilistic techniques, the methods themselves have been put to extensive probabilistic use. For example, Chow and Teicher (1971) represent the strong law as a trivial special case of the following Toeplitz summability. Let {X n}:'= 1 be independent identically distributed random variables with finite first moment. Suppose (1) «.> 0, n > 0, t Good sources for the theory of probability and stochastic processes needed in this chapter are Papoulis (1965) and Miller (1974). 210

12.2. Derivation of the Methods

211

and 11 ;::::

0,

(2)

diverges. Define the transformed sequence {1;,} by

1;, =

n

s;; 1 L a.x;

11 ;::::

j=O

O.

(3)

If 1;, - C, -+ 0 almost surely for some centering constants {Cn}, then {X n} is called an-summable with probability 1. Note that the strong law is obtained by using C, = EX,

11;:::: 0,

(4)

the common mean of the underlying distribution, and

11;:::: O.

(5)

The summation methods to be derived here are nonlinear and nonregular. They are simple to use. They are useful for summing classical series and also for summing "statistical" series whose terms are realizations of random sequences. Numerical examples of both kinds of applications are included here. The advantages the methods hold for statistical applications are clear: For series defined by complicated experiments in which obtaining data is difficult and expensive, the use of the proper summation method based on an appropriate probabilistic assumption can result in practical advantages. Finally, we shall show that for one large and important class of sequences, the methods are regular, namely, the sequence space of partial sums of alternating series whose terms in absolute value are monotone decreasing. No other nonregular method has been shown to be regular for this sequence space. 12.2. Derivation of the Methods To motivate our derivation, suppose that the series Lk='O a k is a realization of the following" experiment": Let {xdk'= 1 be a sequence of independent random variables with and where

Ipl < 1 and q <

E(xD 00.

=

q,

k ;:::: 1,

(1)

k ;:::: 1,

(2)

212

12. Probabilistic Methods

Defining ak

=

n k

aO

j= I

(3)

k ? 1,

Xj'

one finds that (4) Since (5) it follows that n ? 0,

(6)

(7)

E(s) = ao/(1 - p), and n

= 0, 1,2, ...

(8)

Now,

-aop

= -1-

-p

Pn(P),

n ? 0.

°

(9)

for n ? 1.

All methods will have the property that E(r n ) =

Definition. The summation method U is called Esadmissible if the characteristic polynomial has the form

(10)

°

where k b k 2 , ... are positive integers, 1 degree j in A., and dil) #- for any j.

~

k;

~

n, dp.. ) is a polynomial of

Clearly, IE(r n)I is minimized if and only if U is E-admissible. Perhaps the simplest example of an E-admissible method is n ? 1,

(11)

which leads to the following very simple choice.

Method I flnk fln.n-l

= 0,

°

= -p/O - p),

~ k ~ n - 2; flnn =

1/(1 - p).

(12)

12.2. Derivation of the Methods

213

For this matrix U, s; for each n is the expected value of s given So, Sl,"" Sn' Does there exist a U that minimizes both IE(r n) I and E(r~)? The answer is yes: It can be found as follows. Let k-1 (13) Wnk = L Ilnj' i s k :s; n, j=O

Then (14)

Now n

E(r;) = E(r~) - 2k~1 wnkE(rnak)

but for k :s; n, E(rnak) = E(-

I

j=n+1

aja k) = -a5

L

= - ao2(q)k -

n

f

j=n+1

)2 ,

(15)

pj-kqk

v:-1 , pl = - ao2(q)k - P 1- P

(16)

OCJ.

P j=n+ 1

E(r;) = E(r;)

(

+ E k~l wnkak

)2 .

(n + 2a1°_pn+1 Ln Wnk(q)k ~ + E L wnkak p k= 1

P

k= 1

(17)

For the last term, ECt1 WnkakY = E(t1 kt WnkWnlakal) 1

n

" 2 k - ao2L. wnkq k=l

_

Let

F= E(r~) - A(k=i

1

n

k- 1

k=l

1=1

+ 2ao2 L. " Wnk "L. WnlP k-l q.1 Wnkpk- 1 +

~). 1- P

(18)

(19)

Then (20)

214

o

12. Probabilistic Methods

Setting

of/ownj =

0 for 1 S j S n gives

=

W

=~~

~~ 1- P

-2-' -

2aoq)

or nk

r:':'

)..pj-l

w nj

-

j-l LW nk -

q

k= 1

pn+1-k (p2 ) --1 , I-p q

2sksn.

(21)

(22)

Since I E(r n ) 1 is a minimum, Pn(p)

= 0 = Wnl + (w n2 - w n1)p + ...

+ (w nn -

and this implies

~

Wn. n_l)pn-l

L. WnkP

k-l

+ (l

- wnn)pn,

_pn

(24)

= -1--'

k= 1

P

-

(23)

We can now determine Wn1 = IlnO and then, from (22), all the Ilnk: Wn1 = IlnO

- p" - ~ k- 1 = -1-L. WnkP . - P

(25)

k=2

This method minimizes both IE(r n) I and E(r;): 2 IlnO = + (n - 1)(pq -

Method II.

1)l

1-!~ [1

Ilnl

r'

= (1 _ p)

q-

(p2

Ilnk

= ( pq2 _ 1)pn-\

11

=

nn

) 1 [1

_1 (1 _p3). I-p

+ (n

- l)p]

+

pn 1 _ p;

(26) 2 S k S n - 1;

q

We shall be concerned with yet a third E-admissible method, namely, that arising from the choice Pn(A) = (A - p)"/(l - p)n,

n

~

0,

(27)

i.e., the choice that forces Pn(A) to vanish as strongly as possible at x = p. Method III.

This yields the weights Ilnk

=

(~)( - vr: /(1

- p)".

These are related to the Euler means of Section 2.3.2. (There p < 0.)

(28)

12.2. Derivation of the Methods

Let p E Cf/. Then Methods I-III define regular methods iff s 0, respectively.

Theorem.

p

215

i= I, Ipl < 1, and p Proof

Application of Theorem 2.2(1) to the weights of Methods I and

II is trivial. For Method III, note that for fixed k Illnk I '" M knkip 1 _ p In , so that lim n _

00

Ilnk =

°iff Ip/(l -

n i p In

Mk

independent of n,

p) I < 1, that is, Re p < n

k~olllnki = 11 - plnk~O

l

(29)

Further,

(n)k Ipl _k = IIIIp I-+pi1 In.

(30)

Now, the triangle with vertices {O, 1, p} in the complex plane has legs of length 1, Ipl, and 11 - pl. Thus [Ipl + IJ/ll - pi> 1 unless p is real and negative, in which case the ratio is 1 and ~ 0 Illnk I = 1. •

Lk

To obtain the final summation formulas, note that in almost all applications the variables {xdf~ 1 are identically distributed with the first moment p unknown. One could then estimate p by the method of moments

n

~

1,

(31)

or by the maximum likelihood estimate. The first estimate provides the more useful results. Thus, for instance, for Methods I and III one has (32) and (33) respectively, and a similar formula is obtained for Method II with p in (26) replaced by P« and q by qn = n- 1 Lk~ 1 aUa~-I' The form of the sequence of expected errors (8) is a fortunate consequence of the derivation since many of the sequences encountered in practice are at least approximately a constant plus an exponentially decreasing term. The above methods (like many nonlinear methods) cannot be applied to certain sequences. For instance, if s, = s, + 1 for some value of n, then an+ 2/ an + 1 is undefined and so is Pn+ l' The problem in definition is not resolved by considering only subsequences of {sn} containing no adjacent duplicate members since the possibility that P« = 1 is still not obviated.

216

12. Probabilistic Methods

In fact, it is possible to manufacture examples of convergent infinite series where P« = 1for an infinite number of n, for instance, by folding together the two absolutely convergent series

1,11 < 1,

(34)

to obtain ,1+ d

+

,12

+d

2

+ ....

(35)

The sequence an/an-I' n = 1,2,3, ... , is then e, so that

P2n

Aje, e,

2n ak 1 ( = -21 L -= -2 n e + -A) =

n k=

1

ak- I n c

Aje, ... ,

(e + -A) , e

(36)

n

= 1,2,3, ... , (37)

and for the choice (38)

e=I+~, P2n = 1, n = 1, 2, 3, .... 12.3. Properties of the Methods

For Method I there is a simple necessary and sufficient condition for convergence. We have

Theorem 1.

Let s E

f(ic.

S=s

Proof

Trivial.

Then for Method I iff

lim an/( 1 - Pn) =

o.

(2)

•

Method I is not regular because of its lack of definition for certain convergent sequences, but ifboth {s.} and {sn} converge then limn s, = limn sn so long as {Pn} is bounded away from 1. Corollary. For any convergent alternating series L:'=o an' Method I preserves convergence. Further if Ian I is monotone decreasing, then s; lies between Sn- I and Sn for all n.

Proof Since -1 < an/an- 1 < 0, -1 < P« < 0, and < 1, Eq. (1) immediately yields the corollary. •

1<

(l - Pn)-l

12.3. Properties of the Methods

217

Further, Methods I and II preserve convergence when {sn} is the sequence of partial sums of any series, Lk'=o ai, for which Raabe's test (Knopp, 1947, p. 285) is applicable. Let an > 0 for all n, and let s

Theorem 2.

n(an+da n -

n s.

with

E C(jc

(3)

-Yo

Then Method I is regular for s if - y < - 1 and Method II is regular for s if -y < -2. Proof

We may rewrite (3)

+

n(an+ dan - 1

lin)

:os::. -

f3 = y - 1,

f3 < 0,

(4)

so that (5) or

(n - l)a n

nan+ 12. f3a n .> O.

-

(6)

Thus ultimately {en - l)a n } is monotone decreasing. Therefore lim nan + 1 = lim nan

n--+ co

n- 00

exists. Now from (7)

we can conclude

ak!ak -

1

:OS::.l - YI(k - 1),

1 n ak 1[NL-+ ak Pn=-I-:os::In nk=l ak-l

n

k=l

ak-l

k=N+l

(8)

Y)J

(1 - - , k- 1

(9)

or

Pn

:os::

[n - yin n

+ M(n)]ln,

(10)

where {M(n)} is a bounded sequence. Here we have used the fact that n

1

I k_ k=N+l

+ 0(1).

(11)

1 - P« 2 [y In n - M(n)]ln,

(12)

1 = In n

Thus

218

12. Probabilistic Methods

and for n large enough, the right-hand side is positive. Thus

an

--<. 1 - Pn -

nan , yin n - M(n)

(13)

or lim aJ(1 - Pn) = O.

n->

(14)

00

This gives the result for Method I. Now

IPnln~.

[1 + M(n) ~ yin nT

=. exp [ n In ( 1 + =. ex p[n(M(n)

M(n) - yin n

~

yin n

n)J '

+ e~n»)

l

(15)

by a Taylor's series argument, where {sen)} is a null sequence. Thus

nlPnln {3n 2 - i' --<. , 1 - Pn - yin n - M(n) and since 2 - }' < 0, lim

IPnl n/(1

- Pn) =

and the result is established for Method II.

°

(16)

(17)

•

Both Methods I and II preserve convergence for series for which the ratio test shows convergence provided one further stipulation is added for Method II.

Theorem 3.

Let

SE

fll c with lim an/an-l < 1.

(18)

Then Method I is regular for s. If in addition (19)

the same holds for Method II.

219

12.3. Properties of the Methods

Proof Since Pn is the Cesaro means of an/an-I' lim an/an-I ~ lim Thus, for every

I:

o, ~ rrm: », ~ rrm: an/an-I'

(20)

> 0, lim an/an-I -

E

<. Pn <. Iilli an/un-I n-r ca

+ E.

(21)

By virtue of (18), Pn <. r < 1,

or or

(22)

(1 - Pn) >. 1 - r > 0,

(23)

(1 - Pn)-I <.(1 - r)-I.

(24)

Theorem 1 may be invoked to show regularity for Method I. Using (19) and (22) now gives (25)

-1 < rl <. Pn <. rz < 1,

so

(26)

Thus (27) Many sequences encountered in practice can be expressed (at least approximately) as a constant plus a linear combination of exponential terms. More precisely, one can define a sequence space Ye as follows. Let!/' be the space of complex sequences (28) Define Ye =

{SISn = S+ J/~u~), Ar E N, 1 > IAII > IAII, 2 ~ j

~ k,

u(r)

E

Y} (29)

Note Ye is a generalization of CC Ek(N) [see Eq. 2.2(12)]. Theorem 4.

Proof

Let

Method II is regular for Ye; Method I accelerates Ye. uUl E

Y. By an application of Theorem 1.4(2), (30)

220

12. Probabilistic Methods

Now, an + 1

k

L A~+lu~)[1 + o(1)J

=

r=1

-

k

k

r=1

r=1

L A~u~) = L A~u;;l[(Ar -

Thus .

hm

n-e co

rUn An (r) 1 .fn<TI -_ .hm

I

lUn

n-+oo

Uo I(I) Iexp [( n In I-, I + C (r)

A

Uo

(r) n

r

1\..1

(I)

Cn

)J

_ -

1)

0,

+ 0(1)].

(31)

2 :s; r :s; k, (32)

by virtue of the fact that In IArlAl I < O. Thus

lim an + dA~u~1)

= Al - 1.

(33)

Also, (34) Dividing these two limits gives lim an+ dan

= AI'

(35)

Since p is the Cesaro mean sequence of the sequence {anla n- d, it also converges to AI' and so (1) gives

sn =

s

+

±

An - l u (rn -)

r= 2 r

I

[(1 + ~) + 1 _ Al

O(1)J

+

AnI - 1U (nI- ) I 0(1) .

(36)

Thus (37) n~

00

From (32), (38) Dividing these limits shows that

= 0,

(39)

~ = ~ [1 + 0(1)J,

(40)

lim [(Sn - s)/(sn - s)J and this is the desired result. For the result for Method II, note that

I - P«

1 - Al

and this, used in (25), implies the method is regular for s.

•

12.3. Properties of the Methods

Note that all these methods map the partial sums of So

=

s; = 11(1 - x),

1,

I:=

n 2: 1.

0

221

x" into (41)

This property is shared by other nonlinear transformation, for example, the Shanks ef transformation (Shanks, 1955). In fact, a transformation related to Method I was mentioned in passing in Shanks (1955, pp. 25-26) under the name "geometric extrapolation." This transformation is defined by

sn =

Sn -

lim.(anlan-l)sn-l

n 2: 1.

-"---"-'-"-'-~"--"~

I - limn(anla n- 1 )

(42)

Method III is, for certain classes of sequences, the most effective method of all. We now make this more precise. Lemma.

Let

(- 00,0].

WE ~

be bounded and belong to a bounded subset S of

(43) it follows that

lim s,

Proof

Let sup

ZES

= s.

I-zI l-z

= d

(44)

< 1.

(45)

Note also that

\1-zl- 1 :::;; 1,

Z

E

S.

(46)

Since (47) one can write

Choose N such that WnE S, n > N, and ISn - s] rk is bounded, Irk \ < C

= Irnl < s, n> N. Since

222

12. Probabilistic Methods

or (50) Now (51) for fixed k, so lim

n-e co

(n)d. k

(52)

= 0,

taking lim sup term by term gives

rrm If.1 < s,

(53)

or, since c: was arbitrary, or

Theorem 5.

lim f.

= O. •

(54)

Method III is regular for all alternating series

for which sup bk/bk- 1 = M <

00.

(55)

k~l

Proof

In the lemma, let ak = (- 1)kbk, aJak-l = -bk/b k- 1 <0,

(56)

223

12.4. Numerics

so that P« < O. Also, 1

IPnl = -

I n

b,

nk=lb k- 1

s

(57)

M.

•

Thus P« belongs to a bounded set and the proof of the theorem is complete.

12.4. Numerics

As the reader will see, the three methods derived in the previous section have very unusual properties. We first discuss their application to probabilistic sequences. It must be kept in mind that, although the methods are used on individual random sequences, each method was designed for a space of random sequences. The space (and consequently the method) is identified by its parameters P (and perhaps q). The concept of applying one of the methods to an individual sequence is ambiguous. It really makes more sense to talk about the average value of the transformed sequence over a great number of trials s~, in other words,

'" N 1

-

N

.

-, L. sn'

(1)

i=1

This should, in some sense, be close to the expected value of Sn' Now suppose that {x.}~ 1 is a sequence of independent beta-distributed random variables. Recall the random variable x is beta distributed when its probability distribution function is the incomplete beta integral F(rx, [3, x*)

=

Pr(x

s

x")

= B(rx,1 [3)

IX' t 0

a

-

1

(2)

1(1 - t)fJ- dt

with corresponding probability density function 1 a f(rx, [3, x) = B(rx, [3) x - l (1 - x)fJ- 1 •

(3)

Note that E(x) =

and E(x 2 ) =

II o

I

I

o

t!(rx, [3, t) dt =

+ 1, [3) a ([3) = --[3 B o; o: +

B(rx

t 2f(rx, [3, t) dt = B(rx + 2, [3) = Bt«, [3) (rx

rx(rx + 1) + [3)(rx + [3 +

(4)

1)

.

(5)

224

12. Probabilistic Methods

Since each Xk is beta distributed with parameters (a, {3), (4) and (5) hold for each X k. When the Xk are independent

n21.

(6)

Taking {3 = 1 for the numerics, define sn(a) = 1 + tn(a) = 1 From (6),

so

Xl

+

+

Xl

+ ... + XIXZ ••. X n' + ... + (-ltx l,x z,···,x n·

XIX Z

XIX Z -

1 - (a/a + It E(sn) = I - (a/a + 1) ,

E(s)

while E(tn) =

=

a

+

E(t) = (a

+

1)/(2a

(8) (9)

1,

1 + ( -It(a/a + It+ I 1 + (a/a + 1) ,

so

(7)

+ 1).

(10) (11)

Computer generated values of Xk will be selected as follows. Pick Y I' Y2, ... , Yn from a uniform density on [0, IJ; then for k = 1,2, ... , n the value Xk is defined by (2), i.e., (12) It is possible that the sample values x, are atypical of beta-distributed random variables since the distribution function controls only the likelihood (probability) that particular values are obtained and not the possibility that particular values occur. For instance, when a ~ 1 the distribution function for X k is skewed toward X = 1 (see Fig. 1), so that one "expects"

Fig. I

12.4. Numerics

225

Table I Effect of Methods on Two Typical Probabilistic Sequences"

s, n

Sa

3 6 9 12 15 18

3.168 4.616 5.495 6.071 6.461 6.754

6.872 6.415 7.294 6.971 7.324 7.437

"s(a) and t(a), a

= 1'7 = 0.4737.

I

in

II

II

n

In

6.889 6.523 7.347 7.022 7.355 7.457

29.95 8333

3 6 9 12 15 18

0.019 0.881 0.160 0.695 0.228 0.596

= 8; s = 7.2766, t = 0.3360.

0.466 0.500 0.504 0.438 0.455 0.420

Expected value s

= 9;

II

III

0.475 0.487 0.515 0.473 0.468 0.406

0.479 0.471 0.483 0.492 0.493 0.484

expected value

most simulated values for X k to lie near 1 (and hence for the sequence Sn to converge more slowly than for smaller values of (X). However, it is still possible to generate values for Xk anywhere in [0, 1]. Even when (X is small, say, < (X < 1, so that any individual sequences sand t must converge rapidly, the inevitability of producing Xk arbitrarily close to 1 prevents one from concluding that lim an/an-t < 1 and thus using Theorem 12.3(3) to demonstrate regularity of the methods for such sequences; this occurs in spite of the fact that, for all practical purposes, the sequences converge exponentially. In fact, since t n is composed of alternating monotone decreasing terms the difference It - t n I may be bounded and t computed confidently to any desired accuracy. Table I elaborates on this strange phenomenon. For the given individual sequence t n the methods seem to be summing t to its expected, rather than its actual, value. Of course, this is precisely what one would expect, and indeed demand, of a method to be applied to a sequence of experiments arising from a fixed probability distribution. But this bizarre

°

Table II Effect of Methods on Various Analytic Sequences

(PI 2)n n

5 10 15 20 25

1.5213 1.5735 1.5945 1.6058 1.6130

(LN 2)N

II

III

1.5116 1.5671 1.5780 1.5951 1.6052

1.5323 2.9197 -1747.9

0.68586 0.69619 0.69143 0.69427 0.69235

(FAC)n

II

III

III

0.66994 0.70758 0.68257 0.70158 0.68608

0.69327 0.693147 0.693147 0.693147 0.693147

0.608398 0.61003 0.60863 0.60798

226

12. Probabilistic Methods

kind ofnonregularity is clearly a hazard if the method is to be applied to any individual series, in particular, to an analytic sequence. The effect these methods have on analytic sequences is capricious; some examples are given in Table II. Methods I and II produced only so-so results on every analytic sequence on which they were tested. Method III is a disaster on monotone sequences but performs very well on certain sequences alternating about their limits. It even seems to sum to the usually ascribed value the sequence (FAC)" = 1 - I! + 2! + - ... + (- 1)"n!.

Chapter 13

Multiple Sequences

13.1. Rectangular Transformations

Given a double sequence {snd, define the transformed double array

{snd componentwise by

n,k 2 0.

(1)

The transformation is completely characterized by the four-dimensional array of weights U = [p7f], s i :s;; n, o « j :s;; k. (2) Obviously, convergence in the {snd array can be path dependent. Let Snk be located at the point (k, -n) of JO x - J . We shall be concerned here with only two modes of convergence, horizontal, limk _ oo Snk, and vertical limn _ oo Snk' (This designation differs slightly (rom the convention previously used for array transformations s -> S(k l, but here it is more useful to think of {Snk} as a rectangular, rather than a triangular, array.) We shall assume that

°

lim Snk =

k-

fln,

n 2 0,

00

k 20. Definition. regular if

(3)

The transformation defined by (1) is called horizontally lim

k-oo

Snk

=

fln'

227

n 20,

(4)

228

13. Multiple Sequences

and vertically regular if k

~

(5)

0.

The material in the remainder of this section is due to Higgins (1976). Theorem 1. The transformation defined by (1) is horizontally regular iff (i) given n

0,

~

n

k

L L IlliY I ~ R n

;=0 j=O

for some positive number R; independent of k; (ii) given n ~ 0, lim

°

k

L 1l~1 =

bnr ;

k-oo j=O

(iii) given n

0, j

~

~

0,

n,

~ i ~

lim lliY = 0.

k-oo

Proof <=: By virtue of (3), one can write sij = fii

Thus

Snk - fin

=

n

+ £ij' where

lim j _

oo

£ij = 0.

k

L L Ili'f(fi; + £i) -

;=0 j=O

fin

(6)

For n fixed, condition (ii) guarantees that the first two terms on the righthand side of (6) can be made arbitrarily small for large k. Now separate the remaining term as n

k

n

n

J

L L Ilijk£ij = L L lliY£ij + L

;=0 j=O

i=O j=O

° ° °

k

L lliYeij'

i=O j=J+ I

(7)

°

Since n is fixed and lim j_ 00 eij = for ~ i ~ n, given s > it is possible to pick J such that IeijI < e/2R n for ~ i ~ n andj > J. Thus for k sufficiently large, the second term on the right-hand side of (7) has modulus less than e/2 when condition (i) applies. Now with nand J fixed, define M

=

max leijl,

O::s;isn

o ,.:,j,.:,J

(8)

13.1. Rectangular Transformations

229

and pick K so large that k > K implies

°

Ifl71 I < c/2M nJ

°

(9)

for ~ i ~ nand ~ j ~ J. Condition (iii) has obviously been used to guarantee (9). Thus for k sufficiently large, both terms on the right-hand side of (7) can be made arbitrarily small so that the three conditions of the theorem imply limk~GO (Snk - f3n) = for each n. =>: First, let r be fixed and apply the transformation to the array

°

I,

Snk

obtaining

a, =

-

Snk

{

°

n = r, k 2 otherwise

= { 0,

(10)

r> n

k

fl,)'

"nk

L...

j~O

(11)

n 2 r.

By the horizontal regularity,

l5 n, = lim Snk = lim

k

I

(12)

n 2 r.

fl~J,

k-r sx» j==O

k-oo

Now let r vary to demonstrate the necessity of condition (ii). Second, for the necessity of condition (iii), fix i andj and define the double arrays Dij = (dnk ) , where dnk = l5 inl5 j k • Applying the transformation, we obtain n 2 i, k 2j otherwise.

Thus for i

~

n, · 1im

k-co

flijnk

- = liim = liim Snk k-rJ:)

k-oo

Snk

=

(13)

°

(14)

by horizontal regularity. Varying i andj, we obtain the necessity of condition (iii). Third, the necessity of condition (i) will be demonstrated by contradiction. Suppose there is an integer n for which k

n

lim k~GO

I I

i~O j~O

Ifl71 I = + 00.

Then there must be at least one integer i such that k

lim

I

1

k~GOj~O

fl71 1 = + 00.

°

~ i ~

(15)

nand (16)

230

13. Multiple Sequences

If i = n, define the Toeplitz transformation with matrix (c k ) by k 2 0,

O:::;;j:::;; k.

(17)

This Toeplitz matrix represents a nonregular method since L~ 1 Ickj I is unbounded. Thus, there is a sequence {x) with lim, Xj = x and L'=o CkjXj does not converge to x as k goes to infinity. Consider now the double array that has zero entries except for row n, wherein lies the sequence {xj}, and apply the transformation of the theorem to obtain as

k

~

co.

(18)

Therefore the method is not horizontally regular if i = n. If i < n, consider the Toeplitz transformation whose matrix (Ckj) is given by nk k 20, O:::;;j:::;; k. (19) Ckj = J1ij' By Hardy (1956, p. 43), k

L Ickjl = + co k-oo j=O fiiii

implies the existence of a bounded sequence {y i} with the property that {L'=o CkjYj}k"= 1 does not converge as k ~ co. Now apply the transformation of the theorem to the double array (Snk), which has all zero entries except in row i, wherein lies the sequence {yJ. Clearly, (20) but limk_oo Snk = 0; thus if i < n, the method is not horizontally regular. The necessity of condition (i) is now established by contradiction. • The proof actually substantiates some stronger results regarding the transformation (1). For example, note that with n fixed, the three conditions of the theorem are necessary and sufficient for lim k_ oo Snk = Pn. Also, conditions (i) and (iii) are sufficient conditions that the transformation (1) map a double array with all row limits zero to a double array with all row limits zero. The three corresponding necessary and sufficient conditions for vertical regularity can be obtained by analogy. The existence of higher-dimensional analogs of these methods and of this theorem also is clear. An extension of these ideas that has practical application but will not be pursued in this book is the following.

231

13.1. Rectangular Transformations

Suppose f: JO -+ JO and g: JO -+ JO with either limn fen) = limn g(n) = 00 (or both). Call the double array (f, g)-convergent to s if lim

=

S!(n). g(n)

00

or (21)

S.

The natural question is, What are necessary and sufficient conditions on the weights Ili} in (1) to ensure that if (Snk) is (j, g)-convergent to S then (snd is (j, g)-convergent to s? If the double array (Snk) has lim Snk k~oo

= /3,

n 2:: 0,

(22)

then condition (ii) of Theorem 1 can be somewhat relaxed and still maintain horizontal regularity of the transformation (1).

Theorem 2. Suppose that the double array (Snk) enjoys property (22) and that conditions (i) and (iii) of Theorem 1 are satisfied. If n

lim

k-v co

then

n

lim

k-r o:

Proof

L ;;0

k

'Illi}

=

1,

n 2:: 0,

(23)

lliJSij

= /3,

n > 0.

(24)

j;O

k

I I ;;0

j;O

Obvious in view of the proof of Theorem 1.

•

This theorem aids in the design of transformations of the double arrays. If we assume that the better approximations to the limit /3 appear for the larger indices nand k, the weights Ili} should put more mass on the larger indices i and j than on the smaller indices. Therefore, let fbe a function from JO x JO to P/I+ satisfying f(Il,j) > f(v,j)

for

11 > v

and all j,

(25)

fU,Il) > f(i, v)

for

11 > v

and all

(26)

i,

and 'I);O f(O,j) diverges. We choose the weights Ili} by Ili}

= f(i,j) l,to

.t

o

f(ll, v),

(27)

which leads to 'Ii;o 'I); 0 Ili} = 1 for all nand k, so that (23) is satisfied. Condition (iii) of Theorem 1 is satisfied in view of the divergence of the sum 'I); 0 f(1,j)· Condition (i) is satisfied because of the positivity off Thus, the weights (27) define a transformation for which limk~ 00 Snk = P for all n implies lirn.., 00 snk = /3 for all n. These weights will be computationally useful only when the functionfreflects the input double array.

232

13. Multiple Sequences

13.2. Crystal Lattice Sums

An important class of multidimensional sums arises in the theory of crystal lattices, specifically in the computation of the lattice energy per atom of a given crystalline material. Let f: JP -+~, JP = J x ... x J, M P = (ml"'" mp), Ap=(al"'" ap), Op=(O, 0, ... ,0), IIMpl1 =(mi + ... + m;)1/2. [Where there is no chance of misunderstanding, we omit the subscript p.) The sums of interest are generally of the form S=

M~A

f(M)

11M - A11 2 s '

S E

CfJ.

(1)

f is usually quite simple, typical examples being f

= 1,

(2)

although in so-called phase modulated sums (Glasser, 1974) more complicated functions occur. Sometimes the mj range over only even or odd numbers, but it is not useful to develop a special notation to deal with such cases. It is not at all obvious when (1) converges. The following theorem is often applicable. Theorem. Let M - A#-O and let f be bounded. Then (1) converges and represents an analytic function of s for Re s > p/2, convergence being obtained regardless of the order in p-space in which the terms are added up.

Proof The most elegant demonstration uses the theory of theta functions. This proof is given in Section 13.2.2. •

For a discussion of the physical context in which such sums arise, see the classic treatise by Born and Huang (1954). We shall take an approach with these sums that is fundamentally different from the procedures used previously in this book to accelerate the convergence of series or sequences. The techniques given here will not be general, but will very much depend on the specific character off This is, of course, very much in contrast to the previous work-for instance, the fact that the remainder sequence possessed an asymptotic series of Poincare type-where only the general form of the sequence or series was of interest. The present kind of endeavor might be called the analytic approach to sequence transformations. The arguments used will depend on known properties of mathematical functions, such as theta functions, and on the application of a powerful formula from classical analysis, the Poisson summation formula.

13.2. Crystal Lattice Sums

233

13.2.1. Exact Methods Definition.

Let f be locally L(O, 00) and let the integral .A(f; s) = {"" x·- 1f(x) dx

(1)

converge for Re s = to, Re s = t 1, to < t t- .A is called the Mellin transform

off

Clearly the integral converges for IX

< Re s < {3

(2)

where IX = inf to and {3 = sup t l' (2) is called the strip of absolute convergence of (1). The Mellin inversion theorem states that iffis of bounded variation in a neighborhood of x E (0, 00), then, for any IX < C < {3, f(x+)

+

f(x-) _ _ 1 l' fC+iR «cj, ) . 1m JI't, S X 2m R~"" c-iR

'---'-----'--=--'-----'- -

2

-r

s

d

S,

(3)

Usually.A may be continued analytically into a larger region q; of the complex s plane, for instance, f(s)

-. = a

f"" x

.-1

0

e

-ax

d

X,

Re a> 0.

(4)

°

Here IX = 0, {3 = 00, q; = Cfi - {a, -1, -2, ... }. The theta functions for x > are defined as follows.

(5)

For some of the many beautiful properties of these now almost forgotten functions, consult Whittaker and Watson (1962), Hancock (1909, Vol. I), or Bellman's more recent book (1961), which is compulsively readable. A good collection of formulas is in Abramowitz and Stegun (1964). The following notation is standard:

B/O, q) = BlOlr),

(6)

234

13. Multiple Sequences

Thus 8i(0Iix/n) corresponds to taking q = e- x in 8i(0, q). Formulas such as the following can be found in Hancock (1909, Chapter XVIII):

(7)

Similar formulas exist for OJ, etc.; see Hancock or Jacobi (1829), who gives a list of 47 such relationships. It can be shown that 8/0Iix/n) has an algebraic singularity at x = 0; hence Mellin transforms of Oz, 8 3 - 1, 84 - 1, etc., have a half-plane of convergence. The Mellin transforms of theta functions generally involve meromorphic functions such as Riemann's zeta function, defined for Re s > 1 by (s)

=

1

L s' n 00

(8)

n: 1

We shall need the formulas (1 - 2 1 - ' )( s)

=

00

(_1)"-1

n: 1

n

L

"

Re s > 1; (9)

Re s > 1. Another useful function is Re s > 1,

(10)

which satisfies the relationship L(1 - s)

= (2/n)'r(s) sin(ns/2)L(s).

(11)

Obviously, L(s) can be expressed in terms of the generalized zeta function 00

(s, a)

1

= n:O L (n + a)S'

-a ¢ JO,

Re s > 1.

(12)

13.2. Crystal Lattice Sums

235

The Mellin transforms of powers of the theta functions can be found from such formulas as (7). For instance,

A[8~(01~)J = 4AL~o(-lte-(n+1/2)(2k+l)XJ (-It

00

= 4r(s)

n.~o (n + t)S(2k + I)' 2s

00

= 4r(s)L(s)n~o (2n + I)S = 4(2

S -

1)r(sK(s)L(s). (13)

Mellin transforms of products of theta functions can be found by using the

Landen transformations,

8iO, q)83(0, q)

= ¥1~(0, ql/2),

8iO, q)8iO, q) = te-"i/48~(0, i q I / 2),

(14)

83(0, q)8iO, q) = 8i(0, q2),

and the formula

A{f(ax); s} = a-sA{f(x); s}.

(15)

Table I gives some of the Mellin transforms that can be found this way. Table I Mellin Transforms Involving OJ

=

Ii/Ol ixln)

f(x)

O2

2(2 2 '

04

2(2 1 - 2 ' - 1)[(s)(2s) 4(2' - 1)r(s)(s)L(s)

03 O~ O~

O~

(0 3 (0 4

- I

l)r(s)(2s)

1)2 1)2

_

_

0 304 -

-

I 1)(04

0 20 304 0'2

OJ - I

01-

4r(s)(s)L(s) 1)[(s)(s)L(s) 4r(s)[L(s)(s) - «2s)] 4(1 - 2 1 - 2')[(S)[(2s) - L(s)(s)] 2>+ [(2' - 1)[(s)(s)L(s) 22-'(2 1 - ' - l)f(s)(s)L(s) -2 2 - ' [(s )[r ' ( 2s ) + (I - 2 1-')(s)L(s)] -2'+ [r(s)L(2s - I ) 16(1 - 2'-')(1 - 2-')[(s)(s)(s - I)

4(2 1 - ,

- I

020 3

(0 3

-

2[(s)(2s)

I

O~O~ O~O~ - I 8~0~

-

I)

8(1 - 2 2 - 2')f(s)(s)(s - I) - 8(1 - 21-»(1 - 2 2 -')f(S)(5)(S - I) 2>+ 2f(s)L(s)L(s - I) _2 3-'(1 - 2 2 - ' )(1 - 2[-')r(s)(s)(s - I) 2 2 +' (1 - 2 1 - 5 )(1 - 2-')f(s)(s)(s - I)

236

13. Multiple Sequences

To see how these formulas can be used to obtain closed-form expressions for lattice sums, consider

~, = L,

1 r(s)

--

-00

= - 1

r(s)

foo x

e

s-1 -(m 2+m 2+m 2+m 2)x

0

I

2

foo xS-1[e~(Olix/n) 0

3

4

dX

1] dx

= 8(1 - 22 - 2 S)((S)((s - 1).

(16)

[Later it is shown that this sum converges for Re s > p/2 = 2. Since ((s - 1) has a pole at s = 2, the result is sharp.] As another example, consider

= L(s)((s)

(17)

- ((2s).

A short table (Table II) lists two-dimensional sums determined by Glasser. Table II cc

S =

L' J(m,n)

J

S

(m 1 + n 2 ) - ' (_l)m+n(m 1 + n 2 ) - ' (_I)n+ '(m 2 + n 2 ) - ' [(2m + 1)2 + (2n + 1)2r'. m, n Z 0

4«s)L(s) -4(1 - 2'-2')(s)L(s) 22-'(1 - 2'-')(s)L(s) 2-'(1 - r')(s)L(s) 2(1 - 2-' + 2'-1')(s)L(s)

(m 2

+ 4n 1 ) - '

13.2. Crystal Lattice Sums

237

Certain other related sums have been obtained, i.e.,

L' (m + mn + n ) - S = 6(s)g(s), 00

2

2

g(s)

=

-00

I

00

n=O

[(3n

+ 1)-S - (3n + 2)-S] (18)

(Fletcher et al., 1962, p. 95), and (19) whose derivation is rather complicated (Glasser, 1973b). Obviously, the following case can be expressed by a single sum:

L (ml

mj?l

+

m2

+ ... +

mp )

-r

s

=

L (k + Pk 00

k=O

1) (k 1 y' +P

(20)

The difficulty in computing odd-dimensional sums by the use of theta functions is that most of the known theta function identities involve an even number of theta functions. Glasser (1937b) uses a number-theoretic approach to obtain additional sums, and the theory of basic hypergeometric series (Glasser, 1975) can be used to deduce the five-dimensional sum

L

ml?:O;m2,"';m5~

(m 1m2 + m1m3 + m3m4 + m4ml + m2mS)-S

1

= (S)(S - 2) - (2(S - 1). (21)

(The region of convergence of this sum cannot be deduced from the theorem of Section 13.2.) 13.2.2. Approximate Methods: The Poisson Summation Formula

Many approximation techniques have been developed to deal with lattice sums, beginning, perhaps, with Born's and Huang's approach, which uses values of the incomplete gamma function. That approach is not very adaptable to general values of s. Other approaches (van der Hoff and Benson, 1953; Benson and Schreiber, 1955; Hautot, 1974) use methods that convert the sum to a multidimensional sum involving the modified Bessel functions K v • This might, at first glance, seem to be compounding the problems. However, the transformed sums converge with extraordinary rapidity, and often the contributions at just a few lattice points serve to give six- or eightplace accuracy. Several approaches are possible, including one (Hautot, 1974)using Schlornilch series. My own preference is to begin with the following striking result, which can be found in any book on Fourier methods [e.g., Butzer and Nessel (1971, p. 202)].

238

13. Multiple Sequences

Let f

Theorem.

E

L( -

F(x) =

00,

(0),

Loooo e-iX~f(t)

dt,

X E

(1)

!Jll.

Then, iff is of bounded variation, 2n

L 00

k=-oo

f(x

+ 2kn) =

lim

n

X E

n-cok=-n

where, at points of discontinuity, f(a)

Proof

L eikxF(k),

= -t[f(a+) +

See Butzer and Nessel (1971).

211,

(2)

f(a-)].

•

There follows a list of formulas that will subsequently be of use. For the computation of the integrals involved, consult Erdelyi et al. (1954, Vol. I). f(t) = e- at2 cos bt, a E g~+, b e .OJ;

f

e- a(x+2kn)2 cos[b(x

+ 2kn)J

= _1_

- 00

f

2J"1W -

00

eikxe-(k2+b2)/4a COSh(bk). (S-l) 2a

f(t) = Itl±IlK(altJ),

f

eikX(k 2 + a 2)+11-1/2 =

-00

a E!Jll+;

2J1r + 1/2)

(S-2)

(2a)±Ilr(±/l 00

x

L

[x

-00

+ 2knl±IlKialx + 2knl).

(By analytic continuation and use ofthe well-known asymptotic properties of K Il , one finds that these sums are convergent and equal when Re( ± /l) > 0.) f(t) = Jt 2 + a 2 - l e - b-/tT +a' , a, b E ,OJ+; n

L 00

L 00

[(x + 2kn)2 + a2r1/2e-bv'{.x+2kn)2+a2 =

-00

-00

eikxK o(aJb 2 + k 2); (S-3)

a, b E .0/1.+ ;

L 00

~ e- b-/(;:;: 2kn)'+-a2 ab_ oo

L eikxJb 2 + k 2- 1 00

=

-00

x K 1(aJb 2 + k 2);

(S-4)

a, b E ]1+;

}br.3 a ±I'b 1/H Il L 00

-00

L 00

=

-00

[(x

+

2kn)2

+ a 2J±Il/2-1/4K±I1_ 1/z(bJ(x+2kn)2+ a2)

eikx(k2 + b 2)+11/2K,,(aJb 2

+ e),

/l E

t.{j.

(S-5)

239

13.2. Crystal Lattice Sums

We are now in a position to complete the proof of the theorem in Section 13.2. Let (3) (Without loss of generality we may assume that A = 0.) Then

S

=

9=

1

r(s)

Jo 9 dx, (00

Xs- 1

L

(4)

f(M)e-IIMII2x.

(5)

Imjl
We get

IJ =

9 ~ CXReS-l[83(0Iix/n)p and, by (S-I) with b

h,

(6)

x =0

=

h = O(xRes-l-PIZ),

X

--+

O+,

(7)

while as x --+ 00, h = O(e- ax ) , a > O. Thus, under the stated conditions, h is integrable and, by dominated convergence, limNr'oo Sexists. • Expansion (S-2) will be the principal tool. Let k a =

multiply both sides by and sum:

f

00

[IIMpll z

+ JZJs

23/Z-S~

=

r(s)

-

Il,

f

00

take the upper sign on [x

+ 2m

--+ O.

fl, fl--+ S -

t,

nls-l/ZeiX(m,+· .. +mp-Il

[IIM;_ll1 z

X KS-l/z<JIIMp-IIIZ Now let 15

mp ,

(mi + ... + m;-l + ()Z)l/Z,

eix(m'+"'+m p

eix(m,+ .. ·+m p)

-+

+ JZJS/Z 1/4 + JZlx + 2mpn l).

.

(8)

The result can be written [x

+ 2m nl s- 1/ z 11M p_lll s - 1/ Z

"---_ _~p_~_ eix(m,

x KS-l/z(IIMp-ll1lx

00

+ ... + mp -

i )

cos kx

+ 2mpn l) + 2k~I~'

As it stands, this holds only for x '# 2jn, j term must be peeled off and the relationship

E

J. For x

--+ 0+,

the mp

(9)

=0 (10)

240

13. Multiple Sequences

used. The result gives the original sum as a sum over one lower dimension plus a rapidly convergent series of Bessel functions.

~ IIMpll-2s = 2(2s) + y0tr~(S~ 1/2) +

2n s r(s)

f:

IIM p_t11 1- 2s

1m Is-I/2

!;: IIMp:llIs 00

1/2 Ks_I/2(IIMp_11112mpnl).

(11)

The Besselfunction expansions on the right of (9) and (11), still expansions over p-space, converge with great rapidity. Also, for the values of x of greatest interest, the cosine series on the right of (9) can be evaluated in terms of zeta and related functions. For other values of s, it can be dealt with by the asymptotic techniques of Section 1.6. In many cases, s is an integer. The series on the right then becomes a series of exponentials. (An example is given later on.) In any event, the Bessel function K; can be considered a known quantity, its computation today being standard software. For s = t in the case of a three-dimensional sum, there is convergence provided x '1= 2jn, j E J. The (m l , m2) sum can be expressed in terms of exponentials by (S-3), i.e.,

L 00

L 00

mt=l

eiX(m 1 +m,) Ko(aj

mi + mD = n L 00

m2=-OO

m2=-cIJ

x

[(x

+ 2m2n)2 + a2r

(eJ(X + 2m,n-j'-+ a L

ix _

1)-

I,

1/2

(12)

and this can be used in an obvious way in (11). The same applies, with (S-4), when p = 3 and s = l As an example of how an error analysis of these sums proceeds, let us examine (11). Assume the Bessel function sum is truncated, with all points inside the hypercube sup Ixjl = N

(13)

o s i s»

included. Let RN =

I

Imj[=N+I

1m Is - I/2 11M P II s- I/2 Ks_I/2(IIMp_11112mpnl). p-I

(14)

For an analysis of R N , we shall need several preliminary results useful with sums of this kind. Lemma 1. Let

IX,

n > 0, f3 > IX/n. Then

L kae- Pk :::; nae - Pn(1 00

k=n

ea/n-p)-I.

(15)

241

13.2. Crystal Lattice Sums

By calculus one finds that

Proof

x' Letting x

~

(rx/(jYe-ae bx ,

a, (j, x >

o.

k, (j --> «[n, and substituting the result in (15) proves the lemma.

-->

•

--!-,

For A ~ 1, Re v >

Lemma 2.

I

K.(A)eA r(Re v + 1) AV ~ jr(v + -!-)I KRe.(l) =

I Proof

(16)

A(

(17)

CV •

This follows immediately from the integral

Kv(z)e= -- = ZV

- r

v

2

+ -1)-1 2

Re v>

-1-,

foo e 0

-zt[t (I + -t)]V-I/2 dt

Re z >

2

'

o. •

(18)

Lemma 3 (19) Proof

(20) so (mi

and the lemma follows.

+ ... +

m;)1/2 ~ (l/p)(m l

+ ... + mp )

(21)

•

A straightforward application of all these results shows that for s ~

1< 2s+p+I/2rr2S-I/2cs_I/z{N + 1)2S-1 exp{-[2rr(p

R

I

N

r(s)

-

x {I - exp[ -2rr(N

+

2s + p+ 1/2 rr2s-I/2 r(s)

K s-

+

1)2}

1)/p]}I-P

x {I - exp[(2s - 1)/(N >::::

- 1)/p](N

-!-

+

1 / 2 ( 1)

1) - 2rr(p - l)(N

+

exp{ -[2rr(p - 1)/p](N

l)/p]}-I

+

1)2}, N

--> 00.

(22) For instance, if N = 2, the truncated sum will contain 26 terms if p = 3. The exponential term above is 4.2 x 10- 1 7 . If only seven terms are taken (N = I), the exponential term is still only 5.3 x 10- 8 .

242

13. Multiple Sequences

The case s = 1 of (9) is particularly important. It gives eix(m, + ... +mpl

v- 2 2=n -oomt+···+mp 00

'\' 00

L.

x

L.

eix(m'+"'+m p'

l )

-00

(ml ... ·.mp-,l"O

e-j;;'T+"'+~--;lx+2mp"l

00

cos kx

Jmi + ... + m;-t

k=t

k2

+ 2L

(23)

a rapidly convergent series of exponentials. Obviously the forgoing procedure is easily modified to account for sums with denominator 11M - Ails, A = (at, ... , a p ) . For many special cases, see Hautot's paper. 13.2.3. Laguerre Quadrature

This is an elementary but very accurate method for hand computations. It can be applied for certain functionsfwhen s - 1 - 1P is a value {3 for which the abscissas and weights for the Laguerre quadrature formula for xfJe- x have been tabulated, e.g., {3 = 0, -t, -1-. -1, etc (Concus et al., 1963.) This is illustrated for f == 1.

(1) h(x)

=

exx P/ 2[03(0Iix/ny - 1].

The integral on the right is easily evaluated by Laguerre quadrature, since the series for 0 3 converges with great rapidity. For example, let P = 2, s = l

(2)

Laguerre quadrature with just three abcissas yields S = 9.0352, while the true value is 9.0336.

Appendix

A.I. Lagrangian Interpolation

Let x, yEC(;'s, and denote by p~k)(Z) the polynomial of degree k that at assumes the values Yn, Yn+l'···' Yn+k' respectively. (It is assumed the x j are distinct.) Then X n, Xn-b .•• , Xn+k

(k)( ) _ " k

~

Z -

~h+m

m=O

Il k

Xn+i

Z -

(

i=O X n + m i*m

)

Xn+i

.

(1)

It is easily shown that p~k) satisfies the recursion relationship (k+1)_ Pn -

(

(k)

) X n-ZPn+l-

Xn -

(

) (k) X n+k+l- ZPn

Xn+k+ 1

,

n,

k>O

-

,

(0)_ Pn - Yn,

n 2:: 0, (2)

by putting z = Xi' n :s; i :s; n + k + 1. Another useful expression for p~k) comes from expanding the determinant

Yn+ 1

1 1 1

Yn+k

1

p~k)

Yn

Z

Z2

Zk

2

x kn

X n+l

xn X~+ 1

X~+l =0.

Xn+k

X~+k

X~+k

Xn

243

(3)

244

Appendix

Let

Uj E~,

Vm( u 1,

and denote the Vandermonde determinant Vm by

Uz,""

um)

=

Ul

ui

ui

Uz

u~

ui

Um

Z Um

n n

m-l

=

m

i=O j=i+ 1

m Um

(Uj -

uJ

(4)

Expanding the determinants (3) by minors of the first column and using (4) shows that the determinantal expression is the same as the sum (1).

A.2. The Formula for the s-Algorlthm The proof of Eqs. 6.7(1)-6. 7(3) depends on two determinantal identities. It will be very useful to use Aitken's shorthand notation for determinants, writing only diagonal elements. For instance,

al a3a4a 7 b 1b 3b4b 7 d 1d 3d4d 7 el e3e4e 7

'

(1)

and so forth. The two identities are the obvious generalizations to n x n determinants of

which relates determinants with different first rows, and

!albzC3d41IalbzC3esl-lalbzC3dsllalbzC3e41 = lalbzc3d4esllalbzc31, (3)

which is an expression of the cross product of determinants whose last rows and columns differ in a certain way [see Aitken (1956, p. 108, No.2; p. 49, No.8)]. First, Eq. 6.8(1) is true when k = 1 for

(4)

A.2. The Formula for the ,,-Algorithm

Next consider the case k = 2m, m

~

245

1. Let

1

(5)

and -1

Q'n

=

L\sn+m

(6)

1

L\sn+ Zm

We must show these are the same. Rearranging the elements of the first gives

Qn

=

L\ZSn+ 1

L\ZSn+m

L\Zsn

L\ZSn+m

L\ ZSn+Zm_1

L\ ZSn+m_1

L\sn + 1

L\sn+m

L\sn

L\ ZSn+1

L\ZSn+m

L\Zsn

L\ZSn+m

L\ ZSn+Zm_1

L\ ZSn+m_1

L\ZSn+ 1

L\ZSn+m

L\ ZSn+m_1

L\zSn+zm_z

L\sn + 1

L\sn+m

L\ZSn+ 1

L\ZSn+m

L\ ZSn+m_1

L\ZSn+Zm_Z

(7)

246

Appendix

and using the determinantal identity (2) above one gets Eq. (8).

Qn =

I

I

~Sn+l

~sn+m

~sn

~2Sn+l

~2Sn+m

~2Sn

~2Sn+m_l

~2Sn+2m_l

~2Sn+m_l

~2Sn+ 1

~2Sn+m

~2Sn+m

~2Sn+2m_l

~Sn+l

~Sn+m

Ss;

~Sn+ 1

~Sn+m

~2Sn+ 1

~2Sn+m

~2Sn

~2Sn+ 1

~2Sn+m

~2Sn+m

~2Sn+2m-l

~2Sn+m_l

~2Sn+m_l

~2Sn+2m_2

~Sn

~Sn+m

~Sn+l

~Sn+m+ 1

~Sn+m-l

~Sn+2m-l

~Sn+m

~Sn+2m

~Sn

~Sn+m

~Sn+ 1

~Sn+m+ 1

~Sn+m

~Sn+2m

~Sn+l

~Sn+m

~Sn+m

~sn+2m-l

(8)

The second quantity [Eq. (6)] may be written

Q'

n

Dn

= (-It

~sn

~sn+m

~Sn+l

~Sn+m+l

~Sn+m-l

~sn+ 2m-l

~sn+m

~Sn+2m

Dn

=

~Sn+ 1

~Sn+m+l

~Sn+l

Ss;

~Sn+m

~Sn+ 2m

~Sn+m

~Sn+m-l

Sn+ 1

Sn+m-l

(9)

~Sn+l

~Sn+m

~Sn

~Sn+l

~Sn+m

~Sn+m

~Sn+2m-l

~Sn+m-l

~Sn+m

~Sn+2m

Sn+ 1

Sn+m

s;

I

247

A.3. Sylvester's Expansion Theorem

On D; we use the second identity to find

Dn

=

ASn + 1

Asn + m + 1

ASn + m

ASn + 2m

ASn AS n + m 1

1

ASn + 1

ASn + m

ASn + m

AS n + 2 m -

(10) 1

Elementary determinant manipulations show that the first factor above is ( - l)k times the first factor in the denominator of Qn. Thus Qn = Q~. The proof for k = 2m + 1 is similar. A.3. Sylvester's Expansion Theorem Let A be an n x n determinant, n 2 3, with elements aij and denote the minor of element aij by Mij. Let

D=

(1)

Then (2)

[see Muir (1960, p. 132)].

Bibliography

Abramowitz, M., and Stegun, 1. A. (eds.) (1964). "Handbook of Mathematical Functions." National Bureau of Standards Applied Mathematics Series # 55, Washington, D.C. Agnew, R. P. (1952). Proc. Amer. Math. Soc. 3, 550-556. Agnew, R. P. (1957). Michigan Math. J. 4, 105-128. Aitken, A. C. (1926). Proc. Roy. Soc. Edinburgh Sect. A 46, 289-305. Aitken, A. C. (1931). Proc, Roy. Soc. Edinburgh Sect. A 51, 80-90. Aitken, A. C. (1956). "Determinants and Matrices." Oliver & Boyd, London. Allen, G. D., Chui, C. K., Madych, W. R., Narcowich, F. J., and Smith, P. W. (1975). J. Approx. Theory 14, 302-316. Atchison, T. A.. and Gray, H. L. (1968). SIAM J. Numer. Anal. 5, 451-459. Bajsansk i. B.. and Kararnata, J. (1960). Acad. Serhe Sci. Publ. Inst. Math. 14, 109-114. Baker. G. A .. Jr., and Gammel. J. L. (eds.) (1970). "The Pade Approximant in Theoretical Physics." Academic Press, New York. Banach, S. (1932). "Theorie des Operations Lineaires." Chelsea, New York. Baranger, J. (! 970). C.R. Acad. Sci. Paris Ser. A 271, 149-152. Bauer, F. L. (1959). In" On Numerical Approximation" (R. E. Langer ed.). Univ. of Wisconsin Press, Madison, Wisconsin. Bauer, F. L. (1965). In "Approximation of Functions "(H. Garabedian, ed.). American Elsevier, New York. Bauer, F. L. et al. (1963). Proc. Symp. Appl. Math. Amer. Math. Soc. 15. Beardon, A. F. (1968). J. Math. Anal. Appl. 21, 344-346. Beckenbach, E. F., and Bellman, R. (1961). "Inequalities." Springer-Verlag, Berlin and New York. Bellman, R. (1961). " A Brief Introduction to Theta Functions." Holt, New York. Bellman, R. (1970). "Introduction to Matrix Analysis," 2nd ed. McGraw-Hill, New York. Bellman, R., and Cooke, K. L. (1963). "Differential-Difference Equations." Academic Press, New York. Benson, G. c., and Schreiber, H. P. (1955). Canad. J. Phys. 33, 529-540. Birkoff, G. D. (1930). Acta Math. 54, 205-246. Birkhoff, G. D., and Trjitzinsky, W. J. (1932). Acta Math. 60, 1-89. Born, M., and Huang, K. (1954). "Dynamical Theory of Crystal Lattices," Oxford Univ. Press, London and New York. 249

250

Bibliography

Brezinski, C. (1970). C.R. Acad. Sci. Paris Ser. A 270,1252-1253. Brezinski, C. (1972). RAIRO RI, 61-66. Brezinski, C. (1975). Calcolo 12, 317-360. Brezinski, C. (1976). J. Comput. Appl. Math. 2,113-123. Brezinski, C. (1977). "'Acceleration de la Convergence en Analyse Nurnerique." SpringerVerlag, Berlin and New York. Brezinski, C. (1978). "Algorithmes d'Acceleration de la Convergence: Etude Numerique." Editions Technip, Paris. Bulirsch, R., and Stoer, J. (1966). Numer . Math. 8,1-13. Bulirsch, R., and Stoer, J. (1967). Numer. Math. 9, 271-278. Butzer, P. L., and Nessel, R. J. (1971). "Fourier Analysis and Approximation." Academic Press, New York. Chisolm, J. S. R. (1966). J. Math. Phys. 7, 39. Chow, Y. S., and Teicher, H. (1971). Ann. Math. Statist. 42, 401-404. Chrystal, G. (1959). "Algebra." Chelsea, New York. Concus, P., Cassatt, D., Jaehnig, G., and Melby, E. (1963). Math. Camp. 17, 245-256. Cooke, R. G. (1955). "Infinite Matrices and Sequence Spaces." Dover, New York. Cordellier, F. (1977). C.R. Acad. Sci. Paris Ser. A 284, 389-392. Cornyn, J. J., Jr. (1974). Direct Methods for Solving Systems of Linear Equations Involving Toeplitz or Hankel Matrices, NRL Memorandum Rep. 2920. Naval Research Laboratory, Washington, D.C. Cowling, V. F., and King, J. P. (1962/1963). J. Analyse Math. 10, 139-152. Davis, P. (1963). "Interpolation and Approximation." Ginn (Blaisdell), Waltham, Massachusetts. Dieudonne. J. (1969). "Foundations of Modern Analysis." Academic Press, New York. Erdelyi, A. (1956). "Asymptotic Expansions." Dover, New York. Erdelyi, A., Magnus, W., Oberhettinger, F., and Tricorni, F. G. (1953). "Higher Transcendental Functions," Vols. 1,2, and 3. McGraw-Hill, New York. Erdelyi, A .. Magnus. W., Oberhettinger, F., and Tricomi. F. G. (1954). "Tables of Integral Transforms," Vols. I and 2. McGraw-Hill. New York. Esser, H. (1975). Computing 14, 367-369. Fletcher, A., Miller, J. C. P., Rosenhead, L., and Comrie, L. J. (1962). "'An Index of Mathematical Tables." Vol. I. Oxford Univ. Press (Blackwell), London and New York. Freud, G. (1966). "Orthogonal Polynomials." Pergamon, Oxford. Gantrnacher, F. R. (1959). "The Theory of Matrices," Vols. I and 2, Chelsea, New York. Garreau, G. A. (1952). Nederl. Akad. Wetensch. Proc. Ser. A 14,237-244. Gekeler, E. (1972). Math. Camp. 26, 427-435. Germain-Bonne, B. (1973). RAIRO RI, 84-90. Germain-Bonne, B. (1978). Thesis, Univ. des Sciences et Techniques de Lille. Gilewicz, J. (1978). "Approximants de Pade," Lecture Notes in Mathematics #667. SpringerVerlag, Berlin and New York. Glasser, M. L. (l973a). 1. Malh. Phvs. 14,409-414. Glasser, M. L. (l973b). J. Math. Phys. 14,701-703. Glasser, M. L. (1974).1. Math. Phys. 15, 188-189. Glasser, M. L. (1975). J. Math. Phys. 16, 1237-1238. Goldsmith, D. L. (1965). Amer. Math. Monthly 72,523-525. Golomb, M. (1943). Bull. Amer. Math. Soc. 49, 581-592. Gordon, P. (1975). SIAM 1. Math. Anal. 6, 860-867. Gray, H. L., and Atchison, T. A. (1967). SIAM J. Numer. Anal. 4, 363-371. Gray, H. L.. and Atchison, T. A. (I 968a). J. Res. Nat. Bur. Standards 72B, 29-31. Gray, H. L., and Atchison, T. A. (l968b). Math. Camp. 22, 595-606. Gray, H. L., and Clark, W. D. (1969). J. Res. Nat. Bur. Standards 73B, 251-273.

Bibliography

251

Greville, T. N. E. (1968). Univ. of Wisconsin Math. Res. Center Rep. #877. Haber, S. (1977). SIAM J. Numer. Anal. 14,668-685. Hadamard, J. (1892). J. Math. Pures. Appl. 8, 101-186. Hancock, H. (1909). "Lectures on the Theory of Elliptic Functions," Vol. I. Dover, New York. Hardy, G. H. (1956). " Divergent Series." Oxford Univ. Press, London and New York. Hardy, G. H., and Rogosinski, W. W. (1956). "Fourier Series." Cambridge Univ. Press, London and New York. Hardy, G. H., and Wright, E. M. (1954). "An Introduction to the Theory of Numbers." Oxford Univ. Press, London and New York. Hautot, A. (1974). J. Math. Phys. 15,1722-1727. Havie, T. (1979). BIT 19,204-213. Henrici, P. (1977). "Applied and Computational Complex Analysis," Vols. 1 and 2. Wiley (lnterscience), New York. Higgins, R. L. (1976). Thesis, Drexel Univ. Householder, A. S. (1953). "Principles of Numerical Analysis." McGraw-Hill, New York. Iguchi, K. (1975). Inform. Process. Japan 15, 36-40. Iguchi, K. (1976). Inform. Process. Japan 16, 89-93. Isaacson, E., and Keller, H. B. (1966). "Analysis of Numerical Methods." Wiley, New York. Jacobi, C. G. I. (1829). "Fundamenta Nova Theoriae Functionum Ellipticarum." Konigsberg. Jacobi, C. G. I. (1846). J. Reine Angew. Math. 30,127-156. Jakimouski, A. (1959). Michigan Math. J. 6, 277-290. Jameson, G. J. O. (1974). "Topology and Normed Spaces." Chapman & Hall, London. Jones, B. (1970). J. Inst. Math. Appl. 17,27-36. Kantorovich, L. V., and Akilov, G. P. (1964). "Functional Analysis in Normed Spaces" (transl. by D. E. Brown and A. P. Robertson). Pergamon, Oxford. King, R. F. (1979). SIAM J. Numer. Anal. 16,719-725. Knopp, K. (1947). "Theory and Application ofInfinite Series." Hafner, New York. Kress, R. (1971). Computing 6, 274-288. Kress, R. (1972). Math. Camp. 26, 925-933. Krylov, V. I. (1962). "Approximate Calculation of Integrals." Macmillan, New York. Kummer, E. E. (1837). J. Reine Angew. Math. 16,206-214. Lambert, J. D., and Shaw, B. (1965). Math. Camp. 19,456-462. Lambert, J. D., and Shaw, B. (1966). Math. Camp. 20, 11-20. Laurent, P. J. (1964). Thesis, Grenoble. Levin, D. (1973). Internat. J. Comput. Math. B3, 371-388. Livingston, A. E. (1954). Duke Math. J. 21, 309-314. Lorch, L., and Newman, D. J. (1961). Canad. J. Math. 13,283-298. Lorch, L., and Newman, D. J. (1962). Comm. Pure Appl. Math. 15, 109-118. Lotockii, A. V. (1953). Iranor. Gas. Ped. Inst. u: Zap. Fiz-Mat. Nauki 4,61-91. Lubkin, S. (1952). J. Res. Nat. Bur. Standards Sect. B 48,228-254. Luke, Y. L. (1969). "The Special Functions and Their Approximations," Vols. I and 2. Academic, New York. Luke, Y. L. (1979). On a Summability Method, notes, Univ. of Missouri, Kansas City, Missouri. Luke, Y. L., Fair, W., and Wimp, J. (1975). Camp. Math. Appl. 1,3-12. Lyness, J. N. (1970). Math. Comp., 24, 101-135. Lyness, J. N. (1971). Math. Comp., 25,59-78. McLeod, J. B. (1971). Computing 7,17-24. McNamee, J., Stenger, F., and Whitney, E. L. (1971). Math. Camp. 25, 141-154. Miller, K. S. (1974). "Complex Stochastic Processes." Addison-Wesley, Reading, Massachusetts. Milne-Thomson, L. M. (1960). "The Calculus of Finite Differences." Macmillan, London.

252

Bibliography

de Montessus de Balloire, R. (1902). Bull. Soc. Math. France 30, 28~36. Moore, E. H. (1920). Bull. Amer. Math. Soc. 26, 394-395. Muir, T. (1960). "'A Treatise on the Theory of Determinants." Dover, New York. Nikolskii, S. M. (1948). Izt: Akad. Nauk SSSR Ser. Mat. 12,259-278. Olevskil. A. M. (1975). "'Fourier Series with Respect to General Orthogonal Systems," SpringerVerlag, Berlin and New York. Olver. F. W. J. (1974)... Asymptotics and Special Functions." Academic Press. New York. Ortega, J. M., and Rheinboldt, W. C. (1970). "Iterative Solution of Nonlinear Equations in Several Variables." Academic Press, New York. Ostrowski, A. M. (1966). "'Solutions of Equations and Systems of Equations," Academic Press, New York. Ostrowski, A. M. (1973). "Solutions of Equations in Euclidean and Banach Spaces." Academic Press, New York. Overholt, K. J. (1965). BITS, 122-132. Papoulis, A. (1965). "' Probability, Random Variables, and Stochastic Processes." McGraw-Hili, New York. Pennacchi, R. (1968). Calcolo 5,37-50. Penrose, R. (1955). Proc, Cambridge Philos. Soc. 51,406-413. Perron, O. (1929). "Die Lehre von den Kettenbriichen." Chelsea, New York. Perron, O. (1957). "Die Lehre von den Kettenbruchen," 3rd ed., Vols. I and 2. Teubner, Stuttgart. Petersen, G. M. (1966). "Regular Matrix Transformations." McGraw-Hili, New York. Peyerimhoff, A. (1969). "'Lecture Notes of'Summability." Lecture Notes in Mathematics 11 107. Springer-Verlag, Berlin and New York. Pollaczek, F. (1956). "Sur une Generalisation des Polynomes de Jacobi." Gauthiers-Villars, Paris. Pyle, L. D. (1967). Number. uo». 10,86--102. Rainville, E. D. (1960). "Special Functions," Macmillan, New York. Reich, S. (1970). Amer. Math. Monthly 77,283-284. Richtrneyer, R. D. (1957). "Difference Methods for Initial Value Problems." Wiley (lnterscience), New York. Rutishauser, H. (1954). Z. Anqew. Math. Phys. 5, 233-251. Rutishauser, H. (1957). "Der Quotienten-Differenzen Algorithmus." Birkhauser- Verlag, Basel. Salzer, H. E. (1955). J. Math. Phys. 33, 356-359. Salzer, H. E. (1956). MTAC 10,149-156. Salzer, H. E., and Kimbro, G. M. (1961). Math. Camp. 15,23-29. Samuelson, P. A. (1945). J. Math. Phys. 24,131-134. Scheid, F. (1968). "Numerical Analysis, Schaum's Outline Series." McGraw-Hili, New York. Schmidt, J. R. (1941). Philos. Mag. 32, 369-383. Schur, I. (\921). J. Reine Anqew. Math. 151,79-111. Schwartz, C. (1969). J. Comput. Phys. 4, 19-29. Schwartz, L. (\961/1962). In "Serninaire Bourbaki," fasc. 3. Benjamin, New York. Shanks, D. (1955). J. Math. Phys. 34, 1-42. Shaw, B. (1967). J. Assoc. Comput. Math. 14, 143-154. Shohat, J. A., and Tomarkin, J. D. (1943). "The Problems of Moments," American Mathematical Society, Providence, Rhode Island. Shoop, R. A. (1979). Pacific J. Math. 80, 255-262. Slater, L. J. (1960). "Confluent Hypergeometric Functions." Cambridge Univ. Press, London and New York. Smith, A. C. (1978). Utilitas Math. 13,249-269. Smith, D. A., and Ford, W. F. (1979). SIAM J. Numer. Anal. 16,223-240.

Bibliography

253

Szego, G. (1959)... Orthogonal Polynomials." American Mathematical Society, Providence, Rhode Island. Titchmarsh, E. C. (1939). "The Theory of Functions." Oxford Univ. Press, London and New York. Todd, J. (ed.) (1962). "Survey of Numerical Analysis." McGraw-Hill, New York. Traub, J. F. (1964)... Iterative Methods for the Solution of Equations." Prentice-Hall, Englewood Cliffs, New Jersey. Trench, W. F. (1964). SIAM 1. Appl. Math. 12,515-522. Trench, W. F. (1965). SIAM J. Appl. Math. 13, 1102·1107. Tucker, R. R. (1967). Pacific 1. Math. 22, 349..359. Tucker, R. R. (1969). Pacific 1. Math. 28, 455-463. Tucker, R. R. (1973). Faculty Rei'. Bull. N.C. A and T State o-«, 65, 60·63. Uspensky, J. V. (1928). Trans. Amer. Math. Soc. 30, 542-559. van der Hoff, B. M. E., and Benson, G. C. (1953). Canad. 1. Phys. 31,1087-1091. Vuckovic, V. (1958). Acad. Serbe. Sci. Pubi.Tnst . Math. 12, 125-136. Walls, H. S. (1948)... Analytic Theory of Continued Fractions." Chelsea, New York. Wasow, W. (1965). "Asymptotic Expansions For Ordinary Differential Equations." Wiley, New York. Whittaker, E. T., and Watson, G. N. (1962). "A Course of Modern Analysis." Cambridge Univ. Press, London and New York. Wilansky, A., and Zeller, K. (1957). J. London Math. Soc. 32, 397-408. Wimp, J. (1970). SIAM J. Numer. Anal. 7, 329-334. Wimp, J. (1972). Math. Compo 26, 251-254. Wimp, J. (I 974a). Computing 13,195-203. Wimp, J. (1974b). J. Approx. Theory 10,185-198. Wimp, J. (l974c). Numer. Math. 23,1-17. Wimp, J. (1975). Acceleration methods, In "Encyclopedia ofComputer Science and Technology," Vol. 1. Dekker, New York. Wright, E. M. (1955). J. Reine Anqew. Math. 194,66-87. Wynn, P. (I 956a). J. Math. Phys. 35, 318-320. Wynn, P. (I956b). Math. Tables Aids Comput. 10,91-96. Wynn, P. (l956c). Proc. Cambridge Philos. Soc. 52,663-671. Wynn, P. (1959). Numer. Math. 1, 142-149. Wynn, P. (1961). Nieuw Arch. Wisk 9,117-119. Wynn, P. (1962). Math. Camp. 16, 301-322. Wynn, P. (1963). Nordisk Tidskr.Lnformar-Behandl. 3,175-195. Wynn, P. (1966). SIAM 1. Numer. Anal. 3, 91-122. Wynn, P. (1966). Univ. of Wisconsin Math. Res. Center Rep. #626. Wynn, P. (1967). Univ. of Wisconsin Math. Res. Center Rep. # 750. Wynn. P. (1972). C. R. Acad. Sci. Paris Ser. A 275, 1065-1068. Zeller, K. (1952). Math. Z. 56, 18-20. Zeller, K. (1958)... Theorie du Lirnitierungsverfahren." Springer- Verlag, Berlin and New York. Zemansky, M. (1949). C.R. Acad. Sci. Paris 228,1838-1840.

Index

A 2-process,

Aitken 5 104, 149-152 applied to power series, 151 generalized, 105, 154, 167, 184 B

Birkhoff-Poincare scales, 15-23

c Continued fractions, 156-165 Convergence equivalence to, 33 hyperlinear, 154 linear, 6 logarithmic, 6

Exponential polynomials, 203 Extrapolation, deltoids obtained by, 73-76

F Fixed points of differentiable functions, 146-148 Fourier coefficients, computation of, 205-207 Fourier series, summation of, 48-53 G G-transform, 200-205

H Hankel determinants, 14, 157 Heat conduction, equation for, 100 Hilbertian subspace, 9~94

D

Deltoid,S, 71-80 Difference equations, analytic theory, 16

E s-algorithm, 138-148 generalization of, 144-146 stability of, 141-142 Equivalence, asymptotic, I Euler's constant, 75

Implicit summation, 171-174 Interpolation, Neville-Aitken formula for, 73 Iteration functions abstract spaces, 118-119 construction of, 112-118 L

Laguerre quadrature, 91 Lebesgue constants, 48-53 255

256

Index

Lozenge algorithms, 3-5 linear, 67-76 nonlinear, 101-106 M

Means, see Transformation Method, see Transformation Modulus of numerical stability, see Numerical stability N

Numerical analysis, rational formulas for, 142-144 Numerical stability, modulus of, 29

s Saturation, 51-53 Scale, asymptotic, 1-2 Sequences complex, properties of, 5-12 Laplace moment, 84-90 iteration, 106-108 linearly convergent, 6 logarithmically convergent, 6 Taylor, 96 totally monotone, 12-14 totally oscillatory, 12-14 Stieltjes integrals, quadrature formulas for, see Quadrature Summation methods, see Transformation Sums, lattice, 232-242

o Order symbols, 1-2

T

p Pade approximants, see Rational approximations Path,3 Poisson summation formula, 238 Pollaczek polynomials, 59-63 Polynomials, orthogonal, 40-44, 80-83 Products, partial, growth of, 8

Q Quadrature, numerical, 69-71 based on BH protocol, 200-209 based on cardinal interpolation, 77-80 based on G-transform, 200-205 based on Romberg integration, 67-71 based on tanh rule, 207-209 Quotient-difference algorithm, 156-159 R

Rational approximations, 53-59 gamma function, 58 Gaussian hypergeometric function, 56-57 Pade, 54-57, 128-136 for Stieltjes integrals, 132-136 Rhomboid,S, 80-83 Richardson extrapolatin, 67-71 Romberg integration, see Quadrature

T-matrix, Abel, 66 Taylor formula, generalized, 146-148 Transformation accelerative, 3 Brezinski-Havie, 175-209 quadrature by, 200-209 e-algorithm. 120-148 multiparameter, 166-167 'TJ-algorithm for, 160 GWB, 106-108 homogeneous,S implicit summation, 171-174 Levin t and u, 189-198 linear,S Lubkin, 152-153 multiple sequences, 227-231 nonlinear,S Overholt, 108-110 probabilistic, 210-226 p-algorithm for, 168-169 regular, 3 Schmidt, 120-147 geometric interpretation of, 136-137 topological, 182-185 a-algorithm for, 169-171 Toeplitz, 24-26 applied to series of variable terms, 48-53 band,28 based on power series, 94-100

Index

characteristic polynomials for, 28 Chebyshev weights, 43--44 Euler (E, q)method, 99 Euler means, 34 (f,-'Yk) means, 51 Hausdorff, 34 Higgins weights, 45--46 Lotockil, 44 measure of, 28 nonregular, 38--40 optimal, 90-94 orthogonal, 40-43, 80-83 positive, 27

257

Richardson procedure, 67-71 generalized, 181 Romberg weights, 44--45 generalized, 181 rational approximations obtained with, 54 Riesz means, 65 Salzer means, 35-38 weighted means, 33 translative,S W, 152-153 Trench algorithm, 198-199 Triangle, 27

Sequence transformations and their applications

Read more

Baecklund transformations and their applications

Read more

Backlund Transformations and Their Applications (Mathematics in Science and Engineering)

Read more

Linear transformations in Hilbert space and their applications to analysis

Read more

Baecklund transformations, the inverse scattering method, solitons, and their applications

Read more

Orders and their Applications

Read more

Metagraphs and Their Applications

Read more

Wavelets and their applications

Read more

Metagraphs and Their Applications

Read more

Wavelets and their applications

Read more

Cellulases and Their Applications

Read more

Metaclasses and Their Applications

Read more

Geosynthetics and their applications

Read more

Metaclasses and Their Applications

Read more

Wavelets and their Applications

Read more

Wavelets and Their Applications

Read more

Semigroups and their applications

Read more

Interacting Code Motion Transformations: Their Impact and Their Complexity

Read more

Integral equations and their applications

Read more

Maximum principles and their applications

Read more

Bioceramics and Their Clinical Applications

Read more

Wavelet Transforms and Their Applications

Read more

Exponential Sums and Their Applications

Read more

Latin Squares and Their Applications

Read more

Spectral methods and their applications

Read more

Latin Squares and Their Applications

Read more

Integral Transforms and Their Applications

Read more

Hadamard Matrices and Their Applications

Read more

Orthogonal Polynomials and their Applications

Read more

Quadrature Domains and Their Applications

Read more

Recommend Documents

Sequence transformations and their applications

Baecklund transformations and their applications

Backlund Transformations and Their Applications (Mathematics in Science and Engineering)

Backlund Transformations and Their Applications This is Volume 161 in MATHEMATICS IN SCIENCE AND ENGINEERING A Series...

Linear transformations in Hilbert space and their applications to analysis

Baecklund transformations, the inverse scattering method, solitons, and their applications

Orders and their Applications

Metagraphs and Their Applications

METAGRAPHS AND THEIR APPLICATIONS INTEGRATED SERIES IN INFORMATION SYSTEMS Series Editors Professor Ramesh Sharda Okla...

Wavelets and their applications

Wavelets and their Applications Edited by Michel Misiti Yves Misiti Georges Oppenheim Jean-Michel Poggi Wavelets and ...

Metagraphs and Their Applications

METAGRAPHS AND THEIR APPLICATIONS INTEGRATED SERIES IN INFORMATION SYSTEMS Series Editors Professor Ramesh Sharda Okl...

Wavelets and their applications